Trino Fest 2024 recap

Jun 24, 2024 • Manfred Moser, Cole Bowden, Monica Miller

Trino Fest 2024 is successfully in the books! While over 100 enthusiastic members of the community gathered in Boston, over 650 virtual attendees joined us worldwide to learn from our expert speakers as they discussed topics such as table formats, enhancements and optimizations, and use cases with Trino both large and small. And now it is your chance to revisit the presentations or catch up on everything you missed.

Impressions #

Judging from early results from attendee and speaker feedback, everyone enjoyed the event. Asked about what sessions the audience liked we got answers like

They were all very insightful.
All of it, but especially the realtime demos to see speed difference on query optimization.
and All of them, nothing was missed!

Just like some attendees, our speakers travelled from Europe, Asia, and other places, and enjoyed the event.

Thanks for organizing the awesome event and inviting me for the talk!
Was great to finally meet you and we had a great time at Trino Fest!
Thanks for a great event last week. It was a pleasure to meet you all.

Many of us also met Commander Bun Bun, and we sent greetings to the remote audience as well.

The keynote, the sessions, and all the talk in the hallways confirmed that Trino continues to thrive and expand in usage. Large companies like Apple, Microsoft, LinkedIn, Amazon, and many other users openly talk about shipping Trino as part of their products and using it for internal usage as well. Smaller companies either run Trino themselves or take advantage of Trino-based products for all their data platform needs. Our sessions for Trino Fest offered something to learn for everyone.

Sponsors #

Bringing together the event was only possible thanks to the great Trino events team around Anna Schibli at our main sponsor Starburst, and the assistance from all our other sponsors. A heartfelt thank you from Commander Bun Bun and all of us go out to you!

Sessions #

Now, following is what you are really looking for. All the talks, speakers, short recaps, slide decks, video recordings, and following Q&A sessions, ready for you. Enjoy!

What’s new in Trino this summer
Presented by Martin Traverso from Starburst

Martin recapped everything that’s happened in Trino over the last six months, taking a look at the biggest new features and how Trino development is going better than ever. He also gave a sneak peek at what we can expect soon in Trino.
Video recording | Slides

Reducing query cost and query runtimes of Trino powered analytics platforms
Presented by Jonas Irgens Kylling from Dune.

Jonas gave a detailed talk about how Dune has improved their performance of Trino with a few key tweaks. That includes leveraging caching with Alluxio, advanced cluster management, and storing, sampling, and filtering query results.
Video recording | Slides

Enhancing Trino’s query performance and data management with Hudi: innovations and future
Presented by Ethan Guo from Onehouse.

Ethan gave a look into development on Hudi and Trino’s Hudi connector, explaining multi-modal indexing and how it can improve query performance. He also gave an overview of the roadmap and future of the connector.
Video recording | Slides

Trino Engineering @ Microsoft
Presented by George Fisher and Ishan Patwa from Microsoft.

George and Ishan gave a deep dive into what’s been going on with Microsoft’s deployment and management of Trino. This included clients and integrations, result caching, a sharded SQL connector, deep debugging and monitoring, and seamless security integration with Azure.
Video recording

Enhancing data governance in Trino with the OpenLineage integration
Presented by Alok Kumar Prusty from Apple.

Alok’s lightning talk is all about how Apple deployed OpenLineage, an open framework for data lineage collection and analysis, and built a Trino plugin to publish OpenLineage complaint events that can be viewed and monitored.
Video recording

Best practices and insights when migrating to Apache Iceberg for data engineers
Presented by Amit Gilad from Cloudinary.

Amit shared how Cloudinary expanded their data lake to use Apache Iceberg. He demonstrated how moving from Snowflake to an open table format allowed them to reduce storage costs and leverage different query and processing engines to run more powerful analytics at scale.
Video recording | Slides

Trino query intelligence: insights, recommendations, and predictions
Presented by Marton Bod from Apple.

Marton’s lightning talk explored how Apple has monitored and stored metadata for every Trino query execution, then used that data for for real-time cluster dashboarding, self-service troubleshooting, and automatic generation of recommendations for users.
Video recording

The open source journey of the Trino Delta Lake Connector
Presented by Marius Grama from Starburst.

Marius went into a deep dive on all the work and collaboration that’s gone into making the Delta Lake connector in Trino a robust, first-class connector. Casual discussions, engineers working together, GitHub issues filed by the community, and innovative contributions have all come together, and Marius’ talk shows why an open source community is so powerful.
Video recording | Slides

Tiny Trino; new perspectives in small data
Presented by Ben Jeter and Thomas Zugibe from Executive Homes.

Ben and Tommy explore how Executive Homes uses Trino’s robust suite of integrations to handle data at a small scale. Instead of petabytes, how about a handful of gigabytes in several different systems? It’s something that Trino is well-equipped to handle thanks to how well-supported it is in the data ecosystem, and they explain why.
Video recording | Slides

Bridging the divide: running Trino SQL on a vector data lake powered by Lance
Presented by Lei Xu from LanceDB and Noah Shpak from Character.ai.

Lei and Noah give an overview of LanceDB, how it works, and what makes it a great database for multimodal AI. Then they dive into a Trino connector for Lance, and explore how Trino slots into Character.AI’s workload to blend analytics with training and generating new models.
Video recording | Slides

How FourKites runs a scalable and cost-effective log analytics solution to handle petabytes of logs
Presented by Arpit Garg from FourKites.

With nearly a petabyte of logs being managed at FourKites, it shouldn’t be a huge surprise that they’ve turned to Trino to handle understanding and analyzing them. Arpit discusses how they’ve scaled log ingestion, strategically used S3 with Parquet to minimize storage costs, transformed and extracted those logs at scale, and leveraged Trino to search and explore the datasets with Superset as a frontend for visualization.
Video recording | Slides

Observing Trino
Presented by Matt Stephenson from Starburst.

Starburst has built a comprehensive observability platform around Trino to better serve its users and customers. Matt explored all the components of it, including how to integrate with Jaeger, Prometheus, and ELK.
Video recording | Slides

Accelerate Performance at Scale: Best Practices for Trino with Amazon S3
Presented by Dai Ozaki from AWS.

Dai’s talk explores best practices to get the most out of using Trino in conjunction with Amazon S3. He discusses partitioning, scaling workloads, reducing latency, and resolving common bottlenecks, providing valuable insights for anyone trying to manage and deploy Trino with S3.
Video recording | Slides

What’s next #

While you are busy catching up, we are still working hard on a recap of the Trino Contributor Congregation. We also had a lot of great conversations that lead us to follow up action items such as more pull requests to review, new contributors to onboard, and more projects to work on.

Make sure you to join the community on Slack to learn more in the next little while.

Oh, and one last thing…

Trino Summit 2024 registration is open

See you soon,

Manfred, Cole, and Monica

Do you ❤️ Trino? Give us a 🌟 on GitHub

Trino blog

Trino Fest 2024 recap

Impressions #

Sponsors #

Sessions #

What’s next #