Trino Fest 2024 is successfully in the books! While over 100 enthusiastic members of the community gathered in Boston, over 650 virtual attendees joined us worldwide to learn from our expert speakers as they discussed topics such as table formats, enhancements and optimizations, and use cases with Trino both large and small. And now it is your chance to revisit the presentations or catch up on everything you missed.
Impressions #
Judging from early results from attendee and speaker feedback, everyone enjoyed the event. Asked about what sessions the audience liked we got answers like
- They were all very insightful.
- All of it, but especially the realtime demos to see speed difference on query optimization.
- and All of them, nothing was missed!
Just like some attendees, our speakers travelled from Europe, Asia, and other places, and enjoyed the event.
- Thanks for organizing the awesome event and inviting me for the talk!
- Was great to finally meet you and we had a great time at Trino Fest!
- Thanks for a great event last week. It was a pleasure to meet you all.
Many of us also met Commander Bun Bun, and we sent greetings to the remote audience as well.
The keynote, the sessions, and all the talk in the hallways confirmed that Trino continues to thrive and expand in usage. Large companies like Apple, Microsoft, LinkedIn, Amazon, and many other users openly talk about shipping Trino as part of their products and using it for internal usage as well. Smaller companies either run Trino themselves or take advantage of Trino-based products for all their data platform needs. Our sessions for Trino Fest offered something to learn for everyone.
Sponsors #
Bringing together the event was only possible thanks to the great Trino events team around Anna Schibli at our main sponsor Starburst, and the assistance from all our other sponsors. A heartfelt thank you from Commander Bun Bun and all of us go out to you!
Sessions #
Now, following is what you are really looking for. All the talks, speakers, short recaps, slide decks, video recordings, and following Q&A sessions, ready for you. Enjoy!
What’s new in Trino this summer
Presented by Martin Traverso from
Starburst
Martin recapped everything that’s happened in Trino over the last six months,
taking a look at the biggest new features and how Trino development is going
better than ever. He also gave a sneak peek at what we can expect soon in Trino.
Video recording
| Slides
Reducing query cost and query runtimes of Trino powered analytics platforms
Presented by Jonas Irgens Kylling from
Dune.
Jonas gave a detailed talk about how Dune has improved their performance of
Trino with a few key tweaks. That includes leveraging caching with Alluxio,
advanced cluster management, and storing, sampling, and filtering query results.
Video recording
| Slides
Enhancing Trino’s query performance and data management with Hudi: innovations and future
Presented by Ethan Guo from
Onehouse.
Ethan gave a look into development on Hudi and Trino’s Hudi connector,
explaining multi-modal indexing and how it can improve query performance. He
also gave an overview of the roadmap and future of the connector.
Video recording
| Slides
Trino Engineering @ Microsoft
Presented by George Fisher and Ishan Patwa from
Microsoft.
George and Ishan gave a deep dive into what’s been going on with Microsoft’s
deployment and management of Trino. This included clients and integrations,
result caching, a sharded SQL connector, deep debugging and monitoring, and
seamless security integration with Azure.
Video recording
Enhancing data governance in Trino with the OpenLineage integration
Presented by Alok Kumar Prusty from
Apple.
Alok’s lightning talk is all about how Apple deployed OpenLineage, an open
framework for data lineage collection and analysis, and built a Trino plugin to
publish OpenLineage complaint events that can be viewed and monitored.
Video recording
Best practices and insights when migrating to Apache Iceberg for data engineers
Presented by Amit Gilad from
Cloudinary.
Amit shared how Cloudinary expanded their data lake to use Apache Iceberg. He
demonstrated how moving from Snowflake to an open table format allowed them to
reduce storage costs and leverage different query and processing engines to run
more powerful analytics at scale.
Video recording
| Slides
Trino query intelligence: insights, recommendations, and predictions
Presented by Marton Bod from Apple.
Marton’s lightning talk explored how Apple has monitored and stored metadata for
every Trino query execution, then used that data for for real-time cluster
dashboarding, self-service troubleshooting, and automatic generation of
recommendations for users.
Video recording
The open source journey of the Trino Delta Lake Connector
Presented by Marius Grama from
Starburst.
Marius went into a deep dive on all the work and collaboration that’s gone into
making the Delta Lake connector in Trino a robust, first-class connector. Casual
discussions, engineers working together, GitHub issues filed by the community,
and innovative contributions have all come together, and Marius’ talk shows why
an open source community is so powerful.
Video recording
| Slides
Tiny Trino; new perspectives in small data
Presented by Ben Jeter and Thomas Zugibe from
Executive Homes.
Ben and Tommy explore how Executive Homes uses Trino’s robust suite of
integrations to handle data at a small scale. Instead of petabytes, how about a
handful of gigabytes in several different systems? It’s something that Trino is
well-equipped to handle thanks to how well-supported it is in the data
ecosystem, and they explain why.
Video recording
| Slides
Bridging the divide: running Trino SQL on a vector data lake powered by Lance
Presented by Lei Xu from LanceDB
and Noah Shpak from Character.ai.
Lei and Noah give an overview of LanceDB, how it works, and what makes it a
great database for multimodal AI. Then they dive into a Trino connector for
Lance, and explore how Trino slots into Character.AI’s workload to blend
analytics with training and generating new models.
Video recording
| Slides
How FourKites runs a scalable and cost-effective log analytics solution to
handle petabytes of logs
Presented by Arpit Garg from
FourKites.
With nearly a petabyte of logs being managed at FourKites, it shouldn’t be a
huge surprise that they’ve turned to Trino to handle understanding and analyzing
them. Arpit discusses how they’ve scaled log ingestion, strategically used S3
with Parquet to minimize storage costs, transformed and extracted those logs at
scale, and leveraged Trino to search and explore the datasets with Superset as a
frontend for visualization.
Video recording
| Slides
Observing Trino
Presented by Matt Stephenson from
Starburst.
Starburst has built a comprehensive observability platform around Trino to
better serve its users and customers. Matt explored all the components of it,
including how to integrate with Jaeger, Prometheus, and ELK.
Video recording
| Slides
Accelerate Performance at Scale: Best Practices for Trino with Amazon S3
Presented by Dai Ozaki from AWS.
Dai’s talk explores best practices to get the most out of using Trino in
conjunction with Amazon S3. He discusses partitioning, scaling workloads,
reducing latency, and resolving common bottlenecks, providing valuable insights
for anyone trying to manage and deploy Trino with S3.
Video recording
| Slides
What’s next #
While you are busy catching up, we are still working hard on a recap of the Trino Contributor Congregation. We also had a lot of great conversations that lead us to follow up action items such as more pull requests to review, new contributors to onboard, and more projects to work on.
Make sure you to join the community on Slack to learn more in the next little while.
Oh, and one last thing…
See you soon,
Manfred, Cole, and Monica