51: Trino cools off with PopSQL

Video

Audio

Hosts

Cole Bowden, Developer Advocate at Starburst
Manfred Moser, Director of Technical Content at Starburst (@simpligility)

Guests

Jake Peterson, Head of Customer Success at PopSQL
Matthew Peveler, Software Engineer at PopSQL, MasterOdin on GitHub

Releases 423-427

Official highlights from Martin Traverso:

Trino 423

Schema evolution for nested fields
Support for comments on materialized view columns
Support for CASCADE option in DROP SCHEMA for Clickhouse, MariaDB, MySQL, Oracle and SingleStore
Various performance improvements

Trino 424

Improved performance for JSON, CSV, text and related formats in Hive
Support for CASCADE in DROP SCHEMA for PostgreSQL and Iceberg
Improved coordinator CPU utilization for large clusters

Trino 425

Improved performance of GROUP BY.
Support for check constraints in MERGE for Delta Lake connector.
Support for the Decimal128 in MongoDB connector.

Trino 426

Support for SET/RESET SESSION AUTHORIZATION.
Improved performance of aggregations over decimal values.
Support for TRUNCATE TABLE in Delta Lake connector.
Support for Databricks 13.3 LTS.

Trino 427

Improved performance for GROUP BY and DISTINCT.
Support for pushing down UPDATE statements into connectors.
Support for reading Delta Lake tables with Deletion Vectors.
Faster writing to Parquet files in Delta Lake and Iceberg.
Support for querying tags in Iceberg.

Concept of the episode: PopSQL

It may be familiar to some of our viewers to describe an environment where key queries and dashboards are buried in someone’s personal workspace, and you have to go ask them directly every time you want to check on your metrics. When you’re running a world-class, highly-performant query engine like Trino and investing time and resources into maintaining it, shouldn’t you treat your queries like a first-class, collaborative, versioned system, too?

PopSQL, a playful spin on the word popsicle, solves the sadness that is disorganized and siloed insights by centralizing queries into a platform that has versioning, security, and a suite of collaborative tools comparable to Google Drive. Want to work with your teammate on a query? You can open up the same editor and see the same thing. Want to see what that query someone ran last week was to see how the new feature is doing? It’s there. Have a suggestion to improve something? Leave a comment. Realize your suggestion was wrong and need to undo the change? You can view past versions of the query.

PopSQL and Trino make sense together. PopSQL provides a best-in-class interface for organizing, collaborating, and working together on all of your SQL queries across the business, and Trino handles running those queries at unparalleled speeds. They go hand-in-hand for treating your data and SQL analytics as first class citizens. In today’s episode, we’ll be exploring what PopSQL is, how it integrates with Trino, and how the engineers at PopSQL have done some cool things with Trino to make the integration better than ever before. We’ll start with that last one, actually.

Concept of the episode: A new Node.js adapter for Trino

Trino in the frontend is… a tricky thing. We can go ahead and admit that the Trino web UI isn’t going to win any awards for design or functionality. And while a couple Node-based libraries exist out there, including presto-client-node and lento. But presto-client-node lacked support for streaming and had some issues handling 500 errors, and lento doesn’t quite support Trino out of the box and only supports single streams, which wasn’t ideal for PopSQL’s distributed architecture. So when PopSQl’s engineers went to build their frontend and integrate with Trino, what did they do? Build their own adapter.

We’ll talk about how it was implemented, what key features it unlocks, and why it makes using PopSQL with Trino an even better experience.

Demo of the episode: Using PopSQL with Trino

It’s hard to write show notes for a demo, because you can’t really experience the demo by reading about what’s happening. But as a surface-level overview, we’ll be going over:

Setting up a connection
The schema explorer
The SQL editor
Query scheduling

PR of the episode: #57 on trino-gateway: Release version 3

Last week, the community officially released the trino-gateway, a proxy and load balancer that enables large operations to run multiple Trino clusters in harmony with each other to serve big queries and small queries alike. If you or your organization have a need for more than one Trino cluster and want the seamless experience of being able to connect to any of them through a single interface, then check it out! It’s the product of many months of effort and should be a fantastic solution for running Trino at the absolute largest scales.

To learn more about it, you should check out the blog post announcing its first release.

Rounding out

Trino Summit, the biggest Trino event of the year, is coming up on the 13th and 14th of December, and like Trino Fest, it’ll be fully virtual. If you’d like to give a talk about anything related to Trino, we’re looking for speakers now. Submit your talk here! If you’d rather attend, you can also go register to attend now.

Prior to Trino Summit, if you’d like to learn about SQL from the absolute experts, we’ve also announced the Trino Training Series that we’ll be running as a buildup to the summit. Register now and look forward to four great sessions starting from the ground up and ending with some key tricks and Trino specifics that even a seasoned SQL veteran may not know about.

If you want to learn more about Trino, get the definitive guide from O’Reilly. You can download the free PDF or buy the book online.

Music for the show is from the Megaman 6 Game Play album by Krzysztof Slowikowski.

Do you ❤️ Trino? Give us a 🌟 on GitHub

Trino Community Broadcast