What an exciting month we had in August! August marked the ten-year birthday of the Trino project. Don’t worry if you missed all the excitment as we’ve condensed it all in this post.
Blog posts #
We felt it necessary to chronicle the larger events that happened in the last decade of the project through the lens of where we are today.
- Why leaving Facebook/Meta was the best thing we could do for the Trino Community
- A decade of query engine innovation
- Happy tenth birthday Trino!
We shared these posts on HackerNews and the Facebook and the query innovation posts both hit the front page. This resulted in one of the largest amount of page views on the Trino website in a given day - more than 25k views!
Trino ten-year timeline video #
Another way we celebrated was creating an epic ten-year montage video that chronicles the incredible journey starting with the Presto project’s humble beginnings, and how it evolved into the success that Trino is today:
Birthday celebration with the creators of Trino #
To cap things off last month, we hosted a meetup with the creators to reflect on the last ten years, laugh and listen to some stories from the early days, talk about the exciting features currently launching, and speculate on the next ten years of Trino. Here are some highlights you missed:
Adding dynamic catalogs #
Dain discusses what dynamic catalogs could look like in Trino. Currently, to add catalogs in Trino, you need to add the new catalog configuration file and then restart Trino. With dynamic catalogs, you can add and remove these catalogs at runtime with no restart required. There is still no guarantee of exactly when this feature would arrive, but some of the foundations are currently being added. Dain dives into this a bit more in this clip
Vectorization and performance #
As more marketing around vectorized databases has come up recently many have asked if Trino will be following the trend. This question comes up at an interesting time as Trino now requires Java 17 to run. Java 17 comes with a lot of capabilities to vectorize, and while we are excited to start looking into these capabilities, simply updating workloads to use vectorization doesn’t pack the performance punch that many would expect it to. The answer is more complex:
- Do modern workloads benefit from vectorization? See Martin’s answer to this
- Is there a benefit to vectorization over Java’s auto-vectorization? Sometimes, but Dain elaborates on when
- If not vectorization, what type of performance improvements does Trino focus on? Martin and Dain list some simple but impactful ones
- The debate around query time optimization versus runtime adaption. Which should you optimize first?
Polymorphic table functions #
One feature that is top-of-mind for everyone in the Trino project are polymorphic table functions or simply “table functions” as Dain prefers to call them.
- What is a table function? David and Dain discuss standard and polymorphic table functions
- Could we rewrite the Google Sheets connector as a table function?. David and Dain discuss how this would work
- Why table functions are so incredibly powerful. Eric and Dain talk about why PTFs are a game changer
If you want to learn more about polymorphic table functions, check out the recent Trino Community Broadcast episode that covers the potential of these functions in much more detail.
The early days of Presto and Trino #
We wanted to get some insight into what the early days of the project looked like, and how Martin, Dain, David, and Eric began the daunting task of designing and building a distributed query engine from scratch. Some of the discussions were interesting while others were downright hilarious. Here are some steps you can take to write your own query engine, at least if you want to do it the way the Trino creators did it:
- Look up a bunch of research papers to see how others are doing this 📑.
Video
- Side note: Papers tend to be highly aspirational and skip important fundamentals. Video
- Address the real challenges of making a query engine. Video
- Take your initial version and just throw it away 😂🗑🚮. Video
- Expand outside the initial use cases by learning from other companies and building community 👥. Video
- Cause a brownout on the Facebook network 📉. Video
- Realize the system you replaced was actually faster in some cases, but for all the wrong reasons ❌🙅. Video
After a lot of the initial work was done, Presto was deployed at Facebook and soon after open sourced. From here, we know that the velocity of the project picked up and once the project was independent of Facebook, the features took off even more. While everything may seem calculated in hindsight, it was a lot of hard work to grow the community and adoption around Presto and now Trino. The creators knew they were making a project that would be utilized outside the walls of Facebook, but they could never have anticipated the sheer scale of adoption Trino would see.
Conclusion #
We hope you enjoyed all the fun we had celebrating these first ten years of the Trino project. We are thrilled to think of what the following decades will bring. We’d like to leave you with closing thoughts from Dain: