Do you ❤️ Trino? Give us a 🌟 on GitHub

Trino Community Broadcast

71: Fake it real good

Feb 27, 2025

Introduction

Manfred Moser and Cole Bowden are joined by Jan Waś to learn about the new Faker connector and the Datafaker library. You can use it to emulate data that does not exist on any storage, can shape it as you need, and then learn real SQL, build real reports, and make some real charts - all with fake data.

Video

Audio

Trino Community Broadcast 71_ Fake it real good
1:18:42

Hosts

Guest

Releases

Following are some highlights of the recent releases:

Trino 471

  • Add AI functions for textual tasks on data using OpenAI, Anthropic, or other LLMs using Ollama as backend.
  • Add support for logging output to the console in JSON format (useful in containers..).
  • Support additional Python libraries for use with Python user-defined functions.
  • Remove the RPM package.
  • Add local file system support.
  • Add support for S3 Tables in Iceberg connector.

As always, numerous performance improvements, bug fixes, and other features were added as well.

Trino Gateway 14

Our first Trino Gateway release of 2025 shipped, and it is packed with great new features and fixes. Some examples are the following:

  • Rules editor in the web interface
  • Automatic database schema update and support for Oracle
  • Trino cluster monitoring with JMX and OpenMetrics

Introducing Jan Waś

Jan, also known as nineinchnick on GitHub, is a very active Trino contributor with a wide range of his own plugins and projects. He is subproject maintainer for the Helm charts and the Grafana plugin, and is heavily involved in GitHub actions setup and numerous other efforts. Jan resides in Poland. When he is not working on Trino, you can find him at metal, electronics, and even opera concerts across Europe or at home playing video games.

Datafaker, Faker connector, and Trino

We talk about using simulated data from the TPC-H and TPC-DS connectors to learn SQL and use it for other scenarios such as benchmarking, testing for SQL support, and validating other connectors and data sources. This leads us to the limitations of these connectors and how the Faker connector is the next step.

Jan tells us about the Datafaker library and his motivation to create a connector, and how it eventually landed in Trino itself.

Demo time

Jan shows us how to configure the connector and then demoes a number of use cases from learning SQL to populating and testing other data sources.

Resources

Rounding out

Watch the recording of the Trino contributor call or read the minutes.

Join us for upcoming events and let us know if you want to a guest:

  • Trino Community Broadcast 72: Keeping the lake clean, all about Lakekeeper
  • Trino Community Broadcast 73: Wrapping Trino packages with a bow