Do you ❤️ Trino? Give us a 🌟 on GitHub

Trino Community Broadcast

60: Trino calling AI

May 22, 2024

Introduction

Isa Inalcik from BestSecret chats with us about his exploration to create a Trino connector that interacts with generative AI tools.

Video

Audio

 

Hosts

Guest

Releases and news

Trino 446

  • Add support for the Snowflake catalog in the Iceberg connector.
  • Add support for reading S3 objects restored from Glacier storage in the Hive connector.
  • Add support for unsupported type handling configuration in the Snowflake connector.

Trino 447

  • Add support for SHOW CREATE FUNCTION.
  • Require Java 22.
  • Add support for concurrent DELETE and TRUNCATE in the Delta Lake connector.
  • Remove support for Phoenix 5.1.x and earlier.

Trino 448

  • Improve performance of reading from Parquet files.
  • Add support for caching Glue metadata with the update to use the V2 REST interface.

Trino Gateway 8 and 9

  • Add support support for configurable router policies with two new policies available.
  • Add a Helm chart for deployment.
  • Add new website.

We also had a new Trino Helm chart release 0.20.0.

Jan Waś is now also subproject maintainer of the go client and the Helm charts.

Impressions from the Iceberg Summit

Last week, Cole attended the Iceberg Summit with a special Trino perspective, and we chat about his impressions and major take-aways.

Guest Isa Inalcik from BestSecret

Isa is a highly skilled data expert with over a decade of hands-on experience in software development lifecycle. He is well versed with many data tools including Trino/Starburst Enterprise Platform, Snowflake, Airflow, Apache Spark, Hive, Apache Iceberg, dbt, and others.

Trino at BestSecret

At BestSecret, a leading online retailer for fashion and lifestyle in Europe, Isa spearheads the development of efficient and resilient ELT/ETL pipelines and the implementation of data and AI-driven solutions. We chat in more details about their setup and use cases, his solutions, and challenges he is facing.

Generative AI interest and use cases

Isa has been following the waves of interest in AI and sees the following use cases related to data and Trino:

  • Media (Audio,Video,Image): Extract information out of images.
  • Object categorization: Categorize objects on images, videos.
  • Data masking: For anonymizing sensitive data from unstructured text.
  • Data extraction: To pull structured information from unstructured text.
  • Sentiment analysis: For gauging the sentiment of textual data.
  • Language detection or translation: For language detection or translating.
  • Summarization: To generate concise summaries from lengthy texts.

This inspired him to try an integration of the new emerging LLMs with Trino.

Trino SPI

Trino uses a service provider interface (SPI) to allow developers to create plugins for features such as connectors, security integrations and custom functions. This is crucial for business to implement required functionality and enabled Isa to work on a plugin to support custom functions that call LLMs.

The OpenAI API specification also allowed him to create one function that can be used with different LLM backends.

Proof of concept and demo

We look at the concept and implementation that Isa developed with the following architecture:

Isa’s trino-ai repository contains source code and more details as mentioned in his post on LinkedIn and used in the demo.

Other resources

Rounding out

Trino Fest news:

Other topics:

If you want to learn more about Trino, check out the definitive guide from O’Reilly. You can get the free PDF from Starburst or buy the English, Polish, Chinese, or Japanese edition.

Music for the show is from the Megaman 6 Game Play album by Krzysztof Slowikowski.