Do you ❤️ Trino? Give us a 🌟 on GitHub

Ecosystem

Data sources

Data sources provide the data for Trino to query. Configure a catalog with the required Trino connector for the specific data source to access the data. With Trino you are ready to use any supported client to query the data sources using SQL and the features of your client.

Official data sources #

The connectors for the following data source are developed and maintained by the Trino community.

Amazon Kinesis

Amazon Kinesis #

Integration developed and maintained by the Trino community

Amazon Kinesis cost-effectively processes and analyzes streaming data at any scale as a fully managed service. With Kinesis, you can ingest real-time data, such as video, audio, application logs, website clickstreams, and IoT telemetry data, for machine learning (ML), analytics, and other applications.

Use an Amazon Kinesis stream as a data source in Trino by configuring a catalog with the Kinesis connector.

Amazon Redshift

Amazon Redshift #

Integration developed and maintained by the Trino community

Amazon Redshift uses SQL to analyze structured and semi-structured data across data warehouses, operational databases, and data lakes, using AWS-designed hardware and machine learning to deliver the best price performance at any scale.

Use an Amazon Redshift data warehouse as a data source in Trino by configuring a catalog with the Redshift connector.

Apache Accumulo

Apache Accumulo #

Integration developed and maintained by the Trino community

Apache Accumulo® is a sorted, distributed key-value store that provides robust, scalable data storage and retrieval.

Use an Apache Accumulo key-value store as a data source in Trino by configuring a catalog with the Accumulo connector.

Apache Cassandra

Apache Cassandra #

Integration developed and maintained by the Trino community

Apache Cassandra is an open source NoSQL distributed database trusted by thousands of companies for scalability and high availability without compromising performance. Linear scalability and proven fault-tolerance on commodity hardware or cloud infrastructure make it the perfect platform for mission-critical data.

Use an Apache Cassandra database as a data source in Trino by configuring a catalog with the Cassandra connector.

Apache Druid

Apache Druid #

Integration developed and maintained by the Trino community

Druid is a high performance, in-memory, real-time analytics database that delivers sub-second queries on streaming and batch data at scale and under load.

Use an Apache Druid database as a data source in Trino by configuring a catalog with the Druid connector.

Apache Hive

Apache Hive #

Integration developed and maintained by the Trino community

Apache Hive is a distributed, fault-tolerant data warehouse system that enables analytics at a massive scale and facilitates reading, writing, and managing petabytes of data residing in distributed storage using SQL.

Use an Apache Hive data warehouse as a data source in Trino by configuring a catalog with the Hive connector.

Apache Hudi

Apache Hudi #

Integration developed and maintained by the Trino community

Apache Hudi is a transactional data lake platform that brings database and data warehouse capabilities to the data lake. Hudi reimagines slow old-school batch data processing with a powerful new incremental processing framework for low latency minute-level analytics.

Use an Apache Hudi data lake as a data source in Trino by configuring a catalog with the Hudi connector.

Apache Iceberg

Apache Iceberg #

Integration developed and maintained by the Trino community

Apache Iceberg is a high-performance format for huge analytic tables. Iceberg brings the reliability and simplicity of SQL tables to big data, while making it possible for engines like Spark, Trino, Flink, Presto, Hive and Impala to safely work with the same tables, at the same time.

Use an Apache Iceberg data lakehouse as a data source in Trino by configuring a catalog with the Iceberg connector.

Apache Ignite

Apache Ignite #

Integration developed and maintained by the Trino community

Apache Ignite is a distributed in‑memory database for high‑performance applications. It scales across memory, disk, and multiple machines without compromise.

Use an Apache Ignite database as a data source in Trino by configuring a catalog with the Apache Ignite connector.

Apache Kafka

Apache Kafka #

Integration developed and maintained by the Trino community

Apache Kafka is an open source distributed event streaming platform used by thousands of companies for high-performance data pipelines, streaming analytics, data integration, and mission-critical applications.

Use an Apache Kafka event stream as a data source in Trino by configuring a catalog with the Kafka connector.

Apache Kudu

Apache Kudu #

Integration developed and maintained by the Trino community

Apache Kudu is an open source distributed data storage engine that makes fast analytics on fast and changing data easy.

Use an Apache Kudu data storage as a data source in Trino by configuring a catalog with the Kudu connector.

Apache Phoenix

Apache Phoenix #

Integration developed and maintained by the Trino community

Apache Phoenix enables OLTP and operational analytics in Hadoop for low latency applications by combining the best of both worlds:

  • The power of standard SQL and JDBC APIs with full ACID transaction capabilities and
  • The flexibility of late-bound, schema-on-read capabilities from the NoSQL world by leveraging HBase as its backing store

Use a Apache Phoenix key value store as a data source in Trino by configuring a catalog with the Phoenix connector.

Apache Pinot

Apache Pinot #

Integration developed and maintained by the Trino community

Apache Pinot is a real-time distributed OLAP datastore, designed to answer OLAP queries with low latency

Use an Apache Pinot datastore as a data source in Trino by configuring a catalog with the Pinot connector.

Clickhouse

Clickhouse #

Integration developed and maintained by the Trino community

ClickHouse is the fastest and most resource efficient open source real-time database for applications and analytics.

Use a Clickhouse database as a data source in Trino by configuring a catalog with the Clickhouse connector.

Delta Lake

Delta Lake #

Integration developed and maintained by the Trino community

Delta Lake is an open source storage framework that enables building a Lakehouse architecture with compute engines including Spark, PrestoDB, Flink, Trino, and Hive and APIs for Scala, Java, Rust, Ruby, and Python.

Use a Delta Lake data lakehouse as a data source in Trino by configuring a catalog with the Delta Lake connector.

Elasticsearch

Elasticsearch #

Integration developed and maintained by the Trino community

Elasticsearch is a distributed, RESTful search and analytics engine capable of addressing a growing number of use cases. As the heart of the Elastic Stack, it centrally stores your data for lightning fast search, fine‑tuned relevancy, and powerful analytics that scale with ease.

Use an Elasticsearch index as a data source in Trino by configuring a catalog with the Elasticsearch connector.

Exasol

Exasol #

Integration developed and maintained by the Trino community

Exasol is an in-memory, column-oriented, relational database management system known for its high performance and its usage of a massively parallel processing architecture.

Use an Exasol database as a data source in Trino by configuring a catalog with the Exasol connector.

Google BigQuery

Google BigQuery #

Integration developed and maintained by the Trino community

BigQuery is a serverless and cost-effective enterprise data warehouse that works across clouds and scales with your data. Use built-in ML/AI and BI for insights at scale.

Use a Google BigQuery data warehouse as a data source in Trino by configuring a catalog with the BigQuery connector.

Google Sheets

Google Sheets #

Integration developed and maintained by the Trino community

Google Sheets enables you to ceate and collaborate on online spreadsheets in real-time and from any device.

Use a Google Sheets spreadsheet as a data source in Trino by configuring a catalog with the Google Sheets connector.

MariaDB

MariaDB #

Integration developed and maintained by the Trino community

MariaDB Server is one of the most popular open source relational databases. It’s made by the original developers of MySQL and guaranteed to stay open source. It is part of most cloud offerings and the default in most Linux distributions.

Use a MariaDB database as a data source in Trino by configuring a catalog with the MariaDB connector.

Microsoft SQL Server

Microsoft SQL Server #

Integration developed and maintained by the Trino community

Microsoft SQL Server is a proprietary relational database management system developed by Microsoft. Microsoft provides different editions of Microsoft SQL Server, aimed at different audiences and for workloads ranging from small single-machine applications to large Internet-facing applications with many concurrent users.

Use a Microsoft SQL Server database as a data source in Trino by configuring a catalog with the SQL Server connector.

MongoDB

MongoDB #

Integration developed and maintained by the Trino community

MongoDB is a source-available cross-platform document-oriented database program. Classified as a NoSQL database program, MongoDB uses JSON-like documents with optional schemas.

Use a MongoDB database as a data source in Trino by configuring a catalog with the MongoDB connector.

MySQL

MySQL #

Integration developed and maintained by the Trino community

MySQL is the world’s most popular open source relational database management system (RDBMS).

Use a MySQL database as a data source in Trino by configuring a catalog with the MySQL connector.

OpenSearch

OpenSearch #

Integration developed and maintained by the Trino community

OpenSearch is a scalable, flexible, and extensible open-source software suite for search, analytics, and observability applications. OpenSearch offers a vendor-agnostic toolset you can use to build secure, high-performance, cost-efficient applications. OpenSearch includes a data store and search engine, a visualization and user interface, and a library of plugins you can use to tailor your tools to your requirements.

Use an OpenSearch index as a data source in Trino by configuring a catalog with the Elasticsearch connector.

Oracle

Oracle #

Integration developed and maintained by the Trino community

Oracle database services and products offer customers cost-optimized and high-performance versions of Oracle Database, the world’s leading converged, multi-model database management system.

Use an Oracle database as a data source in Trino by configuring a catalog with the Oracle connector.

PostgreSQL

PostgreSQL #

Integration developed and maintained by the Trino community

PostgreSQL is the world’s most advanced open source relational database. PostgreSQL is a powerful system with over 35 years of active development that has earned it a strong reputation for reliability, feature robustness, and performance.

Use a PostgreSQL database as a data source in Trino by configuring a catalog with the PostgreSQL connector.

Prometheus

Prometheus #

Integration developed and maintained by the Trino community

Prometheus is an open source systems monitoring and alerting toolkit with a very active developer and user community. Prometheus collects and stores its metrics as time series data, i.e. metrics information is stored with the timestamp at which it was recorded, alongside optional key-value pairs called labels.

Use a Prometheus database as a data source in Trino by configuring a catalog with the Prometheus connector.

Trino also supports observability with OpenTelemetry, and therefore Prometheus.

Redis

Redis #

Integration developed and maintained by the Trino community

Redis is an open source, in-memory data store used by millions of developers as a database, cache, streaming engine, and message broker.

Use a Redis data store as a data source in Trino by configuring a catalog with the Redis connector.

SingleStore

SingleStore #

Integration developed and maintained by the Trino community

SingleStoreDB is a unified data engine for transactional and analytical workloads, used to power fast, real-time analytics and applications.

Use a SingleStore database as a data source in Trino by configuring a catalog with the SingleStore connector.

Snowflake

Snowflake #

Integration developed and maintained by the Trino community

Snowflake is a Data Cloud platform provider. Snowflake easily enables governed access to near-infinite amounts of data, cutting-edge tools, applications, and services. With the Data Cloud, you can collaborate locally and globally to reveal new insights, create previously unforeseen business opportunities, and identify and know your customers in the moment with seamless and relevant experiences.

Use a Snowflake data cloud as a data source in Trino by configuring a catalog with the Snowflake connector.

TPC

TPC #

Integration developed and maintained by the Trino community

TPC is a non-profit corporation focused on developing data-centric benchmark standards and disseminating objective, verifiable data to the industry.

The Trino TPC-H and TPC-DS connectors are data generators that provide the benchmark data sets for direct querying or copying into other data sources for testing and benchmarking.

Vertica

Vertica #

Integration developed and maintained by the Trino community

Vertica, also known as the OpenText Analytics Database, is a column-oriented analytic database designed to manage large, fast-growing volumes of data, with fast query perfomacne for data warehouses and other query-intensive applications.

Use a Vertica database as a data source in Trino by configuring a catalog with the Vertica connector.


Other data sources #

The connectors for the following data sources are developed and maintained by other communities and vendors.

Git

Git #

Git is a free and open source distributed version control system designed to handle everything from small to very large projects with speed and efficiency.

Use a git repository as a data source in Trino by configuring a catalog with the git connector.

Gravitino

Gravitino #

Gravitino is a high-performance, geo-distributed, and federated metadata lake. It manages the metadata directly in different sources, types, and regions. It also provides users with unified metadata access for data and AI assets, and is available as an open source project.

Use Gravitino as a data source in Trino by configuring a catalog with the Gravitino connector.

OpenAPI

OpenAPI #

OpenAPI is a specification language for REST APIs that provides a standardized means to define your API.

Use any REST API that publishes an OpenAPI specification as a data source in Trino by configuring a catalog with the OpenAPI connector, and avoid having to generate a client.

VAST

VAST #

VAST is a data platform that includes storage and database services.

Use a VAST data store as a data source in Trino by configuring a catalog with the VAST Trino connector.