The Trino community in Japan held an online event on October 5th, 2023. This article is a summary of the conference aiming to share the presentations and provide an overview.
Watch a replay of the whole event, or jump to specific time stamps and topic of interest:
This year, there were 4 sessions:
- Trino, Starburst Galaxy, and Enterprise
- Log infrastructure using Trino and Iceberg
- Data infrastructure using Spark and Trino on bare metal k8s
- Getting started Trino and a transactional data lake with serverless Athena
Trino, Starburst Galaxy, and Enterprise
The first session was presented by Yuya Ebihara (me) from Starburst. I explained the Trino changes from 2022 and 2023, as well as features of Starburst Galaxy and Starburst Enterprise. The session introduced a press release of the partnership of Starburst and Dell Technologies in Japan.
Log infrastructure using Trino and Iceberg
The second session was presented by Tadahisa Kamijo from Sakura Internet. He explained some requirements for new analytics environments such as concurrent read/write, schema evolution, record-level modification, restoring past snapshots, and addressing performance issues with the Hive metastore. They decided to use Trino and Iceberg for handling these requests. Kamijo-san also introduced the file layout in Iceberg and demonstrated how to debug Iceberg files using their Java client.
Data infrastructure using Spark an Trino on bare metal k8s
The third session was presented by Yasukazu Nagatomi from MicroAd. They started a migration to Trino from Impala to resolve the following issues - separating computing and storage, refreshing and utilizing table and column statistics even with large tables, and supporting schema evolution. Nagatomi-san shared a use case of the Trino features fault-tolerant execution and spill-to-disk, which is the first public use case of these features in Japan.
Getting started Trino and a transactional data lake with serverless Athena
The last session was presented by Sotaro Hikita from AWS. Athena is a serverless service for ad hoc analytics with Trino and Presto foundation. It supports not only S3 data but also various datasources via Federated Query. In Athena, Iceberg supports both read and write operations, while Hudi and Delta Lake only support read operations.
We sincerely appreciate the participation of community members in Japan. Thank you so much for watching the live event. We are planning to hold an offline event next year, see you next time!