Official highlights from Martin Traverso:
delete_orphan_filestable procedures for Iceberg.
Additional highlights worth a mention according to Manfred:
DISTINCT .. LIMIToperator when input data is dictionary encoded.
userproperty requirement in JDBC driver.
internal-communication.shared-secretvalue with authentication usage, breaking change for many users that have not set that secret.
To adopt Trino you typically need to run it on a cluster of machines. These can be bare metal servers, virtual machines, or even containers. The Trino project provides a few binary packages to allow you to install Trino:
All of them include a bunch of Java libraries that constitute Trino and all the plugins. As a result there are only a few requirements. You need a Linux operating system, since some of the libraries and code require Linux indirectly, and a Java 11 runtime.
Beyond that is just the
bin/launcher script, which is highly recommended, but
not required. It can be used as a service script or for manual
starts/stop/status of Trino, and only needs Python.
The tarball, is a gz compressed tar archive. For installation you just need to extract the archive anywhere. It contains the following directory structure.
bin, the launcher script and related files
lib, all globally needed libraries
plugins, connectors and other plugins with their own libraries each in separate sub-directories
You need to create the
etc directory with the needed configuration, since the
tarball does not include any defaults, and you can not start the application
Note that all the files are within the created directory.
The RPM archive is suitable for RPM-based Linux distributions, but testing is not very thorough across different versions and distributions.
It adapts the tarball content to the Linux file system hierarchy, hooks the launcher script up as daemon script, and adds default configuration files. That allows you to start Trino after installing the archive, as well as with system restarts.
Locations used are
/var/lib/trino, and others. These are
configured via the launcher script parameters.
In a nutshell the RPM adds some convenience, but narrows down the supported Linux distributions. It still requires Java and Python installation and management.
The container image for Trino adds the necessary Linux, Java, and Python, and adapts Trino to the container setup.
The container adds even more convenience, since it is ready to use out of the box. It allows usage on Kubernetes with the help of the Helm charts, and includes the required operating system and application parts automatically.
All three package Trino ships are just defaults. They all require further configuration to adapt Trino to your specific needs in terms of hardware, connected data sources, security configuration, and so on. All of these can be done manually or with many existing tools.
However, you can also take it a step further and create your own package suited to your needs. The tarball can be used as source for any customization to create your own package. In the following is a list of options and scenarios:
You can also use brew on MacOS, but that is not suitable for production usage. More for convenience to get a local Trino for playing around.
Currently Java 11 is required for Trino. Java 17 is the latest and greatest Java LTS release with lots of good performance, security, and language improvements. The community has been working hard to make Java 17 support a reality. At this stage core Trino fully supports Java 17. Starburst Galaxy for example uses Java 17.
The maintainers and contributors would like to move to fully support and also require Java 17 soon. Here is where your input comes in, and we ask that you let us know your thoughts about questions such as the following:
Let us know on the #dev channel on Trino Slack or ping us directly. You can also chime in on the roadmap issue.
The PR of the episode was
submitted Github user whutpencil, and adds a
significant new feature to the web UI. It exposes the
information, so statistics for each worker, in brand new pages. What a great
effort! Special thanks also go out to Dawid Adamek
dedep for the review.
In the demo of the month Manfred shows a worker installation to add to a local tarball install of a coordinator, and then demos the Web UI with the new feature from the pull request of the month.
Full question from Slack: I was trying the Delta Lake connector. I noticed that write operations are supported for tables stored on Azure ADLS Gen2, S3 and S3-compatible storage. Does that mean write operations are not supported for tables stored on HDFS?
Answer: HDFS is always implicitly supported for data lake connectors. It isn’t called out because it is assumed.
The confusion actually came from an error message used when the user tried to insert into a Delta Lake table they created in Spark. Then they tried inserting a record into the table through IntelliJ IDEA and received the following error message:
Unsupported target SQL type: -155
They thought the problem might be the wrong data type of birthday. Then used statement below to insert a record into the table.
INSERT INTO presto.people10m (id, firstname, middlename, lastname, gender, birthdate, ssn, salary) VALUES (1, 'a', 'b', 'c', 'male', timestamp '1990-01-01 00:00:00 +00:00', 'd', 10);
However, I got an error message like this:
Query 20220419_031201_00015_8qe76 failed: Cannot write to table in hdfs://masters/presto.db/people10m; hdfs not supported
This was an issue on the IntelliJ client.
Trino Meetup groups
If you want to learn more about Trino, check out the definitive guide from O’Reilly. You can download the free PDF or buy the book online.
Music for the show is from the Megaman 6 Game Play album by Krzysztof Słowikowski.