We discuss the Trino 347 release notes: https://trino.io/docs/current/release/release-347.html
Official release announcement from Martin Traverso:
We’re happy to announce the release of Presto 347! This version includes:
Notes from Manfred:
Parser/Analyzer
Planner/Optimizer
Use EXPLAIN
to learn what is planned.
Also refer to chapter 4 in
Trino: The Definitive Guide
pg. 50.
In this week’s pull request https://github.com/trinodb/trino/pull/730, came from one of the co-creators Martin Traverso. This pull request removes duplicate predicates in logical binary expressions (AND, OR) and canonicalizes commutative arithmetic expressions and comparisons to handle a larger number of variants. Canonicalize is a big word but all it is saying is that if there are multiple representations of the same logic or data, then simplify it to a simpler or agreed upon normal form.
For example the statement COALESCE(a * (2 * 3), 1 - 1)
is
equivalent to COALESCE(6 * a, 0)
as the expression 2 * 3 can
be simplified to static integer.
This is an example of a logical plan because we are talking about the query syntax by optimizing the SQL. It differs from the distributed plan as we are not determining how the plan will be distributed, where this plan will run and it does not run further optimizations that are handled by the cost based optimizer such as pushdown predicates. We’ll talk about this step more in the next episode. For now let’s cover a few examples
The format of the EXPLAIN used is graphviz. The online tool used during the show is Viz.js. You can paste the output of your EXPLAIN queries to visualize the query in a tree form.
In this week’s question, we answer:
How should I allocate memory properties? CPU : 16Core MEM:64GB
Before answering this, we should make sure a few things about memory are clear.
Space needed that the user is capable of reasoning about:
Settings
query.max-memory-per-node
- maximum amount of user memory that a query
is allowed to use on a given worker.query.max-memory
(without the -per-node at the end) - This config caps
the amount of user memory used by a single query over all worker nodes in your
cluster.Memory needed to facilitate internal usage
NOTE: There are no settings for this memory as it is implicitly set by the user and total memory settings. Use this to calculate system memory:
query.max-total-memory-per-node
-
query.max-memory-per-node
query.max-total-memory
- query.max-memory
Total Memory = System + User, but there are only properties for total and user memory.
Settings
query.max-total-memory-per-node
- maximum amount of total memory that a
query is allowed to use on a given worker.query.max-total-memory
(without the -per-node at the end) - This config
caps the total memory used by a single query over all worker nodes in your
cluster.The final setting I would like to cover is the
memory.heap-headroom-per-node
. This config sets aside memory for the
JVM heap for allocations that are not tracked by Presto. You can typically go
with the default on this setting which is 30% of the JVM’s max heap size
(-Xmx setting).
Now knowing that Presto is a java application means it runs on the JVM. None of these memory settings mean anything until we actually have the JVM that Presto is running on set aside sufficient memory. So how do I know I am setting sufficient memory based on my settings?
query.max-total-memory-per-node
+ memory.heap-headroom-per-node
<
-Xmx
setting (Java heap)
Dain really covers the proportions well in detail on the recent training videos. Here’s a snippet of what he recommends.
All in all, try to estimate the amount of memory needed by your max anticipated query load, and if possible try to get even more than your estimate. Once Presto is discovered by users, they will start to use it even more and demands on the system will grow.
Blogs
Upcoming events
Latest training from David, Dain, and Martin(Now with timestamps!):
Presto Summit Series - Real world usage
Podcasts:
If you want to learn more about Presto yourself, you should check out the O’Reilly Trino Definitive guide. You can download the free PDF or buy the book online.
Music for the show is from the Megaman 6 Game Play album by Krzysztof Słowikowski.