Python user-defined functions#
A Python user-defined function is a user-defined function that uses the Python programming language and statements for the definition of the function.
Warning
Python user-defined functions are an experimental feature.
Python UDF declaration#
Declare a Python UDF as inline or catalog UDF with the following steps:
Use the FUNCTION keyword to declare the UDF name and parameters.
Add the
RETURNS
declaration to specify the data type of the result.Set the
LANGUAGE
toPYTHON
.Declare the name of the Python function to call with the
handler
property in theWITH
block.Use
$$
to enclose the Python code after theAS
keyword.Add the function from the handler property and ensure it returns the declared data type.
Expand your Python code section to implement the function using the available Python language.
The following snippet shows pseudo-code:
FUNCTION python_udf_name(input_parameter data_type)
RETURNS result_data_type
LANGUAGE PYTHON
WITH (handler = 'python_function')
AS $$
...
def python_function(input):
return ...
...
$$
A minimal example declares the UDF doubleup
that returns the input integer
value x
multiplied by two. The example shows declaration as Inline user-defined functions and
invocation with the value 21
to yield the result 42
.
Set the language to PYTHON
to override the default SQL
for SQL user-defined functions.
The Python code is enclosed with ``$$` and must use valid formatting.
WITH
FUNCTION doubleup(x integer)
RETURNS integer
LANGUAGE PYTHON
WITH (handler = 'twice')
AS $$
def twice(a):
return a * 2
$$
SELECT doubleup(21);
-- 42
The same UDF can also be declared as Catalog user-defined functions.
Refer to the Example Python UDFs for more complex use cases and examples.
Python language details#
The Trino Python UDF integrations uses Python 3.13.0 in a sandboxed environment. Python code runs within a WebAssembly (WASM) runtime within the Java virtual machine running Trino.
Python language rules including indents must be observed.
Python UDFs therefore only have access to the Python language and core libraries included in the sandboxed runtime. Access to external resources with network or file system operations is not supported. Usage of other Python libraries as well as command line tools or package managers is not supported.
The following libraries are explicitly removed from the runtime and therefore not available within a Python UDF:
bdb
concurrent
curses
ensurepip
doctest
idlelib
multiprocessing
pdb
pydoc
socketserver*
sqlite3
ssl
subprocess*
tkinter
turtle*
unittest
venv
webbrowser*
wsgiref
xmlrpc
Type mapping#
The following table shows supported Trino types and their corresponding Python types for input and output values of a Python UDF:
Trino type |
Python type |
---|---|
row |
tuple |
array |
list |
map |
dict |
boolean |
bool |
tinyint |
int |
smallint |
int |
integer |
int |
bigint |
int |
real |
float |
double |
float |
decimal |
decimal.Decimal |
varchar |
str |
varbinary |
bytes |
date |
datetime.date |
time |
datetime.time |
time with time zone |
datetime.time with datetime.tzinfo |
timestamp |
datetime.datetime |
timestamp with time zone |
datetime.datetime with datetime.tzinfo 1 |
interval year to month |
int as the number of months |
interval day to second |
datetime.timedelta |
json |
str |
uuid |
uuid.UUID |
ipaddress |
ipaddress.IPv4Address or ipaddress.IPv6Address |
Date and time#
Python datetime objects only support microsecond precision. Trino argument values with greater precision arerounded when converted to Python values, and Python return values are rounded if the Trino return type has less than microsecond precision.
Only fixed offset time zones are supported. Timestamps with political time zones have the zone converted to the zone’s offset for the timestamp’s instant.