Python user-defined functions#

A Python user-defined function is a user-defined function that uses the Python programming language and statements for the definition of the function.

Warning

Python user-defined functions are an experimental feature.

Python UDF declaration#

Declare a Python UDF as inline or catalog UDF with the following steps:

  • Use the FUNCTION keyword to declare the UDF name and parameters.

  • Add the RETURNS declaration to specify the data type of the result.

  • Set the LANGUAGE to PYTHON.

  • Declare the name of the Python function to call with the handler property in the WITH block.

  • Use $$ to enclose the Python code after the AS keyword.

  • Add the function from the handler property and ensure it returns the declared data type.

  • Expand your Python code section to implement the function using the available Python language.

The following snippet shows pseudo-code:

  FUNCTION python_udf_name(input_parameter data_type)
    RETURNS result_data_type
    LANGUAGE PYTHON
    WITH (handler = 'python_function')
    AS $$
    ...
    def python_function(input):
        return ...
    ...
    $$

A minimal example declares the UDF doubleup that returns the input integer value x multiplied by two. The example shows declaration as Inline user-defined functions and invocation with the value 21 to yield the result 42.

Set the language to PYTHON to override the default SQL for SQL user-defined functions. The Python code is enclosed with ``$$` and must use valid formatting.

WITH
  FUNCTION doubleup(x integer)
    RETURNS integer
    LANGUAGE PYTHON
    WITH (handler = 'twice')
    AS $$
    def twice(a):
        return a * 2
    $$
SELECT doubleup(21);
-- 42

The same UDF can also be declared as Catalog user-defined functions.

Refer to the Example Python UDFs for more complex use cases and examples.

Python language details#

The Trino Python UDF integrations uses Python 3.13.0 in a sandboxed environment. Python code runs within a WebAssembly (WASM) runtime within the Java virtual machine running Trino.

Python language rules including indents must be observed.

Python UDFs therefore only have access to the Python language and core libraries included in the sandboxed runtime. Access to external resources with network or file system operations is not supported. Usage of other Python libraries as well as command line tools or package managers is not supported.

The following libraries are explicitly removed from the runtime and therefore not available within a Python UDF:

  • bdb

  • concurrent

  • curses

  • ensurepip

  • doctest

  • idlelib

  • multiprocessing

  • pdb

  • pydoc

  • socketserver*

  • sqlite3

  • ssl

  • subprocess*

  • tkinter

  • turtle*

  • unittest

  • venv

  • webbrowser*

  • wsgiref

  • xmlrpc

Type mapping#

The following table shows supported Trino types and their corresponding Python types for input and output values of a Python UDF:

File system support properties#

Trino type

Python type

row

tuple

array

list

map

dict

boolean

bool

tinyint

int

smallint

int

integer

int

bigint

int

real

float

double

float

decimal

decimal.Decimal

varchar

str

varbinary

bytes

date

datetime.date

time

datetime.time

time with time zone

datetime.time with datetime.tzinfo

timestamp

datetime.datetime

timestamp with time zone

datetime.datetime with datetime.tzinfo 1

interval year to month

int as the number of months

interval day to second

datetime.timedelta

json

str

uuid

uuid.UUID

ipaddress

ipaddress.IPv4Address or ipaddress.IPv6Address

Date and time#

Python datetime objects only support microsecond precision. Trino argument values with greater precision arerounded when converted to Python values, and Python return values are rounded if the Trino return type has less than microsecond precision.

Only fixed offset time zones are supported. Timestamps with political time zones have the zone converted to the zone’s offset for the timestamp’s instant.