User Defined Functions (UDFs)#

daft.udf(*, return_dtype: DataType) → Callable[[Callable[[...], Union[Series, ndarray, list]]], UDF][source]#

Decorator to convert a Python function into a UDF

UDFs allow users to run arbitrary Python code on the outputs of Expressions.

Note

In most cases, UDFs will be slower than a native kernel/expression because of the required Rust and Python overheads. If your computation can be expressed using Daft expressions, you should do so instead of writing a UDF. If your UDF expresses a common use-case that isn’t already covered by Daft, you should file a ticket or contribute this functionality back to Daft as a kernel!

In the example below, we create a UDF that:

Receives data under the argument name x
Converts the x Daft Series into a Python list using x.to_pylist()
Adds a Python constant value c to every element in x
Returns a new list of Python values which will be coerced to the specified return type: return_dtype=DataType.int64().
We can call our UDF on a dataframe using any of the dataframe projection operations (df.with_column(), df.select(), etc.)

Example

>>> @udf(return_dtype=DataType.int64())
>>> def add_constant(x: Series, c=10):
>>>     return [v + c for v in x.to_pylist()]
>>>
>>> df = df.with_column("new_x", add_constant(df["x"], c=20))

Parameters:: return_dtype (DataType) – Returned type of the UDF
Returns:: UDF decorator - converts a user-provided Python function as a UDF that can be called on Expressions
Return type:: Callable[[UserProvidedPythonFunction], UDF]

User Defined Functions (UDFs)

Contents

User Defined Functions (UDFs)#