User Defined Functions (UDFs)
User Defined Functions (UDFs)#
- daft.udf(*, return_dtype: daft.datatype.DataType) Callable[[Callable[[...], daft.series.Series]], daft.udf.UDF] [source]#
Decorator to convert a Python function into a UDF
UDFs allow users to run arbitrary Python code on the outputs of Expressions.
Note
In most cases, UDFs will be slower than a native kernel/expression because of the required Rust and Python overheads. If your computation can be expressed using Daft expressions, you should do so instead of writing a UDF. If your UDF expresses a common use-case that isn’t already covered by Daft, you should file a ticket or contribute this functionality back to Daft as a kernel!
In the example below, we create a UDF that:
Receives data under the argument name
x
Converts the
x
Daft Series into a Python list usingx.to_pylist()
Adds a Python constant value
c
to every element inx
Returns a new list of Python values which will be coerced to the specified return type:
return_dtype=DataType.int64()
.We can call our UDF on a dataframe using any of the dataframe projection operations (
df.with_column()
,df.select()
, etc.)
Example
>>> @udf(return_dtype=DataType.int64()) >>> def add_constant(x: Series, c=10): >>> return [v + c for v in x.to_pylist()] >>> >>> df = df.with_column("new_x", add_constant(df["x"], c=20))
- Parameters
return_dtype (DataType) – Returned type of the UDF
- Returns
UDF decorator - converts a user-provided Python function as a UDF that can be called on Expressions
- Return type
Callable[[UserProvidedPythonFunction], UDF]