Expressions#

Daft Expressions allow you to express some computation that needs to happen in a DataFrame.

This page provides an overview of all the functionality that is provided by Daft Expressions.

Constructors#

col

Creates an Expression referring to the column with the provided name

lit

Creates an Expression representing a column with every value set to the provided value

Generic#

Expression.alias

Gives the expression a new name, which is its column's name in the DataFrame schema and the name by which subsequent expressions can refer to the results of this expression.

Expression.cast

Casts an expression to the given datatype if possible

Expression.if_else

Conditionally choose values between two expressions using the current boolean expression as a condition

Expression.is_null

Checks if values in the Expression are Null (a special value indicating missing data)

Expression.not_null

Checks if values in the Expression are not Null (a special value indicating missing data)

Expression.apply

Apply a function on each value in a given expression

Numeric#

Expression.__abs__

Absolute of a numeric expression (abs(expr))

Expression.__add__

Adds two numeric expressions or concatenates two string expressions (e1 + e2)

Expression.__sub__

Subtracts two numeric expressions (e1 - e2)

Expression.__mul__

Multiplies two numeric expressions (e1 * e2)

Expression.__truediv__

True divides two numeric expressions (e1 / e2)

Expression.__mod__

Takes the mod of two numeric expressions (e1 % e2)

Expression.ceil

The ceiling of a numeric expression (expr.ceil())

Logical#

Expression.__invert__

Inverts a boolean expression (~e)

Expression.__and__

Takes the logical AND of two boolean expressions (e1 & e2)

Expression.__or__

Takes the logical OR of two boolean expressions (e1 | e2)

Expression.__lt__

Compares if an expression is less than another (e1 < e2)

Expression.__le__

Compares if an expression is less than or equal to another (e1 <= e2)

Expression.__eq__

Compares if an expression is equal to another (e1 == e2)

Expression.__ne__

Compares if an expression is not equal to another (e1 != e2)

Expression.__gt__

Compares if an expression is greater than another (e1 > e2)

Expression.__ge__

Compares if an expression is greater than or equal to another (e1 >= e2)

Membership#

Checking if an expression is a member of a list of values

daft.expressions.Expression.is_in(other)

Checks if values in the Expression are in the provided list

Strings#

The following methods are available under the expr.str attribute.

Expression.str.contains

Checks whether each string contains the given pattern in a string column

Expression.str.endswith

Checks whether each string ends with the given pattern in a string column

Expression.str.startswith

Checks whether each string starts with the given pattern in a string column

Expression.str.concat

Concatenates two string expressions together

Expression.str.length

Retrieves the length for a UTF-8 string column

Expression.str.split

Splits each string on the given pattern, into one or more strings.

Expression.str.lower

Convert UTF-8 string to all lowercase

Expression.str.upper

Convert UTF-8 string to all upper

Expression.str.lstrip

Strip whitespace from the left side of a UTF-8 string

Expression.str.rstrip

Strip whitespace from the right side of a UTF-8 string

Temporal#

Expression.dt.date

Retrieves the date for a datetime column

Expression.dt.hour

Retrieves the day for a datetime column

Expression.dt.day

Retrieves the day for a datetime column

Expression.dt.month

Retrieves the month for a datetime column

Expression.dt.year

Retrieves the year for a datetime column

Expression.dt.day_of_week

Retrieves the day of the week for a datetime column, starting at 0 for Monday and ending at 6 for Sunday

List#

Expression.list.join

Joins every element of a list using the specified string delimiter

Expression.list.lengths

Gets the length of each list

Expression.list.get

Gets the element at an index in each list

Struct#

Expression.struct.get

Retrieves one field from a struct column

Image#

Expression.image.decode

Decodes the binary data in this column into images.

Expression.image.encode

Encode an image column as the provided image file format, returning a binary column of encoded bytes.

Expression.image.resize

Resize image into the provided width and height.

Expression.image.crop

Crops images with the provided bounding box

Partitioning#

Expression.partitioning.days

Partitioning Transform that returns the number of days since epoch (1970-01-01)

Expression.partitioning.hours

Partitioning Transform that returns the number of hours since epoch (1970-01-01)

Expression.partitioning.months

Partitioning Transform that returns the number of months since epoch (1970-01-01)

Expression.partitioning.years

Partitioning Transform that returns the number of years since epoch (1970-01-01)

Expression.partitioning.iceberg_bucket

Partitioning Transform that returns the Hash Bucket following the Iceberg Specification of murmur3_32_x86 https://iceberg.apache.org/spec/#appendix-b-32-bit-hash-requirements

Expression.partitioning.iceberg_truncate

Partitioning Transform that truncates the input to a standard width w following the Iceberg Specification.

URLs#

Expression.url.download

Treats each string as a URL, and downloads the bytes contents as a bytes column