daft.Expression.list.distinct

daft.Expression.list.distinct#

Expression.list.distinct() Expression[source]#

Returns a list of unique elements in each list, preserving order of first occurrence and ignoring nulls.

Example

>>> import daft
>>> df = daft.from_pydict({"a": [[1, 2, 2, 3], [4, 4, 6, 2], [6, 7, 1], [None, 1, None, 1]]})
>>> df.select(df["a"].list.distinct()).show()
╭─────────────╮
│ a           │
│ ---         │
│ List[Int64] │
╞═════════════╡
│ [1, 2, 3]   │
├╌╌╌╌╌╌╌╌╌╌╌╌╌┤
│ [4, 6, 2]   │
├╌╌╌╌╌╌╌╌╌╌╌╌╌┤
│ [6, 7, 1]   │
├╌╌╌╌╌╌╌╌╌╌╌╌╌┤
│ [1]         │
╰─────────────╯

(Showing first 4 of 4 rows)

Note that null values are ignored:

>>> df = daft.from_pydict({"a": [[None, None], [1, None, 1], [None]]})
>>> df.select(df["a"].list.distinct()).show()
╭─────────────╮
│ a           │
│ ---         │
│ List[Int64] │
╞═════════════╡
│ []          │
├╌╌╌╌╌╌╌╌╌╌╌╌╌┤
│ [1]         │
├╌╌╌╌╌╌╌╌╌╌╌╌╌┤
│ []          │
╰─────────────╯

(Showing first 3 of 3 rows)
Returns:

An expression with lists containing only unique elements

Return type:

Expression