daft.Expression.list.distinct#
- Expression.list.distinct() Expression [source]#
Returns a list of unique elements in each list, preserving order of first occurrence and ignoring nulls.
Example
>>> import daft >>> df = daft.from_pydict({"a": [[1, 2, 2, 3], [4, 4, 6, 2], [6, 7, 1], [None, 1, None, 1]]}) >>> df.select(df["a"].list.distinct()).show() ╭─────────────╮ │ a │ │ --- │ │ List[Int64] │ ╞═════════════╡ │ [1, 2, 3] │ ├╌╌╌╌╌╌╌╌╌╌╌╌╌┤ │ [4, 6, 2] │ ├╌╌╌╌╌╌╌╌╌╌╌╌╌┤ │ [6, 7, 1] │ ├╌╌╌╌╌╌╌╌╌╌╌╌╌┤ │ [1] │ ╰─────────────╯ (Showing first 4 of 4 rows)
Note that null values are ignored:
>>> df = daft.from_pydict({"a": [[None, None], [1, None, 1], [None]]}) >>> df.select(df["a"].list.distinct()).show() ╭─────────────╮ │ a │ │ --- │ │ List[Int64] │ ╞═════════════╡ │ [] │ ├╌╌╌╌╌╌╌╌╌╌╌╌╌┤ │ [1] │ ├╌╌╌╌╌╌╌╌╌╌╌╌╌┤ │ [] │ ╰─────────────╯ (Showing first 3 of 3 rows)
- Returns:
An expression with lists containing only unique elements
- Return type:
Expression