daft.DataFrame.distinct

daft.DataFrame.distinct#

DataFrame.distinct() DataFrame[source]#

Computes unique rows, dropping duplicates

Example

>>> import daft
>>> df = daft.from_pydict({"x": [1, 2, 2], "y": [4, 5, 5], "z": [7, 8, 8]})
>>> unique_df = df.distinct()
>>> unique_df.show()
╭───────┬───────┬───────╮
│ x     ┆ y     ┆ z     │
│ ---   ┆ ---   ┆ ---   │
│ Int64 ┆ Int64 ┆ Int64 │
╞═══════╪═══════╪═══════╡
│ 2     ┆ 5     ┆ 8     │
├╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌┤
│ 1     ┆ 4     ┆ 7     │
╰───────┴───────┴───────╯

(Showing first 2 of 2 rows)
Returns:

DataFrame that has only unique rows.

Return type:

DataFrame