daft.DataFrame.sort

Contents

daft.DataFrame.sort#

DataFrame.sort(by: Union[Expression, str, List[Union[Expression, str]]], desc: Union[bool, List[bool]] = False) DataFrame[source]#

Sorts DataFrame globally

Note

  • Since this a global sort, this requires an expensive repartition which can be quite slow.

  • Supports multicolumn sorts and can have unique descending flag per column.

Example

>>> import daft
>>> df = daft.from_pydict({"x": [3, 2, 1], "y": [6, 4, 5]})
>>> sorted_df = df.sort(col('x') + col('y'))
>>> sorted_df.show()
╭───────┬───────╮
│ x     ┆ y     │
│ ---   ┆ ---   │
│ Int64 ┆ Int64 │
╞═══════╪═══════╡
│ 2     ┆ 4     │
├╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌┤
│ 1     ┆ 5     │
├╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌┤
│ 3     ┆ 6     │
╰───────┴───────╯

(Showing first 3 of 3 rows)

You can also sort by multiple columns, and specify the ‘descending’ flag for each column:

>>> df = daft.from_pydict({"x": [1, 2, 1, 2], "y": [9, 8, 7, 6]})
>>> sorted_df = df.sort(["x", "y"], [True, False])
>>> sorted_df.show()
╭───────┬───────╮
│ x     ┆ y     │
│ ---   ┆ ---   │
│ Int64 ┆ Int64 │
╞═══════╪═══════╡
│ 2     ┆ 6     │
├╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌┤
│ 2     ┆ 8     │
├╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌┤
│ 1     ┆ 7     │
├╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌┤
│ 1     ┆ 9     │
╰───────┴───────╯

(Showing first 4 of 4 rows)
Parameters:
  • column (Union[ColumnInputType, List[ColumnInputType]]) – column to sort by. Can be str or expression as well as a list of either.

  • desc (Union[bool, List[bool]), optional) – Sort by descending order. Defaults to False.

Returns:

Sorted DataFrame.

Return type:

DataFrame