daft.DataFrame.sort#
- DataFrame.sort(by: Union[Expression, str, List[Union[Expression, str]]], desc: Union[bool, List[bool]] = False) DataFrame [source]#
Sorts DataFrame globally
Note
Since this a global sort, this requires an expensive repartition which can be quite slow.
Supports multicolumn sorts and can have unique
descending
flag per column.
Example
>>> import daft >>> df = daft.from_pydict({"x": [3, 2, 1], "y": [6, 4, 5]}) >>> sorted_df = df.sort(col('x') + col('y')) >>> sorted_df.show() ╭───────┬───────╮ │ x ┆ y │ │ --- ┆ --- │ │ Int64 ┆ Int64 │ ╞═══════╪═══════╡ │ 2 ┆ 4 │ ├╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌┤ │ 1 ┆ 5 │ ├╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌┤ │ 3 ┆ 6 │ ╰───────┴───────╯ (Showing first 3 of 3 rows)
You can also sort by multiple columns, and specify the ‘descending’ flag for each column:
>>> df = daft.from_pydict({"x": [1, 2, 1, 2], "y": [9, 8, 7, 6]}) >>> sorted_df = df.sort(["x", "y"], [True, False]) >>> sorted_df.show() ╭───────┬───────╮ │ x ┆ y │ │ --- ┆ --- │ │ Int64 ┆ Int64 │ ╞═══════╪═══════╡ │ 2 ┆ 6 │ ├╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌┤ │ 2 ┆ 8 │ ├╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌┤ │ 1 ┆ 7 │ ├╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌┤ │ 1 ┆ 9 │ ╰───────┴───────╯ (Showing first 4 of 4 rows)
- Parameters:
column (Union[ColumnInputType, List[ColumnInputType]]) – column to sort by. Can be
str
or expression as well as a list of either.desc (Union[bool, List[bool]), optional) – Sort by descending order. Defaults to False.
- Returns:
Sorted DataFrame.
- Return type: