daft.DataFrame.pivot#
- DataFrame.pivot(group_by: Union[Expression, str, Iterable[Union[Expression, str]]], pivot_col: Union[Expression, str], value_col: Union[Expression, str], agg_fn: str, names: Optional[List[str]] = None) DataFrame [source]#
Pivots a column of the DataFrame and performs an aggregation on the values.
Note
You may wish to provide a list of distinct values to pivot on, which is more efficient as it avoids a distinct operation. Without this list, Daft will perform a distinct operation on the pivot column to determine the unique values to pivot on.
Example
>>> import daft >>> data = { ... "id": [1, 2, 3, 4], ... "version": ["3.8", "3.8", "3.9", "3.9"], ... "platform": ["macos", "macos", "macos", "windows"], ... "downloads": [100, 200, 150, 250], ... } >>> df = daft.from_pydict(data) >>> df = df.pivot("version", "platform", "downloads", "sum") >>> df.show() ╭─────────┬─────────┬───────╮ │ version ┆ windows ┆ macos │ │ --- ┆ --- ┆ --- │ │ Utf8 ┆ Int64 ┆ Int64 │ ╞═════════╪═════════╪═══════╡ │ 3.9 ┆ 250 ┆ 150 │ ├╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌┤ │ 3.8 ┆ None ┆ 300 │ ╰─────────┴─────────┴───────╯ (Showing first 2 of 2 rows)
- Parameters:
group_by (ManyColumnsInputType) – columns to group by
pivot_col (Union[str, Expression]) – column to pivot
value_col (Union[str, Expression]) – column to aggregate
agg_fn (str) – aggregation function to apply
names (Optional[List[str]]) – names of the pivoted columns
- Returns:
DataFrame with pivoted columns
- Return type: