daft.DataFrame.groupby#
- DataFrame.groupby(*group_by: Union[Expression, str, Iterable[Union[Expression, str]]]) GroupedDataFrame [source]#
Performs a GroupBy on the DataFrame for aggregation
Example
>>> import daft >>> from daft import col >>> df = daft.from_pydict({ ... "pet": ["cat", "dog", "dog", "cat"], ... "age": [1, 2, 3, 4], ... "name": ["Alex", "Jordan", "Sam", "Riley"] ... }) >>> grouped_df = df.groupby("pet").agg( ... col("age").min().alias("min_age"), ... col("age").max().alias("max_age"), ... col("pet").count().alias("count"), ... col("name").any_value() ... ) >>> grouped_df.show() ╭──────┬─────────┬─────────┬────────┬────────╮ │ pet ┆ min_age ┆ max_age ┆ count ┆ name │ │ --- ┆ --- ┆ --- ┆ --- ┆ --- │ │ Utf8 ┆ Int64 ┆ Int64 ┆ UInt64 ┆ Utf8 │ ╞══════╪═════════╪═════════╪════════╪════════╡ │ cat ┆ 1 ┆ 4 ┆ 2 ┆ Alex │ ├╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌┤ │ dog ┆ 2 ┆ 3 ┆ 2 ┆ Jordan │ ╰──────┴─────────┴─────────┴────────┴────────╯ (Showing first 2 of 2 rows)
- Parameters:
*group_by (Union[str, Expression]) – columns to group by
- Returns:
DataFrame to Aggregate
- Return type: