daft.DataFrame.sample

daft.DataFrame.sample#

DataFrame.sample(fraction: float, with_replacement: bool = False, seed: Optional[int] = None) DataFrame[source]#

Samples a fraction of rows from the DataFrame

Example

>>> import daft
>>> df = daft.from_pydict({"x": [1, 2, 3], "y": [4, 5, 6], "z": [7, 8, 9]})
>>> sampled_df = df.sample(0.5)
>>> # Samples will vary from output to output
>>> # here is a sample output
>>> # ╭───────┬───────┬───────╮
>>> # │ x     ┆ y     ┆ z     │
>>> # │ ---   ┆ ---   ┆ ---   │
>>> # │ Int64 ┆ Int64 ┆ Int64 │
>>> # |═══════╪═══════╪═══════╡
>>> # │ 2     ┆ 5     ┆ 8     │
>>> # ├╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌┤
>>> # │ 3     ┆ 6     ┆ 9     │
>>> # ╰───────┴───────┴───────╯
Parameters:
  • fraction (float) – fraction of rows to sample.

  • with_replacement (bool, optional) – whether to sample with replacement. Defaults to False.

  • seed (Optional[int], optional) – random seed. Defaults to None.

Returns:

DataFrame with a fraction of rows.

Return type:

DataFrame