# daft.Expression.approx_percentiles#

Expression.approx_percentiles(percentiles: float | list[float]) Expression[source]#

Calculates the approximate percentile(s) for a column of numeric values

For numeric columns, we use the sketches_ddsketch crate. This is a Rust implementation of the paper DDSketch: A Fast and Fully-Mergeable Quantile Sketch with Relative-Error Guarantees (Masson et al.)

1. Null values are ignored in the computation of the percentiles

2. If all values are Null then the result will also be Null

3. If `percentiles` are supplied as a single float, then the resultant column is a `Float64` column

4. If `percentiles` is supplied as a list, then the resultant column is a `FixedSizeList[Float64; N]` column, where `N` is the length of the supplied list.

Example of a global calculation of approximate percentiles:

```>>> df = daft.from_pydict({"scores": [1, 2, 3, 4, 5, None]})
>>> df = df.agg(
>>>     df["scores"].approx_percentiles(0.5).alias("approx_median_score"),
>>>     df["scores"].approx_percentiles([0.25, 0.5, 0.75]).alias("approx_percentiles_scores"),
>>> )
>>> df.show()
╭─────────────────────┬────────────────────────────────╮
│ approx_median_score ┆ approx_percentiles_scores      │
│ ---                 ┆ ---                            │
│ Float64             ┆ FixedSizeList[Float64; 3]      │
╞═════════════════════╪════════════════════════════════╡
│ 2.9742334234767167  ┆ [1.993661701417351, 2.9742334… │
╰─────────────────────┴────────────────────────────────╯
(Showing first 1 of 1 rows)
```

Example of a grouped calculation of approximate percentiles:

```>>> df = daft.from_pydict({
>>>     "class":  ["a", "a", "a", "b", "c"],
>>>     "scores": [1, 2, 3, 1, None],
>>> })
>>> df = df.groupby("class").agg(
>>>     df["scores"].approx_percentiles(0.5).alias("approx_median_score"),
>>>     df["scores"].approx_percentiles([0.25, 0.5, 0.75]).alias("approx_percentiles_scores"),
>>> )
>>> df.show()
╭───────┬─────────────────────┬────────────────────────────────╮
│ class ┆ approx_median_score ┆ approx_percentiles_scores      │
│ ---   ┆ ---                 ┆ ---                            │
│ Utf8  ┆ Float64             ┆ FixedSizeList[Float64; 3]      │
╞═══════╪═════════════════════╪════════════════════════════════╡
│ c     ┆ None                ┆ None                           │
├╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┤
│ a     ┆ 1.993661701417351   ┆ [0.9900000000000001, 1.993661… │
├╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┤
│ b     ┆ 0.9900000000000001  ┆ [0.9900000000000001, 0.990000… │
╰───────┴─────────────────────┴────────────────────────────────╯
(Showing first 3 of 3 rows)
```
Parameters:

percentiles – the percentile(s) at which to find approximate values at. Can be provided as a single float or a list of floats.

Returns:

`FixedSizeList[Float64, len(percentiles)]`.

Return type:

A new expression representing the approximate percentile(s). If `percentiles` was a single float, this will be a new `Float64` expression. If `percentiles` was a list of floats, this will be a new expression with type