daft.functions.monotonically_increasing_id

daft.functions.monotonically_increasing_id#

monotonically_increasing_id() Expression[source]#

Generates a column of monotonically increasing unique ids.

The implementation puts the partition number in the upper 28 bits, and the row number in each partition in the lower 36 bits. This allows for 2^28 ≈ 268 million partitions and 2^40 ≈ 68 billion rows per partition.

Example

>>> import daft
>>> from daft.functions import monotonically_increasing_id
>>> df = daft.from_pydict({"a": [1, 2, 3, 4]}).into_partitions(2)
>>> df = df.with_column("id", monotonically_increasing_id())
>>> df.show()
╭───────┬─────────────╮
│ a     ┆ id          │
│ ---   ┆ ---         │
│ Int64 ┆ UInt64      │
╞═══════╪═════════════╡
│ 1     ┆ 0           │
├╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌┤
│ 2     ┆ 1           │
├╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌┤
│ 3     ┆ 68719476736 │
├╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌┤
│ 4     ┆ 68719476737 │
╰───────┴─────────────╯
Returns:

An expression that generates monotonically increasing IDs

Return type:

Expression