daft.DataFrame.write_parquet
daft.DataFrame.write_parquet#
- DataFrame.write_parquet(root_dir: Union[str, pathlib.Path], compression: str = 'snappy', partition_cols: Optional[List[Union[daft.expressions.expressions.Expression, str]]] = None) daft.dataframe.dataframe.DataFrame [source]#
Writes the DataFrame as parquet files, returning a new DataFrame with paths to the files that were written
Files will be written to
<root_dir>/*
with randomly generated UUIDs as the file names.Currently generates a parquet file per partition unless
partition_cols
are used, then the number of files can equal the number of partitions times the number of values of partition col.Note
This call is blocking and will execute the DataFrame when called
- Parameters
root_dir (str) – root file path to write parquet files to.
compression (str, optional) – compression algorithm. Defaults to “snappy”.
partition_cols (Optional[List[ColumnInputType]], optional) – How to subpartition each partition further. Currently only supports Column Expressions with any calls. Defaults to None.
- Returns
The filenames that were written out as strings.
Note
This call is blocking and will execute the DataFrame when called
- Return type