daft.DataFrame.write_parquet

daft.DataFrame.write_parquet#

DataFrame.write_parquet(root_dir: Union[str, Path], compression: str = 'snappy', partition_cols: Optional[List[Union[Expression, str]]] = None, io_config: Optional[IOConfig] = None) DataFrame[source]#

Writes the DataFrame as parquet files, returning a new DataFrame with paths to the files that were written

Files will be written to <root_dir>/* with randomly generated UUIDs as the file names.

Note

This call is blocking and will execute the DataFrame when called

Parameters:
  • root_dir (str) – root file path to write parquet files to.

  • compression (str, optional) – compression algorithm. Defaults to “snappy”.

  • partition_cols (Optional[List[ColumnInputType]], optional) – How to subpartition each partition further. Defaults to None.

  • io_config (Optional[IOConfig], optional) – configurations to use when interacting with remote storage.

Returns:

The filenames that were written out as strings.

Note

This call is blocking and will execute the DataFrame when called

Return type:

DataFrame