daft.DataFrame.write_csv

daft.DataFrame.write_csv#

DataFrame.write_csv(root_dir: Union[str, Path], partition_cols: Optional[List[Union[Expression, str]]] = None, io_config: Optional[IOConfig] = None) DataFrame[source]#

Writes the DataFrame as CSV files, returning a new DataFrame with paths to the files that were written

Files will be written to <root_dir>/* with randomly generated UUIDs as the file names.

Currently generates a csv file per partition unless partition_cols are used, then the number of files can equal the number of partitions times the number of values of partition col.

Note

This call is blocking and will execute the DataFrame when called

Parameters:
  • root_dir (str) – root file path to write parquet files to.

  • compression (str, optional) – compression algorithm. Defaults to “snappy”.

  • partition_cols (Optional[List[ColumnInputType]], optional) – How to subpartition each partition further. Defaults to None.

  • io_config (Optional[IOConfig], optional) – configurations to use when interacting with remote storage.

Returns:

The filenames that were written out as strings.

Return type:

DataFrame