Input/Output#

In-Memory Data#

Python Objects#

daft.from_pylist

Creates a DataFrame from a list of dictionaries.

daft.from_pydict

Creates a DataFrame from a Python dictionary.

daft.DataFrame.to_pydict

Converts the current DataFrame to a python dictionary.

Arrow#

daft.from_arrow

Creates a DataFrame from a pyarrow Table.

daft.DataFrame.to_arrow

Converts the current DataFrame to a pyarrow Table.

Pandas#

daft.from_pandas

Creates a Daft DataFrame from a pandas DataFrame.

daft.DataFrame.to_pandas

Converts the current DataFrame to a pandas DataFrame.

File Paths#

daft.from_glob_path

Creates a DataFrame of file paths and other metadata from a glob path.

Files#

Parquet#

daft.read_parquet

Creates a DataFrame from Parquet file(s)

daft.DataFrame.write_parquet

Writes the DataFrame as parquet files, returning a new DataFrame with paths to the files that were written

CSV#

daft.read_csv

Creates a DataFrame from CSV file(s)

daft.DataFrame.write_csv

Writes the DataFrame as CSV files, returning a new DataFrame with paths to the files that were written

JSON#

daft.read_json

Creates a DataFrame from line-delimited JSON file(s)

Integrations#

Ray Datasets#

daft.from_ray_dataset

Creates a DataFrame from a Ray Dataset.

daft.DataFrame.to_ray_dataset

Converts the current DataFrame to a Ray Dataset which is useful for running distributed ML model training in Ray

Dask#

daft.from_dask_dataframe

Creates a Daft DataFrame from a Dask DataFrame.

daft.DataFrame.to_dask_dataframe

Converts the current Daft DataFrame to a Dask DataFrame.