daft.read_parquet
daft.read_parquet#
- daft.read_parquet(path: Union[str, List[str]], schema_hints: Optional[Dict[str, daft.datatype.DataType]] = None, fs: Optional[fsspec.spec.AbstractFileSystem] = None, io_config: Optional[IOConfig] = None, use_native_downloader: bool = False) daft.dataframe.dataframe.DataFrame [source]#
Creates a DataFrame from Parquet file(s)
Example
>>> df = daft.read_parquet("/path/to/file.parquet") >>> df = daft.read_parquet("/path/to/directory") >>> df = daft.read_parquet("/path/to/files-*.parquet") >>> df = daft.read_parquet("s3://path/to/files-*.parquet")
- Parameters
path (str) – Path to Parquet file (allows for wildcards)
schema_hints (dict[str, DataType]) – A mapping between column names and datatypes - passing this option will disable all schema inference on data being read, and throw an error if data being read is incompatible.
fs (fsspec.AbstractFileSystem) – fsspec FileSystem to use for reading data. By default, Daft will automatically construct a FileSystem instance internally.
io_config (IOConfig) – Config to be used with the native downloader
use_native_downloader – Whether to use the native downloader instead of PyArrow for reading Parquet. This is currently experimental.
- Returns
parsed DataFrame
- Return type