daft.read_json

Contents

daft.read_json#

daft.read_json(path: Union[str, List[str]], schema_hints: Optional[Dict[str, DataType]] = None, io_config: Optional[IOConfig] = None, use_native_downloader: bool = True, _buffer_size: Optional[int] = None, _chunk_size: Optional[int] = None) DataFrame[source]#

Creates a DataFrame from line-delimited JSON file(s)

Example

>>> df = daft.read_json("/path/to/file.json")
>>> df = daft.read_json("/path/to/directory")
>>> df = daft.read_json("/path/to/files-*.json")
>>> df = daft.read_json("s3://path/to/files-*.json")
Parameters:
  • path (str) – Path to JSON files (allows for wildcards)

  • schema_hints (dict[str, DataType]) – A mapping between column names and datatypes - passing this option will override the specified columns on the inferred schema with the specified DataTypes

  • io_config (IOConfig) – Config to be used with the native downloader

  • use_native_downloader – Whether to use the native downloader instead of PyArrow for reading Parquet. This is currently experimental.

Returns:

parsed DataFrame

Return type:

DataFrame