daft.from_glob_path
daft.from_glob_path#
- daft.from_glob_path(path: str, fs: Optional[fsspec.spec.AbstractFileSystem] = None) daft.dataframe.dataframe.DataFrame [source]#
Creates a DataFrame of file paths and other metadata from a glob path.
This method supports wildcards:
“*” matches any number of any characters including none
“?” matches any single character
“[…]” matches any single character in the brackets
“**” recursively matches any number of layers of directories
The returned DataFrame will have the following columns:
path: the path to the file/directory
size: size of the object in bytes
type: either “file” or “directory”
Example
>>> df = daft.from_glob_path("/path/to/files/*.jpeg") >>> df = daft.from_glob_path("/path/to/files/**/*.jpeg") >>> df = daft.from_glob_path("/path/to/files/**/image-?.jpeg")
- Parameters
path (str) – Path to files on disk (allows wildcards).
fs (fsspec.AbstractFileSystem) – fsspec FileSystem to use for globbing and fetching metadata. By default, Daft will automatically construct a FileSystem instance internally.
- Returns
- DataFrame containing the path to each file as a row, along with other metadata
parsed from the provided filesystem.
- Return type