daft.expressions.expressions.ExpressionUrlNamespace.download#

ExpressionUrlNamespace.download(max_worker_threads: int = 8, on_error: Union[Literal['raise'], Literal['null']] = 'raise', fs: Optional[fsspec.spec.AbstractFileSystem] = None) daft.expressions.expressions.Expression[source]#

Treats each string as a URL, and downloads the bytes contents as a bytes column

Parameters
  • max_worker_threads – The maximum number of threads to use for downloading URLs, defaults to 8

  • on_error – Behavior when a URL download error is encountered - “raise” to raise the error immediately or “null” to log the error but fallback to a Null value. Defaults to “raise”.

  • fs (fsspec.AbstractFileSystem) – fsspec FileSystem to use for downloading data. By default, Daft will automatically construct a FileSystem instance internally.

Returns

a BYTES expression which is the bytes contents of the URL, or None if an error occured during download

Return type

UdfExpression