Configuration#

Setting the Runner#

Control the execution backend that Daft will run on by calling these functions once at the start of your application.

daft.context.set_runner_py

Set the runner for executing Daft dataframes to your local Python interpreter - this is the default behavior.

daft.context.set_runner_ray

Set the runner for executing Daft dataframes to a Ray cluster

Setting configurations#

Configure Daft in various ways during execution.

daft.set_planning_config

Globally sets various configuration parameters which control Daft plan construction behavior.

daft.planning_config_ctx

Context manager that wraps set_planning_config to reset the config to its original setting afternwards

daft.set_execution_config

Globally sets various configuration parameters which control various aspects of Daft execution.

daft.execution_config_ctx

Context manager that wraps set_execution_config to reset the config to its original setting afternwards

I/O Configurations#

Configure behavior when Daft interacts with storage (e.g. credentials, retry policies and various other knobs to control performance/resource usage)

These configurations are most often used as inputs to Daft DataFrame reading I/O functions such as in Dataframe Creation.

daft.io.IOConfig

Create configurations to be used when accessing storage

daft.io.S3Config

Create configurations to be used when accessing an S3-compatible system

daft.io.S3Credentials

Create credentials to be used when accessing an S3-compatible system

daft.io.GCSConfig

Create configurations to be used when accessing Google Cloud Storage.

daft.io.AzureConfig

Create configurations to be used when accessing Azure Blob Storage.