daft.io.S3Config
daft.io.S3Config#
- class daft.io.S3Config(region_name=None, endpoint_url=None, key_id=None, session_token=None, access_key=None, max_connections=None, retry_initial_backoff_ms=None, connect_timeout_ms=None, read_timeout_ms=None, num_tries=None, retry_mode=None, anonymous=None, verify_ssl=None, check_hostname_ssl=None)#
Create configurations to be used when accessing an S3-compatible system
- Parameters
region_name – Name of the region to be used (used when accessing AWS S3), defaults to “us-east-1”. If wrongly provided, Daft will attempt to auto-detect the buckets’ region at the cost of extra S3 requests.
endpoint_url – URL to the S3 endpoint, defaults to endpoints to AWS
key_id – AWS Access Key ID, defaults to auto-detection from the current environment
access_key – AWS Secret Access Key, defaults to auto-detection from the current environment
max_connections – Maximum number of connections to S3 at any time, defaults to 1024
session_token – AWS Session Token, required only if
key_id
andaccess_key
are temporary credentialsretry_initial_backoff_ms – Initial backoff duration in milliseconds for an S3 retry, defaults to 1000ms
connect_timeout_ms – Timeout duration to wait to make a connection to S3 in milliseconds, defaults to 60 seconds
read_timeout_ms – Timeout duration to wait to read the first byte from S3 in milliseconds, defaults to 60 seconds
num_tries – Number of attempts to make a connection, defaults to 5
retry_mode – Retry Mode when a request fails, current supported values are
standard
andadaptive
anonymous – Whether or not to use “anonymous mode”, which will access S3 without any credentials
verify_ssl – Whether or not to verify ssl certificates, which will access S3 without checking if the certs are valid, defaults to True
check_hostname_ssl – Whether or not to verify the hostname when verifying ssl certificates, this was the legacy behavior for openssl, defaults to True
Example
>>> io_config = IOConfig(s3=S3Config(key_id="xxx", access_key="xxx")) >>> daft.read_parquet("s3://some-path", io_config=io_config)
- __init__()#
Methods
__init__
()Attributes
access_key
AWS Secret Access Key
connect_timeout_ms
AWS Connection Timeout in Milliseconds
endpoint_url
S3-compatible endpoint to use
key_id
AWS Access Key ID
max_connections
AWS max connections
num_tries
AWS Number Retries
read_timeout_ms
AWS Read Timeout in Milliseconds
region_name
Region to use when accessing AWS S3
retry_initial_backoff_ms
AWS Retry Initial Backoff Time in Milliseconds
retry_mode
AWS Retry Mode
session_token
AWS Session Token