Daft 0.0.13 Release Notes
Daft 0.0.13 Release Notes#
The Daft 0.0.13 release fixes some issues with typing and adds new functionality for loading from files on disk. The highlights are:
Improved unified API + User documentation published on www.getdaft.io
Adds support for multi-column
DataFrame.explodewhich explodes a Python column of iterable objects into multiple rows
DataFrame.from_fileswhich loads a DataFrame of filepaths and file metadata
@polars_udf added which works similarly to
@udf, but provides function inputs as a Polars Series instead of Numpy array. Polars Series is a more efficient format to cast our underlying Arrow data representation and handles NaN vs Null semantics correctly.
DataFrame.explode explodes a Python column of iterable objects into multiple rows.
DataFrame creation from files#
DataFrame.from_files loads a DataFrame of filepaths and file metadata.
DataFrame.sort can now run on multiple columns.
Arrow Negative Slice Bug Fix #229
Fix ExpressionExecutor eval’s dispatching of OperatorEvaluator #227
Fix bug in search sorted when table is empty and has no chunks #224
Fix random spaces appearing in long strings in tables #210
Allow RayRunner to proceed when Ray context has already been initialized #203
Refactor UDFs to create properly typed Blocks #232
Downgrade pyarrow for compatibility with Ray Data #221
Read files from storage with DataFrame.from_files #214
DataFrame.explode for splatting sequences of data into rows #208
Use Polars as the user-interface for UDFs #200
Sphinx Documentation on GitHub Pages #186
Selection and configuration of backend (PyRunner vs RayRunner) #178