Daft 0.1.2 Release Notes#

The Daft 0.1.2 release features performance improvements, bugfixes and some of our first Daft logical types!

New Features#

Extension Types for Ray Runner and Embedding Logical Type#

Adds our first “Logical Type”: Embeddings!

An Embedding is a “Logical Type” that encompasses a Fixed Size List. It is common in applications for Machine Learning and AI.

See: #929

Enhancements#

  • Use PyArrow filesystem for tabular file reads #939

  • [I/O] Port to pyarrow filesystems by default. #942

  • Memoize ray.get for batch metadata lookup #937

  • [I/O] Expose user-provided fsspec filesystem arg in read APIs. #931

  • Introduce Logical Arrays and SeriesLike Trait #920

  • [Extension Types] Add support for cross-lang extension types. #899

Bug Fixes#

  • fix concats for extension array for old versions of pyarrow #944

Build Changes#

  • [ci] enable pyrunner for 310 #946

  • Add Pyarrow 6.0 in matrix for CI testing #945

  • Update requirement of tabulate to >=0.9.0 #940

  • unpin numpy for 3.7 to get dependabot to stop complaining #938

  • Bump slackapi/slack-github-action from 1.23.0 to 1.24.0 #936

  • Bump hypothesis from 6.75.2 to 6.75.3 #928

  • Bump dask from 2023.4.1 to 2023.5.0 #927

  • Bump serde from 1.0.162 to 1.0.163 #921

Documentation#

  • Add comment to explain __future__ annotations isort rule in dataframe.py #947

  • [Embedding tutorial] Suggest running on GPU cluster #932

  • Embeddings tutorial #930