Skip to content

Implement Rust core for dataset abstractions #1066

@mdekstrand

Description

@mdekstrand

Since we now have a Rust accelerator, and are using it for several operations in dataset, it seems that it would be easier to maintain the dataset abstractions, as well as possibly more performant in some cases, if we built a Rust kernel to manage the core of Dataset and DatasetBuilder, with the Python code providing a more convenient interface and wrapper functionality (e.g. conversion to tensor engines), instead of bouncing between Rust and Python for core functionality.

Some of the data processing would also be easier to read if it was written directly in Rust instead of chaining together (sometimes buggy) PyArrow compute kernels.

Metadata

Metadata

Assignees

Labels

Projects

No projects

Milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions