Skip to content

Arrow PyCapsule Interface support #12530

@wjones127

Description

@wjones127

Description

In the Arrow project, we recently created a new protocol for sharing Arrow data in Python. One of the goals of the protocol is allow exporting / importing Arrow data in Python without having to necessarily use PyArrow as an intermediary. For example, DuckDB can read from Polars DataFrames and LazyFrames, but only if PyArrow is installed. One this protocol is implemented, it would be possible to accomplish that integration without PyArrow.

This allows Arrow-exportable objects to be recognized based on the presence of one of several dunder methods.

Polars could implement this in two ways:

  • Add Arrow PyCapsule dunder methods to Polars objects
    • That would be: DataFrame, Series, DataType
  • Support Arrow PyCapsules in polars.from_arrow
  • Support Arrow PyCapsules in polars.DataFrame constructor
    • You already support pd.DataFrame, so it would make logical sense to support reading rectangular-shaped Arrow data.

I'd be happy to contribute this to the repo, if these ideas sound good.

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or an improvement of an existing feature

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions