Description
Obstore supports returning Arrow RecordBatch
from each chunk in obstore.list
and returning an Arrow Table
from obstore.list_with_delimiter
.
I would like this to be an obstore implementation detail instead of a requirement of all obspec
implementations.
I had hoped that I would be able to remove the return_arrow
keyword from obspec
's list methods but still allow obstore's implementation to add the return_arrow
keyword as long as it defaults to False
and returns a list[ObjectMeta]
by default. However it looks like this doesn't pass pylance:

See what I tried in #14
That said, since obspec
's list is defined in terms of the Arrow PyCapsule Interface, setting return_arrow=True
allows for very generic programming. The return type could be a pandas, Polars, DuckDB, pyarrow, nanoarrow, or arro3 or anything else that supports the protocol. (There is a wrinkle that list
requires something that implements the ArrowArray
interface, which I'm not sure pandas or polars define, since they only have a concept of a multiple-chunked data structure)