Feature requests

It would be great:

- you could select columns for reading from parquet, or, even better, select from the schema hierarchy in general for deeper structured datasets
- you allow reading row-group X from a parquet dataset; this would allow for distributing the work to threads or even a cluster. Of course, the reader would need to reveal how many row-groups it contains
- some to_buffers kind of method exists to expose the internal buffers of an arrow structure, in the order defined in the arrow docs; also the corresponding from_buffers

Doing all of this would essentially answer what is envisaged in https://github.com/dask/fastparquet/pull/931 : getting what we really need out of arrow without the cruft. It would interoperate nicely with `awkward`, for example. 

Other nice to haves (and I realise you wish to keep the scope as small as possible)
- parquet filter
- str and dt compute functions

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Feature requests #195

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Feature requests #195

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions