Skip to content

open_parquet + remote server #25

@jbogaardt

Description

@jbogaardt

Hi,

This may be a usage question and not a bug and I also apologize if this is not where this sort of question should be asked. I am struggling how to limit columns on a parquet file in an intake server. Based on the docs and the example notebook, I think this should work:

>>> import intake
>>> import intake_parquet
>>> intake_parquet.__version__, intake.__version__
('0.2.3', '0.6.3')
>>> cat = intake.open_catalog('intake://localhost:5555')
>>> type(cat.big_parquet)
intake.container.dataframe.RemoteDataFrame
>>> pq = intake.open_parquet(cat.big_parquet, columns=['Column 1'])
>>> type(pq) 
intake_parquet.source.ParquetSource
>>> pq.read()
...
TypeError: argument of type 'RemoteDataFrame' is not iterable

If I read the file directly without using intake.open_parquet, it works fine, but I am precluded from limiting the columns.

>>> cat.big_parquet.read()

Is this the expected behavior? Apologies in advance if I missed it in the docs.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions