Skip to content

to_pandas in PandasQueryCompiler loses meta information about indices and columns #1726

@dchigarev

Description

@dchigarev

System information

  • OS Platform and Distribution (e.g., Linux Ubuntu 16.04): Windows 10
  • Modin version (modin.__version__): 0.7.3+198.ge4e3fecb
  • Python version: 3.7.5
  • Code we can use to reproduce:
if __name__ == "__main__":
    import modin.pandas as pd
    import numpy as np

    data = {"A": np.arange(256)}

    index = pd.MultiIndex.from_tuples((i, i*2) for i in np.arange(257)).drop(0)

    md_df = pd.DataFrame(data, index=index)

    print("md_df.index:", len(md_df.index.levels[0]))  # md_df.index: 257 
    print("pd_df.index:", len(md_df._to_pandas().index.levels[0]))  # pd_df.index: 256

Describe the problem

to_pandas loses information about index levels which is useful for operations that operates with indices levels internally (like unstack)

Metadata

Metadata

Assignees

Labels

bug 🦗Something isn't working

Type

No type

Projects

No projects

Milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions