Skip to content

Write Support: Aligning Column Order in pyarrow.Table with Iceberg Table Schema for Write Operations #199

Closed as not planned
@HonahX

Description

@HonahX

Feature Request / Improvement

This feature request arises from the discussion in #183 and is contingent upon the implementation of write support as outlined in #41. Upon the merger of PR #41, the following challenge is observed:

When attempting to write or overwrite new data to an Iceberg table, it is crucial that the column order in the pyarrow.Table dataframe or a Parquet file (without field-ids) aligns with the existing table schema. Discrepancies in column order can lead to issues during the process of writing data to a Parquet file using the Iceberg table's schema or when appending an existing Parquet file to our table, leading to incorrect data reads in subsequent operations.

To mitigate this issue,we might want a mechanism to establish a correspondence between the column names in pyarrow.Table and the Iceberg table schema. This will enable us to correctly assign field-ids to the respective columns when writing data.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions