Skip to content

Conversation

@gdlg
Copy link
Contributor

@gdlg gdlg commented Aug 28, 2025

This pull request introduces a new post-processing step to the schema conversion logic in the converter registry, ensuring that attribute renaming and deletion are handled explicitly after all field-type conversions. The main change is the addition of the AttributeRemapperConverter, which is used internally to finalize attribute names and drop unused attributes, improving the correctness and flexibility of schema conversions. The conversion path finding logic and related tests have been updated to reflect this new step, and some minor improvements were made to the converter implementation details.

Converter registry and path finding improvements

  • Added AttributeRemapperConverter, a special converter that performs attribute renaming and selection (dropping others) as a post-processing step after field-type conversions. This converter is not registered globally but is used internally by the conversion path finder.
  • Updated the A* search and conversion path logic to append an AttributeRemapperConverter when attribute names or the set of attributes do not match the target schema, ensuring that the output schema is correct. [1] [2]
  • Modified _SchemaState equality and hashing to ignore attribute names, focusing only on field types and their properties, since renaming is now handled in post-processing.
  • Improved _get_applicable_converters to simplify the logic and ensure uniqueness of temporary names during conversion steps. [1] [2]
  • Updated the heuristic cost function to ignore attribute names, as they are now fixed in post-processing. [1] [2]

Converter implementation details

  • Minor changes to image converters to handle shape column renaming consistently, reflecting the new attribute remapping logic. [1] [2] [3]

Testing updates

  • Added AttributeRemapperConverter to the test imports and updated all relevant tests to expect an additional converter in the conversion path, verifying that attribute remapping is performed as a final step. [1] [2] [3] [4] [5] [6] [7] [8] [9]

Summary

How to test

Checklist

  • I have added unit tests to cover my changes.​
  • I have added integration tests to cover my changes.​
  • I have added the description of my changes into CHANGELOG.​
  • I have updated the documentation accordingly

License

  • I submit my code changes under the same MIT License that covers the project.
    Feel free to contact the maintainers if that's a concern.
  • I have updated the license header for each file (see an example below).
# Copyright (C) 2025 Intel Corporation
#
# SPDX-License-Identifier: MIT

@gdlg gdlg requested a review from AlbertvanHouten August 28, 2025 09:04
Signed-off-by: Grégoire Payen de La Garanderie <[email protected]>
@gdlg gdlg merged commit adedb25 into develop Sep 1, 2025
16 checks passed
@gdlg gdlg deleted the gppayend/attribute-renaming branch September 25, 2025 11:43
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants