Skip to content

Add deduplication pass for initializer tensors #67

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged

Conversation

AbhishekHerbertSamuel
Copy link
Contributor

Summary

This PR adds a new graph transformation pass: DeduplicateInitializersPass.

It removes duplicate initializer tensors (typically model weights) based on a unique fingerprint derived from:

  • Tensor byte content (tobytes())
  • Data type (dtype)
  • Shape

All redundant initializers are removed, and nodes referencing them are updated to use the canonical (first-seen) tensor.


Implementation Details

  • Fingerprints are tracked using a dictionary: (tobytes, dtype, shape) → name
  • Redundant initializers are removed using graph.initializers.pop(...)
  • Node inputs are updated via node.replace_input_with(...) for correctness and safety

Benefits

  • Reduces memory and file size by eliminating duplicated weight tensors
  • Simplifies graph structure for downstream optimization and export

File Added

  • src/onnx_ir/passes/common/deduplicate_initializers.py

Closes

Closes #66

@AbhishekHerbertSamuel AbhishekHerbertSamuel force-pushed the add-deduplicate-initializers-pass branch from f99fa0c to ae8f078 Compare June 5, 2025 07:57
Copy link
Member

@justinchuby justinchuby left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It’s fine to use an AI for contribution. Please ensure however that the code actually works

@AbhishekHerbertSamuel
Copy link
Contributor Author

Thank you for the feedback, Justin. I'll check if it works and then only send it here.

…bgraph traversal

Address reviewer feedback:
- Optimized memory by grouping by dtype and shape before comparing values
- Used iterate_graph to handle subgraphs
- Validated on normal and subgraph models; deduplication works as expected

Signed-off-by: Abhishek Herbert Samuel <[email protected]>
@AbhishekHerbertSamuel
Copy link
Contributor Author

AbhishekHerbertSamuel commented Jun 6, 2025

Hi Justin,

Thanks again for your feedback! I've verified that the updated implementation works as intended. Here's the test setup and output: (I ran the test locally, didn't push it here)

Local file path for the test: /Users/abhishekherbertsamuel/ir-py/src/test_local_dedup.py

Test code:
import numpy as np
from onnx_ir._core import Graph, Node, Tensor, Value
from onnx_ir.passes.common.deduplicate_initializers import DeduplicateInitializersPass
def test_normal_and_subgraph_dedup():
print("\n=== TEST: Normal Graph and Subgraph Deduplication ===")

# Shared tensor content
arr = np.array([1, 2, 3])
t1 = Tensor(arr)
t2 = Tensor(arr.copy())  # clone with same content

# Main graph values
v1 = Value(name="w1", const_value=t1)
v2 = Value(name="w2", const_value=t2)

# Subgraph has its own separate Value object (same tensor, new graph-safe instance)
sub_tensor = Tensor(arr.copy())
sub_val = Value(name="w3", const_value=sub_tensor)

# Subgraph node and graph
sub_node = Node("", "Conv", inputs=[sub_val], outputs=[])
subgraph = Graph(
    inputs=[],
    outputs=[],
    nodes=[sub_node],
    initializers=[sub_val],
    name="subgraph"
)

# Main graph node
main_node = Node("", "Add", inputs=[v1, v2], outputs=[])

# Attach subgraph manually to the node (mimics nested block structure)
main_node.blocks = [subgraph]

# Construct main graph
parent_graph = Graph(
    inputs=[],
    outputs=[],
    nodes=[main_node],
    initializers=[v1, v2],
    name="main_graph"
)

print("Before Deduplication:")
print("Main Graph Initializers:", list(parent_graph.initializers.keys()))
print("Main Node inputs:", [v.name for v in main_node.inputs])
print("Subgraph Initializers:", list(subgraph.initializers.keys()))
print("Subgraph Node inputs:", [v.name for v in sub_node.inputs])

# Apply deduplication
DeduplicateInitializersPass().apply(parent_graph)

print("\nAfter Deduplication:")
print("Main Graph Initializers:", list(parent_graph.initializers.keys()))
print("Main Node inputs:", [v.name for v in main_node.inputs])
print("Subgraph Initializers:", list(subgraph.initializers.keys()))
print("Subgraph Node inputs:", [v.name for v in sub_node.inputs])

if name == "main":
test_normal_and_subgraph_dedup()

Test Screenshot: (Have uploaded it here)
Screenshot 2025-06-06 at 11 58 10 AM

If I have missed out on anything, please let me know.

With regards,
Abhishek Herbert Samuel

@AbhishekHerbertSamuel
Copy link
Contributor Author

Hi @justinchuby,

I've pushed the finalized implementation and test as separate, signed commits. The following have been addressed:

DeduplicateInitializersPass: Added under passes/common, follows repo conventions, uses (dtype, shape) → {tobytes: name} grouping for memory efficiency, and traverses all subgraphs via RecursiveGraphIterator.

Test coverage: A dedicated unittest verifies correct deduplication in the main graph and ensures subgraphs remain isolated.

Coding standards: Followed the structure and documentation style of other passes (e.g., topological_sort.py).

Commit signed: Used -s with a clean message summarizing the functionality.

I have also attached a screenshot of the unit test which passed successfully on my local copy of this repository.

Please let me know if any final changes are needed. Thanks again for your guidance and mentorship throughout this PR!

Best,
Abhishek Herbert Samuel
image

@justinchuby
Copy link
Member

Please feel free to ask questions when you are going through the code base or need help understanding parts of the code. It would be helpful to take a look at other existing passes and usages to ensure they are implemented in a similar style.

@justinchuby
Copy link
Member

My concern with this pass in particular is that we are using the full bytes in the look up table. This is memory intensive. I wonder if there is a good (efficient) hash method that can be apply to the bytes content, and use the hash value in the look up table. Only when the hash matches do we compare the actual bytes.

@AbhishekHerbertSamuel
Copy link
Contributor Author

Hi @justinchuby,
Thanks a lot for your detailed feedback :)

I’ll update the class to inherit from ir.passes.InPlacePass as suggested and move the main logic into the call method, following the repo’s conventions (like in constant_manipulation.py).
I’ll also change the test imports to follow the module-only import guideline — thanks for pointing me to the correct example!

Regarding the memory concern:
You're absolutely right — using tobytes() directly is memory-intensive. I’ll switch to using sha256 to hash the tensor bytes first, which helps group potential duplicates quickly. Then, to avoid any risk of false positives from rare hash collisions, I’ll still compare the full bytes only when the hashes match. This keeps things memory-efficient while still being safe and accurate. Thanks again for the suggestion!

Will push the changes shortly. Please let me know if I missed anything else. Appreciate your guidance!

Warm regards,
Abhishek Herbert Samuel

- Implemented DeduplicateInitializersPass to remove redundant initializers
  with identical shape, dtype, and values within individual graphs.
- Ensured deduplication is confined to the same graph scope (no cross-subgraph merging).
- Added unit tests covering:
  - Exact duplicates
  - Different shapes/dtypes
  - Scalars
  - Multiple duplicates
  - Non-deduplicable distinct values
- Removed subgraph-related tests due to ONNX serialization behavior omitting their initializers.

Signed-off-by: Abhishek Herbert Samuel <[email protected]>
@AbhishekHerbertSamuel
Copy link
Contributor Author

Hi @justinchuby,
I've pushed the finalized version of DeduplicateInitializersPass along with a focused set of unit tests. The current tests comprehensively validate deduplication behavior across various scenarios—shape, dtype, scalar, and value uniqueness.

Tests involving subgraph initializers were removed, as ONNX drops those during serialization, making them unreliable to assert against. Let me know if you'd like a different strategy for subgraph coverage.

Thanks again for your guidance throughout!

Warm regards,
Abhishek Herbert Samuel

Copy link

codecov bot commented Jun 9, 2025

Codecov Report

Attention: Patch coverage is 84.00000% with 4 lines in your changes missing coverage. Please review.

Project coverage is 74.44%. Comparing base (d41327e) to head (a039526).
Report is 2 commits behind head on main.

Files with missing lines Patch % Lines
...onnx_ir/passes/common/initializer_deduplication.py 84.00% 2 Missing and 2 partials ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main      #67      +/-   ##
==========================================
+ Coverage   74.39%   74.44%   +0.05%     
==========================================
  Files          37       38       +1     
  Lines        4648     4673      +25     
  Branches      950      954       +4     
==========================================
+ Hits         3458     3479      +21     
- Misses        839      841       +2     
- Partials      351      353       +2     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@justinchuby justinchuby self-assigned this Jun 9, 2025
@AbhishekHerbertSamuel AbhishekHerbertSamuel force-pushed the add-deduplicate-initializers-pass branch from a00be10 to 6b3e0b7 Compare June 11, 2025 08:40
@AbhishekHerbertSamuel
Copy link
Contributor Author

Sure @justinchuby, will fix it and maintain code consistency :)

@justinchuby justinchuby requested a review from Copilot June 13, 2025 03:08
Copilot

This comment was marked as outdated.

@AbhishekHerbertSamuel
Copy link
Contributor Author

Thank you @xadupre @inisis @justinchuby for the feedback. Will make the requested changes and ensure that the PR is ready to be merged.

…nd size limit

- Avoids comparing large tensors >1024 elements to reduce performance overhead
- Compares shape and dtype before accessing tensor content
- Adds test coverage for subgraph deduplication (If node branches)
- Passes all linters: ruff, mypy, editorconfig

Signed-off-by: Abhishek Herbert Samuel <[email protected]>
@AbhishekHerbertSamuel
Copy link
Contributor Author

@xadupre @justinchuby @inisis I have made the requested changes. Please check and let me know if it's ready for merging or if other changes need to be made prior to that. Thank you once again :)

@AbhishekHerbertSamuel
Copy link
Contributor Author

Hi @justinchuby, is the code I submitted fine? Please let me know if there are any issues so that I can resolve it. As of now it's showing that 20/21 checks have passed (with 1 skipped).

@justinchuby
Copy link
Member

Will take a look soon, thanks!

@justinchuby
Copy link
Member

Thank for your contribution. I updated your code to simplify some of the logic and moved to do simple byte comparison for now because we have a small enough size limit.

@justinchuby justinchuby changed the title Add deduplication pass for initializer tensors (#66) Add deduplication pass for initializer tensors Jun 19, 2025
@justinchuby justinchuby requested a review from Copilot June 19, 2025 20:22
Copy link
Contributor

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This pull request adds a new graph transformation pass to deduplicate initializer tensors based on their content, data type, and shape.

  • Introduces DeduplicateInitializersPass to remove redundant initializer tensors.
  • Adds unit tests to verify deduplication behavior across various scenarios.

Reviewed Changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated no comments.

File Description
src/onnx_ir/passes/common/initializer_deduplication_test.py Adds unit tests for deduplication behavior
src/onnx_ir/passes/common/initializer_deduplication.py Implements the deduplication pass for initializer tensors

@justinchuby justinchuby merged commit d8fa011 into onnx:main Jun 19, 2025
21 checks passed
@AbhishekHerbertSamuel
Copy link
Contributor Author

@justinchuby thank you for the mentorship and support throughout this PR. This was my first time contributing to an open source repository and I learnt a lot through this process. @xadupre @inisis @titaiwangms thank you for the constructive suggestions on this PR and related PR's (98,99) which helped bring this to completion. Looking forward to learning and building more in the ONNX community :)

Warm regards,
Abhishek Herbert Samuel

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Create a tensor de-duplication pass
4 participants