-
Notifications
You must be signed in to change notification settings - Fork 3.9k
Adding connected_components() Method to Data and HeteroData for extracting connected components #10388
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
wsad1
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Left some comments.
Will take another look this week.
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## master #10388 +/- ##
==========================================
- Coverage 86.11% 85.97% -0.14%
==========================================
Files 496 502 +6
Lines 33655 35207 +1552
==========================================
+ Hits 28981 30269 +1288
- Misses 4674 4938 +264 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
Refactor hetero data connected components tests for clarity and conciseness.
wsad1
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the work @jesseangelis .
Summary
This PR adds a new method,
connected_components(), to bothHeteroDataandDataclasses in PyTorch Geometric. It identifies and extracts disjoint connected components from a (heterogeneous or homogeneous) graph using a union-find algorithm.Motivation
While PyG provides convenient utilities such as
subgraph,edge_subgraph, andnode_type_subgraph, it does not currently offer a built-in method to extract connected components from graphs. This functionality could be useful for:This PR provides a general-purpose implementation that complements existing subgraph utilities.
API
List[Data]orList[HeteroData], with each item corresponding to a connected component.node_type_subgraphoredge_type_subgraphif they wish to limit the types involved.Highlights
HeteroDataDiscussion Points
Based on feedback from the original issue, this version avoids introducing new arguments like
allowed_edge_typesorallowed_node_typesand instead suggests combining the new method with existing filtering utilities.Happy to further extend or refine the method based on team preferences!