You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Add a validate_graph_components() function that validates the structural consistency of constructed networkx.DiGraph graph components before they are serialized to disk. This catches construction bugs at the source rather than downstream in neural-lam.
Problem
Currently, graph construction bugs are only discovered when:
to_pyg() tries to serialize the graph and hits unexpected shapes/missing attributes
neural-lam's load_graph() fails with cryptic FileNotFoundError or TypeError (as documented in the format mismatch audit in neural-lam#339)
Or worse — they pass silently and produce wrong training results
Recent examples of bugs that would have been caught earlier:
With multiple mesh_layout variants being added (rectilinear in #81, triangular in #92, icosahedral in #76, prebuilt in #91), the surface area for construction bugs is growing. We need a systematic way to validate graph components.
Relationship to existing work
neural-lam#323 validates the on-disk .pt tensor format after export. This proposal validates the NetworkX DiGraph objects before export — they are complementary.
Assert that all nodes are in g2m #42 ("Assert that all nodes are in g2m") is a specific instance of one check. This proposal generalizes it into a reusable validation framework alongside other checks.
neural-lam#339 (the wmg↔neural-lam bridge RFC) identifies that format mismatches between the two repos are a key pain point. Catching structural issues before serialization would prevent many of these mismatches from occurring.
Bidirectional grid coverage (generalizes Assert that all nodes are in g2m #42)
Every grid node that appears in g2m should also appear in m2g (or be explicitly marked as boundary-only). Flags grid nodes that are "encoded to mesh" but never "decoded from mesh", which would cause silent data loss.
Edge feature consistency
All edges within a component have the same set of attributes (len, vdiff, component, etc.) and consistent dimensions. For example, vdiff should be 2D for projected CRS and 3D for geographic CRS — not mixed within a single component.
Mesh hierarchy completeness (hierarchical only)
For hierarchical graphs: every level-L mesh node has at least one up edge to level L+1 and one down edge from L+1. No orphan mesh nodes at any level.
No degenerate structures
No self-loops (edge from a node to itself)
No empty components (a component with zero edges)
No disconnected subgraphs within a single component (e.g., mesh nodes that form isolated clusters)
Coordinate sanity
All node pos attributes are finite (no NaN, no inf), and within reasonable bounds for the declared CRS (e.g., latitude in [-90, 90] for geographic).
Component labeling consistency
Edge component attributes match the expected values (g2m, m2m, m2g) and are uniform within each subgraph.
This also becomes valuable for validating mesh_layout="prebuilt" (#91) graphs where users provide their own mesh — we can verify it meets structural requirements before attempting serialization.
Implementation plan
Add weather_model_graphs/validation.py with validate_graph_components() and individual check functions
Add corresponding tests in tests/test_validation.py
Optionally integrate as an automatic step in to_pyg() / to_neural_lam() (with a validate=True flag)
Add a brief section to documentation
No new dependencies required — purely operates on networkx.DiGraph objects using existing networkx APIs.
Summary
Add a
validate_graph_components()function that validates the structural consistency of constructednetworkx.DiGraphgraph components before they are serialized to disk. This catches construction bugs at the source rather than downstream inneural-lam.Problem
Currently, graph construction bugs are only discovered when:
to_pyg()tries to serialize the graph and hits unexpected shapes/missing attributesneural-lam'sload_graph()fails with crypticFileNotFoundErrororTypeError(as documented in the format mismatch audit in neural-lam#339)Recent examples of bugs that would have been caught earlier:
connect_nodes_across_graphs— grid nodes mapped to wrong mesh nodesnx_draw_with_pos_and_attrsilently overwrites customedge_cmapdue to wrong key check #93:nx_draw_with_pos_and_attrsilently overwrites customedge_cmapdue to wrong key checknode_features_valuesreturns a list of tensors instead of a tensor whenlist_from_attributeis None #95:node_features_valuesreturns a list of tensors instead of a tensor whenlist_from_attributeisNoneWith multiple
mesh_layoutvariants being added (rectilinearin #81,triangularin #92,icosahedralin #76,prebuiltin #91), the surface area for construction bugs is growing. We need a systematic way to validate graph components.Relationship to existing work
.pttensor format after export. This proposal validates the NetworkX DiGraph objects before export — they are complementary.Proposed API
Checks to implement
Bidirectional grid coverage (generalizes Assert that all nodes are in g2m #42)
Every grid node that appears in g2m should also appear in m2g (or be explicitly marked as boundary-only). Flags grid nodes that are "encoded to mesh" but never "decoded from mesh", which would cause silent data loss.
Edge feature consistency
All edges within a component have the same set of attributes (len, vdiff, component, etc.) and consistent dimensions. For example, vdiff should be 2D for projected CRS and 3D for geographic CRS — not mixed within a single component.
Mesh hierarchy completeness (hierarchical only)
For hierarchical graphs: every level-L mesh node has at least one up edge to level L+1 and one down edge from L+1. No orphan mesh nodes at any level.
No degenerate structures
Coordinate sanity
All node pos attributes are finite (no NaN, no inf), and within reasonable bounds for the declared CRS (e.g., latitude in [-90, 90] for geographic).
Component labeling consistency
Edge component attributes match the expected values (g2m, m2m, m2g) and are uniform within each subgraph.
Usage in CI / tests
This also becomes valuable for validating mesh_layout="prebuilt" (#91) graphs where users provide their own mesh — we can verify it meets structural requirements before attempting serialization.
Implementation plan
No new dependencies required — purely operates on
networkx.DiGraphobjects using existing networkx APIs.