Skip to content

feat: add icosahedral mesh layout with grid-to-mesh connectivity#76

Open
yuvraajnarula wants to merge 61 commits intomllam:mainfrom
yuvraajnarula:feature/icosahedral-connectivity
Open

feat: add icosahedral mesh layout with grid-to-mesh connectivity#76
yuvraajnarula wants to merge 61 commits intomllam:mainfrom
yuvraajnarula:feature/icosahedral-connectivity

Conversation

@yuvraajnarula
Copy link
Copy Markdown

@yuvraajnarula yuvraajnarula commented Feb 28, 2026

Describe your changes

This PR adds mesh_layout='icosahedral' to support global graph generation in WMG, addressing the architectural gap identified in #71.

Key additions:

  • create.mesh.layouts.icosahedral.py - Implements icosahedral mesh hierarchy using trimesh (co-developed with @mandeepsingh2007)
  • Grid-to-mesh (g2m) radius-based connectivity adapted from neural-lam's prob_model_global branch (lines 224-242)
  • Notebook demo in /docs/icosahedral_global_graph.ipynb showing end-to-end pipeline
  • Helper utilities for spherical coordinate conversion (lat/lon ↔ cartesian)

Motivation:
Currently, WMG's mesh generation assumes rectilinear grids, which breaks for global domains (Issue #71). This PR decouples mesh layout from connectivity by introducing:

  • mesh_layout='icosahedral' for quasi-uniform spherical sampling
  • g2m connections using haversine-ready distance calculations
  • Foundation for future spherical layouts (HEALPix, etc.)

Dependencies:

  • trimesh - Lightweight mesh generation (avoiding GraphCast dependency)
  • numpy, scipy (already in WMG dependencies)

Issue Link

#71

Type of change

  • 🐛 Bug fix (non-breaking change that fixes an issue)
  • ✨ New feature (non-breaking change that adds functionality)
  • 💥 Breaking change (fix or feature that would cause existing functionality to not work as expected)
  • 📖 Documentation (Addition or improvements to documentation)

Checklist before requesting a review

  • My branch is up-to-date with the target branch - if not update your fork with the changes from the target branch (use pull with --rebase option if possible).
  • I have performed a self-review of my code
  • For any new/modified functions/classes I have added docstrings that clearly describe its purpose, expected inputs and returned values
  • I have placed in-line comments to clarify the intent of any hard-to-understand passages of my code
  • I have updated the documentation to cover introduced code changes
  • I have added tests that prove my fix is effective or that my feature works
  • I have given the PR a name that clearly describes the change, written in imperative form (context).
  • I have requested a reviewer and an assignee (assignee is responsible for merging)

Checklist for reviewers

Each PR comes with its own improvements and flaws. The reviewer should check the following:

  • the code is readable
  • the code is well tested
  • the code is documented (including return types and parameters)
  • the code is easy to maintain

Author checklist after completed review

  • I have added a line to the CHANGELOG describing this change, in a section
    reflecting type of change (add section where missing):
    • added: when you have added new functionality
    • changed: when default behaviour of the code has been changed
    • fixes: when your contribution fixes a bug

Checklist for assignee

  • PR is up to date with the base branch
  • the tests pass
  • author has added an entry to the changelog (and designated the change as added, changed or fixed)
  • Once the PR is ready to be merged, squash commits and merge the PR.

@mandeepsingh2007
Copy link
Copy Markdown

@yuvraajnarula Since we are collaborating on this PR, could you please invite me as a Collaborator to your fork of weather-model-graphs?

This way, I can directly commit and push the trimesh icosahedral mesh generation script to the PR branch myself. It will make it much easier for both of us to iterate on the code without having to send files back and forth!

@yuvraajnarula
Copy link
Copy Markdown
Author

@Raj-Taware Great point about the BallTree dependency. The 3D projection + KDTree approach is cleaner and matches what we're already doing for g2m connections.

One question:
The len attribute calculation you mentioned (2*arcsin(euclid/2)) - does this work consistently for points on the unit sphere?

@Raj-Taware
Copy link
Copy Markdown

@Raj-Taware Great point about the BallTree dependency. The 3D projection + KDTree approach is cleaner and matches what we're already doing for g2m connections.

One question: The len attribute calculation you mentioned (2*arcsin(euclid/2)) - does this work consistently for points on the unit sphere?

Hey ! great work getting trimesh up and running

Yes the formula works as long as both the coordinates are projected on a unit sphere.

If you want to generalise it the formula becomes
d = 2*R*arcsin(C/2R)

But since in create_icosahedral_mesh function you are normalising the vertices with norms, you are already creating a unit vector. and generate_icosahedral_mesh also sets the radius to 1.0 so I think the original formula can be used.

Btw when using arcsin i would recommend clipping the input to handle floating-point rounding errors. Sometimes KDTree might return a distance as something like 2.00000000000004 which will case np.arcsin(2.00000000000004/2) = np.arcsin(1.00000000000002) to throw a NaN warning.

Hope this helps !

@yuvraajnarula
Copy link
Copy Markdown
Author

Tracking

Note : Core icosahedral work now moved to dedicated issue #77. This PR focuses on connectivity primitives that will feed into that effort.

@yuvraajnarula
Copy link
Copy Markdown
Author

@mandeepsingh2007 Now that the PR is taking shape, wanted to check in – how does the implementation look from your end?

Specifically curious if:

  • The mesh generation integration works smoothly with your trimesh code
  • Any adjustments you'd recommend on my end (g2m, dispatch logic, etc.)
  • Anything you think we should add/change before marking ready for review

Your mesh work is the foundation here, so your feedback matters. Thanks again for collaborating on this!

@mandeepsingh2007
Copy link
Copy Markdown

@yuvraajnarula thanks for putting all this together! The implementation is definitely taking great shape.

To answer your points:

Mesh Integration: The trimesh base generation looks smooth and is establishing the correct spherical topology as expected.

g2m & Dispatch Logic: The current logic works perfectly for establishing our baseline. However, keeping Joel's recent Slack note about networkx bottlenecks in mind, we should note that scaling this to high-resolution global grids will eventually require us to shift toward fully vectorized bulk operations (to avoid iterative graph-building overhead). But for the scope of this initial PR, your current setup is spot on.

Ready for review? I think it’s good to go! Keeping this PR tightly scoped to just establishing the lightweight icosahedral layout (without GraphCast dependencies) is the perfect first step. We can tackle the hierarchical scaling and tensor optimizations in subsequent issues.

dlat = source_pos_2d[0] - target_pos_2d[0]
dlon = source_pos_2d[1] - target_pos_2d[1]
dlon = (dlon + 180) % 360 - 180
G_connect.edges[source_node, target_node]["vdiff"] = np.array([dlat, dlon])
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One thing worth flagging here. find_containing_triangle correctly computes and returns the barycentric weights, but they don't seem to be making it onto the edges in connect_nodes_across_graphs:

G_connect.edges[source_node, target_node]["vdiff"] = np.array([dlat, dlon])
# weights returned by find_containing_triangle not stored here

Storing them as a weights edge attribute would let to_pyg include them in the edge features alongside len and vdiff, which gives the model a geometric prior on each mesh node's contribution to a grid point rather than having to learn it purely from data. That's kind of the nice thing about containing_triangle over nearest_neighbours so would be good to preserve it.

G_connect.edges[source_node, target_node]["weights"] = barycentric_weight_for_this_node

That said, we could skip the weights entirely since the current LAM model operates on a learned weight basis anyway. But if we are going through the effort of computing them it seems like a shame not to pass them through to the model as well.
Might just be an oversight, wanted to flag it!

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Joltsy10 Good flag
I'll add them alongside len and vdiff in the containing_triangle block. The model can then choose to use them or not, but at least the information isn't lost.

Appreciate you catching these edge cases.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great idea to add these weights (maybe as barycentric_weight) as a node attribute. If you have done that @yuvraajnarula should we close this conversation?

@yuvraajnarula
Copy link
Copy Markdown
Author

@mandeepsingh2007 @Joltsy10

Pushed two commits:

  1. refactor: optimize mesh-to-grid connectivity with spatial indexing
  2. chore: add early exit + optimized function in containing_triangle case

@Joltsy10 – your review comments on performance and weights are addressed. The containing_triangle path now uses spatial indexing and stores barycentric weights as edge attributes.

@mandeepsingh2007 – mesh integration still solid? The optimizations shouldn't affect your generation logic.

Next: Generic methods (nearest_neighbour, etc.) could also use 3D spatial indexing for icosahedral cases. I'm going to keep this PR focused on containing_triangle since that's the critical path. We can optimize the generic methods in a follow-up PR if needed.

Thoughts?

@mandeepsingh2007
Copy link
Copy Markdown

mandeepsingh2007 commented Mar 2, 2026

@yuvraajnarula Yes, the mesh integration is completely solid. The spatial indexing you added for the containing_triangle path sits downstream from the coordinate generation, so the underlying trimesh geometry remains perfectly intact and mathematically sound.

I also 100% agree with your approach for the next steps. Keeping this PR strictly scoped to the containing_triangle critical path and the baseline icosahedral layout is the smartest move. It prevents scope creep and makes this much easier for the mentors to review. We should definitely freeze the scope here, get this foundation merged, and tackle the generic methods in subsequent PRs.

This looks fantastic. From my end on the mesh side, this is ready to go!

- Replace 2D lat/lon vdiff with 3D Cartesian in flat and hierarchical mesh edge creation
- Apply same fix to G2M and M2G edges in connect_nodes_across_graphs
- Add tests verifying 3D vdiff for icosahedral and 2D backward compatibility for rectilinear
@leifdenby
Copy link
Copy Markdown
Member

Still waiting on confirmation for mesh node features @leifdenby should icosahedral graphs use 2D lat/lon or 3D Cartesian for pos stored in mesh_features?

@Joltsy10: taking inspiration from @joeloskarsson's work on global probalistic forecasting what was done there (https://github.com/mllam/neural-lam/blob/prob_model_global/create_global_grid_features.py#L66) was create a positional embedding (just sin and cos of longitude and cos of the latitude), which makes me think we should store just the lat/lon coordinates as node attributes (and not the unit-sphere coordinates). What do you think?

@Joltsy10
Copy link
Copy Markdown
Contributor

@Joltsy10: taking inspiration from @joeloskarsson's work on global probalistic forecasting what was done there (https://github.com/mllam/neural-lam/blob/prob_model_global/create_global_grid_features.py#L66) was create a positional embedding (just sin and cos of longitude and cos of the latitude), which makes me think we should store just the lat/lon coordinates as node attributes (and not the unit-sphere coordinates). What do you think?

Oh ye i was thinking of ways to embed the positional coordinates and I thought about the sin/cos embedding and that would make it 3D but I was thinking of computing it here in wmg, rather we could just pass the 2D lat/lon and then embed it later in neural lam so yeah just having the lat/lon as node attributes and then later embedding it in nueral lam seems good

Copy link
Copy Markdown
Member

@leifdenby leifdenby left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Left some feedback on the notebooks (good idea to separate into two!). Both a really great, just some suggestions for simplification and adjustment. Once you've revised the one on the internals I will give another read and that will help me understand what the code is trying to achieve, and I'll then read the implementation next.

Also, what do you think to putting the notebook about the internals in docs/internals/? That way we can over time have a collection of notebooks that describe the api functionality of weather-model-graphs separately from those that describe the internal implementation.

@yuvraajnarula
Copy link
Copy Markdown
Author

@leifdenby
Reverting the visualization changes from this PR to keep #76 focused on core icosahedral graph generation.

I'm opening a new issue and will track 3D visualization work there. Following the discussion in #20, the plan is:

  • Add plot_3d.render_with_plotly(graph) in a separate PR
  • Use level attribute for Z-axis (grid nodes default to -1)
  • Batch edges with None separators for performance
  • Plotly as optional dependency

Happy to iterate on the API there. Thanks for the guidance!

yuvraajnarula added a commit to yuvraajnarula/weather-model-graphs that referenced this pull request Mar 26, 2026
@Prince637-boo
Copy link
Copy Markdown

Prince637-boo commented Mar 30, 2026

Hi @yuvraajnarula @Raj-Taware @mandeepsingh2007, great progress on the icosahedral implementation. After reviewing the latest commits, I have a few concerns regarding the global consistency:

1.Coordinate System: While Lat/Lon is 'pragmatic', using 3D Cartesian coordinates for mesh_features would avoid the pole singularity issue which is critical for GNN stability in global models.

2.Efficiency: Have we considered using torch-cluster for the spatial indexing instead of scipy? This would keep the entire pipeline on the GPU and avoid costly CPU-GPU transfers during graph construction.

3.Hierarchical Links: Does the current mesh_layout logic pre-compute the edges between different icosahedral levels? This is vital for implementing multi-scale message passing (pooling/unpooling) later on.

@Joltsy10
Copy link
Copy Markdown
Contributor

@Prince637-boo

  1. We did discuss on the usage of 3d coordinate over lat/lon but to follow the validator spec that is currently going on in neural-lam we ended on using lat/lon for wmg side and then applying sin/cos embedding over on the neural-lam model side.
  2. The graph generation happens only once here in wmg, then the files are sent over to neural lam for training/inference. I assume you think we will be generating the graph during training but it is not so, the graph is generated once here and the only thing that changes during training are the params. We will not run into this cpu/gpu handoff.
  3. yes the edges are precomputed, you can look at the visualization in feat: add 3D interactive graph visualisation with Plotly #118 to get a better idea.

@Prince637-boo
Copy link
Copy Markdown

Prince637-boo commented Mar 30, 2026 via email

@yuvraajnarula
Copy link
Copy Markdown
Author

@Prince637-boo
Thanks for the thoughtful questions!

To add to what @Joltsy10 already clarified:

1. Coordinate system : We opted for lat/lon in the graph to stay validator‑compliant. The neural‑lam side can apply sin/cos embeddings (as in the prob_model_global branch), which handles the polar continuity nicely.

2. Efficiency : Since graph generation is a one‑time offline step, scipy’s KDTree is sufficient for now. Keeping it CPU‑side avoids pulling in PyTorch GPU dependencies into WMG. If we ever need to generate graphs repeatedly (e.g., dynamic graphs), torch‑cluster would be a good candidate, definitely worth keeping in mind for the future.

3. Hierarchical links : Yes, the edges between levels are pre‑computed. The hierarchical layout in #118 visualises them clearly (intra‑level in blue, up/down in red/green). For multi‑scale message passing, the structure is there to use directly.

Happy to answer any more questions!

@Prince637-boo
Copy link
Copy Markdown

Thanks @yuvraajnarula and @Joltsy10 for the detailed

1.Coordinate system: Understood. Using sin/cos embeddings on the model side is a smart way to maintain validator compliance while solving the polar discontinuity.

2.Efficiency: That makes sense. Since the graph is static and pre-computed, the CPU overhead isn't a bottleneck for training. I agree that keeping WMG lightweight without forcing GPU dependencies is the right call for now.

3.Hierarchical links: I’ve been looking at the visualization in #118, and the concentric layout is a great improvement for clarity. It’s much easier to trace the up/down edges now.

I'm currently diving deeper into the plot_3d.py logic. I'd love to help polish the global visualization further—perhaps by adding a coastline/map layer to give the icosahedral grid more geographical context. Looking forward to contributing!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants