-
Notifications
You must be signed in to change notification settings - Fork 2
Sentinel-2 EOPF GeoZarr #16
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
- Implement functions to create Cloud Optimized GeoTIFF (COG) style overview levels. - Calculate overview levels based on native dimensions and downsampling logic. - Create overview templates maintaining native CRS and spatial attributes. - Populate overview arrays with downsampled data using numpy methods. - Verify coordinates and CRS in overview levels. - Plot overview levels using xarray's native plotting capabilities.
…GeoZarr V3 This document outlines the core requirements, gaps in the current EOPF Zarr format, and the implementation process for converting EOPF datasets to comply with GeoZarr V3 standards. It includes detailed sections on CF compliance, spatial reference systems, multiscale support, and validation processes, along with a comparison between EOPF Zarr and GeoZarr V3.
"3": {} | ||
} | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I thought we agreed to not use TMS grid?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, this is a remainder from my previous experiment
#### Overview Level Requirements | ||
|
||
Each overview level MUST: | ||
- Follow COG-style /2 downsampling (1:1, 1:2, 1:4, 1:8, etc.) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the /2 downsampling is not mandatory in COGs
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is the actual requirement, right:
A COG file SHALL contain reduced resolution subfiles each one reducing the resolution by a minimum factor of 2 and a maximum factor of 10 from the previous one.
I think that is the type of requirement we should translate to GeoZarrr
|
||
### 3. Coordinate Arrays for All Levels | ||
|
||
All GeoZarr V3 datasets MUST include proper coordinate arrays at every resolution level: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if we require GeoTransform
then we don't really need coordinate arrays. IMO having those array at each level would be painful to handle.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FYI: in COG we have one top level GeoTransform and then only the shape of the additional overviews
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Amazing, I really appreciate you spending time on this! I only took a quick look but think it's mostly on the right track.
"spatial_ref": { | ||
"attrs": { | ||
"crs_wkt": "PROJCS[\"WGS 84 / UTM zone 32N\"...]", | ||
"spatial_ref": "PROJCS[\"WGS 84 / UTM zone 32N\"...]", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"spatial_ref": "PROJCS[\"WGS 84 / UTM zone 32N\"...]", |
I think only crs_wkt
should be included because spatial_ref
is not found as an attribute in CF compliant NetCDF files or as a tag in GeoTIFFs, it's a redundancy added by rioxarray for interoperability with GDAL in memory. IMO it's best not to persist those redundancies on disk.
|
||
| Issue | Description | Impact | | ||
|-------|-------------|---------| | ||
| **Missing CF Standard Names** | Variables lack `standard_name` attributes | Reduces interoperability with CF-compliant tools | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What proportion of the EOPF data be accurately described by the existing CF standard names? I was under the impression that data variable standardization was a bit more challenging with satellite products relative to climate / forecasting data.
| **Incomplete CRS Information** | CRS stored only as `proj:epsg` attribute | Limited compatibility with rioxarray and geospatial tools | | ||
| **Missing Grid Mapping** | No `grid_mapping` attribute linking to spatial reference | Geospatial tools can't detect coordinate system | | ||
| **No Multiscale Support** | Lacks overview levels and multiscale metadata | Poor performance for multi-scale visualization | | ||
| **Missing Coordinate Arrays** | Overview levels lack proper x/y coordinate arrays | Cannot perform spatial operations on overview data | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For data that can be described by an affine transformation (e.g., regular grids), the coordinate arrays should not be required because they increase storage space, increase data transfer/load times, and can introduce errors associated with numerical precision relative to a functional coordinates. Geospatial processing libraries can produce explicit coordinate arrays if needed based on the affine transformation.
#### Overview Level Requirements | ||
|
||
Each overview level MUST: | ||
- Follow COG-style /2 downsampling (1:1, 1:2, 1:4, 1:8, etc.) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is the actual requirement, right:
A COG file SHALL contain reduced resolution subfiles each one reducing the resolution by a minimum factor of 2 and a maximum factor of 10 from the previous one.
I think that is the type of requirement we should translate to GeoZarrr
Summary
Branching from initial PR from @wietzesuijker🙏🏼, this PR is a WIP for experimenting GeoZarr from EOPF Sentinel-2
For now, it simply adds Cloud Optimized GeoTIFF (COG) style multiscale functionality to the geozarr-examples repository, implementing proper overview levels that maintain native projections and follow industry best practices.
Changes
🆕 New Helper Module:
src/geozarr_examples/cog_multiscales.py
A comprehensive utility module providing:
🔄 Updated Notebook:
docs/examples/06_multiscales_as_WebMercatorQuad_EOPFZarrV3.ipynb
Key Features
✅ Follows COG Conventions
✅ Maintains Geospatial Integrity
✅ Modern Implementation
Next
I hope this experimental implementation could provides a foundation for COG-style multiscale GeoZarr datasets for EOPF
@maxrjones @briannapagan @vincentsarago
Please let me know if I am going in the right direction. I'm still trying to get all the keys here