Skip to content

Unexpected results? Strange array shapes? Missing data? Read here first. #152

@gjoseph92

Description

@gjoseph92

If stackstac.stack is producing unexpected results, it's possible (but not certain!) that the problem is that the STAC metadata doesn't match up with the actual GeoTIFFs.

stackstac determines the resolution, bounds, CRS, array size, etc. up front only from the STAC metadata—it's careful to not look at the underlying data (GeoTIFFs) at all. If the STAC metadata says, for example, that an item is 1024x1024 pixels at 5m resolution, but the GeoTIFF is actually 1m resolution, then stackstac will pick an output bounding box and resolution 5x larger than what the actual data calls for.

Additionally, while compute-ing each dask chunk, stackstac skips even opening files that don't spatially overlap with the chunk, according to STAC metadata. If the STAC metadata is wrong, and a file does in fact overlap, then stackstac will never know, and your result may have unexpected sections of NaNs/missing data.

What can you do about it?

  • Verify that the STAC metadata and actual data don't match. Use gdalinfo or xr.open_rasterio to look at the spatial parameters of a few files in question, and compare them to the STAC entries for those items. This is a pretty manual process.
  • Try setting resolution=, epsg=, and bounds= explicitly.
  • If you still get missing data with that, you'll need to pre-process the STAC metadata and correct it yourself before passing it into stackstac.stack.
  • In all cases, please open an issue with the data provider (Microsoft Planetary Computer, etc.).

If there isn't a mismatch between STAC metadata and actual data, and you're getting unexpected results, there are a couple other things to be aware of:

  • If you're seeing half-pixel offsets, be aware of the xy_coords='topleft' default. If you're working with rioxarray, you may want to use stackstac.stack(..., xy_coords='cender').
  • Also be aware of the snap_bounds=True default, especially if you're passing custom bounds.

If you've checked all those things, and you're still getting unexpected results, then there's probably a stackstac bug. Please open an issue!

Past issues ultimately due to incorrect STAC metadata or xy_coords/snap_bounds:

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions