-
Notifications
You must be signed in to change notification settings - Fork 24
Description
We wish to be able to query a spatiotemporal "catalog" using RasterSource interface. This could be a GeoTrellis layer with temporal dimensions or it could be a STAC catalog where there are multiple COGs (Cloud Optimized GeoTiffs) available with differing time stamps.
In either case we are currently limited because the RasterSource interface presents a single raster both in its metadata and in its read method interfaces.
It is not desirable to further complicate the RasterSource interface by adding temporal query capability. Further it is likely that different time slices have may be spread over multiple files. Switching source of IO and potentially reading metadata should not be hidden from user of RasterSource interface at those are potentially expensive and error-prone operations that must be explicit and carefully considered by the user.
Instead we should preserve RasterSource as a "spatial slice" of otherwise potentially multi-dimensional dataset. Therefore "something" should be able to produce RasterSource instances given a spatiotemporal query. Let's call this something RasterCatalog because it maps nicely onto STAC catalog.
At a minimum:
trait RasterCatalog {
def find(query: Query): Seq[RasterSource]
}Since RasterSource as an interface is supposed to be lazy we can return large number of them safely.Additionally if the query source (like STAC catalog) had record the raster metadata, it could be included in a specialized wrapper, further delaying the initial header read until the first RasterSource.read() call is made.
Query type should encapsulate the query as an ADT, so it can inspected, serialized and optimized before actual execution.
Critically this interface should avoid making any assumptions about structure of the data because it can commonly take many forms.
For instance:
- GeoTrellis spatiotemporal layer/pyramid
- STAC catalog
- Folder of COGs with timestamp encoded in filename