This repository contains a framework STACpopulator that can be used to implement concrete populators (see implementations) for populating the STAC Catalog, Collections and Items from various dataset/catalog sources, and pushed using STAC API on a server node.
It can also be used to export data from an existing STAC API or catalog to files on disk. These can then later
be used to populate a STAC API with the DirectoryLoader implementation.
It can also be used to update a STAC collection's extents and/or summaries based on the STAC items that already are part of the collection. It does this by iterating through the items in the collection and updating the relevant collection properties accordingly.
The framework is centered around a Python Abstract Base Class: STACpopulatorBase that implements all the logic
for populating a STAC catalog. This class provides abstract methods that should be overridden by implementations that
contain all the logic for constructing the STAC representation for an item in the collection that is to be processed.
Provided implementations of STACpopulatorBase:
| Implementation | Description |
|---|---|
| RDPS_CRIM | Crawls a THREDDS Catalog for RDPS NCML-annotated NetCDF references to publish corresponding STAC Collection and Items. |
| HRDPS_CRIM | Crawls a THREDDS Catalog for HRDPS NCML-annotated NetCDF references to publish corresponding STAC Collection and Items. |
| CMIP6_UofT | Crawls a THREDDS Catalog for CMIP6 NCML-annotated NetCDF references to publish corresponding STAC Collection and Items. |
| DirectoryLoader | Crawls a subdirectory hierarchy of pre-generated STAC Collections and Items to publish to a STAC API endpoint. |
| CORDEX-CMIP6_Ouranos | Crawls a THREDDS Catalog for CORDEX-CMIP6 NetCDF references to publish corresponding STAC Collection and Items. |
Either with Python directly (in an environment of your choosing):
pip install .
# OR
make installWith development packages:
pip install .[dev]
# OR
make install-devYou should then be able to call the STAC populator CLI with following commands:
# obtain the installed version of the STAC populator
stac-populator --version
# obtain general help about available commands
stac-populator --help
# obtain general help about available STAC populator implementations
stac-populator run --help
# obtain help specifically for the execution of a STAC populator implementation
stac-populator run [implementation] --help
# obtain general help about exporting STAC catalogs to a directory on disk
stac-populator export --help
# obtain general help about updating STAC collections based on their items
stac-populator update-collection --helpThe CMIP6 stac-populator extension requires that the pyessv-archive data
files be installed. To install this package to the default location in your home directory at ~/.esdoc/pyessv-archive:
git clone https://github.com/ES-DOC/pyessv-archive ~/.esdoc/pyessv-archive
# OR
make setup-pyessv-archiveYou can also choose to install them to a location on disk other than the default:
git clone https://github.com/ES-DOC/pyessv-archive /some/other/place
# OR
PYESSV_ARCHIVE_HOME=/some/other/place make setup-pyessv-archiveNote:
If you have installed the pyessv-archive data files to a non-default
location, you need to specify that location with the PYESSV_ARCHIVE_HOME environment variable. For example,
if you've installed the pyessv-archive files to /some/other/place then run the following before executing
any of the example commands above:
export PYESSV_ARCHIVE_HOME=/some/other/placeYou can also employ the pre-built Docker, which can be called as follows,
where [command] corresponds to any of the above example operations.
docker run -ti ghcr.io/crim-ca/stac-populator:0.12.0 [command]Note:
If files needs to provided as input or obtained as output for using a command with docker, you will need to either
mount files individually or mount a workspace directory using -v {local-path}:{docker-path} inside the Docker
container to make them accessible to the command.
The provided docker-compose configuration file can be used to launch a test STAC server.
Consider using make docker-start to start this server, and make docker-stop to stop it.
Alternatively, you can also use your own STAC server accessible from any remote location.
To run the STAC populator, follow the steps from Installation and Execution.
For more tests validation, you can also run the test suite with coverage analysis.
make test-covWe welcome any contributions to this codebase. To submit suggested changes, please do the following:
- create a new feature branch off of
master - update the code, write/update tests, write/update documentation
- submit a pull request targetting the
masterbranch
This codebase uses the ruff formatter and linter to enforce style policies.
To check that your changes conform to these policies please run:
ruff format
ruff checkYou can also set up pre-commit hooks that will run these checks before you create any commit in this repo:
pre-commit installUnit tests use the pytest-recording package to cache network responses. This allows the tests to be run offline and allows them to reliably pass regardless of whether a remote resource is available or not.
Whenever you're writing tests that make a request to an external resource, please use the @pytest.mark.vcr
decorator and record a new cassette (response cache) which can be committed to version control with the new
tests.