From 10288469b93657e9f57030122ab734097d4b60a7 Mon Sep 17 00:00:00 2001 From: Rob Emanuele Date: Tue, 8 Sep 2020 10:26:43 -0400 Subject: [PATCH] Add landsat STAC creation tutorial for STAC sprint 6. --- docs/tutorials.rst | 10 +- docs/tutorials/creating-a-landsat-stac.ipynb | 2270 ++++++++++++++++++ 2 files changed, 2279 insertions(+), 1 deletion(-) create mode 100644 docs/tutorials/creating-a-landsat-stac.ipynb diff --git a/docs/tutorials.rst b/docs/tutorials.rst index fdf0a0806..d3c5ee9a1 100644 --- a/docs/tutorials.rst +++ b/docs/tutorials.rst @@ -27,10 +27,18 @@ How to create STAC Catalogs with PySTAC This was a tutorial that was part of a 30 minute presentation at the `community STAC sprint `_ in Arlington, VA in November 2019. It runs through creating a STAC of image or label items from the `SpaceNet 5 `_ dataset. +Creating a Landsat 8 STAC +------------------------- + +- :tutorial:`GitHub version ` +- :ref:`Docs version ` + +This tutorial was presented at [Cloud Native Geospatial Outreach Day](https://sites.google.com/radiant.earth/cng-agenda/) on September 8th, 2020. It shows how to create a STAC collection from a subset of Landsat 8 scenes over a location. + Adding New and Custom Extensions -------------------------------- -- :tutorial:`GitHub version ` +- :tutorial:`GitHub version ` - :ref:`Docs version ` This tutorial goes over how to contribute new extensions to PySTAC as well as how to implement diff --git a/docs/tutorials/creating-a-landsat-stac.ipynb b/docs/tutorials/creating-a-landsat-stac.ipynb new file mode 100644 index 000000000..053310976 --- /dev/null +++ b/docs/tutorials/creating-a-landsat-stac.ipynb @@ -0,0 +1,2270 @@ +{ + "cells": [ + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "# Creating a STAC of Landsat data\n", + "\n", + "In this tutorial we create a STAC of Landsat 8 data from Amazon Web Service's [Open Data program](https://registry.opendata.aws/landsat-8/). There's a lot of Landsat scenes, so we'll only take a subset of scenes that are from a specific year and over a specific location. We'll translate existing metadata about each scene to STAC information, utilizing the `eo`, `view`, and `proj` extensions. Finally we'll write out the STAC catalog to our local machine, allowing us to use [stac-browser](https://github.com/radiantearth/stac-browser) to preview the images.\n", + "\n", + "### Requirements\n", + "\n", + "To run this tutorial you'll have needed to install PySTAC with the validation extra. To do this, use:\n", + "\n", + "```\n", + "pip install pystac[validation]\n", + "```\n", + "\n", + "Also to run this notebook you'll need [jupyter](https://jupyter.org/) installed locally as well. If you're running in a docker container, make sure that port `5555` is exposed if you want to run the server at the end of the notebook." + ] + }, + { + "cell_type": "code", + "execution_count": 1, + "metadata": {}, + "outputs": [], + "source": [ + "import pystac" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "### Reading Landsat 8 scene data\n", + "\n", + "AWS keeps a scene list that includes information about where the scene is located and how to access it's information. Landsat 8 was reorganized into a [Collection](https://www.usgs.gov/land-resources/nli/landsat/landsat-collections?qt-science_support_page_related_con=2#qt-science_support_page_related_con) system in the past, and so there's two places to read a scene list from (as [mentioned on the aws page](https://docs.opendata.aws/landsat-pds/readme.html)). We'll pull from the data that is organized as Collection 1 data, which is where all new processed data is added since 2017." + ] + }, + { + "cell_type": "code", + "execution_count": 2, + "metadata": {}, + "outputs": [], + "source": [ + "import csv\n", + "import gzip\n", + "from io import StringIO\n", + "from urllib.request import urlopen\n", + "\n", + "# Collection 1 data\n", + "url = 'https://landsat-pds.s3.amazonaws.com/c1/L8/scene_list.gz'\n", + "\n", + "# Read and unzip the content\n", + "response = urlopen(url)\n", + "gunzip_response = gzip.GzipFile(fileobj=response)\n", + "content = gunzip_response.read()\n", + "\n", + "# Read the scenes in as dictionaries\n", + "scenes = list(csv.DictReader(StringIO(content.decode(\"utf-8\"))))" + ] + }, + { + "cell_type": "code", + "execution_count": 3, + "metadata": {}, + "outputs": [ + { + "data": { + "text/plain": [ + "2073901" + ] + }, + "execution_count": 3, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "len(scenes)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "As you can see, there are a lot of scenes! Even though STAC items contain just the metadata for the scenes (and not the bulky raster image), this would still be a lot of data and files to work with for this tutorial.\n", + "\n", + "Let's see what one of the scenes looks like, and then filter on those properties to scope things down." + ] + }, + { + "cell_type": "code", + "execution_count": 4, + "metadata": {}, + "outputs": [ + { + "data": { + "text/plain": [ + "OrderedDict([('productId', 'LC08_L1TP_149039_20170411_20170415_01_T1'),\n", + " ('entityId', 'LC81490392017101LGN00'),\n", + " ('acquisitionDate', '2017-04-11 05:36:29.349932'),\n", + " ('cloudCover', '0.0'),\n", + " ('processingLevel', 'L1TP'),\n", + " ('path', '149'),\n", + " ('row', '39'),\n", + " ('min_lat', '29.22165'),\n", + " ('min_lon', '72.41205'),\n", + " ('max_lat', '31.34742'),\n", + " ('max_lon', '74.84666'),\n", + " ('download_url',\n", + " 'https://s3-us-west-2.amazonaws.com/landsat-pds/c1/L8/149/039/LC08_L1TP_149039_20170411_20170415_01_T1/index.html')])" + ] + }, + "execution_count": 4, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "scenes[0]" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "As you can see, we have both a date and a location that we can filter on." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "### Filter scenes by a location\n", + "\n", + "Let's pick a location to filter the scenes by. Here we choose Philly, but feel free to change the location by modifying the latitude and longitude coordinates below and changing the location name: " + ] + }, + { + "cell_type": "code", + "execution_count": 5, + "metadata": {}, + "outputs": [], + "source": [ + "lat, lon = 39.9526, -75.1652\n", + "location_name = \"Philly\"\n", + "filter_year = '2020'" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "We'll use the coordinates and the year to filter out any unwanted scenes:" + ] + }, + { + "cell_type": "code", + "execution_count": 6, + "metadata": {}, + "outputs": [], + "source": [ + "def keep_scene(scene):\n", + " contains_location = float(scene['min_lat']) < lat and float(scene['max_lat']) > lat and \\\n", + " float(scene['min_lon']) < lon and float(scene['max_lon']) > lon\n", + " is_correct_year = '{}-'.format(filter_year) in scene['acquisitionDate']\n", + " return contains_location and is_correct_year\n", + " \n", + "location_scenes = [scene for scene in scenes if keep_scene(scene)]" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "This should leave us with a much more manageable subset of scenes:" + ] + }, + { + "cell_type": "code", + "execution_count": 7, + "metadata": {}, + "outputs": [ + { + "data": { + "text/plain": [ + "52" + ] + }, + "execution_count": 7, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "len(location_scenes)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "We'll be working with a single scene through the next few sections, so let's use the first scene in our list:" + ] + }, + { + "cell_type": "code", + "execution_count": 8, + "metadata": {}, + "outputs": [ + { + "data": { + "text/plain": [ + "OrderedDict([('productId', 'LC08_L1GT_015032_20200103_20200103_01_RT'),\n", + " ('entityId', 'LC80150322020003LGN00'),\n", + " ('acquisitionDate', '2020-01-03 15:46:15.625235'),\n", + " ('cloudCover', '100.0'),\n", + " ('processingLevel', 'L1GT'),\n", + " ('path', '15'),\n", + " ('row', '32'),\n", + " ('min_lat', '39.23189'),\n", + " ('min_lon', '-77.83282'),\n", + " ('max_lat', '41.37536'),\n", + " ('max_lon', '-75.02666'),\n", + " ('download_url',\n", + " 'https://s3-us-west-2.amazonaws.com/landsat-pds/c1/L8/015/032/LC08_L1GT_015032_20200103_20200103_01_RT/index.html')])" + ] + }, + "execution_count": 8, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "scene = location_scenes[0]\n", + "scene" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "### Read metadata from the MTL file\n", + "\n", + "Landsat 8 metadata is contained in an `MTL` file that is alongside of the `download_url` file specified in the scene data. Let's read the MTL file for the first scene and see what it looks like.\n", + "\n", + "First we define a function that reads a file based on the `download_url` location - we'll be using this a lot to get file URLs releated to a scene:" + ] + }, + { + "cell_type": "code", + "execution_count": 9, + "metadata": {}, + "outputs": [], + "source": [ + "def get_asset_url(scene, suffix):\n", + " product_id = scene['productId']\n", + " download_url = scene['download_url'] \n", + " asset_filename = '{}_{}'.format(product_id, suffix)\n", + " return download_url.replace('index.html', asset_filename)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "We can then use function to get the MTL file for our scene. Notice we use `pystac.STAC_IO.read_text` - this is the method that PySTAC uses to read text as it crawls a STAC. It can read from the local filesystem or HTTP/HTTPS by default. Also, it can be extended to read from other sources such as cloud providers - [see the documentation here](https://pystac.readthedocs.io/en/0.5/concepts.html#using-stac-io). For now we'll use it directly as an easy way to read a text file from an HTTPS source:" + ] + }, + { + "cell_type": "code", + "execution_count": 10, + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "GROUP = L1_METADATA_FILE\n", + " GROUP = METADATA_FILE_INFO\n", + " ORIGIN = \"Image courtesy of the U.S. Geological Survey\"\n", + " REQUEST_ID = \"0502001033167_00013\"\n", + " LANDSAT_SCENE_ID = \"LC80150322020003LGN00\"\n", + " LANDSAT_PRODUCT_ID = \"LC08_L1GT_015032_20200103_20200103_01_RT\"\n", + " COLLECTION_NUMBER = 01\n", + " FILE_DATE = 2020-01-03T21:39:14Z\n", + " STATION_ID = \"LGN\"\n", + " PROCESSING_SOFTWARE_VERSION = \"LPGS_13.1.0\"\n", + " END_GROUP = METADATA_FILE_INFO\n", + " GROUP = PRODUCT_METADATA\n", + " DATA_TYPE = \"L1GT\"\n", + " COLLECTION_CATEGORY = \"RT\"\n", + " ELEVATION_SOURCE = \"GLS2000\"\n", + " OUTPUT_FORMAT = \"GEOTIFF\"\n", + " SPACECRAFT_ID = \"LANDSAT_8\"\n", + " SENSOR_ID = \"OLI_TIRS\"\n", + " WRS_PATH = 15\n", + " WRS_ROW = 32\n", + " NADIR_OFFNADIR = \"NADIR\"\n", + " TARGET_WRS_PATH = 15\n", + " TARGET_WRS_ROW = 32\n", + " DATE_ACQUIRED = 2020-01-03\n", + " SCENE_CENTER_TIME = \"15:46:15.6252350Z\"\n", + " CORNER_UL_LAT_PRODUCT = 41.37536\n", + " CORNER_UL_LON_PRODUCT = -77.83282\n", + " CORNER_UR_LAT_PRODUCT = 41.41024\n", + " CORNER_UR_LON_PRODUCT = -75.02752\n", + " CORNER_LL_LAT_PRODUCT = 39.23189\n", + " CORNER_LL_LON_PRODUCT = -77.74460\n", + " CORNER_LR_LAT_PRODUCT = 39.26423\n", + " CORNER_LR_LON_PRODUCT = -75.02666\n", + " CORNER_UL_PROJECTION_X_PRODUCT = 263100.000\n", + " CORNER_UL_PROJECTION_Y_PRODUCT = 4584300.000\n", + " CORNER_UR_PROJECTION_X_PRODUCT = 497700.000\n", + " CORNER_UR_PROJECTION_Y_PRODUCT = 4584300.000\n", + " CORNER_LL_PROJECTION_X_PRODUCT = 263100.000\n", + " CORNER_LL_PROJECTION_Y_PRODUCT = 4346100.000\n", + " CORNER_LR_PROJECTION_X_PRODUCT = 497700.000\n", + " CORNER_LR_PROJECTION_Y_PRODUCT = 4346100.000\n", + " PANCHROMATIC_LINES = 15881\n", + " PANCHROMATIC_SAMPLES = 15641\n", + " REFLECTIVE_LINES = 7941\n", + " REFLECTIVE_SAMPLES = 7821\n", + " THERMAL_LINES = 7941\n", + " THERMAL_SAMPLES = 7821\n", + " FILE_NAME_BAND_1 = \"LC08_L1GT_015032_20200103_20200103_01_RT_B1.TIF\"\n", + " FILE_NAME_BAND_2 = \"LC08_L1GT_015032_20200103_20200103_01_RT_B2.TIF\"\n", + " FILE_NAME_BAND_3 = \"LC08_L1GT_015032_20200103_20200103_01_RT_B3.TIF\"\n", + " FILE_NAME_BAND_4 = \"LC08_L1GT_015032_20200103_20200103_01_RT_B4.TIF\"\n", + " FILE_NAME_BAND_5 = \"LC08_L1GT_015032_20200103_20200103_01_RT_B5.TIF\"\n", + " FILE_NAME_BAND_6 = \"LC08_L1GT_015032_20200103_20200103_01_RT_B6.TIF\"\n", + " FILE_NAME_BAND_7 = \"LC08_L1GT_015032_20200103_20200103_01_RT_B7.TIF\"\n", + " FILE_NAME_BAND_8 = \"LC08_L1GT_015032_20200103_20200103_01_RT_B8.TIF\"\n", + " FILE_NAME_BAND_9 = \"LC08_L1GT_015032_20200103_20200103_01_RT_B9.TIF\"\n", + " FILE_NAME_BAND_10 = \"LC08_L1GT_015032_20200103_20200103_01_RT_B10.TIF\"\n", + " FILE_NAME_BAND_11 = \"LC08_L1GT_015032_20200103_20200103_01_RT_B11.TIF\"\n", + " FILE_NAME_BAND_QUALITY = \"LC08_L1GT_015032_20200103_20200103_01_RT_BQA.TIF\"\n", + " ANGLE_COEFFICIENT_FILE_NAME = \"LC08_L1GT_015032_20200103_20200103_01_RT_ANG.txt\"\n", + " METADATA_FILE_NAME = \"LC08_L1GT_015032_20200103_20200103_01_RT_MTL.txt\"\n", + " CPF_NAME = \"LC08CPF_20200101_20200331_01.02\"\n", + " BPF_NAME_OLI = \"LO8BPF20200103153915_20200103162306.02\"\n", + " BPF_NAME_TIRS = \"LT8BPF20200101032539_20200101033839.01\"\n", + " RLUT_FILE_NAME = \"LC08RLUT_20150303_20431231_01_12.h5\"\n", + " END_GROUP = PRODUCT_METADATA\n", + " GROUP = IMAGE_ATTRIBUTES\n", + " CLOUD_COVER = 100.00\n", + " CLOUD_COVER_LAND = 100.00\n", + " IMAGE_QUALITY_OLI = 9\n", + " IMAGE_QUALITY_TIRS = 7\n", + " TIRS_SSM_MODEL = \"PRELIMINARY\"\n", + " TIRS_SSM_POSITION_STATUS = \"ESTIMATED\"\n", + " TIRS_STRAY_LIGHT_CORRECTION_SOURCE = \"TIRS\"\n", + " ROLL_ANGLE = -0.001\n", + " SUN_AZIMUTH = 158.89720783\n", + " SUN_ELEVATION = 23.89789093\n", + " EARTH_SUN_DISTANCE = 0.9832512\n", + " SATURATION_BAND_1 = \"N\"\n", + " SATURATION_BAND_2 = \"N\"\n", + " SATURATION_BAND_3 = \"N\"\n", + " SATURATION_BAND_4 = \"N\"\n", + " SATURATION_BAND_5 = \"N\"\n", + " SATURATION_BAND_6 = \"N\"\n", + " SATURATION_BAND_7 = \"N\"\n", + " SATURATION_BAND_8 = \"N\"\n", + " SATURATION_BAND_9 = \"N\"\n", + " TRUNCATION_OLI = \"UPPER\"\n", + " END_GROUP = IMAGE_ATTRIBUTES\n", + " GROUP = MIN_MAX_RADIANCE\n", + " RADIANCE_MAXIMUM_BAND_1 = 786.17712\n", + " RADIANCE_MINIMUM_BAND_1 = -64.92276\n", + " RADIANCE_MAXIMUM_BAND_2 = 805.05499\n", + " RADIANCE_MINIMUM_BAND_2 = -66.48170\n", + " RADIANCE_MAXIMUM_BAND_3 = 741.85126\n", + " RADIANCE_MINIMUM_BAND_3 = -61.26232\n", + " RADIANCE_MAXIMUM_BAND_4 = 625.57080\n", + " RADIANCE_MINIMUM_BAND_4 = -51.65984\n", + " RADIANCE_MAXIMUM_BAND_5 = 382.81812\n", + " RADIANCE_MINIMUM_BAND_5 = -31.61325\n", + " RADIANCE_MAXIMUM_BAND_6 = 95.20338\n", + " RADIANCE_MINIMUM_BAND_6 = -7.86193\n", + " RADIANCE_MAXIMUM_BAND_7 = 32.08864\n", + " RADIANCE_MINIMUM_BAND_7 = -2.64989\n", + " RADIANCE_MAXIMUM_BAND_8 = 707.97394\n", + " RADIANCE_MINIMUM_BAND_8 = -58.46472\n", + " RADIANCE_MAXIMUM_BAND_9 = 149.61400\n", + " RADIANCE_MINIMUM_BAND_9 = -12.35517\n", + " RADIANCE_MAXIMUM_BAND_10 = 22.00180\n", + " RADIANCE_MINIMUM_BAND_10 = 0.10033\n", + " RADIANCE_MAXIMUM_BAND_11 = 22.00180\n", + " RADIANCE_MINIMUM_BAND_11 = 0.10033\n", + " END_GROUP = MIN_MAX_RADIANCE\n", + " GROUP = MIN_MAX_REFLECTANCE\n", + " REFLECTANCE_MAXIMUM_BAND_1 = 1.210700\n", + " REFLECTANCE_MINIMUM_BAND_1 = -0.099980\n", + " REFLECTANCE_MAXIMUM_BAND_2 = 1.210700\n", + " REFLECTANCE_MINIMUM_BAND_2 = -0.099980\n", + " REFLECTANCE_MAXIMUM_BAND_3 = 1.210700\n", + " REFLECTANCE_MINIMUM_BAND_3 = -0.099980\n", + " REFLECTANCE_MAXIMUM_BAND_4 = 1.210700\n", + " REFLECTANCE_MINIMUM_BAND_4 = -0.099980\n", + " REFLECTANCE_MAXIMUM_BAND_5 = 1.210700\n", + " REFLECTANCE_MINIMUM_BAND_5 = -0.099980\n", + " REFLECTANCE_MAXIMUM_BAND_6 = 1.210700\n", + " REFLECTANCE_MINIMUM_BAND_6 = -0.099980\n", + " REFLECTANCE_MAXIMUM_BAND_7 = 1.210700\n", + " REFLECTANCE_MINIMUM_BAND_7 = -0.099980\n", + " REFLECTANCE_MAXIMUM_BAND_8 = 1.210700\n", + " REFLECTANCE_MINIMUM_BAND_8 = -0.099980\n", + " REFLECTANCE_MAXIMUM_BAND_9 = 1.210700\n", + " REFLECTANCE_MINIMUM_BAND_9 = -0.099980\n", + " END_GROUP = MIN_MAX_REFLECTANCE\n", + " GROUP = MIN_MAX_PIXEL_VALUE\n", + " QUANTIZE_CAL_MAX_BAND_1 = 65535\n", + " QUANTIZE_CAL_MIN_BAND_1 = 1\n", + " QUANTIZE_CAL_MAX_BAND_2 = 65535\n", + " QUANTIZE_CAL_MIN_BAND_2 = 1\n", + " QUANTIZE_CAL_MAX_BAND_3 = 65535\n", + " QUANTIZE_CAL_MIN_BAND_3 = 1\n", + " QUANTIZE_CAL_MAX_BAND_4 = 65535\n", + " QUANTIZE_CAL_MIN_BAND_4 = 1\n", + " QUANTIZE_CAL_MAX_BAND_5 = 65535\n", + " QUANTIZE_CAL_MIN_BAND_5 = 1\n", + " QUANTIZE_CAL_MAX_BAND_6 = 65535\n", + " QUANTIZE_CAL_MIN_BAND_6 = 1\n", + " QUANTIZE_CAL_MAX_BAND_7 = 65535\n", + " QUANTIZE_CAL_MIN_BAND_7 = 1\n", + " QUANTIZE_CAL_MAX_BAND_8 = 65535\n", + " QUANTIZE_CAL_MIN_BAND_8 = 1\n", + " QUANTIZE_CAL_MAX_BAND_9 = 65535\n", + " QUANTIZE_CAL_MIN_BAND_9 = 1\n", + " QUANTIZE_CAL_MAX_BAND_10 = 65535\n", + " QUANTIZE_CAL_MIN_BAND_10 = 1\n", + " QUANTIZE_CAL_MAX_BAND_11 = 65535\n", + " QUANTIZE_CAL_MIN_BAND_11 = 1\n", + " END_GROUP = MIN_MAX_PIXEL_VALUE\n", + " GROUP = RADIOMETRIC_RESCALING\n", + " RADIANCE_MULT_BAND_1 = 1.2987E-02\n", + " RADIANCE_MULT_BAND_2 = 1.3299E-02\n", + " RADIANCE_MULT_BAND_3 = 1.2255E-02\n", + " RADIANCE_MULT_BAND_4 = 1.0334E-02\n", + " RADIANCE_MULT_BAND_5 = 6.3239E-03\n", + " RADIANCE_MULT_BAND_6 = 1.5727E-03\n", + " RADIANCE_MULT_BAND_7 = 5.3008E-04\n", + " RADIANCE_MULT_BAND_8 = 1.1695E-02\n", + " RADIANCE_MULT_BAND_9 = 2.4715E-03\n", + " RADIANCE_MULT_BAND_10 = 3.3420E-04\n", + " RADIANCE_MULT_BAND_11 = 3.3420E-04\n", + " RADIANCE_ADD_BAND_1 = -64.93575\n", + " RADIANCE_ADD_BAND_2 = -66.49500\n", + " RADIANCE_ADD_BAND_3 = -61.27457\n", + " RADIANCE_ADD_BAND_4 = -51.67017\n", + " RADIANCE_ADD_BAND_5 = -31.61957\n", + " RADIANCE_ADD_BAND_6 = -7.86350\n", + " RADIANCE_ADD_BAND_7 = -2.65042\n", + " RADIANCE_ADD_BAND_8 = -58.47642\n", + " RADIANCE_ADD_BAND_9 = -12.35764\n", + " RADIANCE_ADD_BAND_10 = 0.10000\n", + " RADIANCE_ADD_BAND_11 = 0.10000\n", + " REFLECTANCE_MULT_BAND_1 = 2.0000E-05\n", + " REFLECTANCE_MULT_BAND_2 = 2.0000E-05\n", + " REFLECTANCE_MULT_BAND_3 = 2.0000E-05\n", + " REFLECTANCE_MULT_BAND_4 = 2.0000E-05\n", + " REFLECTANCE_MULT_BAND_5 = 2.0000E-05\n", + " REFLECTANCE_MULT_BAND_6 = 2.0000E-05\n", + " REFLECTANCE_MULT_BAND_7 = 2.0000E-05\n", + " REFLECTANCE_MULT_BAND_8 = 2.0000E-05\n", + " REFLECTANCE_MULT_BAND_9 = 2.0000E-05\n", + " REFLECTANCE_ADD_BAND_1 = -0.100000\n", + " REFLECTANCE_ADD_BAND_2 = -0.100000\n", + " REFLECTANCE_ADD_BAND_3 = -0.100000\n", + " REFLECTANCE_ADD_BAND_4 = -0.100000\n", + " REFLECTANCE_ADD_BAND_5 = -0.100000\n", + " REFLECTANCE_ADD_BAND_6 = -0.100000\n", + " REFLECTANCE_ADD_BAND_7 = -0.100000\n", + " REFLECTANCE_ADD_BAND_8 = -0.100000\n", + " REFLECTANCE_ADD_BAND_9 = -0.100000\n", + " END_GROUP = RADIOMETRIC_RESCALING\n", + " GROUP = TIRS_THERMAL_CONSTANTS\n", + " K1_CONSTANT_BAND_10 = 774.8853\n", + " K2_CONSTANT_BAND_10 = 1321.0789\n", + " K1_CONSTANT_BAND_11 = 480.8883\n", + " K2_CONSTANT_BAND_11 = 1201.1442\n", + " END_GROUP = TIRS_THERMAL_CONSTANTS\n", + " GROUP = PROJECTION_PARAMETERS\n", + " MAP_PROJECTION = \"UTM\"\n", + " DATUM = \"WGS84\"\n", + " ELLIPSOID = \"WGS84\"\n", + " UTM_ZONE = 18\n", + " GRID_CELL_SIZE_PANCHROMATIC = 15.00\n", + " GRID_CELL_SIZE_REFLECTIVE = 30.00\n", + " GRID_CELL_SIZE_THERMAL = 30.00\n", + " ORIENTATION = \"NORTH_UP\"\n", + " RESAMPLING_OPTION = \"CUBIC_CONVOLUTION\"\n", + " END_GROUP = PROJECTION_PARAMETERS\n", + "END_GROUP = L1_METADATA_FILE\n", + "END\n", + "\n" + ] + } + ], + "source": [ + "mtl_url = get_asset_url(scene, 'MTL.txt')\n", + "print(pystac.STAC_IO.read_text(mtl_url))" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "The MTL file contains metadata in a text format that's a bit hard to use as-is; we can parse things out to a `dict` for easier access:" + ] + }, + { + "cell_type": "code", + "execution_count": 11, + "metadata": {}, + "outputs": [], + "source": [ + "def get_metadata(url):\n", + " \"\"\" Convert Landsat MTL file to dictionary of metadata values \"\"\"\n", + " mtl = {}\n", + " mtl_text = pystac.STAC_IO.read_text(url)\n", + " for line in mtl_text.split('\\n'):\n", + " meta = line.replace('\\\"', \"\").strip().split('=')\n", + " if len(meta) > 1:\n", + " key = meta[0].strip()\n", + " item = meta[1].strip()\n", + " if key != \"GROUP\" and key != \"END_GROUP\":\n", + " mtl[key] = item\n", + " return mtl" + ] + }, + { + "cell_type": "code", + "execution_count": 12, + "metadata": {}, + "outputs": [ + { + "data": { + "text/plain": [ + "{'ORIGIN': 'Image courtesy of the U.S. Geological Survey',\n", + " 'REQUEST_ID': '0502001033167_00013',\n", + " 'LANDSAT_SCENE_ID': 'LC80150322020003LGN00',\n", + " 'LANDSAT_PRODUCT_ID': 'LC08_L1GT_015032_20200103_20200103_01_RT',\n", + " 'COLLECTION_NUMBER': '01',\n", + " 'FILE_DATE': '2020-01-03T21:39:14Z',\n", + " 'STATION_ID': 'LGN',\n", + " 'PROCESSING_SOFTWARE_VERSION': 'LPGS_13.1.0',\n", + " 'DATA_TYPE': 'L1GT',\n", + " 'COLLECTION_CATEGORY': 'RT',\n", + " 'ELEVATION_SOURCE': 'GLS2000',\n", + " 'OUTPUT_FORMAT': 'GEOTIFF',\n", + " 'SPACECRAFT_ID': 'LANDSAT_8',\n", + " 'SENSOR_ID': 'OLI_TIRS',\n", + " 'WRS_PATH': '15',\n", + " 'WRS_ROW': '32',\n", + " 'NADIR_OFFNADIR': 'NADIR',\n", + " 'TARGET_WRS_PATH': '15',\n", + " 'TARGET_WRS_ROW': '32',\n", + " 'DATE_ACQUIRED': '2020-01-03',\n", + " 'SCENE_CENTER_TIME': '15:46:15.6252350Z',\n", + " 'CORNER_UL_LAT_PRODUCT': '41.37536',\n", + " 'CORNER_UL_LON_PRODUCT': '-77.83282',\n", + " 'CORNER_UR_LAT_PRODUCT': '41.41024',\n", + " 'CORNER_UR_LON_PRODUCT': '-75.02752',\n", + " 'CORNER_LL_LAT_PRODUCT': '39.23189',\n", + " 'CORNER_LL_LON_PRODUCT': '-77.74460',\n", + " 'CORNER_LR_LAT_PRODUCT': '39.26423',\n", + " 'CORNER_LR_LON_PRODUCT': '-75.02666',\n", + " 'CORNER_UL_PROJECTION_X_PRODUCT': '263100.000',\n", + " 'CORNER_UL_PROJECTION_Y_PRODUCT': '4584300.000',\n", + " 'CORNER_UR_PROJECTION_X_PRODUCT': '497700.000',\n", + " 'CORNER_UR_PROJECTION_Y_PRODUCT': '4584300.000',\n", + " 'CORNER_LL_PROJECTION_X_PRODUCT': '263100.000',\n", + " 'CORNER_LL_PROJECTION_Y_PRODUCT': '4346100.000',\n", + " 'CORNER_LR_PROJECTION_X_PRODUCT': '497700.000',\n", + " 'CORNER_LR_PROJECTION_Y_PRODUCT': '4346100.000',\n", + " 'PANCHROMATIC_LINES': '15881',\n", + " 'PANCHROMATIC_SAMPLES': '15641',\n", + " 'REFLECTIVE_LINES': '7941',\n", + " 'REFLECTIVE_SAMPLES': '7821',\n", + " 'THERMAL_LINES': '7941',\n", + " 'THERMAL_SAMPLES': '7821',\n", + " 'FILE_NAME_BAND_1': 'LC08_L1GT_015032_20200103_20200103_01_RT_B1.TIF',\n", + " 'FILE_NAME_BAND_2': 'LC08_L1GT_015032_20200103_20200103_01_RT_B2.TIF',\n", + " 'FILE_NAME_BAND_3': 'LC08_L1GT_015032_20200103_20200103_01_RT_B3.TIF',\n", + " 'FILE_NAME_BAND_4': 'LC08_L1GT_015032_20200103_20200103_01_RT_B4.TIF',\n", + " 'FILE_NAME_BAND_5': 'LC08_L1GT_015032_20200103_20200103_01_RT_B5.TIF',\n", + " 'FILE_NAME_BAND_6': 'LC08_L1GT_015032_20200103_20200103_01_RT_B6.TIF',\n", + " 'FILE_NAME_BAND_7': 'LC08_L1GT_015032_20200103_20200103_01_RT_B7.TIF',\n", + " 'FILE_NAME_BAND_8': 'LC08_L1GT_015032_20200103_20200103_01_RT_B8.TIF',\n", + " 'FILE_NAME_BAND_9': 'LC08_L1GT_015032_20200103_20200103_01_RT_B9.TIF',\n", + " 'FILE_NAME_BAND_10': 'LC08_L1GT_015032_20200103_20200103_01_RT_B10.TIF',\n", + " 'FILE_NAME_BAND_11': 'LC08_L1GT_015032_20200103_20200103_01_RT_B11.TIF',\n", + " 'FILE_NAME_BAND_QUALITY': 'LC08_L1GT_015032_20200103_20200103_01_RT_BQA.TIF',\n", + " 'ANGLE_COEFFICIENT_FILE_NAME': 'LC08_L1GT_015032_20200103_20200103_01_RT_ANG.txt',\n", + " 'METADATA_FILE_NAME': 'LC08_L1GT_015032_20200103_20200103_01_RT_MTL.txt',\n", + " 'CPF_NAME': 'LC08CPF_20200101_20200331_01.02',\n", + " 'BPF_NAME_OLI': 'LO8BPF20200103153915_20200103162306.02',\n", + " 'BPF_NAME_TIRS': 'LT8BPF20200101032539_20200101033839.01',\n", + " 'RLUT_FILE_NAME': 'LC08RLUT_20150303_20431231_01_12.h5',\n", + " 'CLOUD_COVER': '100.00',\n", + " 'CLOUD_COVER_LAND': '100.00',\n", + " 'IMAGE_QUALITY_OLI': '9',\n", + " 'IMAGE_QUALITY_TIRS': '7',\n", + " 'TIRS_SSM_MODEL': 'PRELIMINARY',\n", + " 'TIRS_SSM_POSITION_STATUS': 'ESTIMATED',\n", + " 'TIRS_STRAY_LIGHT_CORRECTION_SOURCE': 'TIRS',\n", + " 'ROLL_ANGLE': '-0.001',\n", + " 'SUN_AZIMUTH': '158.89720783',\n", + " 'SUN_ELEVATION': '23.89789093',\n", + " 'EARTH_SUN_DISTANCE': '0.9832512',\n", + " 'SATURATION_BAND_1': 'N',\n", + " 'SATURATION_BAND_2': 'N',\n", + " 'SATURATION_BAND_3': 'N',\n", + " 'SATURATION_BAND_4': 'N',\n", + " 'SATURATION_BAND_5': 'N',\n", + " 'SATURATION_BAND_6': 'N',\n", + " 'SATURATION_BAND_7': 'N',\n", + " 'SATURATION_BAND_8': 'N',\n", + " 'SATURATION_BAND_9': 'N',\n", + " 'TRUNCATION_OLI': 'UPPER',\n", + " 'RADIANCE_MAXIMUM_BAND_1': '786.17712',\n", + " 'RADIANCE_MINIMUM_BAND_1': '-64.92276',\n", + " 'RADIANCE_MAXIMUM_BAND_2': '805.05499',\n", + " 'RADIANCE_MINIMUM_BAND_2': '-66.48170',\n", + " 'RADIANCE_MAXIMUM_BAND_3': '741.85126',\n", + " 'RADIANCE_MINIMUM_BAND_3': '-61.26232',\n", + " 'RADIANCE_MAXIMUM_BAND_4': '625.57080',\n", + " 'RADIANCE_MINIMUM_BAND_4': '-51.65984',\n", + " 'RADIANCE_MAXIMUM_BAND_5': '382.81812',\n", + " 'RADIANCE_MINIMUM_BAND_5': '-31.61325',\n", + " 'RADIANCE_MAXIMUM_BAND_6': '95.20338',\n", + " 'RADIANCE_MINIMUM_BAND_6': '-7.86193',\n", + " 'RADIANCE_MAXIMUM_BAND_7': '32.08864',\n", + " 'RADIANCE_MINIMUM_BAND_7': '-2.64989',\n", + " 'RADIANCE_MAXIMUM_BAND_8': '707.97394',\n", + " 'RADIANCE_MINIMUM_BAND_8': '-58.46472',\n", + " 'RADIANCE_MAXIMUM_BAND_9': '149.61400',\n", + " 'RADIANCE_MINIMUM_BAND_9': '-12.35517',\n", + " 'RADIANCE_MAXIMUM_BAND_10': '22.00180',\n", + " 'RADIANCE_MINIMUM_BAND_10': '0.10033',\n", + " 'RADIANCE_MAXIMUM_BAND_11': '22.00180',\n", + " 'RADIANCE_MINIMUM_BAND_11': '0.10033',\n", + " 'REFLECTANCE_MAXIMUM_BAND_1': '1.210700',\n", + " 'REFLECTANCE_MINIMUM_BAND_1': '-0.099980',\n", + " 'REFLECTANCE_MAXIMUM_BAND_2': '1.210700',\n", + " 'REFLECTANCE_MINIMUM_BAND_2': '-0.099980',\n", + " 'REFLECTANCE_MAXIMUM_BAND_3': '1.210700',\n", + " 'REFLECTANCE_MINIMUM_BAND_3': '-0.099980',\n", + " 'REFLECTANCE_MAXIMUM_BAND_4': '1.210700',\n", + " 'REFLECTANCE_MINIMUM_BAND_4': '-0.099980',\n", + " 'REFLECTANCE_MAXIMUM_BAND_5': '1.210700',\n", + " 'REFLECTANCE_MINIMUM_BAND_5': '-0.099980',\n", + " 'REFLECTANCE_MAXIMUM_BAND_6': '1.210700',\n", + " 'REFLECTANCE_MINIMUM_BAND_6': '-0.099980',\n", + " 'REFLECTANCE_MAXIMUM_BAND_7': '1.210700',\n", + " 'REFLECTANCE_MINIMUM_BAND_7': '-0.099980',\n", + " 'REFLECTANCE_MAXIMUM_BAND_8': '1.210700',\n", + " 'REFLECTANCE_MINIMUM_BAND_8': '-0.099980',\n", + " 'REFLECTANCE_MAXIMUM_BAND_9': '1.210700',\n", + " 'REFLECTANCE_MINIMUM_BAND_9': '-0.099980',\n", + " 'QUANTIZE_CAL_MAX_BAND_1': '65535',\n", + " 'QUANTIZE_CAL_MIN_BAND_1': '1',\n", + " 'QUANTIZE_CAL_MAX_BAND_2': '65535',\n", + " 'QUANTIZE_CAL_MIN_BAND_2': '1',\n", + " 'QUANTIZE_CAL_MAX_BAND_3': '65535',\n", + " 'QUANTIZE_CAL_MIN_BAND_3': '1',\n", + " 'QUANTIZE_CAL_MAX_BAND_4': '65535',\n", + " 'QUANTIZE_CAL_MIN_BAND_4': '1',\n", + " 'QUANTIZE_CAL_MAX_BAND_5': '65535',\n", + " 'QUANTIZE_CAL_MIN_BAND_5': '1',\n", + " 'QUANTIZE_CAL_MAX_BAND_6': '65535',\n", + " 'QUANTIZE_CAL_MIN_BAND_6': '1',\n", + " 'QUANTIZE_CAL_MAX_BAND_7': '65535',\n", + " 'QUANTIZE_CAL_MIN_BAND_7': '1',\n", + " 'QUANTIZE_CAL_MAX_BAND_8': '65535',\n", + " 'QUANTIZE_CAL_MIN_BAND_8': '1',\n", + " 'QUANTIZE_CAL_MAX_BAND_9': '65535',\n", + " 'QUANTIZE_CAL_MIN_BAND_9': '1',\n", + " 'QUANTIZE_CAL_MAX_BAND_10': '65535',\n", + " 'QUANTIZE_CAL_MIN_BAND_10': '1',\n", + " 'QUANTIZE_CAL_MAX_BAND_11': '65535',\n", + " 'QUANTIZE_CAL_MIN_BAND_11': '1',\n", + " 'RADIANCE_MULT_BAND_1': '1.2987E-02',\n", + " 'RADIANCE_MULT_BAND_2': '1.3299E-02',\n", + " 'RADIANCE_MULT_BAND_3': '1.2255E-02',\n", + " 'RADIANCE_MULT_BAND_4': '1.0334E-02',\n", + " 'RADIANCE_MULT_BAND_5': '6.3239E-03',\n", + " 'RADIANCE_MULT_BAND_6': '1.5727E-03',\n", + " 'RADIANCE_MULT_BAND_7': '5.3008E-04',\n", + " 'RADIANCE_MULT_BAND_8': '1.1695E-02',\n", + " 'RADIANCE_MULT_BAND_9': '2.4715E-03',\n", + " 'RADIANCE_MULT_BAND_10': '3.3420E-04',\n", + " 'RADIANCE_MULT_BAND_11': '3.3420E-04',\n", + " 'RADIANCE_ADD_BAND_1': '-64.93575',\n", + " 'RADIANCE_ADD_BAND_2': '-66.49500',\n", + " 'RADIANCE_ADD_BAND_3': '-61.27457',\n", + " 'RADIANCE_ADD_BAND_4': '-51.67017',\n", + " 'RADIANCE_ADD_BAND_5': '-31.61957',\n", + " 'RADIANCE_ADD_BAND_6': '-7.86350',\n", + " 'RADIANCE_ADD_BAND_7': '-2.65042',\n", + " 'RADIANCE_ADD_BAND_8': '-58.47642',\n", + " 'RADIANCE_ADD_BAND_9': '-12.35764',\n", + " 'RADIANCE_ADD_BAND_10': '0.10000',\n", + " 'RADIANCE_ADD_BAND_11': '0.10000',\n", + " 'REFLECTANCE_MULT_BAND_1': '2.0000E-05',\n", + " 'REFLECTANCE_MULT_BAND_2': '2.0000E-05',\n", + " 'REFLECTANCE_MULT_BAND_3': '2.0000E-05',\n", + " 'REFLECTANCE_MULT_BAND_4': '2.0000E-05',\n", + " 'REFLECTANCE_MULT_BAND_5': '2.0000E-05',\n", + " 'REFLECTANCE_MULT_BAND_6': '2.0000E-05',\n", + " 'REFLECTANCE_MULT_BAND_7': '2.0000E-05',\n", + " 'REFLECTANCE_MULT_BAND_8': '2.0000E-05',\n", + " 'REFLECTANCE_MULT_BAND_9': '2.0000E-05',\n", + " 'REFLECTANCE_ADD_BAND_1': '-0.100000',\n", + " 'REFLECTANCE_ADD_BAND_2': '-0.100000',\n", + " 'REFLECTANCE_ADD_BAND_3': '-0.100000',\n", + " 'REFLECTANCE_ADD_BAND_4': '-0.100000',\n", + " 'REFLECTANCE_ADD_BAND_5': '-0.100000',\n", + " 'REFLECTANCE_ADD_BAND_6': '-0.100000',\n", + " 'REFLECTANCE_ADD_BAND_7': '-0.100000',\n", + " 'REFLECTANCE_ADD_BAND_8': '-0.100000',\n", + " 'REFLECTANCE_ADD_BAND_9': '-0.100000',\n", + " 'K1_CONSTANT_BAND_10': '774.8853',\n", + " 'K2_CONSTANT_BAND_10': '1321.0789',\n", + " 'K1_CONSTANT_BAND_11': '480.8883',\n", + " 'K2_CONSTANT_BAND_11': '1201.1442',\n", + " 'MAP_PROJECTION': 'UTM',\n", + " 'DATUM': 'WGS84',\n", + " 'ELLIPSOID': 'WGS84',\n", + " 'UTM_ZONE': '18',\n", + " 'GRID_CELL_SIZE_PANCHROMATIC': '15.00',\n", + " 'GRID_CELL_SIZE_REFLECTIVE': '30.00',\n", + " 'GRID_CELL_SIZE_THERMAL': '30.00',\n", + " 'ORIENTATION': 'NORTH_UP',\n", + " 'RESAMPLING_OPTION': 'CUBIC_CONVOLUTION'}" + ] + }, + "execution_count": 12, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "metadata = get_metadata(mtl_url)\n", + "metadata" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "### Create a STAC Item from a scene\n", + "\n", + "Now that we have metadata for the scene let's use it to create a [STAC Item](https://github.com/radiantearth/stac-spec/blob/v1.0.0-beta.2/item-spec/item-spec.md).\n", + "\n", + "We can use the `help` method to see the signature of the `__init__` method on `pystac.Item`. You can also call `help` directly on `pystac.Item` for broader documentation, or check the [API docs for Item here](https://pystac.readthedocs.io/en/0.5/api.html#item)." + ] + }, + { + "cell_type": "code", + "execution_count": 13, + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "Help on function __init__ in module pystac.item:\n", + "\n", + "__init__(self, id, geometry, bbox, datetime, properties, stac_extensions=None, href=None, collection=None, extra_fields=None)\n", + " Initialize self. See help(type(self)) for accurate signature.\n", + "\n" + ] + } + ], + "source": [ + "help(pystac.Item.__init__)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "We can see we'll need at least an `id`, `geometry`, `bbox`, and `datetime`. Properties is required, but can be an empty dictionary that we fill out on the Item once it's created." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "#### Item `id`\n", + "\n", + "For the Item's `id`, we'll use the scene ID. We'll chop off the last 5 characters as they are repeated for each ID and so aren't necessary: " + ] + }, + { + "cell_type": "code", + "execution_count": 14, + "metadata": {}, + "outputs": [], + "source": [ + "def get_item_id(metadata):\n", + " return metadata['LANDSAT_SCENE_ID'][:-5]" + ] + }, + { + "cell_type": "code", + "execution_count": 15, + "metadata": {}, + "outputs": [ + { + "data": { + "text/plain": [ + "'LC80150322020003'" + ] + }, + "execution_count": 15, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "item_id = get_item_id(metadata)\n", + "item_id" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "#### Item `datetime`\n", + "\n", + "Here we parse the datetime of the Item from two metadata fields that describe the date and time the scene was captured:" + ] + }, + { + "cell_type": "code", + "execution_count": 16, + "metadata": {}, + "outputs": [], + "source": [ + "from dateutil.parser import parse\n", + "\n", + "def get_datetime(metadata):\n", + " return parse('%sT%s' % (metadata['DATE_ACQUIRED'], metadata['SCENE_CENTER_TIME']))" + ] + }, + { + "cell_type": "code", + "execution_count": 17, + "metadata": {}, + "outputs": [ + { + "data": { + "text/plain": [ + "datetime.datetime(2020, 1, 3, 15, 46, 15, 625235, tzinfo=tzutc())" + ] + }, + "execution_count": 17, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "item_datetime = get_datetime(metadata)\n", + "item_datetime" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "#### Item `bbox`\n", + "\n", + "Here we read in the bounding box information from the scene and transform it into the format of the Item's `bbox` property:" + ] + }, + { + "cell_type": "code", + "execution_count": 18, + "metadata": {}, + "outputs": [], + "source": [ + "def get_bbox(metadata):\n", + " coords = [[\n", + " [float(metadata['CORNER_UL_LON_PRODUCT']), float(metadata['CORNER_UL_LAT_PRODUCT'])],\n", + " [float(metadata['CORNER_UR_LON_PRODUCT']), float(metadata['CORNER_UR_LAT_PRODUCT'])],\n", + " [float(metadata['CORNER_LR_LON_PRODUCT']), float(metadata['CORNER_LR_LAT_PRODUCT'])],\n", + " [float(metadata['CORNER_LL_LON_PRODUCT']), float(metadata['CORNER_LL_LAT_PRODUCT'])],\n", + " [float(metadata['CORNER_UL_LON_PRODUCT']), float(metadata['CORNER_UL_LAT_PRODUCT'])]\n", + " ]]\n", + " lats = [c[1] for c in coords[0]]\n", + " lons = [c[0] for c in coords[0]]\n", + " return [min(lons), min(lats), max(lons), max(lats)]" + ] + }, + { + "cell_type": "code", + "execution_count": 19, + "metadata": {}, + "outputs": [ + { + "data": { + "text/plain": [ + "[-77.83282, 39.23189, -75.02666, 41.41024]" + ] + }, + "execution_count": 19, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "item_bbox = get_bbox(metadata)\n", + "item_bbox" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "#### Item `geometry`\n", + "\n", + "Getting the geometry of the scene is a little more tricky. The bounding box will be a axis-aligned rectangle of the area the scene occupies, but will not represent the true footprint of the image - Landsat 8 scenes are \"tilted\" according the the coordinate reference system, so there will be areas in the corner where no image data exists. When constructing a STAC Item it's best if you have the Item geometry represent the true footprint of the assets.\n", + "\n", + "To get the footprint of the scene we'll read in another metadata file that lives alongside the MTL - the `ANG.txt` file. This function uses the ANG file and the bbox to construct the GeoJSON polygon that represents the footprint of the scene:" + ] + }, + { + "cell_type": "code", + "execution_count": 20, + "metadata": {}, + "outputs": [], + "source": [ + "def get_geometry(scene, bbox):\n", + " url = get_asset_url(scene, 'ANG.txt')\n", + " sz = []\n", + " coords = []\n", + " ang_text = pystac.STAC_IO.read_text(url)\n", + " for line in ang_text.split('\\n'):\n", + " if 'BAND01_NUM_L1T_LINES' in line or 'BAND01_NUM_L1T_SAMPS' in line:\n", + " sz.append(float(line.split('=')[1]))\n", + " if 'BAND01_L1T_IMAGE_CORNER_LINES' in line or 'BAND01_L1T_IMAGE_CORNER_SAMPS' in line:\n", + " coords.append([float(l) for l in line.split('=')[1].strip().strip('()').split(',')])\n", + " if len(coords) == 2:\n", + " break\n", + " dlon = bbox[2] - bbox[0]\n", + " dlat = bbox[3] - bbox[1]\n", + " lons = [c/sz[1] * dlon + bbox[0] for c in coords[1]]\n", + " lats = [((sz[0] - c)/sz[0]) * dlat + bbox[1] for c in coords[0]]\n", + " coordinates = [[\n", + " [lons[0], lats[0]], [lons[1], lats[1]], [lons[2], lats[2]], [lons[3], lats[3]], [lons[0], lats[0]]\n", + " ]]\n", + " \n", + " return {'type': 'Polygon', 'coordinates': coordinates}" + ] + }, + { + "cell_type": "code", + "execution_count": 21, + "metadata": {}, + "outputs": [ + { + "data": { + "text/plain": [ + "{'type': 'Polygon',\n", + " 'coordinates': [[[-77.24189342621209, 41.40875922921183],\n", + " [-75.02888361862159, 40.97145077644256],\n", + " [-75.62135920205826, 39.23299959871144],\n", + " [-77.83174166996123, 39.67752690559828],\n", + " [-77.24189342621209, 41.40875922921183]]]}" + ] + }, + "execution_count": 21, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "item_geometry = get_geometry(scene, item_bbox)\n", + "item_geometry" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "This would be a good time to check our work - we can print out the GeoJSON and use [geojson.io](http://geojson.io/) to check and make sure we're using scenes that overlap our location. If this footprint is somewhere unexpected in the world, make sure the Lat/Long coordinates are correct and in the right order!" + ] + }, + { + "cell_type": "code", + "execution_count": 22, + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "{\n", + " \"type\": \"Polygon\",\n", + " \"coordinates\": [\n", + " [\n", + " [\n", + " -77.24189342621209,\n", + " 41.40875922921183\n", + " ],\n", + " [\n", + " -75.02888361862159,\n", + " 40.97145077644256\n", + " ],\n", + " [\n", + " -75.62135920205826,\n", + " 39.23299959871144\n", + " ],\n", + " [\n", + " -77.83174166996123,\n", + " 39.67752690559828\n", + " ],\n", + " [\n", + " -77.24189342621209,\n", + " 41.40875922921183\n", + " ]\n", + " ]\n", + " ]\n", + "}\n" + ] + } + ], + "source": [ + "import json\n", + "\n", + "print(json.dumps(item_geometry, indent=2))" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "#### Create the item\n", + "\n", + "Now that we have the required attributes for an Item we can create it:" + ] + }, + { + "cell_type": "code", + "execution_count": 23, + "metadata": {}, + "outputs": [], + "source": [ + "item = pystac.Item(id=item_id, \n", + " datetime=item_datetime,\n", + " geometry=item_geometry,\n", + " bbox=item_bbox,\n", + " properties={})" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "PySTAC has a `validate` method on STAC objects, which you can use to make sure you're constructing things correctly. If there's an issue the following line will throw an exception:" + ] + }, + { + "cell_type": "code", + "execution_count": 24, + "metadata": {}, + "outputs": [], + "source": [ + "item.validate()" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "#### Add Ground Sample Distance to common metadata\n", + "\n", + "We'll add the Ground Sample Distance that is defined as part of the Item [Common Metadata](https://github.com/radiantearth/stac-spec/blob/v1.0.0-beta.2/item-spec/common-metadata.md). We define this on the Item leve as 30 meters, which is the GSD for most of the bands of Landsat 8. However, there are some bands that have a different resolution; we will account for this by setting the GSD explicitly for each of those bands below." + ] + }, + { + "cell_type": "code", + "execution_count": 25, + "metadata": {}, + "outputs": [], + "source": [ + "item.common_metadata.gsd = 30.0" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "#### Adding the EO extension\n", + "\n", + "STAC has a rich [set of extensions](https://github.com/radiantearth/stac-spec/tree/v1.0.0-beta.2/extensions) that allow STAC objects to encode information that is not part of the core spec but is used widely and standardized. An example of this is the [eo extension](https://github.com/radiantearth/stac-spec/tree/v1.0.0-beta.2/extensions/eo), which encapsulates data that that represents a snapshot of the earth for a single date and time.\n", + "\n", + "We can enable the `eo` extension for this item by using the `ext` property that exists on all STAC objects:" + ] + }, + { + "cell_type": "code", + "execution_count": 26, + "metadata": {}, + "outputs": [], + "source": [ + "item.ext.enable('eo')" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "#### Add cloud cover\n", + "\n", + "Here we add cloud cover from the metadata as part of the `eo` extension." + ] + }, + { + "cell_type": "code", + "execution_count": 27, + "metadata": {}, + "outputs": [], + "source": [ + "def get_cloud_cover(metadata):\n", + " return float(metadata['CLOUD_COVER'])" + ] + }, + { + "cell_type": "code", + "execution_count": 28, + "metadata": {}, + "outputs": [ + { + "data": { + "text/plain": [ + "100.0" + ] + }, + "execution_count": 28, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "item.ext.eo.cloud_cover = get_cloud_cover(metadata)\n", + "item.ext.eo.cloud_cover" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "#### Adding assets\n", + "\n", + "STAC Items contain a list of [Assets](https://github.com/radiantearth/stac-spec/blob/v1.0.0-beta.2/item-spec/item-spec.md#asset-object), which are a list of files that relate to the Item. In our case we'll be cataloging each file related to the scene, including the Landsat 8 band files as well as the metadata files associated with the scene.\n", + "\n", + "Here we define a dictionary that describes the band assets. We use the `eo` extension's `Band` class to encapsulate information about the band each file represents, and also specify the Ground Sample Distance of each band:" + ] + }, + { + "cell_type": "code", + "execution_count": 29, + "metadata": {}, + "outputs": [], + "source": [ + "from pystac.extensions.eo import Band\n", + "\n", + "landsat_band_info = {\n", + " 'B1': {\n", + " 'band': Band.create(name=\"B1\", common_name=\"coastal\", center_wavelength=0.48, full_width_half_max=0.02),\n", + " 'gsd': 30.0\n", + " },\n", + " 'B2': {\n", + " 'band': Band.create(name=\"B2\", common_name=\"blue\", center_wavelength=0.44, full_width_half_max=0.06),\n", + " 'gsd': 30.0\n", + " },\n", + " 'B3': {\n", + " 'band': Band.create(name=\"B3\", common_name=\"green\", center_wavelength=0.56, full_width_half_max=0.06),\n", + " 'gsd': 30.0\n", + " },\n", + " 'B4': {\n", + " 'band': Band.create(name=\"B4\", common_name=\"red\", center_wavelength=0.65, full_width_half_max=0.04),\n", + " 'gsd': 30.0\n", + " },\n", + " 'B5': {\n", + " 'band': Band.create(name=\"B5\", common_name=\"nir\", center_wavelength=0.86, full_width_half_max=0.03),\n", + " 'gsd': 30.0\n", + " },\n", + " 'B6': {\n", + " 'band': Band.create(name=\"B6\", common_name=\"swir16\", center_wavelength=1.6, full_width_half_max=0.08),\n", + " 'gsd': 30.0\n", + " },\n", + " 'B7': {\n", + " 'band': Band.create(name=\"B7\", common_name=\"swir22\", center_wavelength=2.2, full_width_half_max=0.2),\n", + " 'gsd': 30.0\n", + " },\n", + " 'B8': {\n", + " 'band': Band.create(name=\"B8\", common_name=\"pan\", center_wavelength=0.59, full_width_half_max=0.18),\n", + " 'gsd': 15.0\n", + " },\n", + " 'B9': {\n", + " 'band': Band.create(name=\"B9\", common_name=\"cirrus\", center_wavelength=1.37, full_width_half_max=0.02),\n", + " 'gsd': 30.0\n", + " },\n", + " 'B10': {\n", + " 'band': Band.create(name=\"B10\", common_name=\"lwir11\", center_wavelength=10.9, full_width_half_max=0.8),\n", + " 'gsd': 100.0\n", + " },\n", + " 'B11': {\n", + " 'band': Band.create(name=\"B11\", common_name=\"lwir12\", center_wavelength=12.0, full_width_half_max=1.0),\n", + " 'gsd': 100.0\n", + " }\n", + "}" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "There are also other non-band assets associated with a scene, and we specify the Asset's URL and media type here, along with the key we will refer to each asset by:" + ] + }, + { + "cell_type": "code", + "execution_count": 30, + "metadata": {}, + "outputs": [], + "source": [ + "def get_other_assets(scene):\n", + " return {\n", + " 'thumbnail': {\n", + " 'href': get_asset_url(scene, 'thumb_large.jpg'),\n", + " 'media_type': pystac.MediaType.JPEG\n", + " },\n", + " 'index': {\n", + " 'href': get_asset_url(scene, 'index.html'),\n", + " 'media_type': 'application/html'\n", + " },\n", + " 'ANG': {\n", + " 'href': get_asset_url(scene, 'ANG.txt'),\n", + " 'media_type': 'text/plain'\n", + " },\n", + " 'MTL': {\n", + " 'href': get_asset_url(scene, 'MTL.txt'),\n", + " 'media_type': 'text/plain'\n", + " },\n", + " 'BQA': {\n", + " 'href': get_asset_url(scene, 'BQA.TIF'),\n", + " 'media_type': pystac.MediaType.GEOTIFF\n", + " }\n", + "}" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "With this information we can now define a method that adds all the relevant assets for a scene and add them to an item:" + ] + }, + { + "cell_type": "code", + "execution_count": 31, + "metadata": {}, + "outputs": [], + "source": [ + "def add_assets(item, scene):\n", + " # Add bands\n", + " for band_id, band_info in landsat_band_info.items():\n", + " band_url = get_asset_url(scene, '{}.TIF'.format(band_id))\n", + " asset = pystac.Asset(href=band_url, media_type=pystac.MediaType.COG)\n", + " bands = [band_info['band']]\n", + " item.ext.eo.set_bands(bands, asset)\n", + " item.add_asset(band_id, asset)\n", + " \n", + " # If this asset has a different GSD than the item, set it on the asset\n", + " if band_info['gsd'] != item.common_metadata.gsd:\n", + " item.common_metadata.set_gsd(band_info['gsd'], asset)\n", + " \n", + " # Add other assets\n", + " for asset_id, asset_info in get_other_assets(scene).items():\n", + " item.add_asset(asset_id, \n", + " pystac.Asset(href=asset_info['href'], media_type=asset_info['media_type']))\n", + " " + ] + }, + { + "cell_type": "code", + "execution_count": 32, + "metadata": {}, + "outputs": [], + "source": [ + "add_assets(item, scene)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "The logic for the Assets is such that if the `gsd` of the Asset is different from the Item's GSD (30 meters), the Asset's GSD will be specified directly on the Asset. We can see this by comparing the `dict` encoding of the Assets for band 10 and band 3:" + ] + }, + { + "cell_type": "code", + "execution_count": 33, + "metadata": {}, + "outputs": [ + { + "data": { + "text/plain": [ + "{'href': 'https://s3-us-west-2.amazonaws.com/landsat-pds/c1/L8/015/032/LC08_L1GT_015032_20200103_20200103_01_RT/LC08_L1GT_015032_20200103_20200103_01_RT_B10.TIF',\n", + " 'type': 'image/tiff; application=geotiff; profile=cloud-optimized',\n", + " 'eo:bands': [{'name': 'B10',\n", + " 'common_name': 'lwir11',\n", + " 'center_wavelength': 10.9,\n", + " 'full_width_half_max': 0.8}],\n", + " 'gsd': 100.0}" + ] + }, + "execution_count": 33, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "item.assets['B10'].to_dict()" + ] + }, + { + "cell_type": "code", + "execution_count": 34, + "metadata": {}, + "outputs": [ + { + "data": { + "text/plain": [ + "{'href': 'https://s3-us-west-2.amazonaws.com/landsat-pds/c1/L8/015/032/LC08_L1GT_015032_20200103_20200103_01_RT/LC08_L1GT_015032_20200103_20200103_01_RT_B3.TIF',\n", + " 'type': 'image/tiff; application=geotiff; profile=cloud-optimized',\n", + " 'eo:bands': [{'name': 'B3',\n", + " 'common_name': 'green',\n", + " 'center_wavelength': 0.56,\n", + " 'full_width_half_max': 0.06}]}" + ] + }, + "execution_count": 34, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "item.assets['B3'].to_dict()" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Here we see the tumbnail asset, which does not include the band information for the `eo` extension as it does not represent any of the Item's bands:" + ] + }, + { + "cell_type": "code", + "execution_count": 35, + "metadata": {}, + "outputs": [ + { + "data": { + "text/plain": [ + "{'href': 'https://s3-us-west-2.amazonaws.com/landsat-pds/c1/L8/015/032/LC08_L1GT_015032_20200103_20200103_01_RT/LC08_L1GT_015032_20200103_20200103_01_RT_thumb_large.jpg',\n", + " 'type': 'image/jpeg'}" + ] + }, + "execution_count": 35, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "item.assets['thumbnail'].to_dict()" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "#### Add projection information\n", + "\n", + "We can specify the EPSG code for the scene as part of the [projection extension](https://github.com/radiantearth/stac-spec/tree/v1.0.0-beta.2/extensions/projection). The below method figures out the correct UTM Zone EPSG code based on the center latitude of the scene:" + ] + }, + { + "cell_type": "code", + "execution_count": 36, + "metadata": {}, + "outputs": [], + "source": [ + "def get_epsg(metadata, min_lat, max_lat):\n", + " if 'UTM_ZONE' in metadata:\n", + " center_lat = (min_lat + max_lat)/2.0\n", + " return int(('326' if center_lat > 0 else '327') + metadata['UTM_ZONE'])\n", + " else:\n", + " return None" + ] + }, + { + "cell_type": "code", + "execution_count": 37, + "metadata": {}, + "outputs": [ + { + "data": { + "text/plain": [ + "32618" + ] + }, + "execution_count": 37, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "item.ext.enable('projection')\n", + "item.ext.projection.epsg = get_epsg(metadata, item.bbox[1], item.bbox[3]) \n", + "item.ext.projection.epsg" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "#### Add view geometry information\n", + "\n", + "The [View Geometry](https://github.com/radiantearth/stac-spec/tree/v1.0.0-beta.2/extensions/view) extension specifies information related to angles of sensors and other radiance angles that affect the view of resulting data. The Landsat 8 metadata specifies two of these parameters, so we add them to our Item:" + ] + }, + { + "cell_type": "code", + "execution_count": 38, + "metadata": {}, + "outputs": [], + "source": [ + "def get_view_info(metadata):\n", + " return { 'sun_azimuth': float(metadata['SUN_AZIMUTH']),\n", + " 'sun_elevation': float(metadata['SUN_ELEVATION']) }" + ] + }, + { + "cell_type": "code", + "execution_count": 39, + "metadata": {}, + "outputs": [ + { + "data": { + "text/plain": [ + "{'datetime': '2020-01-03T15:46:15.625235Z',\n", + " 'gsd': 30.0,\n", + " 'eo:cloud_cover': 100.0,\n", + " 'proj:epsg': 32618,\n", + " 'view:sun_azimuth': 158.89720783,\n", + " 'view:sun_elevation': 23.89789093}" + ] + }, + "execution_count": 39, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "item.ext.enable('view')\n", + "view_info = get_view_info(metadata)\n", + "item.ext.view.sun_azimuth = view_info['sun_azimuth']\n", + "item.ext.view.sun_elevation = view_info['sun_elevation']\n", + "item.properties" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Now that we've added all the metadata to the item, let's check the validator to make sure we've specified everything correctly. The validation logic will take into account the new extensions that have been enabled and validate against the proper schemas for those extensions." + ] + }, + { + "cell_type": "code", + "execution_count": 40, + "metadata": {}, + "outputs": [], + "source": [ + "item.validate()" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "### Building the Collection\n", + "\n", + "Now that we know how to build an item for a scene, let's build the collection that will contain all the Items.\n", + "\n", + "If we look at the `__init__` method for `pystac.Collection`, we can see what properties are required:" + ] + }, + { + "cell_type": "code", + "execution_count": 41, + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "Help on function __init__ in module pystac.collection:\n", + "\n", + "__init__(self, id, description, extent, title=None, stac_extensions=None, href=None, extra_fields=None, license='proprietary', keywords=None, providers=None, properties=None, summaries=None)\n", + " Initialize self. See help(type(self)) for accurate signature.\n", + "\n" + ] + } + ], + "source": [ + "help(pystac.Collection.__init__)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "#### Collection `id`\n", + "\n", + "We'll use the location name we defined above in the ID for our Collection:" + ] + }, + { + "cell_type": "code", + "execution_count": 42, + "metadata": {}, + "outputs": [ + { + "data": { + "text/plain": [ + "'Philly-landsat-8'" + ] + }, + "execution_count": 42, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "collection_id = '{}-landsat-8'.format(location_name)\n", + "collection_id" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "#### Collection `title`\n", + "\n", + "Here we set a simple title for our collection." + ] + }, + { + "cell_type": "code", + "execution_count": 43, + "metadata": {}, + "outputs": [ + { + "data": { + "text/plain": [ + "'2020 Landsat 8 images over Philly'" + ] + }, + "execution_count": 43, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "collection_title = '2020 Landsat 8 images over {}'.format(location_name)\n", + "collection_title" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "#### Collection `description`\n", + "\n", + "Here we give a brief description of the Collection. If this were a real Collection that were being published, I'd recommend putting a much more detailed description to ensure anyone using your STAC knows what they are working with!\n", + "\n", + "Notice we are using [Markdown](https://www.markdownguide.org/) to write the description. The `description` field can be Markdown to help tools that render information about STAC to display the information in a more readable way." + ] + }, + { + "cell_type": "code", + "execution_count": 44, + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "### Philly Landsat 8\n", + "\n", + "A collection of Landsat 8 scenes around Philly in 2020.\n", + "\n" + ] + } + ], + "source": [ + "collection_description = '''### {} Landsat 8\n", + "\n", + "A collection of Landsat 8 scenes around {} in 2020.\n", + "'''.format(location_name, location_name)\n", + "print(collection_description)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "#### Collection `extent`\n", + "\n", + "A Collection specifies the spatial and temporal extent of all the item it contains. Since Landsat 8 spans the globe, we'll simply put a global extent here. We'll also specify an open-ended time interval that starts with the first datetime for scenes hosted by AWS.\n", + "\n", + "Towards the end of the notebook, we'll use a method to easily scope this down to cover the times and space the Items occupy once we've added all the items." + ] + }, + { + "cell_type": "code", + "execution_count": 45, + "metadata": {}, + "outputs": [], + "source": [ + "from datetime import datetime\n", + "\n", + "spatial_extent = pystac.SpatialExtent([[-180, -90, 180, 90]])\n", + "temporal_extent = pystac.TemporalExtent([[datetime(2013, 6, 1), None]])\n", + "collection_extent = pystac.Extent(spatial_extent, temporal_extent)" + ] + }, + { + "cell_type": "code", + "execution_count": 46, + "metadata": {}, + "outputs": [], + "source": [ + "collection = pystac.Collection(id=collection_id,\n", + " title=collection_title,\n", + " description=collection_description,\n", + " extent=collection_extent)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "We can now look at our Collection as a `dict` to check our values." + ] + }, + { + "cell_type": "code", + "execution_count": 47, + "metadata": {}, + "outputs": [ + { + "data": { + "text/plain": [ + "{'id': 'Philly-landsat-8',\n", + " 'stac_version': '1.0.0-beta.2',\n", + " 'description': '### Philly Landsat 8\\n\\nA collection of Landsat 8 scenes around Philly in 2020.\\n',\n", + " 'links': [{'rel': 'root', 'href': None, 'type': 'application/json'}],\n", + " 'title': '2020 Landsat 8 images over Philly',\n", + " 'extent': {'spatial': {'bbox': [[-180, -90, 180, 90]]},\n", + " 'temporal': {'interval': [['2013-06-01T00:00:00Z', None]]}},\n", + " 'license': 'proprietary'}" + ] + }, + "execution_count": 47, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "collection.to_dict()" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "#### Set the license\n", + "\n", + "Notice the `license` above is `proprietary`. This is the default in PySTAC if no license is specified; however Landsat 8 is certainly not proprietary (thankfully!), so let's change the license to the correct [SPDX](https://spdx.org/licenses/) string for public domain data:" + ] + }, + { + "cell_type": "code", + "execution_count": 48, + "metadata": {}, + "outputs": [], + "source": [ + "collection_license = 'PDDL-1.0'" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "#### Set the providers\n", + "\n", + "A collection will specify the providers of the data, including what role they have played. We can set our provider information by instantiating `pystac.Provider` objects:" + ] + }, + { + "cell_type": "code", + "execution_count": 49, + "metadata": {}, + "outputs": [], + "source": [ + "collection.providers = [\n", + " pystac.Provider(name='USGS', roles=['producer'], url='https://landsat.usgs.gov/'),\n", + " pystac.Provider(name='Planet Labs', roles=['processor'], url='https://github.com/landsat-pds/landsat_ingestor'),\n", + " pystac.Provider(name='AWS', roles=['host'], url='https://landsatonaws.com/')\n", + "]" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "### Create items for each scene\n", + "\n", + "We created an Item for a single scene above. This method consolidates that logic into a single method that can construct an Item from a scene, so we can create an Item for every scene in our subset:" + ] + }, + { + "cell_type": "code", + "execution_count": 50, + "metadata": {}, + "outputs": [], + "source": [ + "def item_from_scene(scene):\n", + " mtl_url = get_asset_url(scene, 'MTL.txt')\n", + " metadata = get_metadata(mtl_url)\n", + " \n", + " bbox = get_bbox(metadata)\n", + " item = pystac.Item(id=get_item_id(metadata), \n", + " datetime=get_datetime(metadata),\n", + " geometry=get_geometry(scene, bbox),\n", + " bbox=bbox,\n", + " properties={})\n", + " \n", + " item.common_metadata.gsd = 30.0\n", + " \n", + " item.ext.enable('eo')\n", + " item.ext.eo.cloud_cover = get_cloud_cover(metadata)\n", + " \n", + " add_assets(item, scene)\n", + " \n", + " item.ext.enable('projection')\n", + " item.ext.projection.epsg = get_epsg(metadata, item.bbox[1], item.bbox[3]) \n", + "\n", + " item.ext.enable('view')\n", + " view_info = get_view_info(metadata)\n", + " item.ext.view.sun_azimuth = view_info['sun_azimuth']\n", + " item.ext.view.sun_elevation = view_info['sun_elevation']\n", + "\n", + " item.validate()\n", + " \n", + " return item\n", + " " + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Here we create an item per scene and add it to our collection. Since this is reading multiple metadata files per scene from the internet, it may take a little bit to run:" + ] + }, + { + "cell_type": "code", + "execution_count": 51, + "metadata": {}, + "outputs": [], + "source": [ + "for scene in location_scenes:\n", + " item = item_from_scene(scene)\n", + " collection.add_item(item)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "#### Reset collection extent based on items\n", + "\n", + "Now that we've added all the item we can use the `update_extent_from_items` method on the Collection to set the extent based on the contained items:" + ] + }, + { + "cell_type": "code", + "execution_count": 52, + "metadata": {}, + "outputs": [ + { + "data": { + "text/plain": [ + "{'spatial': {'bbox': [[-77.88298, 39.23073, -73.46322, 41.41025]]},\n", + " 'temporal': {'interval': [['2020-01-03T15:46:15.625235Z',\n", + " '2020-08-30T15:46:11.765407Z']]}}" + ] + }, + "execution_count": 52, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "collection.update_extent_from_items()\n", + "collection.extent.to_dict()" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "### Set the HREFs for everything in the catalog\n", + "\n", + "We've been building up our Collection and Items in memory. This has been convenient as it allows us not to think about file paths as we construct our Catalog. However, a STAC is not valid without any HREFs! \n", + "\n", + "We can use the `normalize_hrefs` method to set all the HREFs in the entire STAC based on a root directory. This will use the [STAC Best Practices](https://github.com/radiantearth/stac-spec/blob/master/best-practices.md#catalog-layout) recommendations for STAC file layout for each Catalog, Collection and Item in the STAC.\n", + "\n", + "Here we use that method and set the root directory to a subdirectory of our user's `home` directory:" + ] + }, + { + "cell_type": "code", + "execution_count": 53, + "metadata": {}, + "outputs": [ + { + "data": { + "text/plain": [ + "" + ] + }, + "execution_count": 53, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "from pathlib import Path\n", + "\n", + "root_path = str(Path.home() / '{}-landsat-stac'.format(location_name))\n", + "\n", + "collection.normalize_hrefs(root_path)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Now that we have all the Collection's data set and HREFs in place we can validate the entire STAC using `validate_all`, which recursively crawls through a catalog and validates every STAC object in the catalog:" + ] + }, + { + "cell_type": "code", + "execution_count": 54, + "metadata": {}, + "outputs": [], + "source": [ + "collection.validate_all()" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "### Write the catalog locally\n", + "\n", + "Now that we have our complete, validated STAC in memory, let's write it out. This is a simple as calling `save` on the Collection. We need to specify the type of catalog in order to property write out links - these types are described again in the STAC [Best Practices](https://github.com/radiantearth/stac-spec/blob/master/best-practices.md#use-of-links) documentation.\n", + "\n", + "We'll use the \"self contained\" type, which uses relative paths and does not specify absolute \"self\" links to any object. This makes the catalog more portable, as it remains valid even if you copy the STAC to new locations." + ] + }, + { + "cell_type": "code", + "execution_count": 55, + "metadata": {}, + "outputs": [], + "source": [ + "collection.save(pystac.CatalogType.SELF_CONTAINED)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Now that we've written our STAC out we probably want to view it. We can use the `describe` method to print out a simple representation of the catalog:" + ] + }, + { + "cell_type": "code", + "execution_count": 56, + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "* \n", + " * \n", + " * \n", + " * \n", + " * \n", + " * \n", + " * \n", + " * \n", + " * \n", + " * \n", + " * \n", + " * \n", + " * \n", + " * \n", + " * \n", + " * \n", + " * \n", + " * \n", + " * \n", + " * \n", + " * \n", + " * \n", + " * \n", + " * \n", + " * \n", + " * \n", + " * \n", + " * \n", + " * \n", + " * \n", + " * \n", + " * \n", + " * \n", + " * \n", + " * \n", + " * \n", + " * \n", + " * \n", + " * \n", + " * \n", + " * \n", + " * \n", + " * \n", + " * \n", + " * \n", + " * \n", + " * \n", + " * \n", + " * \n", + " * \n", + " * \n", + " * \n", + " * \n" + ] + } + ], + "source": [ + "collection.describe()" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "We can also use the `to_dict` method on individual STAC objects in order to see the data, as we've been doing in the tutorial:" + ] + }, + { + "cell_type": "code", + "execution_count": 57, + "metadata": {}, + "outputs": [ + { + "data": { + "text/plain": [ + "{'id': 'Philly-landsat-8',\n", + " 'stac_version': '1.0.0-beta.2',\n", + " 'description': '### Philly Landsat 8\\n\\nA collection of Landsat 8 scenes around Philly in 2020.\\n',\n", + " 'links': [{'rel': 'root',\n", + " 'href': './collection.json',\n", + " 'type': 'application/json'},\n", + " {'rel': 'item',\n", + " 'href': './LC80150322020003/LC80150322020003.json',\n", + " 'type': 'application/json'},\n", + " {'rel': 'item',\n", + " 'href': './LC80140322020012/LC80140322020012.json',\n", + " 'type': 'application/json'},\n", + " {'rel': 'item',\n", + " 'href': './LC80150322020019/LC80150322020019.json',\n", + " 'type': 'application/json'},\n", + " {'rel': 'item',\n", + " 'href': './LC80140322020012/LC80140322020012.json',\n", + " 'type': 'application/json'},\n", + " {'rel': 'item',\n", + " 'href': './LC80150322020019/LC80150322020019.json',\n", + " 'type': 'application/json'},\n", + " {'rel': 'item',\n", + " 'href': './LC80140322020028/LC80140322020028.json',\n", + " 'type': 'application/json'},\n", + " {'rel': 'item',\n", + " 'href': './LC80150322020035/LC80150322020035.json',\n", + " 'type': 'application/json'},\n", + " {'rel': 'item',\n", + " 'href': './LC80140322020028/LC80140322020028.json',\n", + " 'type': 'application/json'},\n", + " {'rel': 'item',\n", + " 'href': './LC80140322020044/LC80140322020044.json',\n", + " 'type': 'application/json'},\n", + " {'rel': 'item',\n", + " 'href': './LC80150322020051/LC80150322020051.json',\n", + " 'type': 'application/json'},\n", + " {'rel': 'item',\n", + " 'href': './LC80150322020051/LC80150322020051.json',\n", + " 'type': 'application/json'},\n", + " {'rel': 'item',\n", + " 'href': './LC80150322020051/LC80150322020051.json',\n", + " 'type': 'application/json'},\n", + " {'rel': 'item',\n", + " 'href': './LC80150322020051/LC80150322020051.json',\n", + " 'type': 'application/json'},\n", + " {'rel': 'item',\n", + " 'href': './LC80140322020060/LC80140322020060.json',\n", + " 'type': 'application/json'},\n", + " {'rel': 'item',\n", + " 'href': './LC80150322020067/LC80150322020067.json',\n", + " 'type': 'application/json'},\n", + " {'rel': 'item',\n", + " 'href': './LC80140322020060/LC80140322020060.json',\n", + " 'type': 'application/json'},\n", + " {'rel': 'item',\n", + " 'href': './LC80150322020067/LC80150322020067.json',\n", + " 'type': 'application/json'},\n", + " {'rel': 'item',\n", + " 'href': './LC80140322020076/LC80140322020076.json',\n", + " 'type': 'application/json'},\n", + " {'rel': 'item',\n", + " 'href': './LC80150322020083/LC80150322020083.json',\n", + " 'type': 'application/json'},\n", + " {'rel': 'item',\n", + " 'href': './LC80140322020076/LC80140322020076.json',\n", + " 'type': 'application/json'},\n", + " {'rel': 'item',\n", + " 'href': './LC80140322020092/LC80140322020092.json',\n", + " 'type': 'application/json'},\n", + " {'rel': 'item',\n", + " 'href': './LC80150322020099/LC80150322020099.json',\n", + " 'type': 'application/json'},\n", + " {'rel': 'item',\n", + " 'href': './LC80140322020092/LC80140322020092.json',\n", + " 'type': 'application/json'},\n", + " {'rel': 'item',\n", + " 'href': './LC80140322020108/LC80140322020108.json',\n", + " 'type': 'application/json'},\n", + " {'rel': 'item',\n", + " 'href': './LC80140322020108/LC80140322020108.json',\n", + " 'type': 'application/json'},\n", + " {'rel': 'item',\n", + " 'href': './LC80150322020115/LC80150322020115.json',\n", + " 'type': 'application/json'},\n", + " {'rel': 'item',\n", + " 'href': './LC80150322020099/LC80150322020099.json',\n", + " 'type': 'application/json'},\n", + " {'rel': 'item',\n", + " 'href': './LC80140322020124/LC80140322020124.json',\n", + " 'type': 'application/json'},\n", + " {'rel': 'item',\n", + " 'href': './LC80150322020131/LC80150322020131.json',\n", + " 'type': 'application/json'},\n", + " {'rel': 'item',\n", + " 'href': './LC80140322020124/LC80140322020124.json',\n", + " 'type': 'application/json'},\n", + " {'rel': 'item',\n", + " 'href': './LC80140322020140/LC80140322020140.json',\n", + " 'type': 'application/json'},\n", + " {'rel': 'item',\n", + " 'href': './LC80150322020147/LC80150322020147.json',\n", + " 'type': 'application/json'},\n", + " {'rel': 'item',\n", + " 'href': './LC80150322020131/LC80150322020131.json',\n", + " 'type': 'application/json'},\n", + " {'rel': 'item',\n", + " 'href': './LC80140322020140/LC80140322020140.json',\n", + " 'type': 'application/json'},\n", + " {'rel': 'item',\n", + " 'href': './LC80140322020156/LC80140322020156.json',\n", + " 'type': 'application/json'},\n", + " {'rel': 'item',\n", + " 'href': './LC80150322020147/LC80150322020147.json',\n", + " 'type': 'application/json'},\n", + " {'rel': 'item',\n", + " 'href': './LC80140322020156/LC80140322020156.json',\n", + " 'type': 'application/json'},\n", + " {'rel': 'item',\n", + " 'href': './LC80150322020163/LC80150322020163.json',\n", + " 'type': 'application/json'},\n", + " {'rel': 'item',\n", + " 'href': './LC80140322020172/LC80140322020172.json',\n", + " 'type': 'application/json'},\n", + " {'rel': 'item',\n", + " 'href': './LC80150322020179/LC80150322020179.json',\n", + " 'type': 'application/json'},\n", + " {'rel': 'item',\n", + " 'href': './LC80150322020163/LC80150322020163.json',\n", + " 'type': 'application/json'},\n", + " {'rel': 'item',\n", + " 'href': './LC80140322020188/LC80140322020188.json',\n", + " 'type': 'application/json'},\n", + " {'rel': 'item',\n", + " 'href': './LC80140322020172/LC80140322020172.json',\n", + " 'type': 'application/json'},\n", + " {'rel': 'item',\n", + " 'href': './LC80150322020179/LC80150322020179.json',\n", + " 'type': 'application/json'},\n", + " {'rel': 'item',\n", + " 'href': './LC80150322020195/LC80150322020195.json',\n", + " 'type': 'application/json'},\n", + " {'rel': 'item',\n", + " 'href': './LC80150322020211/LC80150322020211.json',\n", + " 'type': 'application/json'},\n", + " {'rel': 'item',\n", + " 'href': './LC80150322020195/LC80150322020195.json',\n", + " 'type': 'application/json'},\n", + " {'rel': 'item',\n", + " 'href': './LC80140322020188/LC80140322020188.json',\n", + " 'type': 'application/json'},\n", + " {'rel': 'item',\n", + " 'href': './LC80140322020204/LC80140322020204.json',\n", + " 'type': 'application/json'},\n", + " {'rel': 'item',\n", + " 'href': './LC80150322020227/LC80150322020227.json',\n", + " 'type': 'application/json'},\n", + " {'rel': 'item',\n", + " 'href': './LC80140322020220/LC80140322020220.json',\n", + " 'type': 'application/json'},\n", + " {'rel': 'item',\n", + " 'href': './LC80150322020243/LC80150322020243.json',\n", + " 'type': 'application/json'},\n", + " {'rel': 'self',\n", + " 'href': '/Users/rob/Philly-landsat-stac/collection.json',\n", + " 'type': 'application/json'}],\n", + " 'title': '2020 Landsat 8 images over Philly',\n", + " 'extent': {'spatial': {'bbox': [[-77.88298, 39.23073, -73.46322, 41.41025]]},\n", + " 'temporal': {'interval': [['2020-01-03T15:46:15.625235Z',\n", + " '2020-08-30T15:46:11.765407Z']]}},\n", + " 'license': 'proprietary',\n", + " 'providers': [{'name': 'USGS',\n", + " 'roles': ['producer'],\n", + " 'url': 'https://landsat.usgs.gov/'},\n", + " {'name': 'Planet Labs',\n", + " 'roles': ['processor'],\n", + " 'url': 'https://github.com/landsat-pds/landsat_ingestor'},\n", + " {'name': 'AWS', 'roles': ['host'], 'url': 'https://landsatonaws.com/'}]}" + ] + }, + "execution_count": 57, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "collection.to_dict()" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "However, if we want to browse our STAC more interactively, we can use the [stac-browser](https://github.com/radiantearth/stac-browser) tool to read our local STAC.\n", + "\n", + "We can use this simple Python server (copied from [this gist](https://gist.github.com/acdha/925e9ffc3d74ad59c3ea)) to serve our our directory at port 5555:" + ] + }, + { + "cell_type": "code", + "execution_count": 58, + "metadata": {}, + "outputs": [ + { + "ename": "KeyboardInterrupt", + "evalue": "", + "output_type": "error", + "traceback": [ + "\u001b[0;31m---------------------------------------------------------------------------\u001b[0m", + "\u001b[0;31mKeyboardInterrupt\u001b[0m Traceback (most recent call last)", + "\u001b[0;32m\u001b[0m in \u001b[0;36m\u001b[0;34m\u001b[0m\n\u001b[1;32m 13\u001b[0m \u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 14\u001b[0m \u001b[0;32mwith\u001b[0m \u001b[0mHTTPServer\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m'localhost'\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0;36m5555\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mCORSRequestHandler\u001b[0m\u001b[0;34m)\u001b[0m \u001b[0;32mas\u001b[0m \u001b[0mhttpd\u001b[0m\u001b[0;34m:\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m---> 15\u001b[0;31m \u001b[0mhttpd\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mserve_forever\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m", + "\u001b[0;32m~/anaconda/lib/python3.6/socketserver.py\u001b[0m in \u001b[0;36mserve_forever\u001b[0;34m(self, poll_interval)\u001b[0m\n\u001b[1;32m 234\u001b[0m \u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 235\u001b[0m \u001b[0;32mwhile\u001b[0m \u001b[0;32mnot\u001b[0m \u001b[0mself\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0m__shutdown_request\u001b[0m\u001b[0;34m:\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m--> 236\u001b[0;31m \u001b[0mready\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0mselector\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mselect\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mpoll_interval\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m\u001b[1;32m 237\u001b[0m \u001b[0;32mif\u001b[0m \u001b[0mready\u001b[0m\u001b[0;34m:\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 238\u001b[0m \u001b[0mself\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0m_handle_request_noblock\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n", + "\u001b[0;32m~/anaconda/lib/python3.6/selectors.py\u001b[0m in \u001b[0;36mselect\u001b[0;34m(self, timeout)\u001b[0m\n\u001b[1;32m 374\u001b[0m \u001b[0mready\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0;34m[\u001b[0m\u001b[0;34m]\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 375\u001b[0m \u001b[0;32mtry\u001b[0m\u001b[0;34m:\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m--> 376\u001b[0;31m \u001b[0mfd_event_list\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0mself\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0m_poll\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mpoll\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mtimeout\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m\u001b[1;32m 377\u001b[0m \u001b[0;32mexcept\u001b[0m \u001b[0mInterruptedError\u001b[0m\u001b[0;34m:\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 378\u001b[0m \u001b[0;32mreturn\u001b[0m \u001b[0mready\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n", + "\u001b[0;31mKeyboardInterrupt\u001b[0m: " + ] + } + ], + "source": [ + "import os\n", + "from http.server import HTTPServer, SimpleHTTPRequestHandler\n", + "\n", + "os.chdir(root_path)\n", + "\n", + "class CORSRequestHandler(SimpleHTTPRequestHandler):\n", + " def end_headers(self):\n", + " self.send_header('Access-Control-Allow-Origin', '*')\n", + " self.send_header('Access-Control-Allow-Methods', 'GET')\n", + " self.send_header('Cache-Control', 'no-store, no-cache, must-revalidate')\n", + " return super(CORSRequestHandler, self).end_headers()\n", + "\n", + "\n", + "with HTTPServer(('localhost', 5555), CORSRequestHandler) as httpd:\n", + " httpd.serve_forever()" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Now you can follow the [stac-browser](https://github.com/radiantearth/stac-browser#running) instructions for starting a stac-browser instance and point it at `http://localhost:5555/collection.json` to serve out the STAC!\n", + "\n", + "To quit the server, use the `Kernel` -> `Interrupt` menu option." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "### Acknowledgements\n", + "\n", + "Credit to [sat-stac-landsat](https://github.com/sat-utils/sat-stac-landsat) off of which a lot of this code was based." + ] + } + ], + "metadata": { + "kernelspec": { + "display_name": "Python 3", + "language": "python", + "name": "python3" + }, + "language_info": { + "codemirror_mode": { + "name": "ipython", + "version": 3 + }, + "file_extension": ".py", + "mimetype": "text/x-python", + "name": "python", + "nbconvert_exporter": "python", + "pygments_lexer": "ipython3", + "version": "3.6.5" + } + }, + "nbformat": 4, + "nbformat_minor": 2 +}