Skip to content

Add RetryStacIO #986

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 3 commits into from
Feb 15, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -18,6 +18,7 @@
- `__geo_interface__` for items ([#885](https://github.com/stac-utils/pystac/pull/885))
- Optional `strategy` parameter to `catalog.add_items()` ([#967](https://github.com/stac-utils/pystac/pull/967))
- `start_datetime` and `end_datetime` arguments to the `Item` constructor ([#918](https://github.com/stac-utils/pystac/pull/918))
- `RetryStacIO` ([#986](https://github.com/stac-utils/pystac/pull/986))

### Removed

Expand Down
10 changes: 9 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -34,7 +34,15 @@ optional `orjson` requirements:
pip install pystac[orjson]
```

### Install from source:
If you would like to use a custom `RetryStacIO` class for automatically retrying
network requests when reading with PySTAC, you'll need
[`urllib3`](https://urllib3.readthedocs.io/en/stable/):

```shell
pip install pystac[urllib3]
```

### Install from source

```shell
git clone https://github.com/stac-utils/pystac.git
Expand Down
1 change: 1 addition & 0 deletions docs/conf.py
Original file line number Diff line number Diff line change
Expand Up @@ -232,6 +232,7 @@
intersphinx_mapping = {
"python": ("https://docs.python.org/3", None),
"dateutil": ("https://dateutil.readthedocs.io/en/stable", None),
"urllib3": ("https://urllib3.readthedocs.io/en/stable", None),
}

# -- Substutition variables
Expand Down
12 changes: 12 additions & 0 deletions docs/installation.rst
Original file line number Diff line number Diff line change
Expand Up @@ -66,6 +66,18 @@ additional functionality:

pip install pystac[orjson]

* ``urllib3``

Installs the additional `urllib3 <https://github.com/urllib3/urllib3>`__ dependency.
For now, this is only used in :py:class:`pystac.stac_io.RetryStacIO`, but it
may be used more extensively in the future.

To install:

.. code-block:: bash

pip install pystac[urllib3]

Versions
========

Expand Down
3 changes: 2 additions & 1 deletion pystac/catalog.py
Original file line number Diff line number Diff line change
Expand Up @@ -320,7 +320,8 @@ def add_items(
items: Iterable[Item],
strategy: Optional[HrefLayoutStrategy] = None,
) -> None:
"""Adds links to multiple :class:`~pystac.Item`s.
"""Adds links to multiple :class:`Items <pystac.Item>`.

This method will set each item's parent to this object, and their root to
this Catalog's root.

Expand Down
70 changes: 68 additions & 2 deletions pystac/stac_io.py
Original file line number Diff line number Diff line change
Expand Up @@ -23,6 +23,14 @@
except ImportError:
orjson = None # type: ignore[assignment]

# Is urllib3 available?
try:
import urllib3 # noqa
except ImportError:
HAS_URLLIB3 = False
else:
HAS_URLLIB3 = True

if TYPE_CHECKING:
from pystac.catalog import Catalog
from pystac.stac_object import STACObject
Expand Down Expand Up @@ -281,9 +289,8 @@ def read_text_from_href(self, href: str) -> str:

href : The URI of the file to open.
"""
parsed = safe_urlparse(href)
href_contents: str
if parsed.scheme != "":
if _is_url(href):
try:
req = Request(href, headers=self.headers)
with urlopen(req) as f:
Expand Down Expand Up @@ -373,3 +380,62 @@ def _report_duplicate_object_names(
else:
result[key] = value
return result


def _is_url(href: str) -> bool:
parsed = safe_urlparse(href)
return parsed.scheme != ""


if HAS_URLLIB3:
from typing import cast

from urllib3 import PoolManager
from urllib3.util import Retry

class RetryStacIO(DefaultStacIO):
"""A customized StacIO that retries requests, using
:py:class:`urllib3.util.retry.Retry`.

The headers are passed to :py:class:`DefaultStacIO`. If retry is not
provided, a default retry is used.

To use this class, you'll need to install PySTAC with urllib3:

.. code-block:: shell

pip install pystac[urllib3]

"""

retry: Retry
"""The :py:class:`urllib3.util.retry.Retry` to use with all reading network
requests."""

def __init__(
self,
headers: Optional[Dict[str, str]] = None,
retry: Optional[Retry] = None,
):
super().__init__(headers)
self.retry = retry or Retry()

def read_text_from_href(self, href: str) -> str:
"""Reads file as a UTF-8 string, with retry support.

Args:
href : The URI of the file to open.
"""
if _is_url(href):
# TODO provide a pooled StacIO to enable more efficient network
# access (probably named `PooledStacIO`).
http = PoolManager()
try:
response = http.request(
"GET", href, retries=self.retry # type: ignore
)
return cast(str, response.data.decode("utf-8"))
except HTTPError as e:
raise Exception("Could not read uri {}".format(href)) from e
else:
return super().read_text_from_href(href)
6 changes: 3 additions & 3 deletions pystac/utils.py
Original file line number Diff line number Diff line change
Expand Up @@ -308,9 +308,9 @@ def datetime_to_str(dt: datetime, timespec: str = "auto") -> str:
Args:
dt : The datetime to convert.
timespec: An optional argument that specifies the number of additional
terms of the time to include. Valid options are 'auto', 'hours',
'minutes', 'seconds', 'milliseconds' and 'microseconds'. The default value
is 'auto'.
terms of the time to include. Valid options are 'auto', 'hours',
'minutes', 'seconds', 'milliseconds' and 'microseconds'. The default value
is 'auto'.

Returns:
str: The ISO8601 (RFC 3339) formatted string representing the datetime.
Expand Down
29 changes: 13 additions & 16 deletions requirements-test.txt
Original file line number Diff line number Diff line change
@@ -1,23 +1,20 @@
mypy==1.0.0
flake8==6.0.0
black==23.1.0
pytest==7.2.1
pytest-cov==4.0.0
pytest-mock==3.10.0
codespell==2.2.2
isort==5.12.0

jsonschema==4.17.3
coverage==7.1.0
doc8==0.11.2
jinja2<4.0
flake8==6.0.0
html5lib==1.1

isort==5.12.0
jinja2<4.0
jsonschema==4.17.3
mypy==1.0.0
orjson==3.8.6
pre-commit==3.0.4
pytest-cov==4.0.0
pytest-mock==3.10.0
pytest-vcr==1.0.2
pytest==7.2.1
types-html5lib==1.1.11.11
types-python-dateutil==2.8.19.7
types-orjson==3.6.2

pre-commit==3.0.4

# optional dependencies
orjson==3.8.6
types-python-dateutil==2.8.19.7
types-urllib3==1.26.25.5
6 changes: 5 additions & 1 deletion setup.py
Original file line number Diff line number Diff line change
Expand Up @@ -24,7 +24,11 @@
py_modules=[splitext(basename(path))[0] for path in glob("pystac/*.py")],
python_requires=">=3.8",
install_requires=["python-dateutil>=2.7.0"],
extras_require={"validation": ["jsonschema>=4.0.1"], "orjson": ["orjson>=3.5"]},
extras_require={
"validation": ["jsonschema>=4.0.1"],
"orjson": ["orjson>=3.5"],
"urllib3": ["urllib3>=1.26"],
},
license="Apache Software License 2.0",
license_files=["LICENSE"],
zip_safe=False,
Expand Down
Loading