Skip to content

Added support for summaries #264

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 33 commits into from
Jun 2, 2021
Merged
Show file tree
Hide file tree
Changes from 1 commit
Commits
Show all changes
33 commits
Select commit Hold shift + click to select a range
f50541a
added summaries (WIP)
volaya Feb 20, 2021
d2bebb3
Merge branch 'main' of https://github.com/stac-utils/pystac into summ…
volaya May 11, 2021
ad1d252
updated creation of summaries
volaya May 12, 2021
7e6f0fd
fixed imports in tests
volaya May 12, 2021
1e8d474
fixed call in create_summary method
volaya May 12, 2021
968df84
minor changes to summaries
volaya May 12, 2021
86e09e7
collection now do not know now about summaries and summarizers
volaya May 13, 2021
c5ec2c3
some fixes for typing
volaya May 13, 2021
1d133ee
allow to set a limit for elements in list summaries
volaya May 13, 2021
ccdcd91
use extended information for summarize strategies
volaya May 14, 2021
b21b230
renamed SummaryStrategy.UNDEFINED to SummaryStrategy.DEFAULT
volaya May 14, 2021
a7e1747
Autoformat summaries.py
lossyrob May 18, 2021
1bde14e
Remove type parameter from summary methods.
lossyrob May 18, 2021
af126fb
test_case_5 is not a collection; use items from catalog in test
lossyrob May 18, 2021
a2aac4f
Prefer `Dict[str, Any]` type
lossyrob May 18, 2021
1bf5d26
added some tests for summaries and some minor fixes
volaya May 18, 2021
60fe198
Merge remote-tracking branch 'origin/main' into summaries
lossyrob May 18, 2021
09a2a40
Solve Comparable issues
lossyrob May 18, 2021
41045be
Remove __future__.annotations import
lossyrob May 18, 2021
e362712
Remove deprecation warning by using .assertEqual
lossyrob May 18, 2021
018c527
some fixes and minor changes for summaries
volaya May 21, 2021
ec69f89
Merge branch 'summaries' of https://github.com/volaya/pystac into sum…
volaya May 21, 2021
0223b10
flake8 fixes
volaya May 21, 2021
4c76a2b
formatting fix
volaya May 21, 2021
e175aa6
Use Protocol from typing_extensions for pre-3.8 Python
lossyrob May 21, 2021
dc861b0
Merge remote-tracking branch 'origin/main' into summaries
lossyrob May 27, 2021
93438e5
Remove type parameter from RangeSummary.from_dict
lossyrob May 27, 2021
3912937
Type changes to pass mypy tests
lossyrob May 27, 2021
17ab5a1
cache fields definition in summaries
volaya Jun 1, 2021
eb527e1
use fields-normalized.json for default fields definition file
volaya Jun 1, 2021
ce4e682
Merge remote-tracking branch 'stac-utils/main' into summaries
duckontheweb Jun 1, 2021
6857381
Fix CI failures
duckontheweb Jun 1, 2021
fa8767e
added entry to changelog
volaya Jun 2, 2021
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
49 changes: 48 additions & 1 deletion pystac/collection.py
Original file line number Diff line number Diff line change
@@ -1,3 +1,6 @@
import os
import json
import numbers
from collections import abc
from datetime import datetime
import dateutil.parser
Expand All @@ -9,6 +12,16 @@
from pystac.utils import datetime_to_str


fieldsfilename = os.path.join(os.path.dirname(__file__),
"resources", "fields-normalized.json")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it make sense to load it from a CDN by default? That file is expected to change pretty often (mostly to add new fields)...

with open(fieldsfilename) as f:
jsonfields = json.load(f)
summaryfields = {}
for name, desc in jsonfields["metadata"].items():
if desc.get("summary", True):
summaryfields[name] = {"mergeArrays": desc.get("mergeArrays", False)}


class Collection(Catalog):
"""A Collection extends the Catalog spec with additional metadata that helps
enable discovery.
Expand Down Expand Up @@ -88,9 +101,43 @@ def __init__(self,
def __repr__(self):
return '<Collection id={}>'.format(self.id)

def add_item(self, item, title=None):
def create_summary(self):
"""Creates a summary from current items
It will remove the content of the previous collection summary, in case it exists
"""
self.summaries = {}
for item in self.get_items():
self.update_summary_with_item(item)

def update_summary_with_item(self, item):
if self.summaries is None:
self.summaries = {}
for k, v in item.properties.items():
if k in summaryfields:
if isinstance(v, list):
if k not in self.summaries:
self.summaries[k] = []
if summaryfields[k]["mergeArrays"]:
self.summaries[k] = list(set(self.summaries[k]) | set(v))
else:
if v not in self.summaries[k]:
self.summaries[k].append(v)
elif isinstance(v, numbers.Number):
if k not in self.summaries:
self.summaries[k] = {"min": v, "max": v}
else:
self.summaries[k] = {"min": min([v, self.summaries[k]["min"]]),
"max": max([v, self.summaries[k]["max"]])}
else:
if k not in self.summaries:
self.summaries[k] = []
if v not in self.summaries[k]:
self.summaries[k].append(v)

def add_item(self, item, title=None, update_summary=True):
super(Collection, self).add_item(item, title)
item.set_collection(self)
self.update_summary_with_item(item)

def to_dict(self, include_self_link=True):
d = super(Collection, self).to_dict(include_self_link)
Expand Down
Loading