Open
Description
Using pystac[validation] 1.8.3
I am creating collections with a larger amount of items and was surprised by the time it took to save them. I have been doing some very preliminary tests and it somehow seems that the save time increases exponentially with the amount of items in a collection.
For example saving a catalog with 1 collection takes depending on item count:
Items | Time |
---|---|
200 | 0.225s |
2000 | 5.439s |
10000 | 105.975s |
If i create 5 collections with 2000 items the saving time is 25s. So the same amount of items are being saved in total but it takes 4 times less when separated into multiple collections.
Any ideas why this could be happening?
Here is a very rough testing script:
import time
from datetime import (
datetime,
timedelta,
)
from pystac import (
Item,
Catalog,
CatalogType,
Collection,
Extent,
SpatialExtent,
TemporalExtent,
)
from pystac.layout import TemplateLayoutStrategy
numdays = 10000
number_of_collections = 1
base = datetime.today()
times = [base - timedelta(days=x) for x in range(numdays)]
catalog = Catalog(
id = "test",
description = "catalog to test performance",
title = "performance test catalog",
catalog_type=CatalogType.RELATIVE_PUBLISHED,
)
spatial_extent = SpatialExtent([
[-180.0, -90.0, 180.0, 90.0],
])
temporal_extent = TemporalExtent([[datetime.now()]])
extent = Extent(spatial=spatial_extent, temporal=temporal_extent)
for idx in range(number_of_collections):
collection = Collection(
id="big_collection%s"%idx,
title="collection for items",
description="some desc",
extent=extent
)
for t in times:
item = Item(
id = t.isoformat(),
bbox=[-180.0, -90.0, 180.0, 90.0],
properties={},
geometry = None,
datetime = t,
)
collection.add_item(item)
collection.update_extent_from_items()
catalog.add_child(collection)
strategy = TemplateLayoutStrategy(item_template="${collection}/${year}")
catalog.normalize_hrefs("https://exampleurl.com/", strategy=strategy)
start_time = time.perf_counter()
catalog.save(dest_href="../test_build/")
end_time = time.perf_counter()
print(f"Saving Time : {end_time - start_time:0.6f}" )