-
Notifications
You must be signed in to change notification settings - Fork 15
Open
Description
It may occur that the items of an ItemCollection do not share all the property keys. An example:
import pystac_client
import stac_geoparquet
catalog = pystac_client.Client.open(
"https://planetarycomputer.microsoft.com/api/stac/v1",
)
search = catalog.search(
collections=["sentinel-2-l2a"],
bbox=[6.5425, 47.9044, 6.5548, 47.9091],
datetime="2024-07-20/2024-08-11",
query={"eo:cloud_cover": {"lt": 30.}},
sortby="datetime",
)
coll = search.item_collection()
print(set(coll[0].properties.keys()).symmetric_difference(coll[1].properties.keys()))
# {'s2:dark_features_percentage'} # this property is missing in coll[1:3], due to a different processing baseline (05.11 instead of 05.10)
records = coll.to_dict()["features"]
stac_geoparquet.to_geodataframe(records)
# *** ValueError: All arrays must be of the same lengthIn stac-geoparquet <= 3.2, the geodataframe was built from a list of dict, which was introducing NaN where a property was missing. Since commit #fb798f4 (included in version 4.0+), the geodataframe is built from a dict of lists (for acceleration I suppose), thus a missing property in an item makes the operation fail with error at L177: All arrays must be of the same length
As an ItemCollection cannot garanty that all properties are shared by all items (or am I wrong about that?):
- wouldn't it be wise to remove properties that are not shared (e.g. properties with length smaller than others) or fill the missing values?
- or is it wanted that this issue is delegated to the user?
Metadata
Metadata
Assignees
Labels
No labels