region-id inconsistent between depressions.shp and regions.shp

Firstly, thank you for providing an excellent package/resource.

* lidar version: 0.7.1
* Python version: 3.11.0
* Operating System: Windows 10

### Description

Describe what you were trying to get done.
- short-term goal, and where the problem is happening: aligning depression regions, with individual nested depressions

Tell us what happened, what went wrong, and what you expected to happen.
- the `region-id` values in `regions.shp`/`regions_info.csv` that is generated using `ExtractSinks()` differ from those in the `depressions.shp`/`depressions_info.csv` generated using `DelineateDepressions()`.
- I would expect the `region-id` to be consistent between these.

### What I Did

```python
outdir='tmp/'
min_size = 50           # minimum number of pixels as a depression
min_depth = 30           # minimum depth as a depression
interval = 10         # slicing interval for the level-set method
bool_shp = False        # output shapefiles for each individual level

sink_path = ExtractSinks('sample.tif', min_size, outdir)

dep_id_path, dep_level_path = DelineateDepressions(sink_path, min_size, min_depth, interval, outdir, bool_shp)

# read in output and combine info csv with geometry from shapefile
# depressions
depressions = gpd.read_file('tmp/depressions.shp')
depressions = depressions.dissolve('id').reset_index() # makes for a tidier merge with depressions_info.csv (creates multipolygons)
dep_info = pd.read_csv('tmp/depressions_info.csv')
gdf = depressions.merge(dep_info, on='id')

# regions
regions = gpd.read_file('tmp/regions.shp').sort_values(by='id')
reg_info = pd.read_csv('tmp/regions_info.csv')
regions = regions.merge(reg_info, left_on='id', right_on='region-id').sort_values(by='id').reset_index(drop=True)
```

at this stage I would expect the `region-id` in `regions` to match that in `gdf`...however, they do not. Below is an illustration, which I think also provides a way to handle the discrepency (spatial join)

```python
# read in raster (for plotting)
region_raster = rio.open_rasterio('tmp/region.tif')

# to enable comparison of region-ids intersection of depressions and regions shapefiles
overlaid = (gdf
            .dissolve('region-id')   # aggregate by region-id
            .reset_index() # to ensure result gdf has both region-id_1 and region-id_2 (where _2 is the value that correctly corresponds with those in regions.shp)
            .overlay(regions, keep_geom_type=False)
)

###### plotting

# random region number
R = 310

fig, axs = plt.subplots(figsize=[15,6],ncols=5)

# plot raster 
(region_raster==R).plot(ax=axs[0], add_colorbar=False)

# plot from regions shapefile
regions.loc[regions['region-id']==R].plot(ax=axs[1])

# plot from depreesion shapefile
gdf.loc[gdf['region-id']==R].plot(column='level', ax=axs[2])

# plot from intersection of depressions and regions
overlaid.loc[overlaid['region-id_1']==R].plot(ax=axs[3])

# plot from intersection of depressions and regions
overlaid.loc[overlaid['region-id_2']==R].plot(ax=axs[4])

# tidy up axes limits, and labels etc...
axs[0].set_xlim(axs[1].get_xlim())
axs[0].set_ylim(axs[1].get_ylim())
axs[0].set_aspect('equal')

axs[0].set_title(f'region:{R}\nfrom region.tif')
axs[1].set_title(f'region:{R}\nfrom regions.shp')
axs[2].set_title(f'region:{R}\nfrom depressions.shp')
axs[3].set_title(f'region:{overlaid.loc[overlaid["region-id_1"]==R,"region-id_2"].values[0]}\nfrom regions.shp')
axs[4].set_title(f'region:{overlaid.loc[overlaid["region-id_2"]==R,"region-id_1"].values[0]}\nfrom depressions.shp')

plt.subplots_adjust(wspace=0.35)

print(f"region: {R} in the depressions.shp file corresponds to region: {overlaid.loc[overlaid['region-id_1']==R,'region-id_2'].values[0]} in regions.shp")
print(f"region: {R} in the regions.shp file corresponds to region: {overlaid.loc[overlaid['region-id_2']==R,'region-id_1'].values[0]} in depressions.shp")
```
>region: 310 in the depressions.shp file corresponds to region: 394 in regions.shp
>region: 310 in the regions.shp file corresponds to region: 245 in depressions.shp

![image](https://user-images.githubusercontent.com/92921576/225363312-0b3bcc72-4cea-41ce-8616-1ebfceed1cc5.png)

### the question
So, i think the question is... is the discrepency between the two `region_id`s expected/normal? 
If yes, is the use of `.overlay()` the best way to handle it and reconcile  `regions.shp`/`regions_info.csv` and `depressions.shp`/`depressions_info.csv`, or is there an even more straightforward way?
If no, have I done something wrong?

thank you



Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

region-id inconsistent between depressions.shp and regions.shp #33

Description

What I Did

the question

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

region-id inconsistent between depressions.shp and regions.shp #33

Description

Description

What I Did

the question

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions