Skip to content

various fixes and additions for IO #19

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 23 commits into from
Mar 4, 2023
Merged

various fixes and additions for IO #19

merged 23 commits into from
Mar 4, 2023

Conversation

giovp
Copy link
Member

@giovp giovp commented Feb 27, 2023

  • IO fixes for xenium
  • IO fixes for visium
  • add mcmicro
  • add steinbock

should close respective PRs once this is merged

@giovp giovp requested a review from LucaMarconato February 27, 2023 13:04
@codecov-commenter
Copy link

codecov-commenter commented Feb 27, 2023

Codecov Report

Merging #19 (3907d68) into main (e061d47) will decrease coverage by 2.36%.
The diff coverage is 32.38%.

Additional details and impacted files
@@            Coverage Diff             @@
##             main      #19      +/-   ##
==========================================
- Coverage   44.90%   42.54%   -2.36%     
==========================================
  Files          11       13       +2     
  Lines         432      604     +172     
==========================================
+ Hits          194      257      +63     
- Misses        238      347     +109     
Impacted Files Coverage Δ
src/spatialdata_io/readers/_utils/_read_10x_h5.py 19.14% <ø> (ø)
src/spatialdata_io/readers/_utils/_utils.py 28.26% <0.00%> (-0.63%) ⬇️
src/spatialdata_io/readers/cosmx.py 19.08% <7.59%> (-11.51%) ⬇️
src/spatialdata_io/readers/visium.py 35.00% <13.63%> (+0.15%) ⬆️
src/spatialdata_io/readers/xenium.py 34.06% <17.85%> (-2.93%) ⬇️
src/spatialdata_io/readers/steinbock.py 40.90% <40.90%> (ø)
src/spatialdata_io/readers/mcmicro.py 43.18% <43.18%> (ø)
src/spatialdata_io/__init__.py 100.00% <100.00%> (ø)
src/spatialdata_io/_constants/_constants.py 100.00% <100.00%> (ø)

imread_kwargs,
image_models_kwargs,
)
labels[f"{dataset_id}_nuclei"] = _get_labels(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just one note (nothing to do): in the example the labels of the nuclei coincide with the other cell labels.

shape_size=scalefactors["spot_diameter_fullres"],
index=adata.obs_names,
geometry=0,
radius=np.repeat(scalefactors["spot_diameter_fullres"], len(adata)),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would modify the parser signature to allow also a single float here. This actually already works because when we assign a float to the column in the dataframe the value is copied for each row.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

true it works already, clarified there

adata.obs = metadata
transform = Scale([1.0 / specs["pixel_size"], 1.0 / specs["pixel_size"]], axes=("x", "y"))
diameters = 2 * np.sqrt(adata.obs[XeniumKeys.CELL_AREA].to_numpy() / np.pi) / specs["pixel_size"]
circles = ShapesModel.parse(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am also not a fan of this but I have restored it because this is the only way that we have at the moment to view the data with napari. The polygons are too slow, takes minutes to render the napari view. One option could be to rasterize the polygons into labels. Shall I do it? (I started working on rasterization already so I could code it today)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I load the circles for the nuclei (not the cell boundaries anymore) otherwise the circles are overlapping in napari.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There was a bug in the radius size, fixed.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I mean I understand the napari thing but can't it be done temporarily there in case? I don't think it's any useful and it's not really an element provided by the tech but some type of processing, would keep it removed

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can think of a few alternative options:

  • we keep this code in the io until a better solution
  • we make the user manually create circles every time he/she is working with Xenium data (in particular the has to manually pass the radius column to the parser of circles)
  • I implement the rasterization of polygons to labels and this is called by napari when too many polygons are detected. This function is probably slow, so better if the user calls it beforehand (or even if we call it in cosmx(): if I can implement it lazily that would be the best, so cosmx() returns fast and the time is used only when saving to zarr.)
  • napari shows only the centroids (it can't show the shape sizes since it doesn't know that the radii are stored in a column of the table).
  • We make a function to compute the area of polygons and we create a function to create circles with radius from the polygons. As above, we either call this function automatically in napari, either we call it in cosmx(), either the user will call it manually if needed.

wdyt?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok fine for leaving it in, would add an argument like centroids as shapes then to have it at least optional

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes I would add the arguments. I will implement gradually the rasterization of labels and the "circularization" of polygons and labels, then we can test what works best.


transform = Scale([1.0 / specs["pixel_size"], 1.0 / specs["pixel_size"]], axes=("x", "y"))
# points = PointsModel.parse(coords=arr, annotations=annotations, transformations={"global": transform})
points = PointsModel.parse(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here we get

INFO     Instance key `cell_id` could be of type `pd.Categorical`. Consider     
         casting it.    

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this warning would keep for now and then maybe remove

@LucaMarconato
Copy link
Member

LucaMarconato commented Feb 28, 2023

I made some changes in spatialdata-sandbox (in this small PR: https://github.com/giovp/spatialdata-sandbox/pull/17/files), in napari_spatialdata (simply pushed to the spatialdata branch, and to this PR. Now I push my edits of this PR.

@LucaMarconato
Copy link
Member

In my edits I in particular added the coordinate systems to steinbock_io and changed all the values of region, and all the rows in the region_key column to have the full path, for instance instead of cells, we need /labels/cells, etc. Probably we should go for unique names soon (scverse/spatialdata#124) and simplify that.

@giovp
Copy link
Member Author

giovp commented Mar 3, 2023

ok great, gonna check that all files are consistent mirroring spatialdata sandbox and then will merge.

@giovp
Copy link
Member Author

giovp commented Mar 4, 2023

one thing is that I added spatial coordinates in obsm in every tech, mostly because then we can use squidpy out of the box.

will merge this even without re-review hope it's fine

@giovp giovp merged commit 61177e7 into main Mar 4, 2023
@giovp giovp deleted the fixes/io branch March 4, 2023 20:32
This was referenced Mar 4, 2023
lucas-diedrich pushed a commit to lucas-diedrich/spatialdata-io that referenced this pull request Nov 26, 2024
Co-authored-by: Luca Marconato <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants