improviements to transforms: robustness and ergonomics #100

LucaMarconato · 2023-01-08T19:38:01Z

This pr improves stability and ergonomics of coordinate systems, and in particular helps with #39

Detailed list of what will be implemented.

Some tick boxes are deleted because I have decided to perform a deeper code restructuring (introducing new transformations classes); if I had the item already implemented I then deleted the code.

For the moment I allow only one transformation per element and I am not using at all the coordinate systems. I already set the code ready to do that but will do this in the next PR to keep this PR a bit more contained.

Major (updated plan: new transformation classes)

Major (original plan)

the following crossed out points are not needed anymore with the new transform classes
~~- [X] simplify the logic of inference of coordinate systems in Sequence (function check_and_infer_coordinate_systems())~~
~~- [x] make Sequence also accept the concatenation of transformations with axes mismatch (eg. xy with cyx), by adding automatically the appropriate axis permutations, insertion/deletion~~
~~- [x] setter for elements to check that the transfomration makes sense (if not, suggests how to fix) (see my next comment for a downside of this)~~
~~- [ ] save the affine transformation to the file to avoid verbosity~~

better __repr__ for coordinate transformations
helper function to replace a transformation within an element (on disk, in-memory is already possible) without having to rewrite the element data
- shapes
- points
- polygons
- images and labels
  - SpatialImage
  - MultiscaleSpatialImage

Minor

improved Sequence transformation: make input and output in contatenation inferred from init
better tests for Sequence, covering the case of inference of the intermediate coordinate system for nested Sequence transformations
coordinate systems type simplified. Before we were planning to make possible to have input_coordinate_system and output_coordinate_system as a string inside a transformation object, to deal with some edge cases of implicit coordinate systems. Now this complexity is not needed because thanks to the schema mechanism, the spatial elements are more structured, and we can easily infer from them what is the implicit coordinate system.
helper function to get affine matrices between coordinate systems (Affine.from_input_output_coordinate_systems())

Bonus

…ansform from input and output coordinate system; 2. fixed but in sequence (now inferring coordinate systems from its components); 3. added missing test for inferencen of intermediate coordinate systems for nested sequences

codecov · 2023-01-08T19:39:10Z

Codecov Report

Attention: Patch coverage is 93.19556% with 135 lines in your changes missing coverage. Please review.

Project coverage is 87.76%. Comparing base (0a455dc) to head (458ba9d).
Report is 604 commits behind head on main.

Files with missing lines	Patch %	Lines
spatialdata/_core/ngff/ngff_coordinate_system.py	79.05%	31 Missing ⚠️
spatialdata/_core/transformations.py	94.32%	27 Missing ⚠️
spatialdata/_core/_transform_elements.py	86.84%	20 Missing ⚠️
spatialdata/_core/_spatialdata_ops.py	85.00%	18 Missing ⚠️
spatialdata/_core/ngff/ngff_transformations.py	96.95%	15 Missing ⚠️
spatialdata/_core/_spatialdata.py	94.30%	9 Missing ⚠️
spatialdata/_core/core_utils.py	96.12%	5 Missing ⚠️
spatialdata/_io/read.py	93.84%	4 Missing ⚠️
spatialdata/_core/models.py	96.34%	3 Missing ⚠️
spatialdata/utils.py	95.91%	2 Missing ⚠️
... and 1 more

Additional details and impacted files

@@            Coverage Diff             @@
##             main     #100      +/-   ##
==========================================
+ Coverage   84.22%   87.76%   +3.54%     
==========================================
  Files          19       22       +3     
  Lines        1914     3105    +1191     
==========================================
+ Hits         1612     2725    +1113     
- Misses        302      380      +78

Files with missing lines	Coverage Δ
spatialdata/__init__.py	`100.00% <100.00%> (ø)`
spatialdata/_core/_spatial_query.py	`77.21% <100.00%> (+0.67%)`	⬆️
spatialdata/_io/format.py	`87.62% <100.00%> (+8.24%)`	⬆️
spatialdata/_io/write.py	`96.27% <98.79%> (+0.75%)`	⬆️
spatialdata/utils.py	`84.33% <95.91%> (+15.76%)`	⬆️
spatialdata/_core/models.py	`85.53% <96.34%> (+1.71%)`	⬆️
spatialdata/_io/read.py	`97.66% <93.84%> (+0.20%)`	⬆️
spatialdata/_core/core_utils.py	`95.32% <96.12%> (+5.50%)`	⬆️
spatialdata/_core/_spatialdata.py	`78.67% <94.30%> (+6.34%)`	⬆️
spatialdata/_core/ngff/ngff_transformations.py	`96.95% <96.95%> (ø)`
... and 4 more

…stem

…lly adjusted

for more information, see https://pre-commit.ci

…erse/spatialdata into feature/transform_ergonomics

for more information, see https://pre-commit.ci

LucaMarconato · 2023-01-11T23:57:13Z

EDIT: this post referst to the first tentative implementation. I removed the code which generated these complex transformation in favor of the new transformation classes.

As mentioned in my previous post, a downside of "setter for elements to check that the transfomration makes sense (if not, tries to fix it)" is that very verbose transformations are generated.

For instance this code here:

def test_assign_xyz_scale_to_cyx_image():
    xyz_cs = get_default_coordinate_system(("x", "y", "z"))
    scale = Scale(np.array([2, 3, 4]), input_coordinate_system=xyz_cs, output_coordinate_system=xyz_cs)
    image = Image2DModel.parse(np.zeros((10, 10, 10)), dims=("c", "y", "x"))
    set_transform(image, scale)
    t = get_transform(image)
    pprint(t.to_dict())
    print(t.to_affine().affine)

Leads to this transformation here:

ByDimension (c, y, x -> c, y, x)
    Sequence (x, y -> x, y, z)
        ByDimension (x, y -> x, y, z)
            MapAxis (y, x -> x, y)
                y <- y
                x <- x
            Affine (y -> z)
                [0. 0.]
                [0. 1.]
        Scale (x, y, z -> x, y, z)
            [2. 3. 4.]
    Identity (c -> c)

Which saved to json is insanely verbose:

{'input': {'axes': [{'name': 'c', 'type': 'channel'},
                    {'name': 'y', 'type': 'space', 'unit': 'unit'},
                    {'name': 'x', 'type': 'space', 'unit': 'unit'}],
           'name': 'cyx'},
 'output': {'axes': [{'name': 'c', 'type': 'channel'},
                     {'name': 'y', 'type': 'space', 'unit': 'unit'},
                     {'name': 'x', 'type': 'space', 'unit': 'unit'}],
            'name': 'cyx'},
 'transformations': [{'input': {'axes': [{'name': 'x',
                                          'type': 'space',
                                          'unit': 'unit'},
                                         {'name': 'y',
                                          'type': 'space',
                                          'unit': 'unit'}],
                                'name': "xyz_subset ['x', 'y']"},
                      'output': {'axes': [{'name': 'x',
                                           'type': 'space',
                                           'unit': 'unit'},
                                          {'name': 'y',
                                           'type': 'space',
                                           'unit': 'unit'},
                                          {'name': 'z',
                                           'type': 'space',
                                           'unit': 'unit'}],
                                 'name': 'xyz'},
                      'transformations': [{'input': {'axes': [{'name': 'x',
                                                               'type': 'space',
                                                               'unit': 'unit'},
                                                              {'name': 'y',
                                                               'type': 'space',
                                                               'unit': 'unit'}],
                                                     'name': "xyz_subset ['x', "
                                                             "'y']"},
                                           'output': {'axes': [{'name': 'x',
                                                                'type': 'space',
                                                                'unit': 'unit'},
                                                               {'name': 'y',
                                                                'type': 'space',
                                                                'unit': 'unit'},
                                                               {'name': 'z',
                                                                'type': 'space',
                                                                'unit': 'unit'}],
                                                      'name': 'xyz'},
                                           'transformations': [{'input': {'axes': [{'name': 'y',
                                                                                    'type': 'space',
                                                                                    'unit': 'unit'},
                                                                                   {'name': 'x',
                                                                                    'type': 'space',
                                                                                    'unit': 'unit'}],
                                                                          'name': 'cyx_subset '
                                                                                  "['y', "
                                                                                  "'x']"},
                                                                'mapAxis': {'x': 'x',
                                                                            'y': 'y'},
                                                                'output': {'axes': [{'name': 'x',
                                                                                     'type': 'space',
                                                                                     'unit': 'unit'},
                                                                                    {'name': 'y',
                                                                                     'type': 'space',
                                                                                     'unit': 'unit'}],
                                                                           'name': 'xyz_subset '
                                                                                   "['x', "
                                                                                   "'y']"},
                                                                'type': 'mapAxis'},
                                                               {'affine': [[0.0,
                                                                            0.0]],
                                                                'input': {'axes': [{'name': 'y',
                                                                                    'type': 'space',
                                                                                    'unit': 'unit'}],
                                                                          'name': 'cyx_subset '
                                                                                  "['y']"},
                                                                'output': {'axes': [{'name': 'z',
                                                                                     'type': 'space',
                                                                                     'unit': 'unit'}],
                                                                           'name': 'xyz_subset '
                                                                                   "['z']"},
                                                                'type': 'affine'}],
                                           'type': 'byDimension'},
                                          {'input': {'axes': [{'name': 'x',
                                                               'type': 'space',
                                                               'unit': 'unit'},
                                                              {'name': 'y',
                                                               'type': 'space',
                                                               'unit': 'unit'},
                                                              {'name': 'z',
                                                               'type': 'space',
                                                               'unit': 'unit'}],
                                                     'name': 'xyz'},
                                           'output': {'axes': [{'name': 'x',
                                                                'type': 'space',
                                                                'unit': 'unit'},
                                                               {'name': 'y',
                                                                'type': 'space',
                                                                'unit': 'unit'},
                                                               {'name': 'z',
                                                                'type': 'space',
                                                                'unit': 'unit'}],
                                                      'name': 'xyz'},
                                           'scale': [2.0, 3.0, 4.0],
                                           'type': 'scale'}],
                      'type': 'sequence'},
                     {'input': {'axes': [{'name': 'c', 'type': 'channel'}],
                                'name': "cyx_subset ['c']"},
                      'output': {'axes': [{'name': 'c', 'type': 'channel'}],
                                 'name': "cyx_subset ['c']"},
                      'type': 'identity'}],
 'type': 'byDimension'}

The corresponding affine matrix is the following:

Affine (c, y, x -> c, y, x)
    [1. 0. 0. 0.]
    [0. 3. 0. 0.]
    [0. 0. 2. 0.]
    [0. 0. 0. 1.]

So maybe we should just save the affine matrix to disk, which is something reasonable:

{'affine': [[1.0, 0.0, 0.0, 0.0], [0.0, 3.0, 0.0, 0.0], [0.0, 0.0, 2.0, 0.0]],
 'input': {'axes': [{'name': 'c', 'type': 'channel'},
                    {'name': 'y', 'type': 'space', 'unit': 'unit'},
                    {'name': 'x', 'type': 'space', 'unit': 'unit'}],
           'name': 'cyx'},
 'output': {'axes': [{'name': 'c', 'type': 'channel'},
                     {'name': 'y', 'type': 'space', 'unit': 'unit'},
                     {'name': 'x', 'type': 'space', 'unit': 'unit'}],
            'name': 'cyx'},
 'type': 'affine'}

…erse/spatialdata into feature/transform_ergonomics

…tching cs; added relative test

LucaMarconato · 2023-01-13T19:04:19Z

After discussing with @giovp I will refactor into separate classes to divide what is purely a NGFF transform from what is built on top of it for ergonomics. The reasons are the following:

the NGFF classes can be in principle moved to ome-zarr-py or an object-oriented representation of the storage, which is something that we want;
this approach will greatly simplify the code complexity. In the previous comments I described the implications of using the approach proposed here Influence on the order of axes when declaring different affine transformations #39. With the new classes the problem of the issue will be solved and what is written to file will be cleared (e.g. if the user takes a cyx Scale from a cyx image and assigns it to a xy points, what is saved to disk will be a xy Scale, and not a complex transformation or its affine representation). The NGFF classes will be used only when reading or writing. The in-memory representation will take advantage of the new classes.
The new classes will be based on xarray like described here Using xarray to make transformations less ambiguous #47 and the order of axes will be irrelevant. Also, all the transformations will have non-ambiguous information for the axis (on the contrary NGFF Sequence transform allow for containing transformations which don't specify the axes. This increases the code complexity.
New functionalities that will be easily supported will be the possibility to concatenate any sequence of transformations independently of the axes they operate on, and assign any transformation to any element, again independently of the axes they work on.

The old version of the code can be found in the previous commit, as well as in a separate branch created from it. If the new code works as planned I'll delete the branch (which I already tagged for archiving purposes and can be found here).

…ngff transforms

…elements

Co-authored-by: Giovanni Palla <[email protected]>

for more information, see https://pre-commit.ci

LucaMarconato · 2023-02-02T19:25:46Z

So one major comment I have so far (by just checking the _spatialdata.py file) is the type of API we want to have. Copying it from the design doc

import spatialdata as sd
from spatialdata import SpatialData

sdata = SpatialData(...)
points = sd.transform(sdata.points["image1"], tgt="tgt_space")
sdata = sd.transform(sdata, tgt="tgt_space")
I think we should have 1 way to apply transform to either spatialdata or elements, and maybe we also shouldnt' allow user to pass directly the element, but only the name and type (e.g. {"labels":"sample1"}. I think this would prevent mistakes in copy/viewes of elements from the ones already in spatialdata and the ones transformed. wdyt ?

I think that we need two functions:

one for getting a transformation between one target and one source, and this would need to be a method of SpatialData since it needs to build the digraph of the transformations of the elements that are inside. This is currently the function map_coordinate_systems(). We can change the name, it's a bit confusing. Maybe map()?
one method to actually transform one element to a coordinate system. Again this needs to be a method of SpatialData because it needs to call the function above internally.

The need to have methods of SpatialData is the reason why I couldn't go with this idea from the design doc:

points = sd.transform(sdata.points["image1"], tgt="tgt_space")

Still we could change the second function to this:

points = sd.transform(sdata, element_type='labels', element_name='cells', tgt="tgt_space")

Here are the pro and cons of this second approach.
Pro:

we can rewrite and set_transformation(), remove_transformation() and remove_all_transformations() to take as first argument the sdata object and then the string description of the element like above. In this way we can remove the *_in_memory() methods since all the functions can be static methods (or just global functions). I really like this.

Con:

passing both the element_type and the element_name is (ihmo very) cumbersome. One way to get around this would be to require unique names for elements. I explore this possibility in this issue that I have just created: Discussion: unique names for elements, pro and cons #124

My proposal is to require unique names for elements (separate PR) and after that to change the second function as described (another PR). For the moment, I don't think we will have problems of copy/views, things should work.

…erse/spatialdata into feature/transform_ergonomics

kevinyamauchi

Thanks @LucaMarconato ! I think this is a solid step in the right direction. I am approving so that we can keep things moving forward. I think we will have to continue to iterate on API, etc., but it will be easier to do so when we actually use it to make vignettes, etc.

one for getting a transformation between one target and one source, and this would need to be a method of SpatialData since it needs to build the digraph of the transformations of the elements that are inside. This is currently the function map_coordinate_systems(). We can change the name, it's a bit confusing. Maybe map()?

Perhaps it could be called something like get_transformation_between_coordinate_systems()? Probably too long, but more descriptive. In any case, I agree that map_coordinate_systems() is probably not the most clear name.

Does this method actually need to be on SpatialData? It seems to me that one could still construct the directed graph of the transformations on a given element. Perhaps I'm missing something though.

spatialdata/_core/_spatialdata.py

LucaMarconato · 2023-02-12T00:28:31Z

Thanks for the comments @kevinyamauchi, I have removed the functions get/set/remove transformations (total of 8 functions) and replaced with 3 functions living outside the SpatialData class. I have also moved map_coordinate_systems out of the SpatialData class and renamed it to get_transformation_between_coordinate_systems().

LucaMarconato · 2023-02-12T00:29:23Z

I will now merge the new points io pr, it will take some times to fix all the tests, but then we'll be ready to merge!

Co-authored-by: Kevin Yamauchi <[email protected]>

…erse/spatialdata into feature/transform_ergonomics

for more information, see https://pre-commit.ci

…erse/spatialdata into feature/transform_ergonomics

…ssing optional FEATURE_KEY); added tests

giovp · 2023-02-13T09:52:41Z

@LucaMarconato quick thing I noticed from working in #132 : the PointsModel.parse still requires an annotation to be passed in the case of nd.array input. I think this should not be enforced since the feature_key is not mandatory. Was it fixed here or shall I push a quick fix in main?

LucaMarconato · 2023-02-13T09:54:59Z

Hi, true I didn't fix it actually. A quick fix would be great thanks.

LucaMarconato added 3 commits January 8, 2023 20:55

removed type str for input_coordinate_system and output_coordinate_sy…

2107071

…stem

simplified logic of Sequence coordinate systems inference

e1e12a1

now Sequence works also for mismatched coordinate systems

ad3e7b2

LucaMarconato changed the title ~~3 improviements to transforms: 1. helper function to get an affine tr…~~ improviements to transforms: robustness and ergonomics Jan 9, 2023

LucaMarconato and others added 7 commits January 10, 2023 00:25

transformations can be applied to any element and axes are automatica…

9a6d75f

…lly adjusted

better repr for transforms; wip on transforms with mismatching axes

1bb4e93

[pre-commit.ci] auto fixes from pre-commit.com hooks

370970d

for more information, see https://pre-commit.ci

wip, some tests passing

53b27ce

all tests passing (but mypy)

db0c786

Merge branch 'feature/transform_ergonomics' of https://github.com/scv…

3985195

…erse/spatialdata into feature/transform_ergonomics

[pre-commit.ci] auto fixes from pre-commit.com hooks

e8962d7

for more information, see https://pre-commit.ci

LucaMarconato added 5 commits January 12, 2023 14:09

simplified parser, all tests passing, fixed mypy

5482193

Merge branch 'feature/transform_ergonomics' of https://github.com/scv…

32eb7ab

…erse/spatialdata into feature/transform_ergonomics

complex trans converted to affine; fixed bug with sequences and misma…

62639c8

…tching cs; added relative test

wip

2f6ea94

wip2

20f2e3a

LucaMarconato added 10 commits January 16, 2023 14:12

fixed problem with io (more in a future pr); started refactoring out …

017b2b9

…ngff transforms

added basic structure; added identity and tests

172c861

restored the previous ngff sequence axes inference logic from 358a338

d12b992

added MapAxis and tests

1d0cb58

added translation and scale, with tests

208f896

added affine, with tests

e208a37

added sequence, with tests; bugfix

00c50e8

added repr

8874c31

code for transforming coordinates (+ tests), working on transforming …

e09e3a8

…elements

trying to fix tests in ubuntu

c644035

LucaMarconato and others added 2 commits February 2, 2023 19:56

Apply suggestions from code review

95acd24

Co-authored-by: Giovanni Palla <[email protected]>

[pre-commit.ci] auto fixes from pre-commit.com hooks

96d9a4f

for more information, see https://pre-commit.ci

LucaMarconato mentioned this pull request Feb 2, 2023

Discussion: unique names for elements, pro and cons #124

Closed

LucaMarconato mentioned this pull request Feb 2, 2023

napari reader for multiscale-spatial-image datasets spatial-image/multiscale-spatial-image#71

Open

LucaMarconato added 2 commits February 6, 2023 14:13

small fixes after giovp review

1e92fed

Merge branch 'feature/transform_ergonomics' of https://github.com/scv…

7af34c4

…erse/spatialdata into feature/transform_ergonomics

kevinyamauchi approved these changes Feb 6, 2023

View reviewed changes

starting adding new functions for set/get/del transform

77e7863

LucaMarconato and others added 11 commits February 12, 2023 01:33

Apply suggestions from code review

f69704e

Co-authored-by: Kevin Yamauchi <[email protected]>

finished adding the new set/get/remove transfomation func

b9118af

Merge branch 'feature/transform_ergonomics' of https://github.com/scv…

92f1aa7

…erse/spatialdata into feature/transform_ergonomics

wip new points

a595489

Merge branch 'main' into feature/transform_ergonomics

652d0eb

almost all tests passing

eb10b3b

[pre-commit.ci] auto fixes from pre-commit.com hooks

1ef2cf0

for more information, see https://pre-commit.ci

all tests passing

0164eb5

Merge branch 'feature/transform_ergonomics' of https://github.com/scv…

7c07a07

…erse/spatialdata into feature/transform_ergonomics

readded pyarrow dependency

545349f

fixed bug with PointsFormat (validation failed after io because of mi…

3eac031

…ssing optional FEATURE_KEY); added tests

giovp mentioned this pull request Feb 13, 2023

unify polygons and shapes elements #132

Merged

LucaMarconato added 3 commits February 13, 2023 13:46

fix in points parse

3a31ae8

Merge branch 'main' into feature/transform_ergonomics

e1745f5

updated design doc; fix __repr__; fix points model

458ba9d

LucaMarconato merged commit d0f0fa0 into main Feb 13, 2023

LucaMarconato deleted the feature/transform_ergonomics branch February 22, 2023 21:27

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

improviements to transforms: robustness and ergonomics #100

improviements to transforms: robustness and ergonomics #100

Uh oh!

LucaMarconato commented Jan 8, 2023 •

edited

Loading

Uh oh!

codecov bot commented Jan 8, 2023 •

edited

Loading

Uh oh!

LucaMarconato commented Jan 11, 2023 •

edited

Loading

Uh oh!

LucaMarconato commented Jan 13, 2023 •

edited

Loading

Uh oh!

LucaMarconato commented Feb 2, 2023 •

edited

Loading

Uh oh!

kevinyamauchi left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

LucaMarconato commented Feb 12, 2023

Uh oh!

LucaMarconato commented Feb 12, 2023

Uh oh!

giovp commented Feb 13, 2023

Uh oh!

LucaMarconato commented Feb 13, 2023

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

improviements to transforms: robustness and ergonomics #100

improviements to transforms: robustness and ergonomics #100

Uh oh!

Conversation

LucaMarconato commented Jan 8, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Detailed list of what will be implemented.

Major (updated plan: new transformation classes)

Major (original plan)

Minor

Bonus

Uh oh!

codecov bot commented Jan 8, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

LucaMarconato commented Jan 11, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

LucaMarconato commented Jan 13, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

LucaMarconato commented Feb 2, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

kevinyamauchi left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

LucaMarconato commented Feb 12, 2023

Uh oh!

LucaMarconato commented Feb 12, 2023

Uh oh!

giovp commented Feb 13, 2023

Uh oh!

LucaMarconato commented Feb 13, 2023

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

LucaMarconato commented Jan 8, 2023 •

edited

Loading

codecov bot commented Jan 8, 2023 •

edited

Loading

LucaMarconato commented Jan 11, 2023 •

edited

Loading

LucaMarconato commented Jan 13, 2023 •

edited

Loading

LucaMarconato commented Feb 2, 2023 •

edited

Loading