[RFC] Add ElementWiseMult #983

kevinyamauchi · 2019-02-05T15:51:15Z

Objective

This PR introduces a pipeline component for field flatness correction and image normalization via a user supplied correction image.

Overview

Per our discussion in #945, I did the following:

For illumination correction, the user would supply the full matrix. This method uses the Filter PipelineComponent. The user supplied matrix is a 5-d tensor with any singleton dimension being broadcast. For example, to correct illumination uniformly across (x, y) user would pass (1, 1, 1, x, y). To uniformly scale channels, user would pass (1, c, 1, 1, 1) where c- is a vector of length equal to the number of channels

Usage

import numpy as np

from starfish.imagestack.imagestack import ImageStack
import starfish

# Make the ImageStack
im = np.ones((1, 3, 5, 2, 2))
stack = ImageStack.from_numpy_array(im)

# Make the correction matrix
corr_mat = np.ones((1, 1, 1, 1, 3))
corr_mat[0, 0, 0, 0, 1] = 0.5
corr_mat_xr = xr.DataArray(corr_mat, dims=('r', 'x', 'z', 'y', 'c'))

# Correct...
filter_mult = starfish.image.Filter.ElementWiseMult(mult_mat=corr_mat_xr)
stack2 = filter_mult.run(stack, in_place=False, verbose=True)

To do

Add tests (we can probably use variants of the usage example above)

Questions

I am not familiar with click. Can I supply an ndarray as a default parameter?
For the _DEFAULT_TESTING_PARAMETERS = {"corr_mat": 0}: I assume these are passed as kwargs into some test. Where are those tests?
Currently, I use group_by to chunk by ROUND, CH, Z (where appropriate) and numpy broadcasting for X, Y. I think this is preferred, as it is compatible with the multiprocessing. We can also just directly multiply the mult_mat with the image thanks to the magic of numpy, but I don't think this will use the multiprocessing, as implemented. Thoughts?
Is there a better way to get the group_by axes (see the run() method)?

codecov-io · 2019-02-05T16:09:38Z

Codecov Report

Merging #983 into master will increase coverage by 0.07%.
The diff coverage is 90%.

@@            Coverage Diff             @@
##           master     #983      +/-   ##
==========================================
+ Coverage   88.48%   88.56%   +0.07%     
==========================================
  Files         164      167       +3     
  Lines        6139     6216      +77     
==========================================
+ Hits         5432     5505      +73     
- Misses        707      711       +4

Impacted Files	Coverage Δ
starfish/image/_filter/element_wise_mult.py	`90% <90%> (ø)`
starfish/test/image/filter/test_linear_unmixing.py	`100% <0%> (ø)`
starfish/image/_filter/linear_unmixing.py	`96.77% <0%> (ø)`

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update ae5311a...e9e99ef. Read the comment docs.

kevinyamauchi · 2019-02-05T23:24:25Z

@ambrosejcarr Do you think we should pass in the option to rescale or clip (i.e., pass rescale to preserve_float_range())? I think we should.

ttung

I think a lot of your code assumes a certain ordering of the axes in ImageStack, which is explicitly not guaranteed.

kevinyamauchi · 2019-02-09T05:31:23Z

Ah, yes it does. I did not realize that wasn't guaranteed. Is there a way to get the order of the axes in image passed to _mult()?

ambrosejcarr · 2019-02-09T17:38:33Z

This is great @kevinyamauchi

Ah, yes it does. I did not realize that wasn't guaranteed. Is there a way to get the order of the axes in image passed to _mult()?

I think the right approach here would be to accept an xarray as the matrix argument, with the axes labeled. In the below example, you can think of xx as similar to the ImageStack and yy as the illumination correction input.

Here's my best attempt. It gets stuck on something in xarray which I've reached out to them to ask about. Depending on what they say, we might be able to implement the functionality or a work around.

import starfish
import xarray as xr
import numpy as np

# 3d array of ones
x = np.ones([10, 10, 10])
# 3d [0, 9] range over 2nd axis, singletons in 1st and 3rd axes
y = np.arange(10)[None, :, None]

# create DataArrays
xx = xr.DataArray(x, dims=('x', 'y', 'z'))
# note reversal of (x, y) dimensions
yy = xr.DataArray(y, dims=('y', 'x', 'z'))

# if xx is our ImageStack, this provides the aligned-axes guarantee
yy_aligned = yy.transpose(*xx.dims)

# I wish this worked, see pydata/xarray#2171
xx * yy_aligned

[Edit: the work around isn't as gross as I thought it would be]

xx * yy_aligned.values

kevinyamauchi · 2019-02-09T18:08:57Z

I was trying to use the ImageStack.apply() method to leverage the multiprocessing support that comes with it. As I understand it, ImageStack.apply() uses _processing_workflow(), which passes an ndarray of the selected tiles to the filter function. Is there a way to force it to pass an xarray instead?

kevinyamauchi · 2019-02-20T08:07:48Z

@ambrosejcarr , I made the changes we discussed in our call. Would you mind taking a look? I updated the usage snippet in the PR description.

ambrosejcarr · 2019-02-21T04:19:53Z

I was trying to use the ImageStack.apply() method to leverage the multiprocessing support that comes with it. As I understand it, ImageStack.apply() uses _processing_workflow(), which passes an ndarray of the selected tiles to the filter function. Is there a way to force it to pass an xarray instead?

#1035

ambrosejcarr · 2019-02-21T15:54:06Z

@kevinyamauchi I ran some tests at scale to try to understand whether we need the multiprocessing.

In [0]: import starfish
   ...: import os
   ...: import xarray as xr
   ...: import numpy as np
   ...: from starfish.imagestack.parser.crop import CropParameters  # noqa 
   ...: experiment = starfish.Experiment.from_json(os.path.expanduser('~/scratch/seqfish/experiment.json')) 
   ...: fov = experiment['fov_000'] 
   ...: crop_params = CropParameters(x_slice=slice(0, 1024), y_slice=slice(1024, 2048)) 
   ...: image = fov.get_image('primary', crop_params)

In [1]: mult_array = xr.DataArray(np.arange(12)[None, :, None, None, None], dims=('r', 'c', 'z', 'y', 'x'))

In [2]: ewm = starfish.image.Filter.ElementWiseMultiply(mult_array)

In [3]: %timeit -n1 -r1 ewm.run(image, n_processes=8)
38.7 s ± 0 ns per loop (mean ± std. dev. of 1 run, 1 loop each)

In [4]: %timeit -n1 -r1 image.xarray * mult_array.values
24.2 s ± 0 ns per loop (mean ± std. dev. of 1 run, 1 loop each)

In [5]: res_mp = ewm.run(image)

In [6]: res_direct = image.xarray * mult_array.values

While it does eventually get faster if you add enough processes (20), I suspect it's better to just leverage numpy's vectorization. Do you mind if I simplify the method to use a single process?

kevinyamauchi · 2019-02-21T16:58:30Z

My intuition was that for our current tile size this would be the case. I am less sure how this works if people bring really large tiles (e.g., stitched image of a piece of tissue). What do you think about that? Is that in-scope for near-term starfish goals? I'm fine with either approach!

ambrosejcarr · 2019-02-21T17:01:51Z

My intuition was that for our current tile size this would be the case. I am less sure how this works if people bring really large tiles (e.g., stitched image of a piece of tissue). What do you think about that? Is that in-scope for near-term starfish goals? I'm fine with either approach!

For larger tiles, @ttung is working on a workflow runner that will break them up such that they fit in memory and send them off to individual compute nodes. I think the single-process case here makes more sense.

I omitted the size from the test above, but it is (4, 12, 27, 1024, 1024), so a good size test for a single FoV.

kevinyamauchi requested a review from ambrosejcarr February 5, 2019 15:51

kevinyamauchi mentioned this pull request Feb 6, 2019

Pipeline components for preprocessing of images #945

Closed

ttung reviewed Feb 9, 2019

View reviewed changes

ambrosejcarr mentioned this pull request Feb 9, 2019

Support alignment/broadcasting with unlabeled dimensions of size 1 pydata/xarray#2171

Open

ambrosejcarr mentioned this pull request Feb 21, 2019

Imagestack.transform executes applied function on xarray objects instead of numpy arrays #1035

Merged

ambrosejcarr force-pushed the ky-add-field-flatness branch from 181b912 to d801353 Compare February 21, 2019 21:27

kevinyamauchi and others added 9 commits March 2, 2019 00:52

initial ElementWiseMult

7da22ea

fixed broadcasting

2c4edcc

lint

ca82088

fixed click/tests

703250a

click...

0013830

input matrix has labeled axes

9d75766

lint

fcd938f

lint

9efdffb

remove multiprocessing

c98afee

ambrosejcarr force-pushed the ky-add-field-flatness branch from d801353 to c98afee Compare March 2, 2019 05:53

update clip_method

e9e99ef

ambrosejcarr approved these changes Mar 2, 2019

View reviewed changes

ambrosejcarr merged commit 0b909d2 into master Mar 2, 2019

ambrosejcarr deleted the ky-add-field-flatness branch March 2, 2019 14:58

ambrosejcarr pushed a commit that referenced this pull request Mar 2, 2019

Add ElementWiseMult Filter Pipeline Component (#983)

536b253

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[RFC] Add ElementWiseMult #983

[RFC] Add ElementWiseMult #983

Uh oh!

kevinyamauchi commented Feb 5, 2019 •

edited

Loading

Uh oh!

codecov-io commented Feb 5, 2019 •

edited

Loading

Uh oh!

kevinyamauchi commented Feb 5, 2019

Uh oh!

ttung left a comment

Uh oh!

kevinyamauchi commented Feb 9, 2019

Uh oh!

ambrosejcarr commented Feb 9, 2019 •

edited

Loading

Uh oh!

kevinyamauchi commented Feb 9, 2019 •

edited

Loading

Uh oh!

kevinyamauchi commented Feb 20, 2019

Uh oh!

ambrosejcarr commented Feb 21, 2019

Uh oh!

ambrosejcarr commented Feb 21, 2019

Uh oh!

kevinyamauchi commented Feb 21, 2019

Uh oh!

ambrosejcarr commented Feb 21, 2019

Uh oh!

Uh oh!

[RFC] Add ElementWiseMult #983

[RFC] Add ElementWiseMult #983

Uh oh!

Conversation

kevinyamauchi commented Feb 5, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Objective

Overview

Usage

To do

Questions

Uh oh!

codecov-io commented Feb 5, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

kevinyamauchi commented Feb 5, 2019

Uh oh!

ttung left a comment

Choose a reason for hiding this comment

Uh oh!

kevinyamauchi commented Feb 9, 2019

Uh oh!

ambrosejcarr commented Feb 9, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

kevinyamauchi commented Feb 9, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

kevinyamauchi commented Feb 20, 2019

Uh oh!

ambrosejcarr commented Feb 21, 2019

Uh oh!

ambrosejcarr commented Feb 21, 2019

Uh oh!

kevinyamauchi commented Feb 21, 2019

Uh oh!

ambrosejcarr commented Feb 21, 2019

Uh oh!

Uh oh!

kevinyamauchi commented Feb 5, 2019 •

edited

Loading

codecov-io commented Feb 5, 2019 •

edited

Loading

ambrosejcarr commented Feb 9, 2019 •

edited

Loading

kevinyamauchi commented Feb 9, 2019 •

edited

Loading