Add scan. #531

dcherian · 2024-07-31T18:06:46Z

Closes #277

dcherian · 2024-07-31T18:07:22Z

cubed/core/ops.py

+    #    Here we diverge from Blelloch, who runs a balanced tree algorithm to calculate the scan.
+    #    Instead we generalize recursively apply the scan to `reduced`.
+    # 3a. First we merge to a decent intermediate chunksize since reduced.chunksize[axis] == 1
+    new_chunksize = min(reduced.shape[axis], reduced.chunksize[axis] * 5)


need input here on choosing a new intermediate chunksize to rechunk to based on memory info.

There are a couple of things to consider here: the number of chunks to combine at each stage, and the memory limits.

The first is like split_every in reduction, where the default is 4, although 6 or 8 may be better for larger workloads.

For the second, we should make sure the new chunksize is no larger than (x.spec.allowed_mem - x.spec.reserved_mem) // 4, where the factor of 4 is comes about because of the {compressed,uncompressed} * {input,output} copies.

There is an error case where this memory constraint means the new chunksize is no larger than the existing one, so the computation can't proceed. The user can fix this either by reducing the chunksize or by increasing the memory. This is similar to this case:

cubed/cubed/core/ops.py

Lines 985 to 991 in 88c5dc4

# single axis: see how many result chunks fit in max_mem

# factor of 4 is memory for {compressed, uncompressed} x {input, output}

target_chunk_size = (max_mem - chunk_mem) // (chunk_mem * 4)

if target_chunk_size <= 1:

raise ValueError(

f"Not enough memory for reduction. Increase allowed_mem ({allowed_mem}) or decrease chunk size"

)

dcherian · 2024-07-31T18:07:30Z

cubed/core/ops.py

+        shape=scanned.shape,
+        dtype=scanned.dtype,
+        chunks=scanned.chunks,
+        extra_projected_mem=scanned.chunkmem * 2,  # arbitrary


need input here too.

This should be the memory allocated to read from the side inputs (scanned and increment here). We double the chunk memory to account for reading the compressed Zarr chunk, so the result would be

extra_projected_mem=scanned.chunkmem * 2 + increment.chunkmem * 2

(There's an open issue #288 to make this a bit more transparent.)

tomwhite

Are you going to add a user-facing cumulative_sum function from the Array API? This would be a good function for the unit tests to test.

tomwhite · 2024-08-01T09:40:18Z

cubed/core/ops.py

+        shape=scanned.shape,
+        dtype=scanned.dtype,
+        chunks=scanned.chunks,
+        extra_projected_mem=scanned.chunkmem * 2,  # arbitrary


This should be the memory allocated to read from the side inputs (scanned and increment here). We double the chunk memory to account for reading the compressed Zarr chunk, so the result would be

extra_projected_mem=scanned.chunkmem * 2 + increment.chunkmem * 2

(There's an open issue #288 to make this a bit more transparent.)

tomwhite · 2024-08-01T09:57:21Z

cubed/core/ops.py

+    #    Here we diverge from Blelloch, who runs a balanced tree algorithm to calculate the scan.
+    #    Instead we generalize recursively apply the scan to `reduced`.
+    # 3a. First we merge to a decent intermediate chunksize since reduced.chunksize[axis] == 1
+    new_chunksize = min(reduced.shape[axis], reduced.chunksize[axis] * 5)


There are a couple of things to consider here: the number of chunks to combine at each stage, and the memory limits.

The first is like split_every in reduction, where the default is 4, although 6 or 8 may be better for larger workloads.

For the second, we should make sure the new chunksize is no larger than (x.spec.allowed_mem - x.spec.reserved_mem) // 4, where the factor of 4 is comes about because of the {compressed,uncompressed} * {input,output} copies.

There is an error case where this memory constraint means the new chunksize is no larger than the existing one, so the computation can't proceed. The user can fix this either by reducing the chunksize or by increasing the memory. This is similar to this case:

cubed/cubed/core/ops.py

Lines 985 to 991 in 88c5dc4

# single axis: see how many result chunks fit in max_mem

# factor of 4 is memory for {compressed, uncompressed} x {input, output}

target_chunk_size = (max_mem - chunk_mem) // (chunk_mem * 4)

if target_chunk_size <= 1:

raise ValueError(

f"Not enough memory for reduction. Increase allowed_mem ({allowed_mem}) or decrease chunk size"

)

tomwhite · 2024-08-01T10:06:37Z

cubed/core/ops.py

+    """
+    # Blelloch (1990) out-of-core algorithm.
+    # 1. First, scan blockwise
+    scanned = blockwise(func, "ij", array, "ij", axis=axis)


Using map_blocks would be simpler and avoid the 2D assumption

tomwhite · 2024-08-01T10:08:20Z

cubed/core/ops.py

@@ -1442,3 +1443,120 @@ def smallest_blockdim(blockdims):
            m = ntd[0]
            out = ntd
    return out
+
+
+def wrapper_binop(


Maybe call something like _scan_binop to link it to the scan implementation? I've been using a naming convention like that elsewhere in the file.

tomwhite · 2025-06-11T12:28:52Z

I'm going to merge this, and then do some follow-up PRs to add cumulative sum and product functions.

dcherian · 2025-06-11T13:00:57Z

Sorry I dropped it. Does it still work?

tomwhite · 2025-06-11T13:10:11Z

Sorry I dropped it. Does it still work?

No problem! I just wrote a cumulative sum test that passes, so yes! It uses map_direct, which has since been deprecated in favour of map_selection, so I'd like to change that too (which may be a bit fiddly).

As I said I'm very happy to push this forward building on the work you did, unless you'd like to take a look? My goal is to close it so we can mark #438 as complete.

tomwhite · 2025-06-11T14:23:50Z

It uses map_direct, which has since been deprecated in favour of map_selection, so I'd like to change that too (which may be a bit fiddly).

Actually it's not possible to use map_selection, since it only works on single arrays, and we have two (scanned and increment). Instead we can use general_blockwise with a key function that returns the relevant chunk in increments that corresponds to the chunk in the main array (slightly tricky since the chunking differs). I'll try to do that as a follow on.

dcherian · 2025-06-11T14:37:33Z

I won't be able to take a look till the weekend, so feel free to take over!

Support arrays other than 2D in `scan`. Don't assume axis is last dimension.

…unction.

tomwhite · 2025-06-11T16:54:05Z

I think this is basically working now - the test failures are unrelated. I'll leave it open for a bit in case you want to have a look @dcherian.

dcherian · 2025-06-12T03:53:51Z

Ah I was so close! LGTM. Thanks for finishing it up

tomwhite · 2025-06-12T10:41:57Z

Thanks for your work on this @dcherian!

Add scan.

179cbce

Closes cubed-dev#277

dcherian commented Jul 31, 2024

View reviewed changes

tomwhite reviewed Aug 1, 2024

View reviewed changes

tomwhite mentioned this pull request Aug 1, 2024

Tree merge chunks #527

Closed

tomwhite added 3 commits June 11, 2025 16:54

Merge branch 'main' into scans

30b2627

Add cumulative_sum and tests.

ed08800

Support arrays other than 2D in `scan`. Don't assume axis is last dimension.

Convert Cubed arrays to Zarr arrays for direct access in map_direct f…

8ef0bf4

…unction.

tomwhite marked this pull request as ready for review June 11, 2025 16:52

tomwhite merged commit 7de9081 into cubed-dev:main Jun 12, 2025
16 of 19 checks passed

tomwhite mentioned this pull request Jun 12, 2025

Replace map_direct in scan as a general_blockwise operation #720

Closed

	# single axis: see how many result chunks fit in max_mem
	# factor of 4 is memory for {compressed, uncompressed} x {input, output}
	target_chunk_size = (max_mem - chunk_mem) // (chunk_mem * 4)
	if target_chunk_size <= 1:
	raise ValueError(
	f"Not enough memory for reduction. Increase allowed_mem ({allowed_mem}) or decrease chunk size"
	)

Add scan. #531

Add scan. #531

Uh oh!

Conversation

dcherian commented Jul 31, 2024

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

tomwhite left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

tomwhite commented Jun 11, 2025

Uh oh!

dcherian commented Jun 11, 2025

Uh oh!

tomwhite commented Jun 11, 2025

Uh oh!

tomwhite commented Jun 11, 2025

Uh oh!

dcherian commented Jun 11, 2025

Uh oh!

tomwhite commented Jun 11, 2025

Uh oh!

dcherian commented Jun 12, 2025

Uh oh!

Uh oh!

tomwhite commented Jun 12, 2025

Uh oh!

Uh oh!