PR: Add note for functions that are not compatible with static memory allocation #168

steff456 · 2021-04-22T06:34:45Z

This PR mark the functions discussed in #164 with a note regarding boolean array indexing.

kgryte

@steff456 Thanks for working on this. I think we need to clarify the wording in the notes, and I had a question whether where is applicable.

spec/API_specification/searching_functions.md

spec/API_specification/set_functions.md

…ta-apis#174) * Some small fixes to function signatures to make them valid Python * Fix an issue that was breaking the spec parsing in the test suite * Call the argument to broadcast_arrays 'arrays' This is consistent with meshgrid().

kgryte

LGTM

…a-apis#167) * Update specification for arange Addresses comments in data-apisgh-85 and data-apisgh-107 * Update the specification for `full` and `full_like` Addresses comments in data-apisgh-85 and data-apisgh-107 * Update specification for `linspace` Addresses comments in data-apisgh-85 and data-apisgh-107 * Update specification for `empty`, `ones`, `zeros` Addresses comments in data-apisgh-85 and data-apisgh-107 * Update specification for `eye` This is useful/needed because `M` is not a descriptive name and that name does not match between different array libraries. * Update specification for `expand_dims`, `roll` and `reshape` Address comment in data-apisgh-85 * One more change to `eye`, more descriptive positional arguments * Address the default integer dtype issue for 32/64-bit Python Closes data-apisgh-151 * Update signature of `broadcast_to` Address a review comment; makes it consistent with other functions using `shape`.

rgommers

Thanks @steff456. I pushed a change to make this a more unique/recognizable admonition:

I'm not sure about saying "it's incompatible with static memory allocation and JIT compilation". It makes it harder to implement, but it's not fundamentally incompatible. For example, the Numba implementation of unique is only 6 lines long. And it is possible to pre-allocate memory, the maximum amount one would need for the unique() call, and then use that. So I'd probably say something more along the lines of "the shape of the output array for this function depends on the data values in the input array, hence it is difficult to implement for libraries that rely on static memory allocation or JIT compilation. See REF for more details". And then in that reference (a separate page under "Design topics") we should say that libraries may choose to leave out functions marked as data-dependent; if they do so they must leave out all of them and clearly indicate in their documentation that they do so.

This PR addresses gh-84, and there's more relevant comments there. In particular, boolean indexing should get the same warning.

shoyer · 2021-05-03T21:10:10Z

I'm not sure about saying "it's incompatible with static memory allocation and JIT compilation". It makes it harder to implement, but it's not fundamentally incompatible. For example, the Numba implementation of unique is only 6 lines long.

I agree, it's really implementation dependent. It's fine in Numba's JIT because Numba has types for dynamically sized arrays, which is different from the case for JAX.

I like @rgommers's suggestion, but rather than "hence it is difficult to implement for libraries that rely on static memory allocation or JIT compilation", I might say "hence it can be difficult to implement for libraries that build computation graphs for arrays without knowing their values." And then we can point to examples like JAX, Dask, etc.

szha · 2021-05-03T22:10:50Z

Agreed. Dynamic shapes make it impossible for ahead of time memory planning and thus affects JIT that require full knowledge of shape/type, or AOT compilation.

kgryte · 2021-05-11T04:00:06Z

spec/API_specification/searching_functions.md

+:::{admonition} Data-dependent output shape
+:class: important
+
+The shape of the output array for this function depends on the data values in the input array, hence it can be difficult to implement for libraries that build computation graphs for arrays without knowing their values. See {ref}`indexing` section for more details.


Not clear to me what we are referring to in the "indexing section". I think, instead, what is wanted is to refer to a separate page under "Design topics" where we specifically discuss APIs which return arrays having data-dependent shapes (for further discussion, see here).

Should we refer to this PR or to the boolean indexing section inside the spec or both?

I think we need to add a new section in "Design Topics" (via this PR) and then reference that section.

kgryte · 2021-05-11T04:00:59Z

spec/API_specification/set_functions.md

+:::{admonition} Data-dependent output shape
+:class: important
+
+The shape of the output array for this function depends on the data values in the input array, hence it can be difficult to implement for libraries that build computation graphs for arrays without knowing their values. See {ref}`indexing` section for more details.


Same comment as above.

@shoyer

As suggested by @shoyer in data-apis#97 (comment) This makes it possible to predict resulting rank of output array, which is otherwise undetermined (see discussion in data-apisgh-97). Using squeeze without specifying the axis in library code often results in unintuitive behaviour. For the common use case of extracting a scalar from a size-1 2-D array, this gets a little more verbose (e.g. `np.squeeze(np.array([[1]]), axis=(0, 1))`), but that's probably a price worth paying for the extra clarity. Also changes specified behaviour for a given axis not having size 1 to raising a `ValueError`, which is what NumPy does. This wasn't considered before, and the current description seems simply incorrect. Finally, this makes `squeeze` the exact inverse of `expand_dims`, which is probably a good thing.

* Add Cholesky spec * Update description * Return an array rather than a tuple * Update dtype requirements * Update dtype requirements * Move API to submodule

* Add slogdet spec * Add note * Update description * Reformat description and update return value type * Add article * Simplify description * Update copy * Update copy * Update dtype requirements * Update dtype requirement for returned arrays * Update copy * Update copy * Move API to submodule

…ta-apis#118) * Add pinv specification * Fix statement on broadcasting * Fix type annotation * Rename keyword argument and drop scaling factor * Update copy * Rename keyword argument * Fix argument name * Update dtype requirements * Fix missing period * Update copy * Update copy * Update dtype requirements * Update copy * Move API to submodule

…a-apis#137) * Add dot product specification * Update specification * Fix output shape * Update description * Update copy * Update copy * Rename to inner_dot * Update dtype requirements * Rename to vecdot

…ot) (data-apis#136) * Add tensordot specification * Update data type requirements * Update dtype requirements * Fix missing header

* Add solve specification * Remove statement * Update to return an array, rather than a single-element tuple * Update language * Update copy * Update description * Add note regarding broadcast compatibility * Add support for providing an ordinate vector * Update dtype requirements * Update copy * Move API to submodule

* Add SVD spec * Update spec * Update to follow NumPy * Fix annotation * Update type * Update descriptions * Update annotation * Return a tuple only when `compute_uv` is `True` * Fix type annotation * Update copy Co-authored-by: Leo Fang <[email protected]> * Update copy Co-authored-by: Leo Fang <[email protected]> * Update copy Co-authored-by: Leo Fang <[email protected]> * Update copy Co-authored-by: Leo Fang <[email protected]> * Fix docs * Update dtype requirements * Update copy * Always return u and v singular vectors * Update return type * Add article * Move API to submodule Co-authored-by: Leo Fang <[email protected]>

…r matrix equation (linalg: lstsq) (data-apis#119) * Stub spec * Document keyword * Update spec * Add missing parenthesis * Remove duplicate words * Rename keyword argument and add support for setting tolerance to a float * Update copy * Reorder sentences * Rename keyword argument * Fix name * Update copy * Add support for providing an ordinate vector * Update dtype requirements * Update dtype requirements * Update type annotation * Update copy * Move API to submodule

* Add qr specification * Update copy * Add dtype requirements * Update copy * Move API to submodule

…decomposition (linalg: svdvals) (data-apis#160) * Add svdvals specification * Add article * Move API to submodule

… real symmetric matrix (linalg: eigh) (data-apis#161) * Add specification for eigh * Move API to submodule

… (linalg: eigvalsh) (data-apis#162) * Add eigvalsh * Fix duplicate target * Move API to submodule

…ata-apis#134) * Add matmul specification * Document return value dtype * Document input array dtypes * Add note * Require at least one dimension * Document exceptions * Update array object signatures and fix typos * Fix merge

…s of a matrix (linalg: matrix_rank) (data-apis#128) * Add matrix_rank specification * Update copy * Reorder sentences * Fix missing clause * Rename keyword argument * Update copy * Update dtype requirements * Update wording * Move API to submodule

* Add matrix_power spec * Update description * Add article * Add note on possible exceptions * Add exception * Update dtype requirements * Update copy * Move API to submodule

* Add design principles * Fix grammar * Add additional principle and note regarding namespaces * Remove note

…s#185) Closes data-apisgh-183 (SYCL related, don't require an integer for `stream`) The need to document the ownership of `stream` came up in pytorch/pytorch#57781.

…add-boolean-indexing-notes

kgryte · 2021-06-07T18:29:21Z

@steff456 I submitted gh-193 as a PR to supersede this one, as I mucked up the Git history. Sorry about that! If gh-193 is acceptable, we can close this PR out.

kgryte · 2021-06-07T18:33:01Z

Closing out given gh-193.

Add note for functions that doesn't support boolean array indexing

8815028

steff456 requested a review from kgryte April 22, 2021 06:34

steff456 self-assigned this Apr 22, 2021

kgryte requested changes Apr 26, 2021

View reviewed changes

spec/API_specification/searching_functions.md Outdated Show resolved Hide resolved

spec/API_specification/searching_functions.md Outdated Show resolved Hide resolved

spec/API_specification/set_functions.md Outdated Show resolved Hide resolved

kgryte changed the title ~~PR: Add note for functions that doesn't support boolean array indexing~~ PR: Add note for functions that are not compatible with static memory allocation Apr 26, 2021

steff456 and others added 3 commits April 26, 2021 12:23

clarify the notes

2757e73

remove note from where function

b98c908

kgryte approved these changes Apr 27, 2021

View reviewed changes

rgommers and others added 3 commits April 27, 2021 20:40

Change deployment to github actions for github pages (data-apis#176)

b0f4ca0

Specify the input types for logaddexp (data-apis#175)

0c5d913

rgommers added the Narrative Content Narrative documentation content. label May 3, 2021

Create a unique admonition for data-dependent shapes

ffc4d0c

rgommers requested changes May 3, 2021

View reviewed changes

rgommers requested a review from shoyer May 3, 2021 20:11

Add data type guidance (data-apis#173)

2e759ef

kgryte mentioned this pull request May 6, 2021

Proposal: add APIs for getting and setting elements via a list of indices (i.e., take, put, etc) #177

Open

Rewrite note with the given suggestions

204e283

kgryte requested changes May 11, 2021

View reviewed changes

rgommers and others added 8 commits May 11, 2021 21:49

Add Cholesky function specification (data-apis#110)

76730f1

* Add Cholesky spec * Update description * Return an array rather than a tuple * Update dtype requirements * Update dtype requirements * Move API to submodule

Add specification for computing the dot product (linalg: vecdot) (dat…

36fd440

…a-apis#137) * Add dot product specification * Update specification * Fix output shape * Update description * Update copy * Update copy * Rename to inner_dot * Update dtype requirements * Rename to vecdot

Add specification for computing a tensor contraction (linalg: tensord…

0768fc8

…ot) (data-apis#136) * Add tensordot specification * Update data type requirements * Update dtype requirements * Fix missing header

kgryte and others added 23 commits May 11, 2021 21:58

Add specification for computing the qr factorization (data-apis#126)

a33a710

* Add qr specification * Update copy * Add dtype requirements * Update copy * Move API to submodule

Add specification for computing singular values using singular value …

3add230

…decomposition (linalg: svdvals) (data-apis#160) * Add svdvals specification * Add article * Move API to submodule

Add specification for computing the eigenvalues and eigenvectors of a…

fe3b410

… real symmetric matrix (linalg: eigh) (data-apis#161) * Add specification for eigh * Move API to submodule

Add specification for computing the eigenvalues of a symmetric matrix…

a1d00d5

… (linalg: eigvalsh) (data-apis#162) * Add eigvalsh * Fix duplicate target * Move API to submodule

Add matrix_power specification (data-apis#112)

30b0391

* Add matrix_power spec * Update description * Add article * Add note on possible exceptions * Add exception * Update dtype requirements * Update copy * Move API to submodule

Fix meshgrid formatting (data-apis#178)

d00d255

Add linear algebra design principles (data-apis#149)

83c6be3

* Add design principles * Fix grammar * Add additional principle and note regarding namespaces * Remove note

Fix matmul formatting (data-apis#179)

c2ecccc

Fix norm formatting (data-apis#181)

9efd284

Move linear algebra APIs to an extension (data-apis#182)

03a2b41

Fix a typo in the spec (data-apis#184)

0429693

Add a requirement for the producer of a DLPack capsule(data-apis#186)

2472846

Improve documentation for stream argument to __dlpack__ (data-api…

e3927fd

…s#185) Closes data-apisgh-183 (SYCL related, don't require an integer for `stream`) The need to document the ownership of `stream` came up in pytorch/pytorch#57781.

Clarify that constants are Python scalars (data-apis#169)

192b4d7

Fix typo in definition of __lshift__ (data-apis#190)

63c592e

Merge remote-tracking branch 'steff/add-boolean-indexing-notes' into …

4c8142c

…add-boolean-indexing-notes

Update notes

0a61b94

Add document stub

c9541aa

Add admonition for boolean array indexing and update reference

b1fd96e

Add explainer on data-dependent output shapes

9f336ae

kgryte mentioned this pull request Jun 7, 2021

Clarify guidance on operations having data-dependent output shapes #193

Merged

kgryte closed this Jun 7, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

PR: Add note for functions that are not compatible with static memory allocation #168

PR: Add note for functions that are not compatible with static memory allocation #168

steff456 commented Apr 22, 2021

kgryte left a comment

kgryte left a comment

rgommers left a comment

shoyer commented May 3, 2021

szha commented May 3, 2021 •

edited

Loading

kgryte May 11, 2021 •

edited

Loading

steff456 May 11, 2021

kgryte May 11, 2021

kgryte May 11, 2021

kgryte commented Jun 7, 2021

kgryte commented Jun 7, 2021

PR: Add note for functions that are not compatible with static memory allocation #168

PR: Add note for functions that are not compatible with static memory allocation #168

Conversation

steff456 commented Apr 22, 2021

kgryte left a comment

Choose a reason for hiding this comment

kgryte left a comment

Choose a reason for hiding this comment

rgommers left a comment

Choose a reason for hiding this comment

shoyer commented May 3, 2021

szha commented May 3, 2021 • edited Loading

kgryte May 11, 2021 • edited Loading

Choose a reason for hiding this comment

steff456 May 11, 2021

Choose a reason for hiding this comment

kgryte May 11, 2021

Choose a reason for hiding this comment

kgryte May 11, 2021

Choose a reason for hiding this comment

kgryte commented Jun 7, 2021

kgryte commented Jun 7, 2021

szha commented May 3, 2021 •

edited

Loading

kgryte May 11, 2021 •

edited

Loading