Skip to content

PR: Add note for functions that are not compatible with static memory allocation #168

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 41 commits into from

Conversation

steff456
Copy link
Member

This PR mark the functions discussed in #164 with a note regarding boolean array indexing.

@steff456 steff456 requested a review from kgryte April 22, 2021 06:34
@steff456 steff456 self-assigned this Apr 22, 2021
Copy link
Contributor

@kgryte kgryte left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@steff456 Thanks for working on this. I think we need to clarify the wording in the notes, and I had a question whether where is applicable.

@kgryte kgryte changed the title PR: Add note for functions that doesn't support boolean array indexing PR: Add note for functions that are not compatible with static memory allocation Apr 26, 2021
steff456 and others added 3 commits April 26, 2021 12:23
…ta-apis#174)

* Some small fixes to function signatures to make them valid Python

* Fix an issue that was breaking the spec parsing in the test suite

* Call the argument to broadcast_arrays 'arrays'

This is consistent with meshgrid().
Copy link
Contributor

@kgryte kgryte left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

rgommers and others added 3 commits April 27, 2021 20:40
…a-apis#167)

* Update specification for arange

Addresses comments in data-apisgh-85 and data-apisgh-107

* Update the specification for `full` and `full_like`

Addresses comments in data-apisgh-85 and data-apisgh-107

* Update specification for `linspace`

Addresses comments in data-apisgh-85 and data-apisgh-107

* Update specification for `empty`, `ones`, `zeros`

Addresses comments in data-apisgh-85 and data-apisgh-107

* Update specification for `eye`

This is useful/needed because `M` is not a descriptive name
and that name does not match between different array libraries.

* Update specification for `expand_dims`, `roll` and `reshape`

Address comment in data-apisgh-85

* One more change to `eye`, more descriptive positional arguments

* Address the default integer dtype issue for 32/64-bit Python

Closes data-apisgh-151

* Update signature of `broadcast_to`

Address a review comment; makes it consistent with other functions
using `shape`.
@rgommers rgommers added the Narrative Content Narrative documentation content. label May 3, 2021
Copy link
Member

@rgommers rgommers left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @steff456. I pushed a change to make this a more unique/recognizable admonition:

image

I'm not sure about saying "it's incompatible with static memory allocation and JIT compilation". It makes it harder to implement, but it's not fundamentally incompatible. For example, the Numba implementation of unique is only 6 lines long. And it is possible to pre-allocate memory, the maximum amount one would need for the unique() call, and then use that. So I'd probably say something more along the lines of "the shape of the output array for this function depends on the data values in the input array, hence it is difficult to implement for libraries that rely on static memory allocation or JIT compilation. See REF for more details". And then in that reference (a separate page under "Design topics") we should say that libraries may choose to leave out functions marked as data-dependent; if they do so they must leave out all of them and clearly indicate in their documentation that they do so.

This PR addresses gh-84, and there's more relevant comments there. In particular, boolean indexing should get the same warning.

@rgommers rgommers requested a review from shoyer May 3, 2021 20:11
@shoyer
Copy link
Contributor

shoyer commented May 3, 2021

I'm not sure about saying "it's incompatible with static memory allocation and JIT compilation". It makes it harder to implement, but it's not fundamentally incompatible. For example, the Numba implementation of unique is only 6 lines long.

I agree, it's really implementation dependent. It's fine in Numba's JIT because Numba has types for dynamically sized arrays, which is different from the case for JAX.

I like @rgommers's suggestion, but rather than "hence it is difficult to implement for libraries that rely on static memory allocation or JIT compilation", I might say "hence it can be difficult to implement for libraries that build computation graphs for arrays without knowing their values." And then we can point to examples like JAX, Dask, etc.

@szha
Copy link
Member

szha commented May 3, 2021

Agreed. Dynamic shapes make it impossible for ahead of time memory planning and thus affects JIT that require full knowledge of shape/type, or AOT compilation.

:::{admonition} Data-dependent output shape
:class: important

The shape of the output array for this function depends on the data values in the input array, hence it can be difficult to implement for libraries that build computation graphs for arrays without knowing their values. See {ref}`indexing` section for more details.
Copy link
Contributor

@kgryte kgryte May 11, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not clear to me what we are referring to in the "indexing section". I think, instead, what is wanted is to refer to a separate page under "Design topics" where we specifically discuss APIs which return arrays having data-dependent shapes (for further discussion, see here).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we refer to this PR or to the boolean indexing section inside the spec or both?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we need to add a new section in "Design Topics" (via this PR) and then reference that section.

:::{admonition} Data-dependent output shape
:class: important

The shape of the output array for this function depends on the data values in the input array, hence it can be difficult to implement for libraries that build computation graphs for arrays without knowing their values. See {ref}`indexing` section for more details.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same comment as above.

rgommers and others added 8 commits May 11, 2021 21:49
As suggested by @shoyer in data-apis#97 (comment)
This makes it possible to predict resulting rank of output array, which is
otherwise undetermined (see discussion in data-apisgh-97).

Using squeeze without specifying the axis in library code often results in
unintuitive behaviour. For the common use case of extracting a scalar from a size-1 2-D array,
this gets a little more verbose (e.g. `np.squeeze(np.array([[1]]), axis=(0, 1))`),
but that's probably a price worth paying for the extra clarity.

Also changes specified behaviour for a given axis not having size 1 to raising
a `ValueError`, which is what NumPy does. This wasn't considered before, and the current
description seems simply incorrect.

Finally, this makes `squeeze` the exact inverse of `expand_dims`, which is probably
a good thing.
* Add Cholesky spec

* Update description

* Return an array rather than a tuple

* Update dtype requirements

* Update dtype requirements

* Move API to submodule
* Add slogdet spec

* Add note

* Update description

* Reformat description and update return value type

* Add article

* Simplify description

* Update copy

* Update copy

* Update dtype requirements

* Update dtype requirement for returned arrays

* Update copy

* Update copy

* Move API to submodule
…ta-apis#118)

* Add pinv specification

* Fix statement on broadcasting

* Fix type annotation

* Rename keyword argument and drop scaling factor

* Update copy

* Rename keyword argument

* Fix argument name

* Update dtype requirements

* Fix missing period

* Update copy

* Update copy

* Update dtype requirements

* Update copy

* Move API to submodule
…a-apis#137)

* Add dot product specification

* Update specification

* Fix output shape

* Update description

* Update copy

* Update copy

* Rename to inner_dot

* Update dtype requirements

* Rename to vecdot
…ot) (data-apis#136)

* Add tensordot specification

* Update data type requirements

* Update dtype requirements

* Fix missing header
* Add solve specification

* Remove statement

* Update to return an array, rather than a single-element tuple

* Update language

* Update copy

* Update description

* Add note regarding broadcast compatibility

* Add support for providing an ordinate vector

* Update dtype requirements

* Update copy

* Move API to submodule
* Add SVD spec

* Update spec

* Update to follow NumPy

* Fix annotation

* Update type

* Update descriptions

* Update annotation

* Return a tuple only when `compute_uv` is `True`

* Fix type annotation

* Update copy

Co-authored-by: Leo Fang <[email protected]>

* Update copy

Co-authored-by: Leo Fang <[email protected]>

* Update copy

Co-authored-by: Leo Fang <[email protected]>

* Update copy

Co-authored-by: Leo Fang <[email protected]>

* Fix docs

* Update dtype requirements

* Update copy

* Always return u and v singular vectors

* Update return type

* Add article

* Move API to submodule

Co-authored-by: Leo Fang <[email protected]>
kgryte and others added 23 commits May 11, 2021 21:58
…r matrix equation (linalg: lstsq) (data-apis#119)

* Stub spec

* Document keyword

* Update spec

* Add missing parenthesis

* Remove duplicate words

* Rename keyword argument and add support for setting tolerance to a float

* Update copy

* Reorder sentences

* Rename keyword argument

* Fix name

* Update copy

* Add support for providing an ordinate vector

* Update dtype requirements

* Update dtype requirements

* Update type annotation

* Update copy

* Move API to submodule
* Add qr specification

* Update copy

* Add dtype requirements

* Update copy

* Move API to submodule
…decomposition (linalg: svdvals) (data-apis#160)

* Add svdvals specification

* Add article

* Move API to submodule
… real symmetric matrix (linalg: eigh) (data-apis#161)

* Add specification for eigh

* Move API to submodule
… (linalg: eigvalsh) (data-apis#162)

* Add eigvalsh

* Fix duplicate target

* Move API to submodule
…ata-apis#134)

* Add matmul specification

* Document return value dtype

* Document input array dtypes

* Add note

* Require at least one dimension

* Document exceptions

* Update array object signatures and fix typos

* Fix merge
…s of a matrix (linalg: matrix_rank) (data-apis#128)

* Add matrix_rank specification

* Update copy

* Reorder sentences

* Fix missing clause

* Rename keyword argument

* Update copy

* Update dtype requirements

* Update wording

* Move API to submodule
* Add matrix_power spec

* Update description

* Add article

* Add note on possible exceptions

* Add exception

* Update dtype requirements

* Update copy

* Move API to submodule
* Add design principles

* Fix grammar

* Add additional principle and note regarding namespaces

* Remove note
…s#185)

Closes data-apisgh-183 (SYCL related, don't require an integer for `stream`)

The need to document the ownership of `stream` came up in pytorch/pytorch#57781.
@kgryte
Copy link
Contributor

kgryte commented Jun 7, 2021

@steff456 I submitted gh-193 as a PR to supersede this one, as I mucked up the Git history. Sorry about that! If gh-193 is acceptable, we can close this PR out.

@kgryte
Copy link
Contributor

kgryte commented Jun 7, 2021

Closing out given gh-193.

@kgryte kgryte closed this Jun 7, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Narrative Content Narrative documentation content.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants