Add specifications for array manipulation functions #42

kgryte · 2020-09-14T18:31:59Z

This PR

adds specifications for array manipulation functions.
is derived from comparing API signatures across array libraries.

Notes

This list of array manipulation functions is an initial set of array manipulation functions which can pave the way for additional specs in subsequent pull requests. These functions were identified as the set of functions with the broadest support among array libraries and relatively higher usage among downstream libraries.
Some comments/questions regarding particular APIs...
- concat: CuPy requires a tuple rather than a sequence for first argument. Went with tuple as more consistent with rest of specification (e.g., we require a list of axes to be specified as a tuple, but not a sequence). What happens if provided arrays having different dtypes? What should we the dtype of the returned array? How do type promotion rules factor in here?
  - Answer: regular type promotion rules. Between data type families, behavior is left unspecified.
- expand_dims: NumPy supports providing a tuple or an int. All other array libraries considered support only an int. Torch names this method unsqueeze. Went with expand_dims and only accepting an int for the second positional argument.
- flip: TensorFlow lacks this exact API. Torch/CuPy/ spec axis/dims as position argument. Based proposal on NumPy where axis is a keyword argument, as more versatile.
- reshape: Torch requires a tuple (does not allow int). TensorFlow requires shape to be a int32/int64 tensor. NumPy allows providing an int as shorthand. Based proposal on Torch's more restricted API for consistency.
- roll: TensorFlow requires tensors for axis and shifts.
- squeeze: Torch only allows specifying one axis and does not error if you attempt to squeeze non-singleton dimensions. NumPy/TensorFlow error if you attempt to squeeze a dimension which is not 1. Sided with Torch regarding error behavior, as not clear why attempting to squeeze a non-singleton dimension should error.
- stack: CuPy requires a tuple rather than a sequence for the first argument. Went with tuple for same reasons as in concat. Same dtype question(s) apply as for concat above.

rgommers · 2020-09-14T21:37:41Z

concat: CuPy requires a tuple rather than a sequence for first argument. Went with tuple as more consistent with rest of specification (e.g., we require a list of axes to be specified as a tuple, but not a sequence). What happens if provided arrays having different dtypes? What should we the dtype of the returned array? How do type promotion rules factor in here?

I'd say apply the regular type casting rules. So regular within-dtype-family upcasting (intxx, floatxx) is undefined. Quickly checked numpy and pytorch, that's what they both seem to do.

flip: TensorFlow lacks this exact API. Torch/CuPy/ spec axis/dims as position argument. Based proposal on NumPy where axis is a keyword argument, as more versatile.

This makes sense to include (but leaving out fliplr and flipud). The indexing-based equivalent ([...,::=1,...]) is too weird.

reshape: Torch requires a tuple (does not allow int). TensorFlow requires shape to be a int32/int64 tensor. Based proposal on NumPy as providing an int is convenient shorthand.

I'd be okay with just a tuple for consistency. It's pretty unusual to do reshape with an integer (or a 1-D array in general), and some_1d_array.shape will give a length-1 tuple.

roll: TensorFlow requires tensors for axis and shifts.

That's a little odd, regular ints/tuples should be fine I'd think.

squeeze: Torch only allows specifying one axis and does not error if you attempt to squeeze non-singleton dimensions. NumPy/TensorFlow error if you attempt to squeeze a dimension which is not 1. Sided with Torch regarding error behavior, as not clear why attempting to squeeze a non-singleton dimension should error.

+1

stack: CuPy requires a tuple rather than a sequence for the first argument. Went with tuple for same reasons as in concat. Same dtype question(s) apply as for concat above.

Sounds good to me.

Functions I noticed that you left out are:

moveaxis, swapaxes, rollaxis
ravel, flatten
all other *stack ones
all *split ones
tile
repeat
block
delete, insert, append, resize
rot90
expand_dims
transpose

Of all those, the ones I'd at least consider for inclusion in addition to the ones in this PR are moveaxis, expand_dims and transpose.

kgryte · 2020-09-14T21:55:27Z

Transpose was added with the linear algebra functions. See here.

kgryte · 2020-09-21T08:25:52Z

Updates:

reshape: shape can now only be a tuple.
concat/stack: added output array data type guidance.

kgryte · 2020-09-21T08:51:23Z

Added expand_dims and updated the OP accordingly.

Re: moveaxis. Did not include this API in this round of inclusions, as this API does not appear to be present in either Torch or TensorFlow.

This PR should be ready for another round of review based on the aforementioned changes.

kgryte · 2020-09-24T07:42:19Z

I can provide a follow-up PR with manipulation functions which, while not universally implemented across array libraries, are commonly implemented and commonly used by downstream libraries (e.g., tile, repeat, etc).

rgommers · 2020-09-28T18:41:47Z

I can provide a follow-up PR with manipulation functions which, while not universally implemented across array libraries, are commonly implemented and commonly used by downstream libraries (e.g., tile, repeat, etc).

I think it may make sense to open an issue for those with a summary, but wait with adding them to the standard until they're implemented universally. They're not all that important I'd say.

EDIT: or do nothing, not sure we need un-actionable issues right now.

rgommers

Changes LGTM, and no further comments, so merging. Thanks @kgryte

kgryte added 12 commits September 14, 2020 09:45

Add concat

eebe84d

Add flip

e6fa8ef

Add reshape

71aa0b4

Add roll

2204590

Add stack

0890508

Fix description

297af72

Update copy

448e769

Update copy

a42f984

Update copy

45cbc16

Update copy

d7c6758

Update copy

7cd4c08

Change Sequence to Tuple for consistency elsewhere

2072f30

kgryte added 2 commits September 21, 2020 01:27

Address PR feedback

db52cbd

Add expand_dims

95c5b1a

rgommers approved these changes Sep 28, 2020

View reviewed changes

rgommers merged commit d9073d7 into master Sep 28, 2020

rgommers deleted the array-manipulation branch September 28, 2020 18:44

rgommers mentioned this pull request Mar 11, 2021

ENH: Implementation of the NEP 47 (adopting the array API standard) numpy/numpy#18585

Merged

honno mentioned this pull request Nov 17, 2021

Subnormals make testing flush-to-zero libs/builds impractical data-apis/array-api-tests#42

Closed

rgommers mentioned this pull request Mar 14, 2022

Use case study: Intheon #403

Closed

kgryte mentioned this pull request Mar 8, 2023

reshape, broadcast_to, and permute_dims do not accept a single integer #608

Closed

lucascolley mentioned this pull request Mar 10, 2024

RFC: expand_dims for tuple of axes data-apis/array-api-compat#105

Closed

izaid mentioned this pull request Mar 10, 2024

RFC: add support for a tuple of axes in expand_dims #760

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add specifications for array manipulation functions #42

Add specifications for array manipulation functions #42

kgryte commented Sep 14, 2020 •

edited

Loading

rgommers commented Sep 14, 2020 •

edited

Loading

kgryte commented Sep 14, 2020 •

edited

Loading

kgryte commented Sep 21, 2020

kgryte commented Sep 21, 2020

kgryte commented Sep 24, 2020

rgommers commented Sep 28, 2020 •

edited

Loading

rgommers left a comment

Add specifications for array manipulation functions #42

Add specifications for array manipulation functions #42

Conversation

kgryte commented Sep 14, 2020 • edited Loading

Notes

rgommers commented Sep 14, 2020 • edited Loading

kgryte commented Sep 14, 2020 • edited Loading

kgryte commented Sep 21, 2020

kgryte commented Sep 21, 2020

kgryte commented Sep 24, 2020

rgommers commented Sep 28, 2020 • edited Loading

rgommers left a comment

Choose a reason for hiding this comment

kgryte commented Sep 14, 2020 •

edited

Loading

rgommers commented Sep 14, 2020 •

edited

Loading

kgryte commented Sep 14, 2020 •

edited

Loading

rgommers commented Sep 28, 2020 •

edited

Loading