Proposal to standardize element-wise arithmetic operations #9

kgryte · 2020-07-09T07:54:52Z

Based on the analysis of array library APIs, we know that performing element-wise arithmetic operations is both universally implemented and commonly used. Accordingly, this issue proposes to standardize the following arithmetic operations:

Arithmetic Operations

add
subtract
multiply (mul)
divide (div)

Criterion

Commonly implemented across array libraries.
Commonly used by array library consumers.
Operates on two arrays.

Questions

Naming conventions? Currently, this proposal is biased toward verbose API names following NumPy.
Are there any APIs listed above which should not be standardized?
Are there basic arithmetic operations not listed above which should be standardized? Preferably, any additions should be supported by usage data.

kgryte · 2020-07-20T18:12:05Z

I compiled generalized signatures (with respect to each of the above listed interfaces for each library) for element-wise arithmetic operations, where the raw signature data can be found here.

NumPy

numpy.<name>(x1, x2, out=None, *, where=True, casting='same_kind', order='K', dtype=None, subok=True[, signature, extobj]) → ndarray

CuPy

cupy.<name>(x1, x2, out=None, dtype=None) → ndarray

dask.array

JAX

jax.numpy.<name>(x1, x2) → ndarray

MXNet

np.<name>(x1, x2, out=None, **kwargs) → ndarray

PyTorch

torch.<name>(input, other, out=None) → Tensor

Tensorflow

tf.math.<name>(x, y, name=None) → Tensor

The minimum common API across most libraries is

<name>(x1, x2, out=None)

For example,

add(x1, x2, out=None)

Proposal

Signature of the form:

<name>(x1, x2, *, out=None)

APIs:

add(x1, x2, *, out=None)
subtract(x1, x2, *, out=None)
multiply(x1, x2, *, out=None)
divide(x1, x2, *, out=None)

Notes

Optional arguments as keyword-only arguments for the following reasons:

Avoid potential positional variation amongst library implementations.
Favor explicit interfaces and minimize readers' need to intuit an optional positional argument's meaning.

kgryte · 2020-07-30T09:19:19Z

See #12 for a draft specification.

shoyer · 2020-07-30T18:50:16Z

Meta note: it might be more descriptive to call these "binary arithmetic operations". An operation like abs(x) or -x is also arguable "arithmetic".

My discussion about positional-only arguments and out from #8 is equally relevant here -- we should make a shared decision for both.

kgryte · 2020-07-30T18:57:46Z

@shoyer Re: naming. The operations in this proposal are not intended to be exclusive. My intent with the issue name was simply to distinguish from the other proposals, but point taken and informs how we might categorize APIs upon formal inclusion in the specification.

kgryte · 2020-07-30T18:58:33Z

And agreed regarding out.

@rgommers This may be good topic of discussion to add to the agenda for the next meeting.

shoyer · 2020-08-19T22:33:55Z

Taking a step back: do need these binary arithmetic operations at all when we have access to Python's infix operators?

I'm glad we have codified these semantics, but perhaps it would suffice to recommend using either +/+= or operator.add/operator.iadd rather than defining your own add function? At the very least we should formalize the relationship between these.

rgommers · 2020-08-19T22:54:17Z

That's a good point. I personally can't remember having ever used np.add over +.

Did a quick search of the SciPy code base, and the only uses of np.add are using np.add.reduce, and that's because at some point in the past np.add.reduce was significantly faster than np.sum.

That said, PyTorch uses torch.add all over the place, also without out or another keyword. It's typed more strictly than __add__, so there's probably a reason.

shoyer · 2020-08-19T23:39:52Z

That said, PyTorch uses torch.add all over the place, also without out or another keyword. It's typed more strictly than __add__, so there's probably a reason.

tensorflow.add() is also used all over the place. As far as I can tell it's not for any particularly good reason -- mostly just copied from some early examples, possibly examples written by coders who didn't know Python terribly well. It does have an optional name parameter, but that's rarely used.

In theory, tf.add() does have slightly stricter typing semantics (everything gets converted into a Tensor), but TensorFlow is in the process of adding its own dispatch system with __tf_dispatch__ so this will also change.

I know PyTorch also has experimental dispatch, so I suspect the situation there could be pretty similar.

kgryte · 2020-08-20T00:03:32Z

Downstream libraries do use, e.g., numpy.add (see here and here). An example of using numpy.add in pandas test usage can be found here.

Apart from stricter typing semantics, functional equivalents may be preferred over operator equivalents for purposes of fluent interfaces, lazy evaluation, etc, so I might advise against recommending operator equivalents and not standardizing element-wise arithmetic operation interfaces. These interfaces are widely implemented among analyzed array libraries.

shoyer · 2020-08-20T00:18:06Z

Downstream libraries do use, e.g., numpy.add (see here and here). An example of using numpy.add in pandas test usage can be found here.

This test is explicitly checking overrides of NumPy's ufuncs (which np.add is), so I don't think this is a great example.

Apart from stricter typing semantics, functional equivalents may be preferred over operator equivalents for purposes of fluent interfaces, lazy evaluation, etc,

operator.add(a, b) from Python's standard library is a builtin function that is exactly equivalent to a + b. We don't need something new for that.

These interfaces are widely implemented among analyzed array libraries.

True, but many of these libraries, including CuPy, Dask and JAX, try to copy the NumPy interface exactly. A function like add() existing may be more of an indication that it was easy to add than an indication that it is actually useful.

(Note that your list is missing dask.array.add, but that does seem to exist.)

kgryte · 2020-08-20T00:42:39Z

@shoyer I don't see dask.array.add in its docs. Perhaps you can point to where I can find it? If it exists (apart from being a dunder method), I'd like to update the comparison data.

Re: pandas examples. I simply pulled one usage from a GitHub search. Other examples include here and here. You may be able to find others, or they may all fall into the same category. All this to say is that downstream libraries do use these functional equivalents, as evidenced by the record data where all downstream libraries we've analyzed (pandas, dask.array, xarray, scikit-image, matplotlib) invoked add, subtract, divide, and/or multiply when we've run their test suites.

Re: operator.add. Aware.

Re: other array libraries. Would be good to have some record data for API consumption beyond NumPy. But this is still a WIP.

kgryte · 2020-08-20T00:57:02Z

As a further comment, I think its worth reiterating the main stated goal of the consortium which is to coalesce around what is presently implemented across array libraries. Meaning, we aren't writing an array library spec from first principles.

As such, we'd need to ask, if we left add, subtract, multiply, and divide out of the spec, what would be the desired outcome? Would NumPy simply drop support for element-wise arithmetic interfaces (meaning that numpy.add would no longer exist)? Or would NumPy retain these interfaces? And if retained, would other array libraries continue to provide such interfaces (potentially following NumPy's lead)? If the answer to the last two questions is "yes", then what we after here is to ensure uniformity amongst the array libraries so that users/devs can expect similar signatures as they move from library to library.

I recognize @shoyer that your concern is forward looking. You'd rather not impose undue burdens on future array libraries. However, unless most/all the currently analyzed array libraries remove these interfaces, we're left with de facto standardization, rather than de jure, and without the consistency guarantees afforded by specification compliance.

saulshanabrook · 2020-08-20T01:38:38Z

Well one advantage of not having them included, even if all array libraries continue to provide them, is that then downstream users won't use them if they are writing to the spec. So it could have some influence there, if we think users should be doing a + b over np.add(a, b), because it's more Pythonic and there should be one and only one way to do things?

shoyer · 2020-08-20T04:25:39Z

@shoyer I don't see dask.array.add in its docs. Perhaps you can point to where I can find it? If it exists (apart from being a dunder method), I'd like to update the comparison data.

It may not be documented, but it definitely exists:

In [4]: import dask.array

In [5]: dask.array.add(1, 1)
Out[5]: 2

In [6]: type(dask.array.add)
Out[6]: dask.array.ufunc.ufunc

As such, we'd need to ask, if we left add, subtract, multiply, and divide out of the spec, what would be the desired outcome? Would NumPy simply drop support for element-wise arithmetic interfaces (meaning that numpy.add would no longer exist)? Or would NumPy retain these interfaces?

NumPy is certainly not going to drop numpy.add, regardless of what we decide here, but I don't think should be remotely relevant for our decision making here.

There is guaranteed to be a large list of functionality in all of these array libraries that doesn't get standardized. If we insist that libraries remove existing functionality, this spec would never get off the ground.

If our spec is open to extensibility, then I can see a case for allowing optional functions for arithmetic, e.g,. so TensorFlow can expose the optional name parameter. And certainly if a project like TensorFlow makes an add() function, its default behavior should be guaranteed to exactly match +.

shoyer · 2020-08-20T04:42:01Z

As a further comment, I think its worth reiterating the main stated goal of the consortium which is to coalesce around what is presently implemented across array libraries. Meaning, we aren't writing an array library spec from first principles.

It would be worth stating this clearly somewhere. I couldn't it in the announcement blog post or in any of the written documents.

I agree with not inventing APIs from first principles, but I think it would be a mistake to blindly copy redundant features just because they exists in all of the libraries investigated. In my opinion, if there are no use-cases aside from backwards compatibility, it shouldn't exist in our standard.

kgryte · 2020-08-20T06:37:09Z

Re: removing APIs. My point was not that this is what we'd advocate for, but that these arithmetic functions will live on and continue to be implemented (possibly due to prior art, as is the case now), whether or not we include these functions in the spec due to their "fundamental" nature. If our stance is that the spec is strictly about what "ought" to be the standard (prescriptive) instead of codifying (and, in some cases, normalizing) the fundamental APIs which already exist (descriptive), we miss the opportunity to ensure uniform expectations across array libraries.

Re: large list of functionality. Agreed and recognized. Note, however, that the operations listed here, while arguably redundant, were included because (a) they are implemented across all analyzed array libraries and (b) are used by downstream libraries according to our data. If they had no usage, they would not have been proposed for specification.

It is possible that, if add et al, were not in NumPy, no other array library would include them, and it is possible that all array libraries view the operations as redundant (and only grudgingly implemented them), but I would prefer to see data confirming this. Array libraries were never obligated to follow NumPy, and certainly, many array libraries (e.g., Tensorflow and Torch) have chosen to chart their own paths, so I am hesitant to subscribe to the notion that every array library "blindly" copies redundant features just because NumPy has them. In which case, I'd like to hear more feedback from the respective libraries about their motivations for including these APIs. Given that they are widely used by downstream libraries (at least according to record data), I would like to assume the position of (devil's) advocate on behalf of these users and hear the positive case for inclusion.

Re: backward compatibility. This is actually something deserving of a larger discussion. Both in terms of current state of the art, as well as the future evolution of the specification. Personally, I take a much stronger stance in favor of backward compatibility, even if it entails redundant functionality, in order to minimize harm. Meaning, backward compatibility can be reason enough. But this is something we must collectively discuss and decide.

Re: stated goal. Perhaps not succinctly stated as such, but this goal runs through the blog post, particularly in describing "what is an API standard", "Be conservative in choices made", and "A data-driven approach", and is discussed again in the tooling repo README.

rgommers · 2020-08-20T10:19:27Z

I agree with not inventing APIs from first principles, but I think it would be a mistake to blindly copy redundant features just because they exists in all of the libraries investigated. In my opinion, if there are no use-cases aside from backwards compatibility, it shouldn't exist in our standard.

I do tend to agree with this.

I think one very important point that we need to emphasize is what it means for something to not be included in the spec. We had the same discussion with complex dtypes, where Travis thought it was a big issue, but really it doesn't matter all that much. All it indicates is "you cannot write code that will work the same with all array libraries using complex dtypes at this point in time".

In this case, there is some harm, but it's very minor. No library is going to remove anything, it just means that in case you have a way of retrieving only what's in the standard (which I hope all libraries will add), add won't be included. No functionality is lost, if the signature is only add(x1, x2, out=None). So the one thing people may have to do is, when trying to make existing code spec-compliant, to change some lines of code from add(x1, x2 to x1 + x2.

That said, there is some overhead in operator functions:

In [1]: import operator                                                                                                  

In [2]: x1 = np.arange(10)                                                                                               

In [3]: x2 = x1                                                                                                          

In [4]: %timeit operator.add(x1, x2)                                                                                     
535 ns ± 14.1 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)

In [5]: %timeit np.add(x1, x2)                                                                                           
476 ns ± 8.88 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)

In [6]: %timeit operator.add(x1, x2)                                                                                     
530 ns ± 2.89 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)

In [7]: %timeit np.add(x1, x2)                                                                                           
471 ns ± 4.46 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)

I don't think anyone is going to care or even notice this 50 ns though, given that the overhead on numpy ufuncs is an order of magnitude larger, torch.add several times more even - I suspect tensorflow et al. are in a similar situation.

So the one situation where the API gets worse is when people write:

for op in (np.sum,
           np.mean,
           ...
           np.add):
    ....

There operator.add would look uglier.

In theory, tf.add() does have slightly stricter typing semantics (everything gets converted into a Tensor), but TensorFlow is in the process of adding its own dispatch system with __tf_dispatch__ so this will also change.

Thanks, interesting pointer for TensorFlow! Regarding the decision here, I'm not sure dispatch matter. The dispatcher should still end up in the correct __add__ or __radd__ method, just like when using +.

tl;dr I'm not convinced either way yet.

saulshanabrook · 2020-08-20T12:30:21Z

Thanks, interesting pointer for TensorFlow! Regarding the decision here, I'm not sure dispatch matter. The dispatcher should still end up in the correct add or radd method, just like when using +.

I don't think this has been noted explicitly yet, but I think another difference is that np.add can coerce the output type to an array, regardless of input types, whereas operator.add might not. i.e. if both my inputs may either be ints or arrays, then the behavior will be different between the two functions.

It seems like you could can kind of get around this by wrapping the result in an asarray or whatever the method is to create an array from a primitive type?

kgryte · 2020-08-20T22:40:15Z

@shoyer Thanks for all your input on this. Your feedback has been really helpful in helping clarify thinking and ensure we consider the broader picture.

Based on discussion in today's meeting, beyond saying that the results of x1 + x2 should exactly match the results of add( x1, x2 ), is there anything else you feel the spec should include regarding the relationship between the functional APIs and their equivalent operators? Do you have particular phrasing in mind?

rgommers · 2020-09-16T21:39:52Z

I added an entry to the tracking issue for Python operators, dunder methods and equivalent library functions/methods.

The rest is done here, so closing.

kgryte mentioned this issue Jul 30, 2020

Add element-wise function specifications, incl. type casting and broadcasting rules #12

Merged

rgommers closed this as completed Sep 16, 2020

rgommers mentioned this issue Sep 16, 2020

tracking issue: array API standard document #27

Closed

31 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Proposal to standardize element-wise arithmetic operations #9

Proposal to standardize element-wise arithmetic operations #9

kgryte commented Jul 9, 2020

kgryte commented Jul 20, 2020 •

edited

Loading

kgryte commented Jul 30, 2020

shoyer commented Jul 30, 2020

kgryte commented Jul 30, 2020

kgryte commented Jul 30, 2020

shoyer commented Aug 19, 2020 •

edited

Loading

rgommers commented Aug 19, 2020

shoyer commented Aug 19, 2020

kgryte commented Aug 20, 2020

shoyer commented Aug 20, 2020

kgryte commented Aug 20, 2020 •

edited

Loading

kgryte commented Aug 20, 2020 •

edited

Loading

saulshanabrook commented Aug 20, 2020 •

edited by rgommers

Loading

shoyer commented Aug 20, 2020

shoyer commented Aug 20, 2020

kgryte commented Aug 20, 2020 •

edited

Loading

rgommers commented Aug 20, 2020

saulshanabrook commented Aug 20, 2020

kgryte commented Aug 20, 2020

rgommers commented Sep 16, 2020

Proposal to standardize element-wise arithmetic operations #9

Proposal to standardize element-wise arithmetic operations #9

Comments

kgryte commented Jul 9, 2020

Arithmetic Operations

Criterion

Questions

kgryte commented Jul 20, 2020 • edited Loading

NumPy

CuPy

dask.array

JAX

MXNet

PyTorch

Tensorflow

Proposal

Notes

kgryte commented Jul 30, 2020

shoyer commented Jul 30, 2020

kgryte commented Jul 30, 2020

kgryte commented Jul 30, 2020

shoyer commented Aug 19, 2020 • edited Loading

rgommers commented Aug 19, 2020

shoyer commented Aug 19, 2020

kgryte commented Aug 20, 2020

shoyer commented Aug 20, 2020

kgryte commented Aug 20, 2020 • edited Loading

kgryte commented Aug 20, 2020 • edited Loading

saulshanabrook commented Aug 20, 2020 • edited by rgommers Loading

shoyer commented Aug 20, 2020

shoyer commented Aug 20, 2020

kgryte commented Aug 20, 2020 • edited Loading

rgommers commented Aug 20, 2020

saulshanabrook commented Aug 20, 2020

kgryte commented Aug 20, 2020

rgommers commented Sep 16, 2020

kgryte commented Jul 20, 2020 •

edited

Loading

shoyer commented Aug 19, 2020 •

edited

Loading

kgryte commented Aug 20, 2020 •

edited

Loading

kgryte commented Aug 20, 2020 •

edited

Loading

saulshanabrook commented Aug 20, 2020 •

edited by rgommers

Loading

kgryte commented Aug 20, 2020 •

edited

Loading