-
Notifications
You must be signed in to change notification settings - Fork 52
Proposal to standardize element-wise arithmetic operations #9
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
I compiled generalized signatures (with respect to each of the above listed interfaces for each library) for element-wise arithmetic operations, where the raw signature data can be found here. NumPy
CuPy
dask.array
JAX
MXNet
PyTorch
Tensorflow
The minimum common API across most libraries is
For example,
ProposalSignature of the form:
APIs:
NotesOptional arguments as keyword-only arguments for the following reasons:
|
See #12 for a draft specification. |
Meta note: it might be more descriptive to call these "binary arithmetic operations". An operation like My discussion about positional-only arguments and |
@shoyer Re: naming. The operations in this proposal are not intended to be exclusive. My intent with the issue name was simply to distinguish from the other proposals, but point taken and informs how we might categorize APIs upon formal inclusion in the specification. |
And agreed regarding @rgommers This may be good topic of discussion to add to the agenda for the next meeting. |
Taking a step back: do need these binary arithmetic operations at all when we have access to Python's infix operators? I'm glad we have codified these semantics, but perhaps it would suffice to recommend using either |
That's a good point. I personally can't remember having ever used Did a quick search of the SciPy code base, and the only uses of That said, PyTorch uses |
In theory, I know PyTorch also has experimental dispatch, so I suspect the situation there could be pretty similar. |
Downstream libraries do use, e.g., Apart from stricter typing semantics, functional equivalents may be preferred over operator equivalents for purposes of fluent interfaces, lazy evaluation, etc, so I might advise against recommending operator equivalents and not standardizing element-wise arithmetic operation interfaces. These interfaces are widely implemented among analyzed array libraries. |
This test is explicitly checking overrides of NumPy's ufuncs (which
True, but many of these libraries, including CuPy, Dask and JAX, try to copy the NumPy interface exactly. A function like (Note that your list is missing |
@shoyer I don't see Re: pandas examples. I simply pulled one usage from a GitHub search. Other examples include here and here. You may be able to find others, or they may all fall into the same category. All this to say is that downstream libraries do use these functional equivalents, as evidenced by the record data where all downstream libraries we've analyzed (pandas, dask.array, xarray, scikit-image, matplotlib) invoked Re: operator.add. Aware. Re: other array libraries. Would be good to have some record data for API consumption beyond NumPy. But this is still a WIP. |
As a further comment, I think its worth reiterating the main stated goal of the consortium which is to coalesce around what is presently implemented across array libraries. Meaning, we aren't writing an array library spec from first principles. As such, we'd need to ask, if we left I recognize @shoyer that your concern is forward looking. You'd rather not impose undue burdens on future array libraries. However, unless most/all the currently analyzed array libraries remove these interfaces, we're left with de facto standardization, rather than de jure, and without the consistency guarantees afforded by specification compliance. |
Well one advantage of not having them included, even if all array libraries continue to provide them, is that then downstream users won't use them if they are writing to the spec. So it could have some influence there, if we think users should be doing |
It may not be documented, but it definitely exists:
NumPy is certainly not going to drop There is guaranteed to be a large list of functionality in all of these array libraries that doesn't get standardized. If we insist that libraries remove existing functionality, this spec would never get off the ground. If our spec is open to extensibility, then I can see a case for allowing optional functions for arithmetic, e.g,. so TensorFlow can expose the optional |
It would be worth stating this clearly somewhere. I couldn't it in the announcement blog post or in any of the written documents. I agree with not inventing APIs from first principles, but I think it would be a mistake to blindly copy redundant features just because they exists in all of the libraries investigated. In my opinion, if there are no use-cases aside from backwards compatibility, it shouldn't exist in our standard. |
Re: removing APIs. My point was not that this is what we'd advocate for, but that these arithmetic functions will live on and continue to be implemented (possibly due to prior art, as is the case now), whether or not we include these functions in the spec due to their "fundamental" nature. If our stance is that the spec is strictly about what "ought" to be the standard (prescriptive) instead of codifying (and, in some cases, normalizing) the fundamental APIs which already exist (descriptive), we miss the opportunity to ensure uniform expectations across array libraries. Re: large list of functionality. Agreed and recognized. Note, however, that the operations listed here, while arguably redundant, were included because (a) they are implemented across all analyzed array libraries and (b) are used by downstream libraries according to our data. If they had no usage, they would not have been proposed for specification. It is possible that, if Re: backward compatibility. This is actually something deserving of a larger discussion. Both in terms of current state of the art, as well as the future evolution of the specification. Personally, I take a much stronger stance in favor of backward compatibility, even if it entails redundant functionality, in order to minimize harm. Meaning, backward compatibility can be reason enough. But this is something we must collectively discuss and decide. Re: stated goal. Perhaps not succinctly stated as such, but this goal runs through the blog post, particularly in describing "what is an API standard", "Be conservative in choices made", and "A data-driven approach", and is discussed again in the tooling repo README. |
I do tend to agree with this. I think one very important point that we need to emphasize is what it means for something to not be included in the spec. We had the same discussion with complex dtypes, where Travis thought it was a big issue, but really it doesn't matter all that much. All it indicates is "you cannot write code that will work the same with all array libraries using complex dtypes at this point in time". In this case, there is some harm, but it's very minor. No library is going to remove anything, it just means that in case you have a way of retrieving only what's in the standard (which I hope all libraries will add), That said, there is some overhead in
I don't think anyone is going to care or even notice this 50 ns though, given that the overhead on numpy ufuncs is an order of magnitude larger, So the one situation where the API gets worse is when people write:
There
Thanks, interesting pointer for TensorFlow! Regarding the decision here, I'm not sure dispatch matter. The dispatcher should still end up in the correct tl;dr I'm not convinced either way yet. |
I don't think this has been noted explicitly yet, but I think another difference is that It seems like you could can kind of get around this by wrapping the result in an |
@shoyer Thanks for all your input on this. Your feedback has been really helpful in helping clarify thinking and ensure we consider the broader picture. Based on discussion in today's meeting, beyond saying that the results of |
I added an entry to the tracking issue for Python operators, dunder methods and equivalent library functions/methods. The rest is done here, so closing. |
Based on the analysis of array library APIs, we know that performing element-wise arithmetic operations is both universally implemented and commonly used. Accordingly, this issue proposes to standardize the following arithmetic operations:
Arithmetic Operations
Criterion
Questions
The text was updated successfully, but these errors were encountered: