Skip to content

Proposal to standardize element-wise elementary mathematical functions #8

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
kgryte opened this issue Jul 9, 2020 · 40 comments
Closed

Comments

@kgryte
Copy link
Contributor

kgryte commented Jul 9, 2020

Based on the analysis of array library APIs, we know that evaluating element-wise elementary mathematical functions is both universally implemented and commonly used. Accordingly, this issue proposes to standardize the following elementary mathematical functions:

Special Functions

  • abs (absolute)
  • exp
  • log
  • sqrt

Rounding

  • ceil
  • floor
  • trunc
  • round

Trigonometry

  • sin
  • cos
  • tan
  • asin (arcsin)
  • acos (arccos)
  • atan (arctan)
  • sinh
  • cosh
  • tanh
  • asinh (arcsinh)
  • acosh (arccosh)
  • atanh (arctanh)

Criterion

  1. Commonly implemented across array libraries.
  2. Commonly used by array library consumers.
  3. Operates on a single array (e.g., this criterion excludes atan2).

Questions

  1. Naming conventions? The above is biased toward C naming conventions (e.g., atan vs arctan).
  2. Are there any APIs listed above which should not be standardized?
  3. Are there elementary mathematical functions not listed above which should be standardized? Preferably, any additions should be supported by usage data.
  4. Should the standard mandate a minimum precision to ensure portability? (e.g., JavaScript's lack of minimum precision specification offers a cautionary tale)
@rgommers
Copy link
Member

rgommers commented Jul 9, 2020

Thanks @kgryte.

Trigonometry seems fine to me.

Regarding rounding:

  • you left out round. I think it is commonly implemented once you handle the round, around, round_ aliases as all being the same. So round should probably be considered for inclusion.
  • leaving out fix seems fine, it's basically the same as trunc.
  • leaving out rint seems also fine, I've never seen anyone use it.

Regarding special functions: there's a lot more than the four you listed, maybe worth a closer look. I think leaving out all the base-2 and base-10 ones, and things like sinc, makes sense to me. logaddexp is probably worth considering, it's quite useful and implemented by all libraries.

@kgryte
Copy link
Contributor Author

kgryte commented Jul 9, 2020

@rgommers Re: round. You're right. I've added it to the OP.

Re: special functions. Agreed that there are more. In the OP, I've only included the most universal for the time being with the thought that we could make a follow-up proposal with additional special functions (e.g., erf, erfc, logaddexp, log1p, expm1, etc), which are "higher-order" and whose implementations depend on the list of proposed APIs above.

@kgryte
Copy link
Contributor Author

kgryte commented Jul 20, 2020

I compiled generalized signatures (with respect to each of the above listed interfaces for each library) for element-wise elementary mathematical functions, where the raw signature data can be found here.

NumPy

numpy.<name>(x, out=None, *, where=True, casting='same_kind', order='K', dtype=None, subok=True[, signature, extobj]) → ndarray

CuPy

cupy.<name>(x, out=None, dtype=None) → ndarray

dask.array

dask.array.<name>(x, out=None, *, where=True, casting='same_kind', order='K', dtype=None, subok=True[, signature, extobj]) → array

JAX

jax.numpy.<name>(x) → ndarray

MXNet

np.<name>(x, out=None, **kwargs) → ndarray

PyTorch

torch.<name>(input, out=None) → Tensor

Tensorflow

tf.math.<name>(x, name=None) → Tensor

In short, the minimum common API across most libraries is

<name>(x, out=None)

For example,

sin(x, out=None)

Proposal

Signature of the form:

<name>(x, *, out=None)

APIs:

abs(x, *, out=None)
exp(x, *, out=None)
log(x, *, out=None)
sqrt(x, *, out=None)

ceil(x, *, out=None)
floor(x, *, out=None)
trunc(x, *, out=None)
round(x, *, out=None)

sin(x, *, out=None)
cos(x, *, out=None)
tan(x, *, out=None)
asin(x, *, out=None)
acos(x, *, out=None)
atan(x, *, out=None)
sinh(x, *, out=None)
cosh(x, *, out=None)
tanh(x, *, out=None)
asinh(x, *, out=None)
acosh(x, *, out=None)
atanh(x, *, out=None)

Notes

Optional arguments as keyword-only arguments for the following reasons:

  1. Avoid potential positional variation amongst library implementations.
  2. Favor explicit interfaces and minimize readers' need to intuit an optional positional argument's meaning.

@rgommers
Copy link
Member

Thanks @kgryte. A couple of thoughts on signatures:

  • It may be useful to make all functions use keyword-only keywords (i.e. put * after the positional arguments).
  • For the syntax of the signatures, the square brackets are perhaps best left out. I've always found those a little odd. They come from EBNF notation, where they indicate "optional". That makes perfect sense for a language-independent grammar as well as for signatures of languages where you can't tell whether something is optional or not. In Python however, the default for a keyword (=None or similar) already unambiguously says that it's optional. Making the [ ] just visual noise.

@kgryte
Copy link
Contributor Author

kgryte commented Jul 23, 2020

@rgommers Thanks! Updated to use keyword-only arguments and removed square brackets (where applicable).

@oleksandr-pavlyk
Copy link
Contributor

Just making sure, the keyword where=True present in numpy.ufunc is left out in the standard?

@kgryte
Copy link
Contributor Author

kgryte commented Jul 23, 2020

@oleksandr-pavlyk That's up to us to decide. What I wrote above reflects the "common" API signature across the various array libraries. Perhaps the Torch and Tensorflow contributors can discuss why a where keyword is not currently supported in their libraries.

Regardless, even if not supported now, we could later update the specification to include a where keyword, provided we can reach consensus.

@rgommers
Copy link
Member

where= is fairly new, and coverage is still incomplete even in NumPy itself (e.g. see this PR adding where= to numpy.mean, scheduled for v1.20.0).

It's nice to have, but not used a lot. Given no other library except Dask supports it, I'd suggest it should be left out until that situation changes.

@kgryte
Copy link
Contributor Author

kgryte commented Jul 30, 2020

See #12 for a draft specification.

@shoyer
Copy link
Contributor

shoyer commented Jul 30, 2020

Two high level comments:

  1. Are we willing to use positional-only parameters? This is a Python 3.8+ only feature. If so, the universal signature might be <name>(x, /, out=None), which is well aligned with how these APIs are used in practice.

  2. Do we care about complete coverage for basic mathematical operations, or at we happy to rely on support for builtin Python arithmetic operation? I ask because "unary minus", i.e., -x is one obvious omission here.

@shoyer
Copy link
Contributor

shoyer commented Jul 30, 2020

One other longer comment, on the out keyword argument. I would lean against including it in the spec, or at least reserving it for a larger discussion about whether we want to support both "mutable" and "immutable" arrays.

Mutation can be challenging to support in some execution models (at least without another layer of indirection), which is why several projects currently don't support it (TensorFlow and JAX) or only support it half-heartedly (e.g., Dask). The commonality between these libraries is that they build up abstract computations, which is then transformed (e.g., for autodiff) and/or executed in parallel. Even NumPy has "read only" arrays.

I'm particularly concerned about new projects that implement this API, which might find the need to support mutation burdensome.

(out can certainly improve performance with libraries like NumPy, but in practice I've seen it used only rarely. In most cases where in-place arithmetic is essential for performance, it's worthwhile to rewrite the computation outside of Python, anyways, or with some form of jit compilation)

@rgommers
Copy link
Member

Are we willing to use positional-only parameters? This is a Python 3.8+ only feature. If so, the universal signature might be <name>(x, /, out=None), which is well aligned with how these APIs are used in practice.

Good suggestion, I'd be in favor. I don't see a big issue with >=py38, and there's a major upside: it side-steps the issue with differently named positional arguments (e.g. what numpy et al. call x, pytorch calls input - with / no one will have to spend mental energy on whether or not to rename that).

@kgryte
Copy link
Contributor Author

kgryte commented Jul 30, 2020

We can also keep the keyword-only requirements. In which case, the universal signature would be

<name>(x, /, *, out=None)

@kgryte
Copy link
Contributor Author

kgryte commented Jul 30, 2020

@shoyer Thanks for the suggestion. I can update the proposals accordingly.

@rgommers
Copy link
Member

Mutation can be challenging to support in some execution models (at least without another layer of indirection), which is why several projects currently don't support it (TensorFlow and JAX) or only support it half-heartedly (e.g., Dask).

We talked about this last week, and @alextp said TensorFlow was planning to add mutability and didn't see a real issue with supporting out=.

I think removing out= would not be a big issue, however removing mutability would leave out entire classes of algorithms. E.g. just counting how much SciPy uses += gives an impression. Some of that could be rewritten, some iterative algorithms would get a lot slower.

In most cases where in-place arithmetic is essential for performance, it's worthwhile to rewrite the computation outside of Python, anyways, or with some form of jit compilation)

Yeah I guess that's the trade-off. SciPy for example has optimizers and linear algebra code where individual iterations are expensive and they're therefore (or for "I'm short on time to Cythonize" reasons) in Python.

I don't have a good feeling for how painful this is for TensorFlow/JAX/Dask. I could make an estimate of how bad it'd be for SciPy (gut feeling: not good). @amueller not sure if scikit-learn would have a similar problem?

@saulshanabrook
Copy link
Contributor

I think removing out= would not be a big issue, however removing mutability would leave out entire classes of algorithms. E.g. just counting how much SciPy uses += gives an impression. Some of that could be rewritten, some iterative algorithms would get a lot slower.

Just wanna be careful about language here. These would be a lot slower using NumPy and SciPy, but writing things in an immutable manner does not, a priori, make them slower, as mentioned if you use a jit/whole program optimization.

@shoyer
Copy link
Contributor

shoyer commented Jul 30, 2020

We talked about this last week, and @alextp said TensorFlow was planning to add mutability and didn't see a real issue with supporting out=.

It's true that that mutation isn't a major issue in current TensorFlow (v2, with eager mode), but in projects like TensorFlow v1 (explicit graph based execution) mutation is hard. I don't think we want preclude projects like that.

E.g. just counting how much SciPy uses += gives an impression. Some of that could be rewritten, some iterative algorithms would get a lot slower.

It's worth noting that += in Python does not necessarily imply in-place mutation, if you don't implement __iadd__. Both TensorFlow and JAX (and Python's builtin float) support += even though they don't implement them in-place.

I suspect that infix operations like +=, -=, *= and /= might cover most of the use cases for in-place operations without requiring an out argument.

@shoyer
Copy link
Contributor

shoyer commented Jul 30, 2020

Let me suggest isnan() as another basic mathematical function. (Sometimes x != x is unused as an alternative, but that's much less readable.)

kgryte added a commit that referenced this issue Jul 30, 2020
@rgommers
Copy link
Member

I suspect that infix operations like +=, -=, *= and /= might cover most of the use cases for in-place operations without requiring an out argument.

Just checking: things like x[:2, ;] = some_other_array and view modifications are fine too for graph based execution?

@alextp
Copy link

alextp commented Aug 3, 2020

For tf1 we can support mutation at the python level without supporting mutation at the graph level (think ssa conversion applied between the python program and the graph).

@kgryte
Copy link
Contributor Author

kgryte commented Aug 5, 2020

Along a similar vein, wouldn't supporting mutation at the Python level be generally possible for all array libraries, including future array libraries, regardless of the underlying implementation (graph or otherwise)? It may not always be performant and may involve additional data copying, but this would seem, based on my understanding, straightforward and would not impose an undue burden.

rgommers pushed a commit that referenced this issue Aug 11, 2020
Additional changes:

* Make positional-parameters positional-only (based on discussion here: #8 (comment))

* Document broadcasting behavior

* Add document specifying the behavior of the `out` keyword argument

* Add explainer for arrays of unequal rank

* Add convention regarding shape immutatibility for output arrays

* Add data types document
@shoyer
Copy link
Contributor

shoyer commented Aug 13, 2020

Along a similar vein, wouldn't supporting mutation at the Python level be generally possible for all array libraries, including future array libraries, regardless of the underlying implementation (graph or otherwise)? It may not always be performant and may involve additional data copying, but this would seem, based on my understanding, straightforward and would not impose an undue burden.

It's definitely always possible to support mutation at the Python level via some sort of wrapper layer.

dask.array is perhaps a good example of this. It supports mutating operations and out in some cases, but its support for mutation is still rather limited. For example, it doesn't support assignment like x[:2, :] = some_other_array.

Part of the reason for this is historical -- originally Dask didn't support mutation, aligning its external API with its internal graph based execution model. TensorFlow, JAX and Autograd currently don't support mutation for the same reason.

Dask has now relaxed this hard constraint, but some popular NumPy idioms like indexing assignment still aren't supported. I suspect the reason for this is that in some cases it is impossible to do with the efficiency you would expect from NumPy -- the entire computational graph might need to get rewritten, even if only one element of an array is assigned. Rather than adding such performance cliffs, dask doesn't support such assignment at all.

The details for PyTorch are different, but it cites similar sounding reasons about rewriting entire graphs for not recommending in-place operations:
https://pytorch.org/docs/stable/notes/autograd.html#in-place-operations-with-autograd


Going forward, I guess we have a few options:

  1. Require support for in-place operations. Libraries that don't support mutation fully will need to write a wrapper layer, even if it would be inefficient.
  2. Make support for in-place operations optional. Arrays can indicate whether they support mutation via some standard API, e.g., like NumPy's ndarray.flags.writeable.
  3. Don't include support for in-place operations in the spec. This is a conservative choice, one which might have negative performance consequences (but it's a little hard to say without looking carefully). At the very least, it might require a library like SciPy to retain a special path for numpy.ndarray objects.

@alextp
Copy link

alextp commented Aug 13, 2020 via email

@oleksandr-pavlyk
Copy link
Contributor

Recognizing that mutation is still useful, but we are leaning towards not including it in the standard, would the standard provide a recommendation that if the library does support mutation, it should be through the out= keyword?

@rgommers
Copy link
Member

rgommers commented Aug 13, 2020

Recognizing that mutation is still useful, but we are leaning towards not including it in the standard, would the standard provide a recommendation that if the library does support mutation, it should be through the out= keyword?

I think it's the opposite. The out= keyword wouldn't be a big loss, but element and slice assignment is another story. If something is completely immutable, it can be hard to even build up a new array.
E.g. try constructing [0, 0, 1, 0] - the normal way would be

x = zeros(4)
x[2] = 1

No mutation at all is extremely limiting. (EDIT: imagine a larger-sized version, to avoid the answer of "just do array([0, 0, 1, 0]) directly".)

@saulshanabrook
Copy link
Contributor

or we ensure there is adequate coverage of non mutating operators (like jax's
non-mutating slice assign, for example).

Yeah, if we don't allow __setitem__ then we should have a pure version of setting an item...

That would be my preference, because it requires the least from backends and is easiest to reason about. The semantics for what types of getitem return a view vs which return a copy are subtle, and so it seems easier to just not include any mutation semantics in the spec at all.

@rgommers
Copy link
Member

adequate coverage of non mutating operators (like jax's non-mutating slice assign, for example).

Read up on that at https://jax.readthedocs.io/en/latest/notebooks/Common_Gotchas_in_JAX.html#%F0%9F%94%AA-In-Place-Updates.

I understand why JAX needs to do it like that, but it seems like a few bridges too far for the average PyData library author or end user.

I'd have a preference for doing

jax_array[1, :] = 1.0

and making JAX add the magic to transform that to

new_jax_array = index_update(jax_array, index[1, :], 1.)

Or otherwise have some exception/optional behavior. Things like index_update are going to impose quite some mental load, even if we can convince NumPy, Dask, CuPy, Sparse et al. to add such functions.

@shoyer
Copy link
Contributor

shoyer commented Aug 13, 2020

E.g. try constructing [0, 0, 1, 0]

I agree, it's a bit tricky to do something like this in Dask currently.

I'll mention two tricks that often suffice based on my experience working around these types of limitations:

  1. Use where for selection, e.g., where(arange(4) == 2, 1, 0)
  2. Calculate the "inverse" of the assignment operator in terms of indexing, e.g., y = array([0, 1]); x = y[[0, 0, 1, 0]] in this case

Some version of (2) always works, though it can be tricky to work out (especially with current APIs). The duality between indexing and assignment is the difference between specifying where elements come from or where they end up.

@oleksandr-pavlyk
Copy link
Contributor

The out= keyword wouldn't be a big loss

Perhaps, but if I am using it in my application, what are my options when migrating my code to the pydata_api?
Making a path specific to numpy.ndarray which I know supports it?

If a pydata_api conformant library supports out= keyword, I would like to make sure that standard consumer (think scipy) would be able to use it.

I would therefore like a standard to recommend, as opposed to require, that population of pre-allocated arrays by element-wise functions, if implemented, be exposed via out= keyword.

@rgommers
Copy link
Member

That would be my preference, because it requires the least from backends and is easiest to reason about.

Keep in mind that you're reasoning from a computer science or graph library author point of view. Easier for those stakeholders, way harder for people writing code with the resulting API.

Reminds me of people who find Haskell or OCaml easy - if it fits your brain it's awesome, but that's a pretty small fraction of developers.

I agree, it's a bit tricky to do something like this in Dask currently.

I think Dask still gets away with it mostly, because it is a container for mutable chunks (NumPy, CuPy), so users don't run into it as often.

@rgommers
Copy link
Member

If a pydata_api conformant library supports out= keyword, I would like to make sure that standard consumer (think scipy) would be able to use it.

func(x, out=y) is often easily rewritten as y = func(x). I don't expect many issues for SciPy with out=.

@shoyer
Copy link
Contributor

shoyer commented Aug 13, 2020

adequate coverage of non mutating operators (like jax's non-mutating slice assign, for example).

Read up on that at https://jax.readthedocs.io/en/latest/notebooks/Common_Gotchas_in_JAX.html#%F0%9F%94%AA-In-Place-Updates.

We should update that doc page, but I'll just note that JAX has a more recent improved syntax for indexing assignment:
x.at[idx].set(y) vs x[idx] = y

One advantage of the non-mutating version is that JAX can have reliable assigning arithmetic on array slices with x.at[idx].add(y) (x[idx] += y doesn't work if x[idx] returns a copy).

A disadvantage is that doing this sort thing inside a loop is almost always a bad idea unless you have a JIT compiler, because every indexing assignment operation makes a full copy. So the naive translation of an efficient Python loop that fills out an array row by row would now make a copy in each step. Instead, you'd have to rewrite that loop to use something like concatenate instead (which in my experience is already about as efficient as using indexing assignment).

@rgommers
Copy link
Member

x.at[idx].add(y)

That is a bit nicer, thanks.

(x[idx] += y doesn't work if x[idx] returns a copy).

Not sure I understand, if x[idx] returns a copy then x[idx] += y does nothing - that's a clear bug that in some algorithm with decent tests would immediately show up. It's not a very pressing concern I'd think, I can't remember ever encountering such a bug in SciPy or another library.

@oleksandr-pavlyk
Copy link
Contributor

If a pydata_api conformant library supports out= keyword, I would like to make sure that standard consumer (think scipy) would be able to use it.

func(x, out=y) is often easily rewritten as y = func(x). I don't expect many issues for SciPy with out=.

Yes, func(x, out=y) can be rewritten as y = func(x) with the important difference that new memory will be allocated for storing the result of func(x) and y becomes a new array object wrapping around the newly allocated memory block.

@shoyer
Copy link
Contributor

shoyer commented Aug 13, 2020

I agree that indexing assignment is important, much more important (in my opinion) than an out keyword argument. The non-mutating version in JAX for work well for graph based libraries like JAX/Dask, but I doubt it's a realistic replacement for the central role of assignment in libraries like NumPy/CuPy.

Are we open to "optional" components of the spec? If so, I might just say __setitem__ should either work in such-and-such way or raise TypeError.

@shoyer
Copy link
Contributor

shoyer commented Aug 13, 2020

(x[idx] += y doesn't work if x[idx] returns a copy).

Not sure I understand, if x[idx] returns a copy then x[idx] += y does nothing - that's a clear bug that in some algorithm with decent tests would immediately show up. It's not a very pressing concern I'd think, I can't remember ever encountering such a bug in SciPy or another library.

I didn't mean to imply that this is an existing bug, just that it's a little hard to write efficient code like this currently in NumPy. Instead you either have to loop over indices or use the slightly obscure np.add.at method.

@rgommers
Copy link
Member

Are we open to "optional" components of the spec?

In general yes I think, although we have discussed it more in the context of functionality that may be less central to an array library (e.g. an fft submodule). And we haven't figured out exactly how that should look yet.

If so, I might just say __setitem__ should either work in such-and-such way or raise TypeError.

I'm trying to imagine what that implies for a user. They should then write code like:

try:
    x[:2] = tmp
except TypeError:
    # handle immutable arrays (e.g from JAX)
    x = jax.ops.index_update(x, jax.ops.index[:2], tmp)

But ideally not JAX-specific then - if the special-casing really involves import jax then that kind of breaks the whole idea of the standard. So I think if we'd make __setitem__ optional, there must be a standard function it gets replaced with.

The other problem is that the above example isn't exactly equivalent (x can be a view), and I suspect there's no easy way for JAX to know if NumPy et al. would have deemed x to be a view.

@rgommers
Copy link
Member

After sleeping on it, maybe a better alternative to the above try-except example is to include a "feature flag" for mutability in the standard. So library authors using the standard are aware of the issue and have an easy way of raising a clear error for the (probably small) subset of functionality in their library that relies on mutation.

Right now JAX says

TypeError: '<class 'jax.interpreters.xla.DeviceArray'>' object does not support
 item assignment. JAX arrays are immutable; perhaps you want jax.ops.index_upda
te or jax.ops.index_add instead?

which is pretty clear, but I'm thinking it should be (if coming from, e.g., SciPy):

TypeError: {arraytype} does not support mutation, and therefore this function does not support {arraytype}.
See [here](some-link) for more details.

And then raise that exception right at the top of the function, rather than halfway through where `x[:2]` is used.

@rgommers
Copy link
Member

Summarized the mutation discussion in gh-24 and added two more options on how to deal with mutation. A dedicated issue seemed useful.

@rgommers
Copy link
Member

Mutation discussion is continued in gh-24, out= is removed, and all the element-wise functions that were the original topic here have been added and are in master. So I'll close this issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants