Add specification for computing the pseudo-inverse (linalg: pinv) #118

kgryte · 2021-01-25T17:56:52Z

This PR

specifies the interface for computing the (Moore-Penrose) pseudo-inverse.
is derived from comparing signatures across array libraries.

Notes

Following Torch, MXNet, TF, NumPy, and JAX, this proposal allows for providing a stack of square matrices. CuPy does not currently support providing stacks.
Dask does not provide an API for computing the pseudo-inverse.
TF supports a validate_args argument for embedding additional validations within its computational graph.
NumPy, MXNet, and Torch (latest master) supporting providing a hermitian keyword argument to indicate that more efficient computation methods be used. This PR omits this keyword, as more of an implementation detail, than a generalizable API.
NumPy, MXNet, CuPy, and Torch set the default rcond value to 1e-15, while JAX and TF compute a default value based on the machine epsilon associated with the input array data type and the number of rows/cols. This PR follows JAX and TF in computing the default value (as 1e-15 does not make sense for non-float64 input, such as float32 or bfloat16) and requiring that rcond be a broadcast compatible array (or a float).
This proposal renames the rcond keyword argument to rtol in order to unify keyword arguments for pinv, lstsq, and matrix_rank which all support specifying relative tolerances. The default value is also the same across these APIs.
Question: should this return a namedtuple to allow for a variable number of returns (see API for variable number of returns in linalg #95)? SciPy, e.g., does support returning multiple values (the matrix B along with the effective rank of the result). In theory, other info could be returned, such as error info, but not clear whether this is enough of a forward-looking concern.
- Answer: no.

rgommers · 2021-01-26T12:56:52Z

Question: should this return a tuple to allow for a variable number of returns (see #95)? SciPy, e.g., does support returning multiple values (the matrix B along with the effective rank of the result). In theory, other info could be returned, such as error info, but not clear whether this is enough of a forward-looking concern.

I'd say no. We should avoid variable number of returns as much as possible. Returning a tuple doesn't help at all; changing the tuple length would still be a serious backwards compat break.

I'll comment on gh-95, after looking at some of these PRs that's worth reconsidering.

rgommers · 2021-01-26T14:13:12Z

NumPy, MXNet, CuPy, and Torch set the default rcond value to 1e-15, while JAX and TF compute a default value based on the machine epsilon associated with the input array data type and the number of rows/cols. This PR follows JAX and TF in computing the default value (as 1e-15 does not make sense for non-float64 input, such as float32 or bfloat16) and requiring that rcond be a broadcast compatible array.

The default value here is given as 10.0 * max(M, N) * eps. I see a couple of potential issues:

The factor 10.0 is missing from the lstsq proposal in Add specification for returning the least-squares solution to a linear matrix equation (linalg: lstsq) #119.
This is a lot larger than 1e-15 for large inputs. Intuitively, rcond should not scale linearly with the size of the largest dimension - perhaps more like the square root of it.

kgryte · 2021-01-28T17:22:08Z

Re: 10.0 factor. Yeah, I am not sure the reasoning behind JAX and TF's use of the factor. That factor is absent from lstsq because the factor is not used by NumPy and was not clear to me whether the rcond defaults for pinv and lstsq should match.

kgryte · 2021-01-28T17:23:23Z

Re: namedtuple. Another alternative is to simply return a dictionary.

kgryte · 2021-02-16T05:34:16Z

Renamed rcond to tol in order to unify similar keyword arguments across pinv, lstsq, and matrix_rank. Removed 10.0 scaling factor; the default tolerances are now computed consistently across each of the three aforementioned APIs.

leofang

Just a quick comment 🙂

spec/API_specification/linear_algebra_functions.md

kgryte · 2021-03-04T09:15:06Z

Renamed tol to rtol to more explicitly indicate relative tolerance and pave the way for future specification evolution (e.g., atol).

leofang · 2021-03-05T17:12:08Z

btw CuPy will support batched pinv in the upcoming v9.0 (cupy/cupy#4686).

…pinv

kgryte · 2021-05-12T04:52:09Z

Thanks, @leofang, for the review! This is ready for merge...

kgryte added 3 commits January 25, 2021 09:42

Add pinv specification

8c34c2e

Fix statement on broadcasting

dc8ffd3

Fix type annotation

9e52d16

rgommers mentioned this pull request Jan 26, 2021

API for variable number of returns in linalg #95

Closed

leofang mentioned this pull request Jan 27, 2021

cupy.linalg.{svd, pinv} should support broadcasting cupy/cupy#3062

Closed

kgryte mentioned this pull request Feb 15, 2021

Add specification for returning the least-squares solution to a linear matrix equation (linalg: lstsq) #119

Merged

Rename keyword argument and drop scaling factor

7613ba7

leofang reviewed Feb 25, 2021

View reviewed changes

spec/API_specification/linear_algebra_functions.md Outdated Show resolved Hide resolved

kgryte added 2 commits March 1, 2021 09:47

Update copy

a3fb891

Rename keyword argument

53ff721

Fix argument name

b3dff5e

leofang approved these changes Mar 5, 2021

View reviewed changes

rgommers added the API extension Adds new functions or objects to the API. label Mar 20, 2021

kgryte added 7 commits March 24, 2021 16:57

Merge branch 'main' of https://github.com/pydata-apis/array-api into …

5836ca1

…pinv

Update dtype requirements

10435af

Fix missing period

6f550e5

Update copy

a02b480

Update copy

59bcf47

Update dtype requirements

97571d0

Update copy

e310f08

leofang approved these changes Mar 27, 2021

View reviewed changes

rgommers force-pushed the main branch 2 times, most recently from 2f8f5e4 to 0607525 Compare April 19, 2021 20:22

rgommers force-pushed the main branch from 0607525 to 138e963 Compare April 19, 2021 20:25

Move API to submodule

ecdfd43

kgryte merged commit 9893373 into main May 12, 2021

kgryte deleted the pinv branch May 12, 2021 04:52

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add specification for computing the pseudo-inverse (linalg: pinv) #118

Add specification for computing the pseudo-inverse (linalg: pinv) #118

kgryte commented Jan 25, 2021 •

edited

Loading

rgommers commented Jan 26, 2021

rgommers commented Jan 26, 2021 •

edited

Loading

kgryte commented Jan 28, 2021

kgryte commented Jan 28, 2021

kgryte commented Feb 16, 2021

leofang left a comment

kgryte commented Mar 4, 2021

leofang commented Mar 5, 2021

kgryte commented May 12, 2021

Add specification for computing the pseudo-inverse (linalg: pinv) #118

Add specification for computing the pseudo-inverse (linalg: pinv) #118

Conversation

kgryte commented Jan 25, 2021 • edited Loading

Notes

rgommers commented Jan 26, 2021

rgommers commented Jan 26, 2021 • edited Loading

kgryte commented Jan 28, 2021

kgryte commented Jan 28, 2021

kgryte commented Feb 16, 2021

leofang left a comment

Choose a reason for hiding this comment

kgryte commented Mar 4, 2021

leofang commented Mar 5, 2021

kgryte commented May 12, 2021

kgryte commented Jan 25, 2021 •

edited

Loading

rgommers commented Jan 26, 2021 •

edited

Loading