Skip to content

Linear algebra namespace overview #165

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
kgryte opened this issue Apr 15, 2021 · 2 comments
Closed

Linear algebra namespace overview #165

kgryte opened this issue Apr 15, 2021 · 2 comments
Labels
topic: Linear Algebra Linear algebra.

Comments

@kgryte
Copy link
Contributor

kgryte commented Apr 15, 2021

In NumPy (and other array libraries), linear algebra APIs can be found in both the top-level namespace and a separate linear algebra (linalg) sub-namespace.

The purpose of this issue is to provide an overview of how array libraries (notably NumPy and TensorFlow) currently delineate linear algebra APIs and to provide a place for discussing whether the specification should adopt a similar division and what that division should be.

As discussed in gh-147 and in meetings, whether the specification should adopt a similar division is a matter of some debate.

Notably, some array libraries, such as Dask, may have difficulty implementing some linear algebra APIs, and, thus, the question is raised as to whether we should include "essential", relatively straightforward linear algebra APIs in the top-level namespace (i.e., those APIs whose implementations do not require factorization (of the BLAS variety, rather than of the LAPACK variety)) and place the remaining APIs in a sub-namespace which could be an optional extension.

Some individuals have proposed that we include all linear algebra APIs (mirroring NumPy) in a sub-namespace linalg, but also mirror a select subset of APIs in the top-level namespace (e.g., matmul and linalg.matmul). The benefit of this is that (1) linear algebra APIs in the top-level namespace would be guaranteed and provide a minimal subset of linear algebra APIs in all standard-conforming array libraries and (2) for consistency, a user could simply request the linalg sub-namespace and get all linear algebra APIs (something which is not currently possible in, e.g., NumPy today).

What follows is an overview of NumPy's linear algebra API division which is adopted by several other array libraries (e.g., Dask, CuPy, JAX, etc).

NumPy

Top-Level Namespace

  • cross
  • diagonal
  • einsum
  • matmul
  • outer
  • tensordot
  • trace
  • transpose
  • vecdot

Note: essentially sum-products and basic array manipulation (transpose, diagonal).

Linalg Namespace

  • cholesky
  • det
  • eig
  • eigh
  • eigvals
  • eigvalsh
  • inv
  • lstsq
  • matrix_power
  • matrix_rank
  • norm
  • pinv
  • qr
  • slogdet
  • solve
  • svd
  • svdvals

TensorFlow

TensorFlow adopts a similar division, but places the following APIs in a linalg sub-namespace, rather than than the top-level namespace...

Top-Level

  • einsum
  • tensordot
  • transpose

Linalg Namespace

  • ...
  • cross
  • diag (diagonal)
  • matmul
  • trace

Paths Forward

  1. Everything in the top-level namespace.

    • Advantages:

      1. A single namespace affords the simplest import strategy and mental model. No need to import a separate module. No need for users to recall that a separate module exists.
    • Disadvantages:

      1. No "grouping" of related APIs.
      2. Requires that all specification-conforming array libraries implement all linear algebra APIs. (note: depending on one's views, this may be an advantage.)
  2. Everything in a linear algebra sub-namespace.

    • Advantages:

      1. Related APIs are grouped together.
      2. Possible to make linear algrebra APIs an "extension" (i.e., optional).
    • Disadvantages:

      1. Requires that a specification-conforming array library implement all linear algebra APIs in the sub-namespace.
      2. Slightly less convenient, as need to import sub-package.
  3. Split linear algebra across the top-level and a sub-namespace.

    • Advantages:

      1. Allows delineating "universal" linear algebra APIs which every specification-conforming array library should implement from those which require factorizations and may be more difficult to implement.
    • Disadvantages:

      1. Imposes a more complicated mental model. From a user's POV, not clear why one linear algebra API exists in the top-level namespace, while another lives in a sub-namespace. A division based on factorization (BLAS vs LAPACK) leaks implementation details of which a user is unlikely to be aware.
  4. Split across top-level and sub-namespace and mirror top-level APIs in the sub-namespace. (e.g., matul and linalg.matmul)

    • Advantages:

      1. Allows for certain basic linear algebra APIs to be mandatory, while others could be optional (i.e., implemented as part of an extension).
      2. Allows users wanting all linear algebra APIs to just import the sub-package and not have to remember which APIs are implemented in the top-level vs sub-namespace.
    • Disadvantages:

      1. Duplicated APIs.
@kgryte
Copy link
Contributor Author

kgryte commented Apr 26, 2021

Based on offline discussions and the meeting discussion on (03/06/2021), option (4) seems most amenable and, while not a perfect solution, assuages competing concerns (mandatory vs optional) at the expense of a minimal number of duplicated APIs.

@rgommers
Copy link
Member

This was implemented in gh-182, closing.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
topic: Linear Algebra Linear algebra.
Projects
None yet
Development

No branches or pull requests

2 participants