Merge README content into the main index doc page (#97)

mrocklin · hameerabbasi · commit 678de28959f9 · 2018-02-01T08:50:04.000+01:00
* Merge README content into the main index doc page

This adds some informative content to the documentation
and centralizes our prose in one place.

* Range of changes to docs.

* Fix all broken links in the docs.
* Add useful links to README.
diff --git a/README.rst b/README.rst
@@ -3,147 +3,11 @@ Sparse Multidimensional Arrays
 
 |Build Status|
 
-This implements sparse multidimensional arrays on top of NumPy and
-Scipy.sparse.  It generalizes the scipy.sparse.coo_matrix_ layout but extends
-beyond just rows and columns to an arbitrary number of dimensions.
+This library provides multi-dimensional sparse arrays.
 
-The original motivation is for machine learning algorithms, but it is
-intended for somewhat general use.
+* `Documentation <https://sparse.pydata.org/en/latest>`_
+* `Contributing <https://github.com/pydata/sparse/blob/master/docs/contributing.rst>`_
+* `Bug Reports/Feature Requests <https://github.com/pydata/sparse/issues>`_
 
-This Supports
---------------
-
--  NumPy ufuncs (where zeros are preserved)
--  Binary operations with other ``COO`` objects, where zeros are preserved.
--  Binary operations with Scipy sparse matrices, where zeros are preserved.
--  Binary operations with scalars, where zeros are preserved.
--  Broadcasting binary operations and ``broadcast_to``.
--  Reductions (sum, max, min, prod, ...)
--  Reshape
--  Transpose
--  Tensordot
--  triu, tril
--  Slicing with integers, lists, and slices (with no step value)
--  Concatenation and stacking
-
-This may yet support
---------------------
-
-A "does not support" list is hard to build because it is infinitely long.
-However the following things are in scope, relatively doable, and not yet built
-(help welcome).
-
--  Incremental buliding of arrays and inplace updates
--  More operations supported by Numpy Numpy arrays, such as ``argmin`` and ``argmax``.
--  Array building functions such as ``eye``, ``spdiags``. See `building sparse matrices`_.
--  Linear algebra operations such as ``inv``, ``norm`` and ``solve``. See scipy.sparse.linalg_.
-
-There are no plans to support
------------------------------
-
--  Parallel computing (though Dask.array may use this in the future)
-
-Example
--------
-
-::
-
-   pip install sparse
-
-.. code-block:: python
-
-   import numpy as np
-   n = 1000
-   ndims = 4
-   nnz = 1000000
-   coords = np.random.randint(0, n - 1, size=(ndims, nnz))
-   data = np.random.random(nnz)
-
-   import sparse
-   x = sparse.COO(coords, data, shape=((n,) * ndims))
-   x
-   # <COO: shape=(1000, 1000, 1000, 1000), dtype=float64, nnz=1000000>
-
-   x.nbytes
-   # 16000000
-
-   y = sparse.tensordot(x, x, axes=((3, 0), (1, 2)))
-
-   y
-   # <COO: shape=(1000, 1000, 1000, 1000), dtype=float64, nnz=1001588>
-
-   z = y.sum(axis=(0, 1, 2))
-   z
-   # <COO: shape=(1000,), dtype=float64, nnz=999>
-
-   z.todense()
-   # array([ 244.0671803 ,  246.38455787,  243.43383158,  256.46068737,
-   #         261.18598416,  256.36439011,  271.74177584,  238.56059193,
-   #         ...
-
-
-How does this work?
--------------------
-
-Scipy.sparse implements decent 2-d sparse matrix objects for the standard
-layouts, notably for our purposes
-`CSR, CSC, and COO <https://en.wikipedia.org/wiki/Sparse_matrix>`_.  However it
-doesn't include support for sparse arrays of greater than 2 dimensions.
-
-This library extends the COO layout, which stores the row index, column index,
-and value of every element:
-
-=== === ====
-row col data
-=== === ====
-  0   0   10
-  0   2   13
-  1   3    9
-  3   8   21
-=== === ====
-
-It is straightforward to extend the COO layout to an arbitrary number of
-dimensions:
-
-==== ==== ==== === ====
-dim1 dim2 dim3 ... data
-==== ==== ==== === ====
-  0    0     0   .   10
-  0    0     3   .   13
-  0    2     2   .    9
-  3    1     4   .   21
-==== ==== ==== === ====
-
-This makes it easy to *store* a multidimensional sparse array, but we still
-need to reimplement all of the array operations like transpose, reshape,
-slicing, tensordot, reductions, etc., which can be quite challenging in
-general.
-
-Fortunately in many cases we can leverage the existing SciPy.sparse algorithms
-if we can intelligently transpose and reshape our multi-dimensional array into
-an appropriate 2-d sparse matrix, perform a modified sparse matrix
-operation, and then reshape and transpose back.  These reshape and transpose
-operations can all be done at numpy speeds by modifying the arrays of
-coordinates.  After scipy.sparse runs its operations (coded in C) then we can
-convert back to using the same path of reshapings and transpositions in
-reverse.
-
-This approach is not novel; it has been around in the multidimensional array
-community for a while.  It is also how some operations in numpy work.  For example
-the ``numpy.tensordot`` function performs transposes and reshapes so that it can
-use the ``numpy.dot`` function for matrix multiplication which is backed by
-fast BLAS implementations.  The ``sparse.tensordot`` code is very slight
-modification of ``numpy.tensordot``, replacing ``numpy.dot`` with
-``scipy.sprarse.csr_matrix.dot``.
-
-
-LICENSE
--------
-
-This is licensed under New BSD-3
-
-.. _scipy.sparse.coo_matrix: https://docs.scipy.org/doc/scipy/reference/generated/scipy.sparse.coo_matrix.html
-.. _building sparse matrices: https://docs.scipy.org/doc/scipy/reference/sparse.html#functions
-.. _scipy.sparse.linalg: https://docs.scipy.org/doc/scipy/reference/sparse.linalg.html
 .. |Build Status| image:: https://travis-ci.org/pydata/sparse.svg?branch=master
    :target: https://travis-ci.org/pydata/sparse
diff --git a/docs/changelog.rst b/docs/changelog.rst
@@ -15,7 +15,7 @@ Changelog
 -  Fix nnz for scalars (:pr:`48`) `Hameer Abbasi`_
 -  Update README (:pr:`50`) (:pr:`53`) `Hameer Abbasi`_
 -  Fix large concatenations and stacks (:pr:`50`) `Hameer Abbasi`_
--  Add __array_ufunc__ for __call__ and reduce (:pr:`r9`) `Hameer Abbasi`_
+-  Add __array_ufunc__ for __call__ and reduce (:pr:`49`) `Hameer Abbasi`_
 -  Update documentation (:pr:`54`) `Hameer Abbasi`_
 -  Flake8 and coverage in pytest (:pr:`59`) `Nils Werner`_
 -  Copy constructor (:pr:`55`) `Nils Werner`_
diff --git a/docs/construct.rst b/docs/construct.rst
@@ -9,7 +9,7 @@ You can construct :obj:`COO` arrays from coordinates and value data.
 
 The :code:`coords` parameter contains the indices where the data is nonzero,
 and the :code:`data` parameter contains the data corresponding to those indices.
-For example, the following code will generate a :math:`5 \times 5` identity
+For example, the following code will generate a :math:`5 \times 5` diagonal
 matrix:
 
 .. code-block:: python
@@ -53,9 +53,9 @@ explicitly. For example, if we did the following without the
    data = [1, 4, 2, 1]
    s = COO(coords, data, shape=(5, 5))
 
-From :obj:`scipy.sparse.spmatrix` objects
------------------------------------------
-To construct :obj:`COO` array from :obj:`scipy.sparse.spmatrix`
+From :doc:`Scipy sparse matrices <generated/scipy.sparse.spmatrix>`
+-------------------------------------------------------------------
+To construct :obj:`COO` array from :obj:`spmatrix <scipy.sparse.spmatrix>`
 objects, you can use the :obj:`COO.from_scipy_sparse` method. As an
 example, if :code:`x` is a :obj:`scipy.sparse.spmatrix`, you can
 do the following to get an equivalent :obj:`COO` array:
@@ -64,8 +64,8 @@ do the following to get an equivalent :obj:`COO` array:
 
    s = COO.from_scipy_sparse(x)
 
-From :obj:`numpy.ndarray` objects
----------------------------------
+From :doc:`Numpy arrays <reference/generated/numpy.ndarray>`
+------------------------------------------------------------
 To construct :obj:`COO` arrays from :obj:`numpy.ndarray`
 objects, you can use the :obj:`COO.from_numpy` method. As an
 example, if :code:`x` is a :obj:`numpy.ndarray`, you can
@@ -100,7 +100,7 @@ dictionary or is set to :code:`dtype('float64')` if that is not present.
 .. code-block:: python
 
    s = DOK((6, 5, 2))
-   s2 = DOK((2, 3, 4), dtype=np.float64)
+   s2 = DOK((2, 3, 4), dtype=np.uint8)
 
 After this, you can build the array by assigning arrays or scalars to elements
 or slices of the original array. Broadcasting rules are followed.
@@ -114,7 +114,7 @@ perform arithmetic or other operations on it.
 
 .. code-block:: python
 
-   s2 = COO(s)
+   s3 = COO(s)
 
 In addition, it is possible to access single elements of the :obj:`DOK` array
 using normal Numpy indexing.
@@ -128,8 +128,8 @@ using normal Numpy indexing.
 
 Converting :obj:`COO` objects to other Formats
 ----------------------------------------------
-:obj:`COO` arrays can be converted to :obj:`numpy.ndarray` objects,
-or to some :obj:`scipy.sparse.spmatrix` subclasses via the following
+:obj:`COO` arrays can be converted to :doc:`Numpy arrays <reference/generated/numpy.ndarray>`,
+or to some :obj:`spmatrix <scipy.sparse.spmatrix>` subclasses via the following
 methods:
 
 * :obj:`COO.todense`: Converts to a :obj:`numpy.ndarray` unconditionally.
diff --git a/docs/contributing.rst b/docs/contributing.rst
@@ -17,6 +17,18 @@ If you're not already familiar with it, we follow the `fork and pull model
 <https://help.github.com/articles/about-collaborative-development-models/>`_
 on GitHub.
 
+Filing Issues
+-------------
+If you find a bug or would like a new feature, you might want to `consider
+filing a new issue on GitHub <https://github.com/pydata/sparse/issues>`_. Before
+you open a new issue, please make sure of the following:
+
+* This should go without saying, but make sure what you are requesting is within
+  the scope of this project.
+* The bug/feature is still present/missing on the ``master`` branch on GitHub.
+* A similar issue or pull request isn't already open. If one already is, it's better
+  to contribute to the discussion there.
+
 Running/Adding Unit Tests
 -------------------------
 It is best if all new functionality and/or bug fixes have unit tests added
@@ -25,9 +37,9 @@ with each use-case.
 Since we support both Python 2.7 and Python 3.5 and newer, it is recommended
 to test with at least these two versions before committing your code or opening
 a pull request. We use `pytest <https://docs.pytest.org/en/latest/>`_ as our unit
-testing framework, with the pytest-cov extension to check code coverage and
-pytest-flake8 to check code style. You don't need to configure these extensions
-yourself. Once you've configured your environment, you can just :code:`cd` to
+testing framework, with the ``pytest-cov`` extension to check code coverage and
+``pytest-flake8`` to check code style. You don't need to configure these extensions
+yourself. Once you've configured your environment, you can just ``cd`` to
 the root of your repository and run
 
 .. code-block:: bash
diff --git a/docs/index.rst b/docs/index.rst
@@ -1,28 +1,78 @@
 Sparse
 ======
 
-Introduction
-------------
-In many scientific applications, arrays come up that are mostly empty or filled
-with zeros. These arrays are aptly named *sparse arrays*. However, it is a matter
-of choice as to how these are stored. One may store the full array, i.e., with all
-the zeros included. This incurs a significant cost in terms of memory and
-performance when working with these arrays.
-
-An alternative way is to store them in a standalone data structure that keeps track
-of only the nonzero entries. Often, this improves performance and memory consumption
-but most operations on sparse arrays have to be re-written. :obj:`sparse` tries to
-provide one such data structure. It isn't the only library that does this. Notably,
-:obj:`scipy.sparse` achieves this, along with `pysparse <http://pysparse.sourceforge.net/>`_.
+This implements sparse arrays of arbitrary dimension on top of :obj:`numpy` and :obj:`scipy.sparse`.
+It generalizes the :obj:`scipy.sparse.coo_matrix` and :obj:`scipy.sparse.dok_matrix` layouts,
+but extends beyond just rows and columns to an arbitrary number of dimensions.
+
+Additionally, this project maintains compatibility with the :obj:`numpy.ndarray` interface
+rather than the :obj:`numpy.matrix` interface used in :obj:`scipy.sparse`
+
+These differences make this project useful in certain situations
+where scipy.sparse matrices are not well suited,
+but it should not be considered a full replacement.
+It lacks layouts that are not easily generalized like CSR/CSC
+and depends on scipy.sparse for some computations.
+
 
 Motivation
 ----------
-So why use :obj:`sparse`? Well, the other libraries mentioned are mostly limited to
-two-dimensional arrays. In addition, inter-compatibility with :obj:`numpy` is
-hit-or-miss. :obj:`sparse` strives to achieve inter-compatibility with
-:obj:`numpy.ndarray`, and provide mostly the same API. It defers to :obj:`scipy.sparse`
-when it is convenient to do so, and writes custom implementations of operations where
-this isn't possible. It also supports general N-dimensional arrays.
+
+Sparse arrays, or arrays that are mostly empty or filled with zeros,
+are common in many scientific applications.
+To save space we often avoid storing these arrays in traditional dense formats,
+and instead choose different data structures.
+Our choice of data structure can significantly affect our storage and computational
+costs when working with these arrays.
+
+
+Design
+------
+
+The main data structure in this library follows the
+`Coordinate List (COO) <https://en.wikipedia.org/wiki/Sparse_matrix#Coordinate_list_(COO)>`_
+layout for sparse matrices, but extends it to multiple dimensions.
+
+The COO layout, which stores the row index, column index, and value of every element:
+
+=== === ====
+row col data
+=== === ====
+  0   0   10
+  0   2   13
+  1   3    9
+  3   8   21
+=== === ====
+
+It is straightforward to extend the COO layout to an arbitrary number of
+dimensions:
+
+==== ==== ==== === ====
+dim1 dim2 dim3 ... data
+==== ==== ==== === ====
+  0    0     0   .   10
+  0    0     3   .   13
+  0    2     2   .    9
+  3    1     4   .   21
+==== ==== ==== === ====
+
+This makes it easy to *store* a multidimensional sparse array, but we still
+need to reimplement all of the array operations like transpose, reshape,
+slicing, tensordot, reductions, etc., which can be challenging in general.
+
+Fortunately in many cases we can leverage the existing :obj:`scipy.sparse`
+algorithms if we can intelligently transpose and reshape our multi-dimensional
+array into an appropriate 2-d sparse matrix, perform a modified sparse matrix
+operation, and then reshape and transpose back.  These reshape and transpose
+operations can all be done at numpy speeds by modifying the arrays of
+coordinates.  After scipy.sparse runs its operations (often written in C) then
+we can convert back to using the same path of reshapings and transpositions in
+reverse.
+
+LICENSE
+-------
+
+This library is licensed under BSD-3
 
 .. toctree::
    :maxdepth: 3
@@ -33,5 +83,7 @@ this isn't possible. It also supports general N-dimensional arrays.
    construct
    operations
    generated/sparse
-   contribute
+   contributing
    changelog
+
+.. _scipy.sparse: https://docs.scipy.org/doc/scipy/reference/sparse.html
diff --git a/docs/operations.rst b/docs/operations.rst
diff --git a/docs/quickstart.rst b/docs/quickstart.rst