Compressed Sparse Matrices

The CSR class is the entry point for pure Python code to work with the CSR package.

class csr.CSR(nrows=None, ncols=None, nnz=None, ptrs=None, inds=None, vals=None, R=None)

Simple compressed sparse row matrix. This is like scipy.sparse.csr_matrix, with a couple of useful differences:

  • The value array is optional, for cases in which only the matrix structure is required.

  • The value array, if present, is always double-precision.

You generally don’t want to create this class yourself with the constructor. Instead, use one of its class methods.

It is backed by separate storage type (py:class:csr._CSR) that can be passed around through Numba-compiled functions, and nopython compiled equivalents of many of its methods are available as functions in the csr.native_ops module that take the underlying tuple (accessible by R) as a parameter.

If you need to pass an instance off to a Numba-compiled function, use R:

_some_numba_fun(csr.R)
R

the named tuple containing the actual matrix data

Type

_CSR

nrows

the number of rows.

Type

int

ncols

the number of columns.

Type

int

nnz

the number of entries.

Type

int

rowptrs

the row pointers.

Type

numpy.ndarray

colinds

the column indices.

Type

numpy.ndarray

values

the values

Type

numpy.ndarray or None

Constructing Matrices

In addition to the CSR constructor, there are several utility methods for constructing sparse matrices.

classmethod CSR.from_coo(rows, cols, vals, shape=None, rpdtype=<class 'numpy.int32'>)

Create a CSR matrix from data in COO format.

Parameters
  • rows (array-like) – the row indices.

  • cols (array-like) – the column indices.

  • vals (array-like) – the data values; can be None.

  • shape (tuple) – the array shape, or None to infer from row & column indices.

classmethod CSR.empty(nrows, ncols, row_nnzs=None)

Create an uninitialized CSR matrix.

Parameters
  • nrows (int) – the number of rows.

  • ncols (int) – the number of columns.

  • row_nnzs (array-like) – the number of nonzero entries for each row, or None for an empty matrix.

Accessing Rows

The CSR data itself is exposed through attributes. There are also several methods to extract row data in a more convenient form.

CSR.row_extent(row)

Get the extent of a row in the underlying column index and value arrays.

Parameters

row (int) – the row index.

Returns

(s, e), where the row occupies positions \([s, e)\) in the CSR data.

Return type

tuple

CSR.row_cs(row)

Get the column indcies for the stored values of a row.

CSR.row_vs(row)

Get the stored values of a row. If only the matrix structure is stored, this returns a vector of 1s.

CSR.row(row)

Return a row of this matrix as a dense ndarray.

Parameters

row (int) – the row index.

Returns

the row, with 0s in the place of missing values. If the CSR only stores matrix structure, the returned vector has 1s where the CSR records an entry.

Return type

numpy.ndarray

Transforming and Manipulating Matrices

CSR.copy(include_values=True, *, copy_structure=True)

Create a copy of this CSR.

Parameters
  • include_values (bool) – whether to copy the values or only the structure.

  • copy_structure (bool) – whether to copy the structure (index & pointers) or share with the original matrix.

CSR.subset_rows(begin, end)

Subset the rows in this matrix.

CSR.filter_nnzs(filt)

Filter the values along the full NNZ axis.

Parameters

filt (ndarray) – a logical array of length nnz that indicates the values to keep.

Returns

The filtered sparse matrix.

Return type

CSR

CSR.transpose(values=True)

Transpose a CSR matrix.

Parameters

values (bool) – whether to include the values in the transpose.

Returns

the transpose of this matrix (or, equivalently, this matrix in CSC format).

Return type

CSR

CSR.normalize_rows(normalization)

Normalize the rows of the matrix.

Note

The normalization ignores missing values instead of treating them as 0.

Parameters

normalization (str) –

The normalization to perform. Can be one of:

  • 'center' - center rows about the mean

  • 'unit' - convert rows to a unit vector

Returns

The normalization values for each row.

Return type

numpy.ndarray

CSR.drop_values()

Remove the value array from this CSR. This is an in-place operation.

CSR.fill_values(value)

Fill the values of this CSR with the specified value. If the CSR is structure-only, a value array is added. This is an in-place operation.

Arithmetic

CSRs do not yet support the full suite of SciPy/NumPy matrix operations, but they do support multiplications:

CSR.mult_vec(v)

Multiply this matrix by a vector.

Parameters

other (numpy.ndarray) – A vector, of length ncols.

Returns

\(A\vec{x}\), as a vector.

Return type

numpy.ndarray

CSR.multiply(other, *, transpose=False)

Multiply this matrix by another.

Parameters
  • other (CSR) – the other matrix.

  • transpose (bool) – if True, compute \(AB^{T}\) instead of \(AB\).

Returns

CSR: the product of the two matrices.

SciPy Integration

CSR matrices can be converted to and from SciPy sparse matrices (in any layout):

classmethod CSR.from_scipy(mat, copy=True)

Convert a scipy sparse matrix to a CSR.

Parameters
Returns

a CSR matrix.

Return type

CSR

CSR.to_scipy()

Convert a CSR matrix to a SciPy scipy.sparse.csr_matrix. Avoids copying if possible.

Parameters

self (CSR) – A CSR matrix.

Returns

A SciPy sparse matrix with the same data.

Return type

scipy.sparse.csr_matrix