Compressed Sparse Matrices

The CSR class is the entry point for pure Python code to work with the CSR package.

class csr.CSR(nrows, ncols, nnz, rps, cis, vs, cast=True)

Simple compressed sparse row matrix. This is like scipy.sparse.csr_matrix, with a few useful differences:

  • The value array is optional, for cases in which only the matrix structure is required.

  • The value array, if present, is always double-precision.

  • It is usable from code compiled in Numba’s nopython mode.

You generally don’t want to create this class yourself with the constructor. Instead, use one of its class or static methods.

Not all methods are available from Numba, and a few have restricted signatures. The documentation for each method notes deviations when in Numba-compiled code.

At the Numba level, matrices with and without value arrays have different types. For the most part, this is transparent, but if you want to write a Numba function that works on the values array but only if it is present, it requires writing two versions of the function and using numba.extending.overload() to dispatch to the correct one. There are several examples of doing this in the CSR source code. The method CSRType.has_values() lets you quickly see if a CSR type instance has values or not.

nrows

the number of rows.

Type

int

ncols

the number of columns.

Type

int

nnz

the number of entries.

Type

int

rowptrs

the row pointers.

Type

numpy.ndarray

colinds

the column indices.

Type

numpy.ndarray

values

the values.

Type

numpy.ndarray or None

Constructing Matrices

In addition to the CSR constructor, there are several utility methods for constructing sparse matrices.

classmethod CSR.from_coo(rows, cols, vals, shape=None, *, rpdtype=<class 'numpy.int32'>)

Create a CSR matrix from data in COO format.

Parameters
  • rows (array-like) – the row indices.

  • cols (array-like) – the column indices.

  • vals (array-like) – the data values; can be None.

  • shape (tuple) – the array shape, or None to infer from row & column indices.

classmethod CSR.empty(nrows, ncols, row_nnzs=None, values=True)

Create an uninitialized CSR matrix.

Parameters
  • nrows (int) – the number of rows.

  • ncols (int) – the number of columns.

  • row_nnzs (array-like) – the number of nonzero entries for each row, or None for an empty matrix.

  • values (bool) – whether it has values or only structure.

Constructing from Numba

Numba does not provide access to CSR’s class methods; instead, use the creation functions (these also work from pure Python):

csr.create(nrows, ncols, nnz, rowptrs, colinds, values)

Create a CSR.

csr.create_novalues(nrows, ncols, nnz, rowptrs, colinds)

Create a CSR without values.

csr.create_empty(nrows, ncols)

Create an empty CSR of the specified size.

Note

This function can be used from Numba.

csr.create_from_sizes(nrows, ncols, sizes)

Create a CSR with uninitialized values and specified row sizes.

Parameters
  • nrows (int) – the number of rows

  • ncols (int) – the number of columns

  • sizes (numpyp.ndarray) – the number of nonzero values in each row

Accessing Rows

The CSR data itself is exposed through attributes. There are also several methods to extract row data in a more convenient form.

CSR.row_extent(row)

Get the extent of a row in the underlying column index and value arrays.

Parameters

row (int) – the row index.

Returns

(s, e), where the row occupies positions \([s, e)\) in the CSR data.

Return type

tuple

CSR.row_cs(row)

Get the column indcies for the stored values of a row.

CSR.row_vs(row)

Get the stored values of a row. If only the matrix structure is stored, this returns a vector of 1s.

CSR.row(row)

Return a row of this matrix as a dense ndarray.

Parameters

row (int) – the row index.

Returns

the row, with 0s in the place of missing values. If the CSR only stores matrix structure, the returned vector has 1s where the CSR records an entry.

Return type

numpy.ndarray

Transforming and Manipulating Matrices

CSR.copy(include_values=True, *, copy_structure=True)

Create a copy of this CSR.

Parameters
  • include_values (bool) – whether to copy the values or only the structure.

  • copy_structure (bool) – whether to copy the structure (index & pointers) or share with the original matrix.

CSR.subset_rows(begin, end)

Subset the rows in this matrix.

Note

This method is not available from Numba.

Parameters
  • begin (int) – the first row index to include.

  • end (int) – one past the last row to include.

Returns

the matrix only containing a subset of the rows. It shares storage

with the original matrix to the extent possible.

Return type

CSR

CSR.filter_nnzs(filt)

Filter the values along the full NNZ axis.

Note

This method is not available from Numba.

Parameters

filt (ndarray) – a logical array of length nnz that indicates the values to keep.

Returns

The filtered sparse matrix.

Return type

CSR

CSR.transpose(include_values=True)

Transpose a CSR matrix.

Note

In Numba, this method takes no paramters. Call transpose_structure() for a structure-only transpose.

Parameters

include_values (bool) – whether to include the values in the transpose.

Returns

the transpose of this matrix (or, equivalently, this matrix in CSC format).

Return type

CSR

CSR.normalize_rows(normalization)

Normalize the rows of the matrix.

Note

The normalization ignores missing values instead of treating them as 0.

Note

This method is not available from Numba.

Parameters

normalization (str) –

The normalization to perform. Can be one of:

  • 'center' - center rows about the mean

  • 'unit' - convert rows to a unit vector

Returns

The normalization values for each row.

Return type

numpy.ndarray

CSR.drop_values()

Remove the value array from this CSR. This is an in-place operation.

Warning

This method is deprecated.

Note

This method is not available from Numba.

CSR.fill_values(value)

Fill the values of this CSR with the specified value. If the CSR is structure-only, a value array is added. This is an in-place operation.

Warning

This method is deprecated.

Note

This method is not available from Numba.

Arithmetic

CSRs do not yet support the full suite of SciPy/NumPy matrix operations, but they do support multiplications:

CSR.mult_vec(v)

Multiply this matrix by a vector.

Parameters

other (numpy.ndarray) – A vector, of length ncols.

Returns

\(A\vec{x}\), as a vector.

Return type

numpy.ndarray

CSR.multiply(other, transpose=False)

Multiply this matrix by another.

Note

In Numba, transpose is a mandatory positional argument. Numba users may wish to directly use the kernel API.

Parameters
  • other (CSR) – the other matrix.

  • transpose (bool) – if True, compute \(AB^{T}\) instead of \(AB\).

Returns

CSR: the product of the two matrices.

SciPy Integration

CSR matrices can be converted to and from SciPy sparse matrices (in any layout):

classmethod CSR.from_scipy(mat, copy=True)

Convert a scipy sparse matrix to a CSR.

Parameters
Returns

a CSR matrix.

Return type

CSR

CSR.to_scipy()

Convert a CSR matrix to a SciPy scipy.sparse.csr_matrix. Avoids copying if possible.

Parameters

self (CSR) – A CSR matrix.

Returns

A SciPy sparse matrix with the same data.

Return type

scipy.sparse.csr_matrix