Compressed Sparse Matrices

The CSR class is the entry point for pure Python code to work with the CSR package.

class csr.CSR(nrows, ncols, nnz, rps, cis, vs, _cast=True)

Simple compressed sparse row matrix. This is like scipy.sparse.csr_matrix, with a few useful differences:

  • The value array is optional, for cases in which only the matrix structure is required.

  • The value array, if present, is always double-precision.

  • It is usable from code compiled in Numba’s nopython mode.

You generally don’t want to create this class yourself with the constructor. Instead, use one of its class or static methods. If you do use the constructor, be advised that the class may reuse the arrays that you pass, but does not guarantee that they will be used.

Not all methods are available from Numba, and a few have restricted signatures. The documentation for each method notes deviations when in Numba-compiled code.

At the Numba level, matrices with and without value arrays have different types. For the most part, this is transparent, but if you want to write a Numba function that works on the values array but only if it is present, it requires writing two versions of the function and using numba.extending.overload() to dispatch to the correct one. There are several examples of doing this in the CSR source code. The method CSRType.has_values() lets you quickly see if a CSR type instance has values or not.

nrows

the number of rows.

Type:

int

ncols

the number of columns.

Type:

int

nnz

the number of entries.

Type:

int

rowptrs

the row pointers.

Type:

numpy.ndarray

colinds

the column indices.

Type:

numpy.ndarray

values

the values.

Type:

numpy.ndarray or None

Constructing Matrices

In addition to the CSR constructor, there are several utility methods for constructing sparse matrices.

classmethod CSR.from_coo(rows, cols, vals, shape=None, *, rpdtype=<class 'numpy.int32'>)

Create a CSR matrix from data in COO format.

Parameters:
  • rows (array-like) – the row indices.

  • cols (array-like) – the column indices.

  • vals (array-like) – the data values; can be None.

  • shape (tuple) – the array shape, or None to infer from row & column indices.

classmethod CSR.empty(nrows, ncols, row_nnzs=None, values=True)

Create an uninitialized CSR matrix.

Parameters:
  • nrows (int) – the number of rows.

  • ncols (int) – the number of columns.

  • row_nnzs (array-like) – the number of nonzero entries for each row, or None for an empty matrix.

  • values (bool, str, or numpy.dtype) – whether it has values or only structure; can be a NumPy data type to specify a type other than f8.

Constructing from Numba

Numba does not provide access to CSR’s class methods; instead, use the creation functions (these also work from pure Python):

csr.create(nrows, ncols, nnz, rowptrs, colinds, values)

Create a CSR.

csr.create_novalues(nrows, ncols, nnz, rowptrs, colinds)

Create a CSR without values.

csr.create_empty(nrows, ncols)

Create an empty CSR of the specified size.

Note

This function can be used from Numba.

csr.create_from_sizes(nrows, ncols, sizes)

Create a CSR with uninitialized values and specified row sizes.

This function is Numba-accessible, but is limited to creating matrices with fewer than \(2^{31}\) nonzero entries and present value arrays.

Parameters:
  • nrows (int) – the number of rows

  • ncols (int) – the number of columns

  • sizes (numpy.ndarray) – the number of nonzero values in each row

Accessing Rows

The CSR data itself is exposed through attributes. There are also several methods to extract row data in a more convenient form.

CSR.row_extent(row)

Get the extent of a row in the underlying column index and value arrays.

Parameters:

row (int) – the row index.

Returns:

(s, e), where the row occupies positions \([s, e)\) in the CSR data.

Return type:

tuple

CSR.row_cs(row)

Get the column indcies for the stored values of a row.

CSR.row_vs(row)

Get the stored values of a row. If only the matrix structure is stored, this returns a vector of 1s.

CSR.row(row)

Return one or more rows of this matrix as a dense ndarray.

Parameters:

row (int or numpy.ndarray) – the row index or indices.

Returns:

the row, with 0s in the place of missing values. If the CSR only stores matrix structure, the returned vector has 1s where the CSR records an entry.

Return type:

numpy.ndarray

Transforming and Manipulating Matrices

CSR.copy(include_values=True, *, copy_structure=True)

Create a copy of this CSR.

Parameters:
  • include_values (bool) – whether to copy the values or only the structure.

  • copy_structure (bool) – whether to copy the structure (index & pointers) or share with the original matrix.

CSR.subset_rows(begin, end)

Subset the rows in this matrix.

Note

This method is not available from Numba.

Parameters:
  • begin (int) – the first row index to include.

  • end (int) – one past the last row to include.

Returns:

the matrix only containing a subset of the rows. It shares storage

with the original matrix to the extent possible.

Return type:

CSR

CSR.filter_nnzs(filt)

Filter the values along the full NNZ axis.

Note

This method is not available from Numba.

Parameters:

filt (ndarray) – a logical array of length nnz that indicates the values to keep.

Returns:

The filtered sparse matrix.

Return type:

CSR

CSR.transpose(include_values=True)

Transpose a CSR matrix.

Note

In Numba, this method takes no paramters. Call transpose_structure() for a structure-only transpose.

Parameters:

include_values (bool) – whether to include the values in the transpose.

Returns:

the transpose of this matrix (or, equivalently, this matrix in CSC format).

Return type:

CSR

CSR.normalize_rows(normalization)

Normalize the rows of the matrix.

Note

The normalization ignores missing values instead of treating them as 0.

Note

This method is not available from Numba.

Parameters:

normalization (str) –

The normalization to perform. Can be one of:

  • 'center' - center rows about the mean

  • 'unit' - convert rows to a unit vector

Returns:

The normalization values for each row.

Return type:

numpy.ndarray

CSR.drop_values()

Remove the value array from this CSR. This is an in-place operation.

Warning

This method is deprecated.

Note

This method is not available from Numba.

CSR.fill_values(value)

Fill the values of this CSR with the specified value. If the CSR is structure-only, a value array is added. This is an in-place operation.

Warning

This method is deprecated.

Note

This method is not available from Numba.

Arithmetic

CSRs do not yet support the full suite of SciPy/NumPy matrix operations, but they do support multiplications:

CSR.mult_vec(v)

Multiply this matrix by a vector.

Parameters:

other (numpy.ndarray) – A vector, of length ncols.

Returns:

\(A\vec{x}\), as a vector.

Return type:

numpy.ndarray

CSR.multiply(other, transpose=False)

Multiply this matrix by another.

Note

In Numba, transpose is a mandatory positional argument. Numba users may wish to directly use the kernel API.

Parameters:
  • other (CSR) – the other matrix.

  • transpose (bool) – if True, compute \(AB^{T}\) instead of \(AB\).

Returns

CSR: the product of the two matrices.

SciPy Integration

CSR matrices can be converted to and from SciPy sparse matrices (in any layout):

classmethod CSR.from_scipy(mat, copy=True)

Convert a scipy sparse matrix to a CSR.

Parameters:
Returns:

a CSR matrix.

Return type:

CSR

CSR.to_scipy()

Convert a CSR matrix to a SciPy scipy.sparse.csr_matrix. Avoids copying if possible.

Parameters:

self (CSR) – A CSR matrix.

Returns:

A SciPy sparse matrix with the same data.

Return type:

scipy.sparse.csr_matrix