Compressed Sparse Matrices
The CSR
class is the entry point for pure Python code to work with the
CSR package.
- class csr.CSR(nrows, ncols, nnz, rps, cis, vs, _cast=True)
Simple compressed sparse row matrix. This is like
scipy.sparse.csr_matrix
, with a few useful differences:The value array is optional, for cases in which only the matrix structure is required.
The value array, if present, is always double-precision.
It is usable from code compiled in Numba’s nopython mode.
You generally don’t want to create this class yourself with the constructor. Instead, use one of its class or static methods. If you do use the constructor, be advised that the class may reuse the arrays that you pass, but does not guarantee that they will be used.
Not all methods are available from Numba, and a few have restricted signatures. The documentation for each method notes deviations when in Numba-compiled code.
At the Numba level, matrices with and without value arrays have different types. For the most part, this is transparent, but if you want to write a Numba function that works on the values array but only if it is present, it requires writing two versions of the function and using
numba.extending.overload()
to dispatch to the correct one. There are several examples of doing this in the CSR source code. The methodCSRType.has_values()
lets you quickly see if a CSR type instance has values or not.- rowptrs
the row pointers.
- Type
- colinds
the column indices.
- Type
- values
the values.
- Type
Constructing Matrices
In addition to the CSR constructor, there are several utility methods for constructing sparse matrices.
- classmethod CSR.from_coo(rows, cols, vals, shape=None, *, rpdtype=<class 'numpy.int32'>)
Create a CSR matrix from data in COO format.
- Parameters
rows (array-like) – the row indices.
cols (array-like) – the column indices.
vals (array-like) – the data values; can be
None
.shape (tuple) – the array shape, or
None
to infer from row & column indices.
- classmethod CSR.empty(nrows, ncols, row_nnzs=None, values=True)
Create an uninitialized CSR matrix.
- Parameters
nrows (int) – the number of rows.
ncols (int) – the number of columns.
row_nnzs (array-like) – the number of nonzero entries for each row, or None for an empty matrix.
values (bool, str, or numpy.dtype) – whether it has values or only structure; can be a NumPy data type to specify a type other than f8.
Constructing from Numba
Numba does not provide access to CSR’s class methods; instead, use the creation functions (these also work from pure Python):
- csr.create(nrows, ncols, nnz, rowptrs, colinds, values)
Create a CSR.
- csr.create_novalues(nrows, ncols, nnz, rowptrs, colinds)
Create a CSR without values.
- csr.create_empty(nrows, ncols)
Create an empty CSR of the specified size.
Note
This function can be used from Numba.
- csr.create_from_sizes(nrows, ncols, sizes)
Create a CSR with uninitialized values and specified row sizes.
This function is Numba-accessible, but is limited to creating matrices with fewer than \(2^{31}\) nonzero entries and present value arrays.
- Parameters
nrows (int) – the number of rows
ncols (int) – the number of columns
sizes (numpy.ndarray) – the number of nonzero values in each row
Accessing Rows
The CSR data itself is exposed through attributes. There are also several methods to extract row data in a more convenient form.
- CSR.row_extent(row)
Get the extent of a row in the underlying column index and value arrays.
- CSR.row_cs(row)
Get the column indcies for the stored values of a row.
- CSR.row_vs(row)
Get the stored values of a row. If only the matrix structure is stored, this returns a vector of 1s.
Transforming and Manipulating Matrices
- CSR.copy(include_values=True, *, copy_structure=True)
Create a copy of this CSR.
- CSR.subset_rows(begin, end)
Subset the rows in this matrix.
Note
This method is not available from Numba.
- CSR.filter_nnzs(filt)
Filter the values along the full NNZ axis.
Note
This method is not available from Numba.
- CSR.transpose(include_values=True)
Transpose a CSR matrix.
Note
In Numba, this method takes no paramters. Call
transpose_structure()
for a structure-only transpose.
- CSR.normalize_rows(normalization)
Normalize the rows of the matrix.
Note
The normalization ignores missing values instead of treating them as 0.
Note
This method is not available from Numba.
- Parameters
normalization (str) –
The normalization to perform. Can be one of:
'center'
- center rows about the mean'unit'
- convert rows to a unit vector
- Returns
The normalization values for each row.
- Return type
- CSR.drop_values()
Remove the value array from this CSR. This is an in-place operation.
Warning
This method is deprecated.
Note
This method is not available from Numba.
- CSR.fill_values(value)
Fill the values of this CSR with the specified value. If the CSR is structure-only, a value array is added. This is an in-place operation.
Warning
This method is deprecated.
Note
This method is not available from Numba.
Arithmetic
CSRs do not yet support the full suite of SciPy/NumPy matrix operations, but they do support multiplications:
- CSR.mult_vec(v)
Multiply this matrix by a vector.
- Parameters
other (numpy.ndarray) – A vector, of length ncols.
- Returns
\(A\vec{x}\), as a vector.
- Return type
- CSR.multiply(other, transpose=False)
Multiply this matrix by another.
Note
In Numba,
transpose
is a mandatory positional argument. Numba users may wish to directly use the kernel API.- Parameters
- Returns
CSR: the product of the two matrices.
SciPy Integration
CSR matrices can be converted to and from SciPy sparse matrices (in any layout):
- classmethod CSR.from_scipy(mat, copy=True)
Convert a scipy sparse matrix to a CSR.
- Parameters
mat (scipy.sparse.spmatrix) – a SciPy sparse matrix.
copy (bool) – if
False
, reuse the SciPy storage if possible.
- Returns
a CSR matrix.
- Return type
- CSR.to_scipy()
Convert a CSR matrix to a SciPy
scipy.sparse.csr_matrix
. Avoids copying if possible.- Parameters
self (CSR) – A CSR matrix.
- Returns
A SciPy sparse matrix with the same data.
- Return type