It is wasteful to store zeros elements in a `sparse matrix`

, especially for incrementally data. When constructing `tf-idf`

and `bag-of-words`

features or saving `graph ajacent matrix`

, non-efficient sparse matrix storage might lead to the `memory error`

. To circumvent this problems, efficient sparse matrix storage is a choice.

# Sparse matrix

A martix is sparse when its sparsity is greater than 0.5, where the **sparsity** of a matrix is the # of zero-valued elements divided by the total # of elements (which is equal to 1 minus the **density** of the matrix).^{[2]}

# Storage

Each entry in the array represents the element of the matrix and is accessed by the two indices $i$ (row index) and $j$ (colomn index).

Sparse matrix can reduce sustantial memory requirements by storing only the non-zero entries.

Formats can be divided into:

those support

efficient modification, such as DOK, LIL or COO, which are typically used to construct the matrices.

those that support efficient access and matrix operations, such as CSR or CSC.

## Dictionary of keys (DOK)

- DOK is storing a dictionary that mays
`(row, column)`

-pairs to the valu of the elements. Elements that are missing from the dictionary are zeros. - DOK based sparse matrix is
*efficient for constructing sparse matrices incrementally*. - Allows for efficient O(1) access of individual elements. Duplicates are not allowed. Can be efficiently converted to a coo_matrix once constructed.
^{[3]}

1 | class scipy.sparse.dok_matrix( |

## List of lists (LIL)

- LIL stores one list per row, with each entry containing the column index and the value, where entries are sorted by column index for faster indexing.
^{[4]} *Also efficient for constructing sparse matrices incrementally*

1 | class scipy.sparse.lil_matrix( |

**Advantages** of the LIL format

- supports
`flexible slicing`

- changes to the matrix sparsity structure are efficient

**Disadvantages** of the LIL format

**arithmetic operations**LIL + LIL are slow (consider**CSR**or**CSC**)- slow column slicing (consider CSC)
- slow matrix vector products (consider CSR or CSC)

Intended **Usage**

LIL is a convenient format for constructing sparse matrices

once a matrix has been constructed, convert toCSRorCSCformat forfast arithmetic and matrix vector operations

consider using the`COO format`

whenconstructing large matrices

## Coordinate list (COO)

- COO stores a list of
`(row, column, value)`

tuples, where entries are sorted first by row index and then by column index. ALso known as the`ijv`

or`triplet`

format.^{[5]} *Also efficient for constructing sparse matrices incrementally*`A[i[k], j[k]] = data[k]`

1 | class scipy.sparse.coo_matrix( |

**Advantages** of the COO format

- facilitates fast conversion among sparse formats
- permits duplicate entries (see example)
- very fast conversion to and from CSR/CSC formats

**Disadvantages** of the COO format

- does not directly support
`arithmetic operations`

and`slicing`

Intended **Usage**

- COO is a fast format for
*constructing sparse matrices* - Once a matrix has been constructed, convert to
*CSR*or*CSC*format for*fast arithmetic and matrix vector operations* - By default when converting to CSR or CSC format, duplicate (i,j) entries will be summed together. This facilitates efficient construction of finite element matrices and the like. (see example)

## Compressed sparse row (CSR, CRS or Yale format)

The *compressed sparse row* (CSR) or *compressed row storage* (CRS) or Yale format stores the matrix in three 1-dim arrays `(data, col_indices, row_indptr)`

, respectively containing non-zero values.

Let `NNZ`

denote the # of nonzero entries.^{[6]}

- The arrays
`data`

(non-zero values) and`col_indices`

(column indices) are of length`NNZ`

. - The array
`row_indptr`

has one element per row in the matrix and encodes the index in`data`

where the given row starts. - The first value of
`row_indptr`

is always 0 and the last is always set to`NNZ`

.

Insantiate with:

`csr_matrix(D)`

, with a dense matrix or rank-2 ndarray D`csr_matrix(S)`

, with another sparse matrix S (equivalent to S.tocsr())`csr_matrix((M, N), [dtype])`

, to construct an empty matrix with shape (M, N) dtype is optional, defaulting to dtype=’d’.`csr_matrix((data, (row_ind, col_ind)), [shape=(M, N)])`

, where data, row_ind and col_ind satisfy the relationship a[row_ind[k], col_ind[k]] = data[k].`csr_matrix((data, indices, indptr), [shape=(M, N)])`

.`row = row_indptr[i: i+1]`

`col = indices[row]`

`A[row,col] = data[row]`

**Advantages** of the CSC format

- efficient arithmetic operations CSR + CSR, CSR * CSR, etc.
- efficient column slicing
- fast matrix vector products

**Disadvantages** of the CSC format

- slow column slicing operations (consider CSC)
- changes to the sparsity structure are expensive (consider LIL or DOK)

## Compressed sparse column (CSC or CCS)

CSC is similar to CSR except that values are read first by column, storing row indices for each value.

1 | 0, 2, 3, 6]) # col_indptr indptr = np.array([ |

**Advantages** of the CSC format

- efficient arithmetic operations CSC + CSC, CSC * CSC, etc.
- efficient column slicing
- fast matrix vector products (CSR, BSR may be faster)

**Disadvantages** of the CSC format

- slow row slicing operations (consider CSR)
- changes to the sparsity structure are expensive (consider LIL or DOK)

# Shuffle sparse matrix

When generating batches of training/test data, `sklearn.utils.shuffle`

can help.

1 | from sklearn.utils import shuffle |

# Feed to TensorFlow

## Convert to dense matrix

1 | x_dense_batch = x_csr_batch.todense() |

## Feed to `tf.sparse.placeholder`

1 | import tensorflow as tf |

After feeding into the tensorflow computational graph, use `tf.sparse`

since *SparseTensor* cannot be directly feed into a *Tensor* op!

Use

`tf.sparse`

1

2

3w = tf.Variable(tf.random_normal([dim1, dim2], mean=.0, stddev=.01))

b = tf.Variable(tf.random_normal([dim2], mean=.0, stddev=.01))

sp_x = tf.sparse.sparse_dense_matmul(sparse_input, w) + bConvert the

`SparseTensor`

to dense`Tensor`

1

2

3

4

5

6

7

8

9

10

11

12

13tf.sparse.to_dense(

sp_input, default_value=None, validate_indices=True, name=None

)

"""

Args:

-------------------

sp_input: SparseTensor

default_value: Scalar value to set for indices not specified in sp_input. Defaults to zero.

validate_indices: boolean (default True). If True, indices are checked to assure sorted lexicographic order and no repeats.

name (optional): a name prefix for returned tensors.

"""

dense_x = tf.sparse.to_dense(sparse_input)