Sparse matrix formats

Sparse matrix formats
Dr.SethPatton Profile Pic
Dr.SethPatton,United Kingdom,Teacher
Published Date:22-07-2017
Your Website URL(Optional)
Comment
ECE 250 Algorithms and Data Structures Sparse matrix formats Douglas Wilhelm Harder, M.Math. LEL Department of Electrical and Computer Engineering University of Waterloo Waterloo, Ontario, Canada ece.uwaterloo.ca dwharderalumni.uwaterloo.ca © 2006-2013 by Douglas Wilhelm Harder. Some rights reserved.Sparse matrix formats 2 Outline In this topic, we will cover: – Sparse matrices – A row-column-value representation – The old Yale sparse matrix data structure – Examples – Appropriate useSparse matrix formats 3 Dense versus sparse matrices A dense N × M matrix is one where most elements are non-zero A dense matrix data structure is one that has a memory location allocated for each possible matrix entry – These require Q(NM) memory A sparse matrix is where 90 % or more of the entries are zero – If m is the number of non-zero entries, • The density is m/MN • The row density is m/N – Operations and storage should depend on m – For sparse matrices, m = o(MN)Sparse matrix formats 4 Example A dense 1 585 478 square matrix of doubles would occupy 18 TiB The matrix representation of the AMD G3 processor has only 7 660 826 non-zero entries – The density is 0.000003 – The matrix is used for circuit simulation – Simulations requires solving a system of 1,585,478 equations in that many unknowns http://www.cise.ufl.edu/research/sparse/matrices/AMD/G3_circuit.htmlSparse matrix formats 5 Dense square matrix data structure 2 2 This class stores N × N matrices with 8 + 4N + 8N = Q(N ) bytes class Matrix private: int N; double matrix; Matrix::Matrix( int n ): public: N( std::max(1, n) ), Matrix( int n ); matrix( new double N ) // ... matrix0 = new doubleNN; ; for ( int i = 1; i N; ++i ) matrixi = matrix0 + iN; pointer arithmeticSparse matrix formats 6 Row-Column-Value Representation As an example: 1.5 0 0 0 0 0 0 0   0 2.3 0 1.4 0 0 0 0   0 0 3.7 0 02.7 0 0  01.6 0 2.3 0 0 0 0  A  0 0 0 0 5.8 0 0 0  0 0 0 0 0 7.4 0 0  matrix  0 0 1.9 0 0 0 4.9 0   00000003.6  1.5 0 0 0 0 0 0 0 0 2.3 0 1.4 0 0 0 0 0 0 3.7 0 0 -2.7 0 0 0 -1.6 0 2.3Sparse matrix formats 7 Row-Column-Value Representation Suppose we store triplets for each non-zero value: – We would have (0, 0, 1.5), (1, 1, 2.3), (1, 3, 1.4), (2, 2, 3.7), … 1.5 0 0 0 0 0 0 0   0 2.3 0 1.4 0 0 0 0   0 0 3.7 0 02.7 0 0  01.6 0 2.3 0 0 0 0  A  0 0 0 0 5.8 0 0 0  0 0 0 0 0 7.4 0 0   0 0 1.9 0 0 0 4.9 0   00000003.6 – The class would be  class Triplet int row; int column; double value; ;Sparse matrix formats 8 Row-Column-Value Representation Now the memory requirements are Q(m) class Sparse_matrix private: int matrix_size; int N; int array_capacity; Triplet entries; public: Sparse_matrix( int n, int c = 128 ); ; Sparse_matrix( int n, int c ): matrix_size( 0 ), N( std::max(1, n) ), array_capacity( std::max(1, c) ), entries( new Tripletarray_capacity ) // does nothing Sparse matrix formats 9 Row-Column-Value Representation For the same example, we have: 1.5 0 0 0 0 0 0 0   0 2.3 0 1.4 0 0 0 0   0 0 3.7 0 02.7 0 0  01.6 0 2.3 0 0 0 0  A  0 0 0 0 5.8 0 0 0  0 0 0 0 0 7.4 0 0   entries 0 0 1.9 0 0 0 4.9 0   00000003.6  0 1 1 2 2 3 3 4 5 6 6 7 0 1 3 2 5 1 3 4 5 2 6 7 1.5 2.3 1.4 3.7 –2.7 –1.6 2.3 5.8 7.4 1.9 4.9 3.6Sparse matrix formats 10 Row-Column-Value Representation We fill in the entries in row-major order 1.5 0 0 0 0 0 0 0   0 2.3 0 1.4 0 0 0 0   0 0 3.7 0 02.7 0 0  01.6 0 2.3 0 0 0 0  A  0 0 0 0 5.8 0 0 0  0 0 0 0 0 7.4 0 0   entries 0 0 1.9 0 0 0 4.9 0   00000003.6  0 1 1 2 2 3 3 4 5 6 6 7 0 1 3 2 5 1 3 4 5 2 6 7 1.5 2.3 1.4 3.7 –2.7 –1.6 2.3 5.8 7.4 1.9 4.9 3.6Sparse matrix formats 11 Row-Column-Value Representation The AMD G3 circuit now requires 117 MiB versus 18 TiB http://www.cise.ufl.edu/research/sparse/matrices/AMD/G3_circuit.htmlSparse matrix formats 12 Accessing Entries Adding and erasing entries (setting to 0) requires a lot of work – They are require O(m) copies 1.5 0 0 0 0 0 0 0   0 2.3 0 1.4 0 0 0 0   0 0 3.7 0 02.7 0 0  01.6 0 2.3 0 0 0 0  A  0 0 0 0 5.8 0 0 0  0 0 0 0 0 7.4 0 0  entries  0 0 1.9 0 0 0 4.9 0   00000003.6  0 1 1 2 2 3 3 4 5 6 6 7 0 1 3 2 5 1 3 4 5 2 6 7 1.5 2.3 1.4 3.7 –2.7 –1.6 2.3 5.8 7.4 1.9 4.9 3.6Sparse matrix formats 13 Accessing Entries By using row-major order, the indices are lexicographically ordered – We may search using a binary search: O(ln(m)) 1.5 0 0 0 0 0 0 0   0 2.3 0 1.4 0 0 0 0   0 0 3.7 0 02.7 0 0  01.6 0 2.3 0 0 0 0  A  0 0 0 0 5.8 0 0 0  0 0 0 0 0 7.4 0 0  entries  0 0 1.9 0 0 0 4.9 0   00000003.6  0 1 1 2 2 3 3 4 5 6 6 7 0 1 3 2 5 1 3 4 5 2 6 7 1.5 2.3 1.4 3.7 –2.7 –1.6 2.3 5.8 7.4 1.9 4.9 3.6Sparse matrix formats 14 Old Yale Sparse Matrix Format Storing the row, column, and value of each entry has some undesirable characteristics – Accessing an entry is O(ln(m)) • For the AMD G3, lg(7,660,826) = 23 – If each row has approximately the same number of entries, we can use an interpolation search: O(ln(ln(m))) • For the AMD G3, ln(ln(7,660,826)) = 3Sparse matrix formats 15 Old Yale Sparse Matrix Format The original (old) Yale sparse matrix format: – Reduces access time to lg(m/N) and uses less memory – For circuits, it is seldom that m/N 10 • For the AMD G3 circuits, m/N 5 It was developed by Eisenstat et al. in the late 1970sSparse matrix formats 16 Old Yale Sparse Matrix Format Note that the arrays have redundancy – Any row which contains more than one entry is stored in successive entries1.5 0 0 0 0 0 0 0  0 2.3 0 1.4 0 0 0 0   0 0 3.7 0 0 0 0 0   01.6 0 2.3 9.9 0 0 0 A  0 0 0 0 5.8 0 0 0   0 0 0 0 0 7.4 0 0  0 0 1.9 0 0 0 4.9 0   0 0 0 0 0 0 0 3.6  entries entries 0 1 2 3 4 5 6 7 8 9 10 11 0 1 1 2 3 3 3 4 5 6 6 7 0 1 3 2 5 1 3 4 5 2 6 7 1.5 2.3 1.4 3.7 –1.6 2.3 9.9 5.8 7.4 1.9 4.9 3.6Sparse matrix formats 17 Old Yale Sparse Matrix Format Suppose we store only the location of the first entry of each row: – The first entry in Row 0 is in entry 0 1.5 0 0 0 0 0 0 0 – The first entry in Row 3 is in entry 4  0 2.3 0 1.4 0 0 0 0   0 0 3.7 0 0 0 0 0   01.6 0 2.3 9.9 0 0 0 A  0 0 0 0 5.8 0 0 0   0 0 0 0 0 7.4 0 0  0 0 1.9 0 0 0 4.9 0   0 0 0 0 0 0 0 3.6  entries 0 1 2 3 4 5 6 7 8 9 10 11 0 1 1 2 3 3 3 4 5 6 6 7 0 1 3 2 5 1 3 4 5 2 6 7 1.5 2.3 1.4 3.7 –1.6 2.3 9.9 5.8 7.4 1.9 4.9 3.6Sparse matrix formats 18 Old Yale Sparse Matrix Format Let us remove that redundancy by creating a new array IA where IA(i) stores the location in JA and A of the first non-zero entry in th the i row1.5 0 0 0 0 0 0 0  0 2.3 0 1.4 0 0 0 0  – Rename the column and value  0 0 3.7 0 0 0 0 0  arrays to JA and A, respectively  01.6 0 2.3 9.9 0 0 0 A  0 0 0 0 5.8 0 0 0   0 0 0 0 0 7.4 0 0  0 0 1.9 0 0 0 4.9 0   0 0 0 0 0 0 0 3.6  IA 0 1 2 3 4 5 6 7 8 0 1 3 4 7 8 9 11 12 0 1 2 3 4 5 6 7 8 9 10 11 JA 0 1 3 2 5 1 3 4 5 2 6 7 A 1.5 2.3 1.4 3.7 –1.6 2.3 9.9 5.8 7.4 1.9 4.9 3.6Sparse matrix formats 19 Old Yale Sparse Matrix Format rd For example, the first entry of the 3 row is located at position IA(3) 1.5 0 0 0 0 0 0 0  0 2.3 0 1.4 0 0 0 0   0 0 3.7 0 0 0 0 0   01.6 0 2.3 9.9 0 0 0 A  0 0 0 0 5.8 0 0 0   0 0 0 0 0 7.4 0 0  0 0 1.9 0 0 0 4.9 0   0 0 0 0 0 0 0 3.6  IA 0 1 2 3 4 5 6 7 8 0 1 3 4 7 8 9 11 12 0 1 2 3 4 5 6 7 8 9 10 11 JA 0 1 3 2 5 1 3 4 5 2 6 7 A 1.5 2.3 1.4 3.7 –1.6 2.3 9.9 5.8 7.4 1.9 4.9 3.6Sparse matrix formats 20 Old Yale Sparse Matrix Format th Also, the first entry of the 4 row is located at position IA(4) – The entries of Row 4 are in 4, 5, 6 1.5 0 0 0 0 0 0 0  0 2.3 0 1.4 0 0 0 0   0 0 3.7 0 0 0 0 0   01.6 0 2.3 9.9 0 0 0 A  0 0 0 0 5.8 0 0 0   0 0 0 0 0 7.4 0 0  0 0 1.9 0 0 0 4.9 0   0 0 0 0 0 0 0 3.6  IA 0 1 2 3 4 5 6 7 8 0 1 3 4 7 8 9 11 12 0 1 2 3 4 5 6 7 8 9 10 11 JA 0 1 3 2 5 1 3 4 5 2 6 7 A 1.5 2.3 1.4 3.7 –1.6 2.3 9.9 5.8 7.4 1.9 4.9 3.6