Lecture notes in Computing Science

lecture notes in computational science and engineering pdf
Dr.DavidWllker Profile Pic
Dr.DavidWllker,United States,Professional
Published Date:11-07-2017
Your Website URL(Optional)
Comment
Scientific Computing (wi4201) C. Vuik and D.J.P. Lahaye 2015 Delft University of Technology Faculty of Electrical Engineering, Mathematics and Computer Science Delft Institute of Applied MathematicsContents 1 Introduction 2 1.1 Errors in scientific computing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 2 Concepts from Numerical Linear Algebra 6 2.1 Vectors and Matrices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6 2.1.1 Rank of a Matrix . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6 2.1.2 Kronecker Product of Matrices . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6 2.2 Eigenvalues, Eigenvectors and Spectrum . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 2.2.1 Eigenvalues of Tridiagonal Matrices . . . . . . . . . . . . . . . . . . . . . . . . . 7 2.3 Symmetry . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 2.4 Positive Definiteness . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8 2.5 Vector and Matrix Norms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8 2.5.1 Inner Product . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8 2.5.2 Vector Norms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9 2.5.3 Matrix Norms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9 2.6 Condition Number . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11 2.7 Spectral Radius . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11 2.8 Irreducibility and Diagonal Dominance . . . . . . . . . . . . . . . . . . . . . . . . . . . 13 2.9 Positive Inverse . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14 2.10 Perron-Frobenius Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14 2.11 A collection of Matrices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15 2.12 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16 3 The Model Problem and Its Finite Difference Discretization 19 3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19 3.2 Motivating Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19 3.3 Classification of Second Order Partial Differential Equations . . . . . . . . . . . . . . . . 20 3.4 The Model Problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22 3.5 Finite Difference Discretization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24 3.6 Linear System Formulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26 3.7 Properties of the Discrete Linear System . . . . . . . . . . . . . . . . . . . . . . . . . . . 29 3.7.1 Discrete Spectrum . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30 3.8 bi-Harmonic Equation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33 3.9 Beyond the Model Problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34 3.9.1 Elliptic Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34 3.9.2 Parabolic Problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36 3.9.3 Dense Matrices in PDE Problems . . . . . . . . . . . . . . . . . . . . . . . . . . 36 3.9.4 Sparse Matrices in non-PDE Problems . . . . . . . . . . . . . . . . . . . . . . . . 36 3.10 List of Model Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36 3.11 Computer Implementation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37 3.12 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38 ii4 Direct Solution Methods 42 4.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42 4.2 Floating Point Arithmetic . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43 4.3 Perturbation Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43 4.3.1 Perturbation in the right-hand side only . . . . . . . . . . . . . . . . . . . . . . . 44 4.3.2 Perturbation in both matrix and right-hand side . . . . . . . . . . . . . . . . . . . 44 4.3.3 Diagonal Scaling and Conditioning . . . . . . . . . . . . . . . . . . . . . . . . . 46 4.4 The Gaussian Elimination Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47 4.4.1 Existence and Uniqueness of the LU-Factorization . . . . . . . . . . . . . . . . . 47 4.4.2 Computing the LU-Factorization . . . . . . . . . . . . . . . . . . . . . . . . . . . 47 4.4.3 Forward and Backward Substitution . . . . . . . . . . . . . . . . . . . . . . . . . 50 4.4.4 Computational Cost . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51 4.5 Gaussian Elimination in the Presence of Round-Off Errors . . . . . . . . . . . . . . . . . 51 4.5.1 Solve Stage . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51 4.5.2 Factorization Stage . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52 4.6 Gaussian Elimination with Pivoting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52 4.6.1 Gaussian Elimination with Partial Pivoting . . . . . . . . . . . . . . . . . . . . . 52 4.6.2 Gaussian Elimination with Complete Pivoting . . . . . . . . . . . . . . . . . . . . 54 4.7 Diagonally dominant matrices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54 4.8 Cholesky Decomposition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54 4.8.1 Existence and uniqueness . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55 4.8.2 Computing the Choleski Factor . . . . . . . . . . . . . . . . . . . . . . . . . . . 55 4.8.3 Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55 4.8.4 Stability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55 4.9 Band Matrices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56 4.9.1 Gaussian Elimination without Pivoting . . . . . . . . . . . . . . . . . . . . . . . 56 4.9.2 Occurrence of fill-in . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57 4.9.3 Gaussian elimination with partial pivoting . . . . . . . . . . . . . . . . . . . . . . 57 4.10 General Sparse Matrices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57 4.10.1 Yale Format . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58 4.10.2 Compressed sparse row (CSR or CRS) format . . . . . . . . . . . . . . . . . . . . 59 4.10.3 Compressed sparse column (CSC or CCS) format . . . . . . . . . . . . . . . . . . 59 4.10.4 Sparse Direct Solvers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59 4.11 Tridiagonal Systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59 4.12 Conclusions and Outlook . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59 4.13 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61 5 Basic Iterative Methods 62 5.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62 5.2 Why Iterating? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62 5.3 Prototypes of BIMs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64 5.3.1 The Method of Jacobi and Gauss-Seidel . . . . . . . . . . . . . . . . . . . . . . . 64 5.3.2 The Richardson and damped Jacobi Method . . . . . . . . . . . . . . . . . . . . . 68 5.3.3 The Successive Overrelaxation Method . . . . . . . . . . . . . . . . . . . . . . . 69 5.4 To Convergence or Not To Converge . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70 5.4.1 A General Result . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71 5.4.2 Matrix Norm Bounds . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71 5.4.3 Regular Splittings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72 5.4.4 Diagonal Dominance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72 5.4.5 Converge of SOR() . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73 5.5 Speed Convergence of BIMs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73 5.5.1 A General Result . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74 5.5.2 Speed Convergence of Damped Jacobi . . . . . . . . . . . . . . . . . . . . . . . . 75 5.5.3 Speed of Convergence of SOR() . . . . . . . . . . . . . . . . . . . . . . . . . . 76 iiiChapter 1 Introduction Computational Science and Engineering (CSE) is a rapidly growing multidisciplinary area with connec- tions to the sciences, engineering, mathematics and computer science. CSE focuses on the development of problem-solving methodologies and robust tools for the solution of scientific and engineering problems. We believe that CSE will play an important if not dominating role for the future of the scientific discovery process and engineering design. Below we give a more detailed description of Computational Science and Engineering. For more information on CSE we refer to the Society for Industrial and Applied Mathematics (SIAM). In CSE we deal with the simulation of processes, as they occur amongst others in the engineering, physical and economic sciences. Examples of application areas are fluid dynamics (the aerodynamics of cars and aircrafts, combustion processes, pollution spreading), semi-conductor technology (breeding of crystals, oxidation processes), weather and climate prediction (the growing and tracking of tornados, global warming), applied physics (many particle simulations, protein folding, drug design) but also financial math- ematics (prediction of stock and option prices). Simulation is nowadays an equal and indispensable partner in the advance of scientific knowledge next to the theoretical and experimental research. It is characteristic for CSE that practical relevant results are typically achieved by combining methods and research results from different scientific areas (see Figure 1.1). Figure 1.1: Embedding of computational science and engineering. The application area brings the typical problem-dependent know-how, the knowledge to formulate the problem and model and to verify the computed results by means of real experiments. Applied mathematics 2deals with the definition of a mathematical model, existence and uniqueness results and develops efficient methods to solve the mathematical model with a computer. Computer science typically takes care for the usability, and programming of modern computers and designs powerful software packages. Especially the interplay of the disciplines is necessary for success. For instance: it is possible to parallelize a poor mathematical method and implement it on a supercomputer, but this does not help us much We need much more a continuing development of mathematical models and numerical algorithms, the transfer of the algorithms into powerful and user-friendly software, running this on state-of-the-art computers that steadily increase their performance. Basically, we can distinguish a number of important steps in simulation:  Setting up a model. At the start of each simulation we have the development of a mathematical model of the process of interest. This must be a simplified image of the reality that contains many relevant phenomena. It must be formulated such that the model has a (unique) solution. Often we obtain a system of (not analytically solvable) differential equations.  The analytical treatment. Analytical tools can be used to obtain properties of the solution: existence, uniqueness, maximum principle etc. Furtermore for simple problems an analytical solution can be found. In a number of cases approximate solutions can be derived using an asymptotical approach.  The numerical treatment. Since a computer can only handle discrete numbers, we have to discretize the model (the equations and the solution domain), so that the computer can deal with them. For the solution of the resulting matrix from the discrete equations, mathematicians provide efficient methods.  The implementation. Next to the choice of computer language and data structures, especially the distributed computation is of major importance.  The embedding. The numerical simulation is just one step in the product development in industrial applications. Therefore, we need interfaces, so that we can link the simulation programme with CAD tools. Only in this way it is, for example, possible to use aerodynamics car simulation results on drag coefficients at an early stage into the design process.  The visualization. Typically, after a simulation we have huge data sets (not only one drag coefficient as in the example above). Often we need the velocity in the complete flow domain, or one is interested for the optimal control of a robot in the path of the robot arm. We need to present such results with the help of computer graphics, so visualization is very important.  The validation. After long computer times with many many computations, we have the result of a simulation. It is of a primary importance to verify the results obtained. 1.1 Errors in scientific computing In scientific computing, we wish to compute an exact solutionu, but we obtain an approximationub on h a numerical grid after a number of iterations of an iterative solution method implemented on a computer. An important aspect of scientific computing is therefore the understanding of the errors made during the computation of solutions. The aim is to find an approximationub for the solutionu with a known accuracy. h One has to consider the possible errors occurring and their influence on the accuracy of the approximation. Obtaining an approximate solution without knowing anything about the approximation error is useless. 3Error measures include the absolute and relative error in some normjj: juubj Estimate of the absolute error, h juubj h  Estimate of the relative error. juj An obvious area of conflict is that estimates are only of interest, if they are more or less realistic. Especially for academic model problems it may often be possible to get sharp error estimates, but this is far less trivial, or even not yet reached in research for solutions of real-life, highly nonlinear systems of partial differential equations. An overview of errors that can be made in solving real-life problems with mathematical methods in scientific computing is presented in Figure 1.2. Reality phenomenon Building a mathematical model (Idealisation) MODELLING ERROR Mathematic relation between defined quantities Include measurements or other data ERROR IN DATA Mathematical problem Treat problem with numerical methods ERRORS IN THE SOLUTION OF THE PROBLEM (DISCRETIZATION, TRUNCATION ERROR) Numerical method for solution of the math. problem Solve on computers ROUND−OFF ERRORS Numerical solution: Numbers Figure 1.2: Overview of errors made in scientific computing. The notion of the modeling error is as follows. Practical problems are often physical, chemical, biolog- ical or economical applications. It is necessary to set up a mathematical model, which is typically a partial differential equation (pde) here. In order to analyze only certain features of a complicated problem, a linear simplification of a nonlinear process may be used, for simplicity. In fluid mechanics, for example, one distinguishes between laminar (steady, ordered) and turbulent (unsteady, nonlinear) flows. Whereas the so- lution of laminar flow problems is fully understood, for turbulent flows modeling is still state-of-research. Another distinction is between mathematical models involving friction, leading to the Navier-Stokes equa- tions, versus models without friction, the Euler equations. Including friction in the mathematical model requires the use of more complicated numerical solution methods than the use of a frictionless model. The insight in the impact of the choice of mathematical model on the results obtained is indispensable. 4Data errors may occur because, except for trivial cases, most data that enter a computation are afflicted with errors, for example with measurement errors. They can have a big or a small effect on a numerical solution. This typically depends on the condition of the problem. If a problem is well-posed small changes in data do not have a significant effect on the solution, whereas for an ill-posed problem the opposite is true. Discretization errors occur because we need to solve a (continuous) problem under consideration on a numerical grid, as an analytic solution is not available. The approximation of a function on a discrete grid means that we try to recover a function by means of function values by a finite number of points. As only N values are obtained one ends up with an approximate solution curve (see Figure 1.3). f(x) x Figure 1.3: An approximate solution curve as onlyN values are obtained on a numerical grid. Round off errors occur as only digital numbers of a finite length are used in computers. Practically, this means that already during multiplication and division numbers are rounded. Each separate round off error may be small, however, during long calculations many small errors can add up, to make a solution totally useless. In scientific computing one therefore distinguishes stable from unstable algorithms, the first class leading to reliable numerical solutions. 5Chapter 2 Concepts from Numerical Linear Algebra In this section we will introduce concepts from numerical linear algebra. Textbook references for this chapter include 38, 52. 2.1 Vectors and Matrices n In this chapter we will use bold lower-case Roman letters to denote real-valued vectors such as e.g.u2R having componentsu where 1in. We will use the notationA andR to denote a real-valued square i nn and rectangular matrix, respectively. More precisely, we will assume thatA2R with componentsa ij mn where 1i;jn and thatR2R with componentsr where 1im and 1jn. We will ij also writeA = (a ). We will use the Matlab-like notationA(i; :) andA(:;j) to denote theith row andjth ij column ofA, respectively. We will use the symbols; andI to denote thenn zero and identity nn nn matrix, respectively. We will denote byjAj andjRj the matrices obtained by taking the absolute value ofA andR entry-wise, respectively. In what follows we will say thatA is positive and writeA; iff for all components nn a  0. This notion induces a partial ordering on the set ofnn matrices by definingA A to hold if ij 1 2 and only if (iff)A A ; . It is easy to verify that the product of positive matrices is again positive. 2 1 nn 1 We will denote the (left and right) inverse ofA byA . 2.1.1 Rank of a Matrix mn The rank of a matrixR2R is defined as the size of the largest collection of linear independent rows of columns of the matrix. It can be computed as the dimension of the largest square non-singular submatrix ofR. 2.1.2 Kronecker Product of Matrices In these lecture notes, matrices and vectors represent discretized differential operators and fields. In partic- ular circumstances the discrete differential operator in two and higher dimensions can be represented as a Kronecker product of its one-dimensional counterpart. mn mn 1 1 2 2 Definition 2.1.1 Given two rectangular matricesR 2R andR 2R its Kronecker product 1 2 m mn n 1 2 1 2 denoted byR R 2R is formed by replacing each entryr ofR byr R . 1 2 1;ij 1 1;ij 2 6Example 2.1.1 2 3 2 3 1 0 1 5 2 0 2 5 0 5 0 10     6 7 6 7 1 2 0 5 1 6 1 7 2 6 2 7 6 7 12 14 6 7 6 7 = = 4 5 4 5 3 4 6 7 3 0 3 5 4 0 4 5 0 15 0 20 3 6 3 7 4 6 4 7 18 21 24 28 2.2 Eigenvalues, Eigenvectors and Spectrum k n Definition 2.2.1 The non-zero vectorv 2R nf0g is an eigenvector corresponding to the eigenvalue k k  2 C iff Av =  v . The algebraic multiplicity of  is defined as the multiplicity of the root k k k of of the characteristic equation det(AI) = 0. The geometric multiplicity of is defined as the k k dimension of the space spanned by the corresponding eigenvectors. The set of all eigenvalues ofA is called the spectrum ofA and will be denoted as(A). To explicitly link the eigenvalue to the matrix we will use the notation (A). k Two n n matrices A and A are called similar iff a non-singular n n matrix V exists such 1 2 1 that A = V A V . If A and A are similar, then they have the same spectrum. A matrix is called 2 1 1 2 diagonalizable if it is similar to a diagonal matrix. IfA is diagonalizable and can therefore be written as 1 k 1 k A =V DV for some matricesV andD, thek-th power ofA can be computed asA =V D V 38. The concept of singular values generalizes the concept of eigenvalues from square to rectangular ma- trices. 2.2.1 Eigenvalues of Tridiagonal Matrices As tridiagonal matrices will often appear in this course, we give the following result on the eigenvalues of these matrices. Theorem 2.2.1 Assuming thata,b andc are real numbers such thatbc 0 and denoting the imaginary 2 unit ( =1), the eigenvalues of the tridiagonal Toeplitz matrix 0 1 a b 0 ::: ::: ::: 0 B C c a b 0 ::: ::: 0 B C B . . . . . . .C nn . . . . . . . A = 2R B C . . . . . . . B C A 0 ::: ::: 0 c a b 0 ::: ::: ::: 0 c a are given by p k  (A) =a + 2 bc cos( ): k n + 1 The proof of this theorem as well as the details of the frequently occurring case thatb = c is left as an exercise. 2.3 Symmetry T T Definition 2.3.1 The transpose ofA, denoted byA , is annn matrix with componentsa =a . ji ij mn T nm T nn T mm Example 2.3.1 IfR2R , thenR 2R ,R R2R andRR 2R . T Definition 2.3.2 The matrixA is symmetric iffA =A. The theory of symmetric matrices is very rich as argued e.g. in Chapter 7 of 38. We recall here some basic facts. 7T Theorem 2.3.1 The eigenvalues of a symmetric matrixA are real, i.e.,A =A )(A)R. A proof of this theorem is given in e.g. 52 (Theorem 1.9). This theorem implies that the eigenvalues of a symmetric matrix can be ordered in increasing magnitude as follows (A) =f  ::: g; (2.1) 1 2 n where each eigenvalue is repeated according to its algebraic multiplicity. We will use the notation (A) = min  (A) and (A) = (A). 1 max n T An nn matrix Q is called orthogonal iff its n columns are orthonormal, i.e., iff Q Q = I. The following theorem will be very helpful in computing thek-th power of a symmetric matrix Theorem 2.3.2 A symmetric matrixA is orthogonally diagonalizable, i.e., there exists an orthogonal ma- T trixQ and a diagonal matrixD such thatA =Q DQ. The entries ofD are the eigenvalues ofA and the columns ofQ span the eigenspaces ofA. 2.4 Positive Definiteness T T Ifu is a column (standing) vector, thenu is a row (lying) vector, and the productu Au is a scalar. This scalar is zero ifu =0, independently ofA. Other choices foru are considered in the following definition. Definition 2.4.1 The matrixA is called positive definite (positive semi-definite) iff N T T 8u2R nf0g :u Au 0 (u Au 0): (2.2) In the next theorem the spectrum of symmetric positive definite (SPD) and symmetric positive semi-definite (SPSD) is considered. Theorem 2.4.1 The spectrum of a symmetric positive definite (positive semi-definite) matrixA are strictly + + positive (positive), i.e.,A SPD )(A)R (A SPSD )(A)R ). 0 2.5 Vector and Matrix Norms 2.5.1 Inner Product n The inner product is a positive definite bilinear form onR . More precisely, we have the following defini- tion. n n Definition 2.5.1 Denotingc2R a scalar andu;v;w2R vectors, the inner product :;: :R  n R R is a function that satisfies the following properties 1. symmetry u;v=v;u 2. linearity in the first argument cu;v = c u;v u +v;w = u;w +v;w 3. positive definiteness u;u 0 with equality only foru =0: P n T n Example 2.5.1 The scalar productu v = u v is an inner product onR . i i i=1 T n Example 2.5.2 Given an SPD matrix A, the function u;v = u Av is an inner product onR A (verify this). Two non-zero vectorsu andv for whichu;v = 0 are calledA-conjugate. A A useful relation satisfied by any inner product is the so-called Cauchy-Schwartz inequality 2 jx;yj x;xy;y : (2.3) 82.5.2 Vector Norms n Vector norms define a distance measure inR . More precisely, we have the following definition. n n Definition 2.5.2 Denotingc2 R a scalar andu;v2 R vectors, the vector normk:k : R R is a function that satisfies the following properties 1. positivity kuk 0 with equality only foru =0 2. homogenity kcuk =jcjkuk 3. triangular inequality ku +vkkuk +kvk: p Example 2.5.3 Given an inner product :;: , the function u (u;u) defines a norm (verify). Taking in particular the scalar product as inner product, we arrive at the Euclidean norm denoted as p T kuk = u u. Taking the inner product induced by an SPD matrixA, we arrive theA-norm denoted as 2 p T kuk = u Au. A The Euclidean norm is a special instance of ap-norm defined next. n Definition 2.5.3 Given 1 p 1, thep-norm (Holder ¨ norm) of a vectoru2R denoted askuk is p defined by " 1=p n X p kuk = juj : (2.4) p i i=1 We in particular have that kuk = juj +::: +ju j (2.5) 1 1 n   1=2 2 2 kuk = u +::: +u (2.6) 2 1 n kuk = max juj: (2.7) 1 i i=1;:::;n nn An orthogonal transformationQ2R leaves the Euclidean (p = 2) norm of a vector invariant, i.e., if T Q Q =I thenkQuk =kuk . 2 2 2.5.3 Matrix Norms The analysis of matrix algorithms frequently requires the use of matrix norms. For example, the quality of a linear system solver may be poor if the matrix of coefficients is nearly singular. To quantify the notion of near-singularity we need a measure of distance on the space ofmn matrices. The concepts of a vector norm and of an operator induced norm allow to define a such a metric. mn Definition 2.5.4 Given 1 p 1, thep-norm (Holder ¨ norm) of a matrixR2R denoted askRk p is defined by kRuk p kRk = sup : (2.8) p n kuk u2R nf0g p The normkRk can be regarded as the maximum amplification factor ofR. p Proposition 2.5.1 It can be shown that the above definition is equivalent to kRuk p kRk = max = max kRuk = max kRuk ; (2.9) p p p n u2R nf0g kuk kuk 1 kuk =1 p p p 9and that the matrixp-norm satisfies the usual properties of a norm, i.e., kRk  0 (2.10) p kcRk = jcjkRk (2.11) p p kR +Rk  kRk +kRk (2.12) 1 2 p 1 p 2 p mq qn as well as the so-called submultiplicative property, i.e., that for anyR 2R andR 2R 1 2 kR Rk kRk kRk (2.13) 1 2 p 1 p 2 p The proof of this proposition is left as an exercise. As an orthogonal transformationQ leaves the Euclidean norm of a vector invariant, we have from the definition thatkQk = 1. It appears that orthogonal transformations leaves the Euclidean norm of a matrix 2 invariant. More precisely, we have the following proposition. Proposition 2.5.2 Given anmn matrixR and orthogonal matricesQ andZ of appropriate size, we have that kQRZk =kRk : (2.14) 2 2 T This implies that in case of a symmetric nn matrix A that can be diagonalized as A = Q DQ that kAk =kDk =j (A)j. IfA is SPD, thenj (A)j = (A). We have thus proven the following 2 2 max n max proposition Proposition 2.5.3 If thenn matrixA is SPD, then kAk = (A): (2.15) 2 max Example of a non-p-norm The Frobenius norm of a matrixB is defined as v u m n XX u 2 t kRk = jr j : (2.16) F ij i=1 j=1 It can be viewed as the Euclidean norm of the vector obtained from all rows (or colums) ofB. It is much easier to compute than the matrixp-norm and therefore very useful in numerical linear algebra. Computing the 1, 2 and1-norm of a matrix In the case ofp = 1,p = 2 andp =1, the following expressions exist that allow to the matrixp-norm in practice m X kRk = max jr j maximum absolute column sum (2.17) 1 ij 1jn i=1 q r T T kRk = max  (R R) =  (R R) (2.18) 2 k max 1in n X kRk = max jr j: maximum absolute row sum (2.19) 1 ij 1im j=1 The proof of these expressions if left as an exercise. Expression (2.18) in particular implies that in case of annn matrixA that is SPD,kAk = (A). 2 max Example 2.5.4 For   3 0 A = 0 2 we have thatkAk =kAk =kAk = 3. 1 2 1 10T Let e be the vector of appropriate size having only 1 as entry, that is e = (1; 1;:::; 1) . Then the conditionjRjeCe for some constantC states that all absolute row-sums ofR are bounded byC. In this case, the maximum absolute row sum is bounded byC as well, i.e.,kRk C, i.e., we have proven the 1 following proposition Proposition 2.5.4 jRjeCe)kRk C (2.20) 1 2.6 Condition Number Given the linear system Au = f, a small perturbation in the right-hand side vector f f +4f will cause a (not necessarily small) perturbation in the solutionu u +4u. We will see later that the the spectral condition number of the system matrixA denoted as (A) allows to bound the magnitude of the p perturbation4u in terms of the magnitude in the perturbation4f. Definition 2.6.1 The spectral condition number measured inp-norm (A) of annn matrixA is defined p as 1  (A) =kAkkA k : (2.21) p p p Observe that for anyp, (A)2 1;1) (why?). Using the relation (2.18), the condition number in 2-norm p can be expressed as p T  (A A) max  (A) = p : (2.22) 2 T  (A A) min In case thatA is symmetric, the above expression reduces to j (A)j max  (A) = ; (2.23) 2 j (A)j min and in case thatA is SPD to  (A) max  (A) = : (2.24) 2  (A) min 2.7 Spectral Radius The spectral radius will be used in the study of the convergence of stationary iterative methods. nn Definition 2.7.1 The spectral radius(A) of a matrixA2R is defined as (A) = max fjj : 2(A)g : (2.25) i i i=1;:::;n Note that in general(A)2= (A). The computation of the spectral radius is typically not straightforward. The following theorem gives on upper bound on the spectral matrix that using e.g. (2.17), (2.18) or (2.19) is easier to compute. Theorem 2.7.1 Givenkk any submultiplicative matrix norm, then (A)kAk: (2.26) Proof. Assume (;u) any eigenvalue-eigenvector pair ofA. ThenAu = u, and thus by virtue of the submultiplicative property kuk =jjkuk =kAukkAkkuk)jjkAk: The result then follows from the fact that was chosen arbitrarily. 11Power of a Matrix The following important theorem links the the spectral radius with thekth power of k a matrixA denoted byA fork1. Theorem 2.7.2 k (A) 1, lim A =; (2.27) nn k1 Proof. ) For the proof we assumeA to be diagonalizable, i.e., we assume that a matrixP and a diagonal 1 matrixD exist such thatA =PDP . The columns ofP span the eigenspaces ofA and a diagonal ofD are the eigenvalues ofA. (In case thatA is not diagonalizable an argument similar to the one that follows k k 1 can be made using the Jordan decomposition of A). This implies that A = PD P , which in turn k implies that if(A) 1 we have that lim A =; . k1 nn ( Assume u to be a unit eigenvector corresponding to the eigenvalue with maximum modulus, i.e., n Au = u or n n n k k A u = u (2.28) n n n which implies, by taking the 2-norm on both sides, k k j j =kA u k 0: (2.29) n n 2 This shows that(A) 1. 2 The following results allows to quantify the speed of convergence of the limiting process in Theorem 2.7.2. Theorem 2.7.3 (Gelfand’s formula) For anykk matrix norm, we have that q k k lim kAk =(A): (2.30) k1 k This shows that the spectral radius ofA gives the asymptotic growth rate of the norm ofA . The proof of this result can be found in literature. Proposition 2.5.3 can be generalized to a linear combination of powers inA. Assume thereforep (x) = k P k j c x to be a k-th degree polynomial and A to be a symmetric (not necessarily positive definite) j j=0 T matrix withA =Q DQ its orthogonal spectral decomposition. Then the orthogonality ofQ implies that T p (A) = Q p (D)Q and thereforekp (A)k =kp (D)k = max jp ( )j. We have thus proven k k k 2 k 2 1in k i that Proposition 2.7.1 If thenn matrixA is symmetric, then kp (A)k = maxjp ( )j: (2.31) k 2 k i 1in Next we consider power series of matrices. Power Series of a Matrix For scalar argumentsx we have that the power series 1 X 1 k x = converges ifjxj 1: (2.32) 1x k=0 The next theorem generalizes this result tonn matrices. Theorem 2.7.4 1 X k 1 (A) 1, (IA) is non-singular, and A = (IA) (2.33) k=0 Proof. ) If(A) 1, thenIA has eigenvalues bounded away from zero, and is therefore non-singular. We furthermore have the equality k+1 2 k (IA ) = (IA)(I +A +A +::: +A ); (2.34) or equivalently 1 k+1 2 k (IA) (IA ) = (I +A +A +::: +A ): (2.35) Taking the limit ask1 and taking Theorem 2.7.2 into account then yields the desired result. P 1 k k ( A necessary condition for the power series A to converge is that lim kAk = 0. This k1 p k=0 k implies that lim A = 0 and again by Theorem 2.7.2 the desired result. 2 k1 122.8 Irreducibility and Diagonal Dominance In case that diagonal entries inA are large compared to off-diagonal entries, useful estimates for the loca- tion of the eigenvalues ofA can be obtained from Gershgorin’s theorem. To formalize this idea we first introduce the concept of irreducibility and denote byP annn permutation matrix such thatPA (AP ) corresponds to a permutation of the rows (columns) ofA. The matrixA is called reducible if a permutation T matrixP exists such thatPAP is block upper triangular. If in particularA is reducible and symmetric, T thenPAP is block diagonal, implying that A after permutation is made up of not-interconnected blocks. This motivates the following definition. T Definition 2.8.1 The matrixA is called irreducible iff no permutation matrixP exists such thatPAP is block upper triangular. Next we define the concept of diagonal dominance. Definition 2.8.2 The matrixA is called row diagonal dominant iff n X ja j ja j i = 1;:::;n (2.36) ii ij j=1;j6=i with strict inequality from at least onei. In the following definition stronger requirements are imposed. Definition 2.8.3 The matrixA is called row strictly diagonal dominant iff n X ja j ja j i = 1;:::;n: (2.37) ii ij j=1;j6=i The concepts of column (strictly) diagonal dominance and column strictly diagonal dominance can be defined in a similar fashion. Note that in the above definitions the sign of the off-diagonal entries is irrelevant. The concept of irreducibility is linked to that of diagonal dominance in the following definition. Definition 2.8.4 The matrix A is called irreducibly strictly diagonal dominant iff A is irreducible and diagonally dominant. Diagonal dominance ofA and its spectrum can be related through the following theorem that we state here (for matrices having possibly complex eigenvalues). Theorem 2.8.1 (Gershgorin) If2 (A), then is located in one of then closed disks in the complex plane that centered ina and have as radius ii n X  = ja j (2.38) i ij j=1;j=6 i i.e., n X 2(A))9i such thatja j = ja j: (2.39) ii i ij j=1;j=6 i This theorem is a powerful tool as it allows to certify that symmetric irreducibly diagonal dominant matrices are SPD without computing its eigenvalues explicitly. Details can be found in 52. 132.9 Positive Inverse In the following definition the positivity of the inverse is the key feature. Definition 2.9.1 The matrixA is an M-matrix iff it satisfies the following four properties 1. a 0 fori = 1;:::;n; ii 2. a  0 fori =6 j,i;j = 1;:::;n; ij 3. A is non-singular; 1 4. A  0. In the context of the discretization of partial differential equations, the M-matrix property guarantees useful properties such as the absence of wiggles in convection-dominated flows and the positivity of concentra- tions in chemical engineering applications. In the context of iterative solution methods, the M-matrix prop- erty guarantees the convergence of the basic schemes. While the M-matrix property is therefore highly desirable, verifying whether or not a matrix satisfies the fourth of the above conditions is hard (if not impossible at all) in practice. Therefore the following result will have a large practical value. Theorem 2.9.1 If the matrixA satisfies the following three properties 1. a 0 fori = 1;:::;n; ii 2. a  0 fori6=j,i;j = 1;:::;n; ij 3. A is irreducibly diagonally dominant thenA is an M-matrix. 1 Note the absence of any condition onA in the above conditions. 2.10 Perron-Frobenius Theorem Converge results for stationary iterative methods critically hinge on the theorem of Perron-Frobenius (named after Oskar Perron (1880 - 1975) and Ferdinand Georg Frobenius (1849 - 1917)). Carl Meyer is quoted for saying that In addition to saying something useful, the Perron-Frobenius theory is elegant. It is a testament to the fact that beautiful mathematics eventually tends to be useful, and useful mathematics eventually tends to be beautiful. Theorem 2.10.1 (Perron-Frobenius) LetA be a realnn nonnegative irreducible matrix. Then = (A), the spectral radius ofA, is a simple eigenvalue ofA. Moreover, there exists an eigenvectoru with positive elements associated with this eigenvalue. We use this theorem to prove the following Theorem 2.10.2 LetA be a realnn nonnegative irreducible matrix. Then 1 (A) 1, (IA) is non-singular, and (IA)  0 (2.40) P 1 k Proof. ) From Theorem 2.7.4 follows that if(A) 1 thenIA is non-singular and that A = k=0 1 (IA) . The fact that the product and sum of positive matrices is again positive then implies thatIA is non-negative. ( By Theorem 2.10.1, a nonnegative vectoru exists such thatAu =(A)u or that 1 1 (IA)u = 1(A)u, u = (IA) u: (2.41) 1(A) 1 As both (IA) andu are non-negative and (IA) is non-singular, we have that 1(A) 0. 2 142.11 A collection of Matrices In this section we introduce a collection of matrices that we will use as example in the remainder of these notes. n Example 2.11.1 The Hilbert matrixH2R is a square with unit fractions 1 h = : (2.42) ij i +j 1 ClearlyH is symmetric. It can be shown thatH is positive definite. n Example 2.11.2 The Pei matrixP2R is the square matrix than for 2R can be defined through its entries as  ifi =j p = (2.43) ij 1 ifi =6 j n n Example 2.11.3 The Van der Monde matrix M 2 R is for a given vector v2 R can be defined through its entries as nj m =v : (2.44) ij i Ifv holds then complex roots of 1, thenM represents the discrete Fourier transform. Example 2.11.4 The graph Laplacian and weighted graph Laplacian are constructed on a graph. We therefore consider an oriented graphG =G(V;E) withn verticesv2V andm edgese2E. We consider thenm integer matrixG to be the node-edge incidence matrix. Each column ofG corresponds to an edge ofG and has an entry equal to +1 in the row corresponding to the node in which the edge starts and an entry1 in the row corresponding to the node in which the edge ends. All the other entries in the column are zero. The graph Laplacian is thenn matrix T L =GG : (2.45) Given anmm diagonal matrixW that attributes a strictly positive weightw to each edgee (maximum j j flow capacity for instance), the weighted graph Laplacian is thenn matrix T L =GWG : (2.46) W T As the column sum ofG equals zero,G times the vector having all 1 as components is zero. The constant vector thus lies in the null space ofL andL . This implies thatL andL are singular. We will therefore W W sometimes consider the submatrix obtained ofL by deleting the first row and column ofL andL . This W submatrix is referred to as the first principal submatrix ofL (orL ) and is non-singular. W As special case we consider the weighted graph Laplacian for the simple graph with 5 nodes connected by 4 edges and unevenly weighted edges, i.e., the graph V = f1; 2; 3; 4; 5g E = ff1; 2g;f2; 3g;f3; 4g;f4; 5gg 4 4 W = diag(1; 1; 10 ; 10 ): 152.12 Exercises Exercise 2.12.1 Calculate the Euclidean inner products of the following vectors T T a) (1;7; 12; 2) , (3; 2; 0; 5) T T b) (sinx; cosx) , (cosx; sinx) . Exercise 2.12.2 Calculate the Euclidean norm of the following vectors T a) (1; 5; 2) T b) (sinx; cosx) . Exercise 2.12.3 Calculate the uniform (infinity) norm of the vectors in Exercises 2.12.1, 2.12.2. In 2.12.2 T b), determine the pointsx, 0x 2, for which the vector (sinx; cosx) has the largest infinity norm. Exercise 2.12.4 Two vectorsu andv are said to be orthogonal ifhu;vi = 0. In the following, takeh;i to be the Euclidean inner product. T a) Find a vector which is orthogonal to (a;b) . T b) Use a) to find a vector which is orthogonal to (sinx; cosx) . Sketch the two vectors forx ==4. kAxk p Exercise 2.12.5 Show that sup = max kAxk forp 1: p kxkp kxk =1 p x6=0 m P mn Exercise 2.12.6 Show thatkAk = max ja j A2R (the maximal absolute column sum). 1 ij 1jn i=1 p T Exercise 2.12.7 Show thatkAk =  (A A). 2 max Exercise 2.12.8 Let the matrixA be given by 0 1 2 1 0 0 B C 1 2 1 0 B C A = : A 0 1 2 1 0 0 1 2     1=2 1=2 T kvk = hAv;vi = (Av) v is called the energy norm ofv with respect toA. E T a) Calculate the energy norm (with respect toA) of the vectorv = (v ;v ;v ;v ) . 1 2 3 4 b) Show that the energy norm calculated in a) is positive, unlessv is the zero vector. Exercise 2.12.9 The matrix norm induced by a vector normkk is defined by kAvk kAk = sup = sup kAvk: kvk v=0 6 kvk=1 The spectral normkAk of a symmetric positive definite matrixA is given by S kAk = (A): S max One can show that the matrix norm induced by the Euclidean norm is the spectral norm (see Proposi- tion 2.5.2). This exercise shows that the above statement is true for the matrixA given by   2 1 A = : 1 2 a) Show thatA is symmetric, positive definite. b) Compute the spectral norm ofA by determining its eigenvalues. 16c) Determine the norm ofA induced by the Euclidean norm by using the method of Lagrange multi- pliers to calculate sup kAvk. Compare your answer with the largest eigenvalue from b). kvk=1 Exercise 2.12.10 What is the condition number of the identity matrixI in anyp-norm? Exercise 2.12.11 Show the following properties of symmetric positive definite matrices: nn 1. A principal submatrix of a matrix A 2 R is obtained from A by deleting from A rows and columns corresponding to the same index are removed. A principal minor is the determinant of a principal submatrix. A leading principal submatrix ofA consists of entries ofA taken from the firstk rows and columns ofA, where 1kn. A leading principal minor is the determinant of a leading principle submatrix. Show that a symmetric matrixA is positive definite iff all its leading principal minors are positive. 2. IfA = (a ) is symmetric positive definite, thena 0 for all i. ij ii 3. If A = (a ) is symmetric positive definite, then the largest element (in magnitude) of the whole ij matrix must lie on the diagonal. 4. The sum of two symmetric positive definite matrices is symmetric positive definite. Exercise 2.12.12 Show the following properties on the condition number of a matrix 10 1.  (A) 1 for anyp-norm; p 2. ( A) =(A), where is any non-zero scalar, for any given norm; T 3.  (A) = 1 iff A is a non-zero scalar multiple of an orthogonal matrix, i.e., A A = I, where p 6= 0. (Note that this property of an orthogonal matrixA makes the matrix very attractive for its use in numerical computations); T 2 4.  (A A) =  (A) ; 2 2 T T 5.  (A) = (A ); (A) = (A ); 2 2 1 1 6. for any given norm(AB) (A)(B) if the matricesA andB are compatible for matrix multi- plication. Exercise 2.12.13 In this exercise we compare the determinant and the condition number of a matrixA as a measure for the near-singularity ofA. nn 1. consider the matrixA = (a )2R witha = 1 anda =1 ifj i. Perform the following ij ii ij numerical experiment to verify that A is nearly singular for large n. Consider the linear system Au =f to be such that the vectoru with all components equal to 1 is its solution. This can easily be accomplished by, for a given values ofn, setting the right-hand side vectorf equal tof =Au (by a matrix-vector multiplication). Given the right-hand side vectorf and the system matrixA, solve the linear system foru for various values of n (using the backslash direct solveru =Anf in Matlab for instance). Let the computed solution be denoted byub. List for various values ofn the relative error kuubk =kuk and the relative residual normkfAubk =kfk in the computed solution. You may 2 2 2 2 use to this end the following Table 2.1. 2. consider again the matrix A = (a ) with a = 1 and a = 1 if j i. Show that A has ij ii ij n1 determinant equal to 1. Show thatA has condition number (A) =n 2 by computingkAk , 1 1 1 1 A (using Gaussian elimination for instance) andkA k . Does the determinant explain the 1 effect observed in part (a) of the exercise? Does the condition number explain the observed effect in part (a) of the exercise? 3. repeat part (1) and part (2) of the exercise using as matrix the diagonal matrix of ordern of the form nn A = diag(0:1; 0:1; :::; 0:1)2R . You may use the 2-norm to measure the conditioning of the matrix. 17problem size relative error relative residual n kuubk =kuk kfAubk =kfk 2 2 2 2 10 20 30 40 50 Table 2.1: Numerical study of matrix conditioning. Exercise 2.12.14 Demonstrate Theorem 2.2.1. 2 Exercise 2.12.15 Draw the unit sphere for the 1, 2 and1 norm inR . Exercise 2.12.16 Show that the 1, 2 and1 matrix norms are equivalent, i.e, show that for any ; 2 nn f1; 2;1g and for anyA2R numbersr ands exist such that rkAk kAk skAk : nn Exercise 2.12.17 Show that for anyA2R holds that p kAk  kAk kAk : 2 1 1 Exercise 2.12.18 Prove Gelfand’s formula and show it experimentally. Exercise 2.12.19 Prove Gershgorin’s theorem and show it experimentally. Exercise 2.12.20 Prove Perron-Frobenius’ theorem stated in Theorem 2.10.1. 18

Advise: Why You Wasting Money in Costly SEO Tools, Use World's Best Free SEO Tool Ubersuggest.