Question? Leave a message!




Lecture notes on Linear Algebra

lecture notes on numerical linear algebra and linear algebra lecture notes pdf free download
ShawnPacinocal Profile Pic
ShawnPacinocal,United States,Researcher
Published Date:09-07-2017
Your Website URL(Optional)
Comment
Lecture notes on linear algebra David Lerner Department of Mathematics University of Kansas These are notes of a course given in Fall, 2007 and 2008 to the Honors sections of our elementary linear algebra course. Their comments and corrections have greatly improved the exposition. c 2007, 2008 D. E. LernerChapter 1 Matrices and matrix algebra 1.1 Examples of matrices  Definition: A matrix is a rectangular array of numbers and/or variables. For instance   4 −2 0 −3 1   A= 5 1.2 −0.7 x 3 π −3 4 6 27 is a matrix with 3 rows and 5 columns (a 3×5 matrix). The 15 entries of the matrix are referenced by the row and column in which they sit: the (2,3) entry of A is−0.7. We may also write a = −0.7, a = x, etc. We indicate the fact that A is 3×5 (this is read as 23 24 ”three by five”) by writingA . Matrices can also be enclosed in square brackets as well as 3×5 large parentheses. That is, both     2 4 2 4 and 1 −6 1 −6 are perfectly good ways to write this 2×2 matrix. Real numbers are 1×1 matrices. A vector such as   x   v = y z is a 3× 1 matrix. We will generally use upper case Latin letters as symbols for general matrices, boldface lower case letters for the special case of vectors, and ordinary lower case letters for real numbers.  Definition: Real numbers, when used in matrix computations, are called scalars. Matrices are ubiquitous in mathematics and the sciences. Some instances include: 1• Systems of linear algebraic equations (the main subject matter of this course) are normally written as simple matrix equations of the form Ax=y. 3 2 • The derivative of a function f :R →R is a 2×3 matrix. • First order systems of linear differential equations are written in matrix form. • The symmetry groups of mathematics and physics, some of which we’ll look at later, are groups of matrices. • Quantum mechanics can be formulated using infinite-dimensional matrices. 1.2 Operations with matrices Matrices of the same size can be added or subtracted by adding or subtracting the corre- sponding entries:       2 1 6 −1.2 8 −0.2       −3 4 + π x = π−3 4+x . 7 0 1 −1 8 −1 Definition: If the matricesA andB have the same size, then theirsum is the matrixA+B defined by (A+B) =a +b . ij ij ij Their difference is the matrix A−B defined by (A−B) =a −b ij ij ij .  Definition: A matrix A can be multiplied by a scalar c to obtain the matrix cA, where (cA) =ca . ij ij This is called scalar multiplication. We just multiply each entry of A by c. For example     1 2 −3 −6 −3 = 3 4 −9 −12  Definition: The m×n matrix whose entries are all 0 is denoted 0 (or, more often, just mn by 0 if the dimensions are obvious from context). It’s called the zero matrix.  Definition: Two matrices A and B are equal if all their corresponding entries are equal: A =B ⇐⇒ a =b for all i, j. ij ij 2Definition: IfthenumberofcolumnsofAequalsthenumberofrowsofB,thentheproduct AB is defined by k X (AB) = a b . ij is sj s=1 Here k is the number of columns of A or rows of B. If the summation sign is confusing, this could also be written as (AB) =a b +a b +···+a b . ij i1 1j i2 2j ik kj Example:         −1 0 1 2 3 1·−1+2·4+3·1 1·0+2·2+3·3 10 13   4 2 = = −1 0 4 −1·−1+0·4+4·1 −1·0+0·2+4·3 5 12 1 3 If AB is defined, then the number of rows of AB is the same as the number of rows of A, and the number of columns is the same as the number of columns of B: A B = (AB) . m×n n×p m×p Whydefinemultiplicationlikethis? Theansweristhatthisisthedefinitionthatcorresponds to what shows up in practice. Example: Recall from calculus (Exercise) that if a point (x,y) in the plane is rotated ′ ′ counterclockwise about the origin through an angleθ to obtain a new point (x,y), then ′ x = xcosθ−ysinθ ′ y = xsinθ+ycosθ. In matrix notation, this can be written      ′ x cosθ −sinθ x = . ′ y sinθ cosθ y ′ ′ ′′ ′′ If the new point (x,y ) is now rotated through an additional angle φ to get (x ,y ), then      ′′ ′ x cosφ −sinφ x = ′′ ′ y sinφ cosφ y     cosφ −sinφ cosθ −sinθ x = sinφ cosφ sinθ cosθ y    cosθcosφ−sinθsinφ −(cosθsinφ+sinθcosφ) x = cosθsinφ+sinθcosφ cosθcosφ−sinθsinφ y    cos(θ+φ) −sin(θ+φ) x = sin(θ+φ) cos(θ+φ) y 3This is obviously correct, since it shows that the point has been rotated through the total angle ofθ+φ. So the right answer is given by matrix multiplication as we’ve defined it, and not some other way. Matrix multiplication is not commutative: in English,AB = 6 BA, for arbitrary matri- ces A and B. For instance, if A is 3×5 and B is 5×2, then AB is 3×2, but BA is not defined. Even if both matrices are square and of the same size, so that both AB and BA are defined and have the same size, the two products are not generally equal. ♣ Exercise: Write down two 2×2 matrices and compute both products. Unless you’ve been very selective, the two products won’t be equal. Another example: If    2 A = , and B = 1 2 , 3 then   2 4 AB = , while BA = (8). 3 6 Two fundamental properties of matrix multiplication: 1. If AB and AC are defined, then A(B +C)=AB+AC. 2. If AB is defined, and c is a scalar, then A(cB) =c(AB). ♣ Exercise: Prove the two properties listed above. (Both these properties can be proven by showing that,ineachequation, the(i,j)entryontherighthandsideoftheequationisequal to the (i,j) entry on the left.) t Definition: The transpose of the matrixA, denotedA , is obtained fromA by making the t t first row ofA into the first column ofA , the second row ofA into the second column ofA , and so on. Formally, t a =a . ji ij So   t   1 2 1 3 5   3 4 = . 2 4 6 5 6 Here’soneconsequenceofthenon-commutatitivityofmatrixmultiplication: IfAB isdefined, t t t t t then (AB) =B A (and not AB as you might expect). Example: If     2 1 −1 2 A= , and B = , 3 0 4 3 then     2 7 2 −3 t AB = , so (AB) = . −3 6 7 6 4And      −1 4 2 3 2 −3 t t B A = = 2 3 1 0 7 6 as advertised. t t t th ♣ Exercise: Can you show that (AB) = B A ? You need to write out the (i,j) entry of both sides and then observe that they’re equal.  Definition: A is square if it has the same number of rows and columns. An important instanceistheidentitymatrixI ,whichhasonesonthemaindiagonalandzeroselsewhere: n Example:   1 0 0   I = 0 1 0 . 3 0 0 1 Often, we’ll just writeI without the subscript for an identity matrix, when the dimension is clear from the context. The identity matrices behave, in some sense, like the number 1. If A is n×m, then I A =A, and AI =A. n m Definition: SupposeA andB are square matrices of the same dimension, and suppose that −1 AB = I = BA. Then B is said to be the inverse of A, and we write this as B = A . −1 Similarly, B =A. For instance, you can easily check that      2 1 1 −1 1 0 = , 1 1 −1 2 0 1 and so these two matrices are inverses of one another:         −1 −1 2 1 1 −1 1 −1 2 1 = and = . 1 1 −1 2 −1 2 1 1 Example: Not every square matrix has an inverse. For instance   3 1 A= 3 1 has no inverse. ♣ Exercise: Show that the matrixA in the above example has no inverse. Hint: Suppose that   a b B = c d is the inverse ofA. Then we must haveBA =I. Write this out and show that the equations for the entries of B are inconsistent. 5♣ Exercise: Which 1×1 matrices are invertible, and what are their inverses? ♣ Exercise: Show that if     1 a b d −b −1 A = , and ad−bc = 6 0, then A = . c d −c a ad−bc −1 Hint: Multiply A by the given expression for A and show that it equals I. If ad−bc =0, then the matrix is not invertible. You should probably memorize this formula. ♣ Exercise: Show that if A has an inverse that it’s unique; that is, if B and C are both inverses of A, then B =C. (Hint: Consider the product BAC = (BA)C =B(AC).) 6Chapter 2 Matrices and systems of linear equations 2.1 The matrix form of a linear system You have all seen systems of linear equations such as 3x+4y = 5 2x−y = 0. (2.1) This system can be solved easily: Multiply the 2nd equation by 4, and add the two resulting equationstoget11x =5 or x =5/11. Substitutingthisintoeitherequationgivesy = 10/11. In this case, a solution exists (obviously) and is unique (there’s just one solution, namely (5/11,10/11)). We can write this system as a matrix equation, in the form Ax=y:      3 4 x 5 = . (2.2) 2 −1 y 0 Here       x 5 3 4 x = , and y = , and A = y 0 2 −1 is called the coefficient matrix. This formula works because if we multiply the two matrices on the left, we get the 2×1 matrix equation     3x+4y 5 = . 2x−y 0 And the two matrices are equal if both their entries are equal, which holds only if both equations in (2.1) are satisfied. 72.2 Row operations on the augmented matrix Of course, rewriting the system in matrix form does not, by itself, simplify the way in which we solve it. The simplification results from the following observation: The variables x and y can be eliminated from the computation by simply writing down a matrixinwhichthecoefficientsofxareinthefirstcolumn, thecoefficients ofy inthesecond, and the right hand side of the system is the third column:   3 4 5 . (2.3) 2 −1 0 We are using the columns as ”place markers” instead of x, y and the = sign. That is, the first column consists of the coefficients of x, the second has the coefficients of y, and the third has the numbers on the right hand side of (2.1). 1 We can do exactly the same operations on this matrix as we did on the original system :   3 4 5 : Multiply the 2nd eqn by 4 8 −4 0   3 4 5 : Add the 1st eqn to the 2nd 11 0 5   3 4 5 : Divide the 2nd eqn by 11 5 1 0 11 The second equation now reads 1·x + 0·y = 5/11, and we’ve solved for x; we can now substitute for x in the first equation to solve for y as above.  Definition: The matrix in (2.3) is called the augmented matrix of the system, and can be written in matrix shorthand as (Ay). Even though the solution to the system of equations is unique, it can be solved in many different ways (all of which, clearly, must give the same answer). For instance, start with the same augmented matrix   3 4 5 . 2 −1 0   1 5 5 : Replace eqn 1 with eqn 1 - eqn 2 2 −1 0   1 5 5 : Subtract 2 times eqn 1 from eqn 2 0 −11 −10   1 5 5 : Divide eqn 2 by -11 to get y = 10/11 10 0 1 11 1 The purpose of this lecture is to remind you of the mechanics for solving simple linear systems. We’ll give precise definitions and statements of the algorithms later. 8Thesecondequationtellsusthaty = 10/11,andwecansubstitutethisintothefirstequation x+5y = 5 to get x =5/11. We could even take this one step further:   5 1 0 11 : We added -5(eqn 2) to eqn 1 10 0 1 11 The complete solutioncan now beread offfromthematrix. Whatwe’ve done is to eliminate x from the second equation, (the 0 in position (2,1)) and y from the first (the 0 in position (1,2)). ♣ Exercise: What’s wrong with writing the final matrix as   1 0 0.45 ? 0 1 0.91 Thesystem aboveconsists oftwolinearequationsintwounknowns. Eachequation, byitself, is the equation of a line in the plane and so has infinitely many solutions. To solve both equations simultaneously, we need to find the points, if any, which lie on both lines. There are 3 possibilities: (a) there’s just one (the usual case), (b) there is no solution (if the two lines are parallel and distinct), or (c) there are an infinite number of solutions (if the two lines coincide). ♣ Exercise: (Do this before continuing with the text.) What are the possibilities for 2 linear equations in 3 unknowns? That is, what geometric object does each equation represent, and what are the possibilities for solution(s)? 2.3 More variables Let’s add another variable and consider two equations in three unknowns: 2x−4y+z = 1 4x+y−z = 3 (2.4) Ratherthansolvingthisdirectly, we’llworkwiththeaugmentedmatrixforthesystemwhich is   2 −4 1 1 . 4 1 −1 3 We proceed in more or less the same manner as above - that is, we try to eliminate x from the second equation, and y from the first by doing simple operations on the matrix. Before we start, observe that each time we do such an operation, we are, in effect, replacing the original system of equations by an equivalent system which has the same solutions. For instance, if we multiply the first equation by the number 2, we get a ”new” equation which has exactly the same solutions as the original. ♣ Exercise: This is also true if we replace, say, equation 2 with equation 2 plus some multiple of equation 1. Why? 9So, to business:   1 1 1 −2 2 2 : Mult eqn 1 by 1/2 4 1 −1 3   1 1 1 −2 2 2 : Mult eqn 1 by -4 and add it to eqn 2 0 9 −3 1   1 1 1 −2 2 2 : Mult eqn 2 by 1/9 (2.5) 1 1 0 1 − 3 9   1 13 1 0 − 6 18 : Add (2)eqn 2 to eqn 1 (2.6) 1 1 0 1 − 3 9 The matrix (2.5) is called an echelon form of the augmented matrix. The matrix (2.6) is called the reduced echelon form. (Precise definitions of these terms will be given in the next lecture.) Either one can be used to solve the system of equations. Working with the echelon form in (2.5), the two equations now read x−2y+z/2 = 1/2 y−z/3 = 1/9. So y =z/3+1/9. Substituting this into the first equation gives x = 2y−z/2+1/2 = 2(z/3+1/9)−z/2+1/2 = z/6+13/18 ♣ Exercise: Verify that the reduced echelon matrix (2.6) gives exactly the same solutions. This is as it should be. All equivalent systems of equations have the same solutions. 2.4 The solution in vector notation We see that for any choice of z, we get a solution to (2.4). Taking z = 0, the solution is x = 13/18, y = 1/9. But if z = 1, then x = 8/9, y = 4/9 is the solution. Similarly for any other choice of z which for this reason is called a free variable. If we write z =t, a more familiar expression for the solution is         t 13 1 13 x + 6 18 6 18 t 1 1 1         y = + =t + . (2.7) 3 9 3 9 z t 1 0 This is of the form r(t) =tv+a, and you will recognize it as the (vector) parametric form 3 of a line in R . This (with t a free variable) is called the general solution to the system (??). If we choose a particular value oft, sayt =3π, and substitute into (2.7), then we have a particular solution. ♣ Exercise: Write down the augmented matrix and solve these. If there are free variables, write your answer in the form given in (2.7) above. Also, give a geometric interpretation of the 3 solution set (e.g., the common intersection of three planes inR .) 101. 3x+2y−4z = 3 −x−2y+3z = 4 2. 2x−4y = 3 3x+2y = −1 x−y = 10 3. x+y+3z = 4 It is now time to think about what we’ve just been doing: • Can we formalize the algorithm we’ve been using to solve these equations? • Can we show that the algorithm always works? That is, are we guaranteed to get all the solutions if we use the algorithm? Alternatively, if the system is inconsistent (i.e., no solutions exist), will the algorithm say so? Let’s write down the different ‘operations’ we’ve been using on the systems of equations and on the corresponding augmented matrices: 1. We can multiply any equation by a non-zero real number (scalar). The corresponding matrix operation consists of multiplying a row of the matrix by a scalar. 2. We can replace any equation by the original equation plus a scalar multiple of another equation. Equivalently, we can replace any row of a matrix by thatrow plus a multiple of another row. 3. We can interchange two equations (or two rows of the augmented matrix); we haven’t needed to do this yet, but sometimes it’s necessary, as we’ll see in a bit.  Definition: These three operations are called elementary row operations. In the next lecture, we’ll assemble the solution algorithm, and show that it can be reformu- lated in terms of matrix multiplication. 11Chapter 3 Elementary row operations and their corresponding matrices 3.1 Elementary matrices As we’ll see, any elementary row operation can be performed by multiplying the augmented matrix (Ay) on the left by what we’ll call an elementary matrix. Just so this doesn’t come as a total shock, let’s look at some simple matrix operations: • SupposeEA is defined, and suppose the first row ofE is (1,0,0,...,0). Then the first row of EA is identical to the first row of A. th th th • Similarly, if the i row of E is all zeros except for a 1 in the i slot, then the i row th of the product EA is identical to the i row of A. • It follows that if we want to change only row i of the matrix A, we should multiply A on the left by some matrix E with the following property: th Every row except row i should be the i row of the corresponding identity matrix. The procedure that we illustrate below is used to reduce any matrix to echelon form (not just augmented matrices). The way it works is simple: the elementary matrices E ,E ,... 1 2 are formed by (a) doing the necessary row operation on the identity matrix to get E, and then (b) multiplying A on the left by E. Example: Let   3 4 5 A = . 2 −1 0 1. To multiply the first row ofA by 1/3, we can multiplyA on the left by the elementary matrix   1 0 3 E = . 1 0 1 12(Since we don’t want to change the second row ofA, the second row ofE is the same 1 as the second row ofI .) The first row is obtained by multiplying the first row ofI by 2 1/3. The result is   4 5 1 3 3 E A = . 1 2 −1 0 You should check this on your own. Same with the remaining computations. 2. To add -2(row1) to row 2 in the resulting matrix, multiply it by   1 0 E = . 2 −2 1 The general rule here is the following: To perform an elementary row operation on the matrix A, first perform the operation on the corresponding identity matrix to obtain an elementary matrix; then multiply A on the left by this elementary matrix. 3.2 The echelon and reduced echelon (Gauss-Jordan) form Continuing with the problem, we obtain   4 5 1 3 3 E E A = . 2 1 11 10 0 − − 3 3 Note the order of the factors: E E A and not E E A 2 1 1 2 Now multiply row 2 of E E A by−3/11 using the matrix 2 1   1 0 E = , 3 3 0 − 11 yielding the echelon form   4 5 1 3 3 E E E A = . 3 2 1 10 0 1 11 Last, we clean out the second column by adding (-4/3)(row 2) to row 1. The corresponding elementary matrix is   4 1 − 3 E = . 4 0 1 Carrying out the multiplication, we obtain the Gauss-Jordan form of the augmented matrix   5 1 0 11 E E E E A = . 4 3 2 1 10 0 1 11 13Naturally, we get the same result as before, so why bother? The answer is that we’re developing an algorithm that will work in the general case. So it’s about time to formally identify our goal in the general case. We begin with some definitions.  Definition: The leading entry of a matrix row is the first non-zero entry in the row, starting from the left. A row without a leading entry is a row of zeros.  Definition: The matrix R is said to be in echelon form provided that 1. The leading entry of every non-zero row is a 1. 2. If the leading entry of row i is in position k, and the next row is not a row of zeros, then the leading entry of row i+1 is in position k+j, where j≥ 1. 3. All zero rows are at the bottom of the matrix. The following matrices are in echelon form:       1 ∗ ∗ 0 1 ∗ ∗ ∗ 1 ∗     , 0 0 1 , and 0 0 1 ∗ ∗ . 0 1 0 0 0 0 0 0 1 ∗ Here the asterisks () stand for any number at all, including 0.  Definition: The matrix R is said to be in reduced echelon form if (a) R is in echelon form, and (b) each leading entry is the only non-zero entry in its column. The reduced echelon form of a matrix is also called the Gauss-Jordan form. The following matrices are in reduced row echelon form:       1 ∗ 0 ∗ 0 1 0 0 ∗ 1 0     , 0 0 1 ∗ , and 0 0 1 0 ∗ . 0 1 0 0 0 0 0 0 0 1 ∗ ♣ Exercise: Suppose A is 3×5. What is the maximum number of leading 1’s that can appear when it’s been reduced to echelon form? Same questions for A . Can you generalize your 5×3 results to a statement for A ?. (State it as a theorem.) m×n Once a matrix has been brought to echelon form, it can be put into reduced echelon form by cleaning out the non-zero entries in any column containing a leading 1. For example, if   1 2 −1 3   0 1 2 0 R = , 0 0 0 1 which is in echelon form, then it can be reduced to Gauss-Jordan form by adding (-2)(row 2) to row 1, and then (-3)(row 3) to row 1. Thus      1 −2 0 1 2 −1 3 1 0 −5 3      0 1 0 0 1 2 0 = 0 1 2 0 . 0 0 1 0 0 0 1 0 0 0 1 14and      1 0 −3 1 0 −5 3 1 0 −5 0      0 1 0 0 1 2 0 = 0 1 2 0 . 0 0 1 0 0 0 1 0 0 0 1 Note that column 3 cannot be ”cleaned out” since there’s no leading 1 there. 3.3 The third elementary row operation There is one more elementary row operation and corresponding elementary matrix we may need. Suppose we want to reduce the following matrix to Gauss-Jordan form   2 2 −1   A= 0 0 3 . 1 −1 2 Multiplying row 1 by 1/2, and then adding -row 1 to row 3 leads to       1 1 1 0 0 0 0 2 2 −1 1 1 − 2 2       E E A = 0 1 0 0 1 0 0 0 3 = 0 0 3 . 2 1 5 −1 0 1 0 0 1 1 −1 2 0 −2 2 Now wecan clearly do2 moreoperationsto geta leading 1 inthe(2,3)position, andanother leading 1 in the (3,2) position. But this won’t be in echelon form (why not?) We need to interchange rows 2 and 3. This corresponds to changing the order of the equations, and evidently doesn’t change the solutions. We can accomplish this by multiplying on the left with a matrix obtained fromI by interchanging rows 2 and 3:      1 1 1 0 0 1 1 − 1 1 − 2 2 5      E E E A= 0 0 1 0 0 3 = 0 −2 . 3 2 1 2 5 0 1 0 0 −2 0 0 3 2 ♣ Exercise: Without doing any written computation, write down the Gauss-Jordan form for this matrix. ♣ Exercise: Use elementary matrices to reduce   2 1 A = −1 3 to Gauss-Jordan form. You should wind up with an expression of the form E ···E E A=I. k 2 1 What is another name for the matrix B =E ···E E ? k 2 1 15Chapter 4 Elementary matrices, continued We have identified 3 types of row operations and their corresponding elementary matrices. To repeat the recipe: These matrices are constructed by performing the given row operation on the identity matrix: 1. To multiply row (A) by the scalar c use the matrix E obtained fromI by multiplying j th j row of I by c. th 2. To add (c)(row (A)) to row (A), use the identity matrix with its k row replaced by j k (...,c,...,1,...). Here c is in position j and the 1 is in position k. All other entries are 0 3. To interchange rows j and k, use the identity matrix with rows j and k interchanged. 4.1 Properties of elementary matrices 1. Elementary matrices are always square. If the operation is to be performed on A , then the elementary matrix E is m×m. So the product EA has the same m×n dimension as the original matrix A. −1 2. Elementary matrices are invertible. If E is elementary, then E is the matrix −1 which undoes the operation that created E, and E EA = IA = A; the matrix followed by its inverse does nothing to A: Examples: •   1 0 E = −2 1 adds (−2)(row (A)) to row (A). Its inverse is 1 2   1 0 −1 E = , 2 1 16which adds (2)(row (A)) to row (A). You should check that the product of these 1 2 two is I . 2 1 • If E multiplies the second row of a 2×2 matrix by , then 2   1 0 −1 E = . 0 2 −1 • If E interchanges two rows, then E =E . For instance    0 1 0 1 =I 1 0 1 0 ♣ Exercise: 1. If A is 3× 4, what is the elementary matrix that (a) subtracts (7)(row (A)) from 3 row (A)?, (b) interchanges the first and third rows? (c) multiplies row (A) by 2? 2 1 2. What are the inverses of the matrices in exercise 1? 3. ()Do elementary matrices commute? That is, does it matter in which order they’re multiplied? Give an example or two to illustrate your answer. 4. () In a manner analogous to the above, define three elementary column operations and show that they can be implemented by multiplying A on the right by elemen- m×n tary n×n column matrices. 4.2 The algorithm for Gaussian elimination We can now formulate the algorithm which reduces any matrix first to row echelon form, and then, if needed, to reduced echelon form: 1. Begin with the (1,1) entry. If it’s some number a = 6 0, divide through row 1 by a to get a 1 in the (1,1) position. If it is zero, then interchange row 1 with another row to get a nonzero (1,1) entry and proceed as above. If every entry in column 1 is zero, go to the top of column 2 and, by multiplication and permuting rows if necessary, get a 1 in the (1,2) slot. If column 2 won’t work, then go to column 3, etc. If you can’t arrange for a leading 1 somewhere in row 1, then your original matrix was the zero matrix, and it’s already reduced. 2. You now have a leading 1 in some column. Use this leading 1 and operations of the type (a)row (A)+row (A)→ row (A) to replace every entry in the column below the i k k location of the leading 1 by 0. When you’re done, the column will look like   1   0   .   . .   . 0 173. Now move one column to the right, and one row down and attempt to repeat the process, getting a leading 1 in this location. You may need to permute this row with a row below it. If it’s not possible to get a non-zero entry in this position, move right one column and try again. At the end of this second procedure, your matrix might look like   1 ∗ ∗ ∗   0 0 1 ∗ , 0 0 0 ∗ where the second leading entry is in column 3. Notice that once a leading 1 has been installed in the correct position and the column below this entry has been zeroed out, none of the subsequent row operations will change any of the elements in the column. For the matrix above, no subsequent row operations in our reduction process will change any of the entries in the first 3 columns. 4. The process continues until there are no more positions for leading entries – we either run out of rows or columns or both because the matrix has only a finite number of each. We have arrived at the row echelon form. The three matrices below are all in row echelon form:   1 ∗ ∗       1 ∗ ∗ ∗ ∗ 0 0 1 1 ∗ ∗         0 0 1 ∗ ∗ , or 0 0 0 , or 0 1 ∗     0 0 0 1 ∗ 0 0 0 0 0 1 0 0 0 Remark: The description of the algorithm doesn’t involve elementary matrices. As a practical matter, it’s much simpler to just do the row operation directly on A, instead of writing down anelementary matrix andmultiplying thematrices. Butthefactthatwe could do this with the elementary matrices turns out to be quite useful theoretically. ♣ Exercise: Find the echelon form for each of the following:   1 2       3 4 0 4 3 2 −1 4   , , (3,4),   5 6 7 −2 2 −5 2 6 7 8 4.3 Observations (1)The leading entries progressstrictly downward, fromlefttoright. Wecould justaseasily have written an algorithm in which the leading entries progress downward as we move from right to left, or upwards from left to right. Our choice is purely a matter of convention, but this is the convention used by most people.  Definition: The matrixA is upper triangular if any entrya withij satisfiesa =0. ij ij 18(2) The row echelon form of the matrix is upper triangular (3) To continue the reduction to Gauss-Jordan form, it is only necessary to use each leading 1 to clean out any remaining non-zero entries in its column. For the first matrix in (4.2) above, the Gauss-Jordan form will look like   1 ∗ 0 0 ∗   0 0 1 0 ∗ 0 0 0 1 ∗ Of course, cleaning out the columns may lead to changes in the entries labelled with . 4.4 Why does the algorithm (Gaussian elimination) work? Suppose we start with the system of equations Ax = y. The augmented matrix is (Ay), where the coefficients of the variable x are the numbers in col (A), the ‘equals’ sign is 1 1 represented by the vertical line, and the last column of the augmented matrix is the right hand side of the system. If we multiply the augmented matrix by the elementary matrix E, we get E(Ay). But this can also be written as (EAEy). Example: Suppose   a b c (Ay)= , d e f and we want to add two times the first row to the second, using the elementary matrix   1 0 E = . 2 1 The result is   a b c E(Ay)= . 2a+d 2b+e 2c+f But,asyoucaneasilysee, thefirsttwocolumnsofE(Ay)arejusttheentriesofEA,andthe last column is Ey, so E(Ay) = (EAEy), and this works in general. (See the appropriate problem.) So after multiplication by E, we have the new augmented matrix (EAEy), which corre- sponds to the system of equations EAx = Ey. Now suppose x is a solution to Ax = y. Multiplication of this equation by E gives EAx = Ey, so x solves this new system. And conversely, since E is invertible, if x solves the new system, EAx = Ey, multiplication by −1 E gives Ax =y, so x solves the original system. We have just proven the Theorem: Elementary row operations applied to either Ax =y or the corresponding augmented matrix (Ay) don’t change the set of solutions to the system. 19