Question? Leave a message!




Lecture notes on Linear Algebra

lecture notes on numerical linear algebra and linear algebra lecture notes pdf free download
ShawnPacinocal Profile Pic
ShawnPacinocal,United States,Researcher
Published Date:09-07-2017
Website URL
Comment
Lecture notes on linear algebra David Lerner Department of Mathematics University of Kansas These are notes of a course given in Fall, 2007 and 2008 to the Honors sections of our elementary linear algebra course. Their comments and corrections have greatly improved the exposition. c 2007, 2008 D. E. LernerContents 1 Matrices and matrix algebra 1 1.1 Examples of matrices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 1.2 Operations with matrices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 2 Matrices and systems of linear equations 7 2.1 The matrix form of a linear system . . . . . . . . . . . . . . . . . . . . . . . . 7 2.2 Row operations on the augmented matrix . . . . . . . . . . . . . . . . . . . . . 8 2.3 More variables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9 2.4 The solution in vector notation . . . . . . . . . . . . . . . . . . . . . . . . . . 10 3 Elementary row operations and their corresponding matrices 12 3.1 Elementary matrices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12 3.2 The echelon and reduced echelon (Gauss-Jordan) form . . . . . . . . . . . . . . 13 3.3 The third elementary row operation . . . . . . . . . . . . . . . . . . . . . . . . 15 4 Elementary matrices, continued 16 4.1 Properties of elementary matrices . . . . . . . . . . . . . . . . . . . . . . . . . 16 4.2 The algorithm for Gaussian elimination . . . . . . . . . . . . . . . . . . . . . . 17 4.3 Observations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18 4.4 Why does the algorithm (Gaussian elimination) work? . . . . . . . . . . . . . . 19 4.5 Application to the solution(s) of Ax =y . . . . . . . . . . . . . . . . . . . . . 20 5 Homogeneous systems 23 5.1 Solutions to the homogeneous system . . . . . . . . . . . . . . . . . . . . . . . 23 5.2 Some comments about free and leading variables . . . . . . . . . . . . . . . . . 25 5.3 Properties of the homogenous system for A . . . . . . . . . . . . . . . . . . 26 mn 5.4 Linear combinations and the superposition principle. . . . . . . . . . . . . . . . 27 6 The Inhomogeneous system Ax =y, y = 6 0 29 6.1 Solutions to the inhomogeneous system . . . . . . . . . . . . . . . . . . . . . . 29 6.2 Choosing a different particular solution . . . . . . . . . . . . . . . . . . . . . . 31 7 Square matrices, inverses and related matters 34 7.1 The Gauss-Jordan form of a square matrix . . . . . . . . . . . . . . . . . . . . 34 7.2 Solutions to Ax =y when A is square . . . . . . . . . . . . . . . . . . . . . . 36 −1 7.3 An algorithm for constructing A . . . . . . . . . . . . . . . . . . . . . . . . 36 i8 Square matrices continued: Determinants 38 8.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38 8.2 Aside: some comments about computer arithmetic . . . . . . . . . . . . . . . . 38 8.3 The formal definition of det(A) . . . . . . . . . . . . . . . . . . . . . . . . . . 40 8.4 Some consequences of the definition . . . . . . . . . . . . . . . . . . . . . . . 40 8.5 Computations using row operations . . . . . . . . . . . . . . . . . . . . . . . . 41 8.6 Additional properties of the determinant . . . . . . . . . . . . . . . . . . . . . 43 9 The derivative as a matrix 45 9.1 Redefining the derivative . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45 9.2 Generalization to higher dimensions . . . . . . . . . . . . . . . . . . . . . . . . 46 10 Subspaces 49 11 Linearly dependent and independent sets 53 11.1 Linear dependence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53 11.2 Linear independence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54 11.3 Elementary row operations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55 12 Basis and dimension of subspaces 56 12.1 The concept of basis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56 12.2 Dimension . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59 13 The rank-nullity (dimension) theorem 60 13.1 Rank and nullity of a matrix . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60 13.2 The rank-nullity theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62 14 Change of basis 64 14.1 The coordinates of a vector . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65 14.2 Notation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66 15 Matrices and Linear transformations 70 n m 15.1 m×n matrices as functions fromR toR . . . . . . . . . . . . . . . . . . . 70 15.2 The matrix of a linear transformation . . . . . . . . . . . . . . . . . . . . . . . 73 15.3 The rank-nullity theorem - version 2 . . . . . . . . . . . . . . . . . . . . . . . . 74 15.4 Choosing a useful basis for A . . . . . . . . . . . . . . . . . . . . . . . . . . . 75 16 Eigenvalues and eigenvectors 77 16.1 Definition and some examples . . . . . . . . . . . . . . . . . . . . . . . . . . . 77 16.2 Computations with eigenvalues and eigenvectors . . . . . . . . . . . . . . . . . 78 16.3 Some observations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80 16.4 Diagonalizable matrices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81 17 Inner products 84 17.1 Definition and first properties . . . . . . . . . . . . . . . . . . . . . . . . . . . 84 17.2 Euclidean space . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 86 ii18 Orthonormal bases and related matters 89 18.1 Orthogonality and normalization . . . . . . . . . . . . . . . . . . . . . . . . . . 89 18.2 Orthonormal bases . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 90 19 Orthogonal projections and orthogonal matrices 93 19.1 Orthogonal decompositions of vectors . . . . . . . . . . . . . . . . . . . . . . . 93 19.2 Algorithm for the decomposition . . . . . . . . . . . . . . . . . . . . . . . . . . 94 19.3 Orthogonal matrices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 96 19.4 Invariance of the dot product under orthogonal transformations . . . . . . . . . 97 20 Projections onto subspaces and the Gram-Schmidt algorithm 99 20.1 Construction of an orthonormal basis . . . . . . . . . . . . . . . . . . . . . . . 99 20.2 Orthogonal projection onto a subspace V . . . . . . . . . . . . . . . . . . . . . 100 20.3 Orthogonal complements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 102 20.4 Gram-Schmidt - the general algorithm. . . . . . . . . . . . . . . . . . . . . . . 103 21 Symmetric and skew-symmetric matrices 105 21.1 Decomposition of a square matrix into symmetric and skew-symmetric matrices . 105 21.2 Skew-symmetric matrices and infinitessimal rotations . . . . . . . . . . . . . . . 106 21.3 Properties of symmetric matrices . . . . . . . . . . . . . . . . . . . . . . . . . 107 22 Approximations - the method of least squares 111 22.1 The problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 111 22.2 The method of least squares . . . . . . . . . . . . . . . . . . . . . . . . . . . . 112 23 Least squares approximations - II 115 23.1 The transpose of A . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 115 23.2 Least squares approximations – the Normal equation . . . . . . . . . . . . . . . 116 24 Appendix: Mathematical implications and notation 119 iiiTo the student These are lecture notes for a first course in linear algebra; the prerequisite is a good course incalculus. The notes arequite informal, but they have been carefully read andcriticized by twosectionsofhonorsstudents, andtheircommentsandsuggestionshavebeenincorporated. Although I’ve tried to be careful, there are undoubtedly some errors remaining. If you find any, please let me know. The material in these notes is absolutely fundamental forall mathematicians, physical scien- tists, and engineers. You will use everything you learn in this course in your further studies. Although we can’t spend too much time on applications here, three important ones are treated in some detail — the derivative (Chapter 9), Helmholtz’s theorem on infinitessimal deformations (Chapter 21) and least squares approximations (Chapters 22 and 23). These are notes, and not a textbook; they correspond quite closely to what is actually said anddiscussed inclass. Theintention isforyoutousetheminsteadofanexpensive textbook, but to do this successfully, you will have to treat them differently: • Before each class, read the corresponding lecture. You will have to read it carefully, and you’ll need a pad of scratch paper to follow along with the computations. Some of the “easy” steps in the computations are omitted, and you should supply them. It’s not the case that the “important” material is set offin italics orboxes and the rest can safely be ignored. Typically, you’ll have to read each lecture two or three times before you understand it. If you’ve understood the material, you should be able to work most of the problems. At this point, you’re ready for class. You can pay attention in class to whatever was not clear to you in the notes, and ask questions. • Theway moststudents learnmathoutofastandardtextbookistograbthehomework assignment and start working, referring back to the text for any needed worked exam- ples. That won’t work here. The exercises are not all at the end of the lecture; they’re scattered throughout the text. They are to be worked when you get to them. If you can’t work the exercise, you don’t understand the material, and you’re just kidding yourself if you go on to the next paragraph. Go back, reread the relevant material and try again. Work all the unstarred exercises. If you can’t do something, get help, or ask about it in class. Exercises are all set off by “♣ Exercise: ”, so they’re easy to find. The ones with asterisks () are a bit more difficult. • You should treat mathematics as a foreign language. In particular, definitions must be memorized (just like new vocabulary words in French). If you don’t know what thewords mean, you can’tpossibly do themath. Goto thebookstore, andgetyourself a deck of index cards. Each time you encounter a new word in the notes (you can tell, because the new words are set off by “  Definition: ”), write it down, together with its definition, and at least one example, on a separate index card. Memorize the material on the cards. At least half the time when students make a mistake, it’s because they don’t really know what the words in the problem mean. • There’s an appendix on proofs and symbols; it’s not really part of the text, but you ivmay want to check there if you come upon a symbol or some form of reasoning that’s not clear. • Along with definitions come proofs. Doing a proof is the mathematical analog of going to the physics lab and verifying, by doing the experiment, that the period of a pendulum depends in a specific way on its length. Once you’ve done the proof or experiment, you really know it’s true; you’re not taking someone else’s word for it. The proofs in this course are (a) relatively easy, (b) unsurprising, in the sense that the subject is quite coherent, and (c) useful in practice, since the proof often consists of an algorithm which tells you how to do something. This may be a new approach for some of you, but, in fact, this is the way the experts learn math and science: we read books or articles, working our way through it line by line, and asking questions when we don’t understand. It may be difficult or uncomfortable at first, but it gets easier as you go along. Working this way is a skill that must be mastered by any aspiring mathematician or scientist (i.e., you). vTo the instructor These are lecture notes for our 2-credit introductory linear algebra course. They correspond pretty closely to what I said (or should have said) in class. Two of our Math 291 classes have goneover the notes rathercarefully andhave made many useful suggestions which have been happily adopted. Although the notes are intended to replace the standard text for this course, they may be somewhat abbreviated for self-study. How to use the notes: The way I’ve used the notes is simple: For each lecture, the students’ homework was to read the section (either a chapter or half of one) of the text that would be discussed in class. Most students found this difficult (and somewhat unpleasant) at first; they had to read the material three or four times before it began to make sense. They also hadto work(oratleastattempt)allthe unstarredproblems beforeclass. Formost students, this took between one and three hours of real work per class. During the actual class period, Ianswered questions, worked problems, andtriedtolecture aslittleaspossible. Thisworked quitewellforthefirsthalfofthecourse, butasthematerialgotmoredifficult, Ifoundmyself lecturing more often - there were certain things that needed to be emphasized that might not come up in a discussion format. The students soon became accustomed to this, and even got to like it. Since this is the way real scientists learn (by working though papers on their own), it’s a skill that must be mastered — and the sooner the better. Students were required to buy a 3×5 inch deck ofindex cards, to write down each definition on one side of a card, and any useful examples or counterexamples on the back side. They had to memorize the definitions: at least 25% of the points on each exam were definitions. Theonlyproblems collectedandgradedwerethestarredones. Problems thatcaused trouble (quite a few) were worked out in class. There are not many standard “drill” problems. Students were encouraged to make them up if they felt they needed practice. Comments on the material: Chapters 1 through 8, covering the solution of linear algebraic systems of equations, contains material the students have, in principle, seen before. But there is more than enough new material to keep everyone interested: the use of elementary matrices for row operations and the definition of the determinant as an alternating form are two examples. Chapter 9 (optional but useful) talks about the derivative as a linear transformation. Chapters 10 through 16 cover the basic material on linear dependence, independence, basis, dimension, the dimension theorem, change of basis, linear transformations, and eigenvalues. The learning curve is fairly steep here; and this is certainly the most challenging part of the course for most students. The level of rigor is reasonably high for an introductory course; why shouldn’t it be? Chapters 17 through21 cover the basics ofinner products, orthogonalprojections, orthonor- mal bases, orthogonal transformations and the connection with rotations, and diagonaliza- tion of symmetric matrices. Helmholtz’s theorem (optional) on the infinitessimal motion of a non-rigid body is used to motivate the decomposition of the derivative into its symmetric viand skew-symmetric pieces. Chapters22and23gooverthemotivationandsimpleuseoftheleastsquaresapproximation. Mostofthestudents haveheardaboutthis, buthaveno ideahoworwhy itworks. It’sanice application which is both easily understood and uses much of the material they’ve learned so far. Other things: No use was made of graphing calculators. The important material can all be illustratedadequatelywith2×2matrices, whereit’ssimplertodothecomputationsbyhand (and, as an important byproduct, the students actually learn the algorithms). Most of our students are engineers and have some acquaintance with MatLab, which is what they’ll use for serious work, so we’re not helping them by showing them how to do things inefficiently. In spite of this, every student brought a graphing calculator to every exam. I have no idea how they used them. n No general definition of vector space is given. Everything is done in subspaces ofR , which seems to be more than enough at this level. The disadvantage is that there’s no discussion of vector space isomorphisms, but I felt that the resulting simplification of the exposition justified this. Therearesomeshortcomings: Thelevelofexpositionprobablyvariestoomuch; thedemands on the student are not as consistent as one would like. There should be a few more figures. Undoubtedly there are typos and other minor errors; I hope there are no major ones. viiChapter 1 Matrices and matrix algebra 1.1 Examples of matrices  Definition: A matrix is a rectangular array of numbers and/or variables. For instance   4 −2 0 −3 1   A= 5 1.2 −0.7 x 3 π −3 4 6 27 is a matrix with 3 rows and 5 columns (a 3×5 matrix). The 15 entries of the matrix are referenced by the row and column in which they sit: the (2,3) entry of A is−0.7. We may also write a = −0.7, a = x, etc. We indicate the fact that A is 3×5 (this is read as 23 24 ”three by five”) by writingA . Matrices can also be enclosed in square brackets as well as 3×5 large parentheses. That is, both     2 4 2 4 and 1 −6 1 −6 are perfectly good ways to write this 2×2 matrix. Real numbers are 1×1 matrices. A vector such as   x   v = y z is a 3× 1 matrix. We will generally use upper case Latin letters as symbols for general matrices, boldface lower case letters for the special case of vectors, and ordinary lower case letters for real numbers.  Definition: Real numbers, when used in matrix computations, are called scalars. Matrices are ubiquitous in mathematics and the sciences. Some instances include: 1• Systems of linear algebraic equations (the main subject matter of this course) are normally written as simple matrix equations of the form Ax=y. 3 2 • The derivative of a function f :R →R is a 2×3 matrix. • First order systems of linear differential equations are written in matrix form. • The symmetry groups of mathematics and physics, some of which we’ll look at later, are groups of matrices. • Quantum mechanics can be formulated using infinite-dimensional matrices. 1.2 Operations with matrices Matrices of the same size can be added or subtracted by adding or subtracting the corre- sponding entries:       2 1 6 −1.2 8 −0.2       −3 4 + π x = π−3 4+x . 7 0 1 −1 8 −1 Definition: If the matricesA andB have the same size, then theirsum is the matrixA+B defined by (A+B) =a +b . ij ij ij Their difference is the matrix A−B defined by (A−B) =a −b ij ij ij .  Definition: A matrix A can be multiplied by a scalar c to obtain the matrix cA, where (cA) =ca . ij ij This is called scalar multiplication. We just multiply each entry of A by c. For example     1 2 −3 −6 −3 = 3 4 −9 −12  Definition: The m×n matrix whose entries are all 0 is denoted 0 (or, more often, just mn by 0 if the dimensions are obvious from context). It’s called the zero matrix.  Definition: Two matrices A and B are equal if all their corresponding entries are equal: A =B ⇐⇒ a =b for all i, j. ij ij 2Definition: IfthenumberofcolumnsofAequalsthenumberofrowsofB,thentheproduct AB is defined by k X (AB) = a b . ij is sj s=1 Here k is the number of columns of A or rows of B. If the summation sign is confusing, this could also be written as (AB) =a b +a b +···+a b . ij i1 1j i2 2j ik kj Example:         −1 0 1 2 3 1·−1+2·4+3·1 1·0+2·2+3·3 10 13   4 2 = = −1 0 4 −1·−1+0·4+4·1 −1·0+0·2+4·3 5 12 1 3 If AB is defined, then the number of rows of AB is the same as the number of rows of A, and the number of columns is the same as the number of columns of B: A B = (AB) . m×n n×p m×p Whydefinemultiplicationlikethis? Theansweristhatthisisthedefinitionthatcorresponds to what shows up in practice. Example: Recall from calculus (Exercise) that if a point (x,y) in the plane is rotated ′ ′ counterclockwise about the origin through an angleθ to obtain a new point (x,y), then ′ x = xcosθ−ysinθ ′ y = xsinθ+ycosθ. In matrix notation, this can be written      ′ x cosθ −sinθ x = . ′ y sinθ cosθ y ′ ′ ′′ ′′ If the new point (x,y ) is now rotated through an additional angle φ to get (x ,y ), then      ′′ ′ x cosφ −sinφ x = ′′ ′ y sinφ cosφ y     cosφ −sinφ cosθ −sinθ x = sinφ cosφ sinθ cosθ y    cosθcosφ−sinθsinφ −(cosθsinφ+sinθcosφ) x = cosθsinφ+sinθcosφ cosθcosφ−sinθsinφ y    cos(θ+φ) −sin(θ+φ) x = sin(θ+φ) cos(θ+φ) y 3This is obviously correct, since it shows that the point has been rotated through the total angle ofθ+φ. So the right answer is given by matrix multiplication as we’ve defined it, and not some other way. Matrix multiplication is not commutative: in English,AB = 6 BA, for arbitrary matri- ces A and B. For instance, if A is 3×5 and B is 5×2, then AB is 3×2, but BA is not defined. Even if both matrices are square and of the same size, so that both AB and BA are defined and have the same size, the two products are not generally equal. ♣ Exercise: Write down two 2×2 matrices and compute both products. Unless you’ve been very selective, the two products won’t be equal. Another example: If    2 A = , and B = 1 2 , 3 then   2 4 AB = , while BA = (8). 3 6 Two fundamental properties of matrix multiplication: 1. If AB and AC are defined, then A(B +C)=AB+AC. 2. If AB is defined, and c is a scalar, then A(cB) =c(AB). ♣ Exercise: Prove the two properties listed above. (Both these properties can be proven by showing that,ineachequation, the(i,j)entryontherighthandsideoftheequationisequal to the (i,j) entry on the left.) t Definition: The transpose of the matrixA, denotedA , is obtained fromA by making the t t first row ofA into the first column ofA , the second row ofA into the second column ofA , and so on. Formally, t a =a . ji ij So   t   1 2 1 3 5   3 4 = . 2 4 6 5 6 Here’soneconsequenceofthenon-commutatitivityofmatrixmultiplication: IfAB isdefined, t t t t t then (AB) =B A (and not AB as you might expect). Example: If     2 1 −1 2 A= , and B = , 3 0 4 3 then     2 7 2 −3 t AB = , so (AB) = . −3 6 7 6 4And      −1 4 2 3 2 −3 t t B A = = 2 3 1 0 7 6 as advertised. t t t th ♣ Exercise: Can you show that (AB) = B A ? You need to write out the (i,j) entry of both sides and then observe that they’re equal.  Definition: A is square if it has the same number of rows and columns. An important instanceistheidentitymatrixI ,whichhasonesonthemaindiagonalandzeroselsewhere: n Example:   1 0 0   I = 0 1 0 . 3 0 0 1 Often, we’ll just writeI without the subscript for an identity matrix, when the dimension is clear from the context. The identity matrices behave, in some sense, like the number 1. If A is n×m, then I A =A, and AI =A. n m Definition: SupposeA andB are square matrices of the same dimension, and suppose that −1 AB = I = BA. Then B is said to be the inverse of A, and we write this as B = A . −1 Similarly, B =A. For instance, you can easily check that      2 1 1 −1 1 0 = , 1 1 −1 2 0 1 and so these two matrices are inverses of one another:         −1 −1 2 1 1 −1 1 −1 2 1 = and = . 1 1 −1 2 −1 2 1 1 Example: Not every square matrix has an inverse. For instance   3 1 A= 3 1 has no inverse. ♣ Exercise: Show that the matrixA in the above example has no inverse. Hint: Suppose that   a b B = c d is the inverse ofA. Then we must haveBA =I. Write this out and show that the equations for the entries of B are inconsistent. 5♣ Exercise: Which 1×1 matrices are invertible, and what are their inverses? ♣ Exercise: Show that if     1 a b d −b −1 A = , and ad−bc = 6 0, then A = . c d −c a ad−bc −1 Hint: Multiply A by the given expression for A and show that it equals I. If ad−bc =0, then the matrix is not invertible. You should probably memorize this formula. ♣ Exercise: Show that if A has an inverse that it’s unique; that is, if B and C are both inverses of A, then B =C. (Hint: Consider the product BAC = (BA)C =B(AC).) 6Chapter 2 Matrices and systems of linear equations 2.1 The matrix form of a linear system You have all seen systems of linear equations such as 3x+4y = 5 2x−y = 0. (2.1) This system can be solved easily: Multiply the 2nd equation by 4, and add the two resulting equationstoget11x =5 or x =5/11. Substitutingthisintoeitherequationgivesy = 10/11. In this case, a solution exists (obviously) and is unique (there’s just one solution, namely (5/11,10/11)). We can write this system as a matrix equation, in the form Ax=y:      3 4 x 5 = . (2.2) 2 −1 y 0 Here       x 5 3 4 x = , and y = , and A = y 0 2 −1 is called the coefficient matrix. This formula works because if we multiply the two matrices on the left, we get the 2×1 matrix equation     3x+4y 5 = . 2x−y 0 And the two matrices are equal if both their entries are equal, which holds only if both equations in (2.1) are satisfied. 72.2 Row operations on the augmented matrix Of course, rewriting the system in matrix form does not, by itself, simplify the way in which we solve it. The simplification results from the following observation: The variables x and y can be eliminated from the computation by simply writing down a matrixinwhichthecoefficientsofxareinthefirstcolumn, thecoefficients ofy inthesecond, and the right hand side of the system is the third column:   3 4 5 . (2.3) 2 −1 0 We are using the columns as ”place markers” instead of x, y and the = sign. That is, the first column consists of the coefficients of x, the second has the coefficients of y, and the third has the numbers on the right hand side of (2.1). 1 We can do exactly the same operations on this matrix as we did on the original system :   3 4 5 : Multiply the 2nd eqn by 4 8 −4 0   3 4 5 : Add the 1st eqn to the 2nd 11 0 5   3 4 5 : Divide the 2nd eqn by 11 5 1 0 11 The second equation now reads 1·x + 0·y = 5/11, and we’ve solved for x; we can now substitute for x in the first equation to solve for y as above.  Definition: The matrix in (2.3) is called the augmented matrix of the system, and can be written in matrix shorthand as (Ay). Even though the solution to the system of equations is unique, it can be solved in many different ways (all of which, clearly, must give the same answer). For instance, start with the same augmented matrix   3 4 5 . 2 −1 0   1 5 5 : Replace eqn 1 with eqn 1 - eqn 2 2 −1 0   1 5 5 : Subtract 2 times eqn 1 from eqn 2 0 −11 −10   1 5 5 : Divide eqn 2 by -11 to get y = 10/11 10 0 1 11 1 The purpose of this lecture is to remind you of the mechanics for solving simple linear systems. We’ll give precise definitions and statements of the algorithms later. 8Thesecondequationtellsusthaty = 10/11,andwecansubstitutethisintothefirstequation x+5y = 5 to get x =5/11. We could even take this one step further:   5 1 0 11 : We added -5(eqn 2) to eqn 1 10 0 1 11 The complete solutioncan now beread offfromthematrix. Whatwe’ve done is to eliminate x from the second equation, (the 0 in position (2,1)) and y from the first (the 0 in position (1,2)). ♣ Exercise: What’s wrong with writing the final matrix as   1 0 0.45 ? 0 1 0.91 Thesystem aboveconsists oftwolinearequationsintwounknowns. Eachequation, byitself, is the equation of a line in the plane and so has infinitely many solutions. To solve both equations simultaneously, we need to find the points, if any, which lie on both lines. There are 3 possibilities: (a) there’s just one (the usual case), (b) there is no solution (if the two lines are parallel and distinct), or (c) there are an infinite number of solutions (if the two lines coincide). ♣ Exercise: (Do this before continuing with the text.) What are the possibilities for 2 linear equations in 3 unknowns? That is, what geometric object does each equation represent, and what are the possibilities for solution(s)? 2.3 More variables Let’s add another variable and consider two equations in three unknowns: 2x−4y+z = 1 4x+y−z = 3 (2.4) Ratherthansolvingthisdirectly, we’llworkwiththeaugmentedmatrixforthesystemwhich is   2 −4 1 1 . 4 1 −1 3 We proceed in more or less the same manner as above - that is, we try to eliminate x from the second equation, and y from the first by doing simple operations on the matrix. Before we start, observe that each time we do such an operation, we are, in effect, replacing the original system of equations by an equivalent system which has the same solutions. For instance, if we multiply the first equation by the number 2, we get a ”new” equation which has exactly the same solutions as the original. ♣ Exercise: This is also true if we replace, say, equation 2 with equation 2 plus some multiple of equation 1. Why? 9So, to business:   1 1 1 −2 2 2 : Mult eqn 1 by 1/2 4 1 −1 3   1 1 1 −2 2 2 : Mult eqn 1 by -4 and add it to eqn 2 0 9 −3 1   1 1 1 −2 2 2 : Mult eqn 2 by 1/9 (2.5) 1 1 0 1 − 3 9   1 13 1 0 − 6 18 : Add (2)eqn 2 to eqn 1 (2.6) 1 1 0 1 − 3 9 The matrix (2.5) is called an echelon form of the augmented matrix. The matrix (2.6) is called the reduced echelon form. (Precise definitions of these terms will be given in the next lecture.) Either one can be used to solve the system of equations. Working with the echelon form in (2.5), the two equations now read x−2y+z/2 = 1/2 y−z/3 = 1/9. So y =z/3+1/9. Substituting this into the first equation gives x = 2y−z/2+1/2 = 2(z/3+1/9)−z/2+1/2 = z/6+13/18 ♣ Exercise: Verify that the reduced echelon matrix (2.6) gives exactly the same solutions. This is as it should be. All equivalent systems of equations have the same solutions. 2.4 The solution in vector notation We see that for any choice of z, we get a solution to (2.4). Taking z = 0, the solution is x = 13/18, y = 1/9. But if z = 1, then x = 8/9, y = 4/9 is the solution. Similarly for any other choice of z which for this reason is called a free variable. If we write z =t, a more familiar expression for the solution is         t 13 1 13 x + 6 18 6 18 t 1 1 1         y = + =t + . (2.7) 3 9 3 9 z t 1 0 This is of the form r(t) =tv+a, and you will recognize it as the (vector) parametric form 3 of a line in R . This (with t a free variable) is called the general solution to the system (??). If we choose a particular value oft, sayt =3π, and substitute into (2.7), then we have a particular solution. ♣ Exercise: Write down the augmented matrix and solve these. If there are free variables, write your answer in the form given in (2.7) above. Also, give a geometric interpretation of the 3 solution set (e.g., the common intersection of three planes inR .) 101. 3x+2y−4z = 3 −x−2y+3z = 4 2. 2x−4y = 3 3x+2y = −1 x−y = 10 3. x+y+3z = 4 It is now time to think about what we’ve just been doing: • Can we formalize the algorithm we’ve been using to solve these equations? • Can we show that the algorithm always works? That is, are we guaranteed to get all the solutions if we use the algorithm? Alternatively, if the system is inconsistent (i.e., no solutions exist), will the algorithm say so? Let’s write down the different ‘operations’ we’ve been using on the systems of equations and on the corresponding augmented matrices: 1. We can multiply any equation by a non-zero real number (scalar). The corresponding matrix operation consists of multiplying a row of the matrix by a scalar. 2. We can replace any equation by the original equation plus a scalar multiple of another equation. Equivalently, we can replace any row of a matrix by thatrow plus a multiple of another row. 3. We can interchange two equations (or two rows of the augmented matrix); we haven’t needed to do this yet, but sometimes it’s necessary, as we’ll see in a bit.  Definition: These three operations are called elementary row operations. In the next lecture, we’ll assemble the solution algorithm, and show that it can be reformu- lated in terms of matrix multiplication. 11Chapter 3 Elementary row operations and their corresponding matrices 3.1 Elementary matrices As we’ll see, any elementary row operation can be performed by multiplying the augmented matrix (Ay) on the left by what we’ll call an elementary matrix. Just so this doesn’t come as a total shock, let’s look at some simple matrix operations: • SupposeEA is defined, and suppose the first row ofE is (1,0,0,...,0). Then the first row of EA is identical to the first row of A. th th th • Similarly, if the i row of E is all zeros except for a 1 in the i slot, then the i row th of the product EA is identical to the i row of A. • It follows that if we want to change only row i of the matrix A, we should multiply A on the left by some matrix E with the following property: th Every row except row i should be the i row of the corresponding identity matrix. The procedure that we illustrate below is used to reduce any matrix to echelon form (not just augmented matrices). The way it works is simple: the elementary matrices E ,E ,... 1 2 are formed by (a) doing the necessary row operation on the identity matrix to get E, and then (b) multiplying A on the left by E. Example: Let   3 4 5 A = . 2 −1 0 1. To multiply the first row ofA by 1/3, we can multiplyA on the left by the elementary matrix   1 0 3 E = . 1 0 1 12