lecture notes on numerical linear algebra and linear algebra lecture notes pdf free download
Lecture notes on linear algebra
Department of Mathematics
University of Kansas
These are notes of a course given in Fall, 2007 and 2008 to the Honors sections of our
elementary linear algebra course. Their comments and corrections have greatly improved
2007, 2008 D. E. LernerChapter 1
Matrices and matrix algebra
1.1 Examples of matrices
Deﬁnition: A matrix is a rectangular array of numbers and/or variables. For instance
4 −2 0 −3 1
A= 5 1.2 −0.7 x 3
π −3 4 6 27
is a matrix with 3 rows and 5 columns (a 3×5 matrix). The 15 entries of the matrix are
referenced by the row and column in which they sit: the (2,3) entry of A is−0.7. We may
also write a = −0.7, a = x, etc. We indicate the fact that A is 3×5 (this is read as
”three by ﬁve”) by writingA . Matrices can also be enclosed in square brackets as well as
large parentheses. That is, both
2 4 2 4
1 −6 1 −6
are perfectly good ways to write this 2×2 matrix.
Real numbers are 1×1 matrices. A vector such as
v = y
is a 3× 1 matrix. We will generally use upper case Latin letters as symbols for general
matrices, boldface lower case letters for the special case of vectors, and ordinary lower case
letters for real numbers.
Deﬁnition: Real numbers, when used in matrix computations, are called scalars.
Matrices are ubiquitous in mathematics and the sciences. Some instances include:
1• Systems of linear algebraic equations (the main subject matter of this course) are
normally written as simple matrix equations of the form Ax=y.
• The derivative of a function f :R →R is a 2×3 matrix.
• First order systems of linear diﬀerential equations are written in matrix form.
• The symmetry groups of mathematics and physics, some of which we’ll look at later,
are groups of matrices.
• Quantum mechanics can be formulated using inﬁnite-dimensional matrices.
1.2 Operations with matrices
Matrices of the same size can be added or subtracted by adding or subtracting the corre-
2 1 6 −1.2 8 −0.2
−3 4 + π x = π−3 4+x .
7 0 1 −1 8 −1
Deﬁnition: If the matricesA andB have the same size, then theirsum is the matrixA+B
(A+B) =a +b .
ij ij ij
Their diﬀerence is the matrix A−B deﬁned by
(A−B) =a −b
ij ij ij
Deﬁnition: A matrix A can be multiplied by a scalar c to obtain the matrix cA, where
(cA) =ca .
This is called scalar multiplication. We just multiply each entry of A by c. For example
1 2 −3 −6
3 4 −9 −12
Deﬁnition: The m×n matrix whose entries are all 0 is denoted 0 (or, more often, just
by 0 if the dimensions are obvious from context). It’s called the zero matrix.
Deﬁnition: Two matrices A and B are equal if all their corresponding entries are equal:
A =B ⇐⇒ a =b for all i, j.
AB is deﬁned by
(AB) = a b .
ij is sj
Here k is the number of columns of A or rows of B.
If the summation sign is confusing, this could also be written as
(AB) =a b +a b +···+a b .
ij i1 1j i2 2j ik kj
1 2 3 1·−1+2·4+3·1 1·0+2·2+3·3 10 13
4 2 = =
−1 0 4 −1·−1+0·4+4·1 −1·0+0·2+4·3 5 12
If AB is deﬁned, then the number of rows of AB is the same as the number of rows of A,
and the number of columns is the same as the number of columns of B:
A B = (AB) .
m×n n×p m×p
to what shows up in practice.
Example: Recall from calculus (Exercise) that if a point (x,y) in the plane is rotated
counterclockwise about the origin through an angleθ to obtain a new point (x,y), then
x = xcosθ−ysinθ
y = xsinθ+ycosθ.
In matrix notation, this can be written
x cosθ −sinθ x
y sinθ cosθ y
′ ′ ′′ ′′
If the new point (x,y ) is now rotated through an additional angle φ to get (x ,y ), then
x cosφ −sinφ x
y sinφ cosφ y
cosφ −sinφ cosθ −sinθ x
sinφ cosφ sinθ cosθ y
cosθcosφ−sinθsinφ −(cosθsinφ+sinθcosφ) x
cosθsinφ+sinθcosφ cosθcosφ−sinθsinφ y
cos(θ+φ) −sin(θ+φ) x
sin(θ+φ) cos(θ+φ) y
3This is obviously correct, since it shows that the point has been rotated through the total
angle ofθ+φ. So the right answer is given by matrix multiplication as we’ve deﬁned it, and
not some other way.
Matrix multiplication is not commutative: in English,AB = 6 BA, for arbitrary matri-
ces A and B. For instance, if A is 3×5 and B is 5×2, then AB is 3×2, but BA is not
deﬁned. Even if both matrices are square and of the same size, so that both AB and BA
are deﬁned and have the same size, the two products are not generally equal.
♣ Exercise: Write down two 2×2 matrices and compute both products. Unless you’ve been
very selective, the two products won’t be equal.
Another example: If
A = , and B = 1 2 ,
AB = , while BA = (8).
Two fundamental properties of matrix multiplication:
1. If AB and AC are deﬁned, then A(B +C)=AB+AC.
2. If AB is deﬁned, and c is a scalar, then A(cB) =c(AB).
♣ Exercise: Prove the two properties listed above. (Both these properties can be proven by
showing that,ineachequation, the(i,j)entryontherighthandsideoftheequationisequal
to the (i,j) entry on the left.)
Deﬁnition: The transpose of the matrixA, denotedA , is obtained fromA by making the
ﬁrst row ofA into the ﬁrst column ofA , the second row ofA into the second column ofA ,
and so on. Formally,
a =a .
1 3 5
3 4 = .
2 4 6
Here’soneconsequenceofthenon-commutatitivityofmatrixmultiplication: IfAB isdeﬁned,
t t t t t
then (AB) =B A (and not AB as you might expect).
2 1 −1 2
A= , and B = ,
3 0 4 3
2 7 2 −3
AB = , so (AB) = .
−3 6 7 6
−1 4 2 3 2 −3
B A = =
2 3 1 0 7 6
t t t th
♣ Exercise: Can you show that (AB) = B A ? You need to write out the (i,j) entry of
both sides and then observe that they’re equal.
Deﬁnition: A is square if it has the same number of rows and columns. An important
1 0 0
I = 0 1 0 .
0 0 1
Often, we’ll just writeI without the subscript for an identity matrix, when the dimension is
clear from the context. The identity matrices behave, in some sense, like the number 1. If
A is n×m, then I A =A, and AI =A.
Deﬁnition: SupposeA andB are square matrices of the same dimension, and suppose that
AB = I = BA. Then B is said to be the inverse of A, and we write this as B = A .
Similarly, B =A. For instance, you can easily check that
2 1 1 −1 1 0
1 1 −1 2 0 1
and so these two matrices are inverses of one another:
2 1 1 −1 1 −1 2 1
= and = .
1 1 −1 2 −1 2 1 1
Example: Not every square matrix has an inverse. For instance
has no inverse.
♣ Exercise: Show that the matrixA in the above example has no inverse. Hint: Suppose that
is the inverse ofA. Then we must haveBA =I. Write this out and show that the equations
for the entries of B are inconsistent.
5♣ Exercise: Which 1×1 matrices are invertible, and what are their inverses?
♣ Exercise: Show that if
a b d −b
A = , and ad−bc = 6 0, then A = .
c d −c a
Hint: Multiply A by the given expression for A and show that it equals I. If ad−bc =0,
then the matrix is not invertible. You should probably memorize this formula.
♣ Exercise: Show that if A has an inverse that it’s unique; that is, if B and C are both
inverses of A, then B =C. (Hint: Consider the product BAC = (BA)C =B(AC).)
Matrices and systems of linear equations
2.1 The matrix form of a linear system
You have all seen systems of linear equations such as
3x+4y = 5
2x−y = 0. (2.1)
This system can be solved easily: Multiply the 2nd equation by 4, and add the two resulting
equationstoget11x =5 or x =5/11. Substitutingthisintoeitherequationgivesy = 10/11.
In this case, a solution exists (obviously) and is unique (there’s just one solution, namely
We can write this system as a matrix equation, in the form Ax=y:
3 4 x 5
= . (2.2)
2 −1 y 0
x 5 3 4
x = , and y = , and A =
y 0 2 −1
is called the coeﬃcient matrix.
This formula works because if we multiply the two matrices on the left, we get the 2×1
And the two matrices are equal if both their entries are equal, which holds only if both
equations in (2.1) are satisﬁed.
72.2 Row operations on the augmented matrix
Of course, rewriting the system in matrix form does not, by itself, simplify the way in which
we solve it. The simpliﬁcation results from the following observation:
The variables x and y can be eliminated from the computation by simply writing down a
matrixinwhichthecoeﬃcientsofxareintheﬁrstcolumn, thecoeﬃcients ofy inthesecond,
and the right hand side of the system is the third column:
3 4 5
2 −1 0
We are using the columns as ”place markers” instead of x, y and the = sign. That is, the
ﬁrst column consists of the coeﬃcients of x, the second has the coeﬃcients of y, and the
third has the numbers on the right hand side of (2.1).
We can do exactly the same operations on this matrix as we did on the original system :
3 4 5
: Multiply the 2nd eqn by 4
8 −4 0
3 4 5
: Add the 1st eqn to the 2nd
11 0 5
3 4 5
: Divide the 2nd eqn by 11
The second equation now reads 1·x + 0·y = 5/11, and we’ve solved for x; we can now
substitute for x in the ﬁrst equation to solve for y as above.
Deﬁnition: The matrix in (2.3) is called the augmented matrix of the system, and can
be written in matrix shorthand as (Ay).
Even though the solution to the system of equations is unique, it can be solved in many
diﬀerent ways (all of which, clearly, must give the same answer). For instance, start with
the same augmented matrix
3 4 5
2 −1 0
1 5 5
: Replace eqn 1 with eqn 1 - eqn 2
2 −1 0
1 5 5
: Subtract 2 times eqn 1 from eqn 2
0 −11 −10
1 5 5
: Divide eqn 2 by -11 to get y = 10/11
The purpose of this lecture is to remind you of the mechanics for solving simple linear systems. We’ll
give precise deﬁnitions and statements of the algorithms later.
8Thesecondequationtellsusthaty = 10/11,andwecansubstitutethisintotheﬁrstequation
x+5y = 5 to get x =5/11. We could even take this one step further:
: We added -5(eqn 2) to eqn 1
The complete solutioncan now beread oﬀfromthematrix. Whatwe’ve done is to eliminate
x from the second equation, (the 0 in position (2,1)) and y from the ﬁrst (the 0 in position
♣ Exercise: What’s wrong with writing the ﬁnal matrix as
1 0 0.45
0 1 0.91
Thesystem aboveconsists oftwolinearequationsintwounknowns. Eachequation, byitself,
is the equation of a line in the plane and so has inﬁnitely many solutions. To solve both
equations simultaneously, we need to ﬁnd the points, if any, which lie on both lines. There
are 3 possibilities: (a) there’s just one (the usual case), (b) there is no solution (if the two
lines are parallel and distinct), or (c) there are an inﬁnite number of solutions (if the two
♣ Exercise: (Do this before continuing with the text.) What are the possibilities for 2 linear
equations in 3 unknowns? That is, what geometric object does each equation represent, and
what are the possibilities for solution(s)?
2.3 More variables
Let’s add another variable and consider two equations in three unknowns:
2x−4y+z = 1
4x+y−z = 3 (2.4)
2 −4 1 1
4 1 −1 3
We proceed in more or less the same manner as above - that is, we try to eliminate x from
the second equation, and y from the ﬁrst by doing simple operations on the matrix. Before
we start, observe that each time we do such an operation, we are, in eﬀect, replacing the
original system of equations by an equivalent system which has the same solutions. For
instance, if we multiply the ﬁrst equation by the number 2, we get a ”new” equation which
has exactly the same solutions as the original.
♣ Exercise: This is also true if we replace, say, equation 2 with equation 2 plus some multiple
of equation 1. Why?
9So, to business:
: Mult eqn 1 by 1/2
4 1 −1 3
: Mult eqn 1 by -4 and add it to eqn 2
0 9 −3 1
: Mult eqn 2 by 1/9 (2.5)
0 1 −
1 0 −
: Add (2)eqn 2 to eqn 1 (2.6)
0 1 −
The matrix (2.5) is called an echelon form of the augmented matrix. The matrix (2.6) is
called the reduced echelon form. (Precise deﬁnitions of these terms will be given in the
next lecture.) Either one can be used to solve the system of equations. Working with the
echelon form in (2.5), the two equations now read
x−2y+z/2 = 1/2
y−z/3 = 1/9.
So y =z/3+1/9. Substituting this into the ﬁrst equation gives
x = 2y−z/2+1/2
♣ Exercise: Verify that the reduced echelon matrix (2.6) gives exactly the same solutions. This
is as it should be. All equivalent systems of equations have the same solutions.
2.4 The solution in vector notation
We see that for any choice of z, we get a solution to (2.4). Taking z = 0, the solution is
x = 13/18, y = 1/9. But if z = 1, then x = 8/9, y = 4/9 is the solution. Similarly for any
other choice of z which for this reason is called a free variable. If we write z =t, a more
familiar expression for the solution is
t 13 1 13
6 18 6 18
t 1 1 1
y = + =t + . (2.7)
3 9 3 9
z t 1 0
This is of the form r(t) =tv+a, and you will recognize it as the (vector) parametric form
of a line in R . This (with t a free variable) is called the general solution to the system
(??). If we choose a particular value oft, sayt =3π, and substitute into (2.7), then we have
a particular solution.
♣ Exercise: Write down the augmented matrix and solve these. If there are free variables, write
your answer in the form given in (2.7) above. Also, give a geometric interpretation of the
solution set (e.g., the common intersection of three planes inR .)
3x+2y−4z = 3
−x−2y+3z = 4
2x−4y = 3
3x+2y = −1
x−y = 10
x+y+3z = 4
It is now time to think about what we’ve just been doing:
• Can we formalize the algorithm we’ve been using to solve these equations?
• Can we show that the algorithm always works? That is, are we guaranteed to get all
the solutions if we use the algorithm? Alternatively, if the system is inconsistent (i.e.,
no solutions exist), will the algorithm say so?
Let’s write down the diﬀerent ‘operations’ we’ve been using on the systems of equations and
on the corresponding augmented matrices:
1. We can multiply any equation by a non-zero real number (scalar). The corresponding
matrix operation consists of multiplying a row of the matrix by a scalar.
2. We can replace any equation by the original equation plus a scalar multiple of another
equation. Equivalently, we can replace any row of a matrix by thatrow plus a multiple
of another row.
3. We can interchange two equations (or two rows of the augmented matrix); we haven’t
needed to do this yet, but sometimes it’s necessary, as we’ll see in a bit.
Deﬁnition: These three operations are called elementary row operations.
In the next lecture, we’ll assemble the solution algorithm, and show that it can be reformu-
lated in terms of matrix multiplication.
Elementary row operations and their
3.1 Elementary matrices
As we’ll see, any elementary row operation can be performed by multiplying the augmented
matrix (Ay) on the left by what we’ll call an elementary matrix. Just so this doesn’t
come as a total shock, let’s look at some simple matrix operations:
• SupposeEA is deﬁned, and suppose the ﬁrst row ofE is (1,0,0,...,0). Then the ﬁrst
row of EA is identical to the ﬁrst row of A.
th th th
• Similarly, if the i row of E is all zeros except for a 1 in the i slot, then the i row
of the product EA is identical to the i row of A.
• It follows that if we want to change only row i of the matrix A, we should multiply A
on the left by some matrix E with the following property:
Every row except row i should be the i row of the corresponding identity matrix.
The procedure that we illustrate below is used to reduce any matrix to echelon form (not
just augmented matrices). The way it works is simple: the elementary matrices E ,E ,...
are formed by (a) doing the necessary row operation on the identity matrix to get E, and
then (b) multiplying A on the left by E.
3 4 5
A = .
2 −1 0
1. To multiply the ﬁrst row ofA by 1/3, we can multiplyA on the left by the elementary
E = .
12(Since we don’t want to change the second row ofA, the second row ofE is the same
as the second row ofI .) The ﬁrst row is obtained by multiplying the ﬁrst row ofI by
1/3. The result is
E A = .
2 −1 0
You should check this on your own. Same with the remaining computations.
2. To add -2(row1) to row 2 in the resulting matrix, multiply it by
E = .
The general rule here is the following: To perform an elementary row operation on
the matrix A, ﬁrst perform the operation on the corresponding identity matrix
to obtain an elementary matrix; then multiply A on the left by this elementary
3.2 The echelon and reduced echelon (Gauss-Jordan) form
Continuing with the problem, we obtain
E E A = .
2 1 11 10
0 − −
Note the order of the factors: E E A and not E E A
2 1 1 2
Now multiply row 2 of E E A by−3/11 using the matrix
E = ,
yielding the echelon form
E E E A = .
3 2 1
Last, we clean out the second column by adding (-4/3)(row 2) to row 1. The corresponding
elementary matrix is
E = .
Carrying out the multiplication, we obtain the Gauss-Jordan form of the augmented matrix
E E E E A = .
4 3 2 1
13Naturally, we get the same result as before, so why bother? The answer is that we’re
developing an algorithm that will work in the general case. So it’s about time to formally
identify our goal in the general case. We begin with some deﬁnitions.
Deﬁnition: The leading entry of a matrix row is the ﬁrst non-zero entry in the row,
starting from the left. A row without a leading entry is a row of zeros.
Deﬁnition: The matrix R is said to be in echelon form provided that
1. The leading entry of every non-zero row is a 1.
2. If the leading entry of row i is in position k, and the next row is not a row of zeros,
then the leading entry of row i+1 is in position k+j, where j≥ 1.
3. All zero rows are at the bottom of the matrix.
The following matrices are in echelon form:
1 ∗ ∗ 0 1 ∗ ∗ ∗
, 0 0 1 , and 0 0 1 ∗ ∗
0 0 0 0 0 0 1 ∗
Here the asterisks () stand for any number at all, including 0.
Deﬁnition: The matrix R is said to be in reduced echelon form if (a) R is in echelon
form, and (b) each leading entry is the only non-zero entry in its column. The reduced
echelon form of a matrix is also called the Gauss-Jordan form.
The following matrices are in reduced row echelon form:
1 ∗ 0 ∗ 0 1 0 0 ∗
, 0 0 1 ∗ , and 0 0 1 0 ∗ .
0 0 0 0 0 0 0 1 ∗
♣ Exercise: Suppose A is 3×5. What is the maximum number of leading 1’s that can appear
when it’s been reduced to echelon form? Same questions for A . Can you generalize your
results to a statement for A ?. (State it as a theorem.)
Once a matrix has been brought to echelon form, it can be put into reduced echelon form
by cleaning out the non-zero entries in any column containing a leading 1. For example, if
1 2 −1 3
0 1 2 0
R = ,
0 0 0 1
which is in echelon form, then it can be reduced to Gauss-Jordan form by adding (-2)(row
2) to row 1, and then (-3)(row 3) to row 1. Thus
1 −2 0 1 2 −1 3 1 0 −5 3
0 1 0 0 1 2 0 = 0 1 2 0 .
0 0 1 0 0 0 1 0 0 0 1
1 0 −3 1 0 −5 3 1 0 −5 0
0 1 0 0 1 2 0 = 0 1 2 0 .
0 0 1 0 0 0 1 0 0 0 1
Note that column 3 cannot be ”cleaned out” since there’s no leading 1 there.
3.3 The third elementary row operation
There is one more elementary row operation and corresponding elementary matrix we may
need. Suppose we want to reduce the following matrix to Gauss-Jordan form
2 2 −1
A= 0 0 3 .
1 −1 2
Multiplying row 1 by 1/2, and then adding -row 1 to row 3 leads to
1 0 0 0 0 2 2 −1 1 1 −
E E A = 0 1 0 0 1 0 0 0 3 = 0 0 3 .
−1 0 1 0 0 1 1 −1 2 0 −2
Now wecan clearly do2 moreoperationsto geta leading 1 inthe(2,3)position, andanother
leading 1 in the (3,2) position. But this won’t be in echelon form (why not?) We need to
interchange rows 2 and 3. This corresponds to changing the order of the equations, and
evidently doesn’t change the solutions. We can accomplish this by multiplying on the left
with a matrix obtained fromI by interchanging rows 2 and 3:
1 0 0 1 1 − 1 1 −
E E E A= 0 0 1 0 0 3 = 0 −2 .
3 2 1
0 1 0 0 −2 0 0 3
♣ Exercise: Without doing any written computation, write down the Gauss-Jordan form for
♣ Exercise: Use elementary matrices to reduce
to Gauss-Jordan form. You should wind up with an expression of the form
E ···E E A=I.
k 2 1
What is another name for the matrix B =E ···E E ?
k 2 1
Elementary matrices, continued
We have identiﬁed 3 types of row operations and their corresponding elementary matrices.
To repeat the recipe: These matrices are constructed by performing the given row operation
on the identity matrix:
1. To multiply row (A) by the scalar c use the matrix E obtained fromI by multiplying
j row of I by c.
2. To add (c)(row (A)) to row (A), use the identity matrix with its k row replaced by
(...,c,...,1,...). Here c is in position j and the 1 is in position k. All other entries
3. To interchange rows j and k, use the identity matrix with rows j and k interchanged.
4.1 Properties of elementary matrices
1. Elementary matrices are always square. If the operation is to be performed on
A , then the elementary matrix E is m×m. So the product EA has the same
dimension as the original matrix A.
2. Elementary matrices are invertible. If E is elementary, then E is the matrix
which undoes the operation that created E, and E EA = IA = A; the matrix
followed by its inverse does nothing to A:
adds (−2)(row (A)) to row (A). Its inverse is
E = ,
16which adds (2)(row (A)) to row (A). You should check that the product of these
two is I .
• If E multiplies the second row of a 2×2 matrix by , then
E = .
• If E interchanges two rows, then E =E . For instance
0 1 0 1
1 0 1 0
1. If A is 3× 4, what is the elementary matrix that (a) subtracts (7)(row (A)) from
row (A)?, (b) interchanges the ﬁrst and third rows? (c) multiplies row (A) by 2?
2. What are the inverses of the matrices in exercise 1?
3. ()Do elementary matrices commute? That is, does it matter in which order they’re
multiplied? Give an example or two to illustrate your answer.
4. () In a manner analogous to the above, deﬁne three elementary column operations
and show that they can be implemented by multiplying A on the right by elemen-
tary n×n column matrices.
4.2 The algorithm for Gaussian elimination
We can now formulate the algorithm which reduces any matrix ﬁrst to row echelon form,
and then, if needed, to reduced echelon form:
1. Begin with the (1,1) entry. If it’s some number a = 6 0, divide through row 1 by a to
get a 1 in the (1,1) position. If it is zero, then interchange row 1 with another row to
get a nonzero (1,1) entry and proceed as above. If every entry in column 1 is zero,
go to the top of column 2 and, by multiplication and permuting rows if necessary, get
a 1 in the (1,2) slot. If column 2 won’t work, then go to column 3, etc. If you can’t
arrange for a leading 1 somewhere in row 1, then your original matrix was the zero
matrix, and it’s already reduced.
2. You now have a leading 1 in some column. Use this leading 1 and operations of the
type (a)row (A)+row (A)→ row (A) to replace every entry in the column below the
i k k
location of the leading 1 by 0. When you’re done, the column will look like
173. Now move one column to the right, and one row down and attempt to repeat the
process, getting a leading 1 in this location. You may need to permute this row with
a row below it. If it’s not possible to get a non-zero entry in this position, move right
one column and try again. At the end of this second procedure, your matrix might
1 ∗ ∗ ∗
0 0 1 ∗ ,
0 0 0 ∗
where the second leading entry is in column 3. Notice that once a leading 1 has been
installed in the correct position and the column below this entry has been zeroed out,
none of the subsequent row operations will change any of the elements in the column.
For the matrix above, no subsequent row operations in our reduction process will
change any of the entries in the ﬁrst 3 columns.
4. The process continues until there are no more positions for leading entries – we either
run out of rows or columns or both because the matrix has only a ﬁnite number of
each. We have arrived at the row echelon form.
The three matrices below are all in row echelon form:
1 ∗ ∗
1 ∗ ∗ ∗ ∗ 0 0 1 1 ∗ ∗
0 0 1 ∗ ∗ , or 0 0 0 , or 0 1 ∗
0 0 0 1 ∗ 0 0 0 0 0 1
0 0 0
Remark: The description of the algorithm doesn’t involve elementary matrices. As a
practical matter, it’s much simpler to just do the row operation directly on A, instead of
writing down anelementary matrix andmultiplying thematrices. Butthefactthatwe could
do this with the elementary matrices turns out to be quite useful theoretically.
♣ Exercise: Find the echelon form for each of the following:
3 4 0 4 3 2 −1 4
, , (3,4),
5 6 7 −2 2 −5 2 6
(1)The leading entries progressstrictly downward, fromlefttoright. Wecould justaseasily
have written an algorithm in which the leading entries progress downward as we move from
right to left, or upwards from left to right. Our choice is purely a matter of convention, but
this is the convention used by most people.
Deﬁnition: The matrixA is upper triangular if any entrya withij satisﬁesa =0.
18(2) The row echelon form of the matrix is upper triangular
(3) To continue the reduction to Gauss-Jordan form, it is only necessary to use each leading
1 to clean out any remaining non-zero entries in its column. For the ﬁrst matrix in (4.2)
above, the Gauss-Jordan form will look like
1 ∗ 0 0 ∗
0 0 1 0 ∗
0 0 0 1 ∗
Of course, cleaning out the columns may lead to changes in the entries labelled with .
4.4 Why does the algorithm (Gaussian elimination) work?
Suppose we start with the system of equations Ax = y. The augmented matrix is (Ay),
where the coeﬃcients of the variable x are the numbers in col (A), the ‘equals’ sign is
represented by the vertical line, and the last column of the augmented matrix is the right
hand side of the system.
If we multiply the augmented matrix by the elementary matrix E, we get E(Ay). But this
can also be written as (EAEy).
a b c
d e f
and we want to add two times the ﬁrst row to the second, using the elementary matrix
E = .
The result is
a b c
2a+d 2b+e 2c+f
last column is Ey, so E(Ay) = (EAEy), and this works in general. (See the appropriate
So after multiplication by E, we have the new augmented matrix (EAEy), which corre-
sponds to the system of equations EAx = Ey. Now suppose x is a solution to Ax = y.
Multiplication of this equation by E gives EAx = Ey, so x solves this new system. And
conversely, since E is invertible, if x solves the new system, EAx = Ey, multiplication by
E gives Ax =y, so x solves the original system. We have just proven the
Theorem: Elementary row operations applied to either Ax =y or the corresponding augmented
matrix (Ay) don’t change the set of solutions to the system.