Linear algebra 1 lecture notes

numerical linear algebra lecture notes pdf and also elementary linear algebra lecture notes
ZackVincent Profile Pic
ZackVincent,United Kingdom,Teacher
Published Date:15-07-2017
Your Website URL(Optional)
Comment
Linear Algebra I Course No. 100221 Fall 2006 Michael Stoll With corrections, version of December 10, 2007 Contents 1. Preliminaries 2 2. What Is Linear Algebra? 2 3. Vector Spaces 4 4. Fields 7 5. Linear Subspaces and Linear Hulls 10 6. Linear Independence and Dimension 14 7. Digression: Infinite-Dimensional Vector Spaces and Zorn’s Lemma 23 8. Linear Maps 25 9. Quotient Spaces 32 10. Digression: Finite Fields 35 11. Matrices 37 12. Computations with Matrices: Row and Column Operations 41 13. Linear Equations 47 14. Matrices and Linear Maps 50 15. Determinants 54 16. The Determinant and the Symmetric Group 60 17. Eigenvalues and Eigenvectors 62 References 682 1. Preliminaries In the following, we will assume that you are familiar with the basic notions and notations of (‘naive’) set theory. This includes, for example, the following. • Notation for sets: 1,2,3,x∈S : some property of x, the empty set∅. • Symbols for the set of natural numbers: N =1,2,3,..., N =0,1,2,3,..., 0 integers: Z, rational numbers: Q, real numbers: R, and complex numbers: C (which we will introduce shortly). • Notations x∈ S (x is an element of S), A∩B (intersection of A and B), A∪B (union of A and B) A\B = x ∈ A : x ∈/ B (set difference of A and B), A ⊂ B (A is a subset of B; this includes the case A = B — my convention), A(B (A is a proper subset of B). • Pairs (a,b), triples (a,b,c) and general n-tuples (a ,a ,...,a ), and carte- 1 2 n sian products A×B =(a,b) :a∈A,b∈B etc. • The notion of a map f :A→B and properties like injectivity, surjectivity, bijectivity (we will recall these when we need them). • The notion of an equivalence relation on a set S: formally, a relation on S isasubsetRofS×S. Forsimplicity,wewritea∼bfor(a,b)∈R. ThenR is an equivalence relation if it is reflexive (a∼ a for all a∈ S), symmetric (if a ∼ b, then b ∼ a), and transitive (if a ∼ b and b ∼ c, then a ∼ c). Given an equivalence relation on S, the set S has a natural decomposition into equivalence classes, and we can consider the quotient set S/∼: the set of all equivalence classes. We will recall this in more detail later. We will also use notation like the following. n 0 A =A ∪A ∪···∪A with A =∅ i 1 2 n i i=1 i=1 n 0 \ \ A =A ∩A ∩···∩A ; A usually does not make sense i 1 2 n i i=1 i=1 A = S is the union of all the sets A that are elements of S A∈S n X a =a +a +···+a and is zero for n = 0 i 1 2 n i=1 n Y a =a a ···a and is one for n = 0 i 1 2 n i=1 2. What Is Linear Algebra? Linear Algebra is the theory of ‘linear structures’. So what is a linear structure? Well,thenotionoflinearityinvolvesaddition—youwanttobeabletoformsums, andthisadditionshouldbehaveinthewayyouexpect(itiscommutative(x+y = y+x) and associative ((x+y)+z = x+(y+z)), has a zero element, and every element has a negative) — and multiplication, not between the elements of your structure, but by ‘scalars’, for example, real numbers. This scalar multiplication should also behave reasonably (you want to have 1·x =x andλ·(μ·x) = (λμ)·x)3 andbecompatiblewiththeaddition(inthesensethatbothdistributivelawshold: (λ+μ)x = λx+μx, λ(x+y) = λx+λy). We will make this precise in the next section. Asetwithalinearstructureinthesenseofourdiscussioniscalledalinear spaceor vector space. So Linear Algebra studies these linear spaces and the maps between them that are compatible with the linear structure: linear maps. This may sound somewhatabstract,andindeed,itis. However,itisexactlythislevelofabstraction that makes Linear Algebra an extremely useful tool. The reason for this is that linear structures abound in mathematics, and so Linear Algebra has applications everywhere(seebelow). Itisthismethodofabstractionthatextractsthecommon features of various situations to create a general theory, which forms the basis of all essential progress in mathematics. It took the mathematicians quite a long time before they came up with our modern clean formulation of the theory. As already hinted at above, many other structures in mathematics are built on linear spaces, usually by putting some additional structure on them. Here are some examples. Even though the various notions probably do not have much of a meaning to you now, this list should show you that you can expect to use what you will learn in this course quite heavily in your further studies. • Geometry. Firstofall,thereisofcoursetheelementaryanalyticgeometry oftheplaneandspace, whichisbasedonthefactthatplaneandspacecan be turned into linear spaces by choosing a point as the origin. Usually one adds some structure that allows to talk about distances and angles. This is based on the ‘dot product’, which is a special case of an ‘inner product’, which turns a linear space into a Euclidean vector space. We will discuss this in detail later in the course. AmoreadvancedbranchofgeometryisDifferential Geometrythatstud- ies ‘smooth’ curves, surfaces and higher-dimensional ‘manifolds’. Every point on such a manifold has a ‘tangent space’, which is a linear space, and there are many other linear spaces associated in a natural way to a manifold (for example, spaces of differential forms). • Analysis. The notion of derivative is based on linearity — the derivative describes the best linear approximation to a function in a given point. In the case of functions of one variable, this comes down to the slope of the tangent line to the graph. However, the concept can be generalized to functions in several variables (which are functions defined on a vector space) and even to functions between manifolds. Also the operations of differentiating and integrating functions are linear. Finally, spaces of functions usually carry a structure as a linear space (for example, the space of differentiable functions on the real line). Functional Analysis is concerned with these function spaces; usually one adds some structure by introducing a suitable ‘norm’, leading to Banach Spaces and Hilbert Spaces. • Algebra. Algebra studies more general algebraic structures (like groups, rings, fields), many of which are based on linear spaces, like for example Lie Algebras. Then there is a whole bunch of so-called homology and cohomology theories involving (co-)homology groups, which are in may caseslinearspaces. Suchgroupsoccur,forexample,inAlgebraicTopology, wheretheycanbeusedtoshowthattwospacesareessentiallydistinctfrom each other, in that one cannot continuously deform one into the other. They also play an important role in Algebraic Geometry.4 Another, somewhat different, application (which may be interesting to the EECS, CS and ECE students) is to Coding Theory, where one tries to construct good ‘error-correcting codes’. Essentially all the codes that are considered are linear codes, which means that the codewords form a vector space (where the scalar multiplication is not by real numbers, but by elements from a ‘finite field’ like the fieldF =0,1). 2 To illustrate the versatility of Linear Algebra, let me write down a few linear equations from various areas of mathematics. Even though they involve objects of quite different nature, the general theory applies to all of them to provide (for example) general information on the structure of the solution set. • A system of linear equations in real numbers: x+y = 3, x+z = 5, y+z = 6. • A linear recurrence relation for a sequence (F ) of real numbers (the n n≥0 Fibonacci numbers provide one solution): F =F +F for all n≥ 0. n+2 n+1 n • A linear ordinary differential equation for a (twice differentiable) function y :R→R (the sine and cosine functions solve it): 00 y +y = 0. • A linear partial differential equation for a (twice differentiable) function f of time t and space x,y,z (the Heat Equation): 2 2 2 ∂f ∂ f ∂ f ∂ f =− − − . 2 2 2 ∂t ∂x ∂y ∂z ThefactthatLinearAlgebraappliestoallofthemaccountsforthegeneralfeeling that linear equations are ‘easy’, whereas nonlinear equations are ‘difficult’. 3. Vector Spaces In this section, we will give the complete formal definition of what a (real) vector space or linear space is. ‘Real’ here refers to the fact that the scalars are real numbers. 3.1. Definition. A real vector space or linear space over R is a set V, together with two maps + : V ×V → V (‘addition’) and · : R×V → V (‘scalar multi- plication’) satisfying the following axioms. Note that for λ·x, we usually write λx. (1) For all x,y∈V, x+y =y+x (addition is commutative). (2) For all x,y,z∈V, (x+y)+z =x+(y+z) (addition is associative). (3) There is a 0∈ V such that for all x∈ V, x+0 =x (existence of a zero element). (4) For every x ∈ V, there is −x ∈ V such that x+(−x) = 0 (existence of negatives). (5) For all λ,μ∈R and x∈ V, λ·(μ·x) = (λμ)·x (scalar multiplication is associative).5 (6) For all x∈V, 1·x =x (multiplication by 1 is the identity). (7) For all λ∈R and x,y∈V, λ(x+y) =λx+λy (distributivity I). (8) For all λ,μ∈R and x∈V, (λ+μ)x =λx+μx (distributivity II). The elements of a vector space are usually called vectors. 3.2. Remarks. (1) Thefirstfouraxiomsaboveexactlystatethat(V,+)isan(additive)abelian group. (If you didn’t know what an abelian group is, then this is the definition.) (2) Instead of writing (V,+,·) (which is the complete data for a vector space), we usually just write V, with the addition and scalar multiplication being understood. Before we continue with the general theory, we should look at some examples. 3.3. Example. The simplest (and perhaps least interesting) example of a vector space is V = 0, with addition given by 0+0 = 0 and scalar multiplication by λ·0 = 0 (these are the only possible choices). Trivial as it may seem, this vector space, called the zero space, is important. It plays a role in Linear Algebra similar to the role played by the empty set in Set Theory. 3.4. Example. The next (still not very interesting) example is V = R, with addition and multiplication the usual ones on real numbers. The axioms above in this case just reduce to well-known rules for computing with real numbers. 3.5. Example. Now we come to a very important example, which is the model n of a vector space. We consider V =R , the set of n-tuples of real numbers. We define addition and scalar multiplication ‘component-wise’: (x ,x ,...,x )+(y ,y ,...,y ) = (x +y ,x +y ,...,x +y ) 1 2 n 1 2 n 1 1 2 2 n n λ·(x ,x ,...,x ) = (λx ,λx ,...,λx ) 1 2 n 1 2 n Of course, we now have to prove that our eight axioms are satisfied by our choice of (V,+,·). In this case, this is very easy, since everything reduces to known facts about real numbers. As an example, let us show that the first distributive law is satisfied.  λ (x ,x ,...,x )+(y ,y ,...,y ) =λ·(x +y ,x +y ,...,x +y ) 1 2 n 1 2 n 1 1 2 2 n n  = λ(x +y ),λ(x +y ),...,λ(x +y ) 1 1 2 2 n n = (λx +λy ,λx +λy ,...,λx +λy ) 1 1 2 2 n n = (λx ,λx ,...,λx )+(λy ,λy ,...,λy ) 1 2 n 1 2 n =λ(x ,x ,...,x )+λ(y ,y ,...,y ) 1 2 n 1 2 n Of course, for n = 2 and n = 3, this is more or less what you know as ‘vectors’ from high school. For n = 1, this example reduces to the previous one (if one identifies 1-tuples (x) with elements x); for n = 0, it reduces to the zero space. (Why? Well, like an empty product of numbers should have the value 1, an empty 0 product of sets likeR has exactly one element, the empty tuple (), which we can call 0 here.)6 3.6. Exercise. Write down proofs for the other seven axioms. 3.7. Example. The preceding example can be generalized. Note that we can n identify R with the set of maps f : 1,2,...,n →R, by letting such a map f correspond to the n-tuple (f(1),f(2),...,f(n)). But there is no need to limit ourselves to these specific sets. So let us consider any set X and look at the set of all maps (or functions) from X toR: X V =R =f :X →R. Inordertogetavectorspace,wehavetodefineadditionandscalarmultiplication. To define addition, for every pair of functions f,g : X →R, we have to define a new function f +g : X → R. The only reasonable way to do this is as follows (‘point-wise’): f +g :X −→R, x7−→f(x)+g(x), or,inamorecondensedform,bywriting(f+g)(x) =f(x)+g(x). (Makesurethat you understand these notations) In a similar way, we define scalar multiplication: λf :X −→R, x7−→λf(x). We then have to check the axioms in order to convince ourselves that we really get a vector space. Let us do again the first distributive law as an example. We have to check that λ(f +g) =λf +λg, which means that for all x∈X, we have  λ(f +g) (x) = (λf +λg)(x). So let λ∈R, f,g :X →R, and x∈X.   λ(f +g) (x) =λ (f +g)(x)  =λ f(x)+g(x) =λf(x)+λg(x) = (λf)(x)+(λg)(x) = (λf +λg)(x). Note the parallelism of this proof with the one in the previous example What do we get when X is the empty set? 3.8. Exercise. Write down proofs for the other seven axioms. Before we can continue, we have to deal with a few little things. 3.9. Remark. In a vector space V, there is only one zero element. Proof. The way to show that there is only one element with a given property is to 0 assume there are two and then to show they are equal. So assume 0 and 0 were two zero elements of V. Then we must have 0 0 0 0 = 0+0 = 0 +0 = 0 0 0 (using axioms (3) for 0, (1), (3) for 0, respectively), so 0 = 0.  3.10. Remark. In a vector space V, there is a unique negative for each element. Proof. Letx∈V and assume thata,b∈V are both negatives ofx, i.e., x+a = 0, x+b = 0. Then a =a+0 =a+(x+b) = (a+x)+b = (x+a)+b = 0+b =b+0 =b, so a =b. 7 3.11. Notation. As usual, we write x−y for x+(−y). Here are some more harmless facts. 3.12. Remarks. Let (V,+,·) be a real vector space. (1) For all x∈V, we have 0·x = 0. (2) For all x∈V, we have (−1)·x =−x. (3) For λ∈R and x∈V such that λx = 0, we must have λ = 0 or x = 0. Proof. Exercise.  4. Fields So far, we have required our scalars to be real numbers. It turns out, however, that we can be quite a bit more general without adding any complications, by replacing the real numbers with other structures with similar properties. These structures are called fields. In the formulation of the axioms below, we will use the shorthands ‘∀x∈X :...’ and ‘∃x ∈ X : ...’ for the phrases ‘for all x ∈ X, we have ...’ and ‘there exists some x∈X such that ...’. 4.1. Definition. A field is a set F, together with two maps + : F × F → F (‘addition’) and· :F×F →F (‘multiplication’), written, as usual, (λ,μ)7→λ+μ and (λ,μ)7→λ·μ or λμ, respectively, satisfying the following axioms. (1) ∀λ,μ∈F :λ+μ =μ+λ. (2) ∀λ,μ,ν ∈F : (λ+μ)+ν =λ+(μ+ν). (3) ∃0∈F ∀λ∈F :λ+0 =λ. (4) ∀λ∈F ∃−λ∈F :λ+(−λ) = 0. (5) ∀λ,μ∈F :λμ =μλ. (6) ∀λ,μ,ν ∈F : (λμ)ν =λ(μν). (7) ∃1∈F : 16= 0 and ∀λ∈F : 1·λ =λ. −1 −1 (8) ∀λ∈F \0∃λ ∈F :λ λ = 1. (9) ∀λ,μ,ν ∈F :λ(μ+ν) =λμ+λν. Well-known examples of fields are the fieldQ of rational numbers and the fieldR of real numbers. We will introduce the field C of complex numbers shortly. But there are other examples of fields. For example, the smallest possible field just has the required elements 0 and 1, with the only ‘interesting’ definition being that 1+1 = 0. 4.2. Exercise. Check the axioms for this fieldF =0,1. 2 As before for vector spaces, we have some simple statements that easily follow from the axioms.8 4.3. Remarks. Let F be a field. (1) The zero 0 and unit 1 of F are uniquely determined. Likewise, for every −1 λ ∈ F, −λ is uniquely determined, and for every λ ∈ F \0, λ is uniquely determined. (2) We have 0·λ = 0 and (−1)λ = −λ for all λ ∈ F. In particular (taking λ =−1), (−1)(−1) = 1. (3) If λμ = 0 for λ,μ∈F, then λ = 0 or μ = 0 (or both). Proof. Exercise.  4.4. Remark. It is perhaps easier to remember the field axioms in the following way. (1) (F,+)isan(additive)abeliangroupwithzeroelement0,calledtheadditive group of F. (2) (F \0,·) is a (multiplicative) abelian group, called the multiplicative group of F. (But note that the multiplication map is defined on all of F, not just on F \0.) (3) The distributive law holds: ∀λ,μ,ν ∈F :λ(μ+ν) =λμ+λν. Given this, we can now define the notion of an F-vector space for an arbitrary field F. 4.5. Definition. Let F be a field. An F-vector space or linear space over F is a set V, together with two maps + : V ×V → V (‘addition’) and · : F ×V → V (‘scalar multiplication’), satisfying the vector space axioms given in Def. 3.1 with R replaced by F. Note that in order to prove the statements in Remarks 3.9, 3.10 and 3.12, we only had to use thatR is a field. Hence these statements are valid for F-vector spaces in general. The same observation applies to our examples of real vector spaces. 4.6. Examples. (1) V = 0, with the unique addition and scalar multiplication maps, is an F-vector space, again called the zero space (over F). (2) F, with the addition and multiplication from its structure as a field, is an F-vector space. n (3) F , with addition and scalar multiplication defined component-wise, is an F-vector space. X (4) For any set X, the set F of all maps X → F, with addition and scalar multiplication defined point-wise, is an F-vector space. 4.7. Example. There are other examples that may appear more strange. Let X be any set, and let V be the set of all subsets of X. (For example, if X =a,b, then V has the four elements∅,a,b,a,b.) We define addition on V as the symmetric difference: A+B = (A\B)∪(B\A) (this is the set of elements of X that are in exactly one of A and B). We define scalar multiplication by elements ofF in the only possible way: 0·A =∅, 1·A =A. These operations turn V into 2 anF -vector space. 29 Toprovethisassertion,wecancheckthevectorspaceaxioms(thisisaninstructive exercise). An alternative (and perhaps more elegant) way is to note that subsets of X correspond to maps X →F (a map f corresponds to the subset x ∈ X : 2 X f(x) = 1) — there is a bijection between V andF — and this correspondence 2 translates the addition and scalar multiplication we have defined on V into that X we had defined earlier onF . 2 4.8. The Field of Complex Numbers. Besides real vector spaces, complex vector spaces are very important in many applications. These are linear spaces over the field of complex numbers, which we now introduce. The first motivation for the introduction of complex numbers is a shortcoming of therealnumbers: whilepositiverealnumbershaverealsquareroots, negativereal numbers do not. Since it is frequently desirable to be able to work with solutions 2 to equations like x +1 = 0, we introduce a new number, called i, that has the 2 property i =−1. The setC of complex numbers then consists of all expressions a +bi, where a and b are real numbers. (More formally, one considers pairs of 2 real numbers (a,b) and so identifiesC withR as sets.) In order to turnC into a field, we have to define addition and multiplication. If we want the multiplication 2 to be compatible with the scalar multiplication on the real vector spaceR , then (bearing in mind the field axioms) there is no choice: we have to set (a+bi)+(c+di) = (a+c)+(b+d)i and 2 (a+bi)(c+di) =ac+adi+bci+bdi = (ac−bd)+(ad+bc)i 2 (remember i = −1). It is then an easy, but tedious, matter to show that the axioms hold. (Later, in the “Introductory Algebra” course, you will learn that there is a rather elegant way of doing this.) If z = a + bi as above, then we call Rez = a the real part and Imz = b the imaginary part of z. The least straightforward statement is probably the existence of multiplicative inverses. In this context, it is advantageous to introduce the notion of conjugate complex number. 4.9. Definition. If z =a+bi∈C, then the complex conjugate of z is z¯=a−bi. √ 2 2 Note that zz¯ = a +b ≥ 0. We set z = zz¯; this is called the absolute value or modulus of z. It is clear that z = 0 only for z = 0; otherwise z 0. We ¯ obviously have z¯=z and z¯ =z. 4.10. Remark. (1) For all w,z∈C, we have w+z =w¯+z¯ and wz =w¯z¯. −1 −2 (2) For all z∈C\0, we have z =z ·z¯. (3) For all w,z∈C, we have wz =w·z. Proof. (1) Exercise. (2) First of all, z = 6 0, so the expression makes sense. Now note that −2 −2 −2 2 z z¯·z =z ·zz¯=z z = 1. (3) Exercise.10  For example: 1 1−2i 1−2i 1−2i 1 2 = = = = − i. 2 2 1+2i (1+2i)(1−2i) 1 +2 5 5 5 4.11. Remark. Historically, the necessity of introducing complex numbers was realized through the study of cubic (and not quadratic) equations. The reason for this is that there is a solution formula for cubic equations that in some cases requires complex numbers in order to express a real solution. See Section 2.7 in J¨anich’s book J. The importance of the field of complex numbers lies in the fact that they pro- vide solutions to all polynomial equations. This is the ‘Fundamental Theorem of Algebra’: Every non-constant polynomial with complex coefficients has a root inC. We will have occasion to use it later on. A proof, however, is beyond the scope of this course. 4.12. Definition. A complex vector space is a linear space overC. 5. Linear Subspaces and Linear Hulls In many applications, we do not want to consider all elements of a given vector space V, rather we only consider elements of a certain subset. Usually, it is desirable that this subset is again a vector space (with the addition and scalar multiplication it ‘inherits’ from V). In order for this to be possible, a minimal requirement certainly is that addition and scalar multiplication make sense on the subset. 5.1. Definition. Let V be an F-vector space. A subset U ⊂ V is called a vector subspace or linear subspace of V if it has the following properties. (1) 0∈U. (2) If u ,u ∈U, then u +u ∈U. 1 2 1 2 (3) If λ∈F and u∈U, then λu∈U. Here the addition and scalar multiplication are those of V. Note that, given the third property, the first is equivalent to saying that U is non-empty. Indeed, let u∈U, then by (3), 0 = 0·u∈U. We should justify the name ‘subspace’. 5.2. Lemma. Let (V,+,·) be an F-vector space. If U ⊂ V is a linear subspace of V, then (U,+ ,· ) is again an F-vector space. U×U F×U Thenotation+ meansthatwetaketheadditionmap+ :V×V,butrestrictit U×U to U×U. (Strictly speaking, we also restrict its target set from V to U. However, this is usually suppressed in the notation.)11 Proof. By definition of what a linear subspace is, we really have well-defined ad- dition and scalar multiplication maps on U. It remains to check the axioms. For the axioms that state ‘for all ..., we have ...’ and do not involve any existence statements, this is clear, since they hold (by assumption) even for all elements of V. This covers axioms (1), (2), (5), (6), (7) and (8). Axiom (3) is OK, since we require 0∈U. Finally, for axiom (4), we need that for all u∈U,−u∈U as well. But −u = (−1)u, so this is OK by the third property of linear subspaces.  It is time for some examples. 5.3. Examples. Let V be a vector space. Then 0⊂ V and V itself are linear subspaces of V. 2 2 5.4. Example. Consider V =R and, for a∈R, U =(x,y)∈R :x+y =a. a When is U a linear subspace? a We check the first condition: 0 = (0,0)∈U ⇐⇒ 0+0 =a, so U can only be a a a linear subspace when a = 0. Let us check the other properties for U : 0 (x ,y ),(x ,y )∈U =⇒x +y = 0, x +y = 0 1 1 2 2 0 1 1 2 2 =⇒ (x +x )+(y +y ) = 0 1 2 1 2 =⇒ (x ,y )+(x ,y ) = (x +x ,y +y )∈U 1 1 2 2 1 2 1 2 0 and λ∈R,(x,y)∈U =⇒x+y = 0 0 =⇒λx+λy =λ(x+y) = 0 =⇒λ(x,y) = (λx,λy)∈U . 0 R 5.5. Examples. Consider V = R = f : R → R, the set of real-valued func- tions on R. You will learn in Analysis that if f and g are continuous functions, then f +g is again continuous, and λf is continuous for any λ ∈ R. Of course, the zero function x 7→ 0 is continuous as well. Hence, the set of all continuous functions C(R) =f :R→Rf is continuous is a linear subspace of V. Similarly, you will learn that sums and scalar multiples of differentiable functions are again differentiable. Also, derivatives respect sums and scalar multiplication: 0 0 0 0 0 (f +g) =f +g, (λf) =λf . From this, we conclude that n (n) C (R) =f :R→Rf is n times differentiable and f is continuous is again a linear subspace of V. In a different direction, consider the set of all periodic functions (with period 1): U =f :R→Rf(x+1) =f(x) for all x∈R. The zero function is certainly periodic. If f and g are periodic, then (f +g)(x+1) =f(x+1)+g(x+1) =f(x)+g(x) = (f +g)(x), so f +g is again periodic. Similarly, λf is periodic (for λ∈R). So U is a linear subspace of V. The following result now tells us that U∩C(R), the set of all continuous periodic functions, is again a linear subspace.12 5.6. Lemma. Let V be an F-vector space, U ,U ⊂ V linear subspaces of V. 1 2 Then U ∩U is again a linear subspace of V. 1 2 More generally, if (U ) (with I 6=∅) is any family of linear subspaces of V, then i i∈I T their intersection U = U is again a linear subspace of V. i i∈I Proof. It is sufficient to prove the second statement (take I =1,2 to obtain the first). We check the conditions. (1) By assumption 0∈U for all i∈I. So 0∈U. i (2) Let u ,u ∈ U. Then u ,u ∈ U for all i ∈ I, hence (by assumption) 1 2 1 2 i u +u ∈U for all i∈I. But this means u +u ∈U. 1 2 i 1 2 (3) Let λ ∈ F, u ∈ U. Then u ∈ U for all i ∈ I, hence (by assumption) i λu∈U for all i∈I. This means that λu∈U. i  Note that in general, if U and U are linear subspaces, then U ∪U is not (it is 1 2 1 2 if and only if U ⊂U or U ⊂U — Exercise). 1 2 2 1 Thepropertywejustprovedisveryimportant,sinceittellsusthatthereisalways a smallest linear subspace of V that contains a given subset S of V. This means that there is a linear subspace U of V such that S ⊂ U and such that U is contained in every other linear subspace of V that contains S. 5.7. Definition. Let V be a vector space, S ⊂ V a subset. The linear hull or linear span of S, or the linear subspace generated by S is \ L(S) = U ⊂V :U linear subspace of V, S ⊂U. (This notation means the intersection of all elements of the specified set: we intersect all linear subspaces containing S. Note that V itself is such a subspace, so this set of subspaces is non-empty, so by the preceding result, L(S) really is a linear subspace.) If we want to indicate the field F of scalars, we write L (S). If v ,v ,...,v ∈V, F 1 2 n we also write L(v ,v ,...,v ) for L(v ,v ,...,v ). 1 2 n 1 2 n If L(S) =V, we say that S generates V, or that S is a generating set for V. Beawarethattherearevariousdifferentnotationsforlinearhullsintheliterature, A forexamplehSi(whichinLT Xiswritten\langle S \rangleandnotS). E 5.8. Example. What do we get in the extreme case that S = ∅? Well, then we have to intersect all linear subspaces of V, so we get L(∅) =0. Definition 5.7 above has some advantages and disadvantages. Its main advantage is that it is very elegant. Its main disadvantage is that it is rather abstract and non-constructive. To remedy this, we show that we can build the linear hull in a constructive way “from below” instead of abstractly “from above.”13 5.9. Example. Let us look at a specific case first. Given v ,v ∈V, how can we 1 2 describe L(v ,v )? 1 2 According to the definition of linear subspaces, we must be able to add and multi- ply by scalars in L(v ,v ); also v ,v ∈L(v ,v ). This implies that every element 1 2 1 2 1 2 of the form λ v +λ v must be in L(v ,v ). So set 1 1 2 2 1 2 U =λ v +λ v :λ ,λ ∈F 1 1 2 2 1 2 (whereF is the field of scalars); thenU ⊂L(v ,v ). On the other hand, U is itself 1 2 a linear subspace: 0 = 0·v +0·v ∈U 1 2 (λ +μ )v +(λ +μ )v = (λ v +λ v )+(μ v +μ v )∈U 1 1 1 2 2 2 1 1 2 2 1 1 2 2 (λλ )v +(λλ )v =λ(λ v +λ v )∈U 1 1 2 2 1 1 2 2 (Exercise: which of the vector space axioms have we used where?) Therefore, U is a linear subspace containing v and v , and hence L(v ,v ) ⊂ U 1 2 1 2 by our definition. We conclude that L(v ,v ) =U =λ v +λ v :λ ,λ ∈F. 1 2 1 1 2 2 1 2 This observation generalizes. 5.10. Definition. LetV be anF-vector space, v ,v ,...,v ∈V. A linear combi- 1 2 n nation (or, more precisely, an F-linear combination) of v ,v ,...,v is an element 1 2 n of V of the form v =λ v +λ v +···+λ v 1 1 2 2 n n with λ ,λ ,...,λ ∈ F. If n = 0, then the only linear combination of no vectors 1 2 n is (by definition) 0∈V. IfS ⊂V isanysubset,thenan(F-)linearcombinationonS isalinearcombination of finitely many elements of S. 5.11. Proposition. Let V be a vector space, v ,v ,...,v ∈ V. Then the set 1 2 n of all linear combinations of v ,v ,...,v is a linear subspace of V; it equals the 1 2 n linear hull L(v ,v ,...,v ). 1 2 n More generally, let S ⊂ V be a subset. Then the set of all linear combinations on S is a linear subspace of V, equal to L(S). Proof. LetU bethesetofalllinearcombinationsofv ,v ,...,v . Wehavetocheck 1 2 n thatU isalinearsubspaceofV. Firstofall,0∈U,since0 = 0v +0v +···+0v 1 2 n (this even works for n = 0). To check that U is closed under addition, let v =λ v +λ v +···+λ v and w = μ v +μ v +···+μ v be two elements 1 1 2 2 n n 1 1 2 2 n n of U. Then v+w = (λ v +λ v +···+λ v )+(μ v +μ v +···+μ v ) 1 1 2 2 n n 1 1 2 2 n n = (λ +μ )v +(λ +μ )v +···+(λ +μ )v 1 1 1 2 2 2 n n n is again a linear combination of v ,v ,...,v . Also, for λ∈F, 1 2 n λv =λ(λ v +λ v +···+λ v ) 1 1 2 2 n n = (λλ )v +(λλ )v +···+(λλ )v 1 1 2 2 n n14 is a linear combination ofv ,v ,...,v . SoU is indeed a linear subspace ofV. We 1 2 n have v ,v ,...,v ∈U, since 1 2 n v = 0·v +···+0·v +1·v +0·v +···+0·v , j 1 j−1 j j+1 n so, by definition of the linear hull, L(v ,v ,...,v ) ⊂ U. On the other hand, it 1 2 n is clear that any linear subspace containing v ,v ,...,v has to contain all linear 1 2 n combinations of these vectors, hence U ⊂L(v ,v ,...,v ). Therefore 1 2 n L(v ,v ,...,v ) =U. 1 2 n For the general case, the only possible problem is with checking that the set of linearcombinationsonS isclosedunderaddition. Forthis,weobservethatifvisa linear combination on the finite subsetI ofS andw is alinear combination on the finite subset J of S, then v and w can both be considered as linear combinations on the finite subset I∪J of S; now our argument above applies.  5.12. Example. Let us consider again the vector spaceC(R) of continuous func- n tions on R. The power functions f : x 7→ x (n = 0,1,2,...) are certainly n continuous and defined on R, so they are elements of C(R). We find that their linear hull L(f : n ∈ N ) is the linear subspace of polynomial functions, i.e, n 0 functions that are of the form n n−1 x7−→a x +a x +···+a x+a n n−1 1 0 with n∈N and a ,a ,...,a ∈R. 0 0 1 n 6. Linear Independence and Dimension This section essentially follows Chapter 3 in J¨anich’s book J. In the context of looking at linear hulls, it is a natural question whether we really need all the given vectors in order to generate their linear hull. Also (maybe in order to reduce waste...), it is interesting to consider minimal generating sets. These questions lead to the notions of linear independence and basis. 6.1. Definition. Let V be an F-vector space, v ,v ,...,v ∈ V. We say that 1 2 n v ,v ,...,v are linearly independent, if for all λ ,λ ,...,λ ∈F, 1 2 n 1 2 n λ v +λ v +···+λ v = 0 1 1 2 2 n n implies λ = λ = ··· = λ = 0. (“The zero vector cannot be written as a 1 2 n nontrivial linear combination of v ,...,v .”) 1 n In a similar way, we can define linear independence for arbitrary subsets of V. This is equivalent to the following. S ⊂ V is linearly independent if every choice of finitely many (distinct) vectors from S is linearly independent. As a special case, the empty sequence or empty set of vectors is considered to be linearly independent. If we want to refer to the field of scalars F, we say that the given vectors are F-linearly independent or linearly independent over F. Ifv ,v ,...,v (resp.,S)arenotlinearlyindependent,wesaythattheyarelinearly 1 2 n dependent.15 6.2. Example. Foraneasyexamplethatthefieldofscalarsmattersinthecontext of linear independence, consider 1,i∈C, whereC can be considered as a real or as a complex vector space. We then have that 1 and i areR-linearly independent (essentially by definition ofC — 0 = 0·1+0·i, and this representation is unique), whereas they areC-linearly dependent — i·1+(−1)·i = 0. 6.3. Remark. Let V be a vector space, v ,v ,...,v ∈V. Then v ,v ,...,v are 1 2 n 1 2 n linearly dependent if and only if one of the v is a linear combination of the others, j i.e., if and only if L(v ,v ,...,v ) =L(v ,...,v ,v ,...,v ) 1 2 n 1 j−1 j+1 n for some j ∈1,2,...,n. A similar statement holds for subsets S ⊂V. Proof. Let us first assume that v ,v ,...,v are linearly dependent. Then there 1 2 n are scalars λ ,λ ,...,λ , not all zero, such that 1 2 n λ v +λ v +···+λ v = 0. 1 1 2 2 n n Let j be such that λ 6= 0. Then j −1 v =−λ (λ v +···+λ v +λ v +···+λ v ). j 1 1 j−1 j−1 j+1 j+1 n n j Conversely, assume that v is a linear combination of the other vectors: j v =λ v +···+λ v +λ v +···+λ v . j 1 1 j−1 j−1 j+1 j+1 n n Then λ v +···+λ v −v +λ v +···+λ v = 0, 1 1 j−1 j−1 j j+1 j+1 n n so the given vectors are linearly dependent. Bearing in mind that S is linearly dependent if and only if some finite subset of S is linearly dependent, the last statement also follows.  6.4. Example. In C(R), the functions 2 2 x7−→ 1, x7−→ sinx, x7−→ cosx, x7−→ sin x, x7−→ cos x 2 2 are linearly dependent, since 1−sin x−cos x = 0 for all x∈R. On the other hand, x7−→ 1, x7−→ sinx, x7−→ cosx are linearly independent. To see this, assume that λ+μsinx+νcosx = 0 for all x∈R. Plugging in x = 0, we obtain λ+ν = 0. For x = π, we get λ−ν = 0, which together imply λ =ν = 0. Then taking x =π/2 shows that μ = 0 as well. 6.5. Definition. Let V be a vector space. A sequence v ,v ,...,v ∈ V (or a 1 2 n subset S ⊂ V) is called a basis of V if v ,v ,...,v are (resp., S is) linearly 1 2 n independent, and V =L(v ,v ,...,v ) (resp., V =L(S)). 1 2 n From Remark 6.3 above, we see that a basis of V is a minimal generating set of V, in the sense that we cannot leave out some element and still have a generating set. What is special about a basis among generating sets?16 6.6. Lemma. If v ,v ,...,v is a basis of the F-vector space V, then for every 1 2 n v∈V, there are unique scalars λ ,λ ,...,λ ∈F such that 1 2 n v =λ v +λ +···+λ v . 1 1 2 n n n Proof. The existence of (λ ,λ ,...,λ )∈F such that 1 2 n v =λ v +λ +···+λ v 1 1 2 n n follows from the fact that v ,v ,...,v generate V. 1 2 n n To show uniqueness, assume that (μ ,μ ,...,μ )∈F also satisfy 1 2 n v =μ v +μ v +···+μ v . 1 1 2 2 n n Taking the difference, we obtain 0 = (λ −μ )v +(λ −μ )v +···+(λ −μ )v . 1 1 1 2 2 2 n n n Since v ,v ,...,v are linearly independent, it follows that 1 2 n λ −μ =λ −μ =··· =λ −μ = 0, 1 1 2 2 n n i.e., (λ ,λ ,...,λ ) = (μ ,μ ,...,μ ).  1 2 n 1 2 n 6.7. Exercise. Formulate and prove the corresponding statement for subsets. n 6.8. Example. The most basic example of a basis is the canonical basis of F . This is e ,e ,...,e , where 1 2 n e = (1,0,0,...,0,0) 1 e = (0,1,0,...,0,0) 2 . . . . . . e = (0,0,0,...,0,1). n In a precise sense (to be discussed in detail later), knowing a basis v ,v ,...,v 1 2 n of a vector space V allows us to express everything in V in terms of the standard n vector space F . Since we seem to know “everything” about a vector space as soon as we know a basis, it makes sense to use bases to measure the “size” of vector spaces. In order for this to make sense, we need to know that any two bases of a given vector space have the same size. The key to this (and many other important results) is the following. 6.9. Basis Extension Theorem. Let V be an F-vector space, and let v ,...,v , 1 r w ,...,w ∈ V. If v ,...,v are linearly independent and V is generated by 1 s 1 r v ,...,v ,w ,...,w , then by adding suitably chosen vectors from w ,...,w , we 1 r 1 s 1 s can extend v ,...,v to a basis of V. 1 r Note that this is an existence theorem — what it says is that if we have a bunch of vectors that is ‘too small’ (linearly independent, but not necessarily generating) and a larger bunch of vectors that is ‘too large’ (generating but not necessarily linearly independent), then there is a basis ‘in between’. However, it does not tell us how to actually find such a basis (i.e., how to select the w that we have to j add). The Basis Extension Theorem implies another important statement.17 6.10. Exchange Lemma. If v ,...,v and w ,...,w are two bases of a vector 1 n 1 m space V, then for each i ∈ 1,2,...,n there is some j ∈ 1,2,...,m such that v ,...,v ,w ,v ,...,v is again a basis of V. 1 i−1 j i+1 n This says that we can trade any vector of our choice in the first basis for a vector in the second basis in such a way as to still have a basis. We will postpone the proofs and first look at some consequences. 6.11. Theorem. If v ,v ,...,v and w ,w ,...,w are two bases of a vector 1 2 n 1 2 m space V, then n =m. Proof. Assume that, for example, n m. By repeatedly applying the Exchange Lemma,wecansuccessivelyreplacev ,v ,...,v bysomew andstillhaveabasis. 1 2 n j Since there are more v’s than w’s, the resulting sequence must have repetitions and therefore cannot be linearly independent, contradiction.  This implies that the following definition makes sense. 6.12. Definition. If the vector space V has a basis v ,v ,...,v , then n ≥ 0 is 1 2 n called the dimension of V, written n = dimV = dim V. F 6.13. Examples. Theemptysequenceisabasisofthezerospace,sodim0 = 0. n n The canonical basis of F has length n, so dimF =n. 6.14. Theorem. Let V be a vector space, dimV = n. Then any sequence of vectors v ,...,v ∈V such that mn is linearly dependent. 1 m Proof. Let w ,...,w be a basis of V. Assume that v ,...,v were linearly 1 n 1 m independent. Then by the Basis Extension Theorem (note that V =L(w ,...,w ) =L(v ,...,v ,w ,...,w ) ), 1 n 1 m 1 n wecouldextendv ,...,v toabasisofV byaddingsomevectorsfromw ,...,w . 1 m 1 n Since mn, the resulting basis would be too long, contradiction.  Note that this theorem is a quite strong existence statement: it guarantees the existence of a nontrivial linear relation among the given vectors without the need to do any computation. This is very useful in many applications. On the other hand, it is quite a different matter to actually find such a relation: the proof is non-constructive (as is usually the case with proofs by contradiction), and we usually need some computational method to exhibit a relation. Theprecedingtheoremtellsusthatinavectorspaceof(finite)dimensionn, there is an upper bound (namely, n) for the length of a linearly independent sequence of vectors. We can use this to show that there are vector spaces that do not have dimension n for any n≥ 0.18 6.15. Example. Letusconsideragainthelinearsubspaceofpolynomial functions in C(R) (the vector space of continuous functions onR), compare Example 5.12. Let us call this space P: n P =f ∈C(R) :∃n∈N ∃a ,...,a ∈R∀x∈R :f(x) =a x +···+a x+a 0 0 n n 1 0 n Denote as before by f the nth power function: f (x) = x . I claim that n n f ,f ,f ,... is linearly independent. Recall that this means that the only way 0 1 2 of writing zero (i.e., the zero function) as a finite linear combination of the f is j with all coefficients equal to zero. If we let n be the largest number such that f occurs in the linear combination, then it is clear that we can write the linear n combination as λ f +λ f +···+λ f = 0. 0 0 1 1 n n We have to show that this is only possible when λ =λ =··· =λ = 0. 0 1 n Note that our assumption means that n λ x +···+λ x+λ = 0 for all x∈R. n 1 0 There are various ways to proceed from here. For example, we can make use of the fact that a polynomial of degree n≥ 0 can have at most n zeros inR. Since there are infinitely many real numbers, the polynomial above has infinitely many zeros, hence it must be the zero polynomial (which does not have a well-defined degree). Another possibility is to use induction on n (which, by the way, is implicit in the proof above: it is used in proving the statement on zeros of polynomials). Let us do this in detail. The claim we want to prove is   n ∀n∈N ∀λ ,...,λ ∈R : ∀x∈R :λ x +···+λ = 0 =⇒λ =··· =λ = 0 . 0 0 n n 0 0 n We now have to establish the induction base: the claim holds for n = 0. This is easy — let λ ∈R and assume that for all x∈R, λ = 0 (the function is constant 0 0 here: it does not depend on x). Since there are real numbers, this implies λ = 0. 0 Next, and this is usually the hard part, we have to do the induction step. We assume that the claim holds for a given n (this is the induction hypothesis) and deduce that it then also holds for n+1. To prove the statement for n+1, we have to consider coefficients λ ,...,λ ∈R such that for all x∈R, 0 n+1 n+1 n f(x) =λ x +λ x +···+λ x+λ = 0. n+1 n 1 0 Now we want to use the induction hypothesis, so we have to reduce this to a statement involving a polynomial of degree at mostn. One way of doing that is to borrowsomeknowledgefromAnalysisaboutdifferentiation. Thistells us thatthe derivative of f is zero again, and that it is a polynomial function of degree ≤n: 0 n n−1 0 =f (x) = (n+1)λ x +nλ x +···+λ . n+1 n 1 Now we can apply the induction hypothesis to this polynomial function; it tells us that (n+1)λ = nλ = ··· = λ = 0, hence λ = ··· = λ = λ = 0. n+1 n 1 1 n n+1 So f(x) = λ is in fact constant, which finally implies λ = 0 as well (by our 0 0 reasoning for the induction base). This completes the induction step and therefore the whole proof. Note that since P = L(f : n ∈ N ), we have shown that f : n ∈ N is a n 0 n 0 basis of P. So we see that P cannot have a finite basis, since we can find arbitrarily many linearly independent elements. This motivates the following definition.19 6.16. Definition. If a vector space V does not have a finite basis, then V is said to be infinite-dimensional, and we write dimV =∞. In particular, we see that dimP =∞. The following result shows that our intuition that dimension is a measure for the ‘size’ of a vector space is not too far off: larger spaces have larger dimension. 6.17. Lemma. Let U be a linear subspace of the vector space V. Then we have dimU ≤ dimV. If dimV is finite, then we have equality if and only if U =V. Here we use the usual convention that n∞ for n∈N . 0 Note that in the case that dimV is finite, the statement also asserts the existence of a (finite) basis of U. Proof. There is nothing to show if dimV = ∞. So let us assume that V has a basis v ,...,v . If u ,...,u ∈ U are linearly independent, then m ≤ n by 1 n 1 m Thm. 6.14. Hence there is a sequence u ,...,u of linearly independent vectors 1 m in U of maximal length m (and m ≤ n). We claim that u ,...,u is in fact a 1 m basis of U. The first claim follows, since then dimU =m≤n = dimV. We have to show that u ,...,u generate U. So assume that there is u∈U that 1 m is not a linear combination of the u . Then u ,...,u ,u are linearly independent, j 1 m which contradicts our choice of u ,...,u as a maximal linearly independent se- 1 m quence in U. So there is no such u, hence U =L(u ,...,u ). 1 m Toprovethesecondpart, firstnotethatdimU dimV impliesU (V (ifU =V, a basis of U would also be a basis of V, but it is too short for that by Thm. 6.11). On the other hand, assume U (V, and consider a basis of U. It can be extended to a basis of V by the Basis Extension Theorem 6.9. Since it does not generate V, at least one element has to be added, which implies dimU dimV.  6.18. Examples. Since ∞ \ ∞ n 2 1 R P ⊂C (R) = C (R)⊂···⊂C (R)⊂C (R)⊂C(R)⊂R , n=0 all these spaces are infinite-dimensional. 6.19. Exercise. Let F be a finite field, and consider the F-vector space V of F functions from F to F (so V = F in our earlier notation). Consider again the linear subspace of polynomial functions: P =L (f ,f ,f ,...) F F 0 1 2 n where f :x7→x (for x∈F). Show that dim P is finite. n F F We have seen that the intersection of linear subspaces is again a linear subspace, but the union usually is not. However, it is very useful to have a replacement for the union that has similar properties, but is a linear subspace. Note that the union of two (or more) sets is the smallest set that contains both (or all) of them. From this point of view, the following definition is natural.20 6.20. Definition. LetV be a vector space, U ,U ⊂V two linear subspaces. The 1 2 sum of U and U is the linear subspace generated by U ∪U : 1 2 1 2 U +U =L(U ∪U ). 1 2 1 2 More generally, if (U ) is a family of subspaces of V (I = ∅ is allowed here), i i∈I then their sum is again   X U =L U . i i i∈I i∈I As before in our discussion of linear hulls, we want a more explicit description of these sums. 6.21. Lemma. If U and U are linear subspaces of the vector space V, then 1 2 U +U =u +u :u ∈U ,u ∈U . 1 2 1 2 1 1 2 2 If (U ) is a family of linear subspaces of V, then i i∈I n o X X U = u :J ⊂I finite and u ∈U for all j ∈J . i j j j i∈I j∈J S Proof. ItisclearthatthesetsontherighthandsidecontainU ∪U , resp. U , 1 2 i i∈I and that they are contained in the left hand sides (which are closed under addi- tion). So it suffices to show that they are linear subspaces (this implies that the left hand sides are contained in them). P We have 0 = 0+0 (resp., 0 = u ), so 0 is an element of the right hand side j j∈∅ sets. Closure under scalar multiplication is easy to see: λ(u +u ) =λu +λu , 1 2 1 2 and λu ∈U , λu ∈U , since U , U are linear subspaces. Similarly, 1 1 2 2 1 2 X X λ u = λu , j j j∈J j∈J 0 0 andλu ∈U ,sinceU isalinearsubspace. Finally,foru ,u ∈U andu ,u ∈U , j j j 1 1 2 2 1 2 we have 0 0 0 0 (u +u )+(u +u ) = (u +u )+(u +u ) 1 2 1 2 1 2 1 2 0 0 with u +u ∈ U , u +u ∈ U . And for J ,J finite subsets of I, u ∈ U for 1 1 2 2 1 2 j j 1 2 0 j ∈J , u ∈U for j ∈J , we find 1 j 2 j     X X X 0 u + u = v , j j j j∈J j∈J j∈J ∪J 1 2 1 2 0 0 where v =u ∈U if j ∈J \J , v =u ∈U if j ∈J \J , and v =u +u ∈U j j j 1 2 j j 2 1 j j j j j if j ∈J ∩J .  1 2 Now we have the following nice formula relating the dimensions of U , U , U ∩U 1 2 1 2 and U +U . In the following, we use the convention that ∞ +n = n +∞ = 1 2 ∞+∞ =∞ for n∈N . 0

Advise: Why You Wasting Money in Costly SEO Tools, Use World's Best Free SEO Tool Ubersuggest.