Lecture Notes on Numerical Methods for Engineering

lecture notes on numerical methods in engineering and sciences and numerical methods in engineering lecture notes, numerical methods for in matlab
EdenKelly Profile Pic
EdenKelly,United States,Professional
Published Date:12-07-2017
Your Website URL(Optional)
Lecture Notes on Numerical Methods for Engineering (?) Pedro Fortuny Ayuso UNIVERSIDAD DE OVIEDO E-mail address: fortunypedrouniovi.esCC BY: Copyright c 2011–2016 Pedro Fortuny Ayuso This work is licensed under the Creative Commons Attribution 3.0 License. To view a copy of this license, visit http://creativecommons.org/licenses/by/3.0/es/ or send a letter to Creative Commons, 444 Castro Street, Suite 900, Mountain View, California, 94041, USA.Contents Introduction 5 1. Some minor comments 5 Chapter 1. Arithmetic and error analysis 7 1. Exponential notation 7 2. Error, basic definitions 10 3. Bounding the error 16 Chapter 2. Numerical Solutions to Non-linear Equations 19 1. Introduction 19 2. The Bisection Algorithm 20 3. Newton-Raphson’s Algorithm 22 4. The Secant Algorithm 24 5. Fixed Points 26 6. Annex: Matlab/Octave Code 32 Chapter 3. Numerical Solutions to Linear Systems of Equations 35 1. Gauss’ Algorithm and LU Factorization 35 2. Condition Number: behavior of the relative error 40 3. Fixed Point Algorithms 44 4. Annex: Matlab/Octave Code 46 Chapter 4. Interpolation 49 1. Linear (piecewise) interpolation 49 2. Can parabolas be used for this? 50 3. Cubic Splines: Continuous Curvature 51 4. The Lagrange Interpolating Polynomial 57 5. Approximate Interpolation 60 6. Code for some of the Algorithms 64 Chapter 5. Numerical Differentiation and Integration 69 1. Numerical Differentiation 69 2. Numerical Integration—Quadrature Formulas 71 Chapter 6. Differential Equations 79 1. Introduction 79 34 CONTENTS 2. The Basics 81 3. Discretization 82 4. Sources of error: truncation and rounding 85 5. Quadratures and Integration 86 6. Euler’s Method: Integrate Using the Left Endpoint 86 7. Modified Euler: the Midpoint Rule 87 8. Heun’s Method: the Trapezoidal Rule 89 Chapter 7. Multivariate and higher order ODEs 93 1. A two-variable example 93 2. Multivariate equations: Euler and Heun’s methods 96 3. From greater order to order one 98Introduction These notes cover what is taught in the classes of Numerical Meth- ods for Engineering in the School at Mieres. One should not expect more than that: a revision of what has been or will be explained dur- ing the course. For a more correct, deep and thorough explanation, ? one should go to the material referenced in the bibliography There is none. Simply, stated These notes are no substitute for a book (or two). and even more What is here may and may not cover completely the contents of the exam 1. Some minor comments My aim in these notes is mostly twofold:  To introduce the basic problems tackled by Numerical Cal- culus in their most simple fashion.  To get the students used to stating algorithms with precision and to understanding the idea of complexity. I also would like to be able to make the students aware of the im- portance of the conditioning of a numerical problem and the need to verify that the results obtained by a computer program match reality and fit the problem under study. But there is really no space (much less time in the course) for this. We shall refer, along the way, to several computing tools. Specif- ically, Matlab: I shall try to make all the code in this notes runnable on Octave but this text will only speak of Matlab, which is the software students are used to working with at Mieres. Wolfram Alpha: This is a powerful and very useful tool with which a large number of operations (mathematical and not) can be performed, on a simple web page. 56 INTRODUCTION What may be most interesting is that the students can:  State algorithms with precision, aware that an algorithm must be clear, finite and finish.  Follow the steps of an algorithm and roughly analyze its complexity (or at least, compare two of them). I would also like to stress the importance of discerning if an ap- proach suits a specific problem or not depending on its good or bad conditioning but, alas, there is also too little time in the course. . . It goes without saying that they will have to understand and be able to follow, step by step, the algorithms stated in the classes (this is an essential requisite). All the algorithms that will be explained are of easy memorization because they are either brief (most of them) or geometrically elemental (so that their verbal expression is easily to remember). Formal precision is much more important in this course than geometric ideas because numerical analysis deals with formal methods of solving specific problems, not with their geometrical or intuitive expression. This is the main rub of this course. You will need to memorize.CHAPTER 1 Arithmetic and error analysis Exponential notations, double precision floating point, the no- tion of error and some common sources of error in computations are summarily reviewed. The student should at least get a feeling of the importance of normalization of floating point arithmetic and that mistakes in its use can be critical. 1. Exponential notation Real numbers can have a finite or infinite number of digits. When working with them they have to be approximated using a finite (and usually small) number of digits. Even more, they should be ex- pressed in a standard way so that their magnitude and significant digits are easily noticed at first sight and so that sharing them among machines is a deterministic and efficient process. These require- ments have led to what is known as scientific or exponential nota- tion. We shall explain it roughly; what we intend is to transmit the underlying idea more than to detail all the rules (they have lots of particularities and exceptions). We shall use, along these notes, the following expressions: DEFINITION 1. An exponential notation of a real number is an ex- pression of the form C  A.B 10 C 10 A.B  A.BeC where A, B and C are natural numbers (possibly 0), is a sign (which may be elided if+). Any of those expressions refers to the real num- C ber(A+ 0.B) 10 (where 0.B is “nought dot B. . . ”). For example:  The number 3.123 is the same 3.123.  The number 0.01e 7 is 0.000000001 (eight zeroes after the dot and before the 1). 78 1. ARITHMETIC AND ERROR ANALYSIS 3  The number 10 2.3 is2300.  The number23.783e 1 is2.3783. In general, scientific notation assumes the number to the left of the decimal point is a single non-zero digit: DEFINITION 2. The standard exponential notation is the exponential notation in which A is between 1 and 9. Finally, machines usually store real numbers in a very specific way called floating point. DEFINITION 3. A floating point format is a specification of an ex- ponential notation in which the length of A plus the length of B is constant and in which the exponents (the number C) vary in a spe- cific range. Thus, a floating point format can only express a finite amount of numbers (those which can be written following the specifications of the format). The blueprint for the floating point standard for microchips (and for electronic devices in general) is document IEEE-754 (read “I-E- cube seven-five-four”). The acronym stands for “Institute of Elec- trical and Electronic Engineers”. The last version of the document dates from 2008. 1.1. The binary double precision format of IEEE-754. The dou- ble precision format is the IEEE specification for representing real numbers in sequences of 16, 32, 64, 128 bits (and more cases) and their decimal representation. The main properties of binary double precision with 64 bits are roughly explained in this section. In order to write numbers in double precision, 64 bits are used, that is, 64 binary digits. The first one stands for the sign (a 0 means +, a 1 means). The 11 following ones are used for the exponent (as will be shown) and the remaining 52 are used for what is called the mantissa. Thus, a double precision number has three parts: s, the sign, which can be 0 or 1; e, the exponent, which varies between 0 11 and 2 1 = 2047; and m, a 52-bit number. Given three values for s, e and m, the real number represented by them is:  If e6= 0 and e6= 2047 (i.e. if the exponent is not one of the end values), then s e1023 N =(1)  2  1.m, where 1.m means “one-dot-m” in binary. Notice —and this is the key— that the exponent is not the number represented by1. EXPONENTIAL NOTATION 9 the 11 bits of e but that it is “shifted to the right”. The expo- nent e = 01010101011, which in decimal is 683 represents, 6831023 340 in double precision format, the number 2 = 2 . Those e with a starting 0 bit correspond to negative powers of 2 and those having a starting 1 bit to positive powers (re- 10 call that 2 = 1024).  If e= 0 then, if m6= 0 (if there is a mantissa): s 1023 N =(1)  2  0.m, where 0.m means “zero-dot-m” in binary.  If e = 0 and m = 0, the number is either+0 or0, depend- ing on the sign s. This means that 0 has a sign in double precision arithmetic (which makes sense for technical rea- sons).  The case e = 2047 (when all the eleven digits of e are 1) is reserved for encoding “exceptions” like¥ and NaN (not- a-number-s). We shall not enter into details. As a matter of fact, the standard is much longer and thorough and includes a long list of requisites for electronic devices working in floating point (for example, it specifies how truncations of the main mathematical operations have to be performed to ensure exactness of the results whenever possible). The main advantage of floating point (and of double precision specifically) is, apart from standardization, that it enables computing with very small and very large numbers with a single format (the 1023 300 smallest storable number is 2 ' 10 and the largest one is 1023 300 2 ' 10 ). The trade off is that if one works simultaneously with both types of quantities, the first lose precision and tend to disappear (a truncation error takes place). However, used carefully, it is a hugely useful and versatile format. ? Examples? 1.2. Binary to decimal (and back) conversion. In this course, it is essential to be able to convert a number from binary to decimal representation and back. In these notes, we shall use the following notation: the expres- sion x a means that x is a variable, a is a value (so that it can be a number or another variable) and that x gets assigned the value of a. The expression u = c is the conditional statement “the value de- signed by u is the same as that designed by c”, which can be either true or false. We also denote m//n as the quotient of dividing the natural number m 0 by the natural number n 0, and m%n is the10 1. ARITHMETIC AND ERROR ANALYSIS remainder of that division. That is, m=(m//n) n+(m%n). Finally, if x is a real number x = A.B, the expressionfxg means the fractional part of x, i.e., the number 0.B. EXAMPLE 1. The following table shows the decimal and binary expansions of several numbers. The last binary expansion is approx- imate because it has an infinite number of figures. Decimal Binary 1 1 2 10 10 1010 0.5 0.1 7.25 111.01 0.1 0.000110011+ Algorithm 1 (in page 11) is a way to convert a decimal number A.B with a finite number of digits to its binary form. Algorithm 2 performs the reverse conversion. Notice that, as there are numbers with a finite quantity of decimal digits which cannot be expressed with a finite quantity of binary digits (the most obvious example being 0.1), one has to specify a number of fractional digits for the output (which implies that one does not necessarily obtain the very same number but a truncation). Converting from binary to decimal is “simpler” but requires ad- ditions on the go: one does not obtain a digit at each step because one has to add powers of two. This process is described in Algorithm 2. Notice that the number of digits of the output is not specified (it could be done but it would only make the algorithm more complex). i i Finally, in all those steps in which A 2 or B 2 is added, both i i A and B are either 0 or 1, so that this multiplication just means “ei- i i ther add or do not add” the corresponding power of 2 (this is what a binary digit means, anyway). 2. Error, basic definitions Whenever numbers with a finite quantity of digits are used in computations and whenever real measurements are performed, one has to acknowledge that errors will be made. This is not grave by it- self. What matters is having an estimate of their size and knowing that, as calculations are made, they can propagate. In the end, the best one can do is to bound the absolute error, that is, to know a reasonable2. ERROR, BASIC DEFINITIONS 11 Algorithm 1 Conversion from decimal to binary, without sign. Input: A.B a number in base 10, with B of finite length, k a positive integer (which is the desired number of fractional digits after the binary dot) k Output: a.b (the number x in binary format, truncated to 2 ) if A= 0 then a 0 and go to the FRACTIONAL PART end if ?INTEGRAL PART i 1, n A while n 0 do i i+ 1 x n%2 remainder i n n//2 quotient end while a x x . . . x the sequence of remainders in reverse 0 i i1 order ?FRACTIONAL PART if B= 0 then b 0 return a.b end if i 0, n 0.B while n 0 and i k do i i+ 1 m 2n if m 1 then b 1 i else b 0 i end if n fmg the fractional part of m end while b b b . . . b 1 2 i return a.b value (the bound) which is larger than the error, in order to assess, with certainty, how far the real value can be from the computed one. In what follows, an exact value x is assumed (a constant, a datum, the exact solution to a problem. . . ) and an approximation will be denoted x ˜.12 1. ARITHMETIC AND ERROR ANALYSIS Algorithm 2 Conversion from binary to decimal, without sign. Input: A.B, a number in binary format, k a non-negative integer (the number of the fractional binary digits to be used). Output: a.b, the decimal expression of the truncation up to preci- k sion 2 of the number A.B ?INTEGRAL PART Write A= A A . . . A (the binary digits) r r1 0 a 0, i 0 while i r do i a a+ A 2 i i i+ 1 end while ?FRACTIONAL PART if B= 0 then return a.0 end if b 0, i 0 while i k do i i+ 1 i b b+ B 2 i end while return a.b DEFINITION 4. The absolute error incurred when using x ˜ instead of x is the absolute value of their differencejx x ˜j. But, unless x is 0, one is usually more interested in the order of magnitude of the error; that is, “how relatively large the error is” as compared to the real quantity. DEFINITION 5. The relative error incurred when using x ˜ instead of x, assuming x6= 0, is the quotient jx ˜ xj jxj (which is always positive). We are not going to use a specific notation for them (some people useD andd, respectively). EXAMPLE 2. The constantp, which is the ratio between the length of the circumference and its diameter, is, approximately 3.1415926534+ (the trailing + indicates that the real number is larger than the rep- resentation). Assume one uses the approximationp ˜ = 3.14. Then2. ERROR, BASIC DEFINITIONS 13 ˜  The absolute error isjppj= 0.0015926534+. ˜ jppj 4  The relative error is ' 10  5.069573. p This last statement means that one is incurring an error of 5 ten- thousandths per unit (approx. 1/2000) each time 3.14 is used for p. Thus, if one adds 3.14 two thousand times, the error incurred when using this quantity instead of 2000p will be approximatelyp. This is the meaning of the relative error: its reciprocal is the number of times one has to add x ˜ in order to get an accrued error equal to the number being approximated. Before proceeding with the analysis of errors, let us define the two most common ways of approximating numbers using a finite quantity of digits: truncation and rounding. The precise definitions are too detailed for us to give. Start with a real number (with possi- bly an infinite quantity of digits): x = a a . . . a . a . . . a . . . 2 r n 1 r+1 notice that there are r digits to the left of the decimal point. Define: DEFINITION 6. The truncation of x to k (significant) digits is:  the number a a . . . a 0 . . . 0 (an integer with r digits), if k 1 2 k r,  the number a . . . a .a . . . a if k r. r 1 r+1 k That is, one just cuts (i.e. truncates) the numerical expression of x and adds zeroes to the right if the decimal part has not been reached. DEFINITION 7. The rounding of x to k (significant) digits is the following number:  If a 5, then the rounding is the same as the truncation. k+1  If 5 a  9, then the rounding is the truncation plus k+1 rk+1 10 . Remark: the rounding described here is called biased to plus infinity because when the k+ 1th digit is greater than or equal to 5, the approximation is always greater than the true value. The problem with rounding is that all the digits of the rounded num- ber can be different from the original value (think of 0.9999 rounded to 3 significant digits). The great advantage is that the error incurred when rounding is less than the one incurred when truncating (it can even be a half of it):14 1. ARITHMETIC AND ERROR ANALYSIS 1 EXAMPLE 3. If x = 178.299 and one needs 4 significant figures, then the truncation is x = 178.2, whereas the rounding is 178.3. The 1 absolute error in the former case is 0.099, while it is 0.001 in the latter. EXAMPLE 4. If x = 999.995 and 5 digits are being used, the trun- cation is x = 999.99, while the rounding is 1000.0. Even though all 1 the digits are different the absolute error incurred in both cases is the same: 0.005. This is what matters, not that the digits “match”. Why is then truncation important if rounding is always at least as precise and half the time better? Because when using floating point, one cannot prevent truncation from happening and it must be taken into account to bound the global error. As of today (2013) most of the computer programs working in double precision, perform their computations internally with many more digits and round their results to double precision. However, truncations always happen (there is a limited amount of available digits to the processor). 2.1. Sources of error. Error appears in different ways. On one hand, any measurement is subject to it (this why any measuring de- vice is sold with an estimated margin of precision); this is intrinsic to Nature and one can only take it into account and try to assess its magnitude (give a bound). On the other hand, computations per- formed in finite precision arithmetic both propagate these errors and give rise to new ones precisely because the quantity of available dig- its is finite. The following are some of the sources of error:  Measurement error, already mentioned. This is unavoidable.  Truncation error: happens whenever a number (datum or the result of a computation) has more digits than available and some of them must be “forgotten”.  Rounding error: takes place when a number is rounded to a specified precision.  Loss of significance (also called cancellation error): this appears when an operation on two numbers increases relative error substantially more than the absolute error. For example, when numbers of very similar magnitude are subtracted. The digits lost to truncation or rounding are too relevant and the relative error is huge. A classical example is the instabil- ity of the solution of the quadratic equation. 1 We have not explained the meaning of this term, but we shall not do so. Suffice it to say that “significant figures” means “exact figures” in an expression.2. ERROR, BASIC DEFINITIONS 15  Accumulation error: which appears when accumulating (with additions, essentially) small errors of the same sign a lot of times. This is what happened with the Patriot Missile Sta- 2 tions in 1991, during the Desert Storm operation . All the above errors may take place when working with finite arith- metic. The following rules (which describe the worst-case behaviors) apply:  When adding numbers of the same sign, the absolute error can be up to the sum of the absolute errors and the same can happen to the relative error.  When adding numbers of different sign, the absolute error be- haves like in the previous case but the relative error may behave wildly: 1000.2 1000.1 has only a significant digit, so that the relative error can be up to 10%, but the relative error in the 4 operands was at most 1 10 .  When multiplying, the absolute error is of the same magni- tude as the largest factor times the absolute error in the other factor. If both factors are of the same magnitude, the abso- lute error can be up to double the maximum absolute error times the maximum factor. The relative error is of the same magnitude as the maximum relative error in the factors (and is the sum if both factors are of the same magnitude).  When dividing by a number greater than or equal to 1, the ab- solute error is approximately the absolute error in the nu- merator divided by the denominator and the relative error is the relative error in the numerator (more or less like in multiplication). When dividing by numbers near 0, absolute precision is lost and later operations with numbers of the same magnitude will probably give rise to large cancellation errors. Example 5 is illustrative of this fact: the first line is assumed to be an exact computation, the second one an ap- proximation using a truncation to 6 significant figures in the divisor of the quotient. One can see that the “approxima- tion” is totally useless. An example of the bad behavior of division and addition is the following: 2 One can read a summary at http://www.cs.utexas.edu/˜downing/papers/PatriotA1992.pdf and the official report is also available: http://www.fas.org/spp/starwars/gao/im92026.htm.16 1. ARITHMETIC AND ERROR ANALYSIS EXAMPLE 5. Consider the following two computations, the first one is assumed to be “correct” while the second one is an approxi- mation: 33 26493 =0.256 (the exact result) . 0.0012456 33 26493 = 8.2488 (approximation) 0.001246 The relative error in the rounding of the denominator is only 3 4 10 (less than half a thousandth of a unit), but the relative error in the final result is 33.2 (which means that the obtained result is 33 times larger in absolute value than it should, so it is worthless as an “approximation”). Notice that not even the sign is correct. This is (essentially) the main problem of resolution methods involving divisions (like Gauss’ and, obviously, Cramer’s). When explaining Gauss’ method, convenient ways to choose an adequate denomina- tor will be shown (these are called pivoting strategies). As a general rule, the larger the denominator, the better. 3. Bounding the error As we have already said, one does not seek an exact knowledge of the error incurred during a measurement or the solution to a prob- lem, as this will likely be impossible. Rather, one aims to have an idea of it and, especially, to know a good bound of it. That is, to be able to assess the maximum value that the absolute error can take and this being a useful piece of information. Knowing that 2.71 is an approx- imation of e with an error less than 400 is worthless. Knowing that the error is less than 0.01 is useful. In general, the only possibility is to estimate a reasonably small number larger than the incurred error. This is bounding the error. 3.1. Some bounds. If one knows that a quantity is between two values, which is usually expressed in the form x = ae for some e 0, then the absolute error incurred when using a instead of x is unknown but is bounded bye. So that the relative error is, at most, this e divided by the smallest of the possible absolute values of x. Notice that this is tricky because if ae 0 and a+e 0 then there is no way to bound the relative error because x can be 0 and then there is no relative error (it is not defined for x = 0).3. BOUNDING THE ERROR 17 EXAMPLE 6. Ifp = 3.14 0.01, then the maximum absolute error incurred is 0.01, so that the relative error is at most 0.01 ' .003194 j3.13j (more or less 1/300). Notice that in order to get an upper bound of a quotient, the de- nominator must be bounded from below (the lesser the denominator, the greater the quotient). The rules in page 15 are essential in order to bound the error of a sequence of arithmetic operations. One has to be especially careful when dividing by numbers less than one, as this may lead to useless results like “the result is 7 with a relative error of 23.”CHAPTER 2 Numerical Solutions to Non-linear Equations The problem of solving the non-linear equation f(x)= 0 given f and some initial conditions is treated in this chapter. Each of the algorithms which will be explained in this chapter has its own advantages and disadvantages; one should not discard anyone a priori just for its “slowness” —for example, bisection. We shall see that the “best” one —namely, Newton-Raphson’s— has two drawbacks: it may converge far from the initial condition, becoming useless for finding a root “near” a specific point and it requires the computation of the value of the derivative of f at each step, which may be too costly. Notice that all the algorithms we present include stopping con- ditions which depend on the value of f(x) and are of Cauchy-type. This is done in order to simplify the exposition. If a specific quan- tity of exact figures is required, one needs to perform a much deeper study of the problem being solved, of the derivative of f and other elements, which are out of the scope of these notes. 1. Introduction Computing roots of functions, and especially of polynomials, is one of the classical problems of Mathematics. It used to be believed that any polynomial could be “explicitly solved” like the quadratic equation, via a formula involving radicals (roots of numbers). Galois Theory, developed in the beginning of the XIX Century, proved that this is not the case at all and that, as a matter of fact, most polynomi- als of degree greater than 4 are not solvable using radicals. However, the search for a closed formula for solving equations is just a way of putting off the real problem. In the end, and this is what matters: The only computations that can always be performed exactly are addition, subtraction and multiplication. 1920 2. NUMERICAL SOLUTIONS TO NON-LINEAR EQUATIONS 1 It is not even possible to divide and get exact results . From the beginning, people sought ways to quickly find approx- imate solutions to nonlinear equations involving the four rules: ad- dition, subtraction, multiplication and division. Two different kind of algorithms are explained in this chapter: those with a geometric meaning and the fixed point method, which requires the notion of contractivity. Before proceeding, let us remark two essential requirements in any root-computing algorithm:  The specification of a precision, that is an e 0 such that if j f(c)j e, then c is taken as an approximate root of f . This is needed because, as non-exact computations are performed, expecting a result f(c)= 0 is unrealistic, so that this equality is useless as a stopping condition.  A second stopping condition unrelated to the value of f . It may well happen that either the algorithm does not con- verge or that f has no roots, so that the statementj f(c)j e is never true. In any case, the algorithm must finish at some point. Otherwise, it is not an algorithm. This implies the need for a condition unrelated to f . This takes usually the form “perform no more than n steps,” where n is a counter. 2. The Bisection Algorithm Assume f is a continuous function on a closed interval a, b. Bolzano’s Theorem may be useful if its conditions hold: THEOREM (Bolzano). Let f :a, bR be a continuous function on a, b such that f(a) f(b) 0 (i.e. it changes sign between a and b). Then there is c2(a, b) with f(c)= 0. This statement asserts that if the sign of f changes on a, b then there is at least a root of f in it. One could sample the interval using i small sub-intervals (say of width(b a)/10 ) and seek, among this sub-intervals, one where the sign changes, thus nearing a root at each step. Actually, dividing the interval into two segments turns out to be much simpler. Start with a function f , continuous on the intervala, b with val- ues f(a) and f(b). Specify a desired precision e 0 (so that if 1 One could argue that using rational numbers solves this problem but, again, there is a point at which decimal expansions are needed.

Advise: Why You Wasting Money in Costly SEO Tools, Use World's Best Free SEO Tool Ubersuggest.