Parallel computing Architecture ppt

fundamentals of parallel processing jordan ppt and parallel computing ppt presentation and parallel processing in computer organization ppt
ImogenCameron Profile Pic
ImogenCameron,France,Teacher
Published Date:14-07-2017
Your Website URL(Optional)
Comment
Parallel Processing Denis Caromel, Arnaud Contes Univ. Nice, ActiveEonTraditional Parallel Computing & HPC Solutions ► Parallel Computing  Principles  Parallel Computer Architectures  Parallel Programming Models  Parallel Programming Languages ► Grid Computing  Multiple Infrastructures  Using Grids  P2P  Clouds ► Conclusion 2 2009Parallel (Computing) ► Execution of several activities at the same time.  2 multiplications at the same time on 2 different processes,  Printing a file on two printers at the same time. 3 2009Why Parallel Computing ? ► Save time - wall clock time ► Solve larger problems ► Parallel nature of the problem, so parallel models fit it best ► Provide concurrency (do multiple things at the same time) ► Taking advantage of non-local resources ► Cost savings ► Overcoming memory constraints ► Can be made highly fault-tolerant (replication) 4 2009What application ? Traditional HPC Enterprise App.  Nuclear physics  J2EE and Web servers  Fluid dynamics  Business Intelligence  Weather forecast  Banking, Finance,  Image processing, Insurance, Risk Image synthesis, Analysis Virtual reality  Regression tests for  Petroleum large software  Virtual prototyping  Storage and Access to  Biology and genomics large logs  Security: Finger Print matching, Image behavior recognition 5 2009How to parallelize ? ► 3 steps : 1. Breaking up the task into smaller tasks 2. Assigning the smaller tasks to multiple workers to work on simultaneously 3. Coordinating the workers ► Seems simple, isn’t it ? 6 2009Additional definitions Simultaneous access to a resource, physical or logical Concurrency Concurrent access to variables, resources, remote data Distribution Several address spaces Locality Data located on several hard disks 7 2009Parallelism vs Distribution vs Concurrency ► Parallelism sometimes proceeds from distribution:  Problem domain parallelism  E.g: Collaborative Computing ► Distribution sometimes proceeds from parallelism:  Solution domain parallelism  E.G.: Parallel Computing on Clusters ► Parallelism leads naturally to Concurrency:  Several processes trying to print a file on a single printer 8 2009Levels of Parallelism HardWare ► Bit-level parallelism  Hardware solution  based on increasing processor word size  4 bits in the ‘70s, 64 bits nowadays Focus on hardware capabilities for structuring Focus on hardware capabilities for structuring ► Instruction-level parallelism  A goal of compiler and processor designers  Micro-architectural techniques  Instruction pipelining, Superscalar, out-of-order execution, register renamming Focus on program instructions for structuring Focus on program instructions for structuring 9 2009Levels of Parallelism SoftWare ► Data parallelism (loop-level)  Distribution of data (Lines, Records, Data- structures, …) on several computing entities  Working on local structure or architecture to work in parallel on the original Focus on the data for structuring Focus on the data for structuring ► Task Parallelism  Task decomposition into sub-tasks  Shared memory between tasks or  Communication between tasks through messages Focus on tasks (activities, threads) for structuring Focus on tasks (activities, threads) for structuring 10 2009Performance ? ► Performance as Time  Time spent between the start and the end of a computation ► Performance as rate  MIPS ( Millions of Instructions / sec)  Not equivalent on all architectures ► Peak Performance  Maximal Performance of a Resource (theoretical)  Real code achieves only a fraction of the peak performance 11 2009Code Performance ► how to make code go fast : “High Performance” ► Performance conflicts with  Correctness  By trying to write fast code, one can breaks it  Readability  Multiplication/division by 2 versus bit shifting  Fast code requires more lines  Modularity can hurt performance – Abstract design  Portability  Code that is fast on machine A can be slow on machine B  At the extreme, highly optimized code is not portable at all, and in fact is done in hardware. 12 2009Speedup linear speedup sub-linear speedup number of processors 13 2009 superlinear speedup speedupSuper Linear Speedup ► Rare ► Some reasons for speedup p (efficiency 1)  Parallel computer has p times as much RAM so higher fraction of program memory in RAM instead of disk  An important reason for using parallel computers  Parallel computer is solving slightly different, easier problem, or providing slightly different answer  In developing parallel program a better algorithm was discovered, older serial algorithm was not best possible 14 2009Amdahl’s Law ► Amdahl 1967 noted: given a program,  let f be fraction of time spent on operations that must be performed serially. ► Then for p processors,  Speedup(p) ≤ 1/(f + (1 − f)/p) ► Thus no matter how many processors are used  Speedup ≤ 1/f ► Unfortunately, typically f was 10 –20% ► Useful rule of thumb :  If maximal possible speedup is S, then S processors run at about 50% efficiency. 15 2009Maximal Possible Speedup 16 2009Another View of Amdahl’s Law ► If a significant fraction of the code (in terms of time spent in it) is not parallelizable, then parallelization is not going to be good 17 2009Scalability ► Measure of the “effort” needed to maintain efficiency while adding processors ► For a given problem size, plot Efd(p) for increasing values of p  It should stay close to a flat line ► Isoefficiency: At which rate does the problem size need to be increased to maintain efficiency  By making a problem ridiculously large, one can typically achieve good efficiency  Problem: is it how the machine/code will be used? 18 2009Traditional Parallel Computing & HPC Solutions ► Parallel Computing  Principles  Parallel Computer Architectures  Parallel Programming Models  Parallel Programming Languages ► Grid Computing  Multiple Infrastructures  Using Grids  P2P  Clouds ► Conclusion 19 2009Michael Flynn’s Taxonomy classification of computer architectures Data & Operand (instructions) Instructions Single Instruction Multiple Instruction (SI) (MD) Data streams SISD MISD Single Data (SD) Single-threaded Pipeline architecture process MIMD SIMD Multiple Data (MD) Multi-threaded Vector processing programming 20 2009