Lecture notes on Distributed Computing

how does distributed computing work and what is distributed computing with example pdf free download
Dr.HenryEvans Profile Pic
Published Date:10-07-2017
Your Website URL(Optional)
UNIT – I LESSON 1: DISTRIBUTED SYSTEMS CONTENTS 1.0 Aim and Objectives 1.1. Introduction 1.2. Organization 1.3. Goals and Advantages 1.4. Disadvantages 1.5. Architecture 1.6. Concurrency 1.7. Languages 1.8. Let us Sum UP 1.9. Lesson-End Activities 1.10. Points for Discussion 1.11. References 1.0. AIM AND OBJECTIVES At the end of this Lesson you will be able to  understand the concept of Distributed Computing,  organization of Distributed Computing,  advantages and limitations of Distributed Computing 1.1. INTRODUCTION Distributed computing is a method of computer processing in which different parts of a program are run simultaneously on two or more computers that are communicating with each other over a network. Distributed computing is a type of segmented or parallel computing, but the latter term is most commonly used to refer to processing in which different parts of a program run simultaneously on two or more processors that are part of the same computer. While both types of processing require that a program be segmented—divided into sections that can run simultaneously, distributed computing also requires that the division of the program take into account the different environments on which the different sections of the program will be running. For example, two computers are likely to have different file systems and different hardware components. An example of distributed computing is BOINC, a framework in which large problems can be divided into many small problems which are distributed to many computers. Later, the small results are reassembled into a larger solution. 1 Distributed computing is a natural result of using networks to enable computers to communicate efficiently. But distributed computing is distinct from computer networking or fragmented computing. The latter refers to two or more computers interacting with each other, but not, typically, sharing the processing of a single program. The World Wide Web is an example of a network, but not an example of distributed computing. There are numerous technologies and standards used to construct distributed computations, including some which are specially designed and optimized for that purpose, such as Remote Procedure Calls (RPC) or Remote Method Invocation (RMI) or .NET Remoting. 1.2. ORGANIZATION Organizing the interaction between each computer is of prime importance. In order to be able to use the widest possible range and types of computers, the protocol or communication channel should not contain or use any information that may not be understood by certain machines. Special care must also be taken that messages are indeed delivered correctly and that invalid messages are rejected which would otherwise bring down the system and perhaps the rest of the network. Another important factor is the ability to send software to another computer in a portable way so that it may execute and interact with the existing network. This may not always be possible or practical when using differing hardware and resources, in which case other methods must be used such as cross-compiling or manually porting this software. 1.3. GOALS AND ADVANTAGES There are many different types of distributed computing systems and many challenges to overcome in successfully designing one. The main goal of a distributed computing system is to connect users and resources in a transparent, open, and scalable way. Ideally this arrangement is drastically more fault tolerant and more powerful than many combinations of stand-alone computer systems. Openness Openness is the property of distributed systems such that each subsystem is continually open to interaction with other systems (see references). Web Services protocols are standards which enable distributed systems to be extended and scaled. In general, an open system that scales has an advantage over a perfectly closed and self- contained system. Consequently, open distributed systems are required to meet the following challenges: Monotonicity Once something is published in an open system, it cannot be taken back. 2 Pluralism Different subsystems of an open distributed system include heterogeneous, overlapping and possibly conflicting information. There is no central arbiter of truth in open distributed systems. Unbounded nondeterminism Asynchronously, different subsystems can come up and go down and communication links can come in and go out between subsystems of an open distributed system. Therefore the time that it will take to complete an operation cannot be bounded in advance. 1.4. DISADVANTAGES Technical issues If not planned properly, a distributed system can decrease the overall reliability of computations if the unavailability of a node can cause disruption of the other nodes. Leslie Lamport famously quipped that: "A distributed system is one in which the failure of a computer you didn't even know existed can render your own computer unusable." Troubleshooting and diagnosing problems in a distributed system can also become more difficult, because the analysis may require connecting to remote nodes or inspecting communication between nodes. Many types of computation are not well suited for distributed environments, typically owing to the amount of network communication or synchronization that would be required between nodes. If bandwidth, latency, or communication requirements are too significant, then the benefits of distributed computing may be negated and the performance may be worse than a non-distributed environment. Project-related problems Distributed computing projects may generate data that is proprietary to private industry, even though the process of generating that data involves the resources of volunteers. This may result in controversy as private industry profits from the data which is generated with the aid of volunteers. In addition, some distributed computing projects, such as biology projects that aim to develop thousands or millions of "candidate molecules" for solving various medical problems, may create vast amounts of raw data. This raw data may be useless by itself without refinement of the raw data or testing of candidate results in real-world experiments. Such refinement and experimentation may be so expensive and time-consuming that it may literally take decades to sift through the data. Until the data is refined, no benefits can be acquired from the computing work. Other projects suffer from lack of planning on behalf of their well-meaning originators. These poorly planned projects may not generate results that are palpable, or may not generate data that ultimately result in finished, innovative scientific papers. Sensing that a project may not be generating useful data, the project managers may 3 decide to abruptly terminate the project without definitive results, resulting in wastage of the electricity and computing resources used in the project. Volunteers may feel disappointed and abused by such outcomes. There is an obvious opportunity cost of devoting time and energy to a project that ultimately is useless, when that computing power could have been devoted to a better planned distributed computing project generating useful, concrete results. Another problem with distributed computing projects is that they may devote resources to problems that may not ultimately be soluble, or to problems that are best pursued later in the future, when desktop computing power becomes fast enough to make pursuit of such solutions practical. Some distributed computing projects may also attempt to use computers to find solutions by number-crunching mathematical or physical models. With such projects there is the risk that the model may not be designed well enough to efficiently generate concrete solutions. The effectiveness of a distributed computing project is therefore determined largely by the sophistication of the project creators.- 1.5. ARCHITECTURE Various hardware and software architectures are used for distributed computing. At a lower level, it is necessary to interconnect multiple CPUs with some sort of network, regardless of whether that network is printed onto a circuit board or made up of loosely- coupled devices and cables. At a higher level, it is necessary to interconnect processes running on those CPUs with some sort of communication system. Distributed programming typically falls into one of several basic architectures or categories: Client-server, 3-tier architecture, N-tier architecture, Distributed objects, loose coupling, or tight coupling.  Client-server — Smart client code contacts the server for data, then formats and displays it to the user. Input at the client is committed back to the server when it represents a permanent change.  3-tier architecture — Three tier systems move the client intelligence to a middle tier so that stateless clients can be used. This simplifies application deployment. Most web applications are 3-Tier.  N-tier architecture — N-Tier refers typically to web applications which further forward their requests to other enterprise services. This type of application is the one most responsible for the success of application servers.  Tightly coupled (clustered) — refers typically to a set of highly integrated machines that run the same process in parallel, subdividing the task in parts that are made individually by each one, and then put back together to make the final result.  Peer-to-peer — an architecture where there is no special machine or machines that provide a service or manage the network resources. Instead all responsibilities are uniformly divided among all machines, known as peers. Peers can serve both as clients and servers. 4  Space based — refers to an infrastructure that creates the illusion (virtualization) of one single address-space. Data are transparently replicated according to application needs. Decoupling in time, space and reference is achieved. Another basic aspect of distributed computing architecture is the method of communicating and coordinating work among concurrent processes. Through various message passing protocols, processes may communicate directly with one another, typically in a master/slave relationship. Alternatively, a "database-centric" architecture can enable distributed computing to be done without any form of direct inter-process communication, by utilizing a shared database. 1.6. CONCURRENCY Distributed computing implements a kind of concurrency. It interrelates tightly with concurrent programming so much that they are sometimes not taught as distinct subjects. Multiprocessor systems A multiprocessor system is simply a computer that has 1 & not =1 CPU on its motherboard. If the operating system is built to take advantage of this, it can run different processes (or different threads belonging to the same process) on different CPUs. Multicore systems Intel CPUs from the late Pentium 4 era (Northwood and Prescott cores) employed a technology called Hyperthreading that allowed more than one thread (usually two) to run on the same CPU. The more recent Sun UltraSPARC T1, AMD Athlon 64 X2, AMD Athlon FX, AMD Opteron, Intel Pentium D, Intel Core, Intel Core 2 and Intel Xeon processors feature multiple processor cores to also increase the number of concurrent threads they can run. Multicomputer systems A multicomputer may be considered to be either a loosely coupled NUMA computer or a tightly coupled cluster. Multicomputers are commonly used when strong compute power is required in an environment with restricted physical space or electrical power. Common suppliers include Mercury Computer Systems, CSPI, and SKY Computers. Common uses include 3D medical imaging devices and mobile radar. 5 Computing taxonomies The types of distributed systems are based on Flynn's taxonomy of systems; single instruction, single data (SISD), single instruction, multiple data (SIMD), multiple instruction, single data (MISD), and multiple instruction, multiple data (MIMD). Other taxonomies and architectures available at Computer architecture and in Category:Computer architecture. Computer clusters A cluster consists of multiple stand-alone machines acting in parallel across a local high speed network. Distributed computing differs from cluster computing in that computers in a distributed computing environment are typically not exclusively running "group" tasks, whereas clustered computers are usually much more tightly coupled. Distributed computing also often consists of machines which are widely separated geographically. Grid computing A grid uses the resources of many separate computers, loosely connected by a network (usually the Internet), to solve large-scale computation problems. Public grids may use idle time on many thousands of computers throughout the world. Such arrangements permit handling of data that would otherwise require the power of expensive supercomputers or would have been impossible to analyze. 1.7. LANGUAGES Nearly any programming language that has access to the full hardware of the system could handle distributed programming given enough time and code. Remote procedure calls distribute operating system commands over a network connection. Systems like CORBA, Microsoft DCOM, Java RMI and others, try to map object oriented design to the network. Loosely coupled systems communicate through intermediate documents that are typically human readable (e.g. XML, HTML, SGML, X.500, and EDI). Languages specifically tailored for distributed programming are:  Ada programming language  Alef programming language  E programming language  Erlang programming language  Limbo programming language  Oz programming language  ZPL (programming language)  Orca programming language 6 1.8. LET US SUM UP The above lesson we discussed the concept of Distributed Computing, with it organizational architecture, and also we discussed the advantages and limitations of Distributed Computing with few terms such as concurency control and various languages used to implement Distributed Computing 1.9. LESSON END ACTIVITIES 1. Explain the features of Concurency Control in Distributed Computing Envrionment 1.10. POINTS FOR DISCUSSION 1. List down various appliciton areas where distributed computing is used 1.11. REFERENCEShttp://en.wikipedia.org/wiki/Image:Wikibooks-logo-en.svg  Attiya, Hagit and Welch, Jennifer (2004). Distributed Computing: Fundamentals, Simulations, and Advanced Topics. Wiley-Interscience. ISBN 0471453242.  Lynch, Nancy A (1997). Distributed Algorithms. Morgan Kaufmann. ISBN 1558603484.  http://en.wikipedia.org/wiki/Distributed_computing 7 LESSON 2: DESIGNING OF DISTRIBUTED SYSTEMS CONTENTS 2.0 Aim and Objectives 2.1. Introduction 2.2. Implementation Process 2.2.1. Distributed Mutual Exclusion (DME) 2.2.2. Centralized Approach 2.2.3. Fully Distributed Approach 2.2.4. Behavior of Fully Distributed Approach 2.3. Designing a Distributed Processing System 2.4. Let us Sum UP 2.5. Lesson-End Activities 2.6. Points for Discussion 2.7. References 2.0. AIM AND OBJECTIVES At the end of this Lesson, you will be able to understand  Distributed Mutual Exclusion (DME),  Centralized Approach,  Fully Distributed Approach,  Behavior of Fully Distributed Approach, and  Understand the Designing Process of a Distributed Systems 2.1. INTRODUCTION The term distributed system is used to describe a system with the following characteristics: it consists of several computers that do not share a memory or a clock; the computers communicate with each other by exchanging messages over a communication network; and each computer has its own memory and runs its own operating system. The resources owned and controlled by a computer are said to be local to it, while the resources owned and controlled by other computers and those that can only be accessed through the network are said to be remote. Typically, accessing remote resources is more expensive e than accessing local resources because of the communication delays that occur in the network and the CPU overhead incurred to process communication protocols. Based on the context, the terms computer, note, host, site, machine, processor, and workstation are used interchangeably to denote a computer throughout this lesson. 8 2.2. IMPLEMENTATION PROCESS Associate a timestamp with each system event Require that for every pair of events A and B, if A  B, then the timestamp of A is less than the timestamp of B Within each process Pi a logical clock, LCi is associated The logical clock can be implemented as a simple counter that is incremented between any two successive events executed within a process  Logical clock is monotonically increasing A process advances its logical clock when it receives a message whose timestamp is greater than the current value of its logical clock If the timestamps of two events A and B are the same, then the events are concurrent We may use the process identity numbers to break ties and to create a total ordering 2.2.1. Distributed Mutual Exclusion (DME) Assumptions The system consists of n processes; each process Pi resides at a different processor Each process has a critical section that requires mutual exclusion Requirement If Pi is executing in its critical section, then no other process Pj is executing in its critical section We present two algorithms to ensure the mutual exclusion execution of processes in their critical sections 2.2.2. DME: Centralized Approach One of the processes in the system is chosen to coordinate the entry to the critical section A process that wants to enter its critical section sends a request message to the coordinator The coordinator decides which process can enter the critical section next, and its sends that process a reply message When the process receives a reply message from the coordinator, it enters its critical section After exiting its critical section, the process sends a release message to the coordinator and proceeds with its execution This scheme requires three messages per critical-section entry: request reply release 9 2.2.3. DME: Fully Distributed Approach When process Pi wants to enter its critical section, it generates a new timestamp, TS, and sends the message request (Pi, TS) to all other processes in the system When process Pj receives a request message, it may reply immediately or it may defer sending a reply back When process Pi receives a reply message from all other processes in the system, it can enter its critical section After exiting its critical section, the process sends reply messages to all its deferred requests The decision whether process Pj replies immediately to a request(Pi, TS) message or defers its reply is based on three factors: If Pj is in its critical section, then it defers its reply to Pi If Pj does not want to enter its critical section, then it sends a reply immediately to Pi If Pj wants to enter its critical section but has not yet entered it, then it compares its own request timestamp with the timestamp TS If its own request timestamp is greater than TS, then it sends a reply immediately to Pi (Pi asked first) Otherwise, the reply is deferred 2.2.4. Behavior of Fully Distributed Approach n Freedom from Deadlock is ensured n Freedom from starvation is ensured, since entry to the critical section is scheduled according to the timestamp ordering l The timestamp ordering ensures that processes are served in a first-come, first served order n The number of messages per critical-section entry is 2 x (n – 1) This is the minimum number of required messages per critical-section entry when processes act independently and concurrently 2.3. DESIGNING A DISTRIBUTED PROCESSING SYSTEM In general, designing a distributed operating system is more difficult than designing a centralized operating system for several reasons. Transparency We saw that one of the main goals of a distributed operating system is to make the existence of multiple computers invisible (transparent) and provide a single system image to its users. That is, a distributed operating system must be designed in such a way that a collection of distinct machines connected by a communication subsystem appears to its 10 users as a virtual uniprocessor. Achieving complete transparency is a difficult task and requires that several different aspects of transparency be supported by the distributed operating system. The eight forms of transparency identified by the International Standards Organization's Reference Model for Open Distributed Processing ISO 1992 are access transparency, location transparency, replication transparency, failure transparency, migration transparency, concurrency transparency, performance transparency, and scaling transparency. These transparency aspects are described below Access Transparency Access transparency means that users should not need or be able to recognize whether a resource (hardware or software) is remote or local. This implies that the distributed operating system should allow users to access remote resources in the same way as local resources. Location Transparency The two main aspects of location transparency are as follows: 1 Name transparency. This refers to the fact that the name of a resource (hardware or software) should not reveal any hint as to the physical location of the resource. 2. User mobility. This refers to the fact that no matter which machine a user is logged onto, he or she should be able to access a resource with the same name. Replication Transparency For better performance and reliability, almost all distributed operating systems have the provision to create replicas (additional copies) of files and other resources on different nodes of the distributed system. In these systems, both the existence of multiple copies of a replicated resource and the replication activity should be transparent to the users. Failure Transparency Failure transparency deals with masking from the users' partial failures in the system, such as a communication link failure, a machine failure, or a storage device crash. A distributed operating system having failure transparency property will' continue to function, perhaps in a degraded form, in the face of partial failures. Migration Transparency For better performance, reliability, and security reasons, an object that is capable of being moved (such as a process or a file) is often migrated from one node to another in a distributed system. Concurrency Transparency Concurrency transparency means that each user has a feeling that he or she is the sole user of the system and other users do not exist in the system. 11 Performance Transparency The aim of performance transparency is to allow the system to be automatically reconfigured to improve performance, as loads vary dynamically in the system. Scaling Transparency The aim of scaling transparency is to allow the system to expand in scale without disrupting the activities of the users. Reliability In general, distributed systems are expected to be more reliable than centralized systems due to the existence of multiple instances of resources. However, the existence of multiple instances of the resources alone cannot increase the system's reliability. Rather, the distributed operating system, which manages these resources, must be designed properly to increase the system's reliability by taking full advantage of this characteristic feature of a distributed system. A fault is a mechanical or algorithmic defect that may generate an error. A fault in a system causes system failure. Depending on the manner in which a failed system behaves, system failures are of two types-fail-stop. For higher reliability, the fault-handling mechanisms of a distributed operating system must be designed properly to avoid faults, to tolerate faults, and to detect and recover from faults. Commonly used methods for dealing with these issues are fault avoidance and fault tolerance. Flexibility Another important issue in the design of distributed operating systems is flexibility. Flexibility is the most important feature for open distributed systems. The design of a distributed operating system should be flexible due to the following reasons: 1. Ease of modification. 2. Ease of enhancement Performance If a distributed system is to be used, its performance must be at least as good as a centralized system. That is, when a particular application is run on a distributed system, its overall performance should be better than or at least equal to that of running the same application on a single-processor system. However, to achieve this goal, it is important that the various components of the operating system of a distributed system be designed properly; otherwise, the overall performance of the distributed system may turn out to be worse than a centralized system. Some design principles considered useful for better performance are as follows: 12 1. Batch if possible. 2.Cache whenever possible. 3.Minimize copying of data. 4. Minimize network traffic. 5. Take advantage of fine-grain parallelism for multiprocessing. Scalability Scalability refers to the capability of a system to adapt to increased service load. It is inevitable that a distributed system will grow with time since it is very common to add new, machines or an entire sub network to the system to take care of increased workload or organizational changes in a company. Therefore, a distributed operating system should be designed to easily cope with the growth of nodes and users in the system. That is, such growth should not cause serious disruption of service or significant loss of performance to users. Some guiding principles for designing scalable distributed systems are as follows: 1. Avoid centralized entities. 2. Avoid centralized algorithms. 3. Perform most operations on client workstations. Heterogeneity A heterogeneous distributed system consists of interconnected sets of dissimilar hardware or software systems. Because of the diversity, designing heterogeneous distributed systems is far more difficult than designing homogeneous distributed systems in which each system is based on the same, or closely related, hardware and software. However, as a consequence of large scale, heterogeneity is often inevitable in distributed systems. Furthermore, often heterogeneity is preferred by many users because heterogeneous distributed systems provide the flexibility to their users of different computer platforms for different applications. Security In order that the users can trust the system and rely on it, the various resources of a computer system must be protected against destruction and unauthorized access. Enforcing security in a distributed system is more difficult than in a centralized system because of the lack of a single point of control and the use of insecure networks for data communication. Therefore, as compared to a centralized system, enforcement of security in a distributed system has the following additional requirements: 1. It should be possible for the sender of a message to know that the message was received by the intended receiver. 2. It should be possible for the receiver of a message to know that the message was sent by the genuine sender. 3. It should be possible for both the sender and receiver of a message to be guaranteed that the contents of the message were not changed while it was in transfer. 13 Cryptography is the only known practical method for dealing with these security aspects of a distributed system. Emulation Of Existing Operating System For commercial success, it is important that a newly designed distributed operating system be able to emulate existing popular operating systems such as UNIX. With this property, new software can be written using the system call interface of the new operating system to take full advantage of its special features of distribution, but a vast amount of already existing old software can also be run on the same system without the need to rewrite them. Therefore, moving to the new distributed operating system will allow both types of software to be run side by side 2.4. LET US SUM UP The above lesson we discussed the about various implementation techniques and issues and approaches in designing a distributed processing system. Here we also discussed the role of Distributed Operating System with various concepts used in Distributed Systems. 2.5. LESSON END ACTIVITIES 1. Explain the various reasons for designing applications in Distributed Processing system 2.6. POINTS FOR DISCUSSION 1. Differentiate between Centralised approach and Fully Distributed Approach 2.7. REFERENCES http://en.wikipedia.org/wiki/Image:Wikibooks-logo-en.svg  Attiya, Hagit and Welch, Jennifer (2004). Distributed Computing: Fundamentals, Simulations, and Advanced Topics. Wiley-Interscience. ISBN 0471453242.  Nadiminti, Dias de Assunção, Buyya (September 2006). "Distributed Systems and Recent Innovations: Challenges and Benefitz”. InfoNet Magazine, Volume 16, Issue 3, Melbourne, Australia.  http://en.wikipedia.org/wiki/Distributed_computing 14 UNIT – II LESSON 3: DISTRIBUTED PROCESSING SYSTEMS CONTENTS 3.0 Aim and Objectives 3.1. Introduction 3.2. Pros and Cons of distributed Processing 3.3. Distributed Computing System Models 3.4. Distributed Operating System 3.5. Let us Sum UP 3.6. Lesson-End Activities 3.7. Points for Discussion 3.8. References 3.0. AIM AND OBJECTIVES At the end of this Lesson, you will be able to understand  the advantages and Limitations of Distributed Processing,  various types of Distributed Computing System Models,  distributed Operating System 3.1. INTRODUCTION The reasons behind the development of distributed systems were the availability of powerful microprocessors at low cost as well as significant advances in communication technology. The availability of powerful yet cheap microprocessors led to the development of powerful workstations that satisfy a single user’s needs. These powerful stand-alone workstations satisfy user need by providing such things as bit- mapped displays and visual interfaces, which traditional time-sharing mainframe systems do not support. When a group of people works together, there is generally a need to communicate with each other, to share data, and to share expensive resources (such as high quality printers, disk drivers, etc). This requires interconnecting computers and resources. Designing such systems became feasible with the availability of cheap and powerful microprocessors, and advances in communication technology. When a few powerful workstations are interconnected and can communicate with each other, the total computing power available in such a system can be enormous. Such a system generally only costs tens of thousands of dollars. On the other hand, if one tries 15 to obtain a single machine with the computing power equal to that of a network of workstations, the cost can be as high as a few million dollars. Hence, the main advantage of distributed system is that they have a decisive price/performance advantage over more traditional time-sharing systems. 3.2. PROS AND CONS OF DISTRIBUTED PROCESSING Resource sharing: Since a computer can request a service from another computer by sending an appropriate request to it over the communication network, hardware and software resources can be shared among computers. For example, a printer, a compiler, a text processor, or a database at a computer can be shared with remote computers. Enhanced Performance: A distributed computing system is capable of providing rapid response time and higher system throughput. This ability is mainly due to the fact that many tasks can be concurrently executed at different computers. Moreover, distributed systems can employ a load distributing technique to improve response time. In load distributing tasks at heavily loaded computers are transferred to lightly loaded computers, thereby reducing the time tasks wait before receiving service. Improved reliability and availability: A distributed computing system provides improved reliability and availability because a few components of the system can fail without affecting the availability of the rest of the system. Also, through the replication of data (e.g., files and directories) and services, distributed systems can be made fault tolerant. Services are processes that provide functionality (e.g., a file service provides file system management; a mail service provides an electronic mail facility). Modular expandability: Distributed computing systems are inherently; amenable to modular expansion because new hardware and software resources can be easily added without replacing the existing resources. 3.3. DISTRIBUTED COMPUTING SYSTEM MODELS Various models are used for building distributed computing systems. These models can be broadly classified into five categories-minicomputer, workstation, workstation-workstations server processor-pool, and hybrid. They are briefly described below Minicomputer Model The minicomputer model is a simple extension of the centralized time-sharing system. A distributed computing system based on this model consists of minicomputers (they may be large supercomputers as well) interconnected by a communication network. Each minicomputer usually has multiple users simultaneous logged on to it. For this, several interactive terminals are connected to each minicomputer each user is logged on to one specific minicomputer, with remote access to other Minicomputers. The network allows a user to access remote resources 16 that are available at some machine other than the one on to which the user is currently logged. The minicomputer model may be used when resource sharing (such as sharing information databases of different types, with each type of database located on a different machine) with remote users is desired. The early ARPANET is an example of a distributed computing system based minicomputer model. Mini computer Terminals Mini Communication Mini computer computer network Mini computer Fig 3.1 Distributed-computing system based on the minicomputer model Workstation Model The distributed computing system based on the workstation model consists of several workstations interconnected by a communication network. A company's office or a university department may have several workstations scattered throughout a building or campus, each workstation equipped with its own disk and serving as a single-user computer. It has been often found that in such an environment, at anyone time (especially at night), a significant proportion of the workstations are idle (not being used), resulting in the waste of large amounts of CPU time. Therefore, the idea of the workstation model is to interconnect all these workstations by a high-speed LAN so that idle workstations may be used to process jobs of users who are logged onto other workstations and do not have sufficient processing power at their own workstations to get their jobs processed efficiently. 17 Work - Work - station Work - station station Communication Work - Work - network station station Work - Work - Work- station station station Fig. 3.2 A distributed computing system based on the workstation model. Workstation – Server Model The workstation model is a network of personal workstations, each with its own disk and a local file system. A workstation with its own local disk is usually called a diskful workstation and a workstation without a local disk is called a diskless workstation. With the proliferation of high-speed networks, diskless workstations have become more popular in network. Environments than diskful workstations, making the workstation-server model more popular than the workstation model for building distributed computing systems. A distributed computing system based on the workstation server model consists of a few minicomputers and several workstations (most of which are diskless, but a few of which may be diskful) interconnected by a communication network. Work - station Work - Work - station station Work - Work - Communication station station network Mini Mini Mini computer computer comput used as used as er used file database as server server printer server Fig 3.3. Distributed computing system based on the workstation server model. 18 Processor – Pool Model The processor-pool model is based on the observation that most of the time a user does not need any computing power but once in a while he or she may need a very large amount of computing power for a short time (e.g., when recompiling a program consisting of a large number of files after changing a basic shared declaration). Therefore, unlike the workstation-server model in which a processor is allocated to each user, in the processor pool model the processors are pooled together to be shared by the users as needed. The pool of processors consists of a large number of microcomputers and minicomputers attached to the network. Each processor in the pool has its own memory to load and run a system program or an application program of the distributed computing system. Terminals Communication Run File server server Fig 3.4. A distributed computing system based on the processor - pool model. Hybrid Model Out of the four models described above the workstation – server model, is the most widely used model for building distributed computing system. This is because large number of computer users only perform simple inter acting tasks such as editing jobs, sending electronic mails, and executing small programs. The workstation server model is ideal for such simple usage. However in a working environment that has groups of users who often perform jobs needing massive computation, the processor model is more attractive and suitable. 19 To combine the advantages of both the models a hybrid model may be used to build a distributed computing system. The hybrid model is based on the workstation server model but with the addition of pool of processors. The processors in the pool can be allocated dynamically for computations that are too large for workstations or that require several computers concurrently for efficient execution and gives a guaranteed response to interactive jobs by allowing them to be processed on local workstation of the users. 3.4 DISTRIBUTED OPERATING SYSTEM An operating system as a program that control, the resources of a computer system and provides its users with an interface or virtual machine that is more convenient to use than the bare machine. According to this definition, the two primary tasks of an operating system are as follows 1. To present users with a virtual machine that is easier to program than the underlying hardware. 2. To manage the various resources of the system. This involves performing such tasks as keeping track of who is using which resource, granting resource requests, accounting for resource usage, and mediating conflicting requests from different programs and users The operating systems commonly used for distributed computing systems can be broadly classified into two types-network operating systems and distributed operating systems. The three most important features commonly used to differentiate between these two types of operating systems are system image, autonomy, and fault tolerance capability. These features are explained below. 1. System image. The most important feature used to differentiate between the two types of operating systems is the image of the distributed computing system from the point of view of its users. In case of a network operating system, the users view the distributed computing system as a collection of distinct machines connected by a communication subsystem. A distributed operating system hides the existence of multiple computers and provides a single-system image to its users. That is, it makes a collection of networked machines act as a virtual uniprocessor. 2. Autonomy. In the case of a network operating system, each computer of the distributed computing system has its own local operating system (the operating systems of different computers may be the same or different), and there is essentially no coordination at all among the computers except for the rule that when two processes of different computers communicate with each other, they must use a mutually agreed on communication protocol. With a distributed operating system, there is a single system wide operating system and each computer of the distributed computing system runs a part of this global operating system. The distributed operating system tightly interweaves all the computers of the distributed computing 20

Advise: Why You Wasting Money in Costly SEO Tools, Use World's Best Free SEO Tool Ubersuggest.