Done, your profile is created.Finish your profile by filling in the following fields
Forgot Password Earn Money,Free Notes
Password sent to your Email Id, Please Check your Mail
Updating Cart........ Please Wait........
Chapter 1: Introduction
• Data Independence and Distributed Data Processing
• Deﬁnition of Distributed databases
• Promises of Distributed Databases
• Technical Problems to be Studied
Acknowledgements: I am indebted to Arturas Mazeika for providing me his slides of this course.
DDB 2008/09 J. Gamper Page 1Syllabus
• Distributed DBMS Architecture
• Distributed Database Design
• Query Processing
• Transaction Management
• Distributed Concurrency Control
• Distributed DBMS Reliability
• Parallel Database Systems
DDB 2008/09 J. Gamper Page 2Data Independence
• In the old days, programs stored data in regular ﬁles
• Each program has to maintain its own data
– huge overhead
DDB 2008/09 J. Gamper Page 3Data Independence . . .
• The development of DBMS helped to fully achieve data independence (transparency)
• Provide centralized and controlled data maintenance and access
• Application is immune to physical and logical ﬁle organization
DDB 2008/09 J. Gamper Page 4Data Independence . . .
• Distributed database system is the union of what appear to be two diametrically opposed
approaches to data processing: database systems and computer network
– Computer networks promote a mode of work that goes against centralization
• Key issues to understand this combination
– The most important objective of DB technology is integration not centralization
– Integration is possible without centralization, i.e., integration of databases and
networking does not mean centralization (in fact quite opposite)
• Goal of distributed database systems: achieve data integration and data distribution
DDB 2008/09 J. Gamper Page 5Distributed Computing/Data Processing
• A distributed computing system is a collection of autonomous processing elements
that are interconnected by a computer network. The elements cooperate in order to
perform the assigned task.
• The term “distributed” is very broadly used. The exact meaning of the word depends on
• Synonymous terms:
– distributed function
– distributed data processing
– satellite processing
– back-end processing
– dedicated/special purpose computers
– timeshared systems
– functionally modular systems
DDB 2008/09 J. Gamper Page 6Distributed Computing/Data Processing . . .
• What can be distributed?
– Processing logic
• Classiﬁcation of distributed systems with respect to various criteria
– Degree of coupling, i.e., how closely the processing elements are connected
∗ e.g., measured as ratio of amount of data exchanged to amount of local processing
∗ weak coupling, strong coupling
– Interconnection structure
∗ point-to-point connection between processing elements
∗ common interconnection channel
DDB 2008/09 J. Gamper Page 7Deﬁnition of DDB and DDBMS
• A distributed database (DDB) is a collection of multiple, logically interrelated databases
distributed over a computer network
• A distributed database management system (DDBMS) is the software that manages
the DDB and provides an access mechanism that makes this distribution transparent to
• The terms DDBMS and DDBS are often used interchangeably
• Implicit assumptions
– Data stored at a number of sites each site logically consists of a single processor
– Processors at different sites are interconnected by a computer network (we do not
consider multiprocessors in DDBMS, cf. parallel systems)
– DDBS is a database, not a collection of ﬁles (cf. relational data model). Placement
and query of data is impacted by the access patterns of the user
– DDBMS is a collections of DBMSs (not a remote ﬁle system)
DDB 2008/09 J. Gamper Page 8Deﬁnition of DDB and DDBMS . . .
DDB 2008/09 J. Gamper Page 9Deﬁnition of DDB and DDBMS . . .
• Example: Database consists of 3 relationsemployees,projects, and
assignment which are partitioned and stored at different sites (fragmentation).
• What are the problems with queries, transactions, concurrency, and reliability?
DDB 2008/09 J. Gamper Page 10What is not a DDBS?
• The following systems are parallel database systems and are quite different from (though
related to) distributed DB systems
Shared Memory Shared Disk
Shared Nothing Central Databases
DDB 2008/09 J. Gamper Page 11Applications
• Manufacturing, especially multi-plant manufacturing
• Military command and control
• Hotel chains
• Any organization which has a decentralized organization structure
DDB 2008/09 J. Gamper Page 12Promises of DDBSs
Distributed Database Systems deliver the following advantages:
• Higher reliability
• Improved performance
• Easier system expansion
• Transparency of distributed and replicated data
DDB 2008/09 J. Gamper Page 13Promises of DDBSs . . .
• Replication of components
• No single points of failure
• e.g., a broken communication link or processing element does not bring down the entire
• Distributed transaction processing guarantees the consistency of the database and
DDB 2008/09 J. Gamper Page 14Promises of DDBSs . . .
• Proximity of data to its points of use
– Reduces remote access delays
– Requires some support for fragmentation and replication
• Parallelism in execution
– Inter-query parallelism
– Intra-query parallelism
• Update and read-only queries inﬂuence the design of DDBSs substantially
– If mostly read-only access is required, as much as possible of the data should be
– Writing becomes more complicated with replicated data
DDB 2008/09 J. Gamper Page 15Promises of DDBSs . . .
Easier system expansion
• Issue is database scaling
• Emergence of microprocessor and workstation technologies
– Network of workstations much cheaper than a single mainframe computer
• Data communication cost versus telecommunication cost
• Increasing database size
DDB 2008/09 J. Gamper Page 16Promises of DDBSs . . .
• Refers to the separation of the higher-level semantics of the system from the lower-level
• A transparent system “hides” the implementation details from the users.
• A fully transparent DBMS provides high-level support for the development of complex
(a) User wants to see one database (b) Programmer sees many databases
DDB 2008/09 J. Gamper Page 17Promises of DDBSs . . .
Various forms of transparency can be distingushed for DDBMSs:
• Network transparency (also called distribution transparency)
– Location transparency
– Naming transparency
• Replication transparency
• Fragmentation transparency
• Transaction transparency
– Concurrency transparency
– Failure transparency
• Performance transparency
DDB 2008/09 J. Gamper Page 18Promises of DDBSs . . .
• Network/Distribution transparency allows a user to perceive a DDBS as a single,
• The user is protected from the operational details of the network (or even does not know
about the existence of the network)
• The user does not need to know the location of data items and a command used to
perform a task is independent from the location of the data and the site the task is
performed (location transparency)
• A unique name is provided for each object in the database (naming transparency)
– In absence of this, users are required to embed the location name as part of an
DDB 2008/09 J. Gamper Page 19Promises of DDBSs . . .
Different ways to ensure naming transparency:
• Solution 1: Create a central name server; however, this results in
– loss of some local autonomy
– central site may become a bottleneck
– low availability (if the central site fails remaining sites cannot create new objects)
• Solution 2: Preﬁx object with identiﬁer of site that created it
– e.g., branch created at site S1 might be named S1.BRANCH
– Also need to identify each fragment and its copies
– e.g., copy 2 of fragment 3 of Branch created at site S1 might be referred to as
• An approach that resolves these problems uses aliases for each database object
– Thus, S1.BRANCH.F3.C2 might be known as local branch by user at site S1
– DDBMS has task of mapping an alias to appropriate database object
DDB 2008/09 J. Gamper Page 20