database management systems solutions manual fifth edition pdf free downlaod
University of Wisconsin
Madison, WI, USA
Ithaca, NY, USA
Jeﬀ Derstadt, Scott Selikoﬀ, and Lin Zhu
Ithaca, NY, USACONTENTS
1 INTRODUCTIONTODATABASESYSTEMS 1
2 INTRODUCTIONTODATABASEDESIGN 6
4 RELATIONALALGEBRAANDCALCULUS 28
5 SQL:QUERIES,CONSTRAINTS,TRIGGERS 45
6 DATABASEAPPLICATIONDEVELOPMENT 63
7 INTERNETAPPLICATIONS 66
8 OVERVIEWOFSTORAGEANDINDEXING 73
9 STORINGDATA:DISKSANDFILES 81
10 TREE-STRUCTUREDINDEXING 88
11 HASH-BASEDINDEXING 100
12 OVERVIEWOFQUERYEVALUATION 119
13 EXTERNALSORTING 126
14 EVALUATIONOFRELATIONALOPERATORS 131
15 ATYPICALQUERYOPTIMIZER 144
16 OVERVIEWOFTRANSACTIONMANAGEMENT 159
17 CONCURRENCYCONTROL 167
18 CRASHRECOVERY 179
19 SCHEMAREFINEMENTANDNORMALFORMS 189
20 PHYSICALDATABASEDESIGNANDTUNING 204
21 SECURITY 215PREFACE
It is not every question that deserves an answer.
Publius Syrus, 42 B.C.
I hope that most of the questions in this book deserve an answer. The set of questions
is unusually extensive, and is designed to reinforce and deepen students’ understanding
of the concepts covered in each chapter. There is a strong emphasis on quantitative
and problem-solving type exercises.
While I wrote some of the solutions myself, most were written originally by students
in the database classes at Wisconsin. I’d like to thank the many students who helped
in developing and checking the solutions to the exercises; this manual would not be
available without their contributions. In alphabetical order: X. Bao, S. Biao, M.
Chakrabarti, C. Chan, W. Chen, N. Cheung, D. Colwell, J. Derstadt, C. Fritz, V.
Ganti, J. Gehrke, G. Glass, V. Gopalakrishnan, M. Higgins, T. Jasmin, M. Krish-
naprasad, Y. Lin, C. Liu, M. Lusignan, H. Modi, S. Narayanan, D. Randolph, A.
Ranganathan, J. Reminga, A. Therber, M. Thomas, Q. Wang, R. Wang, Z. Wang and
J. Yuan. In addition, James Harrington and Martin Reames at Wisconsin and Nina
Tang at Berkeley provided especially detailed feedback.
Several students contributed to each chapter’s solutions, and answers were subse-
quently checked by me and by other students. This manual has been in use for several
semesters. I hope that it is now mostly accurate, but I’m sure it still contains er-
rors and omissions. If you are a student and you do not understand a particular
solution, contact your instructor; it may be that you are missing something, but it
may also be that the solution is incorrect If you discover a bug, please send me mail
(raghucs.wisc.edu) and I will update the manual promptly.
The latest version of this solutions manual is distributed freely through the Web; go
to the home page mentioned below to obtain a copy.
For More Information
The home page for this book is at URL:
This page is frequently updated and contains information about the book, past and
current users, and the software. This page also contains a link to all known errors in
the book, the accompanying slides, and the software. Since the solutions manual is
distributed electronically, all known errors are immediately ﬁxed and no list of errors is
maintained. Instructors are advised to visit this site periodically; they can also register
at this site to be notiﬁed of important changes by email.1
INTRODUCTION TO DATABASE
Exercise 1.1 Why would you choose a database system instead of simply storing data
in operating system ﬁles? When would it make sense not to use a database system?
Answer 1.1 A database is an integrated collection of data, usually so large that it
has to be stored on secondary storage devices such as disks or tapes. This data can
be maintained as a collection of operating system ﬁles, or stored in a DBMS (database
management system). The advantages of using a DBMS are:
Data independence and eﬃcient access. Database application programs are in-
dependent of the details of data representation and storage. The conceptual and
external schemas provide independence from physical storage decisions and logical
design decisions respectively. In addition, a DBMS provides eﬃcient storage and
retrieval mechanisms, including support for very large ﬁles, index structures and
Reduced application development time. Since the DBMS provides several impor-
tant functions required by applications, such as concurrency control and crash
recovery, high level query facilities, etc., only application-speciﬁc code needs to
be written. Even this is facilitated by suites of application development tools
available from vendors for many database management systems.
Data integrity and security. The view mechanism and the authorization facilities
of a DBMS provide a powerful access control mechanism. Further, updates to the
data that violate the semantics of the data can be detected and rejected by the
DBMS if users specify the appropriate integrity constraints.
Data administration. By providing a common umbrella for a large collection of
data that is shared by several users, a DBMS facilitates maintenance and data
administration tasks. A good DBA can eﬀectively shield end-users from the chores
of ﬁne-tuning the data representation, periodic back-ups etc.
12 Chapter 1
Concurrent access and crash recovery. A DBMS supports the notion of a trans-
action, which is conceptually a single user’s sequential program. Users can write
transactions as if their programs were running in isolation against the database.
The DBMS executes the actions of transactions in an interleaved fashion to obtain
good performance, but schedules them in such a way as to ensure that conﬂicting
operations are not permitted to proceed concurrently. Further, the DBMS main-
tains a continuous log of the changes to the data, and if there is a system crash,
it can restore the database to a transaction-consistent state. That is, the actions
of incomplete transactions are undone, so that the database state reﬂects only the
actions of completed transactions. Thus, if each complete transaction, executing
alone, maintains the consistency criteria, then the database state after recovery
from a crash is consistent.
If these advantages are not important for the application at hand, using a collection of
ﬁles may be a better solution because of the increased cost and overhead of purchasing
and maintaining a DBMS.
Exercise 1.2 What is logical data independence and why is it important?
Answer 1.2 Answer omitted.
Exercise 1.3 Explain the diﬀerence between logical and physical data independence.
Answer 1.3 Logical data independence means that users are shielded from changes
in the logical structure of the data, while physical data independence insulates users
from changes in the physical storage of the data. We saw an example of logical data
independence in the answer to Exercise 1.2. Consider the Students relation from that
example (and now assume that it is not replaced by the two smaller relations). We
could choose to store Students tuples in a heap ﬁle, with a clustered index on the
sname ﬁeld. Alternatively, we could choose to store it with an index on the gpa ﬁeld,
or to create indexes on both ﬁelds, or to store it as a ﬁle sorted by gpa. These storage
alternatives are not visible to users, except in terms of improved performance, since
they simply see a relation as a set of tuples. This is what is meant by physical data
Exercise 1.4 Explain the diﬀerence between external, internal, and conceptual sche-
mas. How are these diﬀerent schema layers related to the concepts of logical and
physical data independence?
Answer 1.4 Answer omitted.
Exercise 1.5 What are the responsibilities of a DBA? If we assume that the DBA
is never interested in running his or her own queries, does the DBA still need to
understand query optimization? Why?Introduction to Database Systems 3
Answer 1.5 The DBA is responsible for:
Designing the logical and physical schemas, as well as widely-used portions of the
Security and authorization.
Data availability and recovery from failures.
Database tuning: The DBA is responsible for evolving the database, in particular
the conceptual and physical schemas, to ensure adequate performance as user
A DBA needs to understand query optimization even if s/he is not interested in run-
ning his or her own queries because some of these responsibilities (database design
and tuning) are related to query optimization. Unless the DBA understands the per-
formance needs of widely used queries, and how the DBMS will optimize and execute
these queries, good design and tuning decisions cannot be made.
Exercise 1.6 Scrooge McNugget wants to store information (names, addresses, de-
scriptions of embarrassing moments, etc.) about the many ducks on his payroll. Not
surprisingly, the volume of data compels him to buy a database system. To save
money, he wants to buy one with the fewest possible features, and he plans to run it as
a stand-alone application on his PC clone. Of course, Scrooge does not plan to share
his list with anyone. Indicate which of the following DBMS features Scrooge should
pay for; in each case, also indicate why Scrooge should (or should not) pay for that
feature in the system he buys.
1. A security facility.
2. Concurrency control.
3. Crash recovery.
4. A view mechanism.
5. A query language.
Answer 1.6 Answer omitted.
Exercise 1.7 Which of the following plays an important role in representing informa-
tion about the real world in a database? Explain brieﬂy.
1. The data deﬁnition language.4 Chapter 1
2. The data manipulation language.
3. The buﬀer manager.
4. The data model.
Answer 1.7 Let us discuss the choices in turn.
The data deﬁnition language is important in representing information because it
is used to describe external and logical schemas.
The data manipulation language is used to access and update data; it is not
important for representing the data. (Of course, the data manipulation language
must be aware of how data is represented, and reﬂects this in the constructs that
The buﬀer manager is not very important for representation because it brings
arbitrary disk pages into main memory, independent of any data representation.
The data model is fundamental to representing information. The data model
determines what data representation mechanisms are supported by the DBMS.
The data deﬁnition language is just the speciﬁc set of language constructs available
to describe an actual application’s data in terms of the data model.
Exercise 1.8 Describe the structure of a DBMS. If your operating system is upgraded
to support some new functions on OS ﬁles (e.g., the ability to force some sequence of
bytes to disk), which layer(s) of the DBMS would you have to rewrite to take advantage
of these new functions?
Answer 1.8 Answer omitted.
Exercise 1.9 Answer the following questions:
1. What is a transaction?
2. Why does a DBMS interleave the actions of diﬀerent transactions instead of exe-
cuting transactions one after the other?
3. What must a user guarantee with respect to a transaction and database consis-
tency? What should a DBMS guarantee with respect to concurrent execution of
several transactions and database consistency?
4. Explain the strict two-phase locking protocol.
5. What is the WAL property, and why is it important?Introduction to Database Systems 5
Answer 1.9 Let us answer each question in turn:
1. A transaction is any one execution of a user program in a DBMS. This is the basic
unit of change in a DBMS.
2. A DBMS is typically shared among many users. Transactions from these users
can be interleaved to improve the execution time of users’ queries. By interleav-
ing queries, users do not have to wait for other user’s transactions to complete
fully before their own transaction begins. Without interleaving, if user A begins
a transaction that will take 10 seconds to complete, and user B wants to be-
gin a transaction, user B would have to wait an additional 10 seconds for user
A’s transaction to complete before the database would begin processing user B’s
3. A user must guarantee that his or her transaction does not corrupt data or insert
nonsense in the database. For example, in a banking database, a user must guar-
antee that a cash withdraw transaction accurately models the amount a person
removes from his or her account. A database application would be worthless if
a person removed 20 dollars from an ATM but the transaction set their balance
to zero A DBMS must guarantee that transactions are executed fully and in-
dependently of other transactions. An essential property of a DBMS is that a
transaction should execute atomically, or as if it is the only transaction running.
Also, transactions will either complete fully, or will be aborted and the database
returned to it’s initial state. This ensures that the database remains consistent.
4. Strict two-phase locking uses shared and exclusive locks to protect data. A trans-
action must hold all the required locks before executing, and does not release any
lock until the transaction has completely ﬁnished.
5. The WAL property aﬀects the logging strategy in a DBMS. The WAL, Write-
Ahead Log, property states that each write action must be recorded in the log
(on disk) before the corresponding change is reﬂected in the database itself. This
protects the database from system crashes that happen during a transaction’s
execution. By recording the change in a log before the change is truly made, the
database knows to undo the changes to recover from a system crash. Otherwise,
if the system crashes just after making the change in the database but before
the database logs the change, then the database would not be able to detect his
change during crash recovery.2
INTRODUCTION TO DATABASE
Exercise 2.1 Explain the following terms brieﬂy: attribute, domain, entity, relation-
ship, entity set, relationship set, one-to-many relationship, many-to-many relationship,
participation constraint, overlap constraint, covering constraint, weak entity set, aggre-
gation,and role indicator.
Answer 2.1 Term explanations:
Attribute - a property or description of an entity. A toy department employee
entity could have attributes describing the employee’s name, salary, and years of
Domain - a set of possible values for an attribute.
Entity - an object in the real world that is distinguishable from other objects such
as the green dragon toy.
Relationship - an association among two or more entities.
Entity set - a collection of similar entities such as all of the toys in the toy depart-
Relationship set - a collection of similar relationships
One-to-many relationship - a key constraint that indicates that one entity can be
associated with many of another entity. An example of a one-to-many relationship
is when an employee can work for only one department, and a department can
have many employees.
Many-to-many relationship - a key constraint that indicates that many of one
entity can be associated with many of another entity. An example of a many-
to-many relationship is employees and their hobbies: a person can have many
diﬀerent hobbies, and many people can have the same hobby.
6Introduction to Database Design 7
Participation constraint - a participation constraint determines whether relation-
ships must involve certain entities. An example is if every department entity has
a manager entity. Participation constraints can either be total or partial. A total
participation constraint says that every department has a manager. A partial
participation constraint says that every employee does not have to be a manager.
Overlap constraint - within an ISA hierarchy, an overlap constraint determines
whether or not two subclasses can contain the same entity.
Covering constraint - within an ISA hierarchy, a covering constraint determines
where the entities in the subclasses collectively include all entities in the superclass.
For example, with an Employees entity set with subclasses HourlyEmployee and
SalaryEmployee, does every Employee entity necessarily have to be within either
HourlyEmployee or SalaryEmployee?
Weak entity set - an entity that cannot be identiﬁed uniquely without considering
some primary key attributes of another identifying owner entity. An example is
including Dependent information for employees for insurance purposes.
Aggregation - a feature of the entity relationship model that allows a relationship
set to participate in another relationship set. This is indicated on an ER diagram
by drawing a dashed box around the aggregation.
Role indicator - If an entity set plays more than one role, role indicators describe
the diﬀerent purpose in the relationship. An example is a single Employee entity
set with a relation Reports-To that relates supervisors and subordinates.
Exercise 2.2 A university database contains information about professors (identiﬁed
by social security number, or SSN) and courses (identiﬁed by courseid). Professors
teach courses; each of the following situations concerns the Teaches relationship set. For
each situation, draw an ER diagram that describes it (assuming no further constraints
1. Professors can teach the same course in several semesters, and each oﬀering must
2. Professors can teach the same course in several semesters, and only the most
recent such oﬀering needs to be recorded. (Assume this condition applies in all
3. Every professor must teach some course.
4. Every professor teaches exactly one course (no more, no less).
5. Every professor teaches exactly one course (no more, no less), and every course
must be taught by some professor.8 Chapter 2
6. Now suppose that certain courses can be taught by a team of professors jointly,
but it is possible that no one professor in a team can teach the course. Model this
situation, introducing additional entity sets and relationship sets if necessary.
Answer 2.2 Answer omitted.
Exercise 2.3 Consider the following information about a university database:
Professors have an SSN, a name, an age, a rank, and a research specialty.
Projects have a project number, a sponsor name (e.g., NSF), a starting date, an
ending date, and a budget.
Graduate students have an SSN, a name, an age, and a degree program (e.g., M.S.
Each project is managed by one professor (known as the project’s principal inves-
Each project is worked on by one or more professors (known as the project’s
Professors can manage and/or work on multiple projects.
Each project is worked on by one or more graduate students (known as the
project’s research assistants).
When graduate students work on a project, a professor must supervise their work
on the project. Graduate students can work on multiple projects, in which case
they will have a (potentially diﬀerent) supervisor for each one.
Departments have a department number, a department name, and a main oﬃce.
Departments have a professor (known as the chairman) who runs the department.
Professors work in one or more departments, and for each department that they
work in, a time percentage is associated with their job.
Graduate students have one major department in which they are working on their
Each graduate student has another, more senior graduate student (known as a
student advisor) who advises him or her on what courses to take.
Design and draw an ER diagram that captures the information about the university.
Use only the basic ER model here; that is, entities, relationships, and attributes. Be
sure to indicate any key and participation constraints.Introduction to Database Design 9
Figure 2.1 ER Diagram for Exercise 2.3
age speciality pid start_date
ssn rank sponsor end_date
Dept Major senior
age deg_prog10 Chapter 2
Answer 2.3 The ER diagram is shown in Figure 2.1.
Exercise 2.4 A company database needs to store information about employees (iden-
tiﬁed by ssn,with salary and phone as attributes), departments (identiﬁed by dno,
with dname and budget as attributes), and children of employees (with name and age
as attributes). Employees work in departments; each department is managed by an
employee; a child must be identiﬁed uniquely by name when the parent (who is an
employee; assume that only one parent works for the company) is known. We are not
interested in information about a child once the parent leaves the company.
Draw an ER diagram that captures this information.
Answer 2.4 Answer omitted.
Exercise 2.5 Notown Records has decided to store information about musicians who
perform on its albums (as well as other company data) in a database. The company
has wisely chosen to hire you as a database designer (at your usual consulting fee of
Each musician that records at Notown has an SSN, a name, an address, and
a phone number. Poorly paid musicians often share the same address, and no
address has more than one phone.
Each instrument used in songs recorded at Notown has a unique identiﬁcation
number, a name (e.g., guitar, synthesizer, ﬂute) and a musical key (e.g., C, B-ﬂat,
Each album recorded on the Notown label has a unique identiﬁcation number, a
title, a copyright date, a format (e.g., CD or MC), and an album identiﬁer.
Each song recorded at Notown has a title and an author.
Each musician may play several instruments, and a given instrument may be
played by several musicians.
Each album has a number of songs on it, but no song may appear on more than
Each song is performed by one or more musicians, and a musician may perform a
number of songs.
Each album has exactly one musician who acts as its producer. A musician may
produce several albums, of course.Introduction to Database Design 11
Figure 2.2 ER Diagram for Exercise 2.5
Design a conceptual schema for Notown and draw an ER diagram for your schema.
The preceding information describes the situation that the Notown database must
model. Be sure to indicate all key and cardinality constraints and any assumptions
you make. Identify any constraints you are unable to capture in the ER diagram and
brieﬂy explain why you could not express them.
Answer 2.5 The ER diagram is shown in Figure 2.2.
name albumIdentifier speed
ssn copyrightDate title
dname title12 Chapter 2
Exercise 2.6 Computer Sciences Department frequent ﬂiers have been complaining to
Dane County Airport oﬃcials about the poor organization at the airport. As a result,
the oﬃcials decided that all information related to the airport should be organized
using a DBMS, and you have been hired to design the database. Your ﬁrst task is
to organize the information about all the airplanes stationed and maintained at the
airport. The relevant information is as follows:
Every airplane has a registration number, and each airplane is of a speciﬁc model.
The airport accommodates a number of airplane models, and each model is iden-
tiﬁed by a model number (e.g., DC-10) and has a capacity and a weight.
A number of technicians work at the airport. You need to store the name, SSN,
address, phone number, and salary of each technician.
Each technician is an expert on one or more plane model(s), and his or her exper-
tise may overlap with that of other technicians. This information about technicians
must also be recorded.
Traﬃc controllers must have an annual medical examination. For each traﬃc
controller, you must store the date of the most recent exam.
All airport employees (including technicians) belong to a union. You must store
the union membership number of each employee. You can assume that each
employee is uniquely identiﬁed by a social security number.
The airport has a number of tests that are used periodically to ensure that air-
planes are still airworthy. Each test has a Federal Aviation Administration (FAA)
test number, a name, and a maximum possible score.
The FAA requires the airport to keep track of each time a given airplane is tested
by a given technician using a given test. For each testing event, the information
needed is the date, the number of hours the technician spent doing the test, and
the score the airplane received on the test.
1. Draw an ER diagram for the airport database. Be sure to indicate the various
attributes of each entity and relationship set; also specify the key and participation
constraints for each relationship set. Specify any necessary overlap and covering
constraints as well (in English).
2. The FAA passes a regulation that tests on a plane must be conducted by a tech-
nician who is an expert on that model. How would you express this constraint in
the ER diagram? If you cannot express it, explain brieﬂy.
Answer 2.6 Answer omitted.Introduction to Database Design 13
Exercise 2.7 The Prescriptions-R-X chain of pharmacies has oﬀered to give you a
free lifetime supply of medicine if you design its database. Given the rising cost of
health care, you agree. Here’s the information that you gather:
Patients are identiﬁed by an SSN, and their names, addresses, and ages must be
Doctors are identiﬁed by an SSN. For each doctor, the name, specialty, and years
of experience must be recorded.
Each pharmaceutical company is identiﬁed by name and has a phone number.
For each drug, the trade name and formula must be recorded. Each drug is
sold by a given pharmaceutical company, and the trade name identiﬁes a drug
uniquely from among the products of that company. If a pharmaceutical company
is deleted, you need not keep track of its products any longer.
Each pharmacy has a name, address, and phone number.
Every patient has a primary physician. Every doctor has at least one patient.
Each pharmacy sells several drugs and has a price for each. A drug could be sold
at several pharmacies, and the price could vary from one pharmacy to another.
Doctors prescribe drugs for patients. A doctor could prescribe one or more drugs
for several patients, and a patient could obtain prescriptions from several doctors.
Each prescription has a date and a quantity associated with it. You can assume
that, if a doctor prescribes the same drug for the same patient more than once,
only the last such prescription needs to be stored.
Pharmaceutical companies have long-term contracts with pharmacies. A phar-
maceutical company can contract with several pharmacies, and a pharmacy can
contract with several pharmaceutical companies. For each contract, you have to
store a start date, an end date, and the text of the contract.
Pharmacies appoint a supervisor for each contract. There must always be a super-
visor for each contract, but the contract supervisor can change over the lifetime
of the contract.
1. Draw an ER diagram that captures the preceding information. Identify any con-
straints not captured by the ER diagram.
2. How would your design change if each drug must be sold at a ﬁxed price by all
3. How would your design change if the design requirements change as follows: If a
doctor prescribes the same drug for the same patient more than once, several such
prescriptions may have to be stored.14 Chapter 2
age address phy_ssn speciality
ssn name name exp_years
Patient Pri_physician Doctor
name address phone_num quentity
Pharmacy Sell Drug
Figure 2.3 ER Diagram for Exercise 2.7Introduction to Database Design 15
Answer 2.7 1. The ER diagram is shown in Figure 2.3.
2. If the drug is to be sold at a ﬁxed price we can add the price attribute to the Drug
entity set and eliminate the price from the Sell relationship set.
3. The date information can no longer be modeled as an attribute of Prescription.
We have to create a new entity set called Prescription date and make Prescription
a 4-way relationship set that involves this additional entity set.
Exercise 2.8 Although you always wanted to be an artist, you ended up being an ex-
pert on databases because you love to cook data and you somehow confused database
with data baste. Your old love is still there, however, so you set up a database company,
ArtBase, that builds a product for art galleries. The core of this product is a database
with a schema that captures all the information that galleries need to maintain. Gal-
leries keep information about artists, their names (which are unique), birthplaces, age,
and style of art. For each piece of artwork, the artist, the year it was made, its unique
title, its type of art (e.g., painting, lithograph, sculpture, photograph), and its price
must be stored. Pieces of artwork are also classiﬁed into groups of various kinds, for
example, portraits, still lifes, works by Picasso, or works of the 19th century; a given
piece may belong to more than one group. Each group is identiﬁed by a name (like
those just given) that describes the group. Finally, galleries keep information about
customers. For each customer, galleries keep that person’s unique name, address, total
amount of dollars spent in the gallery (very important), and the artists and groups of
art that the customer tends to like.
Draw the ER diagram for the database.
Answer 2.8 Answer omitted.
Exercise 2.9 Answer the following questions.
Explain the following terms brieﬂy: UML, use case diagrams, statechart dia-
grams, class diagrams, database diagrams, component diagrams,and deployment
Explain the relationship between ER diagrams and UML.
Answer 2.9 Not yet done.