Advantages of Document type definition

difference between document type definition and xml schema, document type definition and document type declaration, document type definition vs. document type declaration
OliviaCutts Profile Pic
OliviaCutts,France,Teacher
Published Date:01-08-2017
Your Website URL(Optional)
Comment
xmlhtp1_06.fm Page 134 Monday, November 20, 2000 3:27 PM 6 Document Type Definition (DTD) Objectives • To understand what a DTD is. • To be able to write DTDs. • To be able to declare elements and attributes in a DTD. • To understand the difference between general entities and parameter entities. • To be able to use conditional sections with entities. • To be able to use NOTATIONs. • To understand how an XML document’s whitespace is processed. To whom nothing is given, of him can nothing be required. Henry Fielding Like everything metaphysical, the harmony between thought and reality is to be found in the grammar of the language. Ludwig Wittgenstein Grammar, which knows how to control even kings. Molièrexmlhtp1_06.fm Page 135 Monday, November 20, 2000 3:27 PM Chapter 6 Document Type Definition (DTD) 135 Outline 6.1 Introduction 6.2 Parsers, Well-formed and Valid XML Documents 6.3 Document Type Declaration 6.4 Element Type Declarations 6.4.1 Sequences, Pipe Characters and Occurrence Indicators 6.4.2 EMPTY, Mixed Content and ANY 6.5 Attribute Declarations 6.5.1 Attribute Defaults ( REQUIRED, IMPLIED,FIXED ) 6.6 Attribute Types 6.6.1 Tokenized Attribute Type (ID, IDREF, ENTITY, NMTOKEN) 6.6.2 Enumerated Attribute Types 6.7 Conditional Sections 6.8 Whitespace Characters 6.9 Case Study: Writing a DTD for the Day Planner Application 6.10 Internet and World Wide Web Resources Summary • Terminology • Self-Review Exercises • Answers to Self-Review Exercises • Exercises 6.1 Introduction In this chapter, we discuss Document Type Definitions (DTDs) that define an XML docu- ment’s structure (e.g., what elements, attributes, etc. are permitted in the document). An XML document is not required to have a corresponding DTD. However, DTDs are often recommended to ensure document conformity, especially in business-to-business (B2B) transactions, where XML documents are exchanged. DTDs specify an XML document’s structure and are themselves defined using EBNF (Extended Backus-Naur Form) gram- mar—not the XML syntax introduced in Chapter 5. Software Engineering Observation 6.1 A transition is underway in the XML community from DTDs to Schema (Chapter 7), which improve upon DTDs. Schema use XML syntax, not EBNF grammar. 6.1 6.2 Parsers, Well-formed and Valid XML Documents Parsers are generally classified as validating or nonvalidating. A validating parser is able to read the DTD and determine whether or not the XML document conforms to it. If the document conforms to the DTD, it is referred to as valid. If the document fails to conform to the DTD but is syntactically correct, it is well formed but not valid. By definition, a valid document is well formed. A nonvalidating parser is able to read the DTD, but cannot check the document against the DTD for conformity. If the document is syntactically correct, it is well formed.xmlhtp1_06.fm Page 136 Monday, November 20, 2000 3:27 PM 136 Document Type Definition (DTD) Chapter 6 We will discuss validating and nonvalidating parsers in greater depth in Chapters 8 and 9. In this chapter, we use Microsoft’s XML Validator to check for document conformance to a DTD. XML Validator is available at no charge from msdn.microsoft.com/downloads/samples/Internet/xml/ xml_validator/sample.asp 6.3 Document Type Declaration DTDs are introduced into XML documents using the document type declaration (i.e., DOCTYPE). A document type declaration is placed in the XML document’s prolog and be- gins with DOCTYPE and ends with . The document type declaration can point to dec- larations that are outside the XML document (called the external subset) or can contain the declaration inside the document (called internal subset). For example, an internal subset might look like DOCTYPE myMessage ELEMENT myMessage ( PCDATA ) The first myMessage is the name of the document type declaration. Anything inside the square brackets () constitutes the internal subset. As we will see momentarily, ELE- MENT and PCDATA are used in “element declarations.” External subsets physically exist in a different file that typically ends with the.dtd extension, although this file extension is not required. External subsets are specified using either keyword SYSTEM or PUBLIC. For example, the DOCTYPE external subset might look like DOCTYPE myMessage SYSTEM "myDTD.dtd" which points to the myDTD.dtd document. Using the PUBLIC keyword indicates that the DTD is widely used (e.g., the DTD for HTML documents). The DTD may be made avail- able in well-known locations for more efficient downloading. We used such a DTD in Chapters 2 and 3 when we created HTML documents. The DOCTYPE DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN" "http://www.w3.org/TR/html4/strict.dtd" uses the PUBLIC keyword to reference the well-known DTD for HTML version 4.01. XML parsers that do not have a local copy of the DTD may use the URL provided to down- load the DTD to perform validation. Both the internal and external subset may be specified at the same time. For example, the DOCTYPE DOCTYPE myMessage SYSTEM "myDTD.dtd" ELEMENT myElement ( PCDATA ) contains declarations from the myDTD.dtd document as well as an internal declaration. xmlhtp1_06.fm Page 137 Monday, November 20, 2000 3:27 PM Chapter 6 Document Type Definition (DTD) 137 Software Engineering Observation 6.2 The document type declaration’s internal subset plus its external subset form the DTD. 6.2 Software Engineering Observation 6.3 The internal subset is visible only within the document in which it resides. Other external documents cannot be validated against it. DTDs that are used by many documents should be placed in the external subset. 6.3 6.4 Element Type Declarations Elements are the primary building block used in XML documents and are declared in a DTD with element type declarations (ELEMENTs). For example, to declare element myMessage, we might write ELEMENT myElement ( PCDATA ) The element name (e.g., myElement) that follows ELEMENT is often called a generic identifier. The set of parentheses that follow the element name specify the element’s al- lowed content and is called the content specification. Keyword PCDATA specifies that the element must contain parsable character data. This data will be parsed by the XML parser, therefore any markup text (i.e., , , &, etc.) will be treated as markup. We will discuss the content specification in detail momentarily. Common Programming Error 6.1 Attempting to use the same element name in multiple element type declarations is an error. 6.1 Figure 6.1 lists an XML document that contains a reference to an external DTD in the DOCTYPE. We use Microsoft’s XML Validator to check the document’s conformity against its DTD. Note: To use XML Validator, Internet Explorer 5 is required. In Chapters 8 and 9, we introduce parsers XML4J and Xerces, which can be used to check a document’s validity against a DTD programmatically. Using Java and one of these parsers provides a platform-independent way to validate XML documents. The document type declaration (line 6) is named myMessage—the name of the root element. The element myMessage (lines 8–10) contains a single child element named message (line 9). 1 ?xml version = "1.0"? 2 3 Fig. 6.1: intro.xml 4 Using an external subset 5 6 DOCTYPE myMessage SYSTEM "intro.dtd" 7 8 myMessage 9 messageWelcome to XML/message 10 /myMessage Fig. 6.1 XML document declaring its associated DTD.xmlhtp1_06.fm Page 138 Monday, November 20, 2000 3:27 PM 138 Document Type Definition (DTD) Chapter 6 1 Fig. 6.2: intro.dtd 2 External declarations 3 4 ELEMENT myMessage ( message ) 5 ELEMENT message ( PCDATA ) Fig. 6.2 Validation with using an external DTD.xmlhtp1_06.fm Page 139 Monday, November 20, 2000 3:27 PM Chapter 6 Document Type Definition (DTD) 139 Line 4 of the DTD (Fig. 6.2) declares element myMessage. Notice that the content specification contains the name message. This indicates that element myMessage con- tains exactly one child element named message. Because myMessage can only have an element as its content, it is said to have element content. Line 5 declares element message whose content is of type PCDATA. Note: Many XML Validator screen captures contain the term SCHEMA. The XML Validator is capable of validating an XML document against both DTDs and documents—called Schemas—that also define an XML document’s struc- ture. In Chapter 7, we will discuss Schema in Chapter 7 and how they differ from DTDs. Common Programming Error 6.2 Having a root element name other than the name specified in the document type declaration is an error. 6.2 If an XML document’s structure is inconsistent with its corresponding DTD but is syn- tactically correct, it is only well formed—not valid. Figure 6.3 shows the messages gener- ated by Microsoft’s XML Validator when the required message element is omitted. 6.4.1 Sequences, Pipe Characters and Occurrence Indicators DTDs allow the document author to define the order and frequency of child elements. The comma (,)—called a sequence—specifies the order in which the elements must occur. For example, ELEMENT classroom ( teacher, student ) 1 ?xml version = "1.0"? 2 3 Fig. 6.3 : intro-invalid.xml 4 Simple introduction to XML markup 5 6 DOCTYPE myMessage SYSTEM "intro.dtd" 7 8 Root element missing child element message 9 myMessage 10 /myMessage Fig. 6.3 Non-valid XML document.xmlhtp1_06.fm Page 140 Monday, November 20, 2000 3:27 PM 140 Document Type Definition (DTD) Chapter 6 specifies that element classroom must contain exactly one teacher element followed by exactly one student element. The content specification can contain any number of items in sequence. Similarly, choices are specified using the pipe character (), as in ELEMENT dessert ( iceCream pastry ) which specifies that element dessert must contain either one iceCream element or one pastry element, but not both. The content specification may contain any number of pipe character-separated choices. An element’s frequency (i.e., number of occurrences) is specified by using either the plus sign (+), asterisk () or question mark (?) occurrence indicator (Fig. 6.4). A plus sign indicates one or more occurrences. For example, ELEMENT album ( song+ ) specifies that element album contains one or more song elements. The frequency of an element group (i.e., two or more elements that occur in some com- bination) is specified by enclosing the element names inside the content specification with parentheses, followed by either the plus sign, asterisk or question mark. For example, ELEMENT album ( title, ( songTitle, duration )+ ) indicates that element album contains one title element followed by any number of songTitle/duration element groups. At least one songTitle/duration group must follow title, and in each of these element groups, the songTitle must precede the duration. An example of markup that conforms to this is album titleXML Classical Hits/title songTitleXML Overture/songTitle duration10/duration songTitleXML Symphony 1.0/songTitle duration54/duration /album which contains one title element followed by two songTitle/duration groups. Occurrence Indicator Description Plus sign ( + ) An element can appear any number of times, but must be appear at least once (i.e., the element appears one or more times). Asterisk ( ) An element is optional and if used, the element can appear any num- ber of times (i.e., the element appears zero or more times). Question mark ( ? ) An element is optional, and if used, the element can appear only once (i.e., the element appears zero or one times). Fig. 6.4 Occurrence indicators. xmlhtp1_06.fm Page 141 Monday, November 20, 2000 3:27 PM Chapter 6 Document Type Definition (DTD) 141 The asterisk () character indicates an optional element that, if used, can occur any number of times. For example, ELEMENT library ( book ) indicates that element library contains any number of book elements, including the possibility of none at all. Markup examples that conform to this are library bookThe Wealth of Nations/book bookThe Iliad/book bookThe Jungle/book /library and library/library Optional elements that, if used, may occur only once are followed by a question mark (?). For example, ELEMENT seat ( person? ) indicates that element seat contains at most one person element. Examples of markup that conform to this are seat personJane Doe/person /seat and seat/seat Now we consider three more complicated element type declarations and provide a dec- laration for each. The declaration ELEMENT class ( number, ( instructor assistant+ ), ( credit noCredit ) ) specifies that a class element must contain a number element, either one instructor element or any number of assistant elements and either one credit element or one noCredit element. Markup examples that conform to this are class number123/number instructorDr. Harvey Deitel/instructor credit4/credit /classxmlhtp1_06.fm Page 142 Monday, November 20, 2000 3:27 PM 142 Document Type Definition (DTD) Chapter 6 and class number456/number assistantTem Nieto/assistant assistantPaul Deitel/assistant credit3/credit /class The declaration ELEMENT donutBox ( jelly?, lemon, ( ( creme sugar )+ glazed ) ) specifies that element donutBox can have zero or one jelly elements, followed by zero or more lemon elements, followed by one or more creme or sugar elements or exactly one glazed element. Markup examples that conform to this are donutBox jellygrape/jelly lemonhalf-sour/lemon lemonsour/lemon lemonhalf-sour/lemon glazedchocolate/glazed /donutBox and donutBox sugarsemi-sweet/sugar cremewhipped/creme sugarsweet/sugar /donutBox The declaration ELEMENT farm ( farmer+, ( dog cat? ), pig, ( goat cow )?,( chicken+ duck ) ) indicates that element farm can have one or more farmer elements, any number of op- tional dog elements or an optional cat element, any number of optional pig elements, an optional goat or cow element and one or more chicken elements or any number of op- tional duck elements. Examples of markup that conform to this are farm farmerJane Doe/farmer farmerJohn Doe/farmer catLucy/cat pigBo/pig chickenJill/chicken /farmxmlhtp1_06.fm Page 143 Monday, November 20, 2000 3:27 PM Chapter 6 Document Type Definition (DTD) 143 and farm farmerRed Green/farmer duckBilly/duck duckSue/duck /farm 6.4.2 EMPTY, Mixed Content and ANY Elements must be further refined by specifying the types of content they contain. In the last section, we introduced element content, indicating that an element can contain one or more child elements as its content. In this section, we introduce content specification types for describing non-element content. In addition to element content, three other types of content exist: EMPTY, mixed con- tent and ANY. Keyword EMPTY declares empty elements. Empty elements do not contain character data or child elements. For example, ELEMENT oven EMPTY declares element oven to be an empty element. The markup for an oven element would appear as oven/ in an XML document conforming to this declaration. An element can also be declared as having mixed content. Such elements may contain any combination of elements and PCDATA. For example, the declaration ELEMENT myMessage ( PCDATA message ) indicates that element myMessage contains mixed content. Markup conforming to this declaration might look like myMessageHere is some text, some messageother text/messageand messageeven more text/message. /myMessage Element myMessage contains two message elements and three instances of character data. Because of the , element myMessage could have contained nothing. Figure 6.5 specifies a DTD as an internal subset (lines 6–10) as opposed to an external subset (Fig. 6.1). In the prolog (line 1) we use the standalone attribute with a value of yes. An XML document is standalone if it does not reference an external subset. This DTD defines three elements: one that contains mixed content and two that contain parsed character data. 1 ?xml version = "1.0" standalone = "yes"? 2 Fig. 6.5 Example of a mixed-content element (part 1 of 2).xmlhtp1_06.fm Page 144 Monday, November 20, 2000 3:27 PM 144 Document Type Definition (DTD) Chapter 6 3 Fig. 6.5 : mixed.xml 4 Mixed content type elements 5 6 DOCTYPE format 7 ELEMENT format ( PCDATA bold italic ) 8 ELEMENT bold ( PCDATA ) 9 ELEMENT italic ( PCDATA ) 10 11 12 format 13 This is a simple formatted sentence. 14 boldI have tried bold./bold 15 italicI have tried italic./italic 16 Now what? 17 /format Fig. 6.5 Example of a mixed-content element (part 2 of 2). Line 7 declares element format as a mixed content element. According to the decla- ration, the format element may contain either parsed character data (PCDATA), element bold or element italic. The asterisk indicates that the content can occur zero or more times. Lines 8 and 9 specify that bold and italic elements have PCDATA only for their content specification—they cannot contain child elements. Despite the fact that elements with PCDATA content specification cannot contain child elements, they are still considered to have mixed content. The comma (,), plus sign (+) and question mark (?) occurrence indicators cannot be used with mixed content elements that contain only PCDATA. Figure 6.6 shows the results of changing the first pipe character in line 7 of Fig. 6.5 to a comma and the result of removing the asterisk. Both of these are illegal DTD syntax. Common Programming Error 6.3 When declaring mixed content, not listing PCDATA as the first item is an error. 6.3xmlhtp1_06.fm Page 145 Monday, November 20, 2000 3:27 PM Chapter 6 Document Type Definition (DTD) 145 Fig. 6.6 Illegal mixed-content element syntax. An element declared as type ANY can contain any content, including PCDATA, ele- ments or a combination of elements and PCDATA. Elements with ANY content can also be empty elements. Common Programming Error 6.4 Child elements of an element declared as type ANY must have their own element type decla- rations. 6.4 Software Engineering Observation 6.4 Elements with ANY content are commonly used in the early stages of DTD development. Doc- ument authors typically replace ANY content with more specific content as the DTD evolves. 6.4 6.5 Attribute Declarations In this section, we discuss attribute declarations. An attribute declaration specifies an at- tribute list for an element by using the ATTLIST attribute list declaration. An element can have any number of attributes. For example, ELEMENT x EMPTY ATTLIST x y CDATA REQUIRED declares EMPTY element x. The attribute declaration specifies that y is an attribute of x. Keyword CDATA indicates that y can contain any character text except for the , , &, 'xmlhtp1_06.fm Page 146 Monday, November 20, 2000 3:27 PM 146 Document Type Definition (DTD) Chapter 6 and " characters. Note that the CDATA keyword in an attribute declaration has a different meaning than the CDATA section in an XML document we introduced in Chapter 5. Recall that in a CDATA section all characters are legal except the end tag. Keyword RE- QUIRED specifies that the attribute must be provided for element x. We will say more about other keywords momentarily. Figure 6.7 demonstrates how to specify attribute declarations for an element. Line 9 declares attributes id and to for element message. Both id and to contain required CDATA. Attribute values are normalized (i.e., consecutive whitespace characters are com- bined into one whitespace character). We discuss normalization in detail in Section 6.8. Line 13 assigns attribute id the value "445" and assigns attribute to the value "The world". 1 ?xml version = "1.0"? 2 3 Fig. 6.7: intro2.xml 4 Declaring attributes 5 6 DOCTYPE myMessage 7 ELEMENT myMessage ( message ) 8 ELEMENT message ( PCDATA ) 9 ATTLIST message id CDATA REQUIRED 10 11 12 myMessage 13 14 message id = "445" 15 Welcome to XML 16 /message 17 18 /myMessage Fig. 6.7 Declaring attributes.xmlhtp1_06.fm Page 147 Monday, November 20, 2000 3:27 PM Chapter 6 Document Type Definition (DTD) 147 6.5.1 Attribute Defaults ( REQUIRED,IMPLIED,FIXED ) DTDs allow document authors to specify an attribute’s default value using attribute de- faults, which we briefly touched upon in the last section. Keywords IMPLIED, RE- QUIRED and FIXED are attribute defaults. Keyword IMPLIED specifies that if the attribute does not appear in the element, then the application using the XML document can use whatever value (if any) it chooses. Keyword REQUIRED indicates that the attribute must appear in the element. The XML document is not valid if the attribute is missing. For example, the markup messagenumber/message when checked against the DTD attribute list declaration ATTLIST message number CDATA REQUIRED does not conform because attribute number is missing from element message. An attribute declaration with default value FIXED specifies that the attribute value is constant and cannot be different in the XML document. For example, ATTLISTaddress zipFIXED "02115" indicates that the value "02115" is the only value attribute zip can have. The XML doc- ument is not valid if attribute zip contains a value different from "02115". If element address does not contain attribute zip, the default value "02115" is passed to the ap- plication using the XML document’s data. 6.6 Attribute Types Attribute types are classified as either strings (CDATA), tokenized or enumerated. String at- tribute types do not impose any constraints on attribute values—other than disallowing the , , &, ' and " characters. Entity references (e.g., <, >, etc.) must be used for these characters. Tokenized attributes impose constraints on attribute values—such as which characters are permitted in an attribute name. We discuss tokenized attributes in the next section. Enumerated attributes are the most restrictive of the three types. They can take only one of the values listed in the attribute declaration. We will discuss enumerated attribute types in Section 6.6.2. 6.6.1 Tokenized Attribute Type (ID,IDREF,ENTITY,NMTOKEN) Tokenized attribute types allow a DTD author to restrict the values used for attributes. For example, an author may want to have a unique ID for each element or only allow an at- tribute to have one or two different values. Four different tokenized attribute types exist: ID, IDREF, ENTITY and NMTOKEN. Tokenized attribute type ID uniquely identifies an element. Attributes with type IDREF point to elements with an ID attribute. A validating parser verifies that every ID attribute type referenced by IDREF is in the XML document. Common Programming Error 6.5 Using the same value for multiple ID attributes is a logic error—the document validated against the DTD is not valid. 6.5xmlhtp1_06.fm Page 148 Monday, November 20, 2000 3:27 PM 148 Document Type Definition (DTD) Chapter 6 Figure 6.8 lists an XML document that uses ID and IDREF attribute types. Element bookstore consists of element shipping and element book. Each shipping ele- ment describes a shipping method. Line 9 declares attribute shipID as an ID type attribute (i.e., each shipping ele- ment has a unique identifier). Lines 24–34 declare book elements with attribute shippedBy (line 11) of type IDREF. Attribute shippedBy points to one of the ship- ping elements by matching its shipID attribute. If we assign shippedBy (line 28) the value "s3", an error occurs when we use Microsoft’s Validator (Fig. 6.9). No shipID attribute has a value "s3", which results in a non-valid XML document. Common Programming Error 6.6 Not beginning a type attribute ID ’s value with a letter, underscore (_) or a colon (:) is an error. 6.6 Common Programming Error 6.7 Providing more than one ID attribute type for an element is an error. 6.7 1 ?xml version = "1.0"? 2 3 Fig. 6.8: IDExample.xml 4 Example for ID and IDREF values of attributes 5 6 DOCTYPE bookstore 7 ELEMENT bookstore ( shipping+, book+ ) 8 ELEMENT shipping ( duration ) 9 ATTLIST shipping shipID ID REQUIRED 10 ELEMENT book ( PCDATA ) 11 ATTLIST book shippedBy IDREF IMPLIED 12 ELEMENT duration ( PCDATA ) 13 14 15 bookstore 16 shipping shipID = "s1" 17 duration2 to 4 days/duration 18 /shipping 19 20 shipping shipID = "s2" 21 duration1 day/duration 22 /shipping 23 24 book shippedBy = "s2" 25 Java How to Program 3rd edition. 26 /book 27 28 book shippedBy = "s2" 29 C How to Program 3rd edition. 30 /book 31 Fig. 6.8 XML document with ID and IDREF attribute types (part 1 of 2).xmlhtp1_06.fm Page 149 Monday, November 20, 2000 3:27 PM Chapter 6 Document Type Definition (DTD) 149 32 book shippedBy = "s1" 33 C++ How to Program 3rd edition. 34 /book 35 /bookstore Fig. 6.8 XML document with ID and IDREF attribute types (part 2 of 2). Fig. 6.9 Error displayed by XML Validator when an invalid ID is referenced. xmlhtp1_06.fm Page 150 Monday, November 20, 2000 3:27 PM 150 Document Type Definition (DTD) Chapter 6 Common Programming Error 6.8 Declaring attributes of type ID as FIXED is an error. 6.8 In Chapter 5, we briefly introduced the concept of DTDs and entities. Figure 5.4 (lang.xml) referenced lang.dtd, which contained the values for the entity references &assoc; and &text;. External subset lang.dtd contains the two entity declarations ENTITY assoc "&1571;&1587;&1617;&1608;&1588;&1616;&1610;&1614;&15 78;&1618;&1587;" and ENTITY text "&1575;&1604;&1610;&1608;&1606;&1610;&1603;&1608;&15 83;" for entities assoc and text. A parser replaces the entity references with their values. For example, consider the following entity declaration ENTITY digits "0123456789" for digits. This entity might be used as follows useAnEntity&digits;/useAnEntity The entity reference &digits; is replaced by its value, resulting in useAnEntity0123456789/useAnEntity the value 0123456789 being placed inside the tags. These entities are called general en- tities. Related to entities are entity attributes, which indicate that an attribute has an entity for its value. These entity attributes are specified by using tokenized attribute type ENTI- TY. The primary constraint placed on ENTITY attribute types is that they must refer to ex- ternal unparsed entities. An external unparsed entity is defined in the external subset of a DTD and consists of character data that will not be parsed by the XML parser. Figure 6.10 lists an XML document that demonstrates the use of entities and entity attribute types. 1 ?xml version = "1.0"? 2 3 Fig. 6.10: entityExample.xml 4 ENTITY and ENTITY attribute types 5 6 DOCTYPE database 7 NOTATION html SYSTEM "iexplorer" 8 ENTITY city SYSTEM "tour.html" NDATA html 9 ELEMENT database ( company+ ) 10 ELEMENT company ( name ) Fig. 6.10 XML document that contains an ENTITY attribute type (part 1 of 2).xmlhtp1_06.fm Page 151 Monday, November 20, 2000 3:27 PM Chapter 6 Document Type Definition (DTD) 151 11 ATTLIST company tour ENTITY REQUIRED 12 ELEMENT name ( PCDATA ) 13 14 15 database 16 company tour = "city" 17 nameDeitel & Associates, Inc./name 18 /company 19 /database Fig. 6.10 XML document that contains an ENTITY attribute type (part 2 of 2). Line 7 NOTATION html SYSTEM "iexplorer" declares a notation named html that refers to a SYSTEM identifier named "iexplor- er". Notations provide information that an application using the XML document can use to handle unparsed entities. For example, the application using this document may choose to open Internet Explorer and load the document tour.html (line 8). Line 8 ENTITY city SYSTEM "tour.html" NDATA html declares an entity named city that refers to an external document (tour.html). Key- wordNDATA indicates that the content of this external entity is not XML. The name of the notation (e.g., html) that handles this unparsed entity is placed to the right of NDATA. Line 11 ATTLIST company tour ENTITY REQUIREDxmlhtp1_06.fm Page 152 Monday, November 20, 2000 3:27 PM 152 Document Type Definition (DTD) Chapter 6 declares attribute tour for element company. Attribute tour specifies a required EN- TITY attribute type. Line 16 company tour = "city" assigns entity city to attribute tour. If we replaced line 16 with company tour = "country" the document fails to conform to the DTD because entity country does not exist. Figure 6.11 shows the message generated by XML Validator if country is used. Common Programming Error 6.9 Not assigning an unparsed external entity to an attribute with attribute type ENTITY results in a non-valid XML document. 6.9 Attribute type ENTITIES may also be used in a DTD to indicate that an attribute has multiple entities for its value. Each entity is separated by a space. For example ATTLIST directory file ENTITIES REQUIRED specifies that attribute file is required to contain multiple entities. An example of markup that conforms to this might look like directory file = "animations graph1 graph2" where animations, graph1 and graph2 are entities declared in a DTD. A more restrictive attribute type is attribute type NMTOKEN (name token), whose value consists of letters, digits, periods, underscores, hyphens and colon characters. For example, consider the declaration ATTLIST sportsClub phone NMTOKEN REQUIRED which indicates sportsClub contains a required NMTOKENphone attribute. An exam- ple of markup that conforms to this is sportsClub phone = "555-111-2222" Fig. 6.11 Error generated by XML Validator when a DTD contains a reference to an undefined entity.xmlhtp1_06.fm Page 153 Monday, November 20, 2000 3:27 PM Chapter 6 Document Type Definition (DTD) 153 An example that does not conform to this is sportsClub phone = "555 555 4902" because spaces are not allowed in an NMTOKEN attribute. Similarly, when an NMTOKENS attribute type is declared, the attribute may contain multiple string tokens separated by spaces. 6.6.2 Enumerated Attribute Types In this section, we discuss enumerated attribute types, which declare a list of possible val- ues an attribute can have. The attribute must be assigned a value from this list to conform to the DTD. Enumerated type values are separated by pipe characters (). For example, the declaration ATTLIST person gender ( M F ) "F" contains an enumerated attribute type declaration that allows attribute gender to have ei- ther the value M or F. A default value of "F" is specified to the right of the element attribute type. Alternatively, a declaration such as ATTLIST person gender ( M F ) IMPLIED does not provide a default value for gender. This type of declaration might be used to val- idate a marked up mailing list that contains first names, last names, addresses, etc. The ap- plication that uses this mailing list may want to precede each name by either Mr., Ms. or Mrs. However, some first names are gender neutral (e.g., Chris, Sam, etc.), and the appli- cation may not know the person’s gender. In this case, the application has the flexibility to process the name in a gender neutral way. NOTATION is also an enumerated attribute type. For example, ATTLIST book reference NOTATION ( JAVA C ) "C" the declaration indicates that reference must be assigned either JAVA or C. If a value is not assigned, C is specified as the default. The notation for C might be declared as NOTATION C SYSTEM "http://www.deitel.com/books/2000/chtp3/chtp3_toc.htm" 6.7 Conditional Sections DTDs provide the ability to include or exclude declarations using conditional sections. Keyword INCLUDE specifies that declarations are included, while keyword IGNORE spec- ifies that declarations are excluded. For example, the conditional section INCLUDE ELEMENT name ( PCDATA ) directs the parser to include the declaration of element name.

Advise: Why You Wasting Money in Costly SEO Tools, Use World's Best Free SEO Tool Ubersuggest.