An overview of XML Schema

The structure of an XML document is represented by an XML schema. An XML schema is also an XML document in the namespace http://www.w3.org/2001/XMLSchema.

Root element

The root element in a schema is schema. The schema namespace is specified in the root element with the declaration xmlns:xs=http://www.w3.org/2001/XMLSchema. A prefix other than xs may be used, such as xsd. A schema element may have attributes targetNamespace, version, attributeFormDefault, and elementFormDefault.

  • targetNamespace specifies the namespace described in the schema. "An XML namespace is a collection of names, identified by a URI reference, which are used in XML documents as element types and attribute names." (http://www.w3.org/TR/1999/REC-xml-names-19990114/)
  • elementFormDefault and attributeFormDefault specify whether elements and attributes in the targetNamespace are required to be qualified by default.

A qualified name consists of a prefix that is mapped to a namespace URI followed by a single colon, which is followed by the local name. A value of unqualified (default value) indicates that elements and attributes may be specified without a prefix. The value qualified indicates that attributes and elements must be qualified with a prefix. The default values apply only to local elements and attributes. The top-level elements and attributes need to be qualified. The default values may be overridden by the form attribute in an attribute or element. The attribute version specifies the version of XML schema. An example of a schema element is shown in the following listing:

<xsd:schema version="1.0" xmlns:xsd="http://www.w3.org/2001/XMLSchema" targetNamespace=" http://xdk11g.com/journal" xmlns:journal="http://xdk11g.com/journal"></xsd:schema>

All other schema constructs are specified in the schema element.

Element component

An XML document element is represented in a schema with the element construct. Some of the attributes an element may have are name, type, minOccurs, and maxOccurs.

  • name specifies the element name. Element and attribute tags can be used to either define a component or to use a component by reference. The attribute ref may be used instead of the attribute name. The attribute ref specifies a reference to a top-level element declaration.
  • type specifies the element type, which may be one of the schema built-in types, such as xsd:string or xsd:integer, or a defined simpleType or complexType. We shall discuss simpleType and complexType later.
  • The cardinality (the number of elements in a set) of an element is specified with the attributes minOccurs and maxOccurs. The attribute minOccurs specifies the minimum occurrences of an element, and the attribute maxOccurrs specifies the maximum occurrences of an element. The attributes minOccurs and maxOccurs may only be specified for local elements, not top-level element declarations. Local elements are elements that are defined within another element definition or within a complexType definition. Top-level elements are elements that are defined at the top level directly within the root element schema.

An element may be declared in the schema element as a top-level declaration or in choice, sequence, all, or group declarations, which are discussed later. An example element declaration is shown in the following listing:

<xsd:element name="catalog" type="catalogType" minOccurs="0" maxOccurs="unbounded"/>

An XML document attribute is represented in an XML schema with the attribute construct. Some of the attributes that an attribute construct may have are name, type, default, fixed, and use.

The attribute name specifies attribute name. The attribute ref may be used instead of name to refer to a top-level attribute declaration or an attributeGroup declaration, which we shall discuss later. If ref is specified, name and type attributes must not be specified.

The attribute type specifies the type of an attribute. The attribute default specifies the default value of an attribute. The attribute fixed specifies the fixed value of an attribute. One of default or fixed may be used. The attribute use specifies if the attribute is a required attribute, and may have values optional, required, or prohibited. An attribute may be declared as a top-level declaration in schema element or in attributeGroup, complexType, restriction, or extension elements. An example attribute declaration is shown in the following listing:

<xsd:attribute name="title" type="xsd:string" use="required"/>

The schema construct attributeGroup specifies a group of attributes. An attributeGroup may refer to another attributeGroup declaration with the ref attribute. An example of an attributeGroup declaration is this:

<xsd:attributeGroup name="journalAttr">
<xsd:attribute name="title: type="xsd:string"/>
<xsd:attribute name="publisher" type="xsd:string"/>
<xsd:attribute name="edition" type="xsd:string"/>
</xsd:attributeGroup>

SimpleType component

The schema construct simpleType is used to constrain character data in elements and attributes. The text in an element is actually an XML node, which is a text node. The text node within an element is a text child node of the element. A simpleType may be defined with restriction, list, or union constructs. A new data type may be defined with simpleType. For example, define a simpleType called stringType to constrain the xsd:string data type to a minimum length of 25 as follows:

<xsd:simpleType name="stringType">
<xsd:restriction base="xsd:string">
<xsd:minLenth value="25"/>
</xsd:restriction>
</xsd:simpleType>

The following is an example of an attribute that uses a simpleType definition where the attribute title will have a maximum length of 25:

<xsd:attribute name="title">
<xsd:simpleType>
<xsd:restriction base="xsd:string">
<xsd:maxLength value="25"/>
</xsd:restriction>
</xsd:simpleType>
</xsd:attribute>

With simpleType, an element may be defined as a list of values of a specified data type using the list construct. For example, define an element that contains a list of integer values like this:

<xsd:simpleType name="listType">
<xsd:list itemType="xsd:integer"/>
</xsd:simpleType>

An element construct of type listType may be declared as follows in an XML schema:

<xsd:element name="list" type="listType"/>

An example of a listType element is shown here:

<list>1 5 10 12 15</list>

The union construct defines a collection of simple types. For example, define a simpleType that is the union of the xsd:string and xsd:integer simple types:

<xsd:simpleType name="unionType">
<xsd:union memberTypes="xsd:integer xsd:string"/>
</xsd:simpleType>

SimpleType unionType is a combination of xsd:integer and xsd:string, which implies that an element or attribute of type unionType may either be an integer or a string. In the built-in data type hierarchy, string and integer are defined as different data types.

An element, elementA, of type unionType may be declared as follows:

<elementA>Element A</elementA>

A unionType element may also be declared as follows:

<elementA>25</elementA>

The XML schema specification provides some built-in simple data types (http://www.w3.org/TR/xmlschema-2/#built-in-datatypes).

ComplexType component

The complexType construct is used to define the structure of an element. A complexType is used for the following:

  • Constraining an element definition by providing attribute declarations governing the appearance and content of attributes
  • Constraining element children to conform to a specified element-only or mixed-content model, or constraining character data to conform to a specified simple type definition, or constraining element children to be empty
  • Deriving a complex type from another simple type or complex type

A complexType may be declared at the top level in the schema element or in an element declaration. If a complexType is defined at the top level, it is used in an element declaration using the type attribute. If an element uses a lot of complex type/simple type definitions, it is better to define the complex types at the top level to keep the element declaration simple. A sequence of elements is defined with the sequence construct within a complexType construct. An example is:

<xsd:complexType name="journalType">
<xsd:sequence>
<xsd:element name="title" type="xsd:string"/>
<xsd:element name="edition" type="xsd:string"/>
</xsd:sequence>
<xsd:attribute name="publisher" type="xsd:string"/>
</xsd:complexType>

An example of using a complexType in an element declaration with the type attribute is as follows:

<xsd:element name="journal" type="journalType"/>

The order of elements in an XML document should be the same as in the sequence construct. An example of using the journal element declaration in an XML document is as follows:

<journal publisher="Oracle Publishing">
<title>Oracle Magazine</title>
<edition>Jan-Feb 2008</edition>
</journal>

As the journal element is of type journalType, it has an attribute publisher and a sequence of elements title and edition with the title element preceding the edition element. If you need multiple elements, but do not want them to be in a particular order, use the all construct. An example of using the all construct is as follows:

<xsd:complexType name="journalType">
<xsd:all>
<xsd:element name="title" type="xsd:string"/>
<xsd:element name="edition" type="xsd:string"/>
</xsd:all>
</xsd:complexType>

A complexType may also be defined with a choice of elements with the choice construct, which implies that an element of type journalType may contain either the date subelement or the edition subelement, but not both, and it must contain one of the two:

<xsd:complexType name="journalType">
<xsd:choice>
<xsd:element name="date" type="xsd:string"/>
<xsd:element name="edition" type="xsd:string"/>
</xsd:choice>
</xsd:complexType>

An example XML element journal with choice content is as follows:

<journal>
<date>Jan-Feb 2008</date>
</journal>

Similar to setting cardinality on individual elements, cardinality may be set on sequence, all, and choice constructs with minOccurs and maxOccurs. Cardinality set on a sequence construct implies that the sequence may be repeated the specified number of times. A complexType may be defined to have mixed content, that is, text and elements. A mixed content complexType is specified with the attribute mixed, which has a Boolean value. The default value of mixed is false. ComplexType journalType may be defined with mixed content as listed:

<xsd:complexType name="journalType" mixed="true">
<xsd:sequence>
<xsd:element name="title" type="xsd:string"/>
<xsd:element name="edition" type="xsd:string"/>
</xsd:sequence>
</xsd:complexType>

An example XML construct based on the mixed content schema construct is as follows:

<journal>
The journal is
<title>Oracle Magazine</title>
The edition is
<edition>Jan-Feb 2008</edition>
</journal>

A complexType may be defined with simpleContent and complexContent constructs. SimpleContent specifies restrictions and extensions on a text-only complex type, which is a complex type with character data, attributes, and no elements, or on a simple type. In an example of a simpleContent with extension, an attribute is added on a text-only complexType. For example, define a text only complex type journalType:

<xsd:complexType name="journalType">
<xsd:simpleContent>
<xsd:extension base="xsd:string">
<xsd:attribute name="publisher" type="xsd:string"/>
</xsd:extension>
</xsd:simpleContent>
</xsd:complexType>

Define a complex type that uses a simple content extension to add an attribute to the journalType complex type:

<xsd:complexType name="journalTypeExtension">
<xsd:simpleContent>
<xsd:extension base="journalType">
<xsd:attribute name="edition" type="xsd:string"/>
</xsd:extension>
</xsd:simpleContent>
</xsd:complexType>

An example XML element journal based on the journalTypeExtension is shown here:

<journal publisher="Oracle Publishing" edition="Jan-Feb 2008">Oracle Magazine</journal>

Actually, we used a simple content example in which we used a simpleContent for the journalType definition. The simple content in the journalType definition extends the simple type xsd:string. A simpleContent may also be declared with a restriction. In this example of a simpleContent with restriction, an element length is restricted to 10:

<xsd:element name="journal">
<xsd:complexType>
<xsd:simpleContent>
<xsd:restriction base="xsd:string">
<xsd:maxLength value="10"/>
</xsd:restriction>
</xsd:simpleContent>
</xsd:complexType>
</xsd:element>

An example XML document element journal with simple content restriction is shown here:

<journal>Oracle Mag</journal>

The complexContent construct is used to define extensions and restrictions on an element-only (which includes attributes), or mixed content complexType. In this example of a complexContent with extension, a complexType, named journalType here, is extended to add another element and attribute in complex type journalTypeExten as follows:

<xsd:complexType name="journalType">
<xsd:sequence>
<xsd:element name="title" type="xsd:string"/>
<xsd:element name="publisher" type="xsd:string"/>
</xs:sequence>
</xsd:complexType>
<xsd:complexType name="journalTypeExten">
<xsd:complexContent>
<xsd:extension base="journalType">
<xsd:sequence>
<xsd:element name="edition" type="xsd:string"/>
</xsd:sequence>
<xsd:attribute name="section" type="xsd:string"/>
</xsd:extension>
</xsd:complexContent>
</xsd:complexType>

An example XML document element journal of type journalTypeExten may be defined as follows:

<journal section="XML">
<title>Oracle Magazine</title>
<publisher>Oracle Publishing</publisher>
<edition>Jan-Feb 2006</edition>
</journal>

To review what we have discussed in the introduction on XML schema, the following table compares the simple type and complex type structures:

A more detailed discussion on XML schema structures is in the XML Schema Structures W3C Recommendation (http://www.w3.org/TR/xmlschema-1/).