Powered by SmartDoc
ENGLISHJAPANESE

STEP 8: tag and attPool, revisited

$Id: step8.sdoc 1.16 2000/11/01 14:58:13 murata Exp $

In STEP 2, tag was compared to an attribute-list declaration and attPool was compared to parameter entities describing attributes. Actually, RELAX has a much more generalized framework.

The role attribute of tag elements

On top of the name attribute, tag elements can have the role attribute. In this section, we first consider motivations for this extension, and then introduce this attribute.

Switching content models depending on attribute values

Often, we would like to attach different content models to the same tag name, depending on attribute values. For example, we might want to switch content models of val element, depending on the type attribute. If the attribute value is integer, the content model is a reference to the datatype integer. If it is string, the content model is a reference to the datatype string.

<!-- This is legal. -->
<val type="integer">10</val>

<!-- This is also legal. -->
<val type="string">foo bar</val>

<!-- This is illegal. -->
<val type="integer">foo bar</val>

Thus, we would like to switch content models (shown below) depending on whether the attribute value is integer or string.

<!-- Case 1: type="integer" -->
<elementRule role="val" type="integer"/>

<!-- Case 2: type="string" -->
<elementRule role="val" type="string"/>

However, as long as we use features covered in STEPs 0 thru 7, we have to attach content models to tag names. Attribute values are not taken into consideration. Thus, no matter what the value of the type attribute is, the same elementRule is used.

Constraints represented by tag elements

On top of the name attribute, tag elements can have the role attribute. tag elements take the following form. While the name attribute specifies tag names, the role attribute specifies roles.

<tag name="tag-name" role="role-name">
  ...
</tag>

A tag element attaches a role to a collection of constraints on tag names and attributes. When a start tag (or empty-element tag) satisfies these constraints, this tag plays the specified role.

For example, consider a tag element as below:

<tag name="val" role="val-integer">
  <attribute name="type" type="NMTOKEN" required="true">
    <enumeration value="integer"/>
  </attribute>
</tag>

This tag element specifies that the tag name be val and the type attribute have the value integer. If a start tag (or empty-element tag) satisfies this constraint, this tag plays the val-integer role.

<val type="integer">

In the following tag element, the constraint on the type attribute is that the attribute value be string and the role name is val-string.

<tag name="val" role="val-string">
  <attribute name="type" type="NMTOKEN" required="true">
    <enumeration value="string"/>
  </attribute>
</tag>

The following start tag does not play the val-integer role, but plays the val-string role.

<val type="string">

Attributes may occur even if they are not specified by tag elements. For example, the following start tag has an attribute unknown, which is not specified by the previous tag element. This start tag still plays the role val-string, but warning message will be issued.

<val type="string" unknown="">

How should we interpret those tag elements without the role attribute such as those in STEPs 1 thru STEP 7? When the role attribute is omitted, it is assumed to have the value of the name attribute. Thus, the following two tag elements are semantically identical.

<tag name="foo">
  <attribute name="bar" type="int"/>
</tag>

<tag name="foo" role="foo">
  <attribute name="bar" type="int"/>
</tag>

The role attribute of elementRule elements

The role attribute of elementRule elements do not specify tag names, but rather specifies roles. Thus, we can switch hedge models for the same tag name, depending on attribute values.

If we use roles val-string and val-integer shown in the previous example, we can have two elementRules for start tags of the tag name val. An elementRule that references to the val-string role is concerned with start tags whose type attribute has the value string. An elementRule that references to the val-integer role is concerned with start tags whose type attribute has the value integer.

<!-- Case 1: type="integer" -->

<tag name="val" role="val-integer">
  <attribute name="type" type="NMTOKEN" required="true">
    <enumeration value="integer"/>
  </attribute>
</tag>

<elementRule role="val-integer" label="val" type="integer"/>

<!-- Case 2: type="string" -->

<tag name="val" role="val-string">
  <attribute name="type" type="NMTOKEN" required="true">
    <enumeration value="string"/>
  </attribute>
</tag>

<elementRule role="val-string" label="val" type="string"/>

Note that two tag elements specify the tag name val and the type attribute. In RELAX, tag elements are not declarations, which may appear once and only once, but rather constraints, which may appear more than once.

Prohibition of references by ref elements

Roles referenced by ref elements may not be described by tag elements. If they are described, they must be described by attPool elements.

In the next example, a ref element references to the foo role, which is described by a tag element. This example is thus a syntax error.

<tag name="foo"/>

<attPool role="bar">
  <ref role="foo"/>
</attPool>

The none datatype, revisited

STEP 3 introduced the none datatype. none is useful for switching content models depending on the presence or absence of an attribute.

For example, suppose that <div class="sec"> and <div> require different content models. A role for the former, say divSec, can be described as below:

<tag name="div" role="divSec">
  <attribute name="class" type="string">
    <enumeration value="sec"/>
  </attribute>
</tag>

How do we describe a role for <div>, say divWithoutClass? One might think that the following example would work.

<tag name="div" role="divWithoutClass"/>

However, this description allows divWithoutClass even for <div class="sec">. Although the "undeclared attribute" message is issued, this start tag is assumed to play both roles. (1)

To explicitly disallow the class attribute, we have to use the none datatype and write as below:

<tag name="div" role="divWithoutClass">
  <attribute name="class" type="none"/>
</tag>

Since no character strings are permitted by the none datatype, any value specified for the class attribute will prevent the divWithoutClass role.

  1. RELAX allows tags to play roles even if they have undeclared attributes. There are two reasons for this design. First, traditional XML processors continue validation even if they encounter undeclared attributes. Second, HTML allows undeclared attributes.

attPool elements

Unlike parameter entitles of DTDs, attPool elements are not expanded. tag elements and attPool elements are very similar and equally important in RELAX.

Constraints represented by attPool

We have observed that a tag element attaches a role to a collection of constraints on tag names and attributes. The only difference between attPool and tag is that attPool elements do not contain constraints on tag names. In other words, an attPool element attaches a role to a collection of constraints on attributes.

Consider the following attPool.

<attPool role="info">
  <attribute name="class" required="true">
    <enumeration value="informative"/>
  </attribute>
</attPool>

This attPool element specifies that the class attribute is specified, and its value is "informative" and attaches the info role to this constraint. There are no constraints on tag names. Because of this attPool, the following empty-element tag plays the info role.

<some class="informative"/>

Just like tag, attributes not specified by attPool may occur. For example, the following start tag plays the info role.

<some class="informative" unknown=""/>

Prohibition of references by elementRule elements

Roles referenced by the role attribute of elementRule elements may not be described by attPool elements. If they are described, they must be described by tag elements.

The following elementRule describes the info role, which is described by an attPool element. Thus, this example is a syntax error.

<attPool role="info"/>
<elementRule role="info" label="informative" type="emptyString"/>

Prohibition of role sharing by multiple tag or attPool elements

Multiple tag elements cannot share a single role.

In the following example, two tag elements share the bar role. Thus, this example is a syntax error.

<tag name="foo1" role="bar">
  <attribute name="a" type="string"/>
  ...
</tag>

<tag name="foo2" role="bar">
  <attribute name="b" type="string"/>
  ...
</tag>

In the next example, a role and tag name are both shared by two tag elements. This example is also a syntax error.

<tag name="foo" role="foo">
  <attribute name="a" type="string"/>
  ...
</tag>

<tag name="foo" role="foo">
  <attribute name="b" type="string"/>
  ...
</tag>

Even when the role attribute is omitted and the value of the name attribute is used, role sharing is prohibited. The two tag elements in the next example are identical to the two tag elements shown above. Thus, this example is also a syntax error.

<tag name="foo">
  <attribute name="a" type="string"/>
  ...
</tag>

<tag name="foo">
  <attribute name="b" type="string"/>
  ...
</tag>

In the following example, two attPool elements share the bar role. Thus, this example is a syntax error.

<attPool role="bar">
  <attribute name="a" type="string"/>
  ...
</attPool>

<attPool role="bar">
  <attribute name="b" type="string"/>
  ...
</attPool>

In this last example, a tag element and an attPool element share the bar role. Thus, this example is also a syntax error.

<attPool role="bar">
  <attribute name="a" type="string"/>
  ...
</attPool>

<tag role="bar" name="foo">
  <attribute name="b" type="string"/>
  ...
</tag>

Summary

In STEPs 0 thru 7, we have assumed that a tag element declares a tag name and attributes. Actually, a tag element attaches a role to a collection of constraints on tag names and attributes. In examples in STEPs 1 thru 7, roles and tag names coincide, but they are not always identical. In most cases, there are one-to-one correspondences among labels, roles, and tag names. But this is not always the case.

The following table summarizes syntactical constructs that describe or reference to tag names, labels, or roles.

Syntactical constructs tag names, labels, or roles
The role attribute of elementRule references to roles described by tag
The label attribute of elementRule description of labels
The label attribute of hedgeRule description of labels
The label attribute of ref reference to labels described by elementRule
The label attribute of hedgeRef reference to labels described by hedgeRule
The name attribute of tag description of tag names
The role attribute of tag description of roles
The role attribute of attPool description of roles
The role attribute of ref reference to roles described by attPool

The following table summarizes whether tag names, labels, and roles occur in XML documents.

Types of names In XML instances In RELAX modules
tag names occur occur as part of clauses
roles do not occur occur in clauses (descriptions of and references to roles)
labels do not occur occur in production rules (descriptions of and references to labels)

In traditional DTDs, it has been impossible to switch content models depending on attribute values, but RELAX has made it possible. The only required extension is the role attribute. This demonstrates simplicity and descriptive power of RELAX. Enjoy and RELAX!