STEP 2: Migration from XML DTD (with parameter entites)

$Id: step2e.html 1.3 2000/02/29 12:01:43 murata Exp $

text by MURATA Makoto

html by NAMBA Ryosuke


You often have to write the same thing many times. Features in STEP 2 allow you to create a decscription once and reference to it repeatedly. These features mimic parameter entities of XML.

1. Parameter entities used in content models

hedgeRule allows you to write a hedge model once, name it, and reference to it repeatedly. In other words, hedgeRule mimics parameter entities referenced from content models in DTD.

1.1 Overview

The syntax of hedgeRule is shown below. foo is a name assigned to the hedge model of this hedgeRule.

<hedgeRule label="foo">
  ...element hedge model...
</hedgeRule>

To reference to such a hedgeRule, we write <ref label="foo"/>. This ref is replaced with the element hedge model specified in the hedgeRule.

In the following example, the hedge model of the elementRule for the element type doc references to a hedgeRule. This elementRule is borrowed from the module in the beginning of STEP 1, and the hedge model minus title is rewritten by a hedgeRule.

<hedgeRule label="doc.body">
  <ref label="para" occurs="*"/>
</hedgeRule>

<elementRule pred="doc">
  <sequence>
    <ref label="title"/>
    <ref label="doc.body"/>
  </sequence>
</elementRule>

The reference to doc.body is expanded as below:

<elementRule pred="doc">
  <sequence>
    <ref label="title"/>
    <ref label="para" occurs="*"/>
  </sequence>
</elementRule>

In this example, a hedgeRule is referenced from an elementRule. But a hedgeRule may reference to another hedgeRule.

STEP 1 explained that ref is a reference to an element type. However, this STEP explains that ref is a reference to a parameter entity. As of STEP 2, we do not have a consistent story yet. However, if no names are shared by element types and parameter entities, we do not have any conflicts.

1.2 Permissible hedge models

hedgeRule can have element hedge models only. Datatype references or mixed hedge models are not permitted. For example, the following rules are not permitted.

<hedgeRule label="mixed.param">
  <mixed>
    <choice occurs="*">
      <ref label="em"/>
      <ref label="strong"/>
    <choice>
  </mixed>
</hedgeRule>

<hedgeRule label="string.param" type="string"/>

As for mixed hedge models, their subordinates (element hedge models) reference to hedgeRule. An example is shown below. The mixed hedge model references to phrase, and phrase is described by a hedgeRule.

<hedgeRule label="phrase">
  <choice>
    <ref label="em"/>
    <ref label="strong"/>
  <choice>
</hedgeRule>

<elementRule pred="p">
  <mixed>
    <ref label="phrase" occurs="*"/>
  </mixed>
</elementRule>

1.3 The occurs attribute

ref that references to a parameter entity can have occurs, and an element hedge model specified in hedgeRule can also have occurs. In the following example, both have occurs.

<hedgeRule label="bar">
  <sequence occurs="+" >
    <ref label="foo1"/>
    <ref label="foo2"/>
  </sequence>
</hedgeRule>

<elementRule pred="foo">
  <ref label="bar" occurs="*"/>
</elementRule>

If this example is recaptured in DTD, expansion of the parameter entity bar is obvious.

<!ENTITY % bar "(foo1, foo2)+">
<!-- original --> <!ELEMENT foo (%bar;)*>
<!-- expanded --> <!ELEMENT foo ((foo1, foo2)+)*>

In RELAX, we need a redundant choice so as to specify the occurs attribute twice. That is, choice is created and its only subordinate is the element hedge model specified in the hedgeRule. The occurs attribute of the ref is copied to this choice.

The following shows expansion of the above example. Observe that the choice, which is introduced during expansion, inherits occurs="*" from the ref.

<elementRule pred="foo">
  <choice occurs="*">
    <sequence occurs="+" >
      <ref label="foo1"/>
      <ref label="foo2"/>
    </sequence>
  </choice>
</elementRule>

1.4 Occurence order of ref and hedgeRule

Unlike parameter entities of DTD, hedgeRule does not have to precede ref that reference to it. For example, the following is not an error.

<elementRule pred="doc">
  <sequence>
    <ref label="title"/>
    <ref label="doc.body"/>
  </sequence>
</elementRule>

<hedgeRule label="doc.body">
  <ref label="para" occurs="*"/>
</hedgeRule>

1.5 Illegal reference to itself

hedgeRule may not reference to itself directly or indirectly. The follow is an error since the hedge model for bar references to bar itself.

<hedgeRule label="bar">
  <choice>
    <ref label="title"/>
    <ref label="bar" occurs="*"/>
  </choice>
</hedgeRule>

In the following example, the hedge model for bar1 references to bar2 and the hedge model for bar2 references to bar1. Thus, there is an error.

<hedgeRule label="bar1">
  <ref label="bar2" occurs="*"/>
</hedgeRule>

<hedgeRule label="bar2">
  <choice>
    <ref label="title"/>
    <ref label="bar1"/>
  </choice>
</hedgeRule>

1.6 Use of empty

empty shown in STEP 1 is typically used in hedgeRule. An example is as below:

<hedgeRule label="frontMatter">
  <empty/>
</hedgeRule>

<elementRule pred="section">
  <sequence>
    <ref label="title"/>
    <ref label="frontMatter"/>
    <ref label="para" occurs="*"/>
  </sequence>
</elementRule>

Users of this module can change the structure of section by customizing the description of frontMatter.

1.7 Use of none

none shown in STEP 1 is also used in hedgeRule. An example is as below:

<hedgeRule label="local-block-class">
  <none/>
</hedgeRule>

<hedgeRule label="block-class">
  <choice>
    <ref label="para"/>
    <ref label="fig"/>  
    <ref label="local-black-class"/>  
  </choice>
</hedgeRule>

Users of this module can change the structure of block-class by customizing the description of local-block-class.

2. Parameter entities used in attribute-list declarations

attList allows you to declare attributes once and reference to the declarations repeatedly. In other words, attList mimics parameter entities referenced from attribute-list declarations.

2.1 Overview

The syntax of attList is shown below. foo is a name of a parameter entity.

<attList pred="foo">
  ...attribute definitions...
</attList>

To reference to such an attList, we write <ref pred="foo"/> before attribute declarations. This ref is replaced with attribute declarations specified in the attList.

In the following example, a tag for the element type title references to attList. This tag is borrowed from the module in the beginning of STEP 1 and rewritten. The role attribute which is common to many element types is described by attList named common.att.

<attList pred="common.att">
  <attribute name="role" type="NMTOKEN"/>
</attList>

<tag name="title">
  <ref pred="common.att"/>
  <attribute name="number" required="true" type="integer"/>
</tag>

This ref is expanded as below:

<tag name="title">
  <attribute name="role" type="NMTOKEN"/>
  <attribute name="number" required="true" type="integer"/>
</tag>

In this example, attList is referenced from tag, but it can also be referenced from attList.

2.2 Occurrence order of ref and attList

Unlike parameter entities of DTD, attList does not have to precede ref that reference to it. For example, the following is not an error.

<tag name="title">
  <ref pred="common.att"/>
  <attribute name="number" required="true" type="integer"/>
</tag>

<attList pred="common.att">
  <attribute name="role" type="NMTOKEN"/>
</attList>

2.3 Multiple ref elements

A single tag or attList may contain more than one ref element. In the following example, an attList element refrences to more than one ref element. Required attributes are grouped as common-req.att and optional attributes are grouped as common-opt.att. These two are referenced from the attList element for common.att.

<attList pred="common.att">
  <ref pred="common-req.att"/>
  <ref pred="common-opt.att"/>
</attList>

<attList pred="common-req.att">
  <attribute name="role" type="NMTOKEN" required="true"/>
</attList>

<attList pred="common-opt.att">
  <attribute name="id" type="NMTOKEN"/>
</attList>

2.4 Illegal reference to itself

As in the case of hedgeRule, a direct or indirect reference to itself is an error. For example, the following is an error.

<attList pred="bar1">
  <ref pred="bar2"/>
  <attribute name="id" type="NMTOKEN"/>
</attList>

<attList pred="bar2">
  <ref pred="bar1"/>
</attList>

3. Summary

STEP 2 covers almost all features of XML DTD. Enjoy and RELAX!


mura034@attglobal.net

Valid HTML 4.0!