$Id: divideAndValidate.html 1.3 2001/03/11 12:30:52 murata Exp $
7 March, 2001
To illustrate validation in RELAX Namespace, I wrote a small Java program. Given a non-monolithic XML document, this program decomposes it to a collection of islands, each of which is of a single namespace.
Each island can be validated by the RELAX Core processor. Furthermore, processors for different schema languages (even DTDs!) can be applied to these island.
If an element e and its parent element e' belong to different namespaces, e is detached from e'. Instead of e, a dummy node is introduced as a child element of e'.
A dummy node belongs to the namespace http://www.xml.gr.jp/xmlns/dummy". The attribute "namespaceName" of the dummy node indicates the namespace of e.
Consider an XML document as below:
<doc:doc xmlns:doc="urn:document" xmlns:table="urn:table">
<doc:para>this is a para</doc:para>
<table:table number="1">
<table:row>
<table:cell>
<doc:para>1st para</doc:para>
<doc:para>2nd para</doc:para>
</table:cell>
</table:row>
</table:table>
<table:table number="2">
<table:row>
<table:cell>
<doc:para>3rd para</doc:para>
<doc:para>4th para</doc:para>
</table:cell>
</table:row>
</table:table>
</doc:doc>
This non-monolithic document is decommposed into seven islands.
namespace$ java -cp
"/crimson-1.1/jaxp.jar;/crimson-1.1/crimson.jar;/sax2/sax.jar;/relax/namespace/src/"
org.iso_relax.dispatcher.TestDispatcher -n explain.xml
*Island start: urn:document
<para>1st para</para>
*Island end
*Island start: urn:document
<para>2nd para</para>
*Island end
*Island start: urn:table
<table number="1">
<row>
<cell>
<dummy namespaceName="urn:document
xmlns="http://www.xml.gr.jp/xmlns/dummy"></dummy>
<dummy namespaceName="urn:document
xmlns="http://www.xml.gr.jp/xmlns/dummy"></dummy>
</cell>
</row>
</table>
*Island end
*Island start: urn:document
<para>3rd para</para>
*Island end
*Island start: urn:document
<para>4th para</para>
*Island end
*Island start: urn:table
<table number="2">
<row>
<cell>
<dummy namespaceName="urn:document
xmlns="http://www.xml.gr.jp/xmlns/dummy"></dummy>
<dummy namespaceName="urn:document
xmlns="http://www.xml.gr.jp/xmlns/dummy"></dummy>
</cell>
</row>
</table>
*Island end
*Island start: urn:document
<doc>
<para>this is a para</para>
<dummy namespaceName="urn:table xmlns="http://www.xml.gr.jp/xmlns/dummy"></dummy>
<dummy namespaceName="urn:table xmlns="http://www.xml.gr.jp/xmlns/dummy"></dummy>
</doc>
*Island end
namespace$
The RSS specification uses namespaces heavily. The second example in Section 7 has six namespaces. This document is decomposed as follows:
java -cp "/crimson-1.1/jaxp.jar;/crimson-1.1/crimson.jar;/sax2/sax.jar;/relax/namespace/src/" org.iso_relax.dispatcher.TestDispatcher -n rssExample.xml
*Island start: http://purl.org/dc/elements/1.1/
<publisher>The O'Reilly Network</publisher>
*Island end
*Island start: http://purl.org/dc/elements/1.1/
<creator>Rael Dornfest (mailto:rael@oreilly.com)</creator>
*Island end
*Island start: http://purl.org/dc/elements/1.1/
<rights>Copyright ? 2000 O'Reilly & Associates, Inc.</rights>
*Island end
*Island start: http://purl.org/dc/elements/1.1/
<date>2000-01-01T12:00+00:00</date>
*Island end
*Island start: http://purl.org/rss/1.0/modules/syndication/
<updatePeriod>hourly</updatePeriod>
*Island end
*Island start: http://purl.org/rss/1.0/modules/syndication/
<updateFrequency>2</updateFrequency>
*Island end
*Island start: http://purl.org/rss/1.0/modules/syndication/
<updateBase>2000-01-01T12:00+00:00</updateBase>
*Island end
*Island start: http://www.w3.org/1999/02/22-rdf-syntax-ns#
<Seq>
<li resource="http://c.moreover.com/click/here.pl?r123"></li>
</Seq>
*Island end
*Island start: http://purl.org/rss/1.0/
<channel {http://www.w3.org/1999/02/22-rdf-syntax-ns#}about="http://meerkat.oreillynet.com/?_fl=rss1.0">
<title>Meerkat</title>
<link>http://meerkat.oreillynet.com</link>
<description>Meerkat: An Open Wire Service</description>
<dummy namespaceName="http://purl.org/dc/elements/1.1/ xmlns="http://www.xml.gr.jp/xmlns/dummy"/>
<dummy namespaceName="http://purl.org/dc/elements/1.1/ xmlns="http://www.xml.gr.jp/xmlns/dummy"/>
<dummy namespaceName="http://purl.org/dc/elements/1.1/ xmlns="http://www.xml.gr.jp/xmlns/dummy"/>
<dummy namespaceName="http://purl.org/dc/elements/1.1/ xmlns="http://www.xml.gr.jp/xmlns/dummy"/>
<dummy namespaceName="http://purl.org/rss/1.0/modules/syndication/ xmlns="http://www.xml.gr.jp/xmlns/dummy"/>
<dummy namespaceName="http://purl.org/rss/1.0/modules/syndication/ xmlns="http://www.xml.gr.jp/xmlns/dummy"/>
<dummy namespaceName="http://purl.org/rss/1.0/modules/syndication/ xmlns="http://www.xml.gr.jp/xmlns/dummy"/>
<image {http://www.w3.org/1999/02/22-rdf-syntax-ns#}resource="http://meerkat.oreillynet.com/icons/meerkat-powered.jpg"></image>
<items>
<dummy namespaceName="http://www.w3.org/1999/02/22-rdf-syntax-ns# xmlns="http://www.xml.gr.jp/xmlns/dummy"/>
</items>
<textinput {http://www.w3.org/1999/02/22-rdf-syntax-ns#}resource="http://meerkat.oreillynet.com"></textinput>
</channel>
*Island end
*Island start: http://purl.org/rss/1.0/
<image {http://www.w3.org/1999/02/22-rdf-syntax-ns#}about="http://meerkat.oreillynet.com/icons/meerkat-powered.jpg">
<title>Meerkat Powered!</title>
<url>http://meerkat.oreillynet.com/icons/meerkat-powered.jpg</url>
<link>http://meerkat.oreillynet.com</link>
</image>
*Island end
*Island start: http://purl.org/dc/elements/1.1/
<description>
XML is placing increasingly heavy loads on the existing technical
infrastructure of the Internet.
</description>
*Island end
*Island start: http://purl.org/dc/elements/1.1/
<publisher>The O'Reilly Network</publisher>
*Island end
*Island start: http://purl.org/dc/elements/1.1/
<creator>Simon St.Laurent (mailto:simonstl@simonstl.com)</creator>
*Island end
*Island start: http://purl.org/dc/elements/1.1/
<rights>Copyright ? 2000 O'Reilly & Associates, Inc.</rights>
*Island end
*Island start: http://purl.org/dc/elements/1.1/
<subject>XML</subject>
*Island end
*Island start: http://purl.org/rss/1.0/modules/company/
<name>XML.com</name>
*Island end
*Island start: http://purl.org/rss/1.0/modules/company/
<market>NASDAQ</market>
*Island end
*Island start: http://purl.org/rss/1.0/modules/company/
<symbol>XML</symbol>
*Island end
*Island start: http://purl.org/rss/1.0/
<item {http://www.w3.org/1999/02/22-rdf-syntax-ns#}about="http://c.moreover.com/click/here.pl?r123">
<title>XML: A Disruptive Technology</title>
<link>http://c.moreover.com/click/here.pl?r123</link>
<dummy namespaceName="http://purl.org/dc/elements/1.1/ xmlns="http://www.xml.gr.jp/xmlns/dummy"/>
<dummy namespaceName="http://purl.org/dc/elements/1.1/ xmlns="http://www.xml.gr.jp/xmlns/dummy"/>
<dummy namespaceName="http://purl.org/dc/elements/1.1/ xmlns="http://www.xml.gr.jp/xmlns/dummy"/>
<dummy namespaceName="http://purl.org/dc/elements/1.1/ xmlns="http://www.xml.gr.jp/xmlns/dummy"/>
<dummy namespaceName="http://purl.org/dc/elements/1.1/ xmlns="http://www.xml.gr.jp/xmlns/dummy"/>
<dummy namespaceName="http://purl.org/rss/1.0/modules/company/ xmlns="http://www.xml.gr.jp/xmlns/dummy"/>
<dummy namespaceName="http://purl.org/rss/1.0/modules/company/ xmlns="http://www.xml.gr.jp/xmlns/dummy"/>
<dummy namespaceName="http://purl.org/rss/1.0/modules/company/ xmlns="http://www.xml.gr.jp/xmlns/dummy"/>
</item>
*Island end
*Island start: http://purl.org/rss/1.0/modules/textinput/
<function>search</function>
*Island end
*Island start: http://purl.org/rss/1.0/modules/textinput/
<inputType>regex</inputType>
*Island end
*Island start: http://purl.org/rss/1.0/
<textinput {http://www.w3.org/1999/02/22-rdf-syntax-ns#}about="http://meerkat.oreillynet.com">
<title>Search Meerkat</title>
<description>Search Meerkat's RSS Database...</description>
<name>s</name>
<link>http://meerkat.oreillynet.com/</link>
<dummy namespaceName="http://purl.org/rss/1.0/modules/textinput/ xmlns="http://www.xml.gr.jp/xmlns/dummy"/>
<dummy namespaceName="http://purl.org/rss/1.0/modules/textinput/ xmlns="http://www.xml.gr.jp/xmlns/dummy"/>
</textinput>
*Island end
*Island start: http://www.w3.org/1999/02/22-rdf-syntax-ns#
<RDF>
<dummy namespaceName="http://purl.org/rss/1.0/ xmlns="http://www.xml.gr.jp/xmlns/dummy"/>
<dummy namespaceName="http://purl.org/rss/1.0/ xmlns="http://www.xml.gr.jp/xmlns/dummy"/>
<dummy namespaceName="http://purl.org/rss/1.0/ xmlns="http://www.xml.gr.jp/xmlns/dummy"/>
<dummy namespaceName="http://purl.org/rss/1.0/ xmlns="http://www.xml.gr.jp/xmlns/dummy"/>
</RDF>
*Island end