$Id: divideAndValidate.html 1.3 2001/03/11 12:30:52 murata Exp $
7 March, 2001
To illustrate validation in RELAX Namespace, I wrote a small Java program. Given a non-monolithic XML document, this program decomposes it to a collection of islands, each of which is of a single namespace.
Each island can be validated by the RELAX Core processor. Furthermore, processors for different schema languages (even DTDs!) can be applied to these island.
If an element e and its parent element e' belong to different namespaces, e is detached from e'. Instead of e, a dummy node is introduced as a child element of e'.
A dummy node belongs to the namespace http://www.xml.gr.jp/xmlns/dummy". The attribute "namespaceName" of the dummy node indicates the namespace of e.
Consider an XML document as below:
<doc:doc xmlns:doc="urn:document" xmlns:table="urn:table"> <doc:para>this is a para</doc:para> <table:table number="1"> <table:row> <table:cell> <doc:para>1st para</doc:para> <doc:para>2nd para</doc:para> </table:cell> </table:row> </table:table> <table:table number="2"> <table:row> <table:cell> <doc:para>3rd para</doc:para> <doc:para>4th para</doc:para> </table:cell> </table:row> </table:table> </doc:doc>
This non-monolithic document is decommposed into seven islands.
namespace$ java -cp "/crimson-1.1/jaxp.jar;/crimson-1.1/crimson.jar;/sax2/sax.jar;/relax/namespace/src/" org.iso_relax.dispatcher.TestDispatcher -n explain.xml *Island start: urn:document <para>1st para</para> *Island end *Island start: urn:document <para>2nd para</para> *Island end *Island start: urn:table <table number="1"> <row> <cell> <dummy namespaceName="urn:document xmlns="http://www.xml.gr.jp/xmlns/dummy"></dummy> <dummy namespaceName="urn:document xmlns="http://www.xml.gr.jp/xmlns/dummy"></dummy> </cell> </row> </table> *Island end *Island start: urn:document <para>3rd para</para> *Island end *Island start: urn:document <para>4th para</para> *Island end *Island start: urn:table <table number="2"> <row> <cell> <dummy namespaceName="urn:document xmlns="http://www.xml.gr.jp/xmlns/dummy"></dummy> <dummy namespaceName="urn:document xmlns="http://www.xml.gr.jp/xmlns/dummy"></dummy> </cell> </row> </table> *Island end *Island start: urn:document <doc> <para>this is a para</para> <dummy namespaceName="urn:table xmlns="http://www.xml.gr.jp/xmlns/dummy"></dummy> <dummy namespaceName="urn:table xmlns="http://www.xml.gr.jp/xmlns/dummy"></dummy> </doc> *Island end namespace$
The RSS specification uses namespaces heavily. The second example in Section 7 has six namespaces. This document is decomposed as follows:
java -cp "/crimson-1.1/jaxp.jar;/crimson-1.1/crimson.jar;/sax2/sax.jar;/relax/namespace/src/" org.iso_relax.dispatcher.TestDispatcher -n rssExample.xml *Island start: http://purl.org/dc/elements/1.1/ <publisher>The O'Reilly Network</publisher> *Island end *Island start: http://purl.org/dc/elements/1.1/ <creator>Rael Dornfest (mailto:rael@oreilly.com)</creator> *Island end *Island start: http://purl.org/dc/elements/1.1/ <rights>Copyright ? 2000 O'Reilly & Associates, Inc.</rights> *Island end *Island start: http://purl.org/dc/elements/1.1/ <date>2000-01-01T12:00+00:00</date> *Island end *Island start: http://purl.org/rss/1.0/modules/syndication/ <updatePeriod>hourly</updatePeriod> *Island end *Island start: http://purl.org/rss/1.0/modules/syndication/ <updateFrequency>2</updateFrequency> *Island end *Island start: http://purl.org/rss/1.0/modules/syndication/ <updateBase>2000-01-01T12:00+00:00</updateBase> *Island end *Island start: http://www.w3.org/1999/02/22-rdf-syntax-ns# <Seq> <li resource="http://c.moreover.com/click/here.pl?r123"></li> </Seq> *Island end *Island start: http://purl.org/rss/1.0/ <channel {http://www.w3.org/1999/02/22-rdf-syntax-ns#}about="http://meerkat.oreillynet.com/?_fl=rss1.0"> <title>Meerkat</title> <link>http://meerkat.oreillynet.com</link> <description>Meerkat: An Open Wire Service</description> <dummy namespaceName="http://purl.org/dc/elements/1.1/ xmlns="http://www.xml.gr.jp/xmlns/dummy"/> <dummy namespaceName="http://purl.org/dc/elements/1.1/ xmlns="http://www.xml.gr.jp/xmlns/dummy"/> <dummy namespaceName="http://purl.org/dc/elements/1.1/ xmlns="http://www.xml.gr.jp/xmlns/dummy"/> <dummy namespaceName="http://purl.org/dc/elements/1.1/ xmlns="http://www.xml.gr.jp/xmlns/dummy"/> <dummy namespaceName="http://purl.org/rss/1.0/modules/syndication/ xmlns="http://www.xml.gr.jp/xmlns/dummy"/> <dummy namespaceName="http://purl.org/rss/1.0/modules/syndication/ xmlns="http://www.xml.gr.jp/xmlns/dummy"/> <dummy namespaceName="http://purl.org/rss/1.0/modules/syndication/ xmlns="http://www.xml.gr.jp/xmlns/dummy"/> <image {http://www.w3.org/1999/02/22-rdf-syntax-ns#}resource="http://meerkat.oreillynet.com/icons/meerkat-powered.jpg"></image> <items> <dummy namespaceName="http://www.w3.org/1999/02/22-rdf-syntax-ns# xmlns="http://www.xml.gr.jp/xmlns/dummy"/> </items> <textinput {http://www.w3.org/1999/02/22-rdf-syntax-ns#}resource="http://meerkat.oreillynet.com"></textinput> </channel> *Island end *Island start: http://purl.org/rss/1.0/ <image {http://www.w3.org/1999/02/22-rdf-syntax-ns#}about="http://meerkat.oreillynet.com/icons/meerkat-powered.jpg"> <title>Meerkat Powered!</title> <url>http://meerkat.oreillynet.com/icons/meerkat-powered.jpg</url> <link>http://meerkat.oreillynet.com</link> </image> *Island end *Island start: http://purl.org/dc/elements/1.1/ <description> XML is placing increasingly heavy loads on the existing technical infrastructure of the Internet. </description> *Island end *Island start: http://purl.org/dc/elements/1.1/ <publisher>The O'Reilly Network</publisher> *Island end *Island start: http://purl.org/dc/elements/1.1/ <creator>Simon St.Laurent (mailto:simonstl@simonstl.com)</creator> *Island end *Island start: http://purl.org/dc/elements/1.1/ <rights>Copyright ? 2000 O'Reilly & Associates, Inc.</rights> *Island end *Island start: http://purl.org/dc/elements/1.1/ <subject>XML</subject> *Island end *Island start: http://purl.org/rss/1.0/modules/company/ <name>XML.com</name> *Island end *Island start: http://purl.org/rss/1.0/modules/company/ <market>NASDAQ</market> *Island end *Island start: http://purl.org/rss/1.0/modules/company/ <symbol>XML</symbol> *Island end *Island start: http://purl.org/rss/1.0/ <item {http://www.w3.org/1999/02/22-rdf-syntax-ns#}about="http://c.moreover.com/click/here.pl?r123"> <title>XML: A Disruptive Technology</title> <link>http://c.moreover.com/click/here.pl?r123</link> <dummy namespaceName="http://purl.org/dc/elements/1.1/ xmlns="http://www.xml.gr.jp/xmlns/dummy"/> <dummy namespaceName="http://purl.org/dc/elements/1.1/ xmlns="http://www.xml.gr.jp/xmlns/dummy"/> <dummy namespaceName="http://purl.org/dc/elements/1.1/ xmlns="http://www.xml.gr.jp/xmlns/dummy"/> <dummy namespaceName="http://purl.org/dc/elements/1.1/ xmlns="http://www.xml.gr.jp/xmlns/dummy"/> <dummy namespaceName="http://purl.org/dc/elements/1.1/ xmlns="http://www.xml.gr.jp/xmlns/dummy"/> <dummy namespaceName="http://purl.org/rss/1.0/modules/company/ xmlns="http://www.xml.gr.jp/xmlns/dummy"/> <dummy namespaceName="http://purl.org/rss/1.0/modules/company/ xmlns="http://www.xml.gr.jp/xmlns/dummy"/> <dummy namespaceName="http://purl.org/rss/1.0/modules/company/ xmlns="http://www.xml.gr.jp/xmlns/dummy"/> </item> *Island end *Island start: http://purl.org/rss/1.0/modules/textinput/ <function>search</function> *Island end *Island start: http://purl.org/rss/1.0/modules/textinput/ <inputType>regex</inputType> *Island end *Island start: http://purl.org/rss/1.0/ <textinput {http://www.w3.org/1999/02/22-rdf-syntax-ns#}about="http://meerkat.oreillynet.com"> <title>Search Meerkat</title> <description>Search Meerkat's RSS Database...</description> <name>s</name> <link>http://meerkat.oreillynet.com/</link> <dummy namespaceName="http://purl.org/rss/1.0/modules/textinput/ xmlns="http://www.xml.gr.jp/xmlns/dummy"/> <dummy namespaceName="http://purl.org/rss/1.0/modules/textinput/ xmlns="http://www.xml.gr.jp/xmlns/dummy"/> </textinput> *Island end *Island start: http://www.w3.org/1999/02/22-rdf-syntax-ns# <RDF> <dummy namespaceName="http://purl.org/rss/1.0/ xmlns="http://www.xml.gr.jp/xmlns/dummy"/> <dummy namespaceName="http://purl.org/rss/1.0/ xmlns="http://www.xml.gr.jp/xmlns/dummy"/> <dummy namespaceName="http://purl.org/rss/1.0/ xmlns="http://www.xml.gr.jp/xmlns/dummy"/> <dummy namespaceName="http://purl.org/rss/1.0/ xmlns="http://www.xml.gr.jp/xmlns/dummy"/> </RDF> *Island end