2013-11-26

what is XML namespace - ISO 19139 in mind

Today I had an opportunity to explain what is XML namespace. This might happen again, so I save the text for future reference.




ISO/TS19139 uses XML Namespace, 1999 edition http://www.w3.org/TR/1999/REC-xml-names-19990114/.  There are several editions but I don't think there is significant differences.
The XML Namespace is a mechanism to identify XML elements and attributes defined by many various people.  An element is identified by ``local part'' (substring after colon or entire name if colon is missing) and namespace URI.  For example our famous root element of ISO 19139 metadata is conceptually written as
    {http://www.isotc211.org/2005/gmd}MD_Metadata

where the brace {} encloses the namespace URI (it's unofficial notation used in libxml2).
XML Schema defines the structure of XML elements and attributes identified by namespace.  For example  

  {http://www.isotc211.org/2005/gmd}MD_Metadata 

must have

  {http://www.isotc211.org/2005/gmd}dateStamp 

as a child element.  So far there is no ambiguity at all.

In real XML documents, unfortunately, the element is encoded in a weird way.  You have to choose a ``prefix'' which is a string composed by a name characters of XML.  In most cases "gmd" is used for http://www.isotc211.org/2005/gmd but that is not mandatory.  Then the prefix and the local part is combined with a colon like "".

The relationship between the prefix and the namespace URI is declared by a ``namespace declaration'' which is an XML attribute in form of xmlns:prefix="namespace-URI".  For example the gmd is typically declared as xmlns:gmd="http://www.isotc211.org/2005/gmd".  If the element does not have a namespace declaration for necessary prefix, the parser uses the one in the parent element (or the grandparent if that is missing, or the grand-grand parent ... until the root).

The same element may be described without prefix.  In this special case an XML attribute in the form of xmlns="namespace-URI" declares the relationship for .  The rest is the same.  That is the ``default namespace'' which is currently prohibited in the WCMP.  I'm sorry I can't explain why.  I consider it as unnecessary restriction, but it was somehow slipped in the specification and I was unable to remove it at that time.  It's okay for me, since I can live with that.

[Note 2013-12-18: I learned there is some implementation of WIS catalogue that breaks with the default namespace declaration. No way.]

The attribute is a bit different.  It is worth noting that the default namespace declaration does not apply to non-prefixed attributes. 
An attribute without prefix is called ``per-element-type'' attribute, which is defined for each element.  There is no namespace URI defined for such attributes.  The attribute codeSpace in the elements of codelist type falls on this category.
An attribute with prefix is called ``global'' attribute, which must always be described with prefix.  The attribute gco:nilReason is an example in ISO 19139.  One common mistake is to apply the prefix of the parent element to an attribute defined as per-element.  It just doesn't work.

Please note that the prefix can be *arbitrarily* chosen.  Nobody stated that the last three letters of the namespace URI should be the prefix.  But I hope nobody try

<gmd:MD_Metadata
  xmlns:gml="http://www.isotc211.org/2005/gmd"
  xmlns:html="http://www.isotc211.org/2005/gco"
  xmlns:excel="http://www.opengis.net/gml/3.2"
>

which is perfectly compliant but totally human-unfriendly.

If you are using common prefixes, you'll probably write as follows:

<gmd:MD_Metadata
  xmlns:gmd="http://www.isotc211.org/2005/gmd"
  xmlns:gco="http://www.isotc211.org/2005/gco"
  xmlns:gml="http://www.opengis.net/gml/3.2"
>

These namespace declarations are typically all located in the opening tag of the root element, but that is not a regulation.  It is possible to describe xmlns:gml only at the top element of a subtree including gml elements, or it is also possible to attach the namespace declarations to all every xml elements.  But whenever the prefix is declaraed, the same URI has to be used no matter where it resides.

No comments :

Post a Comment