2013-12-18

XSLT to extract metadata from OAI-PMH GetRecord response

Between the GISCs of WMO Information System, a metadata record is exchanged using OAI-PMH.  OAI-PMH is an HTTP-based protocol, in which the server's response is an XML document that encapsulates metadata record(s).

It sounds so easy.  It's just extracting /OAI-PMH/GetRecord/record/metadata/gmd:MD_Metadata. Following command would suffice:

$ xmllint --xpath '//*[local-name()="MD_Metadata"][1]' input.xml > output.xml

Even with older version of libxml2, a few lines of equivalent XSLT would do the same job. Until today I have thought so.  But it was no good.

WMO Core Metadata Profile version 1.3 somehow prohibits the use of default namespace declaration.  But above command produces undesired default namespace declaration if the OAI-PMH uses it.  Oh no....!

It is a bit tricky to remove namespace declaration.  The exclude-result-prefixes parameter
works only in the literal result elements.  That means you have to write <gmd:MD_Metadata> instead of xsl:copy-of or xsl:copy or xsl:element.

<xsl:stylesheet version="1.0"
 xmlns:gmd="http://www.isotc211.org/2005/gmd"
 xmlns:oai="http://www.openarchives.org/OAI/2.0/"
 xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
 exclude-result-prefixes="oai"
 >
<xsl:output method="xml" omit-xml-declaration="no" />
<xsl:template match="/">
 <xsl:apply-templates select=".//oai:metadata/*[1]"/>
</xsl:template>
<xsl:template match="gmd:MD_Metadata">
 <!-- this is the literal result element -->
 <gmd:MD_Metadata>
  <!-- you might wish to override xsi:schemaLocation by ISO standard -->
  <xsl:apply-templates select="*|@*|text()"/>
  </gmd:MD_Metadata>
</xsl:template>
<xsl:template match="*">
 <!-- xsl:copy-of brings undesirable xmlns= even under MD_Metadata -->
 <xsl:copy>
 <xsl:apply-templates select="*|@*|text()"/>
 </xsl:copy>
</xsl:template>
<xsl:template match="@*">
 <xsl:copy-of select="." />
</xsl:template>
</xsl:stylesheet>

No comments :

Post a Comment