2013-12-20

XSLT (XPath) cannot find where the namespace declaration is

Yesterday I got stack a tricky features of XML.

Well, somehow I had to remove redundant the default namespace declarations (something looking like attribute "xmlns") from a large number of huge size of XML documents.  That job is incomplete, and this is intermediate memorandum.

Before really doing the job, I wanted to see how many instances are found where.  It was surprising that XSLT cannot do the job.  The "namespace::" syntax of XPath does not find the literal text in XML seriarization, but rather matches conceptual namespace nodes which are copied to all child elements [XPath].  So it cannot detect redundant NS declaration.

I think it is necessary to program with XML parser.  I wrote a ruby script to work with libxml2 Reader interface.



2013-12-18

XSLT to extract metadata from OAI-PMH GetRecord response

Between the GISCs of WMO Information System, a metadata record is exchanged using OAI-PMH.  OAI-PMH is an HTTP-based protocol, in which the server's response is an XML document that encapsulates metadata record(s).

It sounds so easy.  It's just extracting /OAI-PMH/GetRecord/record/metadata/gmd:MD_Metadata. Following command would suffice:

$ xmllint --xpath '//*[local-name()="MD_Metadata"][1]' input.xml > output.xml

Even with older version of libxml2, a few lines of equivalent XSLT would do the same job. Until today I have thought so.  But it was no good.

WMO Core Metadata Profile version 1.3 somehow prohibits the use of default namespace declaration.  But above command produces undesired default namespace declaration if the OAI-PMH uses it.  Oh no....!

It is a bit tricky to remove namespace declaration.  The exclude-result-prefixes parameter
works only in the literal result elements.  That means you have to write <gmd:MD_Metadata> instead of xsl:copy-of or xsl:copy or xsl:element.

<xsl:stylesheet version="1.0"
 xmlns:gmd="http://www.isotc211.org/2005/gmd"
 xmlns:oai="http://www.openarchives.org/OAI/2.0/"
 xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
 exclude-result-prefixes="oai"
 >
<xsl:output method="xml" omit-xml-declaration="no" />
<xsl:template match="/">
 <xsl:apply-templates select=".//oai:metadata/*[1]"/>
</xsl:template>
<xsl:template match="gmd:MD_Metadata">
 <!-- this is the literal result element -->
 <gmd:MD_Metadata>
  <!-- you might wish to override xsi:schemaLocation by ISO standard -->
  <xsl:apply-templates select="*|@*|text()"/>
  </gmd:MD_Metadata>
</xsl:template>
<xsl:template match="*">
 <!-- xsl:copy-of brings undesirable xmlns= even under MD_Metadata -->
 <xsl:copy>
 <xsl:apply-templates select="*|@*|text()"/>
 </xsl:copy>
</xsl:template>
<xsl:template match="@*">
 <xsl:copy-of select="." />
</xsl:template>
</xsl:stylesheet>

2013-12-17

Note on NASA DIF (Directory Interchange Format) and GCMD Keywords

For long time I knew only the name of DIF (Directory Interchange Format) used in GCMD (Global Change Master Directory) which is a catalogue operated by NASA.  Recent days I'm getting interacting with more people who are interested in using GCMD keywords in the WMO/WIS Discovery Metadata which is extension of ISO 19139.

Resources I found in a quick research:
In the WIS community there was a question about gmd:keywordType and the uniqueness of gmd:thesaurusName.  Our profile WCMP 1.3 requires a single thesaurusName appears only once.  The GCMD keyword tables contain different types.   If a metadata creator wish to use gmd:keywordType to clarify the category of the keywords, he/she has to divide the MD_Keyword block for different keywordType.

In the original mapping by GCMD, there is no such issue.  The ISO element "keyword" is mapped from only "Keyword" in DIF which is free text.  But the most complex DIF element "Parameters" is mapped to old ISO element "category" which is probably superseded by topicCategory which is unfortunately enumeration and hence no longer extendable.   So the mapping does not have contemporary meaning, really unfortunately.

So I move to more realistic mapping implementation by AADC.  It creates MD_Keywords from following DIF elements:
Apparently there should be a need to care about a need for using GCMD thesaurusName for multiple keyword types.

TT-ApMD-2 (see para 28) was aware about that situation, and recommended slightly changing the title of thesaurusName/*/title like following:

"NASA/Global Change Master Directory (GCMD) Earth Science Keywords. Version 8.0.0.0.0.  (for theme)"

I know this is ugly and there are still some opinions, and really hope we get some agreement....



WMO Common Code Table C-15 (Physical quantities) and QUDT unit of measurement

New common code table C-15 (Physical quantities) is under development in the WMO Manual on Codes.  This is to be served as online registory http://codes.wmo.int/common/c-15.  In my understanding the primary motivation at the moment is to provide semantic description of quantities used in the Aviation XML.

The table is of course a list of entries, each describes a quantity, for example "airTemperature" http://codes.wmo.int/common/c-15/me/airTemperature.  Looking at the table, there is a field "generalization" with value "ThermodynamicTemperature" that links to http://qudt.org/vocab/quantity#ThermodynamicTemperature.

This is a link to QUDT.  The top page describes only SI and CGS systems, but there seems to be care for other conventional units. 

2013-12-06

ambiguity in pressure level heights of TAC TEMP which really casued trouble


It comes to attention recently that unnatural values in geopotential height is sometimes reported in BUFR TEMP message for 89532 SYOWA in Antarctica. That message is converted by RTH Tokyo from the traditional alphanumeric code (TAC) FM 35. The issues is partly a problem in the conversion software (handling of negative values), but also partly stemming from inherent ambiguity in the TAC TEMP format; location-independent algorithm fails to estimate of "upper digits" especially on 700 hPa.
The TAC/BUFR conversion is commonly seen worldwide, and other converter might have that problem, though the situation is not surveyed yet.

2013-12-04

"iso" URN namespace - machine-readable reference to ISO standards


I found IETF RFC5141 http://tools.ietf.org/html/rfc5141 defines the "iso" namespace of the URN.  That makes it possible to cite ISO standards in a computer-readable manner.  The metadata standards used in the WMO Core Metadata Profile can be called like following:
  • urn:iso:std:iso:19115:2003:en
  • urn:iso:std:iso:19115:cor-1:2006:en
  • urn:iso:std:iso:ts:19139:2007:en
Interestingly enough, the author (from ISO) says the ISO requests this namespace because the URN scheme using OID (such as urn:oid:1:0:19115 for ISO 19115) is not human-readable.

I'm not trying to change WCMP (for example gmd:metadataStandardName) since (for now) I don't know request that the field has to be computer-readable. But I think it is worth sharing.

[article also posted to WMO/IPET-MDRD]

2013-11-26

what is XML namespace - ISO 19139 in mind

Today I had an opportunity to explain what is XML namespace. This might happen again, so I save the text for future reference.


2013-11-15

GML versions - why WCMP 1.3 specifies GML 3.2 only?

Question:
Why the WMO Core Metadata Profile (WCMP) v1.3 tells only GML 3.2 must be used?  Isn't it good to make it open-ended?   GML 3.3 is already published as OGC standard, and further development might be coming.

Answer:
We need clear guidance on validation using XSD (W3C XML Schema) from an official source.  That was (still is) only official version published by ISO.

utilizing XLink in WIS Discovery Metadata to compactify common information

When looking at multiple metadata records, we can easily find a huge number of repetition of the same information with significant size, for example point of contact, citation, or reference system.  It is a natural idea to utilize XLink attributes in ISO 19139 XML Schema, and that has been discussed so many times in WIS community for years.

But I have to say there is no tangible outcome so far.  Everybody like to talk about the vision, but apparently don't know how to get there. I feel I'm obliged to explain already-known difficulties.

Note: following is only my personal view, and is not to be considered official position of any organization. 

2013-11-12

text of Amendment 76 for ICAO Annex 3

It's not found in the docplan of IPET-DRMM.  I've just googled and found:

http://www.caa.mk/Upload/Document/MK/039e.pdf

Thanks to the Government of the Republic of Macedonia.

2013-11-07

Quick review - DataCite Metadata Schema v3.0

Data Publication is a recent movement in the data-intensive science domains, which tries to define the service of data as a part of scholarly work to be acknowledged, thus tries to make data referable from literature.  ICSU WDS has established a working group for the data publication, and DataCite is a non-profit organization (seemingly) involved in it.

They published the DataCite Metadata Schema v3.0, which is interesting.


2013-10-31

Stoneyhurst heliographic coordinate system

Note for ongoing work..

The Stoneyhurst heliographic coordinate system is defined in following article among other sun-based coordinate systems:
THOMPSON, W. T. Coordinate systems for solar image data. Astronomy & Astrophysics, 2006, 449.2: 791-803. Also available at http://secchi.nrl.navy.mil/wiki/uploads/Main/coordinates.pdf
Findings: the term "heliographic latitude/longitude" is parallel to geographic lat/lons.  When we are aware about oblateness of the Earth (i.e. almost everywhere except for weather modellers), we distinguish "geographic" and "geocentric" latitudes.  More general terms are "planetographic" and "planetocentric".  In the Stoneyhurst system, we assume spheric sun (i.e. zero oblateness) implicitly, so we use the term "heliographic" in spite of the mathematical formula based on heliocentric coordinates.

[WCMP] conceptual "extension" encoded as restrictions in terms of XML

The WMO core metadata profile, version 1.3, says it is a restricted profile of ISO 19115:
A category-1 profile places additional restrictions on the use of an International Standard to meet the more specific requirements of a given community
But I found some people described it's an extended profile because it has additional codelists. Which is right?

In my view, both.

Apparently opposite terminologies reflect different levels of viewpoint: conceptual and concrete, or more frankly, UML and XML.

[WCMP] Questions on Schema and Schematrons re v1.2 and 1.3

There are many questions regarding WMO Core Metadata Profile (WCMP).  I think some are really worth sharing.

[WCMP] how to cite International Meteorological Vocabulary (WMO 182)?

Question:
I'm trying to use some words from IMV (the International Meteorological Vocabulary, WMO No. 182) as keywords for WIS Discovery metadata.
  • Is the latest version of IMV dated 1992?
  • All words in IMV are found in METEOTERM [http://www.wmo.int/pages/prog/lsp/meteoterm_wmo_en.html].  Should I indicate the URL inside thesaurusName identifier?  That would make documentation more useful. 
  • How would you do after all?
  • How would you cite IMV in literature other than WIS metadata?
Answer:  there's no authorized recommendation from appropriate body, but if there's immediate need, I would do:

2013-10-21

2013 amendment of Manual on the WIS about metadata - including amendment procedure

Recommendation 8 (CBS-15), followed by EC approval, amended the Manual on the WIS, effective 1 July 2013.  Unfortunately the latest manual is not posted on the WMO web, probably waiting for translations.  Those who do not have time to check 314 pages of the Final Report of CBS-15 may find following exerpts useful.
CAUTION: NO WARRANTY AT ALL.

2013-08-27

libxml2 version 2.8.0 or later incorporates the patch for GML XSD

Daniel Veillard's libxml2 is one of the most commonly-available XML parser library which includes an implementation of XSD validator.  Version 2.7.8 and before had a bug that breaks xmllint(1) command with XSD schema of GML and hence ISO 19139.

I wrote a patch in 2011, and I found it was incorporated in the Version 2.8.0.  Great!

2013-08-19

Update and maintenance of WIS OAI Monitor: id case insensitivity and reliability of incremental harvesting

These weeks I did extensive (but incomplete unfortunately) maintenance work of the monitoring site of OAI-PMH synchronization of WIS Discovery metadata.  I believe I ought to share what's happening.

= 1. Reliability of Incremental Harvesting

Apparently my monitor had indicated unnatural large number of diffs (differences between the same metadata sets between centers).  After performing full harvesting, the number reduced significantly.

That meanst that the changes (mostly additions) were not effectively harvested by incremental harvesting i.e. the OAI-PMH "ListIdentifiers" request with "from=" parameter having a little (24h in my case) before the present time.  I think that is due to some problem in the implementation or the operation, but I still need more information to give effective guidance.

Anyway the monitoring software must be changed in its design strategy.

Right now it retrieves full OAI sets every 3 hours from relatively fast servers (>300 record/sec), and only "increments" (changes within past 24 hours) are retrieved from relatively slow servers (around 3 record/sec).  It was deemed necessary to achieve high temporal resolution, but really unfortunately, the loss of incremental harvesting happens on slow servers.  As a result, a reliable monitoring should use full harvesting for all kinds of servers, under current situation.

Right now the entire WIS data catalogue consists of 1.4e5 records.  It's a simple math that it takes at least 13 hours to take full harvest from 3 record/sec server.   In reality it is not nice thing to maintain (even) daily monitoring since monitoring should not take up majority of server resources.

Something has to be done.

= 2. Case-insensitivity of Metadata id's

Another source of increase of #diffs is the case-insensitivity of metadata identifier.  The first meeting of IPET-MDI agreed that the identifier should be treated as case-insensitive, and so was written in the WMO Core Metadata Profile, which is now an appendix to the Manual on WIS.

Before 2013-08-16T12Z, the monitoring did comparison of the OAI sets in case-sensitive manner.  I didn't notice problem when it was written.  But recently there were some records whose id's were partly lowercased.   Some other centres followed the change, and others not.  It is highly suspected that the synchronization does not work for the latter, but it is out of the scope of id-only monitoring.

Anyway the rule is rule, so I changed the software to make the case-insensitive diffs.

2013-06-12

Outcome of WIGOS Metadata Task Team (TT-WDM-1)

Note: below is just my personal analysis of publicly-available documents.

There was a meeting TT-WMD-1 in March.

FR§5.2 says WIGOS will develop a web portal called WIR.  Existing OSCAR will be a component of WIR in following structure:

WIR
 |
 +- SORT: standards of observing methods
 |  (I guess WIS metadata don't link to that in near future)
 |
 +- OSCAR: DB of requirements and capabilities of observatios
    |
    +- OSCAR/Surface: evolution of Volume A
    +- OSCAR/Distributed: DBs made for each regional/progamme community

FR§5.2.6 says the radar DB http://wwr.dmi.gov.tr/ is an example of OSCAR/Distributed.  I wonder where is existing satellite DB in OSCAR is located, perhaps as an instance of OSCAR/Distributed.

FR Appendix II (p.18) gives a draft spec of "Core WIGOS Metadata".  It's a list of elements with Mandatory/Conditional/Optional flags.  The elements are in conceptual level, so the encoding or instantiation has to be developed later before actual catalogue-building can be started.

Regarding the future of Volume A, FR§6.3 says it's primarily the role of CBS, but in my understanding CIMO will augment a number of elements (such as shown in Doc.7 to that meeting http://www.wmo.int/pages/prog/www/WIGOS-WIS/meetings/TT-WMD-1/Doc7_Metadata.doc ).

FR§5.2.4 shows an idea of linkage from WIGOS (WIR) to WIS.  That must raise requirement of some interface - i.e. rules of linkage (URL) to data searched by some criteria (ex station, variable, equipment etc) and sustainable operation with organizational arrangement (I mean which GISC does the service).

Similarly it is conceivable to link from WIS to WIGOS.

2013-05-13

[DRMM] 2013 inter-session amendment of Manual on Codes approved

WMO Secretariat updated its operational newsletter
http://www.wmo.int/pages/prog/www/ois/Operational_Information/Newsletters/current_news_en.html

They say:

"
Adoption of amendments to the Manual on Codes (WMO No. 306) between CBS sessions

In accordance with the Procedure for the Adoption of Amendments to the Manual on Codes between CBS sessions, and as concurred by the president of CBS, the WMO Circular letter No. OBS/WIS/DRMM/DRC (PR-6688) (English http://goo.gl/CaA3y, French http://goo.gl/tOXFP, Russian http://goo.gl/rx7ow, Spanish http://goo.gl/ZSOXx, Arabic http://goo.gl/l3xPA) dated 25 February 2013, including draft amendments to the Manual on Codes, was dispatched to all WMO Members for comments.
The WMO Secretariat received replies approving the draft amendments in the Circular letter by 24 April 2013. The amendments are therefore approved in accordance with the Procedure
The president of CBS agreed to notify all WMO Members and members of EC of the approval of the amendments by the WWW Operational Newsletter.  The implementation date of amendments is on 14 November 2013."

2013-05-08

Fast track amendment (2013 round #1) of WMO Manual on Codes

WMO announced the first round in 2013 of the fast track amendment (*) of WMO Manual on Codes was approved on 28 April, to take effect on 8 May 2013, i.e. today.
 
The result is now posted on WMO website.  The latest versions of GRIB & BUFR tables are 11 and 20 respectively.

IPET-MDRD/TT-MDI reviews WIS discovery metadata records by WMO programmes

Recently our team had opportunity to review a set of WIS discovery metadata (i.e. geographical metadata in ISO 19139 format) made by GAWSIS (GAW Station Info Service by MeteoSwiss).
 
 
That was really good opportunity to refresh my memory, implement missing links, and practice interaction between WMO programmes.  Our role is to establish conventions (notations) to express various metadata which is special to WMO, so it is essential to have input from WMO programmes.

2013-03-25

Presentation at 10th ECMWF Workshop on the Use of High Performance Computing in Meteorology (2002)

More than ten years ago I participated the HPC workshop
(http://www.ecmwf.int/newsevents/meetings/workshops/2002/high_performance_computing/index.html)
and made a presentation to introduce software management works in JMA NWP.

The presentation files was posted on the ECMWF website, but it has gone
recently. Luckily my colleague had an hardcopy so I can post scanned image:
http://goo.gl/QNqax

P.S. PPT is also found: http://goo.gl/UilZX

2013-02-28

ddb.kishou.go.jp to be decommissioned soon

As I wrote before, the website ddb.kishou.go.jp is planned to be deocmmissioned shortly, due to its migration into WIS.  Here's summary of instructions for migration:

2013-02-27

netCDF-C 4.3 RC2 released

There was an announcement from Unidata:
 
CMake for Windows sounds great, though I haven't tried it yet.

2013-02-15

WMO Core metadata profile v1.3 is now online

There was an announcement (http://goo.gl/AyKtN) from the WMO secretariat
that the version 1.3 of the WMO Core Metadata profile to ISO 19115 is now
available online: http://wis.wmo.int/2012/metadata/.



Just for my record....



Title: WCMP v1.3 Specification Part 1 – Conformance Requirements

URL:
http://wis.wmo.int/2012/metadata/WMO_Core_Metadata_Profile_v1.3_Specification_Part_1_v1.0FINALcorrected.pdf

Size: 297572

MD5: ccf036c7a4acd4dad67587adf6796c49

Date: 2013-01-15




Title: WCMP v1.3 Specification Part 2 – Abstract Test Suite, Data Dictionary
and Code Lists

URL:
http://wis.wmo.int/2012/metadata/WMO_Core_Metadata_Profile_v1.3_Specification_Part_2_v1.0FINAL.pdf

Size: 218077

MD5: 2ee503ba6111c38682903eb847f5f865

Date: 2013-02-04



Title: Manual on WIS amendment related to WCMP, and Management Procedure of
WCMP

URL:
http://wis.wmo.int/2012/metadata/WMO_Core_Profile_1-3_ManagementProcedurest-AfterCBS15.pdf#page=3

Size: 126334

MD5: c44df3ce1603e879debdc6a06c854c62

Date: 2012-09-21



URL: http://wis.wmo.int/2012/metadata/validationTestSuite/

Desc: empty directory where validation tools are to be placed



Please note that above documents may be corrected (in editorial level). The
correction should be posted at
http://www.wmo.int/pages/prog/www/WIS/wiswiki/tiki-index.php?page=IPET-MDIchanges#Editorial_Changes_to_version_1.3

NetCDF CF Conventions are now OGC standard

The Open Geospatial Consortium (OGC) announced that they approved the
Climate and Forecast (CF) conventions of netCDF as their standard.



http://www.unidata.ucar.edu/mailing_lists/archives/galeon/2013/msg00002.html



Congratulations!

2013-02-08

[wis sync monitor] Beijing listing updated

If you are carefully watching my WIS sync monitor
https://twitter.com/wissync
you might noticed that the differences between the record counts for the set
WIS-GISC-BEIJING among GISCs are significantly reduced. On 15Z yesterday
all diffs are tens of thousands (23701, 23701, 61244, 61130 in
http://toyoda-eizi.net/za/hist2013W05.zip/20130207T15/DIFF.html), now they
are one order smaller (3437, 3330, 1221, 1070 in
http://toyoda-eizi.net/za/hist2013W05.zip/20130208T06/DIFF.html).

That is due to (past poor) maintenance of the monitoring system.

I'm only doing "incremental harvesting" from Beijing OAI-PMH server, since
full download needs too much time. That should work, but sometimes fails to
keep past records (in my side) or fails to harvest updates. I'm also
running automatic diff-harvester triggered from the monitor, and these days
it starts deleting records from WIS-GISC-BEIJING at GISC Tokyo database.
I've checked it really does the right thing, and the diff is updated as a
result. Sorry for poor maintenance these days.

2013-01-18

tricky switching of Meteosat 9 to 10...

Meteosat 10 will be brought to full operation next Monday. First of all,
congratulations!

This means the products are also going to be changed.
http://www.eumetsat.int/Home/Main/Satellites/SP_2012113011418367?l=en

One tricky thing is BUFR messages on GTS ``swap'' headings. So far headings
starting with "IU" (IUVA01 EUMG for example) is Meteosat 9 and "IW" for
Meteosat 10 (I don't see such bulletins though). After 09UTC Monday, "IU"
will stand for Meteosat 10 and "IW" for Meteosat 9.

If your upstream GTS centres relay only "IU" messages, you will lose
Meteosat 9 bulletins. That may be issue if your usage is sensitive to
satellite/sensor code written inside BUFR.

2013-01-15

WMO Core Metadata Profile V1.3 approved

The version 1.3 of the WMO Core Metadata Profile has been approved by the latest meeting of PTC (Presitends of Technical Commissions) Monday: http://goo.gl/QfHG8
This is a profile to the international standard of geographical metadata ISO 19115/19139, to be used in context of data catalogue of WIS (WMO Information System).
Please refer to http://goo.gl/L9HlK for the latest text that passed CBS.