2013-12-17

Note on NASA DIF (Directory Interchange Format) and GCMD Keywords

For long time I knew only the name of DIF (Directory Interchange Format) used in GCMD (Global Change Master Directory) which is a catalogue operated by NASA.  Recent days I'm getting interacting with more people who are interested in using GCMD keywords in the WMO/WIS Discovery Metadata which is extension of ISO 19139.

Resources I found in a quick research:
In the WIS community there was a question about gmd:keywordType and the uniqueness of gmd:thesaurusName.  Our profile WCMP 1.3 requires a single thesaurusName appears only once.  The GCMD keyword tables contain different types.   If a metadata creator wish to use gmd:keywordType to clarify the category of the keywords, he/she has to divide the MD_Keyword block for different keywordType.

In the original mapping by GCMD, there is no such issue.  The ISO element "keyword" is mapped from only "Keyword" in DIF which is free text.  But the most complex DIF element "Parameters" is mapped to old ISO element "category" which is probably superseded by topicCategory which is unfortunately enumeration and hence no longer extendable.   So the mapping does not have contemporary meaning, really unfortunately.

So I move to more realistic mapping implementation by AADC.  It creates MD_Keywords from following DIF elements:
Apparently there should be a need to care about a need for using GCMD thesaurusName for multiple keyword types.

TT-ApMD-2 (see para 28) was aware about that situation, and recommended slightly changing the title of thesaurusName/*/title like following:

"NASA/Global Change Master Directory (GCMD) Earth Science Keywords. Version 8.0.0.0.0.  (for theme)"

I know this is ugly and there are still some opinions, and really hope we get some agreement....



4 comments :

  1. I think that there is a strong argument for aiming to make the expression of names of things (people, organisations, thesauri, etc) consistent.

    Systems functionality and interoperability (now and future) will be hampered by the lack of consistent implementation.

    Appending the thesaurusName/title with "theme" is a solution to a local need, but seems to be at the cost of a broader need. Is it possible to consider broadening the rule, to allow more than one instance of one thesaurusName? If not, are there other solutions?

    Also: it seems unfortunate that when this vocabulary is so broad-ranging, the recommended citation only applies to the top-level, rather than to each subsection (Projects, Instruments, etc).

    ReplyDelete
    Replies
    1. If I remember right, the WCMP's requirement of uniqueness of thesaurusName.title was explained as a preparation for new ISO 19115-1. I'll check what was the argument to find out best compromise among the requests.

      By the way do you plan to make real use of keywortType, for example creating separated search index for each different keyword type? If so, it's a persuading reason for me to relax the WCMP requirement.

      Delete
    2. Kate Roberts21/01/2014, 09:26

      I've just checked, and the new 19115-1:2014 hasnt changed, re Keyword cardinality. And indeed its' introduction of KeywordClass (as an additional way to cluster keywords) might be a further argument against constraining (to 1) the number of the same ThesaurusName that are allowable.

      Re indexing of (and searching via) KeywordType: I imagine that facetted searching could (and ideally would) operate as follows:
      1.choose "keyword";
      2. systems presents options of a) 'freetext', b) thesaurusName-list, or c)keywordType-list.

      A further argument against this constraint might be the mapping usecase. Where a metadata record in another 19115 Profile exists, and it has to be mapped to WMCPv.1.3... if it contains more than 1 instance of the same thesaurusName, which block should be dropped?

      Delete
    3. Kate Roberts21/01/2014, 09:34

      The last para here refers to GCMD's recommendation on how to cite their vocabularies [see: http://gcmd.nasa.gov/learn/keyword_list.html]

      Delete