2014-09-29

ECMWF workshop on closing GRIB-netCDF gap

I've participated the ECMWF workshop on closing GRIB-netCDF gap http://www.ecmwf.int/en/workshop-closing-grib/netcdf-gap. It was really exciting experience and it's my greatest honour that I was only participant from Asia-Pacific regions to the invitation-only meeting.  So far only presentations are available, but I expect and hope something like summary will be posted also.
Followings are notes before forgetting.  This is just my note and the thought is not necessarily the official position of whatever else, such as WMO nor JMA.

GRIB 1-to-2 Compatibility of Parameter Tables

If my brain was not too damaged by English ale, there was a voice in cocktail conversation that the parameter table of GRIB2 is (and every other tables are) designed to be upper compatible to the counterpart(s) of GRIB Edition 1 (if any).  Unfortunately another voice seems also right that this important "principle" was not written in any formal document such as the Manual or relevant meeting reports.  But apparently the principle explains well the presence of redundant parameters in GRIB2.

As was presented by myself, the GRIB2 parameter tables contain some redundancy such as 0-0-4 Maximum temperature [K] or 0-1-8 Total precipitation [kg.m-2].  I don't care whether they were introduced by the "principle".  Anyway the principle was forgotten (or did not exist from the beginning) before the 2008 meeting when the expert team tried to deprecate the parameters.  If GRIB2 had reference implementation including GRIB1 converter or concrete validation procedure including handling of units, people should have realized the breaking impact at that time.

Much more care is needed to remove some components than to add.

Dead idea of netCDF as GRIB2

Mr Chris Little said WMO "outvoted" the idea of making GRIB2 as netCDF + compression + WMO tables.  I didn't know that.  Formal records (such as CBS/EC reports) do not retain any trace of such idea, but the story is consistent with early FWIS task team reports that indicates much favour on Unidata technologies.  Anyway the idea "netCDF as GRIB Next" has been raised repeatedly, and governance only does not accounts well for technical choice.  Many argued separation of data model and representation; that directly leads to the question why WMO (actually operational meteorology) is not happy with WMO-controlled convention on netCDF.

Most people talk about the data size and compression.  As John Caron reports there will need a little more development for netCDF to compete in size efficiency with GRIB of 16-bit pack with JPEG2K compression.  Furthermore GRIB2 does have run-length encoding which lacks proper documentation unfortunately.  But this is accidental (rather than fundamental) difference in my eyes, and it's the matter of time basically.

I would rather stress on the "sequential" nature of GRIB.
  • Two GRIB messages concatenated in a file works. [as Baudouin point out]
  • At the beginning of output the programmer does not have to fix any the list of ensemble member, forecast time, vertical level, nor parameter.
  • The size of file is kept minimum if the data is sparse matrix in terms of level and parameter or other keys.
  • The reader has to scan the entire file even if the only one plane is interested in
NetCDF3 has almost opposite nature so I would call "direct access".  There can be the way in middle that could be called "indexed sequential" as in JMA's internal format NuSDaS.  I've presented that argument in Japan Geoscience Union meeting in April, but the community has little contact with Europe, so it is real good opportunity to revisit it.

Another point might sound ridiculous for some eyes, but there are some message switching systems on the GTS that requires the four octets of binary messages end with "7777".  If it is university website I'd easily change that in hours, but mission-critical operational systems is managed in totally different way.

Anyway the discussion made me less pessimistic on the future of GRIB 3, but I'd really like to be clear about the reasoning if GRIB Edition 3 is not to remain the name only.

Three-week Rule is not Short enough?

GRIB is defined in the WMO Manual on Codes, which is properly amended by the Congress held once in four years.  That is way too long for data format maintenance, so various delegation of authority has been made so that some no-harm amendment can be done twice a year.  But it's still long to make proposal from national focal point and wait for one year in the worst case.  CF allows any amendment in email based discussion which is considered approved with tree weeks without objection, which makes agile introduction of contemporary requirements.  Did you ever heard such an argument?

It was really shocking for me to see Heiko Klein reports he has 1500 standard name candidates which he hesitates to send to the CF list.  That is a sign for me that something is failing in the CF framework.  Lack of local namespace makes it impossible to share the data outside the institute.

There was many voices in the workshop (mainly from software developer) that the local table of GRIB has to be published and consolidated in computer-readable manner.  The discussion went to direction that "don't worry, WMO will suppress much harder the use of local numbers" but that seems to going to reproduce the issue reported by Heiko.

Data creators are often busy.  They hate to wait for procedures even for three weeks.  And there is subtle fear that the worldwide coordination process finds more logical formulation of data.  So I often hear many people trying to escape from the standardization claiming "it's just locally used data".  Unfortunately the promise keeps short in many cases and we get in greater trouble.

Semantic Web and Abstract Reference Software

Sorry I'll write on the topic later.


No comments :

Post a Comment