2013-12-06

ambiguity in pressure level heights of TAC TEMP which really casued trouble


It comes to attention recently that unnatural values in geopotential height is sometimes reported in BUFR TEMP message for 89532 SYOWA in Antarctica. That message is converted by RTH Tokyo from the traditional alphanumeric code (TAC) FM 35. The issues is partly a problem in the conversion software (handling of negative values), but also partly stemming from inherent ambiguity in the TAC TEMP format; location-independent algorithm fails to estimate of "upper digits" especially on 700 hPa.
The TAC/BUFR conversion is commonly seen worldwide, and other converter might have that problem, though the situation is not surveyed yet.



Background

FM 35-XI Ext. TEMP is one of WMO traditional alphanumeric codes (TAC) used in operational meteorology to report observation by radiosonde. The TAC was developed in the 20th century, the time when telegram and teletypewriter are the only methods of telecommunication. Following is an example, the part A of TEMP report for SYOWA on 2012-09-16T12Z:

USAA01 RJTD 161200
TTAA 66121 89532 99956 21528 15009 00814 /////
///// 92266 17318 06021 85895 19910 04537 70319
23316 05043 50472 36527 04523 40623 473// 02019
30808 609// 02017 25918 701// 36019 20049 751//
34021 15216 743// 30029 10451 765// 29539 88218
759// 01021 88136 743// 29529 77999 31313 75508
81131=

The message consists of groups, each of which is non-whitespace letters (mostly digits). The rules to organize the groups are described in the WMO Manual on Codes, but it you are not familliar with the document with special style, there is an unofficial but practical guide to TEMP written by UNISYS.
Two underlined groups 00814 and 70319 represents geopotential heights on 1000 hPa and 700 hPa levels respectively. The last three digits are basically the lowest three digits of height in metres (on and below 700 hPa). If the value is negative, the three digits are 500 plus the absolute value of the height.
The rule sounds simple, and no problem at all in the good old past, when a human plotter listens to the Morse code and a human forecaster draws contour. But we are living in the iron age and dumb computers do everything, and the story becomes different.
Since 1990s the WMO has developed new system of codes called TDCF (table-driven code form) which is actually BUFR and CREX. BUFR (binary universal form of representation) has more standardized handling of numbers. People are supposed to be free from the ambiguity of data given in lower three or two digits. But currently the migration to TDCF is still on the middle of the journey. There are many observing systems that cannot produce TDCF, thus much portion of the BUFR/CREX messages circulating on the GTS are converted from TAC.

Height at 1000 hPa level

In the BUFR TEMP message (IULK01 RJTD 161200 in the case shown above) the 1000 hPa height is found higher than expected by hundreds of metres. The TAC code 00814 should be decoded as -314 m, but the "500 plus the absolute value" rule is somehow slipped out from the specification. Then the group was decoded as 814 m by the RTH TAC/BUFR encoder, and so encoded into BUFR message. Unfortunately it takes time to amend an operational system, and I can't show a concrete timeframe to fix the issue.
That is regrettable and I'm really sorry on behalf of JMA/RTH-Tokyo. But there is still hope to address the issue, because the 1000 hPa height falls between -500 m and 500 m all time for most fixed land stations on the Earth. When the natural range of variability grows beyond 1000 m, the problem has different nature.

Height at 700 hPa level

The variability of the pressure level height becomes larger for upper levels. Let's look at 700 hPa level where the ambiguity of "reported three digits" is most clear.

The figure is the 2012 annual minimum and maximum height of 700 hPa level. Processed data is 2.5-degree resolution analysis in GRIB publicly available at GISC Tokyo because of small amount, so it was smoothed to some degree. But I think overall tendency is clear and won't be affected.
As a result, really unfortunately, the 700 hPa height has variability broader than 1000 m. It is common in Antarctic coast and sea to experience the height lower than 2200 m or even 2100 m, while the height as high as 3300 m is seen in the mid-latitude. We could do more hi-resolution or long-term statistics but the range will be broader.
So the three digits around the range between 100 to 400 are ambiguous: the group 70319 represents both 2319 m and 3319 m correctly at the same time, and there is no way to tell which is right, without using some knowledge out of the number itself - location-dependent statistics or internal QC using the hydrostatic equation.
In reality the RTH Tokyo uses globally constant threshold tuned for the mid-latitude - it assumes 700 hPa height falls between 2400 m and 3399 m. Above example was actually converted as 3319 m. Oh no.

The way forward

It is not fantastic thing that the BUFR reporting does not currently solving one of the issues to be addressed by the TDCF migration. But I don't mean we can simply extend the deadline of TDCF migration and surviving TACs solves all problem. I'd like to remark two things:
  • TAC/BUFR conversion is not the proper way of the migration. More attention should be paid how can we facilitate the migration toward observing systems that directly produces TDCF. Monitoring of the quality, not only the quantity of reports might be an idea.
  • The issue in the Antarctic TEMP BUFR was found recently, but had been present for many years, probably from the beginning of Tokyo's BUFR distribution. That suggests that little number of people in the data-processing centers seriously considered the migration until recently.

This October JMA started assimilation of SYNOP BUFR in NWP (for limited number of stations). Many international colleagues congratulated me, thanks about that, but that was always accompanied with questions about data quality. I don't think above-mentioned issue is the only one of TAC/BUFR conversion, and would like to facilitate more feedback from the data users to the telecom community.

No comments :

Post a Comment