-
Notifications
You must be signed in to change notification settings - Fork 98
Description
What happened?
Description:
I encountered an issue when converting 850hPa geopotential height (gh) to geopotential (z) using the formula z = gh * 9.80665 in NCEP GFS forecast GRIB files. After writing the converted data back to a GRIB file and reading it again, some values appear as 3.4028234663852886e+38 (IEEE 32-bit float max), while they were originally close to 9999 (likely missing values) before the conversion.
The issue persists across:
- earthkit-data (which depends on eccodes-python)
- eccodes-python directly
- eccodes C library
This suggests a problem in the underlying eccodes C library when handling missing values during data transformation and write-back.
Attachments:
I've attached a sample GRIB file (gh850.grib) containing the 850hPa geopotential height field from NCEP GFS that reproduces the issue.
Additional notes:
The issue appears specifically when:
- The original GRIB file uses a missing value marker (e.g., 9999)
- After numerical transformation, the missing value handling seems to break
- The output file contains IEEE max float values instead of correct values which close to 9999
I suspect this might be related to how eccodes handles missing value indicators during data modification. After further investigation, I tested earlier versions of the eccodes C library and found that version 2.26 processes the conversion correctly without generating the anomalous 3.4028234663852886e+38 values. This suggests that a regression was introduced in later versions of eccodes. Any insight or fix would be appreciated.
What are the steps to reproduce the bug?
Steps to reproduce:
- Read a GRIB file containing 850hPa geopotential height (gh) from NCEP GFS.
- Convert gh to z by multiplying by 9.80665.
- Write the modified field to a new GRIB file.
- Read the new file and observe that values originally near 9999 become 3.4028234663852886e+38.
Minimal reproducible example:
Using earthkit-data
import earthkit.data as ekd
# Read input GRIB file
gh = ekd.from_source("file", "gh850.grib")
f = gh[0]
# Convert gh to z
values = f.to_numpy() * 9.80665
f_new = f.clone(values=values, metadata=f.metadata().override(shortName="z"))
# Write back to new GRIB file
f_new.to_target("file", "z850.grib")
# Read back and check values
z = ekd.from_source("file", "z850.grib")
f = z[0]
md = f.metadata()
print(md["missingValue"], md["level"], f.values.max(), f.values.min())
# 9999 850 3.4028234663852886e+38 9277.9560546875Version
v2.44.0
Platform (OS and architecture)
ubuntu 22.04/Mac OS 15.7.3
Relevant log output
In [1]: import earthkit.data as ekd
In [2]: import numpy as np
In [3]: z = ekd.from_source("file", "z850.grib")
In [4]: f = z[0]
In [5]: values = f.to_numpy()
In [6]: indices = np.where(values > 10000000.0)
In [7]: indices
Out[7]:
(array([140, 602, 604, 617, 618, 619, 638, 647, 668]),
array([1245, 544, 484, 491, 1399, 12, 22, 1398, 802]))
In [8]: gh = ekd.from_source("file", "gh850.grib")
In [9]: f = gh[0]
In [10]: values_orig = f.to_numpy()
In [11]: values_new = f.to_numpy() * 9.80665
In [12]: values_orig[indices]
Out[12]:
array([1019.62425, 1019.62425, 1019.62425, 1019.62425, 1019.62425,
1019.62425, 1019.60825, 1019.60825, 1019.62425])
In [13]: values_new[indices]
Out[13]:
array([9999.09815126, 9999.09815126, 9999.09815126, 9999.09815126,
9999.09815126, 9999.09815126, 9998.94124486, 9998.94124486,
9999.09815126])Accompanying data
The sample GRIB file is gh850.tar.gz