Skip to content

Updated compression filer h5z-sperr v0.2.3 #335

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 7 commits into from
Mar 26, 2025

Conversation

t20100
Copy link
Member

@t20100 t20100 commented Mar 17, 2025

This MR updates h5z-sperr to version 0.2.1 0.2.3.
A new cd_values is added to pass "missing_value_mode"

@t20100 t20100 added this to the Next release milestone Mar 17, 2025
@t20100 t20100 force-pushed the update-h5z-sperr branch from d7989a8 to 90d8fcf Compare March 17, 2025 14:04
@t20100
Copy link
Member Author

t20100 commented Mar 17, 2025

@shaomeng, I found the follwing issue when trying to update h5z-sperr:

This test:

hdf5plugin/test/test.py

Lines 148 to 163 in 4eaa8f2

def testSperr(self):
"""Test reading Sperr compressed data"""
dirname = os.path.abspath(os.path.dirname(__file__))
for fname in ["sperr.h5"]:
fname = os.path.join(dirname, fname)
self.assertTrue(os.path.exists(fname),
"Cannot find %s file" % fname)
with h5py.File(fname, "r") as h5:
compressed = h5["f64_sperr"][()]
original = h5["f64_original"][()]
self.assertTrue(original.shape == compressed.shape,
"Incorrect shape")
self.assertTrue(original.dtype == compressed.dtype,
"Incorrect dtype")
self.assertTrue(numpy.allclose(original, compressed, atol=1e-3),
"Values should be close")

Leads to:

  testSperr (__main__.TestHDF5PluginRead)
  Test reading Sperr compressed data ... sh: line 1:  1278 Segmentation fault      (core dumped) python /project/test/test.py

Reading this file works with h5z-sperr v0.1.3 but seg fault when updating to h5z-sperr v0.2.1.

@t20100
Copy link
Member Author

t20100 commented Mar 17, 2025

There is also a build issue in the hdf5plugin project on Windows:

    INFO:root:"C:\Program Files (x86)\Microsoft Visual Studio\2019\Enterprise\VC\Tools\MSVC\14.29.30133\bin\HostX86\x64\link.exe" /nologo /INCREMENTAL:NO /LTCG /DLL /MANIFEST:EMBED,ID=2 /MANIFESTUAC:NO /LIBPATH:src/hdf5/lib /LIBPATH:C:\Users\runneradmin\AppData\Local\Temp\cibw-run-oegt6uv8\cp38-win_amd64\build\venv\libs /LIBPATH:C:\Users\runneradmin\AppData\Local\pypa\cibuildwheel\Cache\nuget-cpython\python.3.8.10\tools\libs /LIBPATH:C:\Users\runneradmin\AppData\Local\pypa\cibuildwheel\Cache\nuget-cpython\python.3.8.10\tools /LIBPATH:C:\Users\runneradmin\AppData\Local\Temp\cibw-run-oegt6uv8\cp38-win_amd64\build\venv\PCbuild\amd64 /LIBPATH:build\temp.win-amd64-cpython-38 "/LIBPATH:C:\Program Files (x86)\Microsoft Visual Studio\2019\Enterprise\VC\Tools\MSVC\14.29.30133\ATLMFC\lib\x64" "/LIBPATH:C:\Program Files (x86)\Microsoft Visual Studio\2019\Enterprise\VC\Tools\MSVC\14.29.30133\lib\x64" "/LIBPATH:C:\Program Files (x86)\Windows Kits\NETFXSDK\4.8\lib\um\x64" "/LIBPATH:C:\Program Files (x86)\Windows Kits\10\lib\10.0.22621.0\ucrt\x64" "/LIBPATH:C:\Program Files (x86)\Windows Kits\10\lib\10.0.22621.0\um\x64" hdf5.lib charls.lib lz4.lib snappy.lib sperr.lib zfp.lib zlib.lib zstd.lib /EXPORT:H5PLget_plugin_info build\temp.win-amd64-cpython-38\Release\src/H5Z-SPERR/src/compactor.obj build\temp.win-amd64-cpython-38\Release\src/H5Z-SPERR/src/h5z-sperr.obj build\temp.win-amd64-cpython-38\Release\src/H5Z-SPERR/src/icecream.obj /OUT:build\lib.win-amd64-cpython-38\hdf5plugin\plugins\libh5sperr.dll /IMPLIB:build\temp.win-amd64-cpython-38\Release\src/H5Z-SPERR/src\libh5sperr.lib
       Creating library build\temp.win-amd64-cpython-38\Release\src/H5Z-SPERR/src\libh5sperr.lib and object build\temp.win-amd64-cpython-38\Release\src/H5Z-SPERR/src\libh5sperr.exp
    sperr.lib(h5zsperr_helper.obj) : error LNK2001: unresolved external symbol H5E_BADVALUE_g
    sperr.lib(h5zsperr_helper.obj) : error LNK2001: unresolved external symbol H5E_ERR_CLS_g
    sperr.lib(h5zsperr_helper.obj) : error LNK2001: unresolved external symbol H5E_PLINE_g
    build\lib.win-amd64-cpython-38\hdf5plugin\plugins\libh5sperr.dll : fatal error LNK1120: 3 unresolved externals

@shaomeng
Copy link

Hi @t20100 , thank you so much for updating hdf5plugin!

Regarding the linking error, I think it's just about a header, and the 0.2.2 release (https://github.com/NCAR/H5Z-SPERR/releases/tag/v0.2.2) should have fixed it. Could you give it a try?

Regarding the failed test, it's because that I broke backward compatibility... Does it mean that if I supply a few new test files, then we can have the tests pass?

@t20100 t20100 force-pushed the update-h5z-sperr branch from 90d8fcf to 53dc724 Compare March 19, 2025 16:02
@t20100
Copy link
Member Author

t20100 commented Mar 19, 2025

Regarding the linking error, I think it's just about a header, and the 0.2.2 release (https://github.com/NCAR/H5Z-SPERR/releases/tag/v0.2.2) should have fixed it. Could you give it a try?

It does not fix the issue. But the problem is in hdf5plugin that needs to provide the missing symbols to the lib instead of libhdf5.

Regarding the failed test, it's because that I broke backward compatibility... Does it mean that if I supply a few new test files, then we can have the tests pass?

The main problem is users won't be able to read older data and have their application to crash instead without an error message because they updated this package...
I was planning to make a minor release, so a breaking change won't go into it.
Also it would be great to detect older/unsupported compressed data and show an explicit error message in this case if it's not possible to keep the backward compatibility.

@t20100 t20100 removed this from the Next release milestone Mar 19, 2025
@shaomeng
Copy link

shaomeng commented Mar 20, 2025

It does not fix the issue. But the problem is in hdf5plugin that needs to provide the missing symbols to the lib instead of libhdf5.

Thanks for clarifying. Does it mean that I don't need to do anything to have it fixed?

The main problem is users won't be able to read older data and have their application to crash instead without an error message because they updated this package... I was planning to make a minor release, so a breaking change won't go into it. Also it would be great to detect older/unsupported compressed data and show an explicit error message in this case if it's not possible to keep the backward compatibility.

This makes sense. I've changed the code to check "compatibility versions" and raise a proper HDF5 error when they don't match, instead of going into segfaults. Would you mind giving the head of the main branch a try? I can tag a new release if it behaves as expected.
Let me think about the backward compatibility a little bit and then get back to you. I might have a way to support backward compatibility!

@shaomeng
Copy link

Hi @t20100 , I actually just implemented the ability to read files produced by earlier versions of H5Z-SPERR; it's not too bad ;) I've also tagged a new release, version 0.2.3, that includes this change. Do you mind giving it a test, and possibly make it into your next release?

t20100 added 3 commits March 24, 2025 16:24
9a825a5ff Update version to 0.2.3
7058d2a75 Merge pull request silx-kit#14 from shaomeng/main
02fd24d65 Merge branch 'NCAR:main' into main
804d0bf66 be able to read binaries produced by version 0.1.x
8775d06fa Merge pull request silx-kit#13 from shaomeng/main
20ccecb74 Merge branch 'NCAR:main' into main
1e009bcb8 make it an error when compatibility versions don't match
d6a744017 Merge pull request silx-kit#12 from shaomeng/main
13f3f6b73 fix a windows compilation issue
c07099a70 Add experimental clamping filter
f1d35dcc0 add a mini filter: h5z-clamp, with a filter ID of 45678
18fd6f9ce Update README.md
f9f4ee28e Merge pull request silx-kit#10 from shaomeng/bitmask_compactor
68e7942e3 minor
d7604708c bumps version number
f25ecfb24 Update README.md
e745e5bf4 minor
49fe13695 Update README.md
65ca29cec Update README.md with the filter behavior table
18bd130f6 correct mean calculation
5366f22a6 finish decompression code. Need to run tests
2ea811491 finished compression routine, work on decompression
a6b9c5cb5 WIP: compression
9db422acf improve the compactor
ca6df4c8f improve the replace functions
01af6d609 implement replace function
743b9d4cc add function h5zsperr_make_mask_nan(), still need to write a unit test for it
2d9e50d88 remove missing_value_mode 3 and 4, so my job is easier now
b5a0d05cf check in
69f885ee9 encode the magic number in set_local()
429ee9d61 use C++ to implement helper functions
b03cba98e function name change
c6de1094e check in
f0ff97bf8 finish packing and unpacking cd_values[]
06bc330fd re-work on the pack data type function
4d3954fdd add functions to check if an input array really has the specified type of missing values
ea73f7b02 use a separate file h5zsperr_helper to keep helper functions
90ae7d9ed set_local() function considers missing value flag
60148fc3b add another compactor unit test
a5eca6efe compactor works with encoding and decoding
6af5b99a6 improve compactor_comp_size()
76a43720e add compactor_comp_size() function
881ea955a change name to be compactor
8b671bfc8 rename bitstream to icecream
7b9af10cb remove the end pointer
fee5981c7 specify and test the memory usage of bitstream
3f262de1f implement rtell()
3672f78a0 finish bitstream class, adding unit tests
20b62230f WIP: bitstream class
4cf8a36d5 start working on bitmaskt compactor, add first function
ea0aec35a improve README
4c29f3efd Merge pull request silx-kit#8 from NCAR/chunk_dim_check
6ea4bf485 check that dataspace dimension can be divided by the chunk dimension
e344ce900 add link to hdf5plugin

git-subtree-dir: src/H5Z-SPERR
git-subtree-split: 9a825a5ff4de84458f092bc63135e466bc68c0cb
@t20100 t20100 force-pushed the update-h5z-sperr branch from 53dc724 to 005df01 Compare March 24, 2025 15:28
@t20100 t20100 changed the title Updated compression filer h5z-sperr v0.2.1 Updated compression filer h5z-sperr v0.2.3 Mar 24, 2025
@t20100
Copy link
Member Author

t20100 commented Mar 24, 2025

@shaomeng v0.2.3 passes the tests fine with the older sperr compressed hdf5 file, thanks!

I still have to fix the build issue on Windows: the problem is the build a plugin with a mixture of C and C++20 files with setuptools. I'll find a way

@shaomeng
Copy link

@shaomeng v0.2.3 passes the tests fine with the older sperr compressed hdf5 file, thanks!

Great! Do you think I need to supply a hdf5 that's compressed using the new version?

I still have to fix the build issue on Windows: the problem is the build a plugin with a mixture of C and C++20 files with setuptools. I'll find a way

If it helps, the C++ code only requires C++14, not C++20. I'm not sure if it makes things easier for you. Sorry :)

@t20100 t20100 marked this pull request as ready for review March 25, 2025 09:46
@t20100
Copy link
Member Author

t20100 commented Mar 25, 2025

Do you think I need to supply a hdf5 that's compressed using the new version?

If you a small have one, that'd be great, we can add it to the test files: https://github.com/silx-kit/hdf5plugin/tree/main/test

Windows compilation fixed by adding a macro to the C++ build part to declare hdf5 symbols as external while building sperr as a static lib.

Ready for review.

@shaomeng, would you mind having a look at the missing_value_mode argument support : f294eb7

@shaomeng
Copy link

If you a small have one, that'd be great, we can add it to the test files: https://github.com/silx-kit/hdf5plugin/tree/main/test

@shaomeng, would you mind having a look at the missing_value_mode argument support : f294eb7

I've created this test file:
sperr_missing_val.zip
The thing to verify upon decompression is that locations at the indexes of (1, 10, 100) are NaNs, while other locations have normal 64-bit floating point values.

Also, the missing value mode support looks good to me; thanks for making it happen!

@t20100 t20100 merged commit ae42c21 into silx-kit:main Mar 26, 2025
7 checks passed
@t20100 t20100 deleted the update-h5z-sperr branch March 26, 2025 13:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants