Skip to content

Support SWMR in HDF5 (updated) #2653

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 24 commits into
base: main
Choose a base branch
from
Open

Conversation

ZedThree
Copy link
Contributor

@ZedThree ZedThree commented Mar 7, 2023

I've taken the work @eivindlm started in #1448, and done the following:

  • merged in recent main and fixed conflicts
  • removed use of deleted macro
  • added --enable-hdf5-swmr option to configure.ac
  • changed the use of add_definition in CMake to defining a macro in config.h, to be consistent with autotools version
  • fixed a bug where H5Pset_libver_bounds wasn't set when opening a file
  • replaced calls to H5Pstart_swmr_write with file access flags
    • this was required to get the tests to pass

This feature needs some documentation, as it's not quite straightforward to use. Where's best to put that?

Should this also have a feature flag set in netcdf_meta.h, and in nc-config?

Given that this feature needs explicitly enabling at configure time, should it be added to an existing/new CI build?

eivindlm and others added 14 commits July 10, 2019 21:16
* main: (1776 commits)
  Invert solution as discussed at Unidata#2618
  Correct a potential null pointer dereference.
  Fix a logic error that was resulting in an easy-to-miss error when running configure.
  Fix issue with dangling test file not getting cleaned up.
  Turn nczarr zip support off by default in cmake, add a status line indicating whether nczarr-zip-support is available, in libnetcdf.settings.
  Update the version of the cache action used by github action from v2 to v3.
  Explicit cast to unsigned char.
  More issues returned by sanitizer, related to attempts to assign MAX_UNSIGNED_CHAR (255) to a variable of type char.
  Fixed an issue where memcpy was potentially passed a null pointer.
  Correct another uninitialized issue.
  Correct undefined variable error.
  Fixing issues uncovered by compiling with '-fsanitize=undefined' and running nc_test/nc_test, in support of Unidata#1983, amongst others.  The cast as was being used was undefined behavior, and had to be worked around in a style approximating C++'s 'reinterpret_cast'
  Remove a stray character at the head file.
  Fix a distcheck failure with nczarr_test/run_interop.sh
  Turn benchmarks off by default. They require netcdf4, additional logic is required in order to have them on by default.
  Add execute bit to test scripts
  Fix missing endif statement
  Add generated parallel tests for nc_perf, cmake-based build system.
  Correct typo in CMakeLists.txt
  Wiring performance benchmarks into cmake, cleaned up serial compression performance test dependency on MPI.
  ...
@edwardhartnett
Copy link
Contributor

Awesome work!

Definitely add a flag to netcdf_meta/nc-config for this!

@ZedThree
Copy link
Contributor Author

ZedThree commented Mar 7, 2023

Ok, I've added NC_HAS_HDF5_SWMR to netcdf_meta.h and --has-hdf5-swmr to nc-config

@DennisHeimbigner
Copy link
Collaborator

HDF5 has lots of features that are not supported in netcdf-c.
In order to include such features, we need evidence that it is
high demand.

@edwardhartnett
Copy link
Contributor

@DennisHeimbigner SWMR is a very useful feature for HPC users. It is frequently the case that a model is writing results and other processes want to see those results. To calculate the next timestep, all processors need some data from the previous timestep. The simplest solution would be for those reading processes to read the file and get what they want, but without SWMR it will not work.

A NOAA group just asked about this recently, so I believe it is a feature that will be used by multiple modeling groups, There's also a good chance that PIO would allow access to this features, and PIO is used in several important models at NOAA and NCAR.

@edwardhartnett
Copy link
Contributor

@gsjaardema what do you think about SWMR?

@ZedThree
Copy link
Contributor Author

ZedThree commented Mar 9, 2023

The inability to watch netCDF4 files while they are being written is the main complaint I get from researchers when upgrading applications and libraries. Using SWMR removes that limitation. This is a pretty basic QoL feature that would benefit a lot of users.

It would be even better it this feature could be enabled and used completely automatically, but it's not clear if that's even possible.

@gsjaardema
Copy link
Contributor

@gsjaardema what do you think about SWMR?

I agree with @ZedThree that many users want to be able to query the HDF5 file while it is being written and the inability to do that is a big complaint. Especially when it works for netcdf-3 and netcdf-5 but not netcdf-4. Explaining that to users is confusing.

At one time, SWMR did not work with parallel files; I don't remember if that has changed yet. We would at a minimum need to support parallel writing with serial read; parallel write and parallel read would be nice. Serial write/serial read which SWMR definitely supports is also useful for us though.

There is also a question of overhead -- does use of SWMR entail any performance hit on the writing. If no or minimal, then I could see using it at all times; if there is a performance hit, then we would probably need it to be turned on/off on a file-by-file basis.

@ZedThree
Copy link
Contributor Author

Just to be clear, in case there's any confusion, this is a completely opt-in feature currently, and requires netCDF to be configured and built with explicit support, as well as a runtime flag when opening the file.

I think SWMR is incompatible with parallel files, and only works in serial. I have seen some references to "VFD SWMR", "full SWMR", and "MWMR" (multiple writer, multiple reader), but I'm not sure of the availability of these features.

@Mitch3007
Copy link

Mitch3007 commented Nov 14, 2023

Hi all, great stuff! I was just wondering if there had been any updates on this work regarding building into the main branch or any other initiatives in NetCDF over the past six months that do something similar? Very keen to be able to read NetCDF4 model result files (in QGIS for example) while a hydrodynamic model is still running. The ability to review results 'on-the-fly' as a model is running is an important part of flood and coastal modelling workflow, and this functionality will provide great value to researchers and practitioners.

@DennisHeimbigner
Copy link
Collaborator

DennisHeimbigner commented Nov 14, 2023

replaced calls to H5Pstart_swmr_write with file access flags; this was required to get the tests to pass

Where is this change; I do not see it in the files changed for this PR.

@ZedThree
Copy link
Contributor Author

I was just wondering if there had been any updates on this work regarding building into the main branch or any other initiatives in NetCDF over the past six months that do something similar?

It looks like this branch has fallen out of date a little bit, I'll try and fix that up. I don't think there has been any other work on this though.

If you're able, it would be nice to see if you could try out this branch and see if there's any major issues I've missed!

Where is this change; I do not see it in the files changed for this PR.

This is just a difference in implementing this feature between #1448 and this PR, so it doesn't appear in the diff between this branch and main.

In libhdf5/hdf5open.c in #1448 there was:

    if (mode & NC_WRITE && mode & NC_HDF5_SWMR) {
      if ((retval = H5Fstart_swmr_write(h5->hdfid))) {
        BAIL(retval);
      }
    }

and this PR I instead changed the file access flags:

      if((mode & NC_HDF5_SWMR)) {
        flags |= H5F_ACC_SWMR_WRITE;
      }

The former starts SWMR mode after the file as been opened; the latter opens the file immediately in SWMR mode. I don't recall exactly why this made any difference, except that it was required to get the tests to pass.

@Mitch3007
Copy link

Mitch3007 commented Nov 20, 2023

Hi @ZedThree, (and an early disclaimer here about me being a hack), I've been testing the past few days and have compiled #2652 on top of HDF5 (have completed tests on both HDF5 1.10.11 and 1.14.1-2). For the most part it compiles successfully however tst_filter appears to fall over. See outputs at the bottom of this comments.Any advice is certainly welcome.

Despite tst_filter I've pushed on. Our program uses netcdf-fortran to write results, so to try and test your commits i've built v4.6.1 (https://github.com/Unidata/netcdf-fortran/tree/v4.6.1) on top of #2653. Sadly, I appear to be still getting file lock issues (no doubt user error here...).

Are you aware of anything within the netcdf-fortran libs what will also need to be updated to enabled swmr, or potentially if there are additional flags I should be setting when first opening our result files via nf90_create. Happy to send more information through. Many thanks for your help. Cheers, Mitch.

image

_=======================================================
   netCDF 4.9.3-development: nc_test4/test-suite.log
=======================================================

# TOTAL: 83
# PASS:  82
# SKIP:  0
# XFAIL: 0
# FAIL:  1
# XPASS: 0
# ERROR: 0

.. contents:: :depth: 2

FAIL: tst_filter
================

findplugin.sh loaded
final HDF5_PLUGIN_DIR=/home/mjs/dev/tuflowfv_netcdf/tuflowfv/tuflow_shared_external/netcdf-c/plugins/.libs
filter szip not available
*** Testing dynamic filters using API

*** Testing API: bzip2 compression.
show parameters for bzip2: level=9
show chunks: chunks=4,4,4,4

*** Testing API: bzip2 decompression.
data comparison: |array|=256
no data errors
*** Pass: API dynamic filter

*** Testing dynamic filters parameter passing
test1: compression.
test: nparams=14: params= 1 239 23 65511 27 77 93 1145389056 3287505826 1097305129 1 2147483648 4294967295 4294967295
dimsizes=4,4,4,4
chunksizes=4,4,4,4
>>> mismatch: tint64
>>> mismatch: tuint64
>>> mismatch: tfloat64
test1: decompression.
>>> mismatch: tint64
>>> mismatch: tuint64
>>> mismatch: tfloat64
data comparison: |array|=256
no data errors
test2: dimsize % chunksize != 0: compress.
test: nparams=14: params= 2 239 23 65511 27 77 93 1145389056 3287505826 1097305129 1 2147483648 4294967295 4294967295
dimsizes=4,4,4,4
chunksizes=4,4,4,4
>>> nbytes = 1024 chunk size = 1024
test2: dimsize % chunksize != 0: decompress.
>>> nbytes = 1024 chunk size = 1024
data comparison: |array|=256
no data errors
test3: dimsize % chunksize != 0: compress.
test: nparams=14: params= 3 239 23 65511 27 77 93 1145389056 3287505826 1097305129 1 2147483648 4294967295 4294967295
dimsizes=4,4,4,4
chunksizes=4,4,4,4
test3: error code = 0
test3: dimsize % chunksize != 0: decompress.
data comparison: |array|=256
no data errors
*** Pass: parameter passing
*** Testing dynamic filters using ncgen
*** Pass: ncgen dynamic filter
*** Testing dynamic filters using nccopy
	*** Testing simple filter application
	*** Pass: nccopy simple filter
	*** Testing '*' filter application
	*** Pass: nccopy '*' filter
	*** Testing 'v&v' filter application
	*** Pass: nccopy 'v|v' filter
	*** Testing pass-thru of filters
	*** Pass: pass-thru of filters
	*** Testing -F none
	*** Pass: -F none
	*** Testing -F var,none 
	*** Pass: -F var,none
*** Pass: all nccopy filter tests
*** Testing dynamic filters using ncgen with -lc
*** Pass: ncgen dynamic filter
*** Testing multiple filters

*** Testing Multi-filter application: filter set = bzip2 deflate noop
filters verified
show chunks: chunks=4,4,4,4
direction=compress id=40000 cd_nelmts=0 cd_values=

*** Testing Multi-filters.
filters verified
direction=decompress id=40000 cd_nelmts=0 cd_values=
data comparison: |array|=256
no data errors
*** nccopy -F with multiple filters
direction=compress id=40000 cd_nelmts=0 cd_values=
*** ncgen with multiple filters
*** Pass: multiple filters
*** Testing filter re-definition invocation
*** Testing multiple filter order of invocation on create
7a8,9
> direction=compress id=40000 cd_nelmts=1 cd_values= 0
> direction=compress id=40001 cd_nelmts=1 cd_values= 1
11,12d12
< direction=compress id=40000 cd_nelmts=1 cd_values= 0
< direction=compress id=40001 cd_nelmts=1 cd_values= 1_

Copy link
Contributor

@magnusuMET magnusuMET left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I might be mistaken but I think the version handling code is wrong in places. You'll always want the highest version possible, but limit downwards to 1.10 (when SWMR was introduced) when using this feature.

@ZedThree
Copy link
Contributor Author

@magnusuMET Yes, I think you're right. Actually, I'm not sure why we can't just sidestep the issue and only call H5Pset_libver_bounds when enabling SWMR mode.

@ZedThree
Copy link
Contributor Author

@magnusuMET You prompted me to look into this properly -- it turns out that since the original PR (#1448) that this one is based on, a new function hdf5set_format_compatibility was added which sets the libver bounds, and that was being called after the new call to H5Pset_libver_bounds here with the result that the values would get clobbered anyway.

The bounds that are currently set (in main) have H5F_LIBVER_LATEST for the high bound, so we don't have to do anything special for SWMR: simply creating a file in SWMR mode sets the file format to the HDF5 1.10 format (superblock 3).

Libver bounds already taken care of elsewhere, and merely opening the
file in SWMR mode will correctly set format to at least v110
@ZedThree
Copy link
Contributor Author

I've added nc_test4/test_hdf5_swmr.sh that now properly exercises the SWMR capability. Running this test creates separate processes for simultaneously writing to and reading from the same file. This was very useful because it flagged that we need to refresh the metadata when reading dimension lengths. With that implemented, I think this is now the minimally useful version of the feature.

Things that are now missing:

  • testing in CI
  • some docs on how to actually use this
  • maybe ncwatch, the equivalent of h5watch for watching files (basically ncdump but in SWMR mode)

This is currently gated behind a build-time flag (as well as the runtime flag). Does that still make sense? I notice that the CI now only runs on HDF5 1.10.8+, so if 1.8.x is no longer supported, this feature should always be available (though not always turned on of course!). It would also mean not needing an additional CI job.

@WardF @DennisHeimbigner thoughts on how to proceed from here?

@DennisHeimbigner
Copy link
Collaborator

Has anyone tested this under Windows (Visual Studio)?

@ZedThree
Copy link
Contributor Author

I don't have access to a Windows dev environment myself unfortunately. When the Windows tests are running on CI, could use that?

@Mitch3007
Copy link

Mitch3007 commented Dec 17, 2024

I don't have access to a Windows dev environment myself unfortunately. When the Windows tests are running on CI, could use that?

Hi @ZedThree,

I have access to both Linux (using CentOS7) and Windows 10 VS Studio 2019 build environments so am happy to undertake some Visual Studio tests once I've got things up and running properly on Linux. I've had some time to look at this again over the past week, however I'm struggling to build the swmr tests within netcdf-c/nc_test4/CMakeLists.txt:

if (ENABLE_HDF5_SWMR)
BUILD_BIN_TEST(test_hdf5_swmr_writer)
BUILD_BIN_TEST(test_hdf5_swmr_reader)
ADD_SH_TEST(nc_test4 test_hdf5_swmr)
endif()

The terminal output when running the tests (which doesn't include test_hdf5_swmr) is provided as follows.

_-----------------------------
make  check-TESTS
make[2]: Entering directory `/home/mjs/dev/tuflowfv_netcdf/tuflowfv/tuflow_shared_external/netcdf-c/nc_test4'
make[3]: Entering directory `/home/mjs/dev/tuflowfv_netcdf/tuflowfv/tuflow_shared_external/netcdf-c/nc_test4'
PASS: tst_dims
PASS: tst_dims2
PASS: tst_dims3
PASS: tst_files
PASS: tst_files4
PASS: tst_vars
PASS: tst_varms
PASS: tst_unlim_vars
PASS: tst_converts
PASS: tst_converts2
PASS: tst_grps
PASS: tst_grps2
PASS: tst_compounds
PASS: tst_compounds2
PASS: tst_compounds3
PASS: tst_opaques
PASS: tst_strings
PASS: tst_strings2
PASS: tst_interops
PASS: tst_interops4
PASS: tst_interops5
PASS: tst_interops6
PASS: tst_interops_dims
PASS: tst_enums
PASS: tst_coords
PASS: tst_coords2
PASS: tst_coords3
PASS: tst_vars3
PASS: tst_vars4
PASS: tst_chunks
PASS: tst_chunks2
PASS: tst_utf8
PASS: tst_fills
PASS: tst_fills2
PASS: tst_fillbug
PASS: tst_xplatform
PASS: tst_xplatform2
PASS: tst_endian_fill
PASS: tst_atts
PASS: t_type
PASS: cdm_sea_soundings
PASS: tst_camrun
PASS: tst_vl
PASS: tst_atts1
PASS: tst_atts2
PASS: tst_vars2
PASS: tst_files5
PASS: tst_files6
PASS: tst_sync
PASS: tst_h_scalar
PASS: tst_rename
PASS: tst_rename2
PASS: tst_rename3
PASS: tst_h5_endians
PASS: tst_atts_string_rewrite
PASS: tst_hdf5_file_compat
PASS: tst_fill_attr_vanish
PASS: tst_rehash
PASS: tst_filterparser
PASS: tst_bug324
PASS: tst_types
PASS: tst_atts3
PASS: tst_put_vars
PASS: tst_elatefill
PASS: tst_udf
PASS: tst_put_vars_two_unlim_dim
PASS: tst_bug1442
PASS: tst_quantize
PASS: tst_h_transient_types
PASS: tst_alignment
PASS: tst_h_strbug
PASS: tst_h_refs
PASS: run_empty_vlen_test.sh
PASS: tst_v2
PASS: run_grp_rename.sh
PASS: tst_misc.sh
PASS: test_fillonly.sh
PASS: tst_fixedstring.sh
PASS: tst_filter.sh
PASS: tst_specific_filters.sh
PASS: tst_bloscfail.sh
PASS: tst_filter_vlen.sh
PASS: tst_filter_misc.sh
make[4]: Entering directory `/home/mjs/dev/tuflowfv_netcdf/tuflowfv/tuflow_shared_external/netcdf-c/nc_test4'
make[4]: Nothing to be done for `all'.
make[4]: Leaving directory `/home/mjs/dev/tuflowfv_netcdf/tuflowfv/tuflow_shared_external/netcdf-c/nc_test4'
============================================================================
Testsuite summary for netCDF 4.9.3-development
============================================================================
# TOTAL: 83
# PASS:  83
# SKIP:  0
# XFAIL: 0
# FAIL:  0
# XPASS: 0
# ERROR: 0
============================================================================

-----------------------------_

Is there anything additional I need to add to my build environment to add the nc_test4 list?

As a test, I've also tried to commenting out the ENABLE_HDF5_SWMR if as follows:

#if (ENABLE_HDF5_SWMR)
BUILD_BIN_TEST(test_hdf5_swmr_writer)
BUILD_BIN_TEST(test_hdf5_swmr_reader)
ADD_SH_TEST(nc_test4 test_hdf5_swmr)
#endif()

However this still doesn't seem to add the tests. Any thoughts on what I might be doing wrong or need to add? Very much appreciate any help you can provide.

I've attached the build script I'm using to build netcdf-c, which I hope provides some further context on how I'm building. build_netcdf.txt. I've renamed from .sh to .txt here to hopefully avoid any security issues.

The HDF version I'm building on is 1.14.1-2.

Cheers, Mitch.

@WardF
Copy link
Member

WardF commented Dec 17, 2024

@Mitch3007 When you run cmake, there should be a line in the output that either confirms 'Found bash:orbash shell not found.` Can you take a look and see what's being reported? Similarly, I'll test this in my environment where I know I have bash, and then I will see if I can get the conflicts sorted. Thanks!

@Mitch3007
Copy link

Hi @WardF, thanks for taking a look at this :)

I've reviewed the terminal output when building netcdf-c. I don't seem to see any 'Found bash:' or 'bash shell not found output' however the build environment looks to be using -bash: BASH_VERSION = 4.2.46(2)-release. Does that look ok to you, or is there any specific log/output file I can take a look at further? I've attached the full terminal output: build_netcdf_all_terminal_output.txt if it's of any use as reference.

For further context the build process I'm using is as follows:

  1. Setup Intel OneAPI environment via call to source set_dev_environment.sh (have attached as set_dev_environment.txt
  2. Run build_all.sh to build netcdf-c (attached as build_all.txt). In the attached terminal output hdf5 had already been built previously, hence the --no_hdf5 flag. This script calls build_netcdf.sh (build_netcdf.txt). Note, I've commented out the netcdf-fortran build for the moment as I'll cross that bridge once swmr is confirmed to be working ok on my C Linux and Windows build environments.

Also of interest, I was expecting test_hdf5_swmr_reader.c and test_hdf5_swmr_writer.c to produce some .o files in netcdf-c/nc_test4 but I don't appear to be building them.
swmr_nc_test4

Thanks again, Mitch.

@Mitch3007
Copy link

Mitch3007 commented Dec 23, 2024

Hi @WardF ,

Despite the nc_test4 build issues mentioned above I've been able successfully build pull request commit ff7bde5 on both Linux and Windows build environments. To test swmr, I'm using fortran 90 and have put together two small fortran prototype programs.

The first opens the file in swmr mode, declares the file then closes it and opens it again for writing. It writes out 20 time steps to a CF-compilant NetCDF grid on long lat coordinates.

The second program opens the NetCDF in swmr read mode, reports some information on the length of the time dimension, then closes the file. There is a short pause (0.1 seconds) and then it reopens and repeats the process 200 times.

To get it working on fortran I've pulled down netcdf-fortran 138a6b7 and have locally added to netcdf-fortran/fortran/netcdf_constants.F90:

! Flags for SWMR
integer, parameter, public :: nf90_hdf5_swmr = 65536

So @DennisHeimbigner and @ZedThree, swmr appears to be working on Windows and Linux. Keen to hear where we need go from here. Happy to upload my test programs if you think it will be useful. Please let me know if you need anything further from me to help progress this one.

@WardF
Copy link
Member

WardF commented Jan 3, 2025

The next step will be to figure out these conflicts, I will take a stab at that. Any additional thoughts @DennisHeimbigner ?

@DennisHeimbigner
Copy link
Collaborator

The only thing that comes to mind is the threading issue.
Remember that while HDF5 has locking, netcdf does not.

@Mitch3007
Copy link

Hi @WardF and @DennisHeimbigner, hope you've been well. If it helps, and you think a nube such as me can manage it, I could try to work through the conflicts. I might need some help/oversight though. Hoping we can get it out into the world. This will be an excellent feature if we can get it over the line.

@DennisHeimbigner
Copy link
Collaborator

Refresh my memory. How is the netcdf-c library told to use swmr on a given file?

@Mitch3007
Copy link

Hi @DennisHeimbigner, thanks very much for your help. There is a NC_HDF5_SWMR flag that can be optionally added to to nc_create and nc_open (write and read modes). There are some examples here: ff7bde5 refer nc_test4/test_hdf5_swmr_writer.c and nc_test4/test_hdf5_swmr_writer.

If it helps, this is the general approach as described by @ZedThree back in Nov 2024 in the thread above:

There's currently no docs for this feature, but to use it properly, you should first open the file normally and create all the variables and attributes. Then, close and reopen the file in SWMR mode (nc_open(FILE_NAME, NC_WRITE|NC_HDF5_SWMR, &ncid)) to append your data. At this point, you should be able to open the file for reading in a different process, also with the SWMR flag.

If you need any further information please let me know. Cheers, Mitch.

@DennisHeimbigner
Copy link
Collaborator

This my problem. I (and Ward I think) jealously guard the use of the nc_open/nc_create mode flags because
they are a very scarce resource. So I am always reluctant to use one for something as narrow as just HDF5
swmr. There are a myriad of potential HDF5 specific features and we could soon use up all of the netcdf
mode flags trying to accommodate them.

@Mitch3007
Copy link

Hi Dennis, thanks for coming back to me and for the further info on the scarcity of the flags. I suspect you get hit with a lot of 'please help me, my issue is the most important to include', so here is my go at it too haha. I think there would be tremendous value to the modelling community if people could open their NetCDF model results whilst a model is still running. We were able to do this in earlier versions of HDF5 and NetCDF but lost this ability in recent versions after hitting file locking issues.

This pull request has been open for a while, but earlier on there was quite a bit of support from around March 2023 championing the feature. Thanks for considering the change. If anyone else has any further comments please drop them in the discussion.

Thanks, Mitch.

@gsjaardema
Copy link
Contributor

There is some discussion on making the mode flag 64-bit in #2128. I think the consensus was that it could be done prior to 5.0 since it doesn't change the file and should be a backwards compatible change...

@DennisHeimbigner
Copy link
Collaborator

There is some discussion on making the mode flag 64-bit in #2128. I think the consensus was that it could be done prior to 5.0 since it doesn't change the file and should be a backwards compatible change...

I do not think this is entirely accurate for two reasons:

  1. internally, the mode is passed around as an integer (i.e. 32 bit)
  2. nc_open/nc_create assume an integer mode argument so it would break users of previous versions of the library.b

@gsjaardema
Copy link
Contributor

gsjaardema commented May 29, 2025

There is some discussion on making the mode flag 64-bit in #2128. I think the consensus was that it could be done prior to 5.0 since it doesn't change the file and should be a backwards compatible change...

I do not think this is entirely accurate for two reasons:

  1. internally, the mode is passed around as an integer (i.e. 32 bit)
  2. nc_open/nc_create assume an integer mode argument so it would break users of previous versions of the library.b
  1. Yes, there would need to be changes to the internal representation and uses of the mode in the library. But AFAIK, the mode isn't stored in the file.

  2. The argument is passed by value and the compiler will implicitly convert a 32-bit integer to a 64-bit integer, so users would not need to update their client code unless they wanted to use the modes that were introduced that are larger than 32-bit.

The main compatibility issue would be is there is a query mechanism for the mode since it would now be returning a 64-bit value instead of 32-bit which would require changes to the client code.

@DennisHeimbigner
Copy link
Collaborator

One other point. A client expecting to use a 64 bit mode might end up using an older library expecting 32 bits.
In this case, the 64 bit arg will get truncated but the truncation result may differ on different machines.
In any case, as you know, making the internal changes will be a pervasive change, so we would need to
marshal the necessary manpower.
In any case it is something we need to do, but the details need some thought.

So, the original issue stands: do we sacrifice a mode flag in the current 32 bit mode?
I believe there is a good case for it.

@DennisHeimbigner
Copy link
Collaborator

I just reviewed the currently defined mode flags in netcdf.h.
We have one unused bit in the lower 16 bits and 14 unused bits
in the upper 16 bits. So assuming we move to a 64 bit (or more) mode
in the near future, I see little problem in adding a swmr flag.
A minor nitpick: I am not sure about calling it swmr; is there a more descriptive - non-hdf5 specific name?

@gsjaardema
Copy link
Contributor

At some point it might be good to look at mechanisms for passing information down into HDF5 related to different capabilities at that level. I don't remember if SWMR has any options/configuration, but if there is support for VFD and VOL in the future, then it would be nice to be able to specify configuration through a different mechanism than just lots of similar but different modes...

@gsjaardema
Copy link
Contributor

gsjaardema commented May 29, 2025

A minor nitpick: I am not sure about calling it swmr; is there a more descriptive - non-hdf5 specific name?

Agree. Not sure what to call it, but SWMR is not very descriptive. Unless you know HDF5.

It is similar to the existing NC_SHARE in that it enables concurrent (read) access to a dataset by multiple processes

@DennisHeimbigner
Copy link
Collaborator

I had forgotten about NC_SHARE. That flag is currently used only for netcdf-3 classic files and even then
does not appear to be used at all.
It might make sense to just use it as the swmr flag for netcdf-4.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants