Skip to content

Open mode is lost when opening via FSStore #975

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
d70-t opened this issue Feb 28, 2022 · 7 comments · Fixed by #976
Closed

Open mode is lost when opening via FSStore #975

d70-t opened this issue Feb 28, 2022 · 7 comments · Fixed by #976

Comments

@d70-t
Copy link
Contributor

d70-t commented Feb 28, 2022

This issue describes failed writes to xarray groups using FSStore where I'd expect the write to succeed. My current assumption is, that these issues occure due to mode-flags getting lost when the store is created implicitly (i.e. only an url is passed to zarr functions). I don't yet know how to fix this or if this is intended behaviour, thus I'm creating an issue.

Minimal, reproducible code sample, a copy-pastable example if possible

import zarr
import os

here = os.path.abspath(os.path.dirname(__file__))
url = "file://" + here + "/bug.zarr"  # intentionally using fsspec, problem occurs when using any FSStore backend

root = zarr.group(url, overwrite=True)

Problem description

The code above raises:

Traceback (most recent call last):
  File "/Users/tobi/Documents/code/zarrbug/test_write.py", line 7, in <module>
    root = zarr.group(url, overwrite=True)
  File "/usr/local/lib/python3.9/site-packages/zarr/hierarchy.py", line 1112, in group
    init_group(store, overwrite=overwrite, chunk_store=chunk_store,
  File "/usr/local/lib/python3.9/site-packages/zarr/storage.py", line 490, in init_group
    _init_group_metadata(store=store, overwrite=overwrite, path=path,
  File "/usr/local/lib/python3.9/site-packages/zarr/storage.py", line 504, in _init_group_metadata
    rmdir(store, path)
  File "/usr/local/lib/python3.9/site-packages/zarr/storage.py", line 138, in rmdir
    store.rmdir(path)  # type: ignore
  File "/usr/local/lib/python3.9/site-packages/zarr/storage.py", line 1259, in rmdir
    raise ReadOnlyError()
zarr.errors.ReadOnlyError: object is read-only

I'd expect that the code works (in particular due to overwrite=True)

(likely) related variant

import zarr    
import os    
    
here = os.path.abspath(os.path.dirname(__file__))    
url = "file://" + here + "/bug.zarr"    
store = zarr.storage.FSStore(url)    
    
root = zarr.group(store, overwrite=True)    
root.zeros('baz', shape=(10000, 10000), chunks=(1000, 1000), dtype='i4')    
zarr.consolidate_metadata(store)    
    
    
root = zarr.open_consolidated(url, mode="r+")    
root["baz"][0,0] = 7 

raises:

Traceback (most recent call last):
  File "/Users/tobi/Documents/code/zarrbug/test.py", line 14, in <module>
    root["baz"][0,0] = 7
  File "/usr/local/lib/python3.9/site-packages/zarr/core.py", line 1285, in __setitem__
    self.set_basic_selection(pure_selection, value, fields=fields)
  File "/usr/local/lib/python3.9/site-packages/zarr/core.py", line 1380, in set_basic_selection
    return self._set_basic_selection_nd(selection, value, fields=fields)
  File "/usr/local/lib/python3.9/site-packages/zarr/core.py", line 1680, in _set_basic_selection_nd
    self._set_selection(indexer, value, fields=fields)
  File "/usr/local/lib/python3.9/site-packages/zarr/core.py", line 1732, in _set_selection
    self._chunk_setitem(chunk_coords, chunk_selection, chunk_value, fields=fields)
  File "/usr/local/lib/python3.9/site-packages/zarr/core.py", line 1994, in _chunk_setitem
    self._chunk_setitem_nosync(chunk_coords, chunk_selection, value,
  File "/usr/local/lib/python3.9/site-packages/zarr/core.py", line 2005, in _chunk_setitem_nosync
    self.chunk_store[ckey] = self._encode_chunk(cdata)
  File "/usr/local/lib/python3.9/site-packages/zarr/storage.py", line 1177, in __setitem__
    raise ReadOnlyError()
zarr.errors.ReadOnlyError: object is read-only

I've explicitly instantiated the store when creating the dataset (which circuments the previous error).
But when trying to use the url directly for updating the dataset, I recieve a similar error (note that this error is raised on the line root["baz"][0,0] = 7).

context

The following xarray-based code raises the same (second) exception:

import numpy as np
import xarray as xr
import os

here = os.path.abspath(os.path.dirname(__file__))
url = "file://" + here + "/bug.zarr"

ds = xr.Dataset({"baz": (("x", "y"), np.zeros((10000, 10000), dtype="i4"))}).chunk({"x": 1000, "y": 1000})
ds.to_zarr(url, compute=False)

ds.isel(x=slice(1000), y=slice(1000)).to_zarr(url, mode="r+", region={"x": slice(1000), "y": slice(1000)})

Thanks @observingClouds wo showed me this example.

Version and installation information

  • Value of zarr.__version__: '2.11.0'
  • Value of numcodecs.__version__: '0.7.3'
  • Python 3.9.10
  • Operating system: Linux
  • zarr was installed using pip globally
@d70-t
Copy link
Contributor Author

d70-t commented Feb 28, 2022

For the second case (can't write chunks on consolidated store), my assumption is, that here the mode-Flag is lost, and as chunk_store defaults to store a few lines later, normalize_storage_arg creates a read-only FSStore.

I don't yet understand the use of the clobber-Argument, but probably normalize_storage_arg should be called with clobber=True, mode=mode in this case?
... but probably only if there's no chunk_store.

@d70-t
Copy link
Contributor Author

d70-t commented Feb 28, 2022

For the first case, this might be a similar one at this location. However as there's no mode-argument, there might have to be some logic which determines the correct flag in this case.

@joshmoore
Copy link
Member

seealso: #916, #712

@d70-t
Copy link
Contributor Author

d70-t commented Mar 1, 2022

From #712 I got the impression that clobber might be redundant with mode. The draft PR experiments with dropping the clobber-Argument and might fix this issue.

I didn't yet look deeper into the zarr.group case, as it might also be considered a user error by me (open_group does the trick...).

@joshmoore
Copy link
Member

I've struggled with the exact semantics of clobber as well. My interpretation was less redundant than a way to disable the mode, something like "if store is a string, ignore the mode" but the background there is unclear to me.

@d70-t
Copy link
Contributor Author

d70-t commented Mar 2, 2022

Yes, that's what I thought so too, but then, both, clobber and mode are only used if store is a string. So if store is not a string, it shouln't matter what mode is. Thus it should be fine to change mode to "r" when clobber=False, independent of if store is a string or not.

But yes, a bit more background on this would be great.

@sanketverma1704
Copy link
Member

This issue has been fixed in patch release 2.11.1. 🎉

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
3 participants