Skip to content

Conversation

@d-v-b
Copy link
Contributor

@d-v-b d-v-b commented Oct 30, 2025

This PR changes the bytes dtype to support a base64-encoded string as a fill value. This change is motivated by compatibility with the fill values encoding used for variable-length bytes in Zarr V2 data, and also the current behavior of the variable_length_bytes data type in zarr-python, which I would like to deprecate in favor of bytes.

I also re-structured some of the spec for clarity.

xref zarr-developers/zarr-python#3559

I am opening this as a draft because there is content missing from this PR -- given that we have 2 different fill value encodings, we should provide a recommendation for a default when writing data. I'm happy choosing either form as the default.

@LDeakin
Copy link
Member

LDeakin commented Oct 31, 2025

+1 for this

should provide a recommendation for a default when writing data

Base64 is more compact and I'd say should be preferred. But in the near term, implementations probably want to stick with the array form until there is more implementation support.

@d-v-b d-v-b marked this pull request as ready for review October 31, 2025 12:30
@d-v-b
Copy link
Contributor Author

d-v-b commented Oct 31, 2025

Base64 is more compact and I'd say should be preferred. But in the near term, implementations probably want to stick with the array form until there is more implementation support.

I added a soft recommendation for the base64 version

@d-v-b
Copy link
Contributor Author

d-v-b commented Nov 3, 2025

I think this is ready for review

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants