Skip to content

Fix issue tuple as dimension #5477

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 1 commit into from
Closed

Fix issue tuple as dimension #5477

wants to merge 1 commit into from

Conversation

thomashirtz
Copy link
Contributor

I edited dataset in core to try to solve the issue 4821. Some tests are still currently missing, also I believe there is still one more issue related to using tuple as dim.

The code xr.Dataset({"a": ([("a", "b")], [1])}) does not error yet.

  • Closes #4821
  • Tests added
  • Passes pre-commit run --all-files
  • User visible changes (including notable bug fixes) are documented in whats-new.rst

@max-sixty
Copy link
Collaborator

This looks good!

The one corner case I'm not sure this handles is when there is actually a tuple dimension. If that's correct, could we add a test for that case, and maybe here check that it's a tuple and it's not in the dimensions?

The code xr.Dataset({"a": ([("a", "b")], [1])}) does not error yet.

Great, tbc we want that construction to pass — we're saying it's allowable to have tuples as dimension names (while not encouraging it!)

Thanks @thomashirtz

@max-sixty
Copy link
Collaborator

This was quite close @thomashirtz — let me know if you'd be up for finishing it off. Thanks

@thomashirtz
Copy link
Contributor Author

I would like to continue to contribute to xarray, unfortunately I have many important matters theses days. I may contribute again from end of summer. If someone wants to tackle this issue until then, he can

@headtr1ck
Copy link
Collaborator

I think this should be also aligned with str | Iterable[Hashable], so maybe str | DataArray | Iterable[Hashable | DataArray]?
And then reverse the isinstance checks (do not capture tuple and list explicitly, but str).

@max-sixty
Copy link
Collaborator

I think this should be also aligned with str | Iterable[Hashable], so maybe str | DataArray | Iterable[Hashable | DataArray]?

Good point...

IIUC (and it's been a while since I thought through this properly), this would support allowing tuples as dimension names, but forcing them to be passed within an iterable to methods (e.g.method(dim=[('a','b')]) for a dimension named ('a', 'b'). So we'd keep str | Iterable[Hashable], and still raise an error if a tuple was passed directly (i.e. method(dim=('a','b'))

@headtr1ck
Copy link
Collaborator

So we'd keep str | Iterable[Hashable], and still raise an error if a tuple was passed directly (i.e. method(dim=('a','b'))

I would say that a tuple is fine, it IS an iterable of hashable ;)

@max-sixty
Copy link
Collaborator

I would say that a tuple is fine, it IS an iterable of hashable ;)

Sorry, yes, I was unclear above!

How should we handle dimensions which are themselves tuples? (i.e. a really unlikely corner-case, but one that IIRC inspired some of the discussion in #4821)

My thought was to interpret a tuple as an iterable of dimensions. If we have a dimension name of a tuple, then that requires passing within an Iterable — i.e. the method(dim=[('a','b')]). Since the mypy changes around str | Iterable[Hashable], this seems better than the options I enumerated at #4821 (comment).

Does that make sense? V possible my writing and / or thinking is still not clear!

@headtr1ck
Copy link
Collaborator

Yes exactly, that would be the intended way of working with dimensions/variables that are iterablea themself, like tuples.

@thomashirtz thomashirtz closed this by deleting the head repository Sep 8, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants