Skip to content

Use deepcopy recursively on numpy arrays #4379

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Aug 27, 2020

Conversation

darikg
Copy link
Contributor

@darikg darikg commented Aug 26, 2020

(Not sure this is worth noting in whats-new)

@max-sixty
Copy link
Collaborator

Thanks for the PR @darikg !

Re tests failing — is there a chance python interns the object instance? Maybe worth trying with a less primitive object?

@keewis
Copy link
Collaborator

keewis commented Aug 26, 2020

it seems that's a backwards compatibility issue with numpy: 1.15 fails while 1.19 works. I'm not sure in which version the fix was introduced, though

@pep8speaks
Copy link

pep8speaks commented Aug 26, 2020

Hello @darikg! Thanks for updating this PR. We checked the lines you've touched for PEP 8 issues, and found:

There are currently no PEP 8 issues detected in this Pull Request. Cheers! 🍻

Comment last updated at 2020-08-27 13:55:19 UTC

@darikg
Copy link
Contributor Author

darikg commented Aug 26, 2020

Weird, I can't reproduce that locally, even after reverting to numpy 1.15. Trying with a non-object class

@darikg
Copy link
Contributor Author

darikg commented Aug 26, 2020

Oh, sorry, I actually can reproduce it locally.

Turns out arrays in numpy 1.15 don't have __array_function__. So this:

   if deep:
        if hasattr(data, "__array_function__") or isinstance(
            data, dask_array_type
        ):
            data = copy.deepcopy(data)
        elif not isinstance(data, PandasIndexAdapter):
            # pandas.Index is immutable
            data = np.array(data)

could just be

   if deep and not isinstance(data, PandasIndexAdapter):
        data = copy.deepcopy(data)

But I'm not familiar with __array_function__ or why it's being used here, any thoughts?

@keewis
Copy link
Collaborator

keewis commented Aug 26, 2020

thanks for tracking this down. The easiest way to fix the test would be to decorate it with

pytest.mark.skipif(not IS_NEP18_ACTIVE, reason="requires NEP18")

Even easier would be to bump numpy, but that doesn't seem to be possible right now.

We might also try to rewrite the condition to:

if (
    hasattr(data, "__array_function__")
    or isinstance(data, dask_array_type)
    or (not IS_NEP18_ACTIVE and isinstance(data, np.ndarray))
):
    data = copy.deepcopy(data)

Thoughts, @pydata/xarray?

@max-sixty
Copy link
Collaborator

This looks good to go! There's one small formatting fix, and then we can merge. Cheers @darikg

@keewis
Copy link
Collaborator

keewis commented Aug 27, 2020

actually, we should probably merge #4381 to fix the formatting.

@darikg
Copy link
Contributor Author

darikg commented Aug 27, 2020

I went with @keewis's last suggestion because I think it makes the most sense for deepcopy to behave as expected even with older versions of numpy

@max-sixty
Copy link
Collaborator

Great, let's merge both?

@max-sixty max-sixty merged commit 4aa7622 into pydata:master Aug 27, 2020
@max-sixty
Copy link
Collaborator

@darikg Thank you for the contribution!

If you want to put a whatsnew, feel free to — either as a new PR or if you're planning any more contributions atm, then as part of that. Not obligatory though

@darikg
Copy link
Contributor Author

darikg commented Aug 27, 2020

Thanks for everybody's help!

Also just wanted to say, you guys have an incredible test suite.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Surprising deepcopy semantics with dtype='object'
4 participants