-
-
Notifications
You must be signed in to change notification settings - Fork 328
[NO MRG] Test CI #952
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[NO MRG] Test CI #952
Conversation
Am seeing the same failure with commit ( e15001b ) on Edit: Should add this seems potentially related as tests are hanging in Edit 2: Fortunately the changes in Edit 3: Still seeing issues with 0.17.2, which came out a little over a week back. Trying 0.17.1. |
We seem to be experiencing hangs in `test_sync.py` with this version. So exclude it from testing.
Co-authored-by: Josh Moore <[email protected]>
Here's a dependency diff between CI for commit ( 21739ad )'s run and commit ( e15001b )'s run where the issue first appeared Diff:27c27
< coverage==6.2
---
> coverage==6.3
38c38
< fasteners==0.17.2
---
> fasteners==0.17.3
62c62
< jsondiff==1.3.0
---
> jsondiff==1.3.1
68c68
< jupyter-client==7.1.1
---
> jupyter-client==7.1.2
78c78
< moto==2.3.2
---
> moto==3.0.1
82c82
< multidict==5.2.0
---
> multidict==6.0.2
90c90
< notebook==6.4.7
---
> notebook==6.4.8
103c103
< prometheus-client==0.12.0
---
> prometheus-client==0.13.0
107c107
< pure-eval==0.2.1
---
> pure-eval==0.2.2
115c115
< pyparsing==3.0.6
---
> pyparsing==3.0.7 |
Spotted issue ( harlowja/fasteners#86 ), which may also be related. |
Looks like it was still hanging even with 0.16. Going to try running with more verbosity. |
Canceled the tests and looked at the tests run. They were definitely using the |
@joshmoore would be good to get your thoughts here on what we should do next? Should we debug further? Should we xfail the |
Hmmmm..... so we're sure it's just a change in fasteners and not (also) a change in the mainline that's leading to this? It would definitely be good to get 2.11 out. We could move the synchronizer tests out to their own workflow (which only gets run periodically?) |
To be completely honest, I'm not sure. At the moment I'm torn between whether a new copy of That said, CI on Davis' change ( 831e687 ) went in a couple months back (October) and issues only cropped up recently. Think we would have seen issues sooner if that was the cause. All of the testing here is without the recent empty chunks change and it still has issues. So doesn't seem like that's the cause. Finally where the tests slow down is when testing To summarize, there isn't an obvious cause outside of |
Currently trying to reproduce locally with:
|
Hi, fasteners maintainer here, saw a note in our issue about the deadlock, got interested. I'm puzzled by this thing, looking at the commit log I see that your tests passed 8 days ago with fasteners 17.2, but didn't with 2 days ago with fasteners 17.3: Which is weird, as you are using ProcessLock, which was not touched in 17.3. Maybe it would be helpful to rerun your CI on the commit 21739ad (the one that was fine 8 days ago) and see if it still passes? |
Thanks, @psarka. Happy to give it a try. The previous action run was: https://github.com/zarr-developers/zarr-python/runs/4878605616?check_suite_focus=true I've just re-launched. Let's see what happens. |
Looks like it is hanging 🤯 Is there any other explanation, other than that something changed with github actions infrastructure? I'll rerun fasteners CI just in case. Edit: fasteners are passing fine. I'm out of ideas. |
Thanks for taking a look Paulius! 😄 Yeah the hang seems to be occurring with the That said, we probably need to turn this into a smaller reproducer so it is a bit easier to dig into. |
I got side tracked by trying to reproduce locally with https://github.com/nektos/act (unsuccessfully). I will give up on that and try to find a way forward. |
Plan discussed with @jakirkham:
|
Linux & Linux/docker don't show a hang locally when following the python-package.yml workflow instructions. Currently running with |
More strange is we saw PR ( #955 ) pass. Trying to see if we can replicate that elsewhere. If so, maybe it was a GH Action's change? |
Maybe it was a temporary issue in our segment of the GHA cloud?! (if so, reminder for next time: contact GH earlier) |
Checking here as well ( #950 ) just to make sure this is consistently working |
Looks like things are working 🎉 Sorry for the false alarm @psarka and thanks for the help 🙂 Going to go ahead and close |
* Set write_empty_chunks to default to False * Add release entry for write_empty_chunks default * add Empty chunks section to tutorial.rst * add benchmarky example * proper formatting of code block * Fix abstore deprecated strings * Also catch ValueError in all_equal The call to `np.any(array)` in zarr.util.all_equal triggers the following ValueError: ``` > return ufunc.reduce(obj, axis, dtype, out, **passkwargs) E ValueError: invalid literal for int() with base 10: 'baz' ``` Extending the catch block allows test_array_with_categorize_filter to pass, but it's unclear if this points to a deeper issue. * Add --timeout argument to all uses of pytest * Pin fasteners to 0.16.3 (see #952) Co-authored-by: Davis Vann Bennett <[email protected]> Co-authored-by: Josh Moore <[email protected]> Co-authored-by: jmoore <[email protected]>
Opening just to test the current state of CI