Skip to content
This repository was archived by the owner on Dec 19, 2018. It is now read-only.

Microsoft.AspNet.TestHost.TestServerTests.CancelAborts runs for a long time on CoreCLR on Linux #422

Closed
cesarblum opened this issue Oct 19, 2015 · 13 comments
Assignees
Milestone

Comments

@cesarblum
Copy link
Contributor

Intermittent. I've seen it run in a timely fashion once, but then on a different build it ran for over 60 minutes (at which point the build was aborted).

@cesarblum
Copy link
Contributor Author

This affects aspnet/Universe#304.

@muratg
Copy link

muratg commented Oct 19, 2015

@CesarBS Do you know why it hangs? Were you able to debug?

@muratg
Copy link

muratg commented Nov 20, 2015

@JunTaoLuo Can you take a look at this one?

@JunTaoLuo JunTaoLuo self-assigned this Nov 20, 2015
@muratg muratg added this to the 1.0.0-rc2 milestone Nov 23, 2015
@muratg
Copy link

muratg commented Nov 25, 2015

@JunTaoLuo This is one of the high pri fundamentals work items that Eilon was talking about. Please have a look.

@JunTaoLuo
Copy link
Contributor

This no longer seems to be reproducible. I have ran the affected tests over 250 iterations without seeing any hangs on CoreCLR. I'll reenable these tests.

@JunTaoLuo
Copy link
Contributor

CoreCLR tests re-enabled on CI. Mono hangs tracked in #507.

@JunTaoLuo
Copy link
Contributor

This issue popped up on our CI again. I have been able to reproduce the issue on my own VM once I reduced resources to 1 CPU core/ 2GB RAM. We can temporarily reduce the likelihood of hangs on our CI by using agents with more resources. I suspect our TestHost is hanging due to a deadlock.

@JunTaoLuo
Copy link
Contributor

Confirmed this hang only reproduces on single core agents/vms unless -parallel none is specified for xunit. Will continue the investigation here. Need to figure out whether this is an issue in our TestServer or an issue in CoreCLR.

Meanwhile, the temporary workaround on our CI is to use an agent with two cores. May want to add -parallel none for safe measure.

@muratg
Copy link

muratg commented Dec 11, 2015

Thanks @JunTaoLuo! If it turns out to be a CoreCLR issue make sure to file a bug there before closing this one.

@JunTaoLuo
Copy link
Contributor

Deadlock fix tested on CoreCLR using a single processor Linux VM.

@muratg
Copy link

muratg commented Dec 14, 2015

@JunTaoLuo Do we know the root cause?

@JunTaoLuo
Copy link
Contributor

@muratg Yes, I mentioned it in the PR. We were mixing async code in TestServer and blocking code in the test. This will cause a deadlock when we are running in single threaded environments where we can only have one active thread, for example when there is only one core. For reference in the future, linking an article that explains it better: http://blog.stephencleary.com/2012/07/dont-block-on-async-code.html.

@muratg
Copy link

muratg commented Dec 14, 2015

Cool, thanks! 👌

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

3 participants