Skip to content
This repository was archived by the owner on Dec 18, 2018. It is now read-only.

Unskip and fix race in ConnectionClosedEvenIfAppSwallowsException #2632

Merged
merged 1 commit into from
May 31, 2018

Conversation

halter73
Copy link
Member

@halter73 halter73 commented May 31, 2018

#2464

This is the same fix @mikeharder made for RequestTimesOutWhenRequestBodyNotReceivedAtSpecifiedMinimumRate with #2589.

Copy link
Contributor

@ryanbrandenburg ryanbrandenburg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One question/suggestion. If it works go for it.

systemClock.UtcNow += gracePeriod + TimeSpan.FromSeconds(1);
Assert.True(exceptionSwallowedEvent.Wait(TestConstants.DefaultTimeout), "ExceptionSwallowedEvent timed out.");
await exceptionSwallowedTcs.Task.DefaultTimeout();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What are we asserting now? If there's nothing that we should/can assert I generally like to leave a comment noting the exception we're confirming we don't it, or something along those lines. That helps make it clear that we didn't just forget the Assert (which I've seen in our code-base), it was an active choice.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

DefaultTimeout() is an assertion. This is common in Kestrel functional tests. Another common assertion is awaiting a TCS where SetException is set in the app func. Adding comments everywhere pointing out this is an assertion would be super noisy.

Copy link
Member Author

@halter73 halter73 May 31, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same goes for Receive[ForcedEnd]. That's also an assertion that we use everywhere. Anyone who works at all with Kestrel functional tests will be familiar with this convention.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cool beans.

@ryanbrandenburg
Copy link
Contributor

Just noticed that the CI failed. Looks like Expect100ContinueHonored is having problems where the response is empty still/again.

@halter73
Copy link
Member Author

halter73 commented May 31, 2018

@ryanbrandenburg That's a a problem we see a lot in our macOS libuv functional tests. It's not consistent at all which test fails, but it always surfaces as an empty response.

The test ran twice with the libuv transport. You can see that in the log for the successful test run, the request is handled, a response is written, and only after that does the server close the connection as indicated by the "disconnecting" and "sending FIN" logs.

[0.001s] [TestLifetime] [Information] Starting test "Expect100ContinueHonored-http://127.0.0.1:0" at "2018-05-31T00:28:54"
[0.002s] [Microsoft.AspNetCore.Hosting.Internal.WebHost] [Debug] Hosting starting
[0.014s] [Microsoft.AspNetCore.Hosting.Internal.WebHost] [Debug] Hosting started
[0.014s] [Microsoft.AspNetCore.Hosting.Internal.WebHost] [Debug] Loaded hosting startup assembly "Libuv.FunctionalTests, Version=2.2.0.0, Culture=neutral, PublicKeyToken=adb9793829ddae60"
[0.016s] [Microsoft.AspNetCore.Server.Kestrel] [Debug] Connection id ""0HLE6HK9PG7K6"" started.
[0.017s] [Microsoft.AspNetCore.Hosting.Internal.WebHost] [Information] Request starting HTTP/1.1 POST http:///  11
[0.017s] [Microsoft.AspNetCore.Server.Kestrel] [Debug] Connection id ""0HLE6HK9PG7K6"", Request id ""0HLE6HK9PG7K6:00000001"": started reading request body.
[0.018s] [Microsoft.AspNetCore.Server.Kestrel] [Debug] Connection id ""0HLE6HK9PG7K6"", Request id ""0HLE6HK9PG7K6:00000001"": done reading request body.
[0.021s] [Microsoft.AspNetCore.Hosting.Internal.WebHost] [Information] Request finished in 3.967ms 200 
[0.021s] [Microsoft.AspNetCore.Server.Kestrel] [Debug] Connection id ""0HLE6HK9PG7K6"" disconnecting.
[0.021s] [Microsoft.AspNetCore.Server.Kestrel.Transport.Libuv] [Debug] Connection id ""0HLE6HK9PG7K6"" sending FIN.
[0.022s] [Microsoft.AspNetCore.Hosting.Internal.WebHost] [Debug] Hosting shutdown
[0.022s] [Microsoft.AspNetCore.Server.Kestrel] [Debug] Connection id ""0HLE6HK9PG7K6"" stopped.
[0.023s] [TestLifetime] [Information] Finished test "Expect100ContinueHonored-http://127.0.0.1:0" in 0.021416s

In the log for the failed test run the client sends a FIN immediately after opening connection prior to ever sending a request, so the server appropriately closes the connection immediately. This is indicated by the "received FIN" log by the server. This can only be logged if a -4095 status is observed by the libuv transport's read callback. Not only is the client not supposed to send a FIN prior to sending the request (which is obvious) it's also supposed to wait to send a FIN until after it receives a complete response, so there's no way this should be happening.

[0.001s] [TestLifetime] [Information] Starting test "Expect100ContinueHonored-http://127.0.0.1:0" at "2018-05-31T00:28:54"
[0.008s] [Microsoft.AspNetCore.Hosting.Internal.WebHost] [Debug] Hosting starting
[0.010s] [Microsoft.AspNetCore.Hosting.Internal.WebHost] [Debug] Hosting started
[0.010s] [Microsoft.AspNetCore.Hosting.Internal.WebHost] [Debug] Loaded hosting startup assembly "Libuv.FunctionalTests, Version=2.2.0.0, Culture=neutral, PublicKeyToken=adb9793829ddae60"
[0.011s] [Microsoft.AspNetCore.Server.Kestrel] [Debug] Connection id ""0HLE6HK9PG7K5"" started.
[0.011s] [Microsoft.AspNetCore.Server.Kestrel.Transport.Libuv] [Debug] Connection id ""0HLE6HK9PG7K5"" received FIN.
[0.011s] [Microsoft.AspNetCore.Server.Kestrel] [Debug] Connection id ""0HLE6HK9PG7K5"" disconnecting.
[0.011s] [Microsoft.AspNetCore.Server.Kestrel.Transport.Libuv] [Debug] Connection id ""0HLE6HK9PG7K5"" sending FIN.
[0.011s] [Microsoft.AspNetCore.Server.Kestrel] [Debug] Connection id ""0HLE6HK9PG7K5"" stopped.
[0.016s] [Microsoft.AspNetCore.Hosting.Internal.WebHost] [Debug] Hosting shutdown
[0.018s] [TestLifetime] [Information] Finished test "Expect100ContinueHonored-http://127.0.0.1:0" in 0.016812s

I would say it has to be a client bug except the client in this case is Kestrel's TestConnection which is written directly on top of a managed Socket. The Socket isn't shutdown or disposed until the failing assertion completes. More evidence that this isn't a client bug comes from the fact that the same tests run with the Socket transport never fail this way, and these tests also don't fail in this way on Windows.

@halter73
Copy link
Member Author

@ryanbrandenburg I created a single issue to track this class of failures at #2635. If you know of any other issues you filed that relates to this behavior, it would be helpful if you could mark those as dupes of #2635 and close them.

@halter73 halter73 merged commit 28dad2b into dev May 31, 2018
@halter73
Copy link
Member Author

Thanks for the quick approval @ryanbrandenburg

@halter73 halter73 deleted the halter73/2464 branch May 31, 2018 22:39
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants