Skip to content

Ensure Http Telemetry correctness #40338

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 8 commits into from
Aug 11, 2020

Conversation

MihaZupan
Copy link
Member

@MihaZupan MihaZupan commented Aug 4, 2020

The Http Telemetry implementation from #37619 has a few problems:

  • Checking Telemetry.IsEnabled on start and stop opens up a race condition where any request that is in-flight when EventSource is enabled will log a Stop event without a corresponding Start. This leads to the Current requests counter always under-reporting the number of requests by an offset, even going into negative numbers. This is almost impossible not to hit unless Telemetry is started before any request is made, effectively making the dotnet-counters tool unusable for this Telemetry.
  • The Stop event is not guaranteed to run every time (for example if the response stream isn't read till the end). This shows up as a perpetually increasing Current requests counter.

This PR solves these issues by keeping the telemetry state on the HttpRequestMessage object.
This way we only call Stop if we called Start.

This PR is an alternative fix over #38876, avoiding using a new finalizable object to track each request when Telemetry is enabled.

Instead, this PR removes an optimization for HttpConnection where we avoid using a finalizer if maxConnections is int.MaxValue. With this change, we always use a finalizable HttpConnection.

Contributes to #37428

Note: It might seem that the PR is removing calls to Telemetry - it is moving those calls that were always before a CompleteResponse/Dispose to those methods instead.

@MihaZupan MihaZupan added this to the 5.0.0 milestone Aug 4, 2020
@MihaZupan MihaZupan requested review from alnikola and a team August 4, 2020 20:24
@ghost
Copy link

ghost commented Aug 4, 2020

Tagging subscribers to this area: @dotnet/ncl
See info in area-owners.md if you want to be subscribed.

Copy link
Contributor

@aik-jahoda aik-jahoda left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, needs look someone familiar with telemetry

Copy link
Contributor

@alnikola alnikola left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead, this PR removes an optimization for HttpConnectionPool where we avoid using a finalizer if maxConnections is int.MaxValue. With this change, we always use a finalizable HttpConnectionPool.

To my knowledge, unlimited number of HTTP 1.1 connections is the most common case, thus I'm concerned with this change will decrease performance for the most of the users.

@MihaZupan
Copy link
Member Author

MihaZupan commented Aug 7, 2020

It will reduce perf, but not by much.
A finalizer at some level is required, having it on the HttpConnectionPool should impact performance the least.

A quick benchmark of the overhead of having a suppressed finalizer yields:

Method Mean
TestNoFinalizer 2.589 ns
TestFinalizer 440.708 ns
TestFinalizerSupressed 54.758 ns

We are talking about a difference in the area of ~50 ns per HttpConnection. I don't think this is a significant performance hit in the context of creating connections.

For comparison, every Http2ReadStream (not connection) has a finalizer.
The added SuppressFinalize call there should result in an order of magnitude higher difference for every stream.

Do you have a suggestion for an alternative approach that would avoid the finalizer overhead?

@MihaZupan MihaZupan requested a review from alnikola August 7, 2020 10:28
internal void MarkAsTrackedByTelemetry()
{
Debug.Assert(_sendStatus != MessageShouldEmitTelemetry);
_sendStatus = MessageShouldEmitTelemetry;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't this going to overwrite the MessageAlreadySent set as part of MarkAsSent?

Copy link
Member

@stephentoub stephentoub Aug 7, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, I see, you consider both MessageAlreadySent and MessageShouldEmitTelemetry to mean already sent. That naming is strange to me.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would

private const int MessageNotYetSent = 0;
private const int MessageAlreadySent = 1;
-private const int MessageShouldEmitTelemetry = 2;
+private const int MessageAlreadySent_StopNotYetCalled = 2;

be more clear?


internal void OnStopped(bool aborted = false)
{
if (_sendStatus == MessageShouldEmitTelemetry && Interlocked.Exchange(ref _sendStatus, MessageAlreadySent) == MessageShouldEmitTelemetry)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this exchange necessary?

@stephentoub
Copy link
Member

cc: @geoffkizer

Copy link
Contributor

@alnikola alnikola left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, if it's really 50 ns per connection, that's fine. Let's proceed with this PR after all other comments were resolved.

@alnikola alnikola dismissed their stale review August 7, 2020 14:21

Ok, if it's really 50 ns per connection, that's fine. Let's proceed with this PR after all other comments were resolved.

@MihaZupan
Copy link
Member Author

/azp run runtime-libraries outerloop

@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@MihaZupan
Copy link
Member Author

/azp run runtime-libraries outerloop

@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@MihaZupan MihaZupan force-pushed the telemetry-http-counters2 branch from e7f0589 to 902a13d Compare August 9, 2020 19:13
@MihaZupan
Copy link
Member Author

/azp run runtime-libraries outerloop

@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@MihaZupan
Copy link
Member Author

/azp run runtime-libraries outerloop

@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@MihaZupan
Copy link
Member Author

/azp run runtime-libraries outerloop

@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@MihaZupan
Copy link
Member Author

Test failures: #40606, #40115

@MihaZupan MihaZupan merged commit 7600720 into dotnet:master Aug 11, 2020
@ghost ghost locked as resolved and limited conversation to collaborators Dec 7, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants