Skip to content

Resetable CancellationTokenSource registrations #4694

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
benaadams opened this issue Nov 19, 2015 · 36 comments
Closed

Resetable CancellationTokenSource registrations #4694

benaadams opened this issue Nov 19, 2015 · 36 comments

Comments

@benaadams
Copy link
Member

CancellationTokenSource is quite a heavyweight object and its not normally cancelled; however it can't be pooled or reused because its registrations cannot be cleared.

Its composed of the following objects:

ManualResetEvent m_kernelEvent
SparselyPopulatedArray<CancellationCallbackInfo>[] m_registeredCallbacksLists
int m_state
int m_threadIDExecutingCallbacks
bool m_disposed
CancellationTokenRegistration [] m_linkingRegistrations
CancellationCallbackInfo m_executingCallback
Timer m_timer

A ClearRegistrations() or Reset() function would be very helpful.

@davidfowl
Copy link
Member

/cc @stephentoub

@stephentoub
Copy link
Member

A ClearRegistrations() or Reset() function would be very helpful.

One of the key aspects of CTS / CancellationToken is that it's one-directional, only ever transitioning from non-canceled to canceled, never the reverse. Lots of code depends on this, otherwise it's extremely prone to race conditions. It also seems particularly dangerous to "clear registrations"... if someone's still registered, why is it safe to ignore those registrations? If you know it's safe, then you're also typically in a position to unregister because you were the one to register, at which point you can choose to pool/reuse the CTS if it's not yet been canceled.

If you have suggestions for making the implementation more lightweight under the existing surface area and wanted to submit PRs for that, we'd be very happy to review. I don't believe we should add the proposed surface area, however.

@benaadams
Copy link
Member Author

The use case wouldn't need to undo the cancel; but remove the registrations, i.e. it is still a one way cancel.

The token is passed on to external code so the owner of the CancellationSource doesn't have access to the CancellationTokenRegistration objects.

Use case:

Connection has a CancellationSource; handles many Requests which invoke user request handling code; when the connection is disconnected or the request is aborted the CancellationSource is triggered to cancel anything registered during the request.

use

However, if the connection hasn't been terminated; at the end of the request all the request and user code registrations need to be dumped as they are now out of scope and the CancellationSource prepared for the next request. The current way to do this is to dispose the CancellationSource and create another one.

In Kestrel's Frame.cs the Reset function disposes the CancellationToken and it is then recreated on the next loop ready for registrations.

The issue with this is that the CancellationSource is one of the highest allocations that happens in the loop.

So am looking for a lower GC pressure way of being able use a CancellationToken for a longer scoped object (Connection); but with a resetting scope for the registrations (Requests).

@stephentoub
Copy link
Member

The issue with this is that the CancellationSource is one of the highest allocations that happens in the loop.

How much of that is the CTS itself, and how much are the allocations associated with the registrations?

at the end of the request all the request and user code registrations need to be dumped as they are now out of scope and the CancellationSource prepared for the next request.

At that point, why is there anything still registered with the CTS? Doesn't that imply someone isn't disposing of their CancellationTokenRegistration?

@benaadams
Copy link
Member Author

How much of that is the CTS itself, and how much are the allocations associated with the registrations?

In the plaintext techEmpower performance benchmark; its all the CTS as there are no registrations.

At that point, why is there anything still registered with the CTS?

Its library code; can't enforce the user of the library not to make mistakes; and try to gracefully handle them rather than continuously build up references for over every request.

However, would you recommendation be to keep using the same CTS and assume that the registrations are correctly disposed?

@stephentoub
Copy link
Member

would you recommendation be to keep using the same CTS and assume that the registrations are correctly disposed?

No, I see what you're saying.

At the same time, I'm struggling with the notion of simply dropping registrations. You've given out a CancellationToken, and code has registered with that CancellationToken. At some point the thing that provided the CancellationToken says "that's it, I'm done"... if anything is still listening, e.g. someone kicked off a cancelable timer or had some pending database request that can now be canceled and so on, shouldn't the CTS be Cancel'd rather than dropped? And once it's canceled, it's transitioned from non-canceled to canceled, and it's no longer reusable.

@benaadams
Copy link
Member Author

Because its not cancelled, it completed. So the CancellationToken is for an out of band situation where it needs to be cancelled mid processing.

I think the standard approach would be for the connection to have its own CancellationTokenSource which is long lived and then use CreateLinkedTokenSource for each request with a new CancellationTokenSource which it can then dispose; but that's just frightening in number of allocations (e.g. when doing 1M+ requests per second)

@stephentoub
Copy link
Member

Because its not cancelled, it completed. So the CancellationToken is for an out of band situation where it needs to be cancelled mid processing.

What's the mechanism by which work I've kicked off gets canceled when the request completes then? Or the assumption is "just don't do that"? This is obviously up to will of whoever is handling out the CT, but to me, it makes perfect sense to say "I'm completing now, my job is done, if anyone out there related to my work is still going, you can cancel now because I'm outta here." So I'm not quite on board with the argument of "because it's not canceled, it completed".

With regards to clearing registrations, let's hypothetically say we decided that was a good idea and it was something we wanted to do. CancellationTokens are a thin wrapper for a CancellationTokenSource, with their only field being a reference back to the CTS. How do you propose that be handled? Someone who was given a CT and polls it will still be referencing the original CTS, which might now be used to represent a completely unrelated piece of work. We could add an Int64 field to the CTS that represents a version number, add an Int64 field to a CT that represents a version number, and say the connection between the two is only valid if those two numbers match, but now both the struct and the class are 8 bytes bigger, and every operation on the CT needs to do additional work to validate that its source is still valid. Did you have something else in mind?

@benaadams
Copy link
Member Author

What's the mechanism by which work I've kicked off gets canceled when the request completes then?

Its a serial pipeline so the request only completes when the work is done; however if the network connection is dropped, so there is nothing to receive the result produced at the end of the pipeline; then the pipeline needs to be stopped (if someone chooses to register for cancellation; or use the cancellation token in someway - which they may not do).

let's hypothetically say ...

Sounds quite good 👍 An extra 8 bytes for the lifetime of the connection is nothing compared to the 72 bytes per request, which for 4k requests is 288000 bytes (below)

cts-allocations

For 2M requests a second its 144 MB per second or 8.64 GB a minute that needs to be GC'd just for the CancellationTokenSource. The push for cutting memory is that GC is currently the main activity of the server (shown below); and that's with a lot of pooling already going on.

gc-percentage

Could have different flavours of CancellationTokenSource, ones that don't have the timer for example; or even a CancellationTokenSourceSlim that doesn't have much but has protected or public properties so it could be implemented in different ways via derived classes.

Though it would need a function that creates the CancellationToken's since its constructor is currently internal and only takes a CancellationTokenSource?

@stephentoub
Copy link
Member

As noted earlier, if the concern is about the size of CTS, I'd be happy to look at proposals for making the implementation better. For example, if we really believe that the most common case in general (and not just on one specific benchmark) is no registrations, then we could consider putting all of the CTS state into a lazy-allocating subobject. Or maybe there are ways of eliminating some fields or consolidating them, etc.

@stephentoub
Copy link
Member

An extra 8 bytes for the lifetime of the connection is nothing compared to the 72 bytes per request, which for 4k requests is 288000 bytes (below)

You're focused on one very specific benchmark. There are many other use cases. And increasing the size of this object by 10% should not be taken lightly when the concern to begin with is its size. I'm also not entirely sure my off-the-cuff comment would even work, e.g.with the synchronization necessary to prevent such a token from registering with its old CTS.

Its a serial pipeline so the request only completes when the work is done

How is the serial nature enforced? You're saying it's not possible in ASP.NET to schedule concurrent operations? Or you're talking about this specific benchmark?

@benaadams
Copy link
Member Author

The CTS isn't particularly big in the scheme of things; its the multiplication of its size that the main problem; because currently the only safe approach it to ditch it each time. Even a single new object (12 bytes?) done on every request is 24 MB/s of garbage.

You're saying is not possible in ASP.NET to schedule concurrent operations?

Obviously its dotnet so if you wanted to spin up a background thread or split of multiple tasks there's nothing stopping you; so you can go quite happily go "off piste"; have background activities pruning a cache, posting stats etc

For the request pipeline specifically, its a pluggable where each component is called, then it calls the next item in the pipeline; so it has access to the before and after state. I believe the expectation is that outside that context the component shouldn't be trying to mess with the request or holding onto references to it.

You could kick something off on the way in and then finish it on the way out; or have multiple components at different parts of the pipeline communicate with each other; by altering the request; for example authorisation and authentication do this. But essentially when the request has been sent out across the network that's when its complete and nothing further can be done with it.

@stephentoub
Copy link
Member

The CTS isn't particularly big in the scheme of things; its the multiplication of its size that the main problem;

How many objects / how much memory is currently allocated per request? What percentage is the CTS? I of course completely understand the goal of minimizing allocations, but the numbers you're citing are difficult to reason about without context.

And do you have a proposal for how to actually make it all work, assuming we did decide to expose such a reuse operation? It would need to be bulletproof with regards to the previously handed out token not being able to poll, register with, etc. the reused CTS, and it would need to not add to the size of the CTS in a meaningful way for scenarios that do create lots of instances in cases where the CTS object is one of the larger contributors to allocations, such as creating a cancelable task.

@benaadams
Copy link
Member Author

How many objects / how much memory is currently allocated per request? What percentage is the CTS?

The in "aspnet/dev" allocations as a ratio look like this (well higher as its currently using
_requestAbortCts = CancellationTokenSource.CreateLinkedTokenSource(_disconnectCts.Token);)

allocations

There are PRs hopefully to take the last two lines to more or less zero; at which point it becomes 36% of this part of pipeline. The other allocator is the input header's value strings (the names/keys don't allocate) - but I believe that's also being looked at.

The streams I'm not sure about; they mostly exist to support the stopped/aborted state for the current request so you can't continue writing to it after; and for replacement, filters, default, encyption etc Though they could probably be combined.

@benaadams
Copy link
Member Author

And do you have a proposal for how to actually make it all work

Not entirely; though a derived CTS might be safer; or multiple derived types, timed, single use, etc and I like your versioning idea; but that would probably change contracts; and I take your point.

where the CTS object is one of the larger contributors to allocations, such as creating a cancelable task.

Needs more thought; would be good if that could also be reduced; obviously increasing the load here would be very bad; as once you've tried async you can't go back... but there is a lot going on in CTS; and though I use them a lot; other than plumming them in I don't think I've ever knowingly cancelled one.

@stephentoub
Copy link
Member

Thanks for pushing on this, @benaadams. It's impressive to see how much you and others have been able to drive down those allocations, and it's good to understand the costs here in perspective.

I will take a look at CTS today to see if any easy wins jump out at me.

@benaadams
Copy link
Member Author

I've been wondering on this and the Registrations should be disposed; so if the common use case was to pass them though to other library code; for example a sql exec task; then that library should be disposing of the registrations; and hopefully if the user was actually registering things - they should know enough to dispose them - at which point the same CTS could be used for the whole connection?

Though I don't know how the aspnet team would feel about it as it is potential bleed though and a memory leak. On the other hand it is a case of not disposing of your disposables....

/cc @davidfowl @halter73 @Tratcher

@Tratcher
Copy link
Member

Since this will be exposed to arbitrary 3rd party components we can't require them to always un-register, it would be un-enforceable and lead to memory leaks. Really it's a per-request token that you are trying to re-use for the next request on the same connection because you know it didn't fire yet and everyone should be done with it.

@stephentoub
Copy link
Member

@benaadams, I just reread what you wrote about sizes, and noticed that you said "of this part of pipeline". I don't know enough about the system to understand what this part of the pipeline is and whether that's actually used on its own, or whether it's never used on its own and this is just a piece of a larger pie that's always used. Can you clarify? How does this impact the entire request/response for example, assuming that's the pipeline being discussed?

@benaadams
Copy link
Member Author

@stephentoub its most of all; was just expressing it wasn't everything so there are extra allocations in the request input for example with header strings which are the highest allocators. I did an experiment in this PR aspnet/KestrelHttpServer#411 which also gives the breakdown of the top 5 types in current aspnet/dev

@benaadams
Copy link
Member Author

Obviously when it comes to user code all bets are off as that will undoubtedly be the highest allocator; but the load the library presents is the maximal possible optimization level they can get to - so its about it having the lightest load possible.

@benaadams
Copy link
Member Author

@stephentoub looking at the whole app/pipeline after a few of the non-conflicting PRs have been merged across 4k requests

cts-allocs

Items 1,2,3 have stuff in the works, taking this CTS to the second most allocated type; from its current 5th place.

@benaadams
Copy link
Member Author

And the previous second of code for allocations now looks like for 4k requests

request-bytes-allocs

@stephentoub
Copy link
Member

How about a different approach, one that lazily-allocates the CTS only when it's actually needed:
stephentoub/KestrelHttpServer@c7d7f0e

I don't know what perf test you're running, so I've not evaluated the perf impact of this and whether it provides the desired help or not. If it does, I can submit this as a PR once your changes go in, as I've branched off of your PR at aspnet/KestrelHttpServer#408.

@benaadams
Copy link
Member Author

Heh, that's a really good idea! Can't see the wood for the trees...

Merged it into the current PR, thank you!

@stephentoub Should this issue be closed?

@stephentoub
Copy link
Member

Glad it was helpful. Yes, let's close this. We can still explore ways to reduce the size of a CTS further (it's now 64 bytes on 64-bit), but I don't see a good way to safely enable the kind of reuse you're describing, and with my commit, it sounds like the targeted use case no longer exists, or at least is significantly diminished.

@benaadams
Copy link
Member Author

Re-run the tests and this completely clears up the allocations; thank you again.

@benaadams
Copy link
Member Author

And I'm getting good sustained throughput

thoughput

  Thread Stats   Avg      Stdev     Max   +/- Stdev
    Latency    11.50ms   20.49ms 405.61ms   93.01%
    Req/Sec    44.57k     8.62k  101.79k    93.77%
  170312239 requests in 2.00m, 20.94GB read
Requests/sec: 1418228.35
Transfer/sec:    178.53MB

1.4M requests per second

@msftgits msftgits transferred this issue from dotnet/coreclr Jan 30, 2020
@roji
Copy link
Member

roji commented Feb 13, 2020

Does it make sense to revisit this (along the lines of ManualResetValueTaskSourceCore)? My specific use case is waiting on a Channel with a timeout, which only seems to be possible via a cancellation token, and I'd like to avoid incurring an allocation each time. The timeout must be applied on every read so I can't see laziness being a solution here (or have I miss some other option?)

@stephentoub
Copy link
Member

stephentoub commented Feb 13, 2020

I'd like to avoid incurring an allocation each time

Can you not just reset the timeout and reuse the same CTS as long as it hasn't timed out / been canceled? An extra allocation when things actually time out should be much less impactful.

@roji
Copy link
Member

roji commented Feb 13, 2020

@stephentoub yeah, that's what I'm doing - but the timeout here is possibly a frequent event, meaning the allocation would still be frequent. Am also wondering more in general, now that we allow reset for ValueTask, if there's a fundamental reason not to allow the same for cancellation tokens (via a different advanced type, with the understanding that it's the user's responsibility to make sure everything is properly managed).

@stephentoub
Copy link
Member

if there's a fundamental reason not to allow the same for cancellation tokens (via a different advanced type, with the understanding that it's the user's responsibility to make sure everything is properly managed

You mean introducing an entirely new type that's not CancellationToken?

@roji
Copy link
Member

roji commented Feb 13, 2020

I was imagining a new type that's not CancellationTokenSource, but that would be able to manage CancellationToken - but I may have a bad mental model of everything here. Otherwise an advanced, appropriately-named method on CancellationTokenSource?

@stephentoub
Copy link
Member

CancellationToken has very clear semantics: once IsCancellationRequested is true, it'll never be false. And implementations rely on that. That was not the case with ValueTask<T>. We introduced IValueTaskSource<T> so soon after the original type that we felt it was acceptable to retcon the behavioral changes... and even with that short gap we've seen it result in fallout we're still dealing with. CancellationToken has been around for way, way longer.

@roji
Copy link
Member

roji commented Feb 13, 2020

Thanks for explaining. A new type that's not CancellationToken indeed seems like a non-starter.

@davidfowl
Copy link
Member

😢 This makes me sad.

Brar added a commit to Brar/npgsql that referenced this issue Oct 5, 2020
Since resetting a timeout on a CancellationTokenSource is a pretty
common thing to do and that'll probably never be supported out of the
box (dotnet/runtime#4694), put the uglyness of
resetting the timeout in one place so that we don't have to repeat it
all the time.
Brar added a commit to npgsql/npgsql that referenced this issue Oct 7, 2020
…3215)

Add a wrapper for CancellationTokenSource with a resettable timeout

Since resetting a timeout on a CancellationTokenSource is a pretty
common thing to do and that'll probably never be supported out of the
box (dotnet/runtime#4694), put the uglyness of
resetting the timeout in one place so that we don't have to repeat it
all the time.
@ghost ghost locked as resolved and limited conversation to collaborators Jan 4, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants