Skip to content

.Net Core 2.0 + 2.2 High Memory Usage On Startup #10515

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
AdamRiddick opened this issue May 24, 2019 · 22 comments
Closed

.Net Core 2.0 + 2.2 High Memory Usage On Startup #10515

AdamRiddick opened this issue May 24, 2019 · 22 comments
Labels
area-networking Includes servers, yarp, json patch, bedrock, websockets, http client factory, and http abstractions
Milestone

Comments

@AdamRiddick
Copy link

AdamRiddick commented May 24, 2019

Describe the issue

When spinning up an .Net Core Application after an application pool recycle in IIS, the process seems to go on a few memory grabbing runs but doesn't appear to be using it - This doesn't look like a memory leak as the memory is always reclaimed.

For example, the memory usage will cycle upwards to around 3-4GB, back down to about 400mb, then up and down between these a handful more times before settling down to ~200mb. The application starts as normal through this process, and can be accessed and used as normal.

Normal usage averages at around 200mb.

We've managed to take a dump at it's peak;

image

To Reproduce

  • .Net Core 2.0
  • .Net Core Runtime 2.0.7
  • Windows Server 2016
  • Workstation GC Mode

We haven't been able to reproduce this outside of our production server environments.

Is there a way we can determine what is causing this high memory usage during startup?

Edit
I've also been able to reproduce this with;

  • .Net Core 2.2
  • .Net Core Runtime 2.2.4
  • Windows Server 2016
  • Workstation GC Mode
@AdamRiddick AdamRiddick changed the title .Net Core 2.0 High Memory Usage On Startup .Net Core 2.0 + 2.2 High Memory Usage On Startup May 24, 2019
@analogrelay
Copy link
Contributor

We've managed to take a dump at it's peak;

Would you be willing to share the dump itself privately with us so we can examine it further?

@AdamRiddick
Copy link
Author

Yeah we can do that. How’s best to get that to you, and do you have a preferred zip format?

@analogrelay
Copy link
Contributor

analogrelay commented May 25, 2019

You can send it to me at [my github username]@microsoft.com, though if it's large (and since it's a memory dump I'd imagine it is :)) you'll probably need to find a file sharing service (OneDrive, DropBox, Azure Blob, etc.) to host it. I should be fine with most formats, as long as 7-zip can open it ;).

@AdamRiddick
Copy link
Author

Thanks, I've send a share link over on OneDrive :)

@AdamRiddick
Copy link
Author

I'm not sure if its related but during general operation we are seeing some occurrences ofOutOfMemoryException being thrown on servers with multiple instances and there isn't a common stack trace among them.

@analogrelay analogrelay added this to the Discussions milestone May 28, 2019
@AdamRiddick
Copy link
Author

@anurse
We've been looking at this further and we currently believe this is related to #2737.

Using the recommended workaround on that issue, to switch the ServiceProviderMode to Runtime appears to have resolved this issue in our Staging environment.

As there is no documentation on ServiceProviderMode (that I can find ...) can you offer some guidance on the difference between the two modes?

I can see that the default Dynamic mode is compiled whereas runtime is resolved at request, am I correct in thinking we should only be expecting a small performance hit by switching mode until a fix is released?

@analogrelay
Copy link
Contributor

You'd have to measure in your scenario to know for sure. Here's a brief summary of the modes:

  • Runtime means that every time a service needs to be created, we basically use Activator.CreateInstance to do so. For singletons, this has minimal impact since they are created once per app. For transient and scoped services, reflection is much slower than directly calling new, so it has a noticible impact
  • Dynamic means that the first time a service needs to be created, we compile IL into a new DynamicMethod that will actually use new (or rather than newobj IL opcode ;)) to create an instance of the type. We then store that delegate and use it to construct instances each time. It's still a little slower than directly calling new but it's much faster than constructing the instance using reflection each time.

So I would expect some performance impact from switching modes.

@AdamRiddick
Copy link
Author

Hi Andrew,

I can confirm this was the route cause of the high CPU on startup & the OutOfMemoryException's we were seeing throughout the day when the application was starting up again.

We are now averaging a slightly higher CPU usage, but thankfully it's nothing Drastic.

We'll keep an eye on #2737 relating to this - Thanks.

@analogrelay
Copy link
Contributor

Using the Dynamic mode was the root cause? Do you have a large number of DI services? I believe you can query the count of the IServiceCollection at the end of your ConfigureServices to check this.

The fact that switching dramatically reduces CPU/Memory load indicates a very large number of DI services, all of which are used at startup, meaning we have to compile a lot of these dynamic methods.

@gregpakes
Copy link

gregpakes commented Jun 18, 2019

@anurse

I’m going to reply on behalf of @AdamRiddick (same company). I believe Adam is on leave.

No we do not have a large number of DI services. The application is quite small. It is a custom version of Identity Server 4.

The more we look at it it looks like a bug/issue with the dynamic mode.

Did you manage to look at the dump Adam sent? That shows the issue quite well.

@analogrelay
Copy link
Contributor

analogrelay commented Jun 18, 2019

Interesting. I did look at the dump and responded indicating that it seemed to be missing data (my bad, this message didn't get through). I can try to take another look.

@analogrelay
Copy link
Contributor

Yeah, I double checked the dump and it looks like it may be after the memory spike is cleaned up. For example, I see the total GC heap size reported as only about ~130MB:

0:000> !EEHeap -gc
Number of GC Heaps: 1
generation 0 starts at 0x0000016886f93a60
generation 1 starts at 0x0000016886aeea88
generation 2 starts at 0x0000016880001000
ephemeral segment allocation context: none
         segment             begin         allocated              size
0000016880000000  0000016880001000  00000168873e45d0  0x73e35d0(121517520)
Large object heap starts at 0x0000016890001000
         segment             begin         allocated              size
0000016890000000  0000016890001000  00000168915498b0  0x15488b0(22317232)
Total Size:              Size: 0x892be80 (143834752) bytes.
------------------------------
GC Heap Size:            Size: 0x892be80 (143834752) bytes.

I do see 4GB of Private Bytes recorded in the dump (the MEM_PRIVATE category in your results above), but it appears the dump occurred after a garbage collection but before the memory was returned to the OS so I'm not really able to see what is allocating the large amounts of memory.

Did you find more relevant data in the dump? That might help point me in the right direction. Is it possible for you to share the application code at all?

@gregpakes
Copy link

@anurse

Ah ok - I'll go through the dump again and try and locate what I found interesting. But we saw 15000 items in a collection that I believe was related to the container and dynamic mode. I will dig it up.

If we connect privately, we can share application code.

I am in Sydney, Adam is in UK. So hopefully there should be crossover where we can jump on a call or something.

@gregpakes
Copy link

@anurse

Here is what I found in the dump file. To be honest, now that I look at it, I really don't know whether this is normal or not, but I thought I would enclose it anyway.

image

@analogrelay
Copy link
Contributor

Feel free to send me an email at [my github username]@microsoft.com. I'm happy to do what I can to help. We've got quite a bit going on with the upcoming 3.0 release, but I'll take a look where I can. If you have a time-sensitive need for support, Microsoft Support might be more appropriate.

Feel free to share additional data with me there. Probably the code I'm most interested in is where you register your services.

The TrueReadOnlyCollection<Expression> does come from the System.Linq.Expressions code which is what we use to generate DI factory methods, so it does seem somewhat relevant. Thanks for pointing me in this direction.

@gregpakes
Copy link

@anurse

Email sent :-)

@gregpakes
Copy link

gregpakes commented Jun 28, 2019

I just wanted to provide an update here.

@anurse has been very helpful over email, but I am still a little stuck without knowing the ins and outs of the framework. So hopefully someone else can point me in the right direction.

  • When we use the default ServiceProviderMode (Dynamic), we see two memory spikes on the first load of the razor views. NB. Not on application start
  • When we set the ServiceProviderMode to Runtime, we do not see the memory spikes. Standard memory tools such as dotMemory and Ants Memory Profiler have not been helpful as the memory appears to be unmanaged so these tools do not really show it.

Here is a standard memory trace from both dotMemory and ANTS.

ants

dotmemory

The two spikes are very visible.

The most success I have had is using DebugDiag which said the following:

\clrjit.dll is responsible for 797.62 MBytes worth of outstanding allocations.  The following are the top 2 memory consuming functions:

clrjit!ArenaAllocator::allocateNewPage+93:  797.62 MBytes worth of outstanding allocations.

The report is here:

dotnet.exe__PID__23136__Date__06_27_2019__Time_02_46_41PM__22__Leak Dump - Virtual Bytes_MultipleRules.mht.zip

@anurse has said that this finding is in line with his thoughts that the dynamic code-gen process (Dynamic mode) is responsible for this. He said that in the memory dump, there were 16000 instances of the linq query to generate one particular class (DBContext).

My questions are:

  1. Why would there be 16000 instances of the linq to generate a single class? I don't know where to start looking into this.
  2. Why would clrjit.dll be responsible for so many allocations?

@analogrelay
Copy link
Contributor

2. Why would clrjit.dll be responsible for so many allocations?

This would make sense if there are a lot of code-generation paths going on.

  1. Why would there be 16000 instances of the linq to generate a single class? I don't know where to start looking into this.

This is likely the root cause of 2. I'm not sure where to start yet either but it's something we can continue to dive in to (cc @davidfowl)

@davidfowl
Copy link
Member

I wonder if something is wrong with the EF configuration causing it to do weirdness (like making a bunch of internal service providers).
cc @ajcvickers @roji

@ajcvickers
Copy link
Contributor

@davidfowl What does the code that registers EF in D.I. look like?

@analogrelay
Copy link
Contributor

@AdamRiddick and @gregpakes shared some of the code with me offline, I can loop you both in on the thread.

@ghost
Copy link

ghost commented Nov 12, 2020

Thank you for contacting us. Due to a lack of activity on this discussion issue we're closing it in an effort to keep our backlog clean. If you believe there is a concern related to the ASP.NET Core framework, which hasn't been addressed yet, please file a new issue.

This issue will be locked after 30 more days of inactivity. If you still wish to discuss this subject after then, please create a new issue!

@ghost ghost closed this as completed Nov 12, 2020
@ghost ghost locked as resolved and limited conversation to collaborators Dec 12, 2020
@amcasey amcasey added area-networking Includes servers, yarp, json patch, bedrock, websockets, http client factory, and http abstractions and removed area-runtime labels Aug 24, 2023
This issue was closed.
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
area-networking Includes servers, yarp, json patch, bedrock, websockets, http client factory, and http abstractions
Projects
None yet
Development

No branches or pull requests

7 participants