Skip to content

Running coverlet on a sln results in fatal exception #572

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
queen-of-code opened this issue Sep 27, 2019 · 14 comments
Closed

Running coverlet on a sln results in fatal exception #572

queen-of-code opened this issue Sep 27, 2019 · 14 comments
Labels
needs more info More details are needed

Comments

@queen-of-code
Copy link

This is a new regression in dotnet core 3 - we did not encounter this issue using coverlet collectors in 2.2.401. I have something like 40 test csproj files in mycode.sln.

When I run dotnet test mycode.sln --collect:"XPlat Code Coverage"

I inevitably at some point in the run will get


[22:13:31][Step 6/10] Data collector 'XPlat code coverage' message: Coverlet.Collector.Utilities.CoverletDataCollectorException: CoverletCoverageDataCollector: Failed to get coverage result
[22:13:31][Step 6/10]  ---> System.IO.IOException: The process cannot access the file '/opt/teamcity/temp/buildTmp/MP_2ccfd88a-3ed0-4316-9bdc-3213e6db0213' because it is being used by another process.
[22:13:31][Step 6/10]    at System.IO.FileStream.Init(FileMode mode, FileShare share, String originalPath)
[22:13:31][Step 6/10]    at System.IO.FileStream..ctor(String path, FileMode mode, FileAccess access, FileShare share, Int32 bufferSize, FileOptions options)
[22:13:31][Step 6/10]    at Coverlet.Core.Coverage.CalculateCoverage()
[22:13:31][Step 6/10]    at Coverlet.Core.Coverage.GetCoverageResult()
[22:13:31][Step 6/10]    at Coverlet.Collector.DataCollection.CoverageWrapper.GetCoverageResult(Coverage coverage)
[22:13:31][Step 6/10]    at Coverlet.Collector.DataCollection.CoverageManager.GetCoverageResult()
[22:13:31][Step 6/10]    --- End of inner exception stack trace ---
[22:13:31][Step 6/10]    at Coverlet.Collector.DataCollection.CoverageManager.GetCoverageResult()
[22:13:31][Step 6/10]    at Coverlet.Collector.DataCollection.CoverageManager.GetCoverageReport()
[22:13:31][Step 6/10]    at Coverlet.Collector.DataCollection.CoverletCoverageCollector.OnSessionEnd(Object sender, SessionEndEventArgs e).

Pretty clearly it's contending over the same tmp file for coverage info. FWIW, this is much more likely to happen when I am running directly from the command line vs using a .runsettings file - though I believe it has happened in either case.

@MarcoRossignoli MarcoRossignoli added the duplicate This issue or pull request already exists label Sep 28, 2019
@MarcoRossignoli
Copy link
Collaborator

MarcoRossignoli commented Sep 28, 2019

Run coverage on sln is not supported at the moment #357
If you're sharing modules(dll) and tests overlap there is a chance that different process try to read/write to same hit file at the same time...and it's randomic.
It's likely a different run sequence due to new version(of runtime)...can you try with #491 (comment) or run tests in sequence through csproj, only to understand if the issue repro also with serialized tests.
I'll to explain my doubt
TL;DR;
Coverlet instrumet dll injecting a static class. That class has a built-in hit file name(one tmp file for every dll). So if for some reason the "same" dll/hit filename is used in different "test session" there is a chance that when on test ends and we're collecting result(file read) another test session ends and try to write result to the same file.

@MarcoRossignoli
Copy link
Collaborator

MarcoRossignoli commented Sep 28, 2019

Ah if you can repro and attach would be great!I never repro this cases on my local.

@queen-of-code
Copy link
Author

It's absolutely possible that two test sessions writing in parallel have a shared DLL between them that they are reporting coverage on. I'll try with -m:1 and report back.

@queen-of-code
Copy link
Author

queen-of-code commented Oct 3, 2019

Looks like doing m:1 has NOT fixed it, with the added downside that it took our test run went from 7 minutes to closer to 23 minutes. If the naming on that temporary file is any indication, then it's a large common base-library DLL used by nearly all of the application packages in some way or another. So it's likely that it's contending quite a lot.

@MarcoRossignoli
Copy link
Collaborator

with the added downside that it took our test run went from 7 minutes to closer to 23 minutes.

This is expected...m1 block msbuild to run on only one node.

Looks like doing m:1 has NOT fixed it

Can you try to run coverage for every test project?i.e.

dotnet test mytest\mytest.csproj --collect:"XPlat Code Coverage"
dotnet test mytest2\mytest2.csproj --collect:"XPlat Code Coverage"
dotnet test mytest3\mytest3.csproj --collect:"XPlat Code Coverage"

I want to understand if there is a workaround to avoid dll sharing, I mean every test rebuild and use a local copy of lib and run in custom process, if works at the end we can merge result https://github.com/tonerdo/coverlet/blob/master/Documentation/Examples.md

I'm aware that is not a great solution and need some pipeline updates.

@MarcoRossignoli
Copy link
Collaborator

MarcoRossignoli commented Oct 3, 2019

@queen-of-code I'm trying to repro your issue...I don't know if it's possible but are you able to generate a repro(with emtpy project) but with the structure that lead to your issue?

@MarcoRossignoli
Copy link
Collaborator

MarcoRossignoli commented Oct 3, 2019

@queen-of-code I did some check and I could try to give an alfa version of collectors package to test my idea.
If I provide a nuget package are you able to test?Do you have a private nuget server or I could give you address to my own public test nuget server to use.

@queen-of-code
Copy link
Author

If you give me a test package on a public nuget server, I can test with it no problem.

@MarcoRossignoli
Copy link
Collaborator

MarcoRossignoli commented Oct 9, 2019

Try with this nuget.config on root sln

<?xml version="1.0" encoding="utf-8"?>
<configuration>
  <packageSources>
    <clear />
    <add key="coverlet-test" value="https://nugetservercoverlet.azurewebsites.net/nuget/" /> <-- NB IT'S UNOFFICIAL USE ONLY FOR THIS TEST
    <add key="nuget" value="https://api.nuget.org/v3/index.json" />
  </packageSources>
</configuration>

Directory.Build.targets file on root sln folder

<Project>
  <ItemGroup>
    <PackageReference Update="coverlet.collector" Version="1.1.9-g42c952b491" />
  </ItemGroup>
</Project>

runsettings(bug vstest plat workaround) #576

<?xml version="1.0" encoding="utf-8" ?>
<RunSettings>
  <DataCollectionRunSettings>
    <DataCollectors>
      <DataCollector friendlyName="XPlat code coverage">
        <Configuration>
          <Format>cobertura</Format>
        </Configuration>
      </DataCollector>
    </DataCollectors>
  </DataCollectionRunSettings>
  <InProcDataCollectionRunSettings>
    <InProcDataCollectors>
      <InProcDataCollector assemblyQualifiedName="Coverlet.Collector.DataCollection.CoverletInProcDataCollector, coverlet.collector, Version=1.1.0.0, Culture=neutral, PublicKeyToken=null"
                     friendlyName="XPlat Code Coverage"
                     enabled="True"
                     codebase="coverlet.collector.dll" />
    </InProcDataCollectors>
  </InProcDataCollectionRunSettings>
</RunSettings>

cspoj sample

<Project Sdk="Microsoft.NET.Sdk">

  <PropertyGroup>
    <TargetFramework>netcoreapp3.0</TargetFramework>

    <IsPackable>false</IsPackable>
  </PropertyGroup>

  <ItemGroup>
    <PackageReference Include="Microsoft.NET.Test.Sdk" Version="16.2.0" />
    <PackageReference Include="xunit" Version="2.4.0" />
    <PackageReference Include="xunit.runner.visualstudio" Version="2.4.0" />
    <PackageReference Include="coverlet.collector" />
  </ItemGroup>

  <ItemGroup>
    <ProjectReference Include="..\lib1\lib1.csproj" />
    <ProjectReference Include="..\lib2\lib2.csproj" />
    <ProjectReference Include="..\lib3\lib3.csproj" />
  </ItemGroup>

</Project>

Enable also logs for
collectors https://github.com/tonerdo/coverlet/blob/master/Documentation/Troubleshooting.md#collectors-integration
and trackers https://github.com/tonerdo/coverlet/blob/master/Documentation/Troubleshooting.md#enable-injected-tracker-log
to collect data

@MarcoRossignoli
Copy link
Collaborator

@queen-of-code any news?

@reverseark
Copy link

Just to report we hit exactly the same issue and search engine leads me here. Version of collector package is 1.2.0. It is not an easy repro though, probably once every 20-30 runs.

I read through this thread. The MaxCpuCount not respected by dotnet test issue has been well discussed here.

I also saw the description about what is happening behind the scene with. It seems this kind of issue will happen by nature, unless this design is changed "one tmp file for every dll".

Thanks!

@MarcoRossignoli
Copy link
Collaborator

MarcoRossignoli commented Jan 10, 2020

I think that issue is referred to fact that max cpu count is ignored if used inside runsettings, that is the artifact that vstest load and use to run tests, -m:1 options must be passed to msbuild so out of runsettings, you can try our sample and see that in one case test are parallel and in the other case is sequential https://github.com/tonerdo/coverlet/blob/master/Documentation/Examples/MSBuild/MergeWith/HowTo.md

Sequential

λ dotnet test /p:CollectCoverage=true  /p:CoverletOutput=../CoverageResults/ /p:MergeWith="../CoverageResults/coverage.json" /p:CoverletOutputFormat=\"opencover,json\" -m:1
or simply
λ dotnet test -m:1

Parallel

λ dotnet test /p:CollectCoverage=true  /p:CoverletOutput=../CoverageResults/ /p:MergeWith="../CoverageResults/coverage.json" /p:CoverletOutputFormat=\"opencover,json\"
or simply
λ dotnet test

@MarcoRossignoli
Copy link
Collaborator

MarcoRossignoli commented Jan 10, 2020

@reverseark can you try the same sample of #572 (comment) and use a tmp patched version of coverlet?
My suspect is that in some case the issue with big solution with a lot of projects there is a race between process host closing and out of process collectors that read file, I did a tmp patch #572 (comment) and loaded in my personal nuget tmp feed server.

TL;DR;

I try to explain better:

Hits file(file where we put coverage data a simple array of int every position is a line or a branch and we count how many time we hit that position) is flushed to disk in 2 cases

  1. AppDomain.CurrentDomain.ProcessExit
  2. AppDomain.CurrentDomain.DomainUnload (because we support also .NET Framework that can have more than one app domain)

However there is an issue with vstest https://github.com/tonerdo/coverlet/blob/master/Documentation/KnowIssues.md#1-vstest-stops-process-execution-earlydotnet-test that after some time(after test ends) kill host process and can happen in the middle of process exit.

This lead us to integrate with vstest with collector where there is a protocol where the collector signal if his works is end and so vstest doesn't kill the process in the middle.

By the way the host is subscribed yet to process exit(we don't have a custom code for "loaded by collector") so it's possible that out of process collectors are reading hit file and at the same time host test process close and re-flush hit file, and this is randomic and related to the overall condition.

When you run dotnet test... the workflow is:

  1. .net core msbuild start a process for every csproj in solution(parallel or sequential if you use -m:1
  2. the ball pass to vstest that start a dotnet vstest.console.dll https://github.com/microsoft/vstest/blob/master/src/Microsoft.TestPlatform.Build/Tasks/VSTestTask.cs#L20
  3. vstest plat int turn start one testhost process where tests run and another process(dotnet ...) that will host "collectors" that are a way to get info/metrics on tests. This is an extension point and coverlet uses a collectors extension to instrument and get result that pass to vstest to write to disk.
    Collectors can be out-of-process and in-process, out-of-process one run in collectors process(custom new process) and in-process one run inside testhost.exe(another process) so they communicate through RPC(tcp socket).
    We use in-proc to flush file and avoid premature killing of vstest defaults(issue above).
    The report creation is done by out-of-process collector in collectors process when tests end, but testhost before of after will close re-flushing hit file, if these two events overlaps(rare) there a race and in-process collector(testhost.exe) try to write hits and out-of-process collectors(the other process) tries to read that file at same time to get coverage results.

The tmp fix remove the flush of file on testhost.exe process exit.

@MarcoRossignoli MarcoRossignoli added needs more info More details are needed and removed duplicate This issue or pull request already exists labels Jan 10, 2020
@MarcoRossignoli
Copy link
Collaborator

Close for stale conversation.

Feel free to re-open if needed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
needs more info More details are needed
Projects
None yet
Development

No branches or pull requests

3 participants