Skip to content

Coreclr test builds fail magically after ~12 hours after doing a scorch of the repo #5382

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
ramarag opened this issue Mar 16, 2016 · 15 comments
Assignees
Labels
area-Infrastructure-coreclr test-bug Problem in test source code (most likely)
Milestone

Comments

@ramarag
Copy link
Member

ramarag commented Mar 16, 2016

Sequence of events:

git fetch --all
git merge upstream/master
git clean –dxf
build skipnative skipmscorlib
tests\runtest.cmd
everything works fine
make test changes
build skipnative skipmscorlib
tests\runtest.cmd
everything works fine

wait for ~12 hours
do a
build skipnative skipmscorlib
see the error:

BUILDTEST: Starting the Managed Tests Build
BUILDTEST: Using environment: "C:\Program Files (x86)\Microsoft Visual Studio 14.0\Common7\Tools\VsDevCmd.bat"
BUILDTEST: Invoking msbuild
[15:42:22.52] Restoring all packages...
EXEC : error : Package dependencies must specify a version range. [E:\git\coreclr\tests\build.proj]
E:\git\coreclr\tests\build.proj(25,5): error MSB3073: The command "(set CORE_ROOT=) & "E:\git\coreclr\tests..\Tools\dotnetcli/bin/dotnet.exe" restore --packages "E:\git\coreclr\tests..\packages" --source https://dotnet.myget.org/F/dotnet-core/api/
v3/index.json --source https://www.myget.org/F/nugetbuild/api/v3/index.json --source https://api.nuget.org/v3/index.json "E:\git\coreclr\tests\src"" exited with code 1.

EXEC : error : Package dependencies must specify a version range. [E:\git\coreclr\tests\build.proj]
E:\git\coreclr\tests\build.proj(25,5): error MSB3073: The command "(set CORE_ROOT=) & "E:\git\coreclr\tests..\Tools\dotnetcli/bin/dotnet.exe" restore --packages "E:\git\coreclr\tests..\packages" --source https://dotnet.myget.org/F/dotnet-core/api/
v3/index.json --source https://www.myget.org/F/nugetbuild/api/v3/index.json --source https://api.nuget.org/v3/index.json "E:\git\coreclr\tests\src"" exited with code 1.

@dagood
Copy link
Member

dagood commented Mar 16, 2016

(@ramarag and I talked on linq before we created this issue, adding some more info.)

Another dev got this message, with a bit more detail:

BUILDTEST: Starting the Managed Tests Build
BUILDTEST: Using environment: "<C:\Program Files (x86)\Microsoft Visual Studio 14.0\Common7\Tools\VsDevCmd.bat>"
BUILDTEST: Invoking msbuild
  [15:30:12.65] Restoring all packages...
EXEC : error : Error reading 'D:\fxkit\coreclr\tests\src\Common\test_runtime\project.json' at line 3 column 44 : Package dependencies must specify a version range. [D:\fxkit\coreclr\tests\build.proj]
EXEC : error : Package dependencies must specify a version range. [D:\fxkit\coreclr\tests\build.proj]
D:\fxkit\coreclr\tests\build.proj(25,5): error MSB3073: The command "(set CORE_ROOT=) & "D:\fxkit\coreclr\tests\..\Tools\dotnetcli/bin/dotnet.exe" restore --packages "D:\fxkit\coreclr\tests\..\packages" --source https://dotnet.myget.org/F/dotnet-core/api/v3/index.json --source https://api.nuget.org/v3/index.json "D:\fxkit\coreclr\tests\src"" exited with code 1. 

D:\fxkit\coreclr\tests\src\Common\test_runtime\project.json:

{
  "dependencies": {
    "Microsoft.NETCore.Runtime.CoreCLR": "",
    "Microsoft.NETCore.TestHost": "1.0.0-rc2-23816",
  },
  "frameworks": {
    "dnxcore50": {}
  }
}

This file is generated by tests/runtest.proj#L369, so it seems that somehow the nupkg doesn't exist in $(CORE_ROOT)\.nuget\** after these events.

I'm not sure when exactly that folder is updated. @rahku Do you know what the story is behind this .nuget folder?

Maybe a solution could be as simple as providing a default value if no nupkg is detected.

@rahku
Copy link
Contributor

rahku commented Mar 16, 2016

comment above the line you linked provides the reason for existence of test_runtime.csproj
https://github.com/dotnet/coreclr/blob/master/tests/runtest.proj#L363

buildtests.cmd requires that the product is built before it can start. In addition it also sets core_root to product bin directory. $(CORE_ROOT).nuget\pkg points to the coreclr rutnime package that you just built. Not sure why product bin folder would be deleted on its own.

@rahku
Copy link
Contributor

rahku commented Mar 16, 2016

I am wrong buildtests.cmd does not actually set core_root and so yes a restore for that json during buildtest will cause problems. But still puzzles me why this would not fail more often

@rahku
Copy link
Contributor

rahku commented Mar 17, 2016

I will assign this to myself.

@rahku rahku assigned rahku and unassigned dagood Mar 17, 2016
@rahku
Copy link
Contributor

rahku commented Mar 17, 2016

@dagood do you know this would not repro everytime?

@dagood
Copy link
Member

dagood commented Mar 17, 2016

Ah, thanks, the comment wasn't clear to me before but it seems clear now.

Thinking about this again I'm not sure how runtests (to call runtest.proj from my understanding) is being called. I would think that with the steps in the description, the project.json would never be generated at all, let alone with an invalid version. Is there some call to runtest.proj from buildtests (or build) that I'm not aware of?

As far as I'm aware all that buildtests will do is restore existing project.json files, so it seems like this could be a sequencing issue where the first build works, runtests creates an invalid project.json, then the next buildtests tries to restore that file and fails. I still don't know how to explain why runtests (or something that calls runtest.proj) would happen though.

Some more detail from the conversation on IM is that repeated builds work, but stop working after some amount of time has passed--that might rule out simple sequences.

@ramarag Are you certain those commands alone will repro? You never ran runtests?

@rahku
Copy link
Contributor

rahku commented Mar 17, 2016

to run tests in coreclr is a two step process:

  1. build.cmd (builds both product & tests but does not actually execute tests)
  2. runtest.cmd (executes the tests that was built in step 1).
    So runtest needs to be manually invoked it is not executed as part of any other script.

I tried with several permutation but I am still not able to repro the problem. Error says "Package dependencies must specify a version range". It does not say that it is not able to find a package which is what I would expect since the source paths would be missing that version of coreclr runtime package.

@dagood
Copy link
Member

dagood commented Mar 17, 2016

Yeah, that matches what I was thinking of the flow--the really strange bit to me then is that D:\fxkit\coreclr\tests\src\Common\test_runtime\project.json exists even though the file generation happens during runtest. (I also haven't repro'd this, but I haven't waited the 12 hours the repro steps suggest.)

@ramarag
Copy link
Member Author

ramarag commented Mar 17, 2016

Sorry, i have updated the repro steps. I did run the runtest immediately after first build

@dagood
Copy link
Member

dagood commented Mar 22, 2016

Ok, that makes more sense, but I still can't think of anything other than wild guesses.

I tried to repro overnight, and I wasn't able to. The last build skipnative skipmscorlib worked fine for me.

A few questions about the repository state that would be good to know:

build skipnative skipmscorlib
tests\runtest.cmd
everything works fine
make test changes
build skipnative skipmscorlib
<====== Here, what are the contents of the generated project.json ?
tests\runtest.cmd
<====== And here
everything works fine

wait for ~12 hours
do a
build skipnative skipmscorlib

@rahku
Copy link
Contributor

rahku commented Mar 22, 2016

This is what I would expect contents of project.json to be at both time instances above
{
"dependencies": {
"Microsoft.NETCore.Runtime.CoreCLR": "1.0.2-rc3-00001",
"Microsoft.NETCore.TestHost": "1.0.0-rc2-23816",
},
"frameworks": {
"dnxcore50": {}
}
}

@rahku
Copy link
Contributor

rahku commented Mar 22, 2016

On looking more closely at the project.json file that you pasted above I see the difference at
"Microsoft.NETCore.Runtime.CoreCLR": "",

This is not expected. I expect version number of the just built runtime package instead of empty string. It seems maybe runtime package did not get built or version extraction logic failed. Maybe the fix would be to error out early on in project.json generation logic if runtime package is missing.

@ramarag
Copy link
Member Author

ramarag commented Mar 22, 2016

the sequence of events are:

build skipnative skipmscorlib
tests\runtest.cmd
<====== Here project.json is generated and is not modified ever
everything works fine
make test changes
build skipnative skipmscorlib
tests\runtest.cmd
everything works fine

wait for ~12 hours
do a
build skipnative skipmscorlib
package restore fails

@ramarag
Copy link
Member Author

ramarag commented May 8, 2016

work around for now is to delete tests\src\Common\test_runtime\project.lock.json

@dagood
Copy link
Member

dagood commented May 9, 2016

The cause of this should be fixed by dotnet/coreclr#4764, but removing tests\src\Common\test_runtime\project.json manually will still be necessary to fix any active repros.

@dagood dagood closed this as completed May 9, 2016
@msftgits msftgits transferred this issue from dotnet/coreclr Jan 30, 2020
@msftgits msftgits added this to the 1.0.0-rtm milestone Jan 30, 2020
@ghost ghost locked as resolved and limited conversation to collaborators Jan 2, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
area-Infrastructure-coreclr test-bug Problem in test source code (most likely)
Projects
None yet
Development

No branches or pull requests

4 participants