Skip to content

Add a separate test pass to run flaky tests that doesn't fail the build #8486

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 11 commits into from
Mar 19, 2019
12 changes: 12 additions & 0 deletions .azure/pipelines/ci.yml
Original file line number Diff line number Diff line change
Expand Up @@ -300,13 +300,17 @@ jobs:
- powershell: "& ./.azure/pipelines/tools/SetupTestEnvironment.ps1 Setup signalrclienttests.exe"
displayName: Start AppVerifier
afterBuild:
- powershell: "& ./build.ps1 -CI -NoBuild -Test /p:RunFlakyTests=true"
displayName: Run Flaky Tests
continueOnError: true
- powershell: "& ./.azure/pipelines/tools/SetupTestEnvironment.ps1 Shutdown signalrclienttests.exe"
displayName: Stop AppVerifier
condition: always()
artifacts:
- name: Windows_Test_Logs
path: artifacts/logs/
publishOnError: true

- template: jobs/default-build.yml
parameters:
condition: ne(variables['SkipTests'], 'true')
Expand All @@ -318,6 +322,10 @@ jobs:
beforeBuild:
- bash: "./eng/scripts/install-nginx-mac.sh"
displayName: Installing Nginx
afterBuild:
- bash: ./build.sh --no-build --ci --test -p:RunFlakyTests=true
displayName: Run Flaky Tests
continueOnError: true
artifacts:
- name: MacOS_Test_Logs
path: artifacts/logs/
Expand All @@ -333,6 +341,10 @@ jobs:
beforeBuild:
- bash: "./eng/scripts/install-nginx-linux.sh"
displayName: Installing Nginx
afterBuild:
- bash: ./build.sh --no-build --ci --test -p:RunFlakyTests=true
displayName: Run Flaky Tests
continueOnError: true
artifacts:
- name: Linux_Test_Logs
path: artifacts/logs/
Expand Down
2 changes: 2 additions & 0 deletions .azure/pipelines/jobs/default-build.yml
Original file line number Diff line number Diff line change
Expand Up @@ -174,6 +174,8 @@ jobs:
testRunner: vstest
testResultsFiles: '**/artifacts/**/*.trx'
mergeTestResults: true
buildConfiguration: $(BuildConfiguration)
buildPlatform: $(AgentOsName)
- ${{ each artifact in parameters.artifacts }}:
- task: PublishBuildArtifacts@1
displayName: Upload artifacts from ${{ artifact.path }}
Expand Down
7 changes: 7 additions & 0 deletions eng/FlakyTests.AfterArcade.props
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
<Project>
<!-- Override where xUnit logs and results go if we're doing the flaky run -->
<PropertyGroup Condition="'$(RunFlakyTests)' == 'true'">
<ArtifactsLogDir>$(ArtifactsDir)log\$(Configuration)\Flaky\</ArtifactsLogDir>
<ArtifactsTestResultsDir>$(ArtifactsDir)TestResults\$(Configuration)\Flaky\</ArtifactsTestResultsDir>
</PropertyGroup>
</Project>
18 changes: 18 additions & 0 deletions eng/FlakyTests.BeforeArcade.props
Original file line number Diff line number Diff line change
@@ -0,0 +1,18 @@
<Project>
<!-- Local Dev Flakiness -->
<PropertyGroup>
<_FlakyRunAdditionalArgs>-trait "Flaky:All=true"</_FlakyRunAdditionalArgs>
<_NonFlakyRunAdditionalArgs>-notrait "Flaky:All=true"</_NonFlakyRunAdditionalArgs>
</PropertyGroup>

<!-- Azure Pipelines Flakiness -->
<PropertyGroup Condition="'$(AGENT_OS)' != ''">
<_FlakyRunAdditionalArgs>$(_FlakyRunAdditionalArgs) -trait "Flaky:AzP:All=true" -trait "Flaky:AzP:OS:$(AGENT_OS)=true"</_FlakyRunAdditionalArgs>
<_NonFlakyRunAdditionalArgs>$(_NonFlakyRunAdditionalArgs) -notrait "Flaky:AzP:All=true" -notrait "Flaky:AzP:OS:$(AGENT_OS)=true"</_NonFlakyRunAdditionalArgs>
</PropertyGroup>

<PropertyGroup>
<TestRunnerAdditionalArguments Condition="'$(RunFlakyTests)' == ''">$(_NonFlakyRunAdditionalArgs)</TestRunnerAdditionalArguments>
<TestRunnerAdditionalArguments Condition="'$(RunFlakyTests)' == 'true'">$(_FlakyRunAdditionalArgs)</TestRunnerAdditionalArguments>
</PropertyGroup>
</Project>
34 changes: 30 additions & 4 deletions eng/helix/vstest/runtests.cmd
Original file line number Diff line number Diff line change
@@ -1,3 +1,6 @@
REM Disable "!Foo!" expansions because they break the filter syntax
setlocal disableextensions
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cc @HaoK

Even with the planned change to ignore the exit code from helix altogether I'd like to get this change in in some form so that when we do start paying attention to the exit code, we don't start getting known-flaky tests causing failures.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(Still working on Unix scripts btw)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yup, ignoring the exit code is hopefully just a short/medium term thing that gives us some time to get flaky in place

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe consider passing in the filters to each script (assuming they are the same in the sh version)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, they would be the same. I'll look at that. There can be some shell escaping problems with the syntax so it might be tricky, but I'll give it a shot.


set target=%1
set sdkVersion=%2
set runtimeVersion=%3
Expand All @@ -18,12 +21,35 @@ set HELIX=%helixQueue%

%DOTNET_ROOT%\dotnet vstest %target% -lt >discovered.txt
find /c "Exception thrown" discovered.txt
if %errorlevel% equ 0 (
echo Exception thrown during test discovery.
type discovered.txt
REM "ERRORLEVEL is not %ERRORLEVEL%" https://blogs.msdn.microsoft.com/oldnewthing/20080926-00/?p=20743/
if not errorlevel 1 (
echo Exception thrown during test discovery. 1>&2
type discovered.txt 1>&2
exit 1
)

%DOTNET_ROOT%\dotnet vstest %target% --logger:trx --logger:console;verbosity=normal
set exit_code=0

REM Run non-flaky tests first
REM We need to specify all possible Flaky filters that apply to this environment, because the flaky attribute
REM only puts the explicit filter traits the user provided in
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"provided in" what?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

shhhhhh

REM Filter syntax: https://github.com/Microsoft/vstest-docs/blob/master/docs/filter.md
set NONFLAKY_FILTER="Flaky:All!=true&Flaky:Helix:All!=true&Flaky:Helix:Queue:All!=true&Flaky:Helix:Queue:%HELIX%!=true"
echo Running non-flaky tests.
%DOTNET_ROOT%\dotnet vstest %target% --logger:trx --logger:console;verbosity=normal --TestCaseFilter:%NONFLAKY_FILTER%
if errorlevel 1 (
echo Failure in non-flaky test 1>&2
set exit_code=1
REM DO NOT EXIT
)

set FLAKY_FILTER="Flaky:All=true|Flaky:Helix:All=true|Flaky:Helix:Queue:All=true|Flaky:Helix:Queue:%HELIX%=true"
echo Running known-flaky tests.
%DOTNET_ROOT%\dotnet vstest %target% --logger:trx --logger:console;verbosity=normal --TestCaseFilter:%FLAKY_FILTER%
if errorlevel 1 (
echo Failure in flaky test 1>&2
REM DO NOT EXIT and DO NOT SET EXIT_CODE to 1
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggest including similar comments in runtests.sh

)

exit %exit_code%

23 changes: 22 additions & 1 deletion eng/helix/vstest/runtests.sh
Original file line number Diff line number Diff line change
Expand Up @@ -67,4 +67,25 @@ if grep -q "Exception thrown" discovered.txt; then
exit 1
fi

$DOTNET_ROOT/dotnet vstest $1 --logger:trx
# Run non-flaky tests first
# We need to specify all possible Flaky filters that apply to this environment, because the flaky attribute
# only puts the explicit filter traits the user provided in the flaky attribute
# Filter syntax: https://github.com/Microsoft/vstest-docs/blob/master/docs/filter.md
NONFLAKY_FILTER="Flaky:All!=true&Flaky:Helix:All!=true&Flaky:Helix:Queue:All!=true&Flaky:Helix:Queue:$HELIX!=true"
echo "Running non-flaky tests."
$DOTNET_ROOT/dotnet vstest $1 --logger:trx --TestCaseFilter:"$NONFLAKY_FILTER"
nonflaky_exitcode=$?
if [ $nonflaky_exitcode != 0 ]; then
echo "Non-flaky tests failed!" 1>&2
# DO NOT EXIT
fi

FLAKY_FILTER="Flaky:All=true|Flaky:Helix:All=true|Flaky:Helix:Queue:All=true|Flaky:Helix:Queue:$HELIX=true"
echo "Running known-flaky tests."
$DOTNET_ROOT/dotnet vstest $1 --logger:trx --TestCaseFilter:"$FLAKY_FILTER"
if [ $? != 0 ]; then
echo "Flaky tests failed!" 1>&2
# DO NOT EXIT
fi

exit $nonflaky_exitcode
4 changes: 2 additions & 2 deletions korebuild-lock.txt
Original file line number Diff line number Diff line change
@@ -1,2 +1,2 @@
version:3.0.0-build-20190306.2
commithash:18c06e0b774622c87560e6f21b97e099307fd10c
version:3.0.0-build-20190314.2
commithash:e3a8a2aae198f1ef26309714ccba6835be2437c3