Skip to content

Windows CI began to fail on Oct 21 #2775

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
AkihiroSuda opened this issue Oct 21, 2024 · 7 comments · Fixed by #3280
Closed

Windows CI began to fail on Oct 21 #2775

AkihiroSuda opened this issue Oct 21, 2024 · 7 comments · Fixed by #3280

Comments

@AkihiroSuda
Copy link
Member

AkihiroSuda commented Oct 21, 2024

#2769 passed the CI, but its merge commit and later ones are failing

https://github.com/lima-vm/lima/actions/runs/11429806278/job/31800191430

[…]
time="2024-10-21T01:14:55Z" level=info msg="SSH Local Port: 22"
time="2024-10-21T01:14:55Z" level=info msg="[hostagent] Waiting for the essential requirement 1 of 2: \"ssh\""
time="2024-10-21T01:15:05Z" level=info msg="[hostagent] Waiting for the essential requirement 1 of 2: \"ssh\""
time="2024-10-21T01:15:15Z" level=info msg="[hostagent] Waiting for the essential requirement 1 of 2: \"ssh\""
time="2024-10-21T01:24:43Z" level=fatal msg="did not receive an event with the \"running\" status"

Something seems to have changed between https://github.com/actions/runner-images/releases/tag/win22%2F20241006.1 and https://github.com/actions/runner-images/releases/tag/win22%2F20241015.1

@jandubois
Copy link
Member

In https://github.com/lima-vm/lima/actions/runs/11445753347/job/31843450765?pr=2778 I see:

System has not been booted with systemd as init system (PID 1). Can't operate.

@jandubois
Copy link
Member

@pendo324 Do you have any idea what may be causing the Windows tests to fail now?

I can't find anything that seems related in actions/runner-images@fcc4cdb or actions/runner-images@09ff567

The errors look like systemd is no longer enabled in your distro, but there has been no change to the distro.

I'm at a loss on what might be causing this.

@pendo324
Copy link
Contributor

Thanks for pinging me, taking a look now

@arixmkii
Copy link
Contributor

arixmkii commented Feb 1, 2025

I'm doing some experiments with Windows support. I managed to replicate this CI attempt in my rebuild workflow. It worked successfully on a default GH runner. Logs are available https://github.com/arixmkii/qcw/actions/runs/13090314041/job/36526224725

The biggest difference in the setup is that I have to use latest preview WSL build from https://github.com/microsoft/WSL/releases

@arixmkii
Copy link
Contributor

@jandubois I debugged this. There is actually related change in commits you showed. It is Git version bump. It uses OpenSSH from Git distribution.

The script "user session is ready for ssh" hangs indefinitely on Git 2.47 and newer releases. The same is the case for latest msys2 OpenSSH. I downgraded the Git on my system and managed to run WSL2 machine.

I also managed to run almost all integration tests with this WSL2 machine, when using OpenSSH inside Alpine companion distro in WSL2 (not using any of Windows tools) https://github.com/arixmkii/qcw/actions/runs/13474629971/job/37652601743

Conclusion. Machine didn't break, Windows tooling has some sort of issue/regression, which might or might not be fixed.

I tried to create an isolated reproducer using same script doing cat script sh | <openssh command from lima> in parallel to hanging one and was not able to reproduce it outside of Lima.

@mook-as
Copy link
Contributor

mook-as commented Feb 26, 2025

FWIW, git-for-windows/git#5199 may be relevant. Note that git-for-Windows picked up the fixes, but as far as I know it's not in upstream cygwin/msys2 yet.

actions/runner-images@fcc4cdb linked above shows that Git for Windows was updated to 2.47.0.windows.1 which would be an affected version. But right now it shows 2.47.1.windows.2 which should be fixed (so some runs may be succeeding).

@arixmkii
Copy link
Contributor

@mook-as Thank you! I tested with the updated runner (I'm using server 2025, but this should not really behave differently here) with git version 2.47.1.windows.2 and it passed tests https://github.com/arixmkii/qcw/actions/runs/13552437877/job/37879648090

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants