Skip to content

Update CI VM images for newer packages #284

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Jul 26, 2023
Merged

Conversation

cevich
Copy link
Member

@cevich cevich commented Jun 27, 2023

Ref:
containers/podman#18612 (comment)

Also:

  • Switch to the distro. version of passt since development has
    cooled down. It's also now available in both F37 and 38. Note, for
    the F38 images, it will still grab it from updates-testing.

  • Implement a few fixes to cope with the dnf to dnf5 update
    when switching from F38 to rawhide.

  • De-duplicate & force use of DEBIAN_FRONTEND=noninteractive by
    including it into the $SUDO variable.

  • Add debugging to base_images/debian_base-setup.sh to help verify
    DEBIAN_FRONTEND=noninteractive is set.

  • Move python packages out of Fedora rawhide due to a broken
    dependnecy:
    nothing provides (python3.12dist(astroid) <= 2.17~~dev0 with python3.12dist(astroid) >= 2.15.2) needed by python3-pylint-2.17.2-2.fc39.noarch

  • Fix the name of the debian kernel headers package to not mention the
    currently booted kernel version. This fixes an issue where an older
    kernel is in use and doesn't match currently available headers package.

@cevich
Copy link
Member Author

cevich commented Jun 27, 2023

Re: rawhide build error @lsm5 do think it's the --allowerasing that's causing this? Could I fix it by adding an --exclude=dnf to that distro-sync command?

@lsm5
Copy link
Member

lsm5 commented Jun 28, 2023

rawhide now has dnf5 which has slightly different usage. But --allowerasing should work. Let me try it out in CI and get back to you.

@lsm5
Copy link
Member

lsm5 commented Jun 28, 2023

@cevich isn't Rerun with Terminal on the failed job supposed to drop me in a fedora env? Right now it's giving me a centos stream shell.

@cevich
Copy link
Member Author

cevich commented Jun 28, 2023

Right now it's giving me a centos stream shell.

Actually that's expected unfortunately. The image-build happens via Packer, which runs in a CentOS VM. This is because Packer needs to interface directly with the GCE APIs. Best bet is to use hack/get_ci_vm.sh somewhere with one of the "build" tasks. Both podman and buildah are broken at the moment...maybe try it in skopeo (should have F38 images).

@cevich
Copy link
Member Author

cevich commented Jun 28, 2023

rawhide now has dnf5

Just had a thought. I wonder if this is simply a "We don't test upgrades to rawhide" problem. I could just nuke the (presumed) /etc/dnf/protected.d/dnf.conf file. Let me try that and see if it allows the distro-sync to move forward...

@cevich
Copy link
Member Author

cevich commented Jun 29, 2023

nuke the (presumed) /etc/dnf/protected.d/dnf.conf file

Argh, didn't work. @lsm5 do you know if it's even possible to update/upgrade from dnf to dnf5?

force-push: Wild-guess: Try --exclude=dnf in the distro-sync. Also removed the -qq in case it's hiding any important/relevant messages.

@lsm5
Copy link
Member

lsm5 commented Jun 30, 2023

nuke the (presumed) /etc/dnf/protected.d/dnf.conf file

Argh, didn't work. @lsm5 do you know if it's even possible to update/upgrade from dnf to dnf5?

force-push: Wild-guess: Try --exclude=dnf in the distro-sync. Also removed the -qq in case it's hiding any important/relevant messages.

My current workstation is rawhide. I don't remember any issues while upgrading from dnf to dnf5 when I did f38 -> rawhide. Let me try it in ci_vm

@lsm5
Copy link
Member

lsm5 commented Jun 30, 2023

@cevich how should I be running get_ci_vm here? I see there's a get_ci_vm dir with a bunch of scripts.

@cevich
Copy link
Member Author

cevich commented Jun 30, 2023

My current workstation is rawhide. I don't remember any issues while upgrading from dnf to dnf5 when I did f38 -> rawhide.

Okay, I think that's helpful. So either the problem here is unique to here, or maybe some new upgrade problem crept in. You're probably right, some manual tinkering is probably required to figure this out.

I see there's a get_ci_vm dir with a bunch of scripts.

This repo. produces the git_ci_vm container that the script uses. So you can't use it here directly. There are a few debugging make targets, but I don't think that's what you want. They're all geared toward debugging the build process, not the output VMs.

I'd suggest using get_ci_vm from another repo, like podman or buildah if you need a quick VM to play on. There's also container images (F38 and F37), but fair-warning: It's a HUGE image.

Anyway, (unless you're curios) don't kill yourself over this. I can go hands-on just as easily, and see if I can figure it out. I'll ping you if I get stuck again.

@lsm5
Copy link
Member

lsm5 commented Jun 30, 2023

Anyway, (unless you're curios) don't kill yourself over this. I can go hands-on just as easily, and see if I can figure it out.
I'll ping you if I get stuck again.

SGTM :)

@cevich
Copy link
Member Author

cevich commented Jun 30, 2023

Force-push: Experiments showed that upgrading dnf* to rawhide as a distinct step prevents conflict during distro-sync.

@lsm5
Copy link
Member

lsm5 commented Jun 30, 2023

Force-push: Experiments showed that upgrading dnf* to rawhide as a distinct step prevents conflict during distro-sync.

hmm, didn't notice it during the local upgrade sometime last week, but won't be surprised if things changed in that short span. Glad that it works though :D

@cevich
Copy link
Member Author

cevich commented Jul 3, 2023

Glad that it works though

Ugg, well, it worked in a container. It still fails here (same error). I'm going to see about using some duct-tape to convince get_ci_vm.sh to boot up a base image somewhere...

@lsm5
Copy link
Member

lsm5 commented Jul 3, 2023

@cevich maybe you could start directly from a rawhide image? Fedora makes those available too

@cevich
Copy link
Member Author

cevich commented Jul 3, 2023

maybe you could start directly from a rawhide image

Thanks for the suggestion. I'll keep that as "Plan B" since it probably involves a lot more work.

I think I figured out a workaround: Remove yum and dnf packages using rpm, after installing dnf5 (I tried using dnf5 to remove dnf - it gets mad).


For future ref. / in case it helps. I used hack/get_ci_vm.sh validate in the c/image repo (because CI is simple there). Just had to edit .cirrus.yml to use IMAGE_SUFFIX: "b<blah>", and change the task to use image_name: ${FEDORA_CACHE_IMAGE_NAME} (instead of debian).

@cevich
Copy link
Member Author

cevich commented Jul 3, 2023

Yay progress! Now the rawhide image build is failing with:

   rawhide: Problem: conflicting requests
    rawhide:   - package crun-wasm-1.8.5-1.fc39.x86_64 requires wasm-library, but none of the providers can be installed
    rawhide:   - nothing provides libfmt.so.9()(64bit) needed by wasmedge-0.12.1-1.fc39.x86_64
    rawhide:   - nothing provides libfmt.so.9()(64bit) needed by wasmedge-rt-0.12.1-1.fc39.x86_64
    rawhide:     exit(1)

Eed! Poking around, I see there was a build a month ago for this version. I clicked around on stuff (I don't know what I'm doing) and found https://koji.fedoraproject.org/koji/taskinfo?taskID=102699029 (for the -2 release) which is failing. But looking at the build log, it's barfing on some python thing not libfmt. @lsm5 help please 😕

@flouthoc do you remember if/what we need wasm for in our CI images? I grepped down the podman test directory but didn't come up with anything. i.e. Are we actually testing with wasm (somewhere) or was it added as a "nice to have" someday?

@lsm5
Copy link
Member

lsm5 commented Jul 3, 2023

err yes, that's a problem i hit in containers/crun#1000 as well. I need to check with the wasmedge maintainers. The wasmedge package might need a rebuild.

@lsm5
Copy link
Member

lsm5 commented Jul 3, 2023

err yes, that's a problem i hit in containers/crun#1000 as well. I need to check with the wasmedge maintainers. The wasmedge package might need a rebuild.

Fedora automation is pretty neat: https://bugzilla.redhat.com/show_bug.cgi?id=2219457

@lsm5
Copy link
Member

lsm5 commented Jul 3, 2023

Fedora automation is pretty neat: https://bugzilla.redhat.com/show_bug.cgi?id=2219457

@hydai @dm4 could you please take a look at this bugzilla? wasmedge is currently not installable on rawhide. I think it would need a rebuild.

@cevich
Copy link
Member Author

cevich commented Jul 3, 2023

It seems like a dependency problem. I found the fmt package is where libfmt comes from. In rawhide, fmt gives libfmt.so.10 which I think means it's incompatible? Ugg, this may not be easily/quickly fixable. Maybe I can skip that package for rawhide CI VM iamges, or maybe it's not needed at all (@flouthoc would know).

@cevich cevich force-pushed the update_images branch 2 times, most recently from 1bd41fc to 24c5074 Compare July 3, 2023 19:05
@cevich
Copy link
Member Author

cevich commented Jul 3, 2023

Force-push: Added crun-wasm to the list for only non-rawhide images. If that works, and we don't need it (in rawhide) for podman testing that will remove the time-pressure for fixing the package.

@flouthoc
Copy link

flouthoc commented Jul 4, 2023

@flouthoc do you remember if/what we need wasm for in our CI images? I grepped down the podman test directory but didn't come up with anything. i.e. Are we actually testing with wasm (somewhere) or was it added as a "nice to have" someday?

@cevich It was needed for this issue containers/podman#16501, but I don't think its a hard requirement. If its easy then its good, but we should not block on it afaics.

It seems like a dependency problem. I found the fmt package is where libfmt comes from. In rawhide, fmt gives libfmt.so.10 which I think means it's incompatible

Not sure if crun-wasm needs this directly, @cevich which package needs this ?

@hydai
Copy link

hydai commented Jul 4, 2023

Hi @lsm5
The libfmt issue should be resolved with 0.12.1-3.fc39. I found the Bugzilla issue is also closed. Please let us (@hydai / @dm4) know if there is anything we can help. Thanks.

@cevich
Copy link
Member Author

cevich commented Jul 5, 2023

The libfmt issue should be resolved with 0.12.1-3.fc39.

Thanks for the quick response @hydai, I'll will let you know if it breaks.

Not sure if crun-wasm needs this directly

Dunno, it's setup as a dependency though so I'm keen to trust the package maintainer.

but I don't think its a hard requirement. If its easy then its good, but we should not block on it afaics.

Thanks for the feedback @flouthoc, PRs linked to that issue seem pretty low-level (i.e. not just some tests). I'll leave it as-is for now, since there's a packaging fix.

@cevich cevich force-pushed the update_images branch 5 times, most recently from 33b69fb to 938917a Compare July 6, 2023 16:18
@cevich
Copy link
Member Author

cevich commented Jul 6, 2023

@lsm5 ping - the joy continues 😠 This is a new error as of this morning:

    rawhide:     + sudo dnf install -y <lots of packages>
    ...cut...
    rawhide:   - nothing provides (python3.12dist(astroid) <= 2.17~~dev0 with python3.12dist(astroid) >= 2.15.2) needed by python3-pylint-2.17.2-2.fc39.noarch
    rawhide:     exit(1)

How did you find that automatic rawhide BZ before, (I can't seem to find anything on the new error)?

@lsm5
Copy link
Member

lsm5 commented Jul 6, 2023

@lsm5 ping - the joy continues angry This is a new error as of this morning:

    rawhide:     + sudo dnf install -y <lots of packages>
    ...cut...
    rawhide:   - nothing provides (python3.12dist(astroid) <= 2.17~~dev0 with python3.12dist(astroid) >= 2.15.2) needed by python3-pylint-2.17.2-2.fc39.noarch
    rawhide:     exit(1)

do you know what package actually requires this ? Maybe we can just get rid of it?

EDIT: I see it's pylint. Hmm, probably not something we can avoid. Can you try with --best --allowerasing enabled.

How did you find that automatic rawhide BZ before, (I can't seem to find anything on the new error)?

I simply searched for wasmedge in bugzilla and the BZ was there. I can't find anything yet for astroid, maybe the bz account only runs periodically.

@lsm5
Copy link
Member

lsm5 commented Jul 6, 2023

we can also --exclude=pylint,astroid,foo,bar

@cevich
Copy link
Member Author

cevich commented Jul 6, 2023

do you know what package actually requires this ? Maybe we can just get rid of it?
EDIT: I see it's pylint. Hmm, probably not something we can avoid.

Yeah, it's needed for a few repos CI. I'll try --best --allowerasing, hopefully that'll make a difference.

Ref:
containers/podman#18612 (comment)

Also:

* Switch to the distro. version of `passt` since development has
  cooled down.  It's also now available in both F37 and 38.  Note, for
  the F38 images, it will still grab it from updates-testing.

* Implement a few fixes to cope with the `dnf` to `dnf5` update
  when switching from F38 to rawhide.

* De-duplicate & force use of `DEBIAN_FRONTEND=noninteractive` by
  including it into the `$SUDO` variable.

* Add debugging to `base_images/debian_base-setup.sh` to help verify
  `DEBIAN_FRONTEND=noninteractive` is set.

* Move python packages out of Fedora rawhide due to a broken
  dependnecy:
  `nothing provides (python3.12dist(astroid) <= 2.17~~dev0 with
  python3.12dist(astroid) >= 2.15.2) needed by
  python3-pylint-2.17.2-2.fc39.noarch`

* Fix the name of the debian kernel headers package to not mention the
  currently booted kernel version.  This fixes an issue where an older
  kernel is in use and doesn't match currently available headers package.

Signed-off-by: Chris Evich <[email protected]>
@cevich
Copy link
Member Author

cevich commented Jul 6, 2023

no love @lsm5 same error. I think maybe we don't need many/most python things in rawhide. Trying with them only in Fedora/prior-Fedora... 🤞

@github-actions
Copy link

github-actions bot commented Jul 6, 2023

Cirrus CI build successful. Found built image names and IDs:

Stage Image Name IMAGE_SUFFIX
base debian b20230706t200047z-f38f37d13
base fedora b20230706t200047z-f38f37d13
base fedora-aws b20230706t200047z-f38f37d13
base fedora-aws-arm64 b20230706t200047z-f38f37d13
base image-builder b20230706t200047z-f38f37d13
base prior-fedora b20230706t200047z-f38f37d13
cache build-push c20230706t200047z-f38f37d13
cache debian c20230706t200047z-f38f37d13
cache fedora c20230706t200047z-f38f37d13
cache fedora-aws c20230706t200047z-f38f37d13
cache fedora-netavark c20230706t200047z-f38f37d13
cache fedora-netavark-aws-arm64 c20230706t200047z-f38f37d13
cache fedora-podman-aws-arm64 c20230706t200047z-f38f37d13
cache fedora-podman-py c20230706t200047z-f38f37d13
cache prior-fedora c20230706t200047z-f38f37d13
cache rawhide c20230706t200047z-f38f37d13
cache win-server-wsl c20230706t200047z-f38f37d13

@lsm5
Copy link
Member

lsm5 commented Jul 7, 2023

@cevich i heard in fedora-ci channel that rawhide testing is broken due to dnf5. So, maybe you wanna put this on hold for a bit? But, I see all checks have passed, so your call :)

@cevich
Copy link
Member Author

cevich commented Jul 7, 2023

But, I see all checks have passed, so your call

I'm even more torn since I'm headed out on PTO for two weeks. At the same time, these updates in podman are VERY long overdue 😢

Perhaps I won't tag the images, and instead just try them out only in podman...

@lsm5
Copy link
Member

lsm5 commented Jul 7, 2023

But, I see all checks have passed, so your call

I'm even more torn since I'm headed out on PTO for two weeks. At the same time, these updates in podman are VERY long overdue cry

Perhaps I won't tag the images, and instead just try them out only in podman...

@cevich SGTM.

@cevich
Copy link
Member Author

cevich commented Jul 7, 2023

...podman results are improved. I'm going to to leave this PR open, as a reminder to me to followup when I return from PTO. Then we can decide if c20230706t200047z-f38f37d13 should be tagged as a version (causing renovate to open deployment PRs everywhere).

@cevich
Copy link
Member Author

cevich commented Jul 26, 2023

Looks like I need to build updated Debian images, merging this and starting a new PR.

@cevich cevich merged commit d72aaa3 into containers:main Jul 26, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants