-
Notifications
You must be signed in to change notification settings - Fork 653
/etc/resolv.conf is not mounted with the correct permissions when the host has a umask 0077 #3704
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Tried a bunch of cni versions (from 1.4 to 1.6), they all exhibit the same issue. |
Also same results with containerd 1.7 vs. containerd 2.0 Also same with nerdctl 1.7 vs nerdctl 2.0 It is not a regression. Also tested on different hosts - same problem. Either I hosed something on these boxes wrt networking, or there has always been an issue on arm with bridge (seems unlikely...)? |
Is this reproducible with the ARM instance of GHA? |
I have not tried yet. Will try on GHA though. |
Maybe it is related to subnetting for these arm64 boxes. It is still mindblowing that it works for alpine and not debian, but then it would not be the first time there would be something really weird in glibc wrt dns resolution. |
It works fine on the CI. This has to do with something specific wrt networking on these boxes. |
This is nuts. Networking actually works just fine inside the debian container (dig / curl are happy), just NOT for And again, things work just fine with docker for the same images, on the same machine - which is even more baffling. Possible culprits would be ipv6 (nerdctl seems to enable ipv6 on the iface by default while docker does not) - or something related to CNI doing something with UDP packets (in the specific hardware context of these boxes) that apt (or glibc) does not like. I give up on this. No time on my side to deepdivedebug apt / glibc (or CNI for that matter). If anyone else hits this and has ideas on how to debug this further, tag me. |
Looks like I just can't let this go (though cannot continue further tonight). Here is where things are: Standard resolution works just fine - tested for golang apps, using The message in apt comes from: https://salsa.debian.org/apt-team/apt/-/blob/main/methods/connect.cc?ref_type=heads#L407-430 But then at that point, it is already "too late": with docker, the domain getting resolved is debian.map.fastlydns.net - while with containerd / nerdctl, it is still deb.debian.org. That is likely because the earlier SRV lookup failed in Which points to https://salsa.debian.org/apt-team/apt/-/blob/main/apt-pkg/contrib/srvrec.cc?ref_type=heads#L36 as the problem area. So, intuition here is that there is a problem with SRV lookups - Either:
|
OMFG
^^^^^^ 🤦♂️🤦♂️🤦♂️🤦♂️🤦♂️🤦♂️🤦♂️🤦♂️ We obviously have a permission problem mounting /etc/resolv.conf with aggressive umasks on the host... |
The reason it affects ONLY apt-get is likely because apt-get drops out of root. |
This also affects /etc/hosts, possibly others. |
WriteFile sets permissions before umask is applied. For people using agressive umasks (0077), /etc/resolv.conf will end-up unreadable for non root processes. See containerd#3704 Signed-off-by: apostasie <[email protected]>
WriteFile uses syscall.Open, so permissions are modified by umask, if set. For people using agressive umasks (0077), /etc/resolv.conf will end-up unreadable for non root processes. See containerd#3704 Signed-off-by: apostasie <[email protected]>
WriteFile uses syscall.Open, so permissions are modified by umask, if set. For people using agressive umasks (0077), /etc/resolv.conf will end-up unreadable for non root processes. See containerd#3704 Signed-off-by: apostasie <[email protected]>
Description
This is quite confounding.
Will fail
(same with ubuntu)
BUT
Works just fine.
Furthermore:
net host
works just fine - this definitely has to do with bridgesudo nerdctl run --dns 1.1.1.1 --rm -ti debian bash
) does NOT fix the problemSince this is working with alpine, my intuition is to blame glibc.
@AkihiroSuda does this problem sound familiar in any way?
Any pointer on how to debug this?
Steps to reproduce the issue
No response
Describe the results you received and expected
na
What version of nerdctl are you using?
Host is:
Are you using a variant of nerdctl? (e.g., Rancher Desktop)
None
Host information
No response
The text was updated successfully, but these errors were encountered: