-
Notifications
You must be signed in to change notification settings - Fork 698
Description
I have done the following
- I have searched the existing issues
- If possible, I've reproduced the issue using the 'main' branch of this project
Steps to reproduce
- Create a network and start a container with
--publishport forwarding:container network create mynet --subnet 192.168.100.0/24 container create --name mynginx --net mynet --publish 127.0.0.1:8000:8000 docker.io/library/nginx:alpine container start mynginx - Verify the container is reachable from the host via the gateway IP (192.168.100.1) and via the published port (127.0.0.1:8000). Both work.
- Restart the Mac (full system restart, not just sleep/wake).
- After login, verify the container daemon is running (
container system status). - Attempt to reach the container via the gateway IP or published port.
Current behavior
After a system restart, the container reports as "running" via container list, but:
- The gateway IP (e.g. 192.168.100.1) is no longer reachable from inside the container. Requests from the container to host services through the gateway hang indefinitely.
- Published ports on the host still accept TCP connections (the NIO socket forwarder binds successfully), but requests hang because the forwarder cannot route traffic to the container's vmnet IP.
container stophangs indefinitely (related to [Bug]: Unable to stop a container when it's frozen. #576). The graceful shutdown path sends SIGTERM, waits, sends SIGKILL, then callslc.stop(), but the communication with the VM appears to be broken, so none of these complete.container execalso hangs.- The only recovery is to kill the container's underlying process via
kill -9 <pid>, thencontainer rm, then recreate.
This is consistently reproducible across system restarts. It does not require VPN changes (#1307), though that issue may share the same root cause (vmnet interface state not surviving system state changes).
Expected behavior
Containers should either remain network-reachable after a system restart, or the runtime should detect the broken state and either recover automatically or transition the container to a stopped/error state so that container stop and container rm work without hanging.
Environment
- OS: macOS 26.3.2 (Tahoe)
- Hardware: Apple Silicon (M-series)
- Container: container CLI version 0.10.0Analysis
Looking at the source code, the vmnet network is created once during ReservedVmnetNetwork.start() via vmnet_network_create() and the resulting vmnet_network_ref is stored in a Mutex<State>. There is no mechanism to detect that the underlying vmnet interface has become invalid after a system restart, and no reconnection or re-creation logic.
The SandboxService.stop() path calls gracefulStopContainer() which relies on communicating with the VM via the container agent. When the VM's network is in a broken state, lc.wait() and lc.stop() appear to block indefinitely, making the container unrecoverable through normal commands.
A potential approach could be to detect stale container/network state on daemon startup after a system restart and either re-establish vmnet interfaces or clean up containers that are in an unrecoverable state.
Relevant log output
# Container appears running but is unreachable
$ container list
ID IMAGE STATUS NETWORKS
mynginx docker.io/library/nginx:alpine running mynet
# Attempting to stop hangs (must be killed with timeout)
$ timeout 10 container stop mynginx
# (no output, killed by timeout)
# Gateway IP unreachable from container context
$ container exec mynginx -- wget -q -O - --timeout=5 http://192.168.100.1:9090/
# (hangs indefinitely)Code of Conduct
- I agree to follow this project's Code of Conduct