Skip to content

[Bug]: Container network becomes unreachable after macOS sleep/wake cycle #1321

@adiled

Description

@adiled

I have done the following

  • I have searched the existing issues
  • If possible, I've reproduced the issue using the 'main' branch of this project

Steps to reproduce

  1. Create a network and start a container with --publish port forwarding:
    container network create mynet --subnet 192.168.100.0/24
    container create --name mynginx --net mynet --publish 127.0.0.1:8000:8000 docker.io/library/nginx:alpine
    container start mynginx
    
  2. Verify the container is reachable from the host via the gateway IP (192.168.100.1) and via the published port (127.0.0.1:8000). Both work.
  3. Restart the Mac (full system restart, not just sleep/wake).
  4. After login, verify the container daemon is running (container system status).
  5. Attempt to reach the container via the gateway IP or published port.

Current behavior

After a system restart, the container reports as "running" via container list, but:

  • The gateway IP (e.g. 192.168.100.1) is no longer reachable from inside the container. Requests from the container to host services through the gateway hang indefinitely.
  • Published ports on the host still accept TCP connections (the NIO socket forwarder binds successfully), but requests hang because the forwarder cannot route traffic to the container's vmnet IP.
  • container stop hangs indefinitely (related to [Bug]: Unable to stop a container when it's frozen. #576). The graceful shutdown path sends SIGTERM, waits, sends SIGKILL, then calls lc.stop(), but the communication with the VM appears to be broken, so none of these complete.
  • container exec also hangs.
  • The only recovery is to kill the container's underlying process via kill -9 <pid>, then container rm, then recreate.

This is consistently reproducible across system restarts. It does not require VPN changes (#1307), though that issue may share the same root cause (vmnet interface state not surviving system state changes).

Expected behavior

Containers should either remain network-reachable after a system restart, or the runtime should detect the broken state and either recover automatically or transition the container to a stopped/error state so that container stop and container rm work without hanging.

Environment

- OS: macOS 26.3.2 (Tahoe)
- Hardware: Apple Silicon (M-series)
- Container: container CLI version 0.10.0

Analysis

Looking at the source code, the vmnet network is created once during ReservedVmnetNetwork.start() via vmnet_network_create() and the resulting vmnet_network_ref is stored in a Mutex<State>. There is no mechanism to detect that the underlying vmnet interface has become invalid after a system restart, and no reconnection or re-creation logic.

The SandboxService.stop() path calls gracefulStopContainer() which relies on communicating with the VM via the container agent. When the VM's network is in a broken state, lc.wait() and lc.stop() appear to block indefinitely, making the container unrecoverable through normal commands.

A potential approach could be to detect stale container/network state on daemon startup after a system restart and either re-establish vmnet interfaces or clean up containers that are in an unrecoverable state.

Relevant log output

# Container appears running but is unreachable
$ container list
ID           IMAGE                              STATUS    NETWORKS
mynginx      docker.io/library/nginx:alpine     running   mynet

# Attempting to stop hangs (must be killed with timeout)
$ timeout 10 container stop mynginx
# (no output, killed by timeout)

# Gateway IP unreachable from container context
$ container exec mynginx -- wget -q -O - --timeout=5 http://192.168.100.1:9090/
# (hangs indefinitely)

Code of Conduct

  • I agree to follow this project's Code of Conduct

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions