Skip to content

Regression in 4.21.0.0 KVM agent: Storage VLAN traffic uses br<iface>-<vlan> instead of configured bridge (e.g. cloudbr2) #11634

@catalinpan

Description

@catalinpan

problem

After upgrading KVM hosts from 4.20.1.0 to 4.21.0.0, the KVM agent no longer respects the pre-created Linux bridge defined by the traffic label for Storage traffic when a VLAN is configured. Doesn't work either on a clean install of 4.21.0.0.

Instead, the agent automatically creates/uses a new bridge named br- (e.g. brbond0-33).
On 4.20.1.0 with the exact same host configuration, the agent attaches Storage VLAN NICs to the expected cloudbr2 bridge.

This regression might have been introduced by PR #11245

“kvm, ui: fix interface when using vlan subnet for storage traffic type”
(commit diff)

versions

CloudStack 4.21.0.0 (KVM agent) upgraded from Cloudstack 4.20.1.0
Zone type: Core
Network type: Basic
Hypervisor: KVM x86_64
Hosts: Ubuntu 24.04, systemd-networkd + Netplan

Bonded uplinks (bond0) with VLANs, bridges cloudbr0/1/2

Netplan details

network:
  version: 2
  renderer: networkd

  ethernets:
    eno1: {}
    eno2: {}

  bonds:
    bond0:
      interfaces: [eno1, eno2]
      dhcp4: no
      parameters:
        mode: 802.3ad
        mii-monitor-interval: 100
        transmit-hash-policy: layer3+4 # tested layer2 also

  vlans:
    bond0.56:
      id: 56
      link: bond0
    bond0.33:
      id: 33
      link: bond0

  bridges:
    cloudbr0:
      interfaces: [bond0]
      addresses: [10.16.10.10/24]
      parameters: { stp: false, forward-delay: 5 }
      routes:
# ..... REDACTED..........

    cloudbr1:
      interfaces: [bond0.56]
      addresses: [10.15.10.10/16]
      parameters: { stp: false, forward-delay: 5 }
      routes:
# ..... REDACTED..........

    cloudbr2:
      interfaces: [bond0.33]
      addresses: [10.13.10.10/16]
      dhcp4: no
      parameters: { stp: false, forward-delay: 5 }

The steps to reproduce the bug

Steps to reproduce

  • Define a Storage network in CloudStack with VLAN (e.g. VLAN 33).
  • On KVM host, create a bridge named cloudbr2 bound to the correct tagged interface (e.g. bond0.33).
  • Set the traffic label for Storage to cloudbr2.
  • Deploy a System VM or guest VM requiring Storage.

Observed (4.21.0.0 agent):

  • KVM agent creates and uses a new bridge brbond0-33.
  • VM XML shows NIC source brbond0-33.
  • The pre-created cloudbr2 is ignored.
  • secondarystoragevm remains stuck in Connecting mode, never reaches the management server, the connection to storage network also fails.

Expected (and actual on 4.20.1.0 agent):

  • Agent attaches Storage NIC to the configured cloudbr2 bridge.
  • No new bridges are auto-created.
  • secondarystoragevm works as expected, storage connection works, the agent connects also the management server.

What to do about it?

Restore pre-4.21 behavior:

  • For Storage traffic with VLAN, if the traffic label is a bridge, use that bridge directly.
  • Do not unconditionally create br- for Storage VLANs.
  • Alternatively, make this behaviour configurable to avoid breaking existing automated setups and keep the default as it used to be in 4.20.1.0.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions