✓ Verified 💻 Development ✓ Enhanced Data

Sonic Kvm Testbed

Deploy and manage a SONiC sonic-mgmt KVM virtual testbed with cEOS neighbors for running pytest-base

Rating
4 (293 reviews)
Downloads
3,748 downloads
Version
1.0.0

Overview

Deploy and manage a SONiC sonic-mgmt KVM virtual testbed with cEOS neighbors for running pytest-based network tests.

Complete Documentation

View Source →

SONiC KVM Virtual Testbed

Deploy a local sonic-mgmt KVM testbed with cEOS neighbors on a single machine.

Architecture

text
Host Machine (KVM + Docker)
├── vlab-XX (KVM VM running sonic-vs) — DUT
├── ceos_vmsX-Y_VMZZ (Docker) — cEOS neighbor(s)
├── ptf_vmsX-Y (Docker) — PTF test traffic generator
└── sonic-mgmt (Docker) — Ansible + pytest framework

Management network: br1 bridge, 10.250.0.0/24 (host at .1).

Supported Topologies

Testbed NameTopoDUTVM BaseNeighbors (raw → converged)
vms-kvm-t0t0vlab-01VM01004 → 1 cEOS
vms-kvm-t1-lagt1-lagvlab-03VM010424 → 2 cEOS
Use use_converged_peers: true in vtestbed.yaml to reduce cEOS containers via multi-VRF convergence (requires PR #22399 in master branch).

Prerequisites

  • Ubuntu 20.04/22.04/24.04, KVM enabled (kvm-ok)
  • 30GB+ RAM (for single topo) or 20GB+ with reduced VM memory
  • Docker installed, user in docker, kvm, libvirt groups
  • Built sonic-vs.img.gz from sonic-buildimage
  • cEOS image file (e.g., cEOS64-lab-4.32.5M.tar.xz)
  • sshpass installed on host

Deploy Procedure

1. Initial Setup (one-time)

bash
# Clone repo
git clone https://github.com/sonic-net/sonic-mgmt.git ~/sonic-mgmt
cd ~/sonic-mgmt && git checkout master  # PR #22399 needed for auto-convergence

# Prepare images
mkdir -p ~/veos-vm/images ~/sonic-vm/images
gunzip -k sonic-vs.img.gz
cp sonic-vs.img ~/veos-vm/images/ && cp sonic-vs.img ~/sonic-vm/images/

# Import cEOS (docker import, NOT docker load)
xz -d cEOS64-lab-4.32.5M.tar.xz
docker import cEOS64-lab-4.32.5M.tar ceosimage:4.32.5M

# Management bridge
cd ~/sonic-mgmt/ansible && sudo ./setup-management-network.sh

# debian:jessie dependency
docker pull publicmirror.azurecr.io/debian:jessie
docker tag publicmirror.azurecr.io/debian:jessie debian:jessie

# sonic-mgmt container
./setup-container.sh -n sonic-mgmt -d /data

# Create vault password file
echo "abc" > ~/sonic-mgmt/ansible/password.txt

2. Configure Credentials and Settings

See references/credentials.md for all config files.

Critical files (these reset on git operations — automate fixes in a script):

FileKey SettingWhy
group_vars/vm_host/creds.ymlvm_host_user: Host SSH access
group_vars/all/creds.ymlsonic_login: ""DUT SSH user (matches sonic-vs build user)
group_vars/all/ceos.ymlskip_ceos_image_downloading: trueUse local cEOS image
group_vars/vm_host/main.ymlmax_fp_num: 127Default 4 is too low for T0/T1
veos_vtbansible_user: Inventory host user
veosComment out STR-ACS-SERV-01Avoid dual-host conflict
vars/docker_registry.ymlRemove :443 from host:443 causes docker pull to hang
vtestbed.yamluse_converged_peers: trueEnable multi-VRF convergence
Create a fix script to re-apply all settings. Run it before EVERY testbed operation.

3. Deploy Topology

bash
# Fix configs + remove stale .bak
bash fix-configs.sh
rm -f vars/topo_<TOPO>.yml.bak

# Inside sonic-mgmt container:
./testbed-cli.sh -t vtestbed.yaml add-topo <TESTBED_NAME> password.txt

Duration: ~15-20 minutes (VM boot + cEOS startup).

4. Post-Deploy DUT Setup

After add-topo, the DUT boots with the build user. The multi_passwd_ssh plugin expects admin:

bash
# SSH to DUT as build user
ssh <build_user>@<DUT_IP>

# Create admin user
sudo useradd -m -s /bin/bash -G sudo,docker admin
echo 'admin:password' | sudo chpasswd
sudo bash -c "echo 'admin ALL=(ALL) NOPASSWD:ALL' > /etc/sudoers.d/admin"

# Fix docker socket
sudo chmod 666 /var/run/docker.sock

5. Deploy Minigraph

bash
# Fix configs + remove .bak AGAIN (they revert!)
bash fix-configs.sh
rm -f vars/topo_<TOPO>.yml.bak

./testbed-cli.sh -t vtestbed.yaml deploy-mg <TESTBED_NAME> veos_vtb password.txt

Duration: ~5-10 minutes.

6. Verify

bash
# Check containers
docker ps | grep -E "ceos|ptf"

# Check BGP (use admin after deploy-mg)
sshpass -p password ssh admin@<DUT_IP> "show ip bgp summary"

Expected BGP state with converged peers:

  • T0: ARISTA01T1 Established (6400 prefixes), ARISTA02-04T1 Active (normal — VRF peers without physical port-channels)
  • T1-LAG: 17/24 sessions up (all T0 + 1 T2 spine; remaining T2 spines Active)

7. Run Tests

bash
cd /data/sonic-mgmt/tests
./run_tests.sh -n <TESTBED_NAME> -d <DUT_NAME> -c <test_path> \
  -f vtestbed.yaml -i ../ansible/veos_vtb

Teardown

bash
bash fix-configs.sh
rm -f vars/topo_<TOPO>.yml.bak
./testbed-cli.sh -t vtestbed.yaml remove-topo <TESTBED_NAME> password.txt

Duration: ~12-15 minutes.

Critical Gotchas

  • Config files revert during git and testbed operations — run fix script before EVERY command
  • Remove .bak files before add-topo — stale backups cause KeyError in converger
  • docker import for cEOS (not docker load)
  • :443 in docker_registry_host silently hangs docker pulls
  • max_fp_num: 4 is too low — set to 127
  • br1 bridge is not persistent across reboots — add netplan config
  • Non-admin builds: sonic-vs uses the build machine's username, not admin
  • use_converged_peers: true requires master branch (PR #22399) for auto-convergence

Troubleshooting

See references/troubleshooting.md for detailed diagnosis of common failures.

Installation

Terminal bash

openclaw install sonic-kvm-testbed
    
Copied!

💻Code Examples

└── sonic-mgmt (Docker) — Ansible + pytest framework

-sonic-mgmt-docker--ansible--pytest-framework.txt
Management network: `br1` bridge, `10.250.0.0/24` (host at `.1`).

## Supported Topologies

| Testbed Name | Topo | DUT | VM Base | Neighbors (raw → converged) |
|---|---|---|---|---|
| `vms-kvm-t0` | t0 | vlab-01 | VM0100 | 4 → 1 cEOS |
| `vms-kvm-t1-lag` | t1-lag | vlab-03 | VM0104 | 24 → 2 cEOS |

Use `use_converged_peers: true` in vtestbed.yaml to reduce cEOS containers via multi-VRF convergence (requires PR #22399 in master branch).

## Prerequisites

- Ubuntu 20.04/22.04/24.04, KVM enabled (`kvm-ok`)
- 30GB+ RAM (for single topo) or 20GB+ with reduced VM memory
- Docker installed, user in `docker`, `kvm`, `libvirt` groups
- Built `sonic-vs.img.gz` from sonic-buildimage
- cEOS image file (e.g., `cEOS64-lab-4.32.5M.tar.xz`)
- `sshpass` installed on host

## Deploy Procedure

### 1. Initial Setup (one-time)

echo "abc" > ~/sonic-mgmt/ansible/password.txt

echo-abc--sonic-mgmtansiblepasswordtxt.txt
### 2. Configure Credentials and Settings

See [references/credentials.md](references/credentials.md) for all config files.

**Critical files** (these reset on git operations — automate fixes in a script):

| File | Key Setting | Why |
|---|---|---|
| `group_vars/vm_host/creds.yml` | `vm_host_user: <your_user>` | Host SSH access |
| `group_vars/all/creds.yml` | `sonic_login: "<dut_user>"` | DUT SSH user (matches sonic-vs build user) |
| `group_vars/all/ceos.yml` | `skip_ceos_image_downloading: true` | Use local cEOS image |
| `group_vars/vm_host/main.yml` | `max_fp_num: 127` | Default 4 is too low for T0/T1 |
| `veos_vtb` | `ansible_user: <your_user>` | Inventory host user |
| `veos` | Comment out `STR-ACS-SERV-01` | Avoid dual-host conflict |
| `vars/docker_registry.yml` | Remove `:443` from host | `:443` causes docker pull to hang |
| `vtestbed.yaml` | `use_converged_peers: true` | Enable multi-VRF convergence |

**Create a fix script** to re-apply all settings. Run it before EVERY testbed operation.

### 3. Deploy Topology

./testbed-cli.sh -t vtestbed.yaml add-topo <TESTBED_NAME> password.txt

testbed-clish--t-vtestbedyaml-add-topo-testbedname-passwordtxt.txt
**Duration**: ~15-20 minutes (VM boot + cEOS startup).

### 4. Post-Deploy DUT Setup

After `add-topo`, the DUT boots with the build user. The `multi_passwd_ssh` plugin expects `admin`:

./testbed-cli.sh -t vtestbed.yaml deploy-mg <TESTBED_NAME> veos_vtb password.txt

testbed-clish--t-vtestbedyaml-deploy-mg-testbedname-veosvtb-passwordtxt.txt
**Duration**: ~5-10 minutes.

### 6. Verify

sshpass -p password ssh admin@<DUT_IP> "show ip bgp summary"

sshpass--p-password-ssh-admindutip-show-ip-bgp-summary.txt
**Expected BGP state with converged peers:**
- T0: ARISTA01T1 Established (6400 prefixes), ARISTA02-04T1 Active (normal — VRF peers without physical port-channels)
- T1-LAG: 17/24 sessions up (all T0 + 1 T2 spine; remaining T2 spines Active)

### 7. Run Tests
example.txt
Host Machine (KVM + Docker)
├── vlab-XX (KVM VM running sonic-vs) — DUT
├── ceos_vmsX-Y_VMZZ (Docker) — cEOS neighbor(s)
├── ptf_vmsX-Y (Docker) — PTF test traffic generator
└── sonic-mgmt (Docker) — Ansible + pytest framework
example.sh
# Clone repo
git clone https://github.com/sonic-net/sonic-mgmt.git ~/sonic-mgmt
cd ~/sonic-mgmt && git checkout master  # PR #22399 needed for auto-convergence

# Prepare images
mkdir -p ~/veos-vm/images ~/sonic-vm/images
gunzip -k sonic-vs.img.gz
cp sonic-vs.img ~/veos-vm/images/ && cp sonic-vs.img ~/sonic-vm/images/

# Import cEOS (docker import, NOT docker load)
xz -d cEOS64-lab-4.32.5M.tar.xz
docker import cEOS64-lab-4.32.5M.tar ceosimage:4.32.5M

# Management bridge
cd ~/sonic-mgmt/ansible && sudo ./setup-management-network.sh

# debian:jessie dependency
docker pull publicmirror.azurecr.io/debian:jessie
docker tag publicmirror.azurecr.io/debian:jessie debian:jessie

# sonic-mgmt container
./setup-container.sh -n sonic-mgmt -d /data

# Create vault password file
echo "abc" > ~/sonic-mgmt/ansible/password.txt
example.sh
# Fix configs + remove stale .bak
bash fix-configs.sh
rm -f vars/topo_<TOPO>.yml.bak

# Inside sonic-mgmt container:
./testbed-cli.sh -t vtestbed.yaml add-topo <TESTBED_NAME> password.txt
example.sh
# SSH to DUT as build user
ssh <build_user>@<DUT_IP>

# Create admin user
sudo useradd -m -s /bin/bash -G sudo,docker admin
echo 'admin:password' | sudo chpasswd
sudo bash -c "echo 'admin ALL=(ALL) NOPASSWD:ALL' > /etc/sudoers.d/admin"

# Fix docker socket
sudo chmod 666 /var/run/docker.sock
example.sh
# Fix configs + remove .bak AGAIN (they revert!)
bash fix-configs.sh
rm -f vars/topo_<TOPO>.yml.bak

./testbed-cli.sh -t vtestbed.yaml deploy-mg <TESTBED_NAME> veos_vtb password.txt

Tags

#devops_and-cloud

Quick Info

Category Development
Model Claude 3.5
Complexity One-Click
Author yxieca
Last Updated 3/10/2026
🚀
Optimized for
Claude 3.5
🧠

Ready to Install?

Get started with this skill in seconds

openclaw install sonic-kvm-testbed