DVS Migration Recovery Guide

Issue: ESXi Host Lost Network Connectivity During DVS Migration

Current Status

esxi-nuc-01.markalston.net (192.168.10.8): Network connectivity lost during orphaned DVS removal

Root Cause: The migration script attempted to move the management network from the orphaned DVS (vc01-dvs) to a standard vSwitch, but the process was interrupted, leaving the host in an inconsistent network state.

Recovery Options

If you have physical or IPMI/iLO console access to esxi-nuc-01:

  1. Access the ESXi console:
    • Physical console (monitor + keyboard)
    • IPMI/iLO/BMC remote console
    • VMware vSphere Remote Console (if available)
  2. Login as root and run these commands:
# Remove any conflicting network configuration
esxcli network ip interface remove -i vmk0 2>/dev/null || true
esxcli network ip interface remove -i vmk1 2>/dev/null || true

# Ensure vSwitch0 exists
esxcli network vswitch standard add -v vSwitch0 2>/dev/null || true

# Add vmnic0 to vSwitch0
esxcli network vswitch standard uplink add -v vSwitch0 -u vmnic0 2>/dev/null || true

# Create Management Network port group
esxcli network vswitch standard portgroup add -v vSwitch0 -p "Management Network" 2>/dev/null || true

# Recreate management interface
esxcli network ip interface add -i vmk0 -p "Management Network"
esxcli network ip interface ipv4 set -i vmk0 -I 192.168.10.8 -N 255.255.255.0 -t static
esxcli network ip interface tag add -i vmk0 -t Management

# Set default gateway
esxcli network ip route ipv4 add -g 192.168.10.1 -n default

# Remove orphaned DVS (correct command)
esxcfg-vswitch --delete --dvswitch vc01-dvs 2>/dev/null || true

# Restart network services
/etc/init.d/hostd restart
/etc/init.d/vpxa restart

Option 2: DCUI (Direct Console User Interface)

  1. Access DCUI (press F2 at console)
  2. Login with root credentials
  3. Navigate to: Configure Management Network
  4. Select: Network Adapters
  5. Select vmnic0 and confirm
  6. Configure IP: Static IP 192.168.10.8, netmask 255.255.255.0
  7. Set Gateway: 192.168.10.1
  8. Apply changes and restart management network

Option 3: Network Reset (Last Resort)

If the above options don’t work:

  1. Access console
  2. Navigate to DCUI (F2)
  3. Go to: Troubleshooting Options
  4. Select: Restart Management Network
  5. Or select: Reset System Configuration → Reset Network Settings

Prevention for Remaining Hosts

Before migrating esxi-nuc-02 and esxi-nuc-03, create a safer migration script:

#!/usr/bin/env bash
# safe-dvs-migration.sh - Safer approach for remaining hosts

HOST="$1"
if [[ -z "$HOST" ]]; then
    echo "Usage: $0 <hostname>"
    exit 1
fi

# First, ensure we have console access instructions ready
echo "Ensure you have console access to $HOST before proceeding!"
read -p "Continue? (y/N): " confirm
if [[ "$confirm" != "y" ]]; then
    exit 0
fi

# Create standard switch and management network first (non-disruptive)
ssh root@$HOST "esxcli network vswitch standard add -v vSwitch0" 2>/dev/null || true
ssh root@$HOST "esxcli network vswitch standard portgroup add -v vSwitch0 -p 'Management Network'" 2>/dev/null || true

# Add uplink to standard switch (brief sharing with DVS)
ssh root@$HOST "esxcli network vswitch standard uplink add -v vSwitch0 -u vmnic0"

# Create temporary management interface
ssh root@$HOST "esxcli network ip interface add -i vmk1 -p 'Management Network'"
ssh root@$HOST "esxcli network ip interface ipv4 set -i vmk1 -I 192.168.10.${HOST_SUFFIX} -N 255.255.255.0 -t static"

# Test connectivity before proceeding
if ping -c 2 "192.168.10.${HOST_SUFFIX}"; then
    echo "Temporary interface working, completing migration..."
    # Complete the migration
    ssh root@$HOST "esxcli network ip interface remove -i vmk0"
    ssh root@$HOST "esxcli network ip interface add -i vmk0 -p 'Management Network'"
    ssh root@$HOST "esxcli network ip interface ipv4 set -i vmk0 -I 192.168.10.${ORIGINAL_IP} -N 255.255.255.0 -t static"
    ssh root@$HOST "esxcli network ip interface tag add -i vmk0 -t Management"
    ssh root@$HOST "esxcli network ip interface remove -i vmk1"
    ssh root@$HOST "esxcli network vswitch dvs vmware remove -d vc01-dvs"
else
    echo "Temporary interface failed, rolling back..."
    ssh root@$HOST "esxcli network ip interface remove -i vmk1" 2>/dev/null || true
    ssh root@$HOST "esxcli network vswitch standard uplink remove -v vSwitch0 -u vmnic0" 2>/dev/null || true
fi

Current Infrastructure Status

  • esxi-nuc-01.markalston.net: ❌ Network connectivity lost - requires console recovery
  • esxi-nuc-02.markalston.net: ❌ Network connectivity lost - requires console recovery
  • esxi-nuc-03.markalston.net: ✅ Online - has orphaned DVS
  • macpro.markalston.net: ✅ Online - hosting vCenter
  • vcsa.markalston.net: ✅ Online - vCenter Server
  1. Recover esxi-nuc-01 using console access (Option 1 above)
  2. Test connectivity to esxi-nuc-01 before proceeding
  3. Use safer migration approach for esxi-nuc-02 and esxi-nuc-03
  4. Continue with traditional cluster setup once all DVS issues are resolved

Lessons Learned

  • CRITICAL: Single NIC hosts with management on DVS cannot be safely migrated via CLI
  • Direct manipulation of DVS uplinks (esxcfg-vswitch -Q/-U commands) immediately breaks management network
  • The correct ESXi command for DVS removal is esxcfg-vswitch --delete --dvswitch <name>, NOT esxcli
  • Orphaned DVS removal requires console access when management network is on the DVS
  • Always have console access available when modifying management networks
  • Test temporary interfaces thoroughly before removing original ones
  • Consider using vCenter GUI for DVS removal when available (safest method)
  • Backup network configuration before major changes

Critical Discovery: Single NIC + DVS Management = Console Required

The Intel NUCs have only one physical NIC (vmnic0) and the management interface (vmk0) is on the orphaned DVS. Any attempt to manipulate the DVS uplinks or remove interfaces directly breaks network connectivity immediately.

Files Created During This Issue

  • scripts/migrate-management-network.sh - Original migration script
  • scripts/migrate-single-host.sh - Single-host focused script (caused the issue)
  • docs/troubleshooting/dvs-migration-recovery.md - This recovery guide

The network migration will need to be completed manually via console access for esxi-nuc-01.


This project is for educational and home lab purposes.