hyper2kvm is a multi-stage pipeline that transforms VMs from any hypervisor to KVM-ready format.
Think of it like an assembly line: each stage does one job, passes the result to the next stage.
graph LR
FETCH[1. FETCH] --> FLATTEN[2. FLATTEN]
FLATTEN --> INSPECT[3. INSPECT]
INSPECT --> PLAN[4. PLAN]
PLAN --> FIX[5. FIX]
FIX --> CONVERT[6. CONVERT]
CONVERT --> VALIDATE[7. VALIDATE]
classDef stage fill:#4CAF50,stroke:#2E7D32,color:#fff
class FETCH,FLATTEN,INSPECT,PLAN,FIX,CONVERT,VALIDATE stage
Stages:
✓ Sequential - Stages run in strict order ✓ Deterministic - Same input → same output ✓ Isolated - Each stage has clear inputs/outputs ✓ Recoverable - Can resume from checkpoint
Purpose: Acquire source VM disks
Sources:
Output: Raw disk files on local storage
graph LR
S1[vSphere] --> FETCH[FETCH]
S2[Azure] --> FETCH
S3[Local] --> FETCH
FETCH --> O[Local Disk]
classDef source fill:#FFF3E0,stroke:#F57C00
classDef stage fill:#4CAF50,stroke:#2E7D32,color:#fff
classDef output fill:#E3F2FD,stroke:#1565C0
class S1,S2,S3 source
class FETCH stage
class O output
Purpose: Collapse snapshot chains into single files
Problem Solved:
Process:
Output: Single flat disk file
graph TD
I1[base.vmdk] --> FLATTEN[FLATTEN]
I2[delta1.vmdk] --> FLATTEN
I3[delta2.vmdk] --> FLATTEN
FLATTEN --> O[flat.vmdk]
classDef input fill:#FFEBEE,stroke:#C62828
classDef stage fill:#4CAF50,stroke:#2E7D32,color:#fff
classDef output fill:#E3F2FD,stroke:#1565C0
class I1,I2,I3 input
class FLATTEN stage
class O output
Purpose: Detect guest OS and configuration
Technology: libguestfs (offline disk mounting)
Detects:
Output: GuestIdentity object
GuestIdentity(
os_type="linux",
os_distro="rhel",
os_version="9.3",
firmware="uefi",
bootloader="grub2",
init_system="systemd",
network_manager="NetworkManager"
)
Purpose: Decide what fixes are needed
Planning Logic:
IF Windows:
- Need VirtIO driver injection
- Need registry modifications
ELSE IF Linux:
- Check fstab for /dev/disk/by-path
- Check initramfs for virtio modules
- Check network config for MAC pinning
Output: Fix plan (list of required operations)
Purpose: Apply offline fixes to ensure boot on KVM
The Critical Stage - This is what makes hyper2kvm unique!
graph TD
FIX[FIX Stage] --> FS[Filesystem]
FIX --> BL[Bootloader]
FIX --> NET[Network]
FIX --> WIN[Windows]
FS --> F1[fstab: UUID]
BL --> F2[initramfs: virtio]
BL --> F3[GRUB: regen]
NET --> F4[Clean MAC]
WIN --> F5[VirtIO drivers]
classDef stage fill:#4CAF50,stroke:#2E7D32,color:#fff
classDef subsystem fill:#9C27B0,stroke:#6A1B9A,color:#fff
classDef fix fill:#FF9800,stroke:#E65100,color:#fff
class FIX stage
class FS,BL,NET,WIN subsystem
class F1,F2,F3,F4,F5 fix
Fix Details:
/etc/fstab: /dev/disk/by-path → UUID| Mode | When | How |
|---|---|---|
| Offline | Default | Mount disk with libguestfs, modify files directly |
| Live | Opt-in | SSH to running guest, execute commands |
Offline is safer → No runtime dependencies, works on broken VMs
Purpose: Transform disk format
Technology: qemu-img
Conversions:
Options:
Output: KVM-native disk format
# Example conversion
qemu-img convert \
-f vmdk \ # Input format
-O qcow2 \ # Output format
-c \ # Compress
input.vmdk \
output.qcow2
Purpose: Verify VM boots on KVM
Tests:
Output: Pass/Fail + boot log
Disk1: FETCH → FLATTEN → INSPECT → FIX → CONVERT → VALIDATE
Disk2: FETCH → FLATTEN → INSPECT → FIX → CONVERT → VALIDATE
Disk1: FETCH → FLATTEN → INSPECT → FIX → CONVERT
↓
Disk2: FETCH → FLATTEN → INSPECT → FIX → CONVERT → VALIDATE
graph TD
START[Source VMDK] --> FETCH[FETCH]
FETCH --> F1[Downloaded]
F1 --> FLATTEN[FLATTEN]
FLATTEN --> F2[Flattened]
F2 --> INSPECT[INSPECT]
INSPECT --> F3[OS Detected]
F3 --> PLAN[PLAN]
PLAN --> F4[Fix Plan]
F4 --> FIX[FIX]
FIX --> F5[Patched]
F5 --> CONVERT[CONVERT]
CONVERT --> F6[qcow2 Output]
F6 --> VALIDATE[VALIDATE]
VALIDATE --> END[Boot Success]
classDef stage fill:#4CAF50,stroke:#2E7D32,color:#fff
classDef data fill:#E3F2FD,stroke:#1565C0
class FETCH,FLATTEN,INSPECT,PLAN,FIX,CONVERT,VALIDATE stage
class START,F1,F2,F3,F4,F5,F6,END data
What happens:
rhel9.vmdk from vSphereThe Orchestrator coordinates the pipeline:
graph TD
USER[User] --> ORCH[Orchestrator]
ORCH --> DISC[Disk Discovery]
DISC --> PROC[Disk Processor]
PROC --> P1[Worker 1]
PROC --> P2[Worker 2]
P1 --> RES[Results]
P2 --> RES
RES --> ORCH
ORCH --> OUT[Output]
classDef orchestrator fill:#9C27B0,stroke:#6A1B9A,color:#fff
classDef worker fill:#FF9800,stroke:#E65100,color:#fff
classDef result fill:#4CAF50,stroke:#2E7D32,color:#fff
class ORCH,DISC,RES orchestrator
class PROC,P1,P2 worker
class OUT result
1. DiskDiscovery
2. DiskProcessor
3. VsphereExporter (optional)
Pipeline supports resume from failure:
graph LR
S1[Stage 1: Done] --> S2[Stage 2: Done]
S2 --> S3[Stage 3: Failed]
S3 --> CHK[Checkpoint]
CHK --> RESUME[Resume]
RESUME --> S3B[Stage 3: Retry]
S3B --> S4[Stage 4]
classDef done fill:#4CAF50,stroke:#2E7D32,color:#fff
classDef failed fill:#F44336,stroke:#C62828,color:#fff
classDef checkpoint fill:#FF9800,stroke:#E65100,color:#fff
class S1,S2,S4 done
class S3 failed
class CHK,RESUME,S3B checkpoint
Checkpoint file example:
{
"completed_stages": ["fetch", "flatten"],
"current_stage": "inspect",
"resume_from": "/tmp/rhel9-flat.vmdk",
"timestamp": "2024-01-29T10:30:00Z"
}
User Command → Pipeline Execution → Exit
graph LR
C1[CLI: Command] --> C2[Pipeline] --> C3[Exit]
classDef cli fill:#2196F3,stroke:#1565C0,color:#fff
class C1,C2,C3 cli
Watch Queue → Detect File → Run Pipeline → Archive → Loop
graph LR
D1[Daemon: Watch] --> D2[Detect]
D2 --> D3[Pipeline]
D3 --> D4[Archive]
D4 --> D1
classDef daemon fill:#4CAF50,stroke:#2E7D32,color:#fff
class D1,D2,D3,D4 daemon
Each stage has clear responsibilities:
FETCH: Sources → Local disks
FLATTEN: Multi-file → Single file
INSPECT: Disk → Metadata
FIX: Broken configs → KVM configs
CONVERT: Any format → KVM format
Pipeline order is sacred:
✓ ALLOWED: Skip entire pipeline stage (if not needed)
✗ FORBIDDEN: Reorder stages (FIX before INSPECT)
Same inputs → Same outputs:
rhel9.vmdk → Always produces identical qcow2
Each stage validates:
INSPECT: Can't detect OS? → FAIL (don't proceed to FIX)
FIX: Can't mount disk? → FAIL (don't proceed to CONVERT)
Add new sources = new FETCH implementation Add new OS = new FIX rules Add new format = new CONVERT handler
Test each stage independently:
def test_flatten_stage():
input = "multi-extent.vmdk"
output = flatten(input)
assert is_single_file(output)
Pipeline failures are easy to locate:
FETCH: ✓ Success
FLATTEN: ✓ Success
INSPECT: ✓ Success
FIX: ✗ Failed at fstab rewrite
New pipeline stage = plug it in:
FETCH → FLATTEN → INSPECT → [NEW STAGE] → FIX → CONVERT
hyper2kvm = 7-stage pipeline that makes any VM boot on KVM
Source VM → FETCH → FLATTEN → INSPECT → PLAN → FIX → CONVERT → VALIDATE → KVM-ready VM
Each stage does one thing well. Stages run in strict order. Pipeline is deterministic and recoverable.
Result: Reliable, repeatable VM migrations.