hyper2kvm

ARCHITECTURE.md — hyper2kvm Internal Architecture

Table of Contents


Purpose

This document provides an in-depth exploration of hyper2kvm’s module-level architecture, execution flow, and core architectural principles.

It’s designed for:

hyper2kvm is laser-focused on fixing “successful” conversions that fail at boot, lose network connectivity, or exhibit instability post-migration. This architecture document explains how the modular design achieves reliability through:


The Canonical Pipeline

At the heart of every migration is this invariant flow:

FETCH → FLATTEN → INSPECT → PLAN → FIX → CONVERT → VALIDATE / TEST

Not every command executes every stage, but the order is sacred. Stages can be skipped, but never reordered or interleaved.

Pipeline Stages Explained

FETCH

Acquire source disks and metadata from any source:

Key principle: Source-agnostic acquisition with unified interface.

FLATTEN

Transform complex disk chains into single-image files:

Output: Clean, single-file disk images ready for inspection.

INSPECT

Offline deep-dive using libguestfs to extract ground truth:

Philosophy: Derive facts, never guess. Inspection over assumption.

PLAN

Strategic planning before execution:

Value: Plan smart, execute once. No trial-and-error.

FIX

Apply deterministic patches to ensure bootability:

Guarantee: Idempotent operations that tolerate re-runs.

CONVERT

Image format transformation via qemu-img:

Integration: Optional direct export pre/post-processing hooks.

VALIDATE / TEST

Ruthless verification:

Motto: Does it boot? Does it network? Does it survive? Prove it.


Pipeline Flow Diagram

flowchart LR
    FETCH[FETCH<br/>Acquire Disks] --> FLATTEN[FLATTEN<br/>Collapse Chains]
    FLATTEN --> INSPECT[INSPECT<br/>Extract Facts]
    INSPECT --> PLAN[PLAN<br/>Strategy]
    PLAN --> FIX[FIX<br/>Apply Patches]
    FIX --> CONVERT[CONVERT<br/>Format Transform]
    CONVERT --> VALIDATE[VALIDATE/TEST<br/>Boot Tests]

    style FETCH fill:#4CAF50,stroke:#2E7D32,color:#fff
    style FLATTEN fill:#2196F3,stroke:#1565C0,color:#fff
    style INSPECT fill:#FF9800,stroke:#E65100,color:#fff
    style PLAN fill:#9C27B0,stroke:#6A1B9A,color:#fff
    style FIX fill:#F44336,stroke:#C62828,color:#fff
    style CONVERT fill:#00BCD4,stroke:#006064,color:#fff
    style VALIDATE fill:#8BC34A,stroke:#558B2F,color:#fff

Key Invariants:

Repository Structure (Authoritative)

This reflects the actual codebase structure as of the latest refactor:

hyper2kvm/
├── __init__.py                       # Package root
├── __main__.py                       # Entry point (python -m hyper2kvm)
│
├── cli/                              # Command-line interface layer
│   ├── __init__.py
│   ├── argument_parser.py            # Main argument parser (legacy entry)
│   ├── help_texts.py                 # User-facing help documentation
│   └── args/                         # Refactored argument parsing (modular)
│       ├── __init__.py
│       ├── builder.py                # Argument builder pattern
│       ├── groups.py                 # Argument group definitions
│       ├── helpers.py                # Parsing utilities
│       ├── parser.py                 # Core parser logic
│       └── validators.py             # Argument validation rules
│
├── config/                           # Configuration management
│   ├── __init__.py
│   ├── config_loader.py              # YAML config loading and merging
│   └── systemd_template.py          # Systemd unit templates for guest injection
│
├── core/                             # Foundational utilities and infrastructure
│   ├── __init__.py
│   ├── cred.py                       # Credential handling (secure storage)
│   ├── exceptions.py                 # Custom exception hierarchy
│   ├── file_ops.py                   # File operation utilities
│   ├── guest_identity.py             # Guest OS identity detection
│   ├── guest_utils.py                # Guest-specific utilities
│   ├── list_utils.py                 # List manipulation helpers
│   ├── logger.py                     # Structured logging (rich console)
│   ├── logging_utils.py              # Logging configuration helpers
│   ├── optional_imports.py           # Graceful optional dependency handling
│   ├── recovery_manager.py           # Crash recovery and checkpointing
│   ├── retry.py                      # Retry logic with exponential backoff
│   ├── sanity_checker.py             # Pre-flight sanity checks
│   ├── utils.py                      # General-purpose utilities
│   ├── validation_suite.py           # Validation test suites
│   └── xml_utils.py                  # XML parsing and generation utilities
│
├── converters/                       # Disk transformation engines
│   ├── __init__.py
│   ├── disk_resizer.py               # Disk resizing operations
│   ├── fetch.py                      # Unified disk fetching interface
│   ├── flatten.py                    # Snapshot chain flattening
│   ├── extractors/                   # Archive/container extractors
│   │   ├── __init__.py
│   │   ├── ami.py                    # AWS AMI tarball extractor
│   │   ├── ovf.py                    # OVF/OVA unpacker
│   │   ├── raw.py                    # RAW/tarball extractor with security checks
│   │   └── vhd.py                    # VHD/VHDX handler (Azure/Hyper-V)
│   └── qemu/                         # QEMU image operations
│       ├── __init__.py
│       └── converter.py              # qemu-img wrapper (convert, resize, info)
│
├── fixers/                           # Guest OS repair and modification layer
│   ├── __init__.py
│   ├── base_fixer.py                 # Base class defining fixer interface
│   ├── cloud_init_injector.py        # Cloud-init metadata injection
│   ├── network_fixer.py              # Top-level network fixer coordinator
│   ├── offline_fixer.py              # Top-level offline fixer coordinator
│   ├── report_writer.py              # Migration report generation
│   │
│   ├── bootloader/                   # Bootloader fixing subsystem
│   │   ├── __init__.py
│   │   ├── fixer.py                  # Bootloader fixer orchestration
│   │   └── grub.py                   # GRUB/GRUB2 specific fixes
│   │
│   ├── filesystem/                   # Filesystem fixing subsystem
│   │   ├── __init__.py
│   │   ├── fixer.py                  # Filesystem fixer orchestration
│   │   └── fstab.py                  # /etc/fstab rewriting (UUID conversion)
│   │
│   ├── live/                         # Live (SSH-based) fixing subsystem
│   │   ├── __init__.py
│   │   ├── fixer.py                  # Live SSH fixer
│   │   └── grub_fixer.py             # Live GRUB regeneration via SSH
│   │
│   ├── network/                      # Network fixing subsystem
│   │   ├── __init__.py
│   │   ├── backend.py                # Network backend abstraction
│   │   ├── core.py                   # Core network fixing logic
│   │   ├── discovery.py              # Network interface discovery
│   │   ├── model.py                  # Network configuration models
│   │   ├── topology.py               # Network topology analysis
│   │   └── validation.py             # Network config validation
│   │
│   ├── offline/                      # Offline (libguestfs) fixing subsystem
│   │   ├── __init__.py
│   │   ├── config_rewriter.py        # System config file rewriting
│   │   ├── mount.py                  # Guest filesystem mounting
│   │   ├── spec_converter.py         # Spec file format conversions
│   │   ├── validation.py             # Offline fix validation
│   │   └── vmware_tools_remover.py   # Offline VMware Tools purge
│   │
│   └── windows/                      # Windows-specific fixing subsystem
│       ├── __init__.py
│       ├── fixer.py                  # Main Windows fixer orchestrator
│       ├── network_fixer.py          # Windows network fixing
│       ├── registry_core.py          # Registry manipulation core
│       ├── registry/                 # Windows Registry subsystem
│       │   ├── __init__.py
│       │   ├── encoding.py           # Registry value encoding/decoding
│       │   ├── firstboot.py          # First-boot registry tweaks
│       │   ├── io.py                 # Registry file I/O (hivex wrapper)
│       │   ├── mount.py              # Registry hive mounting
│       │   ├── software.py           # HKLM\Software modifications
│       │   └── system.py             # HKLM\System modifications
│       └── virtio/                   # Windows VirtIO driver injection
│           ├── __init__.py
│           ├── config.py             # VirtIO configuration
│           ├── core.py               # Core VirtIO injection logic
│           ├── detection.py          # VirtIO ISO detection
│           ├── discovery.py          # Driver discovery in VirtIO ISO
│           ├── install.py            # Driver installation to registry
│           ├── paths.py              # VirtIO ISO path resolution
│           └── utils.py              # VirtIO utilities
│
├── libvirt/                          # LibVirt integration layer
│   ├── domain_emitter.py             # Generic domain XML emitter
│   ├── libvirt_utils.py              # LibVirt utility functions
│   ├── linux_domain.py               # Linux-specific domain XML generation
│   └── windows_domain.py             # Windows-specific domain XML generation
│
├── modes/                            # Specialized operational modes
│   ├── __init__.py
│   ├── inventory_mode.py             # Read-only VM/disk inventory scanning
│   └── plan_mode.py                  # Dry-run planning mode (what-if)
│
├── orchestrator/                     # Pipeline orchestration layer
│   ├── __init__.py
│   ├── README.md                     # Refactoring documentation
│   ├── orchestrator.py               # Main pipeline coordinator (refactored)
│   ├── disk_discovery.py             # Input disk discovery logic
│   ├── disk_processor.py             # Disk processing pipeline executor
│   └── vsphere_exporter.py           # vSphere VM export orchestration
│
├── ssh/                              # SSH/SCP transport layer
│   ├── __init__.py
│   ├── ssh_client.py                 # Paramiko-based SSH client
│   └── ssh_config.py                 # SSH connection configuration
│
├── testers/                          # Post-migration validation layer
│   ├── __init__.py
│   ├── libvirt_tester.py             # LibVirt domain boot testing
│   └── qemu_tester.py                # Direct QEMU boot testing
│
└── vmware/                           # VMware ecosystem integration
    ├── __init__.py
    ├── clients/                      # VMware API clients
    │   ├── __init__.py
    │   ├── client.py                 # pyvmomi SmartConnect wrapper
    │   ├── extensions.py             # vSphere API extensions
    │   └── nfc_lease.py              # NFC lease management for exports
    │
    ├── transports/                   # Data-plane transport implementations
    │   ├── __init__.py
    │   ├── govc_common.py            # govc CLI wrapper utilities
    │   ├── govc_export.py            # govc export operations
    │   ├── http_client.py            # HTTP datastore download client
    │   ├── http_progress.py          # HTTP download progress tracking
    │   ├── ovftool_client.py         # VMware ovftool wrapper
    │   ├── ovftool_loader.py         # ovftool dynamic loader
    │   ├── vddk_client.py            # VDDK high-speed transfer client
    │   └── vddk_loader.py            # VDDK dynamic library loader
    │
    ├── utils/                        # VMware utilities
    │   ├── __init__.py
    │   ├── datastore.py              # Datastore path parsing
    │   ├── utils.py                  # General VMware utilities
    │   └── vmdk_parser.py            # VMDK descriptor file parser
    │
    └── vsphere/                      # vSphere control-plane operations
        ├── __init__.py
        ├── command.py                # vSphere command abstraction
        ├── errors.py                 # vSphere error handling
        ├── govc.py                   # govc-specific operations
        └── mode.py                   # vSphere operational modes
```bash

**Total:** 27 directories, 117+ Python modules

Orchestrator Architecture (Refactored)

The orchestrator was refactored from a single 1,197-line monolithic class into 4 focused components, each under 300 lines and following the Single Responsibility Principle.

Component Breakdown

1. Orchestrator (orchestrator/orchestrator.py)

Responsibility: Main pipeline coordinator

Key Methods:

Philosophy: Coordinate, don’t implement. Delegate to specialists.

2. DiskDiscovery (orchestrator/disk_discovery.py)

Responsibility: Input disk detection and preparation

Supported Sources:

Output: List of discovered disk paths + optional temp directory

3. DiskProcessor (orchestrator/disk_processor.py)

Responsibility: Per-disk processing pipeline

Pipeline Stages:

  1. Flatten (optional snapshot collapse)
  2. Offline fixes (libguestfs modifications)
  3. Convert to output format (qemu-img)
  4. Validation (sanity checks)

Features:

4. VsphereExporter (orchestrator/vsphere_exporter.py)

Responsibility: vSphere VM export orchestration

Export Modes:

Features:

Refactoring Benefits

Aspect Before (Monolithic) After (Refactored)
Lines of Code 1,197 lines, 50+ methods 4 files, each < 310 lines
Testability Difficult to test in isolation Each component independently testable
Maintainability All concerns mixed Single Responsibility Principle
Reusability Tightly coupled Components usable independently
Debugging Hard to isolate failures Clear component boundaries

Control-Plane vs Data-Plane (VMware)

VMware integration enforces strict separation between what to do (control) and how to move bytes (data).

Control-Plane: Inventory, Planning, Orchestration

Purpose: Answer “what exists, where, and what’s the plan?”

Never touches bulk data - keeps operations lean, fast, and auditable.

Implementation 1: govc (Primary)

Tool: VMware’s official CLI (govc)

Capabilities:

Why govc:

Integration: vmware/vsphere/govc.py + vmware/vsphere/command.py

Implementation 2: pyvmomi / pyVim (Fallback)

Library: VMware’s official Python SDK

Use Cases:

Integration: vmware/clients/client.py - SmartConnect wrapper with retry logic

Details:

CLI Glue Layer

Modules: vmware/vsphere/mode.py + vmware/vsphere/command.py

Function: Translate user commands (vsphere inventory, vsphere plan) into pure metadata operations. No data hauling.


Data-Plane: Byte Movement

Purpose: Answer “how do we safely move disk data?”

No inventory logic - pure transport layer.

Transport 1: VDDK (Highest Performance)

Library: VMware Virtual Disk Development Kit

Module: vmware/transports/vddk_client.py

Features:

When to Use: Large VMs, bandwidth-constrained environments

Transport 2: ovftool (Official VMware Export)

Tool: VMware OVF Tool

Module: vmware/transports/ovftool_client.py

Features:

When to Use: Need OVF compatibility, vendor-specific flags

Transport 3: HTTP /folder (Datastore Downloads)

Protocol: HTTPS datastore browsing

Module: vmware/transports/http_client.py

Features:

When to Use: Simple downloads, no VDDK available

Transport 4: SSH/SCP (Universal Fallback)

Protocol: SSH with SCP/SFTP

Module: ssh/ssh_client.py

Features:

When to Use: API access unavailable, ESXi direct access

Transport 5: govc export (CLI-Based)

Tool: govc export.ovf / export.ova

Module: vmware/transports/govc_export.py

Features:

When to Use: Lightweight exports, scripting


Fixer Subsystems (Deep Dive)

Offline Fixing (Default Strategy)

Module: fixers/offline/

Philosophy: Modify disk images without booting. No runtime dependencies.

Technology: libguestfs (QEMU + kernel appliance)

Advantages:

Subsystems:

1. Filesystem Fixing (fixers/filesystem/)

2. Bootloader Fixing (fixers/bootloader/)

3. Config Rewriting (fixers/offline/config_rewriter.py)

4. VMware Tools Removal (fixers/offline/vmware_tools_remover.py)


Live Fixing (Opt-In Strategy)

Module: fixers/live/

Philosophy: Execute fixes on running Linux guests via SSH.

Use Cases:

Safety:


Windows Fixing (Hermetically Sealed)

Module: fixers/windows/

Principle: Windows logic never leaks into Linux fixers. Complete isolation.

Registry Subsystem (fixers/windows/registry/)

Purpose: Modify Windows Registry offline (no Windows boot required)

Technology: hivex (libguestfs registry manipulation)

Operations:

Modules:

VirtIO Subsystem (fixers/windows/virtio/)

Purpose: Inject VirtIO drivers for KVM compatibility

Challenge: Windows won’t boot on KVM without VirtIO drivers, but drivers can’t be installed without booting.

Solution: Offline registry modification to pre-install drivers.

Workflow:

  1. Detection (detection.py) - Locate VirtIO ISO (local/remote)
  2. Discovery (discovery.py) - Extract drivers matching guest OS version
  3. Installation (install.py) - Add driver registry entries
  4. Configuration (config.py) - Configure driver load order

Drivers Injected:


Network Fixing (Cross-Platform)

Module: fixers/network/

Architecture: Modular backend system supporting multiple network managers.

Backends Supported:

Components:

Discovery (discovery.py)

Topology (topology.py)

Core (core.py)

Validation (validation.py)

Backend (backend.py)

Fixes Applied:


LibVirt Integration

Module: libvirt/

Purpose: Generate libvirt domain XML for migrated VMs

Components:

Domain Emitter (domain_emitter.py)

Generic XML generation framework

Linux Domain (linux_domain.py)

Linux-specific domain XML:

Windows Domain (windows_domain.py)

Windows-specific domain XML:

Output: Ready-to-import libvirt XML (virsh define domain.xml)


Core Utilities

Module: core/

The foundational layer providing infrastructure for all other modules.

Essential Utilities

Guest Identity (guest_identity.py)

Recovery Manager (recovery_manager.py)

Retry Logic (retry.py)

Validation Suite (validation_suite.py)

File Operations (file_ops.py)

Logging (logger.py, logging_utils.py)


VMCraft - VM Manipulation Engine

Module: core/vmcraft/

Version: v9.0

VMCraft is hyper2kvm’s pure Python disk image manipulation platform, serving as the primary VM inspection and modification engine.

Architecture

VMCraft consists of 57 specialized modules organized into focused categories:

core/vmcraft/
├── main.py                    # Orchestrator
├── Core Infrastructure (4 modules)
│   ├── nbd.py                 # NBD device management
│   ├── storage.py             # LVM/LUKS/RAID/ZFS
│   ├── mount.py               # Filesystem mounting
│   └── file_ops.py            # File operations (70+ methods)
├── OS Detection (3 modules)
│   ├── inspection.py          # Orchestration
│   ├── linux_detection.py     # 15+ Linux distros
│   └── windows_detection.py   # 20+ Windows versions
├── Windows Support (6 modules)
│   ├── windows_registry.py    # Registry operations
│   ├── windows_drivers.py     # Driver injection
│   ├── windows_users.py       # User management
│   ├── windows_services.py    # Service control
│   ├── windows_applications.py # App detection
│   └── scheduled_tasks.py     # Task Scheduler
├── Linux Support (1 module)
│   └── linux_services.py      # Systemd/init services
├── Enterprise Intelligence (5 modules)
│   ├── ml_analyzer.py         # AI/ML analytics
│   ├── cloud_optimizer.py     # Cloud migration
│   ├── disaster_recovery.py   # DR planning
│   ├── audit_trail.py         # Compliance logging
│   └── resource_orchestrator.py # Auto-scaling
└── Operational Tools (5 modules)
    ├── backup.py              # Backup/restore
    ├── security.py            # Security auditing
    ├── optimization.py        # Disk optimization
    ├── advanced_analysis.py   # Forensics
    └── export.py              # VM export

Key Capabilities

Core Operations:

Cross-Platform:

Enterprise Intelligence (v9.0):

Integration with Pipeline

VMCraft integrates into the migration pipeline at these stages:

  1. INSPECT: OS detection and filesystem analysis
  2. FIX: Offline file modifications, registry edits, driver injection
  3. VALIDATE: Pre-migration verification

Performance

Operation Time Notes
Launch ~1.9s NBD + storage
OS Inspection ~0.3s Linux/Windows detection
File Read <50ms Direct filesystem access
Registry Read ~150ms Windows offline registry

Usage in hyper2kvm

from hyper2kvm.core.vmcraft import VMCraft

with VMCraft() as g:
    g.add_drive_opts("/path/to/disk.vmdk", readonly=False)
    g.launch()

    # OS detection
    roots = g.inspect_os()

    # File operations
    g.write("/etc/motd", "Migrated to KVM\n")

    # Windows registry (if Windows)
    g.win_registry_write("SOFTWARE", r"Microsoft\...", "Key", "Value")

    # AI/ML analytics (v9.0)
    anomalies = g.ml_detect_anomalies(metrics, "cpu")

    # Cloud optimization (v9.0)
    readiness = g.cloud_analyze_readiness(system_info)

Documentation: See VMCraft Platform Guide for complete reference (307+ methods).


Key Architectural Invariants

These principles are non-negotiable. Violating them leads to unreliable migrations.

1. Offline is the Default Truth

Unless explicitly marked live, all fixers assume:

Runtime dependencies belong in fixers/live/.

2. Inspection Over Assumption

Never guess. Always derive facts from:

Code must handle “unexpected but valid” configurations gracefully.

3. /dev/disk/by-path is Radioactive

VMware uses by-path references extensively. KVM does not.

All fixer code must:

This is the #1 cause of boot failures if missed.

4. Windows Logic is Hermetically Sealed

Windows-specific code lives exclusively in fixers/windows/.

Linux fixers:

Cross-contamination is forbidden.

5. Control-Plane and Data-Plane Never Mix

Control-plane (vmware/vsphere/, vmware/clients/):

Data-plane (vmware/transports/):

No module should perform both. Separation ensures:

6. Idempotent, Best-Effort Behavior

Fixers should:

Only critical failures (unbootable guest) should halt the pipeline.


Module Ownership and Responsibilities

cli/

Owns: User-facing command-line interface, argument parsing, help text. Does NOT own: Business logic, execution.

config/

Owns: Configuration file loading (YAML), merging, defaults. Does NOT own: Configuration validation (done in core/sanity_checker.py).

core/

Owns: Cross-cutting concerns (logging, errors, retries, recovery, validation). Does NOT own: Domain-specific logic.

converters/

Owns: Format conversions (VMDK→qcow2), extractions (OVA, AMI, VHD), disk operations. Does NOT own: Guest OS modifications (that’s fixers/).

fixers/

Owns: Guest OS modifications (offline and live), bootloader fixes, network cleanup, Windows drivers. Does NOT own: Disk format conversions (that’s converters/).

libvirt/

Owns: LibVirt domain XML generation. Does NOT own: QEMU execution (that’s testers/qemu_tester.py).

modes/

Owns: Read-only operational modes (inventory, planning). Does NOT own: Write operations (migrations).

orchestrator/

Owns: Pipeline coordination, stage ordering, component delegation. Does NOT own: Stage implementation (delegates to specialists).

ssh/

Owns: SSH/SCP transport, remote command execution. Does NOT own: What commands to execute (that’s fixers/live/).

testers/

Owns: Post-migration validation (boot tests, network tests). Does NOT own: Migration itself.

vmware/

Owns: VMware-specific integrations (vSphere API, VDDK, govc). Does NOT own: Generic disk operations (that’s converters/).


Why This Architecture Works

Predictability

Reliability

Maintainability

Extensibility

Debuggability


Adding New Features

Where Does My Feature Go?

1. New Disk Source (e.g., Azure Blob, S3)

Location: converters/extractors/azure.py or converters/fetch.py Hook: Register in orchestrator/disk_discovery.py

2. New Fix (e.g., SELinux relabeling)

Location: fixers/offline/selinux_fixer.py or extend fixers/offline/config_rewriter.py Hook: Call from orchestrator/disk_processor.py

3. New Network Manager (e.g., wicked for SUSE)

Location: fixers/network/backend.py (add backend class) Hook: Auto-detected via backend discovery

4. New Validation Test (e.g., storage performance)

Location: testers/storage_tester.py Hook: Call from orchestrator/orchestrator.py:_run_tests()

5. New VMware Transport (e.g., NBD direct)

Location: vmware/transports/nbd_client.py Hook: Register in vmware/transports/__init__.py

Feature Addition Checklist

  1. Identify module boundary (don’t violate separation of concerns)
  2. Check for existing extension point (don’t duplicate)
  3. Write unit tests (isolated component tests)
  4. Update this ARCHITECTURE.md (document new component)
  5. Add integration test (end-to-end validation)
  6. Update user documentation (if user-visible feature)

Performance Considerations

Parallel Processing

Disk Processing

Module: orchestrator/disk_processor.py

Option: args.parallel_processing = True

Implementation: ThreadPoolExecutor (multiple disks processed concurrently)

When to Use: Multi-disk VMs (e.g., VM with OS disk + data disks)

I/O Optimization

VDDK (VMware)

Benefit: 3-5x faster than HTTP downloads Trade-off: Requires VDDK installation, complex setup

Compression

Benefit: Smaller output files, faster network transfers Trade-off: CPU overhead during conversion

Recommendation: Use compression for network transfers, skip for local migrations.


Testing Strategy

Unit Tests

Location: tests/unit/

Coverage:

Technology: pytest, pytest-mock

Integration Tests

Location: tests/integration/

Coverage:

Security Tests

Runs: GitHub Actions (Bandit, pip-audit)

Focus:


Future Architecture Directions

Plugin System

Allow third-party fixers, transports, and validators without modifying core code.

Design:

Cloud-Native Integration

Direct export to cloud providers without intermediate storage.

Candidates:

Module: converters/cloud/ (new)

Advanced Recovery

Transactional migrations with automatic rollback on failure.

Design:

Module: Enhanced core/recovery_manager.py

Metrics and Telemetry

Real-time progress tracking and performance metrics.

Design:

Module: core/metrics.py (new)


Glossary

libguestfs: Library for accessing and modifying virtual machine disk images offline.

VDDK: VMware Virtual Disk Development Kit - high-performance API for disk access.

govc: VMware’s official CLI for vSphere operations.

pyvmomi: VMware’s official Python SDK for vSphere SOAP API.

VirtIO: Paravirtualized I/O drivers for KVM (storage, network, RNG, balloon).

hivex: Library for reading and writing Windows Registry hive files.

NBD: Network Block Device - protocol for accessing block devices over network.

CBT: Changed Block Tracking - VMware feature for incremental backups.

MoRef: Managed Object Reference - vSphere API identifier for objects.

NFC: Network File Copy - VMware protocol for efficient VM export.


Code Examples

Example 1: Basic Pipeline Usage

from hyper2kvm.orchestrator.disk_processor import DiskProcessor
from hyper2kvm.core.guest_identity import GuestIdentity

# Initialize processor
processor = DiskProcessor()

# Process a VMDK
result = processor.process_disk(
    source_path='/data/vm.vmdk',
    output_path='/data/vm.qcow2',
    flatten=True,
    compress=True
)

# Inspect guest OS
identity = GuestIdentity.from_disk('/data/vm.qcow2')
print(f"OS: {identity.os_family}")
print(f"Firmware: {identity.firmware_type}")

Example 2: Custom Fixer

from hyper2kvm.fixers.offline_fixer import OfflineFixer

# Create fixer instance
fixer = OfflineFixer('/data/vm.qcow2')

# Apply specific fixes
fixer.fix_fstab(use_uuid=True)
fixer.fix_grub(regenerate=True)
fixer.fix_network(clean_mac=True)

# Verify fixes
fixer.validate()

Example 3: vSphere Integration

from hyper2kvm.vmware.clients.client import VMwareClient

# Connect to vCenter
client = VMwareClient(
    host='vcenter.example.com',
    username='administrator@vsphere.local',
    password='password'
)

# Export VM
await client.async_export_vm(
    vm_name='production-web',
    output_dir='/data/exports',
    export_mode='export'
)

Contributing

When proposing architectural changes:

  1. Open an issue first (discuss design before implementation)
  2. Follow existing patterns (don’t introduce new paradigms without justification)
  3. Respect module boundaries (don’t mix concerns)
  4. Add tests (unit + integration)
  5. Update documentation (this file + module docstrings)
  6. Keep classes focused (under 300 lines when possible)

Summary

hyper2kvm’s architecture achieves reliable, repeatable VM migrations through:

  1. Deterministic pipeline (FETCH → FLATTEN → INSPECT → PLAN → FIX → CONVERT → TEST)
  2. Offline-first fixing (libguestfs, no runtime dependencies)
  3. Strict separation (control-plane vs data-plane, offline vs live, Windows vs Linux)
  4. Modular components (Single Responsibility Principle)
  5. Inspection over assumption (derive facts, never guess)
  6. Idempotent operations (safe to retry)

The result: Migrations that “just work” - boring, predictable, and successful.

Boring migrations are successful migrations.