Status: ✅ Completed Date: 2026-01-23
Export resumption with checkpoint-based recovery is now implemented for all providers. This feature allows exports to resume from where they left off after network failures, interruptions, or cancellations.
Uses JSON-based checkpoint files to track export state:
opts := vsphere.ExportOptions{
Format: "ova",
OutputPath: "/backups",
EnableCheckpoints: true, // Enable checkpoint support
ResumeFromCheckpoint: true, // Resume if checkpoint exists
CheckpointInterval: 30 * time.Second, // Save every 30s (0 = after each file)
// CheckpointPath: "", // Auto-generate path (optional override)
}
result, err := client.ExportOVF(ctx, "vm-path", opts)
opts := aws.ExportOptions{
Format: "vmdk",
OutputPath: "/exports",
S3Bucket: "my-backups",
EnableCheckpoints: true,
ResumeFromCheckpoint: true,
CheckpointInterval: 60 * time.Second, // Save every minute
}
result, err := client.ExportInstanceWithOptions(ctx, instanceID, opts)
opts := azure.ExportOptions{
Format: "vhd",
OutputPath: "/exports",
EnableCheckpoints: true,
ResumeFromCheckpoint: true,
// CheckpointInterval: 0 = save after each file (default)
}
result, err := client.ExportDiskWithOptions(ctx, diskName, opts)
opts := gcp.ExportOptions{
Format: "vmdk",
OutputPath: "/exports",
GCSBucket: "my-exports",
EnableCheckpoints: true,
ResumeFromCheckpoint: true,
}
result, err := client.ExportDiskWithOptions(ctx, diskName, opts)
opts := hyperv.ExportOptions{
Format: "vhdx",
OutputPath: "/exports",
ExportType: "vhd-only",
EnableCheckpoints: true,
ResumeFromCheckpoint: true,
}
result, err := client.ExportVMWithOptions(ctx, vmName, opts)
Network Interruption:
Export started: 10 files, 100 GB total
Files 1-5 completed: 50 GB downloaded
Network failure ❌
Resume export: Files 6-10 remaining (50 GB)
User Cancels Export:
Export started: 20 files
Files 1-12 completed
User presses Ctrl+C
Later: Resume completes files 13-20
Cloud Provider Egress:
Large VMs:
Day 1: Download 100 GB (limit hit, cancel)
Day 2: Resume, download next 100 GB
Day 3: Complete remaining 50 GB
Total: 250 GB in manageable chunks
Default Path:
{outputDir}/.{vmName}.checkpoint
Example:
/backups/.web-server-01.checkpoint
Custom Path:
opts.CheckpointPath = "/custom/path/my-checkpoint.json"
{
"version": "1.0",
"vm_name": "web-server-01",
"provider": "vsphere",
"export_format": "ova",
"output_path": "/backups",
"created_at": "2026-01-23T10:00:00Z",
"updated_at": "2026-01-23T10:15:30Z",
"files": [
{
"path": "disk-0.vmdk",
"url": "https://vcenter/nfc/...",
"total_size": 53687091200,
"downloaded_size": 53687091200,
"checksum": "a1b2c3d4...",
"status": "completed",
"last_modified": "2026-01-23T10:10:00Z",
"retry_count": 0
},
{
"path": "disk-1.vmdk",
"url": "https://vcenter/nfc/...",
"total_size": 21474836480,
"downloaded_size": 0,
"checksum": "",
"status": "pending",
"last_modified": "2026-01-23T10:10:00Z",
"retry_count": 0
}
],
"metadata": {}
}
| Status | Description |
|---|---|
pending |
File queued for download |
downloading |
Download in progress |
completed |
Download finished successfully |
failed |
Download failed (will retry on resume) |
Type: bool
Default: false
Enable checkpoint-based resumption.
opts.EnableCheckpoints = true
Type: bool
Default: false
Resume from existing checkpoint if found.
Important: EnableCheckpoints must also be true.
opts.EnableCheckpoints = true
opts.ResumeFromCheckpoint = true
Type: time.Duration
Default: 0 (save after each file)
How often to save checkpoint during export.
Options:
// Save after each file completes (most resilient, small overhead)
opts.CheckpointInterval = 0
// Save every 30 seconds (balanced)
opts.CheckpointInterval = 30 * time.Second
// Save every 5 minutes (minimal overhead, less frequent saves)
opts.CheckpointInterval = 5 * time.Minute
Trade-offs:
| Interval | Overhead | Recovery Granularity |
|---|---|---|
| 0 (per file) | Low | Best - resume at exact file |
| 30 seconds | Very low | Good - max 30s of re-download |
| 5 minutes | Minimal | Fair - may re-download one file |
Type: string
Default: "" (auto-generate)
Custom checkpoint file path.
Auto-generated:
opts.CheckpointPath = "" // Uses: {outputDir}/.{vmName}.checkpoint
Custom:
opts.CheckpointPath = "/custom/checkpoints/export-2026-01-23.json"
Start export
|
v
EnableCheckpoints? ──NO──> Normal export (no checkpoints)
|
YES
v
ResumeFromCheckpoint? ──NO──> Create new checkpoint
|
YES
v
Checkpoint exists? ──NO──> Create new checkpoint
|
YES
v
Load checkpoint ──FAIL──> Warn + create new checkpoint
|
OK
v
Resume export (skip completed files)
For each file marked “completed” in checkpoint:
1. Check if file exists on disk
2. Compare file size with expected size
3. If match: Skip download
4. If mismatch: Re-download file
Example:
disk-0.vmdk: checkpoint says 50 GB completed
-> File exists: 50 GB ✓
-> Skip download
disk-1.vmdk: checkpoint says 20 GB completed
-> File exists: 15 GB ✗ (incomplete)
-> Re-download file
disk-2.vmdk: checkpoint says pending
-> Download file
First Run:
$ hyperexport vsphere export \
--vm /DC/vm/web-server \
--output /backups \
--format ova \
--enable-checkpoints \
--resume
# Downloads 5/10 files, then network fails
# Checkpoint saved: 5 files completed
Second Run (Resume):
$ hyperexport vsphere export \
--vm /DC/vm/web-server \
--output /backups \
--format ova \
--enable-checkpoints \
--resume
# Loads checkpoint
# Skips 5 completed files
# Downloads remaining 5 files
# Deletes checkpoint on success
Check Checkpoint Status:
checkpointPath := common.GetCheckpointPath("/backups", "web-server-01")
if common.CheckpointExists(checkpointPath) {
checkpoint, err := common.LoadCheckpoint(checkpointPath)
if err != nil {
log.Fatal(err)
}
fmt.Printf("Progress: %.1f%%\n", checkpoint.GetProgress() * 100)
fmt.Printf("Files: %d\n", len(checkpoint.Files))
for _, file := range checkpoint.Files {
fmt.Printf(" %s: %s (%d/%d bytes)\n",
file.Path, file.Status, file.DownloadedSize, file.TotalSize)
}
}
Delete Stale Checkpoint:
checkpointPath := common.GetCheckpointPath("/backups", "web-server-01")
err := common.DeleteCheckpoint(checkpointPath)
Frequent Saves (Small Files):
// For many small files, save after each
opts.CheckpointInterval = 0
Balanced (Mixed Sizes):
// For mixed file sizes, save every 30s
opts.CheckpointInterval = 30 * time.Second
Infrequent Saves (Large Files):
// For few large files, save every 5 minutes
opts.CheckpointInterval = 5 * time.Minute
The TUI shows resume status:
╭─────────────────────────────────────────╮
│ 🚀 Export Progress │
├─────────────────────────────────────────┤
│ Total: 1 ✓ 0 ⏳ 1 │
│ │
│ ⬇ web-server-01 │
│ ████████████░░░░░░░░░░░░░░ 50% │
│ 5.0 GB / 10.0 GB • 50 MB/s • 1m │
│ File 6/10: disk-5.vmdk │
│ [Resumed from checkpoint] │
╰─────────────────────────────────────────╯
Checkpoint Created:
INFO checkpoint created vm=web-server-01 path=/backups/.web-server-01.checkpoint
Resuming:
INFO resuming from checkpoint progress=0.5 vm=web-server-01
INFO skipping already completed file file=disk-0.vmdk
INFO skipping already completed file file=disk-1.vmdk
...
Checkpoint Saved:
DEBUG checkpoint saved progress=0.75
Checkpoint Deleted:
INFO checkpoint deleted after successful export
Symptom: “Failed to load checkpoint, starting fresh”
Causes:
Solution:
rm /backups/.web-server-01.checkpoint
Symptom: Completed file is re-downloaded
Causes:
Expected Behavior: File validation failed, re-download is correct
Symptom: No checkpoint file created
Causes:
EnableCheckpoints = falseSolution:
// Verify checkpoint is enabled
opts.EnableCheckpoints = true
// Check output directory permissions
// Check disk space
Symptom: Export starts from beginning
Causes:
ResumeFromCheckpoint = falseSolution:
// Enable resume
opts.EnableCheckpoints = true
opts.ResumeFromCheckpoint = true
// Verify checkpoint path matches
// Use same output directory and VM name
1. Export Start
├─> EnableCheckpoints? → Create checkpoint
└─> ResumeFromCheckpoint? → Load checkpoint
2. Download Loop
├─> For each file:
│ ├─> Check if completed in checkpoint
│ ├─> Skip if completed and valid
│ ├─> Download if pending/failed
│ └─> Update checkpoint on completion
└─> Save checkpoint (per interval)
3. Export Complete
└─> Delete checkpoint file
All checkpoint operations are protected by mutex:
var checkpointMux sync.Mutex
// Before checkpoint access
checkpointMux.Lock()
checkpoint.UpdateFileProgress(path, size, "completed")
checkpointMux.Unlock()
Checkpoint saves are atomic:
1. Write to temporary file: .checkpoint.tmp
2. Rename to final file: .checkpoint (atomic operation)
3. Delete temporary file
This ensures checkpoint is never corrupted mid-write.
Overhead:
With CheckpointInterval = 0:
With CheckpointInterval = 30s:
Checkpoint files contain:
Default Permissions:
0644 (rw-r--r--)
For Sensitive Environments:
// After creating checkpoint
os.Chmod(checkpointPath, 0600) // rw-------
Successful Export:
Failed Export:
Cleanup Script:
# Remove checkpoints older than 7 days
find /backups -name ".*.checkpoint" -mtime +7 -delete
Currently skips completed files. Could support:
// Resume partial file download with HTTP Range headers
Range: bytes=50000000-
Add integrity checks on resume:
type FileCheckpoint struct {
// ...
PartialChecksum string // SHA-256 of downloaded bytes
}
Compress large checkpoint files:
opts.CompressCheckpoint = true // Use gzip
Store checkpoints in cloud storage:
opts.CheckpointBackend = "s3://bucket/checkpoints/"
Share checkpoint across multiple export workers:
// Distributed checkpoint for parallel exports
opts.SharedCheckpoint = true
✅ Export resumption is production-ready for all providers.
Key Features:
Use Cases:
Best Practices:
CheckpointInterval = 0 for maximum resilienceNext: Advanced Features (Manifest Generation, Auto-Conversion, etc.)