mirror of https://github.com/legitYosal/vmware-s3-backup.git synced 2026-04-27 02:45:47 +03:00

No description

Find a file

Yosef S 09e8d7e966 fix the end file sparse error		2025-11-08 14:18:14 +03:30
cmd	fix list vms	2025-11-01 14:23:41 +03:30
nbdkit-repository@47e43a7290	Add make file and nbdkit plugin	2025-10-21 12:03:19 +03:30
pkg	fix the end file sparse error	2025-11-08 14:18:14 +03:30
vendor	Add make file and nbdkit plugin	2025-10-21 12:03:19 +03:30
.env.sample	add list --detail vms	2025-10-13 13:33:27 +03:30
.gitignore	Add make file and nbdkit plugin	2025-10-21 12:03:19 +03:30
.gitmodules	Add make file and nbdkit plugin	2025-10-21 12:03:19 +03:30
go.mod	Add make file and nbdkit plugin	2025-10-21 12:03:19 +03:30
go.sum	completely change s3 data layer	2025-10-18 16:15:05 +03:30
LICENSE	Initial commit	2025-10-06 16:20:52 +03:30
Makefile	Add make file and nbdkit plugin	2025-10-21 12:03:19 +03:30
README.md	fix some missing code	2025-11-01 14:23:01 +03:30

README.md

VMware s3 backup

This will enable you to fully backup incrementally an instance on the vmware vcenter/esxi to s3 directly, without any separate steps to convert and send it.

This repo is mostly stolen from migratekit and I am trying to upgrade it to use S3 client. Because it will create so much change I have decided to make it to another repository.

Installing VDDK

In order to use command line tool, or use this repo as a package, for the end product you must download and configure the VDDK package.

Download VMware Virtual Disk Development Kit (VDDK) 8.0.2 for Linux
Once you've downloaded the file, you will need to extract the contents of the tarball to /usr/lib64/vmware-vix-disklib/ on your system.

So for example you can see them on:

CLI usage

In order to use the CLI provided by this repo you must:

Have a vmware either vCenter or vSphere
Have S3 Credentials
Have installed VDDK in previous step

List vms:

You can list vms:

$ go run cmd/cli/main.go list-vms --detailed
┌─────────────────┬────┬───────────────────────────────────┬───────────┬───────────┬───────┬────────────────────┬───────────┬─────────────┐
│      NAME       │ ID │               PATH                │  STATUS   │ MEMORY GB │ CP US │       DISKS        │ SNAPSHOTS │ CBT ENABLED │
├─────────────────┼────┼───────────────────────────────────┼───────────┼───────────┼───────┼────────────────────┼───────────┼─────────────┤
│ Debian-Router   │ 2  │ /ha-datacenter/vm/Debian-Router   │ poweredOn │ 2         │ 2     │ [{Hard-disk-1 16}] │ 0         │ true        │
│ Debian-Target02 │ 4  │ /ha-datacenter/vm/Debian-Target02 │ poweredOn │ 2         │ 2     │ [{Hard-disk-1 20}] │ 0         │ false       │
│ Debian-Migrate  │ 5  │ /ha-datacenter/vm/Debian-Migrate  │ poweredOn │ 2         │ 2     │ [{Hard-disk-1 8}]  │ 0         │ false       │
└─────────────────┴────┴───────────────────────────────────┴───────────┴───────────┴───────┴────────────────────┴───────────┴─────────────┘

Enable CBT:

$ go run cmd/cli/main.go vm enable-cbt Debian-Target02
time=2025-10-13T15:48:19.525+03:30 level=INFO msg="CBT enabled successfully"
$ go run cmd/cli/main.go list-vms --detailed          
┌─────────────────┬────┬───────────────────────────────────┬───────────┬───────────┬───────┬────────────────────┬───────────┬─────────────┐
│      NAME       │ ID │               PATH                │  STATUS   │ MEMORY GB │ CP US │       DISKS        │ SNAPSHOTS │ CBT ENABLED │
├─────────────────┼────┼───────────────────────────────────┼───────────┼───────────┼───────┼────────────────────┼───────────┼─────────────┤
│ Debian-Router   │ 2  │ /ha-datacenter/vm/Debian-Router   │ poweredOn │ 2         │ 2     │ [{Hard-disk-1 16}] │ 0         │ true        │
│ Debian-Target02 │ 4  │ /ha-datacenter/vm/Debian-Target02 │ poweredOn │ 2         │ 2     │ [{Hard-disk-1 20}] │ 0         │ true        │
│ Debian-Migrate  │ 5  │ /ha-datacenter/vm/Debian-Migrate  │ poweredOn │ 2         │ 2     │ [{Hard-disk-1 8}]  │ 0         │ false       │
└─────────────────┴────┴───────────────────────────────────┴───────────┴───────────┴───────┴────────────────────┴───────────┴─────────────┘

List existing backups:

$ go run cmd/cli/main.go list-backups
┌─────────────────────────┬─────────────────┬────────────────┬────────────────────────────────────────┐
│       OBJECT KEY        │     VM KEY      │     DISKS      │             ROOT DISK KEY              │
├─────────────────────────┼─────────────────┼────────────────┼────────────────────────────────────────┤
│ vm-data-Debian-Target02 │ Debian-Target02 │ [0xc000336000] │ vm-data-Debian-Target02/disk-data-2000 │
└─────────────────────────┴─────────────────┴────────────────┴────────────────────────────────────────┘

Start a backup cycle

$ go run cmd/cli/main.go start-cycle Debian-Target02

Download an existing backup to disk

$ go run cmd/cli/main.go download-backup Debian-Target02 2000 ../disk200.raw

In order to verify it very fast if it has the correct data or not on ubuntu you can do these steps:

$ sudo mkdir /mnt/disk-data
$ LOOP_DEVICE=$(sudo losetup -f --show ~/disk2000.raw)
$ sudo kpartx -a $LOOP_DEVICE
$ sudo mount /dev/mapper/loop0p1 /mnt/disk-data

After verifying to un mount:

$ sudo umount /mnt/disk-data
$ sudo kpartx -d $LOOP_DEVICE
$ sudo losetup -d $LOOP_DEVICE

Upload from local to VMware

You can upload a disk from local to VMware, also note that the exported disk data is in raw format and you have to convert it to vmdk first, next:

$ go run cmd/cli/main.go restore-disk --data-store-name DS1 --local-path ./disk2000.vmdk --remote-path RestoredImages

How this works?

For incremental backup, we will query vmware for the changed areas on a disk(with the latest change id we have from last incremental backup), it will respond with the offset and the length of changed area, so kaboom using Nbdkit and VDDK plugin we will read the exact length of bytes from that offset, Now we have the s3 dilemma, first I was keeping a single file on s3, for a disk, and then multi part uploading the changed areas and multi part copying the not changed areas, which had multiple problems:

First of all if a changed area is less than 5MB I am forced to do a over-read and read not changed areas
If a not changed area is less than 5 MB, I can not use multi part copy and I am forced to append it to a changed area hence over-read
I have no control over compression to S3, because it is a single file and I have different dimension of areas hence no compression

So our ultimate solution to save a backup on s3, we are going to ditch the multi part upload, and save each chunk in a new file for example: vm-<key>/disk-<key>/full/00004 keeping the 64MB data in that file, enables us to compress it using zstd, also we will keep a manifest file to keep the state of the backup.

NBD Server

In order to simplify reading from the existing backups, we can use the golang plugin present at cmd/nbdkit-plugin/, you can use it by running the nbdkit server:

$ CGO_ENABLED=1 go build -o nbd.so --buildmode=c-shared cmd/nbdkit-plugin/*
$ nbdkit -f -v ./minimal.so \
    --threads=2 \
    s3-url="s3-url" \
    s3-secret-key="s3-secret-key" \
    s3-access-key="s3-access-key" \
    s3-region="tehran" \
    s3-bucket-name="test-vmware-backup-s3" \
    vm-key="Debian-Target02" \
    disk-key="2000" \
    -P /tmp/s3-nbd.pid \
    -i 127.0.0.1 \
    -p 10809

This will expose a NBD server on local port, which will enable you to use it or mount it, or even feed it to tools like virt-v2v, for example:

$ sudo qemu-nbd -d /dev/nbd0
$ sudo modprobe nbd max_part=8
$ sudo qemu-nbd -c /dev/nbd0 -r nbd:localhost:10809
$ qemu-img info nbd:localhost:10809
$ sudo fdisk -l /dev/nbd0
$ sudo mkdir /mnt/nbd/
$ sudo mount -r -t ext4 -o noload /dev/nbd0p1 /mnt/nbd/
$ ls -l /mnt/nbd/