Since version 4.x, VMware vSphere provides native API to perform image level backup of running VMs: VMware vSphere Storage APIs – Data Protection (formerly known as VMware vStorage APIs for Data Protection or VADP).
Those API are based on VMware snapshot to create a readable base VMDK disk that can be easily copied by a compatible backup solution using a full transfer or using incremental transfer with Changed Block Tracking (CBT) feature.
A backup product using VMware vSphere Storage APIs – Data Protection can backup vSphere virtual machines from a central backup server or virtual machine without requiring backup agents or requiring backup processing to be done inside each guest virtual machine on the ESX host. This offloads backup processing from ESX hosts and reduces costs by allowing each ESX host to run more virtual machines.
But those API are not always bug free… and can there can data corruption in specific build version and specific cases. Almost associated with the CBT based backup.
Here a brief table to summarize some build version affected by possible issues:
vShere version | Solution | Issues |
8.0.2 | Update to ESXi 8.0 Update 2b, build 23305546 | Possible CBT corruption after a virtual disk hot extend KB 95965 |
6.7 | Follow KB instruction | The virtual disk is either corrupted or not a supported format when committing or creating snapshots KB 2013520 |
6.5/6.7 | Upgrade to VMware VDDK 6.7 or later version | CBT reports larger area of changed blocks than expected if guest OS performed unmap on a disk KB 59905 |
6.0.0 | Update and follow KB instructions | VMware vSphere 6 CBT issue KB 2114076 KB 2136854 |
6.x | Apply laters updates | When users revert to a snapshot, it will automatically disable and enable CBT. If the snapshot is not a memory snapshot, enabling CBT will fail and CBT is disabled after reverting. KB 71155 |
4.x/5.x | Apply patches | Possible backup data corruption after extending virtual machine VMDK file with Changed Block Tracking (CBT) enabled KB 2090639 |
As you can notice there is actually on conditions that still can happen in an recent version of vSphere (without the lastest patch):
- If you hot resize a VM disk on a vSphere 8.0.2 version, then CBT may have incorrect information. A fix has been released some days after the bug, and ESXi 8.0 Update 2b, build 23305546 solve that problem.
Funny that a similar bug was introducted in old 4.x and 5.x versions.
So safer solution is don’t perform hot resize, but resize VMDK only on powered off VMs.
Follow KB 95965 to learn more!