Reading Time: 4 minutes

If you are running VMware vSphere 5.5 or 6.x you may have some issues with VM snapshots with the risk that VMs may report guest data inconsistencies!

Not funny at all and potentially a critical problem that must be planned, considering that VM snapshots are used by all VM native backup products!

And more serious that the cosmetic snapshot “issue” in VMware vSphere 6.0 using the vSphere C# Client.

This issue is still open for vSphere 5.5 (and probably will remain un-fixed considerind that v5.5 is in End of General Support), 6.0 and 6.5.

If you try to download vSphere you will notice this warning in the download page:

Important Information about SESparse snapshots on vSphere 5.5 and Above.

If you are running vSphere 5.5 or above, please review this important information regarding SEsparse snapshots. For more information, see KB 59216.

What’s happen? VMware has identified an issue in SEsparse VM snapshots that can cause data inconsistencies. This issue occurs when a VM is running on an SEsparse snapshot and experiences a burst of non-contiguous write IO in a very short period of time.

But what is SEsparse format? SEsparse is a snapshot format introduced in vSphere 5.5 for large disks, and is the preferred format for all snapshots in vSphere 6.5 and above with VMFS-6.

When you take a snapshot, the state of the virtual disk is preserved, which prevents the guest operating system from writing to it. A delta or child disk is created. The delta represents the difference between the current state of the VM disk and the state that existed when you took the previous snapshot.

On the VMFS datastore, the delta disk is a sparse disk. Sparse disks use the copy-on-write mechanism, in which the virtual disk contains no data, until the data is copied there by a write operation. This optimization saves storage space.

Depending on the type of your datastore, delta disks use two different sparse formats:

  • VMFSsparse – VMFS5 uses the VMFSsparse format for virtual disks smaller than 2 TB.
    VMFSsparse is implemented on top of VMFS. The VMFSsparse layer processes I/Os issued to a snapshot VM. Technically, VMFSsparse is a redo-log that starts empty, immediately after a VM snapshot is taken. The redo-log expands to the size of its base vmdk, when the entire vmdk is rewritten with new data after the VM snapshotting. This redo-log is a file in the VMFS datastore. Upon snapshot creation, the base vmdk attached to the VM is changed to the newly created sparse vmdk.
  • SEsparse – SEsparse is a default format for all delta disks on the VMFS6 datastores. On VMFS5, SEsparse is used for virtual disks of the size 2 TB and larger.
    SEsparse is a format similar to VMFSsparse with some enhancements. This format is space efficient and supports the space reclamation technique. With space reclamation, blocks that the guest OS deletes are marked. The system sends commands to the SEsparse layer in the hypervisor to unmap those blocks. The unmapping helps to reclaim space allocated by SEsparse once the guest operating system has deleted that data. For more information about space reclamation, see Storage Space Reclamation.

So basically the affected systems are:

  • VMFS-5 or NFS Datastores: VMs with virtual disks >2TB and snapshots. On VMFS-5 and NFS, the SEsparse format is used for virtual disks that are 2 TB or larger
  • VMFS-6 Datastores: VMs with snapshots. SEsparse is the default format for all snapshots on VMFS-6 datastores.

This issue can cause different problems:

  • Applications such as databases may report block-level data inconsistency.
  • Guest operating systems may report file system metadata inconsistencies
  • The VM fails to boot when it is running from an SEsparse snapshot.

Actually only vSphere 6.7 has a simple resolution that is to upgrade to VMware vSphere 6.7 Update 1.

For previous versions, this issue can be prevented by disabling “IO coalescing” for SEsparse. See VMware KB 59216 (Virtual Machines running on an SEsparse snapshot may report guest data inconsistencies) on how to apply this host mitigation.

Note that is not the first case of issue in the SEsparse format, see some examples of previous issues:

  • VMware KB 2073390 (View Desktops may become unresponsive while powering off when using SEsparse disks in ESXi 5.5) – Solved in vSphere 5.5 Update 1
  • VMware KB 2150962 (Virtual machines become unresponsive or fails when running on SEsparse virtual disk format) – Solved in VMware ESXi 6.0 Update 3a and VMware ESXi 6.5 Patch 01
Share

Virtualization, Cloud and Storage Architect. Tech Field delegate. VMUG IT Co-Founder and board member. VMware VMTN Moderator and vExpert 2010-24. Dell TechCenter Rockstar 2014-15. Microsoft MVP 2014-16. Veeam Vanguard 2015-23. Nutanix NTC 2014-20. Several certifications including: VCDX-DCV, VCP-DCV/DT/Cloud, VCAP-DCA/DCD/CIA/CID/DTA/DTD, MCSA, MCSE, MCITP, CCA, NPP.