Some weeks ago I’ve got an issue in a VMware vSphere 5.1 environment with vMotion that stop each operation at 14% with some strange errors. Also there were several VMware KB articles related to this issues, with different reasons.
The error message was simple: Timed out waiting for migration start request. So the closest KB article was: vMotion fails at 14% with the error: Timed out waiting for migration start request (2068817).
But none of the possible solutions were applicable to my case.
Due to a bad customer planning, VMkernel’s vMotion interfaces were in the same management LAN of the VMkernel’s Management interfaces… So I’ve temporally tried to change with a new IP in a new network, but doesn’t change the problem. And a host reboot was not possible (at least not in production hours).
So I’ve investigate more on the assigned IP and I’ve found the the vMotion IP of the host where vMotion was locked was already used by another system.
The behavior of ESXi seems to softly disable this interface AND the vMotion function in this case of a duplicated IP. Unfortunately build a new VMkernel vMotion interface, remove the existing one, enable/disable it does not change the fact that vMotion was disable at host level. Also disable and re-enable vMotion at host level doesn’t change the issue (using the advanced setting Migrate / Migrate.enabled parameter), probably because this setting will be effective only after a host reboot.
But a host reboot was not an option because several VMs were running on it.
So I found a simple solution in order to re-enable the vMotion features without a reboot. Using vmkernel modules is possible remove and re-enabe the feature.
The module is “migrate” and usually is loaded if you have vMotion enabled, but in this case was not loaded anymore
~ # esxcfg-module -l | grep migrate
This module does not have any parameter:
~ # esxcfg-module -i migrate
esxcfg-module module information
input file: /usr/lib/vmware/vmkmod/migrate
License: VMware
Version:
Name-space: esx@nover
Required name-spaces:
com.vmware.vmkapi@v2_0_0_0
vmkernel@nover
Parameters:
So, if you cannot load it with the -e option:
~ # esxcfg-module -e migrate
You need to force the operation with -f:
~ # esxcfg-module -f migrate
Module migrate loaded successfully
~ # esxcfg-module -l | grep migr
migrate 0 320
After this the vMotion was now working again.