Reading Time: 3 minutes

After an upgrade to VMware vSphere 5.5 on a Dell PowerEdge R710, I’ve got strange occasion issues, where the hosts got completly disconnected from vCenter (with the ESXi/ESX host’s status as Not Responding or Disconnected in vCenter Server) and there was no way to reconnect, also after restarting the management services.

Locking in the ESXi console those kind of errors where notable: Bootbank cannot be found at path ‘/bootbank’.

The only temporally solution was power-off the VMs and restart the host. But the issue can randomly came back.


ESXi-BankIssue

By looking on the Google one possible similar case was this: VMware ESXi 4.x and 5.x lose connectivity to Hypervisor – IBM BladeCenter HX5.

Of course this was not an IBM host, but this kind of issue is interesting and it’s related to the Permanent Device Loss (PDL) condition that appen of the boot device, if it is an Embedded USB Hypervisor. The issue can be identified with the same error message: The VMware ESXi kernel logs an error message similar to the following: “Bootbank cannot be found at path ‘/bootbank'”.

But also with the evidence, in the VMware vSphere Client or VMware vSphere Center logs, of an alert showing ‘Configuration Issue’ due to the ‘Lost connectivity to the device mpx.vmhba32:C0:T0:L0’ when ‘backing the boot file system /vmfs/devices/disks/mpx.vmhba32:C0:T0:L0’.

In my case the real root cause came from the KB 1017297 (ESXi/ESX host appears as Not Responding in vCenter Server due to CD/DVD-ROM drive firmware issues), that explain a possible incompatibility in the CD firmware.

To found if your CD has a bad firmware:

  • execute the command esxcfg-scsidevs -l,
  • depending on your CD/DVD-ROM drive’s model or revision, you see output similar to:mpx.vmhba1:C0:T0:L0
    Device Type: CD-ROM
    Size: 0 MB
    Display Name: Local TEAC CD-ROM (mpx.vmhba1:C0:T0:L0)
    Plugin: NMP
    Console Device: /dev/sr0
    Devfs Path: /vmfs/devices/genscsi/mpx.vmhba1:C0:T0:L0
    Vendor: TEAC      Model: DV-28E-V           Revis: C.AB
    SCSI Level: 5  Is Pseudo: false Status: on
    Is RDM Capable: false Is Removable: true
    Is Local: true
    Other Names:
    vml.0005000000766d686261313a303a30
To work around this issue, you can use one of these options:
  • Upgrade the CD/DVD-ROM drive firmware to the latest version available
  • Replace the CD/DVD-ROM drive with a different model
  • Disable the CD/DVD-ROM drive within the BIOS of the ESXi/ESX host

In my case all the hardware firmware was already up-to-date, so first option was not appliable. Second was possible, but the simplest one was just disable the CD at BIOS level and use for the future the virtual CD features of the iDRAC.

Funny that the other two hosts (identical) was not affected, just because the CD were not exactly the same. The learned lesson is clear: HCL it’s always important, also on minor device and don’t assume that hosts are really identical, also if configured in the same way during the purcase.

Share

Virtualization, Cloud and Storage Architect. Tech Field delegate. VMUG IT Co-Founder and board member. VMware VMTN Moderator and vExpert 2010-24. Dell TechCenter Rockstar 2014-15. Microsoft MVP 2014-16. Veeam Vanguard 2015-23. Nutanix NTC 2014-20. Several certifications including: VCDX-DCV, VCP-DCV/DT/Cloud, VCAP-DCA/DCD/CIA/CID/DTA/DTD, MCSA, MCSE, MCITP, CCA, NPP.