Avoid Breaking the SimpliVity OmniStack Virtual Controller when Upgrading SimpliVity Firmware

We are upgrading an existing SimpliVity Cluster to the latest version of SimpliVity, a hyper-coverged infrastructure (HCI) solution.  We are going from SimpliVity build 3.7.1 to 3.7.6.  Before you upgrade the OmniStack Virtual Controllers (OVCs) you must upgrade the firmware on the ESXi host.  This installation uses the HPE DL380 G9 Servers which use the LSI controllers that are connected to the SSD drives.  We needed to install the 790-0001107-LSI-g.iso (LSI Firmware Update) and SVTSP-2018_0828.01.iso (DL380 G9 Firmware Update).  According to HPE's instructions you are supposed to install the LSI firmware first and then the DL 380 G9 firmware second.  Both firmware updates have issues.  Here's a summary:

  1. LSI Firmware Update. 
    1. Disable HA on the Cluster.
    2. Disable any VM Affinity Rules on the Cluster.
    3. vMotion off all of the VMs on the host to a different host, but do not migrate the OVC.
    4. Shutdown the OVC. 
      1. Click on Home, SimpliVity Federation, Hosts.
      2. Right-click on the OVC you want to shut down.
      3. Select All HPE SimpliVity Actions, Shut Down Virtual Controller.
    5. After the OVC shutdown, put the host in maintenance mode.
    6. Use ILO to attach the 790-0001107-LSI-g.iso to the host.
    7. Use ILO to connect to the console of the ESXi host.
    8. Reboot the host and press F11 to boot the attached ISO image.
    9.  At the prompt run sudo su –
    10. ./update_fw
    11. Depending on how many drives you have this can take 30 to 40 minutes.
    12. Wait until you are prompted to reboot the server. 
    13. Shutdown the server – DO NOT reboot it!
    14. Disconnect the power cords from the server and wait at least five minutes.  This will reset the LSI controller after the firmware flash.  If you do not disconnect the power cords from the server the LSI controller will report errors when the host is booted back up and the OVC will not properly start!  Do NOT skip this step or you will be very sad L.
  2. DL 380 G9 Firmware Update.
    1. Power up the host.
    2. Using ILO, attach the SVTSP-2018_0282.01 ISO file to the host.
    3. Press F11 during the boot process and boot the ISO image.  Make sure to select the Automatic Upgrade.  If you do not, the firmware flash will not properly complete.
    4. The ILO firmware is upgraded last, which will cause the remote console to disconnect.
    5. The server will automatically shut down. 
    6. Disconnect the power (again) from the server.  If you do not disconnect the power the ILO will incorrectly detect a failed power supply.  Disconnecting the power from the server resets the power supply and clears the error.  The incorrect power supply error appears to be a by product of flashing the hardware.
    7. Power up the server.
  3. Take the ESXi host out of maintenance mode.
  4. Wait a few minutes and start the OVC.  Use the console to make sure the OVC properly starts up. 
  5. SSH into the OVC and use your vCenter Credentials to log in.
  6. Run status svtfs.  It should come back with start/running, process <process_id>.  If the status is not running, the OVC probably ran into an error with the LSI controller.  We suggest contacting HPE SimpliVity support and have them review the nostart semaphore to see why the svtfs did not properly start.
  7. Run svt-federation-show.  This should show all of your OVCs Alive and connected.
  8. Run svt-vm-show.  Initially this will list all of your VMs with a Storage HA status of No.  As the OVC resynchronizes its storage the VMs should show a Storage HA status of Yes.

Congratulations!  You've successfully upgraded the firmware on your SimpliVity nodes.  The two key critical items are to make sure to disconnect the power after you upgrade the LSI firmware and the DL 380 G9 firmware.  Note that DL 380 G10s do NOT have the LSI issue because the G10 servers do not use an LSI controller.  Because of all of the issues we ran into it took us seven hours to flash the first node and about four hours to flash the second node.  Hopefully this article will save you some time.  Happy flashing!

Hyperconvergence

Get updated on the latest Information Technology news, Cybersecurity, Information Technology Trends, and recent real-world troubleshooting experiences.

SUBSCRIBE NOW!