Checklist for GPU-enabled VMs that won't start / out of resources

Updated 09/29/2021 01:05 PM

Checklist for GPU-enabled VMs that won't start / out of resources


  • assure proper power to the M60 board

  • assure less than 1TB of system RAM and OEM-specified IOMMU settings

  • verify graphics mode class 300 lspci -n | grep 10de

  • assure NVIDIA software: vib and guest o/s drivers must match from the same bundle

  • assure GPU mode switch utility is removed from vSphere

  • verify nvidia-smi reports all GPUs in vSphere shell

  • check driver status with vmkload_mod -l | grep nvidia

  • assure xorg is running /etc/init.d/xorg status KB article

  • assure all memory is reserved for each GPU-enabled VM, in VM properties

  • start one VM at a time, not many at a time

  • assure the VMs are not managed or set for live migration

  • verify single VMs first, not linked clones

  • start with a fresh VM image, loaded with the proper version of the NVIDIA GRID guest o/s driver, opposed to a former corporate image

  • assure GPUs are not set to dedicated / pass through / direct IO mode in vSphere, if using vGPU modes

  • assure the hypervisor is properly licensed with Enterprise (CITRIX) or Enterprise Plus (vSphere) for vGPU

  • assure all memory is reported on idle GPUs. If there is less than expected, rerun the gpu modeswitch utility ISO

  • assure ECC mode is disabled on GPUs

Is this answer helpful?

Live Chat

Chat online with one of our support agents

CHAT NOW

ASK US A QUESTION

Contact Support for assistance

800.797.6530

Ask a Question