Checklist for GPU-enabled VMs that won't start / out of resources

Answer ID 4376
Updated 01/30/2017 02:55 PM
Checklist for GPU-enabled VMs that won't start / out of resources

- assure proper power to the M60 board

- assure less than 1TB of system RAM and OEM-specified IOMMU settings

- verify graphics mode class 300 lspci -n | grep 10de 

- assure NVIDIA software: vib and guest o/s drivers must match from the same bundle

- assure GPU mode switch utility is removed from vSphere

- verify nvidia-smi reports all GPUs in vSphere shell

- check driver status with vmkload_mod -l | grep nvidia

- assure xorg is running /etc/init.d/xorg status   KB article

- assure all memory is reserved for each GPU-enabled VM, in VM properties

- start one VM at a time, not many at a time

- assure the VMs are not managed or set for live migration

- verify single VMs first, not linked clones

- start with a fresh VM image, loaded with the proper version of the NVIDIA GRID guest o/s driver, opposed to a former corporate image

- assure GPUs are not set to dedicated / pass through / direct IO mode in vSphere, if using vGPU modes

- assure the hypervisor is properly licensed with Enterprise (CITRIX) or Enterprise Plus (vSphere) for vGPU

- assure all memory is reported on idle GPUs.  If there is less than expected, rerun the gpu modeswitch utility ISO

- assure ECC mode is disabled on GPUs

Was this answer helpful?
Your rating has been submitted, please tell us how we can make this answer more useful.


Chat online with one of our support agents



Contact Support for assistance