Checklist for GPU-enabled VMs that won't start / out of resources

Updated 01/30/2017 02:55 PM
Checklist for GPU-enabled VMs that won't start / out of resources

- assure proper power to the M60 board

- assure less than 1TB of system RAM and OEM-specified IOMMU settings

- verify graphics mode class 300 lspci -n | grep 10de 

- assure NVIDIA software: vib and guest o/s drivers must match from the same bundle

- assure GPU mode switch utility is removed from vSphere

- verify nvidia-smi reports all GPUs in vSphere shell

- check driver status with vmkload_mod -l | grep nvidia

- assure xorg is running /etc/init.d/xorg status   KB article

- assure all memory is reserved for each GPU-enabled VM, in VM properties

- start one VM at a time, not many at a time

- assure the VMs are not managed or set for live migration

- verify single VMs first, not linked clones

- start with a fresh VM image, loaded with the proper version of the NVIDIA GRID guest o/s driver, opposed to a former corporate image

- assure GPUs are not set to dedicated / pass through / direct IO mode in vSphere, if using vGPU modes

- assure the hypervisor is properly licensed with Enterprise (CITRIX) or Enterprise Plus (vSphere) for vGPU

- assure all memory is reported on idle GPUs.  If there is less than expected, rerun the gpu modeswitch utility ISO

- assure ECC mode is disabled on GPUs

Is this answer helpful?

Live Chat

Chat online with one of our support agents

CHAT NOW

ASK US A QUESTION

Contact Support for assistance

Click here