Some users already using GPU-technologies have chosen to compare a vGPU profile "equivalent" to a physical GPU card on passthrough (called vDGA on VMware and GPU-passthrough on Citrix). E.g. the vGPU K280Q profile on a K2 card is roughly equivalent in many specifications to a Quadro K5000 card.
Frequently this is done with intensive benchmarks such as CADalyst that aim to maximize the load on the GPU and obtain a framerate or similar measure based on the assumption that higher=better.
vGPU is a technology designed for shared use and VDI/remoting. vGPU includes a feature called the Frame Rate Limiter (FRL) which will limit the frame rate to 60 fps in vGPU (in practice this ensures a maximum of 60 fps even under fluctuating network and server loads so you may see frame rates 66-67 fps). The frame rate limiter can be disabled for benchmarking but it is not advised as in production it ensures excessive frames and bandwidth are not produced consuming resource which remoting protocols could not handle. The FRL also helps to ensure multiple users share GPU "fairly".
This means whilst on physical or passthrough a customer may achieve very high frame rate whereas they will see a maximum of around 60 fps on most vGPU profiles by design. The FRL can be temporarily disabled FRL by adding the configuration parameter pciPassthru0.cfg.frame_rate_limiter to the VM configuration settings and setting appropriately users wishing to do this should consult the appropriate documentation but note that NVIDIA does not validate vGPU with FRL disabled and in a shared networked environment disabling FRL would degrade rather than improve performance saturating the virtualization stack.
The FRL is set to 60fps on most vGPU profiles but was set to 45 fps on a few legacy profiles such as the K100 and K200. The K100 and K200 profiles are no longer recommended for new deployments having been superseded by the K120Q and K220Q profiles.
Users comparing physical workstation behavior to vGPU should be aware of a number of other factors that need to be considered:
· Only Mx-8Q vGPU profiles currently support CUDA/OpenCL whereas CUDA/OpenCL is available on GPU-passthrough and as such heavy CUDA/OpenCL benchmarks and workloads are not appropriate for many vGPU profiles
· Number of CPU cores available, many 3D apps particularly CAD applications have large pockets of CPU intensive code and some areas of code that are multi-threaded but large pockets that remain mostly single-threaded. When comparing a physical workstation to a virtualized server the numbers of vCPUs and the level of over-provisioning should be considered.
· CPU clock speed, this can be a very significant factor in the overall performance of CAD applications and the clock speed should be noted and considered
· Server CPU power settings, many servers ship in power saving mode rather than performance mode and users must ensure that fan speeds, the ability to turbo and power settings are set for maximizing performance accordingly.
NVIDIA GRID vGPU cards including M10, M6, M60, K1 and K2