Monitoring NVIDIA GRID GPU and CPU metrics for Citrix XenServer including with XenCenter – including is the number of vCPUs per VM correct

Answer ID 4117
Published 05/09/2016 02:35 PM
Updated 05/10/2016 10:35 AM

Monitoring NVIDIA GRID GPU and CPU metrics for Citrix XenServer including with XenCenter – including is the number of vCPUs per VM correct

Symptoms or Errors

Many performance or scalability bottlenecks are caused by resource bottlenecks. Frequently it is not the GPU resource causing the issue but CPU contention, too low a CPU spec (applications such as Petrel benefit greatly from a CPU >3.0GHz), RAM, IOPS, bandwidth etc. Customers experiencing issues are encouraged to use their virtualization stack to monitor GPU alongside other resources.

Resource limitations may result in issues such as slow sessions, freezing sessions, sluggish mouse movements (particularly networking/bandwidth issues may cause this).

This article focuses on monitoring CPU alongside GPU resources.


XenServer Monitoring

Citrix XenServer has a good deal of metrics which can be accessed from a command prompt in the hypervisor or from within the XenCenter management console. Many metrics are off by default to avoid unnecessary system load where they not normally be needed. There is a very detailed guide on which metrics are available, how to configure thresholds for alerts and how to trigger email alerts within Chapter 9 of The XenServer Administrator Guide. Always consult the version of the guide pertaining to the version of XenServer you are using e.g. for XS6.5 - Citrix XenServer® 6.5 Administrator's Guide.


The metrics for monitoring GPU usage though are not documented in the administrator guide as this is a set of metrics currently associated with the NVIDIA vGPU feature rather than available for any GPU vendor. This guide does contain all the information on how to add graphs for metrics such as those for GPUs and how to set up alerts etc. I’ve blogged about the availability of these metrics:

For NVIDIA vGPU the main metrics of interest are



Units   Dexcription Enabled by default? 

 Condition  for existence

 Host  gpu_memory_free_<pci-bus-id  Bytes  Unallocated framebuffer memory  No  A supported GPU is installed on the host
 Host  gpu_memory_used_<pci-bus-id>  Bytes  Allocated framebuffer memory  No  A supported GPU is installed on the host
 Host  gpu_power_usage_<pci-bus-id  mW Power usage of this GPU  No  A supported GPU is installed on the host
 Host  gpu_power_usage_<pci-bus-id>  °C  Temperature of this GPU  No  A supported GPU is installed on the host
 Host  gpu_utilisation_compute_<pci-bus-id>  (fraction)  Proportion of time over the past sample period during which one or more kernels was executing on this GPU  No  A supported GPU is installed on the host
 Host  gpu_utilisation_memory_io_<pci-bus-id>  (fraction)  Proportion of time over the past sample period during which global (device) memory was being read or written on this GPU  No  A supported GPU is installed on the host


Note: GPU metrics are available in XenCenter for GPU-passthrough but because of the nature of PCIe pass-through the hypervisor has no access to the actual data (pass-through means only the VM can see/access the GPU) and so these graphs and metrics will be zero (i.e. equal to 0).


If you are trouble-shooting a performance issue it is important that you identify which resource is the bottleneck. Often it may not be the GPU. Metrics that are particularly worth checking include:


·        Those pertaining to CPU usage on the Host




       Condition for existence 

      XenCenter Name



      Time CPU<cpu> spent in C-state <cstate> in miliseconds.


exists on CPU

      CPU <cpu>C-state<cs  ate>



 Time CPU <cpu> spent in P-state <pstate> in miliseconds.

exists on

 CPU <cpu>



 Utilisation of physical CPU <cpu> (fraction).

by default.


 CPU <cpu>



 Mean utilisation of physical CPUs (fraction). Enabled
by default.



























C-State and P-state information is particularly insightful in the context of bursty (CAD applications often are) applications where peak vs. average usage can vary. Many servers are shipped in power saving mode rather than for maximum performance. This needs to be changed in the BIOS to allow the hypervisor and hence app to use the full range of P/C-States. I wrote a guide to C/P-states a long time ago: I’m not sure whether the information is correct with respect to the XenServer commands to optimally configure a system but the monitor instructions should be correct.


Many CAD/3D applications can be highly single-threaded and benefit from using turbo mode. Catia is one such application that has often been like this. P-state (P0) the highest mode is traditionally used to indicate if turbo is in use but you must be very careful if using XenCenter to note the convention that if turbo is in use, P0 will be turbo mode and P1 the highest non-turbo mode.  There is a convention of labelling turbo-mode with a frequency +1MHz above normal maximum frequency means that XenCenter does not reflect the true frequency of the turbo mode and as such users may interpret it that turbo mode is not occurring. E.g. on a 3400MHz Intel system, P0 will be logged as 3401MHz, where the maximal non-turbo mode is P1 with 3400MHz.


·        Those pertaining to CPU usage on the VM




 Condition for existence


 VM  cpu<cpu>
Enabled by default
 Utilisation of vCPU <cpu> (fraction).  vCPU<cpu> exists  CPU <cpu>
 VM  memory  Memory currently allocated to VM (Bytes).Enabled by default  None  Total Memory
 VM  memory_target  Target  of  VM  balloon  driver  (Bytes).  Enabled  by default  None  Memory target
 VM  memory_internal_free  Memory used as reported by the guest agent (KiB).
Enabled by default
 None  Free Memory
 VM  runstate_fullrun  Fraction of time that all VCPUs are running.  None  VCPUs full run
 VM  runstate_full_contention  Fraction  of  time  that  all  VCPUs  are  runnable  (i.e., waiting for CPU  None  VCPUs  full contention
 VM  runstate_concurrency_hazard  Fraction of time that some VCPUs are running and some are runnable  None  VCPUs concurrency hazard
 VM  runstate_blocked  Fraction of time that all VCPUs are blocked or offline  None  VCPUs idle
 VM  runstate_partial_run  Fraction of time that some VCPUs are running, and some are blocked  None  VCPUs partial run
 VM  runstate_partial_contention  Fraction of time that some VCPUs are runnable and some are blocked  None  VCPUs partial contention


o   VM runstate_ metrics, these allow you to assess vCPU contention. This is especially worth monitoring if you are overprovisioning. The background to understand this and details of how to do this can be found within this blog:


o   If you are interested in measuring the vCPU overprovisioning from the point of view of the host, you can use the host’s cpu_avg metric to see if it’s too close to 1.0 (rather than 0.8, i.e. 80%): If you are interested in measuring the vcpu overprovisioning from the point of view of a specific VM, you can use the VM’s runstate_*  metrics, especially the ones measuring runnable, which should be less than 0.01 or so. These metrics can be investigated via the command line or XenCenter.


XenServer metrics are stored by a mechanism of RRD (Round Robin Database) which means that data stored is limited by degrading historical data in granularity. E.g. the last 10 minutes of data can be accessed at a sample interval of 5s as collected, older data is sample-binned and so becomes increasingly averaged. This means the graphs in XenCenter will become smoother and data on short-lived events is lost.  Each archive in the database samples its particular metric on a specified granularity:

o   Every 5 seconds for the duration of 10 minutes

o   Every minute for the past two hours

o   Every hour for the past week

o   Every day for the past year


XenCenter contains a very generic interface to metric data, which means that any available metric can be graphed and plotted. Knowing the GPU metrics the guide will show you how to add those GPU metrics into XenCenter graphs. 


Exercise: Adding P-state graphs to XenCenter

Find the section “Configuring Performance Graphs” with in the XenServer Administrators Guide and follow the steps:

To Add A New Graph

1.      On the Performance tab, click Actions and then New Graph. The New Graph dialog box will be displayed.

2.      In the Name field, enter a name for the graph.

3.      From the list of Datasources, select the check boxes for the datasources you want to include in the graph, i.e. those with the format CPU<cpu>P-state<pstate>:

a.      Add all available P-states for the first CPU

b.      What C-states are available?

4.      Click Save.

5.      Now view the graph:

a.      Is turbo-boost in use, can you tell? (hover over the graph)


Exercise: Check whether vCPU contention is occurring using XenCenter

o   Hint: you may need to add a graph for certain runstate_ metric

o   Hint: you may also need to check a CPU metric, which one?


Checking your GPU configuration

The XenServer CLI (Command Line Interface) offers many commands to probe your XenServer environment. Again these are documented in the Administrators guide but in an Appendix sub-section titled “GPU Commands”. The CLI has good, if esoteric, tab completion.

Exercise: Check what vGPU types are used on each pGPU (physical GPU) in the system)


o    xe pgpu-list

to get a list of the pGPUs use the output from this as input to the xe command:

o   resident-VGPUs 

to find out what vGPUs have been configured on each pGPU.


·       Questions and Answers regarding Citrix XenServer and XenCenter can be found on the Citrix Support forums, which are a good place to ask further questions: 

·       NVIDIA third-party video on why you should not measure GPU consumption for vGPU within a VM:

·       NVIDIA Answer ID 4108: Monitoring the framebuffer for NVIDIA GRID vGPU and GPU-passthrough

·       Third-party blog on limitations in NVIDIA GRID GPU monitoring:

·       You cannot measure GPU usage from the hypervisor either command line or when using XenCenter when using GPU-passthrough. In XenCenter you will see metrics associated with GPU as zero (=0). You can read more here (third-party blog):


Applicable products

NVIDIA GRID GPUs used for vGPU and GPU-passthrough including K1, K2, M6, M60

Citrix XenServer and XenCenter



Was this answer helpful?
Your rating has been submitted, please tell us how we can make this answer more useful.


Chat online with one of our support agents



Contact Support for assistance