Useful nvidia-smi Queries

Updated 09/29/2021 10:18 AM

What are useful nvidia-smi queries for troubleshooting?

VBIOS Version

Query the VBIOS version of each device:

$ nvidia-smi --query-gpu=gpu_name,gpu_bus_id,vbios_version --format=csv
name, pci.bus_id, vbios_version GRID K2, 0000:87:00.0, 80.04.D4.00.07 GRID K2, 0000:88:00.0, 80.04.D4.00.08

Query GPU metrics for host-side logging

This query is good for monitoring the hypervisor-side GPU metrics. This query will work for both ESXi and XenServer

$ nvidia-smi --query-gpu=timestamp,name,pci.bus_id,driver_version,pstate,,,temperature.gpu,utilization.gpu,utilization.memory,,,memory.used --format=csv -l 5

When adding additional parameters to a query, ensure that no spaces are added between the queries options.

You can get a complete list of the query arguments by issuing: nvidia-smi --help-query-gpu

nvidia-smi Usage for logging

Short-term logging

Add the option "-f <filename>" to redirect the output to a file

Prepend "timeout -t <seconds>" to run the query for <seconds> and stop logging.

Ensure the your query granularity is appropriately sized for the use required:

Long-term logging

Create a shell script to automate the creation of the log file with timestamp data added to the filename and query parameters

Add a custom cron job to /var/spool/cron/crontabs to call the script at the intervals required.

Additional low level commands used for clocks and power

Enable Persistence Mode

Any settings below for clocks and power get reset between program runs unless you enable persistence mode (PM) for the driver.

Also note that the nvidia-smi command runs much faster if PM mode is enabled.

nvidia-smi -pm 1 — Make clock, power and other settings persist across program runs / driver invocations



Is this answer helpful?

Live Chat

Chat online with one of our support agents



Contact Support for assistance

Ask a Question