Guide to Checking GPU on Ubuntu: How to Use and Configure nvidia-smi

1. Introduction

When utilizing a GPU on Ubuntu, it is crucial to accurately monitor its status. This is especially important for tasks such as deep learning and graphic rendering, where understanding GPU usage and driver versions is essential. This article explains how to use nvidia-smi, an NVIDIA GPU management tool, and provides a guide on checking GPU status on Ubuntu.

2. Checking GPU Information with nvidia-smi

nvidia-smi is a command-line tool that allows you to monitor NVIDIA GPU usage, memory consumption, and other details. It is particularly useful for real-time monitoring of GPU activity and retrieving detailed usage information.

Basic Usage

The following command displays real-time GPU usage and memory consumption:

nvidia-smi --query-gpu=timestamp,name,utilization.gpu,utilization.memory,memory.used,memory.free --format=csv -l 1

This command provides detailed information, including GPU utilization, memory usage, and available memory. You can also specify the update interval in seconds using the -l option.

Output Format and File Logging

By default, the output is displayed in a table format, but you can also output it in CSV format for easier processing. If you want to save the information to a file, use the -f option to specify the output file path.

nvidia-smi --query-gpu=timestamp,name,utilization.gpu,utilization.memory,memory.used,memory.free --format=csv -l 1 -f /path/to/output.csv

This method allows you to log GPU usage for later analysis.

3. Retrieving Process Information with nvidia-smi

Using nvidia-smi, you can retrieve information about the processes currently utilizing the GPU. This helps identify which processes are consuming GPU resources and to what extent.

Getting Process Information

Run the following command to check the PID and memory usage of processes using the GPU:

nvidia-smi --query-compute-apps=pid,process_name,used_memory --format=csv,noheader

This command returns a list of currently running GPU processes along with their memory usage.

nvidia-smi pmon Subcommand

The nvidia-smi tool includes a subcommand called pmon, which provides more detailed information about GPU processes.

nvidia-smi pmon --delay 10 -s u -o DT

This command displays GPU process information at specified intervals. The --delay option sets the update interval in seconds, and you can customize the displayed information.

4. Installing and Verifying NVIDIA Drivers

To use an NVIDIA GPU on Ubuntu, you must install the appropriate NVIDIA driver. Below are the steps for installing and verifying the driver.

Installing the Driver

First, install the recommended NVIDIA driver for your system using the following command:

sudo apt install nvidia-driver-510

Once the installation is complete, restart your system.

Verifying the Installation

After rebooting, check if the driver is installed correctly using the following command:

nvidia-smi

If the command displays the driver version and CUDA version, the installation was successful.

5. Verifying GPU Operation with TensorFlow

To confirm that the GPU is functioning correctly, you can use TensorFlow, a machine learning framework, for testing.

Installing Anaconda

First, install Anaconda to set up the environment.

bash ./Anaconda3-2022.05-Linux-x86_64.sh
conda update -n base conda
conda update anaconda
conda update -y --all
conda install tensorflow-gpu==2.4.1

Checking GPU Recognition with TensorFlow

Next, verify if TensorFlow recognizes the GPU by running the following command:

from tensorflow.python.client import device_lib
device_lib.list_local_devices()

If the GPU device appears in the list, TensorFlow has successfully detected the GPU.

6. Monitoring GPU Usage and Logging

Using nvidia-smi, you can monitor GPU usage in real time and log the data. This helps track GPU utilization over long periods and optimize performance.

Setting Up Regular Monitoring

To set up periodic monitoring, use the -l option in nvidia-smi to specify the update interval. You can also log the output to a file.

nvidia-smi --query-gpu=timestamp,name,utilization.gpu,utilization.memory,memory.used,memory.free --format=csv -l 1 -f /var/log/gpu.log

Programmatic Control with Python Bindings

nvidia-smi provides Python bindings (nvidia-ml-py), allowing you to retrieve GPU information programmatically. This enables customized monitoring and control.

7. Conclusion

nvidia-smi is a powerful tool for monitoring and managing NVIDIA GPU usage on Ubuntu. This article covered its basic usage, retrieving process information, installing drivers, and verifying GPU operation with TensorFlow. Utilize these methods to maximize GPU performance and optimize your system.