Skip to main content

Restoring Compatibility between Docker and GPUs

Overview

If you are trying to use Docker and the CircleCI GPU executor, you may get the following error.

docker: Error response from daemon: could not select device driver "" with capabilities: [[gpu]].
ERRO[0407] error waiting for container: context canceled 

This is due to the removal of the nvidia-container-toolkit when CircleCI Switched to using images with multiple CUDA versions available at runtime.

Solution

Step 1: Add a step in your config.yml to install `nvidia-container-toolkit` and Restart Docker

      - run: 
          name: Install nvidia-container-toolkit and Restart Docker
          command: |
            sudo apt-get update
            sudo apt-get install -y nvidia-container-toolkit
            sudo systemctl restart docker

Step 2: Verify Compatibility

To ensure that Docker and GPUs are working together, you can run a test container using the nvidia/cuda:11.4.3-base-ubuntu20.04 image. This container will execute the nvidia-smi command to display GPU information.

 - run: 
     name: Test GPU Docker
     command: docker run --gpus all nvidia/cuda:11.4.3-base-ubuntu20.04 nvidia-smi

If the nvidia-container-toolkit is functioning correctly and Docker can utilize the GPU resources, you should see the GPU information displayed.

Additional Resources

Did this answer your question?