I’ve been into local LLMs lately, and on a whim, I decided to try “speaker diarization & subtitle creation with WhisperX.” However, when I tried to run it on my Ubuntu 24.04 environment with an RTX 4080, I got an error that libcudnn_cnn_infer.so.8 was missing. After some research, it seems that the combination of CUDA 11.8 + cuDNN 8 cannot be installed directly on Ubuntu 24.04.
After trying various things, I found that creating an environment based on Ubuntu 22.04 with Docker was the smoothest solution. I’m leaving this as a memo of the setup procedure.
Installing the NVIDIA Container Toolkit
To use the GPU from Docker, first set up the NVIDIA Container Toolkit. The procedure is as per the official documentation ⧉. For reference, here are the commands:
curl -fsSL https://nvidia.github.io/libnvidia-container/gpgkey | sudo gpg --dearmor -o /usr/share/keyrings/nvidia-container-toolkit-keyring.gpg \ && curl -s -L https://nvidia.github.io/libnvidia-container/stable/deb/nvidia-container-toolkit.list | \ sed 's#deb https://#deb [signed-by=/usr/share/keyrings/nvidia-container-toolkit-keyring.gpg] https://#g' | \ sudo tee /etc/apt/sources.list.d/nvidia-container-toolkit.listsudo sed -i -e '/experimental/ s/^#//g' /etc/apt/sources.list.d/nvidia-container-toolkit.listsudo apt-get updateexport NVIDIA_CONTAINER_TOOLKIT_VERSION=1.17.8-1sudo apt-get install -y \ nvidia-container-toolkit=${NVIDIA_CONTAINER_TOOLKIT_VERSION} \ nvidia-container-toolkit-base=${NVIDIA_CONTAINER_TOOLKIT_VERSION} \ libnvidia-container-tools=${NVIDIA_CONTAINER_TOOLKIT_VERSION} \ libnvidia-container1=${NVIDIA_CONTAINER_TOOLKIT_VERSION}Getting the Base Image
Get the latest image of Ubuntu 22.04 (Jammy).
docker pull ubuntu:jammy-20250730Creating the Dockerfile
Create the following Dockerfile. It installs CUDA 11.8 + cuDNN 8, and then WhisperX and its dependencies. CUDA 12 is also installed as it is required by cublas.
FROM ubuntu:jammy-20250730
# Update the system in the container to the latest stateRUN apt update && apt install -y --no-install-recommends \ build-essential \ software-properties-common \ wget \ gnupg \ git \ ffmpeg \ python3-pip \ ca-certificates
# Add NVIDIA CUDA repository & install CUDA 11.8 + cuDNN 8 + CUDA 12.3RUN wget https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2204/x86_64/cuda-keyring_1.0-1_all.deb && \ dpkg -i cuda-keyring_1.0-1_all.deb && \ apt update && \ apt -y install cuda-11-8 libcudnn8 libcudnn8-dev cuda-12-3
# Set pathENV PATH="/usr/local/cuda-11.8/bin:${PATH}"ENV LD_LIBRARY_PATH="/usr/local/cuda-11.8/lib64:/usr/local/cuda-12.3/lib64:${LD_LIBRARY_PATH}"
# Install WhisperX & pyannote.audioRUN pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu118 && \ pip install whisperx && \ pip install "pyannote.audio"
# Default command when the container startsCMD ["/bin/bash"]Building the Image
docker build -t ubuntu:whisperx .Running WhisperX
Prepare an audio file on the host and run it from the container.
Example: /home/me/Documents/makesrt/audio.wav
docker run \ -v /home/me/Documents/makesrt:/root/makesrt \ --gpus all \ --rm \ -w /root/makesrt \ ubuntu:whisperx \ whisperx "audio.wav" \ --compute_type "float16" \ --device "cuda" \ --language "en" \ --diarize \ --hf_token <HUGGINGFACE_TOKEN> \ --output_dir "./output" \ --output_format "all"Option Explanations
-v: Share a directory between the host and the container-w: Working directory--gpus all: Use all GPUs--diarize: Enable speaker diarization (requires Hugging Face Token)--compute_type "float16": Accelerate on GPUs like RTX 4080
This will run WhisperX on the GPU. It’s much faster than CPU.
Summary
When trying to run WhisperX on Ubuntu 24.04, you will run into version constraints with CUDA/cuDNN.
I think there are various ways to do this, but this time I solved it by creating a Docker image based on Ubuntu 22.04.
- Does not pollute the local environment
- Can fully utilize the GPU
- Highly reproducible in other environments
Conclusion: Docker is justice.