[docs] add nvidia-container-toolkit to Linux Docker setup instructions (#8557)

2025-12-29 02:00:36 +08:00 · 2025-07-06 21:37:08 +10:00
parent c6290db118
commit 1b549e3199
1 changed files with 111 additions and 0 deletions
--- a/docker/docker-cuda/README.md
+++ b/docker/docker-cuda/README.md
@@ -0,0 +1,111 @@
+# Docker Setup for NVIDIA GPUs
+
+This directory contains Docker configuration files for running LLaMA Factory with NVIDIA GPU support.
+
+## Prerequisites
+
+### Linux-specific Requirements
+
+Before running the Docker container with GPU support, you need to install the following packages:
+
+1. **Docker**: The container runtime
+   ```bash
+   # Ubuntu/Debian
+   sudo apt-get update
+   sudo apt-get install docker.io
+   
+   # Or install Docker Engine from the official repository:
+   # https://docs.docker.com/engine/install/
+   ```
+
+2. **Docker Compose** (if using the docker-compose method):
+   ```bash
+   # Ubuntu/Debian
+   sudo apt-get install docker-compose
+   
+   # Or install the latest version:
+   # https://docs.docker.com/compose/install/
+   ```
+
+3. **NVIDIA Container Toolkit** (required for GPU support):
+   ```bash
+   # Add the NVIDIA GPG key and repository
+   distribution=$(. /etc/os-release;echo $ID$VERSION_ID)
+   curl -s -L https://nvidia.github.io/nvidia-docker/gpgkey | sudo apt-key add -
+   curl -s -L https://nvidia.github.io/nvidia-docker/$distribution/nvidia-docker.list | sudo tee /etc/apt/sources.list.d/nvidia-docker.list
+   
+   # Install nvidia-container-toolkit
+   sudo apt-get update
+   sudo apt-get install -y nvidia-container-toolkit
+   
+   # Restart Docker to apply changes
+   sudo systemctl restart docker
+   ```
+
+   **Note**: Without `nvidia-container-toolkit`, the Docker container will not be able to access your NVIDIA GPU.
+
+### Verify GPU Access
+
+After installation, verify that Docker can access your GPU:
+
+```bash
+sudo docker run --rm --gpus all nvidia/cuda:12.4.0-base-ubuntu22.04 nvidia-smi
+```
+
+If successful, you should see your GPU information displayed.
+
+## Usage
+
+### Using Docker Compose (Recommended)
+
+```bash
+cd docker/docker-cuda/
+docker compose up -d
+docker compose exec llamafactory bash
+```
+
+### Using Docker Run
+
+```bash
+# Build the image
+docker build -f ./docker/docker-cuda/Dockerfile \
+    --build-arg PIP_INDEX=https://pypi.org/simple \
+    --build-arg EXTRAS=metrics \
+    -t llamafactory:latest .
+
+# Run the container
+docker run -dit --ipc=host --gpus=all \
+    -p 7860:7860 \
+    -p 8000:8000 \
+    --name llamafactory \
+    llamafactory:latest
+
+# Enter the container
+docker exec -it llamafactory bash
+```
+
+## Troubleshooting
+
+### GPU Not Detected
+
+If your GPU is not detected inside the container:
+
+1. Ensure `nvidia-container-toolkit` is installed
+2. Check that the Docker daemon has been restarted after installation
+3. Verify your NVIDIA drivers are properly installed: `nvidia-smi`
+4. Check Docker GPU support: `docker run --rm --gpus all ubuntu nvidia-smi`
+
+### Permission Denied
+
+If you get permission errors, ensure your user is in the docker group:
+
+```bash
+sudo usermod -aG docker $USER
+# Log out and back in for changes to take effect
+```
+
+## Additional Notes
+
+- The default image is built on Ubuntu 22.04 (x86_64), CUDA 12.4, Python 3.11, PyTorch 2.6.0, and Flash-attn 2.7.4
+- For different CUDA versions, you may need to adjust the base image in the Dockerfile
+- Make sure your NVIDIA driver version is compatible with the CUDA version used in the Docker image