diff --git a/README.md b/README.md index 9462964c..a20b848b 100644 --- a/README.md +++ b/README.md @@ -383,11 +383,6 @@ source /usr/local/Ascend/ascend-toolkit/set_env.sh | torch-npu | 2.1.0 | 2.1.0.post3 | | deepspeed | 0.13.2 | 0.13.2 | -Docker image: - -- 32GB: [Download page](http://mirrors.cn-central-221.ovaijisuan.com/detail/130.html) -- 64GB: [Download page](http://mirrors.cn-central-221.ovaijisuan.com/detail/131.html) - Remember to use `ASCEND_RT_VISIBLE_DEVICES` instead of `CUDA_VISIBLE_DEVICES` to specify the device to use. If you cannot infer model on NPU devices, try setting `do_sample: false` in the configurations. @@ -424,17 +419,33 @@ llamafactory-cli webui ### Build Docker -#### Use Docker +For CUDA users: ```bash -docker build -f ./Dockerfile \ +docker-compose -f ./docker/docker-cuda/docker-compose.yml up -d +docker-compose exec llamafactory bash +``` + +For Ascend NPU users: + +```bash +docker-compose -f ./docker/docker-npu/docker-compose.yml up -d +docker-compose exec llamafactory bash +``` + +
Build without Docker Compose + +For CUDA users: + +```bash +docker build -f ./docker/docker-cuda/Dockerfile \ --build-arg INSTALL_BNB=false \ --build-arg INSTALL_VLLM=false \ --build-arg INSTALL_DEEPSPEED=false \ --build-arg PIP_INDEX=https://pypi.org/simple \ -t llamafactory:latest . -docker run -it --gpus=all \ +docker run -dit --gpus=all \ -v ./hf_cache:/root/.cache/huggingface/ \ -v ./data:/app/data \ -v ./output:/app/output \ @@ -443,15 +454,43 @@ docker run -it --gpus=all \ --shm-size 16G \ --name llamafactory \ llamafactory:latest + +docker exec -it llamafactory bash ``` -#### Use Docker Compose +For Ascend NPU users: ```bash -docker-compose up -d -docker-compose exec llamafactory bash +# Change docker image upon your environment +docker build -f ./docker/docker-npu/Dockerfile \ + --build-arg INSTALL_DEEPSPEED=false \ + --build-arg PIP_INDEX=https://pypi.org/simple \ + -t llamafactory:latest . + +# Change `device` upon your resources +docker run -dit \ + -v ./hf_cache:/root/.cache/huggingface/ \ + -v ./data:/app/data \ + -v ./output:/app/output \ + -v /usr/local/dcmi:/usr/local/dcmi \ + -v /usr/local/bin/npu-smi:/usr/local/bin/npu-smi \ + -v /usr/local/Ascend/driver:/usr/local/Ascend/driver \ + -v /etc/ascend_install.info:/etc/ascend_install.info \ + -p 7860:7860 \ + -p 8000:8000 \ + --device /dev/davinci0 \ + --device /dev/davinci_manager \ + --device /dev/devmm_svm \ + --device /dev/hisi_hdc \ + --shm-size 16G \ + --name llamafactory \ + llamafactory:latest + +docker exec -it llamafactory bash ``` +
+
Details about volume - hf_cache: Utilize Hugging Face cache on the host machine. Reassignable if a cache already exists in a different directory. diff --git a/README_zh.md b/README_zh.md index 8b77e91e..3bed0846 100644 --- a/README_zh.md +++ b/README_zh.md @@ -360,7 +360,7 @@ pip install https://github.com/jllllll/bitsandbytes-windows-webui/releases/downl
昇腾 NPU 用户指南 -在昇腾 NPU 设备上安装 LLaMA Factory 时,需要指定额外依赖项,使用 `pip install -e '.[torch-npu,metrics]'` 命令安装。此外,还需要安装 **[Ascend CANN Toolkit and Kernels](https://www.hiascend.com/developer/download/community/result?module=cann)**,安装方法请参考[安装教程](https://www.hiascend.com/document/detail/zh/CANNCommunityEdition/80RC2alpha002/quickstart/quickstart/quickstart_18_0004.html)或使用以下命令: +在昇腾 NPU 设备上安装 LLaMA Factory 时,需要指定额外依赖项,使用 `pip install -e ".[torch-npu,metrics]"` 命令安装。此外,还需要安装 **[Ascend CANN Toolkit and Kernels](https://www.hiascend.com/developer/download/community/result?module=cann)**,安装方法请参考[安装教程](https://www.hiascend.com/document/detail/zh/CANNCommunityEdition/80RC2alpha002/quickstart/quickstart/quickstart_18_0004.html)或使用以下命令: ```bash # 请替换 URL 为 CANN 版本和设备型号对应的 URL @@ -383,11 +383,6 @@ source /usr/local/Ascend/ascend-toolkit/set_env.sh | torch-npu | 2.1.0 | 2.1.0.post3 | | deepspeed | 0.13.2 | 0.13.2 | -Docker 镜像: - -- 32GB:[下载地址](http://mirrors.cn-central-221.ovaijisuan.com/detail/130.html) -- 64GB:[下载地址](http://mirrors.cn-central-221.ovaijisuan.com/detail/131.html) - 请使用 `ASCEND_RT_VISIBLE_DEVICES` 而非 `CUDA_VISIBLE_DEVICES` 来指定运算设备。 如果遇到无法正常推理的情况,请尝试设置 `do_sample: false`。 @@ -424,17 +419,33 @@ llamafactory-cli webui ### 构建 Docker -#### 使用 Docker +CUDA 用户: ```bash -docker build -f ./Dockerfile \ +docker-compose -f ./docker/docker-cuda/docker-compose.yml up -d +docker-compose exec llamafactory bash +``` + +昇腾 NPU 用户: + +```bash +docker-compose -f ./docker/docker-npu/docker-compose.yml up -d +docker-compose exec llamafactory bash +``` + +
不使用 Docker Compose 构建 + +CUDA 用户: + +```bash +docker build -f ./docker/docker-cuda/Dockerfile \ --build-arg INSTALL_BNB=false \ --build-arg INSTALL_VLLM=false \ --build-arg INSTALL_DEEPSPEED=false \ --build-arg PIP_INDEX=https://pypi.org/simple \ -t llamafactory:latest . -docker run -it --gpus=all \ +docker run -dit --gpus=all \ -v ./hf_cache:/root/.cache/huggingface/ \ -v ./data:/app/data \ -v ./output:/app/output \ @@ -443,15 +454,43 @@ docker run -it --gpus=all \ --shm-size 16G \ --name llamafactory \ llamafactory:latest + +docker exec -it llamafactory bash ``` -#### 使用 Docker Compose +昇腾 NPU 用户: ```bash -docker-compose up -d -docker-compose exec llamafactory bash +# 根据您的环境选择镜像 +docker build -f ./docker/docker-npu/Dockerfile \ + --build-arg INSTALL_DEEPSPEED=false \ + --build-arg PIP_INDEX=https://pypi.org/simple \ + -t llamafactory:latest . + +# 根据您的资源更改 `device` +docker run -dit \ + -v ./hf_cache:/root/.cache/huggingface/ \ + -v ./data:/app/data \ + -v ./output:/app/output \ + -v /usr/local/dcmi:/usr/local/dcmi \ + -v /usr/local/bin/npu-smi:/usr/local/bin/npu-smi \ + -v /usr/local/Ascend/driver:/usr/local/Ascend/driver \ + -v /etc/ascend_install.info:/etc/ascend_install.info \ + -p 7860:7860 \ + -p 8000:8000 \ + --device /dev/davinci0 \ + --device /dev/davinci_manager \ + --device /dev/devmm_svm \ + --device /dev/hisi_hdc \ + --shm-size 16G \ + --name llamafactory \ + llamafactory:latest + +docker exec -it llamafactory bash ``` +
+
数据卷详情 - hf_cache:使用宿主机的 Hugging Face 缓存文件夹,允许更改为新的目录。 diff --git a/Dockerfile b/docker/docker-cuda/Dockerfile similarity index 90% rename from Dockerfile rename to docker/docker-cuda/Dockerfile index 61d58005..2d20bfe4 100644 --- a/Dockerfile +++ b/docker/docker-cuda/Dockerfile @@ -12,13 +12,14 @@ ARG PIP_INDEX=https://pypi.org/simple WORKDIR /app # Install the requirements -COPY requirements.txt /app/ +COPY requirements.txt /app RUN pip config set global.index-url $PIP_INDEX +RUN pip config set global.extra-index-url $PIP_INDEX RUN python -m pip install --upgrade pip RUN python -m pip install -r requirements.txt # Copy the rest of the application into the image -COPY . /app/ +COPY . /app # Install the LLaMA Factory RUN EXTRA_PACKAGES="metrics"; \ @@ -38,10 +39,9 @@ RUN EXTRA_PACKAGES="metrics"; \ VOLUME [ "/root/.cache/huggingface/", "/app/data", "/app/output" ] # Expose port 7860 for the LLaMA Board +ENV GRADIO_SERVER_PORT 7860 EXPOSE 7860 # Expose port 8000 for the API service +ENV API_PORT 8000 EXPOSE 8000 - -# Launch LLaMA Board -CMD [ "llamafactory-cli", "webui" ] diff --git a/docker-compose.yml b/docker/docker-cuda/docker-compose.yml similarity index 89% rename from docker-compose.yml rename to docker/docker-cuda/docker-compose.yml index c5dc34e9..04d6531a 100644 --- a/docker-compose.yml +++ b/docker/docker-cuda/docker-compose.yml @@ -1,8 +1,8 @@ services: llamafactory: build: - dockerfile: Dockerfile - context: . + dockerfile: ./docker/docker-cuda/Dockerfile + context: ../.. args: INSTALL_BNB: false INSTALL_VLLM: false diff --git a/docker/docker-npu/Dockerfile b/docker/docker-npu/Dockerfile new file mode 100644 index 00000000..0fdd4472 --- /dev/null +++ b/docker/docker-npu/Dockerfile @@ -0,0 +1,41 @@ +# Use the Ubuntu 22.04 image with CANN 8.0.rc1 +# More versions can be found at https://hub.docker.com/r/cosdt/cann/tags +FROM cosdt/cann:8.0.rc1-910b-ubuntu22.04 + +ENV DEBIAN_FRONTEND=noninteractive + +# Define installation arguments +ARG INSTALL_DEEPSPEED=false +ARG PIP_INDEX=https://pypi.org/simple + +# Set the working directory +WORKDIR /app + +# Install the requirements +COPY requirements.txt /app +RUN pip config set global.index-url $PIP_INDEX +RUN pip config set global.extra-index-url $PIP_INDEX +RUN python -m pip install --upgrade pip +RUN python -m pip install -r requirements.txt + +# Copy the rest of the application into the image +COPY . /app + +# Install the LLaMA Factory +RUN EXTRA_PACKAGES="torch-npu,metrics"; \ + if [ "$INSTALL_DEEPSPEED" = "true" ]; then \ + EXTRA_PACKAGES="${EXTRA_PACKAGES},deepspeed"; \ + fi; \ + pip install -e .[$EXTRA_PACKAGES] && \ + pip uninstall -y transformer-engine flash-attn + +# Set up volumes +VOLUME [ "/root/.cache/huggingface/", "/app/data", "/app/output" ] + +# Expose port 7860 for the LLaMA Board +ENV GRADIO_SERVER_PORT 7860 +EXPOSE 7860 + +# Expose port 8000 for the API service +ENV API_PORT 8000 +EXPOSE 8000 diff --git a/docker/docker-npu/docker-compose.yml b/docker/docker-npu/docker-compose.yml new file mode 100644 index 00000000..7fff6e73 --- /dev/null +++ b/docker/docker-npu/docker-compose.yml @@ -0,0 +1,30 @@ +services: + llamafactory: + build: + dockerfile: ./docker/docker-npu/Dockerfile + context: ../.. + args: + INSTALL_DEEPSPEED: false + PIP_INDEX: https://pypi.org/simple + container_name: llamafactory + volumes: + - ./hf_cache:/root/.cache/huggingface/ + - ./data:/app/data + - ./output:/app/output + - /usr/local/dcmi:/usr/local/dcmi + - /usr/local/bin/npu-smi:/usr/local/bin/npu-smi + - /usr/local/Ascend/driver:/usr/local/Ascend/driver + - /etc/ascend_install.info:/etc/ascend_install.info + ports: + - "7860:7860" + - "8000:8000" + ipc: host + tty: true + stdin_open: true + command: bash + devices: + - /dev/davinci0 + - /dev/davinci_manager + - /dev/devmm_svm + - /dev/hisi_hdc + restart: unless-stopped