diff --git a/README.md b/README.md
index 9462964c..a20b848b 100644
--- a/README.md
+++ b/README.md
@@ -383,11 +383,6 @@ source /usr/local/Ascend/ascend-toolkit/set_env.sh
| torch-npu | 2.1.0 | 2.1.0.post3 |
| deepspeed | 0.13.2 | 0.13.2 |
-Docker image:
-
-- 32GB: [Download page](http://mirrors.cn-central-221.ovaijisuan.com/detail/130.html)
-- 64GB: [Download page](http://mirrors.cn-central-221.ovaijisuan.com/detail/131.html)
-
Remember to use `ASCEND_RT_VISIBLE_DEVICES` instead of `CUDA_VISIBLE_DEVICES` to specify the device to use.
If you cannot infer model on NPU devices, try setting `do_sample: false` in the configurations.
@@ -424,17 +419,33 @@ llamafactory-cli webui
### Build Docker
-#### Use Docker
+For CUDA users:
```bash
-docker build -f ./Dockerfile \
+docker-compose -f ./docker/docker-cuda/docker-compose.yml up -d
+docker-compose exec llamafactory bash
+```
+
+For Ascend NPU users:
+
+```bash
+docker-compose -f ./docker/docker-npu/docker-compose.yml up -d
+docker-compose exec llamafactory bash
+```
+
+Build without Docker Compose
+
+For CUDA users:
+
+```bash
+docker build -f ./docker/docker-cuda/Dockerfile \
--build-arg INSTALL_BNB=false \
--build-arg INSTALL_VLLM=false \
--build-arg INSTALL_DEEPSPEED=false \
--build-arg PIP_INDEX=https://pypi.org/simple \
-t llamafactory:latest .
-docker run -it --gpus=all \
+docker run -dit --gpus=all \
-v ./hf_cache:/root/.cache/huggingface/ \
-v ./data:/app/data \
-v ./output:/app/output \
@@ -443,15 +454,43 @@ docker run -it --gpus=all \
--shm-size 16G \
--name llamafactory \
llamafactory:latest
+
+docker exec -it llamafactory bash
```
-#### Use Docker Compose
+For Ascend NPU users:
```bash
-docker-compose up -d
-docker-compose exec llamafactory bash
+# Change docker image upon your environment
+docker build -f ./docker/docker-npu/Dockerfile \
+ --build-arg INSTALL_DEEPSPEED=false \
+ --build-arg PIP_INDEX=https://pypi.org/simple \
+ -t llamafactory:latest .
+
+# Change `device` upon your resources
+docker run -dit \
+ -v ./hf_cache:/root/.cache/huggingface/ \
+ -v ./data:/app/data \
+ -v ./output:/app/output \
+ -v /usr/local/dcmi:/usr/local/dcmi \
+ -v /usr/local/bin/npu-smi:/usr/local/bin/npu-smi \
+ -v /usr/local/Ascend/driver:/usr/local/Ascend/driver \
+ -v /etc/ascend_install.info:/etc/ascend_install.info \
+ -p 7860:7860 \
+ -p 8000:8000 \
+ --device /dev/davinci0 \
+ --device /dev/davinci_manager \
+ --device /dev/devmm_svm \
+ --device /dev/hisi_hdc \
+ --shm-size 16G \
+ --name llamafactory \
+ llamafactory:latest
+
+docker exec -it llamafactory bash
```
+
+
Details about volume
- hf_cache: Utilize Hugging Face cache on the host machine. Reassignable if a cache already exists in a different directory.
diff --git a/README_zh.md b/README_zh.md
index 8b77e91e..3bed0846 100644
--- a/README_zh.md
+++ b/README_zh.md
@@ -360,7 +360,7 @@ pip install https://github.com/jllllll/bitsandbytes-windows-webui/releases/downl
昇腾 NPU 用户指南
-在昇腾 NPU 设备上安装 LLaMA Factory 时,需要指定额外依赖项,使用 `pip install -e '.[torch-npu,metrics]'` 命令安装。此外,还需要安装 **[Ascend CANN Toolkit and Kernels](https://www.hiascend.com/developer/download/community/result?module=cann)**,安装方法请参考[安装教程](https://www.hiascend.com/document/detail/zh/CANNCommunityEdition/80RC2alpha002/quickstart/quickstart/quickstart_18_0004.html)或使用以下命令:
+在昇腾 NPU 设备上安装 LLaMA Factory 时,需要指定额外依赖项,使用 `pip install -e ".[torch-npu,metrics]"` 命令安装。此外,还需要安装 **[Ascend CANN Toolkit and Kernels](https://www.hiascend.com/developer/download/community/result?module=cann)**,安装方法请参考[安装教程](https://www.hiascend.com/document/detail/zh/CANNCommunityEdition/80RC2alpha002/quickstart/quickstart/quickstart_18_0004.html)或使用以下命令:
```bash
# 请替换 URL 为 CANN 版本和设备型号对应的 URL
@@ -383,11 +383,6 @@ source /usr/local/Ascend/ascend-toolkit/set_env.sh
| torch-npu | 2.1.0 | 2.1.0.post3 |
| deepspeed | 0.13.2 | 0.13.2 |
-Docker 镜像:
-
-- 32GB:[下载地址](http://mirrors.cn-central-221.ovaijisuan.com/detail/130.html)
-- 64GB:[下载地址](http://mirrors.cn-central-221.ovaijisuan.com/detail/131.html)
-
请使用 `ASCEND_RT_VISIBLE_DEVICES` 而非 `CUDA_VISIBLE_DEVICES` 来指定运算设备。
如果遇到无法正常推理的情况,请尝试设置 `do_sample: false`。
@@ -424,17 +419,33 @@ llamafactory-cli webui
### 构建 Docker
-#### 使用 Docker
+CUDA 用户:
```bash
-docker build -f ./Dockerfile \
+docker-compose -f ./docker/docker-cuda/docker-compose.yml up -d
+docker-compose exec llamafactory bash
+```
+
+昇腾 NPU 用户:
+
+```bash
+docker-compose -f ./docker/docker-npu/docker-compose.yml up -d
+docker-compose exec llamafactory bash
+```
+
+不使用 Docker Compose 构建
+
+CUDA 用户:
+
+```bash
+docker build -f ./docker/docker-cuda/Dockerfile \
--build-arg INSTALL_BNB=false \
--build-arg INSTALL_VLLM=false \
--build-arg INSTALL_DEEPSPEED=false \
--build-arg PIP_INDEX=https://pypi.org/simple \
-t llamafactory:latest .
-docker run -it --gpus=all \
+docker run -dit --gpus=all \
-v ./hf_cache:/root/.cache/huggingface/ \
-v ./data:/app/data \
-v ./output:/app/output \
@@ -443,15 +454,43 @@ docker run -it --gpus=all \
--shm-size 16G \
--name llamafactory \
llamafactory:latest
+
+docker exec -it llamafactory bash
```
-#### 使用 Docker Compose
+昇腾 NPU 用户:
```bash
-docker-compose up -d
-docker-compose exec llamafactory bash
+# 根据您的环境选择镜像
+docker build -f ./docker/docker-npu/Dockerfile \
+ --build-arg INSTALL_DEEPSPEED=false \
+ --build-arg PIP_INDEX=https://pypi.org/simple \
+ -t llamafactory:latest .
+
+# 根据您的资源更改 `device`
+docker run -dit \
+ -v ./hf_cache:/root/.cache/huggingface/ \
+ -v ./data:/app/data \
+ -v ./output:/app/output \
+ -v /usr/local/dcmi:/usr/local/dcmi \
+ -v /usr/local/bin/npu-smi:/usr/local/bin/npu-smi \
+ -v /usr/local/Ascend/driver:/usr/local/Ascend/driver \
+ -v /etc/ascend_install.info:/etc/ascend_install.info \
+ -p 7860:7860 \
+ -p 8000:8000 \
+ --device /dev/davinci0 \
+ --device /dev/davinci_manager \
+ --device /dev/devmm_svm \
+ --device /dev/hisi_hdc \
+ --shm-size 16G \
+ --name llamafactory \
+ llamafactory:latest
+
+docker exec -it llamafactory bash
```
+
+
数据卷详情
- hf_cache:使用宿主机的 Hugging Face 缓存文件夹,允许更改为新的目录。
diff --git a/Dockerfile b/docker/docker-cuda/Dockerfile
similarity index 90%
rename from Dockerfile
rename to docker/docker-cuda/Dockerfile
index 61d58005..2d20bfe4 100644
--- a/Dockerfile
+++ b/docker/docker-cuda/Dockerfile
@@ -12,13 +12,14 @@ ARG PIP_INDEX=https://pypi.org/simple
WORKDIR /app
# Install the requirements
-COPY requirements.txt /app/
+COPY requirements.txt /app
RUN pip config set global.index-url $PIP_INDEX
+RUN pip config set global.extra-index-url $PIP_INDEX
RUN python -m pip install --upgrade pip
RUN python -m pip install -r requirements.txt
# Copy the rest of the application into the image
-COPY . /app/
+COPY . /app
# Install the LLaMA Factory
RUN EXTRA_PACKAGES="metrics"; \
@@ -38,10 +39,9 @@ RUN EXTRA_PACKAGES="metrics"; \
VOLUME [ "/root/.cache/huggingface/", "/app/data", "/app/output" ]
# Expose port 7860 for the LLaMA Board
+ENV GRADIO_SERVER_PORT 7860
EXPOSE 7860
# Expose port 8000 for the API service
+ENV API_PORT 8000
EXPOSE 8000
-
-# Launch LLaMA Board
-CMD [ "llamafactory-cli", "webui" ]
diff --git a/docker-compose.yml b/docker/docker-cuda/docker-compose.yml
similarity index 89%
rename from docker-compose.yml
rename to docker/docker-cuda/docker-compose.yml
index c5dc34e9..04d6531a 100644
--- a/docker-compose.yml
+++ b/docker/docker-cuda/docker-compose.yml
@@ -1,8 +1,8 @@
services:
llamafactory:
build:
- dockerfile: Dockerfile
- context: .
+ dockerfile: ./docker/docker-cuda/Dockerfile
+ context: ../..
args:
INSTALL_BNB: false
INSTALL_VLLM: false
diff --git a/docker/docker-npu/Dockerfile b/docker/docker-npu/Dockerfile
new file mode 100644
index 00000000..0fdd4472
--- /dev/null
+++ b/docker/docker-npu/Dockerfile
@@ -0,0 +1,41 @@
+# Use the Ubuntu 22.04 image with CANN 8.0.rc1
+# More versions can be found at https://hub.docker.com/r/cosdt/cann/tags
+FROM cosdt/cann:8.0.rc1-910b-ubuntu22.04
+
+ENV DEBIAN_FRONTEND=noninteractive
+
+# Define installation arguments
+ARG INSTALL_DEEPSPEED=false
+ARG PIP_INDEX=https://pypi.org/simple
+
+# Set the working directory
+WORKDIR /app
+
+# Install the requirements
+COPY requirements.txt /app
+RUN pip config set global.index-url $PIP_INDEX
+RUN pip config set global.extra-index-url $PIP_INDEX
+RUN python -m pip install --upgrade pip
+RUN python -m pip install -r requirements.txt
+
+# Copy the rest of the application into the image
+COPY . /app
+
+# Install the LLaMA Factory
+RUN EXTRA_PACKAGES="torch-npu,metrics"; \
+ if [ "$INSTALL_DEEPSPEED" = "true" ]; then \
+ EXTRA_PACKAGES="${EXTRA_PACKAGES},deepspeed"; \
+ fi; \
+ pip install -e .[$EXTRA_PACKAGES] && \
+ pip uninstall -y transformer-engine flash-attn
+
+# Set up volumes
+VOLUME [ "/root/.cache/huggingface/", "/app/data", "/app/output" ]
+
+# Expose port 7860 for the LLaMA Board
+ENV GRADIO_SERVER_PORT 7860
+EXPOSE 7860
+
+# Expose port 8000 for the API service
+ENV API_PORT 8000
+EXPOSE 8000
diff --git a/docker/docker-npu/docker-compose.yml b/docker/docker-npu/docker-compose.yml
new file mode 100644
index 00000000..7fff6e73
--- /dev/null
+++ b/docker/docker-npu/docker-compose.yml
@@ -0,0 +1,30 @@
+services:
+ llamafactory:
+ build:
+ dockerfile: ./docker/docker-npu/Dockerfile
+ context: ../..
+ args:
+ INSTALL_DEEPSPEED: false
+ PIP_INDEX: https://pypi.org/simple
+ container_name: llamafactory
+ volumes:
+ - ./hf_cache:/root/.cache/huggingface/
+ - ./data:/app/data
+ - ./output:/app/output
+ - /usr/local/dcmi:/usr/local/dcmi
+ - /usr/local/bin/npu-smi:/usr/local/bin/npu-smi
+ - /usr/local/Ascend/driver:/usr/local/Ascend/driver
+ - /etc/ascend_install.info:/etc/ascend_install.info
+ ports:
+ - "7860:7860"
+ - "8000:8000"
+ ipc: host
+ tty: true
+ stdin_open: true
+ command: bash
+ devices:
+ - /dev/davinci0
+ - /dev/davinci_manager
+ - /dev/devmm_svm
+ - /dev/hisi_hdc
+ restart: unless-stopped