This commit is contained in:
hiyouga
2025-12-27 07:35:18 +08:00
parent a1b1931b4a
commit 66e6aa8f37
18 changed files with 66 additions and 72 deletions

View File

@@ -2,23 +2,26 @@
check_dirs := scripts src tests tests_v1 check_dirs := scripts src tests tests_v1
RUN := $(shell command -v uv >/dev/null 2>&1 && echo "uv run" || echo "")
BUILD := $(shell command -v uv >/dev/null 2>&1 && echo "uv build" || echo "python -m build")
build: build:
uv build $(BUILD)
commit: commit:
uv run pre-commit install $(RUN) pre-commit install
uv run pre-commit run --all-files $(RUN) pre-commit run --all-files
license: license:
uv run python tests/check_license.py $(check_dirs) $(RUN) python3 tests/check_license.py $(check_dirs)
quality: quality:
uv run ruff check $(check_dirs) $(RUN) ruff check $(check_dirs)
uv run ruff format --check $(check_dirs) $(RUN) ruff format --check $(check_dirs)
style: style:
uv run ruff check $(check_dirs) --fix $(RUN) ruff check $(check_dirs) --fix
uv run ruff format $(check_dirs) $(RUN) ruff format $(check_dirs)
test: test:
WANDB_DISABLED=true uv run pytest -vv --import-mode=importlib tests/ tests_v1/ WANDB_DISABLED=true $(RUN) pytest -vv --import-mode=importlib tests/ tests_v1/

View File

@@ -538,13 +538,7 @@ Please refer to [build docker](#build-docker) to build the image yourself.
Create an isolated Python environment with [uv](https://github.com/astral-sh/uv): Create an isolated Python environment with [uv](https://github.com/astral-sh/uv):
```bash ```bash
uv sync --extra torch --extra metrics --prerelease=allow uv run llamafactory-cli webui
```
Run LLaMA-Factory in the isolated environment:
```bash
uv run --prerelease=allow llamafactory-cli train examples/train_lora/llama3_lora_pretrain.yaml
``` ```
</details> </details>
@@ -581,7 +575,7 @@ To enable FlashAttention-2 on the Windows platform, please use the script from [
<details><summary>For Ascend NPU users</summary> <details><summary>For Ascend NPU users</summary>
To install LLaMA Factory on Ascend NPU devices, please upgrade Python to version 3.10 or higher: `pip install -e "."`. Additionally, you need to install the **[Ascend CANN Toolkit and Kernels](https://www.hiascend.com/developer/download/community/result?module=cann)**. Please follow the [installation tutorial](https://www.hiascend.com/document/detail/en/CANNCommunityEdition/600alphaX/softwareinstall/instg/atlasdeploy_03_0031.html) or use the following commands: To install LLaMA Factory on Ascend NPU devices, please upgrade Python to version 3.10 or higher: `pip install -e . torch-npu==2.7.1`. Additionally, you need to install the **[Ascend CANN Toolkit and Kernels](https://www.hiascend.com/developer/download/community/result?module=cann)**. Please follow the [installation tutorial](https://www.hiascend.com/document/detail/en/CANNCommunityEdition/600alphaX/softwareinstall/instg/atlasdeploy_03_0031.html) or use the following commands:
```bash ```bash
# replace the url according to your CANN version and devices # replace the url according to your CANN version and devices
@@ -600,8 +594,8 @@ source /usr/local/Ascend/ascend-toolkit/set_env.sh
| Requirement | Minimum | Recommend | | Requirement | Minimum | Recommend |
| ------------ | ------- | -------------- | | ------------ | ------- | -------------- |
| CANN | 8.0.RC1 | 8.0.0.alpha002 | | CANN | 8.0.RC1 | 8.0.0.alpha002 |
| torch | 2.1.0 | 2.4.0 | | torch | 2.1.0 | 2.7.1 |
| torch-npu | 2.1.0 | 2.4.0.post2 | | torch-npu | 2.1.0 | 2.7.1 |
| deepspeed | 0.13.2 | 0.13.2 | | deepspeed | 0.13.2 | 0.13.2 |
| vllm-ascend | - | 0.7.3 | | vllm-ascend | - | 0.7.3 |

View File

@@ -519,7 +519,9 @@ cd LLaMA-Factory
pip install -e ".[torch,metrics]" --no-build-isolation pip install -e ".[torch,metrics]" --no-build-isolation
``` ```
可选的额外依赖项:torch、torch-npu、metricsdeepspeed、liger-kernel、bitsandbytes、hqq、eetq、gptq、aqlm、vllm、sglang、galore、apollo、badam、adam-mini、qwen、minicpm_v、openmind、swanlab、dev 可选的额外依赖项:`metrics``deepspeed`。使用 `pip install -e ".[metrics,deepspeed]"` 安装。
其他可选依赖项请参考 `examples/requirements/` 目录下的文件。
#### 从镜像安装 #### 从镜像安装
@@ -538,13 +540,7 @@ docker run -it --rm --gpus=all --ipc=host hiyouga/llamafactory:latest
使用 [uv](https://github.com/astral-sh/uv) 创建隔离的 Python 环境: 使用 [uv](https://github.com/astral-sh/uv) 创建隔离的 Python 环境:
```bash ```bash
uv sync --extra torch --extra metrics --prerelease=allow uv run llamafactory-cli webui
```
在环境中运行 LLaMA-Factory
```bash
uv run --prerelease=allow llamafactory-cli train examples/train_lora/llama3_lora_pretrain.yaml
``` ```
</details> </details>
@@ -581,7 +577,7 @@ pip install https://github.com/jllllll/bitsandbytes-windows-webui/releases/downl
<details><summary>昇腾 NPU 用户指南</summary> <details><summary>昇腾 NPU 用户指南</summary>
在昇腾 NPU 设备上安装 LLaMA Factory 时,请升级 Python 到 3.10 及以上,并需要指定额外依赖项,使用 `pip install -e ".[torch-npu,metrics]"` 命令安装。此外,还需要安装 **[Ascend CANN Toolkit 与 Kernels](https://www.hiascend.com/developer/download/community/result?module=cann)**,安装方法请参考[安装教程](https://www.hiascend.com/document/detail/zh/CANNCommunityEdition/80RC2alpha002/quickstart/quickstart/quickstart_18_0004.html)或使用以下命令: 在昇腾 NPU 设备上安装 LLaMA Factory 时,请升级 Python 到 3.10 及以上,并需要指定额外依赖项,使用 `pip install -e . torch-npu==2.7.1` 命令安装。此外,还需要安装 **[Ascend CANN Toolkit 与 Kernels](https://www.hiascend.com/developer/download/community/result?module=cann)**,安装方法请参考[安装教程](https://www.hiascend.com/document/detail/zh/CANNCommunityEdition/80RC2alpha002/quickstart/quickstart/quickstart_18_0004.html)或使用以下命令:
```bash ```bash
# 请替换 URL 为 CANN 版本和设备型号对应的 URL # 请替换 URL 为 CANN 版本和设备型号对应的 URL
@@ -600,8 +596,8 @@ source /usr/local/Ascend/ascend-toolkit/set_env.sh
| 依赖项 | 至少 | 推荐 | | 依赖项 | 至少 | 推荐 |
| ------------ | ------- | -------------- | | ------------ | ------- | -------------- |
| CANN | 8.0.RC1 | 8.0.0.alpha002 | | CANN | 8.0.RC1 | 8.0.0.alpha002 |
| torch | 2.1.0 | 2.4.0 | | torch | 2.1.0 | 2.7.1 |
| torch-npu | 2.1.0 | 2.4.0.post2 | | torch-npu | 2.1.0 | 2.7.1 |
| deepspeed | 0.13.2 | 0.13.2 | | deepspeed | 0.13.2 | 0.13.2 |
| vllm-ascend | - | 0.7.3 | | vllm-ascend | - | 0.7.3 |

View File

@@ -32,7 +32,7 @@ RUN pip config set global.index-url "${PIP_INDEX}" && \
COPY . /app COPY . /app
# Install LLaMA Factory # Install LLaMA Factory
RUN pip install --no-cache-dir -e "." --no-build-isolation RUN pip install --no-cache-dir -e ".[metrics,deepspeed]" --no-build-isolation
# Rebuild flash attention # Rebuild flash attention
RUN if [ "${INSTALL_FLASHATTN}" == "true" ]; then \ RUN if [ "${INSTALL_FLASHATTN}" == "true" ]; then \

View File

@@ -60,7 +60,7 @@ WORKDIR /app
COPY . /app COPY . /app
# Install LLaMA Factory # Install LLaMA Factory
RUN pip install --no-cache-dir -e "." --no-build-isolation RUN pip install --no-cache-dir -e ".[metrics]" --no-build-isolation
RUN pip install "git+https://github.com/alibaba/roll.git#subdirectory=mcore_adapter" RUN pip install "git+https://github.com/alibaba/roll.git#subdirectory=mcore_adapter"

View File

@@ -5,7 +5,6 @@ services:
context: ../.. context: ../..
args: args:
PIP_INDEX: https://pypi.org/simple PIP_INDEX: https://pypi.org/simple
EXTRAS: metrics
container_name: llamafactory container_name: llamafactory
ports: ports:
- "7860:7860" - "7860:7860"

View File

@@ -37,7 +37,7 @@ RUN pip uninstall -y torch torchvision torchaudio && \
COPY . /app COPY . /app
# Install LLaMA Factory # Install LLaMA Factory
RUN pip install --no-cache-dir -e "." --no-build-isolation RUN pip install --no-cache-dir -e ".[metrics,deepspeed]" --no-build-isolation
# Set up volumes # Set up volumes
# VOLUME [ "/root/.cache/huggingface", "/app/shared_data", "/app/output" ] # VOLUME [ "/root/.cache/huggingface", "/app/shared_data", "/app/output" ]

View File

@@ -5,7 +5,6 @@ services:
context: ../.. context: ../..
args: args:
PIP_INDEX: https://pypi.org/simple PIP_INDEX: https://pypi.org/simple
EXTRAS: torch-npu,metrics
container_name: llamafactory-a2 container_name: llamafactory-a2
image: llamafactory:npu-a2 image: llamafactory:npu-a2
volumes: volumes:

View File

@@ -37,7 +37,7 @@ RUN pip uninstall -y torch torchvision torchaudio && \
COPY . /app COPY . /app
# Install LLaMA Factory # Install LLaMA Factory
RUN pip install --no-cache-dir -e "." --no-build-isolation RUN pip install --no-cache-dir -e ".[metrics,deepspeed]" --no-build-isolation
# Rebuild flash attention # Rebuild flash attention
RUN if [ "${INSTALL_FLASHATTN}" == "true" ]; then \ RUN if [ "${INSTALL_FLASHATTN}" == "true" ]; then \

View File

@@ -5,7 +5,6 @@ services:
context: ../.. context: ../..
args: args:
PIP_INDEX: https://pypi.org/simple PIP_INDEX: https://pypi.org/simple
EXTRAS: metrics
container_name: llamafactory container_name: llamafactory
ports: ports:
- "7860:7860" - "7860:7860"

View File

@@ -1,4 +0,0 @@
pre-commit
ruff
pytest
build

View File

@@ -38,50 +38,48 @@ classifiers = [
] ]
dependencies = [ dependencies = [
# core deps # core deps
"torch>=2.4.0",
"torchvision>=0.19.0",
"transformers>=4.49.0,<=4.56.2,!=4.52.0; python_version < '3.10'", "transformers>=4.49.0,<=4.56.2,!=4.52.0; python_version < '3.10'",
"transformers>=4.49.0,<=4.57.1,!=4.52.0,!=4.57.0; python_version >= '3.10'", "transformers>=4.49.0,<=4.57.1,!=4.52.0,!=4.57.0; python_version >= '3.10'",
"datasets>=2.16.0,<=4.0.0", "datasets>=2.16.0,<=4.0.0",
"accelerate>=1.3.0,<=1.11.0", "accelerate>=1.3.0,<=1.11.0",
"peft>=0.14.0,<=0.17.1", "peft>=0.14.0,<=0.17.1",
"trl>=0.8.6,<=0.9.6", "trl>=0.8.6,<=0.9.6",
"torchdata", "torchdata>=0.10.0,<=0.11.0",
# torch
"torch>=2.0.0",
"torchvision>=0.15.0",
# gui # gui
"gradio>=4.38.0,<=5.45.0", "gradio>=4.38.0,<=6.2.0",
"matplotlib>=3.7.0", "matplotlib>=3.7.0",
"tyro<0.9.0", "tyro<0.9.0",
# ops # ops
"einops", "einops",
"numpy<2.0.0", "numpy",
"pandas>=2.0.0", "pandas",
"scipy", "scipy",
# model and tokenizer # model and tokenizer
"sentencepiece", "sentencepiece",
"tiktoken", "tiktoken",
"modelscope>=1.14.0", "modelscope",
"hf-transfer", "hf-transfer",
"safetensors<=0.5.3", "safetensors",
# python # python
"fire", "fire",
"omegaconf", "omegaconf",
"packaging", "packaging",
"protobuf", "protobuf",
"pyyaml", "pyyaml",
"pydantic<=2.10.6", "pydantic",
# api # api
"uvicorn", "uvicorn",
"fastapi", "fastapi",
"sse-starlette", "sse-starlette",
# media # media
"av", "av",
"librosa", "librosa"
# yanked
"propcache!=0.4.0"
] ]
[project.optional-dependencies] [project.optional-dependencies]
dev = ["pre-commit", "ruff", "pytest", "build"]
metrics = ["nltk", "jieba", "rouge-chinese"] metrics = ["nltk", "jieba", "rouge-chinese"]
deepspeed = ["deepspeed>=0.10.0,<=0.16.9"] deepspeed = ["deepspeed>=0.10.0,<=0.16.9"]

View File

@@ -500,13 +500,17 @@ class ErnieVLPlugin(BasePlugin):
while IMAGE_PLACEHOLDER in content: while IMAGE_PLACEHOLDER in content:
image_seqlen = image_grid_thw[image_idx].prod() // merge_length if self.expand_mm_tokens else 1 image_seqlen = image_grid_thw[image_idx].prod() // merge_length if self.expand_mm_tokens else 1
content = content.replace( content = content.replace(
IMAGE_PLACEHOLDER, f"Picture {image_idx + 1}:<|IMAGE_START|>{image_token * image_seqlen}<|IMAGE_END|>", 1 IMAGE_PLACEHOLDER,
f"Picture {image_idx + 1}:<|IMAGE_START|>{image_token * image_seqlen}<|IMAGE_END|>",
1,
) )
image_idx += 1 image_idx += 1
while VIDEO_PLACEHOLDER in content: while VIDEO_PLACEHOLDER in content:
video_seqlen = video_grid_thw[video_idx].prod() // merge_length if self.expand_mm_tokens else 1 video_seqlen = video_grid_thw[video_idx].prod() // merge_length if self.expand_mm_tokens else 1
content = content.replace( content = content.replace(
VIDEO_PLACEHOLDER, f"Video {video_idx + 1}:<|VIDEO_START|>{video_token * video_seqlen}<|VIDEO_END|>", 1 VIDEO_PLACEHOLDER,
f"Video {video_idx + 1}:<|VIDEO_START|>{video_token * video_seqlen}<|VIDEO_END|>",
1,
) )
video_idx += 1 video_idx += 1
message["content"] = content message["content"] = content

View File

@@ -332,3 +332,7 @@ def fix_proxy(ipv6_enabled: bool = False) -> None:
if ipv6_enabled: if ipv6_enabled:
os.environ.pop("http_proxy", None) os.environ.pop("http_proxy", None)
os.environ.pop("HTTP_PROXY", None) os.environ.pop("HTTP_PROXY", None)
os.environ.pop("https_proxy", None)
os.environ.pop("HTTPS_PROXY", None)
os.environ.pop("all_proxy", None)
os.environ.pop("ALL_PROXY", None)

View File

@@ -15,7 +15,6 @@
import os import os
from typing import TYPE_CHECKING, Any, Optional, TypedDict from typing import TYPE_CHECKING, Any, Optional, TypedDict
import torch
from transformers import ( from transformers import (
AutoConfig, AutoConfig,
AutoModelForCausalLM, AutoModelForCausalLM,
@@ -158,6 +157,7 @@ def load_model(
if model is None and not lazy_load: if model is None and not lazy_load:
init_kwargs["config"] = config init_kwargs["config"] = config
init_kwargs["pretrained_model_name_or_path"] = model_args.model_name_or_path init_kwargs["pretrained_model_name_or_path"] = model_args.model_name_or_path
init_kwargs["torch_dtype"] = "auto"
if model_args.mixture_of_depths == "load": if model_args.mixture_of_depths == "load":
model = load_mod_pretrained_model(**init_kwargs) model = load_mod_pretrained_model(**init_kwargs)

View File

@@ -156,16 +156,13 @@ def patch_config(
# deepspeed zero3 is not compatible with low_cpu_mem_usage # deepspeed zero3 is not compatible with low_cpu_mem_usage
init_kwargs["low_cpu_mem_usage"] = model_args.low_cpu_mem_usage and (not is_deepspeed_zero3_enabled()) init_kwargs["low_cpu_mem_usage"] = model_args.low_cpu_mem_usage and (not is_deepspeed_zero3_enabled())
# do not cast data type of the model deepspeed zero3 without qlora # fsdp/deepspeed zero3 does not need device map
if not (is_deepspeed_zero3_enabled() and model_args.quantization_bit is None): if not (is_deepspeed_zero3_enabled() or is_fsdp_enabled()) and init_kwargs["low_cpu_mem_usage"]:
init_kwargs["torch_dtype"] = "auto" if "device_map" not in init_kwargs and model_args.device_map:
init_kwargs["device_map"] = model_args.device_map # device map requires low_cpu_mem_usage=True
if init_kwargs["low_cpu_mem_usage"] and not is_fsdp_enabled(): # fsdp does not need device map if init_kwargs.get("device_map", None) == "auto":
if "device_map" not in init_kwargs and model_args.device_map: init_kwargs["offload_folder"] = model_args.offload_folder
init_kwargs["device_map"] = model_args.device_map # device map requires low_cpu_mem_usage=True
if init_kwargs.get("device_map", None) == "auto":
init_kwargs["offload_folder"] = model_args.offload_folder
def patch_model( def patch_model(

View File

@@ -84,7 +84,7 @@ def load_reference_model(
model: AutoModelForCausalLMWithValueHead = AutoModelForCausalLMWithValueHead.from_pretrained( model: AutoModelForCausalLMWithValueHead = AutoModelForCausalLMWithValueHead.from_pretrained(
model_path, torch_dtype=torch.float16, device_map="auto" model_path, torch_dtype=torch.float16, device_map="auto"
) )
return model return model
model = AutoModelForCausalLM.from_pretrained(model_path, torch_dtype=torch.float16, device_map="auto") model = AutoModelForCausalLM.from_pretrained(model_path, torch_dtype=torch.float16, device_map="auto")

View File

@@ -35,35 +35,40 @@ LOCALES = {
"value": ( "value": (
"<h3><center>Visit <a href='https://github.com/hiyouga/LLaMA-Factory' target='_blank'>" "<h3><center>Visit <a href='https://github.com/hiyouga/LLaMA-Factory' target='_blank'>"
"GitHub Page</a> <a href='https://llamafactory.readthedocs.io/en/latest/' target='_blank'>" "GitHub Page</a> <a href='https://llamafactory.readthedocs.io/en/latest/' target='_blank'>"
"Documentation</a></center></h3>" "Documentation</a> <a href='https://blog.llamafactory.net/en/' target='_blank'>"
"Blog</a></center></h3>"
), ),
}, },
"ru": { "ru": {
"value": ( "value": (
"<h3><center>Посетить <a href='https://github.com/hiyouga/LLaMA-Factory' target='_blank'>" "<h3><center>Посетить <a href='https://github.com/hiyouga/LLaMA-Factory' target='_blank'>"
"страницу GitHub</a> <a href='https://llamafactory.readthedocs.io/en/latest/' target='_blank'>" "страницу GitHub</a> <a href='https://llamafactory.readthedocs.io/en/latest/' target='_blank'>"
"Документацию</a></center></h3>" "Документацию</a> <a href='https://blog.llamafactory.net/en/' target='_blank'>"
"Блог</a></center></h3>"
), ),
}, },
"zh": { "zh": {
"value": ( "value": (
"<h3><center>访问 <a href='https://github.com/hiyouga/LLaMA-Factory' target='_blank'>" "<h3><center>访问 <a href='https://github.com/hiyouga/LLaMA-Factory' target='_blank'>"
"GitHub 主页</a> <a href='https://llamafactory.readthedocs.io/zh-cn/latest/' target='_blank'>" "GitHub 主页</a> <a href='https://llamafactory.readthedocs.io/zh-cn/latest/' target='_blank'>"
"官方文档</a></center></h3>" "官方文档</a> <a href='https://blog.llamafactory.net/' target='_blank'>"
"博客</a></center></h3>"
), ),
}, },
"ko": { "ko": {
"value": ( "value": (
"<h3><center><a href='https://github.com/hiyouga/LLaMA-Factory' target='_blank'>" "<h3><center><a href='https://github.com/hiyouga/LLaMA-Factory' target='_blank'>"
"GitHub 페이지</a> <a href='https://llamafactory.readthedocs.io/en/latest/' target='_blank'>" "GitHub 페이지</a> <a href='https://llamafactory.readthedocs.io/en/latest/' target='_blank'>"
"공식 문서</a>를 방문하세요.</center></h3>" "공식 문서</a> <a href='https://blog.llamafactory.net/en/' target='_blank'>"
"블로그</a>를 방문하세요.</center></h3>"
), ),
}, },
"ja": { "ja": {
"value": ( "value": (
"<h3><center><a href='https://github.com/hiyouga/LLaMA-Factory' target='_blank'>" "<h3><center><a href='https://github.com/hiyouga/LLaMA-Factory' target='_blank'>"
"GitHub ページ</a> <a href='https://llamafactory.readthedocs.io/en/latest/' target='_blank'>" "GitHub ページ</a> <a href='https://llamafactory.readthedocs.io/en/latest/' target='_blank'>"
"ドキュメント</a>にアクセスする</center></h3>" "ドキュメント</a> <a href='https://blog.llamafactory.net/en/' target='_blank'>"
"ブログ</a>にアクセスする</center></h3>"
), ),
}, },
}, },