[breaking] migrate from setuptools to uv (#9673 )

Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com> Co-authored-by: hiyouga <16256802+hiyouga@users.noreply.github.com>
[model] Update ernie_vl to adapt new version (#9665 )
2026-02-26 15:56:00 +08:00 · 2025-12-26 22:47:23 +08:00 · 2025-12-26 19:57:49 +08:00 · 2025-12-26 17:32:48 +08:00
37 changed files with 382 additions and 266 deletions
--- a/.github/copilot-instructions.md
+++ b/.github/copilot-instructions.md
@@ -0,0 +1,180 @@
 # GitHub Copilot Instructions for LLaMA Factory
 ## Project Overview
 LLaMA Factory is an efficient fine-tuning framework for 100+ large language models (LLMs). It provides:
 - Support for various models: LLaMA, LLaVA, Mistral, Qwen, DeepSeek, Yi, Gemma, ChatGLM, Phi, etc.
 - Multiple training methods: pre-training, supervised fine-tuning, reward modeling, PPO, DPO, KTO, ORPO
 - Scalable resources: 16-bit full-tuning, freeze-tuning, LoRA and QLoRA variants
 - Advanced algorithms: GaLore, BAdam, APOLLO, Adam-mini, Muon, OFT, DoRA, etc.
 - Web UI (LLaMA Board) and CLI interfaces
 ### Architecture Versions
 LLaMA Factory has two parallel architectures that can be switched via the `USE_V1` environment variable:
 **v0 (default)** - File hierarchy:
 - `api`, `webui` → `chat`, `eval`, `train` → `data`, `model` → `hparams` → `extras`
 **v1** - File hierarchy:
 - `trainers` → `core` → `accelerator`, `plugins`, `config` → `utils`
 Set `USE_V1=1` to enable v1 architecture.
 ## Code Structure
 ### v0 Architecture (Default)
 - `src/llamafactory/` - Main package directory
  - `api/` - OpenAI-style API implementation
  - `chat/` - Chat interface implementation
  - `cli.py` - Command-line interface
  - `data/` - Data processing and dataset handling
  - `eval/` - Model evaluation utilities
  - `extras/` - Additional utilities and helpers
  - `hparams/` - Hyperparameter definitions
  - `model/` - Model loading, patching, and utilities
  - `train/` - Training pipeline implementation
  - `webui/` - Gradio-based web interface
 - `src/train.py` - Training entry script (delegates to `llamafactory.train.tuner`)
 - `src/webui.py` - Web UI entry script (delegates to `llamafactory.webui.interface`)
 - `src/api.py` - API server entry script (delegates to `llamafactory.api.app`)
 - `tests/` - Test suite
 - `examples/` - Example configurations for various training scenarios
 - `data/` - Dataset definitions and examples
 ### v1 Architecture (USE_V1=1)
 - `src/llamafactory/v1/` - Version 1 package directory
  - `trainers/` - Training implementations
  - `core/` - Core training utilities
  - `accelerator/` - Acceleration and distributed training
  - `plugins/` - Pluggable components (model, data, sampler, trainer)
  - `config/` - Configuration management
  - `utils/` - Utility functions
 ## Development Practices
 ### Code Style
 - Follow the [Google Python Style Guide](https://google.github.io/styleguide/pyguide.html)
 - Use ruff for linting and formatting
 - Line length: 119 characters
 - Indentation: 4 spaces
 - Quote style: double quotes
 - Use Google-style docstrings for documentation
 ### Import Organization
 - Known first-party: `llamafactory`
 - Known third-party: `accelerate`, `datasets`, `gradio`, `numpy`, `peft`, `torch`, `transformers`, `trl`
 - Use 2 blank lines after imports
 ### Quality Checks
 Before committing code, run:
 ```bash
 make style      # Auto-fix style issues
 make quality    # Check code quality
 make test       # Run test suite
 ```
 Or use the combined command:
 ```bash
 make commit     # Run pre-commit hooks
 ```
 ### Testing
 - Use pytest for testing
 - Tests are located in `tests/` and `tests_v1/` directories
 - Run tests with: `make test` (which runs `WANDB_DISABLED=true pytest -vv --import-mode=importlib tests/ tests_v1/`)
 - Disable wandb during testing to avoid external dependencies
 - **Note**: Training configurations require GPU machines, so training is typically not tested end-to-end. Use `make test` to validate file-level functionality.
 ### Building
 Build the package with:
 ```bash
 pip3 install build && python3 -m build
 ```
 ### License
 - All source files must include the Apache 2.0 license header
 - Check license headers with: `make license`
 ## Common Patterns
 ### Configuration Files
 - Training configurations are typically YAML or JSON files in `examples/` directory
 - Hyperparameters are defined using dataclasses in `src/llamafactory/hparams/`
 ### Model Support
 - New model support is added through model patches in `src/llamafactory/model/`
 - Visual models use the visual utilities in `src/llamafactory/model/model_utils/visual.py`
 - Quantization support is in `src/llamafactory/model/model_utils/quantization.py`
 ### Data Processing
 - Dataset definitions are in `data/dataset_info.json`
 - Data templates and processors are in `src/llamafactory/data/`
 ### Training
 - Training pipelines are in `src/llamafactory/train/`
 - Support for different training methods: SFT, DPO, PPO, RM, PT, KTO, ORPO
 ## Key Dependencies
 - Python >= 3.9.0
 - PyTorch and transformers for model handling
 - datasets for data processing
 - peft for parameter-efficient fine-tuning
 - accelerate for distributed training
 - gradio for web UI
 - trl for reinforcement learning
 - Optional: vllm/sglang for inference, flash-attention-2, unsloth, liger-kernel
 ## Entry Points
 - **CLI Training**: `llamafactory-cli train --config examples/train_lora/llama3_lora_sft.yaml`
 - **Web UI**: `llamafactory-cli webui` or `python src/webui.py`
 - **API Server**: `llamafactory-cli api` or `python src/api.py`
 - **Chat Interface**: `llamafactory-cli chat --model_name_or_path MODEL_PATH`
 ## Environment Setup
 For development:
 ```bash
 pip install -e ".[dev]"
 ```
 ## Important Notes
 - The project supports multiple backends: default PyTorch, vLLM, SGLang
 - Megatron-core training is supported via mcore_adapter
 - SwanLab and W&B are supported for experiment tracking
 - Docker support is available with pre-built images
 - Day-0/Day-1 support for latest cutting-edge models
 - Multi-modal support for vision and audio understanding tasks
 ## Contribution Guidelines
 1. Fork the repository
 2. Create a development branch
 3. Set up development environment with `pip install -e ".[dev]"`
 4. Make changes following the style guide
 5. Run quality checks: `make style && make quality`
 6. Run tests: `make test`
 7. Submit a pull request
 ## Common Commands
 - `make style` - Format code
 - `make quality` - Run linters
 - `make test` - Run tests
 - `make commit` - Install and run pre-commit hooks
 - `make license` - Check license headers
--- a/.github/workflows/docker.yml
+++ b/.github/workflows/docker.yml
@@ -7,7 +7,7 @@ on:
      - "main"
    paths:
      - "**/*.py"
-      - "requirements.txt"
+      - "pyproject.toml"
      - "docker/**"
      - ".github/workflows/*.yml"
  pull_request:
@@ -15,7 +15,7 @@ on:
      - "main"
    paths:
      - "**/*.py"
-      - "requirements.txt"
+      - "pyproject.toml"
      - "docker/**"
      - ".github/workflows/*.yml"
  release:
@@ -64,7 +64,7 @@ jobs:
        id: version
        run: |
          if [ "${{ github.event_name }}" = "release" ]; then
-            echo "tag=$(python setup.py --version)" >> "$GITHUB_OUTPUT"
+            echo "tag=$(grep -oP 'VERSION = "\K[^"]+' src/llamafactory/extras/env.py)" >> "$GITHUB_OUTPUT"
          else
            echo "tag=latest" >> "$GITHUB_OUTPUT"
          fi
@@ -93,8 +93,6 @@ jobs:
        with:
          context: .
          file: ./docker/docker-cuda/Dockerfile
          build-args: |
            EXTRAS=metrics,deepspeed,liger-kernel
          push: ${{ github.event_name != 'pull_request' }}
          tags: |
            docker.io/hiyouga/llamafactory:${{ steps.version.outputs.tag }}
--- a/.github/workflows/tests.yml
+++ b/.github/workflows/tests.yml
@@ -7,7 +7,7 @@ on:
      - "main"
    paths:
      - "**/*.py"
-      - "requirements.txt"
+      - "pyproject.toml"
      - "Makefile"
      - ".github/workflows/*.yml"
  pull_request:
@@ -15,7 +15,7 @@ on:
      - "main"
    paths:
      - "**/*.py"
-      - "requirements.txt"
+      - "pyproject.toml"
      - "Makefile"
      - ".github/workflows/*.yml"
@@ -68,16 +68,19 @@ jobs:
        with:
          python-version: ${{ matrix.python }}
      - name: Install uv
        uses: astral-sh/setup-uv@v5
      - name: Install dependencies
        run: |
-          python -m pip install --upgrade pip
+          uv pip install --system torch torchvision --index-url https://download.pytorch.org/whl/cpu
-          python -m pip install torch torchvision --index-url https://download.pytorch.org/whl/cpu
+          uv pip install --system -e "."
-          python -m pip install ".[dev]"
+          uv pip install --system -r examples/requirements/dev.txt
      - name: Install transformers
        if: ${{ matrix.transformers }}
        run: |
-          python -m pip install "transformers==${{ matrix.transformers }}"
+          uv pip install --system "transformers==${{ matrix.transformers }}"
      - name: Cache files
        id: hf-hub-cache
--- a/.github/workflows/tests_npu.yml
+++ b/.github/workflows/tests_npu.yml
@@ -7,7 +7,7 @@ on:
      - "main"
    paths:
      - "**/*.py"
-      - "requirements.txt"
+      - "pyproject.toml"
      - "Makefile"
      - ".github/workflows/*.yml"
  pull_request:
@@ -15,7 +15,7 @@ on:
      - "main"
    paths:
      - "**/*.py"
-      - "requirements.txt"
+      - "pyproject.toml"
      - "Makefile"
      - ".github/workflows/*.yml"
@@ -48,10 +48,14 @@ jobs:
      - name: Checkout
        uses: actions/checkout@v4
      - name: Install uv
        run: |
          curl -LsSf https://astral.sh/uv/install.sh | sh
      - name: Install dependencies
        run: |
-          python -m pip install --upgrade pip
+          uv pip install --system -e "." torch-npu==${{matrix.pytorch_npu}}
-          python -m pip install ".[torch-npu,dev]" torch-npu==${{matrix.pytorch_npu}}
+          uv pip install --system -r examples/requirements/dev.txt
      - name: Install node
        run: |
--- a/MANIFEST.in
+++ b/MANIFEST.in
@@ -1 +1 @@
-include LICENSE requirements.txt
+include LICENSE
--- a/20
+++ b/20
@@ -1,24 +1,24 @@
 .PHONY: build commit license quality style test
-check_dirs := scripts src tests tests_v1 setup.py
+check_dirs := scripts src tests tests_v1
 build:
-	pip3 install build && python3 -m build
+	uv build
 commit:
-	pre-commit install
+	uv run pre-commit install
-	pre-commit run --all-files
+	uv run pre-commit run --all-files
 license:
-	python3 tests/check_license.py $(check_dirs)
+	uv run python tests/check_license.py $(check_dirs)
 quality:
-	ruff check $(check_dirs)
+	uv run ruff check $(check_dirs)
-	ruff format --check $(check_dirs)
+	uv run ruff format --check $(check_dirs)
 style:
-	ruff check $(check_dirs) --fix
+	uv run ruff check $(check_dirs) --fix
-	ruff format $(check_dirs)
+	uv run ruff format $(check_dirs)
 test:
-	WANDB_DISABLED=true pytest -vv --import-mode=importlib tests/ tests_v1/
+	WANDB_DISABLED=true uv run pytest -vv --import-mode=importlib tests/ tests_v1/
--- a/README.md
+++ b/README.md
@@ -514,10 +514,12 @@ huggingface-cli login
 ```bash
 git clone --depth 1 https://github.com/hiyouga/LLaMA-Factory.git
 cd LLaMA-Factory
-pip install -e ".[torch,metrics]" --no-build-isolation
+pip install -e "." --no-build-isolation
 ```
-Extra dependencies available: torch, torch-npu, metrics, deepspeed, liger-kernel, bitsandbytes, hqq, eetq, gptq, aqlm, vllm, sglang, galore, apollo, badam, adam-mini, qwen, minicpm_v, openmind, swanlab, dev
+Optional dependencies available: `metrics`, `deepspeed`. Install with: `pip install -e ".[metrics,deepspeed]"`
 Additional dependencies for specific features are available in `examples/requirements/`.
 #### Install from Docker Image
@@ -579,7 +581,7 @@ To enable FlashAttention-2 on the Windows platform, please use the script from [
 <details><summary>For Ascend NPU users</summary>
-To install LLaMA Factory on Ascend NPU devices, please upgrade Python to version 3.10 or higher and specify extra dependencies: `pip install -e ".[torch-npu,metrics]"`. Additionally, you need to install the **[Ascend CANN Toolkit and Kernels](https://www.hiascend.com/developer/download/community/result?module=cann)**. Please follow the [installation tutorial](https://www.hiascend.com/document/detail/en/CANNCommunityEdition/600alphaX/softwareinstall/instg/atlasdeploy_03_0031.html) or use the following commands:
+To install LLaMA Factory on Ascend NPU devices, please upgrade Python to version 3.10 or higher: `pip install -e "."`. Additionally, you need to install the **[Ascend CANN Toolkit and Kernels](https://www.hiascend.com/developer/download/community/result?module=cann)**. Please follow the [installation tutorial](https://www.hiascend.com/document/detail/en/CANNCommunityEdition/600alphaX/softwareinstall/instg/atlasdeploy_03_0031.html) or use the following commands:
 ```bash
 # replace the url according to your CANN version and devices
@@ -714,7 +716,6 @@ For CUDA users:
 ```bash
 docker build -f ./docker/docker-cuda/Dockerfile \
    --build-arg PIP_INDEX=https://pypi.org/simple \
    --build-arg EXTRAS=metrics \
    -t llamafactory:latest .
 docker run -dit --ipc=host --gpus=all \
@@ -731,7 +732,6 @@ For Ascend NPU users:
 ```bash
 docker build -f ./docker/docker-npu/Dockerfile \
    --build-arg PIP_INDEX=https://pypi.org/simple \
    --build-arg EXTRAS=torch-npu,metrics \
    -t llamafactory:latest .
 docker run -dit --ipc=host \
@@ -756,7 +756,6 @@ For AMD ROCm users:
 ```bash
 docker build -f ./docker/docker-rocm/Dockerfile \
    --build-arg PIP_INDEX=https://pypi.org/simple \
    --build-arg EXTRAS=metrics \
    -t llamafactory:latest .
 docker run -dit --ipc=host \
--- a/docker/docker-cuda/Dockerfile
+++ b/docker/docker-cuda/Dockerfile
@@ -4,7 +4,6 @@ FROM ${BASE_IMAGE}
 # Installation arguments
 ARG PIP_INDEX=https://pypi.org/simple
 ARG EXTRAS=metrics
 ARG INSTALL_FLASHATTN=false
 ARG HTTP_PROXY=""
@@ -27,17 +26,13 @@ WORKDIR /app
 # Change pip source
 RUN pip config set global.index-url "${PIP_INDEX}" && \
    pip config set global.extra-index-url "${PIP_INDEX}" && \
-    pip install --no-cache-dir --upgrade pip packaging wheel setuptools
+    pip install --no-cache-dir --upgrade pip packaging wheel setuptools "hatchling>=1.18.0" editables
-# Install the requirements
+# Copy the application into the image
 COPY requirements.txt /app
 RUN pip install --no-cache-dir -r requirements.txt
 # Copy the rest of the application into the image
 COPY . /app
 # Install LLaMA Factory
-RUN pip install --no-cache-dir -e ".[${EXTRAS}]" --no-build-isolation
+RUN pip install --no-cache-dir -e "." --no-build-isolation
 # Rebuild flash attention
 RUN if [ "${INSTALL_FLASHATTN}" == "true" ]; then \
--- a/docker/docker-cuda/Dockerfile.megatron
+++ b/docker/docker-cuda/Dockerfile.megatron
@@ -8,7 +8,7 @@ ENV PYPI_MIRROR=https://mirrors.aliyun.com/pypi/simple/
 ENV PYPI_TRUSTED_HOST=mirrors.aliyun.com
 ENV APT_MIRROR=https://mirrors.tuna.tsinghua.edu.cn/ubuntu/
-RUN pip install --upgrade pip setuptools wheel --trusted-host ${PYPI_TRUSTED_HOST} --index-url ${PYPI_MIRROR}
+RUN pip install --upgrade pip setuptools wheel "hatchling>=1.18.0" editables --trusted-host ${PYPI_TRUSTED_HOST} --index-url ${PYPI_MIRROR}
 RUN pip uninstall -y torch torchvision torch-tensorrt \
    flash_attn transformer-engine \
@@ -56,14 +56,14 @@ ENV JAVA_HOME /usr/lib/jvm/java-21-openjdk-amd64
 # pip install LLaMA-Factory
 WORKDIR /app
-COPY requirements.txt /app/
+# Copy the application into the image
-RUN pip install --no-cache-dir -r requirements.txt
+COPY . /app
 # Install LLaMA Factory
 RUN pip install --no-cache-dir -e "." --no-build-isolation
 RUN pip install "git+https://github.com/alibaba/roll.git#subdirectory=mcore_adapter"
 COPY . /app/
 RUN pip install -e ".[metrics]" --no-build-isolation
 # Expose port 7860 for LLaMA Board
 ENV GRADIO_SERVER_PORT=7860
 EXPOSE 7860
--- a/docker/docker-npu/Dockerfile
+++ b/docker/docker-npu/Dockerfile
@@ -5,7 +5,6 @@ FROM ${BASE_IMAGE}
 # Installation arguments
 ARG PIP_INDEX=https://pypi.org/simple
 ARG EXTRAS=torch-npu,metrics
 ARG HTTP_PROXY=""
 ARG PYTORCH_INDEX=https://download.pytorch.org/whl/cpu
@@ -28,21 +27,17 @@ WORKDIR /app
 # Change pip source
 RUN pip config set global.index-url "${PIP_INDEX}" && \
    pip config set global.extra-index-url "${PIP_INDEX}" && \
-    pip install --no-cache-dir --upgrade pip packaging wheel setuptools
+    pip install --no-cache-dir --upgrade pip packaging wheel setuptools "hatchling>=1.18.0" editables
 # Install torch-npu
 RUN pip uninstall -y torch torchvision torchaudio && \
    pip install --no-cache-dir "torch==2.7.1" "torch-npu==2.7.1" "torchvision==0.22.1" --index-url "${PYTORCH_INDEX}"
-# Install the requirements
+# Copy the application into the image
 COPY requirements.txt /app
 RUN pip install --no-cache-dir -r requirements.txt
 # Copy the rest of the application into the image
 COPY . /app
 # Install LLaMA Factory
-RUN pip install --no-cache-dir -e ".[${EXTRAS}]" --no-build-isolation
+RUN pip install --no-cache-dir -e "." --no-build-isolation
 # Set up volumes
 # VOLUME [ "/root/.cache/huggingface", "/app/shared_data", "/app/output" ]
--- a/docker/docker-rocm/Dockerfile
+++ b/docker/docker-rocm/Dockerfile
@@ -4,7 +4,6 @@ FROM ${BASE_IMAGE}
 # Installation arguments
 ARG PIP_INDEX=https://pypi.org/simple
 ARG EXTRAS=metrics
 ARG INSTALL_FLASHATTN=false
 ARG HTTP_PROXY=""
 ARG PYTORCH_INDEX=https://download.pytorch.org/whl/rocm6.3
@@ -28,21 +27,17 @@ WORKDIR /app
 # Change pip source
 RUN pip config set global.index-url "${PIP_INDEX}" && \
    pip config set global.extra-index-url "${PIP_INDEX}" && \
-    pip install --no-cache-dir --upgrade pip packaging wheel setuptools
+    pip install --no-cache-dir --upgrade pip packaging wheel setuptools "hatchling>=1.18.0" editables
 # Reinstall pytorch rocm
 RUN pip uninstall -y torch torchvision torchaudio && \
    pip install --no-cache-dir --pre torch torchvision torchaudio --index-url "${PYTORCH_INDEX}"
-# Install the requirements
+# Copy the application into the image
 COPY requirements.txt /app
 RUN pip install --no-cache-dir -r requirements.txt
 # Copy the rest of the application into the image
 COPY . /app
 # Install LLaMA Factory
-RUN pip install --no-cache-dir -e ".[${EXTRAS}]" --no-build-isolation
+RUN pip install --no-cache-dir -e "." --no-build-isolation
 # Rebuild flash attention
 RUN if [ "${INSTALL_FLASHATTN}" == "true" ]; then \
--- a/examples/requirements/adam-mini.txt
+++ b/examples/requirements/adam-mini.txt
@@ -0,0 +1 @@
 adam-mini
--- a/examples/requirements/apollo.txt
+++ b/examples/requirements/apollo.txt
@@ -0,0 +1 @@
 apollo-torch
--- a/examples/requirements/aqlm.txt
+++ b/examples/requirements/aqlm.txt
@@ -0,0 +1 @@
 aqlm[gpu]>=1.1.0
--- a/examples/requirements/badam.txt
+++ b/examples/requirements/badam.txt
@@ -0,0 +1 @@
 badam>=1.2.1
--- a/examples/requirements/bitsandbytes.txt
+++ b/examples/requirements/bitsandbytes.txt
@@ -0,0 +1 @@
 bitsandbytes>=0.39.0
--- a/examples/requirements/dev.txt
+++ b/examples/requirements/dev.txt
@@ -0,0 +1,4 @@
 pre-commit
 ruff
 pytest
 build
--- a/examples/requirements/eetq.txt
+++ b/examples/requirements/eetq.txt
@@ -0,0 +1 @@
 eetq
--- a/examples/requirements/fp8-te.txt
+++ b/examples/requirements/fp8-te.txt
@@ -0,0 +1,2 @@
 transformer_engine[pytorch]>=2.0.0
 accelerate>=1.10.0
--- a/examples/requirements/fp8.txt
+++ b/examples/requirements/fp8.txt
@@ -0,0 +1,2 @@
 torchao>=0.8.0
 accelerate>=1.10.0
--- a/examples/requirements/galore.txt
+++ b/examples/requirements/galore.txt
@@ -0,0 +1 @@
 galore-torch
--- a/examples/requirements/gptq.txt
+++ b/examples/requirements/gptq.txt
@@ -0,0 +1,2 @@
 optimum>=1.24.0
 gptqmodel>=2.0.0
--- a/examples/requirements/hqq.txt
+++ b/examples/requirements/hqq.txt
@@ -0,0 +1 @@
 hqq
--- a/examples/requirements/liger-kernel.txt
+++ b/examples/requirements/liger-kernel.txt
@@ -0,0 +1 @@
 liger-kernel>=0.5.5
--- a/examples/requirements/minicpm-v.txt
+++ b/examples/requirements/minicpm-v.txt
@@ -0,0 +1,8 @@
 soundfile
 torchvision
 torchaudio
 vector_quantize_pytorch
 vocos
 msgpack
 referencing
 jsonschema_specifications
--- a/examples/requirements/openmind.txt
+++ b/examples/requirements/openmind.txt
@@ -0,0 +1 @@
 openmind
--- a/examples/requirements/sglang.txt
+++ b/examples/requirements/sglang.txt
@@ -0,0 +1,2 @@
 sglang[srt]>=0.4.5
 transformers==4.51.1
--- a/examples/requirements/swanlab.txt
+++ b/examples/requirements/swanlab.txt
@@ -0,0 +1 @@
 swanlab
--- a/examples/requirements/vllm.txt
+++ b/examples/requirements/vllm.txt
@@ -0,0 +1 @@
 vllm>=0.4.3,<=0.11.0
--- a/pyproject.toml
+++ b/pyproject.toml
@@ -1,22 +1,104 @@
 [build-system]
-requires = ["setuptools>=61.0"]
+requires = ["hatchling"]
-build-backend = "setuptools.build_meta"
+build-backend = "hatchling.build"
 [project]
 name = "llamafactory"
 dynamic = ["version"]
 description = "Unified Efficient Fine-Tuning of 100+ LLMs"
 readme = "README.md"
 license = "Apache-2.0"
 requires-python = ">=3.9.0"
-dynamic = [
+authors = [
-    "version",
+    { name = "hiyouga", email = "hiyouga@buaa.edu.cn" }
    "dependencies",
    "optional-dependencies",
    "scripts",
    "authors",
    "description",
    "readme",
    "license",
    "keywords",
    "classifiers"
 ]
 keywords = [
    "AI",
    "LLM",
    "GPT",
    "ChatGPT",
    "Llama",
    "Transformer",
    "DeepSeek",
    "Pytorch"
 ]
 classifiers = [
    "Development Status :: 4 - Beta",
    "Intended Audience :: Developers",
    "Intended Audience :: Education",
    "Intended Audience :: Science/Research",
    "License :: OSI Approved :: Apache Software License",
    "Operating System :: OS Independent",
    "Programming Language :: Python :: 3",
    "Programming Language :: Python :: 3.9",
    "Programming Language :: Python :: 3.10",
    "Programming Language :: Python :: 3.11",
    "Programming Language :: Python :: 3.12",
    "Topic :: Scientific/Engineering :: Artificial Intelligence"
 ]
 dependencies = [
    # core deps
    "transformers>=4.49.0,<=4.56.2,!=4.52.0; python_version < '3.10'",
    "transformers>=4.49.0,<=4.57.1,!=4.52.0,!=4.57.0; python_version >= '3.10'",
    "datasets>=2.16.0,<=4.0.0",
    "accelerate>=1.3.0,<=1.11.0",
    "peft>=0.14.0,<=0.17.1",
    "trl>=0.8.6,<=0.9.6",
    "torchdata",
    # torch
    "torch>=2.0.0",
    "torchvision>=0.15.0",
    # gui
    "gradio>=4.38.0,<=5.45.0",
    "matplotlib>=3.7.0",
    "tyro<0.9.0",
    # ops
    "einops",
    "numpy<2.0.0",
    "pandas>=2.0.0",
    "scipy",
    # model and tokenizer
    "sentencepiece",
    "tiktoken",
    "modelscope>=1.14.0",
    "hf-transfer",
    "safetensors<=0.5.3",
    # python
    "fire",
    "omegaconf",
    "packaging",
    "protobuf",
    "pyyaml",
    "pydantic<=2.10.6",
    # api
    "uvicorn",
    "fastapi",
    "sse-starlette",
    # media
    "av",
    "librosa",
    # yanked
    "propcache!=0.4.0"
 ]
 [project.optional-dependencies]
 metrics = ["nltk", "jieba", "rouge-chinese"]
 deepspeed = ["deepspeed>=0.10.0,<=0.16.9"]
 [project.scripts]
 llamafactory-cli = "llamafactory.cli:main"
 lmf = "llamafactory.cli:main"
 [project.urls]
 Homepage = "https://github.com/hiyouga/LLaMA-Factory"
 Repository = "https://github.com/hiyouga/LLaMA-Factory"
 [tool.hatch.build.targets.wheel]
 packages = ["src/llamafactory"]
 [tool.hatch.version]
 path = "src/llamafactory/extras/env.py"
 pattern = "VERSION = \"(?P<version>[^\"]+)\""
 [tool.ruff]
 target-version = "py39"
@@ -73,23 +155,3 @@ indent-style = "space"
 docstring-code-format = true
 skip-magic-trailing-comma = false
 line-ending = "auto"
 [tool.uv]
 conflicts = [
    [
        { extra = "torch-npu" },
        { extra = "aqlm" },
    ],
    [
        { extra = "torch-npu" },
        { extra = "vllm" },
    ],
    [
        { extra = "torch-npu" },
        { extra = "sglang" },
    ],
    [
        { extra = "vllm" },
        { extra = "sglang" },
    ],
 ]
--- a/requirements.txt
+++ b/requirements.txt
@@ -1,39 +0,0 @@
 # core deps
 transformers>=4.49.0,<=4.56.2,!=4.52.0; python_version < '3.10'
 transformers>=4.49.0,<=4.57.1,!=4.52.0,!=4.57.0; python_version >= '3.10'
 datasets>=2.16.0,<=4.0.0
 accelerate>=1.3.0,<=1.11.0
 peft>=0.14.0,<=0.17.1
 trl>=0.8.6,<=0.9.6
 torchdata
 # gui
 gradio>=4.38.0,<=5.45.0
 matplotlib>=3.7.0
 tyro<0.9.0
 # ops
 einops
 numpy<2.0.0
 pandas>=2.0.0
 scipy
 # model and tokenizer
 sentencepiece
 tiktoken
 modelscope>=1.14.0
 hf-transfer
 safetensors<=0.5.3
 # python
 fire
 omegaconf
 packaging
 protobuf
 pyyaml
 pydantic<=2.10.6
 # api
 uvicorn
 fastapi
 sse-starlette
 # media
 av
 librosa
 # yanked
 propcache!=0.4.0
--- a/setup.py
+++ b/setup.py
@@ -1,116 +0,0 @@
 # Copyright 2025 the LlamaFactory team.
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
 # You may obtain a copy of the License at
 #
 #     http://www.apache.org/licenses/LICENSE-2.0
 #
 # Unless required by applicable law or agreed to in writing, software
 # distributed under the License is distributed on an "AS IS" BASIS,
 # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
 # See the License for the specific language governing permissions and
 # limitations under the License.
 import os
 import re
 from setuptools import find_packages, setup
 def get_version() -> str:
    with open(os.path.join("src", "llamafactory", "extras", "env.py"), encoding="utf-8") as f:
        file_content = f.read()
        pattern = r"{}\W*=\W*\"([^\"]+)\"".format("VERSION")
        (version,) = re.findall(pattern, file_content)
        return version
 def get_requires() -> list[str]:
    with open("requirements.txt", encoding="utf-8") as f:
        file_content = f.read()
        lines = [line.strip() for line in file_content.strip().split("\n") if not line.startswith("#")]
        return lines
 def get_console_scripts() -> list[str]:
    console_scripts = ["llamafactory-cli = llamafactory.cli:main"]
    if os.getenv("ENABLE_SHORT_CONSOLE", "1").lower() in ["true", "y", "1"]:
        console_scripts.append("lmf = llamafactory.cli:main")
    return console_scripts
 extra_require = {
    "torch": ["torch>=2.0.0", "torchvision>=0.15.0"],
    "torch-npu": ["torch==2.7.1", "torch-npu==2.7.1", "torchvision==0.22.1", "decorator"],
    "metrics": ["nltk", "jieba", "rouge-chinese"],
    "deepspeed": ["deepspeed>=0.10.0,<=0.16.9"],
    "liger-kernel": ["liger-kernel>=0.5.5"],
    "bitsandbytes": ["bitsandbytes>=0.39.0"],
    "hqq": ["hqq"],
    "eetq": ["eetq"],
    "gptq": ["optimum>=1.24.0", "gptqmodel>=2.0.0"],
    "aqlm": ["aqlm[gpu]>=1.1.0"],
    "vllm": ["vllm>=0.4.3,<=0.11.0"],
    "sglang": ["sglang[srt]>=0.4.5", "transformers==4.51.1"],
    "galore": ["galore-torch"],
    "apollo": ["apollo-torch"],
    "badam": ["badam>=1.2.1"],
    "adam-mini": ["adam-mini"],
    "minicpm_v": [
        "soundfile",
        "torchvision",
        "torchaudio",
        "vector_quantize_pytorch",
        "vocos",
        "msgpack",
        "referencing",
        "jsonschema_specifications",
    ],
    "openmind": ["openmind"],
    "swanlab": ["swanlab"],
    "fp8": ["torchao>=0.8.0", "accelerate>=1.10.0"],
    "fp8-te": ["transformer_engine[pytorch]>=2.0.0", "accelerate>=1.10.0"],
    "fp8-all": ["torchao>=0.8.0", "transformer_engine[pytorch]>=2.0.0", "accelerate>=1.10.0"],
    "dev": ["pre-commit", "ruff", "pytest", "build"],
 }
 def main():
    setup(
        name="llamafactory",
        version=get_version(),
        author="hiyouga",
        author_email="hiyouga@buaa.edu.cn",
        description="Unified Efficient Fine-Tuning of 100+ LLMs",
        long_description=open("README.md", encoding="utf-8").read(),
        long_description_content_type="text/markdown",
        keywords=["AI", "LLM", "GPT", "ChatGPT", "Llama", "Transformer", "DeepSeek", "Pytorch"],
        license="Apache 2.0 License",
        url="https://github.com/hiyouga/LLaMA-Factory",
        package_dir={"": "src"},
        packages=find_packages("src"),
        python_requires=">=3.9.0",
        install_requires=get_requires(),
        extras_require=extra_require,
        entry_points={"console_scripts": get_console_scripts()},
        classifiers=[
            "Development Status :: 4 - Beta",
            "Intended Audience :: Developers",
            "Intended Audience :: Education",
            "Intended Audience :: Science/Research",
            "License :: OSI Approved :: Apache Software License",
            "Operating System :: OS Independent",
            "Programming Language :: Python :: 3",
            "Programming Language :: Python :: 3.9",
            "Programming Language :: Python :: 3.10",
            "Programming Language :: Python :: 3.11",
            "Programming Language :: Python :: 3.12",
            "Topic :: Scientific/Engineering :: Artificial Intelligence",
        ],
    )
 if __name__ == "__main__":
    main()
--- a/src/llamafactory/data/mm_plugin.py
+++ b/src/llamafactory/data/mm_plugin.py
@@ -480,21 +480,35 @@ class ErnieVLPlugin(BasePlugin):
        self._validate_input(processor, images, videos, audios)
        self._validate_messages(messages, images, videos, audios)
        messages = deepcopy(messages)
        image_processor: BaseImageProcessor = getattr(processor, "image_processor")
        merge_length: int = getattr(image_processor, "merge_size") ** 2
        if self.expand_mm_tokens:
            mm_inputs = self._get_mm_inputs(images, videos, audios, processor)
            image_grid_thw = mm_inputs.get("image_grid_thw", [])
            video_grid_thw = mm_inputs.get("video_grid_thw", [])
        else:
            image_grid_thw = [None] * len(images)
            video_grid_thw = [None] * len(videos)
        image_idx, video_idx = 0, 0
        for message in messages:
            content = message["content"]
-            image_token = self.image_token or "<|image@placeholder|>"
+            image_token = self.image_token or "<|IMAGE_PLACEHOLDER|>"
-            video_token = self.video_token or "<|video@placeholder|>"
+            video_token = self.video_token or "<|VIDEO_PLACEHOLDER|>"
            while IMAGE_PLACEHOLDER in content:
                image_seqlen = image_grid_thw[image_idx].prod() // merge_length if self.expand_mm_tokens else 1
                content = content.replace(
                    IMAGE_PLACEHOLDER, f"Picture {image_idx + 1}:<|IMAGE_START|>{image_token * image_seqlen}<|IMAGE_END|>", 1
                )
                image_idx += 1
                content = content.replace(
                    IMAGE_PLACEHOLDER, f"Picture {image_idx}:<|IMAGE_START|>{image_token}<|IMAGE_END|>", 1
                )
            while VIDEO_PLACEHOLDER in content:
-                video_idx += 1
+                video_seqlen = video_grid_thw[video_idx].prod() // merge_length if self.expand_mm_tokens else 1
                content = content.replace(
-                    VIDEO_PLACEHOLDER, f"Video {video_idx}:<|VIDEO_START|>{video_token}<|VIDEO_END|>", 1
+                    VIDEO_PLACEHOLDER, f"Video {video_idx + 1}:<|VIDEO_START|>{video_token * video_seqlen}<|VIDEO_END|>", 1
                )
                video_idx += 1
            message["content"] = content
        return messages
--- a/src/llamafactory/data/template.py
+++ b/src/llamafactory/data/template.py
@@ -981,7 +981,7 @@ register_template(
    replace_eos=True,
    replace_jinja_template=True,
    template_class=ReasoningTemplate,
-    mm_plugin=get_mm_plugin(name="ernie_vl", image_token="<|image@placeholder|>", video_token="<|video@placeholder|>"),
+    mm_plugin=get_mm_plugin(name="ernie_vl", image_token="<|IMAGE_PLACEHOLDER|>", video_token="<|VIDEO_PLACEHOLDER|>"),
 )
--- a/src/llamafactory/model/loader.py
+++ b/src/llamafactory/model/loader.py
@@ -205,10 +205,6 @@ def load_model(
    if not is_trainable:
        model.requires_grad_(False)
        for param in model.parameters():
            if param.data.dtype == torch.float32 and model_args.compute_dtype != torch.float32:
                param.data = param.data.to(model_args.compute_dtype)
        model.eval()
    else:
        model.train()
--- a/src/llamafactory/model/patcher.py
+++ b/src/llamafactory/model/patcher.py
@@ -158,7 +158,7 @@ def patch_config(
    # do not cast data type of the model deepspeed zero3 without qlora
    if not (is_deepspeed_zero3_enabled() and model_args.quantization_bit is None):
-        init_kwargs["torch_dtype"] = model_args.compute_dtype
+        init_kwargs["torch_dtype"] = "auto"
        if init_kwargs["low_cpu_mem_usage"] and not is_fsdp_enabled():  # fsdp does not need device map
            if "device_map" not in init_kwargs and model_args.device_map:
--- a/src/llamafactory/train/test_utils.py
+++ b/src/llamafactory/train/test_utils.py
@@ -84,9 +84,7 @@ def load_reference_model(
        model: AutoModelForCausalLMWithValueHead = AutoModelForCausalLMWithValueHead.from_pretrained(
            model_path, torch_dtype=torch.float16, device_map="auto"
        )
-        if not is_trainable:
+        
            model.v_head = model.v_head.to(torch.float16)
        return model
    model = AutoModelForCausalLM.from_pretrained(model_path, torch_dtype=torch.float16, device_map="auto")
Author	SHA1	Message	Date
Copilot	a1b1931b4a	[breaking] migrate from setuptools to uv (#9673 ) Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com> Co-authored-by: hiyouga <16256802+hiyouga@users.noreply.github.com>	2025-12-26 22:47:23 +08:00
Xunpeng Xiao	3c17f2722c	[model] Update ernie_vl to adapt new version (#9665 )	2025-12-26 19:57:49 +08:00
Copilot	a882e2d5fc	[assets] Add GitHub Copilot instructions for repository (#9675 ) Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com> Co-authored-by: hiyouga <16256802+hiyouga@users.noreply.github.com>	2025-12-26 17:32:48 +08:00
`@@ -1 +1 @@`
	`include LICENSE requirements.txt`	`include LICENSE`
		`@@ -0,0 +1,2 @@`
							`transformer_engine[pytorch]>=2.0.0`
							`accelerate>=1.10.0`