Files
LLaMA-Factory/.ai/CLAUDE.md
2026-04-05 12:10:28 +08:00

3.8 KiB

CLAUDE.md

This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.

Commands

# Code style (auto-fix)
make style

# Code quality check (no modifications)
make quality

# Run all tests
make test

# Run a single test file
WANDB_DISABLED=true pytest -vv --import-mode=importlib tests/path/to/test_file.py

# Run tests matching a pattern
WANDB_DISABLED=true pytest -vv --import-mode=importlib tests/ -k "test_name"

# License header check
make license

# Build package
make build

The project uses uv as the preferred package manager. Commands automatically use uv run / uvx if uv is available.

Architecture

LlamaFactory has two parallel architectures controlled by the USE_V1 environment variable:

  • v0 (default): api, webui > chat, eval, train > data, model > hparams > extras
  • v1 (experimental, USE_V1=1): trainers > core > accelerator, plugins, config > utils

Most active development happens in v0. The v1 architecture lives in src/llamafactory/v1/.

Entry Points

CLI entry point is llamafactory-cli / lmfsrc/llamafactory/cli.py:main(), which dispatches to launcher.py based on USE_V1.

Available subcommands: train, chat, api, export, webchat, webui, env, version, help.

Training Flow (v0)

run_exp() [tuner.py]
  → read_args() → parse YAML/JSON config
  → get_train_args() → produces typed argument dataclasses
  → routes to: run_sft / run_dpo / run_ppo / run_rm / run_pt / run_kto
  → optional: export_model()

Training is invoked with a YAML config: llamafactory-cli train examples/train_lora/llama3_lora_sft.yaml

Configuration System

All training parameters are YAML/JSON config files. Argument parsing in src/llamafactory/hparams/parser.py produces four typed dataclasses:

  • ModelArguments — model/tokenizer selection, quantization
  • DataArguments — datasets, templates, preprocessing
  • FinetuningArguments — LoRA rank/target, training method (sft/dpo/ppo/rm/pt/kto)
  • TrainingArguments — extends HuggingFace's TrainingArguments

Key Modules

Module Purpose
src/llamafactory/model/loader.py Loads model + tokenizer; applies quantization, LoRA, patches
src/llamafactory/model/patcher.py Model-specific compatibility patches
src/llamafactory/data/template.py Prompt templates; TEMPLATES dict maps model family → format
src/llamafactory/data/mm_plugin.py Multi-modal (image/video/audio) data handling
src/llamafactory/data/processor/ Per-stage data processors (supervised, pairwise, pretrain, etc.)
src/llamafactory/train/sft/ SFT trainer; other stages follow same structure
src/llamafactory/chat/ Inference engines: hf_engine, vllm_engine, sglang_engine, kt_engine
src/llamafactory/extras/constants.py Enums and constants used across the project

Adding Support for a New Model

  1. Add a prompt template to src/llamafactory/data/template.py in the TEMPLATES dict
  2. Add any necessary model patches in src/llamafactory/model/patcher.py
  3. Add multi-modal support in src/llamafactory/data/mm_plugin.py if needed

Distributed Training

Multi-GPU automatically uses torchrun. Additional backends:

  • Ray: Optional Ray cluster support
  • HyperParallel FSDP2: src/llamafactory/train/hyper_parallel/
  • Megatron-core: src/llamafactory/train/mca/

Testing

  • tests/ — v0 tests; tests_v1/ — v1 tests
  • Most training tests require GPU hardware
  • pytest markers: @pytest.mark.slow, @pytest.mark.runs_on(['cuda'])
  • Always set WANDB_DISABLED=true when running tests

Code Style

  • Ruff for linting and formatting (line length 119, Google-style docstrings)
  • Python 3.11+ syntax
  • Double quotes for strings
  • All new files must include Apache 2.0 license header (checked by make license)