mirror of
https://github.com/hiyouga/LLaMA-Factory.git
synced 2026-04-07 04:46:01 +08:00
[model] gemma4 (#10346)
This commit is contained in:
105
.ai/CLAUDE.md
Normal file
105
.ai/CLAUDE.md
Normal file
@@ -0,0 +1,105 @@
|
||||
# CLAUDE.md
|
||||
|
||||
This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.
|
||||
|
||||
## Commands
|
||||
|
||||
```bash
|
||||
# Code style (auto-fix)
|
||||
make style
|
||||
|
||||
# Code quality check (no modifications)
|
||||
make quality
|
||||
|
||||
# Run all tests
|
||||
make test
|
||||
|
||||
# Run a single test file
|
||||
WANDB_DISABLED=true pytest -vv --import-mode=importlib tests/path/to/test_file.py
|
||||
|
||||
# Run tests matching a pattern
|
||||
WANDB_DISABLED=true pytest -vv --import-mode=importlib tests/ -k "test_name"
|
||||
|
||||
# License header check
|
||||
make license
|
||||
|
||||
# Build package
|
||||
make build
|
||||
```
|
||||
|
||||
The project uses `uv` as the preferred package manager. Commands automatically use `uv run` / `uvx` if `uv` is available.
|
||||
|
||||
## Architecture
|
||||
|
||||
LlamaFactory has two parallel architectures controlled by the `USE_V1` environment variable:
|
||||
|
||||
- **v0 (default):** `api, webui > chat, eval, train > data, model > hparams > extras`
|
||||
- **v1 (experimental, `USE_V1=1`):** `trainers > core > accelerator, plugins, config > utils`
|
||||
|
||||
Most active development happens in v0. The v1 architecture lives in `src/llamafactory/v1/`.
|
||||
|
||||
### Entry Points
|
||||
|
||||
CLI entry point is `llamafactory-cli` / `lmf` → `src/llamafactory/cli.py:main()`, which dispatches to `launcher.py` based on `USE_V1`.
|
||||
|
||||
Available subcommands: `train`, `chat`, `api`, `export`, `webchat`, `webui`, `env`, `version`, `help`.
|
||||
|
||||
### Training Flow (v0)
|
||||
|
||||
```
|
||||
run_exp() [tuner.py]
|
||||
→ read_args() → parse YAML/JSON config
|
||||
→ get_train_args() → produces typed argument dataclasses
|
||||
→ routes to: run_sft / run_dpo / run_ppo / run_rm / run_pt / run_kto
|
||||
→ optional: export_model()
|
||||
```
|
||||
|
||||
Training is invoked with a YAML config: `llamafactory-cli train examples/train_lora/llama3_lora_sft.yaml`
|
||||
|
||||
### Configuration System
|
||||
|
||||
All training parameters are YAML/JSON config files. Argument parsing in `src/llamafactory/hparams/parser.py` produces four typed dataclasses:
|
||||
- `ModelArguments` — model/tokenizer selection, quantization
|
||||
- `DataArguments` — datasets, templates, preprocessing
|
||||
- `FinetuningArguments` — LoRA rank/target, training method (sft/dpo/ppo/rm/pt/kto)
|
||||
- `TrainingArguments` — extends HuggingFace's `TrainingArguments`
|
||||
|
||||
### Key Modules
|
||||
|
||||
| Module | Purpose |
|
||||
|--------|---------|
|
||||
| `src/llamafactory/model/loader.py` | Loads model + tokenizer; applies quantization, LoRA, patches |
|
||||
| `src/llamafactory/model/patcher.py` | Model-specific compatibility patches |
|
||||
| `src/llamafactory/data/template.py` | Prompt templates; `TEMPLATES` dict maps model family → format |
|
||||
| `src/llamafactory/data/mm_plugin.py` | Multi-modal (image/video/audio) data handling |
|
||||
| `src/llamafactory/data/processor/` | Per-stage data processors (supervised, pairwise, pretrain, etc.) |
|
||||
| `src/llamafactory/train/sft/` | SFT trainer; other stages follow same structure |
|
||||
| `src/llamafactory/chat/` | Inference engines: `hf_engine`, `vllm_engine`, `sglang_engine`, `kt_engine` |
|
||||
| `src/llamafactory/extras/constants.py` | Enums and constants used across the project |
|
||||
|
||||
### Adding Support for a New Model
|
||||
|
||||
1. Add a prompt template to `src/llamafactory/data/template.py` in the `TEMPLATES` dict
|
||||
2. Add any necessary model patches in `src/llamafactory/model/patcher.py`
|
||||
3. Add multi-modal support in `src/llamafactory/data/mm_plugin.py` if needed
|
||||
|
||||
### Distributed Training
|
||||
|
||||
Multi-GPU automatically uses `torchrun`. Additional backends:
|
||||
- **Ray:** Optional Ray cluster support
|
||||
- **HyperParallel FSDP2:** `src/llamafactory/train/hyper_parallel/`
|
||||
- **Megatron-core:** `src/llamafactory/train/mca/`
|
||||
|
||||
### Testing
|
||||
|
||||
- `tests/` — v0 tests; `tests_v1/` — v1 tests
|
||||
- Most training tests require GPU hardware
|
||||
- pytest markers: `@pytest.mark.slow`, `@pytest.mark.runs_on(['cuda'])`
|
||||
- Always set `WANDB_DISABLED=true` when running tests
|
||||
|
||||
### Code Style
|
||||
|
||||
- Ruff for linting and formatting (line length 119, Google-style docstrings)
|
||||
- Python 3.11+ syntax
|
||||
- Double quotes for strings
|
||||
- All new files must include Apache 2.0 license header (checked by `make license`)
|
||||
Reference in New Issue
Block a user