[inference] fix stop token for object detection (#6624)

* fix stop token * update minicpm data pipeline * fix npu qlora examples Former-commit-id: e3e2c8c689
2026-03-07 20:26:00 +08:00 · 2025-01-13 21:34:20 +08:00
parent 089c7d5e51
commit d8cba9464f
15 changed files with 101 additions and 45 deletions
--- a/examples/README.md
+++ b/examples/README.md
@@ -109,6 +109,12 @@ USE_RAY=1 llamafactory-cli train examples/train_full/llama3_lora_sft_ray.yaml
 llamafactory-cli train examples/train_qlora/llama3_lora_sft_otfq.yaml
 ```

+#### Supervised Fine-Tuning with 4-bit Bitsandbytes Quantization on Ascend NPU
+
+```bash
+llamafactory-cli train examples/train_qlora/llama3_lora_sft_bnb_npu.yaml
+```
+
 #### Supervised Fine-Tuning with 4/8-bit GPTQ Quantization

 ```bash
--- a/examples/README_zh.md
+++ b/examples/README_zh.md
@@ -109,6 +109,12 @@ USE_RAY=1 llamafactory-cli train examples/train_full/llama3_lora_sft_ray.yaml
 llamafactory-cli train examples/train_qlora/llama3_lora_sft_otfq.yaml
 ```

+#### 在 NPU 上基于 4 比特 Bitsandbytes 量化进行指令监督微调
+
+```bash
+llamafactory-cli train examples/train_qlora/llama3_lora_sft_bnb_npu.yaml
+```
+
 #### 基于 4/8 比特 GPTQ 量化进行指令监督微调

 ```bash
--- a/examples/train_qlora/llama3_lora_sft_otfq_npu.yaml
+++ b/examples/train_qlora/llama3_lora_sft_otfq_npu.yaml
@@ -1,7 +1,7 @@
 ### model
 model_name_or_path: meta-llama/Meta-Llama-3-8B-Instruct
 quantization_bit: 4
-quantization_method: bitsandbytes  # choices: [bitsandbytes (4/8), hqq (2/3/4/5/6/8), eetq (8)]
+quantization_method: bitsandbytes
 double_quantization: false
 trust_remote_code: true