Compare commits

...

1289 Commits
v0.8.0 ... main

Author SHA1 Message Date
Yaowei Zheng
4ba7de0434
[model] Qwen3-30B-A3B-Thinking-2507 (#8788) 2025-07-30 23:05:12 +08:00
Yaowei Zheng
ea8a2d60d0
[deps] upgrade vllm to 0.10.0 (#8787) 2025-07-30 22:26:38 +08:00
Yaowei Zheng
ae0ef374a3
[assets] update readme (#8784) 2025-07-30 17:57:17 +08:00
Yaowei Zheng
edd112f35c
[model] add qwen3 2507 model (#8783) 2025-07-30 17:19:19 +08:00
Kingsley
7218d4aa96
[model] support keye-vl-8b (#8776) 2025-07-29 21:24:08 +08:00
Kingsley
4380b7b35e
[model] update glm4.5 (#8770) 2025-07-29 19:57:29 +08:00
Yaowei Zheng
3307ff1d4a
[model] add qwen3 2507 models (#8750) 2025-07-25 20:21:47 +08:00
Kingsley
2aadc90c2d
[model] add glm4moe (#8689) 2025-07-25 19:53:45 +08:00
Yaowei Zheng
2353e16e20
[assets] update readme (#8739) 2025-07-24 23:13:24 +08:00
Yaowei Zheng
6812f5e1f5
[assets] update wechat (#8731) 2025-07-24 02:35:47 +08:00
Steven sun
2077875622
[model] support granite4 (#8680) 2025-07-21 14:15:36 +08:00
Yaowei Zheng
678b7d69d2
[assets] update wechat (#8685) 2025-07-18 19:13:26 +08:00
Yaowei Zheng
f00742b078
[assets] update wechat (#8637) 2025-07-14 21:24:28 +08:00
Kingsley
fdb70c04e0
[data] fix gemma3n mmplugin (#8627) 2025-07-14 13:32:57 +08:00
Kingsley
95ed6c45cd
[model] add Devstral-Small-2507 (#8614) 2025-07-11 18:59:53 +08:00
Yaowei Zheng
cf1087d409
[webui] fix elems (#8587) 2025-07-08 21:16:42 +08:00
Kingsley
766884fa5c
[data] support glm4.1v video training (#8571) 2025-07-08 16:29:44 +08:00
Yaowei Zheng
6a8d88826e
[webui] fix abort finish (#8569) 2025-07-07 23:07:46 +08:00
Yaowei Zheng
043103e1c9
[webui] support other hub (#8567) 2025-07-07 22:18:48 +08:00
Yaowei Zheng
5817583630
[deps] bump transformers to 4.49.0 (#8564) 2025-07-07 20:31:50 +08:00
Yaowei Zheng
62bd2c8047
[assets] update wechat (#8565) 2025-07-07 20:29:20 +08:00
Redwood-Digital
1b549e3199
[docs] add nvidia-container-toolkit to Linux Docker setup instructions (#8557) 2025-07-06 19:37:08 +08:00
Vivek Iyer
c6290db118
Revert "[model] add lora dropout to unsloth" - requested feature already exists (#8554)
Co-authored-by: viyer <vivek_iyer2@apple.com>
2025-07-05 11:25:31 +08:00
wjunLu
d30cbcdfa5
[ci] Add workflow for building NPU image (#8546)
Co-authored-by: Yaowei Zheng <hiyouga@buaa.edu.cn>
2025-07-04 20:56:59 +08:00
Vivek Iyer
62c6943699
[model] add lora dropout to unsloth (#8548)
Co-authored-by: viyer <vivek_iyer2@apple.com>
2025-07-04 14:56:36 +08:00
Ze-Yi LIN
8e7727f4ee
[tracking] fix swanlab hparams (#8532)
Co-authored-by: Yaowei Zheng <hiyouga@buaa.edu.cn>
2025-07-02 22:08:44 +08:00
Yaowei Zheng
e117e3c2b7
[assets] update issue template (#8530) 2025-07-02 19:29:58 +08:00
Yaowei Zheng
dcd75e7063
[assets] update readme (#8529) 2025-07-02 17:42:27 +08:00
Kingsley
4465e4347e
[assets] update readme (#8519) 2025-07-02 15:38:38 +08:00
Kingsley
c5a08291f4
[model] add gemma3n (#8509) 2025-07-01 22:37:24 +08:00
Yaowei Zheng
544b7dc2ed
[assets] update wechat (#8517) 2025-07-01 21:23:48 +08:00
Injae Ryou
ac6c93df1f
[parser] update config loading to use OmegaConf #7793 (#8505) 2025-07-01 21:05:13 +08:00
Kingsley
0b188ca00c
[model] add GLM-4.1V (#8462) 2025-06-30 01:09:41 +08:00
Liu Jiajun
0a004904bd
[data] fix gemma2 eos token (#8480)
Co-authored-by: Yaowei Zheng <hiyouga@buaa.edu.cn>
2025-06-27 18:19:15 +08:00
Yaowei Zheng
bb7bf51554
Merge commit from fork 2025-06-26 13:55:42 +08:00
Yaowei Zheng
7242caf0ff
[assets] update readme (#8461) 2025-06-25 22:15:03 +08:00
Yaowei Zheng
ed57b7ba2a
[webui] upgrade webui and fix api (#8460) 2025-06-25 21:59:58 +08:00
Yaowei Zheng
b10333dafb
[model] do not force load processor (#8457) 2025-06-25 19:43:00 +08:00
Yaowei Zheng
6b46c8b689
[assets] update wechat (#8458) 2025-06-25 19:42:28 +08:00
Yaowei Zheng
be27eae175
[data] fix audio reader (#8448) 2025-06-24 20:53:20 +08:00
Yaowei Zheng
31b0787e12
[misc] fix ci (#8441) 2025-06-24 16:37:32 +08:00
Kingsley
fffa43be86
[model] Add mistral-small 3.2 & kimi-dev (#8433) 2025-06-24 14:59:47 +08:00
Yaowei Zheng
8ed085e403
[model] add kimi vl 2506 (#8432) 2025-06-23 17:56:48 +08:00
Vivek Iyer
1221533542
[model] unsloth resume from checkpoint bug (#8423)
Co-authored-by: viyer <vivek_iyer2@apple.com>
2025-06-23 16:43:54 +08:00
codingma
8a3bddc7fa
[assets] update wechat 2025-06-23 14:42:52 +08:00
Yaowei Zheng
3a119ed5a2
[assets] update wechat (#8414) 2025-06-20 03:38:43 +08:00
Dhia Eddine Rhaiem
0d7d0ea972
[model] add support for Falcon H1 (#8403) 2025-06-18 16:51:23 +08:00
Yaowei Zheng
0e1fea71d2
[assets] update readme (#8396) 2025-06-17 16:15:20 +08:00
Yaowei Zheng
ec04d7b89c
[misc] set dev version (#8389) 2025-06-17 01:39:25 +08:00
Yaowei Zheng
cabc9207be
[ci] add docker version (#8390) 2025-06-17 01:37:47 +08:00
Yaowei Zheng
f3d144f001
[model] fix vlm utils (#8388) 2025-06-17 01:08:49 +08:00
Yaowei Zheng
af2f75e688
[data] fix qwen2vl pos ids (#8387) 2025-06-17 00:48:54 +08:00
Yaowei Zheng
9f2f12b0fe
[version] release v0.9.3 (#8386) 2025-06-16 19:21:32 +08:00
Yaowei Zheng
bb84c3c83e
[assets] update wechat (#8385) 2025-06-16 18:23:22 +08:00
Aman Gupta
1cfe42916d
[trainer] Add LD-DPO objective (#8362) 2025-06-12 16:10:38 +08:00
Yaowei Zheng
5ed62a29c5
[misc] tiny fixes (#8348) 2025-06-10 15:30:58 +08:00
阿丹(adan)
d39d3106cb
[model] support MiniCPM4 (#8314) 2025-06-10 14:38:39 +08:00
Kingsley
8ffe7daa8d
[model] support Mistral3.1 small 2503 (#8335) 2025-06-09 10:37:42 +08:00
Chenhao Zhang
8fa55db1ec
[assets] Add awesome works used LLaMA-Factory (#8333) 2025-06-09 10:21:17 +08:00
Yaowei Zheng
d8a5050cfb
[assets] update wechat (#8328) 2025-06-07 09:00:07 +08:00
Yaowei Zheng
7ecc2d46ca
[model] fix model generate (#8327) 2025-06-07 08:47:50 +08:00
Vivek Iyer
d325a1a7c7
[model] pushing FFT with unsloth (#8325)
Co-authored-by: viyer <vivek_iyer2@apple.com>
2025-06-07 08:20:58 +08:00
Yaowei Zheng
239ced076c
[data] fix empty template (#8312) 2025-06-06 13:50:50 +08:00
Yaowei Zheng
f5f356649b
[setup] fix uv (#8311) 2025-06-06 11:54:15 +08:00
Yaowei Zheng
13fd43617c
[assets] update readme (#8303) 2025-06-05 23:23:15 +08:00
Kingsley
fcd8662306
[tests] add visual model save test (#8248)
Co-authored-by: Yaowei Zheng <hiyouga@buaa.edu.cn>
2025-06-05 20:38:01 +08:00
Yaowei Zheng
cecba57b3e
[assets] fix npu docker (#8298) 2025-06-05 19:09:20 +08:00
Butui Hu
83688b0b4d
[launcher] Add elastic and fault-tolerant training support (#8286)
Signed-off-by: Butui Hu <hot123tea123@gmail.com>
2025-06-05 16:40:03 +08:00
Kingsley
5308424705
[script] add Script description for qwen_omni_merge (#8293) 2025-06-05 13:22:01 +08:00
Yaowei Zheng
81c4d9bee6
[assets] update docker files (#8291) 2025-06-04 23:30:46 +08:00
Yaowei Zheng
ee676d29f6
[assets] update readme (#8288) 2025-06-04 17:46:12 +08:00
Yaowei Zheng
e3d5e0fa28
[assets] add icon (#8276) 2025-06-03 20:36:21 +08:00
Kingsley
3425bc6e71
[data] support nested images input for videos (#8264) 2025-06-03 20:26:29 +08:00
Ze-Yi LIN
6cc247e815
[tracking] swanlab add llamafactory tag (#8258) 2025-06-03 18:42:29 +08:00
Yaowei Zheng
16a3f8a71b
[assets] update wechat (#8270) 2025-06-03 17:33:57 +08:00
Kingsley
65aa86ed39
[model] add MIMO_VL (#8249) 2025-06-01 03:54:54 +08:00
Yaowei Zheng
fba9c9d9b9
[deps] upgrade transformers to 4.52.4 (#8245) 2025-05-31 16:51:40 +08:00
Akshat Sehgal
21829b5eaf
[model] add smollm2 support (#8220) 2025-05-31 16:29:01 +08:00
hoshi-hiyouga
762c2d776f
[assets] update readme (#8235) 2025-05-30 16:52:12 +08:00
Kingsley
e31afdfd4f
[scripts] specify model class for qwen_omni merge (#8227) 2025-05-30 14:20:12 +08:00
hoshi-hiyouga
be02003d16
[model] add deepseek 0528 models (#8215) 2025-05-29 21:37:07 +08:00
hoshi-hiyouga
a4048b7bb6
[assets] fix docker images (#8203) 2025-05-28 22:26:05 +08:00
yzoaim
73b12baaaf
[workflow] auto push docker images (#8181)
Co-authored-by: hoshi-hiyouga <hiyouga@buaa.edu.cn>
2025-05-28 20:21:15 +08:00
hoshi-hiyouga
55d37dc472
[assets] update Dockerfile (#8201) 2025-05-28 20:20:59 +08:00
hoshi-hiyouga
54ffd06026
[webui] fix skip args (#8195) 2025-05-28 18:11:07 +08:00
Muqi Li
00c4988f89
[assets] fix incorrect user_tag in dataset_info.json to prevent skipped entries (#8197) 2025-05-28 18:01:37 +08:00
Youngwoo Kim
57c6e23247
[data] Reading files from cloud is broken (#8182) (#8183) 2025-05-28 15:50:44 +08:00
hoshi-hiyouga
07f792148e
[assets] fix docker image (#8180) 2025-05-27 19:01:31 +08:00
hoshi-hiyouga
e542f95710
[data] fix shared file system (#8179) 2025-05-27 18:36:03 +08:00
hoshi-hiyouga
4ecf4daeb2
[webui] add extra args to export (#8178) 2025-05-27 18:25:31 +08:00
hoshi-hiyouga
519ac92803
[assets] update docker files (#8176) 2025-05-27 18:15:23 +08:00
hoshi-hiyouga
16e1a5097e
[webui] add infer extra args (#8167) 2025-05-27 12:04:00 +08:00
hoshi-hiyouga
09436c1f45
[webui] fix input args (#8162) 2025-05-27 02:05:54 +08:00
hoshi-hiyouga
f3a1dc8483
[model] add smollm2 and medgemma (#8161) 2025-05-26 23:19:58 +08:00
hoshi-hiyouga
dc8cca11b3
[deps] upgrade transformers (#8159) 2025-05-26 22:03:58 +08:00
Akshat Sehgal
e6f45d696b
feat: add smollm support (#8050) 2025-05-26 19:47:54 +08:00
wangzhan
c477ae6405
[api] support repetition_penalty and align presence_penalty with OpenAI Client (#7958) 2025-05-26 18:45:11 +08:00
hoshi-hiyouga
52dead8775
[assets] update wechat (#8156) 2025-05-26 18:28:02 +08:00
Kingsley
a9211a730e
[data] fix internvl plugin when using PIL images (#8129) 2025-05-22 01:32:59 +08:00
hoshi-hiyouga
763fbc294b
[misc] update data readme (#8128) 2025-05-21 22:41:18 +08:00
hoshi-hiyouga
b0c8ba73e0
[deps] update to transformers 4.52 (#8125) 2025-05-21 05:16:18 +08:00
hoshi-hiyouga
b3b2c9f1ee
[data] llama3 multi tool support (#8124) 2025-05-21 02:01:12 +08:00
hoshi-hiyouga
f96c085857
[assets] update readme (#8110) 2025-05-20 02:44:18 +08:00
hoshi-hiyouga
b83a38eb98
[data] qwen3 fixes (#8109) 2025-05-20 02:00:30 +08:00
hoshi-hiyouga
f3fd67a9bb
[model] switch to gptqmodel (#8108) 2025-05-19 22:25:40 +08:00
piamo
a6f3adf930
[model] update rope kwargs for yarn (#8101) 2025-05-19 20:07:54 +08:00
hoshi-hiyouga
ed2f89efaf
[doc] add no build isolation (#8103) 2025-05-19 19:25:13 +08:00
Ma, Xiaochen
16e26236eb
[trainer] fix KeyError at end of pretrain (#8099) 2025-05-19 18:01:26 +08:00
Biao Wang
89a0d10c18
[misc] fix cli (#8095)
Co-authored-by: wangbiao11 <wangbiao11@baidu.com>
2025-05-19 17:59:39 +08:00
Saiya
820ed764c4
[infer] support lora adapter for SGLang backend (#8067) 2025-05-16 23:33:47 +08:00
Kingsley
66f719dd96
[data] add forward compatibility for video_utils in Transformers 4.52.0 (#8077) 2025-05-16 17:41:04 +08:00
Eric Tang
130bfaf8e3
[data] support loading folder from remote (#8078) 2025-05-16 15:35:38 +08:00
Shawn Tao
e8a18c17e9
[infer] Modify vllm_infer.py to batch preprocess to avoid too much files opened error (#8051)
Co-authored-by: Kingsley <82590017+Kuangdd01@users.noreply.github.com>
2025-05-15 10:54:35 +08:00
hoshi-hiyouga
2b23c0a7a1
[assets] update wechat (#8057) 2025-05-14 18:01:48 +08:00
hoshi-hiyouga
ab2c05115b
[assets] update windows installation (#8042) 2025-05-13 17:01:56 +08:00
hoshi-hiyouga
8d472c20cb
[model] add seed coder and qwen3 quant models (#8039) 2025-05-13 15:59:55 +08:00
hoshi-hiyouga
845af89ea4
[data] fix kimi vl template (#8015) 2025-05-11 20:45:19 +08:00
Kingsley
cef3a0b2e2
[scripts] add video params for vllm infer (#7992) 2025-05-09 21:16:52 +08:00
yunhao-tech
865ac07491
[data] Avoid repetitive tool description warp (#8000)
Co-authored-by: chenyunhao <chenyunhao@wps.cn>
Co-authored-by: hoshi-hiyouga <hiyouga@buaa.edu.cn>
2025-05-09 21:16:37 +08:00
tpoisonooo
f584db50cf
[docs] add GraphGen (#7974) 2025-05-07 12:23:11 +02:00
hoshi-hiyouga
97e0a4cb5c
[misc] update liger kernel patch (#7966) 2025-05-06 20:32:16 +02:00
hoshi-hiyouga
c6bcca4c83
[example] update examples (#7964) 2025-05-06 17:24:25 +02:00
Kingsley
5ee9eb64d8
[model] add mimo7b (#7946) 2025-05-06 17:10:30 +02:00
hoshi-hiyouga
937447bd8a
[misc] fix qwen2 omni (#7962) 2025-05-06 15:39:13 +02:00
hoshi-hiyouga
52f25651a2
[model] add qwen2 omni 3b (#7945) 2025-05-03 16:36:51 +08:00
Eric Chen
75d7c35fdf
[assets] Warp Support README Update (#7887) 2025-05-02 00:08:48 +08:00
hoshi-hiyouga
6a584b4092
[hparam] add enable think argument (#7928) 2025-04-30 17:21:30 +08:00
hoshi-hiyouga
41ec928683
[data] fix base plugin (#7924) 2025-04-30 16:28:05 +08:00
hoshi-hiyouga
d8295cd601
[data] optimize qwen3 loss computation (#7923) 2025-04-30 16:18:00 +08:00
hoshi-hiyouga
a8430f4244
[misc] fix uv (#7913) 2025-04-30 07:45:03 +08:00
hoshi-hiyouga
072bfe29d3
[data] add eval_on_each_dataset arg (#7912) 2025-04-30 06:56:43 +08:00
hoshi-hiyouga
c5b1d07e7c
[data] replace eos token for base models (#7911) 2025-04-30 06:52:28 +08:00
hoshi-hiyouga
77c569e071
[data] improve mm plugin (#7910) 2025-04-30 06:34:28 +08:00
hoshi-hiyouga
ae392e054c
[model] add qwen3 (#7885) 2025-04-29 09:34:05 +08:00
Kingsley
369474451d
[data] fix qwen2.5 omni template (#7883) 2025-04-29 00:58:23 +08:00
hoshi-hiyouga
1f338deb87
[model] fix dsv3 leaf node (#7879) 2025-04-28 18:11:09 +08:00
hoshi-hiyouga
00b5c05946
[data] fix qwen2 omni plugin (#7875) 2025-04-28 14:22:41 +08:00
zhaop-l
1bd319d16c
[trainer] make projector trainable in freeze training (#7872)
Co-authored-by: hoshi-hiyouga <hiyouga@buaa.edu.cn>
2025-04-28 13:19:37 +08:00
hoshi-hiyouga
fcca3b0b0d
[data] fix minicpmo vllm infer (#7870) 2025-04-28 01:59:53 +08:00
Kingsley
035e98035c
fix attn patch for kimivl (#7867) 2025-04-27 23:12:28 +08:00
Eric Tang
b4407e4b0b
[ray] add storage filesystem to ray config (#7854) 2025-04-27 22:12:40 +08:00
hoshi-hiyouga
036a76e9cb
[assets] update wechat (#7840) 2025-04-24 16:31:05 +08:00
hoshi-hiyouga
4fbdc65fcb
[model] fix vit gradient checkpointing (#7830) 2025-04-23 22:48:48 +08:00
hoshi-hiyouga
2989d39239
Merge commit from fork 2025-04-23 16:38:27 +08:00
hoshi-hiyouga
1344416378
[model] fix moe zero3 (#7826) 2025-04-23 15:30:49 +08:00
Kingsley
1dd67eb042
[data] fix internvl plugin (#7817) 2025-04-23 00:58:22 +08:00
hoshi-hiyouga
2b7d564e3b
[assets] update model readme (#7804) 2025-04-22 16:43:56 +08:00
Kingsley
d43013f14a
[model] add arch check for InternVL (#7803) 2025-04-22 16:38:05 +08:00
Kingsley
c91165a5a6
[misc] update internvl constants (#7801) 2025-04-22 15:53:08 +08:00
hoshi-hiyouga
7f3c31f6f4
[trainer] support early stop (#7797) 2025-04-22 01:59:33 +08:00
hoshi-hiyouga
92101f34a1
[data] improve mmplugin (#7795) 2025-04-22 01:25:33 +08:00
hoshi-hiyouga
a62cba3d05
[example] add bash usage (#7794) 2025-04-22 00:25:51 +08:00
Juanxi Tian
d128382d3c
[trainer] Add Muon Optimizer (#7749)
Co-authored-by: hoshi-hiyouga <hiyouga@buaa.edu.cn>
2025-04-21 23:38:37 +08:00
hoshi-hiyouga
278df4308d
[parser] support omegaconf (#7793) 2025-04-21 23:30:30 +08:00
Changrui Chen
81768df04c
[data] Fix wrong position ids with packed attention masks (#7754)
Co-authored-by: hoshi-hiyouga <hiyouga@buaa.edu.cn>
2025-04-21 23:19:36 +08:00
flashJd
1302ca39f6
[misc] fix new tokens adding (#7253)
Co-authored-by: hoshi-hiyouga <hiyouga@buaa.edu.cn>
2025-04-21 23:19:02 +08:00
ddddng
b8cddbc7d7
[model] fix gemma3 export (#7786)
Co-authored-by: hoshi-hiyouga <hiyouga@buaa.edu.cn>
2025-04-21 23:07:11 +08:00
Sachin Beldona
ec7257e70f
[misc] fix bug in constant (#7765)
Co-authored-by: Sachin Beldona <sbeldona@cs.cmu.edu>
2025-04-21 23:06:31 +08:00
hoshi-hiyouga
a4455e3021
[assets] update wechat (#7792) 2025-04-21 21:29:42 +08:00
hoshi-hiyouga
610f164c69
[trainer] fix pt loss (#7748)
* fix pt loss

* robust

* fix

* test
2025-04-17 03:15:35 +08:00
hoshi-hiyouga
0a0cfeb782
[breaking] bump transformers to 4.45.0 & improve ci (#7746)
* update ci

* fix

* fix

* fix

* fix

* fix
2025-04-17 02:36:48 +08:00
hoshi-hiyouga
4831552856
[infer] set env for vllm ascend (#7745) 2025-04-17 01:08:55 +08:00
Kingsley
125513fa5c
[model] support intern-VL 2.5-3 series (#7258)
* add internvl and rebase

* fix for internvl2&3

* remove lines

* fix video_inputs & lint

* nit

* add constants

* remove lines

* fix

* fix error

* pass ci

* pass ci

* skip internvl & nit
2025-04-17 00:31:30 +08:00
ENg-122
8543400584
[misc] improve entrypoint (#7345)
* 纯粹优化下入口代码,因为看到if else太多了

* Update cli.py

---------

Co-authored-by: hoshi-hiyouga <hiyouga@buaa.edu.cn>
2025-04-16 21:48:23 +08:00
leo-pony
e1fdd6e2f8
[infer] support vllm-ascend (#7739) 2025-04-16 20:06:47 +08:00
codingma
d07983dceb
[assets] wechat (#7740) 2025-04-16 18:02:01 +08:00
hoshi-hiyouga
9b94211045
[api] fix chat messages (#7732) 2025-04-15 16:39:08 +08:00
hoshi-hiyouga
0fe5631f9b
[deps] upgrade vllm (#7728) 2025-04-15 14:57:40 +08:00
Joe Schoonover
b5d667cebf
[docker] patch docker-rocm (#7725)
* Update Dockerfile

* Fix typo

* Fix syntax for /bin/sh conditional

* Add build args to docker-compose

* Change shell to /bin/bash

This is required for "==" syntax in conditional string comparison
2025-04-15 13:36:39 +08:00
hoshi-hiyouga
ac8c6fdd3a
[assets] update model readme (#7724) 2025-04-15 00:41:09 +08:00
Kingsley
df8752e8ee
[model] Support Kimi_VL thinking/instruct (#7719)
* add kimi_vl

* patch config

* check version

* Update mm_plugin.py

* Update mm_plugin.py

---------

Co-authored-by: hoshi-hiyouga <hiyouga@buaa.edu.cn>
2025-04-15 00:21:58 +08:00
hoshi-hiyouga
3a13d2cdb1
[misc] fix env vars (#7715) 2025-04-14 16:04:04 +08:00
hoshi-hiyouga
3ef36d0057
[misc] upgrade cli (#7714) 2025-04-14 15:41:22 +08:00
hoshi-hiyouga
1fd4d14fbb
[deps] upgrade transformers (#7704) 2025-04-13 18:11:34 +08:00
Yuxuan Zhang
481ecbf9c5
[model] add GLM-4-0414 (#7695)
* Update README_zh.md

* update
2025-04-13 17:10:45 +08:00
hoshi-hiyouga
60a84f664b
[deps] fix uv conflicts (#7686)
* fix #7678

* Update setup.py

* Update tests.yml

* Update publish.yml

* Update Makefile
2025-04-11 18:02:24 +08:00
hoshi-hiyouga
11bcafd06a
[assets] update wechat (#7674) 2025-04-10 20:10:46 +08:00
Eric Tang
6c53471de2
[data] support for specifying a dataset in cloud storage (#7567)
* add support for loading datasets from s3/gcs

* add comments to readme

* run linter and address comments

* add option to pass in kwargs to ray init (i.e. runtime env)

* address comment

* revert mixed up changes
2025-04-10 11:31:35 +08:00
Eric Tang
39c1e29ed7
[ray] allow for specifying ray.init kwargs (i.e. runtime_env) (#7647)
* ray init kwargs

* Update trainer_utils.py

* fix ray args

---------

Co-authored-by: hoshi-hiyouga <hiyouga@buaa.edu.cn>
2025-04-10 11:31:05 +08:00
Dain Kim
ee840b4e01
[bugfix] enable_gemma_liger_kernel (#7660)
- The `enable_liger_kernel` function for the Gemma model series was not executed due to the existing `if` statement in the code.
- Changed the line to an `elif` statement so that the `apply_liger_kernel` function is executed properly.

resolved: #7628
2025-04-10 11:27:30 +08:00
jilongW
3bdc7e1e6c
[misc] fix cuda warn on intel GPU (#7655) 2025-04-09 21:37:54 +08:00
hoshi-hiyouga
34fdabe005
[data] add coig-p dataset (#7657) 2025-04-09 21:18:25 +08:00
hoshi-hiyouga
24cb890432
[assets] update readme (#7654) 2025-04-09 18:27:38 +08:00
hoshi-hiyouga
39876b85fc
[assets] update readme (#7644) 2025-04-09 01:06:06 +08:00
Kingsley
7d8bee96fc
[data] Fix bugs of use_audio_in_video in Qwen2.5 Omni (#7638)
* cache _mm_inputs

* nit

* support for use_audio_in_video

* remove cache

* fix data

* Update mllm_video_audio_demo.json
2025-04-08 18:40:10 +08:00
Shawn Tao
8f5f4cc559
[trainer] fix key error (#7635) 2025-04-08 18:39:50 +08:00
Adarsh Shirawalmath
8ee26642f3
[sglang] support transformers 4.51.0 (#7639) 2025-04-08 18:39:23 +08:00
hoshi-hiyouga
5817cda37e
[misc] fix packing and eval plot (#7623) 2025-04-07 18:20:57 +08:00
hoshi-hiyouga
7e0cdb1a76
[assets] update readme (#7612) 2025-04-06 13:58:49 +08:00
hoshi-hiyouga
6c200fd218
[model] add llama4 (#7611) 2025-04-06 13:42:31 +08:00
hoshi-hiyouga
61b24c3827
[assets] update wechat (#7594) 2025-04-03 17:45:26 +08:00
Kingsley
32cb086be1
[data] fix qwen2.5 omni plugin (#7578)
* specific entry

* Update mm_plugin.py

* fix fps cal

---------

Co-authored-by: hoshi-hiyouga <hiyouga@buaa.edu.cn>
2025-04-02 23:58:39 +08:00
Kingsley
80f8d037d0
[data] fix qwen2.5 omni plugin (#7573)
* align key with qwen2vl

* nit && change scripts
2025-04-02 21:28:52 +08:00
gechengze
11997593be
[trainer] fix batch processing in PPO trainer (#7576) 2025-04-02 21:17:48 +08:00
hoshi-hiyouga
903db09822
[infer] vllm video/audio inference (#7566) 2025-04-02 02:27:04 +08:00
hoshi-hiyouga
aaf2e6ba2a
[model] fix kv cache (#7564) 2025-04-01 23:07:46 +08:00
Yu Shi Jie
9deece1d50
[model] fix use_cache patching for gemma3 multimodal (#7500) 2025-04-01 16:06:48 +08:00
Ritesh Goru
f06a74ad4e
[data] specify position_ids in PackedSupervisedDatasetProcessor for neat_packing (#7318)
* use position_ids for neat_packing with fa2

* revert fa2 changes
2025-04-01 16:03:13 +08:00
taoharry
6faa6fb53d
[webui] fix launch with proxy (#7332) 2025-04-01 15:52:56 +08:00
Billy Cao
5d1cc863a4
[data] shard the dataset to allow multiprocessing when streaming is enabled (#7530)
* Shard the dataset when streaming to allow multiprocessing

* Allow user to not set dataset_shards to ensure backward compatibility
2025-04-01 15:36:23 +08:00
Hao
6d6e0f44fc
[trainer] new kto mismatch pair creation strategy (#7509) 2025-04-01 15:21:53 +08:00
hoshi-hiyouga
2d421c57bf
[data] fix qwen2.5 omni collator (#7553) 2025-04-01 00:15:12 +08:00
Kingsley
185c76f6ad
[model] add Qwen2.5-Omni model (#7537)
* preserve image_sizes

* preserve image_sizes

* init plugin

* support audio-text2text lora

* nit

* support image/video-text2text, audio-text2text

* remove args

* remove lines

* add docs && nit

* remove some comments

* fix && add merge part script

* add license
2025-03-31 20:39:35 +08:00
hoshi-hiyouga
468eea6f6d
[deps] pin pydantic to 2.10.6 (#7546) 2025-03-31 14:42:28 +08:00
hoshi-hiyouga
49436e93e6
[assets] update wechat (#7523) 2025-03-28 17:44:36 +08:00
Kingsley
b00cb2ed42
[data] fix pixtral plugin (#7505)
* preserve `image_sizes`

* add comments
2025-03-27 17:06:40 +08:00
Xu-pixel
f547334604
[3rdparty] support swanlab lark notification (#7481) 2025-03-27 01:52:01 +08:00
Kdump
01166841cf
[trainer] fix wsd scheduler (#7304)
* [trainer] Warmup_stable_decay supports setting the number of stable and decay steps according to the warmup_ratio ratio

* Update trainer_utils.py

---------

Co-authored-by: hoshi-hiyouga <hiyouga@buaa.edu.cn>
2025-03-26 15:25:02 +08:00
hoshi-hiyouga
59e12bffe8
[model] add qwen2vl 32b & upgrade peft (#7469)
* add qwen2vl 32b

* fix ci

* upgrade peft to 0.15

* fix ci

* fix ci
2025-03-25 12:15:58 +08:00
GuoCoder
b6d8749bf3
[model] fix lora on quant models (#7456)
Co-authored-by: root <root@ai>
2025-03-25 11:59:46 +08:00
Xiaosu Zhu
bc9ada9db7
[misc] update liger-kernel's monkey patch (#7453)
* Update liger_kernel.py

* Update setup.py
2025-03-25 11:58:52 +08:00
AbdelKarim ELJANDOUBI
b6dc7e01e2
[misc] enable liger kernel for gemma3 text and paligemma (#7466)
* add gemma3 text

* add paligemma (1,2 and 2 mix)
2025-03-25 09:27:43 +08:00
Kenny Lam
59a56f7226
[misc] enable liger kernel for gemma3 (#7462) 2025-03-24 19:09:59 +08:00
hoshi-hiyouga
9abee9cd1a
[assets] update wechat (#7455) 2025-03-24 14:53:10 +08:00
hoshi-hiyouga
833edc7c73
[assets] fix gemma3 readme (#7449) 2025-03-24 10:31:25 +08:00
hoshi-hiyouga
42e090d38b
[trainer] fix vlm loss for transformers 4.49 (#7448) 2025-03-24 10:24:05 +08:00
rumichi
747e02d60d
[docker] upgrade to torch 2.6 (#7442) 2025-03-23 21:18:08 +08:00
hoshi-hiyouga
c841e92116
[misc] fix ci (#7441)
* fix ci

* improve ci
2025-03-23 21:09:35 +08:00
hoshi-hiyouga
fbf49e2500
[misc] fix license (#7440) 2025-03-23 19:31:56 +08:00
SnowFox4004
7d4dc25c23
[scripts] support compute score on vllm's predictions (#7419)
* enable manual bleu&rouge eval by adding `scripts/eval_bleu_rouge.py`

* added libraries check

* update: 使用datasets库的多进程加速处理

* update:
- 使用 fire.Fire
- 修改代码格式

* Update eval_bleu_rouge.py: correctly uses fire

Deleted the code of using sys.argv

* Update eval_bleu_rouge.py

---------

Co-authored-by: SnowFox4004 <manba@out>
Co-authored-by: hoshi-hiyouga <hiyouga@buaa.edu.cn>
2025-03-23 19:21:01 +08:00
hoshi-hiyouga
b1b78daf06
[deps] upgrade transformers to 4.50.0 (#7437)
* upgrade transformers

* fix hf cache

* fix dpo trainer
2025-03-23 17:44:27 +08:00
hoshi-hiyouga
dfbe1391e9
[deps] upgrade vllm to 0.8 (#7436) 2025-03-23 14:32:22 +08:00
Guo, Quan
ebc989ad4a
[misc] fix sglang deps (#7432)
* feat: Add transformer version requirement for sglang

* feat: add srt to sglang which is required for running sglang

Other options are srt_hip, srt_xpu, srt_npu, srt_hpu, srt_cpu, for different computation architectures.
2025-03-23 14:07:10 +08:00
Eric Tang
d8a5571be7
[3rdparty] fix redundant process group destroy for ray (#7395)
* fix redundant process group destroy for ray

* Update tuner.py

---------

Co-authored-by: hoshi-hiyouga <hiyouga@buaa.edu.cn>
2025-03-21 10:56:47 +08:00
hoshi-hiyouga
555b71a1cb
[version] fix minicpmo (#7378) 2025-03-20 16:59:31 +08:00
hoshi-hiyouga
4a5d0f0ba7
[assets] update wechat (#7361) 2025-03-18 21:31:09 +08:00
hoshi-hiyouga
c518146e62
[misc] set dev version (#7351) 2025-03-18 00:10:53 +08:00
hoshi-hiyouga
1d2131e5cb
[data] fix template (#7349) 2025-03-17 23:45:20 +08:00
hoshi-hiyouga
48a6584fb1
[assets] update videos (#7340)
* Update README.md

* Update README_zh.md
2025-03-17 15:48:02 +08:00
Hertz
a71e685021
[model] support hunyuan 7b (#7317)
* [Model]supported tencent-hunyuan model

* [Model]supported tencent-hunyuan model(fix)

* [Model]supported tencent-hunyuan model(fix)
2025-03-15 20:55:24 +08:00
Qiaolin Yu
30038d9ce7
[inference] support sglang backend (#7278)
* Mimic SGLang offline Engine

* Add more tests and args

* Pass all current tests

* Clean Code

* fix sample_params

* clean code

* Fix Stream Chat

* change sglang from engine mode to server mode

* fix

* Fix Review Issues

* Use SGLang Built-In Utilities

* Fix test SGLang

* Some Doc Issue

* fix sglang engine

* add readme

---------

Co-authored-by: Jin Pan <jpan236@wisc.edu>
Co-authored-by: hiyouga <hiyouga@buaa.edu.cn>
2025-03-15 04:37:58 +08:00
hoshi-hiyouga
ef5f1c1def
[data] gemma3 plugin pan and scan (#7294)
* gemma3 pan and scan

* add test case

* fix test
2025-03-13 23:29:23 +08:00
Victor Nogueira
3dff4ecca8
[dataset] fix ultrachat_200k dataset (#7259)
The `HuggingFaceH4/ultrachat_200k` dataset doesn't contain the default "train" split. The correct split is "train_sft".
2025-03-13 20:20:18 +08:00
hoshi-hiyouga
0dbce72fb8
[assets] update wechat (#7288) 2025-03-13 18:48:59 +08:00
hoshi-hiyouga
e9b427d535
[assets] update video (#7287) 2025-03-13 18:45:47 +08:00
Ritesh Goru
d7d79f7e06
[data] efficient 4d_attention_mask creation in neat_packing (#7272) 2025-03-13 03:31:12 +08:00
hoshi-hiyouga
9ccfb97a2c
[misc] update format (#7277) 2025-03-13 02:53:08 +08:00
hoshi-hiyouga
165d3ed084
[model] support gemma3 (#7273) 2025-03-13 01:35:23 +08:00
hoshi-hiyouga
142fd7e755
[misc] upgrade deps (#7257) 2025-03-12 00:33:47 +08:00
hoshi-hiyouga
7c1640ed5f
[misc] upgrade format to py39 (#7256) 2025-03-12 00:08:41 +08:00
hoshi-hiyouga
cdafa8a15e
[ci] update workflow (#7255) 2025-03-11 22:57:49 +08:00
hoshi-hiyouga
b256ca86f0
[core] release v0.9.2 (#7254) 2025-03-11 22:42:23 +08:00
hoshi-hiyouga
7a7071e504 Merge pull request #7242 from hiyouga/hiyouga/release
[release] release v0.9.2

Former-commit-id: 6b25268990bf225d84e29d4067595cf720fa12d8
2025-03-11 15:28:45 +08:00
hoshi-hiyouga
847ae972d0 Merge pull request #7247 from hiyouga/hiyouga/commit
[misc] support print commit info

Former-commit-id: 0f7ec4f8529a5d7ea2153b881335821038307bb7
2025-03-11 15:28:04 +08:00
hoshi-hiyouga
1c634d9c53 Merge pull request #7244 from hiyouga/hiyouga/token
[data] avoid exit after saving preprocessed data

Former-commit-id: dcbf01b0035062fa14187e5bdbb925080d349501
2025-03-11 15:17:15 +08:00
hiyouga
99b71768a0 support commit info
Former-commit-id: af752b1c271739e6614773e6898566240d0c7043
2025-03-11 15:13:59 +08:00
hiyouga
37b844d929 remove exit in preprocess
Former-commit-id: 1a800f9993d28d80d4587a08c20f5a69722436b5
2025-03-11 15:08:25 +08:00
hiyouga
f5810a6e47 release v0.9.2
Former-commit-id: aaad96359398c50bfe4a864859039a99b9f3a3a7
2025-03-11 14:49:13 +08:00
hoshi-hiyouga
317d0855d2 [infer] fix vllm args (#7235)
Former-commit-id: ef7af457fc44b1e8cad0c78717848617f98364f0
2025-03-11 01:15:35 +08:00
Ze-Yi LIN
0a43bc1960 [tracking] add swanlab_logdir param (#7219)
* feat: add swanlab_logdir param

* fix

Former-commit-id: a1e76af3d9cf64a6c016bb2333fc815fd4be73cf
2025-03-11 00:53:07 +08:00
hoshi-hiyouga
5a29f49fb1 [config] update args (#7231)
Former-commit-id: ed8b12e3cbdaa85f5bde619081b86717a1f3c5fa
2025-03-10 23:04:43 +08:00
hoshi-hiyouga
4e68828e46 [config] fix export max len (#7230)
Former-commit-id: 728c2f68190129e88ac375b453c60ed0d6bebbd8
2025-03-10 16:46:08 +08:00
hoshi-hiyouga
9a0044ef5e [assets] update wechat (#7229)
Former-commit-id: ae4cbe8fbc9b7e3ea122440be49c8cba52e03248
2025-03-10 15:39:06 +08:00
hoshi-hiyouga
d412301d08 [data] update mm demo data (#7211)
Former-commit-id: 1774882f5a73760e104e08dfa76fe592b1d876a1
2025-03-07 20:07:15 +08:00
hoshi-hiyouga
5a0fd22c05 [assets] update readme (#7209)
Former-commit-id: cdf8fc647819523eca8139ba62a510c3078b694d
2025-03-07 17:27:49 +08:00
hoshi-hiyouga
df63f05b47 [data] fix loader (#7207)
* fix dataloader

* add test case

* fix type

* fix ci

* fix ci

* fix ci

* disable overwrite cache in ci

Former-commit-id: 8c3f9f6747110107cbbb3695637482e45084dbc1
2025-03-07 17:20:46 +08:00
hoshi-hiyouga
98ea0e8109 [misc] fix ds config (#7205)
Former-commit-id: db113f690eb818894e3bc2bfc69cdfd27045d808
2025-03-07 15:21:28 +08:00
ZhangChuanhui
33b4c33279 [data] fix function formatter (#7201)
Co-authored-by: zhangchuanhui <zhangchal@digitalchina.com>
Former-commit-id: 194e3bddb25fa0bcc6d8349ce682b537a07a9a6a
2025-03-07 15:17:23 +08:00
hoshi-hiyouga
113cc3d920 [misc] fix cli (#7204)
Former-commit-id: bd1722355944a3618f44a4ac3e8177c34b97dc75
2025-03-07 15:01:18 +08:00
hoshi-hiyouga
b6c0e8608e [script] fix vllm version (#7193)
Former-commit-id: 313355759dc906d3612364dc6c8f6344afdedb97
2025-03-06 17:14:17 +08:00
hoshi-hiyouga
eba31ae313 [webui] support escape html (#7190)
Former-commit-id: abb23f767351098a926202ea4edc94d9e9a4681c
2025-03-06 16:52:21 +08:00
hoshi-hiyouga
e7556b591e [deps] upgrade vllm (#7183)
Former-commit-id: d739fddb10f51f422c12b1e7d5d7208309bf6c91
2025-03-06 15:25:08 +08:00
hoshi-hiyouga
2b21c749c1 [data] fix mm template (#7181)
Former-commit-id: be66df1f0211cd2d90eac3ab407dced653c9e443
2025-03-06 15:18:32 +08:00
hoshi-hiyouga
002f58ef8e [model] add QwQ 32b (#7179)
Former-commit-id: 64a6fb9b5056166265abc5acbddffb64cd8b5256
2025-03-06 11:58:36 +08:00
Ze-Yi LIN
c67d2b9327 [trainer] fix swanlab callback (#7176)
Former-commit-id: 8ad03258e16309158368384e2a0a707845536133
2025-03-06 00:33:37 +08:00
hoshi-hiyouga
6e58115f98 [trainer] update config (#7174)
Former-commit-id: b4b89b4ff3bc03aa388569e253d62580755a77a5
2025-03-05 23:32:54 +08:00
sirui.li
8dddffa340 [data] fix qwen2audio plugin (#7166)
* Update pairwise.py

[data]Repair multimodal model dpo training

* Update pairwise.py

[data]repair multimodal model dpo training using deepcopy

* Update pairwise.py

* Update mm_plugin.py

Former-commit-id: dff4130969bac9cb1abe66fd5dfada8c757c716f
2025-03-05 18:03:36 +08:00
hoshi-hiyouga
e1d574a784 [assets] update wechat (#7161)
Former-commit-id: 0c403ea15ba9e9878de3ea117c353b5a75192637
2025-03-05 14:11:10 +08:00
hoshi-hiyouga
caef0a8937 [data] use bicubic resampler (#7143)
Former-commit-id: bc298c60b7d3fdc4d116a79b535d7e9b11f4aa65
2025-03-04 00:17:06 +08:00
hoshi-hiyouga
392533e139 [webui] fix webui (#7142)
Former-commit-id: 17ba2d5082bcd6b4cdd5e50286776d256cc934a4
2025-03-04 00:01:49 +08:00
rabbit
299cd03785 [data] bailing template (#7117)
* add bailing template

* add bailing template

* add bailing template

---------

Co-authored-by: chengshiwen.csw@antgroup.com <chengshiwen.csw@antgroup.com>
Former-commit-id: 049ddf48afaa9f12d3e46d7ec63858607329e853
2025-03-03 15:33:22 +08:00
hoshi-hiyouga
ee1b580328 [inference] fix hf_engine (#7120)
Former-commit-id: 1036311826a61fed2346a261c8a060c355778318
2025-03-01 05:22:49 +08:00
hoshi-hiyouga
54a090079c [assets] update wechat (#7106)
Former-commit-id: d1863bbbaa7748e74eab34fc5a56de1102933530
2025-02-28 12:01:04 +08:00
Ze-Yi LIN
210cdb9557 [webui] display swanlab exp link (#7089)
* webui add swanlab link

* change callback name

* update

---------

Co-authored-by: hiyouga <hiyouga@buaa.edu.cn>
Former-commit-id: 891c4875039e8e3b7d0de025ee61c4ff003ff0c4
2025-02-27 19:40:54 +08:00
leo-pony
e86cb8a4fa [npu] update cann base image and torch 2.4 (#7061)
* Update base npu container image version:The Python version required for Hugging Face Transformers is >= python3.10

* Fix the bug: arg type of INSTALL_DEEPSPEED shoud been string now.

* Update Ascend CANN, CANN-Kernel and corresponding torch and torch-npu version

* Upgrade torch-npu needs packages' version: torch==2.1.0 and torch-npu==2.4.0.post2

Former-commit-id: acc52e0fe75d59c2581ef45346ebfbffe45f762b
2025-02-25 23:32:01 +08:00
hoshi-hiyouga
f4aa0a146c [misc] fix project toml (#7067)
Former-commit-id: 96fd510e6a03eae7a1f41772e1d6b784df6d5d2e
2025-02-25 23:22:48 +08:00
JieShen
96636c3729 [script] add seed args (#7058)
* add seed args

* add seed args

* update seed

Former-commit-id: e8266fe5635470e84f9d39f43e53cc49f962c2e9
2025-02-25 19:44:57 +08:00
Kingsley
81947f1d2c [model] add paligemma2-mix series (#7060)
Former-commit-id: 19861d5170bdcdf8c1c5d72289b29bff4b0d4c2c
2025-02-25 18:51:16 +08:00
hoshi-hiyouga
dca5fe14c2 [data] fix mllama (#7053)
* fix mllama

* fix test

Former-commit-id: 76314e6ad1ecaa44fcae4375dd0abf4ebaf1f924
2025-02-24 22:05:38 +08:00
hoshi-hiyouga
ca78ba964d [model] add models (#7054)
* add qwen25vl awq models

* add moonlight

Former-commit-id: ec1a1bc1184d13188029e19c1d4e7de68707aaf6
2025-02-24 22:05:13 +08:00
hoshi-hiyouga
9359ee18ad [assets] update readme (#7051)
Former-commit-id: fe6dd92c84cff34326f6f0ff0b648e43c372fe96
2025-02-24 20:45:06 +08:00
hoshi-hiyouga
15f3087b96 [assets] update wechat (#7019)
Former-commit-id: 1481af5dc9bc99807ae0ee5a438bf0a279cafb66
2025-02-20 20:32:33 +08:00
Zhangchi Feng
1fcedf9af6 [data] fix MiniCPMV plugin (#6998)
* fix template

* fix bug in messages processing

Former-commit-id: cde479e47a51beb60ab555cdee083c1cdba0ead6
2025-02-19 19:36:04 +08:00
hoshi-hiyouga
b0bbacaacb [webui] update css (#6985)
Former-commit-id: 302ecb00fef56d1ccc9203cb46f242841fefab47
2025-02-18 18:27:57 +08:00
hoshi-hiyouga
beb1a9f9d9 [data] add r1 distill dataset (#6983)
Former-commit-id: 2591a3fa8b37fed8011fb66b266ef15e18404756
2025-02-18 17:25:09 +08:00
hoshi-hiyouga
3fbd4848e8 [version] support transformers 449 (#6982)
* support transformers 449

* fix mm plugin

Former-commit-id: b00b290c07beb560a5af857ce64f4ce424831a2c
2025-02-18 17:05:40 +08:00
hoshi-hiyouga
184c5d0882 [misc] fix script (#6977)
Former-commit-id: cc8c7e762b9c873ef79529152465bbed9231053c
2025-02-18 17:00:46 +08:00
hoshi-hiyouga
1f4a0b11ba [data] update vlm args (#6976)
Former-commit-id: 3da2cc2710c9b13ab450815a92fff14b03251984
2025-02-18 02:12:51 +08:00
hoshi-hiyouga
b1d31ff0f9 [data] add min resolution option (#6975)
Former-commit-id: 7faecc0301709326efa21e7a3fdb75fe0a9635c2
2025-02-18 01:40:46 +08:00
hoshi-hiyouga
a8c9d5663d [data] fix predict dataset (#6972)
Former-commit-id: bdb581c4a82d02458766e73c87b7a92ea31796ec
2025-02-17 20:29:40 +08:00
hoshi-hiyouga
475a355b82 [assets] update wechat (#6963)
Former-commit-id: ad0c6c891651c2779d07d89dc2e345aa211f5c51
2025-02-17 15:23:17 +08:00
Zhangchi Feng
3dc938268c [data] fix minicpmo template (#6946)
Former-commit-id: 2faf8aeff897765df44707d5a42157dfdd6b9038
2025-02-15 00:37:41 +08:00
Eric Tang
e55ec42d3c [ray] specify ray storage path (#6920)
Former-commit-id: 6edd4992d700fec56800a638f1cac0f87990c581
2025-02-14 21:55:41 +08:00
hoshi-hiyouga
2baf8bf03d [misc] fix lora regex (#6944)
* fix lora regex

* fix

Former-commit-id: 1ada3ae5a3a14057341540c6d6ba985adf95f348
2025-02-14 21:38:43 +08:00
hoshi-hiyouga
13e1b7ee2b [misc] fix grad ckpt (#6931)
Former-commit-id: c31c63b41109e616997757ec2da6e0ab89ed3b6e
2025-02-13 23:27:51 +08:00
hoshi-hiyouga
cd493b91de [model] add liger kernel to qwen2_5 vl (#6930)
* add liger kernel to qwen2_5 vl

* fix patch

* fix patch

Former-commit-id: 797043d29cb85a8f90fabf48976908037f07000e
2025-02-13 23:05:54 +08:00
Billy Cao
48173b606c [trainer] fix gen_kwarg to eval during training (#5451)
* Correctly pass gen_kwarg to eval during model runs

* fix

* fix

---------

Co-authored-by: hiyouga <hiyouga@buaa.edu.cn>
Former-commit-id: 11eac71c13cd432322b69ae74a3b8fa17af31bc4
2025-02-13 02:35:06 +08:00
SrWYG
0ad9f7f058 [data] evaluate on each dataset (#5522)
* [Update] loader.py , evaluate will run separate evaluations on each dataset.

`If you pass a dictionary with names of datasets as keys and datasets as values, evaluate will run separate evaluations on each dataset. This can be useful to monitor how training affects other datasets or simply to get a more fine-grained evaluation`

seq2seqtrainner support eval_dataset as Dict.

* fix format

* fix

* fix

---------

Co-authored-by: hiyouga <hiyouga@buaa.edu.cn>
Former-commit-id: 1e35967ae159038a66f3203dd0e6ec51eea9208f
2025-02-13 02:19:03 +08:00
Noah
1adb46875f [data] improve error handling (#6128)
* sync from upstream

* update

* update

* fix

---------

Co-authored-by: hiyouga <hiyouga@buaa.edu.cn>
Former-commit-id: 4c7bfebcf1ed90800f5b0de4cf67b3036cb9dc13
2025-02-13 01:39:41 +08:00
hoshi-hiyouga
9b852ebe25 [misc] update readme (#6918)
Former-commit-id: 8956c93d9b05ee929fde2da0eb2365684c01f563
2025-02-13 01:01:41 +08:00
hoshi-hiyouga
07aa7b71a3 [misc] update readme (#6917)
Former-commit-id: 499ea45d1f1ea7704ee82f58c35af123a6c2632b
2025-02-13 00:58:10 +08:00
hoshi-hiyouga
1679930e00 [breaking change] refactor data pipeline (#6901)
* refactor data

* rename file

Former-commit-id: 617c8ab467d32be5f7d5c94fa89c0e3d7d1963bc
2025-02-13 00:39:20 +08:00
Eric Tang
d50e04b805 [misc] support for launching LLaMA-Factory with uv run (#6907)
* yay

* uv with ray temporary commit

* remove ray specific code for now

* cleanup

Former-commit-id: f8a206125d924230740f9f98c72bf70fd77bdc32
2025-02-13 00:38:44 +08:00
Eric Tang
e515fe62de [example] fix path to ray example (#6906)
Former-commit-id: ee5fe216dc73a458431e1b689787a8ddabf8de17
2025-02-13 00:29:32 +08:00
hoshi-hiyouga
036fb0d561 [misc] fix grad ckpt func (#6916)
Former-commit-id: e34c3c06da706f80c74c20800f19110e9ad6b82a
2025-02-13 00:17:18 +08:00
marko1616
bae934dea3 [trainer] fix llama3.2 vision kto train (#6904)
Former-commit-id: b7fd1e9c00c77a4c2a0f2f347767d22bd47213f1
2025-02-12 19:09:14 +08:00
hoshi-hiyouga
2e2f6bea07 [data] feat: auto template (#6905)
* support auto template

* add unittest

Former-commit-id: 2f8b6847f5e199d770e91346dfe205c4b9f1fbb7
2025-02-12 00:22:53 +08:00
hoshi-hiyouga
1b02183da9 [misc] update readme (#6903)
Former-commit-id: 18179a3823b6222973e7c8157d21ed88fdb8c550
2025-02-11 22:51:26 +08:00
hoshi-hiyouga
197aa3baf4 [data] fix ollama template (#6902)
* fix ollama template

* add meta info

* use half precision

Former-commit-id: e1a7c1242cd1e0a1ca9ee7d04377a53872488126
2025-02-11 22:43:09 +08:00
hoshi-hiyouga
c6be9e242c [misc] support export ollama modelfile (#6899)
* support export ollama modelfile

* update config

* add system and num ctx

Former-commit-id: 9184a6e0ed7ff5f632c848f861bfa448c4cd06fc
2025-02-11 19:52:25 +08:00
hoshi-hiyouga
2e954d8fd2 [data] refactor template (#6896)
Former-commit-id: d1b8aa3835f6e3b2e63cf06e6cadbe760d46f9aa
2025-02-11 17:59:25 +08:00
codingma
fafa3add84 support ollama modelfile export (#4686)
Former-commit-id: 7f354b80bc9d1336e8bea5374925a7c2ec7a315c
2025-02-11 17:52:24 +08:00
hoshi-hiyouga
593acca556 [data] refactor mm plugin (#6895)
* refactor plugin

* lint

Former-commit-id: aca63bfcca02ecd95b57cd8949a50e26a913f716
2025-02-11 16:34:49 +08:00
HJ
188f22d8a7 [data] fix qwen_2_5_vl video processing (#6868)
* fix qwen_2_5_vl video processing

* Update mm_plugin.py

* Update mm_plugin.py

---------

Co-authored-by: hoshi-hiyouga <hiyouga@buaa.edu.cn>
Former-commit-id: 9153a7bd832cdae84b63a4d7d1f2b12239e84b61
2025-02-11 16:14:50 +08:00
hoshi-hiyouga
703bb9cc18 [assets] update wechat (#6892)
Former-commit-id: fc5d47401f61a5c95d7a4ab7f696450cfa7b70e3
2025-02-11 13:56:26 +08:00
Zhangchi Feng
5433b318bb [da'ta] fix minicpmv plugin (#6890)
* fix template name

* tiny fix

* support minicpm-o-2.6

* support inference of minicpmv

* update readme

* support dpo of minicpmv

* update init audio

* update init audio

* [model]fix image process in minicpmo

* fix no mm inputs

Former-commit-id: 764627645abcd353f9130d5dd8c584810b0e0b1b
2025-02-11 13:30:44 +08:00
HJ
fe4f4e9758 [data] fix: sharegpt converter (#6879)
* fix-sharegpt-format

* fix

---------

Co-authored-by: hoshi-hiyouga <hiyouga@buaa.edu.cn>
Former-commit-id: 0fb44cb3a5499c8da79e73004adc9d16f792b4b3
2025-02-10 21:59:12 +08:00
hoshi-hiyouga
1bb3d17d9e [data] fix mllama collator (#6874)
Former-commit-id: b68199db274a53d5916179e1aaf9722fd94fa2dc
2025-02-09 22:42:25 +08:00
hoshi-hiyouga
b93333685b [test] align test cases (#6865)
* align test cases

* fix function formatter

Former-commit-id: f6f3f8d0fc79de6bbad0bf892fc2f6c98c27eb8e
2025-02-09 01:03:49 +08:00
hoshi-hiyouga
fcd0f0480d [dataset] add openthought (#6866)
Former-commit-id: 1356f9d8400efaccf677d0b36aaf32a146a09833
2025-02-09 00:53:01 +08:00
hoshi-hiyouga
ff6658ad27 [deps] upgrade vllm (#6857)
Former-commit-id: 5f38bcaba921dbdee27b4be4709fcec06fa37c9e
2025-02-08 15:02:28 +08:00
hoshi-hiyouga
28037c7834 fix qwen2vl plugin (#6855)
Former-commit-id: 40048ab77a8b25a91a844800f0f1e880b84548cd
2025-02-08 10:59:10 +08:00
hoshi-hiyouga
f70208e1c0 [misc] allow extra args (#6831)
Former-commit-id: 74ade3a176cad753971aaad681fea6ff8df40914
2025-02-06 12:38:08 +08:00
hoshi-hiyouga
7aa9767dc2 [assets] update wechat (#6830)
Former-commit-id: 6dad5369681beff089832f3eb2bc329f57a79a08
2025-02-06 12:02:05 +08:00
Zhangchi Feng
01915eaf40 [model] support audio (#6701)
* support qwen2_audio

* improve code

* lint

* fix

* fix

* fix

---------

Co-authored-by: hiyouga <hiyouga@buaa.edu.cn>
Former-commit-id: 24c78429489809873a1269a735ea5421340b32a2
2025-02-05 04:59:09 +08:00
Yueqi Song
e665e1fed5 [data] allow thought in function call (#6797)
* Update template.py

* Update template.py

* use formatter

* fix regex

---------

Co-authored-by: hiyouga <hiyouga@buaa.edu.cn>
Former-commit-id: a5e943f7bcea6e5840da8570055bf3079a49ae8c
2025-02-05 02:26:23 +08:00
hoshi-hiyouga
1fee69f874 [misc] update license year & fix llama pro (#6814)
* fix llamapro script

* change year

Former-commit-id: e2dc5b952aa22835d5220ba624f44676138b65ac
2025-02-05 01:53:33 +08:00
Yueqi Song
8504bde893 [data] fix qwen tool template (#6796)
* Update tool_utils.py

* fix unittest

---------

Co-authored-by: hoshi-hiyouga <hiyouga@buaa.edu.cn>
Former-commit-id: dd6b7d203eedbf09458c64654e8d97fec85f08d7
2025-02-05 00:02:00 +08:00
Zhangchi Feng
85f22d01bf [data] fix minicpmv plugin (#6801)
* fix template name

* tiny fix

* support minicpm-o-2.6

* support inference of minicpmv

* update readme

* support dpo of minicpmv

* update init audio

* update init audio

* [model]fix image process in minicpmo

Former-commit-id: ab9bd068efee861452407cdda08ef014d5ce23d5
2025-02-04 21:20:15 +08:00
hoshi-hiyouga
822d5d362c [assets] update wechat (#6810)
Former-commit-id: 069a477d1633d65270b5e9c3fbb263735cef8cea
2025-02-04 21:17:40 +08:00
neavo
32163e7ce0 [readme] update flash attention installation instruction on win platform (#6788)
* Update README_zh.md

* Update README.md

Former-commit-id: a417bcf8d9d70057d12f9152e4091c090133e025
2025-02-01 12:43:29 +08:00
hoshi-hiyouga
454140d912 [misc] update workflows (#6787)
Former-commit-id: b5fda21288183a5a5eb68cc267fda4270159595b
2025-02-01 04:54:42 +08:00
hoshi-hiyouga
445d643ef3 [model] add mistral small models (#6786)
Former-commit-id: 94803d8133fbbadff6d224cb6695feb5434fd4fd
2025-02-01 04:31:38 +08:00
hoshi-hiyouga
e8c1979b79 [model] add qwen2.5 vl models (#6779)
Former-commit-id: 999c7c8fe0caf6b837a1bdc2c6a24fafec327cd8
2025-01-31 03:00:29 +08:00
hoshi-hiyouga
f6779b0e0c [breaking] support transformers 4.48 (#6628)
Former-commit-id: 15357cdad953bba1f2d294819f56b9746ed1b891
2025-01-31 01:36:33 +08:00
hoshi-hiyouga
245de012ca [webui] improve webui & reasoning mode (#6778)
Former-commit-id: 45e68b9f092879dda55023ebbcd8cf4660e3045a
2025-01-31 00:09:21 +08:00
codingma
f143360ee6 [assets] update wechat (#6771)
Former-commit-id: 4fb6059f486e96287764114e1563fd4235b9f1d8
2025-01-29 12:31:24 +08:00
qvlehao
f5350b103b [model] add deepseek-R1 & show think process (#6767)
Former-commit-id: 28417f862a1947a24663150ca55f421198b6d8eb
2025-01-29 12:16:26 +08:00
yinpu
aa7c07caf0 fix: avoid redundant normalization in DPO's SFT loss calculation (#6722)
Former-commit-id: 0f45982bac6b65533a94054ea5f792cb0f9e5a1f
2025-01-21 13:38:02 +08:00
engchina
324f07613a [webui] support ja (#6698)
* add support for japanese language

* add support for japanese language

---------

Co-authored-by: engchina <atjapan2015@gmail.com>
Former-commit-id: de9bc3fefa4fcb5db7d04589b16282a078c62cb2
2025-01-20 19:46:38 +08:00
hoshi-hiyouga
0c59483368 [assets] update wechat (#6710)
Former-commit-id: 3962645ac0f7e9f2dbc503e5fbc9434dd959316c
2025-01-20 16:29:24 +08:00
hoshi-hiyouga
1efe525df7 [model] support yarn (#6693)
Former-commit-id: 1f47b6186c267de86cbdbd47ba2adbf1f9db7f39
2025-01-18 13:56:09 +08:00
hoshi-hiyouga
ee0b3b1e1a [assets] update wechat (#6692)
Former-commit-id: 17b470630df08786e958192287edc8199280a4b2
2025-01-18 12:35:03 +08:00
hoshi-hiyouga
f87c788154 [misc] update mm plugin (#6691)
Former-commit-id: c0caa7afc60ed3015fe6c263ba3566202ba934f1
2025-01-17 23:04:26 +08:00
hoshi-hiyouga
bbf334f823 disable valset by default (#6690)
Former-commit-id: 77bbf659053e1b205974eb6df69998fee0305d26
2025-01-17 21:09:30 +08:00
hoshi-hiyouga
770433fa33 [webui] upgrade to gradio 5 (#6688)
Former-commit-id: 4d0f662dbe227ab0da11a1e109f7a2c5ab8f70b9
2025-01-17 20:15:42 +08:00
hoshi-hiyouga
788accb601 fix qwen2 moe (#6684)
Former-commit-id: 7bf09abf1c4d971cda33daed933c75f391e79294
2025-01-17 13:46:09 +08:00
Zhangchi Feng
555f17c1ee [data] Fix minicpmv/o dpo training (#6657)
* fix template name

* tiny fix

* support minicpm-o-2.6

* support inference of minicpmv

* update readme

* support dpo of minicpmv

Former-commit-id: 027942789bf3a28b2506a5730c05c8392ef5c885
2025-01-15 17:30:37 +08:00
steveepreston
8895cf1152 Update val_size english description (#6653)
* Update `val_size` Description in locales.py

* Update `val_size` Description in data_args.py

* Remove extra space in data_args.py

Former-commit-id: 76675b654e243c14b260adbfe04f619e4f2bf177
2025-01-15 16:00:20 +08:00
hoshi-hiyouga
320e40d873 update readme (#6648)
Former-commit-id: 563be2286a756fcd5d41b351beb8e1aa4e95842b
2025-01-15 11:06:19 +08:00
hoshi-hiyouga
9ef85f8fc4 [optim] clean apollo (#6645)
* clean apollo code

* update readme

Former-commit-id: 7a04021d0461caea2c7b82169839340b7f51f463
2025-01-15 01:42:50 +08:00
zhuHQ
763f9b9df0 [optim] add support to APOLLO (#6617)
Former-commit-id: d9189f9f0b23ff6929044919208e0e813ca95b1c
2025-01-15 00:24:56 +08:00
Zhangchi Feng
57043fb4e6 update readme of MiniCPM-o (#6642)
* fix template name

* tiny fix

* support minicpm-o-2.6

* support inference of minicpmv

* update readme

Former-commit-id: 9b7ba093c7e017ea18a4562550d5d2e82c4a0161
2025-01-14 21:22:35 +08:00
hoshi-hiyouga
91433d639c lint (#6641)
Former-commit-id: 1278c3e92eeb297e883aab89e2384c1df1d0e910
2025-01-14 18:40:07 +08:00
Haian Huang(深度眸)
864ee06243 Support InternLM3 Dense 8B Model (#6640)
* support internlm3

* update

* update

* update

* add hint

Former-commit-id: deacc00b1226ca3d53bf7bb1231cf276eaa8296b
2025-01-14 18:07:27 +08:00
Xiaosu Zhu
a52496cc09 Fix tokenizer max length (#6632)
Former-commit-id: 58d029f3212dba1808e63cc8875022f6d741bd63
2025-01-14 17:35:54 +08:00
Zhangchi Feng
ad119afc58 Support Inference of MiniCPM-V-2.6 and MiniCPM-o-2.6 (#6631)
* fix template name

* tiny fix

* support minicpm-o-2.6

* support inference of minicpmv

Former-commit-id: 158a127d340d5e4ca23263ffad042f861fd77deb
2025-01-14 17:34:58 +08:00
hoshi-hiyouga
8f73c75c16 [model] fix mllama any image (#6637)
* fix mllama any image

* reorder classes

Former-commit-id: 98189c8e4d70bf5f8ee83852a023ed27dfc96900
2025-01-14 16:47:58 +08:00
hoshi-hiyouga
5e699458e5 pin vllm version to 0.6.5 (#6629)
Former-commit-id: 1c7663d3049e00a9148c3e3c58204deca7a08c8d
2025-01-14 02:44:02 +08:00
Zhangchi Feng
201a495154 Support new features of MiniCPM-V (#6626)
* fix template name

* tiny fix

* support minicpm-o-2.6

Former-commit-id: c3fda5046d835ba4542d525b8d89cd12838e9f4c
2025-01-14 00:26:19 +08:00
hoshi-hiyouga
d8cba9464f [inference] fix stop token for object detection (#6624)
* fix stop token

* update minicpm data pipeline

* fix npu qlora examples

Former-commit-id: e3e2c8c689c54ebb2af264de808502e5a8ba0f2b
2025-01-13 21:34:20 +08:00
codingma
089c7d5e51 add nf4 qlora support on Ascend NPU (#6601)
* add nf4 qlora support on Ascend NPU

* add transformers version check

* add python>=3.10 requirement description for npu

* tiny fix

---------

Co-authored-by: hoshi-hiyouga <hiyouga@buaa.edu.cn>
Former-commit-id: 03de5ac912336190d6b3583f70b6340ab9cf9cdf
2025-01-13 19:43:36 +08:00
Zhangchi Feng
15bba15725 Fix template name of MiniCPM-V (#6620)
* fix template name

* tiny fix

Former-commit-id: 3077f20339158564009270edf79c8ef1b10e8b4a
2025-01-13 16:46:48 +08:00
hoshi-hiyouga
0b47c2a293 Merge pull request #6598 from BUAADreamer/minicpmv
[model] Support MiniCPM-V

Former-commit-id: 6eec50c74dcbcc325ad6258228e19c19b4a03538
2025-01-13 15:24:02 +08:00
fzc8578
313ce9a576 remove tests
Former-commit-id: a019cece8009b0ba8a6b5a309ed5abfe6cb88a75
2025-01-13 15:08:35 +08:00
fzc8578
ee87d318b8 fix tests
Former-commit-id: c2fa4cc7b114ac1a376882022e4b6ef75d288dca
2025-01-13 15:01:39 +08:00
fzc8578
4741eec2d1 fix style
Former-commit-id: 0cc7260a93bf7c65451e376245aa143f9237d7d8
2025-01-13 14:19:38 +08:00
fzc8578
d2afe0c63c fix system prompt and tests
Former-commit-id: cfaa8e4890ad99ec1fb90d9550503d734b5c30b7
2025-01-13 14:18:06 +08:00
fzc8578
bdded9d41a add some
Former-commit-id: 01e9cfd406dc21f387b4f2baa1d61195a841ccb5
2025-01-11 15:03:20 +08:00
fzc8578
8c79fe6a5a add cpm_o test
Former-commit-id: 10073319b4215be900744a28a61bd442e70143cc
2025-01-11 11:55:30 +08:00
fzc8578
63bb2b7235 add cpm_o test
Former-commit-id: c506f763dff1c1d2c85ac8fe6beb9f40ca4fcde9
2025-01-11 11:49:03 +08:00
fzc8578
e7f928adc4 fix format
Former-commit-id: 7b44f3127ef7e91a6bedca0311feb14974914ddf
2025-01-11 01:27:40 +08:00
fzc8578
62c12a133e add some
Former-commit-id: a650e114e907278ece188922467c2514de544eeb
2025-01-11 01:10:24 +08:00
fzc8578
08e8499a98 adapt to new mllm_param
Former-commit-id: 291384dea8a5c10f0358a30d124eaf85557548eb
2025-01-11 00:16:34 +08:00
Zhangchi Feng
d5b18ee4a6 Merge branch 'main' into minicpmv
Former-commit-id: ed0895a9c13b0ea8a5cace6b060f01d9771816ad
2025-01-11 00:01:36 +08:00
hoshi-hiyouga
93cc1f167b Merge pull request #6600 from hiyouga/hiyouga/refactor_mllm_param
[model] refactor mllm param logic

Former-commit-id: 382e932228d1bcfcdee0a25ee3f1977226f1c433
2025-01-10 23:53:37 +08:00
hiyouga
c89d17ab63 refactor mllm param logic
Former-commit-id: f6f630a1c96514053176abb12e35a06242e62abd
2025-01-10 15:45:48 +00:00
fzc8578
9213e48fa2 add minicpmv2.6
Former-commit-id: e45329e7456b647d5684b1f9428641ad18af92d1
2025-01-10 23:45:44 +08:00
fzc8578
0fb50f9c88 add some
Former-commit-id: 771cc802941cf1953b32e5102c817c6a3090b5ce
2025-01-10 23:29:06 +08:00
fzc8578
bcbe37ff52 add some
Former-commit-id: ae1f528df31194fe37a123ba1e5a4cd263a61602
2025-01-10 21:25:32 +08:00
fzc8578
994049380d fix some
Former-commit-id: 15bbcdf8d3265f4154d3937719da5e54a5963355
2025-01-10 20:55:52 +08:00
fzc8578
cc6a6f698f fix version
Former-commit-id: d09032049c1f24336a1899908bf47a98e77b3211
2025-01-10 20:31:04 +08:00
fzc8578
7138b43873 fix some
Former-commit-id: 2ee8ba2f390551af1b865cfa813f5c8b7bbb41c5
2025-01-10 20:27:06 +08:00
fzc8578
aeb4f82ef2 tiny fix
Former-commit-id: 84026be06e34239a828a0cc8b1706084afcfa4ea
2025-01-10 20:15:39 +08:00
Zhangchi Feng
f51ac40f0a Merge branch 'main' into minicpmv
Former-commit-id: fc045d7dd871985d621430b5662cba882188a59c
2025-01-10 20:12:07 +08:00
fzc8578
165fe8e219 add some
Former-commit-id: 096a6cb67a7dfd14a6e339d96baab78c12d36a87
2025-01-10 20:01:22 +08:00
hoshi-hiyouga
4243c618f0 Merge pull request #6597 from hiyouga/hiyouga/upd_wechat
[assets] update wechat

Former-commit-id: b308ddf0971606f0f8f39e26f5711852abad3e79
2025-01-10 18:41:47 +08:00
hiyouga
368d22f79a update wechat
Former-commit-id: 70ed03b288c1853f262e47b06e8601eaf49ccc1b
2025-01-10 10:40:25 +00:00
hoshi-hiyouga
b3561ae552 Merge pull request #6588 from hiyouga/hiyouga/upd_issue_temp
[gh] update issue template

Former-commit-id: 5ffd8ad192bb3932fbe230757d4bf1c907ca3aa4
2025-01-10 03:03:48 +08:00
hiyouga
b395540826 update issue template
Former-commit-id: aa8d0a223b0345e1f665b6703678c0ce526ff950
2025-01-09 18:58:53 +00:00
hoshi-hiyouga
a1b5644889 Merge pull request #6585 from hiyouga/hiyouga/add_phi4
[model] add phi4 model

Former-commit-id: 8b209cb49d9cc6058ce61c97bf2216f6371c5f7c
2025-01-10 02:39:17 +08:00
hiyouga
b471def13d improve template, add phi4 model
Former-commit-id: ae16ea755d581a5a288fb55f12481215f369b255
2025-01-09 18:27:54 +00:00
hoshi-hiyouga
b777fed171 Merge pull request #6564 from stephen-nju/fix_ray
Fix ray

Former-commit-id: 6b34b69fa688c4622489d3d5f33d847fb6b95528
2025-01-08 18:14:18 +08:00
hoshi-hiyouga
618ceda6e9 Merge pull request #6565 from hiyouga/hiyouga/improve_log
[misc] imporve log

Former-commit-id: 18431527bac8da57d9a2fc014695e5891f7a3068
2025-01-08 18:08:21 +08:00
zhubin
014a7ea042 fix –get ray args when args not a dict
Former-commit-id: 9c4c84828b77acf48caf60726e4e7ef3e972118d
2025-01-08 10:06:02 +00:00
hiyouga
da542fad18 imporve log
Former-commit-id: 47e17dd689840ca9b3c5f34448e5f80265336cca
2025-01-08 09:56:10 +00:00
hoshi-hiyouga
984b202f83 Merge pull request #6542 from erictang000/et/ray-integration
Ray Train integration with LLaMA-Factory

Former-commit-id: d23a98825bcb569bc51e21a3c2236eccd2f6d2fd
2025-01-08 11:46:03 +08:00
hiyouga
0c1ad5f3fb fix llamaboard with ray
Former-commit-id: c46675d5e56d175c27d705ef0068fb47dc89a872
2025-01-07 09:59:24 +00:00
hiyouga
b4174021d6 refactor ray integration, support save ckpt
Former-commit-id: d8cac6f54663e6cffeddf2c65e3da454e7b86a75
2025-01-07 09:39:10 +00:00
Eric Tang
bba52e258e run style check
Former-commit-id: 1e8e7be0a535e55888f58bbe2c38bc1c382e9012
2025-01-07 08:55:44 +00:00
Kourosh Hakhamaneshi
1217240918 drafting ray integration
Signed-off-by: Kourosh Hakhamaneshi <kourosh@anyscale.com>

Former-commit-id: 163ddb680b6f84a4424a887a3b8a5d668044e87c
2025-01-07 08:55:44 +00:00
hoshi-hiyouga
a0bcac80c0 Merge pull request #6547 from hiyouga/hiyouga/fix_pixtral_dpo
[trainer] fix pixtral dpo

Former-commit-id: c973f32849b979a3ebb80caa01029b43fbb620ac
2025-01-07 14:38:55 +08:00
hiyouga
8c57169eb7 fix #6546
Former-commit-id: 870f23d7eaff1e32a73fee4eb972163c85ba7b67
2025-01-07 06:30:44 +00:00
fzc8578
b9eeaa9706 add some
Former-commit-id: 785cc70ff205f5962c3ca67f453589e4a471ba8c
2025-01-06 19:32:39 +08:00
hoshi-hiyouga
621d73e87c Merge pull request #6528 from hiyouga/hiyouga/upd_wechat
[assets] update wechat

Former-commit-id: b832ed9a60a5fd0bc7d9f975bb881a71e7d35245
2025-01-04 16:01:21 +08:00
hiyouga
a02a140840 update wechat
Former-commit-id: cd1433650653810f7934c65cb1de91052eb73dcf
2025-01-04 07:59:57 +00:00
Zhangchi Feng
a0188a430f Merge branch 'hiyouga:main' into minicpmv
Former-commit-id: ab87bd6b1398b379b1a7a95f01a6539743b9db2d
2025-01-04 11:20:33 +08:00
fzc8578
b5ef5059ee add some
Former-commit-id: 79c2d7090cbf364063ea3608814ab18aa27fdc87
2025-01-04 11:11:15 +08:00
hoshi-hiyouga
084d356c2c Merge pull request #6524 from hiyouga/hiyouga/upd_scripts
[misc] update scripts

Former-commit-id: e6d603ac374c04df354361f9617173afa8c1edae
2025-01-03 23:52:26 +08:00
hiyouga
20a9565e36 update scripts
Former-commit-id: dd44c65d7f60cb6f5d0e0d8ee5f4e7643defb89b
2025-01-03 10:50:32 +00:00
hoshi-hiyouga
85317bcbaf Merge pull request #6515 from hiyouga/hiyouga/misc
[misc] update model name

Former-commit-id: 51ef90ce0ace4a45f9c01ba7e674adf5e3c92baa
2025-01-02 20:20:02 +08:00
hiyouga
528fb4f799 update model name
Former-commit-id: 4b8add728729d8e2ce4c9a3dc6748357291d8e8b
2025-01-02 12:19:21 +00:00
hoshi-hiyouga
aa7ec44367 Merge pull request #6514 from hiyouga/hiyouga/add_project
[readme] add project

Former-commit-id: a766cad5d49f226eb61a550bc3d157870c1068cc
2025-01-02 20:16:15 +08:00
hoshi-hiyouga
b2ecb80729 Merge pull request #6513 from hiyouga/hiyouga/add_gpt2
[model] add gpt2 model

Former-commit-id: 29ddc6b77862f740570a00d3b8ea548ee1a2ce03
2025-01-02 20:15:55 +08:00
hiyouga
9a3afbd5d1 add project
Former-commit-id: b3e1137fbbdfa4cc081903983fea36acff7afd75
2025-01-02 12:15:41 +00:00
hiyouga
37c60c7d14 add gpt2 model
Former-commit-id: 67442bd497c75b0c5990d94a880e0e25474ae2fa
2025-01-02 12:07:38 +00:00
hoshi-hiyouga
b921dde749 Merge pull request #6512 from hiyouga/hiyouga/fix_gen_logic
[trainer] fix generate logic

Former-commit-id: 72d86ecc9e327933a0a2c893b8ffd2740c99be6b
2025-01-02 19:36:54 +08:00
hoshi-hiyouga
d195329185 Merge pull request #6462 from shibingli/main
Add ARG HTTP_PROXY in Dockerfile to support HTTP proxy during image building

Former-commit-id: 8741e5b3e87a392a3c9d50455e4916c3a938fb24
2025-01-02 19:34:17 +08:00
hiyouga
da8721a70e fix #6499
Former-commit-id: 1800f8c72dfa618c71c84a3a18ecdef4d82754f7
2025-01-02 11:28:54 +00:00
hoshi-hiyouga
f318dc9464 Merge pull request #6493 from hiyouga/hiyouga/upd_wechat
[assets] update wechat

Former-commit-id: f8e80d566f7666b6af00360df97065698a1d3a9f
2024-12-30 21:55:03 +08:00
hiyouga
01bbe66f41 update wechat
Former-commit-id: a400d896a18e317acdbd3c79282c81b50cc2c54d
2024-12-30 13:54:22 +00:00
hoshi-hiyouga
bb664d2fc5 Merge pull request #6492 from hiyouga/hiyouga/add_deepseek3
[model] add deepseek3 model

Former-commit-id: 2382a5f0317d768ba8f4931977f5caed6057b3c0
2024-12-30 21:50:13 +08:00
hiyouga
d0e729cd33 add deepseek3 model
Former-commit-id: e67b9dcc3ad0c003bc3afd7601ecd2adfbf9666b
2024-12-30 13:39:20 +00:00
hoshi-hiyouga
1178cb0e33 Merge pull request #5507 from piamo/main
Add deepseek-v2.5 template

Former-commit-id: 91467ed313802ac3950c2e11a7d0997a36bcbddd
2024-12-30 21:08:25 +08:00
hoshi-hiyouga
089f824cd1 Merge pull request #6483 from hiyouga/hiyouga/fix_paligemma_infer
[model] update vllm & fix paligemma dtype

Former-commit-id: 40805b0cc0cff478703f68067a330ba307bb5809
2024-12-30 16:34:32 +08:00
hiyouga
813f5919a3 fix #6482
Former-commit-id: 6f5bb3b8e5b6eb7fdfd7b0ca8eba789ab741a7b6
2024-12-30 06:03:07 +00:00
hoshi-hiyouga
951d845af2 Merge pull request #6465 from hiyouga/hiyouga/fix_eval_loss
[trainer] fix eval loss

Former-commit-id: b55890291b0049dd90ef4d1d0bf0ba1efb1e4f0a
2024-12-28 01:02:56 +08:00
hiyouga
3bcb4633ca fix #6448
Former-commit-id: 27198679829fb766c7eef468ae4311fdced695a2
2024-12-27 16:54:39 +00:00
shibingli@yeah.net
c76c33ddb1 Add ARG HTTP_PROXY in Dockerfile to support HTTP proxy during image building.
Former-commit-id: f1d76786e094562f6f095a0b56c9c6cd32e2fa5e
2024-12-27 18:31:14 +08:00
shibingli@yeah.net
a37ef0eaae Add ARG HTTP_PROXY in Dockerfile to support HTTP proxy during image building.This commit introduces an ARG parameter named HTTP_PROXY in the Dockerfile. This addition allows for the configuration of an HTTP proxy, facilitating image building in environments with network restrictions.
Former-commit-id: a3a49b1ea477313c979a1649ee6a7f843fe36469
2024-12-27 18:17:17 +08:00
hoshi-hiyouga
377dfe5665 Merge pull request #6457 from youkaichao/module-run
[misc] enable module run

Former-commit-id: f68074d87bcc915a49a8765b3ebb32d935aa5445
2024-12-26 23:41:37 +08:00
youkaichao
f6d5dd6f10 Update cli.py
Former-commit-id: c39d81cd1d108d832746e100ac890b2d4ecaa60e
2024-12-26 23:22:09 +08:00
hoshi-hiyouga
a36f9d923e Merge pull request #6443 from hiyouga/hiyouga/add_qvq
[modle] add qvq

Former-commit-id: cd56f88ff2c5c3edc381f3807f466621cee86b67
2024-12-25 15:53:19 +08:00
hiyouga
c83b74ab9e add qvq #6439
Former-commit-id: ee0e400f417f648cd15cf48144df76e4809cc615
2024-12-25 07:52:41 +00:00
hoshi-hiyouga
c5780f5eaa Merge pull request #6430 from hiyouga/hiyouga/upd_wechat
[assets] update wechat

Former-commit-id: cbd494ddaf692faf83d4825fe4b4595430b111f5
2024-12-24 16:13:20 +08:00
hiyouga
4cd1d05429 update wechat
Former-commit-id: 83202c9027222b83c949d1fe1bff1317f5715015
2024-12-24 08:12:53 +00:00
hoshi-hiyouga
459219a260 Merge pull request #6426 from hiyouga/hiyouga/update_readme
[assets] update readme

Former-commit-id: b9f73fc5caf5753bd5b96de5383eaf80cd958e3d
2024-12-23 22:17:19 +08:00
hiyouga
353259f03f update readme
Former-commit-id: 8fd38d273e5bc3b28a4741b230010fece87e7070
2024-12-23 14:08:59 +00:00
hoshi-hiyouga
8265d6a228 Merge pull request #5922 from Tuyohai/main
support granite3 models

Former-commit-id: c23a4d0658323434c386716c25855711202e37a9
2024-12-23 16:46:02 +08:00
hoshi-hiyouga
c0418062c0 Merge pull request #6418 from hiyouga/hiyouga/add_report
[trainer] add custom args to experimental logger

Former-commit-id: d58746eca203d97ec57abbc312ecf4c00b5d5535
2024-12-22 05:47:55 +08:00
hiyouga
47c2d91933 support report custom args
Former-commit-id: 5111cac6f8e7b77ef1ca1ff967734cfe1d6785f4
2024-12-21 21:42:45 +00:00
hiyouga
f07bad7144 fix paligemma infer
Former-commit-id: 84cd1188ac03c165e1a626db297936c2458627d6
2024-12-21 20:24:32 +00:00
hoshi-hiyouga
9d437a5f4f Merge pull request #6416 from Zeyi-Lin/main
docs: use swanlab
Former-commit-id: a2ad0738a22f71af453a7f266c350ff7662bf67c
2024-12-22 04:08:26 +08:00
ZeYi Lin
1c1d6bea43 docs: use swanlab
Former-commit-id: 744ef8c2688efad82028e22683e6c9d874af6823
2024-12-21 20:59:25 +08:00
hoshi-hiyouga
547f76e56e Merge pull request #6401 from Zeyi-Lin/hiyouga/swanlab
feat: add swanlab for experiment tracking and visualization.
Former-commit-id: 947e22a4a30d8eb7b612da53bbf538ead7dd27b7
2024-12-21 14:09:33 +08:00
ZeYi Lin
67d4757c35 fix: project blank
Former-commit-id: 82e5d75014ffe5fbe762711adecf59c94ab29f59
2024-12-20 18:26:02 +08:00
ZeYi Lin
cc703b58f5 fix: by hiyouga suggestion
Former-commit-id: 3a7ea2048a41eafc41fdca944e142f5a0f35a5b3
2024-12-20 16:43:03 +08:00
ZeYi Lin
8f786ee938 feat: ui improve
Former-commit-id: 5f6dafd70e962b8fe9a294d555133002135f80df
2024-12-20 11:03:02 +08:00
ZeYi Lin
03dba638e6 fix: text
Former-commit-id: 0a52962db365e7456c858a8e58c19313f19d1e09
2024-12-19 21:26:02 +08:00
ZeYi Lin
dd22454fc5 fix: bugs
Former-commit-id: d0eb64d5e3472a166c9adac4cb4ba06bdd663e46
2024-12-19 21:08:16 +08:00
hoshi-hiyouga
904f18b4a2 Merge pull request #6395 from hiyouga/hiyouga/fix_genkwargs
[generate] fix generate kwargs

Former-commit-id: c6e3c14a93490df6e7f8099e09d62d602a667542
2024-12-19 20:24:17 +08:00
ZeYi Lin
b512a06c3d docs: config framework
Former-commit-id: 7eb49e5ffaea59d8a2756ae7ff55bd57b9077f4b
2024-12-19 20:22:36 +08:00
ZeYi Lin
c31933ef9e fix: string
Former-commit-id: 330691962960fdd2053236e43a919e8f15e2bf27
2024-12-19 20:18:59 +08:00
hiyouga
8524dcaa4a fix #6391
Former-commit-id: d4c1fda1ad19e73484d8d51d81e490cdb8781955
2024-12-19 12:16:38 +00:00
ZeYi Lin
53103f55b6 feat: optimize frontend
Former-commit-id: 8c2df41b937f491f7ebf593b20c65a19738c7642
2024-12-19 19:04:19 +08:00
ZeYi Lin
cc5cde734b feat: swanlab params
Former-commit-id: d5cf87990e5bea920ecd1561def09fa17cf328b1
2024-12-19 18:47:27 +08:00
hoshi-hiyouga
af9ef037dd Merge pull request #6388 from hiyouga/hiyouga/shuffle_control
[trainer] support disable shuffling

Former-commit-id: ffbb4dbdb09ba799af1800c78b2e9d669bccd24b
2024-12-19 17:00:12 +08:00
hiyouga
95d3c2620b support disable shuffling
Former-commit-id: c7cedc7569973a2879c689637b2923e8b26f1a81
2024-12-19 08:53:21 +00:00
hiyouga
1a48340680 add swanlab
Former-commit-id: 96f8f103e58a8ff307b0ce36c967de04f452434a
2024-12-19 07:12:31 +00:00
hoshi-hiyouga
d6ce1045f7 Merge pull request #6384 from hiyouga/hiyouga/fix_webui
[webui] fix webui args

Former-commit-id: 6ccd64ecd9089d8694decde3186836bd8e5df511
2024-12-19 14:57:52 +08:00
hiyouga
92a0d08e27 fix webui
Former-commit-id: 369cca8110e6923ad9978b6b93928a3bcb5c6f30
2024-12-19 06:48:03 +00:00
hoshi-hiyouga
910884065e Merge pull request #6379 from hiyouga/hiyouga/add_paligemma2
[model] add paligemma2

Former-commit-id: 933647e6806428a608c2f0fd90b8ea1ea84cdd89
2024-12-18 17:03:11 +08:00
hiyouga
433d116080 add paligemma2
Former-commit-id: d3509050dc4d3105a6e62acc9a1ba481269279a2
2024-12-18 08:57:26 +00:00
hoshi-hiyouga
d43080b534 Merge pull request #6313 from ge-xing/main
support telechat2 model

Former-commit-id: 015f2137887bb9f27fcb0d6cc67ef729aad4031e
2024-12-18 16:16:17 +08:00
hoshi-hiyouga
5f0dd86c15 Merge pull request #6369 from hiyouga/hiyouga/template
[template] support qwen2 tool template

Former-commit-id: af336275021cd6aee3fe9f67b9ac9bcd1276de7c
2024-12-18 04:23:49 +08:00
hiyouga
a421113466 support qwen tool format
Former-commit-id: 98795854e3fda7b0c0bc209b3e2496b0036e154e
2024-12-17 20:12:06 +00:00
hiyouga
acd62fddb8 change default replace jinja to false
Former-commit-id: bcc413cf64cbee068e2f19475ce7919c65284489
2024-12-17 19:27:10 +00:00
hoshi-hiyouga
d8f6569be1 Merge pull request #5473 from AlongWY/mistral
Support Mistral format tools

Former-commit-id: 2fad3792d98f4181ae23e861c3d050fe1bcd8e4e
2024-12-18 03:23:24 +08:00
ylfeng
857d23b324 Support Mistral format tools
Former-commit-id: 115924af47496daa747a018952b6a32ccbd9cecb
2024-12-17 19:13:26 +00:00
hoshi-hiyouga
ad00c793ce Merge pull request #6368 from hiyouga/hiyouga/fix_llama_template
[template] fix llama3 tool template

Former-commit-id: 8974a0a185daf7744b4d3a0b2776f9bd72e24426
2024-12-18 01:10:48 +08:00
hiyouga
f6a2bfc0e8 fix llama3 tool template
Former-commit-id: df5655f61cb847dc2d9eb7b34266b20343ff90d6
2024-12-17 17:05:10 +00:00
hoshi-hiyouga
1cc24ed206 Merge pull request #6367 from hiyouga/hiyouga/add_model
[model&template] add llama3.3 & support llama3 tool prompt

Former-commit-id: e12c80ace8b59a9556ee40f5b810f233f9b8174a
2024-12-18 00:13:28 +08:00
hiyouga
a935933bed support llama3 tool prompt
Former-commit-id: b24ae55ebf548db904a9fe1876192024d8a96108
2024-12-17 15:52:37 +00:00
hoshi-hiyouga
09419dfbab Merge pull request #5819 from yafshar/remote_code
Add trust_remote_code Parameter and Set Default to False

Former-commit-id: 2a832e489b7d88dba3cc8b07ca582cb22ec9a9f1
2024-12-17 21:10:24 +08:00
Yaser Afshar
76ebd62ac1 Add missing key to init_kwargs
Former-commit-id: 1c8ad22a5f167bf4e1c845e273583e5cb3a0214e
2024-12-17 12:34:05 +00:00
Yaser Afshar
fe4546a7bb Add trust_remote_code parameter and remove True
- Introduced a new model parameter `trust_remote_code`
- Set the default value of `trust_remote_code` to `False`
  to enhance security


Former-commit-id: 09437763267bc7081159a6878cee9652a2b1ddac
2024-12-17 12:25:12 +00:00
zhaohu xing
cfb4c42ae4 support telechat2 model
Former-commit-id: 04f19ed0f36e691d89ccb7ac19bae70c59640aaa
2024-12-17 12:15:33 +00:00
hoshi-hiyouga
fc18db6290 Merge pull request #6364 from hiyouga/hiyouga/control_reenterent_gc
[model] support non-reenterent-gc

Former-commit-id: a665ad6178516faf8aaa628d3b2c672ad831d7b7
2024-12-17 19:58:36 +08:00
hiyouga
64bac4bc7e support non-reenterent-gc & fix #6358
Former-commit-id: f319da6937964c5068864f7815efbf3847bd4ad8
2024-12-17 11:41:59 +00:00
hoshi-hiyouga
002c7d2867 Merge pull request #6363 from hiyouga/hiyouga/control_skip_eos
[infer] support control eos

Former-commit-id: 69738283071567eeecb492ac38ce1df3b994d975
2024-12-17 19:35:40 +08:00
hiyouga
a94a1eac67 support control eos, fix #6345
Former-commit-id: eda76de32bab103c650f246327d214539ae6f291
2024-12-17 10:42:05 +00:00
hoshi-hiyouga
a8a990a9a7 Merge pull request #6362 from hiyouga/hiyouga/mllm_packing
[model] generalized packing

Former-commit-id: 9708a39179d7872ff2039086fcadb021265974cc
2024-12-17 18:41:48 +08:00
hiyouga
bff1b94583 generalized packing & fix #6343
Former-commit-id: 2d107d3aefd5af61163056634c8b91fe3cb3e77c
2024-12-17 10:26:19 +00:00
hoshi-hiyouga
4caf043cf8 Merge pull request #6359 from hiyouga/hiyouga/fix_qwen2vl_infer
[model] fix qwen2vl infern

Former-commit-id: 81815f053f9eef23fa4906cc47496806cfc1735c
2024-12-17 18:15:23 +08:00
hiyouga
50ca43c3fb fix #6348
Former-commit-id: 142191e4664cb1b920aff2f51d1bac6180f2c24b
2024-12-17 10:06:46 +00:00
hoshi-hiyouga
0f49e9cb07 Merge pull request #6334 from hiyouga/hiyouga/add_examples
[assets] update wechat and examples

Former-commit-id: e2fbd07096877f76d6bdd1b0c4b2d42aa10c346f
2024-12-15 01:37:01 +08:00
hiyouga
ba901bc000 update assets
Former-commit-id: 7059055e89cc3f8557581d5d5b728a82ab5b0899
2024-12-14 17:36:03 +00:00
hiyouga
6f1e450739 fix mrope
Former-commit-id: 2811814fc42fb214b3e8be1055f9f57ffd0ffb12
2024-12-12 15:08:17 +00:00
hoshi-hiyouga
93d1cba06e Merge pull request #6253 from hiyouga/hiyouga/qwen2vl_mm_proj
[model] support qwen2vl train proj only

Former-commit-id: bcb4fb353ed810477b7813c6904b33c51efb7f81
2024-12-05 20:25:33 +08:00
hiyouga
cf8cad8e7e support qwen2vl train proj only
Former-commit-id: 99c62660c67fd2567c094bb49ec7d108e5670730
2024-12-05 10:37:42 +00:00
hoshi-hiyouga
255260cfcb Merge pull request #6251 from hiyouga/hiyouga/vllm_qwen2vl_infer
[infer] support qwen2vl vllm infer

Former-commit-id: 561a8e56d95e31dc9a36da4b369af219d7319688
2024-12-05 18:26:19 +08:00
hiyouga
88b06a0c7f support qwen2vl vllm infer
Former-commit-id: 207f8b069ca35a28de4588b4962e7254f451c52c
2024-12-05 10:17:26 +00:00
hoshi-hiyouga
7f8c59144e Merge pull request #6246 from hiyouga/hiyouga/update_examples
[examples] update examples

Former-commit-id: 967a6c12a72b97104bd262b0f2ed945a8b0cd81d
2024-12-05 16:49:30 +08:00
hiyouga
90fb5605c1 update examples
Former-commit-id: e5584dc7ba356264034c14ff7d8efcd3d50fa9a3
2024-12-05 08:48:25 +00:00
hoshi-hiyouga
9f9ad6435d Merge pull request #6242 from hiyouga/hiyouga/fix_script
[script] fix scripts

Former-commit-id: c42890bb2b287ac1fccb369a63a6495e7008b245
2024-12-05 11:54:46 +08:00
hiyouga
819f487c8f fix scripts
Former-commit-id: eb3e147d198a3ecb02c65f7733cec7cd9d3814a3
2024-12-05 03:47:32 +00:00
hoshi-hiyouga
9bbeba6323 Merge pull request #6160 from village-way/pr_dataloader
fix:tokenized_path not None and load_from_disk return Dataset Trigger…
Former-commit-id: cf298468309cd923d830dcaf7a1aa837519faf1e
2024-12-04 22:18:19 +08:00
hoshi-hiyouga
92940817e7 lint
Former-commit-id: 6a5074e46695378b76d58aac8ad7768b6b034b9c
2024-12-04 22:08:27 +08:00
hoshi-hiyouga
68614f6bc1 Merge pull request #6238 from hiyouga/hiyouga/vllm_batchinfer
[infer] feat: support batch infer in vllm

Former-commit-id: 8328bd8fbbb0699bde8269ba13d78e400668dca7
2024-12-04 21:59:13 +08:00
hiyouga
235cdcacee support batch infer in vllm
Former-commit-id: 1324d158f954d777f1fbf09f46149c372704b388
2024-12-04 13:50:00 +00:00
hoshi-hiyouga
b2c67a989a Merge pull request #6190 from JieShenAI/main
add vllm_infer script

Former-commit-id: dc7835500297322c85ba1d69fb12e743a5cd7bfd
2024-12-04 21:19:23 +08:00
hoshi-hiyouga
ed4c4bab49 Merge pull request #6170 from hykilpikonna/main
[+] Show the hostname in webui title

Former-commit-id: 263cb82bdb3dfb0c213018f25022686a3dab617e
2024-12-04 18:07:29 +08:00
hoshi-hiyouga
1804e8a491 Merge pull request #6233 from hiyouga/hiyouga/vlm_zero3
[data] fix vlm zero3 training

Former-commit-id: 187402203bb0aeada3c97f65fa9d178ff490c261
2024-12-04 17:51:10 +08:00
hiyouga
0ef1dc4dd5 fix vlm zero3 training
Former-commit-id: dbb9e5b70efab37ed057b2d5822b9d0d23e99fb1
2024-12-04 09:40:39 +00:00
hoshi-hiyouga
b34c3bb796 Merge pull request #6224 from hiyouga/hiyouga-patch-1
[assets] chore: update wechat

Former-commit-id: 7965e9840c18c71028c1a3a04c404e9fae196c0d
2024-12-03 21:25:38 +08:00
hoshi-hiyouga
aa5535c622 update wechat
Former-commit-id: 722a396b69ce340609175b0bc1cf118e5593a749
2024-12-03 20:48:48 +08:00
JieShen
d4bf81b36a add async call api
Former-commit-id: 4c61368600a6648ec20753a41536ad3c7986703b
2024-12-01 22:18:05 +08:00
JieShen
99265c7d2f add vllm_infer script
Former-commit-id: 961e8c2d2e5505de14702cf8609d54b4f3a23b1e
2024-11-29 14:22:20 +08:00
Azalea
0efa34c9ef [U] Compute hostname differently
Former-commit-id: 6554cdeedb8ab669ce500be404dab8d42a90779b
2024-11-28 22:23:41 -05:00
hoshi-hiyouga
f4729904f2 Merge pull request #6175 from hiyouga/hiyouga/add_qwq
[model] add QwQ

Former-commit-id: f2b2a37f0819bb344998a9059c02b97ca5a5ae74
2024-11-28 17:01:53 +08:00
hiyouga
1c3d86cd65 add qwq
Former-commit-id: 68a612115aebba51695d22be4397c16c86f3b40a
2024-11-28 08:50:57 +00:00
Azalea
f5e6e25a1b [+] Show the hostname
Former-commit-id: dfb953b1adb79ed9f886ecac6851a559617aaa5a
2024-11-28 12:25:02 +08:00
wangdepeng
ae09c6c214 fix:tokenized_path not None and load_from_disk return Dataset Trigger stuck
Former-commit-id: 4424d4de8aca0e4d3b92672584978f3cc3fc33da
2024-11-27 16:44:42 +08:00
hoshi-hiyouga
265a5821de Merge pull request #6156 from hiyouga/hiyouga/add_o1
[data&model] add marco-o1, skywork-o1 and openo1

Former-commit-id: 86f41513c04424f8f4ccfeca78d28a4cc7ba8728
2024-11-27 14:36:01 +08:00
hiyouga
9822cb7bac fix dataset
Former-commit-id: 046b6fb118e3ea75062c6a759720a1759639e93c
2024-11-27 06:27:44 +00:00
hiyouga
d51d96d594 add skywork o1
Former-commit-id: ec9ff8caa2637965d41937cce7de4e4d51d054eb
2024-11-27 05:51:59 +00:00
hiyouga
09a3a59c88 Merge remote-tracking branch 'origin/main' into hiyouga/add_o1
Former-commit-id: b7c7f3066f7c33fc6e52d9d1d57b831966e53c7f
2024-11-27 05:36:41 +00:00
hoshi-hiyouga
dfa4e927dd Merge pull request #6157 from hiyouga/hiyouga/fix_ci
[ci] pin tokenizers version

Former-commit-id: 14d0d92bf38ac09f0b7f7f9484345310e32fbbe6
2024-11-27 13:33:04 +08:00
hiyouga
61320965aa pin tokenizers version
Former-commit-id: b7d4cf2caf2d02f7d16bab3f0ec8bf0108b7be75
2024-11-27 05:24:58 +00:00
hiyouga
ab3782b0fa add marco-o1 and openo1 dataset
Former-commit-id: 17afb7d4103499a9a090a6624896cfa123e9e1d6
2024-11-27 04:20:23 +00:00
hoshi-hiyouga
6cd90efb82 Merge pull request #6152 from hiyouga/hiyouga/add_num_proc_in_data_load
[data] add num_proc in load_dataset

Former-commit-id: b26c490ac3a0a8a6342f940eb6ccb7b8b6d78f93
2024-11-27 00:16:15 +08:00
hoshi-hiyouga
08ca40876a Merge pull request #6151 from hiyouga/hiyouga/fix_mllama
[model] fix mllama cross mask

Former-commit-id: 88f087c8b9cb22fa4f4e4f867ea3d71dd8606a98
2024-11-27 00:07:54 +08:00
hiyouga
358708ee97 fix #6149
Former-commit-id: 362d579ce83e63007e6f89f264d06d2698671cc6
2024-11-26 16:03:02 +00:00
hiyouga
006022cadd fix mllama cross_mask
Former-commit-id: 598c22e43f3f10a335933339cc612744c4835eb0
2024-11-26 15:56:58 +00:00
hoshi-hiyouga
e0325b1123 Merge pull request #6141 from hiyouga/hiyouga-patch-1
[misc] chore: lint

Former-commit-id: 00031b1a66ade1c2665ce7a069a756cccbcb07f1
2024-11-25 23:02:11 +08:00
hoshi-hiyouga
118ffe50e3 lint
Former-commit-id: da9e4ddd26ebd6e7eb266aa0bef7505465a6b119
2024-11-25 22:55:56 +08:00
hoshi-hiyouga
a6aeb98af6 Merge pull request #6140 from hiyouga/hiyouga/fix_mllama
[data] fix mllama plugin

Former-commit-id: 3924a3d6e9d761dd51eca92afed7f299be71e42d
2024-11-25 22:32:07 +08:00
hoshi-hiyouga
c0ffe68745 fix #6139
Former-commit-id: d87e16cf5c46dadbfcda7b8ac8edfef6a012f97f
2024-11-25 22:22:06 +08:00
hoshi-hiyouga
1a8c26a7d9 Merge pull request #6138 from hiyouga/hiyouga/update_data
[data] update dataset info

Former-commit-id: 3a1402a4ed1fb958cd38903d016b9a87dc85d4cd
2024-11-25 21:47:23 +08:00
hoshi-hiyouga
4f1d5b6396 update dataset
Former-commit-id: 5214d3ea06ac73f1179ca9574d7c7030c92b5ee1
2024-11-25 21:47:04 +08:00
hoshi-hiyouga
697657271f Merge pull request #6137 from hiyouga/hiyouga/fix_mllama
[model] fix mllama hidden_size

Former-commit-id: 2b7157dc1dfe154146012270e046f311d3b4e4df
2024-11-25 20:17:33 +08:00
hoshi-hiyouga
a489f10986 fix visual patch
Former-commit-id: 75b586c31acf47d1bd28e04566ffd1d954e45596
2024-11-25 20:06:06 +08:00
hoshi-hiyouga
f48d07cd6c fix #6136
Former-commit-id: 0516e556a71a22b8767b17734adb94eb127e7e6f
2024-11-25 19:43:42 +08:00
hoshi-hiyouga
f1a1fb675b Merge pull request #6127 from hiyouga/hiyouga/dev_version
[misc] set dev version

Former-commit-id: 44125da5a57a51d41e67522d32d9c15910d6ae9a
2024-11-25 01:42:29 +08:00
hiyouga
f8f3638bff set dev version
Former-commit-id: b0ccc2ee86b74548c55e15da525bb9f22db8188a
2024-11-25 01:36:49 +08:00
hoshi-hiyouga
1176cd6779 Merge pull request #6124 from hiyouga/hiyouga/release
[release] release v0.9.1

Former-commit-id: 18daf10eda815e29ffd7bc0f2423f0c6d12b7b31
2024-11-25 00:20:02 +08:00
hoshi-hiyouga
4165c4414d Merge pull request #6126 from hiyouga/hiyouga/fix_vllm
[inference] fix vllm

Former-commit-id: 07059a7ca4bc9d79465639b1c22d8ccde5e81244
2024-11-25 00:19:54 +08:00
hoshi-hiyouga
7f74e54bee Merge pull request #6010 from XYZliang/fix-#4316
Increase shm_size to 16GB in docker-compose.yml

Former-commit-id: 8e9f4617f293077b959acc553d26a93c91c2fe6f
2024-11-25 00:16:42 +08:00
hoshi-hiyouga
3f3eeedca0 Merge pull request #6125 from hiyouga/hiyouga/fix_cli
[cli] remove shell=True in cli

Former-commit-id: 57953c8ff64ee91a8525abb0aa1d4a597ffe6ab3
2024-11-25 00:07:35 +08:00
hiyouga
65699c29d4 fix vllm
Former-commit-id: 13ee1f5cec815590c5d290f0aca264e6d16ddd5d
2024-11-25 00:07:24 +08:00
hiyouga
a04538e70b fix cli
Former-commit-id: 8792d78c82e83f319c584079d7ebfa2dccbb346a
2024-11-24 23:56:21 +08:00
hiyouga
708116a5f3 release v0.9.1
Former-commit-id: d622f8fdec744ca98035913eac13e1340949a72a
2024-11-24 23:48:41 +08:00
hoshi-hiyouga
5a7ff02806 Merge pull request #6123 from hiyouga/hiyouga/fix_qwen2vl_vllm
[inference] fix qwen2vl vllm infer

Former-commit-id: 0ce173e2a41f070051cbb7d50180a2f3b7cae89d
2024-11-24 23:42:11 +08:00
hiyouga
dcc67ac1a5 fix qwen2vl vllm infer
Former-commit-id: fa50fc470e46010214f0e509ef75a6c167caac7f
2024-11-24 23:27:24 +08:00
hoshi-hiyouga
7ed5a712f8 Merge pull request #6121 from hiyouga/hiyouga/readme
[readme] update readme

Former-commit-id: f2bfa80d55e661e2263cfffd2a16109442d47e78
2024-11-24 03:28:09 +08:00
hiyouga
7eaafe08bc update readme
Former-commit-id: a89ad72d039d03836f98625eaf438f332368a823
2024-11-23 19:27:18 +00:00
hoshi-hiyouga
503327b5bb Merge pull request #6120 from hiyouga/hiyouga/fix_ci
[test] fix ci

Former-commit-id: 5f310d92791c1a7ef0b1373b098323eacf653857
2024-11-24 03:21:11 +08:00
hiyouga
4bba121f43 fix ci
Former-commit-id: b52c38350d6092e8158484ca01da19ed016236fc
2024-11-23 19:13:32 +00:00
hoshi-hiyouga
8dff6f630c Merge pull request #5555 from marko1616/feat/llama3.2vl
Support llama3.2 vision

Former-commit-id: e68ef89600e85b1f067ca6cc70459e9a7ac77b8a
2024-11-24 02:49:07 +08:00
hiyouga
253752ccca add forbidden modules
Former-commit-id: df477370dc67315effac1a8f48068c4e9c4067a5
2024-11-23 18:34:15 +00:00
hiyouga
e99031daa4 fix inputs
Former-commit-id: 446441fdb020b5a102480251cb8536dd8b3f8f99
2024-11-23 18:26:02 +00:00
marko1616
23fc0c863e Linter.
Former-commit-id: b1e43e56db4d898c0d6d27b703b82a1eb9af70c2
2024-11-23 16:09:04 +00:00
marko1616
3295519099 Tiny fix.
Former-commit-id: 8372c5e3771c42f225d7bd80a758af920f80e893
2024-11-23 16:09:01 +00:00
marko1616
20faaf3418 Support llama3.2vl.
Former-commit-id: 3f2c056253c651e8e614c787e2045f4232e82666
2024-11-23 16:07:35 +00:00
hoshi-hiyouga
24419dd3f1 Merge commit from fork
[patch] Patch remote OS command injection vulnerability

Former-commit-id: b3aa80d54a67da45e9e237e349486fb9c162b2ac
2024-11-21 22:39:44 +08:00
hoshi-hiyouga
1eaaa85980 do not split save_cmd ret value
Former-commit-id: d20b97e7e9514e68c33d0428512fd21b548c1131
2024-11-21 22:30:23 +08:00
superboy-zjc
e272f87849 [patch] Patch remote OS command injection vulnerability
Former-commit-id: aa6a174d6822340022433c5ba38182b4932adecb
2024-11-21 01:52:12 -05:00
hoshi-hiyouga
4d49dc0559 Merge pull request #6098 from hiyouga/hiyouga-patch-2
update wechat

Former-commit-id: c8f199881aa954025f9b022f0f6b4fa7136466ad
2024-11-21 14:26:03 +08:00
hoshi-hiyouga
527ea7297b update wechat
Former-commit-id: acf491fc3a4e3d82608299dbd72271ea65045338
2024-11-21 14:25:33 +08:00
hoshi-hiyouga
302e4e22bf Merge pull request #6078 from wtmlon/support-efficient-tokens-calculation
support effective tokens calculation on sft/dpo

Former-commit-id: bd639a137e6f46e1a0005cc91572f5f1ec894f74
2024-11-20 13:43:15 +08:00
hoshi-hiyouga
2ea2bc7c71 Merge pull request #6083 from hiyouga/hiyouga-patch
[asset] update wechat

Former-commit-id: fdcc78b639596ecae135d2ae726ff35a40b6f113
2024-11-20 11:46:54 +08:00
hiyouga
505edd4da0 update wechat
Former-commit-id: 2f959c73b59302f3b7cc8a07eae06326d95e7e1d
2024-11-20 10:57:30 +08:00
Ting
e27a0c3d53 code refactor
Former-commit-id: 40627c601efc9f144a227dded8c6b40babff4e8b
2024-11-19 20:33:18 +08:00
Ting
32656bc50d update
Former-commit-id: f566ecc8d1f04615351acbe4f8480b75b2daed42
2024-11-19 19:12:10 +08:00
Ting
bf2b8df540 update
Former-commit-id: ef6e14550dd76810285cee9c268590d1d9423e54
2024-11-19 19:10:07 +08:00
Ting
7ad5b5c088 support efficient tokens calculation on sft/dpo
Former-commit-id: b9f00286d8a017ed9fd2876986da3b4d7034ef07
2024-11-19 17:15:47 +08:00
hoshi-hiyouga
a2a9936e1f Merge pull request #6065 from hiyouga/hiyouga-patch-1
[misc] fix dep package version

Former-commit-id: 9c0f6556eeeef4f24b589c15aebfd68fd6f998c6
2024-11-18 21:13:59 +08:00
hoshi-hiyouga
618a8e6c9f fix #6061
Former-commit-id: 4ac5b97011225b1fd5fa741c1335948d721489ac
2024-11-18 20:56:44 +08:00
hoshi-hiyouga
acd70faf17 Merge pull request #6052 from hiyouga/hiyouga-patch-1
[trainer] fix DPO metrics

Former-commit-id: 45f32916ce3e0f1d242b91bbf9dbce2c0200f82d
2024-11-16 16:20:12 +08:00
hoshi-hiyouga
9815d1712c fix #6050
Former-commit-id: dc828218726704ff0453a2d13535663ac6ad7833
2024-11-16 16:11:16 +08:00
hoshi-hiyouga
e3a0640659 Merge pull request #6046 from hiyouga/hiyouga/add_code_model
[model] add qwen-coder and opencoder

Former-commit-id: 6c0847899d8b4f36bc362500f6ecb1c9cc14d7ea
2024-11-15 21:58:03 +08:00
hiyouga
d4e0010027 add qwen-coder and opencoder
Former-commit-id: 431ac4892cdddba802a02b285031a797e278d0eb
2024-11-15 21:48:38 +08:00
codingma
d6b9a2024b Merge pull request #6022 from codemayq/main
update wechat

Former-commit-id: 8e5aad3ffa129070a705bd7709f690d7cb6a1978
2024-11-14 10:03:46 +08:00
codemayq
625ae6f456 update wechat
Former-commit-id: fc1aa8f45c46205ad23e827c493a55ca9f0f06c9
2024-11-14 10:02:06 +08:00
XYZliang
1c1e48a570 Increase shm_size to 16GB in docker-compose.yml to optimize shared memory allocation for large-scale model fine-tuning tasks.
This pull request increases the shm_size parameter in docker-compose.yml to 16GB. The goal is to enhance the LLaMA-Factory framework’s performance for large model fine-tuning tasks by providing sufficient shared memory for efficient data loading and parallel processing.

This PR also addresses the issues discussed in [this comment](https://github.com/hiyouga/LLaMA-Factory/issues/4316#issuecomment-2466270708) regarding Shared Memory Limit error.


Former-commit-id: 64414905a3728abf3c51968177ffc42cfc653310
2024-11-13 10:13:59 +08:00
hoshi-hiyouga
8ae2056b59 Merge pull request #5990 from hiyouga/hiyouga/dev_vllm
[generate] fix vllm config args

Former-commit-id: 3eebae892be74e6d9d4e7d1f6f1fae00080957f9
2024-11-11 14:10:35 +08:00
hoshi-hiyouga
9c394f11ef fix #5988
Former-commit-id: 8d70edf39bce7cb7be272f0bd2820306adf093ad
2024-11-11 13:57:14 +08:00
hoshi-hiyouga
662093b38c Merge pull request #5984 from hiyouga/hiyouga/wechat
[readme] update wechat

Former-commit-id: 2176224f4b672f623624e906e983dfd892f9f903
2024-11-10 22:08:55 +08:00
hiyouga
e928f7fc4c update wechat
Former-commit-id: f2a44e1a2ad9c7029100dc76dd3abca47de734c4
2024-11-10 22:08:10 +08:00
hoshi-hiyouga
82344629af Merge pull request #5982 from hiyouga/hiyouga/vllm_args
[args] add vllm config

Former-commit-id: 1ca6b1582fe5743434cba855e4e7fbec3810ab1f
2024-11-10 21:37:18 +08:00
hiyouga
0d18cca0db add vllm config
Former-commit-id: 58ab4579dc81a1dcea2bf5938ba3f3116cecfc76
2024-11-10 21:28:18 +08:00
hoshi-hiyouga
0386fa6a4f Merge pull request #5973 from JJJJerry/fix_vllm_generate
fix VllmEngine: 将inputs参数替换为prompt

Former-commit-id: 40a2fcc02dad2f1633c117d457651b221b8c5ae0
2024-11-10 21:04:38 +08:00
hoshi-hiyouga
d7476aa55b Update vllm_engine.py
Former-commit-id: a543bc478dfb3472d5d81e2f81922d5ab3b90373
2024-11-10 20:57:00 +08:00
JJJJerry
d996bea1c7 fix VllmEngine: 将inputs参数替换为prompt
Former-commit-id: 1d04078bb5d460f0cb3d26798bd2f0e43c8a0467
2024-11-09 11:45:59 +08:00
hoshi-hiyouga
1e00de38cb Merge pull request #5971 from hiyouga/hiyouga/fix_webui
[webui] fix extra args

Former-commit-id: adc5849ce75cc8eda800c73c59b1f44d3108e992
2024-11-09 00:25:24 +08:00
hiyouga
d7e35ddcaa fix extra args
Former-commit-id: eb294fed1319e97657b54b9aa479d87734c711dc
2024-11-09 00:24:27 +08:00
hoshi-hiyouga
162f7028fc Merge pull request #5970 from hiyouga/hiyouga/fix_beam
[generation] fix vllm v0.6.3

Former-commit-id: 39e330196d8e2774ac43d6f37ccabc0a07efd970
2024-11-08 23:58:15 +08:00
hiyouga
97f4451912 fix #5966
Former-commit-id: 8f3a32286ebcfb3234e3981db2292dd165b1568d
2024-11-08 23:49:16 +08:00
hoshi-hiyouga
265875fffd Merge pull request #5929 from hiyouga/hiyouga/dev_wechat
[readme] update wechat

Former-commit-id: 707ff5a3703ead7b1d374d928f92ed8a128573f0
2024-11-04 16:44:31 +08:00
hiyouga
bf2959c175 update wechat
Former-commit-id: 882635cac3419193c78202fa5fb854c97acea9ed
2024-11-04 08:43:21 +00:00
hoshi-hiyouga
2e6dd72539 Merge pull request #5927 from hiyouga/hiyouga/dev_fixmmchat
[fix] chat engines

Former-commit-id: bcbd1bebb0c2c1f9bfbe920c62008383cb044dba
2024-11-04 16:36:23 +08:00
hiyouga
1598e5d355 add image input type
Former-commit-id: ffa39ba3db0dbfd375cdf20b9f3cbecd359be1a1
2024-11-04 08:27:20 +00:00
hiyouga
2360d63ebc fix chat engines
Former-commit-id: 8c88065c38d0e7dd60e06e27c7f921f7931471ee
2024-11-04 08:18:12 +00:00
hoshi-hiyouga
d68af5b04a Merge pull request #5926 from hiyouga/hiyouga/dev_deps
[version] update datasets version

Former-commit-id: 22ee197a40566cec092f957dd81458bf00caf9fa
2024-11-04 16:04:00 +08:00
hiyouga
3730fc046f update datasets version
Former-commit-id: c5fae465ec8cbc30f9e91e6c32b88e74c805874a
2024-11-04 07:52:26 +00:00
steven
7f7ee0a660 support granite3 models
Former-commit-id: 6eefb4d7d25879db42cefae8332ca9db88bff851
2024-11-04 10:35:03 +08:00
hoshi-hiyouga
e3fb3c313c Merge pull request #5914 from hiyouga/hiyouga/dev_read
[misc] update readme

Former-commit-id: 04c10d2e80b7f7e516eba67fea420498a1238bb5
2024-11-02 21:44:10 +08:00
hoshi-hiyouga
9d463b611c Merge pull request #5475 from menibrief/main
Fix phi-3-small issues 

Former-commit-id: 9a42c5f76f4fff3203ea0ceac08ea5043e1519d6
2024-11-02 21:31:34 +08:00
hiyouga
feb547aa99 fix phi3 template
Former-commit-id: 2e843d989e946725e2dc096d6540531eaaf5b782
2024-11-02 21:31:23 +08:00
hiyouga
f05685c7cf update readme
Former-commit-id: e7ed5091e1f8fb35e458f368558ceac71c6983b4
2024-11-02 21:28:04 +08:00
hoshi-hiyouga
8c2b7aa1ab update template
Former-commit-id: 478cbb1aa72f218df37b5a4686db2248ad2605dd
2024-11-02 21:21:22 +08:00
hoshi-hiyouga
d99e164cad Merge branch 'main' into main
Former-commit-id: 5f14910910154ba569435e7e68acbd6c30f79e80
2024-11-02 21:20:27 +08:00
hoshi-hiyouga
7dbb338df7 Merge pull request #5913 from hiyouga/hiyouga/dev_metrics
[train] support gather DPO metrics, fix return output

Former-commit-id: 344ff76d26a42c859f31cd03765b1b613ffe6bfa
2024-11-02 21:13:43 +08:00
hiyouga
604f4005c9 fix webchat
Former-commit-id: ba10d54bf93c54ab8f691ea53bff850e3595a1a6
2024-11-02 21:04:18 +08:00
hiyouga
c2766af6f4 fix dpo metrics
Former-commit-id: 4270f7dfb9a12471c91f6c03dce7ca6fd88566e1
2024-11-02 20:59:01 +08:00
hoshi-hiyouga
e7b11e4fdb Merge pull request #5880 from sd3ntato/make-image-parametric
make base image parametric.

Former-commit-id: 07e5088851b3a175ad9cfc83268159eac4334031
2024-11-02 20:26:14 +08:00
hoshi-hiyouga
5ad86fc71d Update Dockerfile
Former-commit-id: 5e8a619d1d20048ad912a1f6d9ccdae5740811bc
2024-11-02 20:20:26 +08:00
hoshi-hiyouga
6f79974e8b Merge pull request #5910 from Cuiyn/index
Support Index series models.

Former-commit-id: c58cc22d06eb1a466ad92601ceb74c9bae6abb51
2024-11-02 20:16:54 +08:00
Cuiyn
51177c933a fix: rename to Index-1.9B-Charater-Chat and Index-1.9B-Chat-32K
Former-commit-id: ecca9db66ba1f0fe903a9e02eca317c0bbb28a5f
2024-11-02 20:04:14 +08:00
hoshi-hiyouga
0f53217bbc Merge pull request #5912 from hiyouga/hiyouga/dev_logging
[misc] support rank0 logger

Former-commit-id: 83535bbe8bf50d9653265437d379fcdd8c82b989
2024-11-02 18:48:41 +08:00
hiyouga
e83cb17f97 support rank0 logger
Former-commit-id: c38aa29336f286266553da4909a7267d7ef21f37
2024-11-02 18:31:04 +08:00
Cuiyn
7806bde8ad Add support for Index
Former-commit-id: a15a69ab4417c6f3273c874cf7ee2c34a5a64141
2024-11-02 13:45:27 +08:00
hoshi-hiyouga
4b2c47fcae Merge pull request #5909 from hiyouga/hiyouga/dev2
[data] support auto convert for single image, add image_dir argument

Former-commit-id: bd08b8c441c47076faa03cc1efde21b22f14f058
2024-11-02 13:43:04 +08:00
hoshi-hiyouga
33e8bfc3ae Merge pull request #5907 from hiyouga/hiyouga/dev
[data] fix template replace behavior

Former-commit-id: 8408339d8388a610a4e5fe303086d7536d1e72d5
2024-11-02 13:42:53 +08:00
hiyouga
ac677205c9 fix #5904
Former-commit-id: bfe1abd7afe4595135b568783753d064cb6e0b28
2024-11-02 13:08:15 +08:00
hiyouga
7fa46a24df fix #5883
Former-commit-id: 24da9f59b0bf4874506bbf1ec214f3d5ca43d943
2024-11-02 13:06:34 +08:00
hoshi-hiyouga
7555c54c9f Merge pull request #5906 from hiyouga/dev
[test] update tests

Former-commit-id: 3ef7d590b13301e1c5531ec5942f4a65dc44d47e
2024-11-02 12:50:43 +08:00
hiyouga
3f7c874594 update tests
Former-commit-id: 93d3b8f43faf4a81b809d2f7d897e39bdb5475c3
2024-11-02 12:41:44 +08:00
hoshi-hiyouga
25093c2d82 Merge pull request #5895 from hiyouga/dev
[inference] support multiple images

Former-commit-id: 0a55e60693ab15d92fbe3d7d536408e26228ab82
2024-11-01 16:52:55 +08:00
hiyouga
2eba98e152 add examples
Former-commit-id: e824b715ad4bf885241b245b12d75563adab2e26
2024-11-01 08:41:54 +00:00
hiyouga
8ecc12ee2a support multiimage inference
Former-commit-id: e80a4819274d46ac9e85db7469dc59d7c4e323c7
2024-11-01 07:25:20 +00:00
Valerio Mariani
1ab3ea21ce make base image parametric.
default `BASE_IMAGE` is nvcr.io/nvidia/pytorch:24.02-py3 for retro-compatibility


Former-commit-id: d5b2def7c029d6f808a97cb062340810e0cd075d
2024-10-30 21:53:32 +01:00
hoshi-hiyouga
9108df2b97 Merge pull request #5873 from hiyouga/dev
[misc] update readme

Former-commit-id: 641d0dab08d96a93c34657742213d8994d9ed476
2024-10-30 17:14:44 +08:00
hiyouga
7487bd7b1f update readme
Former-commit-id: 2417b70a620ec3bba7581c1a444e09c2440a58a0
2024-10-30 09:14:01 +00:00
hoshi-hiyouga
efda735f32 Merge pull request #5871 from hiyouga/dev
[loss&ui] fix incorrect loss of vlms, add extra args to ui

Former-commit-id: b2d4b9a7a870aba92ed6e74b7805d606ddc0edbc
2024-10-30 17:13:17 +08:00
hiyouga
584ce3a105 fix incorrect loss value for vlms
Former-commit-id: 30567a1487727473950104718e626ff660f10cbb
2024-10-30 08:56:46 +00:00
hiyouga
1b02915d19 tiny fix
Former-commit-id: 0c22da4f1cc710b471f6d511d50ce878521173ca
2024-10-30 08:56:29 +00:00
hiyouga
aba4268607 support extra args in llamaboard
Former-commit-id: 86e5affc0d5f5682a20b28d27f80e0015b36fdd9
2024-10-30 08:55:54 +00:00
hoshi-hiyouga
5142faca8f Merge pull request #5581 from Kuangdd01/pixtral-patch
[WIP] Support Pixtral-12B

Former-commit-id: 9009a467e621a17ad9fa25bb30fb9ac9ee15df97
2024-10-29 22:29:10 +08:00
hoshi-hiyouga
15786539d7 fix bug
Former-commit-id: bb0afda8fbb7d3556c1742bccf0b33ee3bf0f0b6
2024-10-29 22:19:04 +08:00
hoshi-hiyouga
49054329d0 Update mm_plugin.py
Former-commit-id: 049c554aee25cf1e29bee88dfb21381b3a4a2947
2024-10-29 22:16:22 +08:00
hoshi-hiyouga
54961946ac Update template.py
Former-commit-id: c3239ba6b24481b092e86ce94a6337fba18c25c0
2024-10-29 22:11:21 +08:00
hoshi-hiyouga
2179b91acb Update visual.py
Former-commit-id: 0baa7735f64cbef9bd90e1db485c120b4c1c88bd
2024-10-29 22:10:29 +08:00
hoshi-hiyouga
fb8f35558a Update collator.py
Former-commit-id: f745c4b28f532c7084d4b8522c972e735729ecee
2024-10-29 22:03:42 +08:00
hoshi-hiyouga
90cd3538de Update hf_engine.py
Former-commit-id: 6e212fdab5f48c955db250ecfc197b89f8856e4b
2024-10-29 22:00:59 +08:00
hoshi-hiyouga
eca50b89a2 Update README_zh.md
Former-commit-id: 8fa20bf4272666e0ed9bcbfee8e4fe66801ef10c
2024-10-29 21:58:03 +08:00
hoshi-hiyouga
2876b429bc Update README.md
Former-commit-id: 1b57df074ab4deb29749086ccb10b459eebf5143
2024-10-29 21:57:28 +08:00
hoshi-hiyouga
acfff4319a Merge pull request #5801 from NLPJCL/main
使用了 LLaMA Factory 的项目:RAG-Retrieval 使用LLaMA-Factory作为生成方法做Reranker任务的微调框架。

Former-commit-id: 4ff6e564f8a03abda7a853744d14abe7451991c8
2024-10-29 21:20:16 +08:00
hoshi-hiyouga
b86b869187 Update README_zh.md
Former-commit-id: 08d9a03c30b7aebf74bef7f59e6aea229af2aeb3
2024-10-29 21:19:17 +08:00
hoshi-hiyouga
233556d1c7 Update README.md
Former-commit-id: a76478c127bc98749079fbc7e5aacd6e60648f37
2024-10-29 21:18:15 +08:00
hoshi-hiyouga
4f10d3e28c Merge pull request #5857 from hiyouga/dev
[train] fix saving processor

Former-commit-id: a7c79d8f6a22b93dc52f3b6ba7145cb27d54f30c
2024-10-29 21:12:04 +08:00
hiyouga
13c7e873e0 fix #5749
Former-commit-id: 23dbe9a09999fe0f9eb2902a40e33b36db4ca584
2024-10-29 13:02:13 +00:00
Kingsley
3053a806e9 Merge branch 'hiyouga:main' into pixtral-patch
Former-commit-id: 67f59579d79e97689a4b3cba7101a423c30dab2b
2024-10-29 21:01:25 +08:00
hoshi-hiyouga
6cd0d7da29 Merge pull request #5852 from hiyouga/dev
[misc] several important updates

Former-commit-id: 935adaeed948ef0d6df407bc432ae93ad46d0d97
2024-10-29 20:30:02 +08:00
hiyouga
d183966a5d fix pissa
Former-commit-id: 51e5f962474739bbf396782afdaa68743636fe90
2024-10-29 12:18:45 +00:00
hiyouga
825ea1c72d fix #5747
Former-commit-id: ae045c884f8ac2aa0ea27592e0757b7bca2dba13
2024-10-29 10:47:04 +00:00
hiyouga
0d8aa6e6ef use pre-commit
Former-commit-id: 21db8ed2f4a0eba203754a92ce0741538e8ee709
2024-10-29 09:07:46 +00:00
hiyouga
163cf2ba5c update requires
Former-commit-id: 77666bd2278a3cfe5b567f4fe285b0f93871d166
2024-10-29 16:10:07 +08:00
hiyouga
6989b8c341 update wechat
Former-commit-id: 8723922923383b7f3eb56f5d6d54f4a02b589a5a
2024-10-29 15:46:10 +08:00
grok
c24d477bdb Update README_zh.md
Former-commit-id: 6fcabb334920c3145c7820fee4cd84809585f50f
2024-10-23 23:50:56 +08:00
grok
3e3969784f Update README.md
update english readme

Former-commit-id: 7627ef09088ecbc234c08c0cb4743cbaee576b76
2024-10-23 23:49:47 +08:00
grok
823d7f5c81 Update README_zh.md
Former-commit-id: 18a7f3ff76aa8aae66dd18db49ed3cd13345d5c9
2024-10-23 23:36:14 +08:00
hoshi-hiyouga
f03b20b267 fix #5797
Former-commit-id: b4c7dd3ac5615ccb52d7627db635d33336e51951
2024-10-23 20:49:44 +08:00
KUANGDD
9d1f079ca5 rm comment
Former-commit-id: c1d17b0f013f40b48322cb13979fe3726dc124a6
2024-10-23 15:50:59 +08:00
KUANGDD
b9c6fcfe98 rm useless code
Former-commit-id: f5c1cdfaab939fb970393452081950a4b6e3604d
2024-10-23 15:38:11 +08:00
KUANGDD
9079967ecf Merge branch 'pixtral-patch' of https://github.com/Kuangdd01/LLaMA-Factory-X into pixtral-patch
Former-commit-id: eb35e76d502cff9316e8c7bdb312502d9272bdf0
2024-10-23 15:32:50 +08:00
KUANGDD
006b708b57 rm import torch
Former-commit-id: a8571844569416b59d7e9c5fcc4f9a4809d8700c
2024-10-23 15:32:33 +08:00
Kingsley
a7a5a5671f Merge branch 'hiyouga:main' into pixtral-patch
Former-commit-id: 0d3106e9fad565fbe56b8de57dd6ea373944eb99
2024-10-23 15:30:03 +08:00
KUANGDD
00545ebbe5 Merge branch 'pixtral-patch' of https://github.com/Kuangdd01/LLaMA-Factory-X into pixtral-patch
Former-commit-id: 341a79fb96c65b35985d2982988431fd0eea5277
2024-10-23 15:28:19 +08:00
KUANGDD
62cbcb646a modify style & little change
Former-commit-id: 9d6143e36a12e0f295139d057aeb1843535435cf
2024-10-23 15:24:07 +08:00
hoshi-hiyouga
bdb77bc85a fix test
Former-commit-id: 74a79cc0599b047a691c427d16344a824b21e0f3
2024-10-22 12:35:36 +08:00
hoshi-hiyouga
b7b30fc961 fix #5768
Former-commit-id: d155b7008c7c99a91fd8d12cf910318318138920
2024-10-22 11:06:22 +08:00
hiyouga
63f0f9cf5d update wechat
Former-commit-id: ed9b99b80d9c23cfca8964ccfce280775ae55fcb
2024-10-17 21:08:28 +08:00
hoshi-hiyouga
8fb211ad0e Update misc.py
Former-commit-id: 769fbb6349006effa261f40fc055f670fae3e98d
2024-10-17 19:48:51 +08:00
hoshi-hiyouga
af50c03879 Update loader.py
Former-commit-id: 93b9067dfc44d3a8570c39831539a8aa3e27e2db
2024-10-17 19:48:12 +08:00
hoshi-hiyouga
6fbf77aa54 Update README_zh.md
Former-commit-id: 110e4c548dac4a1838d069d312d9f27af90cb1e3
2024-10-17 19:47:33 +08:00
hoshi-hiyouga
79433fb6a6 Update README.md
Former-commit-id: 1fea87183561559f140f8de9b869e893ff8a3378
2024-10-17 19:46:36 +08:00
BUAADreamer
16d4149c25 tiny fix [skip ci]
Former-commit-id: 1724a58b478d7960ed07e864620249091b242e34
2024-10-16 15:55:30 +08:00
KUANGDD
a24f94a36c remove useless codes
Former-commit-id: 9b2642a2b53d3392e95061ed0f2c8dc10580c9e8
2024-10-16 01:14:51 +08:00
KUANGDD
9c4941a1ea fix bug for webui infer
Former-commit-id: 7ea29bbfe03550ac59ff9cb01a4bc41c95ac3adf
2024-10-16 01:09:33 +08:00
Kingsley
bcb40fddc0 Merge branch 'pixtral-patch' of https://github.com/Kuangdd01/LLaMA-Factory-X into pixtral-patch
Former-commit-id: 3d51caa053823a5ca793b818c7f464662e5b5f39
2024-10-15 17:09:56 +08:00
Kingsley
ae869639dd add extra test for pixtral mm_input
Former-commit-id: 0fc949783dec2d038dc3d1bf52051c256b69ac20
2024-10-15 17:09:24 +08:00
BUAADreamer
66ee9f0489 tiny fix [skip ci]
Former-commit-id: 58b97197b8aea2820c1b7eb338753dba6b22f3d7
2024-10-15 13:53:33 +08:00
Kingsley
9a9716c228 Merge branch 'hiyouga:main' into pixtral-patch
Former-commit-id: 8f3bff456be302d4394892566af7c0e10fe6d4d6
2024-10-15 13:41:10 +08:00
BUAADreamer
a3f37777c1 fix some
Former-commit-id: 25641af04c98e902ff024c8fa7b4c2c36ed797de
2024-10-15 13:30:41 +08:00
KUANGDD
5e440a467d plugin test & check
Former-commit-id: 2df2be1c47aded0132b5cc86acd3926dca585bc1
2024-10-15 12:12:46 +08:00
hiyouga
7ba5488569 fix #5705
Former-commit-id: 3df39f37e1b7988720f36729830889a4c05399b1
2024-10-15 10:10:16 +08:00
KUANGDD
df722bf18e required transformers version
Former-commit-id: 9f44598b92e72cf8dd923eb229f4637ab9287948
2024-10-14 21:11:09 +08:00
Kingsley
2c5f912e16 remove bs condition
Former-commit-id: 962b9730a7a2940a0d4e5c76d1fe41d0fef76547
2024-10-14 16:55:59 +08:00
Kingsley
8ea1c5c69e Merge branch 'hiyouga:main' into pixtral-patch
Former-commit-id: 95330893c5cd290430a0a2a4e4afa87afab2eb88
2024-10-13 17:42:02 +08:00
hiyouga
92de726102 fix #5668
Former-commit-id: 40ceba500bab7452b8671a9fbcd14bbf4a8f6f37
2024-10-12 01:24:43 +08:00
hiyouga
e90a1199da tiny fix
Former-commit-id: 3af57795dda5d236200bad4aa3f2e29ae8930fe2
2024-10-11 23:51:54 +08:00
hoshi-hiyouga
012f4fef6b Merge pull request #5665 from johnnynunez/main
vllm 0.6.3

Former-commit-id: 228dd1739e98a8ea0270c40edff1f30591c30059
2024-10-11 23:45:58 +08:00
hoshi-hiyouga
e7e8d006cc Merge pull request #5642 from huniu20/main
[hub support] add modelers hub support

Former-commit-id: bc80ea9a51400dea8680087edde3489eb7df9a0a
2024-10-11 23:45:17 +08:00
huniu20
5a3280ebee bugs fixed
Former-commit-id: 843b5d85e98b312e5d41ce62ec10e199011beb8c
2024-10-11 19:56:13 +08:00
Johnny
27be1e2122 Update parser.py
Former-commit-id: e5849cdcce109e15547edcf9a692e7c13d625e5a
2024-10-11 12:29:33 +02:00
Johnny
39929bda5a Update setup.py
Former-commit-id: cb56673675a9d2ba359dd6d9a561084b9bad7bfd
2024-10-11 12:29:09 +02:00
huniu20
e8e98bb125 add om_hub_token argument
Former-commit-id: 7b91be33c9cd8473453716f0c4c6dec924304efc
2024-10-10 17:16:46 +08:00
huniu20
132c1f1b0f 1. add model and dataset info to support webui
Former-commit-id: 0f669f221a31622ec7a53d0baab5da6a7891f9b6
2024-10-10 16:46:34 +08:00
huniu20
26e897e861 1. add modelers hub support
Former-commit-id: 24ebe187e360753666b768685a0dcc78054bb702
2024-10-09 17:21:37 +08:00
Kingsley
5523a6fd2c Merge branch 'hiyouga:main' into pixtral-patch
Former-commit-id: 93a441a6b746e9a933dad8c45553fb5b68bf2b34
2024-10-08 21:04:08 +08:00
hiyouga
4464a6ff5b tiny fix
Former-commit-id: 451d271718a8026056d0f7d7b8ab333391d24ad4
2024-10-08 17:48:56 +08:00
hoshi-hiyouga
a95fe78ae2 Merge pull request #5546 from chengchengpei/cpei/refactor
1, log exceptions in details; 2, check processor is None before calling it

Former-commit-id: f8a2cc61054cec215d658159dc2d6fbacf6624e9
2024-10-08 17:46:54 +08:00
hoshi-hiyouga
c639e52c6b Merge branch 'main' into cpei/refactor
Former-commit-id: 2c6262c3cd57906806e53aecd4e5a4423399c1e0
2024-10-08 17:31:17 +08:00
hoshi-hiyouga
d650e461f9 Merge pull request #5615 from johnnynunez/patch-1
Update setup.py (Compatible with Jetson)

Former-commit-id: 89ba4513ae5786c396df9653c3bc8dfdc5e716af
2024-10-07 16:50:34 +08:00
hoshi-hiyouga
b855d3421e Update parser.py
Former-commit-id: 1ce0b42b1e30cb5419c91702a499f23d52db43ee
2024-10-07 16:27:23 +08:00
hoshi-hiyouga
ef87942a42 Update setup.py
Former-commit-id: c2e947de28841c1125991c4ebd60c13f763bf027
2024-10-07 16:26:50 +08:00
Johnny
059c2ffbea Update parser.py
Former-commit-id: 4e638777ebcbf7dea22011361fb341bafe6ba9d9
2024-10-07 10:17:45 +02:00
Johnny
93f14bf121 Update setup.py
Former-commit-id: 6b78d02175c3e03f7304aed64b9fa21e0a71dd0d
2024-10-07 10:16:53 +02:00
codemayq
20cf39ef63 Merge remote-tracking branch 'origin/main'
Former-commit-id: 283017ec52b35b4158b512c12badb13897e20a0a
2024-10-07 12:06:46 +08:00
codemayq
139811bd80 update wechat
Former-commit-id: c9edc4b4e93171c0b99656e7384d95a99c272dd0
2024-10-07 12:06:02 +08:00
hiyouga
74653597f1 update readme
Former-commit-id: 1a7483c1a5fb49dba660f21beb45784ebd829c92
2024-10-07 11:31:18 +08:00
Johnny
9a6045eee6 Update parser.py
Former-commit-id: 6c1aef55604649a956fe928d89280626923815b8
2024-10-06 20:34:19 +02:00
Johnny
332614579c Update setup.py
Former-commit-id: e57f607cb4a6a59b695483b760a0b7e3b762695a
2024-10-06 08:53:55 +02:00
hiyouga
56132983cf fix #5611
Former-commit-id: a45f3f5461e2936b9e119eda2ef4d8c7a4131740
2024-10-06 10:34:55 +08:00
hiyouga
2b3173e5d2 fix #5611
Former-commit-id: eb2db2af8ec061205d7370baa8f15bdaf34ed31c
2024-10-06 10:33:11 +08:00
hiyouga
4bae540cd6 update wechat
Former-commit-id: a9a9ba85740b3d34b2e467ea66bc6eae7561e4ba
2024-10-06 10:13:05 +08:00
hiyouga
e51a6f6367 update wechat
Former-commit-id: 020ef628f9beab3feac82dd65fd8057a0e731bbb
2024-10-01 08:20:56 +08:00
Kingsley
f3ac97a749 Merge branch 'hiyouga:main' into pixtral-patch
Former-commit-id: e53f47c0b3de491d4d9b31c995f9cea100f98896
2024-10-01 00:52:31 +08:00
Kingsley
dd2d1c3154 unfactor md
Former-commit-id: c668568bc73914ba071a4121c4fec1ee7f2ab76c
2024-09-30 23:36:16 +08:00
hiyouga
4df090ff48 fix #5542
Former-commit-id: fe7ffccdb9a45b31e20ab7e88282a75b45504a97
2024-09-30 23:28:55 +08:00
Kingsley
6729ed2c7e sync with former
Former-commit-id: 9ddb84052e3cc72e21a92b8103caa179a35859c4
2024-09-30 20:27:05 +08:00
Kingsley
ececd68f9a register model fix
Former-commit-id: 15d555c8c523ac2252d55614773073c57db0e025
2024-09-30 20:04:47 +08:00
Kingsley
94ce8f561f fix some errors due to inconsistency of model cards
Former-commit-id: 2166b9bc6ba35760ff85b63620af9fa0213a4c78
2024-09-30 19:58:34 +08:00
Kingsley
4f85098088 Merge branch 'hiyouga:main' into pixtral-patch
Former-commit-id: 9ac0fde3f29cfd98e08c53a0e52bf472240ae2e7
2024-09-30 19:33:29 +08:00
hiyouga
3720618c63 add patch processor func
Former-commit-id: 45841bb646afa9d0bc2ea4b6b7b107daa67d90f0
2024-09-30 17:07:43 +08:00
hiyouga
b664bcf307 lint
Former-commit-id: cee3dc484be4bdf31ffe1d8b9c60604b84ed6a00
2024-09-30 17:00:33 +08:00
hoshi-hiyouga
a3f99f123a Merge pull request #5585 from shing100/main
Support EXAONE3.0 Model

Former-commit-id: ef84aaceec87f9fd0f3b5fc0305f1f1e39c1cd60
2024-09-30 16:56:08 +08:00
hoshi-hiyouga
85ed108fa6 Update constants.py
Former-commit-id: 4d7bb692348717d89bd5c682890e98bce5118743
2024-09-30 16:47:52 +08:00
hoshi-hiyouga
54e749d1cf Update template.py
Former-commit-id: 02957fc9c3cd04ef5140f093642b94e74ce86b61
2024-09-30 16:39:48 +08:00
Zhangchi Feng
69e801d456 Merge branch 'main' into pixtral-patch
Former-commit-id: 26f45829b453ff1a0c76f6c1ddaba893d48f821e
2024-09-30 12:37:03 +08:00
shing100
b596102bd5 fix chat template Exaone3.0
Former-commit-id: 2964b20eb59687e24a901926c0da1547d7c046ca
2024-09-30 09:44:21 +09:00
Geun, Lim
5f1209bee6 Update README_zh.md
Former-commit-id: 0dfe9f763b09e00457700111e5d7d0d8b9a81e32
2024-09-30 09:25:02 +09:00
Geun, Lim
73b684c7b0 Update README.md
Former-commit-id: d06440e0e71f02f852a7eac5ea6961c8da2a80f4
2024-09-30 09:24:44 +09:00
shing100
b5eb939ce3 update docs Support model Exaone3.0
Former-commit-id: 826675ff14ddfd291e93c91373df9da43d1388a6
2024-09-30 09:19:27 +09:00
shing100
0a633f8098 add Exaone3.0 template
Former-commit-id: 3a9569647f5dbb1dcd9ef6e5cfc39ec8f9b41e89
2024-09-30 09:18:25 +09:00
hoshi-hiyouga
8d20d6c95c Merge pull request #5574 from BUAADreamer/main
support llava-next(video)/video-llava

Former-commit-id: c7b334e698b776f83ee8e9c99c144d2a4dd6d6a0
2024-09-30 00:22:43 +08:00
hoshi-hiyouga
1f1b8e825d Update common.py
Former-commit-id: 2c17d91bb7ae58346c020c46cb7ffabad4deff4f
2024-09-29 23:58:09 +08:00
hoshi-hiyouga
f051bff1e6 Update README_zh.md
Former-commit-id: e472f355f2b0beba9329137ac625059bc36a671e
2024-09-29 23:56:32 +08:00
hoshi-hiyouga
853c95bb89 Update README.md
Former-commit-id: 63148e72710f1c03cf527f2c06830f725fd035f7
2024-09-29 23:55:55 +08:00
hoshi-hiyouga
4269b4b49a Update README.md
Former-commit-id: 2d37fa1255f289b076ff0fabec6d50077a86f6e4
2024-09-29 23:55:21 +08:00
hoshi-hiyouga
6e4d5d9b2a Update constants.py
Former-commit-id: b257b91cd0a71448af520baa8e864920333da848
2024-09-29 23:45:34 +08:00
hoshi-hiyouga
81cf3bff08 Update test_mm_plugin.py
Former-commit-id: 8b50ce516af715a16c53568fb3420d6441aca1ae
2024-09-29 22:59:47 +08:00
BUAADreamer
f1b16236a4 fix template
Former-commit-id: 01ca056965bb63c5e62ec53ffdb51a062bf64371
2024-09-29 22:56:36 +08:00
BUAADreamer
1c69eea995 fix template
Former-commit-id: 96bec6817a43c58e2d4b294dd52fee4cce10b02d
2024-09-29 22:55:45 +08:00
BUAADreamer
b37bb592ec fix constants
Former-commit-id: bec1cb8d55d01ac8b70b9bacd92a227b48cf8336
2024-09-29 22:40:43 +08:00
BUAADreamer
362788cb09 Merge branch 'main' of https://github.com/BUAADreamer/LLaMA-Factory
Former-commit-id: 671824d1a1f41ee025afaad1d654da0d70909376
2024-09-29 22:00:35 +08:00
BUAADreamer
87ab7fc01c fix constants
Former-commit-id: 485fc047169afd027ee65d05e3c5c08b371b6c4d
2024-09-29 22:00:01 +08:00
hoshi-hiyouga
15dbd4893e Update requirements.txt
Former-commit-id: 905b7c03ae074bd958afdab6d79e45b30cec5271
2024-09-29 21:51:23 +08:00
BUAADreamer
ddec40ac16 fix style
Former-commit-id: 23916d57c1d22653739dbf913d3e427fcb978a15
2024-09-29 21:39:37 +08:00
Zhangchi Feng
8e164f3594 Merge branch 'main' into main
Former-commit-id: 83abf86657ea38968e953e1dc4a2e8c34471b06a
2024-09-29 21:32:54 +08:00
BUAADreamer
1b71afb277 add more llava-next series template
Former-commit-id: 65a8923f5a7d20d34fabf4f81746fe9b7bc8c84a
2024-09-29 21:29:29 +08:00
BUAADreamer
1ca0ccb4a6 Merge branch 'main' of https://github.com/BUAADreamer/LLaMA-Factory
Former-commit-id: 87c8a7e75932d109fd677b5c58ca20206d734cea
2024-09-29 20:55:23 +08:00
BUAADreamer
8231359bbb fix readme_zh
Former-commit-id: 45b01df6a6103d36bbd16204327ac88ea0a02971
2024-09-29 20:55:18 +08:00
hoshi-hiyouga
ec793d16de Update mm_plugin.py
Former-commit-id: ffaea305fc405c9892aa0c9712d98185d9241e69
2024-09-29 20:54:04 +08:00
hoshi-hiyouga
ce0c73c032 Update mm_plugin.py
Former-commit-id: 0257a67cb266dcaee8bfb358d88ef2be2403a2f7
2024-09-29 20:53:34 +08:00
BUAADreamer
ee3fe4226d fix readme_zh
Former-commit-id: 1a757c5ab243c03e2f9c0dcd884e34d752885689
2024-09-29 20:46:47 +08:00
BUAADreamer
534dc58363 fix readme
Former-commit-id: bf0bcbc5ec4ca0182ade283ea9f37012f224f519
2024-09-29 20:45:02 +08:00
BUAADreamer
66213043ac tiny fix
Former-commit-id: 7397827aec55eecad1e70878453387fda0db62b5
2024-09-29 20:38:46 +08:00
BUAADreamer
8ee588248e fix style
Former-commit-id: 6ddea0f3d3ef568378470ce967a0e8d02eeac5dd
2024-09-29 20:30:57 +08:00
BUAADreamer
96d51325ad fix tests
Former-commit-id: 97d1536ee1f85b0a7a84fc6cb390fc502287a74c
2024-09-29 18:00:45 +08:00
BUAADreamer
1e2ea34419 fix some
Former-commit-id: d5c69400cd27cdf0667290f3863a3aab47143eb3
2024-09-29 17:55:40 +08:00
hoshi-hiyouga
e265082db8 update readme
Former-commit-id: a2bd6944cd85fdca83407c1cb354f61e57e2ac78
2024-09-29 05:02:44 +00:00
hoshi-hiyouga
786e013375 Merge pull request #5580 from amrear/main
made a small change to a warning about fa2 for gemma2 models.

Former-commit-id: 5ec161d6bdbb3ecc0ca0b10ea24ac7270618263d
2024-09-29 12:45:03 +08:00
BUAADreamer
615edf937e fix some params of visual regularize
Former-commit-id: 1d09d592d3beacca6a17bb3e5a9728150741e551
2024-09-29 12:38:25 +08:00
hoshi-hiyouga
5df765e376 Update attention.py
Former-commit-id: fe7057a8a3eb111cdaf8349b6ac077d898bf4935
2024-09-29 10:47:41 +08:00
Kingsley
a2452d0b1c Tiny fix
Former-commit-id: 8f13a3627d06a6f0a9b4e35443a415958d9ad1c9
2024-09-29 00:00:23 +08:00
Amirreza A
6ae0e27c8b made a small change to a warning about fa2 for gemma2 models.
Former-commit-id: 94ee105526d817e59bfd91f7bd4161d7cb2fd216
2024-09-28 19:03:36 +03:30
Kingsley
fd79cf8551 tiny fix
Former-commit-id: 3d3cc6705d4575f7f20bf4da2b7dab60b337006b
2024-09-28 22:50:53 +08:00
Kingsley
66e473d519 remove some unnecessary if conditions
Former-commit-id: de06e2678e2168586614242f65939c5772e78774
2024-09-28 02:14:06 +08:00
BUAADreamer
0e33902f61 fix some
Former-commit-id: 7f3f81009e3728fe25b9c063491ee71acc498c35
2024-09-28 01:15:33 +08:00
BUAADreamer
7ab42cb582 modify some style
Former-commit-id: 1b21793543f4a03252e81a86e06ec91fd2a51fa5
2024-09-28 01:07:38 +08:00
BUAADreamer
3501257780 add tests
Former-commit-id: f3be3d21e7efcb2f596ee96f5a97bb53438f2d89
2024-09-28 00:59:14 +08:00
BUAADreamer
5aa1e847d9 add llava-next/llava-next-video/video-llava
Former-commit-id: 6642cd501d55a1657678428ef2aa0c9b99b7e83f
2024-09-28 00:57:03 +08:00
Zhangchi Feng
c576b7ca32 Merge branch 'hiyouga:main' into main
Former-commit-id: 900631755b28692bb150a8cf39354af4e2e986c9
2024-09-27 18:14:39 +08:00
Kingsley
b76116bb6c add pixtral template
Former-commit-id: 7b3336dd97e06a11ec52433ef36980aefdbb45ba
2024-09-26 17:14:51 +08:00
Kingsley
35e44143fd Merge branches 'pixtral-patch' and 'pixtral-patch' of https://github.com/Kuangdd01/LLaMA-Factory-X into pixtral-patch
Former-commit-id: 5e64b0c37165a50296036a6e09e09193fb2ad644
2024-09-26 12:18:25 +08:00
Kingsley
c436d6ea0b add pixtral template
Former-commit-id: 86f5a9be548ef02ce334bba35a529c70e8b3ad7f
2024-09-26 12:11:58 +08:00
Chengcheng Pei
bc7197dcfc address comments
Former-commit-id: 6bfd8442671eec4c0fb63fd34667a44760fd84e2
2024-09-25 21:07:51 -07:00
hoshi-hiyouga
5b128e6b0e Merge pull request #5547 from marko1616/chore/llama3.2
Chore: Support llama3.2.
Former-commit-id: 6b62550af1acda93246e05b37061f2ad7db58e55
2024-09-26 11:38:34 +08:00
hoshi-hiyouga
a73988141b add modelscope models
Former-commit-id: 8e5d12c2c4b687dc0d2c5bc25a916ba9f6ce67c9
2024-09-26 11:22:48 +08:00
marko1616
b70da07977 Chore: Support llama3.2.
Former-commit-id: 885a0b77ab83bf001d7175e2ba440f7928fa4731
2024-09-25 16:08:44 -04:00
Chengcheng Pei
e80c98367e 1, log exceptions in details; 2, check processor is None before calling it.
Former-commit-id: 573e3183e644e8da61a409d96b9adcfacbfc3a7a
2024-09-25 12:59:48 -07:00
hiyouga
fb75821793 update wechat
Former-commit-id: 46e2c1ea9f0b897a8ebe215cead12ff86f24d984
2024-09-25 23:20:55 +08:00
hiyouga
33fc7bec85 fix ci
Former-commit-id: b8e616183cb0252da6efbeac3372b78098b2d6bd
2024-09-25 23:14:17 +08:00
hoshi-hiyouga
eb00df98be Merge pull request #5533 from StrangeBytesOrg/add-docker-args
Add additional install options to Dockerfiles

Former-commit-id: e7f6a9a925bece63dcdca7ce1129b6342c64a52a
2024-09-25 23:04:57 +08:00
hiyouga
f30e0a75c4 optionally replace jinja template
Former-commit-id: ba52103ba7f8f0c856691c4a9a706a6e19e73c1e
2024-09-25 23:02:02 +08:00
hiyouga
52a6667da6 update readme
Former-commit-id: 5eb871cbf492597bf69e5bf8840ec577dc8c0bae
2024-09-25 20:13:04 +08:00
hiyouga
3bcfd73898 update readme
Former-commit-id: 76487764706268a31a3964ac0cbf7799ce5fea30
2024-09-25 19:39:52 +08:00
hiyouga
4aa0493e26 update wechat
Former-commit-id: e4b830a65d20e1cb234c700200889ddd070b0a3b
2024-09-25 19:23:32 +08:00
StrangeBytesDev
d1e766898c Add additional install options to Dockerfiles
Former-commit-id: 237e302b5ccf029dec0c7f859c40f122bf92e1c5
2024-09-24 16:54:46 -07:00
huangpan.foo
fefe2aa0e4 Add deepseek-v2.5 template
Former-commit-id: ee059c393baecc4a4dfd44171d24cab45c73c962
2024-09-21 19:33:30 +08:00
hoshi-hiyouga
944ae8780c Merge pull request #5483 from whybeyoung/main
fix: 修复function call数据集如果 function_call 值的为不合法json,异常提示且中断训练。
Former-commit-id: af8dfb58fd4bd123a76fa8f393cf928995ec91b9
2024-09-19 17:01:52 +08:00
hoshi-hiyouga
1ccc2d198e flat string
Former-commit-id: f2301305637aebc026895a7d4837e5def2c23159
2024-09-19 16:43:42 +08:00
hoshi-hiyouga
e4d26efea9 lint
Former-commit-id: af7f547ecbcae11761aa934f8b25dd5ea8c35516
2024-09-19 16:21:43 +08:00
hoshi-hiyouga
89138b8ab8 fix bug
Former-commit-id: 98b464d2dd4ca303d24ae2bb31989f4458b18146
2024-09-19 16:21:21 +08:00
hoshi-hiyouga
545ca7db99 improve error message
Former-commit-id: 36611d5c098d444454f0e41a15664f14c2975547
2024-09-19 16:06:00 +08:00
ybyang
daa1309466 fix: 修复function call数据集如果 function_call 值的为不合法json,异常提示且中断训练。
Former-commit-id: 953e1a0fb226e63b797cab0007c53d3ebc6456bb
2024-09-19 15:00:10 +08:00
menibrief
4ed2b629a5 fix phi-small template
Former-commit-id: 9c1bbfac2173450c56ded2cf343e3a37ba369b7b
2024-09-18 23:52:30 +03:00
menibrief
c9cd388630 Update README.md
update readme to phi-small template

Former-commit-id: 5519a046a24c8403d094c758f044b00f94a7bbbd
2024-09-18 23:51:36 +03:00
hoshi-hiyouga
27da57b179 tiny fix
Former-commit-id: ed7d3739153fb9a100fa5c3d24bbdf2c797faed0
2024-09-19 02:20:24 +08:00
hoshi-hiyouga
f2dca55ae9 Update README_zh.md
Former-commit-id: 485d6f877268b67cab32b1d3b3fb7823491bd1d2
2024-09-19 02:17:59 +08:00
hoshi-hiyouga
47e4b3724b Update README.md
Former-commit-id: cb2cebcd72721286d763adb49dfade57aed9dc68
2024-09-19 02:16:16 +08:00
hoshi-hiyouga
2d19ba04e3 fix webui
Former-commit-id: f0b930d94a31ed46a605b3e8e3d989fdc344e61b
2024-09-19 02:13:39 +08:00
hoshi-hiyouga
56058e2e84 add qwen2.5 models
Former-commit-id: 92ef62f5025475606e533947b7d9c3cae9bfcdbf
2024-09-19 02:07:54 +08:00
hoshi-hiyouga
90610a9093 Merge pull request #5438 from aliencaocao/patch-1
Add qwen_vl to liger kernel supported list

Former-commit-id: 1a3e6545b2e1d2dab01d2a257130a47da62e747a
2024-09-16 13:40:02 +08:00
Billy Cao
38e955d4a9 Add qwen_vl to liger kernel supported list
Former-commit-id: 7a2958a44f3b99cddb91f7b67fa0dd1c26c1a991
2024-09-14 19:28:20 +08:00
hiyouga
2a5e52db51 Merge branch 'main' of https://github.com/hiyouga/LLaMA-Factory
Former-commit-id: e22ac05fd7a581a0615ef03f514a54f7d7674594
2024-09-14 13:42:47 +08:00
hiyouga
c4e7809ad5 update wechat
Former-commit-id: 9ea2df99e67bde9340f8a9a3e9e00e4e3fd5d841
2024-09-14 13:42:31 +08:00
hoshi-hiyouga
df9f30fdf8 Merge pull request #5427 from HardAndHeavy/update-rocm
Update the ROCm version to 6.2

Former-commit-id: a76617728a1fecefb959a2f25d1ffbff22a56dc1
2024-09-13 10:25:47 +08:00
HardAndHeavy
a39dc400ed update the ROCm version to 6.2
Former-commit-id: b1f9ccb29f6899406253cdde1f3546d6a7353669
2024-09-12 23:46:33 +03:00
hiyouga
5456ec5fe1 update ci
https://github.com/huggingface/transformers/pull/33436

Former-commit-id: ac06b78d615a4a7ab51f7a54be3c22435909cdd3
2024-09-11 20:44:42 +08:00
hiyouga
d2f8bcb890 set dev version
Former-commit-id: 0ded76578450f71dfe6570fbba7caaa65c004f03
2024-09-11 18:56:37 +08:00
hiyouga
acfe212a63 remove windows in ci
Former-commit-id: ba62d3257f6c3c0c4a08b421e0f7eba4b052902f
2024-09-11 18:14:39 +08:00
hiyouga
42b6e1c6c1 fix ci
Former-commit-id: ffac6541a98aa0f5abbb040f7df89952f7188d0c
2024-09-11 18:01:09 +08:00
hiyouga
009500bc6d fix #5411
Former-commit-id: c7e51ff187658eb472c2b234f75d8934c6f7c782
2024-09-11 17:36:42 +08:00
BUAADreamer
75585d01f0 try to past test
Former-commit-id: 0042a144d532aecd0159956d6a275eb9fbfa305d
2024-09-10 13:29:09 +08:00
BUAADreamer
0be477292b try to past test
Former-commit-id: e387216d49cf261c82d77df22424e6360dbcfc2a
2024-09-10 13:25:30 +08:00
Zhangchi Feng
4b6606832c Merge branch 'hiyouga:main' into main
Former-commit-id: 4643089a7dc6a88c391663131333f35b5da5015b
2024-09-10 13:20:24 +08:00
BUAADreamer
16c7326bc5 try to past test
Former-commit-id: 7b4ba0efb658422fd29dca63bac1e9cee8e82af8
2024-09-10 13:12:51 +08:00
BUAADreamer
677d57b7c7 try to past test
Former-commit-id: b0cff7d9096b4b54927ce665fde58b7f34b11439
2024-09-10 12:56:12 +08:00
BUAADreamer
5c53cf3244 resolve confilct
Former-commit-id: 96decf82b872a4ec06450b4440336d38475d1d02
2024-09-10 12:39:17 +08:00
BUAADreamer
f00f4ae9b6 support llava-next(video)
Former-commit-id: 31259e7e0caa9ff6449b4abcee0554e211167178
2024-09-10 12:31:53 +08:00
hiyouga
38505ae9e1 update accelerate ver for schedule_free optimizers
Former-commit-id: bdde35fd2e4a919c1d63ebfc9a0ea8ba0c97e14c
2024-09-09 22:51:08 +08:00
hiyouga
2bb1ee3292 fix mm plugin
Former-commit-id: 6dc2b00fa2bcbdbad71cb0d96e14d75a7db932e2
2024-09-09 22:41:28 +08:00
hiyouga
c52eeb70e7 fix qwen2vl preprocess
Former-commit-id: c93795ae14b1d5a0a3440d18f8197fd53cd013da
2024-09-09 22:33:33 +08:00
hiyouga
3aefdad4ec release v0.9.0 (real)
Former-commit-id: 90d6df622252c6fad985f68b97771c979357e2fc
2024-09-09 01:00:25 +08:00
hiyouga
561ae4d1af fix constants
Former-commit-id: 653fe70acbe44853fa0ad073a9b8391d75ef6c2a
2024-09-08 23:52:30 +08:00
hiyouga
fb9280a0a7 release v0.9.0
Former-commit-id: 54b5c4b8195d23bd9dcc1921af9910d5bdd181fd
2024-09-08 23:43:35 +08:00
hiyouga
0229263fbe tiny fix
Former-commit-id: c9b3870adb60a2aca8cfd82c1a8b8044319bacbc
2024-09-08 23:18:08 +08:00
hiyouga
ec6b85d8f9 update wechat
Former-commit-id: 60114179eb699a7d3906e085f917957c7056d320
2024-09-08 14:18:12 +08:00
hiyouga
dc64166d13 update scripts
Former-commit-id: f2aa02c070b1442bfb58a075f22af25386192bfa
2024-09-08 14:17:41 +08:00
hiyouga
78cf256067 support vllm 0.6.0
Former-commit-id: b6681d7198acf4acbebfe271dd22095e236bc430
2024-09-08 02:26:20 +08:00
hiyouga
f6f58ebef0 fix test case
Former-commit-id: b332908ab4aad392e39f0b8661d100f096d8a6ec
2024-09-08 01:50:51 +08:00
hiyouga
945841503e add test case
Former-commit-id: 52a06efaf8af26d16137ba9095f1fd81e8f61983
2024-09-08 01:40:49 +08:00
hiyouga
0daee7cb39 support activation offloading via unsloth gc
Former-commit-id: fb72a3adb0916232cc9ac9f0c725c02d07b9354c
2024-09-08 01:22:19 +08:00
hiyouga
7ccb86b215 add docstrings, refactor logger
Former-commit-id: 54c69059379d77dc9046c144cbe2d0253de3a4da
2024-09-08 00:56:56 +08:00
hoshi-hiyouga
857d5b9d0a Merge pull request #5388 from yzoaim/cal_mfu_update
update cal_mfu.py

Former-commit-id: 8eac1b929ff7e42f34d443030f9582692ec753db
2024-09-08 00:49:28 +08:00
hoshi-hiyouga
4d8b782268 fix
Former-commit-id: 842c5455b0e7c4b8d307af950304d4ed52bbb3e8
2024-09-08 00:41:45 +08:00
hoshi-hiyouga
4f28e0e5d2 Update cal_mfu.py
Former-commit-id: e71133cab14eed2c071cfd4ce044ae9905447427
2024-09-08 00:39:48 +08:00
-.-
ab1775cd95 update cal_mfu.py
Former-commit-id: 66ec36522c9bf8dfffc1065202362801875a104d
2024-09-07 23:21:35 +08:00
hoshi-hiyouga
4ee9efbd98 fix #5384
Former-commit-id: 36665f3001647b8411ba0e256d1e64eb157abfaf
2024-09-07 01:21:14 +08:00
hiyouga
995491594d tiny fix
Former-commit-id: 76f2e5950483c669a15a961f0554442b6eb5c4a6
2024-09-05 23:41:16 +08:00
hiyouga
52d3c42265 fix ci
Former-commit-id: eb5aa9adce7c01d453d45d2c901e530584e46eb6
2024-09-05 22:39:47 +08:00
hiyouga
5585713182 fix ci
Former-commit-id: 2f939b708f7f183f87aca67aa076db33a3c8a610
2024-09-05 22:27:48 +08:00
hiyouga
3aa6a3e45b add e2e tests
Former-commit-id: 94d5b1bd8f49dabeb9e3c53d634cfb3c06b0241d
2024-09-05 21:52:28 +08:00
hoshi-hiyouga
de277a8ab8 Merge pull request #5372 from LDLINGLINGLING/main
增加了对minicpm3.0的适配'

Former-commit-id: 12743562639ccc6eb0caf170e7123d9844e2b4a6
2024-09-05 21:35:42 +08:00
liudan
1797fe50a4 根据代码规范修改了代码
Former-commit-id: 3d3fbaaff98da327e10bdebb4aedbdf1ec9565e8
2024-09-05 20:17:55 +08:00
hoshi-hiyouga
f6014742fa fix #5366
Former-commit-id: e9bda48c6d7bde135df6456513708a997ada916c
2024-09-05 18:08:09 +08:00
hiyouga
dec6ff046b update data readme
Former-commit-id: 70e36ff2f4b500d987160f3a57d5fb3d4d2007d5
2024-09-05 04:44:49 +08:00
hiyouga
c4d7d76358 update data readme
Former-commit-id: 6055fe02deb3585b4330a7902bf8821dd41ea5cb
2024-09-05 04:25:27 +08:00
hiyouga
4fccc65579 support Yi-Coder models
Former-commit-id: 359ef8bb0ebb8ccf9651ac2b737c5a705dab6bad
2024-09-05 03:12:24 +08:00
hiyouga
666013d09d fix ci
Former-commit-id: 1173f7fc1dbdcf814650bfdf854ade5212fc4738
2024-09-05 03:02:59 +08:00
hiyouga
9f36534b49 fix ci
Former-commit-id: b6a3fdd056d77dbe692053bc22a8923e24ed2256
2024-09-05 02:49:22 +08:00
hiyouga
cc02fb6180 update ci
Former-commit-id: 315284f6272f2510f28b3267015aeb7336721bce
2024-09-05 02:26:10 +08:00
hoshi-hiyouga
ce77a89d8c Merge pull request #5365 from hiyouga/video_finetuning
Support Qwen2-VL Fine-Tuning on Video Datasets

Former-commit-id: 46b1765d0374bfc93d6a3af8669af1c2307814a7
2024-09-05 02:24:58 +08:00
hiyouga
ac33d2f4da tiny fix
Former-commit-id: c122b9f8657d1ca3032b1b6a6cf9cc61f11aaa82
2024-09-05 02:16:49 +08:00
hiyouga
9df7a26e6b video datasets
Former-commit-id: 8cafc7b055a854f483ad1c67f3d487ffd34b5f89
2024-09-05 02:04:17 +08:00
liudan
09cff03026 增加了对minicpm3.0的适配'
Former-commit-id: d7ba97be484bf781d6fe80252ea29eb505b261bb
2024-09-04 23:10:05 +08:00
hiyouga
60d770e4b1 fix test
Former-commit-id: c5e1736470c4c54de5a89ea1c58154b7fed2405d
2024-09-04 22:38:26 +08:00
hiyouga
d5ea05cfff update get template
Former-commit-id: dabad5570bf4a6b1044c963d8f27717030f373ef
2024-09-04 22:36:20 +08:00
hoshi-hiyouga
1dfd1aaf82 Merge pull request #5323 from naem1023/feat/add-dataset-map-batch-size-argument
Add batch size of map function in the preprocessed dataset

Former-commit-id: 8f441c2b3a5bb84dec2c037a541084c0201726c6
2024-09-04 22:09:36 +08:00
hoshi-hiyouga
8ac74c8ccb fix #5228
Former-commit-id: 44d6947e554cd61cff23c297248fff32a5f554da
2024-09-04 19:10:30 +08:00
hiyouga
53aeacae81 fix #5252
Former-commit-id: ebddce5d4a2ebb40ce71462f81701e274eb3c29f
2024-09-04 03:17:54 +08:00
hiyouga
af8c4b4e20 add vl_feedback dataset
Former-commit-id: 57497135bf0a956af9c6893177ee97504b9f34ac
2024-09-04 03:13:03 +08:00
hiyouga
a3d47818b7 fix #5344
Former-commit-id: d41d43a7c37cd10e34c9f399d1a346ffaee641c3
2024-09-04 03:06:06 +08:00
hoshi-hiyouga
a65b5061af Merge pull request #5346 from hiyouga/lazy_image
[exp] Lazyload for multimodal inputs

Former-commit-id: ce7ed6eb99f075e4a3337a86fd6a245bc98c479e
2024-09-04 03:00:53 +08:00
hiyouga
22deca0e9e lazy image load
Former-commit-id: 47ea97fb1ba77de2e8a561904aa8fdc27c3f5025
2024-09-04 02:27:08 +08:00
hiyouga
5ef58eb655 fix #5334
Former-commit-id: 59d2b31e968677263f005f57ae8a56fc758307a7
2024-09-03 19:09:42 +08:00
hiyouga
f837ae8cb5 fix #5338
Former-commit-id: 69d0acacc38dc90a97ce18447f865f525fff086a
2024-09-03 17:45:17 +08:00
hiyouga
982585e375 lint
Former-commit-id: 22959bcdd3b124a642e2acaadc050e36d0520f52
2024-09-03 00:46:25 +08:00
hiyouga
6e98872622 fix #5324
Former-commit-id: a61c8c4890962f3847b19eff31b170cd7f54316c
2024-09-02 23:56:21 +08:00
naem1023
46695e42cc feat: add batch size of map function in the preprocessed dataset
Former-commit-id: 209313eeeab8d1a7c320bd9aa90a5f4656082b7c
2024-09-02 13:52:47 +09:00
hoshi-hiyouga
5af92971bc fix trainer predict
Former-commit-id: 99fd9637bdc25f41fd1abc8a162f1069cb9060d4
2024-09-02 10:15:29 +08:00
hoshi-hiyouga
5c9972a2d5 remove .cpu()
Former-commit-id: a6c6750e8af5bc1ece1dfe6111d3e484fd19ee75
2024-09-02 10:10:53 +08:00
hiyouga
b2a5f49a24 fix mm inference
Former-commit-id: 60fc6b926ead923dbb487b595ed4aa4cbfd94805
2024-09-02 01:47:40 +08:00
hiyouga
f13e974930 tiny fix
Former-commit-id: 3a6f19f017539a2be31e348845a8ed8712275dcb
2024-09-02 01:33:22 +08:00
hiyouga
859823eb23 add image num check
Former-commit-id: ce8c5a264742161101da2094d0fc31c7fa7fb641
2024-09-02 01:31:36 +08:00
hiyouga
549adc888b add pokemon dataset
Former-commit-id: 194064fdae0226dd22522586c9d47c5866a71a8e
2024-09-02 01:02:25 +08:00
hiyouga
65425aeb63 update readme
Former-commit-id: a8f8a2ad8a8c026de528c9bcfb23b65d004bfc96
2024-09-01 23:32:39 +08:00
hiyouga
fd2a99ee5c update wechat
Former-commit-id: 753c5fb36c8b3f9a4488626b0a69396ffcfbbd37
2024-09-01 23:30:57 +08:00
hoshi-hiyouga
69fbecf1fb Merge pull request #5317 from ByronHsu/patch-1
Add liger kernel link

Former-commit-id: 6f9e455af481790aaf16af620ee36f76293e93d0
2024-09-01 23:30:12 +08:00
hiyouga
bfdcc6bacf add rlhf-v dataset
Former-commit-id: 8e49940746c1a6ff910f07dbefbec14af9d0f3c6
2024-09-01 22:57:41 +08:00
hiyouga
236f97b35c tiny fix
Former-commit-id: 55027282cdaa59a470ac89bfb3860504ba9075ff
2024-09-01 21:15:44 +08:00
hiyouga
413a206652 fix bug
Former-commit-id: 64cb947c60398dfdfc2877f898147b0240089ea3
2024-09-01 21:07:49 +08:00
hiyouga
cb776752f6 fix mixed mm inputs and rlhf-v
Former-commit-id: 9967ccb3aef3ca557ad6eafb78c6c99866857008
2024-09-01 20:52:47 +08:00
Byron Hsu
7c0d1a5ff1 Add liger kernel link
Former-commit-id: b8a9cb554efc4c2dedacb48833c5152d2cd2fec5
2024-08-30 17:16:16 -07:00
hiyouga
34dc36462c fix ci (temp)
Former-commit-id: b5063b41449d1f81d2031980d6480a285e359ebf
2024-08-31 02:03:56 +08:00
hiyouga
09a2ecebc4 add test mm plugin
Former-commit-id: a2a8c0b92c49fb1ee65de271aec651e011dcabc4
2024-08-31 01:53:38 +08:00
hiyouga
f31e7e0dfc remove visual_inputs, fix qlora
Former-commit-id: a025c3df61db154bef13033518903bbf846f4fc8
2024-08-31 00:24:51 +08:00
hiyouga
51a0016873 optimize predict vram
Former-commit-id: a244f143f48a01910ce1cd56c0855ef11d62a72a
2024-08-30 23:08:45 +08:00
hiyouga
c883542583 add examples
Former-commit-id: e08045a9468986edf1e84001e6043db0ee2e5265
2024-08-30 21:43:19 +08:00
hiyouga
92c398166d tiny fix
Former-commit-id: bee1bd43b946501690d70e4980205f9d82404296
2024-08-30 03:21:50 +08:00
hiyouga
913ee05e74 fix #5307
Former-commit-id: 8b588c7224e631a96c09f3460effc5dbfca6ec14
2024-08-30 02:45:40 +08:00
hiyouga
a83756b5e9 refactor mm training
Former-commit-id: 3382317e32f88ed377d3e7759bdeaf0f2559d22a
2024-08-30 02:14:31 +08:00
hoshi-hiyouga
98b0c7530c Merge pull request #5290 from simonJJJ/qwen2_vl
support qwen2-vl

Former-commit-id: 727e1848401d306274fb60ba78f66fed577b7b55
2024-08-30 02:10:36 +08:00
hoshi-hiyouga
5730243179 fix bug
Former-commit-id: a8f22d8895c07ce37e8cb34b7014f3e9bcdbf6a3
2024-08-30 02:05:26 +08:00
hiyouga
0e4ee9d9a3 update liger kernel
Former-commit-id: a7dd7d325e68c92c7470c1e9ef83a7c8abcbc616
2024-08-29 20:46:08 +08:00
hiyouga
f153ee13be fix #5292
Former-commit-id: aa1afdc75614868172bd2f9c052647b8f226d3f2
2024-08-29 20:37:47 +08:00
hiyouga
21d3976eea fix #5295
Former-commit-id: ad72f3e06593f124d661d61774def336511716e0
2024-08-29 20:30:18 +08:00
hiyouga
1494fa1f18 fix #5305
Former-commit-id: 364b757e306f7a154359a2bc8245a839f39c4fab
2024-08-29 20:16:01 +08:00
simonJJJ
d9fd8f26cc update
Former-commit-id: 734e019cc1c120c23305d6f6bc1fb5b8e0d21b30
2024-08-28 20:22:46 +08:00
simonJJJ
8a09b1e732 initial-commit
Former-commit-id: aeb85f200bd824748008dae6047c2607dfcdf174
2024-08-28 16:51:35 +08:00
hiyouga
efd60f0306 update wechat
Former-commit-id: 0f5a0f64f798eccd1fe76a74cf537002135e7e85
2024-08-27 12:55:23 +08:00
hiyouga
c2df70e925 add extra requires
Former-commit-id: d14edd350ddf268cfeea0f4e9e3c43f38516b848
2024-08-27 12:52:12 +08:00
hiyouga
7b5834b2dd tiny fix
Former-commit-id: f6ae4e75ddaeb4ac4a527f0141ac5b1afefde10e
2024-08-27 12:49:32 +08:00
hoshi-hiyouga
7cbfd64289 Merge pull request #5237 from marko1616/patch-1
Fix mllm api

Former-commit-id: dbe886ae5cf613f537d70c352cb758a524a14279
2024-08-27 12:24:43 +08:00
marko1616
388c3e4e47 ruff pass.
Former-commit-id: df8d5b6985bb209e841347ee51fc6787d9bcd069
2024-08-27 11:30:16 +08:00
marko1616
53a2b16c38 Update chat.py
Former-commit-id: 1545684c3f176a95554afa64f313f00b53a32a77
2024-08-27 11:27:56 +08:00
hiyouga
c765292093 support liger kernel
Former-commit-id: 72bc8f01111ad69b92a647b54b4af988515d9c34
2024-08-27 11:20:14 +08:00
marko1616
f3778f65f8 Force re check.
Former-commit-id: 3a285217107f3042a4d6b7a60eedcdd07f73627d
2024-08-23 14:43:18 +08:00
marko1616
95500687dd Update chat.py
Former-commit-id: 8eb20929217feed91097f60460802f6c6ba7b92b
2024-08-22 12:24:34 +08:00
marko1616
7f6d267e98 Update chat.py
Former-commit-id: a4f1de9d82743bfd35a6dbc1027242533b22ed5f
2024-08-22 12:14:34 +08:00
hoshi-hiyouga
e9800212fa Merge pull request #5230 from MengqingCao/image
[NPU] Update npu base image

Former-commit-id: 36039b0fe01c17ae30dba60e247d7ba8a1beb20a
2024-08-21 22:13:07 +08:00
hiyouga
f78735528d update wechat
Former-commit-id: 8907150c1ba56ef406838b2a0993f1f8fc6b95b5
2024-08-21 22:07:34 +08:00
MengqingCao
a620fa8be3 update npu base image
Former-commit-id: b3f4acd1b4636d15afdfde61d77af521fff39b19
2024-08-21 09:12:38 +00:00
hiyouga
daebca2368 tiny fix
Former-commit-id: c8b4c7fee5398654683b713ad5c03b5daf13218a
2024-08-20 00:10:52 +08:00
hoshi-hiyouga
a7604a95c1 Merge pull request #5156 from YeQiuO/main
fix Llama-template's system prompt bug

Former-commit-id: 15be2963477e7d3a9fe4330d7701d457dc49b583
2024-08-20 00:09:03 +08:00
hoshi-hiyouga
103132aa99 Update template.py
Former-commit-id: ec72eeca521ba4ec71f0c52de9eec49da2cf0feb
2024-08-20 00:03:33 +08:00
hoshi-hiyouga
033dd1f1dc Merge pull request #5163 from liu-zichen/fix_ppo_optim
fix lr not change

Former-commit-id: da335d42c30df2c2898516e290c0059250d57b63
2024-08-19 23:56:24 +08:00
hoshi-hiyouga
5e19a604a6 Merge pull request #5185 from chenhuiyu/feature/add-sailorllm-template
Add SailorLLM template

Former-commit-id: f59c9bef31429a0775e71b0ee01f84f0d33c3149
2024-08-19 23:51:49 +08:00
hoshi-hiyouga
5582674f06 Merge pull request #5188 from Zxilly/main
fix: report correct device count for intel xpu
Former-commit-id: d39f4a62d3c5a3bbbf39d1eb4b92439acedae18e
2024-08-19 23:51:39 +08:00
hoshi-hiyouga
57b51c8617 Merge pull request #5193 from Ricardo-L-C/main
_is_bf16_available judgment supports npu

Former-commit-id: 5d5bfc83e646ae41701ce208386b56297efd85f0
2024-08-19 23:40:59 +08:00
hoshi-hiyouga
a921505f59 Update template.py
Former-commit-id: 5f3300ec5de564df23c94ebd9662c86708f37ddb
2024-08-19 23:40:16 +08:00
hiyouga
fc1aefa4b1 update readme
Former-commit-id: 3804ddec9e4227c02f0e0d43b7dd240cf15716a8
2024-08-19 23:32:04 +08:00
Ricardo
a9312387bc _is_bf16_available judgment supports npu
Former-commit-id: 384ab8db84eef7d1f6a7643c15c565a7d4906a5c
2024-08-16 02:58:22 +00:00
Zxilly
41a8387195 fix: report correct device count for intel xpu
Former-commit-id: dc36fcc3de721bdd28edd4eed36677e59a7614be
2024-08-15 08:30:43 +00:00
Huiyu Chen
66a7f4f128 Add SailorLLM template
Former-commit-id: 2502833a7755d653e8492cb7f1215dc0105b6ee0
2024-08-15 15:10:14 +08:00
liu-zichen
8a7ab8ab21 fix lr not change
Former-commit-id: ddee718b31a5bf3cb39c5adf3f8e0be8fddf9dbb
2024-08-13 16:33:34 +08:00
codingma
753cb0f9b6 add tutorial and doc links
Former-commit-id: 625a0e32c47aeb72a6fe9c3536914996912e89d4
2024-08-13 16:13:10 +08:00
codingma
919472435f update wechat.jpg
Former-commit-id: 5b9d99ebc617f9830552e15897a347ff9d5bdeb7
2024-08-13 16:12:36 +08:00
“Wzw”
3e159a0a83 fix Llama-template's system prompt bug
Former-commit-id: bcbbf4506300fc132e68a39a9a6dfa5e61497c8b
2024-08-12 19:22:12 +08:00
hiyouga
684d621edc update readme
Former-commit-id: c93d55bfb084fd91436b99dba5a79aa16432e136
2024-08-10 10:17:35 +08:00
hiyouga
a0f1cc7445 update readme
Former-commit-id: 576a894f7734711a5b11ae764f42fa8d00427d4a
2024-08-09 20:46:02 +08:00
hiyouga
bea270042b add magpie ultra dataset
Former-commit-id: c75b5b83c4982a6da1512ad6f9cc4d98cc761094
2024-08-09 20:28:55 +08:00
hiyouga
a8add5c04b add qwen2 math models
Former-commit-id: dc770efb14bd6e18421511912fbb959a3cf9f78d
2024-08-09 20:20:35 +08:00
hiyouga
f8c11bd540 update examples
Former-commit-id: 0a690ada6f9f791e7d013eb89799975b12212ed0
2024-08-09 20:13:46 +08:00
hiyouga
5eacd17090 add adam_mini to readme
Former-commit-id: e2a28f51c635d64ff9de65a37087d89356bdedcc
2024-08-09 20:02:03 +08:00
hoshi-hiyouga
792da85866 Merge pull request #5095 from relic-yuexi/feat-optimizer
Feat optimizer

Former-commit-id: ef482394f0e2820ee8a245f8a6b050a32591b40a
2024-08-09 19:51:33 +08:00
hiyouga
25b9cfa163 update scripts
Former-commit-id: 86f7099fa3fadd9c5a2059361ab5a5e1dbf5b1a2
2024-08-09 19:16:23 +08:00
hiyouga
b5146facff follow #5115
Former-commit-id: c87023d539875cd8e622d40212a5627c9c182fb8
2024-08-09 18:03:00 +08:00
hoshi-hiyouga
397e4daa5d Merge pull request #5115 from YeQiuO/main
fix: `Train on the last turn only` truncate bug
Former-commit-id: 51542cb15fea785d445ecf80bbad0364ebc0cb77
2024-08-09 17:58:27 +08:00
hoshi-hiyouga
fae264f4b9 Merge pull request #5072 from relic-yuexi/main
fix the deepseekcoder template to avoid repeat problem

Former-commit-id: 984961c5507e009b1b41e338e796724578df7021
2024-08-09 16:35:21 +08:00
hoshi-hiyouga
54f57fb354 Update template.py
Former-commit-id: 4f62e1cb243d996d1764c3c86ca234847ad2c022
2024-08-09 16:27:42 +08:00
“Wzw”
13e5fff97a mask_history args verify valid
Former-commit-id: 2fa1e0b2add60142c178e5e21ebaad7132fa5b00
2024-08-08 10:12:01 +08:00
“Wzw”
0bd25c3a6b fix mask_history tiny bug
Former-commit-id: b5ca86cc07d38cf342e351aab16cce4319245792
2024-08-08 10:09:33 +08:00
codingma
0404b17718 Merge pull request #5109 from codemayq/fix-example
fix eval_dataset in example

Former-commit-id: 18e455c232d8c342d68195b118cceacd004ec368
2024-08-07 18:30:05 +08:00
codingma
421d4f91c2 update wechat.jpg
Former-commit-id: 9a48f7e9572a34d671ea22c8a68fce15bc9fb05d
2024-08-07 18:29:48 +08:00
codingma
17c73b44da fix eval_dataset in example
Former-commit-id: 823e7c122bd1ee223b8323aa264a5ead576b2fe3
2024-08-07 18:24:19 +08:00
moontidef
44f7c4dd56 feat: add support for adammini
Former-commit-id: 82bc15dc795f95768b81c25eaaabdc613da30cd8
2024-08-07 10:08:22 +08:00
moontidef
b0d32b2041 fix: rename optimzer to optimizer
Former-commit-id: 40908a36fae3393715f75156867c11e6373fabad
2024-08-07 10:05:01 +08:00
moontidef
1cc927b536 Merge branch 'hiyouga:main' into main
Former-commit-id: 55f32dfbf9fd94d6a4315d87caf034fbdfc3c914
2024-08-06 00:18:45 +08:00
moontidef
733cb9087b fix: fix the deepseekcoder template to avoid repeat problem
Former-commit-id: b82ecbedd0fecd85195217916cba3c21998bd10b
2024-08-05 23:55:45 +08:00
hiyouga
20013e130b fix #5048
Former-commit-id: b7ca6c8dc14f689d0df16684a6121cc0ec24f8ba
2024-08-05 23:48:19 +08:00
hoshi-hiyouga
e703d80536 Merge pull request #5037 from codemayq/feature-gemma-2-2b
support gemma-2-2b

Former-commit-id: c2921b996069f81278ff72049e36f3a422393ca2
2024-08-05 23:27:37 +08:00
codingma
7125b6cf70 support gemma-2-2b
Former-commit-id: dc09d454f285b8584d9017349a9cee3b44eadb72
2024-08-01 13:45:48 +08:00
codingma
a18c2b124e update wechat.jpg
Former-commit-id: 1c05b847b266ea4a5631407a2424f03726b5fc35
2024-08-01 09:51:47 +08:00
codingma
e5aea29108 update wechat_npu.jpg
Former-commit-id: 3885949a9d379916c5cf67c218797e9825e4be78
2024-07-30 13:45:47 +08:00
hoshi-hiyouga
e678c1ccb2 Merge pull request #5010 from Eruly/main
Add Korean web UI (llamafactory-cli webui)

Former-commit-id: cd420c193858547c72c8e08e7c051aa432d4e93d
2024-07-30 01:55:54 +08:00
hoshi-hiyouga
8fd71e1aa3 Merge pull request #4996 from LDLINGLINGLING/main
增加了MiniCPM在页面首页的支持列表,MiniCPM官方github也放了LLama_factory的友情链接

Former-commit-id: 06e17eb4623fa43954b83c6d33b454c2fbd08e46
2024-07-30 01:55:30 +08:00
hoshi-hiyouga
b33d668e17 Update README_zh.md
Former-commit-id: 3a49c76b65e458c0dc71fbdc810f7e50fe6293c9
2024-07-30 01:55:13 +08:00
hoshi-hiyouga
2f72383969 Update README.md
Former-commit-id: 9e409eadb0d43b90f2df6b458182b591831cf3e9
2024-07-30 01:53:19 +08:00
hoshi-hiyouga
f510c2d279 Update README.md
Former-commit-id: 8d5a41f2cdc15707ec6e0373b86463e962c31b7a
2024-07-30 01:52:35 +08:00
hoshi-hiyouga
f49ebf1af9 Merge pull request #4995 from codemayq/fix-pissa
fix pissa callback

Former-commit-id: daa62db06fb27e5fd736983265e23974c203c9e9
2024-07-30 01:47:25 +08:00
eruly
ccc0825008 Add Korean web UI (llamafactory-cli webui)
Former-commit-id: 371009e5220c2db7f07087b7cfc679829ba53f7d
2024-07-29 13:47:13 +00:00
liudan
3c3a5c09dc 增加了MiniCPM在页面首页的支持列表,MiniCPM官方github也放了LLama_factory的友情链接
Former-commit-id: b9ed9d45cc2bb82ab042c282ddb3e5e97b554541
2024-07-29 10:58:28 +08:00
codingma
8132725f2e fix pissa save
Former-commit-id: 2c1ca9f7425b84e158fef527fd6e13297c8253c6
2024-07-29 10:44:34 +08:00
hiyouga
884b0bbb4f tiny fix
Former-commit-id: 668654b5adae3f897d5291b81410226e1304eff9
2024-07-26 11:51:00 +08:00
hoshi-hiyouga
0ac7824822 Merge pull request #4892 from piamo/main
update deepseek template

Former-commit-id: 8a2846cfe1266bedde84da175805dac78439c20e
2024-07-26 11:49:34 +08:00
hoshi-hiyouga
29d9a9827a Merge pull request #4950 from liuwwang/main and fi
fix: Repair the issue where quantization failed after merging the adapter.
Former-commit-id: 9839c6d1f6fd69287afeb3b6528571cf2bedf91a
2024-07-26 11:48:56 +08:00
hoshi-hiyouga
e2720c11b1 Merge pull request #4970 from HardAndHeavy/add-rocm
Add ROCm support

Former-commit-id: b8896b9b8bf025fd150e8bdeecf3b4355dc958aa
2024-07-26 11:41:23 +08:00
hoshi-hiyouga
ab477e1650 Merge pull request #4961 from khazic/main
Added the reference address for TRL PPO details.

Former-commit-id: 3c424cf69a10846b92a5f969e333e401b691dcb3
2024-07-26 11:32:29 +08:00
hoshi-hiyouga
ca3dac9fb3 Update README_zh.md
Former-commit-id: 77e7bfee7967319da6b5cc72e88d9f6cafe065b2
2024-07-26 11:30:57 +08:00
hoshi-hiyouga
d4e84b9a11 Update README.md
Former-commit-id: 1186ad53d43dace9dec335331dbe246f1c5a729b
2024-07-26 11:29:28 +08:00
hoshi-hiyouga
f38decfbaf Update README.md
Former-commit-id: f97beca23a1c79df38769b8dd40c9b19d4e5ef5c
2024-07-26 11:29:09 +08:00
codemayq
4aaeba0f78 update wechat.jpg
Former-commit-id: 024c49d4e01dc79cfff191f1ecea5a49d3d61710
2024-07-26 10:01:10 +08:00
HardAndHeavy
27f42f6319 Add ROCm support
Former-commit-id: c8e18a669adc775f17555cbf06a5ceef6c0d6235
2024-07-25 21:29:28 +03:00
khazic
ed5c75bd64 Added the reference address for TRL PPO details.
Former-commit-id: ceba96f9ed121bb75b8e802d9b758871a94046f1
2024-07-25 09:03:21 +08:00
hiyouga
bc36e36658 fix #4959
Former-commit-id: 77cff78863918656662b41d259b68669b7cc2237
2024-07-24 23:44:00 +08:00
hiyouga
d984942c82 update webui
Former-commit-id: 30f8149d11d4e19c1fe87d06f53504b7eaf4c1cf
2024-07-24 21:11:51 +08:00
hoshi-hiyouga
422771589f Update README_zh.md
Former-commit-id: 71d3e60713e1e99dd82d50aba69458fafed73089
2024-07-24 21:08:42 +08:00
hoshi-hiyouga
4e429f2e05 Update README.md
Former-commit-id: 5626bdc56d5cfb71a6c7c9629e69810dcba22594
2024-07-24 21:07:14 +08:00
hiyouga
0aad78694a tiny fix
Former-commit-id: ace1d44857368dbbc868e4bba2d7ff17329921f5
2024-07-24 18:33:39 +08:00
hiyouga
d63beb7a24 fix #4928
Former-commit-id: 091010492b2afc8ad25359323aba760523a4c5af
2024-07-24 17:00:29 +08:00
hiyouga
019c6dad84 fix #4925
Former-commit-id: 935b22d93ed461e829cd1e6201de1fea697f4d1e
2024-07-24 16:56:58 +08:00
hiyouga
fae881b854 fix #4944
Former-commit-id: 1bbd49faaef438f49cb5340166cb13faee8fb854
2024-07-24 16:42:51 +08:00
hiyouga
91e54d458f add mistral nemo model
Former-commit-id: 1550fe7331370ad39e8ed69c1b060ead902a77e4
2024-07-24 16:25:53 +08:00
hiyouga
e0875f82b3 add llama3.1
Former-commit-id: 26533c0604ef765170f93986bc06f3066c5e28ee
2024-07-24 16:20:11 +08:00
Liuww
ed05486b98 fix: Repair the issue where quantization failed after merging the adapter.
Former-commit-id: f91a9a250afa0c2e1554b2c81c1860d7968dc816
2024-07-24 14:31:29 +08:00
hiyouga
d4440c07b6 Update wechat_npu.jpg
Former-commit-id: bb0a37dc067e4385290644f165e3634dcbd88894
2024-07-22 21:17:22 +08:00
hiyouga
1cddf80a97 tiny fix
Former-commit-id: 5665062ca0bfb166cd8f2e896e2b0970037373f6
2024-07-22 21:10:15 +08:00
hoshi-hiyouga
37c6a0c6dc fix #4917
Former-commit-id: 26082fc6c90e6a399ae5b44f2c3df8019afc7766
2024-07-22 11:28:31 +08:00
hiyouga
dbe26e7cdf tiny fix
Former-commit-id: c333e2f49d23243489fa16622f43f768bcfd45e2
2024-07-22 00:06:03 +08:00
hiyouga
8d06679a3f fix flashattn + packing
Former-commit-id: 4135e69406c09f3892276e4fee7e5757a52a42d6
2024-07-21 17:07:45 +08:00
hiyouga
81fd5097cc update wechat
Former-commit-id: ad71296a7cc04d0c72721de18c52d8d0c2d9ad83
2024-07-20 22:00:44 +08:00
huangpan.foo
ee4c3f32d1 update deepseek template
Former-commit-id: 44e48e2b82929888b0880c00519102da4eb38ca8
2024-07-19 15:02:54 +08:00
hiyouga
726e7046db set dev version
Former-commit-id: 88c7fc159999511e6e19fff3d37147a6a7064335
2024-07-19 02:01:46 +08:00
hiyouga
542658c986 update parser
Former-commit-id: 8f6995081cbdbb2424da586a443e5220a8990faa
2024-07-19 01:36:39 +08:00
hiyouga
f5cfea56bd release v0.8.3
Former-commit-id: bbd5a644230d633f507c72929e8819c07ae38bba
2024-07-19 01:21:18 +08:00
hiyouga
488f392c81 fix test
Former-commit-id: cdb0f34f10f5407e83766a8c5c9c5aff6af2a84e
2024-07-19 01:17:37 +08:00
hiyouga
d0891f05fa fix unittest
Former-commit-id: e80006795fe6344ea98b61f9a8db16356498c7cb
2024-07-19 01:10:30 +08:00
hiyouga
e1e01d7efd add unittest
Former-commit-id: 608de799a21f37319bf31c04c0aa50c4542ec757
2024-07-19 01:06:27 +08:00
hiyouga
34f16cc635 follow #4878 fix #4684
Former-commit-id: 779aae83d253de0a86201ff87543b5d695e28d23
2024-07-18 22:06:12 +08:00
hoshi-hiyouga
c8e77c11d1 Merge pull request #4878 from ly863/main
Train the last turing conversation.

Former-commit-id: 2516763d698db480f86e397f437fe9c89a7423e0
2024-07-18 22:03:41 +08:00
Shiyu Zhang
249adacc4d 仅仅训练最后一轮对话
Former-commit-id: 1e7b396ff2489055574fd3365425d26360d73897
2024-07-18 15:30:25 +08:00
hiyouga
3c7b10b1fa fix metrics #4786
Former-commit-id: beec77a0898a39d94f41c23920415f5b4873a23a
2024-07-17 00:47:00 +08:00
hiyouga
e90fae61f4 support batch_eval_metrics, fix #4826
Former-commit-id: d774b94f124923829b2eae428e25199d503ebfcb
2024-07-17 00:33:00 +08:00
hiyouga
ca40e42b3c tiny fix
Former-commit-id: bda302fbfbdb114dee7782d405732600d2d73279
2024-07-15 23:09:50 +08:00
hoshi-hiyouga
9d8e0f0837 Merge pull request #4822 from codemayq/test-ci
add github action check to ignore some test cases

Former-commit-id: f2aaebdbde45018e0d08db853ddb975bc2547e14
2024-07-15 23:07:55 +08:00
hoshi-hiyouga
d63c3be556 Update test_template.py
Former-commit-id: 10289eab158c2de13bdde2d5a7c644195e516201
2024-07-15 23:04:39 +08:00
hoshi-hiyouga
c790997fbc Update test_template.py
Former-commit-id: da990f76b881382b9ae6fcfef78dc7df8034947b
2024-07-15 23:00:27 +08:00
hoshi-hiyouga
b19e2f84b6 Merge pull request #4821 from codemayq/feature-eval-split
add "split" as suffix in eval task name

Former-commit-id: 38bc411d42108e7859ec6a031f63c5188246f18c
2024-07-15 22:59:44 +08:00
hoshi-hiyouga
df1f0a1258 Update llama3_lora_eval.yaml
Former-commit-id: 91ba083f37146da0ee321a75701ae79c527ecb51
2024-07-15 22:55:12 +08:00
hoshi-hiyouga
c0493daa60 Update test_template.py
Former-commit-id: 33420bab81cfc7db1dfd3cd4200b8681bf7de9e6
2024-07-15 22:55:05 +08:00
hoshi-hiyouga
f67ddc05d6 Update test_template.py
Former-commit-id: 52a4256ad9aa866f6e1bcbe2c663c0f8ecfdb604
2024-07-15 22:52:25 +08:00
hiyouga
84e6715423 fix #4820
Former-commit-id: fd8cc490084aba9b5155eaaaf26129efd2871fa3
2024-07-15 22:32:07 +08:00
hiyouga
ae0f4ba2d3 update wechat
Former-commit-id: b0aa321a4ad6f146b5d30de20c07feef59f9d0d0
2024-07-15 22:02:52 +08:00
codingma
7146b9457c add IN_GITHUB_ACTIONS
Former-commit-id: 32c3afdfa18dc9560009b324fdf4f2987fc739a9
2024-07-15 10:28:07 +08:00
codingma
76046dfda8 1. change the task name format
2. delete split param in data_args.py


Former-commit-id: 645211dc01b5d4db3ccd0e3dce03a53860eded26
2024-07-15 09:55:33 +08:00
hiyouga
22859b8734 allow computing rouge in training
Former-commit-id: 99ab7a8c1c966232faa11b6a42b9740d9a20ace3
2024-07-15 01:16:26 +08:00
hiyouga
14bc7b0551 fix up
Former-commit-id: 29ebcd75d55f70f2891632eba187b643cc3a9e51
2024-07-15 01:04:56 +08:00
hoshi-hiyouga
2b22a7da48 Merge pull request #4691 from codemayq/feature-suppot-eval-dataset
add eval dataset support

Former-commit-id: 15b399a82f45b08fc07d2957884fb7821eba9fd9
2024-07-15 01:00:34 +08:00
hoshi-hiyouga
788dc1c679 Update data_args.py
Former-commit-id: cba673f491c5d97aba62aea03f310bd54fb3fe28
2024-07-15 00:56:03 +08:00
hoshi-hiyouga
30a3c6e886 Update preprocess.py
Former-commit-id: df52fb05b1b08887288bbaab7c612b7ac27c2290
2024-07-15 00:55:36 +08:00
hoshi-hiyouga
eed7cbb453 Update parser.py
Former-commit-id: 84e4047f8a1f78256be65f3f7bddce358ed9e882
2024-07-15 00:55:21 +08:00
hoshi-hiyouga
5633c0ab1e Update data_utils.py
Former-commit-id: 97a0e291c79f145950b54a11d03d81ada4784d22
2024-07-15 00:54:34 +08:00
hoshi-hiyouga
2e9c9471da Update loader.py
Former-commit-id: a5b809516e7de1d6d5f4583089fee3028d0db01d
2024-07-15 00:50:06 +08:00
hiyouga
a9a652eb6f update test template
Former-commit-id: a4ae3ab4ab8e3c6ad9feba4c185e3b592eda3f09
2024-07-15 00:49:34 +08:00
hoshi-hiyouga
140b512426 Update parser.py
Former-commit-id: 3d39d74003c4ca36f9c9b77f622d366383b0af7e
2024-07-14 23:04:34 +08:00
hoshi-hiyouga
ddbd848e49 Update README.md
Former-commit-id: 9d64507bd5d47f096e81c90bfb347690afaaec2b
2024-07-14 21:27:04 +08:00
hiyouga
2564269826 add gemma test
Former-commit-id: f1d8d29bc3ba1b41a72a24834a16a3d125d56461
2024-07-14 18:01:45 +08:00
hiyouga
380c7741f5 fix test
Former-commit-id: 173921419dc24506527cf506bfbfc01541fd2b66
2024-07-14 15:44:30 +08:00
hiyouga
dfd2d912cd fix #4699
slow tokenizer for yi models


Former-commit-id: 88a20ba7972c533d650967a118d612471fe2b2e8
2024-07-14 15:34:22 +08:00
hiyouga
12e0e5d0d7 tiny fix
Former-commit-id: d3c01552e0f978f150902175f096f6e3bfb64363
2024-07-14 10:56:45 +08:00
hiyouga
0b26011181 fix gemma2 attention
Former-commit-id: 2f6af73da28c4f8321b625fd09ddec8bd4977b08
2024-07-13 23:33:45 +08:00
hiyouga
fb387ae1c3 update workflows
Former-commit-id: 7b19e99ed73faea99c35d49a611187ba1a5c8c73
2024-07-13 22:31:15 +08:00
hoshi-hiyouga
e83e36e475 Merge pull request #4781 from hzhaoy/fix-dockerfile-cuda
Fix cuda Dockerfile

Former-commit-id: 5da54deb50d50385197a5aa9743e6695c066088a
2024-07-13 22:25:32 +08:00
hiyouga
3baf1d1f7e fix #4792
Former-commit-id: 6b48308ef9be34d072f3e6bb2444e186a38c2779
2024-07-13 22:07:58 +08:00
hoshi-hiyouga
f616fd61bd Merge pull request #4804 from codemayq/fix-examples
tiny fix of examples

Former-commit-id: 32699a82a6dc3c0104c0884b763a623e521b9a5f
2024-07-13 20:49:13 +08:00
hoshi-hiyouga
f9a4d96194 Update llava1_5.yaml
Former-commit-id: f618b80fa2173003b7e960dd20cecbede3db41c9
2024-07-13 20:30:06 +08:00
codingma
1ccc6153c7 1. fix output_dir in llama3_lora_pretrain.yaml
2. add llava1_5.yaml for inference


Former-commit-id: 982a1cdd24dfa51535af3e49c7ea80fddc95b0ee
2024-07-13 13:16:22 +08:00
hzhaoy
955e01c038 tiny fix
Former-commit-id: 8bab99c5829a80752e461cf65a9124fdea609676
2024-07-12 00:28:44 +08:00
hzhaoy
93ba3bd5b0 fix #4780
Former-commit-id: 642c6d666f3bd00fcdea45c65a6394bcae9c2080
2024-07-12 00:25:48 +08:00
hzhaoy
b3e4793ded fix #4779
Former-commit-id: a8bf1abf0fd39f84748c94ac3ba39eaa53137529
2024-07-12 00:15:15 +08:00
codemayq
0fa59c9b4c update wechat_npu.jpg
Former-commit-id: 67040f149c0b3fbae443ba656ed0dcab0ebaf730
2024-07-11 20:03:39 +08:00
hoshi-hiyouga
f85187b4dd Merge pull request #4700 from marko1616/patch-1
Fix Windows command preview

Former-commit-id: 555194e15026c444b2bd1c09f521950cbff86c21
2024-07-10 13:51:50 +08:00
hoshi-hiyouga
2528487847 Merge pull request #4746 from yzoaim/fix
fix src/llamafactory/train/callbacks.py

Former-commit-id: 40c3b88b68b205e4124a9704d73500e3c404364d
2024-07-10 13:32:49 +08:00
hoshi-hiyouga
4edd7c3529 Update callbacks.py
Former-commit-id: 39cd89ce17220dc50c8331299ae5af230fe40cc9
2024-07-10 13:32:20 +08:00
-.-
973aac3203 fix src/llamafactory/train/callbacks.py
Former-commit-id: cff89a2e8907f3fe89406006105cb6494e2ee993
2024-07-10 12:05:51 +08:00
hiyouga
a9ce54d143 fix #4731
Former-commit-id: 51942acee84cdb20002f8fdccf6be8c7fe9bd0d3
2024-07-10 11:32:36 +08:00
hiyouga
d7130ec635 fix ppo trainer
Former-commit-id: fb0c40011689b3ae84cc3b258bf3c66af3e1e430
2024-07-10 11:05:45 +08:00
hiyouga
aa15ca1719 fix #4742
Former-commit-id: 2f09520c0d5039a5a8be310ab668272cb4dc1bd3
2024-07-09 23:24:24 +08:00
hiyouga
7e9d51fb95 Update wechat.jpg
Former-commit-id: 86b1594823f3e7d61c61981d53f353a9724ea9c4
2024-07-09 09:25:11 +08:00
hoshi-hiyouga
553e517f0f Merge pull request #4706 from T-Atlas/main
chore: Update vllm_engine.py to support vllm version >= 0.5.1
Former-commit-id: 563a27dab7e66d9454c6a09404c354d9fca06908
2024-07-07 15:50:38 +08:00
hoshi-hiyouga
7483e187c6 Update packages.py
Former-commit-id: f84b007ebbb9fa63f797b4bd1c487372877bbc65
2024-07-07 15:48:29 +08:00
Lian Junhong
7ca84e0a09 chore: Update vllm_engine.py to support vllm version >= 0.5.1
Former-commit-id: 322663bf90ce7b99ca5b0b43ff9dbd95eb36ff6b
2024-07-07 15:08:12 +08:00
hiyouga
f3c105f088 fix #4705
Former-commit-id: a15782cb9f3ee64ba1f5fc2a3da20ac6c6ef0aa0
2024-07-07 13:10:06 +08:00
marko1616
c8205c5163 Update utils.py
In windows mutiline command should like
command --arg1 xxx `
--arg2 xxx `

Former-commit-id: e0562521bbd7cf6b3b90f8c87e52690931f736bd
2024-07-06 20:40:13 +08:00
hiyouga
7fcffb860d add codegeex4, internlm2.5
Former-commit-id: 53b1002fb74123095e7466c75b941a31a7cfba4d
2024-07-06 16:16:47 +08:00
hiyouga
d97bb11821 update pissa example
Former-commit-id: c9bb0757ecfa90ba456e2ef7b38e64dbb809265d
2024-07-06 15:47:32 +08:00
codingma
74f0d02eb8 1. add custom eval dataset support
2. merge load dataset and split dataset function


Former-commit-id: 76f3bbcfc0e11aa41f8f5cbebc60b77b987f7901
2024-07-05 15:52:10 +08:00
hiyouga
8379a39776 fix processors
Former-commit-id: 9f33f1edf544807e498f60881f30b00149fe570f
2024-07-05 08:33:22 +08:00
hiyouga
9aa3403687 fix #4683
Former-commit-id: e43809bced009323b3bac9accdd3baa3a2836fdb
2024-07-05 00:58:05 +08:00
hiyouga
956e555310 fix #4674
Former-commit-id: ed232311e857865da2f493d3ead9a9ffa44953e9
2024-07-05 00:41:03 +08:00
hiyouga
c1262dbf94 Merge branch 'main' of https://github.com/hiyouga/LLaMA-Factory
Former-commit-id: 226a9e563f15ad125856db371871e6f4a3d3eef0
2024-07-04 14:23:37 +08:00
hiyouga
e17f12fcad fix #4677
Former-commit-id: 1e27e8c776acadf312804a6d9a243955427e9978
2024-07-04 14:22:07 +08:00
hoshi-hiyouga
d08456c0ce Merge pull request #4673 from hzhaoy/main
tiny fix

Former-commit-id: 07d96d497ca807cad1a6941ec27b019fc6769e06
2024-07-04 10:40:41 +08:00
hzhaoy
6d892dbc23 tiny fix
Former-commit-id: 738df477485de3633049651a9f1d498adf95a3d5
2024-07-04 10:20:28 +08:00
hiyouga
aa14a625e4 update tests
Former-commit-id: 636bb9c1e65e72c3a27049dacb3200234d1c2782
2024-07-04 04:00:12 +08:00
hiyouga
d7657d772d tiny fix
Former-commit-id: 0c699de39de06eac96af67e8dd4fc4c53335b17e
2024-07-04 03:47:05 +08:00
hiyouga
cbb93a2b47 tiny fix
Former-commit-id: 44747cebd28d0b800196f032e18d2f4ff51ee5b3
2024-07-04 03:02:23 +08:00
hiyouga
4987aa32ba fix data map for packing
Former-commit-id: b5d101e1bf435731e6b8e5aed8727ddfb021e4f0
2024-07-04 03:01:31 +08:00
hiyouga
c15210a312 update wechat
Former-commit-id: b03e4a74bab17d7fdce36c48123126f502c3f98b
2024-07-04 01:55:05 +08:00
hiyouga
7b3c1f29ff fix packing for eager/sdpa attn
Former-commit-id: 6fd6aa4530f81a2ed306eeb2a5167607288b62c6
2024-07-04 01:52:43 +08:00
hoshi-hiyouga
a38ff842d0 Merge pull request #4224 from chuan298/main
Implement efficient packing without cross-contamination attention

Former-commit-id: 87d9b2d00513c163335d3f2e2bb3cb3299cecdaa
2024-07-04 01:18:54 +08:00
hiyouga
bfdaadcc40 update packing
Former-commit-id: cce7083024bed4c7429ddc8288d1c9190fde29f5
2024-07-04 01:10:55 +08:00
hoshi-hiyouga
51c75985b8 Update packing.py
Former-commit-id: a36e8f2dd50e0f1c589457a7e785fdbc905d561d
2024-07-03 23:36:01 +08:00
hiyouga
13cec0cc2f update func name
Former-commit-id: c346f79f99db5296000e4d22a65e53c26e85b344
2024-07-03 23:29:33 +08:00
hiyouga
e671ed520b update arg name
Former-commit-id: 8a6a7b9c8a876da9c16e5ada7df461eb8cabee21
2024-07-03 23:23:24 +08:00
hiyouga
ff6fc666c1 update hparams
Former-commit-id: 575a02a23d9b41d00ca6291d8a40b5bdb3cbeeec
2024-07-03 23:18:58 +08:00
hiyouga
b254df2d34 update ui
Former-commit-id: 7f770f6895f1e2e0b8e4f0b49088bfae096f6d3c
2024-07-03 23:13:49 +08:00
hiyouga
28c8e083f4 test
Former-commit-id: a4a1ddbcb987422cd04125ff3f36f8c739061b5c
2024-07-03 23:05:39 +08:00
hiyouga
e5c89890b1 update scripts
Former-commit-id: 1e0c860c8c5ae8958d7105acafdac5d253a585f9
2024-07-03 20:07:44 +08:00
hiyouga
3595d98b4c fix #4609
unwrap_model_for_generation(reward_model) is necessary for zero3 training


Former-commit-id: 8845e94f917b503bbee0604d7290efea7260a30c
2024-07-03 19:45:51 +08:00
hiyouga
0d438e5cf4 update readme
Former-commit-id: 87346c094631b054ca975694416df324d2031c9a
2024-07-03 19:39:05 +08:00
hoshi-hiyouga
34bec52cc4 Merge pull request #4662 from wzh1994/wzh/readme
Add `LazyLLM` to `Projects using LLaMA Factory` in `README.md`

Former-commit-id: 3449c3531f09f0ad45afe765bd4bb8f5d338fe75
2024-07-03 15:51:02 +08:00
wangzhihong
84f8113bb1 Update README_zh.md
Former-commit-id: 6f8f53f879faf991c494ee9655a47f905fd11867
2024-07-03 14:59:09 +08:00
wangzhihong
3881f4eb58 add LazyLLM to Projects using LLaMA Factory in README.md
Former-commit-id: 22da47ba27dc9c15887d21d47c456fb26fc81f5b
2024-07-03 11:12:20 +08:00
hiyouga
104151d558 tiny fix
Former-commit-id: 8b1172b91085125a83a4150943873141c8bbd8bc
2024-07-03 02:31:50 +08:00
hiyouga
c9e9beee4e tiny fix
Former-commit-id: 71cdf8956e1640a1f3e5f6a4b86d28db70e72041
2024-07-02 23:06:13 +08:00
hiyouga
ea2d3f6c18 remove rlhf support for chatglm2&3
Former-commit-id: 821bb6660e57c29ebf6ac482e78dd2efb8d72437
2024-07-02 23:03:17 +08:00
hiyouga
4828bed837 upcast logits
Former-commit-id: c13ae2df19ed4cdc849bef55d04225e1a98c19b5
2024-07-02 22:32:05 +08:00
hiyouga
cc31014002 improve rlhf
Former-commit-id: c47ab6c07287fb260ea49b8b7af46bdd416f88f7
2024-07-02 22:23:08 +08:00
ancv
7f42932957 move efficient_packing from data_args to model_args
Former-commit-id: e8e13b09423dd08a31a3bde8f85833c6e5d43ee5
2024-07-02 18:37:55 +07:00
hiyouga
1e0a2f722b Update bug-report.yml
Former-commit-id: 9dcff3a5b5ae627832c0eac256ab234179ad3b98
2024-07-02 19:18:56 +08:00
hiyouga
591a4cf42a Update bug-report.yml
Former-commit-id: c81687963a16450b9176825198a06e59e1a0dce6
2024-07-02 19:16:12 +08:00
hoshi-hiyouga
773d857eb8 Merge pull request #4651 from hzhaoy/add-telechat-1b
Add TeleChat-1B

Former-commit-id: 4e4b3cc9057659bebe2bd8fb1629dde484996744
2024-07-02 17:56:43 +08:00
hzhaoy
28e787116b add TeleChat-1B
Former-commit-id: 57b7c00430bcfc83afd11547ceead041e8edfd8d
2024-07-02 17:49:04 +08:00
hiyouga
08296f4092 fix ppo callbacks
Former-commit-id: 4c296001c4b77b814e4bd6cb4049a279718cb775
2024-07-02 17:34:56 +08:00
hoshi-hiyouga
2452f57cd7 Merge branch 'main' into main
Former-commit-id: e8e6af26514272e29a50649b38182beb4db4ebfa
2024-07-01 21:01:09 +08:00
hiyouga
48a299f8ae Update wechat_npu.jpg
Former-commit-id: 33f2ddb8b68d11370a1e113330a4e10c39fba744
2024-07-01 16:28:54 +08:00
hiyouga
2cf03017a0 tiny fix
Former-commit-id: 73280b7dc7f8b3210bb08dfc3cf34760190f585a
2024-07-01 05:43:17 +08:00
hiyouga
de4de5b5ab tiny fix
Former-commit-id: 8c41a0aa6db8bf31200c83b14819d474927268a1
2024-07-01 03:55:20 +08:00
hiyouga
54e786346e add eval acc
Former-commit-id: 1856a08e87b150fa4bffcb0af703ed84d848e24b
2024-07-01 03:51:20 +08:00
hiyouga
a475d808f2 Update label_issue.yml
Former-commit-id: fc2c15d71321838aad05ff27323c0199840b3956
2024-07-01 01:29:09 +08:00
hiyouga
ca7b65439d fix #4402 #4617
Deprecate reserved_label_len arg


Former-commit-id: 1771251ce3f6887b301dac10f3de7a253c5e5884
2024-07-01 01:19:27 +08:00
hiyouga
768093c789 update readme
Former-commit-id: d4e2af1fa422caeb1a2daff7cb9af17073cab13c
2024-07-01 00:22:52 +08:00
hiyouga
bbc37b2880 fix #4398 #4592
Former-commit-id: d74244d56858d837044e5c9cea57a1b3c2ca0214
2024-06-30 21:28:51 +08:00
hiyouga
4b1ab6c83d update npu docker
Former-commit-id: 93e6fbb37ddb60ed356d0acc720463a8cd1a8baa
2024-06-30 21:05:31 +08:00
hiyouga
2b006beab1 loose gemma2 attention
Former-commit-id: 2f4b89ace15b7a4d2adf16eeba9feb7de9e25d43
2024-06-29 01:42:14 +08:00
hiyouga
c3792dae9f update readme
Former-commit-id: 0e0d69b77c36a6110f43b0c760e9b86e2f5ee267
2024-06-28 06:55:19 +08:00
hiyouga
87e60f8bac bf16 by default, gemma2 attns
Gemma2 finetuning cannot work until merging https://github.com/huggingface/transformers/pull/31674


Former-commit-id: 4d35e218b1d60ff24b368ff5bc608be9c85411de
2024-06-28 06:00:26 +08:00
hiyouga
b0acd27114 increase pissa_iter for stability
Former-commit-id: 64f4337daca4c914d86a7181dd582508688383cd
2024-06-28 03:18:54 +08:00
hiyouga
cba994b9dc fix docker flashattn
Former-commit-id: e3141f5f1b435d12c71d8b1fc6ade6e69deead71
2024-06-28 01:28:59 +08:00
hiyouga
d3b7c489f2 add Gemma2 models
Former-commit-id: 6f63050e1b61742d5f7e48bdc62c46748031d7cb
2024-06-28 01:26:50 +08:00
hiyouga
2105cf6000 update examples
Former-commit-id: 2f78b5d62a34ea4d157bbe91a253859d25c8a7fe
2024-06-28 01:17:07 +08:00
hiyouga
835f0578c2 refactor pissa, improve llamaboard
Former-commit-id: 8baf3b22b0fb9624807d809832f097301982d192
2024-06-28 01:04:24 +08:00
hoshi-hiyouga
1dad756cff Merge pull request #4580 from hzhaoy/bugfix-deepspeed-pissa
Fix bug when using pissa method with deepspeed

Former-commit-id: ef38daa0a49567bc4577136ac99e307cf535cf79
2024-06-28 00:46:51 +08:00
hiyouga
a294ef2fae fix #4549
Former-commit-id: 8ed6b367e26490acab5d2d7b32f0d5dad449d26a
2024-06-28 00:41:58 +08:00
hiyouga
5e60c29971 fix docker file
Former-commit-id: 0f421055da37d7f9b3313954fd008c433b3100f5
2024-06-27 20:29:16 +08:00
hiyouga
7c488cea57 tiny fix
Former-commit-id: e44a4f07f09bbee55c10ccee91dd858256c36054
2024-06-27 20:14:48 +08:00
hoshi-hiyouga
ecbbed94d9 Merge pull request #4590 from injet-zhou/main
Exit the process with the subprocess's return code when utilizing the CLI

Former-commit-id: 33ef6f4ec2984a686b6158c4f84af354349221aa
2024-06-27 20:09:36 +08:00
hoshi-hiyouga
37d3adb1f8 Merge pull request #4461 from hzhaoy/feature/support-flash-attn
support flash-attn in Dockerfile

Former-commit-id: 64b131dcfa381045cba6b77ab9e0dbf6a3934e03
2024-06-27 20:05:26 +08:00
hoshi-hiyouga
7ae015f8bf Merge pull request #4561 from hashstone/fix-docker-npu
fix torch-npu dependency

Former-commit-id: a6bf74c0c9ea07db23ee693ba884eed097a6bd0f
2024-06-27 19:58:16 +08:00
hoshi-hiyouga
ffe7cda93d Update Dockerfile
Former-commit-id: 5fcd33fd4f08f5e46c1df541f188d13faefa7c8e
2024-06-27 19:57:40 +08:00
hoshi-hiyouga
030dda5861 Update Dockerfile
Former-commit-id: 9624af1df3e4b260b37ad10e2a9be8e69ab066df
2024-06-27 19:51:25 +08:00
hoshi-hiyouga
506f6d79e6 Update setup.py
Former-commit-id: 569f03ca1abf844f6893745c2cbabf056f21d1cb
2024-06-27 19:38:15 +08:00
hoshi-hiyouga
30687babe3 Update README_zh.md
Former-commit-id: fc7a08e16a00db0e2f8bb2b1966b3d363434ddc6
2024-06-27 19:17:52 +08:00
hoshi-hiyouga
65bf205968 Update README.md
Former-commit-id: 06536c44d472fa2b380ebdb3f51865edb028ac5d
2024-06-27 19:17:35 +08:00
hoshi-hiyouga
e384ef7d57 Update setup.py
Former-commit-id: 03389207f473d43360bd938972da02c92e80322f
2024-06-27 19:16:46 +08:00
hiyouga
6a518d386a Update wechat.jpg
Former-commit-id: 719e2f1fef030b65267855d952dacb1f50a65050
2024-06-27 18:46:47 +08:00
faddddeout
9a2e77a5d8 Exit the process with the subprocess's return code when utilizing the CLI
Former-commit-id: f6b62f00707baf582bc168c15886732ac5218524
2024-06-27 09:58:00 +00:00
fanjunliang
9e93ff9479 support docker-npu-[amd64|arm64] build
Former-commit-id: bdda0827b35cbb6005f10aa661fe6f3940b598d8
2024-06-27 15:25:12 +08:00
hzhaoy
e1751f6398 fix #4579
Former-commit-id: 677c86594e4ea904fde0a557852daf54636b06ae
2024-06-27 13:49:57 +08:00
hiyouga
58607ec1b0 add quant checks
Former-commit-id: 96a5044394bff75ca8ef17bd7d07d4da66f797f0
2024-06-27 01:12:25 +08:00
hiyouga
6e03536dca update examples
Former-commit-id: d417e63f928048036717d51eba42be843c51dff2
2024-06-27 00:53:33 +08:00
hiyouga
f2425cb4ed tiny fix
Former-commit-id: f17c9dfd848ff8ee33fb4db70e40bca15153e40f
2024-06-27 00:46:41 +08:00
hiyouga
da80d41c78 tiny fix
Former-commit-id: 29c710da3a8984c5f6522a79368344544319ebce
2024-06-27 00:36:04 +08:00
hiyouga
d2d9fa4abb support HQQ/EETQ #4113
Former-commit-id: ad144c2265cdee0d23014dbb3d017ea257cb26ed
2024-06-27 00:29:42 +08:00
hzhaoy
c662c2e56f add flash-attn installation flag in Dockerfile
Former-commit-id: e19491b0f0446f2fb2154cf14e0b2fbba5b54808
2024-06-27 00:13:30 +08:00
hiyouga
6b2733ce12 improve autogptq integration
Former-commit-id: addca926de42f91366185a47eb8e777ed44a8e77
2024-06-26 22:11:44 +08:00
hiyouga
28e613efd0 fix #4458
Former-commit-id: 8d6cd69ac43afd4bd7c14bd02b0061455827ac9e
2024-06-26 19:52:35 +08:00
hiyouga
654116c0b1 fix #4556
Former-commit-id: 59e0b4f616736ede37cc37a13346b547f5a2d4e7
2024-06-26 19:43:16 +08:00
fanjunliang
e7bd3ab6c3 fix torch-npu dependency
Former-commit-id: 8096f94a7db3e357c52906c83cad9f3c360bad7e
2024-06-26 18:21:42 +08:00
hoshi-hiyouga
2300fb616b Merge pull request #4544 from MengqingCao/npu
fix docker-compose path

Former-commit-id: 82d0b46bc9f9f52ed1cadd697726cf4089bc73ca
2024-06-26 10:19:24 +08:00
MengqingCao
7c7d6614d8 fix docker-compose path
Former-commit-id: 106647a99d36c4a624511edf651cf5f7624c4c9a
2024-06-26 02:15:00 +00:00
hzhaoy
08a221443c support flash-attn in Dockerfile
Former-commit-id: c88b1be9f3dfaf5fe65448dea20fc697b4f257bd
2024-06-25 15:13:07 +08:00
hiyouga
1a79dd23ff fix #4456
Former-commit-id: 50b44d3c6da7c9cb24a60fedec171fb1de3e764a
2024-06-25 14:34:13 +08:00
hiyouga
f3f25ae3b7 lint
Former-commit-id: 555ca8d780a1fbaf42e73450f5eb33048329d921
2024-06-25 02:55:50 +08:00
hiyouga
80effa2993 fix test case
Former-commit-id: c244af0dc3478532de02271667e7af4ad8f54228
2024-06-25 02:51:49 +08:00
hiyouga
0ae1302e41 fix #4432
Former-commit-id: 1e9d0aa1e45fac52614e79a9fe87e8f1d3757333
2024-06-25 02:34:04 +08:00
hiyouga
ad0304e147 fix #4379
Former-commit-id: cc016461e63a570142b56d50a5d11e55a96ab8db
2024-06-25 02:31:44 +08:00
hiyouga
a225b5a70c tiny fix about badam
Former-commit-id: 095fab58d3692607c9e78747b4218ae1abcf5aaf
2024-06-25 01:54:53 +08:00
hiyouga
dafc9268bc fix #4419
Former-commit-id: efb81b25ecd5cb9f4cfda8f2da8b159e4ab26a90
2024-06-25 01:51:29 +08:00
hoshi-hiyouga
fe6ef6400c Merge pull request #4352 from Ledzy/main
[Enhancement] Support ZeRO-3 when using BAdam

Former-commit-id: d0f953bf5bdbfd49acc82ff055bd54889241761a
2024-06-25 01:49:13 +08:00
hiyouga
d519c2fde5 tiny fix
Former-commit-id: 41086059b12ecb7827eb390294e315068ff9c2e6
2024-06-25 01:15:19 +08:00
hoshi-hiyouga
ab1fbbc3ec Merge pull request #4355 from MengqingCao/npu
Add docker-npu

Former-commit-id: d0e60599026e274d52b0c49806f246e712143708
2024-06-25 01:07:43 +08:00
hoshi-hiyouga
678884f97c Update README_zh.md
Former-commit-id: ec95f942d1f36dee9facb687ae4168e7c3c4d3f5
2024-06-25 01:06:59 +08:00
hoshi-hiyouga
cbc23fc299 Update README.md
Former-commit-id: 5dc8fa647e9af2c6d666c9559553c05d1c4860b3
2024-06-25 01:03:38 +08:00
hoshi-hiyouga
af5b2b9299 Update docker-compose.yml
Former-commit-id: 721acd87688b3edd41d2e1890971eb387afc1b6e
2024-06-25 00:54:28 +08:00
hoshi-hiyouga
6cd45e95f7 Update Dockerfile
Former-commit-id: 3af936a76d726c73d14ff6cf8c3088bda5b904c3
2024-06-25 00:50:34 +08:00
hoshi-hiyouga
62e63d74ec Update docker-compose.yml
Former-commit-id: 15608d055827e41b44fb8a381b1fe5a1230afcb0
2024-06-25 00:46:47 +08:00
hoshi-hiyouga
cfa2dbefcb Update Dockerfile
Former-commit-id: fce146ab681a5682d3f96254e3863f70aca8a4af
2024-06-25 00:46:08 +08:00
hoshi-hiyouga
f84bce3638 Update Dockerfile
Former-commit-id: dcc2e24f5cd441a242829f2fb090506db507babd
2024-06-24 23:41:35 +08:00
hoshi-hiyouga
37a079a072 Merge pull request #4409 from kno10/patch-2
Print help if no arguments given

Former-commit-id: 3bed18c6449bca9f060973fa15f848e4ceb988f4
2024-06-24 23:21:31 +08:00
hoshi-hiyouga
60937ccf32 Update cli.py
Former-commit-id: acb61f7ab719202a217d8f42850a9c3797e0c5b1
2024-06-24 23:21:10 +08:00
hoshi-hiyouga
709bbc1d92 Merge pull request #4417 from mMrBun/main
Add tool_format parameter to rewrite templates for different function call formats.

Former-commit-id: def6d280db3a9fe468b05503bcd9929c83c6c19b
2024-06-24 23:17:55 +08:00
hoshi-hiyouga
18863245df Update test_formatter.py
Former-commit-id: 672152d2ce6b49d7668c70100d877a1c34c08eae
2024-06-24 23:14:36 +08:00
hoshi-hiyouga
b7f5cfde6e Update template.py
Former-commit-id: 1240bd57d8a21540c636a6da839e6b3112d1395a
2024-06-24 23:12:59 +08:00
hoshi-hiyouga
673f27a59e Update loader.py
Former-commit-id: dddfd516ee66e9937e21f05300832aab45034b12
2024-06-24 23:06:18 +08:00
hiyouga
47651a94a3 fix #4410
Former-commit-id: fca893d73c3d7bbb87a816522f2e1568d3e9c612
2024-06-24 22:34:31 +08:00
hoshi-hiyouga
f3a2dda567 Merge pull request #4445 from MengqingCao/label
auto-label npu issue

Former-commit-id: e0014db7d2771a1f32f2b7a9f3e577e34390e818
2024-06-24 22:02:05 +08:00
hoshi-hiyouga
78baa8a509 Update label_issue.yml
Former-commit-id: 80d1910a93213b18c94e122b7e48cba0a20fd4e6
2024-06-24 22:01:23 +08:00
hoshi-hiyouga
1a0758b0a1 Update label_issue.yml
Former-commit-id: aa60cd8910e5ad2ede574e892522f34e43f59f2f
2024-06-24 21:59:39 +08:00
hoshi-hiyouga
fe407e8de6 Merge pull request #4446 from stceum/bug-fix
Bug Fix: `off` is parsed as `False` in yaml file

Former-commit-id: cc452c32c7f882c7f024a9d229352918a9eaa925
2024-06-24 21:41:28 +08:00
hoshi-hiyouga
e74fcdf7b1 Update parser.py
Former-commit-id: e90c424f55b17e4971f8b9d85b6aeac89bb6b98e
2024-06-24 21:37:42 +08:00
hoshi-hiyouga
a9f10a9abd Update test_attention.py
Former-commit-id: a9b3d91952dd5a51ff97fbb40a2dd88885d380b8
2024-06-24 21:35:34 +08:00
stceum
9aa640f27b Bug Fix: off is parsed as False in yaml file, changed to disabled to avoid this.
Former-commit-id: 3ed063f281d1c2563df1b9eb3800543208c9dc16
2024-06-24 20:39:31 +08:00
MengqingCao
f923989a6e auto-label npu issue
Former-commit-id: 90c74ff2519229eb1e509d62426ca3b5e83ef03e
2024-06-24 12:27:00 +00:00
MengqingCao
3b499948a5 update docker files
1. add docker-npu (Dockerfile and docker-compose.yml)
  2. move cuda docker to docker-cuda and tiny changes to adapt to the new path


Former-commit-id: d7207e8ad10c7df6dcb1f5e59ff8eb06f9d77e67
2024-06-24 10:57:36 +00:00
hiyouga
a1df18c5df update readme
Former-commit-id: 4ea84a833399ca434f23bdc100c0851d5b53e05b
2024-06-24 18:29:04 +08:00
hiyouga
7be502c5c5 update readme
Former-commit-id: e507e60638b2e8c66f24805b3b28f6b9f98f5924
2024-06-24 18:22:12 +08:00
codemayq
bb9f48590f update wechat
Former-commit-id: 5b897e7c35618330d4f527a66c3da81a48402e9b
2024-06-22 11:57:39 +08:00
mMrBun
c0e005e2ea Add tool_format to overwrite tool formatter template
Former-commit-id: 20e2e6fdcb0cd1771906be035745a2d9fcd3e138
2024-06-22 02:13:23 +08:00
hiyouga
98abb5c900 remove dup template
Former-commit-id: db9a1912e3551394039cc57b4913f03e8f9aa29d
2024-06-22 01:31:32 +08:00
hiyouga
ccc9a895a6 fix api
Former-commit-id: 3ce44dda991e9cf6cbce1fbe22ef1927893b6c4d
2024-06-22 00:00:38 +08:00
Erich Schubert
cf23a279fd Print help if no arguments given
Former-commit-id: 7d70ba7fb86844d0fa307889c6d671939db1cce4
2024-06-21 09:14:21 +02:00
ancv
5319447aa5 move configure_packing to llamafactory.model.patcher and fix constants
Former-commit-id: 770f75dc8363bfa284a72159ff8ad25ec9abe4e0
2024-06-21 00:45:06 +07:00
hiyouga
0844750bb9 tiny fix
Former-commit-id: 8d4f5093cfcccfe9df173b4c4f7ec0125aecf198
2024-06-20 22:56:05 +08:00
hoshi-hiyouga
7d3b21684c Merge pull request #4382 from MengqingCao/bugfix
upper bound numpy version to <2.0

Former-commit-id: a459624474f8c0aa8977e977190c0ca6b42221ac
2024-06-20 10:19:37 +08:00
MengqingCao
cd563116ca update dependencies
Former-commit-id: 7d4a29303350711558566d10d02230ed85ee1b69
2024-06-20 02:09:47 +00:00
hiyouga
6ea4680334 improve llamaboard
Former-commit-id: f22d8f9ca49681e1ca52c9bdedfe3b5c5a3409b8
2024-06-19 23:46:03 +08:00
hiyouga
029c343537 fix llamaboard abort
Former-commit-id: 3f84411b5d904dbbc5cddeda44884417fe68f689
2024-06-19 23:22:28 +08:00
hiyouga
030b4811c7 update patcher
Former-commit-id: 3b040e8e0f78dbb6bc1409a1b2b788e1affc7458
2024-06-19 21:27:00 +08:00
hiyouga
80e9f8e000 set dev version
Former-commit-id: 42e69a3c634ccae792bd8ffb4642061ee475e836
2024-06-19 21:08:16 +08:00
hiyouga
fded2306dc Update publish.yml
Former-commit-id: 87e330fee50a7dd361c317b94ce19b17eda55d6f
2024-06-19 20:46:33 +08:00
hiyouga
9c1b04cd11 release v0.8.2
Former-commit-id: 71327ba85a3a1bb2d2d20c86951c6c7c0ba98829
2024-06-19 20:42:09 +08:00
hiyouga
3d72b1a856 fix jinja template
Former-commit-id: 2b596fb55ff689d2e488d9a9bbab98f70f356c3c
2024-06-19 20:03:50 +08:00
hiyouga
7735456561 fix templates
Former-commit-id: 4cff6a4ad55b24bf57db6be5cf817180c1ea5626
2024-06-19 17:44:05 +08:00
codingma
53b48eb052 update wechat_npu.jpg
Former-commit-id: c48cbc371dea4eed1a76bdda4346e4346db05287
2024-06-19 14:02:24 +08:00
Jonery
c779899f7b Cleaner integration.
Former-commit-id: 5c2ff1b749a265dd3c979189ec491d8ac911a6f6
2024-06-19 12:29:40 +08:00
hiyouga
c9557241f6 fix bug
Former-commit-id: 6d2bf216ac3a48450e861148ce664dad717fd019
2024-06-19 03:49:23 +08:00
hiyouga
e73a235a38 use prefix to replace force system
Former-commit-id: 4f22eae8f405de918237d406e5e9847592925565
2024-06-19 03:39:52 +08:00
hiyouga
bccc852f76 fix tool formatter, allow parallel function #4362
Former-commit-id: cd75b1fe9d91fb52a9ae6de7435302ff06b4d933
2024-06-19 03:23:51 +08:00
hoshi-hiyouga
6db02615d4 Merge pull request #4173 from mMrBun/main
Implemented the tool_formatter and tool_extractor for glm4 and Qwen2 tool_format

Former-commit-id: c0ca42566c6aeccd8d384377510690eafef10995
2024-06-19 03:18:55 +08:00
hiyouga
89564e90d7 update data
Former-commit-id: 9ab0401948d02d029134aa669c378e2ad80fb9fb
2024-06-19 02:48:43 +08:00
hiyouga
9e5988717d tiny fix
Former-commit-id: 344b9a36b2e0b60ee61fba171b35a391e3517fed
2024-06-18 23:32:18 +08:00
hoshi-hiyouga
9055e66643 Merge pull request #4314 from EliMCosta/patch-2
Fix Dockerfile

Former-commit-id: 89a50dbfdeef4c51bce32275348d15313256039f
2024-06-18 23:30:59 +08:00
hoshi-hiyouga
9b30635ff0 Merge pull request #4309 from EliMCosta/patch-1
Add Magpie and Webinstruct dataset samples

Former-commit-id: 10316dd8ca812382ddbaad0b8fce67d9b000df34
2024-06-18 23:30:19 +08:00
hiyouga
e3bf22f61b add deepseek coder v2 #4346
Former-commit-id: a233fbc258d38c62d78b9d1eaf034720361795e6
2024-06-18 22:53:54 +08:00
hiyouga
5156114981 fix #4357
Former-commit-id: 4bd77d8563aa85230af65caf901214247e214bed
2024-06-18 22:42:45 +08:00
hoshi-hiyouga
b596addd1f Merge pull request #4334 from zzxzz12345/bugfix/add-pandas-versions
Update requirements.txt

Former-commit-id: 078040babdff12b96878e96629d8f657c8e4c3b5
2024-06-18 22:30:35 +08:00
hoshi-hiyouga
09c34e5b6c Update requirements.txt
Former-commit-id: e8c518c08a1235f83f66f83d6f8a6fcad8c598df
2024-06-18 22:27:24 +08:00
hiyouga
15a5eb6647 fix #4335
Former-commit-id: c96264bc477d65276557e9059cac7c550c4835a8
2024-06-18 22:08:56 +08:00
Jonery
bc1c082bc2 add example
Former-commit-id: 97c523516093961983037922e7fc84e4010d5fec
2024-06-18 13:50:26 +08:00
Jonery
c2734108e7 fix typo
Former-commit-id: 8f7c78b64138602406af748b0e15948ebbd2dcb5
2024-06-18 12:39:26 +08:00
Jonery
3a5eacb4cf Support distributed BAdam.
Former-commit-id: 0f72aac8c9227e33ad20d2b1641b1c9faae16a5f
2024-06-18 12:27:47 +08:00
hiyouga
19bf21efba lint
Former-commit-id: 24c160df3d575843e5ad5f1b47246d04430a79f0
2024-06-17 22:35:56 +08:00
hiyouga
3d85217464 update chat engine #4335
Former-commit-id: 7857c0990b37d37e58691257eea13fe5e244aaa0
2024-06-17 19:07:17 +08:00
hiyouga
9e0ec3831f update readme
Former-commit-id: fcb2e8e7b7b79915af24c4e3264b579b3649ea90
2024-06-17 18:47:24 +08:00
Jonery
5d59f6562a Merge remote-tracking branch 'upstream/main'
Former-commit-id: ea1f3ba5e030504e07053484f50f4cbdb37808bc
2024-06-17 18:44:51 +08:00
Jonery
67df86201a update gitigore
Former-commit-id: b2fc9cc15f2fc3c9536681260beedd3a224ef40f
2024-06-17 18:29:36 +08:00
Jonery
756566342d adapt for badam with ds zero3
Former-commit-id: 33b437277846d4f0b64c13a0bc892ef4f345a21e
2024-06-17 18:18:10 +08:00
hiyouga
7ef169ed39 fix #4326
Former-commit-id: e2665e71c7428014d46d91542b01a58c1064d05a
2024-06-17 18:17:48 +08:00
hiyouga
0a40ee5444 Update wechat.jpg
Former-commit-id: 72471ee046a69acea25fd4d67679ea942d00eb8f
2024-06-17 17:49:03 +08:00
hiyouga
004f289074 tiny fix
Former-commit-id: 2bf2863a58c93206f271de17d7dfcbcd6375cd73
2024-06-17 17:47:25 +08:00
胡翀
8ab2d707e5 Update requirements.txt
add pandas version requirements

Former-commit-id: 12869c3ede9bf11bc0fbdfa7af559808551563be
2024-06-17 16:45:57 +08:00
Eli Costa
191dac26c0 Fix Dockerfile
Adds the commands to correctly execute LLama-Factory servers

Former-commit-id: df12621daed8037496dbb6a57e653659a209062b
2024-06-16 19:16:23 -03:00
Eli Costa
d7459853d8 Update README_zh.md
Fix details tag in datasets menus

Former-commit-id: 3ec57ac239a4f469bbae013ec8760307fb190189
2024-06-16 11:34:31 -03:00
Eli Costa
ee30db72a3 Update README_zh.md
Add Magpie and WebInstruct to README

Former-commit-id: 82d5c5c1e8dda61523dee4be351c18731e4a5b9c
2024-06-16 11:22:06 -03:00
Eli Costa
26e942b0ad Update README.md
Add Magpie and Webinstruct to README

Former-commit-id: 103664203cf5a8562b5b000676ce95a6da2b7698
2024-06-16 11:19:25 -03:00
Eli Costa
6bbb8b4cd8 Add Magpie and Webinstruct dataset samples
Adds two dataset samples claimed superior performance: Magpie (from Allen AI) and Webinstruct (from TIGER-Lab).

Former-commit-id: 74e49cca957d0bacd2c1d688e995a7370bef69f7
2024-06-15 19:31:56 -03:00
ancv
988231026a update packing with sdpa and eager attention mode
Former-commit-id: 238f5c3d99809c6ae2571b59bdce8d8ea3c700b9
2024-06-16 02:25:47 +07:00
hoshi-hiyouga
06bbc29614 Update parser.py
Former-commit-id: 29c1f31baa442e35714b18b7e51896274a828cae
2024-06-16 02:57:00 +08:00
hiyouga
5769cc8d06 update pr template
Former-commit-id: 0a2ec5fe2049a6d6bc3955c7e4794bd6115ccbcd
2024-06-16 01:43:43 +08:00
hoshi-hiyouga
0ddf7bd28a Merge pull request #4307 from hiyouga/pissa
Support pissa

Former-commit-id: b7b5892a34ed329630db9b79bd5453f15440b4c3
2024-06-16 01:41:50 +08:00
hiyouga
ce4a27a5f7 fix tol
Former-commit-id: 46093b5786611d99adf1fd3d42926a728fc629f8
2024-06-16 01:38:44 +08:00
hiyouga
a8655f5c08 Update tests.yml
Former-commit-id: 7f3c19e3ab616f93ffc8fee23aa15ca0b19a9683
2024-06-16 01:22:23 +08:00
hiyouga
2d2c78d66c increase tol
Former-commit-id: de43bee0b004c7e90811100474b3113590d0f130
2024-06-16 01:21:06 +08:00
hiyouga
f25b8626bf support pissa
Former-commit-id: 8c1046d78ac6c8f9429b73617e35e1eccb35138f
2024-06-16 01:08:12 +08:00
hiyouga
c0c6b8075a tiny fix
Former-commit-id: 38b6b0f52edeb8ba45aa03b415b3c0c1b0e0c1e4
2024-06-16 01:06:41 +08:00
ancv
9d9f8c6531 remove some unused params
Former-commit-id: 04315c3d92ecc25537e45d5807cb38bc290dcb16
2024-06-15 23:00:55 +07:00
hiyouga
96b82ccd4d use fixture
Former-commit-id: 80a9e6bf94cf14fa63e6b6cdf7e1ce13722c8b5e
2024-06-15 20:06:17 +08:00
hiyouga
8053929b20 add tests
Former-commit-id: 1b834f50be64ae9b5123da0e6f528cfbd5167477
2024-06-15 19:51:20 +08:00
hiyouga
f0d6e63f55 add minicpm #4227
Former-commit-id: 572d8bbfdd73c1a00b432f0d0411f46fad6aa1a6
2024-06-15 17:58:52 +08:00
hiyouga
2946153cea add license
Former-commit-id: d87108daa68bd40174b262be1ca65fe6e1b7ab56
2024-06-15 17:54:33 +08:00
hiyouga
4dcd124dbd update readme
Former-commit-id: acd84ce5350ef985e3712a40442c6f7a54d08d40
2024-06-15 05:13:16 +08:00
hiyouga
19ea51e460 fix #4271
Former-commit-id: f1aa6a411a55efea1064ea74b936ac8a3d856f8c
2024-06-15 05:11:33 +08:00
hiyouga
fcbfa70c19 disable DP
Former-commit-id: d519b4d76d39b21a21b1d2f6f7ce6b3af9525d03
2024-06-15 04:57:19 +08:00
hiyouga
ba9aa7e2aa fix #4292
Former-commit-id: 9092f963dba6f2e0eeade53b5b35e6b770f8348e
2024-06-15 04:47:13 +08:00
hiyouga
ab66ae8cd2 fix #4295
Former-commit-id: 78589cf90c6e12e612f269b1c771f19f3dad83d2
2024-06-15 04:34:55 +08:00
hiyouga
a3f4925c2c add test cases
Former-commit-id: b27269bd2b52fb9d43cde8a8b7f293099b0127a2
2024-06-15 04:05:54 +08:00
hiyouga
d4ce280fbc Update README.md
Former-commit-id: 2d43b8bb49057e14a9f79146acdcc0cfa94bcc5a
2024-06-13 16:02:21 +08:00
hiyouga
f81a839197 update examples
Former-commit-id: 892e561c28cbbab1ee38a8022ddd9b397c873563
2024-06-13 03:26:10 +08:00
hiyouga
27777c8e68 add quant check in webui export tab
Former-commit-id: c94e6c9411cc1165ea8180d377611a3ae47956e6
2024-06-13 03:19:18 +08:00
hiyouga
4c40171c55 Update llama3_full_sft_ds3.yaml
Former-commit-id: a19cdd39feafa91fdacae4818393f82e9724c53f
2024-06-13 03:16:20 +08:00
hiyouga
0926d81053 update examples
Former-commit-id: b6e008c152421db668c971b0828cbee6a80b16bc
2024-06-13 03:15:06 +08:00
hiyouga
8fccaf20c5 fix #4221
Former-commit-id: 6baafd4eb3147ad9f7d2952b8eb27c5486940f36
2024-06-13 02:48:21 +08:00
hiyouga
c51f5c2a0a update wechat
Former-commit-id: 9419f96609d791397f02b411966cd1ef779f2f97
2024-06-13 02:31:45 +08:00
hiyouga
81ed4d8abf fix #4209
DeepSpeed ZeRO3 has inflight param error when calling model.eval()


Former-commit-id: cf9f2d6c42b5a37038c9eededbb767eae6a3f67d
2024-06-13 02:25:50 +08:00
hiyouga
833aa324c2 clean code
Former-commit-id: 2ed8270112755971e3f2dfd2f29c5939b077330a
2024-06-13 01:58:16 +08:00
hoshi-hiyouga
7366647b43 Merge pull request #4246 from hzhaoy/adapt-vllm-v0.5.0
adapt vllm==0.5.0

Former-commit-id: 1f23f25226f43bb2b2116474b9c6ab2d72e7e260
2024-06-13 01:54:02 +08:00
hiyouga
e89d1b1ec3 add neo-sft dataset
Former-commit-id: c7a5620ccc72b7574255ea764693ccb866c48263
2024-06-13 01:00:56 +08:00
hiyouga
99ce085415 fix lint
Former-commit-id: 713fde4259233af645bade7790211064a07a2a6f
2024-06-13 00:48:44 +08:00
hiyouga
b2b0b96051 fix docker compose usage
Former-commit-id: 947a34f53b74e4cd2b964941cf1580bcabde2228
2024-06-13 00:07:48 +08:00
hzhaoy
65f2ba3802 adapt vllm==0.5.0
Former-commit-id: 8fb6366ebec432c1333c80eb51007460ac5896bd
2024-06-12 18:29:03 +08:00
hiyouga
77e4dc255f update readme
Former-commit-id: 2ce2e5bc478f6ffcafe8e6451b1fef4e8994694c
2024-06-12 17:39:12 +08:00
hiyouga
f8497921fe fix #4242
Former-commit-id: 577de2fa078ee62d61bca24b75fa59590e2ef66b
2024-06-12 16:50:11 +08:00
hoshi-hiyouga
bc5e97295e Merge pull request #4234 from kimdwkimdw/patch-1
Support vllm==0.5.0

Former-commit-id: 656b2bbdafa0208c28801d41ac4887518d3f0ce5
2024-06-12 16:39:09 +08:00
Arthur Kim
bebca6d01c Support vllm==0.5.0
Former-commit-id: d65a3f7cb6dcd34e9a3ff9e30dffc52a7671d50b
2024-06-12 16:49:12 +09:00
ancv
045eb155a2 implement efficient packing without cross-contamination attention
Former-commit-id: b2c367bc61c2778dc359613dca496d9e134c2743
2024-06-12 11:56:01 +07:00
codingma
c7e021a837 update wechat_npu.jpg
Former-commit-id: 557891debb8a64b73eea012f99780a7b76424cd5
2024-06-12 10:39:05 +08:00
hoshi-hiyouga
fecb9c9a76 Merge pull request #4204 from dignfei/main
fixbug:llama3在增量预训练时应该使用<|end_of_text|>标识文本的结束

Former-commit-id: 9049aab911fa3fbe9cd463ae3805f81b484da052
2024-06-11 17:06:10 +08:00
hoshi-hiyouga
bf3de9bfe8 Update pretrain.py
Former-commit-id: 0c292332374fb96c3fc753abde42d070a0c1dca8
2024-06-11 17:02:14 +08:00
hiyouga
8c574eb3cb fix deepspeed version
Former-commit-id: cca6f351081903ca3b5f79f10accc1bbbae0ee61
2024-06-11 16:52:36 +08:00
d
da39715085 经过大量的增量预训练,进行对比试验,发现这个bug:llama3在预训练时使用的tokenizer.eos_toke是'<|end_of_text|>' ,这里在每条数据后面也得用这个,而不是'<|eot_id|>',否则很容易导致严重的性能下降
Former-commit-id: 6979f3f8480755604d8aea8164f6418126e094c5
2024-06-11 16:23:40 +08:00
hiyouga
6c9cc199ef Update bug-report.yml
Former-commit-id: 53b74361d3d76c602f80854320efc555164afe44
2024-06-11 15:40:21 +08:00
hiyouga
5834651c4a fix #4198
Former-commit-id: 89f2bd8c8c035181927bd530a7ffc733407d674c
2024-06-11 15:38:38 +08:00
hiyouga
53de7f7cc3 tiny fix
Former-commit-id: 90e14a960d1437a16d35dbabbb8aa50714583d3a
2024-06-11 12:48:53 +08:00
hoshi-hiyouga
4ab902d558 Merge pull request #4191 from iamthebot/al--add_manifest_for_reqs
Add MANIFEST.in so requirements.txt is present in sdist

Former-commit-id: 796699f8672e45d53ca2e6eb635b6dee2113d3ef
2024-06-11 10:41:15 +08:00
Alfredo Luque
fb20b8dcc8 add manifest so requirements.txt in sdist
Former-commit-id: 5f7b3b3ff6a0965bb5036868f7ca5612780d6766
2024-06-11 00:07:06 +00:00
hiyouga
e3baa5aa08 tiny fix
Former-commit-id: 3f24337a8a995b145b1e8075bc23878eaa363844
2024-06-11 01:04:16 +08:00
hiyouga
d6632fefc9 set dev version
Former-commit-id: 91e62a098fd997d0d1d12baef64d089aabc01fba
2024-06-11 00:50:53 +08:00
hiyouga
75e1bbf128 release v0.8.1
Former-commit-id: 2b6ebd6b51133cf114d6f0e8605ad2bb26aa6d65
2024-06-11 00:44:26 +08:00
hiyouga
2f164c2c41 fix #4160
The split heads should be concatenated in dim=2


Former-commit-id: a793e8456b664ea0b48f0ba162999f18d06b4c2f
2024-06-11 00:37:17 +08:00
hiyouga
d984776d35 fix #4145
Fix the docker image


Former-commit-id: 949e9908ad634874cf5449ee9904745c7acda611
2024-06-11 00:19:17 +08:00
hiyouga
144544cd37 update evaluator
Former-commit-id: 0012762b04d4082a3a7d98889cc3024adc92fc44
2024-06-10 23:56:00 +08:00
hiyouga
3b244a69dc fix #2666
Former-commit-id: c907d816670975daa900898660d3503708b7fc37
2024-06-10 21:24:15 +08:00
hoshi-hiyouga
55b40b3d62 Merge pull request #4167 from yzoaim/branch
fix README

Former-commit-id: ef4afdaf0e718ef7c316bbee4c618a1f92dadaa6
2024-06-10 16:24:33 +08:00
mMrBun
b6d63b3324 Optimize the handling of QWEN2 in scenarios involving multiple tool calls.
Former-commit-id: 950e360ca00c29febadc14d5995de7d57b5c43a7
2024-06-10 02:00:14 +08:00
mMrBun
3f11ab800f Removed unnecessary comments.
Former-commit-id: 6ed0b0c800d416379acf8395aa852c188c107eb9
2024-06-09 18:25:22 +08:00
mMrBun
daf472994d Merge branch 'hiyouga:main' into main
Former-commit-id: 0f2609ce19492f0bab9b4880ded228b5513e5907
2024-06-09 18:17:24 +08:00
mMrBun
18a86ea104 Implemented the tool_formatter and tool_extractor for glm4 tool_format
Former-commit-id: cb1cbcb293917e960cad8f0eac7a11a122ab644a
2024-06-09 18:16:15 +08:00
-.-
b187450340 fix README
Former-commit-id: 483cdd9b6ad42bc43a97df8ce867e3a9ef9bf5bc
2024-06-08 23:51:56 +08:00
hiyouga
e4b9c80104 add pr ci
Former-commit-id: b61d25cd705db821664f87adf37d2ed56c68130f
2024-06-08 21:25:35 +08:00
hiyouga
0c44309e15 Update tests.yml
Former-commit-id: 25c635ef2849fd173617b6a8d8d3f5d4f800a893
2024-06-08 21:15:36 +08:00
hiyouga
1c31809652 update git workflows
Former-commit-id: 4c4f950f3936edbce84796c9a5e11b61ea07864e
2024-06-08 21:11:32 +08:00
hiyouga
1a261add61 fix llamafactory-cli env
Former-commit-id: 972ec9c668de1a9b6d872187dbc0c1d94f6fec6b
2024-06-08 07:15:45 +08:00
hiyouga
de3400a521 set dev version
Former-commit-id: 3ac11e77cccf686e0da499bd152997133b49a265
2024-06-08 06:46:09 +08:00
302 changed files with 26008 additions and 8798 deletions

View File

@ -3,11 +3,13 @@
.github
.venv
cache
data
docker
saves
hf_cache
ms_cache
om_cache
shared_data
output
examples
.dockerignore
.gitattributes
.gitignore
Dockerfile

42
.env.local Normal file
View File

@ -0,0 +1,42 @@
# Note: actually we do not support .env, just for reference
# api
API_HOST=
API_PORT=
API_KEY=
API_MODEL_NAME=
API_VERBOSE=
FASTAPI_ROOT_PATH=
MAX_CONCURRENT=
# general
DISABLE_VERSION_CHECK=
FORCE_CHECK_IMPORTS=
ALLOW_EXTRA_ARGS=
LLAMAFACTORY_VERBOSITY=
USE_MODELSCOPE_HUB=
USE_OPENMIND_HUB=
USE_RAY=
RECORD_VRAM=
OPTIM_TORCH=
NPU_JIT_COMPILE=
# torchrun
FORCE_TORCHRUN=
MASTER_ADDR=
MASTER_PORT=
NNODES=
NODE_RANK=
NPROC_PER_NODE=
# wandb
WANDB_DISABLED=
WANDB_PROJECT=
WANDB_API_KEY=
# gradio ui
GRADIO_SHARE=
GRADIO_SERVER_NAME=
GRADIO_SERVER_PORT=
GRADIO_ROOT_PATH=
GRADIO_IPV6=
# setup
ENABLE_SHORT_CONSOLE=
# reserved (do not use)
LLAMABOARD_ENABLED=
LLAMABOARD_WORKDIR=

View File

@ -19,3 +19,49 @@ There are several ways you can contribute to LLaMA Factory:
### Style guide
LLaMA Factory follows the [Google Python Style Guide](https://google.github.io/styleguide/pyguide.html), check it for details.
### Create a Pull Request
1. Fork the [repository](https://github.com/hiyouga/LLaMA-Factory) by clicking on the [Fork](https://github.com/hiyouga/LLaMA-Factory/fork) button on the repository's page. This creates a copy of the code under your GitHub user account.
2. Clone your fork to your local disk, and add the base repository as a remote:
```bash
git clone git@github.com:[username]/LLaMA-Factory.git
cd LLaMA-Factory
git remote add upstream https://github.com/hiyouga/LLaMA-Factory.git
```
3. Create a new branch to hold your development changes:
```bash
git checkout -b dev_your_branch
```
4. Set up a development environment by running the following command in a virtual environment:
```bash
pip install -e ".[dev]"
```
If LLaMA Factory was already installed in the virtual environment, remove it with `pip uninstall llamafactory` before reinstalling it in editable mode with the -e flag.
5. Check code before commit:
```bash
make commit
make style && make quality
make test
```
6. Submit changes:
```bash
git add .
git commit -m "commit message"
git fetch upstream
git rebase upstream/main
git push -u origin dev_your_branch
```
7. Create a merge request from your branch `dev_your_branch` at [origin repo](https://github.com/hiyouga/LLaMA-Factory).

61
.github/ISSUE_TEMPLATE/1-bug-report.yml vendored Normal file
View File

@ -0,0 +1,61 @@
name: "\U0001F41B Bug / help"
description: Create a report to help us improve the LLaMA Factory
labels: ["bug", "pending"]
body:
- type: markdown
attributes:
value: |
Issues included in **[FAQs](https://github.com/hiyouga/LLaMA-Factory/issues/4614)** or those with **insufficient** information may be closed without a response.
已经包含在 **[常见问题](https://github.com/hiyouga/LLaMA-Factory/issues/4614)** 内或提供信息**不完整**的 issues 可能不会被回复。
- type: markdown
attributes:
value: |
Please do not create issues that are not related to framework bugs under this category, use **[Discussions](https://github.com/hiyouga/LLaMA-Factory/discussions/categories/q-a)** instead.
请勿在此分类下创建和框架 bug 无关的 issues训练问题求助请使用 **[讨论区](https://github.com/hiyouga/LLaMA-Factory/discussions/categories/q-a)**。
- type: checkboxes
id: reminder
attributes:
label: Reminder
description: |
Please ensure you have read the above rules carefully and searched the existing issues (including FAQs).
请确保您已经认真阅读了上述规则并且搜索过现有的 issues包括常见问题
options:
- label: I have read the above rules and searched the existing issues.
required: true
- type: textarea
id: system-info
validations:
required: true
attributes:
label: System Info
description: |
Please share your system info with us. You can run the command **llamafactory-cli env** and copy-paste its output below.
请提供您的系统信息。您可以在命令行运行 **llamafactory-cli env** 并将其输出复制到该文本框中。
placeholder: llamafactory version, platform, python version, ...
- type: textarea
id: reproduction
validations:
required: true
attributes:
label: Reproduction
description: |
Please provide entry arguments, error messages and stack traces that reproduces the problem.
请提供入口参数,错误日志以及异常堆栈以便于我们复现问题。
value: |
```text
Put your message here.
```
- type: textarea
id: others
validations:
required: false
attributes:
label: Others

View File

@ -0,0 +1,41 @@
name: "\U0001F680 Feature request"
description: Submit a request for a new feature
labels: ["enhancement", "pending"]
body:
- type: markdown
attributes:
value: |
Please do not create issues that are not related to new features under this category.
请勿在此分类下创建和新特性无关的 issues。
- type: checkboxes
id: reminder
attributes:
label: Reminder
description: |
Please ensure you have read the above rules carefully and searched the existing issues.
请确保您已经认真阅读了上述规则并且搜索过现有的 issues。
options:
- label: I have read the above rules and searched the existing issues.
required: true
- type: textarea
id: description
validations:
required: true
attributes:
label: Description
description: |
A clear and concise description of the feature proposal.
请详细描述您希望加入的新功能特性。
- type: textarea
id: contribution
validations:
required: false
attributes:
label: Pull Request
description: |
Have you already created the relevant PR and submitted the code?
您是否已经创建了相关 PR 并提交了代码?

View File

@ -1,58 +0,0 @@
name: "\U0001F41B Bug / Help"
description: Create a report to help us improve the LLaMA Factory
body:
- type: checkboxes
id: reminder
attributes:
label: Reminder
description: |
Please ensure you have read the README carefully and searched the existing issues.
请确保您已经认真阅读了 README 并且搜索过现有的 Issue。
options:
- label: I have read the README and searched the existing issues.
required: true
- type: textarea
id: system-info
validations:
required: true
attributes:
label: System Info
description: |
Please share your system info with us. You can run the command **llamafactory-cli env** and copy-paste its output below.
请提供您的系统信息。您可以在命令行运行 **llamafactory-cli env** 并将其输出复制到该文本框中。
placeholder: llamafactory version, platform, python version, ...
- type: textarea
id: reproduction
validations:
required: true
attributes:
label: Reproduction
description: |
Please provide code snippets, error messages and stack traces that reproduces the problem.
请提供运行参数,错误信息以及异常堆栈以便于我们复现该问题。
Remember to use Markdown tags to correctly format your code.
请合理使用 Markdown 标签来格式化您的文本。
placeholder: |
llamafactory-cli train ...
- type: textarea
id: expected-behavior
validations:
required: false
attributes:
label: Expected behavior
description: |
Please provide a clear and concise description of what you would expect to happen.
请提供您原本的目的,即这段代码的期望行为。
- type: textarea
id: others
validations:
required: false
attributes:
label: Others

8
.github/ISSUE_TEMPLATE/config.yml vendored Normal file
View File

@ -0,0 +1,8 @@
blank_issues_enabled: false
contact_links:
- name: 📚 FAQs | 常见问题
url: https://github.com/hiyouga/LLaMA-Factory/issues/4614
about: Reading in advance is recommended | 建议提前阅读
- name: Discussions | 讨论区
url: https://github.com/hiyouga/LLaMA-Factory/discussions
about: Please ask fine-tuning questions here | 请在这里讨论训练问题

View File

@ -5,3 +5,4 @@ Fixes # (issue)
## Before submitting
- [ ] Did you read the [contributor guideline](https://github.com/hiyouga/LLaMA-Factory/blob/main/.github/CONTRIBUTING.md)?
- [ ] Did you write any new necessary tests?

108
.github/workflows/docker.yml vendored Normal file
View File

@ -0,0 +1,108 @@
name: docker
on:
workflow_dispatch:
push:
branches:
- "main"
paths:
- "**/*.py"
- "requirements.txt"
- "docker/**"
- ".github/workflows/*.yml"
pull_request:
branches:
- "main"
paths:
- "**/*.py"
- "requirements.txt"
- "docker/**"
- ".github/workflows/*.yml"
jobs:
build:
strategy:
fail-fast: false
matrix:
device:
- "cuda"
- "npu"
runs-on: ubuntu-latest
concurrency:
group: ${{ github.workflow }}-${{ github.ref }}-${{ matrix.device }}
cancel-in-progress: ${{ github.ref != 'refs/heads/main' }}
environment:
name: docker
url: https://hub.docker.com/r/hiyouga/llamafactory
steps:
- name: Free up disk space
uses: jlumbroso/free-disk-space@54081f138730dfa15788a46383842cd2f914a1be # v1.3.1
with:
tool-cache: true
docker-images: false
- name: Checkout
uses: actions/checkout@v4
- name: Set up Python
uses: actions/setup-python@v5
with:
python-version: "3.9"
- name: Get llamafactory version
id: version
run: |
echo "tag=$(python setup.py --version | sed 's/\.dev0//')" >> "$GITHUB_OUTPUT"
- name: Set up Docker Buildx
uses: docker/setup-buildx-action@v3
- name: Login to Docker Hub
if: ${{ github.event_name != 'pull_request' }}
uses: docker/login-action@v3
with:
username: ${{ vars.DOCKERHUB_USERNAME }}
password: ${{ secrets.DOCKERHUB_TOKEN }}
- name: Login to Quay
if: ${{ github.event_name != 'pull_request' && matrix.device == 'npu' }}
uses: docker/login-action@v3
with:
registry: quay.io
username: ${{ vars.QUAY_ASCEND_USERNAME }}
password: ${{ secrets.QUAY_ASCEND_TOKEN }}
- name: Build and push Docker image (CUDA)
if: ${{ matrix.device == 'cuda' }}
uses: docker/build-push-action@v6
with:
context: .
file: ./docker/docker-cuda/Dockerfile
build-args: |
EXTRAS=metrics,deepspeed,liger-kernel
push: ${{ github.event_name != 'pull_request' }}
tags: |
docker.io/hiyouga/llamafactory:latest
docker.io/hiyouga/llamafactory:${{ steps.version.outputs.tag }}
cache-from: type=gha
cache-to: type=gha,mode=max
- name: Build and push Docker image (NPU)
if: ${{ matrix.device == 'npu' }}
uses: docker/build-push-action@v6
with:
context: .
platforms: linux/amd64,linux/arm64
file: ./docker/docker-npu/Dockerfile
push: ${{ github.event_name != 'pull_request' }}
tags: |
docker.io/hiyouga/llamafactory:latest-npu-a2
docker.io/hiyouga/llamafactory:${{ steps.version.outputs.tag }}-npu-a2
quay.io/ascend/llamafactory:latest-npu-a2
quay.io/ascend/llamafactory:${{ steps.version.outputs.tag }}-npu-a2
cache-from: type=gha
cache-to: type=gha,mode=max

32
.github/workflows/label_issue.yml vendored Normal file
View File

@ -0,0 +1,32 @@
name: label_issue
on:
issues:
types:
- opened
jobs:
label_issue:
runs-on: ubuntu-latest
permissions:
issues: write
steps:
- env:
GH_TOKEN: ${{ secrets.GITHUB_TOKEN }}
ISSUE_URL: ${{ github.event.issue.html_url }}
ISSUE_TITLE: ${{ github.event.issue.title }}
run: |
LABEL=""
NPU_KEYWORDS=(npu huawei ascend 华为 昇腾 910)
ISSUE_TITLE_LOWER=$(echo $ISSUE_TITLE | tr '[:upper:]' '[:lower:]')
for KEYWORD in ${NPU_KEYWORDS[@]}; do
if [[ $ISSUE_TITLE_LOWER == *$KEYWORD* ]] && [[ $ISSUE_TITLE_LOWER != *input* ]]; then
LABEL="npu"
break
fi
done
if [ -n "$LABEL" ]; then
gh issue edit $ISSUE_URL --add-label $LABEL
fi

36
.github/workflows/publish.yml vendored Normal file
View File

@ -0,0 +1,36 @@
name: publish
on:
workflow_dispatch:
release:
types:
- published
jobs:
publish:
name: Upload release to PyPI
runs-on: ubuntu-latest
environment:
name: release
url: https://pypi.org/p/llamafactory
permissions:
id-token: write
steps:
- name: Checkout
uses: actions/checkout@v4
- name: Set up Python
uses: actions/setup-python@v5
with:
python-version: "3.9"
- name: Build package
run: |
make build
- name: Publish package
uses: pypa/gh-action-pypi-publish@release/v1

View File

@ -1,39 +1,99 @@
name: tests
on:
workflow_dispatch:
push:
branches:
- main
- "main"
paths:
- "**.py"
- "**/*.py"
- "requirements.txt"
- ".github/workflows/*.yml"
pull_request:
branches:
- main
- "main"
paths:
- "**.py"
- "**/*.py"
- "requirements.txt"
- ".github/workflows/*.yml"
jobs:
tests:
runs-on: ubuntu-latest
strategy:
fail-fast: false
matrix:
python:
- "3.9"
- "3.10"
- "3.11"
- "3.12"
os:
- "ubuntu-latest"
- "windows-latest"
- "macos-13"
transformers:
- null
include: # test backward compatibility
- python: "3.9"
os: "ubuntu-latest"
transformers: "4.49.0"
- python: "3.9"
os: "ubuntu-latest"
transformers: "4.51.0"
runs-on: ${{ matrix.os }}
concurrency:
group: ${{ github.workflow }}-${{ github.ref }}-${{ matrix.os }}-${{ matrix.python }}-${{ matrix.transformers }}
cancel-in-progress: ${{ github.ref != 'refs/heads/main' }}
env:
HF_TOKEN: ${{ secrets.HF_TOKEN }}
OS_NAME: ${{ matrix.os }}
steps:
- uses: actions/checkout@v4
- name: Checkout
uses: actions/checkout@v4
- name: Set up Python
uses: actions/setup-python@v5
with:
python-version: "3.8"
python-version: ${{ matrix.python }}
cache: "pip"
cache-dependency-path: "setup.py"
cache-dependency-path: "**/requirements*.txt"
- name: Install dependencies
run: |
python -m pip install --upgrade pip
python -m pip install .[torch,dev]
python -m pip install ".[torch,dev]"
- name: Install transformers
if: ${{ matrix.transformers }}
run: |
python -m pip install "transformers==${{ matrix.transformers }}"
- name: Cache files
id: hf-hub-cache
uses: actions/cache@v4
with:
path: ${{ runner.temp }}/huggingface
key: huggingface-${{ matrix.os }}-${{ matrix.python }}-${{ matrix.transformers }}-${{ hashFiles('tests/version.txt') }}
- name: Check quality
run: |
make style && make quality
- name: Check license
run: |
make license
- name: Check build
run: |
make build
- name: Test with pytest
run: |
make test
env:
HF_HOME: ${{ runner.temp }}/huggingface
HF_HUB_OFFLINE: "${{ steps.hf-hub-cache.outputs.cache-hit == 'true' && '1' || '0' }}"

18
.gitignore vendored
View File

@ -159,7 +159,21 @@ cython_debug/
# option (not recommended) you can uncomment the following to ignore the entire idea folder.
.idea/
# vscode
.vscode/
# uv
uv.lock
# custom .gitignore
user.config
saves/
hf_cache/
ms_cache/
om_cache/
cache/
config/
saves/
output/
wandb/
swanlog/
generated_predictions.jsonl
predictions_score.json

28
.pre-commit-config.yaml Normal file
View File

@ -0,0 +1,28 @@
repos:
- repo: https://github.com/pre-commit/pre-commit-hooks
rev: v5.0.0
hooks:
- id: check-ast
- id: check-added-large-files
args: ['--maxkb=25000']
- id: check-merge-conflict
- id: check-yaml
- id: debug-statements
- id: end-of-file-fixer
- id: trailing-whitespace
args: [--markdown-linebreak-ext=md]
- id: no-commit-to-branch
args: ['--branch', 'main']
- repo: https://github.com/asottile/pyupgrade
rev: v3.17.0
hooks:
- id: pyupgrade
args: [--py38-plus]
- repo: https://github.com/astral-sh/ruff-pre-commit
rev: v0.6.9
hooks:
- id: ruff
args: [--fix]
- id: ruff-format

View File

@ -12,12 +12,16 @@ authors:
given-names: "Yanhan"
- family-names: "Luo"
given-names: "Zheyan"
- family-names: "Feng"
given-names: "Zhangchi"
- family-names: "Ma"
given-names: "Yongqiang"
title: "LlamaFactory: Unified Efficient Fine-Tuning of 100+ Language Models"
url: "https://arxiv.org/abs/2403.13372"
preferred-citation:
type: article
type: conference-paper
conference:
name: "Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 3: System Demonstrations)"
authors:
- family-names: "Zheng"
given-names: "Yaowei"
@ -29,9 +33,12 @@ preferred-citation:
given-names: "Yanhan"
- family-names: "Luo"
given-names: "Zheyan"
- family-names: "Feng"
given-names: "Zhangchi"
- family-names: "Ma"
given-names: "Yongqiang"
journal: "arXiv preprint arXiv:2403.13372"
title: "LlamaFactory: Unified Efficient Fine-Tuning of 100+ Language Models"
url: "https://arxiv.org/abs/2403.13372"
year: 2024
publisher: "Association for Computational Linguistics"
address: "Bangkok, Thailand"

View File

@ -1,14 +0,0 @@
FROM nvcr.io/nvidia/pytorch:24.01-py3
WORKDIR /app
COPY requirements.txt /app/
RUN pip install -r requirements.txt
COPY . /app/
RUN pip install -e .[metrics,bitsandbytes,qwen]
VOLUME [ "/root/.cache/huggingface/", "/app/data", "/app/output" ]
EXPOSE 7860
CMD [ "llamafactory-cli", "webui" ]

1
MANIFEST.in Normal file
View File

@ -0,0 +1 @@
include LICENSE requirements.txt

View File

@ -1,6 +1,16 @@
.PHONY: quality style test
.PHONY: build commit license quality style test
check_dirs := scripts src tests
check_dirs := scripts src tests setup.py
build:
pip3 install build && python3 -m build
commit:
pre-commit install
pre-commit run --all-files
license:
python3 tests/check_license.py $(check_dirs)
quality:
ruff check $(check_dirs)
@ -11,4 +21,4 @@ style:
ruff format $(check_dirs)
test:
pytest tests/
CUDA_VISIBLE_DEVICES= WANDB_DISABLED=true pytest -vv tests/

684
README.md
View File

@ -1,44 +1,86 @@
![# LLaMA Factory](assets/logo.png)
[![GitHub Repo stars](https://img.shields.io/github/stars/hiyouga/LLaMA-Factory?style=social)](https://github.com/hiyouga/LLaMA-Factory/stargazers)
[![GitHub Code License](https://img.shields.io/github/license/hiyouga/LLaMA-Factory)](LICENSE)
[![GitHub last commit](https://img.shields.io/github/last-commit/hiyouga/LLaMA-Factory)](https://github.com/hiyouga/LLaMA-Factory/commits/main)
[![GitHub contributors](https://img.shields.io/github/contributors/hiyouga/LLaMA-Factory?color=orange)](https://github.com/hiyouga/LLaMA-Factory/graphs/contributors)
[![GitHub workflow](https://github.com/hiyouga/LLaMA-Factory/actions/workflows/tests.yml/badge.svg)](https://github.com/hiyouga/LLaMA-Factory/actions/workflows/tests.yml)
[![PyPI](https://img.shields.io/pypi/v/llamafactory)](https://pypi.org/project/llamafactory/)
[![Citation](https://img.shields.io/badge/citation-44-green)](#projects-using-llama-factory)
[![GitHub pull request](https://img.shields.io/badge/PRs-welcome-blue)](https://github.com/hiyouga/LLaMA-Factory/pulls)
[![Discord](https://dcbadge.vercel.app/api/server/rKfvV9r9FK?compact=true&style=flat)](https://discord.gg/rKfvV9r9FK)
[![Citation](https://img.shields.io/badge/citation-730-green)](https://scholar.google.com/scholar?cites=12620864006390196564)
[![Docker Pulls](https://img.shields.io/docker/pulls/hiyouga/llamafactory)](https://hub.docker.com/r/hiyouga/llamafactory/tags)
[![Twitter](https://img.shields.io/twitter/follow/llamafactory_ai)](https://twitter.com/llamafactory_ai)
[![Discord](https://dcbadge.vercel.app/api/server/rKfvV9r9FK?compact=true&style=flat)](https://discord.gg/rKfvV9r9FK)
[![GitCode](https://gitcode.com/zhengyaowei/LLaMA-Factory/star/badge.svg)](https://gitcode.com/zhengyaowei/LLaMA-Factory)
[![Open in Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/drive/1eRTPn37ltBbYsISy9Aw2NuI2Aq5CQrD9?usp=sharing)
[![Open in DSW](https://gallery.pai-ml.com/assets/open-in-dsw.svg)](https://gallery.pai-ml.com/#/preview/deepLearning/nlp/llama_factory)
[![Spaces](https://img.shields.io/badge/🤗-Open%20in%20Spaces-blue)](https://huggingface.co/spaces/hiyouga/LLaMA-Board)
[![Studios](https://img.shields.io/badge/ModelScope-Open%20in%20Studios-blue)](https://modelscope.cn/studios/hiyouga/LLaMA-Board)
[![Open in Alaya](assets/alaya_new.svg)](https://docs.alayanew.com/docs/documents/newActivities/llamafactory/?utm_source=LLaMA-Factory)
[![Open in Spaces](https://img.shields.io/badge/🤗-Open%20in%20Spaces-blue)](https://huggingface.co/spaces/hiyouga/LLaMA-Board)
[![Open in Studios](https://img.shields.io/badge/ModelScope-Open%20in%20Studios-blue)](https://modelscope.cn/studios/hiyouga/LLaMA-Board)
[![Open in Novita](https://img.shields.io/badge/Novita-Deploy%20Template-blue)](https://novita.ai/templates-library/105981?sharer=88115474-394e-4bda-968e-b88e123d0c47)
[![GitHub Tread](https://trendshift.io/api/badge/repositories/4535)](https://trendshift.io/repositories/4535)
### Used by [Amazon](https://aws.amazon.com/cn/blogs/machine-learning/how-apoidea-group-enhances-visual-information-extraction-from-banking-documents-with-multimodal-models-using-llama-factory-on-amazon-sagemaker-hyperpod/), [NVIDIA](https://developer.nvidia.com/rtx/ai-toolkit), [Aliyun](https://help.aliyun.com/zh/pai/use-cases/fine-tune-a-llama-3-model-with-llama-factory), etc.
👋 Join our [WeChat](assets/wechat.jpg).
<div align="center" markdown="1">
### Supporters ❤️
<a href="https://warp.dev/llama-factory">
<img alt="Warp sponsorship" width="400" src="https://github.com/user-attachments/assets/ab8dd143-b0fd-4904-bdc5-dd7ecac94eae">
</a>
#### [Warp, the agentic terminal for developers](https://warp.dev/llama-factory)
[Available for MacOS, Linux, & Windows](https://warp.dev/llama-factory)
----
### Easily fine-tune 100+ large language models with zero-code [CLI](#quickstart) and [Web UI](#fine-tuning-with-llama-board-gui-powered-by-gradio)
![GitHub Trend](https://trendshift.io/api/badge/repositories/4535)
</div>
👋 Join our [WeChat group](assets/wechat.jpg), [NPU user group](assets/wechat_npu.jpg) or [Alaya NeW user group](assets/wechat_alaya.png).
\[ English | [中文](README_zh.md) \]
**Fine-tuning a large language model can be easy as...**
https://github.com/hiyouga/LLaMA-Factory/assets/16256802/9840a653-7e9c-41c8-ae89-7ace5698baf6
https://github.com/user-attachments/assets/3991a3a8-4276-4d30-9cab-4cb0c4b9b99e
Choose your path:
- **Colab**: https://colab.research.google.com/drive/1eRTPn37ltBbYsISy9Aw2NuI2Aq5CQrD9?usp=sharing
- **PAI-DSW**: https://gallery.pai-ml.com/#/preview/deepLearning/nlp/llama_factory
- **Documentation (WIP)**: https://llamafactory.readthedocs.io/en/latest/
- **Documentation (AMD GPU)**: https://rocm.docs.amd.com/projects/ai-developer-hub/en/latest/notebooks/fine_tune/llama_factory_llama3.html
- **Colab (free)**: https://colab.research.google.com/drive/1eRTPn37ltBbYsISy9Aw2NuI2Aq5CQrD9?usp=sharing
- **Local machine**: Please refer to [usage](#getting-started)
- **PAI-DSW (free trial)**: https://gallery.pai-ml.com/#/preview/deepLearning/nlp/llama_factory
- **Alaya NeW (cloud GPU deal)**: https://docs.alayanew.com/docs/documents/useGuide/LLaMAFactory/mutiple/?utm_source=LLaMA-Factory
> [!NOTE]
> Except for the above links, all other websites are unauthorized third-party websites. Please carefully use them.
## Table of Contents
- [Features](#features)
- [Benchmark](#benchmark)
- [Blogs](#blogs)
- [Changelog](#changelog)
- [Supported Models](#supported-models)
- [Supported Training Approaches](#supported-training-approaches)
- [Provided Datasets](#provided-datasets)
- [Requirement](#requirement)
- [Getting Started](#getting-started)
- [Installation](#installation)
- [Data Preparation](#data-preparation)
- [Quickstart](#quickstart)
- [Fine-Tuning with LLaMA Board GUI](#fine-tuning-with-llama-board-gui-powered-by-gradio)
- [Build Docker](#build-docker)
- [Deploy with OpenAI-style API and vLLM](#deploy-with-openai-style-api-and-vllm)
- [Download from ModelScope Hub](#download-from-modelscope-hub)
- [Download from Modelers Hub](#download-from-modelers-hub)
- [Use W&B Logger](#use-wb-logger)
- [Use SwanLab Logger](#use-swanlab-logger)
- [Projects using LLaMA Factory](#projects-using-llama-factory)
- [License](#license)
- [Citation](#citation)
@ -46,40 +88,100 @@ Choose your path:
## Features
- **Various models**: LLaMA, LLaVA, Mistral, Mixtral-MoE, Qwen, Yi, Gemma, Baichuan, ChatGLM, Phi, etc.
- **Various models**: LLaMA, LLaVA, Mistral, Mixtral-MoE, Qwen, Qwen2-VL, DeepSeek, Yi, Gemma, ChatGLM, Phi, etc.
- **Integrated methods**: (Continuous) pre-training, (multimodal) supervised fine-tuning, reward modeling, PPO, DPO, KTO, ORPO, etc.
- **Scalable resources**: 32-bit full-tuning, 16-bit freeze-tuning, 16-bit LoRA and 2/4/8-bit QLoRA via AQLM/AWQ/GPTQ/LLM.int8.
- **Advanced algorithms**: GaLore, BAdam, DoRA, LongLoRA, LLaMA Pro, Mixture-of-Depths, LoRA+, LoftQ and Agent tuning.
- **Practical tricks**: FlashAttention-2, Unsloth, RoPE scaling, NEFTune and rsLoRA.
- **Experiment monitors**: LlamaBoard, TensorBoard, Wandb, MLflow, etc.
- **Faster inference**: OpenAI-style API, Gradio UI and CLI with vLLM worker.
- **Scalable resources**: 16-bit full-tuning, freeze-tuning, LoRA and 2/3/4/5/6/8-bit QLoRA via AQLM/AWQ/GPTQ/LLM.int8/HQQ/EETQ.
- **Advanced algorithms**: [GaLore](https://github.com/jiaweizzhao/GaLore), [BAdam](https://github.com/Ledzy/BAdam), [APOLLO](https://github.com/zhuhanqing/APOLLO), [Adam-mini](https://github.com/zyushun/Adam-mini), [Muon](https://github.com/KellerJordan/Muon), DoRA, LongLoRA, LLaMA Pro, Mixture-of-Depths, LoRA+, LoftQ and PiSSA.
- **Practical tricks**: [FlashAttention-2](https://github.com/Dao-AILab/flash-attention), [Unsloth](https://github.com/unslothai/unsloth), [Liger Kernel](https://github.com/linkedin/Liger-Kernel), RoPE scaling, NEFTune and rsLoRA.
- **Wide tasks**: Multi-turn dialogue, tool using, image understanding, visual grounding, video recognition, audio understanding, etc.
- **Experiment monitors**: LlamaBoard, TensorBoard, Wandb, MLflow, [SwanLab](https://github.com/SwanHubX/SwanLab), etc.
- **Faster inference**: OpenAI-style API, Gradio UI and CLI with [vLLM worker](https://github.com/vllm-project/vllm) or [SGLang worker](https://github.com/sgl-project/sglang).
## Benchmark
### Day-N Support for Fine-Tuning Cutting-Edge Models
Compared to ChatGLM's [P-Tuning](https://github.com/THUDM/ChatGLM2-6B/tree/main/ptuning), LLaMA Factory's LoRA tuning offers up to **3.7 times faster** training speed with a better Rouge score on the advertising text generation task. By leveraging 4-bit quantization technique, LLaMA Factory's QLoRA further improves the efficiency regarding the GPU memory.
| Support Date | Model Name |
| ------------ | -------------------------------------------------------------------- |
| Day 0 | Qwen3 / Qwen2.5-VL / Gemma 3 / GLM-4.1V / InternLM 3 / MiniCPM-o-2.6 |
| Day 1 | Llama 3 / GLM-4 / Mistral Small / PaliGemma2 / Llama 4 |
![benchmark](assets/benchmark.svg)
## Blogs
<details><summary>Definitions</summary>
- [Fine-tune Llama3.1-70B for Medical Diagnosis using LLaMA-Factory](https://docs.alayanew.com/docs/documents/bestPractice/bigModel/llama70B/) (Chinese)
- [A One-Stop Code-Free Model Reinforcement Learning and Deployment Platform based on LLaMA-Factory and EasyR1](https://aws.amazon.com/cn/blogs/china/building-llm-model-hub-based-on-llamafactory-and-easyr1/) (Chinese)
- [How Apoidea Group enhances visual information extraction from banking documents with multimodal models using LLaMA-Factory on Amazon SageMaker HyperPod](https://aws.amazon.com/cn/blogs/machine-learning/how-apoidea-group-enhances-visual-information-extraction-from-banking-documents-with-multimodal-models-using-llama-factory-on-amazon-sagemaker-hyperpod/) (English)
- [Easy Dataset × LLaMA Factory: Enabling LLMs to Efficiently Learn Domain Knowledge](https://buaa-act.feishu.cn/wiki/GVzlwYcRFiR8OLkHbL6cQpYin7g) (English)
- **Training Speed**: the number of training samples processed per second during the training. (bs=4, cutoff_len=1024)
- **Rouge Score**: Rouge-2 score on the development set of the [advertising text generation](https://aclanthology.org/D19-1321.pdf) task. (bs=4, cutoff_len=1024)
- **GPU Memory**: Peak GPU memory usage in 4-bit quantized training. (bs=1, cutoff_len=1024)
- We adopt `pre_seq_len=128` for ChatGLM's P-Tuning and `lora_rank=32` for LLaMA Factory's LoRA tuning.
<details><summary>All Blogs</summary>
- [Fine-tune Qwen2.5-VL for Autonomous Driving using LLaMA-Factory](https://docs.alayanew.com/docs/documents/useGuide/LLaMAFactory/mutiple/?utm_source=LLaMA-Factory) (Chinese)
- [LLaMA Factory: Fine-tuning the DeepSeek-R1-Distill-Qwen-7B Model for News Classifier](https://gallery.pai-ml.com/#/preview/deepLearning/nlp/llama_factory_deepseek_r1_distill_7b) (Chinese)
- [A One-Stop Code-Free Model Fine-Tuning \& Deployment Platform based on SageMaker and LLaMA-Factory](https://aws.amazon.com/cn/blogs/china/a-one-stop-code-free-model-fine-tuning-deployment-platform-based-on-sagemaker-and-llama-factory/) (Chinese)
- [LLaMA Factory Multi-Modal Fine-Tuning Practice: Fine-Tuning Qwen2-VL for Personal Tourist Guide](https://gallery.pai-ml.com/#/preview/deepLearning/nlp/llama_factory_qwen2vl) (Chinese)
- [LLaMA Factory: Fine-tuning Llama3 for Role-Playing](https://gallery.pai-ml.com/#/preview/deepLearning/nlp/llama_factory) (Chinese)
</details>
## Changelog
[24/06/07] We supported fine-tuning the **[Qwen-2](https://qwenlm.github.io/blog/qwen2/)** series models.
[25/07/02] We supported fine-tuning the **[GLM-4.1V-9B-Thinking](https://github.com/THUDM/GLM-4.1V-Thinking)** model. Please install transformers from **main** branch to use.
[24/06/05] We supported fine-tuning the **[GLM-4-9B/GLM-4-9B-Chat](https://github.com/THUDM/GLM-4)** models.
[25/04/28] We supported fine-tuning the **[Qwen3](https://qwenlm.github.io/blog/qwen3/)** model family.
[24/05/26] We supported **[SimPO](https://arxiv.org/abs/2405.14734)** algorithm for preference learning. See [examples](examples/README.md) for usage.
[25/04/21] We supported the **[Muon](https://github.com/KellerJordan/Muon)** optimizer. See [examples](examples/README.md) for usage. Thank [@tianshijing](https://github.com/tianshijing)'s PR.
[25/04/16] We supported fine-tuning the **[InternVL3](https://huggingface.co/OpenGVLab/InternVL3-8B)** model. See [PR #7258](https://github.com/hiyouga/LLaMA-Factory/pull/7258) to get started.
[25/04/14] We supported fine-tuning the **[GLM-Z1](https://huggingface.co/THUDM/GLM-Z1-9B-0414)** and **[Kimi-VL](https://huggingface.co/moonshotai/Kimi-VL-A3B-Instruct)** models.
[25/04/06] We supported fine-tuning the **[Llama 4](https://ai.meta.com/blog/llama-4-multimodal-intelligence/)** model. See [PR #7611](https://github.com/hiyouga/LLaMA-Factory/pull/7611) to get started.
<details><summary>Full Changelog</summary>
[24/05/20] We supported fine-tuning the **PaliGemma** series models. Note that the PaliGemma models are pre-trained models, you need to fine-tune them with `gemma` template for chat completion.
[25/03/31] We supported fine-tuning the **[Qwen2.5 Omni](https://qwenlm.github.io/blog/qwen2.5-omni/)** model. See [PR #7537](https://github.com/hiyouga/LLaMA-Factory/pull/7537) to get started.
[25/03/15] We supported **[SGLang](https://github.com/sgl-project/sglang)** as inference backend. Try `infer_backend: sglang` to accelerate inference.
[25/03/12] We supported fine-tuning the **[Gemma 3](https://huggingface.co/blog/gemma3)** model.
[25/02/24] Announcing **[EasyR1](https://github.com/hiyouga/EasyR1)**, an efficient, scalable and multi-modality RL training framework for efficient GRPO training.
[25/02/11] We supported saving the **[Ollama](https://github.com/ollama/ollama)** modelfile when exporting the model checkpoints. See [examples](examples/README.md) for usage.
[25/02/05] We supported fine-tuning the **[Qwen2-Audio](Qwen/Qwen2-Audio-7B-Instruct)** and **[MiniCPM-o-2.6](https://huggingface.co/openbmb/MiniCPM-o-2_6)** on audio understanding tasks.
[25/01/31] We supported fine-tuning the **[DeepSeek-R1](https://huggingface.co/deepseek-ai/DeepSeek-R1)** and **[Qwen2.5-VL](https://huggingface.co/Qwen/Qwen2.5-VL-7B-Instruct)** models.
[25/01/15] We supported **[APOLLO](https://arxiv.org/abs/2412.05270)** optimizer. See [examples](examples/README.md) for usage.
[25/01/14] We supported fine-tuning the **[MiniCPM-o-2.6](https://huggingface.co/openbmb/MiniCPM-o-2_6)** and **[MiniCPM-V-2.6](https://huggingface.co/openbmb/MiniCPM-V-2_6)** models. Thank [@BUAADreamer](https://github.com/BUAADreamer)'s PR.
[25/01/14] We supported fine-tuning the **[InternLM 3](https://huggingface.co/collections/internlm/)** models. Thank [@hhaAndroid](https://github.com/hhaAndroid)'s PR.
[25/01/10] We supported fine-tuning the **[Phi-4](https://huggingface.co/microsoft/phi-4)** model.
[24/12/21] We supported using **[SwanLab](https://github.com/SwanHubX/SwanLab)** for experiment tracking and visualization. See [this section](#use-swanlab-logger) for details.
[24/11/27] We supported fine-tuning the **[Skywork-o1](https://huggingface.co/Skywork/Skywork-o1-Open-Llama-3.1-8B)** model and the **[OpenO1](https://huggingface.co/datasets/O1-OPEN/OpenO1-SFT)** dataset.
[24/10/09] We supported downloading pre-trained models and datasets from the **[Modelers Hub](https://modelers.cn/models)**. See [this tutorial](#download-from-modelers-hub) for usage.
[24/09/19] We supported fine-tuning the **[Qwen2.5](https://qwenlm.github.io/blog/qwen2.5/)** models.
[24/08/30] We supported fine-tuning the **[Qwen2-VL](https://qwenlm.github.io/blog/qwen2-vl/)** models. Thank [@simonJJJ](https://github.com/simonJJJ)'s PR.
[24/08/27] We supported **[Liger Kernel](https://github.com/linkedin/Liger-Kernel)**. Try `enable_liger_kernel: true` for efficient training.
[24/08/09] We supported **[Adam-mini](https://github.com/zyushun/Adam-mini)** optimizer. See [examples](examples/README.md) for usage. Thank [@relic-yuexi](https://github.com/relic-yuexi)'s PR.
[24/07/04] We supported [contamination-free packed training](https://github.com/MeetKai/functionary/tree/main/functionary/train/packing). Use `neat_packing: true` to activate it. Thank [@chuan298](https://github.com/chuan298)'s PR.
[24/06/16] We supported **[PiSSA](https://arxiv.org/abs/2404.02948)** algorithm. See [examples](examples/README.md) for usage.
[24/06/07] We supported fine-tuning the **[Qwen2](https://qwenlm.github.io/blog/qwen2/)** and **[GLM-4](https://github.com/THUDM/GLM-4)** models.
[24/05/26] We supported **[SimPO](https://arxiv.org/abs/2405.14734)** algorithm for preference learning. See [examples](examples/README.md) for usage.
[24/05/20] We supported fine-tuning the **PaliGemma** series models. Note that the PaliGemma models are pre-trained models, you need to fine-tune them with `paligemma` template for chat completion.
[24/05/18] We supported **[KTO](https://arxiv.org/abs/2402.01306)** algorithm for preference learning. See [examples](examples/README.md) for usage.
@ -91,7 +193,7 @@ Compared to ChatGLM's [P-Tuning](https://github.com/THUDM/ChatGLM2-6B/tree/main/
[24/04/21] We supported **[Mixture-of-Depths](https://arxiv.org/abs/2404.02258)** according to [AstraMindAI's implementation](https://github.com/astramind-ai/Mixture-of-depths). See [examples](examples/README.md) for usage.
[24/04/16] We supported **[BAdam](https://arxiv.org/abs/2404.02827)**. See [examples](examples/README.md) for usage.
[24/04/16] We supported **[BAdam](https://arxiv.org/abs/2404.02827)** optimizer. See [examples](examples/README.md) for usage.
[24/04/16] We supported **[unsloth](https://github.com/unslothai/unsloth)**'s long-sequence training (Llama-2-7B-56k within 24GB). It achieves **117%** speed and **50%** memory compared with FlashAttention-2, more benchmarks can be found in [this page](https://github.com/hiyouga/LLaMA-Factory/wiki/Performance-comparison).
@ -103,7 +205,7 @@ Compared to ChatGLM's [P-Tuning](https://github.com/THUDM/ChatGLM2-6B/tree/main/
[24/03/13] We supported **[LoRA+](https://arxiv.org/abs/2402.12354)**. See [examples](examples/README.md) for usage.
[24/03/07] We supported gradient low-rank projection (**[GaLore](https://arxiv.org/abs/2403.03507)**) algorithm. See [examples](examples/README.md) for usage.
[24/03/07] We supported **[GaLore](https://arxiv.org/abs/2403.03507)** optimizer. See [examples](examples/README.md) for usage.
[24/03/07] We integrated **[vLLM](https://github.com/vllm-project/vllm)** for faster and concurrent inference. Try `infer_backend: vllm` to enjoy **270%** inference speed.
@ -119,7 +221,7 @@ Compared to ChatGLM's [P-Tuning](https://github.com/THUDM/ChatGLM2-6B/tree/main/
[23/12/12] We supported fine-tuning the latest MoE model **[Mixtral 8x7B](https://huggingface.co/mistralai/Mixtral-8x7B-v0.1)** in our framework. See hardware requirement [here](#hardware-requirement).
[23/12/01] We supported downloading pre-trained models and datasets from the **[ModelScope Hub](https://modelscope.cn/models)** for Chinese mainland users. See [this tutorial](#download-from-modelscope-hub) for usage.
[23/12/01] We supported downloading pre-trained models and datasets from the **[ModelScope Hub](https://modelscope.cn/models)**. See [this tutorial](#download-from-modelscope-hub) for usage.
[23/10/21] We supported **[NEFTune](https://arxiv.org/abs/2310.05914)** trick for fine-tuning. Try `neftune_noise_alpha: 5` argument to activate NEFTune.
@ -149,42 +251,78 @@ Compared to ChatGLM's [P-Tuning](https://github.com/THUDM/ChatGLM2-6B/tree/main/
</details>
> [!TIP]
> If you cannot use the latest feature, please pull the latest code and install LLaMA-Factory again.
## Supported Models
| Model | Model size | Template |
| -------------------------------------------------------- | -------------------------------- | --------- |
| [Baichuan2](https://huggingface.co/baichuan-inc) | 7B/13B | baichuan2 |
| [BLOOM](https://huggingface.co/bigscience) | 560M/1.1B/1.7B/3B/7.1B/176B | - |
| [BLOOMZ](https://huggingface.co/bigscience) | 560M/1.1B/1.7B/3B/7.1B/176B | - |
| [ChatGLM3](https://huggingface.co/THUDM) | 6B | chatglm3 |
| [Command-R](https://huggingface.co/CohereForAI) | 35B/104B | cohere |
| [DeepSeek (MoE)](https://huggingface.co/deepseek-ai) | 7B/16B/67B/236B | deepseek |
| [Falcon](https://huggingface.co/tiiuae) | 7B/11B/40B/180B | falcon |
| [Gemma/CodeGemma](https://huggingface.co/google) | 2B/7B | gemma |
| [GLM4](https://huggingface.co/THUDM) | 9B | glm4 |
| [InternLM2](https://huggingface.co/internlm) | 7B/20B | intern2 |
| [LLaMA](https://github.com/facebookresearch/llama) | 7B/13B/33B/65B | - |
| [LLaMA-2](https://huggingface.co/meta-llama) | 7B/13B/70B | llama2 |
| [LLaMA-3](https://huggingface.co/meta-llama) | 8B/70B | llama3 |
| [LLaVA-1.5](https://huggingface.co/llava-hf) | 7B/13B | vicuna |
| [Mistral/Mixtral](https://huggingface.co/mistralai) | 7B/8x7B/8x22B | mistral |
| [OLMo](https://huggingface.co/allenai) | 1B/7B | - |
| [PaliGemma](https://huggingface.co/google) | 3B | gemma |
| [Phi-1.5/2](https://huggingface.co/microsoft) | 1.3B/2.7B | - |
| [Phi-3](https://huggingface.co/microsoft) | 4B/7B/14B | phi |
| [Qwen](https://huggingface.co/Qwen) | 1.8B/7B/14B/72B | qwen |
| [Qwen1.5 (Code/MoE)](https://huggingface.co/Qwen) | 0.5B/1.8B/4B/7B/14B/32B/72B/110B | qwen |
| [Qwen2 (MoE)](https://huggingface.co/Qwen) | 0.5B/1.5B/7B/57B/72B | qwen |
| [StarCoder2](https://huggingface.co/bigcode) | 3B/7B/15B | - |
| [XVERSE](https://huggingface.co/xverse) | 7B/13B/65B | xverse |
| [Yi (1/1.5)](https://huggingface.co/01-ai) | 6B/9B/34B | yi |
| [Yi-VL](https://huggingface.co/01-ai) | 6B/34B | yi_vl |
| [Yuan](https://huggingface.co/IEITYuan) | 2B/51B/102B | yuan |
| Model | Model size | Template |
| ----------------------------------------------------------------- | -------------------------------- | ------------------- |
| [Baichuan 2](https://huggingface.co/baichuan-inc) | 7B/13B | baichuan2 |
| [BLOOM/BLOOMZ](https://huggingface.co/bigscience) | 560M/1.1B/1.7B/3B/7.1B/176B | - |
| [ChatGLM3](https://huggingface.co/THUDM) | 6B | chatglm3 |
| [Command R](https://huggingface.co/CohereForAI) | 35B/104B | cohere |
| [DeepSeek (Code/MoE)](https://huggingface.co/deepseek-ai) | 7B/16B/67B/236B | deepseek |
| [DeepSeek 2.5/3](https://huggingface.co/deepseek-ai) | 236B/671B | deepseek3 |
| [DeepSeek R1 (Distill)](https://huggingface.co/deepseek-ai) | 1.5B/7B/8B/14B/32B/70B/671B | deepseekr1 |
| [Falcon](https://huggingface.co/tiiuae) | 7B/11B/40B/180B | falcon |
| [Falcon-H1](https://huggingface.co/tiiuae) | 0.5B/1.5B/3B/7B/34B | falcon_h1 |
| [Gemma/Gemma 2/CodeGemma](https://huggingface.co/google) | 2B/7B/9B/27B | gemma/gemma2 |
| [Gemma 3/Gemma 3n](https://huggingface.co/google) | 1B/4B/6B/8B/12B/27B | gemma3/gemma3n |
| [GLM-4/GLM-4-0414/GLM-Z1](https://huggingface.co/zai-org) | 9B/32B | glm4/glmz1 |
| [GLM-4.1V](https://huggingface.co/zai-org)* | 9B | glm4v |
| [GLM-4.5](https://huggingface.co/zai-org)* | 106B/355B | glm4_moe |
| [GPT-2](https://huggingface.co/openai-community) | 0.1B/0.4B/0.8B/1.5B | - |
| [Granite 3.0-3.3](https://huggingface.co/ibm-granite) | 1B/2B/3B/8B | granite3 |
| [Granite 4](https://huggingface.co/ibm-granite) | 7B | granite4 |
| [Hunyuan](https://huggingface.co/tencent/) | 7B | hunyuan |
| [Index](https://huggingface.co/IndexTeam) | 1.9B | index |
| [InternLM 2-3](https://huggingface.co/internlm) | 7B/8B/20B | intern2 |
| [InternVL 2.5-3](https://huggingface.co/OpenGVLab) | 1B/2B/8B/14B/38B/78B | intern_vl |
| [Kimi-VL](https://huggingface.co/moonshotai) | 16B | kimi_vl |
| [Llama](https://github.com/facebookresearch/llama) | 7B/13B/33B/65B | - |
| [Llama 2](https://huggingface.co/meta-llama) | 7B/13B/70B | llama2 |
| [Llama 3-3.3](https://huggingface.co/meta-llama) | 1B/3B/8B/70B | llama3 |
| [Llama 4](https://huggingface.co/meta-llama) | 109B/402B | llama4 |
| [Llama 3.2 Vision](https://huggingface.co/meta-llama) | 11B/90B | mllama |
| [LLaVA-1.5](https://huggingface.co/llava-hf) | 7B/13B | llava |
| [LLaVA-NeXT](https://huggingface.co/llava-hf) | 7B/8B/13B/34B/72B/110B | llava_next |
| [LLaVA-NeXT-Video](https://huggingface.co/llava-hf) | 7B/34B | llava_next_video |
| [MiMo](https://huggingface.co/XiaomiMiMo) | 7B | mimo |
| [MiniCPM](https://huggingface.co/openbmb) | 0.5B/1B/2B/4B/8B | cpm/cpm3/cpm4 |
| [MiniCPM-o-2.6/MiniCPM-V-2.6](https://huggingface.co/openbmb) | 8B | minicpm_o/minicpm_v |
| [Ministral/Mistral-Nemo](https://huggingface.co/mistralai) | 8B/12B | ministral |
| [Mistral/Mixtral](https://huggingface.co/mistralai) | 7B/8x7B/8x22B | mistral |
| [Mistral Small](https://huggingface.co/mistralai) | 24B | mistral_small |
| [OLMo](https://huggingface.co/allenai) | 1B/7B | - |
| [PaliGemma/PaliGemma2](https://huggingface.co/google) | 3B/10B/28B | paligemma |
| [Phi-1.5/Phi-2](https://huggingface.co/microsoft) | 1.3B/2.7B | - |
| [Phi-3/Phi-3.5](https://huggingface.co/microsoft) | 4B/14B | phi |
| [Phi-3-small](https://huggingface.co/microsoft) | 7B | phi_small |
| [Phi-4](https://huggingface.co/microsoft) | 14B | phi4 |
| [Pixtral](https://huggingface.co/mistralai) | 12B | pixtral |
| [Qwen (1-2.5) (Code/Math/MoE/QwQ)](https://huggingface.co/Qwen) | 0.5B/1.5B/3B/7B/14B/32B/72B/110B | qwen |
| [Qwen3 (MoE)](https://huggingface.co/Qwen) | 0.6B/1.7B/4B/8B/14B/32B/235B | qwen3 |
| [Qwen2-Audio](https://huggingface.co/Qwen) | 7B | qwen2_audio |
| [Qwen2.5-Omni](https://huggingface.co/Qwen) | 3B/7B | qwen2_omni |
| [Qwen2-VL/Qwen2.5-VL/QVQ](https://huggingface.co/Qwen) | 2B/3B/7B/32B/72B | qwen2_vl |
| [Seed Coder](https://huggingface.co/ByteDance-Seed) | 8B | seed_coder |
| [Skywork o1](https://huggingface.co/Skywork) | 8B | skywork_o1 |
| [StarCoder 2](https://huggingface.co/bigcode) | 3B/7B/15B | - |
| [TeleChat2](https://huggingface.co/Tele-AI) | 3B/7B/35B/115B | telechat2 |
| [XVERSE](https://huggingface.co/xverse) | 7B/13B/65B | xverse |
| [Yi/Yi-1.5 (Code)](https://huggingface.co/01-ai) | 1.5B/6B/9B/34B | yi |
| [Yi-VL](https://huggingface.co/01-ai) | 6B/34B | yi_vl |
| [Yuan 2](https://huggingface.co/IEITYuan) | 2B/51B/102B | yuan |
> [!NOTE]
> For the "base" models, the `template` argument can be chosen from `default`, `alpaca`, `vicuna` etc. But make sure to use the **corresponding template** for the "instruct/chat" models.
>
> Remember to use the **SAME** template in training and inference.
>
> \*: You should install the `transformers` from main branch and use `DISABLE_VERSION_CHECK=1` to skip version check.
>
> \*\*: You need to install a specific version of `transformers` to use the corresponding model.
Please refer to [constants.py](src/llamafactory/extras/constants.py) for a full list of models we supported.
@ -203,6 +341,9 @@ You also can add a custom chat template to [template.py](src/llamafactory/data/t
| ORPO Training | :white_check_mark: | :white_check_mark: | :white_check_mark: | :white_check_mark: |
| SimPO Training | :white_check_mark: | :white_check_mark: | :white_check_mark: | :white_check_mark: |
> [!TIP]
> The implementation details of PPO can be found in [this blog](https://newfacade.github.io/notes-on-reinforcement-learning/17-ppo-trl.html).
## Provided Datasets
<details><summary>Pre-training datasets</summary>
@ -259,7 +400,16 @@ You also can add a custom chat template to [template.py](src/llamafactory/data/t
- [Cosmopedia (en)](https://huggingface.co/datasets/HuggingFaceTB/cosmopedia)
- [STEM (zh)](https://huggingface.co/datasets/hfl/stem_zh_instruction)
- [Ruozhiba (zh)](https://huggingface.co/datasets/hfl/ruozhiba_gpt4_turbo)
- [Neo-sft (zh)](https://huggingface.co/datasets/m-a-p/neo_sft_phase2)
- [Magpie-Pro-300K-Filtered (en)](https://huggingface.co/datasets/Magpie-Align/Magpie-Pro-300K-Filtered)
- [Magpie-ultra-v0.1 (en)](https://huggingface.co/datasets/argilla/magpie-ultra-v0.1)
- [WebInstructSub (en)](https://huggingface.co/datasets/TIGER-Lab/WebInstructSub)
- [OpenO1-SFT (en&zh)](https://huggingface.co/datasets/O1-OPEN/OpenO1-SFT)
- [Open-Thoughts (en)](https://huggingface.co/datasets/open-thoughts/OpenThoughts-114k)
- [Open-R1-Math (en)](https://huggingface.co/datasets/open-r1/OpenR1-Math-220k)
- [Chinese-DeepSeek-R1-Distill (zh)](https://huggingface.co/datasets/Congliu/Chinese-DeepSeek-R1-Distill-data-110k-SFT)
- [LLaVA mixed (en&zh)](https://huggingface.co/datasets/BUAADreamer/llava-en-zh-300k)
- [Pokemon-gpt4o-captions (en&zh)](https://huggingface.co/datasets/jugg1024/pokemon-gpt4o-captions)
- [Open Assistant (de)](https://huggingface.co/datasets/mayflowergmbh/oasst_de)
- [Dolly 15k (de)](https://huggingface.co/datasets/mayflowergmbh/dolly-15k_de)
- [Alpaca GPT4 (de)](https://huggingface.co/datasets/mayflowergmbh/alpaca-gpt4_de)
@ -276,6 +426,10 @@ You also can add a custom chat template to [template.py](src/llamafactory/data/t
- [DPO mixed (en&zh)](https://huggingface.co/datasets/hiyouga/DPO-En-Zh-20k)
- [UltraFeedback (en)](https://huggingface.co/datasets/HuggingFaceH4/ultrafeedback_binarized)
- [COIG-P (zh)](https://huggingface.co/datasets/m-a-p/COIG-P)
- [RLHF-V (en)](https://huggingface.co/datasets/openbmb/RLHF-V-Dataset)
- [VLFeedback (en)](https://huggingface.co/datasets/Zhihui/VLFeedback)
- [RLAIF-V (en)](https://huggingface.co/datasets/openbmb/RLAIF-V-Dataset)
- [Orca DPO Pairs (en)](https://huggingface.co/datasets/Intel/orca_dpo_pairs)
- [HH-RLHF (en)](https://huggingface.co/datasets/Anthropic/hh-rlhf)
- [Nectar (en)](https://huggingface.co/datasets/berkeley-nest/Nectar)
@ -295,35 +449,35 @@ huggingface-cli login
| Mandatory | Minimum | Recommend |
| ------------ | ------- | --------- |
| python | 3.8 | 3.11 |
| torch | 1.13.1 | 2.3.0 |
| transformers | 4.41.2 | 4.41.2 |
| datasets | 2.16.0 | 2.19.2 |
| accelerate | 0.30.1 | 0.30.1 |
| peft | 0.11.1 | 0.11.1 |
| trl | 0.8.6 | 0.9.4 |
| python | 3.9 | 3.10 |
| torch | 2.0.0 | 2.6.0 |
| torchvision | 0.15.0 | 0.21.0 |
| transformers | 4.49.0 | 4.50.0 |
| datasets | 2.16.0 | 3.2.0 |
| accelerate | 0.34.0 | 1.2.1 |
| peft | 0.14.0 | 0.15.1 |
| trl | 0.8.6 | 0.9.6 |
| Optional | Minimum | Recommend |
| ------------ | ------- | --------- |
| CUDA | 11.6 | 12.2 |
| deepspeed | 0.10.0 | 0.14.0 |
| deepspeed | 0.10.0 | 0.16.4 |
| bitsandbytes | 0.39.0 | 0.43.1 |
| vllm | 0.4.3 | 0.4.3 |
| flash-attn | 2.3.0 | 2.5.9 |
| vllm | 0.4.3 | 0.8.2 |
| flash-attn | 2.5.6 | 2.7.2 |
### Hardware Requirement
\* *estimated*
| Method | Bits | 7B | 13B | 30B | 70B | 110B | 8x7B | 8x22B |
| ----------------- | ---- | ----- | ----- | ----- | ------ | ------ | ----- | ------ |
| Full | AMP | 120GB | 240GB | 600GB | 1200GB | 2000GB | 900GB | 2400GB |
| Full | 16 | 60GB | 120GB | 300GB | 600GB | 900GB | 400GB | 1200GB |
| Freeze | 16 | 20GB | 40GB | 80GB | 200GB | 360GB | 160GB | 400GB |
| LoRA/GaLore/BAdam | 16 | 16GB | 32GB | 64GB | 160GB | 240GB | 120GB | 320GB |
| QLoRA | 8 | 10GB | 20GB | 40GB | 80GB | 140GB | 60GB | 160GB |
| QLoRA | 4 | 6GB | 12GB | 24GB | 48GB | 72GB | 30GB | 96GB |
| QLoRA | 2 | 4GB | 8GB | 16GB | 24GB | 48GB | 18GB | 48GB |
| Method | Bits | 7B | 14B | 30B | 70B | `x`B |
| ------------------------------- | ---- | ----- | ----- | ----- | ------ | ------- |
| Full (`bf16` or `fp16`) | 32 | 120GB | 240GB | 600GB | 1200GB | `18x`GB |
| Full (`pure_bf16`) | 16 | 60GB | 120GB | 300GB | 600GB | `8x`GB |
| Freeze/LoRA/GaLore/APOLLO/BAdam | 16 | 16GB | 32GB | 64GB | 160GB | `2x`GB |
| QLoRA | 8 | 10GB | 20GB | 40GB | 80GB | `x`GB |
| QLoRA | 4 | 6GB | 12GB | 24GB | 48GB | `x/2`GB |
| QLoRA | 2 | 4GB | 8GB | 16GB | 24GB | `x/4`GB |
## Getting Started
@ -332,135 +486,287 @@ huggingface-cli login
> [!IMPORTANT]
> Installation is mandatory.
#### Install from Source
```bash
git clone --depth 1 https://github.com/hiyouga/LLaMA-Factory.git
cd LLaMA-Factory
pip install -e '.[torch,metrics]'
pip install -e ".[torch,metrics]" --no-build-isolation
```
Extra dependencies available: torch, torch_npu, metrics, deepspeed, bitsandbytes, vllm, galore, badam, gptq, awq, aqlm, qwen, modelscope, quality
Extra dependencies available: torch, torch-npu, metrics, deepspeed, liger-kernel, bitsandbytes, hqq, eetq, gptq, aqlm, vllm, sglang, galore, apollo, badam, adam-mini, qwen, minicpm_v, openmind, swanlab, dev
> [!TIP]
> Use `pip install --no-deps -e .` to resolve package conflicts.
#### Install from Docker Image
```bash
docker run -it --rm --gpus=all --ipc=host hiyouga/llamafactory:latest
```
This image is built on Ubuntu 22.04 (x86\_64), CUDA 12.4, Python 3.11, PyTorch 2.6.0, and Flash-attn 2.7.4.
Find the pre-built images: https://hub.docker.com/r/hiyouga/llamafactory/tags
Please refer to [build docker](#build-docker) to build the image yourself.
<details><summary>Setting up a virtual environment with <b>uv</b></summary>
Create an isolated Python environment with [uv](https://github.com/astral-sh/uv):
```bash
uv sync --extra torch --extra metrics --prerelease=allow
```
Run LLaMA-Factory in the isolated environment:
```bash
uv run --prerelease=allow llamafactory-cli train examples/train_lora/llama3_lora_pretrain.yaml
```
</details>
<details><summary>For Windows users</summary>
#### Install PyTorch
You need to manually install the GPU version of PyTorch on the Windows platform. Please refer to the [official website](https://pytorch.org/get-started/locally/) and the following command to install PyTorch with CUDA support:
```bash
pip uninstall torch torchvision torchaudio
pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu126
python -c "import torch; print(torch.cuda.is_available())"
```
If you see `True` then you have successfully installed PyTorch with CUDA support.
Try `dataloader_num_workers: 0` if you encounter `Can't pickle local object` error.
#### Install BitsAndBytes
If you want to enable the quantized LoRA (QLoRA) on the Windows platform, you need to install a pre-built version of `bitsandbytes` library, which supports CUDA 11.1 to 12.2, please select the appropriate [release version](https://github.com/jllllll/bitsandbytes-windows-webui/releases/tag/wheels) based on your CUDA version.
```bash
pip install https://github.com/jllllll/bitsandbytes-windows-webui/releases/download/wheels/bitsandbytes-0.41.2.post2-py3-none-win_amd64.whl
```
To enable FlashAttention-2 on the Windows platform, you need to install the precompiled `flash-attn` library, which supports CUDA 12.1 to 12.2. Please download the corresponding version from [flash-attention](https://github.com/bdashore3/flash-attention/releases) based on your requirements.
#### Install Flash Attention-2
To enable FlashAttention-2 on the Windows platform, please use the script from [flash-attention-windows-wheel](https://huggingface.co/lldacing/flash-attention-windows-wheel) to compile and install it by yourself.
</details>
<details><summary>For Ascend NPU users</summary>
Join [NPU user group](assets/wechat_npu.jpg).
To install LLaMA Factory on Ascend NPU devices, please specify extra dependencies: `pip install -e '.[torch-npu,metrics]'`. Additionally, you need to install the **[Ascend CANN Toolkit and Kernels](https://www.hiascend.com/developer/download/community/result?module=cann)**. Please follow the [installation tutorial](https://www.hiascend.com/document/detail/en/CANNCommunityEdition/600alphaX/softwareinstall/instg/atlasdeploy_03_0031.html) or use the following commands:
To install LLaMA Factory on Ascend NPU devices, please upgrade Python to version 3.10 or higher and specify extra dependencies: `pip install -e ".[torch-npu,metrics]"`. Additionally, you need to install the **[Ascend CANN Toolkit and Kernels](https://www.hiascend.com/developer/download/community/result?module=cann)**. Please follow the [installation tutorial](https://www.hiascend.com/document/detail/en/CANNCommunityEdition/600alphaX/softwareinstall/instg/atlasdeploy_03_0031.html) or use the following commands:
```bash
# replace the url according to your CANN version and devices
# install CANN Toolkit
wget https://ascend-repo.obs.cn-east-2.myhuaweicloud.com/Milan-ASL/Milan-ASL%20V100R001C17SPC701/Ascend-cann-toolkit_8.0.RC1.alpha001_linux-"$(uname -i)".run
bash Ascend-cann-toolkit_8.0.RC1.alpha001_linux-"$(uname -i)".run --install
wget https://ascend-repo.obs.cn-east-2.myhuaweicloud.com/Milan-ASL/Milan-ASL%20V100R001C20SPC702/Ascend-cann-toolkit_8.0.0.alpha002_linux-"$(uname -i)".run
bash Ascend-cann-toolkit_8.0.0.alpha002_linux-"$(uname -i)".run --install
# install CANN Kernels
wget https://ascend-repo.obs.cn-east-2.myhuaweicloud.com/Milan-ASL/Milan-ASL%20V100R001C17SPC701/Ascend-cann-kernels-910b_8.0.RC1.alpha001_linux.run
bash Ascend-cann-kernels-910b_8.0.RC1.alpha001_linux.run --install
wget https://ascend-repo.obs.cn-east-2.myhuaweicloud.com/Milan-ASL/Milan-ASL%20V100R001C20SPC702/Ascend-cann-kernels-910b_8.0.0.alpha002_linux-"$(uname -i)".run
bash Ascend-cann-kernels-910b_8.0.0.alpha002_linux-"$(uname -i)".run --install
# set env variables
source /usr/local/Ascend/ascend-toolkit/set_env.sh
```
| Requirement | Minimum | Recommend |
| ------------ | ------- | ----------- |
| CANN | 8.0.RC1 | 8.0.RC1 |
| torch | 2.1.0 | 2.1.0 |
| torch-npu | 2.1.0 | 2.1.0.post3 |
| deepspeed | 0.13.2 | 0.13.2 |
Docker image:
- 32GB: [Download page](http://mirrors.cn-central-221.ovaijisuan.com/detail/130.html)
- 64GB: [Download page](http://mirrors.cn-central-221.ovaijisuan.com/detail/131.html)
| Requirement | Minimum | Recommend |
| ------------ | ------- | -------------- |
| CANN | 8.0.RC1 | 8.0.0.alpha002 |
| torch | 2.1.0 | 2.4.0 |
| torch-npu | 2.1.0 | 2.4.0.post2 |
| deepspeed | 0.13.2 | 0.13.2 |
| vllm-ascend | - | 0.7.3 |
Remember to use `ASCEND_RT_VISIBLE_DEVICES` instead of `CUDA_VISIBLE_DEVICES` to specify the device to use.
If you cannot infer model on NPU devices, try setting `do_sample: false` in the configurations.
Download the pre-built Docker images: [32GB](http://mirrors.cn-central-221.ovaijisuan.com/detail/130.html) | [64GB](http://mirrors.cn-central-221.ovaijisuan.com/detail/131.html)
#### Install BitsAndBytes
To use QLoRA based on bitsandbytes on Ascend NPU, please follow these 3 steps:
1. Manually compile bitsandbytes: Refer to [the installation documentation](https://huggingface.co/docs/bitsandbytes/installation?backend=Ascend+NPU&platform=Ascend+NPU) for the NPU version of bitsandbytes to complete the compilation and installation. The compilation requires a cmake version of at least 3.22.1 and a g++ version of at least 12.x.
```bash
# Install bitsandbytes from source
# Clone bitsandbytes repo, Ascend NPU backend is currently enabled on multi-backend-refactor branch
git clone -b multi-backend-refactor https://github.com/bitsandbytes-foundation/bitsandbytes.git
cd bitsandbytes/
# Install dependencies
pip install -r requirements-dev.txt
# Install the dependencies for the compilation tools. Note that the commands for this step may vary depending on the operating system. The following are provided for reference
apt-get install -y build-essential cmake
# Compile & install
cmake -DCOMPUTE_BACKEND=npu -S .
make
pip install .
```
2. Install transformers from the main branch.
```bash
git clone -b main https://github.com/huggingface/transformers.git
cd transformers
pip install .
```
3. Set `double_quantization: false` in the configuration. You can refer to the [example](examples/train_qlora/llama3_lora_sft_bnb_npu.yaml).
</details>
### Data Preparation
Please refer to [data/README.md](data/README.md) for checking the details about the format of dataset files. You can either use datasets on HuggingFace / ModelScope hub or load the dataset in local disk.
Please refer to [data/README.md](data/README.md) for checking the details about the format of dataset files. You can use datasets on HuggingFace / ModelScope / Modelers hub, load the dataset in local disk, or specify a path to s3/gcs cloud storage.
> [!NOTE]
> Please update `data/dataset_info.json` to use your custom dataset.
You can also use **[Easy Dataset](https://github.com/ConardLi/easy-dataset)**, **[DataFlow](https://github.com/OpenDCAI/DataFlow)** and **[GraphGen](https://github.com/open-sciencelab/GraphGen)** to create synthetic data for fine-tuning.
### Quickstart
Use the following 3 commands to run LoRA **fine-tuning**, **inference** and **merging** of the Llama3-8B-Instruct model, respectively.
```bash
CUDA_VISIBLE_DEVICES=0 llamafactory-cli train examples/lora_single_gpu/llama3_lora_sft.yaml
CUDA_VISIBLE_DEVICES=0 llamafactory-cli chat examples/inference/llama3_lora_sft.yaml
CUDA_VISIBLE_DEVICES=0 llamafactory-cli export examples/merge_lora/llama3_lora_sft.yaml
llamafactory-cli train examples/train_lora/llama3_lora_sft.yaml
llamafactory-cli chat examples/inference/llama3_lora_sft.yaml
llamafactory-cli export examples/merge_lora/llama3_lora_sft.yaml
```
See [examples/README.md](examples/README.md) for advanced usage (including distributed training).
> [!TIP]
> Use `llamafactory-cli help` to show help information.
>
> Read [FAQs](https://github.com/hiyouga/LLaMA-Factory/issues/4614) first if you encounter any problems.
### Fine-Tuning with LLaMA Board GUI (powered by [Gradio](https://github.com/gradio-app/gradio))
#### Use local environment
```bash
llamafactory-cli webui
```
### Build Docker
For CUDA users:
```bash
CUDA_VISIBLE_DEVICES=0 GRADIO_SHARE=1 llamafactory-cli webui
cd docker/docker-cuda/
docker compose up -d
docker compose exec llamafactory bash
```
For Ascend NPU users:
```bash
cd docker/docker-npu/
docker compose up -d
docker compose exec llamafactory bash
```
For AMD ROCm users:
```bash
cd docker/docker-rocm/
docker compose up -d
docker compose exec llamafactory bash
```
<details><summary>Build without Docker Compose</summary>
For CUDA users:
```bash
docker build -f ./docker/docker-cuda/Dockerfile \
--build-arg PIP_INDEX=https://pypi.org/simple \
--build-arg EXTRAS=metrics \
-t llamafactory:latest .
docker run -dit --ipc=host --gpus=all \
-p 7860:7860 \
-p 8000:8000 \
--name llamafactory \
llamafactory:latest
docker exec -it llamafactory bash
```
For Ascend NPU users:
```bash
docker build -f ./docker/docker-npu/Dockerfile \
--build-arg PIP_INDEX=https://pypi.org/simple \
--build-arg EXTRAS=torch-npu,metrics \
-t llamafactory:latest .
docker run -dit --ipc=host \
-v /usr/local/dcmi:/usr/local/dcmi \
-v /usr/local/bin/npu-smi:/usr/local/bin/npu-smi \
-v /usr/local/Ascend/driver:/usr/local/Ascend/driver \
-v /etc/ascend_install.info:/etc/ascend_install.info \
-p 7860:7860 \
-p 8000:8000 \
--device /dev/davinci0 \
--device /dev/davinci_manager \
--device /dev/devmm_svm \
--device /dev/hisi_hdc \
--name llamafactory \
llamafactory:latest
docker exec -it llamafactory bash
```
For AMD ROCm users:
```bash
docker build -f ./docker/docker-rocm/Dockerfile \
--build-arg PIP_INDEX=https://pypi.org/simple \
--build-arg EXTRAS=metrics \
-t llamafactory:latest .
docker run -dit --ipc=host \
-p 7860:7860 \
-p 8000:8000 \
--device /dev/kfd \
--device /dev/dri \
--name llamafactory \
llamafactory:latest
docker exec -it llamafactory bash
```
</details>
#### Use Docker
<details><summary>Use Docker volumes</summary>
```bash
docker build -f ./Dockerfile -t llama-factory:latest .
docker run --gpus=all \
-v ./hf_cache:/root/.cache/huggingface/ \
-v ./data:/app/data \
-v ./output:/app/output \
-p 7860:7860 \
--shm-size 16G \
--name llama_factory \
-d llama-factory:latest
```
You can uncomment `VOLUME [ "/root/.cache/huggingface", "/app/shared_data", "/app/output" ]` in the Dockerfile to use data volumes.
#### Use Docker Compose
When building the Docker image, use `-v ./hf_cache:/root/.cache/huggingface` argument to mount the local directory to the container. The following data volumes are available.
```bash
docker compose -f ./docker-compose.yml up -d
```
<details><summary>Details about volume</summary>
- hf_cache: Utilize Hugging Face cache on the host machine. Reassignable if a cache already exists in a different directory.
- data: Place datasets on this dir of the host machine so that they can be selected on LLaMA Board GUI.
- output: Set export dir to this location so that the merged result can be accessed directly on the host machine.
- `hf_cache`: Utilize Hugging Face cache on the host machine.
- `shared_data`: The directionary to store datasets on the host machine.
- `output`: Set export dir to this location so that the merged result can be accessed directly on the host machine.
</details>
### Deploy with OpenAI-style API and vLLM
```bash
CUDA_VISIBLE_DEVICES=0,1 API_PORT=8000 llamafactory-cli api examples/inference/llama3_vllm.yaml
API_PORT=8000 llamafactory-cli api examples/inference/llama3.yaml infer_backend=vllm vllm_enforce_eager=true
```
> [!TIP]
> Visit https://platform.openai.com/docs/api-reference/chat/create for API document.
> Visit [this page](https://platform.openai.com/docs/api-reference/chat/create) for API document.
>
> Examples: [Image understanding](scripts/api_example/test_image.py) | [Function calling](scripts/api_example/test_toolcall.py)
### Download from ModelScope Hub
@ -472,9 +778,19 @@ export USE_MODELSCOPE_HUB=1 # `set USE_MODELSCOPE_HUB=1` for Windows
Train the model by specifying a model ID of the ModelScope Hub as the `model_name_or_path`. You can find a full list of model IDs at [ModelScope Hub](https://modelscope.cn/models), e.g., `LLM-Research/Meta-Llama-3-8B-Instruct`.
### Download from Modelers Hub
You can also use Modelers Hub to download models and datasets.
```bash
export USE_OPENMIND_HUB=1 # `set USE_OPENMIND_HUB=1` for Windows
```
Train the model by specifying a model ID of the Modelers Hub as the `model_name_or_path`. You can find a full list of model IDs at [Modelers Hub](https://modelers.cn/models), e.g., `TeleAI/TeleChat-7B-pt`.
### Use W&B Logger
To use [Weights & Biases](https://wandb.ai) for logging experimental results, you need to add the following arguments.
To use [Weights & Biases](https://wandb.ai) for logging experimental results, you need to add the following arguments to yaml files.
```yaml
report_to: wandb
@ -483,6 +799,21 @@ run_name: test_run # optional
Set `WANDB_API_KEY` to [your key](https://wandb.ai/authorize) when launching training tasks to log in with your W&B account.
### Use SwanLab Logger
To use [SwanLab](https://github.com/SwanHubX/SwanLab) for logging experimental results, you need to add the following arguments to yaml files.
```yaml
use_swanlab: true
swanlab_run_name: test_run # optional
```
When launching training tasks, you can log in to SwanLab in three ways:
1. Add `swanlab_api_key=<your_api_key>` to the yaml file, and set it to your [API key](https://swanlab.cn/settings).
2. Set the environment variable `SWANLAB_API_KEY` to your [API key](https://swanlab.cn/settings).
3. Use the `swanlab login` command to complete the login.
## Projects using LLaMA Factory
If you have a project that should be incorporated, please contact via email or create a pull request.
@ -494,63 +825,118 @@ If you have a project that should be incorporated, please contact via email or c
1. Wang et al. UbiPhysio: Support Daily Functioning, Fitness, and Rehabilitation with Action Understanding and Feedback in Natural Language. 2023. [[arxiv]](https://arxiv.org/abs/2308.10526)
1. Luceri et al. Leveraging Large Language Models to Detect Influence Campaigns in Social Media. 2023. [[arxiv]](https://arxiv.org/abs/2311.07816)
1. Zhang et al. Alleviating Hallucinations of Large Language Models through Induced Hallucinations. 2023. [[arxiv]](https://arxiv.org/abs/2312.15710)
1. Wang et al. Know Your Needs Better: Towards Structured Understanding of Marketer Demands with Analogical Reasoning Augmented LLMs. 2024. [[arxiv]](https://arxiv.org/abs/2401.04319)
1. Wang et al. CANDLE: Iterative Conceptualization and Instantiation Distillation from Large Language Models for Commonsense Reasoning. 2024. [[arxiv]](https://arxiv.org/abs/2401.07286)
1. Wang et al. Know Your Needs Better: Towards Structured Understanding of Marketer Demands with Analogical Reasoning Augmented LLMs. KDD 2024. [[arxiv]](https://arxiv.org/abs/2401.04319)
1. Wang et al. CANDLE: Iterative Conceptualization and Instantiation Distillation from Large Language Models for Commonsense Reasoning. ACL 2024. [[arxiv]](https://arxiv.org/abs/2401.07286)
1. Choi et al. FACT-GPT: Fact-Checking Augmentation via Claim Matching with LLMs. 2024. [[arxiv]](https://arxiv.org/abs/2402.05904)
1. Zhang et al. AutoMathText: Autonomous Data Selection with Language Models for Mathematical Texts. 2024. [[arxiv]](https://arxiv.org/abs/2402.07625)
1. Lyu et al. KnowTuning: Knowledge-aware Fine-tuning for Large Language Models. 2024. [[arxiv]](https://arxiv.org/abs/2402.11176)
1. Yang et al. LaCo: Large Language Model Pruning via Layer Collaps. 2024. [[arxiv]](https://arxiv.org/abs/2402.11187)
1. Bhardwaj et al. Language Models are Homer Simpson! Safety Re-Alignment of Fine-tuned Language Models through Task Arithmetic. 2024. [[arxiv]](https://arxiv.org/abs/2402.11746)
1. Yang et al. Enhancing Empathetic Response Generation by Augmenting LLMs with Small-scale Empathetic Models. 2024. [[arxiv]](https://arxiv.org/abs/2402.11801)
1. Yi et al. Generation Meets Verification: Accelerating Large Language Model Inference with Smart Parallel Auto-Correct Decoding. 2024. [[arxiv]](https://arxiv.org/abs/2402.11809)
1. Yi et al. Generation Meets Verification: Accelerating Large Language Model Inference with Smart Parallel Auto-Correct Decoding. ACL 2024 Findings. [[arxiv]](https://arxiv.org/abs/2402.11809)
1. Cao et al. Head-wise Shareable Attention for Large Language Models. 2024. [[arxiv]](https://arxiv.org/abs/2402.11819)
1. Zhang et al. Enhancing Multilingual Capabilities of Large Language Models through Self-Distillation from Resource-Rich Languages. 2024. [[arxiv]](https://arxiv.org/abs/2402.12204)
1. Kim et al. Efficient and Effective Vocabulary Expansion Towards Multilingual Large Language Models. 2024. [[arxiv]](https://arxiv.org/abs/2402.14714)
1. Yu et al. KIEval: A Knowledge-grounded Interactive Evaluation Framework for Large Language Models. 2024. [[arxiv]](https://arxiv.org/abs/2402.15043)
1. Yu et al. KIEval: A Knowledge-grounded Interactive Evaluation Framework for Large Language Models. ACL 2024. [[arxiv]](https://arxiv.org/abs/2402.15043)
1. Huang et al. Key-Point-Driven Data Synthesis with its Enhancement on Mathematical Reasoning. 2024. [[arxiv]](https://arxiv.org/abs/2403.02333)
1. Duan et al. Negating Negatives: Alignment without Human Positive Samples via Distributional Dispreference Optimization. 2024. [[arxiv]](https://arxiv.org/abs/2403.03419)
1. Xie and Schwertfeger. Empowering Robotics with Large Language Models: osmAG Map Comprehension with LLMs. 2024. [[arxiv]](https://arxiv.org/abs/2403.08228)
1. Wu et al. Large Language Models are Parallel Multilingual Learners. 2024. [[arxiv]](https://arxiv.org/abs/2403.09073)
1. Zhang et al. EDT: Improving Large Language Models' Generation by Entropy-based Dynamic Temperature Sampling. 2024. [[arxiv]](https://arxiv.org/abs/2403.14541)
1. Weller et al. FollowIR: Evaluating and Teaching Information Retrieval Models to Follow Instructions. 2024. [[arxiv]](https://arxiv.org/abs/2403.15246)
1. Hongbin Na. CBT-LLM: A Chinese Large Language Model for Cognitive Behavioral Therapy-based Mental Health Question Answering. 2024. [[arxiv]](https://arxiv.org/abs/2403.16008)
1. Hongbin Na. CBT-LLM: A Chinese Large Language Model for Cognitive Behavioral Therapy-based Mental Health Question Answering. COLING 2024. [[arxiv]](https://arxiv.org/abs/2403.16008)
1. Zan et al. CodeS: Natural Language to Code Repository via Multi-Layer Sketch. 2024. [[arxiv]](https://arxiv.org/abs/2403.16443)
1. Liu et al. Extensive Self-Contrast Enables Feedback-Free Language Model Alignment. 2024. [[arxiv]](https://arxiv.org/abs/2404.00604)
1. Luo et al. BAdam: A Memory Efficient Full Parameter Training Method for Large Language Models. 2024. [[arxiv]](https://arxiv.org/abs/2404.02827)
1. Du et al. Chinese Tiny LLM: Pretraining a Chinese-Centric Large Language Model. 2024. [[arxiv]](https://arxiv.org/abs/2404.04167)
1. Ma et al. Parameter Efficient Quasi-Orthogonal Fine-Tuning via Givens Rotation. 2024. [[arxiv]](https://arxiv.org/abs/2404.04316)
1. Ma et al. Parameter Efficient Quasi-Orthogonal Fine-Tuning via Givens Rotation. ICML 2024. [[arxiv]](https://arxiv.org/abs/2404.04316)
1. Liu et al. Dynamic Generation of Personalities with Large Language Models. 2024. [[arxiv]](https://arxiv.org/abs/2404.07084)
1. Shang et al. How Far Have We Gone in Stripped Binary Code Understanding Using Large Language Models. 2024. [[arxiv]](https://arxiv.org/abs/2404.09836)
1. Huang et al. LLMTune: Accelerate Database Knob Tuning with Large Language Models. 2024. [[arxiv]](https://arxiv.org/abs/2404.11581)
1. Deng et al. Text-Tuple-Table: Towards Information Integration in Text-to-Table Generation via Global Tuple Extraction. 2024. [[arxiv]](https://arxiv.org/abs/2404.14215)
1. Acikgoz et al. Hippocrates: An Open-Source Framework for Advancing Large Language Models in Healthcare. 2024. [[arxiv]](https://arxiv.org/abs/2404.16621)
1. Zhang et al. Small Language Models Need Strong Verifiers to Self-Correct Reasoning. 2024. [[arxiv]](https://arxiv.org/abs/2404.17140)
1. Zhou et al. FREB-TQA: A Fine-Grained Robustness Evaluation Benchmark for Table Question Answering. 2024. [[arxiv]](https://arxiv.org/abs/2404.18585)
1. Zhang et al. Small Language Models Need Strong Verifiers to Self-Correct Reasoning. ACL 2024 Findings. [[arxiv]](https://arxiv.org/abs/2404.17140)
1. Zhou et al. FREB-TQA: A Fine-Grained Robustness Evaluation Benchmark for Table Question Answering. NAACL 2024. [[arxiv]](https://arxiv.org/abs/2404.18585)
1. Xu et al. Large Language Models for Cyber Security: A Systematic Literature Review. 2024. [[arxiv]](https://arxiv.org/abs/2405.04760)
1. Dammu et al. "They are uncultured": Unveiling Covert Harms and Social Threats in LLM Generated Conversations. 2024. [[arxiv]](https://arxiv.org/abs/2405.05378)
1. Yi et al. A safety realignment framework via subspace-oriented model fusion for large language models. 2024. [[arxiv]](https://arxiv.org/abs/2405.09055)
1. Lou et al. SPO: Multi-Dimensional Preference Sequential Alignment With Implicit Reward Modeling. 2024. [[arxiv]](https://arxiv.org/abs/2405.12739)
1. Zhang et al. Getting More from Less: Large Language Models are Good Spontaneous Multilingual Learners. 2024. [[arxiv]](https://arxiv.org/abs/2405.13816)
1. Zhang et al. TS-Align: A Teacher-Student Collaborative Framework for Scalable Iterative Finetuning of Large Language Models. 2024. [[arxiv]](https://arxiv.org/abs/2405.20215)
1. Zihong Chen. Sentence Segmentation and Sentence Punctuation Based on XunziALLM. 2024. [[paper]](https://aclanthology.org/2024.lt4hala-1.30)
1. Gao et al. The Best of Both Worlds: Toward an Honest and Helpful Large Language Model. 2024. [[arxiv]](https://arxiv.org/abs/2406.00380)
1. Wang and Song. MARS: Benchmarking the Metaphysical Reasoning Abilities of Language Models with a Multi-task Evaluation Dataset. 2024. [[arxiv]](https://arxiv.org/abs/2406.02106)
1. Hu et al. Computational Limits of Low-Rank Adaptation (LoRA) for Transformer-Based Models. 2024. [[arxiv]](https://arxiv.org/abs/2406.03136)
1. Ge et al. Time Sensitive Knowledge Editing through Efficient Finetuning. ACL 2024. [[arxiv]](https://arxiv.org/abs/2406.04496)
1. Tan et al. Peer Review as A Multi-Turn and Long-Context Dialogue with Role-Based Interactions. 2024. [[arxiv]](https://arxiv.org/abs/2406.05688)
1. Song et al. Turbo Sparse: Achieving LLM SOTA Performance with Minimal Activated Parameters. 2024. [[arxiv]](https://arxiv.org/abs/2406.05955)
1. Gu et al. RWKV-CLIP: A Robust Vision-Language Representation Learner. 2024. [[arxiv]](https://arxiv.org/abs/2406.06973)
1. Chen et al. Advancing Tool-Augmented Large Language Models: Integrating Insights from Errors in Inference Trees. 2024. [[arxiv]](https://arxiv.org/abs/2406.07115)
1. Zhu et al. Are Large Language Models Good Statisticians?. 2024. [[arxiv]](https://arxiv.org/abs/2406.07815)
1. Li et al. Know the Unknown: An Uncertainty-Sensitive Method for LLM Instruction Tuning. 2024. [[arxiv]](https://arxiv.org/abs/2406.10099)
1. Ding et al. IntentionQA: A Benchmark for Evaluating Purchase Intention Comprehension Abilities of Language Models in E-commerce. 2024. [[arxiv]](https://arxiv.org/abs/2406.10173)
1. He et al. COMMUNITY-CROSS-INSTRUCT: Unsupervised Instruction Generation for Aligning Large Language Models to Online Communities. 2024. [[arxiv]](https://arxiv.org/abs/2406.12074)
1. Lin et al. FVEL: Interactive Formal Verification Environment with Large Language Models via Theorem Proving. 2024. [[arxiv]](https://arxiv.org/abs/2406.14408)
1. Treutlein et al. Connecting the Dots: LLMs can Infer and Verbalize Latent Structure from Disparate Training Data. 2024. [[arxiv]](https://arxiv.org/abs/2406.14546)
1. Feng et al. SS-Bench: A Benchmark for Social Story Generation and Evaluation. 2024. [[arxiv]](https://arxiv.org/abs/2406.15695)
1. Feng et al. Self-Constructed Context Decompilation with Fined-grained Alignment Enhancement. 2024. [[arxiv]](https://arxiv.org/abs/2406.17233)
1. Liu et al. Large Language Models for Cuffless Blood Pressure Measurement From Wearable Biosignals. 2024. [[arxiv]](https://arxiv.org/abs/2406.18069)
1. Iyer et al. Exploring Very Low-Resource Translation with LLMs: The University of Edinburgh's Submission to AmericasNLP 2024 Translation Task. AmericasNLP 2024. [[paper]](https://aclanthology.org/2024.americasnlp-1.25)
1. Li et al. Calibrating LLMs with Preference Optimization on Thought Trees for Generating Rationale in Science Question Scoring. 2024. [[arxiv]](https://arxiv.org/abs/2406.19949)
1. Yang et al. Financial Knowledge Large Language Model. 2024. [[arxiv]](https://arxiv.org/abs/2407.00365)
1. Lin et al. DogeRM: Equipping Reward Models with Domain Knowledge through Model Merging. 2024. [[arxiv]](https://arxiv.org/abs/2407.01470)
1. Bako et al. Evaluating the Semantic Profiling Abilities of LLMs for Natural Language Utterances in Data Visualization. 2024. [[arxiv]](https://arxiv.org/abs/2407.06129)
1. Huang et al. RoLoRA: Fine-tuning Rotated Outlier-free LLMs for Effective Weight-Activation Quantization. 2024. [[arxiv]](https://arxiv.org/abs/2407.08044)
1. Jiang et al. LLM-Collaboration on Automatic Science Journalism for the General Audience. 2024. [[arxiv]](https://arxiv.org/abs/2407.09756)
1. Inouye et al. Applied Auto-tuning on LoRA Hyperparameters. 2024. [[paper]](https://scholarcommons.scu.edu/cseng_senior/272/)
1. Qi et al. Research on Tibetan Tourism Viewpoints information generation system based on LLM. 2024. [[arxiv]](https://arxiv.org/abs/2407.13561)
1. Xu et al. Course-Correction: Safety Alignment Using Synthetic Preferences. 2024. [[arxiv]](https://arxiv.org/abs/2407.16637)
1. Sun et al. LAMBDA: A Large Model Based Data Agent. 2024. [[arxiv]](https://arxiv.org/abs/2407.17535)
1. Zhu et al. CollectiveSFT: Scaling Large Language Models for Chinese Medical Benchmark with Collective Instructions in Healthcare. 2024. [[arxiv]](https://arxiv.org/abs/2407.19705)
1. Yu et al. Correcting Negative Bias in Large Language Models through Negative Attention Score Alignment. 2024. [[arxiv]](https://arxiv.org/abs/2408.00137)
1. Xie et al. The Power of Personalized Datasets: Advancing Chinese Composition Writing for Elementary School through Targeted Model Fine-Tuning. IALP 2024. [[paper]](https://www.asianlp.sg/conferences/ialp2024/proceedings/papers/IALP2024_P055.pdf)
1. Liu et al. Instruct-Code-Llama: Improving Capabilities of Language Model in Competition Level Code Generation by Online Judge Feedback. ICIC 2024. [[paper]](https://link.springer.com/chapter/10.1007/978-981-97-5669-8_11)
1. Wang et al. Cybernetic Sentinels: Unveiling the Impact of Safety Data Selection on Model Security in Supervised Fine-Tuning. ICIC 2024. [[paper]](https://link.springer.com/chapter/10.1007/978-981-97-5669-8_23)
1. Xia et al. Understanding the Performance and Estimating the Cost of LLM Fine-Tuning. 2024. [[arxiv]](https://arxiv.org/abs/2408.04693)
1. Zeng et al. Perceive, Reflect, and Plan: Designing LLM Agent for Goal-Directed City Navigation without Instructions. 2024. [[arxiv]](https://arxiv.org/abs/2408.04168)
1. Xia et al. Using Pre-trained Language Model for Accurate ESG Prediction. FinNLP 2024. [[paper]](https://aclanthology.org/2024.finnlp-2.1/)
1. Liang et al. I-SHEEP: Self-Alignment of LLM from Scratch through an Iterative Self-Enhancement Paradigm. 2024. [[arxiv]](https://arxiv.org/abs/2408.08072)
1. Bai et al. Aligning Large Language Model with Direct Multi-Preference Optimization for Recommendation. CIKM 2024. [[paper]](https://dl.acm.org/doi/10.1145/3627673.3679611)
1. Zhang et al. CPsyCoun: A Report-based Multi-turn Dialogue Reconstruction and Evaluation Framework for Chinese Psychological Counseling. ACL 2024. [[paper]](https://aclanthology.org/2024.findings-acl.830.pdf)
1. **[StarWhisper](https://github.com/Yu-Yang-Li/StarWhisper)**: A large language model for Astronomy, based on ChatGLM2-6B and Qwen-14B.
1. **[DISC-LawLLM](https://github.com/FudanDISC/DISC-LawLLM)**: A large language model specialized in Chinese legal domain, based on Baichuan-13B, is capable of retrieving and reasoning on legal knowledge.
1. **[Sunsimiao](https://github.com/X-D-Lab/Sunsimiao)**: A large language model specialized in Chinese medical domain, based on Baichuan-7B and ChatGLM-6B.
1. **[CareGPT](https://github.com/WangRongsheng/CareGPT)**: A series of large language models for Chinese medical domain, based on LLaMA2-7B and Baichuan-13B.
1. **[MachineMindset](https://github.com/PKU-YuanGroup/Machine-Mindset/)**: A series of MBTI Personality large language models, capable of giving any LLM 16 different personality types based on different datasets and training methods.
1. **[Luminia-13B-v3](https://huggingface.co/Nekochu/Luminia-13B-v3)**: A large language model specialized in generate metadata for stable diffusion. [[🤗Demo]](https://huggingface.co/spaces/Nekochu/Luminia-13B_SD_Prompt)
1. **[Luminia-13B-v3](https://huggingface.co/Nekochu/Luminia-13B-v3)**: A large language model specialized in generate metadata for stable diffusion. [[demo]](https://huggingface.co/spaces/Nekochu/Luminia-13B_SD_Prompt)
1. **[Chinese-LLaVA-Med](https://github.com/BUAADreamer/Chinese-LLaVA-Med)**: A multimodal large language model specialized in Chinese medical domain, based on LLaVA-1.5-7B.
1. **[AutoRE](https://github.com/THUDM/AutoRE)**: A document-level relation extraction system based on large language models.
1. **[NVIDIA RTX AI Toolkit](https://github.com/NVIDIA/RTX-AI-Toolkit)**: SDKs for fine-tuning LLMs on Windows PC for NVIDIA RTX.
1. **[LazyLLM](https://github.com/LazyAGI/LazyLLM)**: An easy and lazy way for building multi-agent LLMs applications and supports model fine-tuning via LLaMA Factory.
1. **[RAG-Retrieval](https://github.com/NLPJCL/RAG-Retrieval)**: A full pipeline for RAG retrieval model fine-tuning, inference, and distillation. [[blog]](https://zhuanlan.zhihu.com/p/987727357)
1. **[360-LLaMA-Factory](https://github.com/Qihoo360/360-LLaMA-Factory)**: A modified library that supports long sequence SFT & DPO using ring attention.
1. **[Sky-T1](https://novasky-ai.github.io/posts/sky-t1/)**: An o1-like model fine-tuned by NovaSky AI with very small cost.
1. **[WeClone](https://github.com/xming521/WeClone)**: One-stop solution for creating your digital avatar from chat logs.
1. **[EmoLLM](https://github.com/SmartFlowAI/EmoLLM)**: A project about large language models (LLMs) and mental health.
</details>
## License
This repository is licensed under the [Apache-2.0 License](LICENSE).
Please follow the model licenses to use the corresponding model weights: [Baichuan2](https://huggingface.co/baichuan-inc/Baichuan2-7B-Base/blob/main/Community%20License%20for%20Baichuan%202%20Model.pdf) / [BLOOM](https://huggingface.co/spaces/bigscience/license) / [ChatGLM3](https://github.com/THUDM/ChatGLM3/blob/main/MODEL_LICENSE) / [Command-R](https://cohere.com/c4ai-cc-by-nc-license) / [DeepSeek](https://github.com/deepseek-ai/DeepSeek-LLM/blob/main/LICENSE-MODEL) / [Falcon](https://huggingface.co/tiiuae/falcon-180B/blob/main/LICENSE.txt) / [Gemma](https://ai.google.dev/gemma/terms) / [GLM4](https://huggingface.co/THUDM/glm-4-9b/blob/main/LICENSE) / [InternLM2](https://github.com/InternLM/InternLM#license) / [LLaMA](https://github.com/facebookresearch/llama/blob/main/MODEL_CARD.md) / [LLaMA-2 (LLaVA-1.5)](https://ai.meta.com/llama/license/) / [LLaMA-3](https://llama.meta.com/llama3/license/) / [Mistral](LICENSE) / [OLMo](LICENSE) / [Phi-1.5/2](https://huggingface.co/microsoft/phi-1_5/resolve/main/Research%20License.docx) / [Phi-3](https://huggingface.co/microsoft/Phi-3-mini-4k-instruct/blob/main/LICENSE) / [Qwen](https://github.com/QwenLM/Qwen/blob/main/Tongyi%20Qianwen%20LICENSE%20AGREEMENT) / [StarCoder2](https://huggingface.co/spaces/bigcode/bigcode-model-license-agreement) / [XVERSE](https://github.com/xverse-ai/XVERSE-13B/blob/main/MODEL_LICENSE.pdf) / [Yi](https://huggingface.co/01-ai/Yi-6B/blob/main/LICENSE) / [Yi-1.5](LICENSE) / [Yuan](https://github.com/IEIT-Yuan/Yuan-2.0/blob/main/LICENSE-Yuan)
Please follow the model licenses to use the corresponding model weights: [Baichuan 2](https://huggingface.co/baichuan-inc/Baichuan2-7B-Base/blob/main/Community%20License%20for%20Baichuan%202%20Model.pdf) / [BLOOM](https://huggingface.co/spaces/bigscience/license) / [ChatGLM3](https://github.com/THUDM/ChatGLM3/blob/main/MODEL_LICENSE) / [Command R](https://cohere.com/c4ai-cc-by-nc-license) / [DeepSeek](https://github.com/deepseek-ai/DeepSeek-LLM/blob/main/LICENSE-MODEL) / [Falcon](https://huggingface.co/tiiuae/falcon-180B/blob/main/LICENSE.txt) / [Gemma](https://ai.google.dev/gemma/terms) / [GLM-4](https://huggingface.co/THUDM/glm-4-9b/blob/main/LICENSE) / [GPT-2](https://github.com/openai/gpt-2/blob/master/LICENSE) / [Granite](LICENSE) / [Index](https://huggingface.co/IndexTeam/Index-1.9B/blob/main/LICENSE) / [InternLM](https://github.com/InternLM/InternLM#license) / [Llama](https://github.com/facebookresearch/llama/blob/main/MODEL_CARD.md) / [Llama 2](https://ai.meta.com/llama/license/) / [Llama 3](https://llama.meta.com/llama3/license/) / [Llama 4](https://github.com/meta-llama/llama-models/blob/main/models/llama4/LICENSE) / [MiniCPM](https://github.com/OpenBMB/MiniCPM/blob/main/MiniCPM%20Model%20License.md) / [Mistral/Mixtral/Pixtral](LICENSE) / [OLMo](LICENSE) / [Phi-1.5/Phi-2](https://huggingface.co/microsoft/phi-1_5/resolve/main/Research%20License.docx) / [Phi-3/Phi-4](https://huggingface.co/microsoft/Phi-3-mini-4k-instruct/blob/main/LICENSE) / [Qwen](https://github.com/QwenLM/Qwen/blob/main/Tongyi%20Qianwen%20LICENSE%20AGREEMENT) / [Skywork](https://huggingface.co/Skywork/Skywork-13B-base/blob/main/Skywork%20Community%20License.pdf) / [StarCoder 2](https://huggingface.co/spaces/bigcode/bigcode-model-license-agreement) / [TeleChat2](https://huggingface.co/Tele-AI/telechat-7B/blob/main/TeleChat%E6%A8%A1%E5%9E%8B%E7%A4%BE%E5%8C%BA%E8%AE%B8%E5%8F%AF%E5%8D%8F%E8%AE%AE.pdf) / [XVERSE](https://github.com/xverse-ai/XVERSE-13B/blob/main/MODEL_LICENSE.pdf) / [Yi](https://huggingface.co/01-ai/Yi-6B/blob/main/LICENSE) / [Yi-1.5](LICENSE) / [Yuan 2](https://github.com/IEIT-Yuan/Yuan-2.0/blob/main/LICENSE-Yuan)
## Citation
If this work is helpful, please kindly cite as:
```bibtex
@article{zheng2024llamafactory,
@inproceedings{zheng2024llamafactory,
title={LlamaFactory: Unified Efficient Fine-Tuning of 100+ Language Models},
author={Yaowei Zheng and Richong Zhang and Junhao Zhang and Yanhan Ye and Zheyan Luo and Yongqiang Ma},
journal={arXiv preprint arXiv:2403.13372},
author={Yaowei Zheng and Richong Zhang and Junhao Zhang and Yanhan Ye and Zheyan Luo and Zhangchi Feng and Yongqiang Ma},
booktitle={Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 3: System Demonstrations)},
address={Bangkok, Thailand},
publisher={Association for Computational Linguistics},
year={2024},
url={http://arxiv.org/abs/2403.13372}
}

View File

@ -1,44 +1,88 @@
![# LLaMA Factory](assets/logo.png)
[![GitHub Repo stars](https://img.shields.io/github/stars/hiyouga/LLaMA-Factory?style=social)](https://github.com/hiyouga/LLaMA-Factory/stargazers)
[![GitHub Code License](https://img.shields.io/github/license/hiyouga/LLaMA-Factory)](LICENSE)
[![GitHub last commit](https://img.shields.io/github/last-commit/hiyouga/LLaMA-Factory)](https://github.com/hiyouga/LLaMA-Factory/commits/main)
[![GitHub contributors](https://img.shields.io/github/contributors/hiyouga/LLaMA-Factory?color=orange)](https://github.com/hiyouga/LLaMA-Factory/graphs/contributors)
[![GitHub workflow](https://github.com/hiyouga/LLaMA-Factory/actions/workflows/tests.yml/badge.svg)](https://github.com/hiyouga/LLaMA-Factory/actions/workflows/tests.yml)
[![PyPI](https://img.shields.io/pypi/v/llamafactory)](https://pypi.org/project/llamafactory/)
[![Citation](https://img.shields.io/badge/citation-44-green)](#使用了-llama-factory-的项目)
[![GitHub pull request](https://img.shields.io/badge/PRs-welcome-blue)](https://github.com/hiyouga/LLaMA-Factory/pulls)
[![Discord](https://dcbadge.vercel.app/api/server/rKfvV9r9FK?compact=true&style=flat)](https://discord.gg/rKfvV9r9FK)
[![Citation](https://img.shields.io/badge/citation-730-green)](https://scholar.google.com/scholar?cites=12620864006390196564)
[![Docker Pulls](https://img.shields.io/docker/pulls/hiyouga/llamafactory)](https://hub.docker.com/r/hiyouga/llamafactory/tags)
[![Twitter](https://img.shields.io/twitter/follow/llamafactory_ai)](https://twitter.com/llamafactory_ai)
[![Discord](https://dcbadge.vercel.app/api/server/rKfvV9r9FK?compact=true&style=flat)](https://discord.gg/rKfvV9r9FK)
[![GitCode](https://gitcode.com/zhengyaowei/LLaMA-Factory/star/badge.svg)](https://gitcode.com/zhengyaowei/LLaMA-Factory)
[![Open in Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/drive/1d5KQtbemerlSDSxZIfAaWXhKr30QypiK?usp=sharing)
[![Open in DSW](https://gallery.pai-ml.com/assets/open-in-dsw.svg)](https://gallery.pai-ml.com/#/preview/deepLearning/nlp/llama_factory)
[![Spaces](https://img.shields.io/badge/🤗-Open%20in%20Spaces-blue)](https://huggingface.co/spaces/hiyouga/LLaMA-Board)
[![Studios](https://img.shields.io/badge/ModelScope-Open%20in%20Studios-blue)](https://modelscope.cn/studios/hiyouga/LLaMA-Board)
[![Open in Alaya](assets/alaya_new.svg)](https://docs.alayanew.com/docs/documents/newActivities/llamafactory/?utm_source=LLaMA-Factory)
[![Open in Spaces](https://img.shields.io/badge/🤗-Open%20in%20Spaces-blue)](https://huggingface.co/spaces/hiyouga/LLaMA-Board)
[![Open in Studios](https://img.shields.io/badge/ModelScope-Open%20in%20Studios-blue)](https://modelscope.cn/studios/hiyouga/LLaMA-Board)
[![Open in Novita](https://img.shields.io/badge/Novita-Deploy%20Template-blue)](https://novita.ai/templates-library/105981?sharer=88115474-394e-4bda-968e-b88e123d0c47)
[![GitHub Tread](https://trendshift.io/api/badge/repositories/4535)](https://trendshift.io/repositories/4535)
### 获得[亚马逊](https://aws.amazon.com/cn/blogs/china/a-one-stop-code-free-model-fine-tuning-deployment-platform-based-on-sagemaker-and-llama-factory/)、[英伟达](https://developer.nvidia.cn/rtx/ai-toolkit)、[阿里云](https://help.aliyun.com/zh/pai/use-cases/fine-tune-a-llama-3-model-with-llama-factory)等的应用。
👋 加入我们的[微信群](assets/wechat.jpg)。
<div align="center" markdown="1">
### 赞助商 ❤️
<a href="https://warp.dev/llama-factory">
<img alt="Warp sponsorship" width="400" src="https://github.com/user-attachments/assets/ab8dd143-b0fd-4904-bdc5-dd7ecac94eae">
</a>
#### [Warp面向开发者的智能终端](https://warp.dev/llama-factory)
[适用于 MacOS、Linux 和 Windows](https://warp.dev/llama-factory)
----
### 使用零代码[命令行](#快速开始)与 [Web UI](#llama-board-可视化微调由-gradio-驱动) 轻松微调百余种大模型
![GitHub Trend](https://trendshift.io/api/badge/repositories/4535)
</div>
👋 加入我们的[微信群](assets/wechat.jpg)、[NPU 用户群](assets/wechat_npu.jpg)或 [九章智算云算力优惠群](assets/wechat_alaya.png)。
\[ [English](README.md) | 中文 \]
**微调大模型可以像这样轻松…**
https://github.com/hiyouga/LLaMA-Factory/assets/16256802/ec36a9dd-37f4-4f72-81bd-d76c6d0a6594
https://github.com/user-attachments/assets/43b700c6-a178-41db-b1f8-8190a5d3fcfc
选择你的打开方式:
- **Colab**https://colab.research.google.com/drive/1d5KQtbemerlSDSxZIfAaWXhKr30QypiK?usp=sharing
- **PAI-DSW**: https://gallery.pai-ml.com/#/preview/deepLearning/nlp/llama_factory
- **入门教程**https://zhuanlan.zhihu.com/p/695287607
- **微调视频教程**https://www.bilibili.com/video/BV1djgRzxEts/
- **框架文档**https://llamafactory.readthedocs.io/zh-cn/latest/
- **框架文档(昇腾 NPU**https://ascend.github.io/docs/sources/llamafactory/
- **Colab免费**https://colab.research.google.com/drive/1d5KQtbemerlSDSxZIfAaWXhKr30QypiK?usp=sharing
- **本地机器**:请见[如何使用](#如何使用)
- **PAI-DSW免费试用**https://gallery.pai-ml.com/#/preview/deepLearning/nlp/llama_factory
- **九章智算云(算力优惠活动)**https://docs.alayanew.com/docs/documents/useGuide/LLaMAFactory/mutiple/?utm_source=LLaMA-Factory
> [!NOTE]
> 除上述链接以外的其他网站均为未经许可的第三方网站,请小心甄别。
## 目录
- [项目特色](#项目特色)
- [性能指标](#性能指标)
- [官方博客](#官方博客)
- [更新日志](#更新日志)
- [模型](#模型)
- [训练方法](#训练方法)
- [数据集](#数据集)
- [软硬件依赖](#软硬件依赖)
- [如何使用](#如何使用)
- [安装 LLaMA Factory](#安装-llama-factory)
- [数据准备](#数据准备)
- [快速开始](#快速开始)
- [LLaMA Board 可视化微调](#llama-board-可视化微调由-gradio-驱动)
- [构建 Docker](#构建-docker)
- [利用 vLLM 部署 OpenAI API](#利用-vllm-部署-openai-api)
- [从魔搭社区下载](#从魔搭社区下载)
- [从魔乐社区下载](#从魔乐社区下载)
- [使用 W&B 面板](#使用-wb-面板)
- [使用 SwanLab 面板](#使用-swanlab-面板)
- [使用了 LLaMA Factory 的项目](#使用了-llama-factory-的项目)
- [协议](#协议)
- [引用](#引用)
@ -46,40 +90,100 @@ https://github.com/hiyouga/LLaMA-Factory/assets/16256802/ec36a9dd-37f4-4f72-81bd
## 项目特色
- **多种模型**LLaMA、LLaVA、Mistral、Mixtral-MoE、Qwen、Yi、Gemma、Baichuan、ChatGLM、Phi 等等。
- **多种模型**LLaMA、LLaVA、Mistral、Mixtral-MoE、Qwen、Qwen2-VL、DeepSeek、Yi、Gemma、ChatGLM、Phi 等等。
- **集成方法**增量预训练、多模态指令监督微调、奖励模型训练、PPO 训练、DPO 训练、KTO 训练、ORPO 训练等等。
- **多种精度**32 比特全参数微调、16 比特冻结微调、16 比特 LoRA 微调和基于 AQLM/AWQ/GPTQ/LLM.int8 的 2/4/8 比特 QLoRA 微调。
- **先进算法**GaLore、BAdam、DoRA、LongLoRA、LLaMA Pro、Mixture-of-Depths、LoRA+、LoftQ 和 Agent 微调。
- **实用技巧**FlashAttention-2、Unsloth、RoPE scaling、NEFTune 和 rsLoRA。
- **实验监控**LlamaBoard、TensorBoard、Wandb、MLflow 等等。
- **极速推理**:基于 vLLM 的 OpenAI 风格 API、浏览器界面和命令行接口。
- **多种精度**16 比特全参数微调、冻结微调、LoRA 微调和基于 AQLM/AWQ/GPTQ/LLM.int8/HQQ/EETQ 的 2/3/4/5/6/8 比特 QLoRA 微调。
- **先进算法**[GaLore](https://github.com/jiaweizzhao/GaLore)、[BAdam](https://github.com/Ledzy/BAdam)、[APOLLO](https://github.com/zhuhanqing/APOLLO)、[Adam-mini](https://github.com/zyushun/Adam-mini)、[Muon](https://github.com/KellerJordan/Muon)、DoRA、LongLoRA、LLaMA Pro、Mixture-of-Depths、LoRA+、LoftQ 和 PiSSA。
- **实用技巧**[FlashAttention-2](https://github.com/Dao-AILab/flash-attention)、[Unsloth](https://github.com/unslothai/unsloth)、[Liger Kernel](https://github.com/linkedin/Liger-Kernel)、RoPE scaling、NEFTune 和 rsLoRA。
- **广泛任务**:多轮对话、工具调用、图像理解、视觉定位、视频识别和语音理解等等。
- **实验监控**LlamaBoard、TensorBoard、Wandb、MLflow、[SwanLab](https://github.com/SwanHubX/SwanLab) 等等。
- **极速推理**:基于 [vLLM](https://github.com/vllm-project/vllm) 或 [SGLang](https://github.com/sgl-project/sglang) 的 OpenAI 风格 API、浏览器界面和命令行接口。
## 性能指标
### 最新模型的 Day-N 微调适配
与 ChatGLM 官方的 [P-Tuning](https://github.com/THUDM/ChatGLM2-6B/tree/main/ptuning) 微调相比LLaMA Factory 的 LoRA 微调提供了 **3.7 倍**的加速比,同时在广告文案生成任务上取得了更高的 Rouge 分数。结合 4 比特量化技术LLaMA Factory 的 QLoRA 微调进一步降低了 GPU 显存消耗。
| 适配时间 | 模型名称 |
| ------------ | -------------------------------------------------------------------- |
| Day 0 | Qwen3 / Qwen2.5-VL / Gemma 3 / GLM-4.1V / InternLM 3 / MiniCPM-o-2.6 |
| Day 1 | Llama 3 / GLM-4 / Mistral Small / PaliGemma2 / Llama 4 |
![benchmark](assets/benchmark.svg)
## 官方博客
<details><summary>变量定义</summary>
- [使用 LLaMA-Factory 微调 Llama3.1-70B 医学诊断模型](https://docs.alayanew.com/docs/documents/bestPractice/bigModel/llama70B/)(中文)
- [基于 LLaMA-Factory 和 EasyR1 打造一站式无代码大模型强化学习和部署平台 LLM Model Hub](https://aws.amazon.com/cn/blogs/china/building-llm-model-hub-based-on-llamafactory-and-easyr1/)(中文)
- [通过亚马逊 SageMaker HyperPod 上的 LLaMA-Factory 增强多模态模型银行文档的视觉信息提取](https://aws.amazon.com/cn/blogs/machine-learning/how-apoidea-group-enhances-visual-information-extraction-from-banking-documents-with-multimodal-models-using-llama-factory-on-amazon-sagemaker-hyperpod/)(英文)
- [Easy Dataset × LLaMA Factory: 让大模型高效学习领域知识](https://buaa-act.feishu.cn/wiki/KY9xwTGs1iqHrRkjXBwcZP9WnL9)(中文)
- **Training Speed**: 训练阶段每秒处理的样本数量。(批处理大小=4截断长度=1024
- **Rouge Score**: [广告文案生成](https://aclanthology.org/D19-1321.pdf)任务验证集上的 Rouge-2 分数。(批处理大小=4截断长度=1024
- **GPU Memory**: 4 比特量化训练的 GPU 显存峰值。(批处理大小=1截断长度=1024
- 我们在 ChatGLM 的 P-Tuning 中采用 `pre_seq_len=128`,在 LLaMA Factory 的 LoRA 微调中采用 `lora_rank=32`
<details><summary>全部博客</summary>
- [使用 LLaMA-Factory 微调 Qwen2.5-VL 实现自动驾驶场景微调](https://docs.alayanew.com/docs/documents/useGuide/LLaMAFactory/mutiple/?utm_source=LLaMA-Factory)(中文)
- [LLaMA Factory微调 DeepSeek-R1-Distill-Qwen-7B 模型实现新闻标题分类器](https://gallery.pai-ml.com/#/preview/deepLearning/nlp/llama_factory_deepseek_r1_distill_7b)(中文)
- [基于 Amazon SageMaker 和 LLaMA-Factory 打造一站式无代码模型微调部署平台 Model Hub](https://aws.amazon.com/cn/blogs/china/a-one-stop-code-free-model-fine-tuning-deployment-platform-based-on-sagemaker-and-llama-factory/)(中文)
- [LLaMA Factory 多模态微调实践:微调 Qwen2-VL 构建文旅大模型](https://gallery.pai-ml.com/#/preview/deepLearning/nlp/llama_factory_qwen2vl)(中文)
- [LLaMA Factory微调 Llama3 模型实现角色扮演](https://gallery.pai-ml.com/#/preview/deepLearning/nlp/llama_factory)(中文)
</details>
## 更新日志
[24/06/07] 我们支持了 **[Qwen-2](https://qwenlm.github.io/blog/qwen2/)** 系列模型的微调
[25/07/02] 我们支持了 **[GLM-4.1V-9B-Thinking](https://github.com/THUDM/GLM-4.1V-Thinking)** 模型的微调。请安装 transformers 的 main 分支版本以使用
[24/06/05] 我们支持了 **[GLM-4-9B/GLM-4-9B-Chat](https://github.com/THUDM/GLM-4)** 模型的微调。
[25/04/28] 我们支持了 **[Qwen3](https://qwenlm.github.io/blog/qwen3/)** 系列模型的微调。
[24/05/26] 我们支持了 **[SimPO](https://arxiv.org/abs/2405.14734)** 偏好对齐算法。详细用法请参照 [examples](examples/README_zh.md)。
[25/04/21] 我们支持了 **[Muon](https://github.com/KellerJordan/Muon)** 优化器。详细用法请参照 [examples](examples/README_zh.md)。感谢 [@tianshijing](https://github.com/tianshijing) 的 PR。
[25/04/16] 我们支持了 **[InternVL3](https://huggingface.co/OpenGVLab/InternVL3-8B)** 模型的微调。查看 [PR #7258](https://github.com/hiyouga/LLaMA-Factory/pull/7258) 以使用。
[25/04/14] 我们支持了 **[GLM-Z1](https://huggingface.co/THUDM/GLM-Z1-9B-0414)** 和 **[Kimi-VL](https://huggingface.co/moonshotai/Kimi-VL-A3B-Instruct)** 模型的微调。
[25/04/06] 我们支持了 **[Llama 4](https://ai.meta.com/blog/llama-4-multimodal-intelligence/)** 模型的微调。查看 [PR #7611](https://github.com/hiyouga/LLaMA-Factory/pull/7611) 以使用。
<details><summary>展开日志</summary>
[24/05/20] 我们支持了 **PaliGemma** 系列模型的微调。注意 PaliGemma 是预训练模型,你需要使用 `gemma` 模板进行微调使其获得对话能力。
[25/03/31] 我们支持了 **[Qwen2.5 Omni](https://qwenlm.github.io/blog/qwen2.5-omni/)** 模型的微调。查看 [PR #7537](https://github.com/hiyouga/LLaMA-Factory/pull/7537) 以使用。
[25/03/15] 我们支持了 **[SGLang](https://github.com/sgl-project/sglang)** 推理后端,请使用 `infer_backend: sglang` 启用。
[25/03/12] 我们支持了 **[Gemma 3](https://huggingface.co/blog/gemma3)** 模型的微调。
[25/02/24] 我们宣布开源 **[EasyR1](https://github.com/hiyouga/EasyR1)**,一个高效可扩展的多模态强化学习框架,支持高效的 GRPO 训练。
[25/02/11] 我们支持了在导出模型时保存 **[Ollama](https://github.com/ollama/ollama)** 配置文件。详细用法请参照 [examples](examples/README_zh.md)。
[25/02/05] 我们支持了在语音理解任务上微调 **[Qwen2-Audio](Qwen/Qwen2-Audio-7B-Instruct)** 和 **[MiniCPM-o-2.6](https://huggingface.co/openbmb/MiniCPM-o-2_6)** 模型。
[25/01/31] 我们支持了 **[DeepSeek-R1](https://huggingface.co/deepseek-ai/DeepSeek-R1)** 和 **[Qwen2.5-VL](https://huggingface.co/Qwen/Qwen2.5-VL-7B-Instruct)** 模型的微调。
[25/01/15] 我们支持了 **[APOLLO](https://arxiv.org/abs/2412.05270)** 优化器。详细用法请参照 [examples](examples/README_zh.md)。
[25/01/14] 我们支持了 **[MiniCPM-o-2.6](https://huggingface.co/openbmb/MiniCPM-o-2_6)** 和 **[MiniCPM-V-2.6](https://huggingface.co/openbmb/MiniCPM-V-2_6)** 模型的微调。 感谢 [@BUAADreamer](https://github.com/BUAADreamer) 的 PR.
[25/01/14] 我们支持了 **[InternLM 3](https://huggingface.co/collections/internlm/)** 模型的微调。感谢 [@hhaAndroid](https://github.com/hhaAndroid) 的 PR。
[25/01/10] 我们支持了 **[Phi-4](https://huggingface.co/microsoft/phi-4)** 模型的微调。
[24/12/21] 我们支持了使用 **[SwanLab](https://github.com/SwanHubX/SwanLab)** 跟踪与可视化实验。详细用法请参考 [此部分](#使用-swanlab-面板)。
[24/11/27] 我们支持了 **[Skywork-o1](https://huggingface.co/Skywork/Skywork-o1-Open-Llama-3.1-8B)** 模型的微调和 **[OpenO1](https://huggingface.co/datasets/O1-OPEN/OpenO1-SFT)** 数据集。
[24/10/09] 我们支持了从 **[魔乐社区](https://modelers.cn/models)** 下载预训练模型和数据集。详细用法请参照 [此教程](#从魔乐社区下载)。
[24/09/19] 我们支持了 **[Qwen2.5](https://qwenlm.github.io/blog/qwen2.5/)** 模型的微调。
[24/08/30] 我们支持了 **[Qwen2-VL](https://qwenlm.github.io/blog/qwen2-vl/)** 模型的微调。感谢 [@simonJJJ](https://github.com/simonJJJ) 的 PR。
[24/08/27] 我们支持了 **[Liger Kernel](https://github.com/linkedin/Liger-Kernel)**。请使用 `enable_liger_kernel: true` 来加速训练。
[24/08/09] 我们支持了 **[Adam-mini](https://github.com/zyushun/Adam-mini)** 优化器。详细用法请参照 [examples](examples/README_zh.md)。感谢 [@relic-yuexi](https://github.com/relic-yuexi) 的 PR。
[24/07/04] 我们支持了[无污染打包训练](https://github.com/MeetKai/functionary/tree/main/functionary/train/packing)。请使用 `neat_packing: true` 参数。感谢 [@chuan298](https://github.com/chuan298) 的 PR。
[24/06/16] 我们支持了 **[PiSSA](https://arxiv.org/abs/2404.02948)** 算法。详细用法请参照 [examples](examples/README_zh.md)。
[24/06/07] 我们支持了 **[Qwen2](https://qwenlm.github.io/blog/qwen2/)** 和 **[GLM-4](https://github.com/THUDM/GLM-4)** 模型的微调。
[24/05/26] 我们支持了 **[SimPO](https://arxiv.org/abs/2405.14734)** 偏好对齐算法。详细用法请参照 [examples](examples/README_zh.md)。
[24/05/20] 我们支持了 **PaliGemma** 系列模型的微调。注意 PaliGemma 是预训练模型,你需要使用 `paligemma` 模板进行微调使其获得对话能力。
[24/05/18] 我们支持了 **[KTO](https://arxiv.org/abs/2402.01306)** 偏好对齐算法。详细用法请参照 [examples](examples/README_zh.md)。
@ -91,7 +195,7 @@ https://github.com/hiyouga/LLaMA-Factory/assets/16256802/ec36a9dd-37f4-4f72-81bd
[24/04/21] 我们基于 [AstraMindAI 的仓库](https://github.com/astramind-ai/Mixture-of-depths)支持了 **[混合深度训练](https://arxiv.org/abs/2404.02258)**。详细用法请参照 [examples](examples/README_zh.md)。
[24/04/16] 我们支持了 **[BAdam](https://arxiv.org/abs/2404.02827)**。详细用法请参照 [examples](examples/README_zh.md)。
[24/04/16] 我们支持了 **[BAdam](https://arxiv.org/abs/2404.02827)** 优化器。详细用法请参照 [examples](examples/README_zh.md)。
[24/04/16] 我们支持了 **[unsloth](https://github.com/unslothai/unsloth)** 的长序列训练24GB 可训练 Llama-2-7B-56k。该方法相比 FlashAttention-2 提供了 **117%** 的训练速度和 **50%** 的显存节约。更多数据请见[此页面](https://github.com/hiyouga/LLaMA-Factory/wiki/Performance-comparison)。
@ -103,7 +207,7 @@ https://github.com/hiyouga/LLaMA-Factory/assets/16256802/ec36a9dd-37f4-4f72-81bd
[24/03/13] 我们支持了 **[LoRA+](https://arxiv.org/abs/2402.12354)**。详细用法请参照 [examples](examples/README_zh.md)。
[24/03/07] 我们支持了梯度低秩投影(**[GaLore](https://arxiv.org/abs/2403.03507)**)算法。详细用法请参照 [examples](examples/README_zh.md)。
[24/03/07] 我们支持了 **[GaLore](https://arxiv.org/abs/2403.03507)** 优化器。详细用法请参照 [examples](examples/README_zh.md)。
[24/03/07] 我们集成了 **[vLLM](https://github.com/vllm-project/vllm)** 以实现极速并发推理。请使用 `infer_backend: vllm` 来获得 **270%** 的推理速度。
@ -149,42 +253,78 @@ https://github.com/hiyouga/LLaMA-Factory/assets/16256802/ec36a9dd-37f4-4f72-81bd
</details>
> [!TIP]
> 如果您无法使用最新的功能,请尝试重新拉取代码并再次安装 LLaMA-Factory。
## 模型
| 模型名 | 模型大小 | Template |
| -------------------------------------------------------- | -------------------------------- | --------- |
| [Baichuan2](https://huggingface.co/baichuan-inc) | 7B/13B | baichuan2 |
| [BLOOM](https://huggingface.co/bigscience) | 560M/1.1B/1.7B/3B/7.1B/176B | - |
| [BLOOMZ](https://huggingface.co/bigscience) | 560M/1.1B/1.7B/3B/7.1B/176B | - |
| [ChatGLM3](https://huggingface.co/THUDM) | 6B | chatglm3 |
| [Command-R](https://huggingface.co/CohereForAI) | 35B/104B | cohere |
| [DeepSeek (MoE)](https://huggingface.co/deepseek-ai) | 7B/16B/67B/236B | deepseek |
| [Falcon](https://huggingface.co/tiiuae) | 7B/11B/40B/180B | falcon |
| [Gemma/CodeGemma](https://huggingface.co/google) | 2B/7B | gemma |
| [GLM4](https://huggingface.co/THUDM) | 9B | glm4 |
| [InternLM2](https://huggingface.co/internlm) | 7B/20B | intern2 |
| [LLaMA](https://github.com/facebookresearch/llama) | 7B/13B/33B/65B | - |
| [LLaMA-2](https://huggingface.co/meta-llama) | 7B/13B/70B | llama2 |
| [LLaMA-3](https://huggingface.co/meta-llama) | 8B/70B | llama3 |
| [LLaVA-1.5](https://huggingface.co/llava-hf) | 7B/13B | vicuna |
| [Mistral/Mixtral](https://huggingface.co/mistralai) | 7B/8x7B/8x22B | mistral |
| [OLMo](https://huggingface.co/allenai) | 1B/7B | - |
| [PaliGemma](https://huggingface.co/google) | 3B | gemma |
| [Phi-1.5/2](https://huggingface.co/microsoft) | 1.3B/2.7B | - |
| [Phi-3](https://huggingface.co/microsoft) | 4B/7B/14B | phi |
| [Qwen](https://huggingface.co/Qwen) | 1.8B/7B/14B/72B | qwen |
| [Qwen1.5 (Code/MoE)](https://huggingface.co/Qwen) | 0.5B/1.8B/4B/7B/14B/32B/72B/110B | qwen |
| [Qwen2 (MoE)](https://huggingface.co/Qwen) | 0.5B/1.5B/7B/57B/72B | qwen |
| [StarCoder2](https://huggingface.co/bigcode) | 3B/7B/15B | - |
| [XVERSE](https://huggingface.co/xverse) | 7B/13B/65B | xverse |
| [Yi (1/1.5)](https://huggingface.co/01-ai) | 6B/9B/34B | yi |
| [Yi-VL](https://huggingface.co/01-ai) | 6B/34B | yi_vl |
| [Yuan](https://huggingface.co/IEITYuan) | 2B/51B/102B | yuan |
| 模型名 | 参数量 | Template |
| ----------------------------------------------------------------- | -------------------------------- | ------------------- |
| [Baichuan 2](https://huggingface.co/baichuan-inc) | 7B/13B | baichuan2 |
| [BLOOM/BLOOMZ](https://huggingface.co/bigscience) | 560M/1.1B/1.7B/3B/7.1B/176B | - |
| [ChatGLM3](https://huggingface.co/THUDM) | 6B | chatglm3 |
| [Command R](https://huggingface.co/CohereForAI) | 35B/104B | cohere |
| [DeepSeek (Code/MoE)](https://huggingface.co/deepseek-ai) | 7B/16B/67B/236B | deepseek |
| [DeepSeek 2.5/3](https://huggingface.co/deepseek-ai) | 236B/671B | deepseek3 |
| [DeepSeek R1 (Distill)](https://huggingface.co/deepseek-ai) | 1.5B/7B/8B/14B/32B/70B/671B | deepseekr1 |
| [Falcon](https://huggingface.co/tiiuae) | 7B/11B/40B/180B | falcon |
| [Falcon-H1](https://huggingface.co/tiiuae) | 0.5B/1.5B/3B/7B/34B | falcon_h1 |
| [Gemma/Gemma 2/CodeGemma](https://huggingface.co/google) | 2B/7B/9B/27B | gemma/gemma2 |
| [Gemma 3/Gemma 3n](https://huggingface.co/google) | 1B/4B/6B/8B/12B/27B | gemma3/gemma3n |
| [GLM-4/GLM-4-0414/GLM-Z1](https://huggingface.co/zai-org) | 9B/32B | glm4/glmz1 |
| [GLM-4.1V](https://huggingface.co/zai-org)* | 9B | glm4v |
| [GLM-4.5](https://huggingface.co/zai-org)* | 106B/355B | glm4_moe |
| [GPT-2](https://huggingface.co/openai-community) | 0.1B/0.4B/0.8B/1.5B | - |
| [Granite 3.0-3.3](https://huggingface.co/ibm-granite) | 1B/2B/3B/8B | granite3 |
| [Granite 4](https://huggingface.co/ibm-granite) | 7B | granite4 |
| [Hunyuan](https://huggingface.co/tencent/) | 7B | hunyuan |
| [Index](https://huggingface.co/IndexTeam) | 1.9B | index |
| [InternLM 2-3](https://huggingface.co/internlm) | 7B/8B/20B | intern2 |
| [InternVL 2.5-3](https://huggingface.co/OpenGVLab) | 1B/2B/8B/14B/38B/78B | intern_vl |
| [Kimi-VL](https://huggingface.co/moonshotai) | 16B | kimi_vl |
| [Llama](https://github.com/facebookresearch/llama) | 7B/13B/33B/65B | - |
| [Llama 2](https://huggingface.co/meta-llama) | 7B/13B/70B | llama2 |
| [Llama 3-3.3](https://huggingface.co/meta-llama) | 1B/3B/8B/70B | llama3 |
| [Llama 4](https://huggingface.co/meta-llama) | 109B/402B | llama4 |
| [Llama 3.2 Vision](https://huggingface.co/meta-llama) | 11B/90B | mllama |
| [LLaVA-1.5](https://huggingface.co/llava-hf) | 7B/13B | llava |
| [LLaVA-NeXT](https://huggingface.co/llava-hf) | 7B/8B/13B/34B/72B/110B | llava_next |
| [LLaVA-NeXT-Video](https://huggingface.co/llava-hf) | 7B/34B | llava_next_video |
| [MiMo](https://huggingface.co/XiaomiMiMo) | 7B | mimo |
| [MiniCPM](https://huggingface.co/openbmb) | 0.5B/1B/2B/4B/8B | cpm/cpm3/cpm4 |
| [MiniCPM-o-2.6/MiniCPM-V-2.6](https://huggingface.co/openbmb) | 8B | minicpm_o/minicpm_v |
| [Ministral/Mistral-Nemo](https://huggingface.co/mistralai) | 8B/12B | ministral |
| [Mistral/Mixtral](https://huggingface.co/mistralai) | 7B/8x7B/8x22B | mistral |
| [Mistral Small](https://huggingface.co/mistralai) | 24B | mistral_small |
| [OLMo](https://huggingface.co/allenai) | 1B/7B | - |
| [PaliGemma/PaliGemma2](https://huggingface.co/google) | 3B/10B/28B | paligemma |
| [Phi-1.5/Phi-2](https://huggingface.co/microsoft) | 1.3B/2.7B | - |
| [Phi-3/Phi-3.5](https://huggingface.co/microsoft) | 4B/14B | phi |
| [Phi-3-small](https://huggingface.co/microsoft) | 7B | phi_small |
| [Phi-4](https://huggingface.co/microsoft) | 14B | phi4 |
| [Pixtral](https://huggingface.co/mistralai) | 12B | pixtral |
| [Qwen (1-2.5) (Code/Math/MoE/QwQ)](https://huggingface.co/Qwen) | 0.5B/1.5B/3B/7B/14B/32B/72B/110B | qwen |
| [Qwen3 (MoE)](https://huggingface.co/Qwen) | 0.6B/1.7B/4B/8B/14B/32B/235B | qwen3 |
| [Qwen2-Audio](https://huggingface.co/Qwen) | 7B | qwen2_audio |
| [Qwen2.5-Omni](https://huggingface.co/Qwen) | 3B/7B | qwen2_omni |
| [Qwen2-VL/Qwen2.5-VL/QVQ](https://huggingface.co/Qwen) | 2B/3B/7B/32B/72B | qwen2_vl |
| [Seed Coder](https://huggingface.co/ByteDance-Seed) | 8B | seed_coder |
| [Skywork o1](https://huggingface.co/Skywork) | 8B | skywork_o1 |
| [StarCoder 2](https://huggingface.co/bigcode) | 3B/7B/15B | - |
| [TeleChat2](https://huggingface.co/Tele-AI) | 3B/7B/35B/115B | telechat2 |
| [XVERSE](https://huggingface.co/xverse) | 7B/13B/65B | xverse |
| [Yi/Yi-1.5 (Code)](https://huggingface.co/01-ai) | 1.5B/6B/9B/34B | yi |
| [Yi-VL](https://huggingface.co/01-ai) | 6B/34B | yi_vl |
| [Yuan 2](https://huggingface.co/IEITYuan) | 2B/51B/102B | yuan |
> [!NOTE]
> 对于所有“基座”Base模型`template` 参数可以是 `default`, `alpaca`, `vicuna` 等任意值。但“对话”Instruct/Chat模型请务必使用**对应的模板**。
>
> 请务必在训练和推理时采用**完全一致**的模板。
>
> \*:您需要从 main 分支安装 `transformers` 并使用 `DISABLE_VERSION_CHECK=1` 来跳过版本检查。
>
> \*\*:您需要安装特定版本的 `transformers` 以使用该模型。
项目所支持模型的完整列表请参阅 [constants.py](src/llamafactory/extras/constants.py)。
@ -193,7 +333,7 @@ https://github.com/hiyouga/LLaMA-Factory/assets/16256802/ec36a9dd-37f4-4f72-81bd
## 训练方法
| 方法 | 全参数训练 | 部分参数训练 | LoRA | QLoRA |
| ---------------------- | ------------------ | ------------------ | ------------------ | ------------------ |
| --------------------- | ------------------ | ------------------ | ------------------ | ------------------ |
| 预训练 | :white_check_mark: | :white_check_mark: | :white_check_mark: | :white_check_mark: |
| 指令监督微调 | :white_check_mark: | :white_check_mark: | :white_check_mark: | :white_check_mark: |
| 奖励模型训练 | :white_check_mark: | :white_check_mark: | :white_check_mark: | :white_check_mark: |
@ -203,6 +343,9 @@ https://github.com/hiyouga/LLaMA-Factory/assets/16256802/ec36a9dd-37f4-4f72-81bd
| ORPO 训练 | :white_check_mark: | :white_check_mark: | :white_check_mark: | :white_check_mark: |
| SimPO 训练 | :white_check_mark: | :white_check_mark: | :white_check_mark: | :white_check_mark: |
> [!TIP]
> 有关 PPO 的实现细节,请参考[此博客](https://newfacade.github.io/notes-on-reinforcement-learning/17-ppo-trl.html)。
## 数据集
<details><summary>预训练数据集</summary>
@ -259,7 +402,16 @@ https://github.com/hiyouga/LLaMA-Factory/assets/16256802/ec36a9dd-37f4-4f72-81bd
- [Cosmopedia (en)](https://huggingface.co/datasets/HuggingFaceTB/cosmopedia)
- [STEM (zh)](https://huggingface.co/datasets/hfl/stem_zh_instruction)
- [Ruozhiba (zh)](https://huggingface.co/datasets/hfl/ruozhiba_gpt4_turbo)
- [Neo-sft (zh)](https://huggingface.co/datasets/m-a-p/neo_sft_phase2)
- [Magpie-Pro-300K-Filtered (en)](https://huggingface.co/datasets/Magpie-Align/Magpie-Pro-300K-Filtered)
- [Magpie-ultra-v0.1 (en)](https://huggingface.co/datasets/argilla/magpie-ultra-v0.1)
- [WebInstructSub (en)](https://huggingface.co/datasets/TIGER-Lab/WebInstructSub)
- [OpenO1-SFT (en&zh)](https://huggingface.co/datasets/O1-OPEN/OpenO1-SFT)
- [Open-Thoughts (en)](https://huggingface.co/datasets/open-thoughts/OpenThoughts-114k)
- [Open-R1-Math (en)](https://huggingface.co/datasets/open-r1/OpenR1-Math-220k)
- [Chinese-DeepSeek-R1-Distill (zh)](https://huggingface.co/datasets/Congliu/Chinese-DeepSeek-R1-Distill-data-110k-SFT)
- [LLaVA mixed (en&zh)](https://huggingface.co/datasets/BUAADreamer/llava-en-zh-300k)
- [Pokemon-gpt4o-captions (en&zh)](https://huggingface.co/datasets/jugg1024/pokemon-gpt4o-captions)
- [Open Assistant (de)](https://huggingface.co/datasets/mayflowergmbh/oasst_de)
- [Dolly 15k (de)](https://huggingface.co/datasets/mayflowergmbh/dolly-15k_de)
- [Alpaca GPT4 (de)](https://huggingface.co/datasets/mayflowergmbh/alpaca-gpt4_de)
@ -276,6 +428,10 @@ https://github.com/hiyouga/LLaMA-Factory/assets/16256802/ec36a9dd-37f4-4f72-81bd
- [DPO mixed (en&zh)](https://huggingface.co/datasets/hiyouga/DPO-En-Zh-20k)
- [UltraFeedback (en)](https://huggingface.co/datasets/HuggingFaceH4/ultrafeedback_binarized)
- [COIG-P (zh)](https://huggingface.co/datasets/m-a-p/COIG-P)
- [RLHF-V (en)](https://huggingface.co/datasets/openbmb/RLHF-V-Dataset)
- [VLFeedback (en)](https://huggingface.co/datasets/Zhihui/VLFeedback)
- [RLAIF-V (en)](https://huggingface.co/datasets/openbmb/RLAIF-V-Dataset)
- [Orca DPO Pairs (en)](https://huggingface.co/datasets/Intel/orca_dpo_pairs)
- [HH-RLHF (en)](https://huggingface.co/datasets/Anthropic/hh-rlhf)
- [Nectar (en)](https://huggingface.co/datasets/berkeley-nest/Nectar)
@ -293,37 +449,37 @@ huggingface-cli login
## 软硬件依赖
| 必需项 | 至少 | 推荐 |
| 必需项 | 至少 | 推荐 |
| ------------ | ------- | --------- |
| python | 3.8 | 3.11 |
| torch | 1.13.1 | 2.3.0 |
| transformers | 4.41.2 | 4.41.2 |
| datasets | 2.16.0 | 2.19.2 |
| accelerate | 0.30.1 | 0.30.1 |
| peft | 0.11.1 | 0.11.1 |
| trl | 0.8.6 | 0.9.4 |
| python | 3.9 | 3.10 |
| torch | 2.0.0 | 2.6.0 |
| torchvision | 0.15.0 | 0.21.0 |
| transformers | 4.49.0 | 4.50.0 |
| datasets | 2.16.0 | 3.2.0 |
| accelerate | 0.34.0 | 1.2.1 |
| peft | 0.14.0 | 0.15.1 |
| trl | 0.8.6 | 0.9.6 |
| 可选项 | 至少 | 推荐 |
| 可选项 | 至少 | 推荐 |
| ------------ | ------- | --------- |
| CUDA | 11.6 | 12.2 |
| deepspeed | 0.10.0 | 0.14.0 |
| deepspeed | 0.10.0 | 0.16.4 |
| bitsandbytes | 0.39.0 | 0.43.1 |
| vllm | 0.4.3 | 0.4.3 |
| flash-attn | 2.3.0 | 2.5.9 |
| vllm | 0.4.3 | 0.8.2 |
| flash-attn | 2.5.6 | 2.7.2 |
### 硬件依赖
\* *估算值*
| 方法 | 精度 | 7B | 13B | 30B | 70B | 110B | 8x7B | 8x22B |
| ----------------- | ---- | ----- | ----- | ----- | ------ | ------ | ----- | ------ |
| Full | AMP | 120GB | 240GB | 600GB | 1200GB | 2000GB | 900GB | 2400GB |
| Full | 16 | 60GB | 120GB | 300GB | 600GB | 900GB | 400GB | 1200GB |
| Freeze | 16 | 20GB | 40GB | 80GB | 200GB | 360GB | 160GB | 400GB |
| LoRA/GaLore/BAdam | 16 | 16GB | 32GB | 64GB | 160GB | 240GB | 120GB | 320GB |
| QLoRA | 8 | 10GB | 20GB | 40GB | 80GB | 140GB | 60GB | 160GB |
| QLoRA | 4 | 6GB | 12GB | 24GB | 48GB | 72GB | 30GB | 96GB |
| QLoRA | 2 | 4GB | 8GB | 16GB | 24GB | 48GB | 18GB | 48GB |
| 方法 | 精度 | 7B | 14B | 30B | 70B | `x`B |
| ------------------------------- | ---- | ----- | ----- | ----- | ------ | ------- |
| Full (`bf16` or `fp16`) | 32 | 120GB | 240GB | 600GB | 1200GB | `18x`GB |
| Full (`pure_bf16`) | 16 | 60GB | 120GB | 300GB | 600GB | `8x`GB |
| Freeze/LoRA/GaLore/APOLLO/BAdam | 16 | 16GB | 32GB | 64GB | 160GB | `2x`GB |
| QLoRA | 8 | 10GB | 20GB | 40GB | 80GB | `x`GB |
| QLoRA | 4 | 6GB | 12GB | 24GB | 48GB | `x/2`GB |
| QLoRA | 2 | 4GB | 8GB | 16GB | 24GB | `x/4`GB |
## 如何使用
@ -332,34 +488,77 @@ huggingface-cli login
> [!IMPORTANT]
> 此步骤为必需。
#### 从源码安装
```bash
git clone --depth 1 https://github.com/hiyouga/LLaMA-Factory.git
cd LLaMA-Factory
pip install -e '.[torch,metrics]'
pip install -e ".[torch,metrics]" --no-build-isolation
```
可选的额外依赖项torch、torch_npu、metrics、deepspeed、bitsandbytes、vllm、galore、badam、gptq、awq、aqlm、qwen、modelscope、quality
可选的额外依赖项torch、torch-npu、metrics、deepspeed、liger-kernel、bitsandbytes、hqq、eetq、gptq、aqlm、vllm、sglang、galore、apollo、badam、adam-mini、qwen、minicpm_v、openmind、swanlab、dev
> [!TIP]
> 遇到包冲突时,可使用 `pip install --no-deps -e .` 解决。
#### 从镜像安装
```bash
docker run -it --rm --gpus=all --ipc=host hiyouga/llamafactory:latest
```
该镜像基于 Ubuntu 22.04x86\_64、CUDA 12.4、Python 3.11、PyTorch 2.6.0 和 Flash-attn 2.7.4 构建。
查看全部镜像https://hub.docker.com/r/hiyouga/llamafactory/tags
请参阅[构建 Docker](#构建-docker) 来重新构建镜像。
<details><summary>使用 <b>uv</b> 构建虚拟环境</summary>
使用 [uv](https://github.com/astral-sh/uv) 创建隔离的 Python 环境:
```bash
uv sync --extra torch --extra metrics --prerelease=allow
```
在环境中运行 LLaMA-Factory
```bash
uv run --prerelease=allow llamafactory-cli train examples/train_lora/llama3_lora_pretrain.yaml
```
</details>
<details><summary>Windows 用户指南</summary>
#### 安装 PyTorch
Windows 平台需要额外手动安装 GPU 版本的 PyTorch 依赖包,您可以参考[官方网站](https://pytorch.org/get-started/locally/)和以下命令安装并测试 PyTorch 是否正确安装。
```bash
pip uninstall torch torchvision torchaudio
pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu126
python -c "import torch; print(torch.cuda.is_available())"
```
如果看到 `True` 则说明安装成功。
若遇到类似 `Can't pickle local object` 的报错,请设置 `dataloader_num_workers: 0`
#### 安装 BitsAndBytes
如果要在 Windows 平台上开启量化 LoRAQLoRA需要安装预编译的 `bitsandbytes` 库, 支持 CUDA 11.1 到 12.2, 请根据您的 CUDA 版本情况选择适合的[发布版本](https://github.com/jllllll/bitsandbytes-windows-webui/releases/tag/wheels)。
```bash
pip install https://github.com/jllllll/bitsandbytes-windows-webui/releases/download/wheels/bitsandbytes-0.41.2.post2-py3-none-win_amd64.whl
```
如果要在 Windows 平台上开启 FlashAttention-2需要安装预编译的 `flash-attn` 库,支持 CUDA 12.1 到 12.2,请根据需求到 [flash-attention](https://github.com/bdashore3/flash-attention/releases) 下载对应版本安装。
#### 安装 Flash Attention-2
如果要在 Windows 平台上开启 FlashAttention-2请使用 [flash-attention-windows-wheel](https://huggingface.co/lldacing/flash-attention-windows-wheel) 中的脚本自行编译与安装。
</details>
<details><summary>昇腾 NPU 用户指南</summary>
加入 [NPU 用户群](assets/wechat_npu.jpg)。
在昇腾 NPU 设备上安装 LLaMA Factory 时,需要指定额外依赖项,使用 `pip install -e '.[torch-npu,metrics]'` 命令安装。此外,还需要安装 **[Ascend CANN Toolkit and Kernels](https://www.hiascend.com/developer/download/community/result?module=cann)**,安装方法请参考[安装教程](https://www.hiascend.com/document/detail/zh/CANNCommunityEdition/80RC2alpha002/quickstart/quickstart/quickstart_18_0004.html)或使用以下命令:
在昇腾 NPU 设备上安装 LLaMA Factory 时,请升级 Python 到 3.10 及以上,并需要指定额外依赖项,使用 `pip install -e ".[torch-npu,metrics]"` 命令安装。此外,还需要安装 **[Ascend CANN Toolkit 与 Kernels](https://www.hiascend.com/developer/download/community/result?module=cann)**,安装方法请参考[安装教程](https://www.hiascend.com/document/detail/zh/CANNCommunityEdition/80RC2alpha002/quickstart/quickstart/quickstart_18_0004.html)或使用以下命令:
```bash
# 请替换 URL 为 CANN 版本和设备型号对应的 URL
@ -375,90 +574,201 @@ bash Ascend-cann-kernels-910b_8.0.RC1.alpha001_linux.run --install
source /usr/local/Ascend/ascend-toolkit/set_env.sh
```
| 依赖项 | 至少 | 推荐 |
| ------------ | ------- | ----------- |
| CANN | 8.0.RC1 | 8.0.RC1 |
| torch | 2.1.0 | 2.1.0 |
| torch-npu | 2.1.0 | 2.1.0.post3 |
| deepspeed | 0.13.2 | 0.13.2 |
Docker 镜像:
- 32GB[下载地址](http://mirrors.cn-central-221.ovaijisuan.com/detail/130.html)
- 64GB[下载地址](http://mirrors.cn-central-221.ovaijisuan.com/detail/131.html)
| 依赖项 | 至少 | 推荐 |
| ------------ | ------- | -------------- |
| CANN | 8.0.RC1 | 8.0.0.alpha002 |
| torch | 2.1.0 | 2.4.0 |
| torch-npu | 2.1.0 | 2.4.0.post2 |
| deepspeed | 0.13.2 | 0.13.2 |
| vllm-ascend | - | 0.7.3 |
请使用 `ASCEND_RT_VISIBLE_DEVICES` 而非 `CUDA_VISIBLE_DEVICES` 来指定运算设备。
如果遇到无法正常推理的情况,请尝试设置 `do_sample: false`
下载预构建 Docker 镜像:[32GB](http://mirrors.cn-central-221.ovaijisuan.com/detail/130.html) | [64GB](http://mirrors.cn-central-221.ovaijisuan.com/detail/131.html)
#### 安装 BitsAndBytes
如果要在 Ascend NPU 上进行基于 bitsandbytes 的 QLoRA 量化微调,请执行如下步骤:
1. 手动编译 bitsandbytes请参考[安装文档](https://huggingface.co/docs/bitsandbytes/installation?backend=Ascend+NPU&platform=Ascend+NPU)完成 NPU 版的 bitsandbytes 安装,编译要求环境 cmake 版本不低于 3.22.1g++ 版本不低于 12.x。
```bash
# 从源码安装 bitsandbytes
# 克隆 bitsandbytes 仓库, Ascend NPU 目前在 multi-backend-refactor 中支持
git clone -b multi-backend-refactor https://github.com/bitsandbytes-foundation/bitsandbytes.git
cd bitsandbytes/
# 安装依赖
pip install -r requirements-dev.txt
# 安装编译工具依赖,该步骤在不同系统上命令有所不同,供参考
apt-get install -y build-essential cmake
# 编译 & 安装
cmake -DCOMPUTE_BACKEND=npu -S .
make
pip install .
```
2. 安装 transformers 的 main 分支版本。
```bash
git clone -b main https://github.com/huggingface/transformers.git
cd transformers
pip install .
```
3. 在训练参数中设置 `double_quantization: false`,可参考[示例](examples/train_qlora/llama3_lora_sft_bnb_npu.yaml)。
</details>
### 数据准备
关于数据集文件的格式,请参考 [data/README_zh.md](data/README_zh.md) 的内容。你可以使用 HuggingFace / ModelScope 上的数据集或加载本地数据集。
关于数据集文件的格式,请参考 [data/README_zh.md](data/README_zh.md) 的内容。你可以使用 HuggingFace / ModelScope / Modelers 上的数据集或加载本地数据集。
> [!NOTE]
> 使用自定义数据集时,请更新 `data/dataset_info.json` 文件。
您也可以使用 **[Easy Dataset](https://github.com/ConardLi/easy-dataset)**、**[DataFlow](https://github.com/OpenDCAI/DataFlow)** 和 **[GraphGen](https://github.com/open-sciencelab/GraphGen)** 构建用于微调的合成数据。
### 快速开始
下面三行命令分别对 Llama3-8B-Instruct 模型进行 LoRA **微调**、**推理**和**合并**。
```bash
CUDA_VISIBLE_DEVICES=0 llamafactory-cli train examples/lora_single_gpu/llama3_lora_sft.yaml
CUDA_VISIBLE_DEVICES=0 llamafactory-cli chat examples/inference/llama3_lora_sft.yaml
CUDA_VISIBLE_DEVICES=0 llamafactory-cli export examples/merge_lora/llama3_lora_sft.yaml
llamafactory-cli train examples/train_lora/llama3_lora_sft.yaml
llamafactory-cli chat examples/inference/llama3_lora_sft.yaml
llamafactory-cli export examples/merge_lora/llama3_lora_sft.yaml
```
高级用法请参考 [examples/README_zh.md](examples/README_zh.md)(包括多 GPU 微调)。
> [!TIP]
> 使用 `llamafactory-cli help` 显示帮助信息。
>
> 遇到报错请先看[常见问题](https://github.com/hiyouga/LLaMA-Factory/issues/4614)。
### LLaMA Board 可视化微调(由 [Gradio](https://github.com/gradio-app/gradio) 驱动)
#### 使用本地环境
```bash
CUDA_VISIBLE_DEVICES=0 GRADIO_SHARE=1 llamafactory-cli webui
llamafactory-cli webui
```
#### 使用 Docker
### 构建 Docker
CUDA 用户:
```bash
docker build -f ./Dockerfile -t llama-factory:latest .
docker run --gpus=all \
-v ./hf_cache:/root/.cache/huggingface/ \
-v ./data:/app/data \
-v ./output:/app/output \
cd docker/docker-cuda/
docker compose up -d
docker compose exec llamafactory bash
```
昇腾 NPU 用户:
```bash
cd docker/docker-npu/
docker compose up -d
docker compose exec llamafactory bash
```
AMD ROCm 用户:
```bash
cd docker/docker-rocm/
docker compose up -d
docker compose exec llamafactory bash
```
<details><summary>不使用 Docker Compose 构建</summary>
CUDA 用户:
```bash
docker build -f ./docker/docker-cuda/Dockerfile \
--build-arg PIP_INDEX=https://pypi.org/simple \
--build-arg EXTRAS=metrics \
-t llamafactory:latest .
docker run -dit --ipc=host --gpus=all \
-p 7860:7860 \
--shm-size 16G \
--name llama_factory \
-d llama-factory:latest
-p 8000:8000 \
--name llamafactory \
llamafactory:latest
docker exec -it llamafactory bash
```
#### 使用 Docker Compose
昇腾 NPU 用户:
```bash
docker compose -f ./docker-compose.yml up -d
docker build -f ./docker/docker-npu/Dockerfile \
--build-arg PIP_INDEX=https://pypi.org/simple \
--build-arg EXTRAS=torch-npu,metrics \
-t llamafactory:latest .
docker run -dit --ipc=host \
-v /usr/local/dcmi:/usr/local/dcmi \
-v /usr/local/bin/npu-smi:/usr/local/bin/npu-smi \
-v /usr/local/Ascend/driver:/usr/local/Ascend/driver \
-v /etc/ascend_install.info:/etc/ascend_install.info \
-p 7860:7860 \
-p 8000:8000 \
--device /dev/davinci0 \
--device /dev/davinci_manager \
--device /dev/devmm_svm \
--device /dev/hisi_hdc \
--name llamafactory \
llamafactory:latest
docker exec -it llamafactory bash
```
<details><summary>数据卷详情</summary>
AMD ROCm 用户:
- hf_cache使用宿主机的 Hugging Face 缓存文件夹,允许更改为新的目录。
- data宿主机中存放数据集的文件夹路径。
- output将导出目录设置为该路径后即可在宿主机中访问导出后的模型。
```bash
docker build -f ./docker/docker-rocm/Dockerfile \
--build-arg PIP_INDEX=https://pypi.org/simple \
--build-arg EXTRAS=metrics \
-t llamafactory:latest .
docker run -dit --ipc=host \
-p 7860:7860 \
-p 8000:8000 \
--device /dev/kfd \
--device /dev/dri \
--name llamafactory \
llamafactory:latest
docker exec -it llamafactory bash
```
</details>
<details><summary>使用数据卷</summary>
您可以通过移除 Dockerfile 中 `VOLUME [ "/root/.cache/huggingface", "/app/shared_data", "/app/output" ]` 的注释来使用数据卷。
在构建 Docker 时使用参数 `-v ./hf_cache:/root/.cache/huggingface` 来挂载数据卷。各个数据卷的含义表示如下。
- `hf_cache`:使用宿主机的 Hugging Face 缓存文件夹。
- `shared_data`:宿主机中存放数据集的文件夹路径。
- `output`:将导出目录设置为该路径后,即可在宿主机中访问导出后的模型。
</details>
### 利用 vLLM 部署 OpenAI API
```bash
CUDA_VISIBLE_DEVICES=0,1 API_PORT=8000 llamafactory-cli api examples/inference/llama3_vllm.yaml
API_PORT=8000 llamafactory-cli api examples/inference/llama3.yaml infer_backend=vllm vllm_enforce_eager=true
```
> [!TIP]
> API 文档请查阅 https://platform.openai.com/docs/api-reference/chat/create。
> API 文档请查阅[这里](https://platform.openai.com/docs/api-reference/chat/create)。
>
> 示例:[图像理解](scripts/api_example/test_image.py) | [工具调用](scripts/api_example/test_toolcall.py)
### 从魔搭社区下载
@ -470,9 +780,19 @@ export USE_MODELSCOPE_HUB=1 # Windows 使用 `set USE_MODELSCOPE_HUB=1`
`model_name_or_path` 设置为模型 ID 来加载对应的模型。在[魔搭社区](https://modelscope.cn/models)查看所有可用的模型,例如 `LLM-Research/Meta-Llama-3-8B-Instruct`
### 从魔乐社区下载
您也可以通过下述方法,使用魔乐社区下载数据集和模型。
```bash
export USE_OPENMIND_HUB=1 # Windows 使用 `set USE_OPENMIND_HUB=1`
```
`model_name_or_path` 设置为模型 ID 来加载对应的模型。在[魔乐社区](https://modelers.cn/models)查看所有可用的模型,例如 `TeleAI/TeleChat-7B-pt`
### 使用 W&B 面板
若要使用 [Weights & Biases](https://wandb.ai) 记录实验数据,请添加下面的参数。
若要使用 [Weights & Biases](https://wandb.ai) 记录实验数据,请在 yaml 文件中添加下面的参数。
```yaml
report_to: wandb
@ -481,6 +801,21 @@ run_name: test_run # 可选
在启动训练任务时,将 `WANDB_API_KEY` 设置为[密钥](https://wandb.ai/authorize)来登录 W&B 账户。
### 使用 SwanLab 面板
若要使用 [SwanLab](https://github.com/SwanHubX/SwanLab) 记录实验数据,请在 yaml 文件中添加下面的参数。
```yaml
use_swanlab: true
swanlab_run_name: test_run # 可选
```
在启动训练任务时登录SwanLab账户有以下三种方式
方式一:在 yaml 文件中添加 `swanlab_api_key=<your_api_key>` ,并设置为你的 [API 密钥](https://swanlab.cn/settings)。
方式二:将环境变量 `SWANLAB_API_KEY` 设置为你的 [API 密钥](https://swanlab.cn/settings)。
方式三:启动前使用 `swanlab login` 命令完成登录。
## 使用了 LLaMA Factory 的项目
如果您有项目希望添加至下述列表,请通过邮件联系或者创建一个 PR。
@ -492,45 +827,97 @@ run_name: test_run # 可选
1. Wang et al. UbiPhysio: Support Daily Functioning, Fitness, and Rehabilitation with Action Understanding and Feedback in Natural Language. 2023. [[arxiv]](https://arxiv.org/abs/2308.10526)
1. Luceri et al. Leveraging Large Language Models to Detect Influence Campaigns in Social Media. 2023. [[arxiv]](https://arxiv.org/abs/2311.07816)
1. Zhang et al. Alleviating Hallucinations of Large Language Models through Induced Hallucinations. 2023. [[arxiv]](https://arxiv.org/abs/2312.15710)
1. Wang et al. Know Your Needs Better: Towards Structured Understanding of Marketer Demands with Analogical Reasoning Augmented LLMs. 2024. [[arxiv]](https://arxiv.org/abs/2401.04319)
1. Wang et al. CANDLE: Iterative Conceptualization and Instantiation Distillation from Large Language Models for Commonsense Reasoning. 2024. [[arxiv]](https://arxiv.org/abs/2401.07286)
1. Wang et al. Know Your Needs Better: Towards Structured Understanding of Marketer Demands with Analogical Reasoning Augmented LLMs. KDD 2024. [[arxiv]](https://arxiv.org/abs/2401.04319)
1. Wang et al. CANDLE: Iterative Conceptualization and Instantiation Distillation from Large Language Models for Commonsense Reasoning. ACL 2024. [[arxiv]](https://arxiv.org/abs/2401.07286)
1. Choi et al. FACT-GPT: Fact-Checking Augmentation via Claim Matching with LLMs. 2024. [[arxiv]](https://arxiv.org/abs/2402.05904)
1. Zhang et al. AutoMathText: Autonomous Data Selection with Language Models for Mathematical Texts. 2024. [[arxiv]](https://arxiv.org/abs/2402.07625)
1. Lyu et al. KnowTuning: Knowledge-aware Fine-tuning for Large Language Models. 2024. [[arxiv]](https://arxiv.org/abs/2402.11176)
1. Yang et al. LaCo: Large Language Model Pruning via Layer Collaps. 2024. [[arxiv]](https://arxiv.org/abs/2402.11187)
1. Bhardwaj et al. Language Models are Homer Simpson! Safety Re-Alignment of Fine-tuned Language Models through Task Arithmetic. 2024. [[arxiv]](https://arxiv.org/abs/2402.11746)
1. Yang et al. Enhancing Empathetic Response Generation by Augmenting LLMs with Small-scale Empathetic Models. 2024. [[arxiv]](https://arxiv.org/abs/2402.11801)
1. Yi et al. Generation Meets Verification: Accelerating Large Language Model Inference with Smart Parallel Auto-Correct Decoding. 2024. [[arxiv]](https://arxiv.org/abs/2402.11809)
1. Yi et al. Generation Meets Verification: Accelerating Large Language Model Inference with Smart Parallel Auto-Correct Decoding. ACL 2024 Findings. [[arxiv]](https://arxiv.org/abs/2402.11809)
1. Cao et al. Head-wise Shareable Attention for Large Language Models. 2024. [[arxiv]](https://arxiv.org/abs/2402.11819)
1. Zhang et al. Enhancing Multilingual Capabilities of Large Language Models through Self-Distillation from Resource-Rich Languages. 2024. [[arxiv]](https://arxiv.org/abs/2402.12204)
1. Kim et al. Efficient and Effective Vocabulary Expansion Towards Multilingual Large Language Models. 2024. [[arxiv]](https://arxiv.org/abs/2402.14714)
1. Yu et al. KIEval: A Knowledge-grounded Interactive Evaluation Framework for Large Language Models. 2024. [[arxiv]](https://arxiv.org/abs/2402.15043)
1. Yu et al. KIEval: A Knowledge-grounded Interactive Evaluation Framework for Large Language Models. ACL 2024. [[arxiv]](https://arxiv.org/abs/2402.15043)
1. Huang et al. Key-Point-Driven Data Synthesis with its Enhancement on Mathematical Reasoning. 2024. [[arxiv]](https://arxiv.org/abs/2403.02333)
1. Duan et al. Negating Negatives: Alignment without Human Positive Samples via Distributional Dispreference Optimization. 2024. [[arxiv]](https://arxiv.org/abs/2403.03419)
1. Xie and Schwertfeger. Empowering Robotics with Large Language Models: osmAG Map Comprehension with LLMs. 2024. [[arxiv]](https://arxiv.org/abs/2403.08228)
1. Wu et al. Large Language Models are Parallel Multilingual Learners. 2024. [[arxiv]](https://arxiv.org/abs/2403.09073)
1. Zhang et al. EDT: Improving Large Language Models' Generation by Entropy-based Dynamic Temperature Sampling. 2024. [[arxiv]](https://arxiv.org/abs/2403.14541)
1. Weller et al. FollowIR: Evaluating and Teaching Information Retrieval Models to Follow Instructions. 2024. [[arxiv]](https://arxiv.org/abs/2403.15246)
1. Hongbin Na. CBT-LLM: A Chinese Large Language Model for Cognitive Behavioral Therapy-based Mental Health Question Answering. 2024. [[arxiv]](https://arxiv.org/abs/2403.16008)
1. Hongbin Na. CBT-LLM: A Chinese Large Language Model for Cognitive Behavioral Therapy-based Mental Health Question Answering. COLING 2024. [[arxiv]](https://arxiv.org/abs/2403.16008)
1. Zan et al. CodeS: Natural Language to Code Repository via Multi-Layer Sketch. 2024. [[arxiv]](https://arxiv.org/abs/2403.16443)
1. Liu et al. Extensive Self-Contrast Enables Feedback-Free Language Model Alignment. 2024. [[arxiv]](https://arxiv.org/abs/2404.00604)
1. Luo et al. BAdam: A Memory Efficient Full Parameter Training Method for Large Language Models. 2024. [[arxiv]](https://arxiv.org/abs/2404.02827)
1. Du et al. Chinese Tiny LLM: Pretraining a Chinese-Centric Large Language Model. 2024. [[arxiv]](https://arxiv.org/abs/2404.04167)
1. Ma et al. Parameter Efficient Quasi-Orthogonal Fine-Tuning via Givens Rotation. 2024. [[arxiv]](https://arxiv.org/abs/2404.04316)
1. Ma et al. Parameter Efficient Quasi-Orthogonal Fine-Tuning via Givens Rotation. ICML 2024. [[arxiv]](https://arxiv.org/abs/2404.04316)
1. Liu et al. Dynamic Generation of Personalities with Large Language Models. 2024. [[arxiv]](https://arxiv.org/abs/2404.07084)
1. Shang et al. How Far Have We Gone in Stripped Binary Code Understanding Using Large Language Models. 2024. [[arxiv]](https://arxiv.org/abs/2404.09836)
1. Huang et al. LLMTune: Accelerate Database Knob Tuning with Large Language Models. 2024. [[arxiv]](https://arxiv.org/abs/2404.11581)
1. Deng et al. Text-Tuple-Table: Towards Information Integration in Text-to-Table Generation via Global Tuple Extraction. 2024. [[arxiv]](https://arxiv.org/abs/2404.14215)
1. Acikgoz et al. Hippocrates: An Open-Source Framework for Advancing Large Language Models in Healthcare. 2024. [[arxiv]](https://arxiv.org/abs/2404.16621)
1. Zhang et al. Small Language Models Need Strong Verifiers to Self-Correct Reasoning. 2024. [[arxiv]](https://arxiv.org/abs/2404.17140)
1. Zhou et al. FREB-TQA: A Fine-Grained Robustness Evaluation Benchmark for Table Question Answering. 2024. [[arxiv]](https://arxiv.org/abs/2404.18585)
1. Zhang et al. Small Language Models Need Strong Verifiers to Self-Correct Reasoning. ACL 2024 Findings. [[arxiv]](https://arxiv.org/abs/2404.17140)
1. Zhou et al. FREB-TQA: A Fine-Grained Robustness Evaluation Benchmark for Table Question Answering. NAACL 2024. [[arxiv]](https://arxiv.org/abs/2404.18585)
1. Xu et al. Large Language Models for Cyber Security: A Systematic Literature Review. 2024. [[arxiv]](https://arxiv.org/abs/2405.04760)
1. Dammu et al. "They are uncultured": Unveiling Covert Harms and Social Threats in LLM Generated Conversations. 2024. [[arxiv]](https://arxiv.org/abs/2405.05378)
1. Yi et al. A safety realignment framework via subspace-oriented model fusion for large language models. 2024. [[arxiv]](https://arxiv.org/abs/2405.09055)
1. Lou et al. SPO: Multi-Dimensional Preference Sequential Alignment With Implicit Reward Modeling. 2024. [[arxiv]](https://arxiv.org/abs/2405.12739)
1. Zhang et al. Getting More from Less: Large Language Models are Good Spontaneous Multilingual Learners. 2024. [[arxiv]](https://arxiv.org/abs/2405.13816)
1. Zhang et al. TS-Align: A Teacher-Student Collaborative Framework for Scalable Iterative Finetuning of Large Language Models. 2024. [[arxiv]](https://arxiv.org/abs/2405.20215)
1. Zihong Chen. Sentence Segmentation and Sentence Punctuation Based on XunziALLM. 2024. [[paper]](https://aclanthology.org/2024.lt4hala-1.30)
1. Gao et al. The Best of Both Worlds: Toward an Honest and Helpful Large Language Model. 2024. [[arxiv]](https://arxiv.org/abs/2406.00380)
1. Wang and Song. MARS: Benchmarking the Metaphysical Reasoning Abilities of Language Models with a Multi-task Evaluation Dataset. 2024. [[arxiv]](https://arxiv.org/abs/2406.02106)
1. Hu et al. Computational Limits of Low-Rank Adaptation (LoRA) for Transformer-Based Models. 2024. [[arxiv]](https://arxiv.org/abs/2406.03136)
1. Ge et al. Time Sensitive Knowledge Editing through Efficient Finetuning. ACL 2024. [[arxiv]](https://arxiv.org/abs/2406.04496)
1. Tan et al. Peer Review as A Multi-Turn and Long-Context Dialogue with Role-Based Interactions. 2024. [[arxiv]](https://arxiv.org/abs/2406.05688)
1. Song et al. Turbo Sparse: Achieving LLM SOTA Performance with Minimal Activated Parameters. 2024. [[arxiv]](https://arxiv.org/abs/2406.05955)
1. Gu et al. RWKV-CLIP: A Robust Vision-Language Representation Learner. 2024. [[arxiv]](https://arxiv.org/abs/2406.06973)
1. Chen et al. Advancing Tool-Augmented Large Language Models: Integrating Insights from Errors in Inference Trees. 2024. [[arxiv]](https://arxiv.org/abs/2406.07115)
1. Zhu et al. Are Large Language Models Good Statisticians?. 2024. [[arxiv]](https://arxiv.org/abs/2406.07815)
1. Li et al. Know the Unknown: An Uncertainty-Sensitive Method for LLM Instruction Tuning. 2024. [[arxiv]](https://arxiv.org/abs/2406.10099)
1. Ding et al. IntentionQA: A Benchmark for Evaluating Purchase Intention Comprehension Abilities of Language Models in E-commerce. 2024. [[arxiv]](https://arxiv.org/abs/2406.10173)
1. He et al. COMMUNITY-CROSS-INSTRUCT: Unsupervised Instruction Generation for Aligning Large Language Models to Online Communities. 2024. [[arxiv]](https://arxiv.org/abs/2406.12074)
1. Lin et al. FVEL: Interactive Formal Verification Environment with Large Language Models via Theorem Proving. 2024. [[arxiv]](https://arxiv.org/abs/2406.14408)
1. Treutlein et al. Connecting the Dots: LLMs can Infer and Verbalize Latent Structure from Disparate Training Data. 2024. [[arxiv]](https://arxiv.org/abs/2406.14546)
1. Feng et al. SS-Bench: A Benchmark for Social Story Generation and Evaluation. 2024. [[arxiv]](https://arxiv.org/abs/2406.15695)
1. Feng et al. Self-Constructed Context Decompilation with Fined-grained Alignment Enhancement. 2024. [[arxiv]](https://arxiv.org/abs/2406.17233)
1. Liu et al. Large Language Models for Cuffless Blood Pressure Measurement From Wearable Biosignals. 2024. [[arxiv]](https://arxiv.org/abs/2406.18069)
1. Iyer et al. Exploring Very Low-Resource Translation with LLMs: The University of Edinburgh's Submission to AmericasNLP 2024 Translation Task. AmericasNLP 2024. [[paper]](https://aclanthology.org/2024.americasnlp-1.25)
1. Li et al. Calibrating LLMs with Preference Optimization on Thought Trees for Generating Rationale in Science Question Scoring. 2024. [[arxiv]](https://arxiv.org/abs/2406.19949)
1. Yang et al. Financial Knowledge Large Language Model. 2024. [[arxiv]](https://arxiv.org/abs/2407.00365)
1. Lin et al. DogeRM: Equipping Reward Models with Domain Knowledge through Model Merging. 2024. [[arxiv]](https://arxiv.org/abs/2407.01470)
1. Bako et al. Evaluating the Semantic Profiling Abilities of LLMs for Natural Language Utterances in Data Visualization. 2024. [[arxiv]](https://arxiv.org/abs/2407.06129)
1. Huang et al. RoLoRA: Fine-tuning Rotated Outlier-free LLMs for Effective Weight-Activation Quantization. 2024. [[arxiv]](https://arxiv.org/abs/2407.08044)
1. Jiang et al. LLM-Collaboration on Automatic Science Journalism for the General Audience. 2024. [[arxiv]](https://arxiv.org/abs/2407.09756)
1. Inouye et al. Applied Auto-tuning on LoRA Hyperparameters. 2024. [[paper]](https://scholarcommons.scu.edu/cseng_senior/272/)
1. Qi et al. Research on Tibetan Tourism Viewpoints information generation system based on LLM. 2024. [[arxiv]](https://arxiv.org/abs/2407.13561)
1. Xu et al. Course-Correction: Safety Alignment Using Synthetic Preferences. 2024. [[arxiv]](https://arxiv.org/abs/2407.16637)
1. Sun et al. LAMBDA: A Large Model Based Data Agent. 2024. [[arxiv]](https://arxiv.org/abs/2407.17535)
1. Zhu et al. CollectiveSFT: Scaling Large Language Models for Chinese Medical Benchmark with Collective Instructions in Healthcare. 2024. [[arxiv]](https://arxiv.org/abs/2407.19705)
1. Yu et al. Correcting Negative Bias in Large Language Models through Negative Attention Score Alignment. 2024. [[arxiv]](https://arxiv.org/abs/2408.00137)
1. Xie et al. The Power of Personalized Datasets: Advancing Chinese Composition Writing for Elementary School through Targeted Model Fine-Tuning. IALP 2024. [[paper]](https://www.asianlp.sg/conferences/ialp2024/proceedings/papers/IALP2024_P055.pdf)
1. Liu et al. Instruct-Code-Llama: Improving Capabilities of Language Model in Competition Level Code Generation by Online Judge Feedback. ICIC 2024. [[paper]](https://link.springer.com/chapter/10.1007/978-981-97-5669-8_11)
1. Wang et al. Cybernetic Sentinels: Unveiling the Impact of Safety Data Selection on Model Security in Supervised Fine-Tuning. ICIC 2024. [[paper]](https://link.springer.com/chapter/10.1007/978-981-97-5669-8_23)
1. Xia et al. Understanding the Performance and Estimating the Cost of LLM Fine-Tuning. 2024. [[arxiv]](https://arxiv.org/abs/2408.04693)
1. Zeng et al. Perceive, Reflect, and Plan: Designing LLM Agent for Goal-Directed City Navigation without Instructions. 2024. [[arxiv]](https://arxiv.org/abs/2408.04168)
1. Xia et al. Using Pre-trained Language Model for Accurate ESG Prediction. FinNLP 2024. [[paper]](https://aclanthology.org/2024.finnlp-2.1/)
1. Liang et al. I-SHEEP: Self-Alignment of LLM from Scratch through an Iterative Self-Enhancement Paradigm. 2024. [[arxiv]](https://arxiv.org/abs/2408.08072)
1. Bai et al. Aligning Large Language Model with Direct Multi-Preference Optimization for Recommendation. CIKM 2024. [[paper]](https://dl.acm.org/doi/10.1145/3627673.3679611)
1. **[StarWhisper](https://github.com/Yu-Yang-Li/StarWhisper)**: 天文大模型 StarWhisper基于 ChatGLM2-6B 和 Qwen-14B 在天文数据上微调而得。
1. **[DISC-LawLLM](https://github.com/FudanDISC/DISC-LawLLM)**: 中文法律领域大模型 DISC-LawLLM基于 Baichuan-13B 微调而得,具有法律推理和知识检索能力。
1. **[Sunsimiao](https://github.com/X-D-Lab/Sunsimiao)**: 孙思邈中文医疗大模型 Sumsimiao基于 Baichuan-7B 和 ChatGLM-6B 在中文医疗数据上微调而得。
1. **[CareGPT](https://github.com/WangRongsheng/CareGPT)**: 医疗大模型项目 CareGPT基于 LLaMA2-7B 和 Baichuan-13B 在中文医疗数据上微调而得。
1. **[MachineMindset](https://github.com/PKU-YuanGroup/Machine-Mindset/)**MBTI性格大模型项目根据数据集与训练方式让任意 LLM 拥有 16 个不同的性格类型。
1. **[Luminia-13B-v3](https://huggingface.co/Nekochu/Luminia-13B-v3)**:一个用于生成 Stable Diffusion 提示词的大型语言模型。[[🤗Demo]](https://huggingface.co/spaces/Nekochu/Luminia-13B_SD_Prompt)
1. **[Luminia-13B-v3](https://huggingface.co/Nekochu/Luminia-13B-v3)**:一个用于生成 Stable Diffusion 提示词的大型语言模型。[[demo]](https://huggingface.co/spaces/Nekochu/Luminia-13B_SD_Prompt)
1. **[Chinese-LLaVA-Med](https://github.com/BUAADreamer/Chinese-LLaVA-Med)**:中文多模态医学大模型,基于 LLaVA-1.5-7B 在中文多模态医疗数据上微调而得。
1. **[AutoRE](https://github.com/THUDM/AutoRE)**:基于大语言模型的文档级关系抽取系统。
1. **[NVIDIA RTX AI Toolkit](https://github.com/NVIDIA/RTX-AI-Toolkit)**:在 Windows 主机上利用英伟达 RTX 设备进行大型语言模型微调的开发包。
1. **[LazyLLM](https://github.com/LazyAGI/LazyLLM)**:一个低代码构建多 Agent 大模型应用的开发工具,支持基于 LLaMA Factory 的模型微调.
1. **[RAG-Retrieval](https://github.com/NLPJCL/RAG-Retrieval)**:一个全链路 RAG 检索模型微调、推理和蒸馏代码库。[[blog]](https://zhuanlan.zhihu.com/p/987727357)
1. **[360-LLaMA-Factory](https://github.com/Qihoo360/360-LLaMA-Factory)**:一个魔改后的代码库,通过 Ring Attention 支持长序列的 SFT 和 DPO 训练。
1. **[Sky-T1](https://novasky-ai.github.io/posts/sky-t1/)**:由 NovaSky AI 微调的低成本类 o1 长推理模型。
1. **[WeClone](https://github.com/xming521/WeClone)**:从聊天记录创造数字分身的一站式解决方案。
</details>
@ -538,17 +925,19 @@ run_name: test_run # 可选
本仓库的代码依照 [Apache-2.0](LICENSE) 协议开源。
使用模型权重时,请遵循对应的模型协议:[Baichuan2](https://huggingface.co/baichuan-inc/Baichuan2-7B-Base/blob/main/Community%20License%20for%20Baichuan%202%20Model.pdf) / [BLOOM](https://huggingface.co/spaces/bigscience/license) / [ChatGLM3](https://github.com/THUDM/ChatGLM3/blob/main/MODEL_LICENSE) / [Command-R](https://cohere.com/c4ai-cc-by-nc-license) / [DeepSeek](https://github.com/deepseek-ai/DeepSeek-LLM/blob/main/LICENSE-MODEL) / [Falcon](https://huggingface.co/tiiuae/falcon-180B/blob/main/LICENSE.txt) / [Gemma](https://ai.google.dev/gemma/terms) / [GLM4](https://huggingface.co/THUDM/glm-4-9b/blob/main/LICENSE) / [InternLM2](https://github.com/InternLM/InternLM#license) / [LLaMA](https://github.com/facebookresearch/llama/blob/main/MODEL_CARD.md) / [LLaMA-2 (LLaVA-1.5)](https://ai.meta.com/llama/license/) / [LLaMA-3](https://llama.meta.com/llama3/license/) / [Mistral](LICENSE) / [OLMo](LICENSE) / [Phi-1.5/2](https://huggingface.co/microsoft/phi-1_5/resolve/main/Research%20License.docx) / [Phi-3](https://huggingface.co/microsoft/Phi-3-mini-4k-instruct/blob/main/LICENSE) / [Qwen](https://github.com/QwenLM/Qwen/blob/main/Tongyi%20Qianwen%20LICENSE%20AGREEMENT) / [StarCoder2](https://huggingface.co/spaces/bigcode/bigcode-model-license-agreement) / [XVERSE](https://github.com/xverse-ai/XVERSE-13B/blob/main/MODEL_LICENSE.pdf) / [Yi](https://huggingface.co/01-ai/Yi-6B/blob/main/LICENSE) / [Yi-1.5](LICENSE) / [Yuan](https://github.com/IEIT-Yuan/Yuan-2.0/blob/main/LICENSE-Yuan)
使用模型权重时,请遵循对应的模型协议:[Baichuan 2](https://huggingface.co/baichuan-inc/Baichuan2-7B-Base/blob/main/Community%20License%20for%20Baichuan%202%20Model.pdf) / [BLOOM](https://huggingface.co/spaces/bigscience/license) / [ChatGLM3](https://github.com/THUDM/ChatGLM3/blob/main/MODEL_LICENSE) / [Command R](https://cohere.com/c4ai-cc-by-nc-license) / [DeepSeek](https://github.com/deepseek-ai/DeepSeek-LLM/blob/main/LICENSE-MODEL) / [Falcon](https://huggingface.co/tiiuae/falcon-180B/blob/main/LICENSE.txt) / [Gemma](https://ai.google.dev/gemma/terms) / [GLM-4](https://huggingface.co/THUDM/glm-4-9b/blob/main/LICENSE) / [GPT-2](https://github.com/openai/gpt-2/blob/master/LICENSE) / [Granite](LICENSE) / [Index](https://huggingface.co/IndexTeam/Index-1.9B/blob/main/LICENSE) / [InternLM](https://github.com/InternLM/InternLM#license) / [Llama](https://github.com/facebookresearch/llama/blob/main/MODEL_CARD.md) / [Llama 2](https://ai.meta.com/llama/license/) / [Llama 3](https://llama.meta.com/llama3/license/) / [Llama 4](https://github.com/meta-llama/llama-models/blob/main/models/llama4/LICENSE) / [MiniCPM](https://github.com/OpenBMB/MiniCPM/blob/main/MiniCPM%20Model%20License.md) / [Mistral/Mixtral/Pixtral](LICENSE) / [OLMo](LICENSE) / [Phi-1.5/Phi-2](https://huggingface.co/microsoft/phi-1_5/resolve/main/Research%20License.docx) / [Phi-3/Phi-4](https://huggingface.co/microsoft/Phi-3-mini-4k-instruct/blob/main/LICENSE) / [Qwen](https://github.com/QwenLM/Qwen/blob/main/Tongyi%20Qianwen%20LICENSE%20AGREEMENT) / [Skywork](https://huggingface.co/Skywork/Skywork-13B-base/blob/main/Skywork%20Community%20License.pdf) / [StarCoder 2](https://huggingface.co/spaces/bigcode/bigcode-model-license-agreement) / [TeleChat2](https://huggingface.co/Tele-AI/telechat-7B/blob/main/TeleChat%E6%A8%A1%E5%9E%8B%E7%A4%BE%E5%8C%BA%E8%AE%B8%E5%8F%AF%E5%8D%8F%E8%AE%AE.pdf) / [XVERSE](https://github.com/xverse-ai/XVERSE-13B/blob/main/MODEL_LICENSE.pdf) / [Yi](https://huggingface.co/01-ai/Yi-6B/blob/main/LICENSE) / [Yi-1.5](LICENSE) / [Yuan 2](https://github.com/IEIT-Yuan/Yuan-2.0/blob/main/LICENSE-Yuan)
## 引用
如果您觉得此项目有帮助,请考虑以下列格式引用
```bibtex
@article{zheng2024llamafactory,
title={LlamaFactory: Unified Efficient Fine-Tuning of 100+ Language Models},
author={Yaowei Zheng and Richong Zhang and Junhao Zhang and Yanhan Ye and Zheyan Luo and Yongqiang Ma},
journal={arXiv preprint arXiv:2403.13372},
@inproceedings{zheng2024llamafactory,
title={LlamaFactory: Unified Efficient Fine-Tuning of 100+ Language Models},
author={Yaowei Zheng and Richong Zhang and Junhao Zhang and Yanhan Ye and Zheyan Luo and Zhangchi Feng and Yongqiang Ma},
booktitle={Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 3: System Demonstrations)},
address={Bangkok, Thailand},
publisher={Association for Computational Linguistics},
year={2024},
url={http://arxiv.org/abs/2403.13372}
}

38
assets/alaya_new.svg Normal file

File diff suppressed because one or more lines are too long

After

Width:  |  Height:  |  Size: 47 KiB

File diff suppressed because it is too large Load Diff

Before

Width:  |  Height:  |  Size: 29 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 142 KiB

After

Width:  |  Height:  |  Size: 166 KiB

BIN
assets/wechat_alaya.png Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 209 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 148 KiB

After

Width:  |  Height:  |  Size: 171 KiB

View File

@ -1,18 +1,22 @@
The [dataset_info.json](dataset_info.json) contains all available datasets. If you are using a custom dataset, please **make sure** to add a *dataset description* in `dataset_info.json` and specify `dataset: dataset_name` before training to use it.
Currently we support datasets in **alpaca** and **sharegpt** format.
The `dataset_info.json` file should be put in the `dataset_dir` directory. You can change `dataset_dir` to use another directory. The default value is `./data`.
Currently we support datasets in **alpaca** and **sharegpt** format. Allowed file types include json, jsonl, csv, parquet, arrow.
```json
"dataset_name": {
"hf_hub_url": "the name of the dataset repository on the Hugging Face hub. (if specified, ignore script_url and file_name)",
"ms_hub_url": "the name of the dataset repository on the Model Scope hub. (if specified, ignore script_url and file_name)",
"script_url": "the name of the directory containing a dataset loading script. (if specified, ignore file_name)",
"hf_hub_url": "the name of the dataset repository on the Hugging Face hub. (if specified, ignore script_url, file_name and cloud_file_name)",
"ms_hub_url": "the name of the dataset repository on the Model Scope hub. (if specified, ignore script_url, file_name and cloud_file_name)",
"script_url": "the name of the directory containing a dataset loading script. (if specified, ignore file_name and cloud_file_name)",
"cloud_file_name": "the name of the dataset file in s3/gcs cloud storage. (if specified, ignore file_name)",
"file_name": "the name of the dataset folder or dataset file in this directory. (required if above are not specified)",
"formatting": "the format of the dataset. (optional, default: alpaca, can be chosen from {alpaca, sharegpt})",
"ranking": "whether the dataset is a preference dataset or not. (default: False)",
"subset": "the name of the subset. (optional, default: None)",
"split": "the name of dataset split to be used. (optional, default: train)",
"folder": "the name of the folder of the dataset repository on the Hugging Face hub. (optional, default: None)",
"num_samples": "the number of samples in the dataset used for training. (optional, default: None)",
"num_samples": "the number of samples in the dataset to be used. (optional, default: None)",
"columns (optional)": {
"prompt": "the column name in the dataset containing the prompts. (default: instruction)",
"query": "the column name in the dataset containing the queries. (default: input)",
@ -22,6 +26,8 @@ Currently we support datasets in **alpaca** and **sharegpt** format.
"system": "the column name in the dataset containing the system prompts. (default: None)",
"tools": "the column name in the dataset containing the tool description. (default: None)",
"images": "the column name in the dataset containing the image inputs. (default: None)",
"videos": "the column name in the dataset containing the videos inputs. (default: None)",
"audios": "the column name in the dataset containing the audios inputs. (default: None)",
"chosen": "the column name in the dataset containing the chosen answers. (default: None)",
"rejected": "the column name in the dataset containing the rejected answers. (default: None)",
"kto_tag": "the column name in the dataset containing the kto tags. (default: None)"
@ -44,7 +50,9 @@ Currently we support datasets in **alpaca** and **sharegpt** format.
* [Example dataset](alpaca_en_demo.json)
In supervised fine-tuning, the `instruction` column will be concatenated with the `input` column and used as the human prompt, then the human prompt would be `instruction\ninput`. The `output` column represents the model response.
In supervised fine-tuning, the `instruction` column will be concatenated with the `input` column and used as the user prompt, then the user prompt would be `instruction\ninput`. The `output` column represents the model response.
For reasoning models, if the dataset contains chain-of-thought (CoT), the CoT needs to be placed in the model responses, such as `<think>cot</think>output`.
The `system` column will be used as the system prompt if specified.
@ -53,13 +61,13 @@ The `history` column is a list consisting of string tuples representing prompt-r
```json
[
{
"instruction": "human instruction (required)",
"input": "human input (optional)",
"instruction": "user instruction (required)",
"input": "user input (optional)",
"output": "model response (required)",
"system": "system prompt (optional)",
"history": [
["human instruction in the first round (optional)", "model response in the first round (optional)"],
["human instruction in the second round (optional)", "model response in the second round (optional)"]
["user instruction in the first round (optional)", "model response in the first round (optional)"],
["user instruction in the second round (optional)", "model response in the second round (optional)"]
]
}
]
@ -80,9 +88,14 @@ Regarding the above dataset, the *dataset description* in `dataset_info.json` sh
}
```
> [!TIP]
> If the model has reasoning capabilities (e.g. Qwen3) but the dataset does not contain chain-of-thought (CoT), LLaMA-Factory will automatically add empty CoT to the data. When `enable_thinking` is `True` (slow thinking, by default), the empty CoT will be added to the model responses and loss computation will be considered; otherwise (fast thinking), it will be added to the user prompts and loss computation will be ignored. Please keep the `enable_thinking` parameter consistent during training and inference.
>
> If you want to train data containing CoT with slow thinking and data without CoT with fast thinking, you can set `enable_thinking` to `None`. However, this feature is relatively complicated and should be used with caution.
### Pre-training Dataset
- [Example dataset](c4_demo.json)
- [Example dataset](c4_demo.jsonl)
In pre-training, only the `text` column will be used for model learning.
@ -106,15 +119,15 @@ Regarding the above dataset, the *dataset description* in `dataset_info.json` sh
### Preference Dataset
Preference datasets are used for reward modeling, DPO training and ORPO training.
Preference datasets are used for reward modeling, DPO training, ORPO and SimPO training.
It requires a better response in `chosen` column and a worse response in `rejected` column.
```json
[
{
"instruction": "human instruction (required)",
"input": "human input (optional)",
"instruction": "user instruction (required)",
"input": "user input (optional)",
"chosen": "chosen answer (required)",
"rejected": "rejected answer (required)"
}
@ -138,67 +151,19 @@ Regarding the above dataset, the *dataset description* in `dataset_info.json` sh
### KTO Dataset
- [Example dataset](kto_en_demo.json)
An additional column `kto_tag` is required. Please refer to the [sharegpt](#sharegpt-format) format for details.
KTO datasets require a extra `kto_tag` column containing the boolean human feedback.
### Multimodal Image Dataset
```json
[
{
"instruction": "human instruction (required)",
"input": "human input (optional)",
"output": "model response (required)",
"kto_tag": "human feedback [true/false] (required)"
}
]
```
An additional column `images` is required. Please refer to the [sharegpt](#sharegpt-format) format for details.
Regarding the above dataset, the *dataset description* in `dataset_info.json` should be:
### Multimodal Video Dataset
```json
"dataset_name": {
"file_name": "data.json",
"columns": {
"prompt": "instruction",
"query": "input",
"response": "output",
"kto_tag": "kto_tag"
}
}
```
An additional column `videos` is required. Please refer to the [sharegpt](#sharegpt-format) format for details.
### Multimodal Dataset
### Multimodal Audio Dataset
- [Example dataset](mllm_demo.json)
Multimodal datasets require a `images` column containing the paths to the input images. Currently we only support one image.
```json
[
{
"instruction": "human instruction (required)",
"input": "human input (optional)",
"output": "model response (required)",
"images": [
"image path (required)"
]
}
]
```
Regarding the above dataset, the *dataset description* in `dataset_info.json` should be:
```json
"dataset_name": {
"file_name": "data.json",
"columns": {
"prompt": "instruction",
"query": "input",
"response": "output",
"images": "images"
}
}
```
An additional column `audios` is required. Please refer to the [sharegpt](#sharegpt-format) format for details.
## Sharegpt Format
@ -208,7 +173,7 @@ Regarding the above dataset, the *dataset description* in `dataset_info.json` sh
Compared to the alpaca format, the sharegpt format allows the datasets have **more roles**, such as human, gpt, observation and function. They are presented in a list of objects in the `conversations` column.
Note that the human and observation should appear in odd positions, while gpt and function should appear in even positions.
Note that the human and observation should appear in odd positions, while gpt and function should appear in even positions. The gpt and function will be learned by the model.
```json
[
@ -216,7 +181,7 @@ Note that the human and observation should appear in odd positions, while gpt an
"conversations": [
{
"from": "human",
"value": "human instruction"
"value": "user instruction"
},
{
"from": "function_call",
@ -251,6 +216,10 @@ Regarding the above dataset, the *dataset description* in `dataset_info.json` sh
}
```
### Pre-training Dataset
Not yet supported, please use the [alpaca](#alpaca-format) format.
### Preference Dataset
- [Example dataset](dpo_en_demo.json)
@ -263,7 +232,7 @@ Preference datasets in sharegpt format also require a better message in `chosen`
"conversations": [
{
"from": "human",
"value": "human instruction"
"value": "user instruction"
},
{
"from": "gpt",
@ -271,7 +240,7 @@ Preference datasets in sharegpt format also require a better message in `chosen`
},
{
"from": "human",
"value": "human instruction"
"value": "user instruction"
}
],
"chosen": {
@ -301,6 +270,166 @@ Regarding the above dataset, the *dataset description* in `dataset_info.json` sh
}
```
### KTO Dataset
- [Example dataset](kto_en_demo.json)
KTO datasets require a extra `kto_tag` column containing the boolean human feedback.
```json
[
{
"conversations": [
{
"from": "human",
"value": "user instruction"
},
{
"from": "gpt",
"value": "model response"
}
],
"kto_tag": "human feedback [true/false] (required)"
}
]
```
Regarding the above dataset, the *dataset description* in `dataset_info.json` should be:
```json
"dataset_name": {
"file_name": "data.json",
"formatting": "sharegpt",
"columns": {
"messages": "conversations",
"kto_tag": "kto_tag"
}
}
```
### Multimodal Image Dataset
- [Example dataset](mllm_demo.json)
Multimodal image datasets require an `images` column containing the paths to the input images.
The number of images should be identical to the `<image>` tokens in the conversations.
```json
[
{
"conversations": [
{
"from": "human",
"value": "<image>user instruction"
},
{
"from": "gpt",
"value": "model response"
}
],
"images": [
"image path (required)"
]
}
]
```
Regarding the above dataset, the *dataset description* in `dataset_info.json` should be:
```json
"dataset_name": {
"file_name": "data.json",
"formatting": "sharegpt",
"columns": {
"messages": "conversations",
"images": "images"
}
}
```
### Multimodal Video Dataset
- [Example dataset](mllm_video_demo.json)
Multimodal video datasets require a `videos` column containing the paths to the input videos.
The number of videos should be identical to the `<video>` tokens in the conversations.
```json
[
{
"conversations": [
{
"from": "human",
"value": "<video>user instruction"
},
{
"from": "gpt",
"value": "model response"
}
],
"videos": [
"video path (required)"
]
}
]
```
Regarding the above dataset, the *dataset description* in `dataset_info.json` should be:
```json
"dataset_name": {
"file_name": "data.json",
"formatting": "sharegpt",
"columns": {
"messages": "conversations",
"videos": "videos"
}
}
```
### Multimodal Audio Dataset
- [Example dataset](mllm_audio_demo.json)
Multimodal audio datasets require an `audios` column containing the paths to the input audios.
The number of audios should be identical to the `<audio>` tokens in the conversations.
```json
[
{
"conversations": [
{
"from": "human",
"value": "<audio>user instruction"
},
{
"from": "gpt",
"value": "model response"
}
],
"audios": [
"audio path (required)"
]
}
]
```
Regarding the above dataset, the *dataset description* in `dataset_info.json` should be:
```json
"dataset_name": {
"file_name": "data.json",
"formatting": "sharegpt",
"columns": {
"messages": "conversations",
"audios": "audios"
}
}
```
### OpenAI Format
The openai format is simply a special case of the sharegpt format, where the first message may be a system prompt.
@ -315,7 +444,7 @@ The openai format is simply a special case of the sharegpt format, where the fir
},
{
"role": "user",
"content": "human instruction"
"content": "user instruction"
},
{
"role": "assistant",
@ -344,7 +473,3 @@ Regarding the above dataset, the *dataset description* in `dataset_info.json` sh
}
}
```
The KTO datasets and multimodal datasets in sharegpt format are similar to the alpaca format.
Pre-training datasets are **incompatible** with the sharegpt format.

View File

@ -1,6 +1,8 @@
[dataset_info.json](dataset_info.json) 包含了所有可用的数据集。如果您希望使用自定义数据集,请**务必**在 `dataset_info.json` 文件中添加*数据集描述*,并通过修改 `dataset: 数据集名称` 配置来使用数据集。
目前我们支持 **alpaca** 格式和 **sharegpt** 格式的数据集。
其中 `dataset_info.json` 文件应放置在 `dataset_dir` 目录下。您可以通过修改 `dataset_dir` 参数来使用其他目录。默认值为 `./data`
目前我们支持 **alpaca** 格式和 **sharegpt** 格式的数据集。允许的文件类型包括 json、jsonl、csv、parquet 和 arrow。
```json
"数据集名称": {
@ -11,8 +13,9 @@
"formatting": "数据集格式可选默认alpaca可以为 alpaca 或 sharegpt",
"ranking": "是否为偏好数据集可选默认False",
"subset": "数据集子集的名称可选默认None",
"split": "所使用的数据集切分可选默认train",
"folder": "Hugging Face 仓库的文件夹名称可选默认None",
"num_samples": "该数据集中用于训练的样本数量。可选默认None",
"num_samples": "该数据集所使用的样本数量。可选默认None",
"columns可选": {
"prompt": "数据集代表提示词的表头名称默认instruction",
"query": "数据集代表请求的表头名称默认input",
@ -22,6 +25,8 @@
"system": "数据集代表系统提示的表头名称默认None",
"tools": "数据集代表工具描述的表头名称默认None",
"images": "数据集代表图像输入的表头名称默认None",
"videos": "数据集代表视频输入的表头名称默认None",
"audios": "数据集代表音频输入的表头名称默认None",
"chosen": "数据集代表更优回答的表头名称默认None",
"rejected": "数据集代表更差回答的表头名称默认None",
"kto_tag": "数据集代表 KTO 标签的表头名称默认None"
@ -44,7 +49,9 @@
- [样例数据集](alpaca_zh_demo.json)
在指令监督微调时,`instruction` 列对应的内容会与 `input` 列对应的内容拼接后作为人类指令,即人类指令为 `instruction\ninput`。而 `output` 列对应的内容为模型回答。
在指令监督微调时,`instruction` 列对应的内容会与 `input` 列对应的内容拼接后作为提示词,即提示词为 `instruction\ninput`。而 `output` 列对应的内容为模型回答。
对于推理类模型的微调,如果数据集包含思维链,则需要把思维链放在模型回答中,例如 `<think>cot</think>output`
如果指定,`system` 列对应的内容将被作为系统提示词。
@ -53,8 +60,8 @@
```json
[
{
"instruction": "人类指令(必填)",
"input": "人类输入(选填)",
"instruction": "用户指令(必填)",
"input": "用户输入(选填)",
"output": "模型回答(必填)",
"system": "系统提示词(选填)",
"history": [
@ -80,9 +87,14 @@
}
```
> [!TIP]
> 如果模型本身具备推理能力(如 Qwen3而数据集不包含思维链LLaMA-Factory 会自动为数据添加空思维链。当 `enable_thinking``True` 时(慢思考,默认),空思维链会添加到模型回答中并且计算损失,否则会添加到用户指令中并且不计算损失(快思考)。请在训练和推理时保持 `enable_thinking` 参数一致。
>
> 如果您希望训练包含思维链的数据时使用慢思考,训练不包含思维链的数据时使用快思考,可以设置 `enable_thinking``None`。但该功能较为复杂,请谨慎使用。
### 预训练数据集
- [样例数据集](c4_demo.json)
- [样例数据集](c4_demo.jsonl)
在预训练时,只有 `text` 列中的内容会用于模型学习。
@ -106,15 +118,15 @@
### 偏好数据集
偏好数据集用于奖励模型训练、DPO 训练和 ORPO 训练。
偏好数据集用于奖励模型训练、DPO 训练、ORPO 训练和 SimPO 训练。
它需要在 `chosen` 列中提供更优的回答,并在 `rejected` 列中提供更差的回答。
```json
[
{
"instruction": "人类指令(必填)",
"input": "人类输入(选填)",
"instruction": "用户指令(必填)",
"input": "用户输入(选填)",
"chosen": "优质回答(必填)",
"rejected": "劣质回答(必填)"
}
@ -138,67 +150,19 @@
### KTO 数据集
- [样例数据集](kto_en_demo.json)
KTO 数据集需要提供额外的 `kto_tag` 列。详情请参阅 [sharegpt](#sharegpt-格式)。
KTO 数据集需要额外添加一个 `kto_tag` 列,包含 bool 类型的人类反馈。
### 多模态图像数据集
```json
[
{
"instruction": "人类指令(必填)",
"input": "人类输入(选填)",
"output": "模型回答(必填)",
"kto_tag": "人类反馈 [true/false](必填)"
}
]
```
多模态图像数据集需要提供额外的 `images` 列。详情请参阅 [sharegpt](#sharegpt-格式)。
对于上述格式的数据,`dataset_info.json` 中的*数据集描述*应为:
### 多模态视频数据集
```json
"数据集名称": {
"file_name": "data.json",
"columns": {
"prompt": "instruction",
"query": "input",
"response": "output",
"kto_tag": "kto_tag"
}
}
```
多模态视频数据集需要提供额外的 `videos` 列。详情请参阅 [sharegpt](#sharegpt-格式)。
### 多模态数据集
### 多模态音频数据集
- [样例数据集](mllm_demo.json)
多模态数据集需要额外添加一个 `images` 列,包含输入图像的路径。目前我们仅支持单张图像输入。
```json
[
{
"instruction": "人类指令(必填)",
"input": "人类输入(选填)",
"output": "模型回答(必填)",
"images": [
"图像路径(必填)"
]
}
]
```
对于上述格式的数据,`dataset_info.json` 中的*数据集描述*应为:
```json
"数据集名称": {
"file_name": "data.json",
"columns": {
"prompt": "instruction",
"query": "input",
"response": "output",
"images": "images"
}
}
```
多模态音频数据集需要提供额外的 `audios` 列。详情请参阅 [sharegpt](#sharegpt-格式)。
## Sharegpt 格式
@ -208,7 +172,7 @@ KTO 数据集需要额外添加一个 `kto_tag` 列,包含 bool 类型的人
相比 alpaca 格式的数据集sharegpt 格式支持**更多的角色种类**,例如 human、gpt、observation、function 等等。它们构成一个对象列表呈现在 `conversations` 列中。
注意其中 human 和 observation 必须出现在奇数位置gpt 和 function 必须出现在偶数位置。
注意其中 human 和 observation 必须出现在奇数位置gpt 和 function 必须出现在偶数位置。默认所有的 gpt 和 function 会被用于学习。
```json
[
@ -216,7 +180,7 @@ KTO 数据集需要额外添加一个 `kto_tag` 列,包含 bool 类型的人
"conversations": [
{
"from": "human",
"value": "人类指令"
"value": "用户指令"
},
{
"from": "function_call",
@ -251,6 +215,10 @@ KTO 数据集需要额外添加一个 `kto_tag` 列,包含 bool 类型的人
}
```
### 预训练数据集
尚不支持,请使用 [alpaca](#alpaca-格式) 格式。
### 偏好数据集
- [样例数据集](dpo_zh_demo.json)
@ -263,7 +231,7 @@ Sharegpt 格式的偏好数据集同样需要在 `chosen` 列中提供更优的
"conversations": [
{
"from": "human",
"value": "人类指令"
"value": "用户指令"
},
{
"from": "gpt",
@ -271,7 +239,7 @@ Sharegpt 格式的偏好数据集同样需要在 `chosen` 列中提供更优的
},
{
"from": "human",
"value": "人类指令"
"value": "用户指令"
}
],
"chosen": {
@ -301,6 +269,170 @@ Sharegpt 格式的偏好数据集同样需要在 `chosen` 列中提供更优的
}
```
### KTO 数据集
- [样例数据集](kto_en_demo.json)
KTO 数据集需要额外添加一个 `kto_tag` 列,包含 bool 类型的人类反馈。
```json
[
{
"conversations": [
{
"from": "human",
"value": "用户指令"
},
{
"from": "gpt",
"value": "模型回答"
}
],
"kto_tag": "人类反馈 [true/false](必填)"
}
]
```
对于上述格式的数据,`dataset_info.json` 中的*数据集描述*应为:
```json
"数据集名称": {
"file_name": "data.json",
"formatting": "sharegpt",
"columns": {
"messages": "conversations",
"kto_tag": "kto_tag"
}
}
```
### 多模态图像数据集
- [样例数据集](mllm_demo.json)
多模态图像数据集需要额外添加一个 `images` 列,包含输入图像的路径。
注意图片的数量必须与文本中所有 `<image>` 标记的数量严格一致。
```json
[
{
"conversations": [
{
"from": "human",
"value": "<image><image>用户指令"
},
{
"from": "gpt",
"value": "模型回答"
}
],
"images": [
"图像路径(必填)",
"图像路径(必填)"
]
}
]
```
对于上述格式的数据,`dataset_info.json` 中的*数据集描述*应为:
```json
"数据集名称": {
"file_name": "data.json",
"formatting": "sharegpt",
"columns": {
"messages": "conversations",
"images": "images"
}
}
```
### 多模态视频数据集
- [样例数据集](mllm_video_demo.json)
多模态视频数据集需要额外添加一个 `videos` 列,包含输入视频的路径。
注意视频的数量必须与文本中所有 `<video>` 标记的数量严格一致。
```json
[
{
"conversations": [
{
"from": "human",
"value": "<video><video>用户指令"
},
{
"from": "gpt",
"value": "模型回答"
}
],
"videos": [
"视频路径(必填)",
"视频路径(必填)"
]
}
]
```
对于上述格式的数据,`dataset_info.json` 中的*数据集描述*应为:
```json
"数据集名称": {
"file_name": "data.json",
"formatting": "sharegpt",
"columns": {
"messages": "conversations",
"videos": "videos"
}
}
```
### 多模态音频数据集
- [样例数据集](mllm_audio_demo.json)
多模态音频数据集需要额外添加一个 `audios` 列,包含输入音频的路径。
注意音频的数量必须与文本中所有 `<audio>` 标记的数量严格一致。
```json
[
{
"conversations": [
{
"from": "human",
"value": "<audio><audio>用户指令"
},
{
"from": "gpt",
"value": "模型回答"
}
],
"audios": [
"音频路径(必填)",
"音频路径(必填)"
]
}
]
```
对于上述格式的数据,`dataset_info.json` 中的*数据集描述*应为:
```json
"数据集名称": {
"file_name": "data.json",
"formatting": "sharegpt",
"columns": {
"messages": "conversations",
"audios": "audios"
}
}
```
### OpenAI 格式
OpenAI 格式仅仅是 sharegpt 格式的一种特殊情况,其中第一条消息可能是系统提示词。
@ -315,7 +447,7 @@ OpenAI 格式仅仅是 sharegpt 格式的一种特殊情况,其中第一条消
},
{
"role": "user",
"content": "人类指令"
"content": "用户指令"
},
{
"role": "assistant",
@ -344,7 +476,3 @@ OpenAI 格式仅仅是 sharegpt 格式的一种特殊情况,其中第一条消
}
}
```
Sharegpt 格式中的 KTO 数据集和多模态数据集与 alpaca 格式的类似。
预训练数据集**不支持** sharegpt 格式。

View File

@ -4154,11 +4154,6 @@
"input": "The beauty of friendship",
"output": "In life we all embark on journeys grand,\nAnd wander through the trials of our days;\nAt times we'll come across a barren land,\nWhere hardships test and challenge in new ways.\n\nBut through the toil, we're never all alone,\nFor friendship shines as brilliant as the sun;\nWith bonds that strengthen, deepen and have grown,\nA beauty that cannot be rivaled, none.\n\nA friend who's there with hands to lift us high,\nThrough laughter and the joy, through tears and pain,\nA beacon when the road ahead's unclear,\nA treasure more than riches one could gain.\n\nOh friendship, pure and ever shining bright,\nA beauty that brings everlasting light."
},
{
"instruction": "The news and other sources have been claiming that Democracy is losing ground around the world. Give an example of a recent event to back this claim up.",
"input": "",
"output": "One recent example that supports this claim is the events in Hong Kong. In 2019 and 2020, the Chinese government imposed a national security law to crack down on pro-democracy protests in Hong Kong, effectively curtailing the freedoms and autonomy that the region once enjoyed. This move by Beijing to assert greater control over Hong Kong has led to concerns by human rights advocates and international leaders that democracy is being undermined in the territory. There is fear that Hong Kongs outspoken media, independent judiciary, and civil liberties look increasingly vulnerable as Beijing tightens its grip on the city. Additionally, there have been arrests of multiple leading pro-Democracy figures such as Joshua Wong, Agnes Chow and Jimmy Lai in addition to many others who have become political prisoners."
},
{
"instruction": "Redact the following sentence to remove all personal information: \"John Smith from Birmingham works in a factory.\"",
"input": "",
@ -4999,4 +4994,4 @@
"input": "Time waits for no one.",
"output": "No one can stop time from moving forward."
}
]
]

View File

@ -4999,4 +4999,4 @@
"input": "",
"output": "安第斯山脉位于南美洲横跨七个国家包括委内瑞拉哥伦比亚厄瓜多尔秘鲁玻利维亚智利和阿根廷。安第斯山脉是世界上最长的山脉之一全长约7,000千米4,350英里其山脉沿着南美洲西海岸蜿蜒延伸平均海拔约为4,000米13,000英尺。在其南部安第斯山脉宽度达到700千米430英里在其北部宽度约为500千米310英里。"
}
]
]

View File

@ -1,3 +1,18 @@
# Copyright 2025 the LlamaFactory team.
# Copyright 2020 The HuggingFace Datasets Authors and the current dataset script contributor.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
import json
import os
@ -10,16 +25,16 @@ _DESCRIPTION = "BELLE multiturn chat dataset."
_CITATION = """\
@article{belle2023exploring,
title={Exploring the Impact of Instruction Data Scaling on Large Language Models: An Empirical Study on Real-World Use Cases},
title={Exploring the Impact of Instruction Data Scaling on Large Language Models},
author={Yunjie Ji, Yong Deng, Yan Gong, Yiping Peng, Qiang Niu, Lei Zhang, Baochang Ma, Xiangang Li},
journal={arXiv preprint arXiv:2303.14742},
year={2023}
}
"""
_HOMEPAGE = "{}/datasets/BelleGroup/multiturn_chat_0.8M".format(_HF_ENDPOINT)
_HOMEPAGE = f"{_HF_ENDPOINT}/datasets/BelleGroup/multiturn_chat_0.8M"
_LICENSE = "gpl-3.0"
_URL = "{}/datasets/BelleGroup/multiturn_chat_0.8M/resolve/main/multiturn_chat_0.8M.json".format(_HF_ENDPOINT)
_URL = f"{_HF_ENDPOINT}/datasets/BelleGroup/multiturn_chat_0.8M/resolve/main/multiturn_chat_0.8M.json"
class BelleMultiturn(datasets.GeneratorBasedBuilder):
@ -38,7 +53,7 @@ class BelleMultiturn(datasets.GeneratorBasedBuilder):
return [datasets.SplitGenerator(name=datasets.Split.TRAIN, gen_kwargs={"filepath": file_path})]
def _generate_examples(self, filepath: str):
with open(filepath, "r", encoding="utf-8") as f:
with open(filepath, encoding="utf-8") as f:
for key, row in enumerate(f):
data = json.loads(row)
conversations = []

File diff suppressed because one or more lines are too long

300
data/c4_demo.jsonl Normal file

File diff suppressed because one or more lines are too long

View File

@ -38,9 +38,53 @@
"assistant_tag": "assistant"
}
},
"mllm_audio_demo": {
"file_name": "mllm_audio_demo.json",
"formatting": "sharegpt",
"columns": {
"messages": "messages",
"audios": "audios"
},
"tags": {
"role_tag": "role",
"content_tag": "content",
"user_tag": "user",
"assistant_tag": "assistant"
}
},
"mllm_video_demo": {
"file_name": "mllm_video_demo.json",
"formatting": "sharegpt",
"columns": {
"messages": "messages",
"videos": "videos"
},
"tags": {
"role_tag": "role",
"content_tag": "content",
"user_tag": "user",
"assistant_tag": "assistant"
}
},
"mllm_video_audio_demo": {
"file_name": "mllm_video_audio_demo.json",
"formatting": "sharegpt",
"columns": {
"messages": "messages",
"videos": "videos",
"audios": "audios"
},
"tags": {
"role_tag": "role",
"content_tag": "content",
"user_tag": "user",
"assistant_tag": "assistant"
}
},
"alpaca_en": {
"hf_hub_url": "llamafactory/alpaca_en",
"ms_hub_url": "llamafactory/alpaca_en"
"ms_hub_url": "llamafactory/alpaca_en",
"om_hub_url": "HaM/alpaca_en"
},
"alpaca_zh": {
"hf_hub_url": "llamafactory/alpaca_zh",
@ -52,7 +96,8 @@
},
"alpaca_gpt4_zh": {
"hf_hub_url": "llamafactory/alpaca_gpt4_zh",
"ms_hub_url": "llamafactory/alpaca_gpt4_zh"
"ms_hub_url": "llamafactory/alpaca_gpt4_zh",
"om_hub_url": "State_Cloud/alpaca-gpt4-data-zh"
},
"glaive_toolcall_en": {
"hf_hub_url": "llamafactory/glaive_toolcall_en",
@ -172,9 +217,19 @@
"deepctrl": {
"ms_hub_url": "deepctrl/deepctrl-sft-data"
},
"adgen": {
"adgen_train": {
"hf_hub_url": "HasturOfficial/adgen",
"ms_hub_url": "AI-ModelScope/adgen",
"split": "train",
"columns": {
"prompt": "content",
"response": "summary"
}
},
"adgen_eval": {
"hf_hub_url": "HasturOfficial/adgen",
"ms_hub_url": "AI-ModelScope/adgen",
"split": "validation",
"columns": {
"prompt": "content",
"response": "summary"
@ -192,6 +247,7 @@
"ultrachat_200k": {
"hf_hub_url": "HuggingFaceH4/ultrachat_200k",
"ms_hub_url": "AI-ModelScope/ultrachat_200k",
"split": "train_sft",
"formatting": "sharegpt",
"columns": {
"messages": "messages"
@ -218,7 +274,7 @@
"tags": {
"role_tag": "role",
"content_tag": "content",
"user_tag": "human",
"user_tag": "user",
"assistant_tag": "assistant"
}
},
@ -248,6 +304,68 @@
"ruozhiba_gpt4": {
"hf_hub_url": "hfl/ruozhiba_gpt4_turbo"
},
"neo_sft": {
"hf_hub_url": "m-a-p/neo_sft_phase2",
"formatting": "sharegpt"
},
"magpie_pro_300k": {
"hf_hub_url": "Magpie-Align/Magpie-Pro-300K-Filtered",
"formatting": "sharegpt"
},
"magpie_ultra": {
"hf_hub_url": "argilla/magpie-ultra-v0.1",
"columns": {
"prompt": "instruction",
"response": "response"
}
},
"web_instruct": {
"hf_hub_url": "TIGER-Lab/WebInstructSub",
"columns": {
"prompt": "question",
"response": "answer"
}
},
"openo1_sft": {
"hf_hub_url": "llamafactory/OpenO1-SFT",
"ms_hub_url": "llamafactory/OpenO1-SFT",
"columns": {
"prompt": "prompt",
"response": "response"
}
},
"open_thoughts": {
"hf_hub_url": "llamafactory/OpenThoughts-114k",
"formatting": "sharegpt",
"columns": {
"messages": "messages"
},
"tags": {
"role_tag": "role",
"content_tag": "content",
"user_tag": "user",
"assistant_tag": "assistant",
"system_tag": "system"
}
},
"open_r1_math": {
"hf_hub_url": "llamafactory/OpenR1-Math-94k",
"formatting": "sharegpt",
"columns": {
"messages": "messages"
},
"tags": {
"role_tag": "role",
"content_tag": "content",
"user_tag": "user",
"assistant_tag": "assistant",
"system_tag": "system"
}
},
"chinese_r1_distill": {
"hf_hub_url": "Congliu/Chinese-DeepSeek-R1-Distill-data-110k-SFT",
"ms_hub_url": "liucong/Chinese-DeepSeek-R1-Distill-data-110k-SFT"
},
"llava_1k_en": {
"hf_hub_url": "BUAADreamer/llava-en-zh-2k",
"subset": "en",
@ -308,6 +426,14 @@
"assistant_tag": "assistant"
}
},
"pokemon_cap": {
"hf_hub_url": "llamafactory/pokemon-gpt4o-captions",
"formatting": "sharegpt",
"columns": {
"messages": "conversations",
"images": "images"
}
},
"mllm_pt_demo": {
"hf_hub_url": "BUAADreamer/mllm_pt_demo",
"formatting": "sharegpt",
@ -370,7 +496,7 @@
}
},
"dpo_mix_en": {
"hf_hub_url": "hiyouga/DPO-En-Zh-20k",
"hf_hub_url": "llamafactory/DPO-En-Zh-20k",
"subset": "en",
"ranking": true,
"formatting": "sharegpt",
@ -381,7 +507,7 @@
}
},
"dpo_mix_zh": {
"hf_hub_url": "hiyouga/DPO-En-Zh-20k",
"hf_hub_url": "llamafactory/DPO-En-Zh-20k",
"subset": "zh",
"ranking": true,
"formatting": "sharegpt",
@ -401,6 +527,48 @@
"rejected": "rejected"
}
},
"coig_p": {
"hf_hub_url": "m-a-p/COIG-P",
"ranking": true,
"formatting": "sharegpt",
"columns": {
"messages": "conversations",
"chosen": "chosen",
"rejected": "rejected"
}
},
"rlhf_v": {
"hf_hub_url": "llamafactory/RLHF-V",
"ranking": true,
"formatting": "sharegpt",
"columns": {
"messages": "conversations",
"chosen": "chosen",
"rejected": "rejected",
"images": "images"
}
},
"vlfeedback": {
"hf_hub_url": "Zhihui/VLFeedback",
"ranking": true,
"formatting": "sharegpt",
"columns": {
"messages": "conversations",
"chosen": "chosen",
"rejected": "rejected",
"images": "images"
}
},
"rlaif_v": {
"hf_hub_url": "openbmb/RLAIF-V-Dataset",
"ranking": true,
"columns": {
"prompt": "question",
"chosen": "chosen",
"rejected": "rejected",
"images": "image"
}
},
"orca_pairs": {
"hf_hub_url": "Intel/orca_dpo_pairs",
"ranking": true,
@ -474,7 +642,7 @@
}
},
"c4_demo": {
"file_name": "c4_demo.json",
"file_name": "c4_demo.jsonl",
"columns": {
"prompt": "text"
}
@ -520,13 +688,13 @@
"prompt": "text"
}
},
"fileweb": {
"fineweb": {
"hf_hub_url": "HuggingFaceFW/fineweb",
"columns": {
"prompt": "text"
}
},
"fileweb_edu": {
"fineweb_edu": {
"hf_hub_url": "HuggingFaceFW/fineweb-edu",
"columns": {
"prompt": "text"
@ -547,4 +715,4 @@
},
"folder": "python"
}
}
}

View File

@ -7223,4 +7223,4 @@
"value": "Abstraction is a principle in object-oriented programming that refers to the process of focusing on the essential features of an object or concept without emphasizing the details that are not important for its functionality. It enables programmers to create classes that represent the objects they want to work with, and specifies only the behavior of those objects, leaving out unnecessary details. Abstraction helps to make the code more maintainable, modular, and scalable. It also improves the productivity of developers by reducing the amount of code they need to write."
}
}
]
]

View File

@ -5055,4 +5055,4 @@
"value": "C. 参与讨论"
}
}
]
]

View File

@ -1,4 +1,41 @@
[
{
"conversations": [
{
"from": "human",
"value": "Hi, I have some ingredients and I want to cook something. Can you help me find a recipe?"
},
{
"from": "gpt",
"value": "Of course! I can help you with that. Please tell me what ingredients you have."
},
{
"from": "human",
"value": "I have chicken, bell peppers, and rice."
},
{
"from": "function_call",
"value": "{\"name\": \"search_recipes\", \"arguments\": {\"ingredients\": [\"chicken\", \"bell peppers\", \"rice\"]}}"
},
{
"from": "observation",
"value": "{\"recipes\": [{\"name\": \"Chicken and Bell Pepper Stir Fry\", \"ingredients\": [\"chicken\", \"bell peppers\", \"rice\"], \"instructions\": \"Cut the chicken into small pieces. Slice the bell peppers. Cook the rice. Stir fry the chicken and bell peppers. Serve over rice.\"}, {\"name\": \"Chicken and Rice Casserole\", \"ingredients\": [\"chicken\", \"bell peppers\", \"rice\"], \"instructions\": \"Cook the chicken and rice separately. Mix them together with the bell peppers in a casserole dish. Bake until golden brown.\"}]}"
},
{
"from": "gpt",
"value": "I found two recipes for you. The first one is \"Chicken and Bell Pepper Stir Fry\". The instructions are: Cut the chicken into small pieces. Slice the bell peppers. Cook the rice. Stir fry the chicken and bell peppers. Serve over rice. The second one is \"Chicken and Rice Casserole\". The instructions are: Cook the chicken and rice separately. Mix them together with the bell peppers in a casserole dish. Bake until golden brown. Which one would you like to try?"
},
{
"from": "human",
"value": "They both sound delicious, but I think I'll try the stir fry. Can you order the ingredients for me?"
},
{
"from": "gpt",
"value": "I'm sorry, but as an AI, I don't have the capability to perform external tasks such as ordering ingredients. However, I can help you find more recipes or provide cooking tips if you need."
}
],
"tools": "[{\"name\": \"search_recipes\", \"description\": \"Search for recipes based on ingredients\", \"parameters\": {\"type\": \"object\", \"properties\": {\"ingredients\": {\"type\": \"array\", \"items\": {\"type\": \"string\"}, \"description\": \"The ingredients to search for\"}}, \"required\": [\"ingredients\"]}}]"
},
{
"conversations": [
{
@ -155,43 +192,6 @@
],
"tools": "[]"
},
{
"conversations": [
{
"from": "human",
"value": "Hi, I have some ingredients and I want to cook something. Can you help me find a recipe?"
},
{
"from": "gpt",
"value": "Of course! I can help you with that. Please tell me what ingredients you have."
},
{
"from": "human",
"value": "I have chicken, bell peppers, and rice."
},
{
"from": "function_call",
"value": "{\"name\": \"search_recipes\", \"arguments\": {\"ingredients\": [\"chicken\", \"bell peppers\", \"rice\"]}}"
},
{
"from": "observation",
"value": "{\"recipes\": [{\"name\": \"Chicken and Bell Pepper Stir Fry\", \"ingredients\": [\"chicken\", \"bell peppers\", \"rice\"], \"instructions\": \"Cut the chicken into small pieces. Slice the bell peppers. Cook the rice. Stir fry the chicken and bell peppers. Serve over rice.\"}, {\"name\": \"Chicken and Rice Casserole\", \"ingredients\": [\"chicken\", \"bell peppers\", \"rice\"], \"instructions\": \"Cook the chicken and rice separately. Mix them together with the bell peppers in a casserole dish. Bake until golden brown.\"}]}"
},
{
"from": "gpt",
"value": "I found two recipes for you. The first one is \"Chicken and Bell Pepper Stir Fry\". The instructions are: Cut the chicken into small pieces. Slice the bell peppers. Cook the rice. Stir fry the chicken and bell peppers. Serve over rice. The second one is \"Chicken and Rice Casserole\". The instructions are: Cook the chicken and rice separately. Mix them together with the bell peppers in a casserole dish. Bake until golden brown. Which one would you like to try?"
},
{
"from": "human",
"value": "They both sound delicious, but I think I'll try the stir fry. Can you order the ingredients for me?"
},
{
"from": "gpt",
"value": "I'm sorry, but as an AI, I don't have the capability to perform external tasks such as ordering ingredients. However, I can help you find more recipes or provide cooking tips if you need."
}
],
"tools": "[{\"name\": \"search_recipes\", \"description\": \"Search for recipes based on ingredients\", \"parameters\": {\"type\": \"object\", \"properties\": {\"ingredients\": {\"type\": \"array\", \"items\": {\"type\": \"string\"}, \"description\": \"The ingredients to search for\"}}, \"required\": [\"ingredients\"]}}]"
},
{
"conversations": [
{
@ -9155,4 +9155,4 @@
],
"tools": "[]"
}
]
]

View File

@ -1,25 +1,4 @@
[
{
"conversations": [
{
"from": "human",
"value": "假设你有一个需要随机数的Java程序范围在0到10之间。你可以使用什么代码片段来生成这样的数字\n\n下面是一个示例代码片段它使用Math.random()方法来生成一个在0到1之间的随机数然后缩放并四舍五入结果得到一个在0到10之间的随机数\n\n```java\nint randomNumber = (int) (Math.random() * 11);\n```\n\n在Java中是否有更有效的方法来生成随机数你会如何修改代码以生成具有不同范围或分布的随机数"
},
{
"from": "gpt",
"value": "您提供的代码片段是Java中生成随机数的一种常见且高效的方法。然而如果您需要对随机数生成有更多的控制您可以使用java.util.Random类它提供了更多生成随机数的选项。\n\n要生成一个在不同范围内的随机数您可以修改代码以调整缩放因子。例如要生成一个在1到10之间的随机数您可以使用以下代码\n\nint randomNumber = (int) (Math.random() * 10) + 1;\n\n要生成具有不同分布的随机数您可以使用各种技术如正态分布、指数分布或自定义分布。生成正态分布的一种方法是使用Box-Muller变换。这里有一个代码示例它生成具有正态分布的随机数\n\nRandom random = new Random();\ndouble u1 = random.nextDouble();\ndouble u2 = random.nextDouble();\ndouble z = Math.sqrt(-2 * Math.log(u1)) * Math.cos(2 * Math.PI * u2);\ndouble mean = 5;\ndouble stdDev = 2;\nint randomNumber = (int) (z * stdDev + mean);\n\n这段代码生成一个具有均值5和标准差2的正态分布随机数。您可以调整均值和标准差来生成具有不同分布的随机数。"
},
{
"from": "human",
"value": "你能解释一下算法分析中的大O记法概念吗"
},
{
"from": "gpt",
"value": "是的大O表示法是一种描述算法性能随输入数据大小增加的方法。它用于估计算法在最坏情况下的时间复杂度或空间复杂度。大O中的“O”代表“数量级”这意味着随着输入大小的增加算法所需的操作数以一定的速率增长。\n\n例如如果我们有一个需要n步才能解决大小为n的问题的算法那么时间复杂度可以表示为O(n)这意味着操作数随着输入大小线性增长。如果我们有一个需要n^2步才能解决大小为n的问题的算法那么时间复杂度可以表示为O(n^2),这意味着操作数随着输入大小二次方增长。\n\n还有其他几种大O表示法如O(log n)、O(n log n)、O(2^n)等它们代表了随着输入大小增加的不同操作数增长速率。通过分析算法的大O表示法我们可以比较不同算法的性能并选择给定问题最有效的算法。\n\n您是否有关于大O表示法或算法分析的特定问题"
}
],
"tools": "[]"
},
{
"conversations": [
{
@ -41,6 +20,27 @@
],
"tools": "[{\"name\": \"generate_invoice\", \"description\": \"生成发票\", \"parameters\": {\"type\": \"object\", \"properties\": {\"customer_name\": {\"type\": \"string\", \"description\": \"客户名称\"}, \"items\": {\"type\": \"array\", \"items\": {\"type\": \"object\", \"properties\": {\"name\": {\"type\": \"string\", \"description\": \"The item name\"}, \"quantity\": {\"type\": \"integer\", \"description\": \"The quantity of the item\"}, \"price\": {\"type\": \"number\", \"description\": \"The price per unit\"}}, \"required\": [\"name\", \"quantity\", \"price\"]}}}, \"required\": [\"customer_name\", \"items\"]}}, {\"name\": \"generate_password\", \"description\": \"生成随机密码\", \"parameters\": {\"type\": \"object\", \"properties\": {\"length\": {\"type\": \"integer\", \"description\": \"密码的长度\"}}, \"required\": [\"length\"]}}]"
},
{
"conversations": [
{
"from": "human",
"value": "假设你有一个需要随机数的Java程序范围在0到10之间。你可以使用什么代码片段来生成这样的数字\n\n下面是一个示例代码片段它使用Math.random()方法来生成一个在0到1之间的随机数然后缩放并四舍五入结果得到一个在0到10之间的随机数\n\n```java\nint randomNumber = (int) (Math.random() * 11);\n```\n\n在Java中是否有更有效的方法来生成随机数你会如何修改代码以生成具有不同范围或分布的随机数"
},
{
"from": "gpt",
"value": "您提供的代码片段是Java中生成随机数的一种常见且高效的方法。然而如果您需要对随机数生成有更多的控制您可以使用java.util.Random类它提供了更多生成随机数的选项。\n\n要生成一个在不同范围内的随机数您可以修改代码以调整缩放因子。例如要生成一个在1到10之间的随机数您可以使用以下代码\n\nint randomNumber = (int) (Math.random() * 10) + 1;\n\n要生成具有不同分布的随机数您可以使用各种技术如正态分布、指数分布或自定义分布。生成正态分布的一种方法是使用Box-Muller变换。这里有一个代码示例它生成具有正态分布的随机数\n\nRandom random = new Random();\ndouble u1 = random.nextDouble();\ndouble u2 = random.nextDouble();\ndouble z = Math.sqrt(-2 * Math.log(u1)) * Math.cos(2 * Math.PI * u2);\ndouble mean = 5;\ndouble stdDev = 2;\nint randomNumber = (int) (z * stdDev + mean);\n\n这段代码生成一个具有均值5和标准差2的正态分布随机数。您可以调整均值和标准差来生成具有不同分布的随机数。"
},
{
"from": "human",
"value": "你能解释一下算法分析中的大O记法概念吗"
},
{
"from": "gpt",
"value": "是的大O表示法是一种描述算法性能随输入数据大小增加的方法。它用于估计算法在最坏情况下的时间复杂度或空间复杂度。大O中的“O”代表“数量级”这意味着随着输入大小的增加算法所需的操作数以一定的速率增长。\n\n例如如果我们有一个需要n步才能解决大小为n的问题的算法那么时间复杂度可以表示为O(n)这意味着操作数随着输入大小线性增长。如果我们有一个需要n^2步才能解决大小为n的问题的算法那么时间复杂度可以表示为O(n^2),这意味着操作数随着输入大小二次方增长。\n\n还有其他几种大O表示法如O(log n)、O(n log n)、O(2^n)等它们代表了随着输入大小增加的不同操作数增长速率。通过分析算法的大O表示法我们可以比较不同算法的性能并选择给定问题最有效的算法。\n\n您是否有关于大O表示法或算法分析的特定问题"
}
],
"tools": "[]"
},
{
"conversations": [
{
@ -9019,4 +9019,4 @@
],
"tools": "[]"
}
]
]

View File

@ -1,6 +1,20 @@
# Copyright 2025 the LlamaFactory team.
# Copyright 2020 The HuggingFace Datasets Authors and the current dataset script contributor.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
import json
import os
from typing import List
import datasets
@ -8,9 +22,9 @@ import datasets
_HF_ENDPOINT = os.getenv("HF_ENDPOINT", "https://huggingface.co")
_DESCRIPTION = "Human preference data about helpfulness and harmlessness."
_CITATION = ""
_HOMEPAGE = "{}/datasets/Anthropic/hh-rlhf".format(_HF_ENDPOINT)
_HOMEPAGE = f"{_HF_ENDPOINT}/datasets/Anthropic/hh-rlhf"
_LICENSE = "mit"
_URL = "{}/datasets/Anthropic/hh-rlhf/resolve/main/".format(_HF_ENDPOINT)
_URL = f"{_HF_ENDPOINT}/datasets/Anthropic/hh-rlhf/resolve/main/"
_URLS = {
"train": [
_URL + "harmless-base/train.jsonl.gz",
@ -50,10 +64,10 @@ class HhRlhfEn(datasets.GeneratorBasedBuilder):
datasets.SplitGenerator(name=datasets.Split.TEST, gen_kwargs={"filepaths": file_path["test"]}),
]
def _generate_examples(self, filepaths: List[str]):
def _generate_examples(self, filepaths: list[str]):
key = 0
for filepath in filepaths:
with open(filepath, "r", encoding="utf-8") as f:
with open(filepath, encoding="utf-8") as f:
for row in f:
data = json.loads(row)
chosen = data["chosen"]

View File

@ -454,4 +454,4 @@
"input": "",
"output": "抱歉,我不是 OpenAI 开发的 ChatGPT我是 {{author}} 开发的 {{name}},旨在为用户提供智能化的回答和帮助。"
}
]
]

File diff suppressed because one or more lines are too long

47
data/mllm_audio_demo.json Normal file
View File

@ -0,0 +1,47 @@
[
{
"messages": [
{
"content": "<audio>What's that sound?",
"role": "user"
},
{
"content": "It is the sound of glass shattering.",
"role": "assistant"
}
],
"audios": [
"mllm_demo_data/1.mp3"
]
},
{
"messages": [
{
"content": "<audio>What can you hear?",
"role": "user"
},
{
"content": "A woman is coughing.",
"role": "assistant"
}
],
"audios": [
"mllm_demo_data/2.wav"
]
},
{
"messages": [
{
"content": "<audio>What does the person say?",
"role": "user"
},
{
"content": "Mister Quiller is the apostle of the middle classes and we are glad to welcome his gospel.",
"role": "assistant"
}
],
"audios": [
"mllm_demo_data/3.flac"
]
}
]

View File

@ -2,7 +2,7 @@
{
"messages": [
{
"content": "Who are they?",
"content": "<image>Who are they?",
"role": "user"
},
{
@ -10,7 +10,7 @@
"role": "assistant"
},
{
"content": "What are they doing?",
"content": "What are they doing?<image>",
"role": "user"
},
{
@ -19,13 +19,14 @@
}
],
"images": [
"mllm_demo_data/1.jpg",
"mllm_demo_data/1.jpg"
]
},
{
"messages": [
{
"content": "Who is he?",
"content": "<image>Who is he?",
"role": "user"
},
{
@ -48,7 +49,7 @@
{
"messages": [
{
"content": "Please describe this image",
"content": "<image>Please describe this image",
"role": "user"
},
{
@ -71,7 +72,7 @@
{
"messages": [
{
"content": "他们是谁?",
"content": "<image>他们是谁?",
"role": "user"
},
{
@ -79,7 +80,7 @@
"role": "assistant"
},
{
"content": "他们在做什么?",
"content": "他们在做什么?<image>",
"role": "user"
},
{
@ -88,13 +89,14 @@
}
],
"images": [
"mllm_demo_data/1.jpg",
"mllm_demo_data/1.jpg"
]
},
{
"messages": [
{
"content": "他是谁?",
"content": "<image>他是谁?",
"role": "user"
},
{
@ -117,7 +119,7 @@
{
"messages": [
{
"content": "请描述这张图片",
"content": "<image>请描述这张图片",
"role": "user"
},
{
@ -137,4 +139,4 @@
"mllm_demo_data/3.jpg"
]
}
]
]

BIN
data/mllm_demo_data/1.mp3 Normal file

Binary file not shown.

BIN
data/mllm_demo_data/1.mp4 Normal file

Binary file not shown.

BIN
data/mllm_demo_data/2.avi Normal file

Binary file not shown.

BIN
data/mllm_demo_data/2.wav Normal file

Binary file not shown.

BIN
data/mllm_demo_data/3.flac Normal file

Binary file not shown.

BIN
data/mllm_demo_data/3.mp4 Normal file

Binary file not shown.

BIN
data/mllm_demo_data/4.mp3 Normal file

Binary file not shown.

BIN
data/mllm_demo_data/4.mp4 Normal file

Binary file not shown.

View File

@ -0,0 +1,56 @@
[
{
"messages": [
{
"content": "<video><audio>What is the video describing?",
"role": "user"
},
{
"content": "A girl who is drawing a picture of a guitar and feel nervous.",
"role": "assistant"
}
],
"videos": [
"mllm_demo_data/4.mp4"
],
"audios": [
"mllm_demo_data/4.mp3"
]
},
{
"messages": [
{
"content": "<video><audio>What does this girl say?",
"role": "user"
},
{
"content": "She says: 'Hello! Take a look at what am I drawing!'",
"role": "assistant"
}
],
"videos": [
"mllm_demo_data/4.mp4"
],
"audios": [
"mllm_demo_data/4.mp3"
]
},
{
"messages": [
{
"content": "<video><audio>What is this girl drawing with?",
"role": "user"
},
{
"content": "She is drawing with an iPad.",
"role": "assistant"
}
],
"videos": [
"mllm_demo_data/4.mp4"
],
"audios": [
"mllm_demo_data/4.mp3"
]
}
]

47
data/mllm_video_demo.json Normal file
View File

@ -0,0 +1,47 @@
[
{
"messages": [
{
"content": "<video>Why is this video funny?",
"role": "user"
},
{
"content": "Because a baby is reading, and he is so cute!",
"role": "assistant"
}
],
"videos": [
"mllm_demo_data/1.mp4"
]
},
{
"messages": [
{
"content": "<video>What is she doing?",
"role": "user"
},
{
"content": "She is cooking.",
"role": "assistant"
}
],
"videos": [
"mllm_demo_data/2.avi"
]
},
{
"messages": [
{
"content": "<video>What's in the video?",
"role": "user"
},
{
"content": "A baby is playing in the living room.",
"role": "assistant"
}
],
"videos": [
"mllm_demo_data/3.mp4"
]
}
]

View File

@ -1,6 +1,20 @@
# Copyright 2025 the LlamaFactory team.
# Copyright 2020 The HuggingFace Datasets Authors and the current dataset script contributor.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
import json
import os
from typing import List
import datasets
@ -11,7 +25,7 @@ _DESCRIPTION = "UltraChat: Large-scale, Informative, and Diverse Multi-round Dia
_CITATION = """\
@misc{UltraChat,
author = {Ding, Ning and Chen, Yulin and Xu, Bokai and Hu, Shengding and Qin, Yujia and Liu, Zhiyuan and Sun, Maosong and Zhou, Bowen},
author = {Ding, Ning and Chen, Yulin and Xu, Bokai and Hu, Shengding and others},
title = {UltraChat: A Large-scale Auto-generated Multi-round Dialogue Data},
year = {2023},
publisher = {GitHub},
@ -20,9 +34,9 @@ _CITATION = """\
}
"""
_HOMEPAGE = "{}/datasets/stingning/ultrachat".format(_HF_ENDPOINT)
_HOMEPAGE = f"{_HF_ENDPOINT}/datasets/stingning/ultrachat"
_LICENSE = "cc-by-nc-4.0"
_BASE_DATA_URL = "{}/datasets/stingning/ultrachat/resolve/main/train_{{idx}}.jsonl".format(_HF_ENDPOINT)
_BASE_DATA_URL = f"{_HF_ENDPOINT}/datasets/stingning/ultrachat/resolve/main/train_{{idx}}.jsonl"
class UltraChat(datasets.GeneratorBasedBuilder):
@ -40,16 +54,16 @@ class UltraChat(datasets.GeneratorBasedBuilder):
file_paths = [dl_manager.download(_BASE_DATA_URL.format(idx=idx)) for idx in range(10)] # multiple shards
return [datasets.SplitGenerator(name=datasets.Split.TRAIN, gen_kwargs={"filepaths": file_paths})]
def _generate_examples(self, filepaths: List[str]):
def _generate_examples(self, filepaths: list[str]):
for filepath in filepaths:
with open(filepath, "r", encoding="utf-8") as f:
with open(filepath, encoding="utf-8") as f:
for row in f:
try:
data = json.loads(row)
except Exception:
continue
key: int = data["id"]
content: List[str] = data["data"]
content: list[str] = data["data"]
if len(content) % 2 == 1:
content.pop(-1)
if len(content) < 2:

File diff suppressed because one or more lines are too long

View File

@ -1,23 +0,0 @@
version: '3.8'
services:
llama-factory:
build:
dockerfile: Dockerfile
context: .
container_name: llama_factory
volumes:
- ./hf_cache:/root/.cache/huggingface/
- ./data:/app/data
- ./output:/app/output
ports:
- "7860:7860"
ipc: host
deploy:
resources:
reservations:
devices:
- driver: nvidia
count: "all"
capabilities: [gpu]
restart: unless-stopped

View File

@ -0,0 +1,66 @@
# https://hub.docker.com/r/hiyouga/pytorch/tags
ARG BASE_IMAGE=hiyouga/pytorch:th2.6.0-cu124-flashattn2.7.4-cxx11abi0-devel
FROM ${BASE_IMAGE}
# Installation arguments
ARG PIP_INDEX=https://pypi.org/simple
ARG EXTRAS=metrics
ARG INSTALL_FLASHATTN=false
ARG HTTP_PROXY=""
# Define environments
ENV MAX_JOBS=16
ENV FLASH_ATTENTION_FORCE_BUILD=TRUE
ENV VLLM_WORKER_MULTIPROC_METHOD=spawn
ENV DEBIAN_FRONTEND=noninteractive
ENV NODE_OPTIONS=""
ENV PIP_ROOT_USER_ACTION=ignore
ENV http_proxy="${HTTP_PROXY}"
ENV https_proxy="${HTTP_PROXY}"
# Use Bash instead of default /bin/sh
SHELL ["/bin/bash", "-c"]
# Set the working directory
WORKDIR /app
# Change pip source
RUN pip config set global.index-url "${PIP_INDEX}" && \
pip config set global.extra-index-url "${PIP_INDEX}" && \
pip install --no-cache-dir --upgrade pip packaging wheel setuptools
# Install the requirements
COPY requirements.txt /app
RUN pip install --no-cache-dir -r requirements.txt
# Copy the rest of the application into the image
COPY . /app
# Install LLaMA Factory
RUN pip install --no-cache-dir -e ".[${EXTRAS}]" --no-build-isolation
# Rebuild flash attention
RUN if [ "${INSTALL_FLASHATTN}" == "true" ]; then \
pip uninstall -y ninja && \
pip install --no-cache-dir ninja && \
pip install --no-cache-dir flash-attn --no-build-isolation; \
fi
# Set up volumes
# VOLUME [ "/root/.cache/huggingface", "/app/shared_data", "/app/output" ]
# Expose port 7860 for LLaMA Board
ENV GRADIO_SERVER_PORT=7860
EXPOSE 7860
# Expose port 8000 for API service
ENV API_PORT=8000
EXPOSE 8000
# unset proxy
ENV http_proxy=
ENV https_proxy=
# Reset pip config
RUN pip config unset global.index-url && \
pip config unset global.extra-index-url

View File

@ -0,0 +1,55 @@
# Start from the pytorch official image (ubuntu-22.04 + cuda-12.4.1 + python-3.11)
# https://hub.docker.com/r/pytorch/pytorch/tags
FROM pytorch/pytorch:2.6.0-cuda12.4-cudnn9-devel
# Define environments
ENV MAX_JOBS=16
ENV VLLM_WORKER_MULTIPROC_METHOD=spawn
ENV DEBIAN_FRONTEND=noninteractive
ENV NODE_OPTIONS=""
ENV PIP_ROOT_USER_ACTION=ignore
# Define installation arguments
ARG APT_SOURCE=https://mirrors.tuna.tsinghua.edu.cn/ubuntu/
ARG PIP_INDEX=https://mirrors.tuna.tsinghua.edu.cn/pypi/web/simple
# Set apt source
RUN cp /etc/apt/sources.list /etc/apt/sources.list.bak && \
{ \
echo "deb ${APT_SOURCE} jammy main restricted universe multiverse"; \
echo "deb ${APT_SOURCE} jammy-updates main restricted universe multiverse"; \
echo "deb ${APT_SOURCE} jammy-backports main restricted universe multiverse"; \
echo "deb ${APT_SOURCE} jammy-security main restricted universe multiverse"; \
} > /etc/apt/sources.list
# Install systemctl and wget
RUN apt-get update && \
apt-get install -y -o Dpkg::Options::="--force-confdef" systemd wget && \
apt-get clean
# Install git and vim
RUN apt-get update && \
apt-get install -y git vim && \
apt-get clean
# Install gcc and g++
RUN apt-get update && \
apt-get install -y gcc g++ && \
apt-get clean
# Change pip source
RUN pip config set global.index-url "${PIP_INDEX}" && \
pip config set global.extra-index-url "${PIP_INDEX}" && \
pip install --no-cache-dir --upgrade pip packaging wheel setuptools
# Install flash-attn-2.7.4.post1 (cxx11abi=False)
RUN wget -nv https://github.com/Dao-AILab/flash-attention/releases/download/v2.7.4.post1/flash_attn-2.7.4.post1+cu12torch2.6cxx11abiFALSE-cp311-cp311-linux_x86_64.whl && \
pip install --no-cache-dir flash_attn-2.7.4.post1+cu12torch2.6cxx11abiFALSE-cp311-cp311-linux_x86_64.whl
# Install flashinfer-0.2.2.post1+cu124 (cxx11abi=False)
RUN wget -nv https://github.com/flashinfer-ai/flashinfer/releases/download/v0.2.2.post1/flashinfer_python-0.2.2.post1+cu124torch2.6-cp38-abi3-linux_x86_64.whl && \
pip install --no-cache-dir flashinfer_python-0.2.2.post1+cu124torch2.6-cp38-abi3-linux_x86_64.whl
# Reset pip config
RUN pip config unset global.index-url && \
pip config unset global.extra-index-url

View File

@ -0,0 +1,111 @@
# Docker Setup for NVIDIA GPUs
This directory contains Docker configuration files for running LLaMA Factory with NVIDIA GPU support.
## Prerequisites
### Linux-specific Requirements
Before running the Docker container with GPU support, you need to install the following packages:
1. **Docker**: The container runtime
```bash
# Ubuntu/Debian
sudo apt-get update
sudo apt-get install docker.io
# Or install Docker Engine from the official repository:
# https://docs.docker.com/engine/install/
```
2. **Docker Compose** (if using the docker-compose method):
```bash
# Ubuntu/Debian
sudo apt-get install docker-compose
# Or install the latest version:
# https://docs.docker.com/compose/install/
```
3. **NVIDIA Container Toolkit** (required for GPU support):
```bash
# Add the NVIDIA GPG key and repository
distribution=$(. /etc/os-release;echo $ID$VERSION_ID)
curl -s -L https://nvidia.github.io/nvidia-docker/gpgkey | sudo apt-key add -
curl -s -L https://nvidia.github.io/nvidia-docker/$distribution/nvidia-docker.list | sudo tee /etc/apt/sources.list.d/nvidia-docker.list
# Install nvidia-container-toolkit
sudo apt-get update
sudo apt-get install -y nvidia-container-toolkit
# Restart Docker to apply changes
sudo systemctl restart docker
```
**Note**: Without `nvidia-container-toolkit`, the Docker container will not be able to access your NVIDIA GPU.
### Verify GPU Access
After installation, verify that Docker can access your GPU:
```bash
sudo docker run --rm --gpus all nvidia/cuda:12.4.0-base-ubuntu22.04 nvidia-smi
```
If successful, you should see your GPU information displayed.
## Usage
### Using Docker Compose (Recommended)
```bash
cd docker/docker-cuda/
docker compose up -d
docker compose exec llamafactory bash
```
### Using Docker Run
```bash
# Build the image
docker build -f ./docker/docker-cuda/Dockerfile \
--build-arg PIP_INDEX=https://pypi.org/simple \
--build-arg EXTRAS=metrics \
-t llamafactory:latest .
# Run the container
docker run -dit --ipc=host --gpus=all \
-p 7860:7860 \
-p 8000:8000 \
--name llamafactory \
llamafactory:latest
# Enter the container
docker exec -it llamafactory bash
```
## Troubleshooting
### GPU Not Detected
If your GPU is not detected inside the container:
1. Ensure `nvidia-container-toolkit` is installed
2. Check that the Docker daemon has been restarted after installation
3. Verify your NVIDIA drivers are properly installed: `nvidia-smi`
4. Check Docker GPU support: `docker run --rm --gpus all ubuntu nvidia-smi`
### Permission Denied
If you get permission errors, ensure your user is in the docker group:
```bash
sudo usermod -aG docker $USER
# Log out and back in for changes to take effect
```
## Additional Notes
- The default image is built on Ubuntu 22.04 (x86_64), CUDA 12.4, Python 3.11, PyTorch 2.6.0, and Flash-attn 2.7.4
- For different CUDA versions, you may need to adjust the base image in the Dockerfile
- Make sure your NVIDIA driver version is compatible with the CUDA version used in the Docker image

View File

@ -0,0 +1,25 @@
services:
llamafactory:
build:
dockerfile: ./docker/docker-cuda/Dockerfile
context: ../..
args:
PIP_INDEX: https://pypi.org/simple
EXTRAS: metrics
container_name: llamafactory
ports:
- "7860:7860"
- "8000:8000"
ipc: host
tty: true
# shm_size: "16gb" # ipc: host is set
stdin_open: true
command: bash
deploy:
resources:
reservations:
devices:
- driver: nvidia
count: "all"
capabilities: [ gpu ]
restart: unless-stopped

View File

@ -0,0 +1,63 @@
# https://hub.docker.com/r/ascendai/cann/tags
ARG BASE_IMAGE=ascendai/cann:8.1.rc1-910b-ubuntu22.04-py3.11
FROM ${BASE_IMAGE}
# Installation arguments
ARG PIP_INDEX=https://pypi.org/simple
ARG EXTRAS=torch-npu,metrics
ARG HTTP_PROXY=""
ARG PYTORCH_INDEX=https://download.pytorch.org/whl/cpu
# Define environments
ENV MAX_JOBS=16
ENV FLASH_ATTENTION_FORCE_BUILD=TRUE
ENV VLLM_WORKER_MULTIPROC_METHOD=spawn
ENV DEBIAN_FRONTEND=noninteractive
ENV NODE_OPTIONS=""
ENV PIP_ROOT_USER_ACTION=ignore
ENV http_proxy="${HTTP_PROXY}"
ENV https_proxy="${HTTP_PROXY}"
# Use Bash instead of default /bin/sh
SHELL ["/bin/bash", "-c"]
# Set the working directory
WORKDIR /app
# Change pip source
RUN pip config set global.index-url "${PIP_INDEX}" && \
pip config set global.extra-index-url "${PIP_INDEX}" && \
pip install --no-cache-dir --upgrade pip packaging wheel setuptools
# Install torch-npu
RUN pip uninstall -y torch torchvision torchaudio && \
pip install --no-cache-dir "torch-npu==2.5.1" "torchvision==0.20.1" --index-url "${PYTORCH_INDEX}"
# Install the requirements
COPY requirements.txt /app
RUN pip install --no-cache-dir -r requirements.txt
# Copy the rest of the application into the image
COPY . /app
# Install LLaMA Factory
RUN pip install --no-cache-dir -e ".[${EXTRAS}]" --no-build-isolation
# Set up volumes
# VOLUME [ "/root/.cache/huggingface", "/app/shared_data", "/app/output" ]
# Expose port 7860 for LLaMA Board
ENV GRADIO_SERVER_PORT=7860
EXPOSE 7860
# Expose port 8000 for API service
ENV API_PORT=8000
EXPOSE 8000
# unset proxy
ENV http_proxy=
ENV https_proxy=
# Reset pip config
RUN pip config unset global.index-url && \
pip config unset global.extra-index-url

View File

@ -0,0 +1,28 @@
services:
llamafactory:
build:
dockerfile: ./docker/docker-npu/Dockerfile
context: ../..
args:
PIP_INDEX: https://pypi.org/simple
EXTRAS: torch-npu,metrics
container_name: llamafactory
volumes:
- /usr/local/dcmi:/usr/local/dcmi
- /usr/local/bin/npu-smi:/usr/local/bin/npu-smi
- /usr/local/Ascend/driver:/usr/local/Ascend/driver
- /etc/ascend_install.info:/etc/ascend_install.info
ports:
- "7860:7860"
- "8000:8000"
ipc: host
tty: true
# shm_size: "16gb" # ipc: host is set
stdin_open: true
command: bash
devices:
- /dev/davinci0
- /dev/davinci_manager
- /dev/devmm_svm
- /dev/hisi_hdc
restart: unless-stopped

View File

@ -0,0 +1,71 @@
# https://hub.docker.com/r/rocm/pytorch/tags
ARG BASE_IMAGE=rocm/pytorch:rocm6.4.1_ubuntu22.04_py3.10_pytorch_release_2.6.0
FROM ${BASE_IMAGE}
# Installation arguments
ARG PIP_INDEX=https://pypi.org/simple
ARG EXTRAS=metrics
ARG INSTALL_FLASHATTN=false
ARG HTTP_PROXY=""
ARG PYTORCH_INDEX=https://download.pytorch.org/whl/rocm6.3
# Define environments
ENV MAX_JOBS=16
ENV FLASH_ATTENTION_FORCE_BUILD=TRUE
ENV VLLM_WORKER_MULTIPROC_METHOD=spawn
ENV DEBIAN_FRONTEND=noninteractive
ENV NODE_OPTIONS=""
ENV PIP_ROOT_USER_ACTION=ignore
ENV http_proxy="${HTTP_PROXY}"
ENV https_proxy="${HTTP_PROXY}"
# Use Bash instead of default /bin/sh
SHELL ["/bin/bash", "-c"]
# Set the working directory
WORKDIR /app
# Change pip source
RUN pip config set global.index-url "${PIP_INDEX}" && \
pip config set global.extra-index-url "${PIP_INDEX}" && \
pip install --no-cache-dir --upgrade pip packaging wheel setuptools
# Reinstall pytorch rocm
RUN pip uninstall -y torch torchvision torchaudio && \
pip install --no-cache-dir --pre torch torchvision torchaudio --index-url "${PYTORCH_INDEX}"
# Install the requirements
COPY requirements.txt /app
RUN pip install --no-cache-dir -r requirements.txt
# Copy the rest of the application into the image
COPY . /app
# Install LLaMA Factory
RUN pip install --no-cache-dir -e ".[${EXTRAS}]" --no-build-isolation
# Rebuild flash attention
RUN if [ "${INSTALL_FLASHATTN}" == "true" ]; then \
pip uninstall -y ninja && \
pip install --no-cache-dir ninja && \
pip install --no-cache-dir flash-attn --no-build-isolation; \
fi
# Set up volumes
# VOLUME [ "/root/.cache/huggingface", "/app/shared_data", "/app/output" ]
# Expose port 7860 for LLaMA Board
ENV GRADIO_SERVER_PORT=7860
EXPOSE 7860
# Expose port 8000 for API service
ENV API_PORT=8000
EXPOSE 8000
# unset proxy
ENV http_proxy=
ENV https_proxy=
# Reset pip config
RUN pip config unset global.index-url && \
pip config unset global.extra-index-url

View File

@ -0,0 +1,21 @@
services:
llamafactory:
build:
dockerfile: ./docker/docker-rocm/Dockerfile
context: ../..
args:
PIP_INDEX: https://pypi.org/simple
EXTRAS: metrics
container_name: llamafactory
ports:
- "7860:7860"
- "8000:8000"
ipc: host
tty: true
# shm_size: "16gb" # ipc: host is set
stdin_open: true
command: bash
devices:
- /dev/kfd:/dev/kfd
- /dev/dri:/dev/dri
restart: unless-stopped

View File

@ -1,3 +1,4 @@
# Copyright 2025 the LlamaFactory team.
# Copyright 2020 The HuggingFace Datasets Authors and the current dataset script contributor.
#
# Licensed under the Apache License, Version 2.0 (the "License");
@ -11,6 +12,7 @@
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
import os
import datasets
@ -20,14 +22,15 @@ import pandas as pd
_CITATION = """\
@article{huang2023ceval,
title={C-Eval: A Multi-Level Multi-Discipline Chinese Evaluation Suite for Foundation Models},
author={Huang, Yuzhen and Bai, Yuzhuo and Zhu, Zhihao and Zhang, Junlei and Zhang, Jinghan and Su, Tangjun and Liu, Junteng and Lv, Chuancheng and Zhang, Yikai and Lei, Jiayi and Fu, Yao and Sun, Maosong and He, Junxian},
author={Huang, Yuzhen and Bai, Yuzhuo and Zhu, Zhihao and others},
journal={arXiv preprint arXiv:2305.08322},
year={2023}
}
"""
_DESCRIPTION = """\
C-Eval is a comprehensive Chinese evaluation suite for foundation models. It consists of 13948 multi-choice questions spanning 52 diverse disciplines and four difficulty levels.
C-Eval is a comprehensive Chinese evaluation suite for foundation models.
It consists of 13948 multi-choice questions spanning 52 diverse disciplines and four difficulty levels.
"""
_HOMEPAGE = "https://cevalbenchmark.com"

View File

@ -207,4 +207,4 @@
"name": "兽医学",
"category": "STEM"
}
}
}

View File

@ -1,3 +1,4 @@
# Copyright 2025 the LlamaFactory team.
# Copyright 2020 The HuggingFace Datasets Authors and the current dataset script contributor.
#
# Licensed under the Apache License, Version 2.0 (the "License");
@ -11,6 +12,7 @@
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
import os
import datasets
@ -20,14 +22,15 @@ import pandas as pd
_CITATION = """\
@article{li2023cmmlu,
title={CMMLU: Measuring massive multitask language understanding in Chinese},
author={Haonan Li and Yixuan Zhang and Fajri Koto and Yifei Yang and Hai Zhao and Yeyun Gong and Nan Duan and Timothy Baldwin},
author={Haonan Li and Yixuan Zhang and Fajri Koto and Yifei Yang and others,
journal={arXiv preprint arXiv:2306.09212},
year={2023}
}
"""
_DESCRIPTION = """\
CMMLU is a comprehensive Chinese assessment suite specifically designed to evaluate the advanced knowledge and reasoning abilities of LLMs within the Chinese language and cultural context.
CMMLU is a comprehensive Chinese assessment suite specifically designed to evaluate the advanced knowledge
and reasoning abilities of LLMs within the Chinese language and cultural context.
"""
_HOMEPAGE = "https://github.com/haonan-li/CMMLU"

View File

@ -267,4 +267,4 @@
"name": "世界宗教",
"category": "Humanities"
}
}
}

View File

@ -227,4 +227,4 @@
"name": "world religions",
"category": "Humanities"
}
}
}

View File

@ -1,3 +1,4 @@
# Copyright 2025 the LlamaFactory team.
# Copyright 2020 The HuggingFace Datasets Authors and the current dataset script contributor.
#
# Licensed under the Apache License, Version 2.0 (the "License");
@ -11,6 +12,7 @@
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
import os
import datasets
@ -20,14 +22,15 @@ import pandas as pd
_CITATION = """\
@article{hendryckstest2021,
title={Measuring Massive Multitask Language Understanding},
author={Dan Hendrycks and Collin Burns and Steven Basart and Andy Zou and Mantas Mazeika and Dawn Song and Jacob Steinhardt},
author={Dan Hendrycks and Collin Burns and others},
journal={Proceedings of the International Conference on Learning Representations (ICLR)},
year={2021}
}
"""
_DESCRIPTION = """\
Measuring Massive Multitask Language Understanding by Dan Hendrycks, Collin Burns, Steven Basart, Andy Zou, Mantas Mazeika, Dawn Song, and Jacob Steinhardt (ICLR 2021).
Measuring Massive Multitask Language Understanding by Dan Hendrycks, Collin Burns, Steven Basart,
Andy Zou, Mantas Mazeika, Dawn Song, and Jacob Steinhardt (ICLR 2021).
"""
_HOMEPAGE = "https://github.com/hendrycks/test"
@ -157,5 +160,4 @@ class MMLU(datasets.GeneratorBasedBuilder):
df = pd.read_csv(filepath, header=None)
df.columns = ["question", "A", "B", "C", "D", "answer"]
for i, instance in enumerate(df.to_dict(orient="records")):
yield i, instance
yield from enumerate(df.to_dict(orient="records"))

View File

@ -4,59 +4,85 @@ Make sure to execute these commands in the `LLaMA-Factory` directory.
## Table of Contents
- [LoRA Fine-Tuning on A Single GPU](#lora-fine-tuning-on-a-single-gpu)
- [QLoRA Fine-Tuning on a Single GPU](#qlora-fine-tuning-on-a-single-gpu)
- [LoRA Fine-Tuning on Multiple GPUs](#lora-fine-tuning-on-multiple-gpus)
- [LoRA Fine-Tuning on Multiple NPUs](#lora-fine-tuning-on-multiple-npus)
- [Full-Parameter Fine-Tuning on Multiple GPUs](#full-parameter-fine-tuning-on-multiple-gpus)
- [LoRA Fine-Tuning](#lora-fine-tuning)
- [QLoRA Fine-Tuning](#qlora-fine-tuning)
- [Full-Parameter Fine-Tuning](#full-parameter-fine-tuning)
- [Merging LoRA Adapters and Quantization](#merging-lora-adapters-and-quantization)
- [Inferring LoRA Fine-Tuned Models](#inferring-lora-fine-tuned-models)
- [Extras](#extras)
Use `CUDA_VISIBLE_DEVICES` (GPU) or `ASCEND_RT_VISIBLE_DEVICES` (NPU) to choose computing devices.
By default, LLaMA-Factory uses all visible computing devices.
Basic usage:
```bash
llamafactory-cli train examples/train_lora/llama3_lora_sft.yaml
```
Advanced usage:
```bash
CUDA_VISIBLE_DEVICES=0,1 llamafactory-cli train examples/train_lora/llama3_lora_sft.yaml \
learning_rate=1e-5 \
logging_steps=1
```
```bash
bash examples/train_lora/llama3_lora_sft.sh
```
## Examples
### LoRA Fine-Tuning on A Single GPU
### LoRA Fine-Tuning
#### (Continuous) Pre-Training
```bash
CUDA_VISIBLE_DEVICES=0 llamafactory-cli train examples/lora_single_gpu/llama3_lora_pretrain.yaml
llamafactory-cli train examples/train_lora/llama3_lora_pretrain.yaml
```
#### Supervised Fine-Tuning
```bash
CUDA_VISIBLE_DEVICES=0 llamafactory-cli train examples/lora_single_gpu/llama3_lora_sft.yaml
llamafactory-cli train examples/train_lora/llama3_lora_sft.yaml
```
#### Multimodal Supervised Fine-Tuning
```bash
CUDA_VISIBLE_DEVICES=0 llamafactory-cli train examples/lora_single_gpu/llava1_5_lora_sft.yaml
```
#### Reward Modeling
```bash
CUDA_VISIBLE_DEVICES=0 llamafactory-cli train examples/lora_single_gpu/llama3_lora_reward.yaml
```
#### PPO Training
```bash
CUDA_VISIBLE_DEVICES=0 llamafactory-cli train examples/lora_single_gpu/llama3_lora_ppo.yaml
llamafactory-cli train examples/train_lora/qwen2_5vl_lora_sft.yaml
```
#### DPO/ORPO/SimPO Training
```bash
CUDA_VISIBLE_DEVICES=0 llamafactory-cli train examples/lora_single_gpu/llama3_lora_dpo.yaml
llamafactory-cli train examples/train_lora/llama3_lora_dpo.yaml
```
#### Multimodal DPO/ORPO/SimPO Training
```bash
llamafactory-cli train examples/train_lora/qwen2_5vl_lora_dpo.yaml
```
#### Reward Modeling
```bash
llamafactory-cli train examples/train_lora/llama3_lora_reward.yaml
```
#### PPO Training
```bash
llamafactory-cli train examples/train_lora/llama3_lora_ppo.yaml
```
#### KTO Training
```bash
CUDA_VISIBLE_DEVICES=0 llamafactory-cli train examples/lora_single_gpu/llama3_lora_kto.yaml
llamafactory-cli train examples/train_lora/llama3_lora_kto.yaml
```
#### Preprocess Dataset
@ -64,95 +90,93 @@ CUDA_VISIBLE_DEVICES=0 llamafactory-cli train examples/lora_single_gpu/llama3_lo
It is useful for large dataset, use `tokenized_path` in config to load the preprocessed dataset.
```bash
CUDA_VISIBLE_DEVICES=0 llamafactory-cli train examples/lora_single_gpu/llama3_preprocess.yaml
llamafactory-cli train examples/train_lora/llama3_preprocess.yaml
```
#### Evaluating on MMLU/CMMLU/C-Eval Benchmarks
```bash
CUDA_VISIBLE_DEVICES=0 llamafactory-cli eval examples/lora_single_gpu/llama3_lora_eval.yaml
```
#### Batch Predicting and Computing BLEU and ROUGE Scores
```bash
CUDA_VISIBLE_DEVICES=0 llamafactory-cli train examples/lora_single_gpu/llama3_lora_predict.yaml
```
### QLoRA Fine-Tuning on a Single GPU
#### Supervised Fine-Tuning with 4/8-bit Bitsandbytes Quantization (Recommended)
```bash
CUDA_VISIBLE_DEVICES=0 llamafactory-cli train examples/qlora_single_gpu/llama3_lora_sft_bitsandbytes.yaml
```
#### Supervised Fine-Tuning with 4/8-bit GPTQ Quantization
```bash
CUDA_VISIBLE_DEVICES=0 llamafactory-cli train examples/qlora_single_gpu/llama3_lora_sft_gptq.yaml
```
#### Supervised Fine-Tuning with 4-bit AWQ Quantization
```bash
CUDA_VISIBLE_DEVICES=0 llamafactory-cli train examples/qlora_single_gpu/llama3_lora_sft_awq.yaml
```
#### Supervised Fine-Tuning with 2-bit AQLM Quantization
```bash
CUDA_VISIBLE_DEVICES=0 llamafactory-cli train examples/qlora_single_gpu/llama3_lora_sft_aqlm.yaml
```
### LoRA Fine-Tuning on Multiple GPUs
#### Supervised Fine-Tuning on Single Node
```bash
CUDA_VISIBLE_DEVICES=0,1,2,3 llamafactory-cli train examples/lora_multi_gpu/llama3_lora_sft.yaml
llamafactory-cli eval examples/train_lora/llama3_lora_eval.yaml
```
#### Supervised Fine-Tuning on Multiple Nodes
```bash
CUDA_VISIBLE_DEVICES=0,1,2,3 NNODES=2 RANK=0 MASTER_ADDR=192.168.0.1 MASTER_PORT=29500 llamafactory-cli train examples/lora_multi_gpu/llama3_lora_sft.yaml
CUDA_VISIBLE_DEVICES=0,1,2,3 NNODES=2 RANK=1 MASTER_ADDR=192.168.0.1 MASTER_PORT=29500 llamafactory-cli train examples/lora_multi_gpu/llama3_lora_sft.yaml
FORCE_TORCHRUN=1 NNODES=2 NODE_RANK=0 MASTER_ADDR=192.168.0.1 MASTER_PORT=29500 llamafactory-cli train examples/train_lora/llama3_lora_sft.yaml
FORCE_TORCHRUN=1 NNODES=2 NODE_RANK=1 MASTER_ADDR=192.168.0.1 MASTER_PORT=29500 llamafactory-cli train examples/train_lora/llama3_lora_sft.yaml
```
#### Supervised Fine-Tuning with DeepSpeed ZeRO-3 (Weight Sharding)
```bash
CUDA_VISIBLE_DEVICES=0,1,2,3 llamafactory-cli train examples/lora_multi_gpu/llama3_lora_sft_ds.yaml
FORCE_TORCHRUN=1 llamafactory-cli train examples/train_lora/llama3_lora_sft_ds3.yaml
```
### LoRA Fine-Tuning on Multiple NPUs
#### Supervised Fine-Tuning with DeepSpeed ZeRO-0
#### Supervised Fine-Tuning with Ray on 4 GPUs
```bash
ASCEND_RT_VISIBLE_DEVICES=0,1,2,3 llamafactory-cli train examples/lora_multi_npu/llama3_lora_sft_ds.yaml
USE_RAY=1 llamafactory-cli train examples/train_lora/llama3_lora_sft_ray.yaml
```
### Full-Parameter Fine-Tuning on Multiple GPUs
### QLoRA Fine-Tuning
#### Supervised Fine-Tuning with 4/8-bit Bitsandbytes/HQQ/EETQ Quantization (Recommended)
```bash
llamafactory-cli train examples/train_qlora/llama3_lora_sft_otfq.yaml
```
#### Supervised Fine-Tuning with 4-bit Bitsandbytes Quantization on Ascend NPU
```bash
llamafactory-cli train examples/train_qlora/llama3_lora_sft_bnb_npu.yaml
```
#### Supervised Fine-Tuning with 4/8-bit GPTQ Quantization
```bash
llamafactory-cli train examples/train_qlora/llama3_lora_sft_gptq.yaml
```
#### Supervised Fine-Tuning with 4-bit AWQ Quantization
```bash
llamafactory-cli train examples/train_qlora/llama3_lora_sft_awq.yaml
```
#### Supervised Fine-Tuning with 2-bit AQLM Quantization
```bash
llamafactory-cli train examples/train_qlora/llama3_lora_sft_aqlm.yaml
```
### Full-Parameter Fine-Tuning
#### Supervised Fine-Tuning on Single Node
```bash
CUDA_VISIBLE_DEVICES=0,1,2,3 llamafactory-cli train examples/full_multi_gpu/llama3_full_sft.yaml
FORCE_TORCHRUN=1 llamafactory-cli train examples/train_full/llama3_full_sft.yaml
```
#### Supervised Fine-Tuning on Multiple Nodes
```bash
CUDA_VISIBLE_DEVICES=0,1,2,3 NNODES=2 RANK=0 MASTER_ADDR=192.168.0.1 MASTER_PORT=29500 llamafactory-cli train examples/full_multi_gpu/llama3_full_sft.yaml
CUDA_VISIBLE_DEVICES=0,1,2,3 NNODES=2 RANK=1 MASTER_ADDR=192.168.0.1 MASTER_PORT=29500 llamafactory-cli train examples/full_multi_gpu/llama3_full_sft.yaml
FORCE_TORCHRUN=1 NNODES=2 NODE_RANK=0 MASTER_ADDR=192.168.0.1 MASTER_PORT=29500 llamafactory-cli train examples/train_full/llama3_full_sft.yaml
FORCE_TORCHRUN=1 NNODES=2 NODE_RANK=1 MASTER_ADDR=192.168.0.1 MASTER_PORT=29500 llamafactory-cli train examples/train_full/llama3_full_sft.yaml
```
#### Batch Predicting and Computing BLEU and ROUGE Scores
### Elastic and Fault-Tolerant Supervised Fine-Tuning on Multiple Nodes
To launch an elastic job with `MAX_RESTARTS` failures retries, run the following on at least `MIN_NNODES` nodes and at most `MAX_NNODES` nodes. `RDZV_ID` should be set as a unique job id (shared by all nodes participating in the job). See also [torchrun](https://docs.pytorch.org/docs/stable/elastic/run.html).
```bash
CUDA_VISIBLE_DEVICES=0,1,2,3 llamafactory-cli train examples/full_multi_gpu/llama3_full_predict.yaml
FORCE_TORCHRUN=1 MIN_NNODES=1 MAX_NNODES=3 MAX_RESTARTS=3 RDZV_ID=llamafactory MASTER_ADDR=192.168.0.1 MASTER_PORT=29500 llamafactory-cli train examples/train_full/llama3_full_sft.yaml
```
#### Multimodal Supervised Fine-Tuning
```bash
FORCE_TORCHRUN=1 llamafactory-cli train examples/train_full/qwen2_5vl_full_sft.yaml
```
### Merging LoRA Adapters and Quantization
@ -162,35 +186,46 @@ CUDA_VISIBLE_DEVICES=0,1,2,3 llamafactory-cli train examples/full_multi_gpu/llam
Note: DO NOT use quantized model or `quantization_bit` when merging LoRA adapters.
```bash
CUDA_VISIBLE_DEVICES=0 llamafactory-cli export examples/merge_lora/llama3_lora_sft.yaml
llamafactory-cli export examples/merge_lora/llama3_lora_sft.yaml
```
#### Quantizing Model using AutoGPTQ
```bash
CUDA_VISIBLE_DEVICES=0 llamafactory-cli export examples/merge_lora/llama3_gptq.yaml
llamafactory-cli export examples/merge_lora/llama3_gptq.yaml
```
### Save Ollama modelfile
```bash
llamafactory-cli export examples/merge_lora/llama3_full_sft.yaml
```
### Inferring LoRA Fine-Tuned Models
Use `CUDA_VISIBLE_DEVICES=0,1` to infer models on multiple devices.
#### Evaluation using vLLM's Multi-GPU Inference
#### Use CLI
```bash
CUDA_VISIBLE_DEVICES=0 llamafactory-cli chat examples/inference/llama3_lora_sft.yaml
```
python scripts/vllm_infer.py --model_name_or_path meta-llama/Meta-Llama-3-8B-Instruct --template llama3 --dataset alpaca_en_demo
python scripts/eval_bleu_rouge.py generated_predictions.jsonl
```
#### Use Web UI
#### Use CLI ChatBox
```bash
CUDA_VISIBLE_DEVICES=0 llamafactory-cli webchat examples/inference/llama3_lora_sft.yaml
llamafactory-cli chat examples/inference/llama3_lora_sft.yaml
```
#### Use Web UI ChatBox
```bash
llamafactory-cli webchat examples/inference/llama3_lora_sft.yaml
```
#### Launch OpenAI-style API
```bash
CUDA_VISIBLE_DEVICES=0 llamafactory-cli api examples/inference/llama3_lora_sft.yaml
llamafactory-cli api examples/inference/llama3_lora_sft.yaml
```
### Extras
@ -198,36 +233,60 @@ CUDA_VISIBLE_DEVICES=0 llamafactory-cli api examples/inference/llama3_lora_sft.y
#### Full-Parameter Fine-Tuning using GaLore
```bash
CUDA_VISIBLE_DEVICES=0 llamafactory-cli train examples/extras/galore/llama3_full_sft.yaml
llamafactory-cli train examples/extras/galore/llama3_full_sft.yaml
```
#### Full-Parameter Fine-Tuning using APOLLO
```bash
llamafactory-cli train examples/extras/apollo/llama3_full_sft.yaml
```
#### Full-Parameter Fine-Tuning using BAdam
```bash
CUDA_VISIBLE_DEVICES=0 llamafactory-cli train examples/extras/badam/llama3_full_sft.yaml
llamafactory-cli train examples/extras/badam/llama3_full_sft.yaml
```
#### Full-Parameter Fine-Tuning using Adam-mini
```bash
llamafactory-cli train examples/extras/adam_mini/qwen2_full_sft.yaml
```
#### Full-Parameter Fine-Tuning using Muon
```bash
llamafactory-cli train examples/extras/muon/qwen2_full_sft.yaml
```
#### LoRA+ Fine-Tuning
```bash
CUDA_VISIBLE_DEVICES=0 llamafactory-cli train examples/extras/loraplus/llama3_lora_sft.yaml
llamafactory-cli train examples/extras/loraplus/llama3_lora_sft.yaml
```
#### PiSSA Fine-Tuning
```bash
llamafactory-cli train examples/extras/pissa/llama3_lora_sft.yaml
```
#### Mixture-of-Depths Fine-Tuning
```bash
CUDA_VISIBLE_DEVICES=0 llamafactory-cli train examples/extras/mod/llama3_full_sft.yaml
llamafactory-cli train examples/extras/mod/llama3_full_sft.yaml
```
#### LLaMA-Pro Fine-Tuning
```bash
bash examples/extras/llama_pro/expand.sh
CUDA_VISIBLE_DEVICES=0 llamafactory-cli train examples/extras/llama_pro/llama3_freeze_sft.yaml
llamafactory-cli train examples/extras/llama_pro/llama3_freeze_sft.yaml
```
#### FSDP+QLoRA Fine-Tuning
```bash
bash examples/extras/fsdp_qlora/single_node.sh
bash examples/extras/fsdp_qlora/train.sh
```

View File

@ -4,59 +4,85 @@
## 目录
- [单 GPU LoRA 微调](#单-gpu-lora-微调)
- [单 GPU QLoRA 微调](#单-gpu-qlora-微调)
- [多 GPU LoRA 微调](#多-gpu-lora-微调)
- [多 NPU LoRA 微调](#多-npu-lora-微调)
- [多 GPU 全参数微调](#多-gpu-全参数微调)
- [LoRA 微调](#lora-微调)
- [QLoRA 微调](#qlora-微调)
- [全参数微调](#全参数微调)
- [合并 LoRA 适配器与模型量化](#合并-lora-适配器与模型量化)
- [推理 LoRA 模型](#推理-lora-模型)
- [杂项](#杂项)
使用 `CUDA_VISIBLE_DEVICES`GPU`ASCEND_RT_VISIBLE_DEVICES`NPU选择计算设备。
LLaMA-Factory 默认使用所有可见的计算设备。
基础用法:
```bash
llamafactory-cli train examples/train_lora/llama3_lora_sft.yaml
```
高级用法:
```bash
CUDA_VISIBLE_DEVICES=0,1 llamafactory-cli train examples/train_lora/llama3_lora_sft.yaml \
learning_rate=1e-5 \
logging_steps=1
```
```bash
bash examples/train_lora/llama3_lora_sft.sh
```
## 示例
### 单 GPU LoRA 微调
### LoRA 微调
#### (增量)预训练
```bash
CUDA_VISIBLE_DEVICES=0 llamafactory-cli train examples/lora_single_gpu/llama3_lora_pretrain.yaml
llamafactory-cli train examples/train_lora/llama3_lora_pretrain.yaml
```
#### 指令监督微调
```bash
CUDA_VISIBLE_DEVICES=0 llamafactory-cli train examples/lora_single_gpu/llama3_lora_sft.yaml
llamafactory-cli train examples/train_lora/llama3_lora_sft.yaml
```
#### 多模态指令监督微调
```bash
CUDA_VISIBLE_DEVICES=0 llamafactory-cli train examples/lora_single_gpu/llava1_5_lora_sft.yaml
```
#### 奖励模型训练
```bash
CUDA_VISIBLE_DEVICES=0 llamafactory-cli train examples/lora_single_gpu/llama3_lora_reward.yaml
```
#### PPO 训练
```bash
CUDA_VISIBLE_DEVICES=0 llamafactory-cli train examples/lora_single_gpu/llama3_lora_ppo.yaml
llamafactory-cli train examples/train_lora/qwen2_5vl_lora_sft.yaml
```
#### DPO/ORPO/SimPO 训练
```bash
CUDA_VISIBLE_DEVICES=0 llamafactory-cli train examples/lora_single_gpu/llama3_lora_dpo.yaml
llamafactory-cli train examples/train_lora/llama3_lora_dpo.yaml
```
#### 多模态 DPO/ORPO/SimPO 训练
```bash
llamafactory-cli train examples/train_lora/qwen2_5vl_lora_dpo.yaml
```
#### 奖励模型训练
```bash
llamafactory-cli train examples/train_lora/llama3_lora_reward.yaml
```
#### PPO 训练
```bash
llamafactory-cli train examples/train_lora/llama3_lora_ppo.yaml
```
#### KTO 训练
```bash
CUDA_VISIBLE_DEVICES=0 llamafactory-cli train examples/lora_single_gpu/llama3_lora_kto.yaml
llamafactory-cli train examples/train_lora/llama3_lora_kto.yaml
```
#### 预处理数据集
@ -64,95 +90,93 @@ CUDA_VISIBLE_DEVICES=0 llamafactory-cli train examples/lora_single_gpu/llama3_lo
对于大数据集有帮助,在配置中使用 `tokenized_path` 以加载预处理后的数据集。
```bash
CUDA_VISIBLE_DEVICES=0 llamafactory-cli train examples/lora_single_gpu/llama3_preprocess.yaml
llamafactory-cli train examples/train_lora/llama3_preprocess.yaml
```
#### 在 MMLU/CMMLU/C-Eval 上评估
```bash
CUDA_VISIBLE_DEVICES=0 llamafactory-cli eval examples/lora_single_gpu/llama3_lora_eval.yaml
llamafactory-cli eval examples/train_lora/llama3_lora_eval.yaml
```
#### 批量预测并计算 BLEU 和 ROUGE 分数
#### 多机指令监督微调
```bash
CUDA_VISIBLE_DEVICES=0 llamafactory-cli train examples/lora_single_gpu/llama3_lora_predict.yaml
FORCE_TORCHRUN=1 NNODES=2 NODE_RANK=0 MASTER_ADDR=192.168.0.1 MASTER_PORT=29500 llamafactory-cli train examples/train_lora/llama3_lora_sft.yaml
FORCE_TORCHRUN=1 NNODES=2 NODE_RANK=1 MASTER_ADDR=192.168.0.1 MASTER_PORT=29500 llamafactory-cli train examples/train_lora/llama3_lora_sft.yaml
```
### 单 GPU QLoRA 微调
### 支持弹性和容错的多机指令监督微调
#### 基于 4/8 比特 Bitsandbytes 量化进行指令监督微调(推荐)
要启动一个支持弹性节点和容错的多机指令微调,在每个节点上执行以下命令。弹性节点数量范围为 `MIN_NNODES:MAX_NNODES`,每个节点最多允许因为错误重启 `MAX_RESTARTS` 次。`RDZV_ID` 应设置为一个唯一的作业 ID由参与该作业的所有节点共享。更多新可以参考官方文档 [torchrun](https://docs.pytorch.org/docs/stable/elastic/run.html)。
```bash
CUDA_VISIBLE_DEVICES=0 llamafactory-cli train examples/qlora_single_gpu/llama3_lora_sft_bitsandbytes.yaml
```
#### 基于 4/8 比特 GPTQ 量化进行指令监督微调
```bash
CUDA_VISIBLE_DEVICES=0 llamafactory-cli train examples/qlora_single_gpu/llama3_lora_sft_gptq.yaml
```
#### 基于 4 比特 AWQ 量化进行指令监督微调
```bash
CUDA_VISIBLE_DEVICES=0 llamafactory-cli train examples/qlora_single_gpu/llama3_lora_sft_awq.yaml
```
#### 基于 2 比特 AQLM 量化进行指令监督微调
```bash
CUDA_VISIBLE_DEVICES=0 llamafactory-cli train examples/qlora_single_gpu/llama3_lora_sft_aqlm.yaml
```
### 多 GPU LoRA 微调
#### 在单机上进行指令监督微调
```bash
CUDA_VISIBLE_DEVICES=0,1,2,3 llamafactory-cli train examples/lora_multi_gpu/llama3_lora_sft.yaml
```
#### 在多机上进行指令监督微调
```bash
CUDA_VISIBLE_DEVICES=0,1,2,3 NNODES=2 RANK=0 MASTER_ADDR=192.168.0.1 MASTER_PORT=29500 llamafactory-cli train examples/lora_multi_gpu/llama3_lora_sft.yaml
CUDA_VISIBLE_DEVICES=0,1,2,3 NNODES=2 RANK=1 MASTER_ADDR=192.168.0.1 MASTER_PORT=29500 llamafactory-cli train examples/lora_multi_gpu/llama3_lora_sft.yaml
FORCE_TORCHRUN=1 MIN_NNODES=1 MAX_NNODES=3 MAX_RESTARTS=3 RDZV_ID=llamafactory MASTER_ADDR=192.168.0.1 MASTER_PORT=29500 llamafactory-cli train examples/train_full/llama3_full_sft.yaml
```
#### 使用 DeepSpeed ZeRO-3 平均分配显存
```bash
CUDA_VISIBLE_DEVICES=0,1,2,3 llamafactory-cli train examples/lora_multi_gpu/llama3_lora_sft_ds.yaml
FORCE_TORCHRUN=1 llamafactory-cli train examples/train_lora/llama3_lora_sft_ds3.yaml
```
### 多 NPU LoRA 微调
#### 使用 DeepSpeed ZeRO-0 进行指令监督微调
#### 使用 Ray 在 4 张 GPU 上微调
```bash
ASCEND_RT_VISIBLE_DEVICES=0,1,2,3 llamafactory-cli train examples/lora_multi_npu/llama3_lora_sft_ds.yaml
USE_RAY=1 llamafactory-cli train examples/train_lora/llama3_lora_sft_ray.yaml
```
### 多 GPU 全参数微调
### QLoRA 微调
#### 基于 4/8 比特 Bitsandbytes/HQQ/EETQ 量化进行指令监督微调(推荐)
```bash
llamafactory-cli train examples/train_qlora/llama3_lora_sft_otfq.yaml
```
#### 在 NPU 上基于 4 比特 Bitsandbytes 量化进行指令监督微调
```bash
llamafactory-cli train examples/train_qlora/llama3_lora_sft_bnb_npu.yaml
```
#### 基于 4/8 比特 GPTQ 量化进行指令监督微调
```bash
llamafactory-cli train examples/train_qlora/llama3_lora_sft_gptq.yaml
```
#### 基于 4 比特 AWQ 量化进行指令监督微调
```bash
llamafactory-cli train examples/train_qlora/llama3_lora_sft_awq.yaml
```
#### 基于 2 比特 AQLM 量化进行指令监督微调
```bash
llamafactory-cli train examples/train_qlora/llama3_lora_sft_aqlm.yaml
```
### 全参数微调
#### 在单机上进行指令监督微调
```bash
CUDA_VISIBLE_DEVICES=0,1,2,3 llamafactory-cli train examples/full_multi_gpu/llama3_full_sft.yaml
FORCE_TORCHRUN=1 llamafactory-cli train examples/train_full/llama3_full_sft.yaml
```
#### 在多机上进行指令监督微调
```bash
CUDA_VISIBLE_DEVICES=0,1,2,3 NNODES=2 RANK=0 MASTER_ADDR=192.168.0.1 MASTER_PORT=29500 llamafactory-cli train examples/full_multi_gpu/llama3_full_sft.yaml
CUDA_VISIBLE_DEVICES=0,1,2,3 NNODES=2 RANK=1 MASTER_ADDR=192.168.0.1 MASTER_PORT=29500 llamafactory-cli train examples/full_multi_gpu/llama3_full_sft.yaml
FORCE_TORCHRUN=1 NNODES=2 NODE_RANK=0 MASTER_ADDR=192.168.0.1 MASTER_PORT=29500 llamafactory-cli train examples/train_full/llama3_full_sft.yaml
FORCE_TORCHRUN=1 NNODES=2 NODE_RANK=1 MASTER_ADDR=192.168.0.1 MASTER_PORT=29500 llamafactory-cli train examples/train_full/llama3_full_sft.yaml
```
#### 批量预测并计算 BLEU 和 ROUGE 分数
#### 多模态指令监督微调
```bash
CUDA_VISIBLE_DEVICES=0,1,2,3 llamafactory-cli train examples/full_multi_gpu/llama3_full_predict.yaml
FORCE_TORCHRUN=1 llamafactory-cli train examples/train_full/qwen2_5vl_full_sft.yaml
```
### 合并 LoRA 适配器与模型量化
@ -162,35 +186,46 @@ CUDA_VISIBLE_DEVICES=0,1,2,3 llamafactory-cli train examples/full_multi_gpu/llam
注:请勿使用量化后的模型或 `quantization_bit` 参数来合并 LoRA 适配器。
```bash
CUDA_VISIBLE_DEVICES=0 llamafactory-cli export examples/merge_lora/llama3_lora_sft.yaml
llamafactory-cli export examples/merge_lora/llama3_lora_sft.yaml
```
#### 使用 AutoGPTQ 量化模型
```bash
CUDA_VISIBLE_DEVICES=0 llamafactory-cli export examples/merge_lora/llama3_gptq.yaml
llamafactory-cli export examples/merge_lora/llama3_gptq.yaml
```
### 保存 Ollama 配置文件
```bash
llamafactory-cli export examples/merge_lora/llama3_full_sft.yaml
```
### 推理 LoRA 模型
使用 `CUDA_VISIBLE_DEVICES=0,1` 进行多卡推理。
#### 使用 vLLM 多卡推理评估
#### 使用命令行接口
```bash
CUDA_VISIBLE_DEVICES=0 llamafactory-cli chat examples/inference/llama3_lora_sft.yaml
```
python scripts/vllm_infer.py --model_name_or_path meta-llama/Meta-Llama-3-8B-Instruct --template llama3 --dataset alpaca_en_demo
python scripts/eval_bleu_rouge.py generated_predictions.jsonl
```
#### 使用浏览器界面
#### 使用命令行对话框
```bash
CUDA_VISIBLE_DEVICES=0 llamafactory-cli webchat examples/inference/llama3_lora_sft.yaml
llamafactory-cli chat examples/inference/llama3_lora_sft.yaml
```
#### 使用浏览器对话框
```bash
llamafactory-cli webchat examples/inference/llama3_lora_sft.yaml
```
#### 启动 OpenAI 风格 API
```bash
CUDA_VISIBLE_DEVICES=0 llamafactory-cli api examples/inference/llama3_lora_sft.yaml
llamafactory-cli api examples/inference/llama3_lora_sft.yaml
```
### 杂项
@ -198,36 +233,60 @@ CUDA_VISIBLE_DEVICES=0 llamafactory-cli api examples/inference/llama3_lora_sft.y
#### 使用 GaLore 进行全参数训练
```bash
CUDA_VISIBLE_DEVICES=0 llamafactory-cli train examples/extras/galore/llama3_full_sft.yaml
llamafactory-cli train examples/extras/galore/llama3_full_sft.yaml
```
#### 使用 APOLLO 进行全参数训练
```bash
llamafactory-cli train examples/extras/apollo/llama3_full_sft.yaml
```
#### 使用 BAdam 进行全参数训练
```bash
CUDA_VISIBLE_DEVICES=0 llamafactory-cli train examples/extras/badam/llama3_full_sft.yaml
llamafactory-cli train examples/extras/badam/llama3_full_sft.yaml
```
#### 使用 Adam-mini 进行全参数训练
```bash
llamafactory-cli train examples/extras/adam_mini/qwen2_full_sft.yaml
```
#### 使用 Muon 进行全参数训练
```bash
llamafactory-cli train examples/extras/muon/qwen2_full_sft.yaml
```
#### LoRA+ 微调
```bash
CUDA_VISIBLE_DEVICES=0 llamafactory-cli train examples/extras/loraplus/llama3_lora_sft.yaml
llamafactory-cli train examples/extras/loraplus/llama3_lora_sft.yaml
```
#### PiSSA 微调
```bash
llamafactory-cli train examples/extras/pissa/llama3_lora_sft.yaml
```
#### 深度混合微调
```bash
CUDA_VISIBLE_DEVICES=0 llamafactory-cli train examples/extras/mod/llama3_full_sft.yaml
llamafactory-cli train examples/extras/mod/llama3_full_sft.yaml
```
#### LLaMA-Pro 微调
```bash
bash examples/extras/llama_pro/expand.sh
CUDA_VISIBLE_DEVICES=0 llamafactory-cli train examples/extras/llama_pro/llama3_freeze_sft.yaml
llamafactory-cli train examples/extras/llama_pro/llama3_freeze_sft.yaml
```
#### FSDP+QLoRA 微调
```bash
bash examples/extras/fsdp_qlora/single_node.sh
bash examples/extras/fsdp_qlora/train.sh
```

View File

@ -7,16 +7,16 @@ fsdp_config:
fsdp_backward_prefetch: BACKWARD_PRE
fsdp_forward_prefetch: false
fsdp_cpu_ram_efficient_loading: true
fsdp_offload_params: true # offload may affect training speed
fsdp_offload_params: false
fsdp_sharding_strategy: FULL_SHARD
fsdp_state_dict_type: FULL_STATE_DICT
fsdp_sync_module_states: true
fsdp_use_orig_params: true
machine_rank: 0
main_training_function: main
mixed_precision: fp16 # or bf16
num_machines: 1 # the number of nodes
num_processes: 2 # the number of GPUs in all nodes
mixed_precision: bf16 # or fp16
num_machines: 1 # the number of nodes
num_processes: 2 # the number of GPUs in all nodes
rdzv_backend: static
same_network: true
tpu_env: []

View File

@ -0,0 +1,25 @@
compute_environment: LOCAL_MACHINE
debug: false
distributed_type: FSDP
downcast_bf16: 'no'
fsdp_config:
fsdp_auto_wrap_policy: TRANSFORMER_BASED_WRAP
fsdp_backward_prefetch: BACKWARD_PRE
fsdp_forward_prefetch: false
fsdp_cpu_ram_efficient_loading: true
fsdp_offload_params: true # offload may affect training speed
fsdp_sharding_strategy: FULL_SHARD
fsdp_state_dict_type: FULL_STATE_DICT
fsdp_sync_module_states: true
fsdp_use_orig_params: true
machine_rank: 0
main_training_function: main
mixed_precision: bf16 # or fp16
num_machines: 1 # the number of nodes
num_processes: 2 # the number of GPUs in all nodes
rdzv_backend: static
same_network: true
tpu_env: []
tpu_use_cluster: false
tpu_use_sudo: false
use_cpu: false

View File

@ -19,10 +19,10 @@
"stage": 0,
"allgather_partitions": true,
"allgather_bucket_size": 5e8,
"overlap_comm": true,
"overlap_comm": false,
"reduce_scatter": true,
"reduce_bucket_size": 5e8,
"contiguous_gradients": true,
"round_robin_gradients": true
}
}
}

View File

@ -19,10 +19,10 @@
"stage": 2,
"allgather_partitions": true,
"allgather_bucket_size": 5e8,
"overlap_comm": true,
"overlap_comm": false,
"reduce_scatter": true,
"reduce_bucket_size": 5e8,
"contiguous_gradients": true,
"round_robin_gradients": true
}
}
}

View File

@ -23,10 +23,10 @@
},
"allgather_partitions": true,
"allgather_bucket_size": 5e8,
"overlap_comm": true,
"overlap_comm": false,
"reduce_scatter": true,
"reduce_bucket_size": 5e8,
"contiguous_gradients": true,
"round_robin_gradients": true
}
}
}

View File

@ -17,7 +17,7 @@
},
"zero_optimization": {
"stage": 3,
"overlap_comm": true,
"overlap_comm": false,
"contiguous_gradients": true,
"sub_group_size": 1e9,
"reduce_bucket_size": "auto",
@ -27,4 +27,4 @@
"stage3_max_reuse_distance": 1e9,
"stage3_gather_16bit_weights_on_model_save": true
}
}
}

View File

@ -25,7 +25,7 @@
"device": "cpu",
"pin_memory": true
},
"overlap_comm": true,
"overlap_comm": false,
"contiguous_gradients": true,
"sub_group_size": 1e9,
"reduce_bucket_size": "auto",
@ -35,4 +35,4 @@
"stage3_max_reuse_distance": 1e9,
"stage3_gather_16bit_weights_on_model_save": true
}
}
}

View File

@ -0,0 +1,43 @@
### model
model_name_or_path: Qwen/Qwen2-1.5B-Instruct
trust_remote_code: true
### method
stage: sft
do_train: true
finetuning_type: full
use_adam_mini: true
### dataset
dataset: identity,alpaca_en_demo
template: qwen
cutoff_len: 2048
max_samples: 1000
overwrite_cache: true
preprocessing_num_workers: 16
dataloader_num_workers: 4
### output
output_dir: saves/qwen2-1_5b/full/sft
logging_steps: 10
save_steps: 500
plot_loss: true
overwrite_output_dir: true
save_only_model: false
report_to: none # choices: [none, wandb, tensorboard, swanlab, mlflow]
### train
per_device_train_batch_size: 1
gradient_accumulation_steps: 8
learning_rate: 1.0e-5
num_train_epochs: 3.0
lr_scheduler_type: cosine
warmup_ratio: 0.1
bf16: true
ddp_timeout: 180000000
### eval
# val_size: 0.1
# per_device_eval_batch_size: 1
# eval_strategy: steps
# eval_steps: 500

View File

@ -0,0 +1,48 @@
### model
model_name_or_path: meta-llama/Meta-Llama-3-8B-Instruct
trust_remote_code: true
### method
stage: sft
do_train: true
finetuning_type: full
use_apollo: true
apollo_layerwise: true # choices: [true, false], use false for DDP training
apollo_target: all
apollo_rank: 128
apollo_scale: 32.0
apollo_scale_type: channel
### dataset
dataset: identity,alpaca_en_demo
template: llama3
cutoff_len: 2048
max_samples: 1000
overwrite_cache: true
preprocessing_num_workers: 16
dataloader_num_workers: 4
### output
output_dir: saves/llama3-8b/full/sft
logging_steps: 10
save_steps: 500
plot_loss: true
overwrite_output_dir: true
save_only_model: false
report_to: none # choices: [none, wandb, tensorboard, swanlab, mlflow]
### train
per_device_train_batch_size: 1
gradient_accumulation_steps: 1 # use 1 for layerwise apollo
learning_rate: 1.0e-5
num_train_epochs: 3.0
lr_scheduler_type: cosine
warmup_ratio: 0.1
pure_bf16: true
ddp_timeout: 180000000
### eval
# val_size: 0.1
# per_device_eval_batch_size: 1
# eval_strategy: steps
# eval_steps: 500

View File

@ -1,22 +1,26 @@
### model
model_name_or_path: meta-llama/Meta-Llama-3-8B-Instruct
trust_remote_code: true
### method
stage: sft
do_train: true
finetuning_type: full
use_badam: true
badam_mode: layer
badam_switch_mode: ascending
badam_switch_interval: 50
badam_verbose: 2
# deepspeed: examples/deepspeed/ds_z3_config.json
### dataset
dataset: identity,alpaca_en_demo
template: llama3
cutoff_len: 1024
cutoff_len: 2048
max_samples: 1000
overwrite_cache: true
preprocessing_num_workers: 16
dataloader_num_workers: 4
### output
output_dir: saves/llama3-8b/full/sft
@ -24,18 +28,19 @@ logging_steps: 10
save_steps: 500
plot_loss: true
overwrite_output_dir: true
save_only_model: false
report_to: none # choices: [none, wandb, tensorboard, swanlab, mlflow]
### train
per_device_train_batch_size: 1
gradient_accumulation_steps: 8
learning_rate: 1.0e-4
learning_rate: 1.0e-5
num_train_epochs: 3.0
lr_scheduler_type: cosine
warmup_ratio: 0.1
pure_bf16: true
### eval
val_size: 0.1
per_device_eval_batch_size: 1
eval_strategy: steps
eval_steps: 500
# val_size: 0.1
# per_device_eval_batch_size: 1
# eval_strategy: steps
# eval_steps: 500

View File

@ -1,23 +1,23 @@
### model
model_name_or_path: meta-llama/Meta-Llama-3-8B-Instruct
quantization_bit: 4
trust_remote_code: true
### method
stage: sft
do_train: true
finetuning_type: lora
lora_rank: 8
lora_target: all
### ddp
ddp_timeout: 180000000
### dataset
dataset: identity,alpaca_en_demo
template: llama3
cutoff_len: 1024
cutoff_len: 2048
max_samples: 1000
overwrite_cache: true
preprocessing_num_workers: 16
dataloader_num_workers: 4
### output
output_dir: saves/llama3-8b/lora/sft
@ -25,6 +25,8 @@ logging_steps: 10
save_steps: 500
plot_loss: true
overwrite_output_dir: true
save_only_model: false
report_to: none # choices: [none, wandb, tensorboard, swanlab, mlflow]
### train
per_device_train_batch_size: 1
@ -33,10 +35,11 @@ learning_rate: 1.0e-4
num_train_epochs: 3.0
lr_scheduler_type: cosine
warmup_ratio: 0.1
fp16: true
bf16: true
ddp_timeout: 180000000
### eval
val_size: 0.1
per_device_eval_batch_size: 1
eval_strategy: steps
eval_steps: 500
# val_size: 0.1
# per_device_eval_batch_size: 1
# eval_strategy: steps
# eval_steps: 500

View File

@ -1,23 +1,25 @@
### model
model_name_or_path: meta-llama/Meta-Llama-3-8B-Instruct
trust_remote_code: true
### method
stage: sft
do_train: true
finetuning_type: full
use_galore: true
galore_layerwise: true
galore_target: mlp,self_attn
galore_layerwise: true # choices: [true, false], use false for DDP training
galore_target: all
galore_rank: 128
galore_scale: 2.0
### dataset
dataset: identity,alpaca_en_demo
template: llama3
cutoff_len: 1024
cutoff_len: 2048
max_samples: 1000
overwrite_cache: true
preprocessing_num_workers: 16
dataloader_num_workers: 4
### output
output_dir: saves/llama3-8b/full/sft
@ -25,18 +27,21 @@ logging_steps: 10
save_steps: 500
plot_loss: true
overwrite_output_dir: true
save_only_model: false
report_to: none # choices: [none, wandb, tensorboard, swanlab, mlflow]
### train
per_device_train_batch_size: 1
gradient_accumulation_steps: 1
learning_rate: 1.0e-4
gradient_accumulation_steps: 1 # use 1 for layerwise galore
learning_rate: 1.0e-5
num_train_epochs: 3.0
lr_scheduler_type: cosine
warmup_ratio: 0.1
pure_bf16: true
ddp_timeout: 180000000
### eval
val_size: 0.1
per_device_eval_batch_size: 1
eval_strategy: steps
eval_steps: 500
# val_size: 0.1
# per_device_eval_batch_size: 1
# eval_strategy: steps
# eval_steps: 500

View File

@ -2,5 +2,5 @@
python scripts/llama_pro.py \
--model_name_or_path meta-llama/Meta-Llama-3-8B-Instruct \
--output_dir models/llama3-8b-instruct-pro \
--output_dir models/llama3-8b-pro \
--num_expand 8

View File

@ -1,5 +1,6 @@
### model
model_name_or_path: models/llama3-8b-instruct-pro
model_name_or_path: models/llama3-8b-pro
trust_remote_code: true
### method
stage: sft
@ -12,17 +13,20 @@ use_llama_pro: true
### dataset
dataset: identity,alpaca_en_demo
template: llama3
cutoff_len: 1024
cutoff_len: 2048
max_samples: 1000
overwrite_cache: true
preprocessing_num_workers: 16
dataloader_num_workers: 4
### output
output_dir: saves/llama3-8b-instruct-pro/freeze/sft
output_dir: saves/llama3-8b-pro/freeze/sft
logging_steps: 10
save_steps: 500
plot_loss: true
overwrite_output_dir: true
save_only_model: false
report_to: none # choices: [none, wandb, tensorboard, swanlab, mlflow]
### train
per_device_train_batch_size: 1
@ -31,10 +35,11 @@ learning_rate: 1.0e-4
num_train_epochs: 3.0
lr_scheduler_type: cosine
warmup_ratio: 0.1
fp16: true
bf16: true
ddp_timeout: 180000000
### eval
val_size: 0.1
per_device_eval_batch_size: 1
eval_strategy: steps
eval_steps: 500
# val_size: 0.1
# per_device_eval_batch_size: 1
# eval_strategy: steps
# eval_steps: 500

View File

@ -1,20 +1,23 @@
### model
model_name_or_path: meta-llama/Meta-Llama-3-8B-Instruct
trust_remote_code: true
### method
stage: sft
do_train: true
finetuning_type: lora
lora_rank: 8
lora_target: all
loraplus_lr_ratio: 16.0
### dataset
dataset: identity,alpaca_en_demo
template: llama3
cutoff_len: 1024
cutoff_len: 2048
max_samples: 1000
overwrite_cache: true
preprocessing_num_workers: 16
dataloader_num_workers: 4
### output
output_dir: saves/llama3-8b/lora/sft
@ -22,6 +25,8 @@ logging_steps: 10
save_steps: 500
plot_loss: true
overwrite_output_dir: true
save_only_model: false
report_to: none # choices: [none, wandb, tensorboard, swanlab, mlflow]
### train
per_device_train_batch_size: 1
@ -30,10 +35,11 @@ learning_rate: 1.0e-4
num_train_epochs: 3.0
lr_scheduler_type: cosine
warmup_ratio: 0.1
fp16: true
bf16: true
ddp_timeout: 180000000
### eval
val_size: 0.1
per_device_eval_batch_size: 1
eval_strategy: steps
eval_steps: 500
# val_size: 0.1
# per_device_eval_batch_size: 1
# eval_strategy: steps
# eval_steps: 500

View File

@ -1,5 +1,6 @@
### model
model_name_or_path: meta-llama/Meta-Llama-3-8B-Instruct
trust_remote_code: true
### method
stage: sft
@ -10,10 +11,11 @@ mixture_of_depths: convert
### dataset
dataset: identity,alpaca_en_demo
template: llama3
cutoff_len: 1024
cutoff_len: 2048
max_samples: 1000
overwrite_cache: true
preprocessing_num_workers: 16
dataloader_num_workers: 4
### output
output_dir: saves/llama3-8b-mod/full/sft
@ -21,19 +23,22 @@ logging_steps: 10
save_steps: 500
plot_loss: true
overwrite_output_dir: true
save_only_model: false
report_to: none # choices: [none, wandb, tensorboard, swanlab, mlflow]
### train
per_device_train_batch_size: 1
gradient_accumulation_steps: 8
optim: paged_adamw_8bit
learning_rate: 1.0e-4
learning_rate: 1.0e-5
num_train_epochs: 3.0
lr_scheduler_type: cosine
warmup_ratio: 0.1
pure_bf16: true
ddp_timeout: 180000000
### eval
val_size: 0.1
per_device_eval_batch_size: 1
eval_strategy: steps
eval_steps: 500
# val_size: 0.1
# per_device_eval_batch_size: 1
# eval_strategy: steps
# eval_steps: 500

View File

@ -0,0 +1,43 @@
### model
model_name_or_path: Qwen/Qwen2-1.5B-Instruct
trust_remote_code: true
### method
stage: sft
do_train: true
finetuning_type: full
use_muon: true
### dataset
dataset: identity,alpaca_en_demo
template: qwen
cutoff_len: 2048
max_samples: 1000
overwrite_cache: true
preprocessing_num_workers: 16
dataloader_num_workers: 4
### output
output_dir: saves/qwen2-1_5b/full/sft
logging_steps: 10
save_steps: 500
plot_loss: true
overwrite_output_dir: true
save_only_model: false
report_to: none # choices: [none, wandb, tensorboard, swanlab, mlflow]
### train
per_device_train_batch_size: 1
gradient_accumulation_steps: 8
learning_rate: 1.0e-5
num_train_epochs: 3.0
lr_scheduler_type: cosine
warmup_ratio: 0.1
bf16: true
ddp_timeout: 180000000
### eval
# val_size: 0.1
# per_device_eval_batch_size: 1
# eval_strategy: steps
# eval_steps: 500

View File

@ -1,6 +1,10 @@
# The batch generation can be SLOW using this config.
# For faster inference, we recommend to use `scripts/vllm_infer.py`.
### model
model_name_or_path: meta-llama/Meta-Llama-3-8B-Instruct
adapter_name_or_path: saves/llama3-8b/lora/sft
trust_remote_code: true
### method
stage: sft
@ -8,17 +12,20 @@ do_predict: true
finetuning_type: lora
### dataset
dataset: identity,alpaca_en_demo
eval_dataset: identity,alpaca_en_demo
template: llama3
cutoff_len: 1024
cutoff_len: 2048
max_samples: 50
overwrite_cache: true
preprocessing_num_workers: 16
dataloader_num_workers: 4
### output
output_dir: saves/llama3-8b/lora/predict
overwrite_output_dir: true
report_to: none # choices: [none, wandb, tensorboard, swanlab, mlflow]
### eval
per_device_eval_batch_size: 1
predict_with_generate: true
ddp_timeout: 180000000

View File

@ -0,0 +1,5 @@
#!/bin/bash
python scripts/pissa_init.py \
--model_name_or_path meta-llama/Meta-Llama-3-8B-Instruct \
--output_dir models/llama3-8b-pissa

View File

@ -1,19 +1,25 @@
### model
model_name_or_path: meta-llama/Meta-Llama-3-8B-Instruct
trust_remote_code: true
### method
stage: sft
do_train: true
finetuning_type: lora
lora_rank: 8
lora_target: all
pissa_init: true
pissa_iter: 16
pissa_convert: true
### dataset
dataset: identity,alpaca_en_demo
template: llama3
cutoff_len: 1024
cutoff_len: 2048
max_samples: 1000
overwrite_cache: true
preprocessing_num_workers: 16
dataloader_num_workers: 4
### output
output_dir: saves/llama3-8b/lora/sft
@ -21,6 +27,8 @@ logging_steps: 10
save_steps: 500
plot_loss: true
overwrite_output_dir: true
save_only_model: false
report_to: none # choices: [none, wandb, tensorboard, swanlab, mlflow]
### train
per_device_train_batch_size: 1
@ -29,10 +37,11 @@ learning_rate: 1.0e-4
num_train_epochs: 3.0
lr_scheduler_type: cosine
warmup_ratio: 0.1
fp16: true
bf16: true
ddp_timeout: 180000000
### eval
val_size: 0.1
per_device_eval_batch_size: 1
eval_strategy: steps
eval_steps: 500
# val_size: 0.1
# per_device_eval_batch_size: 1
# eval_strategy: steps
# eval_steps: 500

View File

@ -1,23 +0,0 @@
### model
model_name_or_path: saves/llama3-8b/full/sft
### method
stage: sft
do_predict: true
finetuning_type: full
### dataset
dataset: identity,alpaca_en_demo
template: llama3
cutoff_len: 1024
max_samples: 50
overwrite_cache: true
preprocessing_num_workers: 16
### output
output_dir: saves/llama3-8b/full/predict
overwrite_output_dir: true
### eval
per_device_eval_batch_size: 1
predict_with_generate: true

View File

@ -1,2 +1,4 @@
model_name_or_path: meta-llama/Meta-Llama-3-8B-Instruct
template: llama3
infer_backend: huggingface # choices: [huggingface, vllm, sglang]
trust_remote_code: true

View File

@ -0,0 +1,4 @@
model_name_or_path: saves/llama3-8b/full/sft
template: llama3
infer_backend: huggingface # choices: [huggingface, vllm, sglang]
trust_remote_code: true

View File

@ -1,4 +1,5 @@
model_name_or_path: meta-llama/Meta-Llama-3-8B-Instruct
adapter_name_or_path: saves/llama3-8b/lora/sft
template: llama3
finetuning_type: lora
infer_backend: huggingface # choices: [huggingface, vllm, sglang]
trust_remote_code: true

View File

@ -1,4 +0,0 @@
model_name_or_path: meta-llama/Meta-Llama-3-8B-Instruct
template: llama3
infer_backend: vllm
vllm_enforce_eager: true

View File

@ -0,0 +1,4 @@
model_name_or_path: Qwen/Qwen2.5-VL-7B-Instruct
template: qwen2_vl
infer_backend: huggingface # choices: [huggingface, vllm, sglang]
trust_remote_code: true

View File

@ -1,39 +0,0 @@
### model
model_name_or_path: llava-hf/llava-1.5-7b-hf
visual_inputs: true
### method
stage: sft
do_train: true
finetuning_type: lora
lora_target: all
### dataset
dataset: mllm_demo
template: vicuna
cutoff_len: 1024
max_samples: 1000
overwrite_cache: true
preprocessing_num_workers: 16
### output
output_dir: saves/llava1_5-7b/lora/sft
logging_steps: 10
save_steps: 500
plot_loss: true
overwrite_output_dir: true
### train
per_device_train_batch_size: 1
gradient_accumulation_steps: 8
learning_rate: 1.0e-4
num_train_epochs: 3.0
lr_scheduler_type: cosine
warmup_ratio: 0.1
fp16: true
### eval
val_size: 0.1
per_device_eval_batch_size: 1
eval_strategy: steps
eval_steps: 500

View File

@ -0,0 +1,10 @@
### model
model_name_or_path: saves/llama3-8b/full/sft
template: llama3
trust_remote_code: true
### export
export_dir: output/llama3_full_sft
export_size: 5
export_device: cpu # choices: [cpu, auto]
export_legacy_format: false

Some files were not shown because too many files have changed in this diff Show More