Commit Graph

2883 Commits

Author SHA1 Message Date
hoshi-hiyouga
98ea0e8109 [misc] fix ds config (#7205)
Former-commit-id: db113f690e
2025-03-07 15:21:28 +08:00
ZhangChuanhui
33b4c33279 [data] fix function formatter (#7201)
Co-authored-by: zhangchuanhui <zhangchal@digitalchina.com>
Former-commit-id: 194e3bddb2
2025-03-07 15:17:23 +08:00
hoshi-hiyouga
113cc3d920 [misc] fix cli (#7204)
Former-commit-id: bd17223559
2025-03-07 15:01:18 +08:00
hoshi-hiyouga
b6c0e8608e [script] fix vllm version (#7193)
Former-commit-id: 313355759d
2025-03-06 17:14:17 +08:00
hoshi-hiyouga
eba31ae313 [webui] support escape html (#7190)
Former-commit-id: abb23f7673
2025-03-06 16:52:21 +08:00
hoshi-hiyouga
e7556b591e [deps] upgrade vllm (#7183)
Former-commit-id: d739fddb10
2025-03-06 15:25:08 +08:00
hoshi-hiyouga
2b21c749c1 [data] fix mm template (#7181)
Former-commit-id: be66df1f02
2025-03-06 15:18:32 +08:00
hoshi-hiyouga
002f58ef8e [model] add QwQ 32b (#7179)
Former-commit-id: 64a6fb9b50
2025-03-06 11:58:36 +08:00
Ze-Yi LIN
c67d2b9327 [trainer] fix swanlab callback (#7176)
Former-commit-id: 8ad03258e1
2025-03-06 00:33:37 +08:00
hoshi-hiyouga
6e58115f98 [trainer] update config (#7174)
Former-commit-id: b4b89b4ff3
2025-03-05 23:32:54 +08:00
sirui.li
8dddffa340 [data] fix qwen2audio plugin (#7166)
* Update pairwise.py

[data]Repair multimodal model dpo training

* Update pairwise.py

[data]repair multimodal model dpo training using deepcopy

* Update pairwise.py

* Update mm_plugin.py

Former-commit-id: dff4130969
2025-03-05 18:03:36 +08:00
hoshi-hiyouga
e1d574a784 [assets] update wechat (#7161)
Former-commit-id: 0c403ea15b
2025-03-05 14:11:10 +08:00
hoshi-hiyouga
caef0a8937 [data] use bicubic resampler (#7143)
Former-commit-id: bc298c60b7
2025-03-04 00:17:06 +08:00
hoshi-hiyouga
392533e139 [webui] fix webui (#7142)
Former-commit-id: 17ba2d5082
2025-03-04 00:01:49 +08:00
rabbit
299cd03785 [data] bailing template (#7117)
* add bailing template

* add bailing template

* add bailing template

---------

Co-authored-by: chengshiwen.csw@antgroup.com <chengshiwen.csw@antgroup.com>
Former-commit-id: 049ddf48af
2025-03-03 15:33:22 +08:00
hoshi-hiyouga
ee1b580328 [inference] fix hf_engine (#7120)
Former-commit-id: 1036311826
2025-03-01 05:22:49 +08:00
hoshi-hiyouga
54a090079c [assets] update wechat (#7106)
Former-commit-id: d1863bbbaa
2025-02-28 12:01:04 +08:00
Ze-Yi LIN
210cdb9557 [webui] display swanlab exp link (#7089)
* webui add swanlab link

* change callback name

* update

---------

Co-authored-by: hiyouga <hiyouga@buaa.edu.cn>
Former-commit-id: 891c487503
2025-02-27 19:40:54 +08:00
leo-pony
e86cb8a4fa [npu] update cann base image and torch 2.4 (#7061)
* Update base npu container image version:The Python version required for Hugging Face Transformers is >= python3.10

* Fix the bug: arg type of INSTALL_DEEPSPEED shoud been string now.

* Update Ascend CANN, CANN-Kernel and corresponding torch and torch-npu version

* Upgrade torch-npu needs packages' version: torch==2.1.0 and torch-npu==2.4.0.post2

Former-commit-id: acc52e0fe7
2025-02-25 23:32:01 +08:00
hoshi-hiyouga
f4aa0a146c [misc] fix project toml (#7067)
Former-commit-id: 96fd510e6a
2025-02-25 23:22:48 +08:00
JieShen
96636c3729 [script] add seed args (#7058)
* add seed args

* add seed args

* update seed

Former-commit-id: e8266fe563
2025-02-25 19:44:57 +08:00
Kingsley
81947f1d2c [model] add paligemma2-mix series (#7060)
Former-commit-id: 19861d5170
2025-02-25 18:51:16 +08:00
hoshi-hiyouga
dca5fe14c2 [data] fix mllama (#7053)
* fix mllama

* fix test

Former-commit-id: 76314e6ad1
2025-02-24 22:05:38 +08:00
hoshi-hiyouga
ca78ba964d [model] add models (#7054)
* add qwen25vl awq models

* add moonlight

Former-commit-id: ec1a1bc118
2025-02-24 22:05:13 +08:00
hoshi-hiyouga
9359ee18ad [assets] update readme (#7051)
Former-commit-id: fe6dd92c84
2025-02-24 20:45:06 +08:00
hoshi-hiyouga
15f3087b96 [assets] update wechat (#7019)
Former-commit-id: 1481af5dc9
2025-02-20 20:32:33 +08:00
Zhangchi Feng
1fcedf9af6 [data] fix MiniCPMV plugin (#6998)
* fix template

* fix bug in messages processing

Former-commit-id: cde479e47a
2025-02-19 19:36:04 +08:00
hoshi-hiyouga
b0bbacaacb [webui] update css (#6985)
Former-commit-id: 302ecb00fe
2025-02-18 18:27:57 +08:00
hoshi-hiyouga
beb1a9f9d9 [data] add r1 distill dataset (#6983)
Former-commit-id: 2591a3fa8b
2025-02-18 17:25:09 +08:00
hoshi-hiyouga
3fbd4848e8 [version] support transformers 449 (#6982)
* support transformers 449

* fix mm plugin

Former-commit-id: b00b290c07
2025-02-18 17:05:40 +08:00
hoshi-hiyouga
184c5d0882 [misc] fix script (#6977)
Former-commit-id: cc8c7e762b
2025-02-18 17:00:46 +08:00
hoshi-hiyouga
1f4a0b11ba [data] update vlm args (#6976)
Former-commit-id: 3da2cc2710
2025-02-18 02:12:51 +08:00
hoshi-hiyouga
b1d31ff0f9 [data] add min resolution option (#6975)
Former-commit-id: 7faecc0301
2025-02-18 01:40:46 +08:00
hoshi-hiyouga
a8c9d5663d [data] fix predict dataset (#6972)
Former-commit-id: bdb581c4a8
2025-02-17 20:29:40 +08:00
hoshi-hiyouga
475a355b82 [assets] update wechat (#6963)
Former-commit-id: ad0c6c8916
2025-02-17 15:23:17 +08:00
Zhangchi Feng
3dc938268c [data] fix minicpmo template (#6946)
Former-commit-id: 2faf8aeff8
2025-02-15 00:37:41 +08:00
Eric Tang
e55ec42d3c [ray] specify ray storage path (#6920)
Former-commit-id: 6edd4992d7
2025-02-14 21:55:41 +08:00
hoshi-hiyouga
2baf8bf03d [misc] fix lora regex (#6944)
* fix lora regex

* fix

Former-commit-id: 1ada3ae5a3
2025-02-14 21:38:43 +08:00
hoshi-hiyouga
13e1b7ee2b [misc] fix grad ckpt (#6931)
Former-commit-id: c31c63b411
2025-02-13 23:27:51 +08:00
hoshi-hiyouga
cd493b91de [model] add liger kernel to qwen2_5 vl (#6930)
* add liger kernel to qwen2_5 vl

* fix patch

* fix patch

Former-commit-id: 797043d29c
2025-02-13 23:05:54 +08:00
Billy Cao
48173b606c [trainer] fix gen_kwarg to eval during training (#5451)
* Correctly pass gen_kwarg to eval during model runs

* fix

* fix

---------

Co-authored-by: hiyouga <hiyouga@buaa.edu.cn>
Former-commit-id: 11eac71c13
2025-02-13 02:35:06 +08:00
SrWYG
0ad9f7f058 [data] evaluate on each dataset (#5522)
* [Update] loader.py , evaluate will run separate evaluations on each dataset.

`If you pass a dictionary with names of datasets as keys and datasets as values, evaluate will run separate evaluations on each dataset. This can be useful to monitor how training affects other datasets or simply to get a more fine-grained evaluation`

seq2seqtrainner support eval_dataset as Dict.

* fix format

* fix

* fix

---------

Co-authored-by: hiyouga <hiyouga@buaa.edu.cn>
Former-commit-id: 1e35967ae1
2025-02-13 02:19:03 +08:00
Noah
1adb46875f [data] improve error handling (#6128)
* sync from upstream

* update

* update

* fix

---------

Co-authored-by: hiyouga <hiyouga@buaa.edu.cn>
Former-commit-id: 4c7bfebcf1
2025-02-13 01:39:41 +08:00
hoshi-hiyouga
9b852ebe25 [misc] update readme (#6918)
Former-commit-id: 8956c93d9b
2025-02-13 01:01:41 +08:00
hoshi-hiyouga
07aa7b71a3 [misc] update readme (#6917)
Former-commit-id: 499ea45d1f
2025-02-13 00:58:10 +08:00
hoshi-hiyouga
1679930e00 [breaking change] refactor data pipeline (#6901)
* refactor data

* rename file

Former-commit-id: 617c8ab467
2025-02-13 00:39:20 +08:00
Eric Tang
d50e04b805 [misc] support for launching LLaMA-Factory with uv run (#6907)
* yay

* uv with ray temporary commit

* remove ray specific code for now

* cleanup

Former-commit-id: f8a206125d
2025-02-13 00:38:44 +08:00
Eric Tang
e515fe62de [example] fix path to ray example (#6906)
Former-commit-id: ee5fe216dc
2025-02-13 00:29:32 +08:00
hoshi-hiyouga
036fb0d561 [misc] fix grad ckpt func (#6916)
Former-commit-id: e34c3c06da
2025-02-13 00:17:18 +08:00
marko1616
bae934dea3 [trainer] fix llama3.2 vision kto train (#6904)
Former-commit-id: b7fd1e9c00
2025-02-12 19:09:14 +08:00