Commit Graph

2888 Commits

Author SHA1 Message Date
hoshi-hiyouga
4e68828e46 [config] fix export max len (#7230)
Former-commit-id: 728c2f6819
2025-03-10 16:46:08 +08:00
hoshi-hiyouga
9a0044ef5e [assets] update wechat (#7229)
Former-commit-id: ae4cbe8fbc
2025-03-10 15:39:06 +08:00
hoshi-hiyouga
d412301d08 [data] update mm demo data (#7211)
Former-commit-id: 1774882f5a
2025-03-07 20:07:15 +08:00
hoshi-hiyouga
5a0fd22c05 [assets] update readme (#7209)
Former-commit-id: cdf8fc6478
2025-03-07 17:27:49 +08:00
hoshi-hiyouga
df63f05b47 [data] fix loader (#7207)
* fix dataloader

* add test case

* fix type

* fix ci

* fix ci

* fix ci

* disable overwrite cache in ci

Former-commit-id: 8c3f9f6747
2025-03-07 17:20:46 +08:00
hoshi-hiyouga
98ea0e8109 [misc] fix ds config (#7205)
Former-commit-id: db113f690e
2025-03-07 15:21:28 +08:00
ZhangChuanhui
33b4c33279 [data] fix function formatter (#7201)
Co-authored-by: zhangchuanhui <zhangchal@digitalchina.com>
Former-commit-id: 194e3bddb2
2025-03-07 15:17:23 +08:00
hoshi-hiyouga
113cc3d920 [misc] fix cli (#7204)
Former-commit-id: bd17223559
2025-03-07 15:01:18 +08:00
hoshi-hiyouga
b6c0e8608e [script] fix vllm version (#7193)
Former-commit-id: 313355759d
2025-03-06 17:14:17 +08:00
hoshi-hiyouga
eba31ae313 [webui] support escape html (#7190)
Former-commit-id: abb23f7673
2025-03-06 16:52:21 +08:00
hoshi-hiyouga
e7556b591e [deps] upgrade vllm (#7183)
Former-commit-id: d739fddb10
2025-03-06 15:25:08 +08:00
hoshi-hiyouga
2b21c749c1 [data] fix mm template (#7181)
Former-commit-id: be66df1f02
2025-03-06 15:18:32 +08:00
hoshi-hiyouga
002f58ef8e [model] add QwQ 32b (#7179)
Former-commit-id: 64a6fb9b50
2025-03-06 11:58:36 +08:00
Ze-Yi LIN
c67d2b9327 [trainer] fix swanlab callback (#7176)
Former-commit-id: 8ad03258e1
2025-03-06 00:33:37 +08:00
hoshi-hiyouga
6e58115f98 [trainer] update config (#7174)
Former-commit-id: b4b89b4ff3
2025-03-05 23:32:54 +08:00
sirui.li
8dddffa340 [data] fix qwen2audio plugin (#7166)
* Update pairwise.py

[data]Repair multimodal model dpo training

* Update pairwise.py

[data]repair multimodal model dpo training using deepcopy

* Update pairwise.py

* Update mm_plugin.py

Former-commit-id: dff4130969
2025-03-05 18:03:36 +08:00
hoshi-hiyouga
e1d574a784 [assets] update wechat (#7161)
Former-commit-id: 0c403ea15b
2025-03-05 14:11:10 +08:00
hoshi-hiyouga
caef0a8937 [data] use bicubic resampler (#7143)
Former-commit-id: bc298c60b7
2025-03-04 00:17:06 +08:00
hoshi-hiyouga
392533e139 [webui] fix webui (#7142)
Former-commit-id: 17ba2d5082
2025-03-04 00:01:49 +08:00
rabbit
299cd03785 [data] bailing template (#7117)
* add bailing template

* add bailing template

* add bailing template

---------

Co-authored-by: chengshiwen.csw@antgroup.com <chengshiwen.csw@antgroup.com>
Former-commit-id: 049ddf48af
2025-03-03 15:33:22 +08:00
hoshi-hiyouga
ee1b580328 [inference] fix hf_engine (#7120)
Former-commit-id: 1036311826
2025-03-01 05:22:49 +08:00
hoshi-hiyouga
54a090079c [assets] update wechat (#7106)
Former-commit-id: d1863bbbaa
2025-02-28 12:01:04 +08:00
Ze-Yi LIN
210cdb9557 [webui] display swanlab exp link (#7089)
* webui add swanlab link

* change callback name

* update

---------

Co-authored-by: hiyouga <hiyouga@buaa.edu.cn>
Former-commit-id: 891c487503
2025-02-27 19:40:54 +08:00
leo-pony
e86cb8a4fa [npu] update cann base image and torch 2.4 (#7061)
* Update base npu container image version:The Python version required for Hugging Face Transformers is >= python3.10

* Fix the bug: arg type of INSTALL_DEEPSPEED shoud been string now.

* Update Ascend CANN, CANN-Kernel and corresponding torch and torch-npu version

* Upgrade torch-npu needs packages' version: torch==2.1.0 and torch-npu==2.4.0.post2

Former-commit-id: acc52e0fe7
2025-02-25 23:32:01 +08:00
hoshi-hiyouga
f4aa0a146c [misc] fix project toml (#7067)
Former-commit-id: 96fd510e6a
2025-02-25 23:22:48 +08:00
JieShen
96636c3729 [script] add seed args (#7058)
* add seed args

* add seed args

* update seed

Former-commit-id: e8266fe563
2025-02-25 19:44:57 +08:00
Kingsley
81947f1d2c [model] add paligemma2-mix series (#7060)
Former-commit-id: 19861d5170
2025-02-25 18:51:16 +08:00
hoshi-hiyouga
dca5fe14c2 [data] fix mllama (#7053)
* fix mllama

* fix test

Former-commit-id: 76314e6ad1
2025-02-24 22:05:38 +08:00
hoshi-hiyouga
ca78ba964d [model] add models (#7054)
* add qwen25vl awq models

* add moonlight

Former-commit-id: ec1a1bc118
2025-02-24 22:05:13 +08:00
hoshi-hiyouga
9359ee18ad [assets] update readme (#7051)
Former-commit-id: fe6dd92c84
2025-02-24 20:45:06 +08:00
hoshi-hiyouga
15f3087b96 [assets] update wechat (#7019)
Former-commit-id: 1481af5dc9
2025-02-20 20:32:33 +08:00
Zhangchi Feng
1fcedf9af6 [data] fix MiniCPMV plugin (#6998)
* fix template

* fix bug in messages processing

Former-commit-id: cde479e47a
2025-02-19 19:36:04 +08:00
hoshi-hiyouga
b0bbacaacb [webui] update css (#6985)
Former-commit-id: 302ecb00fe
2025-02-18 18:27:57 +08:00
hoshi-hiyouga
beb1a9f9d9 [data] add r1 distill dataset (#6983)
Former-commit-id: 2591a3fa8b
2025-02-18 17:25:09 +08:00
hoshi-hiyouga
3fbd4848e8 [version] support transformers 449 (#6982)
* support transformers 449

* fix mm plugin

Former-commit-id: b00b290c07
2025-02-18 17:05:40 +08:00
hoshi-hiyouga
184c5d0882 [misc] fix script (#6977)
Former-commit-id: cc8c7e762b
2025-02-18 17:00:46 +08:00
hoshi-hiyouga
1f4a0b11ba [data] update vlm args (#6976)
Former-commit-id: 3da2cc2710
2025-02-18 02:12:51 +08:00
hoshi-hiyouga
b1d31ff0f9 [data] add min resolution option (#6975)
Former-commit-id: 7faecc0301
2025-02-18 01:40:46 +08:00
hoshi-hiyouga
a8c9d5663d [data] fix predict dataset (#6972)
Former-commit-id: bdb581c4a8
2025-02-17 20:29:40 +08:00
hoshi-hiyouga
475a355b82 [assets] update wechat (#6963)
Former-commit-id: ad0c6c8916
2025-02-17 15:23:17 +08:00
Zhangchi Feng
3dc938268c [data] fix minicpmo template (#6946)
Former-commit-id: 2faf8aeff8
2025-02-15 00:37:41 +08:00
Eric Tang
e55ec42d3c [ray] specify ray storage path (#6920)
Former-commit-id: 6edd4992d7
2025-02-14 21:55:41 +08:00
hoshi-hiyouga
2baf8bf03d [misc] fix lora regex (#6944)
* fix lora regex

* fix

Former-commit-id: 1ada3ae5a3
2025-02-14 21:38:43 +08:00
hoshi-hiyouga
13e1b7ee2b [misc] fix grad ckpt (#6931)
Former-commit-id: c31c63b411
2025-02-13 23:27:51 +08:00
hoshi-hiyouga
cd493b91de [model] add liger kernel to qwen2_5 vl (#6930)
* add liger kernel to qwen2_5 vl

* fix patch

* fix patch

Former-commit-id: 797043d29c
2025-02-13 23:05:54 +08:00
Billy Cao
48173b606c [trainer] fix gen_kwarg to eval during training (#5451)
* Correctly pass gen_kwarg to eval during model runs

* fix

* fix

---------

Co-authored-by: hiyouga <hiyouga@buaa.edu.cn>
Former-commit-id: 11eac71c13
2025-02-13 02:35:06 +08:00
SrWYG
0ad9f7f058 [data] evaluate on each dataset (#5522)
* [Update] loader.py , evaluate will run separate evaluations on each dataset.

`If you pass a dictionary with names of datasets as keys and datasets as values, evaluate will run separate evaluations on each dataset. This can be useful to monitor how training affects other datasets or simply to get a more fine-grained evaluation`

seq2seqtrainner support eval_dataset as Dict.

* fix format

* fix

* fix

---------

Co-authored-by: hiyouga <hiyouga@buaa.edu.cn>
Former-commit-id: 1e35967ae1
2025-02-13 02:19:03 +08:00
Noah
1adb46875f [data] improve error handling (#6128)
* sync from upstream

* update

* update

* fix

---------

Co-authored-by: hiyouga <hiyouga@buaa.edu.cn>
Former-commit-id: 4c7bfebcf1
2025-02-13 01:39:41 +08:00
hoshi-hiyouga
9b852ebe25 [misc] update readme (#6918)
Former-commit-id: 8956c93d9b
2025-02-13 01:01:41 +08:00
hoshi-hiyouga
07aa7b71a3 [misc] update readme (#6917)
Former-commit-id: 499ea45d1f
2025-02-13 00:58:10 +08:00