hoshi-hiyouga
b4b89b4ff3
[trainer] update config ( #7174 )
2025-03-05 23:32:54 +08:00
sirui.li
dff4130969
[data] fix qwen2audio plugin ( #7166 )
...
* Update pairwise.py
[data]Repair multimodal model dpo training
* Update pairwise.py
[data]repair multimodal model dpo training using deepcopy
* Update pairwise.py
* Update mm_plugin.py
2025-03-05 18:03:36 +08:00
hoshi-hiyouga
0c403ea15b
[assets] update wechat ( #7161 )
2025-03-05 14:11:10 +08:00
hoshi-hiyouga
bc298c60b7
[data] use bicubic resampler ( #7143 )
2025-03-04 00:17:06 +08:00
hoshi-hiyouga
17ba2d5082
[webui] fix webui ( #7142 )
2025-03-04 00:01:49 +08:00
rabbit
049ddf48af
[data] bailing template ( #7117 )
...
* add bailing template
* add bailing template
* add bailing template
---------
Co-authored-by: chengshiwen.csw@antgroup.com <chengshiwen.csw@antgroup.com >
2025-03-03 15:33:22 +08:00
hoshi-hiyouga
1036311826
[inference] fix hf_engine ( #7120 )
2025-03-01 05:22:49 +08:00
hoshi-hiyouga
d1863bbbaa
[assets] update wechat ( #7106 )
2025-02-28 12:01:04 +08:00
Ze-Yi LIN
891c487503
[webui] display swanlab exp link ( #7089 )
...
* webui add swanlab link
* change callback name
* update
---------
Co-authored-by: hiyouga <hiyouga@buaa.edu.cn >
2025-02-27 19:40:54 +08:00
leo-pony
acc52e0fe7
[npu] update cann base image and torch 2.4 ( #7061 )
...
* Update base npu container image version:The Python version required for Hugging Face Transformers is >= python3.10
* Fix the bug: arg type of INSTALL_DEEPSPEED shoud been string now.
* Update Ascend CANN, CANN-Kernel and corresponding torch and torch-npu version
* Upgrade torch-npu needs packages' version: torch==2.1.0 and torch-npu==2.4.0.post2
2025-02-25 23:32:01 +08:00
hoshi-hiyouga
96fd510e6a
[misc] fix project toml ( #7067 )
2025-02-25 23:22:48 +08:00
JieShen
e8266fe563
[script] add seed args ( #7058 )
...
* add seed args
* add seed args
* update seed
2025-02-25 19:44:57 +08:00
Kingsley
19861d5170
[model] add paligemma2-mix series ( #7060 )
2025-02-25 18:51:16 +08:00
hoshi-hiyouga
76314e6ad1
[data] fix mllama ( #7053 )
...
* fix mllama
* fix test
2025-02-24 22:05:38 +08:00
hoshi-hiyouga
ec1a1bc118
[model] add models ( #7054 )
...
* add qwen25vl awq models
* add moonlight
2025-02-24 22:05:13 +08:00
hoshi-hiyouga
fe6dd92c84
[assets] update readme ( #7051 )
2025-02-24 20:45:06 +08:00
hoshi-hiyouga
1481af5dc9
[assets] update wechat ( #7019 )
2025-02-20 20:32:33 +08:00
Zhangchi Feng
cde479e47a
[data] fix MiniCPMV plugin ( #6998 )
...
* fix template
* fix bug in messages processing
2025-02-19 19:36:04 +08:00
hoshi-hiyouga
302ecb00fe
[webui] update css ( #6985 )
2025-02-18 18:27:57 +08:00
hoshi-hiyouga
2591a3fa8b
[data] add r1 distill dataset ( #6983 )
2025-02-18 17:25:09 +08:00
hoshi-hiyouga
b00b290c07
[version] support transformers 449 ( #6982 )
...
* support transformers 449
* fix mm plugin
2025-02-18 17:05:40 +08:00
hoshi-hiyouga
cc8c7e762b
[misc] fix script ( #6977 )
2025-02-18 17:00:46 +08:00
hoshi-hiyouga
3da2cc2710
[data] update vlm args ( #6976 )
2025-02-18 02:12:51 +08:00
hoshi-hiyouga
7faecc0301
[data] add min resolution option ( #6975 )
2025-02-18 01:40:46 +08:00
hoshi-hiyouga
bdb581c4a8
[data] fix predict dataset ( #6972 )
2025-02-17 20:29:40 +08:00
hoshi-hiyouga
ad0c6c8916
[assets] update wechat ( #6963 )
2025-02-17 15:23:17 +08:00
Zhangchi Feng
2faf8aeff8
[data] fix minicpmo template ( #6946 )
2025-02-15 00:37:41 +08:00
Eric Tang
6edd4992d7
[ray] specify ray storage path ( #6920 )
2025-02-14 21:55:41 +08:00
hoshi-hiyouga
1ada3ae5a3
[misc] fix lora regex ( #6944 )
...
* fix lora regex
* fix
2025-02-14 21:38:43 +08:00
hoshi-hiyouga
c31c63b411
[misc] fix grad ckpt ( #6931 )
2025-02-13 23:27:51 +08:00
hoshi-hiyouga
797043d29c
[model] add liger kernel to qwen2_5 vl ( #6930 )
...
* add liger kernel to qwen2_5 vl
* fix patch
* fix patch
2025-02-13 23:05:54 +08:00
Billy Cao
11eac71c13
[trainer] fix gen_kwarg to eval during training ( #5451 )
...
* Correctly pass gen_kwarg to eval during model runs
* fix
* fix
---------
Co-authored-by: hiyouga <hiyouga@buaa.edu.cn >
2025-02-13 02:35:06 +08:00
SrWYG
1e35967ae1
[data] evaluate on each dataset ( #5522 )
...
* [Update] loader.py , evaluate will run separate evaluations on each dataset.
`If you pass a dictionary with names of datasets as keys and datasets as values, evaluate will run separate evaluations on each dataset. This can be useful to monitor how training affects other datasets or simply to get a more fine-grained evaluation`
seq2seqtrainner support eval_dataset as Dict.
* fix format
* fix
* fix
---------
Co-authored-by: hiyouga <hiyouga@buaa.edu.cn >
2025-02-13 02:19:03 +08:00
Noah
4c7bfebcf1
[data] improve error handling ( #6128 )
...
* sync from upstream
* update
* update
* fix
---------
Co-authored-by: hiyouga <hiyouga@buaa.edu.cn >
2025-02-13 01:39:41 +08:00
hoshi-hiyouga
8956c93d9b
[misc] update readme ( #6918 )
2025-02-13 01:01:41 +08:00
hoshi-hiyouga
499ea45d1f
[misc] update readme ( #6917 )
2025-02-13 00:58:10 +08:00
hoshi-hiyouga
617c8ab467
[breaking change] refactor data pipeline ( #6901 )
...
* refactor data
* rename file
2025-02-13 00:39:20 +08:00
Eric Tang
f8a206125d
[misc] support for launching LLaMA-Factory with uv run ( #6907 )
...
* yay
* uv with ray temporary commit
* remove ray specific code for now
* cleanup
2025-02-13 00:38:44 +08:00
Eric Tang
ee5fe216dc
[example] fix path to ray example ( #6906 )
2025-02-13 00:29:32 +08:00
hoshi-hiyouga
e34c3c06da
[misc] fix grad ckpt func ( #6916 )
2025-02-13 00:17:18 +08:00
marko1616
b7fd1e9c00
[trainer] fix llama3.2 vision kto train ( #6904 )
2025-02-12 19:09:14 +08:00
hoshi-hiyouga
2f8b6847f5
[data] feat: auto template ( #6905 )
...
* support auto template
* add unittest
2025-02-12 00:22:53 +08:00
hoshi-hiyouga
18179a3823
[misc] update readme ( #6903 )
2025-02-11 22:51:26 +08:00
hoshi-hiyouga
e1a7c1242c
[data] fix ollama template ( #6902 )
...
* fix ollama template
* add meta info
* use half precision
2025-02-11 22:43:09 +08:00
hoshi-hiyouga
9184a6e0ed
[misc] support export ollama modelfile ( #6899 )
...
* support export ollama modelfile
* update config
* add system and num ctx
2025-02-11 19:52:25 +08:00
hoshi-hiyouga
d1b8aa3835
[data] refactor template ( #6896 )
2025-02-11 17:59:25 +08:00
codingma
7f354b80bc
support ollama modelfile export ( #4686 )
2025-02-11 17:52:24 +08:00
hoshi-hiyouga
aca63bfcca
[data] refactor mm plugin ( #6895 )
...
* refactor plugin
* lint
2025-02-11 16:34:49 +08:00
HJ
9153a7bd83
[data] fix qwen_2_5_vl video processing ( #6868 )
...
* fix qwen_2_5_vl video processing
* Update mm_plugin.py
* Update mm_plugin.py
---------
Co-authored-by: hoshi-hiyouga <hiyouga@buaa.edu.cn >
2025-02-11 16:14:50 +08:00
hoshi-hiyouga
fc5d47401f
[assets] update wechat ( #6892 )
2025-02-11 13:56:26 +08:00