hoshi-hiyouga
002f58ef8e
[model] add QwQ 32b ( #7179 )
...
Former-commit-id: 64a6fb9b5056166265abc5acbddffb64cd8b5256
2025-03-06 11:58:36 +08:00
Ze-Yi LIN
c67d2b9327
[trainer] fix swanlab callback ( #7176 )
...
Former-commit-id: 8ad03258e16309158368384e2a0a707845536133
2025-03-06 00:33:37 +08:00
hoshi-hiyouga
6e58115f98
[trainer] update config ( #7174 )
...
Former-commit-id: b4b89b4ff3bc03aa388569e253d62580755a77a5
2025-03-05 23:32:54 +08:00
sirui.li
8dddffa340
[data] fix qwen2audio plugin ( #7166 )
...
* Update pairwise.py
[data]Repair multimodal model dpo training
* Update pairwise.py
[data]repair multimodal model dpo training using deepcopy
* Update pairwise.py
* Update mm_plugin.py
Former-commit-id: dff4130969bac9cb1abe66fd5dfada8c757c716f
2025-03-05 18:03:36 +08:00
hoshi-hiyouga
caef0a8937
[data] use bicubic resampler ( #7143 )
...
Former-commit-id: bc298c60b7d3fdc4d116a79b535d7e9b11f4aa65
2025-03-04 00:17:06 +08:00
hoshi-hiyouga
392533e139
[webui] fix webui ( #7142 )
...
Former-commit-id: 17ba2d5082bcd6b4cdd5e50286776d256cc934a4
2025-03-04 00:01:49 +08:00
rabbit
299cd03785
[data] bailing template ( #7117 )
...
* add bailing template
* add bailing template
* add bailing template
---------
Co-authored-by: chengshiwen.csw@antgroup.com <chengshiwen.csw@antgroup.com>
Former-commit-id: 049ddf48afaa9f12d3e46d7ec63858607329e853
2025-03-03 15:33:22 +08:00
hoshi-hiyouga
ee1b580328
[inference] fix hf_engine ( #7120 )
...
Former-commit-id: 1036311826a61fed2346a261c8a060c355778318
2025-03-01 05:22:49 +08:00
Ze-Yi LIN
210cdb9557
[webui] display swanlab exp link ( #7089 )
...
* webui add swanlab link
* change callback name
* update
---------
Co-authored-by: hiyouga <hiyouga@buaa.edu.cn>
Former-commit-id: 891c4875039e8e3b7d0de025ee61c4ff003ff0c4
2025-02-27 19:40:54 +08:00
hoshi-hiyouga
f4aa0a146c
[misc] fix project toml ( #7067 )
...
Former-commit-id: 96fd510e6a03eae7a1f41772e1d6b784df6d5d2e
2025-02-25 23:22:48 +08:00
Kingsley
81947f1d2c
[model] add paligemma2-mix series ( #7060 )
...
Former-commit-id: 19861d5170bdcdf8c1c5d72289b29bff4b0d4c2c
2025-02-25 18:51:16 +08:00
hoshi-hiyouga
dca5fe14c2
[data] fix mllama ( #7053 )
...
* fix mllama
* fix test
Former-commit-id: 76314e6ad1ecaa44fcae4375dd0abf4ebaf1f924
2025-02-24 22:05:38 +08:00
hoshi-hiyouga
ca78ba964d
[model] add models ( #7054 )
...
* add qwen25vl awq models
* add moonlight
Former-commit-id: ec1a1bc1184d13188029e19c1d4e7de68707aaf6
2025-02-24 22:05:13 +08:00
Zhangchi Feng
1fcedf9af6
[data] fix MiniCPMV plugin ( #6998 )
...
* fix template
* fix bug in messages processing
Former-commit-id: cde479e47a51beb60ab555cdee083c1cdba0ead6
2025-02-19 19:36:04 +08:00
hoshi-hiyouga
b0bbacaacb
[webui] update css ( #6985 )
...
Former-commit-id: 302ecb00fef56d1ccc9203cb46f242841fefab47
2025-02-18 18:27:57 +08:00
hoshi-hiyouga
3fbd4848e8
[version] support transformers 449 ( #6982 )
...
* support transformers 449
* fix mm plugin
Former-commit-id: b00b290c07beb560a5af857ce64f4ce424831a2c
2025-02-18 17:05:40 +08:00
hoshi-hiyouga
184c5d0882
[misc] fix script ( #6977 )
...
Former-commit-id: cc8c7e762b9c873ef79529152465bbed9231053c
2025-02-18 17:00:46 +08:00
hoshi-hiyouga
1f4a0b11ba
[data] update vlm args ( #6976 )
...
Former-commit-id: 3da2cc2710c9b13ab450815a92fff14b03251984
2025-02-18 02:12:51 +08:00
hoshi-hiyouga
b1d31ff0f9
[data] add min resolution option ( #6975 )
...
Former-commit-id: 7faecc0301709326efa21e7a3fdb75fe0a9635c2
2025-02-18 01:40:46 +08:00
hoshi-hiyouga
a8c9d5663d
[data] fix predict dataset ( #6972 )
...
Former-commit-id: bdb581c4a82d02458766e73c87b7a92ea31796ec
2025-02-17 20:29:40 +08:00
Zhangchi Feng
3dc938268c
[data] fix minicpmo template ( #6946 )
...
Former-commit-id: 2faf8aeff897765df44707d5a42157dfdd6b9038
2025-02-15 00:37:41 +08:00
Eric Tang
e55ec42d3c
[ray] specify ray storage path ( #6920 )
...
Former-commit-id: 6edd4992d700fec56800a638f1cac0f87990c581
2025-02-14 21:55:41 +08:00
hoshi-hiyouga
2baf8bf03d
[misc] fix lora regex ( #6944 )
...
* fix lora regex
* fix
Former-commit-id: 1ada3ae5a3a14057341540c6d6ba985adf95f348
2025-02-14 21:38:43 +08:00
hoshi-hiyouga
13e1b7ee2b
[misc] fix grad ckpt ( #6931 )
...
Former-commit-id: c31c63b41109e616997757ec2da6e0ab89ed3b6e
2025-02-13 23:27:51 +08:00
hoshi-hiyouga
cd493b91de
[model] add liger kernel to qwen2_5 vl ( #6930 )
...
* add liger kernel to qwen2_5 vl
* fix patch
* fix patch
Former-commit-id: 797043d29cb85a8f90fabf48976908037f07000e
2025-02-13 23:05:54 +08:00
Billy Cao
48173b606c
[trainer] fix gen_kwarg to eval during training ( #5451 )
...
* Correctly pass gen_kwarg to eval during model runs
* fix
* fix
---------
Co-authored-by: hiyouga <hiyouga@buaa.edu.cn>
Former-commit-id: 11eac71c13cd432322b69ae74a3b8fa17af31bc4
2025-02-13 02:35:06 +08:00
SrWYG
0ad9f7f058
[data] evaluate on each dataset ( #5522 )
...
* [Update] loader.py , evaluate will run separate evaluations on each dataset.
`If you pass a dictionary with names of datasets as keys and datasets as values, evaluate will run separate evaluations on each dataset. This can be useful to monitor how training affects other datasets or simply to get a more fine-grained evaluation`
seq2seqtrainner support eval_dataset as Dict.
* fix format
* fix
* fix
---------
Co-authored-by: hiyouga <hiyouga@buaa.edu.cn>
Former-commit-id: 1e35967ae159038a66f3203dd0e6ec51eea9208f
2025-02-13 02:19:03 +08:00
Noah
1adb46875f
[data] improve error handling ( #6128 )
...
* sync from upstream
* update
* update
* fix
---------
Co-authored-by: hiyouga <hiyouga@buaa.edu.cn>
Former-commit-id: 4c7bfebcf1ed90800f5b0de4cf67b3036cb9dc13
2025-02-13 01:39:41 +08:00
hoshi-hiyouga
1679930e00
[breaking change] refactor data pipeline ( #6901 )
...
* refactor data
* rename file
Former-commit-id: 617c8ab467d32be5f7d5c94fa89c0e3d7d1963bc
2025-02-13 00:39:20 +08:00
hoshi-hiyouga
036fb0d561
[misc] fix grad ckpt func ( #6916 )
...
Former-commit-id: e34c3c06da706f80c74c20800f19110e9ad6b82a
2025-02-13 00:17:18 +08:00
marko1616
bae934dea3
[trainer] fix llama3.2 vision kto train ( #6904 )
...
Former-commit-id: b7fd1e9c00c77a4c2a0f2f347767d22bd47213f1
2025-02-12 19:09:14 +08:00
hoshi-hiyouga
2e2f6bea07
[data] feat: auto template ( #6905 )
...
* support auto template
* add unittest
Former-commit-id: 2f8b6847f5e199d770e91346dfe205c4b9f1fbb7
2025-02-12 00:22:53 +08:00
hoshi-hiyouga
197aa3baf4
[data] fix ollama template ( #6902 )
...
* fix ollama template
* add meta info
* use half precision
Former-commit-id: e1a7c1242cd1e0a1ca9ee7d04377a53872488126
2025-02-11 22:43:09 +08:00
hoshi-hiyouga
c6be9e242c
[misc] support export ollama modelfile ( #6899 )
...
* support export ollama modelfile
* update config
* add system and num ctx
Former-commit-id: 9184a6e0ed7ff5f632c848f861bfa448c4cd06fc
2025-02-11 19:52:25 +08:00
hoshi-hiyouga
2e954d8fd2
[data] refactor template ( #6896 )
...
Former-commit-id: d1b8aa3835f6e3b2e63cf06e6cadbe760d46f9aa
2025-02-11 17:59:25 +08:00
hoshi-hiyouga
593acca556
[data] refactor mm plugin ( #6895 )
...
* refactor plugin
* lint
Former-commit-id: aca63bfcca02ecd95b57cd8949a50e26a913f716
2025-02-11 16:34:49 +08:00
HJ
188f22d8a7
[data] fix qwen_2_5_vl video processing ( #6868 )
...
* fix qwen_2_5_vl video processing
* Update mm_plugin.py
* Update mm_plugin.py
---------
Co-authored-by: hoshi-hiyouga <hiyouga@buaa.edu.cn>
Former-commit-id: 9153a7bd832cdae84b63a4d7d1f2b12239e84b61
2025-02-11 16:14:50 +08:00
Zhangchi Feng
5433b318bb
[da'ta] fix minicpmv plugin ( #6890 )
...
* fix template name
* tiny fix
* support minicpm-o-2.6
* support inference of minicpmv
* update readme
* support dpo of minicpmv
* update init audio
* update init audio
* [model]fix image process in minicpmo
* fix no mm inputs
Former-commit-id: 764627645abcd353f9130d5dd8c584810b0e0b1b
2025-02-11 13:30:44 +08:00
HJ
fe4f4e9758
[data] fix: sharegpt converter ( #6879 )
...
* fix-sharegpt-format
* fix
---------
Co-authored-by: hoshi-hiyouga <hiyouga@buaa.edu.cn>
Former-commit-id: 0fb44cb3a5499c8da79e73004adc9d16f792b4b3
2025-02-10 21:59:12 +08:00
hoshi-hiyouga
1bb3d17d9e
[data] fix mllama collator ( #6874 )
...
Former-commit-id: b68199db274a53d5916179e1aaf9722fd94fa2dc
2025-02-09 22:42:25 +08:00
hoshi-hiyouga
b93333685b
[test] align test cases ( #6865 )
...
* align test cases
* fix function formatter
Former-commit-id: f6f3f8d0fc79de6bbad0bf892fc2f6c98c27eb8e
2025-02-09 01:03:49 +08:00
hoshi-hiyouga
fcd0f0480d
[dataset] add openthought ( #6866 )
...
Former-commit-id: 1356f9d8400efaccf677d0b36aaf32a146a09833
2025-02-09 00:53:01 +08:00
hoshi-hiyouga
ff6658ad27
[deps] upgrade vllm ( #6857 )
...
Former-commit-id: 5f38bcaba921dbdee27b4be4709fcec06fa37c9e
2025-02-08 15:02:28 +08:00
hoshi-hiyouga
28037c7834
fix qwen2vl plugin ( #6855 )
...
Former-commit-id: 40048ab77a8b25a91a844800f0f1e880b84548cd
2025-02-08 10:59:10 +08:00
hoshi-hiyouga
f70208e1c0
[misc] allow extra args ( #6831 )
...
Former-commit-id: 74ade3a176cad753971aaad681fea6ff8df40914
2025-02-06 12:38:08 +08:00
Zhangchi Feng
01915eaf40
[model] support audio ( #6701 )
...
* support qwen2_audio
* improve code
* lint
* fix
* fix
* fix
---------
Co-authored-by: hiyouga <hiyouga@buaa.edu.cn>
Former-commit-id: 24c78429489809873a1269a735ea5421340b32a2
2025-02-05 04:59:09 +08:00
Yueqi Song
e665e1fed5
[data] allow thought in function call ( #6797 )
...
* Update template.py
* Update template.py
* use formatter
* fix regex
---------
Co-authored-by: hiyouga <hiyouga@buaa.edu.cn>
Former-commit-id: a5e943f7bcea6e5840da8570055bf3079a49ae8c
2025-02-05 02:26:23 +08:00
hoshi-hiyouga
1fee69f874
[misc] update license year & fix llama pro ( #6814 )
...
* fix llamapro script
* change year
Former-commit-id: e2dc5b952aa22835d5220ba624f44676138b65ac
2025-02-05 01:53:33 +08:00
Yueqi Song
8504bde893
[data] fix qwen tool template ( #6796 )
...
* Update tool_utils.py
* fix unittest
---------
Co-authored-by: hoshi-hiyouga <hiyouga@buaa.edu.cn>
Former-commit-id: dd6b7d203eedbf09458c64654e8d97fec85f08d7
2025-02-05 00:02:00 +08:00
Zhangchi Feng
85f22d01bf
[data] fix minicpmv plugin ( #6801 )
...
* fix template name
* tiny fix
* support minicpm-o-2.6
* support inference of minicpmv
* update readme
* support dpo of minicpmv
* update init audio
* update init audio
* [model]fix image process in minicpmo
Former-commit-id: ab9bd068efee861452407cdda08ef014d5ce23d5
2025-02-04 21:20:15 +08:00