hoshi-hiyouga
2baf8bf03d
[misc] fix lora regex ( #6944 )
...
* fix lora regex
* fix
Former-commit-id: 1ada3ae5a3a14057341540c6d6ba985adf95f348
2025-02-14 21:38:43 +08:00
hoshi-hiyouga
13e1b7ee2b
[misc] fix grad ckpt ( #6931 )
...
Former-commit-id: c31c63b41109e616997757ec2da6e0ab89ed3b6e
2025-02-13 23:27:51 +08:00
hoshi-hiyouga
cd493b91de
[model] add liger kernel to qwen2_5 vl ( #6930 )
...
* add liger kernel to qwen2_5 vl
* fix patch
* fix patch
Former-commit-id: 797043d29cb85a8f90fabf48976908037f07000e
2025-02-13 23:05:54 +08:00
Billy Cao
48173b606c
[trainer] fix gen_kwarg to eval during training ( #5451 )
...
* Correctly pass gen_kwarg to eval during model runs
* fix
* fix
---------
Co-authored-by: hiyouga <hiyouga@buaa.edu.cn>
Former-commit-id: 11eac71c13cd432322b69ae74a3b8fa17af31bc4
2025-02-13 02:35:06 +08:00
SrWYG
0ad9f7f058
[data] evaluate on each dataset ( #5522 )
...
* [Update] loader.py , evaluate will run separate evaluations on each dataset.
`If you pass a dictionary with names of datasets as keys and datasets as values, evaluate will run separate evaluations on each dataset. This can be useful to monitor how training affects other datasets or simply to get a more fine-grained evaluation`
seq2seqtrainner support eval_dataset as Dict.
* fix format
* fix
* fix
---------
Co-authored-by: hiyouga <hiyouga@buaa.edu.cn>
Former-commit-id: 1e35967ae159038a66f3203dd0e6ec51eea9208f
2025-02-13 02:19:03 +08:00
Noah
1adb46875f
[data] improve error handling ( #6128 )
...
* sync from upstream
* update
* update
* fix
---------
Co-authored-by: hiyouga <hiyouga@buaa.edu.cn>
Former-commit-id: 4c7bfebcf1ed90800f5b0de4cf67b3036cb9dc13
2025-02-13 01:39:41 +08:00
hoshi-hiyouga
1679930e00
[breaking change] refactor data pipeline ( #6901 )
...
* refactor data
* rename file
Former-commit-id: 617c8ab467d32be5f7d5c94fa89c0e3d7d1963bc
2025-02-13 00:39:20 +08:00
hoshi-hiyouga
036fb0d561
[misc] fix grad ckpt func ( #6916 )
...
Former-commit-id: e34c3c06da706f80c74c20800f19110e9ad6b82a
2025-02-13 00:17:18 +08:00
marko1616
bae934dea3
[trainer] fix llama3.2 vision kto train ( #6904 )
...
Former-commit-id: b7fd1e9c00c77a4c2a0f2f347767d22bd47213f1
2025-02-12 19:09:14 +08:00
hoshi-hiyouga
2e2f6bea07
[data] feat: auto template ( #6905 )
...
* support auto template
* add unittest
Former-commit-id: 2f8b6847f5e199d770e91346dfe205c4b9f1fbb7
2025-02-12 00:22:53 +08:00
hoshi-hiyouga
197aa3baf4
[data] fix ollama template ( #6902 )
...
* fix ollama template
* add meta info
* use half precision
Former-commit-id: e1a7c1242cd1e0a1ca9ee7d04377a53872488126
2025-02-11 22:43:09 +08:00
hoshi-hiyouga
c6be9e242c
[misc] support export ollama modelfile ( #6899 )
...
* support export ollama modelfile
* update config
* add system and num ctx
Former-commit-id: 9184a6e0ed7ff5f632c848f861bfa448c4cd06fc
2025-02-11 19:52:25 +08:00
hoshi-hiyouga
2e954d8fd2
[data] refactor template ( #6896 )
...
Former-commit-id: d1b8aa3835f6e3b2e63cf06e6cadbe760d46f9aa
2025-02-11 17:59:25 +08:00
hoshi-hiyouga
593acca556
[data] refactor mm plugin ( #6895 )
...
* refactor plugin
* lint
Former-commit-id: aca63bfcca02ecd95b57cd8949a50e26a913f716
2025-02-11 16:34:49 +08:00
HJ
188f22d8a7
[data] fix qwen_2_5_vl video processing ( #6868 )
...
* fix qwen_2_5_vl video processing
* Update mm_plugin.py
* Update mm_plugin.py
---------
Co-authored-by: hoshi-hiyouga <hiyouga@buaa.edu.cn>
Former-commit-id: 9153a7bd832cdae84b63a4d7d1f2b12239e84b61
2025-02-11 16:14:50 +08:00
Zhangchi Feng
5433b318bb
[da'ta] fix minicpmv plugin ( #6890 )
...
* fix template name
* tiny fix
* support minicpm-o-2.6
* support inference of minicpmv
* update readme
* support dpo of minicpmv
* update init audio
* update init audio
* [model]fix image process in minicpmo
* fix no mm inputs
Former-commit-id: 764627645abcd353f9130d5dd8c584810b0e0b1b
2025-02-11 13:30:44 +08:00
HJ
fe4f4e9758
[data] fix: sharegpt converter ( #6879 )
...
* fix-sharegpt-format
* fix
---------
Co-authored-by: hoshi-hiyouga <hiyouga@buaa.edu.cn>
Former-commit-id: 0fb44cb3a5499c8da79e73004adc9d16f792b4b3
2025-02-10 21:59:12 +08:00
hoshi-hiyouga
1bb3d17d9e
[data] fix mllama collator ( #6874 )
...
Former-commit-id: b68199db274a53d5916179e1aaf9722fd94fa2dc
2025-02-09 22:42:25 +08:00
hoshi-hiyouga
b93333685b
[test] align test cases ( #6865 )
...
* align test cases
* fix function formatter
Former-commit-id: f6f3f8d0fc79de6bbad0bf892fc2f6c98c27eb8e
2025-02-09 01:03:49 +08:00
hoshi-hiyouga
fcd0f0480d
[dataset] add openthought ( #6866 )
...
Former-commit-id: 1356f9d8400efaccf677d0b36aaf32a146a09833
2025-02-09 00:53:01 +08:00
hoshi-hiyouga
ff6658ad27
[deps] upgrade vllm ( #6857 )
...
Former-commit-id: 5f38bcaba921dbdee27b4be4709fcec06fa37c9e
2025-02-08 15:02:28 +08:00
hoshi-hiyouga
28037c7834
fix qwen2vl plugin ( #6855 )
...
Former-commit-id: 40048ab77a8b25a91a844800f0f1e880b84548cd
2025-02-08 10:59:10 +08:00
hoshi-hiyouga
f70208e1c0
[misc] allow extra args ( #6831 )
...
Former-commit-id: 74ade3a176cad753971aaad681fea6ff8df40914
2025-02-06 12:38:08 +08:00
Zhangchi Feng
01915eaf40
[model] support audio ( #6701 )
...
* support qwen2_audio
* improve code
* lint
* fix
* fix
* fix
---------
Co-authored-by: hiyouga <hiyouga@buaa.edu.cn>
Former-commit-id: 24c78429489809873a1269a735ea5421340b32a2
2025-02-05 04:59:09 +08:00
Yueqi Song
e665e1fed5
[data] allow thought in function call ( #6797 )
...
* Update template.py
* Update template.py
* use formatter
* fix regex
---------
Co-authored-by: hiyouga <hiyouga@buaa.edu.cn>
Former-commit-id: a5e943f7bcea6e5840da8570055bf3079a49ae8c
2025-02-05 02:26:23 +08:00
hoshi-hiyouga
1fee69f874
[misc] update license year & fix llama pro ( #6814 )
...
* fix llamapro script
* change year
Former-commit-id: e2dc5b952aa22835d5220ba624f44676138b65ac
2025-02-05 01:53:33 +08:00
Yueqi Song
8504bde893
[data] fix qwen tool template ( #6796 )
...
* Update tool_utils.py
* fix unittest
---------
Co-authored-by: hoshi-hiyouga <hiyouga@buaa.edu.cn>
Former-commit-id: dd6b7d203eedbf09458c64654e8d97fec85f08d7
2025-02-05 00:02:00 +08:00
Zhangchi Feng
85f22d01bf
[data] fix minicpmv plugin ( #6801 )
...
* fix template name
* tiny fix
* support minicpm-o-2.6
* support inference of minicpmv
* update readme
* support dpo of minicpmv
* update init audio
* update init audio
* [model]fix image process in minicpmo
Former-commit-id: ab9bd068efee861452407cdda08ef014d5ce23d5
2025-02-04 21:20:15 +08:00
hoshi-hiyouga
445d643ef3
[model] add mistral small models ( #6786 )
...
Former-commit-id: 94803d8133fbbadff6d224cb6695feb5434fd4fd
2025-02-01 04:31:38 +08:00
hoshi-hiyouga
e8c1979b79
[model] add qwen2.5 vl models ( #6779 )
...
Former-commit-id: 999c7c8fe0caf6b837a1bdc2c6a24fafec327cd8
2025-01-31 03:00:29 +08:00
hoshi-hiyouga
f6779b0e0c
[breaking] support transformers 4.48 ( #6628 )
...
Former-commit-id: 15357cdad953bba1f2d294819f56b9746ed1b891
2025-01-31 01:36:33 +08:00
hoshi-hiyouga
245de012ca
[webui] improve webui & reasoning mode ( #6778 )
...
Former-commit-id: 45e68b9f092879dda55023ebbcd8cf4660e3045a
2025-01-31 00:09:21 +08:00
qvlehao
f5350b103b
[model] add deepseek-R1 & show think process ( #6767 )
...
Former-commit-id: 28417f862a1947a24663150ca55f421198b6d8eb
2025-01-29 12:16:26 +08:00
yinpu
aa7c07caf0
fix: avoid redundant normalization in DPO's SFT loss calculation ( #6722 )
...
Former-commit-id: 0f45982bac6b65533a94054ea5f792cb0f9e5a1f
2025-01-21 13:38:02 +08:00
engchina
324f07613a
[webui] support ja ( #6698 )
...
* add support for japanese language
* add support for japanese language
---------
Co-authored-by: engchina <atjapan2015@gmail.com>
Former-commit-id: de9bc3fefa4fcb5db7d04589b16282a078c62cb2
2025-01-20 19:46:38 +08:00
hoshi-hiyouga
1efe525df7
[model] support yarn ( #6693 )
...
Former-commit-id: 1f47b6186c267de86cbdbd47ba2adbf1f9db7f39
2025-01-18 13:56:09 +08:00
hoshi-hiyouga
f87c788154
[misc] update mm plugin ( #6691 )
...
Former-commit-id: c0caa7afc60ed3015fe6c263ba3566202ba934f1
2025-01-17 23:04:26 +08:00
hoshi-hiyouga
bbf334f823
disable valset by default ( #6690 )
...
Former-commit-id: 77bbf659053e1b205974eb6df69998fee0305d26
2025-01-17 21:09:30 +08:00
hoshi-hiyouga
770433fa33
[webui] upgrade to gradio 5 ( #6688 )
...
Former-commit-id: 4d0f662dbe227ab0da11a1e109f7a2c5ab8f70b9
2025-01-17 20:15:42 +08:00
hoshi-hiyouga
788accb601
fix qwen2 moe ( #6684 )
...
Former-commit-id: 7bf09abf1c4d971cda33daed933c75f391e79294
2025-01-17 13:46:09 +08:00
Zhangchi Feng
555f17c1ee
[data] Fix minicpmv/o dpo training ( #6657 )
...
* fix template name
* tiny fix
* support minicpm-o-2.6
* support inference of minicpmv
* update readme
* support dpo of minicpmv
Former-commit-id: 027942789bf3a28b2506a5730c05c8392ef5c885
2025-01-15 17:30:37 +08:00
steveepreston
8895cf1152
Update val_size
english description ( #6653 )
...
* Update `val_size` Description in locales.py
* Update `val_size` Description in data_args.py
* Remove extra space in data_args.py
Former-commit-id: 76675b654e243c14b260adbfe04f619e4f2bf177
2025-01-15 16:00:20 +08:00
hoshi-hiyouga
9ef85f8fc4
[optim] clean apollo ( #6645 )
...
* clean apollo code
* update readme
Former-commit-id: 7a04021d0461caea2c7b82169839340b7f51f463
2025-01-15 01:42:50 +08:00
zhuHQ
763f9b9df0
[optim] add support to APOLLO ( #6617 )
...
Former-commit-id: d9189f9f0b23ff6929044919208e0e813ca95b1c
2025-01-15 00:24:56 +08:00
hoshi-hiyouga
91433d639c
lint ( #6641 )
...
Former-commit-id: 1278c3e92eeb297e883aab89e2384c1df1d0e910
2025-01-14 18:40:07 +08:00
Haian Huang(深度眸)
864ee06243
Support InternLM3 Dense 8B Model ( #6640 )
...
* support internlm3
* update
* update
* update
* add hint
Former-commit-id: deacc00b1226ca3d53bf7bb1231cf276eaa8296b
2025-01-14 18:07:27 +08:00
Xiaosu Zhu
a52496cc09
Fix tokenizer max length ( #6632 )
...
Former-commit-id: 58d029f3212dba1808e63cc8875022f6d741bd63
2025-01-14 17:35:54 +08:00
Zhangchi Feng
ad119afc58
Support Inference of MiniCPM-V-2.6 and MiniCPM-o-2.6 ( #6631 )
...
* fix template name
* tiny fix
* support minicpm-o-2.6
* support inference of minicpmv
Former-commit-id: 158a127d340d5e4ca23263ffad042f861fd77deb
2025-01-14 17:34:58 +08:00
hoshi-hiyouga
8f73c75c16
[model] fix mllama any image ( #6637 )
...
* fix mllama any image
* reorder classes
Former-commit-id: 98189c8e4d70bf5f8ee83852a023ed27dfc96900
2025-01-14 16:47:58 +08:00
hoshi-hiyouga
5e699458e5
pin vllm version to 0.6.5 ( #6629 )
...
Former-commit-id: 1c7663d3049e00a9148c3e3c58204deca7a08c8d
2025-01-14 02:44:02 +08:00