63 Commits

Author SHA1 Message Date
Kingsley
0b188ca00c
[model] add GLM-4.1V (#8462) 2025-06-30 01:09:41 +08:00
Yaowei Zheng
af2f75e688
[data] fix qwen2vl pos ids (#8387) 2025-06-17 00:48:54 +08:00
Changrui Chen
81768df04c
[data] Fix wrong position ids with packed attention masks (#7754)
Co-authored-by: hoshi-hiyouga <hiyouga@buaa.edu.cn>
2025-04-21 23:19:36 +08:00
hoshi-hiyouga
39876b85fc
[assets] update readme (#7644) 2025-04-09 01:06:06 +08:00
Kingsley
7d8bee96fc
[data] Fix bugs of use_audio_in_video in Qwen2.5 Omni (#7638)
* cache _mm_inputs

* nit

* support for use_audio_in_video

* remove cache

* fix data

* Update mllm_video_audio_demo.json
2025-04-08 18:40:10 +08:00
hoshi-hiyouga
5817cda37e
[misc] fix packing and eval plot (#7623) 2025-04-07 18:20:57 +08:00
Kingsley
32cb086be1
[data] fix qwen2.5 omni plugin (#7578)
* specific entry

* Update mm_plugin.py

* fix fps cal

---------

Co-authored-by: hoshi-hiyouga <hiyouga@buaa.edu.cn>
2025-04-02 23:58:39 +08:00
Kingsley
80f8d037d0
[data] fix qwen2.5 omni plugin (#7573)
* align key with qwen2vl

* nit && change scripts
2025-04-02 21:28:52 +08:00
hoshi-hiyouga
2d421c57bf
[data] fix qwen2.5 omni collator (#7553) 2025-04-01 00:15:12 +08:00
Kingsley
185c76f6ad
[model] add Qwen2.5-Omni model (#7537)
* preserve image_sizes

* preserve image_sizes

* init plugin

* support audio-text2text lora

* nit

* support image/video-text2text, audio-text2text

* remove args

* remove lines

* add docs && nit

* remove some comments

* fix && add merge part script

* add license
2025-03-31 20:39:35 +08:00
Ritesh Goru
d7d79f7e06
[data] efficient 4d_attention_mask creation in neat_packing (#7272) 2025-03-13 03:31:12 +08:00
hoshi-hiyouga
9ccfb97a2c
[misc] update format (#7277) 2025-03-13 02:53:08 +08:00
hoshi-hiyouga
7c1640ed5f
[misc] upgrade format to py39 (#7256) 2025-03-12 00:08:41 +08:00
hoshi-hiyouga
3fbd4848e8 [version] support transformers 449 (#6982)
* support transformers 449

* fix mm plugin

Former-commit-id: b00b290c07beb560a5af857ce64f4ce424831a2c
2025-02-18 17:05:40 +08:00
marko1616
bae934dea3 [trainer] fix llama3.2 vision kto train (#6904)
Former-commit-id: b7fd1e9c00c77a4c2a0f2f347767d22bd47213f1
2025-02-12 19:09:14 +08:00
Zhangchi Feng
5433b318bb [da'ta] fix minicpmv plugin (#6890)
* fix template name

* tiny fix

* support minicpm-o-2.6

* support inference of minicpmv

* update readme

* support dpo of minicpmv

* update init audio

* update init audio

* [model]fix image process in minicpmo

* fix no mm inputs

Former-commit-id: 764627645abcd353f9130d5dd8c584810b0e0b1b
2025-02-11 13:30:44 +08:00
hoshi-hiyouga
1bb3d17d9e [data] fix mllama collator (#6874)
Former-commit-id: b68199db274a53d5916179e1aaf9722fd94fa2dc
2025-02-09 22:42:25 +08:00
Zhangchi Feng
01915eaf40 [model] support audio (#6701)
* support qwen2_audio

* improve code

* lint

* fix

* fix

* fix

---------

Co-authored-by: hiyouga <hiyouga@buaa.edu.cn>
Former-commit-id: 24c78429489809873a1269a735ea5421340b32a2
2025-02-05 04:59:09 +08:00
hoshi-hiyouga
e8c1979b79 [model] add qwen2.5 vl models (#6779)
Former-commit-id: 999c7c8fe0caf6b837a1bdc2c6a24fafec327cd8
2025-01-31 03:00:29 +08:00
Zhangchi Feng
555f17c1ee [data] Fix minicpmv/o dpo training (#6657)
* fix template name

* tiny fix

* support minicpm-o-2.6

* support inference of minicpmv

* update readme

* support dpo of minicpmv

Former-commit-id: 027942789bf3a28b2506a5730c05c8392ef5c885
2025-01-15 17:30:37 +08:00
Zhangchi Feng
201a495154 Support new features of MiniCPM-V (#6626)
* fix template name

* tiny fix

* support minicpm-o-2.6

Former-commit-id: c3fda5046d835ba4542d525b8d89cd12838e9f4c
2025-01-14 00:26:19 +08:00
hoshi-hiyouga
d8cba9464f [inference] fix stop token for object detection (#6624)
* fix stop token

* update minicpm data pipeline

* fix npu qlora examples

Former-commit-id: e3e2c8c689c54ebb2af264de808502e5a8ba0f2b
2025-01-13 21:34:20 +08:00
fzc8578
4741eec2d1 fix style
Former-commit-id: 0cc7260a93bf7c65451e376245aa143f9237d7d8
2025-01-13 14:19:38 +08:00
fzc8578
d2afe0c63c fix system prompt and tests
Former-commit-id: cfaa8e4890ad99ec1fb90d9550503d734b5c30b7
2025-01-13 14:18:06 +08:00
fzc8578
e7f928adc4 fix format
Former-commit-id: 7b44f3127ef7e91a6bedca0311feb14974914ddf
2025-01-11 01:27:40 +08:00
fzc8578
62c12a133e add some
Former-commit-id: a650e114e907278ece188922467c2514de544eeb
2025-01-11 01:10:24 +08:00
fzc8578
08e8499a98 adapt to new mllm_param
Former-commit-id: 291384dea8a5c10f0358a30d124eaf85557548eb
2025-01-11 00:16:34 +08:00
fzc8578
0fb50f9c88 add some
Former-commit-id: 771cc802941cf1953b32e5102c817c6a3090b5ce
2025-01-10 23:29:06 +08:00
fzc8578
bcbe37ff52 add some
Former-commit-id: ae1f528df31194fe37a123ba1e5a4cd263a61602
2025-01-10 21:25:32 +08:00
fzc8578
7138b43873 fix some
Former-commit-id: 2ee8ba2f390551af1b865cfa813f5c8b7bbb41c5
2025-01-10 20:27:06 +08:00
fzc8578
165fe8e219 add some
Former-commit-id: 096a6cb67a7dfd14a6e339d96baab78c12d36a87
2025-01-10 20:01:22 +08:00
fzc8578
b5ef5059ee add some
Former-commit-id: 79c2d7090cbf364063ea3608814ab18aa27fdc87
2025-01-04 11:11:15 +08:00
hiyouga
813f5919a3 fix #6482
Former-commit-id: 6f5bb3b8e5b6eb7fdfd7b0ca8eba789ab741a7b6
2024-12-30 06:03:07 +00:00
hiyouga
3bcb4633ca fix #6448
Former-commit-id: 27198679829fb766c7eef468ae4311fdced695a2
2024-12-27 16:54:39 +00:00
hiyouga
50ca43c3fb fix #6348
Former-commit-id: 142191e4664cb1b920aff2f51d1bac6180f2c24b
2024-12-17 10:06:46 +00:00
hiyouga
6f1e450739 fix mrope
Former-commit-id: 2811814fc42fb214b3e8be1055f9f57ffd0ffb12
2024-12-12 15:08:17 +00:00
hiyouga
819f487c8f fix scripts
Former-commit-id: eb3e147d198a3ecb02c65f7733cec7cd9d3814a3
2024-12-05 03:47:32 +00:00
hiyouga
0ef1dc4dd5 fix vlm zero3 training
Former-commit-id: dbb9e5b70efab37ed057b2d5822b9d0d23e99fb1
2024-12-04 09:40:39 +00:00
hiyouga
006022cadd fix mllama cross_mask
Former-commit-id: 598c22e43f3f10a335933339cc612744c4835eb0
2024-11-26 15:56:58 +00:00
hiyouga
e99031daa4 fix inputs
Former-commit-id: 446441fdb020b5a102480251cb8536dd8b3f8f99
2024-11-23 18:26:02 +00:00
hoshi-hiyouga
fb8f35558a Update collator.py
Former-commit-id: f745c4b28f532c7084d4b8522c972e735729ecee
2024-10-29 22:03:42 +08:00
Kingsley
3053a806e9 Merge branch 'hiyouga:main' into pixtral-patch
Former-commit-id: 67f59579d79e97689a4b3cba7101a423c30dab2b
2024-10-29 21:01:25 +08:00
hiyouga
0d8aa6e6ef use pre-commit
Former-commit-id: 21db8ed2f4a0eba203754a92ce0741538e8ee709
2024-10-29 09:07:46 +00:00
KUANGDD
62cbcb646a modify style & little change
Former-commit-id: 9d6143e36a12e0f295139d057aeb1843535435cf
2024-10-23 15:24:07 +08:00
hiyouga
995491594d tiny fix
Former-commit-id: 76f2e5950483c669a15a961f0554442b6eb5c4a6
2024-09-05 23:41:16 +08:00
hiyouga
9df7a26e6b video datasets
Former-commit-id: 8cafc7b055a854f483ad1c67f3d487ffd34b5f89
2024-09-05 02:04:17 +08:00
hiyouga
22deca0e9e lazy image load
Former-commit-id: 47ea97fb1ba77de2e8a561904aa8fdc27c3f5025
2024-09-04 02:27:08 +08:00
hiyouga
bfdcc6bacf add rlhf-v dataset
Former-commit-id: 8e49940746c1a6ff910f07dbefbec14af9d0f3c6
2024-09-01 22:57:41 +08:00
hiyouga
413a206652 fix bug
Former-commit-id: 64cb947c60398dfdfc2877f898147b0240089ea3
2024-09-01 21:07:49 +08:00
hiyouga
cb776752f6 fix mixed mm inputs and rlhf-v
Former-commit-id: 9967ccb3aef3ca557ad6eafb78c6c99866857008
2024-09-01 20:52:47 +08:00