60 Commits

Author SHA1 Message Date
hoshi-hiyouga
458b6b0aef [assets] update readme (#7644) 2025-04-09 01:06:06 +08:00
Kingsley
0935eff188 [data] Fix bugs of use_audio_in_video in Qwen2.5 Omni (#7638)
* cache _mm_inputs

* nit

* support for use_audio_in_video

* remove cache

* fix data

* Update mllm_video_audio_demo.json
2025-04-08 18:40:10 +08:00
hoshi-hiyouga
fb46193364 [misc] fix packing and eval plot (#7623) 2025-04-07 18:20:57 +08:00
Kingsley
6eb28bcacd [data] fix qwen2.5 omni plugin (#7578)
* specific entry

* Update mm_plugin.py

* fix fps cal

---------

Co-authored-by: hoshi-hiyouga <hiyouga@buaa.edu.cn>
2025-04-02 23:58:39 +08:00
Kingsley
ac9ba80128 [data] fix qwen2.5 omni plugin (#7573)
* align key with qwen2vl

* nit && change scripts
2025-04-02 21:28:52 +08:00
hoshi-hiyouga
56fa64035a [data] fix qwen2.5 omni collator (#7553) 2025-04-01 00:15:12 +08:00
Kingsley
1189aeb6c2 [model] add Qwen2.5-Omni model (#7537)
* preserve image_sizes

* preserve image_sizes

* init plugin

* support audio-text2text lora

* nit

* support image/video-text2text, audio-text2text

* remove args

* remove lines

* add docs && nit

* remove some comments

* fix && add merge part script

* add license
2025-03-31 20:39:35 +08:00
Ritesh Goru
f5b53249a4 [data] efficient 4d_attention_mask creation in neat_packing (#7272) 2025-03-13 03:31:12 +08:00
hoshi-hiyouga
1b1964714e [misc] update format (#7277) 2025-03-13 02:53:08 +08:00
hoshi-hiyouga
efa86e730c [misc] upgrade format to py39 (#7256) 2025-03-12 00:08:41 +08:00
hoshi-hiyouga
865b2b8b87 [version] support transformers 449 (#6982)
* support transformers 449

* fix mm plugin

Former-commit-id: e9118a9df0839d24f6ddff5a0b55ef101a1d3d22
2025-02-18 17:05:40 +08:00
marko1616
a23e16ae89 [trainer] fix llama3.2 vision kto train (#6904)
Former-commit-id: 1563e89adc8988fc6e4250634a3f1e385979b0e5
2025-02-12 19:09:14 +08:00
Zhangchi Feng
5cf98e2084 [da'ta] fix minicpmv plugin (#6890)
* fix template name

* tiny fix

* support minicpm-o-2.6

* support inference of minicpmv

* update readme

* support dpo of minicpmv

* update init audio

* update init audio

* [model]fix image process in minicpmo

* fix no mm inputs

Former-commit-id: cdd19ccd8cec460606b4545e886e932c1c5c5fe1
2025-02-11 13:30:44 +08:00
hoshi-hiyouga
4d30538c03 [data] fix mllama collator (#6874)
Former-commit-id: c694fa3d66651c6ce547fa72c8260c46a406126b
2025-02-09 22:42:25 +08:00
Zhangchi Feng
46a1786595 [model] support audio (#6701)
* support qwen2_audio

* improve code

* lint

* fix

* fix

* fix

---------

Co-authored-by: hiyouga <hiyouga@buaa.edu.cn>
Former-commit-id: 5eacb5629e4d7733cd992a63747a1335f2c6a929
2025-02-05 04:59:09 +08:00
hoshi-hiyouga
1132aaa53c [model] add qwen2.5 vl models (#6779)
Former-commit-id: ed46fb4f6194c30060b908092464dded12e5787c
2025-01-31 03:00:29 +08:00
Zhangchi Feng
71527cdc55 [data] Fix minicpmv/o dpo training (#6657)
* fix template name

* tiny fix

* support minicpm-o-2.6

* support inference of minicpmv

* update readme

* support dpo of minicpmv

Former-commit-id: 8d9f47b98047f370637d1c96c2f3440dcc738ef3
2025-01-15 17:30:37 +08:00
Zhangchi Feng
429a027832 Support new features of MiniCPM-V (#6626)
* fix template name

* tiny fix

* support minicpm-o-2.6

Former-commit-id: 53034a61c7654358f46916cbc370910fb2aeff3b
2025-01-14 00:26:19 +08:00
hoshi-hiyouga
7ab274eb67 [inference] fix stop token for object detection (#6624)
* fix stop token

* update minicpm data pipeline

* fix npu qlora examples

Former-commit-id: 844919fadaa8a61dfae47020971ea80730b2346f
2025-01-13 21:34:20 +08:00
fzc8578
a8327e0cad fix style
Former-commit-id: 76a36d9acecbf36b6959a14caacfed1d32bcee41
2025-01-13 14:19:38 +08:00
fzc8578
9ab1d4fb3c fix system prompt and tests
Former-commit-id: 955efca677b299749f3d40d587ee310951537543
2025-01-13 14:18:06 +08:00
fzc8578
ac5244ee12 fix format
Former-commit-id: 964e18be5a824950164bc7232d35822a8b116d1a
2025-01-11 01:27:40 +08:00
fzc8578
8321458ac5 add some
Former-commit-id: 6233764d18f31365e9ba450408306fad55567ffc
2025-01-11 01:10:24 +08:00
fzc8578
7e3372c035 adapt to new mllm_param
Former-commit-id: 0775b71965863c2618c117726a1046a36d6d85b8
2025-01-11 00:16:34 +08:00
fzc8578
5ca9cf8753 add some
Former-commit-id: 58f50b8729083e9ea0fdcf07042b06261670ad57
2025-01-10 23:29:06 +08:00
fzc8578
d6591abcb0 add some
Former-commit-id: 3acd151a0f8efdd230c0b0980550795d204a69f7
2025-01-10 21:25:32 +08:00
fzc8578
3a8f989faa fix some
Former-commit-id: cd5a1a8b9c6eb59d6e95f79573f60ad8668f1942
2025-01-10 20:27:06 +08:00
fzc8578
f49b307e35 add some
Former-commit-id: fede563aeb716ba5d1e368fd3e1182e4e580d248
2025-01-10 20:01:22 +08:00
fzc8578
8f97891153 add some
Former-commit-id: 81176fe226da89eace89cb202bad68e73b7c2a02
2025-01-04 11:11:15 +08:00
hiyouga
92c6c384cf fix #6482
Former-commit-id: 8577f52b4152efe6cc7a8b5f6d37b4f9ba6684e7
2024-12-30 06:03:07 +00:00
hiyouga
c555a83ec9 fix #6448
Former-commit-id: 04f78e85af5af14b4c195936623e426a6a128af2
2024-12-27 16:54:39 +00:00
hiyouga
90b62807fb fix #6348
Former-commit-id: 83e552320909f4775377889f1512994b7e638a7e
2024-12-17 10:06:46 +00:00
hiyouga
b57560aa25 fix mrope
Former-commit-id: 55bee1d333549ca19858b3f5c1b7b86926e5fb09
2024-12-12 15:08:17 +00:00
hiyouga
7695787258 fix scripts
Former-commit-id: f94f55d20283298cb7d90d0573992a62df414a8f
2024-12-05 03:47:32 +00:00
hiyouga
effad6fbff fix vlm zero3 training
Former-commit-id: 86fe7fe71b51077310357b7b1895522258f9bc7a
2024-12-04 09:40:39 +00:00
hiyouga
9cb232aaad fix mllama cross_mask
Former-commit-id: c33967308bebd99489d28bd5a879525cf304c1f9
2024-11-26 15:56:58 +00:00
hiyouga
cc22c045a7 fix inputs
Former-commit-id: 7d535bb8cdf7e81edda81152e63c8cfe6c9dcc9f
2024-11-23 18:26:02 +00:00
hoshi-hiyouga
a78bbb6f13 Update collator.py
Former-commit-id: 941fa8a0d9c3a9106ad0af6e776db7e57f69548f
2024-10-29 22:03:42 +08:00
Kingsley
e62a1f4947 Merge branch 'hiyouga:main' into pixtral-patch
Former-commit-id: 438302edfdb66b6397266b8b17ac66f60a89300c
2024-10-29 21:01:25 +08:00
hiyouga
dbbfb5f5dc use pre-commit
Former-commit-id: 7cfede95df22a9ff236788f04159b6b16b8d04bb
2024-10-29 09:07:46 +00:00
KUANGDD
ff67d4117f modify style & little change
Former-commit-id: c988477d14dc656450d5fec31895781b7f9f7dce
2024-10-23 15:24:07 +08:00
hiyouga
2959f12c6e tiny fix
Former-commit-id: c0e9c0484dae6db93cef5048bad827ff22b1986a
2024-09-05 23:41:16 +08:00
hiyouga
d408d5cf32 video datasets
Former-commit-id: 33f28ce82d9e44d2615909250dc56d6a4a03cd99
2024-09-05 02:04:17 +08:00
hiyouga
54d4c3fca7 lazy image load
Former-commit-id: cdd733b575411e003bc5ffd6560dd8eff8aa09cf
2024-09-04 02:27:08 +08:00
hiyouga
04db03bdfd add rlhf-v dataset
Former-commit-id: 3fd18fc34a0c994a738504746abfd5548e002437
2024-09-01 22:57:41 +08:00
hiyouga
25c0ce0f88 fix bug
Former-commit-id: 6e19e56000dd18d5faf84ceabce8d7708ff21e4d
2024-09-01 21:07:49 +08:00
hiyouga
ad6c42ff0a fix mixed mm inputs and rlhf-v
Former-commit-id: 7c248fac20bf85d57a91132ce7a793c7f84e9218
2024-09-01 20:52:47 +08:00
hiyouga
f389b6a676 tiny fix
Former-commit-id: 830511a6d0216da99520aee8b3a753d347a71fa9
2024-08-30 03:21:50 +08:00
hiyouga
228f745235 refactor mm training
Former-commit-id: 179c0558699e287cbf38a2d73bff47e86d589c5a
2024-08-30 02:14:31 +08:00
simonJJJ
5e728ec221 initial-commit
Former-commit-id: b6a39847a10b417b09db4b5512dd835e9e4ce928
2024-08-28 16:51:35 +08:00