hoshi-hiyouga
e2dc5b952a
[misc] update license year & fix llama pro ( #6814 )
...
* fix llamapro script
* change year
2025-02-05 01:53:33 +08:00
Yueqi Song
dd6b7d203e
[data] fix qwen tool template ( #6796 )
...
* Update tool_utils.py
* fix unittest
---------
Co-authored-by: hoshi-hiyouga <hiyouga@buaa.edu.cn >
2025-02-05 00:02:00 +08:00
Zhangchi Feng
ab9bd068ef
[data] fix minicpmv plugin ( #6801 )
...
* fix template name
* tiny fix
* support minicpm-o-2.6
* support inference of minicpmv
* update readme
* support dpo of minicpmv
* update init audio
* update init audio
* [model]fix image process in minicpmo
2025-02-04 21:20:15 +08:00
hoshi-hiyouga
069a477d16
[assets] update wechat ( #6810 )
2025-02-04 21:17:40 +08:00
neavo
a417bcf8d9
[readme] update flash attention installation instruction on win platform ( #6788 )
...
* Update README_zh.md
* Update README.md
2025-02-01 12:43:29 +08:00
hoshi-hiyouga
b5fda21288
[misc] update workflows ( #6787 )
2025-02-01 04:54:42 +08:00
hoshi-hiyouga
94803d8133
[model] add mistral small models ( #6786 )
2025-02-01 04:31:38 +08:00
hoshi-hiyouga
999c7c8fe0
[model] add qwen2.5 vl models ( #6779 )
2025-01-31 03:00:29 +08:00
hoshi-hiyouga
15357cdad9
[breaking] support transformers 4.48 ( #6628 )
2025-01-31 01:36:33 +08:00
hoshi-hiyouga
45e68b9f09
[webui] improve webui & reasoning mode ( #6778 )
2025-01-31 00:09:21 +08:00
codingma
4fb6059f48
[assets] update wechat ( #6771 )
2025-01-29 12:31:24 +08:00
qvlehao
28417f862a
[model] add deepseek-R1 & show think process ( #6767 )
2025-01-29 12:16:26 +08:00
yinpu
0f45982bac
fix: avoid redundant normalization in DPO's SFT loss calculation ( #6722 )
2025-01-21 13:38:02 +08:00
engchina
de9bc3fefa
[webui] support ja ( #6698 )
...
* add support for japanese language
* add support for japanese language
---------
Co-authored-by: engchina <atjapan2015@gmail.com >
2025-01-20 19:46:38 +08:00
hoshi-hiyouga
3962645ac0
[assets] update wechat ( #6710 )
2025-01-20 16:29:24 +08:00
hoshi-hiyouga
1f47b6186c
[model] support yarn ( #6693 )
2025-01-18 13:56:09 +08:00
hoshi-hiyouga
17b470630d
[assets] update wechat ( #6692 )
2025-01-18 12:35:03 +08:00
hoshi-hiyouga
c0caa7afc6
[misc] update mm plugin ( #6691 )
2025-01-17 23:04:26 +08:00
hoshi-hiyouga
77bbf65905
disable valset by default ( #6690 )
2025-01-17 21:09:30 +08:00
hoshi-hiyouga
4d0f662dbe
[webui] upgrade to gradio 5 ( #6688 )
2025-01-17 20:15:42 +08:00
hoshi-hiyouga
7bf09abf1c
fix qwen2 moe ( #6684 )
2025-01-17 13:46:09 +08:00
Zhangchi Feng
027942789b
[data] Fix minicpmv/o dpo training ( #6657 )
...
* fix template name
* tiny fix
* support minicpm-o-2.6
* support inference of minicpmv
* update readme
* support dpo of minicpmv
2025-01-15 17:30:37 +08:00
steveepreston
76675b654e
Update val_size english description ( #6653 )
...
* Update `val_size` Description in locales.py
* Update `val_size` Description in data_args.py
* Remove extra space in data_args.py
2025-01-15 16:00:20 +08:00
hoshi-hiyouga
563be2286a
update readme ( #6648 )
2025-01-15 11:06:19 +08:00
hoshi-hiyouga
7a04021d04
[optim] clean apollo ( #6645 )
...
* clean apollo code
* update readme
2025-01-15 01:42:50 +08:00
zhuHQ
d9189f9f0b
[optim] add support to APOLLO ( #6617 )
2025-01-15 00:24:56 +08:00
Zhangchi Feng
9b7ba093c7
update readme of MiniCPM-o ( #6642 )
...
* fix template name
* tiny fix
* support minicpm-o-2.6
* support inference of minicpmv
* update readme
2025-01-14 21:22:35 +08:00
hoshi-hiyouga
1278c3e92e
lint ( #6641 )
2025-01-14 18:40:07 +08:00
Haian Huang(深度眸)
deacc00b12
Support InternLM3 Dense 8B Model ( #6640 )
...
* support internlm3
* update
* update
* update
* add hint
2025-01-14 18:07:27 +08:00
Xiaosu Zhu
58d029f321
Fix tokenizer max length ( #6632 )
2025-01-14 17:35:54 +08:00
Zhangchi Feng
158a127d34
Support Inference of MiniCPM-V-2.6 and MiniCPM-o-2.6 ( #6631 )
...
* fix template name
* tiny fix
* support minicpm-o-2.6
* support inference of minicpmv
2025-01-14 17:34:58 +08:00
hoshi-hiyouga
98189c8e4d
[model] fix mllama any image ( #6637 )
...
* fix mllama any image
* reorder classes
2025-01-14 16:47:58 +08:00
hoshi-hiyouga
1c7663d304
pin vllm version to 0.6.5 ( #6629 )
2025-01-14 02:44:02 +08:00
Zhangchi Feng
c3fda5046d
Support new features of MiniCPM-V ( #6626 )
...
* fix template name
* tiny fix
* support minicpm-o-2.6
2025-01-14 00:26:19 +08:00
hoshi-hiyouga
e3e2c8c689
[inference] fix stop token for object detection ( #6624 )
...
* fix stop token
* update minicpm data pipeline
* fix npu qlora examples
2025-01-13 21:34:20 +08:00
codingma
03de5ac912
add nf4 qlora support on Ascend NPU ( #6601 )
...
* add nf4 qlora support on Ascend NPU
* add transformers version check
* add python>=3.10 requirement description for npu
* tiny fix
---------
Co-authored-by: hoshi-hiyouga <hiyouga@buaa.edu.cn >
2025-01-13 19:43:36 +08:00
Zhangchi Feng
3077f20339
Fix template name of MiniCPM-V ( #6620 )
...
* fix template name
* tiny fix
2025-01-13 16:46:48 +08:00
hoshi-hiyouga
6eec50c74d
Merge pull request #6598 from BUAADreamer/minicpmv
...
[model] Support MiniCPM-V
2025-01-13 15:24:02 +08:00
fzc8578
a019cece80
remove tests
2025-01-13 15:08:35 +08:00
fzc8578
c2fa4cc7b1
fix tests
2025-01-13 15:01:39 +08:00
fzc8578
0cc7260a93
fix style
2025-01-13 14:19:38 +08:00
fzc8578
cfaa8e4890
fix system prompt and tests
2025-01-13 14:18:06 +08:00
fzc8578
01e9cfd406
add some
2025-01-11 15:03:20 +08:00
fzc8578
10073319b4
add cpm_o test
2025-01-11 11:55:30 +08:00
fzc8578
c506f763df
add cpm_o test
2025-01-11 11:49:03 +08:00
fzc8578
7b44f3127e
fix format
2025-01-11 01:27:40 +08:00
fzc8578
a650e114e9
add some
2025-01-11 01:10:24 +08:00
fzc8578
291384dea8
adapt to new mllm_param
2025-01-11 00:16:34 +08:00
Zhangchi Feng
ed0895a9c1
Merge branch 'main' into minicpmv
2025-01-11 00:01:36 +08:00
hoshi-hiyouga
382e932228
Merge pull request #6600 from hiyouga/hiyouga/refactor_mllm_param
...
[model] refactor mllm param logic
2025-01-10 23:53:37 +08:00