hoshi-hiyouga
9adc0a2c3f
[assets] update readme ( #7209 )
...
Former-commit-id: d1631b38dad9ba3d41aebbb00e3500eb79b9e8e9
2025-03-07 17:27:49 +08:00
hoshi-hiyouga
d2f845d70d
[deps] upgrade vllm ( #7183 )
...
Former-commit-id: 37678a3d64668c3b4a4bfefc054e3b9b40427c1a
2025-03-06 15:25:08 +08:00
hoshi-hiyouga
e62dae37fe
[assets] update wechat ( #7106 )
...
Former-commit-id: 0ea430060994631e9fdb18fbbca0dd565a04fd66
2025-02-28 12:01:04 +08:00
leo-pony
b9f84900ee
[npu] update cann base image and torch 2.4 ( #7061 )
...
* Update base npu container image version:The Python version required for Hugging Face Transformers is >= python3.10
* Fix the bug: arg type of INSTALL_DEEPSPEED shoud been string now.
* Update Ascend CANN, CANN-Kernel and corresponding torch and torch-npu version
* Upgrade torch-npu needs packages' version: torch==2.1.0 and torch-npu==2.4.0.post2
Former-commit-id: d6dafada58412b0c801e576ef4d8d96203f792af
2025-02-25 23:32:01 +08:00
hoshi-hiyouga
5f65558088
[misc] fix project toml ( #7067 )
...
Former-commit-id: 28a668ff4e0beebfe5387362f5518c1d9343666f
2025-02-25 23:22:48 +08:00
hoshi-hiyouga
ee46011b34
[assets] update readme ( #7051 )
...
Former-commit-id: c89a39bfc6a3f0aaa376cd1b221320f466aba617
2025-02-24 20:45:06 +08:00
hoshi-hiyouga
d55f420206
[assets] update wechat ( #7019 )
...
Former-commit-id: 3d102fe7e0bfc23db7d75f90ebaf53216c54cc85
2025-02-20 20:32:33 +08:00
hoshi-hiyouga
331f53381f
[data] add r1 distill dataset ( #6983 )
...
Former-commit-id: 1da5ee4edaa3896593b9cae488f0ac5917c3243e
2025-02-18 17:25:09 +08:00
hoshi-hiyouga
1d675a287d
[version] support transformers 449 ( #6982 )
...
* support transformers 449
* fix mm plugin
Former-commit-id: e9118a9df0839d24f6ddff5a0b55ef101a1d3d22
2025-02-18 17:05:40 +08:00
hoshi-hiyouga
8b8fdb3a85
[misc] update readme ( #6918 )
...
Former-commit-id: f5823479bd51c39db668b68056be749af09894d1
2025-02-13 01:01:41 +08:00
hoshi-hiyouga
290057069e
[misc] update readme ( #6917 )
...
Former-commit-id: 6bbed1d8c4189fb7bea40230e278c40bb5336fbd
2025-02-13 00:58:10 +08:00
hoshi-hiyouga
d58fcd094e
[misc] update readme ( #6903 )
...
Former-commit-id: 830d028939149d54bc91b6bda110dfa5de949483
2025-02-11 22:51:26 +08:00
hoshi-hiyouga
88eafd865b
[misc] support export ollama modelfile ( #6899 )
...
* support export ollama modelfile
* update config
* add system and num ctx
Former-commit-id: 8c2af7466f4015f300b51841db11bcd2505ebf20
2025-02-11 19:52:25 +08:00
Zhangchi Feng
2047eab723
[da'ta] fix minicpmv plugin ( #6890 )
...
* fix template name
* tiny fix
* support minicpm-o-2.6
* support inference of minicpmv
* update readme
* support dpo of minicpmv
* update init audio
* update init audio
* [model]fix image process in minicpmo
* fix no mm inputs
Former-commit-id: cdd19ccd8cec460606b4545e886e932c1c5c5fe1
2025-02-11 13:30:44 +08:00
hoshi-hiyouga
94726bdc8d
[dataset] add openthought ( #6866 )
...
Former-commit-id: 20c748a4f108c0087f0d85377a4aa99126a0beb0
2025-02-09 00:53:01 +08:00
Zhangchi Feng
8f401e37f8
[model] support audio ( #6701 )
...
* support qwen2_audio
* improve code
* lint
* fix
* fix
* fix
---------
Co-authored-by: hiyouga <hiyouga@buaa.edu.cn>
Former-commit-id: 5eacb5629e4d7733cd992a63747a1335f2c6a929
2025-02-05 04:59:09 +08:00
neavo
34746d6151
[readme] update flash attention installation instruction on win platform ( #6788 )
...
* Update README_zh.md
* Update README.md
Former-commit-id: e48d1327fb39cc95f8fbfc746494f67a79471893
2025-02-01 12:43:29 +08:00
hoshi-hiyouga
a28261a866
[model] add mistral small models ( #6786 )
...
Former-commit-id: e5e95c39bc4199fa89c67e34f9adaaa987058744
2025-02-01 04:31:38 +08:00
hoshi-hiyouga
800de98dc8
[model] add qwen2.5 vl models ( #6779 )
...
Former-commit-id: ed46fb4f6194c30060b908092464dded12e5787c
2025-01-31 03:00:29 +08:00
hoshi-hiyouga
222423bcef
[breaking] support transformers 4.48 ( #6628 )
...
Former-commit-id: f154ab175c513a4d7bb866bf2cffc34b77b50508
2025-01-31 01:36:33 +08:00
hoshi-hiyouga
e71737351f
[webui] improve webui & reasoning mode ( #6778 )
...
Former-commit-id: 3f17fc0d7163372e0446f1a38792ff761e99b739
2025-01-31 00:09:21 +08:00
qvlehao
4f298894da
[model] add deepseek-R1 & show think process ( #6767 )
...
Former-commit-id: 4dccb724af51208a001c96fefbdbf226be09e50c
2025-01-29 12:16:26 +08:00
hoshi-hiyouga
e4046bdd1f
[assets] update wechat ( #6692 )
...
Former-commit-id: 70dba5fab6f4c9225758cafb646113d8e80ac084
2025-01-18 12:35:03 +08:00
hoshi-hiyouga
ef994600db
update readme ( #6648 )
...
Former-commit-id: b47467276ab3174c50329b3c8b76823bc0a2249c
2025-01-15 11:06:19 +08:00
hoshi-hiyouga
7638f1070e
[optim] clean apollo ( #6645 )
...
* clean apollo code
* update readme
Former-commit-id: 38b8ec4a99189483124b54df9d6bc6b0d318855a
2025-01-15 01:42:50 +08:00
Zhangchi Feng
66184762e8
update readme of MiniCPM-o ( #6642 )
...
* fix template name
* tiny fix
* support minicpm-o-2.6
* support inference of minicpmv
* update readme
Former-commit-id: 68604050ae2c98aeef5e9a6b4d2c11a4eb609bfa
2025-01-14 21:22:35 +08:00
hoshi-hiyouga
41a9e231cb
lint ( #6641 )
...
Former-commit-id: 79731ae13ecd17eb8646fb53162c81dddfef3b00
2025-01-14 18:40:07 +08:00
Haian Huang(深度眸)
1bb06e06df
Support InternLM3 Dense 8B Model ( #6640 )
...
* support internlm3
* update
* update
* update
* add hint
Former-commit-id: 24ab7ae0944c5f373e9cac60f0332e704824a057
2025-01-14 18:07:27 +08:00
Zhangchi Feng
ae32c148d1
Support new features of MiniCPM-V ( #6626 )
...
* fix template name
* tiny fix
* support minicpm-o-2.6
Former-commit-id: 53034a61c7654358f46916cbc370910fb2aeff3b
2025-01-14 00:26:19 +08:00
hoshi-hiyouga
2a05941b14
[inference] fix stop token for object detection ( #6624 )
...
* fix stop token
* update minicpm data pipeline
* fix npu qlora examples
Former-commit-id: 844919fadaa8a61dfae47020971ea80730b2346f
2025-01-13 21:34:20 +08:00
codingma
11c38b9173
add nf4 qlora support on Ascend NPU ( #6601 )
...
* add nf4 qlora support on Ascend NPU
* add transformers version check
* add python>=3.10 requirement description for npu
* tiny fix
---------
Co-authored-by: hoshi-hiyouga <hiyouga@buaa.edu.cn>
Former-commit-id: 7912d1acac5f10dab22145fe729a90c57aad8d85
2025-01-13 19:43:36 +08:00
Zhangchi Feng
73c1c15b62
Fix template name of MiniCPM-V ( #6620 )
...
* fix template name
* tiny fix
Former-commit-id: 94dea52cef709a7e6f1cdc0b78e83e0422bd65d3
2025-01-13 16:46:48 +08:00
fzc8578
e577990eb2
add minicpmv2.6
...
Former-commit-id: 1ab0aea54b54066cad500b7969b86a0e952d396d
2025-01-10 23:45:44 +08:00
hiyouga
867980196e
improve template, add phi4 model
...
Former-commit-id: a785b6796e445a3adba45c5b6947166a2ff99871
2025-01-09 18:27:54 +00:00
hiyouga
8a5b4bdfd4
update model name
...
Former-commit-id: bf627d9f1ac117f040adbfd7630b5283f0db556a
2025-01-02 12:19:21 +00:00
hoshi-hiyouga
3bceef02ee
Merge pull request #6514 from hiyouga/hiyouga/add_project
...
[readme] add project
Former-commit-id: 0bd0c373183731302f1af9f33a1f8ff70ba743e2
2025-01-02 20:16:15 +08:00
hiyouga
18767fe026
add project
...
Former-commit-id: 3b7e745d271e36b4cfe8826820b23254e1debfe9
2025-01-02 12:15:41 +00:00
hiyouga
18a1a4b9da
add gpt2 model
...
Former-commit-id: 37d5e3639fcf5ae6e58cc435e0fa9dee0d6e4ead
2025-01-02 12:07:38 +00:00
hiyouga
b2e4f11602
add deepseek3 model
...
Former-commit-id: 611779d412f31e25b1ed38049050eee2da61dde5
2024-12-30 13:39:20 +00:00
hiyouga
3c55976a0e
add qvq #6439
...
Former-commit-id: 4dbfa142d899dd6e4d1a9d4db125765af5580a4f
2024-12-25 07:52:41 +00:00
hiyouga
a5346041bb
update readme
...
Former-commit-id: 1deda4750e0df6c46aeb33cf3f8b35baa537cc1d
2024-12-23 14:08:59 +00:00
hoshi-hiyouga
df42e438c1
Merge pull request #5922 from Tuyohai/main
...
support granite3 models
Former-commit-id: a9087bc0549f7f16e5b4c39e324043755b1618c8
2024-12-23 16:46:02 +08:00
hiyouga
a897d46049
support report custom args
...
Former-commit-id: d41254c40a1c5cacf9377096adb27efa9bdb79ea
2024-12-21 21:42:45 +00:00
ZeYi Lin
ec05c8cdb4
docs: use swanlab
...
Former-commit-id: 33509ea7bcd5f698a8393379bb3941c3c32f7fd6
2024-12-21 20:59:25 +08:00
hiyouga
8d2f8b0dd8
add paligemma2
...
Former-commit-id: dafbc31684cb2566ef23c79e171cdfd02d6d396b
2024-12-18 08:57:26 +00:00
hoshi-hiyouga
df42281256
Merge pull request #6313 from ge-xing/main
...
support telechat2 model
Former-commit-id: 282d0619b1047ba48f9bc3ac837d2ed40b7df307
2024-12-18 16:16:17 +08:00
hiyouga
53f0fff513
fix llama3 tool template
...
Former-commit-id: 63f28a594a44c011f2e6d418f22ddbfc445db163
2024-12-17 17:05:10 +00:00
zhaohu xing
584755be4b
support telechat2 model
...
Former-commit-id: 15a069d85c07842cd28d65845af93c3cf70ef1f4
2024-12-17 12:15:33 +00:00
hiyouga
c1768cfb14
support batch infer in vllm
...
Former-commit-id: 3ef5ed3b9a44eed2f7e3ff221dfc343d0a97c0b5
2024-12-04 13:50:00 +00:00
hiyouga
ed86f621a0
add qwq
...
Former-commit-id: acad977356a7f2e729eb6f2cb919a416b18f8add
2024-11-28 08:50:57 +00:00