leo-pony
acc52e0fe7
[npu] update cann base image and torch 2.4 ( #7061 )
...
* Update base npu container image version:The Python version required for Hugging Face Transformers is >= python3.10
* Fix the bug: arg type of INSTALL_DEEPSPEED shoud been string now.
* Update Ascend CANN, CANN-Kernel and corresponding torch and torch-npu version
* Upgrade torch-npu needs packages' version: torch==2.1.0 and torch-npu==2.4.0.post2
2025-02-25 23:32:01 +08:00
hoshi-hiyouga
96fd510e6a
[misc] fix project toml ( #7067 )
2025-02-25 23:22:48 +08:00
hoshi-hiyouga
fe6dd92c84
[assets] update readme ( #7051 )
2025-02-24 20:45:06 +08:00
hoshi-hiyouga
1481af5dc9
[assets] update wechat ( #7019 )
2025-02-20 20:32:33 +08:00
hoshi-hiyouga
2591a3fa8b
[data] add r1 distill dataset ( #6983 )
2025-02-18 17:25:09 +08:00
hoshi-hiyouga
b00b290c07
[version] support transformers 449 ( #6982 )
...
* support transformers 449
* fix mm plugin
2025-02-18 17:05:40 +08:00
hoshi-hiyouga
8956c93d9b
[misc] update readme ( #6918 )
2025-02-13 01:01:41 +08:00
hoshi-hiyouga
499ea45d1f
[misc] update readme ( #6917 )
2025-02-13 00:58:10 +08:00
hoshi-hiyouga
18179a3823
[misc] update readme ( #6903 )
2025-02-11 22:51:26 +08:00
hoshi-hiyouga
9184a6e0ed
[misc] support export ollama modelfile ( #6899 )
...
* support export ollama modelfile
* update config
* add system and num ctx
2025-02-11 19:52:25 +08:00
Zhangchi Feng
764627645a
[da'ta] fix minicpmv plugin ( #6890 )
...
* fix template name
* tiny fix
* support minicpm-o-2.6
* support inference of minicpmv
* update readme
* support dpo of minicpmv
* update init audio
* update init audio
* [model]fix image process in minicpmo
* fix no mm inputs
2025-02-11 13:30:44 +08:00
hoshi-hiyouga
1356f9d840
[dataset] add openthought ( #6866 )
2025-02-09 00:53:01 +08:00
Zhangchi Feng
24c7842948
[model] support audio ( #6701 )
...
* support qwen2_audio
* improve code
* lint
* fix
* fix
* fix
---------
Co-authored-by: hiyouga <hiyouga@buaa.edu.cn >
2025-02-05 04:59:09 +08:00
neavo
a417bcf8d9
[readme] update flash attention installation instruction on win platform ( #6788 )
...
* Update README_zh.md
* Update README.md
2025-02-01 12:43:29 +08:00
hoshi-hiyouga
94803d8133
[model] add mistral small models ( #6786 )
2025-02-01 04:31:38 +08:00
hoshi-hiyouga
999c7c8fe0
[model] add qwen2.5 vl models ( #6779 )
2025-01-31 03:00:29 +08:00
hoshi-hiyouga
15357cdad9
[breaking] support transformers 4.48 ( #6628 )
2025-01-31 01:36:33 +08:00
hoshi-hiyouga
45e68b9f09
[webui] improve webui & reasoning mode ( #6778 )
2025-01-31 00:09:21 +08:00
qvlehao
28417f862a
[model] add deepseek-R1 & show think process ( #6767 )
2025-01-29 12:16:26 +08:00
hoshi-hiyouga
17b470630d
[assets] update wechat ( #6692 )
2025-01-18 12:35:03 +08:00
hoshi-hiyouga
563be2286a
update readme ( #6648 )
2025-01-15 11:06:19 +08:00
hoshi-hiyouga
7a04021d04
[optim] clean apollo ( #6645 )
...
* clean apollo code
* update readme
2025-01-15 01:42:50 +08:00
Zhangchi Feng
9b7ba093c7
update readme of MiniCPM-o ( #6642 )
...
* fix template name
* tiny fix
* support minicpm-o-2.6
* support inference of minicpmv
* update readme
2025-01-14 21:22:35 +08:00
hoshi-hiyouga
1278c3e92e
lint ( #6641 )
2025-01-14 18:40:07 +08:00
Haian Huang(深度眸)
deacc00b12
Support InternLM3 Dense 8B Model ( #6640 )
...
* support internlm3
* update
* update
* update
* add hint
2025-01-14 18:07:27 +08:00
Zhangchi Feng
c3fda5046d
Support new features of MiniCPM-V ( #6626 )
...
* fix template name
* tiny fix
* support minicpm-o-2.6
2025-01-14 00:26:19 +08:00
hoshi-hiyouga
e3e2c8c689
[inference] fix stop token for object detection ( #6624 )
...
* fix stop token
* update minicpm data pipeline
* fix npu qlora examples
2025-01-13 21:34:20 +08:00
codingma
03de5ac912
add nf4 qlora support on Ascend NPU ( #6601 )
...
* add nf4 qlora support on Ascend NPU
* add transformers version check
* add python>=3.10 requirement description for npu
* tiny fix
---------
Co-authored-by: hoshi-hiyouga <hiyouga@buaa.edu.cn >
2025-01-13 19:43:36 +08:00
Zhangchi Feng
3077f20339
Fix template name of MiniCPM-V ( #6620 )
...
* fix template name
* tiny fix
2025-01-13 16:46:48 +08:00
fzc8578
e45329e745
add minicpmv2.6
2025-01-10 23:45:44 +08:00
hiyouga
ae16ea755d
improve template, add phi4 model
2025-01-09 18:27:54 +00:00
hiyouga
4b8add7287
update model name
2025-01-02 12:19:21 +00:00
hoshi-hiyouga
a766cad5d4
Merge pull request #6514 from hiyouga/hiyouga/add_project
...
[readme] add project
2025-01-02 20:16:15 +08:00
hiyouga
b3e1137fbb
add project
2025-01-02 12:15:41 +00:00
hiyouga
67442bd497
add gpt2 model
2025-01-02 12:07:38 +00:00
hiyouga
e67b9dcc3a
add deepseek3 model
2024-12-30 13:39:20 +00:00
hiyouga
ee0e400f41
add qvq #6439
2024-12-25 07:52:41 +00:00
hiyouga
8fd38d273e
update readme
2024-12-23 14:08:59 +00:00
hoshi-hiyouga
c23a4d0658
Merge pull request #5922 from Tuyohai/main
...
support granite3 models
2024-12-23 16:46:02 +08:00
hiyouga
5111cac6f8
support report custom args
2024-12-21 21:42:45 +00:00
ZeYi Lin
744ef8c268
docs: use swanlab
2024-12-21 20:59:25 +08:00
hiyouga
d3509050dc
add paligemma2
2024-12-18 08:57:26 +00:00
hoshi-hiyouga
015f213788
Merge pull request #6313 from ge-xing/main
...
support telechat2 model
2024-12-18 16:16:17 +08:00
hiyouga
df5655f61c
fix llama3 tool template
2024-12-17 17:05:10 +00:00
zhaohu xing
04f19ed0f3
support telechat2 model
2024-12-17 12:15:33 +00:00
hiyouga
1324d158f9
support batch infer in vllm
2024-12-04 13:50:00 +00:00
hiyouga
68a612115a
add qwq
2024-11-28 08:50:57 +00:00
hiyouga
ec9ff8caa2
add skywork o1
2024-11-27 05:51:59 +00:00
hiyouga
17afb7d410
add marco-o1 and openo1 dataset
2024-11-27 04:20:23 +00:00
hiyouga
a89ad72d03
update readme
2024-11-23 19:27:18 +00:00