h7878778h
09dedf144f
[npu] Redirect SDPA to torch_npu.npu_fusion_attention (opt-in, ZeRO-3 safe, no impact off NPU) ( #8972 )
2025-09-30 18:11:31 +08:00
Yaowei Zheng
260b5625c3
[assets] update wechat ( #9129 )
2025-09-14 03:05:08 +08:00
Kingsley
610a3f1094
[data] Fix qwen_2vl with valuehead ( #9078 )
2025-09-14 02:22:20 +08:00
Kingsley
e9f70daabe
[model] add gemma3n ( #8509 )
2025-07-01 22:37:24 +08:00
Yaowei Zheng
9acab4949d
[model] fix model generate ( #8327 )
2025-06-07 08:47:50 +08:00
hoshi-hiyouga
9ae17cd173
[deps] update to transformers 4.52 ( #8125 )
2025-05-21 05:16:18 +08:00
hoshi-hiyouga
45030ff803
[model] switch to gptqmodel ( #8108 )
2025-05-19 22:25:40 +08:00
Kingsley
fa0eb91f1f
[data] fix internvl plugin ( #7817 )
2025-04-23 00:58:22 +08:00
Kingsley
2a564c25d1
[model] add arch check for InternVL ( #7803 )
2025-04-22 16:38:05 +08:00
hoshi-hiyouga
b07628dea5
[example] add bash usage ( #7794 )
2025-04-22 00:25:51 +08:00
flashJd
0ac641326b
[misc] fix new tokens adding ( #7253 )
...
Co-authored-by: hoshi-hiyouga <hiyouga@buaa.edu.cn>
2025-04-21 23:19:02 +08:00
hoshi-hiyouga
d222f63cb7
[infer] set env for vllm ascend ( #7745 )
2025-04-17 01:08:55 +08:00
Kingsley
2101399c94
[model] Support Kimi_VL thinking/instruct ( #7719 )
...
* add kimi_vl
* patch config
* check version
* Update mm_plugin.py
* Update mm_plugin.py
---------
Co-authored-by: hoshi-hiyouga <hiyouga@buaa.edu.cn>
2025-04-15 00:21:58 +08:00
hoshi-hiyouga
3f91a95250
[misc] fix env vars ( #7715 )
2025-04-14 16:04:04 +08:00
hoshi-hiyouga
7c61b35106
[misc] upgrade cli ( #7714 )
2025-04-14 15:41:22 +08:00
Kingsley
349c56c51c
[data] Fix bugs of use_audio_in_video
in Qwen2.5 Omni ( #7638 )
...
* cache _mm_inputs
* nit
* support for use_audio_in_video
* remove cache
* fix data
* Update mllm_video_audio_demo.json
2025-04-08 18:40:10 +08:00
hoshi-hiyouga
5e22597ff1
[infer] vllm video/audio inference ( #7566 )
2025-04-02 02:27:04 +08:00
hoshi-hiyouga
2bfcad2394
[model] fix kv cache ( #7564 )
2025-04-01 23:07:46 +08:00
Yu Shi Jie
a13b1bb49a
[model] fix use_cache patching for gemma3 multimodal ( #7500 )
2025-04-01 16:06:48 +08:00
hoshi-hiyouga
93e6184cbe
[data] gemma3 plugin pan and scan ( #7294 )
...
* gemma3 pan and scan
* add test case
* fix test
2025-03-13 23:29:23 +08:00
hoshi-hiyouga
4b9d8da5a4
[model] support gemma3 ( #7273 )
2025-03-13 01:35:23 +08:00
hoshi-hiyouga
264538cb26
[misc] upgrade format to py39 ( #7256 )
2025-03-12 00:08:41 +08:00
hoshi-hiyouga
f5cd17881e
[data] update vlm args ( #6976 )
...
Former-commit-id: c28e710636a0286d4b8a1d494529b25168a8f3ab
2025-02-18 02:12:51 +08:00
hoshi-hiyouga
c09b648934
[data] add min resolution option ( #6975 )
...
Former-commit-id: 76bd9a98a2fb00f1a1d881e6e1364c02fd36d327
2025-02-18 01:40:46 +08:00
hoshi-hiyouga
4d1791e905
[deps] upgrade vllm ( #6857 )
...
Former-commit-id: 4bd50f65a3d62528768561019fda2723d045c7fd
2025-02-08 15:02:28 +08:00
Zhangchi Feng
8f401e37f8
[model] support audio ( #6701 )
...
* support qwen2_audio
* improve code
* lint
* fix
* fix
* fix
---------
Co-authored-by: hiyouga <hiyouga@buaa.edu.cn>
Former-commit-id: 5eacb5629e4d7733cd992a63747a1335f2c6a929
2025-02-05 04:59:09 +08:00
hoshi-hiyouga
c2022431aa
[misc] update license year & fix llama pro ( #6814 )
...
* fix llamapro script
* change year
Former-commit-id: d9ae594178796994d400a5f207d6499712816f89
2025-02-05 01:53:33 +08:00
Zhangchi Feng
cfb926fb84
[data] fix minicpmv plugin ( #6801 )
...
* fix template name
* tiny fix
* support minicpm-o-2.6
* support inference of minicpmv
* update readme
* support dpo of minicpmv
* update init audio
* update init audio
* [model]fix image process in minicpmo
Former-commit-id: 8f704c8b6228ef50f828014f85dce67fda868660
2025-02-04 21:20:15 +08:00
hoshi-hiyouga
87d685b59f
[model] support yarn ( #6693 )
...
Former-commit-id: 8c412abc44a4c61b683465e36c6288580d980250
2025-01-18 13:56:09 +08:00
hoshi-hiyouga
41a9e231cb
lint ( #6641 )
...
Former-commit-id: 79731ae13ecd17eb8646fb53162c81dddfef3b00
2025-01-14 18:40:07 +08:00
Haian Huang(深度眸)
1bb06e06df
Support InternLM3 Dense 8B Model ( #6640 )
...
* support internlm3
* update
* update
* update
* add hint
Former-commit-id: 24ab7ae0944c5f373e9cac60f0332e704824a057
2025-01-14 18:07:27 +08:00
Zhangchi Feng
f7857c83e1
Support Inference of MiniCPM-V-2.6 and MiniCPM-o-2.6 ( #6631 )
...
* fix template name
* tiny fix
* support minicpm-o-2.6
* support inference of minicpmv
Former-commit-id: 7f3c64e853a7cdd49d02bf85e237611941ac7fa8
2025-01-14 17:34:58 +08:00
hiyouga
d670d62a66
generalized packing & fix #6343
...
Former-commit-id: 3b1e4194616cacd5c24f08b328e31a008bddcf29
2024-12-17 10:26:19 +00:00
hiyouga
5003820a6a
fix inputs
...
Former-commit-id: 7d535bb8cdf7e81edda81152e63c8cfe6c9dcc9f
2024-11-23 18:26:02 +00:00
hiyouga
093eda2ad6
support rank0 logger
...
Former-commit-id: 84528eabe560091bfd866b6a0ca864085af7529b
2024-11-02 18:31:04 +08:00
hiyouga
35862d19ec
fix #5611
...
Former-commit-id: 76c813d37c1d945a8bb6d3e4168e15fbe97c7a87
2024-10-06 10:33:11 +08:00
hiyouga
20ee1d2e19
fix #5542
...
Former-commit-id: cf28e7418c2eb07e86923a53ef832ef218e45af1
2024-09-30 23:28:55 +08:00
hiyouga
cf72aec098
add patch processor func
...
Former-commit-id: 0cd6327da6a044b4a62f203a662e5bb6068d9c29
2024-09-30 17:07:43 +08:00
hiyouga
2f6fc27c8b
remove visual_inputs, fix qlora
...
Former-commit-id: be30c01c4f1482520ece770bd54c6a4837c26f0a
2024-08-31 00:24:51 +08:00
hiyouga
206a8364d4
support liger kernel
...
Former-commit-id: 0f4e54abf6c5feb2329855a4047597ad5147720a
2024-08-27 11:20:14 +08:00
hiyouga
a041f4a111
tiny fix
...
Former-commit-id: bf6a2f032c598f969708c1c3db4875d6239c41a9
2024-07-22 21:10:15 +08:00
hoshi-hiyouga
cdf9dae53e
fix #4917
...
Former-commit-id: e26919aafd8436489d065789c9c25d72c8d05a6d
2024-07-22 11:28:31 +08:00
hiyouga
5ab997d484
fix gemma2 attention
...
Former-commit-id: aeafc68e169ae0ea5939cc81cb0cf89f0ca044b6
2024-07-13 23:33:45 +08:00
hiyouga
1408aa078d
update arg name
...
Former-commit-id: 1509ed550b2060f946ce20e3c5a9e5c49e86e3ab
2024-07-03 23:23:24 +08:00
ancv
20fdf177e8
move efficient_packing from data_args to model_args
...
Former-commit-id: 7b61659c707480bcf8c802c73e10d12ad5b9b965
2024-07-02 18:37:55 +07:00
hoshi-hiyouga
a715490c2a
Merge branch 'main' into main
...
Former-commit-id: 7be442f37d53a0c6324728fa1fa8e2c84d7f0fa5
2024-07-01 21:01:09 +08:00
hiyouga
fda2cf677b
bf16 by default, gemma2 attns
...
Gemma2 finetuning cannot work until merging https://github.com/huggingface/transformers/pull/31674
Former-commit-id: da66c32c7be0adc28d2185b23e9f62d56acb961c
2024-06-28 06:00:26 +08:00
hiyouga
a79e93f335
fix #4410
...
Former-commit-id: f49adc4ab5eade21d7a9e029212f17688ee9b0cf
2024-06-24 22:34:31 +08:00
ancv
6c185a2c57
move configure_packing to llamafactory.model.patcher and fix constants
...
Former-commit-id: 9c5e972c9c81957f2e9e30bf284ef1c076de9fd0
2024-06-21 00:45:06 +07:00
hiyouga
af2cb33bb2
tiny fix
...
Former-commit-id: 2d8d47f6126d68db1701ed18fc31310c6f14dd49
2024-06-20 22:56:05 +08:00