LLaMA-Factory

mirror of https://github.com/hiyouga/LLaMA-Factory.git synced 2026-06-18 13:18:57 +08:00

Author	SHA1	Message	Date
hoshi-hiyouga	4fbdc65fcb	[model] fix vit gradient checkpointing (#7830 )	2025-04-23 22:48:48 +08:00
hoshi-hiyouga	1344416378	[model] fix moe zero3 (#7826 )	2025-04-23 15:30:49 +08:00
Kingsley	1dd67eb042	[data] fix internvl plugin (#7817 )	2025-04-23 00:58:22 +08:00
Kingsley	d43013f14a	[model] add arch check for InternVL (#7803 )	2025-04-22 16:38:05 +08:00
hoshi-hiyouga	92101f34a1	[data] improve mmplugin (#7795 )	2025-04-22 01:25:33 +08:00
hoshi-hiyouga	a62cba3d05	[example] add bash usage (#7794 )	2025-04-22 00:25:51 +08:00
flashJd	1302ca39f6	[misc] fix new tokens adding (#7253 ) Co-authored-by: hoshi-hiyouga <hiyouga@buaa.edu.cn>	2025-04-21 23:19:02 +08:00
ddddng	b8cddbc7d7	[model] fix gemma3 export (#7786 ) Co-authored-by: hoshi-hiyouga <hiyouga@buaa.edu.cn>	2025-04-21 23:07:11 +08:00
Sachin Beldona	ec7257e70f	[misc] fix bug in constant (#7765 ) Co-authored-by: Sachin Beldona <sbeldona@cs.cmu.edu>	2025-04-21 23:06:31 +08:00
hoshi-hiyouga	610f164c69	[trainer] fix pt loss (#7748 ) * fix pt loss * robust * fix * test	2025-04-17 03:15:35 +08:00
hoshi-hiyouga	0a0cfeb782	[breaking] bump transformers to 4.45.0 & improve ci (#7746 ) * update ci * fix * fix * fix * fix * fix	2025-04-17 02:36:48 +08:00
hoshi-hiyouga	4831552856	[infer] set env for vllm ascend (#7745 )	2025-04-17 01:08:55 +08:00
Kingsley	125513fa5c	[model] support intern-VL 2.5-3 series (#7258 ) * add internvl and rebase * fix for internvl2&3 * remove lines * fix video_inputs & lint * nit * add constants * remove lines * fix * fix error * pass ci * pass ci * skip internvl & nit	2025-04-17 00:31:30 +08:00
Kingsley	df8752e8ee	[model] Support Kimi_VL thinking/instruct (#7719 ) * add kimi_vl * patch config * check version * Update mm_plugin.py * Update mm_plugin.py --------- Co-authored-by: hoshi-hiyouga <hiyouga@buaa.edu.cn>	2025-04-15 00:21:58 +08:00
hoshi-hiyouga	3a13d2cdb1	[misc] fix env vars (#7715 )	2025-04-14 16:04:04 +08:00
hoshi-hiyouga	3ef36d0057	[misc] upgrade cli (#7714 )	2025-04-14 15:41:22 +08:00
Dain Kim	ee840b4e01	[bugfix] enable_gemma_liger_kernel (#7660 ) - The `enable_liger_kernel` function for the Gemma model series was not executed due to the existing `if` statement in the code. - Changed the line to an `elif` statement so that the `apply_liger_kernel` function is executed properly. resolved: #7628	2025-04-10 11:27:30 +08:00
Kingsley	7d8bee96fc	[data] Fix bugs of `use_audio_in_video` in Qwen2.5 Omni (#7638 ) * cache _mm_inputs * nit * support for use_audio_in_video * remove cache * fix data * Update mllm_video_audio_demo.json	2025-04-08 18:40:10 +08:00
hoshi-hiyouga	5817cda37e	[misc] fix packing and eval plot (#7623 )	2025-04-07 18:20:57 +08:00
hoshi-hiyouga	6c200fd218	[model] add llama4 (#7611 )	2025-04-06 13:42:31 +08:00
Kingsley	80f8d037d0	[data] fix qwen2.5 omni plugin (#7573 ) * align key with qwen2vl * nit && change scripts	2025-04-02 21:28:52 +08:00
hoshi-hiyouga	903db09822	[infer] vllm video/audio inference (#7566 )	2025-04-02 02:27:04 +08:00
hoshi-hiyouga	aaf2e6ba2a	[model] fix kv cache (#7564 )	2025-04-01 23:07:46 +08:00
Yu Shi Jie	9deece1d50	[model] fix use_cache patching for gemma3 multimodal (#7500 )	2025-04-01 16:06:48 +08:00
Kingsley	185c76f6ad	[model] add Qwen2.5-Omni model (#7537 ) * preserve image_sizes * preserve image_sizes * init plugin * support audio-text2text lora * nit * support image/video-text2text, audio-text2text * remove args * remove lines * add docs && nit * remove some comments * fix && add merge part script * add license	2025-03-31 20:39:35 +08:00
Xiaosu Zhu	bc9ada9db7	[misc] update liger-kernel's monkey patch (#7453 ) * Update liger_kernel.py * Update setup.py	2025-03-25 11:58:52 +08:00
AbdelKarim ELJANDOUBI	b6dc7e01e2	[misc] enable liger kernel for gemma3 text and paligemma (#7466 ) * add gemma3 text * add paligemma (1,2 and 2 mix)	2025-03-25 09:27:43 +08:00
Kenny Lam	59a56f7226	[misc] enable liger kernel for gemma3 (#7462 )	2025-03-24 19:09:59 +08:00
hoshi-hiyouga	ef5f1c1def	[data] gemma3 plugin pan and scan (#7294 ) * gemma3 pan and scan * add test case * fix test	2025-03-13 23:29:23 +08:00
hoshi-hiyouga	9ccfb97a2c	[misc] update format (#7277 )	2025-03-13 02:53:08 +08:00
hoshi-hiyouga	165d3ed084	[model] support gemma3 (#7273 )	2025-03-13 01:35:23 +08:00
hoshi-hiyouga	7c1640ed5f	[misc] upgrade format to py39 (#7256 )	2025-03-12 00:08:41 +08:00
hoshi-hiyouga	5a29f49fb1	[config] update args (#7231 ) Former-commit-id: `ed8b12e3cb`	2025-03-10 23:04:43 +08:00
hoshi-hiyouga	1f4a0b11ba	[data] update vlm args (#6976 ) Former-commit-id: `3da2cc2710`	2025-02-18 02:12:51 +08:00
hoshi-hiyouga	b1d31ff0f9	[data] add min resolution option (#6975 ) Former-commit-id: `7faecc0301`	2025-02-18 01:40:46 +08:00
hoshi-hiyouga	2baf8bf03d	[misc] fix lora regex (#6944 ) * fix lora regex * fix Former-commit-id: `1ada3ae5a3`	2025-02-14 21:38:43 +08:00
hoshi-hiyouga	13e1b7ee2b	[misc] fix grad ckpt (#6931 ) Former-commit-id: `c31c63b411`	2025-02-13 23:27:51 +08:00
hoshi-hiyouga	cd493b91de	[model] add liger kernel to qwen2_5 vl (#6930 ) * add liger kernel to qwen2_5 vl * fix patch * fix patch Former-commit-id: `797043d29c`	2025-02-13 23:05:54 +08:00
hoshi-hiyouga	036fb0d561	[misc] fix grad ckpt func (#6916 ) Former-commit-id: `e34c3c06da`	2025-02-13 00:17:18 +08:00
hoshi-hiyouga	ff6658ad27	[deps] upgrade vllm (#6857 ) Former-commit-id: `5f38bcaba9`	2025-02-08 15:02:28 +08:00
Zhangchi Feng	01915eaf40	[model] support audio (#6701 ) * support qwen2_audio * improve code * lint * fix * fix * fix --------- Co-authored-by: hiyouga <hiyouga@buaa.edu.cn> Former-commit-id: `24c7842948`	2025-02-05 04:59:09 +08:00
hoshi-hiyouga	1fee69f874	[misc] update license year & fix llama pro (#6814 ) * fix llamapro script * change year Former-commit-id: `e2dc5b952a`	2025-02-05 01:53:33 +08:00
Zhangchi Feng	85f22d01bf	[data] fix minicpmv plugin (#6801 ) * fix template name * tiny fix * support minicpm-o-2.6 * support inference of minicpmv * update readme * support dpo of minicpmv * update init audio * update init audio * [model]fix image process in minicpmo Former-commit-id: `ab9bd068ef`	2025-02-04 21:20:15 +08:00
hoshi-hiyouga	445d643ef3	[model] add mistral small models (#6786 ) Former-commit-id: `94803d8133`	2025-02-01 04:31:38 +08:00
hoshi-hiyouga	e8c1979b79	[model] add qwen2.5 vl models (#6779 ) Former-commit-id: `999c7c8fe0`	2025-01-31 03:00:29 +08:00
hoshi-hiyouga	f6779b0e0c	[breaking] support transformers 4.48 (#6628 ) Former-commit-id: `15357cdad9`	2025-01-31 01:36:33 +08:00
hoshi-hiyouga	1efe525df7	[model] support yarn (#6693 ) Former-commit-id: `1f47b6186c`	2025-01-18 13:56:09 +08:00
hoshi-hiyouga	788accb601	fix qwen2 moe (#6684 ) Former-commit-id: `7bf09abf1c`	2025-01-17 13:46:09 +08:00
hoshi-hiyouga	9ef85f8fc4	[optim] clean apollo (#6645 ) * clean apollo code * update readme Former-commit-id: `7a04021d04`	2025-01-15 01:42:50 +08:00
zhuHQ	763f9b9df0	[optim] add support to APOLLO (#6617 ) Former-commit-id: `d9189f9f0b`	2025-01-15 00:24:56 +08:00

1 2 3 4 5

201 Commits