LLaMA-Factory

mirror of https://github.com/hiyouga/LLaMA-Factory.git synced 2025-12-29 18:20:35 +08:00

Author	SHA1	Message	Date
Kingsley	ae2a8100ba	[model] support Mistral3.1 small 2503 (#8335 )	2025-06-09 10:37:42 +08:00
Vivek Iyer	b441ecdbde	[model] pushing FFT with unsloth (#8325 ) Co-authored-by: viyer <vivek_iyer2@apple.com>	2025-06-07 08:20:58 +08:00
hoshi-hiyouga	abb581026f	[deps] update to transformers 4.52 (#8125 )	2025-05-21 05:16:18 +08:00
hoshi-hiyouga	8325087bb3	[model] switch to gptqmodel (#8108 )	2025-05-19 22:25:40 +08:00
piamo	8dc195e4ad	[model] update rope kwargs for yarn (#8101 )	2025-05-19 20:07:54 +08:00
hoshi-hiyouga	d25c556714	[misc] update liger kernel patch (#7966 )	2025-05-06 20:32:16 +02:00
hoshi-hiyouga	f7275e1ef5	[model] fix dsv3 leaf node (#7879 )	2025-04-28 18:11:09 +08:00
Kingsley	3e2460bb38	fix attn patch for kimivl (#7867 )	2025-04-27 23:12:28 +08:00
hoshi-hiyouga	95f92df771	[model] fix vit gradient checkpointing (#7830 )	2025-04-23 22:48:48 +08:00
hoshi-hiyouga	d5f35f8b6c	[model] fix moe zero3 (#7826 )	2025-04-23 15:30:49 +08:00
hoshi-hiyouga	cea9071ed1	[example] add bash usage (#7794 )	2025-04-22 00:25:51 +08:00
ddddng	0313fbd8b0	[model] fix gemma3 export (#7786 ) Co-authored-by: hoshi-hiyouga <hiyouga@buaa.edu.cn>	2025-04-21 23:07:11 +08:00
hoshi-hiyouga	8208cbf1dc	[trainer] fix pt loss (#7748 ) * fix pt loss * robust * fix * test	2025-04-17 03:15:35 +08:00
hoshi-hiyouga	a0818eae58	[breaking] bump transformers to 4.45.0 & improve ci (#7746 ) * update ci * fix * fix * fix * fix * fix	2025-04-17 02:36:48 +08:00
Kingsley	7a00670f70	[model] support intern-VL 2.5-3 series (#7258 ) * add internvl and rebase * fix for internvl2&3 * remove lines * fix video_inputs & lint * nit * add constants * remove lines * fix * fix error * pass ci * pass ci * skip internvl & nit	2025-04-17 00:31:30 +08:00
Kingsley	d1b695cd9f	[model] Support Kimi_VL thinking/instruct (#7719 ) * add kimi_vl * patch config * check version * Update mm_plugin.py * Update mm_plugin.py --------- Co-authored-by: hoshi-hiyouga <hiyouga@buaa.edu.cn>	2025-04-15 00:21:58 +08:00
Dain Kim	e60249d597	[bugfix] enable_gemma_liger_kernel (#7660 ) - The `enable_liger_kernel` function for the Gemma model series was not executed due to the existing `if` statement in the code. - Changed the line to an `elif` statement so that the `apply_liger_kernel` function is executed properly. resolved: #7628	2025-04-10 11:27:30 +08:00
hoshi-hiyouga	fb46193364	[misc] fix packing and eval plot (#7623 )	2025-04-07 18:20:57 +08:00
hoshi-hiyouga	40fb24916f	[model] add llama4 (#7611 )	2025-04-06 13:42:31 +08:00
hoshi-hiyouga	be0289292d	[infer] vllm video/audio inference (#7566 )	2025-04-02 02:27:04 +08:00
hoshi-hiyouga	37d783149d	[model] fix kv cache (#7564 )	2025-04-01 23:07:46 +08:00
Kingsley	1189aeb6c2	[model] add Qwen2.5-Omni model (#7537 ) * preserve image_sizes * preserve image_sizes * init plugin * support audio-text2text lora * nit * support image/video-text2text, audio-text2text * remove args * remove lines * add docs && nit * remove some comments * fix && add merge part script * add license	2025-03-31 20:39:35 +08:00
Xiaosu Zhu	d38c402f63	[misc] update liger-kernel's monkey patch (#7453 ) * Update liger_kernel.py * Update setup.py	2025-03-25 11:58:52 +08:00
AbdelKarim ELJANDOUBI	ce089ef8f6	[misc] enable liger kernel for gemma3 text and paligemma (#7466 ) * add gemma3 text * add paligemma (1,2 and 2 mix)	2025-03-25 09:27:43 +08:00
Kenny Lam	cad8bde6b1	[misc] enable liger kernel for gemma3 (#7462 )	2025-03-24 19:09:59 +08:00
hoshi-hiyouga	1b1964714e	[misc] update format (#7277 )	2025-03-13 02:53:08 +08:00
hoshi-hiyouga	a54c859674	[model] support gemma3 (#7273 )	2025-03-13 01:35:23 +08:00
hoshi-hiyouga	efa86e730c	[misc] upgrade format to py39 (#7256 )	2025-03-12 00:08:41 +08:00
hoshi-hiyouga	c6331546a9	[config] update args (#7231 ) Former-commit-id: f71a901840811bf560df671ec63a146ff99140c6	2025-03-10 23:04:43 +08:00
hoshi-hiyouga	1cda37892e	[misc] fix lora regex (#6944 ) * fix lora regex * fix Former-commit-id: 1d0ecbaee1b72f1e03154ddd4fcc8b7876e01f89	2025-02-14 21:38:43 +08:00
hoshi-hiyouga	6ebe81e04d	[misc] fix grad ckpt (#6931 ) Former-commit-id: deae1fc9a0bea5c8b8be1564cf9c81c9c02a0b3a	2025-02-13 23:27:51 +08:00
hoshi-hiyouga	a9b4e229af	[model] add liger kernel to qwen2_5 vl (#6930 ) * add liger kernel to qwen2_5 vl * fix patch * fix patch Former-commit-id: 828776d155986166498dfc907194f64436571106	2025-02-13 23:05:54 +08:00
hoshi-hiyouga	8cbfa350fd	[misc] fix grad ckpt func (#6916 ) Former-commit-id: 35e069a52b3d7cfd9b0107574b09265eb2290f0b	2025-02-13 00:17:18 +08:00
hoshi-hiyouga	c322512037	[deps] upgrade vllm (#6857 ) Former-commit-id: 4bd50f65a3d62528768561019fda2723d045c7fd	2025-02-08 15:02:28 +08:00
Zhangchi Feng	46a1786595	[model] support audio (#6701 ) * support qwen2_audio * improve code * lint * fix * fix * fix --------- Co-authored-by: hiyouga <hiyouga@buaa.edu.cn> Former-commit-id: 5eacb5629e4d7733cd992a63747a1335f2c6a929	2025-02-05 04:59:09 +08:00
hoshi-hiyouga	40b6e9045d	[misc] update license year & fix llama pro (#6814 ) * fix llamapro script * change year Former-commit-id: d9ae594178796994d400a5f207d6499712816f89	2025-02-05 01:53:33 +08:00
hoshi-hiyouga	e335c548c1	[model] add mistral small models (#6786 ) Former-commit-id: e5e95c39bc4199fa89c67e34f9adaaa987058744	2025-02-01 04:31:38 +08:00
hoshi-hiyouga	1132aaa53c	[model] add qwen2.5 vl models (#6779 ) Former-commit-id: ed46fb4f6194c30060b908092464dded12e5787c	2025-01-31 03:00:29 +08:00
hoshi-hiyouga	46068b3324	[breaking] support transformers 4.48 (#6628 ) Former-commit-id: f154ab175c513a4d7bb866bf2cffc34b77b50508	2025-01-31 01:36:33 +08:00
hoshi-hiyouga	87db2a849a	[model] support yarn (#6693 ) Former-commit-id: 8c412abc44a4c61b683465e36c6288580d980250	2025-01-18 13:56:09 +08:00
hoshi-hiyouga	b2f6d001bf	fix qwen2 moe (#6684 ) Former-commit-id: ab624419fa0ab23ef7a331a0ec14e393328772b5	2025-01-17 13:46:09 +08:00
hoshi-hiyouga	33d420bbcc	[optim] clean apollo (#6645 ) * clean apollo code * update readme Former-commit-id: 38b8ec4a99189483124b54df9d6bc6b0d318855a	2025-01-15 01:42:50 +08:00
zhuHQ	9b29a431db	[optim] add support to APOLLO (#6617 ) Former-commit-id: 5a252e5a458457adbd19da3b68a3897ad2962824	2025-01-15 00:24:56 +08:00
Zhangchi Feng	068d44b509	Support Inference of MiniCPM-V-2.6 and MiniCPM-o-2.6 (#6631 ) * fix template name * tiny fix * support minicpm-o-2.6 * support inference of minicpmv Former-commit-id: 7f3c64e853a7cdd49d02bf85e237611941ac7fa8	2025-01-14 17:34:58 +08:00
Zhangchi Feng	429a027832	Support new features of MiniCPM-V (#6626 ) * fix template name * tiny fix * support minicpm-o-2.6 Former-commit-id: 53034a61c7654358f46916cbc370910fb2aeff3b	2025-01-14 00:26:19 +08:00
codingma	6def336d82	add nf4 qlora support on Ascend NPU (#6601 ) * add nf4 qlora support on Ascend NPU * add transformers version check * add python>=3.10 requirement description for npu * tiny fix --------- Co-authored-by: hoshi-hiyouga <hiyouga@buaa.edu.cn> Former-commit-id: 7912d1acac5f10dab22145fe729a90c57aad8d85	2025-01-13 19:43:36 +08:00
fzc8578	7e3372c035	adapt to new mllm_param Former-commit-id: 0775b71965863c2618c117726a1046a36d6d85b8	2025-01-11 00:16:34 +08:00
Zhangchi Feng	a53daaf821	Merge branch 'main' into minicpmv Former-commit-id: 8a9c90759feda975faadc5858bd44b7ea116e7fb	2025-01-11 00:01:36 +08:00
hiyouga	e49c021e22	refactor mllm param logic Former-commit-id: b895c190945cf5d991cb4e4dea2ae73cc9c8d246	2025-01-10 15:45:48 +00:00
fzc8578	3a8f989faa	fix some Former-commit-id: cd5a1a8b9c6eb59d6e95f79573f60ad8668f1942	2025-01-10 20:27:06 +08:00

1 2 3

144 Commits