LLaMA-Factory

mirror of https://github.com/hiyouga/LLaMA-Factory.git synced 2026-06-19 13:48:55 +08:00

Author	SHA1	Message	Date
hoshi-hiyouga	3ae5da2a04	[model] fix dsv3 leaf node (#7879 )	2025-04-28 18:11:09 +08:00
Kingsley	1157f4e246	fix attn patch for kimivl (#7867 )	2025-04-27 23:12:28 +08:00
hoshi-hiyouga	2233b739fa	[model] fix vit gradient checkpointing (#7830 )	2025-04-23 22:48:48 +08:00
hoshi-hiyouga	c1a7f2ebb2	[model] fix moe zero3 (#7826 )	2025-04-23 15:30:49 +08:00
hoshi-hiyouga	b07628dea5	[example] add bash usage (#7794 )	2025-04-22 00:25:51 +08:00
ddddng	c5ba9106ec	[model] fix gemma3 export (#7786 ) Co-authored-by: hoshi-hiyouga <hiyouga@buaa.edu.cn>	2025-04-21 23:07:11 +08:00
hoshi-hiyouga	39169986ef	[trainer] fix pt loss (#7748 ) * fix pt loss * robust * fix * test	2025-04-17 03:15:35 +08:00
hoshi-hiyouga	86ebb219d6	[breaking] bump transformers to 4.45.0 & improve ci (#7746 ) * update ci * fix * fix * fix * fix * fix	2025-04-17 02:36:48 +08:00
Kingsley	2e518f255f	[model] support intern-VL 2.5-3 series (#7258 ) * add internvl and rebase * fix for internvl2&3 * remove lines * fix video_inputs & lint * nit * add constants * remove lines * fix * fix error * pass ci * pass ci * skip internvl & nit	2025-04-17 00:31:30 +08:00
Kingsley	2101399c94	[model] Support Kimi_VL thinking/instruct (#7719 ) * add kimi_vl * patch config * check version * Update mm_plugin.py * Update mm_plugin.py --------- Co-authored-by: hoshi-hiyouga <hiyouga@buaa.edu.cn>	2025-04-15 00:21:58 +08:00
Dain Kim	1c436c9f25	[bugfix] enable_gemma_liger_kernel (#7660 ) - The `enable_liger_kernel` function for the Gemma model series was not executed due to the existing `if` statement in the code. - Changed the line to an `elif` statement so that the `apply_liger_kernel` function is executed properly. resolved: #7628	2025-04-10 11:27:30 +08:00
hoshi-hiyouga	c3c0efbaa0	[misc] fix packing and eval plot (#7623 )	2025-04-07 18:20:57 +08:00
hoshi-hiyouga	831e7f1cfd	[model] add llama4 (#7611 )	2025-04-06 13:42:31 +08:00
hoshi-hiyouga	5e22597ff1	[infer] vllm video/audio inference (#7566 )	2025-04-02 02:27:04 +08:00
hoshi-hiyouga	2bfcad2394	[model] fix kv cache (#7564 )	2025-04-01 23:07:46 +08:00
Kingsley	7eed496336	[model] add Qwen2.5-Omni model (#7537 ) * preserve image_sizes * preserve image_sizes * init plugin * support audio-text2text lora * nit * support image/video-text2text, audio-text2text * remove args * remove lines * add docs && nit * remove some comments * fix && add merge part script * add license	2025-03-31 20:39:35 +08:00
Xiaosu Zhu	6b3b97c738	[misc] update liger-kernel's monkey patch (#7453 ) * Update liger_kernel.py * Update setup.py	2025-03-25 11:58:52 +08:00
AbdelKarim ELJANDOUBI	6d3748f727	[misc] enable liger kernel for gemma3 text and paligemma (#7466 ) * add gemma3 text * add paligemma (1,2 and 2 mix)	2025-03-25 09:27:43 +08:00
Kenny Lam	7c890170e3	[misc] enable liger kernel for gemma3 (#7462 )	2025-03-24 19:09:59 +08:00
hoshi-hiyouga	650a9a9057	[misc] update format (#7277 )	2025-03-13 02:53:08 +08:00
hoshi-hiyouga	4b9d8da5a4	[model] support gemma3 (#7273 )	2025-03-13 01:35:23 +08:00
hoshi-hiyouga	264538cb26	[misc] upgrade format to py39 (#7256 )	2025-03-12 00:08:41 +08:00
hoshi-hiyouga	71a1c1321a	[config] update args (#7231 ) Former-commit-id: f71a901840811bf560df671ec63a146ff99140c6	2025-03-10 23:04:43 +08:00
hoshi-hiyouga	a893505924	[misc] fix lora regex (#6944 ) * fix lora regex * fix Former-commit-id: 1d0ecbaee1b72f1e03154ddd4fcc8b7876e01f89	2025-02-14 21:38:43 +08:00
hoshi-hiyouga	ed25e051a9	[misc] fix grad ckpt (#6931 ) Former-commit-id: deae1fc9a0bea5c8b8be1564cf9c81c9c02a0b3a	2025-02-13 23:27:51 +08:00
hoshi-hiyouga	5e5fc337f9	[model] add liger kernel to qwen2_5 vl (#6930 ) * add liger kernel to qwen2_5 vl * fix patch * fix patch Former-commit-id: 828776d155986166498dfc907194f64436571106	2025-02-13 23:05:54 +08:00
hoshi-hiyouga	3a3f4072e5	[misc] fix grad ckpt func (#6916 ) Former-commit-id: 35e069a52b3d7cfd9b0107574b09265eb2290f0b	2025-02-13 00:17:18 +08:00
hoshi-hiyouga	4d1791e905	[deps] upgrade vllm (#6857 ) Former-commit-id: 4bd50f65a3d62528768561019fda2723d045c7fd	2025-02-08 15:02:28 +08:00
Zhangchi Feng	8f401e37f8	[model] support audio (#6701 ) * support qwen2_audio * improve code * lint * fix * fix * fix --------- Co-authored-by: hiyouga <hiyouga@buaa.edu.cn> Former-commit-id: 5eacb5629e4d7733cd992a63747a1335f2c6a929	2025-02-05 04:59:09 +08:00
hoshi-hiyouga	c2022431aa	[misc] update license year & fix llama pro (#6814 ) * fix llamapro script * change year Former-commit-id: d9ae594178796994d400a5f207d6499712816f89	2025-02-05 01:53:33 +08:00
hoshi-hiyouga	a28261a866	[model] add mistral small models (#6786 ) Former-commit-id: e5e95c39bc4199fa89c67e34f9adaaa987058744	2025-02-01 04:31:38 +08:00
hoshi-hiyouga	800de98dc8	[model] add qwen2.5 vl models (#6779 ) Former-commit-id: ed46fb4f6194c30060b908092464dded12e5787c	2025-01-31 03:00:29 +08:00
hoshi-hiyouga	222423bcef	[breaking] support transformers 4.48 (#6628 ) Former-commit-id: f154ab175c513a4d7bb866bf2cffc34b77b50508	2025-01-31 01:36:33 +08:00
hoshi-hiyouga	87d685b59f	[model] support yarn (#6693 ) Former-commit-id: 8c412abc44a4c61b683465e36c6288580d980250	2025-01-18 13:56:09 +08:00
hoshi-hiyouga	33525a34b6	fix qwen2 moe (#6684 ) Former-commit-id: ab624419fa0ab23ef7a331a0ec14e393328772b5	2025-01-17 13:46:09 +08:00
hoshi-hiyouga	7638f1070e	[optim] clean apollo (#6645 ) * clean apollo code * update readme Former-commit-id: 38b8ec4a99189483124b54df9d6bc6b0d318855a	2025-01-15 01:42:50 +08:00
zhuHQ	c2120432db	[optim] add support to APOLLO (#6617 ) Former-commit-id: 5a252e5a458457adbd19da3b68a3897ad2962824	2025-01-15 00:24:56 +08:00
Zhangchi Feng	f7857c83e1	Support Inference of MiniCPM-V-2.6 and MiniCPM-o-2.6 (#6631 ) * fix template name * tiny fix * support minicpm-o-2.6 * support inference of minicpmv Former-commit-id: 7f3c64e853a7cdd49d02bf85e237611941ac7fa8	2025-01-14 17:34:58 +08:00
Zhangchi Feng	ae32c148d1	Support new features of MiniCPM-V (#6626 ) * fix template name * tiny fix * support minicpm-o-2.6 Former-commit-id: 53034a61c7654358f46916cbc370910fb2aeff3b	2025-01-14 00:26:19 +08:00
codingma	11c38b9173	add nf4 qlora support on Ascend NPU (#6601 ) * add nf4 qlora support on Ascend NPU * add transformers version check * add python>=3.10 requirement description for npu * tiny fix --------- Co-authored-by: hoshi-hiyouga <hiyouga@buaa.edu.cn> Former-commit-id: 7912d1acac5f10dab22145fe729a90c57aad8d85	2025-01-13 19:43:36 +08:00
fzc8578	9dc7b6c7ac	adapt to new mllm_param Former-commit-id: 0775b71965863c2618c117726a1046a36d6d85b8	2025-01-11 00:16:34 +08:00
Zhangchi Feng	627548bf7f	Merge branch 'main' into minicpmv Former-commit-id: 8a9c90759feda975faadc5858bd44b7ea116e7fb	2025-01-11 00:01:36 +08:00
hiyouga	dc65ecdf09	refactor mllm param logic Former-commit-id: b895c190945cf5d991cb4e4dea2ae73cc9c8d246	2025-01-10 15:45:48 +00:00
fzc8578	e63c2df0b1	fix some Former-commit-id: cd5a1a8b9c6eb59d6e95f79573f60ad8668f1942	2025-01-10 20:27:06 +08:00
fzc8578	25d4889789	tiny fix Former-commit-id: f088e580d3bacd0eecd0c3bf17e928eb49832ba1	2025-01-10 20:15:39 +08:00
Zhangchi Feng	8c0a721c4c	Merge branch 'main' into minicpmv Former-commit-id: d8840ae416660e23f1d615ffd404f519360151d9	2025-01-10 20:12:07 +08:00
fzc8578	9e972bc9ec	add some Former-commit-id: fede563aeb716ba5d1e368fd3e1182e4e580d248	2025-01-10 20:01:22 +08:00
hiyouga	647c51a772	imporve log Former-commit-id: a6abf375975ffea3d51e1b944c9855b5f62ffac8	2025-01-08 09:56:10 +00:00
fzc8578	8c2a712247	add some Former-commit-id: b4790c66c126567bd193de52a564e3ce11c94769	2025-01-06 19:32:39 +08:00
fzc8578	2c120aa0df	add some Former-commit-id: 81176fe226da89eace89cb202bad68e73b7c2a02	2025-01-04 11:11:15 +08:00

1 2 3

138 Commits