LLaMA-Factory

mirror of https://github.com/hiyouga/LLaMA-Factory.git synced 2026-02-26 07:45:59 +08:00

Author	SHA1	Message	Date
sunyi0505	991267fd3b	[v1] support quantization (#10161 )	2026-02-12 20:37:41 +08:00
浮梦	5c52afa30d	[v1] support deepspeed (#10181 )	2026-02-12 17:24:30 +08:00
Junyou Su	675ce8cc7f	[algo] add ASFT (#10174 )	2026-02-12 13:12:14 +08:00
jiaqiw09	ab073f4c13	[v1] add LoRA/Freeze support and merge workflow (#10157 )	2026-02-12 13:02:09 +08:00
浮梦	bf04ca6af8	[deps] adapt to transformers v5 (#10147 ) Co-authored-by: frozenleaves <frozen@Mac.local> Co-authored-by: hiyouga <hiyouga@buaa.edu.cn>	2026-02-02 12:07:19 +08:00
浮梦	f9f11dcb97	[v1] support training with fsdp2 (#9773 ) Co-authored-by: frozenleaves <frozen@Mac.local> Co-authored-by: Yaowei Zheng <hiyouga@buaa.edu.cn>	2026-01-25 19:41:58 +08:00
Peilin Li	03a70ba8dd	[fix] correct ktransformers example config paths and templates (#9732 )	2026-01-08 10:52:50 +08:00
Yaowei Zheng	4c1eb922e2	[misc] fix parser (#9730 )	2026-01-07 17:36:08 +08:00
yanglele	e944dc442c	[feature] add support for EAFT loss (#9720 ) Co-authored-by: Yaowei Zheng <hiyouga@buaa.edu.cn> Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>	2026-01-06 23:07:12 +08:00
jiaqiw09	81b8a50aa5	[deps] Update pyproject.toml and requirements (#9714 ) Co-authored-by: Yaowei Zheng <hiyouga@buaa.edu.cn>	2026-01-04 19:52:16 +08:00
Yaowei Zheng	95ac3f2373	[release] Bye 2025 (#9702 )	2025-12-31 22:22:40 +08:00
Yaowei Zheng	55590f5ece	[misc] fix ci with uv (#9676 )	2025-12-27 01:39:13 +08:00
Copilot	a1b1931b4a	[breaking] migrate from setuptools to uv (#9673 ) Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com> Co-authored-by: hiyouga <16256802+hiyouga@users.noreply.github.com>	2025-12-26 22:47:23 +08:00
xvxuopop	e8deda53a1	[example] add Qwen3 series examples (#9624 ) Co-authored-by: UsernameFull <tohowtodoit@gmail.com>	2025-12-18 21:27:00 +08:00
sunyi0505	a0179772ab	[example] add deepspeed autotp config and example (#9602 )	2025-12-15 15:15:26 +08:00
Yaowei Zheng	5d56817e2b	[misc] lint (#9593 ) Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>	2025-12-09 18:00:35 +08:00
xvxuopop	109162dc56	[fix] fix the issue when using fsdp2 with gradient checkpointing. (#9541 ) Co-authored-by: jin-yongxu <jinyongxu@h-partners.com>	2025-12-06 16:04:51 +08:00
jiaqiw09	165f3f073a	[examples] add fsdp config for mutiple nodes (#9575 ) Co-authored-by: Yaowei Zheng <hiyouga@buaa.edu.cn>	2025-12-05 23:22:48 +08:00
xvxuopop	955396e8a5	[example] correct the parameter errors in the examples file. (#9543 )	2025-11-27 17:38:38 +08:00
xvxuopop	2c4fb3c97e	[v1] Support fused moe kernel for qwen3vlmoe model. (#9532 )	2025-11-27 02:13:33 +08:00
浮梦	2b6f16f261	[model] temporarily support npu fused options on v0, powered by v1 kernels (#9520 ) Co-authored-by: frozenleaves <frozen@Mac.local>	2025-11-27 02:08:36 +08:00
Peilin Li	887c562d60	[example] Add KTransformers Qwen3MoE example (#9511 ) Co-authored-by: unknown <xiongchenhui@hisense.ad> Co-authored-by: Kingsley <kingsleydodonow@gmail.com>	2025-11-19 00:53:28 +08:00
Yaowei Zheng	eaf963f67f	[model] update kt code (#9406 )	2025-11-05 15:27:22 +08:00
Peilin Li	934b3084ee	[train] KTransformers SFT as backend engine for LLaMA-Factory (#9400 ) Co-authored-by: jimmy128 <jimmy128@noreply.gitcode.com> Co-authored-by: Yaowei Zheng <hiyouga@buaa.edu.cn>	2025-11-04 15:54:12 +08:00
Kingsley	13170577b2	[feat] support megatron-LM training by mcore_adapter (#9237 ) Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com> Co-authored-by: Yaowei Zheng <hiyouga@buaa.edu.cn>	2025-10-26 16:21:30 +08:00
Ximing Xing	c867e28093	[model] adds semantic initialization support for special tokens (#9267 ) Co-authored-by: ximingxing <ximingxing@tencent.com>	2025-10-14 17:00:48 +08:00
Yaowei Zheng	a0d44c650a	[misc] add data files (#9224 )	2025-10-02 14:02:07 +08:00
Yaowei Zheng	bcc2c1fd8f	[misc] move wechat out (#9223 )	2025-10-02 02:06:09 +08:00
Ben Feuer	1c44b60e3e	[feat] fp8 training (#8960 ) Co-authored-by: Benjamin Feuer <penfever@gmail.com> Co-authored-by: Yaowei Zheng <hiyouga@buaa.edu.cn>	2025-10-01 14:32:53 +08:00
Zeju Qiu	003a2acb1a	[feature] adding orthogononal finetuning (OFT) to llama factory (#8623 ) Co-authored-by: Zeju <zqiu@g003.internal.cluster.is.localnet> Co-authored-by: Zeju <zqiu@login2.is.localnet> Co-authored-by: Yaowei Zheng <hiyouga@buaa.edu.cn>	2025-08-18 18:22:47 +08:00
XLXW	1ada15981a	[feature] add support for dft loss (#8917 )	2025-08-15 23:29:57 +08:00
Yaowei Zheng	4dfad24902	[model] add gpt oss (#8826 )	2025-08-06 05:56:46 +08:00
Butui Hu	1a33d65a56	[launcher] Add elastic and fault-tolerant training support (#8286 ) Signed-off-by: Butui Hu <hot123tea123@gmail.com>	2025-06-05 16:40:03 +08:00
hoshi-hiyouga	aa9ed4db59	[example] update examples (#7964 )	2025-05-06 17:24:25 +02:00
hoshi-hiyouga	73198a6645	[misc] fix uv (#7913 )	2025-04-30 07:45:03 +08:00
hoshi-hiyouga	b07628dea5	[example] add bash usage (#7794 )	2025-04-22 00:25:51 +08:00
Juanxi Tian	12ada72ed4	[trainer] Add Muon Optimizer (#7749 ) Co-authored-by: hoshi-hiyouga <hiyouga@buaa.edu.cn>	2025-04-21 23:38:37 +08:00
hoshi-hiyouga	416853dd25	[parser] support omegaconf (#7793 )	2025-04-21 23:30:30 +08:00
hoshi-hiyouga	d222f63cb7	[infer] set env for vllm ascend (#7745 )	2025-04-17 01:08:55 +08:00
leo-pony	b9263ff5ac	[infer] support vllm-ascend (#7739 )	2025-04-16 20:06:47 +08:00
Eric Tang	bb8d79bae2	[ray] allow for specifying ray.init kwargs (i.e. runtime_env) (#7647 ) * ray init kwargs * Update trainer_utils.py * fix ray args --------- Co-authored-by: hoshi-hiyouga <hiyouga@buaa.edu.cn>	2025-04-10 11:31:05 +08:00
hoshi-hiyouga	c3c0efbaa0	[misc] fix packing and eval plot (#7623 )	2025-04-07 18:20:57 +08:00
hoshi-hiyouga	5115dc8c7f	[assets] update readme (#7612 )	2025-04-06 13:58:49 +08:00
hoshi-hiyouga	2bfcad2394	[model] fix kv cache (#7564 )	2025-04-01 23:07:46 +08:00
Qiaolin Yu	a44a53ebec	[inference] support sglang backend (#7278 ) * Mimic SGLang offline Engine * Add more tests and args * Pass all current tests * Clean Code * fix sample_params * clean code * Fix Stream Chat * change sglang from engine mode to server mode * fix * Fix Review Issues * Use SGLang Built-In Utilities * Fix test SGLang * Some Doc Issue * fix sglang engine * add readme --------- Co-authored-by: Jin Pan <jpan236@wisc.edu> Co-authored-by: hiyouga <hiyouga@buaa.edu.cn>	2025-03-15 04:37:58 +08:00
hoshi-hiyouga	71a1c1321a	[config] update args (#7231 ) Former-commit-id: f71a901840811bf560df671ec63a146ff99140c6	2025-03-10 23:04:43 +08:00
hoshi-hiyouga	82a2bac866	[misc] fix ds config (#7205 ) Former-commit-id: b478fa1d9de1858075769f86f57126fde92db813	2025-03-07 15:21:28 +08:00
hoshi-hiyouga	7b985f55db	[trainer] update config (#7174 ) Former-commit-id: 9f535d0e3c4ee3cd0f1b65218c2eee5d03f43c6f	2025-03-05 23:32:54 +08:00
hoshi-hiyouga	c1d5073bd3	[model] add models (#7054 ) * add qwen25vl awq models * add moonlight Former-commit-id: ae3be2970fea8a35907202a313ab767381c44916	2025-02-24 22:05:13 +08:00
hoshi-hiyouga	f5cd17881e	[data] update vlm args (#6976 ) Former-commit-id: c28e710636a0286d4b8a1d494529b25168a8f3ab	2025-02-18 02:12:51 +08:00

1 2 3 4

173 Commits