LLaMA-Factory

mirror of https://github.com/hiyouga/LLaMA-Factory.git synced 2026-06-18 21:28:55 +08:00

Author	SHA1	Message	Date
Shawn Tao	8f5f4cc559	[trainer] fix key error (#7635 )	2025-04-08 18:39:50 +08:00
hoshi-hiyouga	5817cda37e	[misc] fix packing and eval plot (#7623 )	2025-04-07 18:20:57 +08:00
hoshi-hiyouga	6c200fd218	[model] add llama4 (#7611 )	2025-04-06 13:42:31 +08:00
Kingsley	32cb086be1	[data] fix qwen2.5 omni plugin (#7578 ) * specific entry * Update mm_plugin.py * fix fps cal --------- Co-authored-by: hoshi-hiyouga <hiyouga@buaa.edu.cn>	2025-04-02 23:58:39 +08:00
Kingsley	80f8d037d0	[data] fix qwen2.5 omni plugin (#7573 ) * align key with qwen2vl * nit && change scripts	2025-04-02 21:28:52 +08:00
gechengze	11997593be	[trainer] fix batch processing in PPO trainer (#7576 )	2025-04-02 21:17:48 +08:00
hoshi-hiyouga	903db09822	[infer] vllm video/audio inference (#7566 )	2025-04-02 02:27:04 +08:00
hoshi-hiyouga	aaf2e6ba2a	[model] fix kv cache (#7564 )	2025-04-01 23:07:46 +08:00
Yu Shi Jie	9deece1d50	[model] fix use_cache patching for gemma3 multimodal (#7500 )	2025-04-01 16:06:48 +08:00
Ritesh Goru	f06a74ad4e	[data] specify position_ids in PackedSupervisedDatasetProcessor for neat_packing (#7318 ) * use position_ids for neat_packing with fa2 * revert fa2 changes	2025-04-01 16:03:13 +08:00
taoharry	6faa6fb53d	[webui] fix launch with proxy (#7332 )	2025-04-01 15:52:56 +08:00
Billy Cao	5d1cc863a4	[data] shard the dataset to allow multiprocessing when streaming is enabled (#7530 ) * Shard the dataset when streaming to allow multiprocessing * Allow user to not set dataset_shards to ensure backward compatibility	2025-04-01 15:36:23 +08:00
Hao	6d6e0f44fc	[trainer] new kto mismatch pair creation strategy (#7509 )	2025-04-01 15:21:53 +08:00
hoshi-hiyouga	2d421c57bf	[data] fix qwen2.5 omni collator (#7553 )	2025-04-01 00:15:12 +08:00
Kingsley	185c76f6ad	[model] add Qwen2.5-Omni model (#7537 ) * preserve image_sizes * preserve image_sizes * init plugin * support audio-text2text lora * nit * support image/video-text2text, audio-text2text * remove args * remove lines * add docs && nit * remove some comments * fix && add merge part script * add license	2025-03-31 20:39:35 +08:00
Kingsley	b00cb2ed42	[data] fix pixtral plugin (#7505 ) * preserve `image_sizes` * add comments	2025-03-27 17:06:40 +08:00
Xu-pixel	f547334604	[3rdparty] support swanlab lark notification (#7481 )	2025-03-27 01:52:01 +08:00
Kdump	01166841cf	[trainer] fix wsd scheduler (#7304 ) * [trainer] Warmup_stable_decay supports setting the number of stable and decay steps according to the warmup_ratio ratio * Update trainer_utils.py --------- Co-authored-by: hoshi-hiyouga <hiyouga@buaa.edu.cn>	2025-03-26 15:25:02 +08:00
hoshi-hiyouga	59e12bffe8	[model] add qwen2vl 32b & upgrade peft (#7469 ) * add qwen2vl 32b * fix ci * upgrade peft to 0.15 * fix ci * fix ci	2025-03-25 12:15:58 +08:00
GuoCoder	b6d8749bf3	[model] fix lora on quant models (#7456 ) Co-authored-by: root <root@ai>	2025-03-25 11:59:46 +08:00
Xiaosu Zhu	bc9ada9db7	[misc] update liger-kernel's monkey patch (#7453 ) * Update liger_kernel.py * Update setup.py	2025-03-25 11:58:52 +08:00
AbdelKarim ELJANDOUBI	b6dc7e01e2	[misc] enable liger kernel for gemma3 text and paligemma (#7466 ) * add gemma3 text * add paligemma (1,2 and 2 mix)	2025-03-25 09:27:43 +08:00
Kenny Lam	59a56f7226	[misc] enable liger kernel for gemma3 (#7462 )	2025-03-24 19:09:59 +08:00
hoshi-hiyouga	42e090d38b	[trainer] fix vlm loss for transformers 4.49 (#7448 )	2025-03-24 10:24:05 +08:00
hoshi-hiyouga	b1b78daf06	[deps] upgrade transformers to 4.50.0 (#7437 ) * upgrade transformers * fix hf cache * fix dpo trainer	2025-03-23 17:44:27 +08:00
hoshi-hiyouga	dfbe1391e9	[deps] upgrade vllm to 0.8 (#7436 )	2025-03-23 14:32:22 +08:00
Eric Tang	d8a5571be7	[3rdparty] fix redundant process group destroy for ray (#7395 ) * fix redundant process group destroy for ray * Update tuner.py --------- Co-authored-by: hoshi-hiyouga <hiyouga@buaa.edu.cn>	2025-03-21 10:56:47 +08:00
hoshi-hiyouga	4a5d0f0ba7	[assets] update wechat (#7361 )	2025-03-18 21:31:09 +08:00
hoshi-hiyouga	c518146e62	[misc] set dev version (#7351 )	2025-03-18 00:10:53 +08:00
hoshi-hiyouga	1d2131e5cb	[data] fix template (#7349 )	2025-03-17 23:45:20 +08:00
Hertz	a71e685021	[model] support hunyuan 7b (#7317 ) * [Model]supported tencent-hunyuan model * [Model]supported tencent-hunyuan model(fix) * [Model]supported tencent-hunyuan model(fix)	2025-03-15 20:55:24 +08:00
Qiaolin Yu	30038d9ce7	[inference] support sglang backend (#7278 ) * Mimic SGLang offline Engine * Add more tests and args * Pass all current tests * Clean Code * fix sample_params * clean code * Fix Stream Chat * change sglang from engine mode to server mode * fix * Fix Review Issues * Use SGLang Built-In Utilities * Fix test SGLang * Some Doc Issue * fix sglang engine * add readme --------- Co-authored-by: Jin Pan <jpan236@wisc.edu> Co-authored-by: hiyouga <hiyouga@buaa.edu.cn>	2025-03-15 04:37:58 +08:00
hoshi-hiyouga	ef5f1c1def	[data] gemma3 plugin pan and scan (#7294 ) * gemma3 pan and scan * add test case * fix test	2025-03-13 23:29:23 +08:00
Ritesh Goru	d7d79f7e06	[data] efficient 4d_attention_mask creation in neat_packing (#7272 )	2025-03-13 03:31:12 +08:00
hoshi-hiyouga	9ccfb97a2c	[misc] update format (#7277 )	2025-03-13 02:53:08 +08:00
hoshi-hiyouga	165d3ed084	[model] support gemma3 (#7273 )	2025-03-13 01:35:23 +08:00
hoshi-hiyouga	142fd7e755	[misc] upgrade deps (#7257 )	2025-03-12 00:33:47 +08:00
hoshi-hiyouga	7c1640ed5f	[misc] upgrade format to py39 (#7256 )	2025-03-12 00:08:41 +08:00
hoshi-hiyouga	7a7071e504	Merge pull request #7242 from hiyouga/hiyouga/release [release] release v0.9.2 Former-commit-id: 6b25268990bf225d84e29d4067595cf720fa12d8	2025-03-11 15:28:45 +08:00
hoshi-hiyouga	847ae972d0	Merge pull request #7247 from hiyouga/hiyouga/commit [misc] support print commit info Former-commit-id: 0f7ec4f8529a5d7ea2153b881335821038307bb7	2025-03-11 15:28:04 +08:00
hiyouga	99b71768a0	support commit info Former-commit-id: `af752b1c27`	2025-03-11 15:13:59 +08:00
hiyouga	37b844d929	remove exit in preprocess Former-commit-id: `1a800f9993`	2025-03-11 15:08:25 +08:00
hiyouga	f5810a6e47	release v0.9.2 Former-commit-id: `aaad963593`	2025-03-11 14:49:13 +08:00
hoshi-hiyouga	317d0855d2	[infer] fix vllm args (#7235 ) Former-commit-id: `ef7af457fc`	2025-03-11 01:15:35 +08:00
Ze-Yi LIN	0a43bc1960	[tracking] add swanlab_logdir param (#7219 ) * feat: add swanlab_logdir param * fix Former-commit-id: `a1e76af3d9`	2025-03-11 00:53:07 +08:00
hoshi-hiyouga	5a29f49fb1	[config] update args (#7231 ) Former-commit-id: `ed8b12e3cb`	2025-03-10 23:04:43 +08:00
hoshi-hiyouga	4e68828e46	[config] fix export max len (#7230 ) Former-commit-id: `728c2f6819`	2025-03-10 16:46:08 +08:00
hoshi-hiyouga	df63f05b47	[data] fix loader (#7207 ) * fix dataloader * add test case * fix type * fix ci * fix ci * fix ci * disable overwrite cache in ci Former-commit-id: `8c3f9f6747`	2025-03-07 17:20:46 +08:00
ZhangChuanhui	33b4c33279	[data] fix function formatter (#7201 ) Co-authored-by: zhangchuanhui <zhangchal@digitalchina.com> Former-commit-id: `194e3bddb2`	2025-03-07 15:17:23 +08:00
hoshi-hiyouga	113cc3d920	[misc] fix cli (#7204 ) Former-commit-id: `bd17223559`	2025-03-07 15:01:18 +08:00

1 2 3 4 5 ...

1834 Commits