LLaMA-Factory

mirror of https://github.com/hiyouga/LLaMA-Factory.git synced 2026-06-17 20:58:54 +08:00

Author	SHA1	Message	Date
Eric Tang	6c53471de2	[data] support for specifying a dataset in cloud storage (#7567 ) * add support for loading datasets from s3/gcs * add comments to readme * run linter and address comments * add option to pass in kwargs to ray init (i.e. runtime env) * address comment * revert mixed up changes	2025-04-10 11:31:35 +08:00
Eric Tang	39c1e29ed7	[ray] allow for specifying ray.init kwargs (i.e. runtime_env) (#7647 ) * ray init kwargs * Update trainer_utils.py * fix ray args --------- Co-authored-by: hoshi-hiyouga <hiyouga@buaa.edu.cn>	2025-04-10 11:31:05 +08:00
Dain Kim	ee840b4e01	[bugfix] enable_gemma_liger_kernel (#7660 ) - The `enable_liger_kernel` function for the Gemma model series was not executed due to the existing `if` statement in the code. - Changed the line to an `elif` statement so that the `apply_liger_kernel` function is executed properly. resolved: #7628	2025-04-10 11:27:30 +08:00
jilongW	3bdc7e1e6c	[misc] fix cuda warn on intel GPU (#7655 )	2025-04-09 21:37:54 +08:00
hoshi-hiyouga	34fdabe005	[data] add coig-p dataset (#7657 )	2025-04-09 21:18:25 +08:00
hoshi-hiyouga	39876b85fc	[assets] update readme (#7644 )	2025-04-09 01:06:06 +08:00
Kingsley	7d8bee96fc	[data] Fix bugs of `use_audio_in_video` in Qwen2.5 Omni (#7638 ) * cache _mm_inputs * nit * support for use_audio_in_video * remove cache * fix data * Update mllm_video_audio_demo.json	2025-04-08 18:40:10 +08:00
Shawn Tao	8f5f4cc559	[trainer] fix key error (#7635 )	2025-04-08 18:39:50 +08:00
hoshi-hiyouga	5817cda37e	[misc] fix packing and eval plot (#7623 )	2025-04-07 18:20:57 +08:00
hoshi-hiyouga	6c200fd218	[model] add llama4 (#7611 )	2025-04-06 13:42:31 +08:00
Kingsley	32cb086be1	[data] fix qwen2.5 omni plugin (#7578 ) * specific entry * Update mm_plugin.py * fix fps cal --------- Co-authored-by: hoshi-hiyouga <hiyouga@buaa.edu.cn>	2025-04-02 23:58:39 +08:00
Kingsley	80f8d037d0	[data] fix qwen2.5 omni plugin (#7573 ) * align key with qwen2vl * nit && change scripts	2025-04-02 21:28:52 +08:00
gechengze	11997593be	[trainer] fix batch processing in PPO trainer (#7576 )	2025-04-02 21:17:48 +08:00
hoshi-hiyouga	903db09822	[infer] vllm video/audio inference (#7566 )	2025-04-02 02:27:04 +08:00
hoshi-hiyouga	aaf2e6ba2a	[model] fix kv cache (#7564 )	2025-04-01 23:07:46 +08:00
Yu Shi Jie	9deece1d50	[model] fix use_cache patching for gemma3 multimodal (#7500 )	2025-04-01 16:06:48 +08:00
Ritesh Goru	f06a74ad4e	[data] specify position_ids in PackedSupervisedDatasetProcessor for neat_packing (#7318 ) * use position_ids for neat_packing with fa2 * revert fa2 changes	2025-04-01 16:03:13 +08:00
taoharry	6faa6fb53d	[webui] fix launch with proxy (#7332 )	2025-04-01 15:52:56 +08:00
Billy Cao	5d1cc863a4	[data] shard the dataset to allow multiprocessing when streaming is enabled (#7530 ) * Shard the dataset when streaming to allow multiprocessing * Allow user to not set dataset_shards to ensure backward compatibility	2025-04-01 15:36:23 +08:00
Hao	6d6e0f44fc	[trainer] new kto mismatch pair creation strategy (#7509 )	2025-04-01 15:21:53 +08:00
hoshi-hiyouga	2d421c57bf	[data] fix qwen2.5 omni collator (#7553 )	2025-04-01 00:15:12 +08:00
Kingsley	185c76f6ad	[model] add Qwen2.5-Omni model (#7537 ) * preserve image_sizes * preserve image_sizes * init plugin * support audio-text2text lora * nit * support image/video-text2text, audio-text2text * remove args * remove lines * add docs && nit * remove some comments * fix && add merge part script * add license	2025-03-31 20:39:35 +08:00
Kingsley	b00cb2ed42	[data] fix pixtral plugin (#7505 ) * preserve `image_sizes` * add comments	2025-03-27 17:06:40 +08:00
Xu-pixel	f547334604	[3rdparty] support swanlab lark notification (#7481 )	2025-03-27 01:52:01 +08:00
Kdump	01166841cf	[trainer] fix wsd scheduler (#7304 ) * [trainer] Warmup_stable_decay supports setting the number of stable and decay steps according to the warmup_ratio ratio * Update trainer_utils.py --------- Co-authored-by: hoshi-hiyouga <hiyouga@buaa.edu.cn>	2025-03-26 15:25:02 +08:00
hoshi-hiyouga	59e12bffe8	[model] add qwen2vl 32b & upgrade peft (#7469 ) * add qwen2vl 32b * fix ci * upgrade peft to 0.15 * fix ci * fix ci	2025-03-25 12:15:58 +08:00
GuoCoder	b6d8749bf3	[model] fix lora on quant models (#7456 ) Co-authored-by: root <root@ai>	2025-03-25 11:59:46 +08:00
Xiaosu Zhu	bc9ada9db7	[misc] update liger-kernel's monkey patch (#7453 ) * Update liger_kernel.py * Update setup.py	2025-03-25 11:58:52 +08:00
AbdelKarim ELJANDOUBI	b6dc7e01e2	[misc] enable liger kernel for gemma3 text and paligemma (#7466 ) * add gemma3 text * add paligemma (1,2 and 2 mix)	2025-03-25 09:27:43 +08:00
Kenny Lam	59a56f7226	[misc] enable liger kernel for gemma3 (#7462 )	2025-03-24 19:09:59 +08:00
hoshi-hiyouga	42e090d38b	[trainer] fix vlm loss for transformers 4.49 (#7448 )	2025-03-24 10:24:05 +08:00
hoshi-hiyouga	b1b78daf06	[deps] upgrade transformers to 4.50.0 (#7437 ) * upgrade transformers * fix hf cache * fix dpo trainer	2025-03-23 17:44:27 +08:00
hoshi-hiyouga	dfbe1391e9	[deps] upgrade vllm to 0.8 (#7436 )	2025-03-23 14:32:22 +08:00
Eric Tang	d8a5571be7	[3rdparty] fix redundant process group destroy for ray (#7395 ) * fix redundant process group destroy for ray * Update tuner.py --------- Co-authored-by: hoshi-hiyouga <hiyouga@buaa.edu.cn>	2025-03-21 10:56:47 +08:00
hoshi-hiyouga	4a5d0f0ba7	[assets] update wechat (#7361 )	2025-03-18 21:31:09 +08:00
hoshi-hiyouga	c518146e62	[misc] set dev version (#7351 )	2025-03-18 00:10:53 +08:00
hoshi-hiyouga	1d2131e5cb	[data] fix template (#7349 )	2025-03-17 23:45:20 +08:00
Hertz	a71e685021	[model] support hunyuan 7b (#7317 ) * [Model]supported tencent-hunyuan model * [Model]supported tencent-hunyuan model(fix) * [Model]supported tencent-hunyuan model(fix)	2025-03-15 20:55:24 +08:00
Qiaolin Yu	30038d9ce7	[inference] support sglang backend (#7278 ) * Mimic SGLang offline Engine * Add more tests and args * Pass all current tests * Clean Code * fix sample_params * clean code * Fix Stream Chat * change sglang from engine mode to server mode * fix * Fix Review Issues * Use SGLang Built-In Utilities * Fix test SGLang * Some Doc Issue * fix sglang engine * add readme --------- Co-authored-by: Jin Pan <jpan236@wisc.edu> Co-authored-by: hiyouga <hiyouga@buaa.edu.cn>	2025-03-15 04:37:58 +08:00
hoshi-hiyouga	ef5f1c1def	[data] gemma3 plugin pan and scan (#7294 ) * gemma3 pan and scan * add test case * fix test	2025-03-13 23:29:23 +08:00
Ritesh Goru	d7d79f7e06	[data] efficient 4d_attention_mask creation in neat_packing (#7272 )	2025-03-13 03:31:12 +08:00
hoshi-hiyouga	9ccfb97a2c	[misc] update format (#7277 )	2025-03-13 02:53:08 +08:00
hoshi-hiyouga	165d3ed084	[model] support gemma3 (#7273 )	2025-03-13 01:35:23 +08:00
hoshi-hiyouga	142fd7e755	[misc] upgrade deps (#7257 )	2025-03-12 00:33:47 +08:00
hoshi-hiyouga	7c1640ed5f	[misc] upgrade format to py39 (#7256 )	2025-03-12 00:08:41 +08:00
hoshi-hiyouga	7a7071e504	Merge pull request #7242 from hiyouga/hiyouga/release [release] release v0.9.2 Former-commit-id: 6b25268990bf225d84e29d4067595cf720fa12d8	2025-03-11 15:28:45 +08:00
hoshi-hiyouga	847ae972d0	Merge pull request #7247 from hiyouga/hiyouga/commit [misc] support print commit info Former-commit-id: 0f7ec4f8529a5d7ea2153b881335821038307bb7	2025-03-11 15:28:04 +08:00
hiyouga	99b71768a0	support commit info Former-commit-id: `af752b1c27`	2025-03-11 15:13:59 +08:00
hiyouga	37b844d929	remove exit in preprocess Former-commit-id: `1a800f9993`	2025-03-11 15:08:25 +08:00
hiyouga	f5810a6e47	release v0.9.2 Former-commit-id: `aaad963593`	2025-03-11 14:49:13 +08:00

1 2 3 4 5 ...

1841 Commits