LLaMA-Factory

mirror of https://github.com/hiyouga/LLaMA-Factory.git synced 2026-03-07 12:15:59 +08:00

Author	SHA1	Message	Date
Qiaolin Yu	30038d9ce7	[inference] support sglang backend (#7278 ) * Mimic SGLang offline Engine * Add more tests and args * Pass all current tests * Clean Code * fix sample_params * clean code * Fix Stream Chat * change sglang from engine mode to server mode * fix * Fix Review Issues * Use SGLang Built-In Utilities * Fix test SGLang * Some Doc Issue * fix sglang engine * add readme --------- Co-authored-by: Jin Pan <jpan236@wisc.edu> Co-authored-by: hiyouga <hiyouga@buaa.edu.cn>	2025-03-15 04:37:58 +08:00
hoshi-hiyouga	ef5f1c1def	[data] gemma3 plugin pan and scan (#7294 ) * gemma3 pan and scan * add test case * fix test	2025-03-13 23:29:23 +08:00
Ritesh Goru	d7d79f7e06	[data] efficient 4d_attention_mask creation in neat_packing (#7272 )	2025-03-13 03:31:12 +08:00
hoshi-hiyouga	9ccfb97a2c	[misc] update format (#7277 )	2025-03-13 02:53:08 +08:00
hoshi-hiyouga	165d3ed084	[model] support gemma3 (#7273 )	2025-03-13 01:35:23 +08:00
hoshi-hiyouga	142fd7e755	[misc] upgrade deps (#7257 )	2025-03-12 00:33:47 +08:00
hoshi-hiyouga	7c1640ed5f	[misc] upgrade format to py39 (#7256 )	2025-03-12 00:08:41 +08:00
hoshi-hiyouga	7a7071e504	Merge pull request #7242 from hiyouga/hiyouga/release [release] release v0.9.2 Former-commit-id: 6b25268990bf225d84e29d4067595cf720fa12d8	2025-03-11 15:28:45 +08:00
hoshi-hiyouga	847ae972d0	Merge pull request #7247 from hiyouga/hiyouga/commit [misc] support print commit info Former-commit-id: 0f7ec4f8529a5d7ea2153b881335821038307bb7	2025-03-11 15:28:04 +08:00
hiyouga	99b71768a0	support commit info Former-commit-id: `af752b1c27`	2025-03-11 15:13:59 +08:00
hiyouga	37b844d929	remove exit in preprocess Former-commit-id: `1a800f9993`	2025-03-11 15:08:25 +08:00
hiyouga	f5810a6e47	release v0.9.2 Former-commit-id: `aaad963593`	2025-03-11 14:49:13 +08:00
hoshi-hiyouga	317d0855d2	[infer] fix vllm args (#7235 ) Former-commit-id: `ef7af457fc`	2025-03-11 01:15:35 +08:00
Ze-Yi LIN	0a43bc1960	[tracking] add swanlab_logdir param (#7219 ) * feat: add swanlab_logdir param * fix Former-commit-id: `a1e76af3d9`	2025-03-11 00:53:07 +08:00
hoshi-hiyouga	5a29f49fb1	[config] update args (#7231 ) Former-commit-id: `ed8b12e3cb`	2025-03-10 23:04:43 +08:00
hoshi-hiyouga	4e68828e46	[config] fix export max len (#7230 ) Former-commit-id: `728c2f6819`	2025-03-10 16:46:08 +08:00
hoshi-hiyouga	df63f05b47	[data] fix loader (#7207 ) * fix dataloader * add test case * fix type * fix ci * fix ci * fix ci * disable overwrite cache in ci Former-commit-id: `8c3f9f6747`	2025-03-07 17:20:46 +08:00
ZhangChuanhui	33b4c33279	[data] fix function formatter (#7201 ) Co-authored-by: zhangchuanhui <zhangchal@digitalchina.com> Former-commit-id: `194e3bddb2`	2025-03-07 15:17:23 +08:00
hoshi-hiyouga	113cc3d920	[misc] fix cli (#7204 ) Former-commit-id: `bd17223559`	2025-03-07 15:01:18 +08:00
hoshi-hiyouga	eba31ae313	[webui] support escape html (#7190 ) Former-commit-id: `abb23f7673`	2025-03-06 16:52:21 +08:00
hoshi-hiyouga	e7556b591e	[deps] upgrade vllm (#7183 ) Former-commit-id: `d739fddb10`	2025-03-06 15:25:08 +08:00
hoshi-hiyouga	2b21c749c1	[data] fix mm template (#7181 ) Former-commit-id: `be66df1f02`	2025-03-06 15:18:32 +08:00
hoshi-hiyouga	002f58ef8e	[model] add QwQ 32b (#7179 ) Former-commit-id: `64a6fb9b50`	2025-03-06 11:58:36 +08:00
Ze-Yi LIN	c67d2b9327	[trainer] fix swanlab callback (#7176 ) Former-commit-id: `8ad03258e1`	2025-03-06 00:33:37 +08:00
hoshi-hiyouga	6e58115f98	[trainer] update config (#7174 ) Former-commit-id: `b4b89b4ff3`	2025-03-05 23:32:54 +08:00
sirui.li	8dddffa340	[data] fix qwen2audio plugin (#7166 ) * Update pairwise.py [data]Repair multimodal model dpo training * Update pairwise.py [data]repair multimodal model dpo training using deepcopy * Update pairwise.py * Update mm_plugin.py Former-commit-id: `dff4130969`	2025-03-05 18:03:36 +08:00
hoshi-hiyouga	caef0a8937	[data] use bicubic resampler (#7143 ) Former-commit-id: `bc298c60b7`	2025-03-04 00:17:06 +08:00
hoshi-hiyouga	392533e139	[webui] fix webui (#7142 ) Former-commit-id: `17ba2d5082`	2025-03-04 00:01:49 +08:00
rabbit	299cd03785	[data] bailing template (#7117 ) * add bailing template * add bailing template * add bailing template --------- Co-authored-by: chengshiwen.csw@antgroup.com <chengshiwen.csw@antgroup.com> Former-commit-id: `049ddf48af`	2025-03-03 15:33:22 +08:00
hoshi-hiyouga	ee1b580328	[inference] fix hf_engine (#7120 ) Former-commit-id: `1036311826`	2025-03-01 05:22:49 +08:00
Ze-Yi LIN	210cdb9557	[webui] display swanlab exp link (#7089 ) * webui add swanlab link * change callback name * update --------- Co-authored-by: hiyouga <hiyouga@buaa.edu.cn> Former-commit-id: `891c487503`	2025-02-27 19:40:54 +08:00
hoshi-hiyouga	f4aa0a146c	[misc] fix project toml (#7067 ) Former-commit-id: `96fd510e6a`	2025-02-25 23:22:48 +08:00
Kingsley	81947f1d2c	[model] add paligemma2-mix series (#7060 ) Former-commit-id: `19861d5170`	2025-02-25 18:51:16 +08:00
hoshi-hiyouga	dca5fe14c2	[data] fix mllama (#7053 ) * fix mllama * fix test Former-commit-id: `76314e6ad1`	2025-02-24 22:05:38 +08:00
hoshi-hiyouga	ca78ba964d	[model] add models (#7054 ) * add qwen25vl awq models * add moonlight Former-commit-id: `ec1a1bc118`	2025-02-24 22:05:13 +08:00
Zhangchi Feng	1fcedf9af6	[data] fix MiniCPMV plugin (#6998 ) * fix template * fix bug in messages processing Former-commit-id: `cde479e47a`	2025-02-19 19:36:04 +08:00
hoshi-hiyouga	b0bbacaacb	[webui] update css (#6985 ) Former-commit-id: `302ecb00fe`	2025-02-18 18:27:57 +08:00
hoshi-hiyouga	3fbd4848e8	[version] support transformers 449 (#6982 ) * support transformers 449 * fix mm plugin Former-commit-id: `b00b290c07`	2025-02-18 17:05:40 +08:00
hoshi-hiyouga	184c5d0882	[misc] fix script (#6977 ) Former-commit-id: `cc8c7e762b`	2025-02-18 17:00:46 +08:00
hoshi-hiyouga	1f4a0b11ba	[data] update vlm args (#6976 ) Former-commit-id: `3da2cc2710`	2025-02-18 02:12:51 +08:00
hoshi-hiyouga	b1d31ff0f9	[data] add min resolution option (#6975 ) Former-commit-id: `7faecc0301`	2025-02-18 01:40:46 +08:00
hoshi-hiyouga	a8c9d5663d	[data] fix predict dataset (#6972 ) Former-commit-id: `bdb581c4a8`	2025-02-17 20:29:40 +08:00
Zhangchi Feng	3dc938268c	[data] fix minicpmo template (#6946 ) Former-commit-id: `2faf8aeff8`	2025-02-15 00:37:41 +08:00
Eric Tang	e55ec42d3c	[ray] specify ray storage path (#6920 ) Former-commit-id: `6edd4992d7`	2025-02-14 21:55:41 +08:00
hoshi-hiyouga	2baf8bf03d	[misc] fix lora regex (#6944 ) * fix lora regex * fix Former-commit-id: `1ada3ae5a3`	2025-02-14 21:38:43 +08:00
hoshi-hiyouga	13e1b7ee2b	[misc] fix grad ckpt (#6931 ) Former-commit-id: `c31c63b411`	2025-02-13 23:27:51 +08:00
hoshi-hiyouga	cd493b91de	[model] add liger kernel to qwen2_5 vl (#6930 ) * add liger kernel to qwen2_5 vl * fix patch * fix patch Former-commit-id: `797043d29c`	2025-02-13 23:05:54 +08:00
Billy Cao	48173b606c	[trainer] fix gen_kwarg to eval during training (#5451 ) * Correctly pass gen_kwarg to eval during model runs * fix * fix --------- Co-authored-by: hiyouga <hiyouga@buaa.edu.cn> Former-commit-id: `11eac71c13`	2025-02-13 02:35:06 +08:00
SrWYG	0ad9f7f058	[data] evaluate on each dataset (#5522 ) * [Update] loader.py , evaluate will run separate evaluations on each dataset. `If you pass a dictionary with names of datasets as keys and datasets as values, evaluate will run separate evaluations on each dataset. This can be useful to monitor how training affects other datasets or simply to get a more fine-grained evaluation` seq2seqtrainner support eval_dataset as Dict. * fix format * fix * fix --------- Co-authored-by: hiyouga <hiyouga@buaa.edu.cn> Former-commit-id: `1e35967ae1`	2025-02-13 02:19:03 +08:00
Noah	1adb46875f	[data] improve error handling (#6128 ) * sync from upstream * update * update * fix --------- Co-authored-by: hiyouga <hiyouga@buaa.edu.cn> Former-commit-id: `4c7bfebcf1`	2025-02-13 01:39:41 +08:00

1 2 3 4 5 ...

777 Commits