LLaMA-Factory

mirror of https://github.com/hiyouga/LLaMA-Factory.git synced 2026-03-10 13:56:00 +08:00

Author	SHA1	Message	Date
hoshi-hiyouga	ef7af457fc	[infer] fix vllm args (#7235 )	2025-03-11 01:15:35 +08:00
Ze-Yi LIN	a1e76af3d9	[tracking] add swanlab_logdir param (#7219 ) * feat: add swanlab_logdir param * fix	2025-03-11 00:53:07 +08:00
hoshi-hiyouga	ed8b12e3cb	[config] update args (#7231 )	2025-03-10 23:04:43 +08:00
hoshi-hiyouga	728c2f6819	[config] fix export max len (#7230 )	2025-03-10 16:46:08 +08:00
hoshi-hiyouga	ae4cbe8fbc	[assets] update wechat (#7229 )	2025-03-10 15:39:06 +08:00
hoshi-hiyouga	1774882f5a	[data] update mm demo data (#7211 )	2025-03-07 20:07:15 +08:00
hoshi-hiyouga	cdf8fc6478	[assets] update readme (#7209 )	2025-03-07 17:27:49 +08:00
hoshi-hiyouga	8c3f9f6747	[data] fix loader (#7207 ) * fix dataloader * add test case * fix type * fix ci * fix ci * fix ci * disable overwrite cache in ci	2025-03-07 17:20:46 +08:00
hoshi-hiyouga	db113f690e	[misc] fix ds config (#7205 )	2025-03-07 15:21:28 +08:00
ZhangChuanhui	194e3bddb2	[data] fix function formatter (#7201 ) Co-authored-by: zhangchuanhui <zhangchal@digitalchina.com>	2025-03-07 15:17:23 +08:00
hoshi-hiyouga	bd17223559	[misc] fix cli (#7204 )	2025-03-07 15:01:18 +08:00
hoshi-hiyouga	313355759d	[script] fix vllm version (#7193 )	2025-03-06 17:14:17 +08:00
hoshi-hiyouga	abb23f7673	[webui] support escape html (#7190 )	2025-03-06 16:52:21 +08:00
hoshi-hiyouga	d739fddb10	[deps] upgrade vllm (#7183 )	2025-03-06 15:25:08 +08:00
hoshi-hiyouga	be66df1f02	[data] fix mm template (#7181 )	2025-03-06 15:18:32 +08:00
hoshi-hiyouga	64a6fb9b50	[model] add QwQ 32b (#7179 )	2025-03-06 11:58:36 +08:00
Ze-Yi LIN	8ad03258e1	[trainer] fix swanlab callback (#7176 )	2025-03-06 00:33:37 +08:00
hoshi-hiyouga	b4b89b4ff3	[trainer] update config (#7174 )	2025-03-05 23:32:54 +08:00
sirui.li	dff4130969	[data] fix qwen2audio plugin (#7166 ) * Update pairwise.py [data]Repair multimodal model dpo training * Update pairwise.py [data]repair multimodal model dpo training using deepcopy * Update pairwise.py * Update mm_plugin.py	2025-03-05 18:03:36 +08:00
hoshi-hiyouga	0c403ea15b	[assets] update wechat (#7161 )	2025-03-05 14:11:10 +08:00
hoshi-hiyouga	bc298c60b7	[data] use bicubic resampler (#7143 )	2025-03-04 00:17:06 +08:00
hoshi-hiyouga	17ba2d5082	[webui] fix webui (#7142 )	2025-03-04 00:01:49 +08:00
rabbit	049ddf48af	[data] bailing template (#7117 ) * add bailing template * add bailing template * add bailing template --------- Co-authored-by: chengshiwen.csw@antgroup.com <chengshiwen.csw@antgroup.com>	2025-03-03 15:33:22 +08:00
hoshi-hiyouga	1036311826	[inference] fix hf_engine (#7120 )	2025-03-01 05:22:49 +08:00
hoshi-hiyouga	d1863bbbaa	[assets] update wechat (#7106 )	2025-02-28 12:01:04 +08:00
Ze-Yi LIN	891c487503	[webui] display swanlab exp link (#7089 ) * webui add swanlab link * change callback name * update --------- Co-authored-by: hiyouga <hiyouga@buaa.edu.cn>	2025-02-27 19:40:54 +08:00
leo-pony	acc52e0fe7	[npu] update cann base image and torch 2.4 (#7061 ) * Update base npu container image version:The Python version required for Hugging Face Transformers is >= python3.10 * Fix the bug: arg type of INSTALL_DEEPSPEED shoud been string now. * Update Ascend CANN, CANN-Kernel and corresponding torch and torch-npu version * Upgrade torch-npu needs packages' version: torch==2.1.0 and torch-npu==2.4.0.post2	2025-02-25 23:32:01 +08:00
hoshi-hiyouga	96fd510e6a	[misc] fix project toml (#7067 )	2025-02-25 23:22:48 +08:00
JieShen	e8266fe563	[script] add seed args (#7058 ) * add seed args * add seed args * update seed	2025-02-25 19:44:57 +08:00
Kingsley	19861d5170	[model] add paligemma2-mix series (#7060 )	2025-02-25 18:51:16 +08:00
hoshi-hiyouga	76314e6ad1	[data] fix mllama (#7053 ) * fix mllama * fix test	2025-02-24 22:05:38 +08:00
hoshi-hiyouga	ec1a1bc118	[model] add models (#7054 ) * add qwen25vl awq models * add moonlight	2025-02-24 22:05:13 +08:00
hoshi-hiyouga	fe6dd92c84	[assets] update readme (#7051 )	2025-02-24 20:45:06 +08:00
hoshi-hiyouga	1481af5dc9	[assets] update wechat (#7019 )	2025-02-20 20:32:33 +08:00
Zhangchi Feng	cde479e47a	[data] fix MiniCPMV plugin (#6998 ) * fix template * fix bug in messages processing	2025-02-19 19:36:04 +08:00
hoshi-hiyouga	302ecb00fe	[webui] update css (#6985 )	2025-02-18 18:27:57 +08:00
hoshi-hiyouga	2591a3fa8b	[data] add r1 distill dataset (#6983 )	2025-02-18 17:25:09 +08:00
hoshi-hiyouga	b00b290c07	[version] support transformers 449 (#6982 ) * support transformers 449 * fix mm plugin	2025-02-18 17:05:40 +08:00
hoshi-hiyouga	cc8c7e762b	[misc] fix script (#6977 )	2025-02-18 17:00:46 +08:00
hoshi-hiyouga	3da2cc2710	[data] update vlm args (#6976 )	2025-02-18 02:12:51 +08:00
hoshi-hiyouga	7faecc0301	[data] add min resolution option (#6975 )	2025-02-18 01:40:46 +08:00
hoshi-hiyouga	bdb581c4a8	[data] fix predict dataset (#6972 )	2025-02-17 20:29:40 +08:00
hoshi-hiyouga	ad0c6c8916	[assets] update wechat (#6963 )	2025-02-17 15:23:17 +08:00
Zhangchi Feng	2faf8aeff8	[data] fix minicpmo template (#6946 )	2025-02-15 00:37:41 +08:00
Eric Tang	6edd4992d7	[ray] specify ray storage path (#6920 )	2025-02-14 21:55:41 +08:00
hoshi-hiyouga	1ada3ae5a3	[misc] fix lora regex (#6944 ) * fix lora regex * fix	2025-02-14 21:38:43 +08:00
hoshi-hiyouga	c31c63b411	[misc] fix grad ckpt (#6931 )	2025-02-13 23:27:51 +08:00
hoshi-hiyouga	797043d29c	[model] add liger kernel to qwen2_5 vl (#6930 ) * add liger kernel to qwen2_5 vl * fix patch * fix patch	2025-02-13 23:05:54 +08:00
Billy Cao	11eac71c13	[trainer] fix gen_kwarg to eval during training (#5451 ) * Correctly pass gen_kwarg to eval during model runs * fix * fix --------- Co-authored-by: hiyouga <hiyouga@buaa.edu.cn>	2025-02-13 02:35:06 +08:00
SrWYG	1e35967ae1	[data] evaluate on each dataset (#5522 ) * [Update] loader.py , evaluate will run separate evaluations on each dataset. `If you pass a dictionary with names of datasets as keys and datasets as values, evaluate will run separate evaluations on each dataset. This can be useful to monitor how training affects other datasets or simply to get a more fine-grained evaluation` seq2seqtrainner support eval_dataset as Dict. * fix format * fix * fix --------- Co-authored-by: hiyouga <hiyouga@buaa.edu.cn>	2025-02-13 02:19:03 +08:00

1 2 3 4 5 ...

2691 Commits