LLaMA-Factory

mirror of https://github.com/hiyouga/LLaMA-Factory.git synced 2025-09-10 07:02:48 +08:00

Author	SHA1	Message	Date
hoshi-hiyouga	92101f34a1	[data] improve mmplugin (#7795 )	2025-04-22 01:25:33 +08:00
Changrui Chen	81768df04c	[data] Fix wrong position ids with packed attention masks (#7754 ) Co-authored-by: hoshi-hiyouga <hiyouga@buaa.edu.cn>	2025-04-21 23:19:36 +08:00
hoshi-hiyouga	610f164c69	[trainer] fix pt loss (#7748 ) * fix pt loss * robust * fix * test	2025-04-17 03:15:35 +08:00
hoshi-hiyouga	0a0cfeb782	[breaking] bump transformers to 4.45.0 & improve ci (#7746 ) * update ci * fix * fix * fix * fix * fix	2025-04-17 02:36:48 +08:00
Kingsley	125513fa5c	[model] support intern-VL 2.5-3 series (#7258 ) * add internvl and rebase * fix for internvl2&3 * remove lines * fix video_inputs & lint * nit * add constants * remove lines * fix * fix error * pass ci * pass ci * skip internvl & nit	2025-04-17 00:31:30 +08:00
Kingsley	df8752e8ee	[model] Support Kimi_VL thinking/instruct (#7719 ) * add kimi_vl * patch config * check version * Update mm_plugin.py * Update mm_plugin.py --------- Co-authored-by: hoshi-hiyouga <hiyouga@buaa.edu.cn>	2025-04-15 00:21:58 +08:00
Eric Tang	6c53471de2	[data] support for specifying a dataset in cloud storage (#7567 ) * add support for loading datasets from s3/gcs * add comments to readme * run linter and address comments * add option to pass in kwargs to ray init (i.e. runtime env) * address comment * revert mixed up changes	2025-04-10 11:31:35 +08:00
hoshi-hiyouga	39876b85fc	[assets] update readme (#7644 )	2025-04-09 01:06:06 +08:00
Kingsley	7d8bee96fc	[data] Fix bugs of `use_audio_in_video` in Qwen2.5 Omni (#7638 ) * cache _mm_inputs * nit * support for use_audio_in_video * remove cache * fix data * Update mllm_video_audio_demo.json	2025-04-08 18:40:10 +08:00
hoshi-hiyouga	5817cda37e	[misc] fix packing and eval plot (#7623 )	2025-04-07 18:20:57 +08:00
hoshi-hiyouga	6c200fd218	[model] add llama4 (#7611 )	2025-04-06 13:42:31 +08:00
Kingsley	32cb086be1	[data] fix qwen2.5 omni plugin (#7578 ) * specific entry * Update mm_plugin.py * fix fps cal --------- Co-authored-by: hoshi-hiyouga <hiyouga@buaa.edu.cn>	2025-04-02 23:58:39 +08:00
Kingsley	80f8d037d0	[data] fix qwen2.5 omni plugin (#7573 ) * align key with qwen2vl * nit && change scripts	2025-04-02 21:28:52 +08:00
hoshi-hiyouga	903db09822	[infer] vllm video/audio inference (#7566 )	2025-04-02 02:27:04 +08:00
hoshi-hiyouga	aaf2e6ba2a	[model] fix kv cache (#7564 )	2025-04-01 23:07:46 +08:00
Ritesh Goru	f06a74ad4e	[data] specify position_ids in PackedSupervisedDatasetProcessor for neat_packing (#7318 ) * use position_ids for neat_packing with fa2 * revert fa2 changes	2025-04-01 16:03:13 +08:00
Billy Cao	5d1cc863a4	[data] shard the dataset to allow multiprocessing when streaming is enabled (#7530 ) * Shard the dataset when streaming to allow multiprocessing * Allow user to not set dataset_shards to ensure backward compatibility	2025-04-01 15:36:23 +08:00
Hao	6d6e0f44fc	[trainer] new kto mismatch pair creation strategy (#7509 )	2025-04-01 15:21:53 +08:00
hoshi-hiyouga	2d421c57bf	[data] fix qwen2.5 omni collator (#7553 )	2025-04-01 00:15:12 +08:00
Kingsley	185c76f6ad	[model] add Qwen2.5-Omni model (#7537 ) * preserve image_sizes * preserve image_sizes * init plugin * support audio-text2text lora * nit * support image/video-text2text, audio-text2text * remove args * remove lines * add docs && nit * remove some comments * fix && add merge part script * add license	2025-03-31 20:39:35 +08:00
Kingsley	b00cb2ed42	[data] fix pixtral plugin (#7505 ) * preserve `image_sizes` * add comments	2025-03-27 17:06:40 +08:00
hoshi-hiyouga	42e090d38b	[trainer] fix vlm loss for transformers 4.49 (#7448 )	2025-03-24 10:24:05 +08:00
hoshi-hiyouga	b1b78daf06	[deps] upgrade transformers to 4.50.0 (#7437 ) * upgrade transformers * fix hf cache * fix dpo trainer	2025-03-23 17:44:27 +08:00
hoshi-hiyouga	4a5d0f0ba7	[assets] update wechat (#7361 )	2025-03-18 21:31:09 +08:00
hoshi-hiyouga	1d2131e5cb	[data] fix template (#7349 )	2025-03-17 23:45:20 +08:00
Hertz	a71e685021	[model] support hunyuan 7b (#7317 ) * [Model]supported tencent-hunyuan model * [Model]supported tencent-hunyuan model(fix) * [Model]supported tencent-hunyuan model(fix)	2025-03-15 20:55:24 +08:00
hoshi-hiyouga	ef5f1c1def	[data] gemma3 plugin pan and scan (#7294 ) * gemma3 pan and scan * add test case * fix test	2025-03-13 23:29:23 +08:00
Ritesh Goru	d7d79f7e06	[data] efficient 4d_attention_mask creation in neat_packing (#7272 )	2025-03-13 03:31:12 +08:00
hoshi-hiyouga	9ccfb97a2c	[misc] update format (#7277 )	2025-03-13 02:53:08 +08:00
hoshi-hiyouga	165d3ed084	[model] support gemma3 (#7273 )	2025-03-13 01:35:23 +08:00
hoshi-hiyouga	7c1640ed5f	[misc] upgrade format to py39 (#7256 )	2025-03-12 00:08:41 +08:00
hiyouga	37b844d929	remove exit in preprocess Former-commit-id: 1a800f9993d28d80d4587a08c20f5a69722436b5	2025-03-11 15:08:25 +08:00
hoshi-hiyouga	df63f05b47	[data] fix loader (#7207 ) * fix dataloader * add test case * fix type * fix ci * fix ci * fix ci * disable overwrite cache in ci Former-commit-id: 8c3f9f6747110107cbbb3695637482e45084dbc1	2025-03-07 17:20:46 +08:00
ZhangChuanhui	33b4c33279	[data] fix function formatter (#7201 ) Co-authored-by: zhangchuanhui <zhangchal@digitalchina.com> Former-commit-id: 194e3bddb25fa0bcc6d8349ce682b537a07a9a6a	2025-03-07 15:17:23 +08:00
hoshi-hiyouga	2b21c749c1	[data] fix mm template (#7181 ) Former-commit-id: be66df1f0211cd2d90eac3ab407dced653c9e443	2025-03-06 15:18:32 +08:00
hoshi-hiyouga	6e58115f98	[trainer] update config (#7174 ) Former-commit-id: b4b89b4ff3bc03aa388569e253d62580755a77a5	2025-03-05 23:32:54 +08:00
sirui.li	8dddffa340	[data] fix qwen2audio plugin (#7166 ) * Update pairwise.py [data]Repair multimodal model dpo training * Update pairwise.py [data]repair multimodal model dpo training using deepcopy * Update pairwise.py * Update mm_plugin.py Former-commit-id: dff4130969bac9cb1abe66fd5dfada8c757c716f	2025-03-05 18:03:36 +08:00
hoshi-hiyouga	caef0a8937	[data] use bicubic resampler (#7143 ) Former-commit-id: bc298c60b7d3fdc4d116a79b535d7e9b11f4aa65	2025-03-04 00:17:06 +08:00
rabbit	299cd03785	[data] bailing template (#7117 ) * add bailing template * add bailing template * add bailing template --------- Co-authored-by: chengshiwen.csw@antgroup.com <chengshiwen.csw@antgroup.com> Former-commit-id: 049ddf48afaa9f12d3e46d7ec63858607329e853	2025-03-03 15:33:22 +08:00
hoshi-hiyouga	dca5fe14c2	[data] fix mllama (#7053 ) * fix mllama * fix test Former-commit-id: 76314e6ad1ecaa44fcae4375dd0abf4ebaf1f924	2025-02-24 22:05:38 +08:00
hoshi-hiyouga	ca78ba964d	[model] add models (#7054 ) * add qwen25vl awq models * add moonlight Former-commit-id: ec1a1bc1184d13188029e19c1d4e7de68707aaf6	2025-02-24 22:05:13 +08:00
Zhangchi Feng	1fcedf9af6	[data] fix MiniCPMV plugin (#6998 ) * fix template * fix bug in messages processing Former-commit-id: cde479e47a51beb60ab555cdee083c1cdba0ead6	2025-02-19 19:36:04 +08:00
hoshi-hiyouga	3fbd4848e8	[version] support transformers 449 (#6982 ) * support transformers 449 * fix mm plugin Former-commit-id: b00b290c07beb560a5af857ce64f4ce424831a2c	2025-02-18 17:05:40 +08:00
hoshi-hiyouga	184c5d0882	[misc] fix script (#6977 ) Former-commit-id: cc8c7e762b9c873ef79529152465bbed9231053c	2025-02-18 17:00:46 +08:00
hoshi-hiyouga	1f4a0b11ba	[data] update vlm args (#6976 ) Former-commit-id: 3da2cc2710c9b13ab450815a92fff14b03251984	2025-02-18 02:12:51 +08:00
hoshi-hiyouga	b1d31ff0f9	[data] add min resolution option (#6975 ) Former-commit-id: 7faecc0301709326efa21e7a3fdb75fe0a9635c2	2025-02-18 01:40:46 +08:00
hoshi-hiyouga	a8c9d5663d	[data] fix predict dataset (#6972 ) Former-commit-id: bdb581c4a82d02458766e73c87b7a92ea31796ec	2025-02-17 20:29:40 +08:00
Zhangchi Feng	3dc938268c	[data] fix minicpmo template (#6946 ) Former-commit-id: 2faf8aeff897765df44707d5a42157dfdd6b9038	2025-02-15 00:37:41 +08:00
hoshi-hiyouga	2baf8bf03d	[misc] fix lora regex (#6944 ) * fix lora regex * fix Former-commit-id: 1ada3ae5a3a14057341540c6d6ba985adf95f348	2025-02-14 21:38:43 +08:00
SrWYG	0ad9f7f058	[data] evaluate on each dataset (#5522 ) * [Update] loader.py , evaluate will run separate evaluations on each dataset. `If you pass a dictionary with names of datasets as keys and datasets as values, evaluate will run separate evaluations on each dataset. This can be useful to monitor how training affects other datasets or simply to get a more fine-grained evaluation` seq2seqtrainner support eval_dataset as Dict. * fix format * fix * fix --------- Co-authored-by: hiyouga <hiyouga@buaa.edu.cn> Former-commit-id: 1e35967ae159038a66f3203dd0e6ec51eea9208f	2025-02-13 02:19:03 +08:00

1 2 3 4 5 ...

350 Commits