LLaMA-Factory

mirror of https://github.com/hiyouga/LLaMA-Factory.git synced 2026-06-22 15:19:00 +08:00

Author	SHA1	Message	Date
Billy Cao	48173b606c	[trainer] fix gen_kwarg to eval during training (#5451 ) * Correctly pass gen_kwarg to eval during model runs * fix * fix --------- Co-authored-by: hiyouga <hiyouga@buaa.edu.cn> Former-commit-id: `11eac71c13`	2025-02-13 02:35:06 +08:00
SrWYG	0ad9f7f058	[data] evaluate on each dataset (#5522 ) * [Update] loader.py , evaluate will run separate evaluations on each dataset. `If you pass a dictionary with names of datasets as keys and datasets as values, evaluate will run separate evaluations on each dataset. This can be useful to monitor how training affects other datasets or simply to get a more fine-grained evaluation` seq2seqtrainner support eval_dataset as Dict. * fix format * fix * fix --------- Co-authored-by: hiyouga <hiyouga@buaa.edu.cn> Former-commit-id: `1e35967ae1`	2025-02-13 02:19:03 +08:00
Noah	1adb46875f	[data] improve error handling (#6128 ) * sync from upstream * update * update * fix --------- Co-authored-by: hiyouga <hiyouga@buaa.edu.cn> Former-commit-id: `4c7bfebcf1`	2025-02-13 01:39:41 +08:00
hoshi-hiyouga	1679930e00	[breaking change] refactor data pipeline (#6901 ) * refactor data * rename file Former-commit-id: `617c8ab467`	2025-02-13 00:39:20 +08:00
hoshi-hiyouga	036fb0d561	[misc] fix grad ckpt func (#6916 ) Former-commit-id: `e34c3c06da`	2025-02-13 00:17:18 +08:00
marko1616	bae934dea3	[trainer] fix llama3.2 vision kto train (#6904 ) Former-commit-id: `b7fd1e9c00`	2025-02-12 19:09:14 +08:00
hoshi-hiyouga	2e2f6bea07	[data] feat: auto template (#6905 ) * support auto template * add unittest Former-commit-id: `2f8b6847f5`	2025-02-12 00:22:53 +08:00
hoshi-hiyouga	197aa3baf4	[data] fix ollama template (#6902 ) * fix ollama template * add meta info * use half precision Former-commit-id: `e1a7c1242c`	2025-02-11 22:43:09 +08:00
hoshi-hiyouga	c6be9e242c	[misc] support export ollama modelfile (#6899 ) * support export ollama modelfile * update config * add system and num ctx Former-commit-id: `9184a6e0ed`	2025-02-11 19:52:25 +08:00
hoshi-hiyouga	2e954d8fd2	[data] refactor template (#6896 ) Former-commit-id: `d1b8aa3835`	2025-02-11 17:59:25 +08:00
hoshi-hiyouga	593acca556	[data] refactor mm plugin (#6895 ) * refactor plugin * lint Former-commit-id: `aca63bfcca`	2025-02-11 16:34:49 +08:00
HJ	188f22d8a7	[data] fix qwen_2_5_vl video processing (#6868 ) * fix qwen_2_5_vl video processing * Update mm_plugin.py * Update mm_plugin.py --------- Co-authored-by: hoshi-hiyouga <hiyouga@buaa.edu.cn> Former-commit-id: `9153a7bd83`	2025-02-11 16:14:50 +08:00
Zhangchi Feng	5433b318bb	[da'ta] fix minicpmv plugin (#6890 ) * fix template name * tiny fix * support minicpm-o-2.6 * support inference of minicpmv * update readme * support dpo of minicpmv * update init audio * update init audio * [model]fix image process in minicpmo * fix no mm inputs Former-commit-id: `764627645a`	2025-02-11 13:30:44 +08:00
HJ	fe4f4e9758	[data] fix: sharegpt converter (#6879 ) * fix-sharegpt-format * fix --------- Co-authored-by: hoshi-hiyouga <hiyouga@buaa.edu.cn> Former-commit-id: `0fb44cb3a5`	2025-02-10 21:59:12 +08:00
hoshi-hiyouga	1bb3d17d9e	[data] fix mllama collator (#6874 ) Former-commit-id: `b68199db27`	2025-02-09 22:42:25 +08:00
hoshi-hiyouga	b93333685b	[test] align test cases (#6865 ) * align test cases * fix function formatter Former-commit-id: `f6f3f8d0fc`	2025-02-09 01:03:49 +08:00
hoshi-hiyouga	fcd0f0480d	[dataset] add openthought (#6866 ) Former-commit-id: `1356f9d840`	2025-02-09 00:53:01 +08:00
hoshi-hiyouga	ff6658ad27	[deps] upgrade vllm (#6857 ) Former-commit-id: `5f38bcaba9`	2025-02-08 15:02:28 +08:00
hoshi-hiyouga	28037c7834	fix qwen2vl plugin (#6855 ) Former-commit-id: `40048ab77a`	2025-02-08 10:59:10 +08:00
hoshi-hiyouga	f70208e1c0	[misc] allow extra args (#6831 ) Former-commit-id: `74ade3a176`	2025-02-06 12:38:08 +08:00
Zhangchi Feng	01915eaf40	[model] support audio (#6701 ) * support qwen2_audio * improve code * lint * fix * fix * fix --------- Co-authored-by: hiyouga <hiyouga@buaa.edu.cn> Former-commit-id: `24c7842948`	2025-02-05 04:59:09 +08:00
Yueqi Song	e665e1fed5	[data] allow thought in function call (#6797 ) * Update template.py * Update template.py * use formatter * fix regex --------- Co-authored-by: hiyouga <hiyouga@buaa.edu.cn> Former-commit-id: `a5e943f7bc`	2025-02-05 02:26:23 +08:00
hoshi-hiyouga	1fee69f874	[misc] update license year & fix llama pro (#6814 ) * fix llamapro script * change year Former-commit-id: `e2dc5b952a`	2025-02-05 01:53:33 +08:00
Yueqi Song	8504bde893	[data] fix qwen tool template (#6796 ) * Update tool_utils.py * fix unittest --------- Co-authored-by: hoshi-hiyouga <hiyouga@buaa.edu.cn> Former-commit-id: `dd6b7d203e`	2025-02-05 00:02:00 +08:00
Zhangchi Feng	85f22d01bf	[data] fix minicpmv plugin (#6801 ) * fix template name * tiny fix * support minicpm-o-2.6 * support inference of minicpmv * update readme * support dpo of minicpmv * update init audio * update init audio * [model]fix image process in minicpmo Former-commit-id: `ab9bd068ef`	2025-02-04 21:20:15 +08:00
hoshi-hiyouga	445d643ef3	[model] add mistral small models (#6786 ) Former-commit-id: `94803d8133`	2025-02-01 04:31:38 +08:00
hoshi-hiyouga	e8c1979b79	[model] add qwen2.5 vl models (#6779 ) Former-commit-id: `999c7c8fe0`	2025-01-31 03:00:29 +08:00
hoshi-hiyouga	f6779b0e0c	[breaking] support transformers 4.48 (#6628 ) Former-commit-id: `15357cdad9`	2025-01-31 01:36:33 +08:00
hoshi-hiyouga	245de012ca	[webui] improve webui & reasoning mode (#6778 ) Former-commit-id: `45e68b9f09`	2025-01-31 00:09:21 +08:00
qvlehao	f5350b103b	[model] add deepseek-R1 & show think process (#6767 ) Former-commit-id: `28417f862a`	2025-01-29 12:16:26 +08:00
yinpu	aa7c07caf0	fix: avoid redundant normalization in DPO's SFT loss calculation (#6722 ) Former-commit-id: `0f45982bac`	2025-01-21 13:38:02 +08:00
engchina	324f07613a	[webui] support ja (#6698 ) * add support for japanese language * add support for japanese language --------- Co-authored-by: engchina <atjapan2015@gmail.com> Former-commit-id: `de9bc3fefa`	2025-01-20 19:46:38 +08:00
hoshi-hiyouga	1efe525df7	[model] support yarn (#6693 ) Former-commit-id: `1f47b6186c`	2025-01-18 13:56:09 +08:00
hoshi-hiyouga	f87c788154	[misc] update mm plugin (#6691 ) Former-commit-id: `c0caa7afc6`	2025-01-17 23:04:26 +08:00
hoshi-hiyouga	bbf334f823	disable valset by default (#6690 ) Former-commit-id: `77bbf65905`	2025-01-17 21:09:30 +08:00
hoshi-hiyouga	770433fa33	[webui] upgrade to gradio 5 (#6688 ) Former-commit-id: `4d0f662dbe`	2025-01-17 20:15:42 +08:00
hoshi-hiyouga	788accb601	fix qwen2 moe (#6684 ) Former-commit-id: `7bf09abf1c`	2025-01-17 13:46:09 +08:00
Zhangchi Feng	555f17c1ee	[data] Fix minicpmv/o dpo training (#6657 ) * fix template name * tiny fix * support minicpm-o-2.6 * support inference of minicpmv * update readme * support dpo of minicpmv Former-commit-id: `027942789b`	2025-01-15 17:30:37 +08:00
steveepreston	8895cf1152	Update `val_size` english description (#6653 ) * Update `val_size` Description in locales.py * Update `val_size` Description in data_args.py * Remove extra space in data_args.py Former-commit-id: `76675b654e`	2025-01-15 16:00:20 +08:00
hoshi-hiyouga	9ef85f8fc4	[optim] clean apollo (#6645 ) * clean apollo code * update readme Former-commit-id: `7a04021d04`	2025-01-15 01:42:50 +08:00
zhuHQ	763f9b9df0	[optim] add support to APOLLO (#6617 ) Former-commit-id: `d9189f9f0b`	2025-01-15 00:24:56 +08:00
hoshi-hiyouga	91433d639c	lint (#6641 ) Former-commit-id: `1278c3e92e`	2025-01-14 18:40:07 +08:00
Haian Huang(深度眸)	864ee06243	Support InternLM3 Dense 8B Model (#6640 ) * support internlm3 * update * update * update * add hint Former-commit-id: `deacc00b12`	2025-01-14 18:07:27 +08:00
Xiaosu Zhu	a52496cc09	Fix tokenizer max length (#6632 ) Former-commit-id: `58d029f321`	2025-01-14 17:35:54 +08:00
Zhangchi Feng	ad119afc58	Support Inference of MiniCPM-V-2.6 and MiniCPM-o-2.6 (#6631 ) * fix template name * tiny fix * support minicpm-o-2.6 * support inference of minicpmv Former-commit-id: `158a127d34`	2025-01-14 17:34:58 +08:00
hoshi-hiyouga	8f73c75c16	[model] fix mllama any image (#6637 ) * fix mllama any image * reorder classes Former-commit-id: `98189c8e4d`	2025-01-14 16:47:58 +08:00
hoshi-hiyouga	5e699458e5	pin vllm version to 0.6.5 (#6629 ) Former-commit-id: `1c7663d304`	2025-01-14 02:44:02 +08:00
Zhangchi Feng	201a495154	Support new features of MiniCPM-V (#6626 ) * fix template name * tiny fix * support minicpm-o-2.6 Former-commit-id: `c3fda5046d`	2025-01-14 00:26:19 +08:00
hoshi-hiyouga	d8cba9464f	[inference] fix stop token for object detection (#6624 ) * fix stop token * update minicpm data pipeline * fix npu qlora examples Former-commit-id: `e3e2c8c689`	2025-01-13 21:34:20 +08:00
codingma	089c7d5e51	add nf4 qlora support on Ascend NPU (#6601 ) * add nf4 qlora support on Ascend NPU * add transformers version check * add python>=3.10 requirement description for npu * tiny fix --------- Co-authored-by: hoshi-hiyouga <hiyouga@buaa.edu.cn> Former-commit-id: `03de5ac912`	2025-01-13 19:43:36 +08:00

1 2 3 4 5 ...

830 Commits