LLaMA-Factory

mirror of https://github.com/hiyouga/LLaMA-Factory.git synced 2026-03-09 21:25:59 +08:00

Author	SHA1	Message	Date
hoshi-hiyouga	c31c63b411	[misc] fix grad ckpt (#6931 )	2025-02-13 23:27:51 +08:00
hoshi-hiyouga	797043d29c	[model] add liger kernel to qwen2_5 vl (#6930 ) * add liger kernel to qwen2_5 vl * fix patch * fix patch	2025-02-13 23:05:54 +08:00
Billy Cao	11eac71c13	[trainer] fix gen_kwarg to eval during training (#5451 ) * Correctly pass gen_kwarg to eval during model runs * fix * fix --------- Co-authored-by: hiyouga <hiyouga@buaa.edu.cn>	2025-02-13 02:35:06 +08:00
SrWYG	1e35967ae1	[data] evaluate on each dataset (#5522 ) * [Update] loader.py , evaluate will run separate evaluations on each dataset. `If you pass a dictionary with names of datasets as keys and datasets as values, evaluate will run separate evaluations on each dataset. This can be useful to monitor how training affects other datasets or simply to get a more fine-grained evaluation` seq2seqtrainner support eval_dataset as Dict. * fix format * fix * fix --------- Co-authored-by: hiyouga <hiyouga@buaa.edu.cn>	2025-02-13 02:19:03 +08:00
Noah	4c7bfebcf1	[data] improve error handling (#6128 ) * sync from upstream * update * update * fix --------- Co-authored-by: hiyouga <hiyouga@buaa.edu.cn>	2025-02-13 01:39:41 +08:00
hoshi-hiyouga	617c8ab467	[breaking change] refactor data pipeline (#6901 ) * refactor data * rename file	2025-02-13 00:39:20 +08:00
hoshi-hiyouga	e34c3c06da	[misc] fix grad ckpt func (#6916 )	2025-02-13 00:17:18 +08:00
marko1616	b7fd1e9c00	[trainer] fix llama3.2 vision kto train (#6904 )	2025-02-12 19:09:14 +08:00
hoshi-hiyouga	2f8b6847f5	[data] feat: auto template (#6905 ) * support auto template * add unittest	2025-02-12 00:22:53 +08:00
hoshi-hiyouga	e1a7c1242c	[data] fix ollama template (#6902 ) * fix ollama template * add meta info * use half precision	2025-02-11 22:43:09 +08:00
hoshi-hiyouga	9184a6e0ed	[misc] support export ollama modelfile (#6899 ) * support export ollama modelfile * update config * add system and num ctx	2025-02-11 19:52:25 +08:00
hoshi-hiyouga	d1b8aa3835	[data] refactor template (#6896 )	2025-02-11 17:59:25 +08:00
hoshi-hiyouga	aca63bfcca	[data] refactor mm plugin (#6895 ) * refactor plugin * lint	2025-02-11 16:34:49 +08:00
HJ	9153a7bd83	[data] fix qwen_2_5_vl video processing (#6868 ) * fix qwen_2_5_vl video processing * Update mm_plugin.py * Update mm_plugin.py --------- Co-authored-by: hoshi-hiyouga <hiyouga@buaa.edu.cn>	2025-02-11 16:14:50 +08:00
Zhangchi Feng	764627645a	[da'ta] fix minicpmv plugin (#6890 ) * fix template name * tiny fix * support minicpm-o-2.6 * support inference of minicpmv * update readme * support dpo of minicpmv * update init audio * update init audio * [model]fix image process in minicpmo * fix no mm inputs	2025-02-11 13:30:44 +08:00
HJ	0fb44cb3a5	[data] fix: sharegpt converter (#6879 ) * fix-sharegpt-format * fix --------- Co-authored-by: hoshi-hiyouga <hiyouga@buaa.edu.cn>	2025-02-10 21:59:12 +08:00
hoshi-hiyouga	b68199db27	[data] fix mllama collator (#6874 )	2025-02-09 22:42:25 +08:00
hoshi-hiyouga	f6f3f8d0fc	[test] align test cases (#6865 ) * align test cases * fix function formatter	2025-02-09 01:03:49 +08:00
hoshi-hiyouga	1356f9d840	[dataset] add openthought (#6866 )	2025-02-09 00:53:01 +08:00
hoshi-hiyouga	5f38bcaba9	[deps] upgrade vllm (#6857 )	2025-02-08 15:02:28 +08:00
hoshi-hiyouga	40048ab77a	fix qwen2vl plugin (#6855 )	2025-02-08 10:59:10 +08:00
hoshi-hiyouga	74ade3a176	[misc] allow extra args (#6831 )	2025-02-06 12:38:08 +08:00
Zhangchi Feng	24c7842948	[model] support audio (#6701 ) * support qwen2_audio * improve code * lint * fix * fix * fix --------- Co-authored-by: hiyouga <hiyouga@buaa.edu.cn>	2025-02-05 04:59:09 +08:00
Yueqi Song	a5e943f7bc	[data] allow thought in function call (#6797 ) * Update template.py * Update template.py * use formatter * fix regex --------- Co-authored-by: hiyouga <hiyouga@buaa.edu.cn>	2025-02-05 02:26:23 +08:00
hoshi-hiyouga	e2dc5b952a	[misc] update license year & fix llama pro (#6814 ) * fix llamapro script * change year	2025-02-05 01:53:33 +08:00
Yueqi Song	dd6b7d203e	[data] fix qwen tool template (#6796 ) * Update tool_utils.py * fix unittest --------- Co-authored-by: hoshi-hiyouga <hiyouga@buaa.edu.cn>	2025-02-05 00:02:00 +08:00
Zhangchi Feng	ab9bd068ef	[data] fix minicpmv plugin (#6801 ) * fix template name * tiny fix * support minicpm-o-2.6 * support inference of minicpmv * update readme * support dpo of minicpmv * update init audio * update init audio * [model]fix image process in minicpmo	2025-02-04 21:20:15 +08:00
hoshi-hiyouga	94803d8133	[model] add mistral small models (#6786 )	2025-02-01 04:31:38 +08:00
hoshi-hiyouga	999c7c8fe0	[model] add qwen2.5 vl models (#6779 )	2025-01-31 03:00:29 +08:00
hoshi-hiyouga	15357cdad9	[breaking] support transformers 4.48 (#6628 )	2025-01-31 01:36:33 +08:00
hoshi-hiyouga	45e68b9f09	[webui] improve webui & reasoning mode (#6778 )	2025-01-31 00:09:21 +08:00
qvlehao	28417f862a	[model] add deepseek-R1 & show think process (#6767 )	2025-01-29 12:16:26 +08:00
yinpu	0f45982bac	fix: avoid redundant normalization in DPO's SFT loss calculation (#6722 )	2025-01-21 13:38:02 +08:00
engchina	de9bc3fefa	[webui] support ja (#6698 ) * add support for japanese language * add support for japanese language --------- Co-authored-by: engchina <atjapan2015@gmail.com>	2025-01-20 19:46:38 +08:00
hoshi-hiyouga	1f47b6186c	[model] support yarn (#6693 )	2025-01-18 13:56:09 +08:00
hoshi-hiyouga	c0caa7afc6	[misc] update mm plugin (#6691 )	2025-01-17 23:04:26 +08:00
hoshi-hiyouga	77bbf65905	disable valset by default (#6690 )	2025-01-17 21:09:30 +08:00
hoshi-hiyouga	4d0f662dbe	[webui] upgrade to gradio 5 (#6688 )	2025-01-17 20:15:42 +08:00
hoshi-hiyouga	7bf09abf1c	fix qwen2 moe (#6684 )	2025-01-17 13:46:09 +08:00
Zhangchi Feng	027942789b	[data] Fix minicpmv/o dpo training (#6657 ) * fix template name * tiny fix * support minicpm-o-2.6 * support inference of minicpmv * update readme * support dpo of minicpmv	2025-01-15 17:30:37 +08:00
steveepreston	76675b654e	Update `val_size` english description (#6653 ) * Update `val_size` Description in locales.py * Update `val_size` Description in data_args.py * Remove extra space in data_args.py	2025-01-15 16:00:20 +08:00
hoshi-hiyouga	7a04021d04	[optim] clean apollo (#6645 ) * clean apollo code * update readme	2025-01-15 01:42:50 +08:00
zhuHQ	d9189f9f0b	[optim] add support to APOLLO (#6617 )	2025-01-15 00:24:56 +08:00
hoshi-hiyouga	1278c3e92e	lint (#6641 )	2025-01-14 18:40:07 +08:00
Haian Huang(深度眸)	deacc00b12	Support InternLM3 Dense 8B Model (#6640 ) * support internlm3 * update * update * update * add hint	2025-01-14 18:07:27 +08:00
Xiaosu Zhu	58d029f321	Fix tokenizer max length (#6632 )	2025-01-14 17:35:54 +08:00
Zhangchi Feng	158a127d34	Support Inference of MiniCPM-V-2.6 and MiniCPM-o-2.6 (#6631 ) * fix template name * tiny fix * support minicpm-o-2.6 * support inference of minicpmv	2025-01-14 17:34:58 +08:00
hoshi-hiyouga	98189c8e4d	[model] fix mllama any image (#6637 ) * fix mllama any image * reorder classes	2025-01-14 16:47:58 +08:00
hoshi-hiyouga	1c7663d304	pin vllm version to 0.6.5 (#6629 )	2025-01-14 02:44:02 +08:00
Zhangchi Feng	c3fda5046d	Support new features of MiniCPM-V (#6626 ) * fix template name * tiny fix * support minicpm-o-2.6	2025-01-14 00:26:19 +08:00

1 2 3 4 5 ...

732 Commits