LLaMA-Factory

mirror of https://github.com/hiyouga/LLaMA-Factory.git synced 2026-03-04 10:46:00 +08:00

Author	SHA1	Message	Date
hoshi-hiyouga	5e5fc337f9	[model] add liger kernel to qwen2_5 vl (#6930 ) * add liger kernel to qwen2_5 vl * fix patch * fix patch Former-commit-id: 828776d155986166498dfc907194f64436571106	2025-02-13 23:05:54 +08:00
Billy Cao	58e9ca8aa0	[trainer] fix gen_kwarg to eval during training (#5451 ) * Correctly pass gen_kwarg to eval during model runs * fix * fix --------- Co-authored-by: hiyouga <hiyouga@buaa.edu.cn> Former-commit-id: 845d16122496311e08263610a6a922f82604de7b	2025-02-13 02:35:06 +08:00
SrWYG	a4c4b8496f	[data] evaluate on each dataset (#5522 ) * [Update] loader.py , evaluate will run separate evaluations on each dataset. `If you pass a dictionary with names of datasets as keys and datasets as values, evaluate will run separate evaluations on each dataset. This can be useful to monitor how training affects other datasets or simply to get a more fine-grained evaluation` seq2seqtrainner support eval_dataset as Dict. * fix format * fix * fix --------- Co-authored-by: hiyouga <hiyouga@buaa.edu.cn> Former-commit-id: cf00f78650a442c85678ce805e030d2b96cbecd7	2025-02-13 02:19:03 +08:00
Noah	38c9641777	[data] improve error handling (#6128 ) * sync from upstream * update * update * fix --------- Co-authored-by: hiyouga <hiyouga@buaa.edu.cn> Former-commit-id: 1569e6096fec07da5583f1a3435b0d23ae09b5ba	2025-02-13 01:39:41 +08:00
hoshi-hiyouga	8b8fdb3a85	[misc] update readme (#6918 ) Former-commit-id: f5823479bd51c39db668b68056be749af09894d1	2025-02-13 01:01:41 +08:00
hoshi-hiyouga	290057069e	[misc] update readme (#6917 ) Former-commit-id: 6bbed1d8c4189fb7bea40230e278c40bb5336fbd	2025-02-13 00:58:10 +08:00
hoshi-hiyouga	46203856fc	[breaking change] refactor data pipeline (#6901 ) * refactor data * rename file Former-commit-id: 7a1a4ce6451cb782573d0bd9dd27a5e443e3a18b	2025-02-13 00:39:20 +08:00
Eric Tang	80b89978d9	[misc] support for launching LLaMA-Factory with `uv run` (#6907 ) * yay * uv with ray temporary commit * remove ray specific code for now * cleanup Former-commit-id: 1a9cab6de49e300bf9c747eefbb11d693592b477	2025-02-13 00:38:44 +08:00
Eric Tang	5a221d91f9	[example] fix path to ray example (#6906 ) Former-commit-id: e9bee3ef045d85051da04e6ad581a23a9e1a9551	2025-02-13 00:29:32 +08:00
hoshi-hiyouga	3a3f4072e5	[misc] fix grad ckpt func (#6916 ) Former-commit-id: 35e069a52b3d7cfd9b0107574b09265eb2290f0b	2025-02-13 00:17:18 +08:00
marko1616	0c0cdc26bc	[trainer] fix llama3.2 vision kto train (#6904 ) Former-commit-id: 1563e89adc8988fc6e4250634a3f1e385979b0e5	2025-02-12 19:09:14 +08:00
hoshi-hiyouga	2581cc844b	[data] feat: auto template (#6905 ) * support auto template * add unittest Former-commit-id: 0c6c9150db6414a5a05527ea486dce6633dff4b3	2025-02-12 00:22:53 +08:00
hoshi-hiyouga	d58fcd094e	[misc] update readme (#6903 ) Former-commit-id: 830d028939149d54bc91b6bda110dfa5de949483	2025-02-11 22:51:26 +08:00
hoshi-hiyouga	86063e27ea	[data] fix ollama template (#6902 ) * fix ollama template * add meta info * use half precision Former-commit-id: 1304bbea69d8c8ca57140017515dee7ae2ee6536	2025-02-11 22:43:09 +08:00
hoshi-hiyouga	88eafd865b	[misc] support export ollama modelfile (#6899 ) * support export ollama modelfile * update config * add system and num ctx Former-commit-id: 8c2af7466f4015f300b51841db11bcd2505ebf20	2025-02-11 19:52:25 +08:00
hoshi-hiyouga	3f7bd98bfa	[data] refactor template (#6896 ) Former-commit-id: f78d5a3eca947ed965ca2f6c87d60441b1a59867	2025-02-11 17:59:25 +08:00
codingma	b72c4bd118	support ollama modelfile export (#4686 ) Former-commit-id: 15cca102a7fc0d08b5d049cf264acc6fa576b104	2025-02-11 17:52:24 +08:00
hoshi-hiyouga	808ff89a2d	[data] refactor mm plugin (#6895 ) * refactor plugin * lint Former-commit-id: 1c8dcc3adca4a2e78f514f8bb70573dd1ca08746	2025-02-11 16:34:49 +08:00
HJ	6d7f1299bd	[data] fix qwen_2_5_vl video processing (#6868 ) * fix qwen_2_5_vl video processing * Update mm_plugin.py * Update mm_plugin.py --------- Co-authored-by: hoshi-hiyouga <hiyouga@buaa.edu.cn> Former-commit-id: 35f326dabdc8e84036296d2e3de1c84c67b8def8	2025-02-11 16:14:50 +08:00
hoshi-hiyouga	0420a608ca	[assets] update wechat (#6892 ) Former-commit-id: 0b268cc903a583ae78cb7e63d2bdc4602d7220fc	2025-02-11 13:56:26 +08:00
Zhangchi Feng	2047eab723	[da'ta] fix minicpmv plugin (#6890 ) * fix template name * tiny fix * support minicpm-o-2.6 * support inference of minicpmv * update readme * support dpo of minicpmv * update init audio * update init audio * [model]fix image process in minicpmo * fix no mm inputs Former-commit-id: cdd19ccd8cec460606b4545e886e932c1c5c5fe1	2025-02-11 13:30:44 +08:00
HJ	e11b40c344	[data] fix: sharegpt converter (#6879 ) * fix-sharegpt-format * fix --------- Co-authored-by: hoshi-hiyouga <hiyouga@buaa.edu.cn> Former-commit-id: ae8f8151ff750839998b50446f127061f240d41a	2025-02-10 21:59:12 +08:00
hoshi-hiyouga	b869506a57	[data] fix mllama collator (#6874 ) Former-commit-id: c694fa3d66651c6ce547fa72c8260c46a406126b	2025-02-09 22:42:25 +08:00
hoshi-hiyouga	72d5b06b08	[test] align test cases (#6865 ) * align test cases * fix function formatter Former-commit-id: a68f5e22d0391c80a9a826dc83967255be572032	2025-02-09 01:03:49 +08:00
hoshi-hiyouga	94726bdc8d	[dataset] add openthought (#6866 ) Former-commit-id: 20c748a4f108c0087f0d85377a4aa99126a0beb0	2025-02-09 00:53:01 +08:00
hoshi-hiyouga	4d1791e905	[deps] upgrade vllm (#6857 ) Former-commit-id: 4bd50f65a3d62528768561019fda2723d045c7fd	2025-02-08 15:02:28 +08:00
hoshi-hiyouga	528e06ccaa	fix qwen2vl plugin (#6855 ) Former-commit-id: fd13b7138ab3f4da0a429a327b9d076bcb70b944	2025-02-08 10:59:10 +08:00
hoshi-hiyouga	fec641ec82	[misc] allow extra args (#6831 ) Former-commit-id: 0fd3a5295cb4e08a4e57e860e82103364c28fba8	2025-02-06 12:38:08 +08:00
Zhangchi Feng	8f401e37f8	[model] support audio (#6701 ) * support qwen2_audio * improve code * lint * fix * fix * fix --------- Co-authored-by: hiyouga <hiyouga@buaa.edu.cn> Former-commit-id: 5eacb5629e4d7733cd992a63747a1335f2c6a929	2025-02-05 04:59:09 +08:00
Yueqi Song	9feb78e7b4	[data] allow thought in function call (#6797 ) * Update template.py * Update template.py * use formatter * fix regex --------- Co-authored-by: hiyouga <hiyouga@buaa.edu.cn> Former-commit-id: 3a31af6e920683ec074da93b1719e29f5d4cffd6	2025-02-05 02:26:23 +08:00
hoshi-hiyouga	c2022431aa	[misc] update license year & fix llama pro (#6814 ) * fix llamapro script * change year Former-commit-id: d9ae594178796994d400a5f207d6499712816f89	2025-02-05 01:53:33 +08:00
Yueqi Song	0817c24c04	[data] fix qwen tool template (#6796 ) * Update tool_utils.py * fix unittest --------- Co-authored-by: hoshi-hiyouga <hiyouga@buaa.edu.cn> Former-commit-id: 02bb78a792112f5151b3a96ddde2528823855288	2025-02-05 00:02:00 +08:00
Zhangchi Feng	cfb926fb84	[data] fix minicpmv plugin (#6801 ) * fix template name * tiny fix * support minicpm-o-2.6 * support inference of minicpmv * update readme * support dpo of minicpmv * update init audio * update init audio * [model]fix image process in minicpmo Former-commit-id: 8f704c8b6228ef50f828014f85dce67fda868660	2025-02-04 21:20:15 +08:00
neavo	34746d6151	[readme] update flash attention installation instruction on win platform (#6788 ) * Update README_zh.md * Update README.md Former-commit-id: e48d1327fb39cc95f8fbfc746494f67a79471893	2025-02-01 12:43:29 +08:00
hoshi-hiyouga	5bb447b118	[misc] update workflows (#6787 ) Former-commit-id: 15add6b250149e2aeabdc62d7dca69fc06054e01	2025-02-01 04:54:42 +08:00
hoshi-hiyouga	a28261a866	[model] add mistral small models (#6786 ) Former-commit-id: e5e95c39bc4199fa89c67e34f9adaaa987058744	2025-02-01 04:31:38 +08:00
hoshi-hiyouga	800de98dc8	[model] add qwen2.5 vl models (#6779 ) Former-commit-id: ed46fb4f6194c30060b908092464dded12e5787c	2025-01-31 03:00:29 +08:00
hoshi-hiyouga	222423bcef	[breaking] support transformers 4.48 (#6628 ) Former-commit-id: f154ab175c513a4d7bb866bf2cffc34b77b50508	2025-01-31 01:36:33 +08:00
hoshi-hiyouga	e71737351f	[webui] improve webui & reasoning mode (#6778 ) Former-commit-id: 3f17fc0d7163372e0446f1a38792ff761e99b739	2025-01-31 00:09:21 +08:00
qvlehao	4f298894da	[model] add deepseek-R1 & show think process (#6767 ) Former-commit-id: 4dccb724af51208a001c96fefbdbf226be09e50c	2025-01-29 12:16:26 +08:00
yinpu	a8fae3869d	fix: avoid redundant normalization in DPO's SFT loss calculation (#6722 ) Former-commit-id: 971a8ccbdacf130763d40c7ef82a711b2fc1292f	2025-01-21 13:38:02 +08:00
engchina	db9b977e4f	[webui] support ja (#6698 ) * add support for japanese language * add support for japanese language --------- Co-authored-by: engchina <atjapan2015@gmail.com> Former-commit-id: 88692e403f9b5085dd0c7c2b2c68656c5da50dd4	2025-01-20 19:46:38 +08:00
hoshi-hiyouga	87d685b59f	[model] support yarn (#6693 ) Former-commit-id: 8c412abc44a4c61b683465e36c6288580d980250	2025-01-18 13:56:09 +08:00
hoshi-hiyouga	e4046bdd1f	[assets] update wechat (#6692 ) Former-commit-id: 70dba5fab6f4c9225758cafb646113d8e80ac084	2025-01-18 12:35:03 +08:00
hoshi-hiyouga	5baa3add8c	[misc] update mm plugin (#6691 ) Former-commit-id: 00303338d6927b1fda58b23340a31a8fa009f706	2025-01-17 23:04:26 +08:00
hoshi-hiyouga	332f637592	disable valset by default (#6690 ) Former-commit-id: a1a94f364e33d1d73852f74eda4fa581e6b16533	2025-01-17 21:09:30 +08:00
hoshi-hiyouga	31daa6570b	[webui] upgrade to gradio 5 (#6688 ) Former-commit-id: 9df7721264ddef0008d7648e6ed173adef99bd74	2025-01-17 20:15:42 +08:00
hoshi-hiyouga	33525a34b6	fix qwen2 moe (#6684 ) Former-commit-id: ab624419fa0ab23ef7a331a0ec14e393328772b5	2025-01-17 13:46:09 +08:00
Zhangchi Feng	3607caa2ad	[data] Fix minicpmv/o dpo training (#6657 ) * fix template name * tiny fix * support minicpm-o-2.6 * support inference of minicpmv * update readme * support dpo of minicpmv Former-commit-id: 8d9f47b98047f370637d1c96c2f3440dcc738ef3	2025-01-15 17:30:37 +08:00
steveepreston	0fc2e19279	Update `val_size` english description (#6653 ) * Update `val_size` Description in locales.py * Update `val_size` Description in data_args.py * Remove extra space in data_args.py Former-commit-id: f1ba5158091446dce540dd796284037bdd724c38	2025-01-15 16:00:20 +08:00

... 5 6 7 8 9 ...

2770 Commits