LLaMA-Factory

mirror of https://github.com/hiyouga/LLaMA-Factory.git synced 2025-11-06 10:52:14 +08:00

Author	SHA1	Message	Date
hiyouga	357a32d7a0	fix #3083 Former-commit-id: ff9a3f73961a362d0ddc22079f80a85465fffda8	2024-04-01 22:53:52 +08:00
hiyouga	8365522ce2	fix #3077 Former-commit-id: d0340391e8075cff0d84b3ef879c2101b66ca1dc	2024-04-01 21:35:18 +08:00
hiyouga	4e3ee3b703	support infer 4bit model on GPUs #3023 Former-commit-id: 950a9dab9055839990656b2b40956792b253573d	2024-04-01 17:34:04 +08:00
hiyouga	0c96919aa5	tiny fix Former-commit-id: ba4a9b3c01e2f7467fbc5be268f47c0d003caa65	2024-03-31 00:10:29 +08:00
marko1616	e9060f37e4	fix blank line contains whitespace Former-commit-id: 7bc3bcc64353d5a1d4870c6a9509b64cff710492	2024-03-30 23:46:55 +08:00
marko1616	fb6e653443	Fix Llama model save for full param train Former-commit-id: ca17b5db4f97c3ec9fe2004877f150e8f51ab4b5	2024-03-30 23:45:04 +08:00
hiyouga	52eb06e2ee	fix #2961 Former-commit-id: 616917bb3be7f71073b56ad8c7bc4e164b08b9b5	2024-03-26 17:26:14 +08:00
hiyouga	06019b7ee3	fix #2941 Former-commit-id: 3775ab52017f0b610ddd8199cccfb8c001eda507	2024-03-24 00:28:44 +08:00
hiyouga	b590e82d41	support fsdp + qlora Former-commit-id: b894bf8e84be689db258021f0638e9ac939abcbc	2024-03-21 00:36:06 +08:00
hiyouga	6302cd94c8	fix #2346 Former-commit-id: c8888c499b0ac51e2fc86c16e8e91c79400a5993	2024-03-20 17:56:33 +08:00
hiyouga	2e81c03f41	fix patcher Former-commit-id: 6a5ad99c8cbf6b7def0a130306d49e7d1eb4e5a5	2024-03-15 19:18:42 +08:00
S3Studio	bada9f71a7	Use official Nvidia base image Note that the flash-attn library is installed in this image and the qwen model will use it automatically. However, if the the host machine's GPU is not compatible with the library, an exception will be raised during the training process as follows: FlashAttention only supports Ampere GPUs or newer. So if the --flash_attn flag is not set, an additional patch for the qwen model's config is necessary to set the default value of use_flash_attn from "auto" to False. Former-commit-id: cd2f5717d676e1a5afd2f4e7a38402d2e55e7479	2024-03-15 08:59:13 +08:00
hiyouga	0d245f67ea	fix export Former-commit-id: c9b968b84c97c9a00fbb43194c3adc9354d74f3b	2024-03-14 18:17:01 +08:00
hiyouga	4ef67ed4dd	improve lora+ impl. Former-commit-id: 332bad25455a70ad9204e7dd384bb086d789aa39	2024-03-13 23:32:51 +08:00
hiyouga	33bef29828	fix kv cache Former-commit-id: a9588e36e95bed896eea8d79ba7108447ff08f4b	2024-03-13 01:21:50 +08:00
hiyouga	89770c5a8e	fix #2802 Former-commit-id: 1370db270d7ba1a20468abdb29193ce7534d1b4f	2024-03-12 17:08:34 +08:00
hiyouga	56565bdbd4	allow non-packing pretraining Former-commit-id: 3fee5cc5a3db9ce874ad90f2500ec092d904bd4e	2024-03-09 22:21:46 +08:00
hiyouga	e16912b0c0	fix #2756 , patch #2746 Former-commit-id: 627d1c91e675f1d9ebf47bad123cbbf29821da4d	2024-03-09 02:01:26 +08:00
hoshi-hiyouga	5469111c65	Merge pull request #2746 from stephen-nju/main fix deepspeed ppo RuntimeError Former-commit-id: 656c653f0c628f9494b4d7ae12e60c8eeec1ea7a	2024-03-09 01:37:00 +08:00
hiyouga	1dd3f17f79	fix aqlm version Former-commit-id: 05673f81f0295c76957f3247c62f95fda322a63e	2024-03-09 00:09:09 +08:00
stephen_zhu	9e8fe6403d	update Former-commit-id: 295f9ef2eff2e8b5d7a21d3da8dd3e6eb2a42006	2024-03-08 12:47:44 +08:00
stephen	eb1ad9f161	fix ppo runtime error Former-commit-id: 14e2f221e3e720075e59065a3dc42aa4d993a8b6	2024-03-08 11:48:26 +08:00
hiyouga	a02d518edc	fix #2735 Former-commit-id: 416f6333f66b6afd70a3a936d82593efca583235	2024-03-07 16:15:53 +08:00
hiyouga	e7bea6981e	export use balanced gpu Former-commit-id: 710487dc694489bf3dfe54f8d32df80ce46439e4	2024-03-06 16:33:14 +08:00
hiyouga	67bb861040	optimize aqlm training Former-commit-id: 8b42660e4039b3d6475f502f397686ba6b140627	2024-03-05 18:35:41 +08:00
hiyouga	0604a84208	fix dora inference Former-commit-id: 21b3597b0a05169afe51e1609b532787a65ca8ea	2024-03-05 11:51:41 +08:00
hiyouga	bfddb4b468	fix export on cpu device Former-commit-id: e4722a9a627ea4e9a1341cc00a3108dd06a6b550	2024-03-04 17:35:09 +08:00
hiyouga	10845a2fe7	fix #2649 Former-commit-id: 1c850de660c671d92f0bc63f230d338b60b7c0bd	2024-03-01 13:02:41 +08:00
hiyouga	c5cbe2c6f9	fix #2642 Former-commit-id: d8435e7f1850532310e1bee069b45f38cd666e48	2024-02-29 18:32:54 +08:00
hiyouga	c5b9c285b4	tiny fix Former-commit-id: 3b6e1132c4d203e6d5376cf97e81cc160697c822	2024-02-29 17:28:50 +08:00
hiyouga	443d85d80f	release v0.5.3 Former-commit-id: f6bc89581b3cd129448da2defc23848de6f494ed	2024-02-29 00:34:19 +08:00
hiyouga	98a0c8e8bf	support DoRA, AWQ, AQLM #2512 Former-commit-id: 6614cc1f08aa944db083e27e451bbdd733f7dd97	2024-02-28 19:53:28 +08:00
hiyouga	562b9d0167	support llama pro #2338 , add rslora Former-commit-id: 40d659b7f30dd5a004703c176ec1f22dc864e505	2024-02-15 02:27:36 +08:00
hiyouga	3c35a04280	update gc kwargs Former-commit-id: 0cb81c156bc8c21a4bbdd3289a491f78dfcaf730	2024-02-07 00:38:24 +08:00
hiyouga	1a5edb7144	fix #2438 Former-commit-id: 412d856eeada2abcea598fac0a8d35ae90cc9c01	2024-02-06 15:23:08 +08:00
hiyouga	434e858c4e	fix #2376 Former-commit-id: 8e2cfa7cca21b7fd4538d72114e36f704bcc82fe	2024-02-03 23:14:31 +08:00
hiyouga	7f28b961a5	fix autoset attn impl, update data readme Former-commit-id: 34a6e5f82baf45cc8dbb11f9f7ab4a480ab7ec5c	2024-01-31 11:58:07 +08:00
hiyouga	eafdae8e94	fix #2320 Former-commit-id: e0b0c4415aaf80e75f6dd4f3777a0616b0e60f84	2024-01-24 16:19:18 +08:00
ldwang	d1a35c3fb1	Add patch_mixtral_replace_moe_impl for full training Mitral using DeepSpeed Zero3. Signed-off-by: ldwang <ftgreat@gmail.com> Former-commit-id: 5f50c02f0e425737cd80abdf8fde9e25abf13083	2024-01-24 15:25:31 +08:00
ldwang	0df7de7ab0	Add patch_mixtral_replace_moe_impl for full training Mitral using DeepSpeed Zero3. Signed-off-by: ldwang <ftgreat@gmail.com> Former-commit-id: d1413dcec8a3b1d671f240b82a689c72b54d7b93	2024-01-24 14:43:16 +08:00
hiyouga	115e7af908	add hint Former-commit-id: c540ef41bda61993b83ef8cfe3c84b1d169e984c	2024-01-22 23:32:01 +08:00
hoshi-hiyouga	e3dfc7de11	Update patcher.py Former-commit-id: 33556cc6b0b65cc6db02e66f4f6e75112c33d966	2024-01-22 23:27:39 +08:00
A-Cepheus	fbaa6269d4	🐞 fix: typo Former-commit-id: 57a3687ecd23237559aee0e8e811b782846f2415	2024-01-22 16:04:39 +08:00
A-Cepheus	ff4027e448	🐞 fix: typo, move MoE fix to patcher Former-commit-id: 4ff28e99ff9b48df7150591c6bbd3723f22b7715	2024-01-22 16:01:58 +08:00
hiyouga	e4fe8c79ee	fix #2268 Former-commit-id: 300ecf9b9d7fd99fbb68f3d086e3ad973c2f894e	2024-01-21 14:11:38 +08:00
hiyouga	c0e4eebf17	format style Former-commit-id: 53b683531b83cd1d19de97c6565f16c1eca6f5e1	2024-01-20 20:15:56 +08:00
hiyouga	e5a751ded0	support longlora for main branch Former-commit-id: f869501ad4c368df26534c41f62c6d63c6be17dd	2024-01-20 19:25:22 +08:00
hoshi-hiyouga	1ac46568ef	Merge pull request #2201 from liu-zichen/token_embed_resize support resize embed for zero3 Former-commit-id: c0d1b5e3aef70da6b115614bd1ed539a76d6547a	2024-01-20 17:45:38 +08:00
hiyouga	b3b8fc7492	add upcast_lmhead option Former-commit-id: 7ef69a1697c11ff13e7503360e40ef36cfb1c345	2024-01-19 23:54:25 +08:00
hiyouga	c5a9f7f593	set use_reentrant=False Former-commit-id: efa2e27d5ef6eaeb7baa7551c651ef10ab31400c	2024-01-19 23:29:54 +08:00

1 2

68 Commits