LLaMA-Factory

mirror of https://github.com/hiyouga/LLaMA-Factory.git synced 2025-10-16 08:38:09 +08:00

Author	SHA1	Message	Date
S3Studio	096869c7b6	Use official Nvidia base image Note that the flash-attn library is installed in this image and the qwen model will use it automatically. However, if the the host machine's GPU is not compatible with the library, an exception will be raised during the training process as follows: FlashAttention only supports Ampere GPUs or newer. So if the --flash_attn flag is not set, an additional patch for the qwen model's config is necessary to set the default value of use_flash_attn from "auto" to False. Former-commit-id: cd2f5717d676e1a5afd2f4e7a38402d2e55e7479	2024-03-15 08:59:13 +08:00
hiyouga	aabe90343e	fix export Former-commit-id: c9b968b84c97c9a00fbb43194c3adc9354d74f3b	2024-03-14 18:17:01 +08:00
hiyouga	46f99ff277	improve lora+ impl. Former-commit-id: 332bad25455a70ad9204e7dd384bb086d789aa39	2024-03-13 23:32:51 +08:00
hiyouga	9a784fb4f3	fix kv cache Former-commit-id: a9588e36e95bed896eea8d79ba7108447ff08f4b	2024-03-13 01:21:50 +08:00
hiyouga	6c1b4aec75	fix #2802 Former-commit-id: 1370db270d7ba1a20468abdb29193ce7534d1b4f	2024-03-12 17:08:34 +08:00
hiyouga	4881f4e631	allow non-packing pretraining Former-commit-id: 3fee5cc5a3db9ce874ad90f2500ec092d904bd4e	2024-03-09 22:21:46 +08:00
hiyouga	43b2ede0f8	fix #2756 , patch #2746 Former-commit-id: 627d1c91e675f1d9ebf47bad123cbbf29821da4d	2024-03-09 02:01:26 +08:00
hoshi-hiyouga	2f095e2017	Merge pull request #2746 from stephen-nju/main fix deepspeed ppo RuntimeError Former-commit-id: 656c653f0c628f9494b4d7ae12e60c8eeec1ea7a	2024-03-09 01:37:00 +08:00
hiyouga	9b97b23ce7	fix aqlm version Former-commit-id: 05673f81f0295c76957f3247c62f95fda322a63e	2024-03-09 00:09:09 +08:00
stephen_zhu	940c00e7ae	update Former-commit-id: 295f9ef2eff2e8b5d7a21d3da8dd3e6eb2a42006	2024-03-08 12:47:44 +08:00
stephen	18cfd5f349	fix ppo runtime error Former-commit-id: 14e2f221e3e720075e59065a3dc42aa4d993a8b6	2024-03-08 11:48:26 +08:00
hiyouga	9a69cadab3	fix #2735 Former-commit-id: 416f6333f66b6afd70a3a936d82593efca583235	2024-03-07 16:15:53 +08:00
hiyouga	7578209735	export use balanced gpu Former-commit-id: 710487dc694489bf3dfe54f8d32df80ce46439e4	2024-03-06 16:33:14 +08:00
hiyouga	a10bead9b5	optimize aqlm training Former-commit-id: 8b42660e4039b3d6475f502f397686ba6b140627	2024-03-05 18:35:41 +08:00
hiyouga	3553e301dd	fix dora inference Former-commit-id: 21b3597b0a05169afe51e1609b532787a65ca8ea	2024-03-05 11:51:41 +08:00
hiyouga	2dca53962e	fix export on cpu device Former-commit-id: e4722a9a627ea4e9a1341cc00a3108dd06a6b550	2024-03-04 17:35:09 +08:00
hiyouga	59a9a5994e	fix #2649 Former-commit-id: 1c850de660c671d92f0bc63f230d338b60b7c0bd	2024-03-01 13:02:41 +08:00
hiyouga	88fddb879d	fix #2642 Former-commit-id: d8435e7f1850532310e1bee069b45f38cd666e48	2024-02-29 18:32:54 +08:00
hiyouga	30855b924a	tiny fix Former-commit-id: 3b6e1132c4d203e6d5376cf97e81cc160697c822	2024-02-29 17:28:50 +08:00
hiyouga	544e7a491b	release v0.5.3 Former-commit-id: f6bc89581b3cd129448da2defc23848de6f494ed	2024-02-29 00:34:19 +08:00
hiyouga	b392e6cfb9	support DoRA, AWQ, AQLM #2512 Former-commit-id: 6614cc1f08aa944db083e27e451bbdd733f7dd97	2024-02-28 19:53:28 +08:00
hiyouga	596b6828cb	support llama pro #2338 , add rslora Former-commit-id: 40d659b7f30dd5a004703c176ec1f22dc864e505	2024-02-15 02:27:36 +08:00
hiyouga	f67f781fed	update gc kwargs Former-commit-id: 0cb81c156bc8c21a4bbdd3289a491f78dfcaf730	2024-02-07 00:38:24 +08:00
hiyouga	b564b97b7e	fix #2438 Former-commit-id: 412d856eeada2abcea598fac0a8d35ae90cc9c01	2024-02-06 15:23:08 +08:00
hiyouga	5fa52e87cb	fix #2376 Former-commit-id: 8e2cfa7cca21b7fd4538d72114e36f704bcc82fe	2024-02-03 23:14:31 +08:00
hiyouga	5b8712d061	fix autoset attn impl, update data readme Former-commit-id: 34a6e5f82baf45cc8dbb11f9f7ab4a480ab7ec5c	2024-01-31 11:58:07 +08:00
hiyouga	1ace676170	fix #2320 Former-commit-id: e0b0c4415aaf80e75f6dd4f3777a0616b0e60f84	2024-01-24 16:19:18 +08:00
ldwang	786a2f1103	Add patch_mixtral_replace_moe_impl for full training Mitral using DeepSpeed Zero3. Signed-off-by: ldwang <ftgreat@gmail.com> Former-commit-id: 5f50c02f0e425737cd80abdf8fde9e25abf13083	2024-01-24 15:25:31 +08:00
ldwang	36ac14a566	Add patch_mixtral_replace_moe_impl for full training Mitral using DeepSpeed Zero3. Signed-off-by: ldwang <ftgreat@gmail.com> Former-commit-id: d1413dcec8a3b1d671f240b82a689c72b54d7b93	2024-01-24 14:43:16 +08:00
hiyouga	7a048fc91d	add hint Former-commit-id: c540ef41bda61993b83ef8cfe3c84b1d169e984c	2024-01-22 23:32:01 +08:00
hoshi-hiyouga	b36c4b99cc	Update patcher.py Former-commit-id: 33556cc6b0b65cc6db02e66f4f6e75112c33d966	2024-01-22 23:27:39 +08:00
A-Cepheus	882a6a1d51	🐞 fix: typo Former-commit-id: 57a3687ecd23237559aee0e8e811b782846f2415	2024-01-22 16:04:39 +08:00
A-Cepheus	712ab4ae7a	🐞 fix: typo, move MoE fix to patcher Former-commit-id: 4ff28e99ff9b48df7150591c6bbd3723f22b7715	2024-01-22 16:01:58 +08:00
hiyouga	96531a0ef8	fix #2268 Former-commit-id: 300ecf9b9d7fd99fbb68f3d086e3ad973c2f894e	2024-01-21 14:11:38 +08:00
hiyouga	66e0e651b9	format style Former-commit-id: 53b683531b83cd1d19de97c6565f16c1eca6f5e1	2024-01-20 20:15:56 +08:00
hiyouga	80637fc06d	support longlora for main branch Former-commit-id: f869501ad4c368df26534c41f62c6d63c6be17dd	2024-01-20 19:25:22 +08:00
hoshi-hiyouga	8efc055511	Merge pull request #2201 from liu-zichen/token_embed_resize support resize embed for zero3 Former-commit-id: c0d1b5e3aef70da6b115614bd1ed539a76d6547a	2024-01-20 17:45:38 +08:00
hiyouga	be61bfda93	add upcast_lmhead option Former-commit-id: 7ef69a1697c11ff13e7503360e40ef36cfb1c345	2024-01-19 23:54:25 +08:00
hiyouga	1a39f529c0	set use_reentrant=False Former-commit-id: efa2e27d5ef6eaeb7baa7551c651ef10ab31400c	2024-01-19 23:29:54 +08:00
hiyouga	a423274fd9	support function calling Former-commit-id: 66533b3f65babf2429c92c0f8fafe4eff5e0ff63	2024-01-18 09:54:23 +08:00
liuzc	aa72a4349e	support resize embed for zero3 Former-commit-id: b5464f5699b13bb118ac57ebc40b3cf9eb030396	2024-01-16 15:16:20 +08:00
hiyouga	75fe1404b1	improve model export Former-commit-id: 31255147a566a23ce1a48402662d14af8ac267ab	2024-01-05 18:51:49 +08:00
hiyouga	b460c9372f	fix #2098 Former-commit-id: e62d9158cffbf1044396597ddaf15b1c0bc5f954	2024-01-05 17:11:26 +08:00
hiyouga	04ae80a52e	fix #2081 Former-commit-id: ec4b539b6c0be11e15d273025c414b694bbd6c9a	2024-01-04 23:19:08 +08:00
hiyouga	8c74851b70	fix dispatch Former-commit-id: deda82638716506dc690902c51276bb1eb0ddd5e	2024-01-03 16:33:16 +08:00
hiyouga	7168392a51	fix valuehead patch Former-commit-id: d9cb98362b58b28ae0ee207e7c07e75e5d810876	2024-01-03 16:19:23 +08:00
hiyouga	ccc5b324fe	fix rm server Former-commit-id: 81bc1638682a9fd01518f9f25250a6b584d2a9e6	2024-01-03 15:30:46 +08:00
hiyouga	c33fbea469	fix bug Former-commit-id: b06faa1be3f5aa5e0fa31aa31314c213c36c3442	2023-12-24 19:20:12 +08:00
hiyouga	921f593632	update loader Former-commit-id: 080d8eab858217ca58bffe719d5ffde7579c5bda	2023-12-24 19:10:23 +08:00
hiyouga	940403720a	update patcher Former-commit-id: d6d7b6670847ce4ea10353c5b126214542b45c2b	2023-12-23 15:24:27 +08:00

1 2

57 Commits