LLaMA-Factory

mirror of https://github.com/hiyouga/LLaMA-Factory.git synced 2025-12-17 12:20:37 +08:00

Author	SHA1	Message	Date
hiyouga	85c376fc1e	fix patcher	2024-03-15 19:18:42 +08:00
S3Studio	e75407febd	Use official Nvidia base image Note that the flash-attn library is installed in this image and the qwen model will use it automatically. However, if the the host machine's GPU is not compatible with the library, an exception will be raised during the training process as follows: FlashAttention only supports Ampere GPUs or newer. So if the --flash_attn flag is not set, an additional patch for the qwen model's config is necessary to set the default value of use_flash_attn from "auto" to False.	2024-03-15 08:59:13 +08:00
hiyouga	3b4a59bfb1	fix export	2024-03-14 18:17:01 +08:00
hiyouga	72367307df	improve lora+ impl.	2024-03-13 23:32:51 +08:00
hiyouga	b9f87cdc11	fix #2802	2024-03-13 12:33:45 +08:00
hiyouga	96ce76cd27	fix kv cache	2024-03-13 01:21:50 +08:00
hiyouga	8d8956bad5	fix #2802	2024-03-12 17:08:34 +08:00
hiyouga	18ffce36b5	fix #2732	2024-03-09 22:37:16 +08:00
hiyouga	bdb496644c	allow non-packing pretraining	2024-03-09 22:21:46 +08:00
hiyouga	e8dd38b7fd	fix #2756 , patch #2746	2024-03-09 02:01:26 +08:00
hoshi-hiyouga	516d0ddc66	Merge pull request #2746 from stephen-nju/main fix deepspeed ppo RuntimeError	2024-03-09 01:37:00 +08:00
hiyouga	10be2f0ecc	fix aqlm version	2024-03-09 00:09:09 +08:00
stephen_zhu	aa71571b77	update	2024-03-08 12:47:44 +08:00
stephen	cdb7f82869	fix ppo runtime error	2024-03-08 11:48:26 +08:00
hiyouga	33a4c24a8a	fix galore	2024-03-08 00:44:51 +08:00
hiyouga	d07ad5cc1c	support vllm	2024-03-07 20:26:31 +08:00
hiyouga	f74f804a71	fix #2735	2024-03-07 16:15:53 +08:00
hiyouga	3e84f430b1	export use balanced gpu	2024-03-06 16:33:14 +08:00
hiyouga	3016e65657	fix version checking	2024-03-06 14:51:51 +08:00
hiyouga	259af60d28	improve aqlm optim	2024-03-05 20:49:50 +08:00
hiyouga	d3d3dac707	optimize aqlm training	2024-03-05 18:35:41 +08:00
hiyouga	ddf352f861	fix dora inference	2024-03-05 11:51:41 +08:00
hiyouga	cda2ff8727	fix export on cpu device	2024-03-04 17:35:09 +08:00
hiyouga	4e5fae2fac	fix #2649	2024-03-01 13:02:41 +08:00
hiyouga	c0be617195	fix #2642	2024-02-29 18:32:54 +08:00
hiyouga	4a871e80e2	tiny fix	2024-02-29 17:28:50 +08:00
hiyouga	fa5ab21ebc	release v0.5.3	2024-02-29 00:34:19 +08:00
hiyouga	cfefacaa37	support DoRA, AWQ, AQLM #2512	2024-02-28 19:53:28 +08:00
hiyouga	9aeb404a94	support lora for llama pro	2024-02-21 02:17:22 +08:00
hiyouga	22acab8aff	fix #2481	2024-02-15 19:07:47 +08:00
hiyouga	7924ffc55d	support llama pro #2338 , add rslora	2024-02-15 02:27:36 +08:00
younesbelkada	0ca0f08162	add v1 hf tags	2024-02-13 05:58:49 +00:00
hiyouga	91d09a01ac	add option to disable version check	2024-02-10 22:31:23 +08:00
hiyouga	0ae9a16b9d	update gc kwargs	2024-02-07 00:38:24 +08:00
hiyouga	ebf31b62eb	fix #2438	2024-02-06 15:23:08 +08:00
hiyouga	19d33ede13	fix #2420	2024-02-04 15:51:47 +08:00
hiyouga	38e63bfd28	bump up transformers version	2024-02-04 00:01:16 +08:00
hiyouga	6545c02790	add hint for freeze #2412	2024-02-03 23:38:56 +08:00
hiyouga	4ecadc3512	fix #2376	2024-02-03 23:14:31 +08:00
hiyouga	521ad76552	fix autoset attn impl, update data readme	2024-01-31 11:58:07 +08:00
hiyouga	2bc30763e9	fix #2320	2024-01-24 16:19:18 +08:00
ldwang	c284665425	Add patch_mixtral_replace_moe_impl for full training Mitral using DeepSpeed Zero3. Signed-off-by: ldwang <ftgreat@gmail.com>	2024-01-24 15:25:31 +08:00
ldwang	18923b1402	Add patch_mixtral_replace_moe_impl for full training Mitral using DeepSpeed Zero3. Signed-off-by: ldwang <ftgreat@gmail.com>	2024-01-24 14:43:16 +08:00
hiyouga	e4ba1deedf	add hint	2024-01-22 23:32:01 +08:00
hoshi-hiyouga	bdc9eff635	Update patcher.py	2024-01-22 23:27:39 +08:00
A-Cepheus	b06a31e76a	🐞 fix: typo	2024-01-22 16:04:39 +08:00
A-Cepheus	319a72b48d	🐞 fix: typo, move MoE fix to patcher	2024-01-22 16:01:58 +08:00
A-Cepheus	e1d5c98519	fix: ZeRO3 does not work with MoE models	2024-01-22 15:21:14 +08:00
hiyouga	e0a717aa3a	fix #2268	2024-01-21 14:11:38 +08:00
hiyouga	638234ceee	format style	2024-01-20 20:15:56 +08:00

1 2 3 4

186 Commits