LLaMA-Factory

mirror of https://github.com/hiyouga/LLaMA-Factory.git synced 2025-08-06 05:32:50 +08:00

Author	SHA1	Message	Date
hiyouga	829cf6458a	fix #3083 Former-commit-id: 4a6ca621c09d179561acc5957c8c911a4e44184c	2024-04-01 22:53:52 +08:00
hiyouga	34f1de0574	fix #3077 Former-commit-id: aee634cd20e6dfdfbe2fbb47ae57f62b2da2bf9a	2024-04-01 21:35:18 +08:00
hiyouga	b7468ea0a8	support infer 4bit model on GPUs #3023 Former-commit-id: eb259cc5738dfb383e4cc5d32579501c580e11b1	2024-04-01 17:34:04 +08:00
hiyouga	3cf35e57db	tiny fix Former-commit-id: 27776c34741ca0c58ed793bcdf1acd5e4a81fb39	2024-03-31 00:10:29 +08:00
marko1616	5721074af1	fix blank line contains whitespace Former-commit-id: d9a5134617d494ef13ba73f9c540123e89a8c29c	2024-03-30 23:46:55 +08:00
marko1616	67c05c2031	Fix Llama model save for full param train Former-commit-id: eb178eaff390a1dc342cc35ab8c7820d654f3717	2024-03-30 23:45:04 +08:00
hiyouga	89c400633a	update trainers Former-commit-id: 8c77b1091296e204dc3c8c1f157c288ca5b236bd	2024-03-28 18:16:27 +08:00
hiyouga	ec94e5e876	fix #2961 Former-commit-id: 511f6754026fbbf48bd481018015338a6a3ad92f	2024-03-26 17:26:14 +08:00
hiyouga	75829c8699	fix #2928 Former-commit-id: 7afbc85daee295cf38dcee9ded5afd87b2c4cfd1	2024-03-24 00:34:54 +08:00
hiyouga	58aa576ae5	fix #2941 Former-commit-id: a1c8c98c5fecfc0dd0ed1be33ee8dd2ade05b708	2024-03-24 00:28:44 +08:00
hiyouga	7999836fb6	support fsdp + qlora Former-commit-id: 84082251621e1470b3b5406a56d0a967780a1804	2024-03-21 00:36:06 +08:00
hiyouga	cf149bf43c	fix #2346 Former-commit-id: 7b8f5029018f0481f7da83cc5ee4408d95c9beb2	2024-03-20 17:56:33 +08:00
hiyouga	a5537f3ee8	fix patcher Former-commit-id: 85c376fc1e0bcc854ed6e70e6455a0b00b341655	2024-03-15 19:18:42 +08:00
S3Studio	46ef7416e6	Use official Nvidia base image Note that the flash-attn library is installed in this image and the qwen model will use it automatically. However, if the the host machine's GPU is not compatible with the library, an exception will be raised during the training process as follows: FlashAttention only supports Ampere GPUs or newer. So if the --flash_attn flag is not set, an additional patch for the qwen model's config is necessary to set the default value of use_flash_attn from "auto" to False. Former-commit-id: e75407febdec086f2bdca723a7f69a92b3b1d63f	2024-03-15 08:59:13 +08:00
hiyouga	2cf95d4efe	fix export Former-commit-id: 3b4a59bfb1866a270b9934a4a2303197ffdab531	2024-03-14 18:17:01 +08:00
hiyouga	8b8671817f	improve lora+ impl. Former-commit-id: 72367307dfadf936fb989ebe8bc9f0ff229fb933	2024-03-13 23:32:51 +08:00
hiyouga	8673abbe5e	fix #2802 Former-commit-id: b9f87cdc11b3fe712574b91455dc722b69c60c66	2024-03-13 12:33:45 +08:00
hiyouga	a74426df0f	fix kv cache Former-commit-id: 96ce76cd2753bc91c781ad13aa8f7a972abe815a	2024-03-13 01:21:50 +08:00
hiyouga	0b7e870b07	fix #2802 Former-commit-id: 8d8956bad542c0e1c0f7edbf4ffc22bb0f8788ae	2024-03-12 17:08:34 +08:00
hiyouga	276def1897	fix #2732 Former-commit-id: 18ffce36b5ee0809f2e2905c2fd44843a3725ea0	2024-03-09 22:37:16 +08:00
hiyouga	868444e124	allow non-packing pretraining Former-commit-id: bdb496644ce2c18806fc4fdae1fedcb3e5b5f808	2024-03-09 22:21:46 +08:00
hiyouga	c561b268ef	fix #2756 , patch #2746 Former-commit-id: e8dd38b7fdf8e172745d2538eb103895f2839c38	2024-03-09 02:01:26 +08:00
hoshi-hiyouga	36d65289d0	Merge pull request #2746 from stephen-nju/main fix deepspeed ppo RuntimeError Former-commit-id: 516d0ddc666c179616a2a610b1353728db57391e	2024-03-09 01:37:00 +08:00
hiyouga	398c261c7c	fix aqlm version Former-commit-id: 10be2f0eccc3963a985afcd24e5b8b8fc638b1c3	2024-03-09 00:09:09 +08:00
stephen_zhu	c69b9fbe58	update Former-commit-id: aa71571b773c5dc527b17219ec87828e4455b330	2024-03-08 12:47:44 +08:00
stephen	495b858606	fix ppo runtime error Former-commit-id: cdb7f82869b07d9d5d31b7b2aaf6b033bd00e32e	2024-03-08 11:48:26 +08:00
hiyouga	5b50458acf	fix galore Former-commit-id: 33a4c24a8a3c153bc62edf74b9246699a0ae3233	2024-03-08 00:44:51 +08:00
hiyouga	34533b2f35	support vllm Former-commit-id: d07ad5cc1cdbc13879afd84f653afdfee03a6933	2024-03-07 20:26:31 +08:00
hiyouga	37e40563f1	fix #2735 Former-commit-id: f74f804a715dfb16bf24a056bc95db6b102f9ed7	2024-03-07 16:15:53 +08:00
hiyouga	8b6c178249	export use balanced gpu Former-commit-id: 3e84f430b14a94e68f5815d8e412f0d74d28a04c	2024-03-06 16:33:14 +08:00
hiyouga	e887aface7	fix version checking Former-commit-id: 3016e6565708637c1d760f2cd5a67cbd8a5a6c26	2024-03-06 14:51:51 +08:00
hiyouga	9561809ce9	improve aqlm optim Former-commit-id: 259af60d28985b919911587716c24a3ac7f7de64	2024-03-05 20:49:50 +08:00
hiyouga	c776cdfc3e	optimize aqlm training Former-commit-id: d3d3dac7070eb9055bcdc91eaf53f5b3741c0bda	2024-03-05 18:35:41 +08:00
hiyouga	0f2250b831	fix dora inference Former-commit-id: ddf352f861e04e813cb8adeb4513964b4945081a	2024-03-05 11:51:41 +08:00
hiyouga	a62d17d009	fix export on cpu device Former-commit-id: cda2ff87272797a062c7addb1bf840ac46208dfd	2024-03-04 17:35:09 +08:00
hiyouga	d1e6e02461	fix #2649 Former-commit-id: 4e5fae2fac85227641bd16159cf296a32e0b18b4	2024-03-01 13:02:41 +08:00
hiyouga	3787d13816	fix #2642 Former-commit-id: c0be617195f43d972681dd59727857b1247eeb7e	2024-02-29 18:32:54 +08:00
hiyouga	1853b5c172	tiny fix Former-commit-id: 4a871e80e205466262534cdc710b0495954b153e	2024-02-29 17:28:50 +08:00
hiyouga	8e7d50dae4	release v0.5.3 Former-commit-id: fa5ab21ebc0ab738178c0c57578db3bda995ae06	2024-02-29 00:34:19 +08:00
hiyouga	5abbca70d3	support DoRA, AWQ, AQLM #2512 Former-commit-id: cfefacaa37453a15c55866d019887f24e886a577	2024-02-28 19:53:28 +08:00
hiyouga	0fcb931f18	support lora for llama pro Former-commit-id: 9aeb404a946795d6c4fa3cb45e3e96ffeec13646	2024-02-21 02:17:22 +08:00
hiyouga	62b78001b7	fix #2481 Former-commit-id: 22acab8aff8cadbba2a67e56af5701c0261ade49	2024-02-15 19:07:47 +08:00
hiyouga	96265ec154	support llama pro #2338 , add rslora Former-commit-id: 7924ffc55d98e33bfbfbca303e46c8f476435673	2024-02-15 02:27:36 +08:00
younesbelkada	6b98435a53	add v1 hf tags Former-commit-id: 0ca0f08162b18d326787939f12bb5ba07904fb4a	2024-02-13 05:58:49 +00:00
hiyouga	75adbfec79	add option to disable version check Former-commit-id: 91d09a01ac3b5da29d284b8d51cdfe4252b391e0	2024-02-10 22:31:23 +08:00
hiyouga	bbe5ff0570	update gc kwargs Former-commit-id: 0ae9a16b9d13bc1093662aa0b9bd990400ec2646	2024-02-07 00:38:24 +08:00
hiyouga	caeffc780d	fix #2438 Former-commit-id: ebf31b62eb1b75399cff7c7542c45ac72f6f41dd	2024-02-06 15:23:08 +08:00
hiyouga	f6b2bcfa16	fix #2420 Former-commit-id: 19d33ede137a687417bb8697f10e1749781d0d23	2024-02-04 15:51:47 +08:00
hiyouga	b1064d2f9b	bump up transformers version Former-commit-id: 38e63bfd28179fbf4b06adc0d127c358fd297e58	2024-02-04 00:01:16 +08:00
hiyouga	0fc8612b97	add hint for freeze #2412 Former-commit-id: 6545c02790e39395a87d664682ab73e0e3191099	2024-02-03 23:38:56 +08:00

1 2 3 4

198 Commits