LLaMA-Factory

mirror of https://github.com/hiyouga/LLaMA-Factory.git synced 2026-06-20 14:18:55 +08:00

Author	SHA1	Message	Date
hiyouga	3d483e0914	fix packages Former-commit-id: `8e04794b2d`	2024-03-17 22:32:03 +08:00
hiyouga	a5537f3ee8	fix patcher Former-commit-id: `85c376fc1e`	2024-03-15 19:18:42 +08:00
hoshi-hiyouga	30765baa91	Merge pull request #2849 from S3Studio/DockerizeSupport Improve Dockerize support Former-commit-id: `113cc04719`	2024-03-15 19:16:02 +08:00
hiyouga	06860e8f0f	fix export Former-commit-id: `6bc2c23b6d`	2024-03-15 15:06:30 +08:00
S3Studio	46ef7416e6	Use official Nvidia base image Note that the flash-attn library is installed in this image and the qwen model will use it automatically. However, if the the host machine's GPU is not compatible with the library, an exception will be raised during the training process as follows: FlashAttention only supports Ampere GPUs or newer. So if the --flash_attn flag is not set, an additional patch for the qwen model's config is necessary to set the default value of use_flash_attn from "auto" to False. Former-commit-id: `e75407febd`	2024-03-15 08:59:13 +08:00
S3Studio	dcbc8168a8	improve Docker build and runtime parameters Modify installation method of extra python library. Utilize shared memory of the host machine to increase training performance. Former-commit-id: `6a5693d11d`	2024-03-15 08:57:46 +08:00
hiyouga	7ef49586be	tiny fix Former-commit-id: `6ebde4f23e`	2024-03-14 21:19:06 +08:00
hiyouga	2cf95d4efe	fix export Former-commit-id: `3b4a59bfb1`	2024-03-14 18:17:01 +08:00
hiyouga	edd28dbe2c	fix bug Former-commit-id: `8172530d54`	2024-03-13 23:55:31 +08:00
hiyouga	9ff7c99eb1	fix bug Former-commit-id: `714d936dfb`	2024-03-13 23:43:42 +08:00
hiyouga	8b8671817f	improve lora+ impl. Former-commit-id: `72367307df`	2024-03-13 23:32:51 +08:00
hoshi-hiyouga	4000de93ea	Merge pull request #2830 from qibaoyuan/lora_plus [FEATURE]: ADD LORA+ ALGORITHM Former-commit-id: `4e5e99af43`	2024-03-13 20:15:46 +08:00
齐保元	24c9277488	[FEATURE]: ADD LORA+ ALGORITHM Former-commit-id: `a0965cd62c`	2024-03-13 19:43:27 +08:00
hiyouga	634c44c51a	Update wechat.jpg Former-commit-id: `dfd451b722`	2024-03-13 19:03:00 +08:00
hiyouga	922bd8864b	fix #2817 Former-commit-id: `0b4a5bf509`	2024-03-13 12:42:03 +08:00
hiyouga	8673abbe5e	fix #2802 Former-commit-id: `b9f87cdc11`	2024-03-13 12:33:45 +08:00
hiyouga	a74426df0f	fix kv cache Former-commit-id: `96ce76cd27`	2024-03-13 01:21:50 +08:00
hiyouga	bbf272f96e	support QDoRA Former-commit-id: `19ef482649`	2024-03-12 22:12:42 +08:00
hiyouga	096c31bfb6	patch for gemma cpt Former-commit-id: `70a3052dd8`	2024-03-12 21:21:54 +08:00
hiyouga	c28818c39f	fix plot issues Former-commit-id: `60cc17f3a8`	2024-03-12 18:41:35 +08:00
hiyouga	14ed926a2d	support olmo Former-commit-id: `b3247d6a16`	2024-03-12 18:30:38 +08:00
hiyouga	0b7e870b07	fix #2802 Former-commit-id: `8d8956bad5`	2024-03-12 17:08:34 +08:00
hiyouga	b983de9f4f	fix #2803 Former-commit-id: `06c97083e1`	2024-03-12 16:57:39 +08:00
hiyouga	7124b71676	fix #2782 #2798 Former-commit-id: `07f9b754a7`	2024-03-12 15:53:29 +08:00
hoshi-hiyouga	52f14211e3	Merge pull request #2743 from S3Studio/DockerizeSupport Add dockerize support Former-commit-id: `c901aa63ff`	2024-03-12 00:05:49 +08:00
hiyouga	c88062347e	fix #2775 Former-commit-id: `e874c00906`	2024-03-11 00:42:54 +08:00
hiyouga	f776e738f8	tiny fix Former-commit-id: `352693e2dc`	2024-03-11 00:17:18 +08:00
hiyouga	566bfad930	update parser Former-commit-id: `be99799413`	2024-03-10 13:35:20 +08:00
hiyouga	4a4e4b4354	support layerwise galore Former-commit-id: `8664262cde`	2024-03-10 00:24:11 +08:00
hiyouga	276def1897	fix #2732 Former-commit-id: `18ffce36b5`	2024-03-09 22:37:16 +08:00
hiyouga	868444e124	allow non-packing pretraining Former-commit-id: `bdb496644c`	2024-03-09 22:21:46 +08:00
hiyouga	1173441661	fix #2766 Former-commit-id: `412c52e325`	2024-03-09 21:35:24 +08:00
hiyouga	8f6eb1383d	use default arg for freeze tuning Former-commit-id: `af0e370fb1`	2024-03-09 06:08:48 +08:00
hiyouga	17e50bcbb1	add GaLore results Former-commit-id: `818726e9bc`	2024-03-09 04:11:55 +08:00
hiyouga	5c00783697	update hardware requirements Former-commit-id: `393c2de27c`	2024-03-09 03:58:18 +08:00
hiyouga	eb363b04b9	update examples Former-commit-id: `4c00bcdcae`	2024-03-09 02:30:37 +08:00
hiyouga	c561b268ef	fix #2756 , patch #2746 Former-commit-id: `e8dd38b7fd`	2024-03-09 02:01:26 +08:00
hoshi-hiyouga	36d65289d0	Merge pull request #2746 from stephen-nju/main fix deepspeed ppo RuntimeError Former-commit-id: `516d0ddc66`	2024-03-09 01:37:00 +08:00
hiyouga	247aab9066	Update setup.py Former-commit-id: `74ff8664d7`	2024-03-09 00:14:48 +08:00
hiyouga	398c261c7c	fix aqlm version Former-commit-id: `10be2f0ecc`	2024-03-09 00:09:09 +08:00
hiyouga	ccec17f773	fix example params Former-commit-id: `8a45213440`	2024-03-08 20:41:43 +08:00
stephen_zhu	c69b9fbe58	update Former-commit-id: `aa71571b77`	2024-03-08 12:47:44 +08:00
stephen	495b858606	fix ppo runtime error Former-commit-id: `cdb7f82869`	2024-03-08 11:48:26 +08:00
S3Studio	de41334055	Add dockerize support Already tested with the model of Qwen:1.8B and the dataset of alpaca_data_zh. Some python libraries are added to the Dockerfile as a result of the exception messages displayed throughout test procedure. Former-commit-id: `3d911ae713`	2024-03-08 10:47:28 +08:00
hiyouga	b268215a0e	update readme Former-commit-id: `4a2cc60b94`	2024-03-08 03:06:21 +08:00
hiyouga	7443ac3116	fix chat engine, update webui Former-commit-id: `5d956e2a51`	2024-03-08 03:01:53 +08:00
hiyouga	0a0959facf	Update setup.py Former-commit-id: `5cd4947650`	2024-03-08 01:23:00 +08:00
hiyouga	2235020cc9	update galore args Former-commit-id: `0ac6b40a47`	2024-03-08 01:17:32 +08:00
hiyouga	5b50458acf	fix galore Former-commit-id: `33a4c24a8a`	2024-03-08 00:44:51 +08:00
hiyouga	f373290012	add Yi-9B model Former-commit-id: `57452a4aa1`	2024-03-07 23:11:57 +08:00

1 2 3 4 5 ...

1024 Commits