LLaMA-Factory

mirror of https://github.com/hiyouga/LLaMA-Factory.git synced 2026-03-08 12:46:06 +08:00

Author	SHA1	Message	Date
hiyouga	84c3d509fa	fix #2936 Former-commit-id: `140ad4ad56`	2024-03-24 00:43:21 +08:00
hiyouga	75829c8699	fix #2928 Former-commit-id: `7afbc85dae`	2024-03-24 00:34:54 +08:00
hiyouga	58aa576ae5	fix #2941 Former-commit-id: `a1c8c98c5f`	2024-03-24 00:28:44 +08:00
hiyouga	7999836fb6	support fsdp + qlora Former-commit-id: `8408225162`	2024-03-21 00:36:06 +08:00
hiyouga	8717e98200	fix #2777 #2895 Former-commit-id: `9bec3c98a2`	2024-03-20 17:59:45 +08:00
hiyouga	cf149bf43c	fix #2346 Former-commit-id: `7b8f502901`	2024-03-20 17:56:33 +08:00
hiyouga	3d483e0914	fix packages Former-commit-id: `8e04794b2d`	2024-03-17 22:32:03 +08:00
hiyouga	a5537f3ee8	fix patcher Former-commit-id: `85c376fc1e`	2024-03-15 19:18:42 +08:00
hoshi-hiyouga	30765baa91	Merge pull request #2849 from S3Studio/DockerizeSupport Improve Dockerize support Former-commit-id: `113cc04719`	2024-03-15 19:16:02 +08:00
hiyouga	06860e8f0f	fix export Former-commit-id: `6bc2c23b6d`	2024-03-15 15:06:30 +08:00
S3Studio	46ef7416e6	Use official Nvidia base image Note that the flash-attn library is installed in this image and the qwen model will use it automatically. However, if the the host machine's GPU is not compatible with the library, an exception will be raised during the training process as follows: FlashAttention only supports Ampere GPUs or newer. So if the --flash_attn flag is not set, an additional patch for the qwen model's config is necessary to set the default value of use_flash_attn from "auto" to False. Former-commit-id: `e75407febd`	2024-03-15 08:59:13 +08:00
hiyouga	7ef49586be	tiny fix Former-commit-id: `6ebde4f23e`	2024-03-14 21:19:06 +08:00
hiyouga	2cf95d4efe	fix export Former-commit-id: `3b4a59bfb1`	2024-03-14 18:17:01 +08:00
hiyouga	edd28dbe2c	fix bug Former-commit-id: `8172530d54`	2024-03-13 23:55:31 +08:00
hiyouga	9ff7c99eb1	fix bug Former-commit-id: `714d936dfb`	2024-03-13 23:43:42 +08:00
hiyouga	8b8671817f	improve lora+ impl. Former-commit-id: `72367307df`	2024-03-13 23:32:51 +08:00
齐保元	24c9277488	[FEATURE]: ADD LORA+ ALGORITHM Former-commit-id: `a0965cd62c`	2024-03-13 19:43:27 +08:00
hiyouga	922bd8864b	fix #2817 Former-commit-id: `0b4a5bf509`	2024-03-13 12:42:03 +08:00
hiyouga	8673abbe5e	fix #2802 Former-commit-id: `b9f87cdc11`	2024-03-13 12:33:45 +08:00
hiyouga	a74426df0f	fix kv cache Former-commit-id: `96ce76cd27`	2024-03-13 01:21:50 +08:00
hiyouga	bbf272f96e	support QDoRA Former-commit-id: `19ef482649`	2024-03-12 22:12:42 +08:00
hiyouga	096c31bfb6	patch for gemma cpt Former-commit-id: `70a3052dd8`	2024-03-12 21:21:54 +08:00
hiyouga	c28818c39f	fix plot issues Former-commit-id: `60cc17f3a8`	2024-03-12 18:41:35 +08:00
hiyouga	14ed926a2d	support olmo Former-commit-id: `b3247d6a16`	2024-03-12 18:30:38 +08:00
hiyouga	0b7e870b07	fix #2802 Former-commit-id: `8d8956bad5`	2024-03-12 17:08:34 +08:00
hiyouga	7124b71676	fix #2782 #2798 Former-commit-id: `07f9b754a7`	2024-03-12 15:53:29 +08:00
hiyouga	c88062347e	fix #2775 Former-commit-id: `e874c00906`	2024-03-11 00:42:54 +08:00
hiyouga	f776e738f8	tiny fix Former-commit-id: `352693e2dc`	2024-03-11 00:17:18 +08:00
hiyouga	566bfad930	update parser Former-commit-id: `be99799413`	2024-03-10 13:35:20 +08:00
hiyouga	4a4e4b4354	support layerwise galore Former-commit-id: `8664262cde`	2024-03-10 00:24:11 +08:00
hiyouga	276def1897	fix #2732 Former-commit-id: `18ffce36b5`	2024-03-09 22:37:16 +08:00
hiyouga	868444e124	allow non-packing pretraining Former-commit-id: `bdb496644c`	2024-03-09 22:21:46 +08:00
hiyouga	1173441661	fix #2766 Former-commit-id: `412c52e325`	2024-03-09 21:35:24 +08:00
hiyouga	8f6eb1383d	use default arg for freeze tuning Former-commit-id: `af0e370fb1`	2024-03-09 06:08:48 +08:00
hiyouga	5c00783697	update hardware requirements Former-commit-id: `393c2de27c`	2024-03-09 03:58:18 +08:00
hiyouga	c561b268ef	fix #2756 , patch #2746 Former-commit-id: `e8dd38b7fd`	2024-03-09 02:01:26 +08:00
hoshi-hiyouga	36d65289d0	Merge pull request #2746 from stephen-nju/main fix deepspeed ppo RuntimeError Former-commit-id: `516d0ddc66`	2024-03-09 01:37:00 +08:00
hiyouga	398c261c7c	fix aqlm version Former-commit-id: `10be2f0ecc`	2024-03-09 00:09:09 +08:00
stephen_zhu	c69b9fbe58	update Former-commit-id: `aa71571b77`	2024-03-08 12:47:44 +08:00
stephen	495b858606	fix ppo runtime error Former-commit-id: `cdb7f82869`	2024-03-08 11:48:26 +08:00
hiyouga	7443ac3116	fix chat engine, update webui Former-commit-id: `5d956e2a51`	2024-03-08 03:01:53 +08:00
hiyouga	2235020cc9	update galore args Former-commit-id: `0ac6b40a47`	2024-03-08 01:17:32 +08:00
hiyouga	5b50458acf	fix galore Former-commit-id: `33a4c24a8a`	2024-03-08 00:44:51 +08:00
hiyouga	f373290012	add Yi-9B model Former-commit-id: `57452a4aa1`	2024-03-07 23:11:57 +08:00
hiyouga	2c010c72b8	support galore Former-commit-id: `28f7862188`	2024-03-07 22:41:36 +08:00
hiyouga	34533b2f35	support vllm Former-commit-id: `d07ad5cc1c`	2024-03-07 20:26:31 +08:00
hiyouga	37e40563f1	fix #2735 Former-commit-id: `f74f804a71`	2024-03-07 16:15:53 +08:00
hoshi-hiyouga	90e66c8d94	Merge pull request #2730 from cx2333-gt/main fix flash_attn in train_web Former-commit-id: `2185855bdb`	2024-03-07 14:37:18 +08:00
cx2333	013c12a135	revert choice name Former-commit-id: `94b7a1b915`	2024-03-07 14:28:55 +08:00
hiyouga	843d3f7a97	fix chatglm3 template Former-commit-id: `921ee82267`	2024-03-07 14:26:16 +08:00

... 16 17 18 19 20 ...

1602 Commits