Commit Graph

198 Commits

Author SHA1 Message Date
hiyouga
829cf6458a fix #3083
Former-commit-id: 4a6ca621c0
2024-04-01 22:53:52 +08:00
hiyouga
34f1de0574 fix #3077
Former-commit-id: aee634cd20
2024-04-01 21:35:18 +08:00
hiyouga
b7468ea0a8 support infer 4bit model on GPUs #3023
Former-commit-id: eb259cc573
2024-04-01 17:34:04 +08:00
hiyouga
3cf35e57db tiny fix
Former-commit-id: 27776c3474
2024-03-31 00:10:29 +08:00
marko1616
5721074af1 fix blank line contains whitespace
Former-commit-id: d9a5134617
2024-03-30 23:46:55 +08:00
marko1616
67c05c2031 Fix Llama model save for full param train
Former-commit-id: eb178eaff3
2024-03-30 23:45:04 +08:00
hiyouga
89c400633a update trainers
Former-commit-id: 8c77b10912
2024-03-28 18:16:27 +08:00
hiyouga
ec94e5e876 fix #2961
Former-commit-id: 511f675402
2024-03-26 17:26:14 +08:00
hiyouga
75829c8699 fix #2928
Former-commit-id: 7afbc85dae
2024-03-24 00:34:54 +08:00
hiyouga
58aa576ae5 fix #2941
Former-commit-id: a1c8c98c5f
2024-03-24 00:28:44 +08:00
hiyouga
7999836fb6 support fsdp + qlora
Former-commit-id: 8408225162
2024-03-21 00:36:06 +08:00
hiyouga
cf149bf43c fix #2346
Former-commit-id: 7b8f502901
2024-03-20 17:56:33 +08:00
hiyouga
a5537f3ee8 fix patcher
Former-commit-id: 85c376fc1e
2024-03-15 19:18:42 +08:00
S3Studio
46ef7416e6 Use official Nvidia base image
Note that the flash-attn library is installed in this image and the qwen model will use it automatically.
However, if the the host machine's GPU is not compatible with the library, an exception will be raised during the training process as follows:
FlashAttention only supports Ampere GPUs or newer.
So if the --flash_attn flag is not set, an additional patch for the qwen model's config is necessary to set the default value of use_flash_attn from "auto" to False.


Former-commit-id: e75407febd
2024-03-15 08:59:13 +08:00
hiyouga
2cf95d4efe fix export
Former-commit-id: 3b4a59bfb1
2024-03-14 18:17:01 +08:00
hiyouga
8b8671817f improve lora+ impl.
Former-commit-id: 72367307df
2024-03-13 23:32:51 +08:00
hiyouga
8673abbe5e fix #2802
Former-commit-id: b9f87cdc11
2024-03-13 12:33:45 +08:00
hiyouga
a74426df0f fix kv cache
Former-commit-id: 96ce76cd27
2024-03-13 01:21:50 +08:00
hiyouga
0b7e870b07 fix #2802
Former-commit-id: 8d8956bad5
2024-03-12 17:08:34 +08:00
hiyouga
276def1897 fix #2732
Former-commit-id: 18ffce36b5
2024-03-09 22:37:16 +08:00
hiyouga
868444e124 allow non-packing pretraining
Former-commit-id: bdb496644c
2024-03-09 22:21:46 +08:00
hiyouga
c561b268ef fix #2756 , patch #2746
Former-commit-id: e8dd38b7fd
2024-03-09 02:01:26 +08:00
hoshi-hiyouga
36d65289d0 Merge pull request #2746 from stephen-nju/main
fix deepspeed ppo RuntimeError

Former-commit-id: 516d0ddc66
2024-03-09 01:37:00 +08:00
hiyouga
398c261c7c fix aqlm version
Former-commit-id: 10be2f0ecc
2024-03-09 00:09:09 +08:00
stephen_zhu
c69b9fbe58 update
Former-commit-id: aa71571b77
2024-03-08 12:47:44 +08:00
stephen
495b858606 fix ppo runtime error
Former-commit-id: cdb7f82869
2024-03-08 11:48:26 +08:00
hiyouga
5b50458acf fix galore
Former-commit-id: 33a4c24a8a
2024-03-08 00:44:51 +08:00
hiyouga
34533b2f35 support vllm
Former-commit-id: d07ad5cc1c
2024-03-07 20:26:31 +08:00
hiyouga
37e40563f1 fix #2735
Former-commit-id: f74f804a71
2024-03-07 16:15:53 +08:00
hiyouga
8b6c178249 export use balanced gpu
Former-commit-id: 3e84f430b1
2024-03-06 16:33:14 +08:00
hiyouga
e887aface7 fix version checking
Former-commit-id: 3016e65657
2024-03-06 14:51:51 +08:00
hiyouga
9561809ce9 improve aqlm optim
Former-commit-id: 259af60d28
2024-03-05 20:49:50 +08:00
hiyouga
c776cdfc3e optimize aqlm training
Former-commit-id: d3d3dac707
2024-03-05 18:35:41 +08:00
hiyouga
0f2250b831 fix dora inference
Former-commit-id: ddf352f861
2024-03-05 11:51:41 +08:00
hiyouga
a62d17d009 fix export on cpu device
Former-commit-id: cda2ff8727
2024-03-04 17:35:09 +08:00
hiyouga
d1e6e02461 fix #2649
Former-commit-id: 4e5fae2fac
2024-03-01 13:02:41 +08:00
hiyouga
3787d13816 fix #2642
Former-commit-id: c0be617195
2024-02-29 18:32:54 +08:00
hiyouga
1853b5c172 tiny fix
Former-commit-id: 4a871e80e2
2024-02-29 17:28:50 +08:00
hiyouga
8e7d50dae4 release v0.5.3
Former-commit-id: fa5ab21ebc
2024-02-29 00:34:19 +08:00
hiyouga
5abbca70d3 support DoRA, AWQ, AQLM #2512
Former-commit-id: cfefacaa37
2024-02-28 19:53:28 +08:00
hiyouga
0fcb931f18 support lora for llama pro
Former-commit-id: 9aeb404a94
2024-02-21 02:17:22 +08:00
hiyouga
62b78001b7 fix #2481
Former-commit-id: 22acab8aff
2024-02-15 19:07:47 +08:00
hiyouga
96265ec154 support llama pro #2338 , add rslora
Former-commit-id: 7924ffc55d
2024-02-15 02:27:36 +08:00
younesbelkada
6b98435a53 add v1 hf tags
Former-commit-id: 0ca0f08162
2024-02-13 05:58:49 +00:00
hiyouga
75adbfec79 add option to disable version check
Former-commit-id: 91d09a01ac
2024-02-10 22:31:23 +08:00
hiyouga
bbe5ff0570 update gc kwargs
Former-commit-id: 0ae9a16b9d
2024-02-07 00:38:24 +08:00
hiyouga
caeffc780d fix #2438
Former-commit-id: ebf31b62eb
2024-02-06 15:23:08 +08:00
hiyouga
f6b2bcfa16 fix #2420
Former-commit-id: 19d33ede13
2024-02-04 15:51:47 +08:00
hiyouga
b1064d2f9b bump up transformers version
Former-commit-id: 38e63bfd28
2024-02-04 00:01:16 +08:00
hiyouga
0fc8612b97 add hint for freeze #2412
Former-commit-id: 6545c02790
2024-02-03 23:38:56 +08:00