齐保元
|
a0965cd62c
|
[FEATURE]: ADD LORA+ ALGORITHM
|
2024-03-13 19:43:27 +08:00 |
|
hiyouga
|
0b4a5bf509
|
fix #2817
|
2024-03-13 12:42:03 +08:00 |
|
hiyouga
|
b9f87cdc11
|
fix #2802
|
2024-03-13 12:33:45 +08:00 |
|
hiyouga
|
96ce76cd27
|
fix kv cache
|
2024-03-13 01:21:50 +08:00 |
|
hiyouga
|
19ef482649
|
support QDoRA
|
2024-03-12 22:12:42 +08:00 |
|
hiyouga
|
70a3052dd8
|
patch for gemma cpt
|
2024-03-12 21:21:54 +08:00 |
|
hiyouga
|
60cc17f3a8
|
fix plot issues
|
2024-03-12 18:41:35 +08:00 |
|
hiyouga
|
b3247d6a16
|
support olmo
|
2024-03-12 18:30:38 +08:00 |
|
hiyouga
|
8d8956bad5
|
fix #2802
|
2024-03-12 17:08:34 +08:00 |
|
hiyouga
|
07f9b754a7
|
fix #2782 #2798
|
2024-03-12 15:53:29 +08:00 |
|
hiyouga
|
e874c00906
|
fix #2775
|
2024-03-11 00:42:54 +08:00 |
|
hiyouga
|
352693e2dc
|
tiny fix
|
2024-03-11 00:17:18 +08:00 |
|
hiyouga
|
be99799413
|
update parser
|
2024-03-10 13:35:20 +08:00 |
|
hiyouga
|
8664262cde
|
support layerwise galore
|
2024-03-10 00:24:11 +08:00 |
|
hiyouga
|
18ffce36b5
|
fix #2732
|
2024-03-09 22:37:16 +08:00 |
|
hiyouga
|
bdb496644c
|
allow non-packing pretraining
|
2024-03-09 22:21:46 +08:00 |
|
hiyouga
|
412c52e325
|
fix #2766
|
2024-03-09 21:35:24 +08:00 |
|
hiyouga
|
af0e370fb1
|
use default arg for freeze tuning
|
2024-03-09 06:08:48 +08:00 |
|
hiyouga
|
393c2de27c
|
update hardware requirements
|
2024-03-09 03:58:18 +08:00 |
|
hiyouga
|
e8dd38b7fd
|
fix #2756 , patch #2746
|
2024-03-09 02:01:26 +08:00 |
|
hoshi-hiyouga
|
516d0ddc66
|
Merge pull request #2746 from stephen-nju/main
fix deepspeed ppo RuntimeError
|
2024-03-09 01:37:00 +08:00 |
|
hiyouga
|
10be2f0ecc
|
fix aqlm version
|
2024-03-09 00:09:09 +08:00 |
|
stephen_zhu
|
aa71571b77
|
update
|
2024-03-08 12:47:44 +08:00 |
|
stephen
|
cdb7f82869
|
fix ppo runtime error
|
2024-03-08 11:48:26 +08:00 |
|
hiyouga
|
5d956e2a51
|
fix chat engine, update webui
|
2024-03-08 03:01:53 +08:00 |
|
hiyouga
|
0ac6b40a47
|
update galore args
|
2024-03-08 01:17:32 +08:00 |
|
hiyouga
|
33a4c24a8a
|
fix galore
|
2024-03-08 00:44:51 +08:00 |
|
hiyouga
|
57452a4aa1
|
add Yi-9B model
|
2024-03-07 23:11:57 +08:00 |
|
hiyouga
|
28f7862188
|
support galore
|
2024-03-07 22:41:36 +08:00 |
|
hiyouga
|
d07ad5cc1c
|
support vllm
|
2024-03-07 20:26:31 +08:00 |
|
hiyouga
|
f74f804a71
|
fix #2735
|
2024-03-07 16:15:53 +08:00 |
|
hoshi-hiyouga
|
2185855bdb
|
Merge pull request #2730 from cx2333-gt/main
fix flash_attn in train_web
|
2024-03-07 14:37:18 +08:00 |
|
cx2333
|
94b7a1b915
|
revert choice name
|
2024-03-07 14:28:55 +08:00 |
|
hiyouga
|
921ee82267
|
fix chatglm3 template
|
2024-03-07 14:26:16 +08:00 |
|
cx2333
|
a8889498fa
|
fix flash_attn in train_web
|
2024-03-07 10:13:55 +08:00 |
|
hiyouga
|
0048a2021e
|
tiny fix
|
2024-03-06 17:25:08 +08:00 |
|
hiyouga
|
3e84f430b1
|
export use balanced gpu
|
2024-03-06 16:33:14 +08:00 |
|
hiyouga
|
9658c63cd9
|
fix add tokens
|
2024-03-06 15:04:02 +08:00 |
|
hiyouga
|
3016e65657
|
fix version checking
|
2024-03-06 14:51:51 +08:00 |
|
hiyouga
|
e0c47358f9
|
fix arg dtype
|
2024-03-05 20:53:30 +08:00 |
|
hiyouga
|
259af60d28
|
improve aqlm optim
|
2024-03-05 20:49:50 +08:00 |
|
hiyouga
|
d3d3dac707
|
optimize aqlm training
|
2024-03-05 18:35:41 +08:00 |
|
hiyouga
|
ddf352f861
|
fix dora inference
|
2024-03-05 11:51:41 +08:00 |
|
hiyouga
|
e5edcf440f
|
fix export model
|
2024-03-05 11:05:41 +08:00 |
|
hiyouga
|
9e56eaf2d3
|
auto set chat template
|
2024-03-05 02:41:20 +08:00 |
|
hiyouga
|
24a79bd50f
|
update readme
|
2024-03-04 19:29:26 +08:00 |
|
hiyouga
|
cda2ff8727
|
fix export on cpu device
|
2024-03-04 17:35:09 +08:00 |
|
hiyouga
|
9c10854b46
|
fix sub-process error in thread
|
2024-03-03 15:04:35 +08:00 |
|
hiyouga
|
894d183214
|
update readme, add starcoder2, cosmopedia
|
2024-03-03 01:01:46 +08:00 |
|
hiyouga
|
4e5fae2fac
|
fix #2649
|
2024-03-01 13:02:41 +08:00 |
|