hiyouga
|
adf2730d1d
|
fix #1567
Former-commit-id: 8c01ffe8d277d49a413571e0669f460c8d0802bf
|
2023-11-20 18:46:36 +08:00 |
|
hiyouga
|
cfad41b901
|
fix #1263
Former-commit-id: faff5d32621f187ebd3124d7ade04e3fa437c53e
|
2023-11-19 16:05:18 +08:00 |
|
hiyouga
|
6889f044fb
|
fix #1558
Former-commit-id: 263b2b24c8a649b51fa5ae768a24e67def8e0e96
|
2023-11-19 14:15:47 +08:00 |
|
Yuchen Han
|
6af7107938
|
Update workflow.py
Former-commit-id: f70b7ffe6442217a222e0ef797c407f259a13886
|
2023-11-17 00:16:27 -08:00 |
|
hiyouga
|
f9d4e37b3c
|
fix bug in freeze tuning
Former-commit-id: f6b436a08421ca17d64abc51497f4aa43729a43b
|
2023-11-16 14:25:11 +08:00 |
|
hiyouga
|
de3a84ac59
|
fix rlhf callback
Former-commit-id: f5485452d660caef56474cb7dc37abbe4f34599e
|
2023-11-16 03:26:19 +08:00 |
|
hiyouga
|
e017266b98
|
fix bug in PPO training
Former-commit-id: 2e99f0e53ce6de0acbcab85dd50aef874e8c6336
|
2023-11-16 02:32:54 +08:00 |
|
hiyouga
|
f81a8a5e5c
|
fix import bug
Former-commit-id: 2356029cdd120d5f7bf630b80681ce8c53bff90d
|
2023-11-16 02:27:03 +08:00 |
|
hiyouga
|
7a3a0144a5
|
support full-parameter PPO
Former-commit-id: 4af967d69475e1c9fdf1a7983cd6b83bd431abff
|
2023-11-16 02:08:04 +08:00 |
|
hiyouga
|
09a4474e7f
|
disentangle model from tuner and rename modules
Former-commit-id: 02cbf91e7e424f8379c1fed01b82a5f7a83b6947
|
2023-11-15 16:29:09 +08:00 |
|