Commit Graph

13 Commits

Author SHA1 Message Date
hiyouga
4966bd7911 support GPTQ tuning #729 #1481 #1545 , fix chatglm template #1453 #1480 #1569
Former-commit-id: 9ea9380145
2023-11-20 22:52:11 +08:00
hiyouga
f06c4c8f7a update ppo trainer
Former-commit-id: 5021062493
2023-11-20 21:39:15 +08:00
hoshi-hiyouga
d72f123851 Merge pull request #1553 from hannlp/hans
Change the default argument settings for PPO training

Former-commit-id: 48211e3799
2023-11-20 20:32:55 +08:00
hiyouga
682d81caa9 fix #1567
Former-commit-id: 99a3f06377
2023-11-20 18:46:36 +08:00
hiyouga
a53afb27eb fix #1263
Former-commit-id: 065bfaeed4
2023-11-19 16:05:18 +08:00
hiyouga
48d6d925f7 fix #1558
Former-commit-id: 1740131d63
2023-11-19 14:15:47 +08:00
Yuchen Han
a419122179 Update workflow.py
Former-commit-id: eeb5249d0b
2023-11-17 00:16:27 -08:00
hiyouga
0ed0b8f9c5 fix bug in freeze tuning
Former-commit-id: ff52b1779c
2023-11-16 14:25:11 +08:00
hiyouga
678052a7ef fix rlhf callback
Former-commit-id: 1817ffc86f
2023-11-16 03:26:19 +08:00
hiyouga
b71da932eb fix bug in PPO training
Former-commit-id: 856522a3df
2023-11-16 02:32:54 +08:00
hiyouga
eb5a852dd5 fix import bug
Former-commit-id: 35b91ea34c
2023-11-16 02:27:03 +08:00
hiyouga
f441932bd1 support full-parameter PPO
Former-commit-id: ce78303600
2023-11-16 02:08:04 +08:00
hiyouga
06a4820836 disentangle model from tuner and rename modules
Former-commit-id: 4736344eb1
2023-11-15 16:29:09 +08:00