6 Commits

Author SHA1 Message Date
hiyouga
678052a7ef fix rlhf callback
Former-commit-id: 1817ffc86fe3463ea91e9359c0e3611979a9d53e
2023-11-16 03:26:19 +08:00
hiyouga
b71da932eb fix bug in PPO training
Former-commit-id: 856522a3df4bb9ddfaaa137119eceb9574873950
2023-11-16 02:32:54 +08:00
hiyouga
eb5a852dd5 fix import bug
Former-commit-id: 35b91ea34caade45dd51813b94da5177b852aa4c
2023-11-16 02:27:03 +08:00
hiyouga
f441932bd1 support full-parameter PPO
Former-commit-id: ce783036001397a20b0b4c5da2fea6d0c03389d2
2023-11-16 02:08:04 +08:00
hiyouga
e30290444a support multiple modules in freeze training #1514
Former-commit-id: 4907452d955367ebe987e6deae4fd4213628f2b2
2023-11-15 17:08:18 +08:00
hiyouga
06a4820836 disentangle model from tuner and rename modules
Former-commit-id: 4736344eb1595ee023a50d49e8118f4eee46305f
2023-11-15 16:29:09 +08:00