14 Commits

Author SHA1 Message Date
hiyouga
61960189b2 fix #1789
Former-commit-id: 4571068e1e00dc234c9131185fe0924c726add84
2024-01-09 18:31:27 +08:00
hiyouga
d0946f08db fix ppo trainer
Former-commit-id: 5431be42f9c43095d478f2250fac64ef189eb3ad
2023-12-28 18:09:28 +08:00
hiyouga
8154b4bdf6 fix #1742
Former-commit-id: 870426ff70c060213ac283b10a9b1f4bf71679ef
2023-12-16 20:50:45 +08:00
hiyouga
027caabbb6 fix ppo trainer save logic
Former-commit-id: d3dccd0693ede18a99f04780f2fd6e3a89810405
2023-12-04 19:00:19 +08:00
hiyouga
6493558c3b fix bug
Former-commit-id: 8b681ee273c28813c599d9d55b2a3540c8ac257d
2023-12-03 21:40:40 +08:00
hiyouga
64eead3fb1 ppo support rm server
Former-commit-id: 747db4017291b0eb91946f57011bb31659056037
2023-12-03 21:38:51 +08:00
hiyouga
3d291a82d3 fix #1597
Former-commit-id: 327d7f7efe1fefe4bf4646c07fc4917a42c13383
2023-11-30 21:47:06 +08:00
hiyouga
ba6d290d0b fix #1668
Former-commit-id: 1585962eb7ed042890d4c56422aae749c669dda8
2023-11-30 21:02:00 +08:00
hiyouga
ecfc7d1b50 fix #1658
Former-commit-id: 77d1b14fc2d9703d15bbd879f67df037db9fbb28
2023-11-28 20:57:24 +08:00
hiyouga
f06c4c8f7a update ppo trainer
Former-commit-id: 5021062493ed63ad1f6133cfb543e4e7f528d2cc
2023-11-20 21:39:15 +08:00
hiyouga
682d81caa9 fix #1567
Former-commit-id: 99a3f06377d2886c4000ce7e3583b12ca965534d
2023-11-20 18:46:36 +08:00
hiyouga
678052a7ef fix rlhf callback
Former-commit-id: 1817ffc86fe3463ea91e9359c0e3611979a9d53e
2023-11-16 03:26:19 +08:00
hiyouga
f441932bd1 support full-parameter PPO
Former-commit-id: ce783036001397a20b0b4c5da2fea6d0c03389d2
2023-11-16 02:08:04 +08:00
hiyouga
06a4820836 disentangle model from tuner and rename modules
Former-commit-id: 4736344eb1595ee023a50d49e8118f4eee46305f
2023-11-15 16:29:09 +08:00