8 Commits

Author SHA1 Message Date
hiyouga
a423274fd9 support function calling
Former-commit-id: 66533b3f65babf2429c92c0f8fafe4eff5e0ff63
2024-01-18 09:54:23 +08:00
hiyouga
f902b0d420 refactor adapter hparam
Former-commit-id: f82aece9ebd6df83a7a005cc7cbbcec07fa6e14d
2023-12-15 20:53:11 +08:00
hiyouga
60aea7521b ppo support rm server
Former-commit-id: 20b0edf16f5b42cb2c4a795674647afb68cb3a4a
2023-12-03 21:38:51 +08:00
hiyouga
29545d0e5e implement rm server #1543
Former-commit-id: 2e5bb6888c86079493456c2ddd525f8c52b9963e
2023-12-03 20:52:54 +08:00
hiyouga
6889f044fb fix #1558
Former-commit-id: 263b2b24c8a649b51fa5ae768a24e67def8e0e96
2023-11-19 14:15:47 +08:00
hiyouga
f9d4e37b3c fix bug in freeze tuning
Former-commit-id: f6b436a08421ca17d64abc51497f4aa43729a43b
2023-11-16 14:25:11 +08:00
hiyouga
e017266b98 fix bug in PPO training
Former-commit-id: 2e99f0e53ce6de0acbcab85dd50aef874e8c6336
2023-11-16 02:32:54 +08:00
hiyouga
f81a8a5e5c fix import bug
Former-commit-id: 2356029cdd120d5f7bf630b80681ce8c53bff90d
2023-11-16 02:27:03 +08:00