27 Commits

Author SHA1 Message Date
hiyouga
43b2ede0f8 fix #2756 , patch #2746
Former-commit-id: 627d1c91e675f1d9ebf47bad123cbbf29821da4d
2024-03-09 02:01:26 +08:00
hiyouga
1e6fb6c8aa support galore
Former-commit-id: b67a4a46a88d83bb2a3459b3317b66cda15e0171
2024-03-07 22:41:36 +08:00
hiyouga
59a9a5994e fix #2649
Former-commit-id: 1c850de660c671d92f0bc63f230d338b60b7c0bd
2024-03-01 13:02:41 +08:00
stephen
823f618cba update project_kwargs for ppo config
Former-commit-id: 14f106962fc0a87802ae9ecffff00d52f7f5f046
2024-02-21 13:47:38 +08:00
hiyouga
de0ebab464 fix #2189
Former-commit-id: b3d81b229d376671e1c12229aeb487b0d84f2548
2024-02-04 00:47:37 +08:00
hiyouga
66e0e651b9 format style
Former-commit-id: 53b683531b83cd1d19de97c6565f16c1eca6f5e1
2024-01-20 20:15:56 +08:00
hiyouga
1750218057 fix tests
Former-commit-id: 23f97bd437424ef43b2b84743d56acc5d1ca70d5
2024-01-20 19:58:04 +08:00
hiyouga
a423274fd9 support function calling
Former-commit-id: 66533b3f65babf2429c92c0f8fafe4eff5e0ff63
2024-01-18 09:54:23 +08:00
hiyouga
73cab9d9d4 fix #2161
Former-commit-id: 9acd5a2b678cd07f8e3b48eca76c4cbacb559e37
2024-01-11 17:04:13 +08:00
hiyouga
4d6669c268 fix #1789
Former-commit-id: d86455f685fa531e651333e00b4fe54d895cf2e4
2024-01-09 18:31:27 +08:00
hiyouga
53d7c5109f fix ppo trainer
Former-commit-id: ca5b5823b03822ef899405d233a82396be997f44
2023-12-28 18:09:28 +08:00
hiyouga
790a31404a fix #1742
Former-commit-id: efbb32afdcf0d6aa4ca26f54c95f76dbb84f77dc
2023-12-16 20:50:45 +08:00
hiyouga
7b7bfea37d fix ppo trainer save logic
Former-commit-id: 5e70c41e4e12a1109570b0ff56346fe212c028ed
2023-12-04 19:00:19 +08:00
hiyouga
09f165d442 fix bug
Former-commit-id: 2fd7a8fc3134af66193a5e8db8fea35025f82de9
2023-12-03 21:40:40 +08:00
hiyouga
60aea7521b ppo support rm server
Former-commit-id: 20b0edf16f5b42cb2c4a795674647afb68cb3a4a
2023-12-03 21:38:51 +08:00
hiyouga
29545d0e5e implement rm server #1543
Former-commit-id: 2e5bb6888c86079493456c2ddd525f8c52b9963e
2023-12-03 20:52:54 +08:00
hiyouga
99ceee840e fix #1597
Former-commit-id: d77a3a79a0e854803a57af8ac6a7246691f69f70
2023-11-30 21:47:06 +08:00
hiyouga
8ed68301e3 fix #1668
Former-commit-id: bccc71259e703ca1e1d88169e385a026c4efa92e
2023-11-30 21:02:00 +08:00
hiyouga
0e6f4f981e fix #1658
Former-commit-id: 3126687c4820c34daa6a2e9e3bf9065ad59e92dc
2023-11-28 20:57:24 +08:00
hiyouga
28258aecd2 update ppo trainer
Former-commit-id: caa525a5c6f228b9ad71387d1fe4f1c2ffa2479e
2023-11-20 21:39:15 +08:00
hoshi-hiyouga
e585950c54 Merge pull request #1553 from hannlp/hans
Change the default argument settings for PPO training

Former-commit-id: 1b64678fa4979485f67c3bb1420dfdff6fcbc6e7
2023-11-20 20:32:55 +08:00
hiyouga
adf2730d1d fix #1567
Former-commit-id: 8c01ffe8d277d49a413571e0669f460c8d0802bf
2023-11-20 18:46:36 +08:00
Yuchen Han
6af7107938 Update workflow.py
Former-commit-id: f70b7ffe6442217a222e0ef797c407f259a13886
2023-11-17 00:16:27 -08:00
hiyouga
de3a84ac59 fix rlhf callback
Former-commit-id: f5485452d660caef56474cb7dc37abbe4f34599e
2023-11-16 03:26:19 +08:00
hiyouga
f81a8a5e5c fix import bug
Former-commit-id: 2356029cdd120d5f7bf630b80681ce8c53bff90d
2023-11-16 02:27:03 +08:00
hiyouga
7a3a0144a5 support full-parameter PPO
Former-commit-id: 4af967d69475e1c9fdf1a7983cd6b83bd431abff
2023-11-16 02:08:04 +08:00
hiyouga
09a4474e7f disentangle model from tuner and rename modules
Former-commit-id: 02cbf91e7e424f8379c1fed01b82a5f7a83b6947
2023-11-15 16:29:09 +08:00