Commit Graph

33 Commits

Author SHA1 Message Date
hiyouga
cc249506a8 release v0.6.1
Former-commit-id: a59d823f554505b2e649e6e111b9dee8306d3ad8
2024-03-29 11:36:08 +08:00
hiyouga
5c8794ad1a fix #2982
Former-commit-id: e5e6a0c50c7a1c0052ed6b459450b9735ff2c9a1
2024-03-28 20:22:31 +08:00
hiyouga
27f5c967e4 update trainers
Former-commit-id: d0dd6eefed0b86895ed00a7cafb331e5193db645
2024-03-28 18:16:27 +08:00
hiyouga
2d12e88c23 fix #2777 #2895
Former-commit-id: 54d5f62d29456a8d9d0c0dd3d0bbfffe48935803
2024-03-20 17:59:45 +08:00
hiyouga
0aacc41252 support layerwise galore
Former-commit-id: d43a4da0947897d0be3f62fad3107754d4c89f2b
2024-03-10 00:24:11 +08:00
hiyouga
56565bdbd4 allow non-packing pretraining
Former-commit-id: 3fee5cc5a3db9ce874ad90f2500ec092d904bd4e
2024-03-09 22:21:46 +08:00
hiyouga
e16912b0c0 fix #2756 , patch #2746
Former-commit-id: 627d1c91e675f1d9ebf47bad123cbbf29821da4d
2024-03-09 02:01:26 +08:00
hiyouga
3f56782ffa support galore
Former-commit-id: b67a4a46a88d83bb2a3459b3317b66cda15e0171
2024-03-07 22:41:36 +08:00
hiyouga
10845a2fe7 fix #2649
Former-commit-id: 1c850de660c671d92f0bc63f230d338b60b7c0bd
2024-03-01 13:02:41 +08:00
stephen
8bceef81c0 update project_kwargs for ppo config
Former-commit-id: 14f106962fc0a87802ae9ecffff00d52f7f5f046
2024-02-21 13:47:38 +08:00
hiyouga
f14cadf12d fix #2189
Former-commit-id: b3d81b229d376671e1c12229aeb487b0d84f2548
2024-02-04 00:47:37 +08:00
hiyouga
c0e4eebf17 format style
Former-commit-id: 53b683531b83cd1d19de97c6565f16c1eca6f5e1
2024-01-20 20:15:56 +08:00
hiyouga
01c22ad7f8 fix tests
Former-commit-id: 23f97bd437424ef43b2b84743d56acc5d1ca70d5
2024-01-20 19:58:04 +08:00
hiyouga
a9fc7dbfa6 support function calling
Former-commit-id: 66533b3f65babf2429c92c0f8fafe4eff5e0ff63
2024-01-18 09:54:23 +08:00
hiyouga
ed2212f197 fix #2161
Former-commit-id: 9acd5a2b678cd07f8e3b48eca76c4cbacb559e37
2024-01-11 17:04:13 +08:00
hiyouga
9537bc7f3e fix #1789
Former-commit-id: d86455f685fa531e651333e00b4fe54d895cf2e4
2024-01-09 18:31:27 +08:00
hiyouga
bb4e62cbcd fix ppo trainer
Former-commit-id: ca5b5823b03822ef899405d233a82396be997f44
2023-12-28 18:09:28 +08:00
hiyouga
67e284b2e4 fix #1742
Former-commit-id: efbb32afdcf0d6aa4ca26f54c95f76dbb84f77dc
2023-12-16 20:50:45 +08:00
hiyouga
28ed4cb3f4 fix ppo trainer save logic
Former-commit-id: 5e70c41e4e12a1109570b0ff56346fe212c028ed
2023-12-04 19:00:19 +08:00
hiyouga
3c667217be fix bug
Former-commit-id: 2fd7a8fc3134af66193a5e8db8fea35025f82de9
2023-12-03 21:40:40 +08:00
hiyouga
3d200af1d2 ppo support rm server
Former-commit-id: 20b0edf16f5b42cb2c4a795674647afb68cb3a4a
2023-12-03 21:38:51 +08:00
hiyouga
74851e7033 implement rm server #1543
Former-commit-id: 2e5bb6888c86079493456c2ddd525f8c52b9963e
2023-12-03 20:52:54 +08:00
hiyouga
24dd0db807 fix #1597
Former-commit-id: d77a3a79a0e854803a57af8ac6a7246691f69f70
2023-11-30 21:47:06 +08:00
hiyouga
ca70d8393d fix #1668
Former-commit-id: bccc71259e703ca1e1d88169e385a026c4efa92e
2023-11-30 21:02:00 +08:00
hiyouga
d072f771d2 fix #1658
Former-commit-id: 3126687c4820c34daa6a2e9e3bf9065ad59e92dc
2023-11-28 20:57:24 +08:00
hiyouga
78e6ac0156 update ppo trainer
Former-commit-id: caa525a5c6f228b9ad71387d1fe4f1c2ffa2479e
2023-11-20 21:39:15 +08:00
hoshi-hiyouga
05fd97c637 Merge pull request #1553 from hannlp/hans
Change the default argument settings for PPO training

Former-commit-id: 1b64678fa4979485f67c3bb1420dfdff6fcbc6e7
2023-11-20 20:32:55 +08:00
hiyouga
4febd99b99 fix #1567
Former-commit-id: 8c01ffe8d277d49a413571e0669f460c8d0802bf
2023-11-20 18:46:36 +08:00
Yuchen Han
b6c80a4d43 Update workflow.py
Former-commit-id: f70b7ffe6442217a222e0ef797c407f259a13886
2023-11-17 00:16:27 -08:00
hiyouga
e83b0cbafb fix rlhf callback
Former-commit-id: f5485452d660caef56474cb7dc37abbe4f34599e
2023-11-16 03:26:19 +08:00
hiyouga
77b1ed4deb fix import bug
Former-commit-id: 2356029cdd120d5f7bf630b80681ce8c53bff90d
2023-11-16 02:27:03 +08:00
hiyouga
685d0c975a support full-parameter PPO
Former-commit-id: 4af967d69475e1c9fdf1a7983cd6b83bd431abff
2023-11-16 02:08:04 +08:00
hiyouga
5a206d54c9 disentangle model from tuner and rename modules
Former-commit-id: 02cbf91e7e424f8379c1fed01b82a5f7a83b6947
2023-11-15 16:29:09 +08:00