Commit Graph

28 Commits

Author SHA1 Message Date
hiyouga
405df0f63d fix #761
Former-commit-id: b34797a845
2023-09-08 20:22:18 +08:00
hiyouga
9ed4bb63d4 change to right-padding, update reward score #803
Former-commit-id: 8ea32e4046
2023-09-08 20:04:31 +08:00
hiyouga
5030f05126 add deepspeed check in PPO training
Former-commit-id: ed1c2c5557
2023-09-07 19:12:40 +08:00
hiyouga
f74b980650 fix baichuan templates
Former-commit-id: 85b1f6632a
2023-09-07 18:54:14 +08:00
hiyouga
a4fd976048 refactor dataset_attr, add eos in pt, fix #757
Former-commit-id: a9d1fb72f7
2023-09-01 19:00:45 +08:00
hiyouga
570ccc3618 fix ppo trainer #551
Former-commit-id: 0676497104
2023-08-20 14:07:11 +08:00
hiyouga
9f1688924d tiny fix
Former-commit-id: d75e377b0f
2023-08-18 13:07:35 +08:00
hiyouga
b88f0b396c support ppo score norm (trl 0.5.1.dev required)
Former-commit-id: 53e33418d0
2023-08-18 12:02:42 +08:00
hiyouga
03edfd07e7 fix PPO trainer #551 , update readme
Former-commit-id: 9020524418
2023-08-18 11:43:10 +08:00
hiyouga
caf4a61e21 fix ChatGLM2 ppo #527 #528
Former-commit-id: 9f4c2adc9a
2023-08-18 00:34:59 +08:00
hiyouga
623a34b16f fix generation bug #532
Former-commit-id: be21fc83f9
2023-08-17 22:21:34 +08:00
hiyouga
048f99354f fix generation
Former-commit-id: d9e62711a3
2023-08-16 22:39:54 +08:00
hiyouga
a9ab8f71d7 fix ChatGLM RLHF
Former-commit-id: af6c011fcb
2023-08-15 11:19:20 +08:00
hiyouga
6c9b035c0e web UI integrating RLHF
Former-commit-id: ec94274ca1
2023-08-14 10:48:47 +08:00
hiyouga
abdfa26d06 support DPO training (2305.18290)
Former-commit-id: 3ec4351cfd
2023-08-11 03:02:53 +08:00
hiyouga
6404167ab7 support val set in streaming mode
Former-commit-id: d86ea314a1
2023-08-09 23:00:26 +08:00
hiyouga
4242897b78 modify code structure
Former-commit-id: 08f180e788
2023-08-02 23:17:36 +08:00
hiyouga
4b8e4398bc fix PPO trainer
Former-commit-id: 1d8a1878ea
2023-08-02 19:10:23 +08:00
hiyouga
569df8ccd6 update ppo trainer
Former-commit-id: b5ba87952a
2023-08-02 18:46:41 +08:00
hiyouga
ab739e72ea fix memory leak of PPO trainer
Former-commit-id: 286f7be346
2023-08-02 17:41:34 +08:00
hiyouga
c5ad96375e fix RM save model
Former-commit-id: ac88ce5233
2023-08-01 11:56:17 +08:00
hiyouga
e80b75b560 support streaming data, fix #284 #274 #268
Former-commit-id: 0411a4b3e1
2023-07-31 23:33:00 +08:00
hiyouga
18656a6316 fix API
Former-commit-id: 29af67b015
2023-07-19 00:01:14 +08:00
hiyouga
0b6f769971 update webUI, fix #179
Former-commit-id: 12d8a8633f
2023-07-18 15:35:17 +08:00
hiyouga
091805d38e release v0.1.0
Former-commit-id: f8193e8009
2023-07-18 00:18:25 +08:00
hiyouga
799524b37b fix #175
Former-commit-id: 85c2210452
2023-07-17 18:07:17 +08:00
hiyouga
70b5232f9a fix callback
Former-commit-id: 22d9a9c2af
2023-07-15 17:18:16 +08:00
hiyouga
a696148d6b modity code structure
Former-commit-id: f751376613
2023-07-15 16:54:28 +08:00