Commit Graph

35 Commits

Author SHA1 Message Date
hiyouga
84af10cec9 update gradio, support multiple resp in api 2023-11-01 23:02:16 +08:00
hiyouga
aff9363ce3 fix #1285 2023-10-26 16:34:52 +08:00
hiyouga
11bd271364 fix ppo args 2023-10-11 23:40:50 +08:00
hiyouga
2818af0b09 refactor model_dtype, fix PPO trainer 2023-10-11 23:16:01 +08:00
hiyouga
b0b0138e1d fix #1026 2023-09-27 22:57:09 +08:00
hiyouga
338b8664ed fix #944 2023-09-21 19:51:02 +08:00
hiyouga
7ba57d5b14 fix ppo save model 2023-09-12 16:25:29 +08:00
hiyouga
0fbece85a7 update flashattn, fix ppo save model 2023-09-11 17:25:36 +08:00
hiyouga
b218c271ed remove PeftTrainer 2023-09-10 22:23:23 +08:00
hiyouga
a51b7c98ac fix lora target 2023-09-09 17:04:45 +08:00
hiyouga
b34797a845 fix #761 2023-09-08 20:22:18 +08:00
hiyouga
8ea32e4046 change to right-padding, update reward score #803 2023-09-08 20:04:31 +08:00
hiyouga
ed1c2c5557 add deepspeed check in PPO training 2023-09-07 19:12:40 +08:00
hiyouga
85b1f6632a fix baichuan templates 2023-09-07 18:54:14 +08:00
hiyouga
a9d1fb72f7 refactor dataset_attr, add eos in pt, fix #757 2023-09-01 19:00:45 +08:00
hiyouga
0676497104 fix ppo trainer #551 2023-08-20 14:07:11 +08:00
hiyouga
d75e377b0f tiny fix 2023-08-18 13:07:35 +08:00
hiyouga
9020524418 fix PPO trainer #551 , update readme 2023-08-18 11:43:10 +08:00
hiyouga
9f4c2adc9a fix ChatGLM2 ppo #527 #528 2023-08-18 00:34:59 +08:00
hiyouga
be21fc83f9 fix generation bug #532 2023-08-17 22:21:34 +08:00
hiyouga
d9e62711a3 fix generation 2023-08-16 22:39:54 +08:00
hiyouga
af6c011fcb fix ChatGLM RLHF 2023-08-15 11:19:20 +08:00
hiyouga
3ec4351cfd support DPO training (2305.18290) 2023-08-11 03:02:53 +08:00
hiyouga
08f180e788 modify code structure 2023-08-02 23:17:36 +08:00
hiyouga
1d8a1878ea fix PPO trainer 2023-08-02 19:10:23 +08:00
hiyouga
b5ba87952a update ppo trainer 2023-08-02 18:46:41 +08:00
hiyouga
286f7be346 fix memory leak of PPO trainer 2023-08-02 17:41:34 +08:00
hiyouga
ac88ce5233 fix RM save model 2023-08-01 11:56:17 +08:00
hiyouga
0411a4b3e1 support streaming data, fix #284 #274 #268 2023-07-31 23:33:00 +08:00
hiyouga
29af67b015 fix API 2023-07-19 00:01:14 +08:00
hiyouga
12d8a8633f update webUI, fix #179 2023-07-18 15:35:17 +08:00
hiyouga
f8193e8009 release v0.1.0 2023-07-18 00:18:25 +08:00
hiyouga
85c2210452 fix #175 2023-07-17 18:07:17 +08:00
hiyouga
22d9a9c2af fix callback 2023-07-15 17:18:16 +08:00
hiyouga
f751376613 modity code structure 2023-07-15 16:54:28 +08:00