Commit Graph

150 Commits

Author SHA1 Message Date
hiyouga
1817ffc86f fix rlhf callback 2023-11-16 03:26:19 +08:00
hiyouga
856522a3df fix bug in PPO training 2023-11-16 02:32:54 +08:00
hiyouga
ce78303600 support full-parameter PPO 2023-11-16 02:08:04 +08:00
hiyouga
4907452d95 support multiple modules in freeze training #1514 2023-11-15 17:08:18 +08:00
hiyouga
d125ef5535 fix #1494 2023-11-14 18:07:20 +08:00
hiyouga
442aefb925 refactor evaluation, upgrade trl to 074 2023-11-13 22:20:35 +08:00
hiyouga
a0c31c68c4 add todo 2023-11-10 14:38:18 +08:00
hiyouga
415bca900e tiny fix 2023-11-09 17:20:49 +08:00
Yanqing
3684dffa14 Update finetuning_args.py
更新 chatglm/falcon/bloom 的 lora_target 的名称
2023-11-09 17:04:40 +08:00
hiyouga
01260d9754 fix ppo train and dpo eval 2023-11-07 22:48:51 +08:00
hiyouga
479d0af2dc delete file 2023-11-07 16:20:12 +08:00
hiyouga
b2a60905f3 upgrade peft, fix #1088 #1411 2023-11-07 16:13:36 +08:00
hiyouga
cc8ffa10d8 update data readme (zh) 2023-11-02 23:42:49 +08:00
hiyouga
a837172413 support sharegpt format, add datasets 2023-11-02 23:10:04 +08:00
hiyouga
dff128c7e3 fix #1356 2023-11-02 16:51:52 +08:00
hiyouga
083787dbfe fix #1325 2023-11-01 23:38:49 +08:00
hiyouga
3fe7df628d support dataset cache 2023-10-26 21:48:45 +08:00
hiyouga
7b4acf7265 reimplement neftune 2023-10-22 16:15:08 +08:00
anvie
57fb40aa04 add NEFTune optimization 2023-10-21 13:24:10 +07:00
hiyouga
b665e9e133 fix #1232 2023-10-20 23:28:52 +08:00
hiyouga
7a11a42dfd fix #1218 2023-10-19 16:17:41 +08:00
hiyouga
ea82f8a82a refactor export, fix #1190 2023-10-15 16:01:48 +08:00
hiyouga
af18b0dce7 fix #1184 2023-10-14 19:20:11 +08:00
hiyouga
11bd271364 fix ppo args 2023-10-11 23:40:50 +08:00
hiyouga
2818af0b09 refactor model_dtype, fix PPO trainer 2023-10-11 23:16:01 +08:00
hiyouga
84b7486885 fix layer norm dtype 2023-09-28 00:25:55 +08:00
hiyouga
620efe1d8d refactor finetuning Args 2023-09-27 22:28:06 +08:00
hiyouga
90375f600d support LongLoRA 2023-09-27 21:55:50 +08:00
hiyouga
338b8664ed fix #944 2023-09-21 19:51:02 +08:00
hiyouga
d8aa1404be support FlashAttention2 2023-09-10 20:43:56 +08:00
hiyouga
8ea32e4046 change to right-padding, update reward score #803 2023-09-08 20:04:31 +08:00
hiyouga
a9d1fb72f7 refactor dataset_attr, add eos in pt, fix #757 2023-09-01 19:00:45 +08:00
codemayq
ba94c8729d add stage in DatasetAttr 2023-08-23 20:54:53 +08:00
hiyouga
4318347d3f update template 2023-08-22 19:46:09 +08:00
hiyouga
53e33418d0 support ppo score norm (trl 0.5.1.dev required) 2023-08-18 12:02:42 +08:00
hiyouga
9020524418 fix PPO trainer #551 , update readme 2023-08-18 11:43:10 +08:00
hiyouga
7407d9daa1 fix system prompt 2023-08-16 01:35:52 +08:00
hiyouga
fa940c17b8 support rope scaling, fix #475 #476 #478 2023-08-12 20:46:27 +08:00
hiyouga
a48cb0d474 Release v0.1.6 2023-08-11 23:25:57 +08:00
hiyouga
3ec4351cfd support DPO training (2305.18290) 2023-08-11 03:02:53 +08:00
jiongxuc
3e000c2b60 huggingface login for projects must login while running 2023-08-10 14:57:12 +08:00
hiyouga
d86ea314a1 support val set in streaming mode 2023-08-09 23:00:26 +08:00
hiyouga
5453b93db0 update args spec 2023-08-07 15:23:35 +08:00
hiyouga
69744c17e8 support interleave probs 2023-08-04 21:27:35 +08:00
hiyouga
87f8f830e2 support Qwen-7B, fix InternLM-7B inference 2023-08-03 15:53:32 +08:00
hiyouga
0411a4b3e1 support streaming data, fix #284 #274 #268 2023-07-31 23:33:00 +08:00
hiyouga
513e1f1ec9 Update data_args.py 2023-07-28 17:42:41 +08:00
hiyouga
8f7819fcaa fix #194 2023-07-19 17:07:33 +08:00
hiyouga
657cf0f55a create chat model 2023-07-15 19:26:20 +08:00
hiyouga
f751376613 modity code structure 2023-07-15 16:54:28 +08:00