Commit Graph

62 Commits

Author SHA1 Message Date
Mark Mueller
1d3598afa1 Slim Orca data parsing 2024-02-08 19:32:20 +01:00
Johann-Peter Hartmann
49c69ea4b9 WS fix 2024-02-06 20:13:04 +01:00
Johann-Peter Hartmann
1126563505 add ranking to dpo dataset 2024-02-06 20:12:36 +01:00
Johann-Peter Hartmann
870182c3a9 remove comma 2024-02-03 08:48:39 +01:00
Johann-Peter Hartmann
4e27950acb Merge branch 'hiyouga:main' into main 2024-01-31 14:05:52 +01:00
hiyouga
521ad76552 fix autoset attn impl, update data readme 2024-01-31 11:58:07 +08:00
Johann-Peter Hartmann
d9a8301ed4 Add support for german datasets 2024-01-30 10:18:01 +01:00
hiyouga
dbaaa4546e Update dataset_info.json 2024-01-23 00:10:32 +08:00
hiyouga
b2fb0eca56 fix #2282 and update tool prompt 2024-01-22 22:27:30 +08:00
hiyouga
486cc8d360 add array param format 2024-01-21 22:17:48 +08:00
hiyouga
487dee066f fix dataset 2024-01-18 12:59:30 +08:00
hiyouga
f1067d2b58 enable cutoff len 2024-01-18 12:25:42 +08:00
hiyouga
d9f1cae351 support function calling 2024-01-18 09:54:23 +08:00
hiyouga
5b93d545e2 tiny update 2023-12-25 18:29:34 +08:00
hiyouga
709ac8870a add models 2023-12-18 19:09:31 +08:00
hiyouga
71389be37c support autogptq in llama board #246 2023-12-16 16:31:30 +08:00
hiyouga
0a9c6e0146 support system column #1765 2023-12-12 19:45:59 +08:00
hiyouga
d5b2c57a35 fix modelscope data hub 2023-12-12 18:33:06 +08:00
hoshi-hiyouga
6382efec52 Merge branch 'main' into feat/support_ms 2023-12-12 17:55:32 +08:00
xingjun.wang
e80a989d49 modify guanaco 2023-12-12 15:00:37 +08:00
xingjun.wang
73b50a26b9 update dataset info 2023-12-12 14:53:59 +08:00
xingjun.wang
09533e95ed update args for MsDataset.load 2023-12-12 13:02:54 +08:00
xingjun.wang
fe4acc66b0 add new datasets 2023-12-12 12:44:15 +08:00
xingjun.wang
0ce18a3782 add open orca 2023-12-12 12:34:04 +08:00
hiyouga
28d5de7e78 fix #1784 2023-12-09 20:53:18 +08:00
yuze.zyz
e4cf2a75ca fix typo 2023-12-08 18:13:26 +08:00
yuze.zyz
9c2247d700 support ms dataset 2023-12-08 18:00:57 +08:00
hiyouga
bf6f6aeefe fix #1696 2023-12-01 15:34:50 +08:00
Marco
9468ee9012 Update dataset_info.json
Added the Nectar dataset already preprocessed and divided in sft and rl to which I added a preprompt to each instruction since it has been seen that this increase instruction following
2023-11-30 16:21:34 +01:00
hiyouga
7b1aa6f63c update dataset 2023-11-17 23:19:12 +08:00
hiyouga
ce78303600 support full-parameter PPO 2023-11-16 02:08:04 +08:00
hiyouga
386f590209 add template, modify datasets 2023-11-09 15:53:23 +08:00
hiyouga
2b5e33c338 update data readme 2023-11-03 00:15:23 +08:00
hiyouga
cc8ffa10d8 update data readme (zh) 2023-11-02 23:42:49 +08:00
hiyouga
a837172413 support sharegpt format, add datasets 2023-11-02 23:10:04 +08:00
hiyouga
026af87e7f add MathInstruct dataset 2023-09-13 22:30:14 +08:00
hiyouga
a9d1fb72f7 refactor dataset_attr, add eos in pt, fix #757 2023-09-01 19:00:45 +08:00
codemayq
604f85487b add ad gen dataset 2023-08-27 20:35:32 +08:00
codemayq
cece66d48a add readme for dataset 2023-08-23 19:55:45 +08:00
codemayq
c0e4d1e81b add dataset stage and filter dataset when stage chosen in webui 2023-08-23 18:54:23 +08:00
hiyouga
4318347d3f update template 2023-08-22 19:46:09 +08:00
Peter Pan
b0ca8fe634 add rm dataset explanation
Signed-off-by: Peter Pan <Peter.Pan@daocloud.io>
2023-08-22 01:33:59 -04:00
hiyouga
3ec4351cfd support DPO training (2305.18290) 2023-08-11 03:02:53 +08:00
hiyouga
b9cdff41bb restore from git lfs 2023-08-01 16:33:25 +08:00
hiyouga
82e793ddb4 use git lfs 2023-08-01 10:14:08 +08:00
hiyouga
f5c2ccdde4 update dataset 2023-07-26 17:05:12 +08:00
hiyouga
182b425043 update dataset 2023-07-23 20:01:43 +08:00
hiyouga
035c966d5c update readme, fix web ui postprocess 2023-07-22 14:29:22 +08:00
mrhan1993
9f0b57b370 根据GLM Efficient Tuning添加中文README,web添加了server_port 2023-07-21 16:57:58 +08:00
hiyouga
7159bc54ed add datasets 2023-07-19 20:59:15 +08:00