Commit Graph

101 Commits

Author SHA1 Message Date
hiyouga
7f12aedc08 enable cutoff len
Former-commit-id: f1067d2b58
2024-01-18 12:25:42 +08:00
hiyouga
4e3bfb799d support function calling
Former-commit-id: d9f1cae351
2024-01-18 09:54:23 +08:00
hiyouga
a52aafdbdc tiny update
Former-commit-id: 5b93d545e2
2023-12-25 18:29:34 +08:00
hiyouga
1af13cb737 add models
Former-commit-id: 709ac8870a
2023-12-18 19:09:31 +08:00
hiyouga
f0f9d253d8 support autogptq in llama board #246
Former-commit-id: 71389be37c
2023-12-16 16:31:30 +08:00
hiyouga
1a0bdd305c support system column #1765
Former-commit-id: 0a9c6e0146
2023-12-12 19:45:59 +08:00
hiyouga
cefc0b2f03 fix modelscope data hub
Former-commit-id: d5b2c57a35
2023-12-12 18:33:06 +08:00
hoshi-hiyouga
b67085e13a Merge branch 'main' into feat/support_ms
Former-commit-id: 6382efec52
2023-12-12 17:55:32 +08:00
xingjun.wang
e331e8c200 modify guanaco
Former-commit-id: e80a989d49
2023-12-12 15:00:37 +08:00
xingjun.wang
277790d868 update dataset info
Former-commit-id: 73b50a26b9
2023-12-12 14:53:59 +08:00
xingjun.wang
879209829e update args for MsDataset.load
Former-commit-id: 09533e95ed
2023-12-12 13:02:54 +08:00
xingjun.wang
9f17d36ccf add new datasets
Former-commit-id: fe4acc66b0
2023-12-12 12:44:15 +08:00
xingjun.wang
92fb73abd4 add open orca
Former-commit-id: 0ce18a3782
2023-12-12 12:34:04 +08:00
hiyouga
b641e9e97e fix #1784
Former-commit-id: 28d5de7e78
2023-12-09 20:53:18 +08:00
yuze.zyz
9c30cdb53d fix typo
Former-commit-id: e4cf2a75ca
2023-12-08 18:13:26 +08:00
yuze.zyz
c523613f0a support ms dataset
Former-commit-id: 9c2247d700
2023-12-08 18:00:57 +08:00
hiyouga
9a6b694e12 fix #1696
Former-commit-id: bf6f6aeefe
2023-12-01 15:34:50 +08:00
Marco
a26f68ba47 Update dataset_info.json
Added the Nectar dataset already preprocessed and divided in sft and rl to which I added a preprompt to each instruction since it has been seen that this increase instruction following

Former-commit-id: 9468ee9012
2023-11-30 16:21:34 +01:00
hiyouga
303956cbb9 update dataset
Former-commit-id: 7b1aa6f63c
2023-11-17 23:19:12 +08:00
hiyouga
f441932bd1 support full-parameter PPO
Former-commit-id: ce78303600
2023-11-16 02:08:04 +08:00
hiyouga
38755bced7 add template, modify datasets
Former-commit-id: 386f590209
2023-11-09 15:53:23 +08:00
hiyouga
b2bf10661b update data readme
Former-commit-id: 2b5e33c338
2023-11-03 00:15:23 +08:00
hiyouga
a9db89a025 update data readme (zh)
Former-commit-id: cc8ffa10d8
2023-11-02 23:42:49 +08:00
hiyouga
a1b0655457 support sharegpt format, add datasets
Former-commit-id: a837172413
2023-11-02 23:10:04 +08:00
hiyouga
1cd0ea1f13 add MathInstruct dataset
Former-commit-id: 026af87e7f
2023-09-13 22:30:14 +08:00
hiyouga
a4fd976048 refactor dataset_attr, add eos in pt, fix #757
Former-commit-id: a9d1fb72f7
2023-09-01 19:00:45 +08:00
codemayq
d9b9d9d1fe add ad gen dataset
Former-commit-id: 604f85487b
2023-08-27 20:35:32 +08:00
codemayq
b032dc4c4e add readme for dataset
Former-commit-id: cece66d48a
2023-08-23 19:55:45 +08:00
codemayq
4b29d9d2b0 add dataset stage and filter dataset when stage chosen in webui
Former-commit-id: c0e4d1e81b
2023-08-23 18:54:23 +08:00
hiyouga
802494e20a update template
Former-commit-id: 4318347d3f
2023-08-22 19:46:09 +08:00
Peter Pan
23443e9696 add rm dataset explanation
Signed-off-by: Peter Pan <Peter.Pan@daocloud.io>

Former-commit-id: b0ca8fe634
2023-08-22 01:33:59 -04:00
hiyouga
abdfa26d06 support DPO training (2305.18290)
Former-commit-id: 3ec4351cfd
2023-08-11 03:02:53 +08:00
hiyouga
4898a0a865 restore from git lfs
Former-commit-id: b9cdff41bb
2023-08-01 16:33:25 +08:00
hiyouga
3b8e33d91c use git lfs
Former-commit-id: 82e793ddb4
2023-08-01 10:14:08 +08:00
hiyouga
ba911f988d update dataset
Former-commit-id: f5c2ccdde4
2023-07-26 17:05:12 +08:00
hiyouga
d46c136c0e update dataset
Former-commit-id: 182b425043
2023-07-23 20:01:43 +08:00
hiyouga
261ca840d0 update readme, fix web ui postprocess
Former-commit-id: 035c966d5c
2023-07-22 14:29:22 +08:00
mrhan1993
cdd887908c 根据GLM Efficient Tuning添加中文README,web添加了server_port
Former-commit-id: 9f0b57b370
2023-07-21 16:57:58 +08:00
hiyouga
fa47c99fa9 add datasets
Former-commit-id: 7159bc54ed
2023-07-19 20:59:15 +08:00
hiyouga
2b0fced03b fix Baichuan-13B
Former-commit-id: 08439d29b2
2023-07-13 23:08:45 +08:00
zxbsmk
3b15aacf02 Support for WebNovel dataset
Former-commit-id: 4955dc9eed
2023-07-12 17:29:47 +08:00
hiyouga
92070c3d7a add open assistant dataset
Former-commit-id: 3154fec979
2023-06-28 23:09:33 +08:00
hiyouga
9155401bf9 add belle multiturn dataset
Former-commit-id: 334d1a6d26
2023-06-16 20:01:16 +08:00
hiyouga
1fbda5d139 support RM metrics, add generating Args
Former-commit-id: cec6524d6b
2023-06-12 15:48:48 +08:00
BUAADreamer
465264f852 update json line file to .jsonl
Former-commit-id: e3b53a67c7
2023-06-11 18:59:19 +08:00
BUAADreamer
c4128832e5 add some
Former-commit-id: 676d910260
2023-06-11 18:55:53 +08:00
BUAADreamer
b1c6ee9cf5 add code for reading from multi files in one directory
Former-commit-id: a2af9df5a9
2023-06-10 16:27:30 +08:00
BUAADreamer
53727aee3e add code for reading from multi files in one directory
Former-commit-id: 3dd5f9a874
2023-06-10 15:53:47 +08:00
hiyouga
ddb456bbcb remove dummy code
Former-commit-id: a72492e649
2023-05-30 16:28:00 +08:00
hiyouga
4c7c96e656 add pre-training script
Former-commit-id: 8ff96509fa
2023-05-29 21:37:22 +08:00