xingjun.wang
b761416dc1
add open orca
...
Former-commit-id: 7994c809b385bdc2c19e1e5e6fa8680aa9f2b77d
2023-12-12 12:34:04 +08:00
yuze.zyz
0cacd5147a
fix typo
...
Former-commit-id: 29b07291e6b40e9f0a61632609465363291ae5c7
2023-12-08 18:13:26 +08:00
yuze.zyz
c2432b2e8d
support ms dataset
...
Former-commit-id: 98638b35dc24045ac17b9b01d08d3a02372acef3
2023-12-08 18:00:57 +08:00
hiyouga
1602fe2350
fix #1696
...
Former-commit-id: 722ae14a652af34d9b91f9459e613d7959ecaa7e
2023-12-01 15:34:50 +08:00
Marco
238379e64a
Update dataset_info.json
...
Added the Nectar dataset already preprocessed and divided in sft and rl to which I added a preprompt to each instruction since it has been seen that this increase instruction following
Former-commit-id: 6336e247c1535f356194046607038245bc48464f
2023-11-30 16:21:34 +01:00
hiyouga
c7ab341fcd
update dataset
...
Former-commit-id: a310b22b446118d90dd73906847ed3d01a574b50
2023-11-17 23:19:12 +08:00
hiyouga
685d0c975a
support full-parameter PPO
...
Former-commit-id: 4af967d69475e1c9fdf1a7983cd6b83bd431abff
2023-11-16 02:08:04 +08:00
hiyouga
f697474e67
add template, modify datasets
...
Former-commit-id: 81e54beb4d0f792f4fd7f450643caaf10f2f0b7d
2023-11-09 15:53:23 +08:00
hiyouga
2659753600
update data readme
...
Former-commit-id: 6a65ef44ed58714c611da60b5af96b85352e8735
2023-11-03 00:15:23 +08:00
hiyouga
fb3d981496
update data readme (zh)
...
Former-commit-id: b32fb3a984c681732b82f6544d6c05a98c34cf4c
2023-11-02 23:42:49 +08:00
hiyouga
33c47f0ebe
support sharegpt format, add datasets
...
Former-commit-id: 202daf8987ccb7523be03ca535b572b5c9e65994
2023-11-02 23:10:04 +08:00
hiyouga
5daa358aab
add MathInstruct dataset
...
Former-commit-id: 3d1d4b47055739854cf9788a902607e1bbba3723
2023-09-13 22:30:14 +08:00
hiyouga
c5fcf5b3a5
refactor dataset_attr, add eos in pt, fix #757
...
Former-commit-id: 0feec9a830b917b36686b61938a66e842eccf930
2023-09-01 19:00:45 +08:00
codemayq
09f61befc8
add ad gen dataset
...
Former-commit-id: fcd0788aa4dda0cecc1420d369d371032a207810
2023-08-27 20:35:32 +08:00
codemayq
9ae6abb7b5
add readme for dataset
...
Former-commit-id: bdcb0ea40e726e4c5752f938b379ed9a18e7e1d0
2023-08-23 19:55:45 +08:00
codemayq
22cece8acb
add dataset stage and filter dataset when stage chosen in webui
...
Former-commit-id: 26e4136449a4df6028d834fd16a0f4a7c532759d
2023-08-23 18:54:23 +08:00
hiyouga
538c404cc0
update template
...
Former-commit-id: a95f3a4d62de1073a78125401cf4289ec0523156
2023-08-22 19:46:09 +08:00
Peter Pan
44777f77f8
add rm dataset explanation
...
Signed-off-by: Peter Pan <Peter.Pan@daocloud.io>
Former-commit-id: 1efb95025be6501f1b30b20e7c711d3590b5d1ee
2023-08-22 01:33:59 -04:00
hiyouga
7ada4f5f6f
support DPO training (2305.18290)
...
Former-commit-id: 6d98de148e4af63a7028dfaeb6cf86eb56a4488f
2023-08-11 03:02:53 +08:00
hiyouga
8f25642087
restore from git lfs
...
Former-commit-id: 0c734a37113b773ae7c0bc8b8d1af39b15bc0fb2
2023-08-01 16:33:25 +08:00
hiyouga
3dee98ebc6
use git lfs
...
Former-commit-id: 4886d0071751f68c5a2d926bd9fcee0c93337322
2023-08-01 10:14:08 +08:00
hiyouga
ed252565f9
update dataset
...
Former-commit-id: 4a044aabbd19c92a9ae93c1c30536f5086fd47f9
2023-07-26 17:05:12 +08:00
hiyouga
9802398c71
update dataset
...
Former-commit-id: 4fc2c3293d91d8464527ebd1ddabe572c8355616
2023-07-23 20:01:43 +08:00
hiyouga
1f8d45d37e
update readme, fix web ui postprocess
...
Former-commit-id: ba51ab3379100108f7b52a3c2444ccdd99e8a6ef
2023-07-22 14:29:22 +08:00
mrhan1993
097c15b657
根据GLM Efficient Tuning添加中文README,web添加了server_port
...
Former-commit-id: 29e3acd23eafd891667d7a860ec544a5b05d3c33
2023-07-21 16:57:58 +08:00
hiyouga
e9de1951dd
add datasets
...
Former-commit-id: 02e4b47dea1b25905c61f2ace88bab112610f021
2023-07-19 20:59:15 +08:00
hiyouga
25182c4779
fix Baichuan-13B
...
Former-commit-id: 6d9d826b3246349454c68f4d13b862da4de986e2
2023-07-13 23:08:45 +08:00
zxbsmk
3f0ee46bd0
Support for WebNovel dataset
...
Former-commit-id: 655162f530784bc9374962c02d8b414872f83b2f
2023-07-12 17:29:47 +08:00
hiyouga
fe45c17b25
add open assistant dataset
...
Former-commit-id: 1694cf3078d04a14bce96da04b9d8c52176b1044
2023-06-28 23:09:33 +08:00
hiyouga
e52b201c13
add belle multiturn dataset
...
Former-commit-id: ac907ae1c37969df3cd09d4ab5f3f7f352eb259c
2023-06-16 20:01:16 +08:00
hiyouga
0da1b7d9ab
support RM metrics, add generating Args
...
Former-commit-id: c461c6190bc124e98dde7f3cf96a59ce40b26fb0
2023-06-12 15:48:48 +08:00
BUAADreamer
463e0762c4
update json line file to .jsonl
...
Former-commit-id: 85e7676c3c1422795a047ffa8587bd4063ad7511
2023-06-11 18:59:19 +08:00
BUAADreamer
ac00fcd114
add some
...
Former-commit-id: 6982a53ed1f6f9fa03e99623b98fff56bf00317e
2023-06-11 18:55:53 +08:00
BUAADreamer
a976cba730
add code for reading from multi files in one directory
...
Former-commit-id: 9b80cf08b9f0d4aee896b228fb76399e9a7c9d8b
2023-06-10 16:27:30 +08:00
BUAADreamer
2012cb5cbc
add code for reading from multi files in one directory
...
Former-commit-id: b7ebb83a96619e5111b0faa9da9d0feb8d9cdff0
2023-06-10 15:53:47 +08:00
hiyouga
f8d03f3aa9
remove dummy code
...
Former-commit-id: e6bc89d280945bbf48281107145c40a41d7cbd56
2023-05-30 16:28:00 +08:00
hiyouga
bb6f731461
add pre-training script
...
Former-commit-id: 935d58de2b3a2eadc4f0bed28c3ad7dee32e9fd5
2023-05-29 21:37:22 +08:00
hiyouga
54574f1dfa
Initial commit
...
Former-commit-id: 5ca8e1d63727e7bcb8cab16542c763c47e48184a
2023-05-28 18:09:04 +08:00