BUAADreamer
b6d78b2a64
merge data part to the text stream
...
Former-commit-id: c6dd89918feb25fe8c07857162421ad1706f791f
2024-04-25 19:19:59 +08:00
BUAADreamer
31bce63a10
add llava and instructblip
...
Former-commit-id: cfb485eddff0130422416b50c50e171fccc8103e
2024-04-25 00:22:43 +08:00
BUAADreamer
175b56bced
add multimodal LLM BLIP-2 and InstructBLIP
...
Former-commit-id: 4dcb11eab7bbeac866043d2a7c748b8d06fbd243
2024-04-23 18:45:43 +08:00
hiyouga
12290955d8
add dpo mix dataset
...
Former-commit-id: 6339edefff4eb23a4052fd273d1348f5ab59b47c
2024-04-20 01:31:38 +08:00
hiyouga
db42378f29
fix #3247
...
Former-commit-id: d1fb6c72b532bfd4ccd5b19f56708c8391fa53aa
2024-04-12 17:41:33 +08:00
li.yunhao
0330ba2ae6
fix pile datset hf hub url
...
Former-commit-id: 9c2ef9cdf47e16985e07421b0bea414d161e2456
2024-03-30 16:06:10 +08:00
hiyouga
6646e18c02
add orca_dpo_pairs dataset
...
Former-commit-id: 3271af2afc90f10dcb101aeb9d7e4ef254d2dc0e
2024-03-20 20:09:06 +08:00
hiyouga
9ae1514a75
update readme, add starcoder2, cosmopedia
...
Former-commit-id: 894d183214417b10af64d6add7be082d63e8b1f3
2024-03-03 01:01:46 +08:00
hiyouga
3b16912235
update data
...
Former-commit-id: 32884523c577f329354decb4c916bf1f1bbc9dff
2024-03-02 19:37:18 +08:00
hiyouga
7e2d8b170a
fix #2533
...
Former-commit-id: 1630a4cb8f19a348014113421134b5730d52932f
2024-02-21 22:47:48 +08:00
hiyouga
62b78001b7
fix #2481
...
Former-commit-id: 22acab8aff8cadbba2a67e56af5701c0261ade49
2024-02-15 19:07:47 +08:00
hiyouga
db2051684b
improve aligner
...
Former-commit-id: 7d2dc83c5e2085da6273241269c9e9d7509ae51b
2024-02-10 16:39:19 +08:00
Mark Mueller
4bd7b8375e
Slim Orca data parsing
...
Former-commit-id: 1d3598afa10797ba0ce30d44f52e7994587c0ce8
2024-02-08 19:32:20 +01:00
Johann-Peter Hartmann
ace1770085
WS fix
...
Former-commit-id: 49c69ea4b97a2507819996dea41a755a29e35e79
2024-02-06 20:13:04 +01:00
Johann-Peter Hartmann
6ff4e9e62c
add ranking to dpo dataset
...
Former-commit-id: 1126563505924a2d7946fa3fad0d9d1756faf987
2024-02-06 20:12:36 +01:00
Johann-Peter Hartmann
77746ad86c
remove comma
...
Former-commit-id: 870182c3a9ce3168db5b40a45daebe33c3d6f0e1
2024-02-03 08:48:39 +01:00
Johann-Peter Hartmann
c264eb4793
Add support for german datasets
...
Former-commit-id: d9a8301ed46d821c3303b14966978e1165d12f2c
2024-01-30 10:18:01 +01:00
hiyouga
cd4d38e0cc
Update dataset_info.json
...
Former-commit-id: dbaaa4546ec681cfc84da015a67e2a9c79173e02
2024-01-23 00:10:32 +08:00
hiyouga
7f12aedc08
enable cutoff len
...
Former-commit-id: f1067d2b585cf24b3e48463692ac99a8222161c9
2024-01-18 12:25:42 +08:00
hiyouga
4e3bfb799d
support function calling
...
Former-commit-id: d9f1cae35150cce594a7abd96dd2beb811fa33f2
2024-01-18 09:54:23 +08:00
hiyouga
a52aafdbdc
tiny update
...
Former-commit-id: 5b93d545e2090d8d6db2cee3a047565f834e87f1
2023-12-25 18:29:34 +08:00
hiyouga
f0f9d253d8
support autogptq in llama board #246
...
Former-commit-id: 71389be37cb0f1a65db6e501e11ca14e615c1a24
2023-12-16 16:31:30 +08:00
hiyouga
1a0bdd305c
support system column #1765
...
Former-commit-id: 0a9c6e0146ebc71d5438c837463d6ab236e227c4
2023-12-12 19:45:59 +08:00
hiyouga
cefc0b2f03
fix modelscope data hub
...
Former-commit-id: d5b2c57a356539df9993e4774b856231eca8a6da
2023-12-12 18:33:06 +08:00
hoshi-hiyouga
b67085e13a
Merge branch 'main' into feat/support_ms
...
Former-commit-id: 6382efec52f6be3daa5db0bd280a96162009fca1
2023-12-12 17:55:32 +08:00
xingjun.wang
e331e8c200
modify guanaco
...
Former-commit-id: e80a989d49366bf08f62d212d329a90a02d8167e
2023-12-12 15:00:37 +08:00
xingjun.wang
277790d868
update dataset info
...
Former-commit-id: 73b50a26b9c6282f28df87338fa4057759c38f69
2023-12-12 14:53:59 +08:00
xingjun.wang
879209829e
update args for MsDataset.load
...
Former-commit-id: 09533e95edc5fa65a38b2f04c6d88506196021b3
2023-12-12 13:02:54 +08:00
xingjun.wang
9f17d36ccf
add new datasets
...
Former-commit-id: fe4acc66b0e2bd96c988315192beb161da2d51f8
2023-12-12 12:44:15 +08:00
xingjun.wang
92fb73abd4
add open orca
...
Former-commit-id: 0ce18a378255a1d075a38a364520ba7a1e56180f
2023-12-12 12:34:04 +08:00
hiyouga
b641e9e97e
fix #1784
...
Former-commit-id: 28d5de7e785f31b223a4646c9c1c770f43e187ec
2023-12-09 20:53:18 +08:00
yuze.zyz
9c30cdb53d
fix typo
...
Former-commit-id: e4cf2a75caac75cb6320350ba179b8e2dcd87366
2023-12-08 18:13:26 +08:00
yuze.zyz
c523613f0a
support ms dataset
...
Former-commit-id: 9c2247d700763f480d88a5dd46480cb32cfc174e
2023-12-08 18:00:57 +08:00
hiyouga
9a6b694e12
fix #1696
...
Former-commit-id: bf6f6aeefe65b4949633648b8711525c0029c001
2023-12-01 15:34:50 +08:00
Marco
a26f68ba47
Update dataset_info.json
...
Added the Nectar dataset already preprocessed and divided in sft and rl to which I added a preprompt to each instruction since it has been seen that this increase instruction following
Former-commit-id: 9468ee9012bfe7124fc5cc2acebcfe03a6d0cdee
2023-11-30 16:21:34 +01:00
hiyouga
303956cbb9
update dataset
...
Former-commit-id: 7b1aa6f63c79c0d9cb5249fdb0d6a5f9a04f36bd
2023-11-17 23:19:12 +08:00
hiyouga
f441932bd1
support full-parameter PPO
...
Former-commit-id: ce783036001397a20b0b4c5da2fea6d0c03389d2
2023-11-16 02:08:04 +08:00
hiyouga
38755bced7
add template, modify datasets
...
Former-commit-id: 386f590209e466b51c17a7ac8cee55fc3ce928d7
2023-11-09 15:53:23 +08:00
hiyouga
a9db89a025
update data readme (zh)
...
Former-commit-id: cc8ffa10d877f5893f3940204e5bec6f3266559f
2023-11-02 23:42:49 +08:00
hiyouga
a1b0655457
support sharegpt format, add datasets
...
Former-commit-id: a8371724130db2fbd7273a480e2acb251e382aec
2023-11-02 23:10:04 +08:00
hiyouga
1cd0ea1f13
add MathInstruct dataset
...
Former-commit-id: 026af87e7fce091a0cda1afd6df3d6ab6189de9a
2023-09-13 22:30:14 +08:00
hiyouga
a4fd976048
refactor dataset_attr, add eos in pt, fix #757
...
Former-commit-id: a9d1fb72f791ae57a4d12f4e3a7e2abccf6a7077
2023-09-01 19:00:45 +08:00
codemayq
d9b9d9d1fe
add ad gen dataset
...
Former-commit-id: 604f85487b46b3eb01b68cb2cc6535b7cb5527a7
2023-08-27 20:35:32 +08:00
codemayq
4b29d9d2b0
add dataset stage and filter dataset when stage chosen in webui
...
Former-commit-id: c0e4d1e81b41c9a36291d8bee46d7d807c898c21
2023-08-23 18:54:23 +08:00
hiyouga
abdfa26d06
support DPO training (2305.18290)
...
Former-commit-id: 3ec4351cfdaf2aefcc7d13345e19d79874ed61d3
2023-08-11 03:02:53 +08:00
hiyouga
4898a0a865
restore from git lfs
...
Former-commit-id: b9cdff41bbb6084380606cde6c875c994b6b1868
2023-08-01 16:33:25 +08:00
hiyouga
3b8e33d91c
use git lfs
...
Former-commit-id: 82e793ddb42d4f1369516dde63dbe4fed28f2e1d
2023-08-01 10:14:08 +08:00
hiyouga
ba911f988d
update dataset
...
Former-commit-id: f5c2ccdde45bfa5648443a901b2ac397d532eceb
2023-07-26 17:05:12 +08:00
hiyouga
d46c136c0e
update dataset
...
Former-commit-id: 182b42504399d2755897b9737db1d36655a0fa50
2023-07-23 20:01:43 +08:00
hiyouga
fa47c99fa9
add datasets
...
Former-commit-id: 7159bc54ed0f1bba974662a87ba5039d9aacadee
2023-07-19 20:59:15 +08:00