125 Commits

Author SHA1 Message Date
hiyouga
89564e90d7 update data
Former-commit-id: 9ab0401948d02d029134aa669c378e2ad80fb9fb
2024-06-19 02:48:43 +08:00
hiyouga
9e5988717d tiny fix
Former-commit-id: 344b9a36b2e0b60ee61fba171b35a391e3517fed
2024-06-18 23:32:18 +08:00
Eli Costa
6bbb8b4cd8 Add Magpie and Webinstruct dataset samples
Adds two dataset samples claimed superior performance: Magpie (from Allen AI) and Webinstruct (from TIGER-Lab).

Former-commit-id: 74e49cca957d0bacd2c1d688e995a7370bef69f7
2024-06-15 19:31:56 -03:00
hiyouga
e89d1b1ec3 add neo-sft dataset
Former-commit-id: c7a5620ccc72b7574255ea764693ccb866c48263
2024-06-13 01:00:56 +08:00
hiyouga
3547a26f86 add ultrafeedback and fineweb #4085 #4132
Former-commit-id: 12d79f89c5082eb29842b501e1cb88433a248ba3
2024-06-08 02:42:34 +08:00
hoshi-hiyouga
9b6bdf9449 Merge pull request #3829 from seanzhang-zhichen/add_dataset_sample_num
Add dataset sample num

Former-commit-id: 483eb47e5d670e23fb713b942f6890b8259f4363
2024-05-30 00:25:45 +08:00
hoshi-hiyouga
21e7979837 Update README_zh.md
Former-commit-id: c8ae7e0e6571c7ca2e526da3e8adda5f8c9948f1
2024-05-30 00:04:47 +08:00
hoshi-hiyouga
eb7ee82f16 Update README.md
Former-commit-id: 3761d7d5dd97ce2fe0098284e6d4821fc0d63d30
2024-05-30 00:04:26 +08:00
hiyouga
b88ecd71fd fix full/freeze tuning for mllm
Former-commit-id: 08564838bd02651668845ed74e2e60561e5b6d8c
2024-05-27 20:37:57 +08:00
BUAADreamer
f9ced0480e Merge branch 'main' of https://github.com/BUAADreamer/LLaMA-Factory
Former-commit-id: 576b0206c27f93ffe19e3b7e6df58a3cd2abbb1d
2024-05-27 20:11:23 +08:00
BUAADreamer
4a958ab909 Merge branch 'hiyouga:main' into main
Former-commit-id: e2022ce4e90b115fb8271ef0f6bf05e8f39c997f
2024-05-27 20:10:58 +08:00
BUAADreamer
ea78a629ba remove mllm_pt_demo.json
Former-commit-id: f665342a2752ffb5d715f134603d84e5228f55dc
2024-05-27 20:10:31 +08:00
hiyouga
db569a2d61 add llava 1k datasets
Former-commit-id: 08bd0440b52dbe2e6d28323900ca1a07751605f9
2024-05-27 19:57:33 +08:00
seanzhang-zhichen
9c8d79fbe3 Merge branch 'main' into add_dataset_sample_num
Former-commit-id: 27cb51f7f86f97ae231abfdcb0114ff245d7af9c
2024-05-24 15:57:47 +08:00
BUAADreamer
d8a27e40e2 Merge branch 'hiyouga:main' into main
Former-commit-id: 8d53ec2b5f37e7b43da8d3e787f68fc1bb15997a
2024-05-21 22:18:20 +08:00
hiyouga
a8480baa11 Update README_zh.md
Former-commit-id: 4d647ddba5934b4d9f594c472aa6b46865bb525a
2024-05-21 18:30:59 +08:00
BUAADreamer
071d674065 support pretraining of llava
Former-commit-id: 29a6d5bdb8610be8f796eed65eede9ba7b503527
2024-05-21 08:57:14 +08:00
hiyouga
7f6c37c68e fix #3818
Former-commit-id: 7262679666bf70816911ff2434c7c7ccbca26378
2024-05-20 21:43:19 +08:00
zhangzc
4b90f04c1f fix conflict
Former-commit-id: d956041640d9abc5e59919a227d27270fb513a7e
2024-05-20 17:10:01 +08:00
hiyouga
c53e626c9a update data readme
Former-commit-id: ca48f90f1eb9828300635bdaee6c10d6cc632d3d
2024-05-18 21:37:38 +08:00
hiyouga
68c07d3e1e update data readme
Former-commit-id: 18cbf8561d6c3fdceac47991ed16d35471823187
2024-05-18 21:15:20 +08:00
hiyouga
13d7b48efe improve KTO impl., replace datasets
Former-commit-id: c450ee87a35ff9235f9b695b0de2e042b2971178
2024-05-18 03:44:56 +08:00
enji.zhou
03956053b8 add kto
Former-commit-id: db1d5a4f51faae61fe18666057353747b01f5b8d
2024-05-17 13:09:17 +08:00
hiyouga
51e0f095a9 remove checksum and fix ui args
Former-commit-id: 58c522cd5cc4498a3fa8ed99424b5d63c9e56ccb
2024-05-12 01:10:30 +08:00
codingma
e017fb67d0 fix sha1 of glaive_toolcall dataset
Former-commit-id: d5520b6017df01e807fe3a913ee6654814359d5d
2024-05-09 16:33:45 +08:00
hiyouga
38c6ce9311 remove big file
Former-commit-id: 1ccbfe562dabe9a75df729c960e09d6a8bd6382c
2024-05-07 22:14:06 +08:00
hiyouga
175a7ea951 fix stop param
Former-commit-id: 09f3ef1de49f97001faa91ef3dc2bd16790f9717
2024-05-07 00:41:04 +08:00
hoshi-hiyouga
14c3c8cc8f Merge pull request #3588 from ZeyuTeng96/patch-1
update hf_hub_url for nectar_rm in dataset_info

Former-commit-id: d6ca7853faf083a7ff5c60feb940983d2577326d
2024-05-07 00:06:11 +08:00
hoshi-hiyouga
a13bdb9a2b Update dataset_info.json
Former-commit-id: c3910ab98ae11b52ff6e6d1faafd3e63256d908e
2024-05-07 00:05:45 +08:00
hiyouga
92cafef325 update example docs
Former-commit-id: f02f87c6fbd20adae105c83526baa23dba2042fd
2024-05-06 22:51:02 +08:00
ZeyuTeng96
96354ca55f update hf_hub_url for nectar_rm in dataset_info
Hi there,

I cannot find the "mlinmg/RLAIF-Nectar" on hf, seems like it changed as "AstraMindAI/RLAIF-Nectar". So, making a PR for updating.

See: https://huggingface.co/datasets/AstraMindAI/RLAIF-Nectar
Former-commit-id: 044af364425766ba23373ff21577bc4a9de18e39
2024-05-06 16:44:50 +08:00
hoshi-hiyouga
eea8a79e35 Update README_zh.md
Former-commit-id: d4d9180c401cb210654792d8052313e8db17fc51
2024-05-02 02:14:55 +08:00
hoshi-hiyouga
2186deceac Update README.md
Former-commit-id: b072ec9d1b18f7e9d5d2c9529eac55d29ca832c8
2024-05-02 02:13:46 +08:00
Lao
f15836c77a Update README_zh.md
Former-commit-id: ce17eccf451649728cf7b45312fd7f75d3a8a246
2024-04-28 23:31:37 +08:00
khazic
db316422a4 Upgrade the second sharegpt format
Former-commit-id: 288911fc7b1e12e53f3396c371cf4b4c7300b4bf
2024-04-28 14:30:05 +08:00
khazic
6f0b412265 added the second sharegpt format
Former-commit-id: d1ba32e4bb70489a9e6f5d3657988c9b7553a157
2024-04-28 14:27:45 +08:00
hiyouga
c9fce361fb update readme
Former-commit-id: 5ee04d418c2e66a292e7da6d393843fcf3b71dc1
2024-04-26 23:39:19 +08:00
hoshi-hiyouga
76f767d5b0 Merge pull request #3471 from BUAADreamer/main
add llava_150k en/zh mllm sft data

Former-commit-id: 8f9142022382d0eedce4356744a281b2ace3b703
2024-04-26 23:36:41 +08:00
hoshi-hiyouga
5ad1c3dd36 Update dataset_info.json
Former-commit-id: c29b257007a8de9735ecaf52afffa80fdcee6a24
2024-04-26 23:34:34 +08:00
BUAADreamer
044668af10 add llava_150k en/zh mllm sft data
Former-commit-id: a17787201082951ae39c3c10436be4c16346f16a
2024-04-26 23:18:58 +08:00
hiyouga
eb14501a52 release v0.7.0
Former-commit-id: 168f56683ae4909ae50edd4859032fad60149d00
2024-04-26 23:18:00 +08:00
hiyouga
d2df4c22ab support mllm hf inference
Former-commit-id: e057c8de486bfbc829240924f9238d6212c917f1
2024-04-26 05:34:58 +08:00
hoshi-hiyouga
3e832e53be Update dataset_info.json
Former-commit-id: f8c26e6a346ca0f18f3b05b6fc7413f3625fb220
2024-04-26 03:03:36 +08:00
hoshi-hiyouga
6275682325 Update mllm_demo.json
Former-commit-id: 5ef293387f8bded42364984f804fb8f665ef1f89
2024-04-26 02:58:45 +08:00
hoshi-hiyouga
82b61ccda6 Update and rename llava_instruct_example.json to mllm_demo.json
Former-commit-id: 7dcae3dba3dbda4953a0e3993e279cc8c21fc976
2024-04-26 02:57:54 +08:00
BUAADreamer
56028422e8 merge data part to the text stream
Former-commit-id: 42c90c8183a49cadb2c2abcc58f6ea27d325231d
2024-04-25 19:58:47 +08:00
BUAADreamer
b6d78b2a64 merge data part to the text stream
Former-commit-id: c6dd89918feb25fe8c07857162421ad1706f791f
2024-04-25 19:19:59 +08:00
BUAADreamer
31bce63a10 add llava and instructblip
Former-commit-id: cfb485eddff0130422416b50c50e171fccc8103e
2024-04-25 00:22:43 +08:00
BUAADreamer
175b56bced add multimodal LLM BLIP-2 and InstructBLIP
Former-commit-id: 4dcb11eab7bbeac866043d2a7c748b8d06fbd243
2024-04-23 18:45:43 +08:00
hiyouga
12290955d8 add dpo mix dataset
Former-commit-id: 6339edefff4eb23a4052fd273d1348f5ab59b47c
2024-04-20 01:31:38 +08:00