80 Commits

Author SHA1 Message Date
BUAADreamer
fb33f6e528 Merge branch 'main' of https://github.com/BUAADreamer/LLaMA-Factory
Former-commit-id: d544570ce88a7b784beeffa70ff718109696b1f5
2024-05-27 20:11:23 +08:00
BUAADreamer
5a581acac7 Merge branch 'hiyouga:main' into main
Former-commit-id: cc1b82bf49b060987392c455fdbfe125ad667ec5
2024-05-27 20:10:58 +08:00
BUAADreamer
136e64081f remove mllm_pt_demo.json
Former-commit-id: 5402589f021056f9c9e7b68421282039a508d5b9
2024-05-27 20:10:31 +08:00
hiyouga
3f8314d4e6 add llava 1k datasets
Former-commit-id: 345d3355752f4a4dc454696a39f1610fffbbf382
2024-05-27 19:57:33 +08:00
BUAADreamer
aaadaa18f6 support pretraining of llava
Former-commit-id: 6a4c8cf0a6a1674c693b9337f018ff8df7477f8f
2024-05-21 08:57:14 +08:00
hiyouga
d24969bb7e improve KTO impl., replace datasets
Former-commit-id: e56a57ddcf061de6e4acc8679f7dbf0b68364986
2024-05-18 03:44:56 +08:00
enji.zhou
d16a1d9ed0 add kto
Former-commit-id: ec51986cf70b0bdd79b8141e45916670fb97a08e
2024-05-17 13:09:17 +08:00
hiyouga
3e5a099187 remove checksum and fix ui args
Former-commit-id: 0cfdeb1d30efb63211434bc4656bceb59e666289
2024-05-12 01:10:30 +08:00
codingma
82e830f8e7 fix sha1 of glaive_toolcall dataset
Former-commit-id: 25649cd14899f41fe12c99af12619ddcd5a8ba88
2024-05-09 16:33:45 +08:00
hiyouga
c3dbaf6eba remove big file
Former-commit-id: 8a05242787f810ec25d1b33358257d2867c45497
2024-05-07 22:14:06 +08:00
hiyouga
a978b5dc4e fix stop param
Former-commit-id: f0a850c25211b72eddbb357c81679db9b0930d44
2024-05-07 00:41:04 +08:00
hoshi-hiyouga
34dcba1bc6 Merge pull request #3588 from ZeyuTeng96/patch-1
update hf_hub_url for nectar_rm in dataset_info

Former-commit-id: bcf2c749490d45e3b1363352cc30fd6f9ef29a19
2024-05-07 00:06:11 +08:00
hoshi-hiyouga
224f57f83b Update dataset_info.json
Former-commit-id: c55969c2350548a9b2eda5352b067df63ee98b20
2024-05-07 00:05:45 +08:00
hiyouga
6710e27429 update example docs
Former-commit-id: 102cd42768d9eb2cf1219309a25b41e26149067e
2024-05-06 22:51:02 +08:00
ZeyuTeng96
6041dda838 update hf_hub_url for nectar_rm in dataset_info
Hi there,

I cannot find the "mlinmg/RLAIF-Nectar" on hf, seems like it changed as "AstraMindAI/RLAIF-Nectar". So, making a PR for updating.

See: https://huggingface.co/datasets/AstraMindAI/RLAIF-Nectar
Former-commit-id: 98ea76989f6ee9096edd0d353d8a001cdb6ccc5a
2024-05-06 16:44:50 +08:00
hiyouga
e626d15764 update readme
Former-commit-id: c9190fe36f511c3a5149d45c85a10b02a57fa88a
2024-04-26 23:39:19 +08:00
hoshi-hiyouga
94ae4c42e5 Merge pull request #3471 from BUAADreamer/main
add llava_150k en/zh mllm sft data

Former-commit-id: 991d843d56acd104ceff42f6d74d4e7acd5ccb01
2024-04-26 23:36:41 +08:00
hoshi-hiyouga
08460183a9 Update dataset_info.json
Former-commit-id: 7df511cfb76c833e8cc9be8cb45673395f54c32b
2024-04-26 23:34:34 +08:00
BUAADreamer
47c6d405dc add llava_150k en/zh mllm sft data
Former-commit-id: 62b3fb2f15e7e1c56da8011f0bf27cff35025863
2024-04-26 23:18:58 +08:00
hiyouga
190ae7b73d release v0.7.0
Former-commit-id: 45bb89cb4d26a6b3fb5360bc90ab950738fe4920
2024-04-26 23:18:00 +08:00
hiyouga
a635030931 support mllm hf inference
Former-commit-id: 2c7c01282acd7ddabbb17ce3246b8dae4bc4b8cf
2024-04-26 05:34:58 +08:00
hoshi-hiyouga
7b4a31ba22 Update dataset_info.json
Former-commit-id: b3e3749d49ba561929ed708650314e2c9b47c24d
2024-04-26 03:03:36 +08:00
BUAADreamer
0373a3f2a8 merge data part to the text stream
Former-commit-id: 80537d580119d9d5a06ab236a5284aaae2f83b5b
2024-04-25 19:58:47 +08:00
BUAADreamer
69fb4351f5 merge data part to the text stream
Former-commit-id: 7ee20286d9bcc2d5378bfd6bb02cd3648396d873
2024-04-25 19:19:59 +08:00
BUAADreamer
641c97ba74 add llava and instructblip
Former-commit-id: 142fb6f4541a1acfefe66ff2574dabde53b00c06
2024-04-25 00:22:43 +08:00
BUAADreamer
20e05970ab add multimodal LLM BLIP-2 and InstructBLIP
Former-commit-id: a730f89a972f1a9d37c718c716f199cb8d4903b2
2024-04-23 18:45:43 +08:00
hiyouga
bbf462a17e add dpo mix dataset
Former-commit-id: 6def3f8bfa51b2d9d73af112352ce07db972e4c9
2024-04-20 01:31:38 +08:00
hiyouga
b1ae554c83 fix #3247
Former-commit-id: bb67c66f80627805b585d157ba807c0ce378d3f2
2024-04-12 17:41:33 +08:00
li.yunhao
e6e3571232 fix pile datset hf hub url
Former-commit-id: c06f71f74ee1b177617417d151185757fd4359f5
2024-03-30 16:06:10 +08:00
hiyouga
5f3f0c53f2 add orca_dpo_pairs dataset
Former-commit-id: af683aacbae462a2a37d76d37df583e217664bd5
2024-03-20 20:09:06 +08:00
hiyouga
28f3e60189 update readme, add starcoder2, cosmopedia
Former-commit-id: 1ae7c183640146bb9b06c98942985a1721d2b9c9
2024-03-03 01:01:46 +08:00
hiyouga
9d241d08ae update data
Former-commit-id: bd63af6ede3a103b75ef9c0875557d65e2c4c7f7
2024-03-02 19:37:18 +08:00
hiyouga
5ebd605149 fix #2533
Former-commit-id: 52a81299fcff0fa691e1d6f9a7e9ea9d19751b3a
2024-02-21 22:47:48 +08:00
hiyouga
a6ff18ab17 fix #2481
Former-commit-id: 2a4e3e4a26a2fad77ccc476be7d45434b8af4a55
2024-02-15 19:07:47 +08:00
hiyouga
3e8c3b506a improve aligner
Former-commit-id: cc7296b92e10c24967fc753393275b71d300683f
2024-02-10 16:39:19 +08:00
Mark Mueller
ac5d3811bd Slim Orca data parsing
Former-commit-id: f2d8efede7e20edafed0d5446eb64f2d419949b1
2024-02-08 19:32:20 +01:00
Johann-Peter Hartmann
5a23651531 WS fix
Former-commit-id: 131935346ac06738be5e7c7f54fe2eb7d3769d7a
2024-02-06 20:13:04 +01:00
Johann-Peter Hartmann
66e1781ee9 add ranking to dpo dataset
Former-commit-id: 6a844fb384dd9cac3fd6b845a6b414320c5eb766
2024-02-06 20:12:36 +01:00
Johann-Peter Hartmann
af258902c4 remove comma
Former-commit-id: 57a2f6d35da8cd10fad9859382bc1e983da56705
2024-02-03 08:48:39 +01:00
Johann-Peter Hartmann
912bb5cb03 Add support for german datasets
Former-commit-id: bbc038aa236952597e97d1ccf1ae2d64a16339b5
2024-01-30 10:18:01 +01:00
hiyouga
fa9939b2b2 Update dataset_info.json
Former-commit-id: 4fe04ac7fc464c0ed705281cd3860839c18d6fc0
2024-01-23 00:10:32 +08:00
hiyouga
54f406a26c enable cutoff len
Former-commit-id: e9513d300c338dfcae98eee7d057bfd00da2da0e
2024-01-18 12:25:42 +08:00
hiyouga
a9fc7dbfa6 support function calling
Former-commit-id: 66533b3f65babf2429c92c0f8fafe4eff5e0ff63
2024-01-18 09:54:23 +08:00
hiyouga
6298f4779c tiny update
Former-commit-id: 4417b8ee20b381c964f452f52081667dfa33cd7b
2023-12-25 18:29:34 +08:00
hiyouga
cedf58978e support autogptq in llama board #246
Former-commit-id: fea01226703d1534b5cf511bcb6a49e73bc86ce1
2023-12-16 16:31:30 +08:00
hiyouga
d9f621be13 support system column #1765
Former-commit-id: f425584a511c5e42bae8b3ba090eaa898b28adad
2023-12-12 19:45:59 +08:00
hiyouga
aa30233322 fix modelscope data hub
Former-commit-id: 5b63e8c22538a4788e4b6c8df50e6e6be93ceeac
2023-12-12 18:33:06 +08:00
hoshi-hiyouga
97d5fb3460 Merge branch 'main' into feat/support_ms
Former-commit-id: 698756dffb7d4e602b3e0cab66ef0a4befe7215c
2023-12-12 17:55:32 +08:00
xingjun.wang
d4d4efc9e6 modify guanaco
Former-commit-id: ed2746fcc29cd07d4fa796f35f8d67c72bf30be8
2023-12-12 15:00:37 +08:00
xingjun.wang
643fa8e685 update dataset info
Former-commit-id: c005716ebcef390cf219e45649778f91e1f6e959
2023-12-12 14:53:59 +08:00