Commit Graph

148 Commits

Author SHA1 Message Date
hoshi-hiyouga
beb1a9f9d9 [data] add r1 distill dataset (#6983)
Former-commit-id: 2591a3fa8b
2025-02-18 17:25:09 +08:00
hoshi-hiyouga
fcd0f0480d [dataset] add openthought (#6866)
Former-commit-id: 1356f9d840
2025-02-09 00:53:01 +08:00
Zhangchi Feng
01915eaf40 [model] support audio (#6701)
* support qwen2_audio

* improve code

* lint

* fix

* fix

* fix

---------

Co-authored-by: hiyouga <hiyouga@buaa.edu.cn>
Former-commit-id: 24c7842948
2025-02-05 04:59:09 +08:00
hiyouga
9822cb7bac fix dataset
Former-commit-id: 046b6fb118
2024-11-27 06:27:44 +00:00
hiyouga
ab3782b0fa add marco-o1 and openo1 dataset
Former-commit-id: 17afb7d410
2024-11-27 04:20:23 +00:00
hoshi-hiyouga
4f1d5b6396 update dataset
Former-commit-id: 5214d3ea06
2024-11-25 21:47:04 +08:00
hiyouga
0d8aa6e6ef use pre-commit
Former-commit-id: 21db8ed2f4
2024-10-29 09:07:46 +00:00
huniu20
132c1f1b0f 1. add model and dataset info to support webui
Former-commit-id: 0f669f221a
2024-10-10 16:46:34 +08:00
hiyouga
7ccb86b215 add docstrings, refactor logger
Former-commit-id: 54c6905937
2024-09-08 00:56:56 +08:00
hiyouga
dec6ff046b update data readme
Former-commit-id: 70e36ff2f4
2024-09-05 04:44:49 +08:00
hiyouga
c4d7d76358 update data readme
Former-commit-id: 6055fe02de
2024-09-05 04:25:27 +08:00
hiyouga
9df7a26e6b video datasets
Former-commit-id: 8cafc7b055
2024-09-05 02:04:17 +08:00
hiyouga
af8c4b4e20 add vl_feedback dataset
Former-commit-id: 57497135bf
2024-09-04 03:13:03 +08:00
hiyouga
549adc888b add pokemon dataset
Former-commit-id: 194064fdae
2024-09-02 01:02:25 +08:00
hiyouga
bfdcc6bacf add rlhf-v dataset
Former-commit-id: 8e49940746
2024-09-01 22:57:41 +08:00
hiyouga
51a0016873 optimize predict vram
Former-commit-id: a244f143f4
2024-08-30 23:08:45 +08:00
hiyouga
a83756b5e9 refactor mm training
Former-commit-id: 3382317e32
2024-08-30 02:14:31 +08:00
simonJJJ
8a09b1e732 initial-commit
Former-commit-id: aeb85f200b
2024-08-28 16:51:35 +08:00
hiyouga
bea270042b add magpie ultra dataset
Former-commit-id: c75b5b83c4
2024-08-09 20:28:55 +08:00
hiyouga
e1e01d7efd add unittest
Former-commit-id: 608de799a2
2024-07-19 01:06:27 +08:00
hiyouga
14bc7b0551 fix up
Former-commit-id: 29ebcd75d5
2024-07-15 01:04:56 +08:00
hoshi-hiyouga
ddbd848e49 Update README.md
Former-commit-id: 9d64507bd5
2024-07-14 21:27:04 +08:00
codingma
74f0d02eb8 1. add custom eval dataset support
2. merge load dataset and split dataset function


Former-commit-id: 76f3bbcfc0
2024-07-05 15:52:10 +08:00
hiyouga
89564e90d7 update data
Former-commit-id: 9ab0401948
2024-06-19 02:48:43 +08:00
hiyouga
9e5988717d tiny fix
Former-commit-id: 344b9a36b2
2024-06-18 23:32:18 +08:00
Eli Costa
6bbb8b4cd8 Add Magpie and Webinstruct dataset samples
Adds two dataset samples claimed superior performance: Magpie (from Allen AI) and Webinstruct (from TIGER-Lab).

Former-commit-id: 74e49cca95
2024-06-15 19:31:56 -03:00
hiyouga
e89d1b1ec3 add neo-sft dataset
Former-commit-id: c7a5620ccc
2024-06-13 01:00:56 +08:00
hiyouga
3547a26f86 add ultrafeedback and fineweb #4085 #4132
Former-commit-id: 12d79f89c5
2024-06-08 02:42:34 +08:00
hoshi-hiyouga
9b6bdf9449 Merge pull request #3829 from seanzhang-zhichen/add_dataset_sample_num
Add dataset sample num

Former-commit-id: 483eb47e5d
2024-05-30 00:25:45 +08:00
hoshi-hiyouga
21e7979837 Update README_zh.md
Former-commit-id: c8ae7e0e65
2024-05-30 00:04:47 +08:00
hoshi-hiyouga
eb7ee82f16 Update README.md
Former-commit-id: 3761d7d5dd
2024-05-30 00:04:26 +08:00
hiyouga
b88ecd71fd fix full/freeze tuning for mllm
Former-commit-id: 08564838bd
2024-05-27 20:37:57 +08:00
BUAADreamer
f9ced0480e Merge branch 'main' of https://github.com/BUAADreamer/LLaMA-Factory
Former-commit-id: 576b0206c2
2024-05-27 20:11:23 +08:00
BUAADreamer
4a958ab909 Merge branch 'hiyouga:main' into main
Former-commit-id: e2022ce4e9
2024-05-27 20:10:58 +08:00
BUAADreamer
ea78a629ba remove mllm_pt_demo.json
Former-commit-id: f665342a27
2024-05-27 20:10:31 +08:00
hiyouga
db569a2d61 add llava 1k datasets
Former-commit-id: 08bd0440b5
2024-05-27 19:57:33 +08:00
seanzhang-zhichen
9c8d79fbe3 Merge branch 'main' into add_dataset_sample_num
Former-commit-id: 27cb51f7f8
2024-05-24 15:57:47 +08:00
BUAADreamer
d8a27e40e2 Merge branch 'hiyouga:main' into main
Former-commit-id: 8d53ec2b5f
2024-05-21 22:18:20 +08:00
hiyouga
a8480baa11 Update README_zh.md
Former-commit-id: 4d647ddba5
2024-05-21 18:30:59 +08:00
BUAADreamer
071d674065 support pretraining of llava
Former-commit-id: 29a6d5bdb8
2024-05-21 08:57:14 +08:00
hiyouga
7f6c37c68e fix #3818
Former-commit-id: 7262679666
2024-05-20 21:43:19 +08:00
zhangzc
4b90f04c1f fix conflict
Former-commit-id: d956041640
2024-05-20 17:10:01 +08:00
hiyouga
c53e626c9a update data readme
Former-commit-id: ca48f90f1e
2024-05-18 21:37:38 +08:00
hiyouga
68c07d3e1e update data readme
Former-commit-id: 18cbf8561d
2024-05-18 21:15:20 +08:00
hiyouga
13d7b48efe improve KTO impl., replace datasets
Former-commit-id: c450ee87a3
2024-05-18 03:44:56 +08:00
enji.zhou
03956053b8 add kto
Former-commit-id: db1d5a4f51
2024-05-17 13:09:17 +08:00
hiyouga
51e0f095a9 remove checksum and fix ui args
Former-commit-id: 58c522cd5c
2024-05-12 01:10:30 +08:00
codingma
e017fb67d0 fix sha1 of glaive_toolcall dataset
Former-commit-id: d5520b6017
2024-05-09 16:33:45 +08:00
hiyouga
38c6ce9311 remove big file
Former-commit-id: 1ccbfe562d
2024-05-07 22:14:06 +08:00
hiyouga
175a7ea951 fix stop param
Former-commit-id: 09f3ef1de4
2024-05-07 00:41:04 +08:00