Kingsley
0935eff188
[data] Fix bugs of use_audio_in_video in Qwen2.5 Omni ( #7638 )
...
* cache _mm_inputs
* nit
* support for use_audio_in_video
* remove cache
* fix data
* Update mllm_video_audio_demo.json
2025-04-08 18:40:10 +08:00
Victor Nogueira
0ecad4b178
[dataset] fix ultrachat_200k dataset ( #7259 )
...
The `HuggingFaceH4/ultrachat_200k` dataset doesn't contain the default "train" split. The correct split is "train_sft".
2025-03-13 20:20:18 +08:00
hoshi-hiyouga
1b1964714e
[misc] update format ( #7277 )
2025-03-13 02:53:08 +08:00
hoshi-hiyouga
efa86e730c
[misc] upgrade format to py39 ( #7256 )
2025-03-12 00:08:41 +08:00
hoshi-hiyouga
39ebcd222d
[data] update mm demo data ( #7211 )
...
Former-commit-id: a6070050bbdc96a95d0f972e427a143bda1eb663
2025-03-07 20:07:15 +08:00
hoshi-hiyouga
b50ca5cafa
[data] add r1 distill dataset ( #6983 )
...
Former-commit-id: 1da5ee4edaa3896593b9cae488f0ac5917c3243e
2025-02-18 17:25:09 +08:00
hoshi-hiyouga
9204641049
[dataset] add openthought ( #6866 )
...
Former-commit-id: 20c748a4f108c0087f0d85377a4aa99126a0beb0
2025-02-09 00:53:01 +08:00
Zhangchi Feng
46a1786595
[model] support audio ( #6701 )
...
* support qwen2_audio
* improve code
* lint
* fix
* fix
* fix
---------
Co-authored-by: hiyouga <hiyouga@buaa.edu.cn>
Former-commit-id: 5eacb5629e4d7733cd992a63747a1335f2c6a929
2025-02-05 04:59:09 +08:00
hiyouga
c672520e37
fix dataset
...
Former-commit-id: d4a2d299414984a4043d30034c5c95e2d717a49e
2024-11-27 06:27:44 +00:00
hiyouga
7631b97e80
add marco-o1 and openo1 dataset
...
Former-commit-id: 51d49e075470951f109bcdde136203f972450c2e
2024-11-27 04:20:23 +00:00
hoshi-hiyouga
a27b418772
update dataset
...
Former-commit-id: 36233e127e3fd1d6b7c47baffe6a55830bcc0aad
2024-11-25 21:47:04 +08:00
hiyouga
dbbfb5f5dc
use pre-commit
...
Former-commit-id: 7cfede95df22a9ff236788f04159b6b16b8d04bb
2024-10-29 09:07:46 +00:00
huniu20
4affb39ca2
1. add model and dataset info to support webui
...
Former-commit-id: 92f6226f3fecbd9af744a7232dda2c68b2bb0d86
2024-10-10 16:46:34 +08:00
hiyouga
fb90faf19a
add docstrings, refactor logger
...
Former-commit-id: c34e489d71f8f539028543ccf8ee92cecedd6276
2024-09-08 00:56:56 +08:00
hiyouga
2ba224773e
update data readme
...
Former-commit-id: 0af5f054b7b8da8b39eb44b1dfa76050f0c45667
2024-09-05 04:44:49 +08:00
hiyouga
0a949bf8aa
update data readme
...
Former-commit-id: 81adb153b7d0b30e6cd50c9bf4ca1ccf17458611
2024-09-05 04:25:27 +08:00
hiyouga
d408d5cf32
video datasets
...
Former-commit-id: 33f28ce82d9e44d2615909250dc56d6a4a03cd99
2024-09-05 02:04:17 +08:00
hiyouga
13d59aecfb
add vl_feedback dataset
...
Former-commit-id: 6ff34ad2db383b5fbd51008bcc5eec880658811e
2024-09-04 03:13:03 +08:00
hiyouga
217d8f7199
add pokemon dataset
...
Former-commit-id: 06680158a0f0a1e3c542e77af92ac877fbe357c5
2024-09-02 01:02:25 +08:00
hiyouga
04db03bdfd
add rlhf-v dataset
...
Former-commit-id: 3fd18fc34a0c994a738504746abfd5548e002437
2024-09-01 22:57:41 +08:00
hiyouga
818e0b2cd0
optimize predict vram
...
Former-commit-id: a577e44eee351b3ed8011a33ae01cd713354ff97
2024-08-30 23:08:45 +08:00
hiyouga
228f745235
refactor mm training
...
Former-commit-id: 179c0558699e287cbf38a2d73bff47e86d589c5a
2024-08-30 02:14:31 +08:00
simonJJJ
5e728ec221
initial-commit
...
Former-commit-id: b6a39847a10b417b09db4b5512dd835e9e4ce928
2024-08-28 16:51:35 +08:00
hiyouga
45fc1dfbda
add magpie ultra dataset
...
Former-commit-id: 3317b24329b87e30f13a78936ac5554f211abf7a
2024-08-09 20:28:55 +08:00
hiyouga
a66ff6052b
add unittest
...
Former-commit-id: 8a1f0c5f922989e08a19c65de0b2c4afd2a5771f
2024-07-19 01:06:27 +08:00
hiyouga
8ce43766c6
fix up
...
Former-commit-id: 43a56cb331fae899ca35b0c312730d4ab79d0c42
2024-07-15 01:04:56 +08:00
hoshi-hiyouga
de1efaed1f
Update README.md
...
Former-commit-id: d9aa6a9437994ac29f3e7a0789ec286f091847d6
2024-07-14 21:27:04 +08:00
codingma
82e941ff61
1. add custom eval dataset support
...
2. merge load dataset and split dataset function
Former-commit-id: 963d97ba07e7efa3a4544c4d077283d9e112b3ad
2024-07-05 15:52:10 +08:00
hiyouga
8594d4fd53
update data
...
Former-commit-id: 5f396ea8555a5f0de7b55f5049890f15c25bbe51
2024-06-19 02:48:43 +08:00
hiyouga
33fe274468
tiny fix
...
Former-commit-id: bb750fa3dde03ec024ae75596ecd4b884cb126c6
2024-06-18 23:32:18 +08:00
Eli Costa
ef578c39a0
Add Magpie and Webinstruct dataset samples
...
Adds two dataset samples claimed superior performance: Magpie (from Allen AI) and Webinstruct (from TIGER-Lab).
Former-commit-id: 12f4a2bc3172ecd5b6775061d59103f565ac9562
2024-06-15 19:31:56 -03:00
hiyouga
39e3d3fed6
add neo-sft dataset
...
Former-commit-id: 34863fa7cb641ceca92e3a2eec914126db537b62
2024-06-13 01:00:56 +08:00
hiyouga
d9aa226c08
add ultrafeedback and fineweb #4085 #4132
...
Former-commit-id: 968e4992e2f2a3ccba73e8668f1654ddc6eb0034
2024-06-08 02:42:34 +08:00
hoshi-hiyouga
8ff3e53457
Merge pull request #3829 from seanzhang-zhichen/add_dataset_sample_num
...
Add dataset sample num
Former-commit-id: ab38cf74ce48ea4f1800e077ca287f2eb9336135
2024-05-30 00:25:45 +08:00
hoshi-hiyouga
9256750add
Update README_zh.md
...
Former-commit-id: 3007d260ed45169583a74497a53b661337dd5f71
2024-05-30 00:04:47 +08:00
hoshi-hiyouga
04dce0079e
Update README.md
...
Former-commit-id: 65fb69e388c0a04c15ecd11441e567966f51fae5
2024-05-30 00:04:26 +08:00
hiyouga
a3dd6f887c
fix full/freeze tuning for mllm
...
Former-commit-id: df5860ddb593d5b82163a585d12160b41dbce0f3
2024-05-27 20:37:57 +08:00
BUAADreamer
fb33f6e528
Merge branch 'main' of https://github.com/BUAADreamer/LLaMA-Factory
...
Former-commit-id: d544570ce88a7b784beeffa70ff718109696b1f5
2024-05-27 20:11:23 +08:00
BUAADreamer
5a581acac7
Merge branch 'hiyouga:main' into main
...
Former-commit-id: cc1b82bf49b060987392c455fdbfe125ad667ec5
2024-05-27 20:10:58 +08:00
BUAADreamer
136e64081f
remove mllm_pt_demo.json
...
Former-commit-id: 5402589f021056f9c9e7b68421282039a508d5b9
2024-05-27 20:10:31 +08:00
hiyouga
3f8314d4e6
add llava 1k datasets
...
Former-commit-id: 345d3355752f4a4dc454696a39f1610fffbbf382
2024-05-27 19:57:33 +08:00
seanzhang-zhichen
fc6c31127a
Merge branch 'main' into add_dataset_sample_num
...
Former-commit-id: 26300127c45f24e63b91f1b0cc73e46c3a936a91
2024-05-24 15:57:47 +08:00
BUAADreamer
e23988ae9e
Merge branch 'hiyouga:main' into main
...
Former-commit-id: 4076f52c8ba7da4624a1fb3fa52a7170d1c3171e
2024-05-21 22:18:20 +08:00
hiyouga
cc7bdaa459
Update README_zh.md
...
Former-commit-id: 34c4ba6bf9bb89170446fb396aa06ae44d251de0
2024-05-21 18:30:59 +08:00
BUAADreamer
aaadaa18f6
support pretraining of llava
...
Former-commit-id: 6a4c8cf0a6a1674c693b9337f018ff8df7477f8f
2024-05-21 08:57:14 +08:00
hiyouga
d9c5d4ee64
fix #3818
...
Former-commit-id: 3f366e05a34be224f53c5bf8334e57ae5d316004
2024-05-20 21:43:19 +08:00
zhangzc
e84b72f806
fix conflict
...
Former-commit-id: 6922b23a748c2459147bf44b96d86daa89f2c96c
2024-05-20 17:10:01 +08:00
hiyouga
e088b3906f
update data readme
...
Former-commit-id: 22c7335b496e4a673383d5a1e4e60bf2cb4e35b3
2024-05-18 21:37:38 +08:00
hiyouga
d732b72f82
update data readme
...
Former-commit-id: beb864a9367943d3274cb6057423d1eb9aaf85c4
2024-05-18 21:15:20 +08:00
hiyouga
d24969bb7e
improve KTO impl., replace datasets
...
Former-commit-id: e56a57ddcf061de6e4acc8679f7dbf0b68364986
2024-05-18 03:44:56 +08:00