hiyouga
|
7fbe8add8f
|
fix log level
|
2024-04-24 23:42:59 +08:00 |
|
hiyouga
|
297fb8ead3
|
support new special token #3420
|
2024-04-24 23:39:31 +08:00 |
|
hiyouga
|
73ff9c834b
|
fix bug
|
2024-04-24 05:21:18 +08:00 |
|
hiyouga
|
8f44dce08a
|
fix bug
|
2024-04-24 05:10:07 +08:00 |
|
hiyouga
|
667ce08b27
|
remove redundant code
|
2024-04-24 05:02:18 +08:00 |
|
hiyouga
|
b1deb0a0b9
|
support unsloth generate
|
2024-04-24 04:46:53 +08:00 |
|
hiyouga
|
aa2b79eb23
|
refactor patcher
|
2024-04-24 03:02:23 +08:00 |
|
hiyouga
|
07737a3d2d
|
reenable sdpa and fast tok by default
|
2024-04-24 02:18:44 +08:00 |
|
hiyouga
|
707f0b1d5d
|
fix #3347 #3387
|
2024-04-24 01:30:16 +08:00 |
|
hiyouga
|
a1d31ffc8c
|
fix #3365
|
2024-04-21 19:20:18 +08:00 |
|
hiyouga
|
f58425ab45
|
fix mod stuff
|
2024-04-21 18:11:10 +08:00 |
|
hoshi-hiyouga
|
d0273787be
|
Merge pull request #3338 from astramind-ai/main
Adding Mixture of Depth
|
2024-04-21 18:05:52 +08:00 |
|
hoshi-hiyouga
|
1fa287fd63
|
fix #3348
|
2024-04-20 10:34:09 +08:00 |
|
Marco
|
620add7b9f
|
Added Mixture of Depths
|
2024-04-18 20:31:24 +02:00 |
|
hiyouga
|
942362d008
|
fix #3324
|
2024-04-18 15:34:45 +08:00 |
|
hiyouga
|
c9a477322d
|
fix #3316
|
2024-04-17 22:54:34 +08:00 |
|
hoshi-hiyouga
|
4d660c5ade
|
Merge pull request #3287 from Ledzy/badam
[Feature] Add BAdam algorithm
|
2024-04-16 17:32:16 +08:00 |
|
hoshi-hiyouga
|
38a56706e0
|
Update utils.py
|
2024-04-16 17:29:30 +08:00 |
|
hoshi-hiyouga
|
a950f3b81d
|
Update patcher.py
|
2024-04-16 17:29:19 +08:00 |
|
hoshi-hiyouga
|
750cdf2e74
|
Update adapter.py
|
2024-04-16 17:28:12 +08:00 |
|
Jonery
|
7ecb61822b
|
resolve gradient checkpointing issue.
|
2024-04-16 12:05:27 +08:00 |
|
hiyouga
|
7dc72fb58c
|
support unsloth 2024.4
|
2024-04-16 00:25:03 +08:00 |
|
hiyouga
|
6543f3d449
|
add codegemma
|
2024-04-16 00:11:15 +08:00 |
|
hiyouga
|
e0dbac2845
|
support cohere commandR #3184
|
2024-04-15 23:26:42 +08:00 |
|
Jonery
|
06c8908d3f
|
Feature BAdam
|
2024-04-15 23:15:27 +08:00 |
|
hiyouga
|
cce52351b5
|
update examples
|
2024-04-15 22:14:34 +08:00 |
|
hoshi-hiyouga
|
0e0942d388
|
Merge pull request #3276 from liu-zichen/fix_mixtral
fix: turn on output_router_logits of mixtral
|
2024-04-15 15:38:16 +08:00 |
|
hiyouga
|
efc345c4b0
|
fix #3273
|
2024-04-15 15:32:58 +08:00 |
|
liuzc
|
9f4fe62386
|
fix: mixtral output_router_logits
|
2024-04-15 12:11:49 +08:00 |
|
hiyouga
|
9d4c949461
|
release v0.6.2
|
2024-04-11 20:08:51 +08:00 |
|
hoshi-hiyouga
|
98bc97d8d2
|
Update adapter.py
|
2024-04-10 00:57:51 +08:00 |
|
hoshi-hiyouga
|
2111b586b6
|
Update adapter.py
|
2024-04-10 00:57:30 +08:00 |
|
Erich Schubert
|
b5eefe5c4c
|
Pass additional_target to unsloth
Fixes #3200
|
2024-04-09 17:53:40 +02:00 |
|
hiyouga
|
7f6c2486b8
|
fix quant infer and qwen2moe
|
2024-04-09 17:12:59 +08:00 |
|
hiyouga
|
148bda353f
|
fix resize vocab at inference #3022
|
2024-04-03 18:14:24 +08:00 |
|
hiyouga
|
92dab8a90b
|
simplify readme
|
2024-04-02 20:07:43 +08:00 |
|
hiyouga
|
b267aeb53f
|
add moe aux loss control #3085
|
2024-04-02 14:26:31 +08:00 |
|
hiyouga
|
9ddbe2866a
|
fix #3022
|
2024-04-02 13:58:39 +08:00 |
|
hiyouga
|
4a6ca621c0
|
fix #3083
|
2024-04-01 22:53:52 +08:00 |
|
hiyouga
|
aee634cd20
|
fix #3077
|
2024-04-01 21:35:18 +08:00 |
|
hiyouga
|
eb259cc573
|
support infer 4bit model on GPUs #3023
|
2024-04-01 17:34:04 +08:00 |
|
hiyouga
|
27776c3474
|
tiny fix
|
2024-03-31 00:10:29 +08:00 |
|
marko1616
|
d9a5134617
|
fix blank line contains whitespace
|
2024-03-30 23:46:55 +08:00 |
|
marko1616
|
eb178eaff3
|
Fix Llama model save for full param train
|
2024-03-30 23:45:04 +08:00 |
|
hiyouga
|
8c77b10912
|
update trainers
|
2024-03-28 18:16:27 +08:00 |
|
hiyouga
|
511f675402
|
fix #2961
|
2024-03-26 17:26:14 +08:00 |
|
hiyouga
|
7afbc85dae
|
fix #2928
|
2024-03-24 00:34:54 +08:00 |
|
hiyouga
|
a1c8c98c5f
|
fix #2941
|
2024-03-24 00:28:44 +08:00 |
|
hiyouga
|
8408225162
|
support fsdp + qlora
|
2024-03-21 00:36:06 +08:00 |
|
hiyouga
|
7b8f502901
|
fix #2346
|
2024-03-20 17:56:33 +08:00 |
|