hiyouga
|
53b1002fb7
|
add codegeex4, internlm2.5
|
2024-07-06 16:16:47 +08:00 |
|
hiyouga
|
6fd6aa4530
|
fix packing for eager/sdpa attn
|
2024-07-04 01:52:43 +08:00 |
|
hoshi-hiyouga
|
87d9b2d005
|
Merge pull request #4224 from chuan298/main
Implement efficient packing without cross-contamination attention
|
2024-07-04 01:18:54 +08:00 |
|
hiyouga
|
cce7083024
|
update packing
|
2024-07-04 01:10:55 +08:00 |
|
hiyouga
|
8a6a7b9c8a
|
update arg name
|
2024-07-03 23:23:24 +08:00 |
|
hiyouga
|
c47ab6c072
|
improve rlhf
|
2024-07-02 22:23:08 +08:00 |
|
hzhaoy
|
57b7c00430
|
add TeleChat-1B
|
2024-07-02 17:49:04 +08:00 |
|
hoshi-hiyouga
|
e8e6af2651
|
Merge branch 'main' into main
|
2024-07-01 21:01:09 +08:00 |
|
hiyouga
|
6f63050e1b
|
add Gemma2 models
|
2024-06-28 01:26:50 +08:00 |
|
hiyouga
|
e507e60638
|
update readme
|
2024-06-24 18:22:12 +08:00 |
|
ancv
|
770f75dc83
|
move configure_packing to llamafactory.model.patcher and fix constants
|
2024-06-21 00:45:06 +07:00 |
|
hiyouga
|
a233fbc258
|
add deepseek coder v2 #4346
|
2024-06-18 22:53:54 +08:00 |
|
ancv
|
238f5c3d99
|
update packing with sdpa and eager attention mode
|
2024-06-16 02:25:47 +07:00 |
|
hiyouga
|
572d8bbfdd
|
add minicpm #4227
|
2024-06-15 17:58:52 +08:00 |
|
hiyouga
|
d87108daa6
|
add license
|
2024-06-15 17:54:33 +08:00 |
|
hiyouga
|
06e5d136a4
|
add resume args in webui
|
2024-06-08 00:22:16 +08:00 |
|
hiyouga
|
8e95648850
|
add qwen2 models
|
2024-06-07 00:22:57 +08:00 |
|
hiyouga
|
cae4737907
|
lora modules: all by default
|
2024-06-06 03:53:28 +08:00 |
|
hiyouga
|
c23cc63d3d
|
add codestral 22B
|
2024-06-06 03:42:50 +08:00 |
|
hiyouga
|
f48f5e646e
|
support glm-4
|
2024-06-05 15:16:38 +08:00 |
|
hiyouga
|
8070871732
|
better llamaboard
* easily resume from checkpoint
* support full and freeze checkpoints
* faster ui
|
2024-05-29 23:55:38 +08:00 |
|
hiyouga
|
89ca832740
|
update readme
|
2024-05-29 18:39:11 +08:00 |
|
hzhaoy
|
0dd632fe9e
|
add TeleChat-12B/TeleChat-12B-v2 models
|
2024-05-29 15:00:37 +08:00 |
|
hiyouga
|
c1fdf81df6
|
tiny fix
|
2024-05-27 20:54:26 +08:00 |
|
hoshi-hiyouga
|
87ea0a8bcd
|
Merge pull request #3921 from gusye1234/main
Add openchat-3.6-8B support
|
2024-05-27 20:52:37 +08:00 |
|
Jianbai Ye
|
cff815391f
|
add openchat-3.6-8B support
|
2024-05-27 20:42:08 +08:00 |
|
hiyouga
|
e626e26446
|
support Aya23
|
2024-05-27 20:23:24 +08:00 |
|
hiyouga
|
efa4b196ca
|
add phi-3 7b/14b, mistral v0.3 models
|
2024-05-27 18:20:16 +08:00 |
|
hiyouga
|
5581cb2e4e
|
update readme
|
2024-05-27 18:14:02 +08:00 |
|
hiyouga
|
cb63b32986
|
support SimPO #3900
|
2024-05-26 23:46:33 +08:00 |
|
hiyouga
|
2a67457e39
|
support paligemma
|
2024-05-21 00:01:22 +08:00 |
|
hiyouga
|
542229abb3
|
fix paligemma inference
|
2024-05-20 23:36:43 +08:00 |
|
hoshi-hiyouga
|
33a354548e
|
Merge pull request #3785 from enji-zhou/feature/add_kto
add kto
|
2024-05-18 03:07:18 +08:00 |
|
hiyouga
|
8af9817605
|
add deepseek v2 lite model
|
2024-05-17 13:25:36 +08:00 |
|
enji.zhou
|
db1d5a4f51
|
add kto
|
2024-05-17 13:09:17 +08:00 |
|
hiyouga
|
d77bed4091
|
add falcon 11b
|
2024-05-17 00:08:33 +08:00 |
|
hiyouga
|
308edbc426
|
rename package
|
2024-05-16 18:39:08 +08:00 |
|