hiyouga
|
7fcffb860d
|
add codegeex4, internlm2.5
Former-commit-id: 53b1002fb74123095e7466c75b941a31a7cfba4d
|
2024-07-06 16:16:47 +08:00 |
|
hiyouga
|
7b3c1f29ff
|
fix packing for eager/sdpa attn
Former-commit-id: 6fd6aa4530f81a2ed306eeb2a5167607288b62c6
|
2024-07-04 01:52:43 +08:00 |
|
hoshi-hiyouga
|
a38ff842d0
|
Merge pull request #4224 from chuan298/main
Implement efficient packing without cross-contamination attention
Former-commit-id: 87d9b2d00513c163335d3f2e2bb3cb3299cecdaa
|
2024-07-04 01:18:54 +08:00 |
|
hiyouga
|
bfdaadcc40
|
update packing
Former-commit-id: cce7083024bed4c7429ddc8288d1c9190fde29f5
|
2024-07-04 01:10:55 +08:00 |
|
hiyouga
|
e671ed520b
|
update arg name
Former-commit-id: 8a6a7b9c8a876da9c16e5ada7df461eb8cabee21
|
2024-07-03 23:23:24 +08:00 |
|
hiyouga
|
cc31014002
|
improve rlhf
Former-commit-id: c47ab6c07287fb260ea49b8b7af46bdd416f88f7
|
2024-07-02 22:23:08 +08:00 |
|
hzhaoy
|
28e787116b
|
add TeleChat-1B
Former-commit-id: 57b7c00430bcfc83afd11547ceead041e8edfd8d
|
2024-07-02 17:49:04 +08:00 |
|
hoshi-hiyouga
|
2452f57cd7
|
Merge branch 'main' into main
Former-commit-id: e8e6af26514272e29a50649b38182beb4db4ebfa
|
2024-07-01 21:01:09 +08:00 |
|
hiyouga
|
d3b7c489f2
|
add Gemma2 models
Former-commit-id: 6f63050e1b61742d5f7e48bdc62c46748031d7cb
|
2024-06-28 01:26:50 +08:00 |
|
hiyouga
|
7be502c5c5
|
update readme
Former-commit-id: e507e60638b2e8c66f24805b3b28f6b9f98f5924
|
2024-06-24 18:22:12 +08:00 |
|
ancv
|
5319447aa5
|
move configure_packing to llamafactory.model.patcher and fix constants
Former-commit-id: 770f75dc8363bfa284a72159ff8ad25ec9abe4e0
|
2024-06-21 00:45:06 +07:00 |
|
hiyouga
|
e3bf22f61b
|
add deepseek coder v2 #4346
Former-commit-id: a233fbc258d38c62d78b9d1eaf034720361795e6
|
2024-06-18 22:53:54 +08:00 |
|
ancv
|
988231026a
|
update packing with sdpa and eager attention mode
Former-commit-id: 238f5c3d99809c6ae2571b59bdce8d8ea3c700b9
|
2024-06-16 02:25:47 +07:00 |
|
hiyouga
|
f0d6e63f55
|
add minicpm #4227
Former-commit-id: 572d8bbfdd73c1a00b432f0d0411f46fad6aa1a6
|
2024-06-15 17:58:52 +08:00 |
|
hiyouga
|
2946153cea
|
add license
Former-commit-id: d87108daa68bd40174b262be1ca65fe6e1b7ab56
|
2024-06-15 17:54:33 +08:00 |
|
hiyouga
|
a8318723a4
|
add resume args in webui
Former-commit-id: 06e5d136a4916413d1c116e341ba7d5136d7748a
|
2024-06-08 00:22:16 +08:00 |
|
hiyouga
|
8a0263551d
|
add qwen2 models
Former-commit-id: 8e95648850fdd5075724359ffdb22beb48b75952
|
2024-06-07 00:22:57 +08:00 |
|
hiyouga
|
cceff9f520
|
lora modules: all by default
Former-commit-id: cae47379079ff811aa385c297481a27020a8da6b
|
2024-06-06 03:53:28 +08:00 |
|
hiyouga
|
679810a3d2
|
add codestral 22B
Former-commit-id: c23cc63d3d3c4fd8edd6c3b3ca1a2a32ec328d7d
|
2024-06-06 03:42:50 +08:00 |
|
hiyouga
|
94c37490d1
|
support glm-4
Former-commit-id: f48f5e646e2da9e02333d027033141b0e75dfcf8
|
2024-06-05 15:16:38 +08:00 |
|
hiyouga
|
820404946e
|
better llamaboard
* easily resume from checkpoint
* support full and freeze checkpoints
* faster ui
Former-commit-id: 80708717329b4552920dd4ce8cebc683e65d54c5
|
2024-05-29 23:55:38 +08:00 |
|
hiyouga
|
a71a6a05c3
|
update readme
Former-commit-id: 89ca832740731dfb121175aa5c16b13bd4944011
|
2024-05-29 18:39:11 +08:00 |
|
hzhaoy
|
ce1be3da4b
|
add TeleChat-12B/TeleChat-12B-v2 models
Former-commit-id: 0dd632fe9e5bbf08605d4b9c6887208b7a127317
|
2024-05-29 15:00:37 +08:00 |
|
hiyouga
|
0706dbf7e6
|
tiny fix
Former-commit-id: c1fdf81df6ade5da7be4eb66b715f0efd171d5aa
|
2024-05-27 20:54:26 +08:00 |
|
hoshi-hiyouga
|
ad3ca3f556
|
Merge pull request #3921 from gusye1234/main
Add openchat-3.6-8B support
Former-commit-id: 87ea0a8bcd8d76a9e916cc8da6905bc805bb18aa
|
2024-05-27 20:52:37 +08:00 |
|
Jianbai Ye
|
d2c1df7f3d
|
add openchat-3.6-8B support
Former-commit-id: cff815391fd15f30647e8694e08c47a514fd6eb2
|
2024-05-27 20:42:08 +08:00 |
|
hiyouga
|
fc5a6b5c4e
|
support Aya23
Former-commit-id: e626e264460d12b282099bfbb8e6679c31e85fc0
|
2024-05-27 20:23:24 +08:00 |
|
hiyouga
|
51a1097c64
|
add phi-3 7b/14b, mistral v0.3 models
Former-commit-id: efa4b196ca8053881bb9d15cfb571204bcb0bbda
|
2024-05-27 18:20:16 +08:00 |
|
hiyouga
|
df33548b39
|
update readme
Former-commit-id: 5581cb2e4e59f3f8109e2acd4611789f9e50bfca
|
2024-05-27 18:14:02 +08:00 |
|
hiyouga
|
4807c11db8
|
support SimPO #3900
Former-commit-id: cb63b32986c43f97994211ec34dc5928fc3bb9d7
|
2024-05-26 23:46:33 +08:00 |
|
hiyouga
|
cce3892f91
|
support paligemma
Former-commit-id: 2a67457e3944d5e528286cb7203857c13078c484
|
2024-05-21 00:01:22 +08:00 |
|
hiyouga
|
446c681b58
|
fix paligemma inference
Former-commit-id: 542229abb3aba2032d4c52a878c0fd35ba299691
|
2024-05-20 23:36:43 +08:00 |
|
hoshi-hiyouga
|
97469892c3
|
Merge pull request #3785 from enji-zhou/feature/add_kto
add kto
Former-commit-id: 33a354548e78a7f7f51d63f80974920827d30252
|
2024-05-18 03:07:18 +08:00 |
|
hiyouga
|
9af3dce3c8
|
add deepseek v2 lite model
Former-commit-id: 8af98176055b6fc28d16b03207b5abaa7de6104a
|
2024-05-17 13:25:36 +08:00 |
|
enji.zhou
|
03956053b8
|
add kto
Former-commit-id: db1d5a4f51faae61fe18666057353747b01f5b8d
|
2024-05-17 13:09:17 +08:00 |
|
hiyouga
|
22f71c152a
|
add falcon 11b
Former-commit-id: d77bed4091a6a8fea682b39d3261e1e93dfe093f
|
2024-05-17 00:08:33 +08:00 |
|
hiyouga
|
cae823ddf0
|
rename package
Former-commit-id: 308edbc4260d45907b4a9d3a45ec21d83e48aacb
|
2024-05-16 18:39:08 +08:00 |
|