hiyouga
|
4f9ca28e11
|
fix callback
Former-commit-id: 1e9401744c
|
2023-10-15 04:59:44 +08:00 |
|
hiyouga
|
3ae6229140
|
implement webui resuming training
Former-commit-id: accde3cd39
|
2023-10-15 04:52:19 +08:00 |
|
hiyouga
|
c9d1cd108d
|
refactor model_dtype, fix PPO trainer
Former-commit-id: 2818af0b09
|
2023-10-11 23:16:01 +08:00 |
|
hiyouga
|
141937ead6
|
fix aquila template, repair sft packing mechanism
Former-commit-id: be420e4179
|
2023-10-10 18:49:55 +08:00 |
|
hiyouga
|
180fd06e61
|
fix flash shift short attention
Former-commit-id: 0a356bc897
|
2023-10-09 17:54:48 +08:00 |
|
hiyouga
|
b6e81a0307
|
fix shift short attention
Former-commit-id: ab65c3063b
|
2023-10-09 17:07:46 +08:00 |
|
hiyouga
|
d338ab3e19
|
fix #1068 #1074
Former-commit-id: d11a545463
|
2023-09-28 14:39:16 +08:00 |
|
hiyouga
|
f61a000e73
|
tiny fix
Former-commit-id: 5d4118b096
|
2023-09-28 01:03:04 +08:00 |
|
hiyouga
|
8a8ba08bf7
|
tiny fix
Former-commit-id: d2ebd225db
|
2023-09-28 01:02:11 +08:00 |
|
hiyouga
|
755e3e49b4
|
fix #1064
Former-commit-id: c902236397
|
2023-09-28 00:53:29 +08:00 |
|
hiyouga
|
deb17942ab
|
fix layer norm dtype
Former-commit-id: 84b7486885
|
2023-09-28 00:25:55 +08:00 |
|
hiyouga
|
108c31e1fc
|
support LongLoRA
Former-commit-id: 90375f600d
|
2023-09-27 21:55:50 +08:00 |
|
hiyouga
|
5ee1bdecdc
|
add MMLU and C-Eval script
Former-commit-id: 465ee8119a
|
2023-09-23 00:34:17 +08:00 |
|
hiyouga
|
48e7b600a8
|
fix error info
Former-commit-id: 7e8655c8b5
|
2023-09-19 18:30:23 +08:00 |
|
hiyouga
|
4e86462bad
|
fix #762 #814
Former-commit-id: d4be857e23
|
2023-09-12 16:10:10 +08:00 |
|
hiyouga
|
8ac7ec0b48
|
tiny fix
Former-commit-id: 3b306478d4
|
2023-09-11 18:27:08 +08:00 |
|
hiyouga
|
33bab0e7c1
|
update flashattn, fix ppo save model
Former-commit-id: 0fbece85a7
|
2023-09-11 17:25:36 +08:00 |
|
hiyouga
|
6a71361a54
|
remove PeftTrainer
Former-commit-id: b218c271ed
|
2023-09-10 22:23:23 +08:00 |
|
hiyouga
|
8ab5566dc0
|
support FlashAttention2
Former-commit-id: d8aa1404be
|
2023-09-10 20:43:56 +08:00 |
|
hiyouga
|
f865d0bd51
|
fix lora target
Former-commit-id: a51b7c98ac
|
2023-09-09 17:04:45 +08:00 |
|
hiyouga
|
c818a7ff60
|
support lora target auto find
Former-commit-id: bca1a247bc
|
2023-09-09 15:38:37 +08:00 |
|
hiyouga
|
9ed4bb63d4
|
change to right-padding, update reward score #803
Former-commit-id: 8ea32e4046
|
2023-09-08 20:04:31 +08:00 |
|
hiyouga
|
62941919e8
|
fix chatglm template
Former-commit-id: 8aaaa132d4
|
2023-09-08 14:45:58 +08:00 |
|
hiyouga
|
f74b980650
|
fix baichuan templates
Former-commit-id: 85b1f6632a
|
2023-09-07 18:54:14 +08:00 |
|
hiyouga
|
51f662860d
|
update baichuan2 template
Former-commit-id: 0531886e1f
|
2023-09-06 21:43:06 +08:00 |
|
hiyouga
|
f9aee17f9d
|
add Baichuan2 models
Former-commit-id: 62ce65c628
|
2023-09-06 18:36:04 +08:00 |
|
hiyouga
|
a4fd976048
|
refactor dataset_attr, add eos in pt, fix #757
Former-commit-id: a9d1fb72f7
|
2023-09-01 19:00:45 +08:00 |
|
codemayq
|
ea74e5a81b
|
update llama2 template
Former-commit-id: 0bcc489c42
|
2023-08-30 16:23:56 +08:00 |
|
codemayq
|
4b29d9d2b0
|
add dataset stage and filter dataset when stage chosen in webui
Former-commit-id: c0e4d1e81b
|
2023-08-23 18:54:23 +08:00 |
|
hiyouga
|
802494e20a
|
update template
Former-commit-id: 4318347d3f
|
2023-08-22 19:46:09 +08:00 |
|
hiyouga
|
e6f4eab4ab
|
fix #608
Former-commit-id: 02d69b6fde
|
2023-08-21 17:49:36 +08:00 |
|
hiyouga
|
d3bef03dc6
|
fix baichuan template for training #597 #616
Former-commit-id: 0a3f698425
|
2023-08-21 17:41:51 +08:00 |
|
hiyouga
|
caf4a61e21
|
fix ChatGLM2 ppo #527 #528
Former-commit-id: 9f4c2adc9a
|
2023-08-18 00:34:59 +08:00 |
|
hiyouga
|
623a34b16f
|
fix generation bug #532
Former-commit-id: be21fc83f9
|
2023-08-17 22:21:34 +08:00 |
|
hiyouga
|
3021a01b71
|
fix baichuan and intern template
Former-commit-id: 892fd39373
|
2023-08-17 01:27:20 +08:00 |
|
hiyouga
|
edc15c62fa
|
fix system prompt
Former-commit-id: 7407d9daa1
|
2023-08-16 01:35:52 +08:00 |
|
hiyouga
|
2ceaecfb42
|
fix baichuan template #481
Former-commit-id: 273135f595
|
2023-08-15 11:38:21 +08:00 |
|
hiyouga
|
d15fe288df
|
alert pad_token source
Former-commit-id: 80b4053602
|
2023-08-15 00:07:56 +08:00 |
|
hiyouga
|
02a61b08b1
|
update webui
Former-commit-id: 9d0f6214b6
|
2023-08-14 22:45:26 +08:00 |
|
codemayq
|
ee7da14f81
|
add template match and stage in webui
Former-commit-id: 79c68e5527
|
2023-08-14 20:42:59 +08:00 |
|
hiyouga
|
3f0a2d6adc
|
support rope scaling, fix #475 #476 #478
Former-commit-id: fa940c17b8
|
2023-08-12 20:46:27 +08:00 |
|
codemayq
|
3ba1b81105
|
add sft script preview in webui
Former-commit-id: 6bc8e9866d
|
2023-08-12 13:53:55 +08:00 |
|
hiyouga
|
7bd4c59b7e
|
fix unusual output of 8bit models #278 #391
Former-commit-id: dd51c24203
|
2023-08-12 00:25:29 +08:00 |
|
hiyouga
|
79f4ba0d26
|
Release v0.1.6
Former-commit-id: a48cb0d474
|
2023-08-11 23:25:57 +08:00 |
|
hiyouga
|
21bf79e72b
|
add defaults
Former-commit-id: d3844e97e3
|
2023-08-11 13:56:26 +08:00 |
|
hiyouga
|
eb26bfc2ba
|
fix stop word in baichuan template
Former-commit-id: d59f938959
|
2023-08-11 13:51:46 +08:00 |
|
hiyouga
|
f1485ab927
|
fix baichuan template
Former-commit-id: 9c6dd10514
|
2023-08-11 13:45:47 +08:00 |
|
hiyouga
|
abdfa26d06
|
support DPO training (2305.18290)
Former-commit-id: 3ec4351cfd
|
2023-08-11 03:02:53 +08:00 |
|
hiyouga
|
0dc9b41b16
|
fix template
Former-commit-id: eb6e571cb7
|
2023-08-09 23:14:27 +08:00 |
|
hiyouga
|
ce9ffca0d9
|
fix template
Former-commit-id: ac29f4d5f0
|
2023-08-09 23:10:20 +08:00 |
|