hiyouga
b983de9f4f
fix #2803
...
Former-commit-id: 06c97083e150d461631a4a9bebb03b33da760098
2024-03-12 16:57:39 +08:00
hiyouga
7124b71676
fix #2782 #2798
...
Former-commit-id: 07f9b754a7418b489e839bd674aa47094583a92d
2024-03-12 15:53:29 +08:00
hoshi-hiyouga
52f14211e3
Merge pull request #2743 from S3Studio/DockerizeSupport
...
Add dockerize support
Former-commit-id: c901aa63ff4fb6daea7f7da467782e8bf6224d4d
2024-03-12 00:05:49 +08:00
hiyouga
c88062347e
fix #2775
...
Former-commit-id: e874c00906c765b81c0e5ff9c7b3679557da8e0e
2024-03-11 00:42:54 +08:00
hiyouga
f776e738f8
tiny fix
...
Former-commit-id: 352693e2dcc8fc039b5d574e1a5709563929b0ce
2024-03-11 00:17:18 +08:00
hiyouga
566bfad930
update parser
...
Former-commit-id: be99799413e1ba37807a02838bf2d87fd966bf55
2024-03-10 13:35:20 +08:00
hiyouga
4a4e4b4354
support layerwise galore
...
Former-commit-id: 8664262cde3919e10eaecbd66e8c5d356856362e
2024-03-10 00:24:11 +08:00
hiyouga
276def1897
fix #2732
...
Former-commit-id: 18ffce36b5ee0809f2e2905c2fd44843a3725ea0
2024-03-09 22:37:16 +08:00
hiyouga
868444e124
allow non-packing pretraining
...
Former-commit-id: bdb496644ce2c18806fc4fdae1fedcb3e5b5f808
2024-03-09 22:21:46 +08:00
hiyouga
1173441661
fix #2766
...
Former-commit-id: 412c52e325660e8b871ffd59f5564f84f46a143f
2024-03-09 21:35:24 +08:00
hiyouga
8f6eb1383d
use default arg for freeze tuning
...
Former-commit-id: af0e370fb16f3e0cf2f4c8036301d5253d8249b9
2024-03-09 06:08:48 +08:00
hiyouga
17e50bcbb1
add GaLore results
...
Former-commit-id: 818726e9bcdedfbd330ea7a60e02ee5b03aed459
2024-03-09 04:11:55 +08:00
hiyouga
5c00783697
update hardware requirements
...
Former-commit-id: 393c2de27ce0a2dee793092843ec0afa54f49a6d
2024-03-09 03:58:18 +08:00
hiyouga
eb363b04b9
update examples
...
Former-commit-id: 4c00bcdcaeb675c9fdb3e977c27c3604d7895ae2
2024-03-09 02:30:37 +08:00
hiyouga
c561b268ef
fix #2756 , patch #2746
...
Former-commit-id: e8dd38b7fdf8e172745d2538eb103895f2839c38
2024-03-09 02:01:26 +08:00
hoshi-hiyouga
36d65289d0
Merge pull request #2746 from stephen-nju/main
...
fix deepspeed ppo RuntimeError
Former-commit-id: 516d0ddc666c179616a2a610b1353728db57391e
2024-03-09 01:37:00 +08:00
hiyouga
247aab9066
Update setup.py
...
Former-commit-id: 74ff8664d783f428227fb62e6a1313a73cbd337d
2024-03-09 00:14:48 +08:00
hiyouga
398c261c7c
fix aqlm version
...
Former-commit-id: 10be2f0eccc3963a985afcd24e5b8b8fc638b1c3
2024-03-09 00:09:09 +08:00
hiyouga
ccec17f773
fix example params
...
Former-commit-id: 8a45213440ffc960947dd69ecf3b092aa724bef3
2024-03-08 20:41:43 +08:00
stephen_zhu
c69b9fbe58
update
...
Former-commit-id: aa71571b773c5dc527b17219ec87828e4455b330
2024-03-08 12:47:44 +08:00
stephen
495b858606
fix ppo runtime error
...
Former-commit-id: cdb7f82869b07d9d5d31b7b2aaf6b033bd00e32e
2024-03-08 11:48:26 +08:00
S3Studio
de41334055
Add dockerize support
...
Already tested with the model of Qwen:1.8B and the dataset of alpaca_data_zh. Some python libraries are added to the Dockerfile as a result of the exception messages displayed throughout test procedure.
Former-commit-id: 3d911ae713b901d6680a9f9ac82569cc5878f820
2024-03-08 10:47:28 +08:00
hiyouga
b268215a0e
update readme
...
Former-commit-id: 4a2cc60b9440d245141e9317c35a0ac4c687dbdb
2024-03-08 03:06:21 +08:00
hiyouga
7443ac3116
fix chat engine, update webui
...
Former-commit-id: 5d956e2a5167201aecdfce2794c25d8a2d84e234
2024-03-08 03:01:53 +08:00
hiyouga
0a0959facf
Update setup.py
...
Former-commit-id: 5cd4947650403490419e5bddf2b1ac7e137edf8b
2024-03-08 01:23:00 +08:00
hiyouga
2235020cc9
update galore args
...
Former-commit-id: 0ac6b40a4772b61a3476bb74b976d24c408a2c35
2024-03-08 01:17:32 +08:00
hiyouga
5b50458acf
fix galore
...
Former-commit-id: 33a4c24a8a3c153bc62edf74b9246699a0ae3233
2024-03-08 00:44:51 +08:00
hiyouga
f373290012
add Yi-9B model
...
Former-commit-id: 57452a4aa1d37a047d659f002c1aaa6246f64178
2024-03-07 23:11:57 +08:00
hiyouga
cb2bf680c9
add galore examples
...
Former-commit-id: 7230e1177daf4d96a1205565ab9335085cc8f3a7
2024-03-07 22:53:45 +08:00
hiyouga
2c010c72b8
support galore
...
Former-commit-id: 28f78621883917425fabe49f5473778111012127
2024-03-07 22:41:36 +08:00
hiyouga
1af71f548c
update readme
...
Former-commit-id: 725f7cd70fce502728f785282f1c0d59f23ff434
2024-03-07 20:34:49 +08:00
hiyouga
583d956bda
tiny fix
...
Former-commit-id: 77211d984385247bf7f5f8edea34e9a080a3dc9f
2024-03-07 20:29:34 +08:00
hoshi-hiyouga
86ba1a5c5b
Merge pull request #2739 from hiyouga/dev-vllm
...
support vllm
Former-commit-id: a0dc721816a176af9568553ae1a5d8bf0f318825
2024-03-07 20:28:18 +08:00
hiyouga
34533b2f35
support vllm
...
Former-commit-id: d07ad5cc1cdbc13879afd84f653afdfee03a6933
2024-03-07 20:26:31 +08:00
hiyouga
37e40563f1
fix #2735
...
Former-commit-id: f74f804a715dfb16bf24a056bc95db6b102f9ed7
2024-03-07 16:15:53 +08:00
hoshi-hiyouga
90e66c8d94
Merge pull request #2730 from cx2333-gt/main
...
fix flash_attn in train_web
Former-commit-id: 2185855bdb7d4cb55f3af796e35fb1b0e8dce5e3
2024-03-07 14:37:18 +08:00
cx2333
013c12a135
revert choice name
...
Former-commit-id: 94b7a1b91588716e9e91f7aae7126ed924e55953
2024-03-07 14:28:55 +08:00
hiyouga
843d3f7a97
fix chatglm3 template
...
Former-commit-id: 921ee822679bd10fcc14d084d0845e6132e570dd
2024-03-07 14:26:16 +08:00
hiyouga
a5dcb4fcf4
Update wechat.jpg
...
Former-commit-id: 08d7dc06f25f2d6fd1f5ea262005f62c67edfb12
2024-03-07 13:14:10 +08:00
cx2333
22624e566e
fix flash_attn in train_web
...
Former-commit-id: a8889498fa4e9b6c7a82422ed5b1da3662b48d42
2024-03-07 10:13:55 +08:00
hiyouga
31c618f1f7
tiny fix
...
Former-commit-id: 0048a2021e94d068f7c6054df0b9569ae4912eb1
2024-03-06 17:25:08 +08:00
hiyouga
8b6c178249
export use balanced gpu
...
Former-commit-id: 3e84f430b14a94e68f5815d8e412f0d74d28a04c
2024-03-06 16:33:14 +08:00
hiyouga
8b21a60d9c
fix add tokens
...
Former-commit-id: 9658c63cd94d28bba730a19f73397580b9865d6b
2024-03-06 15:04:02 +08:00
hiyouga
e887aface7
fix version checking
...
Former-commit-id: 3016e6565708637c1d760f2cd5a67cbd8a5a6c26
2024-03-06 14:51:51 +08:00
hiyouga
8d386775f2
update examples
...
Former-commit-id: d1587c80de2e3191952a952116039b719d8613d4
2024-03-06 13:14:57 +08:00
hiyouga
af526c3a46
fix arg dtype
...
Former-commit-id: e0c47358f9d09ab64acbb5ebafc61b52b5b1f2af
2024-03-05 20:53:30 +08:00
hiyouga
9561809ce9
improve aqlm optim
...
Former-commit-id: 259af60d28985b919911587716c24a3ac7f7de64
2024-03-05 20:49:50 +08:00
hiyouga
c776cdfc3e
optimize aqlm training
...
Former-commit-id: d3d3dac7070eb9055bcdc91eaf53f5b3741c0bda
2024-03-05 18:35:41 +08:00
hiyouga
0f2250b831
fix dora inference
...
Former-commit-id: ddf352f861e04e813cb8adeb4513964b4945081a
2024-03-05 11:51:41 +08:00
hiyouga
768358b960
fix export model
...
Former-commit-id: e5edcf440f2c96b90b1186ada887873f19d3c152
2024-03-05 11:05:41 +08:00