Commit Graph

1511 Commits

Author SHA1 Message Date
hiyouga
634c44c51a Update wechat.jpg
Former-commit-id: dfd451b722
2024-03-13 19:03:00 +08:00
hiyouga
922bd8864b fix #2817
Former-commit-id: 0b4a5bf509
2024-03-13 12:42:03 +08:00
hiyouga
8673abbe5e fix #2802
Former-commit-id: b9f87cdc11
2024-03-13 12:33:45 +08:00
hiyouga
a74426df0f fix kv cache
Former-commit-id: 96ce76cd27
2024-03-13 01:21:50 +08:00
hiyouga
bbf272f96e support QDoRA
Former-commit-id: 19ef482649
2024-03-12 22:12:42 +08:00
hiyouga
096c31bfb6 patch for gemma cpt
Former-commit-id: 70a3052dd8
2024-03-12 21:21:54 +08:00
hiyouga
c28818c39f fix plot issues
Former-commit-id: 60cc17f3a8
2024-03-12 18:41:35 +08:00
hiyouga
14ed926a2d support olmo
Former-commit-id: b3247d6a16
2024-03-12 18:30:38 +08:00
hiyouga
0b7e870b07 fix #2802
Former-commit-id: 8d8956bad5
2024-03-12 17:08:34 +08:00
hiyouga
b983de9f4f fix #2803
Former-commit-id: 06c97083e1
2024-03-12 16:57:39 +08:00
hiyouga
7124b71676 fix #2782 #2798
Former-commit-id: 07f9b754a7
2024-03-12 15:53:29 +08:00
hoshi-hiyouga
52f14211e3 Merge pull request #2743 from S3Studio/DockerizeSupport
Add dockerize support

Former-commit-id: c901aa63ff
2024-03-12 00:05:49 +08:00
hiyouga
c88062347e fix #2775
Former-commit-id: e874c00906
2024-03-11 00:42:54 +08:00
hiyouga
f776e738f8 tiny fix
Former-commit-id: 352693e2dc
2024-03-11 00:17:18 +08:00
hiyouga
566bfad930 update parser
Former-commit-id: be99799413
2024-03-10 13:35:20 +08:00
hiyouga
4a4e4b4354 support layerwise galore
Former-commit-id: 8664262cde
2024-03-10 00:24:11 +08:00
hiyouga
276def1897 fix #2732
Former-commit-id: 18ffce36b5
2024-03-09 22:37:16 +08:00
hiyouga
868444e124 allow non-packing pretraining
Former-commit-id: bdb496644c
2024-03-09 22:21:46 +08:00
hiyouga
1173441661 fix #2766
Former-commit-id: 412c52e325
2024-03-09 21:35:24 +08:00
hiyouga
8f6eb1383d use default arg for freeze tuning
Former-commit-id: af0e370fb1
2024-03-09 06:08:48 +08:00
hiyouga
17e50bcbb1 add GaLore results
Former-commit-id: 818726e9bc
2024-03-09 04:11:55 +08:00
hiyouga
5c00783697 update hardware requirements
Former-commit-id: 393c2de27c
2024-03-09 03:58:18 +08:00
hiyouga
eb363b04b9 update examples
Former-commit-id: 4c00bcdcae
2024-03-09 02:30:37 +08:00
hiyouga
c561b268ef fix #2756 , patch #2746
Former-commit-id: e8dd38b7fd
2024-03-09 02:01:26 +08:00
hoshi-hiyouga
36d65289d0 Merge pull request #2746 from stephen-nju/main
fix deepspeed ppo RuntimeError

Former-commit-id: 516d0ddc66
2024-03-09 01:37:00 +08:00
hiyouga
247aab9066 Update setup.py
Former-commit-id: 74ff8664d7
2024-03-09 00:14:48 +08:00
hiyouga
398c261c7c fix aqlm version
Former-commit-id: 10be2f0ecc
2024-03-09 00:09:09 +08:00
hiyouga
ccec17f773 fix example params
Former-commit-id: 8a45213440
2024-03-08 20:41:43 +08:00
stephen_zhu
c69b9fbe58 update
Former-commit-id: aa71571b77
2024-03-08 12:47:44 +08:00
stephen
495b858606 fix ppo runtime error
Former-commit-id: cdb7f82869
2024-03-08 11:48:26 +08:00
S3Studio
de41334055 Add dockerize support
Already tested with the model of Qwen:1.8B and the dataset of alpaca_data_zh. Some python libraries are added to the Dockerfile as a result of the exception messages displayed throughout test procedure.


Former-commit-id: 3d911ae713
2024-03-08 10:47:28 +08:00
hiyouga
b268215a0e update readme
Former-commit-id: 4a2cc60b94
2024-03-08 03:06:21 +08:00
hiyouga
7443ac3116 fix chat engine, update webui
Former-commit-id: 5d956e2a51
2024-03-08 03:01:53 +08:00
hiyouga
0a0959facf Update setup.py
Former-commit-id: 5cd4947650
2024-03-08 01:23:00 +08:00
hiyouga
2235020cc9 update galore args
Former-commit-id: 0ac6b40a47
2024-03-08 01:17:32 +08:00
hiyouga
5b50458acf fix galore
Former-commit-id: 33a4c24a8a
2024-03-08 00:44:51 +08:00
hiyouga
f373290012 add Yi-9B model
Former-commit-id: 57452a4aa1
2024-03-07 23:11:57 +08:00
hiyouga
cb2bf680c9 add galore examples
Former-commit-id: 7230e1177d
2024-03-07 22:53:45 +08:00
hiyouga
2c010c72b8 support galore
Former-commit-id: 28f7862188
2024-03-07 22:41:36 +08:00
hiyouga
1af71f548c update readme
Former-commit-id: 725f7cd70f
2024-03-07 20:34:49 +08:00
hiyouga
583d956bda tiny fix
Former-commit-id: 77211d9843
2024-03-07 20:29:34 +08:00
hoshi-hiyouga
86ba1a5c5b Merge pull request #2739 from hiyouga/dev-vllm
support vllm

Former-commit-id: a0dc721816
2024-03-07 20:28:18 +08:00
hiyouga
34533b2f35 support vllm
Former-commit-id: d07ad5cc1c
2024-03-07 20:26:31 +08:00
hiyouga
37e40563f1 fix #2735
Former-commit-id: f74f804a71
2024-03-07 16:15:53 +08:00
hoshi-hiyouga
90e66c8d94 Merge pull request #2730 from cx2333-gt/main
fix flash_attn in train_web

Former-commit-id: 2185855bdb
2024-03-07 14:37:18 +08:00
cx2333
013c12a135 revert choice name
Former-commit-id: 94b7a1b915
2024-03-07 14:28:55 +08:00
hiyouga
843d3f7a97 fix chatglm3 template
Former-commit-id: 921ee82267
2024-03-07 14:26:16 +08:00
hiyouga
a5dcb4fcf4 Update wechat.jpg
Former-commit-id: 08d7dc06f2
2024-03-07 13:14:10 +08:00
cx2333
22624e566e fix flash_attn in train_web
Former-commit-id: a8889498fa
2024-03-07 10:13:55 +08:00
hiyouga
31c618f1f7 tiny fix
Former-commit-id: 0048a2021e
2024-03-06 17:25:08 +08:00