hiyouga
e194efab10
fix patcher
...
Former-commit-id: 6a5ad99c8cbf6b7def0a130306d49e7d1eb4e5a5
2024-03-15 19:18:42 +08:00
hoshi-hiyouga
772fc2eac7
Merge pull request #2849 from S3Studio/DockerizeSupport
...
Improve Dockerize support
Former-commit-id: b63cba317266f5ba217de54fda77ec26a4df344d
2024-03-15 19:16:02 +08:00
hiyouga
ed020579dc
fix export
...
Former-commit-id: 4e996f194406d7eb27b2bde290a12c82c41219d0
2024-03-15 15:06:30 +08:00
S3Studio
096869c7b6
Use official Nvidia base image
...
Note that the flash-attn library is installed in this image and the qwen model will use it automatically.
However, if the the host machine's GPU is not compatible with the library, an exception will be raised during the training process as follows:
FlashAttention only supports Ampere GPUs or newer.
So if the --flash_attn flag is not set, an additional patch for the qwen model's config is necessary to set the default value of use_flash_attn from "auto" to False.
Former-commit-id: cd2f5717d676e1a5afd2f4e7a38402d2e55e7479
2024-03-15 08:59:13 +08:00
hiyouga
623ee1bd88
tiny fix
...
Former-commit-id: bf8123669be334338b4268d0a8f7703ff2cf6255
2024-03-14 21:19:06 +08:00
hiyouga
aabe90343e
fix export
...
Former-commit-id: c9b968b84c97c9a00fbb43194c3adc9354d74f3b
2024-03-14 18:17:01 +08:00
hiyouga
764cfb506d
fix bug
...
Former-commit-id: 38c618b797ec219c2c45de960c9cbe50ec524c94
2024-03-13 23:55:31 +08:00
hiyouga
249ad56075
fix bug
...
Former-commit-id: 47ee0276830adbed65bc111d5a83049e77ad360a
2024-03-13 23:43:42 +08:00
hiyouga
46f99ff277
improve lora+ impl.
...
Former-commit-id: 332bad25455a70ad9204e7dd384bb086d789aa39
2024-03-13 23:32:51 +08:00
齐保元
3c91e86268
[FEATURE]: ADD LORA+ ALGORITHM
...
Former-commit-id: c35b3c3b1e27171f8a703f88ede1dc8a84c80a56
2024-03-13 19:43:27 +08:00
hiyouga
42473ec150
fix #2817
...
Former-commit-id: f1c8b8127b3c1ac095176015af5ec92d37a11efe
2024-03-13 12:42:03 +08:00
hiyouga
6a4e4b9c5b
fix #2802
...
Former-commit-id: f4c56ccd785790c02f0d1275cd75958677a18690
2024-03-13 12:33:45 +08:00
hiyouga
9a784fb4f3
fix kv cache
...
Former-commit-id: a9588e36e95bed896eea8d79ba7108447ff08f4b
2024-03-13 01:21:50 +08:00
hiyouga
43fd80a1aa
support QDoRA
...
Former-commit-id: d8ad1c5ef08e733e52084de271aad762b1613129
2024-03-12 22:12:42 +08:00
hiyouga
e6ab1a57ea
patch for gemma cpt
...
Former-commit-id: fc0b19c62f52a90d78b63761dda3d8970a42f2da
2024-03-12 21:21:54 +08:00
hiyouga
282edb9161
fix plot issues
...
Former-commit-id: 01ae196b4916433da9aeec9c0b5c660c6b34464c
2024-03-12 18:41:35 +08:00
hiyouga
dff77004f2
support olmo
...
Former-commit-id: 2719510e8c6baa591c74458b773e4e47215e6052
2024-03-12 18:30:38 +08:00
hiyouga
6c1b4aec75
fix #2802
...
Former-commit-id: 1370db270d7ba1a20468abdb29193ce7534d1b4f
2024-03-12 17:08:34 +08:00
hiyouga
c9ed3fc3a4
fix #2782 #2798
...
Former-commit-id: eb3ab610610a0964bc8a1c9fa015805353f04c31
2024-03-12 15:53:29 +08:00
hiyouga
4f9a47c026
fix #2775
...
Former-commit-id: a5c7feb3e8089f4deff760b00a9f84425957c419
2024-03-11 00:42:54 +08:00
hiyouga
3fcb1c6d09
tiny fix
...
Former-commit-id: 1d22c87db2449c7d9915842b70fbd59ce9c2dd70
2024-03-11 00:17:18 +08:00
hiyouga
7c492864e9
update parser
...
Former-commit-id: d98258aa08d93494ad50d7786064e7fda15f6ca9
2024-03-10 13:35:20 +08:00
hiyouga
7ff8a064f3
support layerwise galore
...
Former-commit-id: d43a4da0947897d0be3f62fad3107754d4c89f2b
2024-03-10 00:24:11 +08:00
hiyouga
c635bbe465
fix #2732
...
Former-commit-id: bc39ad1d102b91d5417daa38b8a581e1e1ab2af9
2024-03-09 22:37:16 +08:00
hiyouga
4881f4e631
allow non-packing pretraining
...
Former-commit-id: 3fee5cc5a3db9ce874ad90f2500ec092d904bd4e
2024-03-09 22:21:46 +08:00
hiyouga
c631799f5d
fix #2766
...
Former-commit-id: a8cd556230c1d0bc4e090acc2276c035910ce6f6
2024-03-09 21:35:24 +08:00
hiyouga
48846676d8
use default arg for freeze tuning
...
Former-commit-id: a38fd7c8b39cb59fb61c26fdf80aaa6f2d0623b9
2024-03-09 06:08:48 +08:00
hiyouga
5d7d8bd55c
update hardware requirements
...
Former-commit-id: 604b3d10fc1448f702943114b66b97bded21e080
2024-03-09 03:58:18 +08:00
hiyouga
43b2ede0f8
fix #2756 , patch #2746
...
Former-commit-id: 627d1c91e675f1d9ebf47bad123cbbf29821da4d
2024-03-09 02:01:26 +08:00
hoshi-hiyouga
2f095e2017
Merge pull request #2746 from stephen-nju/main
...
fix deepspeed ppo RuntimeError
Former-commit-id: 656c653f0c628f9494b4d7ae12e60c8eeec1ea7a
2024-03-09 01:37:00 +08:00
hiyouga
9b97b23ce7
fix aqlm version
...
Former-commit-id: 05673f81f0295c76957f3247c62f95fda322a63e
2024-03-09 00:09:09 +08:00
stephen_zhu
940c00e7ae
update
...
Former-commit-id: 295f9ef2eff2e8b5d7a21d3da8dd3e6eb2a42006
2024-03-08 12:47:44 +08:00
stephen
18cfd5f349
fix ppo runtime error
...
Former-commit-id: 14e2f221e3e720075e59065a3dc42aa4d993a8b6
2024-03-08 11:48:26 +08:00
hiyouga
48d4364586
fix chat engine, update webui
...
Former-commit-id: 8b32dddd7d883bae07735796a517927c79d1c33b
2024-03-08 03:01:53 +08:00
hiyouga
3879d79b89
update galore args
...
Former-commit-id: c7479a7976f773feb36aab4fdb0500be53d83b6a
2024-03-08 01:17:32 +08:00
hiyouga
e416cecf62
fix galore
...
Former-commit-id: 62a3ceeef8f60caef43ccc7f971a0c9184e21296
2024-03-08 00:44:51 +08:00
hiyouga
81fcb80466
add Yi-9B model
...
Former-commit-id: bfcb0245b832242eefb84de6f70bd75544f3ceb7
2024-03-07 23:11:57 +08:00
hiyouga
1e6fb6c8aa
support galore
...
Former-commit-id: b67a4a46a88d83bb2a3459b3317b66cda15e0171
2024-03-07 22:41:36 +08:00
hiyouga
056d2d956a
support vllm
...
Former-commit-id: 889f6e910e654d8ec3922c2185042d737ffbf1c3
2024-03-07 20:26:31 +08:00
hiyouga
9a69cadab3
fix #2735
...
Former-commit-id: 416f6333f66b6afd70a3a936d82593efca583235
2024-03-07 16:15:53 +08:00
hoshi-hiyouga
3de642bffd
Merge pull request #2730 from cx2333-gt/main
...
fix flash_attn in train_web
Former-commit-id: eff0b774fc8e1a5a07a2554d611cb85bef439dec
2024-03-07 14:37:18 +08:00
cx2333
286b9d9849
revert choice name
...
Former-commit-id: 7832e68072219c7d1f562aee868812a4d655f4e0
2024-03-07 14:28:55 +08:00
hiyouga
cef1ede826
fix chatglm3 template
...
Former-commit-id: 9be0aa70fdd2e9ec208aa1850ace5c287efc8c3a
2024-03-07 14:26:16 +08:00
cx2333
5007566588
fix flash_attn in train_web
...
Former-commit-id: 5f340e362b0e91fec76c19c77c5705bba1db481a
2024-03-07 10:13:55 +08:00
hiyouga
e93fb3cc6c
tiny fix
...
Former-commit-id: c3145afa4164dd28888f17599a154f7dddbe9326
2024-03-06 17:25:08 +08:00
hiyouga
7578209735
export use balanced gpu
...
Former-commit-id: 710487dc694489bf3dfe54f8d32df80ce46439e4
2024-03-06 16:33:14 +08:00
hiyouga
67f02f75d0
fix add tokens
...
Former-commit-id: ff5353681a87d033903bf8cf6133c6bdb3fa9e5a
2024-03-06 15:04:02 +08:00
hiyouga
73d9dfc7ab
fix version checking
...
Former-commit-id: 5780da8d640609cca388f55983d0251e5547209a
2024-03-06 14:51:51 +08:00
hiyouga
3168abc0a1
fix arg dtype
...
Former-commit-id: 999ae05655815ac04ababddae55d9343f5d39f84
2024-03-05 20:53:30 +08:00
hiyouga
46ee267cfc
improve aqlm optim
...
Former-commit-id: 81be999b407e988c2f42764d827ac859d079ed3e
2024-03-05 20:49:50 +08:00