marko1616
5721074af1
fix blank line contains whitespace
...
Former-commit-id: d9a5134617d494ef13ba73f9c540123e89a8c29c
2024-03-30 23:46:55 +08:00
marko1616
67c05c2031
Fix Llama model save for full param train
...
Former-commit-id: eb178eaff390a1dc342cc35ab8c7820d654f3717
2024-03-30 23:45:04 +08:00
hiyouga
3bf6dde3a5
support save args in webui #2807 #3046
...
some ideas are borrowed from @marko1616
Former-commit-id: 7a086ed33388551e0f835adf61fac638f96ed188
2024-03-30 23:09:12 +08:00
hiyouga
808ad2071f
upgrade gradio to 4.21.0
...
Former-commit-id: 831c5321ac9b5ec91d9cc1dbcc87967dcdc486f5
2024-03-30 20:37:08 +08:00
hiyouga
fc066cad7f
release v0.6.1
...
Former-commit-id: ca793028c69433eae405009c5ebb790c6c2d40c4
2024-03-29 11:36:08 +08:00
hiyouga
e4f3d583df
fix #2982
...
Former-commit-id: 8d603f8820efd1617557f2bc5d9674143abe7c57
2024-03-28 20:22:31 +08:00
hiyouga
eac2a5b1d3
fix #3010
...
Former-commit-id: b19c14870d30c57fbea81e9cfa737d762922c54b
2024-03-28 18:31:17 +08:00
hiyouga
89c400633a
update trainers
...
Former-commit-id: 8c77b1091296e204dc3c8c1f157c288ca5b236bd
2024-03-28 18:16:27 +08:00
hoshi-hiyouga
ae9ad13f2a
fix ds optimizer
...
Former-commit-id: 3bcd41b639899e72bcabc51d59bac8967af19899
2024-03-26 23:39:56 +08:00
hiyouga
c311375b50
fix bug
...
Former-commit-id: 3164b4f11b72684c8aa2105037cb36c47b6acfd4
2024-03-26 17:30:12 +08:00
hiyouga
ec94e5e876
fix #2961
...
Former-commit-id: 511f6754026fbbf48bd481018015338a6a3ad92f
2024-03-26 17:26:14 +08:00
hiyouga
62312716d9
release v0.6.0 (real)
...
Former-commit-id: ba70aca8fb1275ed3b6af69a7e639303d7201da4
2024-03-25 23:37:48 +08:00
hiyouga
196a33cca4
tiny fix
...
Former-commit-id: 98a42cbdaa4a90dbe5edda1c412c17e628324f52
2024-03-25 23:28:52 +08:00
hiyouga
b18749fb01
add arg check
...
Former-commit-id: 1484f76a95bcf40e4c668d52fed68d68c9745a75
2024-03-25 22:42:58 +08:00
hiyouga
27151b8c65
release v0.6.0
...
Former-commit-id: 6f2b563f125fe51ee32753e58f902a4911ab757c
2024-03-25 22:38:56 +08:00
hiyouga
2d73831177
tiny fix
...
Former-commit-id: 558a538724db373319a6bba26c76943bac1b5063
2024-03-25 21:18:08 +08:00
marko1616
1d0e24549f
pass ruff check
...
Former-commit-id: c8f0d99704308ac1886b16e437dea601eb20658d
2024-03-24 16:12:10 +08:00
marko1616
a68101cbbb
fix Llama lora merge crash
...
Former-commit-id: 6f080fdba3f99145b7722964dd027179dc2eeb2b
2024-03-24 03:06:11 +08:00
marko1616
645c27e5e2
fix Llama lora merge crash
...
Former-commit-id: 51349ea1ccbf3e53b408037986abd850a0963468
2024-03-24 02:55:23 +08:00
marko1616
c083708433
fix Llama lora merge crash
...
Former-commit-id: c1e2c4ea45ad210e776a192e05e226b34d764135
2024-03-24 02:44:35 +08:00
hiyouga
84c3d509fa
fix #2936
...
Former-commit-id: 140ad4ad567de8817a14972175e668971bae6a0a
2024-03-24 00:43:21 +08:00
hiyouga
75829c8699
fix #2928
...
Former-commit-id: 7afbc85daee295cf38dcee9ded5afd87b2c4cfd1
2024-03-24 00:34:54 +08:00
hiyouga
58aa576ae5
fix #2941
...
Former-commit-id: a1c8c98c5fecfc0dd0ed1be33ee8dd2ade05b708
2024-03-24 00:28:44 +08:00
hiyouga
7999836fb6
support fsdp + qlora
...
Former-commit-id: 84082251621e1470b3b5406a56d0a967780a1804
2024-03-21 00:36:06 +08:00
hiyouga
8717e98200
fix #2777 #2895
...
Former-commit-id: 9bec3c98a22c91b1c28fda757db51eb780291641
2024-03-20 17:59:45 +08:00
hiyouga
cf149bf43c
fix #2346
...
Former-commit-id: 7b8f5029018f0481f7da83cc5ee4408d95c9beb2
2024-03-20 17:56:33 +08:00
hiyouga
3d483e0914
fix packages
...
Former-commit-id: 8e04794b2da067a4123b9d7091a54c5647f44244
2024-03-17 22:32:03 +08:00
hiyouga
a5537f3ee8
fix patcher
...
Former-commit-id: 85c376fc1e0bcc854ed6e70e6455a0b00b341655
2024-03-15 19:18:42 +08:00
hoshi-hiyouga
30765baa91
Merge pull request #2849 from S3Studio/DockerizeSupport
...
Improve Dockerize support
Former-commit-id: 113cc047198325b51dac50d8a7ea70396c51e0d9
2024-03-15 19:16:02 +08:00
hiyouga
06860e8f0f
fix export
...
Former-commit-id: 6bc2c23b6d26b52f54ac37fa6149e6eb3cc18ee6
2024-03-15 15:06:30 +08:00
S3Studio
46ef7416e6
Use official Nvidia base image
...
Note that the flash-attn library is installed in this image and the qwen model will use it automatically.
However, if the the host machine's GPU is not compatible with the library, an exception will be raised during the training process as follows:
FlashAttention only supports Ampere GPUs or newer.
So if the --flash_attn flag is not set, an additional patch for the qwen model's config is necessary to set the default value of use_flash_attn from "auto" to False.
Former-commit-id: e75407febdec086f2bdca723a7f69a92b3b1d63f
2024-03-15 08:59:13 +08:00
hiyouga
7ef49586be
tiny fix
...
Former-commit-id: 6ebde4f23e761b8a3e3ea6ca6dff249e657608a1
2024-03-14 21:19:06 +08:00
hiyouga
2cf95d4efe
fix export
...
Former-commit-id: 3b4a59bfb1866a270b9934a4a2303197ffdab531
2024-03-14 18:17:01 +08:00
hiyouga
edd28dbe2c
fix bug
...
Former-commit-id: 8172530d54fbd42a9dd3219f06378563d62424e0
2024-03-13 23:55:31 +08:00
hiyouga
9ff7c99eb1
fix bug
...
Former-commit-id: 714d936dfbe022c4f2cfa6ff643e3482a3f96012
2024-03-13 23:43:42 +08:00
hiyouga
8b8671817f
improve lora+ impl.
...
Former-commit-id: 72367307dfadf936fb989ebe8bc9f0ff229fb933
2024-03-13 23:32:51 +08:00
齐保元
24c9277488
[FEATURE]: ADD LORA+ ALGORITHM
...
Former-commit-id: a0965cd62c85545aa2364e244295df2963308354
2024-03-13 19:43:27 +08:00
hiyouga
922bd8864b
fix #2817
...
Former-commit-id: 0b4a5bf509a6fbf18337a29a6a498f33d0cbca76
2024-03-13 12:42:03 +08:00
hiyouga
8673abbe5e
fix #2802
...
Former-commit-id: b9f87cdc11b3fe712574b91455dc722b69c60c66
2024-03-13 12:33:45 +08:00
hiyouga
a74426df0f
fix kv cache
...
Former-commit-id: 96ce76cd2753bc91c781ad13aa8f7a972abe815a
2024-03-13 01:21:50 +08:00
hiyouga
bbf272f96e
support QDoRA
...
Former-commit-id: 19ef4826490b79e0c2aee20ad67430aa0e4724a7
2024-03-12 22:12:42 +08:00
hiyouga
096c31bfb6
patch for gemma cpt
...
Former-commit-id: 70a3052dd8a2d1322fa01ab19e369e465842d416
2024-03-12 21:21:54 +08:00
hiyouga
c28818c39f
fix plot issues
...
Former-commit-id: 60cc17f3a8b56c0b2ad76be7c10ca0b4e1738121
2024-03-12 18:41:35 +08:00
hiyouga
14ed926a2d
support olmo
...
Former-commit-id: b3247d6a1604f4cbeb0d7c163d0082ce91afb870
2024-03-12 18:30:38 +08:00
hiyouga
0b7e870b07
fix #2802
...
Former-commit-id: 8d8956bad542c0e1c0f7edbf4ffc22bb0f8788ae
2024-03-12 17:08:34 +08:00
hiyouga
7124b71676
fix #2782 #2798
...
Former-commit-id: 07f9b754a7418b489e839bd674aa47094583a92d
2024-03-12 15:53:29 +08:00
hiyouga
c88062347e
fix #2775
...
Former-commit-id: e874c00906c765b81c0e5ff9c7b3679557da8e0e
2024-03-11 00:42:54 +08:00
hiyouga
f776e738f8
tiny fix
...
Former-commit-id: 352693e2dcc8fc039b5d574e1a5709563929b0ce
2024-03-11 00:17:18 +08:00
hiyouga
566bfad930
update parser
...
Former-commit-id: be99799413e1ba37807a02838bf2d87fd966bf55
2024-03-10 13:35:20 +08:00
hiyouga
4a4e4b4354
support layerwise galore
...
Former-commit-id: 8664262cde3919e10eaecbd66e8c5d356856362e
2024-03-10 00:24:11 +08:00