1062 Commits

Author SHA1 Message Date
hiyouga
196a33cca4 tiny fix
Former-commit-id: 98a42cbdaa4a90dbe5edda1c412c17e628324f52
2024-03-25 23:28:52 +08:00
hiyouga
e90c3769e5 update readme
Former-commit-id: 7b3d8188f5e4416f326b1dc98ad941020461f67f
2024-03-25 23:06:13 +08:00
hoshi-hiyouga
94fb50c52a Merge pull request #2967 from Tsumugii24/main
Update README_zh.md

Former-commit-id: f633ac6646306f448bc77c1e261c98f15421df75
2024-03-25 23:02:22 +08:00
Tsumugii24
03c387c543 Update README.md
Former-commit-id: 1704599503a4c6921a8e78c2b4b940232ca1ba5d
2024-03-25 22:54:38 +08:00
Tsumugii24
ce932cd472 Update README_zh.md
Former-commit-id: 7aa77a3451d48214a683b46fa5da41d0fa30d961
2024-03-25 22:54:26 +08:00
hiyouga
b18749fb01 add arg check
Former-commit-id: 1484f76a95bcf40e4c668d52fed68d68c9745a75
2024-03-25 22:42:58 +08:00
hiyouga
27151b8c65 release v0.6.0
Former-commit-id: 6f2b563f125fe51ee32753e58f902a4911ab757c
2024-03-25 22:38:56 +08:00
Tsumugii24
ad09c13a22 Update README_zh.md
Former-commit-id: bb4ca1691a943aef220f03547bbd6acb1e29b31c
2024-03-25 22:31:03 +08:00
hoshi-hiyouga
195bda3432 Merge pull request #2963 from rkinas/patch-1
Update requirements.txt

Former-commit-id: f33a3dfadc54c193a8ae24557a5a28f2d2013ec5
2024-03-25 21:49:34 +08:00
Remek Kinas
f4ee888dfa Update requirements.txt
Former-commit-id: b02899bf89eb77a7b0b60526fa7bfb7d3a79bdf7
2024-03-25 14:30:58 +01:00
hiyouga
2d73831177 tiny fix
Former-commit-id: 558a538724db373319a6bba26c76943bac1b5063
2024-03-25 21:18:08 +08:00
hoshi-hiyouga
abbbdae903 Merge pull request #2945 from marko1616/bugfix/lora-model-merge
修复了在 transformers > 4.36.2 版本中部分模型合并 Lora 模型时因生成配置校验而导致的崩溃问题

Former-commit-id: 49f9dbb4b168b0e9f72e2233271aca130fca55e7
2024-03-25 13:36:08 +08:00
marko1616
1d0e24549f pass ruff check
Former-commit-id: c8f0d99704308ac1886b16e437dea601eb20658d
2024-03-24 16:12:10 +08:00
marko1616
a68101cbbb fix Llama lora merge crash
Former-commit-id: 6f080fdba3f99145b7722964dd027179dc2eeb2b
2024-03-24 03:06:11 +08:00
marko1616
645c27e5e2 fix Llama lora merge crash
Former-commit-id: 51349ea1ccbf3e53b408037986abd850a0963468
2024-03-24 02:55:23 +08:00
marko1616
c083708433 fix Llama lora merge crash
Former-commit-id: c1e2c4ea45ad210e776a192e05e226b34d764135
2024-03-24 02:44:35 +08:00
hiyouga
84c3d509fa fix #2936
Former-commit-id: 140ad4ad567de8817a14972175e668971bae6a0a
2024-03-24 00:43:21 +08:00
hiyouga
75829c8699 fix #2928
Former-commit-id: 7afbc85daee295cf38dcee9ded5afd87b2c4cfd1
2024-03-24 00:34:54 +08:00
hiyouga
58aa576ae5 fix #2941
Former-commit-id: a1c8c98c5fecfc0dd0ed1be33ee8dd2ade05b708
2024-03-24 00:28:44 +08:00
hiyouga
c765b4c1ac Update wechat.jpg
Former-commit-id: 564d57aa233e3c9f9c1a64bccf95553a7e47acd3
2024-03-22 14:00:37 +08:00
hoshi-hiyouga
4e067329a3 Merge pull request #2919 from 0xez/main
Update README.md, fix the release date of the paper

Former-commit-id: ce261fdd64ae29bb00310eba010a5a5a7384f7d6
2024-03-22 12:12:24 +08:00
0xez
028a8bc532 Update README_zh.md, fix the release date of the paper
Former-commit-id: be0360303d2e7275e14586dc503a9581f80ce303
2024-03-22 10:41:17 +08:00
0xez
3f50d572ed Update README.md, fix the release date of the paper
Former-commit-id: 675ba41562d812f169c6b2775e57a3f38fc8deee
2024-03-21 22:14:48 +08:00
hiyouga
cfcea16416 move file
Former-commit-id: 96702620c4620cb98a362ac8ea6d3dd82e5e07e3
2024-03-21 17:05:17 +08:00
hiyouga
63c83f3802 add citation
Former-commit-id: 5eaa50fa01a7172408840255d18bcc0ab43a01fb
2024-03-21 17:04:10 +08:00
hiyouga
0684e315be paper release
Former-commit-id: 0581bfdbc7d6c764e63f8d54271da7663ca354d9
2024-03-21 13:49:17 +08:00
hiyouga
ada7e20eb4 update readme
Former-commit-id: bfe7a9128952bacef93d5478938d3e088bd0480d
2024-03-21 00:48:42 +08:00
hiyouga
7999836fb6 support fsdp + qlora
Former-commit-id: 84082251621e1470b3b5406a56d0a967780a1804
2024-03-21 00:36:06 +08:00
hiyouga
6646e18c02 add orca_dpo_pairs dataset
Former-commit-id: 3271af2afc90f10dcb101aeb9d7e4ef254d2dc0e
2024-03-20 20:09:06 +08:00
hoshi-hiyouga
e8cf2794cd Merge pull request #2905 from SirlyDreamer/main
Follow HF_ENDPOINT environment variable

Former-commit-id: b2dfbd728fec976235c68ff977e874ea4ac81bbb
2024-03-20 18:09:54 +08:00
hiyouga
8717e98200 fix #2777 #2895
Former-commit-id: 9bec3c98a22c91b1c28fda757db51eb780291641
2024-03-20 17:59:45 +08:00
hiyouga
cf149bf43c fix #2346
Former-commit-id: 7b8f5029018f0481f7da83cc5ee4408d95c9beb2
2024-03-20 17:56:33 +08:00
SirlyDreamer
78359638e3 Follow HF_ENDPOINT environment variable
Former-commit-id: e165965341a150f6faa2c072a9281ad99d7e5ce8
2024-03-20 08:31:30 +00:00
hoshi-hiyouga
a9d85cf3c6 Merge pull request #2903 from khazic/main
Updated README with new information

Former-commit-id: a77303570994c3a3a2a0c2faae7fc089cac05629
2024-03-20 16:13:44 +08:00
khazic
c7824c42ff Updated README with new information
Former-commit-id: 8d10fa71c2b4fa2f79ebb08d5e916c3e3f9d7fbe
2024-03-20 14:38:08 +08:00
khazic
13bf8b1f91 Updated README with new information
Former-commit-id: 0531dac30d5cbee56b73e06230cd0a62928ee9ca
2024-03-20 14:21:16 +08:00
刘一博
5b8725399e Updated README with new information
Former-commit-id: df9b4fb90a076c18f533da32beb7c42ae5b9ed22
2024-03-20 14:11:28 +08:00
hiyouga
7fbdbc2419 Update wechat.jpg
Former-commit-id: bea31b9b12fe18a692590a89d263f9bfbae29698
2024-03-18 16:48:32 +08:00
hiyouga
3d483e0914 fix packages
Former-commit-id: 8e04794b2da067a4123b9d7091a54c5647f44244
2024-03-17 22:32:03 +08:00
hiyouga
a5537f3ee8 fix patcher
Former-commit-id: 85c376fc1e0bcc854ed6e70e6455a0b00b341655
2024-03-15 19:18:42 +08:00
hoshi-hiyouga
30765baa91 Merge pull request #2849 from S3Studio/DockerizeSupport
Improve Dockerize support

Former-commit-id: 113cc047198325b51dac50d8a7ea70396c51e0d9
2024-03-15 19:16:02 +08:00
hiyouga
06860e8f0f fix export
Former-commit-id: 6bc2c23b6d26b52f54ac37fa6149e6eb3cc18ee6
2024-03-15 15:06:30 +08:00
S3Studio
46ef7416e6 Use official Nvidia base image
Note that the flash-attn library is installed in this image and the qwen model will use it automatically.
However, if the the host machine's GPU is not compatible with the library, an exception will be raised during the training process as follows:
FlashAttention only supports Ampere GPUs or newer.
So if the --flash_attn flag is not set, an additional patch for the qwen model's config is necessary to set the default value of use_flash_attn from "auto" to False.


Former-commit-id: e75407febdec086f2bdca723a7f69a92b3b1d63f
2024-03-15 08:59:13 +08:00
S3Studio
dcbc8168a8 improve Docker build and runtime parameters
Modify installation method of extra python library.
Utilize shared memory of the host machine to increase training performance.


Former-commit-id: 6a5693d11d065f6e75c8cdd8b5ed962eb520953c
2024-03-15 08:57:46 +08:00
hiyouga
7ef49586be tiny fix
Former-commit-id: 6ebde4f23e761b8a3e3ea6ca6dff249e657608a1
2024-03-14 21:19:06 +08:00
hiyouga
2cf95d4efe fix export
Former-commit-id: 3b4a59bfb1866a270b9934a4a2303197ffdab531
2024-03-14 18:17:01 +08:00
hiyouga
edd28dbe2c fix bug
Former-commit-id: 8172530d54fbd42a9dd3219f06378563d62424e0
2024-03-13 23:55:31 +08:00
hiyouga
9ff7c99eb1 fix bug
Former-commit-id: 714d936dfbe022c4f2cfa6ff643e3482a3f96012
2024-03-13 23:43:42 +08:00
hiyouga
8b8671817f improve lora+ impl.
Former-commit-id: 72367307dfadf936fb989ebe8bc9f0ff229fb933
2024-03-13 23:32:51 +08:00
hoshi-hiyouga
4000de93ea Merge pull request #2830 from qibaoyuan/lora_plus
[FEATURE]: ADD LORA+ ALGORITHM

Former-commit-id: 4e5e99af4320db661a4eebaabc3284f73815ae4e
2024-03-13 20:15:46 +08:00