mirror of
https://github.com/hiyouga/LLaMA-Factory.git
synced 2025-08-04 04:32:50 +08:00
add readme
Former-commit-id: 5aa6751e52b5c2e06727c50e60218226b146b7bf
This commit is contained in:
parent
63e12226a0
commit
b2200409f5
29
README.md
29
README.md
@ -51,6 +51,8 @@ Compared to ChatGLM's [P-Tuning](https://github.com/THUDM/ChatGLM2-6B/tree/main/
|
|||||||
|
|
||||||
## Changelog
|
## Changelog
|
||||||
|
|
||||||
|
[23/12/01] We supported **[ModelScope Hub](https://www.modelscope.cn/models)** to accelerate model downloading. Add environment variable `USE_MODELSCOPE_HUB=1` to your command line, then you can use the model-id of ModelScope Hub.
|
||||||
|
|
||||||
[23/10/21] We supported **[NEFTune](https://arxiv.org/abs/2310.05914)** trick for fine-tuning. Try `--neft_alpha` argument to activate NEFTune, e.g., `--neft_alpha 5`.
|
[23/10/21] We supported **[NEFTune](https://arxiv.org/abs/2310.05914)** trick for fine-tuning. Try `--neft_alpha` argument to activate NEFTune, e.g., `--neft_alpha 5`.
|
||||||
|
|
||||||
[23/09/27] We supported **$S^2$-Attn** proposed by [LongLoRA](https://github.com/dvlab-research/LongLoRA) for the LLaMA models. Try `--shift_attn` argument to enable shift short attention.
|
[23/09/27] We supported **$S^2$-Attn** proposed by [LongLoRA](https://github.com/dvlab-research/LongLoRA) for the LLaMA models. Try `--shift_attn` argument to enable shift short attention.
|
||||||
@ -227,6 +229,33 @@ If you want to enable the quantized LoRA (QLoRA) on the Windows platform, you wi
|
|||||||
pip install https://github.com/jllllll/bitsandbytes-windows-webui/releases/download/wheels/bitsandbytes-0.39.1-py3-none-win_amd64.whl
|
pip install https://github.com/jllllll/bitsandbytes-windows-webui/releases/download/wheels/bitsandbytes-0.39.1-py3-none-win_amd64.whl
|
||||||
```
|
```
|
||||||
|
|
||||||
|
### Use ModelScope Models
|
||||||
|
|
||||||
|
If you have trouble with downloading models from HuggingFace, we have supported ModelScope Hub. To use LLaMA-Factory together with ModelScope, please add a environment variable:
|
||||||
|
|
||||||
|
```shell
|
||||||
|
export USE_MODELSCOPE_HUB=1
|
||||||
|
```
|
||||||
|
|
||||||
|
> [!NOTE]
|
||||||
|
>
|
||||||
|
> Please use integers only. 0 or not set for using HuggingFace hub. Other values will be treated as use ModelScope hub.
|
||||||
|
|
||||||
|
Then you can use LLaMA-Factory with ModelScope model-ids:
|
||||||
|
|
||||||
|
```shell
|
||||||
|
python src/train_bash.py \
|
||||||
|
--model_name_or_path ZhipuAI/chatglm3-6b \
|
||||||
|
... other arguments
|
||||||
|
# You can find all model ids in this link: https://www.modelscope.cn/models
|
||||||
|
```
|
||||||
|
|
||||||
|
Web demo also supports ModelScope, after setting the environment variable please run with this command:
|
||||||
|
|
||||||
|
```shell
|
||||||
|
CUDA_VISIBLE_DEVICES=0 python src/train_web.py
|
||||||
|
```
|
||||||
|
|
||||||
### Train on a single GPU
|
### Train on a single GPU
|
||||||
|
|
||||||
> [!IMPORTANT]
|
> [!IMPORTANT]
|
||||||
|
29
README_zh.md
29
README_zh.md
@ -51,6 +51,8 @@ https://github.com/hiyouga/LLaMA-Factory/assets/16256802/6ba60acc-e2e2-4bec-b846
|
|||||||
|
|
||||||
## 更新日志
|
## 更新日志
|
||||||
|
|
||||||
|
[23/12/01] 我们支持了 **[魔搭ModelHub](https://www.modelscope.cn/models)** 进行模型下载加速。在启动命令前环境变量中增加 `USE_MODELSCOPE_HUB=1` 即可开启。
|
||||||
|
|
||||||
[23/10/21] 我们支持了 **[NEFTune](https://arxiv.org/abs/2310.05914)** 训练技巧。请使用 `--neft_alpha` 参数启用 NEFTune,例如 `--neft_alpha 5`。
|
[23/10/21] 我们支持了 **[NEFTune](https://arxiv.org/abs/2310.05914)** 训练技巧。请使用 `--neft_alpha` 参数启用 NEFTune,例如 `--neft_alpha 5`。
|
||||||
|
|
||||||
[23/09/27] 我们针对 LLaMA 模型支持了 [LongLoRA](https://github.com/dvlab-research/LongLoRA) 提出的 **$S^2$-Attn**。请使用 `--shift_attn` 参数以启用该功能。
|
[23/09/27] 我们针对 LLaMA 模型支持了 [LongLoRA](https://github.com/dvlab-research/LongLoRA) 提出的 **$S^2$-Attn**。请使用 `--shift_attn` 参数以启用该功能。
|
||||||
@ -227,6 +229,33 @@ pip install -r requirements.txt
|
|||||||
pip install https://github.com/jllllll/bitsandbytes-windows-webui/releases/download/wheels/bitsandbytes-0.39.1-py3-none-win_amd64.whl
|
pip install https://github.com/jllllll/bitsandbytes-windows-webui/releases/download/wheels/bitsandbytes-0.39.1-py3-none-win_amd64.whl
|
||||||
```
|
```
|
||||||
|
|
||||||
|
### 使用魔搭的模型
|
||||||
|
|
||||||
|
如果下载HuggingFace模型存在问题,我们已经支持了魔搭的ModelHub,只需要添加一个环境变量:
|
||||||
|
|
||||||
|
```shell
|
||||||
|
export USE_MODELSCOPE_HUB=1
|
||||||
|
```
|
||||||
|
|
||||||
|
> [!NOTE]
|
||||||
|
>
|
||||||
|
> 该环境变量仅支持整数,0或者不设置代表使用HuggingFace,其他值代表使用ModelScope
|
||||||
|
|
||||||
|
之后就可以在命令行中指定魔搭的模型id:
|
||||||
|
|
||||||
|
```shell
|
||||||
|
python src/train_bash.py \
|
||||||
|
--model_name_or_path ZhipuAI/chatglm3-6b \
|
||||||
|
... other arguments
|
||||||
|
# 在这个链接中可以看到所有可用模型: https://www.modelscope.cn/models
|
||||||
|
```
|
||||||
|
|
||||||
|
Web demo目前也支持了魔搭, 在设置环境变量后即可使用:
|
||||||
|
|
||||||
|
```shell
|
||||||
|
CUDA_VISIBLE_DEVICES=0 python src/train_web.py
|
||||||
|
```
|
||||||
|
|
||||||
### 单 GPU 训练
|
### 单 GPU 训练
|
||||||
|
|
||||||
> [!IMPORTANT]
|
> [!IMPORTANT]
|
||||||
|
@ -43,7 +43,7 @@ def register_model_group(
|
|||||||
else:
|
else:
|
||||||
assert prefix == name.split("-")[0], "prefix should be identical."
|
assert prefix == name.split("-")[0], "prefix should be identical."
|
||||||
|
|
||||||
if not os.environ.get('USE_MODELSCOPE_HUB', False):
|
if not int(os.environ.get('USE_MODELSCOPE_HUB', '0')):
|
||||||
# If path is a string, we treat it as a huggingface model-id by default.
|
# If path is a string, we treat it as a huggingface model-id by default.
|
||||||
SUPPORTED_MODELS[name] = path["hf"] if isinstance(path, dict) else path
|
SUPPORTED_MODELS[name] = path["hf"] if isinstance(path, dict) else path
|
||||||
elif isinstance(path, dict) and "ms" in path:
|
elif isinstance(path, dict) and "ms" in path:
|
||||||
|
@ -235,7 +235,7 @@ def load_model_and_tokenizer(
|
|||||||
|
|
||||||
|
|
||||||
def try_download_model_from_ms(model_args):
|
def try_download_model_from_ms(model_args):
|
||||||
if os.environ.get('USE_MODELSCOPE_HUB', False) and not os.path.exists(model_args.model_name_or_path):
|
if int(os.environ.get('USE_MODELSCOPE_HUB', '0')) and not os.path.exists(model_args.model_name_or_path):
|
||||||
try:
|
try:
|
||||||
from modelscope import snapshot_download
|
from modelscope import snapshot_download
|
||||||
revision = model_args.model_revision
|
revision = model_args.model_revision
|
||||||
@ -243,5 +243,5 @@ def try_download_model_from_ms(model_args):
|
|||||||
revision = 'master'
|
revision = 'master'
|
||||||
model_args.model_name_or_path = snapshot_download(model_args.model_name_or_path, revision)
|
model_args.model_name_or_path = snapshot_download(model_args.model_name_or_path, revision)
|
||||||
except ImportError as e:
|
except ImportError as e:
|
||||||
raise ImportError(f'You are using `USE_MODELSCOPE_HUB=True` but you have no modelscope sdk installed. '
|
raise ImportError(f'You are using `USE_MODELSCOPE_HUB=1` but you have no modelscope sdk installed. '
|
||||||
f'Please install it by `pip install modelscope -U`') from e
|
f'Please install it by `pip install modelscope -U`') from e
|
||||||
|
Loading…
x
Reference in New Issue
Block a user