patch modelscope

2026-03-07 20:26:00 +08:00 · 2023-12-01 22:53:15 +08:00
parent 3a64506031
commit bd42c229b0
7 changed files with 312 additions and 222 deletions
--- a/README.md
+++ b/README.md
@@ -44,17 +44,23 @@ Compared to ChatGLM's [P-Tuning](https://github.com/THUDM/ChatGLM2-6B/tree/main/

 ![benchmark](assets/benchmark.svg)

+<details><summary>Definitions</summary>
+
 - **Training Speed**: the number of training samples processed per second during the training. (bs=4, cutoff_len=1024)
 - **Rouge Score**: Rouge-2 score on the development set of the [advertising text generation](https://aclanthology.org/D19-1321.pdf) task. (bs=4, cutoff_len=1024)
 - **GPU Memory**: Peak GPU memory usage in 4-bit quantized training. (bs=1, cutoff_len=1024)
 - We adopt `pre_seq_len=128` for ChatGLM's P-Tuning and `lora_rank=32` for LLaMA-Factory's LoRA tuning.

+</details>
+
 ## Changelog

-[23/12/01] We supported **[ModelScope Hub](https://www.modelscope.cn/models)** to accelerate model downloading. Add environment variable `USE_MODELSCOPE_HUB=1` to your command line, then you can use the model-id of ModelScope Hub.
+[23/12/01] We supported downloading pre-trained models from the **[ModelScope Hub](https://modelscope.cn/models)** for Chinese mainland users. See [this tutorial](#use-modelscope-models-optional) for usage.

 [23/10/21] We supported **[NEFTune](https://arxiv.org/abs/2310.05914)** trick for fine-tuning. Try `--neft_alpha` argument to activate NEFTune, e.g., `--neft_alpha 5`.

+<details><summary>Full Changelog</summary>
+
 [23/09/27] We supported **$S^2$-Attn** proposed by [LongLoRA](https://github.com/dvlab-research/LongLoRA) for the LLaMA models. Try `--shift_attn` argument to enable shift short attention.

 [23/09/23] We integrated MMLU, C-Eval and CMMLU benchmarks in this repo. See [this example](#evaluation) to evaluate your models.
@@ -79,6 +85,8 @@ Compared to ChatGLM's [P-Tuning](https://github.com/THUDM/ChatGLM2-6B/tree/main/

 [23/06/03] We supported quantized training and inference (aka **[QLoRA](https://github.com/artidoro/qlora)**). Try `--quantization_bit 4/8` argument to work with quantized models.

+</details>
+
 ## Supported Models

 | Model                                                    | Model size                  | Default module    | Template  |
@@ -231,31 +239,26 @@ If you want to enable the quantized LoRA (QLoRA) on the Windows platform, you wi
 pip install https://github.com/jllllll/bitsandbytes-windows-webui/releases/download/wheels/bitsandbytes-0.39.1-py3-none-win_amd64.whl
 ```

-### Use ModelScope Models
+### Use ModelScope Models (optional)

-If you have trouble with downloading models from HuggingFace, we have supported ModelScope Hub. To use LLaMA-Factory together with ModelScope, please add a environment variable:
+If you have trouble with downloading models from Hugging Face, you can use LLaMA-Factory together with ModelScope in the following manner.

-```shell
-export USE_MODELSCOPE_HUB=1
+```bash
+export USE_MODELSCOPE_HUB=1 # `set USE_MODELSCOPE_HUB=1` for Windows
 ```

-> [!NOTE]
->
-> Please use integers only. 0 or not set for using HuggingFace hub. Other values will be treated as use ModelScope hub.
+Then you can train the corresponding model by specifying a model ID of the ModelScope Hub. (find a full list of model IDs at [ModelScope Hub](https://modelscope.cn/models))

-Then you can use LLaMA-Factory with ModelScope model-ids:
-
-```shell
+```bash
 python src/train_bash.py \
-    --model_name_or_path ZhipuAI/chatglm3-6b \
-    ... other arguments
-# You can find all model ids in this link: https://www.modelscope.cn/models
+    --model_name_or_path modelscope/Llama-2-7b-ms \
+    ... # arguments (same as above)
 ```

-Web demo also supports ModelScope, after setting the environment variable please run with this command:
+LLaMA Board also supports using the models on the ModelScope Hub.

-```shell
-CUDA_VISIBLE_DEVICES=0 python src/train_web.py
+```bash
+CUDA_VISIBLE_DEVICES=0 USE_MODELSCOPE_HUB=1 python src/train_web.py
 ```

 ### Train on a single GPU