update readme

Former-commit-id: 34005252df4b015fd06a229b0be882ed64672cc1
This commit is contained in:
hiyouga 2023-09-10 20:52:21 +08:00
parent 8ab5566dc0
commit 7a715aac55
2 changed files with 3 additions and 17 deletions

View File

@ -65,7 +65,6 @@
| [ChatGLM2](https://github.com/THUDM/ChatGLM2-6B) | 6B | query_key_value | chatglm2 | | [ChatGLM2](https://github.com/THUDM/ChatGLM2-6B) | 6B | query_key_value | chatglm2 |
> **Note** > **Note**
>
> **Default module** is used for the `--lora_target` argument, you can use `--lora_target all` to specify all the available modules. > **Default module** is used for the `--lora_target` argument, you can use `--lora_target all` to specify all the available modules.
> >
> For the "base" models, the `--template` argument can be chosen from `default`, `alpaca`, `vicuna` etc. But make sure to use the corresponding template for the "chat" models. > For the "base" models, the `--template` argument can be chosen from `default`, `alpaca`, `vicuna` etc. But make sure to use the corresponding template for the "chat" models.
@ -81,7 +80,6 @@
| DPO Training | :white_check_mark: | | :white_check_mark: | :white_check_mark: | | DPO Training | :white_check_mark: | | :white_check_mark: | :white_check_mark: |
> **Note** > **Note**
>
> Use `--quantization_bit 4/8` argument to enable QLoRA. > Use `--quantization_bit 4/8` argument to enable QLoRA.
## Provided Datasets ## Provided Datasets
@ -146,7 +144,6 @@ And **powerful GPUs**!
Please refer to `data/example_dataset` for checking the details about the format of dataset files. You can either use a single `.json` file or a [dataset loading script](https://huggingface.co/docs/datasets/dataset_script) with multiple files to create a custom dataset. Please refer to `data/example_dataset` for checking the details about the format of dataset files. You can either use a single `.json` file or a [dataset loading script](https://huggingface.co/docs/datasets/dataset_script) with multiple files to create a custom dataset.
> **Note** > **Note**
>
> Please update `data/dataset_info.json` to use your custom dataset. About the format of this file, please refer to `data/README.md`. > Please update `data/dataset_info.json` to use your custom dataset. About the format of this file, please refer to `data/README.md`.
### Dependence Installation (optional) ### Dependence Installation (optional)
@ -174,14 +171,12 @@ CUDA_VISIBLE_DEVICES=0 python src/train_web.py
We strongly recommend using the all-in-one Web UI for newcomers since it can also generate training scripts **automatically**. We strongly recommend using the all-in-one Web UI for newcomers since it can also generate training scripts **automatically**.
> **Warning** > **Warning**
>
> Currently the web UI only supports training on **a single GPU**. > Currently the web UI only supports training on **a single GPU**.
### Train on a single GPU ### Train on a single GPU
> **Warning** > **Warning**
> > If you want to train models on multiple GPUs, please refer to [Distributed Training](#distributed-training).
> If you want to train models on multiple GPUs, please refer to [#distributed-training](Distributed Training).
#### Pre-Training #### Pre-Training
@ -397,7 +392,6 @@ python src/api_demo.py \
``` ```
> **Note** > **Note**
>
> Visit `http://localhost:8000/docs` for API documentation. > Visit `http://localhost:8000/docs` for API documentation.
### CLI Demo ### CLI Demo
@ -438,7 +432,6 @@ CUDA_VISIBLE_DEVICES=0 python src/train_bash.py \
``` ```
> **Note** > **Note**
>
> We recommend using `--per_device_eval_batch_size=1` and `--max_target_length 128` at 4/8-bit evaluation. > We recommend using `--per_device_eval_batch_size=1` and `--max_target_length 128` at 4/8-bit evaluation.
### Predict ### Predict
@ -490,7 +483,7 @@ If this work is helpful, please kindly cite as:
## Acknowledgement ## Acknowledgement
This repo is a sibling of [ChatGLM-Efficient-Tuning](https://github.com/hiyouga/ChatGLM-Efficient-Tuning). They share a similar code structure of efficient tuning on large language models. This repo benefits from [PEFT](https://github.com/huggingface/peft), [QLoRA](https://github.com/artidoro/qlora) and [OpenChatKit](https://github.com/togethercomputer/OpenChatKit). Thanks for their wonderful works.
## Star History ## Star History

View File

@ -65,7 +65,6 @@
| [ChatGLM2](https://github.com/THUDM/ChatGLM2-6B) | 6B | query_key_value | chatglm2 | | [ChatGLM2](https://github.com/THUDM/ChatGLM2-6B) | 6B | query_key_value | chatglm2 |
> **Note** > **Note**
>
> **默认模块**应作为 `--lora_target` 参数的默认值,可使用 `--lora_target all` 参数指定全部模块。 > **默认模块**应作为 `--lora_target` 参数的默认值,可使用 `--lora_target all` 参数指定全部模块。
> >
> 对于所有“基座”Base模型`--template` 参数可以是 `default`, `alpaca`, `vicuna` 等任意值。但“对话”Chat模型请务必使用对应的模板。 > 对于所有“基座”Base模型`--template` 参数可以是 `default`, `alpaca`, `vicuna` 等任意值。但“对话”Chat模型请务必使用对应的模板。
@ -81,7 +80,6 @@
| DPO 训练 | :white_check_mark: | | :white_check_mark: | :white_check_mark: | | DPO 训练 | :white_check_mark: | | :white_check_mark: | :white_check_mark: |
> **Note** > **Note**
>
> 请使用 `--quantization_bit 4/8` 参数来启用 QLoRA 训练。 > 请使用 `--quantization_bit 4/8` 参数来启用 QLoRA 训练。
## 数据集 ## 数据集
@ -146,7 +144,6 @@ huggingface-cli login
关于数据集文件的格式,请参考 `data/example_dataset` 文件夹的内容。构建自定义数据集时,既可以使用单个 `.json` 文件,也可以使用一个[数据加载脚本](https://huggingface.co/docs/datasets/dataset_script)和多个文件。 关于数据集文件的格式,请参考 `data/example_dataset` 文件夹的内容。构建自定义数据集时,既可以使用单个 `.json` 文件,也可以使用一个[数据加载脚本](https://huggingface.co/docs/datasets/dataset_script)和多个文件。
> **Note** > **Note**
>
> 使用自定义数据集时,请更新 `data/dataset_info.json` 文件,该文件的格式请参考 `data/README.md` > 使用自定义数据集时,请更新 `data/dataset_info.json` 文件,该文件的格式请参考 `data/README.md`
### 环境搭建(可跳过) ### 环境搭建(可跳过)
@ -174,13 +171,11 @@ CUDA_VISIBLE_DEVICES=0 python src/train_web.py
我们极力推荐新手使用浏览器一体化界面,因为它还可以**自动**生成运行所需的命令行脚本。 我们极力推荐新手使用浏览器一体化界面,因为它还可以**自动**生成运行所需的命令行脚本。
> **Warning** > **Warning**
>
> 目前网页 UI 仅支持**单卡训练**。 > 目前网页 UI 仅支持**单卡训练**。
### 单 GPU 训练 ### 单 GPU 训练
> **Warning** > **Warning**
>
> 如果您使用多张 GPU 训练模型,请移步[多 GPU 分布式训练](#多-gpu-分布式训练)部分。 > 如果您使用多张 GPU 训练模型,请移步[多 GPU 分布式训练](#多-gpu-分布式训练)部分。
#### 预训练 #### 预训练
@ -396,7 +391,6 @@ python src/api_demo.py \
``` ```
> **Note** > **Note**
>
> 关于 API 文档请见 `http://localhost:8000/docs` > 关于 API 文档请见 `http://localhost:8000/docs`
### 命令行测试 ### 命令行测试
@ -437,7 +431,6 @@ CUDA_VISIBLE_DEVICES=0 python src/train_bash.py \
``` ```
> **Note** > **Note**
>
> 我们建议在量化模型的评估中使用 `--per_device_eval_batch_size=1``--max_target_length 128` > 我们建议在量化模型的评估中使用 `--per_device_eval_batch_size=1``--max_target_length 128`
### 模型预测 ### 模型预测
@ -489,7 +482,7 @@ CUDA_VISIBLE_DEVICES=0 python src/train_bash.py \
## 致谢 ## 致谢
本项目是 [ChatGLM-Efficient-Tuning](https://github.com/hiyouga/ChatGLM-Efficient-Tuning) 的同类项目。采用了类似的代码结构和训练方法 本项目受益于 [PEFT](https://github.com/huggingface/peft)、[QLoRA](https://github.com/artidoro/qlora) 和 [OpenChatKit](https://github.com/togethercomputer/OpenChatKit),感谢以上诸位作者的付出
## Star History ## Star History