From 943779eabc6f14e83297494bc6d67b3f4e222c6a Mon Sep 17 00:00:00 2001 From: hiyouga Date: Tue, 14 May 2024 23:55:49 +0800 Subject: [PATCH] update readme Former-commit-id: fc547ee591ef3cfc1bdbb8297a75a74f05c83c82 --- README.md | 26 +++++++++++++++++++++++--- README_zh.md | 24 ++++++++++++++++++++++-- 2 files changed, 45 insertions(+), 5 deletions(-) diff --git a/README.md b/README.md index 90fcb295..a138d646 100644 --- a/README.md +++ b/README.md @@ -70,14 +70,16 @@ Compared to ChatGLM's [P-Tuning](https://github.com/THUDM/ChatGLM2-6B/tree/main/ ## Changelog +[24/05/14] We supported training and inference on the Ascend NPU devices. Check [installation](#installation) section for details. + [24/05/13] We supported fine-tuning the **Yi-1.5** series models. [24/04/26] We supported fine-tuning the **LLaVA-1.5** multimodal LLMs. See [examples](examples/README.md) for usage. -[24/04/22] We provided a **[Colab notebook](https://colab.research.google.com/drive/1eRTPn37ltBbYsISy9Aw2NuI2Aq5CQrD9?usp=sharing)** for fine-tuning the Llama-3 model on a free T4 GPU. Two Llama-3-derived models fine-tuned using LLaMA Factory are available at Hugging Face, check [Llama3-8B-Chinese-Chat](https://huggingface.co/shenzhi-wang/Llama3-8B-Chinese-Chat) and [Llama3-Chinese](https://huggingface.co/zhichen/Llama3-Chinese) for details. -
Full Changelog +[24/04/22] We provided a **[Colab notebook](https://colab.research.google.com/drive/1eRTPn37ltBbYsISy9Aw2NuI2Aq5CQrD9?usp=sharing)** for fine-tuning the Llama-3 model on a free T4 GPU. Two Llama-3-derived models fine-tuned using LLaMA Factory are available at Hugging Face, check [Llama3-8B-Chinese-Chat](https://huggingface.co/shenzhi-wang/Llama3-8B-Chinese-Chat) and [Llama3-Chinese](https://huggingface.co/zhichen/Llama3-Chinese) for details. + [24/04/21] We supported **[Mixture-of-Depths](https://arxiv.org/abs/2404.02258)** according to [AstraMindAI's implementation](https://github.com/astramind-ai/Mixture-of-depths). See [examples](examples/README.md) for usage. [24/04/16] We supported **[BAdam](https://arxiv.org/abs/2404.02827)**. See [examples](examples/README.md) for usage. @@ -328,7 +330,7 @@ Extra dependencies available: torch, metrics, deepspeed, bitsandbytes, vllm, gal
For Windows users -If you want to enable the quantized LoRA (QLoRA) on the Windows platform, you will be required to install a pre-built version of `bitsandbytes` library, which supports CUDA 11.1 to 12.2, please select the appropriate [release version](https://github.com/jllllll/bitsandbytes-windows-webui/releases/tag/wheels) based on your CUDA version. +If you want to enable the quantized LoRA (QLoRA) on the Windows platform, you need to install a pre-built version of `bitsandbytes` library, which supports CUDA 11.1 to 12.2, please select the appropriate [release version](https://github.com/jllllll/bitsandbytes-windows-webui/releases/tag/wheels) based on your CUDA version. ```bash pip install https://github.com/jllllll/bitsandbytes-windows-webui/releases/download/wheels/bitsandbytes-0.41.2.post2-py3-none-win_amd64.whl @@ -338,6 +340,24 @@ To enable FlashAttention-2 on the Windows platform, you need to install the prec
+
For Ascend NPU users + +To utilize Ascend NPU devices for (distributed) training and inference, you need to install the **[torch-npu](https://gitee.com/ascend/pytorch)** package and the **[Ascend CANN Kernels](https://www.hiascend.com/developer/download/community/result?module=cann)**. + +| Requirement | Minimum | Recommend | +| ------------ | ------- | --------- | +| CANN | 8.0.RC1 | 8.0.RC1 | +| torch | 2.2.0 | 2.2.0 | +| torch-npu | 2.2.0 | 2.2.0 | +| deepspeed | 0.13.2 | 0.13.2 | + +> [!NOTE] +> Remember to use `ASCEND_RT_VISIBLE_DEVICES` instead of `CUDA_VISIBLE_DEVICES` to specify the device to use. +> +> If you cannot infer model on NPU devices, try setting `do_sample: false` in the configurations. + +
+ ### Data Preparation Please refer to [data/README.md](data/README.md) for checking the details about the format of dataset files. You can either use datasets on HuggingFace / ModelScope hub or load the dataset in local disk. diff --git a/README_zh.md b/README_zh.md index 1d15515e..a0373711 100644 --- a/README_zh.md +++ b/README_zh.md @@ -70,14 +70,16 @@ https://github.com/hiyouga/LLaMA-Factory/assets/16256802/ec36a9dd-37f4-4f72-81bd ## 更新日志 +[24/05/14] 我们支持了昇腾 NPU 设备的训练和推理。详情请查阅[安装](#安装-llama-factory)部分。 + [24/05/13] 我们支持了 Yi-1.5 系列模型的微调。 [24/04/26] 我们支持了多模态模型 **LLaVA-1.5** 的微调。详细用法请参照 [examples](examples/README_zh.md)。 -[24/04/22] 我们提供了在免费 T4 GPU 上微调 Llama-3 模型的 **[Colab 笔记本](https://colab.research.google.com/drive/1d5KQtbemerlSDSxZIfAaWXhKr30QypiK?usp=sharing)**。Hugging Face 社区公开了两个利用 LLaMA Factory 微调的 Llama-3 模型,详情请见 [Llama3-8B-Chinese-Chat](https://huggingface.co/shenzhi-wang/Llama3-8B-Chinese-Chat) 和 [Llama3-Chinese](https://huggingface.co/zhichen/Llama3-Chinese)。 -
展开日志 +[24/04/22] 我们提供了在免费 T4 GPU 上微调 Llama-3 模型的 **[Colab 笔记本](https://colab.research.google.com/drive/1d5KQtbemerlSDSxZIfAaWXhKr30QypiK?usp=sharing)**。Hugging Face 社区公开了两个利用 LLaMA Factory 微调的 Llama-3 模型,详情请见 [Llama3-8B-Chinese-Chat](https://huggingface.co/shenzhi-wang/Llama3-8B-Chinese-Chat) 和 [Llama3-Chinese](https://huggingface.co/zhichen/Llama3-Chinese)。 + [24/04/21] 我们基于 [AstraMindAI 的仓库](https://github.com/astramind-ai/Mixture-of-depths)支持了 **[混合深度训练](https://arxiv.org/abs/2404.02258)**。详细用法请参照 [examples](examples/README_zh.md)。 [24/04/16] 我们支持了 **[BAdam](https://arxiv.org/abs/2404.02827)**。详细用法请参照 [examples](examples/README_zh.md)。 @@ -338,6 +340,24 @@ pip install https://github.com/jllllll/bitsandbytes-windows-webui/releases/downl
+
昇腾 NPU 用户指南 + +如果使用昇腾 NPU 设备进行(分布式)训练或推理,需要安装 **[torch-npu](https://gitee.com/ascend/pytorch)** 库和 **[Ascend CANN Kernels](https://www.hiascend.com/developer/download/community/result?module=cann)**。 + +| 依赖项 | 至少 | 推荐 | +| ------------ | ------- | --------- | +| CANN | 8.0.RC1 | 8.0.RC1 | +| torch | 2.2.0 | 2.2.0 | +| torch-npu | 2.2.0 | 2.2.0 | +| deepspeed | 0.13.2 | 0.13.2 | + +> [!NOTE] +> 请记得使用 `ASCEND_RT_VISIBLE_DEVICES` 而非 `CUDA_VISIBLE_DEVICES` 来指定您使用的设备。 +> +> 如果遇到无法正常推理的情况,请尝试设置 `do_sample: false`。 + +
+ ### 数据准备 关于数据集文件的格式,请参考 [data/README_zh.md](data/README_zh.md) 的内容。你可以使用 HuggingFace / ModelScope 上的数据集或加载本地数据集。