From e52e0d9b075d747276e6b3524c3c0a722e1d27b4 Mon Sep 17 00:00:00 2001 From: codemayq Date: Tue, 20 Feb 2024 11:28:25 +0800 Subject: [PATCH] 1. update the version of pre-built bitsandbytes library 2. add pre-built flash-attn library Former-commit-id: 2b76a300995a74398ee11d9274e5c0eb6ef53403 --- README.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/README.md b/README.md index c7b55d8b..3aef67c8 100644 --- a/README.md +++ b/README.md @@ -261,10 +261,10 @@ cd LLaMA-Factory pip install -r requirements.txt ``` -If you want to enable the quantized LoRA (QLoRA) on the Windows platform, you will be required to install a pre-built version of `bitsandbytes` library, which supports CUDA 11.1 to 12.1. +If you want to enable the quantized LoRA (QLoRA) on the Windows platform, you will be required to install a pre-built version of `bitsandbytes` library, which supports CUDA 11.1 to 12.2. ```bash -pip install https://github.com/jllllll/bitsandbytes-windows-webui/releases/download/wheels/bitsandbytes-0.39.1-py3-none-win_amd64.whl +pip install https://github.com/jllllll/bitsandbytes-windows-webui/releases/download/wheels/bitsandbytes-0.40.0-py3-none-win_amd64.whl ``` To enable Flash Attention on the Windows platform, you need to install the precompiled `flash-attn` library, which supports CUDA 12.1 to 12.2. Please download the corresponding version from [flash-attention](https://github.com/bdashore3/flash-attention/releases) based on your requirements.