1. update the version of pre-built bitsandbytes library

2. add pre-built flash-attn library Former-commit-id: 2b76a300995a74398ee11d9274e5c0eb6ef53403
2026-07-30 04:36:11 +08:00 · 2024-02-20 11:28:25 +08:00
parent eb2aa2c073
commit e52e0d9b07
1 changed files with 2 additions and 2 deletions
--- a/README.md
+++ b/README.md
@@ -261,10 +261,10 @@ cd LLaMA-Factory
 pip install -r requirements.txt
 ```

-If you want to enable the quantized LoRA (QLoRA) on the Windows platform, you will be required to install a pre-built version of `bitsandbytes` library, which supports CUDA 11.1 to 12.1.
+If you want to enable the quantized LoRA (QLoRA) on the Windows platform, you will be required to install a pre-built version of `bitsandbytes` library, which supports CUDA 11.1 to 12.2.

 ```bash
-pip install https://github.com/jllllll/bitsandbytes-windows-webui/releases/download/wheels/bitsandbytes-0.39.1-py3-none-win_amd64.whl
+pip install https://github.com/jllllll/bitsandbytes-windows-webui/releases/download/wheels/bitsandbytes-0.40.0-py3-none-win_amd64.whl
 ```

 To enable Flash Attention on the Windows platform, you need to install the precompiled `flash-attn` library, which supports CUDA 12.1 to 12.2. Please download the corresponding version from [flash-attention](https://github.com/bdashore3/flash-attention/releases) based on your requirements.