LLaMA-Factory

423A35C7/LLaMA-Factory

Fork 0

mirror of https://github.com/hiyouga/LLaMA-Factory.git synced 2026-07-28 19:56:13 +08:00

Commit Graph

Author	SHA1	Message	Date
S3Studio	096869c7b6	Use official Nvidia base image Note that the flash-attn library is installed in this image and the qwen model will use it automatically. However, if the the host machine's GPU is not compatible with the library, an exception will be raised during the training process as follows: FlashAttention only supports Ampere GPUs or newer. So if the --flash_attn flag is not set, an additional patch for the qwen model's config is necessary to set the default value of use_flash_attn from "auto" to False. Former-commit-id: cd2f5717d676e1a5afd2f4e7a38402d2e55e7479	2024-03-15 08:59:13 +08:00
S3Studio	c6873211e9	improve Docker build and runtime parameters Modify installation method of extra python library. Utilize shared memory of the host machine to increase training performance. Former-commit-id: 97f9901c2f5c29a6ab517a1f8fa028b8e89edf4e	2024-03-15 08:57:46 +08:00
S3Studio	6169df1c52	Add dockerize support Already tested with the model of Qwen:1.8B and the dataset of alpaca_data_zh. Some python libraries are added to the Dockerfile as a result of the exception messages displayed throughout test procedure. Former-commit-id: 897e083bc28ccb15c46909b9d13fc03a674fb254	2024-03-08 10:47:28 +08:00

Author

SHA1

Message

Date

S3Studio

096869c7b6

Use official Nvidia base image

Note that the flash-attn library is installed in this image and the qwen model will use it automatically.
However, if the the host machine's GPU is not compatible with the library, an exception will be raised during the training process as follows:
FlashAttention only supports Ampere GPUs or newer.
So if the --flash_attn flag is not set, an additional patch for the qwen model's config is necessary to set the default value of use_flash_attn from "auto" to False.


Former-commit-id: cd2f5717d676e1a5afd2f4e7a38402d2e55e7479

2024-03-15 08:59:13 +08:00

S3Studio

c6873211e9

improve Docker build and runtime parameters

Modify installation method of extra python library.
Utilize shared memory of the host machine to increase training performance.


Former-commit-id: 97f9901c2f5c29a6ab517a1f8fa028b8e89edf4e

2024-03-15 08:57:46 +08:00

S3Studio

6169df1c52

Add dockerize support

Already tested with the model of Qwen:1.8B and the dataset of alpaca_data_zh. Some python libraries are added to the Dockerfile as a result of the exception messages displayed throughout test procedure.


Former-commit-id: 897e083bc28ccb15c46909b9d13fc03a674fb254

2024-03-08 10:47:28 +08:00

3 Commits