LLaMA-Factory

mirror of https://github.com/hiyouga/LLaMA-Factory.git synced 2025-11-10 07:14:44 +08:00

Author	SHA1	Message	Date
hiyouga	d984776d35	fix #4145 Fix the docker image Former-commit-id: 949e9908ad634874cf5449ee9904745c7acda611	2024-06-11 00:19:17 +08:00
hiyouga	cae823ddf0	rename package Former-commit-id: 308edbc4260d45907b4a9d3a45ec21d83e48aacb	2024-05-16 18:39:08 +08:00
junwooo.lee	a274f7d069	fix: splitted Dockerfile's CMD Former-commit-id: 4598734a0dc837be5f30033fb16d22b6a4d38913	2024-05-07 15:09:48 +09:00
hiyouga	289d1f3679	update webui and add CLIs Former-commit-id: 245fe47ece22a4b7822449b126715aaa8ec25aba	2024-05-03 02:58:23 +08:00
S3Studio	46ef7416e6	Use official Nvidia base image Note that the flash-attn library is installed in this image and the qwen model will use it automatically. However, if the the host machine's GPU is not compatible with the library, an exception will be raised during the training process as follows: FlashAttention only supports Ampere GPUs or newer. So if the --flash_attn flag is not set, an additional patch for the qwen model's config is necessary to set the default value of use_flash_attn from "auto" to False. Former-commit-id: e75407febdec086f2bdca723a7f69a92b3b1d63f	2024-03-15 08:59:13 +08:00
S3Studio	dcbc8168a8	improve Docker build and runtime parameters Modify installation method of extra python library. Utilize shared memory of the host machine to increase training performance. Former-commit-id: 6a5693d11d065f6e75c8cdd8b5ed962eb520953c	2024-03-15 08:57:46 +08:00
S3Studio	de41334055	Add dockerize support Already tested with the model of Qwen:1.8B and the dataset of alpaca_data_zh. Some python libraries are added to the Dockerfile as a result of the exception messages displayed throughout test procedure. Former-commit-id: 3d911ae713b901d6680a9f9ac82569cc5878f820	2024-03-08 10:47:28 +08:00

7 Commits