LLaMA-Factory

mirror of https://github.com/hiyouga/LLaMA-Factory.git synced 2026-06-18 05:08:54 +08:00

Author	SHA1	Message	Date
hiyouga	3271af2afc	add orca_dpo_pairs dataset	2024-03-20 20:09:06 +08:00
hoshi-hiyouga	b2dfbd728f	Merge pull request #2905 from SirlyDreamer/main Follow HF_ENDPOINT environment variable	2024-03-20 18:09:54 +08:00
hiyouga	9bec3c98a2	fix #2777 #2895	2024-03-20 17:59:45 +08:00
hiyouga	7b8f502901	fix #2346	2024-03-20 17:56:33 +08:00
SirlyDreamer	e165965341	Follow HF_ENDPOINT environment variable	2024-03-20 08:31:30 +00:00
hoshi-hiyouga	a773035709	Merge pull request #2903 from khazic/main Updated README with new information	2024-03-20 16:13:44 +08:00
khazic	8d10fa71c2	Updated README with new information	2024-03-20 14:38:08 +08:00
khazic	0531dac30d	Updated README with new information	2024-03-20 14:21:16 +08:00
刘一博	df9b4fb90a	Updated README with new information	2024-03-20 14:11:28 +08:00
hiyouga	bea31b9b12	Update wechat.jpg	2024-03-18 16:48:32 +08:00
hiyouga	8e04794b2d	fix packages	2024-03-17 22:32:03 +08:00
hiyouga	85c376fc1e	fix patcher	2024-03-15 19:18:42 +08:00
hoshi-hiyouga	113cc04719	Merge pull request #2849 from S3Studio/DockerizeSupport Improve Dockerize support	2024-03-15 19:16:02 +08:00
hiyouga	6bc2c23b6d	fix export	2024-03-15 15:06:30 +08:00
S3Studio	e75407febd	Use official Nvidia base image Note that the flash-attn library is installed in this image and the qwen model will use it automatically. However, if the the host machine's GPU is not compatible with the library, an exception will be raised during the training process as follows: FlashAttention only supports Ampere GPUs or newer. So if the --flash_attn flag is not set, an additional patch for the qwen model's config is necessary to set the default value of use_flash_attn from "auto" to False.	2024-03-15 08:59:13 +08:00
S3Studio	6a5693d11d	improve Docker build and runtime parameters Modify installation method of extra python library. Utilize shared memory of the host machine to increase training performance.	2024-03-15 08:57:46 +08:00
hiyouga	6ebde4f23e	tiny fix	2024-03-14 21:19:06 +08:00
hiyouga	3b4a59bfb1	fix export	2024-03-14 18:17:01 +08:00
hiyouga	8172530d54	fix bug	2024-03-13 23:55:31 +08:00
hiyouga	714d936dfb	fix bug	2024-03-13 23:43:42 +08:00
hiyouga	72367307df	improve lora+ impl.	2024-03-13 23:32:51 +08:00
hoshi-hiyouga	4e5e99af43	Merge pull request #2830 from qibaoyuan/lora_plus [FEATURE]: ADD LORA+ ALGORITHM	2024-03-13 20:15:46 +08:00
齐保元	a0965cd62c	[FEATURE]: ADD LORA+ ALGORITHM	2024-03-13 19:43:27 +08:00
hiyouga	dfd451b722	Update wechat.jpg	2024-03-13 19:03:00 +08:00
hiyouga	0b4a5bf509	fix #2817	2024-03-13 12:42:03 +08:00
hiyouga	b9f87cdc11	fix #2802	2024-03-13 12:33:45 +08:00
hiyouga	96ce76cd27	fix kv cache	2024-03-13 01:21:50 +08:00
hiyouga	19ef482649	support QDoRA	2024-03-12 22:12:42 +08:00
hiyouga	70a3052dd8	patch for gemma cpt	2024-03-12 21:21:54 +08:00
hiyouga	60cc17f3a8	fix plot issues	2024-03-12 18:41:35 +08:00
hiyouga	b3247d6a16	support olmo	2024-03-12 18:30:38 +08:00
hiyouga	8d8956bad5	fix #2802	2024-03-12 17:08:34 +08:00
hiyouga	06c97083e1	fix #2803	2024-03-12 16:57:39 +08:00
hiyouga	07f9b754a7	fix #2782 #2798	2024-03-12 15:53:29 +08:00
hoshi-hiyouga	c901aa63ff	Merge pull request #2743 from S3Studio/DockerizeSupport Add dockerize support	2024-03-12 00:05:49 +08:00
hiyouga	e874c00906	fix #2775	2024-03-11 00:42:54 +08:00
hiyouga	352693e2dc	tiny fix	2024-03-11 00:17:18 +08:00
hiyouga	be99799413	update parser	2024-03-10 13:35:20 +08:00
hiyouga	8664262cde	support layerwise galore	2024-03-10 00:24:11 +08:00
hiyouga	18ffce36b5	fix #2732	2024-03-09 22:37:16 +08:00
hiyouga	bdb496644c	allow non-packing pretraining	2024-03-09 22:21:46 +08:00
hiyouga	412c52e325	fix #2766	2024-03-09 21:35:24 +08:00
hiyouga	af0e370fb1	use default arg for freeze tuning	2024-03-09 06:08:48 +08:00
hiyouga	818726e9bc	add GaLore results	2024-03-09 04:11:55 +08:00
hiyouga	393c2de27c	update hardware requirements	2024-03-09 03:58:18 +08:00
hiyouga	4c00bcdcae	update examples	2024-03-09 02:30:37 +08:00
hiyouga	e8dd38b7fd	fix #2756 , patch #2746	2024-03-09 02:01:26 +08:00
hoshi-hiyouga	516d0ddc66	Merge pull request #2746 from stephen-nju/main fix deepspeed ppo RuntimeError	2024-03-09 01:37:00 +08:00
hiyouga	74ff8664d7	Update setup.py	2024-03-09 00:14:48 +08:00
hiyouga	10be2f0ecc	fix aqlm version	2024-03-09 00:09:09 +08:00

1 2 3 4 5 ...

1034 Commits