LLaMA-Factory

mirror of https://github.com/hiyouga/LLaMA-Factory.git synced 2026-03-07 04:05:58 +08:00

Author	SHA1	Message	Date
zhangzc	449e2aa38e	Supports custom data set sampling quantity	2024-03-27 14:22:50 +08:00
hoshi-hiyouga	3bcd41b639	fix ds optimizer	2024-03-26 23:39:56 +08:00
hiyouga	b29d5560f1	fix #2981	2024-03-26 17:53:04 +08:00
hiyouga	3164b4f11b	fix bug	2024-03-26 17:30:12 +08:00
hiyouga	511f675402	fix #2961	2024-03-26 17:26:14 +08:00
hiyouga	7ea1a1f5b3	Update wechat.jpg	2024-03-26 16:24:42 +08:00
hiyouga	ba70aca8fb	release v0.6.0 (real)	2024-03-25 23:37:48 +08:00
hiyouga	98a42cbdaa	tiny fix	2024-03-25 23:28:52 +08:00
hiyouga	7b3d8188f5	update readme	2024-03-25 23:06:13 +08:00
hoshi-hiyouga	f633ac6646	Merge pull request #2967 from Tsumugii24/main Update README_zh.md	2024-03-25 23:02:22 +08:00
Tsumugii24	1704599503	Update README.md	2024-03-25 22:54:38 +08:00
Tsumugii24	7aa77a3451	Update README_zh.md	2024-03-25 22:54:26 +08:00
hiyouga	1484f76a95	add arg check	2024-03-25 22:42:58 +08:00
hiyouga	6f2b563f12	release v0.6.0	2024-03-25 22:38:56 +08:00
Tsumugii24	bb4ca1691a	Update README_zh.md	2024-03-25 22:31:03 +08:00
hoshi-hiyouga	f33a3dfadc	Merge pull request #2963 from rkinas/patch-1 Update requirements.txt	2024-03-25 21:49:34 +08:00
Remek Kinas	b02899bf89	Update requirements.txt	2024-03-25 14:30:58 +01:00
hiyouga	558a538724	tiny fix	2024-03-25 21:18:08 +08:00
hoshi-hiyouga	49f9dbb4b1	Merge pull request #2945 from marko1616/bugfix/lora-model-merge 修复了在 transformers > 4.36.2 版本中部分模型合并 Lora 模型时因生成配置校验而导致的崩溃问题	2024-03-25 13:36:08 +08:00
marko1616	c8f0d99704	pass ruff check	2024-03-24 16:12:10 +08:00
marko1616	6f080fdba3	fix Llama lora merge crash	2024-03-24 03:06:11 +08:00
marko1616	51349ea1cc	fix Llama lora merge crash	2024-03-24 02:55:23 +08:00
marko1616	c1e2c4ea45	fix Llama lora merge crash	2024-03-24 02:44:35 +08:00
hiyouga	140ad4ad56	fix #2936	2024-03-24 00:43:21 +08:00
hiyouga	7afbc85dae	fix #2928	2024-03-24 00:34:54 +08:00
hiyouga	a1c8c98c5f	fix #2941	2024-03-24 00:28:44 +08:00
hiyouga	564d57aa23	Update wechat.jpg	2024-03-22 14:00:37 +08:00
hoshi-hiyouga	ce261fdd64	Merge pull request #2919 from 0xez/main Update README.md, fix the release date of the paper	2024-03-22 12:12:24 +08:00
0xez	be0360303d	Update README_zh.md, fix the release date of the paper	2024-03-22 10:41:17 +08:00
0xez	675ba41562	Update README.md, fix the release date of the paper	2024-03-21 22:14:48 +08:00
hiyouga	96702620c4	move file	2024-03-21 17:05:17 +08:00
hiyouga	5eaa50fa01	add citation	2024-03-21 17:04:10 +08:00
hiyouga	0581bfdbc7	paper release	2024-03-21 13:49:17 +08:00
hiyouga	bfe7a91289	update readme	2024-03-21 00:48:42 +08:00
hiyouga	8408225162	support fsdp + qlora	2024-03-21 00:36:06 +08:00
hiyouga	3271af2afc	add orca_dpo_pairs dataset	2024-03-20 20:09:06 +08:00
hoshi-hiyouga	b2dfbd728f	Merge pull request #2905 from SirlyDreamer/main Follow HF_ENDPOINT environment variable	2024-03-20 18:09:54 +08:00
hiyouga	9bec3c98a2	fix #2777 #2895	2024-03-20 17:59:45 +08:00
hiyouga	7b8f502901	fix #2346	2024-03-20 17:56:33 +08:00
SirlyDreamer	e165965341	Follow HF_ENDPOINT environment variable	2024-03-20 08:31:30 +00:00
hoshi-hiyouga	a773035709	Merge pull request #2903 from khazic/main Updated README with new information	2024-03-20 16:13:44 +08:00
khazic	8d10fa71c2	Updated README with new information	2024-03-20 14:38:08 +08:00
khazic	0531dac30d	Updated README with new information	2024-03-20 14:21:16 +08:00
刘一博	df9b4fb90a	Updated README with new information	2024-03-20 14:11:28 +08:00
hiyouga	bea31b9b12	Update wechat.jpg	2024-03-18 16:48:32 +08:00
hiyouga	8e04794b2d	fix packages	2024-03-17 22:32:03 +08:00
hiyouga	85c376fc1e	fix patcher	2024-03-15 19:18:42 +08:00
hoshi-hiyouga	113cc04719	Merge pull request #2849 from S3Studio/DockerizeSupport Improve Dockerize support	2024-03-15 19:16:02 +08:00
hiyouga	6bc2c23b6d	fix export	2024-03-15 15:06:30 +08:00
S3Studio	e75407febd	Use official Nvidia base image Note that the flash-attn library is installed in this image and the qwen model will use it automatically. However, if the the host machine's GPU is not compatible with the library, an exception will be raised during the training process as follows: FlashAttention only supports Ampere GPUs or newer. So if the --flash_attn flag is not set, an additional patch for the qwen model's config is necessary to set the default value of use_flash_attn from "auto" to False.	2024-03-15 08:59:13 +08:00

1 2 3 4 5 ...

1069 Commits