LLaMA-Factory

mirror of https://github.com/hiyouga/LLaMA-Factory.git synced 2025-12-16 11:50:35 +08:00

Author	SHA1	Message	Date
hiyouga	db9a1912e3	remove dup template	2024-06-22 01:31:32 +08:00
hiyouga	2b596fb55f	fix jinja template	2024-06-19 20:03:50 +08:00
hiyouga	4cff6a4ad5	fix templates	2024-06-19 17:44:05 +08:00
hiyouga	6d2bf216ac	fix bug	2024-06-19 03:49:23 +08:00
hiyouga	4f22eae8f4	use prefix to replace force system	2024-06-19 03:39:52 +08:00
hiyouga	cd75b1fe9d	fix tool formatter, allow parallel function #4362	2024-06-19 03:23:51 +08:00
hoshi-hiyouga	c0ca42566c	Merge pull request #4173 from mMrBun/main Implemented the tool_formatter and tool_extractor for glm4 and Qwen2 tool_format	2024-06-19 03:18:55 +08:00
hiyouga	38b6b0f52e	tiny fix	2024-06-16 01:06:41 +08:00
hiyouga	d87108daa6	add license	2024-06-15 17:54:33 +08:00
hiyouga	6baafd4eb3	fix #4221	2024-06-13 02:48:21 +08:00
hoshi-hiyouga	0c29233237	Update pretrain.py	2024-06-11 17:02:14 +08:00
d	6979f3f848	经过大量的增量预训练，进行对比试验，发现这个bug：llama3在预训练时使用的tokenizer.eos_toke是'<\|end_of_text\|>' ，这里在每条数据后面也得用这个，而不是'<\|eot_id\|>'，否则很容易导致严重的性能下降	2024-06-11 16:23:40 +08:00
mMrBun	950e360ca0	Optimize the handling of QWEN2 in scenarios involving multiple tool calls.	2024-06-10 02:00:14 +08:00
mMrBun	6ed0b0c800	Removed unnecessary comments.	2024-06-09 18:25:22 +08:00
mMrBun	0f2609ce19	Merge branch 'hiyouga:main' into main	2024-06-09 18:17:24 +08:00
mMrBun	cb1cbcb293	Implemented the tool_formatter and tool_extractor for glm4 tool_format	2024-06-09 18:16:15 +08:00
hiyouga	5aa4ce4756	release v0.8.0	2024-06-08 05:20:54 +08:00
hiyouga	ccc8b64cc2	update data processors	2024-06-07 04:15:40 +08:00
hoshi-hiyouga	181dbb0d05	Merge pull request #4009 from AlongWY/main supervised packing with greedy knapsack algorithm	2024-06-07 03:48:46 +08:00
hoshi-hiyouga	c09ad8bab3	Update supervised.py	2024-06-07 03:42:08 +08:00
hoshi-hiyouga	788e8232fc	Update supervised.py	2024-06-07 03:38:23 +08:00
hoshi-hiyouga	8cecade708	Update supervised.py	2024-06-07 03:38:04 +08:00
hiyouga	74f96efef9	rename files	2024-06-07 00:09:06 +08:00
hiyouga	149610c636	fix ppo dataset bug #4012	2024-06-06 19:03:20 +08:00
hiyouga	f48f5e646e	support glm-4	2024-06-05 15:16:38 +08:00
hiyouga	5a13b3baa6	tiny fix	2024-06-04 00:31:10 +08:00
hiyouga	a18acf2abe	fix #3992	2024-06-04 00:17:36 +08:00
hiyouga	49b1e88e3d	fix data loader hint	2024-06-03 18:28:27 +08:00
ylfeng	b47e317447	remove empty line	2024-05-31 21:43:08 +08:00
ylfeng	84aee57901	fix eos	2024-05-31 21:40:41 +08:00
ylfeng	f9db439cb7	supervised packing with greedy knapsack algorithm	2024-05-31 15:33:54 +08:00
hoshi-hiyouga	483eb47e5d	Merge pull request #3829 from seanzhang-zhichen/add_dataset_sample_num Add dataset sample num	2024-05-30 00:25:45 +08:00
hoshi-hiyouga	ca5dd7c6c1	Update loader.py	2024-05-30 00:20:20 +08:00
hoshi-hiyouga	f9a88b89ca	Update loader.py	2024-05-30 00:17:21 +08:00
hoshi-hiyouga	b55fb611c5	Update loader.py	2024-05-30 00:12:12 +08:00
hoshi-hiyouga	51dd454337	Update parser.py	2024-05-30 00:05:20 +08:00
hiyouga	d0aa36b8ad	fix cohere system	2024-05-29 20:58:23 +08:00
hiyouga	0930f58699	fix #3965	2024-05-29 20:55:51 +08:00
hiyouga	89ca832740	update readme	2024-05-29 18:39:11 +08:00
hzhaoy	0dd632fe9e	add TeleChat-12B/TeleChat-12B-v2 models	2024-05-29 15:00:37 +08:00
Yimi81	dc07413e7d	fix yi template	2024-05-27 13:11:25 +00:00
hiyouga	c1fdf81df6	tiny fix	2024-05-27 20:54:26 +08:00
hoshi-hiyouga	f1002b9f93	Update template.py	2024-05-27 20:51:56 +08:00
hoshi-hiyouga	122213a7a7	Update template.py	2024-05-27 20:51:26 +08:00
Jianbai Ye	cff815391f	add openchat-3.6-8B support	2024-05-27 20:42:08 +08:00
hiyouga	5581cb2e4e	update readme	2024-05-27 18:14:02 +08:00
seanzhang-zhichen	27cb51f7f8	Merge branch 'main' into add_dataset_sample_num	2024-05-24 15:57:47 +08:00
hiyouga	3a023bca2a	refactor data preprocessing, fix mllm rlhf	2024-05-24 04:08:25 +08:00
hiyouga	de0e67aff1	fix paligemma sft requires transformers>=4.41.1	2024-05-24 00:23:40 +08:00
hiyouga	7134fb02bb	fix paligemma sft	2024-05-21 20:03:09 +08:00

1 2

60 Commits