Commit Graph

60 Commits

Author SHA1 Message Date
hiyouga
db9a1912e3 remove dup template 2024-06-22 01:31:32 +08:00
hiyouga
2b596fb55f fix jinja template 2024-06-19 20:03:50 +08:00
hiyouga
4cff6a4ad5 fix templates 2024-06-19 17:44:05 +08:00
hiyouga
6d2bf216ac fix bug 2024-06-19 03:49:23 +08:00
hiyouga
4f22eae8f4 use prefix to replace force system 2024-06-19 03:39:52 +08:00
hiyouga
cd75b1fe9d fix tool formatter, allow parallel function #4362 2024-06-19 03:23:51 +08:00
hoshi-hiyouga
c0ca42566c Merge pull request #4173 from mMrBun/main
Implemented the tool_formatter and tool_extractor for glm4 and Qwen2 tool_format
2024-06-19 03:18:55 +08:00
hiyouga
38b6b0f52e tiny fix 2024-06-16 01:06:41 +08:00
hiyouga
d87108daa6 add license 2024-06-15 17:54:33 +08:00
hiyouga
6baafd4eb3 fix #4221 2024-06-13 02:48:21 +08:00
hoshi-hiyouga
0c29233237 Update pretrain.py 2024-06-11 17:02:14 +08:00
d
6979f3f848 经过大量的增量预训练,进行对比试验,发现这个bug:llama3在预训练时使用的tokenizer.eos_toke是'<|end_of_text|>' ,这里在每条数据后面也得用这个,而不是'<|eot_id|>',否则很容易导致严重的性能下降 2024-06-11 16:23:40 +08:00
mMrBun
950e360ca0 Optimize the handling of QWEN2 in scenarios involving multiple tool calls. 2024-06-10 02:00:14 +08:00
mMrBun
6ed0b0c800 Removed unnecessary comments. 2024-06-09 18:25:22 +08:00
mMrBun
0f2609ce19 Merge branch 'hiyouga:main' into main 2024-06-09 18:17:24 +08:00
mMrBun
cb1cbcb293 Implemented the tool_formatter and tool_extractor for glm4 tool_format 2024-06-09 18:16:15 +08:00
hiyouga
5aa4ce4756 release v0.8.0 2024-06-08 05:20:54 +08:00
hiyouga
ccc8b64cc2 update data processors 2024-06-07 04:15:40 +08:00
hoshi-hiyouga
181dbb0d05 Merge pull request #4009 from AlongWY/main
supervised packing with greedy knapsack algorithm
2024-06-07 03:48:46 +08:00
hoshi-hiyouga
c09ad8bab3 Update supervised.py 2024-06-07 03:42:08 +08:00
hoshi-hiyouga
788e8232fc Update supervised.py 2024-06-07 03:38:23 +08:00
hoshi-hiyouga
8cecade708 Update supervised.py 2024-06-07 03:38:04 +08:00
hiyouga
74f96efef9 rename files 2024-06-07 00:09:06 +08:00
hiyouga
149610c636 fix ppo dataset bug #4012 2024-06-06 19:03:20 +08:00
hiyouga
f48f5e646e support glm-4 2024-06-05 15:16:38 +08:00
hiyouga
5a13b3baa6 tiny fix 2024-06-04 00:31:10 +08:00
hiyouga
a18acf2abe fix #3992 2024-06-04 00:17:36 +08:00
hiyouga
49b1e88e3d fix data loader hint 2024-06-03 18:28:27 +08:00
ylfeng
b47e317447 remove empty line 2024-05-31 21:43:08 +08:00
ylfeng
84aee57901 fix eos 2024-05-31 21:40:41 +08:00
ylfeng
f9db439cb7 supervised packing with greedy knapsack algorithm 2024-05-31 15:33:54 +08:00
hoshi-hiyouga
483eb47e5d Merge pull request #3829 from seanzhang-zhichen/add_dataset_sample_num
Add dataset sample num
2024-05-30 00:25:45 +08:00
hoshi-hiyouga
ca5dd7c6c1 Update loader.py 2024-05-30 00:20:20 +08:00
hoshi-hiyouga
f9a88b89ca Update loader.py 2024-05-30 00:17:21 +08:00
hoshi-hiyouga
b55fb611c5 Update loader.py 2024-05-30 00:12:12 +08:00
hoshi-hiyouga
51dd454337 Update parser.py 2024-05-30 00:05:20 +08:00
hiyouga
d0aa36b8ad fix cohere system 2024-05-29 20:58:23 +08:00
hiyouga
0930f58699 fix #3965 2024-05-29 20:55:51 +08:00
hiyouga
89ca832740 update readme 2024-05-29 18:39:11 +08:00
hzhaoy
0dd632fe9e add TeleChat-12B/TeleChat-12B-v2 models 2024-05-29 15:00:37 +08:00
Yimi81
dc07413e7d fix yi template 2024-05-27 13:11:25 +00:00
hiyouga
c1fdf81df6 tiny fix 2024-05-27 20:54:26 +08:00
hoshi-hiyouga
f1002b9f93 Update template.py 2024-05-27 20:51:56 +08:00
hoshi-hiyouga
122213a7a7 Update template.py 2024-05-27 20:51:26 +08:00
Jianbai Ye
cff815391f add openchat-3.6-8B support 2024-05-27 20:42:08 +08:00
hiyouga
5581cb2e4e update readme 2024-05-27 18:14:02 +08:00
seanzhang-zhichen
27cb51f7f8 Merge branch 'main' into add_dataset_sample_num 2024-05-24 15:57:47 +08:00
hiyouga
3a023bca2a refactor data preprocessing, fix mllm rlhf 2024-05-24 04:08:25 +08:00
hiyouga
de0e67aff1 fix paligemma sft
requires transformers>=4.41.1
2024-05-24 00:23:40 +08:00
hiyouga
7134fb02bb fix paligemma sft 2024-05-21 20:03:09 +08:00