Commit Graph

92 Commits

Author SHA1 Message Date
hoshi-hiyouga
15b399a82f Merge pull request #4691 from codemayq/feature-suppot-eval-dataset
add eval dataset support
2024-07-15 01:00:34 +08:00
hoshi-hiyouga
df52fb05b1 Update preprocess.py 2024-07-15 00:55:36 +08:00
hoshi-hiyouga
84e4047f8a Update parser.py 2024-07-15 00:55:21 +08:00
hoshi-hiyouga
97a0e291c7 Update data_utils.py 2024-07-15 00:54:34 +08:00
hoshi-hiyouga
a5b809516e Update loader.py 2024-07-15 00:50:06 +08:00
hoshi-hiyouga
3d39d74003 Update parser.py 2024-07-14 23:04:34 +08:00
hiyouga
2f6af73da2 fix gemma2 attention 2024-07-13 23:33:45 +08:00
hiyouga
6b48308ef9 fix #4792 2024-07-13 22:07:58 +08:00
hiyouga
53b1002fb7 add codegeex4, internlm2.5 2024-07-06 16:16:47 +08:00
codingma
76f3bbcfc0 1. add custom eval dataset support
2. merge load dataset and split dataset function
2024-07-05 15:52:10 +08:00
hiyouga
9f33f1edf5 fix processors 2024-07-05 08:33:22 +08:00
hiyouga
e43809bced fix #4683 2024-07-05 00:58:05 +08:00
hzhaoy
738df47748 tiny fix 2024-07-04 10:20:28 +08:00
hiyouga
44747cebd2 tiny fix 2024-07-04 03:02:23 +08:00
hiyouga
b5d101e1bf fix data map for packing 2024-07-04 03:01:31 +08:00
hiyouga
6fd6aa4530 fix packing for eager/sdpa attn 2024-07-04 01:52:43 +08:00
hoshi-hiyouga
87d9b2d005 Merge pull request #4224 from chuan298/main
Implement efficient packing without cross-contamination attention
2024-07-04 01:18:54 +08:00
hiyouga
cce7083024 update packing 2024-07-04 01:10:55 +08:00
hiyouga
575a02a23d update hparams 2024-07-03 23:18:58 +08:00
hiyouga
c47ab6c072 improve rlhf 2024-07-02 22:23:08 +08:00
ancv
e8e13b0942 move efficient_packing from data_args to model_args 2024-07-02 18:37:55 +07:00
hoshi-hiyouga
e8e6af2651 Merge branch 'main' into main 2024-07-01 21:01:09 +08:00
hiyouga
1771251ce3 fix #4402 #4617
Deprecate reserved_label_len arg
2024-07-01 01:19:27 +08:00
hiyouga
59e0b4f616 fix #4556 2024-06-26 19:43:16 +08:00
hiyouga
41086059b1 tiny fix 2024-06-25 01:15:19 +08:00
hoshi-hiyouga
def6d280db Merge pull request #4417 from mMrBun/main
Add tool_format parameter to rewrite templates for different function call formats.
2024-06-24 23:17:55 +08:00
hoshi-hiyouga
1240bd57d8 Update template.py 2024-06-24 23:12:59 +08:00
hoshi-hiyouga
dddfd516ee Update loader.py 2024-06-24 23:06:18 +08:00
hiyouga
fca893d73c fix #4410 2024-06-24 22:34:31 +08:00
mMrBun
20e2e6fdcb Add tool_format to overwrite tool formatter template 2024-06-22 02:13:23 +08:00
hiyouga
db9a1912e3 remove dup template 2024-06-22 01:31:32 +08:00
hiyouga
2b596fb55f fix jinja template 2024-06-19 20:03:50 +08:00
hiyouga
4cff6a4ad5 fix templates 2024-06-19 17:44:05 +08:00
hiyouga
6d2bf216ac fix bug 2024-06-19 03:49:23 +08:00
hiyouga
4f22eae8f4 use prefix to replace force system 2024-06-19 03:39:52 +08:00
hiyouga
cd75b1fe9d fix tool formatter, allow parallel function #4362 2024-06-19 03:23:51 +08:00
hoshi-hiyouga
c0ca42566c Merge pull request #4173 from mMrBun/main
Implemented the tool_formatter and tool_extractor for glm4 and Qwen2 tool_format
2024-06-19 03:18:55 +08:00
hiyouga
38b6b0f52e tiny fix 2024-06-16 01:06:41 +08:00
ancv
04315c3d92 remove some unused params 2024-06-15 23:00:55 +07:00
hiyouga
d87108daa6 add license 2024-06-15 17:54:33 +08:00
hiyouga
6baafd4eb3 fix #4221 2024-06-13 02:48:21 +08:00
ancv
b2c367bc61 implement efficient packing without cross-contamination attention 2024-06-12 11:56:01 +07:00
hoshi-hiyouga
0c29233237 Update pretrain.py 2024-06-11 17:02:14 +08:00
d
6979f3f848 经过大量的增量预训练,进行对比试验,发现这个bug:llama3在预训练时使用的tokenizer.eos_toke是'<|end_of_text|>' ,这里在每条数据后面也得用这个,而不是'<|eot_id|>',否则很容易导致严重的性能下降 2024-06-11 16:23:40 +08:00
mMrBun
950e360ca0 Optimize the handling of QWEN2 in scenarios involving multiple tool calls. 2024-06-10 02:00:14 +08:00
mMrBun
6ed0b0c800 Removed unnecessary comments. 2024-06-09 18:25:22 +08:00
mMrBun
0f2609ce19 Merge branch 'hiyouga:main' into main 2024-06-09 18:17:24 +08:00
mMrBun
cb1cbcb293 Implemented the tool_formatter and tool_extractor for glm4 tool_format 2024-06-09 18:16:15 +08:00
hiyouga
5aa4ce4756 release v0.8.0 2024-06-08 05:20:54 +08:00
hiyouga
ccc8b64cc2 update data processors 2024-06-07 04:15:40 +08:00