hoshi-hiyouga
|
d337de668e
|
Update data_utils.py
Former-commit-id: 5c2a0e3b1d1afd2a9219d935d3421fffffc3a2c9
|
2024-07-15 00:54:34 +08:00 |
|
hoshi-hiyouga
|
ec372f91e9
|
Update loader.py
Former-commit-id: 860e3eb374947b72dcae88cab0a93ef561e3bfb3
|
2024-07-15 00:50:06 +08:00 |
|
hoshi-hiyouga
|
ee17741591
|
Update parser.py
Former-commit-id: b9760df588e64270a140d9111241c62c1cefe781
|
2024-07-14 23:04:34 +08:00 |
|
hiyouga
|
5ab997d484
|
fix gemma2 attention
Former-commit-id: aeafc68e169ae0ea5939cc81cb0cf89f0ca044b6
|
2024-07-13 23:33:45 +08:00 |
|
hiyouga
|
dfc7a7d5cd
|
fix #4792
Former-commit-id: d7547d6b9e4c660897e3ce0f4022e08686c172d5
|
2024-07-13 22:07:58 +08:00 |
|
hiyouga
|
0d6ec70c6f
|
add codegeex4, internlm2.5
Former-commit-id: 349a5fbc934ac289cad44b4e3eb16f458b94710c
|
2024-07-06 16:16:47 +08:00 |
|
codingma
|
5f2bd04799
|
1. add custom eval dataset support
2. merge load dataset and split dataset function
Former-commit-id: 963d97ba07e7efa3a4544c4d077283d9e112b3ad
|
2024-07-05 15:52:10 +08:00 |
|
hiyouga
|
9a1a5f9778
|
fix processors
Former-commit-id: 7215f3a8612b570cd322802d14db532927900117
|
2024-07-05 08:33:22 +08:00 |
|
hiyouga
|
edc8aefa59
|
fix #4683
Former-commit-id: cbff0ea0db6971f8ced503a2f0cb6bc43e7037ac
|
2024-07-05 00:58:05 +08:00 |
|
hzhaoy
|
c6f1bc65c0
|
tiny fix
Former-commit-id: 8f43ad988a4fd518a708fba53a173596ce2c59dd
|
2024-07-04 10:20:28 +08:00 |
|
hiyouga
|
0517d7bee5
|
tiny fix
Former-commit-id: 935703b46d2871ce1014832da067dfe4a50c0610
|
2024-07-04 03:02:23 +08:00 |
|
hiyouga
|
5bc0b9b31c
|
fix data map for packing
Former-commit-id: ee6f8f926f084a195b2dbbd074e041e6c62c6ef4
|
2024-07-04 03:01:31 +08:00 |
|
hiyouga
|
3d219b91b9
|
fix packing for eager/sdpa attn
Former-commit-id: 735a033ceb7f2da6da71d138ea091d8a665411a9
|
2024-07-04 01:52:43 +08:00 |
|
hoshi-hiyouga
|
a90c6306f8
|
Merge pull request #4224 from chuan298/main
Implement efficient packing without cross-contamination attention
Former-commit-id: ac382cc9fe4ec483658fd54f07f9a123788ce1b1
|
2024-07-04 01:18:54 +08:00 |
|
hiyouga
|
60558388ec
|
update packing
Former-commit-id: f3d9c31efa0e64317bdd5b4ed6f78653cf3b5ba4
|
2024-07-04 01:10:55 +08:00 |
|
hiyouga
|
5acaa476d6
|
update hparams
Former-commit-id: 1c4feac44192b1f540208837f5a530b0d3f5fb37
|
2024-07-03 23:18:58 +08:00 |
|
hiyouga
|
e6ba7ef3e6
|
improve rlhf
Former-commit-id: e441780e3db256ca09a442ea9254e7ce16898a07
|
2024-07-02 22:23:08 +08:00 |
|
ancv
|
20fdf177e8
|
move efficient_packing from data_args to model_args
Former-commit-id: 7b61659c707480bcf8c802c73e10d12ad5b9b965
|
2024-07-02 18:37:55 +07:00 |
|
hoshi-hiyouga
|
a715490c2a
|
Merge branch 'main' into main
Former-commit-id: 7be442f37d53a0c6324728fa1fa8e2c84d7f0fa5
|
2024-07-01 21:01:09 +08:00 |
|
hiyouga
|
67d2eb6b2a
|
fix #4402 #4617
Deprecate reserved_label_len arg
Former-commit-id: 4b6568984c0be4b31e7aa91b7c0d52b7f7b12b0b
|
2024-07-01 01:19:27 +08:00 |
|
hiyouga
|
cf2dc4c444
|
fix #4556
Former-commit-id: 81faa9a985c14e83e38f42aedd228edb676b0695
|
2024-06-26 19:43:16 +08:00 |
|
hiyouga
|
135bfbf7c1
|
tiny fix
Former-commit-id: bb57478366a70a0871af30ab31c890f471e27ff4
|
2024-06-25 01:15:19 +08:00 |
|
hoshi-hiyouga
|
5cfcb8262e
|
Merge pull request #4417 from mMrBun/main
Add tool_format parameter to rewrite templates for different function call formats.
Former-commit-id: 8d1460cad5bff5e4626fdd675046021e0a3d1947
|
2024-06-24 23:17:55 +08:00 |
|
hoshi-hiyouga
|
5d6cf55208
|
Update template.py
Former-commit-id: d53517bff6f8734221d7df9982f3bdd4d2eb2cab
|
2024-06-24 23:12:59 +08:00 |
|
hoshi-hiyouga
|
9a1ec19845
|
Update loader.py
Former-commit-id: afa59d61844595e6b615227e6bfdc0b16c8015dd
|
2024-06-24 23:06:18 +08:00 |
|
hiyouga
|
a79e93f335
|
fix #4410
Former-commit-id: f49adc4ab5eade21d7a9e029212f17688ee9b0cf
|
2024-06-24 22:34:31 +08:00 |
|
mMrBun
|
43a065bb07
|
Add tool_format to overwrite tool formatter template
Former-commit-id: af08971ca50443fd5597e5e4412a3aa17214502f
|
2024-06-22 02:13:23 +08:00 |
|
hiyouga
|
4513a2cc75
|
remove dup template
Former-commit-id: 5fec12203b24608af4d4993f44a657eb5a0348e5
|
2024-06-22 01:31:32 +08:00 |
|
hiyouga
|
c65f7e9bd5
|
fix jinja template
Former-commit-id: 0ebf2e2ee23918d28b0cbb20ba456732d6eedfbb
|
2024-06-19 20:03:50 +08:00 |
|
hiyouga
|
3e0fa4a8da
|
fix templates
Former-commit-id: 6f357d59b73309c5955683008632e7f320e7dcb1
|
2024-06-19 17:44:05 +08:00 |
|
hiyouga
|
235ed85b0f
|
fix bug
Former-commit-id: 412139eaa2fde98ba19e1257d21144382a59f0d6
|
2024-06-19 03:49:23 +08:00 |
|
hiyouga
|
1ca639a777
|
use prefix to replace force system
Former-commit-id: 731d9a964f1c3dbfb83825524d697831e691fb9d
|
2024-06-19 03:39:52 +08:00 |
|
hiyouga
|
e36a994fe6
|
fix tool formatter, allow parallel function #4362
Former-commit-id: b8f16c976db4ecec1cc8558851c8cbfb6a5b7e9c
|
2024-06-19 03:23:51 +08:00 |
|
hoshi-hiyouga
|
19ffcfea76
|
Merge pull request #4173 from mMrBun/main
Implemented the tool_formatter and tool_extractor for glm4 and Qwen2 tool_format
Former-commit-id: 36b02ceed40198ecd5d559ee4ebef9205442ded2
|
2024-06-19 03:18:55 +08:00 |
|
hiyouga
|
05f3a3c944
|
tiny fix
Former-commit-id: f7f440986b0ae3b38ea9f2da80789629d4f79ea1
|
2024-06-16 01:06:41 +08:00 |
|
ancv
|
f91fe10985
|
remove some unused params
Former-commit-id: fef8132c50505a5fb6a246bd024491bd31798a3c
|
2024-06-15 23:00:55 +07:00 |
|
hiyouga
|
bb88536166
|
add license
Former-commit-id: 69cfc98d7c81756a5ab6bf962240e393e449fef0
|
2024-06-15 17:54:33 +08:00 |
|
hiyouga
|
49b58fd6af
|
fix #4221
Former-commit-id: 05a3be4853b941909e7d193c31e8d62c8c5f879b
|
2024-06-13 02:48:21 +08:00 |
|
ancv
|
c7ab302c69
|
implement efficient packing without cross-contamination attention
Former-commit-id: a64a5305c0da5ef092d4cc26faf829bb44de65d1
|
2024-06-12 11:56:01 +07:00 |
|
hoshi-hiyouga
|
cc9717e2f2
|
Update pretrain.py
Former-commit-id: e2317b2a84149e39fddfd6366be3de23dfb71f82
|
2024-06-11 17:02:14 +08:00 |
|
d
|
77bf3d66c7
|
经过大量的增量预训练,进行对比试验,发现这个bug:llama3在预训练时使用的tokenizer.eos_toke是'<|end_of_text|>' ,这里在每条数据后面也得用这个,而不是'<|eot_id|>',否则很容易导致严重的性能下降
Former-commit-id: ef470561f742b16eaa0f99c4cadecd7c84ce6bd2
|
2024-06-11 16:23:40 +08:00 |
|
mMrBun
|
bc04ca464a
|
Optimize the handling of QWEN2 in scenarios involving multiple tool calls.
Former-commit-id: 48f870edc96ada40360f7e6e67cbf58805295b33
|
2024-06-10 02:00:14 +08:00 |
|
mMrBun
|
44829df762
|
Removed unnecessary comments.
Former-commit-id: 2b81252aa693871098931cd7873ef83ef4922ba5
|
2024-06-09 18:25:22 +08:00 |
|
mMrBun
|
94ddfa66c0
|
Merge branch 'hiyouga:main' into main
Former-commit-id: c25734d874a36222e0a540a2c994bbda73008b27
|
2024-06-09 18:17:24 +08:00 |
|
mMrBun
|
8db8ed5a41
|
Implemented the tool_formatter and tool_extractor for glm4 tool_format
Former-commit-id: db7fa4490ea7f6966418d2879c895cbc1763b16d
|
2024-06-09 18:16:15 +08:00 |
|
hiyouga
|
c0c387e4db
|
release v0.8.0
Former-commit-id: 004db680b9e3996ec511ee818df6c0c02bf13603
|
2024-06-08 05:20:54 +08:00 |
|
hiyouga
|
8c4c2e580c
|
update data processors
Former-commit-id: 04b138cbcb8b9a72e4bbda6c65843bb459e525e7
|
2024-06-07 04:15:40 +08:00 |
|
hoshi-hiyouga
|
07f33e7641
|
Merge pull request #4009 from AlongWY/main
supervised packing with greedy knapsack algorithm
Former-commit-id: 5ded166b39a75a98ded5733678f5a1eab7d4cc71
|
2024-06-07 03:48:46 +08:00 |
|
hoshi-hiyouga
|
1998c641af
|
Update supervised.py
Former-commit-id: 04b6c2a754e602e0b698cfe6c255c2f2486d8865
|
2024-06-07 03:42:08 +08:00 |
|
hoshi-hiyouga
|
be1e5f9d62
|
Update supervised.py
Former-commit-id: 49993c4f4e1f871a22ff0196afe60026b668a4dc
|
2024-06-07 03:38:23 +08:00 |
|