Commit Graph

652 Commits

Author SHA1 Message Date
hiyouga
870f23d7ea fix #6546 2025-01-07 06:30:44 +00:00
hiyouga
4b8add7287 update model name 2025-01-02 12:19:21 +00:00
hiyouga
67442bd497 add gpt2 model 2025-01-02 12:07:38 +00:00
hiyouga
1800f8c72d fix #6499 2025-01-02 11:28:54 +00:00
hiyouga
e67b9dcc3a add deepseek3 model 2024-12-30 13:39:20 +00:00
hoshi-hiyouga
91467ed313 Merge pull request #5507 from piamo/main
Add deepseek-v2.5 template
2024-12-30 21:08:25 +08:00
hiyouga
6f5bb3b8e5 fix #6482 2024-12-30 06:03:07 +00:00
hiyouga
2719867982 fix #6448 2024-12-27 16:54:39 +00:00
youkaichao
c39d81cd1d Update cli.py 2024-12-26 23:22:09 +08:00
hiyouga
ee0e400f41 add qvq #6439 2024-12-25 07:52:41 +00:00
hiyouga
8fd38d273e update readme 2024-12-23 14:08:59 +00:00
hoshi-hiyouga
c23a4d0658 Merge pull request #5922 from Tuyohai/main
support granite3 models
2024-12-23 16:46:02 +08:00
hiyouga
5111cac6f8 support report custom args 2024-12-21 21:42:45 +00:00
hiyouga
84cd1188ac fix paligemma infer 2024-12-21 20:24:32 +00:00
hoshi-hiyouga
947e22a4a3 Merge pull request #6401 from Zeyi-Lin/hiyouga/swanlab
feat: add swanlab for experiment tracking and visualization.
2024-12-21 14:09:33 +08:00
ZeYi Lin
82e5d75014 fix: project blank 2024-12-20 18:26:02 +08:00
ZeYi Lin
3a7ea2048a fix: by hiyouga suggestion 2024-12-20 16:43:03 +08:00
ZeYi Lin
5f6dafd70e feat: ui improve 2024-12-20 11:03:02 +08:00
ZeYi Lin
0a52962db3 fix: text 2024-12-19 21:26:02 +08:00
ZeYi Lin
d0eb64d5e3 fix: bugs 2024-12-19 21:08:16 +08:00
ZeYi Lin
7eb49e5ffa docs: config framework 2024-12-19 20:22:36 +08:00
ZeYi Lin
3306919629 fix: string 2024-12-19 20:18:59 +08:00
hiyouga
d4c1fda1ad fix #6391 2024-12-19 12:16:38 +00:00
ZeYi Lin
8c2df41b93 feat: optimize frontend 2024-12-19 19:04:19 +08:00
ZeYi Lin
d5cf87990e feat: swanlab params 2024-12-19 18:47:27 +08:00
hiyouga
c7cedc7569 support disable shuffling 2024-12-19 08:53:21 +00:00
hiyouga
96f8f103e5 add swanlab 2024-12-19 07:12:31 +00:00
hiyouga
369cca8110 fix webui 2024-12-19 06:48:03 +00:00
hiyouga
d3509050dc add paligemma2 2024-12-18 08:57:26 +00:00
hoshi-hiyouga
015f213788 Merge pull request #6313 from ge-xing/main
support telechat2 model
2024-12-18 16:16:17 +08:00
hiyouga
98795854e3 support qwen tool format 2024-12-17 20:12:06 +00:00
hiyouga
bcc413cf64 change default replace jinja to false 2024-12-17 19:27:10 +00:00
ylfeng
115924af47 Support Mistral format tools 2024-12-17 19:13:26 +00:00
hiyouga
df5655f61c fix llama3 tool template 2024-12-17 17:05:10 +00:00
hoshi-hiyouga
e12c80ace8 Merge pull request #6367 from hiyouga/hiyouga/add_model
[model&template] add llama3.3 & support llama3 tool prompt
2024-12-18 00:13:28 +08:00
hiyouga
b24ae55ebf support llama3 tool prompt 2024-12-17 15:52:37 +00:00
Yaser Afshar
1c8ad22a5f Add missing key to init_kwargs 2024-12-17 12:34:05 +00:00
Yaser Afshar
0943776326 Add trust_remote_code parameter and remove True
- Introduced a new model parameter `trust_remote_code`
- Set the default value of `trust_remote_code` to `False`
  to enhance security
2024-12-17 12:25:12 +00:00
zhaohu xing
04f19ed0f3 support telechat2 model 2024-12-17 12:15:33 +00:00
hoshi-hiyouga
a665ad6178 Merge pull request #6364 from hiyouga/hiyouga/control_reenterent_gc
[model] support non-reenterent-gc
2024-12-17 19:58:36 +08:00
hiyouga
f319da6937 support non-reenterent-gc & fix #6358 2024-12-17 11:41:59 +00:00
hoshi-hiyouga
6973828307 Merge pull request #6363 from hiyouga/hiyouga/control_skip_eos
[infer] support control eos
2024-12-17 19:35:40 +08:00
hiyouga
eda76de32b support control eos, fix #6345 2024-12-17 10:42:05 +00:00
hiyouga
2d107d3aef generalized packing & fix #6343 2024-12-17 10:26:19 +00:00
hiyouga
142191e466 fix #6348 2024-12-17 10:06:46 +00:00
hiyouga
2811814fc4 fix mrope 2024-12-12 15:08:17 +00:00
hiyouga
99c62660c6 support qwen2vl train proj only 2024-12-05 10:37:42 +00:00
hiyouga
207f8b069c support qwen2vl vllm infer 2024-12-05 10:17:26 +00:00
hiyouga
eb3e147d19 fix scripts 2024-12-05 03:47:32 +00:00
hoshi-hiyouga
cf29846830 Merge pull request #6160 from village-way/pr_dataloader
fix:tokenized_path not None and load_from_disk return Dataset Trigger…
2024-12-04 22:18:19 +08:00