2822 Commits

Author SHA1 Message Date
Yaowei Zheng
3ae15da9c0
[misc] lint code (#9395) 2025-11-03 22:08:59 +08:00
魅影
215580c77d
[data] fix mm pluigin for qwen omni video training (#9388)
Co-authored-by: frozenleaves <frozen@Mac.local>
2025-11-03 11:44:27 +08:00
魅影
767b344fb4
[model] remove npu sdpa patch (#9368)
Co-authored-by: frozenleaves <frozen@Mac.local>
2025-10-30 16:26:35 +08:00
Kingsley
3057db15c3
[readme] upd mcore readme (#9352) 2025-10-27 21:23:31 +08:00
Kingsley
13170577b2
[feat] support megatron-LM training by mcore_adapter (#9237)
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
Co-authored-by: Yaowei Zheng <hiyouga@buaa.edu.cn>
2025-10-26 16:21:30 +08:00
Xiaosu Zhu
129e918106
[data] Fix Qwen3VL plugin (#9297)
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
Co-authored-by: Yaowei Zheng <hiyouga@buaa.edu.cn>
Co-authored-by: kingsley <kingsleydodonow@gmail.com>
2025-10-26 16:07:04 +08:00
Yaowei Zheng
9c0d033a15
[model] add qwen3vl 2b & 32b (#9343) 2025-10-24 13:22:36 +08:00
Yaowei Zheng
2a822178de
[deps] fix yanked packages (#9333) 2025-10-22 20:54:51 +08:00
Kingsley
b842457ef4
[ci] revert mac os ci setup (#9316) 2025-10-21 18:26:12 +08:00
魅影
2c6aded5d4
[v1] kernel plugin (#9274)
Co-authored-by: frozenleaves <frozen@Mac.local>
2025-10-18 18:02:14 +08:00
Yaowei Zheng
d9d67ba62d
[misc] fix import error (#9299) 2025-10-17 17:46:27 +08:00
Yaowei Zheng
a442fa90ad
[misc] fix import error (#9296) 2025-10-17 10:54:30 +08:00
wyfdgg
8c341cbaae
[model] support hunyuan-mt model (#9284)
Co-authored-by: wyfdgg <liwenkun0812@163.com>
Co-authored-by: Yaowei Zheng <hiyouga@buaa.edu.cn>
2025-10-17 10:33:09 +08:00
Yaowei Zheng
47a7dc1698
[deps] upgrade vllm (#9293) 2025-10-16 23:20:26 +08:00
Yaowei Zheng
1037f63311
[model] add qwen3vl 4b + 8b (#9275) 2025-10-15 15:00:36 +08:00
Ximing Xing
c867e28093
[model] adds semantic initialization support for special tokens (#9267)
Co-authored-by: ximingxing <ximingxing@tencent.com>
2025-10-14 17:00:48 +08:00
Peter-Hamster
3dbca4b533
[data] add new reason tool calls demo data (#9249)
Co-authored-by: unknown <Peter Zeng>
Co-authored-by: Yaowei Zheng <hiyouga@buaa.edu.cn>
2025-10-13 17:16:47 +08:00
Yaowei Zheng
9d1acbc191
[ci] fix ci (#9265) 2025-10-13 16:24:40 +08:00
Yaowei Zheng
52e46e162e
[v1] add data converter (#9263) 2025-10-13 15:54:47 +08:00
Jiayi Mao
48974783da
[model]: add ernie4_5_moe support for DeepSpeed Zero3 training (#9262) 2025-10-13 13:13:31 +08:00
Yaowei Zheng
575e4099df
[misc] add qwen bench script (#9259)
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
2025-10-13 11:45:25 +08:00
Yaowei Zheng
9687b71d3a
[v1] init data plugins (#9248)
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
2025-10-09 22:36:48 +08:00
Yaowei Zheng
1c35db60d6
[v1] support read dataset (#9243) 2025-10-09 17:16:33 +08:00
Yaowei Zheng
10146029ba
[v1] add v1 launcher (#9236)
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
2025-10-07 22:34:48 +08:00
Wu Wenhao
95b7188090
Merge commit from fork
* fix lfi and ssrf

* move utils to common

---------

Co-authored-by: d3do <chamlinx@outlook.com>
Co-authored-by: hiyouga <hiyouga@buaa.edu.cn>
2025-10-07 20:55:29 +08:00
Yaowei Zheng
d5bb4e6394
[assets] update readme (#9232) 2025-10-05 16:42:19 +08:00
Yaowei Zheng
3fe6f0febd
[ci] update docker workflow (#9231) 2025-10-05 02:50:55 +08:00
Yaowei Zheng
40d3691e9e
[misc] fix moe models (#9230) 2025-10-05 02:41:02 +08:00
Yaowei Zheng
af8437095a
[ci] Change macOS version (#9229) 2025-10-05 02:18:30 +08:00
codingma
2e2f92701f
[model] add qwen3-vl-30b (#9227) 2025-10-04 14:12:37 +08:00
Yaowei Zheng
7d60b840ef
[v1] support switch v1 backend (#9226) 2025-10-02 15:59:19 +08:00
Yaowei Zheng
1d96c62df2
[v1] add v1 folders (#9225) 2025-10-02 15:25:57 +08:00
Yaowei Zheng
a0d44c650a
[misc] add data files (#9224) 2025-10-02 14:02:07 +08:00
Yaowei Zheng
bcc2c1fd8f [misc] move wechat out (#9223) 2025-10-02 02:06:09 +08:00
Yaowei Zheng
7dd910f067 [misc] lint (#9221) 2025-10-01 22:58:58 +08:00
krli
d10d65e4ce [docker] update Dockerfile to set no_proxy and fix pydantic version (#8651) 2025-10-01 14:33:47 +08:00
Ben Feuer
1c44b60e3e [feat] fp8 training (#8960)
Co-authored-by: Benjamin Feuer <penfever@gmail.com>
Co-authored-by: Yaowei Zheng <hiyouga@buaa.edu.cn>
2025-10-01 14:32:53 +08:00
Yaowei Zheng
e2b1594d31 [data] fix reasoning template (#9219) 2025-09-30 18:11:45 +08:00
h7878778h
09dedf144f [npu] Redirect SDPA to torch_npu.npu_fusion_attention (opt-in, ZeRO-3 safe, no impact off NPU) (#8972) 2025-09-30 18:11:31 +08:00
魅影
a04d777d7f [cli] support lazy import (#9217)
Co-authored-by: frozenleaves <frozen@Mac.local>
2025-09-30 18:02:26 +08:00
Yaowei Zheng
6ffebe5ff7 [data] fix qwen omni plugin (#9204)
Co-authored-by: kingsley <kingsleydodonow@gmail.com>
2025-09-28 01:02:29 +08:00
xvxuopop
0761a4448f [model] add qwen3-vl/qwen3-omni (#9196)
Co-authored-by: kingsley <kingsleydodonow@gmail.com>
2025-09-27 01:21:47 +08:00
wangshaofei
abc3b1e1c4 [docs] update ling-v2 to the readme (#9188) 2025-09-24 15:23:21 +08:00
Hertz
344c760cc1 [model] supported ERNIE4.5 Text Models (#9165)
Co-authored-by: Yaowei Zheng <hiyouga@buaa.edu.cn>
2025-09-22 11:48:26 +08:00
Yaowei Zheng
80fe3a172d [model] add dots ocr (#9176) 2025-09-21 23:34:19 +08:00
Yaowei Zheng
800934b507 [assets] update readme (#9143) 2025-09-16 17:04:19 +08:00
Yaowei Zheng
e2ba32598d [assets] update readme (#9137) 2025-09-15 23:45:57 +08:00
Yaowei Zheng
812720909e [model] add qwen3 next (#9130) 2025-09-14 03:16:25 +08:00
Yaowei Zheng
260b5625c3 [assets] update wechat (#9129) 2025-09-14 03:05:08 +08:00
Yaowei Zheng
52488ac974 [deps] upgrade transformers to 4.56.1 (#9128) 2025-09-14 02:26:39 +08:00