Commit Graph

2111 Commits

Author SHA1 Message Date
Yaowei Zheng
8abb8fb533 [v1] use async streamer (#9741) 2026-01-09 16:12:07 +08:00
Yaowei Zheng
5cccaeec82 [model] clean obsolete models (#9736) 2026-01-09 16:12:07 +08:00
Jackey
5fb5d7ebd3 [model] support for microsoft's Phi-4-mini (#9734) 2026-01-09 12:24:45 +08:00
Vo Van Phuc
5cfd804b59 [refactor] rename lfm template to lfm2 and add LFM 2.5 to README (#9731) 2026-01-07 19:25:04 +08:00
Yaowei Zheng
4c1eb922e2 [misc] fix parser (#9730) 2026-01-07 17:36:08 +08:00
Vo Van Phuc
958fb523a2 [model] support LiquidAI's LFM2.5-VL vision-language model (#9729) 2026-01-07 17:20:29 +08:00
Vo Van Phuc
b4e051bea4 [model] support for LiquidAI's LFM2.5 (Liquid Foundation Models) (#9726) 2026-01-07 14:14:47 +08:00
Yaowei Zheng
d22de0d4bf [v1] add renderer ut (#9722) 2026-01-07 02:06:07 +08:00
Yaowei Zheng
ea0b4e2466 [v1] add cli sampler (#9721) 2026-01-06 23:31:27 +08:00
yanglele
e944dc442c [feature] add support for EAFT loss (#9720)
Co-authored-by: Yaowei Zheng <hiyouga@buaa.edu.cn>
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
2026-01-06 23:07:12 +08:00
Xunpeng Xiao
68119e5522 [misc] Add a PyTorch version warning for Conv3D. (#9715) 2026-01-05 13:26:29 +08:00
Yaowei Zheng
f60a6e3d01 [v1] add init plugin (#9716) 2026-01-04 20:51:46 +08:00
Yaowei Zheng
8600530002 [misc] lint (#9710) 2026-01-04 13:47:56 +08:00
Hertz
9ae62c6fc0 [model] support Youtu-LLM-2B (#9707) 2026-01-04 13:17:57 +08:00
Xunpeng Xiao
0087bc253b [misc] Compatible with an empty architectures field in config.json (#9709) 2026-01-04 12:11:35 +08:00
Santosh Bhavani
355d5c5e5a [fix] fp8: add Transformer Engine backend support (#9705)
Co-authored-by: Yaowei Zheng <hiyouga@buaa.edu.cn>
2026-01-01 10:18:02 +08:00
Yaowei Zheng
6fe6bd290b [misc] set dev version (#9703) 2025-12-31 23:41:40 +08:00
Yaowei Zheng
95ac3f2373 [release] Bye 2025 (#9702) 2025-12-31 22:22:40 +08:00
Username_Full
000526908a [core deps] upgrade TRL to be between 0.18 and 0.24 (#9617)
Co-authored-by: Yaowei Zheng <hiyouga@buaa.edu.cn>
2025-12-31 20:54:27 +08:00
浮梦
16735b9e35 [v1] Refactor kernel plugin (#9669)
Co-authored-by: frozenleaves <frozen@Mac.local>
2025-12-31 18:26:48 +08:00
Kingsley
bb1ba31005 [misc] lint mca code (#9692) 2025-12-29 11:44:38 +08:00
Yaowei Zheng
3f0c3dc84d [assets] fix installation (#9687)
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
2025-12-28 19:29:28 +08:00
Hertz
c107cc22d0 [model] support MiniMax-M1&M2 series (#9680)
Co-authored-by: Yaowei Zheng <hiyouga@buaa.edu.cn>
2025-12-28 19:02:05 +08:00
Copilot
eceec8ab69 [deps] goodbye python 3.9 (#9677)
Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com>
Co-authored-by: hiyouga <16256802+hiyouga@users.noreply.github.com>
Co-authored-by: hiyouga <hiyouga@buaa.edu.cn>
2025-12-27 02:50:44 +08:00
Yaowei Zheng
55590f5ece [misc] fix ci with uv (#9676) 2025-12-27 01:39:13 +08:00
Xunpeng Xiao
3c17f2722c [model] Update ernie_vl to adapt new version (#9665) 2025-12-26 19:57:49 +08:00
Yaowei Zheng
a754604c11 [misc] fix accelerator (#9661)
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
2025-12-25 02:11:04 +08:00
Xunpeng Xiao
6a2eafbae3 [feat] Models trained and inferred with Mxfp4 are dequantized by default (#9652)
Co-authored-by: Yaowei Zheng <hiyouga@buaa.edu.cn>
2025-12-24 00:26:40 +08:00
Yaowei Zheng
84485406b7 [ci] disable pip cache for ci (#9654) 2025-12-23 18:37:40 +08:00
Kingsley
1c8a42d2f8 [v1&WIP] dataloader init (#9645) 2025-12-23 16:29:47 +08:00
thulyubh22
7901b2f32e [model] efficient tuning for gpt-oss (#9354)
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
2025-12-23 16:28:38 +08:00
Yaowei Zheng
6ef9854713 [misc] fix cache & pin transformers to 4.57.1 (#9638) 2025-12-22 00:20:55 +08:00
Hertz
4923f52a28 [model] support MiMo-V2-Flash model (#9637) 2025-12-21 14:38:18 +08:00
Yaowei Zheng
0894b4f37e [misc] lint (#9636) 2025-12-20 16:19:39 +08:00
ZIYI ZENG
b0d49e137f [misc] Support split eval_dataset when explict set "predict_with_generate" (#9604)
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
2025-12-20 01:46:00 +08:00
Xunpeng Xiao
ddd7dcc722 [data] Fix the video frame sampling issue #9620 (#9634) 2025-12-19 18:36:31 +08:00
浮梦
5204cd2bca [misc] add version check for moe (#9633) 2025-12-19 14:57:37 +08:00
Xunpeng Xiao
8c74dca76a [feat] Models trained and inferred with FP8 are dequantized by default (#9627) 2025-12-18 22:54:35 +08:00
mrhaoxx
a769fb94b9 [feat] support ktransformers for dpo (#9621)
Co-authored-by: poryfly <porykid@gmail.com>
2025-12-18 21:26:25 +08:00
mrhaoxx
964569751f [kt] refactor ktransformers integration (#9632) 2025-12-18 21:26:04 +08:00
Hertz
9fd4b094d4 [model] support VibeThinker models (#9616) 2025-12-16 21:50:46 +08:00
浮梦
18c21bce5a [test] add allreduce test on npu (#9619)
Co-authored-by: frozenleaves <frozen@Mac.local>
2025-12-16 21:33:30 +08:00
Yaowei Zheng
aeda079014 [v1] model loader (#9613) 2025-12-14 11:50:52 +08:00
Xunpeng Xiao
fdd24276ed [feat] support new function call value (#9610)
Co-authored-by: Yaowei Zheng <hiyouga@buaa.edu.cn>
2025-12-14 00:20:33 +08:00
Yaowei Zheng
110d21713e [v1] add dp & mp mesh (#9611) 2025-12-13 01:44:28 +08:00
Yaowei Zheng
203069e11c [v1] add accelerator (#9607) 2025-12-12 19:22:06 +08:00
tangefly
4fd94141a4 [model] Add Ministral3 (#9582)
Co-authored-by: kingsley <kingsleydodonow@gmail.com>
2025-12-10 15:57:24 +08:00
Kingsley
22d6ac29d5 [model] Rename GLMV template (#9595) 2025-12-10 13:27:47 +08:00
DoubleWheat
cff4483392 [config] Fix RoPE scaling patch for resuming from a scaled model (#9588) 2025-12-09 20:37:37 +08:00
Yaowei Zheng
5d56817e2b [misc] lint (#9593)
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
2025-12-09 18:00:35 +08:00