2957 Commits

Author SHA1 Message Date
Shanay Mehta
aab9b400bb [model] Add DeepSpeed Z3 leaf module for Qwen3-Next (#10194)
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-24 19:54:37 +08:00
P. Clawmogorov
50599c719b [misc] remove safe_serialization arg for transformers v5 compatibility (#10208)
Co-authored-by: P. Clawmogorov <262173731+Alm0stSurely@users.noreply.github.com>
2026-02-24 11:14:19 +08:00
Kingsley
a0f3ad0cee [mca] update supported models (#10196) 2026-02-20 22:02:49 +08:00
jiaqiw09
f80e15dbb4 [ci] fix ut huggingface hub 429 error when transformers>=5.0.0 (#10155) 2026-02-12 22:14:10 +08:00
sunyi0505
991267fd3b [v1] support quantization (#10161) 2026-02-12 20:37:41 +08:00
浮梦
5c52afa30d [v1] support deepspeed (#10181) 2026-02-12 17:24:30 +08:00
Junyou Su
675ce8cc7f [algo] add ASFT (#10174) 2026-02-12 13:12:14 +08:00
jiaqiw09
ab073f4c13 [v1] add LoRA/Freeze support and merge workflow (#10157) 2026-02-12 13:02:09 +08:00
Shanay Mehta
184304b5b4 [model] add liger kernel support for Qwen3-Next (#10176)
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-10 21:47:48 +08:00
Xue Yadong
d3ebd5678d [model] support GLM-OCR SFT (#10183) 2026-02-10 21:41:01 +08:00
浮梦
1d5e8ebcd0 [v1] init commit for v1 docs (#10145)
Co-authored-by: frozenleaves <frozen@Mac.local>
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
Co-authored-by: jiaqiw09 <jiaqiw960714@gmail.com>
Co-authored-by: jiaqiw09 <60021713+jiaqiw09@users.noreply.github.com>
Co-authored-by: Yaowei Zheng <hiyouga@buaa.edu.cn>
2026-02-09 19:43:55 +08:00
Shanay Mehta
ea644d04ec [model] support GLM-4.7-Flash SFT (#10173)
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-09 10:40:44 +08:00
Username_Full
92fa3df4c4 [trainer] add dpo/kto fsdp fsdp2 support (#10127) 2026-02-04 23:27:12 +08:00
Hertz
8bedfafa4e [model] support MiniCPM-o-4.5 (#10163)
Co-authored-by: Yaowei Zheng <hiyouga@buaa.edu.cn>
2026-02-04 23:21:27 +08:00
Yaowei Zheng
1a02717fa8 [assets] update readme (#10159) 2026-02-03 19:11:15 +08:00
ゆり
e7cb145f5d [logging] Fix race condition in LoggerHandler during multi-GPU training (#10156)
Co-authored-by: yurekami <yurekami@users.noreply.github.com>
2026-02-03 11:14:07 +08:00
Hertz
b53d7037c2 [model] support youtu-vl model (#10152) 2026-02-02 21:42:43 +08:00
浮梦
bf04ca6af8 [deps] adapt to transformers v5 (#10147)
Co-authored-by: frozenleaves <frozen@Mac.local>
Co-authored-by: hiyouga <hiyouga@buaa.edu.cn>
2026-02-02 12:07:19 +08:00
xvxuopop
762b480131 [feature] support using ray.remote to start distributed training. (#10109) 2026-01-28 16:05:29 +08:00
Jewon Lee
9640f79ae5 [fix] add visual.pos_embed to Qwen3-VL visual model keys (#10139) 2026-01-27 16:33:01 +08:00
jiaqiw09
7ef19eea00 [v0] Fix reward model training safetensors saving (#10137) 2026-01-27 16:27:14 +08:00
浮梦
f9f11dcb97 [v1] support training with fsdp2 (#9773)
Co-authored-by: frozenleaves <frozen@Mac.local>
Co-authored-by: Yaowei Zheng <hiyouga@buaa.edu.cn>
2026-01-25 19:41:58 +08:00
Pádraic Slattery
641bfdd482 chore: Update outdated GitHub Actions versions (#10123) 2026-01-25 19:12:39 +08:00
Meng WANG
e70651ac58 [feat] support all_exhausted_without_replacement in datasets.interleave_datasets (#10112) 2026-01-20 15:54:07 +08:00
Kingsley
db2f794f7b [misc] update mcore related docker and mca supported models (#10114) 2026-01-19 14:55:16 +08:00
jiaqiw09
44eadbda1c [v1] fix kernel moe patch (#9867) 2026-01-17 09:24:54 +08:00
浮梦
9829ae0a77 [ci] using mp to run kernel test (#9754)
Co-authored-by: frozenleaves <frozen@Mac.local>
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
Co-authored-by: Yaowei Zheng <hiyouga@buaa.edu.cn>
2026-01-13 19:43:59 +08:00
Yaowei Zheng
958b9c3468 [v1] add sft (#9752) 2026-01-12 03:15:01 +08:00
Hertz
4d3621e3d3 [model] fixed&added Hunyuan models (#9750) 2026-01-12 01:15:00 +08:00
Yaowei Zheng
a296723697 [v1] upgrade batching (#9751)
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
2026-01-12 00:21:36 +08:00
Hertz
15b87f3125 [model] support HY-MT model (#9746)
Co-authored-by: Yaowei Zheng <hiyouga@buaa.edu.cn>
2026-01-11 16:25:56 +08:00
Yaowei Zheng
9f73a6eb23 [deps] fix package (#9745) 2026-01-10 04:27:53 +08:00
Yaowei Zheng
b2effbd77c [v1] add batch generator (#9744) 2026-01-10 04:24:09 +08:00
Yaowei Zheng
d7d734d54c [misc] fix fp8 (#9742) 2026-01-09 16:17:26 +08:00
Yaowei Zheng
8abb8fb533 [v1] use async streamer (#9741) 2026-01-09 16:12:07 +08:00
Yaowei Zheng
766d5ae6ad [ci] fix workflow (#9738) 2026-01-09 16:12:07 +08:00
Yaowei Zheng
5cccaeec82 [model] clean obsolete models (#9736) 2026-01-09 16:12:07 +08:00
Jackey
5fb5d7ebd3 [model] support for microsoft's Phi-4-mini (#9734) 2026-01-09 12:24:45 +08:00
Peilin Li
03a70ba8dd [fix] correct ktransformers example config paths and templates (#9732) 2026-01-08 10:52:50 +08:00
Vo Van Phuc
5cfd804b59 [refactor] rename lfm template to lfm2 and add LFM 2.5 to README (#9731) 2026-01-07 19:25:04 +08:00
Yaowei Zheng
4c1eb922e2 [misc] fix parser (#9730) 2026-01-07 17:36:08 +08:00
Vo Van Phuc
958fb523a2 [model] support LiquidAI's LFM2.5-VL vision-language model (#9729) 2026-01-07 17:20:29 +08:00
Vo Van Phuc
b4e051bea4 [model] support for LiquidAI's LFM2.5 (Liquid Foundation Models) (#9726) 2026-01-07 14:14:47 +08:00
浮梦
d43e1007e8 [ci] improve cuda ci cache (#9725)
Co-authored-by: frozenleaves <frozen@Mac.local>
2026-01-07 12:34:40 +08:00
Xunpeng Xiao
f89d9367e5 [assets] update README.md (#9724) 2026-01-07 12:11:50 +08:00
Yaowei Zheng
d22de0d4bf [v1] add renderer ut (#9722) 2026-01-07 02:06:07 +08:00
Yaowei Zheng
ea0b4e2466 [v1] add cli sampler (#9721) 2026-01-06 23:31:27 +08:00
yanglele
e944dc442c [feature] add support for EAFT loss (#9720)
Co-authored-by: Yaowei Zheng <hiyouga@buaa.edu.cn>
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
2026-01-06 23:07:12 +08:00
Xunpeng Xiao
68119e5522 [misc] Add a PyTorch version warning for Conv3D. (#9715) 2026-01-05 13:26:29 +08:00
Yaowei Zheng
f60a6e3d01 [v1] add init plugin (#9716) 2026-01-04 20:51:46 +08:00