Commit Graph

  • 4d8c8b2019 fix hiyouga/misc hiyouga 2025-12-27 08:49:43 +08:00
  • 6cdbfa980e fix hiyouga 2025-12-27 08:46:17 +08:00
  • 5eea54f888 fix hiyouga 2025-12-27 08:18:20 +08:00
  • f69c8efd27 fix hiyouga 2025-12-27 08:15:02 +08:00
  • e374b615f2 fix hiyouga 2025-12-27 08:08:57 +08:00
  • d7201fd6ae fix hiyouga 2025-12-27 08:06:42 +08:00
  • 42b436a4fc fix hiyouga 2025-12-27 08:05:25 +08:00
  • 5f3e5eb5f8 fix hiyouga 2025-12-27 07:53:52 +08:00
  • 6de0c3da9b fix hiyouga 2025-12-27 07:50:18 +08:00
  • 1622bad7d4 fix hiyouga 2025-12-27 07:47:43 +08:00
  • c439924e74 fix hiyouga 2025-12-27 07:38:10 +08:00
  • a24d8cc78c fix hiyouga 2025-12-27 07:36:30 +08:00
  • 66e6aa8f37 fixup hiyouga 2025-12-27 07:35:18 +08:00
  • a1b1931b4a [breaking] migrate from setuptools to uv (#9673) main Copilot 2025-12-26 22:47:23 +08:00
  • 3c17f2722c [model] Update ernie_vl to adapt new version (#9665) Xunpeng Xiao 2025-12-26 19:57:49 +08:00
  • a882e2d5fc [assets] Add GitHub Copilot instructions for repository (#9675) Copilot 2025-12-26 17:32:48 +08:00
  • a754604c11 [misc] fix accelerator (#9661) Yaowei Zheng 2025-12-25 02:11:04 +08:00
  • 6a2eafbae3 [feat] Models trained and inferred with Mxfp4 are dequantized by default (#9652) Xunpeng Xiao 2025-12-24 00:26:40 +08:00
  • 84485406b7 [ci] disable pip cache for ci (#9654) Yaowei Zheng 2025-12-23 18:37:40 +08:00
  • 1c8a42d2f8 [v1&WIP] dataloader init (#9645) Kingsley 2025-12-23 16:29:47 +08:00
  • 7901b2f32e [model] efficient tuning for gpt-oss (#9354) thulyubh22 2025-12-23 16:28:38 +08:00
  • 1f1f5a7d1b [ci] remove docker cache (#9640) Yaowei Zheng 2025-12-22 01:03:10 +08:00
  • 6ef9854713 [misc] fix cache & pin transformers to 4.57.1 (#9638) Yaowei Zheng 2025-12-22 00:20:55 +08:00
  • 4923f52a28 [model] support MiMo-V2-Flash model (#9637) Hertz 2025-12-21 14:38:18 +08:00
  • 0894b4f37e [misc] lint (#9636) Yaowei Zheng 2025-12-20 16:19:39 +08:00
  • b0d49e137f [misc] Support split eval_dataset when explict set "predict_with_generate" (#9604) ZIYI ZENG 2025-12-20 01:46:00 +08:00
  • ddd7dcc722 [data] Fix the video frame sampling issue #9620 (#9634) Xunpeng Xiao 2025-12-19 18:36:31 +08:00
  • 5204cd2bca [misc] add version check for moe (#9633) 浮梦 2025-12-19 14:57:37 +08:00
  • 8c74dca76a [feat] Models trained and inferred with FP8 are dequantized by default (#9627) Xunpeng Xiao 2025-12-18 22:54:35 +08:00
  • e8deda53a1 [example] add Qwen3 series examples (#9624) xvxuopop 2025-12-18 21:27:00 +08:00
  • a769fb94b9 [feat] support ktransformers for dpo (#9621) mrhaoxx 2025-12-18 21:26:25 +08:00
  • 964569751f [kt] refactor ktransformers integration (#9632) mrhaoxx 2025-12-18 21:26:04 +08:00
  • 9fd4b094d4 [model] support VibeThinker models (#9616) Hertz 2025-12-16 21:50:46 +08:00
  • 18c21bce5a [test] add allreduce test on npu (#9619) 浮梦 2025-12-16 21:33:30 +08:00
  • a0179772ab [example] add deepspeed autotp config and example (#9602) sunyi0505 2025-12-15 15:15:26 +08:00
  • aeda079014 [v1] model loader (#9613) Yaowei Zheng 2025-12-14 11:50:52 +08:00
  • fdd24276ed [feat] support new function call value (#9610) Xunpeng Xiao 2025-12-14 00:20:33 +08:00
  • 110d21713e [v1] add dp & mp mesh (#9611) Yaowei Zheng 2025-12-13 01:44:28 +08:00
  • 203069e11c [v1] add accelerator (#9607) Yaowei Zheng 2025-12-12 19:22:06 +08:00
  • 4fd94141a4 [model] Add Ministral3 (#9582) tangefly 2025-12-10 15:57:24 +08:00
  • 22d6ac29d5 [model] Rename GLMV template (#9595) Kingsley 2025-12-10 13:27:47 +08:00
  • cff4483392 [config] Fix RoPE scaling patch for resuming from a scaled model (#9588) DoubleWheat 2025-12-09 20:37:37 +08:00
  • 5d56817e2b [misc] lint (#9593) Yaowei Zheng 2025-12-09 18:00:35 +08:00
  • 1bbb461f76 [assets] update readme (#9587) Yaowei Zheng 2025-12-09 12:22:54 +08:00
  • c1f5f8fff6 [model] support GLM4.6v (#9586) Hertz 2025-12-09 11:06:42 +08:00
  • 5744f1ea94 [v1] add models & accelerator (#9579) Yaowei Zheng 2025-12-08 02:30:25 +08:00
  • 739954910a [deps] Update for Transformers v5 (#9569) tangefly 2025-12-08 01:13:32 +08:00
  • 109162dc56 [fix] fix the issue when using fsdp2 with gradient checkpointing. (#9541) xvxuopop 2025-12-06 16:04:51 +08:00
  • 165f3f073a [examples] add fsdp config for mutiple nodes (#9575) jiaqiw09 2025-12-05 23:22:48 +08:00
  • efb13b7483 [V1] Refactor ascend MoE kernel patch logic & Support Qwen3-MoE (#9557) jiaqiw09 2025-12-02 00:22:03 +08:00
  • e43a972b25 [test] add npu test yaml and add ascend a3 docker file (#9547) Username_Full 2025-11-30 09:37:08 +08:00
  • 22be45c78c [misc] fix omni thinker load (#9552) Kingsley 2025-11-30 09:36:36 +08:00
  • d1f585f80a [test] update test cmd (#9544) 浮梦 2025-11-27 17:59:42 +08:00
  • 955396e8a5 [example] correct the parameter errors in the examples file. (#9543) xvxuopop 2025-11-27 17:38:38 +08:00
  • 231756a5bf [chat] fix the error when the vLLM version is greater than 0.10.0 (#9539) xvxuopop 2025-11-27 02:14:53 +08:00
  • 2c4fb3c97e [v1] Support fused moe kernel for qwen3vlmoe model. (#9532) xvxuopop 2025-11-27 02:13:33 +08:00
  • 2b6f16f261 [model] temporarily support npu fused options on v0, powered by v1 kernels (#9520) 浮梦 2025-11-27 02:08:36 +08:00
  • f17efde693 [v1] support automatic discovery of registered kernels. (#9509) 浮梦 2025-11-27 01:47:22 +08:00
  • 591fc9ed02 [model] support ERNIE-4.5-VL Models (#9521) Hertz 2025-11-24 16:48:06 +08:00
  • 3140c242f0 [assets] add README with KT+llamafactory (#9514) Peilin Li 2025-11-19 16:50:45 +08:00
  • 887c562d60 [example] Add KTransformers Qwen3MoE example (#9511) Peilin Li 2025-11-19 00:53:28 +08:00
  • 9779b1f361 [misc] fix typos in some files (#9505) Edge-Seven 2025-11-18 19:36:01 +07:00
  • 45f0437a14 [v1] Add support for ShareGPT format. (#9486) Yinlei Sun 2025-11-18 13:44:08 +08:00
  • d4e120423d [data] fix qwen3omni moe model (#9501) 浮梦 2025-11-18 13:43:22 +08:00
  • 10a446e373 [model] ktransformers qwen3 support (#9485) Pory 2025-11-13 20:09:44 +08:00
  • 0aa4a051af [test] support slow skip and device skip in Uts (#9484) jiaqiw09 2025-11-13 20:08:22 +08:00
  • 8173a88a26 [assets] update readme (#9477) Yaowei Zheng 2025-11-12 16:15:41 +08:00
  • fef86fa7fe [data] fix qwen3omni audio length calculation (#9467) Kingsley 2025-11-12 10:37:15 +08:00
  • 5afa851f71 [misc] Modify pip install command for huggingface_hub (#9463) taohongsheng 2025-11-10 23:04:00 +08:00
  • a711bce664 [data] add openai format (#9449) MyungHa Kwon 2025-11-06 21:10:20 +09:00
  • bd24350cbf [v1] add pair data converter (#9360) 魅影 2025-11-06 14:05:58 +08:00
  • bd30c0003b [train] fix denominator of ga in ksft loss (#9409) Peilin Li 2025-11-05 20:53:23 +08:00
  • 8edd2622ce [docker] update npu dockerfile (#9407) 魅影 2025-11-05 18:28:32 +08:00
  • eaf963f67f [model] update kt code (#9406) Yaowei Zheng 2025-11-05 15:27:22 +08:00
  • 56f45e826f [train] fix MPO re-weight (#9405) Kingsley 2025-11-04 21:10:41 +08:00
  • 14abb75126 [model] enable using FA in npu (#9397) 魅影 2025-11-04 19:32:30 +08:00
  • 5a9939050e [model] add deepstack_merger_list to Qwen3-VL vision_model_keys (#9399) 한송민 2025-11-04 20:27:34 +09:00
  • 934b3084ee [train] KTransformers SFT as backend engine for LLaMA-Factory (#9400) Peilin Li 2025-11-04 15:54:12 +08:00
  • 3ae15da9c0 [misc] lint code (#9395) Yaowei Zheng 2025-11-03 22:08:59 +08:00
  • 215580c77d [data] fix mm pluigin for qwen omni video training (#9388) 魅影 2025-11-03 11:44:27 +08:00
  • 767b344fb4 [model] remove npu sdpa patch (#9368) 魅影 2025-10-30 16:26:35 +08:00
  • 3057db15c3 [readme] upd mcore readme (#9352) Kingsley 2025-10-27 21:23:31 +08:00
  • 13170577b2 [feat] support megatron-LM training by mcore_adapter (#9237) Kingsley 2025-10-26 16:21:30 +08:00
  • 129e918106 [data] Fix Qwen3VL plugin (#9297) Xiaosu Zhu 2025-10-26 16:07:04 +08:00
  • 9c0d033a15 [model] add qwen3vl 2b & 32b (#9343) Yaowei Zheng 2025-10-24 13:22:36 +08:00
  • 2a822178de [deps] fix yanked packages (#9333) Yaowei Zheng 2025-10-22 20:54:51 +08:00
  • b842457ef4 [ci] revert mac os ci setup (#9316) Kingsley 2025-10-21 18:26:12 +08:00
  • 2c6aded5d4 [v1] kernel plugin (#9274) 魅影 2025-10-18 18:02:14 +08:00
  • d9d67ba62d [misc] fix import error (#9299) Yaowei Zheng 2025-10-17 17:46:27 +08:00
  • a442fa90ad [misc] fix import error (#9296) Yaowei Zheng 2025-10-17 10:54:30 +08:00
  • 8c341cbaae [model] support hunyuan-mt model (#9284) wyfdgg 2025-10-17 10:33:09 +08:00
  • 47a7dc1698 [deps] upgrade vllm (#9293) Yaowei Zheng 2025-10-16 23:20:26 +08:00
  • 1037f63311 [model] add qwen3vl 4b + 8b (#9275) Yaowei Zheng 2025-10-15 15:00:36 +08:00
  • c867e28093 [model] adds semantic initialization support for special tokens (#9267) Ximing Xing 2025-10-14 17:00:48 +08:00
  • 3dbca4b533 [data] add new reason tool calls demo data (#9249) Peter-Hamster 2025-10-13 17:16:47 +08:00
  • 9d1acbc191 [ci] fix ci (#9265) Yaowei Zheng 2025-10-13 16:24:40 +08:00
  • 52e46e162e [v1] add data converter (#9263) Yaowei Zheng 2025-10-13 15:54:47 +08:00
  • 48974783da [model]: add ernie4_5_moe support for DeepSpeed Zero3 training (#9262) Jiayi Mao 2025-10-13 13:13:31 +08:00
  • 575e4099df [misc] add qwen bench script (#9259) Yaowei Zheng 2025-10-13 11:45:25 +08:00
  • 9687b71d3a [v1] init data plugins (#9248) Yaowei Zheng 2025-10-09 22:36:48 +08:00