Commit Graph

  • 7500e761d3 [misc] update internvl constants (#7801) Kingsley 2025-04-22 15:53:08 +08:00
  • fddcd43c88 [trainer] support early stop (#7797) hoshi-hiyouga 2025-04-22 01:59:33 +08:00
  • 0e4ce039ee [data] improve mmplugin (#7795) hoshi-hiyouga 2025-04-22 01:25:33 +08:00
  • b07628dea5 [example] add bash usage (#7794) hoshi-hiyouga 2025-04-22 00:25:51 +08:00
  • 12ada72ed4 [trainer] Add Muon Optimizer (#7749) Juanxi Tian 2025-04-21 23:38:37 +08:00
  • 416853dd25 [parser] support omegaconf (#7793) hoshi-hiyouga 2025-04-21 23:30:30 +08:00
  • bd7bc31c79 [data] Fix wrong position ids with packed attention masks (#7754) Changrui Chen 2025-04-21 16:19:36 +01:00
  • 0ac641326b [misc] fix new tokens adding (#7253) flashJd 2025-04-21 23:19:02 +08:00
  • c5ba9106ec [model] fix gemma3 export (#7786) ddddng 2025-04-21 23:07:11 +08:00
  • 3b2d3794a5 [misc] fix bug in constant (#7765) Sachin Beldona 2025-04-21 10:06:31 -05:00
  • b605c20768 [assets] update wechat (#7792) hoshi-hiyouga 2025-04-21 21:29:42 +08:00
  • 39169986ef [trainer] fix pt loss (#7748) hoshi-hiyouga 2025-04-17 03:15:35 +08:00
  • 86ebb219d6 [breaking] bump transformers to 4.45.0 & improve ci (#7746) hoshi-hiyouga 2025-04-17 02:36:48 +08:00
  • d222f63cb7 [infer] set env for vllm ascend (#7745) hoshi-hiyouga 2025-04-17 01:08:55 +08:00
  • 2e518f255f [model] support intern-VL 2.5-3 series (#7258) Kingsley 2025-04-17 00:31:30 +08:00
  • 8f88a4e6a4 [misc] improve entrypoint (#7345) ENg-122 2025-04-16 21:48:23 +08:00
  • b9263ff5ac [infer] support vllm-ascend (#7739) leo-pony 2025-04-16 20:06:47 +08:00
  • ee2ab093a7 [api] fix chat messages (#7732) hoshi-hiyouga 2025-04-15 16:39:08 +08:00
  • 3df021d4d7 [deps] upgrade vllm (#7728) hoshi-hiyouga 2025-04-15 14:57:40 +08:00
  • e252abf051 [docker] patch docker-rocm (#7725) Joe Schoonover 2025-04-15 01:36:39 -04:00
  • 1134baeedd [assets] update model readme (#7724) hoshi-hiyouga 2025-04-15 00:41:09 +08:00
  • 2101399c94 [model] Support Kimi_VL thinking/instruct (#7719) Kingsley 2025-04-15 00:21:58 +08:00
  • 3f91a95250 [misc] fix env vars (#7715) hoshi-hiyouga 2025-04-14 16:04:04 +08:00
  • 7c61b35106 [misc] upgrade cli (#7714) hoshi-hiyouga 2025-04-14 15:41:22 +08:00
  • f518bfba5b [deps] upgrade transformers (#7704) hoshi-hiyouga 2025-04-13 18:11:34 +08:00
  • 8162f94db5 [model] add GLM-4-0414 (#7695) Yuxuan Zhang 2025-04-13 17:10:45 +08:00
  • 1f0c52b73c [deps] fix uv conflicts (#7686) hoshi-hiyouga 2025-04-11 18:02:24 +08:00
  • a8caf09c7f [data] support for specifying a dataset in cloud storage (#7567) Eric Tang 2025-04-09 20:31:35 -07:00
  • bb8d79bae2 [ray] allow for specifying ray.init kwargs (i.e. runtime_env) (#7647) Eric Tang 2025-04-09 20:31:05 -07:00
  • 1c436c9f25 [bugfix] enable_gemma_liger_kernel (#7660) Dain Kim 2025-04-10 12:27:30 +09:00
  • 1b0934bccb [misc] fix cuda warn on intel GPU (#7655) jilongW 2025-04-09 21:37:54 +08:00
  • 4eec541857 [data] add coig-p dataset (#7657) hoshi-hiyouga 2025-04-09 21:18:25 +08:00
  • 89a4f9ec7f [assets] update readme (#7654) hoshi-hiyouga 2025-04-09 18:27:38 +08:00
  • 1abd71b551 [assets] update readme (#7644) hoshi-hiyouga 2025-04-09 01:06:06 +08:00
  • 349c56c51c [data] Fix bugs of use_audio_in_video in Qwen2.5 Omni (#7638) Kingsley 2025-04-08 18:40:10 +08:00
  • acb09fa3a3 [trainer] fix key error (#7635) Shawn Tao 2025-04-08 18:39:50 +08:00
  • f75b91077b [sglang] support transformers 4.51.0 (#7639) Adarsh Shirawalmath 2025-04-08 16:09:23 +05:30
  • c3c0efbaa0 [misc] fix packing and eval plot (#7623) hoshi-hiyouga 2025-04-07 18:20:57 +08:00
  • 5115dc8c7f [assets] update readme (#7612) hoshi-hiyouga 2025-04-06 13:58:49 +08:00
  • 831e7f1cfd [model] add llama4 (#7611) hoshi-hiyouga 2025-04-06 13:42:31 +08:00
  • d4cfa9507e [data] fix qwen2.5 omni plugin (#7578) Kingsley 2025-04-02 23:58:39 +08:00
  • d32c6c014d [data] fix qwen2.5 omni plugin (#7573) Kingsley 2025-04-02 21:28:52 +08:00
  • 7b9deb9410 [trainer] fix batch processing in PPO trainer (#7576) gechengze 2025-04-02 21:17:48 +08:00
  • 5e22597ff1 [infer] vllm video/audio inference (#7566) hoshi-hiyouga 2025-04-02 02:27:04 +08:00
  • 2bfcad2394 [model] fix kv cache (#7564) hoshi-hiyouga 2025-04-01 23:07:46 +08:00
  • a13b1bb49a [model] fix use_cache patching for gemma3 multimodal (#7500) Yu Shi Jie 2025-04-01 04:06:48 -04:00
  • d10467d178 [data] specify position_ids in PackedSupervisedDatasetProcessor for neat_packing (#7318) Ritesh Goru 2025-04-01 13:33:13 +05:30
  • aac70663fd [webui] fix launch with proxy (#7332) taoharry 2025-04-01 15:52:56 +08:00
  • 00409ff28a [data] shard the dataset to allow multiprocessing when streaming is enabled (#7530) Billy Cao 2025-04-01 15:36:23 +08:00
  • d70b3b4bc5 [trainer] new kto mismatch pair creation strategy (#7509) Hao 2025-04-01 15:21:53 +08:00
  • e76eba051d [data] fix qwen2.5 omni collator (#7553) hoshi-hiyouga 2025-04-01 00:15:12 +08:00
  • 7eed496336 [model] add Qwen2.5-Omni model (#7537) Kingsley 2025-03-31 20:39:35 +08:00
  • 0f8296626a [deps] pin pydantic to 2.10.6 (#7546) hoshi-hiyouga 2025-03-31 14:42:28 +08:00
  • 8da1d2fa71 [data] fix pixtral plugin (#7505) Kingsley 2025-03-27 17:06:40 +08:00
  • b578a7d5b6 [3rdparty] support swanlab lark notification (#7481) Xu-pixel 2025-03-27 01:52:01 +08:00
  • 24afceddb7 [trainer] fix wsd scheduler (#7304) Kdump 2025-03-26 15:25:02 +08:00
  • 0583d06676 [model] add qwen2vl 32b & upgrade peft (#7469) hoshi-hiyouga 2025-03-25 12:15:58 +08:00
  • ec6a261568 [model] fix lora on quant models (#7456) GuoCoder 2025-03-25 11:59:46 +08:00
  • 6b3b97c738 [misc] update liger-kernel's monkey patch (#7453) Xiaosu Zhu 2025-03-25 11:58:52 +08:00
  • 6d3748f727 [misc] enable liger kernel for gemma3 text and paligemma (#7466) AbdelKarim ELJANDOUBI 2025-03-25 02:27:43 +01:00
  • 7c890170e3 [misc] enable liger kernel for gemma3 (#7462) Kenny Lam 2025-03-24 11:09:59 +00:00
  • ca42c0c406 [assets] fix gemma3 readme (#7449) hoshi-hiyouga 2025-03-24 10:31:25 +08:00
  • 7203365b80 [trainer] fix vlm loss for transformers 4.49 (#7448) hoshi-hiyouga 2025-03-24 10:24:05 +08:00
  • 3612946dd9 [docker] upgrade to torch 2.6 (#7442) rumichi 2025-03-23 22:18:08 +09:00
  • 3aa4f32e9c [misc] fix ci (#7441) hoshi-hiyouga 2025-03-23 21:09:35 +08:00
  • 304796b803 [misc] fix license (#7440) hoshi-hiyouga 2025-03-23 19:31:56 +08:00
  • 7cfd6e4bb0 [scripts] support compute score on vllm's predictions (#7419) SnowFox4004 2025-03-23 19:21:01 +08:00
  • 05b19d6952 [deps] upgrade transformers to 4.50.0 (#7437) hoshi-hiyouga 2025-03-23 17:44:27 +08:00
  • 919415dba9 [deps] upgrade vllm to 0.8 (#7436) hoshi-hiyouga 2025-03-23 14:32:22 +08:00
  • a959c2a509 [misc] fix sglang deps (#7432) Guo, Quan 2025-03-23 14:07:10 +08:00
  • db0a08db6f [3rdparty] fix redundant process group destroy for ray (#7395) Eric Tang 2025-03-20 19:56:47 -07:00
  • a306f0f5a2 [version] fix minicpmo (#7378) hoshi-hiyouga 2025-03-20 16:59:31 +08:00
  • 63752fccf7 [assets] update wechat (#7361) hoshi-hiyouga 2025-03-18 21:31:09 +08:00
  • 1f9773395b [misc] set dev version (#7351) hoshi-hiyouga 2025-03-18 00:10:53 +08:00
  • 128b5b12b3 [data] fix template (#7349) hoshi-hiyouga 2025-03-17 23:45:20 +08:00
  • d5915a7dd7 [assets] update videos (#7340) hoshi-hiyouga 2025-03-17 15:48:02 +08:00
  • ec1154662b [model] support hunyuan 7b (#7317) Hertz 2025-03-15 20:55:24 +08:00
  • a44a53ebec [inference] support sglang backend (#7278) Qiaolin Yu 2025-03-14 16:37:58 -04:00
  • 93e6184cbe [data] gemma3 plugin pan and scan (#7294) hoshi-hiyouga 2025-03-13 23:29:23 +08:00
  • 0be0d7796a [assets] update video (#7287) hoshi-hiyouga 2025-03-13 18:45:47 +08:00
  • 480369a9f2 [data] efficient 4d_attention_mask creation in neat_packing (#7272) Ritesh Goru 2025-03-13 01:01:12 +05:30
  • 650a9a9057 [misc] update format (#7277) hoshi-hiyouga 2025-03-13 02:53:08 +08:00
  • 4b9d8da5a4 [model] support gemma3 (#7273) hoshi-hiyouga 2025-03-13 01:35:23 +08:00
  • e6159ad730 [misc] upgrade deps (#7257) hoshi-hiyouga 2025-03-12 00:33:47 +08:00
  • 264538cb26 [misc] upgrade format to py39 (#7256) hoshi-hiyouga 2025-03-12 00:08:41 +08:00
  • 5995800bce [ci] update workflow (#7255) hoshi-hiyouga 2025-03-11 22:57:49 +08:00
  • bf8b483186 [core] release v0.9.2 (#7254) hoshi-hiyouga 2025-03-11 22:42:23 +08:00
  • e2299e261b Merge pull request #7242 from hiyouga/hiyouga/release v0.9.2 hoshi-hiyouga 2025-03-11 15:28:45 +08:00
  • 8a44dce326 Merge pull request #7247 from hiyouga/hiyouga/commit hoshi-hiyouga 2025-03-11 15:28:04 +08:00
  • 6d9233833b Merge pull request #7244 from hiyouga/hiyouga/token hoshi-hiyouga 2025-03-11 15:17:15 +08:00
  • d019603835 support commit info hiyouga 2025-03-11 15:13:59 +08:00
  • 478e8194d9 remove exit in preprocess hiyouga 2025-03-11 15:06:17 +08:00
  • 1890d3dafe release v0.9.2 hiyouga 2025-03-11 14:48:22 +08:00
  • 522a3e8493 [infer] fix vllm args (#7235) hoshi-hiyouga 2025-03-11 01:15:35 +08:00
  • 18968405d0 [tracking] add swanlab_logdir param (#7219) Ze-Yi LIN 2025-03-11 00:53:07 +08:00
  • 71a1c1321a [config] update args (#7231) hoshi-hiyouga 2025-03-10 23:04:43 +08:00
  • cf58a6d860 [config] fix export max len (#7230) hoshi-hiyouga 2025-03-10 16:46:08 +08:00
  • 9adc0a2c3f [assets] update readme (#7209) hoshi-hiyouga 2025-03-07 17:27:49 +08:00
  • 16419b2834 [data] fix loader (#7207) hoshi-hiyouga 2025-03-07 17:20:46 +08:00
  • 82a2bac866 [misc] fix ds config (#7205) hoshi-hiyouga 2025-03-07 15:21:28 +08:00