Commit Graph

  • b0c8ba73e0
    [deps] update to transformers 4.52 (#8125) hoshi-hiyouga 2025-05-21 05:16:18 +08:00
  • b3b2c9f1ee
    [data] llama3 multi tool support (#8124) hoshi-hiyouga 2025-05-21 02:01:12 +08:00
  • f96c085857
    [assets] update readme (#8110) hoshi-hiyouga 2025-05-20 02:44:18 +08:00
  • b83a38eb98
    [data] qwen3 fixes (#8109) hoshi-hiyouga 2025-05-20 02:00:30 +08:00
  • f3fd67a9bb
    [model] switch to gptqmodel (#8108) hoshi-hiyouga 2025-05-19 22:25:40 +08:00
  • a6f3adf930
    [model] update rope kwargs for yarn (#8101) piamo 2025-05-19 20:07:54 +08:00
  • ed2f89efaf
    [doc] add no build isolation (#8103) hoshi-hiyouga 2025-05-19 19:25:13 +08:00
  • 16e26236eb
    [trainer] fix KeyError at end of pretrain (#8099) Ma, Xiaochen 2025-05-19 18:01:26 +08:00
  • 89a0d10c18
    [misc] fix cli (#8095) Biao Wang 2025-05-19 17:59:39 +08:00
  • 820ed764c4
    [infer] support lora adapter for SGLang backend (#8067) Saiya 2025-05-16 23:33:47 +08:00
  • 66f719dd96
    [data] add forward compatibility for video_utils in Transformers 4.52.0 (#8077) Kingsley 2025-05-16 17:41:04 +08:00
  • 130bfaf8e3
    [data] support loading folder from remote (#8078) Eric Tang 2025-05-16 00:35:38 -07:00
  • e8a18c17e9
    [infer] Modify vllm_infer.py to batch preprocess to avoid too much files opened error (#8051) Shawn Tao 2025-05-15 10:54:35 +08:00
  • 2b23c0a7a1
    [assets] update wechat (#8057) hoshi-hiyouga 2025-05-14 18:01:48 +08:00
  • ab2c05115b
    [assets] update windows installation (#8042) hoshi-hiyouga 2025-05-13 17:01:56 +08:00
  • 8d472c20cb
    [model] add seed coder and qwen3 quant models (#8039) hoshi-hiyouga 2025-05-13 15:59:55 +08:00
  • 845af89ea4
    [data] fix kimi vl template (#8015) hoshi-hiyouga 2025-05-11 20:45:19 +08:00
  • cef3a0b2e2
    [scripts] add video params for vllm infer (#7992) Kingsley 2025-05-09 21:16:52 +08:00
  • 865ac07491
    [data] Avoid repetitive tool description warp (#8000) yunhao-tech 2025-05-09 21:16:37 +08:00
  • f584db50cf
    [docs] add GraphGen (#7974) tpoisonooo 2025-05-07 18:23:11 +08:00
  • 97e0a4cb5c
    [misc] update liger kernel patch (#7966) hoshi-hiyouga 2025-05-06 20:32:16 +02:00
  • c6bcca4c83
    [example] update examples (#7964) hoshi-hiyouga 2025-05-06 17:24:25 +02:00
  • 5ee9eb64d8
    [model] add mimo7b (#7946) Kingsley 2025-05-06 23:10:30 +08:00
  • 937447bd8a
    [misc] fix qwen2 omni (#7962) hoshi-hiyouga 2025-05-06 15:39:13 +02:00
  • 52f25651a2
    [model] add qwen2 omni 3b (#7945) hoshi-hiyouga 2025-05-03 16:36:51 +08:00
  • 75d7c35fdf
    [assets] Warp Support README Update (#7887) Eric Chen 2025-05-01 12:08:48 -04:00
  • 6a584b4092
    [hparam] add enable think argument (#7928) hoshi-hiyouga 2025-04-30 17:21:30 +08:00
  • 41ec928683
    [data] fix base plugin (#7924) hoshi-hiyouga 2025-04-30 16:28:05 +08:00
  • d8295cd601
    [data] optimize qwen3 loss computation (#7923) hoshi-hiyouga 2025-04-30 16:18:00 +08:00
  • a8430f4244
    [misc] fix uv (#7913) hoshi-hiyouga 2025-04-30 07:45:03 +08:00
  • 072bfe29d3
    [data] add eval_on_each_dataset arg (#7912) hoshi-hiyouga 2025-04-30 06:56:43 +08:00
  • c5b1d07e7c
    [data] replace eos token for base models (#7911) hoshi-hiyouga 2025-04-30 06:52:28 +08:00
  • 77c569e071
    [data] improve mm plugin (#7910) hoshi-hiyouga 2025-04-30 06:34:28 +08:00
  • ae392e054c
    [model] add qwen3 (#7885) hoshi-hiyouga 2025-04-29 09:34:05 +08:00
  • 369474451d
    [data] fix qwen2.5 omni template (#7883) Kingsley 2025-04-29 00:58:23 +08:00
  • 1f338deb87
    [model] fix dsv3 leaf node (#7879) hoshi-hiyouga 2025-04-28 18:11:09 +08:00
  • 00b5c05946
    [data] fix qwen2 omni plugin (#7875) hoshi-hiyouga 2025-04-28 14:22:41 +08:00
  • 1bd319d16c
    [trainer] make projector trainable in freeze training (#7872) zhaop-l 2025-04-28 13:19:37 +08:00
  • fcca3b0b0d
    [data] fix minicpmo vllm infer (#7870) hoshi-hiyouga 2025-04-28 01:59:53 +08:00
  • 035e98035c
    fix attn patch for kimivl (#7867) Kingsley 2025-04-27 23:12:28 +08:00
  • b4407e4b0b
    [ray] add storage filesystem to ray config (#7854) Eric Tang 2025-04-27 07:12:40 -07:00
  • 036a76e9cb
    [assets] update wechat (#7840) hoshi-hiyouga 2025-04-24 16:31:05 +08:00
  • 4fbdc65fcb
    [model] fix vit gradient checkpointing (#7830) hoshi-hiyouga 2025-04-23 22:48:48 +08:00
  • 2989d39239
    Merge commit from fork hoshi-hiyouga 2025-04-23 16:38:27 +08:00
  • 1344416378
    [model] fix moe zero3 (#7826) hoshi-hiyouga 2025-04-23 15:30:49 +08:00
  • 1dd67eb042
    [data] fix internvl plugin (#7817) Kingsley 2025-04-23 00:58:22 +08:00
  • 2b7d564e3b
    [assets] update model readme (#7804) hoshi-hiyouga 2025-04-22 16:43:56 +08:00
  • d43013f14a
    [model] add arch check for InternVL (#7803) Kingsley 2025-04-22 16:38:05 +08:00
  • c91165a5a6
    [misc] update internvl constants (#7801) Kingsley 2025-04-22 15:53:08 +08:00
  • 7f3c31f6f4
    [trainer] support early stop (#7797) hoshi-hiyouga 2025-04-22 01:59:33 +08:00
  • 92101f34a1
    [data] improve mmplugin (#7795) hoshi-hiyouga 2025-04-22 01:25:33 +08:00
  • a62cba3d05
    [example] add bash usage (#7794) hoshi-hiyouga 2025-04-22 00:25:51 +08:00
  • d128382d3c
    [trainer] Add Muon Optimizer (#7749) Juanxi Tian 2025-04-21 23:38:37 +08:00
  • 278df4308d
    [parser] support omegaconf (#7793) hoshi-hiyouga 2025-04-21 23:30:30 +08:00
  • 81768df04c
    [data] Fix wrong position ids with packed attention masks (#7754) Changrui Chen 2025-04-21 16:19:36 +01:00
  • 1302ca39f6
    [misc] fix new tokens adding (#7253) flashJd 2025-04-21 23:19:02 +08:00
  • b8cddbc7d7
    [model] fix gemma3 export (#7786) ddddng 2025-04-21 23:07:11 +08:00
  • ec7257e70f
    [misc] fix bug in constant (#7765) Sachin Beldona 2025-04-21 10:06:31 -05:00
  • a4455e3021
    [assets] update wechat (#7792) hoshi-hiyouga 2025-04-21 21:29:42 +08:00
  • 610f164c69
    [trainer] fix pt loss (#7748) hoshi-hiyouga 2025-04-17 03:15:35 +08:00
  • 0a0cfeb782
    [breaking] bump transformers to 4.45.0 & improve ci (#7746) hoshi-hiyouga 2025-04-17 02:36:48 +08:00
  • 4831552856
    [infer] set env for vllm ascend (#7745) hoshi-hiyouga 2025-04-17 01:08:55 +08:00
  • 125513fa5c
    [model] support intern-VL 2.5-3 series (#7258) Kingsley 2025-04-17 00:31:30 +08:00
  • 8543400584
    [misc] improve entrypoint (#7345) ENg-122 2025-04-16 21:48:23 +08:00
  • e1fdd6e2f8
    [infer] support vllm-ascend (#7739) leo-pony 2025-04-16 20:06:47 +08:00
  • d07983dceb
    [assets] wechat (#7740) codingma 2025-04-16 18:02:01 +08:00
  • 9b94211045
    [api] fix chat messages (#7732) hoshi-hiyouga 2025-04-15 16:39:08 +08:00
  • 0fe5631f9b
    [deps] upgrade vllm (#7728) hoshi-hiyouga 2025-04-15 14:57:40 +08:00
  • b5d667cebf
    [docker] patch docker-rocm (#7725) Joe Schoonover 2025-04-15 01:36:39 -04:00
  • ac8c6fdd3a
    [assets] update model readme (#7724) hoshi-hiyouga 2025-04-15 00:41:09 +08:00
  • df8752e8ee
    [model] Support Kimi_VL thinking/instruct (#7719) Kingsley 2025-04-15 00:21:58 +08:00
  • 3a13d2cdb1
    [misc] fix env vars (#7715) hoshi-hiyouga 2025-04-14 16:04:04 +08:00
  • 3ef36d0057
    [misc] upgrade cli (#7714) hoshi-hiyouga 2025-04-14 15:41:22 +08:00
  • 1fd4d14fbb
    [deps] upgrade transformers (#7704) hoshi-hiyouga 2025-04-13 18:11:34 +08:00
  • 481ecbf9c5
    [model] add GLM-4-0414 (#7695) Yuxuan Zhang 2025-04-13 17:10:45 +08:00
  • 60a84f664b
    [deps] fix uv conflicts (#7686) hoshi-hiyouga 2025-04-11 18:02:24 +08:00
  • 11bcafd06a
    [assets] update wechat (#7674) hoshi-hiyouga 2025-04-10 20:10:46 +08:00
  • 6c53471de2
    [data] support for specifying a dataset in cloud storage (#7567) Eric Tang 2025-04-09 20:31:35 -07:00
  • 39c1e29ed7
    [ray] allow for specifying ray.init kwargs (i.e. runtime_env) (#7647) Eric Tang 2025-04-09 20:31:05 -07:00
  • ee840b4e01
    [bugfix] enable_gemma_liger_kernel (#7660) Dain Kim 2025-04-10 12:27:30 +09:00
  • 3bdc7e1e6c
    [misc] fix cuda warn on intel GPU (#7655) jilongW 2025-04-09 21:37:54 +08:00
  • 34fdabe005
    [data] add coig-p dataset (#7657) hoshi-hiyouga 2025-04-09 21:18:25 +08:00
  • 24cb890432
    [assets] update readme (#7654) hoshi-hiyouga 2025-04-09 18:27:38 +08:00
  • 39876b85fc
    [assets] update readme (#7644) hoshi-hiyouga 2025-04-09 01:06:06 +08:00
  • 7d8bee96fc
    [data] Fix bugs of use_audio_in_video in Qwen2.5 Omni (#7638) Kingsley 2025-04-08 18:40:10 +08:00
  • 8f5f4cc559
    [trainer] fix key error (#7635) Shawn Tao 2025-04-08 18:39:50 +08:00
  • 8ee26642f3
    [sglang] support transformers 4.51.0 (#7639) Adarsh Shirawalmath 2025-04-08 16:09:23 +05:30
  • 5817cda37e
    [misc] fix packing and eval plot (#7623) hoshi-hiyouga 2025-04-07 18:20:57 +08:00
  • 7e0cdb1a76
    [assets] update readme (#7612) hoshi-hiyouga 2025-04-06 13:58:49 +08:00
  • 6c200fd218
    [model] add llama4 (#7611) hoshi-hiyouga 2025-04-06 13:42:31 +08:00
  • 61b24c3827
    [assets] update wechat (#7594) hoshi-hiyouga 2025-04-03 17:45:26 +08:00
  • 32cb086be1
    [data] fix qwen2.5 omni plugin (#7578) Kingsley 2025-04-02 23:58:39 +08:00
  • 80f8d037d0
    [data] fix qwen2.5 omni plugin (#7573) Kingsley 2025-04-02 21:28:52 +08:00
  • 11997593be
    [trainer] fix batch processing in PPO trainer (#7576) gechengze 2025-04-02 21:17:48 +08:00
  • 903db09822
    [infer] vllm video/audio inference (#7566) hoshi-hiyouga 2025-04-02 02:27:04 +08:00
  • aaf2e6ba2a
    [model] fix kv cache (#7564) hoshi-hiyouga 2025-04-01 23:07:46 +08:00
  • 9deece1d50
    [model] fix use_cache patching for gemma3 multimodal (#7500) Yu Shi Jie 2025-04-01 04:06:48 -04:00
  • f06a74ad4e
    [data] specify position_ids in PackedSupervisedDatasetProcessor for neat_packing (#7318) Ritesh Goru 2025-04-01 13:33:13 +05:30
  • 6faa6fb53d
    [webui] fix launch with proxy (#7332) taoharry 2025-04-01 15:52:56 +08:00
  • 5d1cc863a4
    [data] shard the dataset to allow multiprocessing when streaming is enabled (#7530) Billy Cao 2025-04-01 15:36:23 +08:00