423A35C7

423A35C7 synced commits to main at 423A35C7/LLaMA-Factory from mirror 2026-04-28 19:08:58 +08:00

50945ef850 [v1] fix device_mesh and sp for fsdp2 (#10429)

423A35C7 synced commits to main at 423A35C7/LLaMA-Factory from mirror 2026-04-28 02:39:03 +08:00

2f0bef207a [export] handle NotImplementedError in export_model for transformers>=5.0 (fixes #10410) (#10438)

2092abc217 [npu] add Qwen3.5 support with Partial RoPE and Hybrid Attention (#10421)

Compare 2 commits »

423A35C7 synced commits to main at 423A35C7/LLaMA-Factory from mirror 2026-04-27 18:29:08 +08:00

99464b3d03 [misc] code lint (#10439)

423A35C7 synced commits to main at 423A35C7/LLaMA-Factory from mirror 2026-04-27 01:59:00 +08:00

9a0cfdccfa [v1] fix init on meta in transformers v5 (#10414)

c8890c32db [data] support discard history cot for multiturn (#10435)

79c8332e4c [train] add qwen35 patch for neat_packing (#10436)

Compare 3 commits »

423A35C7 synced commits to main at 423A35C7/LLaMA-Factory from mirror 2026-04-24 08:38:56 +08:00

e0bc3c1971 [v1] fix epoch and steps (#10422)

423A35C7 synced commits to main at 423A35C7/LLaMA-Factory from mirror 2026-04-22 20:48:58 +08:00

ecca167eb4 [model] support qwen3.6 models (#10415)

423A35C7 synced commits to main at 423A35C7/LLaMA-Factory from mirror 2026-04-21 20:36:02 +08:00

28a6ea1cdc [v1] add deepspeed zero3 trigger for low memory usage weight loading (#10300)

423A35C7 synced commits to main at 423A35C7/LLaMA-Factory from mirror 2026-04-21 12:26:03 +08:00

f5d739b132 [v1] fix device mesh and clip_grad_norm for ulysses cp (#10366)

423A35C7 synced commits to main at 423A35C7/LLaMA-Factory from mirror 2026-04-21 04:16:00 +08:00

c4bbac49b2 [v1] support resume training from checkpoint (#10280)

423A35C7 synced commits to main at 423A35C7/LLaMA-Factory from mirror 2026-04-20 20:06:04 +08:00

c5aecaf31d [data] fix SeedToolUtils.tool_extractor returns content when no tool calls found (#10408)

423A35C7 synced commits to main at 423A35C7/PrimitiveAnything from mirror 2026-04-17 19:05:59 +08:00

50586e5570 Update README.md

423A35C7 synced commits to main at 423A35C7/LLaMA-Factory from mirror 2026-04-12 15:46:05 +08:00

436d26bc28 fix: projector lookup for gemma4 modules (#10382)

423A35C7 synced commits to main at 423A35C7/LLaMA-Factory from mirror 2026-04-10 22:46:05 +08:00

c109c061e5 [model] set mm_projectors for omni models (#10378)

423A35C7 synced new reference dependabot/npm_and_yarn/demo/frontend/vite-6.4.2 to 423A35C7/sam2 from mirror 2026-04-07 21:06:05 +08:00

423A35C7 synced commits to dependabot/npm_and_yarn/demo/frontend/vite-6.4.2 at 423A35C7/sam2 from mirror 2026-04-07 21:06:05 +08:00

423A35C7 synced and deleted reference refs/tags/dependabot/npm_and_yarn/demo/frontend/vite-5.4.21 at 423A35C7/sam2 from mirror 2026-04-07 21:06:05 +08:00

423A35C7 synced new reference dependabot/npm_and_yarn/demo/frontend/picomatch-2.3.2 to 423A35C7/sam2 from mirror 2026-04-07 12:56:03 +08:00

423A35C7 synced commits to dependabot/npm_and_yarn/demo/frontend/picomatch-2.3.2 at 423A35C7/sam2 from mirror 2026-04-07 12:56:03 +08:00

423A35C7 synced commits to main at 423A35C7/LLaMA-Factory from mirror 2026-04-06 20:36:05 +08:00

fa09c01c36 fix: gemma4 mm_token_type_ids padding (#10359)

423A35C7 synced commits to main at 423A35C7/LLaMA-Factory from mirror 2026-04-05 20:06:01 +08:00

eae6f0b541 [model] gemma4 (#10346)