LLaMA-Factory

423A35C7/LLaMA-Factory

Fork 0

mirror of https://github.com/hiyouga/LLaMA-Factory.git synced 2026-07-28 11:46:09 +08:00

Commit Graph

Select branches

Hide Pull Requests

main

#1

#10109

#10110

#10112

#10114

#10123

#10124

#10124

#10127

#10131

#10137

#10139

#10145

#10147

#10152

#10155

#10156

#10157

#10159

#10161

#10163

#10165

#10172

#10172

#10173

#10174

#10176

#10181

#10183

#10185

#10185

#10188

#10188

#10189

#10190

#10192

#10192

#10194

#10196

#10198

#10199

#10200

#10201

#10202

#10203

#10204

#10205

#10206

#10208

#10209

#10210

#10211

#10212

#10213

#10214

#10215

#10220

#10222

#10223

#10225

#10227

#10232

#10236

#10237

#10242

#10252

#10254

#10255

#10259

#10260

#10262

#10264

#10265

#10267

#10268

#10269

#10273

#10274

#10276

#10280

#10281

#10283

#10284

#10288

#10288

#10289

#10290

#10291

#10295

#10296

#10297

#10297

#10300

#10303

#10304

#10305

#10307

#10308

#10313

#10315

#10319

#10323

#10324

#10324

#10325

#10325

#10330

#10334

#10338

#10346

#10349

#10349

#10353

#10353

#10354

#10354

#10356

#10356

#10357

#10358

#10359

#10360

#10361

#10361

#10362

#10366

#10368

#10368

#10369

#10369

#10370

#10378

#10379

#10380

#10380

#10381

#10382

#10383

#10383

#10388

#10388

#10390

#10392

#10397

#10397

#10403

#10403

#10404

#10408

#10409

#10409

#10414

#10415

#10420

#10420

#10421

#10422

#10423

#10423

#10424

#10424

#10426

#10426

#10428

#10429

#10430

#10431

#10432

#10434

#10434

#10435

#10436

#10437

#10438

#10439

#10445

#10445

#10446

#10448

#10449

#10453

#10454

#10454

#10455

#10458

#10461

#10462

#10463

#10464

#10464

#10469

#10470

#10471

#10471

#10472

#10477

#10477

#10478

#10479

#10481

#10483

#10483

#10488

#10488

#10492

#10492

#10493

#10494

#10495

#10495

#10498

#10498

#10500

#10504

#10506

#10507

#10508

#10508

#10509

#10509

#10512

#10513

#10514

#10514

#10515

#10515

#10516

#10516

#10517

#10517

#10518

#10521

#10521

#10522

#10525

#10525

#10526

#10526

#10528

#10528

#10529

#10531

#10531

#10532

#10533

#10534

#10534

#10535

#10535

#10536

#10537

#10537

#10538

#10538

#10540

#10542

#10543

#10544

#10545

#10545

#10547

#10549

#10551

#10552

#10552

#10553

#10553

#10554

#10554

#10555

#10556

#10558

#10559

#10563

#10563

#10570

#10570

#10571

#10571

#10572

#10574

#10575

#10576

#10576

#10577

#10577

#10578

#10578

#10579

#10579

#10582

#10583

#10584

#10586

#10586

#10587

#10588

#10588

#10589

#10589

#1059

#10590

#10590

#10592

#10592

#10594

#10594

#10595

#10595

#10597

#10598

#10599

#10599

#10600

#10601

#10601

#10602

#10603

#10605

#10606

#10606

#10607

#10612

#10612

#10613

#10615

#10616

#10618

#10619

#10619

#10621

#10622

#10626

#10627

#10628

#10629

#10630

#10630

#10632

#10633

#10634

#10636

#10636

#10638

#10638

#10639

#10639

#10640

#10641

#10643

#10643

#10645

#10648

#10648

#10649

#10650

#10650

#10651

#10651

#10652

#10652

#10653

#10653

#10654

#10654

#10655

#10655

#10656

#10656

#10657

#10660

#10660

#10661

#10663

#10663

#10664

#10666

#10666

#10667

#10667

#10668

#10668

#10671

#10674

#10674

#10676

#10676

#10677

#10678

#10680

#10681

#10683

#10683

#10684

#10684

#10685

#10685

#11

#1186

#119

#1252

#1326

#1348

#1353

#1375

#1436

#145

#1454

#1486

#1525

#1544

#1553

#156

#158

#1624

#1689

#1690

#1695

#1699

#1700

#171

#179

#1796

#1800

#1802

#1861

#1864

#1868

#1918

#1932

#1946

#1947

#1953

#1954

#200

#2007

#2019

#2100

#2117

#213

#2163

#2194

#22

#2201

#221

#2226

#2262

#2264

#2266

#2283

#2285

#2319

#2350

#2411

#2423

#2426

#2435

#2445

#2462

#2469

#2474

#2514

#2519

#2525

#2531

#2568

#2570

#2572

#2575

#258

#26

#2608

#2683

#2689

#2730

#2739

#2743

#2746

#2764

#2766

#2830

#2845

#2849

#2872

#2903

#2905

#2919

#2944

#2945

#2963

#2967

#2993

#3004

#3046

#3053

#3057

#306

#3066

#307

#3083

#3103

#3103

#3158

#3159

#3160

#3161

#3201

#3226

#3254

#3256

#3261

#3263

#3267

#3275

#3276

#3287

#3288

#3291

#33

#3332

#3338

#3357

#3371

#3383

#3394

#3412

#3423

#3435

#3449

#3450

#3454

#3471

#3484

#3487

#3490

#3498

#3511

#3513

#3527

#3532

#356

#3578

#3584

#3588

#3596

#3601

#3604

#3651

#3654

#3655

#3661

#3683

#3692

#3702

#3741

#3746

#3748

#3755

#3756

#3785

#3792

#3794

#3799

#3804

#3812

#382

#3829

#3835

#387

#3876

#3921

#3923

#3925

#3930

#3941

#395

#3958

#3976

#3987

#4003

#4006

#4007

#4009

#4011

#4015

#4029

#4043

#4045

#4053

#4066

#4080

#4082

#4083

#4098

#4099

#4119

#4136

#4166

#4167

#4173

#4191

#4204

#4224

#4227

#4234

#4237

#4245

#4246

#4307

#4309

#4314

#4321

#4329

#4334

#434

#4342

#4347

#4348

#4352

#4355

#4377

#4382

#4409

#4417

#4445

#4446

#4461

#451

#4544

#4561

#4580

#4589

#4590

#4636

#4651

#4662

#4663

#4673

#4680

#4680

#4686

#4687

#4691

#4692

#4700

#4706

#4724

#4733

#4746

#4781

#479

#4793

#4804

#4821

#4822

#4877

#4878

#4892

#4939

#4950

#4957

#4961

#4970

#4995

#4996

#5010

#5019

#5032

#5037

#5068

#5072

#5095

#51

#5109

#511

#5111

#5112

#5115

#5118

#5156

#516

#5163

#5170

#5185

#5188

#5193

#5208

#5226

#5230

#5233

#5237

#5242

#5278

#5290

#5317

#5323

#5326

#5339

#5343

#5346

#5365

#5372

#5388

#5405

#5424

#5427

#5438

#5445

#5451

#5458

#5473

#5475

#5480

#5483

#5486

#5507

#5522

#5532

#5533

#5536

#5546

#5547

#5555

#5563

#5574

#5580

#5581

#5585

#5606

#5615

#5639

#5642

#5653

#5665

#5673

#5746

#5752

#5752

#5758

#5781

#5799

#5801

#5816

#5819

#5839

#5852

#5856

#5857

#5871

#5873

#5880

#5889

#5895

#5897

#5901

#5906

#5907

#5909

#5910

#5912

#5913

#5914

#5920

#5922

#5924

#5926

#5927

#5929

#5933

#596

#5970

#5971

#5973

#598

#5982

#5984

#599

#5990

#5993

#6010

#6022

#6046

#6052

#6065

#6078

#6083

#6098

#6103

#6120

#6121

#6123

#6124

#6125

#6126

#6127

#6128

#6129

#6137

#6138

#6140

#6141

#6151

#6152

#6156

#6157

#6160

#6170

#6175

#619

#6190

#6192

#6204

#6224

#6226

#6233

#6238

#6242

#6246

#6251

#6253

#6265

#6275

#629

#6310

#6313

#6317

#6334

#6359

#6362

#6363

#6364

#6365

#6367

#6368

#6369

#6379

#6384

#6388

#6395

#6396

#6401

#6416

#6418

#6420

#6426

#6430

#644

#6441

#6443

#6444

#6457

#6462

#6465

#6471

#6478

#6483

#6492

#6493

#6498

#6503

#6506

#651

#6512

#6513

#6514

#6515

#6524

#6527

#6528

#6542

#6547

#6564

#6565

#657

#6585

#6588

#6597

#6598

#6600

#6601

#6617

#6620

#6624

#6625

#6626

#6628

#6629

#6631

#6632

#6637

#6640

#6641

#6642

#6645

#6648

#6653

#6657

#6684

#6688

#6689

#6690

#6691

#6692

#6693

#6698

#6701

#6710

#6722

#6753

#6767

#6771

#6778

#6779

#678

#6786

#6787

#6788

#6796

#6797

#68

#6801

#6810

#6814

#6830

#6831

#6834

#6843

#6854

#6855

#6857

#6865

#6866

#6868

#6874

#6879

#6890

#6892

#6895

#6896

#6899

#6901

#6902

#6903

#6904

#6905

#6906

#6907

#6913

#6916

#6917

#6918

#6920

#6930

#6931

#6944

#6946

#6954

#6963

#6972

#6975

#6976

#6977

#6982

#6983

#6985

#6998

#7019

#7051

#7053

#7054

#7058

#7060

#7061

#7067

#7074

#7077

#7089

#7106

#7108

#7117

#7120

#7126

#7142

#7143

#7161

#7166

#7174

#7176

#7179

#7181

#7183

#7190

#7193

#7201

#7204

#7205

#7206

#7207

#7209

#7211

#7219

#7229

#7230

#7231

#7235

#7241

#7242

#7244

#7247

#7253

#7254

#7255

#7256

#7257

#7258

#7259

#7264

#7272

#7273

#7275

#7277

#7278

#7287

#7288

#7294

#7295

#7304

#7308

#7312

#7317

#7318

#7330

#7332

#7338

#7340

#7343

#7345

#7347

#7349

#7351

#7361

#7378

#7381

#7395

#7404

#741

#7413

#7419

#7420

#7432

#7436

#7437

#7440

#7441

#7442

#7445

#7448

#7449

#7453

#7455

#7456

#7462

#7466

#7469

#7471

#7481

#7500

#7505

#7509

#7519

#7523

#7530

#7537

#7546

#7553

#7564

#7566

#7567

#7570

#7573

#7576

#7578

#7594

#7609

#7611

#7612

#7623

#7625

#7635

#7638

#7639

#7644

#7645

#7646

#7647

#7654

#7655

#7657

#766

#7660

#7674

#7686

#7694

#7695

#7700

#7704

#7714

#7715

#7719

#7724

#7725

#7728

#7732

#7739

#7740

#7744

#7745

#7746

#7747

#7748

#7749

#7754

#7765

#7786

#7792

#7793

#7794

#7795

#7797

#7801

#7803

#7804

#7808

#7810

#7817

#7826

#7830

#7840

#7854

#786

#7867

#7870

#7872

#7875

#7879

#7883

#7885

#7887

#7910

#7911

#7912

#7913

#7923

#7924

#7928

#7945

#7946

#7958

#7962

#7964

#7966

#7974

#7988

#7992

#8000

#8015

#8039

#8042

#8050

#8051

#8057

#8067

#8077

#8078

#8095

#8099

#8101

#8103

#8108

#8109

#8110

#8124

#8125

#8128

#8129

#8130

#8156

#8159

#8161

#8162

#8167

#8176

#8178

#8179

#8180

#8181

#8183

#8195

#8196

#8197

#8201

#8202

#8203

#8215

#8220

#8227

#8235

#8245

#8248

#8249

#8258

#8264

#8270

#8276

#8286

#8288

#8291

#8293

#8298

#83

#8303

#8311

#8312

#8314

#8325

#8327

#8328

#8333

#8335

#8348

#8362

#8367

#8385

#8386

#8387

#8388

#8389

#8390

#8396

#84

#8403

#8414

#8421

#8422

#8423

#8432

#8433

#8438

#844

#8441

#8448

#8449

#8457

#8458

#8460

#8461

#8462

#8480

#8481

#8505

#8509

#8517

#8519

#8529

#8530

#8532

#8535

#8538

#8539

#8542

#8543

#8546

#8547

#8548

#8554

#8556

#8557

#8559

#8564

#8565

#8567

#8569

#8571

#8587

#86

#8614

#8623

#8627

#8637

#8651

#8680

#8685

#8689

#8721

#8722

#8731

#8736

#8739

#8750

#8752

#8762

#8770

#8773

#8774

#8776

#8783

#8784

#8787

#8788

#8795

#8812

#8813

#8818

#8823

#8826

#8827

#8829

#8839

#8842

#8845

#8851

#8861

#8863

#8866

#8869

#8875

#8876

#8887

#8899

#8906

#8917

#8930

#8960

#8961

#8962

#8970

#8972

#8975

#8976

#8978

#8985

#8992

#900

#9000

#9008

#9018

#9022

#9024

#9028

#9029

#9046

#9071

#9077

#9078

#9086

#9086

#9112

#9117

#9124

#9128

#9129

#9130

#9137

#9143

#9165

#9176

#9177

#9183

#9188

#9196

#9198

#9204

#9215

#9217

#9219

#9221

#9223

#9224

#9225

#9226

#9227

#9229

#9230

#9231

#9232

#9236

#9237

#9243

#9248

#9249

#9259

#9262

#9263

#9265

#9267

#9274

#9275

#9284

#9293

#9296

#9297

#9299

#9316

#9333

#9337

#9337

#9343

#9351

#9351

#9352

#9354

#9360

#9368

#9370

#9370

#9388

#9391

#9395

#9397

#9399

#9400

#9405

#9406

#9407

#9409

#9449

#9463

#9467

#9477

#9484

#9485

#9486

#9492

#9501

#9504

#9505

#9509

#9511

#9514

#9518

#9520

#9521

#9532

#9536

#9539

#9541

#9543

#9544

#9547

#9549

#9552

#9557

#9569

#9575

#9579

#9582

#9586

#9587

#9588

#9593

#9595

#9602

#9604

#9607

#9608

#9610

#9611

#9612

#9613

#9615

#9615

#9616

#9617

#9619

#9621

#9624

#9626

#9627

#9630

#9630

#9632

#9633

#9634

#9636

#9637

#9638

#9640

#9643

#9645

#9649

#9651

#9652

#9654

#9661

#9665

#9669

#9670

#9673

#9675

#9676

#9677

#9678

#9680

#9682

#9685

#9687

#9688

#9692

#9696

#9700

#9701

#9702

#9703

#9705

#9707

#9709

#9710

#9714

#9715

#9716

#9717

#9717

#9720

#9721

#9722

#9724

#9725

#9726

#9729

#9730

#9731

#9732

#9733

#9733

#9734

#9735

#9736

#9738

#9740

#9740

#9741

#9742

#9744

#9745

#9746

#9749

#975

#9750

#9751

#9752

#9754

#9773

#9867

v0.0.9

v0.1.0

v0.1.1

v0.1.2

v0.1.3

v0.1.4

v0.1.5

v0.1.6

v0.1.7

v0.1.8

v0.2.0

v0.2.1

v0.2.2

v0.3.0

v0.3.2

v0.3.3

v0.4.0

v0.5.0

v0.5.2

v0.5.3

v0.6.0

v0.6.1

v0.6.2

v0.6.3

v0.7.0

v0.7.1

v0.8.0

v0.8.1

v0.8.2

v0.8.3

v0.9.0

v0.9.1

v0.9.2

v0.9.3

v0.9.4

v0.9.5

9ce6b663e9 [train] support megatron-bridge for PT/SFT training (#10645) main sunyi0505 2026-07-27 18:45:18 +08:00
2ebe7be611 [ci] pin ruff version and fix lint errors (#10681) Yaowei Zheng 2026-07-24 16:29:58 +08:00
3f77101580 [v1] refactor registry plugin structure and params (#10641) Jiaqi 2026-07-24 15:23:21 +08:00
19e9fe3ced [docker] improve NPU image build and distribution (#10664) xvxuopop 2026-07-24 15:22:01 +08:00
d0eaa10b0c [docs] update readme (#10678) Yaowei Zheng 2026-07-24 00:09:09 +08:00
a17afe5e1b [docs] update trend badge and promote PenguinHarness in readme (#10677) Yaowei Zheng 2026-07-23 23:52:05 +08:00
ef2d8f9da6 [v1] fix grad norm and lr log (#10640) HelloWorldBeginner 2026-07-17 22:50:13 +08:00
5f653cb96a [v1] add muon optimizer (#10618) HelloWorldBeginner 2026-07-17 21:44:28 +08:00
d1049d650a [docs] Add AMD GPU Cloud link (#10649) GaoYuYang 2026-07-15 17:52:24 +08:00
8489928769 [fix]update license check, update transformers (#10632) 浮梦 2026-07-13 17:30:36 +08:00
b61140db3e [v1] replace custom template system with apply_chat_template (#10598) 浮梦 2026-07-10 21:15:44 +08:00
ea31c43d80 [v1] improve getting started guide with comprehensive content (#10626) Hyacinth-of-Security 2026-07-08 19:55:23 +08:00
76a0391ddd [misc] fix ray initialization comment typo (#10628) Karunanidhi Mishra 2026-07-07 04:30:58 -05:00
445163ab5e [misc] fix typos in comments and help text (#10633) Karunanidhi Mishra 2026-07-07 04:30:34 -05:00
d58ec6a0bc [deps] exclude broken transformers release (#10634) Karunanidhi Mishra 2026-07-07 04:30:26 -05:00
5987a8dd68 [webui] add seed controls for reproducibility (#10629) Karunanidhi Mishra 2026-07-07 04:30:09 -05:00
a61cfa692a [readme] Revise bitsandbytes installation instructions in README (#10621) zhangzhengshan 2026-07-03 13:16:11 +08:00
7a83d28ce3 [readme] Revise bitsandbytes installation instructions (#10622) zhangzhengshan 2026-07-03 13:15:43 +08:00
c8a082e0e3 [fix] Fixes Qwen3-VL prompt expansion for multiple videos (#10518) luca-888 2026-07-02 11:06:42 +08:00
a48af5cc69 [data] clarify _nothink suffix warning for reasoning-only models (#10613) GSCSD1 2026-06-30 17:15:29 +08:00
c383c0d067 [model] add Qwen-AgentWorld-35B-A3B support (#10615) souljoy 2026-06-30 17:15:09 +08:00
50ff45176a [v1] set flash_attn to flash_attention_2 for ulysses CP example (#10616) HelloWorldBeginner 2026-06-30 16:56:12 +08:00
9c0b4b3835 [v1][feature] add dpo trainer (#10544) codingma 2026-06-26 15:32:10 +08:00
b7615dbdc9 [v1] Fix device mesh, fix lora for reward model and fix sp (#10555) jiaqiw09 2026-06-25 20:05:56 +08:00
666ee0ca78 [fix] redundant transformers check (#10602) Artyom Iudin 2026-06-24 11:34:59 +03:00
aca54c7f17 [model] add Hy-MT2-1.8B/7B support (#10605) souljoy 2026-06-24 16:34:53 +08:00
48aa9ef084 [docs] update supported models list for MiniCPM 4/5 (#10603) souljoy 2026-06-24 15:04:15 +08:00
c928c1cb21 [assets] update llamafactory sft skill guidance (#10600) GaoYuYang 2026-06-23 17:22:56 +08:00
c35b7d7f55 [assets] add llamafactory sft skills (#10597) GaoYuYang 2026-06-22 17:01:21 +08:00
802bcfe969 [feat] support HyperParallel Context Parallel feature (#10559) Chaoran Wei 2026-06-22 07:40:44 +08:00
8792f06161 [webui] Fix WebUI training hang from subprocess log pipe (#10584) summernight 2026-06-17 15:36:40 +08:00
8669a22e9c [fix] fix liger kernel patch for npu (#10583) jiaqiw09 2026-06-16 18:21:52 +08:00
897a44386c [docs] add DataFlow and DataFlex blog tutorials (#10582) Hao Liang 2026-06-16 14:20:36 +08:00
7a1e9630f2 [fix] update ascend doc link (#10572) jiaqiw09 2026-06-15 13:55:53 +08:00
cabe59a343 [model] add MiniCPM5-1B-Chat (#10558) souljoy 2026-06-10 16:18:27 +08:00
9ca4026efe [model] handle unsloth model loading fallback during checkpoint resume (#7156) (#10551) Co-Cl2 2026-06-09 01:01:01 +08:00
0b7aaf8f6a [fix] correctly place new token embeddings when embedding is padded (#10547) Ximing Xing 2026-06-05 10:47:51 +08:00
8a4f6a3da5 [model] add gemma-4-12B-it (#10549) codingma 2026-06-04 23:43:20 +08:00
409e8a477f [model] Patch GDN for NPU (#10504) A1waysBeenHere 2026-06-04 16:39:02 +08:00
053d43c0ac [feat] support HyperParallel PT training and activation optimization (#10370) Cui-yshoho 2026-06-02 22:39:32 +08:00
a98a1ef101 [docs] fix README citation typo (#10540) Zhao73 2026-06-01 21:04:53 +08:00
8ef7335b6a [misc] set dev version (#10533) Yaowei Zheng 2026-05-31 00:16:07 +08:00
7af909522a [version] release v0.9.5 (#10532) v0.9.5 Yaowei Zheng 2026-05-30 23:57:09 +08:00
e016d2480e [fix] Fix NPU FusedMoE and RMSNorm (#10512) xvxuopop 2026-05-30 21:42:54 +08:00
7d719182c9 [model] fix non-packing batch (bsz>1) for Qwen3.5 with flash attention (#10529) jiaqiw09 2026-05-30 21:41:41 +08:00
01398eb18d [v1] fix padding free with sp (#10513) jiaqiw09 2026-05-26 23:49:21 +08:00
8e68764b65 [v1] Implement dynamic padding-free stretrgy for batching (#10507) cxy 2026-05-25 20:40:21 +08:00
16ff5a23cb [fix] use getattr for profiler attrs to support MCA TrainingArguments (#10506) Copilot 2026-05-21 17:26:29 +08:00
bdcb92d035 [v1] Add FlashAttention selection and implement normal / padding-free / dynamic batching (#10469) jiaqiw09 2026-05-21 17:14:19 +08:00
7e20db5735 [v1] support liger_kernel (#10493) sunyi0505 2026-05-21 11:44:56 +08:00
2322bf1cc2 [v1] add cuda fused moe kernel, implementing with triton (#10481) 浮梦 2026-05-20 20:49:42 +08:00
368c48968f [callback] add torch profiler callback (#10463) 浮梦 2026-05-20 20:47:52 +08:00
8b5ea65770 [v1] support reward training stage (#10431) 浮梦 2026-05-20 20:46:52 +08:00
40e786d016 [data] add missing return statement in MiniCPM V Plugin (#10500) Dennis Huang 2026-05-20 01:50:00 +08:00
6b9df75ab9 [docker] update npu docker (#10479) xvxuopop 2026-05-13 20:56:43 +08:00
ca50f22c38 [fix] Fix MiniCPM-V-4.6 image preprocessing behavior (#10478) 马境远 2026-05-12 11:35:23 +08:00
53e77a9bfa [model] support MiniCPM-V-4.6 (#10472) 马境远 2026-05-08 18:14:34 +08:00
55bd4944b6 [fix] fix qwen3_6 template doc (#10470) 浮梦 2026-05-08 11:47:02 +08:00
7e09152275 fix(data/converter): handle None tool_calls in OpenAI-style messages (#10455) Tai An 2026-05-07 02:44:41 -07:00
1e503a982d [assets] correct typo in examples/README_zh.md (#10462) simulikeit 2026-05-07 00:42:01 +08:00
8752280dd7 [data] Optimize QwenVL video dataset preprocessing (#10404) luca-888 2026-05-03 18:36:56 +08:00
468723c5d9 [packing] fix GDN crash when meeting dummy image (#10453) Kingsley 2026-05-01 12:10:13 +08:00
887ee2b121 [refactor] Add KTransformers AMX MoE SFT support via Accelerate (#10430) Peilin Li 2026-05-01 01:47:58 +08:00
6b08b948c9 [misc] bump transformers version upperbound (#10446) Kingsley 2026-05-01 01:30:11 +08:00
f7f3bfcbd7 [model] support Hy3-Preview (#10432) Hertz 2026-04-29 23:21:13 +08:00
3475198d1e [fa2] fix IMA when train qwen3_5 (#10448) Kingsley 2026-04-29 20:20:55 +08:00
50945ef850 [v1] fix device_mesh and sp for fsdp2 (#10429) sunyi0505 2026-04-28 11:20:11 +08:00
2f0bef207a [export] handle NotImplementedError in export_model for transformers>=5.0 (fixes #10410) (#10438) Octopus 2026-04-27 23:36:23 +08:00
2092abc217 [npu] add Qwen3.5 support with Partial RoPE and Hybrid Attention (#10421) curnane-lab 2026-04-27 23:36:07 +08:00
99464b3d03 [misc] code lint (#10439) Kingsley 2026-04-27 14:07:31 +08:00
9a0cfdccfa [v1] fix init on meta in transformers v5 (#10414) jiaqiw09 2026-04-27 00:37:09 +08:00
c8890c32db [data] support discard history cot for multiturn (#10435) Kingsley 2026-04-27 00:32:44 +08:00
79c8332e4c [train] add qwen35 patch for neat_packing (#10436) Kingsley 2026-04-27 00:31:49 +08:00
e0bc3c1971 [v1] fix epoch and steps (#10422) jiaqiw09 2026-04-23 17:29:06 +08:00
ecca167eb4 [model] support qwen3.6 models (#10415) 浮梦 2026-04-22 19:44:01 +08:00
28a6ea1cdc [v1] add deepspeed zero3 trigger for low memory usage weight loading (#10300) jiaqiw09 2026-04-21 14:09:52 +08:00
f5d739b132 [v1] fix device mesh and clip_grad_norm for ulysses cp (#10366) sunyi0505 2026-04-21 10:54:54 +08:00
c4bbac49b2 [v1] support resume training from checkpoint (#10280) 浮梦 2026-04-20 20:28:08 +08:00
c5aecaf31d [data] fix SeedToolUtils.tool_extractor returns content when no tool calls found (#10408) Cocoon-Break 2026-04-20 12:22:55 +08:00
436d26bc28 fix: projector lookup for gemma4 modules (#10382) Kingsley 2026-04-12 08:32:14 +08:00
c109c061e5 [model] set mm_projectors for omni models (#10378) Kingsley 2026-04-10 18:12:57 +08:00
fa09c01c36 fix: gemma4 mm_token_type_ids padding (#10359) Kingsley 2026-04-06 13:14:45 +08:00
eae6f0b541 [model] gemma4 (#10346) Kingsley 2026-04-05 12:10:28 +08:00
acac63ef35 [data] fix qwen3vl timestamp (#10338) Kingsley 2026-04-01 22:40:12 +08:00
e5e8546493 [misc] fix moe (#10334) 浮梦 2026-03-31 23:04:45 +08:00
97433c53b6 [feat] support LlamaFactory SFT training by HyperParallel FSDP2 backend (#10289) Cui-yshoho 2026-03-30 10:47:20 +08:00
b5afabe3d2 [v1] support ulysses cp for fsdp2 (#10262) sunyi0505 2026-03-27 16:22:48 +08:00
df2e6edb7e [v1] add init on rank0 for fsdp2 (#10264) jiaqiw09 2026-03-27 14:54:03 +08:00
d02fcd3588 [ci] add nginx cache config for Ascend NPU CI environment (#10323) Goalina 2026-03-27 10:04:16 +08:00
c340aa2a33 [v1] add callbacks (#10255) jiaqiw09 2026-03-26 19:59:57 +08:00
1e536733c6 [data] fix mimo-v2 tool call (#10315) Hertz 2026-03-26 17:37:22 +08:00
97d479fa92 [model] support Qwen3.5 liger kernel (#10313) Yutong Wu 2026-03-24 18:25:33 +08:00
ffbff33af3 chore: mca workflow compatible with qwen-vl series (#10303) Kingsley 2026-03-22 02:28:52 +08:00
833f6027b1 [fix] fit neat_packing & mrope model packing (#10283) Kingsley 2026-03-20 16:50:11 +08:00
d91d8af89e [data] add SGSC zero-hallucination B2B dataset (NOO-Protocol) (#10284) robertglools 2026-03-20 15:49:03 +08:00
e67ab9e2f2 fix:MiniCPMVPlugin IndexError in process_messages when training with video (#10276) xxddccaa 2026-03-18 19:18:06 +08:00
2c4f121817 [fix] handle empty content list in system message (#10291) LincolnBurrows2017 2026-03-18 12:05:49 +08:00
487f8b8191 [v1] add qwen3 templates and fix rendering plugin. (#10212) xvxuopop 2026-03-18 11:30:50 +08:00
78cad1e332 [fix] unused keys in ray example (#10290) SnowCharm 2026-03-18 00:23:53 +08:00
70653026f5 [fix] make position_id_per_seconds configurable for Qwen2OmniPlugin (#10281) LincolnBurrows2017 2026-03-16 19:42:38 +08:00