142 Commits

Author SHA1 Message Date
hiyouga
72ba29d81a fix #4458
Former-commit-id: aab14b15268dbe74ded22549dbd3677474868cbb
2024-06-26 19:52:35 +08:00
hiyouga
0f82a55305 fix #4379
Former-commit-id: 96bedb4b6445a04ff8b97fb2aadace50b2f882df
2024-06-25 02:31:44 +08:00
hiyouga
9fd7a410bb tiny fix about badam
Former-commit-id: 03f49267c7406e36aee35639f86e6e0383897090
2024-06-25 01:54:53 +08:00
hoshi-hiyouga
bfb2ad7c79 Merge pull request #4352 from Ledzy/main
[Enhancement] Support ZeRO-3 when using BAdam

Former-commit-id: 0dc75275efa7d7540b472783a52ea6aeaa503c0b
2024-06-25 01:49:13 +08:00
hiyouga
4c89aca243 update readme
Former-commit-id: a1477208471039d3578980f929f1ca8c2a07aa96
2024-06-24 18:22:12 +08:00
hiyouga
3e0fa4a8da fix templates
Former-commit-id: 6f357d59b73309c5955683008632e7f320e7dcb1
2024-06-19 17:44:05 +08:00
Jonery
fa3150548e Cleaner integration.
Former-commit-id: 26d4b05d424bd71f570195dd433258caf6465d92
2024-06-19 12:29:40 +08:00
Jonery
12fcfc2b72 Support distributed BAdam.
Former-commit-id: bdcb986e37975911c190a74d3e60bb77aa2033bd
2024-06-18 12:27:47 +08:00
Jonery
95ae30f678 Merge remote-tracking branch 'upstream/main'
Former-commit-id: 37834a7e79473ccf50ad7f67745b97c274c326d9
2024-06-17 18:44:51 +08:00
Jonery
ba303fd1aa adapt for badam with ds zero3
Former-commit-id: fff2a020ec8713022bd8145f4a7168168ea07ca4
2024-06-17 18:18:10 +08:00
hiyouga
727943f078 fix tol
Former-commit-id: bdb54bcb477126687db789bd89f2df84e424a2a3
2024-06-16 01:38:44 +08:00
hiyouga
32f45c9e91 support pissa
Former-commit-id: ef8e45f2eaf466c54e9a671512a2974575677b08
2024-06-16 01:08:12 +08:00
hiyouga
05f3a3c944 tiny fix
Former-commit-id: f7f440986b0ae3b38ea9f2da80789629d4f79ea1
2024-06-16 01:06:41 +08:00
hiyouga
bb88536166 add license
Former-commit-id: 69cfc98d7c81756a5ab6bf962240e393e449fef0
2024-06-15 17:54:33 +08:00
hiyouga
a30931fe0f fix #4295
Former-commit-id: 08f657868f9d605b837c5d8c2946a25cc05c8735
2024-06-15 04:34:55 +08:00
hiyouga
3ff9b87012 add test cases
Former-commit-id: 731176ff34cdf0cbf6b41c40c69f4ceb54c2daf6
2024-06-15 04:05:54 +08:00
hiyouga
49b58fd6af fix #4221
Former-commit-id: 05a3be4853b941909e7d193c31e8d62c8c5f879b
2024-06-13 02:48:21 +08:00
hiyouga
103a507b39 fix #4209
DeepSpeed ZeRO3 has inflight param error when calling model.eval()


Former-commit-id: 4be013f18ea6a35b5a11db98db5f0670ffb41619
2024-06-13 02:25:50 +08:00
hiyouga
0a75224f62 clean code
Former-commit-id: f54cafd5c7f0383370d1a2f357834a61a97397ce
2024-06-13 01:58:16 +08:00
hiyouga
820b6e7b32 fix #4198
Former-commit-id: 945d2c6cc73542adf9272ebd9aa332ea2c1c7361
2024-06-11 15:38:38 +08:00
hiyouga
ba648fd003 tiny fix
Former-commit-id: 0621bcad1dfbe8ce2464f741d4256c5df2a8d1b6
2024-06-07 05:19:21 +08:00
hiyouga
b0e5a76f4c fix ppo trainer save zero3 model
accelerator.get_state_dict(ds_model) should be called at all ranks


Former-commit-id: 3a0f60f0aa072531e4ae5819ec00c8fa42aa0913
2024-06-07 05:14:19 +08:00
hiyouga
8692796c9b fix ppo in trl 0.8.6
Former-commit-id: 5e0d66a0d80b4bd4a8506e2317209d8fb9d25ff6
2024-06-07 04:48:29 +08:00
hiyouga
d0edcde4ea fix #4120
Former-commit-id: 2a44da678a5e360a9c0f9056397ac9e801329321
2024-06-07 04:18:05 +08:00
hiyouga
fcb134e144 rename files
Former-commit-id: e1a8431770fc36c0c9ee7fed4abbc3d7fdcc5efd
2024-06-07 00:09:06 +08:00
hiyouga
35b5117a59 fix ppo+zero3 #3108
Former-commit-id: 33a93cc29e3e57bf001515000c0a70c112573dea
2024-06-06 23:30:07 +08:00
hiyouga
ca95e98ca0 fix ppo dataset bug #4012
Former-commit-id: 7fc51b2e93698ae5e012566af8481f4d861c873d
2024-06-06 19:03:20 +08:00
hiyouga
d5559461c1 update trainers
Former-commit-id: b7f6c4a171293cf4f3e88f15a811f847342f84ee
2024-06-06 18:45:49 +08:00
hiyouga
556a4aa972 fix #4090
Former-commit-id: d9f15f30a8f4bc64778a5c96baeb6801700d7a2c
2024-06-06 00:50:32 +08:00
hiyouga
4c1f015eca remove gc warnings in DPO&KTO
Former-commit-id: b649bdcbafb464a638387429b770fe258b41f8af
2024-06-03 22:53:54 +08:00
hoshi-hiyouga
7754024e9b Update trainer.py
Former-commit-id: 8565d4b43db905374c328ae57c71fc226980d14f
2024-06-03 22:08:38 +08:00
enji.zhou
b4913569a8 fix KTO Trainer Sampler
Former-commit-id: 39eb1bfa272011554322e9bb2534f83b68282a70
2024-06-03 21:32:38 +08:00
Uminosachi
2696f614a7 Set scheduler_specific_kwargs to get_scheduler
Former-commit-id: f04e70dfab44480ef4c015c06470443237f69ba9
2024-05-31 13:45:39 +09:00
hiyouga
351b4efc6c 10x generate in ppo w/ zero3
https://github.com/huggingface/trl/pull/1483

Former-commit-id: 5dc43ba8b373d8803bc22d88b3d0d95ef8b9c7f8
2024-05-29 00:23:23 +08:00
hiyouga
9b551309de update dpo, kto trainer
Former-commit-id: 4a6cc3c7046f8b27d05ea53ef216bab6fa7ebfaf
2024-05-29 00:14:29 +08:00
hiyouga
9fed4a2ef4 clean kto trainer
Former-commit-id: 76402bd78cbd3a99a544f0ac019468b569b0e1d1
2024-05-28 21:43:26 +08:00
hiyouga
b0d9966663 support SimPO #3900
Former-commit-id: 6b954ce60155cf8334150b795cfc4bb63ca74c8b
2024-05-26 23:46:33 +08:00
hiyouga
bf59383783 refactor data preprocessing, fix mllm rlhf
Former-commit-id: 53ff2dd24f9121ea30c95063bb72e49a9b31e980
2024-05-24 04:08:25 +08:00
hiyouga
0aa072a155 improve data process logger
Former-commit-id: 33d0b012b56dbafc9fff87b821c2d1bf1409dbb5
2024-05-18 22:02:42 +08:00
hiyouga
2bff90719b improve KTO impl., replace datasets
Former-commit-id: e56a57ddcf061de6e4acc8679f7dbf0b68364986
2024-05-18 03:44:56 +08:00
enji.zhou
66b5634ebf add kto
Former-commit-id: ec51986cf70b0bdd79b8141e45916670fb97a08e
2024-05-17 13:09:17 +08:00
hiyouga
dfa686b617 rename package
Former-commit-id: a07ff0c083558cfe6f474d13027642d3052fee08
2024-05-16 18:39:08 +08:00