8 Commits

Author SHA1 Message Date
hiyouga
b4c5a08d06 fix yi vl vllm infer
Former-commit-id: 51d61fcc89a0acc6e17b97865e277845294c0bd3
2024-05-15 19:25:48 +08:00
hiyouga
226587fc4a fix slow op in dpo/orpo trainer
Former-commit-id: 3010154adb43deb37fbb3a4648dccd27e848e9c3
2024-05-03 23:06:52 +08:00
hiyouga
0a94fab357 support badam for all stages
Former-commit-id: e3d8fc75eb2cfc54efd35bfd9ad6c4ac5acc458c
2024-04-16 17:44:48 +08:00
hiyouga
829cf6458a fix #3083
Former-commit-id: 4a6ca621c09d179561acc5957c8c911a4e44184c
2024-04-01 22:53:52 +08:00
hiyouga
bd52e2b404 fix ORPO loss
Former-commit-id: 816d71414617590f95de89a49f38358e597ed121
2024-04-01 14:42:41 +08:00
hiyouga
69e1d39832 fix IPO and ORPO loss
Former-commit-id: 5b9b40403d59982431a526e337f31d394f8b882b
2024-04-01 14:37:53 +08:00
hiyouga
b873dcb09d use log1p in orpo loss
https://github.com/huggingface/trl/pull/1491

Former-commit-id: 68aaa4904b8dfb6cc791fdcee613edc681a8a198
2024-03-31 19:27:08 +08:00
hiyouga
2f878bde11 support ORPO
Former-commit-id: 17bf8a2c3a7bb5b83071c8659cfd8751e894e692
2024-03-31 18:29:50 +08:00