6 Commits

Author SHA1 Message Date
hiyouga
829cf6458a fix #3083
Former-commit-id: 4a6ca621c09d179561acc5957c8c911a4e44184c
2024-04-01 22:53:52 +08:00
hiyouga
bd52e2b404 fix ORPO loss
Former-commit-id: 816d71414617590f95de89a49f38358e597ed121
2024-04-01 14:42:41 +08:00
hiyouga
69e1d39832 fix IPO and ORPO loss
Former-commit-id: 5b9b40403d59982431a526e337f31d394f8b882b
2024-04-01 14:37:53 +08:00
hiyouga
e7ade84bba fix plots
Former-commit-id: 5907216a1cc7a75a43d681ede410c2fba7fb7b92
2024-03-31 19:43:48 +08:00
hiyouga
b873dcb09d use log1p in orpo loss
https://github.com/huggingface/trl/pull/1491

Former-commit-id: 68aaa4904b8dfb6cc791fdcee613edc681a8a198
2024-03-31 19:27:08 +08:00
hiyouga
2f878bde11 support ORPO
Former-commit-id: 17bf8a2c3a7bb5b83071c8659cfd8751e894e692
2024-03-31 18:29:50 +08:00