5 Commits

Author SHA1 Message Date
hiyouga
be0a807e8c fix ORPO loss
Former-commit-id: 5544ddde9087f00f9e20b78d0079f20c2f5d1604
2024-04-01 14:42:41 +08:00
hiyouga
52d402e2a9 fix IPO and ORPO loss
Former-commit-id: fc27955732aedbb12003faf19b760e2768b228f2
2024-04-01 14:37:53 +08:00
hiyouga
c5a46f9113 fix plots
Former-commit-id: 81355671296b84d438967463bb2a92934ff31aae
2024-03-31 19:43:48 +08:00
hiyouga
00e17a377c use log1p in orpo loss
https://github.com/huggingface/trl/pull/1491

Former-commit-id: 3b15d495264b00a4f8716bafea334778874963d7
2024-03-31 19:27:08 +08:00
hiyouga
d764cd8736 support ORPO
Former-commit-id: f44a4c27e2461cdaa1b16865f597a31033c0e6d9
2024-03-31 18:29:50 +08:00