8 Commits

Author SHA1 Message Date
hiyouga
8692796c9b fix ppo in trl 0.8.6
Former-commit-id: 5e0d66a0d80b4bd4a8506e2317209d8fb9d25ff6
2024-06-07 04:48:29 +08:00
hiyouga
fcb134e144 rename files
Former-commit-id: e1a8431770fc36c0c9ee7fed4abbc3d7fdcc5efd
2024-06-07 00:09:06 +08:00
hiyouga
35b5117a59 fix ppo+zero3 #3108
Former-commit-id: 33a93cc29e3e57bf001515000c0a70c112573dea
2024-06-06 23:30:07 +08:00
hiyouga
ca95e98ca0 fix ppo dataset bug #4012
Former-commit-id: 7fc51b2e93698ae5e012566af8481f4d861c873d
2024-06-06 19:03:20 +08:00
hiyouga
d5559461c1 update trainers
Former-commit-id: b7f6c4a171293cf4f3e88f15a811f847342f84ee
2024-06-06 18:45:49 +08:00
hiyouga
351b4efc6c 10x generate in ppo w/ zero3
https://github.com/huggingface/trl/pull/1483

Former-commit-id: 5dc43ba8b373d8803bc22d88b3d0d95ef8b9c7f8
2024-05-29 00:23:23 +08:00
hiyouga
2bff90719b improve KTO impl., replace datasets
Former-commit-id: e56a57ddcf061de6e4acc8679f7dbf0b68364986
2024-05-18 03:44:56 +08:00
hiyouga
dfa686b617 rename package
Former-commit-id: a07ff0c083558cfe6f474d13027642d3052fee08
2024-05-16 18:39:08 +08:00