Commit Graph

29 Commits

Author SHA1 Message Date
codingma
74f0d02eb8 1. add custom eval dataset support
2. merge load dataset and split dataset function


Former-commit-id: 76f3bbcfc0
2024-07-05 15:52:10 +08:00
hiyouga
3595d98b4c fix #4609
unwrap_model_for_generation(reward_model) is necessary for zero3 training


Former-commit-id: 8845e94f91
2024-07-03 19:45:51 +08:00
hiyouga
104151d558 tiny fix
Former-commit-id: 8b1172b910
2024-07-03 02:31:50 +08:00
hiyouga
c9e9beee4e tiny fix
Former-commit-id: 71cdf8956e
2024-07-02 23:06:13 +08:00
hiyouga
ea2d3f6c18 remove rlhf support for chatglm2&3
Former-commit-id: 821bb6660e
2024-07-02 23:03:17 +08:00
hiyouga
4828bed837 upcast logits
Former-commit-id: c13ae2df19
2024-07-02 22:32:05 +08:00
hiyouga
cc31014002 improve rlhf
Former-commit-id: c47ab6c072
2024-07-02 22:23:08 +08:00
hiyouga
08296f4092 fix ppo callbacks
Former-commit-id: 4c296001c4
2024-07-02 17:34:56 +08:00
hiyouga
835f0578c2 refactor pissa, improve llamaboard
Former-commit-id: 8baf3b22b0
2024-06-28 01:04:24 +08:00
hiyouga
28e613efd0 fix #4458
Former-commit-id: 8d6cd69ac4
2024-06-26 19:52:35 +08:00
hiyouga
a225b5a70c tiny fix about badam
Former-commit-id: 095fab58d3
2024-06-25 01:54:53 +08:00
hoshi-hiyouga
fe6ef6400c Merge pull request #4352 from Ledzy/main
[Enhancement] Support ZeRO-3 when using BAdam

Former-commit-id: d0f953bf5b
2024-06-25 01:49:13 +08:00
hiyouga
7be502c5c5 update readme
Former-commit-id: e507e60638
2024-06-24 18:22:12 +08:00
Jonery
c779899f7b Cleaner integration.
Former-commit-id: 5c2ff1b749
2024-06-19 12:29:40 +08:00
Jonery
3a5eacb4cf Support distributed BAdam.
Former-commit-id: 0f72aac8c9
2024-06-18 12:27:47 +08:00
hiyouga
c0c6b8075a tiny fix
Former-commit-id: 38b6b0f52e
2024-06-16 01:06:41 +08:00
hiyouga
2946153cea add license
Former-commit-id: d87108daa6
2024-06-15 17:54:33 +08:00
hiyouga
a3f4925c2c add test cases
Former-commit-id: b27269bd2b
2024-06-15 04:05:54 +08:00
hiyouga
81ed4d8abf fix #4209
DeepSpeed ZeRO3 has inflight param error when calling model.eval()


Former-commit-id: cf9f2d6c42
2024-06-13 02:25:50 +08:00
hiyouga
ca9468ff04 tiny fix
Former-commit-id: f8d8690bf4
2024-06-07 05:19:21 +08:00
hiyouga
4f3c89a6eb fix ppo trainer save zero3 model
accelerator.get_state_dict(ds_model) should be called at all ranks


Former-commit-id: 4489d73ac7
2024-06-07 05:14:19 +08:00
hiyouga
f76d427332 fix ppo in trl 0.8.6
Former-commit-id: 2702d7e952
2024-06-07 04:48:29 +08:00
hiyouga
8da149ba40 rename files
Former-commit-id: 74f96efef9
2024-06-07 00:09:06 +08:00
hiyouga
368695483d fix ppo+zero3 #3108
Former-commit-id: 76c61905b2
2024-06-06 23:30:07 +08:00
hiyouga
e0aadd4b34 fix ppo dataset bug #4012
Former-commit-id: 149610c636
2024-06-06 19:03:20 +08:00
hiyouga
e898d8bbc4 update trainers
Former-commit-id: fad2591e31
2024-06-06 18:45:49 +08:00
hiyouga
468d0e7ed1 10x generate in ppo w/ zero3
https://github.com/huggingface/trl/pull/1483

Former-commit-id: 65cd8bdbdb
2024-05-29 00:23:23 +08:00
hiyouga
13d7b48efe improve KTO impl., replace datasets
Former-commit-id: c450ee87a3
2024-05-18 03:44:56 +08:00
hiyouga
cae823ddf0 rename package
Former-commit-id: 308edbc426
2024-05-16 18:39:08 +08:00