hiyouga
|
3305e66f8c
|
fix ppo save model
Former-commit-id: 300ca6d904524f46cb520056e1319a1e9a13d169
|
2023-09-12 16:25:29 +08:00 |
|
hiyouga
|
42e0b30476
|
update flashattn, fix ppo save model
Former-commit-id: 0b08bc3dac246d4aa3f89afb7172529dcad9c39f
|
2023-09-11 17:25:36 +08:00 |
|
hiyouga
|
a09a7b650d
|
remove PeftTrainer
Former-commit-id: cc0cff3e991f194732d278e627648e528118a719
|
2023-09-10 22:23:23 +08:00 |
|
hiyouga
|
f91c5f2638
|
fix lora target
Former-commit-id: d822e41e7ac7e310ee49e347fc45754284ce30b8
|
2023-09-09 17:04:45 +08:00 |
|
hiyouga
|
e70b3e8947
|
fix #761
Former-commit-id: be76f6cbe5143f781b6b39603b80392253b3080a
|
2023-09-08 20:22:18 +08:00 |
|
hiyouga
|
612d97db6f
|
change to right-padding, update reward score #803
Former-commit-id: baa90415bc8f5ebd423d001378b51c3a3a6c2ec7
|
2023-09-08 20:04:31 +08:00 |
|
hiyouga
|
b5acec34f7
|
add deepspeed check in PPO training
Former-commit-id: e203ec7f71f504ccbaa89c27d20b8a0d9fa53f7e
|
2023-09-07 19:12:40 +08:00 |
|
hiyouga
|
eae7b331d3
|
fix baichuan templates
Former-commit-id: f48a49e835b32f3991cfad8874c7b9c78953809f
|
2023-09-07 18:54:14 +08:00 |
|
hiyouga
|
e5b72c6a77
|
refactor dataset_attr, add eos in pt, fix #757
Former-commit-id: 0feec9a830b917b36686b61938a66e842eccf930
|
2023-09-01 19:00:45 +08:00 |
|
hiyouga
|
5549f35939
|
fix ppo trainer #551
Former-commit-id: 050a5447c191b8c50a0826a0f03bae499bff8b48
|
2023-08-20 14:07:11 +08:00 |
|
hiyouga
|
948124f55e
|
tiny fix
Former-commit-id: 0ee159654ac6339c162745b004e2152ba6fe3c81
|
2023-08-18 13:07:35 +08:00 |
|
hiyouga
|
be4d2822ea
|
fix PPO trainer #551 , update readme
Former-commit-id: faead74849470cebae9e37cde5fab2a71b32aa43
|
2023-08-18 11:43:10 +08:00 |
|
hiyouga
|
04fa430c6c
|
fix ChatGLM2 ppo #527 #528
Former-commit-id: 60d6ad64d7c9f6445b0df8de0153c3a311974198
|
2023-08-18 00:34:59 +08:00 |
|
hiyouga
|
fa1893b59c
|
fix generation bug #532
Former-commit-id: c071121e67374e5f09798db57cfc8668617a36ae
|
2023-08-17 22:21:34 +08:00 |
|
hiyouga
|
7d04f8567b
|
fix generation
Former-commit-id: 66a0300d312ef91c24fcf80667fa3b0bb8e1a342
|
2023-08-16 22:39:54 +08:00 |
|
hiyouga
|
7c046edb7b
|
fix ChatGLM RLHF
Former-commit-id: 4e43e887e432ceb7e9287b4e309b63af3c3ba1bf
|
2023-08-15 11:19:20 +08:00 |
|
hiyouga
|
ca719a8697
|
support DPO training (2305.18290)
Former-commit-id: 6d98de148e4af63a7028dfaeb6cf86eb56a4488f
|
2023-08-11 03:02:53 +08:00 |
|
hiyouga
|
28a51b622b
|
modify code structure
Former-commit-id: 6369f9b1751e6f9bb709ba76a85f69cbe0823e5d
|
2023-08-02 23:17:36 +08:00 |
|
hiyouga
|
8bd1da7144
|
fix PPO trainer
Former-commit-id: 21982a7d4dd9b7c3a1145b481f02b9990e32dc00
|
2023-08-02 19:10:23 +08:00 |
|
hiyouga
|
e4d0b8ee6e
|
update ppo trainer
Former-commit-id: c27136a83e167465d3f825e40f10c7b9fcfbf97a
|
2023-08-02 18:46:41 +08:00 |
|
hiyouga
|
1dfb28b362
|
fix memory leak of PPO trainer
Former-commit-id: 38410894a5ebf0b043b55a6bd5cca3cd0a44b27d
|
2023-08-02 17:41:34 +08:00 |
|
hiyouga
|
8e26eb374e
|
fix RM save model
Former-commit-id: 8104cc2425431eb1cddccf3909855296116f922b
|
2023-08-01 11:56:17 +08:00 |
|
hiyouga
|
dd3f3e9749
|
support streaming data, fix #284 #274 #268
Former-commit-id: 819cc1353599e5fa45658bc56dd0dbe4b258b197
|
2023-07-31 23:33:00 +08:00 |
|
hiyouga
|
dc8283d3d7
|
fix API
Former-commit-id: 9b10c9a12e33ab897056ecc61d977d221c19141b
|
2023-07-19 00:01:14 +08:00 |
|
hiyouga
|
a864a7b395
|
update webUI, fix #179
Former-commit-id: f9074fed5e22585679661588befcf266a79009f2
|
2023-07-18 15:35:17 +08:00 |
|
hiyouga
|
eac7f97337
|
release v0.1.0
Former-commit-id: 63c8d3a17cb18f0d8a8e37bfa147daf5bdd28ea9
|
2023-07-18 00:18:25 +08:00 |
|
hiyouga
|
c08ff734a7
|
fix #175
Former-commit-id: fd557ebb5e3ef2ca330b4d97731af43f4a5a5fc5
|
2023-07-17 18:07:17 +08:00 |
|
hiyouga
|
a31a609377
|
fix callback
Former-commit-id: 065680cd2a410d7ceab10a4a76588df43e286117
|
2023-07-15 17:18:16 +08:00 |
|
hiyouga
|
6261fb362a
|
modity code structure
Former-commit-id: 0682ed357210897e0b67c4a6eb31a94b3eb929f1
|
2023-07-15 16:54:28 +08:00 |
|