hiyouga
|
2562376f84
|
fix ppo args
Former-commit-id: 11bd271364488d523d5117ec2ea26f39853175b7
|
2023-10-11 23:40:50 +08:00 |
|
hiyouga
|
c9d1cd108d
|
refactor model_dtype, fix PPO trainer
Former-commit-id: 2818af0b0967d7695f27658acac0b7e2c2728e5d
|
2023-10-11 23:16:01 +08:00 |
|
hiyouga
|
f66e6b91c7
|
fix #1026
Former-commit-id: b0b0138e1d841bf5bd2e08b2eea78057b33be69a
|
2023-09-27 22:57:09 +08:00 |
|
hiyouga
|
4581d09fa6
|
fix #944
Former-commit-id: 338b8664edea5ae65192ac657bb013581245ae15
|
2023-09-21 19:51:02 +08:00 |
|
hiyouga
|
c8780205bc
|
fix ppo save model
Former-commit-id: 7ba57d5b1469cd0de0bb391b915bedec97b20ebd
|
2023-09-12 16:25:29 +08:00 |
|
hiyouga
|
33bab0e7c1
|
update flashattn, fix ppo save model
Former-commit-id: 0fbece85a70222e5262a2295203de07ffe648fda
|
2023-09-11 17:25:36 +08:00 |
|
hiyouga
|
6a71361a54
|
remove PeftTrainer
Former-commit-id: b218c271edfb07006ddc34b1aca404088de6c528
|
2023-09-10 22:23:23 +08:00 |
|
hiyouga
|
f865d0bd51
|
fix lora target
Former-commit-id: a51b7c98acc599de5ed2eaeeebe7b184105722c5
|
2023-09-09 17:04:45 +08:00 |
|
hiyouga
|
405df0f63d
|
fix #761
Former-commit-id: b34797a845dec8f6daea59e3c353b8a8f8830100
|
2023-09-08 20:22:18 +08:00 |
|
hiyouga
|
9ed4bb63d4
|
change to right-padding, update reward score #803
Former-commit-id: 8ea32e4046d75ddfa9517669e9de9f48fea720c6
|
2023-09-08 20:04:31 +08:00 |
|
hiyouga
|
5030f05126
|
add deepspeed check in PPO training
Former-commit-id: ed1c2c5557bb2714c3341294f0ea86f6496d4b0c
|
2023-09-07 19:12:40 +08:00 |
|
hiyouga
|
f74b980650
|
fix baichuan templates
Former-commit-id: 85b1f6632a752029dabdaed87c58986deb3a6b1d
|
2023-09-07 18:54:14 +08:00 |
|
hiyouga
|
a4fd976048
|
refactor dataset_attr, add eos in pt, fix #757
Former-commit-id: a9d1fb72f791ae57a4d12f4e3a7e2abccf6a7077
|
2023-09-01 19:00:45 +08:00 |
|
hiyouga
|
570ccc3618
|
fix ppo trainer #551
Former-commit-id: 0676497104eccc8a737d27890eabf1ca8713c235
|
2023-08-20 14:07:11 +08:00 |
|
hiyouga
|
9f1688924d
|
tiny fix
Former-commit-id: d75e377b0f6f3fd7c034676b81ddef3aab1d6901
|
2023-08-18 13:07:35 +08:00 |
|
hiyouga
|
03edfd07e7
|
fix PPO trainer #551 , update readme
Former-commit-id: 90205244186df558cd6b0000728d638348db3a10
|
2023-08-18 11:43:10 +08:00 |
|
hiyouga
|
caf4a61e21
|
fix ChatGLM2 ppo #527 #528
Former-commit-id: 9f4c2adc9a9ca8e458d3868805e077182e0d336a
|
2023-08-18 00:34:59 +08:00 |
|
hiyouga
|
623a34b16f
|
fix generation bug #532
Former-commit-id: be21fc83f9aed0af1e5a2f83f5d5eeb36f1d283c
|
2023-08-17 22:21:34 +08:00 |
|
hiyouga
|
048f99354f
|
fix generation
Former-commit-id: d9e62711a3349d7c6fd3512fb25c709bdfbb311a
|
2023-08-16 22:39:54 +08:00 |
|
hiyouga
|
a9ab8f71d7
|
fix ChatGLM RLHF
Former-commit-id: af6c011fcb8ea9e5cf2eb4699da33d8668df04b4
|
2023-08-15 11:19:20 +08:00 |
|
hiyouga
|
abdfa26d06
|
support DPO training (2305.18290)
Former-commit-id: 3ec4351cfdaf2aefcc7d13345e19d79874ed61d3
|
2023-08-11 03:02:53 +08:00 |
|
hiyouga
|
4242897b78
|
modify code structure
Former-commit-id: 08f180e78862cad902b6cdbbd8c86e39b5cacf8a
|
2023-08-02 23:17:36 +08:00 |
|
hiyouga
|
4b8e4398bc
|
fix PPO trainer
Former-commit-id: 1d8a1878ea053d1dbfc570eea868d2514ce75a51
|
2023-08-02 19:10:23 +08:00 |
|
hiyouga
|
569df8ccd6
|
update ppo trainer
Former-commit-id: b5ba87952ab02ed0720365ebd571e47e92e1cda6
|
2023-08-02 18:46:41 +08:00 |
|
hiyouga
|
ab739e72ea
|
fix memory leak of PPO trainer
Former-commit-id: 286f7be346dbea630da1642bbc9e98bcad3145b4
|
2023-08-02 17:41:34 +08:00 |
|
hiyouga
|
c5ad96375e
|
fix RM save model
Former-commit-id: ac88ce5233248dbf1c7943c5f1197e40ba52fde9
|
2023-08-01 11:56:17 +08:00 |
|
hiyouga
|
e80b75b560
|
support streaming data, fix #284 #274 #268
Former-commit-id: 0411a4b3e122e7907441bc7a64b004948741a620
|
2023-07-31 23:33:00 +08:00 |
|
hiyouga
|
18656a6316
|
fix API
Former-commit-id: 29af67b015ff92e5dd9bf2985ce7723dc036d989
|
2023-07-19 00:01:14 +08:00 |
|
hiyouga
|
0b6f769971
|
update webUI, fix #179
Former-commit-id: 12d8a8633f1d8db8eb72223f69c074d98af16e01
|
2023-07-18 15:35:17 +08:00 |
|
hiyouga
|
091805d38e
|
release v0.1.0
Former-commit-id: f8193e8009451cf569a28a10eb4bd88831844441
|
2023-07-18 00:18:25 +08:00 |
|
hiyouga
|
799524b37b
|
fix #175
Former-commit-id: 85c2210452cc45470c228f17b2b0df09b47e9575
|
2023-07-17 18:07:17 +08:00 |
|
hiyouga
|
70b5232f9a
|
fix callback
Former-commit-id: 22d9a9c2af6674eb832ae4aee80d679f19b7006f
|
2023-07-15 17:18:16 +08:00 |
|
hiyouga
|
a696148d6b
|
modity code structure
Former-commit-id: f75137661358f9070bc70c341dfa2cc5fd69cf94
|
2023-07-15 16:54:28 +08:00 |
|