hiyouga
|
027caabbb6
|
fix ppo trainer save logic
Former-commit-id: d3dccd0693ede18a99f04780f2fd6e3a89810405
|
2023-12-04 19:00:19 +08:00 |
|
hiyouga
|
6493558c3b
|
fix bug
Former-commit-id: 8b681ee273c28813c599d9d55b2a3540c8ac257d
|
2023-12-03 21:40:40 +08:00 |
|
hiyouga
|
64eead3fb1
|
ppo support rm server
Former-commit-id: 747db4017291b0eb91946f57011bb31659056037
|
2023-12-03 21:38:51 +08:00 |
|
hiyouga
|
1cb390b9b2
|
implement rm server #1543
Former-commit-id: 7df4f3ab206fddb462f6ed865eaf04234fd72ed6
|
2023-12-03 20:52:54 +08:00 |
|
hiyouga
|
3d291a82d3
|
fix #1597
Former-commit-id: 327d7f7efe1fefe4bf4646c07fc4917a42c13383
|
2023-11-30 21:47:06 +08:00 |
|
hiyouga
|
ba6d290d0b
|
fix #1668
Former-commit-id: 1585962eb7ed042890d4c56422aae749c669dda8
|
2023-11-30 21:02:00 +08:00 |
|
hiyouga
|
ecfc7d1b50
|
fix #1658
Former-commit-id: 77d1b14fc2d9703d15bbd879f67df037db9fbb28
|
2023-11-28 20:57:24 +08:00 |
|
hiyouga
|
ae1048db6d
|
fix #1659
Former-commit-id: 475a3fa0f4c09d4cfd55ec66271a6d3c9eb5f4d2
|
2023-11-28 20:52:28 +08:00 |
|
hiyouga
|
b015ac35d8
|
support export size setting
Former-commit-id: 859a6ea9425a09d7263f6436d05102df8129c248
|
2023-11-26 18:34:09 +08:00 |
|
hiyouga
|
4966bd7911
|
support GPTQ tuning #729 #1481 #1545 , fix chatglm template #1453 #1480 #1569
Former-commit-id: 9ea93801459b0d271d21a2d730c44abae9106c51
|
2023-11-20 22:52:11 +08:00 |
|
hiyouga
|
f06c4c8f7a
|
update ppo trainer
Former-commit-id: 5021062493ed63ad1f6133cfb543e4e7f528d2cc
|
2023-11-20 21:39:15 +08:00 |
|
hoshi-hiyouga
|
d72f123851
|
Merge pull request #1553 from hannlp/hans
Change the default argument settings for PPO training
Former-commit-id: 48211e3799a16de946360930d3d92f5a40e9d12d
|
2023-11-20 20:32:55 +08:00 |
|
hiyouga
|
682d81caa9
|
fix #1567
Former-commit-id: 99a3f06377d2886c4000ce7e3583b12ca965534d
|
2023-11-20 18:46:36 +08:00 |
|
hiyouga
|
a53afb27eb
|
fix #1263
Former-commit-id: 065bfaeed490a4e03fb48a5adc0b8af4d835a767
|
2023-11-19 16:05:18 +08:00 |
|
hiyouga
|
48d6d925f7
|
fix #1558
Former-commit-id: 1740131d63d32aefc0370441baf4716ddb5ebcfe
|
2023-11-19 14:15:47 +08:00 |
|
Yuchen Han
|
a419122179
|
Update workflow.py
Former-commit-id: eeb5249d0b6ce0816e1fa47afc3a853c7b267cbf
|
2023-11-17 00:16:27 -08:00 |
|
hiyouga
|
0ed0b8f9c5
|
fix bug in freeze tuning
Former-commit-id: ff52b1779c909819d0aef83d3f7ea663199cbe54
|
2023-11-16 14:25:11 +08:00 |
|
hiyouga
|
678052a7ef
|
fix rlhf callback
Former-commit-id: 1817ffc86fe3463ea91e9359c0e3611979a9d53e
|
2023-11-16 03:26:19 +08:00 |
|
hiyouga
|
b71da932eb
|
fix bug in PPO training
Former-commit-id: 856522a3df4bb9ddfaaa137119eceb9574873950
|
2023-11-16 02:32:54 +08:00 |
|
hiyouga
|
eb5a852dd5
|
fix import bug
Former-commit-id: 35b91ea34caade45dd51813b94da5177b852aa4c
|
2023-11-16 02:27:03 +08:00 |
|
hiyouga
|
f441932bd1
|
support full-parameter PPO
Former-commit-id: ce783036001397a20b0b4c5da2fea6d0c03389d2
|
2023-11-16 02:08:04 +08:00 |
|
hiyouga
|
06a4820836
|
disentangle model from tuner and rename modules
Former-commit-id: 4736344eb1595ee023a50d49e8118f4eee46305f
|
2023-11-15 16:29:09 +08:00 |
|