55 Commits

Author SHA1 Message Date
Juanxi Tian
d128382d3c
[trainer] Add Muon Optimizer (#7749)
Co-authored-by: hoshi-hiyouga <hiyouga@buaa.edu.cn>
2025-04-21 23:38:37 +08:00
hoshi-hiyouga
5817cda37e
[misc] fix packing and eval plot (#7623) 2025-04-07 18:20:57 +08:00
hoshi-hiyouga
bbf334f823 disable valset by default (#6690)
Former-commit-id: 77bbf659053e1b205974eb6df69998fee0305d26
2025-01-17 21:09:30 +08:00
hoshi-hiyouga
9ef85f8fc4 [optim] clean apollo (#6645)
* clean apollo code

* update readme

Former-commit-id: 7a04021d0461caea2c7b82169839340b7f51f463
2025-01-15 01:42:50 +08:00
zhuHQ
763f9b9df0 [optim] add support to APOLLO (#6617)
Former-commit-id: d9189f9f0b23ff6929044919208e0e813ca95b1c
2025-01-15 00:24:56 +08:00
Yaser Afshar
76ebd62ac1 Add missing key to init_kwargs
Former-commit-id: 1c8ad22a5f167bf4e1c845e273583e5cb3a0214e
2024-12-17 12:34:05 +00:00
Yaser Afshar
fe4546a7bb Add trust_remote_code parameter and remove True
- Introduced a new model parameter `trust_remote_code`
- Set the default value of `trust_remote_code` to `False`
  to enhance security


Former-commit-id: 09437763267bc7081159a6878cee9652a2b1ddac
2024-12-17 12:25:12 +00:00
hiyouga
235cdcacee support batch infer in vllm
Former-commit-id: 1324d158f954d777f1fbf09f46149c372704b388
2024-12-04 13:50:00 +00:00
hiyouga
0d18cca0db add vllm config
Former-commit-id: 58ab4579dc81a1dcea2bf5938ba3f3116cecfc76
2024-11-10 21:28:18 +08:00
hiyouga
f8c11bd540 update examples
Former-commit-id: 0a690ada6f9f791e7d013eb89799975b12212ed0
2024-08-09 20:13:46 +08:00
hiyouga
5eacd17090 add adam_mini to readme
Former-commit-id: e2a28f51c635d64ff9de65a37087d89356bdedcc
2024-08-09 20:02:03 +08:00
hiyouga
25b9cfa163 update scripts
Former-commit-id: 86f7099fa3fadd9c5a2059361ab5a5e1dbf5b1a2
2024-08-09 19:16:23 +08:00
hiyouga
fae881b854 fix #4944
Former-commit-id: 1bbd49faaef438f49cb5340166cb13faee8fb854
2024-07-24 16:42:51 +08:00
hiyouga
d97bb11821 update pissa example
Former-commit-id: c9bb0757ecfa90ba456e2ef7b38e64dbb809265d
2024-07-06 15:47:32 +08:00
hiyouga
2105cf6000 update examples
Former-commit-id: 2f78b5d62a34ea4d157bbe91a253859d25c8a7fe
2024-06-28 01:17:07 +08:00
hiyouga
a225b5a70c tiny fix about badam
Former-commit-id: 095fab58d3692607c9e78747b4218ae1abcf5aaf
2024-06-25 01:54:53 +08:00
Jonery
bc1c082bc2 add example
Former-commit-id: 97c523516093961983037922e7fc84e4010d5fec
2024-06-18 13:50:26 +08:00
hiyouga
004f289074 tiny fix
Former-commit-id: 2bf2863a58c93206f271de17d7dfcbcd6375cd73
2024-06-17 17:47:25 +08:00
hiyouga
f25b8626bf support pissa
Former-commit-id: 8c1046d78ac6c8f9429b73617e35e1eccb35138f
2024-06-16 01:08:12 +08:00
hiyouga
0926d81053 update examples
Former-commit-id: b6e008c152421db668c971b0828cbee6a80b16bc
2024-06-13 03:15:06 +08:00
hiyouga
cceff9f520 lora modules: all by default
Former-commit-id: cae47379079ff811aa385c297481a27020a8da6b
2024-06-06 03:53:28 +08:00
hiyouga
00b3fb4d14 update train hparams
Former-commit-id: dc4a00dd63769dc02d898c8bad2c158e4e5c0447
2024-06-06 01:49:20 +08:00
hiyouga
0eff6a66d5 tiny fix
Former-commit-id: 5a13b3baa63225e7f79e024610722de0f87e0acc
2024-06-04 00:31:10 +08:00
hiyouga
e4ce59243b fix #4005 #4013
Former-commit-id: eed33862bc733361f3c28b3c95dc0eb4ea00884c
2024-06-03 19:12:29 +08:00
hiyouga
13d7b48efe improve KTO impl., replace datasets
Former-commit-id: c450ee87a35ff9235f9b695b0de2e042b2971178
2024-05-18 03:44:56 +08:00
hiyouga
947f0e9964 update badam example #3764
Former-commit-id: e5bba7cf1bd5317a2446b67ee5e0e245bb8b4ad4
2024-05-17 02:21:10 +08:00
hiyouga
dfff5119b4 update examples
Former-commit-id: ddec9e1b842d407790637e9b0b181f8b26926db9
2024-05-17 01:02:00 +08:00
hiyouga
6e6267f17c fix #3694
Former-commit-id: 2a67ab3925f0c17c4cb5e8c5a5e2cc6a9dc7d47e
2024-05-16 00:35:28 +08:00
hiyouga
3318b6e188 update examples
Former-commit-id: dae83f419919305cb23bb2b9da1277a1616179c5
2024-05-13 20:39:36 +08:00
hiyouga
92cafef325 update example docs
Former-commit-id: f02f87c6fbd20adae105c83526baa23dba2042fd
2024-05-06 22:51:02 +08:00
hiyouga
eb21a527a6 update docs
Former-commit-id: 34d33e22570338da709b8499830adb06b202095c
2024-05-06 21:47:00 +08:00
Oscar
c57a42164c Fix badam example outdated argument
Former-commit-id: eeb415f6fa81ca9093ad0419d1343bd5f780a688
2024-05-05 23:35:19 +08:00
hiyouga
289d1f3679 update webui and add CLIs
Former-commit-id: 245fe47ece22a4b7822449b126715aaa8ec25aba
2024-05-03 02:58:23 +08:00
hiyouga
d8deb0f99e update readme and examples
Former-commit-id: a1f1fac33b2a727b38e8ba52d68a224814d4848b
2024-04-22 00:37:32 +08:00
hiyouga
92e24a73cb remove extras
Former-commit-id: ddbd29d77702f7b82051d930e3eac1b47f5c6d35
2024-04-22 00:35:41 +08:00
hiyouga
9e45f82be7 fix bug in galore optimizer
Former-commit-id: 5c62881c5a59cfcc5a76d365263c8ad8c817ce49
2024-04-21 18:53:22 +08:00
hiyouga
ec81d45d27 fix mod stuff
Former-commit-id: f58425ab45727f7859583d4b9fda776715e27ff6
2024-04-21 18:11:10 +08:00
Marco
639297a5ef Added Mixture of Depths
Former-commit-id: 620add7b9f634de1a711f7b87b16050adf735e9b
2024-04-18 20:31:24 +02:00
hoshi-hiyouga
507ab397f5 Update sft.sh
Former-commit-id: 57dcd91e17833a0eeb8d99af92ac73c132a77648
2024-04-16 17:25:40 +08:00
Jonery
b3260c7456 resolve gradient checkpointing issue.
Former-commit-id: 7ecb61822b37f5d71060d696495830ff98edaa06
2024-04-16 12:05:27 +08:00
Jonery
025f329445 Feature BAdam
Former-commit-id: 06c8908d3fe48907ddb585c5fa15677fc5416f94
2024-04-15 23:15:27 +08:00
hiyouga
fb385b8c26 update examples
Former-commit-id: cce52351b54f70904f33902d9c17411134f9f6eb
2024-04-15 22:14:34 +08:00
hiyouga
e341fa59fe update examples
Former-commit-id: f22eaeb5bc5329146feb0cc5455fae8ce10380f8
2024-04-02 20:51:21 +08:00
hiyouga
9df316931b update examples
Former-commit-id: 31ffbde24dd2e30c3d06331ac4b47d966fc2a191
2024-04-02 20:41:49 +08:00
hiyouga
135c4e3512 update readme
Former-commit-id: 11a6c1bad65a86b0f3d9c5e5df84d246d7d368df
2024-04-02 20:37:37 +08:00
hiyouga
bf5ffeeae0 simplify readme
Former-commit-id: 92dab8a90bdd82a72a06559943467b56dde12c71
2024-04-02 20:07:43 +08:00
hiyouga
89c400633a update trainers
Former-commit-id: 8c77b1091296e204dc3c8c1f157c288ca5b236bd
2024-03-28 18:16:27 +08:00
hiyouga
8b8671817f improve lora+ impl.
Former-commit-id: 72367307dfadf936fb989ebe8bc9f0ff229fb933
2024-03-13 23:32:51 +08:00
齐保元
24c9277488 [FEATURE]: ADD LORA+ ALGORITHM
Former-commit-id: a0965cd62c85545aa2364e244295df2963308354
2024-03-13 19:43:27 +08:00
hiyouga
4a4e4b4354 support layerwise galore
Former-commit-id: 8664262cde3919e10eaecbd66e8c5d356856362e
2024-03-10 00:24:11 +08:00