Kdump
2c1d0b7a83
[trainer] fix wsd scheduler ( #7304 )
...
* [trainer] Warmup_stable_decay supports setting the number of stable and decay steps according to the warmup_ratio ratio
* Update trainer_utils.py
---------
Co-authored-by: hoshi-hiyouga <hiyouga@buaa.edu.cn>
2025-03-26 15:25:02 +08:00
hoshi-hiyouga
1b1964714e
[misc] update format ( #7277 )
2025-03-13 02:53:08 +08:00
hoshi-hiyouga
efa86e730c
[misc] upgrade format to py39 ( #7256 )
2025-03-12 00:08:41 +08:00
Ze-Yi LIN
1358ad9afd
[tracking] add swanlab_logdir param ( #7219 )
...
* feat: add swanlab_logdir param
* fix
Former-commit-id: 9215ad488b6ac6cd57fe8fa4acdacceb63f68ca5
2025-03-11 00:53:07 +08:00
hoshi-hiyouga
25546b9afe
[model] add QwQ 32b ( #7179 )
...
Former-commit-id: 8897e48b8cd55407812453ddd4ff98ac7bdc4e91
2025-03-06 11:58:36 +08:00
Ze-Yi LIN
754dbb8b07
[trainer] fix swanlab callback ( #7176 )
...
Former-commit-id: 6d9acf4bd30db24499118aee16bd19cb19ba9e3d
2025-03-06 00:33:37 +08:00
Ze-Yi LIN
b531afb74e
[webui] display swanlab exp link ( #7089 )
...
* webui add swanlab link
* change callback name
* update
---------
Co-authored-by: hiyouga <hiyouga@buaa.edu.cn>
Former-commit-id: 27a4b93871c63b839c92940766bd7e0177972c9b
2025-02-27 19:40:54 +08:00
Eric Tang
413aa5944a
[ray] specify ray storage path ( #6920 )
...
Former-commit-id: 4be6b66b1eaa79955e936ce2b747a8837ecd1e49
2025-02-14 21:55:41 +08:00
hoshi-hiyouga
33d420bbcc
[optim] clean apollo ( #6645 )
...
* clean apollo code
* update readme
Former-commit-id: 38b8ec4a99189483124b54df9d6bc6b0d318855a
2025-01-15 01:42:50 +08:00
zhuHQ
9b29a431db
[optim] add support to APOLLO ( #6617 )
...
Former-commit-id: 5a252e5a458457adbd19da3b68a3897ad2962824
2025-01-15 00:24:56 +08:00
hiyouga
708e899769
refactor ray integration, support save ckpt
...
Former-commit-id: 2f50b27e608b2092bfceab6c6e84e6631e973ee2
2025-01-07 09:39:10 +00:00
hiyouga
8d1b77cd6f
fix #6546
...
Former-commit-id: 6fcf2f10faf3b1614896b091591eeef96d717e64
2025-01-07 06:30:44 +00:00
hiyouga
c57fbebd55
support report custom args
...
Former-commit-id: d41254c40a1c5cacf9377096adb27efa9bdb79ea
2024-12-21 21:42:45 +00:00
ZeYi Lin
9d27de776c
feat: ui improve
...
Former-commit-id: 6a1effb1741a13ae5238b0e9b429b4cbe3b6534f
2024-12-20 11:03:02 +08:00
ZeYi Lin
87a8d25f76
fix: bugs
...
Former-commit-id: a2297f97f7587c77d55fbce9ffa81dc60d0b04a1
2024-12-19 21:08:16 +08:00
ZeYi Lin
768914653e
feat: optimize frontend
...
Former-commit-id: 4a78603c141d9bd78bcaf81261b443cf082bf51f
2024-12-19 19:04:19 +08:00
ZeYi Lin
ec2bee271d
feat: swanlab params
...
Former-commit-id: 761b3bdb03e27826fde2ca86d4e37b53c2bbc777
2024-12-19 18:47:27 +08:00
hiyouga
cf10c2dff8
add swanlab
...
Former-commit-id: c85a77c8a8824a56a67d56b97b4877fcd6edeb3d
2024-12-19 07:12:31 +00:00
hiyouga
a117731ecb
support rank0 logger
...
Former-commit-id: 84528eabe560091bfd866b6a0ca864085af7529b
2024-11-02 18:31:04 +08:00
hiyouga
dbbfb5f5dc
use pre-commit
...
Former-commit-id: 7cfede95df22a9ff236788f04159b6b16b8d04bb
2024-10-29 09:07:46 +00:00
hiyouga
fb90faf19a
add docstrings, refactor logger
...
Former-commit-id: c34e489d71f8f539028543ccf8ee92cecedd6276
2024-09-08 00:56:56 +08:00
hiyouga
c7a1c3f43a
add adam_mini to readme
...
Former-commit-id: d610c6bcf8a8ba6f4236f5d11f79571b83f4fb11
2024-08-09 20:02:03 +08:00
moontidef
128cb8d2b4
feat: add support for adammini
...
Former-commit-id: a2d5fafb705ff44db1711e972490f0abebc2012b
2024-08-07 10:08:22 +08:00
moontidef
5243075bb7
fix: rename optimzer to optimizer
...
Former-commit-id: 186dc1fde822e6a603ac273538741ea3853f243e
2024-08-07 10:05:01 +08:00
hiyouga
8ce43766c6
fix up
...
Former-commit-id: 43a56cb331fae899ca35b0c312730d4ab79d0c42
2024-07-15 01:04:56 +08:00
hiyouga
884a4a33ee
refactor pissa, improve llamaboard
...
Former-commit-id: 619556e46c19718f702c97df5d570a2a4c5fb13a
2024-06-28 01:04:24 +08:00
hiyouga
4d2c279083
tiny fix about badam
...
Former-commit-id: 03f49267c7406e36aee35639f86e6e0383897090
2024-06-25 01:54:53 +08:00
hoshi-hiyouga
0105689cf4
Merge pull request #4352 from Ledzy/main
...
[Enhancement] Support ZeRO-3 when using BAdam
Former-commit-id: 0dc75275efa7d7540b472783a52ea6aeaa503c0b
2024-06-25 01:49:13 +08:00
hiyouga
da3b0aab6d
fix templates
...
Former-commit-id: 6f357d59b73309c5955683008632e7f320e7dcb1
2024-06-19 17:44:05 +08:00
Jonery
a22e932b4f
Cleaner integration.
...
Former-commit-id: 26d4b05d424bd71f570195dd433258caf6465d92
2024-06-19 12:29:40 +08:00
Jonery
d1edcfb135
Merge remote-tracking branch 'upstream/main'
...
Former-commit-id: 37834a7e79473ccf50ad7f67745b97c274c326d9
2024-06-17 18:44:51 +08:00
hiyouga
d5a0cc93a2
fix tol
...
Former-commit-id: bdb54bcb477126687db789bd89f2df84e424a2a3
2024-06-16 01:38:44 +08:00
hiyouga
0b571f84b4
support pissa
...
Former-commit-id: ef8e45f2eaf466c54e9a671512a2974575677b08
2024-06-16 01:08:12 +08:00
hiyouga
acfae2e677
add license
...
Former-commit-id: 69cfc98d7c81756a5ab6bf962240e393e449fef0
2024-06-15 17:54:33 +08:00
hiyouga
045cef901e
fix #4209
...
DeepSpeed ZeRO3 has inflight param error when calling model.eval()
Former-commit-id: 4be013f18ea6a35b5a11db98db5f0670ffb41619
2024-06-13 02:25:50 +08:00
hiyouga
95f95bef60
fix #4198
...
Former-commit-id: 945d2c6cc73542adf9272ebd9aa332ea2c1c7361
2024-06-11 15:38:38 +08:00
hiyouga
8cc3bbdc62
fix #4120
...
Former-commit-id: 2a44da678a5e360a9c0f9056397ac9e801329321
2024-06-07 04:18:05 +08:00
hiyouga
0b1f4a34f8
rename files
...
Former-commit-id: e1a8431770fc36c0c9ee7fed4abbc3d7fdcc5efd
2024-06-07 00:09:06 +08:00