Commit Graph

235 Commits

Author SHA1 Message Date
hoshi-hiyouga
2aaaede247 support llama3 2024-04-19 01:13:50 +08:00
hiyouga
3b43a3b7c5 tiny fix 2024-04-18 00:22:17 +08:00
hiyouga
cab0598fd0 add mixtral 8x22B models 2024-04-17 23:35:59 +08:00
hiyouga
5f86053d75 add CodeQwen models 2024-04-17 23:27:22 +08:00
hiyouga
6d641af703 fix #3317 2024-04-17 22:17:19 +08:00
hiyouga
5d62a51c12 update readme and gradio version 2024-04-16 18:09:16 +08:00
hiyouga
6543f3d449 add codegemma 2024-04-16 00:11:15 +08:00
hiyouga
e0dbac2845 support cohere commandR #3184 2024-04-15 23:26:42 +08:00
hoshi-hiyouga
7a8ae3f4ac Merge pull request #3254 from marko1616/feature/Add-support-for-CohereForAI/c4ai-command-r-plus
Add template&support for c4ai-command-r/plus (tested)
2024-04-15 22:59:35 +08:00
hoshi-hiyouga
268f53dddb Update constants.py 2024-04-15 22:56:55 +08:00
hiyouga
cce52351b5 update examples 2024-04-15 22:14:34 +08:00
marko1616
ab033dac4f Typo fix 2024-04-13 17:30:21 +08:00
marko1616
d0705518ee Add c4ai-command-r-plus link 2024-04-13 07:32:40 +08:00
marko1616
6574a721d2 Add template&support(Not tested) 2024-04-13 04:31:33 +08:00
hiyouga
9d4c949461 release v0.6.2 2024-04-11 20:08:51 +08:00
hiyouga
a99f5ed0b6 fix #3225 2024-04-10 23:57:59 +08:00
hiyouga
9a99fbc86d tiny fix 2024-04-08 21:28:39 +08:00
hoshi-hiyouga
4c6c4a0d88 Merge pull request #3161 from hiyouga/feature/add-mediatek-model
support Breeze-7B
2024-04-08 20:56:51 +08:00
codingma
7b76b4ca08 add empty line 2024-04-07 18:28:08 +08:00
codingma
5a780e9eec rename template to breeze 2024-04-07 11:39:54 +08:00
codingma
2565a32bd9 support https://github.com/hiyouga/LLaMA-Factory/issues/3152 2024-04-07 11:34:01 +08:00
sliderSun
1d117b7bb6 fix spell error 2024-04-07 10:59:15 +08:00
sliderSun
21650d467c support Qwen1.5-32B 2024-04-07 10:56:03 +08:00
sliderSun
77044d9ef4 support Qwen1.5-32B 2024-04-07 10:26:13 +08:00
hiyouga
4b920f24d3 back to gradio 4.21 and fix chat 2024-04-04 02:07:20 +08:00
hiyouga
5ddcecda50 fix bug in latest gradio 2024-04-04 00:55:31 +08:00
hiyouga
92dab8a90b simplify readme 2024-04-02 20:07:43 +08:00
hiyouga
b267aeb53f add moe aux loss control #3085 2024-04-02 14:26:31 +08:00
hiyouga
54b7d34908 add qwen1.5 moe 2024-04-01 21:49:40 +08:00
hiyouga
aee634cd20 fix #3077 2024-04-01 21:35:18 +08:00
hiyouga
17bf8a2c3a support ORPO 2024-03-31 18:29:50 +08:00
hiyouga
7a086ed333 support save args in webui #2807 #3046
some ideas are borrowed from @marko1616
2024-03-30 23:09:12 +08:00
hiyouga
8d603f8820 fix #2982 2024-03-28 20:22:31 +08:00
hiyouga
b19c14870d fix #3010 2024-03-28 18:31:17 +08:00
hiyouga
140ad4ad56 fix #2936 2024-03-24 00:43:21 +08:00
hiyouga
a1c8c98c5f fix #2941 2024-03-24 00:28:44 +08:00
hiyouga
8408225162 support fsdp + qlora 2024-03-21 00:36:06 +08:00
hiyouga
70a3052dd8 patch for gemma cpt 2024-03-12 21:21:54 +08:00
hiyouga
60cc17f3a8 fix plot issues 2024-03-12 18:41:35 +08:00
hiyouga
b3247d6a16 support olmo 2024-03-12 18:30:38 +08:00
hiyouga
bdb496644c allow non-packing pretraining 2024-03-09 22:21:46 +08:00
hiyouga
57452a4aa1 add Yi-9B model 2024-03-07 23:11:57 +08:00
hiyouga
28f7862188 support galore 2024-03-07 22:41:36 +08:00
hiyouga
d07ad5cc1c support vllm 2024-03-07 20:26:31 +08:00
hiyouga
0048a2021e tiny fix 2024-03-06 17:25:08 +08:00
hiyouga
3016e65657 fix version checking 2024-03-06 14:51:51 +08:00
hiyouga
9c10854b46 fix sub-process error in thread 2024-03-03 15:04:35 +08:00
hiyouga
894d183214 update readme, add starcoder2, cosmopedia 2024-03-03 01:01:46 +08:00
hiyouga
38d8b2cef8 update chatglm3 template 2024-02-28 21:11:23 +08:00
hiyouga
cfefacaa37 support DoRA, AWQ, AQLM #2512 2024-02-28 19:53:28 +08:00