Commit Graph

  • b053c6454e update readme hiyouga 2024-04-16 02:36:54 +08:00
  • ebf0f4a77c update readme hiyouga 2024-04-16 02:35:36 +08:00
  • efa808069a support unsloth 2024.4 hiyouga 2024-04-16 00:25:03 +08:00
  • b5c5283dd6 add codegemma hiyouga 2024-04-16 00:11:15 +08:00
  • b638c65519 support cohere commandR #3184 hiyouga 2024-04-15 23:26:42 +08:00
  • d4d471450f Feature BAdam Jonery 2024-04-15 23:15:27 +08:00
  • 3144bdec2c Merge pull request #3254 from marko1616/feature/Add-support-for-CohereForAI/c4ai-command-r-plus hoshi-hiyouga 2024-04-15 22:59:35 +08:00
  • c6d6c4c209 Update template.py hoshi-hiyouga 2024-04-15 22:58:01 +08:00
  • f5f1589662 Update constants.py hoshi-hiyouga 2024-04-15 22:56:55 +08:00
  • 276f2cb24e update examples hiyouga 2024-04-15 22:14:34 +08:00
  • 952b785bb3 change default_system accroding to official template marko1616 2024-04-15 20:45:46 +08:00
  • 72dd676208 Revert "Add support for function call(Not strictly following origin)" marko1616 2024-04-15 20:27:09 +08:00
  • dfaa31e991 Add support for function call(Not strictly following origin) marko1616 2024-04-15 20:16:52 +08:00
  • 86556b1c74 Merge pull request #3261 from khazic/main hoshi-hiyouga 2024-04-15 16:30:57 +08:00
  • 0c80751e87 Merge pull request #3276 from liu-zichen/fix_mixtral hoshi-hiyouga 2024-04-15 15:38:16 +08:00
  • 9338f878a3 fix #3273 hiyouga 2024-04-15 15:32:58 +08:00
  • fde3d91242 fix: mixtral output_router_logits liuzc 2024-04-15 12:11:49 +08:00
  • 19adfb88a9 Upgrade README.md khazic 2024-04-13 20:50:49 +08:00
  • daaafa900a Added specimens for single-card full parameter prediction khazic 2024-04-13 20:45:19 +08:00
  • 0dcc9e0bca Typo fix marko1616 2024-04-13 17:30:21 +08:00
  • aeec78b35c Typo fix marko1616 2024-04-13 07:52:11 +08:00
  • c991654cb4 Add c4ai-command-r-plus link marko1616 2024-04-13 07:32:40 +08:00
  • f328413646 Add template&support(Not tested) marko1616 2024-04-13 04:31:33 +08:00
  • 106a0104da fix #3247 hiyouga 2024-04-12 17:41:33 +08:00
  • 5486ea09e3 fix model card hiyouga 2024-04-12 17:11:59 +08:00
  • 31bbbb6d13 fix #3238 hiyouga 2024-04-12 14:28:11 +08:00
  • 1a77de82fa set dev version hiyouga 2024-04-11 20:27:34 +08:00
  • 7468f2535c release v0.6.2 v0.6.2 hiyouga 2024-04-11 20:08:51 +08:00
  • 38e4f22605 Merge branch 'main' of https://github.com/hiyouga/LLaMA-Factory hiyouga 2024-04-10 23:58:18 +08:00
  • 2bc2fe7b5e fix #3225 hiyouga 2024-04-10 23:57:59 +08:00
  • 6d0140d8a0 Merge pull request #3201 from kno10/patch-1 and fix #3200 hoshi-hiyouga 2024-04-10 00:58:48 +08:00
  • 7856f98965 Update adapter.py hoshi-hiyouga 2024-04-10 00:57:51 +08:00
  • e25ddef08c Update adapter.py hoshi-hiyouga 2024-04-10 00:57:30 +08:00
  • 95a4589bbf Pass additional_target to unsloth Erich Schubert 2024-04-09 17:53:40 +02:00
  • 566d71b7a9 fix quant infer and qwen2moe hiyouga 2024-04-09 17:12:59 +08:00
  • 6030a4a720 tiny fix hiyouga 2024-04-08 21:28:39 +08:00
  • 5dc0cb94d4 Merge pull request #3161 from hiyouga/feature/add-mediatek-model hoshi-hiyouga 2024-04-08 20:56:51 +08:00
  • 325dafcbb0 add empty line codingma 2024-04-07 18:28:08 +08:00
  • 1a8a8b8651 rename template to breeze codingma 2024-04-07 18:27:20 +08:00
  • 61a495cb1e Merge pull request #3160 from sliderSun/main hoshi-hiyouga 2024-04-07 18:00:40 +08:00
  • 75866aa020 rename template to breeze codingma 2024-04-07 11:39:54 +08:00
  • 9e4fda326d support https://github.com/hiyouga/LLaMA-Factory/issues/3152 codingma 2024-04-07 11:34:01 +08:00
  • 1131ddfaff fix spell error sliderSun 2024-04-07 10:59:15 +08:00
  • 9f437b5c43 support Qwen1.5-32B sliderSun 2024-04-07 10:56:03 +08:00
  • 0cc03d3f05 support Qwen1.5-32B sliderSun 2024-04-07 10:26:13 +08:00
  • 04fc2f78bf update readme hiyouga 2024-04-07 00:48:24 +08:00
  • 3ac333fc6a update examples hiyouga 2024-04-04 14:48:21 +08:00
  • a246ac1914 tiny fix hiyouga 2024-04-04 02:19:03 +08:00
  • 48ceac845c back to gradio 4.21 and fix chat hiyouga 2024-04-04 02:07:20 +08:00
  • b1986a06b9 fix bug in latest gradio hiyouga 2024-04-04 00:55:31 +08:00
  • 43d134ba29 fix requires for windows hiyouga 2024-04-03 21:56:43 +08:00
  • 1348f7d860 fix resize vocab at inference #3022 hiyouga 2024-04-03 18:14:24 +08:00
  • f6530222f7 fix #3116 hiyouga 2024-04-03 14:47:59 +08:00
  • a74a7585e0 update vllm example hiyouga 2024-04-02 22:45:20 +08:00
  • 5bf0cca2b8 update readme hiyouga 2024-04-02 22:17:48 +08:00
  • 755b6511ff update examples hiyouga 2024-04-02 21:09:25 +08:00
  • 35621c6089 add zh readme hiyouga 2024-04-02 20:58:45 +08:00
  • 38b59664e6 update examples hiyouga 2024-04-02 20:51:21 +08:00
  • 933a084999 update examples hiyouga 2024-04-02 20:41:49 +08:00
  • c1510d19c7 update readme hiyouga 2024-04-02 20:37:37 +08:00
  • 2074cf99fb update readme hiyouga 2024-04-02 20:22:11 +08:00
  • b12176d818 simplify readme hiyouga 2024-04-02 20:07:43 +08:00
  • 117b67ea30 add moe aux loss control #3085 hiyouga 2024-04-02 14:26:31 +08:00
  • 03e20bb5c6 fix #3022 hiyouga 2024-04-02 13:58:39 +08:00
  • 0c4a1381a4 Update SECURITY.md hiyouga 2024-04-01 23:30:03 +08:00
  • 9e14501edb set dev version hiyouga 2024-04-01 23:24:08 +08:00
  • 1dc963caa6 fix #3083 hiyouga 2024-04-01 22:53:52 +08:00
  • 85726c91ce add qwen1.5 moe hiyouga 2024-04-01 21:49:40 +08:00
  • 40211db275 fix #3077 hiyouga 2024-04-01 21:35:18 +08:00
  • e7f13098c6 support infer 4bit model on GPUs #3023 hiyouga 2024-04-01 17:34:04 +08:00
  • 61eb3a3d46 update webui hiyouga 2024-04-01 16:23:28 +08:00
  • be0a807e8c fix ORPO loss hiyouga 2024-04-01 14:42:41 +08:00
  • 52d402e2a9 fix IPO and ORPO loss hiyouga 2024-04-01 14:37:53 +08:00
  • c5a46f9113 fix plots hiyouga 2024-03-31 19:43:48 +08:00
  • 00e17a377c use log1p in orpo loss hiyouga 2024-03-31 19:27:08 +08:00
  • 9abd83adb1 update readme hiyouga 2024-03-31 18:46:34 +08:00
  • f0d2afcf90 Merge pull request #3066 from hiyouga/orpo hoshi-hiyouga 2024-03-31 18:42:48 +08:00
  • 1aba442bcd support orpo in webui hiyouga 2024-03-31 18:34:59 +08:00
  • d764cd8736 support ORPO hiyouga 2024-03-31 18:29:50 +08:00
  • 526111a303 tiny fix hiyouga 2024-03-31 00:10:29 +08:00
  • b8364046df Merge pull request #3057 from marko1616/bugfix/lora-model-merge hoshi-hiyouga 2024-03-31 00:07:20 +08:00
  • 1f617c6e08 fix blank line contains whitespace marko1616 2024-03-30 23:46:55 +08:00
  • a6858a36c0 Fix Llama model save for full param train marko1616 2024-03-30 23:45:04 +08:00
  • 6198121923 support save args in webui #2807 #3046 hiyouga 2024-03-30 23:09:12 +08:00
  • b0efebf853 upgrade gradio to 4.21.0 hiyouga 2024-03-30 20:37:08 +08:00
  • fbd0584391 release v0.6.1 v0.6.1 hiyouga 2024-03-29 11:36:08 +08:00
  • 50224b09cc update readme hiyouga 2024-03-28 22:02:32 +08:00
  • 32dcc5a491 add project hiyouga 2024-03-28 20:24:27 +08:00
  • 9408366a36 fix #2982 hiyouga 2024-03-28 20:22:31 +08:00
  • f0e564beaa update readme hiyouga 2024-03-28 18:35:11 +08:00
  • 14b75a0b93 fix #3010 hiyouga 2024-03-28 18:31:17 +08:00
  • 59e6ebf039 update trainers hiyouga 2024-03-28 18:16:27 +08:00
  • 7cdc16abdf Supports custom data set sampling quantity zhangzc 2024-03-27 14:22:50 +08:00
  • dc540dfaa8 fix ds optimizer hoshi-hiyouga 2024-03-26 23:39:56 +08:00
  • 587e65e442 fix #2981 hiyouga 2024-03-26 17:53:04 +08:00
  • a916688723 fix bug hiyouga 2024-03-26 17:30:12 +08:00
  • 3336422760 fix #2961 hiyouga 2024-03-26 17:26:14 +08:00
  • 04423b916f release v0.6.0 (real) v0.6.0 hiyouga 2024-03-25 23:37:48 +08:00
  • bf8d2f8eda tiny fix hiyouga 2024-03-25 23:28:52 +08:00
  • 2a5d02fd0f update readme hiyouga 2024-03-25 23:06:13 +08:00