mirror of
https://github.com/hiyouga/LLaMA-Factory.git
synced 2025-08-02 03:32:50 +08:00
[assets] update readme (#8110)
This commit is contained in:
parent
b83a38eb98
commit
f96c085857
41
README.md
41
README.md
@ -5,7 +5,7 @@
|
||||
[](https://github.com/hiyouga/LLaMA-Factory/graphs/contributors)
|
||||
[](https://github.com/hiyouga/LLaMA-Factory/actions/workflows/tests.yml)
|
||||
[](https://pypi.org/project/llamafactory/)
|
||||
[](https://scholar.google.com/scholar?cites=12620864006390196564)
|
||||
[](https://scholar.google.com/scholar?cites=12620864006390196564)
|
||||
[](https://github.com/hiyouga/LLaMA-Factory/pulls)
|
||||
|
||||
[](https://twitter.com/llamafactory_ai)
|
||||
@ -16,7 +16,9 @@
|
||||
[](https://gallery.pai-ml.com/#/preview/deepLearning/nlp/llama_factory)
|
||||
[](https://huggingface.co/spaces/hiyouga/LLaMA-Board)
|
||||
[](https://modelscope.cn/studios/hiyouga/LLaMA-Board)
|
||||
[](https://aws.amazon.com/cn/blogs/china/a-one-stop-code-free-model-fine-tuning-deployment-platform-based-on-sagemaker-and-llama-factory/)
|
||||
[](https://aws.amazon.com/cn/blogs/machine-learning/how-apoidea-group-enhances-visual-information-extraction-from-banking-documents-with-multimodal-models-using-llama-factory-on-amazon-sagemaker-hyperpod/)
|
||||
|
||||
### Used by [Amazon](https://aws.amazon.com/cn/blogs/machine-learning/how-apoidea-group-enhances-visual-information-extraction-from-banking-documents-with-multimodal-models-using-llama-factory-on-amazon-sagemaker-hyperpod/), [NVIDIA](https://developer.nvidia.com/rtx/ai-toolkit), [Aliyun](https://help.aliyun.com/zh/pai/use-cases/fine-tune-a-llama-3-model-with-llama-factory), etc.
|
||||
|
||||
<div align="center" markdown="1">
|
||||
|
||||
@ -30,19 +32,13 @@
|
||||
|
||||
[Available for MacOS, Linux, & Windows](https://warp.dev/llama-factory)
|
||||
|
||||
</div>
|
||||
|
||||
----
|
||||
|
||||
<h3 align="center">
|
||||
Easily fine-tune 100+ large language models with zero-code <a href="#quickstart">CLI</a> and <a href="#fine-tuning-with-llama-board-gui-powered-by-gradio">Web UI</a>
|
||||
</h3>
|
||||
### Easily fine-tune 100+ large language models with zero-code [CLI](#quickstart) and [Web UI](#fine-tuning-with-llama-board-gui-powered-by-gradio)
|
||||
|
||||
<p align="center">
|
||||
<picture>
|
||||
<img alt="Github trend" src="https://trendshift.io/api/badge/repositories/4535">
|
||||
</picture>
|
||||
</p>
|
||||

|
||||
|
||||
</div>
|
||||
|
||||
👋 Join our [WeChat](assets/wechat.jpg) or [NPU user group](assets/wechat_npu.jpg).
|
||||
|
||||
@ -58,7 +54,7 @@ Choose your path:
|
||||
- **Colab (free)**: https://colab.research.google.com/drive/1eRTPn37ltBbYsISy9Aw2NuI2Aq5CQrD9?usp=sharing
|
||||
- **Local machine**: Please refer to [usage](#getting-started)
|
||||
- **PAI-DSW (free trial)**: [Llama3 Example](https://gallery.pai-ml.com/#/preview/deepLearning/nlp/llama_factory) | [Qwen2-VL Example](https://gallery.pai-ml.com/#/preview/deepLearning/nlp/llama_factory_qwen2vl) | [DeepSeek-R1-Distill Example](https://gallery.pai-ml.com/#/preview/deepLearning/nlp/llama_factory_deepseek_r1_distill_7b)
|
||||
- **Amazon SageMaker**: [Blog](https://aws.amazon.com/cn/blogs/china/a-one-stop-code-free-model-fine-tuning-deployment-platform-based-on-sagemaker-and-llama-factory/)
|
||||
- **Amazon SageMaker**: [Blog](https://aws.amazon.com/cn/blogs/machine-learning/how-apoidea-group-enhances-visual-information-extraction-from-banking-documents-with-multimodal-models-using-llama-factory-on-amazon-sagemaker-hyperpod/)
|
||||
- **Easy Dataset**: [Fine-tune on Synthetic Data](https://buaa-act.feishu.cn/wiki/GVzlwYcRFiR8OLkHbL6cQpYin7g)
|
||||
|
||||
> [!NOTE]
|
||||
@ -67,7 +63,7 @@ Choose your path:
|
||||
## Table of Contents
|
||||
|
||||
- [Features](#features)
|
||||
- [Benchmark](#benchmark)
|
||||
- [Blogs](#blogs)
|
||||
- [Changelog](#changelog)
|
||||
- [Supported Models](#supported-models)
|
||||
- [Supported Training Approaches](#supported-training-approaches)
|
||||
@ -107,18 +103,17 @@ Choose your path:
|
||||
| Day 0 | Qwen3 / Qwen2.5-VL / Gemma 3 / InternLM 3 / MiniCPM-o-2.6 |
|
||||
| Day 1 | Llama 3 / GLM-4 / Mistral Small / PaliGemma2 / Llama 4 |
|
||||
|
||||
## Benchmark
|
||||
## Blogs
|
||||
|
||||
Compared to ChatGLM's [P-Tuning](https://github.com/THUDM/ChatGLM2-6B/tree/main/ptuning), LLaMA Factory's LoRA tuning offers up to **3.7 times faster** training speed with a better Rouge score on the advertising text generation task. By leveraging 4-bit quantization technique, LLaMA Factory's QLoRA further improves the efficiency regarding the GPU memory.
|
||||
- [How Apoidea Group enhances visual information extraction from banking documents with multimodal models using LLaMA-Factory on Amazon SageMaker HyperPod](https://aws.amazon.com/cn/blogs/machine-learning/how-apoidea-group-enhances-visual-information-extraction-from-banking-documents-with-multimodal-models-using-llama-factory-on-amazon-sagemaker-hyperpod/) (English)
|
||||
- [Easy Dataset × LLaMA Factory: Enabling LLMs to Efficiently Learn Domain Knowledge](https://buaa-act.feishu.cn/wiki/GVzlwYcRFiR8OLkHbL6cQpYin7g) (English)
|
||||
- [LLaMA Factory: Fine-tuning the DeepSeek-R1-Distill-Qwen-7B Model For News Classifier](https://gallery.pai-ml.com/#/preview/deepLearning/nlp/llama_factory_deepseek_r1_distill_7b) (Chinese)
|
||||
|
||||

|
||||
<details><summary>All Blogs</summary>
|
||||
|
||||
<details><summary>Definitions</summary>
|
||||
|
||||
- **Training Speed**: the number of training samples processed per second during the training. (bs=4, cutoff_len=1024)
|
||||
- **Rouge Score**: Rouge-2 score on the development set of the [advertising text generation](https://aclanthology.org/D19-1321.pdf) task. (bs=4, cutoff_len=1024)
|
||||
- **GPU Memory**: Peak GPU memory usage in 4-bit quantized training. (bs=1, cutoff_len=1024)
|
||||
- We adopt `pre_seq_len=128` for ChatGLM's P-Tuning and `lora_rank=32` for LLaMA Factory's LoRA tuning.
|
||||
- [A One-Stop Code-Free Model Fine-Tuning \& Deployment Platform based on SageMaker and LLaMA-Factory](https://aws.amazon.com/cn/blogs/china/a-one-stop-code-free-model-fine-tuning-deployment-platform-based-on-sagemaker-and-llama-factory/) (Chinese)
|
||||
- [LLaMA Factory Multi-Modal Fine-Tuning Practice: Fine-Tuning Qwen2-VL for Personal Tourist Guide](https://gallery.pai-ml.com/#/preview/deepLearning/nlp/llama_factory_qwen2vl) (Chinese)
|
||||
- [LLaMA Factory: Fine-tuning the LLaMA3 Model for Role-Playing](https://gallery.pai-ml.com/#/preview/deepLearning/nlp/llama_factory) (Chinese)
|
||||
|
||||
</details>
|
||||
|
||||
|
48
README_zh.md
48
README_zh.md
@ -5,7 +5,7 @@
|
||||
[](https://github.com/hiyouga/LLaMA-Factory/graphs/contributors)
|
||||
[](https://github.com/hiyouga/LLaMA-Factory/actions/workflows/tests.yml)
|
||||
[](https://pypi.org/project/llamafactory/)
|
||||
[](https://scholar.google.com/scholar?cites=12620864006390196564)
|
||||
[](https://scholar.google.com/scholar?cites=12620864006390196564)
|
||||
[](https://github.com/hiyouga/LLaMA-Factory/pulls)
|
||||
|
||||
[](https://twitter.com/llamafactory_ai)
|
||||
@ -18,16 +18,27 @@
|
||||
[](https://modelscope.cn/studios/hiyouga/LLaMA-Board)
|
||||
[](https://aws.amazon.com/cn/blogs/china/a-one-stop-code-free-model-fine-tuning-deployment-platform-based-on-sagemaker-and-llama-factory/)
|
||||
|
||||
<h3 align="center">
|
||||
使用零代码<a href="#快速开始">命令行</a>与 <a href="#llama-board-可视化微调由-gradio-驱动">Web UI</a> 轻松微调百余种大模型
|
||||
</h3>
|
||||
### 获得[亚马逊](https://aws.amazon.com/cn/blogs/china/a-one-stop-code-free-model-fine-tuning-deployment-platform-based-on-sagemaker-and-llama-factory/)、[英伟达](https://developer.nvidia.cn/rtx/ai-toolkit)、[阿里云](https://help.aliyun.com/zh/pai/use-cases/fine-tune-a-llama-3-model-with-llama-factory)等的应用。
|
||||
|
||||
<p align="center">
|
||||
<picture>
|
||||
<img alt="Github trend" src="https://trendshift.io/api/badge/repositories/4535">
|
||||
</picture>
|
||||
</p>
|
||||
<div align="center" markdown="1">
|
||||
|
||||
### 赞助商 ❤️
|
||||
|
||||
<a href="https://warp.dev/llama-factory">
|
||||
<img alt="Warp sponsorship" width="400" src="https://github.com/user-attachments/assets/ab8dd143-b0fd-4904-bdc5-dd7ecac94eae">
|
||||
</a>
|
||||
|
||||
#### [Warp,面向开发者的智能终端](https://warp.dev/llama-factory)
|
||||
|
||||
[适用于 MacOS、Linux 和 Windows](https://warp.dev/llama-factory)
|
||||
|
||||
----
|
||||
|
||||
### 使用零代码[命令行](#快速开始)与 [Web UI](#llama-board-可视化微调由-gradio-驱动) 轻松微调百余种大模型
|
||||
|
||||

|
||||
|
||||
</div>
|
||||
|
||||
👋 加入我们的[微信群](assets/wechat.jpg)或 [NPU 用户群](assets/wechat_npu.jpg)。
|
||||
|
||||
@ -54,7 +65,7 @@ https://github.com/user-attachments/assets/43b700c6-a178-41db-b1f8-8190a5d3fcfc
|
||||
## 目录
|
||||
|
||||
- [项目特色](#项目特色)
|
||||
- [性能指标](#性能指标)
|
||||
- [官方博客](#官方博客)
|
||||
- [更新日志](#更新日志)
|
||||
- [模型](#模型)
|
||||
- [训练方法](#训练方法)
|
||||
@ -94,18 +105,17 @@ https://github.com/user-attachments/assets/43b700c6-a178-41db-b1f8-8190a5d3fcfc
|
||||
| Day 0 | Qwen3 / Qwen2.5-VL / Gemma 3 / InternLM 3 / MiniCPM-o-2.6 |
|
||||
| Day 1 | Llama 3 / GLM-4 / Mistral Small / PaliGemma2 / Llama 4 |
|
||||
|
||||
## 性能指标
|
||||
## 官方博客
|
||||
|
||||
与 ChatGLM 官方的 [P-Tuning](https://github.com/THUDM/ChatGLM2-6B/tree/main/ptuning) 微调相比,LLaMA Factory 的 LoRA 微调提供了 **3.7 倍**的加速比,同时在广告文案生成任务上取得了更高的 Rouge 分数。结合 4 比特量化技术,LLaMA Factory 的 QLoRA 微调进一步降低了 GPU 显存消耗。
|
||||
- [通过亚马逊 SageMaker HyperPod 上的 LLaMA-Factory 增强多模态模型银行文档的视觉信息提取](https://aws.amazon.com/cn/blogs/machine-learning/how-apoidea-group-enhances-visual-information-extraction-from-banking-documents-with-multimodal-models-using-llama-factory-on-amazon-sagemaker-hyperpod/)(英文)
|
||||
- [Easy Dataset × LLaMA Factory: 让大模型高效学习领域知识](https://buaa-act.feishu.cn/wiki/KY9xwTGs1iqHrRkjXBwcZP9WnL9)(中文)
|
||||
- [LLaMA Factory:微调DeepSeek-R1-Distill-Qwen-7B模型实现新闻标题分类器](https://gallery.pai-ml.com/#/preview/deepLearning/nlp/llama_factory_deepseek_r1_distill_7b)(中文)
|
||||
|
||||

|
||||
<details><summary>全部博客</summary>
|
||||
|
||||
<details><summary>变量定义</summary>
|
||||
|
||||
- **Training Speed**: 训练阶段每秒处理的样本数量。(批处理大小=4,截断长度=1024)
|
||||
- **Rouge Score**: [广告文案生成](https://aclanthology.org/D19-1321.pdf)任务验证集上的 Rouge-2 分数。(批处理大小=4,截断长度=1024)
|
||||
- **GPU Memory**: 4 比特量化训练的 GPU 显存峰值。(批处理大小=1,截断长度=1024)
|
||||
- 我们在 ChatGLM 的 P-Tuning 中采用 `pre_seq_len=128`,在 LLaMA Factory 的 LoRA 微调中采用 `lora_rank=32`。
|
||||
- [基于 Amazon SageMaker 和 LLaMA-Factory 打造一站式无代码模型微调部署平台 Model Hub](https://aws.amazon.com/cn/blogs/china/a-one-stop-code-free-model-fine-tuning-deployment-platform-based-on-sagemaker-and-llama-factory/)(中文)
|
||||
- [LLaMA Factory多模态微调实践:微调Qwen2-VL构建文旅大模型](https://gallery.pai-ml.com/#/preview/deepLearning/nlp/llama_factory_qwen2vl)(中文)
|
||||
- [LLaMA Factory:微调LLaMA3模型实现角色扮演](https://gallery.pai-ml.com/#/preview/deepLearning/nlp/llama_factory)(中文)
|
||||
|
||||
</details>
|
||||
|
||||
|
1216
assets/benchmark.svg
1216
assets/benchmark.svg
File diff suppressed because it is too large
Load Diff
Before Width: | Height: | Size: 28 KiB |
Loading…
x
Reference in New Issue
Block a user