mirror of
				https://github.com/hiyouga/LLaMA-Factory.git
				synced 2025-11-04 18:02:19 +08:00 
			
		
		
		
	add benchmark
Former-commit-id: 85a09cb649be740a47359371499d821ee0d5c81e
This commit is contained in:
		
							parent
							
								
									92abe91d22
								
							
						
					
					
						commit
						5197fb2fad
					
				
							
								
								
									
										30
									
								
								README.md
									
									
									
									
									
								
							
							
						
						
									
										30
									
								
								README.md
									
									
									
									
									
								
							@ -17,12 +17,37 @@
 | 
			
		||||
 | 
			
		||||
Preview LLaMA Board at **[🤗 Spaces](https://huggingface.co/spaces/hiyouga/LLaMA-Board)**.
 | 
			
		||||
 | 
			
		||||
Launch LLaMA Board via `CUDA_VISIBLE_DEVICES=0 python src/train_web.py`. (multiple GPUs are not supported yet)
 | 
			
		||||
Launch LLaMA Board via `CUDA_VISIBLE_DEVICES=0 python src/train_web.py`. (multiple GPUs are not supported yet in this mode)
 | 
			
		||||
 | 
			
		||||
Here is an example of altering the self-cognition of an instruction-tuned language model within 10 minutes on a single GPU.
 | 
			
		||||
 | 
			
		||||
https://github.com/hiyouga/LLaMA-Factory/assets/16256802/6ba60acc-e2e2-4bec-b846-2d88920d5ba1
 | 
			
		||||
 | 
			
		||||
## Table of Contents
 | 
			
		||||
 | 
			
		||||
- [Benchmark](#benchmark)
 | 
			
		||||
- [Changelog](#changelog)
 | 
			
		||||
- [Supported Models](#supported-models)
 | 
			
		||||
- [Supported Training Approaches](#supported-training-approaches)
 | 
			
		||||
- [Provided Datasets](#provided-datasets)
 | 
			
		||||
- [Requirement](#requirement)
 | 
			
		||||
- [Getting Started](#getting-started)
 | 
			
		||||
- [Projects using LLaMA Factory](#projects-using-llama-factory)
 | 
			
		||||
- [License](#license)
 | 
			
		||||
- [Citation](#citation)
 | 
			
		||||
- [Acknowledgement](#acknowledgement)
 | 
			
		||||
 | 
			
		||||
## Benchmark
 | 
			
		||||
 | 
			
		||||
Compared to ChatGLM's [P-Tuning](https://github.com/THUDM/ChatGLM2-6B/tree/main/ptuning), LLaMA-Factory's LoRA tuning offers up to **3.7 times faster** training speed with a better BLEU score on the advertising text generation task. By leveraging 4-bit quantization technique, LLaMA-Factory's QLoRA further improves the efficiency regarding the GPU memory.
 | 
			
		||||
 | 
			
		||||

 | 
			
		||||
 | 
			
		||||
- Training Speed: the number of training samples processed per second during the training. (bs=4, cutoff_len=1024)
 | 
			
		||||
- BLEU Score: BLEU-4 score on the development set of the [advertising text generation](https://aclanthology.org/D19-1321.pdf) task. (bs=4, cutoff_len=1024)
 | 
			
		||||
- GPU Memory: Peak GPU memory usage in the 4-bit quantized training. (bs=1, cutoff_len=1024)
 | 
			
		||||
- We adopt `pre_seq_len=128` for ChatGLM's P-Tuning and `lora_rank=32` for LLaMA-Factory's LoRA tuning.
 | 
			
		||||
 | 
			
		||||
## Changelog
 | 
			
		||||
 | 
			
		||||
[23/10/21] We supported **[NEFTune](https://arxiv.org/abs/2310.05914)** trick for fine-tuning. Try `--neft_alpha` argument to activate NEFTune, e.g., `--neft_alpha 5`.
 | 
			
		||||
@ -477,6 +502,9 @@ CUDA_VISIBLE_DEVICES=0 python src/train_bash.py \
 | 
			
		||||
- **[Sunsimiao](https://github.com/thomas-yanxin/Sunsimiao)**: A large language model specialized in Chinese medical domain, based on Baichuan-7B and ChatGLM-6B.
 | 
			
		||||
- **[CareGPT](https://github.com/WangRongsheng/CareGPT)**: A series of large language models for Chinese medical domain, based on LLaMA2-7B and Baichuan-13B.
 | 
			
		||||
 | 
			
		||||
> [!NOTE]
 | 
			
		||||
> If you have a project that should be incorporated, please contact via email or create a pull request.
 | 
			
		||||
 | 
			
		||||
## License
 | 
			
		||||
 | 
			
		||||
This repository is licensed under the [Apache-2.0 License](LICENSE).
 | 
			
		||||
 | 
			
		||||
							
								
								
									
										28
									
								
								README_zh.md
									
									
									
									
									
								
							
							
						
						
									
										28
									
								
								README_zh.md
									
									
									
									
									
								
							@ -23,6 +23,31 @@
 | 
			
		||||
 | 
			
		||||
https://github.com/hiyouga/LLaMA-Factory/assets/16256802/6ba60acc-e2e2-4bec-b846-2d88920d5ba1
 | 
			
		||||
 | 
			
		||||
## 目录
 | 
			
		||||
 | 
			
		||||
- [性能指标](#性能指标)
 | 
			
		||||
- [更新日志](#更新日志)
 | 
			
		||||
- [模型](#模型)
 | 
			
		||||
- [训练方法](#训练方法)
 | 
			
		||||
- [数据集](#数据集)
 | 
			
		||||
- [软件依赖](#软件依赖)
 | 
			
		||||
- [如何使用](#如何使用)
 | 
			
		||||
- [使用了 LLaMA Factory 的项目](#使用了-llama-factory-的项目)
 | 
			
		||||
- [协议](#协议)
 | 
			
		||||
- [引用](#引用)
 | 
			
		||||
- [致谢](#致谢)
 | 
			
		||||
 | 
			
		||||
## 性能指标
 | 
			
		||||
 | 
			
		||||
与 ChatGLM 官方的 [P-Tuning](https://github.com/THUDM/ChatGLM2-6B/tree/main/ptuning) 微调相比,LLaMA-Factory 的 LoRA 微调提供了 **3.7 倍**的加速比,同时在广告文案生成任务上取得了更高的 BLEU 分数。结合 4 比特量化技术,LLaMA-Factory 的 QLoRA 微调进一步降低了 GPU 显存消耗。
 | 
			
		||||
 | 
			
		||||

 | 
			
		||||
 | 
			
		||||
- Training Speed: 训练阶段每秒处理的样本数量。(批处理大小=4,截断长度=1024)
 | 
			
		||||
- BLEU Score: [广告文案生成](https://aclanthology.org/D19-1321.pdf)任务验证集上的 BLEU-4 分数。(批处理大小=4,截断长度=1024)
 | 
			
		||||
- GPU Memory: 4 比特量化训练的 GPU 显存峰值。(批处理大小=1,截断长度=1024)
 | 
			
		||||
- 我们在 ChatGLM 的 P-Tuning 中采用 `pre_seq_len=128`,在 LLaMA-Factory 的 LoRA 微调中采用 `lora_rank=32`。
 | 
			
		||||
 | 
			
		||||
## 更新日志
 | 
			
		||||
 | 
			
		||||
[23/10/21] 我们支持了 **[NEFTune](https://arxiv.org/abs/2310.05914)** 训练技巧。请使用 `--neft_alpha` 参数启用 NEFTune,例如 `--neft_alpha 5`。
 | 
			
		||||
@ -476,6 +501,9 @@ CUDA_VISIBLE_DEVICES=0 python src/train_bash.py \
 | 
			
		||||
- **[Sunsimiao](https://github.com/thomas-yanxin/Sunsimiao)**: 孙思邈中文医疗大模型 Sumsimiao,基于 Baichuan-7B 和 ChatGLM-6B 在中文医疗数据上微调而得。
 | 
			
		||||
- **[CareGPT](https://github.com/WangRongsheng/CareGPT)**: 医疗大模型项目 CareGPT,基于 LLaMA2-7B 和 Baichuan-13B 在中文医疗数据上微调而得。
 | 
			
		||||
 | 
			
		||||
> [!NOTE]
 | 
			
		||||
> 如果您有项目希望添加至上述列表,请通过邮件联系或者创建一个 PR。
 | 
			
		||||
 | 
			
		||||
## 协议
 | 
			
		||||
 | 
			
		||||
本仓库的代码依照 [Apache-2.0](LICENSE) 协议开源。
 | 
			
		||||
 | 
			
		||||
							
								
								
									
										1172
									
								
								assets/benchmark.svg
									
									
									
									
									
										Normal file
									
								
							
							
						
						
									
										1172
									
								
								assets/benchmark.svg
									
									
									
									
									
										Normal file
									
								
							
										
											
												File diff suppressed because it is too large
												Load Diff
											
										
									
								
							| 
		 After Width: | Height: | Size: 28 KiB  | 
		Loading…
	
	
			
			x
			
			
		
	
		Reference in New Issue
	
	Block a user