mirror of
https://github.com/hiyouga/LLaMA-Factory.git
synced 2025-11-05 02:12:14 +08:00
Warning
Merging LoRA weights into a quantized model is not supported.
Tip
Use
--model_name_or_path path_to_modelsolely to use the exported model or model fine-tuned in full/freeze mode.Use
CUDA_VISIBLE_DEVICES=0,--export_quantization_bit 4and--export_quantization_dataset data/c4_demo.jsonto quantize the model with AutoGPTQ after merging the LoRA weights.
Usage:
merge.sh: merge the lora weightsquantize.sh: quantize the model with AutoGPTQ (must after merge.sh, optional)