mirror of
https://github.com/hiyouga/LLaMA-Factory.git
synced 2025-08-02 03:32:50 +08:00
519 B
519 B
Warning
Merging LoRA weights into a quantized model is not supported.
Tip
Use
--model_name_or_path path_to_model
solely to use the exported model or model fine-tuned in full/freeze mode.Use
CUDA_VISIBLE_DEVICES=0
,--export_quantization_bit 4
and--export_quantization_dataset data/c4_demo.json
to quantize the model with AutoGPTQ after merging the LoRA weights.
Usage:
merge.sh
: merge the lora weightsquantize.sh
: quantize the model with AutoGPTQ (must after merge.sh, optional)