update readme

Former-commit-id: df9e6bb0636160a93f1d4e9562a2e31a08009be3
This commit is contained in:
hiyouga 2024-03-05 03:20:23 +08:00
parent 8cf9842f7a
commit 02eac3fd09
2 changed files with 3 additions and 3 deletions

View File

@ -558,9 +558,9 @@ python src/export_model.py \
> Merging LoRA weights into a quantized model is not supported.
> [!TIP]
> Use `--model_name_or_path path_to_export` only to use the exported model.
> Use `--model_name_or_path path_to_export` solely to use the exported model.
>
> Use `--export_quantization_bit 4` and `--export_quantization_dataset data/c4_demo.json` to quantize the model after merging the LoRA weights.
> Use `--export_quantization_bit 4` and `--export_quantization_dataset data/c4_demo.json` to quantize the model with AutoGPTQ after merging the LoRA weights.
### Inference with OpenAI-style API

View File

@ -559,7 +559,7 @@ python src/export_model.py \
> [!TIP]
> 仅使用 `--model_name_or_path path_to_export` 来加载导出后的模型。
>
> 合并 LoRA 权重之后可再次使用 `--export_quantization_bit 4``--export_quantization_dataset data/c4_demo.json` 量化模型。
> 合并 LoRA 权重之后可再次使用 `--export_quantization_bit 4``--export_quantization_dataset data/c4_demo.json` 基于 AutoGPTQ 量化模型。
### 使用 OpenAI 风格 API 推理