diff --git a/README.md b/README.md index 9491e1dc..cdc019f1 100644 --- a/README.md +++ b/README.md @@ -9,6 +9,8 @@ ## Changelog +[23/07/11] Now we support training the **Baichuan-13B** model in this repo. Try `--model_name_or_path baichuan-inc/Baichuan-13B-Base` and `--lora_target W_pack` arguments to use the Baichuan-13B model. + [23/07/09] Now we release [FastEdit](https://github.com/hiyouga/FastEdit)⚡🩹, an easy-to-use package for editing the factual knowledge of large language models efficiently. Please follow [FastEdit](https://github.com/hiyouga/FastEdit) if you are interested. [23/07/07] Now we support training the **InternLM-7B** model in this repo. Try `--model_name_or_path internlm/internlm-7b` argument to use the InternLM model. Remember to use `--prompt_template intern` argument when you are using the InternLM-chat model. @@ -19,7 +21,7 @@ [23/06/22] Now we align the [demo API](src/api_demo.py) with the [OpenAI's](https://platform.openai.com/docs/api-reference/chat) format where you can insert the fine-tuned model in **arbitrary ChatGPT-based applications**. -[23/06/15] Now we support training the **baichuan-7B** model in this repo. Try `--model_name_or_path baichuan-inc/baichuan-7B` and `--lora_target W_pack` arguments to use the baichuan-7B model. If you want to train with RTX3090, use `git checkout baichuan-7b-rtx3090` to switch to the `baichuan-7b-rtx3090` branch and try the `--baichuan_rtx_gpu true` argument. (Other RTX series GPUs can also be tried) +[23/06/15] Now we support training the **Baichuan-7B** model in this repo. Try `--model_name_or_path baichuan-inc/Baichuan-7B` and `--lora_target W_pack` arguments to use the Baichuan-7B model. If you want to train with RTX3090, use `git checkout baichuan-7b-rtx3090` to switch to the `baichuan-7b-rtx3090` branch and try the `--baichuan_rtx_gpu true` argument. (Other RTX series GPUs can also be tried) [23/06/03] Now we support quantized training and inference (aka **[QLoRA](https://github.com/artidoro/qlora)**). Try `--quantization_bit 4/8` argument to work with quantized model. (experimental feature) @@ -30,7 +32,7 @@ - [LLaMA](https://github.com/facebookresearch/llama) (7B/13B/33B/65B) - [BLOOM](https://huggingface.co/bigscience/bloom) & [BLOOMZ](https://huggingface.co/bigscience/bloomz) (560M/1.1B/1.7B/3B/7.1B/176B) - [Falcon](https://huggingface.co/tiiuae/falcon-7b) (7B/40B) -- [baichuan](https://huggingface.co/baichuan-inc/baichuan-7B) (7B) +- [Baichuan](https://huggingface.co/baichuan-inc/baichuan-7B) (7B/13B) - [InternLM](https://github.com/InternLM/InternLM) (7B) ## Supported Training Approaches diff --git a/tests/auto_gptq.py b/tests/quantize.py similarity index 86% rename from tests/auto_gptq.py rename to tests/quantize.py index cdf97305..226b0908 100644 --- a/tests/auto_gptq.py +++ b/tests/quantize.py @@ -2,7 +2,7 @@ # Quantizes fine-tuned models with AutoGPTQ (https://github.com/PanQiWei/AutoGPTQ). # Usage: python auto_gptq.py --input_dir path_to_llama_model --output_dir path_to_quant_model --data_file alpaca.json # --max_length 1024 --max_samples 1024 -# dataset format: question (string), A (string), B (string), C (string), D (string), answer (Literal["A", "B", "C", "D"]) +# dataset format: instruction (string), input (string), output (string), history (List[string]) import fire @@ -23,7 +23,9 @@ def quantize(input_dir: str, output_dir: str, data_file: str, max_length: int, m if "history" in examples: for user_query, bot_resp in examples["history"][i]: prompt += "Human: {}\nAssistant: {}\n".format(user_query, bot_resp) - prompt += "Human: {}\nAssistant: {}".format(examples["instruction"][i], examples["output"][i]) + prompt += "Human: {}\nAssistant: {}".format( + examples["instruction"][i] + "\n" + examples["input"][i], examples["output"][i] + ) texts.append(prompt) return tokenizer(texts, truncation=True, max_length=max_length) @@ -39,7 +41,7 @@ def quantize(input_dir: str, output_dir: str, data_file: str, max_length: int, m desc_act=False ) - model = AutoGPTQForCausalLM.from_pretrained(input_dir, quantize_config) + model = AutoGPTQForCausalLM.from_pretrained(input_dir, quantize_config, trust_remote_code=True) model.quantize(dataset) model.save_quantized(output_dir)