diff --git a/README.md b/README.md index 19a628d4..33e3fe2c 100644 --- a/README.md +++ b/README.md @@ -72,7 +72,7 @@ Compared to ChatGLM's [P-Tuning](https://github.com/THUDM/ChatGLM2-6B/tree/main/ ## Changelog -[24/08/30] We supported fine-tuning the **[Qwen2-VL](https://qwenlm.github.io/blog/qwen2-vl/)** models. +[24/08/30] We support fine-tuning the **[Qwen2-VL](https://qwenlm.github.io/blog/qwen2-vl/)** models. Thank [@simonJJJ](https://github.com/simonJJJ)'s PR. [24/08/27] We support **[Liger Kernel](https://github.com/linkedin/Liger-Kernel)**. Try `enable_liger_kernel: true` for efficient training. @@ -88,7 +88,7 @@ Compared to ChatGLM's [P-Tuning](https://github.com/THUDM/ChatGLM2-6B/tree/main/ [24/05/26] We supported **[SimPO](https://arxiv.org/abs/2405.14734)** algorithm for preference learning. See [examples](examples/README.md) for usage. -[24/05/20] We supported fine-tuning the **PaliGemma** series models. Note that the PaliGemma models are pre-trained models, you need to fine-tune them with `gemma` template for chat completion. +[24/05/20] We supported fine-tuning the **PaliGemma** series models. Note that the PaliGemma models are pre-trained models, you need to fine-tune them with `paligemma` template for chat completion. [24/05/18] We supported **[KTO](https://arxiv.org/abs/2402.01306)** algorithm for preference learning. See [examples](examples/README.md) for usage. diff --git a/README_zh.md b/README_zh.md index bbe6d159..9465f027 100644 --- a/README_zh.md +++ b/README_zh.md @@ -73,7 +73,7 @@ https://github.com/user-attachments/assets/e6ce34b0-52d5-4f3e-a830-592106c4c272 ## 更新日志 -[24/08/30] 我们支持了 **[Qwen2-VL](https://qwenlm.github.io/blog/qwen2-vl/)** 模型的微调。 +[24/08/30] 我们支持了 **[Qwen2-VL](https://qwenlm.github.io/blog/qwen2-vl/)** 模型的微调。感谢 [@simonJJJ](https://github.com/simonJJJ) 的 PR。 [24/08/27] 我们支持了 **[Liger Kernel](https://github.com/linkedin/Liger-Kernel)**。请使用 `enable_liger_kernel: true` 来加速训练。 @@ -89,7 +89,7 @@ https://github.com/user-attachments/assets/e6ce34b0-52d5-4f3e-a830-592106c4c272 [24/05/26] 我们支持了 **[SimPO](https://arxiv.org/abs/2405.14734)** 偏好对齐算法。详细用法请参照 [examples](examples/README_zh.md)。 -[24/05/20] 我们支持了 **PaliGemma** 系列模型的微调。注意 PaliGemma 是预训练模型,你需要使用 `gemma` 模板进行微调使其获得对话能力。 +[24/05/20] 我们支持了 **PaliGemma** 系列模型的微调。注意 PaliGemma 是预训练模型,你需要使用 `paligemma` 模板进行微调使其获得对话能力。 [24/05/18] 我们支持了 **[KTO](https://arxiv.org/abs/2402.01306)** 偏好对齐算法。详细用法请参照 [examples](examples/README_zh.md)。 diff --git a/examples/README.md b/examples/README.md index e92bf052..d6dccb1c 100644 --- a/examples/README.md +++ b/examples/README.md @@ -134,6 +134,12 @@ FORCE_TORCHRUN=1 NNODES=2 RANK=0 MASTER_ADDR=192.168.0.1 MASTER_PORT=29500 llama FORCE_TORCHRUN=1 NNODES=2 RANK=1 MASTER_ADDR=192.168.0.1 MASTER_PORT=29500 llamafactory-cli train examples/train_full/llama3_full_sft_ds3.yaml ``` +#### Multimodal Supervised Fine-Tuning + +```bash +FORCE_TORCHRUN=1 llamafactory-cli train examples/train_full/qwen2vl_full_sft.yaml +``` + #### Batch Predicting and Computing BLEU and ROUGE Scores ```bash diff --git a/examples/README_zh.md b/examples/README_zh.md index 88588c3a..037136a1 100644 --- a/examples/README_zh.md +++ b/examples/README_zh.md @@ -134,6 +134,12 @@ FORCE_TORCHRUN=1 NNODES=2 RANK=0 MASTER_ADDR=192.168.0.1 MASTER_PORT=29500 llama FORCE_TORCHRUN=1 NNODES=2 RANK=1 MASTER_ADDR=192.168.0.1 MASTER_PORT=29500 llamafactory-cli train examples/train_full/llama3_full_sft_ds3.yaml ``` +#### 多模态指令监督微调 + +```bash +FORCE_TORCHRUN=1 llamafactory-cli train examples/train_full/qwen2vl_full_sft.yaml +``` + #### 批量预测并计算 BLEU 和 ROUGE 分数 ```bash diff --git a/examples/inference/llava1_5.yaml b/examples/inference/llava1_5.yaml index d613be68..4a7673e1 100644 --- a/examples/inference/llava1_5.yaml +++ b/examples/inference/llava1_5.yaml @@ -1,3 +1,3 @@ model_name_or_path: llava-hf/llava-1.5-7b-hf -template: vicuna +template: llava visual_inputs: true diff --git a/examples/inference/qwen2_vl.yaml b/examples/inference/qwen2_vl.yaml new file mode 100644 index 00000000..a875f0d2 --- /dev/null +++ b/examples/inference/qwen2_vl.yaml @@ -0,0 +1,3 @@ +model_name_or_path: Qwen/Qwen2-VL-7B-Instruct +template: qwen2_vl +visual_inputs: true diff --git a/examples/merge_lora/qwen2vl_lora_sft.yaml b/examples/merge_lora/qwen2vl_lora_sft.yaml new file mode 100644 index 00000000..c71cd87e --- /dev/null +++ b/examples/merge_lora/qwen2vl_lora_sft.yaml @@ -0,0 +1,14 @@ +### Note: DO NOT use quantized model or quantization_bit when merging lora adapters + +### model +model_name_or_path: Qwen/Qwen2-VL-7B-Instruct +adapter_name_or_path: saves/qwen2_vl-7b/lora/sft +visual_inputs: true +template: qwen2_vl +finetuning_type: lora + +### export +export_dir: models/qwen2_vl_lora_sft +export_size: 2 +export_device: cpu +export_legacy_format: false diff --git a/examples/train_full/qwen2vl_full_sft.yaml b/examples/train_full/qwen2vl_full_sft.yaml new file mode 100644 index 00000000..1163a37e --- /dev/null +++ b/examples/train_full/qwen2vl_full_sft.yaml @@ -0,0 +1,40 @@ +### model +model_name_or_path: Qwen/Qwen2-VL-7B-Instruct +visual_inputs: true + +### method +stage: sft +do_train: true +finetuning_type: full +deepspeed: examples/deepspeed/ds_z3_config.json + +### dataset +dataset: mllm_demo +template: qwen2_vl +cutoff_len: 1024 +max_samples: 1000 +overwrite_cache: true +preprocessing_num_workers: 16 + +### output +output_dir: saves/qwen2_vl-7b/full/sft +logging_steps: 10 +save_steps: 500 +plot_loss: true +overwrite_output_dir: true + +### train +per_device_train_batch_size: 1 +gradient_accumulation_steps: 2 +learning_rate: 1.0e-5 +num_train_epochs: 3.0 +lr_scheduler_type: cosine +warmup_ratio: 0.1 +bf16: true +ddp_timeout: 180000000 + +### eval +val_size: 0.1 +per_device_eval_batch_size: 1 +eval_strategy: steps +eval_steps: 500 diff --git a/examples/train_lora/llava1_5_lora_sft.yaml b/examples/train_lora/llava1_5_lora_sft.yaml index ec03f82c..f0616ac8 100644 --- a/examples/train_lora/llava1_5_lora_sft.yaml +++ b/examples/train_lora/llava1_5_lora_sft.yaml @@ -10,7 +10,7 @@ lora_target: all ### dataset dataset: mllm_demo -template: vicuna +template: llava cutoff_len: 1024 max_samples: 1000 overwrite_cache: true