mirror of
https://github.com/hiyouga/LLaMA-Factory.git
synced 2025-08-22 13:42:51 +08:00
support paligemma
Former-commit-id: 2a67457e3944d5e528286cb7203857c13078c484
This commit is contained in:
parent
a935c5105d
commit
cce3892f91
11
README.md
11
README.md
@ -69,12 +69,12 @@ Compared to ChatGLM's [P-Tuning](https://github.com/THUDM/ChatGLM2-6B/tree/main/
|
||||
|
||||
## Changelog
|
||||
|
||||
[24/05/20] We supported fine-tuning the **PaliGemma** series models. Note that the PaliGemma models are pre-trained models, you need to fine-tune them with `gemma` template for chat completion.
|
||||
|
||||
[24/05/18] We supported **[KTO](https://arxiv.org/abs/2402.01306)** algorithm for preference learning. See [examples](examples/README.md) for usage.
|
||||
|
||||
[24/05/14] We supported training and inference on the Ascend NPU devices. Check [installation](#installation) section for details.
|
||||
|
||||
[24/05/13] We supported fine-tuning the **Yi-1.5** series models.
|
||||
|
||||
<details><summary>Full Changelog</summary>
|
||||
|
||||
[24/04/26] We supported fine-tuning the **LLaVA-1.5** multimodal LLMs. See [examples](examples/README.md) for usage.
|
||||
@ -160,6 +160,7 @@ Compared to ChatGLM's [P-Tuning](https://github.com/THUDM/ChatGLM2-6B/tree/main/
|
||||
| [LLaVA-1.5](https://huggingface.co/llava-hf) | 7B/13B | q_proj,v_proj | vicuna |
|
||||
| [Mistral/Mixtral](https://huggingface.co/mistralai) | 7B/8x7B/8x22B | q_proj,v_proj | mistral |
|
||||
| [OLMo](https://huggingface.co/allenai) | 1B/7B | q_proj,v_proj | - |
|
||||
| [PaliGemma](https://huggingface.co/google) | 3B | q_proj,v_proj | gemma |
|
||||
| [Phi-1.5/2](https://huggingface.co/microsoft) | 1.3B/2.7B | q_proj,v_proj | - |
|
||||
| [Phi-3](https://huggingface.co/microsoft) | 3.8B | qkv_proj | phi |
|
||||
| [Qwen](https://huggingface.co/Qwen) | 1.8B/7B/14B/72B | c_attn | qwen |
|
||||
@ -284,10 +285,10 @@ huggingface-cli login
|
||||
| ------------ | ------- | --------- |
|
||||
| python | 3.8 | 3.10 |
|
||||
| torch | 1.13.1 | 2.2.0 |
|
||||
| transformers | 4.37.2 | 4.40.1 |
|
||||
| transformers | 4.37.2 | 4.41.0 |
|
||||
| datasets | 2.14.3 | 2.19.1 |
|
||||
| accelerate | 0.27.2 | 0.30.0 |
|
||||
| peft | 0.9.0 | 0.10.0 |
|
||||
| accelerate | 0.27.2 | 0.30.1 |
|
||||
| peft | 0.9.0 | 0.11.1 |
|
||||
| trl | 0.8.1 | 0.8.6 |
|
||||
|
||||
| Optional | Minimum | Recommend |
|
||||
|
11
README_zh.md
11
README_zh.md
@ -69,12 +69,12 @@ https://github.com/hiyouga/LLaMA-Factory/assets/16256802/ec36a9dd-37f4-4f72-81bd
|
||||
|
||||
## 更新日志
|
||||
|
||||
[24/05/20] 我们支持了 **PaliGemma** 系列模型的微调。注意 PaliGemma 是预训练模型,你需要使用 `gemma` 模板进行微调使其获得对话能力。
|
||||
|
||||
[24/05/18] 我们支持了 **[KTO](https://arxiv.org/abs/2402.01306)** 偏好对齐算法。详细用法请参照 [examples](examples/README_zh.md)。
|
||||
|
||||
[24/05/14] 我们支持了昇腾 NPU 设备的训练和推理。详情请查阅[安装](#安装-llama-factory)部分。
|
||||
|
||||
[24/05/13] 我们支持了 Yi-1.5 系列模型的微调。
|
||||
|
||||
<details><summary>展开日志</summary>
|
||||
|
||||
[24/04/26] 我们支持了多模态模型 **LLaVA-1.5** 的微调。详细用法请参照 [examples](examples/README_zh.md)。
|
||||
@ -160,6 +160,7 @@ https://github.com/hiyouga/LLaMA-Factory/assets/16256802/ec36a9dd-37f4-4f72-81bd
|
||||
| [LLaVA-1.5](https://huggingface.co/llava-hf) | 7B/13B | q_proj,v_proj | vicuna |
|
||||
| [Mistral/Mixtral](https://huggingface.co/mistralai) | 7B/8x7B/8x22B | q_proj,v_proj | mistral |
|
||||
| [OLMo](https://huggingface.co/allenai) | 1B/7B | q_proj,v_proj | - |
|
||||
| [PaliGemma](https://huggingface.co/google) | 3B | q_proj,v_proj | gemma |
|
||||
| [Phi-1.5/2](https://huggingface.co/microsoft) | 1.3B/2.7B | q_proj,v_proj | - |
|
||||
| [Phi-3](https://huggingface.co/microsoft) | 3.8B | qkv_proj | phi |
|
||||
| [Qwen](https://huggingface.co/Qwen) | 1.8B/7B/14B/72B | c_attn | qwen |
|
||||
@ -284,10 +285,10 @@ huggingface-cli login
|
||||
| ------------ | ------- | --------- |
|
||||
| python | 3.8 | 3.10 |
|
||||
| torch | 1.13.1 | 2.2.0 |
|
||||
| transformers | 4.37.2 | 4.40.1 |
|
||||
| transformers | 4.37.2 | 4.41.0 |
|
||||
| datasets | 2.14.3 | 2.19.1 |
|
||||
| accelerate | 0.27.2 | 0.30.0 |
|
||||
| peft | 0.9.0 | 0.10.0 |
|
||||
| accelerate | 0.27.2 | 0.30.1 |
|
||||
| peft | 0.9.0 | 0.11.1 |
|
||||
| trl | 0.8.1 | 0.8.6 |
|
||||
|
||||
| 可选项 | 至少 | 推荐 |
|
||||
|
@ -716,6 +716,28 @@ register_model_group(
|
||||
)
|
||||
|
||||
|
||||
register_model_group(
|
||||
models={
|
||||
"PaliGemma-3B-pt-224": {
|
||||
DownloadSource.DEFAULT: "google/paligemma-3b-pt-224",
|
||||
},
|
||||
"PaliGemma-3B-pt-448": {
|
||||
DownloadSource.DEFAULT: "google/paligemma-3b-pt-448",
|
||||
},
|
||||
"PaliGemma-3B-pt-896": {
|
||||
DownloadSource.DEFAULT: "google/paligemma-3b-pt-896",
|
||||
},
|
||||
"PaliGemma-3B-mix-224": {
|
||||
DownloadSource.DEFAULT: "google/paligemma-3b-mix-224",
|
||||
},
|
||||
"PaliGemma-3B-mix-448": {
|
||||
DownloadSource.DEFAULT: "google/paligemma-3b-mix-448",
|
||||
},
|
||||
},
|
||||
vision=True,
|
||||
)
|
||||
|
||||
|
||||
register_model_group(
|
||||
models={
|
||||
"Phi-1.5-1.3B": {
|
||||
|
Loading…
x
Reference in New Issue
Block a user