mirror of
				https://github.com/hiyouga/LLaMA-Factory.git
				synced 2025-11-04 18:02:19 +08:00 
			
		
		
		
	tiny fix
Former-commit-id: 1fe424323b212094856f423351dc2a15774d39c3
This commit is contained in:
		
							parent
							
								
									c71c78da50
								
							
						
					
					
						commit
						c7efc7f2ed
					
				@ -75,7 +75,7 @@ Compared to ChatGLM's [P-Tuning](https://github.com/THUDM/ChatGLM2-6B/tree/main/
 | 
			
		||||
 | 
			
		||||
## Changelog
 | 
			
		||||
 | 
			
		||||
[24/10/09] We supported downloading pre-trained models and datasets from the **[Modelers Hub](https://modelers.cn/models)** for Chinese mainland users. See [this tutorial](#download-from-modelers-hub) for usage.
 | 
			
		||||
[24/10/09] We supported downloading pre-trained models and datasets from the **[Modelers Hub](https://modelers.cn/models)**. See [this tutorial](#download-from-modelers-hub) for usage.
 | 
			
		||||
 | 
			
		||||
[24/09/19] We support fine-tuning the **[Qwen2.5](https://qwenlm.github.io/blog/qwen2.5/)** models.
 | 
			
		||||
 | 
			
		||||
@ -135,7 +135,7 @@ Compared to ChatGLM's [P-Tuning](https://github.com/THUDM/ChatGLM2-6B/tree/main/
 | 
			
		||||
 | 
			
		||||
[23/12/12] We supported fine-tuning the latest MoE model **[Mixtral 8x7B](https://huggingface.co/mistralai/Mixtral-8x7B-v0.1)** in our framework. See hardware requirement [here](#hardware-requirement).
 | 
			
		||||
 | 
			
		||||
[23/12/01] We supported downloading pre-trained models and datasets from the **[ModelScope Hub](https://modelscope.cn/models)** for Chinese mainland users. See [this tutorial](#download-from-modelscope-hub) for usage.
 | 
			
		||||
[23/12/01] We supported downloading pre-trained models and datasets from the **[ModelScope Hub](https://modelscope.cn/models)**. See [this tutorial](#download-from-modelscope-hub) for usage.
 | 
			
		||||
 | 
			
		||||
[23/10/21] We supported **[NEFTune](https://arxiv.org/abs/2310.05914)** trick for fine-tuning. Try `neftune_noise_alpha: 5` argument to activate NEFTune.
 | 
			
		||||
 | 
			
		||||
@ -365,7 +365,7 @@ cd LLaMA-Factory
 | 
			
		||||
pip install -e ".[torch,metrics]"
 | 
			
		||||
```
 | 
			
		||||
 | 
			
		||||
Extra dependencies available: torch, torch-npu, metrics, deepspeed, liger-kernel, bitsandbytes, hqq, eetq, gptq, awq, aqlm, vllm, galore, badam, adam-mini, qwen, modelscope, quality, openmind
 | 
			
		||||
Extra dependencies available: torch, torch-npu, metrics, deepspeed, liger-kernel, bitsandbytes, hqq, eetq, gptq, awq, aqlm, vllm, galore, badam, adam-mini, qwen, modelscope, openmind, quality
 | 
			
		||||
 | 
			
		||||
> [!TIP]
 | 
			
		||||
> Use `pip install --no-deps -e .` to resolve package conflicts.
 | 
			
		||||
 | 
			
		||||
@ -75,6 +75,7 @@ https://github.com/user-attachments/assets/e6ce34b0-52d5-4f3e-a830-592106c4c272
 | 
			
		||||
</details>
 | 
			
		||||
 | 
			
		||||
## 更新日志
 | 
			
		||||
 | 
			
		||||
[24/10/09] 我们支持了从 **[魔乐社区](https://modelers.cn/models)** 下载预训练模型和数据集。详细用法请参照 [此教程](#从魔乐社区下载)。
 | 
			
		||||
 | 
			
		||||
[24/09/19] 我们支持了 **[Qwen2.5](https://qwenlm.github.io/blog/qwen2.5/)** 模型的微调。
 | 
			
		||||
@ -365,7 +366,7 @@ cd LLaMA-Factory
 | 
			
		||||
pip install -e ".[torch,metrics]"
 | 
			
		||||
```
 | 
			
		||||
 | 
			
		||||
可选的额外依赖项:torch、torch-npu、metrics、deepspeed、liger-kernel、bitsandbytes、hqq、eetq、gptq、awq、aqlm、vllm、galore、badam、adam-mini、qwen、modelscope、quality、openmind
 | 
			
		||||
可选的额外依赖项:torch、torch-npu、metrics、deepspeed、liger-kernel、bitsandbytes、hqq、eetq、gptq、awq、aqlm、vllm、galore、badam、adam-mini、qwen、modelscope、openmind、quality
 | 
			
		||||
 | 
			
		||||
> [!TIP]
 | 
			
		||||
> 遇到包冲突时,可使用 `pip install --no-deps -e .` 解决。
 | 
			
		||||
@ -418,6 +419,7 @@ source /usr/local/Ascend/ascend-toolkit/set_env.sh
 | 
			
		||||
### 数据准备
 | 
			
		||||
 | 
			
		||||
关于数据集文件的格式,请参考 [data/README_zh.md](data/README_zh.md) 的内容。你可以使用 HuggingFace / ModelScope / Modelers 上的数据集或加载本地数据集。
 | 
			
		||||
 | 
			
		||||
> [!NOTE]
 | 
			
		||||
> 使用自定义数据集时,请更新 `data/dataset_info.json` 文件。
 | 
			
		||||
 | 
			
		||||
@ -591,7 +593,7 @@ export USE_MODELSCOPE_HUB=1 # Windows 使用 `set USE_MODELSCOPE_HUB=1`
 | 
			
		||||
 | 
			
		||||
### 从魔乐社区下载
 | 
			
		||||
 | 
			
		||||
您也可以通过下述方法使用魔乐社区,在魔乐社区上下载数据集和模型。
 | 
			
		||||
您也可以通过下述方法,使用魔乐社区下载数据集和模型。
 | 
			
		||||
 | 
			
		||||
```bash
 | 
			
		||||
export USE_OPENMIND_HUB=1 # Windows 使用 `set USE_OPENMIND_HUB=1`
 | 
			
		||||
@ -599,7 +601,6 @@ export USE_OPENMIND_HUB=1 # Windows 使用 `set USE_OPENMIND_HUB=1`
 | 
			
		||||
 | 
			
		||||
将 `model_name_or_path` 设置为模型 ID 来加载对应的模型。在[魔乐社区](https://modelers.cn/models)查看所有可用的模型,例如 `TeleAI/TeleChat-7B-pt`。
 | 
			
		||||
 | 
			
		||||
 | 
			
		||||
### 使用 W&B 面板
 | 
			
		||||
 | 
			
		||||
若要使用 [Weights & Biases](https://wandb.ai) 记录实验数据,请在 yaml 文件中添加下面的参数。
 | 
			
		||||
 | 
			
		||||
@ -16,6 +16,7 @@ services:
 | 
			
		||||
    volumes:
 | 
			
		||||
      - ../../hf_cache:/root/.cache/huggingface
 | 
			
		||||
      - ../../ms_cache:/root/.cache/modelscope
 | 
			
		||||
      - ../../om_cache:/root/.cache/openmind
 | 
			
		||||
      - ../../data:/app/data
 | 
			
		||||
      - ../../output:/app/output
 | 
			
		||||
    ports:
 | 
			
		||||
 | 
			
		||||
@ -10,6 +10,7 @@ services:
 | 
			
		||||
    volumes:
 | 
			
		||||
      - ../../hf_cache:/root/.cache/huggingface
 | 
			
		||||
      - ../../ms_cache:/root/.cache/modelscope
 | 
			
		||||
      - ../../om_cache:/root/.cache/openmind
 | 
			
		||||
      - ../../data:/app/data
 | 
			
		||||
      - ../../output:/app/output
 | 
			
		||||
      - /usr/local/dcmi:/usr/local/dcmi
 | 
			
		||||
 | 
			
		||||
@ -15,6 +15,7 @@ services:
 | 
			
		||||
    volumes:
 | 
			
		||||
      - ../../hf_cache:/root/.cache/huggingface
 | 
			
		||||
      - ../../ms_cache:/root/.cache/modelscope
 | 
			
		||||
      - ../../om_cache:/root/.cache/openmind
 | 
			
		||||
      - ../../data:/app/data
 | 
			
		||||
      - ../../output:/app/output
 | 
			
		||||
      - ../../saves:/app/saves
 | 
			
		||||
 | 
			
		||||
							
								
								
									
										1
									
								
								setup.py
									
									
									
									
									
								
							
							
						
						
									
										1
									
								
								setup.py
									
									
									
									
									
								
							@ -60,6 +60,7 @@ extra_require = {
 | 
			
		||||
    "adam-mini": ["adam-mini"],
 | 
			
		||||
    "qwen": ["transformers_stream_generator"],
 | 
			
		||||
    "modelscope": ["modelscope"],
 | 
			
		||||
    "openmind": ["openmind"],
 | 
			
		||||
    "dev": ["ruff", "pytest"],
 | 
			
		||||
}
 | 
			
		||||
 | 
			
		||||
 | 
			
		||||
@ -53,7 +53,7 @@ def _load_single_dataset(
 | 
			
		||||
    """
 | 
			
		||||
    logger.info("Loading dataset {}...".format(dataset_attr))
 | 
			
		||||
    data_path, data_name, data_dir, data_files = None, None, None, None
 | 
			
		||||
    if dataset_attr.load_from in ["om_hub", "hf_hub", "ms_hub"]:
 | 
			
		||||
    if dataset_attr.load_from in ["hf_hub", "ms_hub", "om_hub"]:
 | 
			
		||||
        data_path = dataset_attr.dataset_name
 | 
			
		||||
        data_name = dataset_attr.subset
 | 
			
		||||
        data_dir = dataset_attr.folder
 | 
			
		||||
@ -84,24 +84,7 @@ def _load_single_dataset(
 | 
			
		||||
    else:
 | 
			
		||||
        raise NotImplementedError("Unknown load type: {}.".format(dataset_attr.load_from))
 | 
			
		||||
 | 
			
		||||
    if dataset_attr.load_from == "om_hub":
 | 
			
		||||
        try:
 | 
			
		||||
            from openmind import OmDataset
 | 
			
		||||
            from openmind.utils.hub import OM_DATASETS_CACHE
 | 
			
		||||
            cache_dir = model_args.cache_dir or OM_DATASETS_CACHE
 | 
			
		||||
            dataset = OmDataset.load_dataset(
 | 
			
		||||
                path=data_path,
 | 
			
		||||
                name=data_name,
 | 
			
		||||
                data_dir=data_dir,
 | 
			
		||||
                data_files=data_files,
 | 
			
		||||
                split=dataset_attr.split,
 | 
			
		||||
                cache_dir=cache_dir,
 | 
			
		||||
                token=model_args.om_hub_token,
 | 
			
		||||
                streaming=(data_args.streaming and (dataset_attr.load_from != "file")),
 | 
			
		||||
            )
 | 
			
		||||
        except ImportError:
 | 
			
		||||
            raise ImportError("Please install openmind via `pip install openmind -U`")
 | 
			
		||||
    elif dataset_attr.load_from == "ms_hub":
 | 
			
		||||
    if dataset_attr.load_from == "ms_hub":
 | 
			
		||||
        require_version("modelscope>=1.11.0", "To fix: pip install modelscope>=1.11.0")
 | 
			
		||||
        from modelscope import MsDataset
 | 
			
		||||
        from modelscope.utils.config_ds import MS_DATASETS_CACHE
 | 
			
		||||
@ -119,6 +102,23 @@ def _load_single_dataset(
 | 
			
		||||
        )
 | 
			
		||||
        if isinstance(dataset, MsDataset):
 | 
			
		||||
            dataset = dataset.to_hf_dataset()
 | 
			
		||||
 | 
			
		||||
    elif dataset_attr.load_from == "om_hub":
 | 
			
		||||
        require_version("openmind>=0.8.0", "To fix: pip install openmind>=0.8.0")
 | 
			
		||||
        from openmind import OmDataset
 | 
			
		||||
        from openmind.utils.hub import OM_DATASETS_CACHE
 | 
			
		||||
 | 
			
		||||
        cache_dir = model_args.cache_dir or OM_DATASETS_CACHE
 | 
			
		||||
        dataset = OmDataset.load_dataset(
 | 
			
		||||
            path=data_path,
 | 
			
		||||
            name=data_name,
 | 
			
		||||
            data_dir=data_dir,
 | 
			
		||||
            data_files=data_files,
 | 
			
		||||
            split=dataset_attr.split,
 | 
			
		||||
            cache_dir=cache_dir,
 | 
			
		||||
            token=model_args.om_hub_token,
 | 
			
		||||
            streaming=(data_args.streaming and (dataset_attr.load_from != "file")),
 | 
			
		||||
        )
 | 
			
		||||
    else:
 | 
			
		||||
        dataset = load_dataset(
 | 
			
		||||
            path=data_path,
 | 
			
		||||
 | 
			
		||||
@ -20,7 +20,7 @@ from typing import Any, Dict, List, Literal, Optional, Sequence
 | 
			
		||||
from transformers.utils import cached_file
 | 
			
		||||
 | 
			
		||||
from ..extras.constants import DATA_CONFIG
 | 
			
		||||
from ..extras.misc import use_openmind, use_modelscope
 | 
			
		||||
from ..extras.misc import use_modelscope, use_openmind
 | 
			
		||||
 | 
			
		||||
 | 
			
		||||
@dataclass
 | 
			
		||||
@ -30,7 +30,7 @@ class DatasetAttr:
 | 
			
		||||
    """
 | 
			
		||||
 | 
			
		||||
    # basic configs
 | 
			
		||||
    load_from: Literal["hf_hub", "ms_hub", "script", "file"]
 | 
			
		||||
    load_from: Literal["hf_hub", "ms_hub", "om_hub", "script", "file"]
 | 
			
		||||
    dataset_name: str
 | 
			
		||||
    formatting: Literal["alpaca", "sharegpt"] = "alpaca"
 | 
			
		||||
    ranking: bool = False
 | 
			
		||||
@ -97,11 +97,11 @@ def get_dataset_list(dataset_names: Optional[Sequence[str]], dataset_dir: str) -
 | 
			
		||||
 | 
			
		||||
    dataset_list: List["DatasetAttr"] = []
 | 
			
		||||
    for name in dataset_names:
 | 
			
		||||
        if dataset_info is None: # dataset_dir is ONLINE
 | 
			
		||||
            if use_openmind():
 | 
			
		||||
                load_from = "om_hub"
 | 
			
		||||
            elif use_modelscope():
 | 
			
		||||
        if dataset_info is None:  # dataset_dir is ONLINE
 | 
			
		||||
            if use_modelscope():
 | 
			
		||||
                load_from = "ms_hub"
 | 
			
		||||
            elif use_openmind():
 | 
			
		||||
                load_from = "om_hub"
 | 
			
		||||
            else:
 | 
			
		||||
                load_from = "hf_hub"
 | 
			
		||||
            dataset_attr = DatasetAttr(load_from, dataset_name=name)
 | 
			
		||||
@ -111,15 +111,15 @@ def get_dataset_list(dataset_names: Optional[Sequence[str]], dataset_dir: str) -
 | 
			
		||||
        if name not in dataset_info:
 | 
			
		||||
            raise ValueError("Undefined dataset {} in {}.".format(name, DATA_CONFIG))
 | 
			
		||||
 | 
			
		||||
        has_om_url = "om_hub_url" in dataset_info[name]
 | 
			
		||||
        has_hf_url = "hf_hub_url" in dataset_info[name]
 | 
			
		||||
        has_ms_url = "ms_hub_url" in dataset_info[name]
 | 
			
		||||
        has_om_url = "om_hub_url" in dataset_info[name]
 | 
			
		||||
 | 
			
		||||
        if has_om_url or has_hf_url or has_ms_url:
 | 
			
		||||
            if has_om_url and (use_openmind() or not has_hf_url):
 | 
			
		||||
                dataset_attr = DatasetAttr("om_hub", dataset_name=dataset_info[name]["om_hub_url"])
 | 
			
		||||
        if has_hf_url or has_ms_url or has_om_url:
 | 
			
		||||
            if has_ms_url and (use_modelscope() or not has_hf_url):
 | 
			
		||||
                dataset_attr = DatasetAttr("ms_hub", dataset_name=dataset_info[name]["ms_hub_url"])
 | 
			
		||||
            elif has_om_url and (use_openmind() or not has_hf_url):
 | 
			
		||||
                dataset_attr = DatasetAttr("om_hub", dataset_name=dataset_info[name]["om_hub_url"])
 | 
			
		||||
            else:
 | 
			
		||||
                dataset_attr = DatasetAttr("hf_hub", dataset_name=dataset_info[name]["hf_hub_url"])
 | 
			
		||||
        elif "script_url" in dataset_info[name]:
 | 
			
		||||
 | 
			
		||||
@ -107,7 +107,7 @@ VISION_MODELS = set()
 | 
			
		||||
class DownloadSource(str, Enum):
 | 
			
		||||
    DEFAULT = "hf"
 | 
			
		||||
    MODELSCOPE = "ms"
 | 
			
		||||
    MODELERS = "om"
 | 
			
		||||
    OPENMIND = "om"
 | 
			
		||||
 | 
			
		||||
 | 
			
		||||
def register_model_group(
 | 
			
		||||
@ -164,17 +164,17 @@ register_model_group(
 | 
			
		||||
        "Baichuan2-13B-Base": {
 | 
			
		||||
            DownloadSource.DEFAULT: "baichuan-inc/Baichuan2-13B-Base",
 | 
			
		||||
            DownloadSource.MODELSCOPE: "baichuan-inc/Baichuan2-13B-Base",
 | 
			
		||||
            DownloadSource.MODELERS: "Baichuan/Baichuan2_13b_base_pt"
 | 
			
		||||
            DownloadSource.OPENMIND: "Baichuan/Baichuan2_13b_base_pt",
 | 
			
		||||
        },
 | 
			
		||||
        "Baichuan2-7B-Chat": {
 | 
			
		||||
            DownloadSource.DEFAULT: "baichuan-inc/Baichuan2-7B-Chat",
 | 
			
		||||
            DownloadSource.MODELSCOPE: "baichuan-inc/Baichuan2-7B-Chat",
 | 
			
		||||
            DownloadSource.MODELERS: "Baichuan/Baichuan2_7b_chat_pt"
 | 
			
		||||
            DownloadSource.OPENMIND: "Baichuan/Baichuan2_7b_chat_pt",
 | 
			
		||||
        },
 | 
			
		||||
        "Baichuan2-13B-Chat": {
 | 
			
		||||
            DownloadSource.DEFAULT: "baichuan-inc/Baichuan2-13B-Chat",
 | 
			
		||||
            DownloadSource.MODELSCOPE: "baichuan-inc/Baichuan2-13B-Chat",
 | 
			
		||||
            DownloadSource.MODELERS: "Baichuan/Baichuan2_13b_chat_pt"
 | 
			
		||||
            DownloadSource.OPENMIND: "Baichuan/Baichuan2_13b_chat_pt",
 | 
			
		||||
        },
 | 
			
		||||
    },
 | 
			
		||||
    template="baichuan2",
 | 
			
		||||
@ -559,11 +559,12 @@ register_model_group(
 | 
			
		||||
        "Gemma-2-2B-Instruct": {
 | 
			
		||||
            DownloadSource.DEFAULT: "google/gemma-2-2b-it",
 | 
			
		||||
            DownloadSource.MODELSCOPE: "LLM-Research/gemma-2-2b-it",
 | 
			
		||||
            DownloadSource.OPENMIND: "LlamaFactory/gemma-2-2b-it",
 | 
			
		||||
        },
 | 
			
		||||
        "Gemma-2-9B-Instruct": {
 | 
			
		||||
            DownloadSource.DEFAULT: "google/gemma-2-9b-it",
 | 
			
		||||
            DownloadSource.MODELSCOPE: "LLM-Research/gemma-2-9b-it",
 | 
			
		||||
            DownloadSource.MODELERS: "LlamaFactory/gemma-2-2b-it"
 | 
			
		||||
            DownloadSource.OPENMIND: "LlamaFactory/gemma-2-9b-it",
 | 
			
		||||
        },
 | 
			
		||||
        "Gemma-2-27B-Instruct": {
 | 
			
		||||
            DownloadSource.DEFAULT: "google/gemma-2-27b-it",
 | 
			
		||||
@ -583,6 +584,7 @@ register_model_group(
 | 
			
		||||
        "GLM-4-9B-Chat": {
 | 
			
		||||
            DownloadSource.DEFAULT: "THUDM/glm-4-9b-chat",
 | 
			
		||||
            DownloadSource.MODELSCOPE: "ZhipuAI/glm-4-9b-chat",
 | 
			
		||||
            DownloadSource.OPENMIND: "LlamaFactory/glm-4-9b-chat",
 | 
			
		||||
        },
 | 
			
		||||
        "GLM-4-9B-1M-Chat": {
 | 
			
		||||
            DownloadSource.DEFAULT: "THUDM/glm-4-9b-chat-1m",
 | 
			
		||||
@ -637,6 +639,7 @@ register_model_group(
 | 
			
		||||
        "InternLM2.5-1.8B": {
 | 
			
		||||
            DownloadSource.DEFAULT: "internlm/internlm2_5-1_8b",
 | 
			
		||||
            DownloadSource.MODELSCOPE: "Shanghai_AI_Laboratory/internlm2_5-1_8b",
 | 
			
		||||
            DownloadSource.OPENMIND: "Intern/internlm2_5-1_8b",
 | 
			
		||||
        },
 | 
			
		||||
        "InternLM2.5-7B": {
 | 
			
		||||
            DownloadSource.DEFAULT: "internlm/internlm2_5-7b",
 | 
			
		||||
@ -645,23 +648,27 @@ register_model_group(
 | 
			
		||||
        "InternLM2.5-20B": {
 | 
			
		||||
            DownloadSource.DEFAULT: "internlm/internlm2_5-20b",
 | 
			
		||||
            DownloadSource.MODELSCOPE: "Shanghai_AI_Laboratory/internlm2_5-20b",
 | 
			
		||||
            DownloadSource.OPENMIND: "Intern/internlm2_5-20b",
 | 
			
		||||
        },
 | 
			
		||||
        "InternLM2.5-1.8B-Chat": {
 | 
			
		||||
            DownloadSource.DEFAULT: "internlm/internlm2_5-1_8b-chat",
 | 
			
		||||
            DownloadSource.MODELSCOPE: "Shanghai_AI_Laboratory/internlm2_5-1_8b-chat",
 | 
			
		||||
            DownloadSource.OPENMIND: "Intern/internlm2_5-1_8b-chat",
 | 
			
		||||
        },
 | 
			
		||||
        "InternLM2.5-7B-Chat": {
 | 
			
		||||
            DownloadSource.DEFAULT: "internlm/internlm2_5-7b-chat",
 | 
			
		||||
            DownloadSource.MODELSCOPE: "Shanghai_AI_Laboratory/internlm2_5-7b-chat",
 | 
			
		||||
            DownloadSource.OPENMIND: "Intern/internlm2_5-7b-chat",
 | 
			
		||||
        },
 | 
			
		||||
        "InternLM2.5-7B-1M-Chat": {
 | 
			
		||||
            DownloadSource.DEFAULT: "internlm/internlm2_5-7b-chat-1m",
 | 
			
		||||
            DownloadSource.MODELSCOPE: "Shanghai_AI_Laboratory/internlm2_5-7b-chat-1m",
 | 
			
		||||
            DownloadSource.OPENMIND: "Intern/internlm2_5-7b-chat-1m",
 | 
			
		||||
        },
 | 
			
		||||
        "InternLM2.5-20B-Chat": {
 | 
			
		||||
            DownloadSource.DEFAULT: "internlm/internlm2_5-20b-chat",
 | 
			
		||||
            DownloadSource.MODELSCOPE: "Shanghai_AI_Laboratory/internlm2_5-20b-chat",
 | 
			
		||||
            DownloadSource.MODELERS: "Intern/internlm2_5-20b-chat"
 | 
			
		||||
            DownloadSource.OPENMIND: "Intern/internlm2_5-20b-chat",
 | 
			
		||||
        },
 | 
			
		||||
    },
 | 
			
		||||
    template="intern2",
 | 
			
		||||
@ -762,7 +769,7 @@ register_model_group(
 | 
			
		||||
        "Llama-3-8B-Chinese-Chat": {
 | 
			
		||||
            DownloadSource.DEFAULT: "shenzhi-wang/Llama3-8B-Chinese-Chat",
 | 
			
		||||
            DownloadSource.MODELSCOPE: "LLM-Research/Llama3-8B-Chinese-Chat",
 | 
			
		||||
            DownloadSource.MODELERS: "HaM/Llama3-8B-Chinese-Chat",
 | 
			
		||||
            DownloadSource.OPENMIND: "LlamaFactory/Llama3-Chinese-8B-Instruct",
 | 
			
		||||
        },
 | 
			
		||||
        "Llama-3-70B-Chinese-Chat": {
 | 
			
		||||
            DownloadSource.DEFAULT: "shenzhi-wang/Llama3-70B-Chinese-Chat",
 | 
			
		||||
@ -967,7 +974,7 @@ register_model_group(
 | 
			
		||||
        "MiniCPM3-4B-Chat": {
 | 
			
		||||
            DownloadSource.DEFAULT: "openbmb/MiniCPM3-4B",
 | 
			
		||||
            DownloadSource.MODELSCOPE: "OpenBMB/MiniCPM3-4B",
 | 
			
		||||
            DownloadSource.MODELERS: "LlamaFactory/MiniCPM3-4B"
 | 
			
		||||
            DownloadSource.OPENMIND: "LlamaFactory/MiniCPM3-4B",
 | 
			
		||||
        },
 | 
			
		||||
    },
 | 
			
		||||
    template="cpm3",
 | 
			
		||||
@ -1417,14 +1424,17 @@ register_model_group(
 | 
			
		||||
        "Qwen2-0.5B-Instruct": {
 | 
			
		||||
            DownloadSource.DEFAULT: "Qwen/Qwen2-0.5B-Instruct",
 | 
			
		||||
            DownloadSource.MODELSCOPE: "qwen/Qwen2-0.5B-Instruct",
 | 
			
		||||
            DownloadSource.OPENMIND: "LlamaFactory/Qwen2-0.5B-Instruct",
 | 
			
		||||
        },
 | 
			
		||||
        "Qwen2-1.5B-Instruct": {
 | 
			
		||||
            DownloadSource.DEFAULT: "Qwen/Qwen2-1.5B-Instruct",
 | 
			
		||||
            DownloadSource.MODELSCOPE: "qwen/Qwen2-1.5B-Instruct",
 | 
			
		||||
            DownloadSource.OPENMIND: "LlamaFactory/Qwen2-1.5B-Instruct",
 | 
			
		||||
        },
 | 
			
		||||
        "Qwen2-7B-Instruct": {
 | 
			
		||||
            DownloadSource.DEFAULT: "Qwen/Qwen2-7B-Instruct",
 | 
			
		||||
            DownloadSource.MODELSCOPE: "qwen/Qwen2-7B-Instruct",
 | 
			
		||||
            DownloadSource.OPENMIND: "LlamaFactory/Qwen2-7B-Instruct",
 | 
			
		||||
        },
 | 
			
		||||
        "Qwen2-72B-Instruct": {
 | 
			
		||||
            DownloadSource.DEFAULT: "Qwen/Qwen2-72B-Instruct",
 | 
			
		||||
@ -1707,11 +1717,12 @@ register_model_group(
 | 
			
		||||
        "Qwen2-VL-2B-Instruct": {
 | 
			
		||||
            DownloadSource.DEFAULT: "Qwen/Qwen2-VL-2B-Instruct",
 | 
			
		||||
            DownloadSource.MODELSCOPE: "qwen/Qwen2-VL-2B-Instruct",
 | 
			
		||||
            DownloadSource.MODELERS: "LlamaFactory/Qwen2-VL-2B-Instruct"
 | 
			
		||||
            DownloadSource.OPENMIND: "LlamaFactory/Qwen2-VL-2B-Instruct",
 | 
			
		||||
        },
 | 
			
		||||
        "Qwen2-VL-7B-Instruct": {
 | 
			
		||||
            DownloadSource.DEFAULT: "Qwen/Qwen2-VL-7B-Instruct",
 | 
			
		||||
            DownloadSource.MODELSCOPE: "qwen/Qwen2-VL-7B-Instruct",
 | 
			
		||||
            DownloadSource.OPENMIND: "LlamaFactory/Qwen2-VL-7B-Instruct",
 | 
			
		||||
        },
 | 
			
		||||
        "Qwen2-VL-72B-Instruct": {
 | 
			
		||||
            DownloadSource.DEFAULT: "Qwen/Qwen2-VL-72B-Instruct",
 | 
			
		||||
@ -1810,12 +1821,12 @@ register_model_group(
 | 
			
		||||
        "TeleChat-7B-Chat": {
 | 
			
		||||
            DownloadSource.DEFAULT: "Tele-AI/telechat-7B",
 | 
			
		||||
            DownloadSource.MODELSCOPE: "TeleAI/telechat-7B",
 | 
			
		||||
            DownloadSource.MODELERS: "TeleAI/TeleChat-7B-pt"
 | 
			
		||||
            DownloadSource.OPENMIND: "TeleAI/TeleChat-7B-pt",
 | 
			
		||||
        },
 | 
			
		||||
        "TeleChat-12B-Chat": {
 | 
			
		||||
            DownloadSource.DEFAULT: "Tele-AI/TeleChat-12B",
 | 
			
		||||
            DownloadSource.MODELSCOPE: "TeleAI/TeleChat-12B",
 | 
			
		||||
            DownloadSource.MODELERS: "TeleAI/TeleChat-12B-pt",
 | 
			
		||||
            DownloadSource.OPENMIND: "TeleAI/TeleChat-12B-pt",
 | 
			
		||||
        },
 | 
			
		||||
        "TeleChat-12B-v2-Chat": {
 | 
			
		||||
            DownloadSource.DEFAULT: "Tele-AI/TeleChat-12B-v2",
 | 
			
		||||
@ -2034,7 +2045,7 @@ register_model_group(
 | 
			
		||||
        "Yi-1.5-6B-Chat": {
 | 
			
		||||
            DownloadSource.DEFAULT: "01-ai/Yi-1.5-6B-Chat",
 | 
			
		||||
            DownloadSource.MODELSCOPE: "01ai/Yi-1.5-6B-Chat",
 | 
			
		||||
            DownloadSource.MODELERS: "LlamaFactory/Yi-1.5-6B-Chat"
 | 
			
		||||
            DownloadSource.OPENMIND: "LlamaFactory/Yi-1.5-6B-Chat",
 | 
			
		||||
        },
 | 
			
		||||
        "Yi-1.5-9B-Chat": {
 | 
			
		||||
            DownloadSource.DEFAULT: "01-ai/Yi-1.5-9B-Chat",
 | 
			
		||||
 | 
			
		||||
@ -232,28 +232,34 @@ def torch_gc() -> None:
 | 
			
		||||
 | 
			
		||||
 | 
			
		||||
def try_download_model_from_other_hub(model_args: "ModelArguments") -> str:
 | 
			
		||||
    if (not use_openmind() and not use_modelscope()) or os.path.exists(model_args.model_name_or_path):
 | 
			
		||||
    if (not use_modelscope() and not use_openmind()) or os.path.exists(model_args.model_name_or_path):
 | 
			
		||||
        return model_args.model_name_or_path
 | 
			
		||||
 | 
			
		||||
    if use_openmind():
 | 
			
		||||
        try:
 | 
			
		||||
            from openmind.utils.hub import snapshot_download
 | 
			
		||||
 | 
			
		||||
            return snapshot_download(model_args.model_name_or_path, revision=model_args.model_revision, cache_dir=model_args.cache_dir)
 | 
			
		||||
        except ImportError:
 | 
			
		||||
            raise ImportError("Please install openmind and openmind_hub via `pip install openmind -U`")
 | 
			
		||||
 | 
			
		||||
    if use_modelscope():
 | 
			
		||||
        try:
 | 
			
		||||
            from modelscope import snapshot_download
 | 
			
		||||
        require_version("modelscope>=1.11.0", "To fix: pip install modelscope>=1.11.0")
 | 
			
		||||
        from modelscope import snapshot_download
 | 
			
		||||
 | 
			
		||||
            revision = "master" if model_args.model_revision == "main" else model_args.model_revision
 | 
			
		||||
            return snapshot_download(model_args.model_name_or_path, revision=revision, cache_dir=model_args.cache_dir)
 | 
			
		||||
        except ImportError:
 | 
			
		||||
            raise ImportError("Please install modelscope via `pip install modelscope -U`")
 | 
			
		||||
        revision = "master" if model_args.model_revision == "main" else model_args.model_revision
 | 
			
		||||
        return snapshot_download(
 | 
			
		||||
            model_args.model_name_or_path,
 | 
			
		||||
            revision=revision,
 | 
			
		||||
            cache_dir=model_args.cache_dir,
 | 
			
		||||
        )
 | 
			
		||||
 | 
			
		||||
    if use_openmind():
 | 
			
		||||
        require_version("openmind>=0.8.0", "To fix: pip install openmind>=0.8.0")
 | 
			
		||||
        from openmind.utils.hub import snapshot_download
 | 
			
		||||
 | 
			
		||||
        return snapshot_download(
 | 
			
		||||
            model_args.model_name_or_path,
 | 
			
		||||
            revision=model_args.model_revision,
 | 
			
		||||
            cache_dir=model_args.cache_dir,
 | 
			
		||||
        )
 | 
			
		||||
 | 
			
		||||
def use_openmind() -> bool:
 | 
			
		||||
    return os.environ.get("USE_OPENMIND_HUB", "0").lower() in ["true", "1"]
 | 
			
		||||
 | 
			
		||||
def use_modelscope() -> bool:
 | 
			
		||||
    return os.environ.get("USE_MODELSCOPE_HUB", "0").lower() in ["true", "1"]
 | 
			
		||||
 | 
			
		||||
 | 
			
		||||
def use_openmind() -> bool:
 | 
			
		||||
    return os.environ.get("USE_OPENMIND_HUB", "0").lower() in ["true", "1"]
 | 
			
		||||
 | 
			
		||||
@ -123,7 +123,7 @@ def _check_extra_dependencies(
 | 
			
		||||
        require_version("mixture-of-depth>=1.1.6", "To fix: pip install mixture-of-depth>=1.1.6")
 | 
			
		||||
 | 
			
		||||
    if model_args.infer_backend == "vllm":
 | 
			
		||||
        require_version("vllm>=0.4.3,<=0.6.3", "To fix: pip install vllm>=0.4.3,<=0.6.2")
 | 
			
		||||
        require_version("vllm>=0.4.3,<=0.6.3", "To fix: pip install vllm>=0.4.3,<=0.6.3")
 | 
			
		||||
 | 
			
		||||
    if finetuning_args.use_galore:
 | 
			
		||||
        require_version("galore_torch", "To fix: pip install galore_torch")
 | 
			
		||||
 | 
			
		||||
@ -109,15 +109,15 @@ def get_model_path(model_name: str) -> str:
 | 
			
		||||
        use_modelscope()
 | 
			
		||||
        and path_dict.get(DownloadSource.MODELSCOPE)
 | 
			
		||||
        and model_path == path_dict.get(DownloadSource.DEFAULT)
 | 
			
		||||
    ):  # replace path
 | 
			
		||||
    ):  # replace hf path with ms path
 | 
			
		||||
        model_path = path_dict.get(DownloadSource.MODELSCOPE)
 | 
			
		||||
 | 
			
		||||
    if (
 | 
			
		||||
        use_openmind()
 | 
			
		||||
        and path_dict.get(DownloadSource.MODELERS)
 | 
			
		||||
        and path_dict.get(DownloadSource.OPENMIND)
 | 
			
		||||
        and model_path == path_dict.get(DownloadSource.DEFAULT)
 | 
			
		||||
    ):  # replace path
 | 
			
		||||
        model_path = path_dict.get(DownloadSource.MODELERS)
 | 
			
		||||
    ):  # replace hf path with om path
 | 
			
		||||
        model_path = path_dict.get(DownloadSource.OPENMIND)
 | 
			
		||||
 | 
			
		||||
    return model_path
 | 
			
		||||
 | 
			
		||||
 | 
			
		||||
		Loading…
	
	
			
			x
			
			
		
	
		Reference in New Issue
	
	Block a user