Commit Graph

7 Commits

Author SHA1 Message Date
leo-pony
acc52e0fe7 [npu] update cann base image and torch 2.4 (#7061)
* Update base npu container image version:The Python version required for Hugging Face Transformers is >= python3.10

* Fix the bug: arg type of INSTALL_DEEPSPEED shoud been string now.

* Update Ascend CANN, CANN-Kernel and corresponding torch and torch-npu version

* Upgrade torch-npu needs packages' version: torch==2.1.0 and torch-npu==2.4.0.post2
2025-02-25 23:32:01 +08:00
XYZliang
64414905a3 Increase shm_size to 16GB in docker-compose.yml to optimize shared memory allocation for large-scale model fine-tuning tasks.
This pull request increases the shm_size parameter in docker-compose.yml to 16GB. The goal is to enhance the LLaMA-Factory framework’s performance for large model fine-tuning tasks by providing sufficient shared memory for efficient data loading and parallel processing.

This PR also addresses the issues discussed in [this comment](https://github.com/hiyouga/LLaMA-Factory/issues/4316#issuecomment-2466270708) regarding Shared Memory Limit error.
2024-11-13 10:13:59 +08:00
hiyouga
3af57795dd tiny fix 2024-10-11 23:51:54 +08:00
MengqingCao
106647a99d fix docker-compose path 2024-06-26 02:15:00 +00:00
hiyouga
efb81b25ec fix #4419 2024-06-25 01:51:29 +08:00
hoshi-hiyouga
721acd8768 Update docker-compose.yml 2024-06-25 00:54:28 +08:00
MengqingCao
d7207e8ad1 update docker files
1. add docker-npu (Dockerfile and docker-compose.yml)
  2. move cuda docker to docker-cuda and tiny changes to adapt to the new path
2024-06-24 10:57:36 +00:00