Merge pull request #6152 from hiyouga/hiyouga/add_num_proc_in_data_load

[data] add num_proc in load_dataset

Former-commit-id: b26c490ac3a0a8a6342f940eb6ccb7b8b6d78f93
This commit is contained in:
hoshi-hiyouga 2024-11-27 00:16:15 +08:00 committed by GitHub
commit 6cd90efb82

View File

@ -128,6 +128,7 @@ def _load_single_dataset(
cache_dir=model_args.cache_dir,
token=model_args.hf_hub_token,
streaming=data_args.streaming,
num_proc=data_args.preprocessing_num_workers,
trust_remote_code=True,
)