diff --git a/README.md b/README.md index 852ae132..88b00ec8 100644 --- a/README.md +++ b/README.md @@ -12,6 +12,8 @@ ## Changelog +[23/07/31] Now we support dataset streaming. Try `--streaming` and `--max_steps 100` arguments to stream your dataset. + [23/07/19] Now we support training the **LLaMA-2** models in this repo. Try `--model_name_or_path meta-llama/Llama-2-7b-hf` argument to use the LLaMA-2 model. Remember to use `--template llama2` argument when you are using the LLaMA-2-chat model. [23/07/18] Now we develop an all-in-one Web UI for training, evaluation and inference. Try `train_web.py` to fine-tune models in your Web browser. Thank [@KanadeSiina](https://github.com/KanadeSiina) and [@codemayq](https://github.com/codemayq) for their efforts in the development. diff --git a/README_zh.md b/README_zh.md index 4ca9194d..6b4773eb 100644 --- a/README_zh.md +++ b/README_zh.md @@ -12,6 +12,8 @@ ## 更新日志 +[23/07/31] 现在我们支持了训练数据流式加载。请尝试使用 `--streaming` 和 `--max_steps 100` 参数来流式加载数据集。 + [23/07/19] 现在我们支持了 **LLaMA-2** 模型的训练。请尝试使用 `--model_name_or_path meta-llama/Llama-2-7b-hf` 参数。请注意使用 LLaMA-2-chat 模型需要添加 `--template llama2` 参数。 [23/07/18] 我们开发了支持训练和测试的浏览器一键微调界面。请尝试使用 `train_web.py` 在您的浏览器中微调模型。感谢 [@KanadeSiina](https://github.com/KanadeSiina) 和 [@codemayq](https://github.com/codemayq) 在该功能开发中付出的努力。 diff --git a/src/llmtuner/tuner/core/parser.py b/src/llmtuner/tuner/core/parser.py index 38c13f76..be515288 100644 --- a/src/llmtuner/tuner/core/parser.py +++ b/src/llmtuner/tuner/core/parser.py @@ -108,6 +108,8 @@ def get_train_args( logger.warning("`dev_ratio` is incompatible with `streaming`. Disabling development set.") data_args.dev_ratio = 0 + assert not (training_args.max_steps == -1 and data_args.streaming), "Please specify `max_steps` in streaming mode." + training_args.optim = "adamw_torch" if training_args.optim == "adamw_hf" else training_args.optim # suppress warning if model_args.quantization_bit is not None: @@ -119,10 +121,10 @@ def get_train_args( model_args.compute_dtype = torch.float32 # Log on each process the small summary: - logger.info( - f"Process rank: {training_args.local_rank}, device: {training_args.device}, n_gpu: {training_args.n_gpu}\n" - + f" distributed training: {bool(training_args.local_rank != -1)}, 16-bits training: {training_args.fp16}" - ) + logger.info("Process rank: {}, device: {}, n_gpu: {}\n distributed training: {}, 16-bits training: {}".format( + training_args.local_rank, training_args.device, training_args.n_gpu, + bool(training_args.local_rank != -1), training_args.fp16 + )) logger.info(f"Training/evaluation parameters {training_args}") # Set seed before initializing model.