Commit Graph

118 Commits

Author SHA1 Message Date
hoshi-hiyouga
7a04021d04 [optim] clean apollo (#6645)
* clean apollo code

* update readme
2025-01-15 01:42:50 +08:00
zhuHQ
d9189f9f0b [optim] add support to APOLLO (#6617) 2025-01-15 00:24:56 +08:00
hoshi-hiyouga
e3e2c8c689 [inference] fix stop token for object detection (#6624)
* fix stop token

* update minicpm data pipeline

* fix npu qlora examples
2025-01-13 21:34:20 +08:00
codingma
03de5ac912 add nf4 qlora support on Ascend NPU (#6601)
* add nf4 qlora support on Ascend NPU

* add transformers version check

* add python>=3.10 requirement description for npu

* tiny fix

---------

Co-authored-by: hoshi-hiyouga <hiyouga@buaa.edu.cn>
2025-01-13 19:43:36 +08:00
hiyouga
f6f630a1c9 refactor mllm param logic 2025-01-10 15:45:48 +00:00
hiyouga
d8cac6f546 refactor ray integration, support save ckpt 2025-01-07 09:39:10 +00:00
Eric Tang
1e8e7be0a5 run style check 2025-01-07 08:55:44 +00:00
Kourosh Hakhamaneshi
163ddb680b drafting ray integration
Signed-off-by: Kourosh Hakhamaneshi <kourosh@anyscale.com>
2025-01-07 08:55:44 +00:00
Yaser Afshar
1c8ad22a5f Add missing key to init_kwargs 2024-12-17 12:34:05 +00:00
Yaser Afshar
0943776326 Add trust_remote_code parameter and remove True
- Introduced a new model parameter `trust_remote_code`
- Set the default value of `trust_remote_code` to `False`
  to enhance security
2024-12-17 12:25:12 +00:00
hiyouga
7059055e89 update assets 2024-12-14 17:36:03 +00:00
hiyouga
2811814fc4 fix mrope 2024-12-12 15:08:17 +00:00
hiyouga
99c62660c6 support qwen2vl train proj only 2024-12-05 10:37:42 +00:00
hiyouga
e5584dc7ba update examples 2024-12-05 08:48:25 +00:00
hiyouga
1324d158f9 support batch infer in vllm 2024-12-04 13:50:00 +00:00
hiyouga
58ab4579dc add vllm config 2024-11-10 21:28:18 +08:00
hiyouga
93d3b8f43f update tests 2024-11-02 12:41:44 +08:00
hiyouga
21db8ed2f4 use pre-commit 2024-10-29 09:07:46 +00:00
hiyouga
94d5b1bd8f add e2e tests 2024-09-05 21:52:28 +08:00
hiyouga
8e49940746 add rlhf-v dataset 2024-09-01 22:57:41 +08:00
hiyouga
a025c3df61 remove visual_inputs, fix qlora 2024-08-31 00:24:51 +08:00
hiyouga
e08045a946 add examples 2024-08-30 21:43:19 +08:00
hiyouga
3382317e32 refactor mm training 2024-08-30 02:14:31 +08:00
simonJJJ
aeb85f200b initial-commit 2024-08-28 16:51:35 +08:00
hiyouga
0a690ada6f update examples 2024-08-09 20:13:46 +08:00
hiyouga
e2a28f51c6 add adam_mini to readme 2024-08-09 20:02:03 +08:00
hiyouga
86f7099fa3 update scripts 2024-08-09 19:16:23 +08:00
hiyouga
c87023d539 follow #5115 2024-08-09 18:03:00 +08:00
codingma
823e7c122b fix eval_dataset in example 2024-08-07 18:24:19 +08:00
hiyouga
1bbd49faae fix #4944 2024-07-24 16:42:51 +08:00
hoshi-hiyouga
91ba083f37 Update llama3_lora_eval.yaml 2024-07-15 22:55:12 +08:00
codingma
645211dc01 1. change the task name format
2. delete split param in data_args.py
2024-07-15 09:55:33 +08:00
hiyouga
29ebcd75d5 fix up 2024-07-15 01:04:56 +08:00
hoshi-hiyouga
f618b80fa2 Update llava1_5.yaml 2024-07-13 20:30:06 +08:00
codingma
982a1cdd24 1. fix output_dir in llama3_lora_pretrain.yaml
2. add llava1_5.yaml for inference
2024-07-13 13:16:22 +08:00
hiyouga
c9bb0757ec update pissa example 2024-07-06 15:47:32 +08:00
hiyouga
2f78b5d62a update examples 2024-06-28 01:17:07 +08:00
hiyouga
d417e63f92 update examples 2024-06-27 00:53:33 +08:00
hiyouga
095fab58d3 tiny fix about badam 2024-06-25 01:54:53 +08:00
Jonery
97c5235160 add example 2024-06-18 13:50:26 +08:00
hiyouga
2bf2863a58 tiny fix 2024-06-17 17:47:25 +08:00
hiyouga
8c1046d78a support pissa 2024-06-16 01:08:12 +08:00
hiyouga
2d43b8bb49 Update README.md 2024-06-13 16:02:21 +08:00
hiyouga
892e561c28 update examples 2024-06-13 03:26:10 +08:00
hiyouga
a19cdd39fe Update llama3_full_sft_ds3.yaml 2024-06-13 03:16:20 +08:00
hiyouga
b6e008c152 update examples 2024-06-13 03:15:06 +08:00
hiyouga
cae4737907 lora modules: all by default 2024-06-06 03:53:28 +08:00
hiyouga
dc4a00dd63 update train hparams 2024-06-06 01:49:20 +08:00
hiyouga
5a13b3baa6 tiny fix 2024-06-04 00:31:10 +08:00
hiyouga
eed33862bc fix #4005 #4013 2024-06-03 19:12:29 +08:00