1839 Commits

Author SHA1 Message Date
hiyouga
a0df8be4e8 fix packing for eager/sdpa attn
Former-commit-id: 735a033ceb7f2da6da71d138ea091d8a665411a9
2024-07-04 01:52:43 +08:00
hoshi-hiyouga
9dcdaee09c Merge pull request #4224 from chuan298/main
Implement efficient packing without cross-contamination attention

Former-commit-id: ac382cc9fe4ec483658fd54f07f9a123788ce1b1
2024-07-04 01:18:54 +08:00
hiyouga
bd294e7cc3 update packing
Former-commit-id: f3d9c31efa0e64317bdd5b4ed6f78653cf3b5ba4
2024-07-04 01:10:55 +08:00
hoshi-hiyouga
d124ce001b Update packing.py
Former-commit-id: 3cc11aa88839c5b99cfd83d9225770a33d0eb6fd
2024-07-03 23:36:01 +08:00
hiyouga
f849d03533 update func name
Former-commit-id: ed93ac0829fa656194fd32e1ac063843f475746f
2024-07-03 23:29:33 +08:00
hiyouga
7c08a4a82a update arg name
Former-commit-id: 1509ed550b2060f946ce20e3c5a9e5c49e86e3ab
2024-07-03 23:23:24 +08:00
hiyouga
fe888a9073 update hparams
Former-commit-id: 1c4feac44192b1f540208837f5a530b0d3f5fb37
2024-07-03 23:18:58 +08:00
hiyouga
1c8d199740 update ui
Former-commit-id: b1522a3c0951e2e57f873dc6c758aaed33ca374e
2024-07-03 23:13:49 +08:00
hiyouga
d5350d51ef test
Former-commit-id: 610eea0c0a0069fdc9148620b15ffffcfef731ea
2024-07-03 23:05:39 +08:00
hiyouga
b71b7c6a63 update scripts
Former-commit-id: 6dd6bae598d4d0b7b7d80341e88e313e49a49c00
2024-07-03 20:07:44 +08:00
hiyouga
767aae4b72 fix #4609
unwrap_model_for_generation(reward_model) is necessary for zero3 training


Former-commit-id: c8d5b21700577cae8d6ca03359bcf1762c8b7cb8
2024-07-03 19:45:51 +08:00
hiyouga
53b5addb33 update readme
Former-commit-id: 4b5f05b791fce9fdc4678598d7be8dc954f9ff73
2024-07-03 19:39:05 +08:00
hoshi-hiyouga
0271da6054 Merge pull request #4662 from wzh1994/wzh/readme
Add `LazyLLM` to `Projects using LLaMA Factory` in `README.md`

Former-commit-id: 5ac6334cc40cefda91f5344f60ec0d4757d76df4
2024-07-03 15:51:02 +08:00
wangzhihong
440b04aa4f Update README_zh.md
Former-commit-id: d4036add433989ad88d54895b6f5af90b393c009
2024-07-03 14:59:09 +08:00
wangzhihong
e8e8968467 add LazyLLM to Projects using LLaMA Factory in README.md
Former-commit-id: e1d8587ea120ad356df35431f84af92193fcbaf3
2024-07-03 11:12:20 +08:00
hiyouga
e8a1dc2785 tiny fix
Former-commit-id: d944020257f363f38e62de6279b337e399b7c65e
2024-07-03 02:31:50 +08:00
hiyouga
3f2b9e9326 tiny fix
Former-commit-id: 98c4a0af6b3e27ae393d2847f48a01d23d9c8780
2024-07-02 23:06:13 +08:00
hiyouga
1aaee45a94 remove rlhf support for chatglm2&3
Former-commit-id: bcbb5b71961b89719bffb0d202c431c82e6067cc
2024-07-02 23:03:17 +08:00
hiyouga
6b755749b9 upcast logits
Former-commit-id: df61660351c8af30591471807a20869a45bb055a
2024-07-02 22:32:05 +08:00
hiyouga
ca106d1f1b improve rlhf
Former-commit-id: e441780e3db256ca09a442ea9254e7ce16898a07
2024-07-02 22:23:08 +08:00
ancv
260f55ea47 move efficient_packing from data_args to model_args
Former-commit-id: 7b61659c707480bcf8c802c73e10d12ad5b9b965
2024-07-02 18:37:55 +07:00
hiyouga
fe7a181eb8 Update bug-report.yml
Former-commit-id: b92636feff19f144850d7741d8f3fa9fcfdb0580
2024-07-02 19:18:56 +08:00
hiyouga
eb93a6dbce Update bug-report.yml
Former-commit-id: dc04e33b17dfb798eaee137eef08879a0b7114c7
2024-07-02 19:16:12 +08:00
hoshi-hiyouga
5a1f8a7888 Merge pull request #4651 from hzhaoy/add-telechat-1b
Add TeleChat-1B

Former-commit-id: 2da64665d3da9dc0084bb782c65e88bac21f45a1
2024-07-02 17:56:43 +08:00
hzhaoy
1df3f02aca add TeleChat-1B
Former-commit-id: 1b81b43fc483a21e0c2985b98459ecf5137aa4c4
2024-07-02 17:49:04 +08:00
hiyouga
8c3b285da2 fix ppo callbacks
Former-commit-id: 54f1c67c2a802b1d8368a6d1837d4c9a729f2695
2024-07-02 17:34:56 +08:00
hoshi-hiyouga
9174675ba9 Merge branch 'main' into main
Former-commit-id: 7be442f37d53a0c6324728fa1fa8e2c84d7f0fa5
2024-07-01 21:01:09 +08:00
hiyouga
14b37e1e03 tiny fix
Former-commit-id: 5dd2e5c3323f56420b5845a5ed28bcd9d4da5e41
2024-07-01 05:43:17 +08:00
hiyouga
711ffd0aaf tiny fix
Former-commit-id: 19e43c3a9ed771e991cb273d394ab28fb923f868
2024-07-01 03:55:20 +08:00
hiyouga
8baf04d772 add eval acc
Former-commit-id: 7ffde76fbfb6192e3aac31ccc098f31ce89181ae
2024-07-01 03:51:20 +08:00
hiyouga
92607846d0 Update label_issue.yml
Former-commit-id: fffa3defdda02ad579cb703c0704f94bad94f21a
2024-07-01 01:29:09 +08:00
hiyouga
a43f518389 fix #4402 #4617
Deprecate reserved_label_len arg


Former-commit-id: 4b6568984c0be4b31e7aa91b7c0d52b7f7b12b0b
2024-07-01 01:19:27 +08:00
hiyouga
9988b1b029 update readme
Former-commit-id: 7998d969bf942c91cf41a189e3941f6e04c81c6f
2024-07-01 00:22:52 +08:00
hiyouga
35c65ddf8c fix #4398 #4592
Former-commit-id: 8c92d268903c00392c8bd75a731daa1f107d6202
2024-06-30 21:28:51 +08:00
hiyouga
9a0723143a update npu docker
Former-commit-id: 2f4d5174205605b8821d4fb626283e07694ecf80
2024-06-30 21:05:31 +08:00
hiyouga
f7a4f3d9c0 loose gemma2 attention
Former-commit-id: a0b645017a2de3d58b6cbc71bd91ec96fc7a818b
2024-06-29 01:42:14 +08:00
hiyouga
b9f2c6e64e update readme
Former-commit-id: 9f809c311af373508cb51b204ae54b047729a9dc
2024-06-28 06:55:19 +08:00
hiyouga
6ce0b5891b bf16 by default, gemma2 attns
Gemma2 finetuning cannot work until merging https://github.com/huggingface/transformers/pull/31674


Former-commit-id: da66c32c7be0adc28d2185b23e9f62d56acb961c
2024-06-28 06:00:26 +08:00
hiyouga
0bd6bcd95f increase pissa_iter for stability
Former-commit-id: 03f8d9b0fb10ae58e7f68508197330d616957899
2024-06-28 03:18:54 +08:00
hiyouga
7705df9dad fix docker flashattn
Former-commit-id: 0966f5d4616a3877a6b921976dc39e8799831d36
2024-06-28 01:28:59 +08:00
hiyouga
81094dc09a add Gemma2 models
Former-commit-id: 8fc5a248ecfd6861cb90dac6c14fe89cdeaf8921
2024-06-28 01:26:50 +08:00
hiyouga
71b8bb6037 update examples
Former-commit-id: 66f248b90cfa2b29c73060459b2337b78154c47b
2024-06-28 01:17:07 +08:00
hiyouga
884a4a33ee refactor pissa, improve llamaboard
Former-commit-id: 619556e46c19718f702c97df5d570a2a4c5fb13a
2024-06-28 01:04:24 +08:00
hoshi-hiyouga
bf8855f90b Merge pull request #4580 from hzhaoy/bugfix-deepspeed-pissa
Fix bug when using pissa method with deepspeed

Former-commit-id: f260d458f91d6d2b4ed141f64844cded11d5aaad
2024-06-28 00:46:51 +08:00
hiyouga
b588a099db fix #4549
Former-commit-id: c9fdef10de737d1f433209812ef73e29cb60490a
2024-06-28 00:41:58 +08:00
hiyouga
52ab77d008 fix docker file
Former-commit-id: 688f02decb1185deb74b26444f7643cab7d355c1
2024-06-27 20:29:16 +08:00
hiyouga
9805350811 tiny fix
Former-commit-id: c1a78a3a9f8ab9d57577cee37f9c457d60863ba2
2024-06-27 20:14:48 +08:00
hoshi-hiyouga
a70530d8f5 Merge pull request #4590 from injet-zhou/main
Exit the process with the subprocess's return code when utilizing the CLI

Former-commit-id: c6a8a7f239d7aa7c74ba09d55a24d4416181cc02
2024-06-27 20:09:36 +08:00
hoshi-hiyouga
2e8a018ea6 Merge pull request #4461 from hzhaoy/feature/support-flash-attn
support flash-attn in Dockerfile

Former-commit-id: e30a47ab5bda9303c8a2eb814caf0dd40c01b125
2024-06-27 20:05:26 +08:00
hoshi-hiyouga
9999804b4c Merge pull request #4561 from hashstone/fix-docker-npu
fix torch-npu dependency

Former-commit-id: 14867c5cf8be3a5e8a91a6533a615d32d298fd67
2024-06-27 19:58:16 +08:00