42 Commits

Author SHA1 Message Date
BUAADreamer
12c51655ce add llava and instructblip
Former-commit-id: 142fb6f4541a1acfefe66ff2574dabde53b00c06
2024-04-25 00:22:43 +08:00
hiyouga
0cb596fee1 add dpo mix dataset
Former-commit-id: 6def3f8bfa51b2d9d73af112352ce07db972e4c9
2024-04-20 01:31:38 +08:00
hiyouga
106a0104da fix #3247
Former-commit-id: bb67c66f80627805b585d157ba807c0ce378d3f2
2024-04-12 17:41:33 +08:00
hiyouga
d764cd8736 support ORPO
Former-commit-id: f44a4c27e2461cdaa1b16865f597a31033c0e6d9
2024-03-31 18:29:50 +08:00
hiyouga
5ed234ca63 add orca_dpo_pairs dataset
Former-commit-id: af683aacbae462a2a37d76d37df583e217664bd5
2024-03-20 20:09:06 +08:00
SirlyDreamer
6fc2d7e063 Follow HF_ENDPOINT environment variable
Former-commit-id: 22b36a3cfd2909cb624b1bb7385558eda504defe
2024-03-20 08:31:30 +00:00
hiyouga
7c492864e9 update parser
Former-commit-id: d98258aa08d93494ad50d7786064e7fda15f6ca9
2024-03-10 13:35:20 +08:00
hiyouga
62b6a7971a update data/readme
Former-commit-id: aa566e3cea5bc75688b4399a9da07be0b35b921c
2024-02-10 21:04:29 +08:00
hiyouga
1955a8ea5a improve aligner
Former-commit-id: cc7296b92e10c24967fc753393275b71d300683f
2024-02-10 16:39:19 +08:00
Mark Mueller
1ce82f391a Slim Orca data parsing
Former-commit-id: f2d8efede7e20edafed0d5446eb64f2d419949b1
2024-02-08 19:32:20 +01:00
hiyouga
5b8712d061 fix autoset attn impl, update data readme
Former-commit-id: 34a6e5f82baf45cc8dbb11f9f7ab4a480ab7ec5c
2024-01-31 11:58:07 +08:00
hiyouga
75be329994 fix #2282 and update tool prompt
Former-commit-id: 1c412f803866bde32b76f7c26c7b464b6b3651f3
2024-01-22 22:27:30 +08:00
hiyouga
fe4d93c6db add array param format
Former-commit-id: bf910f8a5b21ee552fa9ab069610a3f5f611de57
2024-01-21 22:17:48 +08:00
hiyouga
b7df920860 fix dataset
Former-commit-id: a7ce244a6d83d62f5bbecc588f1978e3791fd3b3
2024-01-18 12:59:30 +08:00
hiyouga
e4a424cb6a enable cutoff len
Former-commit-id: e9513d300c338dfcae98eee7d057bfd00da2da0e
2024-01-18 12:25:42 +08:00
hiyouga
d925ecae1b add models
Former-commit-id: 3a4728557304996bcbe58d7d6380beead7c63c70
2023-12-18 19:09:31 +08:00
hiyouga
934d00ea1e support system column #1765
Former-commit-id: f425584a511c5e42bae8b3ba090eaa898b28adad
2023-12-12 19:45:59 +08:00
hiyouga
f3ffa8310f fix #1784
Former-commit-id: 4e1af5a5d39d9e2f374c1372e2d67120c63fea09
2023-12-09 20:53:18 +08:00
hiyouga
92abe91d22 update dataset
Former-commit-id: a310b22b446118d90dd73906847ed3d01a574b50
2023-11-17 23:19:12 +08:00
hiyouga
7a3a0144a5 support full-parameter PPO
Former-commit-id: 4af967d69475e1c9fdf1a7983cd6b83bd431abff
2023-11-16 02:08:04 +08:00
hiyouga
48ec5355f9 add template, modify datasets
Former-commit-id: 81e54beb4d0f792f4fd7f450643caaf10f2f0b7d
2023-11-09 15:53:23 +08:00
hiyouga
065021d82a update data readme
Former-commit-id: 6a65ef44ed58714c611da60b5af96b85352e8735
2023-11-03 00:15:23 +08:00
hiyouga
4bb643e685 update data readme (zh)
Former-commit-id: b32fb3a984c681732b82f6544d6c05a98c34cf4c
2023-11-02 23:42:49 +08:00
hiyouga
b77c745b1a support sharegpt format, add datasets
Former-commit-id: 202daf8987ccb7523be03ca535b572b5c9e65994
2023-11-02 23:10:04 +08:00
hiyouga
e5b72c6a77 refactor dataset_attr, add eos in pt, fix #757
Former-commit-id: 0feec9a830b917b36686b61938a66e842eccf930
2023-09-01 19:00:45 +08:00
codemayq
a6662b73f5 add readme for dataset
Former-commit-id: bdcb0ea40e726e4c5752f938b379ed9a18e7e1d0
2023-08-23 19:55:45 +08:00
hiyouga
6310613699 update template
Former-commit-id: a95f3a4d62de1073a78125401cf4289ec0523156
2023-08-22 19:46:09 +08:00
Peter Pan
5cac87d317 add rm dataset explanation
Signed-off-by: Peter Pan <Peter.Pan@daocloud.io>

Former-commit-id: 1efb95025be6501f1b30b20e7c711d3590b5d1ee
2023-08-22 01:33:59 -04:00
hiyouga
ca719a8697 support DPO training (2305.18290)
Former-commit-id: 6d98de148e4af63a7028dfaeb6cf86eb56a4488f
2023-08-11 03:02:53 +08:00
hiyouga
cb4d1d5ebb restore from git lfs
Former-commit-id: 0c734a37113b773ae7c0bc8b8d1af39b15bc0fb2
2023-08-01 16:33:25 +08:00
hiyouga
9bba01a033 use git lfs
Former-commit-id: 4886d0071751f68c5a2d926bd9fcee0c93337322
2023-08-01 10:14:08 +08:00
hiyouga
030daca686 update dataset
Former-commit-id: 4a044aabbd19c92a9ae93c1c30536f5086fd47f9
2023-07-26 17:05:12 +08:00
hiyouga
c145bbef3c update dataset
Former-commit-id: 4fc2c3293d91d8464527ebd1ddabe572c8355616
2023-07-23 20:01:43 +08:00
hiyouga
a707f5b502 update readme, fix web ui postprocess
Former-commit-id: ba51ab3379100108f7b52a3c2444ccdd99e8a6ef
2023-07-22 14:29:22 +08:00
mrhan1993
8e6b7034fe 根据GLM Efficient Tuning添加中文README,web添加了server_port
Former-commit-id: 29e3acd23eafd891667d7a860ec544a5b05d3c33
2023-07-21 16:57:58 +08:00
hiyouga
6d881f161b add datasets
Former-commit-id: 02e4b47dea1b25905c61f2ace88bab112610f021
2023-07-19 20:59:15 +08:00
hiyouga
90fa2dd935 add open assistant dataset
Former-commit-id: 1694cf3078d04a14bce96da04b9d8c52176b1044
2023-06-28 23:09:33 +08:00
hiyouga
7dc1f06a97 add belle multiturn dataset
Former-commit-id: ac907ae1c37969df3cd09d4ab5f3f7f352eb259c
2023-06-16 20:01:16 +08:00
hiyouga
4724ae3492 support RM metrics, add generating Args
Former-commit-id: c461c6190bc124e98dde7f3cf96a59ce40b26fb0
2023-06-12 15:48:48 +08:00
BUAADreamer
4adbb95b03 add some
Former-commit-id: 6982a53ed1f6f9fa03e99623b98fff56bf00317e
2023-06-11 18:55:53 +08:00
hiyouga
33fee45217 add pre-training script
Former-commit-id: 935d58de2b3a2eadc4f0bed28c3ad7dee32e9fd5
2023-05-29 21:37:22 +08:00
hiyouga
17024ebc1a Initial commit
Former-commit-id: 5ca8e1d63727e7bcb8cab16542c763c47e48184a
2023-05-28 18:09:04 +08:00