This website requires JavaScript.
Explore
Help
Register
Sign In
423A35C7
/
LLaMA-Factory
Watch
1
Star
0
Fork
0
You've already forked LLaMA-Factory
mirror of
https://github.com/hiyouga/LLaMA-Factory.git
synced
2025-08-06 13:42:51 +08:00
Code
Issues
Packages
Projects
Releases
Wiki
Activity
LLaMA-Factory
/
src
/
llamafactory
/
data
/
processors
History
d
da39715085
经过大量的增量预训练,进行对比试验,发现这个bug:llama3在预训练时使用的tokenizer.eos_toke是'<|end_of_text|>' ,这里在每条数据后面也得用这个,而不是'<|eot_id|>',否则很容易导致严重的性能下降
...
Former-commit-id: 6979f3f8480755604d8aea8164f6418126e094c5
2024-06-11 16:23:40 +08:00
..
__init__.py
refactor data preprocessing, fix mllm rlhf
2024-05-24 04:08:25 +08:00
feedback.py
update data processors
2024-06-07 04:15:40 +08:00
pairwise.py
update data processors
2024-06-07 04:15:40 +08:00
pretrain.py
经过大量的增量预训练,进行对比试验,发现这个bug:llama3在预训练时使用的tokenizer.eos_toke是'<|end_of_text|>' ,这里在每条数据后面也得用这个,而不是'<|eot_id|>',否则很容易导致严重的性能下降
2024-06-11 16:23:40 +08:00
processor_utils.py
update data processors
2024-06-07 04:15:40 +08:00
supervised.py
update data processors
2024-06-07 04:15:40 +08:00
unsupervised.py
update data processors
2024-06-07 04:15:40 +08:00