Files
LLaMA-Factory/examples/lora_single_gpu/README.md
2024-03-05 03:16:35 +08:00

101 B

Usage:

  • pretrain.sh
  • sft.sh -> reward.sh -> ppo.sh
  • sft.sh -> dpo.sh -> predict.sh