hiyouga 8cf9842f7a add examples
Former-commit-id: 76f31b18eb4d3724f96ea1bad10073677daee36d
2024-03-05 03:16:35 +08:00

101 B

Usage:

  • pretrain.sh
  • sft.sh -> reward.sh -> ppo.sh
  • sft.sh -> dpo.sh -> predict.sh