hiyouga 6b407092d9 update examples
Former-commit-id: 194e25606515bfa42c3be27d68f68d604191514b
2024-03-06 13:14:57 +08:00
..
2024-03-06 13:14:57 +08:00
2024-03-06 13:14:57 +08:00
2024-03-06 13:14:57 +08:00
2024-03-06 13:14:57 +08:00
2024-03-05 03:16:35 +08:00
2024-03-06 13:14:57 +08:00
2024-03-06 13:14:57 +08:00

Usage:

  • pretrain.sh
  • sft.sh -> reward.sh -> ppo.sh
  • sft.sh -> dpo.sh -> predict.sh