hiyouga 8d386775f2 update examples
Former-commit-id: d1587c80de2e3191952a952116039b719d8613d4
2024-03-06 13:14:57 +08:00
..
2024-03-06 13:14:57 +08:00
2024-03-06 13:14:57 +08:00
2024-03-06 13:14:57 +08:00
2024-03-06 13:14:57 +08:00
2024-03-05 03:16:35 +08:00
2024-03-06 13:14:57 +08:00
2024-03-06 13:14:57 +08:00

Usage:

  • pretrain.sh
  • sft.sh -> reward.sh -> ppo.sh
  • sft.sh -> dpo.sh -> predict.sh