Interface SingleReinforcementTuningRewardConfig

Single reinforcement tuning reward config.

Properties

Scores parsed responses for autorater use cases by using a model to compute the reward.

Scores parsed responses by calling a Cloud Run service.

Scores parsed responses for code execution use cases.

Defines how to parse sample response.

rewardName?: string

A unique reward name used to identify each single reinforcement tuning reward.

Scores parsed responses for simple string matching use cases against reference answer without writing python code.