Interface ReinforcementTuningCodeExecutionRewardScorer

Scores parsed responses for code execution use cases.

interface ReinforcementTuningCodeExecutionRewardScorer {
    pythonCodeSnippet?: string;
}

Properties

pythonCodeSnippet?: string

Example python code snippet which assigns reward of 1 to answer matching user provided reference answer in per prompt references map.