Interface ReinforcementTuningSpec

Reinforcement tuning spec for tuning.

Properties

Additional hyper-parameters to use during tuning.

Single reward function configuration for reinforcement tuning.

trainingDatasetUri?: string

Cloud Storage path to file containing training dataset for tuning. The dataset must be formatted as a JSONL file.

validationDatasetUri?: string

Cloud Storage path to file containing validation dataset for tuning. The dataset must be formatted as a JSONL file. If no validation dataset is provided, by default the API splits 25% of the training dataset or 50 examples, whichever is larger, as the validation dataset.