OptionalcompositeOptionalhyperAdditional hyper-parameters to use during tuning.
OptionalsingleSingle reward function configuration for reinforcement tuning.
OptionaltrainingCloud Storage path to file containing training dataset for tuning. The dataset must be formatted as a JSONL file.
OptionalvalidationCloud Storage path to file containing validation dataset for tuning. The dataset must be formatted as a JSONL file. If no validation dataset is provided, by default the API splits 25% of the training dataset or 50 examples, whichever is larger, as the validation dataset.
Reinforcement tuning spec for tuning.