Returns the operations Resource.
Close httplib2 connections.
create(parent, body=None, x__xgafv=None)
Creates an Evaluation Item.
Deletes an Evaluation Item.
Gets an Evaluation Item.
list(parent, filter=None, orderBy=None, pageSize=None, pageToken=None, x__xgafv=None)
Lists Evaluation Items.
Retrieves the next page of results.
close()
Close httplib2 connections.
create(parent, body=None, x__xgafv=None)
Creates an Evaluation Item. Args: parent: string, Required. The resource name of the Location to create the Evaluation Item in. Format: `projects/{project}/locations/{location}` (required) body: object, The request body. The object takes the form of: { # EvaluationItem is a single evaluation request or result. The content of an EvaluationItem is immutable - it cannot be updated once created. EvaluationItems can be deleted when no longer needed. "createTime": "A String", # Output only. Timestamp when this item was created. "displayName": "A String", # Required. The display name of the EvaluationItem. "error": { # The `Status` type defines a logical error model that is suitable for different programming environments, including REST APIs and RPC APIs. It is used by [gRPC](https://github.com/grpc). Each `Status` message contains three pieces of data: error code, error message, and error details. You can find out more about this error model and how to work with it in the [API Design Guide](https://cloud.google.com/apis/design/errors). # Output only. Error for the evaluation item. "code": 42, # The status code, which should be an enum value of google.rpc.Code. "details": [ # A list of messages that carry the error details. There is a common set of message types for APIs to use. { "a_key": "", # Properties of the object. Contains field @type with type URL. }, ], "message": "A String", # A developer-facing error message, which should be in English. Any user-facing error message should be localized and sent in the google.rpc.Status.details field, or localized by the client. }, "evaluationItemType": "A String", # Required. The type of the EvaluationItem. "evaluationRequest": { # Single evaluation request. # The request to evaluate. "candidateResponses": [ # Optional. Responses from model under test and other baseline models for comparison. { # Responses from model or agent. "candidate": "A String", # Required. The name of the candidate that produced the response. "text": "A String", # Text response. "value": "", # Fields and values that can be used to populate the response template. }, ], "goldenResponse": { # Responses from model or agent. # Optional. The Ideal response or ground truth. "candidate": "A String", # Required. The name of the candidate that produced the response. "text": "A String", # Text response. "value": "", # Fields and values that can be used to populate the response template. }, "prompt": { # Prompt to be evaluated. # Required. The request/prompt to evaluate. "promptTemplateData": { # Message to hold a prompt template and the values to populate the template. # Prompt template data. "values": { # The values for fields in the prompt template. "a_key": { # The base structured datatype containing multi-part content of a message. A `Content` includes a `role` field designating the producer of the `Content` and a `parts` field containing multi-part data that contains the content of the message turn. "parts": [ # Required. Ordered `Parts` that constitute a single message. Parts may have different IANA MIME types. { # A datatype containing media that is part of a multi-part `Content` message. A `Part` consists of data which has an associated datatype. A `Part` can only contain one of the accepted types in `Part.data`. A `Part` must have a fixed IANA MIME type identifying the type and subtype of the media if `inline_data` or `file_data` field is filled with raw bytes. "codeExecutionResult": { # Result of executing the [ExecutableCode]. Only generated when using the [CodeExecution] tool, and always follows a `part` containing the [ExecutableCode]. # Optional. Result of executing the [ExecutableCode]. "outcome": "A String", # Required. Outcome of the code execution. "output": "A String", # Optional. Contains stdout when code execution is successful, stderr or other description otherwise. }, "executableCode": { # Code generated by the model that is meant to be executed, and the result returned to the model. Generated when using the [CodeExecution] tool, in which the code will be automatically executed, and a corresponding [CodeExecutionResult] will also be generated. # Optional. Code generated by the model that is meant to be executed. "code": "A String", # Required. The code to be executed. "language": "A String", # Required. Programming language of the `code`. }, "fileData": { # URI based data. # Optional. URI based data. "displayName": "A String", # Optional. Display name of the file data. Used to provide a label or filename to distinguish file datas. This field is only returned in PromptMessage for prompt management. It is currently used in the Gemini GenerateContent calls only when server side tools (code_execution, google_search, and url_context) are enabled. "fileUri": "A String", # Required. URI. "mimeType": "A String", # Required. The IANA standard MIME type of the source data. }, "functionCall": { # A predicted [FunctionCall] returned from the model that contains a string representing the [FunctionDeclaration.name] and a structured JSON object containing the parameters and their values. # Optional. A predicted [FunctionCall] returned from the model that contains a string representing the [FunctionDeclaration.name] with the parameters and their values. "args": { # Optional. The function parameters and values in JSON object format. See [FunctionDeclaration.parameters] for parameter details. "a_key": "", # Properties of the object. }, "id": "A String", # Optional. The unique id of the function call. If populated, the client to execute the `function_call` and return the response with the matching `id`. "name": "A String", # Required. The name of the function to call. Matches [FunctionDeclaration.name]. }, "functionResponse": { # The result output from a [FunctionCall] that contains a string representing the [FunctionDeclaration.name] and a structured JSON object containing any output from the function is used as context to the model. This should contain the result of a [FunctionCall] made based on model prediction. # Optional. The result output of a [FunctionCall] that contains a string representing the [FunctionDeclaration.name] and a structured JSON object containing any output from the function call. It is used as context to the model. "id": "A String", # Optional. The id of the function call this response is for. Populated by the client to match the corresponding function call `id`. "name": "A String", # Required. The name of the function to call. Matches [FunctionDeclaration.name] and [FunctionCall.name]. "response": { # Required. The function response in JSON object format. Use "output" key to specify function output and "error" key to specify error details (if any). If "output" and "error" keys are not specified, then whole "response" is treated as function output. "a_key": "", # Properties of the object. }, }, "inlineData": { # Content blob. # Optional. Inlined bytes data. "data": "A String", # Required. Raw bytes. "displayName": "A String", # Optional. Display name of the blob. Used to provide a label or filename to distinguish blobs. This field is only returned in PromptMessage for prompt management. It is currently used in the Gemini GenerateContent calls only when server side tools (code_execution, google_search, and url_context) are enabled. "mimeType": "A String", # Required. The IANA standard MIME type of the source data. }, "text": "A String", # Optional. Text part (can be code). "thought": True or False, # Optional. Indicates if the part is thought from the model. "thoughtSignature": "A String", # Optional. An opaque signature for the thought so it can be reused in subsequent requests. "videoMetadata": { # Metadata describes the input video content. # Optional. Video metadata. The metadata should only be specified while the video data is presented in inline_data or file_data. "endOffset": "A String", # Optional. The end offset of the video. "fps": 3.14, # Optional. The frame rate of the video sent to the model. If not specified, the default value will be 1.0. The fps range is (0.0, 24.0]. "startOffset": "A String", # Optional. The start offset of the video. }, }, ], "role": "A String", # Optional. The producer of the content. Must be either 'user' or 'model'. Useful to set for multi-turn conversations, otherwise can be left blank or unset. }, }, }, "text": "A String", # Text prompt. "value": "", # Fields and values that can be used to populate the prompt template. }, "rubrics": { # Optional. Named groups of rubrics associated with this prompt. The key is a user-defined name for the rubric group. "a_key": { # A group of rubrics, used for grouping rubrics based on a metric or a version. "displayName": "A String", # Human-readable name for the group. This should be unique within a given context if used for display or selection. Example: "Instruction Following V1", "Content Quality - Summarization Task". "groupId": "A String", # Unique identifier for the group. "rubrics": [ # Rubrics that are part of this group. { # Message representing a single testable criterion for evaluation. One input prompt could have multiple rubrics. "content": { # Content of the rubric, defining the testable criteria. # Required. The actual testable criteria for the rubric. "property": { # Defines criteria based on a specific property. # Evaluation criteria based on a specific property. "description": "A String", # Description of the property being evaluated. Example: "The model's response is grammatically correct." }, }, "importance": "A String", # Optional. The relative importance of this rubric. "rubricId": "A String", # Unique identifier for the rubric. This ID is used to refer to this rubric, e.g., in RubricVerdict. "type": "A String", # Optional. A type designator for the rubric, which can inform how it's evaluated or interpreted by systems or users. It's recommended to use consistent, well-defined, upper snake_case strings. Examples: "SUMMARIZATION_QUALITY", "SAFETY_HARMFUL_CONTENT", "INSTRUCTION_ADHERENCE". }, ], }, }, }, "evaluationResponse": { # Evaluation result. # Output only. The response from evaluation. "candidateResults": [ # Optional. The results for the metric. { # Result for a single candidate. "additionalResults": "", # Optional. Additional results for the metric. "candidate": "A String", # Required. The candidate that is being evaluated. The value is the same as the candidate name in the EvaluationRequest. "explanation": "A String", # Optional. The explanation for the metric. "metric": "A String", # Required. The metric that was evaluated. "rubricVerdicts": [ # Optional. The rubric verdicts for the metric. { # Represents the verdict of an evaluation against a single rubric. "evaluatedRubric": { # Message representing a single testable criterion for evaluation. One input prompt could have multiple rubrics. # Required. The full rubric definition that was evaluated. Storing this ensures the verdict is self-contained and understandable, especially if the original rubric definition changes or was dynamically generated. "content": { # Content of the rubric, defining the testable criteria. # Required. The actual testable criteria for the rubric. "property": { # Defines criteria based on a specific property. # Evaluation criteria based on a specific property. "description": "A String", # Description of the property being evaluated. Example: "The model's response is grammatically correct." }, }, "importance": "A String", # Optional. The relative importance of this rubric. "rubricId": "A String", # Unique identifier for the rubric. This ID is used to refer to this rubric, e.g., in RubricVerdict. "type": "A String", # Optional. A type designator for the rubric, which can inform how it's evaluated or interpreted by systems or users. It's recommended to use consistent, well-defined, upper snake_case strings. Examples: "SUMMARIZATION_QUALITY", "SAFETY_HARMFUL_CONTENT", "INSTRUCTION_ADHERENCE". }, "reasoning": "A String", # Optional. Human-readable reasoning or explanation for the verdict. This can include specific examples or details from the evaluated content that justify the given verdict. "verdict": True or False, # Required. Outcome of the evaluation against the rubric, represented as a boolean. `true` indicates a "Pass", `false` indicates a "Fail". }, ], "score": 3.14, # Optional. The score for the metric. }, ], "evaluationRequest": "A String", # Required. The request item that was evaluated. Format: projects/{project}/locations/{location}/evaluationItems/{evaluation_item} "evaluationRun": "A String", # Required. The evaluation run that was used to generate the result. Format: projects/{project}/locations/{location}/evaluationRuns/{evaluation_run} "metadata": "", # Optional. Metadata about the evaluation result. "metric": "A String", # Required. The metric that was evaluated. "request": { # Single evaluation request. # Required. The request that was evaluated. "candidateResponses": [ # Optional. Responses from model under test and other baseline models for comparison. { # Responses from model or agent. "candidate": "A String", # Required. The name of the candidate that produced the response. "text": "A String", # Text response. "value": "", # Fields and values that can be used to populate the response template. }, ], "goldenResponse": { # Responses from model or agent. # Optional. The Ideal response or ground truth. "candidate": "A String", # Required. The name of the candidate that produced the response. "text": "A String", # Text response. "value": "", # Fields and values that can be used to populate the response template. }, "prompt": { # Prompt to be evaluated. # Required. The request/prompt to evaluate. "promptTemplateData": { # Message to hold a prompt template and the values to populate the template. # Prompt template data. "values": { # The values for fields in the prompt template. "a_key": { # The base structured datatype containing multi-part content of a message. A `Content` includes a `role` field designating the producer of the `Content` and a `parts` field containing multi-part data that contains the content of the message turn. "parts": [ # Required. Ordered `Parts` that constitute a single message. Parts may have different IANA MIME types. { # A datatype containing media that is part of a multi-part `Content` message. A `Part` consists of data which has an associated datatype. A `Part` can only contain one of the accepted types in `Part.data`. A `Part` must have a fixed IANA MIME type identifying the type and subtype of the media if `inline_data` or `file_data` field is filled with raw bytes. "codeExecutionResult": { # Result of executing the [ExecutableCode]. Only generated when using the [CodeExecution] tool, and always follows a `part` containing the [ExecutableCode]. # Optional. Result of executing the [ExecutableCode]. "outcome": "A String", # Required. Outcome of the code execution. "output": "A String", # Optional. Contains stdout when code execution is successful, stderr or other description otherwise. }, "executableCode": { # Code generated by the model that is meant to be executed, and the result returned to the model. Generated when using the [CodeExecution] tool, in which the code will be automatically executed, and a corresponding [CodeExecutionResult] will also be generated. # Optional. Code generated by the model that is meant to be executed. "code": "A String", # Required. The code to be executed. "language": "A String", # Required. Programming language of the `code`. }, "fileData": { # URI based data. # Optional. URI based data. "displayName": "A String", # Optional. Display name of the file data. Used to provide a label or filename to distinguish file datas. This field is only returned in PromptMessage for prompt management. It is currently used in the Gemini GenerateContent calls only when server side tools (code_execution, google_search, and url_context) are enabled. "fileUri": "A String", # Required. URI. "mimeType": "A String", # Required. The IANA standard MIME type of the source data. }, "functionCall": { # A predicted [FunctionCall] returned from the model that contains a string representing the [FunctionDeclaration.name] and a structured JSON object containing the parameters and their values. # Optional. A predicted [FunctionCall] returned from the model that contains a string representing the [FunctionDeclaration.name] with the parameters and their values. "args": { # Optional. The function parameters and values in JSON object format. See [FunctionDeclaration.parameters] for parameter details. "a_key": "", # Properties of the object. }, "id": "A String", # Optional. The unique id of the function call. If populated, the client to execute the `function_call` and return the response with the matching `id`. "name": "A String", # Required. The name of the function to call. Matches [FunctionDeclaration.name]. }, "functionResponse": { # The result output from a [FunctionCall] that contains a string representing the [FunctionDeclaration.name] and a structured JSON object containing any output from the function is used as context to the model. This should contain the result of a [FunctionCall] made based on model prediction. # Optional. The result output of a [FunctionCall] that contains a string representing the [FunctionDeclaration.name] and a structured JSON object containing any output from the function call. It is used as context to the model. "id": "A String", # Optional. The id of the function call this response is for. Populated by the client to match the corresponding function call `id`. "name": "A String", # Required. The name of the function to call. Matches [FunctionDeclaration.name] and [FunctionCall.name]. "response": { # Required. The function response in JSON object format. Use "output" key to specify function output and "error" key to specify error details (if any). If "output" and "error" keys are not specified, then whole "response" is treated as function output. "a_key": "", # Properties of the object. }, }, "inlineData": { # Content blob. # Optional. Inlined bytes data. "data": "A String", # Required. Raw bytes. "displayName": "A String", # Optional. Display name of the blob. Used to provide a label or filename to distinguish blobs. This field is only returned in PromptMessage for prompt management. It is currently used in the Gemini GenerateContent calls only when server side tools (code_execution, google_search, and url_context) are enabled. "mimeType": "A String", # Required. The IANA standard MIME type of the source data. }, "text": "A String", # Optional. Text part (can be code). "thought": True or False, # Optional. Indicates if the part is thought from the model. "thoughtSignature": "A String", # Optional. An opaque signature for the thought so it can be reused in subsequent requests. "videoMetadata": { # Metadata describes the input video content. # Optional. Video metadata. The metadata should only be specified while the video data is presented in inline_data or file_data. "endOffset": "A String", # Optional. The end offset of the video. "fps": 3.14, # Optional. The frame rate of the video sent to the model. If not specified, the default value will be 1.0. The fps range is (0.0, 24.0]. "startOffset": "A String", # Optional. The start offset of the video. }, }, ], "role": "A String", # Optional. The producer of the content. Must be either 'user' or 'model'. Useful to set for multi-turn conversations, otherwise can be left blank or unset. }, }, }, "text": "A String", # Text prompt. "value": "", # Fields and values that can be used to populate the prompt template. }, "rubrics": { # Optional. Named groups of rubrics associated with this prompt. The key is a user-defined name for the rubric group. "a_key": { # A group of rubrics, used for grouping rubrics based on a metric or a version. "displayName": "A String", # Human-readable name for the group. This should be unique within a given context if used for display or selection. Example: "Instruction Following V1", "Content Quality - Summarization Task". "groupId": "A String", # Unique identifier for the group. "rubrics": [ # Rubrics that are part of this group. { # Message representing a single testable criterion for evaluation. One input prompt could have multiple rubrics. "content": { # Content of the rubric, defining the testable criteria. # Required. The actual testable criteria for the rubric. "property": { # Defines criteria based on a specific property. # Evaluation criteria based on a specific property. "description": "A String", # Description of the property being evaluated. Example: "The model's response is grammatically correct." }, }, "importance": "A String", # Optional. The relative importance of this rubric. "rubricId": "A String", # Unique identifier for the rubric. This ID is used to refer to this rubric, e.g., in RubricVerdict. "type": "A String", # Optional. A type designator for the rubric, which can inform how it's evaluated or interpreted by systems or users. It's recommended to use consistent, well-defined, upper snake_case strings. Examples: "SUMMARIZATION_QUALITY", "SAFETY_HARMFUL_CONTENT", "INSTRUCTION_ADHERENCE". }, ], }, }, }, }, "gcsUri": "A String", # The GCS object where the request or response is stored. "labels": { # Optional. Labels for the EvaluationItem. "a_key": "A String", }, "metadata": "", # Optional. Metadata for the EvaluationItem. "name": "A String", # Identifier. The resource name of the EvaluationItem. Format: `projects/{project}/locations/{location}/evaluationItems/{evaluation_item}` } x__xgafv: string, V1 error format. Allowed values 1 - v1 error format 2 - v2 error format Returns: An object of the form: { # EvaluationItem is a single evaluation request or result. The content of an EvaluationItem is immutable - it cannot be updated once created. EvaluationItems can be deleted when no longer needed. "createTime": "A String", # Output only. Timestamp when this item was created. "displayName": "A String", # Required. The display name of the EvaluationItem. "error": { # The `Status` type defines a logical error model that is suitable for different programming environments, including REST APIs and RPC APIs. It is used by [gRPC](https://github.com/grpc). Each `Status` message contains three pieces of data: error code, error message, and error details. You can find out more about this error model and how to work with it in the [API Design Guide](https://cloud.google.com/apis/design/errors). # Output only. Error for the evaluation item. "code": 42, # The status code, which should be an enum value of google.rpc.Code. "details": [ # A list of messages that carry the error details. There is a common set of message types for APIs to use. { "a_key": "", # Properties of the object. Contains field @type with type URL. }, ], "message": "A String", # A developer-facing error message, which should be in English. Any user-facing error message should be localized and sent in the google.rpc.Status.details field, or localized by the client. }, "evaluationItemType": "A String", # Required. The type of the EvaluationItem. "evaluationRequest": { # Single evaluation request. # The request to evaluate. "candidateResponses": [ # Optional. Responses from model under test and other baseline models for comparison. { # Responses from model or agent. "candidate": "A String", # Required. The name of the candidate that produced the response. "text": "A String", # Text response. "value": "", # Fields and values that can be used to populate the response template. }, ], "goldenResponse": { # Responses from model or agent. # Optional. The Ideal response or ground truth. "candidate": "A String", # Required. The name of the candidate that produced the response. "text": "A String", # Text response. "value": "", # Fields and values that can be used to populate the response template. }, "prompt": { # Prompt to be evaluated. # Required. The request/prompt to evaluate. "promptTemplateData": { # Message to hold a prompt template and the values to populate the template. # Prompt template data. "values": { # The values for fields in the prompt template. "a_key": { # The base structured datatype containing multi-part content of a message. A `Content` includes a `role` field designating the producer of the `Content` and a `parts` field containing multi-part data that contains the content of the message turn. "parts": [ # Required. Ordered `Parts` that constitute a single message. Parts may have different IANA MIME types. { # A datatype containing media that is part of a multi-part `Content` message. A `Part` consists of data which has an associated datatype. A `Part` can only contain one of the accepted types in `Part.data`. A `Part` must have a fixed IANA MIME type identifying the type and subtype of the media if `inline_data` or `file_data` field is filled with raw bytes. "codeExecutionResult": { # Result of executing the [ExecutableCode]. Only generated when using the [CodeExecution] tool, and always follows a `part` containing the [ExecutableCode]. # Optional. Result of executing the [ExecutableCode]. "outcome": "A String", # Required. Outcome of the code execution. "output": "A String", # Optional. Contains stdout when code execution is successful, stderr or other description otherwise. }, "executableCode": { # Code generated by the model that is meant to be executed, and the result returned to the model. Generated when using the [CodeExecution] tool, in which the code will be automatically executed, and a corresponding [CodeExecutionResult] will also be generated. # Optional. Code generated by the model that is meant to be executed. "code": "A String", # Required. The code to be executed. "language": "A String", # Required. Programming language of the `code`. }, "fileData": { # URI based data. # Optional. URI based data. "displayName": "A String", # Optional. Display name of the file data. Used to provide a label or filename to distinguish file datas. This field is only returned in PromptMessage for prompt management. It is currently used in the Gemini GenerateContent calls only when server side tools (code_execution, google_search, and url_context) are enabled. "fileUri": "A String", # Required. URI. "mimeType": "A String", # Required. The IANA standard MIME type of the source data. }, "functionCall": { # A predicted [FunctionCall] returned from the model that contains a string representing the [FunctionDeclaration.name] and a structured JSON object containing the parameters and their values. # Optional. A predicted [FunctionCall] returned from the model that contains a string representing the [FunctionDeclaration.name] with the parameters and their values. "args": { # Optional. The function parameters and values in JSON object format. See [FunctionDeclaration.parameters] for parameter details. "a_key": "", # Properties of the object. }, "id": "A String", # Optional. The unique id of the function call. If populated, the client to execute the `function_call` and return the response with the matching `id`. "name": "A String", # Required. The name of the function to call. Matches [FunctionDeclaration.name]. }, "functionResponse": { # The result output from a [FunctionCall] that contains a string representing the [FunctionDeclaration.name] and a structured JSON object containing any output from the function is used as context to the model. This should contain the result of a [FunctionCall] made based on model prediction. # Optional. The result output of a [FunctionCall] that contains a string representing the [FunctionDeclaration.name] and a structured JSON object containing any output from the function call. It is used as context to the model. "id": "A String", # Optional. The id of the function call this response is for. Populated by the client to match the corresponding function call `id`. "name": "A String", # Required. The name of the function to call. Matches [FunctionDeclaration.name] and [FunctionCall.name]. "response": { # Required. The function response in JSON object format. Use "output" key to specify function output and "error" key to specify error details (if any). If "output" and "error" keys are not specified, then whole "response" is treated as function output. "a_key": "", # Properties of the object. }, }, "inlineData": { # Content blob. # Optional. Inlined bytes data. "data": "A String", # Required. Raw bytes. "displayName": "A String", # Optional. Display name of the blob. Used to provide a label or filename to distinguish blobs. This field is only returned in PromptMessage for prompt management. It is currently used in the Gemini GenerateContent calls only when server side tools (code_execution, google_search, and url_context) are enabled. "mimeType": "A String", # Required. The IANA standard MIME type of the source data. }, "text": "A String", # Optional. Text part (can be code). "thought": True or False, # Optional. Indicates if the part is thought from the model. "thoughtSignature": "A String", # Optional. An opaque signature for the thought so it can be reused in subsequent requests. "videoMetadata": { # Metadata describes the input video content. # Optional. Video metadata. The metadata should only be specified while the video data is presented in inline_data or file_data. "endOffset": "A String", # Optional. The end offset of the video. "fps": 3.14, # Optional. The frame rate of the video sent to the model. If not specified, the default value will be 1.0. The fps range is (0.0, 24.0]. "startOffset": "A String", # Optional. The start offset of the video. }, }, ], "role": "A String", # Optional. The producer of the content. Must be either 'user' or 'model'. Useful to set for multi-turn conversations, otherwise can be left blank or unset. }, }, }, "text": "A String", # Text prompt. "value": "", # Fields and values that can be used to populate the prompt template. }, "rubrics": { # Optional. Named groups of rubrics associated with this prompt. The key is a user-defined name for the rubric group. "a_key": { # A group of rubrics, used for grouping rubrics based on a metric or a version. "displayName": "A String", # Human-readable name for the group. This should be unique within a given context if used for display or selection. Example: "Instruction Following V1", "Content Quality - Summarization Task". "groupId": "A String", # Unique identifier for the group. "rubrics": [ # Rubrics that are part of this group. { # Message representing a single testable criterion for evaluation. One input prompt could have multiple rubrics. "content": { # Content of the rubric, defining the testable criteria. # Required. The actual testable criteria for the rubric. "property": { # Defines criteria based on a specific property. # Evaluation criteria based on a specific property. "description": "A String", # Description of the property being evaluated. Example: "The model's response is grammatically correct." }, }, "importance": "A String", # Optional. The relative importance of this rubric. "rubricId": "A String", # Unique identifier for the rubric. This ID is used to refer to this rubric, e.g., in RubricVerdict. "type": "A String", # Optional. A type designator for the rubric, which can inform how it's evaluated or interpreted by systems or users. It's recommended to use consistent, well-defined, upper snake_case strings. Examples: "SUMMARIZATION_QUALITY", "SAFETY_HARMFUL_CONTENT", "INSTRUCTION_ADHERENCE". }, ], }, }, }, "evaluationResponse": { # Evaluation result. # Output only. The response from evaluation. "candidateResults": [ # Optional. The results for the metric. { # Result for a single candidate. "additionalResults": "", # Optional. Additional results for the metric. "candidate": "A String", # Required. The candidate that is being evaluated. The value is the same as the candidate name in the EvaluationRequest. "explanation": "A String", # Optional. The explanation for the metric. "metric": "A String", # Required. The metric that was evaluated. "rubricVerdicts": [ # Optional. The rubric verdicts for the metric. { # Represents the verdict of an evaluation against a single rubric. "evaluatedRubric": { # Message representing a single testable criterion for evaluation. One input prompt could have multiple rubrics. # Required. The full rubric definition that was evaluated. Storing this ensures the verdict is self-contained and understandable, especially if the original rubric definition changes or was dynamically generated. "content": { # Content of the rubric, defining the testable criteria. # Required. The actual testable criteria for the rubric. "property": { # Defines criteria based on a specific property. # Evaluation criteria based on a specific property. "description": "A String", # Description of the property being evaluated. Example: "The model's response is grammatically correct." }, }, "importance": "A String", # Optional. The relative importance of this rubric. "rubricId": "A String", # Unique identifier for the rubric. This ID is used to refer to this rubric, e.g., in RubricVerdict. "type": "A String", # Optional. A type designator for the rubric, which can inform how it's evaluated or interpreted by systems or users. It's recommended to use consistent, well-defined, upper snake_case strings. Examples: "SUMMARIZATION_QUALITY", "SAFETY_HARMFUL_CONTENT", "INSTRUCTION_ADHERENCE". }, "reasoning": "A String", # Optional. Human-readable reasoning or explanation for the verdict. This can include specific examples or details from the evaluated content that justify the given verdict. "verdict": True or False, # Required. Outcome of the evaluation against the rubric, represented as a boolean. `true` indicates a "Pass", `false` indicates a "Fail". }, ], "score": 3.14, # Optional. The score for the metric. }, ], "evaluationRequest": "A String", # Required. The request item that was evaluated. Format: projects/{project}/locations/{location}/evaluationItems/{evaluation_item} "evaluationRun": "A String", # Required. The evaluation run that was used to generate the result. Format: projects/{project}/locations/{location}/evaluationRuns/{evaluation_run} "metadata": "", # Optional. Metadata about the evaluation result. "metric": "A String", # Required. The metric that was evaluated. "request": { # Single evaluation request. # Required. The request that was evaluated. "candidateResponses": [ # Optional. Responses from model under test and other baseline models for comparison. { # Responses from model or agent. "candidate": "A String", # Required. The name of the candidate that produced the response. "text": "A String", # Text response. "value": "", # Fields and values that can be used to populate the response template. }, ], "goldenResponse": { # Responses from model or agent. # Optional. The Ideal response or ground truth. "candidate": "A String", # Required. The name of the candidate that produced the response. "text": "A String", # Text response. "value": "", # Fields and values that can be used to populate the response template. }, "prompt": { # Prompt to be evaluated. # Required. The request/prompt to evaluate. "promptTemplateData": { # Message to hold a prompt template and the values to populate the template. # Prompt template data. "values": { # The values for fields in the prompt template. "a_key": { # The base structured datatype containing multi-part content of a message. A `Content` includes a `role` field designating the producer of the `Content` and a `parts` field containing multi-part data that contains the content of the message turn. "parts": [ # Required. Ordered `Parts` that constitute a single message. Parts may have different IANA MIME types. { # A datatype containing media that is part of a multi-part `Content` message. A `Part` consists of data which has an associated datatype. A `Part` can only contain one of the accepted types in `Part.data`. A `Part` must have a fixed IANA MIME type identifying the type and subtype of the media if `inline_data` or `file_data` field is filled with raw bytes. "codeExecutionResult": { # Result of executing the [ExecutableCode]. Only generated when using the [CodeExecution] tool, and always follows a `part` containing the [ExecutableCode]. # Optional. Result of executing the [ExecutableCode]. "outcome": "A String", # Required. Outcome of the code execution. "output": "A String", # Optional. Contains stdout when code execution is successful, stderr or other description otherwise. }, "executableCode": { # Code generated by the model that is meant to be executed, and the result returned to the model. Generated when using the [CodeExecution] tool, in which the code will be automatically executed, and a corresponding [CodeExecutionResult] will also be generated. # Optional. Code generated by the model that is meant to be executed. "code": "A String", # Required. The code to be executed. "language": "A String", # Required. Programming language of the `code`. }, "fileData": { # URI based data. # Optional. URI based data. "displayName": "A String", # Optional. Display name of the file data. Used to provide a label or filename to distinguish file datas. This field is only returned in PromptMessage for prompt management. It is currently used in the Gemini GenerateContent calls only when server side tools (code_execution, google_search, and url_context) are enabled. "fileUri": "A String", # Required. URI. "mimeType": "A String", # Required. The IANA standard MIME type of the source data. }, "functionCall": { # A predicted [FunctionCall] returned from the model that contains a string representing the [FunctionDeclaration.name] and a structured JSON object containing the parameters and their values. # Optional. A predicted [FunctionCall] returned from the model that contains a string representing the [FunctionDeclaration.name] with the parameters and their values. "args": { # Optional. The function parameters and values in JSON object format. See [FunctionDeclaration.parameters] for parameter details. "a_key": "", # Properties of the object. }, "id": "A String", # Optional. The unique id of the function call. If populated, the client to execute the `function_call` and return the response with the matching `id`. "name": "A String", # Required. The name of the function to call. Matches [FunctionDeclaration.name]. }, "functionResponse": { # The result output from a [FunctionCall] that contains a string representing the [FunctionDeclaration.name] and a structured JSON object containing any output from the function is used as context to the model. This should contain the result of a [FunctionCall] made based on model prediction. # Optional. The result output of a [FunctionCall] that contains a string representing the [FunctionDeclaration.name] and a structured JSON object containing any output from the function call. It is used as context to the model. "id": "A String", # Optional. The id of the function call this response is for. Populated by the client to match the corresponding function call `id`. "name": "A String", # Required. The name of the function to call. Matches [FunctionDeclaration.name] and [FunctionCall.name]. "response": { # Required. The function response in JSON object format. Use "output" key to specify function output and "error" key to specify error details (if any). If "output" and "error" keys are not specified, then whole "response" is treated as function output. "a_key": "", # Properties of the object. }, }, "inlineData": { # Content blob. # Optional. Inlined bytes data. "data": "A String", # Required. Raw bytes. "displayName": "A String", # Optional. Display name of the blob. Used to provide a label or filename to distinguish blobs. This field is only returned in PromptMessage for prompt management. It is currently used in the Gemini GenerateContent calls only when server side tools (code_execution, google_search, and url_context) are enabled. "mimeType": "A String", # Required. The IANA standard MIME type of the source data. }, "text": "A String", # Optional. Text part (can be code). "thought": True or False, # Optional. Indicates if the part is thought from the model. "thoughtSignature": "A String", # Optional. An opaque signature for the thought so it can be reused in subsequent requests. "videoMetadata": { # Metadata describes the input video content. # Optional. Video metadata. The metadata should only be specified while the video data is presented in inline_data or file_data. "endOffset": "A String", # Optional. The end offset of the video. "fps": 3.14, # Optional. The frame rate of the video sent to the model. If not specified, the default value will be 1.0. The fps range is (0.0, 24.0]. "startOffset": "A String", # Optional. The start offset of the video. }, }, ], "role": "A String", # Optional. The producer of the content. Must be either 'user' or 'model'. Useful to set for multi-turn conversations, otherwise can be left blank or unset. }, }, }, "text": "A String", # Text prompt. "value": "", # Fields and values that can be used to populate the prompt template. }, "rubrics": { # Optional. Named groups of rubrics associated with this prompt. The key is a user-defined name for the rubric group. "a_key": { # A group of rubrics, used for grouping rubrics based on a metric or a version. "displayName": "A String", # Human-readable name for the group. This should be unique within a given context if used for display or selection. Example: "Instruction Following V1", "Content Quality - Summarization Task". "groupId": "A String", # Unique identifier for the group. "rubrics": [ # Rubrics that are part of this group. { # Message representing a single testable criterion for evaluation. One input prompt could have multiple rubrics. "content": { # Content of the rubric, defining the testable criteria. # Required. The actual testable criteria for the rubric. "property": { # Defines criteria based on a specific property. # Evaluation criteria based on a specific property. "description": "A String", # Description of the property being evaluated. Example: "The model's response is grammatically correct." }, }, "importance": "A String", # Optional. The relative importance of this rubric. "rubricId": "A String", # Unique identifier for the rubric. This ID is used to refer to this rubric, e.g., in RubricVerdict. "type": "A String", # Optional. A type designator for the rubric, which can inform how it's evaluated or interpreted by systems or users. It's recommended to use consistent, well-defined, upper snake_case strings. Examples: "SUMMARIZATION_QUALITY", "SAFETY_HARMFUL_CONTENT", "INSTRUCTION_ADHERENCE". }, ], }, }, }, }, "gcsUri": "A String", # The GCS object where the request or response is stored. "labels": { # Optional. Labels for the EvaluationItem. "a_key": "A String", }, "metadata": "", # Optional. Metadata for the EvaluationItem. "name": "A String", # Identifier. The resource name of the EvaluationItem. Format: `projects/{project}/locations/{location}/evaluationItems/{evaluation_item}` }
delete(name, x__xgafv=None)
Deletes an Evaluation Item. Args: name: string, Required. The name of the EvaluationItem resource to be deleted. Format: `projects/{project}/locations/{location}/evaluationItems/{evaluation_item}` (required) x__xgafv: string, V1 error format. Allowed values 1 - v1 error format 2 - v2 error format Returns: An object of the form: { # This resource represents a long-running operation that is the result of a network API call. "done": True or False, # If the value is `false`, it means the operation is still in progress. If `true`, the operation is completed, and either `error` or `response` is available. "error": { # The `Status` type defines a logical error model that is suitable for different programming environments, including REST APIs and RPC APIs. It is used by [gRPC](https://github.com/grpc). Each `Status` message contains three pieces of data: error code, error message, and error details. You can find out more about this error model and how to work with it in the [API Design Guide](https://cloud.google.com/apis/design/errors). # The error result of the operation in case of failure or cancellation. "code": 42, # The status code, which should be an enum value of google.rpc.Code. "details": [ # A list of messages that carry the error details. There is a common set of message types for APIs to use. { "a_key": "", # Properties of the object. Contains field @type with type URL. }, ], "message": "A String", # A developer-facing error message, which should be in English. Any user-facing error message should be localized and sent in the google.rpc.Status.details field, or localized by the client. }, "metadata": { # Service-specific metadata associated with the operation. It typically contains progress information and common metadata such as create time. Some services might not provide such metadata. Any method that returns a long-running operation should document the metadata type, if any. "a_key": "", # Properties of the object. Contains field @type with type URL. }, "name": "A String", # The server-assigned name, which is only unique within the same service that originally returns it. If you use the default HTTP mapping, the `name` should be a resource name ending with `operations/{unique_id}`. "response": { # The normal, successful response of the operation. If the original method returns no data on success, such as `Delete`, the response is `google.protobuf.Empty`. If the original method is standard `Get`/`Create`/`Update`, the response should be the resource. For other methods, the response should have the type `XxxResponse`, where `Xxx` is the original method name. For example, if the original method name is `TakeSnapshot()`, the inferred response type is `TakeSnapshotResponse`. "a_key": "", # Properties of the object. Contains field @type with type URL. }, }
get(name, x__xgafv=None)
Gets an Evaluation Item. Args: name: string, Required. The name of the EvaluationItem resource. Format: `projects/{project}/locations/{location}/evaluationItems/{evaluation_item}` (required) x__xgafv: string, V1 error format. Allowed values 1 - v1 error format 2 - v2 error format Returns: An object of the form: { # EvaluationItem is a single evaluation request or result. The content of an EvaluationItem is immutable - it cannot be updated once created. EvaluationItems can be deleted when no longer needed. "createTime": "A String", # Output only. Timestamp when this item was created. "displayName": "A String", # Required. The display name of the EvaluationItem. "error": { # The `Status` type defines a logical error model that is suitable for different programming environments, including REST APIs and RPC APIs. It is used by [gRPC](https://github.com/grpc). Each `Status` message contains three pieces of data: error code, error message, and error details. You can find out more about this error model and how to work with it in the [API Design Guide](https://cloud.google.com/apis/design/errors). # Output only. Error for the evaluation item. "code": 42, # The status code, which should be an enum value of google.rpc.Code. "details": [ # A list of messages that carry the error details. There is a common set of message types for APIs to use. { "a_key": "", # Properties of the object. Contains field @type with type URL. }, ], "message": "A String", # A developer-facing error message, which should be in English. Any user-facing error message should be localized and sent in the google.rpc.Status.details field, or localized by the client. }, "evaluationItemType": "A String", # Required. The type of the EvaluationItem. "evaluationRequest": { # Single evaluation request. # The request to evaluate. "candidateResponses": [ # Optional. Responses from model under test and other baseline models for comparison. { # Responses from model or agent. "candidate": "A String", # Required. The name of the candidate that produced the response. "text": "A String", # Text response. "value": "", # Fields and values that can be used to populate the response template. }, ], "goldenResponse": { # Responses from model or agent. # Optional. The Ideal response or ground truth. "candidate": "A String", # Required. The name of the candidate that produced the response. "text": "A String", # Text response. "value": "", # Fields and values that can be used to populate the response template. }, "prompt": { # Prompt to be evaluated. # Required. The request/prompt to evaluate. "promptTemplateData": { # Message to hold a prompt template and the values to populate the template. # Prompt template data. "values": { # The values for fields in the prompt template. "a_key": { # The base structured datatype containing multi-part content of a message. A `Content` includes a `role` field designating the producer of the `Content` and a `parts` field containing multi-part data that contains the content of the message turn. "parts": [ # Required. Ordered `Parts` that constitute a single message. Parts may have different IANA MIME types. { # A datatype containing media that is part of a multi-part `Content` message. A `Part` consists of data which has an associated datatype. A `Part` can only contain one of the accepted types in `Part.data`. A `Part` must have a fixed IANA MIME type identifying the type and subtype of the media if `inline_data` or `file_data` field is filled with raw bytes. "codeExecutionResult": { # Result of executing the [ExecutableCode]. Only generated when using the [CodeExecution] tool, and always follows a `part` containing the [ExecutableCode]. # Optional. Result of executing the [ExecutableCode]. "outcome": "A String", # Required. Outcome of the code execution. "output": "A String", # Optional. Contains stdout when code execution is successful, stderr or other description otherwise. }, "executableCode": { # Code generated by the model that is meant to be executed, and the result returned to the model. Generated when using the [CodeExecution] tool, in which the code will be automatically executed, and a corresponding [CodeExecutionResult] will also be generated. # Optional. Code generated by the model that is meant to be executed. "code": "A String", # Required. The code to be executed. "language": "A String", # Required. Programming language of the `code`. }, "fileData": { # URI based data. # Optional. URI based data. "displayName": "A String", # Optional. Display name of the file data. Used to provide a label or filename to distinguish file datas. This field is only returned in PromptMessage for prompt management. It is currently used in the Gemini GenerateContent calls only when server side tools (code_execution, google_search, and url_context) are enabled. "fileUri": "A String", # Required. URI. "mimeType": "A String", # Required. The IANA standard MIME type of the source data. }, "functionCall": { # A predicted [FunctionCall] returned from the model that contains a string representing the [FunctionDeclaration.name] and a structured JSON object containing the parameters and their values. # Optional. A predicted [FunctionCall] returned from the model that contains a string representing the [FunctionDeclaration.name] with the parameters and their values. "args": { # Optional. The function parameters and values in JSON object format. See [FunctionDeclaration.parameters] for parameter details. "a_key": "", # Properties of the object. }, "id": "A String", # Optional. The unique id of the function call. If populated, the client to execute the `function_call` and return the response with the matching `id`. "name": "A String", # Required. The name of the function to call. Matches [FunctionDeclaration.name]. }, "functionResponse": { # The result output from a [FunctionCall] that contains a string representing the [FunctionDeclaration.name] and a structured JSON object containing any output from the function is used as context to the model. This should contain the result of a [FunctionCall] made based on model prediction. # Optional. The result output of a [FunctionCall] that contains a string representing the [FunctionDeclaration.name] and a structured JSON object containing any output from the function call. It is used as context to the model. "id": "A String", # Optional. The id of the function call this response is for. Populated by the client to match the corresponding function call `id`. "name": "A String", # Required. The name of the function to call. Matches [FunctionDeclaration.name] and [FunctionCall.name]. "response": { # Required. The function response in JSON object format. Use "output" key to specify function output and "error" key to specify error details (if any). If "output" and "error" keys are not specified, then whole "response" is treated as function output. "a_key": "", # Properties of the object. }, }, "inlineData": { # Content blob. # Optional. Inlined bytes data. "data": "A String", # Required. Raw bytes. "displayName": "A String", # Optional. Display name of the blob. Used to provide a label or filename to distinguish blobs. This field is only returned in PromptMessage for prompt management. It is currently used in the Gemini GenerateContent calls only when server side tools (code_execution, google_search, and url_context) are enabled. "mimeType": "A String", # Required. The IANA standard MIME type of the source data. }, "text": "A String", # Optional. Text part (can be code). "thought": True or False, # Optional. Indicates if the part is thought from the model. "thoughtSignature": "A String", # Optional. An opaque signature for the thought so it can be reused in subsequent requests. "videoMetadata": { # Metadata describes the input video content. # Optional. Video metadata. The metadata should only be specified while the video data is presented in inline_data or file_data. "endOffset": "A String", # Optional. The end offset of the video. "fps": 3.14, # Optional. The frame rate of the video sent to the model. If not specified, the default value will be 1.0. The fps range is (0.0, 24.0]. "startOffset": "A String", # Optional. The start offset of the video. }, }, ], "role": "A String", # Optional. The producer of the content. Must be either 'user' or 'model'. Useful to set for multi-turn conversations, otherwise can be left blank or unset. }, }, }, "text": "A String", # Text prompt. "value": "", # Fields and values that can be used to populate the prompt template. }, "rubrics": { # Optional. Named groups of rubrics associated with this prompt. The key is a user-defined name for the rubric group. "a_key": { # A group of rubrics, used for grouping rubrics based on a metric or a version. "displayName": "A String", # Human-readable name for the group. This should be unique within a given context if used for display or selection. Example: "Instruction Following V1", "Content Quality - Summarization Task". "groupId": "A String", # Unique identifier for the group. "rubrics": [ # Rubrics that are part of this group. { # Message representing a single testable criterion for evaluation. One input prompt could have multiple rubrics. "content": { # Content of the rubric, defining the testable criteria. # Required. The actual testable criteria for the rubric. "property": { # Defines criteria based on a specific property. # Evaluation criteria based on a specific property. "description": "A String", # Description of the property being evaluated. Example: "The model's response is grammatically correct." }, }, "importance": "A String", # Optional. The relative importance of this rubric. "rubricId": "A String", # Unique identifier for the rubric. This ID is used to refer to this rubric, e.g., in RubricVerdict. "type": "A String", # Optional. A type designator for the rubric, which can inform how it's evaluated or interpreted by systems or users. It's recommended to use consistent, well-defined, upper snake_case strings. Examples: "SUMMARIZATION_QUALITY", "SAFETY_HARMFUL_CONTENT", "INSTRUCTION_ADHERENCE". }, ], }, }, }, "evaluationResponse": { # Evaluation result. # Output only. The response from evaluation. "candidateResults": [ # Optional. The results for the metric. { # Result for a single candidate. "additionalResults": "", # Optional. Additional results for the metric. "candidate": "A String", # Required. The candidate that is being evaluated. The value is the same as the candidate name in the EvaluationRequest. "explanation": "A String", # Optional. The explanation for the metric. "metric": "A String", # Required. The metric that was evaluated. "rubricVerdicts": [ # Optional. The rubric verdicts for the metric. { # Represents the verdict of an evaluation against a single rubric. "evaluatedRubric": { # Message representing a single testable criterion for evaluation. One input prompt could have multiple rubrics. # Required. The full rubric definition that was evaluated. Storing this ensures the verdict is self-contained and understandable, especially if the original rubric definition changes or was dynamically generated. "content": { # Content of the rubric, defining the testable criteria. # Required. The actual testable criteria for the rubric. "property": { # Defines criteria based on a specific property. # Evaluation criteria based on a specific property. "description": "A String", # Description of the property being evaluated. Example: "The model's response is grammatically correct." }, }, "importance": "A String", # Optional. The relative importance of this rubric. "rubricId": "A String", # Unique identifier for the rubric. This ID is used to refer to this rubric, e.g., in RubricVerdict. "type": "A String", # Optional. A type designator for the rubric, which can inform how it's evaluated or interpreted by systems or users. It's recommended to use consistent, well-defined, upper snake_case strings. Examples: "SUMMARIZATION_QUALITY", "SAFETY_HARMFUL_CONTENT", "INSTRUCTION_ADHERENCE". }, "reasoning": "A String", # Optional. Human-readable reasoning or explanation for the verdict. This can include specific examples or details from the evaluated content that justify the given verdict. "verdict": True or False, # Required. Outcome of the evaluation against the rubric, represented as a boolean. `true` indicates a "Pass", `false` indicates a "Fail". }, ], "score": 3.14, # Optional. The score for the metric. }, ], "evaluationRequest": "A String", # Required. The request item that was evaluated. Format: projects/{project}/locations/{location}/evaluationItems/{evaluation_item} "evaluationRun": "A String", # Required. The evaluation run that was used to generate the result. Format: projects/{project}/locations/{location}/evaluationRuns/{evaluation_run} "metadata": "", # Optional. Metadata about the evaluation result. "metric": "A String", # Required. The metric that was evaluated. "request": { # Single evaluation request. # Required. The request that was evaluated. "candidateResponses": [ # Optional. Responses from model under test and other baseline models for comparison. { # Responses from model or agent. "candidate": "A String", # Required. The name of the candidate that produced the response. "text": "A String", # Text response. "value": "", # Fields and values that can be used to populate the response template. }, ], "goldenResponse": { # Responses from model or agent. # Optional. The Ideal response or ground truth. "candidate": "A String", # Required. The name of the candidate that produced the response. "text": "A String", # Text response. "value": "", # Fields and values that can be used to populate the response template. }, "prompt": { # Prompt to be evaluated. # Required. The request/prompt to evaluate. "promptTemplateData": { # Message to hold a prompt template and the values to populate the template. # Prompt template data. "values": { # The values for fields in the prompt template. "a_key": { # The base structured datatype containing multi-part content of a message. A `Content` includes a `role` field designating the producer of the `Content` and a `parts` field containing multi-part data that contains the content of the message turn. "parts": [ # Required. Ordered `Parts` that constitute a single message. Parts may have different IANA MIME types. { # A datatype containing media that is part of a multi-part `Content` message. A `Part` consists of data which has an associated datatype. A `Part` can only contain one of the accepted types in `Part.data`. A `Part` must have a fixed IANA MIME type identifying the type and subtype of the media if `inline_data` or `file_data` field is filled with raw bytes. "codeExecutionResult": { # Result of executing the [ExecutableCode]. Only generated when using the [CodeExecution] tool, and always follows a `part` containing the [ExecutableCode]. # Optional. Result of executing the [ExecutableCode]. "outcome": "A String", # Required. Outcome of the code execution. "output": "A String", # Optional. Contains stdout when code execution is successful, stderr or other description otherwise. }, "executableCode": { # Code generated by the model that is meant to be executed, and the result returned to the model. Generated when using the [CodeExecution] tool, in which the code will be automatically executed, and a corresponding [CodeExecutionResult] will also be generated. # Optional. Code generated by the model that is meant to be executed. "code": "A String", # Required. The code to be executed. "language": "A String", # Required. Programming language of the `code`. }, "fileData": { # URI based data. # Optional. URI based data. "displayName": "A String", # Optional. Display name of the file data. Used to provide a label or filename to distinguish file datas. This field is only returned in PromptMessage for prompt management. It is currently used in the Gemini GenerateContent calls only when server side tools (code_execution, google_search, and url_context) are enabled. "fileUri": "A String", # Required. URI. "mimeType": "A String", # Required. The IANA standard MIME type of the source data. }, "functionCall": { # A predicted [FunctionCall] returned from the model that contains a string representing the [FunctionDeclaration.name] and a structured JSON object containing the parameters and their values. # Optional. A predicted [FunctionCall] returned from the model that contains a string representing the [FunctionDeclaration.name] with the parameters and their values. "args": { # Optional. The function parameters and values in JSON object format. See [FunctionDeclaration.parameters] for parameter details. "a_key": "", # Properties of the object. }, "id": "A String", # Optional. The unique id of the function call. If populated, the client to execute the `function_call` and return the response with the matching `id`. "name": "A String", # Required. The name of the function to call. Matches [FunctionDeclaration.name]. }, "functionResponse": { # The result output from a [FunctionCall] that contains a string representing the [FunctionDeclaration.name] and a structured JSON object containing any output from the function is used as context to the model. This should contain the result of a [FunctionCall] made based on model prediction. # Optional. The result output of a [FunctionCall] that contains a string representing the [FunctionDeclaration.name] and a structured JSON object containing any output from the function call. It is used as context to the model. "id": "A String", # Optional. The id of the function call this response is for. Populated by the client to match the corresponding function call `id`. "name": "A String", # Required. The name of the function to call. Matches [FunctionDeclaration.name] and [FunctionCall.name]. "response": { # Required. The function response in JSON object format. Use "output" key to specify function output and "error" key to specify error details (if any). If "output" and "error" keys are not specified, then whole "response" is treated as function output. "a_key": "", # Properties of the object. }, }, "inlineData": { # Content blob. # Optional. Inlined bytes data. "data": "A String", # Required. Raw bytes. "displayName": "A String", # Optional. Display name of the blob. Used to provide a label or filename to distinguish blobs. This field is only returned in PromptMessage for prompt management. It is currently used in the Gemini GenerateContent calls only when server side tools (code_execution, google_search, and url_context) are enabled. "mimeType": "A String", # Required. The IANA standard MIME type of the source data. }, "text": "A String", # Optional. Text part (can be code). "thought": True or False, # Optional. Indicates if the part is thought from the model. "thoughtSignature": "A String", # Optional. An opaque signature for the thought so it can be reused in subsequent requests. "videoMetadata": { # Metadata describes the input video content. # Optional. Video metadata. The metadata should only be specified while the video data is presented in inline_data or file_data. "endOffset": "A String", # Optional. The end offset of the video. "fps": 3.14, # Optional. The frame rate of the video sent to the model. If not specified, the default value will be 1.0. The fps range is (0.0, 24.0]. "startOffset": "A String", # Optional. The start offset of the video. }, }, ], "role": "A String", # Optional. The producer of the content. Must be either 'user' or 'model'. Useful to set for multi-turn conversations, otherwise can be left blank or unset. }, }, }, "text": "A String", # Text prompt. "value": "", # Fields and values that can be used to populate the prompt template. }, "rubrics": { # Optional. Named groups of rubrics associated with this prompt. The key is a user-defined name for the rubric group. "a_key": { # A group of rubrics, used for grouping rubrics based on a metric or a version. "displayName": "A String", # Human-readable name for the group. This should be unique within a given context if used for display or selection. Example: "Instruction Following V1", "Content Quality - Summarization Task". "groupId": "A String", # Unique identifier for the group. "rubrics": [ # Rubrics that are part of this group. { # Message representing a single testable criterion for evaluation. One input prompt could have multiple rubrics. "content": { # Content of the rubric, defining the testable criteria. # Required. The actual testable criteria for the rubric. "property": { # Defines criteria based on a specific property. # Evaluation criteria based on a specific property. "description": "A String", # Description of the property being evaluated. Example: "The model's response is grammatically correct." }, }, "importance": "A String", # Optional. The relative importance of this rubric. "rubricId": "A String", # Unique identifier for the rubric. This ID is used to refer to this rubric, e.g., in RubricVerdict. "type": "A String", # Optional. A type designator for the rubric, which can inform how it's evaluated or interpreted by systems or users. It's recommended to use consistent, well-defined, upper snake_case strings. Examples: "SUMMARIZATION_QUALITY", "SAFETY_HARMFUL_CONTENT", "INSTRUCTION_ADHERENCE". }, ], }, }, }, }, "gcsUri": "A String", # The GCS object where the request or response is stored. "labels": { # Optional. Labels for the EvaluationItem. "a_key": "A String", }, "metadata": "", # Optional. Metadata for the EvaluationItem. "name": "A String", # Identifier. The resource name of the EvaluationItem. Format: `projects/{project}/locations/{location}/evaluationItems/{evaluation_item}` }
list(parent, filter=None, orderBy=None, pageSize=None, pageToken=None, x__xgafv=None)
Lists Evaluation Items. Args: parent: string, Required. The resource name of the Location from which to list the Evaluation Items. Format: `projects/{project}/locations/{location}` (required) filter: string, Optional. Filter expression that matches a subset of the EvaluationItems to show. For field names both snake_case and camelCase are supported. For more information about filter syntax, see [AIP-160](https://google.aip.dev/160). orderBy: string, Optional. A comma-separated list of fields to order by, sorted in ascending order by default. Use `desc` after a field name for descending. pageSize: integer, Optional. The maximum number of Evaluation Items to return. pageToken: string, Optional. A page token, received from a previous `ListEvaluationItems` call. Provide this to retrieve the subsequent page. x__xgafv: string, V1 error format. Allowed values 1 - v1 error format 2 - v2 error format Returns: An object of the form: { # Response message for EvaluationManagementService.ListEvaluationItems. "evaluationItems": [ # List of EvaluationItems in the requested page. { # EvaluationItem is a single evaluation request or result. The content of an EvaluationItem is immutable - it cannot be updated once created. EvaluationItems can be deleted when no longer needed. "createTime": "A String", # Output only. Timestamp when this item was created. "displayName": "A String", # Required. The display name of the EvaluationItem. "error": { # The `Status` type defines a logical error model that is suitable for different programming environments, including REST APIs and RPC APIs. It is used by [gRPC](https://github.com/grpc). Each `Status` message contains three pieces of data: error code, error message, and error details. You can find out more about this error model and how to work with it in the [API Design Guide](https://cloud.google.com/apis/design/errors). # Output only. Error for the evaluation item. "code": 42, # The status code, which should be an enum value of google.rpc.Code. "details": [ # A list of messages that carry the error details. There is a common set of message types for APIs to use. { "a_key": "", # Properties of the object. Contains field @type with type URL. }, ], "message": "A String", # A developer-facing error message, which should be in English. Any user-facing error message should be localized and sent in the google.rpc.Status.details field, or localized by the client. }, "evaluationItemType": "A String", # Required. The type of the EvaluationItem. "evaluationRequest": { # Single evaluation request. # The request to evaluate. "candidateResponses": [ # Optional. Responses from model under test and other baseline models for comparison. { # Responses from model or agent. "candidate": "A String", # Required. The name of the candidate that produced the response. "text": "A String", # Text response. "value": "", # Fields and values that can be used to populate the response template. }, ], "goldenResponse": { # Responses from model or agent. # Optional. The Ideal response or ground truth. "candidate": "A String", # Required. The name of the candidate that produced the response. "text": "A String", # Text response. "value": "", # Fields and values that can be used to populate the response template. }, "prompt": { # Prompt to be evaluated. # Required. The request/prompt to evaluate. "promptTemplateData": { # Message to hold a prompt template and the values to populate the template. # Prompt template data. "values": { # The values for fields in the prompt template. "a_key": { # The base structured datatype containing multi-part content of a message. A `Content` includes a `role` field designating the producer of the `Content` and a `parts` field containing multi-part data that contains the content of the message turn. "parts": [ # Required. Ordered `Parts` that constitute a single message. Parts may have different IANA MIME types. { # A datatype containing media that is part of a multi-part `Content` message. A `Part` consists of data which has an associated datatype. A `Part` can only contain one of the accepted types in `Part.data`. A `Part` must have a fixed IANA MIME type identifying the type and subtype of the media if `inline_data` or `file_data` field is filled with raw bytes. "codeExecutionResult": { # Result of executing the [ExecutableCode]. Only generated when using the [CodeExecution] tool, and always follows a `part` containing the [ExecutableCode]. # Optional. Result of executing the [ExecutableCode]. "outcome": "A String", # Required. Outcome of the code execution. "output": "A String", # Optional. Contains stdout when code execution is successful, stderr or other description otherwise. }, "executableCode": { # Code generated by the model that is meant to be executed, and the result returned to the model. Generated when using the [CodeExecution] tool, in which the code will be automatically executed, and a corresponding [CodeExecutionResult] will also be generated. # Optional. Code generated by the model that is meant to be executed. "code": "A String", # Required. The code to be executed. "language": "A String", # Required. Programming language of the `code`. }, "fileData": { # URI based data. # Optional. URI based data. "displayName": "A String", # Optional. Display name of the file data. Used to provide a label or filename to distinguish file datas. This field is only returned in PromptMessage for prompt management. It is currently used in the Gemini GenerateContent calls only when server side tools (code_execution, google_search, and url_context) are enabled. "fileUri": "A String", # Required. URI. "mimeType": "A String", # Required. The IANA standard MIME type of the source data. }, "functionCall": { # A predicted [FunctionCall] returned from the model that contains a string representing the [FunctionDeclaration.name] and a structured JSON object containing the parameters and their values. # Optional. A predicted [FunctionCall] returned from the model that contains a string representing the [FunctionDeclaration.name] with the parameters and their values. "args": { # Optional. The function parameters and values in JSON object format. See [FunctionDeclaration.parameters] for parameter details. "a_key": "", # Properties of the object. }, "id": "A String", # Optional. The unique id of the function call. If populated, the client to execute the `function_call` and return the response with the matching `id`. "name": "A String", # Required. The name of the function to call. Matches [FunctionDeclaration.name]. }, "functionResponse": { # The result output from a [FunctionCall] that contains a string representing the [FunctionDeclaration.name] and a structured JSON object containing any output from the function is used as context to the model. This should contain the result of a [FunctionCall] made based on model prediction. # Optional. The result output of a [FunctionCall] that contains a string representing the [FunctionDeclaration.name] and a structured JSON object containing any output from the function call. It is used as context to the model. "id": "A String", # Optional. The id of the function call this response is for. Populated by the client to match the corresponding function call `id`. "name": "A String", # Required. The name of the function to call. Matches [FunctionDeclaration.name] and [FunctionCall.name]. "response": { # Required. The function response in JSON object format. Use "output" key to specify function output and "error" key to specify error details (if any). If "output" and "error" keys are not specified, then whole "response" is treated as function output. "a_key": "", # Properties of the object. }, }, "inlineData": { # Content blob. # Optional. Inlined bytes data. "data": "A String", # Required. Raw bytes. "displayName": "A String", # Optional. Display name of the blob. Used to provide a label or filename to distinguish blobs. This field is only returned in PromptMessage for prompt management. It is currently used in the Gemini GenerateContent calls only when server side tools (code_execution, google_search, and url_context) are enabled. "mimeType": "A String", # Required. The IANA standard MIME type of the source data. }, "text": "A String", # Optional. Text part (can be code). "thought": True or False, # Optional. Indicates if the part is thought from the model. "thoughtSignature": "A String", # Optional. An opaque signature for the thought so it can be reused in subsequent requests. "videoMetadata": { # Metadata describes the input video content. # Optional. Video metadata. The metadata should only be specified while the video data is presented in inline_data or file_data. "endOffset": "A String", # Optional. The end offset of the video. "fps": 3.14, # Optional. The frame rate of the video sent to the model. If not specified, the default value will be 1.0. The fps range is (0.0, 24.0]. "startOffset": "A String", # Optional. The start offset of the video. }, }, ], "role": "A String", # Optional. The producer of the content. Must be either 'user' or 'model'. Useful to set for multi-turn conversations, otherwise can be left blank or unset. }, }, }, "text": "A String", # Text prompt. "value": "", # Fields and values that can be used to populate the prompt template. }, "rubrics": { # Optional. Named groups of rubrics associated with this prompt. The key is a user-defined name for the rubric group. "a_key": { # A group of rubrics, used for grouping rubrics based on a metric or a version. "displayName": "A String", # Human-readable name for the group. This should be unique within a given context if used for display or selection. Example: "Instruction Following V1", "Content Quality - Summarization Task". "groupId": "A String", # Unique identifier for the group. "rubrics": [ # Rubrics that are part of this group. { # Message representing a single testable criterion for evaluation. One input prompt could have multiple rubrics. "content": { # Content of the rubric, defining the testable criteria. # Required. The actual testable criteria for the rubric. "property": { # Defines criteria based on a specific property. # Evaluation criteria based on a specific property. "description": "A String", # Description of the property being evaluated. Example: "The model's response is grammatically correct." }, }, "importance": "A String", # Optional. The relative importance of this rubric. "rubricId": "A String", # Unique identifier for the rubric. This ID is used to refer to this rubric, e.g., in RubricVerdict. "type": "A String", # Optional. A type designator for the rubric, which can inform how it's evaluated or interpreted by systems or users. It's recommended to use consistent, well-defined, upper snake_case strings. Examples: "SUMMARIZATION_QUALITY", "SAFETY_HARMFUL_CONTENT", "INSTRUCTION_ADHERENCE". }, ], }, }, }, "evaluationResponse": { # Evaluation result. # Output only. The response from evaluation. "candidateResults": [ # Optional. The results for the metric. { # Result for a single candidate. "additionalResults": "", # Optional. Additional results for the metric. "candidate": "A String", # Required. The candidate that is being evaluated. The value is the same as the candidate name in the EvaluationRequest. "explanation": "A String", # Optional. The explanation for the metric. "metric": "A String", # Required. The metric that was evaluated. "rubricVerdicts": [ # Optional. The rubric verdicts for the metric. { # Represents the verdict of an evaluation against a single rubric. "evaluatedRubric": { # Message representing a single testable criterion for evaluation. One input prompt could have multiple rubrics. # Required. The full rubric definition that was evaluated. Storing this ensures the verdict is self-contained and understandable, especially if the original rubric definition changes or was dynamically generated. "content": { # Content of the rubric, defining the testable criteria. # Required. The actual testable criteria for the rubric. "property": { # Defines criteria based on a specific property. # Evaluation criteria based on a specific property. "description": "A String", # Description of the property being evaluated. Example: "The model's response is grammatically correct." }, }, "importance": "A String", # Optional. The relative importance of this rubric. "rubricId": "A String", # Unique identifier for the rubric. This ID is used to refer to this rubric, e.g., in RubricVerdict. "type": "A String", # Optional. A type designator for the rubric, which can inform how it's evaluated or interpreted by systems or users. It's recommended to use consistent, well-defined, upper snake_case strings. Examples: "SUMMARIZATION_QUALITY", "SAFETY_HARMFUL_CONTENT", "INSTRUCTION_ADHERENCE". }, "reasoning": "A String", # Optional. Human-readable reasoning or explanation for the verdict. This can include specific examples or details from the evaluated content that justify the given verdict. "verdict": True or False, # Required. Outcome of the evaluation against the rubric, represented as a boolean. `true` indicates a "Pass", `false` indicates a "Fail". }, ], "score": 3.14, # Optional. The score for the metric. }, ], "evaluationRequest": "A String", # Required. The request item that was evaluated. Format: projects/{project}/locations/{location}/evaluationItems/{evaluation_item} "evaluationRun": "A String", # Required. The evaluation run that was used to generate the result. Format: projects/{project}/locations/{location}/evaluationRuns/{evaluation_run} "metadata": "", # Optional. Metadata about the evaluation result. "metric": "A String", # Required. The metric that was evaluated. "request": { # Single evaluation request. # Required. The request that was evaluated. "candidateResponses": [ # Optional. Responses from model under test and other baseline models for comparison. { # Responses from model or agent. "candidate": "A String", # Required. The name of the candidate that produced the response. "text": "A String", # Text response. "value": "", # Fields and values that can be used to populate the response template. }, ], "goldenResponse": { # Responses from model or agent. # Optional. The Ideal response or ground truth. "candidate": "A String", # Required. The name of the candidate that produced the response. "text": "A String", # Text response. "value": "", # Fields and values that can be used to populate the response template. }, "prompt": { # Prompt to be evaluated. # Required. The request/prompt to evaluate. "promptTemplateData": { # Message to hold a prompt template and the values to populate the template. # Prompt template data. "values": { # The values for fields in the prompt template. "a_key": { # The base structured datatype containing multi-part content of a message. A `Content` includes a `role` field designating the producer of the `Content` and a `parts` field containing multi-part data that contains the content of the message turn. "parts": [ # Required. Ordered `Parts` that constitute a single message. Parts may have different IANA MIME types. { # A datatype containing media that is part of a multi-part `Content` message. A `Part` consists of data which has an associated datatype. A `Part` can only contain one of the accepted types in `Part.data`. A `Part` must have a fixed IANA MIME type identifying the type and subtype of the media if `inline_data` or `file_data` field is filled with raw bytes. "codeExecutionResult": { # Result of executing the [ExecutableCode]. Only generated when using the [CodeExecution] tool, and always follows a `part` containing the [ExecutableCode]. # Optional. Result of executing the [ExecutableCode]. "outcome": "A String", # Required. Outcome of the code execution. "output": "A String", # Optional. Contains stdout when code execution is successful, stderr or other description otherwise. }, "executableCode": { # Code generated by the model that is meant to be executed, and the result returned to the model. Generated when using the [CodeExecution] tool, in which the code will be automatically executed, and a corresponding [CodeExecutionResult] will also be generated. # Optional. Code generated by the model that is meant to be executed. "code": "A String", # Required. The code to be executed. "language": "A String", # Required. Programming language of the `code`. }, "fileData": { # URI based data. # Optional. URI based data. "displayName": "A String", # Optional. Display name of the file data. Used to provide a label or filename to distinguish file datas. This field is only returned in PromptMessage for prompt management. It is currently used in the Gemini GenerateContent calls only when server side tools (code_execution, google_search, and url_context) are enabled. "fileUri": "A String", # Required. URI. "mimeType": "A String", # Required. The IANA standard MIME type of the source data. }, "functionCall": { # A predicted [FunctionCall] returned from the model that contains a string representing the [FunctionDeclaration.name] and a structured JSON object containing the parameters and their values. # Optional. A predicted [FunctionCall] returned from the model that contains a string representing the [FunctionDeclaration.name] with the parameters and their values. "args": { # Optional. The function parameters and values in JSON object format. See [FunctionDeclaration.parameters] for parameter details. "a_key": "", # Properties of the object. }, "id": "A String", # Optional. The unique id of the function call. If populated, the client to execute the `function_call` and return the response with the matching `id`. "name": "A String", # Required. The name of the function to call. Matches [FunctionDeclaration.name]. }, "functionResponse": { # The result output from a [FunctionCall] that contains a string representing the [FunctionDeclaration.name] and a structured JSON object containing any output from the function is used as context to the model. This should contain the result of a [FunctionCall] made based on model prediction. # Optional. The result output of a [FunctionCall] that contains a string representing the [FunctionDeclaration.name] and a structured JSON object containing any output from the function call. It is used as context to the model. "id": "A String", # Optional. The id of the function call this response is for. Populated by the client to match the corresponding function call `id`. "name": "A String", # Required. The name of the function to call. Matches [FunctionDeclaration.name] and [FunctionCall.name]. "response": { # Required. The function response in JSON object format. Use "output" key to specify function output and "error" key to specify error details (if any). If "output" and "error" keys are not specified, then whole "response" is treated as function output. "a_key": "", # Properties of the object. }, }, "inlineData": { # Content blob. # Optional. Inlined bytes data. "data": "A String", # Required. Raw bytes. "displayName": "A String", # Optional. Display name of the blob. Used to provide a label or filename to distinguish blobs. This field is only returned in PromptMessage for prompt management. It is currently used in the Gemini GenerateContent calls only when server side tools (code_execution, google_search, and url_context) are enabled. "mimeType": "A String", # Required. The IANA standard MIME type of the source data. }, "text": "A String", # Optional. Text part (can be code). "thought": True or False, # Optional. Indicates if the part is thought from the model. "thoughtSignature": "A String", # Optional. An opaque signature for the thought so it can be reused in subsequent requests. "videoMetadata": { # Metadata describes the input video content. # Optional. Video metadata. The metadata should only be specified while the video data is presented in inline_data or file_data. "endOffset": "A String", # Optional. The end offset of the video. "fps": 3.14, # Optional. The frame rate of the video sent to the model. If not specified, the default value will be 1.0. The fps range is (0.0, 24.0]. "startOffset": "A String", # Optional. The start offset of the video. }, }, ], "role": "A String", # Optional. The producer of the content. Must be either 'user' or 'model'. Useful to set for multi-turn conversations, otherwise can be left blank or unset. }, }, }, "text": "A String", # Text prompt. "value": "", # Fields and values that can be used to populate the prompt template. }, "rubrics": { # Optional. Named groups of rubrics associated with this prompt. The key is a user-defined name for the rubric group. "a_key": { # A group of rubrics, used for grouping rubrics based on a metric or a version. "displayName": "A String", # Human-readable name for the group. This should be unique within a given context if used for display or selection. Example: "Instruction Following V1", "Content Quality - Summarization Task". "groupId": "A String", # Unique identifier for the group. "rubrics": [ # Rubrics that are part of this group. { # Message representing a single testable criterion for evaluation. One input prompt could have multiple rubrics. "content": { # Content of the rubric, defining the testable criteria. # Required. The actual testable criteria for the rubric. "property": { # Defines criteria based on a specific property. # Evaluation criteria based on a specific property. "description": "A String", # Description of the property being evaluated. Example: "The model's response is grammatically correct." }, }, "importance": "A String", # Optional. The relative importance of this rubric. "rubricId": "A String", # Unique identifier for the rubric. This ID is used to refer to this rubric, e.g., in RubricVerdict. "type": "A String", # Optional. A type designator for the rubric, which can inform how it's evaluated or interpreted by systems or users. It's recommended to use consistent, well-defined, upper snake_case strings. Examples: "SUMMARIZATION_QUALITY", "SAFETY_HARMFUL_CONTENT", "INSTRUCTION_ADHERENCE". }, ], }, }, }, }, "gcsUri": "A String", # The GCS object where the request or response is stored. "labels": { # Optional. Labels for the EvaluationItem. "a_key": "A String", }, "metadata": "", # Optional. Metadata for the EvaluationItem. "name": "A String", # Identifier. The resource name of the EvaluationItem. Format: `projects/{project}/locations/{location}/evaluationItems/{evaluation_item}` }, ], "nextPageToken": "A String", # A token to retrieve the next page of results. }
list_next()
Retrieves the next page of results. Args: previous_request: The request for the previous page. (required) previous_response: The response from the request for the previous page. (required) Returns: A request object that you can call 'execute()' on to request the next page. Returns None if there are no more items in the collection.