Vertex AI API . projects . locations . ragCorpora . ragFiles

Instance Methods

operations()

Returns the operations Resource.

close()

Close httplib2 connections.

delete(name, x__xgafv=None)

Deletes a RagFile.

get(name, x__xgafv=None)

Gets a RagFile.

import_(parent, body=None, x__xgafv=None)

Import files from Google Cloud Storage or Google Drive into a RagCorpus.

list(parent, pageSize=None, pageToken=None, x__xgafv=None)

Lists RagFiles in a RagCorpus.

list_next()

Retrieves the next page of results.

Method Details

close()
Close httplib2 connections.
delete(name, x__xgafv=None)
Deletes a RagFile.

Args:
  name: string, Required. The name of the RagFile resource to be deleted. Format: `projects/{project}/locations/{location}/ragCorpora/{rag_corpus}/ragFiles/{rag_file}` (required)
  x__xgafv: string, V1 error format.
    Allowed values
      1 - v1 error format
      2 - v2 error format

Returns:
  An object of the form:

    { # This resource represents a long-running operation that is the result of a network API call.
  "done": True or False, # If the value is `false`, it means the operation is still in progress. If `true`, the operation is completed, and either `error` or `response` is available.
  "error": { # The `Status` type defines a logical error model that is suitable for different programming environments, including REST APIs and RPC APIs. It is used by [gRPC](https://github.com/grpc). Each `Status` message contains three pieces of data: error code, error message, and error details. You can find out more about this error model and how to work with it in the [API Design Guide](https://cloud.google.com/apis/design/errors). # The error result of the operation in case of failure or cancellation.
    "code": 42, # The status code, which should be an enum value of google.rpc.Code.
    "details": [ # A list of messages that carry the error details. There is a common set of message types for APIs to use.
      {
        "a_key": "", # Properties of the object. Contains field @type with type URL.
      },
    ],
    "message": "A String", # A developer-facing error message, which should be in English. Any user-facing error message should be localized and sent in the google.rpc.Status.details field, or localized by the client.
  },
  "metadata": { # Service-specific metadata associated with the operation. It typically contains progress information and common metadata such as create time. Some services might not provide such metadata. Any method that returns a long-running operation should document the metadata type, if any.
    "a_key": "", # Properties of the object. Contains field @type with type URL.
  },
  "name": "A String", # The server-assigned name, which is only unique within the same service that originally returns it. If you use the default HTTP mapping, the `name` should be a resource name ending with `operations/{unique_id}`.
  "response": { # The normal, successful response of the operation. If the original method returns no data on success, such as `Delete`, the response is `google.protobuf.Empty`. If the original method is standard `Get`/`Create`/`Update`, the response should be the resource. For other methods, the response should have the type `XxxResponse`, where `Xxx` is the original method name. For example, if the original method name is `TakeSnapshot()`, the inferred response type is `TakeSnapshotResponse`.
    "a_key": "", # Properties of the object. Contains field @type with type URL.
  },
}
get(name, x__xgafv=None)
Gets a RagFile.

Args:
  name: string, Required. The name of the RagFile resource. Format: `projects/{project}/locations/{location}/ragCorpora/{rag_corpus}/ragFiles/{rag_file}` (required)
  x__xgafv: string, V1 error format.
    Allowed values
      1 - v1 error format
      2 - v2 error format

Returns:
  An object of the form:

    { # A RagFile contains user data for chunking, embedding and indexing.
  "createTime": "A String", # Output only. Timestamp when this RagFile was created.
  "description": "A String", # Optional. The description of the RagFile.
  "directUploadSource": { # The input content is encapsulated and uploaded in the request. # Output only. The RagFile is encapsulated and uploaded in the UploadRagFile request.
  },
  "displayName": "A String", # Required. The display name of the RagFile. The name can be up to 128 characters long and can consist of any UTF-8 characters.
  "fileStatus": { # RagFile status. # Output only. State of the RagFile.
    "errorStatus": "A String", # Output only. Only when the `state` field is ERROR.
    "state": "A String", # Output only. RagFile state.
  },
  "gcsSource": { # The Google Cloud Storage location for the input content. # Output only. Google Cloud Storage location of the RagFile. It does not support wildcards in the Cloud Storage uri for now.
    "uris": [ # Required. Google Cloud Storage URI(-s) to the input file(s). May contain wildcards. For more information on wildcards, see https://cloud.google.com/storage/docs/gsutil/addlhelp/WildcardNames.
      "A String",
    ],
  },
  "googleDriveSource": { # The Google Drive location for the input content. # Output only. Google Drive location. Supports importing individual files as well as Google Drive folders.
    "resourceIds": [ # Required. Google Drive resource IDs.
      { # The type and ID of the Google Drive resource.
        "resourceId": "A String", # Required. The ID of the Google Drive resource.
        "resourceType": "A String", # Required. The type of the Google Drive resource.
      },
    ],
  },
  "jiraSource": { # The Jira source for the ImportRagFilesRequest. # The RagFile is imported from a Jira query.
    "jiraQueries": [ # Required. The Jira queries.
      { # JiraQueries contains the Jira queries and corresponding authentication.
        "apiKeyConfig": { # The API secret. # Required. The SecretManager secret version resource name (e.g. projects/{project}/secrets/{secret}/versions/{version}) storing the Jira API key. See [Manage API tokens for your Atlassian account](https://support.atlassian.com/atlassian-account/docs/manage-api-tokens-for-your-atlassian-account/).
          "apiKeySecretVersion": "A String", # Required. The SecretManager secret version resource name storing API key. e.g. projects/{project}/secrets/{secret}/versions/{version}
        },
        "customQueries": [ # A list of custom Jira queries to import. For information about JQL (Jira Query Language), see https://support.atlassian.com/jira-service-management-cloud/docs/use-advanced-search-with-jira-query-language-jql/
          "A String",
        ],
        "email": "A String", # Required. The Jira email address.
        "projects": [ # A list of Jira projects to import in their entirety.
          "A String",
        ],
        "serverUri": "A String", # Required. The Jira server URI.
      },
    ],
  },
  "name": "A String", # Output only. The resource name of the RagFile.
  "ragFileType": "A String", # Output only. The type of the RagFile.
  "sharePointSources": { # The SharePointSources to pass to ImportRagFiles. # The RagFile is imported from a SharePoint source.
    "sharePointSources": [ # The SharePoint sources.
      { # An individual SharePointSource.
        "clientId": "A String", # The Application ID for the app registered in Microsoft Azure Portal. The application must also be configured with MS Graph permissions "Files.ReadAll", "Sites.ReadAll" and BrowserSiteLists.Read.All.
        "clientSecret": { # The API secret. # The application secret for the app registered in Azure.
          "apiKeySecretVersion": "A String", # Required. The SecretManager secret version resource name storing API key. e.g. projects/{project}/secrets/{secret}/versions/{version}
        },
        "driveId": "A String", # The ID of the drive to download from.
        "driveName": "A String", # The name of the drive to download from.
        "fileId": "A String", # Output only. The SharePoint file id. Output only.
        "sharepointFolderId": "A String", # The ID of the SharePoint folder to download from.
        "sharepointFolderPath": "A String", # The path of the SharePoint folder to download from.
        "sharepointSiteName": "A String", # The name of the SharePoint site to download from. This can be the site name or the site id.
        "tenantId": "A String", # Unique identifier of the Azure Active Directory Instance.
      },
    ],
  },
  "sizeBytes": "A String", # Output only. The size of the RagFile in bytes.
  "slackSource": { # The Slack source for the ImportRagFilesRequest. # The RagFile is imported from a Slack channel.
    "channels": [ # Required. The Slack channels.
      { # SlackChannels contains the Slack channels and corresponding access token.
        "apiKeyConfig": { # The API secret. # Required. The SecretManager secret version resource name (e.g. projects/{project}/secrets/{secret}/versions/{version}) storing the Slack channel access token that has access to the slack channel IDs. See: https://api.slack.com/tutorials/tracks/getting-a-token.
          "apiKeySecretVersion": "A String", # Required. The SecretManager secret version resource name storing API key. e.g. projects/{project}/secrets/{secret}/versions/{version}
        },
        "channels": [ # Required. The Slack channel IDs.
          { # SlackChannel contains the Slack channel ID and the time range to import.
            "channelId": "A String", # Required. The Slack channel ID.
            "endTime": "A String", # Optional. The ending timestamp for messages to import.
            "startTime": "A String", # Optional. The starting timestamp for messages to import.
          },
        ],
      },
    ],
  },
  "updateTime": "A String", # Output only. Timestamp when this RagFile was last updated.
}
import_(parent, body=None, x__xgafv=None)
Import files from Google Cloud Storage or Google Drive into a RagCorpus.

Args:
  parent: string, Required. The name of the RagCorpus resource into which to import files. Format: `projects/{project}/locations/{location}/ragCorpora/{rag_corpus}` (required)
  body: object, The request body.
    The object takes the form of:

{ # Request message for VertexRagDataService.ImportRagFiles.
  "importRagFilesConfig": { # Config for importing RagFiles. # Required. The config for the RagFiles to be synced and imported into the RagCorpus. VertexRagDataService.ImportRagFiles.
    "gcsSource": { # The Google Cloud Storage location for the input content. # Google Cloud Storage location. Supports importing individual files as well as entire Google Cloud Storage directories. Sample formats: - `gs://bucket_name/my_directory/object_name/my_file.txt` - `gs://bucket_name/my_directory`
      "uris": [ # Required. Google Cloud Storage URI(-s) to the input file(s). May contain wildcards. For more information on wildcards, see https://cloud.google.com/storage/docs/gsutil/addlhelp/WildcardNames.
        "A String",
      ],
    },
    "googleDriveSource": { # The Google Drive location for the input content. # Google Drive location. Supports importing individual files as well as Google Drive folders.
      "resourceIds": [ # Required. Google Drive resource IDs.
        { # The type and ID of the Google Drive resource.
          "resourceId": "A String", # Required. The ID of the Google Drive resource.
          "resourceType": "A String", # Required. The type of the Google Drive resource.
        },
      ],
    },
    "jiraSource": { # The Jira source for the ImportRagFilesRequest. # Jira queries with their corresponding authentication.
      "jiraQueries": [ # Required. The Jira queries.
        { # JiraQueries contains the Jira queries and corresponding authentication.
          "apiKeyConfig": { # The API secret. # Required. The SecretManager secret version resource name (e.g. projects/{project}/secrets/{secret}/versions/{version}) storing the Jira API key. See [Manage API tokens for your Atlassian account](https://support.atlassian.com/atlassian-account/docs/manage-api-tokens-for-your-atlassian-account/).
            "apiKeySecretVersion": "A String", # Required. The SecretManager secret version resource name storing API key. e.g. projects/{project}/secrets/{secret}/versions/{version}
          },
          "customQueries": [ # A list of custom Jira queries to import. For information about JQL (Jira Query Language), see https://support.atlassian.com/jira-service-management-cloud/docs/use-advanced-search-with-jira-query-language-jql/
            "A String",
          ],
          "email": "A String", # Required. The Jira email address.
          "projects": [ # A list of Jira projects to import in their entirety.
            "A String",
          ],
          "serverUri": "A String", # Required. The Jira server URI.
        },
      ],
    },
    "maxEmbeddingRequestsPerMin": 42, # Optional. The max number of queries per minute that this job is allowed to make to the embedding model specified on the corpus. This value is specific to this job and not shared across other import jobs. Consult the Quotas page on the project to set an appropriate value here. If unspecified, a default value of 1,000 QPM would be used.
    "partialFailureBigquerySink": { # The BigQuery location for the output content. # The BigQuery destination to write partial failures to. It should be a bigquery table resource name (e.g. "bq://projectId.bqDatasetId.bqTableId"). The dataset must exist. If the table does not exist, it will be created with the expected schema. If the table exists, the schema will be validated and data will be added to this existing table. Deprecated. Prefer to use `import_result_bq_sink`.
      "outputUri": "A String", # Required. BigQuery URI to a project or table, up to 2000 characters long. When only the project is specified, the Dataset and Table is created. When the full table reference is specified, the Dataset must exist and table must not exist. Accepted forms: * BigQuery path. For example: `bq://projectId` or `bq://projectId.bqDatasetId` or `bq://projectId.bqDatasetId.bqTableId`.
    },
    "partialFailureGcsSink": { # The Google Cloud Storage location where the output is to be written to. # The Cloud Storage path to write partial failures to. Deprecated. Prefer to use `import_result_gcs_sink`.
      "outputUriPrefix": "A String", # Required. Google Cloud Storage URI to output directory. If the uri doesn't end with '/', a '/' will be automatically appended. The directory is created if it doesn't exist.
    },
    "ragFileChunkingConfig": { # Specifies the size and overlap of chunks for RagFiles. # Specifies the size and overlap of chunks after importing RagFiles.
      "chunkOverlap": 42, # The overlap between chunks.
      "chunkSize": 42, # The size of the chunks.
    },
    "ragFileParsingConfig": { # Specifies the parsing config for RagFiles. # Specifies the parsing config for RagFiles.
      "useAdvancedPdfParsing": True or False, # Whether to use advanced PDF parsing.
    },
    "sharePointSources": { # The SharePointSources to pass to ImportRagFiles. # SharePoint sources.
      "sharePointSources": [ # The SharePoint sources.
        { # An individual SharePointSource.
          "clientId": "A String", # The Application ID for the app registered in Microsoft Azure Portal. The application must also be configured with MS Graph permissions "Files.ReadAll", "Sites.ReadAll" and BrowserSiteLists.Read.All.
          "clientSecret": { # The API secret. # The application secret for the app registered in Azure.
            "apiKeySecretVersion": "A String", # Required. The SecretManager secret version resource name storing API key. e.g. projects/{project}/secrets/{secret}/versions/{version}
          },
          "driveId": "A String", # The ID of the drive to download from.
          "driveName": "A String", # The name of the drive to download from.
          "fileId": "A String", # Output only. The SharePoint file id. Output only.
          "sharepointFolderId": "A String", # The ID of the SharePoint folder to download from.
          "sharepointFolderPath": "A String", # The path of the SharePoint folder to download from.
          "sharepointSiteName": "A String", # The name of the SharePoint site to download from. This can be the site name or the site id.
          "tenantId": "A String", # Unique identifier of the Azure Active Directory Instance.
        },
      ],
    },
    "slackSource": { # The Slack source for the ImportRagFilesRequest. # Slack channels with their corresponding access tokens.
      "channels": [ # Required. The Slack channels.
        { # SlackChannels contains the Slack channels and corresponding access token.
          "apiKeyConfig": { # The API secret. # Required. The SecretManager secret version resource name (e.g. projects/{project}/secrets/{secret}/versions/{version}) storing the Slack channel access token that has access to the slack channel IDs. See: https://api.slack.com/tutorials/tracks/getting-a-token.
            "apiKeySecretVersion": "A String", # Required. The SecretManager secret version resource name storing API key. e.g. projects/{project}/secrets/{secret}/versions/{version}
          },
          "channels": [ # Required. The Slack channel IDs.
            { # SlackChannel contains the Slack channel ID and the time range to import.
              "channelId": "A String", # Required. The Slack channel ID.
              "endTime": "A String", # Optional. The ending timestamp for messages to import.
              "startTime": "A String", # Optional. The starting timestamp for messages to import.
            },
          ],
        },
      ],
    },
  },
}

  x__xgafv: string, V1 error format.
    Allowed values
      1 - v1 error format
      2 - v2 error format

Returns:
  An object of the form:

    { # This resource represents a long-running operation that is the result of a network API call.
  "done": True or False, # If the value is `false`, it means the operation is still in progress. If `true`, the operation is completed, and either `error` or `response` is available.
  "error": { # The `Status` type defines a logical error model that is suitable for different programming environments, including REST APIs and RPC APIs. It is used by [gRPC](https://github.com/grpc). Each `Status` message contains three pieces of data: error code, error message, and error details. You can find out more about this error model and how to work with it in the [API Design Guide](https://cloud.google.com/apis/design/errors). # The error result of the operation in case of failure or cancellation.
    "code": 42, # The status code, which should be an enum value of google.rpc.Code.
    "details": [ # A list of messages that carry the error details. There is a common set of message types for APIs to use.
      {
        "a_key": "", # Properties of the object. Contains field @type with type URL.
      },
    ],
    "message": "A String", # A developer-facing error message, which should be in English. Any user-facing error message should be localized and sent in the google.rpc.Status.details field, or localized by the client.
  },
  "metadata": { # Service-specific metadata associated with the operation. It typically contains progress information and common metadata such as create time. Some services might not provide such metadata. Any method that returns a long-running operation should document the metadata type, if any.
    "a_key": "", # Properties of the object. Contains field @type with type URL.
  },
  "name": "A String", # The server-assigned name, which is only unique within the same service that originally returns it. If you use the default HTTP mapping, the `name` should be a resource name ending with `operations/{unique_id}`.
  "response": { # The normal, successful response of the operation. If the original method returns no data on success, such as `Delete`, the response is `google.protobuf.Empty`. If the original method is standard `Get`/`Create`/`Update`, the response should be the resource. For other methods, the response should have the type `XxxResponse`, where `Xxx` is the original method name. For example, if the original method name is `TakeSnapshot()`, the inferred response type is `TakeSnapshotResponse`.
    "a_key": "", # Properties of the object. Contains field @type with type URL.
  },
}
list(parent, pageSize=None, pageToken=None, x__xgafv=None)
Lists RagFiles in a RagCorpus.

Args:
  parent: string, Required. The resource name of the RagCorpus from which to list the RagFiles. Format: `projects/{project}/locations/{location}/ragCorpora/{rag_corpus}` (required)
  pageSize: integer, Optional. The standard list page size.
  pageToken: string, Optional. The standard list page token. Typically obtained via ListRagFilesResponse.next_page_token of the previous VertexRagDataService.ListRagFiles call.
  x__xgafv: string, V1 error format.
    Allowed values
      1 - v1 error format
      2 - v2 error format

Returns:
  An object of the form:

    { # Response message for VertexRagDataService.ListRagFiles.
  "nextPageToken": "A String", # A token to retrieve the next page of results. Pass to ListRagFilesRequest.page_token to obtain that page.
  "ragFiles": [ # List of RagFiles in the requested page.
    { # A RagFile contains user data for chunking, embedding and indexing.
      "createTime": "A String", # Output only. Timestamp when this RagFile was created.
      "description": "A String", # Optional. The description of the RagFile.
      "directUploadSource": { # The input content is encapsulated and uploaded in the request. # Output only. The RagFile is encapsulated and uploaded in the UploadRagFile request.
      },
      "displayName": "A String", # Required. The display name of the RagFile. The name can be up to 128 characters long and can consist of any UTF-8 characters.
      "fileStatus": { # RagFile status. # Output only. State of the RagFile.
        "errorStatus": "A String", # Output only. Only when the `state` field is ERROR.
        "state": "A String", # Output only. RagFile state.
      },
      "gcsSource": { # The Google Cloud Storage location for the input content. # Output only. Google Cloud Storage location of the RagFile. It does not support wildcards in the Cloud Storage uri for now.
        "uris": [ # Required. Google Cloud Storage URI(-s) to the input file(s). May contain wildcards. For more information on wildcards, see https://cloud.google.com/storage/docs/gsutil/addlhelp/WildcardNames.
          "A String",
        ],
      },
      "googleDriveSource": { # The Google Drive location for the input content. # Output only. Google Drive location. Supports importing individual files as well as Google Drive folders.
        "resourceIds": [ # Required. Google Drive resource IDs.
          { # The type and ID of the Google Drive resource.
            "resourceId": "A String", # Required. The ID of the Google Drive resource.
            "resourceType": "A String", # Required. The type of the Google Drive resource.
          },
        ],
      },
      "jiraSource": { # The Jira source for the ImportRagFilesRequest. # The RagFile is imported from a Jira query.
        "jiraQueries": [ # Required. The Jira queries.
          { # JiraQueries contains the Jira queries and corresponding authentication.
            "apiKeyConfig": { # The API secret. # Required. The SecretManager secret version resource name (e.g. projects/{project}/secrets/{secret}/versions/{version}) storing the Jira API key. See [Manage API tokens for your Atlassian account](https://support.atlassian.com/atlassian-account/docs/manage-api-tokens-for-your-atlassian-account/).
              "apiKeySecretVersion": "A String", # Required. The SecretManager secret version resource name storing API key. e.g. projects/{project}/secrets/{secret}/versions/{version}
            },
            "customQueries": [ # A list of custom Jira queries to import. For information about JQL (Jira Query Language), see https://support.atlassian.com/jira-service-management-cloud/docs/use-advanced-search-with-jira-query-language-jql/
              "A String",
            ],
            "email": "A String", # Required. The Jira email address.
            "projects": [ # A list of Jira projects to import in their entirety.
              "A String",
            ],
            "serverUri": "A String", # Required. The Jira server URI.
          },
        ],
      },
      "name": "A String", # Output only. The resource name of the RagFile.
      "ragFileType": "A String", # Output only. The type of the RagFile.
      "sharePointSources": { # The SharePointSources to pass to ImportRagFiles. # The RagFile is imported from a SharePoint source.
        "sharePointSources": [ # The SharePoint sources.
          { # An individual SharePointSource.
            "clientId": "A String", # The Application ID for the app registered in Microsoft Azure Portal. The application must also be configured with MS Graph permissions "Files.ReadAll", "Sites.ReadAll" and BrowserSiteLists.Read.All.
            "clientSecret": { # The API secret. # The application secret for the app registered in Azure.
              "apiKeySecretVersion": "A String", # Required. The SecretManager secret version resource name storing API key. e.g. projects/{project}/secrets/{secret}/versions/{version}
            },
            "driveId": "A String", # The ID of the drive to download from.
            "driveName": "A String", # The name of the drive to download from.
            "fileId": "A String", # Output only. The SharePoint file id. Output only.
            "sharepointFolderId": "A String", # The ID of the SharePoint folder to download from.
            "sharepointFolderPath": "A String", # The path of the SharePoint folder to download from.
            "sharepointSiteName": "A String", # The name of the SharePoint site to download from. This can be the site name or the site id.
            "tenantId": "A String", # Unique identifier of the Azure Active Directory Instance.
          },
        ],
      },
      "sizeBytes": "A String", # Output only. The size of the RagFile in bytes.
      "slackSource": { # The Slack source for the ImportRagFilesRequest. # The RagFile is imported from a Slack channel.
        "channels": [ # Required. The Slack channels.
          { # SlackChannels contains the Slack channels and corresponding access token.
            "apiKeyConfig": { # The API secret. # Required. The SecretManager secret version resource name (e.g. projects/{project}/secrets/{secret}/versions/{version}) storing the Slack channel access token that has access to the slack channel IDs. See: https://api.slack.com/tutorials/tracks/getting-a-token.
              "apiKeySecretVersion": "A String", # Required. The SecretManager secret version resource name storing API key. e.g. projects/{project}/secrets/{secret}/versions/{version}
            },
            "channels": [ # Required. The Slack channel IDs.
              { # SlackChannel contains the Slack channel ID and the time range to import.
                "channelId": "A String", # Required. The Slack channel ID.
                "endTime": "A String", # Optional. The ending timestamp for messages to import.
                "startTime": "A String", # Optional. The starting timestamp for messages to import.
              },
            ],
          },
        ],
      },
      "updateTime": "A String", # Output only. Timestamp when this RagFile was last updated.
    },
  ],
}
list_next()
Retrieves the next page of results.

        Args:
          previous_request: The request for the previous page. (required)
          previous_response: The response from the request for the previous page. (required)

        Returns:
          A request object that you can call 'execute()' on to request the next
          page. Returns None if there are no more items in the collection.