Class LocalTokenizer

java.lang.Object
com.google.genai.LocalTokenizer

public final class LocalTokenizer extends Object
[Experimental] Text Only Local Tokenizer.

This class provides a local tokenizer for text only token counting.

LIMITATIONS:

  • Only supports text based tokenization and no multimodal tokenization.
  • Forward compatibility depends on the open-source tokenizer models for future Gemini versions.

NOTE: The SDK's local tokenizer implementation is experimental and may change in the future. It only supports text based tokenization.

  • Constructor Details

    • LocalTokenizer

      public LocalTokenizer(String modelName)
      Creates a new LocalTokenizer for the specified model.
      Parameters:
      modelName - the name of the model to load (e.g., "gemini-1.5-flash")
      Throws:
      IllegalArgumentException - if the model name is not supported or the tokenizer cannot be loaded
  • Method Details

    • countTokens

      public CountTokensResult countTokens(List<Content> contents, CountTokensConfig config)
      Counts the number of tokens in a given text.
      Parameters:
      contents - The contents to tokenize.
      config - The configuration for counting tokens.
      Returns:
      A CountTokensResult containing the total number of tokens.
    • countTokens

      public CountTokensResult countTokens(List<Content> contents)
      Counts the number of tokens in a list of content objects using default configuration.
      Parameters:
      contents - The contents to tokenize.
      Returns:
      A CountTokensResult containing the total number of tokens.
    • countTokens

      public CountTokensResult countTokens(Content content, CountTokensConfig config)
      Counts the number of tokens in a single content object.
      Parameters:
      content - The content to tokenize.
      config - The configuration for counting tokens.
      Returns:
      A CountTokensResult containing the total number of tokens.
    • countTokens

      public CountTokensResult countTokens(Content content)
      Counts the number of tokens in a single content object using default configuration.
      Parameters:
      content - The content to tokenize.
      Returns:
      A CountTokensResult containing the total number of tokens.
    • countTokens

      public CountTokensResult countTokens(String content, CountTokensConfig config)
      Counts the number of tokens in a text string.
      Parameters:
      content - The text content to tokenize.
      config - The configuration for counting tokens.
      Returns:
      A CountTokensResult containing the total number of tokens.
    • countTokens

      public CountTokensResult countTokens(String content)
      Counts the number of tokens in a text string using default configuration.
      Parameters:
      content - The text content to tokenize.
      Returns:
      A CountTokensResult containing the total number of tokens.
    • computeTokens

      public ComputeTokensResult computeTokens(List<Content> contents)
      Computes the tokens ids and string pieces in the input.
      Parameters:
      contents - The contents to tokenize.
      Returns:
      A ComputeTokensResult containing the token information.
    • computeTokens

      public ComputeTokensResult computeTokens(Content content)
      Computes the token ids and string pieces for a single content object.
      Parameters:
      content - The content to tokenize.
      Returns:
      A ComputeTokensResult containing the token information.
    • computeTokens

      public ComputeTokensResult computeTokens(String content)
      Computes the token ids and string pieces for a text string.
      Parameters:
      content - The text content to tokenize.
      Returns:
      A ComputeTokensResult containing the token information.