Interface SentencepieceModel.NormalizerSpecOrBuilder

All Superinterfaces:
com.google.protobuf.GeneratedMessageV3.ExtendableMessageOrBuilder<SentencepieceModel.NormalizerSpec>, com.google.protobuf.MessageLiteOrBuilder, com.google.protobuf.MessageOrBuilder
All Known Implementing Classes:
SentencepieceModel.NormalizerSpec, SentencepieceModel.NormalizerSpec.Builder
Enclosing class:
SentencepieceModel

public static interface SentencepieceModel.NormalizerSpecOrBuilder extends com.google.protobuf.GeneratedMessageV3.ExtendableMessageOrBuilder<SentencepieceModel.NormalizerSpec>
  • Method Summary

    Modifier and Type
    Method
    Description
    boolean
    Adds dummy whitespace at the beginning of text in order to treat "world" in "world" and "hello world" in the same way.
    boolean
    Replaces whitespace with meta symbol.
    name of normalization rule.
    com.google.protobuf.ByteString
    name of normalization rule.
    Custom normalization rule file in TSV format.
    com.google.protobuf.ByteString
    Custom normalization rule file in TSV format.
    com.google.protobuf.ByteString
    Pre-compiled normalization rule created by Builder::GetPrecompiledCharsMap() or Builder::CompileCharsMap() method.
    boolean
    Removes leading, trailing, and duplicate internal whitespace.
    boolean
    Adds dummy whitespace at the beginning of text in order to treat "world" in "world" and "hello world" in the same way.
    boolean
    Replaces whitespace with meta symbol.
    boolean
    name of normalization rule.
    boolean
    Custom normalization rule file in TSV format.
    boolean
    Pre-compiled normalization rule created by Builder::GetPrecompiledCharsMap() or Builder::CompileCharsMap() method.
    boolean
    Removes leading, trailing, and duplicate internal whitespace.

    Methods inherited from interface com.google.protobuf.GeneratedMessageV3.ExtendableMessageOrBuilder

    getDefaultInstanceForType, getExtension, getExtension, getExtension, getExtension, getExtension, getExtension, getExtensionCount, getExtensionCount, getExtensionCount, hasExtension, hasExtension, hasExtension

    Methods inherited from interface com.google.protobuf.MessageLiteOrBuilder

    isInitialized

    Methods inherited from interface com.google.protobuf.MessageOrBuilder

    findInitializationErrors, getAllFields, getDescriptorForType, getField, getInitializationErrorString, getOneofFieldDescriptor, getRepeatedField, getRepeatedFieldCount, getUnknownFields, hasField, hasOneof
  • Method Details

    • hasName

      boolean hasName()
       name of normalization rule.
       
      optional string name = 1;
      Returns:
      Whether the name field is set.
    • getName

      String getName()
       name of normalization rule.
       
      optional string name = 1;
      Returns:
      The name.
    • getNameBytes

      com.google.protobuf.ByteString getNameBytes()
       name of normalization rule.
       
      optional string name = 1;
      Returns:
      The bytes for name.
    • hasPrecompiledCharsmap

      boolean hasPrecompiledCharsmap()
       Pre-compiled normalization rule created by
       Builder::GetPrecompiledCharsMap() or Builder::CompileCharsMap() method.
       Usually this field is set by Builder::GetNormalizerSpec() method.
       
      optional bytes precompiled_charsmap = 2;
      Returns:
      Whether the precompiledCharsmap field is set.
    • getPrecompiledCharsmap

      com.google.protobuf.ByteString getPrecompiledCharsmap()
       Pre-compiled normalization rule created by
       Builder::GetPrecompiledCharsMap() or Builder::CompileCharsMap() method.
       Usually this field is set by Builder::GetNormalizerSpec() method.
       
      optional bytes precompiled_charsmap = 2;
      Returns:
      The precompiledCharsmap.
    • hasAddDummyPrefix

      boolean hasAddDummyPrefix()
       Adds dummy whitespace at the beginning of text in order to
       treat "world" in "world" and "hello world" in the same way.
       
      optional bool add_dummy_prefix = 3 [default = true];
      Returns:
      Whether the addDummyPrefix field is set.
    • getAddDummyPrefix

      boolean getAddDummyPrefix()
       Adds dummy whitespace at the beginning of text in order to
       treat "world" in "world" and "hello world" in the same way.
       
      optional bool add_dummy_prefix = 3 [default = true];
      Returns:
      The addDummyPrefix.
    • hasRemoveExtraWhitespaces

      boolean hasRemoveExtraWhitespaces()
       Removes leading, trailing, and duplicate internal whitespace.
       
      optional bool remove_extra_whitespaces = 4 [default = true];
      Returns:
      Whether the removeExtraWhitespaces field is set.
    • getRemoveExtraWhitespaces

      boolean getRemoveExtraWhitespaces()
       Removes leading, trailing, and duplicate internal whitespace.
       
      optional bool remove_extra_whitespaces = 4 [default = true];
      Returns:
      The removeExtraWhitespaces.
    • hasEscapeWhitespaces

      boolean hasEscapeWhitespaces()
       Replaces whitespace with meta symbol.
       This field must be true to train sentence piece model.
       
      optional bool escape_whitespaces = 5 [default = true];
      Returns:
      Whether the escapeWhitespaces field is set.
    • getEscapeWhitespaces

      boolean getEscapeWhitespaces()
       Replaces whitespace with meta symbol.
       This field must be true to train sentence piece model.
       
      optional bool escape_whitespaces = 5 [default = true];
      Returns:
      The escapeWhitespaces.
    • hasNormalizationRuleTsv

      boolean hasNormalizationRuleTsv()
       Custom normalization rule file in TSV format.
       https://github.com/google/sentencepiece/blob/master/doc/normalization.md
       This field is only used in SentencePieceTrainer::Train() method, which
       compiles the rule into the binary rule stored in `precompiled_charsmap`.
       
      optional string normalization_rule_tsv = 6;
      Returns:
      Whether the normalizationRuleTsv field is set.
    • getNormalizationRuleTsv

      String getNormalizationRuleTsv()
       Custom normalization rule file in TSV format.
       https://github.com/google/sentencepiece/blob/master/doc/normalization.md
       This field is only used in SentencePieceTrainer::Train() method, which
       compiles the rule into the binary rule stored in `precompiled_charsmap`.
       
      optional string normalization_rule_tsv = 6;
      Returns:
      The normalizationRuleTsv.
    • getNormalizationRuleTsvBytes

      com.google.protobuf.ByteString getNormalizationRuleTsvBytes()
       Custom normalization rule file in TSV format.
       https://github.com/google/sentencepiece/blob/master/doc/normalization.md
       This field is only used in SentencePieceTrainer::Train() method, which
       compiles the rule into the binary rule stored in `precompiled_charsmap`.
       
      optional string normalization_rule_tsv = 6;
      Returns:
      The bytes for normalizationRuleTsv.