Package com.google.genai.proto
Class SentencepieceModel.NormalizerSpec
java.lang.Object
com.google.protobuf.AbstractMessageLite
com.google.protobuf.AbstractMessage
com.google.protobuf.GeneratedMessageV3
com.google.protobuf.GeneratedMessageV3.ExtendableMessage<SentencepieceModel.NormalizerSpec>
com.google.genai.proto.SentencepieceModel.NormalizerSpec
- All Implemented Interfaces:
SentencepieceModel.NormalizerSpecOrBuilder
,com.google.protobuf.GeneratedMessageV3.ExtendableMessageOrBuilder<SentencepieceModel.NormalizerSpec>
,com.google.protobuf.Message
,com.google.protobuf.MessageLite
,com.google.protobuf.MessageLiteOrBuilder
,com.google.protobuf.MessageOrBuilder
,Serializable
- Enclosing class:
- SentencepieceModel
public static final class SentencepieceModel.NormalizerSpec
extends com.google.protobuf.GeneratedMessageV3.ExtendableMessage<SentencepieceModel.NormalizerSpec>
implements SentencepieceModel.NormalizerSpecOrBuilder
NormalizerSpec encodes a various parameters for string normalizationProtobuf type
com.google.genai.proto.NormalizerSpec
- See Also:
-
Nested Class Summary
Nested ClassesModifier and TypeClassDescriptionstatic final class
NormalizerSpec encodes a various parameters for string normalizationNested classes/interfaces inherited from class com.google.protobuf.GeneratedMessageV3
com.google.protobuf.GeneratedMessageV3.ExtendableBuilder<MessageT extends com.google.protobuf.GeneratedMessageV3.ExtendableMessage<MessageT>,
BuilderT extends com.google.protobuf.GeneratedMessageV3.ExtendableBuilder<MessageT, BuilderT>>, com.google.protobuf.GeneratedMessageV3.ExtendableMessage<MessageT extends com.google.protobuf.GeneratedMessageV3.ExtendableMessage<MessageT>>, com.google.protobuf.GeneratedMessageV3.ExtendableMessageOrBuilder<MessageT extends com.google.protobuf.GeneratedMessageV3.ExtendableMessage<MessageT>>, com.google.protobuf.GeneratedMessageV3.FieldAccessorTable -
Field Summary
FieldsModifier and TypeFieldDescriptionstatic final int
static final int
static final int
static final int
static final com.google.protobuf.Parser<SentencepieceModel.NormalizerSpec>
Deprecated.static final int
static final int
-
Method Summary
Modifier and TypeMethodDescriptionboolean
boolean
Adds dummy whitespace at the beginning of text in order to treat "world" in "world" and "hello world" in the same way.static final com.google.protobuf.Descriptors.Descriptor
boolean
Replaces whitespace with meta symbol.getName()
name of normalization rule.com.google.protobuf.ByteString
name of normalization rule.Custom normalization rule file in TSV format.com.google.protobuf.ByteString
Custom normalization rule file in TSV format.com.google.protobuf.Parser<SentencepieceModel.NormalizerSpec>
com.google.protobuf.ByteString
Pre-compiled normalization rule created by Builder::GetPrecompiledCharsMap() or Builder::CompileCharsMap() method.boolean
Removes leading, trailing, and duplicate internal whitespace.int
boolean
Adds dummy whitespace at the beginning of text in order to treat "world" in "world" and "hello world" in the same way.boolean
Replaces whitespace with meta symbol.int
hashCode()
boolean
hasName()
name of normalization rule.boolean
Custom normalization rule file in TSV format.boolean
Pre-compiled normalization rule created by Builder::GetPrecompiledCharsMap() or Builder::CompileCharsMap() method.boolean
Removes leading, trailing, and duplicate internal whitespace.final boolean
newBuilder
(SentencepieceModel.NormalizerSpec prototype) parseDelimitedFrom
(InputStream input) parseDelimitedFrom
(InputStream input, com.google.protobuf.ExtensionRegistryLite extensionRegistry) parseFrom
(byte[] data) parseFrom
(byte[] data, com.google.protobuf.ExtensionRegistryLite extensionRegistry) parseFrom
(com.google.protobuf.ByteString data) parseFrom
(com.google.protobuf.ByteString data, com.google.protobuf.ExtensionRegistryLite extensionRegistry) parseFrom
(com.google.protobuf.CodedInputStream input) parseFrom
(com.google.protobuf.CodedInputStream input, com.google.protobuf.ExtensionRegistryLite extensionRegistry) parseFrom
(InputStream input) parseFrom
(InputStream input, com.google.protobuf.ExtensionRegistryLite extensionRegistry) parseFrom
(ByteBuffer data) parseFrom
(ByteBuffer data, com.google.protobuf.ExtensionRegistryLite extensionRegistry) static com.google.protobuf.Parser<SentencepieceModel.NormalizerSpec>
parser()
void
writeTo
(com.google.protobuf.CodedOutputStream output) Methods inherited from class com.google.protobuf.GeneratedMessageV3.ExtendableMessage
getAllFields, getAllFieldsRaw, getExtension, getExtension, getExtension, getExtension, getExtension, getExtension, getExtensionCount, getExtensionCount, getExtensionCount, getField, getRepeatedField, getRepeatedFieldCount, hasExtension, hasExtension, hasExtension, hasField
Methods inherited from class com.google.protobuf.GeneratedMessageV3
getDescriptorForType, getOneofFieldDescriptor, getUnknownFields, hasOneof
Methods inherited from class com.google.protobuf.AbstractMessage
findInitializationErrors, getInitializationErrorString, toString
Methods inherited from class com.google.protobuf.AbstractMessageLite
toByteArray, toByteString, writeDelimitedTo, writeTo
Methods inherited from interface com.google.protobuf.GeneratedMessageV3.ExtendableMessageOrBuilder
getExtension, getExtension, getExtension, getExtension, getExtension, getExtension, getExtensionCount, getExtensionCount, getExtensionCount, hasExtension, hasExtension, hasExtension
Methods inherited from interface com.google.protobuf.MessageLite
toByteArray, toByteString, writeDelimitedTo, writeTo
Methods inherited from interface com.google.protobuf.MessageOrBuilder
findInitializationErrors, getAllFields, getDescriptorForType, getField, getInitializationErrorString, getOneofFieldDescriptor, getRepeatedField, getRepeatedFieldCount, getUnknownFields, hasField, hasOneof
-
Field Details
-
NAME_FIELD_NUMBER
public static final int NAME_FIELD_NUMBER- See Also:
-
PRECOMPILED_CHARSMAP_FIELD_NUMBER
public static final int PRECOMPILED_CHARSMAP_FIELD_NUMBER- See Also:
-
ADD_DUMMY_PREFIX_FIELD_NUMBER
public static final int ADD_DUMMY_PREFIX_FIELD_NUMBER- See Also:
-
REMOVE_EXTRA_WHITESPACES_FIELD_NUMBER
public static final int REMOVE_EXTRA_WHITESPACES_FIELD_NUMBER- See Also:
-
ESCAPE_WHITESPACES_FIELD_NUMBER
public static final int ESCAPE_WHITESPACES_FIELD_NUMBER- See Also:
-
NORMALIZATION_RULE_TSV_FIELD_NUMBER
public static final int NORMALIZATION_RULE_TSV_FIELD_NUMBER- See Also:
-
PARSER
@Deprecated public static final com.google.protobuf.Parser<SentencepieceModel.NormalizerSpec> PARSERDeprecated.
-
-
Method Details
-
getDescriptor
public static final com.google.protobuf.Descriptors.Descriptor getDescriptor() -
hasName
public boolean hasName()name of normalization rule.
optional string name = 1;
- Specified by:
hasName
in interfaceSentencepieceModel.NormalizerSpecOrBuilder
- Returns:
- Whether the name field is set.
-
getName
name of normalization rule.
optional string name = 1;
- Specified by:
getName
in interfaceSentencepieceModel.NormalizerSpecOrBuilder
- Returns:
- The name.
-
getNameBytes
public com.google.protobuf.ByteString getNameBytes()name of normalization rule.
optional string name = 1;
- Specified by:
getNameBytes
in interfaceSentencepieceModel.NormalizerSpecOrBuilder
- Returns:
- The bytes for name.
-
hasPrecompiledCharsmap
public boolean hasPrecompiledCharsmap()Pre-compiled normalization rule created by Builder::GetPrecompiledCharsMap() or Builder::CompileCharsMap() method. Usually this field is set by Builder::GetNormalizerSpec() method.
optional bytes precompiled_charsmap = 2;
- Specified by:
hasPrecompiledCharsmap
in interfaceSentencepieceModel.NormalizerSpecOrBuilder
- Returns:
- Whether the precompiledCharsmap field is set.
-
getPrecompiledCharsmap
public com.google.protobuf.ByteString getPrecompiledCharsmap()Pre-compiled normalization rule created by Builder::GetPrecompiledCharsMap() or Builder::CompileCharsMap() method. Usually this field is set by Builder::GetNormalizerSpec() method.
optional bytes precompiled_charsmap = 2;
- Specified by:
getPrecompiledCharsmap
in interfaceSentencepieceModel.NormalizerSpecOrBuilder
- Returns:
- The precompiledCharsmap.
-
hasAddDummyPrefix
public boolean hasAddDummyPrefix()Adds dummy whitespace at the beginning of text in order to treat "world" in "world" and "hello world" in the same way.
optional bool add_dummy_prefix = 3 [default = true];
- Specified by:
hasAddDummyPrefix
in interfaceSentencepieceModel.NormalizerSpecOrBuilder
- Returns:
- Whether the addDummyPrefix field is set.
-
getAddDummyPrefix
public boolean getAddDummyPrefix()Adds dummy whitespace at the beginning of text in order to treat "world" in "world" and "hello world" in the same way.
optional bool add_dummy_prefix = 3 [default = true];
- Specified by:
getAddDummyPrefix
in interfaceSentencepieceModel.NormalizerSpecOrBuilder
- Returns:
- The addDummyPrefix.
-
hasRemoveExtraWhitespaces
public boolean hasRemoveExtraWhitespaces()Removes leading, trailing, and duplicate internal whitespace.
optional bool remove_extra_whitespaces = 4 [default = true];
- Specified by:
hasRemoveExtraWhitespaces
in interfaceSentencepieceModel.NormalizerSpecOrBuilder
- Returns:
- Whether the removeExtraWhitespaces field is set.
-
getRemoveExtraWhitespaces
public boolean getRemoveExtraWhitespaces()Removes leading, trailing, and duplicate internal whitespace.
optional bool remove_extra_whitespaces = 4 [default = true];
- Specified by:
getRemoveExtraWhitespaces
in interfaceSentencepieceModel.NormalizerSpecOrBuilder
- Returns:
- The removeExtraWhitespaces.
-
hasEscapeWhitespaces
public boolean hasEscapeWhitespaces()Replaces whitespace with meta symbol. This field must be true to train sentence piece model.
optional bool escape_whitespaces = 5 [default = true];
- Specified by:
hasEscapeWhitespaces
in interfaceSentencepieceModel.NormalizerSpecOrBuilder
- Returns:
- Whether the escapeWhitespaces field is set.
-
getEscapeWhitespaces
public boolean getEscapeWhitespaces()Replaces whitespace with meta symbol. This field must be true to train sentence piece model.
optional bool escape_whitespaces = 5 [default = true];
- Specified by:
getEscapeWhitespaces
in interfaceSentencepieceModel.NormalizerSpecOrBuilder
- Returns:
- The escapeWhitespaces.
-
hasNormalizationRuleTsv
public boolean hasNormalizationRuleTsv()Custom normalization rule file in TSV format. https://github.com/google/sentencepiece/blob/master/doc/normalization.md This field is only used in SentencePieceTrainer::Train() method, which compiles the rule into the binary rule stored in `precompiled_charsmap`.
optional string normalization_rule_tsv = 6;
- Specified by:
hasNormalizationRuleTsv
in interfaceSentencepieceModel.NormalizerSpecOrBuilder
- Returns:
- Whether the normalizationRuleTsv field is set.
-
getNormalizationRuleTsv
Custom normalization rule file in TSV format. https://github.com/google/sentencepiece/blob/master/doc/normalization.md This field is only used in SentencePieceTrainer::Train() method, which compiles the rule into the binary rule stored in `precompiled_charsmap`.
optional string normalization_rule_tsv = 6;
- Specified by:
getNormalizationRuleTsv
in interfaceSentencepieceModel.NormalizerSpecOrBuilder
- Returns:
- The normalizationRuleTsv.
-
getNormalizationRuleTsvBytes
public com.google.protobuf.ByteString getNormalizationRuleTsvBytes()Custom normalization rule file in TSV format. https://github.com/google/sentencepiece/blob/master/doc/normalization.md This field is only used in SentencePieceTrainer::Train() method, which compiles the rule into the binary rule stored in `precompiled_charsmap`.
optional string normalization_rule_tsv = 6;
- Specified by:
getNormalizationRuleTsvBytes
in interfaceSentencepieceModel.NormalizerSpecOrBuilder
- Returns:
- The bytes for normalizationRuleTsv.
-
isInitialized
public final boolean isInitialized()- Specified by:
isInitialized
in interfacecom.google.protobuf.MessageLiteOrBuilder
- Overrides:
isInitialized
in classcom.google.protobuf.GeneratedMessageV3.ExtendableMessage<SentencepieceModel.NormalizerSpec>
-
writeTo
- Specified by:
writeTo
in interfacecom.google.protobuf.MessageLite
- Overrides:
writeTo
in classcom.google.protobuf.GeneratedMessageV3
- Throws:
IOException
-
getSerializedSize
public int getSerializedSize()- Specified by:
getSerializedSize
in interfacecom.google.protobuf.MessageLite
- Overrides:
getSerializedSize
in classcom.google.protobuf.GeneratedMessageV3
-
equals
- Specified by:
equals
in interfacecom.google.protobuf.Message
- Overrides:
equals
in classcom.google.protobuf.AbstractMessage
-
hashCode
public int hashCode()- Specified by:
hashCode
in interfacecom.google.protobuf.Message
- Overrides:
hashCode
in classcom.google.protobuf.AbstractMessage
-
parseFrom
public static SentencepieceModel.NormalizerSpec parseFrom(ByteBuffer data) throws com.google.protobuf.InvalidProtocolBufferException - Throws:
com.google.protobuf.InvalidProtocolBufferException
-
parseFrom
public static SentencepieceModel.NormalizerSpec parseFrom(ByteBuffer data, com.google.protobuf.ExtensionRegistryLite extensionRegistry) throws com.google.protobuf.InvalidProtocolBufferException - Throws:
com.google.protobuf.InvalidProtocolBufferException
-
parseFrom
public static SentencepieceModel.NormalizerSpec parseFrom(com.google.protobuf.ByteString data) throws com.google.protobuf.InvalidProtocolBufferException - Throws:
com.google.protobuf.InvalidProtocolBufferException
-
parseFrom
public static SentencepieceModel.NormalizerSpec parseFrom(com.google.protobuf.ByteString data, com.google.protobuf.ExtensionRegistryLite extensionRegistry) throws com.google.protobuf.InvalidProtocolBufferException - Throws:
com.google.protobuf.InvalidProtocolBufferException
-
parseFrom
public static SentencepieceModel.NormalizerSpec parseFrom(byte[] data) throws com.google.protobuf.InvalidProtocolBufferException - Throws:
com.google.protobuf.InvalidProtocolBufferException
-
parseFrom
public static SentencepieceModel.NormalizerSpec parseFrom(byte[] data, com.google.protobuf.ExtensionRegistryLite extensionRegistry) throws com.google.protobuf.InvalidProtocolBufferException - Throws:
com.google.protobuf.InvalidProtocolBufferException
-
parseFrom
- Throws:
IOException
-
parseFrom
public static SentencepieceModel.NormalizerSpec parseFrom(InputStream input, com.google.protobuf.ExtensionRegistryLite extensionRegistry) throws IOException - Throws:
IOException
-
parseDelimitedFrom
public static SentencepieceModel.NormalizerSpec parseDelimitedFrom(InputStream input) throws IOException - Throws:
IOException
-
parseDelimitedFrom
public static SentencepieceModel.NormalizerSpec parseDelimitedFrom(InputStream input, com.google.protobuf.ExtensionRegistryLite extensionRegistry) throws IOException - Throws:
IOException
-
parseFrom
public static SentencepieceModel.NormalizerSpec parseFrom(com.google.protobuf.CodedInputStream input) throws IOException - Throws:
IOException
-
parseFrom
public static SentencepieceModel.NormalizerSpec parseFrom(com.google.protobuf.CodedInputStream input, com.google.protobuf.ExtensionRegistryLite extensionRegistry) throws IOException - Throws:
IOException
-
newBuilderForType
- Specified by:
newBuilderForType
in interfacecom.google.protobuf.Message
- Specified by:
newBuilderForType
in interfacecom.google.protobuf.MessageLite
-
newBuilder
-
newBuilder
public static SentencepieceModel.NormalizerSpec.Builder newBuilder(SentencepieceModel.NormalizerSpec prototype) -
toBuilder
- Specified by:
toBuilder
in interfacecom.google.protobuf.Message
- Specified by:
toBuilder
in interfacecom.google.protobuf.MessageLite
-
getDefaultInstance
-
parser
-
getParserForType
- Specified by:
getParserForType
in interfacecom.google.protobuf.Message
- Specified by:
getParserForType
in interfacecom.google.protobuf.MessageLite
- Overrides:
getParserForType
in classcom.google.protobuf.GeneratedMessageV3
-
getDefaultInstanceForType
- Specified by:
getDefaultInstanceForType
in interfacecom.google.protobuf.GeneratedMessageV3.ExtendableMessageOrBuilder<SentencepieceModel.NormalizerSpec>
- Specified by:
getDefaultInstanceForType
in interfacecom.google.protobuf.MessageLiteOrBuilder
- Specified by:
getDefaultInstanceForType
in interfacecom.google.protobuf.MessageOrBuilder
-