Class SpeculativeDecodingSpec (1.100.0)

SpeculativeDecodingSpec(mapping=None, *, ignore_unknown_fields=False, **kwargs)

Configuration for Speculative Decoding.

This message has oneof_ fields (mutually exclusive fields). For each oneof, at most one member field can be set at the same time. Setting any member of the oneof automatically clears all other members.

.. _oneof: https://proto-plus-python.readthedocs.io/en/stable/fields.html#oneofs-mutually-exclusive-fields

Attributes

Name Description
draft_model_speculation google.cloud.aiplatform_v1.types.SpeculativeDecodingSpec.DraftModelSpeculation
draft model speculation. This field is a member of oneof_ speculation.
ngram_speculation google.cloud.aiplatform_v1.types.SpeculativeDecodingSpec.NgramSpeculation
N-Gram speculation. This field is a member of oneof_ speculation.
speculative_token_count int
The number of speculative tokens to generate at each step.

Classes

DraftModelSpeculation

DraftModelSpeculation(mapping=None, *, ignore_unknown_fields=False, **kwargs)

Draft model speculation works by using the smaller model to generate candidate tokens for speculative decoding.

NgramSpeculation

NgramSpeculation(mapping=None, *, ignore_unknown_fields=False, **kwargs)

N-Gram speculation works by trying to find matching tokens in the previous prompt sequence and use those as speculation for generating new tokens.

Methods

SpeculativeDecodingSpec

SpeculativeDecodingSpec(mapping=None, *, ignore_unknown_fields=False, **kwargs)

Configuration for Speculative Decoding.

This message has oneof_ fields (mutually exclusive fields). For each oneof, at most one member field can be set at the same time. Setting any member of the oneof automatically clears all other members.

.. _oneof: https://proto-plus-python.readthedocs.io/en/stable/fields.html#oneofs-mutually-exclusive-fields