Frequently asked questions

Does Custom Voice support SSML?

  • Custom Voice supports all SSML features except for emphasis (on the roadmap) and pitch prosody (coming soon).

How long can a SSML sentence be?

  • We suggest avoiding SSML sentence tags and letting us infer the sentence structure. If you must use SSML sentences, we will generate up to 30 seconds of audio per SSML sentence.

  • Each sentence can contain at most 480 phonemes. Break up longer sentences with punctuation (e.g. periods) as needed.

Will there be voice differences between two versions of a Custom Voice model?

  • Some changes between two versions of a Custom Voice model are to be expected as our technology evolves, even though the models are trained using the same audio data. If you encounter this issue please send us some samples so that we can investigate.

Where can I report issues to Google?