English
Deutsch
Español – América Latina
Français
Indonesia
Italiano
Português – Brasil
中文 – 简体
中文 – 繁體
日本語
한국어

Console

Contact Us Start free

Starting April 29, 2025, Gemini 1.5 Pro and Gemini 1.5 Flash models are not available in projects that have no prior usage of these models, including new projects. For details, see Model versions and lifecycle.

Convert speech to text

This document shows you how to use Vertex AI Studio to convert speech to text. The speech-to-text feature in Vertex AI Studio is a quick way to transcribe short audio files. For more advanced features and higher limits, you can use the dedicated Speech-to-Text service.

The following table compares the two services.

Tool	Description	Use Case
Vertex AI Studio	A quick way to transcribe short audio files using the Chirp model directly in the studio.	Best for quick tests and transcribing audio files under 60 seconds.
Speech-to-Text	A dedicated service with more models, advanced features, and support for much longer audio files.	Suitable for production workloads and transcribing files up to 8 hours long.

To learn how to convert text to speech, see Convert text to speech.

Convert speech to text

To convert speech to text, follow these steps:

In the Vertex AI section of the Google Cloud console, go to the Vertex AI Studio page.

Go to Vertex AI Studio
Click Generate speech.
Select the Speech-to-text tab.
In the Speech section, click Browse to select the audio file that you want to convert to text.
In the Language list, select the language of the speech in the audio file.
Click Submit.

The converted text appears in the Text field.

Limitations

Audio files can be a maximum 60 seconds or 10 MB (whichever is less).
Files are transcribed with the Chirp model.
Only 16-bit linear PCM WAV files are supported.

You can use the Speech-to-Text UI directly to overcome these limitations.

What's next

For more models, advanced features, and ability to transcribe files up to 8 hours, see Speech-to-Text.

Except as otherwise noted, the content of this page is licensed under the Creative Commons Attribution 4.0 License, and code samples are licensed under the Apache 2.0 License. For details, see the Google Developers Site Policies. Java is a registered trademark of Oracle and/or its affiliates.

Last updated 2025-08-23 UTC.