This document shows you how to use Vertex AI Studio to convert speech to text. The speech-to-text feature in Vertex AI Studio is a quick way to transcribe short audio files. For more advanced features and higher limits, you can use the dedicated Speech-to-Text service. The following table compares the two services. To learn how to convert text to speech, see
Convert text to speech. To convert speech to text, follow these steps: In the Vertex AI section of the Google Cloud console, go to
the Vertex AI Studio page. Click Generate speech. Select the Speech-to-text tab. In the Speech section, click Browse to select the audio file that you want to
convert to text. In the Language list, select the language of the speech in the
audio file. Click Submit. The converted text appears in the Text field. You can use the Speech-to-Text UI directly to overcome these limitations.
Tool
Description
Use Case
Vertex AI Studio
A quick way to transcribe short audio files using the Chirp model directly in the studio.
Best for quick tests and transcribing audio files under 60 seconds.
Speech-to-Text
A dedicated service with more models, advanced features, and support for much longer audio files.
Suitable for production workloads and transcribing files up to 8 hours long.
Convert speech to text
Limitations
What's next
Convert speech to text
Except as otherwise noted, the content of this page is licensed under the Creative Commons Attribution 4.0 License, and code samples are licensed under the Apache 2.0 License. For details, see the Google Developers Site Policies. Java is a registered trademark of Oracle and/or its affiliates.
Last updated 2025-08-23 UTC.