Media TranslationBeta

Add real-time audio translation directly to your content and applications.

Conceptual rendering of networked internet devices communicating through Media Translation

Scale quickly, globally with dynamic audio translation

Media Translation API delivers real-time speech translation to your content and applications directly from your audio data. Leveraging Google’s machine learning technologies, the API offers enhanced accuracy and simplified integration while equipping you with a comprehensive set of features to further refine your translation results. Improve user experience with low-latency streaming translation and scale quickly with straightforward internationalization.

What's new

Mobile device emits sound to networked globe

Proven record of quality

Google Cloud’s translation and speech recognition technologies have been widely recognized for their quality, thanks to Google’s machine learning expertise. Bringing cutting-edge technologies together, Media Translation API provides you with state-of-the-art audio translation along with the features of our popular Translation API and Speech-to-Text API.

Mobile phone, API, and Google translate connect with dotted lines and green checkmarks

Seamless content translation

Translate content directly from your audio data. Media Translation API enhances the accuracy of interpretation by optimizing model integrations from audio to text and abstracts potential frictions you may face initiating multiple API calls. Simply make one API call, and Media Translation takes care of the rest.

Conceptual image of man and woman speaking into the same computer monitor with their speech transcribing and a stopwatch to the left of the monitor

Streaming translation at speed

Stream translation output as you supply audio from a microphone or prerecorded audio file. Media Translation API minimizes the latency between input and translation results—enhancing user experience and enabling real-time engagement across languages and/or geographies.


Streaming translation

Real-time translation is available during streaming audio input from a microphone or prerecorded audio files, and the API optimizes the integration for reduced latency.

Automatic punctuation

The API accurately punctuates your translation results (e.g., commas, periods, question marks).

Enhanced models

Media Translation API comes with two enhanced models (video, phone call), so you can optimize accuracy for your specific audio use case.

Language support

Media Translation API supports 12 languages.

"At OnePlus, we aim to share the best technology with the world, hand in hand with our users. One important feature for our product is face-to-face communication across countries, time zones, and even languages. With Google Cloud’s Media Translation API, we are now able to provide real-time streaming translation for video chat with a simple API integration and ensure our customers feel effortlessly connected with minimal latency."

Gary Chen, Head of Software Product, OnePlus


Media Translation API is priced monthly based on the amount of audio translation successfully processed by the service and on the model used for translation. Usage is measured in increments rounded up to 15 seconds.

View pricing details

Take the next step

Start building on Google Cloud with $300 in free credits and 20+ always free products.

Need help getting started?
Work with a trusted partner
Continue browsing

Take the next step

Add real-time audio translation directly to your content and applications.

Need help getting started?
Work with a trusted partner
Get tips & best practices

This product is in beta. Learn more about product launch stages.

Cloud AI products comply with our SLA policies. They may offer different latency or availability guarantees from other Google Cloud services.