Stay organized with collections
Save and categorize content based on your preferences.
The Multimodal Live API enables low-latency bidirectional voice and video
interactions with Gemini. Using the Multimodal Live API, you can provide end
users with the experience of natural, human-like voice conversations, and with
the ability to interrupt the model's responses using voice commands. The model
can process text, audio, and video input, and it can provide text and audio
output.
The Multimodal Live API is available in the Gemini API as the
BidiGenerateContent method and is built on
WebSockets.
Set environment variables to use the Gen AI SDK with Vertex AI:
# Replace the `GOOGLE_CLOUD_PROJECT` and `GOOGLE_CLOUD_LOCATION` values# with appropriate values for your project.exportGOOGLE_CLOUD_PROJECT=GOOGLE_CLOUD_PROJECTexportGOOGLE_CLOUD_LOCATION=us-central1
exportGOOGLE_GENAI_USE_VERTEXAI=True
fromgoogleimportgenaifromgoogle.genai.typesimportLiveConnectConfig,HttpOptions,Modalityclient=genai.Client(http_options=HttpOptions(api_version="v1beta1"))model_id="gemini-2.0-flash-exp"asyncwithclient.aio.live.connect(model=model_id,config=LiveConnectConfig(response_modalities=[Modality.TEXT]),)assession:text_input="Hello? Gemini, are you there?"print("> ",text_input,"\n")awaitsession.send(input=text_input,end_of_turn=True)response=[]asyncformessageinsession.receive():ifmessage.text:response.append(message.text)print("".join(response))# Example output:# > Hello? Gemini, are you there?# Yes, I'm here. What would you like to talk about?
[[["Easy to understand","easyToUnderstand","thumb-up"],["Solved my problem","solvedMyProblem","thumb-up"],["Other","otherUp","thumb-up"]],[["Hard to understand","hardToUnderstand","thumb-down"],["Incorrect information or sample code","incorrectInformationOrSampleCode","thumb-down"],["Missing the information/samples I need","missingTheInformationSamplesINeed","thumb-down"],["Other","otherDown","thumb-down"]],["Last updated 2025-03-11 UTC."],[],[]]