Starting April 29, 2025, Gemini 1.5 Pro and Gemini 1.5 Flash models are not available in projects that have no prior usage of these models, including new projects. For details, see
Model versions and lifecycle.
Generate text from multimodal prompt
Stay organized with collections
Save and categorize content based on your preferences.
This sample demonstrates how to generate text from a multimodal prompt using the Gemini model. The prompt consists of three images and two text prompts. The model generates a text response that describes the images and the text prompts.
Code sample
Except as otherwise noted, the content of this page is licensed under the Creative Commons Attribution 4.0 License, and code samples are licensed under the Apache 2.0 License. For details, see the Google Developers Site Policies. Java is a registered trademark of Oracle and/or its affiliates.
[[["Easy to understand","easyToUnderstand","thumb-up"],["Solved my problem","solvedMyProblem","thumb-up"],["Other","otherUp","thumb-up"]],[["Hard to understand","hardToUnderstand","thumb-down"],["Incorrect information or sample code","incorrectInformationOrSampleCode","thumb-down"],["Missing the information/samples I need","missingTheInformationSamplesINeed","thumb-down"],["Other","otherDown","thumb-down"]],[],[],[],null,["# Generate text from multimodal prompt\n\nThis sample demonstrates how to generate text from a multimodal prompt using the Gemini model. The prompt consists of three images and two text prompts. The model generates a text response that describes the images and the text prompts.\n\nCode sample\n-----------\n\n### C#\n\n\nBefore trying this sample, follow the C# setup instructions in the\n[Vertex AI quickstart using\nclient libraries](/vertex-ai/docs/start/client-libraries).\n\n\nFor more information, see the\n[Vertex AI C# API\nreference documentation](/dotnet/docs/reference/Google.Cloud.AIPlatform.V1/latest).\n\n\nTo authenticate to Vertex AI, set up Application Default Credentials.\nFor more information, see\n\n[Set up authentication for a local development environment](/docs/authentication/set-up-adc-local-dev-environment).\n\n\n using https://cloud.google.com/dotnet/docs/reference/Google.Api.Gax/latest/Google.Api.Gax.Grpc.html;\n using https://cloud.google.com/dotnet/docs/reference/Google.Cloud.AIPlatform.V1/latest/Google.Cloud.AIPlatform.V1.html;\n using https://cloud.google.com/dotnet/docs/reference/Google.Protobuf/latest/Google.Protobuf.html;\n using System.Net.Http;\n using System.Text;\n using System.Threading.Tasks;\n\n public class MultimodalMultiImage\n {\n public async Task\u003cstring\u003e GenerateContent(\n string projectId = \"your-project-id\",\n string location = \"us-central1\",\n string publisher = \"google\",\n string model = \"gemini-2.0-flash-001\"\n )\n {\n var predictionServiceClient = new https://cloud.google.com/dotnet/docs/reference/Google.Cloud.AIPlatform.V1/latest/Google.Cloud.AIPlatform.V1.PredictionServiceClientBuilder.html\n {\n Endpoint = $\"{location}-aiplatform.googleapis.com\"\n }.Build();\n\n https://cloud.google.com/dotnet/docs/reference/Google.Protobuf/latest/Google.Protobuf.ByteString.html colosseum = await ReadImageFileAsync(\n \"https://storage.googleapis.com/cloud-samples-data/vertex-ai/llm/prompts/landmark1.png\");\n\n https://cloud.google.com/dotnet/docs/reference/Google.Protobuf/latest/Google.Protobuf.ByteString.html forbiddenCity = await ReadImageFileAsync(\n \"https://storage.googleapis.com/cloud-samples-data/vertex-ai/llm/prompts/landmark2.png\");\n\n https://cloud.google.com/dotnet/docs/reference/Google.Protobuf/latest/Google.Protobuf.ByteString.html christRedeemer = await ReadImageFileAsync(\n \"https://storage.googleapis.com/cloud-samples-data/vertex-ai/llm/prompts/landmark3.png\");\n\n var generateContentRequest = new https://cloud.google.com/dotnet/docs/reference/Google.Cloud.AIPlatform.V1/latest/Google.Cloud.AIPlatform.V1.GenerateContentRequest.html\n {\n Model = $\"projects/{projectId}/locations/{location}/publishers/{publisher}/models/{model}\",\n Contents =\n {\n new https://cloud.google.com/dotnet/docs/reference/Google.Cloud.AIPlatform.V1/latest/Google.Cloud.AIPlatform.V1.Content.html\n {\n Role = \"USER\",\n Parts =\n {\n new https://cloud.google.com/dotnet/docs/reference/Google.Cloud.AIPlatform.V1/latest/Google.Cloud.AIPlatform.V1.Part.html { InlineData = new() { MimeType = \"image/png\", Data = colosseum }},\n new https://cloud.google.com/dotnet/docs/reference/Google.Cloud.AIPlatform.V1/latest/Google.Cloud.AIPlatform.V1.Part.html { Text = \"city: Rome, Landmark: the Colosseum\" },\n new https://cloud.google.com/dotnet/docs/reference/Google.Cloud.AIPlatform.V1/latest/Google.Cloud.AIPlatform.V1.Part.html { InlineData = new() { MimeType = \"image/png\", Data = forbiddenCity }},\n new https://cloud.google.com/dotnet/docs/reference/Google.Cloud.AIPlatform.V1/latest/Google.Cloud.AIPlatform.V1.Part.html { Text = \"city: Beijing, Landmark: Forbidden City\"},\n new https://cloud.google.com/dotnet/docs/reference/Google.Cloud.AIPlatform.V1/latest/Google.Cloud.AIPlatform.V1.Part.html { InlineData = new() { MimeType = \"image/png\", Data = christRedeemer }}\n }\n }\n }\n };\n\n using PredictionServiceClient.StreamGenerateContentStream response = predictionServiceClient.StreamGenerateContent(generateContentRequest);\n\n StringBuilder fullText = new();\n\n AsyncResponseStream\u003cGenerateContentResponse\u003e responseStream = response.GetResponseStream();\n await foreach (https://cloud.google.com/dotnet/docs/reference/Google.Cloud.AIPlatform.V1/latest/Google.Cloud.AIPlatform.V1.GenerateContentResponse.html responseItem in responseStream)\n {\n fullText.Append(responseItem.Candidates[0].https://cloud.google.com/dotnet/docs/reference/Google.Cloud.AIPlatform.V1/latest/Google.Cloud.AIPlatform.V1.Content.html.Parts[0].Text);\n }\n return fullText.ToString();\n }\n\n private static async Task\u003cByteString\u003e ReadImageFileAsync(string url)\n {\n using HttpClient client = new();\n using var response = await client.GetAsync(url);\n byte[] imageBytes = await response.https://cloud.google.com/dotnet/docs/reference/Google.Cloud.AIPlatform.V1/latest/Google.Cloud.AIPlatform.V1.Content.html.ReadAsByteArrayAsync();\n return https://cloud.google.com/dotnet/docs/reference/Google.Protobuf/latest/Google.Protobuf.ByteString.html.https://cloud.google.com/dotnet/docs/reference/Google.Protobuf/latest/Google.Protobuf.ByteString.html#Google_Protobuf_ByteString_CopyFrom_System_Byte___(imageBytes);\n }\n }\n\n### Node.js\n\n\nBefore trying this sample, follow the Node.js setup instructions in the\n[Vertex AI quickstart using\nclient libraries](/vertex-ai/docs/start/client-libraries).\n\n\nFor more information, see the\n[Vertex AI Node.js API\nreference documentation](/nodejs/docs/reference/aiplatform/latest).\n\n\nTo authenticate to Vertex AI, set up Application Default Credentials.\nFor more information, see\n\n[Set up authentication for a local development environment](/docs/authentication/set-up-adc-local-dev-environment).\n\n const {VertexAI} = require('https://cloud.google.com/nodejs/docs/reference/vertexai/latest/overview.html');\n const axios = require('axios');\n\n async function getBase64(url) {\n const image = await axios.get(url, {responseType: 'arraybuffer'});\n return Buffer.from(image.https://cloud.google.com/nodejs/docs/reference/vertexai/latest/vertexai/generativecontentblob.html).toString('base64');\n }\n\n /**\n * TODO(developer): Update these variables before running the sample.\n */\n async function sendMultiModalPromptWithImage(\n projectId = 'PROJECT_ID',\n location = 'us-central1',\n model = 'gemini-2.0-flash-001'\n ) {\n // For images, the SDK supports base64 strings\n const landmarkImage1 = await getBase64(\n 'https://storage.googleapis.com/cloud-samples-data/vertex-ai/llm/prompts/landmark1.png'\n );\n const landmarkImage2 = await getBase64(\n 'https://storage.googleapis.com/cloud-samples-data/vertex-ai/llm/prompts/landmark2.png'\n );\n const landmarkImage3 = await getBase64(\n 'https://storage.googleapis.com/cloud-samples-data/vertex-ai/llm/prompts/landmark3.png'\n );\n\n // Initialize Vertex with your Cloud project and location\n const vertexAI = new https://cloud.google.com/nodejs/docs/reference/vertexai/latest/vertexai/vertexai.html({project: projectId, location: location});\n\n const generativeVisionModel = vertexAI.https://cloud.google.com/nodejs/docs/reference/vertexai/latest/vertexai/vertexai.html({\n model: model,\n });\n\n // Pass multimodal prompt\n const request = {\n contents: [\n {\n role: 'user',\n parts: [\n {\n inlineData: {\n data: landmarkImage1,\n mimeType: 'image/png',\n },\n },\n {\n text: 'city: Rome, Landmark: the Colosseum',\n },\n\n {\n inlineData: {\n data: landmarkImage2,\n mimeType: 'image/png',\n },\n },\n {\n text: 'city: Beijing, Landmark: Forbidden City',\n },\n {\n inlineData: {\n data: landmarkImage3,\n mimeType: 'image/png',\n },\n },\n ],\n },\n ],\n };\n\n // Create the response\n const response = await generativeVisionModel.generateContent(request);\n // Wait for the response to complete\n const aggregatedResponse = await response.response;\n // Select the text from the response\n const fullTextResponse =\n aggregatedResponse.https://cloud.google.com/nodejs/docs/reference/vertexai/latest/vertexai/generatecontentresponse.html[0].https://cloud.google.com/nodejs/docs/reference/vertexai/latest/vertexai/generatecontentcandidate.html.https://cloud.google.com/nodejs/docs/reference/vertexai/latest/vertexai/content.html[0].text;\n\n console.log(fullTextResponse);\n }\n\nWhat's next\n-----------\n\n\nTo search and filter code samples for other Google Cloud products, see the\n[Google Cloud sample browser](/docs/samples?product=generativeaionvertexai)."]]