Mulai 29 April 2025, model Gemini 1.5 Pro dan Gemini 1.5 Flash tidak tersedia di project yang belum pernah menggunakan model ini, termasuk project baru. Untuk mengetahui detailnya, lihat Versi dan siklus proses model.
Tetap teratur dengan koleksi
Simpan dan kategorikan konten berdasarkan preferensi Anda.
imagetext adalah nama model yang mendukung pemberian teks pada gambar. imagetext
menghasilkan teks dari gambar yang Anda berikan berdasarkan bahasa yang Anda
tentukan. Model ini mendukung bahasa berikut: Inggris (en), Jerman
(de), Prancis (fr), Spanyol (es), dan Italia (it).
Untuk menjelajahi model ini di konsol, lihat kartu model Image Captioning di
Model Garden.
{"instances":[{"image":{// Union field can be only one of the following:"bytesBase64Encoded":string,"gcsUri":string,// End of list of possible types for union field."mimeType":string}}],"parameters":{"sampleCount":integer,"storageUri":string,"language":string,"seed":integer}}
[[["Mudah dipahami","easyToUnderstand","thumb-up"],["Memecahkan masalah saya","solvedMyProblem","thumb-up"],["Lainnya","otherUp","thumb-up"]],[["Sulit dipahami","hardToUnderstand","thumb-down"],["Informasi atau kode contoh salah","incorrectInformationOrSampleCode","thumb-down"],["Informasi/contoh yang saya butuhkan tidak ada","missingTheInformationSamplesINeed","thumb-down"],["Masalah terjemahan","translationIssue","thumb-down"],["Lainnya","otherDown","thumb-down"]],["Terakhir diperbarui pada 2025-08-25 UTC."],[],[],null,["# Image captions\n\n| **Caution:** Starting on June 24, 2025, Imagen versions 1 and 2 are deprecated. Imagen models `imagegeneration@002`, `imagegeneration@005`, and `imagegeneration@006` will be removed on September 24, 2025 . For more information about migrating to Imagen 3, see [Migrate to\n| Imagen 3](/vertex-ai/generative-ai/docs/image/migrate-to-imagen-3).\n\n\u003cbr /\u003e\n\n`imagetext` is the name of the model that supports image captioning. `imagetext`\ngenerates a caption from an image you provide based on the language that you\nspecify. The model supports the following languages: English (`en`), German\n(`de`), French (`fr`), Spanish (`es`) and Italian (`it`).\n\nTo explore this model in the console, see the `Image Captioning` model card in\nthe Model Garden.\n\n\n[View Imagen for Captioning \\& VQA model card](https://console.cloud.google.com/vertex-ai/publishers/google/model-garden/imagetext)\n\nUse cases\n---------\n\nSome common use cases for image captioning include:\n\n- Creators can generate captions for uploaded images and videos (for example, a short description of a video sequence)\n- Generate captions to describe products\n- Integrate captioning with an app using the API to create new experiences\n\nHTTP request\n------------\n\n POST https://us-central1-aiplatform.googleapis.com/v1/projects/\u003cvar translate=\"no\"\u003ePROJECT_ID\u003c/var\u003e/locations/us-central1/publishers/google/models/imagetext:predict\n\nRequest body\n------------\n\n {\n \"instances\": [\n {\n \"image\": {\n // Union field can be only one of the following:\n \"bytesBase64Encoded\": string,\n \"gcsUri\": string,\n // End of list of possible types for union field.\n \"mimeType\": string\n }\n }\n ],\n \"parameters\": {\n \"sampleCount\": integer,\n \"storageUri\": string,\n \"language\": string,\n \"seed\": integer\n }\n }\n\nUse the following parameters for the Imagen model `imagetext`.\nFor more information, see\n[Get image descriptions using visual captioning](/vertex-ai/generative-ai/docs/image/image-captioning).\n\nSample request\n--------------\n\n### REST\n\nTo test a text prompt by using the Vertex AI API, send a POST request to the\npublisher model endpoint.\n\n\nBefore using any of the request data,\nmake the following replacements:\n\n- \u003cvar translate=\"no\"\u003ePROJECT_ID\u003c/var\u003e: Your Google Cloud [project ID](/resource-manager/docs/creating-managing-projects#identifiers).\n- \u003cvar translate=\"no\"\u003eLOCATION\u003c/var\u003e: Your project's region. For example, `us-central1`, `europe-west2`, or `asia-northeast3`. For a list of available regions, see [Generative AI on Vertex AI locations](/vertex-ai/generative-ai/docs/learn/locations-genai).\n- \u003cvar translate=\"no\"\u003eB64_IMAGE\u003c/var\u003e: The image to get captions for. The image must be specified as a [base64-encoded](/vertex-ai/generative-ai/docs/image/base64-encode) byte string. Size limit: 10 MB.\n- \u003cvar translate=\"no\"\u003eRESPONSE_COUNT\u003c/var\u003e: The number of image captions you want to generate. Accepted integer values: 1-3.\n- \u003cvar translate=\"no\"\u003eLANGUAGE_CODE\u003c/var\u003e: One of the supported language codes. Languages supported:\n - English (`en`)\n - French (`fr`)\n - German (`de`)\n - Italian (`it`)\n - Spanish (`es`)\n\n\nHTTP method and URL:\n\n```\nPOST https://LOCATION-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/publishers/google/models/imagetext:predict\n```\n\n\nRequest JSON body:\n\n```\n{\n \"instances\": [\n {\n \"image\": {\n \"bytesBase64Encoded\": \"B64_IMAGE\"\n }\n }\n ],\n \"parameters\": {\n \"sampleCount\": RESPONSE_COUNT,\n \"language\": \"LANGUAGE_CODE\"\n }\n}\n```\n\nTo send your request, choose one of these options: \n\n#### curl\n\n| **Note:** The following command assumes that you have logged in to the `gcloud` CLI with your user account by running [`gcloud init`](/sdk/gcloud/reference/init) or [`gcloud auth login`](/sdk/gcloud/reference/auth/login) , or by using [Cloud Shell](/shell/docs), which automatically logs you into the `gcloud` CLI . You can check the currently active account by running [`gcloud auth list`](/sdk/gcloud/reference/auth/list).\n\n\nSave the request body in a file named `request.json`,\nand execute the following command:\n\n```\ncurl -X POST \\\n -H \"Authorization: Bearer $(gcloud auth print-access-token)\" \\\n -H \"Content-Type: application/json; charset=utf-8\" \\\n -d @request.json \\\n \"https://LOCATION-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/publishers/google/models/imagetext:predict\"\n```\n\n#### PowerShell\n\n| **Note:** The following command assumes that you have logged in to the `gcloud` CLI with your user account by running [`gcloud init`](/sdk/gcloud/reference/init) or [`gcloud auth login`](/sdk/gcloud/reference/auth/login) . You can check the currently active account by running [`gcloud auth list`](/sdk/gcloud/reference/auth/list).\n\n\nSave the request body in a file named `request.json`,\nand execute the following command:\n\n```\n$cred = gcloud auth print-access-token\n$headers = @{ \"Authorization\" = \"Bearer $cred\" }\n\nInvoke-WebRequest `\n -Method POST `\n -Headers $headers `\n -ContentType: \"application/json; charset=utf-8\" `\n -InFile request.json `\n -Uri \"https://LOCATION-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/publishers/google/models/imagetext:predict\" | Select-Object -Expand Content\n```\nThe following sample responses are for a request with `\"sampleCount\": 2`. The response returns two prediction strings.\n\n**English (`en`):** \n\n```\n{\n \"predictions\": [\n \"a yellow mug with a sheep on it sits next to a slice of cake\",\n \"a cup of coffee with a heart shaped latte art next to a slice of cake\"\n ],\n \"deployedModelId\": \"DEPLOYED_MODEL_ID\",\n \"model\": \"projects/PROJECT_ID/locations/LOCATION/models/MODEL_ID\",\n \"modelDisplayName\": \"MODEL_DISPLAYNAME\",\n \"modelVersionId\": \"1\"\n}\n```\n\n**Spanish (`es`):**\n\n```\n{\n \"predictions\": [\n \"una taza de café junto a un plato de pastel de chocolate\",\n \"una taza de café con una forma de corazón en la espuma\"\n ]\n}\n```\n\n\u003cbr /\u003e\n\nResponse body\n-------------\n\n {\n \"predictions\": [ string ]\n }\n\nSample response\n---------------\n\n {\n \"predictions\": [\n \"text1\",\n \"text2\"\n ]\n }"]]