Batch-Textgenerierung

Mithilfe von Batchvorhersagen können Sie mehrere multimodale Prompts senden, die nicht latenzempfindlich sind. Im Gegensatz zur Onlinevorhersage, bei der jeweils nur ein Eingabeprompt gleichzeitig möglich ist, können Sie eine große Anzahl von multimodalen Prompts in einer einzelnen Batchanfrage senden. Anschließend werden Ihre Antworten asynchron in den Speicherort der BigQuery-Speicherausgabe eingefügt.

Für Batchanfragen für Gemini-Modelle wird ein Rabatt von 50% auf Standardanfragen gewährt. Weitere Informationen finden Sie auf der Preisseite.

Multimodale Modelle, die Batchvorhersagen unterstützen

Die folgenden multimodalen Modelle unterstützen Batchvorhersagen.

gemini-1.5-flash-002
gemini-1.5-flash-001
gemini-1.5-pro-002
gemini-1.5-pro-001
gemini-1.0-pro-002
gemini-1.0-pro-001

Eingaben vorbereiten

Batchanfragen für multimodale Modelle akzeptieren BigQuery- und Cloud Storage-Speicherquellen.

BigQuery-Speichereingabe

Der Inhalt in der request-Spalte muss gültiges JSON-Format sein. Diese JSON-Daten stellen Ihre Eingabe für das Modell dar.
Der Inhalt in der JSON-Anleitung muss mit der Struktur einer GenerateContentRequest übereinstimmen.
Ihre Eingabetabelle kann auch andere Spalten als request enthalten. Sie werden bei der Inhaltsgenerierung ignoriert, aber in der Ausgabetabelle enthalten. Das System reserviert zwei Spaltennamen für die Ausgabe: response und status. Sie liefern Informationen zum Ergebnis des Batchvorhersagejobs.
Bei der Batch-Vorhersage wird das Feld fileData für Gemini nicht unterstützt.

Beispiel für Eingabe (JSON)
`{ "contents": [ { "role": "user", "parts": { "text": "Give me a recipe for banana bread." } } ], "system_instruction": { "parts": [ { "text": "You are a chef." } ] } }`

Cloud Storage-Eingabe

Dateiformat: JSON Lines (JSONL)
Befindet sich in us-central1
Entsprechende Leseberechtigungen für das Dienstkonto

Einschränkungen für „fileData“ für bestimmte Gemini-Modelle

Beispiel für Eingabe (JSONL)

Beispiel für Eingabe (JSONL)
{"request":{"contents": [{"role": "user", "parts": [{"text": "What is the relation between the following video and image samples?"}, {"file_data": {"file_uri": "gs://cloud-samples-data/generative-ai/video/animals.mp4", "mime_type": "video/mp4"}}, {"file_data": {"file_uri": "gs://cloud-samples-data/generative-ai/image/cricket.jpeg", "mime_type": "image/jpeg"}}]}]}} {"request":{"contents": [{"role": "user", "parts": [{"text": "Describe what is happening in this video."}, {"file_data": {"file_uri": "gs://cloud-samples-data/generative-ai/video/another_video.mov", "mime_type": "video/mov"}}]}]}}


{"request":{"contents": [{"role": "user", "parts": [{"text": "What is the relation between the following video and image samples?"}, {"file_data": {"file_uri": "gs://cloud-samples-data/generative-ai/video/animals.mp4", "mime_type": "video/mp4"}}, {"file_data": {"file_uri": "gs://cloud-samples-data/generative-ai/image/cricket.jpeg", "mime_type": "image/jpeg"}}]}]}}
{"request":{"contents": [{"role": "user", "parts": [{"text": "Describe what is happening in this video."}, {"file_data": {"file_uri": "gs://cloud-samples-data/generative-ai/video/another_video.mov", "mime_type": "video/mov"}}]}]}}

Batchantwort anfordern

Abhängig von der Anzahl der Eingabeelemente, die Sie eingereicht haben, kann die Batchvgenerierung eine Weile dauern.

REST

Senden Sie zum Testen eines multimodalen Code-Prompts mit der Vertex AI API eine POST-Anfrage an den Endpunkt des Publisher-Modells.

Ersetzen Sie diese Werte in den folgenden Anfragedaten:

PROJECT_ID ist der Name Ihres Google Cloud-Projekts.
BP_JOB_NAME: Ein Name, den Sie für Ihren Job auswählen.
INPUT_URI: Der URI der Eingabequelle. Dies ist ein BigQuery-Tabellen-URI im Format bq://PROJECT_ID.DATASET.TABLE. Oder die URI Ihres Cloud Storage-Buckets.
INPUT_SOURCE: Der Typ der Eingabequelle. Optionen sind bigquerySource und gcsSource.
INSTANCES_FORMAT: Format der Eingabeinstanzen – kann „jsonl“ oder „bigquery“ sein.
OUTPUT_URI: Der URI der Ausgabe- oder Zielausgabetabelle im Format bq://PROJECT_ID.DATASET.TABLE. Wenn die Tabelle noch nicht vorhanden ist, wird sie für Sie erstellt.

HTTP-Methode und URL:

POST https://us-central1-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/us-central1/batchPredictionJobs

JSON-Text der Anfrage:

{
    "displayName": "BP_JOB_NAME",
    "model": "publishers/google/models/gemini-1.0-pro-002",
    "inputConfig": {
      "instancesFormat":"INSTANCES_FORMAT",
      "inputSource":{ INPUT_SOURCE
        "inputUri" : "INPUT_URI"
      }
    },
    "outputConfig": {
      "predictionsFormat":"bigquery",
      "bigqueryDestination":{
        "outputUri": "OUTPUT_URI"
        }
    }
}

Wenn Sie die Anfrage senden möchten, wählen Sie eine der folgenden Optionen aus:

curl

Hinweis: Der folgende Befehl setzt voraus, dass Sie sich mit Ihrem Nutzerkonto bei der gcloud CLI angemeldet haben. Dazu haben Sie gcloud init oder gcloud auth login ausgeführt oder die Cloud Shell genutzt, die Sie automatisch bei der gcloud CLI anmeldet. Um herauszufinden, welches Konto gerade aktiv ist, führen Sie gcloud auth list aus.

Speichern Sie den Anfragetext in einer Datei mit dem Namen request.json und führen Sie den folgenden Befehl aus:

curl -X POST \
     -H "Authorization: Bearer $(gcloud auth print-access-token)" \
     -H "Content-Type: application/json; charset=utf-8" \
     -d @request.json \
     "https://us-central1-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/us-central1/batchPredictionJobs"

PowerShell

Hinweis: Der folgende Befehl setzt voraus, dass Sie sich mit Ihrem Nutzerkonto bei der gcloud CLI angemeldet haben. Dazu führen Sie gcloud init oder gcloud auth login aus. Um herauszufinden, welches Konto gerade aktiv ist, führen Sie gcloud auth list aus.

Speichern Sie den Anfragetext in einer Datei mit dem Namen request.json und führen Sie den folgenden Befehl aus:

$cred = gcloud auth print-access-token
$headers = @{ "Authorization" = "Bearer $cred" }

Invoke-WebRequest `
    -Method POST `
    -Headers $headers `
    -ContentType: "application/json; charset=utf-8" `
    -InFile request.json `
    -Uri "https://us-central1-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/us-central1/batchPredictionJobs" | Select-Object -Expand Content

Sie sollten eine JSON-Antwort ähnlich wie diese erhalten:

{
  "name": "projects/{PROJECT_ID}/locations/us-central1/batchPredictionJobs/{BATCH_JOB_ID}",
  "displayName": "My first batch prediction",
  "model": "projects/{PROJECT_ID}/locations/us-central1/models/gemini-1.0-pro-002",
  "inputConfig": {
    "instancesFormat": "bigquery",
    "bigquerySource": {
      "inputUri": "bq://{PROJECT_ID}.mydataset.batch_predictions_input"
    }
  },
  "modelParameters": {},
  "outputConfig": {
    "predictionsFormat": "bigquery",
    "bigqueryDestination": {
      "outputUri": "bq://{PROJECT_ID}.mydataset.batch_predictions_output"
    }
  },
  "state": "JOB_STATE_PENDING",
  "createTime": "2023-07-12T20:46:52.148717Z",
  "updateTime": "2023-07-12T20:46:52.148717Z",
  "modelVersionId": "1"
}

Die Antwort enthält eine eindeutige Kennung für den Batchjob. Sie können den Status des Batch-Jobs mit BATCH_JOB_ID abfragen, bis der Job state den Wert JOB_STATE_SUCCEEDED hat. Beispiel:

curl \
  -X GET \
  -H "Authorization: Bearer $(gcloud auth print-access-token)" \
  -H "Content-Type: application/json" \
https://us-central1-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/us-central1/batchPredictionJobs/BATCH_JOB_ID

Batchausgabe abrufen

Wenn eine Batchvorhersage abgeschlossen ist, wird die Ausgabe in der BigQuery-Tabelle gespeichert, die Sie in der Anfrage angegeben haben.

BigQuery-Ausgabebeispiel

Anfrage	Antwort	Status
'{"content":[{...}]}'	{ "candidates": [ { "content": { "role": "model", "parts": [ { "text": "In a medium bowl, whisk together the flour, baking soda, baking powder." } ] }, "finishReason": "STOP", "safetyRatings": [ { "category": "HARM_CATEGORY_SEXUALLY_EXPLICIT", "probability": "NEGLIGIBLE", "probabilityScore": 0.14057204, "severity": "HARM_SEVERITY_NEGLIGIBLE", "severityScore": 0.14270912 } ] } ], "usageMetadata": { "promptTokenCount": 8, "candidatesTokenCount": 396, "totalTokenCount": 404 } }

Anfrage

Antwort

Status

'{"content":[{...}]}'

{
  "candidates": [
    {
      "content": {
        "role": "model",
        "parts": [
          {
            "text": "In a medium bowl, whisk together the flour, baking soda, baking powder."
          }
        ]
      },
      "finishReason": "STOP",
      "safetyRatings": [
        {
          "category": "HARM_CATEGORY_SEXUALLY_EXPLICIT",
          "probability": "NEGLIGIBLE",
          "probabilityScore": 0.14057204,
          "severity": "HARM_SEVERITY_NEGLIGIBLE",
          "severityScore": 0.14270912
        }
      ]
    }
  ],
  "usageMetadata": {
    "promptTokenCount": 8,
    "candidatesTokenCount": 396,
    "totalTokenCount": 404
  }
}

Beispiel für Cloud Storage-Ausgabe

PROJECT_ID=[PROJECT ID]
REGION="us-central1"
MODEL_URI="publishers/google/models/gemini-1.0-pro-001@default"
INPUT_URI="[GCS INPUT URI]"
OUTPUT_URI="[OUTPUT URI]"

# Setting variables based on parameters
ENDPOINT="${REGION}-autopush-aiplatform.sandbox.googleapis.com"
API_VERSION=v1
ENV=autopush
BP_JOB_NAME="BP_testing_`date +%Y%m%d_%H%M%S`"

curl \
  -X POST \
  -H "Authorization: Bearer $(gcloud auth print-access-token)" \
  -H "Content-Type: application/json" \
  https://${ENDPOINT}/${API_VERSION}/projects/${PROJECT_ID}/locations/${REGION}/batchPredictionJobs \
-d '{
    "name": "'${BP_JOB_NAME}'",
    "displayName": "'${BP_JOB_NAME}'",
    "model": "'${MODEL_URI}'",
    "inputConfig": {
      "instancesFormat":"jsonl",
      "gcsSource":{
        "uris" : "'${INPUT_URI}'"
      }
    },
    "outputConfig": {
      "predictionsFormat":"jsonl",
      "gcsDestination":{
        "outputUriPrefix": "'${OUTPUT_URI}'"
      }
    },
    "labels": {"stage": "'${ENV}'"},
}'

Nächste Schritte

Informationen zum Optimieren eines Gemini-Modells finden Sie in der Übersicht über die Modellabstimmung für Gemini
Weitere Informationen zur Batch Prediction API