OpenAI の互換性

Gemini モデルには、REST API とともに OpenAI ライブラリ(Python と TypeScript / Javascript)を使用してアクセスできます。Vertex AI の OpenAI ライブラリでは、 Google Cloud Auth のみがサポートされています。OpenAI ライブラリを使用していない場合は、Gemini API を直接呼び出すことをおすすめします。

Python

import openai
from google.auth import default
import google.auth.transport.requests

# TODO(developer): Update and un-comment below lines
#project_id = "PROJECT_ID"
location = "us-central1"

# # Programmatically get an access token
credentials, _ = default(scopes=["https://www.googleapis.com/auth/cloud-platform"])
credentials.refresh(google.auth.transport.requests.Request())

# OpenAI Client
client = openai.OpenAI(
  base_url=f"https://{location}-aiplatform.googleapis.com/v1/projects/{project_id}/locations/{location}/endpoints/openapi",
  api_key=credentials.token
)

response = client.chat.completions.create(
  model="google/gemini-2.0-flash-001",
  messages=[
      {"role": "system", "content": "You are a helpful assistant."},
      {"role": "user", "content": "Explain to me how AI works"}
  ]
)

print(response.choices[0].message)

何が変更されたのですか?

  • api_key=credentials.token: Google Cloud 認証を使用するには、サンプルコードを使用してGoogle Cloud 認証トークンを取得します。

  • base_url: デフォルトの URL ではなく Google Cloudにリクエストを送信するように OpenAI ライブラリに指示します。

  • model="google/gemini-2.0-flash-001": Vertex がホストするモデルから、互換性のある Gemini モデルを選択します。

思考

Gemini 2.5 モデルは、複雑な問題を検討するようにトレーニングされているため、推論が大幅に向上します。Gemini API には、モデルの思考量をきめ細かく制御できる「思考時間」パラメータが用意されています。

Gemini API とは異なり、OpenAI API には「低」、「中」、「高」の 3 つの思考制御レベルがあります。これらは、1,000、8,000、24,000 の思考トークン予算にバックグラウンドでマッピングされます。

思考を無効にするには、推論の努力を「なし」に設定します。

Python

import openai
from google.auth import default
import google.auth.transport.requests

# TODO(developer): Update and un-comment below lines
#project_id = PROJECT_ID
location = "us-central1"

# # Programmatically get an access token
credentials, _ = default(scopes=["https://www.googleapis.com/auth/cloud-platform"])
credentials.refresh(google.auth.transport.requests.Request())

# OpenAI Client
client = openai.OpenAI(
  base_url=f"https://{location}-aiplatform.googleapis.com/v1/projects/{project_id}/locations/{location}/endpoints/openapi",
  api_key=credentials.token
)

response = client.chat.completions.create(
  model="google/gemini-2.5-flash-preview-04-17",
  reasoning_effort="low",
  messages=[
      {"role": "system", "content": "You are a helpful assistant."},
      {
          "role": "user",
          "content": "Explain to me how AI works"
      }
  ]
)
print(response.choices[0].message)

ストリーミング

Gemini API はストリーミング レスポンスをサポートしています。

Python

import openai
from google.auth import default
import google.auth.transport.requests

# TODO(developer): Update and un-comment below lines
#project_id = PROJECT_ID
location = "us-central1"

credentials, _ = default(scopes=["https://www.googleapis.com/auth/cloud-platform"])
credentials.refresh(google.auth.transport.requests.Request())

client = openai.OpenAI(
  base_url=f"https://{location}-aiplatform.googleapis.com/v1/projects/{project_id}/locations/{location}/endpoints/openapi",
  api_key=credentials.token
)
response = client.chat.completions.create(
model="google/gemini-2.0-flash",
messages=[
  {"role": "system", "content": "You are a helpful assistant."},
  {"role": "user", "content": "Hello!"}
],
stream=True
)

for chunk in response:
  print(chunk.choices[0].delta)

関数呼び出し

関数呼び出しを使用すると、生成モデルから構造化データ出力を簡単に取得できます。これは Gemini API でサポートされています

Python

import openai
from google.auth import default
import google.auth.transport.requests

# TODO(developer): Update and un-comment below lines
#project_id = PROJECT_ID
location = "us-central1"

credentials, _ = default(scopes=["https://www.googleapis.com/auth/cloud-platform"])
credentials.refresh(google.auth.transport.requests.Request())

client = openai.OpenAI(
  base_url=f"https://{location}-aiplatform.googleapis.com/v1/projects/{project_id}/locations/{location}/endpoints/openapi",
  api_key=credentials.token
)

tools = [
{
  "type": "function",
  "function": {
    "name": "get_weather",
    "description": "Get the weather in a given location",
    "parameters": {
      "type": "object",
      "properties": {
        "location": {
          "type": "string",
          "description": "The city and state, e.g. Chicago, IL",
        },
        "unit": {"type": "string", "enum": ["celsius", "fahrenheit"]},
      },
      "required": ["location"],
    },
  }
}
]

messages = [{"role": "user", "content": "What's the weather like in Chicago today?"}]
response = client.chat.completions.create(
model="google/gemini-2.0-flash",
messages=messages,
tools=tools,
tool_choice="auto"
)

print(response)

画像理解

Gemini モデルはネイティブにマルチモーダルであり、一般的なビジョン タスクの多くでクラス最高のパフォーマンスを提供します。

Python

from google.auth import default
import google.auth.transport.requests

import base64
from openai import OpenAI

# TODO(developer): Update and un-comment below lines
# project_id = "PROJECT_ID"
location = "us-central1"

# Programmatically get an access token
credentials, _ = default(scopes=["https://www.googleapis.com/auth/cloud-platform"])
credentials.refresh(google.auth.transport.requests.Request())

# OpenAI Client
client = openai.OpenAI(
  base_url=f"https://{location}-aiplatform.googleapis.com/v1/projects/{project_id}/locations/{location}/endpoints/openapi",
  api_key=credentials.token,
)

# Function to encode the image
def encode_image(image_path):
with open(image_path, "rb") as image_file:
  return base64.b64encode(image_file.read()).decode('utf-8')

# Getting the base64 string
#base64_image = encode_image("Path/to/image.jpeg")

response = client.chat.completions.create(
model="google/gemini-2.0-flash",
messages=[
  {
    "role": "user",
    "content": [
      {
        "type": "text",
        "text": "What is in this image?",
      },
      {
        "type": "image_url",
        "image_url": {
          "url":  f"data:image/jpeg;base64,{base64_image}"
        },
      },
    ],
  }
],
)

print(response.choices[0])

画像を生成する

Python

from google.auth import default
import google.auth.transport.requests

import base64
from openai import OpenAI

# TODO(developer): Update and un-comment below lines
# project_id = "PROJECT_ID"
location = "us-central1"

# Programmatically get an access token
credentials, _ = default(scopes=["https://www.googleapis.com/auth/cloud-platform"])
credentials.refresh(google.auth.transport.requests.Request())

# OpenAI Client
client = openai.OpenAI(
  base_url=f"https://{location}-aiplatform.googleapis.com/v1/projects/{project_id}/locations/{location}/endpoints/openapi",
  api_key=credentials.token,
)

# Function to encode the image
def encode_image(image_path):
with open(image_path, "rb") as image_file:
  return base64.b64encode(image_file.read()).decode('utf-8')

# Getting the base64 string
#base64_image = encode_image("Path/to/image.jpeg")
base64_image = encode_image("/content/wayfairsofa.jpg")

response = client.chat.completions.create(
model="google/gemini-2.0-flash",
messages=[
  {
    "role": "user",
    "content": [
      {
        "type": "text",
        "text": "What is in this image?",
      },
      {
        "type": "image_url",
        "image_url": {
          "url":  f"data:image/jpeg;base64,{base64_image}"
        },
      },
    ],
  }
],
)

print(response.choices[0])

音声の理解

音声入力を分析する:

Python

from google.auth import default
import google.auth.transport.requests

import base64
from openai import OpenAI

# TODO(developer): Update and un-comment below lines
# project_id = "PROJECT_ID"
location = "us-central1"

# Programmatically get an access token
credentials, _ = default(scopes=["https://www.googleapis.com/auth/cloud-platform"])
credentials.refresh(google.auth.transport.requests.Request())

# OpenAI Client
client = openai.OpenAI(
  base_url=f"https://{location}-aiplatform.googleapis.com/v1/projects/{project_id}/locations/{location}/endpoints/openapi",
  api_key=credentials.token,
)

with open("/path/to/your/audio/file.wav", "rb") as audio_file:
base64_audio = base64.b64encode(audio_file.read()).decode('utf-8')

response = client.chat.completions.create(
  model="gemini-2.0-flash",
  messages=[
  {
    "role": "user",
    "content": [
      {
        "type": "text",
        "text": "Transcribe this audio",
      },
      {
            "type": "input_audio",
            "input_audio": {
              "data": base64_audio,
              "format": "wav"
        }
      }
    ],
  }
],
)

print(response.choices[0].message.content)

構造化出力

Gemini モデルは、定義した構造で JSON オブジェクトを出力できます。

Python

from google.auth import default
import google.auth.transport.requests

from pydantic import BaseModel
from openai import OpenAI

# TODO(developer): Update and un-comment below lines
# project_id = "PROJECT_ID"
location = "us-central1"

# Programmatically get an access token
credentials, _ = default(scopes=["https://www.googleapis.com/auth/cloud-platform"])
credentials.refresh(google.auth.transport.requests.Request())

# OpenAI Client
client = openai.OpenAI(
  base_url=f"https://{location}-aiplatform.googleapis.com/v1/projects/{project_id}/locations/{location}/endpoints/openapi",
  api_key=credentials.token,
)

class CalendarEvent(BaseModel):
  name: str
  date: str
  participants: list[str]

completion = client.beta.chat.completions.parse(
  model="google/gemini-2.0-flash",
  messages=[
      {"role": "system", "content": "Extract the event information."},
      {"role": "user", "content": "John and Susan are going to an AI conference on Friday."},
  ],
  response_format=CalendarEvent,
)

print(completion.choices[0].message.parsed)

現在の制限

  • 認証情報は、デフォルトでは 1 時間有効です。有効期限が切れた場合は、更新する必要があります。詳細については、こちらのコード例をご覧ください。

  • 機能のサポートを拡大する間、OpenAI ライブラリのサポートはまだプレビュー版です。ご不明な点や問題がございましたら、 Google Cloud コミュニティに投稿してください。

次のステップ