Compatibilidade com o OpenAI

Os modelos do Gemini podem ser acessados usando as bibliotecas OpenAI (Python e TypeScript / Javascript) e a API REST. Somente a Google Cloud Auth é compatível com a biblioteca OpenAI na Vertex AI. Se você ainda não usa as bibliotecas OpenAI, recomendamos chamar a API Gemini diretamente.

Python

import openai
from google.auth import default
import google.auth.transport.requests

# TODO(developer): Update and un-comment below lines
#project_id = "PROJECT_ID"
location = "us-central1"

# # Programmatically get an access token
credentials, _ = default(scopes=["https://www.googleapis.com/auth/cloud-platform"])
credentials.refresh(google.auth.transport.requests.Request())

# OpenAI Client
client = openai.OpenAI(
  base_url=f"https://{location}-aiplatform.googleapis.com/v1/projects/{project_id}/locations/{location}/endpoints/openapi",
  api_key=credentials.token
)

response = client.chat.completions.create(
  model="google/gemini-2.0-flash-001",
  messages=[
      {"role": "system", "content": "You are a helpful assistant."},
      {"role": "user", "content": "Explain to me how AI works"}
  ]
)

print(response.choices[0].message)

O que mudou?

  • api_key=credentials.token: para usar a autenticação Google Cloud , receba um token de autenticaçãoGoogle Cloud usando o código de exemplo.

  • base_url: informa à biblioteca OpenAI para enviar solicitações para Google Cloud em vez do URL padrão.

  • model="google/gemini-2.0-flash-001": escolha um modelo compatível do Gemini entre os modelos hospedados pela Vertex.

Thinking

Os modelos Gemini 2.5 são treinados para pensar em problemas complexos, o que resulta em um raciocínio muito melhor. A API Gemini vem com um parâmetro"orçamento de pensamento", que oferece controle detalhado sobre o quanto o modelo vai pensar.

Ao contrário da API Gemini, a API OpenAI oferece três níveis de controle de pensamento: "baixo", "médio" e "alto", que são mapeados nos bastidores para orçamentos de token de pensamento de 1K, 8K e 24K.

Para desativar o pensamento, defina o esforço de raciocínio como "nenhum".

Python

import openai
from google.auth import default
import google.auth.transport.requests

# TODO(developer): Update and un-comment below lines
#project_id = PROJECT_ID
location = "us-central1"

# # Programmatically get an access token
credentials, _ = default(scopes=["https://www.googleapis.com/auth/cloud-platform"])
credentials.refresh(google.auth.transport.requests.Request())

# OpenAI Client
client = openai.OpenAI(
  base_url=f"https://{location}-aiplatform.googleapis.com/v1/projects/{project_id}/locations/{location}/endpoints/openapi",
  api_key=credentials.token
)

response = client.chat.completions.create(
  model="google/gemini-2.5-flash-preview-04-17",
  reasoning_effort="low",
  messages=[
      {"role": "system", "content": "You are a helpful assistant."},
      {
          "role": "user",
          "content": "Explain to me how AI works"
      }
  ]
)
print(response.choices[0].message)

Streaming

A API Gemini oferece suporte a respostas de streaming.

Python

import openai
from google.auth import default
import google.auth.transport.requests

# TODO(developer): Update and un-comment below lines
#project_id = PROJECT_ID
location = "us-central1"

credentials, _ = default(scopes=["https://www.googleapis.com/auth/cloud-platform"])
credentials.refresh(google.auth.transport.requests.Request())

client = openai.OpenAI(
  base_url=f"https://{location}-aiplatform.googleapis.com/v1/projects/{project_id}/locations/{location}/endpoints/openapi",
  api_key=credentials.token
)
response = client.chat.completions.create(
model="google/gemini-2.0-flash",
messages=[
  {"role": "system", "content": "You are a helpful assistant."},
  {"role": "user", "content": "Hello!"}
],
stream=True
)

for chunk in response:
  print(chunk.choices[0].delta)

Chamadas de função

A chamada de função facilita a geração de saídas de dados estruturados de modelos generativos e é compatível com a API Gemini.

Python

import openai
from google.auth import default
import google.auth.transport.requests

# TODO(developer): Update and un-comment below lines
#project_id = PROJECT_ID
location = "us-central1"

credentials, _ = default(scopes=["https://www.googleapis.com/auth/cloud-platform"])
credentials.refresh(google.auth.transport.requests.Request())

client = openai.OpenAI(
  base_url=f"https://{location}-aiplatform.googleapis.com/v1/projects/{project_id}/locations/{location}/endpoints/openapi",
  api_key=credentials.token
)

tools = [
{
  "type": "function",
  "function": {
    "name": "get_weather",
    "description": "Get the weather in a given location",
    "parameters": {
      "type": "object",
      "properties": {
        "location": {
          "type": "string",
          "description": "The city and state, e.g. Chicago, IL",
        },
        "unit": {"type": "string", "enum": ["celsius", "fahrenheit"]},
      },
      "required": ["location"],
    },
  }
}
]

messages = [{"role": "user", "content": "What's the weather like in Chicago today?"}]
response = client.chat.completions.create(
model="google/gemini-2.0-flash",
messages=messages,
tools=tools,
tool_choice="auto"
)

print(response)

Compreensão de imagens

Os modelos Gemini são multimodal por padrão e oferecem o melhor desempenho em muitas tarefas de visão comuns.

Python

from google.auth import default
import google.auth.transport.requests

import base64
from openai import OpenAI

# TODO(developer): Update and un-comment below lines
# project_id = "PROJECT_ID"
location = "us-central1"

# Programmatically get an access token
credentials, _ = default(scopes=["https://www.googleapis.com/auth/cloud-platform"])
credentials.refresh(google.auth.transport.requests.Request())

# OpenAI Client
client = openai.OpenAI(
  base_url=f"https://{location}-aiplatform.googleapis.com/v1/projects/{project_id}/locations/{location}/endpoints/openapi",
  api_key=credentials.token,
)

# Function to encode the image
def encode_image(image_path):
with open(image_path, "rb") as image_file:
  return base64.b64encode(image_file.read()).decode('utf-8')

# Getting the base64 string
#base64_image = encode_image("Path/to/image.jpeg")

response = client.chat.completions.create(
model="google/gemini-2.0-flash",
messages=[
  {
    "role": "user",
    "content": [
      {
        "type": "text",
        "text": "What is in this image?",
      },
      {
        "type": "image_url",
        "image_url": {
          "url":  f"data:image/jpeg;base64,{base64_image}"
        },
      },
    ],
  }
],
)

print(response.choices[0])

Gerar uma imagem

Python

from google.auth import default
import google.auth.transport.requests

import base64
from openai import OpenAI

# TODO(developer): Update and un-comment below lines
# project_id = "PROJECT_ID"
location = "us-central1"

# Programmatically get an access token
credentials, _ = default(scopes=["https://www.googleapis.com/auth/cloud-platform"])
credentials.refresh(google.auth.transport.requests.Request())

# OpenAI Client
client = openai.OpenAI(
  base_url=f"https://{location}-aiplatform.googleapis.com/v1/projects/{project_id}/locations/{location}/endpoints/openapi",
  api_key=credentials.token,
)

# Function to encode the image
def encode_image(image_path):
with open(image_path, "rb") as image_file:
  return base64.b64encode(image_file.read()).decode('utf-8')

# Getting the base64 string
#base64_image = encode_image("Path/to/image.jpeg")
base64_image = encode_image("/content/wayfairsofa.jpg")

response = client.chat.completions.create(
model="google/gemini-2.0-flash",
messages=[
  {
    "role": "user",
    "content": [
      {
        "type": "text",
        "text": "What is in this image?",
      },
      {
        "type": "image_url",
        "image_url": {
          "url":  f"data:image/jpeg;base64,{base64_image}"
        },
      },
    ],
  }
],
)

print(response.choices[0])

Compreensão de áudio

Analisar a entrada de áudio:

Python

from google.auth import default
import google.auth.transport.requests

import base64
from openai import OpenAI

# TODO(developer): Update and un-comment below lines
# project_id = "PROJECT_ID"
location = "us-central1"

# Programmatically get an access token
credentials, _ = default(scopes=["https://www.googleapis.com/auth/cloud-platform"])
credentials.refresh(google.auth.transport.requests.Request())

# OpenAI Client
client = openai.OpenAI(
  base_url=f"https://{location}-aiplatform.googleapis.com/v1/projects/{project_id}/locations/{location}/endpoints/openapi",
  api_key=credentials.token,
)

with open("/path/to/your/audio/file.wav", "rb") as audio_file:
base64_audio = base64.b64encode(audio_file.read()).decode('utf-8')

response = client.chat.completions.create(
  model="gemini-2.0-flash",
  messages=[
  {
    "role": "user",
    "content": [
      {
        "type": "text",
        "text": "Transcribe this audio",
      },
      {
            "type": "input_audio",
            "input_audio": {
              "data": base64_audio,
              "format": "wav"
        }
      }
    ],
  }
],
)

print(response.choices[0].message.content)

Resposta estruturada

Os modelos do Gemini podem gerar objetos JSON em qualquer estrutura definida.

Python

from google.auth import default
import google.auth.transport.requests

from pydantic import BaseModel
from openai import OpenAI

# TODO(developer): Update and un-comment below lines
# project_id = "PROJECT_ID"
location = "us-central1"

# Programmatically get an access token
credentials, _ = default(scopes=["https://www.googleapis.com/auth/cloud-platform"])
credentials.refresh(google.auth.transport.requests.Request())

# OpenAI Client
client = openai.OpenAI(
  base_url=f"https://{location}-aiplatform.googleapis.com/v1/projects/{project_id}/locations/{location}/endpoints/openapi",
  api_key=credentials.token,
)

class CalendarEvent(BaseModel):
  name: str
  date: str
  participants: list[str]

completion = client.beta.chat.completions.parse(
  model="google/gemini-2.0-flash",
  messages=[
      {"role": "system", "content": "Extract the event information."},
      {"role": "user", "content": "John and Susan are going to an AI conference on Friday."},
  ],
  response_format=CalendarEvent,
)

print(completion.choices[0].message.parsed)

Limitações atuais

  • As credenciais ficam ativas por uma hora por padrão. Após a expiração, elas precisam ser atualizadas. Consulte este exemplo de código para mais informações.

  • O suporte para as bibliotecas do OpenAI ainda está em versão prévia enquanto ampliamos o suporte a recursos. Se tiver dúvidas ou problemas, poste na Google Cloud Comunidade.

A seguir