This page shows you how to use the Gemini model to perform bounding box detection, which identifies objects in an image and provides their coordinates. In this experimental launch, we are providing developers with a powerful tool
for object detection and localization within images and video. By accurately
identifying and delineating objects with bounding boxes, developers can unlock a
wide range of applications and enhance the intelligence of their projects. Key Benefits: Technical Details:
To learn more, see the
SDK reference documentation.
Set environment variables to use the Gen AI SDK with Vertex AI:
Learn about bounding box detection
[y_min, x_min, y_max, x_max]
format. The
top left corner is the origin. The x
and y
axis go horizontally and
vertically, respectively. Coordinate values are normalized to 0-1000 for every
image.Code example
Python
Install
pip install --upgrade google-genai
# Replace the `GOOGLE_CLOUD_PROJECT` and `GOOGLE_CLOUD_LOCATION` values
# with appropriate values for your project.
export GOOGLE_CLOUD_PROJECT=GOOGLE_CLOUD_PROJECT
export GOOGLE_CLOUD_LOCATION=global
export GOOGLE_GENAI_USE_VERTEXAI=True
Bounding box detection
Except as otherwise noted, the content of this page is licensed under the Creative Commons Attribution 4.0 License, and code samples are licensed under the Apache 2.0 License. For details, see the Google Developers Site Policies. Java is a registered trademark of Oracle and/or its affiliates.
Last updated 2025-08-21 UTC.