In this experimental launch, we are providing developers with a powerful tool for object detection and localization within images and video. By accurately identifying and delineating objects with bounding boxes, developers can unlock a wide range of applications and enhance the intelligence of their projects.
Key Benefits:
- Simple: Integrate object detection capabilities into your applications with ease, regardless of your computer vision expertise.
- Customizable: Produce bounding boxes based on custom instructions (e.g. "I want to see bounding boxes of all the green objects in this image"), without having to train a custom model.
Technical Details:
- Input: Your prompt and associated images or video frames.
- Output: Bounding boxes in the
[y_min, x_min, y_max, x_max]
format. The top left corner is the origin. Thex
andy
axis go horizontally and vertically, respectively. Coordinate values are normalized to 0-1000 for every image. - Visualization: AI Studio users will see bounding boxes plotted within the UI. Vertex AI users should visualize their bounding boxes through custom visualization code.
Gen AI SDK for Python
Install
pip install --upgrade google-genai
Set environment variables to use the Gen AI SDK with Vertex AI:
# Replace the `GOOGLE_CLOUD_PROJECT` and `GOOGLE_CLOUD_LOCATION` values # with appropriate values for your project. export GOOGLE_CLOUD_PROJECT=GOOGLE_CLOUD_PROJECT export GOOGLE_CLOUD_LOCATION=us-central1 export GOOGLE_GENAI_USE_VERTEXAI=True