Stay organized with collections
Save and categorize content based on your preferences.
Object tracking can track multiple objects detected in an input video or video
segments and return
labels (tags) associated with the detected entities along with the location of the entity in
the frame.
Object tracking differs from
label detection. While label
detection provides labels for the entire frame (without bounding boxes),
object tracking detects individual objects and provides a label along with
a bounding box that describes the location in the frame for each object. For example, a video
of vehicles crossing an intersection may produce labels such as "car" , "truck",
"bike", "tires", "lights", "window" and so on. Each label includes a series
of bounding boxes showing the location of the object in the frame.
Each bounding box also has an associated time segment
with a time offset (timestamp) that indicates the duration offset from
the beginning of the video. The annotation also contains additional entity
information including an entity id that you can use to find more information
about that entity in the
Google Knowledge Graph Search API.
To make an object tracking request, call the
annotate
method and specify
OBJECT_TRACKING
in the features field.
[[["Easy to understand","easyToUnderstand","thumb-up"],["Solved my problem","solvedMyProblem","thumb-up"],["Other","otherUp","thumb-up"]],[["Hard to understand","hardToUnderstand","thumb-down"],["Incorrect information or sample code","incorrectInformationOrSampleCode","thumb-down"],["Missing the information/samples I need","missingTheInformationSamplesINeed","thumb-down"],["Other","otherDown","thumb-down"]],["Last updated 2025-09-03 UTC."],[],[],null,["# Object tracking\n\n*Object tracking* can track multiple objects detected in an input video or video\nsegments and return\nlabels (tags) associated with the detected entities along with the location of the entity in\nthe frame.\n| **Note:** There is a limit on the size of the detected objects. Very small objects in the video might not get detected.\n\nObject tracking differs from\n[label detection](/video-intelligence/docs/analyze-labels). While label\ndetection provides labels for the entire frame (without bounding boxes),\nobject tracking detects individual objects and provides a label along with\na bounding box that describes the location in the frame for each object. For example, a video\nof vehicles crossing an intersection may produce labels such as \"car\" , \"truck\",\n\"bike\", \"tires\", \"lights\", \"window\" and so on. Each label includes a series\nof bounding boxes showing the location of the object in the frame.\nEach bounding box also has an associated time segment\nwith a time offset (timestamp) that indicates the duration offset from\nthe beginning of the video. The annotation also contains additional entity\ninformation including an entity id that you can use to find more information\nabout that entity in the\n[Google Knowledge Graph Search API](https://developers.google.com/knowledge-graph/).\n\nTo make an object tracking request, call the\n[`annotate`](/video-intelligence/docs/reference/rest/v1p2beta1/videos/annotate)\nmethod and specify\n[`OBJECT_TRACKING`](/video-intelligence/docs/reference/rest/v1p2beta1/videos#Feature)\nin the `features` field.\n\nCheck out the [Video Intelligence API visualizer](https://zackakil.github.io/video-intelligence-api-visualiser/#Object%20Tracking) to see this feature in action.\n\nFor an example, see [Object Tracking](/video-intelligence/docs/object-tracking)\nand [Shot Change Detection](/video-intelligence/docs/shot-detection) tutorial."]]