Index
- ImageAnnotator(interface)
- AnnotateFileRequest(message)
- AnnotateFileResponse(message)
- AnnotateImageRequest(message)
- AnnotateImageResponse(message)
- AsyncAnnotateFileRequest(message)
- AsyncAnnotateFileResponse(message)
- AsyncBatchAnnotateFilesRequest(message)
- AsyncBatchAnnotateFilesResponse(message)
- AsyncBatchAnnotateImagesRequest(message)
- BatchAnnotateFilesRequest(message)
- BatchAnnotateFilesResponse(message)
- BatchAnnotateImagesRequest(message)
- BatchAnnotateImagesResponse(message)
- Block(message)
- Block.BlockType(enum)
- BoundingPoly(message)
- ColorInfo(message)
- CropHint(message)
- CropHintsAnnotation(message)
- CropHintsParams(message)
- DominantColorsAnnotation(message)
- EntityAnnotation(message)
- FaceAnnotation(message)
- FaceAnnotation.Landmark(message)
- FaceAnnotation.Landmark.Type(enum)
- Feature(message)
- Feature.Type(enum)
- GcsDestination(message)
- GcsSource(message)
- Image(message)
- ImageAnnotationContext(message)
- ImageContext(message)
- ImageProperties(message)
- ImageSource(message)
- InputConfig(message)
- LatLongRect(message)
- Likelihood(enum)
- LocalizedObjectAnnotation(message)
- LocationInfo(message)
- NormalizedVertex(message)
- OperationMetadata(message)
- OperationMetadata.State(enum)
- OutputConfig(message)
- Page(message)
- Paragraph(message)
- Position(message)
- Product(message)
- Product.KeyValue(message)
- ProductSearchParams(message)
- ProductSearchResults(message)
- ProductSearchResults.GroupedResult(message)
- ProductSearchResults.ObjectAnnotation(message)
- ProductSearchResults.Result(message)
- Property(message)
- SafeSearchAnnotation(message)
- Symbol(message)
- TextAnnotation(message)
- TextAnnotation.DetectedBreak(message)
- TextAnnotation.DetectedBreak.BreakType(enum)
- TextAnnotation.DetectedLanguage(message)
- TextAnnotation.TextProperty(message)
- TextDetectionParams(message)
- Vertex(message)
- WebDetection(message)
- WebDetection.WebEntity(message)
- WebDetection.WebImage(message)
- WebDetection.WebLabel(message)
- WebDetection.WebPage(message)
- WebDetectionParams(message)
- Word(message)
ImageAnnotator
Service that performs Google Cloud Vision API detection tasks over client images, such as face, landmark, logo, label, and text detection. The ImageAnnotator service returns detected entities from the images.
| AsyncBatchAnnotateFiles | |
|---|---|
| 
 Run asynchronous image detection and annotation for a list of generic files, such as PDF files, which may contain multiple pages and multiple images per page. Progress and results can be retrieved through the  
 | |
| AsyncBatchAnnotateImages | |
|---|---|
| 
 Run asynchronous image detection and annotation for a list of images. Progress and results can be retrieved through the  This service will write image annotation outputs to json files in customer Google Cloud Storage bucket, each json file containing BatchAnnotateImagesResponse proto. 
 | |
| BatchAnnotateFiles | |
|---|---|
| 
 Service that performs image detection and annotation for a batch of files. Now only "application/pdf", "image/tiff" and "image/gif" are supported. This service will extract at most 5 (customers can specify which 5 in AnnotateFileRequest.pages) frames (gif) or pages (pdf or tiff) from each file provided and perform detection and annotation for each image extracted. 
 | |
| BatchAnnotateImages | |
|---|---|
| 
 Run image detection and annotation for a batch of images. 
 | |
AnnotateFileRequest
A request to annotate one single file, e.g. a PDF, TIFF or GIF file.
| Fields | |
|---|---|
| input_config | Required. Information about the input file. | 
| features[] | Required. Requested features. | 
| image_context | Additional context that may accompany the image(s) in the file. | 
| pages[] | 
 Pages of the file to perform image annotation. Pages starts from 1, we assume the first page of the file is page 1. At most 5 pages are supported per request. Pages can be negative. Page 1 means the first page. Page 2 means the second page. Page -1 means the last page. Page -2 means the second to the last page. If the file is GIF instead of PDF or TIFF, page refers to GIF frames. If this field is empty, by default the service performs image annotation for the first 5 pages of the file. | 
AnnotateFileResponse
Response to a single file annotation request. A file may contain one or more images, which individually have their own responses.
| Fields | |
|---|---|
| input_config | Information about the file for which this response is generated. | 
| responses[] | Individual responses to images found within the file. This field will be empty if the  | 
| total_pages | 
 This field gives the total number of pages in the file. | 
| error | If set, represents the error message for the failed request. The  | 
AnnotateImageRequest
Request for performing Google Cloud Vision API tasks over a user-provided image, with user-requested features, and with context information.
| Fields | |
|---|---|
| image | The image to be processed. | 
| features[] | Requested features. | 
| image_context | Additional context that may accompany the image. | 
AnnotateImageResponse
Response to an image annotation request.
| Fields | |
|---|---|
| face_annotations[] | If present, face detection has completed successfully. | 
| landmark_annotations[] | If present, landmark detection has completed successfully. | 
| logo_annotations[] | If present, logo detection has completed successfully. | 
| label_annotations[] | If present, label detection has completed successfully. | 
| localized_object_annotations[] | If present, localized object detection has completed successfully. This will be sorted descending by confidence score. | 
| text_annotations[] | If present, text (OCR) detection has completed successfully. | 
| full_text_annotation | If present, text (OCR) detection or document (OCR) text detection has completed successfully. This annotation provides the structural hierarchy for the OCR detected text. | 
| safe_search_annotation | If present, safe-search annotation has completed successfully. | 
| image_properties_annotation | If present, image properties were extracted successfully. | 
| crop_hints_annotation | If present, crop hints have completed successfully. | 
| web_detection | If present, web detection has completed successfully. | 
| product_search_results | If present, product search has completed successfully. | 
| error | If set, represents the error message for the operation. Note that filled-in image annotations are guaranteed to be correct, even when  | 
| context | If present, contextual information is needed to understand where this image comes from. | 
AsyncAnnotateFileRequest
An offline file annotation request.
| Fields | |
|---|---|
| input_config | Required. Information about the input file. | 
| features[] | Required. Requested features. | 
| image_context | Additional context that may accompany the image(s) in the file. | 
| output_config | Required. The desired output location and metadata (e.g. format). | 
AsyncAnnotateFileResponse
The response for a single offline file annotation request.
| Fields | |
|---|---|
| output_config | The output location and metadata from AsyncAnnotateFileRequest. | 
AsyncBatchAnnotateFilesRequest
Multiple async file annotation requests are batched into a single service call.
| Fields | |
|---|---|
| requests[] | Required. Individual async file annotation requests for this batch. | 
| parent | 
 Optional. Target project and location to make a call. Format:  If no parent is specified, a region will be chosen automatically. Supported location-ids:   Example:  | 
AsyncBatchAnnotateFilesResponse
Response to an async batch file annotation request.
| Fields | |
|---|---|
| responses[] | The list of file annotation responses, one for each request in AsyncBatchAnnotateFilesRequest. | 
AsyncBatchAnnotateImagesRequest
Request for async image annotation for a list of images.
| Fields | |
|---|---|
| requests[] | Required. Individual image annotation requests for this batch. | 
| output_config | Required. The desired output location and metadata (e.g. format). | 
| parent | 
 Optional. Target project and location to make a call. Format:  If no parent is specified, a region will be chosen automatically. Supported location-ids:  Example:  | 
BatchAnnotateFilesRequest
A list of requests to annotate files using the BatchAnnotateFiles API.
| Fields | |
|---|---|
| requests[] | Required. The list of file annotation requests. Right now we support only one AnnotateFileRequest in BatchAnnotateFilesRequest. | 
| parent | 
 Optional. Target project and location to make a call. Format:  If no parent is specified, a region will be chosen automatically. Supported location-ids:   Example:  | 
BatchAnnotateFilesResponse
A list of file annotation responses.
| Fields | |
|---|---|
| responses[] | The list of file annotation responses, each response corresponding to each AnnotateFileRequest in BatchAnnotateFilesRequest. | 
BatchAnnotateImagesRequest
Multiple image annotation requests are batched into a single service call.
| Fields | |
|---|---|
| requests[] | Required. Individual image annotation requests for this batch. | 
| parent | 
 Optional. Target project and location to make a call. Format:  If no parent is specified, a region will be chosen automatically. Supported location-ids:   Example:  | 
BatchAnnotateImagesResponse
Response to a batch image annotation request.
| Fields | |
|---|---|
| responses[] | Individual responses to image annotation requests within the batch. | 
Block
Logical element on the page.
| Fields | |
|---|---|
| property | Additional information detected for the block. | 
| bounding_box | 
 The bounding box for the block. The vertices are in the order of top-left, top-right, bottom-right, bottom-left. When a rotation of the bounding box is detected the rotation is represented as around the top-left corner as defined when the text is read in the 'natural' orientation. For example: 
 
 and the vertice order will still be (0, 1, 2, 3). | 
| paragraphs[] | List of paragraphs in this block (if this blocks is of type text). | 
| block_type | Detected block type (text, image etc) for this block. | 
| confidence | 
 Confidence of the OCR results on the block. Range [0, 1]. | 
BlockType
Type of a block (text, image etc) as identified by OCR.
| Enums | |
|---|---|
| UNKNOWN | Unknown block type. | 
| TEXT | Regular text block. | 
| TABLE | Table block. | 
| PICTURE | Image block. | 
| RULER | Horizontal/vertical line box. | 
| BARCODE | Barcode block. | 
BoundingPoly
A bounding polygon for the detected image annotation.
| Fields | |
|---|---|
| vertices[] | The bounding polygon vertices. | 
| normalized_vertices[] | The bounding polygon normalized vertices. | 
ColorInfo
Color information consists of RGB channels, score, and the fraction of the image that the color occupies in the image.
| Fields | |
|---|---|
| color | RGB components of the color. | 
| score | 
 Image-specific score for this color. Value in range [0, 1]. | 
| pixel_fraction | 
 The fraction of pixels the color occupies in the image. Value in range [0, 1]. | 
CropHint
Single crop hint that is used to generate a new crop when serving an image.
| Fields | |
|---|---|
| bounding_poly | The bounding polygon for the crop region. The coordinates of the bounding box are in the original image's scale. | 
| confidence | 
 Confidence of this being a salient region. Range [0, 1]. | 
| importance_fraction | 
 Fraction of importance of this salient region with respect to the original image. | 
CropHintsAnnotation
Set of crop hints that are used to generate new crops when serving images.
| Fields | |
|---|---|
| crop_hints[] | Crop hint results. | 
CropHintsParams
Parameters for crop hints annotation request.
| Fields | |
|---|---|
| aspect_ratios[] | 
 Aspect ratios in floats, representing the ratio of the width to the height of the image. For example, if the desired aspect ratio is 4/3, the corresponding float value should be 1.33333. If not specified, the best possible crop is returned. The number of provided aspect ratios is limited to a maximum of 16; any aspect ratios provided after the 16th are ignored. | 
DominantColorsAnnotation
Set of dominant colors and their corresponding scores.
| Fields | |
|---|---|
| colors[] | RGB color values with their score and pixel fraction. | 
EntityAnnotation
Set of detected entity features.
| Fields | |
|---|---|
| mid | 
 Opaque entity ID. Some IDs may be available in Google Knowledge Graph Search API. | 
| locale | 
 The language code for the locale in which the entity textual  | 
| description | 
 Entity textual description, expressed in its  | 
| score | 
 Overall score of the result. Range [0, 1]. | 
| confidence | 
 Deprecated. Use  | 
| topicality | 
 The relevancy of the ICA (Image Content Annotation) label to the image. For example, the relevancy of "tower" is likely higher to an image containing the detected "Eiffel Tower" than to an image containing a detected distant towering building, even though the confidence that there is a tower in each image may be the same. Range [0, 1]. | 
| bounding_poly | Image region to which this entity belongs. Not produced for  | 
| locations[] | The location information for the detected entity. Multiple  | 
| properties[] | Some entities may have optional user-supplied  | 
FaceAnnotation
A face annotation object contains the results of face detection.
| Fields | |
|---|---|
| bounding_poly | The bounding polygon around the face. The coordinates of the bounding box are in the original image's scale. The bounding box is computed to "frame" the face in accordance with human expectations. It is based on the landmarker results. Note that one or more x and/or y coordinates may not be generated in the  | 
| fd_bounding_poly | The  
 (face detection) prefix. | 
| landmarks[] | Detected face landmarks. | 
| roll_angle | 
 Roll angle, which indicates the amount of clockwise/anti-clockwise rotation of the face relative to the image vertical about the axis perpendicular to the face. Range [-180,180]. | 
| pan_angle | 
 Yaw angle, which indicates the leftward/rightward angle that the face is pointing relative to the vertical plane perpendicular to the image. Range [-180,180]. | 
| tilt_angle | 
 Pitch angle, which indicates the upwards/downwards angle that the face is pointing relative to the image's horizontal plane. Range [-180,180]. | 
| detection_confidence | 
 Detection confidence. Range [0, 1]. | 
| landmarking_confidence | 
 Face landmarking confidence. Range [0, 1]. | 
| joy_likelihood | Joy likelihood. | 
| sorrow_likelihood | Sorrow likelihood. | 
| anger_likelihood | Anger likelihood. | 
| surprise_likelihood | Surprise likelihood. | 
| under_exposed_likelihood | Under-exposed likelihood. | 
| blurred_likelihood | Blurred likelihood. | 
| headwear_likelihood | Headwear likelihood. | 
Landmark
A face-specific landmark (for example, a face feature).
| Fields | |
|---|---|
| type | Face landmark type. | 
| position | Face landmark position. | 
Type
Face landmark (feature) type. Left and right are defined from the vantage of the viewer of the image without considering mirror projections typical of photos. So, LEFT_EYE, typically, is the person's right eye.
| Enums | |
|---|---|
| UNKNOWN_LANDMARK | Unknown face landmark detected. Should not be filled. | 
| LEFT_EYE | Left eye. | 
| RIGHT_EYE | Right eye. | 
| LEFT_OF_LEFT_EYEBROW | Left of left eyebrow. | 
| RIGHT_OF_LEFT_EYEBROW | Right of left eyebrow. | 
| LEFT_OF_RIGHT_EYEBROW | Left of right eyebrow. | 
| RIGHT_OF_RIGHT_EYEBROW | Right of right eyebrow. | 
| MIDPOINT_BETWEEN_EYES | Midpoint between eyes. | 
| NOSE_TIP | Nose tip. | 
| UPPER_LIP | Upper lip. | 
| LOWER_LIP | Lower lip. | 
| MOUTH_LEFT | Mouth left. | 
| MOUTH_RIGHT | Mouth right. | 
| MOUTH_CENTER | Mouth center. | 
| NOSE_BOTTOM_RIGHT | Nose, bottom right. | 
| NOSE_BOTTOM_LEFT | Nose, bottom left. | 
| NOSE_BOTTOM_CENTER | Nose, bottom center. | 
| LEFT_EYE_TOP_BOUNDARY | Left eye, top boundary. | 
| LEFT_EYE_RIGHT_CORNER | Left eye, right corner. | 
| LEFT_EYE_BOTTOM_BOUNDARY | Left eye, bottom boundary. | 
| LEFT_EYE_LEFT_CORNER | Left eye, left corner. | 
| RIGHT_EYE_TOP_BOUNDARY | Right eye, top boundary. | 
| RIGHT_EYE_RIGHT_CORNER | Right eye, right corner. | 
| RIGHT_EYE_BOTTOM_BOUNDARY | Right eye, bottom boundary. | 
| RIGHT_EYE_LEFT_CORNER | Right eye, left corner. | 
| LEFT_EYEBROW_UPPER_MIDPOINT | Left eyebrow, upper midpoint. | 
| RIGHT_EYEBROW_UPPER_MIDPOINT | Right eyebrow, upper midpoint. | 
| LEFT_EAR_TRAGION | Left ear tragion. | 
| RIGHT_EAR_TRAGION | Right ear tragion. | 
| LEFT_EYE_PUPIL | Left eye pupil. | 
| RIGHT_EYE_PUPIL | Right eye pupil. | 
| FOREHEAD_GLABELLA | Forehead glabella. | 
| CHIN_GNATHION | Chin gnathion. | 
| CHIN_LEFT_GONION | Chin left gonion. | 
| CHIN_RIGHT_GONION | Chin right gonion. | 
| LEFT_CHEEK_CENTER | Left cheek center. | 
| RIGHT_CHEEK_CENTER | Right cheek center. | 
Feature
The type of Google Cloud Vision API detection to perform, and the maximum number of results to return for that type. Multiple Feature objects can be specified in the features list.
| Fields | |
|---|---|
| type | The feature type. | 
| max_results | 
 Maximum number of results of this type. Does not apply to  | 
| model | 
 Model to use for the feature. Supported values: "builtin/stable" (the default if unset) and "builtin/latest". | 
Type
Type of Google Cloud Vision API feature to be extracted.
| Enums | |
|---|---|
| TYPE_UNSPECIFIED | Unspecified feature type. | 
| FACE_DETECTION | Run face detection. | 
| LANDMARK_DETECTION | Run landmark detection. | 
| LOGO_DETECTION | Run logo detection. | 
| LABEL_DETECTION | Run label detection. | 
| TEXT_DETECTION | Run text detection / optical character recognition (OCR). Text detection is optimized for areas of text within a larger image; if the image is a document, use DOCUMENT_TEXT_DETECTIONinstead. | 
| DOCUMENT_TEXT_DETECTION | Run dense text document OCR. Takes precedence when both DOCUMENT_TEXT_DETECTIONandTEXT_DETECTIONare present. | 
| SAFE_SEARCH_DETECTION | Run Safe Search to detect potentially unsafe or undesirable content. | 
| IMAGE_PROPERTIES | Compute a set of image properties, such as the image's dominant colors. | 
| CROP_HINTS | Run crop hints. | 
| WEB_DETECTION | Run web detection. | 
| PRODUCT_SEARCH | Run Product Search. | 
| OBJECT_LOCALIZATION | Run localizer for object detection. | 
GcsDestination
The Google Cloud Storage location where the output will be written to.
| Fields | |
|---|---|
| uri | 
 Google Cloud Storage URI prefix where the results will be stored. Results will be in JSON format and preceded by its corresponding input URI prefix. This field can either represent a Google Cloud Storage file prefix or Google Cloud Storage directory. In either case, the uri should be unique because in order to get all of the output files, you will need to do a wildcard Google Cloud Storage search on the uri prefix you provide. Examples: 
 If multiple outputs, each response is still AnnotateFileResponse, each of which contains some subset of the full list of AnnotateImageResponse. Multiple outputs can happen if, for example, the output JSON is too large and overflows into multiple sharded files. | 
GcsSource
The Google Cloud Storage location where the input will be read from.
| Fields | |
|---|---|
| uri | 
 Google Cloud Storage URI for the input file. This must only be a Google Cloud Storage object. Wildcards are not currently supported. | 
Image
Client image to perform Google Cloud Vision API tasks over.
| Fields | |
|---|---|
| content | 
 Image content, represented as a stream of bytes. Note: As with all  Currently, this field only works for BatchAnnotateImages requests. It does not work for AsyncBatchAnnotateImages requests. | 
| source | Google Cloud Storage image location, or publicly-accessible image URL. If both  | 
ImageAnnotationContext
If an image was produced from a file (e.g. a PDF), this message gives information about the source of that image.
| Fields | |
|---|---|
| uri | 
 The URI of the file used to produce the image. | 
| page_number | 
 If the file was a PDF or TIFF, this field gives the page number within the file used to produce the image. | 
ImageContext
Image context and/or feature-specific parameters.
| Fields | |
|---|---|
| lat_long_rect | Not used. | 
| language_hints[] | 
 List of languages to use for TEXT_DETECTION. In most cases, an empty value yields the best results since it enables automatic language detection. For languages based on the Latin alphabet, setting  | 
| crop_hints_params | Parameters for crop hints annotation request. | 
| product_search_params | Parameters for product search. | 
| web_detection_params | Parameters for web detection. | 
| text_detection_params | Parameters for text detection and document text detection. | 
ImageProperties
Stores image properties, such as dominant colors.
| Fields | |
|---|---|
| dominant_colors | If present, dominant colors completed successfully. | 
ImageSource
External image source (Google Cloud Storage or web URL image location).
| Fields | |
|---|---|
| gcs_image_uri | 
 Use  The Google Cloud Storage URI of the form  | 
| image_uri | 
 The URI of the source image. Can be either: 
 When both  | 
InputConfig
The desired input location and metadata.
| Fields | |
|---|---|
| gcs_source | The Google Cloud Storage location to read the input from. | 
| content | 
 File content, represented as a stream of bytes. Note: As with all  Currently, this field only works for BatchAnnotateFiles requests. It does not work for AsyncBatchAnnotateFiles requests. | 
| mime_type | 
 The type of the file. Currently only "application/pdf", "image/tiff" and "image/gif" are supported. Wildcards are not supported. | 
LatLongRect
Rectangle determined by min and max LatLng pairs.
| Fields | |
|---|---|
| min_lat_lng | Min lat/long pair. | 
| max_lat_lng | Max lat/long pair. | 
Likelihood
A bucketized representation of likelihood, which is intended to give clients highly stable results across model upgrades.
| Enums | |
|---|---|
| UNKNOWN | Unknown likelihood. | 
| VERY_UNLIKELY | It is very unlikely. | 
| UNLIKELY | It is unlikely. | 
| POSSIBLE | It is possible. | 
| LIKELY | It is likely. | 
| VERY_LIKELY | It is very likely. | 
LocalizedObjectAnnotation
Set of detected objects with bounding boxes.
| Fields | |
|---|---|
| mid | 
 Object ID that should align with EntityAnnotation mid. | 
| language_code | 
 The BCP-47 language code, such as "en-US" or "sr-Latn". For more information, see http://www.unicode.org/reports/tr35/#Unicode_locale_identifier. | 
| name | 
 Object name, expressed in its  | 
| score | 
 Score of the result. Range [0, 1]. | 
| bounding_poly | Image region to which this object belongs. This must be populated. | 
LocationInfo
Detected entity location information.
| Fields | |
|---|---|
| lat_lng | lat/long location coordinates. | 
NormalizedVertex
A vertex represents a 2D point in the image. NOTE: the normalized vertex coordinates are relative to the original image and range from 0 to 1.
| Fields | |
|---|---|
| x | 
 X coordinate. | 
| y | 
 Y coordinate. | 
OperationMetadata
Contains metadata for the BatchAnnotateImages operation.
| Fields | |
|---|---|
| state | Current state of the batch operation. | 
| create_time | The time when the batch request was received. | 
| update_time | The time when the operation result was last updated. | 
State
Batch operation states.
| Enums | |
|---|---|
| STATE_UNSPECIFIED | Invalid. | 
| CREATED | Request is received. | 
| RUNNING | Request is actively being processed. | 
| DONE | The batch processing is done. | 
| CANCELLED | The batch processing was cancelled. | 
OutputConfig
The desired output location and metadata.
| Fields | |
|---|---|
| gcs_destination | The Google Cloud Storage location to write the output(s) to. | 
| batch_size | 
 The max number of response protos to put into each output JSON file on Google Cloud Storage. The valid range is [1, 100]. If not specified, the default value is 20. For example, for one pdf file with 100 pages, 100 response protos will be generated. If  Currently, batch_size only applies to GcsDestination, with potential future support for other output configurations. | 
Page
Detected page from OCR.
| Fields | |
|---|---|
| property | Additional information detected on the page. | 
| width | 
 Page width. For PDFs the unit is points. For images (including TIFFs) the unit is pixels. | 
| height | 
 Page height. For PDFs the unit is points. For images (including TIFFs) the unit is pixels. | 
| blocks[] | List of blocks of text, images etc on this page. | 
| confidence | 
 Confidence of the OCR results on the page. Range [0, 1]. | 
Paragraph
Structural unit of text representing a number of words in certain order.
| Fields | |
|---|---|
| property | Additional information detected for the paragraph. | 
| bounding_box | The bounding box for the paragraph. The vertices are in the order of top-left, top-right, bottom-right, bottom-left. When a rotation of the bounding box is detected the rotation is represented as around the top-left corner as defined when the text is read in the 'natural' orientation. For example: * when the text is horizontal it might look like: 0----1 | | 3----2 * when it's rotated 180 degrees around the top-left corner it becomes: 2----3 | | 1----0 and the vertex order will still be (0, 1, 2, 3). | 
| words[] | List of all words in this paragraph. | 
| confidence | 
 Confidence of the OCR results for the paragraph. Range [0, 1]. | 
Position
A 3D position in the image, used primarily for Face detection landmarks. A valid Position must have both x and y coordinates. The position coordinates are in the same scale as the original image.
| Fields | |
|---|---|
| x | 
 X coordinate. | 
| y | 
 Y coordinate. | 
| z | 
 Z coordinate (or depth). | 
Product
A Product contains ReferenceImages.
| Fields | |
|---|---|
| name | 
 The resource name of the product. Format is:  This field is ignored when creating a product. | 
| display_name | 
 The user-provided name for this Product. Must not be empty. Must be at most 4096 characters long. | 
| description | 
 User-provided metadata to be stored with this product. Must be at most 4096 characters long. | 
| product_category | 
 Immutable. The category for the product identified by the reference image. This should be one of "homegoods-v2", "apparel-v2", "toys-v2", "packagedgoods-v1" or "general-v1". The legacy categories "homegoods", "apparel", and "toys" are still supported, but these should not be used for new products. | 
| product_labels[] | Key-value pairs that can be attached to a product. At query time, constraints can be specified based on the product_labels. Note that integer values can be provided as strings, e.g. "1199". Only strings with integer values can match a range-based restriction which is to be supported soon. Multiple values can be assigned to the same key. One product may have up to 500 product_labels. Notice that the total number of distinct product_labels over all products in one ProductSet cannot exceed 1M, otherwise the product search pipeline will refuse to work for that ProductSet. | 
KeyValue
A product label represented as a key-value pair.
| Fields | |
|---|---|
| key | 
 The key of the label attached to the product. Cannot be empty and cannot exceed 128 bytes. | 
| value | 
 The value of the label attached to the product. Cannot be empty and cannot exceed 128 bytes. | 
ProductSearchParams
Parameters for a product search request.
| Fields | |
|---|---|
| bounding_poly | The bounding polygon around the area of interest in the image. If it is not specified, system discretion will be applied. | 
| product_set | 
 The resource name of a  Format is:  | 
| product_categories[] | 
 The list of product categories to search in. Currently, we only consider the first category, and either "homegoods-v2", "apparel-v2", "toys-v2", "packagedgoods-v1", or "general-v1" should be specified. The legacy categories "homegoods", "apparel", and "toys" are still supported but will be deprecated. For new products, please use "homegoods-v2", "apparel-v2", or "toys-v2" for better product search accuracy. It is recommended to migrate existing products to these categories as well. | 
| filter | 
 The filtering expression. This can be used to restrict search results based on Product labels. We currently support an AND of OR of key-value expressions, where each expression within an OR must have the same key. An '=' should be used to connect the key and value. For example, "(color = red OR color = blue) AND brand = Google" is acceptable, but "(color = red OR brand = Google)" is not acceptable. "color: red" is not acceptable because it uses a ':' instead of an '='. | 
ProductSearchResults
Results for a product search request.
| Fields | |
|---|---|
| index_time | Timestamp of the index which provided these results. Products added to the product set and products removed from the product set after this time are not reflected in the current results. | 
| results[] | List of results, one for each product match. | 
| product_grouped_results[] | List of results grouped by products detected in the query image. Each entry corresponds to one bounding polygon in the query image, and contains the matching products specific to that region. There may be duplicate product matches in the union of all the per-product results. | 
GroupedResult
Information about the products similar to a single product in a query image.
| Fields | |
|---|---|
| bounding_poly | The bounding polygon around the product detected in the query image. | 
| results[] | List of results, one for each product match. | 
| object_annotations[] | List of generic predictions for the object in the bounding box. | 
ObjectAnnotation
Prediction for what the object in the bounding box is.
| Fields | |
|---|---|
| mid | 
 Object ID that should align with EntityAnnotation mid. | 
| language_code | 
 The BCP-47 language code, such as "en-US" or "sr-Latn". For more information, see http://www.unicode.org/reports/tr35/#Unicode_locale_identifier. | 
| name | 
 Object name, expressed in its  | 
| score | 
 Score of the result. Range [0, 1]. | 
Result
Information about a product.
| Fields | |
|---|---|
| product | The Product. | 
| score | 
 A confidence level on the match, ranging from 0 (no confidence) to 1 (full confidence). | 
| image | 
 The resource name of the image from the product that is the closest match to the query. | 
Property
A Property consists of a user-supplied name/value pair.
| Fields | |
|---|---|
| name | 
 Name of the property. | 
| value | 
 Value of the property. | 
| uint64_value | 
 Value of numeric properties. | 
SafeSearchAnnotation
Set of features pertaining to the image, computed by computer vision methods over safe-search verticals (for example, adult, spoof, medical, violence).
| Fields | |
|---|---|
| adult | Represents the adult content likelihood for the image. Adult content may contain elements such as nudity, pornographic images or cartoons, or sexual activities. | 
| spoof | Spoof likelihood. The likelihood that an modification was made to the image's canonical version to make it appear funny or offensive. | 
| medical | Likelihood that this is a medical image. | 
| violence | Likelihood that this image contains violent content. | 
| racy | Likelihood that the request image contains racy content. Racy content may include (but is not limited to) skimpy or sheer clothing, strategically covered nudity, lewd or provocative poses, or close-ups of sensitive body areas. | 
Symbol
A single symbol representation.
| Fields | |
|---|---|
| property | Additional information detected for the symbol. | 
| bounding_box | The bounding box for the symbol. The vertices are in the order of top-left, top-right, bottom-right, bottom-left. When a rotation of the bounding box is detected the rotation is represented as around the top-left corner as defined when the text is read in the 'natural' orientation. For example: * when the text is horizontal it might look like: 0----1 | | 3----2 * when it's rotated 180 degrees around the top-left corner it becomes: 2----3 | | 1----0 and the vertex order will still be (0, 1, 2, 3). | 
| text | 
 The actual UTF-8 representation of the symbol. | 
| confidence | 
 Confidence of the OCR results for the symbol. Range [0, 1]. | 
TextAnnotation
TextAnnotation contains a structured representation of OCR extracted text. The hierarchy of an OCR extracted text structure is like this:  TextAnnotation -> Page -> Block -> Paragraph -> Word -> Symbol Each structural component, starting from Page, may further have their own properties. Properties describe detected languages, breaks etc.. Please refer to the TextAnnotation.TextProperty message definition below for more detail.
| Fields | |
|---|---|
| pages[] | List of pages detected by OCR. | 
| text | 
 UTF-8 text detected on the pages. | 
DetectedBreak
Detected start or end of a structural component.
| Fields | |
|---|---|
| type | Detected break type. | 
| is_prefix | 
 True if break prepends the element. | 
BreakType
Enum to denote the type of break found. New line, space etc.
| Enums | |
|---|---|
| UNKNOWN | Unknown break label type. | 
| SPACE | Regular space. | 
| SURE_SPACE | Sure space (very wide). | 
| EOL_SURE_SPACE | Line-wrapping break. | 
| HYPHEN | End-line hyphen that is not present in text; does not co-occur with SPACE,LEADER_SPACE, orLINE_BREAK. | 
| LINE_BREAK | Line break that ends a paragraph. | 
DetectedLanguage
Detected language for a structural component.
| Fields | |
|---|---|
| language_code | 
 The BCP-47 language code, such as "en-US" or "sr-Latn". For more information, see http://www.unicode.org/reports/tr35/#Unicode_locale_identifier. | 
| confidence | 
 Confidence of detected language. Range [0, 1]. | 
TextProperty
Additional information detected on the structural component.
| Fields | |
|---|---|
| detected_languages[] | A list of detected languages together with confidence. | 
| detected_break | Detected start or end of a text segment. | 
TextDetectionParams
Parameters for text detections. This is used to control TEXT_DETECTION and DOCUMENT_TEXT_DETECTION features.
| Fields | |
|---|---|
| enable_text_detection_confidence_score | 
 By default, Cloud Vision API only includes confidence score for DOCUMENT_TEXT_DETECTION result. Set the flag to true to include confidence score for TEXT_DETECTION as well. | 
Vertex
A vertex represents a 2D point in the image. NOTE: the vertex coordinates are in the same scale as the original image.
| Fields | |
|---|---|
| x | 
 X coordinate. | 
| y | 
 Y coordinate. | 
WebDetection
Relevant information for the image from the Internet.
| Fields | |
|---|---|
| web_entities[] | Deduced entities from similar images on the Internet. | 
| full_matching_images[] | Fully matching images from the Internet. Can include resized copies of the query image. | 
| partial_matching_images[] | Partial matching images from the Internet. Those images are similar enough to share some key-point features. For example an original image will likely have partial matching for its crops. | 
| pages_with_matching_images[] | Web pages containing the matching images from the Internet. | 
| visually_similar_images[] | The visually similar image results. | 
| best_guess_labels[] | The service's best guess as to the topic of the request image. Inferred from similar images on the open web. | 
WebEntity
Entity deduced from similar images on the Internet.
| Fields | |
|---|---|
| entity_id | 
 Opaque entity ID. | 
| score | 
 Overall relevancy score for the entity. Not normalized and not comparable across different image queries. | 
| description | 
 Canonical description of the entity, in English. | 
WebImage
Metadata for online images.
| Fields | |
|---|---|
| url | 
 The result image URL. | 
| score | 
 (Deprecated) Overall relevancy score for the image. | 
WebLabel
Label to provide extra metadata for the web detection.
| Fields | |
|---|---|
| label | 
 Label for extra metadata. | 
| language_code | 
 The BCP-47 language code for  | 
WebPage
Metadata for web pages.
| Fields | |
|---|---|
| url | 
 The result web page URL. | 
| score | 
 (Deprecated) Overall relevancy score for the web page. | 
| page_title | 
 Title for the web page, may contain HTML markups. | 
| full_matching_images[] | Fully matching images on the page. Can include resized copies of the query image. | 
| partial_matching_images[] | Partial matching images on the page. Those images are similar enough to share some key-point features. For example an original image will likely have partial matching for its crops. | 
WebDetectionParams
Parameters for web detection request.
| Fields | |
|---|---|
| include_geo_results | 
 Whether to include results derived from the geo information in the image. | 
Word
A word representation.
| Fields | |
|---|---|
| property | Additional information detected for the word. | 
| bounding_box | The bounding box for the word. The vertices are in the order of top-left, top-right, bottom-right, bottom-left. When a rotation of the bounding box is detected the rotation is represented as around the top-left corner as defined when the text is read in the 'natural' orientation. For example: * when the text is horizontal it might look like: 0----1 | | 3----2 * when it's rotated 180 degrees around the top-left corner it becomes: 2----3 | | 1----0 and the vertex order will still be (0, 1, 2, 3). | 
| symbols[] | List of symbols in the word. The order of the symbols follows the natural reading order. | 
| confidence | 
 Confidence of the OCR results for the word. Range [0, 1]. |