Stay organized with collections
Save and categorize content based on your preferences.
Optical Character Recognition (OCR) is one of the three Vertex AI
pre-trained APIs on Google Distributed Cloud (GDC) air-gapped. The OCR
service detects text in various file types, such as
images, document files, and handwritten text.
OCR offers the following methods available in
Distributed Cloud to recognize text:
Detect text from a batch of PDF or TIFF files in a storage bucket for offline requests.
Learn more about the supported languages
detected by the text recognition feature.
Optical character recognition features
The OCR API can detect and extract text from images. The
following two annotation features support optical character recognition:
TEXT_DETECTION detects and extracts text from any image. For example, a
photograph might contain a street or traffic sign. The OCR
service returns a JSON file with the extracted string, individual words, and
their bounding boxes.
Figure 1. Road sign photograph where the OCR API detects
words and their bounding boxes.
DOCUMENT_TEXT_DETECTION also extracts text from an image, but the service
optimizes the response for dense text and documents. For example, a scanned
image of typed text might contain several paragraphs and headings. The
OCR service returns a JSON file with page, block, paragraph,
word, and break information.
Figure 2. Scanned image of typed text where the OCR API detects information such as words, pages, and paragraphs.
Handwritten text
Figure 3 is an image of handwritten text. The OCR API detects and
extracts text from these images. For a list of handwriting scripts that
support handwriting recognition, see
Handwriting scripts.
Figure 3. Handwriting image where the OCR API detects text.
Optical character recognition limits
The BatchAnnotateImages and BatchAnnotateFiles API methods only support a
single request per batch call.
The following table lists the current limits of the OCR service
in Distributed Cloud.
File limit for OCR
Value
Maximum number of pages
Five
Maximum file size
20 MB
Maximum image size
20 million pixels (length x width)
Submitted files for the OCR API that exceed the maximum number of
pages or the maximum file size return an error. Submitted files that exceed the
maximum image size are downsized to 20 million pixels.
Supported file types for OCR
The OCR pre-trained API detects and transcribes
text from the following file types:
PDF
TIFF
JPG
PNG
You must store the files locally in your Distributed Cloud environment. You
can't access files hosted in Cloud Storage or publicly available files for
text detection.
[[["Easy to understand","easyToUnderstand","thumb-up"],["Solved my problem","solvedMyProblem","thumb-up"],["Other","otherUp","thumb-up"]],[["Hard to understand","hardToUnderstand","thumb-down"],["Incorrect information or sample code","incorrectInformationOrSampleCode","thumb-down"],["Missing the information/samples I need","missingTheInformationSamplesINeed","thumb-down"],["Other","otherDown","thumb-down"]],["Last updated 2025-08-29 UTC."],[[["\u003cp\u003eOptical Character Recognition (OCR) is a pre-trained Vertex AI API on Google Distributed Cloud (GDC) air-gapped that can detect text in images, document files, and handwritten text.\u003c/p\u003e\n"],["\u003cp\u003eThe OCR service offers three methods: \u003ccode\u003eBatchAnnotateImages\u003c/code\u003e for detecting text from JPEG or PNG images, \u003ccode\u003eBatchAnnotateFiles\u003c/code\u003e for PDF or TIFF files in inline requests, and \u003ccode\u003eAsyncBatchAnnotateFiles\u003c/code\u003e for offline text detection from PDF or TIFF files in storage buckets.\u003c/p\u003e\n"],["\u003cp\u003eThe OCR API supports two main features: \u003ccode\u003eTEXT_DETECTION\u003c/code\u003e to extract text from any image and \u003ccode\u003eDOCUMENT_TEXT_DETECTION\u003c/code\u003e optimized for dense text and documents.\u003c/p\u003e\n"],["\u003cp\u003eThe OCR API also supports handwritten text detection and extraction, with specific supported handwriting scripts available for recognition.\u003c/p\u003e\n"],["\u003cp\u003eOCR has limitations such as a maximum of five pages, a 20 MB file size limit, and a maximum image size of 20 million pixels, and it supports file types like PDF, TIFF, JPG, and PNG, stored locally in the Distributed Cloud environment.\u003c/p\u003e\n"]]],[],null,["# Learn about character recognition features\n\nOptical Character Recognition (OCR) is one of the three Vertex AI\npre-trained APIs on Google Distributed Cloud (GDC) air-gapped. The OCR\nservice detects text in [various file types](#supported-file-types), such as\nimages, document files, and handwritten text.\n\nOCR offers the following methods available in\nDistributed Cloud to recognize text:\n\n| **Note:** The `BatchAnnotateImages` and `BatchAnnotateFiles` API methods only support a single request per batch call.\n\nLearn more about the [supported languages](/distributed-cloud/hosted/docs/latest/gdch/application/ao-user/vertex-ai-ocr-supported-langs)\ndetected by the text recognition feature.\n\nOptical character recognition features\n--------------------------------------\n\nThe OCR API can detect and extract text from images. The\nfollowing two annotation features support optical character recognition:\n\n- `TEXT_DETECTION` detects and extracts text from any image. For example, a\n photograph might contain a street or traffic sign. The OCR\n service returns a JSON file with the extracted string, individual words, and\n their bounding boxes.\n\n **Figure 1.** Road sign photograph where the OCR API detects\n words and their bounding boxes.\n- `DOCUMENT_TEXT_DETECTION` also extracts text from an image, but the service\n optimizes the response for dense text and documents. For example, a scanned\n image of typed text might contain several paragraphs and headings. The\n OCR service returns a JSON file with page, block, paragraph,\n word, and break information.\n\n **Figure 2.** Scanned image of typed text where the OCR API detects information such as words, pages, and paragraphs.\n\nHandwritten text\n----------------\n\nFigure 3 is an image of handwritten text. The OCR API detects and\nextracts text from these images. For a list of handwriting scripts that\nsupport handwriting recognition, see\n[Handwriting scripts](/distributed-cloud/hosted/docs/latest/gdch/application/ao-user/vertex-ai-ocr-supported-langs#handwriting-scripts).\n\n**Figure 3.** Handwriting image where the OCR API detects text.\n\nOptical character recognition limits\n------------------------------------\n\nThe `BatchAnnotateImages` and `BatchAnnotateFiles` API methods only support a\nsingle request per batch call.\n\nThe following table lists the current limits of the OCR service\nin Distributed Cloud.\n\nSubmitted files for the OCR API that exceed the maximum number of\npages or the maximum file size return an error. Submitted files that exceed the\nmaximum image size are downsized to 20 million pixels.\n\nSupported file types for OCR\n----------------------------\n\nThe OCR pre-trained API detects and transcribes\ntext from the following file types:\n\n- PDF\n- TIFF\n- JPG\n- PNG\n\nYou must store the files locally in your Distributed Cloud environment. You\ncan't access files hosted in Cloud Storage or publicly available files for\ntext detection."]]