Small batch file annotation online

The Vision API can provide online (immediate) annotation of multiple pages or frames from PDF, TIFF, or GIF files stored in Cloud Storage.

You can request online feature detection and annotation of 5 frames (GIF; "image/gif") or pages (PDF; "application/pdf", or TIFF; "image/tiff") of your choosing for each file.

The example annotations on this page are for DOCUMENT_TEXT_DETECTION, but online small batch annotation is available for all Vision features.

first five pages of a pdf file
gs://cloud-samples-data/vision/document_understanding/custom_0773375000.pdf

Limitations

At most 5 pages will be annotated. Users can specify the specific 5 pages to be annotated.

Authentication

Set up your Google Cloud project and authentication

Currently supported feature types

Feature type
CROP_HINTS Determine suggested vertices for a crop region on an image.
DOCUMENT_TEXT_DETECTION Perform OCR on dense text images, such as documents (PDF/TIFF), and images with handwriting. TEXT_DETECTION can be used for sparse text images. Takes precedence when both DOCUMENT_TEXT_DETECTION and TEXT_DETECTION are present.
FACE_DETECTION Detect faces within the image.
IMAGE_PROPERTIES Compute a set of image properties, such as the image's dominant colors.
LABEL_DETECTION Add labels based on image content.
LANDMARK_DETECTION Detect geographic landmarks within the image.
LOGO_DETECTION Detect company logos within the image.
OBJECT_LOCALIZATION Detect and extract multiple objects in an image.
SAFE_SEARCH_DETECTION Run SafeSearch to detect potentially unsafe or undesirable content.
TEXT_DETECTION Perform Optical Character Recognition (OCR) on text within the image. Text detection is optimized for areas of sparse text within a larger image. If the image is a document (PDF/TIFF), has dense text, or contains handwriting, use DOCUMENT_TEXT_DETECTION instead.
WEB_DETECTION Detect topical entities such as news, events, or celebrities within the image, and find similar images on the web using the power of Google Image Search.

Sample code

You can either send an annotation request with a locally stored file, or use a file that is stored on Cloud Storage.

Using a locally stored file

Use the following code samples to get any feature annotation for a locally stored file.

REST

To perform online PDF/TIFF/GIF feature detection for a small batch of files, make a POST request and provide the appropriate request body:

Before using any of the request data, make the following replacements:

  • BASE64_ENCODED_FILE: The base64 representation (ASCII string) of your binary file data. This string should look similar to the following string:
    • JVBERi0xLjUNCiW1tbW1...ydHhyZWYNCjk5NzM2OQ0KJSVFT0Y=
    Visit the base64 encode topic for more information.
  • PROJECT_ID: Your Google Cloud project ID.

Field-specific considerations:

  • inputConfig.mimeType - One of the following: "application/pdf", "image/tiff" or "image/gif".
  • pages - specifies the specific pages of the file to perform feature detection.

HTTP method and URL:

POST https://vision.googleapis.com/v1/files:annotate

Request JSON body:

{
  "requests": [
    {
      "inputConfig": {
        "content": "BASE64_ENCODED_FILE",
        "mimeType": "application/pdf"
      },
      "features": [
        {
          "type": "DOCUMENT_TEXT_DETECTION"
        }
      ],
      "pages": [
        1,2,3,4,5
      ]
    }
  ]
}

To send your request, choose one of these options:

curl

Save the request body in a file named request.json, and execute the following command:

curl -X POST \
-H "Authorization: Bearer $(gcloud auth print-access-token)" \
-H "x-goog-user-project: PROJECT_ID" \
-H "Content-Type: application/json; charset=utf-8" \
-d @request.json \
"https://vision.googleapis.com/v1/files:annotate"

PowerShell

Save the request body in a file named request.json, and execute the following command:

$cred = gcloud auth print-access-token
$headers = @{ "Authorization" = "Bearer $cred"; "x-goog-user-project" = "PROJECT_ID" }

Invoke-WebRequest `
-Method POST `
-Headers $headers `
-ContentType: "application/json; charset=utf-8" `
-InFile request.json `
-Uri "https://vision.googleapis.com/v1/files:annotate" | Select-Object -Expand Content
Response:

A successful annotate request immediately returns a JSON response.

For this feature (DOCUMENT_TEXT_DETECTION), the JSON response is similar to that of an image's document text detection request. The response contains bounding boxes for blocks broken down by paragraphs, words, and individual symbols. The full text is also detected. The response also contains a context field showing the location of the PDF or TIFF that was specified and the result's page number in the file.

The following response JSON is only for a single page (page 2) and has been shortened for clarity.

Java

Before trying this sample, follow the Java setup instructions in the Vision API Quickstart Using Client Libraries. For more information, see the Vision API Java reference documentation.

import com.google.cloud.vision.v1.AnnotateFileRequest;
import com.google.cloud.vision.v1.AnnotateImageResponse;
import com.google.cloud.vision.v1.BatchAnnotateFilesRequest;
import com.google.cloud.vision.v1.BatchAnnotateFilesResponse;
import com.google.cloud.vision.v1.Block;
import com.google.cloud.vision.v1.Feature;
import com.google.cloud.vision.v1.ImageAnnotatorClient;
import com.google.cloud.vision.v1.InputConfig;
import com.google.cloud.vision.v1.Page;
import com.google.cloud.vision.v1.Paragraph;
import com.google.cloud.vision.v1.Symbol;
import com.google.cloud.vision.v1.Word;
import com.google.protobuf.ByteString;
import java.io.IOException;
import java.nio.file.Files;
import java.nio.file.Path;
import java.nio.file.Paths;

public class BatchAnnotateFiles {

  public static void batchAnnotateFiles() throws IOException {
    String filePath = "path/to/your/file.pdf";
    batchAnnotateFiles(filePath);
  }

  public static void batchAnnotateFiles(String filePath) throws IOException {
    // Initialize client that will be used to send requests. This client only needs to be created
    // once, and can be reused for multiple requests. After completing all of your requests, call
    // the "close" method on the client to safely clean up any remaining background resources.
    try (ImageAnnotatorClient imageAnnotatorClient = ImageAnnotatorClient.create()) {
      // You can send multiple files to be annotated, this sample demonstrates how to do this with
      // one file. If you want to use multiple files, you have to create a `AnnotateImageRequest`
      // object for each file that you want annotated.
      // First read the files contents
      Path path = Paths.get(filePath);
      byte[] data = Files.readAllBytes(path);
      ByteString content = ByteString.copyFrom(data);

      // Specify the input config with the file's contents and its type.
      // Supported mime_type: application/pdf, image/tiff, image/gif
      // https://cloud.google.com/vision/docs/reference/rpc/google.cloud.vision.v1#inputconfig
      InputConfig inputConfig =
          InputConfig.newBuilder().setMimeType("application/pdf").setContent(content).build();

      // Set the type of annotation you want to perform on the file
      // https://cloud.google.com/vision/docs/reference/rpc/google.cloud.vision.v1#google.cloud.vision.v1.Feature.Type
      Feature feature = Feature.newBuilder().setType(Feature.Type.DOCUMENT_TEXT_DETECTION).build();

      // Build the request object for that one file. Note: for additional file you have to create
      // additional `AnnotateFileRequest` objects and store them in a list to be used below.
      // Since we are sending a file of type `application/pdf`, we can use the `pages` field to
      // specify which pages to process. The service can process up to 5 pages per document file.
      // https://cloud.google.com/vision/docs/reference/rpc/google.cloud.vision.v1#google.cloud.vision.v1.AnnotateFileRequest
      AnnotateFileRequest fileRequest =
          AnnotateFileRequest.newBuilder()
              .setInputConfig(inputConfig)
              .addFeatures(feature)
              .addPages(1) // Process the first page
              .addPages(2) // Process the second page
              .addPages(-1) // Process the last page
              .build();

      // Add each `AnnotateFileRequest` object to the batch request.
      BatchAnnotateFilesRequest request =
          BatchAnnotateFilesRequest.newBuilder().addRequests(fileRequest).build();

      // Make the synchronous batch request.
      BatchAnnotateFilesResponse response = imageAnnotatorClient.batchAnnotateFiles(request);

      // Process the results, just get the first result, since only one file was sent in this
      // sample.
      for (AnnotateImageResponse imageResponse :
          response.getResponsesList().get(0).getResponsesList()) {
        System.out.format("Full text: %s%n", imageResponse.getFullTextAnnotation().getText());
        for (Page page : imageResponse.getFullTextAnnotation().getPagesList()) {
          for (Block block : page.getBlocksList()) {
            System.out.format("%nBlock confidence: %s%n", block.getConfidence());
            for (Paragraph par : block.getParagraphsList()) {
              System.out.format("\tParagraph confidence: %s%n", par.getConfidence());
              for (Word word : par.getWordsList()) {
                System.out.format("\t\tWord confidence: %s%n", word.getConfidence());
                for (Symbol symbol : word.getSymbolsList()) {
                  System.out.format(
                      "\t\t\tSymbol: %s, (confidence: %s)%n",
                      symbol.getText(), symbol.getConfidence());
                }
              }
            }
          }
        }
      }
    }
  }
}

Node.js

Before trying this sample, follow the Node.js setup instructions in the Vision quickstart using client libraries. For more information, see the Vision Node.js API reference documentation.

To authenticate to Vision, set up Application Default Credentials. For more information, see Set up authentication for a local development environment.

/**
 * TODO(developer): Uncomment these variables before running the sample.
 */
// const fileName = 'path/to/your/file.pdf';

// Imports the Google Cloud client libraries
const {ImageAnnotatorClient} = require('@google-cloud/vision').v1;
const fs = require('fs').promises;

// Instantiates a client
const client = new ImageAnnotatorClient();

// You can send multiple files to be annotated, this sample demonstrates how to do this with
// one file. If you want to use multiple files, you have to create a request object for each file that you want annotated.
async function batchAnnotateFiles() {
  // First Specify the input config with the file's path and its type.
  // Supported mime_type: application/pdf, image/tiff, image/gif
  // https://cloud.google.com/vision/docs/reference/rpc/google.cloud.vision.v1#inputconfig
  const inputConfig = {
    mimeType: 'application/pdf',
    content: await fs.readFile(fileName),
  };

  // Set the type of annotation you want to perform on the file
  // https://cloud.google.com/vision/docs/reference/rpc/google.cloud.vision.v1#google.cloud.vision.v1.Feature.Type
  const features = [{type: 'DOCUMENT_TEXT_DETECTION'}];

  // Build the request object for that one file. Note: for additional files you have to create
  // additional file request objects and store them in a list to be used below.
  // Since we are sending a file of type `application/pdf`, we can use the `pages` field to
  // specify which pages to process. The service can process up to 5 pages per document file.
  // https://cloud.google.com/vision/docs/reference/rpc/google.cloud.vision.v1#google.cloud.vision.v1.AnnotateFileRequest
  const fileRequest = {
    inputConfig: inputConfig,
    features: features,
    // Annotate the first two pages and the last one (max 5 pages)
    // First page starts at 1, and not 0. Last page is -1.
    pages: [1, 2, -1],
  };

  // Add each `AnnotateFileRequest` object to the batch request.
  const request = {
    requests: [fileRequest],
  };

  // Make the synchronous batch request.
  const [result] = await client.batchAnnotateFiles(request);

  // Process the results, just get the first result, since only one file was sent in this
  // sample.
  const responses = result.responses[0].responses;

  for (const response of responses) {
    console.log(`Full text: ${response.fullTextAnnotation.text}`);
    for (const page of response.fullTextAnnotation.pages) {
      for (const block of page.blocks) {
        console.log(`Block confidence: ${block.confidence}`);
        for (const paragraph of block.paragraphs) {
          console.log(` Paragraph confidence: ${paragraph.confidence}`);
          for (const word of paragraph.words) {
            const symbol_texts = word.symbols.map(symbol => symbol.text);
            const word_text = symbol_texts.join('');
            console.log(
              `  Word text: ${word_text} (confidence: ${word.confidence})`
            );
            for (const symbol of word.symbols) {
              console.log(
                `   Symbol: ${symbol.text} (confidence: ${symbol.confidence})`
              );
            }
          }
        }
      }
    }
  }
}

batchAnnotateFiles();

Python

Before trying this sample, follow the Python setup instructions in the Vision quickstart using client libraries. For more information, see the Vision Python API reference documentation.

To authenticate to Vision, set up Application Default Credentials. For more information, see Set up authentication for a local development environment.



from google.cloud import vision_v1


def sample_batch_annotate_files(file_path="path/to/your/document.pdf"):
    """Perform batch file annotation."""
    client = vision_v1.ImageAnnotatorClient()

    # Supported mime_type: application/pdf, image/tiff, image/gif
    mime_type = "application/pdf"
    with open(file_path, "rb") as f:
        content = f.read()
    input_config = {"mime_type": mime_type, "content": content}
    features = [{"type_": vision_v1.Feature.Type.DOCUMENT_TEXT_DETECTION}]

    # The service can process up to 5 pages per document file. Here we specify
    # the first, second, and last page of the document to be processed.
    pages = [1, 2, -1]
    requests = [{"input_config": input_config, "features": features, "pages": pages}]

    response = client.batch_annotate_files(requests=requests)
    for image_response in response.responses[0].responses:
        print(f"Full text: {image_response.full_text_annotation.text}")
        for page in image_response.full_text_annotation.pages:
            for block in page.blocks:
                print(f"\nBlock confidence: {block.confidence}")
                for par in block.paragraphs:
                    print(f"\tParagraph confidence: {par.confidence}")
                    for word in par.words:
                        print(f"\t\tWord confidence: {word.confidence}")
                        for symbol in word.symbols:
                            print(
                                "\t\t\tSymbol: {}, (confidence: {})".format(
                                    symbol.text, symbol.confidence
                                )
                            )

Using a file on Cloud Storage

Use the following code samples to get any feature annotation for a file on Cloud Storage.

REST

To perform online PDF/TIFF/GIF feature detection for a small batch of files, make a POST request and provide the appropriate request body:

Before using any of the request data, make the following replacements:

  • CLOUD_STORAGE_FILE_URI: the path to a valid file (PDF/TIFF) in a Cloud Storage bucket. You must at least have read privileges to the file. Example:
    • gs://cloud-samples-data/vision/document_understanding/custom_0773375000.pdf
  • PROJECT_ID: Your Google Cloud project ID.

Field-specific considerations:

  • inputConfig.mimeType - One of the following: "application/pdf", "image/tiff" or "image/gif".
  • pages - specifies the specific pages of the file to perform feature detection.

HTTP method and URL:

POST https://vision.googleapis.com/v1/files:annotate

Request JSON body:

{
  "requests": [
    {
      "inputConfig": {
        "gcsSource": {
          "uri": "CLOUD_STORAGE_FILE_URI"
        },
        "mimeType": "application/pdf"
      },
      "features": [
        {
          "type": "DOCUMENT_TEXT_DETECTION"
        }
      ],
      "pages": [
        1,2,3,4,5
      ]
    }
  ]
}

To send your request, choose one of these options:

curl

Save the request body in a file named request.json, and execute the following command:

curl -X POST \
-H "Authorization: Bearer $(gcloud auth print-access-token)" \
-H "x-goog-user-project: PROJECT_ID" \
-H "Content-Type: application/json; charset=utf-8" \
-d @request.json \
"https://vision.googleapis.com/v1/files:annotate"

PowerShell

Save the request body in a file named request.json, and execute the following command:

$cred = gcloud auth print-access-token
$headers = @{ "Authorization" = "Bearer $cred"; "x-goog-user-project" = "PROJECT_ID" }

Invoke-WebRequest `
-Method POST `
-Headers $headers `
-ContentType: "application/json; charset=utf-8" `
-InFile request.json `
-Uri "https://vision.googleapis.com/v1/files:annotate" | Select-Object -Expand Content
Response:

A successful annotate request immediately returns a JSON response.

For this feature (DOCUMENT_TEXT_DETECTION), the JSON response is similar to that of an image's document text detection request. The response contains bounding boxes for blocks broken down by paragraphs, words, and individual symbols. The full text is also detected. The response also contains a context field showing the location of the PDF or TIFF that was specified and the result's page number in the file.

The following response JSON is only for a single page (page 2) and has been shortened for clarity.

Java

Before trying this sample, follow the Java setup instructions in the Vision API Quickstart Using Client Libraries. For more information, see the Vision API Java reference documentation.

import com.google.cloud.vision.v1.AnnotateFileRequest;
import com.google.cloud.vision.v1.AnnotateImageResponse;
import com.google.cloud.vision.v1.BatchAnnotateFilesRequest;
import com.google.cloud.vision.v1.BatchAnnotateFilesResponse;
import com.google.cloud.vision.v1.Block;
import com.google.cloud.vision.v1.Feature;
import com.google.cloud.vision.v1.GcsSource;
import com.google.cloud.vision.v1.ImageAnnotatorClient;
import com.google.cloud.vision.v1.InputConfig;
import com.google.cloud.vision.v1.Page;
import com.google.cloud.vision.v1.Paragraph;
import com.google.cloud.vision.v1.Symbol;
import com.google.cloud.vision.v1.Word;
import java.io.IOException;

public class BatchAnnotateFilesGcs {

  public static void batchAnnotateFilesGcs() throws IOException {
    String gcsUri = "gs://cloud-samples-data/vision/document_understanding/kafka.pdf";
    batchAnnotateFilesGcs(gcsUri);
  }

  public static void batchAnnotateFilesGcs(String gcsUri) throws IOException {
    // Initialize client that will be used to send requests. This client only needs to be created
    // once, and can be reused for multiple requests. After completing all of your requests, call
    // the "close" method on the client to safely clean up any remaining background resources.
    try (ImageAnnotatorClient imageAnnotatorClient = ImageAnnotatorClient.create()) {
      // You can send multiple files to be annotated, this sample demonstrates how to do this with
      // one file. If you want to use multiple files, you have to create a `AnnotateImageRequest`
      // object for each file that you want annotated.
      // First specify where the vision api can find the image
      GcsSource gcsSource = GcsSource.newBuilder().setUri(gcsUri).build();

      // Specify the input config with the file's uri and its type.
      // Supported mime_type: application/pdf, image/tiff, image/gif
      // https://cloud.google.com/vision/docs/reference/rpc/google.cloud.vision.v1#inputconfig
      InputConfig inputConfig =
          InputConfig.newBuilder().setMimeType("application/pdf").setGcsSource(gcsSource).build();

      // Set the type of annotation you want to perform on the file
      // https://cloud.google.com/vision/docs/reference/rpc/google.cloud.vision.v1#google.cloud.vision.v1.Feature.Type
      Feature feature = Feature.newBuilder().setType(Feature.Type.DOCUMENT_TEXT_DETECTION).build();

      // Build the request object for that one file. Note: for additional file you have to create
      // additional `AnnotateFileRequest` objects and store them in a list to be used below.
      // Since we are sending a file of type `application/pdf`, we can use the `pages` field to
      // specify which pages to process. The service can process up to 5 pages per document file.
      // https://cloud.google.com/vision/docs/reference/rpc/google.cloud.vision.v1#google.cloud.vision.v1.AnnotateFileRequest
      AnnotateFileRequest fileRequest =
          AnnotateFileRequest.newBuilder()
              .setInputConfig(inputConfig)
              .addFeatures(feature)
              .addPages(1) // Process the first page
              .addPages(2) // Process the second page
              .addPages(-1) // Process the last page
              .build();

      // Add each `AnnotateFileRequest` object to the batch request.
      BatchAnnotateFilesRequest request =
          BatchAnnotateFilesRequest.newBuilder().addRequests(fileRequest).build();

      // Make the synchronous batch request.
      BatchAnnotateFilesResponse response = imageAnnotatorClient.batchAnnotateFiles(request);

      // Process the results, just get the first result, since only one file was sent in this
      // sample.
      for (AnnotateImageResponse imageResponse :
          response.getResponsesList().get(0).getResponsesList()) {
        System.out.format("Full text: %s%n", imageResponse.getFullTextAnnotation().getText());
        for (Page page : imageResponse.getFullTextAnnotation().getPagesList()) {
          for (Block block : page.getBlocksList()) {
            System.out.format("%nBlock confidence: %s%n", block.getConfidence());
            for (Paragraph par : block.getParagraphsList()) {
              System.out.format("\tParagraph confidence: %s%n", par.getConfidence());
              for (Word word : par.getWordsList()) {
                System.out.format("\t\tWord confidence: %s%n", word.getConfidence());
                for (Symbol symbol : word.getSymbolsList()) {
                  System.out.format(
                      "\t\t\tSymbol: %s, (confidence: %s)%n",
                      symbol.getText(), symbol.getConfidence());
                }
              }
            }
          }
        }
      }
    }
  }
}

Node.js

Before trying this sample, follow the Node.js setup instructions in the Vision quickstart using client libraries. For more information, see the Vision Node.js API reference documentation.

To authenticate to Vision, set up Application Default Credentials. For more information, see Set up authentication for a local development environment.

/**
 * TODO(developer): Uncomment these variables before running the sample.
 */
// const gcsSourceUri = 'gs://cloud-samples-data/vision/document_understanding/kafka.pdf';

// Imports the Google Cloud client libraries
const {ImageAnnotatorClient} = require('@google-cloud/vision').v1;

// Instantiates a client
const client = new ImageAnnotatorClient();

// You can send multiple files to be annotated, this sample demonstrates how to do this with
// one file. If you want to use multiple files, you have to create a request object for each file that you want annotated.
async function batchAnnotateFiles() {
  // First Specify the input config with the file's uri and its type.
  // Supported mime_type: application/pdf, image/tiff, image/gif
  // https://cloud.google.com/vision/docs/reference/rpc/google.cloud.vision.v1#inputconfig
  const inputConfig = {
    mimeType: 'application/pdf',
    gcsSource: {
      uri: gcsSourceUri,
    },
  };

  // Set the type of annotation you want to perform on the file
  // https://cloud.google.com/vision/docs/reference/rpc/google.cloud.vision.v1#google.cloud.vision.v1.Feature.Type
  const features = [{type: 'DOCUMENT_TEXT_DETECTION'}];

  // Build the request object for that one file. Note: for additional files you have to create
  // additional file request objects and store them in a list to be used below.
  // Since we are sending a file of type `application/pdf`, we can use the `pages` field to
  // specify which pages to process. The service can process up to 5 pages per document file.
  // https://cloud.google.com/vision/docs/reference/rpc/google.cloud.vision.v1#google.cloud.vision.v1.AnnotateFileRequest
  const fileRequest = {
    inputConfig: inputConfig,
    features: features,
    // Annotate the first two pages and the last one (max 5 pages)
    // First page starts at 1, and not 0. Last page is -1.
    pages: [1, 2, -1],
  };

  // Add each `AnnotateFileRequest` object to the batch request.
  const request = {
    requests: [fileRequest],
  };

  // Make the synchronous batch request.
  const [result] = await client.batchAnnotateFiles(request);

  // Process the results, just get the first result, since only one file was sent in this
  // sample.
  const responses = result.responses[0].responses;

  for (const response of responses) {
    console.log(`Full text: ${response.fullTextAnnotation.text}`);
    for (const page of response.fullTextAnnotation.pages) {
      for (const block of page.blocks) {
        console.log(`Block confidence: ${block.confidence}`);
        for (const paragraph of block.paragraphs) {
          console.log(` Paragraph confidence: ${paragraph.confidence}`);
          for (const word of paragraph.words) {
            const symbol_texts = word.symbols.map(symbol => symbol.text);
            const word_text = symbol_texts.join('');
            console.log(
              `  Word text: ${word_text} (confidence: ${word.confidence})`
            );
            for (const symbol of word.symbols) {
              console.log(
                `   Symbol: ${symbol.text} (confidence: ${symbol.confidence})`
              );
            }
          }
        }
      }
    }
  }
}

batchAnnotateFiles();

Python

Before trying this sample, follow the Python setup instructions in the Vision quickstart using client libraries. For more information, see the Vision Python API reference documentation.

To authenticate to Vision, set up Application Default Credentials. For more information, see Set up authentication for a local development environment.


from google.cloud import vision_v1


def sample_batch_annotate_files(
    storage_uri="gs://cloud-samples-data/vision/document_understanding/kafka.pdf",
):
    """Perform batch file annotation."""
    mime_type = "application/pdf"

    client = vision_v1.ImageAnnotatorClient()

    gcs_source = {"uri": storage_uri}
    input_config = {"gcs_source": gcs_source, "mime_type": mime_type}
    features = [{"type_": vision_v1.Feature.Type.DOCUMENT_TEXT_DETECTION}]

    # The service can process up to 5 pages per document file.
    # Here we specify the first, second, and last page of the document to be
    # processed.
    pages = [1, 2, -1]
    requests = [{"input_config": input_config, "features": features, "pages": pages}]

    response = client.batch_annotate_files(requests=requests)
    for image_response in response.responses[0].responses:
        print(f"Full text: {image_response.full_text_annotation.text}")
        for page in image_response.full_text_annotation.pages:
            for block in page.blocks:
                print(f"\nBlock confidence: {block.confidence}")
                for par in block.paragraphs:
                    print(f"\tParagraph confidence: {par.confidence}")
                    for word in par.words:
                        print(f"\t\tWord confidence: {word.confidence}")
                        for symbol in word.symbols:
                            print(
                                "\t\t\tSymbol: {}, (confidence: {})".format(
                                    symbol.text, symbol.confidence
                                )
                            )

Try it

Try small batch online feature detection below.

You can use the PDF file specified already or specify your own file in its place.

first five pages of a pdf file
gs://cloud-samples-data/vision/document_understanding/custom_0773375000.pdf

There are three feature types specified for this request:

  • DOCUMENT_TEXT_DETECTION
  • LABEL_DETECTION
  • CROP_HINTS

You can add or remove other feature types by changing the appropriate object in the request ({"type": "FEATURE_NAME"}).

Send the request by selecting Execute.

Request body:

{
  "requests": [
    {
      "inputConfig": {
        "gcsSource": {
          "uri": "gs://cloud-samples-data/vision/document_understanding/custom_0773375000.pdf"
        },
        "mimeType": "application/pdf"
      },
      "features": [
        {
          "type": "DOCUMENT_TEXT_DETECTION"
        },
        {
          "type": "LABEL_DETECTION"
        },
        {
          "type": "CROP_HINTS"
        }
      ],
      "pages": [
        1,
        2,
        3,
        4,
        5
      ]
    }
  ]
}