本頁面由 Cloud Translation API 翻譯而成。

離線批次檔案加註

Vision API 可從儲存在 Cloud Storage 的 PDF 和 TIFF 檔案偵測任何 Vision API 功能。

如要從 PDF 和 TIFF 偵測特徵，必須使用 files:asyncBatchAnnotate 函式提出要求，該函式會執行離線 (非同步) 要求，並使用 operations 資源提供狀態。

PDF/TIFF 要求產生的輸出內容會寫入指定 Cloud Storage 值區中建立的 JSON 檔案。

限制

Vision API 接受最多 2000 頁的 PDF/TIFF 檔案。如果檔案過大，系統會傳回錯誤。

驗證

files:asyncBatchAnnotate 要求不支援 API 金鑰。如需透過服務帳戶進行驗證的說明，請參閱「使用服務帳戶」。

用於驗證的帳戶必須有權存取您指定的輸出 Cloud Storage 值區 (roles/editor 或 roles/storage.objectCreator 以上)。

您可以使用 API 金鑰查詢作業狀態；如需操作說明，請參閱「使用 API 金鑰」。

特徵偵測要求

目前 PDF/TIFF 文件偵測功能僅適用於儲存在 Cloud Storage bucket 中的檔案。回應 JSON 檔案也會儲存到 Cloud Storage bucket。

指令列

如要執行 PDF/TIFF 文件文字偵測，請發出 POST 要求並提供適當的要求主體：

curl -X POST \
-H "Authorization: Bearer $(gcloud auth application-default print-access-token)" \
-H "Content-Type: application/json; charset=utf-8" \
https://vision.googleapis.com/v1/files:asyncBatchAnnotate -d "{
  'requests':[
    {
      'inputConfig': {
        'gcsSource': {
          'uri': 'gs://your-source-bucket-name/folder/multi-page-file.pdf'
        },
        'mimeType': 'application/pdf'
      },
      'features': [
        {
          'type': 'DOCUMENT_TEXT_DETECTION'
        }
      ],
      'outputConfig': {
        'gcsDestination': {
          'uri': 'gs://your-bucket-name/folder/'
        },
        'batchSize': 1
      }
    }
  ]
}"

其中：

inputConfig - 取代其他 Vision API 要求中使用的 image 欄位。其中包含兩個子欄位：
- gcsSource.uri - PDF 或 TIFF 檔案的 Cloud Storage URI (要求的使用者或服務帳戶可存取)
- mimeType - 接受的檔案類型：application/pdf 或 image/tiff。
outputConfig - 指定輸出詳細資料。其中包含兩個子欄位：
- gcsDestination.uri - 有效的 Cloud Storage URI。使用者或服務帳戶必須具備該 bucket 的寫入權限，才能提出要求。檔案名稱為 output-x-to-y，其中 x 和 y 代表該輸出檔案中包含的 PDF/TIFF 頁碼。如果檔案存在，系統會覆寫其內容。
- batchSize - 指定每個輸出 JSON 檔案應包含多少頁輸出內容。

回應：

成功的 asyncBatchAnnotate 要求會傳回含有單一名稱欄位的回應：

{
  "name": "projects/usable-auth-library/operations/1efec2285bd442df"
}

這個名稱代表一項長期執行的作業，以及相關聯的 ID (例如 1efec2285bd442df)，您可以使用 v1.operations API 查詢這項作業。

如要擷取 Vision 註解回應，請將 GET 要求傳送至 v1.operations 端點，並在網址中傳遞作業 ID。

curl -X GET -H "Authorization: Bearer $(gcloud auth application-default print-access-token)" \
-H "Content-Type: application/json" \
https://vision.googleapis.com/v1/operations/1efec2285bd442df

如果作業正在進行中：

{
  "name": "operations/1efec2285bd442df",
  "metadata": {
    "@type": "type.googleapis.com/google.cloud.vision.v1.OperationMetadata",
    "state": "RUNNING",
    "createTime": "2019-05-15T21:10:08.401917049Z",
    "updateTime": "2019-05-15T21:10:33.700763554Z"
  }
}

作業完成後，state 會顯示為 DONE，結果會寫入您指定的 Cloud Storage 檔案：

{
  "name": "operations/1efec2285bd442df",
  "metadata": {
    "@type": "type.googleapis.com/google.cloud.vision.v1.OperationMetadata",
    "state": "DONE",
    "createTime": "2019-05-15T20:56:30.622473785Z",
    "updateTime": "2019-05-15T20:56:41.666379749Z"
  },
  "done": true,
  "response": {
    "@type": "type.googleapis.com/google.cloud.vision.v1.AsyncBatchAnnotateFilesResponse",
    "responses": [
      {
        "outputConfig": {
          "gcsDestination": {
            "uri": "gs://your-bucket-name/folder/"
          },
          "batchSize": 1
        }
      }
    ]
  }
}

輸出檔案中的 JSON 類似於圖片的文件文字偵測要求，但會額外顯示 context 欄位，指出指定 PDF 或 TIFF 的位置，以及檔案中的頁數：

output-1-to-1.json

完整回覆

{
  "inputConfig": {
    "gcsSource": {
      "uri": "gs://cloud-samples-data/vision/pdf_tiff/census2010.pdf"
    },
    "mimeType": "application/pdf"
  },
  "responses": [
    {
      "fullTextAnnotation": {
        "pages": [
          {
            "property": {
              "detectedLanguages": [
                {
                  "languageCode": "en",
                  "confidence": 0.94
                }
              ]
            },
            "width": 612,
            "height": 792,
            "blocks": [
              {
                "boundingBox": {
                  "normalizedVertices": [
                    {
                      "x": 0.12908497,
                      "y": 0.10479798
                    },
                    ...
                    {
                      "x": 0.12908497,
                      "y": 0.1199495
                    }
                  ]
                },
                "paragraphs": [
                  {
                  ...
                    },
                    "words": [
                      {
                        ...
                        },
                        "symbols": [
                          {
                          ...
                            "text": "C",
                            "confidence": 0.99
                          },
                          {
                            "property": {
                              "detectedLanguages": [
                                {
                                  "languageCode": "en"
                                }
                              ]
                            },
                            "text": "O",
                            "confidence": 0.99
                          },
             ...
             }
            ]
          }
        ],
        "text": "CONTENTS\n.\n1-1\nII-1\nIII-1\nList of Statistical Tables...
        \nHow to Use This Census Report ..\nTable Finding Guide .\nUser
        Notes .......\nStatistical Tables.........\nAppendixes
        \nA Geographic Terms and Concepts .........\nB Definitions of
        Subject Characteristics.\nData Collection and Processing Procedures...
        \nQuestionnaire. ........\nE Maps .................\nF Operational
        Overview and accuracy of the Data.......\nG Residence Rule and
        Residence Situations for the \n2010 Census of the United States...
        \nH Acknowledgments .....\nE\n*Appendix may be found in the separate
        volume, CPH-1-A, Summary Population and\nHousing Characteristics,
        Selected Appendixes, on the Internet at
        <www.census.gov\n/prod/cen2010/cph-1-a.pdf>.\nContents\n"
      },
      "context": {
        "uri": "gs://cloud-samples-data/vision/pdf_tiff/census2010.pdf",
        "pageNumber": 1
      }
    }
  ]
}

Go

在試用這個範例之前，請先按照Go「使用用戶端程式庫的 Vision 快速入門導覽課程」中的設定說明操作。詳情請參閱 Vision Go API 參考說明文件。

如要向 Vision 進行驗證，請設定應用程式預設憑證。詳情請參閱「為本機開發環境設定驗證」。


// detectAsyncDocumentURI performs Optical Character Recognition (OCR) on a
// PDF file stored in GCS.
func detectAsyncDocumentURI(w io.Writer, gcsSourceURI, gcsDestinationURI string) error {
	ctx := context.Background()

	client, err := vision.NewImageAnnotatorClient(ctx)
	if err != nil {
		return err
	}

	request := &visionpb.AsyncBatchAnnotateFilesRequest{
		Requests: []*visionpb.AsyncAnnotateFileRequest{
			{
				Features: []*visionpb.Feature{
					{
						Type: visionpb.Feature_DOCUMENT_TEXT_DETECTION,
					},
				},
				InputConfig: &visionpb.InputConfig{
					GcsSource: &visionpb.GcsSource{Uri: gcsSourceURI},
					// Supported MimeTypes are: "application/pdf" and "image/tiff".
					MimeType: "application/pdf",
				},
				OutputConfig: &visionpb.OutputConfig{
					GcsDestination: &visionpb.GcsDestination{Uri: gcsDestinationURI},
					// How many pages should be grouped into each json output file.
					BatchSize: 2,
				},
			},
		},
	}

	operation, err := client.AsyncBatchAnnotateFiles(ctx, request)
	if err != nil {
		return err
	}

	fmt.Fprintf(w, "Waiting for the operation to finish.")

	resp, err := operation.Wait(ctx)
	if err != nil {
		return err
	}

	fmt.Fprintf(w, "%v", resp)

	return nil
}

Java

在試用這個範例之前，請先按照Java「使用用戶端程式庫的 Vision 快速入門導覽課程」中的設定說明操作。詳情請參閱 Vision Java API 參考說明文件。

如要向 Vision 進行驗證，請設定應用程式預設憑證。詳情請參閱「為本機開發環境設定驗證」。

/**
 * Performs document text OCR with PDF/TIFF as source files on Google Cloud Storage.
 *
 * @param gcsSourcePath The path to the remote file on Google Cloud Storage to detect document
 *     text on.
 * @param gcsDestinationPath The path to the remote file on Google Cloud Storage to store the
 *     results on.
 * @throws Exception on errors while closing the client.
 */
public static void detectDocumentsGcs(String gcsSourcePath, String gcsDestinationPath)
    throws Exception {

  // Initialize client that will be used to send requests. This client only needs to be created
  // once, and can be reused for multiple requests. After completing all of your requests, call
  // the "close" method on the client to safely clean up any remaining background resources.
  try (ImageAnnotatorClient client = ImageAnnotatorClient.create()) {
    List<AsyncAnnotateFileRequest> requests = new ArrayList<>();

    // Set the GCS source path for the remote file.
    GcsSource gcsSource = GcsSource.newBuilder().setUri(gcsSourcePath).build();

    // Create the configuration with the specified MIME (Multipurpose Internet Mail Extensions)
    // types
    InputConfig inputConfig =
        InputConfig.newBuilder()
            .setMimeType(
                "application/pdf") // Supported MimeTypes: "application/pdf", "image/tiff"
            .setGcsSource(gcsSource)
            .build();

    // Set the GCS destination path for where to save the results.
    GcsDestination gcsDestination =
        GcsDestination.newBuilder().setUri(gcsDestinationPath).build();

    // Create the configuration for the System.output with the batch size.
    // The batch size sets how many pages should be grouped into each json System.output file.
    OutputConfig outputConfig =
        OutputConfig.newBuilder().setBatchSize(2).setGcsDestination(gcsDestination).build();

    // Select the Feature required by the vision API
    Feature feature = Feature.newBuilder().setType(Feature.Type.DOCUMENT_TEXT_DETECTION).build();

    // Build the OCR request
    AsyncAnnotateFileRequest request =
        AsyncAnnotateFileRequest.newBuilder()
            .addFeatures(feature)
            .setInputConfig(inputConfig)
            .setOutputConfig(outputConfig)
            .build();

    requests.add(request);

    // Perform the OCR request
    OperationFuture<AsyncBatchAnnotateFilesResponse, OperationMetadata> response =
        client.asyncBatchAnnotateFilesAsync(requests);

    System.out.println("Waiting for the operation to finish.");

    // Wait for the request to finish. (The result is not used, since the API saves the result to
    // the specified location on GCS.)
    List<AsyncAnnotateFileResponse> result =
        response.get(180, TimeUnit.SECONDS).getResponsesList();

    // Once the request has completed and the System.output has been
    // written to GCS, we can list all the System.output files.
    Storage storage = StorageOptions.getDefaultInstance().getService();

    // Get the destination location from the gcsDestinationPath
    Pattern pattern = Pattern.compile("gs://([^/]+)/(.+)");
    Matcher matcher = pattern.matcher(gcsDestinationPath);

    if (matcher.find()) {
      String bucketName = matcher.group(1);
      String prefix = matcher.group(2);

      // Get the list of objects with the given prefix from the GCS bucket
      Bucket bucket = storage.get(bucketName);
      com.google.api.gax.paging.Page<Blob> pageList = bucket.list(BlobListOption.prefix(prefix));

      Blob firstOutputFile = null;

      // List objects with the given prefix.
      System.out.println("Output files:");
      for (Blob blob : pageList.iterateAll()) {
        System.out.println(blob.getName());

        // Process the first System.output file from GCS.
        // Since we specified batch size = 2, the first response contains
        // the first two pages of the input file.
        if (firstOutputFile == null) {
          firstOutputFile = blob;
        }
      }

      // Get the contents of the file and convert the JSON contents to an AnnotateFileResponse
      // object. If the Blob is small read all its content in one request
      // (Note: the file is a .json file)
      // Storage guide: https://cloud.google.com/storage/docs/downloading-objects
      String jsonContents = new String(firstOutputFile.getContent());
      Builder builder = AnnotateFileResponse.newBuilder();
      JsonFormat.parser().merge(jsonContents, builder);

      // Build the AnnotateFileResponse object
      AnnotateFileResponse annotateFileResponse = builder.build();

      // Parse through the object to get the actual response for the first page of the input file.
      AnnotateImageResponse annotateImageResponse = annotateFileResponse.getResponses(0);

      // Here we print the full text from the first page.
      // The response contains more information:
      // annotation/pages/blocks/paragraphs/words/symbols
      // including confidence score and bounding boxes
      System.out.format("%nText: %s%n", annotateImageResponse.getFullTextAnnotation().getText());
    } else {
      System.out.println("No MATCH");
    }
  }
}

Node.js

在試用這個範例之前，請先按照Node.js「使用用戶端程式庫的 Vision 快速入門導覽課程」中的設定說明操作。詳情請參閱 Vision Node.js API 參考說明文件。

如要向 Vision 進行驗證，請設定應用程式預設憑證。詳情請參閱「為本機開發環境設定驗證」。


// Imports the Google Cloud client libraries
const vision = require('@google-cloud/vision').v1;

// Creates a client
const client = new vision.ImageAnnotatorClient();

/**
 * TODO(developer): Uncomment the following lines before running the sample.
 */
// Bucket where the file resides
// const bucketName = 'my-bucket';
// Path to PDF file within bucket
// const fileName = 'path/to/document.pdf';
// The folder to store the results
// const outputPrefix = 'results'

const gcsSourceUri = `gs://${bucketName}/${fileName}`;
const gcsDestinationUri = `gs://${bucketName}/${outputPrefix}/`;

const inputConfig = {
  // Supported mime_types are: 'application/pdf' and 'image/tiff'
  mimeType: 'application/pdf',
  gcsSource: {
    uri: gcsSourceUri,
  },
};
const outputConfig = {
  gcsDestination: {
    uri: gcsDestinationUri,
  },
};
const features = [{type: 'DOCUMENT_TEXT_DETECTION'}];
const request = {
  requests: [
    {
      inputConfig: inputConfig,
      features: features,
      outputConfig: outputConfig,
    },
  ],
};

const [operation] = await client.asyncBatchAnnotateFiles(request);
const [filesResponse] = await operation.promise();
const destinationUri =
  filesResponse.responses[0].outputConfig.gcsDestination.uri;
console.log('Json saved to: ' + destinationUri);

Python

在試用這個範例之前，請先按照Python「使用用戶端程式庫的 Vision 快速入門導覽課程」中的設定說明操作。詳情請參閱 Vision Python API 參考說明文件。

如要向 Vision 進行驗證，請設定應用程式預設憑證。詳情請參閱「為本機開發環境設定驗證」。

def async_detect_document(gcs_source_uri, gcs_destination_uri):
    """OCR with PDF/TIFF as source files on GCS"""
    import json
    import re
    from google.cloud import vision
    from google.cloud import storage

    # Supported mime_types are: 'application/pdf' and 'image/tiff'
    mime_type = "application/pdf"

    # How many pages should be grouped into each json output file.
    batch_size = 2

    client = vision.ImageAnnotatorClient()

    feature = vision.Feature(type_=vision.Feature.Type.DOCUMENT_TEXT_DETECTION)

    gcs_source = vision.GcsSource(uri=gcs_source_uri)
    input_config = vision.InputConfig(gcs_source=gcs_source, mime_type=mime_type)

    gcs_destination = vision.GcsDestination(uri=gcs_destination_uri)
    output_config = vision.OutputConfig(
        gcs_destination=gcs_destination, batch_size=batch_size
    )

    async_request = vision.AsyncAnnotateFileRequest(
        features=[feature], input_config=input_config, output_config=output_config
    )

    operation = client.async_batch_annotate_files(requests=[async_request])

    print("Waiting for the operation to finish.")
    operation.result(timeout=420)

    # Once the request has completed and the output has been
    # written to GCS, we can list all the output files.
    storage_client = storage.Client()

    match = re.match(r"gs://([^/]+)/(.+)", gcs_destination_uri)
    bucket_name = match.group(1)
    prefix = match.group(2)

    bucket = storage_client.get_bucket(bucket_name)

    # List objects with the given prefix, filtering out folders.
    blob_list = [
        blob
        for blob in list(bucket.list_blobs(prefix=prefix))
        if not blob.name.endswith("/")
    ]
    print("Output files:")
    for blob in blob_list:
        print(blob.name)

    # Process the first output file from GCS.
    # Since we specified batch_size=2, the first response contains
    # the first two pages of the input file.
    output = blob_list[0]

    json_string = output.download_as_bytes().decode("utf-8")
    response = json.loads(json_string)

    # The actual response for the first page of the input file.
    first_page_response = response["responses"][0]
    annotation = first_page_response["fullTextAnnotation"]

    # Here we print the full text from the first page.
    # The response contains more information:
    # annotation/pages/blocks/paragraphs/words/symbols
    # including confidence scores and bounding boxes
    print("Full text:\n")
    print(annotation["text"])

gcloud

使用的 gcloud 指令取決於檔案類型。

如要執行 PDF 文字偵測，請使用 gcloud ml vision detect-text-pdf 指令，如下列範例所示：
```
gcloud ml vision detect-text-pdf gs://my_bucket/input_file  gs://my_bucket/out_put_prefix
```
如要執行 TIFF 文字偵測，請使用 gcloud ml vision detect-text-tiff 指令，如下列範例所示：
```
gcloud ml vision detect-text-tiff gs://my_bucket/input_file  gs://my_bucket/out_put_prefix
```

其他語言

C#：請按照用戶端程式庫頁面上的C# 設定說明操作，然後前往 .NET 適用的 Vision 參考說明文件。

PHP：請按照用戶端程式庫頁面的 PHP 設定說明操作，然後前往 PHP 適用的 Vision 參考文件。

Ruby：請按照用戶端程式庫頁面的 Ruby 設定說明操作，然後前往 Ruby 適用的 Vision 參考說明文件。