This quickstart guides the Application Operator (AO) through the process of using the Vertex AI Optical Character Recognition (OCR) pre-trained API on Google Distributed Cloud (GDC) air-gapped.
Before you begin
Follow these steps before trying OCR:
Set up a project using the GDC console to group the Vertex AI services. For information about creating and using projects, see Create a project.
Ask your Project IAM Admin to grant you the AI OCR Developer (
ai-ocr-developer
) role in your project namespace.Download the gdcloud command-line interface (CLI).
Set up your service account
Set up your service account with the name of your service account, project ID,
and service key. Replace the PROJECT_ID
with your project.
${HOME}/gdcloud init # set URI and project
${HOME}/gdcloud auth login
${HOME}/gdcloud iam service-accounts create SERVICE_ACCOUNT --project=PROJECT_ID
${HOME}/gdcloud iam service-accounts keys create "SERVICE_KEY".json --project=PROJECT_ID --iam-account=SERVICE_ACCOUNT
Grant access to project resources
Grant access to the Translation API service account by providing
your project ID, name of your service account, and the role ai-ocr-developer
.
${HOME}/gdcloud iam service-accounts add-iam-policy-binding --project=PROJECT_ID --iam-account=SERVICE_ACCOUNT --role=role/ai-ocr-developer
Set your environment variables
Before running the OCR pre-trained service, set your environment variable.
export GOOGLE_APPLICATION_CREDENTIALS="SERVICE_KEY".json
Authenticate the request
You must get a token to authenticate the requests to the OCR pre-trained service. Follow these steps:
gdcloud CLI
Export the identity token for the specified account to an environment variable:
export TOKEN="$($HOME/gdcloud auth print-identity-token --audiences=https://ENDPOINT)"
Replace ENDPOINT
with the OCR endpoint. For more information, view service statuses and endpoints.
Python
Install the
google-auth
client library.pip install google-auth
Save the following code to a Python script, and update the
ENDPOINT
to the OCR endpoint. For more information, see View service statuses and endpoints.import google.auth from google.auth.transport import requests api_endpoint = "https://ENDPOINT" creds, project_id = google.auth.default() creds = creds.with_gdch_audience(api_endpoint) def test_get_token(): req = requests.Request() creds.refresh(req) print(creds.token) if __name__=="__main__": test_get_token()
Run the script to fetch the token.
You must add the fetched token to the header of the curl
requests as in the following example:
-H "Authorization: Bearer TOKEN"
Make the curl
request:
curl
echo '{"requests": [{"image": {"content": "'iVBORw0KGgoAAAANSUhEUgAAAMgAAAArCAMAAAAKVjeAAAAAA3NCSVQICAjb4U/gAAAADFBMVEX///8AAABnZ2cMDAzMh6MLAAAAX3pUWHRSYXcgcHJvZmlsZSB0eXBlIEFQUDEAAAiZ40pPzUstykxWKCjKT8vMSeVSAANjEy4TSxNLo0QDAwMLAwgwNDAwNgSSRkC2OVQo0QAFmJibpQGhuVmymSmIzwUAT7oVaBst2IwAAAEjSURBVGiB7ZRBFsMgCEShvf+d+9o0VmAwxpCuZjZGkYGfaEQoiqIoiqIoiqKoG6Sqg6lbTqK1LfwWTpUjSJ0IMnIhyAXdDaL6mwSQPpg5hgeT9H7c5sG1FES/wiA2OgkSLUPfW7wSRNWUdSAuih19drTUFnCuiyBO+6ob7WBGTPJ5tZYDJ4NAJYgvEoesUgoC+8bntgikczALSXQGJLMcuj7nOfAduQbStkm3fQnkUQACP9EZkB3mCsgZ3QEiDkRQ0r9A4K55kHaswlUmyApIVsVH04oGxO1NSoDfbw2IujmI5hX7fNeeDkDaWAbSX/cIIjY4B+KTAoj5xaDelkAEWobooW2/xyZFkH0DTF4GsZ84HIejg4x7UWuAnlSzZIqiJvUCFxYEUadKypwAAAAASUVORK5CYII='" }, "features": [ { "type": "DOCUMENT_TEXT_DETECTION" } ] }] }' | curl --cacert CERTIFICATE_NAME --data-binary @- -H "Content-Type: application/json" -H "Authorization: Bearer TOKEN" -H "x-goog-user-project: projects/PROJECT_ID" https://ENDPOINT/v1/images:annotate
Run the OCR pre-trained API sample script
This example shows you how to interact with an OCR pre-trained API.
Check whether the client library for OCR is installed.
pip freeze | grep vision # output example: google-cloud-vision==3.0.0
If the existing version doesn't match the client library in
https://CONSOLE_ENDPOINT/.well-known/static/client-libraries
, uninstall the client library.pip uninstall google-cloud-vision
Specify the console endpoint and the client library for OCR (provided in the example).
wget https://CONSOLE_ENDPOINT/.well-known/static/client-libraries/google-cloud-vision
Extract the
tar
file, and install it usingpip
. If errors are generated because something isn't found, install any missing dependencies.tar -xvzf CLIENT_LIBRARY pip install -r FOLDER/requirements.txt --no-index --find-links FOLDER
Use the OCR client library script to generate the token, and make requests to the OCR service.
Set up your environment variable.
os.environ["GOOGLE_APPLICATION_CREDENTIALS"] = ""SERVICE_KEY".json"
Run the OCR sample
Replace the ENDPOINT
with the OCR endpoint that you use for your
organization.
from google.cloud import vision
import google.auth
from google.auth.transport import requests
from google.api_core.client_options import ClientOptions
audience = "https://ENDPOINT:443"
api_endpoint="ENDPOINT:443"
def vision_client(creds):
opts = ClientOptions(api_endpoint=api_endpoint)
"""Create vision client."""
return vision.ImageAnnotatorClient(credentials=creds, client_options=opts)
def main():
creds = None
try:
creds, project_id = google.auth.default()
creds = creds.with_gdch_audience(audience)
req = requests.Request()
creds.refresh(req)
print("Got token: ")
print(creds.token)
except Exception as e:
print("Caught exception" + str(e))
raise e
return creds
def vision_func(creds):
vc = vision_client(creds)
image = {"content": "iVBORw0KGgoAAAANSUhEUgAAAMgAAAArCAMAAAAKVjeAAAAAA3NCSVQICAjb4U/gAAAADFBMVEX///8AAABnZ2cMDAzMh6MLAAAAX3pUWHRSYXcgcHJvZmlsZSB0eXBlIEFQUDEAAAiZ40pPzUstykxWKCjKT8vMSeVSAANjEy4TSxNLo0QDAwMLAwgwNDAwNgSSRkC2OVQo0QAFmJibpQGhuVmymSmIzwUAT7oVaBst2IwAAAEjSURBVGiB7ZRBFsMgCEShvf+d+9o0VmAwxpCuZjZGkYGfaEQoiqIoiqIoiqKoG6Sqg6lbTqK1LfwWTpUjSJ0IMnIhyAXdDaL6mwSQPpg5hgeT9H7c5sG1FES/wiA2OgkSLUPfW7wSRNWUdSAuih19drTUFnCuiyBO+6ob7WBGTPJ5tZYDJ4NAJYgvEoesUgoC+8bntgikczALSXQGJLMcuj7nOfAduQbStkm3fQnkUQACP9EZkB3mCsgZ3QEiDkRQ0r9A4K55kHaswlUmyApIVsVH04oGxO1NSoDfbw2IujmI5hX7fNeeDkDaWAbSX/cIIjY4B+KTAoj5xaDelkAEWobooW2/xyZFkH0DTF4GsZ84HIejg4x7UWuAnlSzZIqiJvUCFxYEUadKypwAAAAASUVORK5CYII="}
features = [{"type_": vision.Feature.Type.DOCUMENT_TEXT_DETECTION}]
# Each requests element corresponds to a single image. To annotate more
# images, create a request element for each image and add it to
# the array of requests
req = {"image": image, "features": features}
metadata = [("x-goog-user-project", "projects/PROJECT_ID")]
resp = vc.annotate_image(req,metadata=metadata)
print(resp)
if __name__=="__main__":
creds = main()
vision_func(creds)
Replace PROJECT_ID
with the ID of the project that you want to use.
What's next
- Learn more about how to Detect text in images.
- Learn more about how to Detect text in images offline.