This page provides an overview of how to manage access control for projects and documents.
Data Access Control overview
Data Access Control is a key feature of Document AI Warehouse. It controls who has access to which resource in Document AI Warehouse, and what level of access they have.
Document AI Warehouse APIs are built on Google Cloud. HTTPS is used to ensure secure data transmission over the internet. Authentication and authorization are enforced on the Document AI Warehouse APIs to protect the service and user data based on Google identities.
Document AI Warehouse APIs use OAuth2 for authentication with a user account. All the API methods require the https://www.googleapis.com/auth/cloud-platform OAuth scope.
Document AI Warehouse enforces access control for customer data, based on Cloud IAM. Document AI Warehouse defines a set of roles and associated permissions for you to restrict different users' access to the data stored in our service. For more information, see the IAM roles and permissions section.
Use a service account to enforce basic access control
You need a service account granted with the required permissions to access the Document AI Warehouse API. If you go through the "Provision through Google Cloud console" step in the Quickstart guide, a service account is automatically provisioned with the Document AI Warehouse Admin role.
Access control mode
Document AI Warehouse provides three access control modes:
- Universal access: No document-level access control
- Document-level access control with your own identity service
- Document-level access control with Cloud Identity
Users need to choose one of the access modes during the provisioning process. The following sections outline the difference between the three access control modes, and demonstrate how to enable each mode.
Universal access
Universal access control lets you use Identity and Access Management (IAM) alone to manage permissions. IAM applies the same permissions to all documents under the project with the authenticated identity.
In this mode, when you have finished the provisioning procedure in the quickstart guide, you and all of your users are able to access all the documents under the selected Google Cloud project in the Document AI Warehouse service using the service account, with the permissions associated with the service account.
The rest of this document discusses document-level access control. If you are using universal access, feel free to skip the rest of the document.
Document-level access control
For Document AI Warehouse users, you can either:
- Bring your own identity service
- Both the end user and end-user membership groups are required in the request metadata. If your company has its own way of authenticating the user and identifying what groups the user belongs to, use this option.
- Use Cloud Identity
- Only the end user is required in the request metadata because Document AI Warehouse collects the membership groups from Cloud Identity for customers. The difference between this and using a custom identity service is that you manage the user's group memberships using Cloud Identity versus an in-house system.
There are a few limitations with using the document-level access mode:
- Only members and roles in the ACL are supported. IAM conditions are ignored.
- Custom roles are not supported in the ACL.
- Document AI Warehouse does not verify end-user credentials. Document AI Warehouse only verifies the service account credentials to make sure the calls are from the customers. End user credentials need to be verified on the customer's side.
- Customers need to provide the end user (and all the groups that the end user is a member of if not using the Cloud Identity option) in the request metadata to enforce the access control.
- The number of membership groups for the end user should be less than 100.
Document-level access control with the customer's own identity service
You can choose this mode if you want to do the following:
- Grant end user (groups) different permissions to access each of the documents.
- Use your own identity service.
This mode enables you to use IAM and access control lists (ACLs) together to manage permissions. Each document in Document AI Warehouse can be configured with a specific document-level ACL. The authentication and authorization happens as follows:
- The service account credential is authenticated and authorized to access the service.
- In the request metadata, include the end user and end-user membership groups. Either the end user or at least one of the groups the end user belongs to needs to have permission to access the document.
Document AI Warehouse grants access to the requested document only if both conditions in the preceding list are satisfied.
The UserInfo
(including end user ID and user membership group IDs) of the RequestMetadata
provided in the API call is used to validate if the end user is allowed to
perform the corresponding action against the document resource requested. For
example, the UserInfo
provided in the GetDocument
API is used to validate if
the end user is allowed to view the document. If either the end user or one of
the membership groups is allowed to view the document, then the end user is
allowed to view the document.
Sample RequestMetadata
in JSON format:
request_metadata: {
user_info: {
id: user:fake_user_id
group_ids: [
group:fake_group_id_1,
group:fake_group_id_2,
group:fake_group_id_3,
]
}
}
In addition to following the Quickstart guide, this access control mode requires a few additional steps before you start sending APIs to Document AI Warehouse:
- Fetch group memberships for a given end user from your directory service (for example, Azure Active Directory or Okta).
- Follow the instructions under the Configure access control section to set a default project policy. You could also set a document-level ACL for specific documents after creation.
After completing the preceding steps, you are now ready to use the service
account to make API calls to Document AI Warehouse with end user and group membership
info in the RequestMetadata
section of the request body.
In this mode, you should deploy a proxy to authenticate and authorize the end users. The proxy uses the service account granted with the admin role to access the service. The service account key should be protected so that it is only used by the proxy.
As an out-of-box solution, the Document AI Warehouse console is a proxy that can store the service account key, authenticate the end users through the Google identities, and forward the requests to Document AI Warehouse.
Document-level access control with Cloud Identity
As an alternative to using your own identity service, you could also opt in to use Cloud Identity to simplify the process.
To centrally manage users and groups, Google Cloud customers can set up Cloud Identity from scratch or federate identities between Google and other identity providers, such as Active Directory and Azure Active Directory.
The UserInfo
section of RequestMetadata
provided in the API call is used to
validate if the end user is allowed to perform the corresponding action against
the document resource requested. Using Cloud Identity, only the end user ID is
required in the RequestMetadata
, and Document AI Warehouse collects the membership
group information from the Cloud Identity service. If either the end user or
one of the membership groups is allowed to access the document, then the end
user is allowed to access the document.
Sample RequestMetadata
in JSON format:
request_metadata: {
user_info: {
id: user:fake_user_id
}
}
In addition to following the Quickstart guide, this access control mode requires a few additional steps before you start sending requests to Document AI Warehouse:
- Integrate with Cloud Identity for the end users and groups.
- Follow the instructions under the Configure access control section to set a default project policy. You could also set a document-level ACL for specific documents after creation.
After completing the preceding steps, you are now ready to use the service
account to make API calls to Document AI Warehouse with end-user information in the
RequestMetadata
section of the request body.
Configure access control
Before you begin
Before you begin, make sure you have completed the Quickstart page.
SetAcl and FetchAcl
When a new project is created, no project ACL is set. The project owner can call
the Document AI Warehouse
SetAcl
API to
set a default
project policy
using predefined roles for the project by setting the
projectOwner
field to true using the service account. Members in the project
policy have access to all the documents under the project depending on the roles
granted. You can grant admin users or groups the access in the default project
policy.
The following is a table that summarizes the required role for each document action. For more information about the permissions granted to each role, see IAM roles and permissions.
To make calls to the Document Schema API using the service account, see
projects.locations.documentSchemas
.
Document API method | Required roles |
---|---|
CreateDocument |
roles/contentwarehouse.documentCreator |
UpdateDocument |
roles/contentwarehouse.documentEditor |
DeleteDocument SetACL |
roles/contentwarehouse.documentAdmin |
GetDocument FetchACL
SearchDocuments |
roles/contentwarehouse.documentViewer
|
CreateDocument
Grant the end user or group Creator access if not granted:
- [Optional] Fetch membership groups for the end user Admin from the customer's identity service. This step can be skipped for customers using Cloud Identity.
- Grant end user A (or the group that user A is a member of) the role
roles/contentwarehouse.documentCreator
at the project level by making the call toSetAcl
using the service account with end user Admin [and membership groups] in the request metadata. The end user Admin has documentAdmin access at the project level.
Create a document:
- Optional: Fetch membership groups for end user A from your identity service. This step can be skipped if you use Cloud Identity.
- Make the call to
CreateDocument
with the end user A [and membership groups] in the request metadata to create a document using the service account. After the document is created, end user A can view and edit the document by default. Customers can also specify a default policy to grant users or groups the access during the creation. For example, granting groupX thedocumentViewer
access, groupY thedocumentEditor
access, and groupZ thedocumentAdmin
access.
GetDocument and FetchAcl
After the document is created, end user A or the members of groupX, groupY, or
groupZ are able to call
GetDocument
to view the document, or call
FetchAcl
to view the ACL of the document. Here are the steps:
- Optional: Fetch membership groups for end user A from your identity service. This step can be skipped if you use Cloud Identity.
- Make the call to
GetDocument
orFetchAcl
using the service account with end user A (and membership groups) in the request metadata.
The call from end user B is rejected if B is not a member of groupX, groupY, or groupZ.
UpdateDocument, DeleteDocument, and SetAcl
After the document is created, only the end user A or members of groupY or
groupZ are allowed to call
UpdateDocument
to update the document; only the end user A or members of groupZ are allowed to
call
DeleteDocument
to delete the document or SetAcl
to share the document with other end users or
groups. Here are the steps:
- Optional: Fetch membership groups for end user A from your identity service. This step can be skipped if you use Cloud Identity.
- Make the call to
UpdateDocument
,DeleteDocument
, orSetAcl
using the service account with end user A [and membership groups] in the request metadata.
The call from members of groupX will be denied because they only have
documentViewer
access to the document.
SearchDocuments
The documents returned depend on the roles granted to the end user. For example,
for an empty search query, all documents under the project will be returned if
the end user has documentViewer
access at the project level. Otherwise, only
the documents with contentwarehouse.documents.get
permission for the given end
user are returned.
To make a call to the
SearchDocument
API, customers need to perform the following steps.
- Optional: Fetch membership groups for end user A from your identity service. This step can be skipped if you use Cloud Identity.
- Make the call to
SearchDocument
using the service account with end user A (and membership groups) in the request metadata.
Document Link APIs
Document Link API method | Required roles |
---|---|
CreateDocumentLink
|
Source:
roles/contentwarehouse.documentEditor
Target: roles/contentwarehouse.documentViewer |
ListLinkedTargets ListLinkedSources |
roles/contentwarehouse.documentViewer
|
DeleteDocumentLink
|
Source:
roles/contentwarehouse.documentEditor |
CreateDocumentLink
End users are able to
link
document doc1 and document doc2 if the end users have
contentwarehouse.documents.update
permission for doc1 and
contentwarehouse.documents.get
permission for doc2.
ListLinkedTargets and ListLinkedSources
End users can only list the
target
or
source
documents with contentwarehouse.documents.get
permission.
DeleteDocumentLink
End users are able to
delete
the links if they have contentwarehouse.documents.update
permission on the
source documents.