Organize documents in folders

This page gives a brief overview of folders and explains how to manage documents using folders.

What are Document AI Warehouse folders

A folder is a special type of document. It can't include inline contents or have any associated contents, but users can still add properties to the folder. A folder serves as a container to group and label documents. Users can attach a document to multiple folders and a folder can contain multiple documents. Folders can also be used in documents.search API to filter child documents under a specific folder.

Before you begin

Before you begin, make sure you have completed the Quickstart page.

Create a Document AI Warehouse folder with a schema

To create a schema for instantiating folders, do the following:

REST

Request:

curl -X POST \
-H "Authorization: Bearer $(gcloud auth print-access-token)" \
-H "Content-Type: application/json" -d '{
"display_name": "abc",
"property_definitions": [
  {
    "name": "Name",
    "display_name": "Name",
    "is_repeatable": false,
    "is_filterable": true,
    "is_searchable": true,
    "is_metadata": true,
    "is_required": true,
    "text_type_options": {},
    "schema_sources": []
  }
],
"document_is_folder": true
}' \
"https://contentwarehouse.googleapis.com/v1/projects/406397197218/locations/us/documentSchemas"

Response:

{
  "name": "SCHEMA_NAME",
  "displayName": "abc",
  "documentIsFolder": true,
  "updateTime": "2022-08-31T16:10:43.111978Z",
  "createTime": "2022-08-31T16:10:43.111978Z"
}

Python

from google.cloud import contentwarehouse_v1

def create_folder_schema():
    # Create a client
    client = contentwarehouse_v1.DocumentSchemaServiceClient()

    # Initialize request argument(s)
    document_schema = contentwarehouse_v1.DocumentSchema()
    document_schema.display_name = "display_name_value"
    document_schema.document_is_folder = True

    request = contentwarehouse_v1.CreateDocumentSchemaRequest(
        parent="projects/533503808294/locations/us",
        document_schema=document_schema,
    )

    # Make the request
    return client.create_document_schema(request=request)

Add a document to a Document AI Warehouse folder

To add a document to a folder, do the following:

REST

Request:

curl -X POST \
-H "Authorization: Bearer $(gcloud auth print-access-token)" \
-H "Content-Type: application/json; charset=utf-8" -d '{
"document_link": {
  "source_document_reference": {
    "document_name": "projects/PROJECT_NUMBER/locations/LOCATION/documents/{document_id}"
  },
  "target_document_reference": {
    "document_name": "projects/PROJECT_NUMBER/locations/LOCATION/documents/{document_id}"
  }
},
"requestMetadata": {
  "userInfo": {
    "id": "user:USER_EMAIL_ID"
  }
}
}' \
"https://contentwarehouse.googleapis.com/v1/projects/PROJECT_NUMBER/locations/LOCATION/documents/{document_id}/documentLinks"

Response:

{
  "name": "LINK_NAME",
  "source_document_reference": {
   "document_name": "FOLDER_NAME"
  },
  "target_document_reference": {
   "document_name": "DOCUMENT_NAME"
  }
}

Python

from google.cloud import contentwarehouse_v1

def add_to_folder(folder:str, doc:str):
    # Create a client
    client = contentwarehouse_v1.DocumentLinkServiceClient()

    # Initialize request argument(s)
    link = contentwarehouse_v1.DocumentLink()
    link.source_document_reference = contentwarehouse_v1.DocumentReference()
    link.source_document_reference.document_name = folder
    link.target_document_reference = contentwarehouse_v1.DocumentReference()
    link.target_document_reference.document_name = doc

    request = contentwarehouse_v1.CreateDocumentLinkRequest(
        parent=folder,
        document_link=link,
    )

    # Make the request
    return client.create_document_link(request=request)

List child documents under a Document AI Warehouse folder

To list immediate child documents under a folder, do the following:

REST

Request:

curl -X POST -H "Authorization: Bearer $(gcloud auth print-access-token)" https://contentwarehouse.googleapis.com/v1/projects/PROJECT_NUMBER/locations/LOCATION/documents/FOLDER/linkedTargets

Response:

{
  "documentLinks": [
    {
      "name": "LINK_NAME1"
      "source_document_reference": {
       "document_name": "FOLDER_NAME"
      },
      "target_document_reference": {
       "document_name": "DOCUMENT_NAME1"
      }
    },
    {
      "name": "LINK_NAME2"
      "source_document_reference": {
       "document_name": "FOLDER_NAME"
      },
      "target_document_reference": {
       "document_name": "DOCUMENT_NAME2"
      }
    }
    ...
  ]
}

Python

from google.cloud import contentwarehouse_v1

def list_sub_docs(folder:str):
    # Create a client
    client = contentwarehouse_v1.DocumentLinkServiceClient()

    # Initialize request argument(s)
    request = contentwarehouse_v1.ListLinkedTargetsRequest(
        parent=folder,
    )

    # Make the request
    return client.list_linked_targets(request=request)

Remove a document from a Document AI Warehouse folder

To remove a document from a folder, you need the link name. You can retrieve the link name using the documents.linkedTargets method in the preceding step.

REST

Request:

curl -X POST -H "Authorization: Bearer $(gcloud auth print-access-token)" https://contentwarehouse.googleapis.com/v1/projects/PROJECT_NUMBER/locations/LOCATION/documents/FOLDER/documentLinks/LINK:delete

Python

from google.cloud import contentwarehouse_v1

def remove_doc_from_folder(link:str):
    # Create a client
    client = contentwarehouse_v1.DocumentLinkServiceClient()

    # Initialize request argument(s)
    request = contentwarehouse_v1.DeleteDocumentLinkRequest(
        name=link,
    )

    # Make the request
    return client.delete_document_link(request=request)