Before you start
To ingest sample documents into the Document AI Warehouse, see the Quickstart Guide.
GenAI search
GenAI search retrieves the top-k documents most relevant to the (keyword or natural-language) search query. It returns pinpointed answers from a body of customer-uploaded documents, sorting search results by relevance.
The search request caller decides the k by specifying it in the qaSizeLimit field. Large language models determine the relevance between the search query and the documents.
What data gets searched?
- The document's plain_text.
- If you are importing a Document AI object, use the embedded cloud_ai_document.text.
Filtering, pagination, histograming, custom synonyms, document level, fine-grained access control are not supported.
Make a search request call
To call the search service, you must use a search request, which is defined as follows:
{
"documentQuery": {
object (DocumentQuery)
},
"qaSizeLimit": integer
}
The parent
field must be filled in with the format:
/projects/PROJECT_ID/locations/LOCATION
The qaSizeLimit field is required for GenAI search.
Response to a search request
The search response is defined as follows:
{
"matchingDocuments": [
{
object (MatchingDocument)
}
],
"metadata": {
object (ResponseMetadata)
}
}
Document Query
The document_query field is defined as follows:
{
"query": string,
"isNlQuery": boolean
}
The query field is for the requesting user's search query words which can be either keywords or natural language questions. Typically, these come from the search field in the UI. The isNlQuery field needs to be set as true for GenAI search.
Matching document
A matching document looks like this:
{
"document": {
object (Document)
},
"searchTextSnippet": string,
"qaResult": {
object (QAResult)
}
}
The SearchTextSnippet field contains a snippet that answers the user's natural-language query. No HTML bold tags will be present, and highlights in the answer snippet can be found in QAResult.highlights. Note: Full reference for Matching Document.
GenAI search result
This is GenAI search result information.
{
"highlights": {
object(Highlight)
}
"confidence_score": float
}
Highlight
This is a text span in the search-text snippet that represents a highlighted section, such as answer context or a highly relevant sentence.
{
"start_index": integer
"end_index": integer
}
Questions and answers from a set of documents
To generate an answer using GenAI, you must use a search request with documentNameFilter
, which is defined as follows:
{
"documentQuery": {
"query": "QUERY",
"isNlQuery": "true",
"documentNameFilter" : [
"projects/PROJECT_NUMBER/locations/LOCATION/documents/DOCUMENT_ID_1",
"projects/PROJECT_NUMBER/locations/LOCATION/documents/DOCUMENT_ID_2",
]
},
"qaSizeLimit": integer
}
Avoid adding other filters to documentQuery
because the other filters are not yet functional.
If an answer can be found within the given set of documents, the answer is stored in the questionAnswer
field.
{
"document": {
object (Document)
},
questionAnswer: "QUESTION_ANSWER",
}
Next steps
Proceed to GenAI quickstart guide to understand and run GenAI in Document AI Warehouse.
Proceed to GenAI search guide to learn how to manage searches on GenAI.