Import from Spanner (Public Preview)

To ingest data from Spanner, use the following steps to create a data store and ingest data using either the Google Cloud console or the API.

Set up Spanner access from a different project

If your Spanner data is in the same project as Agentspace, skip to Import data from Spanner.

To give Agentspace access to Spanner data that is in a different project, follow these steps:

Replace the following PROJECT_NUMBER variable with your Agentspace project number, and then copy the contents of this code block. This is your Agentspace service account identifier:
```
service-PROJECT_NUMBER@gcp-sa-discoveryengine.iam.gserviceaccount.com
```
Go to the IAM & Admin page.

IAM & Admin
Switch to your Spanner project on the IAM & Admin page and click Grant Access.
For New principals, enter the identifier for the service account and select one of the following:
- If you won't use data boost during import, select the Cloud Spanner > Cloud Spanner Database Reader role.
- If you plan to use data boost during import, select the Cloud Spanner > Cloud Spanner Database Admin role, or a custom role with the permissions of Cloud Spanner Database Reader and spanner.databases.useDataBoost. For information about Data Boost, see Data Boost overview in the Spanner documentation.
Click Save.

Next, go to Import data from Spanner.

Import data from Spanner

Console

To use the console to ingest data from Spanner, follow these steps:

In the Google Cloud console, go to the Agentspace page.

Agentspace
Go to the Data Stores page.
Click New data store.
On the Source page, select Cloud Spanner.
Specify the project ID, instance ID, database ID, and table ID of the data that you plan to import.
Select whether to turn on Data Boost. For information about Data Boost, see Data Boost overview in the Spanner documentation.
Click Continue.
Choose a region for your data store.
Enter a name for your data store.
Click Create.
To check the status of your ingestion, go to the Data Stores page and click your data store name to see details about it on its Data page. When the status column on the Activity tab changes from In progress to Import completed, the ingestion is complete.

Depending on the size of your data, ingestion can take several minutes or several hours.

REST

To use the command line to create a data store and ingest data from Spanner, follow these steps:

Create a data store.

curl -X POST \
-H "Authorization: Bearer $(gcloud auth print-access-token)" \
-H "Content-Type: application/json" \
-H "X-Goog-User-Project: PROJECT_ID" \
"https://discoveryengine.googleapis.com/v1alpha/projects/PROJECT_ID/locations/global/collections/default_collection/dataStores?dataStoreId=DATA_STORE_ID" \
-d '{
  "displayName": "DISPLAY_NAME",
  "industryVertical": "GENERIC",
  "solutionTypes": ["SOLUTION_TYPE_SEARCH"],
  "contentConfig": "CONTENT_REQUIRED",
}'

Replace the following:

PROJECT_ID: the ID of your Agentspace project.
DATA_STORE_ID: the ID of the data store. The ID can contain only lowercase letters, digits, underscores, and hyphens.
DISPLAY_NAME: the display name of the data store. This might be displayed in the Google Cloud console.

Import data from Spanner.
```
  curl -X POST \
  -H "Authorization: Bearer $(gcloud auth print-access-token)" \
  -H "Content-Type: application/json" \
  "https://discoveryengine.googleapis.com/v1/projects/PROJECT_ID/locations/global/collections/default_collection/dataStores/DATA_STORE_ID/branches/0/documents:import" \
  -d '{
    "cloudSpannerSource": {
      "projectId": "SPANNER_PROJECT_ID",
      "instanceId": "INSTANCE_ID",
      "databaseId": "DATABASE_ID",
      "tableId": "TABLE_ID",
      "enableDataBoost": "DATA_BOOST_BOOLEAN"
    },
    "reconciliationMode": "RECONCILIATION_MODE",
    "autoGenerateIds": "AUTO_GENERATE_IDS",
    "idField": "ID_FIELD",
  }'
```
Replace the following:
- PROJECT_ID: the ID of your Agentspace project.
- DATA_STORE_ID: the ID of the data store.
- SPANNER_PROJECT_ID: the ID of your Spanner project.
- INSTANCE_ID: the ID of your Spanner instance.
- DATABASE_ID: the ID of your Spanner database.
- TABLE_ID: the ID of your Spanner table.
- DATA_BOOST_BOOLEAN: optional. Whether to turn on Data Boost. For information about Data Boost, see Data Boost overview in the Spanner documentation.
- RECONCILIATION_MODE: optional. Values are FULL and INCREMENTAL. Default is INCREMENTAL. Specifying INCREMENTAL causes an incremental refresh of data from Spanner to your data store. This does an upsert operation, which adds new documents and replaces existing documents with updated documents with the same ID. Specifying FULL causes a full rebase of the documents in your data store. In other words, new and updated documents are added to your data store, and documents that are not in Spanner are removed from your data store. The FULL mode is helpful if you want to automatically delete documents that you no longer need.
- AUTO_GENERATE_IDS: optional. Specifies whether to automatically generate document IDs. If set to true, document IDs are generated based on a hash of the payload. Note that generated document IDs might not remain consistent over multiple imports. If you auto-generate IDs over multiple imports, Google highly recommends setting reconciliationMode to FULL to maintain consistent document IDs.
- ID_FIELD: optional. Specifies which fields are the document IDs.

Next steps

To attach your data store to an app, create an app and select your data store following the steps in Create a search app.
To preview how your search results appear after your app and data store are set up, see Preview search results.