This document describes how you migrate from streaming data replication to CDC replication through Pub/Sub when using the BigQuery Connector for SAP.
If you're an existing BigQuery Connector for SAP user and want your BigQuery table to directly reflect changes from the source SAP tables, then you can use CDC replication through Pub/Sub. When your data changes in SAP, the BigQuery Connector for SAP captures the changes, and sends them through Pub/Sub directly to your BigQuery table, applying the changes without duplicates.
Before you begin
Before you begin, learn about the Pub/Sub service and its terminology.
Make sure that you or your administrators have completed the following prerequisites:
- Update the BigQuery Connector for SAP to version 2.9 or later.
- Enable the Google Cloud APIs.
- Set up authentication.
- Create a dead-letter topic.
Update the BigQuery Connector for SAP
To use CDC replication through Pub/Sub, update your BigQuery Connector for SAP to version 2.9 or later. For information about how to update your connector, see Update the BigQuery Connector for SAP.
Enable the Google Cloud APIs
For CDC replication through Pub/Sub, make sure the following APIs are enabled:
- Pub/Sub API
- BigQuery API
- IAM Service Account Credentials API
For information about how to enable Google Cloud APIs, see Enabling APIs.
Set up authentication
To use CDC replication through Pub/Sub, in addition to the existing BigQuery roles, you need to grant the Pub/Sub Editor role to the service account used for authorization. Make sure that the service account has the following roles:
- Pub/Sub Editor
- BigQuery Data Editor
- BigQuery Job User
For information about how to grant a role, see Grant a single role.
For information about the supported authentication methods and how to set up authentication, see Set up authentication.
Create a dead-letter topic
To handle messages that a BigQuery subscription fails to write to the target BigQuery table, we recommend that you create a dead-letter topic. This dead-letter topic serves as a separate queue where Pub/Sub automatically publishes undeliverable messages after all retries.
Migration steps
Follow these high-level migration steps:
- Stop the SAP LT Replication Server replication configuration.
- Prepare the deduplicated CDC table.
- Initiate migration through the BigQuery Connector for SAP.
- Configure the replication settings.
- Activate the SAP LT Replication Server replication configuration.
- Check replication in BigQuery.
Stop the SAP LT Replication Server replication configuration
In the SAP GUI, enter the transaction code
LTRC
.Stop the SAP LT Replication Server replication configuration that you want to migrate.
Prepare the deduplicated CDC table
In BigQuery, you need to create a deduplicated CDC table from your existing BigQuery staging table. When you use CDC replication through Pub/Sub, this CDC table becomes the external table in BigQuery for your SAP data.
To prepare a deduplicated CDC table, do the following:
In the Google Cloud console, go to the BigQuery page.
Click Compose new query.
Create a CDC table from your existing BigQuery staging table:
In the Query editor textarea, to copy your existing BigQuery staging table structure, enter the following query:
CREATE TABLE CDC_TABLE LIKE STAGING_TABLE
Replace the following:
CDC_TABLE
: A name for your BigQuery CDC table.STAGING_TABLE
: The name of the BigQuery staging table.
Click Run.
This step creates a CDC table structure without primary keys.
Modify the CDC table definition to add primary keys:
In the Query editor textarea, to add primary keys, enter the following query:
ALTER TABLE CDC_TABLE ADD PRIMARY KEY(PRIMARY_KEYS) NOT INFORCED
Replace the following:
CDC_TABLE
: The name of the CDC table.PRIMARY_KEYS
: The primary keys as a comma-separated list. You can retrieve the list of primary keys from the source SAP table.
Click Run.
Deduplicate the CDC table:
In the Query editor textarea, to deduplicate the CDC table, enter the following query:
INSERT INTO CDC_TABLE (COLUMN_NAMES ) SELECT COLUMN_NAMES FROM STAGING_TABLE WHERE is_deleted = false QUALIFY ROW_NUMBER() OVER( PARTITION BY PRIMARY_KEYS ORDER BY recordstamp DESC ) = 1
Replace the following:
CDC_TABLE
: The name of the CDC table.COLUMN_NAMES
: The column names as a comma-separated list. You can retrieve the list of column names from the SAP source system. Alternatively, use an asterisk (*) to select all columns.PRIMARY_KEYS
: The primary keys of the CDC table.STAGING_TABLE
: The name of the staging table.
Click Run.
Alternatively, you can use the Copy option in BigQuery Studio to copy the source staging table structure and create a CDC table.
Initiate migration through the BigQuery Connector for SAP
- In the SAP GUI, enter the
/GOOG/SLT_SETTINGS
transaction preceded by/n
:/n/GOOG/SLT_SETTINGS
- From the drop-down menu in the Google Cloud Partner field, select BigQuery.
- From the Settings Table drop-down menu in the launch screen for the
/GOOG/SLT_SETTINGS
transaction, select Mass Transfers. - In the Mass Transfer Key field, specify the mass transfer ID of the SAP LT Replication Server replication configuration that you want to migrate.
- Click the Execute icon.
- In the BigQuery Settings Maintenance - Mass Transfers screen, click Migrate.
- In the migration confirmation dialog, click Yes.
Configure the replication settings
- In the SAP GUI, enter the
/GOOG/SLT_SETTINGS
transaction preceded by/n
:/n/GOOG/SLT_SETTINGS
- From the drop-down menu in the Google Cloud Partner field, select BigQuery CDC via Pub/Sub.
- From the Settings Table drop-down menu in the launch screen for the
/GOOG/SLT_SETTINGS
transaction, select Mass Transfers. - In the Mass Transfer Key field, specify the mass transfer ID of the SAP LT Replication Server replication configuration that you migrated in the section Initiate migration through the BigQuery Connector for SAP.
- Click the Execute icon. The BigQuery Settings Maintenance - Mass Transfers screen displays.
- Verify that the migrated SAP LT Replication Server replication configuration is displayed.
For the SAP LT Replication Server replication configuration, specify the following table attributes:
In the External Table Name field, enter the name of the CDC table that you created in the preceding step Prepare the deduplicated CDC table.
To enable CDC replication to BigQuery through Pub/Sub, select the Repl CPS checkbox.
In the Pub/Sub Schema field, enter a name for the Pub/Sub schema. The BigQuery Connector for SAP automatically creates or updates an existing Avro schema to match the SAP table definition.
In the Pub/Sub Topic field, enter a name for the Pub/Sub topic where messages are published. The BigQuery Connector for SAP automatically creates a topic.
In the Pub/Sub Subscription field, enter a name for the BigQuery subscription that consumes messages from the Pub/Sub topic and writes to BigQuery. The BigQuery Connector for SAP automatically creates a subscription and links it to the topic specified in the preceding step.
In the Dead Letter Queue field, enter the name of the Pub/Sub dead-letter topic. This dead-letter topic receives messages that Pub/Sub fails to write to BigQuery through the BigQuery subscription.
To enable the Change Data Capture (CDC) functionality for the table, select the Act. CDC (Activate CDC) checkbox.
To optimize performance for initial loads, select the Cache Val (Cache Validation) checkbox.
When you select this field, the connector provides maximum performance for large data transfers. The system runs Google Cloud pipeline validations at intervals and caches the results. The connector provides rapid data transformation and sends data directly to Pub/Sub. For more information, see Cache Validations.
Activate the SAP LT Replication Server replication configuration
In the SAP GUI, enter the transaction code
LTRC
.Activate the configuration.
Activating the configuration creates all necessary Pub/Sub related resources and initiates replication. The configured CDC table reflects all new and unprocessed changes.
Check replication in BigQuery
In BigQuery, verify that the values in the CDC table are the same as your source SAP table. For more information about how to do it, see Validate the replication to BigQuery.
Troubleshoot
For information about diagnosing and resolving issues that you might encounter when you migrate replication, see BigQuery Connector for SAP troubleshooting guide.
Get support
If you need help resolving problems with the migration steps, then collect all available diagnostic information and contact Cloud Customer Care. For information about contacting Customer Care, see Getting support for SAP on Google Cloud.