Replicating data from SAP S/4HANA to BigQuery through SAP Datasphere

This document provides an overview about how you can replicate data from SAP S/4HANA to BigQuery through SAP Datasphere. SAP Datasphere offers robust data transformation capabilities and ensures the data is loaded correctly into BigQuery. It uses the BigQuery Storage Write API to write the data to BigQuery dataset in near-real time.

Replication methods

To read data from the SAP S/4HANA source system, you can use one of the following replication methods:

  • CDS-based replication: This method uses Core Data Services (CDS) views, which are semantically rich and offer predefined models of your business data within SAP S/4HANA.

  • SLT-based replication: This method directly replicates underlying database tables from your SAP S/4HANA system, typically using SAP Landscape Transformation (SLT) as the replication technology. The replicated tables can then be used as a data foundation for Google Cloud Cortex Framework.

Key differences and considerations

The following table summarizes the key differences between the two replication methods:

Feature CDS-based replication SLT-based replication
Data representation Provides business-oriented data views, often combining data from multiple tables into meaningful structures. Provides direct access to raw table structures. If you require granular control over data or need to replicate custom tables, then this replication method is appropriate for your needs.
Ease of use If you're using the pre-delivered CDS views in SAP Datasphere, then setting up the replication becomes more straightforward. Might require more technical configurations, especially for initial setup of SLT and handling table dependencies.
Flexibility Less flexible if you need to replicate custom tables or tables not exposed through standard CDS views. Offers greater flexibility to replicate any table, including custom-developed ones.
Performance
  • Causes extra CPU consumption on the production database due to business logic and delta calculations. This method can require additional CPU resources for your SAP database.
  • Less data is transferred due to possible pre-aggregations in CDS view logic.
  • Replicates entire tables, leading to increased network traffic, and hence this method is networking-intensive. However, it requires less consumption of the database CPU resources.
  • More data is transferred as full tables are replicated without pre-aggregations.
Google Cortex Framework compatibility Not compatible with pre-delivered technical accelerators. Fully compatible with pre-delivered technical accelerators.

Choose your replication method

The best replication method for you depends on several factors, including the intended use case and other considerations, such as the following:

  • Business requirements: If you primarily need standard business data and want a quick setup, then CDS-based replication is likely a good fit. If you require more specialized data or have heavily customized tables, then consider SLT-based replication.

  • Cortex Framework compatibility. If you want to use Cortex Framework, then use SLT-based replication.

  • Technical expertise: CDS-based replication is easier to work with for people with SAP Basis or data replication expertise.

  • Data volume and complexity: For multi-terabyte data or complex table relationships, SLT-based replication might be more scalable.

What's next