Managed I/O supports the following capabilities for Apache Iceberg:
Catalog | Batch read | Batch write | Streaming write | Dynamic table creation | Dynamic destinations |
---|---|---|---|---|---|
Hadoop | Supported | Supported | Supported | Supported | Supported |
Hive | Supported | Supported | Supported | Supported | Supported |
REST-based catalogs | Supported | Supported | Supported | Supported | Supported |
BigQuery metastore | Supported | Supported | Supported | Supported | Supported |
For BigQuery tables for Apache Iceberg,
use the
BigQueryIO
connector
with BigQuery Storage API. The table must already exist; dynamic table creation is
not supported.
Requirements
Requires Apache Beam SDK for Java version 2.58.0 or later.
Configuration
Managed I/O uses the following configuration parameters for Apache Iceberg:
Read and write configuration | Data type | Description |
---|---|---|
table |
string | The identifier of the Apache Iceberg table. Example:
"db.table1" . |
catalog_name |
string | The name of the catalog. Example: "local" . |
catalog_properties |
map | A map of configuration properties for the Apache Iceberg
catalog. The required properties depend on the catalog. For more
information, see
CatalogUtil in the Apache Iceberg documentation. |
config_properties |
map | An optional set of Hadoop configuration properties. For more
information, see
CatalogUtil in the Apache Iceberg documentation. |
Write configuration | Data type | Description |
triggering_frequency_seconds |
integer | For streaming write pipelines, the frequency at which the sink attempts to produce snapshots, in seconds. |
For more information and code examples, see the following topics: