Managed I/O supports reading and writing to Apache Kafka.
Requirements
Requires Apache Beam SDK for Java version 2.58.0 or later.
Configuration
Managed I/O uses the following configuration parameters for Apache Kafka.
Read and write configuration | Data type | Description |
---|---|---|
bootstrap_servers |
string | Required. A comma-separated list of Kafka bootstrap servers.
Example: localhost:9092 . |
topic |
string | Required. The Kafka topic to read or write. |
file_descriptor_path |
string | The path to a protocol buffer file descriptor set. Applies only if
data_format is "PROTO" . |
data_format |
string | The format of the messages. Supported values: "AVRO" ,
"JSON" , "PROTO" , "RAW" . The
default value is "RAW" , which reads or writes the raw
bytes of the message payload. |
message_name |
string | The name of the protocol buffer message. Required if
data_format is "PROTO" . |
schema |
string | The Kafka message schema. The expected schema type depends on the data format:
For read pipelines, this parameter is ignored if
|
Read configuration | Data type | Description |
auto_offset_reset_config |
string | Specifies the behavior when there is no initial offset or the current offset no longer exists on the Kafka server. The following values are supported:
The default value is |
confluent_schema_registry_subject |
string | The subject of a Confluent schema registry. Required if
confluent_schema_registry_url is specified. |
confluent_schema_registry_url |
string | The URL of a Confluent schema registry. If specified, the
schema parameter is ignored. |
consumer_config_updates |
map | Sets configuration parameters for the Kafka consumer. For more information, see Consumer configs in the Kafka documentation. You can use this parameter to customize the Kafka consumer. | max_read_time_seconds |
int | The maximum read time, in seconds. This option produces a bounded
PCollection and is mainly intended for testing or other
non-production scenarios. |
Write configuration | Data type | Description |
producer_config_updates |
map | Sets configuration parameters for the Kafka producer. For more information, see Producer configs in the Kafka documentation. You can use this parameter to customize the Kafka producer. |
To read Avro or JSON messages, you must specify a message schema. To set a
schema directly, use the schema
parameter. To provide the schema through a
Confluent schema registry, set the confluent_schema_registry_url
and
confluent_schema_registry_subject
parameters.
To read or write Protocol Buffer messages, either specify a message schema or
set the file_descriptor_path
parameter.
For more information and code examples, see the following topics: