Module batcher (2.28.0)

User friendly container for Google Cloud Bigtable MutationBatcher.

Classes

MutationsBatchError

MutationsBatchError(message, exc)

Error in the batch request

MutationsBatcher

MutationsBatcher(
    table,
    flush_count=100,
    max_row_bytes=20971520,
    flush_interval=1,
    batch_completed_callback=None,
)

A MutationsBatcher is used in batch cases where the number of mutations is large or unknown. It will store DirectRow in memory until one of the size limits is reached, or an explicit call to flush() is performed. When a flush event occurs, the DirectRow in memory will be sent to Cloud Bigtable. Batching mutations is more efficient than sending individual request.

This class is not suited for usage in systems where each mutation must be guaranteed to be sent, since calling mutate may only result in an in-memory change. In a case of a system crash, any DirectRow remaining in memory will not necessarily be sent to the service, even after the completion of the mutate() method.

Note on thread safety: The same MutationBatcher cannot be shared by multiple end-user threads.

Parameters
Name Description
table class

class:Table.

flush_count int

(Optional) Max number of rows to flush. If it reaches the max number of rows it calls finish_batch() to mutate the current row batch. Default is FLUSH_COUNT (1000 rows).

max_row_bytes int

(Optional) Max number of row mutations size to flush. If it reaches the max number of row mutations size it calls finish_batch() to mutate the current row batch. Default is MAX_ROW_BYTES (5 MB).

flush_interval float

(Optional) The interval (in seconds) between asynchronous flush. Default is 1 second.

batch_completed_callback Callable[list:[google.rpc.status_pb2.Status]] = None

(Optional) A callable for handling responses after the current batch is sent. The callable function expect a list of grpc Status.