When bigtable row mutations run into retryable errors and exhausts all retry budgets, the responses of batched/flushed row mutations are all masked/hidden by the client who later suppresses all retryable errors.
Additionally, the returned response of the client is no longer Status types but the default None value.
To force a retryable error to be raised in normal conditions, manually set a very tiny mutation_timeout (0.1 ms), run below code, you get None values in the response.
import datetime
from google.cloud import bigtable
project_id = 'ningk-test-project'
instance_id = 'ningk-test'
table_id = 'test'
client = bigtable.Client(project=project_id, admin=True)
instance = client.instance(instance_id)
table = instance.table(table_id, mutation_timeout=0.0001)
timestamp = datetime.datetime.utcnow()
column_family_id = 'stats'
rows = [
table.direct_row('stats#a1#20220106001'),
table.direct_row('stats#a1#20220106002')
]
rows[0].set_cell(column_family_id,
'name',
'dango',
timestamp)
rows[0].set_cell(column_family_id,
'type',
'alpha',
timestamp)
rows[1].set_cell(column_family_id,
'name',
'pump',
timestamp)
rows[1].set_cell(column_family_id,
'type',
'beta',
timestamp)
from google.cloud.bigtable.batcher import MutationsBatcher
FLUSH_COUNT = 1000
MAX_ROW_BYTES = 5242880 # 5MB
class _MutationsBatcher(MutationsBatcher):
def __init__(
self, table, flush_count=FLUSH_COUNT, max_row_bytes=MAX_ROW_BYTES):
super().__init__(table, flush_count, max_row_bytes)
self.rows = []
def set_flush_callback(self, callback_fn):
self.callback_fn = callback_fn
def flush(self):
if len(self.rows) != 0:
response = self.table.mutate_rows(self.rows)
self.callback_fn(response)
self.total_mutation_count = 0
self.total_size = 0
self.rows = []
batcher = _MutationsBatcher(table)
def callback(response):
for i, status in enumerate(response):
print(type(status))
batcher.set_flush_callback(callback)
for row in rows:
batcher.mutate(row)
batcher.flush()
You will get
<class 'NoneType'>
<class 'NoneType'>
More context can be found here: https://issues.apache.org/jira/browse/BEAM-13606.
The expected solution:
- The returned response should always contain Status types instead of None. A DEADLINE_EXCEEDED might be a good fit here as retry budget is exhausted;
- All retryable errors shouldn't be suppressed by the client because then the invoker of the client wouldn't know what rows need to be retried.
When bigtable row mutations run into retryable errors and exhausts all retry budgets, the responses of batched/flushed row mutations are all masked/hidden by the client who later suppresses all retryable errors.
Additionally, the returned response of the client is no longer Status types but the default None value.
To force a retryable error to be raised in normal conditions, manually set a very tiny mutation_timeout (0.1 ms), run below code, you get None values in the response.
You will get
More context can be found here: https://issues.apache.org/jira/browse/BEAM-13606.
The expected solution: