Conversation
ryzhyk
left a comment
There was a problem hiding this comment.
Thank you! I think this is the right direction, but maybe we can simplify the design.
|
It would be nice to add this to other connectors as well. But I don't think we will be able to do this in this PR. We should however be able to distinguish between the connector not supporting this counter and the counter being 0, so we can show the in different ways in the UI. |
mythical-fred
left a comment
There was a problem hiding this comment.
Requesting changes: commit message hygiene ("address review comments"), and docs for new user-visible metric. See inline for a functional blocker.
b9a90c2 to
8c96cb3
Compare
mythical-fred
left a comment
There was a problem hiding this comment.
Still missing docs for the new user-visible metric (batch_records_written). Please add a short note in docs.feldera.com (metrics/stats) before merge.
When a Delta Lake output connector writes a large batch, the write can take a long time with no visible progress. Add a `batch_records_written` metric that reports how many records have been written so far in the current batch, giving users live feedback during long writes. The metric is updated incrementally per chunk by each parallel write range and resets to 0 after the batch is committed or on failure. Forward-progress updates use `fetch_add` to atomically increment the shared counter, while rollbacks on retry use `fetch_sub` to subtract exactly the failed attempt's contribution. Batch boundaries and terminal failures use `store(0) for a clean reset. The counter is owned by encoder, e.g. DeltaTableWriter and registered with the controller's OutputEndpointMetrics via register_batch_progress_counter() during construction. To make this possible, add_output() is called before encoder creation so the metrics slot exists when the connector is built. Connectors that don't register a counter show batch_records_written: null. Signed-off-by: Swanand Mulay <73115739+swanandx@users.noreply.github.com>
check commit message
Ref: #5970
Describe Manual Test Plan
ran locally and checked stats endpoint
Checklist
Breaking Changes?
adding metrics should not be breaking change