You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
[adapters] Add metrics and statistics for pipeline backpressure stalls.
I noticed these showing up in the log and immediately wanted them to show
up elsewhere, since they're important.
Signed-off-by: Ben Pfaff <blp@feldera.com>
"Time in seconds that the pipeline was stalled because one or more output connectors' output buffers were full.\n\nThis value is greater than or equal to `output_stall_seconds`.",
"If the pipeline is currently stalled because one or more output connectors' output buffers were full, this is the time in seconds for which it has been stalled.\n\nIf the pipeline is not currently stalled, this is zero.\n\nIf this is nonzero, then the output connectors causing the stall can be identified by observing which values of `output_connector_queued_records` are greater than or equal to the configured maximum (which defaults to 1,000,000).",
Copy file name to clipboardExpand all lines: docs.feldera.com/docs/operations/metrics.md
+2Lines changed: 2 additions & 0 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -71,6 +71,8 @@ which Feldera is built.
71
71
| <aname='dbsp_runtime_elapsed_seconds_total'>`dbsp_runtime_elapsed_seconds_total`</a> |counter | Time elapsed while the pipeline is executing a step, multiplied by the number of foreground and background threads, in seconds. |
72
72
| <aname='dbsp_step_latency_seconds'>`dbsp_step_latency_seconds`</a> |histogram | Latency of DBSP steps over the last 60 seconds or 1000 steps, whichever is less, in seconds |
73
73
| <aname='dbsp_steps_total'>`dbsp_steps_total`</a> |counter | Total number of DBSP steps executed. |
74
+
| <aname='output_stall_seconds'>`output_stall_seconds`</a> |gauge | If the pipeline is currently stalled because one or more output connectors' output buffers were full, this is the time in seconds for which it has been stalled.<br/><br/>If the pipeline is not currently stalled, this is zero.<br/><br/>If this is nonzero, then the output connectors causing the stall can be identified by observing which values of `output_connector_queued_records` are greater than or equal to the configured maximum (which defaults to 1,000,000). |
75
+
| <aname='output_stall_seconds_total'>`output_stall_seconds_total`</a> |counter | Time in seconds that the pipeline was stalled because one or more output connectors' output buffers were full.<br/><br/>This value is greater than or equal to `output_stall_seconds`. |
Copy file name to clipboardExpand all lines: openapi.json
+7Lines changed: 7 additions & 0 deletions
Original file line number
Diff line number
Diff line change
@@ -8266,6 +8266,7 @@
8266
8266
"total_processed_records",
8267
8267
"total_processed_bytes",
8268
8268
"total_completed_records",
8269
+
"output_stall_msecs",
8269
8270
"total_initiated_steps",
8270
8271
"total_completed_steps",
8271
8272
"pipeline_complete"
@@ -8312,6 +8313,12 @@
8312
8313
"description": "Time at which the pipeline process from which we resumed started, in seconds since the epoch.",
8313
8314
"minimum": 0
8314
8315
},
8316
+
"output_stall_msecs": {
8317
+
"type": "integer",
8318
+
"format": "int64",
8319
+
"description": "If the pipeline is stalled because one or more output connectors' output\nbuffers are full, this is the number of milliseconds that the current\nstall has lasted.\n\nIf this is nonzero, then the output connectors causing the stall can be\nidentified by noticing `ExternalOutputEndpointMetrics::queued_records`\nis greater than or equal to `ConnectorConfig::max_queued_records`.\n\nIn the ordinary case, the pipeline is not stalled, and this value is 0.",
8320
+
"minimum": 0
8321
+
},
8315
8322
"pipeline_complete": {
8316
8323
"type": "boolean",
8317
8324
"description": "True if the pipeline has processed all input data to completion."
0 commit comments