Skip to content

[adapters] Report how much data each input connector supplies per step.#6000

Merged
blp merged 1 commit intomainfrom
step-size
Apr 7, 2026
Merged

[adapters] Report how much data each input connector supplies per step.#6000
blp merged 1 commit intomainfrom
step-size

Conversation

@blp
Copy link
Copy Markdown
Member

@blp blp commented Apr 6, 2026

Describe Manual Test Plan

I tried this out by hand with a pipeline of mine.

@blp blp requested a review from ryzhyk April 6, 2026 20:01
@blp blp self-assigned this Apr 6, 2026
@blp blp added performance connectors Issues related to the adapters/connectors crate rust Pull requests that update Rust code user-reported Reported by a user or customer profiler Issues related to the profiler and it's APIs metrics Metrics about feldera pipelines labels Apr 6, 2026
Copy link
Copy Markdown

@mythical-fred mythical-fred left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Copy Markdown
Contributor

@gz gz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we make it a metric (gauge?) too if we dont have it already there

Copy link
Copy Markdown
Contributor

@ryzhyk ryzhyk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It does seem like it would be useful to expose this metric via /metrics as well (I guess as a histogram?)

@gz
Copy link
Copy Markdown
Contributor

gz commented Apr 7, 2026

was thinking https://prometheus.io/docs/concepts/metric_types/#gauge makes more sense

@blp
Copy link
Copy Markdown
Member Author

blp commented Apr 7, 2026

With a gauge, you only get history as detailed as your metrics polling frequency. I think that a typical metrics polling frequency is a minute or more. That's fine if you want long-term data; it's not useful if you want detailed history of how the input batch size varied over, say, the last minute. So I think that the right kind of metric here depends on the goal (for debugging, I suspect that it is a histogram, possibly a sliding histogram like we use for dbsp_step_latency_seconds).

@blp blp added this pull request to the merge queue Apr 7, 2026
@github-merge-queue github-merge-queue bot removed this pull request from the merge queue due to failed status checks Apr 7, 2026
@blp blp added this pull request to the merge queue Apr 7, 2026
@gz
Copy link
Copy Markdown
Contributor

gz commented Apr 7, 2026

With a gauge, you only get history as detailed as your metrics polling frequency

agree my thinking was you already put it in samply for low-level debugging
whereas metrics is also used by users that are more interested in long term perf

@blp
Copy link
Copy Markdown
Member Author

blp commented Apr 7, 2026

With a gauge, you only get history as detailed as your metrics polling frequency

agree my thinking was you already put it in samply for low-level debugging whereas metrics is also used by users that are more interested in long term perf

That's reasonable.

So the gauge(s) would report the number of bytes/records in the most recent input batch, I guess.

Merged via the queue into main with commit 48d51fb Apr 7, 2026
1 check passed
@blp blp deleted the step-size branch April 7, 2026 18:36
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

connectors Issues related to the adapters/connectors crate metrics Metrics about feldera pipelines performance profiler Issues related to the profiler and it's APIs rust Pull requests that update Rust code user-reported Reported by a user or customer

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants