feat: report user secrets adoption summary in telemetry#24854
Conversation
6a36732 to
b575a43
Compare
Add a deployment-wide user secrets summary to the telemetry snapshot so we can track adoption of user secrets The summary reports: - A breakdown of secrets by which injection fields are populated: EnvNameOnly, FilePathOnly, Both, Neither - The distribution of secrets per user (max, p25, p50, p75, p90) All metrics are scoped to active non-system users. Soft-deleted users are excluded. The percentile distribution is computed across the entire active non-system user base, including users with zero secrets, so the percentiles reflect deployment-wide adoption. Assisted by Coder Agents.
b575a43 to
eeff31b
Compare
hugodutka
left a comment
There was a problem hiding this comment.
Approving, pending the 2 issues pointed out in the review.
There was a problem hiding this comment.
One more thing: we could use a field on the summary struct to indicate the period it's summarizing. Similar issue as in the past: https://github.com/coder/coder-telemetry-server/pull/36#issuecomment-3816970252
| -- distribution is active non-system users. Soft-deleted users are | ||
| -- excluded because Coder soft-deletes by flipping users.deleted | ||
| -- rather than removing rows, so their secrets persist in user_secrets | ||
| -- but are no longer reachable. System users (is_system = true) cover |
There was a problem hiding this comment.
Out of scope for this work, but this comment has me wondering if there should be cleanup specifically for user secrets when users.deleted is flipped. I see that it's unreachable so doesn't seem urgent, but I think best practice here would be to have secrets be deleted during that flip so that we store as little sensitive information as possible
There was a problem hiding this comment.
This is a really good catch. Created https://linear.app/codercom/issue/PLAT-183/add-user-secret-cleanup-to-user-soft-delete-trigger to track.
dylanhuff-at-coder
left a comment
There was a problem hiding this comment.
Two small comments, otherwise looks good
I didn't bother with the timestamps/window mainly because the data represents "feature usage right now" rather than over some interval. Using the telemetry server receive time to know what feature usage is at a rough point in time seems sufficient for our purposes. |
Add a deployment-wide user secrets summary to the telemetry snapshot so we can track adoption of user secrets
The summary reports:
All metrics are scoped to active non-system users. Soft-deleted users are excluded. The percentile distribution is computed across the entire active non-system user base, including users with zero secrets, so the percentiles reflect deployment-wide adoption.
Assisted by Coder Agents.
Corresponding coder-telemetry-server PR: https://github.com/coder/coder-telemetry-server/pull/40