feat(coderd): add provisioner_daemons to /debug/health endpoint#11393
Conversation
1b2bc9a to
53ed901
Compare
53ed901 to
b83013b
Compare
There was a problem hiding this comment.
Do you understand why this keeps getting marked as changed?
There was a problem hiding this comment.
I think it's the timestamp on the files changing; I haven't yet figured out a way to ignore this.
There was a problem hiding this comment.
Thinking out loud, is it possible that a daemon reports an error, but after that, the last seen isn't updated. Then after stale interval the error disappears, and perhaps nobody ever notices it?
Also, don't we want to apply the same rules to r.ProvisionerDaemons?
There was a problem hiding this comment.
is it possible that a daemon reports an error, but after that, the last seen isn't updated. Then after stale interval the error disappears, and perhaps nobody ever notices it?
Yes, and this is by design. If a provisioner daemon connects at some point, has a transient error, and then never heartbeats, I don't think it makes sense to warn about this.
There was a problem hiding this comment.
That's fair. I was entertaining the possibility of some error state resulting in the provisioner exiting, crashing or plain stopping communication with server. But I didn't have anything concrete in mind, so this is fine.
There was a problem hiding this comment.
This feels more like Config instead of Deps, but I'll leave it up to you.
62d42e8 to
19e896a
Compare
|
@johnstcn Nice contribution! Could you please link it with the umbrella issue and fill in the PR description? |
Adds a healthcheck for provisioner daemons to /debug/health endpoint.
Part of #10676