Change waitForProcessing to use exponential backoff#3937
Conversation
There was a problem hiding this comment.
Pull request overview
This PR changes SARIF processing polling to use exponential backoff, reducing request volume during delayed analysis processing.
Changes:
- Replaces fixed 5-second polling with exponential backoff.
- Adds an initial delay before the first processing-status request.
- Limits processing-status polling to 5 intended attempts.
Show a summary per file
| File | Description |
|---|---|
src/upload-lib.ts |
Updates waitForProcessing polling constants and loop behavior to use exponential backoff. |
Copilot's findings
- Files reviewed: 1/1 changed files
- Comments generated: 2
| statusCheckBackoff *= STATUS_CHECK_BACKOFF_MULTIPLIER; | ||
| await util.delay(statusCheckBackoff, { allowProcessExit: false }); |
There was a problem hiding this comment.
I don't think this comment is quite right about the worst case calculations, but it's still a good idea and it would be cleaner to put the timeout check at the end of the loop, instead of doing an extra loop iteration just to break out early.
| for ( | ||
| let statusCheckingCount = 0; | ||
| statusCheckingCount <= STATUS_CHECK_MAX_TRIES; // Aborts on the last loop iteration | ||
| statusCheckingCount++ | ||
| ) { | ||
| if (statusCheckingCount === STATUS_CHECK_MAX_TRIES) { |
e930918 to
dfc1411
Compare
mbg
left a comment
There was a problem hiding this comment.
Thanks for preparing this! The change makes to me as-is and there's nothing blocking here.
We may want to review how well chosen the backoff pattern is, but that's something we can adjust later if needed.
Just one comment about the (potential) delay this imposes on the unit tests.
Also agree with Copilot that it would be nice to have a test that checks that this works as expected.
| let statusCheckBackoff = STATUS_CHECK_INITIAL_BACKOFF_MILLISECONDS; | ||
| await util.delay(statusCheckBackoff, { allowProcessExit: false }); |
There was a problem hiding this comment.
Minor: I think this also adds a STATUS_CHECK_INITIAL_BACKOFF_MILLISECONDS long delay to tests where we can get a "response" from a stubbed client.request faster. If that is the case, we could either make this initial wait conditional on e.g. the NODE_ENV (like in #3930) or stub util.delay in relevant tests to not wait.
The goal is to change
waitForProcessingto backoff a bit more aggressively and reduce the maximum number of requests that it makes. During periods where analysis processing is being delayed, we're seeing a large number of extra GET requests that I believe are coming from here. See linked internal issue for more info.In this PR I'm suggesting changing it to exponential backoff that'll make a maximum of 5 requests, instead of the current behaviour which can make up to 24 requests. I also add a wait before the first poll because processing will always take at least a couple of seconds so it's unlikely that an immediate poll will get anything. Let me know if you think this is reasonable. We can also tweak values to keep the maximum timeout to 2 minutes if that would be better.
Risk assessment
For internal use only. Please select the risk level of this change:
Which use cases does this change impact?
Workflow types:
dynamicworkflows (Default Setup, Code Quality, ...).Products:
analysis-kinds: code-scanning.analysis-kinds: code-quality.upload-sarifaction.Environments:
github.comand/or GitHub Enterprise Cloud with Data Residency.How did/will you validate this change?
.test.tsfiles).pr-checks).If something goes wrong after this change is released, what are the mitigation and rollback strategies?
How will you know if something goes wrong after this change is released?
Are there any special considerations for merging or releasing this change?
Merge / deployment checklist