Skip to content

feat(workflows): Refactor wait-for-required-workflows#9813

Closed
candiduslynx wants to merge 7 commits into
mainfrom
feat/workflows/wait-refactor
Closed

feat(workflows): Refactor wait-for-required-workflows#9813
candiduslynx wants to merge 7 commits into
mainfrom
feat/workflows/wait-refactor

Conversation

@candiduslynx
Copy link
Copy Markdown
Contributor

@candiduslynx candiduslynx commented Apr 10, 2023

Closes #9797
Tested via #9814

Copy link
Copy Markdown
Member

@erezrokah erezrokah left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See my comment here #9797 (comment)

I don't understand how we can submit a fix for this issue until we confirm we can de-duplicate using IDs

@candiduslynx
Copy link
Copy Markdown
Contributor Author

See my comment here #9797 (comment)

As we saw (without IDs logged, but fairly obvious due to the name uniqueness) the duplicated check run from paged API, the only suitable conclusion is the following:

  • page 1 was fetched
  • another check finished
    • vital: it had to fall into the 1st page
  • we fetched the 2nd page, that would now start from the same entry the 1st one ended

I don't understand how we can submit a fix for this issue until we confirm we can de-duplicate using IDs

See API doc for getting check run (ID is sufficient key).

@candiduslynx candiduslynx added the automerge Automatically merge once required checks pass label Apr 10, 2023
@candiduslynx candiduslynx self-assigned this Apr 10, 2023
@candiduslynx candiduslynx requested a review from erezrokah April 10, 2023 16:51
@erezrokah
Copy link
Copy Markdown
Member

erezrokah commented Apr 10, 2023

As we saw (without IDs logged, but fairly obvious due to the name uniqueness) the duplicated check run from paged API, the only suitable conclusion is the following:

We did not see any duplicate IDs, only names. I suggest we at least wait until we see duplicate IDs. You can even log any detected duplicates before doing the filter here to confirm https://github.com/cloudquery/cloudquery/pull/9813/files#diff-9878c300a2552b0ecce61f2e611d6b959132e0ccd85ce4619c765cd1b35d0757R93

I'd rather wait to get additional information based on the debug logs we added rather than doing a refactor to an important part of our code base, especially since we've haven't seen this issue very often (or at least to justify the refactor).

TLDR: Let's try and reproduce the issue first, then fix it. If we can't reproduce it (or rarely reproduce it), I don't see a reason to do such a significant change

@candiduslynx
Copy link
Copy Markdown
Contributor Author

or at least to justify the refactor

Well, this approach allows us to see only the new check runs we saw instead of the already precessed ones, making debugging much easier.

  • it tracks the workflows using maps and will fail if we ever encounter extra check (like, we wait for 50 validate-release and receive 51)

@erezrokah
Copy link
Copy Markdown
Member

erezrokah commented Apr 10, 2023

  • it tracks the workflows using maps and will fail if we ever encounter extra check (like, we wait for 50 validate-release and receive 51)

I think we can agree this code is already complex and error prone (as seen in the time spent on recent PRs) so I think it's better to identify the issue first, than do any code changes later. Is there any information missing from the current logs that allows us to verify our assumption?

Also you tested only on a single plugin, so I don't think that will reproduce the edge cases we have been seeing

@candiduslynx candiduslynx force-pushed the feat/workflows/wait-refactor branch from b090b6c to 70bd1d7 Compare April 10, 2023 19:35
@candiduslynx
Copy link
Copy Markdown
Contributor Author

@erezrokah See #9797 (comment) for the repro

@erezrokah
Copy link
Copy Markdown
Member

Replaced by #9822

@erezrokah erezrokah closed this Apr 11, 2023
@candiduslynx candiduslynx deleted the feat/workflows/wait-refactor branch April 11, 2023 09:10
@erezrokah
Copy link
Copy Markdown
Member

We discussed this in a Zoom call, and figured that once we add re-usable workflows we should be able to give each plugin validate release and test policies job a distinct name such as validate release / validate release aws so we could do an array equals instead of a length check here:

if (matchingRuns.length === actions.length) {

That should make the wait for all workflow more robust and also allow us to log and know which plugins are waiting for workflows with the same name

kodiakhq Bot pushed a commit that referenced this pull request Apr 11, 2023

#### Summary

Fixes #9797. Follow up to #9793 and replaces #9813.

#9797 (comment) Confirms the ID is duplicated and we can filter using it

<!--
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

automerge Automatically merge once required checks pass

Projects

None yet

Development

Successfully merging this pull request may close these issues.

bug: Duplicates in wait-for-required-workflows resulting in false-positive success

2 participants