Skip to content

Fix flaky Minecraft HoC Pegasus share page scenario#61801

Merged
fisher-alice merged 3 commits into
stagingfrom
alice/ui-minecraft
Oct 16, 2024
Merged

Fix flaky Minecraft HoC Pegasus share page scenario#61801
fisher-alice merged 3 commits into
stagingfrom
alice/ui-minecraft

Conversation

@fisher-alice

@fisher-alice fisher-alice commented Oct 15, 2024

Copy link
Copy Markdown
Contributor

This PR includes a prospective fix for the flaky hour_of_code_finish.feature UI test. This UI test has been flaky on Ipad and in particular on iPhone (top failing UI test on iPhone).

Screenshot 2024-10-16 at 8 46 07 AM

This UI test has been disabled in Safari #60103 because the test was flaky at > 0.33 on Safari. I propose that we merge this PR and monitor for a few days. If the failure rate decreases on Iphone and Ipad, we can consider enabling this UI test on Safari.
@wilkie proposed a refactoring of Phaser's isLoading which I think would also be helpful in deflaking this test. (More details below.)

When looking at the Saucelabs replays of the UI test, there is one scenario which is frequently flaky: 'Pegasus share page preserves certificate when redirecting' in hour_of_code_finish.feature. I observed 2 different errors which resulted in a failed state. A third error that didn't lead to a failed state but I thought was still noteworthy is described in the Follow-up work section.

Error 1:

HTTP Status: 500
{
	"message": "An unknown server-side error occurred while processing the command. Original error: null is not an object (evaluating 'Craft.gameController.game.load.isLoading')",
	"error": "unknown error"
}

To help resolve the first error, I am adding the step 'And I wait for the lab page to fully load' so that there is time to check for Craft.phaserLoaded.

Then /^I wait until the Minecraft game is loaded$/ do
wait = Selenium::WebDriver::Wait.new(timeout: 60)
wait.until {@browser.execute_script('return Craft.phaserLoaded();')}
end

FYI, In craft.js, Craft.phaserLoaded is defined as

Craft.phaserLoaded = function () {
  return (
    Craft.gameController &&
    Craft.gameController.game &&
    !Craft.gameController.game.load.isLoading
  );
};

@wilkie is proposing a possible solution at #59857 to help resolve the null is not an object (evaluating 'Craft.gameController.game.load.isLoading error. He investigated the logic behind phaserLoaded. I think it would also be helpful to update the code in craft to avoid the JS error.

Error 2:

Screenshot 2024-10-15 at 5 03 25 PM

In the second error, we have completed the the first 4 steps of the scenario, and we are now on step Given I am on "http://code.org/api/hour/finish/mc".

When we check And I wait until current URL contains "/congrats"we see this message at the top of the screen:
'Puma caught this error: undefined method by_short_code for {"dance-ai-2023"=>...'

by_short_code is defined at

by_short_code = CDO.cache.fetch("Tutorials/#{@table}/by_short_code") {@contents.index_by {|row| row[:short_code]}}
.

When '/api/hour/finish/mc' redirects to '/congrats', the fetching of tutorial data appears to not have been successful.
I propose adding a wait after the Minecraft game loads and after Given I am on "http://code.org/api/hour/finish/mc".
After consulting with @wilkie, we are going to monitor this UI test as we are unsure why this error is occurring.
From Wilkie: 'the code seems to deterministically build a cached key/value store that maps things like "mc" to the CdoTutorial record... and I can't see why that would fail unless the database itself were misbehaving somehow.'

I also added a check for Craft in the 'I wait until the Minecraft game is loaded' so that phaserLoaded is not called unless Craft is defined.

Links

jira

Testing story

I ran this updated scenario against test 10 times and it passed 10 times in a row.
Screenshot 2024-10-16 at 9 42 00 AM

Deployment strategy

Follow-up work

UPDATE: Resolved Slack thread
One of the three errors from the Saucelabs replay videos involved a YouTube error

Screenshot 2024-10-15 at 5 11 21 PM

Note that this error didn't lead to a failed state (the Minecraft loading error did), but I was able to repro this on my iPhone on production so logging a ticket to investigate this at https://codedotorg.atlassian.net/browse/LABS-1119.

IMG_2784

Privacy

Security

Caching

PR Checklist:

  • Tests provide adequate coverage
  • Privacy and Security impacts have been assessed
  • Code is well-commented
  • New features are translatable or updates will not break translations
  • Relevant documentation has been added or updated
  • User impact is well-understood and desirable
  • Pull Request is labeled appropriately
  • Follow-up work items (including potential tech debt) are tracked and linked

@fisher-alice fisher-alice changed the title Fix flaky Minecraft HoC Egasus share page scenario Fix flaky Minecraft HoC Pegasus share page scenario Oct 15, 2024
@fisher-alice fisher-alice requested a review from a team October 15, 2024 22:34
@fisher-alice fisher-alice marked this pull request as ready for review October 15, 2024 22:34
@fisher-alice fisher-alice requested a review from wilkie October 16, 2024 14:48

@wilkie wilkie left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants