Use Local Webdriver for UI Tests in CI#65064
Conversation
This reverts commit 014ac15.
…al-selenium-in-drone
…labs rather than local
…e do locally doesn't result in fewer total attempts against sauce
In favor of using local selenium for the first run anytime we're in the CI environment
This reverts commit 4a8731a.
cat5inthecradle
left a comment
There was a problem hiding this comment.
LGTM. I like the helper function renames. Excited to make this change!
snickell
left a comment
There was a problem hiding this comment.
Yayyy! LGTM, its gonna be wild how much impact this should have on our saucelabs usage. I love resource usage efficiency fixes, I don't know why, but so satisfying.
| from_secret: SAUCE_USERNAME | ||
| SAUCE_ACCESS_KEY: | ||
| from_secret: SAUCE_ACCESS_KEY | ||
| shm_size: '2gb' # necessary to avoid page crashes in Selenium |
There was a problem hiding this comment.
I wouldn't mind a comment (or a link back to a GH comment on this PR!) showing what these look like, in case somebody in the future needs to re-evaluate and say bump it up to 3gb or whatever.
| options.add_argument('window-size=1280,1024') if [:chrome, :firefox].include?(browser) | ||
| options.add_argument('headless') if headless | ||
| options.add_argument('--window-size=1280,1024') if [:chrome, :firefox].include?(browser) | ||
| options.add_argument('--headless') if headless |
There was a problem hiding this comment.
Not part of the review, just me being curious: were these not taking effect before without the -- being added?
There was a problem hiding this comment.
They were working for Chrome, which I think is what most people use for manual local testing, but not for Firefox
Follow-up to #65064, which added logic to use a local webdriver for the first run of all UI and Eyes tests, only falling back to SauceLabs for any tests which fail that first run. This PR adds a commit flag which can be used to selectively override that behavior if a developer has a specific reason to want even passing tests to run in SauceLabs (for example, if they want to share a video of a newly-added UI test running successfully). [Slack thread](https://codedotorg.slack.com/archives/C08AMQ869QX/p1753933950644269)
* Allow Overriding Local Webdriver Tests Follow-up to #65064, which added logic to use a local webdriver for the first run of all UI and Eyes tests, only falling back to SauceLabs for any tests which fail that first run. This PR adds a commit flag which can be used to selectively override that behavior if a developer has a specific reason to want even passing tests to run in SauceLabs (for example, if they want to share a video of a newly-added UI test running successfully). [Slack thread](https://codedotorg.slack.com/archives/C08AMQ869QX/p1753933950644269) * restore accidentally-removed whitespace * [skip local webdriver] empty commit to test new commit flag
Specifically, I added a new
--first-run-localflag torunner.rb, which tells it to use the local Selenium webdriver for the first run of a given test, and only use the configured webdriver (ie, Saucelabs) for reruns.We can then use this flag in our CI tests to be a bit more efficient with our Saucelabs credits. We expect most tests to pass just fine with the local webdriver, but Saucelabs is more consistent and provides a better experience for debugging legitimate failures.
Also did a little miscellaneous cleanup while I was in the area:
ENV['TEST_LOCAL'] == 'true'with a new helper methodENV['CI']with an existing helper methodLinks
Related work:
Alternative to:
Testing story
Relying on CI tests to validate this CI-test-focused change. Note that running the local selenium webdriver alongside the puma server processes does result in slightly slower responses, which increases test flakiness slightly even for the reruns happening in Saucelabs. I deflaked a few tests in response and have been consistently getting green checkmarks since, despite multiple reruns; I think it is reasonably stable. We should be prepared to invest a bit more time in further deflaking if other engineers run into issues, though.
Also verified on Drone that this change does not noticeably impact total runtime of UI tests; see for example this run with the new logic disabled versus this run with it enabled.
Deployment strategy
We should plan to notify developers when this gets merged; warn them about the potential for increased flakiness, and ask them to speak up if they notice anything. If this change does cause issues, we can easily gate it behind a commit message flag while we resolve them.
Follow-up work
I'd like to add some kind of logging or metrics for tracking local runs vs saucelabs runs. In particular, I'd like for there to be a way for us to identify any tests that are consistently failing locally but consistently passing remotely.