fix race condition while seeding ui tests#70356
Merged
Merged
Conversation
The previous approach using a .seeded file timestamp failed when new scripts were added to UI_TEST_SCRIPTS after the CI cache was built. The cached .seeded file had a newer timestamp than new script files, so they wouldn't get seeded. This change replaces the timestamp-based approach with MD5 hash comparison (following the existing parse_dsl_files pattern): - Add md5 column to scripts table - Compute MD5 of script file contents and compare against stored hash - Only seed scripts where hash differs or script doesn't exist - Remove the .seeded file workaround from ci.rake 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
When a unit is saved in levelbuilder mode, write_script_json now computes and stores the MD5 hash of the written content. This ensures that when we next build in the levelbuilder environment, the incremental seeding will recognize it as already matching the database state. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Test seed_from_json_file stores md5 when provided - Test seed_from_json_file works without md5 parameter - Test write_script_json updates md5 in levelbuilder mode - Test write_script_json does nothing outside levelbuilder mode 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Sonnet <noreply@anthropic.com>
This reverts commit c19db76.
etaderhold
approved these changes
Jan 20, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Depends on #70424.
MD5-Based Incremental Script Seeding
Problem
Our CI system uses a cached build from the
stagingbranch as a starting point. The seed process previously used a.seededfile as a timestamp marker - only.script_jsonfiles newer than.seededwould be re-seeded.This approach causes drone to fail on PRs which add new scripts to
UI_TEST_SCRIPTS. The cached.seededfile has a newer timestamp than the new script files, so they don't get seeded. This causedseed:courses_ui_teststo fail when trying to reference units that were never seeded. Here are two recent examples of this happening:Solution
Replace timestamp-based incremental seeding with MD5 hash-based seeding, following the existing pattern for level files:
code-dot-org/dashboard/lib/level_loader.rb
Lines 34 to 35 in 894d30f
code-dot-org/dashboard/lib/tasks/seed.rake
Lines 441 to 442 in 894d30f
Changes:
md5column toscriptstable to store the hash of each script's.script_jsoncontents (done in add md5 column to scripts table via db migration #70424)update_scriptsinseed.raketo compare file MD5 against stored hash instead of file timestampsScriptSeedcall stack (seed_from_json_file→seed_from_json→seed_from_hash→import_script)write_script_jsonwhen levelbuilders update a script or lesson, to avoid triggering a redundant re-seed in levelbuilder envTesting Story
CI Validation
aiml-2021added toUI_TEST_SCRIPTSbut not seeded because.seededfile timestamp is newer.seededfile forces full re-seed, confirming the issue is with incremental seeding logicaiml-2021now seeds successfully using the new MD5-based approachAutomated Tests
Added 4 unit tests:
In
script_seed_test.rb:seed_from_json_file stores md5 when provided- verifies MD5 is stored in the database when passed through the seeding call stackseed_from_json_file works without md5 parameter- verifies backward compatibility when MD5 is not providedIn
unit_test.rb:write_script_json updates md5 in levelbuilder mode- verifies levelbuilder writes MD5 field to match file contentswrite_script_json does nothing outside levelbuilder mode- verifies no-op behavior in non-levelbuilder environmentsPerformance
Benchmarked overhead of MD5 hashing and JSON parsing for all ~600 script files:
Total overhead of ~1.4s is acceptable compared to the 90+ seconds saved by not re-seeding unchanged scripts.