Skip to content

fix race condition while seeding ui tests#70356

Merged
davidsbailey merged 10 commits into
stagingfrom
debug-ui-test-seed-race
Jan 20, 2026
Merged

fix race condition while seeding ui tests#70356
davidsbailey merged 10 commits into
stagingfrom
debug-ui-test-seed-race

Conversation

@davidsbailey

@davidsbailey davidsbailey commented Jan 15, 2026

Copy link
Copy Markdown
Member

Depends on #70424.

MD5-Based Incremental Script Seeding

Problem

Our CI system uses a cached build from the staging branch as a starting point. The seed process previously used a .seeded file as a timestamp marker - only .script_json files newer than .seeded would be re-seeded.

This approach causes drone to fail on PRs which add new scripts to UI_TEST_SCRIPTS. The cached .seeded file has a newer timestamp than the new script files, so they don't get seeded. This caused seed:courses_ui_tests to fail when trying to reference units that were never seeded. Here are two recent examples of this happening:

Solution

Replace timestamp-based incremental seeding with MD5 hash-based seeding, following the existing pattern for level files:

Changes:

  • Add md5 column to scripts table to store the hash of each script's .script_json contents (done in add md5 column to scripts table via db migration #70424)
  • Update update_scripts in seed.rake to compare file MD5 against stored hash instead of file timestamps
  • Pass MD5 through the ScriptSeed call stack (seed_from_json_fileseed_from_jsonseed_from_hashimport_script)
  • Update write_script_json when levelbuilders update a script or lesson, to avoid triggering a redundant re-seed in levelbuilder env

Testing Story

CI Validation

Commit CI Result Observation
c19db76 ❌ Fails Illustrates the original issue - aiml-2021 added to UI_TEST_SCRIPTS but not seeded because .seeded file timestamp is newer
e3c834f ✅ Passes Workaround: removing .seeded file forces full re-seed, confirming the issue is with incremental seeding logic
8feb420 ✅ Passes aiml-2021 now seeds successfully using the new MD5-based approach

Automated Tests

Added 4 unit tests:

In script_seed_test.rb:

  • seed_from_json_file stores md5 when provided - verifies MD5 is stored in the database when passed through the seeding call stack
  • seed_from_json_file works without md5 parameter - verifies backward compatibility when MD5 is not provided

In unit_test.rb:

  • write_script_json updates md5 in levelbuilder mode - verifies levelbuilder writes MD5 field to match file contents
  • write_script_json does nothing outside levelbuilder mode - verifies no-op behavior in non-levelbuilder environments

Performance

Benchmarked overhead of MD5 hashing and JSON parsing for all ~600 script files:

# MD5 hashing only (~0.6s)
Benchmark.measure { Dir.glob('config/scripts_json/*.script_json').each { |f| Digest::MD5.hexdigest(File.read(f)) } }
# => real=0.60s

# JSON parsing only (~0.8s)
Benchmark.measure { Dir.glob('config/scripts_json/*.script_json').each { |f| JSON.parse(File.read(f)) } }
# => real=0.82s

Total overhead of ~1.4s is acceptable compared to the 90+ seconds saved by not re-seeding unchanged scripts.

davidsbailey and others added 7 commits January 15, 2026 09:39
The previous approach using a .seeded file timestamp failed when new
scripts were added to UI_TEST_SCRIPTS after the CI cache was built.
The cached .seeded file had a newer timestamp than new script files,
so they wouldn't get seeded.

This change replaces the timestamp-based approach with MD5 hash
comparison (following the existing parse_dsl_files pattern):
- Add md5 column to scripts table
- Compute MD5 of script file contents and compare against stored hash
- Only seed scripts where hash differs or script doesn't exist
- Remove the .seeded file workaround from ci.rake

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
When a unit is saved in levelbuilder mode, write_script_json now
computes and stores the MD5 hash of the written content. This ensures
that when we next build in the levelbuilder environment, the
incremental seeding will recognize it as already
matching the database state.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Test seed_from_json_file stores md5 when provided
- Test seed_from_json_file works without md5 parameter
- Test write_script_json updates md5 in levelbuilder mode
- Test write_script_json does nothing outside levelbuilder mode

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Sonnet <noreply@anthropic.com>
@davidsbailey davidsbailey changed the base branch from staging to migrate-add-md5-to-scripts January 20, 2026 19:16
@davidsbailey davidsbailey marked this pull request as ready for review January 20, 2026 20:35
@davidsbailey davidsbailey changed the title debug race condition while seeding ui tests fix race condition while seeding ui tests Jan 20, 2026
Base automatically changed from migrate-add-md5-to-scripts to staging January 20, 2026 23:06
@davidsbailey davidsbailey merged commit 532415c into staging Jan 20, 2026
5 of 6 checks passed
@davidsbailey davidsbailey deleted the debug-ui-test-seed-race branch January 20, 2026 23:14
@davidsbailey davidsbailey restored the debug-ui-test-seed-race branch January 20, 2026 23:14
@davidsbailey davidsbailey deleted the debug-ui-test-seed-race branch February 18, 2026 21:42
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants