feat: add CI pipeline for go-memory-load-mysql, mongo, and grpc#4107
feat: add CI pipeline for go-memory-load-mysql, mongo, and grpc#4107pathakharshit wants to merge 116 commits into
Conversation
There was a problem hiding this comment.
Pull request overview
Adds CI coverage for additional Go “memory load” sample apps by wiring them into the existing golang_docker workflow and providing per-app docker runner scripts.
Changes:
- Added new docker-based workflow scripts for
go-memory-load-mysql,go-memory-load-mongo, andgo-memory-load-grpc. - Enabled the replay phase in the existing
go-memory-loadscript. - Expanded
.github/workflows/golang_docker.ymlmatrix to run the new apps and switchedsamples-gocheckout ref.
Reviewed changes
Copilot reviewed 5 out of 5 changed files in this pull request and generated 10 comments.
Show a summary per file
| File | Description |
|---|---|
| .github/workflows/test_workflow_scripts/golang/go_memory_load_mysql/golang-docker.sh | New CI runner script for MySQL load test app (record + load + replay) with memory monitoring. |
| .github/workflows/test_workflow_scripts/golang/go_memory_load_mongo/golang-docker.sh | New CI runner script for Mongo load test app (record + load + replay) with memory monitoring. |
| .github/workflows/test_workflow_scripts/golang/go_memory_load_grpc/golang-docker.sh | New CI runner script for gRPC load test app (record + load + replay) with memory monitoring. |
| .github/workflows/test_workflow_scripts/golang/go_memory_load/golang-docker.sh | Un-commented replay stage to actually run replay in CI. |
| .github/workflows/golang_docker.yml | Added new matrix entries for the new apps and changed the samples-go ref. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| # Extract the http_req_failed percentage, e.g. "3.26%" from: | ||
| # http_req_failed.................: 3.26% ✓ 10 ✗ 296 | ||
| local fail_pct | ||
| fail_pct="$(grep -oP 'http_req_failed[.]*:\s+\K[0-9]+(\.[0-9]+)?' "$k6_log" | head -1 || true)" | ||
|
|
There was a problem hiding this comment.
check_k6_failure_rate relies on grep -P with \K, which is not available in all environments (and can vary across runners). Prefer a POSIX-ish parser (e.g., awk/sed) to extract the failure rate to avoid CI brittleness.
| docker compose build | ||
|
|
||
| section "Cleaning previous artifacts" | ||
| sudo rm -rf keploy/ |
There was a problem hiding this comment.
This cleanup uses sudo rm -rf keploy/ unconditionally. In environments where sudo isn’t available (or requires a password), the script will fail under set -e. Use the existing run_with_keploy_privileges helper (or a conditional command -v sudo) for cleanup to keep the script consistent and portable.
| sudo rm -rf keploy/ | |
| run_with_keploy_privileges rm -rf keploy/ |
| # Extract the http_req_failed percentage, e.g. "3.26%" from: | ||
| # http_req_failed.................: 3.26% ✓ 10 ✗ 296 | ||
| local fail_pct | ||
| fail_pct="$(grep -oP 'http_req_failed[.]*:\s+\K[0-9]+(\.[0-9]+)?' "$k6_log" | head -1 || true)" | ||
|
|
There was a problem hiding this comment.
check_k6_failure_rate relies on grep -P with \K, which is not available in all environments (and can vary across runners). Prefer a POSIX-ish parser (e.g., awk/sed) to extract the failure rate to avoid CI brittleness.
| section "Recording load-test traffic" | ||
| run_with_keploy_privileges "$RECORD_BIN" record -c "docker compose up" --container-name "$APP_CONTAINER_NAME" --memory-limit "$RECORD_MEMORY_LIMIT_MB" --enable-sampling --generate-github-actions=false 2>&1 | tee record.txt & | ||
| record_pid=$! | ||
| echo "Started Keploy record process with PID: $record_pid" | ||
|
|
||
| keploy_container="$(wait_for_keploy_container 120)" | ||
| echo "Detected Keploy container: $keploy_container" | ||
| # apply_keploy_memory_limit "$keploy_container" | ||
| start_memory_monitor "$keploy_container" "$record_pid" "record" |
There was a problem hiding this comment.
record_pid=$! captures the PID of tee (last process in the background pipeline), not the Keploy record process. As a result, start_memory_monitor will stop monitoring early and kill -TERM "$phase_pid" won’t terminate the recorder when a memory violation/OOM is detected. Capture the actual Keploy record PID (e.g., via pgrep/ps after start) or avoid the pipeline so $! refers to the recorder process, and pass that PID into the monitor.
| docker compose build | ||
|
|
||
| section "Cleaning previous artifacts" | ||
| sudo rm -rf keploy/ |
There was a problem hiding this comment.
This cleanup uses sudo rm -rf keploy/ unconditionally. In environments where sudo isn’t available (or requires a password), the script will fail under set -e. Use the existing run_with_keploy_privileges helper (or a conditional command -v sudo) for cleanup to keep the script consistent and portable.
| sudo rm -rf keploy/ | |
| run_with_keploy_privileges rm -rf keploy/ |
| # Extract the grpc_req_failed percentage, e.g. "3.26%" from: | ||
| # grpc_req_failed.................: 3.26% ✓ 10 ✗ 296 | ||
| # Fall back to http_req_failed for compatibility. | ||
| local fail_pct | ||
| fail_pct="$(grep -oP 'grpc_req_failed[.]*:\s+\K[0-9]+(\.[0-9]+)?' "$k6_log" | head -1 || true)" | ||
| if [ -z "$fail_pct" ]; then | ||
| fail_pct="$(grep -oP 'http_req_failed[.]*:\s+\K[0-9]+(\.[0-9]+)?' "$k6_log" | head -1 || true)" | ||
| fi |
There was a problem hiding this comment.
check_k6_failure_rate relies on grep -P with \K, which is not available in all environments (and can vary across runners). Prefer a POSIX-ish parser (e.g., awk/sed) to extract the failure rate to avoid CI brittleness.
| section "Recording load-test traffic" | ||
| run_with_keploy_privileges "$RECORD_BIN" record -c "docker compose up" --container-name "$APP_CONTAINER_NAME" --memory-limit "$RECORD_MEMORY_LIMIT_MB" --enable-sampling --generate-github-actions=false 2>&1 | tee record.txt & | ||
| record_pid=$! | ||
| echo "Started Keploy record process with PID: $record_pid" | ||
|
|
||
| keploy_container="$(wait_for_keploy_container 120)" | ||
| echo "Detected Keploy container: $keploy_container" | ||
| # apply_keploy_memory_limit "$keploy_container" | ||
| start_memory_monitor "$keploy_container" "$record_pid" "record" |
There was a problem hiding this comment.
record_pid=$! captures the PID of tee (last process in the background pipeline), not the Keploy record process. As a result, start_memory_monitor will stop monitoring early and kill -TERM "$phase_pid" won’t terminate the recorder when a memory violation/OOM is detected. Capture the actual Keploy record PID (e.g., via pgrep/ps after start) or avoid the pipeline so $! refers to the recorder process, and pass that PID into the monitor.
| docker compose build | ||
|
|
||
| section "Cleaning previous artifacts" | ||
| sudo rm -rf keploy/ |
There was a problem hiding this comment.
This cleanup uses sudo rm -rf keploy/ unconditionally. In environments where sudo isn’t available (or requires a password), the script will fail under set -e. Use the existing run_with_keploy_privileges helper (or a conditional command -v sudo) for cleanup to keep the script consistent and portable.
| sudo rm -rf keploy/ | |
| run_with_keploy_privileges rm -rf keploy/ |
| section "Recording load-test traffic" | ||
| run_with_keploy_privileges "$RECORD_BIN" record -c "docker compose up" --container-name "$APP_CONTAINER_NAME" --memory-limit "$RECORD_MEMORY_LIMIT_MB" --enable-sampling --generate-github-actions=false 2>&1 | tee record.txt & | ||
| record_pid=$! | ||
| echo "Started Keploy record process with PID: $record_pid" | ||
|
|
||
| keploy_container="$(wait_for_keploy_container 120)" | ||
| echo "Detected Keploy container: $keploy_container" | ||
| # apply_keploy_memory_limit "$keploy_container" | ||
| start_memory_monitor "$keploy_container" "$record_pid" "record" |
There was a problem hiding this comment.
record_pid=$! captures the PID of tee (last process in the background pipeline), not the Keploy record process. As a result, start_memory_monitor will stop monitoring early and kill -TERM "$phase_pid" won’t terminate the recorder when a memory violation/OOM is detected. Capture the actual Keploy record PID (e.g., via pgrep/ps after start) or avoid the pipeline so $! refers to the recorder process, and pass that PID into the monitor.
| - name: Checkout the samples-go repository | ||
| uses: actions/checkout@v4 | ||
| with: | ||
| repository: keploy/samples-go | ||
| ref: main | ||
| ref: feat/all-memory-load-apps | ||
| path: samples-go |
There was a problem hiding this comment.
This workflow now checks out keploy/samples-go from feat/all-memory-load-apps instead of main. CI will become dependent on a non-default branch that may be rebased/deleted, causing flaky failures. Prefer pinning to a commit SHA/tag, or merging the required samples into main and keeping CI on main.
🚀 Keploy Performance Test ResultsMulti-Run Validation: Tests run 3 times, pipeline fails only if 2+ runs show regression.
Thresholds: P50 < 5ms, P90 < 15ms, P99 < 70ms, RPS >= 100 (±1% tolerance), Error Rate < 1% ✅ Result: PASSED - Only 0 out of 3 runs failed (threshold: 2) P50, P90, and P99 percentiles naturally filter out outliers |
🚀 Keploy Performance Test ResultsMulti-Run Validation: Tests run 3 times, pipeline fails only if 2+ runs show regression.
Thresholds: P50 < 5ms, P90 < 15ms, P99 < 70ms, RPS >= 100 (±1% tolerance), Error Rate < 1% ✅ Result: PASSED - Only 0 out of 3 runs failed (threshold: 2) P50, P90, and P99 percentiles naturally filter out outliers |
🚀 Keploy Performance Test ResultsMulti-Run Validation: Tests run 3 times, pipeline fails only if 2+ runs show regression.
Thresholds: P50 < 5ms, P90 < 15ms, P99 < 70ms, RPS >= 100 (±1% tolerance), Error Rate < 1% ✅ Result: PASSED - Only 0 out of 3 runs failed (threshold: 2) P50, P90, and P99 percentiles naturally filter out outliers |
🚀 Keploy Performance Test ResultsMulti-Run Validation: Tests run 3 times, pipeline fails only if 2+ runs show regression.
Thresholds: P50 < 5ms, P90 < 15ms, P99 < 70ms, RPS >= 100 (±1% tolerance), Error Rate < 1% ✅ Result: PASSED - Only 0 out of 3 runs failed (threshold: 2) P50, P90, and P99 percentiles naturally filter out outliers |
🚀 Keploy Performance Test ResultsMulti-Run Validation: Tests run 3 times, pipeline fails only if 2+ runs show regression.
Thresholds: P50 < 5ms, P90 < 15ms, P99 < 70ms, RPS >= 100 (±1% tolerance), Error Rate < 1% ✅ Result: PASSED - Only 0 out of 3 runs failed (threshold: 2) P50, P90, and P99 percentiles naturally filter out outliers |
🚀 Keploy Performance Test ResultsMulti-Run Validation: Tests run 3 times, pipeline fails only if 2+ runs show regression.
Thresholds: P50 < 5ms, P90 < 15ms, P99 < 70ms, RPS >= 100 (±1% tolerance), Error Rate < 1% ✅ Result: PASSED - Only 0 out of 3 runs failed (threshold: 2) P50, P90, and P99 percentiles naturally filter out outliers |
🚀 Keploy Performance Test ResultsMulti-Run Validation: Tests run 3 times, pipeline fails only if 2+ runs show regression.
Thresholds: P50 < 5ms, P90 < 15ms, P99 < 70ms, RPS >= 100 (±1% tolerance), Error Rate < 1% ✅ Result: PASSED - Only 0 out of 3 runs failed (threshold: 2) P50, P90, and P99 percentiles naturally filter out outliers |
🚀 Keploy Performance Test ResultsMulti-Run Validation: Tests run 3 times, pipeline fails only if 2+ runs show regression.
Thresholds: P50 < 5ms, P90 < 15ms, P99 < 70ms, RPS >= 100 (±1% tolerance), Error Rate < 1% ✅ Result: PASSED - Only 0 out of 3 runs failed (threshold: 2) P50, P90, and P99 percentiles naturally filter out outliers |
🚀 Keploy Performance Test ResultsMulti-Run Validation: Tests run 3 times, pipeline fails only if 2+ runs show regression.
Thresholds: P50 < 5ms, P90 < 15ms, P99 < 70ms, RPS >= 100 (±1% tolerance), Error Rate < 1% ✅ Result: PASSED - Only 0 out of 3 runs failed (threshold: 2) P50, P90, and P99 percentiles naturally filter out outliers |
🚀 Keploy Performance Test ResultsMulti-Run Validation: Tests run 3 times, pipeline fails only if 2+ runs show regression.
Thresholds: P50 < 5ms, P90 < 15ms, P99 < 70ms, RPS >= 100 (±1% tolerance), Error Rate < 1% ✅ Result: PASSED - Only 0 out of 3 runs failed (threshold: 2) P50, P90, and P99 percentiles naturally filter out outliers |
🚀 Keploy Performance Test ResultsMulti-Run Validation: Tests run 3 times, pipeline fails only if 2+ runs show regression.
Thresholds: P50 < 5ms, P90 < 15ms, P99 < 70ms, RPS >= 100 (±1% tolerance), Error Rate < 1% ✅ Result: PASSED - Only 0 out of 3 runs failed (threshold: 2) P50, P90, and P99 percentiles naturally filter out outliers |
🚀 Keploy Performance Test ResultsMulti-Run Validation: Tests run 3 times, pipeline fails only if 2+ runs show regression.
Thresholds: P50 < 5ms, P90 < 15ms, P99 < 70ms, RPS >= 100 (±1% tolerance), Error Rate < 1% ✅ Result: PASSED - Only 0 out of 3 runs failed (threshold: 2) P50, P90, and P99 percentiles naturally filter out outliers |
🚀 Keploy Performance Test ResultsMulti-Run Validation: Tests run 3 times, pipeline fails only if 2+ runs show regression.
Thresholds: P50 < 5ms, P90 < 15ms, P99 < 70ms, RPS >= 100 (±1% tolerance), Error Rate < 1% ✅ Result: PASSED - Only 0 out of 3 runs failed (threshold: 2) P50, P90, and P99 percentiles naturally filter out outliers |
🚀 Keploy Performance Test ResultsMulti-Run Validation: Tests run 3 times, pipeline fails only if 2+ runs show regression.
Thresholds: P50 < 5ms, P90 < 15ms, P99 < 70ms, RPS >= 100 (±1% tolerance), Error Rate < 1% ✅ Result: PASSED - Only 0 out of 3 runs failed (threshold: 2) P50, P90, and P99 percentiles naturally filter out outliers |
🚀 Keploy Performance Test ResultsMulti-Run Validation: Tests run 3 times, pipeline fails only if 2+ runs show regression.
Thresholds: P50 < 5ms, P90 < 15ms, P99 < 70ms, RPS >= 100 (±1% tolerance), Error Rate < 1% ✅ Result: PASSED - Only 0 out of 3 runs failed (threshold: 2) P50, P90, and P99 percentiles naturally filter out outliers |
🚀 Keploy Performance Test ResultsMulti-Run Validation: Tests run 3 times, pipeline fails only if 2+ runs show regression.
Thresholds: P50 < 5ms, P90 < 15ms, P99 < 70ms, RPS >= 100 (±1% tolerance), Error Rate < 1% ✅ Result: PASSED - Only 0 out of 3 runs failed (threshold: 2) P50, P90, and P99 percentiles naturally filter out outliers |
🚀 Keploy Performance Test ResultsMulti-Run Validation: Tests run 3 times, pipeline fails only if 2+ runs show regression.
Thresholds: P50 < 5ms, P90 < 15ms, P99 < 70ms, RPS >= 100 (±1% tolerance), Error Rate < 1% ✅ Result: PASSED - Only 0 out of 3 runs failed (threshold: 2) P50, P90, and P99 percentiles naturally filter out outliers |
🚀 Keploy Performance Test ResultsMulti-Run Validation: Tests run 3 times, pipeline fails only if 2+ runs show regression.
Thresholds: P50 < 5ms, P90 < 15ms, P99 < 70ms, RPS >= 100 (±1% tolerance), Error Rate < 1% ✅ Result: PASSED - Only 0 out of 3 runs failed (threshold: 2) P50, P90, and P99 percentiles naturally filter out outliers |
🚀 Keploy Performance Test ResultsMulti-Run Validation: Tests run 3 times, pipeline fails only if 2+ runs show regression.
Thresholds: P50 < 5ms, P90 < 15ms, P99 < 70ms, RPS >= 100 (±1% tolerance), Error Rate < 1% ✅ Result: PASSED - Only 0 out of 3 runs failed (threshold: 2) P50, P90, and P99 percentiles naturally filter out outliers |
Signed-off-by: Harshit Pathak <harshit07pathak@gmail.com>
Signed-off-by: Harshit Pathak <harshit07pathak@gmail.com>
Signed-off-by: Harshit Pathak <harshit07pathak@gmail.com>
🚀 Keploy Performance Test ResultsMulti-Run Validation: Tests run 3 times, pipeline fails only if 2+ runs show regression.
Thresholds: P50 < 5ms, P90 < 15ms, P99 < 70ms, RPS >= 100 (±1% tolerance), Error Rate < 1% ✅ Result: PASSED - Only 0 out of 3 runs failed (threshold: 2) P50, P90, and P99 percentiles naturally filter out outliers |
…sure) Signed-off-by: Harshit Pathak <harshit07pathak@gmail.com>
🚀 Keploy Performance Test ResultsMulti-Run Validation: Tests run 3 times, pipeline fails only if 2+ runs show regression.
Thresholds: P50 < 5ms, P90 < 15ms, P99 < 70ms, RPS >= 100 (±1% tolerance), Error Rate < 1% ✅ Result: PASSED - Only 0 out of 3 runs failed (threshold: 2) P50, P90, and P99 percentiles naturally filter out outliers |
… RCA Add fmt.Fprintf(os.Stderr, "TRACE-...") lines to follow each mongo mock through AddMock and ResolveRange. Pairs with TRACE-MONGO-EMIT in integrations to pin where the chronic-6 mongo mocks die before reaching mocks.yaml in go-memory-load-mongo. TRACE-ADDMOCK-IN: every entry, with mock kind/reqTs/lifetime + firstReqSeen + bound/closed state of outChan so we know which branch will fire. TRACE-ADDMOCK-DROP-CLOSED: outChan-already-closed drop path. TRACE-ADDMOCK-FORWARD-PREFIRST: pre-firstReqSeen forwarding to outChan. TRACE-ADDMOCK-BUFFER: buffered path (firstReqSeen + outChan bound). TRACE-RESOLVE: every ResolveRange call with window/before/after/flushed. TRACE-RESOLVE-STALECUT: every stale-7s-cutoff drop with mock reqTs. Diagnostic only. Will be removed once the cause is pinned. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> Signed-off-by: Harshit Pathak <harshit07pathak@gmail.com>
🚀 Keploy Performance Test ResultsMulti-Run Validation: Tests run 3 times, pipeline fails only if 2+ runs show regression.
Thresholds: P50 < 5ms, P90 < 15ms, P99 < 70ms, RPS >= 100 (±1% tolerance), Error Rate < 1% ✅ Result: PASSED - Only 0 out of 3 runs failed (threshold: 2) P50, P90, and P99 percentiles naturally filter out outliers |
EmitMock has a pre-AddMock ctx.Err() check at session.go:392. If sess.Ctx is cancelled during shutdown while the mongo decoder is still flushing the chronic-6 teardown bytes, the mock returns silently from this path and never reaches syncMock.AddMock — appearing in our V2 tracing as TRACE-MONGOV2-EMIT-DROP with err="context canceled". Add TRACE-EMITMOCK-CTXDONE-DROP at the ctx-err return site so we can see how many mocks die here and at what timestamps, to confirm or falsify the hypothesis. Diagnostic only — pure logging, no behavior change. Will be removed once the cause is pinned. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> Signed-off-by: Harshit Pathak <harshit07pathak@gmail.com>
🚀 Keploy Performance Test ResultsMulti-Run Validation: Tests run 3 times, pipeline fails only if 2+ runs show regression.
Thresholds: P50 < 5ms, P90 < 15ms, P99 < 70ms, RPS >= 100 (±1% tolerance), Error Rate < 1% ✅ Result: PASSED - Only 0 out of 3 runs failed (threshold: 2) P50, P90, and P99 percentiles naturally filter out outliers |
Previous pipeline created at 06:54:10 sat in pending state with 0 jobs for 15+ minutes. The "Prepare Binary and Run Workflows" matrix never generated — symptom of forge-config fetch timeout at pipeline-create time per keploy-ci-debug skill. Empty retrigger forces GitHub to re-evaluate the workflow trigger. No code change — diagnostic traces from da59f10 remain in effect. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> Signed-off-by: Harshit Pathak <harshit07pathak@gmail.com>
🚀 Keploy Performance Test ResultsMulti-Run Validation: Tests run 3 times, pipeline fails only if 2+ runs show regression.
Thresholds: P50 < 5ms, P90 < 15ms, P99 < 70ms, RPS >= 100 (±1% tolerance), Error Rate < 1% ✅ Result: PASSED - Only 0 out of 3 runs failed (threshold: 2) P50, P90, and P99 percentiles naturally filter out outliers |
Run gofmt to fix: - import order: syncMock should come before pTls alphabetically - struct field alignment in pendingTC These were introduced in 00bbba6 (TC hold + agent-side pressure check) and surfaced in golangci-lint's gofmt step. Pure formatting, no behavior change. Also serves to fresh-trigger the prepare-and-run workflow after two prior runs (26272536927, 26273608297) sat in a GitHub Actions concurrency hold without generating their job matrix. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> Signed-off-by: Harshit Pathak <harshit07pathak@gmail.com>
🚀 Keploy Performance Test ResultsMulti-Run Validation: Tests run 3 times, pipeline fails only if 2+ runs show regression.
Thresholds: P50 < 5ms, P90 < 15ms, P99 < 70ms, RPS >= 100 (±1% tolerance), Error Rate < 1% ✅ Result: PASSED - Only 0 out of 3 runs failed (threshold: 2) P50, P90, and P99 percentiles naturally filter out outliers |
🚀 Keploy Performance Test ResultsMulti-Run Validation: Tests run 3 times, pipeline fails only if 2+ runs show regression.
Thresholds: P50 < 5ms, P90 < 15ms, P99 < 70ms, RPS >= 100 (±1% tolerance), Error Rate < 1% ✅ Result: PASSED - Only 0 out of 3 runs failed (threshold: 2) P50, P90, and P99 percentiles naturally filter out outliers |
Chronic-6 RCA confirmed on run 26277486420 (mongo-rbrb):
- last mongo mock at 08:49:20, last HTTP TC at 08:49:41 — 21s gap
- 6 HTTP TCs captured AFTER mongo's async decoder stopped emitting
- replay: connection EOF on get-orders-1..5 + get-analytics-top-products-1
The mongo v2 decoder runs on an async goroutine pipeline (encode.go
asyncMongoDecode). Under memory-limited recording with k6 load, the
pipeline can fall behind HTTP capture by 20+ seconds at shutdown.
The HTTP integration commits TCs as soon as the round-trip completes,
so when the recording window closes, HTTP captures the teardown TCs
but mongo has no time to drain its decode backlog — leaving orphan
TCs whose underlying mongo queries were never persisted to mocks.yaml.
Fix preserves the user's invariant ("no partial mocks; if mock is
dropped the corresponding TC must also be dropped"):
- syncMock.SyncMockManager grows a lastMongoMockResTime field,
updated under m.mu in AddMock when a Mongo-kind mock arrives.
Stored as the youngest observed ResTimestampMock so the agent's
TC-commit path can compare any HTTP TC's req-time against it.
- LastMongoMockResTime() accessor returns the youngest observed time
(zero if no mongo activity yet). Callers MUST treat zero as "mongo
not in use" and NOT drop based on it — otherwise pre-handshake HTTP
TCs in mongo apps would be wrongly dropped before the first mongo
mock decodes.
- HandleIncoming drain() gains a per-TC orphan check after the
existing 500ms tcHold + pressure-window check: when LastMongoMock
is non-zero AND the TC's req-time is more than mongoSilenceTolerance
(5s) NEWER than the youngest mongo mock, drop the TC and emit
diag/stage-tc-mongo-silence-drop.
Sized 5s because the normal decoder lag is ~50-200ms under load;
5s comfortably catches the 21s shutdown-orphan window without
dropping legitimate "mongo briefly idle" testcases.
Does NOT touch mongo encode.go / asyncMongoDecode itself — that
async architecture is load-bearing for throughput. The atomicity
guarantee is enforced at the TC commit boundary instead.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Signed-off-by: Harshit Pathak <harshit07pathak@gmail.com>
🚀 Keploy Performance Test ResultsMulti-Run Validation: Tests run 3 times, pipeline fails only if 2+ runs show regression.
Thresholds: P50 < 5ms, P90 < 15ms, P99 < 70ms, RPS >= 100 (±1% tolerance), Error Rate < 1% ✅ Result: PASSED - Only 0 out of 3 runs failed (threshold: 2) P50, P90, and P99 percentiles naturally filter out outliers |
Pulls in keploy/integrations debug/window-shift-diagnostic@700907d which fixes mongo mock-timestamp drift under decoder back-pressure (summary-17 RCA — find:customers mock 85 ms outside its test window). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> Signed-off-by: Harshit Pathak <harshit07pathak@gmail.com>
🚀 Keploy Performance Test ResultsMulti-Run Validation: Tests run 3 times, pipeline fails only if 2+ runs show regression.
Thresholds: P50 < 5ms, P90 < 15ms, P99 < 70ms, RPS >= 100 (±1% tolerance), Error Rate < 1% ✅ Result: PASSED - Only 0 out of 3 runs failed (threshold: 2) P50, P90, and P99 percentiles naturally filter out outliers |
The socket-time fix (integrations 700907d) traded one bug for another: - FIXED: rbrb summary-17 (find:customers mock outside per-test window) - REGRESSED: rbrl get/delete-large-payloads-by-id-3 (mocks for these TCs went missing, likely a wire vs decoder-pipeline timestamp desync that I haven't pinned yet) Net mongo failure count went 7 → 4, but with explicit regression on previously-green tests. User preference is zero regressions, so revert the socket-time stamp change. The orphan-TC drop in HandleIncoming (still in place, commit 1e79b15 in keploy) remains the load-bearing fix for the chronic-6 pattern. Summary-17 becomes an intermittent corner case — needs separate RCA for the per-test window vs async-decoder-lag interaction, but is not the systematic chronic-6 issue. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> Signed-off-by: Harshit Pathak <harshit07pathak@gmail.com>
🚀 Keploy Performance Test ResultsMulti-Run Validation: Tests run 3 times, pipeline fails only if 2+ runs show regression.
Thresholds: P50 < 5ms, P90 < 15ms, P99 < 70ms, RPS >= 100 (±1% tolerance), Error Rate < 1% ✅ Result: PASSED - Only 0 out of 3 runs failed (threshold: 2) P50, P90, and P99 percentiles naturally filter out outliers |
🚀 Keploy Performance Test ResultsMulti-Run Validation: Tests run 3 times, pipeline fails only if 2+ runs show regression.
Thresholds: P50 < 5ms, P90 < 15ms, P99 < 70ms, RPS >= 100 (±1% tolerance), Error Rate < 1% ✅ Result: PASSED - Only 0 out of 3 runs failed (threshold: 2) P50, P90, and P99 percentiles naturally filter out outliers |
…rain Race in the orphan-TC drop: the existing code reads syncMock.LastMongoMockResTime() at DRAIN time (after the 500ms tcHold). In that 500ms window, late mongo mocks for OTHER tests arrive and refresh the live timestamp. The orphan-check then sees a fresh value and emits a TC that was actually orphaned at arrival. Confirmed on run 26281482736 mongo-rbrl post-large-payloads-8: - TC arrived at 10:10:49.965 - LAST mongo mock at TC arrival: 10:10:43.389 (6.6s gap → orphan) - After tcHold, fresh mongo mocks at 10:10:49.988+ refreshed the live value; drain-time gap was only 22ms → orphan check didn't fire → TC committed without its underlying mongo insert → EOF at replay. Fix: pendingTC now stores the LastMongoMockResTime() value taken at TC arrival. Drain compares against this frozen value, ignoring any mocks that landed during the tcHold window. This is the correct semantics because the orphan condition is "this TC's req-time arrived after a long gap in mongo decoding" — that's a property of the moment the TC was captured, not of the moment it's being emitted. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> Signed-off-by: Harshit Pathak <harshit07pathak@gmail.com>
🚀 Keploy Performance Test ResultsMulti-Run Validation: Tests run 3 times, pipeline fails only if 2+ runs show regression.
Thresholds: P50 < 5ms, P90 < 15ms, P99 < 70ms, RPS >= 100 (±1% tolerance), Error Rate < 1% ✅ Result: PASSED - Only 0 out of 3 runs failed (threshold: 2) P50, P90, and P99 percentiles naturally filter out outliers |
…n MySQL mock drop Two related atomicity bugs caused TC-without-mock orphans in the go-memory-load-mongo (rbrb) and go-memory-load-mysql (rbrl) lanes: Bug 1 — mongo rbrb (delete-large-payloads-by-id-5): The 1 MB GET response ahead of the DELETE bytes in decodeChan caused the async decoder to process the delete mock AFTER the 500 ms tcHold expired. By the time AddMock's backward currentPressureStart extension fired, the TC had already been committed without its mock. Fix: track `arrivedDuringPressure` at TC arrival. drain() now holds such TCs (up to pressureHold=10 s) while pressure is active, giving async decoders time to emit their mocks and trigger the extension before the TC is drained. Bug 2 — mysql rbrl (chronic-6: get-orders-1..5 + get-analytics-1): MySQL mocks are dropped in recordMock() BEFORE AddMock is called, so AddMock's in-line backward currentPressureStart extension never fires for dropped MySQL mocks. IsHTTPTCInPressureWindow saw currentPressureStart = pressure-fire-time, not the earlier mysql ReqTimestampMock, so TCs whose round-trip completed just before pressure were not dropped. Fix: add ExtendPressureWindow(reqTimestamp) to syncMock and call it from recordMock when dropping, ensuring currentPressureStart is extended even when AddMock is bypassed. Also adds IsMemoryPressureActive() to syncMock for drain() to query whether pressure is still in progress. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> Signed-off-by: Harshit Pathak <harshit07pathak@gmail.com>
🚀 Keploy Performance Test ResultsMulti-Run Validation: Tests run 3 times, pipeline fails only if 2+ runs show regression.
Thresholds: P50 < 5ms, P90 < 15ms, P99 < 70ms, RPS >= 100 (±1% tolerance), Error Rate < 1% ✅ Result: PASSED - Only 0 out of 3 runs failed (threshold: 2) P50, P90, and P99 percentiles naturally filter out outliers |
🚀 Keploy Performance Test ResultsMulti-Run Validation: Tests run 3 times, pipeline fails only if 2+ runs show regression.
Thresholds: P50 < 5ms, P90 < 15ms, P99 < 70ms, RPS >= 100 (±1% tolerance), Error Rate < 1% ✅ Result: PASSED - Only 0 out of 3 runs failed (threshold: 2) P50, P90, and P99 percentiles naturally filter out outliers |
🚀 Keploy Performance Test ResultsMulti-Run Validation: Tests run 3 times, pipeline fails only if 2+ runs show regression.
Thresholds: P50 < 5ms, P90 < 15ms, P99 < 70ms, RPS >= 100 (±1% tolerance), Error Rate < 1% ✅ Result: PASSED - Only 0 out of 3 runs failed (threshold: 2) P50, P90, and P99 percentiles naturally filter out outliers |
Describe the changes that are made
Added k6 load-test CI pipelines for MySQL, MongoDB, and gRPC sample apps,
mirroring the existing Postgres (go-memory-load) pipeline pattern. Also
enabled full record + replay for the existing Postgres pipeline (replay was
previously disabled).
Each new pipeline:
Links & References
Closes: #
🔗 Related PRs
🐞 Related Issues
NA
📄 Related Documents
NA
What type of PR is this?
Added e2e test pipeline?
Added comments for hard-to-understand areas?
Added to documentation?
Are there any sample code or steps to test the changes?
Steps to test:
Golang On Dockerworkflowgo-memory-load-mysql(×3 configs)go-memory-load-mongo(×3 configs)go-memory-load-grpc(×3 configs)go-memory-load(×3 configs) — now with replay enabledSelf Review done?
Any relevant screenshots, recordings or logs?
NA