feat: noise-based obfuscation-aware mock matching across all parsers by ayush3160 · Pull Request #4026 · keploy/keploy

ayush3160 · 2026-04-08T17:55:19Z

Summary

Replaces prefix-based obfuscation detection (IsObfuscated, ContainsObfuscatedValue) with noise-pattern matching using Mock.Noise regex patterns
New NoiseChecker type in util/obfuscate.go with compile/cache/check methods that handles all obfuscated character classes (alphanumeric, digit-only, hex) uniformly
Updated HTTP integration parser, JSON diff (JSONDiffWithNoiseControl), MySQL paramValueEqual, and Generic binary match to use noise-based detection
HTTP/gRPC matchers get obfuscation awareness through the shared matchJSONWithNoiseHandlingIndexed change

Parsers changed

HTTP Integration — reworked ExactBodyMatch, PerformFuzzyMatch to use NoiseChecker from mock.Noise
JSON Diff (shared) — JSONDiffWithNoiseControl accepts *NoiseChecker, covers HTTP Matcher + gRPC Matcher
MySQL — paramValueEqual skips noisy mock param values
Generic — findExactMatch/findBinaryMatch skip noisy mock buffers

Related PRs

keploy/enterprise — Kafka + Redis parser changes
keploy/integrations — MongoDB V2, PostgreSQL V2, gRPC V2, HTTP/2 parser changes

Test plan

All existing HTTP parser tests pass
All existing matcher tests (HTTP, gRPC) pass
All existing MySQL tests pass
New noise-aware tests added for JSONBodyMatchScore, StripNoisyJSON, ExactBodyMatch
go build ./... passes

🤖 Generated with Claude Code

Add support for mocks containing obfuscated (redacted) secret values prefixed with `__KEPLOY_REDACT__:`. During replay, obfuscated fields are completely excluded from the match score so they don't affect whether a mock is selected. This enables secret protection without breaking mock matching. Changes: - Add shared ObfuscationPrefix constant and helpers in util/obfuscate.go - ExactBodyMatch: two-pass approach — exact string match first, then JSON-level comparison that skips obfuscated fields - PerformFuzzyMatch: strip obfuscated values from mock bodies before Levenshtein/Jaccard similarity computation - Add Info-level match percentage logging throughout the pipeline (schema match, exact body, body key, fuzzy) - Add 14 unit tests covering scoring, stripping, and edge cases Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Add a RecordHooks interface to the OSS record service, mirroring the TestHooks pattern used in replay. This allows enterprise to inject behaviour (e.g. secret obfuscation) into the recording pipeline via Before/AfterTestCaseInsert and Before/AfterMockInsert hooks without wrapping DB interfaces. Includes BaseRecordHooks (embeddable no-op), struct-based context params for forward compatibility, and SetRecordHooks/GetRecordHooks on the Recorder.

Move noise from unstructured Metadata["noise"] JSON string to a typed Noise []string field on MockSpec. This gives the mock matcher a defined structure to read obfuscation patterns from.

Add Noise []string to all per-protocol schema structs (HTTPSchema, GrpcSpec, MongoSpec, DNSSchema, GenericSchema, RedisSchema, KafkaSchema, HTTP2Schema, postgres.Spec, mysql.Spec) and wire it through EncodeMock and DecodeMocks so the field is serialized to/from YAML.

Move the Noise []string field from the per-protocol schema structs and MockSpec up to the Mock struct and NetworkTrafficDoc. This places noise patterns at the YAML root level alongside version/kind/name rather than buried inside each protocol's spec, since noise is mock-level metadata that is protocol-agnostic.

Replace prefix-based obfuscation detection with noise-pattern matching using Mock.Noise regex patterns. This handles all obfuscated character classes (alphanumeric, digit-only, hex) uniformly. Changes: - Rewrite util/obfuscate.go with NoiseChecker type (compile, cache, check) - Rework HTTP parser to use NoiseChecker instead of prefix checks - Add noise check in JSONDiffWithNoiseControl for HTTP/gRPC matchers - Add noise check in MySQL paramValueEqual - Add noise check in Generic findExactMatch/findBinaryMatch Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Copilot

Pull request overview

This PR introduces noise-pattern–based obfuscation awareness for mock matching by persisting per-mock regex “noise” patterns (Mock.Noise) and teaching multiple matchers/parsers to treat matching values as ignorable noise during comparison.

Changes:

Add Mock.Noise and persist it through YAML encoding/decoding (noise field in mock docs).
Introduce util.NoiseChecker (+ helpers like StripNoisyJSON, JSONBodyMatchScore) and thread it into JSON diff and protocol matchers (HTTP/MySQL/Generic).
Add RecordHooks to allow injecting behavior around test case/mock insertion in the recording pipeline.

Reviewed changes

Copilot reviewed 16 out of 16 changed files in this pull request and generated 11 comments.

Show a summary per file

File	Description
pkg/service/record/record.go	Adds `RecordHooks` to the recorder and calls hook callbacks around inserts.
pkg/service/record/hooks.go	Introduces hook interfaces and no-op base implementation.
pkg/platform/yaml/yaml.go	Adds persisted `noise` field to YAML mock document schema.
pkg/platform/yaml/mockdb/util.go	Encodes/decodes `Mock.Noise` into/from YAML documents.
pkg/models/mock.go	Adds `Noise []string` to `models.Mock`.
pkg/matcher/utils.go	Extends JSON diff to accept an obfuscation `NoiseChecker` and skip noisy values.
pkg/matcher/schema/match.go	Updates JSON diff call signature usage.
pkg/matcher/http/match.go	Updates JSON diff call signature usage.
pkg/matcher/http/absmatch.go	Updates JSON diff call signature usage.
pkg/matcher/grpc/match.go	Updates JSON diff call signature usage.
pkg/agent/proxy/integrations/util/obfuscate.go	Adds `NoiseChecker` with cached regex compilation and JSON helpers for stripping/scoring.
pkg/agent/proxy/integrations/mysql/replayer/match.go	Skips noisy param values during MySQL statement execute param matching.
pkg/agent/proxy/integrations/http/match.go	Implements noise-aware exact/fuzzy HTTP body matching and adds match logging.
pkg/agent/proxy/integrations/http/match_test.go	Adds tests for noise-aware JSON scoring, stripping, and exact body matching.
pkg/agent/proxy/integrations/generic/match.go	Skips/handles noisy generic payloads in exact/binary matching.
cli/provider/core_service.go	Updates recorder constructor call to pass hooks argument.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2026-04-08T18:00:50Z

+	// Noise holds exact-match regex patterns for obfuscated values.
+	// During mock matching, any stored value matching a pattern in this
+	// list is skipped (treated as noise). Written by the enterprise
+	// secret-protection obfuscator.
+	Noise []string `json:"Noise,omitempty" bson:"noise,omitempty" yaml:"noise,omitempty"`


Mock.Noise is added, but Mock.DeepCopy() currently doesn't copy the Noise slice, so any code that deep-copies mocks (e.g., to avoid races) will silently drop noise patterns and lose obfuscation-aware matching. Update DeepCopy to deep-copy Noise (and include it in the returned Mock).

Copilot · 2026-04-08T18:00:50Z

+			nc := util.NewNoiseChecker(mock.Noise)
 			for requestIndex, reqBuff := range reqBuffs {
+				mockData := mock.Spec.GenericRequests[requestIndex].Message[0].Data
+
+				// If mock data is noisy (obfuscated), give it a perfect similarity score
+				if nc != nil && nc.IsNoisy(mockData) {
+					if 1.0 > mxSim {
+						mxSim = 1.0
+						mxIdx = idx
+					}
+					continue
+				}


In findBinaryMatch, treating a single noisy mock buffer as a perfect similarity (mxSim=1.0) can cause this mock to win even if other request buffers are completely different, since matching uses a global max similarity across all buffers. Instead, skip noisy buffers in the similarity calculation (or aggregate similarity across all non-noisy buffers) so noise doesn't force an unconditional best match.

Copilot · 2026-04-08T18:00:51Z

+		h.Logger.Info("http mock schema match results",
+			zap.Int("schema_matched", len(schemaMatched)),
+			zap.Int("total_http_mocks", len(unfilteredMocks)))


These Info-level logs run for every request match attempt and will likely flood logs in normal operation. Consider downgrading to Debug (or gating behind a verbose flag) to avoid high log volume and performance impact.

Copilot · 2026-04-08T18:00:51Z

+		matched, total, noisy := util.JSONBodyMatchScore(mockData, reqData, nc)
+
+		pct := 100.0
+		if total > 0 {
+			pct = float64(matched) / float64(total) * 100
+		}
+		h.Logger.Info("http mock match score (noise-aware)",
+			zap.String("mock", mock.Name),
+			zap.Int("matched_fields", matched),
+			zap.Int("total_fields", total),
+			zap.Int("noisy_fields_skipped", noisy),
+			zap.Float64("match_percentage", pct))
+
+		if matched == total {
+			return true, mock
+		}


The noise-aware ExactBodyMatch uses JSONBodyMatchScore and then treats matched==total as an exact match, but the score only iterates over keys present in the mock JSON. This allows requests with extra non-noisy fields to still be considered an "exact" body match. If this is meant to be exact equality ignoring noisy fields, ensure the comparison also fails on extra request keys (except those corresponding to skipped noisy fields), e.g., by stripping noisy fields from both sides and doing a deep equality check.

Copilot · 2026-04-08T18:00:51Z

+		h.Logger.Info("http mock match score (noise-aware)",
+			zap.String("mock", mock.Name),
+			zap.Int("matched_fields", matched),
+			zap.Int("total_fields", total),
+			zap.Int("noisy_fields_skipped", noisy),
+			zap.Float64("match_percentage", pct))


Logging a per-mock noise-aware match score at Info inside the match loop can be extremely noisy and expensive at runtime (N mocks per request). Consider changing this to Debug (or only logging the winning mock / when no match is found) to reduce operational overhead.

Copilot · 2026-04-08T18:00:52Z

+					if hookErr := r.hooks.AfterTestCaseInsert(ctx, &TestCaseContext{
+						TestCase: testCase, TestSetID: newTestSetID,
+					}); hookErr != nil {
+						r.logger.Error("AfterTestCaseInsert hook failed", zap.Error(hookErr))
+					}


Hook failure is logged as an error but doesn't provide any actionable next step or context (e.g., which test case failed). Consider including identifiers (test case name/ID, testSetID) and a next-step hint (e.g., "disable custom record hooks" or "check enterprise hook implementation logs").

Copilot · 2026-04-08T18:00:52Z

+			if hookErr := r.hooks.BeforeMockInsert(ctx, &MockContext{
+				Mock: mock, TestSetID: newTestSetID,
+			}); hookErr != nil {
+				r.logger.Error("BeforeMockInsert hook failed", zap.Error(hookErr))
+			}


Hook failure is logged as an error but doesn't provide any actionable next step or context (e.g., which mock failed). Consider including identifiers (mock name/kind, testSetID) and a next-step hint (e.g., "disable custom record hooks" or "check enterprise hook implementation logs").

Copilot · 2026-04-08T18:00:52Z

+				if hookErr := r.hooks.AfterMockInsert(ctx, &MockContext{
+					Mock: mock, TestSetID: newTestSetID,
+				}); hookErr != nil {
+					r.logger.Error("AfterMockInsert hook failed", zap.Error(hookErr))
+				}


Hook failure is logged as an error but doesn't provide any actionable next step or context (e.g., which mock failed). Consider including identifiers (mock name/kind, testSetID) and a next-step hint (e.g., "disable custom record hooks" or "check enterprise hook implementation logs").

Copilot · 2026-04-08T18:00:52Z

+	compiled, err := regexp.Compile(pattern)
+	if err != nil {
+		return nil // skip invalid patterns silently
+	}


getCachedRegexp silently drops invalid regex patterns (returns nil) with no surfaced error, which can make "noise" mismatches very hard to diagnose. Consider returning an error (or at least collecting/reporting invalid patterns via a debug log or counter) so misconfigured Mock.Noise patterns are visible to operators.

Copilot · 2026-04-08T18:00:53Z

+	h.Logger.Info("http mock body key match results",
+		zap.Int("body_key_matched", len(bodyMatched)),
+		zap.Int("schema_matched", len(schemaMatched)))


This Info-level aggregate log is emitted on every JSON-body match pass and may flood logs under load. Consider changing to Debug (or only logging when multiple candidates remain / when matching fails) to keep production logs actionable.

github-actions · 2026-04-08T18:01:46Z

🚀 Keploy Performance Test Results

Multi-Run Validation: Tests run 3 times, pipeline fails only if 2+ runs show regression.

Run	P50	P90	P99	RPS	Error Rate	Status
1	3.13ms	4.1ms	5.9ms	100.02	0.00%	✅ PASS
2	2.92ms	3.81ms	5.4ms	100.00	0.00%	✅ PASS
3	3.06ms	4.31ms	6.44ms	100.02	0.00%	✅ PASS

Thresholds: P50 < 5ms, P90 < 15ms, P99 < 70ms, RPS >= 100 (±1% tolerance), Error Rate < 1%

✅ Result: PASSED - Only 0 out of 3 runs failed (threshold: 2)

P50, P90, and P99 percentiles naturally filter out outliers

…pipeline Point the build-and-upload and build-docker-image jobs to the integrations branch with obfuscation-aware parser changes so the CI pipeline tests the full stack together. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

github-actions · 2026-04-08T18:07:38Z

🚀 Keploy Performance Test Results

Multi-Run Validation: Tests run 3 times, pipeline fails only if 2+ runs show regression.

Run	P50	P90	P99	RPS	Error Rate	Status
1	2.75ms	3.52ms	5.16ms	100.00	0.00%	✅ PASS
2	2.81ms	3.5ms	5.26ms	100.02	0.00%	✅ PASS
3	2.93ms	4.17ms	7.7ms	100.00	0.00%	✅ PASS

Thresholds: P50 < 5ms, P90 < 15ms, P99 < 70ms, RPS >= 100 (±1% tolerance), Error Rate < 1%

✅ Result: PASSED - Only 0 out of 3 runs failed (threshold: 2)

P50, P90, and P99 percentiles naturally filter out outliers

Blank lines after Metadata break gofmt's column alignment group, causing the CI lint check to fail. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

github-actions · 2026-04-09T08:04:18Z

🚀 Keploy Performance Test Results

Multi-Run Validation: Tests run 3 times, pipeline fails only if 2+ runs show regression.

Run	P50	P90	P99	RPS	Error Rate	Status
1	2.93ms	3.85ms	5.47ms	100.00	0.00%	✅ PASS
2	2.93ms	3.81ms	5.64ms	100.02	0.00%	✅ PASS
3	3.11ms	4.33ms	6.86ms	100.02	0.00%	✅ PASS

Thresholds: P50 < 5ms, P90 < 15ms, P99 < 70ms, RPS >= 100 (±1% tolerance), Error Rate < 1%

✅ Result: PASSED - Only 0 out of 3 runs failed (threshold: 2)

P50, P90, and P99 percentiles naturally filter out outliers

- DeepCopy now copies the Noise slice to prevent race conditions - findBinaryMatch aggregates similarity across non-noisy buffers instead of forcing perfect score on noisy ones - Downgrade per-request match logs from Info to Debug to reduce noise - ExactBodyMatch now rejects requests with extra non-noisy keys - Hook failure logs include testSetID, name, and kind for debugging - getCachedRegexp warns on invalid regex patterns instead of silently dropping them Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

github-actions · 2026-04-09T08:15:24Z

🚀 Keploy Performance Test Results

Multi-Run Validation: Tests run 3 times, pipeline fails only if 2+ runs show regression.

Run	P50	P90	P99	RPS	Error Rate	Status
1	2.86ms	3.66ms	5.25ms	100.02	0.00%	✅ PASS
2	2.78ms	3.57ms	5.3ms	100.02	0.00%	✅ PASS
3	3.1ms	4.59ms	7.03ms	100.02	0.00%	✅ PASS

Thresholds: P50 < 5ms, P90 < 15ms, P99 < 70ms, RPS >= 100 (±1% tolerance), Error Rate < 1%

✅ Result: PASSED - Only 0 out of 3 runs failed (threshold: 2)

P50, P90, and P99 percentiles naturally filter out outliers

Copilot

Pull request overview

Copilot reviewed 17 out of 17 changed files in this pull request and generated 8 comments.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2026-04-09T08:22:45Z

+	compiled, err := regexp.Compile(pattern)
+	if err != nil {
+		log.Printf("WARNING: invalid noise regex pattern %q: %v — pattern will be ignored", pattern, err)
+		return nil


getCachedRegexp logs via log.Printf("WARNING: ...") when a noise regex is invalid. This adds a warning-style log line (and uses the stdlib logger) without a clear next step for users. Consider returning/propagating the compile error to the caller so it can be logged via the existing zap logger with actionable guidance (e.g., which config/mock produced the pattern), or silently ignoring invalid patterns if they’re not user-actionable.

Copilot · 2026-04-09T08:22:46Z

+	case []interface{}:
+		rv, ok := reqVal.([]interface{})
+		if !ok {
+			return false
+		}
+		for i := 0; i < len(mv) && i < len(rv); i++ {
+			if nc.IsNoisyValue(mv[i]) {
+				continue
+			}
+			if HasExtraNonNoisyKeys(mv[i], rv[i], nc) {
+				return true
+			}
+		}
+		return false


HasExtraNonNoisyKeys doesn’t treat extra elements in request arrays as “extra non-noisy keys”. In the []interface{} case it only compares up to min(len(mv), len(rv)) and then returns false, so a request like [1,2,3] can be considered “exact” against a mock [1,2] (when matched == total). Consider adding a length check (e.g., if len(rv) > len(mv) then return true) so arrays with additional request elements don’t incorrectly pass as exact matches.

Copilot · 2026-04-09T08:22:46Z

+				if hookErr := r.hooks.BeforeTestCaseInsert(ctx, &TestCaseContext{
+					TestCase: testCase, TestSetID: newTestSetID,
+				}); hookErr != nil {
+					r.logger.Error("BeforeTestCaseInsert hook failed",
+						zap.Error(hookErr),
+						zap.String("testSetID", newTestSetID),
+						zap.String("testCaseName", testCase.Name))
+				}


Hook error logs (e.g., BeforeTestCaseInsert hook failed) don’t provide any actionable next step for the user, even though recording continues after the failure. Consider including guidance such as how to disable the hook/feature or where to look (enterprise hook implementation) and whether the failure affects recording results, so users know what to do when they see this error.

Copilot · 2026-04-09T08:22:46Z

+					if hookErr := r.hooks.AfterTestCaseInsert(ctx, &TestCaseContext{
+						TestCase: testCase, TestSetID: newTestSetID,
+					}); hookErr != nil {
+						r.logger.Error("AfterTestCaseInsert hook failed",
+							zap.Error(hookErr),
+							zap.String("testSetID", newTestSetID),
+							zap.String("testCaseName", testCase.Name))
+					}


Hook error logs (e.g., AfterTestCaseInsert hook failed) don’t provide an actionable next step and don’t indicate whether the test case was still recorded successfully. Consider enriching the log message/fields to clarify impact (insert succeeded vs not) and provide guidance on resolving or disabling the hook implementation.

Copilot · 2026-04-09T08:22:47Z

+				if hookErr := r.hooks.AfterMockInsert(ctx, &MockContext{
+					Mock: mock, TestSetID: newTestSetID,
+				}); hookErr != nil {
+					r.logger.Error("AfterMockInsert hook failed",
+						zap.Error(hookErr),
+						zap.String("testSetID", newTestSetID),
+						zap.String("mockName", mock.Name),
+						zap.String("mockKind", mock.GetKind()))
+				}


Hook error logs (e.g., AfterMockInsert hook failed) don’t provide an actionable next step and don’t clarify whether the mock insert succeeded (it appears to, since this is in the success branch). Consider adjusting the message/fields to state the insert result and include guidance for fixing or disabling the hook implementation so users can act on the error.

Copilot · 2026-04-09T08:22:47Z

+	// First pass: exact string match (fastest path)
 	for _, mock := range schemaMatched {
 		if mock.Spec.HTTPReq.Body == string(body) {
+			h.Logger.Info("http mock matched",
+				zap.String("mock", mock.Name),
+				zap.Float64("match_percentage", 100.0),
+				zap.String("match_type", "exact_body"))
 			return true, mock


These new Info logs on every successful HTTP mock match (http mock matched) may be very high-volume in normal operation (proxy matching runs per request) and could noticeably increase log noise/cost. Consider downgrading to Debug (or gating behind a verbose flag), keeping Info for only user-actionable lifecycle events.

Copilot · 2026-04-09T08:22:47Z

      - name: Add Private Parsers
        if: ${{ (github.event_name == 'pull_request' && !github.event.pull_request.head.repo.fork) || (github.event_name == 'push' && github.ref == 'refs/heads/main') }}
        uses: ./.github/actions/setup-private-parsers
        with:
          ssh-private-key: ${{ secrets.INTEGRATIONS_REPO_DEPLOY_KEY_PRIVATE }}
          go-cache: true
+          integration-ref: feat/obfuscation-aware-matching



CI is pinned to a feature branch of the private parsers repo via integration-ref: feat/obfuscation-aware-matching. This can break builds when the branch is force-pushed/deleted and makes CI results less reproducible. Consider pinning to a tag/commit SHA (or omitting integration-ref to use the default) and updating it through a controlled release process.

Copilot · 2026-04-09T08:22:47Z

      - name: Add Private Parsers
        if: ${{ (github.event_name == 'pull_request' && !github.event.pull_request.head.repo.fork) || (github.event_name == 'push' && github.ref == 'refs/heads/main') }}
        uses: ./.github/actions/setup-private-parsers
        with:
          ssh-private-key: ${{ secrets.INTEGRATIONS_REPO_DEPLOY_KEY_PRIVATE }}
          go-cache: true
+          integration-ref: feat/obfuscation-aware-matching



This second integration-ref: feat/obfuscation-aware-matching occurrence also pins CI to a moving branch for the docker-image build job. Consider using the same pinned tag/SHA approach here as well so both jobs build against an immutable parser revision.

- Remove stdlib log.Printf from getCachedRegexp; silently skip invalid regex patterns since they are not user-actionable - Add array length check in HasExtraNonNoisyKeys so requests with extra array elements don't pass as exact matches - Add actionable guidance and impact clarity to all hook error logs - Downgrade all "http mock matched" Info logs to Debug to reduce per-request log volume - Pin CI integration-ref to commit SHA instead of moving branch name Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Signed-off-by: Ayush Sharma <kshitij3160@gmail.com>

github-actions · 2026-04-09T08:44:56Z

🚀 Keploy Performance Test Results

Multi-Run Validation: Tests run 3 times, pipeline fails only if 2+ runs show regression.

Run	P50	P90	P99	RPS	Error Rate	Status
1	2.9ms	3.8ms	5.54ms	100.00	0.00%	✅ PASS
2	2.86ms	3.77ms	6.05ms	100.00	0.00%	✅ PASS
3	3.08ms	4.47ms	7.58ms	100.02	0.00%	✅ PASS

Thresholds: P50 < 5ms, P90 < 15ms, P99 < 70ms, RPS >= 100 (±1% tolerance), Error Rate < 1%

✅ Result: PASSED - Only 0 out of 3 runs failed (threshold: 2)

P50, P90, and P99 percentiles naturally filter out outliers

Copilot

Pull request overview

Copilot reviewed 13 out of 13 changed files in this pull request and generated 5 comments.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2026-04-09T08:47:50Z

+		// Build set of mock keys (excluding noisy ones)
+		mockKeys := make(map[string]struct{}, len(mv))
+		for k, v := range mv {
+			if !nc.IsNoisyValue(v) {
+				mockKeys[k] = struct{}{}
+			}
+		}
+		for k := range rv {
+			if _, exists := mockKeys[k]; !exists {
+				return true
+			}
+		}


HasExtraNonNoisyKeys treats any request key not in mockKeys as “extra”, but mockKeys is built by excluding keys whose mock value is noisy. This makes common cases like {password: <noisy>} in the mock be considered an extra key in the request (since password is excluded from mockKeys), causing otherwise-valid noise-aware exact matches to be rejected. Consider treating keys as present regardless of noise (only skip value comparisons/recursion for noisy fields), or include noisy keys in the allowed key set while still ignoring their values.

Copilot · 2026-04-09T08:47:50Z

+	case []interface{}:
+		var result []interface{}
+		for _, item := range v {
+			if nc.IsNoisyValue(item) {
+				continue
+			}
+			result = append(result, StripNoisyFields(item, nc))
+		}
+		return result


StripNoisyFields uses a nil []interface{} result when all array elements are stripped. encoding/json marshals a nil slice as null, which can change semantics (and similarity scoring) compared to an empty array []. Initializing result with make([]interface{}, 0, len(v)) avoids emitting null for empty arrays.

Copilot · 2026-04-09T08:47:50Z

+	// String-based fuzzy matching (Levenshtein distance)
+	reqStr := string(reqBuff)
+	if util.IsASCII(reqStr) {
+		idx := h.findStringMatch(reqStr, mockStrings)
 		if idx != -1 {
-			h.Logger.Debug("string match found", zap.String("mock name", tcsMocks[idx].Name))
+			dist := levenshtein.ComputeDistance(reqStr, mockStrings[idx])
+			maxLen := len(reqStr)
+			if len(mockStrings[idx]) > maxLen {
+				maxLen = len(mockStrings[idx])
+			}
+			pct := 0.0
+			if maxLen > 0 {
+				pct = (1.0 - float64(dist)/float64(maxLen)) * 100
+			}


PerformFuzzyMatch recomputes Levenshtein distance (ComputeDistance) for logging even though findStringMatch already computed distances while selecting the best match. This adds extra O(n*m) work on the hot path purely for debug output. Consider returning the distance from findStringMatch (or computing the percentage only when debug is enabled) to avoid the duplicate computation.

Copilot · 2026-04-09T08:47:51Z

 // JSONDiffWithNoiseControl compares JSON with support for both Path-based noise (e.g. "body.user.id")
 // and Global noise (e.g. "timestamp") to be ignored everywhere.
-func JSONDiffWithNoiseControl(validatedJSON ValidatedJSON, noise map[string][]string, ignoreOrdering bool) (JSONComparisonResult, error) {
+func JSONDiffWithNoiseControl(validatedJSON ValidatedJSON, noise map[string][]string, ignoreOrdering bool, obfuscationNoise *util.NoiseChecker) (JSONComparisonResult, error) {
 	// Split noise into Path-based (contains dots) and Global (no dots)
 	pathNoise := make(map[string][]string)


JSONDiffWithNoiseControl now accepts obfuscationNoise, but all in-repo callers pass nil (HTTP matcher, gRPC matcher, schema matcher). As a result, the new obfuscation-aware branch in matchJSONWithNoiseHandlingIndexed is currently unused in this codebase. If the PR intends to make matchers obfuscation-aware, plumb a real NoiseChecker from the relevant source; otherwise consider removing this parameter to avoid dead/untested code paths.

Copilot · 2026-04-09T08:47:51Z

+					continue
+				}
+
 				_ = base64.StdEncoding.EncodeToString(reqBuff)
-				encoded, _ := util.DecodeBase64(mock.Spec.GenericRequests[requestIndex].Message[0].Data)
+				encoded, _ := util.DecodeBase64(mockData)


_ = base64.StdEncoding.EncodeToString(reqBuff) is a no-op (result unused) and can be removed. Leaving it in looks like leftover debugging and makes the inner matching loop harder to read.

- HasExtraNonNoisyKeys: include noisy keys in allowed key set so requests with noisy fields are not falsely rejected (fixes TestExactBodyMatch_NoisyFullMatch) - StripNoisyFields: initialize empty slice instead of nil to avoid marshaling as null - findStringMatch: return distance alongside index to eliminate duplicate Levenshtein computation in PerformFuzzyMatch - Remove unused obfuscationNoise param from JSONDiffWithNoiseControl and all callers — obfuscation-aware matching lives in proxy layer - Remove no-op base64 encoding in generic/match.go - Remove integration-ref from CI workflow to match main branch Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

github-actions · 2026-04-09T08:59:24Z

🚀 Keploy Performance Test Results

Multi-Run Validation: Tests run 3 times, pipeline fails only if 2+ runs show regression.

Run	P50	P90	P99	RPS	Error Rate	Status
1	2.69ms	3.48ms	5.12ms	100.00	0.00%	✅ PASS
2	2.72ms	3.56ms	5.21ms	100.02	0.00%	✅ PASS
3	2.99ms	4.24ms	6.08ms	100.02	0.00%	✅ PASS

Thresholds: P50 < 5ms, P90 < 15ms, P99 < 70ms, RPS >= 100 (±1% tolerance), Error Rate < 1%

✅ Result: PASSED - Only 0 out of 3 runs failed (threshold: 2)

P50, P90, and P99 percentiles naturally filter out outliers

Copilot

Pull request overview

Copilot reviewed 7 out of 7 changed files in this pull request and generated 3 comments.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2026-04-09T09:07:50Z

 		dist := levenshtein.ComputeDistance(req, mock)
 		if dist == 0 {
-			return 0
+			return 0, 0
 		}


In findStringMatch, when an exact string match is found (dist == 0), the function currently returns index 0 regardless of which mock string matched. This can select the wrong mock if the exact match occurs at a later index (especially after noise stripping). Return the current idx (and distance 0) instead of hardcoding 0.

Copilot · 2026-04-09T09:07:51Z

+	for idx := range tcsMocks {
+		mockBody := []byte(mockStrings[idx])
+		k := util.AdaptiveK(len(reqBuff), 3, 8, 5)
+		shingles1 := util.CreateShingles(mockBody, k)
+		shingles2 := util.CreateShingles(reqBuff, k)
+		similarity := util.JaccardSimilarity(shingles1, shingles2)
+		if mxSim < similarity {


PerformFuzzyMatch's binary (Jaccard) loop recomputes request shingles (CreateShingles(reqBuff, k)) for every mock. Since reqBuff and k are constant across iterations, precomputing the request shingles once outside the loop would avoid repeated work and reduce CPU for large bodies / many mocks.

Copilot · 2026-04-09T09:07:51Z

+// Global cache for compiled regexes to avoid recompiling the same patterns
+// across multiple mock comparisons.
+var (
+	noiseCacheMu sync.RWMutex
+	noiseCache   = make(map[string]*regexp.Regexp)
+)


noiseCache is a global map with no eviction/size bound. Since cache keys are the raw regex patterns (likely derived from recorded mocks), long-running processes that load many distinct mocks/patterns can grow this map unbounded and retain memory permanently. Consider bounding the cache (LRU/TTL) or scoping it to a replay/recording session instead of using a process-wide map.

- findStringMatch: return actual idx on exact match instead of hardcoded 0, which could select the wrong mock - PerformFuzzyMatch: precompute request shingles once outside the Jaccard loop to avoid redundant O(n) work per mock - Bound noiseCache to 1024 entries with full eviction to prevent unbounded memory growth in long-running processes Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

github-actions · 2026-04-09T09:14:49Z

🚀 Keploy Performance Test Results

Multi-Run Validation: Tests run 3 times, pipeline fails only if 2+ runs show regression.

Run	P50	P90	P99	RPS	Error Rate	Status
1	3.15ms	4.01ms	5.62ms	100.00	0.00%	✅ PASS
2	3.1ms	3.95ms	5.94ms	100.02	0.00%	✅ PASS
3	3.3ms	4.75ms	9.24ms	100.00	0.00%	✅ PASS

Thresholds: P50 < 5ms, P90 < 15ms, P99 < 70ms, RPS >= 100 (±1% tolerance), Error Rate < 1%

✅ Result: PASSED - Only 0 out of 3 runs failed (threshold: 2)

P50, P90, and P99 percentiles naturally filter out outliers

Copilot

Pull request overview

Copilot reviewed 7 out of 7 changed files in this pull request and generated 4 comments.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2026-04-09T09:28:39Z

+		// JSON-level comparison skipping noisy fields
+		if !pkg.IsJSON([]byte(mockBody)) || !pkg.IsJSON(body) {
+			continue
+		}
+
+		var mockData, reqData interface{}
+		if err := json.Unmarshal([]byte(mockBody), &mockData); err != nil {
+			continue
+		}
+		if err := json.Unmarshal(body, &reqData); err != nil {
+			continue
+		}


ExactBodyMatch does JSON parsing of the request body inside the per-mock loop (pkg.IsJSON + json.Unmarshal of body each iteration). This is O(#mocks) re-parsing and can become a noticeable hot path when many mocks share the same schema. Parse/validate the request JSON once before the loop and reuse reqData (and an isReqJSON bool) while iterating mocks.

Copilot · 2026-04-09T09:28:39Z

+	// Binary fuzzy matching (Jaccard similarity) with stripped mock bodies
+	mxSim := -1.0
+	mxIdx := -1
+	k := util.AdaptiveK(len(reqBuff), 3, 8, 5)
+	reqShingles := util.CreateShingles(reqBuff, k)
+	for idx := range tcsMocks {
+		mockBody := []byte(mockStrings[idx])
+		mockShingles := util.CreateShingles(mockBody, k)
+		similarity := util.JaccardSimilarity(mockShingles, reqShingles)
+		if mxSim < similarity {
+			mxSim = similarity
+			mxIdx = idx
+		}
+	}


PerformFuzzyMatch now duplicates the Jaccard-similarity loop that already exists in findBinaryMatch, but against mockStrings (noise-stripped). This duplication makes it easier for the two implementations to drift. Consider refactoring so the binary fuzzy match logic lives in one place (e.g., update findBinaryMatch to accept the preprocessed mock bodies / NoiseChecker and call it here).

Copilot · 2026-04-09T09:28:39Z

+func getCachedRegexp(pattern string) *regexp.Regexp {
+	noiseCacheMu.RLock()
+	re := noiseCache[pattern]
+	noiseCacheMu.RUnlock()
+	if re != nil {
+		return re
+	}
+	compiled, err := regexp.Compile(pattern)
+	if err != nil {
+		return nil // invalid pattern — silently skipped; not user-actionable
+	}


getCachedRegexp recompiles the same invalid regex pattern on every call because failures aren’t cached (it returns nil immediately). If an invalid pattern appears in Mock.Noise, this can become a repeated compile cost during matching. Consider caching a negative result (e.g., store a sentinel) or filtering/validating patterns once so subsequent checks don’t re-run regexp.Compile.

Copilot · 2026-04-09T09:28:40Z

+// HasExtraNonNoisyKeys checks whether reqVal contains keys not present in
+// mockVal (excluding keys whose mock value is noisy). Returns true if extra
+// non-noisy keys exist, meaning the request is not an exact match.
+func HasExtraNonNoisyKeys(mockVal, reqVal interface{}, nc *NoiseChecker) bool {
+	switch mv := mockVal.(type) {
+	case map[string]interface{}:
+		rv, ok := reqVal.(map[string]interface{})
+		if !ok {
+			return false
+		}
+		// Build set of all mock keys — noisy keys are still valid keys,
+		// we only skip their value comparison, not their presence.
+		mockKeys := make(map[string]struct{}, len(mv))


The doc comment for HasExtraNonNoisyKeys says it excludes keys whose mock value is noisy, but the implementation explicitly treats noisy keys as valid keys for presence checks (and only skips value recursion). Please align the comment and behavior to avoid confusion for future maintainers (either update the comment or adjust the key-handling logic).

- ExactBodyMatch: parse request JSON once before the noise-aware loop instead of re-parsing per mock - Extract jaccardBestMatch helper to deduplicate Jaccard similarity logic between findBinaryMatch and PerformFuzzyMatch - getCachedRegexp: cache invalid patterns as nil entries to avoid repeated regexp.Compile calls for the same bad pattern Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

github-actions · 2026-04-09T09:33:09Z

🚀 Keploy Performance Test Results

Multi-Run Validation: Tests run 3 times, pipeline fails only if 2+ runs show regression.

Run	P50	P90	P99	RPS	Error Rate	Status
1	N/A	N/A	N/A	N/A	N/A	✅ PASS

Thresholds: P50 < 5ms, P90 < 15ms, P99 < 70ms, RPS >= 100 (±1% tolerance), Error Rate < 1%

✅ Result: PASSED - Only 0 out of 3 runs failed (threshold: 2)

P50, P90, and P99 percentiles naturally filter out outliers

github-actions · 2026-04-09T09:40:14Z

🚀 Keploy Performance Test Results

Multi-Run Validation: Tests run 3 times, pipeline fails only if 2+ runs show regression.

Run	P50	P90	P99	RPS	Error Rate	Status
1	3.13ms	4.08ms	5.6ms	100.02	0.00%	✅ PASS
2	2.8ms	3.68ms	4.98ms	100.00	0.00%	✅ PASS
3	2.93ms	3.85ms	5.4ms	100.02	0.00%	✅ PASS

Thresholds: P50 < 5ms, P90 < 15ms, P99 < 70ms, RPS >= 100 (±1% tolerance), Error Rate < 1%

✅ Result: PASSED - Only 0 out of 3 runs failed (threshold: 2)

P50, P90, and P99 percentiles naturally filter out outliers

Copilot

Pull request overview

Copilot reviewed 7 out of 7 changed files in this pull request and generated 3 comments.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2026-04-09T10:01:01Z

+			nc := util.NewNoiseChecker(mock.Noise)
+			var simSum float64
+			var simCount int
 			for requestIndex, reqBuff := range reqBuffs {
-				_ = base64.StdEncoding.EncodeToString(reqBuff)
-				encoded, _ := util.DecodeBase64(mock.Spec.GenericRequests[requestIndex].Message[0].Data)
+				mockData := mock.Spec.GenericRequests[requestIndex].Message[0].Data

-				similarity := fuzzyCheck(encoded, reqBuff)
+				// Skip noisy (obfuscated) buffers — don't let them influence similarity
+				if nc != nil && nc.IsNoisy(mockData) {
+					continue
+				}

-				if mxSim < similarity {
-					mxSim = similarity
+				encoded, _ := util.DecodeBase64(mockData)
+
+				similarity := fuzzyCheck(encoded, reqBuff)
+				simSum += similarity
+				simCount++
+			}
+			// Compute average similarity across non-noisy buffers
+			if simCount > 0 {
+				avgSim := simSum / float64(simCount)
+				if avgSim > mxSim {


If all buffers for a mock are marked noisy, simCount stays 0 and the mock is never considered (even though, conceptually, it may be a valid match when everything is obfuscated). Handle the simCount == 0 case explicitly (e.g., treat it as a perfect/neutral similarity or fall back to a different tie-breaker) so fully-redacted generic interactions remain matchable.

Copilot · 2026-04-09T10:01:01Z

+	compiled, err := regexp.Compile(pattern)
+	if err != nil {
+		compiled = nil // will be cached as negative result
+	}


Invalid noise regex patterns are silently cached as nil and then ignored by NewNoiseChecker, which can disable obfuscation-awareness without any visibility (leading to surprising mismatches in production). Consider returning an error (or collecting invalid patterns) from NewNoiseChecker, or at least emitting a one-time log/metric when a pattern fails to compile so operators can detect misconfigured Mock.Noise.

Copilot · 2026-04-09T10:01:02Z

+// HasExtraNonNoisyKeys checks whether reqVal contains keys not present in
+// mockVal (excluding keys whose mock value is noisy). Returns true if extra
+// non-noisy keys exist, meaning the request is not an exact match.
+func HasExtraNonNoisyKeys(mockVal, reqVal interface{}, nc *NoiseChecker) bool {


The doc comment doesn’t match the implementation: the function always treats mock keys as present regardless of whether their values are noisy (it skips value comparison/recursion for noisy values, but not key presence). Update the comment to reflect actual behavior (e.g., ‘extra keys not present in the mock cause mismatch; nested extras under noisy branches are ignored’) to prevent incorrect usage/assumptions by future callers.

- findBinaryMatch: treat fully-noisy mocks as neutral (1.0) so they remain matchable when all buffers are obfuscated - NewNoiseChecker: log invalid regex patterns at construction time so operators can detect misconfigured Mock.Noise - HasExtraNonNoisyKeys: update doc comment to accurately reflect that all mock keys are present regardless of noise; only value comparison is skipped for noisy fields Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

github-actions · 2026-04-09T10:25:17Z

🚀 Keploy Performance Test Results

Multi-Run Validation: Tests run 3 times, pipeline fails only if 2+ runs show regression.

Run	P50	P90	P99	RPS	Error Rate	Status
1	2.82ms	3.54ms	5ms	100.02	0.00%	✅ PASS
2	2.7ms	3.39ms	4.86ms	100.02	0.00%	✅ PASS
3	2.87ms	4.07ms	6.61ms	100.00	0.00%	✅ PASS

Thresholds: P50 < 5ms, P90 < 15ms, P99 < 70ms, RPS >= 100 (±1% tolerance), Error Rate < 1%

✅ Result: PASSED - Only 0 out of 3 runs failed (threshold: 2)

P50, P90, and P99 percentiles naturally filter out outliers

charankamarapu

LGTM

github-actions · 2026-04-09T12:42:31Z

🚀 Keploy Performance Test Results

Multi-Run Validation: Tests run 3 times, pipeline fails only if 2+ runs show regression.

Run	P50	P90	P99	RPS	Error Rate	Status
1	2.67ms	3.37ms	4.9ms	100.00	0.00%	✅ PASS
2	2.6ms	3.3ms	4.91ms	100.02	0.00%	✅ PASS
3	2.72ms	3.83ms	5.81ms	100.02	0.00%	✅ PASS

Thresholds: P50 < 5ms, P90 < 15ms, P99 < 70ms, RPS >= 100 (±1% tolerance), Error Rate < 1%

✅ Result: PASSED - Only 0 out of 3 runs failed (threshold: 2)

P50, P90, and P99 percentiles naturally filter out outliers

ayush3160 and others added 6 commits April 8, 2026 14:01

feat(models): add Noise field to MockSpec for obfuscated value patterns

ce5ad42

Move noise from unstructured Metadata["noise"] JSON string to a typed Noise []string field on MockSpec. This gives the mock matcher a defined structure to read obfuscation patterns from.

Copilot AI review requested due to automatic review settings April 8, 2026 17:55

ayush3160 requested a review from gouravkrosx as a code owner April 8, 2026 17:55

Copilot started reviewing on behalf of ayush3160 April 8, 2026 17:56 View session

Copilot AI reviewed Apr 8, 2026

View reviewed changes

ayush3160 mentioned this pull request Apr 9, 2026

feat: add Noise field and RecordHooks for enterprise secret obfuscation #4024

Merged

5 tasks

fix: correct gofmt alignment in struct literals after Metadata field

cf142f0

Blank lines after Metadata break gofmt's column alignment group, causing the CI lint check to fail. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

ayush3160 requested a review from Copilot April 9, 2026 08:12

Copilot started reviewing on behalf of ayush3160 April 9, 2026 08:15 View session

Copilot AI reviewed Apr 9, 2026

View reviewed changes

ayush3160 and others added 2 commits April 9, 2026 09:33

Merge branch 'main' into feat/obfuscation-aware-matching

8532dc1

Signed-off-by: Ayush Sharma <kshitij3160@gmail.com>

ayush3160 requested a review from Copilot April 9, 2026 08:42

Copilot started reviewing on behalf of ayush3160 April 9, 2026 08:42 View session

Copilot AI reviewed Apr 9, 2026

View reviewed changes

ayush3160 requested a review from Copilot April 9, 2026 09:01

Copilot started reviewing on behalf of ayush3160 April 9, 2026 09:02 View session

Copilot AI reviewed Apr 9, 2026

View reviewed changes

ayush3160 requested a review from Copilot April 9, 2026 09:18

Copilot started reviewing on behalf of ayush3160 April 9, 2026 09:19 View session

Copilot AI reviewed Apr 9, 2026

View reviewed changes

ayush3160 and others added 2 commits April 9, 2026 15:00

Merge branch 'main' into feat/obfuscation-aware-matching

cb8e499

ayush3160 requested a review from Copilot April 9, 2026 09:53

Copilot AI reviewed Apr 9, 2026

View reviewed changes

Copilot started reviewing on behalf of ayush3160 April 9, 2026 10:11 View session

charankamarapu approved these changes Apr 9, 2026

View reviewed changes

Merge branch 'main' into feat/obfuscation-aware-matching

c99e78c

slayerjain merged commit 5e1545f into main Apr 9, 2026
124 checks passed

slayerjain deleted the feat/obfuscation-aware-matching branch April 9, 2026 12:42

github-actions Bot locked and limited conversation to collaborators Apr 9, 2026

Uh oh!

Conversation

ayush3160 commented Apr 8, 2026

Summary

Parsers changed

Related PRs

Test plan

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Copilot AI Apr 8, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 8, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 8, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 8, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 8, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 8, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 8, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 8, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 8, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 8, 2026

Choose a reason for hiding this comment

Uh oh!

github-actions Bot commented Apr 8, 2026

🚀 Keploy Performance Test Results

Uh oh!

github-actions Bot commented Apr 8, 2026

🚀 Keploy Performance Test Results

Uh oh!

github-actions Bot commented Apr 9, 2026

🚀 Keploy Performance Test Results

Uh oh!

github-actions Bot commented Apr 9, 2026

🚀 Keploy Performance Test Results

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Copilot AI Apr 9, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 9, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 9, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 9, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 9, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 9, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 9, 2026

Choose a reason for hiding this comment

Uh oh!