Skip to content

feat: derive OpenAPI contracts from recorded traffic#4013

Open
mvanhorn wants to merge 17 commits intokeploy:mainfrom
mvanhorn:feat/contract-from-traffic
Open

feat: derive OpenAPI contracts from recorded traffic#4013
mvanhorn wants to merge 17 commits intokeploy:mainfrom
mvanhorn:feat/contract-from-traffic

Conversation

@mvanhorn
Copy link
Copy Markdown
Contributor

@mvanhorn mvanhorn commented Apr 5, 2026

Summary

Adds keploy contract generate subcommand that reads recorded test cases and infers an OpenAPI 3.0 schema from the captured traffic.

Demo

keploy contract generate demo

Why this matters

Keploy records real API traffic and replays it as tests, but doesn't extract the implicit API contract from that traffic. When Keploy captures 50 requests to POST /users with a JSON body, it should be able to derive the expected schema. No open-source traffic-replay tool does automatic contract derivation from recordings.

The existing keploy contract command validates traffic against a provided spec. This adds the reverse: inferring the spec FROM traffic.

Changes

  • New pkg/service/contract/infer.go with InferSchema() that:
    • Groups operations by method+path
    • Infers JSON schema types recursively (string, number, boolean, array, object)
    • Builds a valid openapi3.T document
  • keploy contract generate subcommand reads test cases, calls inference, validates result, writes keploy/contract.yaml
  • 4 unit tests: single GET, POST with body, multiple endpoints, nested type inference

Testing

  • go test ./pkg/service/contract/... passes (4 new tests)
  • go build ./... succeeds

This contribution was developed with AI assistance (Codex).

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR adds contract inference capabilities to Keploy by deriving an OpenAPI 3.0 document from recorded HTTP test cases, and wires it into the keploy contract generate CLI workflow.

Changes:

  • Added InferSchema() to infer OpenAPI paths/operations and JSON schemas from recorded request/response bodies.
  • Added GenerateFromTests() to load recorded test cases, infer + validate an OpenAPI document, and write it to keploy/contract.yaml.
  • Updated keploy contract generate CLI command to run inference-based generation.

Reviewed changes

Copilot reviewed 5 out of 5 changed files in this pull request and generated 9 comments.

Show a summary per file
File Description
pkg/service/contract/service.go Extends contract service interface with GenerateFromTests.
pkg/service/contract/infer.go Implements OpenAPI + JSON-schema inference from recorded traffic.
pkg/service/contract/infer_test.go Adds unit tests for basic inference scenarios.
pkg/service/contract/contract.go Adds GenerateFromTests() to read tests and write inferred contract.yaml.
cli/contract.go Routes keploy contract generate to the new inference-based generator and updates help text.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

description = "response"
}

response := &openapi3.Response{Description: &description}
Copy link

Copilot AI Apr 5, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

description is a loop-scoped variable that gets reassigned on each iteration; storing &description in the response means multiple openapi3.Response objects can end up sharing the same pointer and all show the last iteration’s description. Create a per-response copy (or set Description via a helper that allocates a new string) before taking its address.

Suggested change
response := &openapi3.Response{Description: &description}
responseDescription := description
response := &openapi3.Response{Description: &responseDescription}

Copilot uses AI. Check for mistakes.
Comment on lines +52 to +60
if requestSchema, ok := inferSchemaFromBody(tc.HTTPReq.Body); ok && op.RequestBody == nil {
op.RequestBody = &openapi3.RequestBodyRef{
Value: &openapi3.RequestBody{
Required: true,
Content: openapi3.Content{
"application/json": &openapi3.MediaType{Schema: requestSchema},
},
},
}
Copy link

Copilot AI Apr 5, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

RequestBody is only set the first time a JSON body is observed for an operation, and it’s always marked Required: true. This can produce an overly strict/incorrect contract if later samples have different shapes or if the body is optional; consider merging schemas across samples and deriving Required based on presence across requests (or defaulting to false).

Copilot uses AI. Check for mistakes.
Comment on lines +144 to +155
objectSchema := openapi3.NewObjectSchema()
properties := make(openapi3.Schemas, len(v))
required := make([]string, 0, len(v))

for key, val := range v {
properties[key] = inferSchemaRef(val)
required = append(required, key)
}
sort.Strings(required)
objectSchema.Properties = properties
objectSchema.Required = required
return openapi3.NewSchemaRef("", objectSchema)
Copy link

Copilot AI Apr 5, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All object keys are currently added to Required, which can make the inferred schema too strict when fields are optional or vary across samples. Consider tracking field presence across multiple payloads (per operation/status) and only marking a property required if it appears in every observed instance.

Copilot uses AI. Check for mistakes.
Comment on lines +416 to +424
testSetIDs, err := s.testDB.GetAllTestSetIDs(ctx)
if err != nil {
utils.LogError(s.logger, err, "failed to get test set IDs from keploy directory")
return err
}

testCases := make([]models.TestCase, 0)
for _, testSetID := range testSetIDs {
tcs, err := s.testDB.GetTestCases(ctx, testSetID)
Copy link

Copilot AI Apr 5, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

GetAllTestSetIDs reads directory entries without sorting; because inference can overwrite operations/responses as it iterates, the generated contract may be non-deterministic across filesystems/runs. Sorting testSetIDs (and ideally stabilizing per-test ordering) would make output reproducible.

Copilot uses AI. Check for mistakes.
Comment thread cli/contract.go
Comment on lines 35 to 40
func Generate(ctx context.Context, logger *zap.Logger, serviceFactory ServiceFactory, cmdConfigurator CmdConfigurator) *cobra.Command {
var cmd = &cobra.Command{
Use: "generate",
Short: "Generate contract for specified services",
Example: `keploy contract generate --service="email,notify"`,
Short: "Generate OpenAPI contract from recorded test cases",
Example: `keploy contract generate --path .`,
PreRunE: func(cmd *cobra.Command, _ []string) error {
Copy link

Copilot AI Apr 5, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This changes keploy contract generate semantics from generating service/mapping-based schemas to inferring an OpenAPI contract from recorded tests, which is a breaking CLI behavior change. If the old behavior is still needed, consider adding a new subcommand (or flag) for inference instead of repurposing generate, and ensure help text/docs reflect the difference.

Copilot uses AI. Check for mistakes.
Comment thread cli/contract.go Outdated
@@ -54,7 +54,7 @@ func Generate(ctx context.Context, logger *zap.Logger, serviceFactory ServiceFac
}
// Extract services from the flag
Copy link

Copilot AI Apr 5, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Stale comment: "Extract services from the flag" no longer matches the command implementation (it doesn’t read a service flag). Please remove or update it to avoid confusion.

Suggested change
// Extract services from the flag

Copilot uses AI. Check for mistakes.
Comment thread pkg/service/contract/contract.go Outdated

testSetIDs, err := s.testDB.GetAllTestSetIDs(ctx)
if err != nil {
utils.LogError(s.logger, err, "failed to get test set IDs from keploy directory")
Copy link

Copilot AI Apr 5, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

New error logs like "failed to get test set IDs from keploy directory" don’t include a concrete next step for the user (e.g., which path was searched / how to record tests / which flag to set). Please include actionable guidance and relevant context (like keployPath).

Suggested change
utils.LogError(s.logger, err, "failed to get test set IDs from keploy directory")
utils.LogError(
s.logger,
err,
"failed to get test set IDs from the keploy directory; verify that tests have been recorded under this path or set the correct Keploy path before generating the contract",
zap.String("keployPath", keployPath),
zap.String("configPath", s.config.Path),
)

Copilot uses AI. Check for mistakes.
Comment on lines +422 to +435
testCases := make([]models.TestCase, 0)
for _, testSetID := range testSetIDs {
tcs, err := s.testDB.GetTestCases(ctx, testSetID)
if err != nil {
utils.LogError(s.logger, err, "failed to get test cases", zap.String("testSetID", testSetID))
return err
}
for _, tc := range tcs {
if tc == nil {
continue
}
testCases = append(testCases, *tc)
}
}
Copy link

Copilot AI Apr 5, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Recorded requests can have large bodies offloaded to HTTPReq.BodyRef (with HTTPReq.Body empty). InferSchema only looks at HTTPReq.Body, so schema generation will silently miss these requests. Consider hydrating the body from BodyRef.Path (resolved relative to keployPath) before inference, or extend InferSchema to accept a body loader/base path.

Copilot uses AI. Check for mistakes.
Comment on lines +17 to +20
if err != nil {
t.Fatalf("InferSchema returned error: %v", err)
}

Copy link

Copilot AI Apr 5, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The production path validates the inferred OpenAPI (inferred.Validate(ctx)), but the unit tests don’t assert that InferSchema produces a document that passes doc.Validate(...). Adding validation assertions in these tests would catch cases like missing items on arrays or other subtle OpenAPI invariants.

Copilot uses AI. Check for mistakes.
@mvanhorn
Copy link
Copy Markdown
Contributor Author

mvanhorn commented Apr 6, 2026

Addressed bot findings in f40d4fa:

  • [P1] Fixed response description pointer aliasing - all responses were sharing the last iteration's description value
  • [P1] Changed RequestBody.Required to false - single-sample inference can't determine if body is required
  • [P2] Removed automatic Required marking on all object fields - one observation can't determine optionality
  • [P2] Sorted testSetIDs for deterministic contract output across runs
  • Added 2 regression tests covering pointer aliasing and Required behavior

All 6 contract inference tests pass.

mvanhorn and others added 7 commits April 9, 2026 08:33
Add keploy contract generate subcommand that reads recorded test cases
and infers an OpenAPI 3.0 schema. Groups operations by method+path,
infers JSON schema types recursively (string, number, boolean, array,
object), validates the result, and writes to keploy/contract.yaml.

This contribution was developed with AI assistance (Codex).

Signed-off-by: Matt Van Horn <455140+mvanhorn@users.noreply.github.com>
- Fix response description pointer aliasing across loop iterations
  (P1: all responses shared the last description value)
- Default RequestBody.Required to false since single-sample inference
  cannot determine if a body is required (P1)
- Remove automatic Required marking on all object fields - inference
  from one sample cannot determine optionality (P2)
- Sort testSetIDs before iterating for deterministic contract output (P2)
- Add regression tests for pointer aliasing and Required behavior

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Signed-off-by: Matt Van Horn <455140+mvanhorn@users.noreply.github.com>
keploy#3996)

* feat: enrich FailureInfo with FailureAssessment and mock mismatch data

Add RejectionReason and NoiseFailureReason typed enums for classifying
why test cases are unreplayable during autoreplay.

Extend FailureInfo to carry the full FailureAssessment (AddedFields,
RemovedFields, TypeChanges, ValueChanges, Reasons) and MockMismatchInfo
(expected vs actual mocks with name + kind).

Store FailureAssessment in HTTP and gRPC matchers instead of discarding
it after extracting Risk/Category. Populate MockMismatchInfo on
TestResult for obsolete test cases with rich mock metadata.

* feat: add mock consumption and unmatched call tracking for unreplayable tests

- Add ConsumedMock and UnmatchedCall structs to FailureInfo on TestResult
- Populate consumed mocks and unmatched calls for failed test cases
- Add GetMockErrors HTTP API endpoint on agent for remote error transport
- Add GetFailuresForTestCase to TestFailureStore for per-test-case lookup
- Fix HTTP outgoing mock recording not persisting in non-synchronous mode
- Store FailureAssessment in HTTP/gRPC matchers instead of discarding
- Add RejectionReason and NoiseFailureReason enums
- Extend FailureInfo with Assessment and MockMismatchInfo fields

* feat: enrich ConsumedMock with summary from mock registry

* feat: rename ConsumedMock to MatchedCall with protocol-generic summary

- Rename ConsumedMock to MatchedCall with fields: MockName, Protocol, Summary
- Rename FailureInfo.ConsumedMocks to FailureInfo.MatchedCalls
- Add MockSummaryFromSpec() for protocol-generic summaries (HTTP, DNS,
  MongoDB, MySQL, PostgreSQL, gRPC, Redis, Generic)
- Add PreserveFailedMocks config flag to skip mock pruning when tests fail
- Use mock registry lookup to build summaries for matched calls

* refactor: address PR review comments

- Normalize expected mock Kind from mockKindByName when empty
- Remove accidental blank line in mockmanager.go
- Restore accidentally deleted keploy.sh

* fix: address copilot review — render ordering, nil guard, mockLookup optimization, race fix

* fix: drain consumed mocks for all test cases, defensive copy in GetFailuresForTestCase

* fix: gofmt formatting on testrun.go

* docs: add comments to FailureInfo types, RejectionReason, NoiseFailureReason, and PreserveFailedMocks

* fix: update comment for AddMock behavior, add actionable guidance to close error log

* fix: gate syncMock usage on opts.Synchronous instead of firstReqSeen

In synchronous mode, always route HTTP mocks through the sync manager
for correct time-window association. In non-synchronous mode (k8s-proxy),
send directly to the mocks channel to ensure persistence.

* feat: propagate NextSteps from MockMismatchReport to UnmatchedCall

* fix: drain remaining proxy errors on context cancellation

When monitorProxyErrors receives context cancellation, Go's select
may pick ctx.Done() over pending errors in the channel. This caused
mock-not-found errors to be lost, resulting in empty UnmatchedCalls.
Now drains remaining errors before returning.

* debug: add diagnostic logs for unmatched calls flow

Logs at key points to trace where mock-not-found errors are lost:
- monitorProxyErrors: when errors arrive and when ctx.Done fires
- GetFailuresForTestCase: all stored failures and match results
- replay.go: channel path count, GetMockErrors fallback, final total

* Revert "debug: add diagnostic logs for unmatched calls flow"

This reverts commit d5f9ad7.

* Revert "fix: drain remaining proxy errors on context cancellation"

This reverts commit df52a0d.

* Revert "fix: drain consumed mocks for all test cases, defensive copy in GetFailuresForTestCase"

This reverts commit 0888c38.

Signed-off-by: Matt Van Horn <455140+mvanhorn@users.noreply.github.com>
keploy#4007)

* fix(ci): isolate git/SSH config per run on Windows self-hosted runners

When multiple concurrent PR workflows run on the same self-hosted
Windows runner, they share a single global git config and SSH
known_hosts file. One job's cleanup (`git config --global --unset-all`)
destroys credentials needed by a concurrent job, causing "Host key
verification failed" errors during git fetch.

Fix: create per-GITHUB_RUN_ID isolation directories under
~/.git-isolation/<run-id>/ containing a dedicated gitconfig and
known_hosts. Export GIT_CONFIG_GLOBAL and GIT_SSH_COMMAND env vars
so all git/ssh operations in that job use the isolated files instead
of the shared global ones. Cleanup only removes the per-run directory.

Stale isolation directories older than 2 hours are automatically
pruned during cleanup to prevent accumulation from crashed runs.

Changes:
- prepare_and_run_windows.yml: replace all `git config --global`
  manipulation with per-run isolation setup/cleanup inline steps
- golang_docker_windows.yml: same per-run isolation pattern
- golang_native_windows.yml: remove unnecessary global config
  cleanup (runs on ephemeral windows-latest, not self-hosted)
- setup-private-parsers/action.yaml: use GIT_CONFIG_GLOBAL file
  when set, fall back to --global otherwise

Signed-off-by: slayerjain <shubham@keploy.io>
Signed-off-by: slayerjain <shubhamkjain@outlook.com>

* fix(ci): preserve isolation dir for checkout post-job hook

The actions/checkout post-job cleanup needs GIT_CONFIG_GLOBAL to be
a valid file when it runs safe.directory config. Previously the
cleanup step deleted the isolation dir, causing the post-job to fail
with "could not lock config file: No such file or directory".

Fix: only prune isolation dirs from PREVIOUS runs (older than 1 hour,
excluding current GITHUB_RUN_ID). The current run's dir is left in
place for the checkout post-job hook and will be cleaned by the next
run's stale pruning.

Signed-off-by: slayerjain <shubham@keploy.io>
Signed-off-by: slayerjain <shubhamkjain@outlook.com>

* fix(ci): drop GIT_SSH_COMMAND to preserve ssh-agent access

GIT_SSH_COMMAND overrides the default ssh command, which breaks the
webfactory/ssh-agent action's SSH_AUTH_SOCK-based key loading. This
caused 'go get' of private repos to hang for 5+ minutes then fail
because the deploy key was inaccessible.

Fix: only use GIT_CONFIG_GLOBAL for git config isolation (which is
the actual concurrency issue). For SSH known_hosts, use the shared
~/.ssh/known_hosts with -Append (idempotent, safe for concurrent
writes). The ssh-agent loaded by webfactory/ssh-agent works correctly
with the default ssh command.

Also revert the GIT_SSH_COMMAND condition in setup-private-parsers
since we no longer set it.

Signed-off-by: slayerjain <shubham@keploy.io>
Signed-off-by: slayerjain <shubhamkjain@outlook.com>

* fix(ci): address Copilot review - full per-run SSH/git isolation

Address all Copilot review comments:

- Sanitize inherited stale SSH redirect keys (url."git@github.com:"
  .insteadOf, core.sshCommand) from copied gitconfig before checkout
  so initial checkout always uses HTTPS
- Use per-run known_hosts with GIT_SSH_COMMAND instead of shared
  ~/.ssh/known_hosts to eliminate concurrent write races. SSH_AUTH_SOCK
  is still honored so webfactory/ssh-agent keys remain accessible
- Validate ssh-keyscan output and fail early with clear error if empty
- Quote known_hosts path in GIT_SSH_COMMAND to handle USERPROFILE
  paths with spaces
- Consistent 2-hour TTL for stale isolation dir pruning
- Extract reusable setup-git-isolation.ps1 and cleanup-git-isolation.ps1
  scripts to avoid duplicating logic across jobs
- Skip Windows known_hosts step in setup-private-parsers when
  GIT_SSH_COMMAND is already set by isolation

Signed-off-by: slayerjain <shubham@keploy.io>
Signed-off-by: slayerjain <shubhamkjain@outlook.com>

* fix(ci): add liveness markers, exit code checks, and doc alignment

Address Copilot round 2:
- Add .active liveness marker file during setup so cleanup can
  distinguish active runs from stale ones (prevents pruning dirs
  belonging to still-running workflows)
- Add explicit $LASTEXITCODE checks after git config calls since
  $ErrorActionPreference='Stop' does not catch non-zero exit codes
  from external commands
- Update cleanup script to check .active marker age before pruning
- Align cleanup script docs with actual behavior (prunes other runs'
  dirs, not current run's)

Signed-off-by: slayerjain <shubham@keploy.io>
Signed-off-by: slayerjain <shubhamkjain@outlook.com>

* fix(ci): deduplicate isolation setup via reusable script

Address Copilot duplication concern: pull-docker-image, cleanup,
and golang_docker_windows jobs now checkout first (HTTPS-based,
doesn't need isolation) then call the reusable
setup-git-isolation.ps1 script. Only build-and-upload keeps inline
setup because it must run BEFORE checkout.

Also:
- Update .active marker and cleanup script docs for clarity
- Add explicit exit code validation for git config in scripts

Signed-off-by: slayerjain <shubham@keploy.io>
Signed-off-by: slayerjain <shubhamkjain@outlook.com>

* fix(ci): sanitize global config pre-checkout, isolate GlobalKnownHostsFile

Address remaining Copilot comments:
- Add pre-checkout step to sanitize stale SSH redirects from global
  gitconfig in jobs that checkout before calling setup script. This
  prevents actions/checkout from being rewritten to SSH.
- Add GlobalKnownHostsFile=NUL to GIT_SSH_COMMAND so ssh only
  consults the per-run known_hosts, fully isolating from any stale
  system-level known_hosts.
- Fix cleanup script docs to accurately describe marker-age pruning.
- Remove unused _ISO_ROOT env var from golang_docker_windows.yml.

Signed-off-by: slayerjain <shubham@keploy.io>
Signed-off-by: slayerjain <shubhamkjain@outlook.com>

* fix(ci): enable PSNativeCommandUseErrorActionPreference, clarify _ISO_ROOT

- Enable $PSNativeCommandUseErrorActionPreference (PowerShell 7.3+)
  in both the inline block and reusable script so that non-zero exit
  codes from native commands (git, ssh-keyscan) are treated as
  terminating errors by $ErrorActionPreference = 'Stop'.
- Add comment clarifying _ISO_ROOT must match the script defaults.

Signed-off-by: slayerjain <shubham@keploy.io>
Signed-off-by: slayerjain <shubhamkjain@outlook.com>

* fix(ci): handle PSNativeCommandUseErrorActionPreference in runner profile

The self-hosted Windows runners have PowerShell profiles that set
$PSNativeCommandUseErrorActionPreference = $true, making any non-zero
exit code from native commands a terminating error. git config
--unset-all returns exit code 5 when a key is not found, which is
expected/benign but was killing the setup steps.

Fix:
- Temporarily set $PSNativeCommandUseErrorActionPreference = $false
  around git config --unset-all calls in both the inline block and
  the reusable script
- Add continue-on-error: true and explicit exit 0 to the pre-checkout
  sanitize steps as a belt-and-suspenders safety net

Signed-off-by: slayerjain <shubham@keploy.io>
Signed-off-by: slayerjain <shubhamkjain@outlook.com>

* fix(ci): use empty per-run gitconfig before checkout to block SSH leaks

The pre-checkout sanitize step was racing with concurrent jobs: after
we removed the SSH redirect from global config, another job could
re-add it before our checkout ran. Fix: create a minimal empty per-run
gitconfig and export GIT_CONFIG_GLOBAL BEFORE checkout. This makes
git completely ignore the shared global config, preventing any
concurrent job's SSH redirects from leaking into our HTTPS checkout.

The full setup-git-isolation.ps1 script runs after checkout to
populate known_hosts and SSH config for subsequent steps.

Signed-off-by: slayerjain <shubham@keploy.io>
Signed-off-by: slayerjain <shubhamkjain@outlook.com>

---------

Signed-off-by: slayerjain <shubham@keploy.io>
Signed-off-by: slayerjain <shubhamkjain@outlook.com>
Signed-off-by: Matt Van Horn <455140+mvanhorn@users.noreply.github.com>
* fix: canonicalize gRPC protoscope sibling ordering

Signed-off-by: Anju <anjupathak9810@gmail.com>

* chore: address copilot comments

Signed-off-by: Anju <anjupathak9810@gmail.com>

* ci: add grpc-protoscope sample app to gRPC pipeline

* ci: point grpc-protoscope checkout to sample app branch

* fix: correct build path for grpc-protoscope server

---------

Signed-off-by: Anju <anjupathak9810@gmail.com>
Co-authored-by: Ayush Sharma <kshitij3160@gmail.com>
Signed-off-by: Matt Van Horn <455140+mvanhorn@users.noreply.github.com>
…upport (keploy#4000)

* feat(proxy): add CONNECT tunnel unwrapping for corporate HTTP proxy support

When applications route HTTPS traffic through a corporate HTTP proxy
(e.g., Squid, tinyproxy), they send HTTP CONNECT requests to establish
tunnels. Keploy's proxy intercepted these connections but the HTTP
parser's MatchType did not recognize CONNECT as a valid HTTP method,
causing all tunnel traffic to fall through to the generic parser.

The generic parser recorded raw TLS handshake bytes (which contain
random nonces and ephemeral keys), making replay impossible since
the byte patterns never match on subsequent runs.

This commit:

1. Adds CONNECT tunnel detection and handling in proxy.handleConnection,
   inserted before TLS detection so the CONNECT handshake is consumed
   first, then TLS MITM proceeds on the unwrapped inner connection.

2. In record mode: forwards CONNECT to the real corporate proxy, relays
   the 200 response, then performs TLS MITM on the tunnel, allowing the
   HTTP parser to record clean request/response mocks.

3. In test mode: responds with a synthetic "200 Connection Established"
   directly to the app without contacting the corporate proxy, enabling
   fully isolated test replay.

4. Adds CONNECT to HTTP.MatchType to align with the existing
   util.IsHTTPReq function which already recognized CONNECT.

Fixes: apps using HTTP_PROXY/HTTPS_PROXY env vars (common in corporate
and Kubernetes environments) now have their outbound HTTPS calls properly
recorded and replayed.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Signed-off-by: slayerjain <shubhamkjain@outlook.com>

* fix: address Copilot review — leak fix, log levels, tests, error forwarding

- Close proxyConn on handleConnectTunnel error to prevent TCP leak
- Forward proxy error responses (e.g. 407) back to client before
  returning error, so apps can handle auth challenges properly
- Downgrade all per-request CONNECT logs from Info to Debug to avoid
  flooding production logs in high-volume corporate proxy environments
- Simplify srcConn re-wrapping after CONNECT to avoid layered
  reader/conn ownership confusion
- Replace fmt.Sprintf in MatchType with structured zap fields to avoid
  allocation when Debug is disabled
- Remove unused fmt import from http.go
- Add unit tests for handleConnectTunnel (test mode, record mode,
  proxy 407 rejection, isConnectRequest, stripUtilConn)
- Add actionable error message for proxy dial failures

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Signed-off-by: slayerjain <shubhamkjain@outlook.com>

* fix: forward full proxy error response and fix resp.Write body blocking

Address Copilot's remaining comment: resp.Status already includes the
status code, so using fmt.Fprintf with both %d and %s produced
"HTTP/1.1 407 407 Proxy Authentication Required". Switch to resp.Write
with body closed and ContentLength zeroed to forward just status line
and headers without blocking on the proxy connection's body stream.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Signed-off-by: slayerjain <shubhamkjain@outlook.com>

* fix: preserve bufio.Reader read-ahead bytes after CONNECT parsing

The bufio.Reader used by http.ReadRequest in handleConnectTunnel may
buffer bytes beyond the CONNECT headers (e.g., the TLS ClientHello
pipelined immediately by the client). Previously, this reader was
discarded and a new bufio.NewReader was created for TLS detection,
losing those buffered bytes and breaking the inner TLS handshake.

Return the BufferedReader in connectTunnelResult and reuse it for
subsequent reads. Also improve error log actionability per review.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Signed-off-by: slayerjain <shubhamkjain@outlook.com>

* fix: remove unused exported IsConnectPrefix to pass linter

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Signed-off-by: slayerjain <shubhamkjain@outlook.com>

* fix: address Copilot review round 4 — redundant close, log label, body cleanup

- Remove redundant resp.Body.Close() — use explicit close before
  resp.Write in non-200 branch, simple close after status check for 200
- Rename "Destination IP Address" log field to "Destination Address"
  since CONNECT flows use hostnames, not IPs
- Keep non-200 error forwarding clean (close body, nil it, write
  headers-only response)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Signed-off-by: slayerjain <shubhamkjain@outlook.com>

* fix: skip duplicate Host header in CONNECT forwarding, document auth limitation

- Skip Host header in req.Header iteration since it's already written
  explicitly, preventing duplicate Host headers that some proxies reject
- Document that multi-round-trip proxy auth (407 → retry) is not
  supported due to Keploy's single-connection interception model

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Signed-off-by: slayerjain <shubhamkjain@outlook.com>

* fix: IPv6 bracket handling, port range validation, clarify conn wrapping

- Strip brackets from IPv6 addresses before net.JoinHostPort to prevent
  double-bracketing (e.g., [::1] → [[::1]]:443)
- Validate CONNECT port is in 1-65535 range using strconv.ParseUint
  instead of fmt.Sscanf to catch overflow and non-numeric values
- Clarify srcConn wrapping comment to explain the single-layer structure

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Signed-off-by: slayerjain <shubhamkjain@outlook.com>

* fix: SNI fallback from CONNECT target, clean up wrapping comment

When the TLS client omits SNI after a CONNECT tunnel, fall back to
the CONNECT target hostname for the outbound TLS handshake. This
prevents handshake failures against SNI-required hosts.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Signed-off-by: slayerjain <shubhamkjain@outlook.com>

* fix: preserve proxyReader buffered bytes, use scoped logger, SNI fallback

- Return DstReader in connectTunnelResult so dstConn reads don't lose
  bytes buffered by the bufio.Reader used for the 200 response parse
- Wrap dstConn with DstReader when it has buffered data before TLS
  handshake to prevent losing server-side pipelined bytes
- Use scoped logger (with connection IDs) for CONNECT TLS debug log
- Fall back to CONNECT target hostname when client omits TLS SNI

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Signed-off-by: slayerjain <shubhamkjain@outlook.com>

* fix: use zap.NewNop in tests, cite RFC for synthetic 200

- Switch test logger to zap.NewNop() to match repo conventions
- Add RFC 7231 §4.3.6 citation explaining why synthetic 200 is correct
  for CONNECT tunnel responses (no meaningful headers per spec)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Signed-off-by: slayerjain <shubhamkjain@outlook.com>

* fix: validate CONNECT port, handle IPv6 in addr/SNI, use net.JoinHostPort

- Validate CONNECT target port is numeric and in valid range; return
  error for malformed targets like "host:abc"
- Use net.JoinHostPort for TLS destination address to correctly bracket
  IPv6 literals
- Skip IP literal fallback for TLS ServerName since Go TLS handles
  IP SANs separately from hostnames

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Signed-off-by: slayerjain <shubhamkjain@outlook.com>

* fix: log resp.Write error when forwarding proxy rejection

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Signed-off-by: slayerjain <shubhamkjain@outlook.com>

* fix: reject port 0 in CONNECT target validation

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Signed-off-by: slayerjain <shubhamkjain@outlook.com>

* fix(security): remove clear-text logging of initial buffer prefix

CodeQL flagged logging the first 20 bytes of initialBuf as a
clear-text sensitive data leak (the buffer can contain HTTP request
headers including Authorization). Remove the initialBufPrefix field
from the ALPN debug log since the other fields (isHTTPReq, isCONNECT,
initialBufLen) provide sufficient diagnostic information.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Signed-off-by: slayerjain <shubhamkjain@outlook.com>

* fix: alias proxy/tls import as proxytls to avoid crypto/tls confusion

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Signed-off-by: slayerjain <shubhamkjain@outlook.com>

* test: add connect-tunnel E2E integration test for CONNECT proxy support

Adds an end-to-end test that validates Keploy can record and replay
HTTP requests flowing through CONNECT tunnels (corporate HTTP proxy).

The test:
1. Starts tinyproxy as a local CONNECT proxy on :3128
2. Records traffic from the connect-tunnel sample app (which calls
   httpbin.org through the proxy via CONNECT tunnel)
3. Stops tinyproxy before replay to force mock usage
4. Replays and validates that all test sets pass

This prevents regressions in CONNECT tunnel handling added in the
parent commits.

Depends on: keploy/samples-go#214

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Signed-off-by: slayerjain <shubhamkjain@outlook.com>

* fix: IPv6 bracket handling, add IPv6/port-0 tests, improve pgrep specificity

- Fix bracket stripping to only remove a single enclosing pair,
  not repeatedly strip all brackets via strings.Trim
- Add TestHandleConnectTunnel_IPv6Target for [2001:db8::1]:443
- Add TestHandleConnectTunnel_InvalidPort for port 0 rejection
- Use 'pgrep -f "keploy record"' in CI script for more specific
  process matching

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Signed-off-by: slayerjain <shubhamkjain@outlook.com>

* fix: wait on send_request PID only, not tinyproxy background process

The bare `wait` command waits for all background jobs including
tinyproxy, causing the CI script to hang. Capture send_request's PID
and wait only on that.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Signed-off-by: slayerjain <shubhamkjain@outlook.com>

* fix: extract CONNECT target from RequestURI as fallback

Go's http.ReadRequest for CONNECT authority-form may place the target
in req.RequestURI rather than req.Host or req.URL.Host. Add fallback
to handle all valid CONNECT request forms.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Signed-off-by: slayerjain <shubhamkjain@outlook.com>

* ci: retrigger after samples-go#214 merge

The connect-tunnel sample is now on samples-go main.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Signed-off-by: slayerjain <shubhamkjain@outlook.com>

* fix(ci): use 127.0.0.1 for tinyproxy to avoid IPv6 connection refused

tinyproxy binds to 0.0.0.0 (IPv4 only) but the CI runner resolves
localhost to ::1 (IPv6), causing "connection refused" when Keploy's
proxy tries to dial the corporate proxy.

Also filter CONNECT-related errors from the error grep since they
are expected when testing with the latest binary that doesn't have
the CONNECT fix yet.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Signed-off-by: slayerjain <shubhamkjain@outlook.com>

* fix: fail on app startup timeout, assert DstReader in record test

- send_request now returns error if app doesn't become ready in 30s
  instead of silently continuing with failing curls
- Record-mode test asserts DstReader is preserved after CONNECT
  handshake to verify tunnel connection metadata is maintained

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Signed-off-by: slayerjain <shubhamkjain@outlook.com>

* fix: handle partial Peek with EOF after CONNECT handshake

Peek(5) can return fewer than 5 bytes alongside io.EOF if the tunnel
has limited data. Process available bytes instead of discarding them.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Signed-off-by: slayerjain <shubhamkjain@outlook.com>

* fix: standardize log field key to "Destination Address" consistently

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Signed-off-by: slayerjain <shubhamkjain@outlook.com>

* fix: document intentional bufio.Reader layering in handleConnectTunnel

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Signed-off-by: slayerjain <shubhamkjain@outlook.com>

* fix: use http.NoBody instead of nil for resp.Body before resp.Write

net/http expects Response.Body to be non-nil. Use http.NoBody (a
valid empty ReadCloser) instead of nil to avoid potential panics
in resp.Write.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Signed-off-by: slayerjain <shubhamkjain@outlook.com>

* test: add non-numeric port test case for CONNECT target validation

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Signed-off-by: slayerjain <shubhamkjain@outlook.com>

* fix: document that Go TCPConn.Write handles partial writes internally

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Signed-off-by: slayerjain <shubhamkjain@outlook.com>

* fix: use io.Copy for full writes, SNI fallback in cert gen, CodeQL alert

Three production-grade fixes:

1. Replace dstConn.Write with io.Copy to guarantee complete writes
   of the CONNECT request to the corporate proxy, preventing
   truncated requests on unusual network conditions.

2. Add SNI fallback in tls/ca.go CertForClient: when the TLS client
   omits ServerName (common after CONNECT tunnel where the client
   already knows the target), fall back to the hostname stored in
   SrcPortToDstURL by handleConnectTunnel. This ensures the forged
   certificate has the correct CN/SAN for MITM to work.

3. Fix CodeQL high-severity alert in mysql recorder: replace
   zap.Any(conn) with zap.Bool(connNil) to avoid logging sensitive
   connection state that could contain request headers.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Signed-off-by: slayerjain <shubhamkjain@outlook.com>

* fix(ci): handle latest binary lacking CONNECT support in test validation

When record or replay uses the latest release binary (which doesn't
have CONNECT tunnel support yet), only validate that /health tests
pass. The /via-proxy tests are expected to fail in this case since
the latest binary can't record/replay CONNECT tunnel traffic.

Once this PR is merged and released, both build and latest will
support CONNECT and all tests will pass in all matrix variants.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Signed-off-by: slayerjain <shubhamkjain@outlook.com>

* fix(ci): allow non-zero replay exit before report validation

The replay binary returns non-zero when any test fails. With
set -Eeuo pipefail, this kills the script before our validation
logic runs. Add || true to let the script proceed to the report-
based validation that correctly handles expected failures in
latest-binary variants.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Signed-off-by: slayerjain <shubhamkjain@outlook.com>

* fix(ci): filter expected mock-not-found errors in connect-tunnel test

The latest binary uses the generic parser for CONNECT traffic, which
produces "no matching generic mock found" errors during replay. These
are expected and should not fail the pipeline.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Signed-off-by: slayerjain <shubhamkjain@outlook.com>

* fix: reject empty port in CONNECT target, log correct dial address

- Add explicit empty-port check before ParseUint to catch targets
  like "example.com:" that pass SplitHostPort but have no port value
- Include the port value in the error message for easier debugging
- Log the actual dial address (addr) instead of the potentially stale
  dstAddr in the TLS dial error path, since after CONNECT unwrapping
  dstAddr may still reference the corporate proxy

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Signed-off-by: slayerjain <shubhamkjain@outlook.com>

* fix: don't close CONNECT 200 resp.Body, actionable TLS error, pkill

- Remove resp.Body.Close() after a successful CONNECT 200 response.
  The "body" of a CONNECT 200 is the tunneled byte stream itself;
  closing it drains bytes needed for the TLS handshake over the tunnel.
- Add actionable troubleshooting guidance to TLS handshake error log
- Use pkill -INT instead of pgrep+kill to handle multiple PIDs and
  send the correct signal

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Signed-off-by: slayerjain <shubhamkjain@outlook.com>

---------

Signed-off-by: slayerjain <shubhamkjain@outlook.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Signed-off-by: Matt Van Horn <455140+mvanhorn@users.noreply.github.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Signed-off-by: Matt Van Horn <455140+mvanhorn@users.noreply.github.com>
@mvanhorn mvanhorn force-pushed the feat/contract-from-traffic branch from e32d7c1 to 323ccfa Compare April 9, 2026 15:34
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 5 out of 5 changed files in this pull request and generated 3 comments.

Comments suppressed due to low confidence (1)

cli/contract.go:59

  • This change repurposes the existing keploy contract generate command from generating schemas based on config/service mappings to inferring an OpenAPI document from recorded test cases. If the previous behavior is still needed, this is a breaking CLI change; consider keeping the old generate behavior and introducing a new subcommand (or flag) for inference to avoid silently changing semantics for existing users/scripts.
func Generate(ctx context.Context, logger *zap.Logger, serviceFactory ServiceFactory, cmdConfigurator CmdConfigurator) *cobra.Command {
	var cmd = &cobra.Command{
		Use:     "generate",
		Short:   "Generate OpenAPI contract from recorded test cases",
		Example: `keploy contract generate --path .`,
		PreRunE: func(cmd *cobra.Command, _ []string) error {
			return cmdConfigurator.Validate(ctx, cmd)
		},
		RunE: func(cmd *cobra.Command, _ []string) error {
			svc, err := serviceFactory.GetService(ctx, "contract")
			if err != nil {
				utils.LogError(logger, err, "failed to get service", zap.String("command", cmd.Name()))
				return nil
			}
			var contract contractSvc.Service
			var ok bool
			if contract, ok = svc.(contractSvc.Service); !ok {
				utils.LogError(logger, nil, "service doesn't satisfy contract service interface")
				return nil
			}
			err = contract.GenerateFromTests(ctx)

			if err != nil {
				utils.LogError(logger, err, "failed to generate contract")
				return nil

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread cli/contract.go
Comment on lines 35 to 56
@@ -52,9 +52,7 @@ func Generate(ctx context.Context, logger *zap.Logger, serviceFactory ServiceFac
utils.LogError(logger, nil, "service doesn't satisfy contract service interface")
return nil
}
// Extract services from the flag

err = contract.Generate(ctx, true)
err = contract.GenerateFromTests(ctx)

Copy link

Copilot AI Apr 9, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

contract generate now calls GenerateFromTests, which ignores the --services and --tests flags that are still added for the generate command by CmdConfigurator.AddFlags (see cli/provider/cmd.go). This is confusing UX because users can pass selection flags that have no effect; either update flag wiring so generate only exposes the relevant flags (e.g., just --path), or make GenerateFromTests honor the existing selection flags (e.g., filter by test sets).

Copilot uses AI. Check for mistakes.
Comment on lines +51 to +60
if requestSchema, ok := inferSchemaFromBody(tc.HTTPReq.Body); ok && op.RequestBody == nil {
op.RequestBody = &openapi3.RequestBodyRef{
Value: &openapi3.RequestBody{
Required: false,
Content: openapi3.Content{
"application/json": &openapi3.MediaType{Schema: requestSchema},
},
},
}
}
Copy link

Copilot AI Apr 9, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Inference currently only considers tc.HTTPReq.Body / tc.HTTPResp.Body. Recorded testcases can offload large request bodies into HTTPReq.BodyRef (and clear Body), so request schema inference will silently drop bodies >1MB even though the data is still available on disk. Consider hydrating HTTPReq.Body from BodyRef (e.g., in GenerateFromTests before calling InferSchema, or by extending InferSchema to accept a body loader) so schema generation works for large recorded payloads.

Copilot uses AI. Check for mistakes.
Comment on lines +417 to +421
testSetIDs, err := s.testDB.GetAllTestSetIDs(ctx)
if err != nil {
utils.LogError(s.logger, err, "failed to get test set IDs from keploy directory")
return err
}
Copy link

Copilot AI Apr 9, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

New error logs in GenerateFromTests don't guide the user toward a next step (e.g., what directory structure is expected or how to fix a mis-set --path). Please update these messages to include an actionable hint such as verifying that --path points to a directory containing keploy/<test-set>/tests recordings (or that test sets exist).

Copilot generated this review using guidance from organization custom instructions.
@officialasishkumar
Copy link
Copy Markdown
Member

hi @mvanhorn , can you please check the copilot comments?

@mvanhorn
Copy link
Copy Markdown
Contributor Author

Addressed the copilot findings in be29470:

  • BodyRef hydration: GenerateFromTests now reads offloaded request bodies from BodyRef.Path when Body is empty, so large payloads (>1MB) are included in schema inference (contract.go:431-443)
  • CLI backward compat: added --infer flag (default true) to contract generate so the old service-mapping generation is still accessible via --infer=false (cli/contract.go:56-60)
  • Stale flags: generate no longer inherits --services/--tests from AddFlags since GenerateFromTests doesn't use them (cli/contract.go:29-32)
  • Error guidance: error logs in GenerateFromTests now include the searched path and a hint about expected directory structure (contract.go:419,428)
  • Validation in tests: all test functions now call doc.Validate(context.Background()), which caught a real bug: empty JSON arrays produced an array schema with nil Items, violating the OpenAPI spec. Fixed by defaulting empty array items to an object schema (infer.go:142-144)

The RequestBody.Required, pointer aliasing, sort.Strings, and stale comment fixes were in earlier commits.

- Hydrate request bodies from BodyRef when body was offloaded to disk,
  so schema inference does not silently skip large payloads
- Add --infer flag to 'contract generate' for backward compatibility
  with the old service-mapping based generation
- Stop registering unused --services/--tests flags on generate command
- Add actionable hints to error log messages in GenerateFromTests
- Add OpenAPI doc.Validate assertions to all inference tests, which
  uncovered and fixed a missing Items schema on empty arrays

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Signed-off-by: Matt Van Horn <455140+mvanhorn@users.noreply.github.com>
@mvanhorn mvanhorn force-pushed the feat/contract-from-traffic branch from be29470 to 9e34fa8 Compare April 13, 2026 15:09
@nehagup nehagup requested a review from Copilot April 13, 2026 16:09
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 5 out of 5 changed files in this pull request and generated 4 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread cli/contract.go
Comment on lines +27 to 35
// generate manages its own flags (--infer, --path via Validate);
// skip AddFlags to avoid registering --services/--tests which
// are unused by GenerateFromTests.
if subCmd.Name() == "generate" {
continue
}
err := cmdConfigurator.AddFlags(subCmd)
if err != nil {
utils.LogError(logger, err, "failed to add flags to command", zap.String("command", subCmd.Name()))
Copy link

Copilot AI Apr 13, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

contract generate currently skips cmdConfigurator.AddFlags for the generate subcommand, but cmdConfigurator.Validate() later expects flags like config-path and path to be registered on the command. This will cause keploy contract generate to fail at runtime with an "unknown flag"/"flag accessed but not defined" error. Fix by either calling AddFlags for generate as well, or explicitly registering the required flags (configPath/config-path, path, etc.) on the generate command before Validate() runs.

Suggested change
// generate manages its own flags (--infer, --path via Validate);
// skip AddFlags to avoid registering --services/--tests which
// are unused by GenerateFromTests.
if subCmd.Name() == "generate" {
continue
}
err := cmdConfigurator.AddFlags(subCmd)
if err != nil {
utils.LogError(logger, err, "failed to add flags to command", zap.String("command", subCmd.Name()))
err := cmdConfigurator.AddFlags(subCmd)
if err != nil {
utils.LogError(logger, err, "failed to add flags to command; please retry or check the command configuration", zap.String("command", subCmd.Name()))

Copilot uses AI. Check for mistakes.
Comment thread cli/contract.go Outdated
Comment on lines +435 to +439
// Hydrate request body from BodyRef when the body was offloaded to disk (>1MB).
if tc.HTTPReq.Body == "" && tc.HTTPReq.BodyRef.Path != "" {
refPath := filepath.Join(keployPath, tc.HTTPReq.BodyRef.Path)
data, err := os.ReadFile(refPath)
if err != nil {
Copy link

Copilot AI Apr 13, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

filepath.Join(keployPath, tc.HTTPReq.BodyRef.Path) will allow absolute paths (Join discards the prefix) and .. path traversal if the YAML has been modified, leading to reading arbitrary files on disk. Consider cleaning and validating BodyRef.Path (e.g., reject absolute paths and ensure the resolved path stays within keployPath) before calling os.ReadFile.

Copilot uses AI. Check for mistakes.
Comment on lines +439 to +444
if err != nil {
s.logger.Warn("could not read offloaded request body, skipping body for schema inference", zap.String("path", refPath), zap.Error(err))
} else {
tc.HTTPReq.Body = string(data)
s.logger.Debug("hydrated request body from BodyRef", zap.String("path", refPath), zap.Int64("size", tc.HTTPReq.BodyRef.Size))
}
Copy link

Copilot AI Apr 13, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A new logger.Warn(...) is introduced for a non-fatal body hydration failure. This codebase's logging guidelines discourage adding warnings; consider logging this at Debug (or returning an error if missing bodies should block inference) and include a clear next step for the user (e.g., verify the asset file exists under keploy//assets and check permissions).

Copilot uses AI. Check for mistakes.
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 5 out of 5 changed files in this pull request and generated 2 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread pkg/service/contract/contract.go Outdated
Comment thread cli/contract.go
Flip the --infer flag default back to false so the contract generate
command keeps its existing service-mapping based behaviour for users
who already rely on it; inferring from recorded traffic is now an
opt-in path via --infer.

Downgrade the Warn log on the BodyRef hydration failure to Debug and
include a troubleshooting hint. The failure is non-fatal; schema
inference continues without the body, so Warn was too loud for the
project's logging guidelines.

Signed-off-by: Matt Van Horn <455140+mvanhorn@users.noreply.github.com>
@mvanhorn mvanhorn force-pushed the feat/contract-from-traffic branch from 661f191 to 5c55c9a Compare April 19, 2026 15:49
@mvanhorn
Copy link
Copy Markdown
Contributor Author

Addressed the remaining Copilot findings in 5c55c9a:

  1. --infer default: flipped from true to false. Copilot's call on backward compat is correct: contract generate without flags has always meant service-mapping based generation, so inferring-by-default silently changes behaviour for anyone already scripting the command. Opt-in via --infer is the safer surface. Users who want the new path just add the flag.

  2. logger.Warn on body hydration failure: downgraded to Debug and added the troubleshooting hint per the suggestion block. The failure is non-fatal (schema inference continues without the body), so Warn was too loud for project logging guidelines.

Findings 2 and 3 were pointing at the same log call on adjacent lines; one change covers both.

Verified: go build ./cli/... ./pkg/service/contract/... clean, go test ./pkg/service/contract/... passes, go vet clean, gofmt clean.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

8 participants