Skip to content

crypto: fix crash on invalid mTLS certs in CREATE/ALTER CONNECTION#35894

Merged
jasonhernandez merged 1 commit intomainfrom
jason/sec-258-mtls-crash-fix
Apr 10, 2026
Merged

crypto: fix crash on invalid mTLS certs in CREATE/ALTER CONNECTION#35894
jasonhernandez merged 1 commit intomainfrom
jason/sec-258-mtls-crash-fix

Conversation

@jasonhernandez
Copy link
Copy Markdown
Contributor

@jasonhernandez jasonhernandez commented Apr 7, 2026

Summary

  • Fix panic in CSR HTTP client when TLS configuration is invalid (.unwrap().map_err() on reqwest builder)
  • Wrap CREATE CONNECTION ... VALIDATE and ALTER CONNECTION ... VALIDATE tasks in ore_catch_unwind so TLS handshake panics return AdapterError instead of crashing the coordinator
  • Add test verifying invalid TLS material (garbage PKCS#12, PEM, DER) is rejected at construction time

Motivation

When a user runs CREATE CONNECTION or ALTER CONNECTION with malformed mTLS certificates (e.g., corrupt PEM, invalid PKCS#12), the connection validation task panics and takes down the coordinator. After this fix, those cases return a user-facing error instead.

Test plan

  • New unit test: test_invalid_tls_config_returns_error — garbage PKCS#12/PEM/DER rejected with error
  • Existing kafka-auth schema-registry mTLS tests pass
  • CI green (3 failures are pre-existing flakes in SLT/MySQL CDC/Postgres CDC, unrelated)

🤖 Generated with Claude Code

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented Apr 7, 2026

Thanks for opening this PR! Here are a few tips to help make the review process smooth for everyone.

PR title guidelines

  • Use imperative mood: "Fix X" not "Fixed X" or "Fixes X"
  • Be specific: "Fix panic in catalog sync when controller restarts" not "Fix bug" or "Update catalog code"
  • Prefix with area if helpful: compute: , storage: , adapter: , sql:

Pre-merge checklist

  • The PR title is descriptive and will make sense in the git log.
  • This PR has adequate test coverage / QA involvement has been duly considered. (trigger-ci for additional test/nightly runs)
  • If this PR includes major user-facing behavior changes, I have pinged the relevant PM to schedule a changelog post.
  • This PR has an associated up-to-date design doc, is a design doc (template), or is sufficiently small to not require a design.
  • If this PR evolves an existing $T ⇔ Proto$T mapping (possibly in a backwards-incompatible way), then it is tagged with a T-proto label.
  • If this PR will require changes to cloud orchestration or tests, there is a companion cloud PR to account for those changes that is tagged with the release-blocker label (example).

@jasonhernandez jasonhernandez force-pushed the jason/sec-258-mtls-crash-fix branch from 66a24e8 to c648256 Compare April 7, 2026 17:08
@jasonhernandez jasonhernandez changed the title crypto: fix mTLS crash in CSR client and add panic guard (SEC-258) crypto: fix mTLS crash in CSR client and add panic guard Apr 7, 2026
@jasonhernandez jasonhernandez changed the title crypto: fix mTLS crash in CSR client and add panic guard crypto: fix crash on invalid mTLS certs in CREATE/ALTER CONNECTION Apr 7, 2026
@jasonhernandez jasonhernandez force-pushed the jason/sec-258-mtls-crash-fix branch from c648256 to 8690248 Compare April 7, 2026 17:30
@jasonhernandez jasonhernandez marked this pull request as ready for review April 7, 2026 17:32
@jasonhernandez jasonhernandez requested review from a team as code owners April 7, 2026 17:32
@jasonhernandez jasonhernandez requested a review from mtabebe April 7, 2026 17:32
@jasonhernandez jasonhernandez force-pushed the jason/sec-258-mtls-crash-fix branch from 8690248 to c9b5171 Compare April 9, 2026 20:41
The CSR HTTP client builder could panic on invalid TLS configuration
(e.g., malformed certificates) because it used `.unwrap()` on the
reqwest builder. Replace with `.map_err()` to surface the error.

Wrap CREATE CONNECTION and ALTER CONNECTION validation tasks in
`ore_catch_unwind` so that panics in TLS handshakes don't crash the
coordinator — they now return an AdapterError instead.

Add test verifying that invalid TLS material (garbage PKCS#12, PEM,
DER) is rejected at construction time with an error.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@jasonhernandez jasonhernandez force-pushed the jason/sec-258-mtls-crash-fix branch from c9b5171 to dae2ed7 Compare April 10, 2026 15:19
@jasonhernandez jasonhernandez merged commit 4218b69 into main Apr 10, 2026
118 checks passed
@jasonhernandez jasonhernandez deleted the jason/sec-258-mtls-crash-fix branch April 10, 2026 15:45
Err(_panic) => {
tracing::error!("connection validation panicked");
Err(AdapterError::Internal(
"connection validation panicked".into(),
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should really keep the panic message available somewhere instead of just silencing it entirely, otherwise how can we debug it?

Err(_panic) => {
tracing::error!("alter connection validation panicked");
Err(AdapterError::Internal(
"connection validation panicked".into(),
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same here

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants