Skip to content

crypto: migrate reqwest from native-tls to rustls across workspace#35869

Closed
jasonhernandez wants to merge 31 commits intojason/sec-218-combined-tls-rustlsfrom
jason/sec-239-reqwest-rustls
Closed

crypto: migrate reqwest from native-tls to rustls across workspace#35869
jasonhernandez wants to merge 31 commits intojason/sec-218-combined-tls-rustlsfrom
jason/sec-239-reqwest-rustls

Conversation

@jasonhernandez
Copy link
Copy Markdown
Contributor

@jasonhernandez jasonhernandez commented Apr 4, 2026

Summary

  • Migrate ~20 reqwest deps to default-features = false + rustls-tls-webpki-roots-no-provider
  • Remove native-tls-vendored from ccsr, switch Identity to from_pem
  • Switch mysql_async from native-tls-tls to rustls-tls, update ClientIdentity API
  • Switch segment from native-tls-vendored to rustls-tls
  • Switch sentry from transport (native-tls) to reqwest (TLS via feature unification)
  • Remove unused postgres-openssl patch
  • Fix deny.toml ring ban wrappers (pre-existing issue)
  • Install aws-lc-rs CryptoProvider in build scripts (mz-npm, fivetran-destination)
  • Revert rdkafka from ssl-awslc back to ssl-vendored to fix duplicate SSL symbol linker errors (rdkafka's bundled BoringSSL conflicts with openssl-sys still in tree via tiberius/hyper-tls)

Remaining native-tls paths (require separate follow-ups)

  • tiberius fork — uses native-tls; upstream rustls 0.21 uses ring; needs fork update to rustls 0.23 + aws_lc_rs → SEC-248
  • hyper-tls via LaunchDarkly SDK — uses hyper 0.14; needs SDK upgrade to hyper 1.x + hyper-rustls → SEC-249
  • iceberg-rust fork — hardcodes reqwest/native-tls; needs fork update
  • azure_coreenable_reqwest_rustls activates reqwest/rustls-tls which bundles ring; needs upstream enable_reqwest_rustls_no_provider feature
  • duckdblibduckdb-sys build script needs TLS; kept native-tls (build-dep only)

Why rdkafka stays on ssl-vendored for now

rdkafka's ssl-awslc feature bundles BoringSSL (AWS-LC) statically, exporting SSL_new, SSL_CTX_ctrl, etc. These conflict with the same symbols from openssl-sys (brought in by tiberius and hyper-tls via native-tls). The rdkafka AWS-LC migration can only happen after SEC-248 and SEC-249 remove openssl-sys from the dependency tree entirely.

Chains off #35867. Part of SEC-239.

Test plan

  • cargo check --all-targets passes locally
  • cargo deny check bans passes
  • cargo deny check licenses passes
  • Full CI

🤖 Generated with Claude Code

jasonhernandez and others added 4 commits April 3, 2026 22:55
Migrate all reqwest dependencies across the workspace from the default
native-tls backend to rustls-tls, eliminating the native-tls/OpenSSL
dependency chain that caused duplicate SSL symbol errors with aws-lc-rs
when building with --all-features.

Changes per dependency:
- reqwest: Add default-features = false + rustls-tls-webpki-roots-no-provider
  to ~20 crates (build-deps use rustls-tls-webpki-roots for self-contained
  CryptoProvider)
- ccsr: Remove native-tls-vendored, switch Identity to use reqwest::Identity::from_pem
- hyper-tls → hyper-rustls 0.24: Migrate LaunchDarkly SDK connectors in
  adapter and dyncfg-launchdarkly to use HttpsConnectorBuilder
- tiberius: Switch native-tls → rustls feature in sql-server-util, storage,
  storage-types
- mysql_async: Switch native-tls-tls → rustls-tls in storage-types,
  update ClientIdentity API to provide cert_chain + priv_key separately
- segment: Switch native-tls-vendored → rustls-tls
- sentry: Replace transport (native-tls) with reqwest + rustls features
- azure_*: Set default-features = false, enable enable_reqwest_rustls
- duckdb: Remove native-tls feature
- Remove unused postgres-openssl patch

Remaining: iceberg-rust fork hardcodes reqwest/native-tls in its workspace
Cargo.toml — requires a separate fork update.

Part of SEC-239.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
hyper-rustls 0.24 depends on rustls 0.21 + ring, which violates the
workspace ring ban and creates duplicate crate versions. Keep hyper-tls
for the LaunchDarkly SDK connectors (behind telemetry feature flag)
until the LD SDK is upgraded to hyper 1.x.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Revert tiberius native-tls → rustls: the MaterializeInc tiberius fork
  uses rustls 0.21 + ring, violating the workspace ring ban
- Fix mz-npm and fivetran-destination build deps: use
  rustls-tls-webpki-roots-no-provider and explicitly install aws-lc-rs
  CryptoProvider in build scripts
- Sentry: use just "reqwest" feature (TLS comes from workspace feature
  unification), not "rustls" which activates ring through reqwest

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The ring ban's wrappers list was missing the intermediate crates
(rustls, rustls-webpki) between ring and the already-allowed aws-config.
This was a pre-existing issue on the parent branch.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented Apr 4, 2026

Thanks for opening this PR! Here are a few tips to help make the review process smooth for everyone.

PR title guidelines

  • Use imperative mood: "Fix X" not "Fixed X" or "Fixes X"
  • Be specific: "Fix panic in catalog sync when controller restarts" not "Fix bug" or "Update catalog code"
  • Prefix with area if helpful: compute: , storage: , adapter: , sql:

Pre-merge checklist

  • The PR title is descriptive and will make sense in the git log.
  • This PR has adequate test coverage / QA involvement has been duly considered. (trigger-ci for additional test/nightly runs)
  • If this PR includes major user-facing behavior changes, I have pinged the relevant PM to schedule a changelog post.
  • This PR has an associated up-to-date design doc, is a design doc (template), or is sufficiently small to not require a design.
  • If this PR evolves an existing $T ⇔ Proto$T mapping (possibly in a backwards-incompatible way), then it is tagged with a T-proto label.
  • If this PR will require changes to cloud orchestration or tests, there is a companion cloud PR to account for those changes that is tagged with the release-blocker label (example).

jasonhernandez and others added 10 commits April 4, 2026 11:59
- Revert azure_* crates to defaults: azure_core's enable_reqwest_rustls
  activates reqwest/rustls-tls which bundles ring, causing duplicate
  symbol errors with aws-lc-rs on Linux. azure_core needs an upstream
  enable_reqwest_rustls_no_provider feature.
- Restore duckdb native-tls feature: libduckdb-sys build script uses
  reqwest and needs a TLS backend; with resolver "2" build-dep features
  are separate from normal deps.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
rdkafka's ssl-awslc feature bundles BoringSSL (AWS-LC), which exports
SSL_new, SSL_CTX_ctrl, etc. These conflict with openssl-sys symbols
still in the tree via tiberius (native-tls) and hyper-tls (LaunchDarkly
SDK), causing "duplicate symbol" linker errors in CI.

Revert rdkafka to ssl-vendored (OpenSSL) so both rdkafka and the
remaining native-tls paths use the same SSL library. The rdkafka AWS-LC
migration must wait until tiberius and the LD SDK are migrated off
native-tls, fully removing openssl-sys from the dependency tree.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Switch tiberius from the native-tls feature (openssl) to the rustls
feature using rustls 0.23 + aws-lc-rs. This removes tiberius as an
openssl-sys dependency path.

Uses personal fork (jasonhernandez/tiberius, branch
rustls-0.23-on-mz-changes) which cherry-picks the rustls 0.23 upgrade
onto the existing mz_changes commits (Row::build, money types,
trust_cert_ca_pem).

Remaining native-tls paths: hyper-tls (LaunchDarkly SDK), reqwest
dev/build-deps.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
When both `aws-lc-rs` and `ring` features are active (as in CI's
--all-features cargo-test-2 job), rustls cannot auto-detect which
CryptoProvider to use. Install aws-lc-rs as the default provider
early in binary entrypoints (environmentd, clusterd) and in tests
that use tokio-postgres-rustls.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…ness

Extend the CryptoProvider installation to testdrive, sqllogictest,
orchestratord, and the environmentd test harness. These all use TLS
(via reqwest, tokio-postgres-rustls, or server-core) and need the
provider installed before any TLS operation when both aws-lc-rs and
ring features are active.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Add fips_crypto_provider() to persistcli main (fixes short-zippy).
Move CryptoProvider install in test harness from try_start to
try_start_with_trigger so all test paths are covered.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
tokio-postgres-rustls v0.13 hardcodes a dependency on ring. Replace it
with mz-tls-util::MakeRustlsConnect which uses the process-wide
CryptoProvider (aws-lc-rs). This removes one direct ring dependency
path.

Ring still enters via kube's hyper-rustls (feature unification), so
the CryptoProvider install_default() calls remain necessary for
--all-features builds.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Add kube/aws-lc-rs feature to avoid ring via hyper-rustls
- Remove segment/rustls-tls feature (was enabling reqwest/rustls-tls
  which bundles ring; TLS comes from workspace feature unification)

With ring fully removed, rustls auto-detects aws-lc-rs as the sole
CryptoProvider, eliminating the need for explicit install_default()
calls. All CryptoProvider panics from --all-features builds are
resolved.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Include Cargo.lock changes from ring elimination and remove ring
wrappers from deny.toml since ring is no longer in the dependency tree.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@jasonhernandez jasonhernandez force-pushed the jason/sec-239-reqwest-rustls branch from 0dae9fd to 3998765 Compare April 6, 2026 06:01
jasonhernandez and others added 14 commits April 5, 2026 23:34
When tokio-postgres connects via Unix socket, it still calls
make_tls_connect with the socket path as the domain (e.g.
"/var/run/postgresql"). Since socket paths are not valid DNS names,
ServerName::try_from fails, preventing all Unix socket connections.

Use "localhost" as a placeholder ServerName for Unix socket paths,
since TLS is never actually negotiated over Unix sockets.

Fixes CI failures where environmentd could not connect to the internal
CockroachDB via Unix socket for consensus and timestamp oracle.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
tokio-postgres passes an empty string as the domain to
make_tls_connect for Unix socket connections (Host::Unix maps to
hostname=None, which unwrap_or("") produces ""). The previous fix
only handled "/" prefixed paths but missed the empty string case,
so ServerName::try_from("") still failed with InvalidDnsNameError.

Handle both empty strings and "/" prefixed paths by using "localhost"
as a placeholder ServerName, since TLS is never negotiated over Unix
sockets.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
1. Update kafka-auth test expected error: with rustls, invalid PEM data
   produces "failed to parse private key PEM" instead of the old OpenSSL
   error "No supported data to decode".

2. Add [patch.crates-io] for mysql_async pointing to MaterializeInc fork
   with two rustls backend fixes:
   - ClientIdentity::load() uses private_key() for all key formats
     (PKCS#1, PKCS#8, SEC1), not just rsa_private_keys() (PKCS#1 only)
   - DangerousVerifier matches "not valid for name" (rustls 0.23 Display
     format) in addition to "NotValidForName" for skip_domain_validation

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
With rustls, certificate validation failures produce "invalid peer
certificate: UnknownIssuer" instead of the native-tls "TLS error".
Update both mysql-cdc and mysql-cdc-old-syntax test variants.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Identity::from_pem validated PEM data with rustls-pki-types but the
From<Identity> for reqwest::Identity conversion used expect() with
reqwest::Identity::from_pem, which may parse differently. If reqwest
couldn't parse the PEM, this would panic and crash the session.

Add reqwest validation at construction time so the expect() is safe.
This may fix the session crash seen in kafka-auth mTLS tests (SEC-258).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Now that tiberius, reqwest, and all other native-tls consumers have been
migrated to rustls, openssl-sys is no longer in the dependency tree.
This eliminates the duplicate symbol conflict that previously blocked
the ssl-awslc migration.

Switch rdkafka from ssl-vendored (vendored OpenSSL) to ssl-awslc
(pre-built AWS-LC at /opt/aws-lc in the CI builder image). This:
- Removes openssl and openssl-sys from the dependency tree entirely
- Eliminates dual crypto library cleanup conflicts (SIGABRT on exit)
- Aligns Kafka TLS with the rest of the stack on AWS-LC for FIPS

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…-LC"

native-tls is still in the dependency tree via iceberg-rust (reqwest
native-tls feature), LaunchDarkly SDK (hyper-tls), and duckdb. Without
ssl-vendored, there is no OpenSSL for native-tls to link against.

Revert to ssl-vendored until native-tls is fully eliminated from the
dependency tree. The SIGABRT from dual crypto libraries is intermittent
and less blocking than linker failures.

This reverts commit fdbd336.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Target the jasonhernandez/rustls-migration branch which replaces
reqwest's native-tls feature with rustls-tls-webpki-roots, removing
one of the blockers for full OpenSSL elimination.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Target the jasonhernandez/rustls-migration branch which replaces
reqwest's native-tls feature with rustls-tls-webpki-roots-no-provider.
This eliminates ring from the dependency tree (iceberg was the last
consumer pulling it in via reqwest's default rustls crypto provider).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The CSR (Confluent Schema Registry) connection validation could crash the
entire environmentd process instead of returning a SQL error. Three fixes:

1. Replace `.unwrap()` on `reqwest::ClientBuilder::build()` in
   `src/ccsr/src/config.rs` with proper `?` error propagation. When
   `build()` fails (e.g., rustls detects a key-cert mismatch), the
   unwrap panicked, triggering the enhanced panic handler's
   `process::abort()` and killing all connections.

2. Wrap CREATE/ALTER CONNECTION validation tasks in `ore_catch_unwind()`
   to prevent any panic in the validation path from aborting the process.
   Panics are now caught and returned as SQL errors, which also surfaces
   the panic message for diagnosis.

3. Update kafka-auth test expectations for rustls error format:
   - "alert certificate unknown" -> "CertificateUnknown" (rustls TLS
     alert Display format)
   - "key values mismatch" -> "keys may not be consistent" (rustls
     InconsistentKeys error)
   - "certificate verify failed" -> "UnknownIssuer" (rustls cert
     verification error)

Part of SEC-258.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
CI revealed that:
- build() fails for mTLS identities with "builder error" (likely due to
  aws-lc/vendored-OpenSSL symbol collision when loading private keys)
- aws-lc produces OpenSSL-compatible error strings, so "certificate
  verify failed" is still correct for the ssl.td test
- Use {e:#} in error formatting to include the full error chain

Update test expectations to match actual errors:
- mssl tests: expect "failed to build schema registry HTTP client"
- ssl.td: revert to "certificate verify failed" (aws-lc compat)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@jasonhernandez jasonhernandez force-pushed the jason/sec-239-reqwest-rustls branch 12 times, most recently from 7a431a0 to bb3582f Compare April 7, 2026 05:36
Remove all native-tls/OpenSSL dependencies and unify on rustls + aws-lc-rs:

- Switch rdkafka from ssl-vendored (OpenSSL) to ssl-awslc, using the
  MaterializeInc/rust-rdkafka fork's jasonhernandez/ssl-awslc branch.
  This eliminates the duplicate crypto library symbols that caused
  SIGABRTs and reqwest build() failures during mTLS handshakes.

- Switch LaunchDarkly SDK to use built-in rustls connector (enable the
  `rustls` feature) instead of hyper-tls. Remove hyper-tls from both
  mz-dyncfg-launchdarkly and mz-adapter's telemetry feature.

- Switch azure_core/azure_identity/azure_storage from default features
  (enable_reqwest → native-tls) to enable_reqwest_rustls. This removes
  native-tls from the Azure SDK path.

- Remove native-tls feature from duckdb. The MaterializeInc duckdb-rs
  fork defaults to reqwest/rustls-tls in libduckdb-sys.

- Ban native-tls, hyper-tls, and openssl-sys in deny.toml with wrappers
  for remaining transitive deps (azure_core for native-tls until
  upstream supports rustls-tls-no-provider per Azure/azure-sdk-for-rust#1680,
  rdkafka-sys for openssl-sys). Add skip entries for duplicate rustls
  versions from LD SDK's hyper-rustls 0.24.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@jasonhernandez jasonhernandez force-pushed the jason/sec-239-reqwest-rustls branch from bb3582f to c6e07ff Compare April 7, 2026 05:48
@jasonhernandez jasonhernandez force-pushed the jason/sec-239-reqwest-rustls branch 2 times, most recently from 4738733 to 29f0a8f Compare April 7, 2026 15:36
@jasonhernandez jasonhernandez force-pushed the jason/sec-239-reqwest-rustls branch from 29f0a8f to e426812 Compare April 7, 2026 15:52
@jasonhernandez
Copy link
Copy Markdown
Contributor Author

Closing: superseded by the new crypto migration PR series (#35940-#35952). Original stacked-PR chain is obsolete.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant