quic: apply multiple security posture improvements#63483
Conversation
Signed-off-by: James M Snell <jasnell@gmail.com>
Signed-off-by: James M Snell <jasnell@gmail.com> Assisted-by: Opencode:Opus 4.6
Signed-off-by: James M Snell <jasnell@gmail.com> Assisted-by: Opencode:Opus 4.6
Signed-off-by: James M Snell <jasnell@gmail.com> Assisted-by: Opencode/Opus 4.6
Signed-off-by: James M Snell <jasnell@gmail.com>
Signed-off-by: James M Snell <jasnell@gmail.com> Assisted-by: Opencode/Opus 4.6
Signed-off-by: James M Snell <jasnell@gmail.com>
On the client, add verifyPeer: 'auto', 'strict', and 'manual' modes. The 'strict' mode will reject invalid certs at the handshake layer, while the 'manual' mode allows the application to inspect the peer cert and decide whether to trust it or not. The 'auto' mode is the default and will reject invalid certs at a middle layer after the onhandshake event. Signed-off-by: James M Snell <jasnell@gmail.com> Assisted-by: Opencode/Opus 4.6
Signed-off-by: James M Snell <jasnell@gmail.com> Assisted-by: Opencode/Opus 4.6
Signed-off-by: James M Snell <jasnell@gmail.com>
Signed-off-by: James M Snell <jasnell@gmail.com> Assisted-by: Opencode:Opus 4.6
|
Review requested:
|
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## main #63483 +/- ##
==========================================
- Coverage 90.17% 90.15% -0.03%
==========================================
Files 718 718
Lines 227731 227924 +193
Branches 42768 42764 -4
==========================================
+ Hits 205365 205492 +127
- Misses 14145 14212 +67
+ Partials 8221 8220 -1
🚀 New features to boost your workflow:
|
| `validationErrorReason` and `validationErrorCode` if validation failed. The | ||
| application is responsible for checking these values and deciding whether to | ||
| continue. Use this mode for custom validation logic, certificate pinning, or | ||
| intentionally accepting self-signed certificates. |
There was a problem hiding this comment.
A comment, not a blocker: imo, in general, there's going to be a huge advantage to connecting this through to the existing TLS APIs.
This setup in isolation works fine, but if we want people to be able to easily use QUIC & HTTP/3 in practice, it's going to be much easier if they can reuse TLS config.
For HTTP/3, I strongly suspect every deployment is going to deploy it alongside HTTP/2 and HTTP/1. That means they'll already have TLS configuration for our normal TLS APIs. Deviating from those is a pain. All server frameworks & HTTP client libraries etc supporting this (Express, Fastify etc) will need to build translation to support a single set of TLS options and convert into both normal TLS & QUIC's TLS option format (e.g. wrapping their existing checkServerIdentity callback to use this approach for QUIC). Ditto for ALPN, SNI, etc etc. Anybody using these APIs directly will need to build that conversion layer for themselves from scratch as well.
Also reimplementing our whole TLS setup creates lots of risks of introducing new issues - like the bug in strict & auto below, which skips hostname validation completely. Keeping one implementation secure is hard enough 😆
Fine with this for now while it's experimental and we're building & testing things, but imo we have to aim towards aligning them for this to go stable, which means either using the existing API options here, or implementing these new options for node:tls as well.
For this specific case I've conveniently looked at the latter already - we actually don't currently send proper TLS alerts when our clients reject a certificate in normal TLS (even with rejectUnauthorized: true, this same 'auto' behaviour doesn't send a TLS alerts, just a clean connection close after handshake). It would be really nice to do auto-rejection properly there too. In fact we can do even better: with OpenSSL 3+, I think we can do async verification in JS visa SSL_set_retry_verify and send proper alerts without completing the handshake. There's some good general TLS improvement space available here.
Anyway, as above - not a blocker to do immediately, just mentioning this to try and stay on the same page for where this is going. I really don't want to have two incompatible TLS configuration APIs we (and the whole ecosystem) have to deal with forever.
There was a problem hiding this comment.
Yeah, I've been wanting to come back around to this. I intentionally did not want to reuse the existing SecureContext because I think some of the options there just don't make sense for QUIC but this is definitely something I think we need to work through.
| } | ||
|
|
||
| uint64_t Stream::last_activity_timestamp() const { | ||
| uint64_t ts = stats()->received_at; |
There was a problem hiding this comment.
received_at is only updated when the client sends data (for HTTP/3, message body data I think? So ignores headers). If the server is sending data, it won't clear the timeout, so long responses (large downloads, slow processing) will get cut off even though the connection is active.
It also doesn't actually protect against the slowloris equivalent: the core of that vulnerability isn't about opening silent streams, but sending data very slowly. This specifically does allow exactly that, with no maximum request timeout. You can send a byte every 30 seconds and keep the stream open forever.
That's from an HTTP POV. On the QUIC side, I'm not sure stream idle timeouts are useful at all.
For HTTP/3 a per-stream timeout of some sort totally makes sense, and would be a good security improvement. For QUIC, we have no idea what the application protocol is doing, and I think in very many non-HTTP cases it is normal to have open streams where the client and/or server is idle on the stream for long periods. Media-over-QUIC does this on control streams, and those would all be killed by this timeout. I asked Claude, it thinks similar logic also applies to MASQUE (control channel), DNS-over-QUIC (for AXFR), SMB-over-QUIC (long-term idle by design), and MQTT-over-QUIC (often idle by design). We don't add idle timeouts on net.Socket or TLSSocket which are the closest equivalents (and I don't think we should).
Idle request timeouts are an HTTP concern.
So, I think:
- We should add this to HTTP/3, not QUIC.
- We should make it a total request time timeout: from opening the stream to the end of the request body. That's the slowloris and unused-stream risk, and matches HTTP/1 (which has headerTimeout & requestTimeout). We don't want a server-side timeout for how long the server has to respond once it's fully received a request: a server doing lots of processing, long-polling, or streaming very large data is an application concern imo, not something to block at the protocol level.
There was a problem hiding this comment.
I think having it for QUIC streams makes sense too. It's easily configurable to switch off entirely if the application calls for it. Good catch on the headers check tho.
| * `'strict'` — OpenSSL aborts the TLS handshake immediately if the server's | ||
| certificate fails validation. The `session.opened` promise rejects with a | ||
| TLS error. The application cannot inspect the certificate or the error | ||
| details. This is the most secure mode. |
There was a problem hiding this comment.
I just did a quick look, I think currently we don't call SSL_set1_host anywhere, which is required to make OpenSSL verify the hostname. Since strict doesn't do checks JS side, it effectively ignores hostnames entirely, which defeats the point. For auto, it could check JS side (matching node:tls checkServerIdentity checks) but it doesn't anywhere I can see, so that won't work either.
None of the new tests here cover good-cert + wrong-hostname, for auto or strict, but I strongly suspect they would show this verification is broken in both cases.
Improve the security footing of the quic implementation with multiple improvements
There are still a number of ergonomic issues with the API, particularly around streams, but the security details are shaping up nicely.
@nodejs/quic