Skip to content

Increase default keepAliveTimeout to 65s#59203

Open
pras529 wants to merge 2 commits intonodejs:mainfrom
pras529:feat/keepalive-timeout-60s
Open

Increase default keepAliveTimeout to 65s#59203
pras529 wants to merge 2 commits intonodejs:mainfrom
pras529:feat/keepalive-timeout-60s

Conversation

@pras529
Copy link
Copy Markdown

@pras529 pras529 commented Jul 25, 2025

Summary

This PR increases the default value of http.Server.keepAliveTimeout from 5000ms (5s) to 65000ms (65s). This change aligns Node.js with the keep-alive timeout expectations of most browsers, proxies, and mobile clients, reducing the risk of subtle long-lived connection failures due to premature socket closure.

Details

  • Default Change:
    The keepAliveTimeout default is now set to 65 seconds (65000ms) instead of the previous 5 seconds (5000ms).
  • Documentation & Comments:
    All relevant documentation and code comments have been updated to reflect the new default.

Motivation

The previous default of 5 seconds is shorter than the defaults used by common infrastructure, which can lead to unexpected connection terminations or failures. Setting the default to 65 seconds brings Node.js in line with industry standards and improves interoperability.

References


Please review carefully as this changes default server behavior for all applications that do not explicitly set keepAliveTimeout.

@nodejs-github-bot
Copy link
Copy Markdown
Collaborator

Review requested:

  • @nodejs/http
  • @nodejs/net

@nodejs-github-bot nodejs-github-bot added http Issues or PRs related to the http subsystem. needs-ci PRs that need a full CI run. labels Jul 25, 2025
@pras529
Copy link
Copy Markdown
Author

pras529 commented Jul 25, 2025

@jasnell @targos @lpinca Please review my pull request when you have a chance. Thank you!

Copy link
Copy Markdown
Member

@mcollina mcollina left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed, this should have been raised a long time ago.

What are the http keep-alive timeouts in browsers, smartphones etc? Can you provide references?

I did a quick check and Chrome would default to 300s. @nodejs/tsc can you weight it if you want to increase it up to 350s?

Fastify uses 72 seconds to avoid any issues with ALB/ELBs and other load balancers that expect 60s (having the same timeout is slightly problematic).

Increasing this exposes users to the possibility of DoS attacks by exhaustion of file descriptor - just keeping the sockets open would prevent new connections from arriving.

Comment thread doc/changelogs/CHANGELOG_V24.md Outdated
Comment thread lib/_http_server.js Outdated
Comment thread doc/api/http.md Outdated
Comment thread doc/api/http.md Outdated
Comment thread doc/api/http.md Outdated
Comment thread doc/api/http.md Outdated
@mcollina mcollina added the semver-major PRs that contain breaking changes and should be released in the next major version. label Jul 25, 2025
@pras529 pras529 marked this pull request as draft July 25, 2025 09:07
@codecov
Copy link
Copy Markdown

codecov bot commented Jul 25, 2025

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 90.06%. Comparing base (7215d9b) to head (e59f1f3).
Report is 5 commits behind head on main.

Additional details and impacted files
@@            Coverage Diff             @@
##             main   #59203      +/-   ##
==========================================
+ Coverage   90.04%   90.06%   +0.02%     
==========================================
  Files         648      648              
  Lines      191041   191041              
  Branches    37448    37455       +7     
==========================================
+ Hits       172026   172065      +39     
+ Misses      11651    11601      -50     
- Partials     7364     7375      +11     
Files with missing lines Coverage Δ
lib/_http_server.js 97.23% <100.00%> (ø)

... and 32 files with indirect coverage changes

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@pras529 pras529 marked this pull request as ready for review July 25, 2025 09:42
@pras529
Copy link
Copy Markdown
Author

pras529 commented Jul 25, 2025

Hey @mcollina ,

Thanks for your feedback on the keep-alive timeout.

I’ve chosen 60 seconds to align with defaults used by major cloud load balancers like AWS ELB/ALB, ensuring broad compatibility and preventing unexpected connection drops. Increasing the timeout to 72 seconds, as seen in Fastify and NGINX, could slightly boost connection reuse, but also raises the risk of file descriptor exhaustion under DoS scenarios.

Overall, 60 seconds provides a safer balance between interoperability and security for most environments. I’m open to discussing further if there’s a strong case for a higher value.

@pras529
Copy link
Copy Markdown
Author

pras529 commented Jul 25, 2025

@mcollina , I've resolved this issue and removed the breaking change alert regarding http.Server.keepAliveTimeout. The documentation now reflects the correct status, in line with the release team's guidance. Please review my pull request when you have a chance. Thank you!

@pras529 pras529 changed the title Increase default keepAliveTimeout to 60s and warn on legacy 5s usage Increase default keepAliveTimeout to 60s Jul 25, 2025
@pras529 pras529 requested a review from mcollina July 25, 2025 09:57
@mcollina
Copy link
Copy Markdown
Member

mcollina commented Jul 25, 2025

I've seen 60 seconds cause problems due to time network delays - I've actually seen this happen. Going over 60 seconds is necessary to account for that.

@pras529
Copy link
Copy Markdown
Author

pras529 commented Jul 25, 2025

Okay, sure—that makes sense. Is 72 seconds sufficient to account for those network delays, or do you recommend going even higher? Happy to adjust as needed based on your experience.

@mcollina
Copy link
Copy Markdown
Member

I literally picked 72 with no reasoning. We could do 65.

@pras529
Copy link
Copy Markdown
Author

pras529 commented Jul 25, 2025

Thanks for clarifying. I’m fine with 65 seconds if that works better—happy to update the default to 65 to provide a bit more buffer over 60. Let me know if you have any concerns or if you think another value might be preferable.

@pras529 pras529 changed the title Increase default keepAliveTimeout to 60s Increase default keepAliveTimeout to 65s Jul 25, 2025
@mcollina
Copy link
Copy Markdown
Member

@pras529 there are 4 frailing tests and linting to be fixed, could you address them?

@pras529
Copy link
Copy Markdown
Author

pras529 commented Jul 25, 2025

@mcollina Yes, I’ll address the failing tests and linting issues—just need a bit of time to work through them.

Copy link
Copy Markdown
Member

@pimterry pimterry left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm very happy to increase this as well, but it's worth noting that the description in the linked issue doesn't quite match the reality of what's going on. The example reproduction in #59193 specifically isn't correct - in fact the response there is delivered just fine AFAICT. The keepAliveTimeout we're talking about here is how long the server will wait between requests, to keep a connection alive, when there's no response pending. In cases like that repro, it's not relevant: the response will be delivered to the client, and then the connection will close 5 seconds later (11 seconds after the initial connection started, in that example).

I've done some quick research, there's a wide split in defaults elsewhere:

I did a quick check and Chrome would default to 300s.

I don't think browser's configuration matters as much as LBs & reverse proxies. Given a RST when reusing an idle connection they'll retry transparently without exposing the error (I've just tested Chrome & Firefox and confirmed) and I've seen them apply heuristics and various connection pooling games to manage keep-alive and idle sockets there in the past - it's not a fixed value.

Of course, it's still nice for performance to keep the connection open, but in browser-facing performance-sensitive cases you'll probably want HTTP/2 anyway, which this PR doesn't touch.

The only case where there's going to cause an actual problem (e.g. those 502 errors) is when a) the client which doesn't gracefully handle this (it's not a browser) and b) you hit this race condition where the server closes the socket, but the client sends the next request before it processes the FIN packet.

Unless the Node timeout is greater than the LB timeout (i.e. many minutes) then that race condition is always possible, but any increase will exponentially decrease the probability of this race happening (particularly on production sites, where I'd strongly expect most LB->backend connections are rarely idle for more than a few seconds anyway).

Anyway, that's a long way of saying: given the above I think boosting this to 30+ seconds is a good idea, happy to go with 65 seconds as a balanced option 👍 but I don't think we should worry about going beyond that - we're balancing protection against resource usage vs race condition resistance, and there's diminishing returns.

Comment thread lib/_http_server.js Outdated
@yujiang
Copy link
Copy Markdown

yujiang commented Jul 25, 2025

Thanks for referencing this and working on a fix!
Looks like a great step forward, and I appreciate you taking the time to address it 🙏

@pras529 pras529 force-pushed the feat/keepalive-timeout-60s branch from e59f1f3 to 695d5d0 Compare July 25, 2025 12:56
@pras529 pras529 requested a review from pimterry July 25, 2025 13:02
Copy link
Copy Markdown
Member

@pimterry pimterry left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code all looks good to me 👍

The first commit message is failing the linting - you'll need to use git to modify that. It's probably best to just squash all your commits and change the resulting commit message to describe the end result of the whole change here, because that message is what will be shown in the CHANGELOG when this is released.

pras529 added a commit to pras529/node that referenced this pull request Jul 25, 2025
http: Requested changes addressed and resolved

http: Increase default keepAliveTimeout to 65s

http: explicitly set keepAliveTimeout to 65s in tests

http: increase keepAliveTimeout default

Increase the default keepAliveTimeout to 65 seconds in the http module. Tests have been
updated to explicitly set the new timeout value. All requested changes
have been addressed and resolved.

PR-URL: nodejs#59203
@pras529 pras529 force-pushed the feat/keepalive-timeout-60s branch from 695d5d0 to 33c1556 Compare July 25, 2025 17:19
@pras529
Copy link
Copy Markdown
Author

pras529 commented Jul 26, 2025

Hello @mcollina,

Thank you for your review. I’ve addressed the changes you requested and updated the commit message to fully comply with our guidelines. I’m still a bit uncertain about the Linux and macOS test results—could you please take a look and let me know if anything else needs adjustment?

Thanks for your help!

Copy link
Copy Markdown
Member

@mcollina mcollina left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

one more nit, sorry.

Comment thread doc/api/http.md Outdated
@mcollina
Copy link
Copy Markdown
Member

I'm on the fence to consider this a breaking change or not. If so, this change will probably go out in v25,
but I think it's pretty harmless to ship in v24 as well.

@pimterry wdyt?

@mcollina mcollina added semver-minor PRs that contain new features and should be released in the next minor version. baking-for-lts PRs that need to wait before landing in a LTS release. and removed semver-major PRs that contain breaking changes and should be released in the next major version. labels Jul 29, 2025
@mcollina
Copy link
Copy Markdown
Member

@pras529 can you adjust the first commit message? (You can also squash all commits too)

Copy link
Copy Markdown
Member

@marco-ippolito marco-ippolito left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM but I think this is semver major

@Ethan-Arrowood
Copy link
Copy Markdown
Contributor

I'm content with 65 seconds. Appreciate this change greatly.

Copy link
Copy Markdown
Member

@mcollina mcollina left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

Copy link
Copy Markdown
Member

@RaisinTen RaisinTen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM but the failing test assertions need to be updated

@RafaelGSS
Copy link
Copy Markdown
Member

What's the reason for it be semver-minor and not semver-major? @mcollina

@pimterry
Copy link
Copy Markdown
Member

What's the reason for it be semver-minor and not semver-major?

#59203 (comment)

If we think there's a real risk that this could break anything in existing code, or if people would just prefer to be cautious, I'm not opposed to making this major, but personally I'd be quite surprised it caused any issues in practice. For users it should really just be a bug fix & performance improvement.

Do we have any specific likely scenario where this might break things?

@RafaelGSS
Copy link
Copy Markdown
Member

It's hard to say its real impact, but by experience, whenever we change a default option on HTTP, it tends to break people. It's also hard to have those actions covered by our test suite.

I feel that we might want to wait for it to land on Node.js 25 and then backport after assessing its impact. I think CITGM wouldn't cover this scenario too.

@mcollina
Copy link
Copy Markdown
Member

@RafaelGSSy assumption is that risk in shipping on Node 24 before it goes LTS is close to zero, but very high afterwars.

@ex1st
Copy link
Copy Markdown

ex1st commented Aug 4, 2025

Hi folks! If you plan to increase keepAliveTimeout to 65 seconds, should you also increase headersTimeout to 66 at least? Please check #27363

@pimterry
Copy link
Copy Markdown
Member

pimterry commented Aug 4, 2025

A very good question @ex1st! From what I can see though, that issue was resolved by #32329, which changed how headerTimeout was measured entirely, so it shouldn't overlap with keepAliveTimeout as it used to. From my reading of the code there, nowadays the header timer doesn't start until a new request starts, and it's disabled when it finishes, so it's not measuring idle time at all. Does that make sense?

@BethGriggs
Copy link
Copy Markdown
Member

Is this change still something we're looking to land on Node.js 25? It seemed to get a lot of approvals/support for changing the default at the time.

@pimterry
Copy link
Copy Markdown
Member

It would be great to get this merged, @BethGriggs mentioned yesterday that it's still causing issues, and it would be a clear improvement for lots of use cases. It looks like there's a pending issue on the commit message and some debate about whether it's semver-major or not.

I'm happy to go for semver-major if there's uncertainty, just in case. Any objections to that? I'll update it now but shout if anybody feels that's an issue. That'll delay releasing this significantly unfortunately but it's better than leaving it stuck.

@pras529 can you update the commit message & force push, so the checks pass? Since this was done the requirements have also changed, so you'll now also need to include a 'Signed-Off-By' footer - see the docs at https://github.com/nodejs/node/blob/main/doc/contributing/pull-requests.md#commit-message-guidelines for details & examples. En route, a rebase would be helpful since this is now a long way back from main, and the tests will need to be passing too, if that doesn't get resolved by the rebase & rerun (logs are gone, so it's unclear what failed there).

Copy link
Copy Markdown
Member

@RafaelGSS RafaelGSS left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can land it on v26 as semver-major - if we land that in the next couple of days. If we don't see any major breakage, we could consider backporting it to other release lines

@RafaelGSS RafaelGSS added semver-major PRs that contain breaking changes and should be released in the next major version. and removed semver-minor PRs that contain new features and should be released in the next minor version. baking-for-lts PRs that need to wait before landing in a LTS release. labels Apr 15, 2026
@RafaelGSS
Copy link
Copy Markdown
Member

RafaelGSS commented Apr 15, 2026

@pimterry feel free to open a fresh new PR and add @pras529 on Co-Authored-By section in case of silent communication.

@RafaelGSS RafaelGSS added dont-land-on-v20.x PRs that should not land on the v20.x-staging branch and should not be released in v20.x. dont-land-on-v22.x PRs that should not land on the v22.x-staging branch and should not be released in v22.x. dont-land-on-v24.x PRs that should not land on the v24.x-staging branch and should not be released in v24.x. dont-land-on-v25.x PRs that should not land on the v25.x-staging branch and should not be released in v25.x. and removed semver-major PRs that contain breaking changes and should be released in the next major version. labels Apr 15, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

dont-land-on-v20.x PRs that should not land on the v20.x-staging branch and should not be released in v20.x. dont-land-on-v22.x PRs that should not land on the v22.x-staging branch and should not be released in v22.x. dont-land-on-v24.x PRs that should not land on the v24.x-staging branch and should not be released in v24.x. dont-land-on-v25.x PRs that should not land on the v25.x-staging branch and should not be released in v25.x. http Issues or PRs related to the http subsystem. needs-ci PRs that need a full CI run. notable-change PRs with changes that should be highlighted in changelogs.

Projects

None yet

Development

Successfully merging this pull request may close these issues.