Skip to content

Don't share server between worker 0 and descendants on refork#3602

Merged
joshuay03 merged 1 commit intopuma:mainfrom
joshuay03:fix-3599-2
Nov 28, 2025
Merged

Don't share server between worker 0 and descendants on refork#3602
joshuay03 merged 1 commit intopuma:mainfrom
joshuay03:fix-3599-2

Conversation

@joshuay03
Copy link
Copy Markdown
Collaborator

@joshuay03 joshuay03 commented Jan 11, 2025

Description

Closes #3599

Context:

  • Both Cluster and Worker subclass Runner
  • The app is loaded in the Runner here
  • Except for some irrelevant cases, the app is loaded for the first time in Runner#start_server

In ordinary cluster mode, when forking workers from the master process:

  1. If preloading, the server is started in the new process before the worker is initialized so that the app is loaded in, memoized, and inherited from the cluster runner
  2. If not preloading, the server is started in Worker#run so that the app is loaded in each worker runner

When fork_worker is enabled, only worker 0 follows the above. Then, when forking workers from worker 0:

  1. Currently, worker 0's server is given to the worker. This behaviour is different to the above. Instead of starting a new server, both the new worker and worker 0 now share the same server thanks to CoW, at least until the server is first modified in the new worker. This could have unknown repercussions considering the server object is not intentionally designed to be used this way (at least that's my understanding from my contributions here so far). My presumption is that this implementation was the most straightforward way to ensure the forked worker benefits from CoW of the app.
  2. On this branch, worker 0's children now have their own server from the get go, just like with a worker forked from master, but still benefits from CoW of the app as is the intention behind fork_worker. I've attempted to explain in a comment why I opted for the implementation in this PR.

Your checklist for this pull request

  • I have reviewed the guidelines for contributing to this repository.
  • I have added (or updated) appropriate tests if this PR fixes a bug or adds a feature.
  • My pull request is 100 lines added/removed or less so that it can be easily reviewed.
  • If this PR doesn't need tests (docs change), I added [ci skip] to the title of the PR.
  • If this closes any issues, I have added "Closes #issue" to the PR description or my commit messages.
  • I have updated the documentation accordingly.
  • All new and existing tests passed, including Rubocop.

@joshuay03 joshuay03 force-pushed the fix-3599-2 branch 12 times, most recently from f68e078 to ec3242b Compare January 11, 2025 05:48
@joshuay03 joshuay03 changed the title Don't share server between worker 0 and >= grandchildren on refork Don't share server between worker 0 and descendants on refork Jan 11, 2025
@joshuay03 joshuay03 force-pushed the fix-3599-2 branch 2 times, most recently from b7fc17f to 92816b5 Compare January 11, 2025 09:02
@joshuay03 joshuay03 force-pushed the fix-3599-2 branch 3 times, most recently from 6854dcf to 0438488 Compare January 12, 2025 02:56
Comment thread lib/puma/cluster.rb
pipes[:wakeup] = @wakeup
end

server = start_server if preload?
Copy link
Copy Markdown
Collaborator Author

@joshuay03 joshuay03 Jan 12, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now that we can pass the preloaded app into the new worker, we can consistently start the server there instead.

@joshuay03 joshuay03 force-pushed the fix-3599-2 branch 4 times, most recently from 5439297 to 461fd56 Compare January 12, 2025 10:38
attr_reader :index, :master

def initialize(index:, master:, launcher:, pipes:, server: nil)
def initialize(index:, master:, launcher:, pipes:, app: nil)
Copy link
Copy Markdown
Collaborator Author

@joshuay03 joshuay03 Jan 12, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are two options here:

  1. Continue to accept server and simply ensure that @app is initialized with @server.app if so. Then, calling start_server in #spawn_worker will use the preloaded worker 0 app to start a new server which we can give to the new worker. This would then mirror the current Cluster#spawn_worker implementation.
  2. Let the worker consistently be responsible for starting its own server, and instead accept the preloaded app where needed, which is what we really intend to preload. This makes the whole inheritance chain much easier to grok imo, and introduces a clear separation of responsibilities.

@joshuay03 joshuay03 force-pushed the fix-3599-2 branch 3 times, most recently from 37dee58 to 7982fe9 Compare January 12, 2025 13:09
Comment on lines +454 to +482
def test_application_is_loaded_exactly_once_if_using_fork_worker
cli_server "-w #{workers} --fork-worker test/rackup/write_to_stdout_on_boot.ru"

get_worker_pids
loading_app_count = @server_log.scan("Loading app").length
assert_equal 1, loading_app_count # loaded in worker 0

@server_log.clear
Process.kill :SIGURG, @pid

get_worker_pids 1, workers - 1
loading_app_count = @server_log.scan("Loading app").length
assert_equal 0, loading_app_count
end

def test_application_is_loaded_exactly_once_if_using_preload_app_with_fork_worker
cli_server "-w #{workers} --preload --fork-worker test/rackup/write_to_stdout_on_boot.ru"

get_worker_pids
loading_app_count = @server_log.scan("Loading app").length
assert_equal 1, loading_app_count # loaded in master

@server_log.clear
Process.kill :SIGURG, @pid

get_worker_pids 1, workers - 1
loading_app_count = @server_log.scan("Loading app").length
assert_equal 0, loading_app_count
end
Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These pass on master as well, I added it to ensure I don't regress on this behaviour. I've also done some puts debugging to verify this manually.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Extracted into #3690

Comment thread lib/puma/cluster.rb Outdated
@joshuay03 joshuay03 marked this pull request as ready for review January 12, 2025 13:50
@github-actions github-actions Bot added the waiting-for-review Waiting on review from anyone label Jan 12, 2025
@joshuay03
Copy link
Copy Markdown
Collaborator Author

@nateberkopec Can this be considered for v7?

@joshuay03 joshuay03 mentioned this pull request Aug 2, 2025
@dentarg
Copy link
Copy Markdown
Member

dentarg commented Aug 2, 2025

Is this an alternative to #3643?

@dentarg
Copy link
Copy Markdown
Member

dentarg commented Aug 2, 2025

Or is it an alternative to #3629?

@joshuay03
Copy link
Copy Markdown
Collaborator Author

joshuay03 commented Aug 3, 2025

Is this an alternative to #3643?

Neither actually, this is purely to address #3599. mold_worker should also benefit from this, but I'm going to only focus my testing on fork_worker which is the topic of the issue.

I'll redo the load testing comparison in my last comment on the issue now that @MSP-Greg's long-tail fix and my other fix to address the issue are on master, to confirm that this is still needed (I suspect it is).

Edit: This is still needed: #3599 (comment)

@joshuay03 joshuay03 removed the waiting-for-review Waiting on review from anyone label Nov 28, 2025
@github-actions github-actions Bot added the waiting-for-review Waiting on review from anyone label Nov 28, 2025
@joshuay03
Copy link
Copy Markdown
Collaborator Author

Merging because CI is green (except for the Bundler v4 failures on main) and I've left more than enough evidence in the form of benchmarks in #3599 that this results in a significant performance improvement to fork_worker.

@joshuay03 joshuay03 removed the waiting-for-review Waiting on review from anyone label Nov 28, 2025
@joshuay03 joshuay03 merged commit 731b97d into puma:main Nov 28, 2025
60 of 71 checks passed
@joshuay03 joshuay03 moved this from In Progress / Pending Review to Done in Open Source Nov 28, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Using fork_worker results in a large performance degradation (5x slowly)

3 participants