Skip to content

Fix FIN_WAIT_2 accumulation by draining sockets before close#615

Open
renaudallard wants to merge 1 commit into
tinyproxy:masterfrom
renaudallard:fix-finwait2-leak
Open

Fix FIN_WAIT_2 accumulation by draining sockets before close#615
renaudallard wants to merge 1 commit into
tinyproxy:masterfrom
renaudallard:fix-finwait2-leak

Conversation

@renaudallard

Copy link
Copy Markdown

Re-opens #600 with fresh code on top of current master (the original branch was deleted during a repo cleanup). @rofl0r had said he planned to merge it after reading up on the close semantics — this is the same fix, rebased and tidied up.

Problem

On OpenBSD, proxied connections accumulate in FIN_WAIT_2 and are never reaped. Once enough build up, the proxy stalls.

tcp 0 0 172.20.66.101.8080 172.20.66.3.17914 FIN_WAIT_2
tcp 0 0 172.20.66.101.8080 172.20.66.3.13047 FIN_WAIT_2
...

After relay_connection() sends a FIN with shutdown(SHUT_WR), conn_destroy_contents() calls close() without waiting for the peer's FIN, orphaning the socket while it is still in FIN_WAIT_2. The idle-timeout and poll-error return paths are worse: they skip shutdown() entirely. On Linux this is masked by net.ipv4.tcp_fin_timeout (default 60s); OpenBSD has no equivalent, so they persist indefinitely.

Why a read after shutdown (re: @rofl0r's question on #600)

It isn't that a read is "required" after shutdown in general — it's the mechanism to detect the peer's FIN so we keep the fd open until the four-way handshake finishes. shutdown(SHUT_WR) sends our FIN while the fd stays open; read() returning 0 means the peer's FIN arrived and the socket moved to TIME_WAIT. Closing then is harmless because TIME_WAIT is self-limiting (2×MSL). Closing earlier, in FIN_WAIT_2, orphans the socket and leaves its fate to a kernel timeout OpenBSD doesn't have. References: RFC 793 §3.5 and Stevens, UNIX Network Programming Vol. 1 §6.6.

Fix

  • Add close_socket() in sock.c: shutdown(SHUT_WR), drain the peer with a 10s SO_RCVTIMEO until read() returns 0, then close(). It's self-contained, so it also covers the timeout/poll-error paths that previously skipped shutdown(). The 10s cap means a peer that never sends its FIN can't tie up a worker thread indefinitely.
  • Use it for both descriptors in conn_destroy_contents().
  • Add the matching shutdown(server_fd, SHUT_WR) in relay_connection(), symmetric with the existing client-side shutdown, so the upstream gets its FIN promptly.

Testing

Tested on OpenBSD: connections transition through TIME_WAIT normally instead of accumulating in FIN_WAIT_2. Builds cleanly on Linux with no new warnings.

conn_destroy_contents() called close() right after a FIN was sent to
the peer, orphaning the socket while it was still in FIN_WAIT_2.  Linux
reaps such sockets via net.ipv4.tcp_fin_timeout, but OpenBSD has no
equivalent, so they pile up until the proxy stalls.

Add close_socket(), which sends our FIN with shutdown(SHUT_WR), drains
the peer until read() returns 0, then close()s.  By then the close
handshake is complete and the socket moves to TIME_WAIT, which the
kernel reaps on its own.  Use it for both descriptors in
conn_destroy_contents() and add the matching shutdown(server_fd,
SHUT_WR) in relay_connection().
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant