diff mbox series

tcp: Add an extra check for consecutive failed keepalive probes

Message ID 20250109161802.3599-1-sensor1010@163.com (mailing list archive)
State Changes Requested
Delegated to: Netdev Maintainers
Headers show
Series tcp: Add an extra check for consecutive failed keepalive probes | expand

Checks

Context Check Description
netdev/series_format warning Single patches do not need cover letters; Target tree name not specified in the subject
netdev/tree_selection success Guessed tree name to be net-next
netdev/ynl success Generated files up to date; no warnings/errors; no diff in generated;
netdev/fixes_present success Fixes tag not required for -next series
netdev/header_inline success No static functions without inline keyword in header files
netdev/build_32bit success Errors and warnings before: 1 this patch: 1
netdev/build_tools success No tools touched, skip
netdev/cc_maintainers success CCed 6 of 6 maintainers
netdev/build_clang success Errors and warnings before: 3 this patch: 3
netdev/verify_signedoff success Signed-off-by tag matches author and committer
netdev/deprecated_api success None detected
netdev/check_selftest success No net selftest shell script
netdev/verify_fixes success No Fixes tag
netdev/build_allmodconfig_warn success Errors and warnings before: 1 this patch: 1
netdev/checkpatch warning CHECK: Alignment should match open parenthesis WARNING: line length of 85 exceeds 80 columns
netdev/build_clang_rust success No Rust files in patch. Skipping build
netdev/kdoc success Errors and warnings before: 3 this patch: 3
netdev/source_inline success Was 0 now: 0
netdev/contest fail net-next-2025-01-09--18-00 (tests: 882)

Commit Message

Lizhe Jan. 9, 2025, 4:18 p.m. UTC
Add an additional check to handle situations where consecutive
keepalive probe packets are sent without receiving a response.

Signed-off-by: Lizhe <sensor1010@163.com>
---
 net/ipv4/tcp_timer.c | 6 ++++++
 1 file changed, 6 insertions(+)

Comments

Neal Cardwell Jan. 9, 2025, 4:31 p.m. UTC | #1
On Thu, Jan 9, 2025 at 11:21 AM Lizhe <sensor1010@163.com> wrote:
>
> Add an additional check to handle situations where consecutive
> keepalive probe packets are sent without receiving a response.
>
> Signed-off-by: Lizhe <sensor1010@163.com>
> ---
>  net/ipv4/tcp_timer.c | 6 ++++++
>  1 file changed, 6 insertions(+)
>
> diff --git a/net/ipv4/tcp_timer.c b/net/ipv4/tcp_timer.c
> index b412ed88ccd9..5a5dee8cd6d3 100644
> --- a/net/ipv4/tcp_timer.c
> +++ b/net/ipv4/tcp_timer.c
> @@ -828,6 +828,12 @@ static void tcp_keepalive_timer (struct timer_list *t)
>                 }
>                 if (tcp_write_wakeup(sk, LINUX_MIB_TCPKEEPALIVE) <= 0) {
>                         icsk->icsk_probes_out++;
> +                       if (icsk->icsk_probes_out >= keepalive_probes(tp)) {
> +                               tcp_send_active_reset(sk, GFP_ATOMIC,
> +                                               SK_RST_REASON_TCP_KEEPALIVE_TIMEOUT);
> +                               tcp_write_err(sk);
> +                               goto out;
> +                       }
>                         elapsed = keepalive_intvl_when(tp);
>                 } else {
>                         /* If keepalive was lost due to local congestion,
> --

Can you please explain the exact motivation for your patch, ideally
providing either a tcpdump trace or packetdrill test to document the
scenario you are concerned about?

The Linux TCP keepalive logic in tcp_keepalive_timer() already
includes logic (a few lines above the spot you propose to patch) that
ensures that a connection will be closed with ETIMEDOUT if consecutive
keepalive probes fail:

                if ((user_timeout != 0 &&
                    elapsed >= msecs_to_jiffies(user_timeout) &&
                    icsk->icsk_probes_out > 0) ||
                    (user_timeout == 0 &&
                    icsk->icsk_probes_out >= keepalive_probes(tp))) {
                        tcp_send_active_reset(sk, GFP_ATOMIC,

SK_RST_REASON_TCP_KEEPALIVE_TIMEOUT);
                        tcp_write_err(sk);
                        goto out;
                }
                if (tcp_write_wakeup(sk, LINUX_MIB_TCPKEEPALIVE) <= 0) {
                        icsk->icsk_probes_out++;
                        elapsed = keepalive_intvl_when(tp);

AFAICT your patch nearly duplicates the existing logic, but changes
the application-visible behavior to close the connection after one
fewer timer expiration, thus breaking the semantics of the
net.ipv4.tcp_keepalive_probes.

neal

---

ps: For reference, here is a packetdrill test we use to test this
functionality; this passes on recent Linux kernels:

// Test TCP keepalive behavior without TCP timestamps enabled.

`../common/defaults.sh`

// Create a socket.
    0 socket(..., SOCK_STREAM, IPPROTO_TCP) = 3
   +0 setsockopt(3, SOL_SOCKET, SO_REUSEADDR, [1], 4) = 0

   +0 bind(3, ..., ...) = 0
   +0 listen(3, 1) = 0

// Establish a connection.
   +0 < S 0:0(0) win 20000 <mss 1000,nop,nop,sackOK,nop,wscale 8>
   +0 > S. 0:0(0) ack 1 <mss 1460,nop,nop,sackOK,nop,wscale 8>
  +.1 < . 1:1(0) ack 1 win 20000
   +0 accept(3, ..., ...) = 4

// Verify keepalives are disabled by default.
   +0 getsockopt(4, SOL_SOCKET, SO_KEEPALIVE, [0], [4]) = 0
// Enable keepalives:
   +0 setsockopt(4, SOL_SOCKET, SO_KEEPALIVE, [1], 4) = 0

// Verify default TCP_KEEPIDLE is 7200, from net.ipv4.tcp_keepalive_time=7200:
   +0 getsockopt(4, SOL_TCP, TCP_KEEPIDLE, [7200], [4]) = 0
// Start sending keepalive probes after 3 seconds of idle
   +0 setsockopt(4, SOL_TCP, TCP_KEEPIDLE, [3], 4) = 0

// Verify default TCP_KEEPINTVL is 75, from net.ipv4.tcp_keepalive_intvl=75:
   +0 getsockopt(4, SOL_TCP, TCP_KEEPINTVL, [75], [4]) = 0
// Send keepalive probes every 2 seconds.
   +0 setsockopt(4, SOL_TCP, TCP_KEEPINTVL, [2], 4) = 0

// Verify default TCP_KEEPCNT is 9, from net.ipv4.tcp_keepalive_probes=9:
   +0 getsockopt(4, SOL_TCP, TCP_KEEPCNT, [9], [4]) = 0
// Send 4 keepalive probes before giving up.
   +0 setsockopt(4, SOL_TCP, TCP_KEEPCNT, [4], 4) = 0

// Set up an epoll operation to verify that connections terminated by failed
// keepalives will wake up blocked epoll waiters with EPOLLERR|EPOLLHUP:
   +0 epoll_create(1) = 5
   +0 epoll_ctl(5, EPOLL_CTL_ADD, 4, {events=EPOLLERR, fd=4}) = 0
   +0...11 epoll_wait(5, {events=EPOLLERR|EPOLLHUP, fd=4}, 1, 15000) = 1

// Verify keepalive behavior looks correct, given the parameters above:

// Start sending keepalive probes after 3 seconds of idle.
   +3 > . 0:0(0) ack 1
// Send keepalive probes every 2 seconds.
   +2 > . 0:0(0) ack 1
   +2 > . 0:0(0) ack 1
   +2 > . 0:0(0) ack 1
   +2 > R. 1:1(0) ack 1
// Sent 4 keepalive probes and then gave up and reset the connection.

// Verify that we get the expected error when we try to use the socket:
   +0 read(4, ..., 1000) = -1 ETIMEDOUT (Connection timed out)
diff mbox series

Patch

diff --git a/net/ipv4/tcp_timer.c b/net/ipv4/tcp_timer.c
index b412ed88ccd9..5a5dee8cd6d3 100644
--- a/net/ipv4/tcp_timer.c
+++ b/net/ipv4/tcp_timer.c
@@ -828,6 +828,12 @@  static void tcp_keepalive_timer (struct timer_list *t)
 		}
 		if (tcp_write_wakeup(sk, LINUX_MIB_TCPKEEPALIVE) <= 0) {
 			icsk->icsk_probes_out++;
+			if (icsk->icsk_probes_out >= keepalive_probes(tp)) {
+				tcp_send_active_reset(sk, GFP_ATOMIC,
+						SK_RST_REASON_TCP_KEEPALIVE_TIMEOUT);
+				tcp_write_err(sk);
+				goto out;
+			}
 			elapsed = keepalive_intvl_when(tp);
 		} else {
 			/* If keepalive was lost due to local congestion,