diff mbox series

SUNRPC: timeout and cancel TLS handshake with -ETIMEDOUT

Message ID ee226061afc4152fb8c6a829565dc5af390842ec.1731678901.git.bcodding@redhat.com (mailing list archive)
State New
Headers show
Series SUNRPC: timeout and cancel TLS handshake with -ETIMEDOUT | expand

Commit Message

Benjamin Coddington Nov. 15, 2024, 1:59 p.m. UTC
We've noticed a situation where an unstable TCP connection can cause the
TLS handshake to timeout waiting for userspace to complete it.  When this
happens, we don't want to return from xs_tls_handshake_sync() with zero, as
this will cause the upper xprt to be set CONNECTED, and subsequent attempts
to transmit will be returned with -EPIPE.  The sunrpc machine does not
recover from this situation and will spin attempting to transmit.

The return value of tls_handshake_cancel() can be used to detect a race
with completion:

 * tls_handshake_cancel - cancel a pending handshake
 * Return values:
 *   %true - Uncompleted handshake request was canceled
 *   %false - Handshake request already completed or not found

If true, we do not want the upper xprt to be connected, so return
-ETIMEDOUT.  If false, its possible the handshake request was lost and
that may be the reason for our timeout.  Again we do not want the upper
xprt to be connected, so return -ETIMEDOUT.

Ensure that we alway return an error from xs_tls_handshake_sync() if we
call tls_handshake_cancel().

Signed-off-by: Benjamin Coddington <bcodding@redhat.com>
---
 net/sunrpc/xprtsock.c | 9 ++++-----
 1 file changed, 4 insertions(+), 5 deletions(-)


base-commit: a9cda7c0ffedb47b23002e109bd26ab2a2ab99c9

Comments

Chuck Lever Nov. 15, 2024, 2:34 p.m. UTC | #1
On Fri, Nov 15, 2024 at 08:59:36AM -0500, Benjamin Coddington wrote:
> We've noticed a situation where an unstable TCP connection can cause the
> TLS handshake to timeout waiting for userspace to complete it.  When this
> happens, we don't want to return from xs_tls_handshake_sync() with zero, as
> this will cause the upper xprt to be set CONNECTED, and subsequent attempts
> to transmit will be returned with -EPIPE.  The sunrpc machine does not
> recover from this situation and will spin attempting to transmit.
> 
> The return value of tls_handshake_cancel() can be used to detect a race
> with completion:
> 
>  * tls_handshake_cancel - cancel a pending handshake
>  * Return values:
>  *   %true - Uncompleted handshake request was canceled
>  *   %false - Handshake request already completed or not found
> 
> If true, we do not want the upper xprt to be connected, so return
> -ETIMEDOUT.  If false, its possible the handshake request was lost and
> that may be the reason for our timeout.  Again we do not want the upper
> xprt to be connected, so return -ETIMEDOUT.

If false, it might be that the handshake succeeded but the kernel
missed the downcall. I think that means the kernel doesn't have the
handshake completion status, so it shouldn't assume the handshake
succeeded, only that it finished.

Reviewed-by: Chuck Lever <chuck.lever@oracle.com>


> Ensure that we alway return an error from xs_tls_handshake_sync() if we
> call tls_handshake_cancel().
> 
> Signed-off-by: Benjamin Coddington <bcodding@redhat.com>
> ---
>  net/sunrpc/xprtsock.c | 9 ++++-----
>  1 file changed, 4 insertions(+), 5 deletions(-)
> 
> diff --git a/net/sunrpc/xprtsock.c b/net/sunrpc/xprtsock.c
> index 1326fbf45a34..95161a8cd138 100644
> --- a/net/sunrpc/xprtsock.c
> +++ b/net/sunrpc/xprtsock.c
> @@ -2614,11 +2614,10 @@ static int xs_tls_handshake_sync(struct rpc_xprt *lower_xprt, struct xprtsec_par
>  	rc = wait_for_completion_interruptible_timeout(&lower_transport->handshake_done,
>  						       XS_TLS_HANDSHAKE_TO);
>  	if (rc <= 0) {
> -		if (!tls_handshake_cancel(sk)) {
> -			if (rc == 0)
> -				rc = -ETIMEDOUT;
> -			goto out_put_xprt;
> -		}
> +		tls_handshake_cancel(sk);
> +		if (rc == 0)
> +			rc = -ETIMEDOUT;
> +		goto out_put_xprt;
>  	}
>  
>  	rc = lower_transport->xprt_err;
> 
> base-commit: a9cda7c0ffedb47b23002e109bd26ab2a2ab99c9
> -- 
> 2.47.0
>
diff mbox series

Patch

diff --git a/net/sunrpc/xprtsock.c b/net/sunrpc/xprtsock.c
index 1326fbf45a34..95161a8cd138 100644
--- a/net/sunrpc/xprtsock.c
+++ b/net/sunrpc/xprtsock.c
@@ -2614,11 +2614,10 @@  static int xs_tls_handshake_sync(struct rpc_xprt *lower_xprt, struct xprtsec_par
 	rc = wait_for_completion_interruptible_timeout(&lower_transport->handshake_done,
 						       XS_TLS_HANDSHAKE_TO);
 	if (rc <= 0) {
-		if (!tls_handshake_cancel(sk)) {
-			if (rc == 0)
-				rc = -ETIMEDOUT;
-			goto out_put_xprt;
-		}
+		tls_handshake_cancel(sk);
+		if (rc == 0)
+			rc = -ETIMEDOUT;
+		goto out_put_xprt;
 	}
 
 	rc = lower_transport->xprt_err;