diff mbox series

tls: defer close to kernel task

Message ID 20240410-ktls-defer-close-v1-1-b59e6626b8e4@pengutronix.de (mailing list archive)
State Changes Requested
Delegated to: Netdev Maintainers
Headers show
Series tls: defer close to kernel task | expand

Checks

Context Check Description
netdev/series_format warning Single patches do not need cover letters; Target tree name not specified in the subject
netdev/tree_selection success Guessed tree name to be net-next
netdev/ynl success Generated files up to date; no warnings/errors; no diff in generated;
netdev/fixes_present success Fixes tag not required for -next series
netdev/header_inline success No static functions without inline keyword in header files
netdev/build_32bit success Errors and warnings before: 942 this patch: 942
netdev/build_tools success No tools touched, skip
netdev/cc_maintainers success CCed 6 of 6 maintainers
netdev/build_clang success Errors and warnings before: 953 this patch: 953
netdev/verify_signedoff success Signed-off-by tag matches author and committer
netdev/deprecated_api success None detected
netdev/check_selftest success No net selftest shell script
netdev/verify_fixes success No Fixes tag
netdev/build_allmodconfig_warn success Errors and warnings before: 953 this patch: 953
netdev/checkpatch warning WARNING: line length of 84 exceeds 80 columns
netdev/build_clang_rust success No Rust files in patch. Skipping build
netdev/kdoc success Errors and warnings before: 0 this patch: 0
netdev/source_inline success Was 0 now: 0
netdev/contest warning net-next-2024-04-10--12-00 (tests: 868)

Commit Message

Sascha Hauer April 10, 2024, 6:33 a.m. UTC
proto->close is normally called from a userspace task which can be
interrupted by signals. When asynchronous encryption is used then KTLS
sends out the final data at close time. When a signal comes in during
close then it can happen tcp_sendmsg_locked() is interrupted by that
signal while waiting for memory in sk_stream_wait_memory() which then
returns with -ERSTARTSYS. It is not possible to recover from this situation
and the final transmit data is lost.

With this patch we defer the close operation to a kernel task which
doesn't get signals.

The described situation happens when KTLS is used in conjunction with
io_uring, as io_uring uses task_work_add() to add work to the current
userspace task.

The problem is discussed in [1] and [2] and the solution implemented in
this patch is suggested by Pavel Begunkov here [3]

[1] https://lore.kernel.org/all/20231010141932.GD3114228@pengutronix.de/
[2] https://lore.kernel.org/all/20240315100159.3898944-1-s.hauer@pengutronix.de/
[3] https://lore.kernel.org/all/bfc6afa9-501f-40b6-929a-3aa8c0298265@gmail.com

Signed-off-by: Sascha Hauer <s.hauer@pengutronix.de>
---

Note I only need this asynchronous close for the
ctx->tx_conf == TLS_SW case. I can refactor the patch to only go
asynchronous when necessary if that's desired.
---
 net/tls/tls_main.c | 37 +++++++++++++++++++++++++++++++------
 1 file changed, 31 insertions(+), 6 deletions(-)


---
base-commit: fec50db7033ea478773b159e0e2efb135270e3b7
change-id: 20240410-ktls-defer-close-002934564b09

Best regards,

Comments

Jakub Kicinski April 10, 2024, 1:11 p.m. UTC | #1
On Wed, 10 Apr 2024 08:33:07 +0200 Sascha Hauer wrote:
> proto->close is normally called from a userspace task which can be
> interrupted by signals. When asynchronous encryption is used then KTLS
> sends out the final data at close time. When a signal comes in during
> close then it can happen tcp_sendmsg_locked() is interrupted by that
> signal while waiting for memory in sk_stream_wait_memory() which then
> returns with -ERSTARTSYS. It is not possible to recover from this situation
> and the final transmit data is lost.
> 
> With this patch we defer the close operation to a kernel task which
> doesn't get signals.
> 
> The described situation happens when KTLS is used in conjunction with
> io_uring, as io_uring uses task_work_add() to add work to the current
> userspace task.
> 
> The problem is discussed in [1] and [2] and the solution implemented in
> this patch is suggested by Pavel Begunkov here [3]

Appears to crash reliably.
Please run the tls selftests with KASAN enabled.
Sascha Hauer April 12, 2024, 7:25 a.m. UTC | #2
On Wed, Apr 10, 2024 at 06:11:28AM -0700, Jakub Kicinski wrote:
> On Wed, 10 Apr 2024 08:33:07 +0200 Sascha Hauer wrote:
> > proto->close is normally called from a userspace task which can be
> > interrupted by signals. When asynchronous encryption is used then KTLS
> > sends out the final data at close time. When a signal comes in during
> > close then it can happen tcp_sendmsg_locked() is interrupted by that
> > signal while waiting for memory in sk_stream_wait_memory() which then
> > returns with -ERSTARTSYS. It is not possible to recover from this situation
> > and the final transmit data is lost.
> > 
> > With this patch we defer the close operation to a kernel task which
> > doesn't get signals.
> > 
> > The described situation happens when KTLS is used in conjunction with
> > io_uring, as io_uring uses task_work_add() to add work to the current
> > userspace task.
> > 
> > The problem is discussed in [1] and [2] and the solution implemented in
> > this patch is suggested by Pavel Begunkov here [3]
> 
> Appears to crash reliably.
> Please run the tls selftests with KASAN enabled.

Oops, will run the tests and fix the fallout before resending.

Sascha
diff mbox series

Patch

diff --git a/net/tls/tls_main.c b/net/tls/tls_main.c
index b4674f03d71a9..b0b7e0d2f1145 100644
--- a/net/tls/tls_main.c
+++ b/net/tls/tls_main.c
@@ -365,16 +365,21 @@  static void tls_sk_proto_cleanup(struct sock *sk,
 	}
 }
 
-static void tls_sk_proto_close(struct sock *sk, long timeout)
+struct tls_close_work {
+	struct work_struct work;
+	struct tls_context *ctx;
+	long timeout;
+};
+
+static void deferred_close(struct work_struct *work)
 {
+	struct tls_close_work *cw = container_of(work, struct tls_close_work, work);
+	struct tls_context *ctx = cw->ctx;
+	struct sock *sk = ctx->sk;
 	struct inet_connection_sock *icsk = inet_csk(sk);
-	struct tls_context *ctx = tls_get_ctx(sk);
 	long timeo = sock_sndtimeo(sk, 0);
 	bool free_ctx;
 
-	if (ctx->tx_conf == TLS_SW)
-		tls_sw_cancel_work_tx(ctx);
-
 	lock_sock(sk);
 	free_ctx = ctx->tx_conf != TLS_HW && ctx->rx_conf != TLS_HW;
 
@@ -395,10 +400,30 @@  static void tls_sk_proto_close(struct sock *sk, long timeout)
 		tls_sw_strparser_done(ctx);
 	if (ctx->rx_conf == TLS_SW)
 		tls_sw_free_ctx_rx(ctx);
-	ctx->sk_proto->close(sk, timeout);
+	ctx->sk_proto->close(sk, cw->timeout);
 
 	if (free_ctx)
 		tls_ctx_free(sk, ctx);
+
+	kfree(cw);
+}
+
+static void tls_sk_proto_close(struct sock *sk, long timeout)
+{
+	struct tls_context *ctx = tls_get_ctx(sk);
+	struct tls_close_work *cw;
+
+	if (ctx->tx_conf == TLS_SW)
+		tls_sw_cancel_work_tx(ctx);
+
+	cw = kmalloc(sizeof(*cw), GFP_KERNEL);
+	if (!cw)
+		return;
+
+	INIT_WORK(&cw->work, deferred_close);
+	cw->timeout = timeout;
+	cw->ctx = ctx;
+	queue_work(system_unbound_wq, &cw->work);
 }
 
 static __poll_t tls_sk_poll(struct file *file, struct socket *sock,