diff mbox series

[v2] net/tls: Fix slab-use-after-free in tls_encrypt_done

Message ID VI1P193MB0752321F24623E024C87886A99D6A@VI1P193MB0752.EURP193.PROD.OUTLOOK.COM (mailing list archive)
State Changes Requested
Delegated to: Netdev Maintainers
Headers show
Series [v2] net/tls: Fix slab-use-after-free in tls_encrypt_done | expand

Checks

Context Check Description
netdev/series_format warning Single patches do not need cover letters; Target tree name not specified in the subject
netdev/tree_selection success Guessed tree name to be net-next
netdev/fixes_present success Fixes tag not required for -next series
netdev/header_inline success No static functions without inline keyword in header files
netdev/build_32bit success Errors and warnings before: 1363 this patch: 1363
netdev/cc_maintainers success CCed 7 of 7 maintainers
netdev/build_clang success Errors and warnings before: 1388 this patch: 1388
netdev/verify_signedoff success Signed-off-by tag matches author and committer
netdev/deprecated_api success None detected
netdev/check_selftest success No net selftest shell script
netdev/verify_fixes success No Fixes tag
netdev/build_allmodconfig_warn success Errors and warnings before: 1388 this patch: 1388
netdev/checkpatch success total: 0 errors, 0 warnings, 0 checks, 22 lines checked
netdev/kdoc success Errors and warnings before: 0 this patch: 0
netdev/source_inline success Was 0 now: 0

Commit Message

Juntong Deng Oct. 17, 2023, 4:22 p.m. UTC
In the current implementation, ctx->async_wait.completion is completed
after spin_lock_bh, which causes tls_sw_release_resources_tx to
continue executing and return to tls_sk_proto_cleanup, then return
to tls_sk_proto_close, and after that enter tls_sw_free_ctx_tx to kfree
the entire struct tls_sw_context_tx (including ctx->encrypt_compl_lock).

Since ctx->encrypt_compl_lock has been freed, subsequent spin_unlock_bh
will result in slab-use-after-free error. Due to SMP, even using
spin_lock_bh does not prevent tls_sw_release_resources_tx from continuing
on other CPUs. After tls_sw_release_resources_tx is woken up, there is no
attempt to hold ctx->encrypt_compl_lock again, therefore everything
described above is possible.

The fix is to put complete(&ctx->async_wait.completion) after
spin_unlock_bh, making the release after the unlock. Since complete is
only executed if pending is 0, which means this is the last record, there
is no need to worry about race condition causing duplicate completes.

Since tls_sw_release_resources_tx freed all un-sent records in the
ctx->tx_list, subsequent schedule transmission is meaningless, and
the entire struct tls_sw_context_tx has been freed (including
ctx->tx_bitmask and ctx->tx_work.work), continuing to execute
subsequent code will cause use-after-free, so return directly.

Reported-by: syzbot+29c22ea2d6b2c5fd2eae@syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=29c22ea2d6b2c5fd2eae
Signed-off-by: Juntong Deng <juntong.deng@outlook.com>
---
V1 -> V2: Fix possible use-after-free caused by test_and_set_bit

 net/tls/tls_sw.c | 8 ++++++--
 1 file changed, 6 insertions(+), 2 deletions(-)

Comments

Jakub Kicinski Oct. 19, 2023, 1:25 a.m. UTC | #1
On Wed, 18 Oct 2023 00:22:15 +0800 Juntong Deng wrote:
> In the current implementation, ctx->async_wait.completion is completed
> after spin_lock_bh, which causes tls_sw_release_resources_tx to
> continue executing and return to tls_sk_proto_cleanup, then return
> to tls_sk_proto_close, and after that enter tls_sw_free_ctx_tx to kfree
> the entire struct tls_sw_context_tx (including ctx->encrypt_compl_lock).
> 
> Since ctx->encrypt_compl_lock has been freed, subsequent spin_unlock_bh
> will result in slab-use-after-free error. Due to SMP, even using
> spin_lock_bh does not prevent tls_sw_release_resources_tx from continuing
> on other CPUs. After tls_sw_release_resources_tx is woken up, there is no
> attempt to hold ctx->encrypt_compl_lock again, therefore everything
> described above is possible.

Whoever triggered the Tx should wait for all outstanding encryption 
to finish before exiting sendmsg() (or alike).  This looks like 
a band-aid. Sabrina is working on fixes for the async code, lets
get those in first before attempting spot fixes.
diff mbox series

Patch

diff --git a/net/tls/tls_sw.c b/net/tls/tls_sw.c
index 270712b8d391..b21f5fecc84e 100644
--- a/net/tls/tls_sw.c
+++ b/net/tls/tls_sw.c
@@ -441,6 +441,7 @@  static void tls_encrypt_done(void *data, int err)
 	struct sk_msg *msg_en;
 	bool ready = false;
 	struct sock *sk;
+	int async_notify;
 	int pending;
 
 	msg_en = &rec->msg_encrypted;
@@ -482,10 +483,13 @@  static void tls_encrypt_done(void *data, int err)
 
 	spin_lock_bh(&ctx->encrypt_compl_lock);
 	pending = atomic_dec_return(&ctx->encrypt_pending);
+	async_notify = ctx->async_notify;
+	spin_unlock_bh(&ctx->encrypt_compl_lock);
 
-	if (!pending && ctx->async_notify)
+	if (!pending && async_notify) {
 		complete(&ctx->async_wait.completion);
-	spin_unlock_bh(&ctx->encrypt_compl_lock);
+		return;
+	}
 
 	if (!ready)
 		return;