[net,2/2] net/smc: fix kernel panic caused by race of smc_sock

A crash occurs when smc_cdc_tx_handler() tries to access smc_sock
but smc_release() has already freed it.

[ 4570.695099] BUG: unable to handle page fault for address: 000000002eae9e88
[ 4570.696048] #PF: supervisor write access in kernel mode
[ 4570.696728] #PF: error_code(0x0002) - not-present page
[ 4570.697401] PGD 0 P4D 0
[ 4570.697716] Oops: 0002 [#1] PREEMPT SMP NOPTI
[ 4570.698228] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 5.16.0-rc4+ #111
[ 4570.699013] Hardware name: Alibaba Cloud Alibaba Cloud ECS, BIOS 8c24b4c 04/0
[ 4570.699933] RIP: 0010:_raw_spin_lock+0x1a/0x30
<...>
[ 4570.711446] Call Trace:
[ 4570.711746]  <IRQ>
[ 4570.711992]  smc_cdc_tx_handler+0x41/0xc0
[ 4570.712470]  smc_wr_tx_tasklet_fn+0x213/0x560
[ 4570.712981]  ? smc_cdc_tx_dismisser+0x10/0x10
[ 4570.713489]  tasklet_action_common.isra.17+0x66/0x140
[ 4570.714083]  __do_softirq+0x123/0x2f4
[ 4570.714521]  irq_exit_rcu+0xc4/0xf0
[ 4570.714934]  common_interrupt+0xba/0xe0

Though smc_cdc_tx_handler() checked the existence of smc connection,
smc_release() may have already dismissed and released the smc socket
before smc_cdc_tx_handler() further visits it.

smc_cdc_tx_handler()           |smc_release()
if (!conn)                     |
                               |
                               |smc_cdc_tx_dismiss_slots()
                               |      smc_cdc_tx_dismisser()
                               |
                               |sock_put(&smc->sk) <- last sock_put,
                               |                      smc_sock freed
bh_lock_sock(&smc->sk) (panic) |

To make sure we won't receive any CDC messages after we free the
smc_sock, add a refcount on the smc_connection for inflight CDC
message(posted to the QP but haven't received related CQE), and
don't release the smc_connection until all the inflight CDC messages
haven been done, for both success or failed ones.

Using refcount on CDC messages brings another problem: when the link
is going to be destroyed, smcr_link_clear() will reset the QP, which
then remove all the pending CQEs related to the QP in the CQ. To make
sure all the CQEs will always come back so the refcount on the
smc_connection can always reach 0, smc_ib_modify_qp_reset() was replaced
by smc_ib_modify_qp_error().
And remove the timeout in smc_wr_tx_wait_no_pending_sends() since we
need to wait for all pending WQEs done, or we may encounter use-after-
free when handling CQEs.

For IB device removal routine, we need to wait for all the QPs on that
device been destroyed before we can destroy CQs on the device, or
the refcount on smc_connection won't reach 0 and smc_sock cannot be
released.

Fixes: 5f08318f617b ("smc: connection data control (CDC)")
Reported-by: Wen Gu <guwen@linux.alibaba.com>
Signed-off-by: Dust Li <dust.li@linux.alibaba.com>
---
 net/smc/smc.h      |  5 +++++
 net/smc/smc_cdc.c  | 52 +++++++++++++++++++++-------------------------
 net/smc/smc_cdc.h  |  2 +-
 net/smc/smc_core.c | 25 +++++++++++++++++-----
 net/smc/smc_ib.c   |  4 ++--
 net/smc/smc_ib.h   |  1 +
 net/smc/smc_wr.c   | 41 +++---------------------------------
 net/smc/smc_wr.h   |  3 +--
 8 files changed, 57 insertions(+), 76 deletions(-)

Message ID	20211228090325.27263-3-dust.li@linux.alibaba.com (mailing list archive)
State	Accepted
Commit	349d43127dac00c15231e8ffbcaabd70f7b0e544
Delegated to:	Netdev Maintainers
Headers	show Return-Path: <netdev-owner@kernel.org> From: Dust Li <dust.li@linux.alibaba.com> To: Karsten Graul <kgraul@linux.ibm.com>, "David S . Miller" <davem@davemloft.net>, Jakub Kicinski <kuba@kernel.org> Cc: linux-s390@vger.kernel.org, netdev@vger.kernel.org, Wen Gu <guwen@linux.alibaba.com>, Tony Lu <tonylu@linux.alibaba.com> Subject: [PATCH net 2/2] net/smc: fix kernel panic caused by race of smc_sock Date: Tue, 28 Dec 2021 17:03:25 +0800 Message-Id: <20211228090325.27263-3-dust.li@linux.alibaba.com> In-Reply-To: <20211228090325.27263-1-dust.li@linux.alibaba.com> References: <20211228090325.27263-1-dust.li@linux.alibaba.com> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Precedence: bulk
Series	net/smc: fix kernel panic caused by race of smc_sock \| expand [net,0/2] net/smc: fix kernel panic caused by race of smc_sock [net,1/2] net/smc: don't send CDC/LLC message if link not ready [net,2/2] net/smc: fix kernel panic caused by race of smc_sock

Context	Check	Description
netdev/tree_selection	success	Clearly marked for net
netdev/fixes_present	success	Fixes tag present in non-next series
netdev/subject_prefix	success	Link
netdev/cover_letter	success	Series has a cover letter
netdev/patch_count	success	Link
netdev/header_inline	success	No static functions without inline keyword in header files
netdev/build_32bit	success	Errors and warnings before: 0 this patch: 0
netdev/cc_maintainers	fail	1 blamed authors not CCed: ubraun@linux.vnet.ibm.com; 1 maintainers not CCed: ubraun@linux.vnet.ibm.com
netdev/build_clang	success	Errors and warnings before: 0 this patch: 0
netdev/module_param	success	Was 0 now: 0
netdev/verify_signedoff	success	Signed-off-by tag matches author and committer
netdev/verify_fixes	success	Fixes tag looks correct
netdev/build_allmodconfig_warn	success	Errors and warnings before: 0 this patch: 0
netdev/checkpatch	warning	WARNING: line length of 87 exceeds 80 columns
netdev/kdoc	success	Errors and warnings before: 0 this patch: 0
netdev/source_inline	success	Was 0 now: 0

[net,2/2] net/smc: fix kernel panic caused by race of smc_sock

Checks

Commit Message

Comments

Patch