diff mbox series

net/ncsi: Avoid channel_monitor hrtimer deadlock

Message ID 20210329152039.15189-1-eajames@linux.ibm.com (mailing list archive)
State Accepted
Commit 03cb4d05b4ea9a3491674ca40952adb708d549fa
Delegated to: Netdev Maintainers
Headers show
Series net/ncsi: Avoid channel_monitor hrtimer deadlock | expand

Checks

Context Check Description
netdev/cover_letter success Link
netdev/fixes_present success Link
netdev/patch_count success Link
netdev/tree_selection success Guessed tree name to be net-next
netdev/subject_prefix warning Target tree name not specified in the subject
netdev/cc_maintainers success CCed 4 of 4 maintainers
netdev/source_inline success Was 0 now: 0
netdev/verify_signedoff success Link
netdev/module_param success Was 0 now: 0
netdev/build_32bit success Errors and warnings before: 19 this patch: 19
netdev/kdoc success Errors and warnings before: 0 this patch: 0
netdev/verify_fixes success Link
netdev/checkpatch success total: 0 errors, 0 warnings, 0 checks, 36 lines checked
netdev/build_allmodconfig_warn success Errors and warnings before: 19 this patch: 19
netdev/header_inline success Link

Commit Message

Eddie James March 29, 2021, 3:20 p.m. UTC
From: Milton Miller <miltonm@us.ibm.com>

Calling ncsi_stop_channel_monitor from channel_monitor is a guaranteed
deadlock on SMP because stop calls del_timer_sync on the timer that
inoked channel_monitor as its timer function.

Recognise the inherent race of marking the monitor disabled before
deleting the timer by just returning if enable was cleared.  After
a timeout (the default case -- reset to START when response received)
just mark the monitor.enabled false.

If the channel has an entrie on the channel_queue list, or if the
state is not ACTIVE or INACTIVE, then warn and mark the timer stopped
and don't restart, as the locking is broken somehow.

Fixes: 0795fb2021f0 ("net/ncsi: Stop monitor if channel times out or is inactive")
Signed-off-by: Milton Miller <miltonm@us.ibm.com>
Signed-off-by: Eddie James <eajames@linux.ibm.com>
---
 net/ncsi/ncsi-manage.c | 20 +++++++++++++-------
 1 file changed, 13 insertions(+), 7 deletions(-)

Comments

patchwork-bot+netdevbpf@kernel.org March 30, 2021, 8:20 p.m. UTC | #1
Hello:

This patch was applied to netdev/net.git (refs/heads/master):

On Mon, 29 Mar 2021 10:20:39 -0500 you wrote:
> From: Milton Miller <miltonm@us.ibm.com>
> 
> Calling ncsi_stop_channel_monitor from channel_monitor is a guaranteed
> deadlock on SMP because stop calls del_timer_sync on the timer that
> inoked channel_monitor as its timer function.
> 
> Recognise the inherent race of marking the monitor disabled before
> deleting the timer by just returning if enable was cleared.  After
> a timeout (the default case -- reset to START when response received)
> just mark the monitor.enabled false.
> 
> [...]

Here is the summary with links:
  - net/ncsi: Avoid channel_monitor hrtimer deadlock
    https://git.kernel.org/netdev/net/c/03cb4d05b4ea

You are awesome, thank you!
--
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/patchwork/pwbot.html
diff mbox series

Patch

diff --git a/net/ncsi/ncsi-manage.c b/net/ncsi/ncsi-manage.c
index a9cb355324d1..ffff8da707b8 100644
--- a/net/ncsi/ncsi-manage.c
+++ b/net/ncsi/ncsi-manage.c
@@ -105,13 +105,20 @@  static void ncsi_channel_monitor(struct timer_list *t)
 	monitor_state = nc->monitor.state;
 	spin_unlock_irqrestore(&nc->lock, flags);
 
-	if (!enabled || chained) {
-		ncsi_stop_channel_monitor(nc);
-		return;
-	}
+	if (!enabled)
+		return;		/* expected race disabling timer */
+	if (WARN_ON_ONCE(chained))
+		goto bad_state;
+
 	if (state != NCSI_CHANNEL_INACTIVE &&
 	    state != NCSI_CHANNEL_ACTIVE) {
-		ncsi_stop_channel_monitor(nc);
+bad_state:
+		netdev_warn(ndp->ndev.dev,
+			    "Bad NCSI monitor state channel %d 0x%x %s queue\n",
+			    nc->id, state, chained ? "on" : "off");
+		spin_lock_irqsave(&nc->lock, flags);
+		nc->monitor.enabled = false;
+		spin_unlock_irqrestore(&nc->lock, flags);
 		return;
 	}
 
@@ -136,10 +143,9 @@  static void ncsi_channel_monitor(struct timer_list *t)
 		ncsi_report_link(ndp, true);
 		ndp->flags |= NCSI_DEV_RESHUFFLE;
 
-		ncsi_stop_channel_monitor(nc);
-
 		ncm = &nc->modes[NCSI_MODE_LINK];
 		spin_lock_irqsave(&nc->lock, flags);
+		nc->monitor.enabled = false;
 		nc->state = NCSI_CHANNEL_INVISIBLE;
 		ncm->data[2] &= ~0x1;
 		spin_unlock_irqrestore(&nc->lock, flags);