diff mbox

IB/cm: Fix a recently introduced deadlock

Message ID 56866E6A.8040503@sandisk.com (mailing list archive)
State Accepted
Headers show

Commit Message

Bart Van Assche Jan. 1, 2016, 12:17 p.m. UTC
ib_send_cm_drep() calls cm_enter_timewait() while holding a spinlock
that can be locked from inside an interrupt handler. Hence do not
enable interrupts inside cm_enter_timewait() if called with interrupts
disabled.

This patch fixes e.g. the following deadlock:

Comments

Erez Shitrit Jan. 6, 2016, 11:35 a.m. UTC | #1
On Fri, Jan 1, 2016 at 2:17 PM, Bart Van Assche
<bart.vanassche@sandisk.com> wrote:
> ib_send_cm_drep() calls cm_enter_timewait() while holding a spinlock
> that can be locked from inside an interrupt handler. Hence do not
> enable interrupts inside cm_enter_timewait() if called with interrupts
> disabled.
>
> This patch fixes e.g. the following deadlock:
>
> =================================
> [ INFO: inconsistent lock state ]
> 4.4.0-rc7+ #1 Tainted: G            E
> ---------------------------------
> inconsistent {HARDIRQ-ON-W} -> {IN-HARDIRQ-W} usage.
> swapper/8/0 [HC1[1]:SC0[0]:HE0:SE1] takes:
> (&(&cm_id_priv->lock)->rlock){?.+...}, at: [<ffffffffa036eec4>] cm_establish+0x
> 74/0x1b0 [ib_cm]
> {HARDIRQ-ON-W} state was registered at:
>   [<ffffffff810a3c11>] mark_held_locks+0x71/0x90
>   [<ffffffff810a3e87>] trace_hardirqs_on_caller+0xa7/0x1c0
>   [<ffffffff810a3fad>] trace_hardirqs_on+0xd/0x10
>   [<ffffffff8151c40b>] _raw_spin_unlock_irq+0x2b/0x40
>   [<ffffffffa036ea8e>] cm_enter_timewait+0xae/0x100 [ib_cm]
>   [<ffffffffa036ff76>] ib_send_cm_drep+0xb6/0x190 [ib_cm]
>   [<ffffffffa052ed08>] srp_cm_handler+0x128/0x1a0 [ib_srp]
>   [<ffffffffa0370340>] cm_process_work+0x20/0xf0 [ib_cm]
>   [<ffffffffa0371335>] cm_dreq_handler+0x135/0x2c0 [ib_cm]
>   [<ffffffffa03733c5>] cm_work_handler+0x75/0xd0 [ib_cm]
>   [<ffffffff8107184d>] process_one_work+0x1bd/0x460
>   [<ffffffff81073148>] worker_thread+0x118/0x420
>   [<ffffffff81078454>] kthread+0xe4/0x100
>   [<ffffffff8151cbbf>] ret_from_fork+0x3f/0x70
> irq event stamp: 1672286
> hardirqs last  enabled at (1672283): [<ffffffff81408ec0>] poll_idle+0x10/0x80
> hardirqs last disabled at (1672284): [<ffffffff8151d304>] common_interrupt+0x84/0x89
> softirqs last  enabled at (1672286): [<ffffffff8105b4dc>] _local_bh_enable+0x1c/0x50
> softirqs last disabled at (1672285): [<ffffffff8105b697>] irq_enter+0x47/0x70
>
> other info that might help us debug this:
>  Possible unsafe locking scenario:
>
>        CPU0
>        ----
>   lock(&(&cm_id_priv->lock)->rlock);
>   <Interrupt>
>     lock(&(&cm_id_priv->lock)->rlock);
>
>  *** DEADLOCK ***
>
> no locks held by swapper/8/0.
>
> stack backtrace:
> CPU: 8 PID: 0 Comm: swapper/8 Tainted: G            E   4.4.0-rc7+ #1
> Hardware name: Dell Inc. PowerEdge R430/03XKDV, BIOS 1.0.2 11/17/2014
>  ffff88045af5e950 ffff88046e503a88 ffffffff81251c1b 0000000000000007
>  0000000000000006 0000000000000003 ffff88045af5ddc0 ffff88046e503ad8
>  ffffffff810a32f4 0000000000000000 0000000000000000 0000000000000001
> Call Trace:
>  <IRQ>  [<ffffffff81251c1b>] dump_stack+0x4f/0x74
>  [<ffffffff810a32f4>] print_usage_bug+0x184/0x190
>  [<ffffffff810a36e2>] mark_lock_irq+0xf2/0x290
>  [<ffffffff810a3995>] mark_lock+0x115/0x1b0
>  [<ffffffff810a3b8c>] mark_irqflags+0x15c/0x170
>  [<ffffffff810a4fef>] __lock_acquire+0x1ef/0x560
>  [<ffffffff810a53c2>] lock_acquire+0x62/0x80
>  [<ffffffff8151bd33>] _raw_spin_lock_irqsave+0x43/0x60
>  [<ffffffffa036eec4>] cm_establish+0x74/0x1b0 [ib_cm]
>  [<ffffffffa036f031>] ib_cm_notify+0x31/0x100 [ib_cm]
>  [<ffffffffa0637f24>] srpt_qp_event+0x54/0xd0 [ib_srpt]
>  [<ffffffffa0196052>] mlx4_ib_qp_event+0x72/0xc0 [mlx4_ib]
>  [<ffffffffa00775b9>] mlx4_qp_event+0x69/0xd0 [mlx4_core]
>  [<ffffffffa006000e>] mlx4_eq_int+0x51e/0xd50 [mlx4_core]
>  [<ffffffffa006084f>] mlx4_msi_x_interrupt+0xf/0x20 [mlx4_core]
>  [<ffffffff810b67b0>] handle_irq_event_percpu+0x40/0x110
>  [<ffffffff810b68bf>] handle_irq_event+0x3f/0x70
>  [<ffffffff810ba7f9>] handle_edge_irq+0x79/0x120
>  [<ffffffff81007f3d>] handle_irq+0x5d/0x130
>  [<ffffffff810071fd>] do_IRQ+0x6d/0x130
>  [<ffffffff8151d309>] common_interrupt+0x89/0x89
>  <EOI>  [<ffffffff8140895f>] cpuidle_enter_state+0xcf/0x200
>  [<ffffffff81408aa2>] cpuidle_enter+0x12/0x20
>  [<ffffffff810990d6>] call_cpuidle+0x36/0x60
>  [<ffffffff81099163>] cpuidle_idle_call+0x63/0x110
>  [<ffffffff8109930a>] cpu_idle_loop+0xfa/0x130
>  [<ffffffff8109934e>] cpu_startup_entry+0xe/0x10
>  [<ffffffff8103c443>] start_secondary+0x83/0x90
>
> Fixes: commit be4b499323bf ("IB/cm: Do not queue work to a device that's going away")
> Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com>

Acked-by: Erez Shitrit <erezsh@mellanox.com>

> Cc: Erez Shitrit <erezsh@mellanox.com>
> Cc: stable <stable@vger.kernel.org>
> ---
>  drivers/infiniband/core/cm.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/infiniband/core/cm.c b/drivers/infiniband/core/cm.c
> index 0a26dd6..d6d2b35 100644
> --- a/drivers/infiniband/core/cm.c
> +++ b/drivers/infiniband/core/cm.c
> @@ -782,11 +782,11 @@ static void cm_enter_timewait(struct cm_id_private *cm_id_priv)
>         wait_time = cm_convert_to_ms(cm_id_priv->av.timeout);
>
>         /* Check if the device started its remove_one */
> -       spin_lock_irq(&cm.lock);
> +       spin_lock_irqsave(&cm.lock, flags);
>         if (!cm_dev->going_down)
>                 queue_delayed_work(cm.wq, &cm_id_priv->timewait_info->work.work,
>                                    msecs_to_jiffies(wait_time));
> -       spin_unlock_irq(&cm.lock);
> +       spin_unlock_irqrestore(&cm.lock, flags);
>
>         cm_id_priv->timewait_info = NULL;
>  }
> --
> 2.1.4
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Doug Ledford Jan. 19, 2016, 9:06 p.m. UTC | #2
On 01/06/2016 06:35 AM, Erez Shitrit wrote:
> On Fri, Jan 1, 2016 at 2:17 PM, Bart Van Assche
> <bart.vanassche@sandisk.com> wrote:
>> ib_send_cm_drep() calls cm_enter_timewait() while holding a spinlock
>> that can be locked from inside an interrupt handler. Hence do not
>> enable interrupts inside cm_enter_timewait() if called with interrupts
>> disabled.
>>
>> This patch fixes e.g. the following deadlock:
>>
>> =================================
>> [ INFO: inconsistent lock state ]
>> 4.4.0-rc7+ #1 Tainted: G            E
>> ---------------------------------
>> inconsistent {HARDIRQ-ON-W} -> {IN-HARDIRQ-W} usage.
>> swapper/8/0 [HC1[1]:SC0[0]:HE0:SE1] takes:
>> (&(&cm_id_priv->lock)->rlock){?.+...}, at: [<ffffffffa036eec4>] cm_establish+0x
>> 74/0x1b0 [ib_cm]
>> {HARDIRQ-ON-W} state was registered at:
>>   [<ffffffff810a3c11>] mark_held_locks+0x71/0x90
>>   [<ffffffff810a3e87>] trace_hardirqs_on_caller+0xa7/0x1c0
>>   [<ffffffff810a3fad>] trace_hardirqs_on+0xd/0x10
>>   [<ffffffff8151c40b>] _raw_spin_unlock_irq+0x2b/0x40
>>   [<ffffffffa036ea8e>] cm_enter_timewait+0xae/0x100 [ib_cm]
>>   [<ffffffffa036ff76>] ib_send_cm_drep+0xb6/0x190 [ib_cm]
>>   [<ffffffffa052ed08>] srp_cm_handler+0x128/0x1a0 [ib_srp]
>>   [<ffffffffa0370340>] cm_process_work+0x20/0xf0 [ib_cm]
>>   [<ffffffffa0371335>] cm_dreq_handler+0x135/0x2c0 [ib_cm]
>>   [<ffffffffa03733c5>] cm_work_handler+0x75/0xd0 [ib_cm]
>>   [<ffffffff8107184d>] process_one_work+0x1bd/0x460
>>   [<ffffffff81073148>] worker_thread+0x118/0x420
>>   [<ffffffff81078454>] kthread+0xe4/0x100
>>   [<ffffffff8151cbbf>] ret_from_fork+0x3f/0x70
>> irq event stamp: 1672286
>> hardirqs last  enabled at (1672283): [<ffffffff81408ec0>] poll_idle+0x10/0x80
>> hardirqs last disabled at (1672284): [<ffffffff8151d304>] common_interrupt+0x84/0x89
>> softirqs last  enabled at (1672286): [<ffffffff8105b4dc>] _local_bh_enable+0x1c/0x50
>> softirqs last disabled at (1672285): [<ffffffff8105b697>] irq_enter+0x47/0x70
>>
>> other info that might help us debug this:
>>  Possible unsafe locking scenario:
>>
>>        CPU0
>>        ----
>>   lock(&(&cm_id_priv->lock)->rlock);
>>   <Interrupt>
>>     lock(&(&cm_id_priv->lock)->rlock);
>>
>>  *** DEADLOCK ***
>>
>> no locks held by swapper/8/0.
>>
>> stack backtrace:
>> CPU: 8 PID: 0 Comm: swapper/8 Tainted: G            E   4.4.0-rc7+ #1
>> Hardware name: Dell Inc. PowerEdge R430/03XKDV, BIOS 1.0.2 11/17/2014
>>  ffff88045af5e950 ffff88046e503a88 ffffffff81251c1b 0000000000000007
>>  0000000000000006 0000000000000003 ffff88045af5ddc0 ffff88046e503ad8
>>  ffffffff810a32f4 0000000000000000 0000000000000000 0000000000000001
>> Call Trace:
>>  <IRQ>  [<ffffffff81251c1b>] dump_stack+0x4f/0x74
>>  [<ffffffff810a32f4>] print_usage_bug+0x184/0x190
>>  [<ffffffff810a36e2>] mark_lock_irq+0xf2/0x290
>>  [<ffffffff810a3995>] mark_lock+0x115/0x1b0
>>  [<ffffffff810a3b8c>] mark_irqflags+0x15c/0x170
>>  [<ffffffff810a4fef>] __lock_acquire+0x1ef/0x560
>>  [<ffffffff810a53c2>] lock_acquire+0x62/0x80
>>  [<ffffffff8151bd33>] _raw_spin_lock_irqsave+0x43/0x60
>>  [<ffffffffa036eec4>] cm_establish+0x74/0x1b0 [ib_cm]
>>  [<ffffffffa036f031>] ib_cm_notify+0x31/0x100 [ib_cm]
>>  [<ffffffffa0637f24>] srpt_qp_event+0x54/0xd0 [ib_srpt]
>>  [<ffffffffa0196052>] mlx4_ib_qp_event+0x72/0xc0 [mlx4_ib]
>>  [<ffffffffa00775b9>] mlx4_qp_event+0x69/0xd0 [mlx4_core]
>>  [<ffffffffa006000e>] mlx4_eq_int+0x51e/0xd50 [mlx4_core]
>>  [<ffffffffa006084f>] mlx4_msi_x_interrupt+0xf/0x20 [mlx4_core]
>>  [<ffffffff810b67b0>] handle_irq_event_percpu+0x40/0x110
>>  [<ffffffff810b68bf>] handle_irq_event+0x3f/0x70
>>  [<ffffffff810ba7f9>] handle_edge_irq+0x79/0x120
>>  [<ffffffff81007f3d>] handle_irq+0x5d/0x130
>>  [<ffffffff810071fd>] do_IRQ+0x6d/0x130
>>  [<ffffffff8151d309>] common_interrupt+0x89/0x89
>>  <EOI>  [<ffffffff8140895f>] cpuidle_enter_state+0xcf/0x200
>>  [<ffffffff81408aa2>] cpuidle_enter+0x12/0x20
>>  [<ffffffff810990d6>] call_cpuidle+0x36/0x60
>>  [<ffffffff81099163>] cpuidle_idle_call+0x63/0x110
>>  [<ffffffff8109930a>] cpu_idle_loop+0xfa/0x130
>>  [<ffffffff8109934e>] cpu_startup_entry+0xe/0x10
>>  [<ffffffff8103c443>] start_secondary+0x83/0x90
>>
>> Fixes: commit be4b499323bf ("IB/cm: Do not queue work to a device that's going away")
>> Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com>
> 
> Acked-by: Erez Shitrit <erezsh@mellanox.com>

Thanks, applied.

> 
>> Cc: Erez Shitrit <erezsh@mellanox.com>
>> Cc: stable <stable@vger.kernel.org>
>> ---
>>  drivers/infiniband/core/cm.c | 4 ++--
>>  1 file changed, 2 insertions(+), 2 deletions(-)
>>
>> diff --git a/drivers/infiniband/core/cm.c b/drivers/infiniband/core/cm.c
>> index 0a26dd6..d6d2b35 100644
>> --- a/drivers/infiniband/core/cm.c
>> +++ b/drivers/infiniband/core/cm.c
>> @@ -782,11 +782,11 @@ static void cm_enter_timewait(struct cm_id_private *cm_id_priv)
>>         wait_time = cm_convert_to_ms(cm_id_priv->av.timeout);
>>
>>         /* Check if the device started its remove_one */
>> -       spin_lock_irq(&cm.lock);
>> +       spin_lock_irqsave(&cm.lock, flags);
>>         if (!cm_dev->going_down)
>>                 queue_delayed_work(cm.wq, &cm_id_priv->timewait_info->work.work,
>>                                    msecs_to_jiffies(wait_time));
>> -       spin_unlock_irq(&cm.lock);
>> +       spin_unlock_irqrestore(&cm.lock, flags);
>>
>>         cm_id_priv->timewait_info = NULL;
>>  }
>> --
>> 2.1.4
>>
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

=================================
[ INFO: inconsistent lock state ]
4.4.0-rc7+ #1 Tainted: G            E
---------------------------------
inconsistent {HARDIRQ-ON-W} -> {IN-HARDIRQ-W} usage.
swapper/8/0 [HC1[1]:SC0[0]:HE0:SE1] takes:
(&(&cm_id_priv->lock)->rlock){?.+...}, at: [<ffffffffa036eec4>] cm_establish+0x
74/0x1b0 [ib_cm]
{HARDIRQ-ON-W} state was registered at:
  [<ffffffff810a3c11>] mark_held_locks+0x71/0x90
  [<ffffffff810a3e87>] trace_hardirqs_on_caller+0xa7/0x1c0
  [<ffffffff810a3fad>] trace_hardirqs_on+0xd/0x10
  [<ffffffff8151c40b>] _raw_spin_unlock_irq+0x2b/0x40
  [<ffffffffa036ea8e>] cm_enter_timewait+0xae/0x100 [ib_cm]
  [<ffffffffa036ff76>] ib_send_cm_drep+0xb6/0x190 [ib_cm]
  [<ffffffffa052ed08>] srp_cm_handler+0x128/0x1a0 [ib_srp]
  [<ffffffffa0370340>] cm_process_work+0x20/0xf0 [ib_cm]
  [<ffffffffa0371335>] cm_dreq_handler+0x135/0x2c0 [ib_cm]
  [<ffffffffa03733c5>] cm_work_handler+0x75/0xd0 [ib_cm]
  [<ffffffff8107184d>] process_one_work+0x1bd/0x460
  [<ffffffff81073148>] worker_thread+0x118/0x420
  [<ffffffff81078454>] kthread+0xe4/0x100
  [<ffffffff8151cbbf>] ret_from_fork+0x3f/0x70
irq event stamp: 1672286
hardirqs last  enabled at (1672283): [<ffffffff81408ec0>] poll_idle+0x10/0x80
hardirqs last disabled at (1672284): [<ffffffff8151d304>] common_interrupt+0x84/0x89
softirqs last  enabled at (1672286): [<ffffffff8105b4dc>] _local_bh_enable+0x1c/0x50
softirqs last disabled at (1672285): [<ffffffff8105b697>] irq_enter+0x47/0x70

other info that might help us debug this:
 Possible unsafe locking scenario:

       CPU0
       ----
  lock(&(&cm_id_priv->lock)->rlock);
  <Interrupt>
    lock(&(&cm_id_priv->lock)->rlock);

 *** DEADLOCK ***

no locks held by swapper/8/0.

stack backtrace:
CPU: 8 PID: 0 Comm: swapper/8 Tainted: G            E   4.4.0-rc7+ #1
Hardware name: Dell Inc. PowerEdge R430/03XKDV, BIOS 1.0.2 11/17/2014
 ffff88045af5e950 ffff88046e503a88 ffffffff81251c1b 0000000000000007
 0000000000000006 0000000000000003 ffff88045af5ddc0 ffff88046e503ad8
 ffffffff810a32f4 0000000000000000 0000000000000000 0000000000000001
Call Trace:
 <IRQ>  [<ffffffff81251c1b>] dump_stack+0x4f/0x74
 [<ffffffff810a32f4>] print_usage_bug+0x184/0x190
 [<ffffffff810a36e2>] mark_lock_irq+0xf2/0x290
 [<ffffffff810a3995>] mark_lock+0x115/0x1b0
 [<ffffffff810a3b8c>] mark_irqflags+0x15c/0x170
 [<ffffffff810a4fef>] __lock_acquire+0x1ef/0x560
 [<ffffffff810a53c2>] lock_acquire+0x62/0x80
 [<ffffffff8151bd33>] _raw_spin_lock_irqsave+0x43/0x60
 [<ffffffffa036eec4>] cm_establish+0x74/0x1b0 [ib_cm]
 [<ffffffffa036f031>] ib_cm_notify+0x31/0x100 [ib_cm]
 [<ffffffffa0637f24>] srpt_qp_event+0x54/0xd0 [ib_srpt]
 [<ffffffffa0196052>] mlx4_ib_qp_event+0x72/0xc0 [mlx4_ib]
 [<ffffffffa00775b9>] mlx4_qp_event+0x69/0xd0 [mlx4_core]
 [<ffffffffa006000e>] mlx4_eq_int+0x51e/0xd50 [mlx4_core]
 [<ffffffffa006084f>] mlx4_msi_x_interrupt+0xf/0x20 [mlx4_core]
 [<ffffffff810b67b0>] handle_irq_event_percpu+0x40/0x110
 [<ffffffff810b68bf>] handle_irq_event+0x3f/0x70
 [<ffffffff810ba7f9>] handle_edge_irq+0x79/0x120
 [<ffffffff81007f3d>] handle_irq+0x5d/0x130
 [<ffffffff810071fd>] do_IRQ+0x6d/0x130
 [<ffffffff8151d309>] common_interrupt+0x89/0x89
 <EOI>  [<ffffffff8140895f>] cpuidle_enter_state+0xcf/0x200
 [<ffffffff81408aa2>] cpuidle_enter+0x12/0x20
 [<ffffffff810990d6>] call_cpuidle+0x36/0x60
 [<ffffffff81099163>] cpuidle_idle_call+0x63/0x110
 [<ffffffff8109930a>] cpu_idle_loop+0xfa/0x130
 [<ffffffff8109934e>] cpu_startup_entry+0xe/0x10
 [<ffffffff8103c443>] start_secondary+0x83/0x90

Fixes: commit be4b499323bf ("IB/cm: Do not queue work to a device that's going away")
Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com>
Cc: Erez Shitrit <erezsh@mellanox.com>
Cc: stable <stable@vger.kernel.org>
---
 drivers/infiniband/core/cm.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/infiniband/core/cm.c b/drivers/infiniband/core/cm.c
index 0a26dd6..d6d2b35 100644
--- a/drivers/infiniband/core/cm.c
+++ b/drivers/infiniband/core/cm.c
@@ -782,11 +782,11 @@  static void cm_enter_timewait(struct cm_id_private *cm_id_priv)
 	wait_time = cm_convert_to_ms(cm_id_priv->av.timeout);
 
 	/* Check if the device started its remove_one */
-	spin_lock_irq(&cm.lock);
+	spin_lock_irqsave(&cm.lock, flags);
 	if (!cm_dev->going_down)
 		queue_delayed_work(cm.wq, &cm_id_priv->timewait_info->work.work,
 				   msecs_to_jiffies(wait_time));
-	spin_unlock_irq(&cm.lock);
+	spin_unlock_irqrestore(&cm.lock, flags);
 
 	cm_id_priv->timewait_info = NULL;
 }