Message ID | 20220417131414.98144-1-duoming@zju.edu.cn (mailing list archive) |
---|---|
State | Superseded |
Headers | show |
Series | [V5] drivers: infiniband: hw: Fix deadlock in irdma_cleanup_cm_core() | expand |
> Subject: [PATCH V5] drivers: infiniband: hw: Fix deadlock in > irdma_cleanup_cm_core() > > There is a deadlock in irdma_cleanup_cm_core(), which is shown > below: > > (Thread 1) | (Thread 2) > | irdma_schedule_cm_timer() > irdma_cleanup_cm_core() | add_timer() > spin_lock_irqsave() //(1) | (wait a time) > ... | irdma_cm_timer_tick() > del_timer_sync() | spin_lock_irqsave() //(2) > (wait timer to stop) | ... > > We hold cm_core->ht_lock in position (1) of thread 1 and use del_timer_sync() > to wait timer to stop, but timer handler also need cm_core->ht_lock in position (2) > of thread 2. > As a result, irdma_cleanup_cm_core() will block forever. > > This patch removes the check of timer_pending() in irdma_cleanup_cm_core(), > because the del_timer_sync() function will just return directly if there isn't a > pending timer. As a result, the lock is redundant, because there is no resource it > could protect. > > Signed-off-by: Duoming Zhou <duoming@zju.edu.cn> > --- > Changes in V5: > - Remove mod_timer() in irdma_schedule_cm_timer and irdma_cm_timer_tick. > > drivers/infiniband/hw/irdma/cm.c | 5 +---- > 1 file changed, 1 insertion(+), 4 deletions(-) > > diff --git a/drivers/infiniband/hw/irdma/cm.c b/drivers/infiniband/hw/irdma/cm.c > index dedb3b7edd8..4b6b1065f85 100644 > --- a/drivers/infiniband/hw/irdma/cm.c > +++ b/drivers/infiniband/hw/irdma/cm.c > @@ -3251,10 +3251,7 @@ void irdma_cleanup_cm_core(struct > irdma_cm_core *cm_core) > if (!cm_core) > return; > > - spin_lock_irqsave(&cm_core->ht_lock, flags); > - if (timer_pending(&cm_core->tcp_timer)) > - del_timer_sync(&cm_core->tcp_timer); > - spin_unlock_irqrestore(&cm_core->ht_lock, flags); > + del_timer_sync(&cm_core->tcp_timer); > > destroy_workqueue(cm_core->event_wq); > cm_core->dev->ws_reset(&cm_core->iwdev->vsi); > -- I am not sure the deadlock is possible practically since all CM nodes should be culled by the time we get to irdma_cleanup_cm_core. However, timer_pending check and locks are redundant and should be removed. The subject line for patches to our driver are typically prefixed with "RDMA/irdma: " Reviewed-by: Shiraz Saleem <shiraz.saleem@intel.com>
Hello, On Mon, 18 Apr 2022 14:57:06 +0000 Saleem, Shiraz wrote: > > There is a deadlock in irdma_cleanup_cm_core(), which is shown > > below: > > > > (Thread 1) | (Thread 2) > > | irdma_schedule_cm_timer() > > irdma_cleanup_cm_core() | add_timer() > > spin_lock_irqsave() //(1) | (wait a time) > > ... | irdma_cm_timer_tick() > > del_timer_sync() | spin_lock_irqsave() //(2) > > (wait timer to stop) | ... > > > > We hold cm_core->ht_lock in position (1) of thread 1 and use del_timer_sync() > > to wait timer to stop, but timer handler also need cm_core->ht_lock in position (2) > > of thread 2. > > As a result, irdma_cleanup_cm_core() will block forever. > > > > This patch removes the check of timer_pending() in irdma_cleanup_cm_core(), > > because the del_timer_sync() function will just return directly if there isn't a > > pending timer. As a result, the lock is redundant, because there is no resource it > > could protect. > > > > Signed-off-by: Duoming Zhou <duoming@zju.edu.cn> > > --- > > Changes in V5: > > - Remove mod_timer() in irdma_schedule_cm_timer and irdma_cm_timer_tick. > > > > drivers/infiniband/hw/irdma/cm.c | 5 +---- > > 1 file changed, 1 insertion(+), 4 deletions(-) > > > > diff --git a/drivers/infiniband/hw/irdma/cm.c b/drivers/infiniband/hw/irdma/cm.c > > index dedb3b7edd8..4b6b1065f85 100644 > > --- a/drivers/infiniband/hw/irdma/cm.c > > +++ b/drivers/infiniband/hw/irdma/cm.c > > @@ -3251,10 +3251,7 @@ void irdma_cleanup_cm_core(struct > > irdma_cm_core *cm_core) > > if (!cm_core) > > return; > > > > - spin_lock_irqsave(&cm_core->ht_lock, flags); > > - if (timer_pending(&cm_core->tcp_timer)) > > - del_timer_sync(&cm_core->tcp_timer); > > - spin_unlock_irqrestore(&cm_core->ht_lock, flags); > > + del_timer_sync(&cm_core->tcp_timer); > > > > destroy_workqueue(cm_core->event_wq); > > cm_core->dev->ws_reset(&cm_core->iwdev->vsi); > > -- > > I am not sure the deadlock is possible practically since all CM nodes should be culled by the time we get to irdma_cleanup_cm_core. I think the deadlock is possible, because the timer is a delay mechanism that could execute at any time, although all CM nodes are culled. > However, timer_pending check and locks are redundant and should be removed. > > The subject line for patches to our driver are typically prefixed with "RDMA/irdma: " I sent "[PATCH V6] RDMA/irdma: Fix deadlock in irdma_cleanup_cm_core()" just now. Best regards, Duoming Zhou
diff --git a/drivers/infiniband/hw/irdma/cm.c b/drivers/infiniband/hw/irdma/cm.c index dedb3b7edd8..4b6b1065f85 100644 --- a/drivers/infiniband/hw/irdma/cm.c +++ b/drivers/infiniband/hw/irdma/cm.c @@ -3251,10 +3251,7 @@ void irdma_cleanup_cm_core(struct irdma_cm_core *cm_core) if (!cm_core) return; - spin_lock_irqsave(&cm_core->ht_lock, flags); - if (timer_pending(&cm_core->tcp_timer)) - del_timer_sync(&cm_core->tcp_timer); - spin_unlock_irqrestore(&cm_core->ht_lock, flags); + del_timer_sync(&cm_core->tcp_timer); destroy_workqueue(cm_core->event_wq); cm_core->dev->ws_reset(&cm_core->iwdev->vsi);
There is a deadlock in irdma_cleanup_cm_core(), which is shown below: (Thread 1) | (Thread 2) | irdma_schedule_cm_timer() irdma_cleanup_cm_core() | add_timer() spin_lock_irqsave() //(1) | (wait a time) ... | irdma_cm_timer_tick() del_timer_sync() | spin_lock_irqsave() //(2) (wait timer to stop) | ... We hold cm_core->ht_lock in position (1) of thread 1 and use del_timer_sync() to wait timer to stop, but timer handler also need cm_core->ht_lock in position (2) of thread 2. As a result, irdma_cleanup_cm_core() will block forever. This patch removes the check of timer_pending() in irdma_cleanup_cm_core(), because the del_timer_sync() function will just return directly if there isn't a pending timer. As a result, the lock is redundant, because there is no resource it could protect. Signed-off-by: Duoming Zhou <duoming@zju.edu.cn> --- Changes in V5: - Remove mod_timer() in irdma_schedule_cm_timer and irdma_cm_timer_tick. drivers/infiniband/hw/irdma/cm.c | 5 +---- 1 file changed, 1 insertion(+), 4 deletions(-)