diff mbox

IB/ipoib: replace local_irq_disable() with proper locking

Message ID 20180504144555.5595-1-bigeasy@linutronix.de (mailing list archive)
State Changes Requested
Delegated to: Doug Ledford
Headers show

Commit Message

Sebastian Andrzej Siewior May 4, 2018, 2:45 p.m. UTC
Commit 78bfe0b5b67f ("IPoIB: Take dev->xmit_lock around mc_list accesses")
introduced xmit_lock lock in ipoib_mcast_restart_task() and commit
932ff279a43a ("[NET]: Add netif_tx_lock") preserved the locking order while
dev->xmit_lock has been replaced with a helper. The netif_tx_lock should
not be acquired with disabled interrupts because it is meant to be a BH
disabling lock.

The priv->lock is always acquired with interrupts disabled. The only
place where netif_addr_lock() and priv->lock nest ist
ipoib_mcast_restart_task(). By reversing the lock order and taking
netif_addr lock with bottom halfs disabled it is possible to get rid of
the local_irq_save() completely.

This requires to take priv->lock with spin_lock_irq() inside the netif_addr
locked section. It's safe to do so because the caller is either a worker
function or __ipoib_ib_dev_flush() which are both calling with interrupts
enabled.

Cc: Doug Ledford <dledford@redhat.com>
Cc: Jason Gunthorpe <jgg@ziepe.ca>
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
---
 drivers/infiniband/ulp/ipoib/ipoib_multicast.c | 15 ++++++---------
 1 file changed, 6 insertions(+), 9 deletions(-)

Comments

Jason Gunthorpe May 15, 2018, 11 p.m. UTC | #1
On Fri, May 04, 2018 at 04:45:54PM +0200, Sebastian Andrzej Siewior wrote:
> Commit 78bfe0b5b67f ("IPoIB: Take dev->xmit_lock around mc_list accesses")
> introduced xmit_lock lock in ipoib_mcast_restart_task() and commit
> 932ff279a43a ("[NET]: Add netif_tx_lock") preserved the locking order while
> dev->xmit_lock has been replaced with a helper. The netif_tx_lock should
> not be acquired with disabled interrupts because it is meant to be a BH
> disabling lock.

This commenting is talking about the tx_lock, which was true long ago,
but these days we are taking the netif_addr_lock, which is
different.. So at least the last sentence needs to be reworded..

> The priv->lock is always acquired with interrupts disabled. The only
> place where netif_addr_lock() and priv->lock nest ist
> ipoib_mcast_restart_task(). By reversing the lock order and taking
> netif_addr lock with bottom halfs disabled it is possible to get rid of
> the local_irq_save() completely.

I'm also having trouble following this, where did the locking odering
get reversed in this patch? I see the ordering is the same, but the
irq disable has moved.

> This requires to take priv->lock with spin_lock_irq() inside the netif_addr
> locked section. It's safe to do so because the caller is either a worker
> function or __ipoib_ib_dev_flush() which are both calling with interrupts
> enabled.

Otherwise the patch seems fine to me, I also can think of no reason
why we need to have the IRQ disabled prior to getting the addr_lock,
it looks like a holdover from 10 years ago, maybe it made sense when
this was the xmit lock..
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Sebastian Andrzej Siewior May 16, 2018, 1:12 p.m. UTC | #2
On 2018-05-15 17:00:25 [-0600], Jason Gunthorpe wrote:
> On Fri, May 04, 2018 at 04:45:54PM +0200, Sebastian Andrzej Siewior wrote:
> > Commit 78bfe0b5b67f ("IPoIB: Take dev->xmit_lock around mc_list accesses")
> > introduced xmit_lock lock in ipoib_mcast_restart_task() and commit
> > 932ff279a43a ("[NET]: Add netif_tx_lock") preserved the locking order while
> > dev->xmit_lock has been replaced with a helper. The netif_tx_lock should
> > not be acquired with disabled interrupts because it is meant to be a BH
> > disabling lock.
> 
> This commenting is talking about the tx_lock, which was true long ago,
> but these days we are taking the netif_addr_lock, which is
> different.. So at least the last sentence needs to be reworded..

I referred to what happened during those two commits (back then, ages
ago). Are you sure this still needs rewording?

> > The priv->lock is always acquired with interrupts disabled. The only
> > place where netif_addr_lock() and priv->lock nest ist
> > ipoib_mcast_restart_task(). By reversing the lock order and taking
> > netif_addr lock with bottom halfs disabled it is possible to get rid of
> > the local_irq_save() completely.
> 
> I'm also having trouble following this, where did the locking odering
> get reversed in this patch? I see the ordering is the same, but the
> irq disable has moved.

I meant the part where netif_addr_lock() is no longer acquired with
disabled interrupts. Now that I read it myself again I understand the
confusion. I will try to reword it.

> > This requires to take priv->lock with spin_lock_irq() inside the netif_addr
> > locked section. It's safe to do so because the caller is either a worker
> > function or __ipoib_ib_dev_flush() which are both calling with interrupts
> > enabled.
> 
> Otherwise the patch seems fine to me, I also can think of no reason
> why we need to have the IRQ disabled prior to getting the addr_lock,
> it looks like a holdover from 10 years ago, maybe it made sense when
> this was the xmit lock..

good.

Sebastian
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Jason Gunthorpe May 16, 2018, 5:05 p.m. UTC | #3
On Wed, May 16, 2018 at 03:12:28PM +0200, Sebastian Andrzej Siewior wrote:
> On 2018-05-15 17:00:25 [-0600], Jason Gunthorpe wrote:
> > On Fri, May 04, 2018 at 04:45:54PM +0200, Sebastian Andrzej Siewior wrote:
> > > Commit 78bfe0b5b67f ("IPoIB: Take dev->xmit_lock around mc_list accesses")
> > > introduced xmit_lock lock in ipoib_mcast_restart_task() and commit
> > > 932ff279a43a ("[NET]: Add netif_tx_lock") preserved the locking order while
> > > dev->xmit_lock has been replaced with a helper. The netif_tx_lock should
> > > not be acquired with disabled interrupts because it is meant to be a BH
> > > disabling lock.
> > 
> > This commenting is talking about the tx_lock, which was true long ago,
> > but these days we are taking the netif_addr_lock, which is
> > different.. So at least the last sentence needs to be reworded..
> 
> I referred to what happened during those two commits (back then, ages
> ago). Are you sure this still needs rewording?

Well, it just doesn't make any sense in this context becuase the code
doesn't take the netif_tx_lock and netif_addr_lock is not 'meant to be
a BH disabling lock'

> > > The priv->lock is always acquired with interrupts disabled. The only
> > > place where netif_addr_lock() and priv->lock nest ist
> > > ipoib_mcast_restart_task(). By reversing the lock order and taking
> > > netif_addr lock with bottom halfs disabled it is possible to get rid of
> > > the local_irq_save() completely.
> > 
> > I'm also having trouble following this, where did the locking odering
> > get reversed in this patch? I see the ordering is the same, but the
> > irq disable has moved.
> 
> I meant the part where netif_addr_lock() is no longer acquired with
> disabled interrupts. Now that I read it myself again I understand the
> confusion. I will try to reword it.
> 
> > > This requires to take priv->lock with spin_lock_irq() inside the netif_addr
> > > locked section. It's safe to do so because the caller is either a worker
> > > function or __ipoib_ib_dev_flush() which are both calling with interrupts
> > > enabled.
> > 
> > Otherwise the patch seems fine to me, I also can think of no reason
> > why we need to have the IRQ disabled prior to getting the addr_lock,
> > it looks like a holdover from 10 years ago, maybe it made sense when
> > this was the xmit lock..
> 
> good.

Send a v2 with a tighter commit message please

Jason
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Erez Shitrit May 17, 2018, 6:57 a.m. UTC | #4
On Fri, May 4, 2018 at 5:45 PM, Sebastian Andrzej Siewior
<bigeasy@linutronix.de> wrote:
> Commit 78bfe0b5b67f ("IPoIB: Take dev->xmit_lock around mc_list accesses")
> introduced xmit_lock lock in ipoib_mcast_restart_task() and commit
> 932ff279a43a ("[NET]: Add netif_tx_lock") preserved the locking order while
> dev->xmit_lock has been replaced with a helper. The netif_tx_lock should
> not be acquired with disabled interrupts because it is meant to be a BH
> disabling lock.
>
> The priv->lock is always acquired with interrupts disabled. The only
> place where netif_addr_lock() and priv->lock nest ist
> ipoib_mcast_restart_task(). By reversing the lock order and taking
> netif_addr lock with bottom halfs disabled it is possible to get rid of
> the local_irq_save() completely.
>
> This requires to take priv->lock with spin_lock_irq() inside the netif_addr
> locked section. It's safe to do so because the caller is either a worker
> function or __ipoib_ib_dev_flush() which are both calling with interrupts
> enabled.
>
> Cc: Doug Ledford <dledford@redhat.com>
> Cc: Jason Gunthorpe <jgg@ziepe.ca>
> Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>

Reviewed-by: Erez Shitrit <erezsh@mellanox.com>

> ---
>  drivers/infiniband/ulp/ipoib/ipoib_multicast.c | 15 ++++++---------
>  1 file changed, 6 insertions(+), 9 deletions(-)
>
> diff --git a/drivers/infiniband/ulp/ipoib/ipoib_multicast.c b/drivers/infiniband/ulp/ipoib/ipoib_multicast.c
> index 9b3f47ae2016..6709328d90f8 100644
> --- a/drivers/infiniband/ulp/ipoib/ipoib_multicast.c
> +++ b/drivers/infiniband/ulp/ipoib/ipoib_multicast.c
> @@ -886,7 +886,6 @@ void ipoib_mcast_restart_task(struct work_struct *work)
>         struct netdev_hw_addr *ha;
>         struct ipoib_mcast *mcast, *tmcast;
>         LIST_HEAD(remove_list);
> -       unsigned long flags;
>         struct ib_sa_mcmember_rec rec;
>
>         if (!test_bit(IPOIB_FLAG_OPER_UP, &priv->flags))
> @@ -898,9 +897,8 @@ void ipoib_mcast_restart_task(struct work_struct *work)
>
>         ipoib_dbg_mcast(priv, "restarting multicast task\n");
>
> -       local_irq_save(flags);
> -       netif_addr_lock(dev);
> -       spin_lock(&priv->lock);
> +       netif_addr_lock_bh(dev);
> +       spin_lock_irq(&priv->lock);
>
>         /*
>          * Unfortunately, the networking core only gives us a list of all of
> @@ -978,9 +976,8 @@ void ipoib_mcast_restart_task(struct work_struct *work)
>                 }
>         }
>
> -       spin_unlock(&priv->lock);
> -       netif_addr_unlock(dev);
> -       local_irq_restore(flags);
> +       spin_unlock_irq(&priv->lock);
> +       netif_addr_unlock_bh(dev);
>
>         ipoib_mcast_remove_list(&remove_list);
>
> @@ -988,9 +985,9 @@ void ipoib_mcast_restart_task(struct work_struct *work)
>          * Double check that we are still up
>          */
>         if (test_bit(IPOIB_FLAG_OPER_UP, &priv->flags)) {
> -               spin_lock_irqsave(&priv->lock, flags);
> +               spin_lock_irq(&priv->lock);
>                 __ipoib_mcast_schedule_join_thread(priv, NULL, 0);
> -               spin_unlock_irqrestore(&priv->lock, flags);
> +               spin_unlock_irq(&priv->lock);
>         }
>  }
>
> --
> 2.17.0
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/drivers/infiniband/ulp/ipoib/ipoib_multicast.c b/drivers/infiniband/ulp/ipoib/ipoib_multicast.c
index 9b3f47ae2016..6709328d90f8 100644
--- a/drivers/infiniband/ulp/ipoib/ipoib_multicast.c
+++ b/drivers/infiniband/ulp/ipoib/ipoib_multicast.c
@@ -886,7 +886,6 @@  void ipoib_mcast_restart_task(struct work_struct *work)
 	struct netdev_hw_addr *ha;
 	struct ipoib_mcast *mcast, *tmcast;
 	LIST_HEAD(remove_list);
-	unsigned long flags;
 	struct ib_sa_mcmember_rec rec;
 
 	if (!test_bit(IPOIB_FLAG_OPER_UP, &priv->flags))
@@ -898,9 +897,8 @@  void ipoib_mcast_restart_task(struct work_struct *work)
 
 	ipoib_dbg_mcast(priv, "restarting multicast task\n");
 
-	local_irq_save(flags);
-	netif_addr_lock(dev);
-	spin_lock(&priv->lock);
+	netif_addr_lock_bh(dev);
+	spin_lock_irq(&priv->lock);
 
 	/*
 	 * Unfortunately, the networking core only gives us a list of all of
@@ -978,9 +976,8 @@  void ipoib_mcast_restart_task(struct work_struct *work)
 		}
 	}
 
-	spin_unlock(&priv->lock);
-	netif_addr_unlock(dev);
-	local_irq_restore(flags);
+	spin_unlock_irq(&priv->lock);
+	netif_addr_unlock_bh(dev);
 
 	ipoib_mcast_remove_list(&remove_list);
 
@@ -988,9 +985,9 @@  void ipoib_mcast_restart_task(struct work_struct *work)
 	 * Double check that we are still up
 	 */
 	if (test_bit(IPOIB_FLAG_OPER_UP, &priv->flags)) {
-		spin_lock_irqsave(&priv->lock, flags);
+		spin_lock_irq(&priv->lock);
 		__ipoib_mcast_schedule_join_thread(priv, NULL, 0);
-		spin_unlock_irqrestore(&priv->lock, flags);
+		spin_unlock_irq(&priv->lock);
 	}
 }