diff mbox

[09/22] IB/ipoib: fix IPOIB_MCAST_RUN flag usage

Message ID 73f38862d167a9849482464519c04b7c1f0a8b7c.1423703861.git.dledford@redhat.com (mailing list archive)
State Rejected
Headers show

Commit Message

Doug Ledford Feb. 12, 2015, 1:43 a.m. UTC
The usage of IPOIB_MCAST_RUN as a flag is inconsistent.  In some places
it is used to mean "our device is administratively allowed to send
multicast joins/leaves/packets" and in other places it means "our
multicast join task thread is currently running and will process your
request if you put it on the queue".  However, this latter meaning is in
fact flawed as there is a race condition between the join task testing
the mcast list and finding it empty of remaining work, dropping the
mcast mutex and also the priv->lock spinlock, and clearing the
IPOIB_MCAST_RUN flag.  Further, there are numerous locations that use
the flag in the former fashion, and when all tasks complete and the task
thread clears the RUN flag, all of those other locations will fail to
ever again queue any work.  This results in the interface coming up fine
initially, but having problems adding new multicast groups after the
first round of groups have all been added and the RUN flag is cleared by
the join task thread when it thinks it is done.  To resolve this issue,
convert all locations in the code to treat the RUN flag as an indicator
that the multicast portion of this interface is in fact administratively
up and joins/leaves/sends can be performed.  There is no harm (other
than a slight performance penalty) to never clearing this flag and using
it in this fashion as it simply means that a few places that used to
micro-optimize how often this task was queued on a work queue will now
queue the task a few extra times.  We can address that suboptimal
behavior in future patches.

Signed-off-by: Doug Ledford <dledford@redhat.com>
---
 drivers/infiniband/ulp/ipoib/ipoib_multicast.c | 11 +++++------
 1 file changed, 5 insertions(+), 6 deletions(-)

Comments

Erez Shitrit Feb. 13, 2015, 8:50 a.m. UTC | #1
On 2/12/2015 3:43 AM, Doug Ledford wrote:
> The usage of IPOIB_MCAST_RUN as a flag is inconsistent.  In some places
> it is used to mean "our device is administratively allowed to send
> multicast joins/leaves/packets" and in other places it means "our
> multicast join task thread is currently running and will process your
> request if you put it on the queue".  However, this latter meaning is in
> fact flawed as there is a race condition between the join task testing
> the mcast list and finding it empty of remaining work, dropping the
> mcast mutex and also the priv->lock spinlock, and clearing the
> IPOIB_MCAST_RUN flag.  Further, there are numerous locations that use
> the flag in the former fashion, and when all tasks complete and the task
> thread clears the RUN flag, all of those other locations will fail to
> ever again queue any work.  This results in the interface coming up fine
> initially, but having problems adding new multicast groups after the
> first round of groups have all been added and the RUN flag is cleared by
> the join task thread when it thinks it is done.  To resolve this issue,
> convert all locations in the code to treat the RUN flag as an indicator
> that the multicast portion of this interface is in fact administratively
> up and joins/leaves/sends can be performed.  There is no harm (other
> than a slight performance penalty) to never clearing this flag and using
> it in this fashion as it simply means that a few places that used to
> micro-optimize how often this task was queued on a work queue will now
> queue the task a few extra times.  We can address that suboptimal
> behavior in future patches.
>
> Signed-off-by: Doug Ledford <dledford@redhat.com>
> ---
>   drivers/infiniband/ulp/ipoib/ipoib_multicast.c | 11 +++++------
>   1 file changed, 5 insertions(+), 6 deletions(-)
>
> diff --git a/drivers/infiniband/ulp/ipoib/ipoib_multicast.c b/drivers/infiniband/ulp/ipoib/ipoib_multicast.c
> index bc50dd0d0e4..91b8fe118ec 100644
> --- a/drivers/infiniband/ulp/ipoib/ipoib_multicast.c
> +++ b/drivers/infiniband/ulp/ipoib/ipoib_multicast.c
> @@ -630,8 +630,6 @@ void ipoib_mcast_join_task(struct work_struct *work)
>   	}
>   
>   	ipoib_dbg_mcast(priv, "successfully joined all multicast groups\n");
> -
> -	clear_bit(IPOIB_MCAST_RUN, &priv->flags);
>   }
>   
>   int ipoib_mcast_start_thread(struct net_device *dev)
> @@ -641,8 +639,8 @@ int ipoib_mcast_start_thread(struct net_device *dev)
>   	ipoib_dbg_mcast(priv, "starting multicast thread\n");
>   
>   	mutex_lock(&mcast_mutex);
> -	if (!test_and_set_bit(IPOIB_MCAST_RUN, &priv->flags))
> -		queue_delayed_work(priv->wq, &priv->mcast_task, 0);
> +	set_bit(IPOIB_MCAST_RUN, &priv->flags);

Hi Doug,
Can you use IPOIB_FLAG_OPER_UP instead of IPOIB_MCAST_RUN, in all these 
places and get rid of that RUN bit?

> +	queue_delayed_work(priv->wq, &priv->mcast_task, 0);
>   	mutex_unlock(&mcast_mutex);
>   
>   	return 0;
> @@ -725,7 +723,7 @@ void ipoib_mcast_send(struct net_device *dev, u8 *daddr, struct sk_buff *skb)
>   		memcpy(mcast->mcmember.mgid.raw, mgid, sizeof (union ib_gid));
>   		__ipoib_mcast_add(dev, mcast);
>   		list_add_tail(&mcast->list, &priv->multicast_list);
> -		if (!test_and_set_bit(IPOIB_MCAST_RUN, &priv->flags))
> +		if (test_bit(IPOIB_MCAST_RUN, &priv->flags))
>   			queue_delayed_work(priv->wq, &priv->mcast_task, 0);
>   	}
>   
> @@ -951,7 +949,8 @@ void ipoib_mcast_restart_task(struct work_struct *work)
>   	/*
>   	 * Restart our join task if needed
>   	 */
> -	ipoib_mcast_start_thread(dev);
> +	if (test_bit(IPOIB_MCAST_RUN, &priv->flags))
> +		queue_delayed_work(priv->wq, &priv->mcast_task, 0);
>   	rtnl_unlock();
>   }
>   

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Doug Ledford Feb. 13, 2015, 1:30 p.m. UTC | #2
On Fri, 2015-02-13 at 10:50 +0200, Erez Shitrit wrote:
> On 2/12/2015 3:43 AM, Doug Ledford wrote:
> > The usage of IPOIB_MCAST_RUN as a flag is inconsistent.  In some places
> > it is used to mean "our device is administratively allowed to send
> > multicast joins/leaves/packets" and in other places it means "our
> > multicast join task thread is currently running and will process your
> > request if you put it on the queue".  However, this latter meaning is in
> > fact flawed as there is a race condition between the join task testing
> > the mcast list and finding it empty of remaining work, dropping the
> > mcast mutex and also the priv->lock spinlock, and clearing the
> > IPOIB_MCAST_RUN flag.  Further, there are numerous locations that use
> > the flag in the former fashion, and when all tasks complete and the task
> > thread clears the RUN flag, all of those other locations will fail to
> > ever again queue any work.  This results in the interface coming up fine
> > initially, but having problems adding new multicast groups after the
> > first round of groups have all been added and the RUN flag is cleared by
> > the join task thread when it thinks it is done.  To resolve this issue,
> > convert all locations in the code to treat the RUN flag as an indicator
> > that the multicast portion of this interface is in fact administratively
> > up and joins/leaves/sends can be performed.  There is no harm (other
> > than a slight performance penalty) to never clearing this flag and using
> > it in this fashion as it simply means that a few places that used to
> > micro-optimize how often this task was queued on a work queue will now
> > queue the task a few extra times.  We can address that suboptimal
> > behavior in future patches.
> >
> > Signed-off-by: Doug Ledford <dledford@redhat.com>
> > ---
> >   drivers/infiniband/ulp/ipoib/ipoib_multicast.c | 11 +++++------
> >   1 file changed, 5 insertions(+), 6 deletions(-)
> >
> > diff --git a/drivers/infiniband/ulp/ipoib/ipoib_multicast.c b/drivers/infiniband/ulp/ipoib/ipoib_multicast.c
> > index bc50dd0d0e4..91b8fe118ec 100644
> > --- a/drivers/infiniband/ulp/ipoib/ipoib_multicast.c
> > +++ b/drivers/infiniband/ulp/ipoib/ipoib_multicast.c
> > @@ -630,8 +630,6 @@ void ipoib_mcast_join_task(struct work_struct *work)
> >   	}
> >   
> >   	ipoib_dbg_mcast(priv, "successfully joined all multicast groups\n");
> > -
> > -	clear_bit(IPOIB_MCAST_RUN, &priv->flags);
> >   }
> >   
> >   int ipoib_mcast_start_thread(struct net_device *dev)
> > @@ -641,8 +639,8 @@ int ipoib_mcast_start_thread(struct net_device *dev)
> >   	ipoib_dbg_mcast(priv, "starting multicast thread\n");
> >   
> >   	mutex_lock(&mcast_mutex);
> > -	if (!test_and_set_bit(IPOIB_MCAST_RUN, &priv->flags))
> > -		queue_delayed_work(priv->wq, &priv->mcast_task, 0);
> > +	set_bit(IPOIB_MCAST_RUN, &priv->flags);
> 
> Hi Doug,
> Can you use IPOIB_FLAG_OPER_UP instead of IPOIB_MCAST_RUN, in all these 
> places and get rid of that RUN bit?

Yes, I think that should be easily doable.
diff mbox

Patch

diff --git a/drivers/infiniband/ulp/ipoib/ipoib_multicast.c b/drivers/infiniband/ulp/ipoib/ipoib_multicast.c
index bc50dd0d0e4..91b8fe118ec 100644
--- a/drivers/infiniband/ulp/ipoib/ipoib_multicast.c
+++ b/drivers/infiniband/ulp/ipoib/ipoib_multicast.c
@@ -630,8 +630,6 @@  void ipoib_mcast_join_task(struct work_struct *work)
 	}
 
 	ipoib_dbg_mcast(priv, "successfully joined all multicast groups\n");
-
-	clear_bit(IPOIB_MCAST_RUN, &priv->flags);
 }
 
 int ipoib_mcast_start_thread(struct net_device *dev)
@@ -641,8 +639,8 @@  int ipoib_mcast_start_thread(struct net_device *dev)
 	ipoib_dbg_mcast(priv, "starting multicast thread\n");
 
 	mutex_lock(&mcast_mutex);
-	if (!test_and_set_bit(IPOIB_MCAST_RUN, &priv->flags))
-		queue_delayed_work(priv->wq, &priv->mcast_task, 0);
+	set_bit(IPOIB_MCAST_RUN, &priv->flags);
+	queue_delayed_work(priv->wq, &priv->mcast_task, 0);
 	mutex_unlock(&mcast_mutex);
 
 	return 0;
@@ -725,7 +723,7 @@  void ipoib_mcast_send(struct net_device *dev, u8 *daddr, struct sk_buff *skb)
 		memcpy(mcast->mcmember.mgid.raw, mgid, sizeof (union ib_gid));
 		__ipoib_mcast_add(dev, mcast);
 		list_add_tail(&mcast->list, &priv->multicast_list);
-		if (!test_and_set_bit(IPOIB_MCAST_RUN, &priv->flags))
+		if (test_bit(IPOIB_MCAST_RUN, &priv->flags))
 			queue_delayed_work(priv->wq, &priv->mcast_task, 0);
 	}
 
@@ -951,7 +949,8 @@  void ipoib_mcast_restart_task(struct work_struct *work)
 	/*
 	 * Restart our join task if needed
 	 */
-	ipoib_mcast_start_thread(dev);
+	if (test_bit(IPOIB_MCAST_RUN, &priv->flags))
+		queue_delayed_work(priv->wq, &priv->mcast_task, 0);
 	rtnl_unlock();
 }