diff mbox

[rdma-next,V1,5/9] IB/ipoib: rtnl_unlock can not come after free_netdev

Message ID 20161228124728.26619-6-leon@kernel.org (mailing list archive)
State Accepted
Headers show

Commit Message

Leon Romanovsky Dec. 28, 2016, 12:47 p.m. UTC
From: Feras Daoud <ferasda@mellanox.com>

The ipoib_vlan_add function calls rtnl_unlock after free_netdev,
rtnl_unlock not only releases the lock, but also calls netdev_run_todo.
The latter function browses the net_todo_list array and completes the
unregistration of all its net_device instances. If we call free_netdev
before rtnl_unlock, then netdev_run_todo call over the freed device causes
panic.
To fix, move rtnl_unlock call before free_netdev call.

Fixes: 9baa0b036410 ("IB/ipoib: Add rtnl_link_ops support")
Cc: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Feras Daoud <ferasda@mellanox.com>
Signed-off-by: Erez Shitrit <erezsh@mellanox.com>
Reviewed-by: Yuval Shaia <yuval.shaia@oracle.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
---
 drivers/infiniband/ulp/ipoib/ipoib_vlan.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

--
2.10.2

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Comments

Or Gerlitz Dec. 29, 2016, 8:55 p.m. UTC | #1
On Wed, Dec 28, 2016 at 2:47 PM, Leon Romanovsky <leon@kernel.org> wrote:
> From: Feras Daoud <ferasda@mellanox.com>
>
> The ipoib_vlan_add function calls rtnl_unlock after free_netdev,
> rtnl_unlock not only releases the lock, but also calls netdev_run_todo.
> The latter function browses the net_todo_list array and completes the
> unregistration of all its net_device instances. If we call free_netdev
> before rtnl_unlock, then netdev_run_todo call over the freed device causes
> panic.


RU claiming that we are crashing 100.0% here? since when? ever?

> To fix, move rtnl_unlock call before free_netdev call.
>
> Fixes: 9baa0b036410 ("IB/ipoib: Add rtnl_link_ops support")
> Cc: Or Gerlitz <ogerlitz@mellanox.com>

> diff --git a/drivers/infiniband/ulp/ipoib/ipoib_vlan.c b/drivers/infiniband/ulp/ipoib/ipoib_vlan.c
> index 9682183..d9dab4a 100644
> --- a/drivers/infiniband/ulp/ipoib/ipoib_vlan.c
> +++ b/drivers/infiniband/ulp/ipoib/ipoib_vlan.c
> @@ -168,11 +168,11 @@ int ipoib_vlan_add(struct net_device *pdev, unsigned short pkey)
>  out:
>         up_write(&ppriv->vlan_rwsem);
>
> +       rtnl_unlock();
> +
>         if (result)
>                 free_netdev(priv->dev);
>
> -       rtnl_unlock();
> -
>         return result;
>  }
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Leon Romanovsky Jan. 1, 2017, 6:39 a.m. UTC | #2
On Thu, Dec 29, 2016 at 10:55:34PM +0200, Or Gerlitz wrote:
> On Wed, Dec 28, 2016 at 2:47 PM, Leon Romanovsky <leon@kernel.org> wrote:
> > From: Feras Daoud <ferasda@mellanox.com>
> >
> > The ipoib_vlan_add function calls rtnl_unlock after free_netdev,
> > rtnl_unlock not only releases the lock, but also calls netdev_run_todo.
> > The latter function browses the net_todo_list array and completes the
> > unregistration of all its net_device instances. If we call free_netdev
> > before rtnl_unlock, then netdev_run_todo call over the freed device causes
> > panic.
>
>
> RU claiming that we are crashing 100.0% here? since when? ever?

According to our internal bugtracker (see #131352), this bug was
literaly forever (4 years old), and it was fixed internaly in
MOFED 2.0.

>
> > To fix, move rtnl_unlock call before free_netdev call.
> >
> > Fixes: 9baa0b036410 ("IB/ipoib: Add rtnl_link_ops support")
> > Cc: Or Gerlitz <ogerlitz@mellanox.com>
>
> > diff --git a/drivers/infiniband/ulp/ipoib/ipoib_vlan.c b/drivers/infiniband/ulp/ipoib/ipoib_vlan.c
> > index 9682183..d9dab4a 100644
> > --- a/drivers/infiniband/ulp/ipoib/ipoib_vlan.c
> > +++ b/drivers/infiniband/ulp/ipoib/ipoib_vlan.c
> > @@ -168,11 +168,11 @@ int ipoib_vlan_add(struct net_device *pdev, unsigned short pkey)
> >  out:
> >         up_write(&ppriv->vlan_rwsem);
> >
> > +       rtnl_unlock();
> > +
> >         if (result)
> >                 free_netdev(priv->dev);
> >
> > -       rtnl_unlock();
> > -
> >         return result;
> >  }
> --
> To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
Or Gerlitz Jan. 1, 2017, 7:19 a.m. UTC | #3
On Sun, Jan 1, 2017 at 8:39 AM, Leon Romanovsky <leonro@mellanox.com> wrote:
> On Thu, Dec 29, 2016 at 10:55:34PM +0200, Or Gerlitz wrote:
>> On Wed, Dec 28, 2016 at 2:47 PM, Leon Romanovsky <leon@kernel.org> wrote:
>> > From: Feras Daoud <ferasda@mellanox.com>
>> >
>> > The ipoib_vlan_add function calls rtnl_unlock after free_netdev,
>> > rtnl_unlock not only releases the lock, but also calls netdev_run_todo.
>> > The latter function browses the net_todo_list array and completes the
>> > unregistration of all its net_device instances. If we call free_netdev
>> > before rtnl_unlock, then netdev_run_todo call over the freed device causes
>> > panic.
>>
>>
>> RU claiming that we are crashing 100.0% here? since when? ever?
>
> According to our internal bugtracker (see #131352), this bug was
> literaly forever (4 years old), and it was fixed internaly in
> MOFED 2.0.

Did you try it out? e.g to see that this crashes 100% as you claim?

I am pretty much sure this is not correct, and I'd like to see Erez commenting
on that and supporting your claim.

>> > To fix, move rtnl_unlock call before free_netdev call.
>> > Fixes: 9baa0b036410 ("IB/ipoib: Add rtnl_link_ops support")
>> > Cc: Or Gerlitz <ogerlitz@mellanox.com>
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Erez Shitrit Jan. 1, 2017, 7:45 a.m. UTC | #4
On Sun, Jan 1, 2017 at 9:19 AM, Or Gerlitz <gerlitz.or@gmail.com> wrote:
> On Sun, Jan 1, 2017 at 8:39 AM, Leon Romanovsky <leonro@mellanox.com> wrote:
>> On Thu, Dec 29, 2016 at 10:55:34PM +0200, Or Gerlitz wrote:
>>> On Wed, Dec 28, 2016 at 2:47 PM, Leon Romanovsky <leon@kernel.org> wrote:
>>> > From: Feras Daoud <ferasda@mellanox.com>
>>> >
>>> > The ipoib_vlan_add function calls rtnl_unlock after free_netdev,
>>> > rtnl_unlock not only releases the lock, but also calls netdev_run_todo.
>>> > The latter function browses the net_todo_list array and completes the
>>> > unregistration of all its net_device instances. If we call free_netdev
>>> > before rtnl_unlock, then netdev_run_todo call over the freed device causes
>>> > panic.
>>>
>>>
>>> RU claiming that we are crashing 100.0% here? since when? ever?
>>
>> According to our internal bugtracker (see #131352), this bug was
>> literaly forever (4 years old), and it was fixed internaly in
>> MOFED 2.0.
>
> Did you try it out? e.g to see that this crashes 100% as you claim?
>
> I am pretty much sure this is not correct, and I'd like to see Erez commenting
> on that and supporting your claim.
>
It happens in error flow only, when the call for __ipoib_vlan_add
returns with error, in that case we will call free_netdev before
rtnl_unlock.

The fix is similar to what we have in ipoib_vlan_delete function:

"
         ....
         rtnl_unlock();

         if (dev) {
                 free_netdev(dev);
                 return 0;
         }
"

First rtnl_unlock() and then  free_netdev

Thanks, Erez

>>> > To fix, move rtnl_unlock call before free_netdev call.
>>> > Fixes: 9baa0b036410 ("IB/ipoib: Add rtnl_link_ops support")
>>> > Cc: Or Gerlitz <ogerlitz@mellanox.com>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/drivers/infiniband/ulp/ipoib/ipoib_vlan.c b/drivers/infiniband/ulp/ipoib/ipoib_vlan.c
index 9682183..d9dab4a 100644
--- a/drivers/infiniband/ulp/ipoib/ipoib_vlan.c
+++ b/drivers/infiniband/ulp/ipoib/ipoib_vlan.c
@@ -168,11 +168,11 @@  int ipoib_vlan_add(struct net_device *pdev, unsigned short pkey)
 out:
 	up_write(&ppriv->vlan_rwsem);

+	rtnl_unlock();
+
 	if (result)
 		free_netdev(priv->dev);

-	rtnl_unlock();
-
 	return result;
 }