diff mbox series

RDMA/siw: Fix IPv6 addr_list locking

Message ID 20190826141740.12969-1-bmt@zurich.ibm.com (mailing list archive)
State Superseded
Delegated to: Jason Gunthorpe
Headers show
Series RDMA/siw: Fix IPv6 addr_list locking | expand

Commit Message

Bernard Metzler Aug. 26, 2019, 2:17 p.m. UTC
Walking the address list of an inet6_dev requires
appropriate locking. Since the called function
siw_listen_address() may sleep, we have to use
rtnl_lock() + rcu_read_lock_bh() instead of
read_lock_bh().

Also introduces checks if we get a device from
in_dev_get(), or in6_dev_get().

Reported-by: Bart Van Assche <bvanassche@acm.org>
Fixes: 6c52fdc244b5 ("rdma/siw: connection management")
Signed-off-by: Bernard Metzler <bmt@zurich.ibm.com>
---
 drivers/infiniband/sw/siw/siw_cm.c | 33 +++++++++++++++++++-----------
 1 file changed, 21 insertions(+), 12 deletions(-)

Comments

Jason Gunthorpe Aug. 26, 2019, 2:25 p.m. UTC | #1
On Mon, Aug 26, 2019 at 04:17:40PM +0200, Bernard Metzler wrote:
> Walking the address list of an inet6_dev requires
> appropriate locking. Since the called function
> siw_listen_address() may sleep, we have to use
> rtnl_lock() + rcu_read_lock_bh() instead of
> read_lock_bh().

What is the RCU for if you have RTNL?

Jason
Bernard Metzler Aug. 26, 2019, 3:12 p.m. UTC | #2
-----"Jason Gunthorpe" <jgg@ziepe.ca> wrote: -----

>To: "Bernard Metzler" <bmt@zurich.ibm.com>
>From: "Jason Gunthorpe" <jgg@ziepe.ca>
>Date: 08/26/2019 04:25PM
>Cc: linux-rdma@vger.kernel.org, bvanassche@acm.org,
>dledford@redhat.com
>Subject: [EXTERNAL] Re: [PATCH] RDMA/siw: Fix IPv6 addr_list locking
>
>On Mon, Aug 26, 2019 at 04:17:40PM +0200, Bernard Metzler wrote:
>> Walking the address list of an inet6_dev requires
>> appropriate locking. Since the called function
>> siw_listen_address() may sleep, we have to use
>> rtnl_lock() + rcu_read_lock_bh() instead of
>> read_lock_bh().
>
>What is the RCU for if you have RTNL?
>

Frankly, I looked around in net/ipv6 and found, if not
rwlocked, addr_list walking to be rcu protected, even
if rtnl_lock()'d (e.g. addrconf_verify_rtnl()).

You are saying this is useless and overdone, since all
changes to that list are rtnl_lock protected right?
I was not sure about that.

For the IPv4 case further up, we also take the rtnl_lock,
and RCU-deref the address pointer (via
in_dev_for_each_ifa_rtnl()).


Hmmm.

Thanks for your help,
Bernard.
Jason Gunthorpe Aug. 26, 2019, 3:14 p.m. UTC | #3
On Mon, Aug 26, 2019 at 03:12:20PM +0000, Bernard Metzler wrote:
> 
> >To: "Bernard Metzler" <bmt@zurich.ibm.com>
> >From: "Jason Gunthorpe" <jgg@ziepe.ca>
> >Date: 08/26/2019 04:25PM
> >Cc: linux-rdma@vger.kernel.org, bvanassche@acm.org,
> >dledford@redhat.com
> >Subject: [EXTERNAL] Re: [PATCH] RDMA/siw: Fix IPv6 addr_list locking
> >
> >On Mon, Aug 26, 2019 at 04:17:40PM +0200, Bernard Metzler wrote:
> >> Walking the address list of an inet6_dev requires
> >> appropriate locking. Since the called function
> >> siw_listen_address() may sleep, we have to use
> >> rtnl_lock() + rcu_read_lock_bh() instead of
> >> read_lock_bh().
> >
> >What is the RCU for if you have RTNL?
> >
> 
> Frankly, I looked around in net/ipv6 and found, if not
> rwlocked, addr_list walking to be rcu protected, even
> if rtnl_lock()'d (e.g. addrconf_verify_rtnl()).
> 
> You are saying this is useless and overdone, since all
> changes to that list are rtnl_lock protected right?
> I was not sure about that.

I'm not sure, I thought it was the case that rtnl protected the
address lists on write.

> For the IPv4 case further up, we also take the rtnl_lock,
> and RCU-deref the address pointer (via
> in_dev_for_each_ifa_rtnl()).

That uses rtnl_derference which calls into rcu_dereference_protected
which is saying 'this RCU protected data is being read outside RCU
because we are holding the write side lock'

Which means it is locked by RTNL

Jason
Bernard Metzler Aug. 26, 2019, 3:33 p.m. UTC | #4
-----"Jason Gunthorpe" <jgg@ziepe.ca> wrote: -----

>To: "Bernard Metzler" <BMT@zurich.ibm.com>
>From: "Jason Gunthorpe" <jgg@ziepe.ca>
>Date: 08/26/2019 05:14PM
>Cc: linux-rdma@vger.kernel.org, bvanassche@acm.org,
>dledford@redhat.com
>Subject: [EXTERNAL] Re: Re: [PATCH] RDMA/siw: Fix IPv6 addr_list
>locking
>
>On Mon, Aug 26, 2019 at 03:12:20PM +0000, Bernard Metzler wrote:
>> 
>> >To: "Bernard Metzler" <bmt@zurich.ibm.com>
>> >From: "Jason Gunthorpe" <jgg@ziepe.ca>
>> >Date: 08/26/2019 04:25PM
>> >Cc: linux-rdma@vger.kernel.org, bvanassche@acm.org,
>> >dledford@redhat.com
>> >Subject: [EXTERNAL] Re: [PATCH] RDMA/siw: Fix IPv6 addr_list
>locking
>> >
>> >On Mon, Aug 26, 2019 at 04:17:40PM +0200, Bernard Metzler wrote:
>> >> Walking the address list of an inet6_dev requires
>> >> appropriate locking. Since the called function
>> >> siw_listen_address() may sleep, we have to use
>> >> rtnl_lock() + rcu_read_lock_bh() instead of
>> >> read_lock_bh().
>> >
>> >What is the RCU for if you have RTNL?
>> >
>> 
>> Frankly, I looked around in net/ipv6 and found, if not
>> rwlocked, addr_list walking to be rcu protected, even
>> if rtnl_lock()'d (e.g. addrconf_verify_rtnl()).
>> 
>> You are saying this is useless and overdone, since all
>> changes to that list are rtnl_lock protected right?
>> I was not sure about that.
>
>I'm not sure, I thought it was the case that rtnl protected the
>address lists on write.
>
>> For the IPv4 case further up, we also take the rtnl_lock,
>> and RCU-deref the address pointer (via
>> in_dev_for_each_ifa_rtnl()).
>
>That uses rtnl_derference which calls into rcu_dereference_protected
>which is saying 'this RCU protected data is being read outside RCU
>because we are holding the write side lock'
>
>Which means it is locked by RTNL
>
OK, makes sense.

So this is probably really overdone.

Thanks,
Bernard.
Bernard Metzler Aug. 27, 2019, 4:43 p.m. UTC | #5
-----"Jason Gunthorpe" <jgg@ziepe.ca> wrote: -----

>To: "Bernard Metzler" <BMT@zurich.ibm.com>
>From: "Jason Gunthorpe" <jgg@ziepe.ca>
>Date: 08/26/2019 05:14PM
>Cc: linux-rdma@vger.kernel.org, bvanassche@acm.org,
>dledford@redhat.com
>Subject: [EXTERNAL] Re: Re: [PATCH] RDMA/siw: Fix IPv6 addr_list
>locking
>
>On Mon, Aug 26, 2019 at 03:12:20PM +0000, Bernard Metzler wrote:
>> 
>> >To: "Bernard Metzler" <bmt@zurich.ibm.com>
>> >From: "Jason Gunthorpe" <jgg@ziepe.ca>
>> >Date: 08/26/2019 04:25PM
>> >Cc: linux-rdma@vger.kernel.org, bvanassche@acm.org,
>> >dledford@redhat.com
>> >Subject: [EXTERNAL] Re: [PATCH] RDMA/siw: Fix IPv6 addr_list
>locking
>> >
>> >On Mon, Aug 26, 2019 at 04:17:40PM +0200, Bernard Metzler wrote:
>> >> Walking the address list of an inet6_dev requires
>> >> appropriate locking. Since the called function
>> >> siw_listen_address() may sleep, we have to use
>> >> rtnl_lock() + rcu_read_lock_bh() instead of
>> >> read_lock_bh().
>> >
>> >What is the RCU for if you have RTNL?
>> >
>> 
>> Frankly, I looked around in net/ipv6 and found, if not
>> rwlocked, addr_list walking to be rcu protected, even
>> if rtnl_lock()'d (e.g. addrconf_verify_rtnl()).
>> 
>> You are saying this is useless and overdone, since all
>> changes to that list are rtnl_lock protected right?
>> I was not sure about that.
>
>I'm not sure, I thought it was the case that rtnl protected the
>address lists on write.
>
>> For the IPv4 case further up, we also take the rtnl_lock,
>> and RCU-deref the address pointer (via
>> in_dev_for_each_ifa_rtnl()).
>
>That uses rtnl_derference which calls into rcu_dereference_protected
>which is saying 'this RCU protected data is being read outside RCU
>because we are holding the write side lock'
>
>Which means it is locked by RTNL
>

Yes, I think you are right - RCU locking is not appropriate,
since we may sleep -- and that's completely against the RCU
principle.

Sorry for the confusion. Will resend the patch.
I will also take the opportunity to add skipping dead
or tentative IPv6 addresses for the wildcard listen
case.


Thanks!
Bernard.
diff mbox series

Patch

diff --git a/drivers/infiniband/sw/siw/siw_cm.c b/drivers/infiniband/sw/siw/siw_cm.c
index 1db5ad3d9580..8291bfcde995 100644
--- a/drivers/infiniband/sw/siw/siw_cm.c
+++ b/drivers/infiniband/sw/siw/siw_cm.c
@@ -1962,6 +1962,10 @@  int siw_create_listen(struct iw_cm_id *id, int backlog)
 		struct sockaddr_in s_laddr, *s_raddr;
 		const struct in_ifaddr *ifa;
 
+		if (!in_dev) {
+			rv = -ENODEV;
+			goto out;
+		}
 		memcpy(&s_laddr, &id->local_addr, sizeof(s_laddr));
 		s_raddr = (struct sockaddr_in *)&id->remote_addr;
 
@@ -1991,22 +1995,26 @@  int siw_create_listen(struct iw_cm_id *id, int backlog)
 		struct sockaddr_in6 *s_laddr = &to_sockaddr_in6(id->local_addr),
 			*s_raddr = &to_sockaddr_in6(id->remote_addr);
 
+		if (!in6_dev) {
+			rv = -ENODEV;
+			goto out;
+		}
 		siw_dbg(id->device,
 			"laddr %pI6:%d, raddr %pI6:%d\n",
 			&s_laddr->sin6_addr, ntohs(s_laddr->sin6_port),
 			&s_raddr->sin6_addr, ntohs(s_raddr->sin6_port));
 
-		read_lock_bh(&in6_dev->lock);
-		list_for_each_entry(ifp, &in6_dev->addr_list, if_list) {
-			struct sockaddr_in6 bind_addr;
-
+		rtnl_lock();
+		rcu_read_lock_bh();
+		list_for_each_entry_rcu(ifp, &in6_dev->addr_list, if_list) {
 			if (ipv6_addr_any(&s_laddr->sin6_addr) ||
 			    ipv6_addr_equal(&s_laddr->sin6_addr, &ifp->addr)) {
-				bind_addr.sin6_family = AF_INET6;
-				bind_addr.sin6_port = s_laddr->sin6_port;
-				bind_addr.sin6_flowinfo = 0;
-				bind_addr.sin6_addr = ifp->addr;
-				bind_addr.sin6_scope_id = dev->ifindex;
+				struct sockaddr_in6 bind_addr  = {
+					.sin6_family = AF_INET6,
+					.sin6_port = s_laddr->sin6_port,
+					.sin6_flowinfo = 0,
+					.sin6_addr = ifp->addr,
+					.sin6_scope_id = dev->ifindex };
 
 				rv = siw_listen_address(id, backlog,
 						(struct sockaddr *)&bind_addr,
@@ -2015,12 +2023,13 @@  int siw_create_listen(struct iw_cm_id *id, int backlog)
 					listeners++;
 			}
 		}
-		read_unlock_bh(&in6_dev->lock);
-
+		rcu_read_unlock_bh();
+		rtnl_unlock();
 		in6_dev_put(in6_dev);
 	} else {
-		return -EAFNOSUPPORT;
+		rv = -EAFNOSUPPORT;
 	}
+out:
 	if (listeners)
 		rv = 0;
 	else if (!rv)