diff mbox series

[v2] RDMA/core: Fix race when resolving IP address

Message ID 1561711763-24705-1-git-send-email-dag.moxnes@oracle.com (mailing list archive)
State Superseded
Headers show
Series [v2] RDMA/core: Fix race when resolving IP address | expand

Commit Message

Dag Moxnes June 28, 2019, 8:49 a.m. UTC
Use neighbour lock when copying MAC address from neighbour data struct
in dst_fetch_ha.

When not using the lock, it is possible for the function to race with
neigh_update, causing it to copy an invalid MAC address.

It is possible to provoke this error by calling rdma_resolve_addr in a
tight loop, while deleting the corresponding ARP entry in another tight
loop.

Signed-off-by: Dag Moxnes <dag.moxnes@oracle.com>
Signed-off-by: Håkon Bugge <haakon.bugge@oracle.com>

---
v1 -> v2:
   * Modified implementation to improve readability
---
 drivers/infiniband/core/addr.c | 9 ++++++---
 1 file changed, 6 insertions(+), 3 deletions(-)

Comments

Parav Pandit July 5, 2019, 2:19 a.m. UTC | #1
On Fri, Jun 28, 2019 at 2:20 PM Dag Moxnes <dag.moxnes@oracle.com> wrote:
>
> Use neighbour lock when copying MAC address from neighbour data struct
> in dst_fetch_ha.
>
> When not using the lock, it is possible for the function to race with
> neigh_update, causing it to copy an invalid MAC address.
>
> It is possible to provoke this error by calling rdma_resolve_addr in a
> tight loop, while deleting the corresponding ARP entry in another tight
> loop.
>
> Signed-off-by: Dag Moxnes <dag.moxnes@oracle.com>
> Signed-off-by: Håkon Bugge <haakon.bugge@oracle.com>
>
> ---
> v1 -> v2:
>    * Modified implementation to improve readability
> ---
>  drivers/infiniband/core/addr.c | 9 ++++++---
>  1 file changed, 6 insertions(+), 3 deletions(-)
>
> diff --git a/drivers/infiniband/core/addr.c b/drivers/infiniband/core/addr.c
> index 2f7d141598..51323ffbc5 100644
> --- a/drivers/infiniband/core/addr.c
> +++ b/drivers/infiniband/core/addr.c
> @@ -333,11 +333,14 @@ static int dst_fetch_ha(const struct dst_entry *dst,
>         if (!n)
>                 return -ENODATA;
>
> -       if (!(n->nud_state & NUD_VALID)) {
> +       read_lock_bh(&n->lock);
> +       if (n->nud_state & NUD_VALID) {
> +               memcpy(dev_addr->dst_dev_addr, n->ha, MAX_ADDR_LEN);
> +               read_unlock_bh(&n->lock);
> +       } else {
> +               read_unlock_bh(&n->lock);
>                 neigh_event_send(n, NULL);
>                 ret = -ENODATA;
> -       } else {
> -               memcpy(dev_addr->dst_dev_addr, n->ha, MAX_ADDR_LEN);
>         }
>
>         neigh_release(n);
> --
> 2.20.1
>
Reviewed-by: Parav Pandit <parav@mellanox.com>

A sample trace such as below in commit message would be good to have.
Or the similar one that you noticed with ARP delete sequence.

neigh_changeaddr()
  neigh_flush_dev()
   n->nud_state = NUD_NOARP;

Having some issues with office outlook, so replying via gmail.
Leon Romanovsky July 5, 2019, 4:09 a.m. UTC | #2
On Fri, Jul 05, 2019 at 07:49:06AM +0530, Parav Pandit wrote:
> On Fri, Jun 28, 2019 at 2:20 PM Dag Moxnes <dag.moxnes@oracle.com> wrote:
> >
> > Use neighbour lock when copying MAC address from neighbour data struct
> > in dst_fetch_ha.
> >
> > When not using the lock, it is possible for the function to race with
> > neigh_update, causing it to copy an invalid MAC address.
> >
> > It is possible to provoke this error by calling rdma_resolve_addr in a
> > tight loop, while deleting the corresponding ARP entry in another tight
> > loop.
> >
> > Signed-off-by: Dag Moxnes <dag.moxnes@oracle.com>
> > Signed-off-by: Håkon Bugge <haakon.bugge@oracle.com>
> >
> > ---
> > v1 -> v2:
> >    * Modified implementation to improve readability
> > ---
> >  drivers/infiniband/core/addr.c | 9 ++++++---
> >  1 file changed, 6 insertions(+), 3 deletions(-)
> >
> > diff --git a/drivers/infiniband/core/addr.c b/drivers/infiniband/core/addr.c
> > index 2f7d141598..51323ffbc5 100644
> > --- a/drivers/infiniband/core/addr.c
> > +++ b/drivers/infiniband/core/addr.c
> > @@ -333,11 +333,14 @@ static int dst_fetch_ha(const struct dst_entry *dst,
> >         if (!n)
> >                 return -ENODATA;
> >
> > -       if (!(n->nud_state & NUD_VALID)) {
> > +       read_lock_bh(&n->lock);
> > +       if (n->nud_state & NUD_VALID) {
> > +               memcpy(dev_addr->dst_dev_addr, n->ha, MAX_ADDR_LEN);
> > +               read_unlock_bh(&n->lock);
> > +       } else {
> > +               read_unlock_bh(&n->lock);
> >                 neigh_event_send(n, NULL);
> >                 ret = -ENODATA;
> > -       } else {
> > -               memcpy(dev_addr->dst_dev_addr, n->ha, MAX_ADDR_LEN);
> >         }
> >
> >         neigh_release(n);
> > --
> > 2.20.1
> >
> Reviewed-by: Parav Pandit <parav@mellanox.com>
>
> A sample trace such as below in commit message would be good to have.
> Or the similar one that you noticed with ARP delete sequence.
>
> neigh_changeaddr()
>   neigh_flush_dev()
>    n->nud_state = NUD_NOARP;
>
> Having some issues with office outlook, so replying via gmail.

Your replies from gmail looks much better when you used Outlook - proper
spacing between quoted text and your reply.

Thanks
Dag Moxnes July 5, 2019, 9:03 a.m. UTC | #3
Den 05.07.2019 04:19, skrev Parav Pandit:
> On Fri, Jun 28, 2019 at 2:20 PM Dag Moxnes <dag.moxnes@oracle.com> wrote:
>> Use neighbour lock when copying MAC address from neighbour data struct
>> in dst_fetch_ha.
>>
>> When not using the lock, it is possible for the function to race with
>> neigh_update, causing it to copy an invalid MAC address.
>>
>> It is possible to provoke this error by calling rdma_resolve_addr in a
>> tight loop, while deleting the corresponding ARP entry in another tight
>> loop.
>>
>> Signed-off-by: Dag Moxnes <dag.moxnes@oracle.com>
>> Signed-off-by: Håkon Bugge <haakon.bugge@oracle.com>
>>
>> ---
>> v1 -> v2:
>>     * Modified implementation to improve readability
>> ---
>>   drivers/infiniband/core/addr.c | 9 ++++++---
>>   1 file changed, 6 insertions(+), 3 deletions(-)
>>
>> diff --git a/drivers/infiniband/core/addr.c b/drivers/infiniband/core/addr.c
>> index 2f7d141598..51323ffbc5 100644
>> --- a/drivers/infiniband/core/addr.c
>> +++ b/drivers/infiniband/core/addr.c
>> @@ -333,11 +333,14 @@ static int dst_fetch_ha(const struct dst_entry *dst,
>>          if (!n)
>>                  return -ENODATA;
>>
>> -       if (!(n->nud_state & NUD_VALID)) {
>> +       read_lock_bh(&n->lock);
>> +       if (n->nud_state & NUD_VALID) {
>> +               memcpy(dev_addr->dst_dev_addr, n->ha, MAX_ADDR_LEN);
>> +               read_unlock_bh(&n->lock);
>> +       } else {
>> +               read_unlock_bh(&n->lock);
>>                  neigh_event_send(n, NULL);
>>                  ret = -ENODATA;
>> -       } else {
>> -               memcpy(dev_addr->dst_dev_addr, n->ha, MAX_ADDR_LEN);
>>          }
>>
>>          neigh_release(n);
>> --
>> 2.20.1
>>
> Reviewed-by: Parav Pandit <parav@mellanox.com>
>
> A sample trace such as below in commit message would be good to have.
> Or the similar one that you noticed with ARP delete sequence.
>
> neigh_changeaddr()
>    neigh_flush_dev()
>     n->nud_state = NUD_NOARP;
>
> Having some issues with office outlook, so replying via gmail.

Hi Parav,

Thanks for your review. I'll add a sample trace to the commit message as

you suggest.


Regards,

-Dag
Jason Gunthorpe July 9, 2019, 12:26 p.m. UTC | #4
On Fri, Jul 05, 2019 at 07:09:50AM +0300, Leon Romanovsky wrote:
> On Fri, Jul 05, 2019 at 07:49:06AM +0530, Parav Pandit wrote:
> > On Fri, Jun 28, 2019 at 2:20 PM Dag Moxnes <dag.moxnes@oracle.com> wrote:
> > >
> > > Use neighbour lock when copying MAC address from neighbour data struct
> > > in dst_fetch_ha.
> > >
> > > When not using the lock, it is possible for the function to race with
> > > neigh_update, causing it to copy an invalid MAC address.
> > >
> > > It is possible to provoke this error by calling rdma_resolve_addr in a
> > > tight loop, while deleting the corresponding ARP entry in another tight
> > > loop.
> > >
> > > Signed-off-by: Dag Moxnes <dag.moxnes@oracle.com>
> > > Signed-off-by: Håkon Bugge <haakon.bugge@oracle.com>
> > >
> > > v1 -> v2:
> > >    * Modified implementation to improve readability
> > >  drivers/infiniband/core/addr.c | 9 ++++++---
> > >  1 file changed, 6 insertions(+), 3 deletions(-)
> > >
> > > diff --git a/drivers/infiniband/core/addr.c b/drivers/infiniband/core/addr.c
> > > index 2f7d141598..51323ffbc5 100644
> > > +++ b/drivers/infiniband/core/addr.c
> > > @@ -333,11 +333,14 @@ static int dst_fetch_ha(const struct dst_entry *dst,
> > >         if (!n)
> > >                 return -ENODATA;
> > >
> > > -       if (!(n->nud_state & NUD_VALID)) {
> > > +       read_lock_bh(&n->lock);
> > > +       if (n->nud_state & NUD_VALID) {
> > > +               memcpy(dev_addr->dst_dev_addr, n->ha, MAX_ADDR_LEN);
> > > +               read_unlock_bh(&n->lock);
> > > +       } else {
> > > +               read_unlock_bh(&n->lock);
> > >                 neigh_event_send(n, NULL);
> > >                 ret = -ENODATA;
> > > -       } else {
> > > -               memcpy(dev_addr->dst_dev_addr, n->ha, MAX_ADDR_LEN);
> > >         }
> > >
> > >         neigh_release(n);
> > >
> > Reviewed-by: Parav Pandit <parav@mellanox.com>
> >
> > A sample trace such as below in commit message would be good to have.
> > Or the similar one that you noticed with ARP delete sequence.
> >
> > neigh_changeaddr()
> >   neigh_flush_dev()
> >    n->nud_state = NUD_NOARP;
> >
> > Having some issues with office outlook, so replying via gmail.
> 
> Your replies from gmail looks much better when you used Outlook - proper
> spacing between quoted text and your reply.

Why not use thunderbird or something?

Jason
diff mbox series

Patch

diff --git a/drivers/infiniband/core/addr.c b/drivers/infiniband/core/addr.c
index 2f7d141598..51323ffbc5 100644
--- a/drivers/infiniband/core/addr.c
+++ b/drivers/infiniband/core/addr.c
@@ -333,11 +333,14 @@  static int dst_fetch_ha(const struct dst_entry *dst,
 	if (!n)
 		return -ENODATA;
 
-	if (!(n->nud_state & NUD_VALID)) {
+	read_lock_bh(&n->lock);
+	if (n->nud_state & NUD_VALID) {
+		memcpy(dev_addr->dst_dev_addr, n->ha, MAX_ADDR_LEN);
+		read_unlock_bh(&n->lock);
+	} else {
+		read_unlock_bh(&n->lock);
 		neigh_event_send(n, NULL);
 		ret = -ENODATA;
-	} else {
-		memcpy(dev_addr->dst_dev_addr, n->ha, MAX_ADDR_LEN);
 	}
 
 	neigh_release(n);