Message ID | 168805181129.164730.67097436102991889.stgit@manet.1015granger.net (mailing list archive) |
---|---|
State | Superseded |
Headers | show |
Series | Handle ARPHRD_NONE devices for siw | expand |
On 6/29/2023 11:16 AM, Chuck Lever wrote: > From: Chuck Lever <chuck.lever@oracle.com> > > We would like to enable the use of siw on top of a VPN that is > constructed and managed via a tun device. That hasn't worked up > until now because ARPHRD_NONE devices (such as tun devices) have > no GID for the RDMA/core to look up. > > But it turns out that the egress device has already been picked for > us -- no GID is necessary. addr_handler() just has to do the right > thing with it. > > Suggested-by: Jason Gunthorpe <jgg@nvidia.com> > Signed-off-by: Chuck Lever <chuck.lever@oracle.com> > --- > drivers/infiniband/core/cma.c | 15 +++++++++++++++ > 1 file changed, 15 insertions(+) > > diff --git a/drivers/infiniband/core/cma.c b/drivers/infiniband/core/cma.c > index 889b3e4ea980..07bb5ac4019d 100644 > --- a/drivers/infiniband/core/cma.c > +++ b/drivers/infiniband/core/cma.c > @@ -700,6 +700,21 @@ cma_validate_port(struct ib_device *device, u32 port, > if ((dev_type != ARPHRD_INFINIBAND) && rdma_protocol_ib(device, port)) > goto out; > > + /* Linux iWARP devices have but one port */ I don't believe this comment is correct, or necessary. In-tree drivers exist for several multi-port iWARP devices, and the port bnumber passed to rdma_protocol_iwarp() and rdma_get_gid_attr() will follow, no? The code looks correct otherwise... Tom. > + if (rdma_protocol_iwarp(device, port)) { > + sgid_attr = rdma_get_gid_attr(device, port, 0); > + if (IS_ERR(sgid_attr)) > + goto out; > + > + rcu_read_lock(); > + ndev = rcu_dereference(sgid_attr->ndev); > + if (!net_eq(dev_net(ndev), dev_addr->net) || > + ndev->ifindex != bound_if_index) > + sgid_attr = ERR_PTR(-ENODEV); > + rcu_read_unlock(); > + goto out; > + } > + > if (dev_type == ARPHRD_ETHER && rdma_protocol_roce(device, port)) { > ndev = dev_get_by_index(dev_addr->net, bound_if_index); > if (!ndev) > > >
> On Jul 1, 2023, at 12:24 PM, Tom Talpey <tom@talpey.com> wrote: > > On 6/29/2023 11:16 AM, Chuck Lever wrote: >> From: Chuck Lever <chuck.lever@oracle.com> >> We would like to enable the use of siw on top of a VPN that is >> constructed and managed via a tun device. That hasn't worked up >> until now because ARPHRD_NONE devices (such as tun devices) have >> no GID for the RDMA/core to look up. >> But it turns out that the egress device has already been picked for >> us -- no GID is necessary. addr_handler() just has to do the right >> thing with it. >> Suggested-by: Jason Gunthorpe <jgg@nvidia.com> >> Signed-off-by: Chuck Lever <chuck.lever@oracle.com> >> --- >> drivers/infiniband/core/cma.c | 15 +++++++++++++++ >> 1 file changed, 15 insertions(+) >> diff --git a/drivers/infiniband/core/cma.c b/drivers/infiniband/core/cma.c >> index 889b3e4ea980..07bb5ac4019d 100644 >> --- a/drivers/infiniband/core/cma.c >> +++ b/drivers/infiniband/core/cma.c >> @@ -700,6 +700,21 @@ cma_validate_port(struct ib_device *device, u32 port, >> if ((dev_type != ARPHRD_INFINIBAND) && rdma_protocol_ib(device, port)) >> goto out; >> + /* Linux iWARP devices have but one port */ > > I don't believe this comment is correct, or necessary. In-tree drivers > exist for several multi-port iWARP devices, and the port bnumber passed > to rdma_protocol_iwarp() and rdma_get_gid_attr() will follow, no? Then I must have misunderstood what Jason said about the reason for the rdma_protocol_iwarp() check. He said that we are able to do this kind of GID lookup because iWARP devices have only a single port. Jason? > The code looks correct otherwise... > > Tom. > >> + if (rdma_protocol_iwarp(device, port)) { >> + sgid_attr = rdma_get_gid_attr(device, port, 0); >> + if (IS_ERR(sgid_attr)) >> + goto out; >> + >> + rcu_read_lock(); >> + ndev = rcu_dereference(sgid_attr->ndev); >> + if (!net_eq(dev_net(ndev), dev_addr->net) || >> + ndev->ifindex != bound_if_index) >> + sgid_attr = ERR_PTR(-ENODEV); >> + rcu_read_unlock(); >> + goto out; >> + } >> + >> if (dev_type == ARPHRD_ETHER && rdma_protocol_roce(device, port)) { >> ndev = dev_get_by_index(dev_addr->net, bound_if_index); >> if (!ndev) -- Chuck Lever
On Sat, Jul 01, 2023 at 04:27:23PM +0000, Chuck Lever III wrote: > > > > On Jul 1, 2023, at 12:24 PM, Tom Talpey <tom@talpey.com> wrote: > > > > On 6/29/2023 11:16 AM, Chuck Lever wrote: > >> From: Chuck Lever <chuck.lever@oracle.com> > >> We would like to enable the use of siw on top of a VPN that is > >> constructed and managed via a tun device. That hasn't worked up > >> until now because ARPHRD_NONE devices (such as tun devices) have > >> no GID for the RDMA/core to look up. > >> But it turns out that the egress device has already been picked for > >> us -- no GID is necessary. addr_handler() just has to do the right > >> thing with it. > >> Suggested-by: Jason Gunthorpe <jgg@nvidia.com> > >> Signed-off-by: Chuck Lever <chuck.lever@oracle.com> > >> --- > >> drivers/infiniband/core/cma.c | 15 +++++++++++++++ > >> 1 file changed, 15 insertions(+) > >> diff --git a/drivers/infiniband/core/cma.c b/drivers/infiniband/core/cma.c > >> index 889b3e4ea980..07bb5ac4019d 100644 > >> --- a/drivers/infiniband/core/cma.c > >> +++ b/drivers/infiniband/core/cma.c > >> @@ -700,6 +700,21 @@ cma_validate_port(struct ib_device *device, u32 port, > >> if ((dev_type != ARPHRD_INFINIBAND) && rdma_protocol_ib(device, port)) > >> goto out; > >> + /* Linux iWARP devices have but one port */ > > > > I don't believe this comment is correct, or necessary. In-tree drivers > > exist for several multi-port iWARP devices, and the port bnumber passed > > to rdma_protocol_iwarp() and rdma_get_gid_attr() will follow, no? > > Then I must have misunderstood what Jason said about the reason > for the rdma_protocol_iwarp() check. He said that we are able to > do this kind of GID lookup because iWARP devices have only a > single port. > > Jason? I don't know alot about iwarp - tom does iwarp really have multiported *struct ib_device* models? This is different from multiport hw. If it is multiport how do the gid tables work across the ports? Jason
On 7/3/2023 5:07 PM, Jason Gunthorpe wrote: > On Sat, Jul 01, 2023 at 04:27:23PM +0000, Chuck Lever III wrote: >> >> >>> On Jul 1, 2023, at 12:24 PM, Tom Talpey <tom@talpey.com> wrote: >>> >>> On 6/29/2023 11:16 AM, Chuck Lever wrote: >>>> From: Chuck Lever <chuck.lever@oracle.com> >>>> We would like to enable the use of siw on top of a VPN that is >>>> constructed and managed via a tun device. That hasn't worked up >>>> until now because ARPHRD_NONE devices (such as tun devices) have >>>> no GID for the RDMA/core to look up. >>>> But it turns out that the egress device has already been picked for >>>> us -- no GID is necessary. addr_handler() just has to do the right >>>> thing with it. >>>> Suggested-by: Jason Gunthorpe <jgg@nvidia.com> >>>> Signed-off-by: Chuck Lever <chuck.lever@oracle.com> >>>> --- >>>> drivers/infiniband/core/cma.c | 15 +++++++++++++++ >>>> 1 file changed, 15 insertions(+) >>>> diff --git a/drivers/infiniband/core/cma.c b/drivers/infiniband/core/cma.c >>>> index 889b3e4ea980..07bb5ac4019d 100644 >>>> --- a/drivers/infiniband/core/cma.c >>>> +++ b/drivers/infiniband/core/cma.c >>>> @@ -700,6 +700,21 @@ cma_validate_port(struct ib_device *device, u32 port, >>>> if ((dev_type != ARPHRD_INFINIBAND) && rdma_protocol_ib(device, port)) >>>> goto out; >>>> + /* Linux iWARP devices have but one port */ >>> >>> I don't believe this comment is correct, or necessary. In-tree drivers >>> exist for several multi-port iWARP devices, and the port bnumber passed >>> to rdma_protocol_iwarp() and rdma_get_gid_attr() will follow, no? >> >> Then I must have misunderstood what Jason said about the reason >> for the rdma_protocol_iwarp() check. He said that we are able to >> do this kind of GID lookup because iWARP devices have only a >> single port. >> >> Jason? > > I don't know alot about iwarp - tom does iwarp really have multiported > *struct ib_device* models? This is different from multiport hw. I don't see how the iWARP protocol impacts this, but I believe the cxgb4 provider implements multiport. It sets the ibdev.phys_port_cnt anyway. Perhaps this is incorrect. > If it is multiport how do the gid tables work across the ports? Again, not sure how to respond. iWARP doesn't express the gid as a protocol element. And the iw_cm really doesn't either, although it does implement a gid-type API I guess. That's local behavior though, not something that goes on the wire directly. Maybe I should ask... what does the "Linux iWARP devices have but one port" actually mean in the comment? Would the code below it not work if that were not the case? All I'm saying is that the comment seems to be unnecessary, and confusing. Tom.
> On Jul 4, 2023, at 10:23 AM, Tom Talpey <tom@talpey.com> wrote: > > On 7/3/2023 5:07 PM, Jason Gunthorpe wrote: >> On Sat, Jul 01, 2023 at 04:27:23PM +0000, Chuck Lever III wrote: >>> >>> >>>> On Jul 1, 2023, at 12:24 PM, Tom Talpey <tom@talpey.com> wrote: >>>> >>>> On 6/29/2023 11:16 AM, Chuck Lever wrote: >>>>> From: Chuck Lever <chuck.lever@oracle.com> >>>>> We would like to enable the use of siw on top of a VPN that is >>>>> constructed and managed via a tun device. That hasn't worked up >>>>> until now because ARPHRD_NONE devices (such as tun devices) have >>>>> no GID for the RDMA/core to look up. >>>>> But it turns out that the egress device has already been picked for >>>>> us -- no GID is necessary. addr_handler() just has to do the right >>>>> thing with it. >>>>> Suggested-by: Jason Gunthorpe <jgg@nvidia.com> >>>>> Signed-off-by: Chuck Lever <chuck.lever@oracle.com> >>>>> --- >>>>> drivers/infiniband/core/cma.c | 15 +++++++++++++++ >>>>> 1 file changed, 15 insertions(+) >>>>> diff --git a/drivers/infiniband/core/cma.c b/drivers/infiniband/core/cma.c >>>>> index 889b3e4ea980..07bb5ac4019d 100644 >>>>> --- a/drivers/infiniband/core/cma.c >>>>> +++ b/drivers/infiniband/core/cma.c >>>>> @@ -700,6 +700,21 @@ cma_validate_port(struct ib_device *device, u32 port, >>>>> if ((dev_type != ARPHRD_INFINIBAND) && rdma_protocol_ib(device, port)) >>>>> goto out; >>>>> + /* Linux iWARP devices have but one port */ >>>> >>>> I don't believe this comment is correct, or necessary. In-tree drivers >>>> exist for several multi-port iWARP devices, and the port bnumber passed >>>> to rdma_protocol_iwarp() and rdma_get_gid_attr() will follow, no? >>> >>> Then I must have misunderstood what Jason said about the reason >>> for the rdma_protocol_iwarp() check. He said that we are able to >>> do this kind of GID lookup because iWARP devices have only a >>> single port. >>> >>> Jason? >> I don't know alot about iwarp - tom does iwarp really have multiported >> *struct ib_device* models? This is different from multiport hw. > > I don't see how the iWARP protocol impacts this, but I believe the > cxgb4 provider implements multiport. It sets the ibdev.phys_port_cnt > anyway. Perhaps this is incorrect. > >> If it is multiport how do the gid tables work across the ports? > > Again, not sure how to respond. iWARP doesn't express the gid as a > protocol element. And the iw_cm really doesn't either, although it > does implement a gid-type API I guess. That's local behavior though, > not something that goes on the wire directly. > > Maybe I should ask... what does the "Linux iWARP devices have but one > port" actually mean in the comment? Would the code below it not work > if that were not the case? All I'm saying is that the comment seems > to be unnecessary, and confusing. It replaces a code comment you complained about in an earlier review regarding the use of "if (rdma_protocol_iwarp())". As far as I understand, /in Linux/ each iWARP endpoint gets its own ib_device and that device has exactly one port. So for example, a physical device that has two ports would appear as two ib_devices each with a single port. Is that not how it works? It's certainly possible I've misunderstood something. Again, this is not about the protocol, it's about how Linux implements it. I'm open to specific suggestions to improve the comment, or I can remove it if the code is sufficiently clear. -- Chuck Lever
On Tue, Jul 04, 2023 at 02:54:59PM +0000, Chuck Lever III wrote: > > > > On Jul 4, 2023, at 10:23 AM, Tom Talpey <tom@talpey.com> wrote: > > > > On 7/3/2023 5:07 PM, Jason Gunthorpe wrote: > >> On Sat, Jul 01, 2023 at 04:27:23PM +0000, Chuck Lever III wrote: > >>> > >>> > >>>> On Jul 1, 2023, at 12:24 PM, Tom Talpey <tom@talpey.com> wrote: > >>>> > >>>> On 6/29/2023 11:16 AM, Chuck Lever wrote: > >>>>> From: Chuck Lever <chuck.lever@oracle.com> > >>>>> We would like to enable the use of siw on top of a VPN that is > >>>>> constructed and managed via a tun device. That hasn't worked up > >>>>> until now because ARPHRD_NONE devices (such as tun devices) have > >>>>> no GID for the RDMA/core to look up. > >>>>> But it turns out that the egress device has already been picked for > >>>>> us -- no GID is necessary. addr_handler() just has to do the right > >>>>> thing with it. > >>>>> Suggested-by: Jason Gunthorpe <jgg@nvidia.com> > >>>>> Signed-off-by: Chuck Lever <chuck.lever@oracle.com> > >>>>> --- > >>>>> drivers/infiniband/core/cma.c | 15 +++++++++++++++ > >>>>> 1 file changed, 15 insertions(+) > >>>>> diff --git a/drivers/infiniband/core/cma.c b/drivers/infiniband/core/cma.c > >>>>> index 889b3e4ea980..07bb5ac4019d 100644 > >>>>> --- a/drivers/infiniband/core/cma.c > >>>>> +++ b/drivers/infiniband/core/cma.c > >>>>> @@ -700,6 +700,21 @@ cma_validate_port(struct ib_device *device, u32 port, > >>>>> if ((dev_type != ARPHRD_INFINIBAND) && rdma_protocol_ib(device, port)) > >>>>> goto out; > >>>>> + /* Linux iWARP devices have but one port */ > >>>> > >>>> I don't believe this comment is correct, or necessary. In-tree drivers > >>>> exist for several multi-port iWARP devices, and the port bnumber passed > >>>> to rdma_protocol_iwarp() and rdma_get_gid_attr() will follow, no? > >>> > >>> Then I must have misunderstood what Jason said about the reason > >>> for the rdma_protocol_iwarp() check. He said that we are able to > >>> do this kind of GID lookup because iWARP devices have only a > >>> single port. > >>> > >>> Jason? > >> I don't know alot about iwarp - tom does iwarp really have multiported > >> *struct ib_device* models? This is different from multiport hw. > > > > I don't see how the iWARP protocol impacts this, but I believe the > > cxgb4 provider implements multiport. It sets the ibdev.phys_port_cnt > > anyway. Perhaps this is incorrect. > > > >> If it is multiport how do the gid tables work across the ports? > > > > Again, not sure how to respond. iWARP doesn't express the gid as a > > protocol element. And the iw_cm really doesn't either, although it > > does implement a gid-type API I guess. That's local behavior though, > > not something that goes on the wire directly. > > > > Maybe I should ask... what does the "Linux iWARP devices have but one > > port" actually mean in the comment? Would the code below it not work > > if that were not the case? All I'm saying is that the comment seems > > to be unnecessary, and confusing. > > It replaces a code comment you complained about in an earlier review > regarding the use of "if (rdma_protocol_iwarp())". As far as I > understand, /in Linux/ each iWARP endpoint gets its own ib_device > and that device has exactly one port. > > So for example, a physical device that has two ports would appear > as two ib_devices each with a single port. Is that not how it > works? It's certainly possible I've misunderstood something. That is how I would expect it to work. Multi-port ib_device is really only something that exists to support IB's APM, and iWarp doesn't have that. Otherwise verbs says a QP is bound to a single IB device's port and a single GID of that port. It should not float around between multiple ports. So, I don't know what the iwarp drivers did here. As for rthe comment, I don't think it is quite right, this code already knows what ib_device port it is supposed to be using somehow, so it doesn't matter. I think it should be more like: In iWarp mode we need to have a sgid entry to be able to locate the netdevice. iWarp drivers are not allowed to associate more than one net device with their gid tables, so returning the first entry is sufficient. iWarp will ignore the GID entries actual GID, and the passed in GID may not even be present in the GID table for tunnels and other non-ethernet netdevices. Jason
> On Jul 10, 2023, at 1:06 PM, Jason Gunthorpe <jgg@nvidia.com> wrote: > > On Tue, Jul 04, 2023 at 02:54:59PM +0000, Chuck Lever III wrote: >> >> >>> On Jul 4, 2023, at 10:23 AM, Tom Talpey <tom@talpey.com> wrote: >>> >>> On 7/3/2023 5:07 PM, Jason Gunthorpe wrote: >>>> On Sat, Jul 01, 2023 at 04:27:23PM +0000, Chuck Lever III wrote: >>>>> >>>>> >>>>>> On Jul 1, 2023, at 12:24 PM, Tom Talpey <tom@talpey.com> wrote: >>>>>> >>>>>> On 6/29/2023 11:16 AM, Chuck Lever wrote: >>>>>>> From: Chuck Lever <chuck.lever@oracle.com> >>>>>>> We would like to enable the use of siw on top of a VPN that is >>>>>>> constructed and managed via a tun device. That hasn't worked up >>>>>>> until now because ARPHRD_NONE devices (such as tun devices) have >>>>>>> no GID for the RDMA/core to look up. >>>>>>> But it turns out that the egress device has already been picked for >>>>>>> us -- no GID is necessary. addr_handler() just has to do the right >>>>>>> thing with it. >>>>>>> Suggested-by: Jason Gunthorpe <jgg@nvidia.com> >>>>>>> Signed-off-by: Chuck Lever <chuck.lever@oracle.com> >>>>>>> --- >>>>>>> drivers/infiniband/core/cma.c | 15 +++++++++++++++ >>>>>>> 1 file changed, 15 insertions(+) >>>>>>> diff --git a/drivers/infiniband/core/cma.c b/drivers/infiniband/core/cma.c >>>>>>> index 889b3e4ea980..07bb5ac4019d 100644 >>>>>>> --- a/drivers/infiniband/core/cma.c >>>>>>> +++ b/drivers/infiniband/core/cma.c >>>>>>> @@ -700,6 +700,21 @@ cma_validate_port(struct ib_device *device, u32 port, >>>>>>> if ((dev_type != ARPHRD_INFINIBAND) && rdma_protocol_ib(device, port)) >>>>>>> goto out; >>>>>>> + /* Linux iWARP devices have but one port */ >>>>>> >>>>>> I don't believe this comment is correct, or necessary. In-tree drivers >>>>>> exist for several multi-port iWARP devices, and the port bnumber passed >>>>>> to rdma_protocol_iwarp() and rdma_get_gid_attr() will follow, no? >>>>> >>>>> Then I must have misunderstood what Jason said about the reason >>>>> for the rdma_protocol_iwarp() check. He said that we are able to >>>>> do this kind of GID lookup because iWARP devices have only a >>>>> single port. >>>>> >>>>> Jason? >>>> I don't know alot about iwarp - tom does iwarp really have multiported >>>> *struct ib_device* models? This is different from multiport hw. >>> >>> I don't see how the iWARP protocol impacts this, but I believe the >>> cxgb4 provider implements multiport. It sets the ibdev.phys_port_cnt >>> anyway. Perhaps this is incorrect. >>> >>>> If it is multiport how do the gid tables work across the ports? >>> >>> Again, not sure how to respond. iWARP doesn't express the gid as a >>> protocol element. And the iw_cm really doesn't either, although it >>> does implement a gid-type API I guess. That's local behavior though, >>> not something that goes on the wire directly. >>> >>> Maybe I should ask... what does the "Linux iWARP devices have but one >>> port" actually mean in the comment? Would the code below it not work >>> if that were not the case? All I'm saying is that the comment seems >>> to be unnecessary, and confusing. >> >> It replaces a code comment you complained about in an earlier review >> regarding the use of "if (rdma_protocol_iwarp())". As far as I >> understand, /in Linux/ each iWARP endpoint gets its own ib_device >> and that device has exactly one port. >> >> So for example, a physical device that has two ports would appear >> as two ib_devices each with a single port. Is that not how it >> works? It's certainly possible I've misunderstood something. > > That is how I would expect it to work. Multi-port ib_device is really > only something that exists to support IB's APM, and iWarp doesn't have > that. > > Otherwise verbs says a QP is bound to a single IB device's port and a > single GID of that port. It should not float around between multiple > ports. > > So, I don't know what the iwarp drivers did here. > > As for rthe comment, I don't think it is quite right, this code > already knows what ib_device port it is supposed to be using somehow, > so it doesn't matter. > > I think it should be more like: > > In iWarp mode we need to have a sgid entry to be able to locate the > netdevice. iWarp drivers are not allowed to associate more than one > net device with their gid tables, so returning the first entry is > sufficient. iWarp will ignore the GID entries actual GID, and the > passed in GID may not even be present in the GID table for tunnels > and other non-ethernet netdevices. I can make that change and post a refresh. I'd like to hear from Tom first. -- Chuck Lever
On 7/11/2023 6:49 PM, Chuck Lever III wrote: > > >> On Jul 10, 2023, at 1:06 PM, Jason Gunthorpe <jgg@nvidia.com> wrote: >> >> On Tue, Jul 04, 2023 at 02:54:59PM +0000, Chuck Lever III wrote: >>> >>> >>>> On Jul 4, 2023, at 10:23 AM, Tom Talpey <tom@talpey.com> wrote: >>>> >>>> On 7/3/2023 5:07 PM, Jason Gunthorpe wrote: >>>>> On Sat, Jul 01, 2023 at 04:27:23PM +0000, Chuck Lever III wrote: >>>>>> >>>>>> >>>>>>> On Jul 1, 2023, at 12:24 PM, Tom Talpey <tom@talpey.com> wrote: >>>>>>> >>>>>>> On 6/29/2023 11:16 AM, Chuck Lever wrote: >>>>>>>> From: Chuck Lever <chuck.lever@oracle.com> >>>>>>>> We would like to enable the use of siw on top of a VPN that is >>>>>>>> constructed and managed via a tun device. That hasn't worked up >>>>>>>> until now because ARPHRD_NONE devices (such as tun devices) have >>>>>>>> no GID for the RDMA/core to look up. >>>>>>>> But it turns out that the egress device has already been picked for >>>>>>>> us -- no GID is necessary. addr_handler() just has to do the right >>>>>>>> thing with it. >>>>>>>> Suggested-by: Jason Gunthorpe <jgg@nvidia.com> >>>>>>>> Signed-off-by: Chuck Lever <chuck.lever@oracle.com> >>>>>>>> --- >>>>>>>> drivers/infiniband/core/cma.c | 15 +++++++++++++++ >>>>>>>> 1 file changed, 15 insertions(+) >>>>>>>> diff --git a/drivers/infiniband/core/cma.c b/drivers/infiniband/core/cma.c >>>>>>>> index 889b3e4ea980..07bb5ac4019d 100644 >>>>>>>> --- a/drivers/infiniband/core/cma.c >>>>>>>> +++ b/drivers/infiniband/core/cma.c >>>>>>>> @@ -700,6 +700,21 @@ cma_validate_port(struct ib_device *device, u32 port, >>>>>>>> if ((dev_type != ARPHRD_INFINIBAND) && rdma_protocol_ib(device, port)) >>>>>>>> goto out; >>>>>>>> + /* Linux iWARP devices have but one port */ >>>>>>> >>>>>>> I don't believe this comment is correct, or necessary. In-tree drivers >>>>>>> exist for several multi-port iWARP devices, and the port bnumber passed >>>>>>> to rdma_protocol_iwarp() and rdma_get_gid_attr() will follow, no? >>>>>> >>>>>> Then I must have misunderstood what Jason said about the reason >>>>>> for the rdma_protocol_iwarp() check. He said that we are able to >>>>>> do this kind of GID lookup because iWARP devices have only a >>>>>> single port. >>>>>> >>>>>> Jason? >>>>> I don't know alot about iwarp - tom does iwarp really have multiported >>>>> *struct ib_device* models? This is different from multiport hw. >>>> >>>> I don't see how the iWARP protocol impacts this, but I believe the >>>> cxgb4 provider implements multiport. It sets the ibdev.phys_port_cnt >>>> anyway. Perhaps this is incorrect. >>>> >>>>> If it is multiport how do the gid tables work across the ports? >>>> >>>> Again, not sure how to respond. iWARP doesn't express the gid as a >>>> protocol element. And the iw_cm really doesn't either, although it >>>> does implement a gid-type API I guess. That's local behavior though, >>>> not something that goes on the wire directly. >>>> >>>> Maybe I should ask... what does the "Linux iWARP devices have but one >>>> port" actually mean in the comment? Would the code below it not work >>>> if that were not the case? All I'm saying is that the comment seems >>>> to be unnecessary, and confusing. >>> >>> It replaces a code comment you complained about in an earlier review >>> regarding the use of "if (rdma_protocol_iwarp())". As far as I >>> understand, /in Linux/ each iWARP endpoint gets its own ib_device >>> and that device has exactly one port. >>> >>> So for example, a physical device that has two ports would appear >>> as two ib_devices each with a single port. Is that not how it >>> works? It's certainly possible I've misunderstood something. >> >> That is how I would expect it to work. Multi-port ib_device is really >> only something that exists to support IB's APM, and iWarp doesn't have >> that. >> >> Otherwise verbs says a QP is bound to a single IB device's port and a >> single GID of that port. It should not float around between multiple >> ports. >> >> So, I don't know what the iwarp drivers did here. >> >> As for rthe comment, I don't think it is quite right, this code >> already knows what ib_device port it is supposed to be using somehow, >> so it doesn't matter. >> >> I think it should be more like: >> >> In iWarp mode we need to have a sgid entry to be able to locate the >> netdevice. iWarp drivers are not allowed to associate more than one >> net device with their gid tables, so returning the first entry is >> sufficient. iWarp will ignore the GID entries actual GID, and the >> passed in GID may not even be present in the GID table for tunnels >> and other non-ethernet netdevices. > > I can make that change and post a refresh. I'd like > to hear from Tom first. The extra detail is helpful, but I'm still queasy about tying this to iWARP. The situation seems much more general. But the breadcrumbs are here so if the situation arises in another transport, there's a trail for others to follow. Tom.
diff --git a/drivers/infiniband/core/cma.c b/drivers/infiniband/core/cma.c index 889b3e4ea980..07bb5ac4019d 100644 --- a/drivers/infiniband/core/cma.c +++ b/drivers/infiniband/core/cma.c @@ -700,6 +700,21 @@ cma_validate_port(struct ib_device *device, u32 port, if ((dev_type != ARPHRD_INFINIBAND) && rdma_protocol_ib(device, port)) goto out; + /* Linux iWARP devices have but one port */ + if (rdma_protocol_iwarp(device, port)) { + sgid_attr = rdma_get_gid_attr(device, port, 0); + if (IS_ERR(sgid_attr)) + goto out; + + rcu_read_lock(); + ndev = rcu_dereference(sgid_attr->ndev); + if (!net_eq(dev_net(ndev), dev_addr->net) || + ndev->ifindex != bound_if_index) + sgid_attr = ERR_PTR(-ENODEV); + rcu_read_unlock(); + goto out; + } + if (dev_type == ARPHRD_ETHER && rdma_protocol_roce(device, port)) { ndev = dev_get_by_index(dev_addr->net, bound_if_index); if (!ndev)