Message ID | 1450710084-22547-1-git-send-email-matanb@mellanox.com (mailing list archive) |
---|---|
State | Accepted |
Headers | show |
On 12/21/2015 5:01 PM, Matan Barak wrote: > Previously, cma_match_net_dev called cma_protocol_roce which > tried to verify that the IB device uses RoCE protocol. However, > if rdma_id didn't have a bounded port, it used the first port > of the device. maybe prefer a higher then code speak language e.g "if the rdma id didn't have" also below "unbounded rdma ids" > > In VPI systems, the first port might be an IB port while the second > one could be an Ethernet port. This made requests for unbounded rdma_ids > that come from the Ethernet port fail. add "to" --> "Ethernet port to fail" > Fixing this by passing the port of the request and checking this port > of the device. OK, so this fix will work for both ib/eth and eth/ib configs, right? good. > > Fixes: b8cab5dab15f ('IB/cma: Accept connection without a valid netdev on RoCE') Reported-by: Or Gerlitz <ogerlitz@mellanox.com> > Signed-off-by: Matan Barak <matanb@mellanox.com> Doug, the bug fixes a commit from from 4.3, lets fix it in 4.4 and later we will send it to -stable as well. So for 4.4 there's this one and the kvfree fix [1] Or. [1] https://patchwork.kernel.org/patch/7868481/ > --- > Hi Doug, > > This patch fixes a bug in VPI systems, where the first port is configured > as IB and the second one is configured as Ethernet. > In this case, if the rdma_id isn't bounded to a port, cma_match_net_dev > will try to verify that the first port is a RoCE port and fail. > This is fixed by passing the port of the incoming request. > > Regards, > Matan > > drivers/infiniband/core/cma.c | 16 +++++++++------- > 1 files changed, 9 insertions(+), 7 deletions(-) > > diff --git a/drivers/infiniband/core/cma.c b/drivers/infiniband/core/cma.c > index d2d5d00..c8a265c 100644 > --- a/drivers/infiniband/core/cma.c > +++ b/drivers/infiniband/core/cma.c > @@ -1265,15 +1265,17 @@ static bool cma_protocol_roce(const struct rdma_cm_id *id) > return cma_protocol_roce_dev_port(device, port_num); > } > > -static bool cma_match_net_dev(const struct rdma_id_private *id_priv, > - const struct net_device *net_dev) > +static bool cma_match_net_dev(const struct rdma_cm_id *id, > + const struct net_device *net_dev, > + u8 port_num) > { > - const struct rdma_addr *addr = &id_priv->id.route.addr; > + const struct rdma_addr *addr = &id->route.addr; > > if (!net_dev) > /* This request is an AF_IB request or a RoCE request */ > - return addr->src_addr.ss_family == AF_IB || > - cma_protocol_roce(&id_priv->id); > + return (!id->port_num || id->port_num == port_num) && > + (addr->src_addr.ss_family == AF_IB || > + cma_protocol_roce_dev_port(id->device, port_num)); > > return !addr->dev_addr.bound_dev_if || > (net_eq(dev_net(net_dev), addr->dev_addr.net) && > @@ -1295,13 +1297,13 @@ static struct rdma_id_private *cma_find_listener( > hlist_for_each_entry(id_priv, &bind_list->owners, node) { > if (cma_match_private_data(id_priv, ib_event->private_data)) { > if (id_priv->id.device == cm_id->device && > - cma_match_net_dev(id_priv, net_dev)) > + cma_match_net_dev(&id_priv->id, net_dev, req->port)) > return id_priv; > list_for_each_entry(id_priv_dev, > &id_priv->listen_list, > listen_list) { > if (id_priv_dev->id.device == cm_id->device && > - cma_match_net_dev(id_priv_dev, net_dev)) > + cma_match_net_dev(&id_priv_dev->id, net_dev, req->port)) > return id_priv_dev; > } > } -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 12/21/2015 5:01 PM, Matan Barak wrote: > This patch fixes a bug in VPI systems, where the first port is configured > as IB and the second one is configured as Ethernet. In this case, if the rdma_id isn't bounded to a port, cma_match_net_dev will try to verify that the first port is a RoCE port and fail. This is fixed by passing the port of the incoming request. OK -- we have another bug down there, cma loopback doesn't work, same reject reason (below).This happens in both VPI and non-VPI configurations. Works well with 4.2.3 Or. $ rping -d -v -c -a 127.0.0.1 -C 1 verbose client count 1 created cm_id 0x6087d0 cma_event type RDMA_CM_EVENT_ADDR_RESOLVED cma_id 0x6087d0 (parent) cma_event type RDMA_CM_EVENT_ROUTE_RESOLVED cma_id 0x6087d0 (parent) rdma_resolve_addr - rdma_resolve_route successful created pd 0x60e5f0 created channel 0x608250 created cq 0x608a20 created qp 0x6082e0 rping_setup_buffers called on cb 0x606010 allocated & registered buffers... cq_thread started. wait for CONNECTED state 10 cma_event type RDMA_CM_EVENT_REJECTED cma_id 0x6087d0 (parent) cma event RDMA_CM_EVENT_REJECTED, error 28 connect error -1 rping_free_buffers called on cb 0x606010 destroy cm_id 0x6087d0 -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 12/21/2015 5:01 PM, Matan Barak wrote: > Previously, cma_match_net_dev called cma_protocol_roce which > tried to verify that the IB device uses RoCE protocol. However, > if rdma_id didn't have a bounded port, it used the first port > of the device. > > In VPI systems, the first port might be an IB port while the second > one could be an Ethernet port. This made requests for unbounded rdma_ids > that come from the Ethernet port fail. > Fixing this by passing the port of the request and checking this port > of the device. > > Fixes: b8cab5dab15f ('IB/cma: Accept connection without a valid netdev on RoCE') > Signed-off-by: Matan Barak<matanb@mellanox.com> seems that the patch is missing from patchworks, I can't explain that. -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 12/22/2015 9:17 AM, Or Gerlitz wrote: > On 12/21/2015 5:01 PM, Matan Barak wrote: >> This patch fixes a bug in VPI systems, where the first port is >> configured >> as IB and the second one is configured as Ethernet. In this case, if >> the rdma_id isn't bounded to a port, cma_match_net_dev will try to >> verify that the first port is a RoCE port and fail. This is fixed by >> passing the port of the incoming request. > > OK -- we have another bug down there, cma loopback doesn't work, same > reject reason (below).This happens in both VPI and non-VPI > configurations. > > Works well with 4.2.3 I made more checks with the 4.2.3 kernel (before all the IB core/cache changes that went in 4.3) -- rdma-cm loopback does work as long as there are active IB ports in the system. When there are no active IB ports, but rather only Eth ports (VPI or plain Eth), rping fails: # rping -d -v -c -a 127.0.0.1 -C 1 created cm_id 0x6087d0 cma_event type RDMA_CM_EVENT_ADDR_RESOLVED cma_id 0x6087d0 (parent) rdma_resolve_route: No such device so it seems we had something there before the 4.3 changes. Or. -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 12/22/2015 05:47 AM, Or Gerlitz wrote: > On 12/21/2015 5:01 PM, Matan Barak wrote: >> Previously, cma_match_net_dev called cma_protocol_roce which >> tried to verify that the IB device uses RoCE protocol. However, >> if rdma_id didn't have a bounded port, it used the first port >> of the device. >> >> In VPI systems, the first port might be an IB port while the second >> one could be an Ethernet port. This made requests for unbounded rdma_ids >> that come from the Ethernet port fail. >> Fixing this by passing the port of the request and checking this port >> of the device. >> >> Fixes: b8cab5dab15f ('IB/cma: Accept connection without a valid netdev >> on RoCE') >> Signed-off-by: Matan Barak<matanb@mellanox.com> > > seems that the patch is missing from patchworks, I can't explain that. I've already downloaded it and marked it accepted.
On Tue, Dec 22, 2015 at 8:58 PM, Doug Ledford <dledford@redhat.com> wrote: > On 12/22/2015 05:47 AM, Or Gerlitz wrote: >> On 12/21/2015 5:01 PM, Matan Barak wrote: >>> Previously, cma_match_net_dev called cma_protocol_roce which >>> tried to verify that the IB device uses RoCE protocol. However, >>> if rdma_id didn't have a bounded port, it used the first port >>> of the device. >>> >>> In VPI systems, the first port might be an IB port while the second >>> one could be an Ethernet port. This made requests for unbounded rdma_ids >>> that come from the Ethernet port fail. >>> Fixing this by passing the port of the request and checking this port >>> of the device. >>> >>> Fixes: b8cab5dab15f ('IB/cma: Accept connection without a valid netdev >>> on RoCE') >>> Signed-off-by: Matan Barak<matanb@mellanox.com> >> >> seems that the patch is missing from patchworks, I can't explain that. > > I've already downloaded it and marked it accepted. > Thanks Doug. Would you like that I'll repost the patch with the commit message changed as Or suggested or is the current version good enough? Regarding the Ethernet loopback issue, I started looking into that, but as Or stated, it's broken even before the RoCE patches. > -- > Doug Ledford <dledford@redhat.com> > GPG KeyID: 0E572FDD > > Matan -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 12/22/2015 02:26 PM, Matan Barak wrote: > On Tue, Dec 22, 2015 at 8:58 PM, Doug Ledford <dledford@redhat.com> wrote: >> On 12/22/2015 05:47 AM, Or Gerlitz wrote: >>> On 12/21/2015 5:01 PM, Matan Barak wrote: >>>> Previously, cma_match_net_dev called cma_protocol_roce which >>>> tried to verify that the IB device uses RoCE protocol. However, >>>> if rdma_id didn't have a bounded port, it used the first port >>>> of the device. >>>> >>>> In VPI systems, the first port might be an IB port while the second >>>> one could be an Ethernet port. This made requests for unbounded rdma_ids >>>> that come from the Ethernet port fail. >>>> Fixing this by passing the port of the request and checking this port >>>> of the device. >>>> >>>> Fixes: b8cab5dab15f ('IB/cma: Accept connection without a valid netdev >>>> on RoCE') >>>> Signed-off-by: Matan Barak<matanb@mellanox.com> >>> >>> seems that the patch is missing from patchworks, I can't explain that. >> >> I've already downloaded it and marked it accepted. >> > > Thanks Doug. Would you like that I'll repost the patch with the commit > message changed as Or suggested or is the current version good enough? I rewrote most of the commit message myself. It now reads as such: commit b00a1609f845df4c75bc62f56d0d71a01fe87cc6 Author: Matan Barak <matanb@mellanox.com> Date: Mon Dec 21 17:01:24 2015 +0200 IB/cma: cma_match_net_dev needs to take into account port_num Previously, cma_match_net_dev called cma_protocol_roce which tried to verify that the IB device uses RoCE protocol. However, if rdma_id wasn't bound to a port, then the check would occur against the first port of the device without regard to whether that port was even of the same type as the type of port the incoming packet was received on. Fix this by passing the port of the request and only checking against the same port of the device. Fixes: b8cab5dab15f ('IB/cma: Accept connection without a valid netdev on RoCE') Signed-off-by: Matan Barak <matanb@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com> > Regarding the Ethernet loopback issue, I started looking into that, > but as Or stated, it's broken even before the RoCE patches. Sure, let me know if you root cause it, I'd like to get it into 4.5 if possible.
FWIW, looks good to me,
Reviewed-by: Sagi Grimberg <sagig@mellanox.com>
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
On 12/22/2015 02:26 PM, Matan Barak wrote: > On Tue, Dec 22, 2015 at 8:58 PM, Doug Ledford <dledford@redhat.com> wrote: >> On 12/22/2015 05:47 AM, Or Gerlitz wrote: >>> On 12/21/2015 5:01 PM, Matan Barak wrote: >>>> Previously, cma_match_net_dev called cma_protocol_roce which >>>> tried to verify that the IB device uses RoCE protocol. However, >>>> if rdma_id didn't have a bounded port, it used the first port >>>> of the device. >>>> >>>> In VPI systems, the first port might be an IB port while the second >>>> one could be an Ethernet port. This made requests for unbounded rdma_ids >>>> that come from the Ethernet port fail. >>>> Fixing this by passing the port of the request and checking this port >>>> of the device. >>>> >>>> Fixes: b8cab5dab15f ('IB/cma: Accept connection without a valid netdev >>>> on RoCE') >>>> Signed-off-by: Matan Barak<matanb@mellanox.com> >>> >>> seems that the patch is missing from patchworks, I can't explain that. >> >> I've already downloaded it and marked it accepted. >> > > Thanks Doug. Would you like that I'll repost the patch with the commit > message changed as Or suggested or is the current version good enough? > > Regarding the Ethernet loopback issue, I started looking into that, > but as Or stated, it's broken even before the RoCE patches. Ping. Any progress on this?
On Wed, Dec 23, 2015 at 6:08 PM, Doug Ledford <dledford@redhat.com> wrote: > On 12/22/2015 02:26 PM, Matan Barak wrote: >> On Tue, Dec 22, 2015 at 8:58 PM, Doug Ledford <dledford@redhat.com> wrote: >>> On 12/22/2015 05:47 AM, Or Gerlitz wrote: >>>> On 12/21/2015 5:01 PM, Matan Barak wrote: >>>>> Previously, cma_match_net_dev called cma_protocol_roce which >>>>> tried to verify that the IB device uses RoCE protocol. However, >>>>> if rdma_id didn't have a bounded port, it used the first port >>>>> of the device. >>>>> >>>>> In VPI systems, the first port might be an IB port while the second >>>>> one could be an Ethernet port. This made requests for unbounded rdma_ids >>>>> that come from the Ethernet port fail. >>>>> Fixing this by passing the port of the request and checking this port >>>>> of the device. >>>>> >>>>> Fixes: b8cab5dab15f ('IB/cma: Accept connection without a valid netdev >>>>> on RoCE') >>>>> Signed-off-by: Matan Barak<matanb@mellanox.com> >>>> >>>> seems that the patch is missing from patchworks, I can't explain that. >>> >>> I've already downloaded it and marked it accepted. >>> >> >> Thanks Doug. Would you like that I'll repost the patch with the commit >> message changed as Or suggested or is the current version good enough? >> >> Regarding the Ethernet loopback issue, I started looking into that, >> but as Or stated, it's broken even before the RoCE patches. > > Ping. Any progress on this? Yeah, there's some progress - the basic problem is that we don't have a bounded ndev and thus cma_resolve_iboe_route returns -ENODEV. The root cause for this is that we have to store the ndev in cma_bind_loopback. Even after doing that, cma_set_loopback changes the sgid to be the localhost GID, which doesn't exist in the GID table and thus will fail later in the GID lookup. I think that regarding loopback, we actually want to send the data on the link local default GID, which is guaranteed to exist. That's why I think we should: 1. Change the cma_src_addr and cma_dst_addr in cma_bind_loopback to be the default GID. 2. Store the associated ndev of this default GID as the bounded device. 3. In cma_resolve_loopback, get the MAC of this bounded device and store it as the DMAC. 4. In cma_resolve_iboe_route, don't try to do route resolve if the dGID matches the default GID. It's still not working though, but this is where I'm headed. What do you think? > > > -- > Doug Ledford <dledford@redhat.com> > GPG KeyID: 0E572FDD > > Regards, Matan -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Wed, Dec 23, 2015 at 6:08 PM, Doug Ledford <dledford@redhat.com> wrote: > On 12/22/2015 02:26 PM, Matan Barak wrote: >> On Tue, Dec 22, 2015 at 8:58 PM, Doug Ledford <dledford@redhat.com> wrote: >>> On 12/22/2015 05:47 AM, Or Gerlitz wrote: >>>> On 12/21/2015 5:01 PM, Matan Barak wrote: >>>>> Previously, cma_match_net_dev called cma_protocol_roce which >>>>> tried to verify that the IB device uses RoCE protocol. However, >>>>> if rdma_id didn't have a bounded port, it used the first port >>>>> of the device. >>>>> >>>>> In VPI systems, the first port might be an IB port while the second >>>>> one could be an Ethernet port. This made requests for unbounded rdma_ids >>>>> that come from the Ethernet port fail. >>>>> Fixing this by passing the port of the request and checking this port >>>>> of the device. >>>>> >>>>> Fixes: b8cab5dab15f ('IB/cma: Accept connection without a valid netdev >>>>> on RoCE') >>>>> Signed-off-by: Matan Barak<matanb@mellanox.com> >>>> >>>> seems that the patch is missing from patchworks, I can't explain that. >>> >>> I've already downloaded it and marked it accepted. >>> >> >> Thanks Doug. Would you like that I'll repost the patch with the commit >> message changed as Or suggested or is the current version good enough? >> >> Regarding the Ethernet loopback issue, I started looking into that, >> but as Or stated, it's broken even before the RoCE patches. > > Ping. Any progress on this? Yeah, there's some progress - the basic problem is that we don't have a bounded ndev and thus cma_resolve_iboe_route returns -ENODEV. The root cause for this is that we have to store the ndev in cma_bind_loopback. Even after doing that, cma_set_loopback changes the sgid to be the localhost GID, which doesn't exist in the GID table and thus will fail later in the GID lookup. I think that regarding loopback, we actually want to send the data on the link local default GID, which is guaranteed to exist. That's why I think we should: 1. Change the cma_src_addr and cma_dst_addr in cma_bind_loopback to be the default GID. 2. Store the associated ndev of this default GID as the bounded device. 3. In cma_resolve_loopback, get the MAC of this bounded device and store it as the DMAC. 4. In cma_resolve_iboe_route, don't try to do route resolve if the dGID matches the default GID. It's still not working though, but this is where I'm headed. What do you think? > > > -- > Doug Ledford <dledford@redhat.com> > GPG KeyID: 0E572FDD > > -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 12/23/2015 11:35 AM, Matan Barak wrote: > On Wed, Dec 23, 2015 at 6:08 PM, Doug Ledford <dledford@redhat.com> wrote: >> On 12/22/2015 02:26 PM, Matan Barak wrote: >>> On Tue, Dec 22, 2015 at 8:58 PM, Doug Ledford <dledford@redhat.com> wrote: >>>> On 12/22/2015 05:47 AM, Or Gerlitz wrote: >>>>> On 12/21/2015 5:01 PM, Matan Barak wrote: >>>>>> Previously, cma_match_net_dev called cma_protocol_roce which >>>>>> tried to verify that the IB device uses RoCE protocol. However, >>>>>> if rdma_id didn't have a bounded port, it used the first port >>>>>> of the device. >>>>>> >>>>>> In VPI systems, the first port might be an IB port while the second >>>>>> one could be an Ethernet port. This made requests for unbounded rdma_ids >>>>>> that come from the Ethernet port fail. >>>>>> Fixing this by passing the port of the request and checking this port >>>>>> of the device. >>>>>> >>>>>> Fixes: b8cab5dab15f ('IB/cma: Accept connection without a valid netdev >>>>>> on RoCE') >>>>>> Signed-off-by: Matan Barak<matanb@mellanox.com> >>>>> >>>>> seems that the patch is missing from patchworks, I can't explain that. >>>> >>>> I've already downloaded it and marked it accepted. >>>> >>> >>> Thanks Doug. Would you like that I'll repost the patch with the commit >>> message changed as Or suggested or is the current version good enough? >>> >>> Regarding the Ethernet loopback issue, I started looking into that, >>> but as Or stated, it's broken even before the RoCE patches. >> >> Ping. Any progress on this? > > Yeah, there's some progress - the basic problem is that we don't have > a bounded ndev and thus cma_resolve_iboe_route returns -ENODEV. Which makes sense considering that 127.0.0.1 doesn't belong to any of the devs. > The root cause for this is that we have to store the ndev in > cma_bind_loopback. Even after doing that, cma_set_loopback changes the > sgid to be the localhost GID, which doesn't exist in the GID table and > thus will fail later in the GID lookup. Again, makes sense. > I think that regarding loopback, we actually want to send the data on > the link local default GID, Which link local default GID? If you have more than one port or card, then that is not a unique value. > which is guaranteed to exist. And in many cases, multiple times. > That's why I > think we should: > 1. Change the cma_src_addr and cma_dst_addr in cma_bind_loopback to be > the default GID. > 2. Store the associated ndev of this default GID as the bounded device. > 3. In cma_resolve_loopback, get the MAC of this bounded device and > store it as the DMAC. > 4. In cma_resolve_iboe_route, don't try to do route resolve if the > dGID matches the default GID. > > It's still not working though, but this is where I'm headed. What do you think? Let's punt this until later. It only effects the situation when you use 127.0.0.1 as the address. If you use the local IP address of a specific interface, you get the same loopback behavior, but no failures (and on top of that instead of getting a random device to handle the loopback transfer, you get a specific device of your choosing). To me, that qualifies as a reasonable workaround. The 127.0.0.1 behavior has been broken for a while (and I'm not sure it should have ever been relied upon anyway), so I don't think we have to hold things up.
On Wed, Dec 23, 2015 at 7:57 PM, Doug Ledford <dledford@redhat.com> wrote: > On 12/23/2015 11:35 AM, Matan Barak wrote: >> On Wed, Dec 23, 2015 at 6:08 PM, Doug Ledford <dledford@redhat.com> wrote: >>> On 12/22/2015 02:26 PM, Matan Barak wrote: >>>> On Tue, Dec 22, 2015 at 8:58 PM, Doug Ledford <dledford@redhat.com> wrote: >>>>> On 12/22/2015 05:47 AM, Or Gerlitz wrote: >>>>>> On 12/21/2015 5:01 PM, Matan Barak wrote: >>>>>>> Previously, cma_match_net_dev called cma_protocol_roce which >>>>>>> tried to verify that the IB device uses RoCE protocol. However, >>>>>>> if rdma_id didn't have a bounded port, it used the first port >>>>>>> of the device. >>>>>>> >>>>>>> In VPI systems, the first port might be an IB port while the second >>>>>>> one could be an Ethernet port. This made requests for unbounded rdma_ids >>>>>>> that come from the Ethernet port fail. >>>>>>> Fixing this by passing the port of the request and checking this port >>>>>>> of the device. >>>>>>> >>>>>>> Fixes: b8cab5dab15f ('IB/cma: Accept connection without a valid netdev >>>>>>> on RoCE') >>>>>>> Signed-off-by: Matan Barak<matanb@mellanox.com> >>>>>> >>>>>> seems that the patch is missing from patchworks, I can't explain that. >>>>> >>>>> I've already downloaded it and marked it accepted. >>>>> >>>> >>>> Thanks Doug. Would you like that I'll repost the patch with the commit >>>> message changed as Or suggested or is the current version good enough? >>>> >>>> Regarding the Ethernet loopback issue, I started looking into that, >>>> but as Or stated, it's broken even before the RoCE patches. >>> >>> Ping. Any progress on this? >> >> Yeah, there's some progress - the basic problem is that we don't have >> a bounded ndev and thus cma_resolve_iboe_route returns -ENODEV. > > Which makes sense considering that 127.0.0.1 doesn't belong to any of > the devs. > >> The root cause for this is that we have to store the ndev in >> cma_bind_loopback. Even after doing that, cma_set_loopback changes the >> sgid to be the localhost GID, which doesn't exist in the GID table and >> thus will fail later in the GID lookup. > > Again, makes sense. > >> I think that regarding loopback, we actually want to send the data on >> the link local default GID, > > Which link local default GID? If you have more than one port or card, > then that is not a unique value. We assume that every RoCE port has an associated net device. Since a net device should have a unique MAC, it should have a unique IPv6 link local address and thus a unique GID. > >> which is guaranteed to exist. > > And in many cases, multiple times. > >> That's why I >> think we should: >> 1. Change the cma_src_addr and cma_dst_addr in cma_bind_loopback to be >> the default GID. >> 2. Store the associated ndev of this default GID as the bounded device. >> 3. In cma_resolve_loopback, get the MAC of this bounded device and >> store it as the DMAC. >> 4. In cma_resolve_iboe_route, don't try to do route resolve if the >> dGID matches the default GID. >> >> It's still not working though, but this is where I'm headed. What do you think? > > Let's punt this until later. It only effects the situation when you use > 127.0.0.1 as the address. If you use the local IP address of a specific > interface, you get the same loopback behavior, but no failures (and on > top of that instead of getting a random device to handle the loopback > transfer, you get a specific device of your choosing). To me, that > qualifies as a reasonable workaround. The 127.0.0.1 behavior has been > broken for a while (and I'm not sure it should have ever been relied > upon anyway), so I don't think we have to hold things up. > I totally agree that it's better to use the local IP address and not just get a random device by using 127.0.0.1. You could get a specific device by binding it, but then - use its local IP instead of 127.0.0.1. > -- > Doug Ledford <dledford@redhat.com> > GPG KeyID: 0E572FDD > > Matan -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 12/24/2015 9:57 AM, Matan Barak wrote: > I totally agree that it's better to use the local IP address and not > just get a random device by using 127.0.0.1. You could get a specific > device by binding it, but then - use its local IP instead of > 127.0.0.1. Yes guys, it might be better but the user can do that. However, loopback ala 127.0.0.1 is working in the rdma-cm since the 2.6.18 day one and was broken recently when Eth ports are around. Lets fix. Indeed, this is 4.5 and not 4.4 material. Or. -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
diff --git a/drivers/infiniband/core/cma.c b/drivers/infiniband/core/cma.c index d2d5d00..c8a265c 100644 --- a/drivers/infiniband/core/cma.c +++ b/drivers/infiniband/core/cma.c @@ -1265,15 +1265,17 @@ static bool cma_protocol_roce(const struct rdma_cm_id *id) return cma_protocol_roce_dev_port(device, port_num); } -static bool cma_match_net_dev(const struct rdma_id_private *id_priv, - const struct net_device *net_dev) +static bool cma_match_net_dev(const struct rdma_cm_id *id, + const struct net_device *net_dev, + u8 port_num) { - const struct rdma_addr *addr = &id_priv->id.route.addr; + const struct rdma_addr *addr = &id->route.addr; if (!net_dev) /* This request is an AF_IB request or a RoCE request */ - return addr->src_addr.ss_family == AF_IB || - cma_protocol_roce(&id_priv->id); + return (!id->port_num || id->port_num == port_num) && + (addr->src_addr.ss_family == AF_IB || + cma_protocol_roce_dev_port(id->device, port_num)); return !addr->dev_addr.bound_dev_if || (net_eq(dev_net(net_dev), addr->dev_addr.net) && @@ -1295,13 +1297,13 @@ static struct rdma_id_private *cma_find_listener( hlist_for_each_entry(id_priv, &bind_list->owners, node) { if (cma_match_private_data(id_priv, ib_event->private_data)) { if (id_priv->id.device == cm_id->device && - cma_match_net_dev(id_priv, net_dev)) + cma_match_net_dev(&id_priv->id, net_dev, req->port)) return id_priv; list_for_each_entry(id_priv_dev, &id_priv->listen_list, listen_list) { if (id_priv_dev->id.device == cm_id->device && - cma_match_net_dev(id_priv_dev, net_dev)) + cma_match_net_dev(&id_priv_dev->id, net_dev, req->port)) return id_priv_dev; } }
Previously, cma_match_net_dev called cma_protocol_roce which tried to verify that the IB device uses RoCE protocol. However, if rdma_id didn't have a bounded port, it used the first port of the device. In VPI systems, the first port might be an IB port while the second one could be an Ethernet port. This made requests for unbounded rdma_ids that come from the Ethernet port fail. Fixing this by passing the port of the request and checking this port of the device. Fixes: b8cab5dab15f ('IB/cma: Accept connection without a valid netdev on RoCE') Signed-off-by: Matan Barak <matanb@mellanox.com> --- Hi Doug, This patch fixes a bug in VPI systems, where the first port is configured as IB and the second one is configured as Ethernet. In this case, if the rdma_id isn't bounded to a port, cma_match_net_dev will try to verify that the first port is a RoCE port and fail. This is fixed by passing the port of the incoming request. Regards, Matan drivers/infiniband/core/cma.c | 16 +++++++++------- 1 files changed, 9 insertions(+), 7 deletions(-)