diff mbox series

RDMA/srpt: Filter out AGN bits

Message ID 20190814151507.140572-1-bvanassche@acm.org (mailing list archive)
State Not Applicable
Headers show
Series RDMA/srpt: Filter out AGN bits | expand

Commit Message

Bart Van Assche Aug. 14, 2019, 3:15 p.m. UTC
The ib_srpt driver derives its default service GUID from the node GUID
of the first encountered HCA. Since that service GUID is passed to
ib_cm_listen(), the AGN bits must not be set. Since the AGN bits can
be set in the node GUID of RoCE HCAs, filter these bits out. This
patch avoids that loading the ib_srpt driver fails as follows for the
hns driver:

  ib_srpt srpt_add_one(hns_0) failed.

Cc: oulijun <oulijun@huawei.com>
Reported-by: oulijun <oulijun@huawei.com>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
---
 drivers/infiniband/ulp/srpt/ib_srpt.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

Comments

Jason Gunthorpe Aug. 19, 2019, 12:21 p.m. UTC | #1
On Wed, Aug 14, 2019 at 08:15:07AM -0700, Bart Van Assche wrote:
> The ib_srpt driver derives its default service GUID from the node GUID
> of the first encountered HCA. Since that service GUID is passed to
> ib_cm_listen(), the AGN bits must not be set. Since the AGN bits can
> be set in the node GUID of RoCE HCAs, filter these bits out. This
> patch avoids that loading the ib_srpt driver fails as follows for the
> hns driver:
> 
>   ib_srpt srpt_add_one(hns_0) failed.
> 
> Cc: oulijun <oulijun@huawei.com>
> Reported-by: oulijun <oulijun@huawei.com>
> Signed-off-by: Bart Van Assche <bvanassche@acm.org>
>  drivers/infiniband/ulp/srpt/ib_srpt.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/infiniband/ulp/srpt/ib_srpt.c b/drivers/infiniband/ulp/srpt/ib_srpt.c
> index e25c70a56be6..114bf8d6c82b 100644
> +++ b/drivers/infiniband/ulp/srpt/ib_srpt.c
> @@ -3109,7 +3109,8 @@ static void srpt_add_one(struct ib_device *device)
>  	srpt_use_srq(sdev, sdev->port[0].port_attrib.use_srq);
>  
>  	if (!srpt_service_guid)
> -		srpt_service_guid = be64_to_cpu(device->node_guid);
> +		srpt_service_guid = be64_to_cpu(device->node_guid) &
> +			~IB_SERVICE_ID_AGN_MASK;

This seems kind of sketchy, masking bits in the GUID is going to make
it non-unique.. Should we do this only for roce or something?

Hal, do you have any insight?

Jason
Hal Rosenstock Aug. 19, 2019, 1:40 p.m. UTC | #2
On 8/19/2019 8:21 AM, Jason Gunthorpe wrote:
> On Wed, Aug 14, 2019 at 08:15:07AM -0700, Bart Van Assche wrote:
>> The ib_srpt driver derives its default service GUID from the node GUID
>> of the first encountered HCA. Since that service GUID is passed to
>> ib_cm_listen(), the AGN bits must not be set. Since the AGN bits can
>> be set in the node GUID of RoCE HCAs, filter these bits out. This
>> patch avoids that loading the ib_srpt driver fails as follows for the
>> hns driver:
>>
>>   ib_srpt srpt_add_one(hns_0) failed.
>>
>> Cc: oulijun <oulijun@huawei.com>
>> Reported-by: oulijun <oulijun@huawei.com>
>> Signed-off-by: Bart Van Assche <bvanassche@acm.org>
>>  drivers/infiniband/ulp/srpt/ib_srpt.c | 3 ++-
>>  1 file changed, 2 insertions(+), 1 deletion(-)
>>
>> diff --git a/drivers/infiniband/ulp/srpt/ib_srpt.c b/drivers/infiniband/ulp/srpt/ib_srpt.c
>> index e25c70a56be6..114bf8d6c82b 100644
>> +++ b/drivers/infiniband/ulp/srpt/ib_srpt.c
>> @@ -3109,7 +3109,8 @@ static void srpt_add_one(struct ib_device *device)
>>  	srpt_use_srq(sdev, sdev->port[0].port_attrib.use_srq);
>>  
>>  	if (!srpt_service_guid)
>> -		srpt_service_guid = be64_to_cpu(device->node_guid);
>> +		srpt_service_guid = be64_to_cpu(device->node_guid) &
>> +			~IB_SERVICE_ID_AGN_MASK;
> 
> This seems kind of sketchy, masking bits in the GUID is going to make
> it non-unique.. Should we do this only for roce or something?
> 
> Hal, do you have any insight?

include/rdma/ib_cm.h:#define IB_SERVICE_ID_AGN_MASK
cpu_to_be64(0xFF00000000000000ULL)

IB_SERVICE_ID_AGN_MASK masks entire first byte of OUI which seems like
too much to me as it contains company related bits.

Would it work just masking the first 2 bits (local/global and X bits) ?

-- Hal

> Jason
>
Jason Gunthorpe Aug. 19, 2019, 1:46 p.m. UTC | #3
On Mon, Aug 19, 2019 at 09:40:24AM -0400, Hal Rosenstock wrote:
> On 8/19/2019 8:21 AM, Jason Gunthorpe wrote:
> > On Wed, Aug 14, 2019 at 08:15:07AM -0700, Bart Van Assche wrote:
> >> The ib_srpt driver derives its default service GUID from the node GUID
> >> of the first encountered HCA. Since that service GUID is passed to
> >> ib_cm_listen(), the AGN bits must not be set. Since the AGN bits can
> >> be set in the node GUID of RoCE HCAs, filter these bits out. This
> >> patch avoids that loading the ib_srpt driver fails as follows for the
> >> hns driver:

What is the actual problem anyhow? Is some roce GUID using the local
bits and overlapping with the IB_CM_ASSIGN_SERVICE_ID? 

Ie generated VF MAC or something?

> >>   ib_srpt srpt_add_one(hns_0) failed.
> >>
> >> Cc: oulijun <oulijun@huawei.com>
> >> Reported-by: oulijun <oulijun@huawei.com>
> >> Signed-off-by: Bart Van Assche <bvanassche@acm.org>
> >>  drivers/infiniband/ulp/srpt/ib_srpt.c | 3 ++-
> >>  1 file changed, 2 insertions(+), 1 deletion(-)
> >>
> >> diff --git a/drivers/infiniband/ulp/srpt/ib_srpt.c b/drivers/infiniband/ulp/srpt/ib_srpt.c
> >> index e25c70a56be6..114bf8d6c82b 100644
> >> +++ b/drivers/infiniband/ulp/srpt/ib_srpt.c
> >> @@ -3109,7 +3109,8 @@ static void srpt_add_one(struct ib_device *device)
> >>  	srpt_use_srq(sdev, sdev->port[0].port_attrib.use_srq);
> >>  
> >>  	if (!srpt_service_guid)
> >> -		srpt_service_guid = be64_to_cpu(device->node_guid);
> >> +		srpt_service_guid = be64_to_cpu(device->node_guid) &
> >> +			~IB_SERVICE_ID_AGN_MASK;
> > 
> > This seems kind of sketchy, masking bits in the GUID is going to make
> > it non-unique.. Should we do this only for roce or something?
> > 
> > Hal, do you have any insight?
> 
> include/rdma/ib_cm.h:#define IB_SERVICE_ID_AGN_MASK
> cpu_to_be64(0xFF00000000000000ULL)
> 
> IB_SERVICE_ID_AGN_MASK masks entire first byte of OUI which seems like
> too much to me as it contains company related bits.
> 
> Would it work just masking the first 2 bits (local/global and X bits) ?

Maybe if we see a local GUID it should generate a random global GUID
instead.

Jason
Hal Rosenstock Aug. 19, 2019, 1:49 p.m. UTC | #4
On 8/19/2019 9:46 AM, Jason Gunthorpe wrote:
> On Mon, Aug 19, 2019 at 09:40:24AM -0400, Hal Rosenstock wrote:
>> On 8/19/2019 8:21 AM, Jason Gunthorpe wrote:
>>> On Wed, Aug 14, 2019 at 08:15:07AM -0700, Bart Van Assche wrote:
>>>> The ib_srpt driver derives its default service GUID from the node GUID
>>>> of the first encountered HCA. Since that service GUID is passed to
>>>> ib_cm_listen(), the AGN bits must not be set. Since the AGN bits can
>>>> be set in the node GUID of RoCE HCAs, filter these bits out. This
>>>> patch avoids that loading the ib_srpt driver fails as follows for the
>>>> hns driver:
> 
> What is the actual problem anyhow? Is some roce GUID using the local
> bits and overlapping with the IB_CM_ASSIGN_SERVICE_ID? 
> 
> Ie generated VF MAC or something?
> 
>>>>   ib_srpt srpt_add_one(hns_0) failed.
>>>>
>>>> Cc: oulijun <oulijun@huawei.com>
>>>> Reported-by: oulijun <oulijun@huawei.com>
>>>> Signed-off-by: Bart Van Assche <bvanassche@acm.org>
>>>>  drivers/infiniband/ulp/srpt/ib_srpt.c | 3 ++-
>>>>  1 file changed, 2 insertions(+), 1 deletion(-)
>>>>
>>>> diff --git a/drivers/infiniband/ulp/srpt/ib_srpt.c b/drivers/infiniband/ulp/srpt/ib_srpt.c
>>>> index e25c70a56be6..114bf8d6c82b 100644
>>>> +++ b/drivers/infiniband/ulp/srpt/ib_srpt.c
>>>> @@ -3109,7 +3109,8 @@ static void srpt_add_one(struct ib_device *device)
>>>>  	srpt_use_srq(sdev, sdev->port[0].port_attrib.use_srq);
>>>>  
>>>>  	if (!srpt_service_guid)
>>>> -		srpt_service_guid = be64_to_cpu(device->node_guid);
>>>> +		srpt_service_guid = be64_to_cpu(device->node_guid) &
>>>> +			~IB_SERVICE_ID_AGN_MASK;
>>>
>>> This seems kind of sketchy, masking bits in the GUID is going to make
>>> it non-unique.. Should we do this only for roce or something?
>>>
>>> Hal, do you have any insight?
>>
>> include/rdma/ib_cm.h:#define IB_SERVICE_ID_AGN_MASK
>> cpu_to_be64(0xFF00000000000000ULL)
>>
>> IB_SERVICE_ID_AGN_MASK masks entire first byte of OUI which seems like
>> too much to me as it contains company related bits.
>>
>> Would it work just masking the first 2 bits (local/global and X bits) ?
> 
> Maybe if we see a local GUID it should generate a random global GUID
> instead.

I think the OpenIB OUI could be used for that if you want...

> Jason
>
Bart Van Assche Aug. 19, 2019, 3:11 p.m. UTC | #5
On 8/19/19 5:21 AM, Jason Gunthorpe wrote:
> On Wed, Aug 14, 2019 at 08:15:07AM -0700, Bart Van Assche wrote:
>> The ib_srpt driver derives its default service GUID from the node GUID
>> of the first encountered HCA. Since that service GUID is passed to
>> ib_cm_listen(), the AGN bits must not be set. Since the AGN bits can
>> be set in the node GUID of RoCE HCAs, filter these bits out. This
>> patch avoids that loading the ib_srpt driver fails as follows for the
>> hns driver:
>>
>>    ib_srpt srpt_add_one(hns_0) failed.
>>
>> Cc: oulijun <oulijun@huawei.com>
>> Reported-by: oulijun <oulijun@huawei.com>
>> Signed-off-by: Bart Van Assche <bvanassche@acm.org>
>>   drivers/infiniband/ulp/srpt/ib_srpt.c | 3 ++-
>>   1 file changed, 2 insertions(+), 1 deletion(-)
>>
>> diff --git a/drivers/infiniband/ulp/srpt/ib_srpt.c b/drivers/infiniband/ulp/srpt/ib_srpt.c
>> index e25c70a56be6..114bf8d6c82b 100644
>> +++ b/drivers/infiniband/ulp/srpt/ib_srpt.c
>> @@ -3109,7 +3109,8 @@ static void srpt_add_one(struct ib_device *device)
>>   	srpt_use_srq(sdev, sdev->port[0].port_attrib.use_srq);
>>   
>>   	if (!srpt_service_guid)
>> -		srpt_service_guid = be64_to_cpu(device->node_guid);
>> +		srpt_service_guid = be64_to_cpu(device->node_guid) &
>> +			~IB_SERVICE_ID_AGN_MASK;
> 
> This seems kind of sketchy, masking bits in the GUID is going to make
> it non-unique.. Should we do this only for roce or something?

Hi Jason and Hal,

The I/O controller GUID can be used in the srp_daemon configuration file 
for filtering purposes. The srp_daemon only supports IB networks.

In the IBTA spec I found the following about the I/O controller GUID: 
"An EUI-64 GUID used to uniquely identify the controller. This could be 
the same one as the Node/Port GUID if there is only one controller."

Does uniqueness of the I/O controller GUID only matter in InfiniBand 
networks or does it also matter in other RDMA networks?

How about using 0 as default value for the srpt_service_guid in RoCE 
networks?

Thanks,

Bart.
Jason Gunthorpe Aug. 19, 2019, 3:17 p.m. UTC | #6
On Mon, Aug 19, 2019 at 08:11:21AM -0700, Bart Van Assche wrote:
> On 8/19/19 5:21 AM, Jason Gunthorpe wrote:
> > On Wed, Aug 14, 2019 at 08:15:07AM -0700, Bart Van Assche wrote:
> > > The ib_srpt driver derives its default service GUID from the node GUID
> > > of the first encountered HCA. Since that service GUID is passed to
> > > ib_cm_listen(), the AGN bits must not be set. Since the AGN bits can
> > > be set in the node GUID of RoCE HCAs, filter these bits out. This
> > > patch avoids that loading the ib_srpt driver fails as follows for the
> > > hns driver:
> > > 
> > >    ib_srpt srpt_add_one(hns_0) failed.
> > > 
> > > Cc: oulijun <oulijun@huawei.com>
> > > Reported-by: oulijun <oulijun@huawei.com>
> > > Signed-off-by: Bart Van Assche <bvanassche@acm.org>
> > >   drivers/infiniband/ulp/srpt/ib_srpt.c | 3 ++-
> > >   1 file changed, 2 insertions(+), 1 deletion(-)
> > > 
> > > diff --git a/drivers/infiniband/ulp/srpt/ib_srpt.c b/drivers/infiniband/ulp/srpt/ib_srpt.c
> > > index e25c70a56be6..114bf8d6c82b 100644
> > > +++ b/drivers/infiniband/ulp/srpt/ib_srpt.c
> > > @@ -3109,7 +3109,8 @@ static void srpt_add_one(struct ib_device *device)
> > >   	srpt_use_srq(sdev, sdev->port[0].port_attrib.use_srq);
> > >   	if (!srpt_service_guid)
> > > -		srpt_service_guid = be64_to_cpu(device->node_guid);
> > > +		srpt_service_guid = be64_to_cpu(device->node_guid) &
> > > +			~IB_SERVICE_ID_AGN_MASK;
> > 
> > This seems kind of sketchy, masking bits in the GUID is going to make
> > it non-unique.. Should we do this only for roce or something?
> 
> Hi Jason and Hal,
> 
> The I/O controller GUID can be used in the srp_daemon configuration file for
> filtering purposes. The srp_daemon only supports IB networks.
> 
> In the IBTA spec I found the following about the I/O controller GUID: "An
> EUI-64 GUID used to uniquely identify the controller. This could be the same
> one as the Node/Port GUID if there is only one controller."

Yes, and the CM uses a magic GUID to indicate service ID selection,
and it looks like that magic GUID was *very* poorly choosen. It also
looks like it is stealth ABI

> Does uniqueness of the I/O controller GUID only matter in InfiniBand
> networks or does it also matter in other RDMA networks?
> 
> How about using 0 as default value for the srpt_service_guid in RoCE
> networks?

How does SRP connection management even work on RoCE?? The CM MADs
still carry a service_id? How do the sides exchange the service ID to
start the connection? Or is it ultimately overriden in the CM to use
an IP port based service ID?

Jason
Bart Van Assche Aug. 19, 2019, 3:45 p.m. UTC | #7
On 8/19/19 8:17 AM, Jason Gunthorpe wrote:
> On Mon, Aug 19, 2019 at 08:11:21AM -0700, Bart Van Assche wrote:
>> Does uniqueness of the I/O controller GUID only matter in InfiniBand
>> networks or does it also matter in other RDMA networks?
>>
>> How about using 0 as default value for the srpt_service_guid in RoCE
>> networks?
> 
> How does SRP connection management even work on RoCE?? The CM MADs
> still carry a service_id? How do the sides exchange the service ID to
> start the connection? Or is it ultimately overriden in the CM to use
> an IP port based service ID?

The ib_srpt kernel driver would have to set id_ext to a unique value if 
srpt_service_guid would be zero since the SRP initiator kernel driver 
uses the IOC GUID + id_ext + initiator_ext combination in its connection 
uniqueness check (srp_conn_unique()).

The ib_srp kernel driver supports both the IB/CM and the RDMA/CM. The 
srp_daemon software tells ib_srp to use the IB/CM. Software like 
blktests tells ib_srp to use the RDMA/CM. From 
https://github.com/osandov/blktests/blob/master/tests/srp/rc:

   srp_single_login \
     "id_ext=$ioc_guid,ioc_guid=$ioc_guid,dest=$dest,$add_param" \
     "$p/add_target"

The most important parameter in the login string is $dest. That is a 
string with the following format:

   <IP address>:<port number>.

Bart.
Jason Gunthorpe Aug. 19, 2019, 4:16 p.m. UTC | #8
On Mon, Aug 19, 2019 at 08:45:58AM -0700, Bart Van Assche wrote:
> On 8/19/19 8:17 AM, Jason Gunthorpe wrote:
> > On Mon, Aug 19, 2019 at 08:11:21AM -0700, Bart Van Assche wrote:
> > > Does uniqueness of the I/O controller GUID only matter in InfiniBand
> > > networks or does it also matter in other RDMA networks?
> > > 
> > > How about using 0 as default value for the srpt_service_guid in RoCE
> > > networks?
> > 
> > How does SRP connection management even work on RoCE?? The CM MADs
> > still carry a service_id? How do the sides exchange the service ID to
> > start the connection? Or is it ultimately overriden in the CM to use
> > an IP port based service ID?
> 
> The ib_srpt kernel driver would have to set id_ext to a unique value if
> srpt_service_guid would be zero since the SRP initiator kernel driver uses
> the IOC GUID + id_ext + initiator_ext combination in its connection
> uniqueness check (srp_conn_unique()).

Sounds like you should just generate something random for RDMA/CM mode ?

Still a bit confused how this is usable though if the initiating side
needs the service ID?

Jason
Bart Van Assche Aug. 19, 2019, 4:48 p.m. UTC | #9
On 8/19/19 9:16 AM, Jason Gunthorpe wrote:
> On Mon, Aug 19, 2019 at 08:45:58AM -0700, Bart Van Assche wrote:
>> On 8/19/19 8:17 AM, Jason Gunthorpe wrote:
>>> On Mon, Aug 19, 2019 at 08:11:21AM -0700, Bart Van Assche wrote:
>>>> Does uniqueness of the I/O controller GUID only matter in InfiniBand
>>>> networks or does it also matter in other RDMA networks?
>>>>
>>>> How about using 0 as default value for the srpt_service_guid in RoCE
>>>> networks?
>>>
>>> How does SRP connection management even work on RoCE?? The CM MADs
>>> still carry a service_id? How do the sides exchange the service ID to
>>> start the connection? Or is it ultimately overriden in the CM to use
>>> an IP port based service ID?
>>
>> The ib_srpt kernel driver would have to set id_ext to a unique value if
>> srpt_service_guid would be zero since the SRP initiator kernel driver uses
>> the IOC GUID + id_ext + initiator_ext combination in its connection
>> uniqueness check (srp_conn_unique()).
> 
> Sounds like you should just generate something random for RDMA/CM mode ?
> 
> Still a bit confused how this is usable though if the initiating side
> needs the service ID?

Hi Jason,

When I read Lijun Ou's e-mails for the first time I was assuming that my 
patch had been tested on top of a recent kernel. After having reread 
these e-mails I think my patch had been tested on top of kernel v4.14 
and is not necessary for more recent kernels. So I think we can drop the 
patch at the start of this e-mail thread.

Bart.
Doug Ledford Aug. 20, 2019, 5:08 p.m. UTC | #10
On Mon, 2019-08-19 at 09:48 -0700, Bart Van Assche wrote:
> Hi Jason,
> 
> When I read Lijun Ou's e-mails for the first time I was assuming that
> my 
> patch had been tested on top of a recent kernel. After having reread 
> these e-mails I think my patch had been tested on top of kernel v4.14 
> and is not necessary for more recent kernels. So I think we can drop
> the 
> patch at the start of this e-mail thread.

Hi Bart,

Thanks for the heads up.  I'll drop this out of patchworks and we can
revisit things if that turns out to be incorrect.
diff mbox series

Patch

diff --git a/drivers/infiniband/ulp/srpt/ib_srpt.c b/drivers/infiniband/ulp/srpt/ib_srpt.c
index e25c70a56be6..114bf8d6c82b 100644
--- a/drivers/infiniband/ulp/srpt/ib_srpt.c
+++ b/drivers/infiniband/ulp/srpt/ib_srpt.c
@@ -3109,7 +3109,8 @@  static void srpt_add_one(struct ib_device *device)
 	srpt_use_srq(sdev, sdev->port[0].port_attrib.use_srq);
 
 	if (!srpt_service_guid)
-		srpt_service_guid = be64_to_cpu(device->node_guid);
+		srpt_service_guid = be64_to_cpu(device->node_guid) &
+			~IB_SERVICE_ID_AGN_MASK;
 
 	if (rdma_port_get_link_layer(device, 1) == IB_LINK_LAYER_INFINIBAND)
 		sdev->cm_id = ib_create_cm_id(device, srpt_cm_handler, sdev);