diff mbox

[3/3] IB/mlx4: Report checksum offload cap when query device

Message ID 68bea4df2c89e5457aaaccf914756bc309d742a7.1442413048.git.bodong@mellanox.com (mailing list archive)
State Changes Requested
Headers show

Commit Message

Bodong Wang Sept. 16, 2015, 3:56 p.m. UTC
Signed-off-by: Bodong Wang <bodong@mellanox.com>
---
 drivers/infiniband/hw/mlx4/main.c | 3 +++
 1 file changed, 3 insertions(+)

Comments

Doug Ledford Sept. 18, 2015, 2:46 a.m. UTC | #1
On 09/16/2015 11:56 AM, Bodong Wang wrote:
> Signed-off-by: Bodong Wang <bodong@mellanox.com>
> ---
>  drivers/infiniband/hw/mlx4/main.c | 3 +++
>  1 file changed, 3 insertions(+)
> 
> diff --git a/drivers/infiniband/hw/mlx4/main.c b/drivers/infiniband/hw/mlx4/main.c
> index 8be6db8..a70ca6a 100644
> --- a/drivers/infiniband/hw/mlx4/main.c
> +++ b/drivers/infiniband/hw/mlx4/main.c
> @@ -217,6 +217,9 @@ static int mlx4_ib_query_device(struct ib_device *ibdev,
>  		props->device_cap_flags |= IB_DEVICE_MANAGED_FLOW_STEERING;
>  	}
>  
> +	props->csum_cap.eth_csum_cap |= IB_CSUM_SUPPORT_RAW;
> +	props->csum_cap.ib_csum_cap |= IB_CSUM_SUPPORT_UD;
> +
>  	props->vendor_id	   = be32_to_cpup((__be32 *) (out_mad->data + 36)) &
>  		0xffffff;
>  	props->vendor_part_id	   = dev->dev->persist->pdev->device;
> 

This patch highlights something I didn't think about on the previous
patch.  Why separate eth/ib if you have per QP flags?  The QP denotes
the ib/eth relationship without the need to separate it into two
different caps.  In other words, you can never have an IB qp type on eth
because the only eth QP types we support other than RAW are all RDMA and
not IP.  Really, there's enough spare bits in ib_device_cap_flags that
you could do away with the new caps entirely.  Right now, we support UD
(which we already have a flag for), we can add two flags (for RAW and
RC) and that should cover all of the foreseeable options as that would
allow us to extend IP CSUM support to cover connected mode and cover all
of the current options.  I don't see us doing IP traffic in any other
situation, so I thing that should suffice.  Bits 25 and 26 could be used
for the two new bits.  Then you just need to extend the bits to user space.
Or Gerlitz Sept. 21, 2015, 9:41 p.m. UTC | #2
On Fri, Sep 18, 2015 at 5:46 AM, Doug Ledford <dledford@redhat.com> wrote:
> On 09/16/2015 11:56 AM, Bodong Wang wrote:
>> Signed-off-by: Bodong Wang <bodong@mellanox.com>
>> ---
>>  drivers/infiniband/hw/mlx4/main.c | 3 +++
>>  1 file changed, 3 insertions(+)
>>
>> diff --git a/drivers/infiniband/hw/mlx4/main.c b/drivers/infiniband/hw/mlx4/main.c
>> index 8be6db8..a70ca6a 100644
>> --- a/drivers/infiniband/hw/mlx4/main.c
>> +++ b/drivers/infiniband/hw/mlx4/main.c
>> @@ -217,6 +217,9 @@ static int mlx4_ib_query_device(struct ib_device *ibdev,
>>               props->device_cap_flags |= IB_DEVICE_MANAGED_FLOW_STEERING;
>>       }
>>
>> +     props->csum_cap.eth_csum_cap |= IB_CSUM_SUPPORT_RAW;
>> +     props->csum_cap.ib_csum_cap |= IB_CSUM_SUPPORT_UD;
>> +

> This patch highlights something I didn't think about on the previous
> patch.  Why separate eth/ib if you have per QP flags?  The QP denotes
> the ib/eth relationship without the need to separate it into two
> different caps.  In other words, you can never have an IB qp type on eth
> because the only eth QP types we support other than RAW are all RDMA and
> not IP.  Really, there's enough spare bits in ib_device_cap_flags that
> you could do away with the new caps entirely.  Right now, we support UD
> (which we already have a flag for), we can add two flags (for RAW and
> RC) and that should cover all of the foreseeable options as that would
> allow us to extend IP CSUM support to cover connected mode and cover all
> of the current options.  I don't see us doing IP traffic in any other
> situation, so I thing that should suffice.  Bits 25 and 26 could be used
> for the two new bits.  Then you just need to extend the bits to user space.

Doug,

The vendor may support the offload for a certain QP type only over
certain link. E.g mlx4 support checksum for UD QPs only over IB but
not over Eth,  checksum for RC QPs isn't supported, and RAW_PACKET QPs
are available anyway only for Eth links. But if this is what you think
needs to be done, I guess we can do that.

Or.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Doug Ledford Sept. 21, 2015, 9:59 p.m. UTC | #3
On 09/21/2015 05:41 PM, Or Gerlitz wrote:
> On Fri, Sep 18, 2015 at 5:46 AM, Doug Ledford <dledford@redhat.com> wrote:
>> On 09/16/2015 11:56 AM, Bodong Wang wrote:
>>> Signed-off-by: Bodong Wang <bodong@mellanox.com>
>>> ---
>>>  drivers/infiniband/hw/mlx4/main.c | 3 +++
>>>  1 file changed, 3 insertions(+)
>>>
>>> diff --git a/drivers/infiniband/hw/mlx4/main.c b/drivers/infiniband/hw/mlx4/main.c
>>> index 8be6db8..a70ca6a 100644
>>> --- a/drivers/infiniband/hw/mlx4/main.c
>>> +++ b/drivers/infiniband/hw/mlx4/main.c
>>> @@ -217,6 +217,9 @@ static int mlx4_ib_query_device(struct ib_device *ibdev,
>>>               props->device_cap_flags |= IB_DEVICE_MANAGED_FLOW_STEERING;
>>>       }
>>>
>>> +     props->csum_cap.eth_csum_cap |= IB_CSUM_SUPPORT_RAW;
>>> +     props->csum_cap.ib_csum_cap |= IB_CSUM_SUPPORT_UD;
>>> +
> 
>> This patch highlights something I didn't think about on the previous
>> patch.  Why separate eth/ib if you have per QP flags?  The QP denotes
>> the ib/eth relationship without the need to separate it into two
>> different caps.  In other words, you can never have an IB qp type on eth
>> because the only eth QP types we support other than RAW are all RDMA and
>> not IP.  Really, there's enough spare bits in ib_device_cap_flags that
>> you could do away with the new caps entirely.  Right now, we support UD
>> (which we already have a flag for), we can add two flags (for RAW and
>> RC) and that should cover all of the foreseeable options as that would
>> allow us to extend IP CSUM support to cover connected mode and cover all
>> of the current options.  I don't see us doing IP traffic in any other
>> situation, so I thing that should suffice.  Bits 25 and 26 could be used
>> for the two new bits.  Then you just need to extend the bits to user space.
> 
> Doug,
> 
> The vendor may support the offload for a certain QP type only over
> certain link.

Exactly my point.

> E.g mlx4 support checksum for UD QPs only over IB but
> not over Eth,

As it should be.  We are, after all, talking about IP embedded in UD
RDMA.  Over IB that makes sense.  Over Eth it makes no sense.  If you
are going to do IP on Eth, then just do IP, don't do IP in UD.  I see no
reason to support this construct.

>  checksum for RC QPs isn't supported,

But could be for IB.  The fact that it isn't is why there is an ongoing
effort to work around this issue.

> and RAW_PACKET QPs
> are available anyway only for Eth links.

Correct.  And this will never be available on IB.

> But if this is what you think
> needs to be done, I guess we can do that.

Here's the only matrix of IP checksumming that makes sense:

1) UD over IB (because it is one of the supported IPoIB types)
2) RC over IB (same as above)
3) Raw ETH over Eth (because IP over Eth makes sense and is a common
type of packet to send on Raw Eth, but Raw Eth will never be sent on IB
as it isn't supported there at all)

Anything else would require adding more Raw ETH QPs elsewhere, or
expanding the IPoIB spec to include more connection types.
Or Gerlitz Sept. 22, 2015, 4:45 a.m. UTC | #4
On Tue, Sep 22, 2015 at 12:59 AM, Doug Ledford <dledford@redhat.com> wrote:

> Here's the only matrix of IP checksumming that makes sense:
>
> 1) UD over IB (because it is one of the supported IPoIB types)
> 2) RC over IB (same as above)
> 3) Raw ETH over Eth (because IP over Eth makes sense and is a common
> type of packet to send on Raw Eth, but Raw Eth will never be sent on IB
> as it isn't supported there at all)
>
> Anything else would require adding more Raw ETH QPs elsewhere, or
> expanding the IPoIB spec to include more connection types.

OK, I'm good with that, so you want two news bits of device caps and
not the too fancy
matrix of checksum-caps-for-qp-type-and-link-type.

>> Really, there's enough spare bits in ib_device_cap_flags that
>> you could do away with the new caps entirely.

Yes, Bodong, please go ahead and use bits 26,27 of
include/rdma/ib_verbs.h :: enum ib_device_cap_flags
for the two new caps Doug asked.

Or.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/drivers/infiniband/hw/mlx4/main.c b/drivers/infiniband/hw/mlx4/main.c
index 8be6db8..a70ca6a 100644
--- a/drivers/infiniband/hw/mlx4/main.c
+++ b/drivers/infiniband/hw/mlx4/main.c
@@ -217,6 +217,9 @@  static int mlx4_ib_query_device(struct ib_device *ibdev,
 		props->device_cap_flags |= IB_DEVICE_MANAGED_FLOW_STEERING;
 	}
 
+	props->csum_cap.eth_csum_cap |= IB_CSUM_SUPPORT_RAW;
+	props->csum_cap.ib_csum_cap |= IB_CSUM_SUPPORT_UD;
+
 	props->vendor_id	   = be32_to_cpup((__be32 *) (out_mad->data + 36)) &
 		0xffffff;
 	props->vendor_part_id	   = dev->dev->persist->pdev->device;