diff mbox

[RFC,08/11] IB/Verbs: Use management helper has_iwarp() for, iwarp-check

Message ID 55157B98.1060103@profitbricks.com (mailing list archive)
State New, archived
Headers show

Commit Message

Michael Wang March 27, 2015, 3:47 p.m. UTC
Introduce helper has_iwarp() to help us check if an IB device
support IWARP protocol.

Cc: Jason Gunthorpe <jgunthorpe@obsidianresearch.com>
Cc: Doug Ledford <dledford@redhat.com>
Cc: Ira Weiny <ira.weiny@intel.com>
Cc: Sean Hefty <sean.hefty@intel.com>
Signed-off-by: Michael Wang <yun.wang@profitbricks.com>
---
 include/rdma/ib_verbs.h                 | 13 +++++++++++++
 net/sunrpc/xprtrdma/svc_rdma_recvfrom.c |  2 +-
 2 files changed, 14 insertions(+), 1 deletion(-)

Comments

Jason Gunthorpe March 27, 2015, 4:13 p.m. UTC | #1
On Fri, Mar 27, 2015 at 04:47:36PM +0100, Michael Wang wrote:
> 
> Introduce helper has_iwarp() to help us check if an IB device
> support IWARP protocol.

Should probably be !has_rdma_read_sges()

True if the device can handle more than one SGE entry on a RDMA READ
work request.

Jason
--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Michael Wang March 27, 2015, 4:17 p.m. UTC | #2
On 03/27/2015 05:13 PM, Jason Gunthorpe wrote:
> On Fri, Mar 27, 2015 at 04:47:36PM +0100, Michael Wang wrote:
>> Introduce helper has_iwarp() to help us check if an IB device
>> support IWARP protocol.
> Should probably be !has_rdma_read_sges()
>
> True if the device can handle more than one SGE entry on a RDMA READ
> work request.
Sounds better :-) will change it in next version.

Regards,
Michael Wang

>
> Jason

--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Ira Weiny March 27, 2015, 5:16 p.m. UTC | #3
On Fri, Mar 27, 2015 at 10:13:19AM -0600, Jason Gunthorpe wrote:
> On Fri, Mar 27, 2015 at 04:47:36PM +0100, Michael Wang wrote:
> > 
> > Introduce helper has_iwarp() to help us check if an IB device
> > support IWARP protocol.
> 
> Should probably be !has_rdma_read_sges()
> 
> True if the device can handle more than one SGE entry on a RDMA READ
> work request.

Isn't this value already provided by the query_device verb?

The verbs spec states the Query HCA contains the:

"Maximum number of scatter/gather entries per Work Request supported by the
HCA."

-- Ira

> 
> Jason
--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Jason Gunthorpe March 27, 2015, 5:29 p.m. UTC | #4
On Fri, Mar 27, 2015 at 01:16:31PM -0400, ira.weiny wrote:
> On Fri, Mar 27, 2015 at 10:13:19AM -0600, Jason Gunthorpe wrote:
> > On Fri, Mar 27, 2015 at 04:47:36PM +0100, Michael Wang wrote:
> > > 
> > > Introduce helper has_iwarp() to help us check if an IB device
> > > support IWARP protocol.
> > 
> > Should probably be !has_rdma_read_sges()
> > 
> > True if the device can handle more than one SGE entry on a RDMA READ
> > work request.
> 
> Isn't this value already provided by the query_device verb?
> 
> The verbs spec states the Query HCA contains the:
> 
> "Maximum number of scatter/gather entries per Work Request supported by the
> HCA."

http://www.spinics.net/lists/linux-rdma/msg22565.html

''Unlike IB, the iWARP protocol only allows 1 target/sink SGE in an
rdma read''

It is one of those annoying verbs is different on iWarp things.

So the max sge in the query_verbs must only apply to send/rdma write
on iWarp?

Jason
--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Michael Wang March 27, 2015, 5:35 p.m. UTC | #5
On Fri, Mar 27, 2015 at 6:16 PM, ira.weiny <ira.weiny@intel.com> wrote:
> On Fri, Mar 27, 2015 at 10:13:19AM -0600, Jason Gunthorpe wrote:
>> On Fri, Mar 27, 2015 at 04:47:36PM +0100, Michael Wang wrote:
>> >
>> > Introduce helper has_iwarp() to help us check if an IB device
>> > support IWARP protocol.
>>
>> Should probably be !has_rdma_read_sges()
>>
>> True if the device can handle more than one SGE entry on a RDMA READ
>> work request.
>
> Isn't this value already provided by the query_device verb?
>
> The verbs spec states the Query HCA contains the:
>
> "Maximum number of scatter/gather entries per Work Request supported by the
> HCA."

I'm not sure but may be query_device() is just too expensive for this path?
I need some investigation here too.

Regards,
Michael Wang

>
> -- Ira
>
>>
>> Jason
--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Michael Wang March 30, 2015, 3:10 p.m. UTC | #6
On 03/27/2015 06:29 PM, Jason Gunthorpe wrote:
> On Fri, Mar 27, 2015 at 01:16:31PM -0400, ira.weiny wrote:
>> [snip]
> http://www.spinics.net/lists/linux-rdma/msg22565.html
>
> ''Unlike IB, the iWARP protocol only allows 1 target/sink SGE in an
> rdma read''
>
> It is one of those annoying verbs is different on iWarp things.
>
> So the max sge in the query_verbs must only apply to send/rdma write
> on iWarp?
I found that actually we don't have to touch this one which
only used by HW driver currently.

I think we can leave these scenes there in device driver, since
vendor could have different way to classify the usage of transfer
and link layer.

Our purpose is to introduce IB core management approach, which
may not be applicable on device level, maybe we can just pass them :-)

Regards,
Michael Wang


>
> Jason

--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Doug Ledford March 30, 2015, 4:13 p.m. UTC | #7
On Fri, 2015-03-27 at 16:47 +0100, Michael Wang wrote:
> Introduce helper has_iwarp() to help us check if an IB device
> support IWARP protocol.

This is a needless redirection.  Just stick with the original
rdma_transport_is_iwarp().

> Cc: Jason Gunthorpe <jgunthorpe@obsidianresearch.com>
> Cc: Doug Ledford <dledford@redhat.com>
> Cc: Ira Weiny <ira.weiny@intel.com>
> Cc: Sean Hefty <sean.hefty@intel.com>
> Signed-off-by: Michael Wang <yun.wang@profitbricks.com>
> ---
>  include/rdma/ib_verbs.h                 | 13 +++++++++++++
>  net/sunrpc/xprtrdma/svc_rdma_recvfrom.c |  2 +-
>  2 files changed, 14 insertions(+), 1 deletion(-)
> 
> diff --git a/include/rdma/ib_verbs.h b/include/rdma/ib_verbs.h
> index e796104..0ef9cd7 100644
> --- a/include/rdma/ib_verbs.h
> +++ b/include/rdma/ib_verbs.h
> @@ -1836,6 +1836,19 @@ static inline int has_mcast(struct ib_device *device)
>  }
>  
>  /**
> + * has_iwarp - Check if a device support IWARP protocol.
> + *
> + * @device: Device to be checked
> + *
> + * Return 0 when a device has none port to support
> + * IWARP protocol.
> + */
> +static inline int has_iwarp(struct ib_device *device)
> +{
> +    return rdma_transport_is_iwarp(device);
> +}
> +
> +/**
>   * cap_smi - Check if the port of device has the capability
>   * Subnet Management Interface.
>   *
> diff --git a/net/sunrpc/xprtrdma/svc_rdma_recvfrom.c b/net/sunrpc/xprtrdma/svc_rdma_recvfrom.c
> index a7b5891..48aeb5e 100644
> --- a/net/sunrpc/xprtrdma/svc_rdma_recvfrom.c
> +++ b/net/sunrpc/xprtrdma/svc_rdma_recvfrom.c
> @@ -118,7 +118,7 @@ static void rdma_build_arg_xdr(struct svc_rqst *rqstp,
>  
>  static int rdma_read_max_sge(struct svcxprt_rdma *xprt, int sge_count)
>  {
> -    if (rdma_transport_is_iwarp(xprt->sc_cm_id->device))
> +    if (has_iwarp(xprt->sc_cm_id->device))
>          return 1;
>      else
>          return min_t(int, sge_count, xprt->sc_max_sge);
Michael Wang March 30, 2015, 4:21 p.m. UTC | #8
On 03/30/2015 06:13 PM, Doug Ledford wrote:
> On Fri, 2015-03-27 at 16:47 +0100, Michael Wang wrote:
>> Introduce helper has_iwarp() to help us check if an IB device
>> support IWARP protocol.
> This is a needless redirection.  Just stick with the original
> rdma_transport_is_iwarp().

Agree, will leave it there :-)

Regards,
Michael Wang

>
>> Cc: Jason Gunthorpe <jgunthorpe@obsidianresearch.com>
>> Cc: Doug Ledford <dledford@redhat.com>
>> Cc: Ira Weiny <ira.weiny@intel.com>
>> Cc: Sean Hefty <sean.hefty@intel.com>
>> Signed-off-by: Michael Wang <yun.wang@profitbricks.com>
>> ---
>>  include/rdma/ib_verbs.h                 | 13 +++++++++++++
>>  net/sunrpc/xprtrdma/svc_rdma_recvfrom.c |  2 +-
>>  2 files changed, 14 insertions(+), 1 deletion(-)
>>
>> diff --git a/include/rdma/ib_verbs.h b/include/rdma/ib_verbs.h
>> index e796104..0ef9cd7 100644
>> --- a/include/rdma/ib_verbs.h
>> +++ b/include/rdma/ib_verbs.h
>> @@ -1836,6 +1836,19 @@ static inline int has_mcast(struct ib_device *device)
>>  }
>>  
>>  /**
>> + * has_iwarp - Check if a device support IWARP protocol.
>> + *
>> + * @device: Device to be checked
>> + *
>> + * Return 0 when a device has none port to support
>> + * IWARP protocol.
>> + */
>> +static inline int has_iwarp(struct ib_device *device)
>> +{
>> +    return rdma_transport_is_iwarp(device);
>> +}
>> +
>> +/**
>>   * cap_smi - Check if the port of device has the capability
>>   * Subnet Management Interface.
>>   *
>> diff --git a/net/sunrpc/xprtrdma/svc_rdma_recvfrom.c b/net/sunrpc/xprtrdma/svc_rdma_recvfrom.c
>> index a7b5891..48aeb5e 100644
>> --- a/net/sunrpc/xprtrdma/svc_rdma_recvfrom.c
>> +++ b/net/sunrpc/xprtrdma/svc_rdma_recvfrom.c
>> @@ -118,7 +118,7 @@ static void rdma_build_arg_xdr(struct svc_rqst *rqstp,
>>  
>>  static int rdma_read_max_sge(struct svcxprt_rdma *xprt, int sge_count)
>>  {
>> -    if (rdma_transport_is_iwarp(xprt->sc_cm_id->device))
>> +    if (has_iwarp(xprt->sc_cm_id->device))
>>          return 1;
>>      else
>>          return min_t(int, sge_count, xprt->sc_max_sge);
>

--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Jason Gunthorpe March 30, 2015, 10:35 p.m. UTC | #9
On Mon, Mar 30, 2015 at 05:10:12PM +0200, Michael Wang wrote:
> I found that actually we don't have to touch this one which
> only used by HW driver currently.

I'm having a hard time understanding this, the code in question was in

net/sunrpc/xprtrdma/svc_rdma_recvfrom.c

Which is the NFS ULP, not a device driver.

Regards,
Jason
--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Michael Wang March 31, 2015, 7:39 a.m. UTC | #10
On 03/31/2015 12:35 AM, Jason Gunthorpe wrote:
> On Mon, Mar 30, 2015 at 05:10:12PM +0200, Michael Wang wrote:
>> I found that actually we don't have to touch this one which
>> only used by HW driver currently.
> I'm having a hard time understanding this, the code in question was in
>
> net/sunrpc/xprtrdma/svc_rdma_recvfrom.c
>
> Which is the NFS ULP, not a device driver.

I'm not familiar with this part too :-P but yes, it looks like an ulp
to support NFS.

Actually I'm thinking about Doug's idea to use rdma_transport_is_XX()
instead of the current basic helper, thus may be use rdma_transport_is_iwarp()
in here could be better, since it's actually a feature of iwarp tech
that RDMA Read only support one scatter-gather entry.

But I need more investigation on that idea, there are some part
especially inside device driver I'm not very clear, things could be more
easier if the semantic there is compatible with Doug's proposal ;-)

Regards,
Michael Wang

>
> Regards,
> Jason

--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Tom Talpey March 31, 2015, 11:19 a.m. UTC | #11
On 3/31/2015 3:39 AM, Michael Wang wrote:
> On 03/31/2015 12:35 AM, Jason Gunthorpe wrote:
>> On Mon, Mar 30, 2015 at 05:10:12PM +0200, Michael Wang wrote:
>>> I found that actually we don't have to touch this one which
>>> only used by HW driver currently.
>> I'm having a hard time understanding this, the code in question was in
>>
>> net/sunrpc/xprtrdma/svc_rdma_recvfrom.c
>>
>> Which is the NFS ULP, not a device driver.
>
> I'm not familiar with this part too :-P but yes, it looks like an ulp
> to support NFS.

It's the NFS server itself.

>
> Actually I'm thinking about Doug's idea to use rdma_transport_is_XX()
> instead of the current basic helper, thus may be use rdma_transport_is_iwarp()
> in here could be better, since it's actually a feature of iwarp tech
> that RDMA Read only support one scatter-gather entry.

No, you should expose an attribute to surface the maximum length of
the remote gather list, which varies by adapter as well as protocol.
The fact that iWARP is different from IB is not relevant, and conflates
unrelated properties.
--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Michael Wang March 31, 2015, 11:41 a.m. UTC | #12
Hi, Tom

Thanks for the comments :-)

On 03/31/2015 01:19 PM, Tom Talpey wrote:
> [snip]
>
>>
>> Actually I'm thinking about Doug's idea to use rdma_transport_is_XX()
>> instead of the current basic helper, thus may be use rdma_transport_is_iwarp()
>> in here could be better, since it's actually a feature of iwarp tech
>> that RDMA Read only support one scatter-gather entry.
>
> No, you should expose an attribute to surface the maximum length of
> the remote gather list, which varies by adapter as well as protocol.
> The fact that iWARP is different from IB is not relevant, and conflates
> unrelated properties.

To be confirmed, so your point is that the max-read-sges will be different
even the transport is the same IWRAP, and that depends on the capability
of adapter, correct?

I currently only find this one place where infer max-read-sges from
transport type, it looks more like a special case to me rather than a generic
method we could exposed... and also  not very related with IB management
helper concept IMHO.

Regards,
Michael Wang

--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Tom Talpey March 31, 2015, 1:56 p.m. UTC | #13
On 3/31/2015 7:41 AM, Michael Wang wrote:
> Hi, Tom
>
> Thanks for the comments :-)
>
> On 03/31/2015 01:19 PM, Tom Talpey wrote:

[oops - repeating my reply with full cc's]

>> [snip]
>>
>>>
>>> Actually I'm thinking about Doug's idea to use rdma_transport_is_XX()
>>> instead of the current basic helper, thus may be use rdma_transport_is_iwarp()
>>> in here could be better, since it's actually a feature of iwarp tech
>>> that RDMA Read only support one scatter-gather entry.
>>
>> No, you should expose an attribute to surface the maximum length of
>> the remote gather list, which varies by adapter as well as protocol.
>> The fact that iWARP is different from IB is not relevant, and conflates
>> unrelated properties.
>
> To be confirmed, so your point is that the max-read-sges will be different
> even the transport is the same IWRAP, and that depends on the capability
> of adapter, correct?

Yes, in fact the iWARP protocol does not preclude multiple read SGEs,
even though most iWARP implementations have chosen to support just one.

Even for multi-SGE-capable adapters, there is a limit of SGL size, based
on the adapter's work request format and other factors. So my argument
is that upper layers can and should query that, not make a blanket
decision based on protocol type.

>
> I currently only find this one place where infer max-read-sges from
> transport type, it looks more like a special case to me rather than a generic
> method we could exposed... and also  not very related with IB management
> helper concept IMHO.

It is most certainly not a special case, but you could decide to
introduce it in many ways. I'm not commenting on that.

My main concern is that you do not introduce a new and clumsy "is iWARP"
rule as an adapter-specific API requirement to expose the RDMA Read SGE
behavior. That's what your initial message seemed to imply?


--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Michael Wang March 31, 2015, 1:58 p.m. UTC | #14
On 03/31/2015 03:42 PM, Tom Talpey wrote:

> On 3/31/2015 7:41 AM, Michael Wang wrote:
>> [snip]
> Yes, in fact the iWARP protocol does not preclude multiple read SGEs,
> even though most iWARP implementations have chosen to support just one.
>
> Even for multi-SGE-capable adapters, there is a limit of SGL size, based
> on the adapter's work request format and other factors. So my argument
> is that upper layers can and should query that, not make a blanket
> decision based on protocol type.

Thanks for the explanation  Sounds like some new callback on device level
like query_device() to acquire the right info.

>> I currently only find this one place where infer max-read-sges from
>> transport type, it looks more like a special case to me rather than a generic
>> method we could exposed... and also  not very related with IB management
>> helper concept IMHO.
> It is most certainly not a special case, but you could decide to
> introduce it in many ways. I'm not commenting on that.
>
> My main concern is that you do not introduce a new and clumsy "is iWARP"
> rule as an adapter-specific API requirement to expose the RDMA Read SGE
> behavior. That's what your initial message seemed to imply?

Yeah I planed to just use rdma_transport_iwarp() to replace the check, it's
actually meaningless but just refine, frankly speaking I would prefer some
driver developer to work on that part, at this point I prefer to focus on
introducing the management helpers firstly 

Maybe we could mark it as a TODO temporarily, if later we found more
scenes using similar logical, we can collect them and do some reform 

Regards,
Michael Wang


--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Jason Gunthorpe March 31, 2015, 11:20 p.m. UTC | #15
On Mon, Mar 30, 2015 at 12:13:00PM -0400, Doug Ledford wrote:
> On Fri, 2015-03-27 at 16:47 +0100, Michael Wang wrote:
> > Introduce helper has_iwarp() to help us check if an IB device
> > support IWARP protocol.
> 
> This is a needless redirection.  Just stick with the original
> rdma_transport_is_iwarp().

Sticking with the original isn't really the point.

The point here, is to document what Tom was talking about - some ports
can only support one RDMA READ SGL entry, even though they support
multiple RDMA WRITE SGL entries. Today the query API assumes
READ/WRITE/SEND are symmetrical.

has_rdma_read_sgl() is a good way to document that for now, and is a big
flashing marker where something might need to be fixed in the
future.

This tells everyone reading the code and the API that when working
with RDMA READ they need to be aware of this limitation.

Jason
--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/include/rdma/ib_verbs.h b/include/rdma/ib_verbs.h
index e796104..0ef9cd7 100644
--- a/include/rdma/ib_verbs.h
+++ b/include/rdma/ib_verbs.h
@@ -1836,6 +1836,19 @@  static inline int has_mcast(struct ib_device *device)
 }
 
 /**
+ * has_iwarp - Check if a device support IWARP protocol.
+ *
+ * @device: Device to be checked
+ *
+ * Return 0 when a device has none port to support
+ * IWARP protocol.
+ */
+static inline int has_iwarp(struct ib_device *device)
+{
+    return rdma_transport_is_iwarp(device);
+}
+
+/**
  * cap_smi - Check if the port of device has the capability
  * Subnet Management Interface.
  *
diff --git a/net/sunrpc/xprtrdma/svc_rdma_recvfrom.c b/net/sunrpc/xprtrdma/svc_rdma_recvfrom.c
index a7b5891..48aeb5e 100644
--- a/net/sunrpc/xprtrdma/svc_rdma_recvfrom.c
+++ b/net/sunrpc/xprtrdma/svc_rdma_recvfrom.c
@@ -118,7 +118,7 @@  static void rdma_build_arg_xdr(struct svc_rqst *rqstp,
 
 static int rdma_read_max_sge(struct svcxprt_rdma *xprt, int sge_count)
 {
-    if (rdma_transport_is_iwarp(xprt->sc_cm_id->device))
+    if (has_iwarp(xprt->sc_cm_id->device))
         return 1;
     else
         return min_t(int, sge_count, xprt->sc_max_sge);