diff mbox series

[rdma-next,01/13] RDMA: Add EFA related definitions

Message ID 1543925069-8838-2-git-send-email-galpress@amazon.com (mailing list archive)
State Changes Requested
Headers show
Series Elastic Fabric Adapter (EFA) driver | expand

Commit Message

Gal Pressman Dec. 4, 2018, 12:04 p.m. UTC
Add EFA node type, transport type and protocol type to core code.
EFA relies on underlying implementation similar to reliable datagram, so
we also define a new QP type named Scalable Reliable Datagram (SRD).

EFA reliable datagram transport provides reliable out-of-order delivery,
transparently utilizing multiple network paths to reduce network tail
latency. Its interface is similar to UD, in particular it supports
message size up to MTU, with error handling extended to support reliable
communication.

Signed-off-by: Gal Pressman <galpress@amazon.com>
---
 drivers/infiniband/core/verbs.c | 2 ++
 include/rdma/ib_verbs.h         | 9 +++++++--
 2 files changed, 9 insertions(+), 2 deletions(-)

Comments

Leon Romanovsky Dec. 4, 2018, 12:44 p.m. UTC | #1
On Tue, Dec 04, 2018 at 02:04:17PM +0200, Gal Pressman wrote:
> Add EFA node type, transport type and protocol type to core code.
> EFA relies on underlying implementation similar to reliable datagram, so
> we also define a new QP type named Scalable Reliable Datagram (SRD).
>
> EFA reliable datagram transport provides reliable out-of-order delivery,
> transparently utilizing multiple network paths to reduce network tail
> latency. Its interface is similar to UD, in particular it supports
> message size up to MTU, with error handling extended to support reliable
> communication.
>
> Signed-off-by: Gal Pressman <galpress@amazon.com>
> ---
>  drivers/infiniband/core/verbs.c | 2 ++
>  include/rdma/ib_verbs.h         | 9 +++++++--
>  2 files changed, 9 insertions(+), 2 deletions(-)
>

Do you have any specification/documentation for that?

I'm afraid that awesome press release [1] is not enough.

[1]
https://aws.amazon.com/about-aws/whats-new/2018/11/introducing-elastic-fabric-adapter/

Thanks
Gal Pressman Dec. 4, 2018, 2:38 p.m. UTC | #2
On 04-Dec-18 14:44, Leon Romanovsky wrote:
> On Tue, Dec 04, 2018 at 02:04:17PM +0200, Gal Pressman wrote:
>> Add EFA node type, transport type and protocol type to core code.
>> EFA relies on underlying implementation similar to reliable datagram, so
>> we also define a new QP type named Scalable Reliable Datagram (SRD).
>>
>> EFA reliable datagram transport provides reliable out-of-order delivery,
>> transparently utilizing multiple network paths to reduce network tail
>> latency. Its interface is similar to UD, in particular it supports
>> message size up to MTU, with error handling extended to support reliable
>> communication.
>>
>> Signed-off-by: Gal Pressman <galpress@amazon.com>
>> ---
>>  drivers/infiniband/core/verbs.c | 2 ++
>>  include/rdma/ib_verbs.h         | 9 +++++++--
>>  2 files changed, 9 insertions(+), 2 deletions(-)
>>
> 
> Do you have any specification/documentation for that?
> 
> I'm afraid that awesome press release [1] is not enough.
> 
> [1]
> https://aws.amazon.com/about-aws/whats-new/2018/11/introducing-elastic-fabric-adapter/
> 
> Thanks
> 

Hey Leon,
The commit message (and part of the cover letter) contains a description of SRD.
It is similar to UD in most ways with the addition of reliable out-of-order
delivery. The work requests usage are the same as UD with a minor difference in
the completion's status codes.

The exact specification is internal to AWS, but if you have any more questions
I'll be more than happy to answer.
Leon Romanovsky Dec. 4, 2018, 3:45 p.m. UTC | #3
On Tue, Dec 04, 2018 at 04:38:52PM +0200, Gal Pressman wrote:
> On 04-Dec-18 14:44, Leon Romanovsky wrote:
> > On Tue, Dec 04, 2018 at 02:04:17PM +0200, Gal Pressman wrote:
> >> Add EFA node type, transport type and protocol type to core code.
> >> EFA relies on underlying implementation similar to reliable datagram, so
> >> we also define a new QP type named Scalable Reliable Datagram (SRD).
> >>
> >> EFA reliable datagram transport provides reliable out-of-order delivery,
> >> transparently utilizing multiple network paths to reduce network tail
> >> latency. Its interface is similar to UD, in particular it supports
> >> message size up to MTU, with error handling extended to support reliable
> >> communication.
> >>
> >> Signed-off-by: Gal Pressman <galpress@amazon.com>
> >> ---
> >>  drivers/infiniband/core/verbs.c | 2 ++
> >>  include/rdma/ib_verbs.h         | 9 +++++++--
> >>  2 files changed, 9 insertions(+), 2 deletions(-)
> >>
> >
> > Do you have any specification/documentation for that?
> >
> > I'm afraid that awesome press release [1] is not enough.
> >
> > [1]
> > https://aws.amazon.com/about-aws/whats-new/2018/11/introducing-elastic-fabric-adapter/
> >
> > Thanks
> >
>
> Hey Leon,
> The commit message (and part of the cover letter) contains a description of SRD.
> It is similar to UD in most ways with the addition of reliable out-of-order
> delivery. The work requests usage are the same as UD with a minor difference in
> the completion's status codes.
>
> The exact specification is internal to AWS, but if you have any more questions
> I'll be more than happy to answer.

All structures which you extended are backed by IBTA and everything
that "internal to .." is supposed to be implemented by various extensions
which we already have in RDMA/core. For example, in case of SRD, we have
IB_QPT_DRIVER exactly for that.

Otherwise, please provide full semantics of this SRD type: out-of-order
semantics, handle of errors, state diagram, retransmission e.t.c.

Thanks
Gal Pressman Dec. 5, 2018, 8:56 a.m. UTC | #4
On 04-Dec-18 17:45, Leon Romanovsky wrote:
> On Tue, Dec 04, 2018 at 04:38:52PM +0200, Gal Pressman wrote:
>> On 04-Dec-18 14:44, Leon Romanovsky wrote:
>>> On Tue, Dec 04, 2018 at 02:04:17PM +0200, Gal Pressman wrote:
>>>> Add EFA node type, transport type and protocol type to core code.
>>>> EFA relies on underlying implementation similar to reliable datagram, so
>>>> we also define a new QP type named Scalable Reliable Datagram (SRD).
>>>>
>>>> EFA reliable datagram transport provides reliable out-of-order delivery,
>>>> transparently utilizing multiple network paths to reduce network tail
>>>> latency. Its interface is similar to UD, in particular it supports
>>>> message size up to MTU, with error handling extended to support reliable
>>>> communication.
>>>>
>>>> Signed-off-by: Gal Pressman <galpress@amazon.com>
>>>> ---
>>>>  drivers/infiniband/core/verbs.c | 2 ++
>>>>  include/rdma/ib_verbs.h         | 9 +++++++--
>>>>  2 files changed, 9 insertions(+), 2 deletions(-)
>>>>
>>>
>>> Do you have any specification/documentation for that?
>>>
>>> I'm afraid that awesome press release [1] is not enough.
>>>
>>> [1]
>>> https://aws.amazon.com/about-aws/whats-new/2018/11/introducing-elastic-fabric-adapter/
>>>
>>> Thanks
>>>
>>
>> Hey Leon,
>> The commit message (and part of the cover letter) contains a description of SRD.
>> It is similar to UD in most ways with the addition of reliable out-of-order
>> delivery. The work requests usage are the same as UD with a minor difference in
>> the completion's status codes.
>>
>> The exact specification is internal to AWS, but if you have any more questions
>> I'll be more than happy to answer.
> 
> All structures which you extended are backed by IBTA and everything
> that "internal to .." is supposed to be implemented by various extensions
> which we already have in RDMA/core. For example, in case of SRD, we have
> IB_QPT_DRIVER exactly for that.
> 
> Otherwise, please provide full semantics of this SRD type: out-of-order
> semantics, handle of errors, state diagram, retransmission e.t.c.
> 
> Thanks
> 

We can use IB_QPT_DRIVER, if I understand correctly the only downside is that
kernel QPs will not be able to utilize SRD.
Hefty, Sean Dec. 5, 2018, 7:23 p.m. UTC | #5
>  enum {
> @@ -119,14 +120,16 @@ enum rdma_transport_type {
>  	RDMA_TRANSPORT_IB,
>  	RDMA_TRANSPORT_IWARP,
>  	RDMA_TRANSPORT_USNIC,
> -	RDMA_TRANSPORT_USNIC_UDP
> +	RDMA_TRANSPORT_USNIC_UDP,
> +	RDMA_TRANSPORT_EFA,
>  };
> 
>  enum rdma_protocol_type {
>  	RDMA_PROTOCOL_IB,
>  	RDMA_PROTOCOL_IBOE,
>  	RDMA_PROTOCOL_IWARP,
> -	RDMA_PROTOCOL_USNIC_UDP
> +	RDMA_PROTOCOL_USNIC_UDP,
> +	RDMA_PROTOCOL_EFA,

EFA is the (marketing?) name of the NIC, not really the transport or protocol.  You called the protocol SRD in the cover letter.  I'm not sure if that would apply as both the transport or protocol, but it seems a better option than EFA.

- Sean
Gal Pressman Dec. 6, 2018, 8:57 a.m. UTC | #6
On 05-Dec-18 21:23, Hefty, Sean wrote:
>>  enum {
>> @@ -119,14 +120,16 @@ enum rdma_transport_type {
>>  	RDMA_TRANSPORT_IB,
>>  	RDMA_TRANSPORT_IWARP,
>>  	RDMA_TRANSPORT_USNIC,
>> -	RDMA_TRANSPORT_USNIC_UDP
>> +	RDMA_TRANSPORT_USNIC_UDP,
>> +	RDMA_TRANSPORT_EFA,
>>  };
>>
>>  enum rdma_protocol_type {
>>  	RDMA_PROTOCOL_IB,
>>  	RDMA_PROTOCOL_IBOE,
>>  	RDMA_PROTOCOL_IWARP,
>> -	RDMA_PROTOCOL_USNIC_UDP
>> +	RDMA_PROTOCOL_USNIC_UDP,
>> +	RDMA_PROTOCOL_EFA,
> 
> EFA is the (marketing?) name of the NIC, not really the transport or protocol.  You called the protocol SRD in the cover letter.  I'm not sure if that would apply as both the transport or protocol, but it seems a better option than EFA.

We support both SRD and UD, we consider EFA as a family of protocols.

> 
> - Sean
>
diff mbox series

Patch

diff --git a/drivers/infiniband/core/verbs.c b/drivers/infiniband/core/verbs.c
index 178899e3ce73..970744ffbf33 100644
--- a/drivers/infiniband/core/verbs.c
+++ b/drivers/infiniband/core/verbs.c
@@ -206,6 +206,8 @@  rdma_node_get_transport(enum rdma_node_type node_type)
 		return RDMA_TRANSPORT_USNIC_UDP;
 	if (node_type == RDMA_NODE_RNIC)
 		return RDMA_TRANSPORT_IWARP;
+	if (node_type == RDMA_NODE_EFA)
+		return RDMA_TRANSPORT_EFA;
 
 	return RDMA_TRANSPORT_IB;
 }
diff --git a/include/rdma/ib_verbs.h b/include/rdma/ib_verbs.h
index 92633c15125b..8d4b07b346b7 100644
--- a/include/rdma/ib_verbs.h
+++ b/include/rdma/ib_verbs.h
@@ -108,6 +108,7 @@  enum rdma_node_type {
 	RDMA_NODE_RNIC,
 	RDMA_NODE_USNIC,
 	RDMA_NODE_USNIC_UDP,
+	RDMA_NODE_EFA,
 };
 
 enum {
@@ -119,14 +120,16 @@  enum rdma_transport_type {
 	RDMA_TRANSPORT_IB,
 	RDMA_TRANSPORT_IWARP,
 	RDMA_TRANSPORT_USNIC,
-	RDMA_TRANSPORT_USNIC_UDP
+	RDMA_TRANSPORT_USNIC_UDP,
+	RDMA_TRANSPORT_EFA,
 };
 
 enum rdma_protocol_type {
 	RDMA_PROTOCOL_IB,
 	RDMA_PROTOCOL_IBOE,
 	RDMA_PROTOCOL_IWARP,
-	RDMA_PROTOCOL_USNIC_UDP
+	RDMA_PROTOCOL_USNIC_UDP,
+	RDMA_PROTOCOL_EFA,
 };
 
 __attribute_const__ enum rdma_transport_type
@@ -538,6 +541,7 @@  static inline struct rdma_hw_stats *rdma_alloc_hw_stats_struct(
 #define RDMA_CORE_CAP_PROT_ROCE_UDP_ENCAP 0x00800000
 #define RDMA_CORE_CAP_PROT_RAW_PACKET   0x01000000
 #define RDMA_CORE_CAP_PROT_USNIC        0x02000000
+#define RDMA_CORE_CAP_PROT_EFA          0x04000000
 
 #define RDMA_CORE_PORT_IB_GRH_REQUIRED (RDMA_CORE_CAP_IB_GRH_REQUIRED \
 					| RDMA_CORE_CAP_PROT_ROCE     \
@@ -1095,6 +1099,7 @@  enum ib_qp_type {
 	IB_QPT_RAW_PACKET = 8,
 	IB_QPT_XRC_INI = 9,
 	IB_QPT_XRC_TGT,
+	IB_QPT_SRD,
 	IB_QPT_MAX,
 	IB_QPT_DRIVER = 0xFF,
 	/* Reserve a range for qp types internal to the low level driver.