diff mbox series

[for-next] infiniband:cma: add a parameter for the packet lifetime

Message ID 20221122090206.865-1-lengchao@huawei.com (mailing list archive)
State Changes Requested
Delegated to: Jason Gunthorpe
Headers show
Series [for-next] infiniband:cma: add a parameter for the packet lifetime | expand

Commit Message

Chao Leng Nov. 22, 2022, 9:02 a.m. UTC
Now the default packet lifetime(CMA_IBOE_PACKET_LIFETIME) is 18.
That means the minimum ack timeout is 2 seconds(2^(18+1)*4us=2.097seconds).
The packet lifetime means the maximum transmission time of packets
on the network, the maximum transmission time of packets is closely
related to the network. 2 seconds is too long for simple lossless networks.
The packet lifetime should allow the user to adjust according to the
network situation.
So add a parameter for the packet lifetime.

Signed-off-by: Chao Leng <lengchao@huawei.com>
---
 drivers/infiniband/core/cma.c | 6 +++++-
 1 file changed, 5 insertions(+), 1 deletion(-)

Comments

Jason Gunthorpe Nov. 22, 2022, 2:08 p.m. UTC | #1
On Tue, Nov 22, 2022 at 05:02:06PM +0800, Chao Leng wrote:
> Now the default packet lifetime(CMA_IBOE_PACKET_LIFETIME) is 18.
> That means the minimum ack timeout is 2 seconds(2^(18+1)*4us=2.097seconds).
> The packet lifetime means the maximum transmission time of packets
> on the network, the maximum transmission time of packets is closely
> related to the network. 2 seconds is too long for simple lossless networks.
> The packet lifetime should allow the user to adjust according to the
> network situation.
> So add a parameter for the packet lifetime.
> 
> Signed-off-by: Chao Leng <lengchao@huawei.com>
> ---
>  drivers/infiniband/core/cma.c | 6 +++++-
>  1 file changed, 5 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/infiniband/core/cma.c b/drivers/infiniband/core/cma.c
> index cc2222b85c88..8e2ff5d610e3 100644
> --- a/drivers/infiniband/core/cma.c
> +++ b/drivers/infiniband/core/cma.c
> @@ -50,6 +50,10 @@ MODULE_LICENSE("Dual BSD/GPL");
>  #define CMA_IBOE_PACKET_LIFETIME 18
>  #define CMA_PREFERRED_ROCE_GID_TYPE IB_GID_TYPE_ROCE_UDP_ENCAP
>  
> +static unsigned char cma_packet_lifetime = CMA_IBOE_PACKET_LIFETIME;
> +module_param_named(packet_lifetime, cma_packet_lifetime, byte, 0644);
> +MODULE_PARM_DESC(packet_lifetime, "max transmission time of the packet");

No new module parameters

Maybe something in netlink would be appropriate, I'm not sure how
best to deal with this.

Really, the entire retransmit strategy in CM is not suitable for
ethernet networks, this is just one symptom.

Jason
Chao Leng Nov. 23, 2022, 2:13 a.m. UTC | #2
On 2022/11/22 22:08, Jason Gunthorpe wrote:
> On Tue, Nov 22, 2022 at 05:02:06PM +0800, Chao Leng wrote:
>> Now the default packet lifetime(CMA_IBOE_PACKET_LIFETIME) is 18.
>> That means the minimum ack timeout is 2 seconds(2^(18+1)*4us=2.097seconds).
>> The packet lifetime means the maximum transmission time of packets
>> on the network, the maximum transmission time of packets is closely
>> related to the network. 2 seconds is too long for simple lossless networks.
>> The packet lifetime should allow the user to adjust according to the
>> network situation.
>> So add a parameter for the packet lifetime.
>>
>> Signed-off-by: Chao Leng <lengchao@huawei.com>
>> ---
>>   drivers/infiniband/core/cma.c | 6 +++++-
>>   1 file changed, 5 insertions(+), 1 deletion(-)
>>
>> diff --git a/drivers/infiniband/core/cma.c b/drivers/infiniband/core/cma.c
>> index cc2222b85c88..8e2ff5d610e3 100644
>> --- a/drivers/infiniband/core/cma.c
>> +++ b/drivers/infiniband/core/cma.c
>> @@ -50,6 +50,10 @@ MODULE_LICENSE("Dual BSD/GPL");
>>   #define CMA_IBOE_PACKET_LIFETIME 18
>>   #define CMA_PREFERRED_ROCE_GID_TYPE IB_GID_TYPE_ROCE_UDP_ENCAP
>>   
>> +static unsigned char cma_packet_lifetime = CMA_IBOE_PACKET_LIFETIME;
>> +module_param_named(packet_lifetime, cma_packet_lifetime, byte, 0644);
>> +MODULE_PARM_DESC(packet_lifetime, "max transmission time of the packet");
> 
> No new module parameters
> 
> Maybe something in netlink would be appropriate, I'm not sure how
> best to deal with this.
> 
> Really, the entire retransmit strategy in CM is not suitable for
> ethernet networks, this is just one symptom.
What do you think to change the CMA_IBOE_PACKET_LIFETIME to 16.
The maximum transmission time of packets will be about 500+ms,
I think this is long enough for RoCE networks.
2 seconds is too long to my honest.
Jason Gunthorpe Nov. 23, 2022, 7:48 p.m. UTC | #3
On Wed, Nov 23, 2022 at 10:13:48AM +0800, Chao Leng wrote:
> 
> 
> On 2022/11/22 22:08, Jason Gunthorpe wrote:
> > On Tue, Nov 22, 2022 at 05:02:06PM +0800, Chao Leng wrote:
> > > Now the default packet lifetime(CMA_IBOE_PACKET_LIFETIME) is 18.
> > > That means the minimum ack timeout is 2 seconds(2^(18+1)*4us=2.097seconds).
> > > The packet lifetime means the maximum transmission time of packets
> > > on the network, the maximum transmission time of packets is closely
> > > related to the network. 2 seconds is too long for simple lossless networks.
> > > The packet lifetime should allow the user to adjust according to the
> > > network situation.
> > > So add a parameter for the packet lifetime.
> > > 
> > > Signed-off-by: Chao Leng <lengchao@huawei.com>
> > > ---
> > >   drivers/infiniband/core/cma.c | 6 +++++-
> > >   1 file changed, 5 insertions(+), 1 deletion(-)
> > > 
> > > diff --git a/drivers/infiniband/core/cma.c b/drivers/infiniband/core/cma.c
> > > index cc2222b85c88..8e2ff5d610e3 100644
> > > --- a/drivers/infiniband/core/cma.c
> > > +++ b/drivers/infiniband/core/cma.c
> > > @@ -50,6 +50,10 @@ MODULE_LICENSE("Dual BSD/GPL");
> > >   #define CMA_IBOE_PACKET_LIFETIME 18
> > >   #define CMA_PREFERRED_ROCE_GID_TYPE IB_GID_TYPE_ROCE_UDP_ENCAP
> > > +static unsigned char cma_packet_lifetime = CMA_IBOE_PACKET_LIFETIME;
> > > +module_param_named(packet_lifetime, cma_packet_lifetime, byte, 0644);
> > > +MODULE_PARM_DESC(packet_lifetime, "max transmission time of the packet");
> > 
> > No new module parameters
> > 
> > Maybe something in netlink would be appropriate, I'm not sure how
> > best to deal with this.
> > 
> > Really, the entire retransmit strategy in CM is not suitable for
> > ethernet networks, this is just one symptom.
> What do you think to change the CMA_IBOE_PACKET_LIFETIME to 16.
> The maximum transmission time of packets will be about 500+ms,
> I think this is long enough for RoCE networks.
> 2 seconds is too long to my honest.

I don't have an informed opinion on this. I agree that 2s is too long though

Do we have any information to back up what this should be?

Jason
Chao Leng Nov. 24, 2022, 8:19 a.m. UTC | #4
On 2022/11/24 3:48, Jason Gunthorpe wrote:
> On Wed, Nov 23, 2022 at 10:13:48AM +0800, Chao Leng wrote:
>>
>>
>> On 2022/11/22 22:08, Jason Gunthorpe wrote:
>>> On Tue, Nov 22, 2022 at 05:02:06PM +0800, Chao Leng wrote:
>>>> Now the default packet lifetime(CMA_IBOE_PACKET_LIFETIME) is 18.
>>>> That means the minimum ack timeout is 2 seconds(2^(18+1)*4us=2.097seconds).
>>>> The packet lifetime means the maximum transmission time of packets
>>>> on the network, the maximum transmission time of packets is closely
>>>> related to the network. 2 seconds is too long for simple lossless networks.
>>>> The packet lifetime should allow the user to adjust according to the
>>>> network situation.
>>>> So add a parameter for the packet lifetime.
>>>>
>>>> Signed-off-by: Chao Leng <lengchao@huawei.com>
>>>> ---
>>>>    drivers/infiniband/core/cma.c | 6 +++++-
>>>>    1 file changed, 5 insertions(+), 1 deletion(-)
>>>>
>>>> diff --git a/drivers/infiniband/core/cma.c b/drivers/infiniband/core/cma.c
>>>> index cc2222b85c88..8e2ff5d610e3 100644
>>>> --- a/drivers/infiniband/core/cma.c
>>>> +++ b/drivers/infiniband/core/cma.c
>>>> @@ -50,6 +50,10 @@ MODULE_LICENSE("Dual BSD/GPL");
>>>>    #define CMA_IBOE_PACKET_LIFETIME 18
>>>>    #define CMA_PREFERRED_ROCE_GID_TYPE IB_GID_TYPE_ROCE_UDP_ENCAP
>>>> +static unsigned char cma_packet_lifetime = CMA_IBOE_PACKET_LIFETIME;
>>>> +module_param_named(packet_lifetime, cma_packet_lifetime, byte, 0644);
>>>> +MODULE_PARM_DESC(packet_lifetime, "max transmission time of the packet");
>>>
>>> No new module parameters
>>>
>>> Maybe something in netlink would be appropriate, I'm not sure how
>>> best to deal with this.
>>>
>>> Really, the entire retransmit strategy in CM is not suitable for
>>> ethernet networks, this is just one symptom.
>> What do you think to change the CMA_IBOE_PACKET_LIFETIME to 16.
>> The maximum transmission time of packets will be about 500+ms,
>> I think this is long enough for RoCE networks.
>> 2 seconds is too long to my honest.
> 
> I don't have an informed opinion on this. I agree that 2s is too long though
> 
> Do we have any information to back up what this should be?
Assume the network is a clos topology with three layers, every packet
will pass through five hops of switches. Assume the buffer of every
switch is 128MB and the port transmission rate is 25 Gbit/s,
the maximum transmission time of the packet is 200ms(128MB*5/25Gbit/s).
Add double redundancy, it is less than 500ms.
So change the CMA_IBOE_PACKET_LIFETIME to 16,
the maximum transmission time of the packet will be about 500+ms,
it is long enough.
Jason Gunthorpe Nov. 24, 2022, 1:22 p.m. UTC | #5
On Thu, Nov 24, 2022 at 04:19:35PM +0800, Chao Leng wrote:
> 
> 
> On 2022/11/24 3:48, Jason Gunthorpe wrote:
> > On Wed, Nov 23, 2022 at 10:13:48AM +0800, Chao Leng wrote:
> > > 
> > > 
> > > On 2022/11/22 22:08, Jason Gunthorpe wrote:
> > > > On Tue, Nov 22, 2022 at 05:02:06PM +0800, Chao Leng wrote:
> > > > > Now the default packet lifetime(CMA_IBOE_PACKET_LIFETIME) is 18.
> > > > > That means the minimum ack timeout is 2 seconds(2^(18+1)*4us=2.097seconds).
> > > > > The packet lifetime means the maximum transmission time of packets
> > > > > on the network, the maximum transmission time of packets is closely
> > > > > related to the network. 2 seconds is too long for simple lossless networks.
> > > > > The packet lifetime should allow the user to adjust according to the
> > > > > network situation.
> > > > > So add a parameter for the packet lifetime.
> > > > > 
> > > > > Signed-off-by: Chao Leng <lengchao@huawei.com>
> > > > > ---
> > > > >    drivers/infiniband/core/cma.c | 6 +++++-
> > > > >    1 file changed, 5 insertions(+), 1 deletion(-)
> > > > > 
> > > > > diff --git a/drivers/infiniband/core/cma.c b/drivers/infiniband/core/cma.c
> > > > > index cc2222b85c88..8e2ff5d610e3 100644
> > > > > --- a/drivers/infiniband/core/cma.c
> > > > > +++ b/drivers/infiniband/core/cma.c
> > > > > @@ -50,6 +50,10 @@ MODULE_LICENSE("Dual BSD/GPL");
> > > > >    #define CMA_IBOE_PACKET_LIFETIME 18
> > > > >    #define CMA_PREFERRED_ROCE_GID_TYPE IB_GID_TYPE_ROCE_UDP_ENCAP
> > > > > +static unsigned char cma_packet_lifetime = CMA_IBOE_PACKET_LIFETIME;
> > > > > +module_param_named(packet_lifetime, cma_packet_lifetime, byte, 0644);
> > > > > +MODULE_PARM_DESC(packet_lifetime, "max transmission time of the packet");
> > > > 
> > > > No new module parameters
> > > > 
> > > > Maybe something in netlink would be appropriate, I'm not sure how
> > > > best to deal with this.
> > > > 
> > > > Really, the entire retransmit strategy in CM is not suitable for
> > > > ethernet networks, this is just one symptom.
> > > What do you think to change the CMA_IBOE_PACKET_LIFETIME to 16.
> > > The maximum transmission time of packets will be about 500+ms,
> > > I think this is long enough for RoCE networks.
> > > 2 seconds is too long to my honest.
> > 
> > I don't have an informed opinion on this. I agree that 2s is too long though
> > 
> > Do we have any information to back up what this should be?
> Assume the network is a clos topology with three layers, every packet
> will pass through five hops of switches. Assume the buffer of every
> switch is 128MB and the port transmission rate is 25 Gbit/s,
> the maximum transmission time of the packet is 200ms(128MB*5/25Gbit/s).
> Add double redundancy, it is less than 500ms.

We also have to worry about HCA processing time which is driven by CPU
loading more than anything

> So change the CMA_IBOE_PACKET_LIFETIME to 16,
> the maximum transmission time of the packet will be about 500+ms,
> it is long enough.

That makes sense to me, put it in a commit message and send a patch

Jason
Chao Leng Nov. 25, 2022, 12:43 a.m. UTC | #6
On 2022/11/24 21:22, Jason Gunthorpe wrote:
> On Thu, Nov 24, 2022 at 04:19:35PM +0800, Chao Leng wrote:
>>
>>
>> On 2022/11/24 3:48, Jason Gunthorpe wrote:
>>> On Wed, Nov 23, 2022 at 10:13:48AM +0800, Chao Leng wrote:
>>>>
>>>>
>>>> On 2022/11/22 22:08, Jason Gunthorpe wrote:
>>>>> On Tue, Nov 22, 2022 at 05:02:06PM +0800, Chao Leng wrote:
>>>>>> Now the default packet lifetime(CMA_IBOE_PACKET_LIFETIME) is 18.
>>>>>> That means the minimum ack timeout is 2 seconds(2^(18+1)*4us=2.097seconds).
>>>>>> The packet lifetime means the maximum transmission time of packets
>>>>>> on the network, the maximum transmission time of packets is closely
>>>>>> related to the network. 2 seconds is too long for simple lossless networks.
>>>>>> The packet lifetime should allow the user to adjust according to the
>>>>>> network situation.
>>>>>> So add a parameter for the packet lifetime.
>>>>>>
>>>>>> Signed-off-by: Chao Leng <lengchao@huawei.com>
>>>>>> ---
>>>>>>     drivers/infiniband/core/cma.c | 6 +++++-
>>>>>>     1 file changed, 5 insertions(+), 1 deletion(-)
>>>>>>
>>>>>> diff --git a/drivers/infiniband/core/cma.c b/drivers/infiniband/core/cma.c
>>>>>> index cc2222b85c88..8e2ff5d610e3 100644
>>>>>> --- a/drivers/infiniband/core/cma.c
>>>>>> +++ b/drivers/infiniband/core/cma.c
>>>>>> @@ -50,6 +50,10 @@ MODULE_LICENSE("Dual BSD/GPL");
>>>>>>     #define CMA_IBOE_PACKET_LIFETIME 18
>>>>>>     #define CMA_PREFERRED_ROCE_GID_TYPE IB_GID_TYPE_ROCE_UDP_ENCAP
>>>>>> +static unsigned char cma_packet_lifetime = CMA_IBOE_PACKET_LIFETIME;
>>>>>> +module_param_named(packet_lifetime, cma_packet_lifetime, byte, 0644);
>>>>>> +MODULE_PARM_DESC(packet_lifetime, "max transmission time of the packet");
>>>>>
>>>>> No new module parameters
>>>>>
>>>>> Maybe something in netlink would be appropriate, I'm not sure how
>>>>> best to deal with this.
>>>>>
>>>>> Really, the entire retransmit strategy in CM is not suitable for
>>>>> ethernet networks, this is just one symptom.
>>>> What do you think to change the CMA_IBOE_PACKET_LIFETIME to 16.
>>>> The maximum transmission time of packets will be about 500+ms,
>>>> I think this is long enough for RoCE networks.
>>>> 2 seconds is too long to my honest.
>>>
>>> I don't have an informed opinion on this. I agree that 2s is too long though
>>>
>>> Do we have any information to back up what this should be?
>> Assume the network is a clos topology with three layers, every packet
>> will pass through five hops of switches. Assume the buffer of every
>> switch is 128MB and the port transmission rate is 25 Gbit/s,
>> the maximum transmission time of the packet is 200ms(128MB*5/25Gbit/s).
>> Add double redundancy, it is less than 500ms.
> 
> We also have to worry about HCA processing time which is driven by CPU
> loading more than anything
The ack timeout retransmission time is affected by the following two factors:
one is packet life time, another is the HCA processing time.
The HCA processing time is already considered, it is covered by the HCA ack delay
which controlled by the HCA driver.
> 
>> So change the CMA_IBOE_PACKET_LIFETIME to 16,
>> the maximum transmission time of the packet will be about 500+ms,
>> it is long enough.
> 
> That makes sense to me, put it in a commit message and send a patch
Ok, thank you very much.
diff mbox series

Patch

diff --git a/drivers/infiniband/core/cma.c b/drivers/infiniband/core/cma.c
index cc2222b85c88..8e2ff5d610e3 100644
--- a/drivers/infiniband/core/cma.c
+++ b/drivers/infiniband/core/cma.c
@@ -50,6 +50,10 @@  MODULE_LICENSE("Dual BSD/GPL");
 #define CMA_IBOE_PACKET_LIFETIME 18
 #define CMA_PREFERRED_ROCE_GID_TYPE IB_GID_TYPE_ROCE_UDP_ENCAP
 
+static unsigned char cma_packet_lifetime = CMA_IBOE_PACKET_LIFETIME;
+module_param_named(packet_lifetime, cma_packet_lifetime, byte, 0644);
+MODULE_PARM_DESC(packet_lifetime, "max transmission time of the packet");
+
 static const char * const cma_events[] = {
 	[RDMA_CM_EVENT_ADDR_RESOLVED]	 = "address resolved",
 	[RDMA_CM_EVENT_ADDR_ERROR]	 = "address error",
@@ -3301,7 +3305,7 @@  static int cma_resolve_iboe_route(struct rdma_id_private *id_priv)
 	if (id_priv->timeout_set && id_priv->timeout)
 		route->path_rec->packet_life_time = id_priv->timeout - 1;
 	else
-		route->path_rec->packet_life_time = CMA_IBOE_PACKET_LIFETIME;
+		route->path_rec->packet_life_time = cma_packet_lifetime;
 	mutex_unlock(&id_priv->qp_mutex);
 
 	if (!route->path_rec->mtu) {