Message ID | 20221122090206.865-1-lengchao@huawei.com (mailing list archive) |
---|---|
State | Changes Requested |
Delegated to: | Jason Gunthorpe |
Headers | show |
Series | [for-next] infiniband:cma: add a parameter for the packet lifetime | expand |
On Tue, Nov 22, 2022 at 05:02:06PM +0800, Chao Leng wrote: > Now the default packet lifetime(CMA_IBOE_PACKET_LIFETIME) is 18. > That means the minimum ack timeout is 2 seconds(2^(18+1)*4us=2.097seconds). > The packet lifetime means the maximum transmission time of packets > on the network, the maximum transmission time of packets is closely > related to the network. 2 seconds is too long for simple lossless networks. > The packet lifetime should allow the user to adjust according to the > network situation. > So add a parameter for the packet lifetime. > > Signed-off-by: Chao Leng <lengchao@huawei.com> > --- > drivers/infiniband/core/cma.c | 6 +++++- > 1 file changed, 5 insertions(+), 1 deletion(-) > > diff --git a/drivers/infiniband/core/cma.c b/drivers/infiniband/core/cma.c > index cc2222b85c88..8e2ff5d610e3 100644 > --- a/drivers/infiniband/core/cma.c > +++ b/drivers/infiniband/core/cma.c > @@ -50,6 +50,10 @@ MODULE_LICENSE("Dual BSD/GPL"); > #define CMA_IBOE_PACKET_LIFETIME 18 > #define CMA_PREFERRED_ROCE_GID_TYPE IB_GID_TYPE_ROCE_UDP_ENCAP > > +static unsigned char cma_packet_lifetime = CMA_IBOE_PACKET_LIFETIME; > +module_param_named(packet_lifetime, cma_packet_lifetime, byte, 0644); > +MODULE_PARM_DESC(packet_lifetime, "max transmission time of the packet"); No new module parameters Maybe something in netlink would be appropriate, I'm not sure how best to deal with this. Really, the entire retransmit strategy in CM is not suitable for ethernet networks, this is just one symptom. Jason
On 2022/11/22 22:08, Jason Gunthorpe wrote: > On Tue, Nov 22, 2022 at 05:02:06PM +0800, Chao Leng wrote: >> Now the default packet lifetime(CMA_IBOE_PACKET_LIFETIME) is 18. >> That means the minimum ack timeout is 2 seconds(2^(18+1)*4us=2.097seconds). >> The packet lifetime means the maximum transmission time of packets >> on the network, the maximum transmission time of packets is closely >> related to the network. 2 seconds is too long for simple lossless networks. >> The packet lifetime should allow the user to adjust according to the >> network situation. >> So add a parameter for the packet lifetime. >> >> Signed-off-by: Chao Leng <lengchao@huawei.com> >> --- >> drivers/infiniband/core/cma.c | 6 +++++- >> 1 file changed, 5 insertions(+), 1 deletion(-) >> >> diff --git a/drivers/infiniband/core/cma.c b/drivers/infiniband/core/cma.c >> index cc2222b85c88..8e2ff5d610e3 100644 >> --- a/drivers/infiniband/core/cma.c >> +++ b/drivers/infiniband/core/cma.c >> @@ -50,6 +50,10 @@ MODULE_LICENSE("Dual BSD/GPL"); >> #define CMA_IBOE_PACKET_LIFETIME 18 >> #define CMA_PREFERRED_ROCE_GID_TYPE IB_GID_TYPE_ROCE_UDP_ENCAP >> >> +static unsigned char cma_packet_lifetime = CMA_IBOE_PACKET_LIFETIME; >> +module_param_named(packet_lifetime, cma_packet_lifetime, byte, 0644); >> +MODULE_PARM_DESC(packet_lifetime, "max transmission time of the packet"); > > No new module parameters > > Maybe something in netlink would be appropriate, I'm not sure how > best to deal with this. > > Really, the entire retransmit strategy in CM is not suitable for > ethernet networks, this is just one symptom. What do you think to change the CMA_IBOE_PACKET_LIFETIME to 16. The maximum transmission time of packets will be about 500+ms, I think this is long enough for RoCE networks. 2 seconds is too long to my honest.
On Wed, Nov 23, 2022 at 10:13:48AM +0800, Chao Leng wrote: > > > On 2022/11/22 22:08, Jason Gunthorpe wrote: > > On Tue, Nov 22, 2022 at 05:02:06PM +0800, Chao Leng wrote: > > > Now the default packet lifetime(CMA_IBOE_PACKET_LIFETIME) is 18. > > > That means the minimum ack timeout is 2 seconds(2^(18+1)*4us=2.097seconds). > > > The packet lifetime means the maximum transmission time of packets > > > on the network, the maximum transmission time of packets is closely > > > related to the network. 2 seconds is too long for simple lossless networks. > > > The packet lifetime should allow the user to adjust according to the > > > network situation. > > > So add a parameter for the packet lifetime. > > > > > > Signed-off-by: Chao Leng <lengchao@huawei.com> > > > --- > > > drivers/infiniband/core/cma.c | 6 +++++- > > > 1 file changed, 5 insertions(+), 1 deletion(-) > > > > > > diff --git a/drivers/infiniband/core/cma.c b/drivers/infiniband/core/cma.c > > > index cc2222b85c88..8e2ff5d610e3 100644 > > > --- a/drivers/infiniband/core/cma.c > > > +++ b/drivers/infiniband/core/cma.c > > > @@ -50,6 +50,10 @@ MODULE_LICENSE("Dual BSD/GPL"); > > > #define CMA_IBOE_PACKET_LIFETIME 18 > > > #define CMA_PREFERRED_ROCE_GID_TYPE IB_GID_TYPE_ROCE_UDP_ENCAP > > > +static unsigned char cma_packet_lifetime = CMA_IBOE_PACKET_LIFETIME; > > > +module_param_named(packet_lifetime, cma_packet_lifetime, byte, 0644); > > > +MODULE_PARM_DESC(packet_lifetime, "max transmission time of the packet"); > > > > No new module parameters > > > > Maybe something in netlink would be appropriate, I'm not sure how > > best to deal with this. > > > > Really, the entire retransmit strategy in CM is not suitable for > > ethernet networks, this is just one symptom. > What do you think to change the CMA_IBOE_PACKET_LIFETIME to 16. > The maximum transmission time of packets will be about 500+ms, > I think this is long enough for RoCE networks. > 2 seconds is too long to my honest. I don't have an informed opinion on this. I agree that 2s is too long though Do we have any information to back up what this should be? Jason
On 2022/11/24 3:48, Jason Gunthorpe wrote: > On Wed, Nov 23, 2022 at 10:13:48AM +0800, Chao Leng wrote: >> >> >> On 2022/11/22 22:08, Jason Gunthorpe wrote: >>> On Tue, Nov 22, 2022 at 05:02:06PM +0800, Chao Leng wrote: >>>> Now the default packet lifetime(CMA_IBOE_PACKET_LIFETIME) is 18. >>>> That means the minimum ack timeout is 2 seconds(2^(18+1)*4us=2.097seconds). >>>> The packet lifetime means the maximum transmission time of packets >>>> on the network, the maximum transmission time of packets is closely >>>> related to the network. 2 seconds is too long for simple lossless networks. >>>> The packet lifetime should allow the user to adjust according to the >>>> network situation. >>>> So add a parameter for the packet lifetime. >>>> >>>> Signed-off-by: Chao Leng <lengchao@huawei.com> >>>> --- >>>> drivers/infiniband/core/cma.c | 6 +++++- >>>> 1 file changed, 5 insertions(+), 1 deletion(-) >>>> >>>> diff --git a/drivers/infiniband/core/cma.c b/drivers/infiniband/core/cma.c >>>> index cc2222b85c88..8e2ff5d610e3 100644 >>>> --- a/drivers/infiniband/core/cma.c >>>> +++ b/drivers/infiniband/core/cma.c >>>> @@ -50,6 +50,10 @@ MODULE_LICENSE("Dual BSD/GPL"); >>>> #define CMA_IBOE_PACKET_LIFETIME 18 >>>> #define CMA_PREFERRED_ROCE_GID_TYPE IB_GID_TYPE_ROCE_UDP_ENCAP >>>> +static unsigned char cma_packet_lifetime = CMA_IBOE_PACKET_LIFETIME; >>>> +module_param_named(packet_lifetime, cma_packet_lifetime, byte, 0644); >>>> +MODULE_PARM_DESC(packet_lifetime, "max transmission time of the packet"); >>> >>> No new module parameters >>> >>> Maybe something in netlink would be appropriate, I'm not sure how >>> best to deal with this. >>> >>> Really, the entire retransmit strategy in CM is not suitable for >>> ethernet networks, this is just one symptom. >> What do you think to change the CMA_IBOE_PACKET_LIFETIME to 16. >> The maximum transmission time of packets will be about 500+ms, >> I think this is long enough for RoCE networks. >> 2 seconds is too long to my honest. > > I don't have an informed opinion on this. I agree that 2s is too long though > > Do we have any information to back up what this should be? Assume the network is a clos topology with three layers, every packet will pass through five hops of switches. Assume the buffer of every switch is 128MB and the port transmission rate is 25 Gbit/s, the maximum transmission time of the packet is 200ms(128MB*5/25Gbit/s). Add double redundancy, it is less than 500ms. So change the CMA_IBOE_PACKET_LIFETIME to 16, the maximum transmission time of the packet will be about 500+ms, it is long enough.
On Thu, Nov 24, 2022 at 04:19:35PM +0800, Chao Leng wrote: > > > On 2022/11/24 3:48, Jason Gunthorpe wrote: > > On Wed, Nov 23, 2022 at 10:13:48AM +0800, Chao Leng wrote: > > > > > > > > > On 2022/11/22 22:08, Jason Gunthorpe wrote: > > > > On Tue, Nov 22, 2022 at 05:02:06PM +0800, Chao Leng wrote: > > > > > Now the default packet lifetime(CMA_IBOE_PACKET_LIFETIME) is 18. > > > > > That means the minimum ack timeout is 2 seconds(2^(18+1)*4us=2.097seconds). > > > > > The packet lifetime means the maximum transmission time of packets > > > > > on the network, the maximum transmission time of packets is closely > > > > > related to the network. 2 seconds is too long for simple lossless networks. > > > > > The packet lifetime should allow the user to adjust according to the > > > > > network situation. > > > > > So add a parameter for the packet lifetime. > > > > > > > > > > Signed-off-by: Chao Leng <lengchao@huawei.com> > > > > > --- > > > > > drivers/infiniband/core/cma.c | 6 +++++- > > > > > 1 file changed, 5 insertions(+), 1 deletion(-) > > > > > > > > > > diff --git a/drivers/infiniband/core/cma.c b/drivers/infiniband/core/cma.c > > > > > index cc2222b85c88..8e2ff5d610e3 100644 > > > > > --- a/drivers/infiniband/core/cma.c > > > > > +++ b/drivers/infiniband/core/cma.c > > > > > @@ -50,6 +50,10 @@ MODULE_LICENSE("Dual BSD/GPL"); > > > > > #define CMA_IBOE_PACKET_LIFETIME 18 > > > > > #define CMA_PREFERRED_ROCE_GID_TYPE IB_GID_TYPE_ROCE_UDP_ENCAP > > > > > +static unsigned char cma_packet_lifetime = CMA_IBOE_PACKET_LIFETIME; > > > > > +module_param_named(packet_lifetime, cma_packet_lifetime, byte, 0644); > > > > > +MODULE_PARM_DESC(packet_lifetime, "max transmission time of the packet"); > > > > > > > > No new module parameters > > > > > > > > Maybe something in netlink would be appropriate, I'm not sure how > > > > best to deal with this. > > > > > > > > Really, the entire retransmit strategy in CM is not suitable for > > > > ethernet networks, this is just one symptom. > > > What do you think to change the CMA_IBOE_PACKET_LIFETIME to 16. > > > The maximum transmission time of packets will be about 500+ms, > > > I think this is long enough for RoCE networks. > > > 2 seconds is too long to my honest. > > > > I don't have an informed opinion on this. I agree that 2s is too long though > > > > Do we have any information to back up what this should be? > Assume the network is a clos topology with three layers, every packet > will pass through five hops of switches. Assume the buffer of every > switch is 128MB and the port transmission rate is 25 Gbit/s, > the maximum transmission time of the packet is 200ms(128MB*5/25Gbit/s). > Add double redundancy, it is less than 500ms. We also have to worry about HCA processing time which is driven by CPU loading more than anything > So change the CMA_IBOE_PACKET_LIFETIME to 16, > the maximum transmission time of the packet will be about 500+ms, > it is long enough. That makes sense to me, put it in a commit message and send a patch Jason
On 2022/11/24 21:22, Jason Gunthorpe wrote: > On Thu, Nov 24, 2022 at 04:19:35PM +0800, Chao Leng wrote: >> >> >> On 2022/11/24 3:48, Jason Gunthorpe wrote: >>> On Wed, Nov 23, 2022 at 10:13:48AM +0800, Chao Leng wrote: >>>> >>>> >>>> On 2022/11/22 22:08, Jason Gunthorpe wrote: >>>>> On Tue, Nov 22, 2022 at 05:02:06PM +0800, Chao Leng wrote: >>>>>> Now the default packet lifetime(CMA_IBOE_PACKET_LIFETIME) is 18. >>>>>> That means the minimum ack timeout is 2 seconds(2^(18+1)*4us=2.097seconds). >>>>>> The packet lifetime means the maximum transmission time of packets >>>>>> on the network, the maximum transmission time of packets is closely >>>>>> related to the network. 2 seconds is too long for simple lossless networks. >>>>>> The packet lifetime should allow the user to adjust according to the >>>>>> network situation. >>>>>> So add a parameter for the packet lifetime. >>>>>> >>>>>> Signed-off-by: Chao Leng <lengchao@huawei.com> >>>>>> --- >>>>>> drivers/infiniband/core/cma.c | 6 +++++- >>>>>> 1 file changed, 5 insertions(+), 1 deletion(-) >>>>>> >>>>>> diff --git a/drivers/infiniband/core/cma.c b/drivers/infiniband/core/cma.c >>>>>> index cc2222b85c88..8e2ff5d610e3 100644 >>>>>> --- a/drivers/infiniband/core/cma.c >>>>>> +++ b/drivers/infiniband/core/cma.c >>>>>> @@ -50,6 +50,10 @@ MODULE_LICENSE("Dual BSD/GPL"); >>>>>> #define CMA_IBOE_PACKET_LIFETIME 18 >>>>>> #define CMA_PREFERRED_ROCE_GID_TYPE IB_GID_TYPE_ROCE_UDP_ENCAP >>>>>> +static unsigned char cma_packet_lifetime = CMA_IBOE_PACKET_LIFETIME; >>>>>> +module_param_named(packet_lifetime, cma_packet_lifetime, byte, 0644); >>>>>> +MODULE_PARM_DESC(packet_lifetime, "max transmission time of the packet"); >>>>> >>>>> No new module parameters >>>>> >>>>> Maybe something in netlink would be appropriate, I'm not sure how >>>>> best to deal with this. >>>>> >>>>> Really, the entire retransmit strategy in CM is not suitable for >>>>> ethernet networks, this is just one symptom. >>>> What do you think to change the CMA_IBOE_PACKET_LIFETIME to 16. >>>> The maximum transmission time of packets will be about 500+ms, >>>> I think this is long enough for RoCE networks. >>>> 2 seconds is too long to my honest. >>> >>> I don't have an informed opinion on this. I agree that 2s is too long though >>> >>> Do we have any information to back up what this should be? >> Assume the network is a clos topology with three layers, every packet >> will pass through five hops of switches. Assume the buffer of every >> switch is 128MB and the port transmission rate is 25 Gbit/s, >> the maximum transmission time of the packet is 200ms(128MB*5/25Gbit/s). >> Add double redundancy, it is less than 500ms. > > We also have to worry about HCA processing time which is driven by CPU > loading more than anything The ack timeout retransmission time is affected by the following two factors: one is packet life time, another is the HCA processing time. The HCA processing time is already considered, it is covered by the HCA ack delay which controlled by the HCA driver. > >> So change the CMA_IBOE_PACKET_LIFETIME to 16, >> the maximum transmission time of the packet will be about 500+ms, >> it is long enough. > > That makes sense to me, put it in a commit message and send a patch Ok, thank you very much.
diff --git a/drivers/infiniband/core/cma.c b/drivers/infiniband/core/cma.c index cc2222b85c88..8e2ff5d610e3 100644 --- a/drivers/infiniband/core/cma.c +++ b/drivers/infiniband/core/cma.c @@ -50,6 +50,10 @@ MODULE_LICENSE("Dual BSD/GPL"); #define CMA_IBOE_PACKET_LIFETIME 18 #define CMA_PREFERRED_ROCE_GID_TYPE IB_GID_TYPE_ROCE_UDP_ENCAP +static unsigned char cma_packet_lifetime = CMA_IBOE_PACKET_LIFETIME; +module_param_named(packet_lifetime, cma_packet_lifetime, byte, 0644); +MODULE_PARM_DESC(packet_lifetime, "max transmission time of the packet"); + static const char * const cma_events[] = { [RDMA_CM_EVENT_ADDR_RESOLVED] = "address resolved", [RDMA_CM_EVENT_ADDR_ERROR] = "address error", @@ -3301,7 +3305,7 @@ static int cma_resolve_iboe_route(struct rdma_id_private *id_priv) if (id_priv->timeout_set && id_priv->timeout) route->path_rec->packet_life_time = id_priv->timeout - 1; else - route->path_rec->packet_life_time = CMA_IBOE_PACKET_LIFETIME; + route->path_rec->packet_life_time = cma_packet_lifetime; mutex_unlock(&id_priv->qp_mutex); if (!route->path_rec->mtu) {
Now the default packet lifetime(CMA_IBOE_PACKET_LIFETIME) is 18. That means the minimum ack timeout is 2 seconds(2^(18+1)*4us=2.097seconds). The packet lifetime means the maximum transmission time of packets on the network, the maximum transmission time of packets is closely related to the network. 2 seconds is too long for simple lossless networks. The packet lifetime should allow the user to adjust according to the network situation. So add a parameter for the packet lifetime. Signed-off-by: Chao Leng <lengchao@huawei.com> --- drivers/infiniband/core/cma.c | 6 +++++- 1 file changed, 5 insertions(+), 1 deletion(-)