diff mbox

Fwd: mlx5_ib_post_send panic on s390x

Message ID VI1PR0502MB300817FC6256218DE800497BD1220@VI1PR0502MB3008.eurprd05.prod.outlook.com (mailing list archive)
State Not Applicable
Headers show

Commit Message

Parav Pandit March 12, 2017, 8:20 p.m. UTC
Hi Ursula,

> -----Original Message-----

> From: linux-rdma-owner@vger.kernel.org [mailto:linux-rdma-

> owner@vger.kernel.org] On Behalf Of Ursula Braun

> Sent: Thursday, March 9, 2017 3:54 AM

> To: Eli Cohen <eli@mellanox.com>; Matan Barak <matanb@mellanox.com>

> Cc: Saeed Mahameed <saeedm@mellanox.com>; Leon Romanovsky

> <leonro@mellanox.com>; linux-rdma@vger.kernel.org

> Subject: Re: Fwd: mlx5_ib_post_send panic on s390x

> 

> 

> 

> On 03/06/2017 02:08 PM, Eli Cohen wrote:

> >>>

> >>> The problem seems to be caused by the usage of plain memcpy in

> set_data_inl_seg().

> >>> The address provided by SMC-code in struct ib_send_wr *wr is an

> >>> address belonging to an area mapped with the ib_dma_map_single()

> >>> call. On s390x those kind of addresses require extra access functions (see

> arch/s390/include/asm/io.h).

> >>>

> >

> > By definition, when you are posting a send request with inline, the address

> must be mapped to the cpu so plain memcpy should work.

> >

> In the past I run SMC-R with Connect X3 cards. The mlx4 driver does not seem to

> contain extra coding for IB_SEND_INLINE flag for ib_post_send. Does this mean

> for SMC-R to run on Connect X3 cards the IB_SEND_INLINE flag is ignored, and

> thus I needed the ib_dma_map_single() call for the area used with

> ib_post_send()? Does this mean I should stay away from the IB_SEND_INLINE

> flag, if I want to run the same SMC-R code with both, Connect X3 cards and

> Connect X4 cards?

> 

I had encountered the same kernel panic that you mentioned last week on ConnectX-4 adapters with smc-r on x86_64.
Shall I submit below fix to netdev mailing list?
I have tested above change. I also have optimization that avoids dma mapping for wr_tx_dma_addr.

-               lnk->wr_tx_sges[i].addr =
-                       lnk->wr_tx_dma_addr + i * SMC_WR_BUF_SIZE;
+               lnk->wr_tx_sges[i].addr = (uintptr_t)(lnk->wr_tx_bufs + i);

I also have fix for processing IB_SEND_INLINE in mlx4 driver on little older kernel base.
I have attached below. I can rebase my kernel and provide fix in mlx5_ib driver.
Let me know.

Regards,
Parav Pandit


> --

> To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body

> of a message to majordomo@vger.kernel.org More majordomo info at

> http://vger.kernel.org/majordomo-info.html

Comments

Parav Pandit March 12, 2017, 8:38 p.m. UTC | #1
I meant mlx4_ib* driver below. Sorry for typo.

> -----Original Message-----

> From: linux-rdma-owner@vger.kernel.org [mailto:linux-rdma-

> owner@vger.kernel.org] On Behalf Of Parav Pandit

> Sent: Sunday, March 12, 2017 3:21 PM

> To: Ursula Braun <ubraun@linux.vnet.ibm.com>; Eli Cohen

> <eli@mellanox.com>; Matan Barak <matanb@mellanox.com>

> Cc: Saeed Mahameed <saeedm@mellanox.com>; Leon Romanovsky

> <leonro@mellanox.com>; linux-rdma@vger.kernel.org

> Subject: RE: Fwd: mlx5_ib_post_send panic on s390x

> 

> Hi Ursula,

> 

> > -----Original Message-----

> > From: linux-rdma-owner@vger.kernel.org [mailto:linux-rdma-

> > owner@vger.kernel.org] On Behalf Of Ursula Braun

> > Sent: Thursday, March 9, 2017 3:54 AM

> > To: Eli Cohen <eli@mellanox.com>; Matan Barak <matanb@mellanox.com>

> > Cc: Saeed Mahameed <saeedm@mellanox.com>; Leon Romanovsky

> > <leonro@mellanox.com>; linux-rdma@vger.kernel.org

> > Subject: Re: Fwd: mlx5_ib_post_send panic on s390x

> >

> >

> >

> > On 03/06/2017 02:08 PM, Eli Cohen wrote:

> > >>>

> > >>> The problem seems to be caused by the usage of plain memcpy in

> > set_data_inl_seg().

> > >>> The address provided by SMC-code in struct ib_send_wr *wr is an

> > >>> address belonging to an area mapped with the ib_dma_map_single()

> > >>> call. On s390x those kind of addresses require extra access

> > >>> functions (see

> > arch/s390/include/asm/io.h).

> > >>>

> > >

> > > By definition, when you are posting a send request with inline, the

> > > address

> > must be mapped to the cpu so plain memcpy should work.

> > >

> > In the past I run SMC-R with Connect X3 cards. The mlx4 driver does

> > not seem to contain extra coding for IB_SEND_INLINE flag for

> > ib_post_send. Does this mean for SMC-R to run on Connect X3 cards the

> > IB_SEND_INLINE flag is ignored, and thus I needed the

> > ib_dma_map_single() call for the area used with ib_post_send()? Does

> > this mean I should stay away from the IB_SEND_INLINE flag, if I want

> > to run the same SMC-R code with both, Connect X3 cards and Connect X4

> cards?

> >

> I had encountered the same kernel panic that you mentioned last week on

> ConnectX-4 adapters with smc-r on x86_64.

> Shall I submit below fix to netdev mailing list?

> I have tested above change. I also have optimization that avoids dma mapping

> for wr_tx_dma_addr.

> 

> -               lnk->wr_tx_sges[i].addr =

> -                       lnk->wr_tx_dma_addr + i * SMC_WR_BUF_SIZE;

> +               lnk->wr_tx_sges[i].addr = (uintptr_t)(lnk->wr_tx_bufs +

> + i);

> 

> I also have fix for processing IB_SEND_INLINE in mlx4 driver on little older

> kernel base.

> I have attached below. I can rebase my kernel and provide fix in mlx5_ib driver.

> Let me know.

> 

> Regards,

> Parav Pandit

> 

> diff --git a/drivers/infiniband/hw/mlx4/qp.c b/drivers/infiniband/hw/mlx4/qp.c

> index a2e4ca5..0d984f5 100644

> --- a/drivers/infiniband/hw/mlx4/qp.c

> +++ b/drivers/infiniband/hw/mlx4/qp.c

> @@ -2748,6 +2748,7 @@ int mlx4_ib_post_send(struct ib_qp *ibqp, struct

> ib_send_wr *wr,

>  	unsigned long flags;

>  	int nreq;

>  	int err = 0;

> +	int inl = 0;

>  	unsigned ind;

>  	int uninitialized_var(stamp);

>  	int uninitialized_var(size);

> @@ -2958,30 +2959,97 @@ int mlx4_ib_post_send(struct ib_qp *ibqp, struct

> ib_send_wr *wr,

>  		default:

>  			break;

>  		}

> +		if (wr->send_flags & IB_SEND_INLINE && wr->num_sge) {

> +			struct mlx4_wqe_inline_seg *seg;

> +			void *addr;

> +			int len, seg_len;

> +			int num_seg;

> +			int off, to_copy;

> +

> +			inl = 0;

> +

> +			seg = wqe;

> +			wqe += sizeof *seg;

> +			off = ((uintptr_t) wqe) & (MLX4_INLINE_ALIGN - 1);

> +			num_seg = 0;

> +			seg_len = 0;

> +

> +			for (i = 0; i < wr->num_sge; ++i) {

> +				addr = (void *) (uintptr_t) wr->sg_list[i].addr;

> +				len  = wr->sg_list[i].length;

> +				inl += len;

> +

> +				if (inl > 16) {

> +					inl = 0;

> +					err = ENOMEM;

> +					*bad_wr = wr;

> +					goto out;

> +				}

> 

> -		/*

> -		 * Write data segments in reverse order, so as to

> -		 * overwrite cacheline stamp last within each

> -		 * cacheline.  This avoids issues with WQE

> -		 * prefetching.

> -		 */

> +				while (len >= MLX4_INLINE_ALIGN - off) {

> +					to_copy = MLX4_INLINE_ALIGN - off;

> +					memcpy(wqe, addr, to_copy);

> +					len -= to_copy;

> +					wqe += to_copy;

> +					addr += to_copy;

> +					seg_len += to_copy;

> +					wmb(); /* see comment below */

> +					seg->byte_count =

> htonl(MLX4_INLINE_SEG | seg_len);

> +					seg_len = 0;

> +					seg = wqe;

> +					wqe += sizeof *seg;

> +					off = sizeof *seg;

> +					++num_seg;

> +				}

> 

> -		dseg = wqe;

> -		dseg += wr->num_sge - 1;

> -		size += wr->num_sge * (sizeof (struct mlx4_wqe_data_seg) /

> 16);

> +				memcpy(wqe, addr, len);

> +				wqe += len;

> +				seg_len += len;

> +				off += len;

> +			}

> 

> -		/* Add one more inline data segment for ICRC for MLX sends */

> -		if (unlikely(qp->mlx4_ib_qp_type == MLX4_IB_QPT_SMI ||

> -			     qp->mlx4_ib_qp_type == MLX4_IB_QPT_GSI ||

> -			     qp->mlx4_ib_qp_type &

> -			     (MLX4_IB_QPT_PROXY_SMI_OWNER |

> MLX4_IB_QPT_TUN_SMI_OWNER))) {

> -			set_mlx_icrc_seg(dseg + 1);

> -			size += sizeof (struct mlx4_wqe_data_seg) / 16;

> -		}

> +			if (seg_len) {

> +				++num_seg;

> +				/*

> +				 * Need a barrier here to make sure

> +				 * all the data is visible before the

> +				 * byte_count field is set.  Otherwise

> +				 * the HCA prefetcher could grab the

> +				 * 64-byte chunk with this inline

> +				 * segment and get a valid (!=

> +				 * 0xffffffff) byte count but stale

> +				 * data, and end up sending the wrong

> +				 * data.

> +				 */

> +				wmb();

> +				seg->byte_count = htonl(MLX4_INLINE_SEG |

> seg_len);

> +			}

> 

> -		for (i = wr->num_sge - 1; i >= 0; --i, --dseg)

> -			set_data_seg(dseg, wr->sg_list + i);

> +			size += (inl + num_seg * sizeof (*seg) + 15) / 16;

> +		} else {

> +			/*

> +			 * Write data segments in reverse order, so as to

> +			 * overwrite cacheline stamp last within each

> +			 * cacheline.  This avoids issues with WQE

> +			 * prefetching.

> +			 */

> +

> +			dseg = wqe;

> +			dseg += wr->num_sge - 1;

> +			size += wr->num_sge * (sizeof (struct

> mlx4_wqe_data_seg) / 16);

> +

> +			/* Add one more inline data segment for ICRC for MLX

> sends */

> +			if (unlikely(qp->mlx4_ib_qp_type == MLX4_IB_QPT_SMI

> ||

> +				     qp->mlx4_ib_qp_type == MLX4_IB_QPT_GSI

> ||

> +				     qp->mlx4_ib_qp_type &

> +				     (MLX4_IB_QPT_PROXY_SMI_OWNER |

> MLX4_IB_QPT_TUN_SMI_OWNER))) {

> +				set_mlx_icrc_seg(dseg + 1);

> +				size += sizeof (struct mlx4_wqe_data_seg) / 16;

> +			}

> 

> +			for (i = wr->num_sge - 1; i >= 0; --i, --dseg)

> +				set_data_seg(dseg, wr->sg_list + i);

> +		}

>  		/*

>  		 * Possibly overwrite stamping in cacheline with LSO

>  		 * segment only after making sure all data segments

> 

> > --

> > To unsubscribe from this list: send the line "unsubscribe linux-rdma"

> > in the body of a message to majordomo@vger.kernel.org More majordomo

> > info at http://vger.kernel.org/majordomo-info.html

>  {.n +       +%  lzwm  b 맲  r  zX  ݙ     ܨ}   Ơz &j:+v        zZ+  +zf   h   ~    i   z  w   ?

> & )ߢf
Ursula Braun March 14, 2017, 3:02 p.m. UTC | #2
Hi Parav,

I tried your mlx4-patch together with SMC on s390x, but it failed.
The SMC-R code tries to send 44 bytes as inline in 1 sge.
I wonder about a length check with 16 bytes, which probably explains the failure.
See my question below in the patch:

On 03/12/2017 09:20 PM, Parav Pandit wrote:
> Hi Ursula,
> 
>> -----Original Message-----
>> From: linux-rdma-owner@vger.kernel.org [mailto:linux-rdma-
>> owner@vger.kernel.org] On Behalf Of Ursula Braun
>> Sent: Thursday, March 9, 2017 3:54 AM
>> To: Eli Cohen <eli@mellanox.com>; Matan Barak <matanb@mellanox.com>
>> Cc: Saeed Mahameed <saeedm@mellanox.com>; Leon Romanovsky
>> <leonro@mellanox.com>; linux-rdma@vger.kernel.org
>> Subject: Re: Fwd: mlx5_ib_post_send panic on s390x
>>
>>
>>
>> On 03/06/2017 02:08 PM, Eli Cohen wrote:
>>>>>
>>>>> The problem seems to be caused by the usage of plain memcpy in
>> set_data_inl_seg().
>>>>> The address provided by SMC-code in struct ib_send_wr *wr is an
>>>>> address belonging to an area mapped with the ib_dma_map_single()
>>>>> call. On s390x those kind of addresses require extra access functions (see
>> arch/s390/include/asm/io.h).
>>>>>
>>>
>>> By definition, when you are posting a send request with inline, the address
>> must be mapped to the cpu so plain memcpy should work.
>>>
>> In the past I run SMC-R with Connect X3 cards. The mlx4 driver does not seem to
>> contain extra coding for IB_SEND_INLINE flag for ib_post_send. Does this mean
>> for SMC-R to run on Connect X3 cards the IB_SEND_INLINE flag is ignored, and
>> thus I needed the ib_dma_map_single() call for the area used with
>> ib_post_send()? Does this mean I should stay away from the IB_SEND_INLINE
>> flag, if I want to run the same SMC-R code with both, Connect X3 cards and
>> Connect X4 cards?
>>
> I had encountered the same kernel panic that you mentioned last week on ConnectX-4 adapters with smc-r on x86_64.
> Shall I submit below fix to netdev mailing list?
> I have tested above change. I also have optimization that avoids dma mapping for wr_tx_dma_addr.
> 
> -               lnk->wr_tx_sges[i].addr =
> -                       lnk->wr_tx_dma_addr + i * SMC_WR_BUF_SIZE;
> +               lnk->wr_tx_sges[i].addr = (uintptr_t)(lnk->wr_tx_bufs + i);
> 
> I also have fix for processing IB_SEND_INLINE in mlx4 driver on little older kernel base.
> I have attached below. I can rebase my kernel and provide fix in mlx5_ib driver.
> Let me know.
> 
> Regards,
> Parav Pandit
> 
> diff --git a/drivers/infiniband/hw/mlx4/qp.c b/drivers/infiniband/hw/mlx4/qp.c
> index a2e4ca5..0d984f5 100644
> --- a/drivers/infiniband/hw/mlx4/qp.c
> +++ b/drivers/infiniband/hw/mlx4/qp.c
> @@ -2748,6 +2748,7 @@ int mlx4_ib_post_send(struct ib_qp *ibqp, struct ib_send_wr *wr,
>  	unsigned long flags;
>  	int nreq;
>  	int err = 0;
> +	int inl = 0;
>  	unsigned ind;
>  	int uninitialized_var(stamp);
>  	int uninitialized_var(size);
> @@ -2958,30 +2959,97 @@ int mlx4_ib_post_send(struct ib_qp *ibqp, struct ib_send_wr *wr,
>  		default:
>  			break;
>  		}
> +		if (wr->send_flags & IB_SEND_INLINE && wr->num_sge) {
> +			struct mlx4_wqe_inline_seg *seg;
> +			void *addr;
> +			int len, seg_len;
> +			int num_seg;
> +			int off, to_copy;
> +
> +			inl = 0;
> +
> +			seg = wqe;
> +			wqe += sizeof *seg;
> +			off = ((uintptr_t) wqe) & (MLX4_INLINE_ALIGN - 1);
> +			num_seg = 0;
> +			seg_len = 0;
> +
> +			for (i = 0; i < wr->num_sge; ++i) {
> +				addr = (void *) (uintptr_t) wr->sg_list[i].addr;
> +				len  = wr->sg_list[i].length;
> +				inl += len;
> +
> +				if (inl > 16) {
> +					inl = 0;
> +					err = ENOMEM;
> +					*bad_wr = wr;
> +					goto out;
> +				}
SMC-R fails due to this check. inl is 44 here. Why is 16 a limit for IB_SEND_INLINE data?
The SMC-R code calls ib_create_qp() with max_inline_data=44. And the function does not
seem to return an error.
>  
> -		/*
> -		 * Write data segments in reverse order, so as to
> -		 * overwrite cacheline stamp last within each
> -		 * cacheline.  This avoids issues with WQE
> -		 * prefetching.
> -		 */
> +				while (len >= MLX4_INLINE_ALIGN - off) {
> +					to_copy = MLX4_INLINE_ALIGN - off;
> +					memcpy(wqe, addr, to_copy);
> +					len -= to_copy;
> +					wqe += to_copy;
> +					addr += to_copy;
> +					seg_len += to_copy;
> +					wmb(); /* see comment below */
> +					seg->byte_count = htonl(MLX4_INLINE_SEG | seg_len);
> +					seg_len = 0;
> +					seg = wqe;
> +					wqe += sizeof *seg;
> +					off = sizeof *seg;
> +					++num_seg;
> +				}
>  
> -		dseg = wqe;
> -		dseg += wr->num_sge - 1;
> -		size += wr->num_sge * (sizeof (struct mlx4_wqe_data_seg) / 16);
> +				memcpy(wqe, addr, len);
> +				wqe += len;
> +				seg_len += len;
> +				off += len;
> +			}
>  
> -		/* Add one more inline data segment for ICRC for MLX sends */
> -		if (unlikely(qp->mlx4_ib_qp_type == MLX4_IB_QPT_SMI ||
> -			     qp->mlx4_ib_qp_type == MLX4_IB_QPT_GSI ||
> -			     qp->mlx4_ib_qp_type &
> -			     (MLX4_IB_QPT_PROXY_SMI_OWNER | MLX4_IB_QPT_TUN_SMI_OWNER))) {
> -			set_mlx_icrc_seg(dseg + 1);
> -			size += sizeof (struct mlx4_wqe_data_seg) / 16;
> -		}
> +			if (seg_len) {
> +				++num_seg;
> +				/*
> +				 * Need a barrier here to make sure
> +				 * all the data is visible before the
> +				 * byte_count field is set.  Otherwise
> +				 * the HCA prefetcher could grab the
> +				 * 64-byte chunk with this inline
> +				 * segment and get a valid (!=
> +				 * 0xffffffff) byte count but stale
> +				 * data, and end up sending the wrong
> +				 * data.
> +				 */
> +				wmb();
> +				seg->byte_count = htonl(MLX4_INLINE_SEG | seg_len);
> +			}
>  
> -		for (i = wr->num_sge - 1; i >= 0; --i, --dseg)
> -			set_data_seg(dseg, wr->sg_list + i);
> +			size += (inl + num_seg * sizeof (*seg) + 15) / 16;
> +		} else {
> +			/*
> +			 * Write data segments in reverse order, so as to
> +			 * overwrite cacheline stamp last within each
> +			 * cacheline.  This avoids issues with WQE
> +			 * prefetching.
> +			 */
> +
> +			dseg = wqe;
> +			dseg += wr->num_sge - 1;
> +			size += wr->num_sge * (sizeof (struct mlx4_wqe_data_seg) / 16);
> +
> +			/* Add one more inline data segment for ICRC for MLX sends */
> +			if (unlikely(qp->mlx4_ib_qp_type == MLX4_IB_QPT_SMI ||
> +				     qp->mlx4_ib_qp_type == MLX4_IB_QPT_GSI ||
> +				     qp->mlx4_ib_qp_type &
> +				     (MLX4_IB_QPT_PROXY_SMI_OWNER | MLX4_IB_QPT_TUN_SMI_OWNER))) {
> +				set_mlx_icrc_seg(dseg + 1);
> +				size += sizeof (struct mlx4_wqe_data_seg) / 16;
> +			}
>  
> +			for (i = wr->num_sge - 1; i >= 0; --i, --dseg)
> +				set_data_seg(dseg, wr->sg_list + i);
> +		}
>  		/*
>  		 * Possibly overwrite stamping in cacheline with LSO
>  		 * segment only after making sure all data segments
> 
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body
>> of a message to majordomo@vger.kernel.org More majordomo info at
>> http://vger.kernel.org/majordomo-info.html

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Parav Pandit March 14, 2017, 3:24 p.m. UTC | #3
Hi Ursula,


> -----Original Message-----

> From: Ursula Braun [mailto:ubraun@linux.vnet.ibm.com]

> Sent: Tuesday, March 14, 2017 10:02 AM

> To: Parav Pandit <parav@mellanox.com>; Eli Cohen <eli@mellanox.com>;

> Matan Barak <matanb@mellanox.com>

> Cc: Saeed Mahameed <saeedm@mellanox.com>; Leon Romanovsky

> <leonro@mellanox.com>; linux-rdma@vger.kernel.org

> Subject: Re: Fwd: mlx5_ib_post_send panic on s390x

> 

> Hi Parav,

> 

> I tried your mlx4-patch together with SMC on s390x, but it failed.

> The SMC-R code tries to send 44 bytes as inline in 1 sge.

> I wonder about a length check with 16 bytes, which probably explains the

> failure.

> See my question below in the patch:

> 

> On 03/12/2017 09:20 PM, Parav Pandit wrote:

> > Hi Ursula,

> >

> >> -----Original Message-----

> >> From: linux-rdma-owner@vger.kernel.org [mailto:linux-rdma-

> >> owner@vger.kernel.org] On Behalf Of Ursula Braun

> >> Sent: Thursday, March 9, 2017 3:54 AM

> >> To: Eli Cohen <eli@mellanox.com>; Matan Barak <matanb@mellanox.com>

> >> Cc: Saeed Mahameed <saeedm@mellanox.com>; Leon Romanovsky

> >> <leonro@mellanox.com>; linux-rdma@vger.kernel.org

> >> Subject: Re: Fwd: mlx5_ib_post_send panic on s390x

> >>

> >>

> >>

> >> On 03/06/2017 02:08 PM, Eli Cohen wrote:

> >>>>>

> >>>>> The problem seems to be caused by the usage of plain memcpy in

> >> set_data_inl_seg().

> >>>>> The address provided by SMC-code in struct ib_send_wr *wr is an

> >>>>> address belonging to an area mapped with the ib_dma_map_single()

> >>>>> call. On s390x those kind of addresses require extra access

> >>>>> functions (see

> >> arch/s390/include/asm/io.h).

> >>>>>

> >>>

> >>> By definition, when you are posting a send request with inline, the

> >>> address

> >> must be mapped to the cpu so plain memcpy should work.

> >>>

> >> In the past I run SMC-R with Connect X3 cards. The mlx4 driver does

> >> not seem to contain extra coding for IB_SEND_INLINE flag for

> >> ib_post_send. Does this mean for SMC-R to run on Connect X3 cards the

> >> IB_SEND_INLINE flag is ignored, and thus I needed the

> >> ib_dma_map_single() call for the area used with ib_post_send()? Does

> >> this mean I should stay away from the IB_SEND_INLINE flag, if I want

> >> to run the same SMC-R code with both, Connect X3 cards and Connect X4

> cards?

> >>

> > I had encountered the same kernel panic that you mentioned last week on

> ConnectX-4 adapters with smc-r on x86_64.

> > Shall I submit below fix to netdev mailing list?

> > I have tested above change. I also have optimization that avoids dma mapping

> for wr_tx_dma_addr.

> >

> > -               lnk->wr_tx_sges[i].addr =

> > -                       lnk->wr_tx_dma_addr + i * SMC_WR_BUF_SIZE;

> > +               lnk->wr_tx_sges[i].addr = (uintptr_t)(lnk->wr_tx_bufs

> > + + i);

> >

> > I also have fix for processing IB_SEND_INLINE in mlx4 driver on little older

> kernel base.

> > I have attached below. I can rebase my kernel and provide fix in mlx5_ib driver.

> > Let me know.

> >

> > Regards,

> > Parav Pandit

> >

> > diff --git a/drivers/infiniband/hw/mlx4/qp.c

> > b/drivers/infiniband/hw/mlx4/qp.c index a2e4ca5..0d984f5 100644

> > --- a/drivers/infiniband/hw/mlx4/qp.c

> > +++ b/drivers/infiniband/hw/mlx4/qp.c

> > @@ -2748,6 +2748,7 @@ int mlx4_ib_post_send(struct ib_qp *ibqp, struct

> ib_send_wr *wr,

> >  	unsigned long flags;

> >  	int nreq;

> >  	int err = 0;

> > +	int inl = 0;

> >  	unsigned ind;

> >  	int uninitialized_var(stamp);

> >  	int uninitialized_var(size);

> > @@ -2958,30 +2959,97 @@ int mlx4_ib_post_send(struct ib_qp *ibqp, struct

> ib_send_wr *wr,

> >  		default:

> >  			break;

> >  		}

> > +		if (wr->send_flags & IB_SEND_INLINE && wr->num_sge) {

> > +			struct mlx4_wqe_inline_seg *seg;

> > +			void *addr;

> > +			int len, seg_len;

> > +			int num_seg;

> > +			int off, to_copy;

> > +

> > +			inl = 0;

> > +

> > +			seg = wqe;

> > +			wqe += sizeof *seg;

> > +			off = ((uintptr_t) wqe) & (MLX4_INLINE_ALIGN - 1);

> > +			num_seg = 0;

> > +			seg_len = 0;

> > +

> > +			for (i = 0; i < wr->num_sge; ++i) {

> > +				addr = (void *) (uintptr_t) wr->sg_list[i].addr;

> > +				len  = wr->sg_list[i].length;

> > +				inl += len;

> > +

> > +				if (inl > 16) {

> > +					inl = 0;

> > +					err = ENOMEM;

> > +					*bad_wr = wr;

> > +					goto out;

> > +				}

> SMC-R fails due to this check. inl is 44 here. Why is 16 a limit for

> IB_SEND_INLINE data?

> The SMC-R code calls ib_create_qp() with max_inline_data=44. And the function

> does not seem to return an error.

> >

This check should be for max_inline_data variable of the QP.
This was just for error check, I should have fixed it. I was testing with nvme where inline data was only worth 16 bytes.
I will fix this. Is it possible to change to 44 and do quick test?
Final patch will have right check in addition to check in create_qp?

> > -		/*

> > -		 * Write data segments in reverse order, so as to

> > -		 * overwrite cacheline stamp last within each

> > -		 * cacheline.  This avoids issues with WQE

> > -		 * prefetching.

> > -		 */

> > +				while (len >= MLX4_INLINE_ALIGN - off) {

> > +					to_copy = MLX4_INLINE_ALIGN - off;

> > +					memcpy(wqe, addr, to_copy);

> > +					len -= to_copy;

> > +					wqe += to_copy;

> > +					addr += to_copy;

> > +					seg_len += to_copy;

> > +					wmb(); /* see comment below */

> > +					seg->byte_count =

> htonl(MLX4_INLINE_SEG | seg_len);

> > +					seg_len = 0;

> > +					seg = wqe;

> > +					wqe += sizeof *seg;

> > +					off = sizeof *seg;

> > +					++num_seg;

> > +				}

> >

> > -		dseg = wqe;

> > -		dseg += wr->num_sge - 1;

> > -		size += wr->num_sge * (sizeof (struct mlx4_wqe_data_seg) /

> 16);

> > +				memcpy(wqe, addr, len);

> > +				wqe += len;

> > +				seg_len += len;

> > +				off += len;

> > +			}

> >

> > -		/* Add one more inline data segment for ICRC for MLX sends */

> > -		if (unlikely(qp->mlx4_ib_qp_type == MLX4_IB_QPT_SMI ||

> > -			     qp->mlx4_ib_qp_type == MLX4_IB_QPT_GSI ||

> > -			     qp->mlx4_ib_qp_type &

> > -			     (MLX4_IB_QPT_PROXY_SMI_OWNER |

> MLX4_IB_QPT_TUN_SMI_OWNER))) {

> > -			set_mlx_icrc_seg(dseg + 1);

> > -			size += sizeof (struct mlx4_wqe_data_seg) / 16;

> > -		}

> > +			if (seg_len) {

> > +				++num_seg;

> > +				/*

> > +				 * Need a barrier here to make sure

> > +				 * all the data is visible before the

> > +				 * byte_count field is set.  Otherwise

> > +				 * the HCA prefetcher could grab the

> > +				 * 64-byte chunk with this inline

> > +				 * segment and get a valid (!=

> > +				 * 0xffffffff) byte count but stale

> > +				 * data, and end up sending the wrong

> > +				 * data.

> > +				 */

> > +				wmb();

> > +				seg->byte_count = htonl(MLX4_INLINE_SEG |

> seg_len);

> > +			}

> >

> > -		for (i = wr->num_sge - 1; i >= 0; --i, --dseg)

> > -			set_data_seg(dseg, wr->sg_list + i);

> > +			size += (inl + num_seg * sizeof (*seg) + 15) / 16;

> > +		} else {

> > +			/*

> > +			 * Write data segments in reverse order, so as to

> > +			 * overwrite cacheline stamp last within each

> > +			 * cacheline.  This avoids issues with WQE

> > +			 * prefetching.

> > +			 */

> > +

> > +			dseg = wqe;

> > +			dseg += wr->num_sge - 1;

> > +			size += wr->num_sge * (sizeof (struct

> mlx4_wqe_data_seg) / 16);

> > +

> > +			/* Add one more inline data segment for ICRC for MLX

> sends */

> > +			if (unlikely(qp->mlx4_ib_qp_type == MLX4_IB_QPT_SMI

> ||

> > +				     qp->mlx4_ib_qp_type == MLX4_IB_QPT_GSI

> ||

> > +				     qp->mlx4_ib_qp_type &

> > +				     (MLX4_IB_QPT_PROXY_SMI_OWNER |

> MLX4_IB_QPT_TUN_SMI_OWNER))) {

> > +				set_mlx_icrc_seg(dseg + 1);

> > +				size += sizeof (struct mlx4_wqe_data_seg) / 16;

> > +			}

> >

> > +			for (i = wr->num_sge - 1; i >= 0; --i, --dseg)

> > +				set_data_seg(dseg, wr->sg_list + i);

> > +		}

> >  		/*

> >  		 * Possibly overwrite stamping in cacheline with LSO

> >  		 * segment only after making sure all data segments

> >

> >> --

> >> To unsubscribe from this list: send the line "unsubscribe linux-rdma"

> >> in the body of a message to majordomo@vger.kernel.org More majordomo

> >> info at http://vger.kernel.org/majordomo-info.html
Ursula Braun March 16, 2017, 11:51 a.m. UTC | #4
Hi Parav,

I run your new mlx4-Code together with changed SMC-R code no longer mapping 
the IB_SEND_INLINE area. It worked - great!

Below I have added a small improvement idea in your patch.

Nevertheless I am still not sure, if I should keep the IB_SEND_INLINE flag
in the SMC-R code, since there is no guarantee that this will work with
all kinds of RoCE-devices. The maximum length for IB_SEND_INLINE depends
on the RoCE-driver - right? Is there an interface to determine such a
maximum length? Would ib_create_qp() return with an error, if the
SMC-R specified .cap.max_inline_data = 44 is not supported by a RoCE-driver?

On 03/14/2017 04:24 PM, Parav Pandit wrote:
> Hi Ursula,
> 
> 
>> -----Original Message-----
>> From: Ursula Braun [mailto:ubraun@linux.vnet.ibm.com]
>> Sent: Tuesday, March 14, 2017 10:02 AM
>> To: Parav Pandit <parav@mellanox.com>; Eli Cohen <eli@mellanox.com>;
>> Matan Barak <matanb@mellanox.com>
>> Cc: Saeed Mahameed <saeedm@mellanox.com>; Leon Romanovsky
>> <leonro@mellanox.com>; linux-rdma@vger.kernel.org
>> Subject: Re: Fwd: mlx5_ib_post_send panic on s390x
>>
>> Hi Parav,
>>
>> I tried your mlx4-patch together with SMC on s390x, but it failed.
>> The SMC-R code tries to send 44 bytes as inline in 1 sge.
>> I wonder about a length check with 16 bytes, which probably explains the
>> failure.
>> See my question below in the patch:
>>
>> On 03/12/2017 09:20 PM, Parav Pandit wrote:
>>> Hi Ursula,
>>>
>>>> -----Original Message-----
>>>> From: linux-rdma-owner@vger.kernel.org [mailto:linux-rdma-
>>>> owner@vger.kernel.org] On Behalf Of Ursula Braun
>>>> Sent: Thursday, March 9, 2017 3:54 AM
>>>> To: Eli Cohen <eli@mellanox.com>; Matan Barak <matanb@mellanox.com>
>>>> Cc: Saeed Mahameed <saeedm@mellanox.com>; Leon Romanovsky
>>>> <leonro@mellanox.com>; linux-rdma@vger.kernel.org
>>>> Subject: Re: Fwd: mlx5_ib_post_send panic on s390x
>>>>
>>>>
>>>>
>>>> On 03/06/2017 02:08 PM, Eli Cohen wrote:
>>>>>>>
>>>>>>> The problem seems to be caused by the usage of plain memcpy in
>>>> set_data_inl_seg().
>>>>>>> The address provided by SMC-code in struct ib_send_wr *wr is an
>>>>>>> address belonging to an area mapped with the ib_dma_map_single()
>>>>>>> call. On s390x those kind of addresses require extra access
>>>>>>> functions (see
>>>> arch/s390/include/asm/io.h).
>>>>>>>
>>>>>
>>>>> By definition, when you are posting a send request with inline, the
>>>>> address
>>>> must be mapped to the cpu so plain memcpy should work.
>>>>>
>>>> In the past I run SMC-R with Connect X3 cards. The mlx4 driver does
>>>> not seem to contain extra coding for IB_SEND_INLINE flag for
>>>> ib_post_send. Does this mean for SMC-R to run on Connect X3 cards the
>>>> IB_SEND_INLINE flag is ignored, and thus I needed the
>>>> ib_dma_map_single() call for the area used with ib_post_send()? Does
>>>> this mean I should stay away from the IB_SEND_INLINE flag, if I want
>>>> to run the same SMC-R code with both, Connect X3 cards and Connect X4
>> cards?
>>>>
>>> I had encountered the same kernel panic that you mentioned last week on
>> ConnectX-4 adapters with smc-r on x86_64.
>>> Shall I submit below fix to netdev mailing list?
>>> I have tested above change. I also have optimization that avoids dma mapping
>> for wr_tx_dma_addr.
>>>
>>> -               lnk->wr_tx_sges[i].addr =
>>> -                       lnk->wr_tx_dma_addr + i * SMC_WR_BUF_SIZE;
>>> +               lnk->wr_tx_sges[i].addr = (uintptr_t)(lnk->wr_tx_bufs
>>> + + i);
>>>
>>> I also have fix for processing IB_SEND_INLINE in mlx4 driver on little older
>> kernel base.
>>> I have attached below. I can rebase my kernel and provide fix in mlx5_ib driver.
>>> Let me know.
>>>
>>> Regards,
>>> Parav Pandit
>>>
>>> diff --git a/drivers/infiniband/hw/mlx4/qp.c
>>> b/drivers/infiniband/hw/mlx4/qp.c index a2e4ca5..0d984f5 100644
>>> --- a/drivers/infiniband/hw/mlx4/qp.c
>>> +++ b/drivers/infiniband/hw/mlx4/qp.c
>>> @@ -2748,6 +2748,7 @@ int mlx4_ib_post_send(struct ib_qp *ibqp, struct
>> ib_send_wr *wr,
>>>  	unsigned long flags;
>>>  	int nreq;
>>>  	int err = 0;
>>> +	int inl = 0;
>>>  	unsigned ind;
>>>  	int uninitialized_var(stamp);
>>>  	int uninitialized_var(size);
>>> @@ -2958,30 +2959,97 @@ int mlx4_ib_post_send(struct ib_qp *ibqp, struct
>> ib_send_wr *wr,
>>>  		default:
>>>  			break;
>>>  		}
>>> +		if (wr->send_flags & IB_SEND_INLINE && wr->num_sge) {
>>> +			struct mlx4_wqe_inline_seg *seg;
>>> +			void *addr;
>>> +			int len, seg_len;
>>> +			int num_seg;
>>> +			int off, to_copy;
>>> +
>>> +			inl = 0;
>>> +
>>> +			seg = wqe;
>>> +			wqe += sizeof *seg;
>>> +			off = ((uintptr_t) wqe) & (MLX4_INLINE_ALIGN - 1);
>>> +			num_seg = 0;
>>> +			seg_len = 0;
>>> +
>>> +			for (i = 0; i < wr->num_sge; ++i) {
>>> +				addr = (void *) (uintptr_t) wr->sg_list[i].addr;
>>> +				len  = wr->sg_list[i].length;
>>> +				inl += len;
>>> +
>>> +				if (inl > 16) {
>>> +					inl = 0;
>>> +					err = ENOMEM;
>>> +					*bad_wr = wr;
>>> +					goto out;
>>> +				}
>> SMC-R fails due to this check. inl is 44 here. Why is 16 a limit for
>> IB_SEND_INLINE data?
>> The SMC-R code calls ib_create_qp() with max_inline_data=44. And the function
>> does not seem to return an error.
>>>
> This check should be for max_inline_data variable of the QP.
> This was just for error check, I should have fixed it. I was testing with nvme where inline data was only worth 16 bytes.
> I will fix this. Is it possible to change to 44 and do quick test?
> Final patch will have right check in addition to check in create_qp?
> 
>>> -		/*
>>> -		 * Write data segments in reverse order, so as to
>>> -		 * overwrite cacheline stamp last within each
>>> -		 * cacheline.  This avoids issues with WQE
>>> -		 * prefetching.
>>> -		 */
>>> +				while (len >= MLX4_INLINE_ALIGN - off) {
With this code there are 2 memcpy-Calls, one with to_copy=44, and the next one with len 0.
I suggest to change the check to "len > MLX4_INLINE_ALIGN - off".
>>> +					to_copy = MLX4_INLINE_ALIGN - off;
>>> +					memcpy(wqe, addr, to_copy);
>>> +					len -= to_copy;
>>> +					wqe += to_copy;
>>> +					addr += to_copy;
>>> +					seg_len += to_copy;
>>> +					wmb(); /* see comment below */
>>> +					seg->byte_count =
>> htonl(MLX4_INLINE_SEG | seg_len);
>>> +					seg_len = 0;
>>> +					seg = wqe;
>>> +					wqe += sizeof *seg;
>>> +					off = sizeof *seg;
>>> +					++num_seg;
>>> +				}
>>>
>>> -		dseg = wqe;
>>> -		dseg += wr->num_sge - 1;
>>> -		size += wr->num_sge * (sizeof (struct mlx4_wqe_data_seg) /
>> 16);
>>> +				memcpy(wqe, addr, len);
>>> +				wqe += len;
>>> +				seg_len += len;
>>> +				off += len;
>>> +			}
>>>
>>> -		/* Add one more inline data segment for ICRC for MLX sends */
>>> -		if (unlikely(qp->mlx4_ib_qp_type == MLX4_IB_QPT_SMI ||
>>> -			     qp->mlx4_ib_qp_type == MLX4_IB_QPT_GSI ||
>>> -			     qp->mlx4_ib_qp_type &
>>> -			     (MLX4_IB_QPT_PROXY_SMI_OWNER |
>> MLX4_IB_QPT_TUN_SMI_OWNER))) {
>>> -			set_mlx_icrc_seg(dseg + 1);
>>> -			size += sizeof (struct mlx4_wqe_data_seg) / 16;
>>> -		}
>>> +			if (seg_len) {
>>> +				++num_seg;
>>> +				/*
>>> +				 * Need a barrier here to make sure
>>> +				 * all the data is visible before the
>>> +				 * byte_count field is set.  Otherwise
>>> +				 * the HCA prefetcher could grab the
>>> +				 * 64-byte chunk with this inline
>>> +				 * segment and get a valid (!=
>>> +				 * 0xffffffff) byte count but stale
>>> +				 * data, and end up sending the wrong
>>> +				 * data.
>>> +				 */
>>> +				wmb();
>>> +				seg->byte_count = htonl(MLX4_INLINE_SEG |
>> seg_len);
>>> +			}
>>>
>>> -		for (i = wr->num_sge - 1; i >= 0; --i, --dseg)
>>> -			set_data_seg(dseg, wr->sg_list + i);
>>> +			size += (inl + num_seg * sizeof (*seg) + 15) / 16;
>>> +		} else {
>>> +			/*
>>> +			 * Write data segments in reverse order, so as to
>>> +			 * overwrite cacheline stamp last within each
>>> +			 * cacheline.  This avoids issues with WQE
>>> +			 * prefetching.
>>> +			 */
>>> +
>>> +			dseg = wqe;
>>> +			dseg += wr->num_sge - 1;
>>> +			size += wr->num_sge * (sizeof (struct
>> mlx4_wqe_data_seg) / 16);
>>> +
>>> +			/* Add one more inline data segment for ICRC for MLX
>> sends */
>>> +			if (unlikely(qp->mlx4_ib_qp_type == MLX4_IB_QPT_SMI
>> ||
>>> +				     qp->mlx4_ib_qp_type == MLX4_IB_QPT_GSI
>> ||
>>> +				     qp->mlx4_ib_qp_type &
>>> +				     (MLX4_IB_QPT_PROXY_SMI_OWNER |
>> MLX4_IB_QPT_TUN_SMI_OWNER))) {
>>> +				set_mlx_icrc_seg(dseg + 1);
>>> +				size += sizeof (struct mlx4_wqe_data_seg) / 16;
>>> +			}
>>>
>>> +			for (i = wr->num_sge - 1; i >= 0; --i, --dseg)
>>> +				set_data_seg(dseg, wr->sg_list + i);
>>> +		}
>>>  		/*
>>>  		 * Possibly overwrite stamping in cacheline with LSO
>>>  		 * segment only after making sure all data segments
>>>
>>>> --
>>>> To unsubscribe from this list: send the line "unsubscribe linux-rdma"
>>>> in the body of a message to majordomo@vger.kernel.org More majordomo
>>>> info at http://vger.kernel.org/majordomo-info.html
> 

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Parav Pandit March 20, 2017, 9:04 p.m. UTC | #5
SGkgVXJzdWxhLA0KDQpGb3IgdGhlIHN1Z2dlc3Rpb24gaXQgc3RpbGwgbmVlZCB0byBjb250aW51
ZSB0byBjaGVjayBmb3IgbGVuID49IElOTElORV9BTElHTiAtIG9mZiBiZWNhdXNlIDQ0ID0gNjQt
MjAuDQpXaGljaCBpcyBzdGlsbCBhICB2YWxpZCBjYXNlIChsZW4gPT0gaW5saW5lIC0gb2ZmKS4N
CkJ1dCBJIGFncmVlIHRoYXQgaXQgc2hvdWxkbid0IGRvIDJuZCBtZW1jcHkgd2l0aCB6ZXJvIGxl
bmd0aC4NClRoZXJlZm9yZSB0aGVyZSBzaG91bGQgYmUgYWRkaXRpb25hbCBjaGVjayBmb3IgbGVu
ICE9IDAuDQoNCkNvbWluZyB0byBJQl9TRU5EX0lOTElORV9EQVRBIHBhcnQsIHdoZW4gaWJfY3Jl
YXRlX3FwIGlzIGNhbGxlZCBhbmQgaWYgSENBIGRvZXNuJ3Qgc3VwcG9ydCBjYXAubWF4X2lubGlu
ZV9kYXRhLCBwcm92aWRlciBIQ0EgZHJpdmVyIGlzIHN1cHBvc2VkIHRvIGZhaWwgdGhlIGNhbGwu
DQpBbmQgVUxQIGlzIGV4cGVjdGVkIHRvIGRvIGZhbGxiYWNrIHRvIG5vbl9pbmxpbmUgc2NoZW1l
Lg0KDQpBcyBpdCBhcHBlYXJzIG1seDQgZHJpdmVyIGlzIG5vdCBmYWlsaW5nIHRoaXMgY2FsbCwg
d2hpY2ggaXMgYSBidWcgdGhhdCBuZWVkcyBmaXguDQpJbnN0ZWFkIG9mIGZhaWxpbmcgdGhlIGNh
bGwsIEkgcHJlZmVyIHRvIHByb3ZpZGUgdGhlIGRhdGEgcGF0aCBzb29uZXIgYmFzZWQgb24gbXkg
aW5saW5lIHBhdGNoIGluIHRoaXMgZW1haWwgdGhyZWFkLg0KDQpQYXJhdg0KDQo+IC0tLS0tT3Jp
Z2luYWwgTWVzc2FnZS0tLS0tDQo+IEZyb206IFVyc3VsYSBCcmF1biBbbWFpbHRvOnVicmF1bkBs
aW51eC52bmV0LmlibS5jb21dDQo+IFNlbnQ6IFRodXJzZGF5LCBNYXJjaCAxNiwgMjAxNyA2OjUx
IEFNDQo+IFRvOiBQYXJhdiBQYW5kaXQgPHBhcmF2QG1lbGxhbm94LmNvbT47IEVsaSBDb2hlbiA8
ZWxpQG1lbGxhbm94LmNvbT47DQo+IE1hdGFuIEJhcmFrIDxtYXRhbmJAbWVsbGFub3guY29tPg0K
PiBDYzogU2FlZWQgTWFoYW1lZWQgPHNhZWVkbUBtZWxsYW5veC5jb20+OyBMZW9uIFJvbWFub3Zz
a3kNCj4gPGxlb25yb0BtZWxsYW5veC5jb20+OyBsaW51eC1yZG1hQHZnZXIua2VybmVsLm9yZw0K
PiBTdWJqZWN0OiBSZTogRndkOiBtbHg1X2liX3Bvc3Rfc2VuZCBwYW5pYyBvbiBzMzkweA0KPiAN
Cj4gSGkgUGFyYXYsDQo+IA0KPiBJIHJ1biB5b3VyIG5ldyBtbHg0LUNvZGUgdG9nZXRoZXIgd2l0
aCBjaGFuZ2VkIFNNQy1SIGNvZGUgbm8gbG9uZ2VyDQo+IG1hcHBpbmcgdGhlIElCX1NFTkRfSU5M
SU5FIGFyZWEuIEl0IHdvcmtlZCAtIGdyZWF0IQ0KPiANCj4gQmVsb3cgSSBoYXZlIGFkZGVkIGEg
c21hbGwgaW1wcm92ZW1lbnQgaWRlYSBpbiB5b3VyIHBhdGNoLg0KPiANCj4gTmV2ZXJ0aGVsZXNz
IEkgYW0gc3RpbGwgbm90IHN1cmUsIGlmIEkgc2hvdWxkIGtlZXAgdGhlIElCX1NFTkRfSU5MSU5F
IGZsYWcgaW4NCj4gdGhlIFNNQy1SIGNvZGUsIHNpbmNlIHRoZXJlIGlzIG5vIGd1YXJhbnRlZSB0
aGF0IHRoaXMgd2lsbCB3b3JrIHdpdGggYWxsIGtpbmRzDQo+IG9mIFJvQ0UtZGV2aWNlcy4gVGhl
IG1heGltdW0gbGVuZ3RoIGZvciBJQl9TRU5EX0lOTElORSBkZXBlbmRzIG9uIHRoZQ0KPiBSb0NF
LWRyaXZlciAtIHJpZ2h0PyBJcyB0aGVyZSBhbiBpbnRlcmZhY2UgdG8gZGV0ZXJtaW5lIHN1Y2gg
YSBtYXhpbXVtDQo+IGxlbmd0aD8gV291bGQgaWJfY3JlYXRlX3FwKCkgcmV0dXJuIHdpdGggYW4g
ZXJyb3IsIGlmIHRoZSBTTUMtUiBzcGVjaWZpZWQNCj4gLmNhcC5tYXhfaW5saW5lX2RhdGEgPSA0
NCBpcyBub3Qgc3VwcG9ydGVkIGJ5IGEgUm9DRS1kcml2ZXI/DQo+IA0KPiBPbiAwMy8xNC8yMDE3
IDA0OjI0IFBNLCBQYXJhdiBQYW5kaXQgd3JvdGU6DQo+ID4gSGkgVXJzdWxhLA0KPiA+DQo+ID4N
Cj4gPj4gLS0tLS1PcmlnaW5hbCBNZXNzYWdlLS0tLS0NCj4gPj4gRnJvbTogVXJzdWxhIEJyYXVu
IFttYWlsdG86dWJyYXVuQGxpbnV4LnZuZXQuaWJtLmNvbV0NCj4gPj4gU2VudDogVHVlc2RheSwg
TWFyY2ggMTQsIDIwMTcgMTA6MDIgQU0NCj4gPj4gVG86IFBhcmF2IFBhbmRpdCA8cGFyYXZAbWVs
bGFub3guY29tPjsgRWxpIENvaGVuIDxlbGlAbWVsbGFub3guY29tPjsNCj4gPj4gTWF0YW4gQmFy
YWsgPG1hdGFuYkBtZWxsYW5veC5jb20+DQo+ID4+IENjOiBTYWVlZCBNYWhhbWVlZCA8c2FlZWRt
QG1lbGxhbm94LmNvbT47IExlb24gUm9tYW5vdnNreQ0KPiA+PiA8bGVvbnJvQG1lbGxhbm94LmNv
bT47IGxpbnV4LXJkbWFAdmdlci5rZXJuZWwub3JnDQo+ID4+IFN1YmplY3Q6IFJlOiBGd2Q6IG1s
eDVfaWJfcG9zdF9zZW5kIHBhbmljIG9uIHMzOTB4DQo+ID4+DQo+ID4+IEhpIFBhcmF2LA0KPiA+
Pg0KPiA+PiBJIHRyaWVkIHlvdXIgbWx4NC1wYXRjaCB0b2dldGhlciB3aXRoIFNNQyBvbiBzMzkw
eCwgYnV0IGl0IGZhaWxlZC4NCj4gPj4gVGhlIFNNQy1SIGNvZGUgdHJpZXMgdG8gc2VuZCA0NCBi
eXRlcyBhcyBpbmxpbmUgaW4gMSBzZ2UuDQo+ID4+IEkgd29uZGVyIGFib3V0IGEgbGVuZ3RoIGNo
ZWNrIHdpdGggMTYgYnl0ZXMsIHdoaWNoIHByb2JhYmx5IGV4cGxhaW5zDQo+ID4+IHRoZSBmYWls
dXJlLg0KPiA+PiBTZWUgbXkgcXVlc3Rpb24gYmVsb3cgaW4gdGhlIHBhdGNoOg0KPiA+Pg0KPiA+
PiBPbiAwMy8xMi8yMDE3IDA5OjIwIFBNLCBQYXJhdiBQYW5kaXQgd3JvdGU6DQo+ID4+PiBIaSBV
cnN1bGEsDQo+ID4+Pg0KPiA+Pj4+IC0tLS0tT3JpZ2luYWwgTWVzc2FnZS0tLS0tDQo+ID4+Pj4g
RnJvbTogbGludXgtcmRtYS1vd25lckB2Z2VyLmtlcm5lbC5vcmcgW21haWx0bzpsaW51eC1yZG1h
LQ0KPiA+Pj4+IG93bmVyQHZnZXIua2VybmVsLm9yZ10gT24gQmVoYWxmIE9mIFVyc3VsYSBCcmF1
bg0KPiA+Pj4+IFNlbnQ6IFRodXJzZGF5LCBNYXJjaCA5LCAyMDE3IDM6NTQgQU0NCj4gPj4+PiBU
bzogRWxpIENvaGVuIDxlbGlAbWVsbGFub3guY29tPjsgTWF0YW4gQmFyYWsNCj4gPG1hdGFuYkBt
ZWxsYW5veC5jb20+DQo+ID4+Pj4gQ2M6IFNhZWVkIE1haGFtZWVkIDxzYWVlZG1AbWVsbGFub3gu
Y29tPjsgTGVvbiBSb21hbm92c2t5DQo+ID4+Pj4gPGxlb25yb0BtZWxsYW5veC5jb20+OyBsaW51
eC1yZG1hQHZnZXIua2VybmVsLm9yZw0KPiA+Pj4+IFN1YmplY3Q6IFJlOiBGd2Q6IG1seDVfaWJf
cG9zdF9zZW5kIHBhbmljIG9uIHMzOTB4DQo+ID4+Pj4NCj4gPj4+Pg0KPiA+Pj4+DQo+ID4+Pj4g
T24gMDMvMDYvMjAxNyAwMjowOCBQTSwgRWxpIENvaGVuIHdyb3RlOg0KPiA+Pj4+Pj4+DQo+ID4+
Pj4+Pj4gVGhlIHByb2JsZW0gc2VlbXMgdG8gYmUgY2F1c2VkIGJ5IHRoZSB1c2FnZSBvZiBwbGFp
biBtZW1jcHkgaW4NCj4gPj4+PiBzZXRfZGF0YV9pbmxfc2VnKCkuDQo+ID4+Pj4+Pj4gVGhlIGFk
ZHJlc3MgcHJvdmlkZWQgYnkgU01DLWNvZGUgaW4gc3RydWN0IGliX3NlbmRfd3IgKndyIGlzIGFu
DQo+ID4+Pj4+Pj4gYWRkcmVzcyBiZWxvbmdpbmcgdG8gYW4gYXJlYSBtYXBwZWQgd2l0aCB0aGUN
Cj4gaWJfZG1hX21hcF9zaW5nbGUoKQ0KPiA+Pj4+Pj4+IGNhbGwuIE9uIHMzOTB4IHRob3NlIGtp
bmQgb2YgYWRkcmVzc2VzIHJlcXVpcmUgZXh0cmEgYWNjZXNzDQo+ID4+Pj4+Pj4gZnVuY3Rpb25z
IChzZWUNCj4gPj4+PiBhcmNoL3MzOTAvaW5jbHVkZS9hc20vaW8uaCkuDQo+ID4+Pj4+Pj4NCj4g
Pj4+Pj4NCj4gPj4+Pj4gQnkgZGVmaW5pdGlvbiwgd2hlbiB5b3UgYXJlIHBvc3RpbmcgYSBzZW5k
IHJlcXVlc3Qgd2l0aCBpbmxpbmUsDQo+ID4+Pj4+IHRoZSBhZGRyZXNzDQo+ID4+Pj4gbXVzdCBi
ZSBtYXBwZWQgdG8gdGhlIGNwdSBzbyBwbGFpbiBtZW1jcHkgc2hvdWxkIHdvcmsuDQo+ID4+Pj4+
DQo+ID4+Pj4gSW4gdGhlIHBhc3QgSSBydW4gU01DLVIgd2l0aCBDb25uZWN0IFgzIGNhcmRzLiBU
aGUgbWx4NCBkcml2ZXIgZG9lcw0KPiA+Pj4+IG5vdCBzZWVtIHRvIGNvbnRhaW4gZXh0cmEgY29k
aW5nIGZvciBJQl9TRU5EX0lOTElORSBmbGFnIGZvcg0KPiA+Pj4+IGliX3Bvc3Rfc2VuZC4gRG9l
cyB0aGlzIG1lYW4gZm9yIFNNQy1SIHRvIHJ1biBvbiBDb25uZWN0IFgzIGNhcmRzDQo+ID4+Pj4g
dGhlIElCX1NFTkRfSU5MSU5FIGZsYWcgaXMgaWdub3JlZCwgYW5kIHRodXMgSSBuZWVkZWQgdGhl
DQo+ID4+Pj4gaWJfZG1hX21hcF9zaW5nbGUoKSBjYWxsIGZvciB0aGUgYXJlYSB1c2VkIHdpdGgg
aWJfcG9zdF9zZW5kKCk/DQo+ID4+Pj4gRG9lcyB0aGlzIG1lYW4gSSBzaG91bGQgc3RheSBhd2F5
IGZyb20gdGhlIElCX1NFTkRfSU5MSU5FIGZsYWcsIGlmDQo+ID4+Pj4gSSB3YW50IHRvIHJ1biB0
aGUgc2FtZSBTTUMtUiBjb2RlIHdpdGggYm90aCwgQ29ubmVjdCBYMyBjYXJkcyBhbmQNCj4gPj4+
PiBDb25uZWN0IFg0DQo+ID4+IGNhcmRzPw0KPiA+Pj4+DQo+ID4+PiBJIGhhZCBlbmNvdW50ZXJl
ZCB0aGUgc2FtZSBrZXJuZWwgcGFuaWMgdGhhdCB5b3UgbWVudGlvbmVkIGxhc3Qgd2Vlaw0KPiA+
Pj4gb24NCj4gPj4gQ29ubmVjdFgtNCBhZGFwdGVycyB3aXRoIHNtYy1yIG9uIHg4Nl82NC4NCj4g
Pj4+IFNoYWxsIEkgc3VibWl0IGJlbG93IGZpeCB0byBuZXRkZXYgbWFpbGluZyBsaXN0Pw0KPiA+
Pj4gSSBoYXZlIHRlc3RlZCBhYm92ZSBjaGFuZ2UuIEkgYWxzbyBoYXZlIG9wdGltaXphdGlvbiB0
aGF0IGF2b2lkcyBkbWENCj4gPj4+IG1hcHBpbmcNCj4gPj4gZm9yIHdyX3R4X2RtYV9hZGRyLg0K
PiA+Pj4NCj4gPj4+IC0gICAgICAgICAgICAgICBsbmstPndyX3R4X3NnZXNbaV0uYWRkciA9DQo+
ID4+PiAtICAgICAgICAgICAgICAgICAgICAgICBsbmstPndyX3R4X2RtYV9hZGRyICsgaSAqIFNN
Q19XUl9CVUZfU0laRTsNCj4gPj4+ICsgICAgICAgICAgICAgICBsbmstPndyX3R4X3NnZXNbaV0u
YWRkciA9DQo+ID4+PiArICh1aW50cHRyX3QpKGxuay0+d3JfdHhfYnVmcw0KPiA+Pj4gKyArIGkp
Ow0KPiA+Pj4NCj4gPj4+IEkgYWxzbyBoYXZlIGZpeCBmb3IgcHJvY2Vzc2luZyBJQl9TRU5EX0lO
TElORSBpbiBtbHg0IGRyaXZlciBvbg0KPiA+Pj4gbGl0dGxlIG9sZGVyDQo+ID4+IGtlcm5lbCBi
YXNlLg0KPiA+Pj4gSSBoYXZlIGF0dGFjaGVkIGJlbG93LiBJIGNhbiByZWJhc2UgbXkga2VybmVs
IGFuZCBwcm92aWRlIGZpeCBpbiBtbHg1X2liDQo+IGRyaXZlci4NCj4gPj4+IExldCBtZSBrbm93
Lg0KPiA+Pj4NCj4gPj4+IFJlZ2FyZHMsDQo+ID4+PiBQYXJhdiBQYW5kaXQNCj4gPj4+DQo+ID4+
PiBkaWZmIC0tZ2l0IGEvZHJpdmVycy9pbmZpbmliYW5kL2h3L21seDQvcXAuYw0KPiA+Pj4gYi9k
cml2ZXJzL2luZmluaWJhbmQvaHcvbWx4NC9xcC5jIGluZGV4IGEyZTRjYTUuLjBkOTg0ZjUgMTAw
NjQ0DQo+ID4+PiAtLS0gYS9kcml2ZXJzL2luZmluaWJhbmQvaHcvbWx4NC9xcC5jDQo+ID4+PiAr
KysgYi9kcml2ZXJzL2luZmluaWJhbmQvaHcvbWx4NC9xcC5jDQo+ID4+PiBAQCAtMjc0OCw2ICsy
NzQ4LDcgQEAgaW50IG1seDRfaWJfcG9zdF9zZW5kKHN0cnVjdCBpYl9xcCAqaWJxcCwNCj4gPj4+
IHN0cnVjdA0KPiA+PiBpYl9zZW5kX3dyICp3ciwNCj4gPj4+ICAJdW5zaWduZWQgbG9uZyBmbGFn
czsNCj4gPj4+ICAJaW50IG5yZXE7DQo+ID4+PiAgCWludCBlcnIgPSAwOw0KPiA+Pj4gKwlpbnQg
aW5sID0gMDsNCj4gPj4+ICAJdW5zaWduZWQgaW5kOw0KPiA+Pj4gIAlpbnQgdW5pbml0aWFsaXpl
ZF92YXIoc3RhbXApOw0KPiA+Pj4gIAlpbnQgdW5pbml0aWFsaXplZF92YXIoc2l6ZSk7DQo+ID4+
PiBAQCAtMjk1OCwzMCArMjk1OSw5NyBAQCBpbnQgbWx4NF9pYl9wb3N0X3NlbmQoc3RydWN0IGli
X3FwICppYnFwLA0KPiA+Pj4gc3RydWN0DQo+ID4+IGliX3NlbmRfd3IgKndyLA0KPiA+Pj4gIAkJ
ZGVmYXVsdDoNCj4gPj4+ICAJCQlicmVhazsNCj4gPj4+ICAJCX0NCj4gPj4+ICsJCWlmICh3ci0+
c2VuZF9mbGFncyAmIElCX1NFTkRfSU5MSU5FICYmIHdyLT5udW1fc2dlKSB7DQo+ID4+PiArCQkJ
c3RydWN0IG1seDRfd3FlX2lubGluZV9zZWcgKnNlZzsNCj4gPj4+ICsJCQl2b2lkICphZGRyOw0K
PiA+Pj4gKwkJCWludCBsZW4sIHNlZ19sZW47DQo+ID4+PiArCQkJaW50IG51bV9zZWc7DQo+ID4+
PiArCQkJaW50IG9mZiwgdG9fY29weTsNCj4gPj4+ICsNCj4gPj4+ICsJCQlpbmwgPSAwOw0KPiA+
Pj4gKw0KPiA+Pj4gKwkJCXNlZyA9IHdxZTsNCj4gPj4+ICsJCQl3cWUgKz0gc2l6ZW9mICpzZWc7
DQo+ID4+PiArCQkJb2ZmID0gKCh1aW50cHRyX3QpIHdxZSkgJiAoTUxYNF9JTkxJTkVfQUxJR04g
LSAxKTsNCj4gPj4+ICsJCQludW1fc2VnID0gMDsNCj4gPj4+ICsJCQlzZWdfbGVuID0gMDsNCj4g
Pj4+ICsNCj4gPj4+ICsJCQlmb3IgKGkgPSAwOyBpIDwgd3ItPm51bV9zZ2U7ICsraSkgew0KPiA+
Pj4gKwkJCQlhZGRyID0gKHZvaWQgKikgKHVpbnRwdHJfdCkgd3ItPnNnX2xpc3RbaV0uYWRkcjsN
Cj4gPj4+ICsJCQkJbGVuICA9IHdyLT5zZ19saXN0W2ldLmxlbmd0aDsNCj4gPj4+ICsJCQkJaW5s
ICs9IGxlbjsNCj4gPj4+ICsNCj4gPj4+ICsJCQkJaWYgKGlubCA+IDE2KSB7DQo+ID4+PiArCQkJ
CQlpbmwgPSAwOw0KPiA+Pj4gKwkJCQkJZXJyID0gRU5PTUVNOw0KPiA+Pj4gKwkJCQkJKmJhZF93
ciA9IHdyOw0KPiA+Pj4gKwkJCQkJZ290byBvdXQ7DQo+ID4+PiArCQkJCX0NCj4gPj4gU01DLVIg
ZmFpbHMgZHVlIHRvIHRoaXMgY2hlY2suIGlubCBpcyA0NCBoZXJlLiBXaHkgaXMgMTYgYSBsaW1p
dCBmb3INCj4gPj4gSUJfU0VORF9JTkxJTkUgZGF0YT8NCj4gPj4gVGhlIFNNQy1SIGNvZGUgY2Fs
bHMgaWJfY3JlYXRlX3FwKCkgd2l0aCBtYXhfaW5saW5lX2RhdGE9NDQuIEFuZCB0aGUNCj4gPj4g
ZnVuY3Rpb24gZG9lcyBub3Qgc2VlbSB0byByZXR1cm4gYW4gZXJyb3IuDQo+ID4+Pg0KPiA+IFRo
aXMgY2hlY2sgc2hvdWxkIGJlIGZvciBtYXhfaW5saW5lX2RhdGEgdmFyaWFibGUgb2YgdGhlIFFQ
Lg0KPiA+IFRoaXMgd2FzIGp1c3QgZm9yIGVycm9yIGNoZWNrLCBJIHNob3VsZCBoYXZlIGZpeGVk
IGl0LiBJIHdhcyB0ZXN0aW5nIHdpdGggbnZtZQ0KPiB3aGVyZSBpbmxpbmUgZGF0YSB3YXMgb25s
eSB3b3J0aCAxNiBieXRlcy4NCj4gPiBJIHdpbGwgZml4IHRoaXMuIElzIGl0IHBvc3NpYmxlIHRv
IGNoYW5nZSB0byA0NCBhbmQgZG8gcXVpY2sgdGVzdD8NCj4gPiBGaW5hbCBwYXRjaCB3aWxsIGhh
dmUgcmlnaHQgY2hlY2sgaW4gYWRkaXRpb24gdG8gY2hlY2sgaW4gY3JlYXRlX3FwPw0KPiA+DQo+
ID4+PiAtCQkvKg0KPiA+Pj4gLQkJICogV3JpdGUgZGF0YSBzZWdtZW50cyBpbiByZXZlcnNlIG9y
ZGVyLCBzbyBhcyB0bw0KPiA+Pj4gLQkJICogb3ZlcndyaXRlIGNhY2hlbGluZSBzdGFtcCBsYXN0
IHdpdGhpbiBlYWNoDQo+ID4+PiAtCQkgKiBjYWNoZWxpbmUuICBUaGlzIGF2b2lkcyBpc3N1ZXMg
d2l0aCBXUUUNCj4gPj4+IC0JCSAqIHByZWZldGNoaW5nLg0KPiA+Pj4gLQkJICovDQo+ID4+PiAr
CQkJCXdoaWxlIChsZW4gPj0gTUxYNF9JTkxJTkVfQUxJR04gLSBvZmYpIHsNCj4gV2l0aCB0aGlz
IGNvZGUgdGhlcmUgYXJlIDIgbWVtY3B5LUNhbGxzLCBvbmUgd2l0aCB0b19jb3B5PTQ0LCBhbmQg
dGhlIG5leHQNCj4gb25lIHdpdGggbGVuIDAuDQo+IEkgc3VnZ2VzdCB0byBjaGFuZ2UgdGhlIGNo
ZWNrIHRvICJsZW4gPiBNTFg0X0lOTElORV9BTElHTiAtIG9mZiIuDQo+ID4+PiArCQkJCQl0b19j
b3B5ID0gTUxYNF9JTkxJTkVfQUxJR04gLSBvZmY7DQo+ID4+PiArCQkJCQltZW1jcHkod3FlLCBh
ZGRyLCB0b19jb3B5KTsNCj4gPj4+ICsJCQkJCWxlbiAtPSB0b19jb3B5Ow0KPiA+Pj4gKwkJCQkJ
d3FlICs9IHRvX2NvcHk7DQo+ID4+PiArCQkJCQlhZGRyICs9IHRvX2NvcHk7DQo+ID4+PiArCQkJ
CQlzZWdfbGVuICs9IHRvX2NvcHk7DQo+ID4+PiArCQkJCQl3bWIoKTsgLyogc2VlIGNvbW1lbnQg
YmVsb3cgKi8NCj4gPj4+ICsJCQkJCXNlZy0+Ynl0ZV9jb3VudCA9DQo+ID4+IGh0b25sKE1MWDRf
SU5MSU5FX1NFRyB8IHNlZ19sZW4pOw0KPiA+Pj4gKwkJCQkJc2VnX2xlbiA9IDA7DQo+ID4+PiAr
CQkJCQlzZWcgPSB3cWU7DQo+ID4+PiArCQkJCQl3cWUgKz0gc2l6ZW9mICpzZWc7DQo+ID4+PiAr
CQkJCQlvZmYgPSBzaXplb2YgKnNlZzsNCj4gPj4+ICsJCQkJCSsrbnVtX3NlZzsNCj4gPj4+ICsJ
CQkJfQ0KPiA+Pj4NCj4gPj4+IC0JCWRzZWcgPSB3cWU7DQo+ID4+PiAtCQlkc2VnICs9IHdyLT5u
dW1fc2dlIC0gMTsNCj4gPj4+IC0JCXNpemUgKz0gd3ItPm51bV9zZ2UgKiAoc2l6ZW9mIChzdHJ1
Y3QgbWx4NF93cWVfZGF0YV9zZWcpIC8NCj4gPj4gMTYpOw0KPiA+Pj4gKwkJCQltZW1jcHkod3Fl
LCBhZGRyLCBsZW4pOw0KPiA+Pj4gKwkJCQl3cWUgKz0gbGVuOw0KPiA+Pj4gKwkJCQlzZWdfbGVu
ICs9IGxlbjsNCj4gPj4+ICsJCQkJb2ZmICs9IGxlbjsNCj4gPj4+ICsJCQl9DQo+ID4+Pg0KPiA+
Pj4gLQkJLyogQWRkIG9uZSBtb3JlIGlubGluZSBkYXRhIHNlZ21lbnQgZm9yIElDUkMgZm9yIE1M
WCBzZW5kcw0KPiAqLw0KPiA+Pj4gLQkJaWYgKHVubGlrZWx5KHFwLT5tbHg0X2liX3FwX3R5cGUg
PT0gTUxYNF9JQl9RUFRfU01JIHx8DQo+ID4+PiAtCQkJICAgICBxcC0+bWx4NF9pYl9xcF90eXBl
ID09IE1MWDRfSUJfUVBUX0dTSSB8fA0KPiA+Pj4gLQkJCSAgICAgcXAtPm1seDRfaWJfcXBfdHlw
ZSAmDQo+ID4+PiAtCQkJICAgICAoTUxYNF9JQl9RUFRfUFJPWFlfU01JX09XTkVSIHwNCj4gPj4g
TUxYNF9JQl9RUFRfVFVOX1NNSV9PV05FUikpKSB7DQo+ID4+PiAtCQkJc2V0X21seF9pY3JjX3Nl
Zyhkc2VnICsgMSk7DQo+ID4+PiAtCQkJc2l6ZSArPSBzaXplb2YgKHN0cnVjdCBtbHg0X3dxZV9k
YXRhX3NlZykgLyAxNjsNCj4gPj4+IC0JCX0NCj4gPj4+ICsJCQlpZiAoc2VnX2xlbikgew0KPiA+
Pj4gKwkJCQkrK251bV9zZWc7DQo+ID4+PiArCQkJCS8qDQo+ID4+PiArCQkJCSAqIE5lZWQgYSBi
YXJyaWVyIGhlcmUgdG8gbWFrZSBzdXJlDQo+ID4+PiArCQkJCSAqIGFsbCB0aGUgZGF0YSBpcyB2
aXNpYmxlIGJlZm9yZSB0aGUNCj4gPj4+ICsJCQkJICogYnl0ZV9jb3VudCBmaWVsZCBpcyBzZXQu
ICBPdGhlcndpc2UNCj4gPj4+ICsJCQkJICogdGhlIEhDQSBwcmVmZXRjaGVyIGNvdWxkIGdyYWIg
dGhlDQo+ID4+PiArCQkJCSAqIDY0LWJ5dGUgY2h1bmsgd2l0aCB0aGlzIGlubGluZQ0KPiA+Pj4g
KwkJCQkgKiBzZWdtZW50IGFuZCBnZXQgYSB2YWxpZCAoIT0NCj4gPj4+ICsJCQkJICogMHhmZmZm
ZmZmZikgYnl0ZSBjb3VudCBidXQgc3RhbGUNCj4gPj4+ICsJCQkJICogZGF0YSwgYW5kIGVuZCB1
cCBzZW5kaW5nIHRoZSB3cm9uZw0KPiA+Pj4gKwkJCQkgKiBkYXRhLg0KPiA+Pj4gKwkJCQkgKi8N
Cj4gPj4+ICsJCQkJd21iKCk7DQo+ID4+PiArCQkJCXNlZy0+Ynl0ZV9jb3VudCA9IGh0b25sKE1M
WDRfSU5MSU5FX1NFRw0KPiB8DQo+ID4+IHNlZ19sZW4pOw0KPiA+Pj4gKwkJCX0NCj4gPj4+DQo+
ID4+PiAtCQlmb3IgKGkgPSB3ci0+bnVtX3NnZSAtIDE7IGkgPj0gMDsgLS1pLCAtLWRzZWcpDQo+
ID4+PiAtCQkJc2V0X2RhdGFfc2VnKGRzZWcsIHdyLT5zZ19saXN0ICsgaSk7DQo+ID4+PiArCQkJ
c2l6ZSArPSAoaW5sICsgbnVtX3NlZyAqIHNpemVvZiAoKnNlZykgKyAxNSkgLyAxNjsNCj4gPj4+
ICsJCX0gZWxzZSB7DQo+ID4+PiArCQkJLyoNCj4gPj4+ICsJCQkgKiBXcml0ZSBkYXRhIHNlZ21l
bnRzIGluIHJldmVyc2Ugb3JkZXIsIHNvIGFzIHRvDQo+ID4+PiArCQkJICogb3ZlcndyaXRlIGNh
Y2hlbGluZSBzdGFtcCBsYXN0IHdpdGhpbiBlYWNoDQo+ID4+PiArCQkJICogY2FjaGVsaW5lLiAg
VGhpcyBhdm9pZHMgaXNzdWVzIHdpdGggV1FFDQo+ID4+PiArCQkJICogcHJlZmV0Y2hpbmcuDQo+
ID4+PiArCQkJICovDQo+ID4+PiArDQo+ID4+PiArCQkJZHNlZyA9IHdxZTsNCj4gPj4+ICsJCQlk
c2VnICs9IHdyLT5udW1fc2dlIC0gMTsNCj4gPj4+ICsJCQlzaXplICs9IHdyLT5udW1fc2dlICog
KHNpemVvZiAoc3RydWN0DQo+ID4+IG1seDRfd3FlX2RhdGFfc2VnKSAvIDE2KTsNCj4gPj4+ICsN
Cj4gPj4+ICsJCQkvKiBBZGQgb25lIG1vcmUgaW5saW5lIGRhdGEgc2VnbWVudCBmb3IgSUNSQyBm
b3INCj4gTUxYDQo+ID4+IHNlbmRzICovDQo+ID4+PiArCQkJaWYgKHVubGlrZWx5KHFwLT5tbHg0
X2liX3FwX3R5cGUgPT0NCj4gTUxYNF9JQl9RUFRfU01JDQo+ID4+IHx8DQo+ID4+PiArCQkJCSAg
ICAgcXAtPm1seDRfaWJfcXBfdHlwZSA9PQ0KPiBNTFg0X0lCX1FQVF9HU0kNCj4gPj4gfHwNCj4g
Pj4+ICsJCQkJICAgICBxcC0+bWx4NF9pYl9xcF90eXBlICYNCj4gPj4+ICsJCQkJICAgICAoTUxY
NF9JQl9RUFRfUFJPWFlfU01JX09XTkVSIHwNCj4gPj4gTUxYNF9JQl9RUFRfVFVOX1NNSV9PV05F
UikpKSB7DQo+ID4+PiArCQkJCXNldF9tbHhfaWNyY19zZWcoZHNlZyArIDEpOw0KPiA+Pj4gKwkJ
CQlzaXplICs9IHNpemVvZiAoc3RydWN0IG1seDRfd3FlX2RhdGFfc2VnKSAvDQo+IDE2Ow0KPiA+
Pj4gKwkJCX0NCj4gPj4+DQo+ID4+PiArCQkJZm9yIChpID0gd3ItPm51bV9zZ2UgLSAxOyBpID49
IDA7IC0taSwgLS1kc2VnKQ0KPiA+Pj4gKwkJCQlzZXRfZGF0YV9zZWcoZHNlZywgd3ItPnNnX2xp
c3QgKyBpKTsNCj4gPj4+ICsJCX0NCj4gPj4+ICAJCS8qDQo+ID4+PiAgCQkgKiBQb3NzaWJseSBv
dmVyd3JpdGUgc3RhbXBpbmcgaW4gY2FjaGVsaW5lIHdpdGggTFNPDQo+ID4+PiAgCQkgKiBzZWdt
ZW50IG9ubHkgYWZ0ZXIgbWFraW5nIHN1cmUgYWxsIGRhdGEgc2VnbWVudHMNCj4gPj4+DQo+ID4+
Pj4gLS0NCj4gPj4+PiBUbyB1bnN1YnNjcmliZSBmcm9tIHRoaXMgbGlzdDogc2VuZCB0aGUgbGlu
ZSAidW5zdWJzY3JpYmUgbGludXgtcmRtYSINCj4gPj4+PiBpbiB0aGUgYm9keSBvZiBhIG1lc3Nh
Z2UgdG8gbWFqb3Jkb21vQHZnZXIua2VybmVsLm9yZyBNb3JlDQo+ID4+Pj4gbWFqb3Jkb21vIGlu
Zm8gYXQgaHR0cDovL3ZnZXIua2VybmVsLm9yZy9tYWpvcmRvbW8taW5mby5odG1sDQo+ID4NCg0K
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/drivers/infiniband/hw/mlx4/qp.c b/drivers/infiniband/hw/mlx4/qp.c
index a2e4ca5..0d984f5 100644
--- a/drivers/infiniband/hw/mlx4/qp.c
+++ b/drivers/infiniband/hw/mlx4/qp.c
@@ -2748,6 +2748,7 @@  int mlx4_ib_post_send(struct ib_qp *ibqp, struct ib_send_wr *wr,
 	unsigned long flags;
 	int nreq;
 	int err = 0;
+	int inl = 0;
 	unsigned ind;
 	int uninitialized_var(stamp);
 	int uninitialized_var(size);
@@ -2958,30 +2959,97 @@  int mlx4_ib_post_send(struct ib_qp *ibqp, struct ib_send_wr *wr,
 		default:
 			break;
 		}
+		if (wr->send_flags & IB_SEND_INLINE && wr->num_sge) {
+			struct mlx4_wqe_inline_seg *seg;
+			void *addr;
+			int len, seg_len;
+			int num_seg;
+			int off, to_copy;
+
+			inl = 0;
+
+			seg = wqe;
+			wqe += sizeof *seg;
+			off = ((uintptr_t) wqe) & (MLX4_INLINE_ALIGN - 1);
+			num_seg = 0;
+			seg_len = 0;
+
+			for (i = 0; i < wr->num_sge; ++i) {
+				addr = (void *) (uintptr_t) wr->sg_list[i].addr;
+				len  = wr->sg_list[i].length;
+				inl += len;
+
+				if (inl > 16) {
+					inl = 0;
+					err = ENOMEM;
+					*bad_wr = wr;
+					goto out;
+				}
 
-		/*
-		 * Write data segments in reverse order, so as to
-		 * overwrite cacheline stamp last within each
-		 * cacheline.  This avoids issues with WQE
-		 * prefetching.
-		 */
+				while (len >= MLX4_INLINE_ALIGN - off) {
+					to_copy = MLX4_INLINE_ALIGN - off;
+					memcpy(wqe, addr, to_copy);
+					len -= to_copy;
+					wqe += to_copy;
+					addr += to_copy;
+					seg_len += to_copy;
+					wmb(); /* see comment below */
+					seg->byte_count = htonl(MLX4_INLINE_SEG | seg_len);
+					seg_len = 0;
+					seg = wqe;
+					wqe += sizeof *seg;
+					off = sizeof *seg;
+					++num_seg;
+				}
 
-		dseg = wqe;
-		dseg += wr->num_sge - 1;
-		size += wr->num_sge * (sizeof (struct mlx4_wqe_data_seg) / 16);
+				memcpy(wqe, addr, len);
+				wqe += len;
+				seg_len += len;
+				off += len;
+			}
 
-		/* Add one more inline data segment for ICRC for MLX sends */
-		if (unlikely(qp->mlx4_ib_qp_type == MLX4_IB_QPT_SMI ||
-			     qp->mlx4_ib_qp_type == MLX4_IB_QPT_GSI ||
-			     qp->mlx4_ib_qp_type &
-			     (MLX4_IB_QPT_PROXY_SMI_OWNER | MLX4_IB_QPT_TUN_SMI_OWNER))) {
-			set_mlx_icrc_seg(dseg + 1);
-			size += sizeof (struct mlx4_wqe_data_seg) / 16;
-		}
+			if (seg_len) {
+				++num_seg;
+				/*
+				 * Need a barrier here to make sure
+				 * all the data is visible before the
+				 * byte_count field is set.  Otherwise
+				 * the HCA prefetcher could grab the
+				 * 64-byte chunk with this inline
+				 * segment and get a valid (!=
+				 * 0xffffffff) byte count but stale
+				 * data, and end up sending the wrong
+				 * data.
+				 */
+				wmb();
+				seg->byte_count = htonl(MLX4_INLINE_SEG | seg_len);
+			}
 
-		for (i = wr->num_sge - 1; i >= 0; --i, --dseg)
-			set_data_seg(dseg, wr->sg_list + i);
+			size += (inl + num_seg * sizeof (*seg) + 15) / 16;
+		} else {
+			/*
+			 * Write data segments in reverse order, so as to
+			 * overwrite cacheline stamp last within each
+			 * cacheline.  This avoids issues with WQE
+			 * prefetching.
+			 */
+
+			dseg = wqe;
+			dseg += wr->num_sge - 1;
+			size += wr->num_sge * (sizeof (struct mlx4_wqe_data_seg) / 16);
+
+			/* Add one more inline data segment for ICRC for MLX sends */
+			if (unlikely(qp->mlx4_ib_qp_type == MLX4_IB_QPT_SMI ||
+				     qp->mlx4_ib_qp_type == MLX4_IB_QPT_GSI ||
+				     qp->mlx4_ib_qp_type &
+				     (MLX4_IB_QPT_PROXY_SMI_OWNER | MLX4_IB_QPT_TUN_SMI_OWNER))) {
+				set_mlx_icrc_seg(dseg + 1);
+				size += sizeof (struct mlx4_wqe_data_seg) / 16;
+			}
 
+			for (i = wr->num_sge - 1; i >= 0; --i, --dseg)
+				set_data_seg(dseg, wr->sg_list + i);
+		}
 		/*
 		 * Possibly overwrite stamping in cacheline with LSO
 		 * segment only after making sure all data segments