Message ID | VI1PR0502MB300817FC6256218DE800497BD1220@VI1PR0502MB3008.eurprd05.prod.outlook.com (mailing list archive) |
---|---|
State | Not Applicable |
Headers | show |
I meant mlx4_ib* driver below. Sorry for typo. > -----Original Message----- > From: linux-rdma-owner@vger.kernel.org [mailto:linux-rdma- > owner@vger.kernel.org] On Behalf Of Parav Pandit > Sent: Sunday, March 12, 2017 3:21 PM > To: Ursula Braun <ubraun@linux.vnet.ibm.com>; Eli Cohen > <eli@mellanox.com>; Matan Barak <matanb@mellanox.com> > Cc: Saeed Mahameed <saeedm@mellanox.com>; Leon Romanovsky > <leonro@mellanox.com>; linux-rdma@vger.kernel.org > Subject: RE: Fwd: mlx5_ib_post_send panic on s390x > > Hi Ursula, > > > -----Original Message----- > > From: linux-rdma-owner@vger.kernel.org [mailto:linux-rdma- > > owner@vger.kernel.org] On Behalf Of Ursula Braun > > Sent: Thursday, March 9, 2017 3:54 AM > > To: Eli Cohen <eli@mellanox.com>; Matan Barak <matanb@mellanox.com> > > Cc: Saeed Mahameed <saeedm@mellanox.com>; Leon Romanovsky > > <leonro@mellanox.com>; linux-rdma@vger.kernel.org > > Subject: Re: Fwd: mlx5_ib_post_send panic on s390x > > > > > > > > On 03/06/2017 02:08 PM, Eli Cohen wrote: > > >>> > > >>> The problem seems to be caused by the usage of plain memcpy in > > set_data_inl_seg(). > > >>> The address provided by SMC-code in struct ib_send_wr *wr is an > > >>> address belonging to an area mapped with the ib_dma_map_single() > > >>> call. On s390x those kind of addresses require extra access > > >>> functions (see > > arch/s390/include/asm/io.h). > > >>> > > > > > > By definition, when you are posting a send request with inline, the > > > address > > must be mapped to the cpu so plain memcpy should work. > > > > > In the past I run SMC-R with Connect X3 cards. The mlx4 driver does > > not seem to contain extra coding for IB_SEND_INLINE flag for > > ib_post_send. Does this mean for SMC-R to run on Connect X3 cards the > > IB_SEND_INLINE flag is ignored, and thus I needed the > > ib_dma_map_single() call for the area used with ib_post_send()? Does > > this mean I should stay away from the IB_SEND_INLINE flag, if I want > > to run the same SMC-R code with both, Connect X3 cards and Connect X4 > cards? > > > I had encountered the same kernel panic that you mentioned last week on > ConnectX-4 adapters with smc-r on x86_64. > Shall I submit below fix to netdev mailing list? > I have tested above change. I also have optimization that avoids dma mapping > for wr_tx_dma_addr. > > - lnk->wr_tx_sges[i].addr = > - lnk->wr_tx_dma_addr + i * SMC_WR_BUF_SIZE; > + lnk->wr_tx_sges[i].addr = (uintptr_t)(lnk->wr_tx_bufs + > + i); > > I also have fix for processing IB_SEND_INLINE in mlx4 driver on little older > kernel base. > I have attached below. I can rebase my kernel and provide fix in mlx5_ib driver. > Let me know. > > Regards, > Parav Pandit > > diff --git a/drivers/infiniband/hw/mlx4/qp.c b/drivers/infiniband/hw/mlx4/qp.c > index a2e4ca5..0d984f5 100644 > --- a/drivers/infiniband/hw/mlx4/qp.c > +++ b/drivers/infiniband/hw/mlx4/qp.c > @@ -2748,6 +2748,7 @@ int mlx4_ib_post_send(struct ib_qp *ibqp, struct > ib_send_wr *wr, > unsigned long flags; > int nreq; > int err = 0; > + int inl = 0; > unsigned ind; > int uninitialized_var(stamp); > int uninitialized_var(size); > @@ -2958,30 +2959,97 @@ int mlx4_ib_post_send(struct ib_qp *ibqp, struct > ib_send_wr *wr, > default: > break; > } > + if (wr->send_flags & IB_SEND_INLINE && wr->num_sge) { > + struct mlx4_wqe_inline_seg *seg; > + void *addr; > + int len, seg_len; > + int num_seg; > + int off, to_copy; > + > + inl = 0; > + > + seg = wqe; > + wqe += sizeof *seg; > + off = ((uintptr_t) wqe) & (MLX4_INLINE_ALIGN - 1); > + num_seg = 0; > + seg_len = 0; > + > + for (i = 0; i < wr->num_sge; ++i) { > + addr = (void *) (uintptr_t) wr->sg_list[i].addr; > + len = wr->sg_list[i].length; > + inl += len; > + > + if (inl > 16) { > + inl = 0; > + err = ENOMEM; > + *bad_wr = wr; > + goto out; > + } > > - /* > - * Write data segments in reverse order, so as to > - * overwrite cacheline stamp last within each > - * cacheline. This avoids issues with WQE > - * prefetching. > - */ > + while (len >= MLX4_INLINE_ALIGN - off) { > + to_copy = MLX4_INLINE_ALIGN - off; > + memcpy(wqe, addr, to_copy); > + len -= to_copy; > + wqe += to_copy; > + addr += to_copy; > + seg_len += to_copy; > + wmb(); /* see comment below */ > + seg->byte_count = > htonl(MLX4_INLINE_SEG | seg_len); > + seg_len = 0; > + seg = wqe; > + wqe += sizeof *seg; > + off = sizeof *seg; > + ++num_seg; > + } > > - dseg = wqe; > - dseg += wr->num_sge - 1; > - size += wr->num_sge * (sizeof (struct mlx4_wqe_data_seg) / > 16); > + memcpy(wqe, addr, len); > + wqe += len; > + seg_len += len; > + off += len; > + } > > - /* Add one more inline data segment for ICRC for MLX sends */ > - if (unlikely(qp->mlx4_ib_qp_type == MLX4_IB_QPT_SMI || > - qp->mlx4_ib_qp_type == MLX4_IB_QPT_GSI || > - qp->mlx4_ib_qp_type & > - (MLX4_IB_QPT_PROXY_SMI_OWNER | > MLX4_IB_QPT_TUN_SMI_OWNER))) { > - set_mlx_icrc_seg(dseg + 1); > - size += sizeof (struct mlx4_wqe_data_seg) / 16; > - } > + if (seg_len) { > + ++num_seg; > + /* > + * Need a barrier here to make sure > + * all the data is visible before the > + * byte_count field is set. Otherwise > + * the HCA prefetcher could grab the > + * 64-byte chunk with this inline > + * segment and get a valid (!= > + * 0xffffffff) byte count but stale > + * data, and end up sending the wrong > + * data. > + */ > + wmb(); > + seg->byte_count = htonl(MLX4_INLINE_SEG | > seg_len); > + } > > - for (i = wr->num_sge - 1; i >= 0; --i, --dseg) > - set_data_seg(dseg, wr->sg_list + i); > + size += (inl + num_seg * sizeof (*seg) + 15) / 16; > + } else { > + /* > + * Write data segments in reverse order, so as to > + * overwrite cacheline stamp last within each > + * cacheline. This avoids issues with WQE > + * prefetching. > + */ > + > + dseg = wqe; > + dseg += wr->num_sge - 1; > + size += wr->num_sge * (sizeof (struct > mlx4_wqe_data_seg) / 16); > + > + /* Add one more inline data segment for ICRC for MLX > sends */ > + if (unlikely(qp->mlx4_ib_qp_type == MLX4_IB_QPT_SMI > || > + qp->mlx4_ib_qp_type == MLX4_IB_QPT_GSI > || > + qp->mlx4_ib_qp_type & > + (MLX4_IB_QPT_PROXY_SMI_OWNER | > MLX4_IB_QPT_TUN_SMI_OWNER))) { > + set_mlx_icrc_seg(dseg + 1); > + size += sizeof (struct mlx4_wqe_data_seg) / 16; > + } > > + for (i = wr->num_sge - 1; i >= 0; --i, --dseg) > + set_data_seg(dseg, wr->sg_list + i); > + } > /* > * Possibly overwrite stamping in cacheline with LSO > * segment only after making sure all data segments > > > -- > > To unsubscribe from this list: send the line "unsubscribe linux-rdma" > > in the body of a message to majordomo@vger.kernel.org More majordomo > > info at http://vger.kernel.org/majordomo-info.html > {.n + +% lzwm b 맲 r zX ݙ ܨ} Ơz &j:+v zZ+ +zf h ~ i z w ? > & )ߢf
Hi Parav, I tried your mlx4-patch together with SMC on s390x, but it failed. The SMC-R code tries to send 44 bytes as inline in 1 sge. I wonder about a length check with 16 bytes, which probably explains the failure. See my question below in the patch: On 03/12/2017 09:20 PM, Parav Pandit wrote: > Hi Ursula, > >> -----Original Message----- >> From: linux-rdma-owner@vger.kernel.org [mailto:linux-rdma- >> owner@vger.kernel.org] On Behalf Of Ursula Braun >> Sent: Thursday, March 9, 2017 3:54 AM >> To: Eli Cohen <eli@mellanox.com>; Matan Barak <matanb@mellanox.com> >> Cc: Saeed Mahameed <saeedm@mellanox.com>; Leon Romanovsky >> <leonro@mellanox.com>; linux-rdma@vger.kernel.org >> Subject: Re: Fwd: mlx5_ib_post_send panic on s390x >> >> >> >> On 03/06/2017 02:08 PM, Eli Cohen wrote: >>>>> >>>>> The problem seems to be caused by the usage of plain memcpy in >> set_data_inl_seg(). >>>>> The address provided by SMC-code in struct ib_send_wr *wr is an >>>>> address belonging to an area mapped with the ib_dma_map_single() >>>>> call. On s390x those kind of addresses require extra access functions (see >> arch/s390/include/asm/io.h). >>>>> >>> >>> By definition, when you are posting a send request with inline, the address >> must be mapped to the cpu so plain memcpy should work. >>> >> In the past I run SMC-R with Connect X3 cards. The mlx4 driver does not seem to >> contain extra coding for IB_SEND_INLINE flag for ib_post_send. Does this mean >> for SMC-R to run on Connect X3 cards the IB_SEND_INLINE flag is ignored, and >> thus I needed the ib_dma_map_single() call for the area used with >> ib_post_send()? Does this mean I should stay away from the IB_SEND_INLINE >> flag, if I want to run the same SMC-R code with both, Connect X3 cards and >> Connect X4 cards? >> > I had encountered the same kernel panic that you mentioned last week on ConnectX-4 adapters with smc-r on x86_64. > Shall I submit below fix to netdev mailing list? > I have tested above change. I also have optimization that avoids dma mapping for wr_tx_dma_addr. > > - lnk->wr_tx_sges[i].addr = > - lnk->wr_tx_dma_addr + i * SMC_WR_BUF_SIZE; > + lnk->wr_tx_sges[i].addr = (uintptr_t)(lnk->wr_tx_bufs + i); > > I also have fix for processing IB_SEND_INLINE in mlx4 driver on little older kernel base. > I have attached below. I can rebase my kernel and provide fix in mlx5_ib driver. > Let me know. > > Regards, > Parav Pandit > > diff --git a/drivers/infiniband/hw/mlx4/qp.c b/drivers/infiniband/hw/mlx4/qp.c > index a2e4ca5..0d984f5 100644 > --- a/drivers/infiniband/hw/mlx4/qp.c > +++ b/drivers/infiniband/hw/mlx4/qp.c > @@ -2748,6 +2748,7 @@ int mlx4_ib_post_send(struct ib_qp *ibqp, struct ib_send_wr *wr, > unsigned long flags; > int nreq; > int err = 0; > + int inl = 0; > unsigned ind; > int uninitialized_var(stamp); > int uninitialized_var(size); > @@ -2958,30 +2959,97 @@ int mlx4_ib_post_send(struct ib_qp *ibqp, struct ib_send_wr *wr, > default: > break; > } > + if (wr->send_flags & IB_SEND_INLINE && wr->num_sge) { > + struct mlx4_wqe_inline_seg *seg; > + void *addr; > + int len, seg_len; > + int num_seg; > + int off, to_copy; > + > + inl = 0; > + > + seg = wqe; > + wqe += sizeof *seg; > + off = ((uintptr_t) wqe) & (MLX4_INLINE_ALIGN - 1); > + num_seg = 0; > + seg_len = 0; > + > + for (i = 0; i < wr->num_sge; ++i) { > + addr = (void *) (uintptr_t) wr->sg_list[i].addr; > + len = wr->sg_list[i].length; > + inl += len; > + > + if (inl > 16) { > + inl = 0; > + err = ENOMEM; > + *bad_wr = wr; > + goto out; > + } SMC-R fails due to this check. inl is 44 here. Why is 16 a limit for IB_SEND_INLINE data? The SMC-R code calls ib_create_qp() with max_inline_data=44. And the function does not seem to return an error. > > - /* > - * Write data segments in reverse order, so as to > - * overwrite cacheline stamp last within each > - * cacheline. This avoids issues with WQE > - * prefetching. > - */ > + while (len >= MLX4_INLINE_ALIGN - off) { > + to_copy = MLX4_INLINE_ALIGN - off; > + memcpy(wqe, addr, to_copy); > + len -= to_copy; > + wqe += to_copy; > + addr += to_copy; > + seg_len += to_copy; > + wmb(); /* see comment below */ > + seg->byte_count = htonl(MLX4_INLINE_SEG | seg_len); > + seg_len = 0; > + seg = wqe; > + wqe += sizeof *seg; > + off = sizeof *seg; > + ++num_seg; > + } > > - dseg = wqe; > - dseg += wr->num_sge - 1; > - size += wr->num_sge * (sizeof (struct mlx4_wqe_data_seg) / 16); > + memcpy(wqe, addr, len); > + wqe += len; > + seg_len += len; > + off += len; > + } > > - /* Add one more inline data segment for ICRC for MLX sends */ > - if (unlikely(qp->mlx4_ib_qp_type == MLX4_IB_QPT_SMI || > - qp->mlx4_ib_qp_type == MLX4_IB_QPT_GSI || > - qp->mlx4_ib_qp_type & > - (MLX4_IB_QPT_PROXY_SMI_OWNER | MLX4_IB_QPT_TUN_SMI_OWNER))) { > - set_mlx_icrc_seg(dseg + 1); > - size += sizeof (struct mlx4_wqe_data_seg) / 16; > - } > + if (seg_len) { > + ++num_seg; > + /* > + * Need a barrier here to make sure > + * all the data is visible before the > + * byte_count field is set. Otherwise > + * the HCA prefetcher could grab the > + * 64-byte chunk with this inline > + * segment and get a valid (!= > + * 0xffffffff) byte count but stale > + * data, and end up sending the wrong > + * data. > + */ > + wmb(); > + seg->byte_count = htonl(MLX4_INLINE_SEG | seg_len); > + } > > - for (i = wr->num_sge - 1; i >= 0; --i, --dseg) > - set_data_seg(dseg, wr->sg_list + i); > + size += (inl + num_seg * sizeof (*seg) + 15) / 16; > + } else { > + /* > + * Write data segments in reverse order, so as to > + * overwrite cacheline stamp last within each > + * cacheline. This avoids issues with WQE > + * prefetching. > + */ > + > + dseg = wqe; > + dseg += wr->num_sge - 1; > + size += wr->num_sge * (sizeof (struct mlx4_wqe_data_seg) / 16); > + > + /* Add one more inline data segment for ICRC for MLX sends */ > + if (unlikely(qp->mlx4_ib_qp_type == MLX4_IB_QPT_SMI || > + qp->mlx4_ib_qp_type == MLX4_IB_QPT_GSI || > + qp->mlx4_ib_qp_type & > + (MLX4_IB_QPT_PROXY_SMI_OWNER | MLX4_IB_QPT_TUN_SMI_OWNER))) { > + set_mlx_icrc_seg(dseg + 1); > + size += sizeof (struct mlx4_wqe_data_seg) / 16; > + } > > + for (i = wr->num_sge - 1; i >= 0; --i, --dseg) > + set_data_seg(dseg, wr->sg_list + i); > + } > /* > * Possibly overwrite stamping in cacheline with LSO > * segment only after making sure all data segments > >> -- >> To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body >> of a message to majordomo@vger.kernel.org More majordomo info at >> http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Hi Ursula, > -----Original Message----- > From: Ursula Braun [mailto:ubraun@linux.vnet.ibm.com] > Sent: Tuesday, March 14, 2017 10:02 AM > To: Parav Pandit <parav@mellanox.com>; Eli Cohen <eli@mellanox.com>; > Matan Barak <matanb@mellanox.com> > Cc: Saeed Mahameed <saeedm@mellanox.com>; Leon Romanovsky > <leonro@mellanox.com>; linux-rdma@vger.kernel.org > Subject: Re: Fwd: mlx5_ib_post_send panic on s390x > > Hi Parav, > > I tried your mlx4-patch together with SMC on s390x, but it failed. > The SMC-R code tries to send 44 bytes as inline in 1 sge. > I wonder about a length check with 16 bytes, which probably explains the > failure. > See my question below in the patch: > > On 03/12/2017 09:20 PM, Parav Pandit wrote: > > Hi Ursula, > > > >> -----Original Message----- > >> From: linux-rdma-owner@vger.kernel.org [mailto:linux-rdma- > >> owner@vger.kernel.org] On Behalf Of Ursula Braun > >> Sent: Thursday, March 9, 2017 3:54 AM > >> To: Eli Cohen <eli@mellanox.com>; Matan Barak <matanb@mellanox.com> > >> Cc: Saeed Mahameed <saeedm@mellanox.com>; Leon Romanovsky > >> <leonro@mellanox.com>; linux-rdma@vger.kernel.org > >> Subject: Re: Fwd: mlx5_ib_post_send panic on s390x > >> > >> > >> > >> On 03/06/2017 02:08 PM, Eli Cohen wrote: > >>>>> > >>>>> The problem seems to be caused by the usage of plain memcpy in > >> set_data_inl_seg(). > >>>>> The address provided by SMC-code in struct ib_send_wr *wr is an > >>>>> address belonging to an area mapped with the ib_dma_map_single() > >>>>> call. On s390x those kind of addresses require extra access > >>>>> functions (see > >> arch/s390/include/asm/io.h). > >>>>> > >>> > >>> By definition, when you are posting a send request with inline, the > >>> address > >> must be mapped to the cpu so plain memcpy should work. > >>> > >> In the past I run SMC-R with Connect X3 cards. The mlx4 driver does > >> not seem to contain extra coding for IB_SEND_INLINE flag for > >> ib_post_send. Does this mean for SMC-R to run on Connect X3 cards the > >> IB_SEND_INLINE flag is ignored, and thus I needed the > >> ib_dma_map_single() call for the area used with ib_post_send()? Does > >> this mean I should stay away from the IB_SEND_INLINE flag, if I want > >> to run the same SMC-R code with both, Connect X3 cards and Connect X4 > cards? > >> > > I had encountered the same kernel panic that you mentioned last week on > ConnectX-4 adapters with smc-r on x86_64. > > Shall I submit below fix to netdev mailing list? > > I have tested above change. I also have optimization that avoids dma mapping > for wr_tx_dma_addr. > > > > - lnk->wr_tx_sges[i].addr = > > - lnk->wr_tx_dma_addr + i * SMC_WR_BUF_SIZE; > > + lnk->wr_tx_sges[i].addr = (uintptr_t)(lnk->wr_tx_bufs > > + + i); > > > > I also have fix for processing IB_SEND_INLINE in mlx4 driver on little older > kernel base. > > I have attached below. I can rebase my kernel and provide fix in mlx5_ib driver. > > Let me know. > > > > Regards, > > Parav Pandit > > > > diff --git a/drivers/infiniband/hw/mlx4/qp.c > > b/drivers/infiniband/hw/mlx4/qp.c index a2e4ca5..0d984f5 100644 > > --- a/drivers/infiniband/hw/mlx4/qp.c > > +++ b/drivers/infiniband/hw/mlx4/qp.c > > @@ -2748,6 +2748,7 @@ int mlx4_ib_post_send(struct ib_qp *ibqp, struct > ib_send_wr *wr, > > unsigned long flags; > > int nreq; > > int err = 0; > > + int inl = 0; > > unsigned ind; > > int uninitialized_var(stamp); > > int uninitialized_var(size); > > @@ -2958,30 +2959,97 @@ int mlx4_ib_post_send(struct ib_qp *ibqp, struct > ib_send_wr *wr, > > default: > > break; > > } > > + if (wr->send_flags & IB_SEND_INLINE && wr->num_sge) { > > + struct mlx4_wqe_inline_seg *seg; > > + void *addr; > > + int len, seg_len; > > + int num_seg; > > + int off, to_copy; > > + > > + inl = 0; > > + > > + seg = wqe; > > + wqe += sizeof *seg; > > + off = ((uintptr_t) wqe) & (MLX4_INLINE_ALIGN - 1); > > + num_seg = 0; > > + seg_len = 0; > > + > > + for (i = 0; i < wr->num_sge; ++i) { > > + addr = (void *) (uintptr_t) wr->sg_list[i].addr; > > + len = wr->sg_list[i].length; > > + inl += len; > > + > > + if (inl > 16) { > > + inl = 0; > > + err = ENOMEM; > > + *bad_wr = wr; > > + goto out; > > + } > SMC-R fails due to this check. inl is 44 here. Why is 16 a limit for > IB_SEND_INLINE data? > The SMC-R code calls ib_create_qp() with max_inline_data=44. And the function > does not seem to return an error. > > This check should be for max_inline_data variable of the QP. This was just for error check, I should have fixed it. I was testing with nvme where inline data was only worth 16 bytes. I will fix this. Is it possible to change to 44 and do quick test? Final patch will have right check in addition to check in create_qp? > > - /* > > - * Write data segments in reverse order, so as to > > - * overwrite cacheline stamp last within each > > - * cacheline. This avoids issues with WQE > > - * prefetching. > > - */ > > + while (len >= MLX4_INLINE_ALIGN - off) { > > + to_copy = MLX4_INLINE_ALIGN - off; > > + memcpy(wqe, addr, to_copy); > > + len -= to_copy; > > + wqe += to_copy; > > + addr += to_copy; > > + seg_len += to_copy; > > + wmb(); /* see comment below */ > > + seg->byte_count = > htonl(MLX4_INLINE_SEG | seg_len); > > + seg_len = 0; > > + seg = wqe; > > + wqe += sizeof *seg; > > + off = sizeof *seg; > > + ++num_seg; > > + } > > > > - dseg = wqe; > > - dseg += wr->num_sge - 1; > > - size += wr->num_sge * (sizeof (struct mlx4_wqe_data_seg) / > 16); > > + memcpy(wqe, addr, len); > > + wqe += len; > > + seg_len += len; > > + off += len; > > + } > > > > - /* Add one more inline data segment for ICRC for MLX sends */ > > - if (unlikely(qp->mlx4_ib_qp_type == MLX4_IB_QPT_SMI || > > - qp->mlx4_ib_qp_type == MLX4_IB_QPT_GSI || > > - qp->mlx4_ib_qp_type & > > - (MLX4_IB_QPT_PROXY_SMI_OWNER | > MLX4_IB_QPT_TUN_SMI_OWNER))) { > > - set_mlx_icrc_seg(dseg + 1); > > - size += sizeof (struct mlx4_wqe_data_seg) / 16; > > - } > > + if (seg_len) { > > + ++num_seg; > > + /* > > + * Need a barrier here to make sure > > + * all the data is visible before the > > + * byte_count field is set. Otherwise > > + * the HCA prefetcher could grab the > > + * 64-byte chunk with this inline > > + * segment and get a valid (!= > > + * 0xffffffff) byte count but stale > > + * data, and end up sending the wrong > > + * data. > > + */ > > + wmb(); > > + seg->byte_count = htonl(MLX4_INLINE_SEG | > seg_len); > > + } > > > > - for (i = wr->num_sge - 1; i >= 0; --i, --dseg) > > - set_data_seg(dseg, wr->sg_list + i); > > + size += (inl + num_seg * sizeof (*seg) + 15) / 16; > > + } else { > > + /* > > + * Write data segments in reverse order, so as to > > + * overwrite cacheline stamp last within each > > + * cacheline. This avoids issues with WQE > > + * prefetching. > > + */ > > + > > + dseg = wqe; > > + dseg += wr->num_sge - 1; > > + size += wr->num_sge * (sizeof (struct > mlx4_wqe_data_seg) / 16); > > + > > + /* Add one more inline data segment for ICRC for MLX > sends */ > > + if (unlikely(qp->mlx4_ib_qp_type == MLX4_IB_QPT_SMI > || > > + qp->mlx4_ib_qp_type == MLX4_IB_QPT_GSI > || > > + qp->mlx4_ib_qp_type & > > + (MLX4_IB_QPT_PROXY_SMI_OWNER | > MLX4_IB_QPT_TUN_SMI_OWNER))) { > > + set_mlx_icrc_seg(dseg + 1); > > + size += sizeof (struct mlx4_wqe_data_seg) / 16; > > + } > > > > + for (i = wr->num_sge - 1; i >= 0; --i, --dseg) > > + set_data_seg(dseg, wr->sg_list + i); > > + } > > /* > > * Possibly overwrite stamping in cacheline with LSO > > * segment only after making sure all data segments > > > >> -- > >> To unsubscribe from this list: send the line "unsubscribe linux-rdma" > >> in the body of a message to majordomo@vger.kernel.org More majordomo > >> info at http://vger.kernel.org/majordomo-info.html
Hi Parav, I run your new mlx4-Code together with changed SMC-R code no longer mapping the IB_SEND_INLINE area. It worked - great! Below I have added a small improvement idea in your patch. Nevertheless I am still not sure, if I should keep the IB_SEND_INLINE flag in the SMC-R code, since there is no guarantee that this will work with all kinds of RoCE-devices. The maximum length for IB_SEND_INLINE depends on the RoCE-driver - right? Is there an interface to determine such a maximum length? Would ib_create_qp() return with an error, if the SMC-R specified .cap.max_inline_data = 44 is not supported by a RoCE-driver? On 03/14/2017 04:24 PM, Parav Pandit wrote: > Hi Ursula, > > >> -----Original Message----- >> From: Ursula Braun [mailto:ubraun@linux.vnet.ibm.com] >> Sent: Tuesday, March 14, 2017 10:02 AM >> To: Parav Pandit <parav@mellanox.com>; Eli Cohen <eli@mellanox.com>; >> Matan Barak <matanb@mellanox.com> >> Cc: Saeed Mahameed <saeedm@mellanox.com>; Leon Romanovsky >> <leonro@mellanox.com>; linux-rdma@vger.kernel.org >> Subject: Re: Fwd: mlx5_ib_post_send panic on s390x >> >> Hi Parav, >> >> I tried your mlx4-patch together with SMC on s390x, but it failed. >> The SMC-R code tries to send 44 bytes as inline in 1 sge. >> I wonder about a length check with 16 bytes, which probably explains the >> failure. >> See my question below in the patch: >> >> On 03/12/2017 09:20 PM, Parav Pandit wrote: >>> Hi Ursula, >>> >>>> -----Original Message----- >>>> From: linux-rdma-owner@vger.kernel.org [mailto:linux-rdma- >>>> owner@vger.kernel.org] On Behalf Of Ursula Braun >>>> Sent: Thursday, March 9, 2017 3:54 AM >>>> To: Eli Cohen <eli@mellanox.com>; Matan Barak <matanb@mellanox.com> >>>> Cc: Saeed Mahameed <saeedm@mellanox.com>; Leon Romanovsky >>>> <leonro@mellanox.com>; linux-rdma@vger.kernel.org >>>> Subject: Re: Fwd: mlx5_ib_post_send panic on s390x >>>> >>>> >>>> >>>> On 03/06/2017 02:08 PM, Eli Cohen wrote: >>>>>>> >>>>>>> The problem seems to be caused by the usage of plain memcpy in >>>> set_data_inl_seg(). >>>>>>> The address provided by SMC-code in struct ib_send_wr *wr is an >>>>>>> address belonging to an area mapped with the ib_dma_map_single() >>>>>>> call. On s390x those kind of addresses require extra access >>>>>>> functions (see >>>> arch/s390/include/asm/io.h). >>>>>>> >>>>> >>>>> By definition, when you are posting a send request with inline, the >>>>> address >>>> must be mapped to the cpu so plain memcpy should work. >>>>> >>>> In the past I run SMC-R with Connect X3 cards. The mlx4 driver does >>>> not seem to contain extra coding for IB_SEND_INLINE flag for >>>> ib_post_send. Does this mean for SMC-R to run on Connect X3 cards the >>>> IB_SEND_INLINE flag is ignored, and thus I needed the >>>> ib_dma_map_single() call for the area used with ib_post_send()? Does >>>> this mean I should stay away from the IB_SEND_INLINE flag, if I want >>>> to run the same SMC-R code with both, Connect X3 cards and Connect X4 >> cards? >>>> >>> I had encountered the same kernel panic that you mentioned last week on >> ConnectX-4 adapters with smc-r on x86_64. >>> Shall I submit below fix to netdev mailing list? >>> I have tested above change. I also have optimization that avoids dma mapping >> for wr_tx_dma_addr. >>> >>> - lnk->wr_tx_sges[i].addr = >>> - lnk->wr_tx_dma_addr + i * SMC_WR_BUF_SIZE; >>> + lnk->wr_tx_sges[i].addr = (uintptr_t)(lnk->wr_tx_bufs >>> + + i); >>> >>> I also have fix for processing IB_SEND_INLINE in mlx4 driver on little older >> kernel base. >>> I have attached below. I can rebase my kernel and provide fix in mlx5_ib driver. >>> Let me know. >>> >>> Regards, >>> Parav Pandit >>> >>> diff --git a/drivers/infiniband/hw/mlx4/qp.c >>> b/drivers/infiniband/hw/mlx4/qp.c index a2e4ca5..0d984f5 100644 >>> --- a/drivers/infiniband/hw/mlx4/qp.c >>> +++ b/drivers/infiniband/hw/mlx4/qp.c >>> @@ -2748,6 +2748,7 @@ int mlx4_ib_post_send(struct ib_qp *ibqp, struct >> ib_send_wr *wr, >>> unsigned long flags; >>> int nreq; >>> int err = 0; >>> + int inl = 0; >>> unsigned ind; >>> int uninitialized_var(stamp); >>> int uninitialized_var(size); >>> @@ -2958,30 +2959,97 @@ int mlx4_ib_post_send(struct ib_qp *ibqp, struct >> ib_send_wr *wr, >>> default: >>> break; >>> } >>> + if (wr->send_flags & IB_SEND_INLINE && wr->num_sge) { >>> + struct mlx4_wqe_inline_seg *seg; >>> + void *addr; >>> + int len, seg_len; >>> + int num_seg; >>> + int off, to_copy; >>> + >>> + inl = 0; >>> + >>> + seg = wqe; >>> + wqe += sizeof *seg; >>> + off = ((uintptr_t) wqe) & (MLX4_INLINE_ALIGN - 1); >>> + num_seg = 0; >>> + seg_len = 0; >>> + >>> + for (i = 0; i < wr->num_sge; ++i) { >>> + addr = (void *) (uintptr_t) wr->sg_list[i].addr; >>> + len = wr->sg_list[i].length; >>> + inl += len; >>> + >>> + if (inl > 16) { >>> + inl = 0; >>> + err = ENOMEM; >>> + *bad_wr = wr; >>> + goto out; >>> + } >> SMC-R fails due to this check. inl is 44 here. Why is 16 a limit for >> IB_SEND_INLINE data? >> The SMC-R code calls ib_create_qp() with max_inline_data=44. And the function >> does not seem to return an error. >>> > This check should be for max_inline_data variable of the QP. > This was just for error check, I should have fixed it. I was testing with nvme where inline data was only worth 16 bytes. > I will fix this. Is it possible to change to 44 and do quick test? > Final patch will have right check in addition to check in create_qp? > >>> - /* >>> - * Write data segments in reverse order, so as to >>> - * overwrite cacheline stamp last within each >>> - * cacheline. This avoids issues with WQE >>> - * prefetching. >>> - */ >>> + while (len >= MLX4_INLINE_ALIGN - off) { With this code there are 2 memcpy-Calls, one with to_copy=44, and the next one with len 0. I suggest to change the check to "len > MLX4_INLINE_ALIGN - off". >>> + to_copy = MLX4_INLINE_ALIGN - off; >>> + memcpy(wqe, addr, to_copy); >>> + len -= to_copy; >>> + wqe += to_copy; >>> + addr += to_copy; >>> + seg_len += to_copy; >>> + wmb(); /* see comment below */ >>> + seg->byte_count = >> htonl(MLX4_INLINE_SEG | seg_len); >>> + seg_len = 0; >>> + seg = wqe; >>> + wqe += sizeof *seg; >>> + off = sizeof *seg; >>> + ++num_seg; >>> + } >>> >>> - dseg = wqe; >>> - dseg += wr->num_sge - 1; >>> - size += wr->num_sge * (sizeof (struct mlx4_wqe_data_seg) / >> 16); >>> + memcpy(wqe, addr, len); >>> + wqe += len; >>> + seg_len += len; >>> + off += len; >>> + } >>> >>> - /* Add one more inline data segment for ICRC for MLX sends */ >>> - if (unlikely(qp->mlx4_ib_qp_type == MLX4_IB_QPT_SMI || >>> - qp->mlx4_ib_qp_type == MLX4_IB_QPT_GSI || >>> - qp->mlx4_ib_qp_type & >>> - (MLX4_IB_QPT_PROXY_SMI_OWNER | >> MLX4_IB_QPT_TUN_SMI_OWNER))) { >>> - set_mlx_icrc_seg(dseg + 1); >>> - size += sizeof (struct mlx4_wqe_data_seg) / 16; >>> - } >>> + if (seg_len) { >>> + ++num_seg; >>> + /* >>> + * Need a barrier here to make sure >>> + * all the data is visible before the >>> + * byte_count field is set. Otherwise >>> + * the HCA prefetcher could grab the >>> + * 64-byte chunk with this inline >>> + * segment and get a valid (!= >>> + * 0xffffffff) byte count but stale >>> + * data, and end up sending the wrong >>> + * data. >>> + */ >>> + wmb(); >>> + seg->byte_count = htonl(MLX4_INLINE_SEG | >> seg_len); >>> + } >>> >>> - for (i = wr->num_sge - 1; i >= 0; --i, --dseg) >>> - set_data_seg(dseg, wr->sg_list + i); >>> + size += (inl + num_seg * sizeof (*seg) + 15) / 16; >>> + } else { >>> + /* >>> + * Write data segments in reverse order, so as to >>> + * overwrite cacheline stamp last within each >>> + * cacheline. This avoids issues with WQE >>> + * prefetching. >>> + */ >>> + >>> + dseg = wqe; >>> + dseg += wr->num_sge - 1; >>> + size += wr->num_sge * (sizeof (struct >> mlx4_wqe_data_seg) / 16); >>> + >>> + /* Add one more inline data segment for ICRC for MLX >> sends */ >>> + if (unlikely(qp->mlx4_ib_qp_type == MLX4_IB_QPT_SMI >> || >>> + qp->mlx4_ib_qp_type == MLX4_IB_QPT_GSI >> || >>> + qp->mlx4_ib_qp_type & >>> + (MLX4_IB_QPT_PROXY_SMI_OWNER | >> MLX4_IB_QPT_TUN_SMI_OWNER))) { >>> + set_mlx_icrc_seg(dseg + 1); >>> + size += sizeof (struct mlx4_wqe_data_seg) / 16; >>> + } >>> >>> + for (i = wr->num_sge - 1; i >= 0; --i, --dseg) >>> + set_data_seg(dseg, wr->sg_list + i); >>> + } >>> /* >>> * Possibly overwrite stamping in cacheline with LSO >>> * segment only after making sure all data segments >>> >>>> -- >>>> To unsubscribe from this list: send the line "unsubscribe linux-rdma" >>>> in the body of a message to majordomo@vger.kernel.org More majordomo >>>> info at http://vger.kernel.org/majordomo-info.html > -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
SGkgVXJzdWxhLA0KDQpGb3IgdGhlIHN1Z2dlc3Rpb24gaXQgc3RpbGwgbmVlZCB0byBjb250aW51 ZSB0byBjaGVjayBmb3IgbGVuID49IElOTElORV9BTElHTiAtIG9mZiBiZWNhdXNlIDQ0ID0gNjQt MjAuDQpXaGljaCBpcyBzdGlsbCBhICB2YWxpZCBjYXNlIChsZW4gPT0gaW5saW5lIC0gb2ZmKS4N CkJ1dCBJIGFncmVlIHRoYXQgaXQgc2hvdWxkbid0IGRvIDJuZCBtZW1jcHkgd2l0aCB6ZXJvIGxl bmd0aC4NClRoZXJlZm9yZSB0aGVyZSBzaG91bGQgYmUgYWRkaXRpb25hbCBjaGVjayBmb3IgbGVu ICE9IDAuDQoNCkNvbWluZyB0byBJQl9TRU5EX0lOTElORV9EQVRBIHBhcnQsIHdoZW4gaWJfY3Jl YXRlX3FwIGlzIGNhbGxlZCBhbmQgaWYgSENBIGRvZXNuJ3Qgc3VwcG9ydCBjYXAubWF4X2lubGlu ZV9kYXRhLCBwcm92aWRlciBIQ0EgZHJpdmVyIGlzIHN1cHBvc2VkIHRvIGZhaWwgdGhlIGNhbGwu DQpBbmQgVUxQIGlzIGV4cGVjdGVkIHRvIGRvIGZhbGxiYWNrIHRvIG5vbl9pbmxpbmUgc2NoZW1l Lg0KDQpBcyBpdCBhcHBlYXJzIG1seDQgZHJpdmVyIGlzIG5vdCBmYWlsaW5nIHRoaXMgY2FsbCwg d2hpY2ggaXMgYSBidWcgdGhhdCBuZWVkcyBmaXguDQpJbnN0ZWFkIG9mIGZhaWxpbmcgdGhlIGNh bGwsIEkgcHJlZmVyIHRvIHByb3ZpZGUgdGhlIGRhdGEgcGF0aCBzb29uZXIgYmFzZWQgb24gbXkg aW5saW5lIHBhdGNoIGluIHRoaXMgZW1haWwgdGhyZWFkLg0KDQpQYXJhdg0KDQo+IC0tLS0tT3Jp Z2luYWwgTWVzc2FnZS0tLS0tDQo+IEZyb206IFVyc3VsYSBCcmF1biBbbWFpbHRvOnVicmF1bkBs aW51eC52bmV0LmlibS5jb21dDQo+IFNlbnQ6IFRodXJzZGF5LCBNYXJjaCAxNiwgMjAxNyA2OjUx IEFNDQo+IFRvOiBQYXJhdiBQYW5kaXQgPHBhcmF2QG1lbGxhbm94LmNvbT47IEVsaSBDb2hlbiA8 ZWxpQG1lbGxhbm94LmNvbT47DQo+IE1hdGFuIEJhcmFrIDxtYXRhbmJAbWVsbGFub3guY29tPg0K PiBDYzogU2FlZWQgTWFoYW1lZWQgPHNhZWVkbUBtZWxsYW5veC5jb20+OyBMZW9uIFJvbWFub3Zz a3kNCj4gPGxlb25yb0BtZWxsYW5veC5jb20+OyBsaW51eC1yZG1hQHZnZXIua2VybmVsLm9yZw0K PiBTdWJqZWN0OiBSZTogRndkOiBtbHg1X2liX3Bvc3Rfc2VuZCBwYW5pYyBvbiBzMzkweA0KPiAN Cj4gSGkgUGFyYXYsDQo+IA0KPiBJIHJ1biB5b3VyIG5ldyBtbHg0LUNvZGUgdG9nZXRoZXIgd2l0 aCBjaGFuZ2VkIFNNQy1SIGNvZGUgbm8gbG9uZ2VyDQo+IG1hcHBpbmcgdGhlIElCX1NFTkRfSU5M SU5FIGFyZWEuIEl0IHdvcmtlZCAtIGdyZWF0IQ0KPiANCj4gQmVsb3cgSSBoYXZlIGFkZGVkIGEg c21hbGwgaW1wcm92ZW1lbnQgaWRlYSBpbiB5b3VyIHBhdGNoLg0KPiANCj4gTmV2ZXJ0aGVsZXNz IEkgYW0gc3RpbGwgbm90IHN1cmUsIGlmIEkgc2hvdWxkIGtlZXAgdGhlIElCX1NFTkRfSU5MSU5F IGZsYWcgaW4NCj4gdGhlIFNNQy1SIGNvZGUsIHNpbmNlIHRoZXJlIGlzIG5vIGd1YXJhbnRlZSB0 aGF0IHRoaXMgd2lsbCB3b3JrIHdpdGggYWxsIGtpbmRzDQo+IG9mIFJvQ0UtZGV2aWNlcy4gVGhl IG1heGltdW0gbGVuZ3RoIGZvciBJQl9TRU5EX0lOTElORSBkZXBlbmRzIG9uIHRoZQ0KPiBSb0NF LWRyaXZlciAtIHJpZ2h0PyBJcyB0aGVyZSBhbiBpbnRlcmZhY2UgdG8gZGV0ZXJtaW5lIHN1Y2gg YSBtYXhpbXVtDQo+IGxlbmd0aD8gV291bGQgaWJfY3JlYXRlX3FwKCkgcmV0dXJuIHdpdGggYW4g ZXJyb3IsIGlmIHRoZSBTTUMtUiBzcGVjaWZpZWQNCj4gLmNhcC5tYXhfaW5saW5lX2RhdGEgPSA0 NCBpcyBub3Qgc3VwcG9ydGVkIGJ5IGEgUm9DRS1kcml2ZXI/DQo+IA0KPiBPbiAwMy8xNC8yMDE3 IDA0OjI0IFBNLCBQYXJhdiBQYW5kaXQgd3JvdGU6DQo+ID4gSGkgVXJzdWxhLA0KPiA+DQo+ID4N Cj4gPj4gLS0tLS1PcmlnaW5hbCBNZXNzYWdlLS0tLS0NCj4gPj4gRnJvbTogVXJzdWxhIEJyYXVu IFttYWlsdG86dWJyYXVuQGxpbnV4LnZuZXQuaWJtLmNvbV0NCj4gPj4gU2VudDogVHVlc2RheSwg TWFyY2ggMTQsIDIwMTcgMTA6MDIgQU0NCj4gPj4gVG86IFBhcmF2IFBhbmRpdCA8cGFyYXZAbWVs bGFub3guY29tPjsgRWxpIENvaGVuIDxlbGlAbWVsbGFub3guY29tPjsNCj4gPj4gTWF0YW4gQmFy YWsgPG1hdGFuYkBtZWxsYW5veC5jb20+DQo+ID4+IENjOiBTYWVlZCBNYWhhbWVlZCA8c2FlZWRt QG1lbGxhbm94LmNvbT47IExlb24gUm9tYW5vdnNreQ0KPiA+PiA8bGVvbnJvQG1lbGxhbm94LmNv bT47IGxpbnV4LXJkbWFAdmdlci5rZXJuZWwub3JnDQo+ID4+IFN1YmplY3Q6IFJlOiBGd2Q6IG1s eDVfaWJfcG9zdF9zZW5kIHBhbmljIG9uIHMzOTB4DQo+ID4+DQo+ID4+IEhpIFBhcmF2LA0KPiA+ Pg0KPiA+PiBJIHRyaWVkIHlvdXIgbWx4NC1wYXRjaCB0b2dldGhlciB3aXRoIFNNQyBvbiBzMzkw eCwgYnV0IGl0IGZhaWxlZC4NCj4gPj4gVGhlIFNNQy1SIGNvZGUgdHJpZXMgdG8gc2VuZCA0NCBi eXRlcyBhcyBpbmxpbmUgaW4gMSBzZ2UuDQo+ID4+IEkgd29uZGVyIGFib3V0IGEgbGVuZ3RoIGNo ZWNrIHdpdGggMTYgYnl0ZXMsIHdoaWNoIHByb2JhYmx5IGV4cGxhaW5zDQo+ID4+IHRoZSBmYWls dXJlLg0KPiA+PiBTZWUgbXkgcXVlc3Rpb24gYmVsb3cgaW4gdGhlIHBhdGNoOg0KPiA+Pg0KPiA+ PiBPbiAwMy8xMi8yMDE3IDA5OjIwIFBNLCBQYXJhdiBQYW5kaXQgd3JvdGU6DQo+ID4+PiBIaSBV cnN1bGEsDQo+ID4+Pg0KPiA+Pj4+IC0tLS0tT3JpZ2luYWwgTWVzc2FnZS0tLS0tDQo+ID4+Pj4g RnJvbTogbGludXgtcmRtYS1vd25lckB2Z2VyLmtlcm5lbC5vcmcgW21haWx0bzpsaW51eC1yZG1h LQ0KPiA+Pj4+IG93bmVyQHZnZXIua2VybmVsLm9yZ10gT24gQmVoYWxmIE9mIFVyc3VsYSBCcmF1 bg0KPiA+Pj4+IFNlbnQ6IFRodXJzZGF5LCBNYXJjaCA5LCAyMDE3IDM6NTQgQU0NCj4gPj4+PiBU bzogRWxpIENvaGVuIDxlbGlAbWVsbGFub3guY29tPjsgTWF0YW4gQmFyYWsNCj4gPG1hdGFuYkBt ZWxsYW5veC5jb20+DQo+ID4+Pj4gQ2M6IFNhZWVkIE1haGFtZWVkIDxzYWVlZG1AbWVsbGFub3gu Y29tPjsgTGVvbiBSb21hbm92c2t5DQo+ID4+Pj4gPGxlb25yb0BtZWxsYW5veC5jb20+OyBsaW51 eC1yZG1hQHZnZXIua2VybmVsLm9yZw0KPiA+Pj4+IFN1YmplY3Q6IFJlOiBGd2Q6IG1seDVfaWJf cG9zdF9zZW5kIHBhbmljIG9uIHMzOTB4DQo+ID4+Pj4NCj4gPj4+Pg0KPiA+Pj4+DQo+ID4+Pj4g T24gMDMvMDYvMjAxNyAwMjowOCBQTSwgRWxpIENvaGVuIHdyb3RlOg0KPiA+Pj4+Pj4+DQo+ID4+ Pj4+Pj4gVGhlIHByb2JsZW0gc2VlbXMgdG8gYmUgY2F1c2VkIGJ5IHRoZSB1c2FnZSBvZiBwbGFp biBtZW1jcHkgaW4NCj4gPj4+PiBzZXRfZGF0YV9pbmxfc2VnKCkuDQo+ID4+Pj4+Pj4gVGhlIGFk ZHJlc3MgcHJvdmlkZWQgYnkgU01DLWNvZGUgaW4gc3RydWN0IGliX3NlbmRfd3IgKndyIGlzIGFu DQo+ID4+Pj4+Pj4gYWRkcmVzcyBiZWxvbmdpbmcgdG8gYW4gYXJlYSBtYXBwZWQgd2l0aCB0aGUN Cj4gaWJfZG1hX21hcF9zaW5nbGUoKQ0KPiA+Pj4+Pj4+IGNhbGwuIE9uIHMzOTB4IHRob3NlIGtp bmQgb2YgYWRkcmVzc2VzIHJlcXVpcmUgZXh0cmEgYWNjZXNzDQo+ID4+Pj4+Pj4gZnVuY3Rpb25z IChzZWUNCj4gPj4+PiBhcmNoL3MzOTAvaW5jbHVkZS9hc20vaW8uaCkuDQo+ID4+Pj4+Pj4NCj4g Pj4+Pj4NCj4gPj4+Pj4gQnkgZGVmaW5pdGlvbiwgd2hlbiB5b3UgYXJlIHBvc3RpbmcgYSBzZW5k IHJlcXVlc3Qgd2l0aCBpbmxpbmUsDQo+ID4+Pj4+IHRoZSBhZGRyZXNzDQo+ID4+Pj4gbXVzdCBi ZSBtYXBwZWQgdG8gdGhlIGNwdSBzbyBwbGFpbiBtZW1jcHkgc2hvdWxkIHdvcmsuDQo+ID4+Pj4+ DQo+ID4+Pj4gSW4gdGhlIHBhc3QgSSBydW4gU01DLVIgd2l0aCBDb25uZWN0IFgzIGNhcmRzLiBU aGUgbWx4NCBkcml2ZXIgZG9lcw0KPiA+Pj4+IG5vdCBzZWVtIHRvIGNvbnRhaW4gZXh0cmEgY29k aW5nIGZvciBJQl9TRU5EX0lOTElORSBmbGFnIGZvcg0KPiA+Pj4+IGliX3Bvc3Rfc2VuZC4gRG9l cyB0aGlzIG1lYW4gZm9yIFNNQy1SIHRvIHJ1biBvbiBDb25uZWN0IFgzIGNhcmRzDQo+ID4+Pj4g dGhlIElCX1NFTkRfSU5MSU5FIGZsYWcgaXMgaWdub3JlZCwgYW5kIHRodXMgSSBuZWVkZWQgdGhl DQo+ID4+Pj4gaWJfZG1hX21hcF9zaW5nbGUoKSBjYWxsIGZvciB0aGUgYXJlYSB1c2VkIHdpdGgg aWJfcG9zdF9zZW5kKCk/DQo+ID4+Pj4gRG9lcyB0aGlzIG1lYW4gSSBzaG91bGQgc3RheSBhd2F5 IGZyb20gdGhlIElCX1NFTkRfSU5MSU5FIGZsYWcsIGlmDQo+ID4+Pj4gSSB3YW50IHRvIHJ1biB0 aGUgc2FtZSBTTUMtUiBjb2RlIHdpdGggYm90aCwgQ29ubmVjdCBYMyBjYXJkcyBhbmQNCj4gPj4+ PiBDb25uZWN0IFg0DQo+ID4+IGNhcmRzPw0KPiA+Pj4+DQo+ID4+PiBJIGhhZCBlbmNvdW50ZXJl ZCB0aGUgc2FtZSBrZXJuZWwgcGFuaWMgdGhhdCB5b3UgbWVudGlvbmVkIGxhc3Qgd2Vlaw0KPiA+ Pj4gb24NCj4gPj4gQ29ubmVjdFgtNCBhZGFwdGVycyB3aXRoIHNtYy1yIG9uIHg4Nl82NC4NCj4g Pj4+IFNoYWxsIEkgc3VibWl0IGJlbG93IGZpeCB0byBuZXRkZXYgbWFpbGluZyBsaXN0Pw0KPiA+ Pj4gSSBoYXZlIHRlc3RlZCBhYm92ZSBjaGFuZ2UuIEkgYWxzbyBoYXZlIG9wdGltaXphdGlvbiB0 aGF0IGF2b2lkcyBkbWENCj4gPj4+IG1hcHBpbmcNCj4gPj4gZm9yIHdyX3R4X2RtYV9hZGRyLg0K PiA+Pj4NCj4gPj4+IC0gICAgICAgICAgICAgICBsbmstPndyX3R4X3NnZXNbaV0uYWRkciA9DQo+ ID4+PiAtICAgICAgICAgICAgICAgICAgICAgICBsbmstPndyX3R4X2RtYV9hZGRyICsgaSAqIFNN Q19XUl9CVUZfU0laRTsNCj4gPj4+ICsgICAgICAgICAgICAgICBsbmstPndyX3R4X3NnZXNbaV0u YWRkciA9DQo+ID4+PiArICh1aW50cHRyX3QpKGxuay0+d3JfdHhfYnVmcw0KPiA+Pj4gKyArIGkp Ow0KPiA+Pj4NCj4gPj4+IEkgYWxzbyBoYXZlIGZpeCBmb3IgcHJvY2Vzc2luZyBJQl9TRU5EX0lO TElORSBpbiBtbHg0IGRyaXZlciBvbg0KPiA+Pj4gbGl0dGxlIG9sZGVyDQo+ID4+IGtlcm5lbCBi YXNlLg0KPiA+Pj4gSSBoYXZlIGF0dGFjaGVkIGJlbG93LiBJIGNhbiByZWJhc2UgbXkga2VybmVs IGFuZCBwcm92aWRlIGZpeCBpbiBtbHg1X2liDQo+IGRyaXZlci4NCj4gPj4+IExldCBtZSBrbm93 Lg0KPiA+Pj4NCj4gPj4+IFJlZ2FyZHMsDQo+ID4+PiBQYXJhdiBQYW5kaXQNCj4gPj4+DQo+ID4+ PiBkaWZmIC0tZ2l0IGEvZHJpdmVycy9pbmZpbmliYW5kL2h3L21seDQvcXAuYw0KPiA+Pj4gYi9k cml2ZXJzL2luZmluaWJhbmQvaHcvbWx4NC9xcC5jIGluZGV4IGEyZTRjYTUuLjBkOTg0ZjUgMTAw NjQ0DQo+ID4+PiAtLS0gYS9kcml2ZXJzL2luZmluaWJhbmQvaHcvbWx4NC9xcC5jDQo+ID4+PiAr KysgYi9kcml2ZXJzL2luZmluaWJhbmQvaHcvbWx4NC9xcC5jDQo+ID4+PiBAQCAtMjc0OCw2ICsy NzQ4LDcgQEAgaW50IG1seDRfaWJfcG9zdF9zZW5kKHN0cnVjdCBpYl9xcCAqaWJxcCwNCj4gPj4+ IHN0cnVjdA0KPiA+PiBpYl9zZW5kX3dyICp3ciwNCj4gPj4+ICAJdW5zaWduZWQgbG9uZyBmbGFn czsNCj4gPj4+ICAJaW50IG5yZXE7DQo+ID4+PiAgCWludCBlcnIgPSAwOw0KPiA+Pj4gKwlpbnQg aW5sID0gMDsNCj4gPj4+ICAJdW5zaWduZWQgaW5kOw0KPiA+Pj4gIAlpbnQgdW5pbml0aWFsaXpl ZF92YXIoc3RhbXApOw0KPiA+Pj4gIAlpbnQgdW5pbml0aWFsaXplZF92YXIoc2l6ZSk7DQo+ID4+ PiBAQCAtMjk1OCwzMCArMjk1OSw5NyBAQCBpbnQgbWx4NF9pYl9wb3N0X3NlbmQoc3RydWN0IGli X3FwICppYnFwLA0KPiA+Pj4gc3RydWN0DQo+ID4+IGliX3NlbmRfd3IgKndyLA0KPiA+Pj4gIAkJ ZGVmYXVsdDoNCj4gPj4+ICAJCQlicmVhazsNCj4gPj4+ICAJCX0NCj4gPj4+ICsJCWlmICh3ci0+ c2VuZF9mbGFncyAmIElCX1NFTkRfSU5MSU5FICYmIHdyLT5udW1fc2dlKSB7DQo+ID4+PiArCQkJ c3RydWN0IG1seDRfd3FlX2lubGluZV9zZWcgKnNlZzsNCj4gPj4+ICsJCQl2b2lkICphZGRyOw0K PiA+Pj4gKwkJCWludCBsZW4sIHNlZ19sZW47DQo+ID4+PiArCQkJaW50IG51bV9zZWc7DQo+ID4+ PiArCQkJaW50IG9mZiwgdG9fY29weTsNCj4gPj4+ICsNCj4gPj4+ICsJCQlpbmwgPSAwOw0KPiA+ Pj4gKw0KPiA+Pj4gKwkJCXNlZyA9IHdxZTsNCj4gPj4+ICsJCQl3cWUgKz0gc2l6ZW9mICpzZWc7 DQo+ID4+PiArCQkJb2ZmID0gKCh1aW50cHRyX3QpIHdxZSkgJiAoTUxYNF9JTkxJTkVfQUxJR04g LSAxKTsNCj4gPj4+ICsJCQludW1fc2VnID0gMDsNCj4gPj4+ICsJCQlzZWdfbGVuID0gMDsNCj4g Pj4+ICsNCj4gPj4+ICsJCQlmb3IgKGkgPSAwOyBpIDwgd3ItPm51bV9zZ2U7ICsraSkgew0KPiA+ Pj4gKwkJCQlhZGRyID0gKHZvaWQgKikgKHVpbnRwdHJfdCkgd3ItPnNnX2xpc3RbaV0uYWRkcjsN Cj4gPj4+ICsJCQkJbGVuICA9IHdyLT5zZ19saXN0W2ldLmxlbmd0aDsNCj4gPj4+ICsJCQkJaW5s ICs9IGxlbjsNCj4gPj4+ICsNCj4gPj4+ICsJCQkJaWYgKGlubCA+IDE2KSB7DQo+ID4+PiArCQkJ CQlpbmwgPSAwOw0KPiA+Pj4gKwkJCQkJZXJyID0gRU5PTUVNOw0KPiA+Pj4gKwkJCQkJKmJhZF93 ciA9IHdyOw0KPiA+Pj4gKwkJCQkJZ290byBvdXQ7DQo+ID4+PiArCQkJCX0NCj4gPj4gU01DLVIg ZmFpbHMgZHVlIHRvIHRoaXMgY2hlY2suIGlubCBpcyA0NCBoZXJlLiBXaHkgaXMgMTYgYSBsaW1p dCBmb3INCj4gPj4gSUJfU0VORF9JTkxJTkUgZGF0YT8NCj4gPj4gVGhlIFNNQy1SIGNvZGUgY2Fs bHMgaWJfY3JlYXRlX3FwKCkgd2l0aCBtYXhfaW5saW5lX2RhdGE9NDQuIEFuZCB0aGUNCj4gPj4g ZnVuY3Rpb24gZG9lcyBub3Qgc2VlbSB0byByZXR1cm4gYW4gZXJyb3IuDQo+ID4+Pg0KPiA+IFRo aXMgY2hlY2sgc2hvdWxkIGJlIGZvciBtYXhfaW5saW5lX2RhdGEgdmFyaWFibGUgb2YgdGhlIFFQ Lg0KPiA+IFRoaXMgd2FzIGp1c3QgZm9yIGVycm9yIGNoZWNrLCBJIHNob3VsZCBoYXZlIGZpeGVk IGl0LiBJIHdhcyB0ZXN0aW5nIHdpdGggbnZtZQ0KPiB3aGVyZSBpbmxpbmUgZGF0YSB3YXMgb25s eSB3b3J0aCAxNiBieXRlcy4NCj4gPiBJIHdpbGwgZml4IHRoaXMuIElzIGl0IHBvc3NpYmxlIHRv IGNoYW5nZSB0byA0NCBhbmQgZG8gcXVpY2sgdGVzdD8NCj4gPiBGaW5hbCBwYXRjaCB3aWxsIGhh dmUgcmlnaHQgY2hlY2sgaW4gYWRkaXRpb24gdG8gY2hlY2sgaW4gY3JlYXRlX3FwPw0KPiA+DQo+ ID4+PiAtCQkvKg0KPiA+Pj4gLQkJICogV3JpdGUgZGF0YSBzZWdtZW50cyBpbiByZXZlcnNlIG9y ZGVyLCBzbyBhcyB0bw0KPiA+Pj4gLQkJICogb3ZlcndyaXRlIGNhY2hlbGluZSBzdGFtcCBsYXN0 IHdpdGhpbiBlYWNoDQo+ID4+PiAtCQkgKiBjYWNoZWxpbmUuICBUaGlzIGF2b2lkcyBpc3N1ZXMg d2l0aCBXUUUNCj4gPj4+IC0JCSAqIHByZWZldGNoaW5nLg0KPiA+Pj4gLQkJICovDQo+ID4+PiAr CQkJCXdoaWxlIChsZW4gPj0gTUxYNF9JTkxJTkVfQUxJR04gLSBvZmYpIHsNCj4gV2l0aCB0aGlz IGNvZGUgdGhlcmUgYXJlIDIgbWVtY3B5LUNhbGxzLCBvbmUgd2l0aCB0b19jb3B5PTQ0LCBhbmQg dGhlIG5leHQNCj4gb25lIHdpdGggbGVuIDAuDQo+IEkgc3VnZ2VzdCB0byBjaGFuZ2UgdGhlIGNo ZWNrIHRvICJsZW4gPiBNTFg0X0lOTElORV9BTElHTiAtIG9mZiIuDQo+ID4+PiArCQkJCQl0b19j b3B5ID0gTUxYNF9JTkxJTkVfQUxJR04gLSBvZmY7DQo+ID4+PiArCQkJCQltZW1jcHkod3FlLCBh ZGRyLCB0b19jb3B5KTsNCj4gPj4+ICsJCQkJCWxlbiAtPSB0b19jb3B5Ow0KPiA+Pj4gKwkJCQkJ d3FlICs9IHRvX2NvcHk7DQo+ID4+PiArCQkJCQlhZGRyICs9IHRvX2NvcHk7DQo+ID4+PiArCQkJ CQlzZWdfbGVuICs9IHRvX2NvcHk7DQo+ID4+PiArCQkJCQl3bWIoKTsgLyogc2VlIGNvbW1lbnQg YmVsb3cgKi8NCj4gPj4+ICsJCQkJCXNlZy0+Ynl0ZV9jb3VudCA9DQo+ID4+IGh0b25sKE1MWDRf SU5MSU5FX1NFRyB8IHNlZ19sZW4pOw0KPiA+Pj4gKwkJCQkJc2VnX2xlbiA9IDA7DQo+ID4+PiAr CQkJCQlzZWcgPSB3cWU7DQo+ID4+PiArCQkJCQl3cWUgKz0gc2l6ZW9mICpzZWc7DQo+ID4+PiAr CQkJCQlvZmYgPSBzaXplb2YgKnNlZzsNCj4gPj4+ICsJCQkJCSsrbnVtX3NlZzsNCj4gPj4+ICsJ CQkJfQ0KPiA+Pj4NCj4gPj4+IC0JCWRzZWcgPSB3cWU7DQo+ID4+PiAtCQlkc2VnICs9IHdyLT5u dW1fc2dlIC0gMTsNCj4gPj4+IC0JCXNpemUgKz0gd3ItPm51bV9zZ2UgKiAoc2l6ZW9mIChzdHJ1 Y3QgbWx4NF93cWVfZGF0YV9zZWcpIC8NCj4gPj4gMTYpOw0KPiA+Pj4gKwkJCQltZW1jcHkod3Fl LCBhZGRyLCBsZW4pOw0KPiA+Pj4gKwkJCQl3cWUgKz0gbGVuOw0KPiA+Pj4gKwkJCQlzZWdfbGVu ICs9IGxlbjsNCj4gPj4+ICsJCQkJb2ZmICs9IGxlbjsNCj4gPj4+ICsJCQl9DQo+ID4+Pg0KPiA+ Pj4gLQkJLyogQWRkIG9uZSBtb3JlIGlubGluZSBkYXRhIHNlZ21lbnQgZm9yIElDUkMgZm9yIE1M WCBzZW5kcw0KPiAqLw0KPiA+Pj4gLQkJaWYgKHVubGlrZWx5KHFwLT5tbHg0X2liX3FwX3R5cGUg PT0gTUxYNF9JQl9RUFRfU01JIHx8DQo+ID4+PiAtCQkJICAgICBxcC0+bWx4NF9pYl9xcF90eXBl ID09IE1MWDRfSUJfUVBUX0dTSSB8fA0KPiA+Pj4gLQkJCSAgICAgcXAtPm1seDRfaWJfcXBfdHlw ZSAmDQo+ID4+PiAtCQkJICAgICAoTUxYNF9JQl9RUFRfUFJPWFlfU01JX09XTkVSIHwNCj4gPj4g TUxYNF9JQl9RUFRfVFVOX1NNSV9PV05FUikpKSB7DQo+ID4+PiAtCQkJc2V0X21seF9pY3JjX3Nl Zyhkc2VnICsgMSk7DQo+ID4+PiAtCQkJc2l6ZSArPSBzaXplb2YgKHN0cnVjdCBtbHg0X3dxZV9k YXRhX3NlZykgLyAxNjsNCj4gPj4+IC0JCX0NCj4gPj4+ICsJCQlpZiAoc2VnX2xlbikgew0KPiA+ Pj4gKwkJCQkrK251bV9zZWc7DQo+ID4+PiArCQkJCS8qDQo+ID4+PiArCQkJCSAqIE5lZWQgYSBi YXJyaWVyIGhlcmUgdG8gbWFrZSBzdXJlDQo+ID4+PiArCQkJCSAqIGFsbCB0aGUgZGF0YSBpcyB2 aXNpYmxlIGJlZm9yZSB0aGUNCj4gPj4+ICsJCQkJICogYnl0ZV9jb3VudCBmaWVsZCBpcyBzZXQu ICBPdGhlcndpc2UNCj4gPj4+ICsJCQkJICogdGhlIEhDQSBwcmVmZXRjaGVyIGNvdWxkIGdyYWIg dGhlDQo+ID4+PiArCQkJCSAqIDY0LWJ5dGUgY2h1bmsgd2l0aCB0aGlzIGlubGluZQ0KPiA+Pj4g KwkJCQkgKiBzZWdtZW50IGFuZCBnZXQgYSB2YWxpZCAoIT0NCj4gPj4+ICsJCQkJICogMHhmZmZm ZmZmZikgYnl0ZSBjb3VudCBidXQgc3RhbGUNCj4gPj4+ICsJCQkJICogZGF0YSwgYW5kIGVuZCB1 cCBzZW5kaW5nIHRoZSB3cm9uZw0KPiA+Pj4gKwkJCQkgKiBkYXRhLg0KPiA+Pj4gKwkJCQkgKi8N Cj4gPj4+ICsJCQkJd21iKCk7DQo+ID4+PiArCQkJCXNlZy0+Ynl0ZV9jb3VudCA9IGh0b25sKE1M WDRfSU5MSU5FX1NFRw0KPiB8DQo+ID4+IHNlZ19sZW4pOw0KPiA+Pj4gKwkJCX0NCj4gPj4+DQo+ ID4+PiAtCQlmb3IgKGkgPSB3ci0+bnVtX3NnZSAtIDE7IGkgPj0gMDsgLS1pLCAtLWRzZWcpDQo+ ID4+PiAtCQkJc2V0X2RhdGFfc2VnKGRzZWcsIHdyLT5zZ19saXN0ICsgaSk7DQo+ID4+PiArCQkJ c2l6ZSArPSAoaW5sICsgbnVtX3NlZyAqIHNpemVvZiAoKnNlZykgKyAxNSkgLyAxNjsNCj4gPj4+ ICsJCX0gZWxzZSB7DQo+ID4+PiArCQkJLyoNCj4gPj4+ICsJCQkgKiBXcml0ZSBkYXRhIHNlZ21l bnRzIGluIHJldmVyc2Ugb3JkZXIsIHNvIGFzIHRvDQo+ID4+PiArCQkJICogb3ZlcndyaXRlIGNh Y2hlbGluZSBzdGFtcCBsYXN0IHdpdGhpbiBlYWNoDQo+ID4+PiArCQkJICogY2FjaGVsaW5lLiAg VGhpcyBhdm9pZHMgaXNzdWVzIHdpdGggV1FFDQo+ID4+PiArCQkJICogcHJlZmV0Y2hpbmcuDQo+ ID4+PiArCQkJICovDQo+ID4+PiArDQo+ID4+PiArCQkJZHNlZyA9IHdxZTsNCj4gPj4+ICsJCQlk c2VnICs9IHdyLT5udW1fc2dlIC0gMTsNCj4gPj4+ICsJCQlzaXplICs9IHdyLT5udW1fc2dlICog KHNpemVvZiAoc3RydWN0DQo+ID4+IG1seDRfd3FlX2RhdGFfc2VnKSAvIDE2KTsNCj4gPj4+ICsN Cj4gPj4+ICsJCQkvKiBBZGQgb25lIG1vcmUgaW5saW5lIGRhdGEgc2VnbWVudCBmb3IgSUNSQyBm b3INCj4gTUxYDQo+ID4+IHNlbmRzICovDQo+ID4+PiArCQkJaWYgKHVubGlrZWx5KHFwLT5tbHg0 X2liX3FwX3R5cGUgPT0NCj4gTUxYNF9JQl9RUFRfU01JDQo+ID4+IHx8DQo+ID4+PiArCQkJCSAg ICAgcXAtPm1seDRfaWJfcXBfdHlwZSA9PQ0KPiBNTFg0X0lCX1FQVF9HU0kNCj4gPj4gfHwNCj4g Pj4+ICsJCQkJICAgICBxcC0+bWx4NF9pYl9xcF90eXBlICYNCj4gPj4+ICsJCQkJICAgICAoTUxY NF9JQl9RUFRfUFJPWFlfU01JX09XTkVSIHwNCj4gPj4gTUxYNF9JQl9RUFRfVFVOX1NNSV9PV05F UikpKSB7DQo+ID4+PiArCQkJCXNldF9tbHhfaWNyY19zZWcoZHNlZyArIDEpOw0KPiA+Pj4gKwkJ CQlzaXplICs9IHNpemVvZiAoc3RydWN0IG1seDRfd3FlX2RhdGFfc2VnKSAvDQo+IDE2Ow0KPiA+ Pj4gKwkJCX0NCj4gPj4+DQo+ID4+PiArCQkJZm9yIChpID0gd3ItPm51bV9zZ2UgLSAxOyBpID49 IDA7IC0taSwgLS1kc2VnKQ0KPiA+Pj4gKwkJCQlzZXRfZGF0YV9zZWcoZHNlZywgd3ItPnNnX2xp c3QgKyBpKTsNCj4gPj4+ICsJCX0NCj4gPj4+ICAJCS8qDQo+ID4+PiAgCQkgKiBQb3NzaWJseSBv dmVyd3JpdGUgc3RhbXBpbmcgaW4gY2FjaGVsaW5lIHdpdGggTFNPDQo+ID4+PiAgCQkgKiBzZWdt ZW50IG9ubHkgYWZ0ZXIgbWFraW5nIHN1cmUgYWxsIGRhdGEgc2VnbWVudHMNCj4gPj4+DQo+ID4+ Pj4gLS0NCj4gPj4+PiBUbyB1bnN1YnNjcmliZSBmcm9tIHRoaXMgbGlzdDogc2VuZCB0aGUgbGlu ZSAidW5zdWJzY3JpYmUgbGludXgtcmRtYSINCj4gPj4+PiBpbiB0aGUgYm9keSBvZiBhIG1lc3Nh Z2UgdG8gbWFqb3Jkb21vQHZnZXIua2VybmVsLm9yZyBNb3JlDQo+ID4+Pj4gbWFqb3Jkb21vIGlu Zm8gYXQgaHR0cDovL3ZnZXIua2VybmVsLm9yZy9tYWpvcmRvbW8taW5mby5odG1sDQo+ID4NCg0K -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
diff --git a/drivers/infiniband/hw/mlx4/qp.c b/drivers/infiniband/hw/mlx4/qp.c index a2e4ca5..0d984f5 100644 --- a/drivers/infiniband/hw/mlx4/qp.c +++ b/drivers/infiniband/hw/mlx4/qp.c @@ -2748,6 +2748,7 @@ int mlx4_ib_post_send(struct ib_qp *ibqp, struct ib_send_wr *wr, unsigned long flags; int nreq; int err = 0; + int inl = 0; unsigned ind; int uninitialized_var(stamp); int uninitialized_var(size); @@ -2958,30 +2959,97 @@ int mlx4_ib_post_send(struct ib_qp *ibqp, struct ib_send_wr *wr, default: break; } + if (wr->send_flags & IB_SEND_INLINE && wr->num_sge) { + struct mlx4_wqe_inline_seg *seg; + void *addr; + int len, seg_len; + int num_seg; + int off, to_copy; + + inl = 0; + + seg = wqe; + wqe += sizeof *seg; + off = ((uintptr_t) wqe) & (MLX4_INLINE_ALIGN - 1); + num_seg = 0; + seg_len = 0; + + for (i = 0; i < wr->num_sge; ++i) { + addr = (void *) (uintptr_t) wr->sg_list[i].addr; + len = wr->sg_list[i].length; + inl += len; + + if (inl > 16) { + inl = 0; + err = ENOMEM; + *bad_wr = wr; + goto out; + } - /* - * Write data segments in reverse order, so as to - * overwrite cacheline stamp last within each - * cacheline. This avoids issues with WQE - * prefetching. - */ + while (len >= MLX4_INLINE_ALIGN - off) { + to_copy = MLX4_INLINE_ALIGN - off; + memcpy(wqe, addr, to_copy); + len -= to_copy; + wqe += to_copy; + addr += to_copy; + seg_len += to_copy; + wmb(); /* see comment below */ + seg->byte_count = htonl(MLX4_INLINE_SEG | seg_len); + seg_len = 0; + seg = wqe; + wqe += sizeof *seg; + off = sizeof *seg; + ++num_seg; + } - dseg = wqe; - dseg += wr->num_sge - 1; - size += wr->num_sge * (sizeof (struct mlx4_wqe_data_seg) / 16); + memcpy(wqe, addr, len); + wqe += len; + seg_len += len; + off += len; + } - /* Add one more inline data segment for ICRC for MLX sends */ - if (unlikely(qp->mlx4_ib_qp_type == MLX4_IB_QPT_SMI || - qp->mlx4_ib_qp_type == MLX4_IB_QPT_GSI || - qp->mlx4_ib_qp_type & - (MLX4_IB_QPT_PROXY_SMI_OWNER | MLX4_IB_QPT_TUN_SMI_OWNER))) { - set_mlx_icrc_seg(dseg + 1); - size += sizeof (struct mlx4_wqe_data_seg) / 16; - } + if (seg_len) { + ++num_seg; + /* + * Need a barrier here to make sure + * all the data is visible before the + * byte_count field is set. Otherwise + * the HCA prefetcher could grab the + * 64-byte chunk with this inline + * segment and get a valid (!= + * 0xffffffff) byte count but stale + * data, and end up sending the wrong + * data. + */ + wmb(); + seg->byte_count = htonl(MLX4_INLINE_SEG | seg_len); + } - for (i = wr->num_sge - 1; i >= 0; --i, --dseg) - set_data_seg(dseg, wr->sg_list + i); + size += (inl + num_seg * sizeof (*seg) + 15) / 16; + } else { + /* + * Write data segments in reverse order, so as to + * overwrite cacheline stamp last within each + * cacheline. This avoids issues with WQE + * prefetching. + */ + + dseg = wqe; + dseg += wr->num_sge - 1; + size += wr->num_sge * (sizeof (struct mlx4_wqe_data_seg) / 16); + + /* Add one more inline data segment for ICRC for MLX sends */ + if (unlikely(qp->mlx4_ib_qp_type == MLX4_IB_QPT_SMI || + qp->mlx4_ib_qp_type == MLX4_IB_QPT_GSI || + qp->mlx4_ib_qp_type & + (MLX4_IB_QPT_PROXY_SMI_OWNER | MLX4_IB_QPT_TUN_SMI_OWNER))) { + set_mlx_icrc_seg(dseg + 1); + size += sizeof (struct mlx4_wqe_data_seg) / 16; + } + for (i = wr->num_sge - 1; i >= 0; --i, --dseg) + set_data_seg(dseg, wr->sg_list + i); + } /* * Possibly overwrite stamping in cacheline with LSO * segment only after making sure all data segments