Message ID | 896e9a9e-43b6-7a21-e41b-861e4f795436@mellanox.com (mailing list archive) |
---|---|
State | Not Applicable |
Headers | show |
----- Original Message ----- > From: "Max Gurtovoy" <maxg@mellanox.com> > To: "Laurence Oberman" <loberman@redhat.com>, "Leon Romanovsky" <leonro@mellanox.com> > Cc: "Bart Van Assche" <bart.vanassche@sandisk.com>, "Doug Ledford" <dledford@redhat.com>, "Sagi Grimberg" > <sagi@grimberg.me>, "Israel Rukshin" <israelr@mellanox.com>, linux-rdma@vger.kernel.org > Sent: Wednesday, April 26, 2017 4:31:57 AM > Subject: Re: [PATCH, untested] mlx5: Avoid that mlx5_ib_sg_to_klms() overflows the klms[] array > > > > On 4/25/2017 11:37 PM, Laurence Oberman wrote: > > > > > > ----- Original Message ----- > >> From: "Leon Romanovsky" <leonro@mellanox.com> > >> To: "Bart Van Assche" <bart.vanassche@sandisk.com> > >> Cc: "Doug Ledford" <dledford@redhat.com>, "Max Gurtovoy" > >> <maxg@mellanox.com>, "Sagi Grimberg" <sagi@grimberg.me>, > >> "Israel Rukshin" <israelr@mellanox.com>, "Laurence Oberman" > >> <loberman@redhat.com>, linux-rdma@vger.kernel.org > >> Sent: Tuesday, April 25, 2017 1:58:49 PM > >> Subject: Re: [PATCH, untested] mlx5: Avoid that mlx5_ib_sg_to_klms() > >> overflows the klms[] array > >> > >> On Mon, Apr 24, 2017 at 03:15:28PM -0700, Bart Van Assche wrote: > >>> ib_map_mr_sg() can pass an SG-list to .map_mr_sg() that is larger > >>> than what fits into a single MR. .map_mr_sg() must not attempt to > >>> map more SG-list elements than what fits into a single MR. > >>> Hence make sure that mlx5_ib_sg_to_klms() does not write outside > >>> the MR klms[] array. > >>> > >>> Fixes: b005d3164713 ("mlx5: Add arbitrary sg list support") > >>> Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com> > >>> Reviewed-by: Max Gurtovoy <maxg@mellanox.com> > >>> Cc: Sagi Grimberg <sagi@grimberg.me> > >>> Cc: Leon Romanovsky <leonro@mellanox.com> > >>> Cc: Israel Rukshin <israelr@mellanox.com> > >>> Cc: <stable@vger.kernel.org> > >>> --- > >>> drivers/infiniband/hw/mlx5/mr.c | 2 +- > >>> 1 file changed, 1 insertion(+), 1 deletion(-) > >>> > >> > >> Bart, > >> > >> Thanks a lot, it indeed looks right. > >> Acked-by: Leon Romanovsky <leonro@mellanox.com> > >> > >> Thanks > >> > > > > > > Hello Bart, Leon, Max and Israel. > > > > I cloned off Barts tree. > > > > git clone https://github.com/bvanassche/linux > > cd linux > > git checkout block-scsi-for-next > > > > I checked all patches were in for this test. > > > > a83e404 IB/srp: Reenable IB_MR_TYPE_SG_GAPS > > dfa5a2b mlx5: Avoid that mlx5_ib_sg_to_klms() overflows the klms[] array > > f759c80 mlx5: Fix mlx5_ib_map_mr_sg mr lengt > > Hi, > copying Sagi's request from different thread: > > " > Can you please enable srp_add_one debug: > > echo "func srp_add_one +p" > /sys/kernel/debug/dynamic_debug/control > > In addition apply the following: > -- > diff --git a/drivers/infiniband/hw/mlx5/mr.c > b/drivers/infiniband/hw/mlx5/mr.c > index d9c6c0ea750b..040fbc387e4f 100644 > --- a/drivers/infiniband/hw/mlx5/mr.c > +++ b/drivers/infiniband/hw/mlx5/mr.c > @@ -1403,6 +1403,8 @@ mlx5_alloc_priv_descs(struct ib_device *device, > int add_size; > int ret; > > + WARN_ON_ONCE(ndescs > device->attr.max_fast_reg_page_list_len); > + > add_size = max_t(int, MLX5_UMR_ALIGN - ARCH_KMALLOC_MINALIGN, 0); > > mr->descs_alloc = kzalloc(size + add_size, GFP_KERNEL); > > " > > Max. > > > > > Built and tested the kernel. > > > > However this issue is not resolved :( > > > > [ 2707.931909] scsi host1: ib_srp: failed RECV status WR flushed (5) for > > CQE ffff8817edca86b0 > > [ 2708.089806] mlx5_0:dump_cqe:262:(pid 20129): dump error cqe > > [ 2708.121342] 00000000 00000000 00000000 00000000 > > [ 2708.147104] 00000000 00000000 00000000 00000000 > > [ 2708.172633] 00000000 00000000 00000000 00000000 > > [ 2708.198702] 00000000 0f007806 2500002a 14a527d0 > > [ 2732.434127] scsi host1: ib_srp: reconnect succeeded > > [ 2733.048023] scsi host1: ib_srp: failed RECV status WR flushed (5) for > > CQE ffff8817ed0a9c30 > > > > [root@localhost ~]# [ 2746.413277] mlx5_0:dump_cqe:262:(pid 15877): dump > > error cqe > > [ 2746.443240] 00000000 00000000 00000000 00000000 > > [ 2746.469323] 00000000 00000000 00000000 00000000 > > [ 2746.495310] 00000000 00000000 00000000 00000000 > > [ 2746.521407] 00000000 0f007806 25000032 003c7ad0 > > [ 2752.445899] scsi host1: ib_srp: reconnect succeeded > > [ 2752.481835] scsi host1: ib_srp: failed RECV status WR flushed (5) for > > CQE ffff8817ed0a9cf0 > > [ 2763.267386] mlx5_0:dump_cqe:262:(pid 15877): dump error cqe > > [ 2763.297826] 00000000 00000000 00000000 00000000 > > [ 2763.323352] 00000000 00000000 00000000 00000000 > > [ 2763.348722] 00000000 00000000 00000000 00000000 > > [ 2763.374681] 00000000 0f007806 2500003a 00084bd0 > > > > [root@localhost ~]# [ 2769.385203] fast_io_fail_tmo expired for SRP > > port-1:1 / host1. > > [ 2769.415956] scsi host1: ib_srp: reconnect succeeded > > [ 2769.450258] scsi host1: ib_srp: failed RECV status WR flushed (5) for > > CQE ffff8817ed0a9cf0 > > [ 2780.064627] mlx5_0:dump_cqe:262:(pid 18771): dump error cqe > > [ 2780.093520] 00000000 00000000 00000000 00000000 > > [ 2780.120067] 00000000 00000000 00000000 00000000 > > [ 2780.145575] 00000000 00000000 00000000 00000000 > > [ 2780.171153] 00000000 0f007806 25000042 000833d0 > > [ 2785.923399] scsi host1: ib_srp: reconnect succeeded > > [ 2785.957504] scsi host1: ib_srp: failed RECV status WR flushed (5) for > > CQE ffff8817ed0a9cf0 > > [ 2796.463426] mlx5_0:dump_cqe:262:(pid 18771): dump error cqe > > [ 2796.495257] 00000000 00000000 00000000 00000000 > > [ 2796.521506] 00000000 00000000 00000000 00000000 > > [ 2796.547640] 00000000 00000000 00000000 00000000 > > [ 2796.573120] 00000000 0f007806 2500004a 00083bd0 > > [ 2802.562578] scsi host1: ib_srp: reconnect succeeded > > [ 2802.596880] scsi host1: ib_srp: failed RECV status WR flushed (5) for > > CQE ffff8817ed0a9cf0 > > > > Regards > > Laurence > > > Doing this now Thanks Laurence -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
----- Original Message ----- > From: "Laurence Oberman" <loberman@redhat.com> > To: "Max Gurtovoy" <maxg@mellanox.com> > Cc: "Leon Romanovsky" <leonro@mellanox.com>, "Bart Van Assche" <bart.vanassche@sandisk.com>, "Doug Ledford" > <dledford@redhat.com>, "Sagi Grimberg" <sagi@grimberg.me>, "Israel Rukshin" <israelr@mellanox.com>, > linux-rdma@vger.kernel.org > Sent: Wednesday, April 26, 2017 7:47:37 AM > Subject: Re: [PATCH, untested] mlx5: Avoid that mlx5_ib_sg_to_klms() overflows the klms[] array > > > > ----- Original Message ----- > > From: "Max Gurtovoy" <maxg@mellanox.com> > > To: "Laurence Oberman" <loberman@redhat.com>, "Leon Romanovsky" > > <leonro@mellanox.com> > > Cc: "Bart Van Assche" <bart.vanassche@sandisk.com>, "Doug Ledford" > > <dledford@redhat.com>, "Sagi Grimberg" > > <sagi@grimberg.me>, "Israel Rukshin" <israelr@mellanox.com>, > > linux-rdma@vger.kernel.org > > Sent: Wednesday, April 26, 2017 4:31:57 AM > > Subject: Re: [PATCH, untested] mlx5: Avoid that mlx5_ib_sg_to_klms() > > overflows the klms[] array > > > > > > > > On 4/25/2017 11:37 PM, Laurence Oberman wrote: > > > > > > > > > ----- Original Message ----- > > >> From: "Leon Romanovsky" <leonro@mellanox.com> > > >> To: "Bart Van Assche" <bart.vanassche@sandisk.com> > > >> Cc: "Doug Ledford" <dledford@redhat.com>, "Max Gurtovoy" > > >> <maxg@mellanox.com>, "Sagi Grimberg" <sagi@grimberg.me>, > > >> "Israel Rukshin" <israelr@mellanox.com>, "Laurence Oberman" > > >> <loberman@redhat.com>, linux-rdma@vger.kernel.org > > >> Sent: Tuesday, April 25, 2017 1:58:49 PM > > >> Subject: Re: [PATCH, untested] mlx5: Avoid that mlx5_ib_sg_to_klms() > > >> overflows the klms[] array > > >> > > >> On Mon, Apr 24, 2017 at 03:15:28PM -0700, Bart Van Assche wrote: > > >>> ib_map_mr_sg() can pass an SG-list to .map_mr_sg() that is larger > > >>> than what fits into a single MR. .map_mr_sg() must not attempt to > > >>> map more SG-list elements than what fits into a single MR. > > >>> Hence make sure that mlx5_ib_sg_to_klms() does not write outside > > >>> the MR klms[] array. > > >>> > > >>> Fixes: b005d3164713 ("mlx5: Add arbitrary sg list support") > > >>> Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com> > > >>> Reviewed-by: Max Gurtovoy <maxg@mellanox.com> > > >>> Cc: Sagi Grimberg <sagi@grimberg.me> > > >>> Cc: Leon Romanovsky <leonro@mellanox.com> > > >>> Cc: Israel Rukshin <israelr@mellanox.com> > > >>> Cc: <stable@vger.kernel.org> > > >>> --- > > >>> drivers/infiniband/hw/mlx5/mr.c | 2 +- > > >>> 1 file changed, 1 insertion(+), 1 deletion(-) > > >>> > > >> > > >> Bart, > > >> > > >> Thanks a lot, it indeed looks right. > > >> Acked-by: Leon Romanovsky <leonro@mellanox.com> > > >> > > >> Thanks > > >> > > > > > > > > > Hello Bart, Leon, Max and Israel. > > > > > > I cloned off Barts tree. > > > > > > git clone https://github.com/bvanassche/linux > > > cd linux > > > git checkout block-scsi-for-next > > > > > > I checked all patches were in for this test. > > > > > > a83e404 IB/srp: Reenable IB_MR_TYPE_SG_GAPS > > > dfa5a2b mlx5: Avoid that mlx5_ib_sg_to_klms() overflows the klms[] array > > > f759c80 mlx5: Fix mlx5_ib_map_mr_sg mr lengt > > > > Hi, > > copying Sagi's request from different thread: > > > > " > > Can you please enable srp_add_one debug: > > > > echo "func srp_add_one +p" > /sys/kernel/debug/dynamic_debug/control > > > > In addition apply the following: > > -- > > diff --git a/drivers/infiniband/hw/mlx5/mr.c > > b/drivers/infiniband/hw/mlx5/mr.c > > index d9c6c0ea750b..040fbc387e4f 100644 > > --- a/drivers/infiniband/hw/mlx5/mr.c > > +++ b/drivers/infiniband/hw/mlx5/mr.c > > @@ -1403,6 +1403,8 @@ mlx5_alloc_priv_descs(struct ib_device *device, > > int add_size; > > int ret; > > > > + WARN_ON_ONCE(ndescs > device->attr.max_fast_reg_page_list_len); > > + > > add_size = max_t(int, MLX5_UMR_ALIGN - ARCH_KMALLOC_MINALIGN, 0); > > > > mr->descs_alloc = kzalloc(size + add_size, GFP_KERNEL); > > > > " > > > > Max. > > > > > > > > Built and tested the kernel. > > > > > > However this issue is not resolved :( > > > > > > [ 2707.931909] scsi host1: ib_srp: failed RECV status WR flushed (5) for > > > CQE ffff8817edca86b0 > > > [ 2708.089806] mlx5_0:dump_cqe:262:(pid 20129): dump error cqe > > > [ 2708.121342] 00000000 00000000 00000000 00000000 > > > [ 2708.147104] 00000000 00000000 00000000 00000000 > > > [ 2708.172633] 00000000 00000000 00000000 00000000 > > > [ 2708.198702] 00000000 0f007806 2500002a 14a527d0 > > > [ 2732.434127] scsi host1: ib_srp: reconnect succeeded > > > [ 2733.048023] scsi host1: ib_srp: failed RECV status WR flushed (5) for > > > CQE ffff8817ed0a9c30 > > > > > > [root@localhost ~]# [ 2746.413277] mlx5_0:dump_cqe:262:(pid 15877): dump > > > error cqe > > > [ 2746.443240] 00000000 00000000 00000000 00000000 > > > [ 2746.469323] 00000000 00000000 00000000 00000000 > > > [ 2746.495310] 00000000 00000000 00000000 00000000 > > > [ 2746.521407] 00000000 0f007806 25000032 003c7ad0 > > > [ 2752.445899] scsi host1: ib_srp: reconnect succeeded > > > [ 2752.481835] scsi host1: ib_srp: failed RECV status WR flushed (5) for > > > CQE ffff8817ed0a9cf0 > > > [ 2763.267386] mlx5_0:dump_cqe:262:(pid 15877): dump error cqe > > > [ 2763.297826] 00000000 00000000 00000000 00000000 > > > [ 2763.323352] 00000000 00000000 00000000 00000000 > > > [ 2763.348722] 00000000 00000000 00000000 00000000 > > > [ 2763.374681] 00000000 0f007806 2500003a 00084bd0 > > > > > > [root@localhost ~]# [ 2769.385203] fast_io_fail_tmo expired for SRP > > > port-1:1 / host1. > > > [ 2769.415956] scsi host1: ib_srp: reconnect succeeded > > > [ 2769.450258] scsi host1: ib_srp: failed RECV status WR flushed (5) for > > > CQE ffff8817ed0a9cf0 > > > [ 2780.064627] mlx5_0:dump_cqe:262:(pid 18771): dump error cqe > > > [ 2780.093520] 00000000 00000000 00000000 00000000 > > > [ 2780.120067] 00000000 00000000 00000000 00000000 > > > [ 2780.145575] 00000000 00000000 00000000 00000000 > > > [ 2780.171153] 00000000 0f007806 25000042 000833d0 > > > [ 2785.923399] scsi host1: ib_srp: reconnect succeeded > > > [ 2785.957504] scsi host1: ib_srp: failed RECV status WR flushed (5) for > > > CQE ffff8817ed0a9cf0 > > > [ 2796.463426] mlx5_0:dump_cqe:262:(pid 18771): dump error cqe > > > [ 2796.495257] 00000000 00000000 00000000 00000000 > > > [ 2796.521506] 00000000 00000000 00000000 00000000 > > > [ 2796.547640] 00000000 00000000 00000000 00000000 > > > [ 2796.573120] 00000000 0f007806 2500004a 00083bd0 > > > [ 2802.562578] scsi host1: ib_srp: reconnect succeeded > > > [ 2802.596880] scsi host1: ib_srp: failed RECV status WR flushed (5) for > > > CQE ffff8817ed0a9cf0 > > > > > > Regards > > > Laurence > > > > > > Doing this now > Thanks > Laurence Max The Patch is not correct. drivers/infiniband/hw/mlx5/mr.c: In function 'mlx5_alloc_priv_descs': drivers/infiniband/hw/mlx5/mr.c:1406:30: error: 'struct ib_device' has no member named 'attr' WARN_ON_ONCE(ndescs > device->attr.max_fast_reg_page_list_len); ^ ./include/asm-generic/bug.h:117:27: note: in definition of macro 'WARN_ON_ONCE' int __ret_warn_once = !!(condition); \ I think you meant to give me WARN_ON_ONCE(ndescs > ib_device_attr->attr.max_fast_reg_page_list_len); Can you confirm Thanks Laurence -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
----- Original Message ----- > From: "Laurence Oberman" <loberman@redhat.com> > To: "Max Gurtovoy" <maxg@mellanox.com> > Cc: "Leon Romanovsky" <leonro@mellanox.com>, "Bart Van Assche" <bart.vanassche@sandisk.com>, "Doug Ledford" > <dledford@redhat.com>, "Sagi Grimberg" <sagi@grimberg.me>, "Israel Rukshin" <israelr@mellanox.com>, > linux-rdma@vger.kernel.org > Sent: Wednesday, April 26, 2017 8:18:13 AM > Subject: Re: [PATCH, untested] mlx5: Avoid that mlx5_ib_sg_to_klms() overflows the klms[] array > > > > ----- Original Message ----- > > From: "Laurence Oberman" <loberman@redhat.com> > > To: "Max Gurtovoy" <maxg@mellanox.com> > > Cc: "Leon Romanovsky" <leonro@mellanox.com>, "Bart Van Assche" > > <bart.vanassche@sandisk.com>, "Doug Ledford" > > <dledford@redhat.com>, "Sagi Grimberg" <sagi@grimberg.me>, "Israel Rukshin" > > <israelr@mellanox.com>, > > linux-rdma@vger.kernel.org > > Sent: Wednesday, April 26, 2017 7:47:37 AM > > Subject: Re: [PATCH, untested] mlx5: Avoid that mlx5_ib_sg_to_klms() > > overflows the klms[] array > > > > > > > > ----- Original Message ----- > > > From: "Max Gurtovoy" <maxg@mellanox.com> > > > To: "Laurence Oberman" <loberman@redhat.com>, "Leon Romanovsky" > > > <leonro@mellanox.com> > > > Cc: "Bart Van Assche" <bart.vanassche@sandisk.com>, "Doug Ledford" > > > <dledford@redhat.com>, "Sagi Grimberg" > > > <sagi@grimberg.me>, "Israel Rukshin" <israelr@mellanox.com>, > > > linux-rdma@vger.kernel.org > > > Sent: Wednesday, April 26, 2017 4:31:57 AM > > > Subject: Re: [PATCH, untested] mlx5: Avoid that mlx5_ib_sg_to_klms() > > > overflows the klms[] array > > > > > > > > > > > > On 4/25/2017 11:37 PM, Laurence Oberman wrote: > > > > > > > > > > > > ----- Original Message ----- > > > >> From: "Leon Romanovsky" <leonro@mellanox.com> > > > >> To: "Bart Van Assche" <bart.vanassche@sandisk.com> > > > >> Cc: "Doug Ledford" <dledford@redhat.com>, "Max Gurtovoy" > > > >> <maxg@mellanox.com>, "Sagi Grimberg" <sagi@grimberg.me>, > > > >> "Israel Rukshin" <israelr@mellanox.com>, "Laurence Oberman" > > > >> <loberman@redhat.com>, linux-rdma@vger.kernel.org > > > >> Sent: Tuesday, April 25, 2017 1:58:49 PM > > > >> Subject: Re: [PATCH, untested] mlx5: Avoid that mlx5_ib_sg_to_klms() > > > >> overflows the klms[] array > > > >> > > > >> On Mon, Apr 24, 2017 at 03:15:28PM -0700, Bart Van Assche wrote: > > > >>> ib_map_mr_sg() can pass an SG-list to .map_mr_sg() that is larger > > > >>> than what fits into a single MR. .map_mr_sg() must not attempt to > > > >>> map more SG-list elements than what fits into a single MR. > > > >>> Hence make sure that mlx5_ib_sg_to_klms() does not write outside > > > >>> the MR klms[] array. > > > >>> > > > >>> Fixes: b005d3164713 ("mlx5: Add arbitrary sg list support") > > > >>> Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com> > > > >>> Reviewed-by: Max Gurtovoy <maxg@mellanox.com> > > > >>> Cc: Sagi Grimberg <sagi@grimberg.me> > > > >>> Cc: Leon Romanovsky <leonro@mellanox.com> > > > >>> Cc: Israel Rukshin <israelr@mellanox.com> > > > >>> Cc: <stable@vger.kernel.org> > > > >>> --- > > > >>> drivers/infiniband/hw/mlx5/mr.c | 2 +- > > > >>> 1 file changed, 1 insertion(+), 1 deletion(-) > > > >>> > > > >> > > > >> Bart, > > > >> > > > >> Thanks a lot, it indeed looks right. > > > >> Acked-by: Leon Romanovsky <leonro@mellanox.com> > > > >> > > > >> Thanks > > > >> > > > > > > > > > > > > Hello Bart, Leon, Max and Israel. > > > > > > > > I cloned off Barts tree. > > > > > > > > git clone https://github.com/bvanassche/linux > > > > cd linux > > > > git checkout block-scsi-for-next > > > > > > > > I checked all patches were in for this test. > > > > > > > > a83e404 IB/srp: Reenable IB_MR_TYPE_SG_GAPS > > > > dfa5a2b mlx5: Avoid that mlx5_ib_sg_to_klms() overflows the klms[] > > > > array > > > > f759c80 mlx5: Fix mlx5_ib_map_mr_sg mr lengt > > > > > > Hi, > > > copying Sagi's request from different thread: > > > > > > " > > > Can you please enable srp_add_one debug: > > > > > > echo "func srp_add_one +p" > /sys/kernel/debug/dynamic_debug/control > > > > > > In addition apply the following: > > > -- > > > diff --git a/drivers/infiniband/hw/mlx5/mr.c > > > b/drivers/infiniband/hw/mlx5/mr.c > > > index d9c6c0ea750b..040fbc387e4f 100644 > > > --- a/drivers/infiniband/hw/mlx5/mr.c > > > +++ b/drivers/infiniband/hw/mlx5/mr.c > > > @@ -1403,6 +1403,8 @@ mlx5_alloc_priv_descs(struct ib_device *device, > > > int add_size; > > > int ret; > > > > > > + WARN_ON_ONCE(ndescs > device->attr.max_fast_reg_page_list_len); > > > + > > > add_size = max_t(int, MLX5_UMR_ALIGN - ARCH_KMALLOC_MINALIGN, > > > 0); > > > > > > mr->descs_alloc = kzalloc(size + add_size, GFP_KERNEL); > > > > > > " > > > > > > Max. > > > > > > > > > > > Built and tested the kernel. > > > > > > > > However this issue is not resolved :( > > > > > > > > [ 2707.931909] scsi host1: ib_srp: failed RECV status WR flushed (5) > > > > for > > > > CQE ffff8817edca86b0 > > > > [ 2708.089806] mlx5_0:dump_cqe:262:(pid 20129): dump error cqe > > > > [ 2708.121342] 00000000 00000000 00000000 00000000 > > > > [ 2708.147104] 00000000 00000000 00000000 00000000 > > > > [ 2708.172633] 00000000 00000000 00000000 00000000 > > > > [ 2708.198702] 00000000 0f007806 2500002a 14a527d0 > > > > [ 2732.434127] scsi host1: ib_srp: reconnect succeeded > > > > [ 2733.048023] scsi host1: ib_srp: failed RECV status WR flushed (5) > > > > for > > > > CQE ffff8817ed0a9c30 > > > > > > > > [root@localhost ~]# [ 2746.413277] mlx5_0:dump_cqe:262:(pid 15877): > > > > dump > > > > error cqe > > > > [ 2746.443240] 00000000 00000000 00000000 00000000 > > > > [ 2746.469323] 00000000 00000000 00000000 00000000 > > > > [ 2746.495310] 00000000 00000000 00000000 00000000 > > > > [ 2746.521407] 00000000 0f007806 25000032 003c7ad0 > > > > [ 2752.445899] scsi host1: ib_srp: reconnect succeeded > > > > [ 2752.481835] scsi host1: ib_srp: failed RECV status WR flushed (5) > > > > for > > > > CQE ffff8817ed0a9cf0 > > > > [ 2763.267386] mlx5_0:dump_cqe:262:(pid 15877): dump error cqe > > > > [ 2763.297826] 00000000 00000000 00000000 00000000 > > > > [ 2763.323352] 00000000 00000000 00000000 00000000 > > > > [ 2763.348722] 00000000 00000000 00000000 00000000 > > > > [ 2763.374681] 00000000 0f007806 2500003a 00084bd0 > > > > > > > > [root@localhost ~]# [ 2769.385203] fast_io_fail_tmo expired for SRP > > > > port-1:1 / host1. > > > > [ 2769.415956] scsi host1: ib_srp: reconnect succeeded > > > > [ 2769.450258] scsi host1: ib_srp: failed RECV status WR flushed (5) > > > > for > > > > CQE ffff8817ed0a9cf0 > > > > [ 2780.064627] mlx5_0:dump_cqe:262:(pid 18771): dump error cqe > > > > [ 2780.093520] 00000000 00000000 00000000 00000000 > > > > [ 2780.120067] 00000000 00000000 00000000 00000000 > > > > [ 2780.145575] 00000000 00000000 00000000 00000000 > > > > [ 2780.171153] 00000000 0f007806 25000042 000833d0 > > > > [ 2785.923399] scsi host1: ib_srp: reconnect succeeded > > > > [ 2785.957504] scsi host1: ib_srp: failed RECV status WR flushed (5) > > > > for > > > > CQE ffff8817ed0a9cf0 > > > > [ 2796.463426] mlx5_0:dump_cqe:262:(pid 18771): dump error cqe > > > > [ 2796.495257] 00000000 00000000 00000000 00000000 > > > > [ 2796.521506] 00000000 00000000 00000000 00000000 > > > > [ 2796.547640] 00000000 00000000 00000000 00000000 > > > > [ 2796.573120] 00000000 0f007806 2500004a 00083bd0 > > > > [ 2802.562578] scsi host1: ib_srp: reconnect succeeded > > > > [ 2802.596880] scsi host1: ib_srp: failed RECV status WR flushed (5) > > > > for > > > > CQE ffff8817ed0a9cf0 > > > > > > > > Regards > > > > Laurence > > > > > > > > > Doing this now > > Thanks > > Laurence > > Max > > The Patch is not correct. > > drivers/infiniband/hw/mlx5/mr.c: In function 'mlx5_alloc_priv_descs': > drivers/infiniband/hw/mlx5/mr.c:1406:30: error: 'struct ib_device' has no > member named 'attr' > WARN_ON_ONCE(ndescs > device->attr.max_fast_reg_page_list_len); > ^ > ./include/asm-generic/bug.h:117:27: note: in definition of macro > 'WARN_ON_ONCE' > int __ret_warn_once = !!(condition); \ > > I think you meant to give me > > WARN_ON_ONCE(ndescs > ib_device_attr->attr.max_fast_reg_page_list_len); > > Can you confirm > > Thanks > Laurence Oops rather this WARN_ON_ONCE(ndescs > device->ib_device_attr.max_fast_reg_page_list_len); -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
diff --git a/drivers/infiniband/hw/mlx5/mr.c b/drivers/infiniband/hw/mlx5/mr.c index d9c6c0ea750b..040fbc387e4f 100644 --- a/drivers/infiniband/hw/mlx5/mr.c +++ b/drivers/infiniband/hw/mlx5/mr.c @@ -1403,6 +1403,8 @@ mlx5_alloc_priv_descs(struct ib_device *device, int add_size; int ret; + WARN_ON_ONCE(ndescs > device->attr.max_fast_reg_page_list_len); + add_size = max_t(int, MLX5_UMR_ALIGN - ARCH_KMALLOC_MINALIGN, 0); mr->descs_alloc = kzalloc(size + add_size, GFP_KERNEL);