Message ID | 20201215122306.3886-1-bmt@zurich.ibm.com (mailing list archive) |
---|---|
State | Superseded |
Headers | show |
Series | RDMA/siw: Fix handling of zero-sized Read and Receive Queues. | expand |
Hi Bernard, I love your patch! Perhaps something to improve: [auto build test WARNING on rdma/for-next] [also build test WARNING on linus/master v5.10 next-20201215] [If your patch is applied to the wrong git tree, kindly drop us a note. And when submitting patch, we suggest to use '--base' as documented in https://git-scm.com/docs/git-format-patch] url: https://github.com/0day-ci/linux/commits/Bernard-Metzler/RDMA-siw-Fix-handling-of-zero-sized-Read-and-Receive-Queues/20201215-202632 base: https://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma.git for-next config: arm-randconfig-r035-20201215 (attached as .config) compiler: clang version 12.0.0 (https://github.com/llvm/llvm-project a29ecca7819a6ed4250d3689b12b1f664bb790d7) reproduce (this is a W=1 build): wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross chmod +x ~/bin/make.cross # install arm cross compiling tool for clang build # apt-get install binutils-arm-linux-gnueabi # https://github.com/0day-ci/linux/commit/2626590f990ffc9ed5607f7027ffadbfc9073692 git remote add linux-review https://github.com/0day-ci/linux git fetch --no-tags linux-review Bernard-Metzler/RDMA-siw-Fix-handling-of-zero-sized-Read-and-Receive-Queues/20201215-202632 git checkout 2626590f990ffc9ed5607f7027ffadbfc9073692 # save the attached .config to linux build tree COMPILER_INSTALL_PATH=$HOME/0day COMPILER=clang make.cross ARCH=arm If you fix the issue, kindly add following tag as appropriate Reported-by: kernel test robot <lkp@intel.com> All warnings (new ones prefixed by >>): >> drivers/infiniband/sw/siw/siw_qp_rx.c:681:6: warning: variable 'flags' is used uninitialized whenever 'if' condition is true [-Wsometimes-uninitialized] if (unlikely(!qp->attrs.irq_size)) { ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~ include/linux/compiler.h:78:22: note: expanded from macro 'unlikely' # define unlikely(x) __builtin_expect(!!(x), 0) ^~~~~~~~~~~~~~~~~~~~~~~~~~ drivers/infiniband/sw/siw/siw_qp_rx.c:729:39: note: uninitialized use occurs here spin_unlock_irqrestore(&qp->sq_lock, flags); ^~~~~ drivers/infiniband/sw/siw/siw_qp_rx.c:681:2: note: remove the 'if' if its condition is always false if (unlikely(!qp->attrs.irq_size)) { ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ drivers/infiniband/sw/siw/siw_qp_rx.c:673:21: note: initialize the variable 'flags' to silence this warning unsigned long flags; ^ = 0 1 warning generated. vim +681 drivers/infiniband/sw/siw/siw_qp_rx.c 643 644 /* 645 * siw_init_rresp: 646 * 647 * Process inbound RDMA READ REQ. Produce a pseudo READ RESPONSE WQE. 648 * Put it at the tail of the IRQ, if there is another WQE currently in 649 * transmit processing. If not, make it the current WQE to be processed 650 * and schedule transmit processing. 651 * 652 * Can be called from softirq context and from process 653 * context (RREAD socket loopback case!) 654 * 655 * return value: 656 * 0: success, 657 * failure code otherwise 658 */ 659 660 static int siw_init_rresp(struct siw_qp *qp, struct siw_rx_stream *srx) 661 { 662 struct siw_wqe *tx_work = tx_wqe(qp); 663 struct siw_sqe *resp; 664 665 uint64_t raddr = be64_to_cpu(srx->hdr.rreq.sink_to), 666 laddr = be64_to_cpu(srx->hdr.rreq.source_to); 667 uint32_t length = be32_to_cpu(srx->hdr.rreq.read_size), 668 lkey = be32_to_cpu(srx->hdr.rreq.source_stag), 669 rkey = be32_to_cpu(srx->hdr.rreq.sink_stag), 670 msn = be32_to_cpu(srx->hdr.rreq.ddp_msn); 671 672 int run_sq = 1, rv = 0; 673 unsigned long flags; 674 675 if (unlikely(msn != srx->ddp_msn[RDMAP_UNTAGGED_QN_RDMA_READ])) { 676 siw_init_terminate(qp, TERM_ERROR_LAYER_DDP, 677 DDP_ETYPE_UNTAGGED_BUF, 678 DDP_ECODE_UT_INVALID_MSN_RANGE, 0); 679 return -EPROTO; 680 } > 681 if (unlikely(!qp->attrs.irq_size)) { 682 run_sq = 0; 683 goto error_irq; 684 } 685 spin_lock_irqsave(&qp->sq_lock, flags); 686 687 if (tx_work->wr_status == SIW_WR_IDLE) { 688 /* 689 * immediately schedule READ response w/o 690 * consuming IRQ entry: IRQ must be empty. 691 */ 692 tx_work->processed = 0; 693 tx_work->mem[0] = NULL; 694 tx_work->wr_status = SIW_WR_QUEUED; 695 resp = &tx_work->sqe; 696 } else { 697 resp = irq_alloc_free(qp); 698 run_sq = 0; 699 } 700 if (likely(resp)) { 701 resp->opcode = SIW_OP_READ_RESPONSE; 702 703 resp->sge[0].length = length; 704 resp->sge[0].laddr = laddr; 705 resp->sge[0].lkey = lkey; 706 707 /* Keep aside message sequence number for potential 708 * error reporting during Read Response generation. 709 */ 710 resp->sge[1].length = msn; 711 712 resp->raddr = raddr; 713 resp->rkey = rkey; 714 resp->num_sge = length ? 1 : 0; 715 716 /* RRESP now valid as current TX wqe or placed into IRQ */ 717 smp_store_mb(resp->flags, SIW_WQE_VALID); 718 } else { 719 error_irq: 720 pr_warn("siw: [QP %u]: IRQ exceeded or null, size %d\n", 721 qp_id(qp), qp->attrs.irq_size); 722 723 siw_init_terminate(qp, TERM_ERROR_LAYER_RDMAP, 724 RDMAP_ETYPE_REMOTE_OPERATION, 725 RDMAP_ECODE_CATASTROPHIC_STREAM, 0); 726 rv = -EPROTO; 727 } 728 729 spin_unlock_irqrestore(&qp->sq_lock, flags); 730 731 if (run_sq) 732 rv = siw_sq_start(qp); 733 734 return rv; 735 } 736 --- 0-DAY CI Kernel Test Service, Intel Corporation https://lists.01.org/hyperkitty/list/kbuild-all@lists.01.org
On Tue, Dec 15, 2020 at 01:23:06PM +0100, Bernard Metzler wrote: > During connection setup, the application may choose to zero-size > inbound and outbound READ queues, as well as the Receive queue. > This patch fixes handling of zero-sized queues. This fixes a crasher? Copy the fixes line and oops from Kamal? Jason
-----"Jason Gunthorpe" <jgg@nvidia.com> wrote: ----- >To: "Bernard Metzler" <bmt@zurich.ibm.com> >From: "Jason Gunthorpe" <jgg@nvidia.com> >Date: 12/15/2020 06:48PM >Cc: <linux-rdma@vger.kernel.org>, <kamalheib1@gmail.com>, ><yi.zhang@redhat.com>, <linux-nvme@lists.infradead.org> >Subject: [EXTERNAL] Re: [PATCH] RDMA/siw: Fix handling of zero-sized >Read and Receive Queues. > >On Tue, Dec 15, 2020 at 01:23:06PM +0100, Bernard Metzler wrote: >> During connection setup, the application may choose to zero-size >> inbound and outbound READ queues, as well as the Receive queue. >> This patch fixes handling of zero-sized queues. > >This fixes a crasher? Copy the fixes line and oops from Kamal? > >Jason > Yes. it does. Will resend with a small change fixing an uninitialized variable I just invented with the patch (uuuhh). I will attach Kamal's oops report, but his fix was blocking allowed application behavior (setting some queue sizes to zero). The intention of my patch is to fix siw to handle that application behavior correctly. So the fixes lines will be different. Thanks, Bernard.
On Tue, Dec 15, 2020 at 01:23:06PM +0100, Bernard Metzler wrote: > During connection setup, the application may choose to zero-size > inbound and outbound READ queues, as well as the Receive queue. > This patch fixes handling of zero-sized queues. > > Reported-by: Kamal Heib <kamalheib1@gmail.com> > Reported-by: Yi Zhang <yi.zhang@redhat.com> > Signed-off-by: Bernard Metzler <bmt@zurich.ibm.com> > --- > drivers/infiniband/sw/siw/siw.h | 2 +- > drivers/infiniband/sw/siw/siw_qp.c | 54 ++++++++++++++++----------- > drivers/infiniband/sw/siw/siw_qp_rx.c | 26 +++++++++---- > drivers/infiniband/sw/siw/siw_qp_tx.c | 4 +- > drivers/infiniband/sw/siw/siw_verbs.c | 18 +++++++-- > 5 files changed, 68 insertions(+), 36 deletions(-) > > diff --git a/drivers/infiniband/sw/siw/siw.h b/drivers/infiniband/sw/siw/siw.h > index e9753831ac3f..6f17392f975a 100644 > --- a/drivers/infiniband/sw/siw/siw.h > +++ b/drivers/infiniband/sw/siw/siw.h > @@ -654,7 +654,7 @@ static inline struct siw_sqe *orq_get_free(struct siw_qp *qp) > { > struct siw_sqe *orq_e = orq_get_tail(qp); > > - if (orq_e && READ_ONCE(orq_e->flags) == 0) > + if (READ_ONCE(orq_e->flags) == 0) > return orq_e; > > return NULL; > diff --git a/drivers/infiniband/sw/siw/siw_qp.c b/drivers/infiniband/sw/siw/siw_qp.c > index 875d36d4b1c6..b686a09a75ae 100644 > --- a/drivers/infiniband/sw/siw/siw_qp.c > +++ b/drivers/infiniband/sw/siw/siw_qp.c > @@ -199,26 +199,28 @@ void siw_qp_llp_write_space(struct sock *sk) > > static int siw_qp_readq_init(struct siw_qp *qp, int irq_size, int orq_size) > { > - irq_size = roundup_pow_of_two(irq_size); > - orq_size = roundup_pow_of_two(orq_size); > - > - qp->attrs.irq_size = irq_size; > - qp->attrs.orq_size = orq_size; > - > - qp->irq = vzalloc(irq_size * sizeof(struct siw_sqe)); > - if (!qp->irq) { > - siw_dbg_qp(qp, "irq malloc for %d failed\n", irq_size); > - qp->attrs.irq_size = 0; > - return -ENOMEM; > + if (irq_size) { > + irq_size = roundup_pow_of_two(irq_size); > + qp->irq = vzalloc(irq_size * sizeof(struct siw_sqe)); > + if (!qp->irq) { > + siw_dbg_qp(qp, "irq malloc for %d failed\n", irq_size); Please don't copy prints after kernel allocators. You won't miss failure in allocations. Thanks
diff --git a/drivers/infiniband/sw/siw/siw.h b/drivers/infiniband/sw/siw/siw.h index e9753831ac3f..6f17392f975a 100644 --- a/drivers/infiniband/sw/siw/siw.h +++ b/drivers/infiniband/sw/siw/siw.h @@ -654,7 +654,7 @@ static inline struct siw_sqe *orq_get_free(struct siw_qp *qp) { struct siw_sqe *orq_e = orq_get_tail(qp); - if (orq_e && READ_ONCE(orq_e->flags) == 0) + if (READ_ONCE(orq_e->flags) == 0) return orq_e; return NULL; diff --git a/drivers/infiniband/sw/siw/siw_qp.c b/drivers/infiniband/sw/siw/siw_qp.c index 875d36d4b1c6..b686a09a75ae 100644 --- a/drivers/infiniband/sw/siw/siw_qp.c +++ b/drivers/infiniband/sw/siw/siw_qp.c @@ -199,26 +199,28 @@ void siw_qp_llp_write_space(struct sock *sk) static int siw_qp_readq_init(struct siw_qp *qp, int irq_size, int orq_size) { - irq_size = roundup_pow_of_two(irq_size); - orq_size = roundup_pow_of_two(orq_size); - - qp->attrs.irq_size = irq_size; - qp->attrs.orq_size = orq_size; - - qp->irq = vzalloc(irq_size * sizeof(struct siw_sqe)); - if (!qp->irq) { - siw_dbg_qp(qp, "irq malloc for %d failed\n", irq_size); - qp->attrs.irq_size = 0; - return -ENOMEM; + if (irq_size) { + irq_size = roundup_pow_of_two(irq_size); + qp->irq = vzalloc(irq_size * sizeof(struct siw_sqe)); + if (!qp->irq) { + siw_dbg_qp(qp, "irq malloc for %d failed\n", irq_size); + qp->attrs.irq_size = 0; + return -ENOMEM; + } } - qp->orq = vzalloc(orq_size * sizeof(struct siw_sqe)); - if (!qp->orq) { - siw_dbg_qp(qp, "orq malloc for %d failed\n", orq_size); - qp->attrs.orq_size = 0; - qp->attrs.irq_size = 0; - vfree(qp->irq); - return -ENOMEM; + if (orq_size) { + orq_size = roundup_pow_of_two(orq_size); + qp->orq = vzalloc(orq_size * sizeof(struct siw_sqe)); + if (!qp->orq) { + siw_dbg_qp(qp, "orq malloc for %d failed\n", orq_size); + qp->attrs.orq_size = 0; + qp->attrs.irq_size = 0; + vfree(qp->irq); + return -ENOMEM; + } } + qp->attrs.irq_size = irq_size; + qp->attrs.orq_size = orq_size; siw_dbg_qp(qp, "ORD %d, IRD %d\n", orq_size, irq_size); return 0; } @@ -288,13 +290,14 @@ int siw_qp_mpa_rts(struct siw_qp *qp, enum mpa_v2_ctrl ctrl) if (ctrl & MPA_V2_RDMA_WRITE_RTR) wqe->sqe.opcode = SIW_OP_WRITE; else if (ctrl & MPA_V2_RDMA_READ_RTR) { - struct siw_sqe *rreq; + struct siw_sqe *rreq = NULL; wqe->sqe.opcode = SIW_OP_READ; spin_lock(&qp->orq_lock); - rreq = orq_get_free(qp); + if (qp->attrs.orq_size) + rreq = orq_get_free(qp); if (rreq) { siw_read_to_orq(rreq, &wqe->sqe); qp->orq_put++; @@ -889,6 +892,9 @@ int siw_activate_tx(struct siw_qp *qp) struct siw_wqe *wqe = tx_wqe(qp); int rv = 1; + if (!qp->attrs.irq_size) + goto no_irq; + irqe = &qp->irq[qp->irq_get % qp->attrs.irq_size]; if (irqe->flags & SIW_WQE_VALID) { @@ -933,6 +939,7 @@ int siw_activate_tx(struct siw_qp *qp) goto out; } +no_irq: sqe = sq_get_next(qp); if (sqe) { skip_irq: @@ -971,7 +978,7 @@ int siw_activate_tx(struct siw_qp *qp) } spin_lock(&qp->orq_lock); - if (!siw_orq_empty(qp)) { + if (qp->attrs.orq_size && !siw_orq_empty(qp)) { qp->tx_ctx.orq_fence = 1; rv = 0; } @@ -981,6 +988,11 @@ int siw_activate_tx(struct siw_qp *qp) wqe->sqe.opcode == SIW_OP_READ_LOCAL_INV) { struct siw_sqe *rreq; + if (unlikely(!qp->attrs.orq_size)) { + /* We negotiated not to send READ req's */ + rv = -EINVAL; + goto out; + } wqe->sqe.num_sge = 1; spin_lock(&qp->orq_lock); diff --git a/drivers/infiniband/sw/siw/siw_qp_rx.c b/drivers/infiniband/sw/siw/siw_qp_rx.c index 4bd1f1f84057..981e11f31b2d 100644 --- a/drivers/infiniband/sw/siw/siw_qp_rx.c +++ b/drivers/infiniband/sw/siw/siw_qp_rx.c @@ -678,6 +678,10 @@ static int siw_init_rresp(struct siw_qp *qp, struct siw_rx_stream *srx) DDP_ECODE_UT_INVALID_MSN_RANGE, 0); return -EPROTO; } + if (unlikely(!qp->attrs.irq_size)) { + run_sq = 0; + goto error_irq; + } spin_lock_irqsave(&qp->sq_lock, flags); if (tx_work->wr_status == SIW_WR_IDLE) { @@ -712,8 +716,9 @@ static int siw_init_rresp(struct siw_qp *qp, struct siw_rx_stream *srx) /* RRESP now valid as current TX wqe or placed into IRQ */ smp_store_mb(resp->flags, SIW_WQE_VALID); } else { - pr_warn("siw: [QP %u]: irq %d exceeded %d\n", qp_id(qp), - qp->irq_put % qp->attrs.irq_size, qp->attrs.irq_size); +error_irq: + pr_warn("siw: [QP %u]: IRQ exceeded or null, size %d\n", + qp_id(qp), qp->attrs.irq_size); siw_init_terminate(qp, TERM_ERROR_LAYER_RDMAP, RDMAP_ETYPE_REMOTE_OPERATION, @@ -740,6 +745,9 @@ static int siw_orqe_start_rx(struct siw_qp *qp) struct siw_sqe *orqe; struct siw_wqe *wqe = NULL; + if (unlikely(!qp->attrs.orq_size)) + return -EPROTO; + /* make sure ORQ indices are current */ smp_mb(); @@ -796,8 +804,8 @@ int siw_proc_rresp(struct siw_qp *qp) */ rv = siw_orqe_start_rx(qp); if (rv) { - pr_warn("siw: [QP %u]: ORQ empty at idx %d\n", - qp_id(qp), qp->orq_get % qp->attrs.orq_size); + pr_warn("siw: [QP %u]: ORQ empty, size %d\n", + qp_id(qp), qp->attrs.orq_size); goto error_term; } rv = siw_rresp_check_ntoh(srx, frx); @@ -1290,11 +1298,13 @@ static int siw_rdmap_complete(struct siw_qp *qp, int error) wc_status); siw_wqe_put_mem(wqe, SIW_OP_READ); - if (!error) + if (!error) { rv = siw_check_tx_fence(qp); - else - /* Disable current ORQ eleement */ - WRITE_ONCE(orq_get_current(qp)->flags, 0); + } else { + /* Disable current ORQ element */ + if (qp->attrs.orq_size) + WRITE_ONCE(orq_get_current(qp)->flags, 0); + } break; case RDMAP_RDMA_READ_REQ: diff --git a/drivers/infiniband/sw/siw/siw_qp_tx.c b/drivers/infiniband/sw/siw/siw_qp_tx.c index d19d8325588b..7989c4043db4 100644 --- a/drivers/infiniband/sw/siw/siw_qp_tx.c +++ b/drivers/infiniband/sw/siw/siw_qp_tx.c @@ -1107,8 +1107,8 @@ int siw_qp_sq_process(struct siw_qp *qp) /* * RREQ may have already been completed by inbound RRESP! */ - if (tx_type == SIW_OP_READ || - tx_type == SIW_OP_READ_LOCAL_INV) { + if ((tx_type == SIW_OP_READ || + tx_type == SIW_OP_READ_LOCAL_INV) && qp->attrs.orq_size) { /* Cleanup pending entry in ORQ */ qp->orq_put--; qp->orq[qp->orq_put % qp->attrs.orq_size].flags = 0; diff --git a/drivers/infiniband/sw/siw/siw_verbs.c b/drivers/infiniband/sw/siw/siw_verbs.c index 7cf3242ffb41..95003678cf3f 100644 --- a/drivers/infiniband/sw/siw/siw_verbs.c +++ b/drivers/infiniband/sw/siw/siw_verbs.c @@ -362,13 +362,23 @@ struct ib_qp *siw_create_qp(struct ib_pd *pd, if (rv) goto err_out; + num_sqe = attrs->cap.max_send_wr; + num_rqe = attrs->cap.max_recv_wr; + /* All queue indices are derived from modulo operations * on a free running 'get' (consumer) and 'put' (producer) * unsigned counter. Having queue sizes at power of two * avoids handling counter wrap around. */ - num_sqe = roundup_pow_of_two(attrs->cap.max_send_wr); - num_rqe = roundup_pow_of_two(attrs->cap.max_recv_wr); + if (num_sqe) + num_sqe = roundup_pow_of_two(num_sqe); + else { + /* Zero sized SQ is not supported */ + rv = -EINVAL; + goto err_out; + } + if (num_rqe) + num_rqe = roundup_pow_of_two(num_rqe); if (udata) qp->sendq = vmalloc_user(num_sqe * sizeof(struct siw_sqe)); @@ -960,9 +970,9 @@ int siw_post_receive(struct ib_qp *base_qp, const struct ib_recv_wr *wr, unsigned long flags; int rv = 0; - if (qp->srq) { + if (qp->srq || qp->attrs.rq_size == 0) { *bad_wr = wr; - return -EOPNOTSUPP; /* what else from errno.h? */ + return -EINVAL; } if (!rdma_is_kernel_res(&qp->base_qp.res)) { siw_dbg_qp(qp, "no kernel post_recv for user mapped rq\n");
During connection setup, the application may choose to zero-size inbound and outbound READ queues, as well as the Receive queue. This patch fixes handling of zero-sized queues. Reported-by: Kamal Heib <kamalheib1@gmail.com> Reported-by: Yi Zhang <yi.zhang@redhat.com> Signed-off-by: Bernard Metzler <bmt@zurich.ibm.com> --- drivers/infiniband/sw/siw/siw.h | 2 +- drivers/infiniband/sw/siw/siw_qp.c | 54 ++++++++++++++++----------- drivers/infiniband/sw/siw/siw_qp_rx.c | 26 +++++++++---- drivers/infiniband/sw/siw/siw_qp_tx.c | 4 +- drivers/infiniband/sw/siw/siw_verbs.c | 18 +++++++-- 5 files changed, 68 insertions(+), 36 deletions(-)