From patchwork Wed Jul 10 13:36:58 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Junxian Huang X-Patchwork-Id: 13729356 Received: from szxga02-in.huawei.com (szxga02-in.huawei.com [45.249.212.188]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id EFABB194141; Wed, 10 Jul 2024 13:42:32 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=45.249.212.188 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1720618954; cv=none; b=rwul4i76phbQF3wcO4SSk6DWUwFmpN2LFkTqVDWKARe63D5JBGkpQBXQfo7EZwgHM9RyLaPp1Jf1ORyzwP/DAvtwXHYXlgvAEDIDbrB+xjttu2k4snyeKF7R/DERUpgWpHUX/sXb8DIAeavSrUE8sPbWjSrsDTXmU8rG7qM0c58= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1720618954; c=relaxed/simple; bh=PND3fEFJhzk25l89uTHt9gHFkld+VZcBV1FuXneP828=; h=From:To:CC:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=b78OBjM8BKhDUpMAsO8xKw7k3btTPljUihuie2CiKUanFjIQqqfHjo/6u52/sV7IhIn1M1vZavN+RA7Prcd5dX10tUeTrN8vz6X8tSKdsh2eFqYKzVTWr2GsGMLWTShtKvs4q3aVNxtk9eTFBguntgUST1orCZgAPcG6oZFb2uM= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=hisilicon.com; spf=pass smtp.mailfrom=hisilicon.com; arc=none smtp.client-ip=45.249.212.188 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=hisilicon.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=hisilicon.com Received: from mail.maildlp.com (unknown [172.19.88.194]) by szxga02-in.huawei.com (SkyGuard) with ESMTP id 4WJzYH4kvGzjX65; Wed, 10 Jul 2024 21:41:55 +0800 (CST) Received: from kwepemf100018.china.huawei.com (unknown [7.202.181.17]) by mail.maildlp.com (Postfix) with ESMTPS id 58AF714035F; Wed, 10 Jul 2024 21:42:23 +0800 (CST) Received: from localhost.localdomain (10.90.30.45) by kwepemf100018.china.huawei.com (7.202.181.17) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.1544.11; Wed, 10 Jul 2024 21:42:22 +0800 From: Junxian Huang To: , CC: , , , Subject: [PATCH v2 for-rc 1/8] RDMA/hns: Check atomic wr length Date: Wed, 10 Jul 2024 21:36:58 +0800 Message-ID: <20240710133705.896445-2-huangjunxian6@hisilicon.com> X-Mailer: git-send-email 2.30.0 In-Reply-To: <20240710133705.896445-1-huangjunxian6@hisilicon.com> References: <20240710133705.896445-1-huangjunxian6@hisilicon.com> Precedence: bulk X-Mailing-List: linux-rdma@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-ClientProxiedBy: dggems702-chm.china.huawei.com (10.3.19.179) To kwepemf100018.china.huawei.com (7.202.181.17) 8 bytes is the only supported length of atomic. Add this check in set_rc_wqe(). Besides, stop processing WQEs and return from set_rc_wqe() if there is any error. Fixes: 384f88185112 ("RDMA/hns: Add atomic support") Signed-off-by: Junxian Huang --- drivers/infiniband/hw/hns/hns_roce_device.h | 2 ++ drivers/infiniband/hw/hns/hns_roce_hw_v2.c | 9 +++++++-- 2 files changed, 9 insertions(+), 2 deletions(-) diff --git a/drivers/infiniband/hw/hns/hns_roce_device.h b/drivers/infiniband/hw/hns/hns_roce_device.h index ff0b3f68ee3a..05005079258c 100644 --- a/drivers/infiniband/hw/hns/hns_roce_device.h +++ b/drivers/infiniband/hw/hns/hns_roce_device.h @@ -91,6 +91,8 @@ /* Configure to HW for PAGE_SIZE larger than 4KB */ #define PG_SHIFT_OFFSET (PAGE_SHIFT - 12) +#define ATOMIC_WR_LEN 8 + #define HNS_ROCE_IDX_QUE_ENTRY_SZ 4 #define SRQ_DB_REG 0x230 diff --git a/drivers/infiniband/hw/hns/hns_roce_hw_v2.c b/drivers/infiniband/hw/hns/hns_roce_hw_v2.c index 4287818a737f..eb6052ee8938 100644 --- a/drivers/infiniband/hw/hns/hns_roce_hw_v2.c +++ b/drivers/infiniband/hw/hns/hns_roce_hw_v2.c @@ -591,11 +591,16 @@ static inline int set_rc_wqe(struct hns_roce_qp *qp, (wr->send_flags & IB_SEND_SIGNALED) ? 1 : 0); if (wr->opcode == IB_WR_ATOMIC_CMP_AND_SWP || - wr->opcode == IB_WR_ATOMIC_FETCH_AND_ADD) + wr->opcode == IB_WR_ATOMIC_FETCH_AND_ADD) { + if (msg_len != ATOMIC_WR_LEN) + return -EINVAL; set_atomic_seg(wr, rc_sq_wqe, valid_num_sge); - else if (wr->opcode != IB_WR_REG_MR) + } else if (wr->opcode != IB_WR_REG_MR) { ret = set_rwqe_data_seg(&qp->ibqp, wr, rc_sq_wqe, &curr_idx, valid_num_sge); + if (ret) + return ret; + } /* * The pipeline can sequentially post all valid WQEs into WQ buffer, From patchwork Wed Jul 10 13:36:59 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Junxian Huang X-Patchwork-Id: 13729352 Received: from szxga01-in.huawei.com (szxga01-in.huawei.com [45.249.212.187]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id D6C8E18EFDC; Wed, 10 Jul 2024 13:42:26 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=45.249.212.187 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1720618950; cv=none; b=QRVT4yLVSRilZtniTXYjAlcMgRi6c/Mg4KfbinphRWyO7654Ww5BwNGwWgc6Ix1VdRDnJRSIYJi/upwvO/BsLhPtEUKYbRNzHvOHA04lT2W16+TyC7hHbTfirW11y5qpM/F7JmbwA0Ac4x0gUuzVvrgveMuLt6h185eKda9q3Hk= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1720618950; c=relaxed/simple; bh=ogdrkRvit/re/c45w1VUdcStsjjDgF65FlFx3Dta8dY=; h=From:To:CC:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=uLlo5GT6CkrRQTz/ONIvLkcCO5Zq049Rp+aYGMPAGziaqJmwizzQqRpwbL8eHIxQjoN+vxwkR+hKl2oTLa5IFvl4XuNrLOd46zQwdCqF8+EpmWM266Ve486sb+7SMwU02apqvl60v7MehfPB3+ajaJ5xVbzGT4DTilVPfwW5nSc= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=hisilicon.com; spf=pass smtp.mailfrom=hisilicon.com; arc=none smtp.client-ip=45.249.212.187 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=hisilicon.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=hisilicon.com Received: from mail.maildlp.com (unknown [172.19.163.252]) by szxga01-in.huawei.com (SkyGuard) with ESMTP id 4WJzSX2dZfzxVwH; Wed, 10 Jul 2024 21:37:48 +0800 (CST) Received: from kwepemf100018.china.huawei.com (unknown [7.202.181.17]) by mail.maildlp.com (Postfix) with ESMTPS id AB2E5180ADC; Wed, 10 Jul 2024 21:42:23 +0800 (CST) Received: from localhost.localdomain (10.90.30.45) by kwepemf100018.china.huawei.com (7.202.181.17) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.1544.11; Wed, 10 Jul 2024 21:42:23 +0800 From: Junxian Huang To: , CC: , , , Subject: [PATCH v2 for-rc 2/8] RDMA/hns: Fix soft lockup under heavy CEQE load Date: Wed, 10 Jul 2024 21:36:59 +0800 Message-ID: <20240710133705.896445-3-huangjunxian6@hisilicon.com> X-Mailer: git-send-email 2.30.0 In-Reply-To: <20240710133705.896445-1-huangjunxian6@hisilicon.com> References: <20240710133705.896445-1-huangjunxian6@hisilicon.com> Precedence: bulk X-Mailing-List: linux-rdma@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-ClientProxiedBy: dggems702-chm.china.huawei.com (10.3.19.179) To kwepemf100018.china.huawei.com (7.202.181.17) CEQEs are handled in interrupt handler currently. This may cause the CPU core staying in interrupt context too long and lead to soft lockup under heavy load. Handle CEQEs in BH workqueue and set an upper limit for the number of CEQE handled by a single call of work handler. Fixes: a5073d6054f7 ("RDMA/hns: Add eq support of hip08") Signed-off-by: Junxian Huang --- drivers/infiniband/hw/hns/hns_roce_device.h | 1 + drivers/infiniband/hw/hns/hns_roce_hw_v2.c | 89 ++++++++++++--------- 2 files changed, 54 insertions(+), 36 deletions(-) diff --git a/drivers/infiniband/hw/hns/hns_roce_device.h b/drivers/infiniband/hw/hns/hns_roce_device.h index 05005079258c..f8451e5ab107 100644 --- a/drivers/infiniband/hw/hns/hns_roce_device.h +++ b/drivers/infiniband/hw/hns/hns_roce_device.h @@ -717,6 +717,7 @@ struct hns_roce_eq { int shift; int event_type; int sub_type; + struct work_struct work; }; struct hns_roce_eq_table { diff --git a/drivers/infiniband/hw/hns/hns_roce_hw_v2.c b/drivers/infiniband/hw/hns/hns_roce_hw_v2.c index eb6052ee8938..2f16554c96be 100644 --- a/drivers/infiniband/hw/hns/hns_roce_hw_v2.c +++ b/drivers/infiniband/hw/hns/hns_roce_hw_v2.c @@ -36,6 +36,7 @@ #include #include #include +#include #include #include #include @@ -6140,33 +6141,11 @@ static struct hns_roce_ceqe *next_ceqe_sw_v2(struct hns_roce_eq *eq) !!(eq->cons_index & eq->entries)) ? ceqe : NULL; } -static irqreturn_t hns_roce_v2_ceq_int(struct hns_roce_dev *hr_dev, - struct hns_roce_eq *eq) +static irqreturn_t hns_roce_v2_ceq_int(struct hns_roce_eq *eq) { - struct hns_roce_ceqe *ceqe = next_ceqe_sw_v2(eq); - irqreturn_t ceqe_found = IRQ_NONE; - u32 cqn; - - while (ceqe) { - /* Make sure we read CEQ entry after we have checked the - * ownership bit - */ - dma_rmb(); - - cqn = hr_reg_read(ceqe, CEQE_CQN); - - hns_roce_cq_completion(hr_dev, cqn); - - ++eq->cons_index; - ceqe_found = IRQ_HANDLED; - atomic64_inc(&hr_dev->dfx_cnt[HNS_ROCE_DFX_CEQE_CNT]); - - ceqe = next_ceqe_sw_v2(eq); - } + queue_work(system_bh_wq, &eq->work); - update_eq_db(eq); - - return IRQ_RETVAL(ceqe_found); + return IRQ_HANDLED; } static irqreturn_t hns_roce_v2_msix_interrupt_eq(int irq, void *eq_ptr) @@ -6177,7 +6156,7 @@ static irqreturn_t hns_roce_v2_msix_interrupt_eq(int irq, void *eq_ptr) if (eq->type_flag == HNS_ROCE_CEQ) /* Completion event interrupt */ - int_work = hns_roce_v2_ceq_int(hr_dev, eq); + int_work = hns_roce_v2_ceq_int(eq); else /* Asynchronous event interrupt */ int_work = hns_roce_v2_aeq_int(hr_dev, eq); @@ -6545,6 +6524,34 @@ static int hns_roce_v2_create_eq(struct hns_roce_dev *hr_dev, return ret; } +static void hns_roce_ceq_work(struct work_struct *work) +{ + struct hns_roce_eq *eq = from_work(eq, work, work); + struct hns_roce_ceqe *ceqe = next_ceqe_sw_v2(eq); + struct hns_roce_dev *hr_dev = eq->hr_dev; + int ceqe_num = 0; + u32 cqn; + + while (ceqe && ceqe_num < hr_dev->caps.ceqe_depth) { + /* Make sure we read CEQ entry after we have checked the + * ownership bit + */ + dma_rmb(); + + cqn = hr_reg_read(ceqe, CEQE_CQN); + + hns_roce_cq_completion(hr_dev, cqn); + + ++eq->cons_index; + ++ceqe_num; + atomic64_inc(&hr_dev->dfx_cnt[HNS_ROCE_DFX_CEQE_CNT]); + + ceqe = next_ceqe_sw_v2(eq); + } + + update_eq_db(eq); +} + static int __hns_roce_request_irq(struct hns_roce_dev *hr_dev, int irq_num, int comp_num, int aeq_num, int other_num) { @@ -6576,21 +6583,24 @@ static int __hns_roce_request_irq(struct hns_roce_dev *hr_dev, int irq_num, j - other_num - aeq_num); for (j = 0; j < irq_num; j++) { - if (j < other_num) + if (j < other_num) { ret = request_irq(hr_dev->irq[j], hns_roce_v2_msix_interrupt_abn, 0, hr_dev->irq_names[j], hr_dev); - - else if (j < (other_num + comp_num)) + } else if (j < (other_num + comp_num)) { + INIT_WORK(&eq_table->eq[j - other_num].work, + hns_roce_ceq_work); ret = request_irq(eq_table->eq[j - other_num].irq, hns_roce_v2_msix_interrupt_eq, 0, hr_dev->irq_names[j + aeq_num], &eq_table->eq[j - other_num]); - else + } else { ret = request_irq(eq_table->eq[j - other_num].irq, hns_roce_v2_msix_interrupt_eq, 0, hr_dev->irq_names[j - comp_num], &eq_table->eq[j - other_num]); + } + if (ret) { dev_err(hr_dev->dev, "request irq error!\n"); goto err_request_failed; @@ -6600,12 +6610,16 @@ static int __hns_roce_request_irq(struct hns_roce_dev *hr_dev, int irq_num, return 0; err_request_failed: - for (j -= 1; j >= 0; j--) - if (j < other_num) + for (j -= 1; j >= 0; j--) { + if (j < other_num) { free_irq(hr_dev->irq[j], hr_dev); - else - free_irq(eq_table->eq[j - other_num].irq, - &eq_table->eq[j - other_num]); + continue; + } + free_irq(eq_table->eq[j - other_num].irq, + &eq_table->eq[j - other_num]); + if (j < other_num + comp_num) + cancel_work_sync(&eq_table->eq[j - other_num].work); + } err_kzalloc_failed: for (i -= 1; i >= 0; i--) @@ -6626,8 +6640,11 @@ static void __hns_roce_free_irq(struct hns_roce_dev *hr_dev) for (i = 0; i < hr_dev->caps.num_other_vectors; i++) free_irq(hr_dev->irq[i], hr_dev); - for (i = 0; i < eq_num; i++) + for (i = 0; i < eq_num; i++) { free_irq(hr_dev->eq_table.eq[i].irq, &hr_dev->eq_table.eq[i]); + if (i < hr_dev->caps.num_comp_vectors) + cancel_work_sync(&hr_dev->eq_table.eq[i].work); + } for (i = 0; i < irq_num; i++) kfree(hr_dev->irq_names[i]); From patchwork Wed Jul 10 13:37:00 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Junxian Huang X-Patchwork-Id: 13729353 Received: from szxga01-in.huawei.com (szxga01-in.huawei.com [45.249.212.187]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id D6C44187844; Wed, 10 Jul 2024 13:42:26 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=45.249.212.187 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1720618951; cv=none; b=h38ancrag7/TDj4v3H4k/m5GvXEBupluVrJ/xasFp6nS/RmgYwHxXW84b+VWlx6k9F3wp4ysRrL6YhIs7Xn8G9R5UWJqlLNlDkKtoQsFUhZM+9Mo8HsVDWj4yFzrMMbSdLlvM7+SSw6KNyJKt+97FAjKU5AoKO5wTTaT45QvmRE= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1720618951; c=relaxed/simple; bh=f0IdYP8Ghq/+BSWKh69LdDWg9TexC/0W8jKIIZOVhLM=; h=From:To:CC:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=d2bCpPRXxqUcZ//Qse2Eiv09X4ifCiNg9mdR1EhqZ8J5KOOd7lUsIZqux6ncJomyCDc/wK53csuPVCIrWssLZhsok0X+bEPqiLQJxd0rq+0wuka9ipSoPrRPFL2HkntmD1CiwZjMyxokTfCpvxw3thsqApbYdqdmOIyAfcW8QpA= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=hisilicon.com; spf=pass smtp.mailfrom=hisilicon.com; arc=none smtp.client-ip=45.249.212.187 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=hisilicon.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=hisilicon.com Received: from mail.maildlp.com (unknown [172.19.163.48]) by szxga01-in.huawei.com (SkyGuard) with ESMTP id 4WJzSP1FhCzwWmQ; Wed, 10 Jul 2024 21:37:41 +0800 (CST) Received: from kwepemf100018.china.huawei.com (unknown [7.202.181.17]) by mail.maildlp.com (Postfix) with ESMTPS id 0ADAF180088; Wed, 10 Jul 2024 21:42:24 +0800 (CST) Received: from localhost.localdomain (10.90.30.45) by kwepemf100018.china.huawei.com (7.202.181.17) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.1544.11; Wed, 10 Jul 2024 21:42:23 +0800 From: Junxian Huang To: , CC: , , , Subject: [PATCH v2 for-rc 3/8] RDMA/hns: Fix unmatch exception handling when init eq table fails Date: Wed, 10 Jul 2024 21:37:00 +0800 Message-ID: <20240710133705.896445-4-huangjunxian6@hisilicon.com> X-Mailer: git-send-email 2.30.0 In-Reply-To: <20240710133705.896445-1-huangjunxian6@hisilicon.com> References: <20240710133705.896445-1-huangjunxian6@hisilicon.com> Precedence: bulk X-Mailing-List: linux-rdma@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-ClientProxiedBy: dggems702-chm.china.huawei.com (10.3.19.179) To kwepemf100018.china.huawei.com (7.202.181.17) The hw ctx should be destroyed when init eq table fails. Fixes: a5073d6054f7 ("RDMA/hns: Add eq support of hip08") Signed-off-by: Junxian Huang --- drivers/infiniband/hw/hns/hns_roce_hw_v2.c | 25 +++++++++++----------- 1 file changed, 13 insertions(+), 12 deletions(-) diff --git a/drivers/infiniband/hw/hns/hns_roce_hw_v2.c b/drivers/infiniband/hw/hns/hns_roce_hw_v2.c index 2f16554c96be..cbbc142afc1b 100644 --- a/drivers/infiniband/hw/hns/hns_roce_hw_v2.c +++ b/drivers/infiniband/hw/hns/hns_roce_hw_v2.c @@ -6368,9 +6368,16 @@ static void hns_roce_v2_int_mask_enable(struct hns_roce_dev *hr_dev, roce_write(hr_dev, ROCEE_VF_ABN_INT_CFG_REG, enable_flag); } -static void hns_roce_v2_destroy_eqc(struct hns_roce_dev *hr_dev, u32 eqn) +static void free_eq_buf(struct hns_roce_dev *hr_dev, struct hns_roce_eq *eq) +{ + hns_roce_mtr_destroy(hr_dev, &eq->mtr); +} + +static void hns_roce_v2_destroy_eqc(struct hns_roce_dev *hr_dev, + struct hns_roce_eq *eq) { struct device *dev = hr_dev->dev; + int eqn = eq->eqn; int ret; u8 cmd; @@ -6381,12 +6388,9 @@ static void hns_roce_v2_destroy_eqc(struct hns_roce_dev *hr_dev, u32 eqn) ret = hns_roce_destroy_hw_ctx(hr_dev, cmd, eqn & HNS_ROCE_V2_EQN_M); if (ret) - dev_err(dev, "[mailbox cmd] destroy eqc(%u) failed.\n", eqn); -} + dev_err(dev, "[mailbox cmd] destroy eqc(%d) failed.\n", eqn); -static void free_eq_buf(struct hns_roce_dev *hr_dev, struct hns_roce_eq *eq) -{ - hns_roce_mtr_destroy(hr_dev, &eq->mtr); + free_eq_buf(hr_dev, eq); } static void init_eq_config(struct hns_roce_dev *hr_dev, struct hns_roce_eq *eq) @@ -6733,7 +6737,7 @@ static int hns_roce_v2_init_eq_table(struct hns_roce_dev *hr_dev) err_create_eq_fail: for (i -= 1; i >= 0; i--) - free_eq_buf(hr_dev, &eq_table->eq[i]); + hns_roce_v2_destroy_eqc(hr_dev, &eq_table->eq[i]); kfree(eq_table->eq); return ret; @@ -6753,11 +6757,8 @@ static void hns_roce_v2_cleanup_eq_table(struct hns_roce_dev *hr_dev) __hns_roce_free_irq(hr_dev); destroy_workqueue(hr_dev->irq_workq); - for (i = 0; i < eq_num; i++) { - hns_roce_v2_destroy_eqc(hr_dev, i); - - free_eq_buf(hr_dev, &eq_table->eq[i]); - } + for (i = 0; i < eq_num; i++) + hns_roce_v2_destroy_eqc(hr_dev, &eq_table->eq[i]); kfree(eq_table->eq); } From patchwork Wed Jul 10 13:37:01 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Junxian Huang X-Patchwork-Id: 13729354 Received: from szxga08-in.huawei.com (szxga08-in.huawei.com [45.249.212.255]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 9C244194088; Wed, 10 Jul 2024 13:42:31 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=45.249.212.255 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1720618953; cv=none; b=tGwh3A1iErqkTiC1y0JuN2Cxg6vDxLK57DTFi3VCRj1WH+mUEEWBzRu8WrSUDtqVohAYGvHiyCeli28AX4h6ndQ3jy+YqtQG2DpG7PNb48FpwRe3kiCLp+WrOFr1WTeSla34atidGe63sHex4yBDHeBGp/swYXvF7+dwWIpH1Kc= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1720618953; c=relaxed/simple; bh=0bhKmbWEhTQZaRCWuVo53XMfLGkm79/LUryaLp70VAs=; h=From:To:CC:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=osw8A4EW7y46Sp00J6PMZUG6EW9QQ0EHJ9QLuplskk4wZeEVk7yA4mzrIVNkO8bYjQ+2L1Ff/e8qNEExASYEko4e3GDTLB40wWQka8lV5bUQwCN9Zm7Jcobh9EphbzhzQp2tseYmPsSmTsqjxe80eiRp3rr2YK7Ie9UrwLi57y4= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=hisilicon.com; spf=pass smtp.mailfrom=hisilicon.com; arc=none smtp.client-ip=45.249.212.255 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=hisilicon.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=hisilicon.com Received: from mail.maildlp.com (unknown [172.19.163.252]) by szxga08-in.huawei.com (SkyGuard) with ESMTP id 4WJzSP4gfTz1T633; Wed, 10 Jul 2024 21:37:41 +0800 (CST) Received: from kwepemf100018.china.huawei.com (unknown [7.202.181.17]) by mail.maildlp.com (Postfix) with ESMTPS id 58C54180ADC; Wed, 10 Jul 2024 21:42:24 +0800 (CST) Received: from localhost.localdomain (10.90.30.45) by kwepemf100018.china.huawei.com (7.202.181.17) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.1544.11; Wed, 10 Jul 2024 21:42:23 +0800 From: Junxian Huang To: , CC: , , , Subject: [PATCH v2 for-rc 4/8] RDMA/hns: Fix missing pagesize and alignment check in FRMR Date: Wed, 10 Jul 2024 21:37:01 +0800 Message-ID: <20240710133705.896445-5-huangjunxian6@hisilicon.com> X-Mailer: git-send-email 2.30.0 In-Reply-To: <20240710133705.896445-1-huangjunxian6@hisilicon.com> References: <20240710133705.896445-1-huangjunxian6@hisilicon.com> Precedence: bulk X-Mailing-List: linux-rdma@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-ClientProxiedBy: dggems702-chm.china.huawei.com (10.3.19.179) To kwepemf100018.china.huawei.com (7.202.181.17) From: Chengchang Tang The offset requires 128B alignment and the page size ranges from 4K to 128M. Fixes: 68a997c5d28c ("RDMA/hns: Add FRMR support for hip08") Signed-off-by: Chengchang Tang Signed-off-by: Junxian Huang --- drivers/infiniband/hw/hns/hns_roce_device.h | 4 ++++ drivers/infiniband/hw/hns/hns_roce_mr.c | 5 +++++ 2 files changed, 9 insertions(+) diff --git a/drivers/infiniband/hw/hns/hns_roce_device.h b/drivers/infiniband/hw/hns/hns_roce_device.h index f8451e5ab107..7d5931872f8a 100644 --- a/drivers/infiniband/hw/hns/hns_roce_device.h +++ b/drivers/infiniband/hw/hns/hns_roce_device.h @@ -83,6 +83,7 @@ #define MR_TYPE_DMA 0x03 #define HNS_ROCE_FRMR_MAX_PA 512 +#define HNS_ROCE_FRMR_ALIGN_SIZE 128 #define PKEY_ID 0xffff #define NODE_DESC_SIZE 64 @@ -189,6 +190,9 @@ enum { #define HNS_HW_PAGE_SHIFT 12 #define HNS_HW_PAGE_SIZE (1 << HNS_HW_PAGE_SHIFT) +#define HNS_HW_MAX_PAGE_SHIFT 27 +#define HNS_HW_MAX_PAGE_SIZE (1 << HNS_HW_MAX_PAGE_SHIFT) + struct hns_roce_uar { u64 pfn; unsigned long index; diff --git a/drivers/infiniband/hw/hns/hns_roce_mr.c b/drivers/infiniband/hw/hns/hns_roce_mr.c index 1a61dceb3319..846da8c78b8b 100644 --- a/drivers/infiniband/hw/hns/hns_roce_mr.c +++ b/drivers/infiniband/hw/hns/hns_roce_mr.c @@ -443,6 +443,11 @@ int hns_roce_map_mr_sg(struct ib_mr *ibmr, struct scatterlist *sg, int sg_nents, struct hns_roce_mtr *mtr = &mr->pbl_mtr; int ret, sg_num = 0; + if (!IS_ALIGNED(*sg_offset, HNS_ROCE_FRMR_ALIGN_SIZE) || + ibmr->page_size < HNS_HW_PAGE_SIZE || + ibmr->page_size > HNS_HW_MAX_PAGE_SIZE) + return sg_num; + mr->npages = 0; mr->page_list = kvcalloc(mr->pbl_mtr.hem_cfg.buf_pg_count, sizeof(dma_addr_t), GFP_KERNEL); From patchwork Wed Jul 10 13:37:02 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Junxian Huang X-Patchwork-Id: 13729355 Received: from szxga08-in.huawei.com (szxga08-in.huawei.com [45.249.212.255]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id A248219412D; Wed, 10 Jul 2024 13:42:32 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=45.249.212.255 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1720618954; cv=none; b=vBY2PLNkXjo4ylWPlG+PCZvqnXPu+XijH3csrfGd+eYSothp7QCqVTd8KlDF87k81kCGY94AYfSihSrJUh+cBotJjwbZlLZcwy+Iy0JxTM3/P5/Qh6Y0w7kZxHmsN+mrdV80eSx8PDW/q1fZ7BGhyn/5nXvFTQ4sPueBMSCeJlU= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1720618954; c=relaxed/simple; bh=P6dMXAa4aK7kznempixsatXqdLe6ioCSMbTLC/ydCyo=; h=From:To:CC:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=TfDBEhHyCW2VFRTFRFi2a47zAUHFwYQyGwJjn+y1WxBFH5Gnx6wQAZHNlc01KMaTn6ZqwAxEZTjkuUVZ1fzVLajqR/QxnXNat2Z5sT1p6/660EhUhUNQDqBgqbgazWPGwGTHcuPc2z5o1ymZzmnhSWj7LV6mkem5LvuhX1QShfg= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=hisilicon.com; spf=pass smtp.mailfrom=hisilicon.com; arc=none smtp.client-ip=45.249.212.255 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=hisilicon.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=hisilicon.com Received: from mail.maildlp.com (unknown [172.19.163.48]) by szxga08-in.huawei.com (SkyGuard) with ESMTP id 4WJzSQ01nPz1T63R; Wed, 10 Jul 2024 21:37:42 +0800 (CST) Received: from kwepemf100018.china.huawei.com (unknown [7.202.181.17]) by mail.maildlp.com (Postfix) with ESMTPS id AAFDE18009B; Wed, 10 Jul 2024 21:42:24 +0800 (CST) Received: from localhost.localdomain (10.90.30.45) by kwepemf100018.china.huawei.com (7.202.181.17) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.1544.11; Wed, 10 Jul 2024 21:42:24 +0800 From: Junxian Huang To: , CC: , , , Subject: [PATCH v2 for-rc 5/8] RDMA/hns: Fix shift-out-bounds when max_inline_data is 0 Date: Wed, 10 Jul 2024 21:37:02 +0800 Message-ID: <20240710133705.896445-6-huangjunxian6@hisilicon.com> X-Mailer: git-send-email 2.30.0 In-Reply-To: <20240710133705.896445-1-huangjunxian6@hisilicon.com> References: <20240710133705.896445-1-huangjunxian6@hisilicon.com> Precedence: bulk X-Mailing-List: linux-rdma@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-ClientProxiedBy: dggems702-chm.china.huawei.com (10.3.19.179) To kwepemf100018.china.huawei.com (7.202.181.17) From: Chengchang Tang A shift-out-bounds may occur, if the max_inline_data has not been set. The related log: UBSAN: shift-out-of-bounds in kernel/include/linux/log2.h:57:13 shift exponent 64 is too large for 64-bit type 'long unsigned int' Call trace: dump_backtrace+0xb0/0x118 show_stack+0x20/0x38 dump_stack_lvl+0xbc/0x120 dump_stack+0x1c/0x28 __ubsan_handle_shift_out_of_bounds+0x104/0x240 set_ext_sge_param+0x40c/0x420 [hns_roce_hw_v2] hns_roce_create_qp+0xf48/0x1c40 [hns_roce_hw_v2] create_qp.part.0+0x294/0x3c0 ib_create_qp_kernel+0x7c/0x150 create_mad_qp+0x11c/0x1e0 ib_mad_init_device+0x834/0xc88 add_client_context+0x248/0x318 enable_device_and_get+0x158/0x280 ib_register_device+0x4ac/0x610 hns_roce_init+0x890/0xf98 [hns_roce_hw_v2] __hns_roce_hw_v2_init_instance+0x398/0x720 [hns_roce_hw_v2] hns_roce_hw_v2_init_instance+0x108/0x1e0 [hns_roce_hw_v2] hclge_init_roce_client_instance+0x1a0/0x358 [hclge] hclge_init_client_instance+0xa0/0x508 [hclge] hnae3_register_client+0x18c/0x210 [hnae3] hns_roce_hw_v2_init+0x28/0xff8 [hns_roce_hw_v2] do_one_initcall+0xe0/0x510 do_init_module+0x110/0x370 load_module+0x2c6c/0x2f20 init_module_from_file+0xe0/0x140 idempotent_init_module+0x24c/0x350 __arm64_sys_finit_module+0x88/0xf8 invoke_syscall+0x68/0x1a0 el0_svc_common.constprop.0+0x11c/0x150 do_el0_svc+0x38/0x50 el0_svc+0x50/0xa0 el0t_64_sync_handler+0xc0/0xc8 el0t_64_sync+0x1a4/0x1a8 Fixes: 0c5e259b06a8 ("RDMA/hns: Fix incorrect sge nums calculation") Signed-off-by: Chengchang Tang Signed-off-by: Junxian Huang --- drivers/infiniband/hw/hns/hns_roce_qp.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/drivers/infiniband/hw/hns/hns_roce_qp.c b/drivers/infiniband/hw/hns/hns_roce_qp.c index db34665d1dfb..1de384ce4d0e 100644 --- a/drivers/infiniband/hw/hns/hns_roce_qp.c +++ b/drivers/infiniband/hw/hns/hns_roce_qp.c @@ -532,13 +532,15 @@ static unsigned int get_sge_num_from_max_inl_data(bool is_ud_or_gsi, { unsigned int inline_sge; - inline_sge = roundup_pow_of_two(max_inline_data) / HNS_ROCE_SGE_SIZE; + if (!max_inline_data) + return 0; /* * if max_inline_data less than * HNS_ROCE_SGE_IN_WQE * HNS_ROCE_SGE_SIZE, * In addition to ud's mode, no need to extend sge. */ + inline_sge = roundup_pow_of_two(max_inline_data) / HNS_ROCE_SGE_SIZE; if (!is_ud_or_gsi && inline_sge <= HNS_ROCE_SGE_IN_WQE) inline_sge = 0; From patchwork Wed Jul 10 13:37:03 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Junxian Huang X-Patchwork-Id: 13729351 Received: from szxga01-in.huawei.com (szxga01-in.huawei.com [45.249.212.187]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 01DDD192B88; Wed, 10 Jul 2024 13:42:26 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=45.249.212.187 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1720618949; cv=none; b=pouNG+5qEW0+58Hszr79KGR8fhV6Xjpb/QThIkDG6uzbEK+6I9HBsMwEhwBRXbO80UhGs8tC+QFezHW4dhjzjsb4mqT85o8R5bGlZfIaBe6gzUDQc7pfFK8ywxxihJ7kFBBckKjc+iFJGcsBXdrnuC+lQ2H4yvF9Nrsx9+/vIZY= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1720618949; c=relaxed/simple; bh=IxBtV6nqUj65HpN3G6WZfLsFkOvYTXrL2fLPYWJmyDI=; h=From:To:CC:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=gnlhwQ1m2m1gi3nDjPvJVtXwez+6DhA3/OXgfztz1zmrGMvJd2biSTl9N9RZAeTmj0rTU0+IG4wGEfKWwaUyRCHqi5VjSQ0vMYVlEkCO/I2EsswPYBVvyZIE/AmZZhUYGypr9lCyLqs2b9CLv8KlslK7LfgFmrfE92Yes0UHV/s= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=hisilicon.com; spf=pass smtp.mailfrom=hisilicon.com; arc=none smtp.client-ip=45.249.212.187 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=hisilicon.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=hisilicon.com Received: from mail.maildlp.com (unknown [172.19.163.174]) by szxga01-in.huawei.com (SkyGuard) with ESMTP id 4WJzSY53jYzxVww; Wed, 10 Jul 2024 21:37:49 +0800 (CST) Received: from kwepemf100018.china.huawei.com (unknown [7.202.181.17]) by mail.maildlp.com (Postfix) with ESMTPS id 083C5140257; Wed, 10 Jul 2024 21:42:25 +0800 (CST) Received: from localhost.localdomain (10.90.30.45) by kwepemf100018.china.huawei.com (7.202.181.17) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.1544.11; Wed, 10 Jul 2024 21:42:24 +0800 From: Junxian Huang To: , CC: , , , Subject: [PATCH v2 for-rc 6/8] RDMA/hns: Fix undifined behavior caused by invalid max_sge Date: Wed, 10 Jul 2024 21:37:03 +0800 Message-ID: <20240710133705.896445-7-huangjunxian6@hisilicon.com> X-Mailer: git-send-email 2.30.0 In-Reply-To: <20240710133705.896445-1-huangjunxian6@hisilicon.com> References: <20240710133705.896445-1-huangjunxian6@hisilicon.com> Precedence: bulk X-Mailing-List: linux-rdma@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-ClientProxiedBy: dggems702-chm.china.huawei.com (10.3.19.179) To kwepemf100018.china.huawei.com (7.202.181.17) From: Chengchang Tang If max_sge has been set to 0, roundup_pow_of_two() in set_srq_basic_param() may have undefined behavior. Fixes: 9dd052474a26 ("RDMA/hns: Allocate one more recv SGE for HIP08") Signed-off-by: Chengchang Tang Signed-off-by: Junxian Huang --- drivers/infiniband/hw/hns/hns_roce_srq.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/infiniband/hw/hns/hns_roce_srq.c b/drivers/infiniband/hw/hns/hns_roce_srq.c index f1997abc97ca..c9b8233f4b05 100644 --- a/drivers/infiniband/hw/hns/hns_roce_srq.c +++ b/drivers/infiniband/hw/hns/hns_roce_srq.c @@ -297,7 +297,7 @@ static int set_srq_basic_param(struct hns_roce_srq *srq, max_sge = proc_srq_sge(hr_dev, srq, !!udata); if (attr->max_wr > hr_dev->caps.max_srq_wrs || - attr->max_sge > max_sge) { + attr->max_sge > max_sge || !attr->max_sge) { ibdev_err(&hr_dev->ib_dev, "invalid SRQ attr, depth = %u, sge = %u.\n", attr->max_wr, attr->max_sge); From patchwork Wed Jul 10 13:37:04 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Junxian Huang X-Patchwork-Id: 13729358 Received: from szxga02-in.huawei.com (szxga02-in.huawei.com [45.249.212.188]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id CAEE9194137; Wed, 10 Jul 2024 13:42:32 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=45.249.212.188 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1720618955; cv=none; b=a0a8KWcvFnusrjUYlNmPCT+EDMlFHTJCOOTpKcvmC5p9Kt41/Bg+X01aGy1cOoQx+VwvbGDsElftpD8L0VKEuyLeOUWzMCidheulhDxrcj4+Hl0kuKIPJbkVpmnTHJco4rNrcBCggdgnDOLrVRzT6R/WK/djPEXh4uKOt4RHKQk= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1720618955; c=relaxed/simple; bh=mF3Y6W2YlXcMBpbUar7YsW8yKzS0IxQSydLUM8baSOc=; h=From:To:CC:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=CMi+5xPBNBelAkxq2NP5FXmiuYegQSl6eAzrRidXg1G7vEMAFrSiu/SxWMaldmSBNXdCjlErcWm0HypfN3H6tigPnRw153qe4erUi5a7TetT6O7qriwKkoHsUd24UANLLMjkGGl+dyVvUGU2EKv1febTqQJWwwbjGMz/xlzWo/Y= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=hisilicon.com; spf=pass smtp.mailfrom=hisilicon.com; arc=none smtp.client-ip=45.249.212.188 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=hisilicon.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=hisilicon.com Received: from mail.maildlp.com (unknown [172.19.163.174]) by szxga02-in.huawei.com (SkyGuard) with ESMTP id 4WJzWw6SrLzdhCN; Wed, 10 Jul 2024 21:40:44 +0800 (CST) Received: from kwepemf100018.china.huawei.com (unknown [7.202.181.17]) by mail.maildlp.com (Postfix) with ESMTPS id 62017140257; Wed, 10 Jul 2024 21:42:25 +0800 (CST) Received: from localhost.localdomain (10.90.30.45) by kwepemf100018.china.huawei.com (7.202.181.17) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.1544.11; Wed, 10 Jul 2024 21:42:24 +0800 From: Junxian Huang To: , CC: , , , Subject: [PATCH v2 for-rc 7/8] RDMA/hns: Fix insufficient extend DB for VFs. Date: Wed, 10 Jul 2024 21:37:04 +0800 Message-ID: <20240710133705.896445-8-huangjunxian6@hisilicon.com> X-Mailer: git-send-email 2.30.0 In-Reply-To: <20240710133705.896445-1-huangjunxian6@hisilicon.com> References: <20240710133705.896445-1-huangjunxian6@hisilicon.com> Precedence: bulk X-Mailing-List: linux-rdma@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-ClientProxiedBy: dggems702-chm.china.huawei.com (10.3.19.179) To kwepemf100018.china.huawei.com (7.202.181.17) From: Chengchang Tang VFs and its PF will share the memory of the extend DB. Currently, the number of extend DB allocated by driver is only enough for PF. This leads to a probability of DB loss and some other problems in scenarios where both PF and VFs use a large number of QPs. Fixes: 6b63597d3540 ("RDMA/hns: Add TSQ link table support") Signed-off-by: Chengchang Tang Signed-off-by: Junxian Huang --- drivers/infiniband/hw/hns/hns_roce_hw_v2.c | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-) diff --git a/drivers/infiniband/hw/hns/hns_roce_hw_v2.c b/drivers/infiniband/hw/hns/hns_roce_hw_v2.c index cbbc142afc1b..aecd137c1e60 100644 --- a/drivers/infiniband/hw/hns/hns_roce_hw_v2.c +++ b/drivers/infiniband/hw/hns/hns_roce_hw_v2.c @@ -2463,14 +2463,16 @@ static int set_llm_cfg_to_hw(struct hns_roce_dev *hr_dev, static struct hns_roce_link_table * alloc_link_table_buf(struct hns_roce_dev *hr_dev) { + u16 total_sl = hr_dev->caps.sl_num * hr_dev->func_num; struct hns_roce_v2_priv *priv = hr_dev->priv; struct hns_roce_link_table *link_tbl; u32 pg_shift, size, min_size; link_tbl = &priv->ext_llm; pg_shift = hr_dev->caps.llm_buf_pg_sz + PAGE_SHIFT; - size = hr_dev->caps.num_qps * HNS_ROCE_V2_EXT_LLM_ENTRY_SZ; - min_size = HNS_ROCE_EXT_LLM_MIN_PAGES(hr_dev->caps.sl_num) << pg_shift; + size = hr_dev->caps.num_qps * hr_dev->func_num * + HNS_ROCE_V2_EXT_LLM_ENTRY_SZ; + min_size = HNS_ROCE_EXT_LLM_MIN_PAGES(total_sl) << pg_shift; /* Alloc data table */ size = max(size, min_size); From patchwork Wed Jul 10 13:37:05 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Junxian Huang X-Patchwork-Id: 13729357 Received: from szxga03-in.huawei.com (szxga03-in.huawei.com [45.249.212.189]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id B52EE194145; Wed, 10 Jul 2024 13:42:33 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=45.249.212.189 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1720618955; cv=none; b=W32b7nA7NaZX6pNMPmQkrY3/9OAGWKylqqcDBjenDHBe1DbFGERd5exeqwU6khoaZwkYxIFXKN5yv3oJOt5sOep4E/0AQvj4DolVIVJdiLLFFuJdWtm1IEYKQcnNhztE8H/uug4LpaBRBDyeZw4T5erY1zuIqW+kLTYmkj8MBWU= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1720618955; c=relaxed/simple; bh=IrET6SJcvqLFD/xHfsUKDv0n4sdBIoR7mhvWUiHWpx0=; h=From:To:CC:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=HIgwMKGKX22CxWfQD4LQAkRQbR35AROBBIf0kTRyohcE/RDxW/OCUiHmW6kKrnc0v5WwabUlD4F1KP3EwFRrVDE92jvg4OR61a2FkyITfrKH90y9sYv8jpJx/nqvRvtrdpdySOe+Guzx2nZXGOOKYWe0VSBX0rnAcJr7iv/2p9g= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=hisilicon.com; spf=pass smtp.mailfrom=hisilicon.com; arc=none smtp.client-ip=45.249.212.189 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=hisilicon.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=hisilicon.com Received: from mail.maildlp.com (unknown [172.19.163.174]) by szxga03-in.huawei.com (SkyGuard) with ESMTP id 4WJzTK5bKbzQlCx; Wed, 10 Jul 2024 21:38:29 +0800 (CST) Received: from kwepemf100018.china.huawei.com (unknown [7.202.181.17]) by mail.maildlp.com (Postfix) with ESMTPS id B5A29140257; Wed, 10 Jul 2024 21:42:25 +0800 (CST) Received: from localhost.localdomain (10.90.30.45) by kwepemf100018.china.huawei.com (7.202.181.17) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.1544.11; Wed, 10 Jul 2024 21:42:25 +0800 From: Junxian Huang To: , CC: , , , Subject: [PATCH v2 for-rc 8/8] RDMA/hns: Fix mbx timing out before CMD execution is completed Date: Wed, 10 Jul 2024 21:37:05 +0800 Message-ID: <20240710133705.896445-9-huangjunxian6@hisilicon.com> X-Mailer: git-send-email 2.30.0 In-Reply-To: <20240710133705.896445-1-huangjunxian6@hisilicon.com> References: <20240710133705.896445-1-huangjunxian6@hisilicon.com> Precedence: bulk X-Mailing-List: linux-rdma@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-ClientProxiedBy: dggems702-chm.china.huawei.com (10.3.19.179) To kwepemf100018.china.huawei.com (7.202.181.17) From: Chengchang Tang When a large number of tasks are issued, the speed of HW processing mbx will slow down. The standard for judging mbx timeout in the current firmware is 30ms, and the current timeout standard for the driver is also 30ms. Considering that firmware scheduling in multi-function scenarios takes a certain amount of time, this will cause the driver to time out too early and report a failure before mbx execution times out. This patch introduces a new mechanism that can set different timeouts for different cmds and extends the timeout of mbx to 35ms. Fixes: a04ff739f2a9 ("RDMA/hns: Add command queue support for hip08 RoCE driver") Signed-off-by: Chengchang Tang Signed-off-by: Junxian Huang --- drivers/infiniband/hw/hns/hns_roce_hw_v2.c | 35 +++++++++++++++++----- drivers/infiniband/hw/hns/hns_roce_hw_v2.h | 6 ++++ 2 files changed, 34 insertions(+), 7 deletions(-) diff --git a/drivers/infiniband/hw/hns/hns_roce_hw_v2.c b/drivers/infiniband/hw/hns/hns_roce_hw_v2.c index aecd137c1e60..621b057fb9da 100644 --- a/drivers/infiniband/hw/hns/hns_roce_hw_v2.c +++ b/drivers/infiniband/hw/hns/hns_roce_hw_v2.c @@ -1275,12 +1275,38 @@ static int hns_roce_cmd_err_convert_errno(u16 desc_ret) return -EIO; } +static u32 hns_roce_cmdq_tx_timeout(u16 opcode, u32 tx_timeout) +{ + static const struct hns_roce_cmdq_tx_timeout_map cmdq_tx_timeout[] = { + {HNS_ROCE_OPC_POST_MB, HNS_ROCE_OPC_POST_MB_TIMEOUT}, + }; + int i; + + for (i = 0; i < ARRAY_SIZE(cmdq_tx_timeout); i++) + if (cmdq_tx_timeout[i].opcode == opcode) + return cmdq_tx_timeout[i].tx_timeout; + + return tx_timeout; +} + +static void hns_roce_wait_csq_done(struct hns_roce_dev *hr_dev, u16 opcode) +{ + struct hns_roce_v2_priv *priv = hr_dev->priv; + u32 tx_timeout = hns_roce_cmdq_tx_timeout(opcode, priv->cmq.tx_timeout); + u32 timeout = 0; + + do { + if (hns_roce_cmq_csq_done(hr_dev)) + break; + udelay(1); + } while (++timeout < tx_timeout); +} + static int __hns_roce_cmq_send(struct hns_roce_dev *hr_dev, struct hns_roce_cmq_desc *desc, int num) { struct hns_roce_v2_priv *priv = hr_dev->priv; struct hns_roce_v2_cmq_ring *csq = &priv->cmq.csq; - u32 timeout = 0; u16 desc_ret; u32 tail; int ret; @@ -1301,12 +1327,7 @@ static int __hns_roce_cmq_send(struct hns_roce_dev *hr_dev, atomic64_inc(&hr_dev->dfx_cnt[HNS_ROCE_DFX_CMDS_CNT]); - do { - if (hns_roce_cmq_csq_done(hr_dev)) - break; - udelay(1); - } while (++timeout < priv->cmq.tx_timeout); - + hns_roce_wait_csq_done(hr_dev, le16_to_cpu(desc->opcode)); if (hns_roce_cmq_csq_done(hr_dev)) { ret = 0; for (i = 0; i < num; i++) { diff --git a/drivers/infiniband/hw/hns/hns_roce_hw_v2.h b/drivers/infiniband/hw/hns/hns_roce_hw_v2.h index def1d15a03c7..c65f68a14a26 100644 --- a/drivers/infiniband/hw/hns/hns_roce_hw_v2.h +++ b/drivers/infiniband/hw/hns/hns_roce_hw_v2.h @@ -224,6 +224,12 @@ enum hns_roce_opcode_type { HNS_SWITCH_PARAMETER_CFG = 0x1033, }; +#define HNS_ROCE_OPC_POST_MB_TIMEOUT 35000 +struct hns_roce_cmdq_tx_timeout_map { + u16 opcode; + u32 tx_timeout; +}; + enum { TYPE_CRQ, TYPE_CSQ,