From patchwork Fri Jun 9 11:01:43 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Selvin Xavier X-Patchwork-Id: 13273721 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id BEFD0C7EE25 for ; Fri, 9 Jun 2023 11:15:14 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229776AbjFILPM (ORCPT ); Fri, 9 Jun 2023 07:15:12 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:57842 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S236561AbjFILOq (ORCPT ); Fri, 9 Jun 2023 07:14:46 -0400 Received: from mail-pl1-x629.google.com (mail-pl1-x629.google.com [IPv6:2607:f8b0:4864:20::629]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 4138D2737 for ; Fri, 9 Jun 2023 04:14:30 -0700 (PDT) Received: by mail-pl1-x629.google.com with SMTP id d9443c01a7336-1b2439e9004so5256875ad.3 for ; Fri, 09 Jun 2023 04:14:30 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=broadcom.com; s=google; t=1686309270; x=1688901270; h=references:in-reply-to:message-id:date:subject:cc:to:from:from:to :cc:subject:date:message-id:reply-to; bh=1goHtNkDhbstCc5HCdOWnRrRV63nLY8Bax+1w3aZqg8=; b=fHWV+29XjKnzUBq36DR7VSwEpmR1QAkoYciw+UeYBzZ8KsUJQaLxnsCdvgnpuTwATy xbd2OcFXGQB3xBGxSujGITGyMesoyEuFecEumWqTWDK2EGHtZBtF1QfBlZzXPOgWo6+y q4oVcqYIKPt6tjWAxs2g4By6IJvcZUcyC3wYc= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1686309270; x=1688901270; h=references:in-reply-to:message-id:date:subject:cc:to:from :x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=1goHtNkDhbstCc5HCdOWnRrRV63nLY8Bax+1w3aZqg8=; b=LPzQdnTQU35GJqm264WL/N3ZjgL5KyC4vbbJZhBi1ExeYVY6VquSk2QF1dxYlS1z50 voXU8bq838bszNUG+GVrdDDAuJqrXU/dSoIdyE4FbNglttn9G+T6ZbJLi0UOhh/IIaCP Z11mkuz0ZmgTmlVBS9vhxJj9V0HfVVLaGOIHl4g2n/A0ILueA4XcnV6By4rhDPy/HVST L5H00aV1AIEh3RhzY78SFxywUaTNy5xgLqJzJw437+f0ZXWPCeLEBNRf/leREGiu9hnK yvL3ayUf44apOYG1uUqx76dnh2ngYqq0+T/jXVGBxHBAznndC3uhW8TRRs0Vp3U0s3vL vMmg== X-Gm-Message-State: AC+VfDwMoA9oD/bS9BCFKMEXllBKVxhDgYM9G5Zh1kpfQ5GJe/AR7zsP spkYeMdhPCajdueWpqWYSPRfLFeSfoh80fQIauA= X-Google-Smtp-Source: ACHHUZ75CiD00bnGEac0ClaapyzPI3IFZped8FLqZFcAlFD0YQUuFwnL5BP0yRUoKqrnhZ9zNhILUg== X-Received: by 2002:a17:902:db0d:b0:1b1:e88b:f63 with SMTP id m13-20020a170902db0d00b001b1e88b0f63mr805202plx.61.1686309269709; Fri, 09 Jun 2023 04:14:29 -0700 (PDT) Received: from dhcp-10-192-206-197.iig.avagotech.net.net ([192.19.234.250]) by smtp.gmail.com with ESMTPSA id q4-20020a170902dac400b001b0142908f7sm2992954plx.291.2023.06.09.04.14.26 (version=TLS1_2 cipher=ECDHE-ECDSA-AES128-GCM-SHA256 bits=128/128); Fri, 09 Jun 2023 04:14:28 -0700 (PDT) From: Selvin Xavier To: jgg@ziepe.ca, leon@kernel.org Cc: linux-rdma@vger.kernel.org, andrew.gospodarek@broadcom.com, kashyap.desai@broadcom.com, Selvin Xavier Subject: [PATCH v2 for-next 06/17] RDMA/bnxt_re: Avoid the command wait if firmware is inactive Date: Fri, 9 Jun 2023 04:01:43 -0700 Message-Id: <1686308514-11996-7-git-send-email-selvin.xavier@broadcom.com> X-Mailer: git-send-email 2.5.5 In-Reply-To: <1686308514-11996-1-git-send-email-selvin.xavier@broadcom.com> References: <1686308514-11996-1-git-send-email-selvin.xavier@broadcom.com> Precedence: bulk List-ID: X-Mailing-List: linux-rdma@vger.kernel.org From: Kashyap Desai Add a check to avoid waiting if driver already detects a FW timeout. Return success for resource destroy in case the device is detached. Add helper function to map timeout error code to success. Signed-off-by: Kashyap Desai Signed-off-by: Selvin Xavier --- drivers/infiniband/hw/bnxt_re/qplib_rcfw.c | 52 +++++++++++++++++++++++++++--- 1 file changed, 48 insertions(+), 4 deletions(-) diff --git a/drivers/infiniband/hw/bnxt_re/qplib_rcfw.c b/drivers/infiniband/hw/bnxt_re/qplib_rcfw.c index 3b242cc..3c4f72a 100644 --- a/drivers/infiniband/hw/bnxt_re/qplib_rcfw.c +++ b/drivers/infiniband/hw/bnxt_re/qplib_rcfw.c @@ -54,9 +54,46 @@ static void bnxt_qplib_service_creq(struct tasklet_struct *t); /** + * bnxt_qplib_map_rc - map return type based on opcode + * @opcode - roce slow path opcode + * + * In some cases like firmware halt is detected, the driver is supposed to + * remap the error code of the timed out command. + * + * It is not safe to assume hardware is really inactive so certain opcodes + * like destroy qp etc are not safe to be returned success, but this function + * will be called when FW already reports a timeout. This would be possible + * only when FW crashes and resets. This will clear all the HW resources. + * + * Returns: + * 0 to communicate success to caller. + * Non zero error code to communicate failure to caller. + */ +static int bnxt_qplib_map_rc(u8 opcode) +{ + switch (opcode) { + case CMDQ_BASE_OPCODE_DESTROY_QP: + case CMDQ_BASE_OPCODE_DESTROY_SRQ: + case CMDQ_BASE_OPCODE_DESTROY_CQ: + case CMDQ_BASE_OPCODE_DEALLOCATE_KEY: + case CMDQ_BASE_OPCODE_DEREGISTER_MR: + case CMDQ_BASE_OPCODE_DELETE_GID: + case CMDQ_BASE_OPCODE_DESTROY_QP1: + case CMDQ_BASE_OPCODE_DESTROY_AH: + case CMDQ_BASE_OPCODE_DEINITIALIZE_FW: + case CMDQ_BASE_OPCODE_MODIFY_ROCE_CC: + case CMDQ_BASE_OPCODE_SET_LINK_AGGR_MODE: + return 0; + default: + return -ETIMEDOUT; + } +} + +/** * __wait_for_resp - Don't hold the cpu context and wait for response * @rcfw - rcfw channel instance of rdev * @cookie - cookie to track the command + * @opcode - rcfw submitted for given opcode * * Wait for command completion in sleepable context. * @@ -64,7 +101,7 @@ static void bnxt_qplib_service_creq(struct tasklet_struct *t); * 0 if command is completed by firmware. * Non zero error code for rest of the case. */ -static int __wait_for_resp(struct bnxt_qplib_rcfw *rcfw, u16 cookie) +static int __wait_for_resp(struct bnxt_qplib_rcfw *rcfw, u16 cookie, u8 opcode) { struct bnxt_qplib_cmdq_ctx *cmdq; u16 cbit; @@ -74,6 +111,9 @@ static int __wait_for_resp(struct bnxt_qplib_rcfw *rcfw, u16 cookie) cbit = cookie % rcfw->cmdq_depth; do { + if (test_bit(ERR_DEVICE_DETACHED, &cmdq->flags)) + return bnxt_qplib_map_rc(opcode); + /* Non zero means command completed */ ret = wait_event_timeout(cmdq->waitq, !test_bit(cbit, cmdq->cmdq_bitmap), @@ -94,6 +134,7 @@ static int __wait_for_resp(struct bnxt_qplib_rcfw *rcfw, u16 cookie) * __block_for_resp - hold the cpu context and wait for response * @rcfw - rcfw channel instance of rdev * @cookie - cookie to track the command + * @opcode - rcfw submitted for given opcode * * This function will hold the cpu (non-sleepable context) and * wait for command completion. Maximum holding interval is 8 second. @@ -102,7 +143,7 @@ static int __wait_for_resp(struct bnxt_qplib_rcfw *rcfw, u16 cookie) * -ETIMEOUT if command is not completed in specific time interval. * 0 if command is completed by firmware. */ -static int __block_for_resp(struct bnxt_qplib_rcfw *rcfw, u16 cookie) +static int __block_for_resp(struct bnxt_qplib_rcfw *rcfw, u16 cookie, u8 opcode) { struct bnxt_qplib_cmdq_ctx *cmdq = &rcfw->cmdq; unsigned long issue_time = 0; @@ -112,6 +153,9 @@ static int __block_for_resp(struct bnxt_qplib_rcfw *rcfw, u16 cookie) issue_time = jiffies; do { + if (test_bit(ERR_DEVICE_DETACHED, &cmdq->flags)) + return bnxt_qplib_map_rc(opcode); + udelay(1); bnxt_qplib_service_creq(&rcfw->creq.creq_tasklet); @@ -267,9 +311,9 @@ int bnxt_qplib_rcfw_send_message(struct bnxt_qplib_rcfw *rcfw, } while (retry_cnt--); if (msg->block) - rc = __block_for_resp(rcfw, cookie); + rc = __block_for_resp(rcfw, cookie, opcode); else - rc = __wait_for_resp(rcfw, cookie); + rc = __wait_for_resp(rcfw, cookie, opcode); if (rc) { /* timed out */ dev_err(&rcfw->pdev->dev, "cmdq[%#x]=%#x timedout (%d)msec\n",