From patchwork Mon Jul 17 05:53:13 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Selvin Xavier X-Patchwork-Id: 13315202 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id D9B33C0015E for ; Mon, 17 Jul 2023 06:08:14 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231450AbjGQGIN (ORCPT ); Mon, 17 Jul 2023 02:08:13 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:49588 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231454AbjGQGHy (ORCPT ); Mon, 17 Jul 2023 02:07:54 -0400 Received: from mail-pg1-x52d.google.com (mail-pg1-x52d.google.com [IPv6:2607:f8b0:4864:20::52d]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 0EFC9E6B for ; Sun, 16 Jul 2023 23:07:27 -0700 (PDT) Received: by mail-pg1-x52d.google.com with SMTP id 41be03b00d2f7-55adfa61199so3209641a12.2 for ; Sun, 16 Jul 2023 23:07:27 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=broadcom.com; s=google; t=1689574040; x=1692166040; h=references:in-reply-to:message-id:date:subject:cc:to:from:from:to :cc:subject:date:message-id:reply-to; bh=AfYkg3CCDYeUL6Cm7P+Js4VYj7lBhU19Li0zauZkHIg=; b=UY5HwWJq8Wp+zgYCZiuhTBpKj52XCWvTJAwu/0XY9412izlYOvGkxmx73R9fDdZPCN ePJDz49qBsIfiVAGPSuTaMjVbqZyrm/WyupeduZIEMF7dnbYLTY/wdcjkqdCNWo8wZiH XI1dpVCmFHz7euJkSaecYIWajst01Khj6hEuU= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1689574040; x=1692166040; h=references:in-reply-to:message-id:date:subject:cc:to:from :x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=AfYkg3CCDYeUL6Cm7P+Js4VYj7lBhU19Li0zauZkHIg=; b=cFtE+AFHm48tqirTo8ERRwWybzmSLGL+Un6bns2Z617Tgd6wyQV+dYBRImeXn54w13 48yzo4Y3R5V4gHoksgERdhufu4qgYAVF503CumZQXLnacy1cut0KYHUy2gOKIv6l95j5 1u4D2JEVbFwFbCA1WTHSsVuUi6GvRtxB977UoxoU/b+DXeV6skcqEVAwsN7Kew0U0wRC oDGxDq6ngg/dfqoSLzzNHNhiNizEP3hko1aaRIBDpovi/60ubyxoiwK5eBZyEQ211HaK jet6QLzsGqThhj2ExxsYueaXFPHeP/eTX1JG1yKte/RMGe1KrRwEmbjjouJDvgTg3RvM yDOg== X-Gm-Message-State: ABy/qLbQcPeuSR5yf5DxhrV0lpiWDblCtg7u3aIS3WVGgjRAs16tKexS Xcnnc4tRo4JNkOJ7fKEUIV4fJg== X-Google-Smtp-Source: APBJJlF4Q6RTyOqBrJh5PcIz/WNyPAHjOBvtwM6peh+zVHUFOHG3Hhr89O6LTxYxzaNihKlQVYPoYg== X-Received: by 2002:a17:90a:3e4a:b0:262:ed49:ffe7 with SMTP id t10-20020a17090a3e4a00b00262ed49ffe7mr10711550pjm.25.1689574039669; Sun, 16 Jul 2023 23:07:19 -0700 (PDT) Received: from dhcp-10-192-206-197.iig.avagotech.net.net ([192.19.234.250]) by smtp.gmail.com with ESMTPSA id q17-20020a170902789100b001b9da7b6bc3sm11849632pll.184.2023.07.16.23.07.16 (version=TLS1_2 cipher=ECDHE-ECDSA-AES128-GCM-SHA256 bits=128/128); Sun, 16 Jul 2023 23:07:18 -0700 (PDT) From: Selvin Xavier To: jgg@ziepe.ca, leon@kernel.org Cc: linux-rdma@vger.kernel.org, andrew.gospodarek@broadcom.com, michael.chan@broadcom.com, Chandramohan Akula , Selvin Xavier Subject: [PATCH for-next v2 6/7] RDMA/bnxt_re: Implement doorbell pacing algorithm Date: Sun, 16 Jul 2023 22:53:13 -0700 Message-Id: <1689573194-27687-7-git-send-email-selvin.xavier@broadcom.com> X-Mailer: git-send-email 2.5.5 In-Reply-To: <1689573194-27687-1-git-send-email-selvin.xavier@broadcom.com> References: <1689573194-27687-1-git-send-email-selvin.xavier@broadcom.com> Precedence: bulk List-ID: X-Mailing-List: linux-rdma@vger.kernel.org From: Chandramohan Akula User applications alert the driver when the Doorbell FIFO reaches the alarm threshold. The driver updates the pacing parameters in the shared page to do the maximum pacing by the application till the DB FIFO congestion reduces to pacing threshold. Driver keeps checking the DB FIFO depth at the pacing interval and gradually adjusts the pacing level. Once the pacing level reaches default values (no congestion in the FIFO) pacing gets completed. Signed-off-by: Chandramohan Akula Signed-off-by: Selvin Xavier --- drivers/infiniband/hw/bnxt_re/bnxt_re.h | 5 ++ drivers/infiniband/hw/bnxt_re/main.c | 124 ++++++++++++++++++++++++++++++++ 2 files changed, 129 insertions(+) diff --git a/drivers/infiniband/hw/bnxt_re/bnxt_re.h b/drivers/infiniband/hw/bnxt_re/bnxt_re.h index 1543f80..2175103 100644 --- a/drivers/infiniband/hw/bnxt_re/bnxt_re.h +++ b/drivers/infiniband/hw/bnxt_re/bnxt_re.h @@ -121,8 +121,10 @@ struct bnxt_re_pacing { u32 dbq_pacing_time; /* ms */ u32 dbr_def_do_pacing; bool dbr_pacing; + struct mutex dbq_lock; /* synchronize db pacing algo */ }; +#define BNXT_RE_MAX_DBR_DO_PACING 0xFFFF #define BNXT_RE_DBR_PACING_TIME 5 /* ms */ #define BNXT_RE_PACING_ALGO_THRESHOLD 250 /* Entries in DB FIFO */ #define BNXT_RE_PACING_ALARM_TH_MULTIPLE 2 /* Multiple of pacing algo threshold */ @@ -193,6 +195,8 @@ struct bnxt_re_dev { u32 is_virtfn; u32 num_vfs; struct bnxt_re_pacing pacing; + struct work_struct dbq_fifo_check_work; + struct delayed_work dbq_pacing_work; }; #define to_bnxt_re_dev(ptr, member) \ @@ -203,6 +207,7 @@ struct bnxt_re_dev { #define BNXT_RE_ROCEV2_IPV6_PACKET 3 #define BNXT_RE_CHECK_RC(x) ((x) && ((x) != -ETIMEDOUT)) +void bnxt_re_pacing_alert(struct bnxt_re_dev *rdev); static inline struct device *rdev_to_dev(struct bnxt_re_dev *rdev) { diff --git a/drivers/infiniband/hw/bnxt_re/main.c b/drivers/infiniband/hw/bnxt_re/main.c index 13cd84d..6469811 100644 --- a/drivers/infiniband/hw/bnxt_re/main.c +++ b/drivers/infiniband/hw/bnxt_re/main.c @@ -475,6 +475,125 @@ static void bnxt_re_set_default_pacing_data(struct bnxt_re_dev *rdev) pacing_data->pacing_th * BNXT_RE_PACING_ALARM_TH_MULTIPLE; } +static void __wait_for_fifo_occupancy_below_th(struct bnxt_re_dev *rdev) +{ + u32 read_val, fifo_occup; + + /* loop shouldn't run infintely as the occupancy usually goes + * below pacing algo threshold as soon as pacing kicks in. + */ + while (1) { + read_val = readl(rdev->en_dev->bar0 + rdev->pacing.dbr_db_fifo_reg_off); + fifo_occup = BNXT_RE_MAX_FIFO_DEPTH - + ((read_val & BNXT_RE_DB_FIFO_ROOM_MASK) >> + BNXT_RE_DB_FIFO_ROOM_SHIFT); + /* Fifo occupancy cannot be greater the MAX FIFO depth */ + if (fifo_occup > BNXT_RE_MAX_FIFO_DEPTH) + break; + + if (fifo_occup < rdev->qplib_res.pacing_data->pacing_th) + break; + } +} + +static void bnxt_re_db_fifo_check(struct work_struct *work) +{ + struct bnxt_re_dev *rdev = container_of(work, struct bnxt_re_dev, + dbq_fifo_check_work); + struct bnxt_qplib_db_pacing_data *pacing_data; + u32 pacing_save; + + if (!mutex_trylock(&rdev->pacing.dbq_lock)) + return; + pacing_data = rdev->qplib_res.pacing_data; + pacing_save = rdev->pacing.do_pacing_save; + __wait_for_fifo_occupancy_below_th(rdev); + cancel_delayed_work_sync(&rdev->dbq_pacing_work); + if (pacing_save > rdev->pacing.dbr_def_do_pacing) { + /* Double the do_pacing value during the congestion */ + pacing_save = pacing_save << 1; + } else { + /* + * when a new congestion is detected increase the do_pacing + * by 8 times. And also increase the pacing_th by 4 times. The + * reason to increase pacing_th is to give more space for the + * queue to oscillate down without getting empty, but also more + * room for the queue to increase without causing another alarm. + */ + pacing_save = pacing_save << 3; + pacing_data->pacing_th = rdev->pacing.pacing_algo_th * 4; + } + + if (pacing_save > BNXT_RE_MAX_DBR_DO_PACING) + pacing_save = BNXT_RE_MAX_DBR_DO_PACING; + + pacing_data->do_pacing = pacing_save; + rdev->pacing.do_pacing_save = pacing_data->do_pacing; + pacing_data->alarm_th = + pacing_data->pacing_th * BNXT_RE_PACING_ALARM_TH_MULTIPLE; + schedule_delayed_work(&rdev->dbq_pacing_work, + msecs_to_jiffies(rdev->pacing.dbq_pacing_time)); + mutex_unlock(&rdev->pacing.dbq_lock); +} + +static void bnxt_re_pacing_timer_exp(struct work_struct *work) +{ + struct bnxt_re_dev *rdev = container_of(work, struct bnxt_re_dev, + dbq_pacing_work.work); + struct bnxt_qplib_db_pacing_data *pacing_data; + u32 read_val, fifo_occup; + + if (!mutex_trylock(&rdev->pacing.dbq_lock)) + return; + + pacing_data = rdev->qplib_res.pacing_data; + read_val = readl(rdev->en_dev->bar0 + rdev->pacing.dbr_db_fifo_reg_off); + fifo_occup = BNXT_RE_MAX_FIFO_DEPTH - + ((read_val & BNXT_RE_DB_FIFO_ROOM_MASK) >> + BNXT_RE_DB_FIFO_ROOM_SHIFT); + + if (fifo_occup > pacing_data->pacing_th) + goto restart_timer; + + /* + * Instead of immediately going back to the default do_pacing + * reduce it by 1/8 times and restart the timer. + */ + pacing_data->do_pacing = pacing_data->do_pacing - (pacing_data->do_pacing >> 3); + pacing_data->do_pacing = max_t(u32, rdev->pacing.dbr_def_do_pacing, pacing_data->do_pacing); + if (pacing_data->do_pacing <= rdev->pacing.dbr_def_do_pacing) { + bnxt_re_set_default_pacing_data(rdev); + goto dbq_unlock; + } + +restart_timer: + schedule_delayed_work(&rdev->dbq_pacing_work, + msecs_to_jiffies(rdev->pacing.dbq_pacing_time)); +dbq_unlock: + rdev->pacing.do_pacing_save = pacing_data->do_pacing; + mutex_unlock(&rdev->pacing.dbq_lock); +} + +void bnxt_re_pacing_alert(struct bnxt_re_dev *rdev) +{ + struct bnxt_qplib_db_pacing_data *pacing_data; + + if (!rdev->pacing.dbr_pacing) + return; + mutex_lock(&rdev->pacing.dbq_lock); + pacing_data = rdev->qplib_res.pacing_data; + + /* + * Increase the alarm_th to max so that other user lib instances do not + * keep alerting the driver. + */ + pacing_data->alarm_th = BNXT_RE_MAX_FIFO_DEPTH; + pacing_data->do_pacing = BNXT_RE_MAX_DBR_DO_PACING; + cancel_work_sync(&rdev->dbq_fifo_check_work); + schedule_work(&rdev->dbq_fifo_check_work); + mutex_unlock(&rdev->pacing.dbq_lock); +} + static int bnxt_re_initialize_dbr_pacing(struct bnxt_re_dev *rdev) { if (bnxt_re_hwrm_dbr_pacing_qcfg(rdev)) @@ -506,11 +625,16 @@ static int bnxt_re_initialize_dbr_pacing(struct bnxt_re_dev *rdev) rdev->qplib_res.pacing_data->fifo_room_shift = BNXT_RE_DB_FIFO_ROOM_SHIFT; rdev->qplib_res.pacing_data->grc_reg_offset = rdev->pacing.dbr_db_fifo_reg_off; bnxt_re_set_default_pacing_data(rdev); + /* Initialize worker for DBR Pacing */ + INIT_WORK(&rdev->dbq_fifo_check_work, bnxt_re_db_fifo_check); + INIT_DELAYED_WORK(&rdev->dbq_pacing_work, bnxt_re_pacing_timer_exp); return 0; } static void bnxt_re_deinitialize_dbr_pacing(struct bnxt_re_dev *rdev) { + cancel_work_sync(&rdev->dbq_fifo_check_work); + cancel_delayed_work_sync(&rdev->dbq_pacing_work); if (rdev->pacing.dbr_page) free_page((u64)rdev->pacing.dbr_page);