From patchwork Fri Sep 15 12:01:33 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?Ilpo_J=C3=A4rvinen?= X-Patchwork-Id: 13386975 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 5A240EE645B for ; Fri, 15 Sep 2023 12:02:07 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234437AbjIOMCK (ORCPT ); Fri, 15 Sep 2023 08:02:10 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:50772 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232836AbjIOMCJ (ORCPT ); Fri, 15 Sep 2023 08:02:09 -0400 Received: from mgamail.intel.com (mgamail.intel.com [192.55.52.120]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id B72492134; Fri, 15 Sep 2023 05:02:01 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1694779321; x=1726315321; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=hHkykAT64KFeDdyczTYQ6xW3x+LfVkMtEWBZ1a9eYwk=; b=GZ7yyH1HgVIK3RHsqYypMx3fYqnlzZWF+YHSwildnzX4+bLiXFhBWQvT sur5YhuL/jZtSlKrFX0TzUVXqg8jjaogGW93pRL0M0ZPUOT7XyPMoJaJ5 eE5KDSJwgDIq24YubWqHGOfUAVVD0EqUGxFpV2by8nNList4ZMjEbiHI0 ugtZJLfVct9tUgaDfxuk8VyV54KDKsGsC06QWoY58Wyx0d9Q5QGO/SGcW A8VD900j12xMnpaulrl1aHacD1CbP6/037p/3bJWVD9ARLPHC2YqW/Sgn glvHFGrMpt2JDUK8tST7WT+mqCmV3IBi5xT5HHGvhQkMZSi2Mu586BDZ4 A==; X-IronPort-AV: E=McAfee;i="6600,9927,10833"; a="378145932" X-IronPort-AV: E=Sophos;i="6.02,149,1688454000"; d="scan'208";a="378145932" Received: from orsmga008.jf.intel.com ([10.7.209.65]) by fmsmga104.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 15 Sep 2023 05:02:00 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10833"; a="774292741" X-IronPort-AV: E=Sophos;i="6.02,149,1688454000"; d="scan'208";a="774292741" Received: from srdoo-mobl1.ger.corp.intel.com (HELO ijarvine-mobl2.ger.corp.intel.com) ([10.252.38.99]) by orsmga008-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 15 Sep 2023 05:01:55 -0700 From: =?utf-8?q?Ilpo_J=C3=A4rvinen?= To: linux-pci@vger.kernel.org, Bjorn Helgaas , Lorenzo Pieralisi , Rob Herring , =?utf-8?q?Krzysztof_Wilczy=C5=84ski?= , Lukas Wunner , Alexandru Gagniuc , Krishna chaitanya chundru , Srinivas Pandruvada , "Rafael J . Wysocki" , linux-pm@vger.kernel.org, Bjorn Helgaas , Jonathan Corbet , linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org Cc: Alex Deucher , Daniel Lezcano , Amit Kucheria , Zhang Rui , =?utf-8?q?Ilpo_J=C3=A4rvinen?= Subject: [PATCH v2 01/10] PCI: Protect Link Control 2 Register with RMW locking Date: Fri, 15 Sep 2023 15:01:33 +0300 Message-Id: <20230915120142.32987-2-ilpo.jarvinen@linux.intel.com> X-Mailer: git-send-email 2.30.2 In-Reply-To: <20230915120142.32987-1-ilpo.jarvinen@linux.intel.com> References: <20230915120142.32987-1-ilpo.jarvinen@linux.intel.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-pm@vger.kernel.org PCIe Bandwidth Controller performs RMW accesses the Link Control 2 Register which can occur concurrently to other sources of Link Control 2 Register writes. Therefore, add Link Control 2 Register among the PCI Express Capability Registers that need RMW locking. Signed-off-by: Ilpo Järvinen --- Documentation/PCI/pciebus-howto.rst | 8 ++++---- include/linux/pci.h | 1 + 2 files changed, 5 insertions(+), 4 deletions(-) diff --git a/Documentation/PCI/pciebus-howto.rst b/Documentation/PCI/pciebus-howto.rst index a0027e8fb0d0..3ba322ca1ce1 100644 --- a/Documentation/PCI/pciebus-howto.rst +++ b/Documentation/PCI/pciebus-howto.rst @@ -218,7 +218,7 @@ that is shared between many drivers including the service drivers. RMW Capability accessors (pcie_capability_clear_and_set_word(), pcie_capability_set_word(), and pcie_capability_clear_word()) protect a selected set of PCI Express Capability Registers (Link Control -Register and Root Control Register). Any change to those registers -should be performed using RMW accessors to avoid problems due to -concurrent updates. For the up-to-date list of protected registers, -see pcie_capability_clear_and_set_word(). +Register, Root Control Register, and Link Control 2 Register). Any +change to those registers should be performed using RMW accessors to +avoid problems due to concurrent updates. For the up-to-date list of +protected registers, see pcie_capability_clear_and_set_word(). diff --git a/include/linux/pci.h b/include/linux/pci.h index 8c7c2c3c6c65..16db80f8b15c 100644 --- a/include/linux/pci.h +++ b/include/linux/pci.h @@ -1243,6 +1243,7 @@ static inline int pcie_capability_clear_and_set_word(struct pci_dev *dev, { switch (pos) { case PCI_EXP_LNKCTL: + case PCI_EXP_LNKCTL2: case PCI_EXP_RTCTL: return pcie_capability_clear_and_set_word_locked(dev, pos, clear, set); From patchwork Fri Sep 15 12:01:34 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?Ilpo_J=C3=A4rvinen?= X-Patchwork-Id: 13386976 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 0DAE0EE6457 for ; Fri, 15 Sep 2023 12:02:12 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234534AbjIOMCP (ORCPT ); Fri, 15 Sep 2023 08:02:15 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:45790 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S234530AbjIOMCN (ORCPT ); Fri, 15 Sep 2023 08:02:13 -0400 Received: from mgamail.intel.com (mgamail.intel.com [192.55.52.120]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id BA0852113; Fri, 15 Sep 2023 05:02:07 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1694779327; x=1726315327; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=gwM5fnXuG8powd+ajq+cI+5GyVGOVUj5BEzCq+gdTz4=; b=VB2BubRn/B9nIimyDe8PxSegw0jKwv37h+9RATSbmEAajSKfsnuIUl7e 3glKYpRwNL397bNClM8d622t9Rt2p8OO9jSBfXm2O6MfU6aAxoTCHnYD/ 0lt+LAad3zzTNwJgTL1pFdPPPk/TFb+sXJkUVl0znhmJJu3cwNHpZcGvV 6LhMuXB3MRC56wCDeMPW52xXoTS8q4xA0MjB3N+Bonmor1TWJnAQCPhPV t0NtrKdiRkDxVFfGyIR9wz80tA+sLKLDa1E4WxwZb9AOvVRn2MkyIIMft 5Ky+Dbjnqa9KW2Ufv2b3RKc2R9nDTJ++jpGGjXi/0qKH8a2z7CxKJJCuO A==; X-IronPort-AV: E=McAfee;i="6600,9927,10833"; a="378145957" X-IronPort-AV: E=Sophos;i="6.02,149,1688454000"; d="scan'208";a="378145957" Received: from orsmga008.jf.intel.com ([10.7.209.65]) by fmsmga104.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 15 Sep 2023 05:02:07 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10833"; a="774292786" X-IronPort-AV: E=Sophos;i="6.02,149,1688454000"; d="scan'208";a="774292786" Received: from srdoo-mobl1.ger.corp.intel.com (HELO ijarvine-mobl2.ger.corp.intel.com) ([10.252.38.99]) by orsmga008-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 15 Sep 2023 05:02:01 -0700 From: =?utf-8?q?Ilpo_J=C3=A4rvinen?= To: linux-pci@vger.kernel.org, Bjorn Helgaas , Lorenzo Pieralisi , Rob Herring , =?utf-8?q?Krzysztof_Wilczy=C5=84ski?= , Lukas Wunner , Alexandru Gagniuc , Krishna chaitanya chundru , Srinivas Pandruvada , "Rafael J . Wysocki" , linux-pm@vger.kernel.org, Alex Deucher , =?utf-8?q?Christian_K=C3=B6nig?= , "Pan, Xinhui" , David Airlie , Daniel Vetter , amd-gfx@lists.freedesktop.org, dri-devel@lists.freedesktop.org, linux-kernel@vger.kernel.org Cc: Alex Deucher , Daniel Lezcano , Amit Kucheria , Zhang Rui , =?utf-8?q?Ilpo_J=C3=A4rvinen?= Subject: [PATCH v2 02/10] drm/radeon: Use RMW accessors for changing LNKCTL2 Date: Fri, 15 Sep 2023 15:01:34 +0300 Message-Id: <20230915120142.32987-3-ilpo.jarvinen@linux.intel.com> X-Mailer: git-send-email 2.30.2 In-Reply-To: <20230915120142.32987-1-ilpo.jarvinen@linux.intel.com> References: <20230915120142.32987-1-ilpo.jarvinen@linux.intel.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-pm@vger.kernel.org Don't assume that only the driver would be accessing LNKCTL2. In the case of upstream (parent), the driver does not even own the device it's changing the registers for. Use RMW capability accessors which do proper locking to avoid losing concurrent updates to the register value. This change is also useful as a cleanup. Suggested-by: Lukas Wunner Signed-off-by: Ilpo Järvinen --- drivers/gpu/drm/radeon/cik.c | 40 ++++++++++++++---------------------- drivers/gpu/drm/radeon/si.c | 40 ++++++++++++++---------------------- 2 files changed, 30 insertions(+), 50 deletions(-) diff --git a/drivers/gpu/drm/radeon/cik.c b/drivers/gpu/drm/radeon/cik.c index 10be30366c2b..b5e96a8fc2c1 100644 --- a/drivers/gpu/drm/radeon/cik.c +++ b/drivers/gpu/drm/radeon/cik.c @@ -9592,28 +9592,18 @@ static void cik_pcie_gen3_enable(struct radeon_device *rdev) PCI_EXP_LNKCTL_HAWD); /* linkctl2 */ - pcie_capability_read_word(root, PCI_EXP_LNKCTL2, - &tmp16); - tmp16 &= ~(PCI_EXP_LNKCTL2_ENTER_COMP | - PCI_EXP_LNKCTL2_TX_MARGIN); - tmp16 |= (bridge_cfg2 & - (PCI_EXP_LNKCTL2_ENTER_COMP | - PCI_EXP_LNKCTL2_TX_MARGIN)); - pcie_capability_write_word(root, - PCI_EXP_LNKCTL2, - tmp16); - - pcie_capability_read_word(rdev->pdev, - PCI_EXP_LNKCTL2, - &tmp16); - tmp16 &= ~(PCI_EXP_LNKCTL2_ENTER_COMP | - PCI_EXP_LNKCTL2_TX_MARGIN); - tmp16 |= (gpu_cfg2 & - (PCI_EXP_LNKCTL2_ENTER_COMP | - PCI_EXP_LNKCTL2_TX_MARGIN)); - pcie_capability_write_word(rdev->pdev, - PCI_EXP_LNKCTL2, - tmp16); + pcie_capability_clear_and_set_word(root, PCI_EXP_LNKCTL2, + PCI_EXP_LNKCTL2_ENTER_COMP | + PCI_EXP_LNKCTL2_TX_MARGIN, + bridge_cfg2 | + (PCI_EXP_LNKCTL2_ENTER_COMP | + PCI_EXP_LNKCTL2_TX_MARGIN)); + pcie_capability_clear_and_set_word(rdev->pdev, PCI_EXP_LNKCTL2, + PCI_EXP_LNKCTL2_ENTER_COMP | + PCI_EXP_LNKCTL2_TX_MARGIN, + gpu_cfg2 | + (PCI_EXP_LNKCTL2_ENTER_COMP | + PCI_EXP_LNKCTL2_TX_MARGIN)); tmp = RREG32_PCIE_PORT(PCIE_LC_CNTL4); tmp &= ~LC_SET_QUIESCE; @@ -9627,15 +9617,15 @@ static void cik_pcie_gen3_enable(struct radeon_device *rdev) speed_cntl &= ~LC_FORCE_DIS_SW_SPEED_CHANGE; WREG32_PCIE_PORT(PCIE_LC_SPEED_CNTL, speed_cntl); - pcie_capability_read_word(rdev->pdev, PCI_EXP_LNKCTL2, &tmp16); - tmp16 &= ~PCI_EXP_LNKCTL2_TLS; + tmp16 = 0; if (speed_cap == PCIE_SPEED_8_0GT) tmp16 |= PCI_EXP_LNKCTL2_TLS_8_0GT; /* gen3 */ else if (speed_cap == PCIE_SPEED_5_0GT) tmp16 |= PCI_EXP_LNKCTL2_TLS_5_0GT; /* gen2 */ else tmp16 |= PCI_EXP_LNKCTL2_TLS_2_5GT; /* gen1 */ - pcie_capability_write_word(rdev->pdev, PCI_EXP_LNKCTL2, tmp16); + pcie_capability_clear_and_set_word(rdev->pdev, PCI_EXP_LNKCTL2, + PCI_EXP_LNKCTL2_TLS, tmp16); speed_cntl = RREG32_PCIE_PORT(PCIE_LC_SPEED_CNTL); speed_cntl |= LC_INITIATE_LINK_SPEED_CHANGE; diff --git a/drivers/gpu/drm/radeon/si.c b/drivers/gpu/drm/radeon/si.c index a91012447b56..32871ca09a0f 100644 --- a/drivers/gpu/drm/radeon/si.c +++ b/drivers/gpu/drm/radeon/si.c @@ -7189,28 +7189,18 @@ static void si_pcie_gen3_enable(struct radeon_device *rdev) PCI_EXP_LNKCTL_HAWD); /* linkctl2 */ - pcie_capability_read_word(root, PCI_EXP_LNKCTL2, - &tmp16); - tmp16 &= ~(PCI_EXP_LNKCTL2_ENTER_COMP | - PCI_EXP_LNKCTL2_TX_MARGIN); - tmp16 |= (bridge_cfg2 & - (PCI_EXP_LNKCTL2_ENTER_COMP | - PCI_EXP_LNKCTL2_TX_MARGIN)); - pcie_capability_write_word(root, - PCI_EXP_LNKCTL2, - tmp16); - - pcie_capability_read_word(rdev->pdev, - PCI_EXP_LNKCTL2, - &tmp16); - tmp16 &= ~(PCI_EXP_LNKCTL2_ENTER_COMP | - PCI_EXP_LNKCTL2_TX_MARGIN); - tmp16 |= (gpu_cfg2 & - (PCI_EXP_LNKCTL2_ENTER_COMP | - PCI_EXP_LNKCTL2_TX_MARGIN)); - pcie_capability_write_word(rdev->pdev, - PCI_EXP_LNKCTL2, - tmp16); + pcie_capability_clear_and_set_word(root, PCI_EXP_LNKCTL2, + PCI_EXP_LNKCTL2_ENTER_COMP | + PCI_EXP_LNKCTL2_TX_MARGIN, + bridge_cfg2 & + (PCI_EXP_LNKCTL2_ENTER_COMP | + PCI_EXP_LNKCTL2_TX_MARGIN)); + pcie_capability_clear_and_set_word(rdev->pdev, PCI_EXP_LNKCTL2, + PCI_EXP_LNKCTL2_ENTER_COMP | + PCI_EXP_LNKCTL2_TX_MARGIN, + gpu_cfg2 & + (PCI_EXP_LNKCTL2_ENTER_COMP | + PCI_EXP_LNKCTL2_TX_MARGIN)); tmp = RREG32_PCIE_PORT(PCIE_LC_CNTL4); tmp &= ~LC_SET_QUIESCE; @@ -7224,15 +7214,15 @@ static void si_pcie_gen3_enable(struct radeon_device *rdev) speed_cntl &= ~LC_FORCE_DIS_SW_SPEED_CHANGE; WREG32_PCIE_PORT(PCIE_LC_SPEED_CNTL, speed_cntl); - pcie_capability_read_word(rdev->pdev, PCI_EXP_LNKCTL2, &tmp16); - tmp16 &= ~PCI_EXP_LNKCTL2_TLS; + tmp16 = 0; if (speed_cap == PCIE_SPEED_8_0GT) tmp16 |= PCI_EXP_LNKCTL2_TLS_8_0GT; /* gen3 */ else if (speed_cap == PCIE_SPEED_5_0GT) tmp16 |= PCI_EXP_LNKCTL2_TLS_5_0GT; /* gen2 */ else tmp16 |= PCI_EXP_LNKCTL2_TLS_2_5GT; /* gen1 */ - pcie_capability_write_word(rdev->pdev, PCI_EXP_LNKCTL2, tmp16); + pcie_capability_clear_and_set_word(rdev->pdev, PCI_EXP_LNKCTL2, + PCI_EXP_LNKCTL2_TLS, tmp16); speed_cntl = RREG32_PCIE_PORT(PCIE_LC_SPEED_CNTL); speed_cntl |= LC_INITIATE_LINK_SPEED_CHANGE; From patchwork Fri Sep 15 12:01:35 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?Ilpo_J=C3=A4rvinen?= X-Patchwork-Id: 13386977 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 9ABEBEE6457 for ; Fri, 15 Sep 2023 12:02:24 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234541AbjIOMC2 (ORCPT ); Fri, 15 Sep 2023 08:02:28 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:47386 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S234432AbjIOMC1 (ORCPT ); Fri, 15 Sep 2023 08:02:27 -0400 Received: from mgamail.intel.com (mgamail.intel.com [192.55.52.120]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 9D68E139; Fri, 15 Sep 2023 05:02:15 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1694779335; x=1726315335; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=oWLIDmT8V8yYTYSf+Z4MKL980NvQzXj0f/SDHZh0VdQ=; b=Y9159pSIuvrK4qQhk+Q8PvaF+vtuq4B5y1syorMmjmbqT6cmS1eRa18U HQhHHvw3xLahBS9JBnUpg774hbIl95AkFtJlZ9+byPB6tOPudJZ5/Uocy 1RiNI2HLWaReBdzuNIGNanTPjsZPiPXtMGP3HNH1ZBFWHkXNJoreAU4QE JAAto/Mq8/ZVYDhL4EFY/3ZbI3yYFNhHlGNgU9oqKBt7KRvbdICm/FIBn Hvqkc6HuPU/ST0NYgKDTkvsLqbQZJuFzjlAAW//hB/QQnSBKlRmyCxeuh 2UXRdUA4WNFIcNkIgL/6ZNL6sGZLb2805WIByIm4LaT/2kRrFqbDLVJvs g==; X-IronPort-AV: E=McAfee;i="6600,9927,10833"; a="378145982" X-IronPort-AV: E=Sophos;i="6.02,149,1688454000"; d="scan'208";a="378145982" Received: from orsmga008.jf.intel.com ([10.7.209.65]) by fmsmga104.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 15 Sep 2023 05:02:14 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10833"; a="774292805" X-IronPort-AV: E=Sophos;i="6.02,149,1688454000"; d="scan'208";a="774292805" Received: from srdoo-mobl1.ger.corp.intel.com (HELO ijarvine-mobl2.ger.corp.intel.com) ([10.252.38.99]) by orsmga008-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 15 Sep 2023 05:02:08 -0700 From: =?utf-8?q?Ilpo_J=C3=A4rvinen?= To: linux-pci@vger.kernel.org, Bjorn Helgaas , Lorenzo Pieralisi , Rob Herring , =?utf-8?q?Krzysztof_Wilczy=C5=84ski?= , Lukas Wunner , Alexandru Gagniuc , Krishna chaitanya chundru , Srinivas Pandruvada , "Rafael J . Wysocki" , linux-pm@vger.kernel.org, Alex Deucher , =?utf-8?q?Christian_K=C3=B6nig?= , "Pan, Xinhui" , David Airlie , Daniel Vetter , amd-gfx@lists.freedesktop.org, dri-devel@lists.freedesktop.org, linux-kernel@vger.kernel.org Cc: Alex Deucher , Daniel Lezcano , Amit Kucheria , Zhang Rui , =?utf-8?q?Ilpo_J=C3=A4rvinen?= Subject: [PATCH v2 03/10] drm/amdgpu: Use RMW accessors for changing LNKCTL2 Date: Fri, 15 Sep 2023 15:01:35 +0300 Message-Id: <20230915120142.32987-4-ilpo.jarvinen@linux.intel.com> X-Mailer: git-send-email 2.30.2 In-Reply-To: <20230915120142.32987-1-ilpo.jarvinen@linux.intel.com> References: <20230915120142.32987-1-ilpo.jarvinen@linux.intel.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-pm@vger.kernel.org Don't assume that only the driver would be accessing LNKCTL2. In the case of upstream (parent), the driver does not even own the device it's changing the registers for. Use RMW capability accessors which do proper locking to avoid losing concurrent updates to the register value. This change is also useful as a cleanup. Suggested-by: Lukas Wunner Signed-off-by: Ilpo Järvinen --- drivers/gpu/drm/amd/amdgpu/cik.c | 41 ++++++++++++-------------------- drivers/gpu/drm/amd/amdgpu/si.c | 41 ++++++++++++-------------------- 2 files changed, 30 insertions(+), 52 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/cik.c b/drivers/gpu/drm/amd/amdgpu/cik.c index e63abdf52b6c..7bcd41996927 100644 --- a/drivers/gpu/drm/amd/amdgpu/cik.c +++ b/drivers/gpu/drm/amd/amdgpu/cik.c @@ -1638,28 +1638,18 @@ static void cik_pcie_gen3_enable(struct amdgpu_device *adev) PCI_EXP_LNKCTL_HAWD); /* linkctl2 */ - pcie_capability_read_word(root, PCI_EXP_LNKCTL2, - &tmp16); - tmp16 &= ~(PCI_EXP_LNKCTL2_ENTER_COMP | - PCI_EXP_LNKCTL2_TX_MARGIN); - tmp16 |= (bridge_cfg2 & - (PCI_EXP_LNKCTL2_ENTER_COMP | - PCI_EXP_LNKCTL2_TX_MARGIN)); - pcie_capability_write_word(root, - PCI_EXP_LNKCTL2, - tmp16); - - pcie_capability_read_word(adev->pdev, - PCI_EXP_LNKCTL2, - &tmp16); - tmp16 &= ~(PCI_EXP_LNKCTL2_ENTER_COMP | - PCI_EXP_LNKCTL2_TX_MARGIN); - tmp16 |= (gpu_cfg2 & - (PCI_EXP_LNKCTL2_ENTER_COMP | - PCI_EXP_LNKCTL2_TX_MARGIN)); - pcie_capability_write_word(adev->pdev, - PCI_EXP_LNKCTL2, - tmp16); + pcie_capability_clear_and_set_word(root, PCI_EXP_LNKCTL2, + PCI_EXP_LNKCTL2_ENTER_COMP | + PCI_EXP_LNKCTL2_TX_MARGIN, + bridge_cfg2 & + (PCI_EXP_LNKCTL2_ENTER_COMP | + PCI_EXP_LNKCTL2_TX_MARGIN)); + pcie_capability_clear_and_set_word(adev->pdev, PCI_EXP_LNKCTL2, + PCI_EXP_LNKCTL2_ENTER_COMP | + PCI_EXP_LNKCTL2_TX_MARGIN, + gpu_cfg2 & + (PCI_EXP_LNKCTL2_ENTER_COMP | + PCI_EXP_LNKCTL2_TX_MARGIN)); tmp = RREG32_PCIE(ixPCIE_LC_CNTL4); tmp &= ~PCIE_LC_CNTL4__LC_SET_QUIESCE_MASK; @@ -1674,16 +1664,15 @@ static void cik_pcie_gen3_enable(struct amdgpu_device *adev) speed_cntl &= ~PCIE_LC_SPEED_CNTL__LC_FORCE_DIS_SW_SPEED_CHANGE_MASK; WREG32_PCIE(ixPCIE_LC_SPEED_CNTL, speed_cntl); - pcie_capability_read_word(adev->pdev, PCI_EXP_LNKCTL2, &tmp16); - tmp16 &= ~PCI_EXP_LNKCTL2_TLS; - + tmp16 = 0; if (adev->pm.pcie_gen_mask & CAIL_PCIE_LINK_SPEED_SUPPORT_GEN3) tmp16 |= PCI_EXP_LNKCTL2_TLS_8_0GT; /* gen3 */ else if (adev->pm.pcie_gen_mask & CAIL_PCIE_LINK_SPEED_SUPPORT_GEN2) tmp16 |= PCI_EXP_LNKCTL2_TLS_5_0GT; /* gen2 */ else tmp16 |= PCI_EXP_LNKCTL2_TLS_2_5GT; /* gen1 */ - pcie_capability_write_word(adev->pdev, PCI_EXP_LNKCTL2, tmp16); + pcie_capability_clear_and_set_word(adev->pdev, PCI_EXP_LNKCTL2, + PCI_EXP_LNKCTL2_TLS, tmp16); speed_cntl = RREG32_PCIE(ixPCIE_LC_SPEED_CNTL); speed_cntl |= PCIE_LC_SPEED_CNTL__LC_INITIATE_LINK_SPEED_CHANGE_MASK; diff --git a/drivers/gpu/drm/amd/amdgpu/si.c b/drivers/gpu/drm/amd/amdgpu/si.c index 4b81f29e5fd5..8ea60fdd1b1d 100644 --- a/drivers/gpu/drm/amd/amdgpu/si.c +++ b/drivers/gpu/drm/amd/amdgpu/si.c @@ -2331,28 +2331,18 @@ static void si_pcie_gen3_enable(struct amdgpu_device *adev) gpu_cfg & PCI_EXP_LNKCTL_HAWD); - pcie_capability_read_word(root, PCI_EXP_LNKCTL2, - &tmp16); - tmp16 &= ~(PCI_EXP_LNKCTL2_ENTER_COMP | - PCI_EXP_LNKCTL2_TX_MARGIN); - tmp16 |= (bridge_cfg2 & - (PCI_EXP_LNKCTL2_ENTER_COMP | - PCI_EXP_LNKCTL2_TX_MARGIN)); - pcie_capability_write_word(root, - PCI_EXP_LNKCTL2, - tmp16); - - pcie_capability_read_word(adev->pdev, - PCI_EXP_LNKCTL2, - &tmp16); - tmp16 &= ~(PCI_EXP_LNKCTL2_ENTER_COMP | - PCI_EXP_LNKCTL2_TX_MARGIN); - tmp16 |= (gpu_cfg2 & - (PCI_EXP_LNKCTL2_ENTER_COMP | - PCI_EXP_LNKCTL2_TX_MARGIN)); - pcie_capability_write_word(adev->pdev, - PCI_EXP_LNKCTL2, - tmp16); + pcie_capability_clear_and_set_word(root, PCI_EXP_LNKCTL2, + PCI_EXP_LNKCTL2_ENTER_COMP | + PCI_EXP_LNKCTL2_TX_MARGIN, + bridge_cfg2 & + (PCI_EXP_LNKCTL2_ENTER_COMP | + PCI_EXP_LNKCTL2_TX_MARGIN)); + pcie_capability_clear_and_set_word(adev->pdev, PCI_EXP_LNKCTL2, + PCI_EXP_LNKCTL2_ENTER_COMP | + PCI_EXP_LNKCTL2_TX_MARGIN, + gpu_cfg2 & + (PCI_EXP_LNKCTL2_ENTER_COMP | + PCI_EXP_LNKCTL2_TX_MARGIN)); tmp = RREG32_PCIE_PORT(PCIE_LC_CNTL4); tmp &= ~LC_SET_QUIESCE; @@ -2365,16 +2355,15 @@ static void si_pcie_gen3_enable(struct amdgpu_device *adev) speed_cntl &= ~LC_FORCE_DIS_SW_SPEED_CHANGE; WREG32_PCIE_PORT(PCIE_LC_SPEED_CNTL, speed_cntl); - pcie_capability_read_word(adev->pdev, PCI_EXP_LNKCTL2, &tmp16); - tmp16 &= ~PCI_EXP_LNKCTL2_TLS; - + tmp16 = 0; if (adev->pm.pcie_gen_mask & CAIL_PCIE_LINK_SPEED_SUPPORT_GEN3) tmp16 |= PCI_EXP_LNKCTL2_TLS_8_0GT; /* gen3 */ else if (adev->pm.pcie_gen_mask & CAIL_PCIE_LINK_SPEED_SUPPORT_GEN2) tmp16 |= PCI_EXP_LNKCTL2_TLS_5_0GT; /* gen2 */ else tmp16 |= PCI_EXP_LNKCTL2_TLS_2_5GT; /* gen1 */ - pcie_capability_write_word(adev->pdev, PCI_EXP_LNKCTL2, tmp16); + pcie_capability_clear_and_set_word(adev->pdev, PCI_EXP_LNKCTL2, + PCI_EXP_LNKCTL2_TLS, tmp16); speed_cntl = RREG32_PCIE_PORT(PCIE_LC_SPEED_CNTL); speed_cntl |= LC_INITIATE_LINK_SPEED_CHANGE; From patchwork Fri Sep 15 12:01:36 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?Ilpo_J=C3=A4rvinen?= X-Patchwork-Id: 13386978 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id BFCA0EE645A for ; Fri, 15 Sep 2023 12:03:06 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234485AbjIOMDJ (ORCPT ); Fri, 15 Sep 2023 08:03:09 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:55688 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232836AbjIOMDJ (ORCPT ); Fri, 15 Sep 2023 08:03:09 -0400 Received: from mgamail.intel.com (mgamail.intel.com [192.55.52.120]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 0F37B2722; Fri, 15 Sep 2023 05:02:45 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1694779365; x=1726315365; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=vDsDIHqtLtfNmUG3Y8t5VDuHkp8RJgeZagGcp/a8iuA=; b=PD0Pbphd+cTrXrBvdDVaWpF4myBOn9G1gLBVBDyobay5CxrmEKWRdZ2Q uNwoCAp//+s9Mu8gny17P13Q9g0MdpUipcjuqUFBn9n2LI7YoLFsy8ozP rJirOo8WNVjdm9ZRKwIlA7oQ5pxpvCnLnuge/hEqB0qSVvHWXR4wtKK8X RhJXm8RZxIlHbB+ER9GUxQDyGPe3BlsHkwXfZ3CVaKNtKlUNTq7sXpa7d f3VAKlTLy6HE9/MGAF0HZnq895dqlZJek1tMBRnX7YQ339zDQl0MyzYBz IUGpuhzuWtwDfmZyQ0C7IZiD98i4iHHZti+uNtqCG0s0QaBHAvsZmsaSL g==; X-IronPort-AV: E=McAfee;i="6600,9927,10833"; a="378146017" X-IronPort-AV: E=Sophos;i="6.02,149,1688454000"; d="scan'208";a="378146017" Received: from orsmga008.jf.intel.com ([10.7.209.65]) by fmsmga104.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 15 Sep 2023 05:02:22 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10833"; a="774292848" X-IronPort-AV: E=Sophos;i="6.02,149,1688454000"; d="scan'208";a="774292848" Received: from srdoo-mobl1.ger.corp.intel.com (HELO ijarvine-mobl2.ger.corp.intel.com) ([10.252.38.99]) by orsmga008-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 15 Sep 2023 05:02:14 -0700 From: =?utf-8?q?Ilpo_J=C3=A4rvinen?= To: linux-pci@vger.kernel.org, Bjorn Helgaas , Lorenzo Pieralisi , Rob Herring , =?utf-8?q?Krzysztof_Wilczy=C5=84ski?= , Lukas Wunner , Alexandru Gagniuc , Krishna chaitanya chundru , Srinivas Pandruvada , "Rafael J . Wysocki" , linux-pm@vger.kernel.org, Dennis Dalessandro , Jason Gunthorpe , Leon Romanovsky , linux-rdma@vger.kernel.org, linux-kernel@vger.kernel.org Cc: Alex Deucher , Daniel Lezcano , Amit Kucheria , Zhang Rui , =?utf-8?q?Ilpo_J=C3=A4rvinen?= Subject: [PATCH v2 04/10] drm/IB/hfi1: Use RMW accessors for changing LNKCTL2 Date: Fri, 15 Sep 2023 15:01:36 +0300 Message-Id: <20230915120142.32987-5-ilpo.jarvinen@linux.intel.com> X-Mailer: git-send-email 2.30.2 In-Reply-To: <20230915120142.32987-1-ilpo.jarvinen@linux.intel.com> References: <20230915120142.32987-1-ilpo.jarvinen@linux.intel.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-pm@vger.kernel.org Don't assume that only the driver would be accessing LNKCTL2. In the case of upstream (parent), the driver does not even own the device it's changing the registers for. Use RMW capability accessors which do proper locking to avoid losing concurrent updates to the register value. This change is also useful as a cleanup. Suggested-by: Lukas Wunner Signed-off-by: Ilpo Järvinen --- drivers/infiniband/hw/hfi1/pcie.c | 30 ++++++++---------------------- 1 file changed, 8 insertions(+), 22 deletions(-) diff --git a/drivers/infiniband/hw/hfi1/pcie.c b/drivers/infiniband/hw/hfi1/pcie.c index 08732e1ac966..60a177f52eb5 100644 --- a/drivers/infiniband/hw/hfi1/pcie.c +++ b/drivers/infiniband/hw/hfi1/pcie.c @@ -1212,14 +1212,11 @@ int do_pcie_gen3_transition(struct hfi1_devdata *dd) (u32)lnkctl2); /* only write to parent if target is not as high as ours */ if ((lnkctl2 & PCI_EXP_LNKCTL2_TLS) < target_vector) { - lnkctl2 &= ~PCI_EXP_LNKCTL2_TLS; - lnkctl2 |= target_vector; - dd_dev_info(dd, "%s: ..new link control2: 0x%x\n", __func__, - (u32)lnkctl2); - ret = pcie_capability_write_word(parent, - PCI_EXP_LNKCTL2, lnkctl2); + ret = pcie_capability_clear_and_set_word(parent, PCI_EXP_LNKCTL2, + PCI_EXP_LNKCTL2_TLS, + target_vector); if (ret) { - dd_dev_err(dd, "Unable to write to PCI config\n"); + dd_dev_err(dd, "Unable to change PCI target speed\n"); return_error = 1; goto done; } @@ -1228,22 +1225,11 @@ int do_pcie_gen3_transition(struct hfi1_devdata *dd) } dd_dev_info(dd, "%s: setting target link speed\n", __func__); - ret = pcie_capability_read_word(dd->pcidev, PCI_EXP_LNKCTL2, &lnkctl2); + ret = pcie_capability_clear_and_set_word(dd->pcidev, PCI_EXP_LNKCTL2, + PCI_EXP_LNKCTL2_TLS, + target_vector); if (ret) { - dd_dev_err(dd, "Unable to read from PCI config\n"); - return_error = 1; - goto done; - } - - dd_dev_info(dd, "%s: ..old link control2: 0x%x\n", __func__, - (u32)lnkctl2); - lnkctl2 &= ~PCI_EXP_LNKCTL2_TLS; - lnkctl2 |= target_vector; - dd_dev_info(dd, "%s: ..new link control2: 0x%x\n", __func__, - (u32)lnkctl2); - ret = pcie_capability_write_word(dd->pcidev, PCI_EXP_LNKCTL2, lnkctl2); - if (ret) { - dd_dev_err(dd, "Unable to write to PCI config\n"); + dd_dev_err(dd, "Unable to change PCI target speed\n"); return_error = 1; goto done; } From patchwork Fri Sep 15 12:01:37 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?Ilpo_J=C3=A4rvinen?= X-Patchwork-Id: 13386979 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id BD6CBEE645B for ; Fri, 15 Sep 2023 12:03:15 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234575AbjIOMDS (ORCPT ); Fri, 15 Sep 2023 08:03:18 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:47616 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S234567AbjIOMDN (ORCPT ); Fri, 15 Sep 2023 08:03:13 -0400 Received: from mgamail.intel.com (mgamail.intel.com [192.55.52.120]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 879D62729; Fri, 15 Sep 2023 05:02:48 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1694779368; x=1726315368; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=jEq8VuTODvxdM49avhgWzqUC3yyIiRlCMHso5pMiaHY=; b=mUqBlUGikBC/pQGXNhUiux4ISA+vf0dIYTyR4kL6wpm78vLB4eDqPsM6 1zZZKOlnYd4xaReJj7BV8G8T7u9Oxl3Igl8G20hMUSDSbjAzC/zkyGN4X OSQiKTsqeLFIqo+qStkiJSk/uG97tjMqpDYVxmM9Kv3eZpG9fu1sif6Kd nJ2NTXolH6plBSEO5BZgxfCQIgdxqvtK4ZzEoO875s+2Ephzrw3bVmd3O +jhH/79sirtk3tSJqhSRC/FqFIf9pIOiZd0XTEUzEXeFcg03C56c9Djvq j0JQM7bjJGaVYEA+oNwdRqkrz1pt5cTDXaL/3BVZ2tSPaoYGye4PQ5Pyg g==; X-IronPort-AV: E=McAfee;i="6600,9927,10833"; a="378146041" X-IronPort-AV: E=Sophos;i="6.02,149,1688454000"; d="scan'208";a="378146041" Received: from orsmga008.jf.intel.com ([10.7.209.65]) by fmsmga104.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 15 Sep 2023 05:02:26 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10833"; a="774292874" X-IronPort-AV: E=Sophos;i="6.02,149,1688454000"; d="scan'208";a="774292874" Received: from srdoo-mobl1.ger.corp.intel.com (HELO ijarvine-mobl2.ger.corp.intel.com) ([10.252.38.99]) by orsmga008-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 15 Sep 2023 05:02:21 -0700 From: =?utf-8?q?Ilpo_J=C3=A4rvinen?= To: linux-pci@vger.kernel.org, Bjorn Helgaas , Lorenzo Pieralisi , Rob Herring , =?utf-8?q?Krzysztof_Wilczy=C5=84ski?= , Lukas Wunner , Alexandru Gagniuc , Krishna chaitanya chundru , Srinivas Pandruvada , "Rafael J . Wysocki" , linux-pm@vger.kernel.org, Bjorn Helgaas , linux-kernel@vger.kernel.org Cc: Alex Deucher , Daniel Lezcano , Amit Kucheria , Zhang Rui , =?utf-8?q?Ilpo_J=C3=A4rvinen?= Subject: [PATCH v2 05/10] PCI: Store all PCIe Supported Link Speeds Date: Fri, 15 Sep 2023 15:01:37 +0300 Message-Id: <20230915120142.32987-6-ilpo.jarvinen@linux.intel.com> X-Mailer: git-send-email 2.30.2 In-Reply-To: <20230915120142.32987-1-ilpo.jarvinen@linux.intel.com> References: <20230915120142.32987-1-ilpo.jarvinen@linux.intel.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-pm@vger.kernel.org struct pci_bus stores max_bus_speed. Implementation Note in PCIe r6.0.1 sec 7.5.3.18, however, recommends determining supported Link Speeds using the Supported Link Speeds Vector in the Link Capabilities 2 Register (when available). Add pcie_bus_speeds into struct pci_bus which caches the Supported Link Speeds. The value is taken directly from the Supported Link Speeds Vector or synthetized from the Max Link Speed in the Link Capabilities Register when the Link Capabilities 2 Register is not available. pcie_bus_speeds field keeps the extra reserved zero at the least significant bit to match the Link Capabilities 2 Register layouting. Suggested-by: Lukas Wunner Signed-off-by: Ilpo Järvinen --- drivers/pci/probe.c | 28 +++++++++++++++++++++++++++- include/linux/pci.h | 1 + include/uapi/linux/pci_regs.h | 1 + 3 files changed, 29 insertions(+), 1 deletion(-) diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c index 795534589b98..ca1d797a30cb 100644 --- a/drivers/pci/probe.c +++ b/drivers/pci/probe.c @@ -767,6 +767,29 @@ static enum pci_bus_speed agp_speed(int agp3, int agpstat) return agp_speeds[index]; } +/* + * Implementation Note in PCIe r6.0.1 sec 7.5.3.18 recommends determining + * supported link speeds using the Supported Link Speeds Vector in the Link + * Capabilities 2 Register (when available). + */ +static u8 pcie_get_supported_speeds(u32 linkcap, u32 linkcap2) +{ + u8 speeds; + + speeds = linkcap2 & PCI_EXP_LNKCAP2_SLS; + if (speeds) + return speeds; + + /* + * Synthetize supported link speeds from the Max Link Speed in the + * Link Capabilities Register. + */ + speeds = PCI_EXP_LNKCAP2_SLS_2_5GB; + if ((linkcap & PCI_EXP_LNKCAP_SLS) == PCI_EXP_LNKCAP_SLS_5_0GB) + speeds |= PCI_EXP_LNKCAP2_SLS_5_0GB; + return speeds; +} + static void pci_set_bus_speed(struct pci_bus *bus) { struct pci_dev *bridge = bus->self; @@ -814,12 +837,15 @@ static void pci_set_bus_speed(struct pci_bus *bus) } if (pci_is_pcie(bridge)) { - u32 linkcap; + u32 linkcap, linkcap2; u16 linksta; pcie_capability_read_dword(bridge, PCI_EXP_LNKCAP, &linkcap); bus->max_bus_speed = pcie_link_speed[linkcap & PCI_EXP_LNKCAP_SLS]; + pcie_capability_read_dword(bridge, PCI_EXP_LNKCAP2, &linkcap2); + bus->pcie_bus_speeds = pcie_get_supported_speeds(linkcap, linkcap2); + pcie_capability_read_word(bridge, PCI_EXP_LNKSTA, &linksta); pcie_update_link_speed(bus, linksta); } diff --git a/include/linux/pci.h b/include/linux/pci.h index 16db80f8b15c..cb03f3ff9d23 100644 --- a/include/linux/pci.h +++ b/include/linux/pci.h @@ -664,6 +664,7 @@ struct pci_bus { unsigned char primary; /* Number of primary bridge */ unsigned char max_bus_speed; /* enum pci_bus_speed */ unsigned char cur_bus_speed; /* enum pci_bus_speed */ + u8 pcie_bus_speeds;/* Supported Link Speeds Vector (+ reserved 0 at LSB) */ #ifdef CONFIG_PCI_DOMAINS_GENERIC int domain_nr; #endif diff --git a/include/uapi/linux/pci_regs.h b/include/uapi/linux/pci_regs.h index e5f558d96493..2b27e4f6854a 100644 --- a/include/uapi/linux/pci_regs.h +++ b/include/uapi/linux/pci_regs.h @@ -674,6 +674,7 @@ #define PCI_EXP_DEVSTA2 0x2a /* Device Status 2 */ #define PCI_CAP_EXP_RC_ENDPOINT_SIZEOF_V2 0x2c /* end of v2 EPs w/o link */ #define PCI_EXP_LNKCAP2 0x2c /* Link Capabilities 2 */ +#define PCI_EXP_LNKCAP2_SLS 0x000000fe /* Supported Link Speeds Vector */ #define PCI_EXP_LNKCAP2_SLS_2_5GB 0x00000002 /* Supported Speed 2.5GT/s */ #define PCI_EXP_LNKCAP2_SLS_5_0GB 0x00000004 /* Supported Speed 5GT/s */ #define PCI_EXP_LNKCAP2_SLS_8_0GB 0x00000008 /* Supported Speed 8GT/s */ From patchwork Fri Sep 15 12:01:38 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?Ilpo_J=C3=A4rvinen?= X-Patchwork-Id: 13386980 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 69461EE645C for ; Fri, 15 Sep 2023 12:03:16 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234570AbjIOMDT (ORCPT ); Fri, 15 Sep 2023 08:03:19 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:47652 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S234584AbjIOMDR (ORCPT ); Fri, 15 Sep 2023 08:03:17 -0400 Received: from mgamail.intel.com (mgamail.intel.com [192.55.52.120]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id AAFC62D52; Fri, 15 Sep 2023 05:02:49 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1694779369; x=1726315369; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=5GI9De0NedLS8Fb+1iCt8QAfte6ia8Vm/9fJLfKqyBE=; b=gH0arkA97F8RRCais4AVL+rYXaitkPJ/uVA6ezsFJNl2TcaC1YLpo+hl HGG8B3vDz1rPKOvVY2quwsc5veJaT8kWZHrY5+jGkY2kvbl3rt/c3Wi4o HCpgOo81lXN4YOI8upslYohs41gq9uCMulgmgyEiiIR/OmvRreS0So6K7 N/r0WnyCkBRWJzL1YtT06H/DTdaKa0hroiykTb5PE8SG8E3dgkZwR/oQG AuILiXfg7Rhob8c4EuWsIOEz2b91mDw6XfwMnSCy5zzRcMl8V6eJuBSBZ +b/CJebkiTGwq0qCF8a1PNS4WD2uEwMdNPA/kd92bchtmHykP/WSnuQsS w==; X-IronPort-AV: E=McAfee;i="6600,9927,10833"; a="378146062" X-IronPort-AV: E=Sophos;i="6.02,149,1688454000"; d="scan'208";a="378146062" Received: from orsmga008.jf.intel.com ([10.7.209.65]) by fmsmga104.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 15 Sep 2023 05:02:32 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10833"; a="774292920" X-IronPort-AV: E=Sophos;i="6.02,149,1688454000"; d="scan'208";a="774292920" Received: from srdoo-mobl1.ger.corp.intel.com (HELO ijarvine-mobl2.ger.corp.intel.com) ([10.252.38.99]) by orsmga008-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 15 Sep 2023 05:02:27 -0700 From: =?utf-8?q?Ilpo_J=C3=A4rvinen?= To: linux-pci@vger.kernel.org, Bjorn Helgaas , Lorenzo Pieralisi , Rob Herring , =?utf-8?q?Krzysztof_Wilczy=C5=84ski?= , Lukas Wunner , Alexandru Gagniuc , Krishna chaitanya chundru , Srinivas Pandruvada , "Rafael J . Wysocki" , linux-pm@vger.kernel.org, Bjorn Helgaas , linux-kernel@vger.kernel.org Cc: Alex Deucher , Daniel Lezcano , Amit Kucheria , Zhang Rui , =?utf-8?q?Ilpo_J=C3=A4rvinen?= Subject: [PATCH v2 06/10] PCI: Cache PCIe device's Supported Speed Vector Date: Fri, 15 Sep 2023 15:01:38 +0300 Message-Id: <20230915120142.32987-7-ilpo.jarvinen@linux.intel.com> X-Mailer: git-send-email 2.30.2 In-Reply-To: <20230915120142.32987-1-ilpo.jarvinen@linux.intel.com> References: <20230915120142.32987-1-ilpo.jarvinen@linux.intel.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-pm@vger.kernel.org The Supported Link Speeds Vector in the Link Capabilities Register 2 corresponds to the bus below on Root Ports and Downstream Ports, whereas it corresponds to the bus above on Upstream Ports and Endpoints. Only the former is currently cached in pcie_bus_speeds in the struct pci_bus. The link speeds that are supported is the intersection of these two. Store the device's Supported Link Speeds Vector into the struct pci_bus when the Function 0 is enumerated (the Multi-Function Devices must have same speeds the same for all Functions) to be easily able to calculate the intersection of Supported Link Speeds. Suggested-by: Lukas Wunner Signed-off-by: Ilpo Järvinen --- drivers/pci/probe.c | 10 ++++++++++ drivers/pci/remove.c | 2 ++ include/linux/pci.h | 1 + 3 files changed, 13 insertions(+) diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c index ca1d797a30cb..a9408f2420e5 100644 --- a/drivers/pci/probe.c +++ b/drivers/pci/probe.c @@ -2564,6 +2564,7 @@ static void pci_set_msi_domain(struct pci_dev *dev) void pci_device_add(struct pci_dev *dev, struct pci_bus *bus) { + u8 dev_speeds = 0; int ret; pci_configure_device(dev); @@ -2590,11 +2591,20 @@ void pci_device_add(struct pci_dev *dev, struct pci_bus *bus) pci_init_capabilities(dev); + if (pci_is_pcie(dev) && PCI_FUNC(dev->devfn) == 0) { + u32 linkcap, linkcap2; + + pcie_capability_read_dword(dev, PCI_EXP_LNKCAP, &linkcap); + pcie_capability_read_dword(dev, PCI_EXP_LNKCAP2, &linkcap2); + dev_speeds = pcie_get_supported_speeds(linkcap, linkcap2); + } /* * Add the device to our list of discovered devices * and the bus list for fixup functions, etc. */ down_write(&pci_bus_sem); + if (dev_speeds) + bus->pcie_dev_speeds = dev_speeds; list_add_tail(&dev->bus_list, &bus->devices); up_write(&pci_bus_sem); diff --git a/drivers/pci/remove.c b/drivers/pci/remove.c index d749ea8250d6..656784cfb291 100644 --- a/drivers/pci/remove.c +++ b/drivers/pci/remove.c @@ -36,6 +36,8 @@ static void pci_destroy_dev(struct pci_dev *dev) device_del(&dev->dev); down_write(&pci_bus_sem); + if (pci_is_pcie(dev) && PCI_FUNC(dev->devfn) == 0) + dev->bus->pcie_dev_speeds = 0; list_del(&dev->bus_list); up_write(&pci_bus_sem); diff --git a/include/linux/pci.h b/include/linux/pci.h index cb03f3ff9d23..b8bd3dc92032 100644 --- a/include/linux/pci.h +++ b/include/linux/pci.h @@ -665,6 +665,7 @@ struct pci_bus { unsigned char max_bus_speed; /* enum pci_bus_speed */ unsigned char cur_bus_speed; /* enum pci_bus_speed */ u8 pcie_bus_speeds;/* Supported Link Speeds Vector (+ reserved 0 at LSB) */ + u8 pcie_dev_speeds;/* Device's Supported Link Speeds Vector (+ 0 at LSB) */ #ifdef CONFIG_PCI_DOMAINS_GENERIC int domain_nr; #endif From patchwork Fri Sep 15 12:01:39 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?Ilpo_J=C3=A4rvinen?= X-Patchwork-Id: 13386981 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 4898DEE6457 for ; Fri, 15 Sep 2023 12:03:43 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234506AbjIOMDq (ORCPT ); Fri, 15 Sep 2023 08:03:46 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:49014 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S234495AbjIOMDq (ORCPT ); Fri, 15 Sep 2023 08:03:46 -0400 Received: from mgamail.intel.com (mgamail.intel.com [192.55.52.120]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 2C1332D75; Fri, 15 Sep 2023 05:03:05 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1694779385; x=1726315385; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=WxEVy4hdhGkm4tL7rKxsWilDBWWfyowIoUWmNvGZdbc=; b=hsxPK0Cd1ryfz8ut5UiQz8NjvmBGqm5C3N+ntI6zZGPJa6KFd5WEo1IL WaLcks3RzVv/fruJ6/jiE0PwDGNfnJRjxf/eOg73m6bp7pMoBLldCoCfo wTk+U4XNiebs/BKItKx8Hk/UgEG1Zj+g20r4Hgr77RuR7kgu/faWoAOKV BFLlfEzM6iKcXyWPsUxO16twFHoUQqVGvm1fpk/v9+wGF43krure2vXAX lZFLwyj8W0xiCcXcDA/+xfI+GeJr/dMeO10oSjwMa2AsEIWqEHqO76H37 MdPTiwsrnrPTkrmXEvwzPLx6RBvpBNwqUVuFoga3Axx9ieu9nBaS0iiPP g==; X-IronPort-AV: E=McAfee;i="6600,9927,10833"; a="378146082" X-IronPort-AV: E=Sophos;i="6.02,149,1688454000"; d="scan'208";a="378146082" Received: from orsmga008.jf.intel.com ([10.7.209.65]) by fmsmga104.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 15 Sep 2023 05:02:38 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10833"; a="774292950" X-IronPort-AV: E=Sophos;i="6.02,149,1688454000"; d="scan'208";a="774292950" Received: from srdoo-mobl1.ger.corp.intel.com (HELO ijarvine-mobl2.ger.corp.intel.com) ([10.252.38.99]) by orsmga008-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 15 Sep 2023 05:02:33 -0700 From: =?utf-8?q?Ilpo_J=C3=A4rvinen?= To: linux-pci@vger.kernel.org, Bjorn Helgaas , Lorenzo Pieralisi , Rob Herring , =?utf-8?q?Krzysztof_Wilczy=C5=84ski?= , Lukas Wunner , Alexandru Gagniuc , Krishna chaitanya chundru , Srinivas Pandruvada , "Rafael J . Wysocki" , linux-pm@vger.kernel.org, Bjorn Helgaas , =?utf-8?q?Ilpo_J=C3=A4rvinen?= , linux-kernel@vger.kernel.org Cc: Alex Deucher , Daniel Lezcano , Amit Kucheria , Zhang Rui Subject: [PATCH v2 07/10] PCI/LINK: Re-add BW notification portdrv as PCIe BW controller Date: Fri, 15 Sep 2023 15:01:39 +0300 Message-Id: <20230915120142.32987-8-ilpo.jarvinen@linux.intel.com> X-Mailer: git-send-email 2.30.2 In-Reply-To: <20230915120142.32987-1-ilpo.jarvinen@linux.intel.com> References: <20230915120142.32987-1-ilpo.jarvinen@linux.intel.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-pm@vger.kernel.org This mostly reverts b4c7d2076b4e ("PCI/LINK: Remove bandwidth notification"), however, there are small tweaks: 1) Call it PCIe bwctrl (bandwidth controller) instead of just bandwidth notifications. 2) Don't print the notifications into kernel log, just keep the current link speed updated. 3) Use concurrency safe LNKCTL RMW operations. 4) Read link speed after enabling the notification to ensure the current link speed is correct from the start. 5) Add local variable in probe for srv->port. 6) Handle link speed read and LBMS write race in pcie_bw_notification_irq(). The reason for 1) is to indicate the increased scope of the driver. A subsequent commit extends the driver to allow controlling PCIe bandwidths from user space upon crossing thermal thresholds. While 2) is somewhat unfortunate, the log spam was the source of complaints that eventually lead to the removal of the bandwidth notifications driver (see the links below for further information). After re-adding this driver back the userspace can, if it wishes to, observe the link speed changes using the current bus speed files under sysfs. Link: https://lore.kernel.org/all/20190429185611.121751-1-helgaas@kernel.org/ Link: https://lore.kernel.org/linux-pci/20190501142942.26972-1-keith.busch@intel.com/ Link: https://lore.kernel.org/linux-pci/20200115221008.GA191037@google.com/ Suggested-by: Lukas Wunner Signed-off-by: Ilpo Järvinen --- drivers/pci/pcie/Kconfig | 8 +++ drivers/pci/pcie/Makefile | 1 + drivers/pci/pcie/bwctrl.c | 131 +++++++++++++++++++++++++++++++++++++ drivers/pci/pcie/portdrv.c | 9 +-- drivers/pci/pcie/portdrv.h | 10 ++- 5 files changed, 153 insertions(+), 6 deletions(-) create mode 100644 drivers/pci/pcie/bwctrl.c diff --git a/drivers/pci/pcie/Kconfig b/drivers/pci/pcie/Kconfig index 228652a59f27..1ef8073fa89a 100644 --- a/drivers/pci/pcie/Kconfig +++ b/drivers/pci/pcie/Kconfig @@ -137,6 +137,14 @@ config PCIE_PTM This is only useful if you have devices that support PTM, but it is safe to enable even if you don't. +config PCIE_BW + bool "PCI Express Bandwidth Change Notification" + depends on PCIEPORTBUS + help + This enables PCI Express Bandwidth Change Notification. If + you know link width or rate changes occur to correct unreliable + links, you may answer Y. + config PCIE_EDR bool "PCI Express Error Disconnect Recover support" depends on PCIE_DPC && ACPI diff --git a/drivers/pci/pcie/Makefile b/drivers/pci/pcie/Makefile index 8de4ed5f98f1..175065a495cf 100644 --- a/drivers/pci/pcie/Makefile +++ b/drivers/pci/pcie/Makefile @@ -12,4 +12,5 @@ obj-$(CONFIG_PCIEAER_INJECT) += aer_inject.o obj-$(CONFIG_PCIE_PME) += pme.o obj-$(CONFIG_PCIE_DPC) += dpc.o obj-$(CONFIG_PCIE_PTM) += ptm.o +obj-$(CONFIG_PCIE_BW) += bwctrl.o obj-$(CONFIG_PCIE_EDR) += edr.o diff --git a/drivers/pci/pcie/bwctrl.c b/drivers/pci/pcie/bwctrl.c new file mode 100644 index 000000000000..4fc6718fc0e5 --- /dev/null +++ b/drivers/pci/pcie/bwctrl.c @@ -0,0 +1,131 @@ +// SPDX-License-Identifier: GPL-2.0+ +/* + * PCI Express Link Bandwidth Notification services driver + * Author: Alexandru Gagniuc + * + * Copyright (C) 2019, Dell Inc + * + * The PCIe Link Bandwidth Notification provides a way to notify the + * operating system when the link width or data rate changes. This + * capability is required for all root ports and downstream ports + * supporting links wider than x1 and/or multiple link speeds. + * + * This service port driver hooks into the bandwidth notification interrupt + * watching for link speed changes or links becoming degraded in operation + * and updates the cached link speed exposed to user space. + */ + +#define dev_fmt(fmt) "bwctrl: " fmt + +#include "../pci.h" +#include "portdrv.h" + +static bool pcie_link_bandwidth_notification_supported(struct pci_dev *dev) +{ + int ret; + u32 lnk_cap; + + ret = pcie_capability_read_dword(dev, PCI_EXP_LNKCAP, &lnk_cap); + return (ret == PCIBIOS_SUCCESSFUL) && (lnk_cap & PCI_EXP_LNKCAP_LBNC); +} + +static void pcie_enable_link_bandwidth_notification(struct pci_dev *dev) +{ + u16 link_status; + int ret; + + pcie_capability_write_word(dev, PCI_EXP_LNKSTA, PCI_EXP_LNKSTA_LBMS); + pcie_capability_set_word(dev, PCI_EXP_LNKCTL, PCI_EXP_LNKCTL_LBMIE); + + /* Read after enabling notifications to ensure link speed is up to date */ + ret = pcie_capability_read_word(dev, PCI_EXP_LNKSTA, &link_status); + if (ret == PCIBIOS_SUCCESSFUL) + pcie_update_link_speed(dev->subordinate, link_status); +} + +static void pcie_disable_link_bandwidth_notification(struct pci_dev *dev) +{ + pcie_capability_clear_word(dev, PCI_EXP_LNKCTL, PCI_EXP_LNKCTL_LBMIE); +} + +static irqreturn_t pcie_bw_notification_irq(int irq, void *context) +{ + struct pcie_device *srv = context; + struct pci_dev *port = srv->port; + u16 link_status, events; + int ret; + + ret = pcie_capability_read_word(port, PCI_EXP_LNKSTA, &link_status); + events = link_status & PCI_EXP_LNKSTA_LBMS; + + if (ret != PCIBIOS_SUCCESSFUL || !events) + return IRQ_NONE; + + pcie_capability_write_word(port, PCI_EXP_LNKSTA, events); + + /* + * The write to clear LBMS prevents getting interrupt from the + * latest link speed when the link speed changes between the above + * LNKSTA read and write. Therefore, re-read the speed before + * updating it. + */ + ret = pcie_capability_read_word(port, PCI_EXP_LNKSTA, &link_status); + if (ret != PCIBIOS_SUCCESSFUL) + return IRQ_HANDLED; + pcie_update_link_speed(port->subordinate, link_status); + + return IRQ_HANDLED; +} + +static int pcie_bandwidth_notification_probe(struct pcie_device *srv) +{ + struct pci_dev *port = srv->port; + int ret; + + /* Single-width or single-speed ports do not have to support this. */ + if (!pcie_link_bandwidth_notification_supported(port)) + return -ENODEV; + + ret = request_irq(srv->irq, pcie_bw_notification_irq, + IRQF_SHARED, "PCIe BW ctrl", srv); + if (ret) + return ret; + + pcie_enable_link_bandwidth_notification(port); + pci_info(port, "enabled with IRQ %d\n", srv->irq); + + return 0; +} + +static void pcie_bandwidth_notification_remove(struct pcie_device *srv) +{ + pcie_disable_link_bandwidth_notification(srv->port); + free_irq(srv->irq, srv); +} + +static int pcie_bandwidth_notification_suspend(struct pcie_device *srv) +{ + pcie_disable_link_bandwidth_notification(srv->port); + return 0; +} + +static int pcie_bandwidth_notification_resume(struct pcie_device *srv) +{ + pcie_enable_link_bandwidth_notification(srv->port); + return 0; +} + +static struct pcie_port_service_driver pcie_bandwidth_notification_driver = { + .name = "pcie_bwctrl", + .port_type = PCIE_ANY_PORT, + .service = PCIE_PORT_SERVICE_BWCTRL, + .probe = pcie_bandwidth_notification_probe, + .suspend = pcie_bandwidth_notification_suspend, + .resume = pcie_bandwidth_notification_resume, + .remove = pcie_bandwidth_notification_remove, +}; + +int __init pcie_bwctrl_init(void) +{ + return pcie_port_service_register(&pcie_bandwidth_notification_driver); +} diff --git a/drivers/pci/pcie/portdrv.c b/drivers/pci/pcie/portdrv.c index 46fad0d813b2..ed33049bffd6 100644 --- a/drivers/pci/pcie/portdrv.c +++ b/drivers/pci/pcie/portdrv.c @@ -67,7 +67,7 @@ static int pcie_message_numbers(struct pci_dev *dev, int mask, */ if (mask & (PCIE_PORT_SERVICE_PME | PCIE_PORT_SERVICE_HP | - PCIE_PORT_SERVICE_BWNOTIF)) { + PCIE_PORT_SERVICE_BWCTRL)) { pcie_capability_read_word(dev, PCI_EXP_FLAGS, ®16); *pme = (reg16 & PCI_EXP_FLAGS_IRQ) >> 9; nvec = *pme + 1; @@ -149,11 +149,11 @@ static int pcie_port_enable_irq_vec(struct pci_dev *dev, int *irqs, int mask) /* PME, hotplug and bandwidth notification share an MSI/MSI-X vector */ if (mask & (PCIE_PORT_SERVICE_PME | PCIE_PORT_SERVICE_HP | - PCIE_PORT_SERVICE_BWNOTIF)) { + PCIE_PORT_SERVICE_BWCTRL)) { pcie_irq = pci_irq_vector(dev, pme); irqs[PCIE_PORT_SERVICE_PME_SHIFT] = pcie_irq; irqs[PCIE_PORT_SERVICE_HP_SHIFT] = pcie_irq; - irqs[PCIE_PORT_SERVICE_BWNOTIF_SHIFT] = pcie_irq; + irqs[PCIE_PORT_SERVICE_BWCTRL_SHIFT] = pcie_irq; } if (mask & PCIE_PORT_SERVICE_AER) @@ -270,7 +270,7 @@ static int get_port_device_capability(struct pci_dev *dev) pcie_capability_read_dword(dev, PCI_EXP_LNKCAP, &linkcap); if (linkcap & PCI_EXP_LNKCAP_LBNC) - services |= PCIE_PORT_SERVICE_BWNOTIF; + services |= PCIE_PORT_SERVICE_BWCTRL; } return services; @@ -828,6 +828,7 @@ static void __init pcie_init_services(void) pcie_pme_init(); pcie_dpc_init(); pcie_hp_init(); + pcie_bwctrl_init(); } static int __init pcie_portdrv_init(void) diff --git a/drivers/pci/pcie/portdrv.h b/drivers/pci/pcie/portdrv.h index 58a2b1a1cae4..f622c8a02a5b 100644 --- a/drivers/pci/pcie/portdrv.h +++ b/drivers/pci/pcie/portdrv.h @@ -20,8 +20,8 @@ #define PCIE_PORT_SERVICE_HP (1 << PCIE_PORT_SERVICE_HP_SHIFT) #define PCIE_PORT_SERVICE_DPC_SHIFT 3 /* Downstream Port Containment */ #define PCIE_PORT_SERVICE_DPC (1 << PCIE_PORT_SERVICE_DPC_SHIFT) -#define PCIE_PORT_SERVICE_BWNOTIF_SHIFT 4 /* Bandwidth notification */ -#define PCIE_PORT_SERVICE_BWNOTIF (1 << PCIE_PORT_SERVICE_BWNOTIF_SHIFT) +#define PCIE_PORT_SERVICE_BWCTRL_SHIFT 4 /* Bandwidth Controller (notifications) */ +#define PCIE_PORT_SERVICE_BWCTRL (1 << PCIE_PORT_SERVICE_BWCTRL_SHIFT) #define PCIE_PORT_DEVICE_MAXSERVICES 5 @@ -53,6 +53,12 @@ int pcie_dpc_init(void); static inline int pcie_dpc_init(void) { return 0; } #endif +#ifdef CONFIG_PCIE_BW +int pcie_bwctrl_init(void); +#else +static inline int pcie_bwctrl_init(void) { return 0; } +#endif + /* Port Type */ #define PCIE_ANY_PORT (~0) From patchwork Fri Sep 15 12:01:40 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?Ilpo_J=C3=A4rvinen?= X-Patchwork-Id: 13386982 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 9DF08EE6457 for ; Fri, 15 Sep 2023 12:03:57 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234581AbjIOMEA (ORCPT ); Fri, 15 Sep 2023 08:04:00 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:54544 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S234495AbjIOMD7 (ORCPT ); Fri, 15 Sep 2023 08:03:59 -0400 Received: from mgamail.intel.com (mgamail.intel.com [192.55.52.120]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 653FB30D0; Fri, 15 Sep 2023 05:03:09 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1694779389; x=1726315389; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=3hklPNGedGopQwHTctJ3dyBwTBHlEl8CThCkfuzvxWs=; b=H3j+Yotdjc1dRYRdtxL5hrH/gvjWJlj1wjwWjXuL6isuIHSd15eEaXVH xHU3Mx2KO3CwTphlSYgrkwzcDJzCtiRJAIKC3d/GslooEVEqlkzfptshu gZAgda00wSjoDnKNoRAAV5LPBdpYF9OgNyHF3OpBMmyMpocTGIk/oLspr wN0HbE6fU2SIP2tpO2/oFa76BL6MLfDGn0gcJQ+82d499CyP/VLyl+Ikw 95bLZIFdrBj99qKBAK3JugmBFEG0+Y02AYtu6eK+VPjUyWhdz8uqcQrzt ic7PQJdKpny2ztiEVSh2U4JuA1ecxgX1Hat66drDfK0klDW/AsFhuZ/pA g==; X-IronPort-AV: E=McAfee;i="6600,9927,10833"; a="378146102" X-IronPort-AV: E=Sophos;i="6.02,149,1688454000"; d="scan'208";a="378146102" Received: from orsmga008.jf.intel.com ([10.7.209.65]) by fmsmga104.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 15 Sep 2023 05:02:44 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10833"; a="774292986" X-IronPort-AV: E=Sophos;i="6.02,149,1688454000"; d="scan'208";a="774292986" Received: from srdoo-mobl1.ger.corp.intel.com (HELO ijarvine-mobl2.ger.corp.intel.com) ([10.252.38.99]) by orsmga008-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 15 Sep 2023 05:02:39 -0700 From: =?utf-8?q?Ilpo_J=C3=A4rvinen?= To: linux-pci@vger.kernel.org, Bjorn Helgaas , Lorenzo Pieralisi , Rob Herring , =?utf-8?q?Krzysztof_Wilczy=C5=84ski?= , Lukas Wunner , Alexandru Gagniuc , Krishna chaitanya chundru , Srinivas Pandruvada , "Rafael J . Wysocki" , linux-pm@vger.kernel.org, Bjorn Helgaas , =?utf-8?q?Ilpo_J=C3=A4rvinen?= , linux-kernel@vger.kernel.org Cc: Alex Deucher , Daniel Lezcano , Amit Kucheria , Zhang Rui Subject: [PATCH v2 08/10] PCI/bwctrl: Add "controller" part into PCIe bwctrl Date: Fri, 15 Sep 2023 15:01:40 +0300 Message-Id: <20230915120142.32987-9-ilpo.jarvinen@linux.intel.com> X-Mailer: git-send-email 2.30.2 In-Reply-To: <20230915120142.32987-1-ilpo.jarvinen@linux.intel.com> References: <20230915120142.32987-1-ilpo.jarvinen@linux.intel.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-pm@vger.kernel.org Add "controller" parts into PCIe bwctrl for limiting PCIe Link Speed (due to thermal reasons). PCIe bandwidth controller introduces an in-kernel API to set PCIe Link Speed. This new API is intended to be used in an upcoming commit that adds a thermal cooling device to throttle PCIe bandwidth when thermal thresholds are reached. No users are introduced in this commit yet. The PCIe bandwidth control procedure is as follows. The requested speed is validated against Link Speeds supported by the port and downstream device. Then bandwidth controller sets the Target Link Speed in the Link Control 2 Register and retrains the PCIe Link. Bandwidth notifications enable the cur_bus_speed in the struct pci_bus to keep track PCIe Link Speed changes. This keeps the link speed seen through sysfs correct (both for PCI device and thermal cooling device). While bandwidth notifications should also be generated when bandwidth controller alters the PCIe Link Speed, a few platforms do not deliver LMBS interrupt after Link Training as expected. Thus, after changing the Link Speed, bandwidth controller makes additional read for the Link Status Register to ensure cur_bus_speed is consistent with the new PCIe Link Speed. Signed-off-by: Ilpo Järvinen --- MAINTAINERS | 6 ++ drivers/pci/pcie/Kconfig | 9 +- drivers/pci/pcie/bwctrl.c | 177 +++++++++++++++++++++++++++++++++++-- include/linux/pci-bwctrl.h | 17 ++++ 4 files changed, 200 insertions(+), 9 deletions(-) create mode 100644 include/linux/pci-bwctrl.h diff --git a/MAINTAINERS b/MAINTAINERS index 90f13281d297..cd5c9b9ad32b 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -16569,6 +16569,12 @@ F: include/linux/pci* F: include/uapi/linux/pci* F: lib/pci* +PCIE BANDWIDTH CONTROLLER +M: Ilpo Järvinen +S: Supported +F: drivers/pci/pcie/bwctrl.c +F: include/linux/pci-bwctrl.h + PCIE DRIVER FOR AMAZON ANNAPURNA LABS M: Jonathan Chocron L: linux-pci@vger.kernel.org diff --git a/drivers/pci/pcie/Kconfig b/drivers/pci/pcie/Kconfig index 1ef8073fa89a..1c6509cf169a 100644 --- a/drivers/pci/pcie/Kconfig +++ b/drivers/pci/pcie/Kconfig @@ -138,12 +138,13 @@ config PCIE_PTM is safe to enable even if you don't. config PCIE_BW - bool "PCI Express Bandwidth Change Notification" + bool "PCI Express Bandwidth Controller" depends on PCIEPORTBUS help - This enables PCI Express Bandwidth Change Notification. If - you know link width or rate changes occur to correct unreliable - links, you may answer Y. + This enables PCI Express Bandwidth Controller. The Bandwidth + Controller allows controlling PCIe link speed and listens for link + peed Change Notifications. If you know link width or rate changes + occur to correct unreliable links, you may answer Y. config PCIE_EDR bool "PCI Express Error Disconnect Recover support" diff --git a/drivers/pci/pcie/bwctrl.c b/drivers/pci/pcie/bwctrl.c index 4fc6718fc0e5..e3172d69476f 100644 --- a/drivers/pci/pcie/bwctrl.c +++ b/drivers/pci/pcie/bwctrl.c @@ -1,14 +1,16 @@ // SPDX-License-Identifier: GPL-2.0+ /* - * PCI Express Link Bandwidth Notification services driver + * PCIe bandwidth controller + * * Author: Alexandru Gagniuc * * Copyright (C) 2019, Dell Inc + * Copyright (C) 2023 Intel Corporation. * - * The PCIe Link Bandwidth Notification provides a way to notify the - * operating system when the link width or data rate changes. This - * capability is required for all root ports and downstream ports - * supporting links wider than x1 and/or multiple link speeds. + * The PCIe Bandwidth Controller provides a way to alter PCIe link speeds + * and notify the operating system when the link width or data rate changes. + * The notification capability is required for all Root Ports and Downstream + * Ports supporting links wider than x1 and/or multiple link speeds. * * This service port driver hooks into the bandwidth notification interrupt * watching for link speed changes or links becoming degraded in operation @@ -17,9 +19,48 @@ #define dev_fmt(fmt) "bwctrl: " fmt +#include +#include +#include +#include +#include +#include + +#include + #include "../pci.h" #include "portdrv.h" +/** + * struct bwctrl_service_data - PCIe Port Bandwidth Controller + * @set_speed_mutex: serializes link speed changes + */ +struct bwctrl_service_data { + struct mutex set_speed_mutex; +}; + +static bool bwctrl_valid_pcie_speed(enum pci_bus_speed speed) +{ + return (speed >= PCIE_SPEED_2_5GT) && (speed <= PCIE_SPEED_64_0GT); +} + +static u16 speed2lnkctl2(enum pci_bus_speed speed) +{ + static const u16 speed_conv[] = { + [PCIE_SPEED_2_5GT] = PCI_EXP_LNKCTL2_TLS_2_5GT, + [PCIE_SPEED_5_0GT] = PCI_EXP_LNKCTL2_TLS_5_0GT, + [PCIE_SPEED_8_0GT] = PCI_EXP_LNKCTL2_TLS_8_0GT, + [PCIE_SPEED_16_0GT] = PCI_EXP_LNKCTL2_TLS_16_0GT, + [PCIE_SPEED_32_0GT] = PCI_EXP_LNKCTL2_TLS_32_0GT, + [PCIE_SPEED_64_0GT] = PCI_EXP_LNKCTL2_TLS_64_0GT, + }; + + if (WARN_ON_ONCE(!bwctrl_valid_pcie_speed(speed))) + return 0; + + return speed_conv[speed]; +} + static bool pcie_link_bandwidth_notification_supported(struct pci_dev *dev) { int ret; @@ -77,8 +118,118 @@ static irqreturn_t pcie_bw_notification_irq(int irq, void *context) return IRQ_HANDLED; } +/* Configure target speed to the requested speed and set train link */ +static int bwctrl_set_speed(struct pci_dev *port, u16 lnkctl2_speed) +{ + int ret; + + ret = pcie_capability_clear_and_set_word(port, PCI_EXP_LNKCTL2, + PCI_EXP_LNKCTL2_TLS, lnkctl2_speed); + if (ret != PCIBIOS_SUCCESSFUL) + return pcibios_err_to_errno(ret); + + return 0; +} + +static int bwctrl_select_speed(struct pcie_device *srv, enum pci_bus_speed *speed) +{ + struct pci_bus *bus = srv->port->subordinate; + u8 speeds, dev_speeds; + int i; + + if (*speed > PCIE_LNKCAP2_SLS2SPEED(bus->pcie_bus_speeds)) + return -EINVAL; + + dev_speeds = READ_ONCE(bus->pcie_dev_speeds); + /* Only the lowest speed can be set when there are no devices */ + if (!dev_speeds) + dev_speeds = PCI_EXP_LNKCAP2_SLS_2_5GB; + + /* + * Implementation Note in PCIe r6.0.1 sec 7.5.3.18 recommends OS to + * utilize Supported Link Speeds vector for determining which link + * speeds are supported. + * + * Take into account Supported Link Speeds both from the Root Port + * and the device. + */ + speeds = bus->pcie_bus_speeds & dev_speeds; + i = BIT(fls(speeds)); + while (i >= PCI_EXP_LNKCAP2_SLS_2_5GB) { + enum pci_bus_speed candidate; + + if (speeds & i) { + candidate = PCIE_LNKCAP2_SLS2SPEED(i); + if (candidate <= *speed) { + *speed = candidate; + return 0; + } + } + i >>= 1; + } + + return -EINVAL; +} + +/** + * bwctrl_set_current_speed - Set downstream link speed for PCIe port + * @srv: PCIe port + * @speed: PCIe bus speed to set + * + * Attempts to set PCIe port link speed to @speed. As long as @speed is less + * than the maximum of what is supported by @srv, the speed is adjusted + * downwards to the best speed supported by both the port and device + * underneath it. + * + * Return: + * * 0 - on success + * * -EINVAL - @speed is higher than the maximum @srv supports + * * -ETIMEDOUT - changing link speed took too long + * * -EAGAIN - link speed was changed but @speed was not achieved + */ +int bwctrl_set_current_speed(struct pcie_device *srv, enum pci_bus_speed speed) +{ + struct bwctrl_service_data *data = get_service_data(srv); + struct pci_dev *port = srv->port; + u16 link_status; + int ret; + + if (WARN_ON_ONCE(!bwctrl_valid_pcie_speed(speed))) + return -EINVAL; + + ret = bwctrl_select_speed(srv, &speed); + if (ret < 0) + return ret; + + mutex_lock(&data->set_speed_mutex); + ret = bwctrl_set_speed(port, speed2lnkctl2(speed)); + if (ret < 0) + goto unlock; + + ret = pcie_retrain_link(port, true); + if (ret < 0) + goto unlock; + + /* + * Ensure link speed updates also with platforms that have problems + * with notifications + */ + ret = pcie_capability_read_word(port, PCI_EXP_LNKSTA, &link_status); + if (ret == PCIBIOS_SUCCESSFUL) + pcie_update_link_speed(port->subordinate, link_status); + + if (port->subordinate->cur_bus_speed != speed) + ret = -EAGAIN; + +unlock: + mutex_unlock(&data->set_speed_mutex); + + return ret; +} + static int pcie_bandwidth_notification_probe(struct pcie_device *srv) { + struct bwctrl_service_data *data; struct pci_dev *port = srv->port; int ret; @@ -91,16 +242,32 @@ static int pcie_bandwidth_notification_probe(struct pcie_device *srv) if (ret) return ret; + data = kzalloc(sizeof(*data), GFP_KERNEL); + if (!data) { + ret = -ENOMEM; + goto free_irq; + } + mutex_init(&data->set_speed_mutex); + set_service_data(srv, data); + pcie_enable_link_bandwidth_notification(port); pci_info(port, "enabled with IRQ %d\n", srv->irq); return 0; + +free_irq: + free_irq(srv->irq, srv); + return ret; } static void pcie_bandwidth_notification_remove(struct pcie_device *srv) { + struct bwctrl_service_data *data = get_service_data(srv); + pcie_disable_link_bandwidth_notification(srv->port); free_irq(srv->irq, srv); + mutex_destroy(&data->set_speed_mutex); + kfree(data); } static int pcie_bandwidth_notification_suspend(struct pcie_device *srv) diff --git a/include/linux/pci-bwctrl.h b/include/linux/pci-bwctrl.h new file mode 100644 index 000000000000..8eae09bd03b5 --- /dev/null +++ b/include/linux/pci-bwctrl.h @@ -0,0 +1,17 @@ +/* SPDX-License-Identifier: GPL-2.0-only */ +/* + * PCIe bandwidth controller + * + * Copyright (C) 2023 Intel Corporation. + */ + +#ifndef LINUX_PCI_BWCTRL_H +#define LINUX_PCI_BWCTRL_H + +#include + +struct pcie_device; + +int bwctrl_set_current_speed(struct pcie_device *srv, enum pci_bus_speed speed); + +#endif From patchwork Fri Sep 15 12:01:41 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?Ilpo_J=C3=A4rvinen?= X-Patchwork-Id: 13386983 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 48D00EE645D for ; Fri, 15 Sep 2023 12:03:59 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234595AbjIOMEC (ORCPT ); Fri, 15 Sep 2023 08:04:02 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:54598 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S234546AbjIOMEB (ORCPT ); Fri, 15 Sep 2023 08:04:01 -0400 Received: from mgamail.intel.com (mgamail.intel.com [192.55.52.120]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 508C630D5; Fri, 15 Sep 2023 05:03:10 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1694779390; x=1726315390; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=Lpc/55qfjPc+bstEFURf/p/RCtJBjsTaLxy0OsxpmoE=; b=gDvMv4ySZ/9IaYLpP0uo3V5zX1Vqy4X06G1qCUpuxwzzDyPvd68KLPQc yXtm5wTT22SSLEv0uiXQ+j8cfyZaKAuS/mMlqjQgTXuctvomomjE9oe6k zyvgj81jxz9YWlDlcWXNEQwRE+aFMqHHnLv77IPgS2RJvW7Ni6YgdJE+9 aaNr401UJKV5fWn+1x8BxgYzjW//YpLonq53huO/z2+dNbaM1P3iFoRjx tANHq04WMqanYsEEKmIUfF2iwOL2uLskK7LECKjGn0Vq30Jhkmaf5JCqL olkxz4H4SLTKH4O1M7IuNd1FJBqkr7x3MzeisZWlche2ypBxdCmadvjg3 A==; X-IronPort-AV: E=McAfee;i="6600,9927,10833"; a="378146116" X-IronPort-AV: E=Sophos;i="6.02,149,1688454000"; d="scan'208";a="378146116" Received: from orsmga008.jf.intel.com ([10.7.209.65]) by fmsmga104.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 15 Sep 2023 05:02:50 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10833"; a="774293033" X-IronPort-AV: E=Sophos;i="6.02,149,1688454000"; d="scan'208";a="774293033" Received: from srdoo-mobl1.ger.corp.intel.com (HELO ijarvine-mobl2.ger.corp.intel.com) ([10.252.38.99]) by orsmga008-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 15 Sep 2023 05:02:45 -0700 From: =?utf-8?q?Ilpo_J=C3=A4rvinen?= To: linux-pci@vger.kernel.org, Bjorn Helgaas , Lorenzo Pieralisi , Rob Herring , =?utf-8?q?Krzysztof_Wilczy=C5=84ski?= , Lukas Wunner , Alexandru Gagniuc , Krishna chaitanya chundru , Srinivas Pandruvada , "Rafael J . Wysocki" , linux-pm@vger.kernel.org, =?utf-8?q?Ilpo_J=C3=A4rvinen?= , Bjorn Helgaas , Daniel Lezcano , Amit Kucheria , Zhang Rui , linux-kernel@vger.kernel.org Cc: Alex Deucher Subject: [PATCH v2 09/10] thermal: Add PCIe cooling driver Date: Fri, 15 Sep 2023 15:01:41 +0300 Message-Id: <20230915120142.32987-10-ilpo.jarvinen@linux.intel.com> X-Mailer: git-send-email 2.30.2 In-Reply-To: <20230915120142.32987-1-ilpo.jarvinen@linux.intel.com> References: <20230915120142.32987-1-ilpo.jarvinen@linux.intel.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-pm@vger.kernel.org Add a thermal cooling driver to provide path to access PCIe bandwidth controller using the usual thermal interfaces. A cooling device is instantiated for controllable PCIe ports from the bwctrl service driver. The thermal side state 0 means no throttling, i.e., maximum supported PCIe speed. Signed-off-by: Ilpo Järvinen Acked-by: Rafael J. Wysocki # From the cooling device interface perspective --- MAINTAINERS | 1 + drivers/pci/pcie/bwctrl.c | 11 ++++ drivers/thermal/Kconfig | 10 +++ drivers/thermal/Makefile | 2 + drivers/thermal/pcie_cooling.c | 107 +++++++++++++++++++++++++++++++++ include/linux/pci-bwctrl.h | 16 +++++ 6 files changed, 147 insertions(+) create mode 100644 drivers/thermal/pcie_cooling.c diff --git a/MAINTAINERS b/MAINTAINERS index cd5c9b9ad32b..32974417ad52 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -16573,6 +16573,7 @@ PCIE BANDWIDTH CONTROLLER M: Ilpo Järvinen S: Supported F: drivers/pci/pcie/bwctrl.c +F: drivers/thermal/pcie_cooling.c F: include/linux/pci-bwctrl.h PCIE DRIVER FOR AMAZON ANNAPURNA LABS diff --git a/drivers/pci/pcie/bwctrl.c b/drivers/pci/pcie/bwctrl.c index e3172d69476f..13c73546244e 100644 --- a/drivers/pci/pcie/bwctrl.c +++ b/drivers/pci/pcie/bwctrl.c @@ -34,9 +34,11 @@ /** * struct bwctrl_service_data - PCIe Port Bandwidth Controller * @set_speed_mutex: serializes link speed changes + * @cdev: thermal cooling device associated with the port */ struct bwctrl_service_data { struct mutex set_speed_mutex; + struct thermal_cooling_device *cdev; }; static bool bwctrl_valid_pcie_speed(enum pci_bus_speed speed) @@ -253,8 +255,16 @@ static int pcie_bandwidth_notification_probe(struct pcie_device *srv) pcie_enable_link_bandwidth_notification(port); pci_info(port, "enabled with IRQ %d\n", srv->irq); + data->cdev = pcie_cooling_device_register(port, srv); + if (IS_ERR(data->cdev)) { + ret = PTR_ERR(data->cdev); + goto disable_notifications; + } return 0; +disable_notifications: + pcie_disable_link_bandwidth_notification(srv->port); + kfree(data); free_irq: free_irq(srv->irq, srv); return ret; @@ -264,6 +274,7 @@ static void pcie_bandwidth_notification_remove(struct pcie_device *srv) { struct bwctrl_service_data *data = get_service_data(srv); + pcie_cooling_device_unregister(data->cdev); pcie_disable_link_bandwidth_notification(srv->port); free_irq(srv->irq, srv); mutex_destroy(&data->set_speed_mutex); diff --git a/drivers/thermal/Kconfig b/drivers/thermal/Kconfig index c81a00fbca7d..3a071396f1c6 100644 --- a/drivers/thermal/Kconfig +++ b/drivers/thermal/Kconfig @@ -219,6 +219,16 @@ config DEVFREQ_THERMAL If you want this support, you should say Y here. +config PCIE_THERMAL + bool "PCIe cooling support" + depends on PCIEPORTBUS + select PCIE_BW + help + This implements PCIe cooling mechanism through bandwidth reduction + for PCIe devices. + + If you want this support, you should say Y here. + config THERMAL_EMULATION bool "Thermal emulation mode support" help diff --git a/drivers/thermal/Makefile b/drivers/thermal/Makefile index c934cab309ae..a0b25a2742b7 100644 --- a/drivers/thermal/Makefile +++ b/drivers/thermal/Makefile @@ -30,6 +30,8 @@ thermal_sys-$(CONFIG_CPU_IDLE_THERMAL) += cpuidle_cooling.o # devfreq cooling thermal_sys-$(CONFIG_DEVFREQ_THERMAL) += devfreq_cooling.o +thermal_sys-$(CONFIG_PCIE_THERMAL) += pcie_cooling.o + obj-$(CONFIG_K3_THERMAL) += k3_bandgap.o k3_j72xx_bandgap.o # platform thermal drivers obj-y += broadcom/ diff --git a/drivers/thermal/pcie_cooling.c b/drivers/thermal/pcie_cooling.c new file mode 100644 index 000000000000..c23b59dd0331 --- /dev/null +++ b/drivers/thermal/pcie_cooling.c @@ -0,0 +1,107 @@ +// SPDX-License-Identifier: GPL-2.0-only +/* + * PCIe cooling device + * + * Copyright (C) 2023 Intel Corporation. + */ + +#include +#include +#include +#include +#include +#include +#include +#include + +#define COOLING_DEV_TYPE_PREFIX "PCIe_Port_Link_Speed_" + +struct pcie_cooling_device { + struct pci_dev *port; + struct pcie_device *pdev; +}; + +static int pcie_cooling_get_max_level(struct thermal_cooling_device *cdev, unsigned long *state) +{ + struct pcie_cooling_device *pcie_cdev = cdev->devdata; + + /* cooling state 0 is same as the maximum PCIe speed */ + *state = pcie_cdev->port->subordinate->max_bus_speed - PCIE_SPEED_2_5GT; + + return 0; +} + +static int pcie_cooling_get_cur_level(struct thermal_cooling_device *cdev, unsigned long *state) +{ + struct pcie_cooling_device *pcie_cdev = cdev->devdata; + + /* cooling state 0 is same as the maximum PCIe speed */ + *state = cdev->max_state - + (pcie_cdev->port->subordinate->cur_bus_speed - PCIE_SPEED_2_5GT); + + return 0; +} + +static int pcie_cooling_set_cur_level(struct thermal_cooling_device *cdev, unsigned long state) +{ + struct pcie_cooling_device *pcie_cdev = cdev->devdata; + enum pci_bus_speed speed; + + /* cooling state 0 is same as the maximum PCIe speed */ + speed = (cdev->max_state - state) + PCIE_SPEED_2_5GT; + + return bwctrl_set_current_speed(pcie_cdev->pdev, speed); +} + +static struct thermal_cooling_device_ops pcie_cooling_ops = { + .get_max_state = pcie_cooling_get_max_level, + .get_cur_state = pcie_cooling_get_cur_level, + .set_cur_state = pcie_cooling_set_cur_level, +}; + +struct thermal_cooling_device *pcie_cooling_device_register(struct pci_dev *port, + struct pcie_device *pdev) +{ + struct pcie_cooling_device *pcie_cdev; + struct thermal_cooling_device *cdev; + size_t name_len; + char *name; + + pcie_cdev = kzalloc(sizeof(*pcie_cdev), GFP_KERNEL); + if (!pcie_cdev) + return ERR_PTR(-ENOMEM); + + pcie_cdev->port = port; + pcie_cdev->pdev = pdev; + + name_len = strlen(COOLING_DEV_TYPE_PREFIX) + strlen(pci_name(port)) + 1; + name = kzalloc(name_len, GFP_KERNEL); + if (!name) { + kfree(pcie_cdev); + return ERR_PTR(-ENOMEM); + } + + snprintf(name, name_len, COOLING_DEV_TYPE_PREFIX "%s", pci_name(port)); + cdev = thermal_cooling_device_register(name, pcie_cdev, &pcie_cooling_ops); + kfree(name); + + return cdev; +} + +void pcie_cooling_device_unregister(struct thermal_cooling_device *cdev) +{ + struct pcie_cooling_device *pcie_cdev = cdev->devdata; + + thermal_cooling_device_unregister(cdev); + kfree(pcie_cdev); +} + +/* For bus_speed <-> state arithmetic */ +static_assert(PCIE_SPEED_2_5GT + 1 == PCIE_SPEED_5_0GT); +static_assert(PCIE_SPEED_5_0GT + 1 == PCIE_SPEED_8_0GT); +static_assert(PCIE_SPEED_8_0GT + 1 == PCIE_SPEED_16_0GT); +static_assert(PCIE_SPEED_16_0GT + 1 == PCIE_SPEED_32_0GT); +static_assert(PCIE_SPEED_32_0GT + 1 == PCIE_SPEED_64_0GT); + +MODULE_AUTHOR("Ilpo Järvinen "); +MODULE_DESCRIPTION("PCIe cooling driver"); diff --git a/include/linux/pci-bwctrl.h b/include/linux/pci-bwctrl.h index 8eae09bd03b5..366445517b72 100644 --- a/include/linux/pci-bwctrl.h +++ b/include/linux/pci-bwctrl.h @@ -11,7 +11,23 @@ #include struct pcie_device; +struct thermal_cooling_device; int bwctrl_set_current_speed(struct pcie_device *srv, enum pci_bus_speed speed); +#ifdef CONFIG_PCIE_THERMAL +struct thermal_cooling_device *pcie_cooling_device_register(struct pci_dev *port, + struct pcie_device *pdev); +void pcie_cooling_device_unregister(struct thermal_cooling_device *cdev); +#else +static inline struct thermal_cooling_device *pcie_cooling_device_register(struct pci_dev *port, + struct pcie_device *pdev) +{ + return NULL; +} +static inline void pcie_cooling_device_unregister(struct thermal_cooling_device *cdev) +{ +} +#endif + #endif From patchwork Fri Sep 15 12:01:42 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?Ilpo_J=C3=A4rvinen?= X-Patchwork-Id: 13386984 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 48F61EE645B for ; Fri, 15 Sep 2023 12:04:31 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234432AbjIOMEe (ORCPT ); Fri, 15 Sep 2023 08:04:34 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:47892 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S234444AbjIOMEd (ORCPT ); Fri, 15 Sep 2023 08:04:33 -0400 Received: from mgamail.intel.com (mgamail.intel.com [192.55.52.120]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 2C8433A87; Fri, 15 Sep 2023 05:03:42 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1694779422; x=1726315422; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=0aSOAVXV3pig4axO9+jHlML+tu/afJdMdghU/1L+h8Q=; b=LYnF/WXkwsqXg5q3WVKYknZVhocc3RVjBdRDL2kcRSDtNX4/da+tX257 pAt6W13pGA8bdxdffCQTURqXVd4JJ/uPSZwEAS5R/IpubkuyDrpypBs8t m4x97iMWHSzJS9EFQsJNw6PUXXnnTMfK+0KohRKcL/+BDpWXZx74nEALM U2qWIlyp3e5EZOfKvcOafDJKDjEwFLTyiW+IrliP3acuIJtjegk+FkRKo jACKJ2dcQFeyqKnHeS8sjXwMlakvhl6ubBu/52m99VrpSD+/vHes8qqtO eSP6+WpUj7KBZ6qttBxwuY36wm1WdrySehQMsgfirnq/ND7KxChX3rHr5 Q==; X-IronPort-AV: E=McAfee;i="6600,9927,10833"; a="378146139" X-IronPort-AV: E=Sophos;i="6.02,149,1688454000"; d="scan'208";a="378146139" Received: from orsmga008.jf.intel.com ([10.7.209.65]) by fmsmga104.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 15 Sep 2023 05:02:56 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10833"; a="774293067" X-IronPort-AV: E=Sophos;i="6.02,149,1688454000"; d="scan'208";a="774293067" Received: from srdoo-mobl1.ger.corp.intel.com (HELO ijarvine-mobl2.ger.corp.intel.com) ([10.252.38.99]) by orsmga008-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 15 Sep 2023 05:02:51 -0700 From: =?utf-8?q?Ilpo_J=C3=A4rvinen?= To: linux-pci@vger.kernel.org, Bjorn Helgaas , Lorenzo Pieralisi , Rob Herring , =?utf-8?q?Krzysztof_Wilczy=C5=84ski?= , Lukas Wunner , Alexandru Gagniuc , Krishna chaitanya chundru , Srinivas Pandruvada , "Rafael J . Wysocki" , linux-pm@vger.kernel.org, Shuah Khan , =?utf-8?q?Ilpo_J=C3=A4rvinen?= , linux-kernel@vger.kernel.org, linux-kselftest@vger.kernel.org Cc: Alex Deucher , Daniel Lezcano , Amit Kucheria , Zhang Rui Subject: [PATCH v2 10/10] selftests/pcie_bwctrl: Create selftests Date: Fri, 15 Sep 2023 15:01:42 +0300 Message-Id: <20230915120142.32987-11-ilpo.jarvinen@linux.intel.com> X-Mailer: git-send-email 2.30.2 In-Reply-To: <20230915120142.32987-1-ilpo.jarvinen@linux.intel.com> References: <20230915120142.32987-1-ilpo.jarvinen@linux.intel.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-pm@vger.kernel.org Create selftests for PCIe BW control through the PCIe cooling device sysfs interface. First, the BW control selftest finds the PCIe port to test with. By default, the PCIe port with the highest bus speed is selected but another PCIe port can be provided with -d parameter. The actual test steps the cur_state of the cooling device one-by-one from max_state to what the cur_state was initially. The speed change is confirmed by observing the current_link_speed for the corresponding PCIe port. Signed-off-by: Ilpo Järvinen --- MAINTAINERS | 1 + tools/testing/selftests/Makefile | 1 + tools/testing/selftests/pcie_bwctrl/Makefile | 2 + .../pcie_bwctrl/set_pcie_cooling_state.sh | 122 ++++++++++++++++++ .../selftests/pcie_bwctrl/set_pcie_speed.sh | 67 ++++++++++ 5 files changed, 193 insertions(+) create mode 100644 tools/testing/selftests/pcie_bwctrl/Makefile create mode 100755 tools/testing/selftests/pcie_bwctrl/set_pcie_cooling_state.sh create mode 100755 tools/testing/selftests/pcie_bwctrl/set_pcie_speed.sh diff --git a/MAINTAINERS b/MAINTAINERS index 32974417ad52..84e6687a646b 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -16575,6 +16575,7 @@ S: Supported F: drivers/pci/pcie/bwctrl.c F: drivers/thermal/pcie_cooling.c F: include/linux/pci-bwctrl.h +F: tools/testing/selftests/pcie_bwctrl/ PCIE DRIVER FOR AMAZON ANNAPURNA LABS M: Jonathan Chocron diff --git a/tools/testing/selftests/Makefile b/tools/testing/selftests/Makefile index 42806add0114..18ad9acd440a 100644 --- a/tools/testing/selftests/Makefile +++ b/tools/testing/selftests/Makefile @@ -59,6 +59,7 @@ TARGETS += net/mptcp TARGETS += net/openvswitch TARGETS += netfilter TARGETS += nsfs +TARGETS += pcie_bwctrl TARGETS += perf_events TARGETS += pidfd TARGETS += pid_namespace diff --git a/tools/testing/selftests/pcie_bwctrl/Makefile b/tools/testing/selftests/pcie_bwctrl/Makefile new file mode 100644 index 000000000000..3e84e26341d1 --- /dev/null +++ b/tools/testing/selftests/pcie_bwctrl/Makefile @@ -0,0 +1,2 @@ +TEST_PROGS = set_pcie_cooling_state.sh +include ../lib.mk diff --git a/tools/testing/selftests/pcie_bwctrl/set_pcie_cooling_state.sh b/tools/testing/selftests/pcie_bwctrl/set_pcie_cooling_state.sh new file mode 100755 index 000000000000..3a8f91f0309e --- /dev/null +++ b/tools/testing/selftests/pcie_bwctrl/set_pcie_cooling_state.sh @@ -0,0 +1,122 @@ +#!/bin/bash +# SPDX-License-Identifier: GPL-2.0-or-later + +SYSFS= +# Kselftest framework requirement - SKIP code is 4. +ksft_skip=4 +retval=0 +skipmsg="skip all tests:" + +PCIEPORTTYPE="PCIe_Port_Link_Speed" + +prerequisite() +{ + local ports + + if [ $UID != 0 ]; then + echo $skipmsg must be run as root >&2 + exit $ksft_skip + fi + + SYSFS=`mount -t sysfs | head -1 | awk '{ print $3 }'` + + if [ ! -d "$SYSFS" ]; then + echo $skipmsg sysfs is not mounted >&2 + exit $ksft_skip + fi + + if ! ls $SYSFS/class/thermal/cooling_device* > /dev/null 2>&1; then + echo $skipmsg thermal cooling devices missing >&2 + exit $ksft_skip + fi + + ports=`grep -e "^$PCIEPORTTYPE" $SYSFS/class/thermal/cooling_device*/type | wc -l` + if [ $ports -eq 0 ]; then + echo $skipmsg pcie cooling devices missing >&2 + exit $ksft_skip + fi +} + +testport= +find_pcie_port() +{ + local patt="$1" + local pcieports + local max + local cur + local delta + local bestdelta=-1 + + pcieports=`grep -l -F -e "$patt" /sys/class/thermal/cooling_device*/type` + if [ -z "$pcieports" ]; then + return + fi + pcieports=${pcieports//\/type/} + # Find the port with the highest PCIe Link Speed + for port in $pcieports; do + max=`cat $port/max_state` + cur=`cat $port/cur_state` + delta=$((max-cur)) + if [ $delta -gt $bestdelta ]; then + testport="$port" + bestdelta=$delta + fi + done +} + +sysfspcidev= +find_sysfs_pci_dev() +{ + local typefile="$1/type" + local pcidir + + pcidir="$SYSFS/bus/pci/devices/`sed -e "s|^${PCIEPORTTYPE}_||g" $typefile`" + + if [ -r "$pcidir/current_link_speed" ]; then + sysfspcidev="$pcidir/current_link_speed" + fi +} + +usage() +{ + echo "Usage $0 [ -d dev ]" + echo -e "\t-d: PCIe port BDF string (e.g., 0000:00:04.0)" +} + +pattern="$PCIEPORTTYPE" +parse_arguments() +{ + while getopts d:h opt; do + case $opt in + h) + usage "$0" + exit 0 + ;; + d) + pattern="$PCIEPORTTYPE_$OPTARG" + ;; + *) + usage "$0" + exit 0 + ;; + esac + done +} + +parse_arguments "$@" +prerequisite +find_pcie_port "$pattern" +if [ -z "$testport" ]; then + echo $skipmsg "pcie cooling device not found from sysfs" >&2 + exit $ksft_skip +fi +find_sysfs_pci_dev "$testport" +if [ -z "$sysfspcidev" ]; then + echo $skipmsg "PCIe port device not found from sysfs" >&2 + exit $ksft_skip +fi + +./set_pcie_speed.sh "$testport" "$sysfspcidev" +retval=$? + +exit $retval diff --git a/tools/testing/selftests/pcie_bwctrl/set_pcie_speed.sh b/tools/testing/selftests/pcie_bwctrl/set_pcie_speed.sh new file mode 100755 index 000000000000..584596949312 --- /dev/null +++ b/tools/testing/selftests/pcie_bwctrl/set_pcie_speed.sh @@ -0,0 +1,67 @@ +#!/bin/bash +# SPDX-License-Identifier: GPL-2.0-or-later + +set -e + +TESTNAME=set_pcie_speed + +declare -a PCIELINKSPEED=( + "2.5 GT/s PCIe" + "5.0 GT/s PCIe" + "8.0 GT/s PCIe" + "16.0 GT/s PCIe" + "32.0 GT/s PCIe" + "64.0 GT/s PCIe" +) + +# Kselftest framework requirement - SKIP code is 4. +ksft_skip=4 +retval=0 + +coolingdev="$1" +statefile="$coolingdev/cur_state" +maxfile="$coolingdev/max_state" +linkspeedfile="$2" + +oldstate=`cat $statefile` +maxstate=`cat $maxfile` + +set_state() +{ + local state=$1 + local linkspeed + local expected_linkspeed + + echo $state > $statefile + + sleep 1 + + linkspeed="`cat $linkspeedfile`" + expected_linkspeed=$((maxstate-state)) + expected_str="${PCIELINKSPEED[$expected_linkspeed]}" + if [ ! "${expected_str}" = "${linkspeed}" ]; then + echo "$TESTNAME failed: expected: ${expected_str}; got ${linkspeed}" + retval=1 + fi +} + +cleanup_skip () +{ + set_state $oldstate + exit $ksft_skip +} + +trap cleanup_skip EXIT + +echo "$TESTNAME: testing states $maxstate .. $oldstate with $coolingdev" +for i in $(seq $maxstate -1 $oldstate); do + set_state "$i" +done + +trap EXIT +if [ $retval -eq 0 ]; then + echo "$TESTNAME [PASS]" +else + echo "$TESTNAME [FAIL]" +fi +exit $retval