From patchwork Wed Mar 13 08:35:57 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Li, Ming4" X-Patchwork-Id: 13591141 Received: from mgamail.intel.com (mgamail.intel.com [192.198.163.8]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 9671B1B971; Wed, 13 Mar 2024 09:04:37 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=192.198.163.8 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1710320679; cv=none; b=kWBGO1njxQS1Od1vq37cyqi5/05mEjn940efOjgudvQD9Bo50AOrhezVJtSaO2/OguVEHKIvs11XgVn6GDKTmi6ytVqZSku0niDb4lf08zMv0JlJA2fMA3OhfSM79HjH8pFvlKuE9udheAz+3MdDcVyctPBAMNgWzZJa2IqnUSE= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1710320679; c=relaxed/simple; bh=ABNAIctSLDSs2aKwynQcOBhqzG8RDByD/Ji7/NnXqQ4=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=QnmUmzz6mGxXWSvgs44Ifz5cqy70nxGc7ADDUCjk4NA6IYlmIBd+52Ke56DWwJDu5cFL7WFfcjLB7Q1dMR9MJpfbCu06xK55E1covxjiB/4KV9XBlGAdHSO0mhOlTXwOZyTU4R9o4ifAE1FtbNQHFhxoUMePLyZiUvW5bUGCnf4= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com; spf=pass smtp.mailfrom=intel.com; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b=mKJu4bla; arc=none smtp.client-ip=192.198.163.8 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=intel.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="mKJu4bla" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1710320678; x=1741856678; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=ABNAIctSLDSs2aKwynQcOBhqzG8RDByD/Ji7/NnXqQ4=; b=mKJu4blau5iuziEVhvO2F6ba/SKn/hp9OSiCdEkU/vJIddSTwkRyORV4 AI1l/ouK/y5s/i4Gpu7GywrNci8zbFivQIS1Gd7pMZkoYgFhS4w0YSIIz gvajT7337zlugxeH9Awq4AsgMskHWKFR7FYecji0hXHjnQkInjEV+54NC +E0+lQYIu1CvdjVIVYsXBN6dW/jxikXSK+QTbXicr54RmTzdNNwbq5mJx DR0ub0yYzj4y1bE0pNtKECp5r/lvRhMIboLf3q3nsYDBwUT2gynewoTuF EZY8E38tn6eK94CqTfhE5M7R5ljfGvLXe8dVRI296YlsbNYTkpL5hJRvS Q==; X-IronPort-AV: E=McAfee;i="6600,9927,11011"; a="22586876" X-IronPort-AV: E=Sophos;i="6.07,119,1708416000"; d="scan'208";a="22586876" Received: from orviesa006.jf.intel.com ([10.64.159.146]) by fmvoesa102.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 13 Mar 2024 02:04:37 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.07,119,1708416000"; d="scan'208";a="12265607" Received: from s2600wttr.bj.intel.com ([10.240.192.113]) by orviesa006-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 13 Mar 2024 02:04:35 -0700 From: Li Ming To: dan.j.williams@intel.com, rrichter@amd.com, terry.bowman@amd.com Cc: linux-cxl@vger.kernel.org, linux-kernel@vger.kernel.org, Li Ming Subject: [RFC PATCH 1/6] PCI/RCEC: Introduce pcie_walk_rcec_all() Date: Wed, 13 Mar 2024 08:35:57 +0000 Message-Id: <20240313083602.239201-2-ming4.li@intel.com> X-Mailer: git-send-email 2.40.1 In-Reply-To: <20240313083602.239201-1-ming4.li@intel.com> References: <20240313083602.239201-1-ming4.li@intel.com> Precedence: bulk X-Mailing-List: linux-cxl@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 PCIe RCEC core only provides pcie_walk_rcec() to walk all RCiEP devices associating with RCEC, but CXL subsystem needs a helper function which can walk all devices in RCEC associated bus range other than RCiEPs for below RAS error case. CXL r3.1 section 12.2.2 mentions that the CXL.cachemem protocol errors detected by a CXL root port could be logged in RCEC AER Extended Capability. The recommendation solution from CXL r3.1 section 9.18.1.5 is: "Probe all CXL Downstream Ports and determine whether they have logged an error in the CXL.io or CXL.cachemem status registers." The new helper function called pcie_walk_rcec_all(), CXL RAS error handler can use it to locate all CXL root ports or CXL devices in RCEC associated bus range. Signed-off-by: Li Ming --- drivers/pci/pci.h | 6 ++++++ drivers/pci/pcie/rcec.c | 44 +++++++++++++++++++++++++++++++++++++++-- 2 files changed, 48 insertions(+), 2 deletions(-) diff --git a/drivers/pci/pci.h b/drivers/pci/pci.h index 5ecbcf041179..a068f2d7dd28 100644 --- a/drivers/pci/pci.h +++ b/drivers/pci/pci.h @@ -444,6 +444,9 @@ void pcie_link_rcec(struct pci_dev *rcec); void pcie_walk_rcec(struct pci_dev *rcec, int (*cb)(struct pci_dev *, void *), void *userdata); +void pcie_walk_rcec_all(struct pci_dev *rcec, + int (*cb)(struct pci_dev *, void *), + void *userdata); #else static inline void pci_rcec_init(struct pci_dev *dev) { } static inline void pci_rcec_exit(struct pci_dev *dev) { } @@ -451,6 +454,9 @@ static inline void pcie_link_rcec(struct pci_dev *rcec) { } static inline void pcie_walk_rcec(struct pci_dev *rcec, int (*cb)(struct pci_dev *, void *), void *userdata) { } +static inline void pcie_walk_rcec_all(struct pci_dev *rcec, + int (*cb)(struct pci_dev *, void *), + void *userdata) { } #endif #ifdef CONFIG_PCI_ATS diff --git a/drivers/pci/pcie/rcec.c b/drivers/pci/pcie/rcec.c index d0bcd141ac9c..189de280660c 100644 --- a/drivers/pci/pcie/rcec.c +++ b/drivers/pci/pcie/rcec.c @@ -65,6 +65,15 @@ static int walk_rcec_helper(struct pci_dev *dev, void *data) return 0; } +static int walk_rcec_all_helper(struct pci_dev *dev, void *data) +{ + struct walk_rcec_data *rcec_data = data; + + rcec_data->user_callback(dev, rcec_data->user_data); + + return 0; +} + static void walk_rcec(int (*cb)(struct pci_dev *dev, void *data), void *userdata) { @@ -83,7 +92,7 @@ static void walk_rcec(int (*cb)(struct pci_dev *dev, void *data), nextbusn = rcec->rcec_ea->nextbusn; lastbusn = rcec->rcec_ea->lastbusn; - /* All RCiEP devices are on the same bus as the RCEC */ + /* All devices are on the same bus as the RCEC */ if (nextbusn == 0xff && lastbusn == 0x00) return; @@ -96,7 +105,7 @@ static void walk_rcec(int (*cb)(struct pci_dev *dev, void *data), if (!bus) continue; - /* Find RCiEP devices on the given bus ranges */ + /* Find devices on the given bus ranges */ pci_walk_bus(bus, cb, rcec_data); } } @@ -146,6 +155,37 @@ void pcie_walk_rcec(struct pci_dev *rcec, int (*cb)(struct pci_dev *, void *), walk_rcec(walk_rcec_helper, &rcec_data); } +/** + * pcie_walk_rcec_all - Walk all devices in RCEC's bus range. + * @rcec: RCEC whose devices should be walked + * @cb: Callback to be called for each device found + * @userdata: Arbitrary pointer to be passed to callback + * + * It is implemented only for CXL cases. + * Per CXL r3.1 section 12.2.2, CXL protocol errors detected by + * CXL root port could be logged in an RCEC's AER Extended Capability. + * And per CXL r3.1 section 9.18.1.5, the recommendation is that probing + * all CXL root ports to determine whether they have logged an error. + * So provide this function for CXL to walk the given RCEC, CXL driver + * will figure out which CXL root ports detected errors. + * + * If @cb returns anything other than 0, break out. + */ +void pcie_walk_rcec_all(struct pci_dev *rcec, int (*cb)(struct pci_dev *, void *), + void *userdata) +{ + struct walk_rcec_data rcec_data; + + if (!rcec->rcec_ea) + return; + + rcec_data.rcec = rcec; + rcec_data.user_callback = cb; + rcec_data.user_data = userdata; + + walk_rcec(walk_rcec_all_helper, &rcec_data); +} + void pci_rcec_init(struct pci_dev *dev) { struct rcec_ea *rcec_ea; From patchwork Wed Mar 13 08:35:58 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Li, Ming4" X-Patchwork-Id: 13591142 Received: from mgamail.intel.com (mgamail.intel.com [192.198.163.8]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 2DDB71B7FE; Wed, 13 Mar 2024 09:04:39 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=192.198.163.8 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1710320680; cv=none; b=gtjKT/Y6x+rtJFrar5+EJuOQLGVteMo9v57ofTe64Vjokoyrf24wlB3Yb0tmya+s7SBLktqBl9ewaSdT2550yl5bjtII7DnYtiRsgYOYkrM/wCrnWf0HGJ7AcK0lQfJ6HjuWiTj+/gF51kfBinsdK0EciR44bb3zUxUZWUb9aqg= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1710320680; c=relaxed/simple; bh=/POb0gRCrTWVO7oygRZEsKACcsqc9KTtU1WqkcFMu/I=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=b0wKC4PYowqD+IuDz63iGDR8LOe7gW5ypT92IvzMcA/D3ty6HMNQTcvPhM0HWOaZ1B5A334bH2gkBlp6plIP28YupYvkSf2O6KX+SqzNQCnis/ykicN+7dv6oyZMNv8KIVHK1vDa4m+PP4BgwdMriq/O3Enwx8tkaqz6bW8i/E0= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com; spf=pass smtp.mailfrom=intel.com; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b=KLtv1WXf; arc=none smtp.client-ip=192.198.163.8 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=intel.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="KLtv1WXf" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1710320679; x=1741856679; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=/POb0gRCrTWVO7oygRZEsKACcsqc9KTtU1WqkcFMu/I=; b=KLtv1WXfsOCHPb2lSLAXxNYZzeINM2EMfmsUZ0TRDIii/qFSRBOnGpc2 24QHlm8ekcijwYfvhjjZS7N/hJ35MreboJrVqBB5N5uJ3cto38unqyMho OV/COMNmQzWr8RJE+EA291hAPa0btXzu2wwsxFPSqOI+qFBBsWE2h7VvK KUSrjqAYwK/Xny5aYLnZC5jdNoEj0ejPts1IUuHBDcS1LZx/HdoZDVHPj eQI3yf/dsnbJHsvFroZgiiLqXzUQEx5ttfqG0RWUNkuLFnosdvnIPlve8 eQfrKBlpAdvdTqXfqZcbto3aR3AeseNeLP5ByazYGv1fJBElDcSy+nBoD Q==; X-IronPort-AV: E=McAfee;i="6600,9927,11011"; a="22586883" X-IronPort-AV: E=Sophos;i="6.07,119,1708416000"; d="scan'208";a="22586883" Received: from orviesa006.jf.intel.com ([10.64.159.146]) by fmvoesa102.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 13 Mar 2024 02:04:39 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.07,119,1708416000"; d="scan'208";a="12265612" Received: from s2600wttr.bj.intel.com ([10.240.192.113]) by orviesa006-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 13 Mar 2024 02:04:37 -0700 From: Li Ming To: dan.j.williams@intel.com, rrichter@amd.com, terry.bowman@amd.com Cc: linux-cxl@vger.kernel.org, linux-kernel@vger.kernel.org, Li Ming Subject: [RFC PATCH 2/6] PCI/CXL: A new attribute to indicate CXL-capable host bridge Date: Wed, 13 Mar 2024 08:35:58 +0000 Message-Id: <20240313083602.239201-3-ming4.li@intel.com> X-Mailer: git-send-email 2.40.1 In-Reply-To: <20240313083602.239201-1-ming4.li@intel.com> References: <20240313083602.239201-1-ming4.li@intel.com> Precedence: bulk X-Mailing-List: linux-cxl@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Introduce a new attribute called 'is_cxl' in struct pci_host_bridge to indicate if the PCI host bridge is CXl capable. Suggested-by: Dan Williams Signed-off-by: Li Ming --- drivers/acpi/pci_root.c | 1 + include/linux/pci.h | 1 + 2 files changed, 2 insertions(+) diff --git a/drivers/acpi/pci_root.c b/drivers/acpi/pci_root.c index 58b89b8d950e..4ab0dc8450ce 100644 --- a/drivers/acpi/pci_root.c +++ b/drivers/acpi/pci_root.c @@ -1034,6 +1034,7 @@ struct pci_bus *acpi_pci_root_create(struct acpi_pci_root *root, goto out_release_info; host_bridge = to_pci_host_bridge(bus->bridge); + host_bridge->is_cxl = is_cxl(root); if (!(root->osc_control_set & OSC_PCI_EXPRESS_NATIVE_HP_CONTROL)) host_bridge->native_pcie_hotplug = 0; if (!(root->osc_control_set & OSC_PCI_SHPC_NATIVE_HP_CONTROL)) diff --git a/include/linux/pci.h b/include/linux/pci.h index bf6c02bee49f..bbe90e730285 100644 --- a/include/linux/pci.h +++ b/include/linux/pci.h @@ -587,6 +587,7 @@ struct pci_host_bridge { unsigned int preserve_config:1; /* Preserve FW resource setup */ unsigned int size_windows:1; /* Enable root bus sizing */ unsigned int msi_domain:1; /* Bridge wants MSI domain */ + unsigned int is_cxl:1; /* CXL capable Host Bridge */ /* Resource alignment requirements */ resource_size_t (*align_resource)(struct pci_dev *dev, From patchwork Wed Mar 13 08:35:59 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Li, Ming4" X-Patchwork-Id: 13591143 Received: from mgamail.intel.com (mgamail.intel.com [192.198.163.8]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id B9E04225AC; Wed, 13 Mar 2024 09:04:41 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=192.198.163.8 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1710320683; cv=none; b=XcabbzNhmiGXgXezxoNHN0PaHvIzmXtUpOcvsk9Yo1PaEfTmvSI6lO/F+xzb3+bJ+qJ0am2edOpUBpI+wIjGyIvryDKDLZAurgAqLDxpA6oZqJYt7/PXBxerfALMntr9yoRfaIDoynO3FuL7NzmPZzpAdp5hD02DyVXRKKPGHV0= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1710320683; c=relaxed/simple; bh=njHq/wxfjkmMzk/ugY+47vwXpLQCoq4Ln7TqkR1qc4A=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=KSiztW0M6ebBbQaNytbdMZBx/jmBUftt7FPWd528okx0njPBMYKM/LaQrn/0sKHuj/Uc6O9La/n9oNrvh9a2arCXhY8C385uMqlpx86qH/WV30toiAjiG23Qr7mFi76EHcow4GM+0wKTfZ03PfUeAC3wzZBEK4akvpYcza4A+p4= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com; spf=pass smtp.mailfrom=intel.com; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b=nTFdb27c; arc=none smtp.client-ip=192.198.163.8 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=intel.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="nTFdb27c" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1710320682; x=1741856682; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=njHq/wxfjkmMzk/ugY+47vwXpLQCoq4Ln7TqkR1qc4A=; b=nTFdb27cbbzJl6RzhS06o7A2Pw08RDJdFA+jxbT2HD+qCZYKU18jioUe 9X74UKSq8//wUl7IsJK1Ju1xOJhEG6h6WmKePkfc6Z83IlPHrc4GQO2EP NomjNSk2eEt+vwxtbvP7JF0WNeSPGIy8OS+WGtbfpnj0ELVTWHTQfRLsN VHLQJqMrvJGDjNe/uOHqwRATixXt1jfm/HRSqAGa/xmxmFYfb39h89tNw ZCV7k4YHJvOpBrBo/8fWoSM6bq5mjEzF8C/NFRSZLx2wQCmsUUJwY1n5Q y8MKnBGDnmupTMbpOcnWJZHi8kNkk1AQ2Q1y061ygle8B3aRXZTJbCRku Q==; X-IronPort-AV: E=McAfee;i="6600,9927,11011"; a="22586888" X-IronPort-AV: E=Sophos;i="6.07,119,1708416000"; d="scan'208";a="22586888" Received: from orviesa006.jf.intel.com ([10.64.159.146]) by fmvoesa102.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 13 Mar 2024 02:04:41 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.07,119,1708416000"; d="scan'208";a="12265626" Received: from s2600wttr.bj.intel.com ([10.240.192.113]) by orviesa006-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 13 Mar 2024 02:04:39 -0700 From: Li Ming To: dan.j.williams@intel.com, rrichter@amd.com, terry.bowman@amd.com Cc: linux-cxl@vger.kernel.org, linux-kernel@vger.kernel.org, Li Ming Subject: [RFC PATCH 3/6] PCI/AER: Enable RCEC to report internal error for CXL root port Date: Wed, 13 Mar 2024 08:35:59 +0000 Message-Id: <20240313083602.239201-4-ming4.li@intel.com> X-Mailer: git-send-email 2.40.1 In-Reply-To: <20240313083602.239201-1-ming4.li@intel.com> References: <20240313083602.239201-1-ming4.li@intel.com> Precedence: bulk X-Mailing-List: linux-cxl@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Per CXl r3.1 section 12.2.2, CXL.cachemem protocol erros detected by CXL root port could be logged in RCEC AER Extended Capability as PCI_ERR_UNC_INTN or PCI_ERR_COR_INTERNAL. Unmask these errors for that case. Signed-off-by: Li Ming --- drivers/pci/pcie/aer.c | 24 +++++++++++++++++------- 1 file changed, 17 insertions(+), 7 deletions(-) diff --git a/drivers/pci/pcie/aer.c b/drivers/pci/pcie/aer.c index 42a3bd35a3e1..364c74e47273 100644 --- a/drivers/pci/pcie/aer.c +++ b/drivers/pci/pcie/aer.c @@ -985,7 +985,7 @@ static bool cxl_error_is_native(struct pci_dev *dev) { struct pci_host_bridge *host = pci_find_host_bridge(dev->bus); - return (pcie_ports_native || host->native_aer); + return (pcie_ports_native || host->native_aer) && host->is_cxl; } static bool is_internal_error(struct aer_err_info *info) @@ -1041,8 +1041,13 @@ static int handles_cxl_error_iter(struct pci_dev *dev, void *data) { bool *handles_cxl = data; - if (!*handles_cxl) - *handles_cxl = is_cxl_mem_dev(dev) && cxl_error_is_native(dev); + if (!*handles_cxl && cxl_error_is_native(dev)) { + if (pci_pcie_type(dev) == PCI_EXP_TYPE_RC_END && + dev->rcec && is_cxl_mem_dev(dev)) + *handles_cxl = true; + if (pci_pcie_type(dev) == PCI_EXP_TYPE_ROOT_PORT) + *handles_cxl = true; + } /* Non-zero terminates iteration */ return *handles_cxl; @@ -1054,13 +1059,18 @@ static bool handles_cxl_errors(struct pci_dev *rcec) if (pci_pcie_type(rcec) == PCI_EXP_TYPE_RC_EC && pcie_aer_is_native(rcec)) - pcie_walk_rcec(rcec, handles_cxl_error_iter, &handles_cxl); + pcie_walk_rcec_all(rcec, handles_cxl_error_iter, &handles_cxl); return handles_cxl; } -static void cxl_rch_enable_rcec(struct pci_dev *rcec) +static void cxl_enable_rcec(struct pci_dev *rcec) { + /* + * Enable RCEC's internal error report for two cases: + * 1. RCiEP detected CXL.cachemem protocol errors + * 2. CXL root port detected CXL.cachemem protocol errors. + */ if (!handles_cxl_errors(rcec)) return; @@ -1069,7 +1079,7 @@ static void cxl_rch_enable_rcec(struct pci_dev *rcec) } #else -static inline void cxl_rch_enable_rcec(struct pci_dev *dev) { } +static inline void cxl_enable_rcec(struct pci_dev *dev) { } static inline void cxl_rch_handle_error(struct pci_dev *dev, struct aer_err_info *info) { } #endif @@ -1494,7 +1504,7 @@ static int aer_probe(struct pcie_device *dev) return status; } - cxl_rch_enable_rcec(port); + cxl_enable_rcec(port); aer_enable_rootport(rpc); pci_info(port, "enabled with IRQ %d\n", dev->irq); return 0; From patchwork Wed Mar 13 08:36:00 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Li, Ming4" X-Patchwork-Id: 13591144 Received: from mgamail.intel.com (mgamail.intel.com [192.198.163.8]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 810FA1CF9C; Wed, 13 Mar 2024 09:04:43 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=192.198.163.8 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1710320685; cv=none; b=GZ4D0pfruFAykqtGAEvympfQWhgYIo9JjEU1VTCmB54Rfy8/bfIHk7EGq1Bie7VlKRJZ4SWe7WbXCn1fUFo5PlCYTGHl8+dMZD05NbLCNIduz5gqe58seO2zNEdiQbR3DDk/wFbYtE76coqE0Bq5iqM7oPEdQGtSNB9UtLKOP4M= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1710320685; c=relaxed/simple; bh=cCcAZorzRLb7lbgjB3+q7cWyi/RgdyXLKB8wuGXjVn4=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=hgnjZSKK3DqGAHDhqO9Nty3o6W5h2dS8Qo4zeUYTftCEZ1MH3pHAya1fmWYAcQuPvrPUmXvm7TjviGDA8LhC4jSfG5eatVNHjZS6vsclwhZyWtKcUqYI8uf1P71i5YSY6CCjPDhk/d+tP7CaRfSd0GUVZnE6XZ8aGMQfPNH2waQ= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com; spf=pass smtp.mailfrom=intel.com; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b=FAO41EBo; arc=none smtp.client-ip=192.198.163.8 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=intel.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="FAO41EBo" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1710320684; x=1741856684; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=cCcAZorzRLb7lbgjB3+q7cWyi/RgdyXLKB8wuGXjVn4=; b=FAO41EBotxMxJc6ql89JS4cp+B0heIX+SZ8mbzu5bz/szl8cxFBdumzo OViWPfkcBnFboLDkAfM52H/IBoUKqsFH7M00Qc4QoFFYnLwK5AitPEiAT QYIZoEJd4QKOYL3hNmERC7DqhbVDAVfFQxIcJYVQuL/JNxOJC8MqiLhEc ufYeJc8LNr/hiq51qTE4+hzb50V4jcModMyNJRNPG4Q4hNgtoKMfnLxR7 QO28SUxVQ/AdFLuRVt0w47R+pnHwBUyOwuw4YGrqPZpoVNZnUSa0+cJ2d LMbN45ymJ1B8dOC2+8hCkoni2h1+e2mqXQTC9jVGihkn4hqnayDycgwEW A==; X-IronPort-AV: E=McAfee;i="6600,9927,11011"; a="22586893" X-IronPort-AV: E=Sophos;i="6.07,119,1708416000"; d="scan'208";a="22586893" Received: from orviesa006.jf.intel.com ([10.64.159.146]) by fmvoesa102.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 13 Mar 2024 02:04:43 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.07,119,1708416000"; d="scan'208";a="12265634" Received: from s2600wttr.bj.intel.com ([10.240.192.113]) by orviesa006-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 13 Mar 2024 02:04:41 -0700 From: Li Ming To: dan.j.williams@intel.com, rrichter@amd.com, terry.bowman@amd.com Cc: linux-cxl@vger.kernel.org, linux-kernel@vger.kernel.org, Li Ming Subject: [RFC PATCH 4/6] PCI/AER: Extend RCH RAS error handling to support VH topology case Date: Wed, 13 Mar 2024 08:36:00 +0000 Message-Id: <20240313083602.239201-5-ming4.li@intel.com> X-Mailer: git-send-email 2.40.1 In-Reply-To: <20240313083602.239201-1-ming4.li@intel.com> References: <20240313083602.239201-1-ming4.li@intel.com> Precedence: bulk X-Mailing-List: linux-cxl@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 When RCEC captures CXL.cachemem protocol errors detected by CXL root port, the recommendation from CXL r3.1 9.18.1.5 is : "Probe all CXL Downstream Ports and determine whether they have logged an error in the CXL.io or CXL.cachemem status registers." The flow is similar with RCH RAS error handling, so reuse it to support above case. Signed-off-by: Li Ming --- drivers/pci/pcie/aer.c | 20 ++++++++++++-------- 1 file changed, 12 insertions(+), 8 deletions(-) diff --git a/drivers/pci/pcie/aer.c b/drivers/pci/pcie/aer.c index 364c74e47273..79bfa5fb78f4 100644 --- a/drivers/pci/pcie/aer.c +++ b/drivers/pci/pcie/aer.c @@ -996,11 +996,15 @@ static bool is_internal_error(struct aer_err_info *info) return info->status & PCI_ERR_UNC_INTN; } -static int cxl_rch_handle_error_iter(struct pci_dev *dev, void *data) +static int cxl_handle_error_iter(struct pci_dev *dev, void *data) { struct aer_err_info *info = (struct aer_err_info *)data; const struct pci_error_handlers *err_handler; + /* Skip the RCiEP devices not associating with RCEC */ + if ((pci_pcie_type(dev) == PCI_EXP_TYPE_RC_END) && + !dev->rcec) + return 0; if (!is_cxl_mem_dev(dev) || !cxl_error_is_native(dev)) return 0; @@ -1025,16 +1029,16 @@ static int cxl_rch_handle_error_iter(struct pci_dev *dev, void *data) return 0; } -static void cxl_rch_handle_error(struct pci_dev *dev, struct aer_err_info *info) +static void cxl_handle_error(struct pci_dev *dev, struct aer_err_info *info) { /* * Internal errors of an RCEC indicate an AER error in an - * RCH's downstream port. Check and handle them in the CXL.mem - * device driver. + * RCH's downstream port or a CXL root port. Check and handle + * them in the CXL.mem device driver. */ if (pci_pcie_type(dev) == PCI_EXP_TYPE_RC_EC && is_internal_error(info)) - pcie_walk_rcec(dev, cxl_rch_handle_error_iter, info); + pcie_walk_rcec_all(dev, cxl_handle_error_iter, info); } static int handles_cxl_error_iter(struct pci_dev *dev, void *data) @@ -1080,8 +1084,8 @@ static void cxl_enable_rcec(struct pci_dev *rcec) #else static inline void cxl_enable_rcec(struct pci_dev *dev) { } -static inline void cxl_rch_handle_error(struct pci_dev *dev, - struct aer_err_info *info) { } +static inline void cxl_handle_error(struct pci_dev *dev, + struct aer_err_info *info) { } #endif /** @@ -1119,7 +1123,7 @@ static void pci_aer_handle_error(struct pci_dev *dev, struct aer_err_info *info) static void handle_error_source(struct pci_dev *dev, struct aer_err_info *info) { - cxl_rch_handle_error(dev, info); + cxl_handle_error(dev, info); pci_aer_handle_error(dev, info); pci_dev_put(dev); } From patchwork Wed Mar 13 08:36:01 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Li, Ming4" X-Patchwork-Id: 13591145 Received: from mgamail.intel.com (mgamail.intel.com [192.198.163.8]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id EECE22E64E; Wed, 13 Mar 2024 09:04:45 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=192.198.163.8 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1710320687; cv=none; b=A5kllIJeOhFaO+nKhTqNz/jn7uIqKBTZtdMPW40lT3s/+gS2YlzKg2VbQiSsZARsbare64VtLFiImCIirNG1uxfHS+CqkjtYbWyQZBT2Q48sC27wWJT3sSKpJbjF2RZORbqfv9/yZUhowC0Mz4uz2IkcPJCkppVnTqsXg1fxD+0= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1710320687; c=relaxed/simple; bh=bdjRVrFiBAZVBjFLvU3URBuucAkGET+Mkv7SXM2nCHo=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=rfNVyr7Myf79MwWb7T6/t8qAOKtIz5DyUvdLkcWBm828Um1RWRmzapics3rObd6aCcqr5XyHe79628+AIaW+NswJgbq8Lzws6Chw+LqTFmNnoPYinLcIkf6utNzYbPZ/e6kmGGT2q9xZFmUjilltWnYsst3yppAz/E7gh51FbYQ= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com; spf=pass smtp.mailfrom=intel.com; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b=nHqpBe+s; arc=none smtp.client-ip=192.198.163.8 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=intel.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="nHqpBe+s" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1710320686; x=1741856686; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=bdjRVrFiBAZVBjFLvU3URBuucAkGET+Mkv7SXM2nCHo=; b=nHqpBe+sHBpfrEdmb/+KnnFw6U64YWVPntohXd2f+65pbwNIbQUcZjp+ RTJgCY8P6QNQ8NqBLsATJoU8qssWU+VQ+t5zPAtvh47fMF+Fh6RDppJ4+ Ki0glwuMl+tAgfMhISmIfKdjmh4Y4uc60xtLromv1aLd6u/+rrkJTM9Vo GAPO+1FNF77ebVrAVQys9x+k6nlnjEaeoWcIltj8m+nMovrWVd8Mgi3gZ yyWvzA4bbDI1uNwt8xVLhQs1Jg7IPhVAfk2C+GOfUa3MFBc4iLz8/6n9g c5/aVeXvwLzdKCI7lraAbxgNwt5JblYxTBDXDt9c+V2CcLb1Yf10V3fBh A==; X-IronPort-AV: E=McAfee;i="6600,9927,11011"; a="22586900" X-IronPort-AV: E=Sophos;i="6.07,119,1708416000"; d="scan'208";a="22586900" Received: from orviesa006.jf.intel.com ([10.64.159.146]) by fmvoesa102.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 13 Mar 2024 02:04:45 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.07,119,1708416000"; d="scan'208";a="12265646" Received: from s2600wttr.bj.intel.com ([10.240.192.113]) by orviesa006-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 13 Mar 2024 02:04:43 -0700 From: Li Ming To: dan.j.williams@intel.com, rrichter@amd.com, terry.bowman@amd.com Cc: linux-cxl@vger.kernel.org, linux-kernel@vger.kernel.org, Li Ming Subject: [RFC PATCH 5/6] cxl: Use __free() for cxl_pci/mem_find_port() to drop put_device() Date: Wed, 13 Mar 2024 08:36:01 +0000 Message-Id: <20240313083602.239201-6-ming4.li@intel.com> X-Mailer: git-send-email 2.40.1 In-Reply-To: <20240313083602.239201-1-ming4.li@intel.com> References: <20240313083602.239201-1-ming4.li@intel.com> Precedence: bulk X-Mailing-List: linux-cxl@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Introduce a new helper function called put_cxl_port() to instead of the put_device() in order to release the device reference of struct cxl_port got via get_device() in cxl_pci/mem_find_port(). Besides, use scope-based resource management __free() to drop the open coded put_device() for each cxl_pci/mem_find_port(). Suggested-by: Dan Williams Signed-off-by: Li Ming --- drivers/cxl/core/pci.c | 6 ++---- drivers/cxl/core/port.c | 9 +++++++++ drivers/cxl/cxl.h | 2 ++ drivers/cxl/mem.c | 5 ++--- drivers/cxl/pci.c | 12 +++++------- 5 files changed, 20 insertions(+), 14 deletions(-) diff --git a/drivers/cxl/core/pci.c b/drivers/cxl/core/pci.c index 6c9c8d92f8f7..7254484330d2 100644 --- a/drivers/cxl/core/pci.c +++ b/drivers/cxl/core/pci.c @@ -902,15 +902,13 @@ static void cxl_handle_rdport_errors(struct cxl_dev_state *cxlds) struct pci_dev *pdev = to_pci_dev(cxlds->dev); struct aer_capability_regs aer_regs; struct cxl_dport *dport; - struct cxl_port *port; int severity; + struct cxl_port *port __free(put_cxl_port) = + cxl_pci_find_port(pdev, &dport); - port = cxl_pci_find_port(pdev, &dport); if (!port) return; - put_device(&port->dev); - if (!cxl_rch_get_aer_info(dport->regs.dport_aer, &aer_regs)) return; diff --git a/drivers/cxl/core/port.c b/drivers/cxl/core/port.c index e59d9d37aa65..6e2fc2fce7c9 100644 --- a/drivers/cxl/core/port.c +++ b/drivers/cxl/core/port.c @@ -1671,6 +1671,15 @@ struct cxl_port *cxl_mem_find_port(struct cxl_memdev *cxlmd, } EXPORT_SYMBOL_NS_GPL(cxl_mem_find_port, CXL); +void put_cxl_port(struct cxl_port *port) +{ + if (!port) + return; + + put_device(&port->dev); +} +EXPORT_SYMBOL_NS_GPL(put_cxl_port, CXL); + static int decoder_populate_targets(struct cxl_switch_decoder *cxlsd, struct cxl_port *port, int *target_map) { diff --git a/drivers/cxl/cxl.h b/drivers/cxl/cxl.h index b6017c0c57b4..476158782e3e 100644 --- a/drivers/cxl/cxl.h +++ b/drivers/cxl/cxl.h @@ -743,6 +743,8 @@ struct cxl_port *cxl_pci_find_port(struct pci_dev *pdev, struct cxl_dport **dport); struct cxl_port *cxl_mem_find_port(struct cxl_memdev *cxlmd, struct cxl_dport **dport); +void put_cxl_port(struct cxl_port *port); +DEFINE_FREE(put_cxl_port, struct cxl_port *, if (_T) put_cxl_port(_T)) bool schedule_cxl_memdev_detach(struct cxl_memdev *cxlmd); struct cxl_dport *devm_cxl_add_dport(struct cxl_port *port, diff --git a/drivers/cxl/mem.c b/drivers/cxl/mem.c index c5c9d8e0d88d..5aaa8ee2a46d 100644 --- a/drivers/cxl/mem.c +++ b/drivers/cxl/mem.c @@ -109,7 +109,6 @@ static int cxl_mem_probe(struct device *dev) struct cxl_memdev_state *mds = to_cxl_memdev_state(cxlmd->cxlds); struct cxl_dev_state *cxlds = cxlmd->cxlds; struct device *endpoint_parent; - struct cxl_port *parent_port; struct cxl_dport *dport; struct dentry *dentry; int rc; @@ -146,7 +145,8 @@ static int cxl_mem_probe(struct device *dev) if (rc) return rc; - parent_port = cxl_mem_find_port(cxlmd, &dport); + struct cxl_port *parent_port __free(put_cxl_port) = + cxl_mem_find_port(cxlmd, &dport); if (!parent_port) { dev_err(dev, "CXL port topology not found\n"); return -ENXIO; @@ -170,7 +170,6 @@ static int cxl_mem_probe(struct device *dev) rc = devm_cxl_add_endpoint(endpoint_parent, cxlmd, dport); unlock: device_unlock(endpoint_parent); - put_device(&parent_port->dev); if (rc) return rc; diff --git a/drivers/cxl/pci.c b/drivers/cxl/pci.c index 4fd1f207c84e..d0ec8c5b1e99 100644 --- a/drivers/cxl/pci.c +++ b/drivers/cxl/pci.c @@ -473,23 +473,21 @@ static bool is_cxl_restricted(struct pci_dev *pdev) static int cxl_rcrb_get_comp_regs(struct pci_dev *pdev, struct cxl_register_map *map) { - struct cxl_port *port; struct cxl_dport *dport; resource_size_t component_reg_phys; + struct cxl_port *port __free(put_cxl_port) = + cxl_pci_find_port(pdev, &dport); + + if (!port) + return -EPROBE_DEFER; *map = (struct cxl_register_map) { .host = &pdev->dev, .resource = CXL_RESOURCE_NONE, }; - port = cxl_pci_find_port(pdev, &dport); - if (!port) - return -EPROBE_DEFER; - component_reg_phys = cxl_rcd_component_reg_phys(&pdev->dev, dport); - put_device(&port->dev); - if (component_reg_phys == CXL_RESOURCE_NONE) return -ENXIO; From patchwork Wed Mar 13 08:36:02 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Li, Ming4" X-Patchwork-Id: 13591146 Received: from mgamail.intel.com (mgamail.intel.com [192.198.163.8]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id BDE0E1B7F1; Wed, 13 Mar 2024 09:04:47 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=192.198.163.8 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1710320689; cv=none; b=hsgtN6R3QQOvd1hz+CBbVorjfzxwDkxYndfjaWHLVsWeV9qQWvQgKEb2wSLPr83a0OvWRBDsNlq4P4m3p4GY/GbXv8U5j5GmmgrZWiwn+yfCXCpsbqFqPxw58SzojJDVoi/qNljOtmUPf8gXIbX6P5c2XjYk6yLrGrIZ5wbMsdU= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1710320689; c=relaxed/simple; bh=VV5p98jOb39/jQo9ir6M+cQENBSVyltBiiHZq5lYrcg=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=kQ7Ww+en0k2RVQQHN6BKKnLtGX+/G35nTpb+u8UWU89tUOHSTonx1ap5LMZUfV0cxNqg+Tin3xvN9N8hAD/XOBmnJEgJjm49E+4Qzh/Wspat+7ePwLr1BZinWevNagBln8P4fPciAelL4FnOQEAoqB29uVHrW5qW+x1OSyYrhgE= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com; spf=pass smtp.mailfrom=intel.com; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b=hLLR62tp; arc=none smtp.client-ip=192.198.163.8 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=intel.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="hLLR62tp" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1710320688; x=1741856688; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=VV5p98jOb39/jQo9ir6M+cQENBSVyltBiiHZq5lYrcg=; b=hLLR62tp03sMiubOxlX/cfJ23sD6gGZkDeBslTtAeglvmngFuKR1OvJi p9cjWCuNUbuEr4vjq+R5XDGNNFUXqLBjjdBp6wVTgwM5xxk+RrFL9/hYY SO8wuzu/mM/kbrJIWGZ/bOmhhAw1LOCv0IctgWmVK2qimcBEWHiAChxcf bP9vFlNDASJy3z0MDAqtRvLYA5u16UlRVPcwgqMuQeBh/1ce9BEJ5A4YF Jpr5SiTZgSOmdIjuK2mAAmMPvsZ3A1uLW5gCMLgTMQ5DVdbLUVobX1VE6 z2G4yyXfMh9NKFMQK2A+U2hH1fIfxxfgdZ9DpcFHo7T73Jw24CiuihpIp g==; X-IronPort-AV: E=McAfee;i="6600,9927,11011"; a="22586904" X-IronPort-AV: E=Sophos;i="6.07,119,1708416000"; d="scan'208";a="22586904" Received: from orviesa006.jf.intel.com ([10.64.159.146]) by fmvoesa102.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 13 Mar 2024 02:04:47 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.07,119,1708416000"; d="scan'208";a="12265660" Received: from s2600wttr.bj.intel.com ([10.240.192.113]) by orviesa006-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 13 Mar 2024 02:04:45 -0700 From: Li Ming To: dan.j.williams@intel.com, rrichter@amd.com, terry.bowman@amd.com Cc: linux-cxl@vger.kernel.org, linux-kernel@vger.kernel.org, Li Ming Subject: [RFC PATCH 6/6] cxl/pci: Support to handle root port RAS errors captured by RCEC Date: Wed, 13 Mar 2024 08:36:02 +0000 Message-Id: <20240313083602.239201-7-ming4.li@intel.com> X-Mailer: git-send-email 2.40.1 In-Reply-To: <20240313083602.239201-1-ming4.li@intel.com> References: <20240313083602.239201-1-ming4.li@intel.com> Precedence: bulk X-Mailing-List: linux-cxl@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 The CXL subsystem already supports RCH RAS Error handling that has a dependency on the RCEC. Reuse and extend that RCH topology support to handle the errors detected by root port and logged in RCEC. Signed-off-by: Li Ming --- drivers/cxl/core/pci.c | 83 ++++++++++++++++++++++++++++-------------- 1 file changed, 56 insertions(+), 27 deletions(-) diff --git a/drivers/cxl/core/pci.c b/drivers/cxl/core/pci.c index 7254484330d2..154812f1f450 100644 --- a/drivers/cxl/core/pci.c +++ b/drivers/cxl/core/pci.c @@ -837,18 +837,6 @@ void cxl_setup_parent_dport(struct device *host, struct cxl_dport *dport) } EXPORT_SYMBOL_NS_GPL(cxl_setup_parent_dport, CXL); -static void cxl_handle_rdport_cor_ras(struct cxl_dev_state *cxlds, - struct cxl_dport *dport) -{ - return __cxl_handle_cor_ras(cxlds, dport->regs.ras); -} - -static bool cxl_handle_rdport_ras(struct cxl_dev_state *cxlds, - struct cxl_dport *dport) -{ - return __cxl_handle_ras(cxlds, dport->regs.ras); -} - /* * Copy the AER capability registers using 32 bit read accesses. * This is necessary because RCRB AER capability is MMIO mapped. Clear the @@ -897,10 +885,45 @@ static bool cxl_rch_get_aer_severity(struct aer_capability_regs *aer_regs, return false; } -static void cxl_handle_rdport_errors(struct cxl_dev_state *cxlds) +/* Get AER severity from CXL RAS Capability */ +static bool cxl_ras_get_aer_severity(void __iomem *ras_base, int *severity) +{ + void __iomem *addr; + u32 ue_severity; + u32 status; + + if (!ras_base) + return false; + + addr = ras_base + CXL_RAS_UNCORRECTABLE_STATUS_OFFSET; + status = readl(addr); + addr = ras_base + CXL_RAS_UNCORRECTABLE_SEVERITY_OFFSET; + ue_severity = readl(addr); + status &= CXL_RAS_UNCORRECTABLE_STATUS_MASK; + if (status) { + if (status & ue_severity) + *severity = AER_FATAL; + else + *severity = AER_NONFATAL; + + return true; + } + + addr = ras_base + CXL_RAS_CORRECTABLE_STATUS_OFFSET; + status = readl(addr); + if (status & CXL_RAS_CORRECTABLE_STATUS_MASK) { + *severity = AER_CORRECTABLE; + return true; + } + + return false; +} + +static void cxl_handle_dport_errors(struct cxl_dev_state *cxlds) { struct pci_dev *pdev = to_pci_dev(cxlds->dev); struct aer_capability_regs aer_regs; + struct pci_dev *dport_pdev; struct cxl_dport *dport; int severity; struct cxl_port *port __free(put_cxl_port) = @@ -909,31 +932,38 @@ static void cxl_handle_rdport_errors(struct cxl_dev_state *cxlds) if (!port) return; - if (!cxl_rch_get_aer_info(dport->regs.dport_aer, &aer_regs)) - return; - - if (!cxl_rch_get_aer_severity(&aer_regs, &severity)) - return; + if (cxlds->rcd) { + if (!cxl_rch_get_aer_info(dport->regs.dport_aer, &aer_regs)) + return; - pci_print_aer(pdev, severity, &aer_regs); + if (!cxl_rch_get_aer_severity(&aer_regs, &severity)) + return; + pci_print_aer(pdev, severity, &aer_regs); + } else { + dport_pdev = to_pci_dev(dport->dport_dev); + /* TODO: add support for switch downstream port error handling */ + if (pci_pcie_type(dport_pdev) != PCI_EXP_TYPE_ROOT_PORT) + return; + if (!cxl_ras_get_aer_severity(dport->regs.ras, &severity)) + return; + } if (severity == AER_CORRECTABLE) - cxl_handle_rdport_cor_ras(cxlds, dport); + __cxl_handle_cor_ras(cxlds, dport->regs.ras); else - cxl_handle_rdport_ras(cxlds, dport); + __cxl_handle_ras(cxlds, dport->regs.ras); + } #else -static void cxl_handle_rdport_errors(struct cxl_dev_state *cxlds) { } +static void cxl_handle_dport_errors(struct cxl_dev_state *cxlds) { } #endif void cxl_cor_error_detected(struct pci_dev *pdev) { struct cxl_dev_state *cxlds = pci_get_drvdata(pdev); - if (cxlds->rcd) - cxl_handle_rdport_errors(cxlds); - + cxl_handle_dport_errors(cxlds); cxl_handle_endpoint_cor_ras(cxlds); } EXPORT_SYMBOL_NS_GPL(cxl_cor_error_detected, CXL); @@ -946,8 +976,7 @@ pci_ers_result_t cxl_error_detected(struct pci_dev *pdev, struct device *dev = &cxlmd->dev; bool ue; - if (cxlds->rcd) - cxl_handle_rdport_errors(cxlds); + cxl_handle_dport_errors(cxlds); /* * A frozen channel indicates an impending reset which is fatal to