From patchwork Wed Oct 23 01:43:24 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Dan Williams X-Patchwork-Id: 13846300 Received: from mgamail.intel.com (mgamail.intel.com [192.198.163.18]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id C79EA20323; Wed, 23 Oct 2024 01:43:26 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=192.198.163.18 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1729647808; cv=none; b=HTnl4AWwJ0qs+KzjbOZPdVtLqSa3xg61Fch5+IYJjC8n1USJ5oBwJZWM+zr3KzdIuWU5vyrMzizsG9kui9VqobUvOY5IGzi2wqSfKN9e09h6p7t1nVMQrJN7fCoALWmVfzGEMbYgVo92cZguTJHzfCIi7hd2NLUBfxCadMSNAhQ= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1729647808; c=relaxed/simple; bh=hRlt6zWXN+JO1BCxg9TJx3kZIWg+qY5u+t8CECAw79I=; h=Subject:From:To:Cc:Date:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=mUwWZgEq6eGZLCAnRmoA3vzqTCgCOCmU+Dm7E0t7UwqKh5OBY422Aa15BIWGle1IO55Tg0koJQJVxWXwEtin2IYSN0ZjVGoOZ6l7nNQtLGbGQcD0n+c8bShflYE6eS4yvG5gcO/JYxO/7eX40xh0Ffa+FAUrcBIrYHYx2awMtAE= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com; spf=pass smtp.mailfrom=intel.com; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b=ejr6m//b; arc=none smtp.client-ip=192.198.163.18 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=intel.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="ejr6m//b" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1729647807; x=1761183807; h=subject:from:to:cc:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=hRlt6zWXN+JO1BCxg9TJx3kZIWg+qY5u+t8CECAw79I=; b=ejr6m//bC9VdH/UX3jy62fotKCym8DXrzOnQ9xRN5X+8MvpMimCJJNY7 ogp+Di7jnz9TdZMn2jcLDoDYEQpl6pRwUbujkdrjcmUcjnRVjJkAKrWj0 y7zBIy465obwiyGZemcQr+bkLXCxWGBs1A8go7uLufQeieTYO1Pu6O8E2 a7L/ckav8IojmiyE5dcz1YJ/W39JJsiaWZpJJDWYz5kokqPBy+q2vmcaR kzO/KBrTDuJvya/36LziPEcMPiPDw4tWuf18qQ5sa+hyky7EMshEHAGho B/9Yg2OqjmAb7UliyV0isjQ9xA+Z1/xuvx6hwLIZENJDlWUaIRedLE+q+ A==; X-CSE-ConnectionGUID: lz3YJNCcTxq9Nq2SgTMJZw== X-CSE-MsgGUID: NU8jjOWvS7WtHQmKIGFJyA== X-IronPort-AV: E=McAfee;i="6700,10204,11233"; a="28658048" X-IronPort-AV: E=Sophos;i="6.11,223,1725346800"; d="scan'208";a="28658048" Received: from fmviesa005.fm.intel.com ([10.60.135.145]) by fmvoesa112.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 22 Oct 2024 18:43:26 -0700 X-CSE-ConnectionGUID: i2Z/HCKzQLOZxygwkIZ+3Q== X-CSE-MsgGUID: mWDpW2H9QVCnl0wFzaZ8ew== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.11,223,1725346800"; d="scan'208";a="84630701" Received: from cmdeoliv-mobl4.amr.corp.intel.com (HELO dwillia2-xfh.jf.intel.com) ([10.125.110.222]) by fmviesa005-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 22 Oct 2024 18:43:25 -0700 Subject: [PATCH v2 1/6] cxl/port: Fix CXL port initialization order when the subsystem is built-in From: Dan Williams To: ira.weiny@intel.com Cc: Gregory Price , Gregory Price , stable@vger.kernel.org, Davidlohr Bueso , Jonathan Cameron , Dave Jiang , Alison Schofield , Vishal Verma , dave.jiang@intel.com, linux-cxl@vger.kernel.org Date: Tue, 22 Oct 2024 18:43:24 -0700 Message-ID: <172964780249.81806.11601867702278939388.stgit@dwillia2-xfh.jf.intel.com> In-Reply-To: <172964779333.81806.8852577918216421011.stgit@dwillia2-xfh.jf.intel.com> References: <172964779333.81806.8852577918216421011.stgit@dwillia2-xfh.jf.intel.com> User-Agent: StGit/0.18-3-g996c Precedence: bulk X-Mailing-List: linux-cxl@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 When the CXL subsystem is built-in the module init order is determined by Makefile order. That order violates expectations. The expectation is that cxl_acpi and cxl_mem can race to attach and that if cxl_acpi wins the race cxl_mem will find the enabled CXL root ports it needs and if cxl_acpi loses the race it will retrigger cxl_mem to attach via cxl_bus_rescan(). That only works if cxl_acpi can assume ports are enabled immediately upon cxl_acpi_probe() return. That in turn can only happen in the CONFIG_CXL_ACPI=y case if the cxl_port object appears before the cxl_acpi object in the Makefile. Fix up the order to prevent initialization failures, and make sure that cxl_port is built-in if cxl_acpi is also built-in. As for what contributed to this not being found earlier, the CXL regression environment, cxl_test, builds all CXL functionality as a module to allow to symbol mocking and other dynamic reload tests. As a result there is no regression coverage for the built-in case. Reported-by: Gregory Price Closes: http://lore.kernel.org/20241004212504.1246-1-gourry@gourry.net Tested-by: Gregory Price Fixes: 8dd2bc0f8e02 ("cxl/mem: Add the cxl_mem driver") Cc: Cc: Davidlohr Bueso Cc: Jonathan Cameron Cc: Dave Jiang Cc: Alison Schofield Cc: Vishal Verma Cc: Ira Weiny Signed-off-by: Dan Williams Reviewed-by: Jonathan Cameron Reviewed-by: Ira Weiny Tested-by: Alejandro Lucero Reviewed-by: Alejandro Lucero --- drivers/cxl/Kconfig | 1 + drivers/cxl/Makefile | 20 ++++++++++++++------ 2 files changed, 15 insertions(+), 6 deletions(-) diff --git a/drivers/cxl/Kconfig b/drivers/cxl/Kconfig index 29c192f20082..876469e23f7a 100644 --- a/drivers/cxl/Kconfig +++ b/drivers/cxl/Kconfig @@ -60,6 +60,7 @@ config CXL_ACPI default CXL_BUS select ACPI_TABLE_LIB select ACPI_HMAT + select CXL_PORT help Enable support for host managed device memory (HDM) resources published by a platform's ACPI CXL memory layout description. See diff --git a/drivers/cxl/Makefile b/drivers/cxl/Makefile index db321f48ba52..2caa90fa4bf2 100644 --- a/drivers/cxl/Makefile +++ b/drivers/cxl/Makefile @@ -1,13 +1,21 @@ # SPDX-License-Identifier: GPL-2.0 + +# Order is important here for the built-in case: +# - 'core' first for fundamental init +# - 'port' before platform root drivers like 'acpi' so that CXL-root ports +# are immediately enabled +# - 'mem' and 'pmem' before endpoint drivers so that memdevs are +# immediately enabled +# - 'pci' last, also mirrors the hardware enumeration hierarchy obj-y += core/ -obj-$(CONFIG_CXL_PCI) += cxl_pci.o -obj-$(CONFIG_CXL_MEM) += cxl_mem.o +obj-$(CONFIG_CXL_PORT) += cxl_port.o obj-$(CONFIG_CXL_ACPI) += cxl_acpi.o obj-$(CONFIG_CXL_PMEM) += cxl_pmem.o -obj-$(CONFIG_CXL_PORT) += cxl_port.o +obj-$(CONFIG_CXL_MEM) += cxl_mem.o +obj-$(CONFIG_CXL_PCI) += cxl_pci.o -cxl_mem-y := mem.o -cxl_pci-y := pci.o +cxl_port-y := port.o cxl_acpi-y := acpi.o cxl_pmem-y := pmem.o security.o -cxl_port-y := port.o +cxl_mem-y := mem.o +cxl_pci-y := pci.o From patchwork Wed Oct 23 01:43:32 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Dan Williams X-Patchwork-Id: 13846301 Received: from mgamail.intel.com (mgamail.intel.com [192.198.163.18]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 84FE820323 for ; Wed, 23 Oct 2024 01:43:35 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=192.198.163.18 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1729647817; cv=none; b=Wi0u8RSNDDcl9l36RYSkXXr2xWRbWIXUwptcZ/OQmwNkkvxkjiOsIC1O81+hwaQb7ztUhN2vCYVI+3RzV2+pat2MoeNCF0uIfwKaz15Fe5gb3IxMaN7hFrDGMvQ7fb8i8GVnN8yi1nrAzLyivFBPQKJi9zoVCQWoee4Q4JJp/Zo= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1729647817; c=relaxed/simple; bh=dwNJEPj6jHE8CJs/VGrofccHyDyGMrVYwMK3E8SxFxY=; h=Subject:From:To:Cc:Date:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=j9gCHEKagMlB22LiRBnrDr4RK82X6J7n/HLAVC2091A+bRx6XkdWthdEW+OemazdZ/mQyEYtUTOoslFjpJHczBsi+LCvXTz3ptOgxrBw2CB/QrZeOmecAEgKEyxdQvlK1nGJQ6lHvauHfukVHiDxYLjNsun2YP5H/8N9f8ORakc= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com; spf=pass smtp.mailfrom=intel.com; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b=Sb6UDXEk; arc=none smtp.client-ip=192.198.163.18 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=intel.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="Sb6UDXEk" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1729647815; x=1761183815; h=subject:from:to:cc:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=dwNJEPj6jHE8CJs/VGrofccHyDyGMrVYwMK3E8SxFxY=; b=Sb6UDXEkxHZNmva6JK0IUhHb9U6ZmImNJ1nD4Q5nhp1/bBrCP4h8q9IK DjbDyw70ERUCJThuChXcf2xSAYPc9tH2V3U0zoY3QlKmQN0/up1Tmi9Kh 8wo7RvY3tmY1H9X1bldr0X/i8zdFcHMem4Q2PczysSVL5Gtoexv9Z06f3 6StnArzhlkzjW39Oa8NdRIhLDYLmLgfZ8ZJqJWNWfywGOci9OfuieXEG/ S6NAfcH9Z7P3tFmkF8W8M8+/7XRvmVANzjp9QrI0SEl+novIrDDYq1mkF USSKX1YXATOjhYG+zjXLXA4t0x2zzSySfllelOXbMmfobvtZ16AaQyXMg w==; X-CSE-ConnectionGUID: 3tLA65WnQHudjTrBz1rEuQ== X-CSE-MsgGUID: PZkkgdrsS1S7Vgy0oxS3rw== X-IronPort-AV: E=McAfee;i="6700,10204,11233"; a="28658053" X-IronPort-AV: E=Sophos;i="6.11,223,1725346800"; d="scan'208";a="28658053" Received: from fmviesa005.fm.intel.com ([10.60.135.145]) by fmvoesa112.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 22 Oct 2024 18:43:34 -0700 X-CSE-ConnectionGUID: A5sJ9CMKQKaDfFLJv4akpg== X-CSE-MsgGUID: XKNHBFwMQs2mMbQSOZKAag== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.11,223,1725346800"; d="scan'208";a="84630734" Received: from cmdeoliv-mobl4.amr.corp.intel.com (HELO dwillia2-xfh.jf.intel.com) ([10.125.110.222]) by fmviesa005-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 22 Oct 2024 18:43:34 -0700 Subject: [PATCH v2 2/6] cxl/port: Fix cxl_bus_rescan() vs bus_rescan_devices() From: Dan Williams To: ira.weiny@intel.com Cc: vishal.l.verma@intel.com, alison.schofield@intel.com, dave.jiang@intel.com, linux-cxl@vger.kernel.org Date: Tue, 22 Oct 2024 18:43:32 -0700 Message-ID: <172964781104.81806.4277549800082443769.stgit@dwillia2-xfh.jf.intel.com> In-Reply-To: <172964779333.81806.8852577918216421011.stgit@dwillia2-xfh.jf.intel.com> References: <172964779333.81806.8852577918216421011.stgit@dwillia2-xfh.jf.intel.com> User-Agent: StGit/0.18-3-g996c Precedence: bulk X-Mailing-List: linux-cxl@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 It turns out since its original introduction, pre-2.6.12, bus_rescan_devices() has skipped devices that might be in the process of attaching or detaching from their driver. For CXL this behavior is unwanted and expects that cxl_bus_rescan() is a probe barrier. That behavior is simple enough to achieve with bus_for_each_dev() paired with call to device_attach(), and it is unclear why bus_rescan_devices() took the position of lockless consumption of dev->driver which is racy. The "Fixes:" but no "Cc: stable" on this patch reflects that the issue is merely by inspection since the bug that triggered the discovery of this potential problem [1] is fixed by other means. However, a stable backport should do no harm. Fixes: 8dd2bc0f8e02 ("cxl/mem: Add the cxl_mem driver") Link: http://lore.kernel.org/20241004212504.1246-1-gourry@gourry.net [1] Signed-off-by: Dan Williams Tested-by: Gregory Price Reviewed-by: Jonathan Cameron Reviewed-by: Ira Weiny --- drivers/cxl/core/port.c | 13 ++++++++++--- 1 file changed, 10 insertions(+), 3 deletions(-) diff --git a/drivers/cxl/core/port.c b/drivers/cxl/core/port.c index e666ec6a9085..af92c67bc954 100644 --- a/drivers/cxl/core/port.c +++ b/drivers/cxl/core/port.c @@ -2084,11 +2084,18 @@ static void cxl_bus_remove(struct device *dev) static struct workqueue_struct *cxl_bus_wq; -static void cxl_bus_rescan_queue(struct work_struct *w) +static int cxl_rescan_attach(struct device *dev, void *data) { - int rc = bus_rescan_devices(&cxl_bus_type); + int rc = device_attach(dev); + + dev_vdbg(dev, "rescan: %s\n", rc ? "attach" : "detached"); - pr_debug("CXL bus rescan result: %d\n", rc); + return 0; +} + +static void cxl_bus_rescan_queue(struct work_struct *w) +{ + bus_for_each_dev(&cxl_bus_type, NULL, NULL, cxl_rescan_attach); } void cxl_bus_rescan(void) From patchwork Wed Oct 23 01:43:40 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Dan Williams X-Patchwork-Id: 13846302 Received: from mgamail.intel.com (mgamail.intel.com [192.198.163.18]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 6E18520323 for ; Wed, 23 Oct 2024 01:43:43 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=192.198.163.18 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1729647825; cv=none; b=GGmnfoz16SisvFg2bHpWeS03bkEh9bbvqycsnjgwFZ7PLb98YaSTYtd6pstRPYumRlcqH+88XBr+yMU75J8ocmzMNdsS/tBcxjIc4LjUkTj11ZTO5Ai5qKLr/C4B4tr4Sb+247UaOuv9mz3O5KZAgaTBydfkKVQr596eHk4nnCY= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1729647825; c=relaxed/simple; bh=juOGAtdjznPc0RKrnj5+K0hX4qicmjHRC5I+vDohYYI=; h=Subject:From:To:Cc:Date:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=itc/TKVLtJj1SgHnCRz7rL/zD7Np/BdO/8YnSb4SS29Z80JEY6xCdbpmlB5acz/lRlGBVkepLOdcEexWcx0Gct9XKiYXQDbhQ/HHQdVIzQff0qe6E6RehGQGw1T7GijsW4eFNhi8TE1kRVY5CszGg0KE1hkq5Sel09m+Pkkxg0U= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com; spf=pass smtp.mailfrom=intel.com; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b=YaKOaXKk; arc=none smtp.client-ip=192.198.163.18 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=intel.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="YaKOaXKk" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1729647824; x=1761183824; h=subject:from:to:cc:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=juOGAtdjznPc0RKrnj5+K0hX4qicmjHRC5I+vDohYYI=; b=YaKOaXKkL4FCh74strGq9dr+KGCQgsM+zBJCzh2ozC8HzpLxYQ221gee pGI9Ncbi8kcWxhnuhkLWGj8wvhQIa2tPiOQQ2B6J/4/T5LMAwkbS25pGN GzmzKYO3xFSzYg/eLrlYfNAIgU5DjgKCZa3SfEeruZY48nKYP+iaTurDT ue3ldbLlYO47MECWGTih02dHUEozEU1RAzUKtfBZ870rlTW0vahtOaPpv hK73HMDBsgJ8yAREW4ywXmP6iCYv1i/cX4IrSQMuiKQSj5DMo2wgmkct7 jaqKkfep/yGfvZO7gAuaxUN4+85yDlnEQjY2+VHR5T/LYYrtCBQS/az+d g==; X-CSE-ConnectionGUID: o8LpFVF9Sn+fwxqEVBnehQ== X-CSE-MsgGUID: IiqqZbeXSR+qhiEXxpy3lQ== X-IronPort-AV: E=McAfee;i="6700,10204,11233"; a="28658057" X-IronPort-AV: E=Sophos;i="6.11,223,1725346800"; d="scan'208";a="28658057" Received: from fmviesa005.fm.intel.com ([10.60.135.145]) by fmvoesa112.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 22 Oct 2024 18:43:43 -0700 X-CSE-ConnectionGUID: sh9FYzkmT2Gp3DCdZhSqFg== X-CSE-MsgGUID: DM8MvJYvS++44cn4ZY8GWA== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.11,223,1725346800"; d="scan'208";a="84630744" Received: from cmdeoliv-mobl4.amr.corp.intel.com (HELO dwillia2-xfh.jf.intel.com) ([10.125.110.222]) by fmviesa005-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 22 Oct 2024 18:43:42 -0700 Subject: [PATCH v2 3/6] cxl/acpi: Ensure ports ready at cxl_acpi_probe() return From: Dan Williams To: ira.weiny@intel.com Cc: vishal.l.verma@intel.com, alison.schofield@intel.com, dave.jiang@intel.com, linux-cxl@vger.kernel.org Date: Tue, 22 Oct 2024 18:43:40 -0700 Message-ID: <172964781969.81806.17276352414854540808.stgit@dwillia2-xfh.jf.intel.com> In-Reply-To: <172964779333.81806.8852577918216421011.stgit@dwillia2-xfh.jf.intel.com> References: <172964779333.81806.8852577918216421011.stgit@dwillia2-xfh.jf.intel.com> User-Agent: StGit/0.18-3-g996c Precedence: bulk X-Mailing-List: linux-cxl@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 In order to ensure root CXL ports are enabled upon cxl_acpi_probe() when the 'cxl_port' driver is built as a module, arrange for the module to be pre-loaded or built-in. The "Fixes:" but no "Cc: stable" on this patch reflects that the issue is merely by inspection since the bug that triggered the discovery of this potential problem [1] is fixed by other means. However, a stable backport should do no harm. Fixes: 8dd2bc0f8e02 ("cxl/mem: Add the cxl_mem driver") Link: http://lore.kernel.org/20241004212504.1246-1-gourry@gourry.net [1] Signed-off-by: Dan Williams Tested-by: Gregory Price Reviewed-by: Jonathan Cameron Reviewed-by: Ira Weiny --- drivers/cxl/acpi.c | 7 +++++++ 1 file changed, 7 insertions(+) diff --git a/drivers/cxl/acpi.c b/drivers/cxl/acpi.c index 82b78e331d8e..432b7cfd12a8 100644 --- a/drivers/cxl/acpi.c +++ b/drivers/cxl/acpi.c @@ -924,6 +924,13 @@ static void __exit cxl_acpi_exit(void) /* load before dax_hmem sees 'Soft Reserved' CXL ranges */ subsys_initcall(cxl_acpi_init); + +/* + * Arrange for host-bridge ports to be active synchronous with + * cxl_acpi_probe() exit. + */ +MODULE_SOFTDEP("pre: cxl_port"); + module_exit(cxl_acpi_exit); MODULE_DESCRIPTION("CXL ACPI: Platform Support"); MODULE_LICENSE("GPL v2"); From patchwork Wed Oct 23 01:43:49 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Dan Williams X-Patchwork-Id: 13846303 Received: from mgamail.intel.com (mgamail.intel.com [198.175.65.10]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id A03EC20323; Wed, 23 Oct 2024 01:43:52 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=198.175.65.10 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1729647834; cv=none; b=UoPQgO1xhG1oVSQ/VvPjpaZFc9P/R3frWKj5WnvY9TGiKaIzViOQpN+1HnDC9kagqBYO3FMvt4K65DQZa4W9T+Xw8oVyjSVijiPxXKVNnjXhyXRSPmN+JtaiYx7oYnPMr7WiSlhz3jzW2LWvhmLrv+fe/SLW1zyAjFR75FGHSkk= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1729647834; c=relaxed/simple; bh=S5SBBD34sDQ+NONT3hUfjKBSzcn8qXTm64swiCBSqr0=; h=Subject:From:To:Cc:Date:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=klRe8SLvKG0+OJc//6q0cdy/pyPTfKNf1tCB+zei6eATLCnxc3gvrvRpqOpUho5sU/gY2VCYjv8JIog2NC3rpy0FfV4sa+gGnPfYCsEB1Hm1g/YtDtPXaffuCbzKSIivrWKnItWNuz3aSd0MvnwNciLbDzPVr6fhgpRSUzR4pSY= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com; spf=pass smtp.mailfrom=intel.com; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b=BJyVL+O2; arc=none smtp.client-ip=198.175.65.10 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=intel.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="BJyVL+O2" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1729647832; x=1761183832; h=subject:from:to:cc:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=S5SBBD34sDQ+NONT3hUfjKBSzcn8qXTm64swiCBSqr0=; b=BJyVL+O2783aKByymKxVauZpPr/rlChm1u926ek7aqzNIZwDR//SJKuR 8Iso/WUnIdQkVcFH457BYkFJQvX9E90lNMvYERDURCrFPCChHZjszYkT2 j4CzPs+rWtB5BO2l1CoyWMSECmhJZBR0CssuCfYWxuwolPHuYkL0BrONc dicmh9ABi2XBfpHC2OWIWGvz9Ob1TiuxeswHfCqXlPdH4zTwuRjnvl76h Mna7PZ5ruyz1hdbx47rpQsYJPj7XtAZQXmv0xwBPqOYUhIRrihhRoQmDS gAFGateY/EXKX5LHNtHSZi7lkvZI245jKFvMcoUc/J3mqaovKwsV57Ol+ A==; X-CSE-ConnectionGUID: hPdAX1+jR++W+zrU1O1rFQ== X-CSE-MsgGUID: ftKi00aXQDq0pwl+vswHng== X-IronPort-AV: E=McAfee;i="6700,10204,11222"; a="46675945" X-IronPort-AV: E=Sophos;i="6.11,199,1725346800"; d="scan'208";a="46675945" Received: from fmviesa001.fm.intel.com ([10.60.135.141]) by orvoesa102.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 22 Oct 2024 18:43:52 -0700 X-CSE-ConnectionGUID: DJemh+FDTI2zxZuGYJwvuw== X-CSE-MsgGUID: /AB1vowvR7WHTPh/BgX3sg== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.11,223,1725346800"; d="scan'208";a="110844040" Received: from cmdeoliv-mobl4.amr.corp.intel.com (HELO dwillia2-xfh.jf.intel.com) ([10.125.110.222]) by smtpauth.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 22 Oct 2024 18:43:51 -0700 Subject: [PATCH v2 4/6] cxl/port: Fix use-after-free, permit out-of-order decoder shutdown From: Dan Williams To: ira.weiny@intel.com Cc: stable@vger.kernel.org, Jonathan Cameron , Greg Kroah-Hartman , Davidlohr Bueso , Dave Jiang , Alison Schofield , Zijun Hu , vishal.l.verma@intel.com, dave.jiang@intel.com, linux-cxl@vger.kernel.org Date: Tue, 22 Oct 2024 18:43:49 -0700 Message-ID: <172964782781.81806.17902885593105284330.stgit@dwillia2-xfh.jf.intel.com> In-Reply-To: <172964779333.81806.8852577918216421011.stgit@dwillia2-xfh.jf.intel.com> References: <172964779333.81806.8852577918216421011.stgit@dwillia2-xfh.jf.intel.com> User-Agent: StGit/0.18-3-g996c Precedence: bulk X-Mailing-List: linux-cxl@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 In support of investigating an initialization failure report [1], cxl_test was updated to register mock memory-devices after the mock root-port/bus device had been registered. That led to cxl_test crashing with a use-after-free bug with the following signature: cxl_port_attach_region: cxl region3: cxl_host_bridge.0:port3 decoder3.0 add: mem0:decoder7.0 @ 0 next: cxl_switch_uport.0 nr_eps: 1 nr_targets: 1 cxl_port_attach_region: cxl region3: cxl_host_bridge.0:port3 decoder3.0 add: mem4:decoder14.0 @ 1 next: cxl_switch_uport.0 nr_eps: 2 nr_targets: 1 cxl_port_setup_targets: cxl region3: cxl_switch_uport.0:port6 target[0] = cxl_switch_dport.0 for mem0:decoder7.0 @ 0 1) cxl_port_setup_targets: cxl region3: cxl_switch_uport.0:port6 target[1] = cxl_switch_dport.4 for mem4:decoder14.0 @ 1 [..] cxld_unregister: cxl decoder14.0: cxl_region_decode_reset: cxl_region region3: mock_decoder_reset: cxl_port port3: decoder3.0 reset 2) mock_decoder_reset: cxl_port port3: decoder3.0: out of order reset, expected decoder3.1 cxl_endpoint_decoder_release: cxl decoder14.0: [..] cxld_unregister: cxl decoder7.0: 3) cxl_region_decode_reset: cxl_region region3: Oops: general protection fault, probably for non-canonical address 0x6b6b6b6b6b6b6bc3: 0000 [#1] PREEMPT SMP PTI [..] RIP: 0010:to_cxl_port+0x8/0x60 [cxl_core] [..] Call Trace: cxl_region_decode_reset+0x69/0x190 [cxl_core] cxl_region_detach+0xe8/0x210 [cxl_core] cxl_decoder_kill_region+0x27/0x40 [cxl_core] cxld_unregister+0x5d/0x60 [cxl_core] At 1) a region has been established with 2 endpoint decoders (7.0 and 14.0). Those endpoints share a common switch-decoder in the topology (3.0). At teardown, 2), decoder14.0 is the first to be removed and hits the "out of order reset case" in the switch decoder. The effect though is that region3 cleanup is aborted leaving it in-tact and referencing decoder14.0. At 3) the second attempt to teardown region3 trips over the stale decoder14.0 object which has long since been deleted. The fix here is to recognize that the CXL specification places no mandate on in-order shutdown of switch-decoders, the driver enforces in-order allocation, and hardware enforces in-order commit. So, rather than fail and leave objects dangling, always remove them. In support of making cxl_region_decode_reset() always succeed, cxl_region_invalidate_memregion() failures are turned into warnings. Crashing the kernel is ok there since system integrity is at risk if caches cannot be managed around physical address mutation events like CXL region destruction. A new device_for_each_child_reverse_from() is added to cleanup port->commit_end after all dependent decoders have been disabled. In other words if decoders are allocated 0->1->2 and disabled 1->2->0 then port->commit_end only decrements from 2 after 2 has been disabled, and it decrements all the way to zero since 1 was disabled previously. Link: http://lore.kernel.org/20241004212504.1246-1-gourry@gourry.net [1] Cc: Fixes: 176baefb2eb5 ("cxl/hdm: Commit decoder state to hardware") Reviewed-by: Jonathan Cameron Cc: Greg Kroah-Hartman Cc: Davidlohr Bueso Cc: Dave Jiang Cc: Alison Schofield Cc: Ira Weiny Cc: Zijun Hu Signed-off-by: Dan Williams Reviewed-by: Ira Weiny --- drivers/base/core.c | 35 +++++++++++++++++++++++++++++ drivers/cxl/core/hdm.c | 50 +++++++++++++++++++++++++++++++++++------- drivers/cxl/core/region.c | 48 +++++++++++----------------------------- drivers/cxl/cxl.h | 3 ++- include/linux/device.h | 3 +++ tools/testing/cxl/test/cxl.c | 14 ++++-------- 6 files changed, 100 insertions(+), 53 deletions(-) diff --git a/drivers/base/core.c b/drivers/base/core.c index a4c853411a6b..e42f1ad73078 100644 --- a/drivers/base/core.c +++ b/drivers/base/core.c @@ -4037,6 +4037,41 @@ int device_for_each_child_reverse(struct device *parent, void *data, } EXPORT_SYMBOL_GPL(device_for_each_child_reverse); +/** + * device_for_each_child_reverse_from - device child iterator in reversed order. + * @parent: parent struct device. + * @from: optional starting point in child list + * @fn: function to be called for each device. + * @data: data for the callback. + * + * Iterate over @parent's child devices, starting at @from, and call @fn + * for each, passing it @data. This helper is identical to + * device_for_each_child_reverse() when @from is NULL. + * + * @fn is checked each iteration. If it returns anything other than 0, + * iteration stop and that value is returned to the caller of + * device_for_each_child_reverse_from(); + */ +int device_for_each_child_reverse_from(struct device *parent, + struct device *from, const void *data, + int (*fn)(struct device *, const void *)) +{ + struct klist_iter i; + struct device *child; + int error = 0; + + if (!parent->p) + return 0; + + klist_iter_init_node(&parent->p->klist_children, &i, + (from ? &from->p->knode_parent : NULL)); + while ((child = prev_device(&i)) && !error) + error = fn(child, data); + klist_iter_exit(&i); + return error; +} +EXPORT_SYMBOL_GPL(device_for_each_child_reverse_from); + /** * device_find_child - device iterator for locating a particular device. * @parent: parent struct device diff --git a/drivers/cxl/core/hdm.c b/drivers/cxl/core/hdm.c index 3df10517a327..223c273c0cd1 100644 --- a/drivers/cxl/core/hdm.c +++ b/drivers/cxl/core/hdm.c @@ -712,7 +712,44 @@ static int cxl_decoder_commit(struct cxl_decoder *cxld) return 0; } -static int cxl_decoder_reset(struct cxl_decoder *cxld) +static int commit_reap(struct device *dev, const void *data) +{ + struct cxl_port *port = to_cxl_port(dev->parent); + struct cxl_decoder *cxld; + + if (!is_switch_decoder(dev) && !is_endpoint_decoder(dev)) + return 0; + + cxld = to_cxl_decoder(dev); + if (port->commit_end == cxld->id && + ((cxld->flags & CXL_DECODER_F_ENABLE) == 0)) { + port->commit_end--; + dev_dbg(&port->dev, "reap: %s commit_end: %d\n", + dev_name(&cxld->dev), port->commit_end); + } + + return 0; +} + +void cxl_port_commit_reap(struct cxl_decoder *cxld) +{ + struct cxl_port *port = to_cxl_port(cxld->dev.parent); + + lockdep_assert_held_write(&cxl_region_rwsem); + + /* + * Once the highest committed decoder is disabled, free any other + * decoders that were pinned allocated by out-of-order release. + */ + port->commit_end--; + dev_dbg(&port->dev, "reap: %s commit_end: %d\n", dev_name(&cxld->dev), + port->commit_end); + device_for_each_child_reverse_from(&port->dev, &cxld->dev, NULL, + commit_reap); +} +EXPORT_SYMBOL_NS_GPL(cxl_port_commit_reap, CXL); + +static void cxl_decoder_reset(struct cxl_decoder *cxld) { struct cxl_port *port = to_cxl_port(cxld->dev.parent); struct cxl_hdm *cxlhdm = dev_get_drvdata(&port->dev); @@ -721,14 +758,14 @@ static int cxl_decoder_reset(struct cxl_decoder *cxld) u32 ctrl; if ((cxld->flags & CXL_DECODER_F_ENABLE) == 0) - return 0; + return; - if (port->commit_end != id) { + if (port->commit_end == id) + cxl_port_commit_reap(cxld); + else dev_dbg(&port->dev, "%s: out of order reset, expected decoder%d.%d\n", dev_name(&cxld->dev), port->id, port->commit_end); - return -EBUSY; - } down_read(&cxl_dpa_rwsem); ctrl = readl(hdm + CXL_HDM_DECODER0_CTRL_OFFSET(id)); @@ -741,7 +778,6 @@ static int cxl_decoder_reset(struct cxl_decoder *cxld) writel(0, hdm + CXL_HDM_DECODER0_BASE_LOW_OFFSET(id)); up_read(&cxl_dpa_rwsem); - port->commit_end--; cxld->flags &= ~CXL_DECODER_F_ENABLE; /* Userspace is now responsible for reconfiguring this decoder */ @@ -751,8 +787,6 @@ static int cxl_decoder_reset(struct cxl_decoder *cxld) cxled = to_cxl_endpoint_decoder(&cxld->dev); cxled->state = CXL_DECODER_STATE_MANUAL; } - - return 0; } static int cxl_setup_hdm_decoder_from_dvsec( diff --git a/drivers/cxl/core/region.c b/drivers/cxl/core/region.c index e701e4b04032..3478d2058303 100644 --- a/drivers/cxl/core/region.c +++ b/drivers/cxl/core/region.c @@ -232,8 +232,8 @@ static int cxl_region_invalidate_memregion(struct cxl_region *cxlr) "Bypassing cpu_cache_invalidate_memregion() for testing!\n"); return 0; } else { - dev_err(&cxlr->dev, - "Failed to synchronize CPU cache state\n"); + dev_WARN(&cxlr->dev, + "Failed to synchronize CPU cache state\n"); return -ENXIO; } } @@ -242,19 +242,17 @@ static int cxl_region_invalidate_memregion(struct cxl_region *cxlr) return 0; } -static int cxl_region_decode_reset(struct cxl_region *cxlr, int count) +static void cxl_region_decode_reset(struct cxl_region *cxlr, int count) { struct cxl_region_params *p = &cxlr->params; - int i, rc = 0; + int i; /* - * Before region teardown attempt to flush, and if the flush - * fails cancel the region teardown for data consistency - * concerns + * Before region teardown attempt to flush, evict any data cached for + * this region, or scream loudly about missing arch / platform support + * for CXL teardown. */ - rc = cxl_region_invalidate_memregion(cxlr); - if (rc) - return rc; + cxl_region_invalidate_memregion(cxlr); for (i = count - 1; i >= 0; i--) { struct cxl_endpoint_decoder *cxled = p->targets[i]; @@ -277,23 +275,17 @@ static int cxl_region_decode_reset(struct cxl_region *cxlr, int count) cxl_rr = cxl_rr_load(iter, cxlr); cxld = cxl_rr->decoder; if (cxld->reset) - rc = cxld->reset(cxld); - if (rc) - return rc; + cxld->reset(cxld); set_bit(CXL_REGION_F_NEEDS_RESET, &cxlr->flags); } endpoint_reset: - rc = cxled->cxld.reset(&cxled->cxld); - if (rc) - return rc; + cxled->cxld.reset(&cxled->cxld); set_bit(CXL_REGION_F_NEEDS_RESET, &cxlr->flags); } /* all decoders associated with this region have been torn down */ clear_bit(CXL_REGION_F_NEEDS_RESET, &cxlr->flags); - - return 0; } static int commit_decoder(struct cxl_decoder *cxld) @@ -409,16 +401,8 @@ static ssize_t commit_store(struct device *dev, struct device_attribute *attr, * still pending. */ if (p->state == CXL_CONFIG_RESET_PENDING) { - rc = cxl_region_decode_reset(cxlr, p->interleave_ways); - /* - * Revert to committed since there may still be active - * decoders associated with this region, or move forward - * to active to mark the reset successful - */ - if (rc) - p->state = CXL_CONFIG_COMMIT; - else - p->state = CXL_CONFIG_ACTIVE; + cxl_region_decode_reset(cxlr, p->interleave_ways); + p->state = CXL_CONFIG_ACTIVE; } } @@ -2054,13 +2038,7 @@ static int cxl_region_detach(struct cxl_endpoint_decoder *cxled) get_device(&cxlr->dev); if (p->state > CXL_CONFIG_ACTIVE) { - /* - * TODO: tear down all impacted regions if a device is - * removed out of order - */ - rc = cxl_region_decode_reset(cxlr, p->interleave_ways); - if (rc) - goto out; + cxl_region_decode_reset(cxlr, p->interleave_ways); p->state = CXL_CONFIG_ACTIVE; } diff --git a/drivers/cxl/cxl.h b/drivers/cxl/cxl.h index 0d8b810a51f0..5406e3ab3d4a 100644 --- a/drivers/cxl/cxl.h +++ b/drivers/cxl/cxl.h @@ -359,7 +359,7 @@ struct cxl_decoder { struct cxl_region *region; unsigned long flags; int (*commit)(struct cxl_decoder *cxld); - int (*reset)(struct cxl_decoder *cxld); + void (*reset)(struct cxl_decoder *cxld); }; /* @@ -730,6 +730,7 @@ static inline bool is_cxl_root(struct cxl_port *port) int cxl_num_decoders_committed(struct cxl_port *port); bool is_cxl_port(const struct device *dev); struct cxl_port *to_cxl_port(const struct device *dev); +void cxl_port_commit_reap(struct cxl_decoder *cxld); struct pci_bus; int devm_cxl_register_pci_bus(struct device *host, struct device *uport_dev, struct pci_bus *bus); diff --git a/include/linux/device.h b/include/linux/device.h index b4bde8d22697..667cb6db9019 100644 --- a/include/linux/device.h +++ b/include/linux/device.h @@ -1078,6 +1078,9 @@ int device_for_each_child(struct device *dev, void *data, int (*fn)(struct device *dev, void *data)); int device_for_each_child_reverse(struct device *dev, void *data, int (*fn)(struct device *dev, void *data)); +int device_for_each_child_reverse_from(struct device *parent, + struct device *from, const void *data, + int (*fn)(struct device *, const void *)); struct device *device_find_child(struct device *dev, void *data, int (*match)(struct device *dev, void *data)); struct device *device_find_child_by_name(struct device *parent, diff --git a/tools/testing/cxl/test/cxl.c b/tools/testing/cxl/test/cxl.c index 90d5afd52dd0..c5bbd89b3192 100644 --- a/tools/testing/cxl/test/cxl.c +++ b/tools/testing/cxl/test/cxl.c @@ -693,26 +693,22 @@ static int mock_decoder_commit(struct cxl_decoder *cxld) return 0; } -static int mock_decoder_reset(struct cxl_decoder *cxld) +static void mock_decoder_reset(struct cxl_decoder *cxld) { struct cxl_port *port = to_cxl_port(cxld->dev.parent); int id = cxld->id; if ((cxld->flags & CXL_DECODER_F_ENABLE) == 0) - return 0; + return; dev_dbg(&port->dev, "%s reset\n", dev_name(&cxld->dev)); - if (port->commit_end != id) { + if (port->commit_end == id) + cxl_port_commit_reap(cxld); + else dev_dbg(&port->dev, "%s: out of order reset, expected decoder%d.%d\n", dev_name(&cxld->dev), port->id, port->commit_end); - return -EBUSY; - } - - port->commit_end--; cxld->flags &= ~CXL_DECODER_F_ENABLE; - - return 0; } static void default_mock_decoder(struct cxl_decoder *cxld) From patchwork Wed Oct 23 01:43:57 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Dan Williams X-Patchwork-Id: 13846304 Received: from mgamail.intel.com (mgamail.intel.com [198.175.65.10]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 0098D20323 for ; Wed, 23 Oct 2024 01:44:01 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=198.175.65.10 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1729647843; cv=none; b=NXQkhkxqHXGNJLjVVvyurv8/ho+eph2twAiDgxNYXi/UTRe1zGIrhPNMchCYmU0EgvSDWqZnKoqEXrJq1568Oq0B033vbRQjfa/JSAW32M8PRiHBx1MkFD8dnQCEb6a3RNDKw9213BqNqbxIzPUnXsnov6hlh8nHKYlvB9p68RY= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1729647843; c=relaxed/simple; bh=dnQwbUB68OkA+2eKZ7BUN1fY2s+v4Sxk//VQ3Ml6v1I=; h=Subject:From:To:Cc:Date:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=eu0/u1QNsow5NPAvWtYjxx/ecPk9fB3LhPHX7OEuOw6nGvAmgBwc1PJQcT3SAideP9DlBsL4R4mDffD95Jee/TKcPzbf0fbj85qTYICTQAwqRZvnzPFQOyo2/XOBeNq8SdQF8FGZV1PttGnuC9Apk0DcULeyEZ2OKvTWe+K5PCk= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com; spf=pass smtp.mailfrom=intel.com; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b=R58uUQj+; arc=none smtp.client-ip=198.175.65.10 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=intel.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="R58uUQj+" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1729647842; x=1761183842; h=subject:from:to:cc:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=dnQwbUB68OkA+2eKZ7BUN1fY2s+v4Sxk//VQ3Ml6v1I=; b=R58uUQj+8ht4WNx9FpUx8cckINUGxnnD+XSxJ3tcj7h7aW1kyY5LW+CH iwAjZ3C793bFQschko/GKQGhy4P7iykZHDujAU6BgXcxa0Z/BVy8joiL7 r+xsMlFxjsBklVrIyZyqQSXiYQmCpnDwa3vkFco3v73mtebaxQhRqSXfl T1C807gcfU5B3WtbSXLTCEzQOuoixBZQkkp8lpHMqfZ5vi9KEU+5S8mqX 8sqSKzyNWOSF5M8oE6Bs6bpPKqnY47l8Co2i+x/hiUfLJQt51jH5WxfMH o2U1M78gTyVEildd5Ek/fR6JOp+tXxm6vs+Jf1n6YjbrM8jOCleSv0GOD g==; X-CSE-ConnectionGUID: aL+blriDQiuJbmbM2vtYIA== X-CSE-MsgGUID: D4/iOe9BSnytcGWQ0UWY4A== X-IronPort-AV: E=McAfee;i="6700,10204,11222"; a="46675957" X-IronPort-AV: E=Sophos;i="6.11,199,1725346800"; d="scan'208";a="46675957" Received: from fmviesa001.fm.intel.com ([10.60.135.141]) by orvoesa102.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 22 Oct 2024 18:44:01 -0700 X-CSE-ConnectionGUID: 0gf9+tbsQgmb/GWw+EUyWw== X-CSE-MsgGUID: F+6dFQgtRa2k5GKPgKSwUg== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.11,223,1725346800"; d="scan'208";a="110844073" Received: from cmdeoliv-mobl4.amr.corp.intel.com (HELO dwillia2-xfh.jf.intel.com) ([10.125.110.222]) by smtpauth.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 22 Oct 2024 18:44:00 -0700 Subject: [PATCH v2 5/6] cxl/port: Prevent out-of-order decoder allocation From: Dan Williams To: ira.weiny@intel.com Cc: Zijun Hu , Davidlohr Bueso , Vishal Verma , Alison Schofield , Jonathan Cameron , dave.jiang@intel.com, linux-cxl@vger.kernel.org Date: Tue, 22 Oct 2024 18:43:57 -0700 Message-ID: <172964783668.81806.14962699553881333486.stgit@dwillia2-xfh.jf.intel.com> In-Reply-To: <172964779333.81806.8852577918216421011.stgit@dwillia2-xfh.jf.intel.com> References: <172964779333.81806.8852577918216421011.stgit@dwillia2-xfh.jf.intel.com> User-Agent: StGit/0.18-3-g996c Precedence: bulk X-Mailing-List: linux-cxl@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 With the recent change to allow out-of-order decoder de-commit it highlights a need to strengthen the in-order decoder commit guarantees. As it stands match_free_decoder() ensures that if 2 regions are racing decoder allocations the one that wins the race will get the lower id decoder, but that still leaves the race to *commit* the decoder. Rather than have this complicated case of "reserved in-order, but may still commit out-of-order", just arrange for the reservation order to match the commit-order. In other words, prevent subsequent allocations until the last reservation is committed. This precludes overlapping region creation events and requires the previous regionN to either move forward to the decoder commit stage or drop its reservation before regionN+1 can move forward. That is, provided that regionN and regionN+1 decode through the same switch port. As a side effect this allows match_free_decoder() to drop its dependency on needing write access to the device_find_child() @data parameter [1]. Reported-by: Zijun Hu Closes: http://lore.kernel.org/20240905-const_dfc_prepare-v4-0-4180e1d5a244@quicinc.com Cc: Davidlohr Bueso Cc: Vishal Verma Cc: Alison Schofield Cc: Jonathan Cameron Signed-off-by: Dan Williams Reviewed-by: Jonathan Cameron Reviewed-by: Ira Weiny --- drivers/cxl/core/region.c | 43 +++++++++++++++++++++++++++++++++---------- 1 file changed, 33 insertions(+), 10 deletions(-) diff --git a/drivers/cxl/core/region.c b/drivers/cxl/core/region.c index 3478d2058303..dff618c708dc 100644 --- a/drivers/cxl/core/region.c +++ b/drivers/cxl/core/region.c @@ -778,26 +778,50 @@ static size_t show_targetN(struct cxl_region *cxlr, char *buf, int pos) return rc; } +static int check_commit_order(struct device *dev, const void *data) +{ + struct cxl_decoder *cxld = to_cxl_decoder(dev); + + /* + * if port->commit_end is not the only free decoder, then out of + * order shutdown has occurred, block further allocations until + * that is resolved + */ + if (((cxld->flags & CXL_DECODER_F_ENABLE) == 0)) + return -EBUSY; + return 0; +} + static int match_free_decoder(struct device *dev, void *data) { + struct cxl_port *port = to_cxl_port(dev->parent); struct cxl_decoder *cxld; - int *id = data; + int rc; if (!is_switch_decoder(dev)) return 0; cxld = to_cxl_decoder(dev); - /* enforce ordered allocation */ - if (cxld->id != *id) + if (cxld->id != port->commit_end + 1) return 0; - if (!cxld->region) - return 1; - - (*id)++; + if (cxld->region) { + dev_dbg(dev->parent, + "next decoder to commit (%s) is already reserved (%s)\n", + dev_name(dev), dev_name(&cxld->region->dev)); + return 0; + } - return 0; + rc = device_for_each_child_reverse_from(dev->parent, dev, NULL, + check_commit_order); + if (rc) { + dev_dbg(dev->parent, + "unable to allocate %s due to out of order shutdown\n", + dev_name(dev)); + return 0; + } + return 1; } static int match_auto_decoder(struct device *dev, void *data) @@ -824,7 +848,6 @@ cxl_region_find_decoder(struct cxl_port *port, struct cxl_region *cxlr) { struct device *dev; - int id = 0; if (port == cxled_to_port(cxled)) return &cxled->cxld; @@ -833,7 +856,7 @@ cxl_region_find_decoder(struct cxl_port *port, dev = device_find_child(&port->dev, &cxlr->params, match_auto_decoder); else - dev = device_find_child(&port->dev, &id, match_free_decoder); + dev = device_find_child(&port->dev, NULL, match_free_decoder); if (!dev) return NULL; /* From patchwork Wed Oct 23 01:44:06 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Dan Williams X-Patchwork-Id: 13846305 Received: from mgamail.intel.com (mgamail.intel.com [198.175.65.10]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 75D8220323 for ; Wed, 23 Oct 2024 01:44:09 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=198.175.65.10 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1729647851; cv=none; b=n4DPxTCPf/Un46TMPm+oIDB81pZQ9Yh1lc9izhIFVmSXRuuIGmVcwdh9oj9jSahSKGmtTyZ4Q0X/p1eHEGoQBERxQ0o95zRY6OLBrXF9vslFdHUBqlzgy6gQCLugGxWi1wAOjh/m46ekok5ShIWuX/9DltYi5CgS1a4O4si0sqk= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1729647851; c=relaxed/simple; bh=8+1Oxg2KD3Wc6rz5kN6jZmUPo8IGhAIV8pkcpyE4JhM=; h=Subject:From:To:Cc:Date:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=Tevb8shG468siUYekbtOWU/oMoVJ176r/SO6nuA3mQ7XI4BLT4fmNqO2XcuZCtcDluu+jI9IgGE2zEYCtbN0v0GxoTyzlouYJuCqrsnO5nzpEeV4df9JJ4TRZitlEA10uIGlaJCAUHV3IIaVontyMR3lZiv7kZEJ5uZVtaQKUIY= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com; spf=pass smtp.mailfrom=intel.com; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b=ncbiu9w8; arc=none smtp.client-ip=198.175.65.10 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=intel.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="ncbiu9w8" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1729647849; x=1761183849; h=subject:from:to:cc:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=8+1Oxg2KD3Wc6rz5kN6jZmUPo8IGhAIV8pkcpyE4JhM=; b=ncbiu9w88SNqV6VBhGatP1BrdrcI5vIwVwEvjwSnN2pbS7Q0sbXWvNO7 NxhROOFvTTL3tG+LZ4q5fUfJiTHuBS6pDPEZN+It5ZFUNPsFQs0o89+fy CrU9i0pGHSDZS+wLVa/Sdpb+RRfSSOeYstLcw+IIPzM2zALLmiacOVSdK bCncEfIzjFYx88uFFPcDKGbok96igdAUQjLLrePZQ9cgeZjnEqWl/qTWF lEv1+5ATqMK8/+Y03XTbdpVOnLh+l/aMwb4raSzdQeSt/3a3hMdOKVTso JTEKJFN546VLttfUcBOxxkXYY26/N+oiWu3C1P6y4/D2ojbTuSiUOduYF A==; X-CSE-ConnectionGUID: IA7DMlk4QLusni/iktDdNA== X-CSE-MsgGUID: mQLnjKeGQFaPy/VaPS8G4A== X-IronPort-AV: E=McAfee;i="6700,10204,11222"; a="46675968" X-IronPort-AV: E=Sophos;i="6.11,199,1725346800"; d="scan'208";a="46675968" Received: from fmviesa001.fm.intel.com ([10.60.135.141]) by orvoesa102.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 22 Oct 2024 18:44:09 -0700 X-CSE-ConnectionGUID: AudK1LvdTQucC9r4+nZ97g== X-CSE-MsgGUID: KeRn4tcrTO6ckgcvoN6Zvw== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.11,223,1725346800"; d="scan'208";a="110844108" Received: from cmdeoliv-mobl4.amr.corp.intel.com (HELO dwillia2-xfh.jf.intel.com) ([10.125.110.222]) by smtpauth.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 22 Oct 2024 18:44:08 -0700 Subject: [PATCH v2 6/6] cxl/test: Improve init-order fidelity relative to real-world systems From: Dan Williams To: ira.weiny@intel.com Cc: Davidlohr Bueso , Jonathan Cameron , Dave Jiang , Alison Schofield , Vishal Verma , dave.jiang@intel.com, linux-cxl@vger.kernel.org Date: Tue, 22 Oct 2024 18:44:06 -0700 Message-ID: <172964784521.81806.15791069994065969243.stgit@dwillia2-xfh.jf.intel.com> In-Reply-To: <172964779333.81806.8852577918216421011.stgit@dwillia2-xfh.jf.intel.com> References: <172964779333.81806.8852577918216421011.stgit@dwillia2-xfh.jf.intel.com> User-Agent: StGit/0.18-3-g996c Precedence: bulk X-Mailing-List: linux-cxl@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 The investigation of an initialization failure [1] highlighted that cxl_test does not reflect the init-order of real world systems. The expected order is root/bus first then async probing of the memory devices. Fix up cxl_test to reflect that order. While it did not reproduce the initial bug report (since that is dependent on built-in vs modular builds), it did reveal a separate latent bug in the subsystem's decoder shutdown flow. Fix for that sent separately. Link: http://lore.kernel.org/20241004212504.1246-1-gourry@gourry.net [1] Cc: Davidlohr Bueso Cc: Jonathan Cameron Cc: Dave Jiang Cc: Alison Schofield Cc: Vishal Verma Cc: Ira Weiny Signed-off-by: Dan Williams Reviewed-by: Jonathan Cameron Reviewed-by: Ira Weiny --- tools/testing/cxl/test/cxl.c | 186 +++++++++++++++++++++++------------------- tools/testing/cxl/test/mem.c | 1 2 files changed, 104 insertions(+), 83 deletions(-) diff --git a/tools/testing/cxl/test/cxl.c b/tools/testing/cxl/test/cxl.c index c5bbd89b3192..050725afa45d 100644 --- a/tools/testing/cxl/test/cxl.c +++ b/tools/testing/cxl/test/cxl.c @@ -1058,7 +1058,7 @@ static void mock_companion(struct acpi_device *adev, struct device *dev) #define SZ_64G (SZ_32G * 2) #endif -static __init int cxl_rch_init(void) +static __init int cxl_rch_topo_init(void) { int rc, i; @@ -1086,30 +1086,8 @@ static __init int cxl_rch_init(void) goto err_bridge; } - for (i = 0; i < ARRAY_SIZE(cxl_rcd); i++) { - int idx = NR_MEM_MULTI + NR_MEM_SINGLE + i; - struct platform_device *rch = cxl_rch[i]; - struct platform_device *pdev; - - pdev = platform_device_alloc("cxl_rcd", idx); - if (!pdev) - goto err_mem; - pdev->dev.parent = &rch->dev; - set_dev_node(&pdev->dev, i % 2); - - rc = platform_device_add(pdev); - if (rc) { - platform_device_put(pdev); - goto err_mem; - } - cxl_rcd[i] = pdev; - } - return 0; -err_mem: - for (i = ARRAY_SIZE(cxl_rcd) - 1; i >= 0; i--) - platform_device_unregister(cxl_rcd[i]); err_bridge: for (i = ARRAY_SIZE(cxl_rch) - 1; i >= 0; i--) { struct platform_device *pdev = cxl_rch[i]; @@ -1123,12 +1101,10 @@ static __init int cxl_rch_init(void) return rc; } -static void cxl_rch_exit(void) +static void cxl_rch_topo_exit(void) { int i; - for (i = ARRAY_SIZE(cxl_rcd) - 1; i >= 0; i--) - platform_device_unregister(cxl_rcd[i]); for (i = ARRAY_SIZE(cxl_rch) - 1; i >= 0; i--) { struct platform_device *pdev = cxl_rch[i]; @@ -1139,7 +1115,7 @@ static void cxl_rch_exit(void) } } -static __init int cxl_single_init(void) +static __init int cxl_single_topo_init(void) { int i, rc; @@ -1224,29 +1200,8 @@ static __init int cxl_single_init(void) cxl_swd_single[i] = pdev; } - for (i = 0; i < ARRAY_SIZE(cxl_mem_single); i++) { - struct platform_device *dport = cxl_swd_single[i]; - struct platform_device *pdev; - - pdev = platform_device_alloc("cxl_mem", NR_MEM_MULTI + i); - if (!pdev) - goto err_mem; - pdev->dev.parent = &dport->dev; - set_dev_node(&pdev->dev, i % 2); - - rc = platform_device_add(pdev); - if (rc) { - platform_device_put(pdev); - goto err_mem; - } - cxl_mem_single[i] = pdev; - } - return 0; -err_mem: - for (i = ARRAY_SIZE(cxl_mem_single) - 1; i >= 0; i--) - platform_device_unregister(cxl_mem_single[i]); err_dport: for (i = ARRAY_SIZE(cxl_swd_single) - 1; i >= 0; i--) platform_device_unregister(cxl_swd_single[i]); @@ -1269,12 +1224,10 @@ static __init int cxl_single_init(void) return rc; } -static void cxl_single_exit(void) +static void cxl_single_topo_exit(void) { int i; - for (i = ARRAY_SIZE(cxl_mem_single) - 1; i >= 0; i--) - platform_device_unregister(cxl_mem_single[i]); for (i = ARRAY_SIZE(cxl_swd_single) - 1; i >= 0; i--) platform_device_unregister(cxl_swd_single[i]); for (i = ARRAY_SIZE(cxl_swu_single) - 1; i >= 0; i--) @@ -1291,6 +1244,91 @@ static void cxl_single_exit(void) } } +static void cxl_mem_exit(void) +{ + int i; + + for (i = ARRAY_SIZE(cxl_rcd) - 1; i >= 0; i--) + platform_device_unregister(cxl_rcd[i]); + for (i = ARRAY_SIZE(cxl_mem_single) - 1; i >= 0; i--) + platform_device_unregister(cxl_mem_single[i]); + for (i = ARRAY_SIZE(cxl_mem) - 1; i >= 0; i--) + platform_device_unregister(cxl_mem[i]); +} + +static int cxl_mem_init(void) +{ + int i, rc; + + for (i = 0; i < ARRAY_SIZE(cxl_mem); i++) { + struct platform_device *dport = cxl_switch_dport[i]; + struct platform_device *pdev; + + pdev = platform_device_alloc("cxl_mem", i); + if (!pdev) + goto err_mem; + pdev->dev.parent = &dport->dev; + set_dev_node(&pdev->dev, i % 2); + + rc = platform_device_add(pdev); + if (rc) { + platform_device_put(pdev); + goto err_mem; + } + cxl_mem[i] = pdev; + } + + for (i = 0; i < ARRAY_SIZE(cxl_mem_single); i++) { + struct platform_device *dport = cxl_swd_single[i]; + struct platform_device *pdev; + + pdev = platform_device_alloc("cxl_mem", NR_MEM_MULTI + i); + if (!pdev) + goto err_single; + pdev->dev.parent = &dport->dev; + set_dev_node(&pdev->dev, i % 2); + + rc = platform_device_add(pdev); + if (rc) { + platform_device_put(pdev); + goto err_single; + } + cxl_mem_single[i] = pdev; + } + + for (i = 0; i < ARRAY_SIZE(cxl_rcd); i++) { + int idx = NR_MEM_MULTI + NR_MEM_SINGLE + i; + struct platform_device *rch = cxl_rch[i]; + struct platform_device *pdev; + + pdev = platform_device_alloc("cxl_rcd", idx); + if (!pdev) + goto err_rcd; + pdev->dev.parent = &rch->dev; + set_dev_node(&pdev->dev, i % 2); + + rc = platform_device_add(pdev); + if (rc) { + platform_device_put(pdev); + goto err_rcd; + } + cxl_rcd[i] = pdev; + } + + return 0; + +err_rcd: + for (i = ARRAY_SIZE(cxl_rcd) - 1; i >= 0; i--) + platform_device_unregister(cxl_rcd[i]); +err_single: + for (i = ARRAY_SIZE(cxl_mem_single) - 1; i >= 0; i--) + platform_device_unregister(cxl_mem_single[i]); +err_mem: + for (i = ARRAY_SIZE(cxl_mem) - 1; i >= 0; i--) + platform_device_unregister(cxl_mem[i]); + return rc; +} + static __init int cxl_test_init(void) { int rc, i; @@ -1403,29 +1441,11 @@ static __init int cxl_test_init(void) cxl_switch_dport[i] = pdev; } - for (i = 0; i < ARRAY_SIZE(cxl_mem); i++) { - struct platform_device *dport = cxl_switch_dport[i]; - struct platform_device *pdev; - - pdev = platform_device_alloc("cxl_mem", i); - if (!pdev) - goto err_mem; - pdev->dev.parent = &dport->dev; - set_dev_node(&pdev->dev, i % 2); - - rc = platform_device_add(pdev); - if (rc) { - platform_device_put(pdev); - goto err_mem; - } - cxl_mem[i] = pdev; - } - - rc = cxl_single_init(); + rc = cxl_single_topo_init(); if (rc) - goto err_mem; + goto err_dport; - rc = cxl_rch_init(); + rc = cxl_rch_topo_init(); if (rc) goto err_single; @@ -1438,19 +1458,20 @@ static __init int cxl_test_init(void) rc = platform_device_add(cxl_acpi); if (rc) - goto err_add; + goto err_root; + + rc = cxl_mem_init(); + if (rc) + goto err_root; return 0; -err_add: +err_root: platform_device_put(cxl_acpi); err_rch: - cxl_rch_exit(); + cxl_rch_topo_exit(); err_single: - cxl_single_exit(); -err_mem: - for (i = ARRAY_SIZE(cxl_mem) - 1; i >= 0; i--) - platform_device_unregister(cxl_mem[i]); + cxl_single_topo_exit(); err_dport: for (i = ARRAY_SIZE(cxl_switch_dport) - 1; i >= 0; i--) platform_device_unregister(cxl_switch_dport[i]); @@ -1482,11 +1503,10 @@ static __exit void cxl_test_exit(void) { int i; + cxl_mem_exit(); platform_device_unregister(cxl_acpi); - cxl_rch_exit(); - cxl_single_exit(); - for (i = ARRAY_SIZE(cxl_mem) - 1; i >= 0; i--) - platform_device_unregister(cxl_mem[i]); + cxl_rch_topo_exit(); + cxl_single_topo_exit(); for (i = ARRAY_SIZE(cxl_switch_dport) - 1; i >= 0; i--) platform_device_unregister(cxl_switch_dport[i]); for (i = ARRAY_SIZE(cxl_switch_uport) - 1; i >= 0; i--) diff --git a/tools/testing/cxl/test/mem.c b/tools/testing/cxl/test/mem.c index ad5c4c18c5c6..71916e0e1546 100644 --- a/tools/testing/cxl/test/mem.c +++ b/tools/testing/cxl/test/mem.c @@ -1673,6 +1673,7 @@ static struct platform_driver cxl_mock_mem_driver = { .name = KBUILD_MODNAME, .dev_groups = cxl_mock_mem_groups, .groups = cxl_mock_mem_core_groups, + .probe_type = PROBE_PREFER_ASYNCHRONOUS, }, };