From patchwork Fri Feb 10 09:05:27 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Dan Williams X-Patchwork-Id: 13135548 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id D9328C636CD for ; Fri, 10 Feb 2023 09:05:32 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 727D66B007B; Fri, 10 Feb 2023 04:05:32 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 6D6EF6B0089; Fri, 10 Feb 2023 04:05:32 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 59EBC6B008C; Fri, 10 Feb 2023 04:05:32 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0010.hostedemail.com [216.40.44.10]) by kanga.kvack.org (Postfix) with ESMTP id 3796E6B007B for ; Fri, 10 Feb 2023 04:05:32 -0500 (EST) Received: from smtpin09.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay08.hostedemail.com (Postfix) with ESMTP id D18941414AB for ; Fri, 10 Feb 2023 09:05:31 +0000 (UTC) X-FDA: 80450798862.09.91E51B6 Received: from mga18.intel.com (mga18.intel.com [134.134.136.126]) by imf16.hostedemail.com (Postfix) with ESMTP id 98AEA180018 for ; Fri, 10 Feb 2023 09:05:29 +0000 (UTC) Authentication-Results: imf16.hostedemail.com; dkim=pass header.d=intel.com header.s=Intel header.b=UWFGI2G9; spf=pass (imf16.hostedemail.com: domain of dan.j.williams@intel.com designates 134.134.136.126 as permitted sender) smtp.mailfrom=dan.j.williams@intel.com; dmarc=pass (policy=none) header.from=intel.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1676019930; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=qIOTL5xh8Pqs43N+5OJhNWaRvw62EgV5/mOUXhULrnU=; b=X1t2OWlVntiGzHivjZ6fsqQxe5V0UER2hjyspS4ixhRUnOGKZuYSPgdIxtj+V2uulurcp2 ZmsYGg8w6THibNyp7BNiKbKOmZPxozQvM6vrVb9omK+rRN+mWb2NhUP7hjBXyrKGs23P0R 4pB5E75bvnNTjyMS8HfmACu11Yq5PGs= ARC-Authentication-Results: i=1; imf16.hostedemail.com; dkim=pass header.d=intel.com header.s=Intel header.b=UWFGI2G9; spf=pass (imf16.hostedemail.com: domain of dan.j.williams@intel.com designates 134.134.136.126 as permitted sender) smtp.mailfrom=dan.j.williams@intel.com; dmarc=pass (policy=none) header.from=intel.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1676019930; a=rsa-sha256; cv=none; b=mu5VvsT/vDjTZggTQVsKuJC9x2KvKqYuOEHgdRMO10QSlK7C6kSvQX0gfvkETcNUtCMxHv fKf7EdcDeCH2H9k3PozLiRIbP3dV8am3u5uRk98hdvlB0OlLLkPn5m1/5R0R21fKJBUMy6 yiMy3Hwf8z1jSbthg3ka085ZJPMykUs= DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1676019929; x=1707555929; h=subject:from:to:cc:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=0DbfECaRxiOQUgGeyWMl1ZfzEksrLbe++d0Gmjbv5io=; b=UWFGI2G9E/Z82lZhr7Rdg0UrxhlNk2JoBVMi1vRsNnewdRK/BOxv5zHW DuN1IqbcSl39gOrxmZpgHr4ByB8wKeYnMU3L9fM/oduvdUTpkV4CCE6pR HYo2r7+RFx9jER1h2ZE+TzFRBre9bAkPCVp9vtjby8t4z9pKIogAEECWm Qyyy1OB2CRYfLbppTO5/u19bojnddC6XuyNrioqpRly9p4ricwGhaUQu1 zfS4B4/cFxVpJ4rC/d+F7M64NTQ5JHfBbfsQoMbcthXAUVx98RHewVcE+ ql8SmJvE6LW8OKvgSxH7mEWBD54rIdkFgA77+/FQOPqccdbvU9nx5hOa9 A==; X-IronPort-AV: E=McAfee;i="6500,9779,10616"; a="314018691" X-IronPort-AV: E=Sophos;i="5.97,286,1669104000"; d="scan'208";a="314018691" Received: from fmsmga004.fm.intel.com ([10.253.24.48]) by orsmga106.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 10 Feb 2023 01:05:28 -0800 X-IronPort-AV: E=McAfee;i="6500,9779,10616"; a="736669726" X-IronPort-AV: E=Sophos;i="5.97,286,1669104000"; d="scan'208";a="736669726" Received: from hrchavan-mobl.amr.corp.intel.com (HELO dwillia2-xfh.jf.intel.com) ([10.209.46.42]) by fmsmga004-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 10 Feb 2023 01:05:28 -0800 Subject: [PATCH v2 01/20] cxl/memdev: Fix endpoint port removal From: Dan Williams To: linux-cxl@vger.kernel.org Cc: vishal.l.verma@intel.com, dave.hansen@linux.intel.com, linux-mm@kvack.org, linux-acpi@vger.kernel.org Date: Fri, 10 Feb 2023 01:05:27 -0800 Message-ID: <167601992789.1924368.8083994227892600608.stgit@dwillia2-xfh.jf.intel.com> In-Reply-To: <167601992097.1924368.18291887895351917895.stgit@dwillia2-xfh.jf.intel.com> References: <167601992097.1924368.18291887895351917895.stgit@dwillia2-xfh.jf.intel.com> User-Agent: StGit/0.18-3-g996c MIME-Version: 1.0 X-Stat-Signature: kw85xagaqftjuyytc6xeeqahr6irwpb4 X-Rspam-User: X-Rspamd-Queue-Id: 98AEA180018 X-Rspamd-Server: rspam06 X-HE-Tag: 1676019929-818589 X-HE-Meta: U2FsdGVkX1+c9UfYjzHTNVHMGN1qrET8U3CWK/lhZGyobqMsIrM30fS6aTp7BE9hddtkwUw4Q7C9mKzoCohN9eDspBifbeISlI+y333C0UAvhC8VVCtrUWguPAfHVDlL1gG56SKCAoIszCXICdAASG/JUgzkChNVVwqEE2fwI9ysa6zDiIPYqjIyrcGPxvSayESjyY3lbsLwkRLmtGGTnpF/Itk9FxzhcRkYQNHi8dOXNgF/Zs6/kl28d7g/iljfO9lkD+yk/Qc/38myFb/7PrDSRCy3w9CU6htBh2NBtylvqAN19sG5B6jR94FGt+ZBVl4ky31xzTslJygclKoQtqPRHiysUbylutkwvVF+9kCqcIOHb0bpTucDN3TKuZrhGzDtQDKgJioUdUwA22xH/20/AOeJU8i0ROVbQbcsLFzrmFZYxxajXie7XRfb6ez1OyeA+dxrKINK644qx5SzTaM/4v+EK8056AWyEOeLF+tWc3jAT8I0+WlLFqtHUwTKu1lQy0563GCSDd7zNI3dKZTBTSNueiKIWT8ZwYXHO4epkqVtJPuhW+v1y86JQy43WcqtazgZYLGYRlUkHUX+wWZrj4J4fYq/g5H2Z7dbmFHTDlDGQ1sWa6CMVL0AfMtTWc/llLPkqezLrrctdW9aurmv/dc3jSABBEjUUFW3nWvxuB8j6+gG7ghkxBZyfeD/cmOv1J/ASguDIj55czNROO+UUoPVr3GWTn6OKH8L42IwjWBqDj2ESVnG3xKzFKmh7vb26gsb/JV9/0TvVanSTpXV1boBkp3yBAR82+gSdFtBorSLOt+Agc8xxm45Wwy+W7gqeyaNJRwrYl5xP/IM86Ct/w2gvGH4qaYKaO7b1aXWNRdjxcWmurgEbSUr3AhEG2wYlzKVM/J/wlC+mufdCWQy1HZ9IymAimKX4flKqGkTKUuF3fEi3cT9bJEIcTXIBMxkHrKNZpZc54cnY02 xr3Bctg1 ilNwPeBZMj73FrOwbb78/UIEIManLwN5eVnMfBjiZvRCGEBsa2FEzLS+dWwvxGAr5W5qOVWV6WHPqlIAXhHmh5rv11q+PsoZck1gqn3VQ7Ab1HhG60vkU8qC5Kq0wpG/T+EDL+t9je3G1ciVciiHt4GC5dXD25I1ZNRGAswLjmsFvsHCAYLEaq22hTi611A4egSECYQ9eYA/sr/b8jNxlpGr4544zndKxpg5tgoOFdtUg5HD0yUZ8pC8saQ== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Testing of ram region support [1], stimulates a long standing bug in cxl_detach_ep() where some cxl_ep_remove() cleanup is skipped due to inability to walk ports after dports have been unregistered. That results in a failure to re-register a memdev after the port is re-enabled leading to a crash like the following: cxl_port_setup_targets: cxl region4: cxl_host_bridge.0:port4 iw: 1 ig: 256 general protection fault, ... [..] RIP: 0010:cxl_region_setup_targets+0x897/0x9e0 [cxl_core] dev_name at include/linux/device.h:700 (inlined by) cxl_port_setup_targets at drivers/cxl/core/region.c:1155 (inlined by) cxl_region_setup_targets at drivers/cxl/core/region.c:1249 [..] Call Trace: attach_target+0x39a/0x760 [cxl_core] ? __mutex_unlock_slowpath+0x3a/0x290 cxl_add_to_region+0xb8/0x340 [cxl_core] ? lockdep_hardirqs_on+0x7d/0x100 discover_region+0x4b/0x80 [cxl_port] ? __pfx_discover_region+0x10/0x10 [cxl_port] device_for_each_child+0x58/0x90 cxl_port_probe+0x10e/0x130 [cxl_port] cxl_bus_probe+0x17/0x50 [cxl_core] Change the port ancestry walk to be by depth rather than by dport. This ensures that even if a port has unregistered its dports a deferred memdev cleanup will still be able to cleanup the memdev's interest in that port. The parent_port->dev.driver check is only needed for determining if the bottom up removal beat the top-down removal, but cxl_ep_remove() can always proceed. Fixes: 2703c16c75ae ("cxl/core/port: Add switch port enumeration") Link: http://lore.kernel.org/r/167564534874.847146.5222419648551436750.stgit@dwillia2-xfh.jf.intel.com [1] Signed-off-by: Dan Williams Reviewed-by: Vishal Verma --- drivers/cxl/core/memdev.c | 1 + drivers/cxl/core/port.c | 58 +++++++++++++++++++++++++-------------------- drivers/cxl/cxlmem.h | 2 ++ 3 files changed, 35 insertions(+), 26 deletions(-) diff --git a/drivers/cxl/core/memdev.c b/drivers/cxl/core/memdev.c index a74a93310d26..3a8bc2b06047 100644 --- a/drivers/cxl/core/memdev.c +++ b/drivers/cxl/core/memdev.c @@ -246,6 +246,7 @@ static struct cxl_memdev *cxl_memdev_alloc(struct cxl_dev_state *cxlds, if (rc < 0) goto err; cxlmd->id = rc; + cxlmd->depth = -1; dev = &cxlmd->dev; device_initialize(dev); diff --git a/drivers/cxl/core/port.c b/drivers/cxl/core/port.c index 410c036c09fa..317bcf4dbd9d 100644 --- a/drivers/cxl/core/port.c +++ b/drivers/cxl/core/port.c @@ -1207,6 +1207,7 @@ int cxl_endpoint_autoremove(struct cxl_memdev *cxlmd, struct cxl_port *endpoint) get_device(&endpoint->dev); dev_set_drvdata(dev, endpoint); + cxlmd->depth = endpoint->depth; return devm_add_action_or_reset(dev, delete_endpoint, cxlmd); } EXPORT_SYMBOL_NS_GPL(cxl_endpoint_autoremove, CXL); @@ -1241,50 +1242,55 @@ static void reap_dports(struct cxl_port *port) } } +struct detach_ctx { + struct cxl_memdev *cxlmd; + int depth; +}; + +static int port_has_memdev(struct device *dev, const void *data) +{ + const struct detach_ctx *ctx = data; + struct cxl_port *port; + + if (!is_cxl_port(dev)) + return 0; + + port = to_cxl_port(dev); + if (port->depth != ctx->depth) + return 0; + + return !!cxl_ep_load(port, ctx->cxlmd); +} + static void cxl_detach_ep(void *data) { struct cxl_memdev *cxlmd = data; - struct device *iter; - for (iter = &cxlmd->dev; iter; iter = grandparent(iter)) { - struct device *dport_dev = grandparent(iter); + for (int i = cxlmd->depth - 1; i >= 1; i--) { struct cxl_port *port, *parent_port; + struct detach_ctx ctx = { + .cxlmd = cxlmd, + .depth = i, + }; + struct device *dev; struct cxl_ep *ep; bool died = false; - if (!dport_dev) - break; - - port = find_cxl_port(dport_dev, NULL); - if (!port) - continue; - - if (is_cxl_root(port)) { - put_device(&port->dev); + dev = bus_find_device(&cxl_bus_type, NULL, &ctx, + port_has_memdev); + if (!dev) continue; - } + port = to_cxl_port(dev); parent_port = to_cxl_port(port->dev.parent); device_lock(&parent_port->dev); - if (!parent_port->dev.driver) { - /* - * The bottom-up race to delete the port lost to a - * top-down port disable, give up here, because the - * parent_port ->remove() will have cleaned up all - * descendants. - */ - device_unlock(&parent_port->dev); - put_device(&port->dev); - continue; - } - device_lock(&port->dev); ep = cxl_ep_load(port, cxlmd); dev_dbg(&cxlmd->dev, "disconnect %s from %s\n", ep ? dev_name(ep->ep) : "", dev_name(&port->dev)); cxl_ep_remove(port, ep); if (ep && !port->dead && xa_empty(&port->endpoints) && - !is_cxl_root(parent_port)) { + !is_cxl_root(parent_port) && parent_port->dev.driver) { /* * This was the last ep attached to a dynamically * enumerated port. Block new cxl_add_ep() and garbage diff --git a/drivers/cxl/cxlmem.h b/drivers/cxl/cxlmem.h index ab138004f644..c9da3c699a21 100644 --- a/drivers/cxl/cxlmem.h +++ b/drivers/cxl/cxlmem.h @@ -38,6 +38,7 @@ * @cxl_nvb: coordinate removal of @cxl_nvd if present * @cxl_nvd: optional bridge to an nvdimm if the device supports pmem * @id: id number of this memdev instance. + * @depth: endpoint port depth */ struct cxl_memdev { struct device dev; @@ -47,6 +48,7 @@ struct cxl_memdev { struct cxl_nvdimm_bridge *cxl_nvb; struct cxl_nvdimm *cxl_nvd; int id; + int depth; }; static inline struct cxl_memdev *to_cxl_memdev(struct device *dev)