From patchwork Fri Sep 29 23:09:33 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Dan Williams X-Patchwork-Id: 13404864 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 93C5EE77347 for ; Fri, 29 Sep 2023 23:09:37 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229769AbjI2XJi (ORCPT ); Fri, 29 Sep 2023 19:09:38 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:48170 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229508AbjI2XJh (ORCPT ); Fri, 29 Sep 2023 19:09:37 -0400 Received: from mgamail.intel.com (mgamail.intel.com [192.55.52.120]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id E3AF0DD for ; Fri, 29 Sep 2023 16:09:34 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1696028974; x=1727564974; h=subject:from:to:cc:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=I+pGIPYdvlTDz4UX+LrZQ0OshunMoGGi5Wn+AimK7UA=; b=WnTn7Pg53Xqan5HGOvImCiW2GEzNQqK5QpJk9nJWjxIpoq4e7XuEi0j7 J/gg65kYURaNibmCI0oNzJpIl4S2tn3OHzbID0kpro0nWZ6yQFi+HrK9u DzyAL+5oovDygDXdhQvaPDLFVj49ol92tf6bWUBvG7RRHsgaSBk5nRjUW 5VHZe6+OiKeQJtFUvm4sKoWbouICEx6NAZEzklp8YOhcYmAso6b+yfsSy 8o7QKM8iwU0ZZuWAUlnGyEXP+ADTCP6pCmoNAeJGCbrq7nT5E8PJPp3Sz aH/EO9QpiT/nW13xzjYPdJWZ3xPLF42paOKT2Dk45cX4sWh00mUzgPH76 g==; X-IronPort-AV: E=McAfee;i="6600,9927,10848"; a="381279649" X-IronPort-AV: E=Sophos;i="6.03,188,1694761200"; d="scan'208";a="381279649" Received: from orsmga007.jf.intel.com ([10.7.209.58]) by fmsmga104.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 29 Sep 2023 16:09:34 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10848"; a="743572005" X-IronPort-AV: E=Sophos;i="6.03,188,1694761200"; d="scan'208";a="743572005" Received: from thamvo-mobl.amr.corp.intel.com (HELO dwillia2-xfh.jf.intel.com) ([10.209.56.79]) by orsmga007-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 29 Sep 2023 16:09:33 -0700 Subject: [PATCH v2 1/4] cxl/pci: Remove unnecessary device reference management in sanitize work From: Dan Williams To: linux-cxl@vger.kernel.org Cc: ira.weiny@intel.com Date: Fri, 29 Sep 2023 16:09:33 -0700 Message-ID: <169602897366.904193.9449207727775648546.stgit@dwillia2-xfh.jf.intel.com> In-Reply-To: <169602896768.904193.11292185494339980455.stgit@dwillia2-xfh.jf.intel.com> References: <169602896768.904193.11292185494339980455.stgit@dwillia2-xfh.jf.intel.com> User-Agent: StGit/0.18-3-g996c MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-cxl@vger.kernel.org Given that any particular put_device() could be the final put of the device, the fact that there are usages of cxlds->dev after put_device(cxlds->dev) is a red flag. Drop the reference counting since the device is pinned by being registered and will not be unregistered without triggering the driver + workqueue to shutdown. Signed-off-by: Dan Williams Reviewed-by: Ira Weiny Reviewed-by: Jonathan Cameron Reviewed-by: Davidlohr Bueso Reviewed-by: Dave Jiang --- drivers/cxl/pci.c | 5 ----- 1 file changed, 5 deletions(-) diff --git a/drivers/cxl/pci.c b/drivers/cxl/pci.c index 44a21ab7add5..aa1b3dd9e64c 100644 --- a/drivers/cxl/pci.c +++ b/drivers/cxl/pci.c @@ -152,8 +152,6 @@ static void cxl_mbox_sanitize_work(struct work_struct *work) mutex_lock(&mds->mbox_mutex); if (cxl_mbox_background_complete(cxlds)) { mds->security.poll_tmo_secs = 0; - put_device(cxlds->dev); - if (mds->security.sanitize_node) sysfs_notify_dirent(mds->security.sanitize_node); @@ -296,9 +294,6 @@ static int __cxl_pci_mbox_send_cmd(struct cxl_memdev_state *mds, */ if (mbox_cmd->opcode == CXL_MBOX_OP_SANITIZE) { if (mds->security.poll) { - /* hold the device throughout */ - get_device(cxlds->dev); - /* give first timeout a second */ timeout = 1; mds->security.poll_tmo_secs = timeout; From patchwork Fri Sep 29 23:09:39 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Dan Williams X-Patchwork-Id: 13404865 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 20C9AE77347 for ; Fri, 29 Sep 2023 23:09:42 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229824AbjI2XJm (ORCPT ); Fri, 29 Sep 2023 19:09:42 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:48186 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229508AbjI2XJl (ORCPT ); Fri, 29 Sep 2023 19:09:41 -0400 Received: from mgamail.intel.com (mgamail.intel.com [192.55.52.120]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id AB5A6E6 for ; Fri, 29 Sep 2023 16:09:39 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1696028979; x=1727564979; h=subject:from:to:cc:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=8ICWG5Rj5Ygb9LNVot50TB/RHvKwqUo+oZjEDvT4vjc=; b=kswC5VFu1CFGg3KYhB/rfJ8fP0fbm7jdV6BeHElz/bW1rgn4Cuxdjrvb SlGjM4fKn+ssX+yxp6RTsiRn1h8Dcwz2mH7sxHD2jG4DFo8n+lhqexECn vbZ9fYzaSIWlhfRhHZrvdLcIdh2GDZjQHPMLM3wOOccyYwlxo/zXgOVYL 9dSqcpAeRQ9pDQM5r3tWij2yd/jrZSWeeKbRrziLEacAAX2wO+3CnBLr9 fmMz3SHXo89iyprO3cpgzS59pcxPq1F63ZezEcq0gbg2CF3VsUupCocsy PSQGmc+AMJw6PnAWWalBLTxGw13x81MHRwXLoTAkcx8aIykgbfi/LCT3r g==; X-IronPort-AV: E=McAfee;i="6600,9927,10848"; a="381279664" X-IronPort-AV: E=Sophos;i="6.03,188,1694761200"; d="scan'208";a="381279664" Received: from orsmga007.jf.intel.com ([10.7.209.58]) by fmsmga104.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 29 Sep 2023 16:09:39 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10848"; a="743572035" X-IronPort-AV: E=Sophos;i="6.03,188,1694761200"; d="scan'208";a="743572035" Received: from thamvo-mobl.amr.corp.intel.com (HELO dwillia2-xfh.jf.intel.com) ([10.209.56.79]) by orsmga007-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 29 Sep 2023 16:09:39 -0700 Subject: [PATCH v2 2/4] cxl/pci: Cleanup 'sanitize' to always poll From: Dan Williams To: linux-cxl@vger.kernel.org Cc: ira.weiny@intel.com Date: Fri, 29 Sep 2023 16:09:39 -0700 Message-ID: <169602897906.904193.9057960720070253436.stgit@dwillia2-xfh.jf.intel.com> In-Reply-To: <169602896768.904193.11292185494339980455.stgit@dwillia2-xfh.jf.intel.com> References: <169602896768.904193.11292185494339980455.stgit@dwillia2-xfh.jf.intel.com> User-Agent: StGit/0.18-3-g996c MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-cxl@vger.kernel.org In preparation for fixing the init/teardown of the 'sanitize' workqueue and sysfs notification mechanism, arrange for cxl_mbox_sanitize_work() to be the single location where the sysfs attribute is notified. With that change there is no distinction between polled mode and interrupt mode. All the interrupt does is accelerate the polling interval. The change to check for "mds->security.sanitize_node" under the lock is there to ensure that the interrupt, the work routine and the setup/teardown code can all have a consistent view of the registered notifier and the workqueue state. I.e. the expectation is that the interrupt is live past the point that the sanitize sysfs attribute is published, and it may race teardown, so it must be consulted under a lock. Lastly, some opportunistic replacements of "queue_delayed_work(system_wq, ...)", which is just open coded schedule_delayed_work(), are included. Signed-off-by: Dan Williams Reviewed-by: Ira Weiny Reviewed-by: Jonathan Cameron Reviewed-by: Dave Jiang --- drivers/cxl/core/memdev.c | 3 +- drivers/cxl/cxlmem.h | 2 -- drivers/cxl/pci.c | 60 +++++++++++++++++++-------------------------- 3 files changed, 26 insertions(+), 39 deletions(-) diff --git a/drivers/cxl/core/memdev.c b/drivers/cxl/core/memdev.c index 14b547c07f54..2a7a07f6d165 100644 --- a/drivers/cxl/core/memdev.c +++ b/drivers/cxl/core/memdev.c @@ -561,8 +561,7 @@ static void cxl_memdev_security_shutdown(struct device *dev) struct cxl_memdev *cxlmd = to_cxl_memdev(dev); struct cxl_memdev_state *mds = to_cxl_memdev_state(cxlmd->cxlds); - if (mds->security.poll) - cancel_delayed_work_sync(&mds->security.poll_dwork); + cancel_delayed_work_sync(&mds->security.poll_dwork); } static void cxl_memdev_shutdown(struct device *dev) diff --git a/drivers/cxl/cxlmem.h b/drivers/cxl/cxlmem.h index 706f8a6d1ef4..55f00ad17a77 100644 --- a/drivers/cxl/cxlmem.h +++ b/drivers/cxl/cxlmem.h @@ -360,7 +360,6 @@ struct cxl_fw_state { * * @state: state of last security operation * @enabled_cmds: All security commands enabled in the CEL - * @poll: polling for sanitization is enabled, device has no mbox irq support * @poll_tmo_secs: polling timeout * @poll_dwork: polling work item * @sanitize_node: sanitation sysfs file to notify @@ -368,7 +367,6 @@ struct cxl_fw_state { struct cxl_security_state { unsigned long state; DECLARE_BITMAP(enabled_cmds, CXL_SEC_ENABLED_MAX); - bool poll; int poll_tmo_secs; struct delayed_work poll_dwork; struct kernfs_node *sanitize_node; diff --git a/drivers/cxl/pci.c b/drivers/cxl/pci.c index aa1b3dd9e64c..ac4e434b0806 100644 --- a/drivers/cxl/pci.c +++ b/drivers/cxl/pci.c @@ -128,10 +128,10 @@ static irqreturn_t cxl_pci_mbox_irq(int irq, void *id) reg = readq(cxlds->regs.mbox + CXLDEV_MBOX_BG_CMD_STATUS_OFFSET); opcode = FIELD_GET(CXLDEV_MBOX_BG_CMD_COMMAND_OPCODE_MASK, reg); if (opcode == CXL_MBOX_OP_SANITIZE) { + mutex_lock(&mds->mbox_mutex); if (mds->security.sanitize_node) - sysfs_notify_dirent(mds->security.sanitize_node); - - dev_dbg(cxlds->dev, "Sanitization operation ended\n"); + mod_delayed_work(system_wq, &mds->security.poll_dwork, 0); + mutex_unlock(&mds->mbox_mutex); } else { /* short-circuit the wait in __cxl_pci_mbox_send_cmd() */ rcuwait_wake_up(&mds->mbox_wait); @@ -160,8 +160,7 @@ static void cxl_mbox_sanitize_work(struct work_struct *work) int timeout = mds->security.poll_tmo_secs + 10; mds->security.poll_tmo_secs = min(15 * 60, timeout); - queue_delayed_work(system_wq, &mds->security.poll_dwork, - timeout * HZ); + schedule_delayed_work(&mds->security.poll_dwork, timeout * HZ); } mutex_unlock(&mds->mbox_mutex); } @@ -293,15 +292,11 @@ static int __cxl_pci_mbox_send_cmd(struct cxl_memdev_state *mds, * and allow userspace to poll(2) for completion. */ if (mbox_cmd->opcode == CXL_MBOX_OP_SANITIZE) { - if (mds->security.poll) { - /* give first timeout a second */ - timeout = 1; - mds->security.poll_tmo_secs = timeout; - queue_delayed_work(system_wq, - &mds->security.poll_dwork, - timeout * HZ); - } - + /* give first timeout a second */ + timeout = 1; + mds->security.poll_tmo_secs = timeout; + schedule_delayed_work(&mds->security.poll_dwork, + timeout * HZ); dev_dbg(dev, "Sanitization operation started\n"); goto success; } @@ -384,7 +379,9 @@ static int cxl_pci_setup_mailbox(struct cxl_memdev_state *mds) const int cap = readl(cxlds->regs.mbox + CXLDEV_MBOX_CAPS_OFFSET); struct device *dev = cxlds->dev; unsigned long timeout; + int irq, msgnum; u64 md_status; + u32 ctrl; timeout = jiffies + mbox_ready_timeout * HZ; do { @@ -432,33 +429,26 @@ static int cxl_pci_setup_mailbox(struct cxl_memdev_state *mds) dev_dbg(dev, "Mailbox payload sized %zu", mds->payload_size); rcuwait_init(&mds->mbox_wait); + INIT_DELAYED_WORK(&mds->security.poll_dwork, cxl_mbox_sanitize_work); - if (cap & CXLDEV_MBOX_CAP_BG_CMD_IRQ) { - u32 ctrl; - int irq, msgnum; - struct pci_dev *pdev = to_pci_dev(cxlds->dev); - - msgnum = FIELD_GET(CXLDEV_MBOX_CAP_IRQ_MSGNUM_MASK, cap); - irq = pci_irq_vector(pdev, msgnum); - if (irq < 0) - goto mbox_poll; - - if (cxl_request_irq(cxlds, irq, cxl_pci_mbox_irq, NULL)) - goto mbox_poll; + /* background command interrupts are optional */ + if ((cap & CXLDEV_MBOX_CAP_BG_CMD_IRQ) == 0) + return 0; - /* enable background command mbox irq support */ - ctrl = readl(cxlds->regs.mbox + CXLDEV_MBOX_CTRL_OFFSET); - ctrl |= CXLDEV_MBOX_CTRL_BG_CMD_IRQ; - writel(ctrl, cxlds->regs.mbox + CXLDEV_MBOX_CTRL_OFFSET); + msgnum = FIELD_GET(CXLDEV_MBOX_CAP_IRQ_MSGNUM_MASK, cap); + irq = pci_irq_vector(to_pci_dev(cxlds->dev), msgnum); + if (irq < 0) + return 0; + if (cxl_request_irq(cxlds, irq, cxl_pci_mbox_irq, NULL)) return 0; - } -mbox_poll: - mds->security.poll = true; - INIT_DELAYED_WORK(&mds->security.poll_dwork, cxl_mbox_sanitize_work); + dev_dbg(cxlds->dev, "Mailbox interrupts enabled\n"); + /* enable background command mbox irq support */ + ctrl = readl(cxlds->regs.mbox + CXLDEV_MBOX_CTRL_OFFSET); + ctrl |= CXLDEV_MBOX_CTRL_BG_CMD_IRQ; + writel(ctrl, cxlds->regs.mbox + CXLDEV_MBOX_CTRL_OFFSET); - dev_dbg(cxlds->dev, "Mailbox interrupts are unsupported"); return 0; } From patchwork Fri Sep 29 23:09:44 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Dan Williams X-Patchwork-Id: 13404866 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 5FD4AE77347 for ; Fri, 29 Sep 2023 23:09:47 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229508AbjI2XJr (ORCPT ); Fri, 29 Sep 2023 19:09:47 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:43350 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230063AbjI2XJr (ORCPT ); Fri, 29 Sep 2023 19:09:47 -0400 Received: from mgamail.intel.com (mgamail.intel.com [192.55.52.120]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 283FEE6 for ; Fri, 29 Sep 2023 16:09:45 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1696028985; x=1727564985; h=subject:from:to:cc:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=Nc5jQpp6R0sD2LAPV3rEK+Kfd6TmVd7I3i+IbEkCF20=; b=jaujVZvmLTbTO6V0yWcPodduLUnI3pEZ5A8xvq0YYQ7YK1xHSF4Ko8Bd Ka0h78C0xB9dYlwhTkTg54otsdeJ7FGZtZXf50YuEMnk/yziI6ldoD8Oo KquKZvUaE9ymv+fGuT1Y5oJh/s/2bAyMTGYw7Abiy2URA/ptK1ZmHjJNw QBKA36lSkxq6hxbjDiYAD757TD9fRUKIKFhTj9h8qOJeg6N6+BrJ7Evcp XCznOVHveDdAguf7cb9+DtnLDEU3am3vWXVxl1vJ3wk61/fuNSuVTVbLF HSUsc/PV7HplWX+T01hr1vPb5JAZ9vOZY9H+IkAoPQtCsI/mlR6MjXwEP w==; X-IronPort-AV: E=McAfee;i="6600,9927,10848"; a="381279679" X-IronPort-AV: E=Sophos;i="6.03,188,1694761200"; d="scan'208";a="381279679" Received: from orsmga007.jf.intel.com ([10.7.209.58]) by fmsmga104.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 29 Sep 2023 16:09:44 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10848"; a="743572070" X-IronPort-AV: E=Sophos;i="6.03,188,1694761200"; d="scan'208";a="743572070" Received: from thamvo-mobl.amr.corp.intel.com (HELO dwillia2-xfh.jf.intel.com) ([10.209.56.79]) by orsmga007-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 29 Sep 2023 16:09:44 -0700 Subject: [PATCH v2 3/4] cxl/pci: Fix sanitize notifier setup From: Dan Williams To: linux-cxl@vger.kernel.org Cc: Jonathan Cameron , Dave Jiang , Davidlohr Bueso , ira.weiny@intel.com Date: Fri, 29 Sep 2023 16:09:44 -0700 Message-ID: <169602898447.904193.4454973423100628922.stgit@dwillia2-xfh.jf.intel.com> In-Reply-To: <169602896768.904193.11292185494339980455.stgit@dwillia2-xfh.jf.intel.com> References: <169602896768.904193.11292185494339980455.stgit@dwillia2-xfh.jf.intel.com> User-Agent: StGit/0.18-3-g996c MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-cxl@vger.kernel.org Fix a race condition between the mailbox-background command interrupt firing and the security-state sysfs attribute being removed. The race is difficult to see due to the awkward placement of the sanitize-notifier setup code and the multiple places the teardown calls are made, cxl_memdev_security_init() and cxl_memdev_security_shutdown(). Unify setup in one place, cxl_sanitize_setup_notifier(). Arrange for the paired cxl_sanitize_teardown_notifier() to safely quiet the notifier and let the cxl_memdev + irq be unregistered later in the flow. This fix is also needed as a preparation fix for a memdev unregistration crash. Reported-by: Jonathan Cameron Cc: Dave Jiang Cc: Davidlohr Bueso Fixes: 0c36b6ad436a ("cxl/mbox: Add sanitization handling machinery") Signed-off-by: Dan Williams Reviewed-by: Ira Weiny Reviewed-by: Dave Jiang --- drivers/cxl/core/memdev.c | 42 ---------------------------------------- drivers/cxl/pci.c | 47 +++++++++++++++++++++++++++++++++++++++++++++ 2 files changed, 47 insertions(+), 42 deletions(-) diff --git a/drivers/cxl/core/memdev.c b/drivers/cxl/core/memdev.c index 2a7a07f6d165..a950091e5640 100644 --- a/drivers/cxl/core/memdev.c +++ b/drivers/cxl/core/memdev.c @@ -556,20 +556,11 @@ void clear_exclusive_cxl_commands(struct cxl_memdev_state *mds, } EXPORT_SYMBOL_NS_GPL(clear_exclusive_cxl_commands, CXL); -static void cxl_memdev_security_shutdown(struct device *dev) -{ - struct cxl_memdev *cxlmd = to_cxl_memdev(dev); - struct cxl_memdev_state *mds = to_cxl_memdev_state(cxlmd->cxlds); - - cancel_delayed_work_sync(&mds->security.poll_dwork); -} - static void cxl_memdev_shutdown(struct device *dev) { struct cxl_memdev *cxlmd = to_cxl_memdev(dev); down_write(&cxl_memdev_rwsem); - cxl_memdev_security_shutdown(dev); cxlmd->cxlds = NULL; up_write(&cxl_memdev_rwsem); } @@ -1001,35 +992,6 @@ static const struct file_operations cxl_memdev_fops = { .llseek = noop_llseek, }; -static void put_sanitize(void *data) -{ - struct cxl_memdev_state *mds = data; - - sysfs_put(mds->security.sanitize_node); -} - -static int cxl_memdev_security_init(struct cxl_memdev *cxlmd) -{ - struct cxl_dev_state *cxlds = cxlmd->cxlds; - struct cxl_memdev_state *mds = to_cxl_memdev_state(cxlds); - struct device *dev = &cxlmd->dev; - struct kernfs_node *sec; - - sec = sysfs_get_dirent(dev->kobj.sd, "security"); - if (!sec) { - dev_err(dev, "sysfs_get_dirent 'security' failed\n"); - return -ENODEV; - } - mds->security.sanitize_node = sysfs_get_dirent(sec, "state"); - sysfs_put(sec); - if (!mds->security.sanitize_node) { - dev_err(dev, "sysfs_get_dirent 'state' failed\n"); - return -ENODEV; - } - - return devm_add_action_or_reset(cxlds->dev, put_sanitize, mds); - } - struct cxl_memdev *devm_cxl_add_memdev(struct cxl_dev_state *cxlds) { struct cxl_memdev *cxlmd; @@ -1058,10 +1020,6 @@ struct cxl_memdev *devm_cxl_add_memdev(struct cxl_dev_state *cxlds) if (rc) goto err; - rc = cxl_memdev_security_init(cxlmd); - if (rc) - goto err; - rc = devm_add_action_or_reset(cxlds->dev, cxl_memdev_unregister, cxlmd); if (rc) return ERR_PTR(rc); diff --git a/drivers/cxl/pci.c b/drivers/cxl/pci.c index ac4e434b0806..b0023e479315 100644 --- a/drivers/cxl/pci.c +++ b/drivers/cxl/pci.c @@ -165,6 +165,49 @@ static void cxl_mbox_sanitize_work(struct work_struct *work) mutex_unlock(&mds->mbox_mutex); } +static void cxl_sanitize_teardown_notifier(void *data) +{ + struct cxl_memdev_state *mds = data; + struct kernfs_node *state; + + /* + * Prevent new irq triggered invocations of the workqueue and + * flush inflight invocations. + */ + mutex_lock(&mds->mbox_mutex); + state = mds->security.sanitize_node; + mds->security.sanitize_node = NULL; + mutex_unlock(&mds->mbox_mutex); + + cancel_delayed_work_sync(&mds->security.poll_dwork); + sysfs_put(state); +} + +static int cxl_sanitize_setup_notifier(struct cxl_memdev *cxlmd) +{ + struct cxl_dev_state *cxlds = cxlmd->cxlds; + struct cxl_memdev_state *mds = to_cxl_memdev_state(cxlds); + struct device *dev = cxlds->dev; + struct kernfs_node *sec; + + if (!test_bit(CXL_SEC_ENABLED_SANITIZE, mds->security.enabled_cmds)) + return 0; + + sec = sysfs_get_dirent(dev->kobj.sd, "security"); + if (!sec) { + dev_err(dev, "sanitize notification setup failure\n"); + return -ENOENT; + } + mds->security.sanitize_node = sysfs_get_dirent(sec, "state"); + sysfs_put(sec); + if (!mds->security.sanitize_node) { + dev_err(dev, "sanitize notification setup failure\n"); + return -ENOENT; + } + + return devm_add_action_or_reset(dev, cxl_sanitize_teardown_notifier, mds); +} + /** * __cxl_pci_mbox_send_cmd() - Execute a mailbox command * @mds: The memory device driver data @@ -875,6 +918,10 @@ static int cxl_pci_probe(struct pci_dev *pdev, const struct pci_device_id *id) if (rc) return rc; + rc = cxl_sanitize_setup_notifier(cxlmd); + if (rc) + return rc; + pmu_count = cxl_count_regblock(pdev, CXL_REGLOC_RBI_PMU); for (i = 0; i < pmu_count; i++) { struct cxl_pmu_regs pmu_regs; From patchwork Fri Sep 29 23:09:49 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Dan Williams X-Patchwork-Id: 13404867 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id CDD48E77347 for ; Fri, 29 Sep 2023 23:09:55 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230063AbjI2XJ4 (ORCPT ); Fri, 29 Sep 2023 19:09:56 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:43396 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230165AbjI2XJz (ORCPT ); Fri, 29 Sep 2023 19:09:55 -0400 Received: from mgamail.intel.com (mgamail.intel.com [192.55.52.120]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id C0EABF7 for ; Fri, 29 Sep 2023 16:09:52 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1696028992; x=1727564992; h=subject:from:to:cc:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=4mH1m+H45/oAaMzuzdMJu716WCHdaF3Gco4nkLXM1yo=; b=F7pWz3/0wZtl+dZ/U/7MOzojDSO4VGd4aoY3lCawAb4OGNb95do2ZLgJ yL108lAc3lMV5f6O6cRm1Y94q9P2cfRHpNptDZFXII6mg6EccszQCNeHY wehSU8gp5WeP9a1iPMqBGs1ZnZYNSjDL18Z1ee8o1+I6Epv7JY8o/MJSX l55jzmws1SF0xuDjt/22JDU7gxR4uIZ+YMLqS54q+NEuqko3tGQ2b+ZJY ndS37kQ5jQwE+7isprFUJA0HfP4gaPm2b+68t6+fTjdhpyL0qe/3guWTk cY4nceidkCGi/qyg9ZFvY7e0nfGYLnWyYxbumu9tHre75fWA6XcwrARFt Q==; X-IronPort-AV: E=McAfee;i="6600,9927,10848"; a="381279704" X-IronPort-AV: E=Sophos;i="6.03,188,1694761200"; d="scan'208";a="381279704" Received: from orsmga007.jf.intel.com ([10.7.209.58]) by fmsmga104.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 29 Sep 2023 16:09:52 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10848"; a="743572099" X-IronPort-AV: E=Sophos;i="6.03,188,1694761200"; d="scan'208";a="743572099" Received: from thamvo-mobl.amr.corp.intel.com (HELO dwillia2-xfh.jf.intel.com) ([10.209.56.79]) by orsmga007-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 29 Sep 2023 16:09:50 -0700 Subject: [PATCH v2 4/4] cxl/mem: Fix shutdown order From: Dan Williams To: linux-cxl@vger.kernel.org Cc: Ira Weiny , ira.weiny@intel.com Date: Fri, 29 Sep 2023 16:09:49 -0700 Message-ID: <169602898991.904193.3059334392093961032.stgit@dwillia2-xfh.jf.intel.com> In-Reply-To: <169602896768.904193.11292185494339980455.stgit@dwillia2-xfh.jf.intel.com> References: <169602896768.904193.11292185494339980455.stgit@dwillia2-xfh.jf.intel.com> User-Agent: StGit/0.18-3-g996c MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-cxl@vger.kernel.org Ira reports that removing cxl_mock_mem causes a crash with the following trace: BUG: kernel NULL pointer dereference, address: 0000000000000044 [..] RIP: 0010:cxl_region_decode_reset+0x7f/0x180 [cxl_core] [..] Call Trace: cxl_region_detach+0xe8/0x210 [cxl_core] cxl_decoder_kill_region+0x27/0x40 [cxl_core] cxld_unregister+0x29/0x40 [cxl_core] devres_release_all+0xb8/0x110 device_unbind_cleanup+0xe/0x70 device_release_driver_internal+0x1d2/0x210 bus_remove_device+0xd7/0x150 device_del+0x155/0x3e0 device_unregister+0x13/0x60 devm_release_action+0x4d/0x90 ? __pfx_unregister_port+0x10/0x10 [cxl_core] delete_endpoint+0x121/0x130 [cxl_core] devres_release_all+0xb8/0x110 device_unbind_cleanup+0xe/0x70 device_release_driver_internal+0x1d2/0x210 bus_remove_device+0xd7/0x150 device_del+0x155/0x3e0 ? lock_release+0x142/0x290 cdev_device_del+0x15/0x50 cxl_memdev_unregister+0x54/0x70 [cxl_core] This crash is due to the clearing out the cxl_memdev's driver context (@cxlds) before the subsystem is done with it. This is ultimately due to the region(s), that this memdev is a member, being torn down and expecting to be able to de-reference @cxlds, like here: static int cxl_region_decode_reset(struct cxl_region *cxlr, int count) ... if (cxlds->rcd) goto endpoint_reset; ... Fix it by keeping the driver context valid until memdev-device unregistration, and subsequently the entire stack of related dependencies, unwinds. Fixes: 9cc238c7a526 ("cxl/pci: Introduce cdevm_file_operations") Reported-by: Ira Weiny Signed-off-by: Dan Williams Reviewed-by: Ira Weiny Tested-by: Ira Weiny Reviewed-by: Jonathan Cameron Reviewed-by: Dave Jiang Reviewed-by: Davidlohr Bueso --- drivers/cxl/core/memdev.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/cxl/core/memdev.c b/drivers/cxl/core/memdev.c index a950091e5640..e78b5ead14fa 100644 --- a/drivers/cxl/core/memdev.c +++ b/drivers/cxl/core/memdev.c @@ -570,8 +570,8 @@ static void cxl_memdev_unregister(void *_cxlmd) struct cxl_memdev *cxlmd = _cxlmd; struct device *dev = &cxlmd->dev; - cxl_memdev_shutdown(dev); cdev_device_del(&cxlmd->cdev, dev); + cxl_memdev_shutdown(dev); put_device(dev); }