From patchwork Tue Apr 12 22:43:46 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Tanjore Suresh X-Patchwork-Id: 12811276 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id A71E8C433EF for ; Tue, 12 Apr 2022 23:23:03 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229774AbiDLXZU (ORCPT ); Tue, 12 Apr 2022 19:25:20 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:38564 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229666AbiDLXY3 (ORCPT ); Tue, 12 Apr 2022 19:24:29 -0400 Received: from mail-yw1-x1149.google.com (mail-yw1-x1149.google.com [IPv6:2607:f8b0:4864:20::1149]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id C5CE52F018 for ; Tue, 12 Apr 2022 15:44:01 -0700 (PDT) Received: by mail-yw1-x1149.google.com with SMTP id 00721157ae682-2eb4f03f212so2959937b3.2 for ; Tue, 12 Apr 2022 15:44:01 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=date:in-reply-to:message-id:mime-version:references:subject:from:to :cc; bh=vu+tbewZC1L8tNZcdnISH4xxfmEZ4vDTSNrfXZHLvPE=; b=A+K8pYUBsHgL6smGFqlwdaGolyd1WSaLs2CJu781D2KugJNhkIcxatZ/BkJUcRGwsD BmMWeCu8OCCkYzco2VjqflXFLm4AzUcmyxBs8n0TaTIOMdjnESI5O2EicQhFPqdp0NN7 fNiyrKG7UB9dFuczLb0Wysld+4bWa7aodWBsavyrCBP88t2S53hrfnrN7MzPLGFKXvHr /YhN1rKHnO9BC7DT3iaP6/lwbF2qGlgqPA2iURWyhzsLr3eeV+hiMP26g5hWkgmt/66H sKWrEKFx8HcrbShKcF+wKv41p4dD63z1a0Mbz7JP0apEU6pOYr5sI9IPmD2PxHm6FdIA oUnw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:date:in-reply-to:message-id:mime-version :references:subject:from:to:cc; bh=vu+tbewZC1L8tNZcdnISH4xxfmEZ4vDTSNrfXZHLvPE=; b=VIOqiVlir1A8cj5ojQwVjfEAWkyBPqk2/3UCCd2Ln8Ro+0pWQO3Cr1WRLOBPlKVBq1 HNfQG3hpULlaXZA4A+hkzzDAqblP2rWSNVuFgZ3X/Ebv448mNArMflk5VJYVHRYdCpeE XYwoSOA2iiAAfK/cXYgWgLoO6x6OCAEUWda7XadzEVQSlJIydRuT7mJYFTl0nOQfUncK NoY5zfDtmlrolssKQFUB10PntAXCVb464JKJ5o04PT7YjRURpWA1hJxSKRCaM/uo/5jQ 13GGVY4AK3A/qnQLKXEPC0SMJepaQdy2/cIK5EqsVE8exzNZNGW9kI0rm+7bCIEbRgkB VBYg== X-Gm-Message-State: AOAM531jiD6mP8B3zk4K5dYzWE+0ktbqhytyDGEA6kMFq2Juf6olBdKL Ob0z+EhWtV9Z40Q7h+Vmvu8c98b2S7j1IqE= X-Google-Smtp-Source: ABdhPJwYB6gmE/AnTTYvT4XgOL27Ob0XRx2o/4au0mtEC42h0VLwXwN0PWba0iF+eWL2UPJd2GUmprS1JHEoLq4= X-Received: from tansuresh.svl.corp.google.com ([2620:15c:2c5:13:8573:aa64:c3e8:ebc]) (user=tansuresh job=sendgmr) by 2002:a81:1e42:0:b0:2ec:3343:6b3e with SMTP id e63-20020a811e42000000b002ec33436b3emr9165499ywe.171.1649803441033; Tue, 12 Apr 2022 15:44:01 -0700 (PDT) Date: Tue, 12 Apr 2022 15:43:46 -0700 In-Reply-To: <20220412224348.1038613-1-tansuresh@google.com> Message-Id: <20220412224348.1038613-2-tansuresh@google.com> Mime-Version: 1.0 References: <20220412224348.1038613-1-tansuresh@google.com> X-Mailer: git-send-email 2.36.0.rc0.470.gd361397f0d-goog Subject: [PATCH v2 1/3] driver core: Support asynchronous driver shutdown From: Tanjore Suresh To: Greg Kroah-Hartman , "Rafael J . Wysocki" , Christoph Hellwig , Sagi Grimberg , Bjorn Helgaas Cc: linux-kernel@vger.kernel.org, linux-nvme@lists.infradead.org, linux-pci@vger.kernel.org, Tanjore Suresh Precedence: bulk List-ID: X-Mailing-List: linux-pci@vger.kernel.org This changes the bus driver interface with additional entry points to enable devices to implement asynchronous shutdown. The existing synchronous interface to shutdown is unmodified and retained for backward compatibility. This changes the common device shutdown code to enable devices to participate in asynchronous shutdown implementation. Signed-off-by: Tanjore Suresh --- drivers/base/core.c | 38 +++++++++++++++++++++++++++++++++++++- include/linux/device/bus.h | 12 ++++++++++++ 2 files changed, 49 insertions(+), 1 deletion(-) diff --git a/drivers/base/core.c b/drivers/base/core.c index 3d6430eb0c6a..ba267ae70a22 100644 --- a/drivers/base/core.c +++ b/drivers/base/core.c @@ -4479,6 +4479,7 @@ EXPORT_SYMBOL_GPL(device_change_owner); void device_shutdown(void) { struct device *dev, *parent; + LIST_HEAD(async_shutdown_list); wait_for_device_probe(); device_block_probing(); @@ -4523,7 +4524,13 @@ void device_shutdown(void) dev_info(dev, "shutdown_pre\n"); dev->class->shutdown_pre(dev); } - if (dev->bus && dev->bus->shutdown) { + if (dev->bus && dev->bus->async_shutdown_start) { + if (initcall_debug) + dev_info(dev, "async_shutdown_start\n"); + dev->bus->async_shutdown_start(dev); + list_add_tail(&dev->kobj.entry, + &async_shutdown_list); + } else if (dev->bus && dev->bus->shutdown) { if (initcall_debug) dev_info(dev, "shutdown\n"); dev->bus->shutdown(dev); @@ -4543,6 +4550,35 @@ void device_shutdown(void) spin_lock(&devices_kset->list_lock); } spin_unlock(&devices_kset->list_lock); + + /* + * Second pass spin for only devices, that have configured + * Asynchronous shutdown. + */ + while (!list_empty(&async_shutdown_list)) { + dev = list_entry(async_shutdown_list.next, struct device, + kobj.entry); + parent = get_device(dev->parent); + get_device(dev); + /* + * Make sure the device is off the list + */ + list_del_init(&dev->kobj.entry); + if (parent) + device_lock(parent); + device_lock(dev); + if (dev->bus && dev->bus->async_shutdown_end) { + if (initcall_debug) + dev_info(dev, + "async_shutdown_end called\n"); + dev->bus->async_shutdown_end(dev); + } + device_unlock(dev); + if (parent) + device_unlock(parent); + put_device(dev); + put_device(parent); + } } /* diff --git a/include/linux/device/bus.h b/include/linux/device/bus.h index a039ab809753..f582c9d21515 100644 --- a/include/linux/device/bus.h +++ b/include/linux/device/bus.h @@ -49,6 +49,16 @@ struct fwnode_handle; * will never get called until they do. * @remove: Called when a device removed from this bus. * @shutdown: Called at shut-down time to quiesce the device. + * @async_shutdown_start: Called at the shutdown-time to start + * the shutdown process on the device. + * This entry point will be called only + * when the bus driver has indicated it would + * like to participate in asynchronous shutdown + * completion. + * @async_shutdown_end: Called at shutdown-time to complete the shutdown + * process of the device. This entry point will be called + * only when the bus drive has indicated it would like to + * participate in the asynchronous shutdown completion. * * @online: Called to put the device back online (after offlining it). * @offline: Called to put the device offline for hot-removal. May fail. @@ -93,6 +103,8 @@ struct bus_type { void (*sync_state)(struct device *dev); void (*remove)(struct device *dev); void (*shutdown)(struct device *dev); + void (*async_shutdown_start)(struct device *dev); + void (*async_shutdown_end)(struct device *dev); int (*online)(struct device *dev); int (*offline)(struct device *dev); From patchwork Tue Apr 12 22:43:47 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Tanjore Suresh X-Patchwork-Id: 12811291 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 35F88C433F5 for ; Tue, 12 Apr 2022 23:23:11 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230011AbiDLXZ0 (ORCPT ); Tue, 12 Apr 2022 19:25:26 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:41408 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229974AbiDLXYb (ORCPT ); Tue, 12 Apr 2022 19:24:31 -0400 Received: from mail-yw1-x1149.google.com (mail-yw1-x1149.google.com [IPv6:2607:f8b0:4864:20::1149]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 910AC2183C for ; Tue, 12 Apr 2022 15:44:07 -0700 (PDT) Received: by mail-yw1-x1149.google.com with SMTP id 00721157ae682-2eb7d137101so2653847b3.12 for ; Tue, 12 Apr 2022 15:44:07 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=date:in-reply-to:message-id:mime-version:references:subject:from:to :cc; bh=hi3AFMxd4lntnLY/VQnmQFojrvZ1/71m51waZccMBqI=; b=VSfz0n9nJ2lR6P4G1ocNVOqrsqdiqSq5veHYoIkdL2cPgyxU3+U9LcU9Zkt/L/espf NmwSsO4+O135IPaW1U4SpRbUoVTI6czXTZTUQVOfP4GB6t9DP51zByHNa3ErVrq6PdQX vhvq0g4vS1ppV6+ekIZN/hh4KMwYNcbvxHa4fY5rwjDS25nFDqBqYQNIxVq2N+qkdI5P tOm7zxWSH0otzyH/4jZINtMt3bMcfFxWkRewzcq0hIZY65SE1OIMXyyLfG6ujnRWmYHB uo8WBsGjShsN2CT5NZo9aXr/G162d6Vf+fOdGvQ5whNcumFtjtuA8KUq/+ekNeiJMXPM AbKQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:date:in-reply-to:message-id:mime-version :references:subject:from:to:cc; bh=hi3AFMxd4lntnLY/VQnmQFojrvZ1/71m51waZccMBqI=; b=JQeqQW80nGWq2JeGUpEcq5HlH7DxZiF4e2G6PayoK7hFVpS0ObA0SHuK/PKVTPxMPn 6aC3pL56KJhB0c8a8WQ54bKMKdwJw3FhAvg1EBsA3hwjluxW2g29IJhu1B/TSMVTfu0z 7iie8o3tf9qrn12GmHm09/VpRWzpSS2pebtZfHe1mMKrSuX7/5jTTwyNP6irN2swRP9G 8eJe2avZ5zKrq+PkGNRvaZiDTtrnPeJ5E9wWIJwVliZWa7acvo0xi5JkoHe/DjcbXfhI y3X2PqTJIXLVSv/Pp/eaSw3hxOqyxy535xmtimdAWKjT2CZbPFtVs+5s4RlJU94CNK0J jxaQ== X-Gm-Message-State: AOAM530QRf7biR0fZ+LT43BSH1GOxSFLslCrg/HGkKblLbrW6i9+iuM0 YiEvk0jEJrkYstX7iUcoqTgk56RkX7UyK08= X-Google-Smtp-Source: ABdhPJxBsxbpOMmVxsT/oTTD7ilgxR9LQI/XCGpqtElVe+HELVc1DLqgy7IThzRYQurYCsg7OyK4RdKoUJkOazA= X-Received: from tansuresh.svl.corp.google.com ([2620:15c:2c5:13:8573:aa64:c3e8:ebc]) (user=tansuresh job=sendgmr) by 2002:a5b:a0c:0:b0:633:6489:7e3a with SMTP id k12-20020a5b0a0c000000b0063364897e3amr28515611ybq.71.1649803446833; Tue, 12 Apr 2022 15:44:06 -0700 (PDT) Date: Tue, 12 Apr 2022 15:43:47 -0700 In-Reply-To: <20220412224348.1038613-2-tansuresh@google.com> Message-Id: <20220412224348.1038613-3-tansuresh@google.com> Mime-Version: 1.0 References: <20220412224348.1038613-1-tansuresh@google.com> <20220412224348.1038613-2-tansuresh@google.com> X-Mailer: git-send-email 2.36.0.rc0.470.gd361397f0d-goog Subject: [PATCH v2 2/3] PCI: Support asynchronous shutdown From: Tanjore Suresh To: Greg Kroah-Hartman , "Rafael J . Wysocki" , Christoph Hellwig , Sagi Grimberg , Bjorn Helgaas Cc: linux-kernel@vger.kernel.org, linux-nvme@lists.infradead.org, linux-pci@vger.kernel.org, Tanjore Suresh Precedence: bulk List-ID: X-Mailing-List: linux-pci@vger.kernel.org Enhances the base PCI driver to add support for asynchronous shutdown. Assume a device takes n secs to shutdown. If a machine has been populated with M such devices, the total time spent in shutting down all the devices will be M * n secs, if the shutdown is done synchronously. For example, if NVMe PCI Controllers take 5 secs to shutdown and if there are 16 such NVMe controllers in a system, system will spend a total of 80 secs to shutdown all NVMe devices in that system. In order to speed up the shutdown time, asynchronous interface to shutdown has been implemented. This will significantly reduce the machine reboot time. Signed-off-by: Tanjore Suresh --- drivers/pci/pci-driver.c | 20 ++++++++++++++++---- include/linux/pci.h | 4 ++++ 2 files changed, 20 insertions(+), 4 deletions(-) diff --git a/drivers/pci/pci-driver.c b/drivers/pci/pci-driver.c index 4ceeb75fc899..63f49a8dff8e 100644 --- a/drivers/pci/pci-driver.c +++ b/drivers/pci/pci-driver.c @@ -501,16 +501,28 @@ static void pci_device_remove(struct device *dev) pci_dev_put(pci_dev); } -static void pci_device_shutdown(struct device *dev) +static void pci_device_async_shutdown_start(struct device *dev) { struct pci_dev *pci_dev = to_pci_dev(dev); struct pci_driver *drv = pci_dev->driver; pm_runtime_resume(dev); - if (drv && drv->shutdown) + if (drv && drv->async_shutdown_start) + drv->async_shutdown_start(pci_dev); + else if (drv && drv->shutdown) drv->shutdown(pci_dev); +} + +static void pci_device_async_shutdown_end(struct device *dev) +{ + struct pci_dev *pci_dev = to_pci_dev(dev); + struct pci_driver *drv = pci_dev->driver; + + if (drv && drv->async_shutdown_end) + drv->async_shutdown_end(pci_dev); + /* * If this is a kexec reboot, turn off Bus Master bit on the * device to tell it to not continue to do DMA. Don't touch @@ -521,7 +533,6 @@ static void pci_device_shutdown(struct device *dev) if (kexec_in_progress && (pci_dev->current_state <= PCI_D3hot)) pci_clear_master(pci_dev); } - #ifdef CONFIG_PM /* Auxiliary functions used for system resume and run-time resume. */ @@ -1625,7 +1636,8 @@ struct bus_type pci_bus_type = { .uevent = pci_uevent, .probe = pci_device_probe, .remove = pci_device_remove, - .shutdown = pci_device_shutdown, + .async_shutdown_start = pci_device_async_shutdown_start, + .async_shutdown_end = pci_device_async_shutdown_end, .dev_groups = pci_dev_groups, .bus_groups = pci_bus_groups, .drv_groups = pci_drv_groups, diff --git a/include/linux/pci.h b/include/linux/pci.h index b957eeb89c7a..bbdf7d52e87b 100644 --- a/include/linux/pci.h +++ b/include/linux/pci.h @@ -881,6 +881,8 @@ struct module; * Useful for enabling wake-on-lan (NIC) or changing * the power state of a device before reboot. * e.g. drivers/net/e100.c. + * @async_shutdown_start: This starts the asynchronous shutdown + * @async_shutdown_end: This completes the started asynchronous shutdown * @sriov_configure: Optional driver callback to allow configuration of * number of VFs to enable via sysfs "sriov_numvfs" file. * @sriov_set_msix_vec_count: PF Driver callback to change number of MSI-X @@ -905,6 +907,8 @@ struct pci_driver { int (*suspend)(struct pci_dev *dev, pm_message_t state); /* Device suspended */ int (*resume)(struct pci_dev *dev); /* Device woken up */ void (*shutdown)(struct pci_dev *dev); + void (*async_shutdown_start)(struct pci_dev *dev); + void (*async_shutdown_end)(struct pci_dev *dev); int (*sriov_configure)(struct pci_dev *dev, int num_vfs); /* On PF */ int (*sriov_set_msix_vec_count)(struct pci_dev *vf, int msix_vec_count); /* On PF */ u32 (*sriov_get_vf_total_msix)(struct pci_dev *pf); From patchwork Tue Apr 12 22:43:48 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Tanjore Suresh X-Patchwork-Id: 12811292 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id E1029C433EF for ; Tue, 12 Apr 2022 23:23:17 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230022AbiDLXZd (ORCPT ); Tue, 12 Apr 2022 19:25:33 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:38554 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229977AbiDLXYb (ORCPT ); Tue, 12 Apr 2022 19:24:31 -0400 Received: from mail-yb1-xb49.google.com (mail-yb1-xb49.google.com [IPv6:2607:f8b0:4864:20::b49]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id EAB281CB26 for ; Tue, 12 Apr 2022 15:44:12 -0700 (PDT) Received: by mail-yb1-xb49.google.com with SMTP id b12-20020a056902030c00b0061d720e274aso236318ybs.20 for ; Tue, 12 Apr 2022 15:44:12 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=date:in-reply-to:message-id:mime-version:references:subject:from:to :cc; bh=j+uFg3UvBPSpe9z7EjPoimwDUI3tSTdqXoWzVBlT8YA=; b=HwHLzedYjS+IUQlGju+jlS6RAWdrjLVCflB7Xg274v3cbLF+QlAtwn7wjvHLdSDhqN 79sVvq9GJAKECcvYiEUbpfCo7qi4m2Py1AIn+EqIDAKohghe0dJsBHFSvlcEa5eItEev kOgkGWC9oEjNEfxmCkZDiys1r5J3xDmZhpxRvWqbXbbUtViAFPWehEyFU4Jv/Mq+jsaN 4/N1m8Ce4geOGMsUa+kbjt4EzQpw1xL7JrpF9vsOZUrwMy5DnelCHza15FNsTzM0+LGK ia2ZBL5PjqiQiavZ8ITilA0jznoXtTbJ0/OJmQUUw4c5tEVk73kZuMCaUfrccOAZj4ur 6vPA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:date:in-reply-to:message-id:mime-version :references:subject:from:to:cc; bh=j+uFg3UvBPSpe9z7EjPoimwDUI3tSTdqXoWzVBlT8YA=; b=gIiKvFFiMHtQLIp6OMxxy32ic0ThiMQkkuKzc+J03tALmczf6O8YRjJa8NCQXTCUd1 CekP7VogzeasPIdBxP8Jh29j5cbb6ngjJAqMptwQ6s+91fvBzqB+hHCRsgdRL78mZopZ 2TaXvCcSYaDlrMKO+tRgHQxcjapiCf4jmfhuk+UfBKY9gfJqAQk7qiRXqNbXy3qPgn/S WFIK6nGqMLJgzJvwNM+BiCl20PMCIk0p1ZB1QGEdm2Eep3N/YtaznqS+/GUSwmOP/f1+ ebyl/0j1iQlO1CqwYzA7C70B7KR82l7JvhgKVojUu2EQ6zwP3EJUNWfxe11GrNGatAeX XFYg== X-Gm-Message-State: AOAM533kaYPNBnUpjls+7KqbQzby5VuPlyrsHMlDYeUWg5RlZot3JtD6 k2sL97RwBVMXJ8ihrdOISy7hkS+cFAstruI= X-Google-Smtp-Source: ABdhPJxmaX4f7eR9NYtp3dHD+Hth17rj4XFauB+9rp80Ol/yYeoj7tX+cNd/8JowOkjsiHpI5vVGwiL3Gc6cx14= X-Received: from tansuresh.svl.corp.google.com ([2620:15c:2c5:13:8573:aa64:c3e8:ebc]) (user=tansuresh job=sendgmr) by 2002:a81:2fd1:0:b0:2e5:d8d8:f60 with SMTP id v200-20020a812fd1000000b002e5d8d80f60mr31882457ywv.164.1649803452193; Tue, 12 Apr 2022 15:44:12 -0700 (PDT) Date: Tue, 12 Apr 2022 15:43:48 -0700 In-Reply-To: <20220412224348.1038613-3-tansuresh@google.com> Message-Id: <20220412224348.1038613-4-tansuresh@google.com> Mime-Version: 1.0 References: <20220412224348.1038613-1-tansuresh@google.com> <20220412224348.1038613-2-tansuresh@google.com> <20220412224348.1038613-3-tansuresh@google.com> X-Mailer: git-send-email 2.36.0.rc0.470.gd361397f0d-goog Subject: [PATCH v2 3/3] nvme: Add async shutdown support From: Tanjore Suresh To: Greg Kroah-Hartman , "Rafael J . Wysocki" , Christoph Hellwig , Sagi Grimberg , Bjorn Helgaas Cc: linux-kernel@vger.kernel.org, linux-nvme@lists.infradead.org, linux-pci@vger.kernel.org, Tanjore Suresh Precedence: bulk List-ID: X-Mailing-List: linux-pci@vger.kernel.org This works with the asynchronous shutdown mechanism setup for the PCI drivers and participates to provide both pre and post shutdown routines at pci_driver structure level. The shutdown_pre routine starts the shutdown and does not wait for the shutdown to complete. The shutdown_post routine waits for the shutdown to complete on individual controllers that this driver instance controls. This mechanism optimizes to speed up the shutdown in a system which host many controllers. Signed-off-by: Tanjore Suresh --- drivers/nvme/host/core.c | 28 ++++++++++---- drivers/nvme/host/nvme.h | 8 ++++ drivers/nvme/host/pci.c | 80 +++++++++++++++++++++++++--------------- 3 files changed, 80 insertions(+), 36 deletions(-) diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c index 677fa4bf76d3..24b08789fd34 100644 --- a/drivers/nvme/host/core.c +++ b/drivers/nvme/host/core.c @@ -2173,16 +2173,30 @@ EXPORT_SYMBOL_GPL(nvme_enable_ctrl); int nvme_shutdown_ctrl(struct nvme_ctrl *ctrl) { - unsigned long timeout = jiffies + (ctrl->shutdown_timeout * HZ); - u32 csts; int ret; + ret = nvme_shutdown_ctrl_start(ctrl); + if (ret) + return ret; + return nvme_wait_for_shutdown_cmpl(ctrl); +} +EXPORT_SYMBOL_GPL(nvme_shutdown_ctrl); + +int nvme_shutdown_ctrl_start(struct nvme_ctrl *ctrl) +{ + ctrl->ctrl_config &= ~NVME_CC_SHN_MASK; ctrl->ctrl_config |= NVME_CC_SHN_NORMAL; - ret = ctrl->ops->reg_write32(ctrl, NVME_REG_CC, ctrl->ctrl_config); - if (ret) - return ret; + return ctrl->ops->reg_write32(ctrl, NVME_REG_CC, ctrl->ctrl_config); +} +EXPORT_SYMBOL_GPL(nvme_shutdown_ctrl_start); + +int nvme_wait_for_shutdown_cmpl(struct nvme_ctrl *ctrl) +{ + unsigned long deadline = jiffies + (ctrl->shutdown_timeout * HZ); + u32 csts; + int ret; while ((ret = ctrl->ops->reg_read32(ctrl, NVME_REG_CSTS, &csts)) == 0) { if ((csts & NVME_CSTS_SHST_MASK) == NVME_CSTS_SHST_CMPLT) @@ -2191,7 +2205,7 @@ int nvme_shutdown_ctrl(struct nvme_ctrl *ctrl) msleep(100); if (fatal_signal_pending(current)) return -EINTR; - if (time_after(jiffies, timeout)) { + if (time_after(jiffies, deadline)) { dev_err(ctrl->device, "Device shutdown incomplete; abort shutdown\n"); return -ENODEV; @@ -2200,7 +2214,7 @@ int nvme_shutdown_ctrl(struct nvme_ctrl *ctrl) return ret; } -EXPORT_SYMBOL_GPL(nvme_shutdown_ctrl); +EXPORT_SYMBOL_GPL(nvme_wait_for_shutdown_cmpl); static int nvme_configure_timestamp(struct nvme_ctrl *ctrl) { diff --git a/drivers/nvme/host/nvme.h b/drivers/nvme/host/nvme.h index f4b674a8ce20..87f5803ef577 100644 --- a/drivers/nvme/host/nvme.h +++ b/drivers/nvme/host/nvme.h @@ -170,6 +170,12 @@ enum { NVME_REQ_USERCMD = (1 << 1), }; +enum shutdown_type { + DO_NOT_SHUTDOWN = 0, + SHUTDOWN_TYPE_SYNC = 1, + SHUTDOWN_TYPE_ASYNC = 2, +}; + static inline struct nvme_request *nvme_req(struct request *req) { return blk_mq_rq_to_pdu(req); @@ -672,6 +678,8 @@ bool nvme_wait_reset(struct nvme_ctrl *ctrl); int nvme_disable_ctrl(struct nvme_ctrl *ctrl); int nvme_enable_ctrl(struct nvme_ctrl *ctrl); int nvme_shutdown_ctrl(struct nvme_ctrl *ctrl); +int nvme_shutdown_ctrl_start(struct nvme_ctrl *ctrl); +int nvme_wait_for_shutdown_cmpl(struct nvme_ctrl *ctrl); int nvme_init_ctrl(struct nvme_ctrl *ctrl, struct device *dev, const struct nvme_ctrl_ops *ops, unsigned long quirks); void nvme_uninit_ctrl(struct nvme_ctrl *ctrl); diff --git a/drivers/nvme/host/pci.c b/drivers/nvme/host/pci.c index 2e98ac3f3ad6..e812509462e8 100644 --- a/drivers/nvme/host/pci.c +++ b/drivers/nvme/host/pci.c @@ -107,7 +107,7 @@ MODULE_PARM_DESC(noacpi, "disable acpi bios quirks"); struct nvme_dev; struct nvme_queue; -static void nvme_dev_disable(struct nvme_dev *dev, bool shutdown); +static void nvme_dev_disable(struct nvme_dev *dev, int shutdown_type); static bool __nvme_disable_io_queues(struct nvme_dev *dev, u8 opcode); /* @@ -1357,7 +1357,7 @@ static enum blk_eh_timer_return nvme_timeout(struct request *req, bool reserved) */ if (nvme_should_reset(dev, csts)) { nvme_warn_reset(dev, csts); - nvme_dev_disable(dev, false); + nvme_dev_disable(dev, DO_NOT_SHUTDOWN); nvme_reset_ctrl(&dev->ctrl); return BLK_EH_DONE; } @@ -1392,7 +1392,7 @@ static enum blk_eh_timer_return nvme_timeout(struct request *req, bool reserved) "I/O %d QID %d timeout, disable controller\n", req->tag, nvmeq->qid); nvme_req(req)->flags |= NVME_REQ_CANCELLED; - nvme_dev_disable(dev, true); + nvme_dev_disable(dev, SHUTDOWN_TYPE_SYNC); return BLK_EH_DONE; case NVME_CTRL_RESETTING: return BLK_EH_RESET_TIMER; @@ -1410,7 +1410,7 @@ static enum blk_eh_timer_return nvme_timeout(struct request *req, bool reserved) "I/O %d QID %d timeout, reset controller\n", req->tag, nvmeq->qid); nvme_req(req)->flags |= NVME_REQ_CANCELLED; - nvme_dev_disable(dev, false); + nvme_dev_disable(dev, DO_NOT_SHUTDOWN); nvme_reset_ctrl(&dev->ctrl); return BLK_EH_DONE; @@ -1503,11 +1503,13 @@ static void nvme_suspend_io_queues(struct nvme_dev *dev) nvme_suspend_queue(&dev->queues[i]); } -static void nvme_disable_admin_queue(struct nvme_dev *dev, bool shutdown) +static void nvme_disable_admin_queue(struct nvme_dev *dev, int shutdown_type) { struct nvme_queue *nvmeq = &dev->queues[0]; - if (shutdown) + if (shutdown_type == SHUTDOWN_TYPE_ASYNC) + nvme_shutdown_ctrl_start(&dev->ctrl); + else if (shutdown_type == SHUTDOWN_TYPE_SYNC) nvme_shutdown_ctrl(&dev->ctrl); else nvme_disable_ctrl(&dev->ctrl); @@ -2669,7 +2671,7 @@ static void nvme_pci_disable(struct nvme_dev *dev) } } -static void nvme_dev_disable(struct nvme_dev *dev, bool shutdown) +static void nvme_dev_disable(struct nvme_dev *dev, int shutdown_type) { bool dead = true, freeze = false; struct pci_dev *pdev = to_pci_dev(dev->dev); @@ -2691,14 +2693,14 @@ static void nvme_dev_disable(struct nvme_dev *dev, bool shutdown) * Give the controller a chance to complete all entered requests if * doing a safe shutdown. */ - if (!dead && shutdown && freeze) + if (!dead && (shutdown_type != DO_NOT_SHUTDOWN) && freeze) nvme_wait_freeze_timeout(&dev->ctrl, NVME_IO_TIMEOUT); nvme_stop_queues(&dev->ctrl); if (!dead && dev->ctrl.queue_count > 0) { nvme_disable_io_queues(dev); - nvme_disable_admin_queue(dev, shutdown); + nvme_disable_admin_queue(dev, shutdown_type); } nvme_suspend_io_queues(dev); nvme_suspend_queue(&dev->queues[0]); @@ -2710,12 +2712,12 @@ static void nvme_dev_disable(struct nvme_dev *dev, bool shutdown) blk_mq_tagset_wait_completed_request(&dev->tagset); blk_mq_tagset_wait_completed_request(&dev->admin_tagset); - /* - * The driver will not be starting up queues again if shutting down so - * must flush all entered requests to their failed completion to avoid - * deadlocking blk-mq hot-cpu notifier. - */ - if (shutdown) { + if (shutdown_type == SHUTDOWN_TYPE_SYNC) { + /* + * The driver will not be starting up queues again if shutting down so + * must flush all entered requests to their failed completion to avoid + * deadlocking blk-mq hot-cpu notifier. + */ nvme_start_queues(&dev->ctrl); if (dev->ctrl.admin_q && !blk_queue_dying(dev->ctrl.admin_q)) nvme_start_admin_queue(&dev->ctrl); @@ -2723,11 +2725,11 @@ static void nvme_dev_disable(struct nvme_dev *dev, bool shutdown) mutex_unlock(&dev->shutdown_lock); } -static int nvme_disable_prepare_reset(struct nvme_dev *dev, bool shutdown) +static int nvme_disable_prepare_reset(struct nvme_dev *dev, int type) { if (!nvme_wait_reset(&dev->ctrl)) return -EBUSY; - nvme_dev_disable(dev, shutdown); + nvme_dev_disable(dev, type); return 0; } @@ -2785,7 +2787,7 @@ static void nvme_remove_dead_ctrl(struct nvme_dev *dev) */ nvme_change_ctrl_state(&dev->ctrl, NVME_CTRL_DELETING); nvme_get_ctrl(&dev->ctrl); - nvme_dev_disable(dev, false); + nvme_dev_disable(dev, DO_NOT_SHUTDOWN); nvme_kill_queues(&dev->ctrl); if (!queue_work(nvme_wq, &dev->remove_work)) nvme_put_ctrl(&dev->ctrl); @@ -2810,7 +2812,7 @@ static void nvme_reset_work(struct work_struct *work) * moving on. */ if (dev->ctrl.ctrl_config & NVME_CC_ENABLE) - nvme_dev_disable(dev, false); + nvme_dev_disable(dev, DO_NOT_SHUTDOWN); nvme_sync_queues(&dev->ctrl); mutex_lock(&dev->shutdown_lock); @@ -3151,7 +3153,7 @@ static void nvme_reset_prepare(struct pci_dev *pdev) * state as pci_dev device lock is held, making it impossible to race * with ->remove(). */ - nvme_disable_prepare_reset(dev, false); + nvme_disable_prepare_reset(dev, DO_NOT_SHUTDOWN); nvme_sync_queues(&dev->ctrl); } @@ -3163,13 +3165,32 @@ static void nvme_reset_done(struct pci_dev *pdev) flush_work(&dev->ctrl.reset_work); } -static void nvme_shutdown(struct pci_dev *pdev) +static void nvme_async_shutdown_start(struct pci_dev *pdev) { struct nvme_dev *dev = pci_get_drvdata(pdev); - nvme_disable_prepare_reset(dev, true); + nvme_disable_prepare_reset(dev, SHUTDOWN_TYPE_ASYNC); } +static void nvme_async_shutdown_end(struct pci_dev *pdev) +{ + struct nvme_dev *dev = pci_get_drvdata(pdev); + + mutex_lock(&dev->shutdown_lock); + nvme_wait_for_shutdown_cmpl(&dev->ctrl); + + /* + * The driver will not be starting up queues again if shutting down so + * must flush all entered requests to their failed completion to avoid + * deadlocking blk-mq hot-cpu notifier. + */ + nvme_start_queues(&dev->ctrl); + if (dev->ctrl.admin_q && !blk_queue_dying(dev->ctrl.admin_q)) + nvme_start_admin_queue(&dev->ctrl); + + mutex_unlock(&dev->shutdown_lock); + +} static void nvme_remove_attrs(struct nvme_dev *dev) { if (dev->attrs_added) @@ -3191,13 +3212,13 @@ static void nvme_remove(struct pci_dev *pdev) if (!pci_device_is_present(pdev)) { nvme_change_ctrl_state(&dev->ctrl, NVME_CTRL_DEAD); - nvme_dev_disable(dev, true); + nvme_dev_disable(dev, SHUTDOWN_TYPE_SYNC); } flush_work(&dev->ctrl.reset_work); nvme_stop_ctrl(&dev->ctrl); nvme_remove_namespaces(&dev->ctrl); - nvme_dev_disable(dev, true); + nvme_dev_disable(dev, SHUTDOWN_TYPE_SYNC); nvme_remove_attrs(dev); nvme_free_host_mem(dev); nvme_dev_remove_admin(dev); @@ -3259,7 +3280,7 @@ static int nvme_suspend(struct device *dev) if (pm_suspend_via_firmware() || !ctrl->npss || !pcie_aspm_enabled(pdev) || (ndev->ctrl.quirks & NVME_QUIRK_SIMPLE_SUSPEND)) - return nvme_disable_prepare_reset(ndev, true); + return nvme_disable_prepare_reset(ndev, SHUTDOWN_TYPE_SYNC); nvme_start_freeze(ctrl); nvme_wait_freeze(ctrl); @@ -3302,7 +3323,7 @@ static int nvme_suspend(struct device *dev) * Clearing npss forces a controller reset on resume. The * correct value will be rediscovered then. */ - ret = nvme_disable_prepare_reset(ndev, true); + ret = nvme_disable_prepare_reset(ndev, SHUTDOWN_TYPE_SYNC); ctrl->npss = 0; } unfreeze: @@ -3314,7 +3335,7 @@ static int nvme_simple_suspend(struct device *dev) { struct nvme_dev *ndev = pci_get_drvdata(to_pci_dev(dev)); - return nvme_disable_prepare_reset(ndev, true); + return nvme_disable_prepare_reset(ndev, SHUTDOWN_TYPE_SYNC); } static int nvme_simple_resume(struct device *dev) @@ -3351,7 +3372,7 @@ static pci_ers_result_t nvme_error_detected(struct pci_dev *pdev, case pci_channel_io_frozen: dev_warn(dev->ctrl.device, "frozen state error detected, reset controller\n"); - nvme_dev_disable(dev, false); + nvme_dev_disable(dev, DO_NOT_SHUTDOWN); return PCI_ERS_RESULT_NEED_RESET; case pci_channel_io_perm_failure: dev_warn(dev->ctrl.device, @@ -3478,7 +3499,8 @@ static struct pci_driver nvme_driver = { .id_table = nvme_id_table, .probe = nvme_probe, .remove = nvme_remove, - .shutdown = nvme_shutdown, + .async_shutdown_start = nvme_async_shutdown_start, + .async_shutdown_end = nvme_async_shutdown_end, #ifdef CONFIG_PM_SLEEP .driver = { .pm = &nvme_dev_pm_ops,