From patchwork Mon May 21 14:04:13 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Keith Busch X-Patchwork-Id: 10415393 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork.web.codeaurora.org (Postfix) with ESMTP id 354CF600CC for ; Mon, 21 May 2018 14:02:06 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 2469728906 for ; Mon, 21 May 2018 14:02:06 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 1869928910; Mon, 21 May 2018 14:02:06 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.9 required=2.0 tests=BAYES_00, MAILING_LIST_MULTI, RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 9A6FC28906 for ; Mon, 21 May 2018 14:02:05 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751021AbeEUOCF (ORCPT ); Mon, 21 May 2018 10:02:05 -0400 Received: from mga09.intel.com ([134.134.136.24]:16330 "EHLO mga09.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751133AbeEUOCE (ORCPT ); Mon, 21 May 2018 10:02:04 -0400 X-Amp-Result: UNSCANNABLE X-Amp-File-Uploaded: False Received: from orsmga001.jf.intel.com ([10.7.209.18]) by orsmga102.jf.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 21 May 2018 07:02:04 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.49,426,1520924400"; d="scan'208";a="57702414" Received: from unknown (HELO localhost.localdomain) ([10.232.112.44]) by orsmga001.jf.intel.com with ESMTP; 21 May 2018 07:02:03 -0700 Date: Mon, 21 May 2018 08:04:13 -0600 From: Keith Busch To: Ming Lei Cc: Jens Axboe , linux-block@vger.kernel.org, Laurence Oberman , Sagi Grimberg , James Smart , linux-nvme@lists.infradead.org, Keith Busch , Johannes Thumshirn , Christoph Hellwig Subject: Re: [PATCH 1/6] nvme: Sync request queues on reset Message-ID: <20180521140413.GA5528@localhost.localdomain> References: <20180518163823.27820-1-keith.busch@intel.com> <20180518223210.GB18334@ming.t460p> <20180518234408.GA31749@localhost.localdomain> <20180519000141.GB19799@ming.t460p> MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: <20180519000141.GB19799@ming.t460p> User-Agent: Mutt/1.9.1 (2017-09-22) Sender: linux-block-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-block@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP On Sat, May 19, 2018 at 08:01:42AM +0800, Ming Lei wrote: > > You keep saying that, but the controller state is global to the > > controller. It doesn't matter which namespace request_queue started the > > reset: every namespaces request queue sees the RESETTING controller state > > When timeouts come, the global state of RESETTING may not be updated > yet, so all the timeouts may not observe the state. Even prior to the RESETING state, every single command, no matter which namespace or request_queue it came on, is reclaimed by the driver. There *should* be no requests to timeout after nvme_dev_disable is called because the nvme driver returned control of all requests in the tagset to blk-mq. In any case, if blk-mq decides it won't complete those requests, we can just swap the order in the reset_work: sync first, uncondintionally disable. Does the following snippet look more okay? --- -- diff --git a/drivers/nvme/host/pci.c b/drivers/nvme/host/pci.c index 17a0190bd88f..42af077ee07a 100644 --- a/drivers/nvme/host/pci.c +++ b/drivers/nvme/host/pci.c @@ -2307,11 +2307,14 @@ static void nvme_reset_work(struct work_struct *work) goto out; /* - * If we're called to reset a live controller first shut it down before - * moving on. + * Ensure there are no timeout work in progress prior to forcefully + * disabling the queue. There is no harm in disabling the device even + * when it was already disabled, as this will forcefully reclaim any + * IOs that are stuck due to blk-mq's timeout handling that prevents + * timed out requests from completing. */ - if (dev->ctrl.ctrl_config & NVME_CC_ENABLE) - nvme_dev_disable(dev, false); + nvme_sync_queues(&dev->ctrl); + nvme_dev_disable(dev, false); /* * Introduce CONNECTING state from nvme-fc/rdma transports to mark the