From patchwork Sat Oct 10 00:57:00 2015 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Dan Williams X-Patchwork-Id: 7365161 Return-Path: X-Original-To: patchwork-linux-nvdimm@patchwork.kernel.org Delivered-To: patchwork-parsemail@patchwork1.web.kernel.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.136]) by patchwork1.web.kernel.org (Postfix) with ESMTP id 98A4D9F1D5 for ; Sat, 10 Oct 2015 01:02:46 +0000 (UTC) Received: from mail.kernel.org (localhost [127.0.0.1]) by mail.kernel.org (Postfix) with ESMTP id 92ED920834 for ; Sat, 10 Oct 2015 01:02:45 +0000 (UTC) Received: from ml01.01.org (ml01.01.org [198.145.21.10]) (using TLSv1.2 with cipher DHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 828A52060D for ; Sat, 10 Oct 2015 01:02:44 +0000 (UTC) Received: from ml01.vlan14.01.org (localhost [IPv6:::1]) by ml01.01.org (Postfix) with ESMTP id 7816561BF8; Fri, 9 Oct 2015 18:02:44 -0700 (PDT) X-Original-To: linux-nvdimm@lists.01.org Delivered-To: linux-nvdimm@lists.01.org Received: from mga11.intel.com (mga11.intel.com [192.55.52.93]) by ml01.01.org (Postfix) with ESMTP id 91EF661BF8 for ; Fri, 9 Oct 2015 18:02:43 -0700 (PDT) Received: from fmsmga003.fm.intel.com ([10.253.24.29]) by fmsmga102.fm.intel.com with ESMTP; 09 Oct 2015 18:02:43 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.17,660,1437462000"; d="scan'208";a="577682049" Received: from dwillia2-desk3.jf.intel.com ([10.54.39.39]) by FMSMGA003.fm.intel.com with ESMTP; 09 Oct 2015 18:02:43 -0700 Subject: [PATCH v2 18/20] block: notify queue death confirmation From: Dan Williams To: linux-nvdimm@lists.01.org Date: Fri, 09 Oct 2015 20:57:00 -0400 Message-ID: <20151010005700.17221.88874.stgit@dwillia2-desk3.jf.intel.com> In-Reply-To: <20151010005522.17221.87557.stgit@dwillia2-desk3.jf.intel.com> References: <20151010005522.17221.87557.stgit@dwillia2-desk3.jf.intel.com> User-Agent: StGit/0.17.1-9-g687f MIME-Version: 1.0 Cc: Jens Axboe , linux-mm@kvack.org, hch@lst.de, linux-kernel@vger.kernel.org X-BeenThere: linux-nvdimm@lists.01.org X-Mailman-Version: 2.1.17 Precedence: list List-Id: "Linux-nvdimm developer list." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: linux-nvdimm-bounces@lists.01.org Sender: "Linux-nvdimm" X-Spam-Status: No, score=-2.6 required=5.0 tests=BAYES_00, RCVD_IN_DNSWL_LOW, T_RP_MATCHES_RCVD, UNPARSEABLE_RELAY autolearn=unavailable version=3.3.1 X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on mail.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP The pmem driver arranges for references to be taken against the queue while pages it allocated via devm_memremap_pages() are in use. At shutdown time, before those pages can be deallocated, they need to be truncated, unmapped, and guaranteed to be idle. Scanning the pages to initiate truncation can only be done once we are certain no new page references will be taken. Once the blk queue percpu_ref is confirmed dead __get_dev_pagemap() will cease allowing new references and we can reclaim these "device" pages. Cc: Jens Axboe Cc: Christoph Hellwig Cc: Ross Zwisler Signed-off-by: Dan Williams --- block/blk-core.c | 12 +++++++++--- block/blk-mq.c | 19 +++++++++++++++---- include/linux/blkdev.h | 4 +++- 3 files changed, 27 insertions(+), 8 deletions(-) diff --git a/block/blk-core.c b/block/blk-core.c index 9b4d735cb5b8..74aaa208a8e9 100644 --- a/block/blk-core.c +++ b/block/blk-core.c @@ -516,6 +516,12 @@ void blk_set_queue_dying(struct request_queue *q) } EXPORT_SYMBOL_GPL(blk_set_queue_dying); +void blk_wait_queue_dead(struct request_queue *q) +{ + wait_event(q->q_freeze_wq, q->q_usage_dead); +} +EXPORT_SYMBOL(blk_wait_queue_dead); + /** * blk_cleanup_queue - shutdown a request queue * @q: request queue to shutdown @@ -638,7 +644,7 @@ int blk_queue_enter(struct request_queue *q, gfp_t gfp) if (!(gfp & __GFP_WAIT)) return -EBUSY; - ret = wait_event_interruptible(q->mq_freeze_wq, + ret = wait_event_interruptible(q->q_freeze_wq, !atomic_read(&q->mq_freeze_depth) || blk_queue_dying(q)); if (blk_queue_dying(q)) @@ -658,7 +664,7 @@ static void blk_queue_usage_counter_release(struct percpu_ref *ref) struct request_queue *q = container_of(ref, struct request_queue, q_usage_counter); - wake_up_all(&q->mq_freeze_wq); + wake_up_all(&q->q_freeze_wq); } struct request_queue *blk_alloc_queue_node(gfp_t gfp_mask, int node_id) @@ -720,7 +726,7 @@ struct request_queue *blk_alloc_queue_node(gfp_t gfp_mask, int node_id) q->bypass_depth = 1; __set_bit(QUEUE_FLAG_BYPASS, &q->queue_flags); - init_waitqueue_head(&q->mq_freeze_wq); + init_waitqueue_head(&q->q_freeze_wq); /* * Init percpu_ref in atomic mode so that it's faster to shutdown. diff --git a/block/blk-mq.c b/block/blk-mq.c index c371aeda2986..d52f9d91f5c1 100644 --- a/block/blk-mq.c +++ b/block/blk-mq.c @@ -77,13 +77,23 @@ static void blk_mq_hctx_clear_pending(struct blk_mq_hw_ctx *hctx, clear_bit(CTX_TO_BIT(hctx, ctx), &bm->word); } +static void blk_confirm_queue_death(struct percpu_ref *ref) +{ + struct request_queue *q = container_of(ref, typeof(*q), + q_usage_counter); + + q->q_usage_dead = 1; + wake_up_all(&q->q_freeze_wq); +} + void blk_mq_freeze_queue_start(struct request_queue *q) { int freeze_depth; freeze_depth = atomic_inc_return(&q->mq_freeze_depth); if (freeze_depth == 1) { - percpu_ref_kill(&q->q_usage_counter); + percpu_ref_kill_and_confirm(&q->q_usage_counter, + blk_confirm_queue_death); blk_mq_run_hw_queues(q, false); } } @@ -91,7 +101,7 @@ EXPORT_SYMBOL_GPL(blk_mq_freeze_queue_start); static void blk_mq_freeze_queue_wait(struct request_queue *q) { - wait_event(q->mq_freeze_wq, percpu_ref_is_zero(&q->q_usage_counter)); + wait_event(q->q_freeze_wq, percpu_ref_is_zero(&q->q_usage_counter)); } /* @@ -129,7 +139,8 @@ void blk_mq_unfreeze_queue(struct request_queue *q) WARN_ON_ONCE(freeze_depth < 0); if (!freeze_depth) { percpu_ref_reinit(&q->q_usage_counter); - wake_up_all(&q->mq_freeze_wq); + q->q_usage_dead = 0; + wake_up_all(&q->q_freeze_wq); } } EXPORT_SYMBOL_GPL(blk_mq_unfreeze_queue); @@ -148,7 +159,7 @@ void blk_mq_wake_waiters(struct request_queue *q) * dying, we need to ensure that processes currently waiting on * the queue are notified as well. */ - wake_up_all(&q->mq_freeze_wq); + wake_up_all(&q->q_freeze_wq); } bool blk_mq_can_queue(struct blk_mq_hw_ctx *hctx) diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h index fb3e6886c479..a1340654e360 100644 --- a/include/linux/blkdev.h +++ b/include/linux/blkdev.h @@ -427,6 +427,7 @@ struct request_queue { */ unsigned int flush_flags; unsigned int flush_not_queueable:1; + unsigned int q_usage_dead:1; struct blk_flush_queue *fq; struct list_head requeue_list; @@ -449,7 +450,7 @@ struct request_queue { struct throtl_data *td; #endif struct rcu_head rcu_head; - wait_queue_head_t mq_freeze_wq; + wait_queue_head_t q_freeze_wq; struct percpu_ref q_usage_counter; struct list_head all_q_node; @@ -949,6 +950,7 @@ extern struct request_queue *blk_init_queue_node(request_fn_proc *rfn, extern struct request_queue *blk_init_queue(request_fn_proc *, spinlock_t *); extern struct request_queue *blk_init_allocated_queue(struct request_queue *, request_fn_proc *, spinlock_t *); +extern void blk_wait_queue_dead(struct request_queue *q); extern void blk_cleanup_queue(struct request_queue *); extern void blk_queue_make_request(struct request_queue *, make_request_fn *); extern void blk_queue_bounce_limit(struct request_queue *, u64);