From patchwork Fri Jun 14 13:13:30 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Coly Li X-Patchwork-Id: 10995263 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id D36171515 for ; Fri, 14 Jun 2019 13:14:11 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id C54E8286BE for ; Fri, 14 Jun 2019 13:14:11 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id B99CF286C7; Fri, 14 Jun 2019 13:14:11 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 37A9A286BE for ; Fri, 14 Jun 2019 13:14:11 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728014AbfFNNOK (ORCPT ); Fri, 14 Jun 2019 09:14:10 -0400 Received: from mx2.suse.de ([195.135.220.15]:45250 "EHLO mx1.suse.de" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1726874AbfFNNOJ (ORCPT ); Fri, 14 Jun 2019 09:14:09 -0400 X-Virus-Scanned: by amavisd-new at test-mx.suse.de Received: from relay2.suse.de (unknown [195.135.220.254]) by mx1.suse.de (Postfix) with ESMTP id 976D6AB8C; Fri, 14 Jun 2019 13:14:07 +0000 (UTC) From: Coly Li To: linux-bcache@vger.kernel.org Cc: linux-block@vger.kernel.org, Coly Li , stable@vger.kernel.org, Tang Junhui Subject: [PATCH 01/29] bcache: Revert "bcache: fix high CPU occupancy during journal" Date: Fri, 14 Jun 2019 21:13:30 +0800 Message-Id: <20190614131358.2771-2-colyli@suse.de> X-Mailer: git-send-email 2.16.4 In-Reply-To: <20190614131358.2771-1-colyli@suse.de> References: <20190614131358.2771-1-colyli@suse.de> Sender: linux-block-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-block@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP This reverts commit c4dc2497d50d9c6fb16aa0d07b6a14f3b2adb1e0. This patch enlarges a race between normal btree flush code path and flush_btree_write(), which causes deadlock when journal space is exhausted. Reverts this patch makes the race window from 128 btree nodes to only 1 btree nodes. Fixes: c4dc2497d50d ("bcache: fix high CPU occupancy during journal") Signed-off-by: Coly Li Cc: stable@vger.kernel.org Cc: Tang Junhui --- drivers/md/bcache/bcache.h | 2 -- drivers/md/bcache/journal.c | 47 +++++++++++++++------------------------------ drivers/md/bcache/util.h | 2 -- 3 files changed, 15 insertions(+), 36 deletions(-) diff --git a/drivers/md/bcache/bcache.h b/drivers/md/bcache/bcache.h index fdf75352e16a..e30a983a68cd 100644 --- a/drivers/md/bcache/bcache.h +++ b/drivers/md/bcache/bcache.h @@ -726,8 +726,6 @@ struct cache_set { #define BUCKET_HASH_BITS 12 struct hlist_head bucket_hash[1 << BUCKET_HASH_BITS]; - - DECLARE_HEAP(struct btree *, flush_btree); }; struct bbio { diff --git a/drivers/md/bcache/journal.c b/drivers/md/bcache/journal.c index 12dae9348147..621ebf5c62a5 100644 --- a/drivers/md/bcache/journal.c +++ b/drivers/md/bcache/journal.c @@ -391,12 +391,6 @@ int bch_journal_replay(struct cache_set *s, struct list_head *list) } /* Journalling */ -#define journal_max_cmp(l, r) \ - (fifo_idx(&c->journal.pin, btree_current_write(l)->journal) < \ - fifo_idx(&(c)->journal.pin, btree_current_write(r)->journal)) -#define journal_min_cmp(l, r) \ - (fifo_idx(&c->journal.pin, btree_current_write(l)->journal) > \ - fifo_idx(&(c)->journal.pin, btree_current_write(r)->journal)) static void btree_flush_write(struct cache_set *c) { @@ -404,35 +398,25 @@ static void btree_flush_write(struct cache_set *c) * Try to find the btree node with that references the oldest journal * entry, best is our current candidate and is locked if non NULL: */ - struct btree *b; - int i; + struct btree *b, *best; + unsigned int i; atomic_long_inc(&c->flush_write); - retry: - spin_lock(&c->journal.lock); - if (heap_empty(&c->flush_btree)) { - for_each_cached_btree(b, c, i) - if (btree_current_write(b)->journal) { - if (!heap_full(&c->flush_btree)) - heap_add(&c->flush_btree, b, - journal_max_cmp); - else if (journal_max_cmp(b, - heap_peek(&c->flush_btree))) { - c->flush_btree.data[0] = b; - heap_sift(&c->flush_btree, 0, - journal_max_cmp); - } + best = NULL; + + for_each_cached_btree(b, c, i) + if (btree_current_write(b)->journal) { + if (!best) + best = b; + else if (journal_pin_cmp(c, + btree_current_write(best)->journal, + btree_current_write(b)->journal)) { + best = b; } + } - for (i = c->flush_btree.used / 2 - 1; i >= 0; --i) - heap_sift(&c->flush_btree, i, journal_min_cmp); - } - - b = NULL; - heap_pop(&c->flush_btree, b, journal_min_cmp); - spin_unlock(&c->journal.lock); - + b = best; if (b) { mutex_lock(&b->write_lock); if (!btree_current_write(b)->journal) { @@ -870,8 +854,7 @@ int bch_journal_alloc(struct cache_set *c) j->w[0].c = c; j->w[1].c = c; - if (!(init_heap(&c->flush_btree, 128, GFP_KERNEL)) || - !(init_fifo(&j->pin, JOURNAL_PIN, GFP_KERNEL)) || + if (!(init_fifo(&j->pin, JOURNAL_PIN, GFP_KERNEL)) || !(j->w[0].data = (void *) __get_free_pages(GFP_KERNEL, JSET_BITS)) || !(j->w[1].data = (void *) __get_free_pages(GFP_KERNEL, JSET_BITS))) return -ENOMEM; diff --git a/drivers/md/bcache/util.h b/drivers/md/bcache/util.h index 1fbced94e4cc..c029f7443190 100644 --- a/drivers/md/bcache/util.h +++ b/drivers/md/bcache/util.h @@ -113,8 +113,6 @@ do { \ #define heap_full(h) ((h)->used == (h)->size) -#define heap_empty(h) ((h)->used == 0) - #define DECLARE_FIFO(type, name) \ struct { \ size_t front, back, size, mask; \ From patchwork Fri Jun 14 13:13:31 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Coly Li X-Patchwork-Id: 10995265 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 41D0A1515 for ; Fri, 14 Jun 2019 13:14:16 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 3422B286BF for ; Fri, 14 Jun 2019 13:14:16 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 28B6E286C4; Fri, 14 Jun 2019 13:14:16 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id A768A286BF for ; Fri, 14 Jun 2019 13:14:15 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728086AbfFNNON (ORCPT ); Fri, 14 Jun 2019 09:14:13 -0400 Received: from mx2.suse.de ([195.135.220.15]:45312 "EHLO mx1.suse.de" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1727998AbfFNNON (ORCPT ); Fri, 14 Jun 2019 09:14:13 -0400 X-Virus-Scanned: by amavisd-new at test-mx.suse.de Received: from relay2.suse.de (unknown [195.135.220.254]) by mx1.suse.de (Postfix) with ESMTP id 37605AB8C; Fri, 14 Jun 2019 13:14:12 +0000 (UTC) From: Coly Li To: linux-bcache@vger.kernel.org Cc: linux-block@vger.kernel.org, Coly Li , stable@vger.kernel.org, Shenghui Wang Subject: [PATCH 02/29] bcache: Revert "bcache: free heap cache_set->flush_btree in bch_journal_free" Date: Fri, 14 Jun 2019 21:13:31 +0800 Message-Id: <20190614131358.2771-3-colyli@suse.de> X-Mailer: git-send-email 2.16.4 In-Reply-To: <20190614131358.2771-1-colyli@suse.de> References: <20190614131358.2771-1-colyli@suse.de> Sender: linux-block-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-block@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP This reverts commit 6268dc2c4703aabfb0b35681be709acf4c2826c6. This patch depends on commit c4dc2497d50d ("bcache: fix high CPU occupancy during journal") which is reverted in previous patch. So revert this one too. Fixes: 6268dc2c4703 ("bcache: free heap cache_set->flush_btree in bch_journal_free") Signed-off-by: Coly Li Cc: stable@vger.kernel.org Cc: Shenghui Wang --- drivers/md/bcache/journal.c | 1 - 1 file changed, 1 deletion(-) diff --git a/drivers/md/bcache/journal.c b/drivers/md/bcache/journal.c index 621ebf5c62a5..f18c1e7b012c 100644 --- a/drivers/md/bcache/journal.c +++ b/drivers/md/bcache/journal.c @@ -839,7 +839,6 @@ void bch_journal_free(struct cache_set *c) free_pages((unsigned long) c->journal.w[1].data, JSET_BITS); free_pages((unsigned long) c->journal.w[0].data, JSET_BITS); free_fifo(&c->journal.pin); - free_heap(&c->flush_btree); } int bch_journal_alloc(struct cache_set *c) From patchwork Fri Jun 14 13:13:32 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Coly Li X-Patchwork-Id: 10995267 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 5C63714BB for ; Fri, 14 Jun 2019 13:14:19 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 4CEA4286B4 for ; Fri, 14 Jun 2019 13:14:19 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 41550286C2; Fri, 14 Jun 2019 13:14:19 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id D7C34286B4 for ; Fri, 14 Jun 2019 13:14:18 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728094AbfFNNOS (ORCPT ); Fri, 14 Jun 2019 09:14:18 -0400 Received: from mx2.suse.de ([195.135.220.15]:45348 "EHLO mx1.suse.de" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1727998AbfFNNOR (ORCPT ); Fri, 14 Jun 2019 09:14:17 -0400 X-Virus-Scanned: by amavisd-new at test-mx.suse.de Received: from relay2.suse.de (unknown [195.135.220.254]) by mx1.suse.de (Postfix) with ESMTP id 57630AE1F; Fri, 14 Jun 2019 13:14:16 +0000 (UTC) From: Coly Li To: linux-bcache@vger.kernel.org Cc: linux-block@vger.kernel.org, Coly Li Subject: [PATCH 03/29] bcache: add code comments for journal_read_bucket() Date: Fri, 14 Jun 2019 21:13:32 +0800 Message-Id: <20190614131358.2771-4-colyli@suse.de> X-Mailer: git-send-email 2.16.4 In-Reply-To: <20190614131358.2771-1-colyli@suse.de> References: <20190614131358.2771-1-colyli@suse.de> Sender: linux-block-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-block@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP This patch adds more code comments in journal_read_bucket(), this is an effort to make the code to be more understandable. Signed-off-by: Coly Li --- drivers/md/bcache/journal.c | 24 ++++++++++++++++++++++++ 1 file changed, 24 insertions(+) diff --git a/drivers/md/bcache/journal.c b/drivers/md/bcache/journal.c index f18c1e7b012c..50c9a1e405e6 100644 --- a/drivers/md/bcache/journal.c +++ b/drivers/md/bcache/journal.c @@ -100,6 +100,20 @@ reread: left = ca->sb.bucket_size - offset; blocks = set_blocks(j, block_bytes(ca->set)); + /* + * Nodes in 'list' are in linear increasing order of + * i->j.seq, the node on head has the smallest (oldest) + * journal seq, the node on tail has the biggest + * (latest) journal seq. + */ + + /* + * Check from the oldest jset for last_seq. If + * i->j.seq < j->last_seq, it means the oldest jset + * in list is expired and useless, remove it from + * this list. Otherwise, j is a condidate jset for + * further following checks. + */ while (!list_empty(list)) { i = list_first_entry(list, struct journal_replay, list); @@ -109,13 +123,22 @@ reread: left = ca->sb.bucket_size - offset; kfree(i); } + /* iterate list in reverse order (from latest jset) */ list_for_each_entry_reverse(i, list, list) { if (j->seq == i->j.seq) goto next_set; + /* + * if j->seq is less than any i->j.last_seq + * in list, j is an expired and useless jset. + */ if (j->seq < i->j.last_seq) goto next_set; + /* + * 'where' points to first jset in list which + * is elder then j. + */ if (j->seq > i->j.seq) { where = &i->list; goto add; @@ -129,6 +152,7 @@ reread: left = ca->sb.bucket_size - offset; if (!i) return -ENOMEM; memcpy(&i->j, j, bytes); + /* Add to the location after 'where' points to */ list_add(&i->list, where); ret = 1; From patchwork Fri Jun 14 13:13:33 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Coly Li X-Patchwork-Id: 10995269 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 3FCAE14BB for ; Fri, 14 Jun 2019 13:14:23 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 31A6C286BF for ; Fri, 14 Jun 2019 13:14:23 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 25105286BD; Fri, 14 Jun 2019 13:14:23 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id C42EB286C0 for ; Fri, 14 Jun 2019 13:14:22 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728107AbfFNNOW (ORCPT ); Fri, 14 Jun 2019 09:14:22 -0400 Received: from mx2.suse.de ([195.135.220.15]:45386 "EHLO mx1.suse.de" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1727998AbfFNNOV (ORCPT ); Fri, 14 Jun 2019 09:14:21 -0400 X-Virus-Scanned: by amavisd-new at test-mx.suse.de Received: from relay2.suse.de (unknown [195.135.220.254]) by mx1.suse.de (Postfix) with ESMTP id ABBF3AB8C; Fri, 14 Jun 2019 13:14:20 +0000 (UTC) From: Coly Li To: linux-bcache@vger.kernel.org Cc: linux-block@vger.kernel.org, Coly Li Subject: [PATCH 04/29] bcache: set largest seq to ja->seq[bucket_index] in journal_read_bucket() Date: Fri, 14 Jun 2019 21:13:33 +0800 Message-Id: <20190614131358.2771-5-colyli@suse.de> X-Mailer: git-send-email 2.16.4 In-Reply-To: <20190614131358.2771-1-colyli@suse.de> References: <20190614131358.2771-1-colyli@suse.de> Sender: linux-block-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-block@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP In journal_read_bucket() when setting ja->seq[bucket_index], there might be potential case that a later non-maximum overwrites a better sequence number to ja->seq[bucket_index]. This patch adds a check to make sure that ja->seq[bucket_index] will be only set a new value if it is bigger then current value. Signed-off-by: Coly Li --- drivers/md/bcache/journal.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/drivers/md/bcache/journal.c b/drivers/md/bcache/journal.c index 50c9a1e405e6..a9f1703a8514 100644 --- a/drivers/md/bcache/journal.c +++ b/drivers/md/bcache/journal.c @@ -156,7 +156,8 @@ reread: left = ca->sb.bucket_size - offset; list_add(&i->list, where); ret = 1; - ja->seq[bucket_index] = j->seq; + if (j->seq > ja->seq[bucket_index]) + ja->seq[bucket_index] = j->seq; next_set: offset += blocks * ca->sb.block_size; len -= blocks * ca->sb.block_size; From patchwork Fri Jun 14 13:13:34 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Coly Li X-Patchwork-Id: 10995271 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 8F87A1515 for ; Fri, 14 Jun 2019 13:14:26 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 81007286A8 for ; Fri, 14 Jun 2019 13:14:26 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 75B76286C1; Fri, 14 Jun 2019 13:14:26 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 1D6C2286A8 for ; Fri, 14 Jun 2019 13:14:26 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727998AbfFNNOZ (ORCPT ); Fri, 14 Jun 2019 09:14:25 -0400 Received: from mx2.suse.de ([195.135.220.15]:45408 "EHLO mx1.suse.de" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1727983AbfFNNOZ (ORCPT ); Fri, 14 Jun 2019 09:14:25 -0400 X-Virus-Scanned: by amavisd-new at test-mx.suse.de Received: from relay2.suse.de (unknown [195.135.220.254]) by mx1.suse.de (Postfix) with ESMTP id 38D06AB8C; Fri, 14 Jun 2019 13:14:24 +0000 (UTC) From: Coly Li To: linux-bcache@vger.kernel.org Cc: linux-block@vger.kernel.org, Coly Li Subject: [PATCH 05/29] bcache: remove retry_flush_write from struct cache_set Date: Fri, 14 Jun 2019 21:13:34 +0800 Message-Id: <20190614131358.2771-6-colyli@suse.de> X-Mailer: git-send-email 2.16.4 In-Reply-To: <20190614131358.2771-1-colyli@suse.de> References: <20190614131358.2771-1-colyli@suse.de> Sender: linux-block-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-block@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP In struct cache_set, retry_flush_write is added for commit c4dc2497d50d ("bcache: fix high CPU occupancy during journal") which is reverted in previous patch. Now it is useless anymore, and this patch removes it from bcache code. Signed-off-by: Coly Li --- drivers/md/bcache/bcache.h | 1 - drivers/md/bcache/sysfs.c | 5 ----- 2 files changed, 6 deletions(-) diff --git a/drivers/md/bcache/bcache.h b/drivers/md/bcache/bcache.h index e30a983a68cd..ae87cb01401e 100644 --- a/drivers/md/bcache/bcache.h +++ b/drivers/md/bcache/bcache.h @@ -706,7 +706,6 @@ struct cache_set { atomic_long_t reclaim; atomic_long_t flush_write; - atomic_long_t retry_flush_write; enum { ON_ERROR_UNREGISTER, diff --git a/drivers/md/bcache/sysfs.c b/drivers/md/bcache/sysfs.c index 6cd44d3cf906..a67e1e57ea68 100644 --- a/drivers/md/bcache/sysfs.c +++ b/drivers/md/bcache/sysfs.c @@ -85,7 +85,6 @@ read_attribute(state); read_attribute(cache_read_races); read_attribute(reclaim); read_attribute(flush_write); -read_attribute(retry_flush_write); read_attribute(writeback_keys_done); read_attribute(writeback_keys_failed); read_attribute(io_errors); @@ -691,9 +690,6 @@ SHOW(__bch_cache_set) sysfs_print(flush_write, atomic_long_read(&c->flush_write)); - sysfs_print(retry_flush_write, - atomic_long_read(&c->retry_flush_write)); - sysfs_print(writeback_keys_done, atomic_long_read(&c->writeback_keys_done)); sysfs_print(writeback_keys_failed, @@ -910,7 +906,6 @@ static struct attribute *bch_cache_set_internal_files[] = { &sysfs_cache_read_races, &sysfs_reclaim, &sysfs_flush_write, - &sysfs_retry_flush_write, &sysfs_writeback_keys_done, &sysfs_writeback_keys_failed, From patchwork Fri Jun 14 13:13:35 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Coly Li X-Patchwork-Id: 10995273 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 69AE51515 for ; Fri, 14 Jun 2019 13:14:32 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 5AD7A286B8 for ; Fri, 14 Jun 2019 13:14:32 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 4F54B286C0; Fri, 14 Jun 2019 13:14:32 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 7177E286BE for ; Fri, 14 Jun 2019 13:14:31 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728108AbfFNNOa (ORCPT ); Fri, 14 Jun 2019 09:14:30 -0400 Received: from mx2.suse.de ([195.135.220.15]:45444 "EHLO mx1.suse.de" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1728034AbfFNNOa (ORCPT ); Fri, 14 Jun 2019 09:14:30 -0400 X-Virus-Scanned: by amavisd-new at test-mx.suse.de Received: from relay2.suse.de (unknown [195.135.220.254]) by mx1.suse.de (Postfix) with ESMTP id 85C04AB8C; Fri, 14 Jun 2019 13:14:28 +0000 (UTC) From: Coly Li To: linux-bcache@vger.kernel.org Cc: linux-block@vger.kernel.org, Coly Li Subject: [PATCH 06/29] bcache: fix race in btree_flush_write() Date: Fri, 14 Jun 2019 21:13:35 +0800 Message-Id: <20190614131358.2771-7-colyli@suse.de> X-Mailer: git-send-email 2.16.4 In-Reply-To: <20190614131358.2771-1-colyli@suse.de> References: <20190614131358.2771-1-colyli@suse.de> Sender: linux-block-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-block@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP There is a race between mca_reap(), btree_node_free() and journal code btree_flush_write(), which results very rare and strange deadlock or panic and are very hard to reproduce. Let me explain how the race happens. In btree_flush_write() one btree node with oldest journal pin is selected, then it is flushed to cache device, the select-and-flush is a two steps operation. Between these two steps, there are something may happen inside the race window, - The selected btree node was reaped by mca_reap() and allocated to other requesters for other btree node. - The slected btree node was selected, flushed and released by mca shrink callback bch_mca_scan(). When btree_flush_write() tries to flush the selected btree node, firstly b->write_lock is held by mutex_lock(). If the race happens and the memory of selected btree node is allocated to other btree node, if that btree node's write_lock is held already, a deadlock very probably happens here. A worse case is the memory of the selected btree node is released, then all references to this btree node (e.g. b->write_lock) will trigger NULL pointer deference panic. This race was introduced in commit cafe56359144 ("bcache: A block layer cache"), and enlarged by commit c4dc2497d50d ("bcache: fix high CPU occupancy during journal"), which selected 128 btree nodes and flushed them one-by-one in a quite long time period. Such race is not easy to reproduce before. On a Lenovo SR650 server with 48 Xeon cores, and configure 1 NVMe SSD as cache device, a MD raid0 device assembled by 3 NVMe SSDs as backing device, this race can be observed around every 10,000 times btree_flush_write() gets called. Both deadlock and kernel panic all happened as aftermath of the race. The idea of the fix is to add a btree flag BTREE_NODE_journal_flush. It is set when selecting btree nodes, and cleared after btree nodes flushed. Then when mca_reap() selects a btree node with this bit set, this btree node will be skipped. Since mca_reap() only reaps btree node without BTREE_NODE_journal_flush flag, such race is avoided. Once corner case should be noticed, that is btree_node_free(). It might be called in some error handling code path. For example the following code piece from btree_split(), 2149 err_free2: 2150 bkey_put(b->c, &n2->key); 2151 btree_node_free(n2); 2152 rw_unlock(true, n2); 2153 err_free1: 2154 bkey_put(b->c, &n1->key); 2155 btree_node_free(n1); 2156 rw_unlock(true, n1); At line 2151 and 2155, the btree node n2 and n1 are released without mac_reap(), so BTREE_NODE_journal_flush also needs to be checked here. If btree_node_free() is called directly in such error handling path, and the selected btree node has BTREE_NODE_journal_flush bit set, just wait for 1 jiffy and retry again. In this case this btree node won't be skipped, just retry until the BTREE_NODE_journal_flush bit cleared, and free the btree node memory. Wait for 1 jiffy inside btree_node_free() does not hurt too much performance here, the reasons are, - Error handling code path is not frequently executed, and the race inside this cold path should be very rare. If the very rare race happens in the cold code path, waiting 1 jiffy should be acceptible. - If bree_node_free() is called inside mca_reap(), it means the bit BTREE_NODE_journal_flush is checked already, no wait will happen here. Beside the above fix, the way to select flushing btree nodes is also changed in this patch. Let me explain what changes in this patch. - Use another spinlock journal.flush_write_lock to replace the very hot journal.lock. We don't have to use journal.lock here, selecting candidate btree nodes takes a lot of time, hold journal.lock here will block other jouranling threads and drop the overall I/O performance. - Only select flushing btree node from c->btree_cache list. When the machine has a large system memory, mca cache may have a huge number of cached btree nodes. Iterating all the cached nodes will take a lot of CPU time, and most of the nodes on c->btree_cache_freeable and c->btree_cache_freed lists are cleared and have need to flush. So only travel mca list c->btree_cache to select flushing btree node should be enough for most of the cases. - Don't iterate whole c->btree_cache list, only reversely select first BTREE_FLUSH_NR (32) btree nodes to flush. Iterate all btree nodes from c->btree_cache and select the oldest journal pin btree nodes consumes huge number of CPU cycles if the list is huge (push and pop a node into/out of a heap is expensive). The last several dirty btree nodes on the tail of c->btree_cache list are earlest allocated and cached btree nodes, they are relative to the oldest journal pin btree nodes. Therefore only flushing BTREE_FLUSH_NR btree nodes from tail of c->btree_cache probably includes the oldest journal pin btree nodes. In my testing, the above change decreases 50%+ CPU consumption when journal space is full. Some times IOPS drops to 0 for 5-8 seconds, comparing blocking I/O for 120+ seconds in previous code, this is much better. Maybe there is room to improve in future, but at this momment the fix looks fine and performs well in my testing. Signed-off-by: Coly Li --- drivers/md/bcache/btree.c | 15 +++++++- drivers/md/bcache/btree.h | 2 + drivers/md/bcache/journal.c | 93 ++++++++++++++++++++++++++++++++++----------- drivers/md/bcache/journal.h | 4 ++ 4 files changed, 90 insertions(+), 24 deletions(-) diff --git a/drivers/md/bcache/btree.c b/drivers/md/bcache/btree.c index 773f5fdad25f..881dc238c7cb 100644 --- a/drivers/md/bcache/btree.c +++ b/drivers/md/bcache/btree.c @@ -656,6 +656,13 @@ static int mca_reap(struct btree *b, unsigned int min_order, bool flush) } mutex_lock(&b->write_lock); + /* don't reap btree node handling in btree_flush_write() */ + if (btree_node_journal_flush(b)) { + pr_debug("bnode %p is flushing by journal, ignore", b); + mutex_unlock(&b->write_lock); + goto out_unlock; + } + if (btree_node_dirty(b)) __bch_btree_node_write(b, &cl); mutex_unlock(&b->write_lock); @@ -1067,8 +1074,14 @@ static void btree_node_free(struct btree *b) BUG_ON(b == b->c->root); +retry: mutex_lock(&b->write_lock); - + if (btree_node_journal_flush(b)) { + mutex_unlock(&b->write_lock); + pr_debug("bnode %p journal_flush set, retry", b); + schedule_timeout_interruptible(1); + goto retry; + } if (btree_node_dirty(b)) btree_complete_write(b, btree_current_write(b)); clear_bit(BTREE_NODE_dirty, &b->flags); diff --git a/drivers/md/bcache/btree.h b/drivers/md/bcache/btree.h index d1c72ef64edf..76cfd121a486 100644 --- a/drivers/md/bcache/btree.h +++ b/drivers/md/bcache/btree.h @@ -158,11 +158,13 @@ enum btree_flags { BTREE_NODE_io_error, BTREE_NODE_dirty, BTREE_NODE_write_idx, + BTREE_NODE_journal_flush, }; BTREE_FLAG(io_error); BTREE_FLAG(dirty); BTREE_FLAG(write_idx); +BTREE_FLAG(journal_flush); static inline struct btree_write *btree_current_write(struct btree *b) { diff --git a/drivers/md/bcache/journal.c b/drivers/md/bcache/journal.c index a9f1703a8514..303ef3d1fbc6 100644 --- a/drivers/md/bcache/journal.c +++ b/drivers/md/bcache/journal.c @@ -419,41 +419,87 @@ int bch_journal_replay(struct cache_set *s, struct list_head *list) static void btree_flush_write(struct cache_set *c) { - /* - * Try to find the btree node with that references the oldest journal - * entry, best is our current candidate and is locked if non NULL: - */ - struct btree *b, *best; - unsigned int i; + struct btree *b, *btree_nodes[BTREE_FLUSH_NR]; + unsigned int i, n; + + if (c->journal.btree_flushing) + return; + + spin_lock(&c->journal.flush_write_lock); + if (c->journal.btree_flushing) { + spin_unlock(&c->journal.flush_write_lock); + return; + } + c->journal.btree_flushing = true; + spin_unlock(&c->journal.flush_write_lock); atomic_long_inc(&c->flush_write); -retry: - best = NULL; - - for_each_cached_btree(b, c, i) - if (btree_current_write(b)->journal) { - if (!best) - best = b; - else if (journal_pin_cmp(c, - btree_current_write(best)->journal, - btree_current_write(b)->journal)) { - best = b; - } + memset(btree_nodes, 0, sizeof(btree_nodes)); + n = 0; + + mutex_lock(&c->bucket_lock); + list_for_each_entry_reverse(b, &c->btree_cache, list) { + if (btree_node_journal_flush(b)) + pr_err("BUG: flush_write bit should not be set here!"); + + mutex_lock(&b->write_lock); + + if (!btree_node_dirty(b)) { + mutex_unlock(&b->write_lock); + continue; + } + + if (!btree_current_write(b)->journal) { + mutex_unlock(&b->write_lock); + continue; + } + + set_btree_node_journal_flush(b); + + mutex_unlock(&b->write_lock); + + btree_nodes[n++] = b; + if (n == BTREE_FLUSH_NR) + break; + } + mutex_unlock(&c->bucket_lock); + + for (i = 0; i < n; i++) { + b = btree_nodes[i]; + if (!b) { + pr_err("BUG: btree_nodes[%d] is NULL", i); + continue; + } + + /* safe to check without holding b->write_lock */ + if (!btree_node_journal_flush(b)) { + pr_err("BUG: bnode %p: journal_flush bit cleaned", b); + continue; } - b = best; - if (b) { mutex_lock(&b->write_lock); if (!btree_current_write(b)->journal) { mutex_unlock(&b->write_lock); - /* We raced */ - atomic_long_inc(&c->retry_flush_write); - goto retry; + pr_debug("bnode %p: written by others", b); + clear_bit(BTREE_NODE_journal_flush, &b->flags); + continue; + } + + if (!btree_node_dirty(b)) { + pr_debug("bnode %p: dirty bit cleaned by others", b); + clear_bit(BTREE_NODE_journal_flush, &b->flags); + mutex_unlock(&b->write_lock); + continue; } __bch_btree_node_write(b, NULL); + clear_bit(BTREE_NODE_journal_flush, &b->flags); mutex_unlock(&b->write_lock); } + + spin_lock(&c->journal.flush_write_lock); + c->journal.btree_flushing = false; + spin_unlock(&c->journal.flush_write_lock); } #define last_seq(j) ((j)->seq - fifo_used(&(j)->pin) + 1) @@ -871,6 +917,7 @@ int bch_journal_alloc(struct cache_set *c) struct journal *j = &c->journal; spin_lock_init(&j->lock); + spin_lock_init(&j->flush_write_lock); INIT_DELAYED_WORK(&j->work, journal_write_work); c->journal_delay_ms = 100; diff --git a/drivers/md/bcache/journal.h b/drivers/md/bcache/journal.h index 66f0facff84b..aeed791f05e7 100644 --- a/drivers/md/bcache/journal.h +++ b/drivers/md/bcache/journal.h @@ -103,6 +103,8 @@ struct journal_write { /* Embedded in struct cache_set */ struct journal { spinlock_t lock; + spinlock_t flush_write_lock; + bool btree_flushing; /* used when waiting because the journal was full */ struct closure_waitlist wait; struct closure io; @@ -154,6 +156,8 @@ struct journal_device { struct bio_vec bv[8]; }; +#define BTREE_FLUSH_NR 32 + #define journal_pin_cmp(c, l, r) \ (fifo_idx(&(c)->journal.pin, (l)) > fifo_idx(&(c)->journal.pin, (r))) From patchwork Fri Jun 14 13:13:36 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Coly Li X-Patchwork-Id: 10995275 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 0F21014BB for ; Fri, 14 Jun 2019 13:14:35 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 01E02286B4 for ; Fri, 14 Jun 2019 13:14:35 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id EAB06286BD; Fri, 14 Jun 2019 13:14:34 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 82EA4286BE for ; Fri, 14 Jun 2019 13:14:34 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728034AbfFNNOe (ORCPT ); Fri, 14 Jun 2019 09:14:34 -0400 Received: from mx2.suse.de ([195.135.220.15]:45518 "EHLO mx1.suse.de" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1727766AbfFNNOd (ORCPT ); Fri, 14 Jun 2019 09:14:33 -0400 X-Virus-Scanned: by amavisd-new at test-mx.suse.de Received: from relay2.suse.de (unknown [195.135.220.254]) by mx1.suse.de (Postfix) with ESMTP id C09E8AB8C; Fri, 14 Jun 2019 13:14:32 +0000 (UTC) From: Coly Li To: linux-bcache@vger.kernel.org Cc: linux-block@vger.kernel.org, Coly Li Subject: [PATCH 07/29] bcache: add reclaimed_journal_buckets to struct cache_set Date: Fri, 14 Jun 2019 21:13:36 +0800 Message-Id: <20190614131358.2771-8-colyli@suse.de> X-Mailer: git-send-email 2.16.4 In-Reply-To: <20190614131358.2771-1-colyli@suse.de> References: <20190614131358.2771-1-colyli@suse.de> Sender: linux-block-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-block@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP Now we have counters for how many times jouranl is reclaimed, how many times cached dirty btree nodes are flushed, but we don't know how many jouranl buckets are really reclaimed. This patch adds reclaimed_journal_buckets into struct cache_set, this is an increasing only counter, to tell how many journal buckets are reclaimed since cache set runs. From all these three counters (reclaim, reclaimed_journal_buckets, flush_write), we can have idea how well current journal space reclaim code works. Signed-off-by: Coly Li --- drivers/md/bcache/bcache.h | 1 + drivers/md/bcache/journal.c | 1 + drivers/md/bcache/sysfs.c | 5 +++++ 3 files changed, 7 insertions(+) diff --git a/drivers/md/bcache/bcache.h b/drivers/md/bcache/bcache.h index ae87cb01401e..732f9b6cb2bb 100644 --- a/drivers/md/bcache/bcache.h +++ b/drivers/md/bcache/bcache.h @@ -705,6 +705,7 @@ struct cache_set { atomic_long_t writeback_keys_failed; atomic_long_t reclaim; + atomic_long_t reclaimed_journal_buckets; atomic_long_t flush_write; enum { diff --git a/drivers/md/bcache/journal.c b/drivers/md/bcache/journal.c index 303ef3d1fbc6..d75efeaa4baa 100644 --- a/drivers/md/bcache/journal.c +++ b/drivers/md/bcache/journal.c @@ -614,6 +614,7 @@ static void journal_reclaim(struct cache_set *c) k->ptr[n++] = MAKE_PTR(0, bucket_to_sector(c, ca->sb.d[ja->cur_idx]), ca->sb.nr_this_dev); + atomic_long_inc(&c->reclaimed_journal_buckets); } if (n) { diff --git a/drivers/md/bcache/sysfs.c b/drivers/md/bcache/sysfs.c index a67e1e57ea68..7a88ba4bfbfb 100644 --- a/drivers/md/bcache/sysfs.c +++ b/drivers/md/bcache/sysfs.c @@ -84,6 +84,7 @@ read_attribute(bset_tree_stats); read_attribute(state); read_attribute(cache_read_races); read_attribute(reclaim); +read_attribute(reclaimed_journal_buckets); read_attribute(flush_write); read_attribute(writeback_keys_done); read_attribute(writeback_keys_failed); @@ -687,6 +688,9 @@ SHOW(__bch_cache_set) sysfs_print(reclaim, atomic_long_read(&c->reclaim)); + sysfs_print(reclaimed_journal_buckets, + atomic_long_read(&c->reclaimed_journal_buckets)); + sysfs_print(flush_write, atomic_long_read(&c->flush_write)); @@ -905,6 +909,7 @@ static struct attribute *bch_cache_set_internal_files[] = { &sysfs_bset_tree_stats, &sysfs_cache_read_races, &sysfs_reclaim, + &sysfs_reclaimed_journal_buckets, &sysfs_flush_write, &sysfs_writeback_keys_done, &sysfs_writeback_keys_failed, From patchwork Fri Jun 14 13:13:37 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Coly Li X-Patchwork-Id: 10995277 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 62DB21515 for ; Fri, 14 Jun 2019 13:14:38 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 5495F286A8 for ; Fri, 14 Jun 2019 13:14:38 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 48EFC286C0; Fri, 14 Jun 2019 13:14:38 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id EC911286A8 for ; Fri, 14 Jun 2019 13:14:37 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728112AbfFNNOh (ORCPT ); Fri, 14 Jun 2019 09:14:37 -0400 Received: from mx2.suse.de ([195.135.220.15]:45606 "EHLO mx1.suse.de" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1727766AbfFNNOh (ORCPT ); Fri, 14 Jun 2019 09:14:37 -0400 X-Virus-Scanned: by amavisd-new at test-mx.suse.de Received: from relay2.suse.de (unknown [195.135.220.254]) by mx1.suse.de (Postfix) with ESMTP id 28ADFAB8C; Fri, 14 Jun 2019 13:14:36 +0000 (UTC) From: Coly Li To: linux-bcache@vger.kernel.org Cc: linux-block@vger.kernel.org, Coly Li Subject: [PATCH 08/29] bcache: fix return value error in bch_journal_read() Date: Fri, 14 Jun 2019 21:13:37 +0800 Message-Id: <20190614131358.2771-9-colyli@suse.de> X-Mailer: git-send-email 2.16.4 In-Reply-To: <20190614131358.2771-1-colyli@suse.de> References: <20190614131358.2771-1-colyli@suse.de> Sender: linux-block-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-block@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP When everything is OK in bch_journal_read(), finally the return value is returned by, return ret; which assumes ret will be 0 here. This assumption is wrong when all journal buckets as are full and filled with valid journal entries. In such cache the last location referencess read_bucket() sets 'ret' to 1, which means new jset added into jset list. The jset list is list 'journal' in caller run_cache_set(). Return 1 to run_cache_set() means something wrong and the cache set won't start, but indeed everything is OK. This patch changes the line at end of bch_journal_read() to directly return 0 since everything if verything is good. Then a bogus error is fixed. Signed-off-by: Coly Li --- drivers/md/bcache/journal.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/md/bcache/journal.c b/drivers/md/bcache/journal.c index d75efeaa4baa..7beff8884dd0 100644 --- a/drivers/md/bcache/journal.c +++ b/drivers/md/bcache/journal.c @@ -293,7 +293,7 @@ int bch_journal_read(struct cache_set *c, struct list_head *list) struct journal_replay, list)->j.seq; - return ret; + return 0; #undef read_bucket } From patchwork Fri Jun 14 13:13:38 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Coly Li X-Patchwork-Id: 10995279 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 38A011515 for ; Fri, 14 Jun 2019 13:14:42 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 2AE33286A8 for ; Fri, 14 Jun 2019 13:14:42 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 1FABF286B3; Fri, 14 Jun 2019 13:14:42 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id B0C07286A4 for ; Fri, 14 Jun 2019 13:14:41 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728118AbfFNNOl (ORCPT ); Fri, 14 Jun 2019 09:14:41 -0400 Received: from mx2.suse.de ([195.135.220.15]:45680 "EHLO mx1.suse.de" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1727766AbfFNNOl (ORCPT ); Fri, 14 Jun 2019 09:14:41 -0400 X-Virus-Scanned: by amavisd-new at test-mx.suse.de Received: from relay2.suse.de (unknown [195.135.220.254]) by mx1.suse.de (Postfix) with ESMTP id B8749AB8C; Fri, 14 Jun 2019 13:14:39 +0000 (UTC) From: Coly Li To: linux-bcache@vger.kernel.org Cc: linux-block@vger.kernel.org, Coly Li , stable@vger.kernel.org Subject: [PATCH 09/29] Revert "bcache: set CACHE_SET_IO_DISABLE in bch_cached_dev_error()" Date: Fri, 14 Jun 2019 21:13:38 +0800 Message-Id: <20190614131358.2771-10-colyli@suse.de> X-Mailer: git-send-email 2.16.4 In-Reply-To: <20190614131358.2771-1-colyli@suse.de> References: <20190614131358.2771-1-colyli@suse.de> Sender: linux-block-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-block@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP This reverts commit 6147305c73e4511ca1a975b766b97a779d442567. Although this patch helps the failed bcache device to stop faster when too many I/O errors detected on corresponding cached device, setting CACHE_SET_IO_DISABLE bit to cache set c->flags was not a good idea. This operation will disable all I/Os on cache set, which means other attached bcache devices won't work neither. Without this patch, the failed bcache device can also be stopped eventually if internal I/O accomplished (e.g. writeback). Therefore here I revert it. Fixes: 6147305c73e4 ("bcache: set CACHE_SET_IO_DISABLE in bch_cached_dev_error()") Reported-by: Yong Li Signed-off-by: Coly Li Cc: stable@vger.kernel.org --- drivers/md/bcache/super.c | 17 ----------------- 1 file changed, 17 deletions(-) diff --git a/drivers/md/bcache/super.c b/drivers/md/bcache/super.c index 1b63ac876169..eaaa046fd95d 100644 --- a/drivers/md/bcache/super.c +++ b/drivers/md/bcache/super.c @@ -1437,8 +1437,6 @@ int bch_flash_dev_create(struct cache_set *c, uint64_t size) bool bch_cached_dev_error(struct cached_dev *dc) { - struct cache_set *c; - if (!dc || test_bit(BCACHE_DEV_CLOSING, &dc->disk.flags)) return false; @@ -1449,21 +1447,6 @@ bool bch_cached_dev_error(struct cached_dev *dc) pr_err("stop %s: too many IO errors on backing device %s\n", dc->disk.disk->disk_name, dc->backing_dev_name); - /* - * If the cached device is still attached to a cache set, - * even dc->io_disable is true and no more I/O requests - * accepted, cache device internal I/O (writeback scan or - * garbage collection) may still prevent bcache device from - * being stopped. So here CACHE_SET_IO_DISABLE should be - * set to c->flags too, to make the internal I/O to cache - * device rejected and stopped immediately. - * If c is NULL, that means the bcache device is not attached - * to any cache set, then no CACHE_SET_IO_DISABLE bit to set. - */ - c = dc->disk.c; - if (c && test_and_set_bit(CACHE_SET_IO_DISABLE, &c->flags)) - pr_info("CACHE_SET_IO_DISABLE already set"); - bcache_device_stop(&dc->disk); return true; } From patchwork Fri Jun 14 13:13:39 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Coly Li X-Patchwork-Id: 10995281 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 7DB1414BB for ; Fri, 14 Jun 2019 13:14:45 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 6EDD6286B3 for ; Fri, 14 Jun 2019 13:14:45 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 6368E286BD; Fri, 14 Jun 2019 13:14:45 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 03794286B3 for ; Fri, 14 Jun 2019 13:14:45 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727949AbfFNNOo (ORCPT ); Fri, 14 Jun 2019 09:14:44 -0400 Received: from mx2.suse.de ([195.135.220.15]:45714 "EHLO mx1.suse.de" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1727766AbfFNNOo (ORCPT ); Fri, 14 Jun 2019 09:14:44 -0400 X-Virus-Scanned: by amavisd-new at test-mx.suse.de Received: from relay2.suse.de (unknown [195.135.220.254]) by mx1.suse.de (Postfix) with ESMTP id 0CB1CAB8C; Fri, 14 Jun 2019 13:14:43 +0000 (UTC) From: Coly Li To: linux-bcache@vger.kernel.org Cc: linux-block@vger.kernel.org, Coly Li Subject: [PATCH 10/29] bcache: avoid flushing btree node in cache_set_flush() if io disabled Date: Fri, 14 Jun 2019 21:13:39 +0800 Message-Id: <20190614131358.2771-11-colyli@suse.de> X-Mailer: git-send-email 2.16.4 In-Reply-To: <20190614131358.2771-1-colyli@suse.de> References: <20190614131358.2771-1-colyli@suse.de> Sender: linux-block-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-block@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP When cache_set_flush() is called for too many I/O errors detected on cache device and the cache set is retiring, inside the function it doesn't make sense to flushing cached btree nodes from c->btree_cache because CACHE_SET_IO_DISABLE is set on c->flags already and all I/Os onto cache device will be rejected. This patch checks in cache_set_flush() that whether CACHE_SET_IO_DISABLE is set. If yes, then avoids to flush the cached btree nodes to reduce more time and make cache set retiring more faster. Signed-off-by: Coly Li --- drivers/md/bcache/super.c | 18 +++++++++++------- 1 file changed, 11 insertions(+), 7 deletions(-) diff --git a/drivers/md/bcache/super.c b/drivers/md/bcache/super.c index eaaa046fd95d..da9d6a63b81a 100644 --- a/drivers/md/bcache/super.c +++ b/drivers/md/bcache/super.c @@ -1553,13 +1553,17 @@ static void cache_set_flush(struct closure *cl) if (!IS_ERR_OR_NULL(c->root)) list_add(&c->root->list, &c->btree_cache); - /* Should skip this if we're unregistering because of an error */ - list_for_each_entry(b, &c->btree_cache, list) { - mutex_lock(&b->write_lock); - if (btree_node_dirty(b)) - __bch_btree_node_write(b, NULL); - mutex_unlock(&b->write_lock); - } + /* + * Avoid flushing cached nodes if cache set is retiring + * due to too many I/O errors detected. + */ + if (!test_bit(CACHE_SET_IO_DISABLE, &c->flags)) + list_for_each_entry(b, &c->btree_cache, list) { + mutex_lock(&b->write_lock); + if (btree_node_dirty(b)) + __bch_btree_node_write(b, NULL); + mutex_unlock(&b->write_lock); + } for_each_cache(ca, c, i) if (ca->alloc_thread) From patchwork Fri Jun 14 13:13:40 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Coly Li X-Patchwork-Id: 10995283 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 3FBE91515 for ; Fri, 14 Jun 2019 13:14:50 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 32157286A4 for ; Fri, 14 Jun 2019 13:14:50 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 26C55286B8; Fri, 14 Jun 2019 13:14:50 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 8662C286B4 for ; Fri, 14 Jun 2019 13:14:49 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728043AbfFNNOs (ORCPT ); Fri, 14 Jun 2019 09:14:48 -0400 Received: from mx2.suse.de ([195.135.220.15]:45800 "EHLO mx1.suse.de" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1727766AbfFNNOs (ORCPT ); Fri, 14 Jun 2019 09:14:48 -0400 X-Virus-Scanned: by amavisd-new at test-mx.suse.de Received: from relay2.suse.de (unknown [195.135.220.254]) by mx1.suse.de (Postfix) with ESMTP id 4267AAE07; Fri, 14 Jun 2019 13:14:47 +0000 (UTC) From: Coly Li To: linux-bcache@vger.kernel.org Cc: linux-block@vger.kernel.org, Coly Li , stable@vger.kernel.org Subject: [PATCH 11/29] bcache: ignore read-ahead request failure on backing device Date: Fri, 14 Jun 2019 21:13:40 +0800 Message-Id: <20190614131358.2771-12-colyli@suse.de> X-Mailer: git-send-email 2.16.4 In-Reply-To: <20190614131358.2771-1-colyli@suse.de> References: <20190614131358.2771-1-colyli@suse.de> Sender: linux-block-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-block@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP When md raid device (e.g. raid456) is used as backing device, read-ahead requests on a degrading and recovering md raid device might be failured immediately by md raid code, but indeed this md raid array can still be read or write for normal I/O requests. Therefore such failed read-ahead request are not real hardware failure. Further more, after degrading and recovering accomplished, read-ahead requests will be handled by md raid array again. For such condition, I/O failures of read-ahead requests don't indicate real health status (because normal I/O still be served), they should not be counted into I/O error counter dc->io_errors. Since there is no simple way to detect whether the backing divice is a md raid device, this patch simply ignores I/O failures for read-ahead bios on backing device, to avoid bogus backing device failure on a degrading md raid array. Suggested-and-tested-by: Thorsten Knabe Signed-off-by: Coly Li Cc: stable@vger.kernel.org --- drivers/md/bcache/io.c | 12 ++++++++++++ 1 file changed, 12 insertions(+) diff --git a/drivers/md/bcache/io.c b/drivers/md/bcache/io.c index c25097968319..4d93f07f63e5 100644 --- a/drivers/md/bcache/io.c +++ b/drivers/md/bcache/io.c @@ -58,6 +58,18 @@ void bch_count_backing_io_errors(struct cached_dev *dc, struct bio *bio) WARN_ONCE(!dc, "NULL pointer of struct cached_dev"); + /* + * Read-ahead requests on a degrading and recovering md raid + * (e.g. raid6) device might be failured immediately by md + * raid code, which is not a real hardware media failure. So + * we shouldn't count failed REQ_RAHEAD bio to dc->io_errors. + */ + if (bio->bi_opf & REQ_RAHEAD) { + pr_warn_ratelimited("%s: Read-ahead I/O failed on backing device, ignore", + dc->backing_dev_name); + return; + } + errors = atomic_add_return(1, &dc->io_errors); if (errors < dc->error_limit) pr_err("%s: IO error on backing device, unrecoverable", From patchwork Fri Jun 14 13:13:41 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Coly Li X-Patchwork-Id: 10995285 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id A443A1515 for ; Fri, 14 Jun 2019 13:14:52 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 91C2C286B8 for ; Fri, 14 Jun 2019 13:14:52 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 813F0286BD; Fri, 14 Jun 2019 13:14:52 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 37F39286A4 for ; Fri, 14 Jun 2019 13:14:52 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728119AbfFNNOv (ORCPT ); Fri, 14 Jun 2019 09:14:51 -0400 Received: from mx2.suse.de ([195.135.220.15]:45814 "EHLO mx1.suse.de" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1727766AbfFNNOv (ORCPT ); Fri, 14 Jun 2019 09:14:51 -0400 X-Virus-Scanned: by amavisd-new at test-mx.suse.de Received: from relay2.suse.de (unknown [195.135.220.254]) by mx1.suse.de (Postfix) with ESMTP id 601FCAE07; Fri, 14 Jun 2019 13:14:50 +0000 (UTC) From: Coly Li To: linux-bcache@vger.kernel.org Cc: linux-block@vger.kernel.org, Coly Li Subject: [PATCH 12/29] bcache: add io error counting in write_bdev_super_endio() Date: Fri, 14 Jun 2019 21:13:41 +0800 Message-Id: <20190614131358.2771-13-colyli@suse.de> X-Mailer: git-send-email 2.16.4 In-Reply-To: <20190614131358.2771-1-colyli@suse.de> References: <20190614131358.2771-1-colyli@suse.de> Sender: linux-block-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-block@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP When backing device super block is written by bch_write_bdev_super(), the bio complete callback write_bdev_super_endio() simply ignores I/O status. Indeed such write request also contribute to backing device health status if the request failed. This patch checkes bio->bi_status in write_bdev_super_endio(), if there is error, bch_count_backing_io_errors() will be called to count an I/O error to dc->io_errors. Signed-off-by: Coly Li --- drivers/md/bcache/super.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/drivers/md/bcache/super.c b/drivers/md/bcache/super.c index da9d6a63b81a..877113b62b0f 100644 --- a/drivers/md/bcache/super.c +++ b/drivers/md/bcache/super.c @@ -197,7 +197,9 @@ static const char *read_super(struct cache_sb *sb, struct block_device *bdev, static void write_bdev_super_endio(struct bio *bio) { struct cached_dev *dc = bio->bi_private; - /* XXX: error checking */ + + if (bio->bi_status) + bch_count_backing_io_errors(dc, bio); closure_put(&dc->sb_write); } From patchwork Fri Jun 14 13:13:42 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Coly Li X-Patchwork-Id: 10995287 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 2874F14BB for ; Fri, 14 Jun 2019 13:14:57 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 1913F285E1 for ; Fri, 14 Jun 2019 13:14:57 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 0DACB286A8; Fri, 14 Jun 2019 13:14:57 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id BB5F1286B4 for ; Fri, 14 Jun 2019 13:14:56 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728134AbfFNNO4 (ORCPT ); Fri, 14 Jun 2019 09:14:56 -0400 Received: from mx2.suse.de ([195.135.220.15]:45858 "EHLO mx1.suse.de" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1727766AbfFNNOz (ORCPT ); Fri, 14 Jun 2019 09:14:55 -0400 X-Virus-Scanned: by amavisd-new at test-mx.suse.de Received: from relay2.suse.de (unknown [195.135.220.254]) by mx1.suse.de (Postfix) with ESMTP id 5AE11AE07; Fri, 14 Jun 2019 13:14:54 +0000 (UTC) From: Coly Li To: linux-bcache@vger.kernel.org Cc: linux-block@vger.kernel.org, Coly Li Subject: [PATCH 13/29] bcache: remove "XXX:" comment line from run_cache_set() Date: Fri, 14 Jun 2019 21:13:42 +0800 Message-Id: <20190614131358.2771-14-colyli@suse.de> X-Mailer: git-send-email 2.16.4 In-Reply-To: <20190614131358.2771-1-colyli@suse.de> References: <20190614131358.2771-1-colyli@suse.de> Sender: linux-block-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-block@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP In previous bcache patches for Linux v5.2, the failure code path of run_cache_set() is tested and fixed. So now the following comment line can be removed from run_cache_set(), /* XXX: test this, it's broken */ Signed-off-by: Coly Li --- drivers/md/bcache/super.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/md/bcache/super.c b/drivers/md/bcache/super.c index 877113b62b0f..3364b20567eb 100644 --- a/drivers/md/bcache/super.c +++ b/drivers/md/bcache/super.c @@ -1946,7 +1946,7 @@ static int run_cache_set(struct cache_set *c) } closure_sync(&cl); - /* XXX: test this, it's broken */ + bch_cache_set_error(c, "%s", err); return -EIO; From patchwork Fri Jun 14 13:13:43 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Coly Li X-Patchwork-Id: 10995289 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 6D4C91515 for ; Fri, 14 Jun 2019 13:15:01 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 5B171285E1 for ; Fri, 14 Jun 2019 13:15:01 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 4D63A286BE; Fri, 14 Jun 2019 13:15:01 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id E4B6A286BD for ; Fri, 14 Jun 2019 13:15:00 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728023AbfFNNO7 (ORCPT ); Fri, 14 Jun 2019 09:14:59 -0400 Received: from mx2.suse.de ([195.135.220.15]:45890 "EHLO mx1.suse.de" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1728121AbfFNNO6 (ORCPT ); Fri, 14 Jun 2019 09:14:58 -0400 X-Virus-Scanned: by amavisd-new at test-mx.suse.de Received: from relay2.suse.de (unknown [195.135.220.254]) by mx1.suse.de (Postfix) with ESMTP id 67485AE1F; Fri, 14 Jun 2019 13:14:57 +0000 (UTC) From: Coly Li To: linux-bcache@vger.kernel.org Cc: linux-block@vger.kernel.org, Coly Li Subject: [PATCH 14/29] bcache: remove unnecessary prefetch() in bset_search_tree() Date: Fri, 14 Jun 2019 21:13:43 +0800 Message-Id: <20190614131358.2771-15-colyli@suse.de> X-Mailer: git-send-email 2.16.4 In-Reply-To: <20190614131358.2771-1-colyli@suse.de> References: <20190614131358.2771-1-colyli@suse.de> Sender: linux-block-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-block@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP In function bset_search_tree(), when p >= t->size, t->tree[0] will be prefetched by the following code piece, 974 unsigned int p = n << 4; 975 976 p &= ((int) (p - t->size)) >> 31; 977 978 prefetch(&t->tree[p]); The purpose of the above code is to avoid a branch instruction, but when p >= t->size, prefetch(&t->tree[0]) has no positive performance contribution at all. This patch avoids the unncessary prefetch by only calling prefetch() when p < t->size. Signed-off-by: Coly Li --- drivers/md/bcache/bset.c | 16 ++-------------- 1 file changed, 2 insertions(+), 14 deletions(-) diff --git a/drivers/md/bcache/bset.c b/drivers/md/bcache/bset.c index 8f07fa6e1739..aa2e4ab0fab9 100644 --- a/drivers/md/bcache/bset.c +++ b/drivers/md/bcache/bset.c @@ -960,22 +960,10 @@ static struct bset_search_iter bset_search_tree(struct bset_tree *t, unsigned int inorder, j, n = 1; do { - /* - * A bit trick here. - * If p < t->size, (int)(p - t->size) is a minus value and - * the most significant bit is set, right shifting 31 bits - * gets 1. If p >= t->size, the most significant bit is - * not set, right shifting 31 bits gets 0. - * So the following 2 lines equals to - * if (p >= t->size) - * p = 0; - * but a branch instruction is avoided. - */ unsigned int p = n << 4; - p &= ((int) (p - t->size)) >> 31; - - prefetch(&t->tree[p]); + if (p < t->size) + prefetch(&t->tree[p]); j = n; f = &t->tree[j]; From patchwork Fri Jun 14 13:13:44 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Coly Li X-Patchwork-Id: 10995291 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 1A4231395 for ; Fri, 14 Jun 2019 13:15:04 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 0AE6E286B8 for ; Fri, 14 Jun 2019 13:15:04 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id F370F286BE; Fri, 14 Jun 2019 13:15:03 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 90E9C286B8 for ; Fri, 14 Jun 2019 13:15:03 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728187AbfFNNPD (ORCPT ); Fri, 14 Jun 2019 09:15:03 -0400 Received: from mx2.suse.de ([195.135.220.15]:45934 "EHLO mx1.suse.de" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1728121AbfFNNPC (ORCPT ); Fri, 14 Jun 2019 09:15:02 -0400 X-Virus-Scanned: by amavisd-new at test-mx.suse.de Received: from relay2.suse.de (unknown [195.135.220.254]) by mx1.suse.de (Postfix) with ESMTP id 7E24BAE07; Fri, 14 Jun 2019 13:15:01 +0000 (UTC) From: Coly Li To: linux-bcache@vger.kernel.org Cc: linux-block@vger.kernel.org, Alexandru Ardelean , Coly Li Subject: [PATCH 15/29] bcache: use sysfs_match_string() instead of __sysfs_match_string() Date: Fri, 14 Jun 2019 21:13:44 +0800 Message-Id: <20190614131358.2771-16-colyli@suse.de> X-Mailer: git-send-email 2.16.4 In-Reply-To: <20190614131358.2771-1-colyli@suse.de> References: <20190614131358.2771-1-colyli@suse.de> Sender: linux-block-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-block@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP From: Alexandru Ardelean The arrays (of strings) that are passed to __sysfs_match_string() are static, so use sysfs_match_string() which does an implicit ARRAY_SIZE() over these arrays. Functionally, this doesn't change anything. The change is more cosmetic. It only shrinks the static arrays by 1 byte each. Signed-off-by: Alexandru Ardelean Signed-off-by: Coly Li --- drivers/md/bcache/sysfs.c | 20 ++++++++------------ 1 file changed, 8 insertions(+), 12 deletions(-) diff --git a/drivers/md/bcache/sysfs.c b/drivers/md/bcache/sysfs.c index 7a88ba4bfbfb..53f70f97241e 100644 --- a/drivers/md/bcache/sysfs.c +++ b/drivers/md/bcache/sysfs.c @@ -21,28 +21,24 @@ static const char * const bch_cache_modes[] = { "writethrough", "writeback", "writearound", - "none", - NULL + "none" }; /* Default is 0 ("auto") */ static const char * const bch_stop_on_failure_modes[] = { "auto", - "always", - NULL + "always" }; static const char * const cache_replacement_policies[] = { "lru", "fifo", - "random", - NULL + "random" }; static const char * const error_actions[] = { "unregister", - "panic", - NULL + "panic" }; write_attribute(attach); @@ -333,7 +329,7 @@ STORE(__cached_dev) bch_cached_dev_run(dc); if (attr == &sysfs_cache_mode) { - v = __sysfs_match_string(bch_cache_modes, -1, buf); + v = sysfs_match_string(bch_cache_modes, buf); if (v < 0) return v; @@ -344,7 +340,7 @@ STORE(__cached_dev) } if (attr == &sysfs_stop_when_cache_set_failed) { - v = __sysfs_match_string(bch_stop_on_failure_modes, -1, buf); + v = sysfs_match_string(bch_stop_on_failure_modes, buf); if (v < 0) return v; @@ -794,7 +790,7 @@ STORE(__bch_cache_set) 0, UINT_MAX); if (attr == &sysfs_errors) { - v = __sysfs_match_string(error_actions, -1, buf); + v = sysfs_match_string(error_actions, buf); if (v < 0) return v; @@ -1058,7 +1054,7 @@ STORE(__bch_cache) } if (attr == &sysfs_cache_replacement_policy) { - v = __sysfs_match_string(cache_replacement_policies, -1, buf); + v = sysfs_match_string(cache_replacement_policies, buf); if (v < 0) return v; From patchwork Fri Jun 14 13:13:45 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Coly Li X-Patchwork-Id: 10995293 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 945FD1395 for ; Fri, 14 Jun 2019 13:15:07 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 850E9286B8 for ; Fri, 14 Jun 2019 13:15:07 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 7981A285E1; Fri, 14 Jun 2019 13:15:07 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id EFCD8286BE for ; Fri, 14 Jun 2019 13:15:06 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728086AbfFNNPG (ORCPT ); Fri, 14 Jun 2019 09:15:06 -0400 Received: from mx2.suse.de ([195.135.220.15]:45972 "EHLO mx1.suse.de" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1728189AbfFNNPG (ORCPT ); Fri, 14 Jun 2019 09:15:06 -0400 X-Virus-Scanned: by amavisd-new at test-mx.suse.de Received: from relay2.suse.de (unknown [195.135.220.254]) by mx1.suse.de (Postfix) with ESMTP id 533D1AF8C; Fri, 14 Jun 2019 13:15:04 +0000 (UTC) From: Coly Li To: linux-bcache@vger.kernel.org Cc: linux-block@vger.kernel.org, Coly Li Subject: [PATCH 16/29] bcache: add return value check to bch_cached_dev_run() Date: Fri, 14 Jun 2019 21:13:45 +0800 Message-Id: <20190614131358.2771-17-colyli@suse.de> X-Mailer: git-send-email 2.16.4 In-Reply-To: <20190614131358.2771-1-colyli@suse.de> References: <20190614131358.2771-1-colyli@suse.de> Sender: linux-block-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-block@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP This patch adds return value check to bch_cached_dev_run(), now if there is error happens inside bch_cached_dev_run(), it can be catched. Signed-off-by: Coly Li --- drivers/md/bcache/bcache.h | 2 +- drivers/md/bcache/super.c | 33 ++++++++++++++++++++++++++------- drivers/md/bcache/sysfs.c | 7 +++++-- 3 files changed, 32 insertions(+), 10 deletions(-) diff --git a/drivers/md/bcache/bcache.h b/drivers/md/bcache/bcache.h index 732f9b6cb2bb..013e35a9e317 100644 --- a/drivers/md/bcache/bcache.h +++ b/drivers/md/bcache/bcache.h @@ -1004,7 +1004,7 @@ int bch_flash_dev_create(struct cache_set *c, uint64_t size); int bch_cached_dev_attach(struct cached_dev *dc, struct cache_set *c, uint8_t *set_uuid); void bch_cached_dev_detach(struct cached_dev *dc); -void bch_cached_dev_run(struct cached_dev *dc); +int bch_cached_dev_run(struct cached_dev *dc); void bcache_device_stop(struct bcache_device *d); void bch_cache_set_unregister(struct cache_set *c); diff --git a/drivers/md/bcache/super.c b/drivers/md/bcache/super.c index 3364b20567eb..74eb18b1af40 100644 --- a/drivers/md/bcache/super.c +++ b/drivers/md/bcache/super.c @@ -910,7 +910,7 @@ static int cached_dev_status_update(void *arg) } -void bch_cached_dev_run(struct cached_dev *dc) +int bch_cached_dev_run(struct cached_dev *dc) { struct bcache_device *d = &dc->disk; char *buf = kmemdup_nul(dc->sb.label, SB_LABEL_SIZE, GFP_KERNEL); @@ -921,11 +921,14 @@ void bch_cached_dev_run(struct cached_dev *dc) NULL, }; + if (dc->io_disable) + return -EIO; + if (atomic_xchg(&dc->running, 1)) { kfree(env[1]); kfree(env[2]); kfree(buf); - return; + return -EBUSY; } if (!d->c && @@ -951,8 +954,11 @@ void bch_cached_dev_run(struct cached_dev *dc) kfree(buf); if (sysfs_create_link(&d->kobj, &disk_to_dev(d->disk)->kobj, "dev") || - sysfs_create_link(&disk_to_dev(d->disk)->kobj, &d->kobj, "bcache")) + sysfs_create_link(&disk_to_dev(d->disk)->kobj, + &d->kobj, "bcache")) { pr_debug("error creating sysfs link"); + return -ENOMEM; + } dc->status_update_thread = kthread_run(cached_dev_status_update, dc, "bcache_status_update"); @@ -961,6 +967,8 @@ void bch_cached_dev_run(struct cached_dev *dc) "continue to run without monitoring backing " "device status"); } + + return 0; } /* @@ -1056,6 +1064,7 @@ int bch_cached_dev_attach(struct cached_dev *dc, struct cache_set *c, uint32_t rtime = cpu_to_le32((u32)ktime_get_real_seconds()); struct uuid_entry *u; struct cached_dev *exist_dc, *t; + int ret = 0; if ((set_uuid && memcmp(set_uuid, c->sb.set_uuid, 16)) || (!set_uuid && memcmp(dc->sb.set_uuid, c->sb.set_uuid, 16))) @@ -1165,7 +1174,12 @@ int bch_cached_dev_attach(struct cached_dev *dc, struct cache_set *c, bch_sectors_dirty_init(&dc->disk); - bch_cached_dev_run(dc); + ret = bch_cached_dev_run(dc); + if (ret && (ret != -EBUSY)) { + up_write(&dc->writeback_lock); + return ret; + } + bcache_device_link(&dc->disk, c, "bdev"); atomic_inc(&c->attached_dev_nr); @@ -1292,6 +1306,7 @@ static int register_bdev(struct cache_sb *sb, struct page *sb_page, { const char *err = "cannot allocate memory"; struct cache_set *c; + int ret = -ENOMEM; bdevname(bdev, dc->backing_dev_name); memcpy(&dc->sb, sb, sizeof(struct cache_sb)); @@ -1321,14 +1336,18 @@ static int register_bdev(struct cache_sb *sb, struct page *sb_page, bch_cached_dev_attach(dc, c, NULL); if (BDEV_STATE(&dc->sb) == BDEV_STATE_NONE || - BDEV_STATE(&dc->sb) == BDEV_STATE_STALE) - bch_cached_dev_run(dc); + BDEV_STATE(&dc->sb) == BDEV_STATE_STALE) { + err = "failed to run cached device"; + ret = bch_cached_dev_run(dc); + if (ret) + goto err; + } return 0; err: pr_notice("error %s: %s", dc->backing_dev_name, err); bcache_device_stop(&dc->disk); - return -EIO; + return ret; } /* Flash only volumes */ diff --git a/drivers/md/bcache/sysfs.c b/drivers/md/bcache/sysfs.c index 53f70f97241e..82a2fe8ac86b 100644 --- a/drivers/md/bcache/sysfs.c +++ b/drivers/md/bcache/sysfs.c @@ -325,8 +325,11 @@ STORE(__cached_dev) bch_cache_accounting_clear(&dc->accounting); if (attr == &sysfs_running && - strtoul_or_return(buf)) - bch_cached_dev_run(dc); + strtoul_or_return(buf)) { + v = bch_cached_dev_run(dc); + if (v) + return v; + } if (attr == &sysfs_cache_mode) { v = sysfs_match_string(bch_cache_modes, buf); From patchwork Fri Jun 14 13:13:46 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Coly Li X-Patchwork-Id: 10995295 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id DB8CC1515 for ; Fri, 14 Jun 2019 13:15:09 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id CC7CB285E1 for ; Fri, 14 Jun 2019 13:15:09 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id BFCC3286BD; Fri, 14 Jun 2019 13:15:09 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 5D877285E1 for ; Fri, 14 Jun 2019 13:15:09 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728189AbfFNNPI (ORCPT ); Fri, 14 Jun 2019 09:15:08 -0400 Received: from mx2.suse.de ([195.135.220.15]:46004 "EHLO mx1.suse.de" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1728121AbfFNNPI (ORCPT ); Fri, 14 Jun 2019 09:15:08 -0400 X-Virus-Scanned: by amavisd-new at test-mx.suse.de Received: from relay2.suse.de (unknown [195.135.220.254]) by mx1.suse.de (Postfix) with ESMTP id 86ED6AF7B; Fri, 14 Jun 2019 13:15:07 +0000 (UTC) From: Coly Li To: linux-bcache@vger.kernel.org Cc: linux-block@vger.kernel.org, Coly Li Subject: [PATCH 17/29] bcache: remove unncessary code in bch_btree_keys_init() Date: Fri, 14 Jun 2019 21:13:46 +0800 Message-Id: <20190614131358.2771-18-colyli@suse.de> X-Mailer: git-send-email 2.16.4 In-Reply-To: <20190614131358.2771-1-colyli@suse.de> References: <20190614131358.2771-1-colyli@suse.de> Sender: linux-block-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-block@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP Function bch_btree_keys_init() initializes b->set[].size and b->set[].data to zero. As the code comments indicates, these code indeed is unncessary, because both struct btree_keys and struct bset_tree are nested embedded into struct btree, when struct btree is filled with 0 bits by kzalloc() in mca_bucket_alloc(), b->set[].size and b->set[].data are initialized to 0 (a.k.a NULL) already. This patch removes the redundant code, and add comments in bch_btree_keys_init() and mca_bucket_alloc() to explain why it's safe. Signed-off-by: Coly Li --- drivers/md/bcache/bset.c | 15 ++++++--------- drivers/md/bcache/btree.c | 4 ++++ 2 files changed, 10 insertions(+), 9 deletions(-) diff --git a/drivers/md/bcache/bset.c b/drivers/md/bcache/bset.c index aa2e4ab0fab9..eedaf0f3e3f0 100644 --- a/drivers/md/bcache/bset.c +++ b/drivers/md/bcache/bset.c @@ -347,22 +347,19 @@ EXPORT_SYMBOL(bch_btree_keys_alloc); void bch_btree_keys_init(struct btree_keys *b, const struct btree_keys_ops *ops, bool *expensive_debug_checks) { - unsigned int i; - b->ops = ops; b->expensive_debug_checks = expensive_debug_checks; b->nsets = 0; b->last_set_unwritten = 0; - /* XXX: shouldn't be needed */ - for (i = 0; i < MAX_BSETS; i++) - b->set[i].size = 0; /* - * Second loop starts at 1 because b->keys[0]->data is the memory we - * allocated + * struct btree_keys in embedded in struct btree, and struct + * bset_tree is embedded into struct btree_keys. They are all + * initialized as 0 by kzalloc() in mca_bucket_alloc(), and + * b->set[0].data is allocated in bch_btree_keys_alloc(), so we + * don't have to initiate b->set[].size and b->set[].data here + * any more. */ - for (i = 1; i < MAX_BSETS; i++) - b->set[i].data = NULL; } EXPORT_SYMBOL(bch_btree_keys_init); diff --git a/drivers/md/bcache/btree.c b/drivers/md/bcache/btree.c index 881dc238c7cb..c0dd8fde37af 100644 --- a/drivers/md/bcache/btree.c +++ b/drivers/md/bcache/btree.c @@ -613,6 +613,10 @@ static void mca_data_alloc(struct btree *b, struct bkey *k, gfp_t gfp) static struct btree *mca_bucket_alloc(struct cache_set *c, struct bkey *k, gfp_t gfp) { + /* + * kzalloc() is necessary here for initialization, + * see code comments in bch_btree_keys_init(). + */ struct btree *b = kzalloc(sizeof(struct btree), gfp); if (!b) From patchwork Fri Jun 14 13:13:47 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Coly Li X-Patchwork-Id: 10995297 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id A26B91515 for ; Fri, 14 Jun 2019 13:15:13 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 913F2286C0 for ; Fri, 14 Jun 2019 13:15:13 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 824BA286BF; Fri, 14 Jun 2019 13:15:13 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 2E83F286A4 for ; Fri, 14 Jun 2019 13:15:13 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728121AbfFNNPM (ORCPT ); Fri, 14 Jun 2019 09:15:12 -0400 Received: from mx2.suse.de ([195.135.220.15]:46042 "EHLO mx1.suse.de" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1728017AbfFNNPM (ORCPT ); Fri, 14 Jun 2019 09:15:12 -0400 X-Virus-Scanned: by amavisd-new at test-mx.suse.de Received: from relay2.suse.de (unknown [195.135.220.254]) by mx1.suse.de (Postfix) with ESMTP id 5EBF2AE07; Fri, 14 Jun 2019 13:15:11 +0000 (UTC) From: Coly Li To: linux-bcache@vger.kernel.org Cc: linux-block@vger.kernel.org, Coly Li Subject: [PATCH 18/29] bcache: check CACHE_SET_IO_DISABLE in allocator code Date: Fri, 14 Jun 2019 21:13:47 +0800 Message-Id: <20190614131358.2771-19-colyli@suse.de> X-Mailer: git-send-email 2.16.4 In-Reply-To: <20190614131358.2771-1-colyli@suse.de> References: <20190614131358.2771-1-colyli@suse.de> Sender: linux-block-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-block@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP If CACHE_SET_IO_DISABLE of a cache set flag is set by too many I/O errors, currently allocator routines can still continue allocate space which may introduce inconsistent metadata state. This patch checkes CACHE_SET_IO_DISABLE bit in following allocator routines, - bch_bucket_alloc() - __bch_bucket_alloc_set() Once CACHE_SET_IO_DISABLE is set on cache set, the allocator routines may reject allocation request earlier to avoid potential inconsistent metadata. Signed-off-by: Coly Li --- drivers/md/bcache/alloc.c | 9 +++++++++ 1 file changed, 9 insertions(+) diff --git a/drivers/md/bcache/alloc.c b/drivers/md/bcache/alloc.c index f8986effcb50..6f776823b9ba 100644 --- a/drivers/md/bcache/alloc.c +++ b/drivers/md/bcache/alloc.c @@ -393,6 +393,11 @@ long bch_bucket_alloc(struct cache *ca, unsigned int reserve, bool wait) struct bucket *b; long r; + + /* No allocation if CACHE_SET_IO_DISABLE bit is set */ + if (unlikely(test_bit(CACHE_SET_IO_DISABLE, &ca->set->flags))) + return -1; + /* fastpath */ if (fifo_pop(&ca->free[RESERVE_NONE], r) || fifo_pop(&ca->free[reserve], r)) @@ -484,6 +489,10 @@ int __bch_bucket_alloc_set(struct cache_set *c, unsigned int reserve, { int i; + /* No allocation if CACHE_SET_IO_DISABLE bit is set */ + if (unlikely(test_bit(CACHE_SET_IO_DISABLE, &c->flags))) + return -1; + lockdep_assert_held(&c->bucket_lock); BUG_ON(!n || n > c->caches_loaded || n > MAX_CACHES_PER_SET); From patchwork Fri Jun 14 13:13:48 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Coly Li X-Patchwork-Id: 10995299 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id BCF3F1395 for ; Fri, 14 Jun 2019 13:15:16 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id ADC39286B3 for ; Fri, 14 Jun 2019 13:15:16 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id A249D286C1; Fri, 14 Jun 2019 13:15:16 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 47236286B3 for ; Fri, 14 Jun 2019 13:15:16 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728117AbfFNNPP (ORCPT ); Fri, 14 Jun 2019 09:15:15 -0400 Received: from mx2.suse.de ([195.135.220.15]:46082 "EHLO mx1.suse.de" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1727956AbfFNNPP (ORCPT ); Fri, 14 Jun 2019 09:15:15 -0400 X-Virus-Scanned: by amavisd-new at test-mx.suse.de Received: from relay2.suse.de (unknown [195.135.220.254]) by mx1.suse.de (Postfix) with ESMTP id 6DB9CAF8C; Fri, 14 Jun 2019 13:15:14 +0000 (UTC) From: Coly Li To: linux-bcache@vger.kernel.org Cc: linux-block@vger.kernel.org, Coly Li Subject: [PATCH 19/29] bcache: check CACHE_SET_IO_DISABLE bit in bch_journal() Date: Fri, 14 Jun 2019 21:13:48 +0800 Message-Id: <20190614131358.2771-20-colyli@suse.de> X-Mailer: git-send-email 2.16.4 In-Reply-To: <20190614131358.2771-1-colyli@suse.de> References: <20190614131358.2771-1-colyli@suse.de> Sender: linux-block-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-block@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP When too many I/O errors happen on cache set and CACHE_SET_IO_DISABLE bit is set, bch_journal() may continue to work because the journaling bkey might be still in write set yet. The caller of bch_journal() may believe the journal still work but the truth is in-memory journal write set won't be written into cache device any more. This behavior may introduce potential inconsistent metadata status. This patch checks CACHE_SET_IO_DISABLE bit at the head of bch_journal(), if the bit is set, bch_journal() returns NULL immediately to notice caller to know journal does not work. Signed-off-by: Coly Li --- drivers/md/bcache/journal.c | 4 ++++ 1 file changed, 4 insertions(+) diff --git a/drivers/md/bcache/journal.c b/drivers/md/bcache/journal.c index 7beff8884dd0..87af7c6513c5 100644 --- a/drivers/md/bcache/journal.c +++ b/drivers/md/bcache/journal.c @@ -867,6 +867,10 @@ atomic_t *bch_journal(struct cache_set *c, struct journal_write *w; atomic_t *ret; + /* No journaling if CACHE_SET_IO_DISABLE set already */ + if (unlikely(test_bit(CACHE_SET_IO_DISABLE, &c->flags))) + return NULL; + if (!CACHE_SYNC(&c->sb)) return NULL; From patchwork Fri Jun 14 13:13:49 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Coly Li X-Patchwork-Id: 10995301 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 4EFDC1395 for ; Fri, 14 Jun 2019 13:15:19 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 4020F286BE for ; Fri, 14 Jun 2019 13:15:19 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 35066286BF; Fri, 14 Jun 2019 13:15:19 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id D6076286C0 for ; Fri, 14 Jun 2019 13:15:18 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728127AbfFNNPS (ORCPT ); Fri, 14 Jun 2019 09:15:18 -0400 Received: from mx2.suse.de ([195.135.220.15]:46102 "EHLO mx1.suse.de" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1727956AbfFNNPS (ORCPT ); Fri, 14 Jun 2019 09:15:18 -0400 X-Virus-Scanned: by amavisd-new at test-mx.suse.de Received: from relay2.suse.de (unknown [195.135.220.254]) by mx1.suse.de (Postfix) with ESMTP id 1A85EAF7B; Fri, 14 Jun 2019 13:15:17 +0000 (UTC) From: Coly Li To: linux-bcache@vger.kernel.org Cc: linux-block@vger.kernel.org, Coly Li Subject: [PATCH 20/29] bcache: more detailed error message to bcache_device_link() Date: Fri, 14 Jun 2019 21:13:49 +0800 Message-Id: <20190614131358.2771-21-colyli@suse.de> X-Mailer: git-send-email 2.16.4 In-Reply-To: <20190614131358.2771-1-colyli@suse.de> References: <20190614131358.2771-1-colyli@suse.de> Sender: linux-block-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-block@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP This patch adds more accurate error message for specific ssyfs_create_link() call, to help debugging failure during bcache device start tup. Signed-off-by: Coly Li --- drivers/md/bcache/super.c | 11 ++++++++--- 1 file changed, 8 insertions(+), 3 deletions(-) diff --git a/drivers/md/bcache/super.c b/drivers/md/bcache/super.c index 74eb18b1af40..1fa3f4e26d02 100644 --- a/drivers/md/bcache/super.c +++ b/drivers/md/bcache/super.c @@ -693,6 +693,7 @@ static void bcache_device_link(struct bcache_device *d, struct cache_set *c, { unsigned int i; struct cache *ca; + int ret; for_each_cache(ca, d->c, i) bd_link_disk_holder(ca->bdev, d->disk); @@ -700,9 +701,13 @@ static void bcache_device_link(struct bcache_device *d, struct cache_set *c, snprintf(d->name, BCACHEDEVNAME_SIZE, "%s%u", name, d->id); - WARN(sysfs_create_link(&d->kobj, &c->kobj, "cache") || - sysfs_create_link(&c->kobj, &d->kobj, d->name), - "Couldn't create device <-> cache set symlinks"); + ret = sysfs_create_link(&d->kobj, &c->kobj, "cache"); + if (ret < 0) + pr_err("Couldn't create device -> cache set symlink"); + + ret = sysfs_create_link(&c->kobj, &d->kobj, d->name); + if (ret < 0) + pr_err("Couldn't create cache set -> device symlink"); clear_bit(BCACHE_DEV_UNLINK_DONE, &d->flags); } From patchwork Fri Jun 14 13:13:50 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Coly Li X-Patchwork-Id: 10995303 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 021FD1515 for ; Fri, 14 Jun 2019 13:15:23 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id E8E5628517 for ; Fri, 14 Jun 2019 13:15:22 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id DDBBC285FB; Fri, 14 Jun 2019 13:15:22 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 9589428607 for ; Fri, 14 Jun 2019 13:15:22 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728149AbfFNNPW (ORCPT ); Fri, 14 Jun 2019 09:15:22 -0400 Received: from mx2.suse.de ([195.135.220.15]:46172 "EHLO mx1.suse.de" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1727956AbfFNNPW (ORCPT ); Fri, 14 Jun 2019 09:15:22 -0400 X-Virus-Scanned: by amavisd-new at test-mx.suse.de Received: from relay2.suse.de (unknown [195.135.220.254]) by mx1.suse.de (Postfix) with ESMTP id E2747AE07; Fri, 14 Jun 2019 13:15:20 +0000 (UTC) From: Coly Li To: linux-bcache@vger.kernel.org Cc: linux-block@vger.kernel.org, Coly Li Subject: [PATCH 21/29] bcache: add more error message in bch_cached_dev_attach() Date: Fri, 14 Jun 2019 21:13:50 +0800 Message-Id: <20190614131358.2771-22-colyli@suse.de> X-Mailer: git-send-email 2.16.4 In-Reply-To: <20190614131358.2771-1-colyli@suse.de> References: <20190614131358.2771-1-colyli@suse.de> Sender: linux-block-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-block@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP This patch adds more error message for attaching cached device, this is helpful to debug code failure during bache device start up. Signed-off-by: Coly Li --- drivers/md/bcache/super.c | 4 ++++ 1 file changed, 4 insertions(+) diff --git a/drivers/md/bcache/super.c b/drivers/md/bcache/super.c index 1fa3f4e26d02..cf5673af3143 100644 --- a/drivers/md/bcache/super.c +++ b/drivers/md/bcache/super.c @@ -1169,6 +1169,8 @@ int bch_cached_dev_attach(struct cached_dev *dc, struct cache_set *c, down_write(&dc->writeback_lock); if (bch_cached_dev_writeback_start(dc)) { up_write(&dc->writeback_lock); + pr_err("Couldn't start writeback facilities for %s", + dc->disk.disk->disk_name); return -ENOMEM; } @@ -1182,6 +1184,8 @@ int bch_cached_dev_attach(struct cached_dev *dc, struct cache_set *c, ret = bch_cached_dev_run(dc); if (ret && (ret != -EBUSY)) { up_write(&dc->writeback_lock); + pr_err("Couldn't run cached device %s", + dc->backing_dev_name); return ret; } From patchwork Fri Jun 14 13:13:51 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Coly Li X-Patchwork-Id: 10995305 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 208E61395 for ; Fri, 14 Jun 2019 13:15:26 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 12B1B2850D for ; Fri, 14 Jun 2019 13:15:26 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 0790A28608; Fri, 14 Jun 2019 13:15:26 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id A18EE2860C for ; Fri, 14 Jun 2019 13:15:25 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728082AbfFNNPZ (ORCPT ); Fri, 14 Jun 2019 09:15:25 -0400 Received: from mx2.suse.de ([195.135.220.15]:46194 "EHLO mx1.suse.de" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1727956AbfFNNPZ (ORCPT ); Fri, 14 Jun 2019 09:15:25 -0400 X-Virus-Scanned: by amavisd-new at test-mx.suse.de Received: from relay2.suse.de (unknown [195.135.220.254]) by mx1.suse.de (Postfix) with ESMTP id D5515AE07; Fri, 14 Jun 2019 13:15:23 +0000 (UTC) From: Coly Li To: linux-bcache@vger.kernel.org Cc: linux-block@vger.kernel.org, Coly Li Subject: [PATCH 22/29] bcache: shrink btree node cache after bch_btree_check() Date: Fri, 14 Jun 2019 21:13:51 +0800 Message-Id: <20190614131358.2771-23-colyli@suse.de> X-Mailer: git-send-email 2.16.4 In-Reply-To: <20190614131358.2771-1-colyli@suse.de> References: <20190614131358.2771-1-colyli@suse.de> Sender: linux-block-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-block@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP When cache set starts, bch_btree_check() will check all bkeys on cache device by calculating the checksum. This operation will consume a huge number of system memory if there are a lot of data cached. Since bcache uses its own mca cache to maintain all its read-in btree nodes, and only releases the cache space when system memory manage code starts to shrink caches. There is will be a delay between bch_btree_check() returns and the bcache shrink code gets called, so following memory allocatiion might fail after bch_btree_check() finished. The most frequent one is failure of creating allocator kernel thread. This patch tries to proactively call bcache mca shrinker routine to release around 25% cache memory, to help following memory allocation to success. 'Around 25% cache memory' means when mca shrnker tries to release cache memory, it might have to skip some busy memory objects, so the result might be a few less than the expected amount. Signed-off-by: Coly Li --- drivers/md/bcache/super.c | 18 ++++++++++++++++++ 1 file changed, 18 insertions(+) diff --git a/drivers/md/bcache/super.c b/drivers/md/bcache/super.c index cf5673af3143..4a6406b53de1 100644 --- a/drivers/md/bcache/super.c +++ b/drivers/md/bcache/super.c @@ -1866,6 +1866,24 @@ static int run_cache_set(struct cache_set *c) if (bch_btree_check(c)) goto err; + /* + * bch_btree_check() may occupy too much system memory which + * will fail memory allocation operations in the following + * routines before kernel triggers memory shrinker call backs. + * Shrinking 25% mca cache memory proactively here to avoid + * potential memory allocation failure. + */ + if (!c->shrinker_disabled) { + struct shrink_control sc; + + sc.gfp_mask = GFP_KERNEL; + sc.nr_to_scan = + c->shrink.count_objects(&c->shrink, &sc) / 4; + pr_debug("try to shrink %lu (25%%) cached btree node", + sc.nr_to_scan); + c->shrink.scan_objects(&c->shrink, &sc); + } + bch_journal_mark(c, &journal); bch_initial_gc_finish(c); pr_debug("btree_check() done"); From patchwork Fri Jun 14 13:13:52 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Coly Li X-Patchwork-Id: 10995307 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 235711515 for ; Fri, 14 Jun 2019 13:15:29 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 15DCC285E1 for ; Fri, 14 Jun 2019 13:15:29 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 0AD39285FB; Fri, 14 Jun 2019 13:15:29 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id B226E28607 for ; Fri, 14 Jun 2019 13:15:28 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728049AbfFNNP2 (ORCPT ); Fri, 14 Jun 2019 09:15:28 -0400 Received: from mx2.suse.de ([195.135.220.15]:46216 "EHLO mx1.suse.de" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1727918AbfFNNP2 (ORCPT ); Fri, 14 Jun 2019 09:15:28 -0400 X-Virus-Scanned: by amavisd-new at test-mx.suse.de Received: from relay2.suse.de (unknown [195.135.220.254]) by mx1.suse.de (Postfix) with ESMTP id D195EAF7B; Fri, 14 Jun 2019 13:15:26 +0000 (UTC) From: Coly Li To: linux-bcache@vger.kernel.org Cc: linux-block@vger.kernel.org, Coly Li Subject: [PATCH 23/29] bcache: improve error message in bch_cached_dev_run() Date: Fri, 14 Jun 2019 21:13:52 +0800 Message-Id: <20190614131358.2771-24-colyli@suse.de> X-Mailer: git-send-email 2.16.4 In-Reply-To: <20190614131358.2771-1-colyli@suse.de> References: <20190614131358.2771-1-colyli@suse.de> Sender: linux-block-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-block@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP This patch adds more error message in bch_cached_dev_run() to indicate the exact reason why an error value is returned. Please notice when printing out the "is running already" message, pr_info() is used here, because in this case also -EBUSY is returned, the bcache device can continue to attach to the cache devince and run, so it won't be an error level message in kernel message. Signed-off-by: Coly Li --- drivers/md/bcache/super.c | 9 +++++++-- 1 file changed, 7 insertions(+), 2 deletions(-) diff --git a/drivers/md/bcache/super.c b/drivers/md/bcache/super.c index 4a6406b53de1..026e2df358c3 100644 --- a/drivers/md/bcache/super.c +++ b/drivers/md/bcache/super.c @@ -926,13 +926,18 @@ int bch_cached_dev_run(struct cached_dev *dc) NULL, }; - if (dc->io_disable) + if (dc->io_disable) { + pr_err("I/O disabled on cached dev %s", + dc->backing_dev_name); return -EIO; + } if (atomic_xchg(&dc->running, 1)) { kfree(env[1]); kfree(env[2]); kfree(buf); + pr_info("cached dev %s is running already", + dc->backing_dev_name); return -EBUSY; } @@ -961,7 +966,7 @@ int bch_cached_dev_run(struct cached_dev *dc) if (sysfs_create_link(&d->kobj, &disk_to_dev(d->disk)->kobj, "dev") || sysfs_create_link(&disk_to_dev(d->disk)->kobj, &d->kobj, "bcache")) { - pr_debug("error creating sysfs link"); + pr_err("Couldn't create bcache dev <-> disk sysfs symlinks"); return -ENOMEM; } From patchwork Fri Jun 14 13:13:53 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Coly Li X-Patchwork-Id: 10995309 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 0826F1515 for ; Fri, 14 Jun 2019 13:15:32 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id EE42128515 for ; Fri, 14 Jun 2019 13:15:31 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id E112C286A4; Fri, 14 Jun 2019 13:15:31 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 8E24028515 for ; Fri, 14 Jun 2019 13:15:31 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728171AbfFNNPb (ORCPT ); Fri, 14 Jun 2019 09:15:31 -0400 Received: from mx2.suse.de ([195.135.220.15]:46242 "EHLO mx1.suse.de" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1727918AbfFNNPa (ORCPT ); Fri, 14 Jun 2019 09:15:30 -0400 X-Virus-Scanned: by amavisd-new at test-mx.suse.de Received: from relay2.suse.de (unknown [195.135.220.254]) by mx1.suse.de (Postfix) with ESMTP id B0F28AF7B; Fri, 14 Jun 2019 13:15:29 +0000 (UTC) From: Coly Li To: linux-bcache@vger.kernel.org Cc: linux-block@vger.kernel.org, Coly Li Subject: [PATCH 24/29] bcache: make bset_search_tree() be more understandable Date: Fri, 14 Jun 2019 21:13:53 +0800 Message-Id: <20190614131358.2771-25-colyli@suse.de> X-Mailer: git-send-email 2.16.4 In-Reply-To: <20190614131358.2771-1-colyli@suse.de> References: <20190614131358.2771-1-colyli@suse.de> Sender: linux-block-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-block@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP The purpose of following code in bset_search_tree() is to avoid a branch instruction, 994 if (likely(f->exponent != 127)) 995 n = j * 2 + (((unsigned int) 996 (f->mantissa - 997 bfloat_mantissa(search, f))) >> 31); 998 else 999 n = (bkey_cmp(tree_to_bkey(t, j), search) > 0) 1000 ? j * 2 1001 : j * 2 + 1; This piece of code is not very clear to understand, even when I tried to add code comment for it, I made mistake. This patch removes the implict bit operation and uses explicit branch to calculate next location in binary tree search. Signed-off-by: Coly Li --- drivers/md/bcache/bset.c | 30 +++++++++++------------------- 1 file changed, 11 insertions(+), 19 deletions(-) diff --git a/drivers/md/bcache/bset.c b/drivers/md/bcache/bset.c index eedaf0f3e3f0..2722e7f83de3 100644 --- a/drivers/md/bcache/bset.c +++ b/drivers/md/bcache/bset.c @@ -965,25 +965,17 @@ static struct bset_search_iter bset_search_tree(struct bset_tree *t, j = n; f = &t->tree[j]; - /* - * Similar bit trick, use subtract operation to avoid a branch - * instruction. - * - * n = (f->mantissa > bfloat_mantissa()) - * ? j * 2 - * : j * 2 + 1; - * - * We need to subtract 1 from f->mantissa for the sign bit trick - * to work - that's done in make_bfloat() - */ - if (likely(f->exponent != 127)) - n = j * 2 + (((unsigned int) - (f->mantissa - - bfloat_mantissa(search, f))) >> 31); - else - n = (bkey_cmp(tree_to_bkey(t, j), search) > 0) - ? j * 2 - : j * 2 + 1; + if (likely(f->exponent != 127)) { + if (f->mantissa >= bfloat_mantissa(search, f)) + n = j * 2; + else + n = j * 2 + 1; + } else { + if (bkey_cmp(tree_to_bkey(t, j), search) > 0) + n = j * 2; + else + n = j * 2 + 1; + } } while (n < t->size); inorder = to_inorder(j, t); From patchwork Fri Jun 14 13:13:54 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Coly Li X-Patchwork-Id: 10995311 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 4F7DD1395 for ; Fri, 14 Jun 2019 13:15:34 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 3DF3F28517 for ; Fri, 14 Jun 2019 13:15:34 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 32B6A2850D; Fri, 14 Jun 2019 13:15:34 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id C7F4B2860C for ; Fri, 14 Jun 2019 13:15:33 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728157AbfFNNPd (ORCPT ); Fri, 14 Jun 2019 09:15:33 -0400 Received: from mx2.suse.de ([195.135.220.15]:46262 "EHLO mx1.suse.de" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1727996AbfFNNPd (ORCPT ); Fri, 14 Jun 2019 09:15:33 -0400 X-Virus-Scanned: by amavisd-new at test-mx.suse.de Received: from relay2.suse.de (unknown [195.135.220.254]) by mx1.suse.de (Postfix) with ESMTP id 12CA2AE07; Fri, 14 Jun 2019 13:15:32 +0000 (UTC) From: Coly Li To: linux-bcache@vger.kernel.org Cc: linux-block@vger.kernel.org, Coly Li Subject: [PATCH 25/29] bcache: add pendings_cleanup to stop pending bcache device Date: Fri, 14 Jun 2019 21:13:54 +0800 Message-Id: <20190614131358.2771-26-colyli@suse.de> X-Mailer: git-send-email 2.16.4 In-Reply-To: <20190614131358.2771-1-colyli@suse.de> References: <20190614131358.2771-1-colyli@suse.de> Sender: linux-block-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-block@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP If a bcache device is in dirty state and its cache set is not registered, this bcache device will not appear in /dev/bcache, and there is no way to stop it or remove the bcache kernel module. This is an as-designed behavior, but sometimes people has to reboot whole system to release or stop the pending backing device. This sysfs interface may remove such pending bcache devices when write anything into the sysfs file manually. Signed-off-by: Coly Li --- drivers/md/bcache/super.c | 55 +++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 55 insertions(+) diff --git a/drivers/md/bcache/super.c b/drivers/md/bcache/super.c index 026e2df358c3..16ed617183f7 100644 --- a/drivers/md/bcache/super.c +++ b/drivers/md/bcache/super.c @@ -2291,9 +2291,13 @@ static int register_cache(struct cache_sb *sb, struct page *sb_page, static ssize_t register_bcache(struct kobject *k, struct kobj_attribute *attr, const char *buffer, size_t size); +static ssize_t bch_pending_bdevs_cleanup(struct kobject *k, + struct kobj_attribute *attr, + const char *buffer, size_t size); kobj_attribute_write(register, register_bcache); kobj_attribute_write(register_quiet, register_bcache); +kobj_attribute_write(pendings_cleanup, bch_pending_bdevs_cleanup); static bool bch_is_open_backing(struct block_device *bdev) { @@ -2418,6 +2422,56 @@ static ssize_t register_bcache(struct kobject *k, struct kobj_attribute *attr, goto out; } + +struct pdev { + struct list_head list; + struct cached_dev *dc; +}; + +static ssize_t bch_pending_bdevs_cleanup(struct kobject *k, + struct kobj_attribute *attr, + const char *buffer, + size_t size) +{ + LIST_HEAD(pending_devs); + ssize_t ret = size; + struct cached_dev *dc, *tdc; + struct pdev *pdev, *tpdev; + struct cache_set *c, *tc; + + mutex_lock(&bch_register_lock); + list_for_each_entry_safe(dc, tdc, &uncached_devices, list) { + pdev = kmalloc(sizeof(struct pdev), GFP_KERNEL); + if (!pdev) + break; + pdev->dc = dc; + list_add(&pdev->list, &pending_devs); + } + + list_for_each_entry_safe(pdev, tpdev, &pending_devs, list) { + list_for_each_entry_safe(c, tc, &bch_cache_sets, list) { + char *pdev_set_uuid = pdev->dc->sb.set_uuid; + char *set_uuid = c->sb.uuid; + + if (!memcmp(pdev_set_uuid, set_uuid, 16)) { + list_del(&pdev->list); + kfree(pdev); + break; + } + } + } + mutex_unlock(&bch_register_lock); + + list_for_each_entry_safe(pdev, tpdev, &pending_devs, list) { + pr_info("delete pdev %p", pdev); + list_del(&pdev->list); + bcache_device_stop(&pdev->dc->disk); + kfree(pdev); + } + + return ret; +} + static int bcache_reboot(struct notifier_block *n, unsigned long code, void *x) { if (code == SYS_DOWN || @@ -2536,6 +2590,7 @@ static int __init bcache_init(void) static const struct attribute *files[] = { &ksysfs_register.attr, &ksysfs_register_quiet.attr, + &ksysfs_pendings_cleanup.attr, NULL }; From patchwork Fri Jun 14 13:13:55 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Coly Li X-Patchwork-Id: 10995313 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id B179B1395 for ; Fri, 14 Jun 2019 13:15:37 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id A1CB1285FB for ; Fri, 14 Jun 2019 13:15:37 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 968242860C; Fri, 14 Jun 2019 13:15:37 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 1124D285FB for ; Fri, 14 Jun 2019 13:15:37 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727996AbfFNNPg (ORCPT ); Fri, 14 Jun 2019 09:15:36 -0400 Received: from mx2.suse.de ([195.135.220.15]:46276 "EHLO mx1.suse.de" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1727918AbfFNNPg (ORCPT ); Fri, 14 Jun 2019 09:15:36 -0400 X-Virus-Scanned: by amavisd-new at test-mx.suse.de Received: from relay2.suse.de (unknown [195.135.220.254]) by mx1.suse.de (Postfix) with ESMTP id D442FAE07; Fri, 14 Jun 2019 13:15:34 +0000 (UTC) From: Coly Li To: linux-bcache@vger.kernel.org Cc: linux-block@vger.kernel.org, Coly Li Subject: [PATCH 26/29] bcache: avoid a deadlock in bcache_reboot() Date: Fri, 14 Jun 2019 21:13:55 +0800 Message-Id: <20190614131358.2771-27-colyli@suse.de> X-Mailer: git-send-email 2.16.4 In-Reply-To: <20190614131358.2771-1-colyli@suse.de> References: <20190614131358.2771-1-colyli@suse.de> Sender: linux-block-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-block@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP It is quite frequently to observe deadlock in bcache_reboot() happens and hang the system reboot process. The reason is, in bcache_reboot() when calling bch_cache_set_stop() and bcache_device_stop() the mutex bch_register_lock is held. But in the process to stop cache set and bcache device, bch_register_lock will be acquired again. If this mutex is held here, deadlock will happen inside the stopping process. The aftermath of the deadlock is, whole system reboot gets hung. The fix is to avoid holding bch_register_lock for the following loops in bcache_reboot(), list_for_each_entry_safe(c, tc, &bch_cache_sets, list) bch_cache_set_stop(c); list_for_each_entry_safe(dc, tdc, &uncached_devices, list) bcache_device_stop(&dc->disk); A module range variable 'bcache_is_reboot' is added, it sets to true in bcache_reboot(). In register_bcache(), if bcache_is_reboot is checked to be true, reject the registration by returning -EBUSY immediately. Signed-off-by: Coly Li --- drivers/md/bcache/super.c | 40 +++++++++++++++++++++++++++++++++++++++- drivers/md/bcache/sysfs.c | 26 ++++++++++++++++++++++++++ 2 files changed, 65 insertions(+), 1 deletion(-) diff --git a/drivers/md/bcache/super.c b/drivers/md/bcache/super.c index 16ed617183f7..2928adfea743 100644 --- a/drivers/md/bcache/super.c +++ b/drivers/md/bcache/super.c @@ -40,6 +40,7 @@ static const char invalid_uuid[] = { static struct kobject *bcache_kobj; struct mutex bch_register_lock; +bool bcache_is_reboot; LIST_HEAD(bch_cache_sets); static LIST_HEAD(uncached_devices); @@ -49,6 +50,7 @@ static wait_queue_head_t unregister_wait; struct workqueue_struct *bcache_wq; struct workqueue_struct *bch_journal_wq; + #define BTREE_MAX_PAGES (256 * 1024 / PAGE_SIZE) /* limitation of partitions number on single bcache device */ #define BCACHE_MINORS 128 @@ -2345,6 +2347,11 @@ static ssize_t register_bcache(struct kobject *k, struct kobj_attribute *attr, if (!try_module_get(THIS_MODULE)) return -EBUSY; + /* For latest state of bcache_is_reboot */ + smp_mb(); + if (bcache_is_reboot) + return -EBUSY; + path = kstrndup(buffer, size, GFP_KERNEL); if (!path) goto err; @@ -2474,6 +2481,9 @@ static ssize_t bch_pending_bdevs_cleanup(struct kobject *k, static int bcache_reboot(struct notifier_block *n, unsigned long code, void *x) { + if (bcache_is_reboot) + return NOTIFY_DONE; + if (code == SYS_DOWN || code == SYS_HALT || code == SYS_POWER_OFF) { @@ -2486,19 +2496,45 @@ static int bcache_reboot(struct notifier_block *n, unsigned long code, void *x) mutex_lock(&bch_register_lock); + if (bcache_is_reboot) + goto out; + + /* New registration is rejected since now */ + bcache_is_reboot = true; + /* + * Make registering caller (if there is) on other CPU + * core know bcache_is_reboot set to true earlier + */ + smp_mb(); + if (list_empty(&bch_cache_sets) && list_empty(&uncached_devices)) goto out; + mutex_unlock(&bch_register_lock); + pr_info("Stopping all devices:"); + /* + * The reason bch_register_lock is not held to call + * bch_cache_set_stop() and bcache_device_stop() is to + * avoid potential deadlock during reboot, because cache + * set or bcache device stopping process will acqurie + * bch_register_lock too. + * + * We are safe here because bcache_is_reboot sets to + * true already, register_bcache() will reject new + * registration now. bcache_is_reboot also makes sure + * bcache_reboot() won't be re-entered on by other thread, + * so there is no race in following list iteration by + * list_for_each_entry_safe(). + */ list_for_each_entry_safe(c, tc, &bch_cache_sets, list) bch_cache_set_stop(c); list_for_each_entry_safe(dc, tdc, &uncached_devices, list) bcache_device_stop(&dc->disk); - mutex_unlock(&bch_register_lock); /* * Give an early chance for other kthreads and @@ -2626,6 +2662,8 @@ static int __init bcache_init(void) bch_debug_init(); closure_debug_init(); + bcache_is_reboot = false; + return 0; err: bcache_exit(); diff --git a/drivers/md/bcache/sysfs.c b/drivers/md/bcache/sysfs.c index 82a2fe8ac86b..4ab15442cab5 100644 --- a/drivers/md/bcache/sysfs.c +++ b/drivers/md/bcache/sysfs.c @@ -16,6 +16,8 @@ #include #include +extern bool bcache_is_reboot; + /* Default is 0 ("writethrough") */ static const char * const bch_cache_modes[] = { "writethrough", @@ -267,6 +269,10 @@ STORE(__cached_dev) struct cache_set *c; struct kobj_uevent_env *env; + /* no user space access if system is rebooting */ + if (bcache_is_reboot) + return -EBUSY; + #define d_strtoul(var) sysfs_strtoul(var, dc->var) #define d_strtoul_nonzero(var) sysfs_strtoul_clamp(var, dc->var, 1, INT_MAX) #define d_strtoi_h(var) sysfs_hatoi(var, dc->var) @@ -407,6 +413,10 @@ STORE(bch_cached_dev) struct cached_dev *dc = container_of(kobj, struct cached_dev, disk.kobj); + /* no user space access if system is rebooting */ + if (bcache_is_reboot) + return -EBUSY; + mutex_lock(&bch_register_lock); size = __cached_dev_store(kobj, attr, buf, size); @@ -505,6 +515,10 @@ STORE(__bch_flash_dev) kobj); struct uuid_entry *u = &d->c->uuids[d->id]; + /* no user space access if system is rebooting */ + if (bcache_is_reboot) + return -EBUSY; + sysfs_strtoul(data_csum, d->data_csum); if (attr == &sysfs_size) { @@ -740,6 +754,10 @@ STORE(__bch_cache_set) struct cache_set *c = container_of(kobj, struct cache_set, kobj); ssize_t v; + /* no user space access if system is rebooting */ + if (bcache_is_reboot) + return -EBUSY; + if (attr == &sysfs_unregister) bch_cache_set_unregister(c); @@ -859,6 +877,10 @@ STORE(bch_cache_set_internal) { struct cache_set *c = container_of(kobj, struct cache_set, internal); + /* no user space access if system is rebooting */ + if (bcache_is_reboot) + return -EBUSY; + return bch_cache_set_store(&c->kobj, attr, buf, size); } @@ -1044,6 +1066,10 @@ STORE(__bch_cache) struct cache *ca = container_of(kobj, struct cache, kobj); ssize_t v; + /* no user space access if system is rebooting */ + if (bcache_is_reboot) + return -EBUSY; + if (attr == &sysfs_discard) { bool v = strtoul_or_return(buf); From patchwork Fri Jun 14 13:13:56 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Coly Li X-Patchwork-Id: 10995315 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 0CD3B1515 for ; Fri, 14 Jun 2019 13:15:40 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id F232F2844E for ; Fri, 14 Jun 2019 13:15:39 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id E6E2F28608; Fri, 14 Jun 2019 13:15:39 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 8756728610 for ; Fri, 14 Jun 2019 13:15:39 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728194AbfFNNPj (ORCPT ); Fri, 14 Jun 2019 09:15:39 -0400 Received: from mx2.suse.de ([195.135.220.15]:46292 "EHLO mx1.suse.de" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1727918AbfFNNPi (ORCPT ); Fri, 14 Jun 2019 09:15:38 -0400 X-Virus-Scanned: by amavisd-new at test-mx.suse.de Received: from relay2.suse.de (unknown [195.135.220.254]) by mx1.suse.de (Postfix) with ESMTP id C5E27AE07; Fri, 14 Jun 2019 13:15:37 +0000 (UTC) From: Coly Li To: linux-bcache@vger.kernel.org Cc: linux-block@vger.kernel.org, Coly Li Subject: [PATCH 27/29] bcache: acquire bch_register_lock later in cached_dev_detach_finish() Date: Fri, 14 Jun 2019 21:13:56 +0800 Message-Id: <20190614131358.2771-28-colyli@suse.de> X-Mailer: git-send-email 2.16.4 In-Reply-To: <20190614131358.2771-1-colyli@suse.de> References: <20190614131358.2771-1-colyli@suse.de> Sender: linux-block-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-block@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP Now there is variable bcache_is_reboot to prevent device register or unregister during reboot, it is unncessary to still hold mutex lock bch_regsiter_lock before stopping writeback_rate_update kworker and writeback kthread. And if the stopping kworker or kthread holding bch_register_lock inside their routine (we used to have such problem in writeback thread, thanks to Junhui Wang fixed it), it is very easy to introduce deadlock during reboot/shutdown procedure. Therefore in this patch, the location to acquire bch_register_lock is moved to the location before calling calc_cached_dev_sectors(). Which is later then original location in cached_dev_detach_finish(). Signed-off-by: Coly Li --- drivers/md/bcache/super.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/drivers/md/bcache/super.c b/drivers/md/bcache/super.c index 2928adfea743..15916b3ec8bf 100644 --- a/drivers/md/bcache/super.c +++ b/drivers/md/bcache/super.c @@ -1018,7 +1018,6 @@ static void cached_dev_detach_finish(struct work_struct *w) BUG_ON(!test_bit(BCACHE_DEV_DETACHING, &dc->disk.flags)); BUG_ON(refcount_read(&dc->count)); - mutex_lock(&bch_register_lock); if (test_and_clear_bit(BCACHE_DEV_WB_RUNNING, &dc->disk.flags)) cancel_writeback_rate_update_dwork(dc); @@ -1034,6 +1033,8 @@ static void cached_dev_detach_finish(struct work_struct *w) bch_write_bdev_super(dc, &cl); closure_sync(&cl); + mutex_lock(&bch_register_lock); + calc_cached_dev_sectors(dc->disk.c); bcache_device_detach(&dc->disk); list_move(&dc->list, &uncached_devices); From patchwork Fri Jun 14 13:13:57 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Coly Li X-Patchwork-Id: 10995317 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 448071515 for ; Fri, 14 Jun 2019 13:15:44 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 352792850D for ; Fri, 14 Jun 2019 13:15:44 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 2998828515; Fri, 14 Jun 2019 13:15:44 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 7D129285E1 for ; Fri, 14 Jun 2019 13:15:43 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728061AbfFNNPn (ORCPT ); Fri, 14 Jun 2019 09:15:43 -0400 Received: from mx2.suse.de ([195.135.220.15]:46318 "EHLO mx1.suse.de" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1727918AbfFNNPm (ORCPT ); Fri, 14 Jun 2019 09:15:42 -0400 X-Virus-Scanned: by amavisd-new at test-mx.suse.de Received: from relay2.suse.de (unknown [195.135.220.254]) by mx1.suse.de (Postfix) with ESMTP id 1FFD0AE07; Fri, 14 Jun 2019 13:15:41 +0000 (UTC) From: Coly Li To: linux-bcache@vger.kernel.org Cc: linux-block@vger.kernel.org, Coly Li Subject: [PATCH 28/29] bcache: acquire bch_register_lock later in cached_dev_free() Date: Fri, 14 Jun 2019 21:13:57 +0800 Message-Id: <20190614131358.2771-29-colyli@suse.de> X-Mailer: git-send-email 2.16.4 In-Reply-To: <20190614131358.2771-1-colyli@suse.de> References: <20190614131358.2771-1-colyli@suse.de> Sender: linux-block-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-block@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP When enable lockdep engine, a lockdep warning can be observed when reboot or shutdown system, [ 3142.764557][ T1] bcache: bcache_reboot() Stopping all devices: [ 3142.776265][ T2649] [ 3142.777159][ T2649] ====================================================== [ 3142.780039][ T2649] WARNING: possible circular locking dependency detected [ 3142.782869][ T2649] 5.2.0-rc4-lp151.20-default+ #1 Tainted: G W [ 3142.785684][ T2649] ------------------------------------------------------ [ 3142.788479][ T2649] kworker/3:67/2649 is trying to acquire lock: [ 3142.790738][ T2649] 00000000aaf02291 ((wq_completion)bcache_writeback_wq){+.+.}, at: flush_workqueue+0x87/0x4c0 [ 3142.794678][ T2649] [ 3142.794678][ T2649] but task is already holding lock: [ 3142.797402][ T2649] 000000004fcf89c5 (&bch_register_lock){+.+.}, at: cached_dev_free+0x17/0x120 [bcache] [ 3142.801462][ T2649] [ 3142.801462][ T2649] which lock already depends on the new lock. [ 3142.801462][ T2649] [ 3142.805277][ T2649] [ 3142.805277][ T2649] the existing dependency chain (in reverse order) is: [ 3142.808902][ T2649] [ 3142.808902][ T2649] -> #2 (&bch_register_lock){+.+.}: [ 3142.812396][ T2649] __mutex_lock+0x7a/0x9d0 [ 3142.814184][ T2649] cached_dev_free+0x17/0x120 [bcache] [ 3142.816415][ T2649] process_one_work+0x2a4/0x640 [ 3142.818413][ T2649] worker_thread+0x39/0x3f0 [ 3142.820276][ T2649] kthread+0x125/0x140 [ 3142.822061][ T2649] ret_from_fork+0x3a/0x50 [ 3142.823965][ T2649] [ 3142.823965][ T2649] -> #1 ((work_completion)(&cl->work)#2){+.+.}: [ 3142.827244][ T2649] process_one_work+0x277/0x640 [ 3142.829160][ T2649] worker_thread+0x39/0x3f0 [ 3142.830958][ T2649] kthread+0x125/0x140 [ 3142.832674][ T2649] ret_from_fork+0x3a/0x50 [ 3142.834915][ T2649] [ 3142.834915][ T2649] -> #0 ((wq_completion)bcache_writeback_wq){+.+.}: [ 3142.838121][ T2649] lock_acquire+0xb4/0x1c0 [ 3142.840025][ T2649] flush_workqueue+0xae/0x4c0 [ 3142.842035][ T2649] drain_workqueue+0xa9/0x180 [ 3142.844042][ T2649] destroy_workqueue+0x17/0x250 [ 3142.846142][ T2649] cached_dev_free+0x52/0x120 [bcache] [ 3142.848530][ T2649] process_one_work+0x2a4/0x640 [ 3142.850663][ T2649] worker_thread+0x39/0x3f0 [ 3142.852464][ T2649] kthread+0x125/0x140 [ 3142.854106][ T2649] ret_from_fork+0x3a/0x50 [ 3142.855880][ T2649] [ 3142.855880][ T2649] other info that might help us debug this: [ 3142.855880][ T2649] [ 3142.859663][ T2649] Chain exists of: [ 3142.859663][ T2649] (wq_completion)bcache_writeback_wq --> (work_completion)(&cl->work)#2 --> &bch_register_lock [ 3142.859663][ T2649] [ 3142.865424][ T2649] Possible unsafe locking scenario: [ 3142.865424][ T2649] [ 3142.868022][ T2649] CPU0 CPU1 [ 3142.869885][ T2649] ---- ---- [ 3142.871751][ T2649] lock(&bch_register_lock); [ 3142.873379][ T2649] lock((work_completion)(&cl->work)#2); [ 3142.876399][ T2649] lock(&bch_register_lock); [ 3142.879727][ T2649] lock((wq_completion)bcache_writeback_wq); [ 3142.882064][ T2649] [ 3142.882064][ T2649] *** DEADLOCK *** [ 3142.882064][ T2649] [ 3142.885060][ T2649] 3 locks held by kworker/3:67/2649: [ 3142.887245][ T2649] #0: 00000000e774cdd0 ((wq_completion)events){+.+.}, at: process_one_work+0x21e/0x640 [ 3142.890815][ T2649] #1: 00000000f7df89da ((work_completion)(&cl->work)#2){+.+.}, at: process_one_work+0x21e/0x640 [ 3142.894884][ T2649] #2: 000000004fcf89c5 (&bch_register_lock){+.+.}, at: cached_dev_free+0x17/0x120 [bcache] [ 3142.898797][ T2649] [ 3142.898797][ T2649] stack backtrace: [ 3142.900961][ T2649] CPU: 3 PID: 2649 Comm: kworker/3:67 Tainted: G W 5.2.0-rc4-lp151.20-default+ #1 [ 3142.904789][ T2649] Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 04/13/2018 [ 3142.909168][ T2649] Workqueue: events cached_dev_free [bcache] [ 3142.911422][ T2649] Call Trace: [ 3142.912656][ T2649] dump_stack+0x85/0xcb [ 3142.914181][ T2649] print_circular_bug+0x19a/0x1f0 [ 3142.916193][ T2649] __lock_acquire+0x16cd/0x1850 [ 3142.917936][ T2649] ? __lock_acquire+0x6a8/0x1850 [ 3142.919704][ T2649] ? lock_acquire+0xb4/0x1c0 [ 3142.921335][ T2649] ? find_held_lock+0x34/0xa0 [ 3142.923052][ T2649] lock_acquire+0xb4/0x1c0 [ 3142.924635][ T2649] ? flush_workqueue+0x87/0x4c0 [ 3142.926375][ T2649] flush_workqueue+0xae/0x4c0 [ 3142.928047][ T2649] ? flush_workqueue+0x87/0x4c0 [ 3142.929824][ T2649] ? drain_workqueue+0xa9/0x180 [ 3142.931686][ T2649] drain_workqueue+0xa9/0x180 [ 3142.933534][ T2649] destroy_workqueue+0x17/0x250 [ 3142.935787][ T2649] cached_dev_free+0x52/0x120 [bcache] [ 3142.937795][ T2649] process_one_work+0x2a4/0x640 [ 3142.939803][ T2649] worker_thread+0x39/0x3f0 [ 3142.941487][ T2649] ? process_one_work+0x640/0x640 [ 3142.943389][ T2649] kthread+0x125/0x140 [ 3142.944894][ T2649] ? kthread_create_worker_on_cpu+0x70/0x70 [ 3142.947744][ T2649] ret_from_fork+0x3a/0x50 [ 3142.970358][ T2649] bcache: bcache_device_free() bcache0 stopped Here is how the deadlock happens. 1) bcache_reboot() calls bcache_device_stop(), then inside bcache_device_stop() BCACHE_DEV_CLOSING bit is set on d->flags. Then closure_queue(&d->cl) is called to invoke cached_dev_flush(). 2) In cached_dev_flush(), cached_dev_free() is called by continu_at(). 3) In cached_dev_free(), when stopping the writeback kthread of the cached device by kthread_stop(), dc->writeback_thread will be waken up to quite the kthread while-loop, then cached_dev_put() is called in bch_writeback_thread(). 4) Calling cached_dev_put() in writeback kthread may drop dc->count to 0, then dc->detach kworker is scheduled, which is initialized as cached_dev_detach_finish(). 5) Inside cached_dev_detach_finish(), the last line of code is to call closure_put(&dc->disk.cl), which drops the last reference counter of closrure dc->disk.cl, then the callback cached_dev_flush() gets called. Now cached_dev_flush() is called for second time in the code path, the first time is in step 2). And again bch_register_lock will be acquired again, and a A-A lock (lockdep terminology) is happening. The root cause of the above A-A lock is in cached_dev_free(), mutex bch_register_lock is held before stopping writeback kthread and other kworkers. Fortunately now we have variable 'bcache_is_reboot', which may prevent device registration or unregistration during reboot/shutdown time, so it is unncessary to hold bch_register_lock such early now. This is how this patch fixes the reboot/shutdown time A-A lock issue: After moving mutex_lock(&bch_register_lock) to a later location where before atomic_read(&dc->running) in cached_dev_free(), such A-A lock problem can be solved without any reboot time registration race. Signed-off-by: Coly Li --- drivers/md/bcache/super.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/drivers/md/bcache/super.c b/drivers/md/bcache/super.c index 15916b3ec8bf..f376ba7e4d3f 100644 --- a/drivers/md/bcache/super.c +++ b/drivers/md/bcache/super.c @@ -1223,8 +1223,6 @@ static void cached_dev_free(struct closure *cl) { struct cached_dev *dc = container_of(cl, struct cached_dev, disk.cl); - mutex_lock(&bch_register_lock); - if (test_and_clear_bit(BCACHE_DEV_WB_RUNNING, &dc->disk.flags)) cancel_writeback_rate_update_dwork(dc); @@ -1235,6 +1233,8 @@ static void cached_dev_free(struct closure *cl) if (!IS_ERR_OR_NULL(dc->status_update_thread)) kthread_stop(dc->status_update_thread); + mutex_lock(&bch_register_lock); + if (atomic_read(&dc->running)) bd_unlink_disk_holder(dc->bdev, dc->disk.disk); bcache_device_free(&dc->disk); From patchwork Fri Jun 14 13:13:58 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Coly Li X-Patchwork-Id: 10995319 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id B652E1395 for ; Fri, 14 Jun 2019 13:15:49 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id A739328610 for ; Fri, 14 Jun 2019 13:15:49 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 9BC5E286A4; Fri, 14 Jun 2019 13:15:49 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id E8EFB28610 for ; Fri, 14 Jun 2019 13:15:48 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728067AbfFNNPs (ORCPT ); Fri, 14 Jun 2019 09:15:48 -0400 Received: from mx2.suse.de ([195.135.220.15]:46378 "EHLO mx1.suse.de" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1727918AbfFNNPs (ORCPT ); Fri, 14 Jun 2019 09:15:48 -0400 X-Virus-Scanned: by amavisd-new at test-mx.suse.de Received: from relay2.suse.de (unknown [195.135.220.254]) by mx1.suse.de (Postfix) with ESMTP id EE780AE07; Fri, 14 Jun 2019 13:15:44 +0000 (UTC) From: Coly Li To: linux-bcache@vger.kernel.org Cc: linux-block@vger.kernel.org, Coly Li Subject: [PATCH 29/29] bcache: fix potential deadlock in cached_def_free() Date: Fri, 14 Jun 2019 21:13:58 +0800 Message-Id: <20190614131358.2771-30-colyli@suse.de> X-Mailer: git-send-email 2.16.4 In-Reply-To: <20190614131358.2771-1-colyli@suse.de> References: <20190614131358.2771-1-colyli@suse.de> Sender: linux-block-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-block@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP When enable lockdep and reboot system with a writeback mode bcache device, the following potential deadlock warning is reported by lockdep engine. [ 101.536569][ T401] kworker/2:2/401 is trying to acquire lock: [ 101.538575][ T401] 00000000bbf6e6c7 ((wq_completion)bcache_writeback_wq){+.+.}, at: flush_workqueue+0x87/0x4c0 [ 101.542054][ T401] [ 101.542054][ T401] but task is already holding lock: [ 101.544587][ T401] 00000000f5f305b3 ((work_completion)(&cl->work)#2){+.+.}, at: process_one_work+0x21e/0x640 [ 101.548386][ T401] [ 101.548386][ T401] which lock already depends on the new lock. [ 101.548386][ T401] [ 101.551874][ T401] [ 101.551874][ T401] the existing dependency chain (in reverse order) is: [ 101.555000][ T401] [ 101.555000][ T401] -> #1 ((work_completion)(&cl->work)#2){+.+.}: [ 101.557860][ T401] process_one_work+0x277/0x640 [ 101.559661][ T401] worker_thread+0x39/0x3f0 [ 101.561340][ T401] kthread+0x125/0x140 [ 101.562963][ T401] ret_from_fork+0x3a/0x50 [ 101.564718][ T401] [ 101.564718][ T401] -> #0 ((wq_completion)bcache_writeback_wq){+.+.}: [ 101.567701][ T401] lock_acquire+0xb4/0x1c0 [ 101.569651][ T401] flush_workqueue+0xae/0x4c0 [ 101.571494][ T401] drain_workqueue+0xa9/0x180 [ 101.573234][ T401] destroy_workqueue+0x17/0x250 [ 101.575109][ T401] cached_dev_free+0x44/0x120 [bcache] [ 101.577304][ T401] process_one_work+0x2a4/0x640 [ 101.579357][ T401] worker_thread+0x39/0x3f0 [ 101.581055][ T401] kthread+0x125/0x140 [ 101.582709][ T401] ret_from_fork+0x3a/0x50 [ 101.584592][ T401] [ 101.584592][ T401] other info that might help us debug this: [ 101.584592][ T401] [ 101.588355][ T401] Possible unsafe locking scenario: [ 101.588355][ T401] [ 101.590974][ T401] CPU0 CPU1 [ 101.592889][ T401] ---- ---- [ 101.594743][ T401] lock((work_completion)(&cl->work)#2); [ 101.596785][ T401] lock((wq_completion)bcache_writeback_wq); [ 101.600072][ T401] lock((work_completion)(&cl->work)#2); [ 101.602971][ T401] lock((wq_completion)bcache_writeback_wq); [ 101.605255][ T401] [ 101.605255][ T401] *** DEADLOCK *** [ 101.605255][ T401] [ 101.608310][ T401] 2 locks held by kworker/2:2/401: [ 101.610208][ T401] #0: 00000000cf2c7d17 ((wq_completion)events){+.+.}, at: process_one_work+0x21e/0x640 [ 101.613709][ T401] #1: 00000000f5f305b3 ((work_completion)(&cl->work)#2){+.+.}, at: process_one_work+0x21e/0x640 [ 101.617480][ T401] [ 101.617480][ T401] stack backtrace: [ 101.619539][ T401] CPU: 2 PID: 401 Comm: kworker/2:2 Tainted: G W 5.2.0-rc4-lp151.20-default+ #1 [ 101.623225][ T401] Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 04/13/2018 [ 101.627210][ T401] Workqueue: events cached_dev_free [bcache] [ 101.629239][ T401] Call Trace: [ 101.630360][ T401] dump_stack+0x85/0xcb [ 101.631777][ T401] print_circular_bug+0x19a/0x1f0 [ 101.633485][ T401] __lock_acquire+0x16cd/0x1850 [ 101.635184][ T401] ? __lock_acquire+0x6a8/0x1850 [ 101.636863][ T401] ? lock_acquire+0xb4/0x1c0 [ 101.638421][ T401] ? find_held_lock+0x34/0xa0 [ 101.640015][ T401] lock_acquire+0xb4/0x1c0 [ 101.641513][ T401] ? flush_workqueue+0x87/0x4c0 [ 101.643248][ T401] flush_workqueue+0xae/0x4c0 [ 101.644832][ T401] ? flush_workqueue+0x87/0x4c0 [ 101.646476][ T401] ? drain_workqueue+0xa9/0x180 [ 101.648303][ T401] drain_workqueue+0xa9/0x180 [ 101.649867][ T401] destroy_workqueue+0x17/0x250 [ 101.651503][ T401] cached_dev_free+0x44/0x120 [bcache] [ 101.653328][ T401] process_one_work+0x2a4/0x640 [ 101.655029][ T401] worker_thread+0x39/0x3f0 [ 101.656693][ T401] ? process_one_work+0x640/0x640 [ 101.658501][ T401] kthread+0x125/0x140 [ 101.660012][ T401] ? kthread_create_worker_on_cpu+0x70/0x70 [ 101.661985][ T401] ret_from_fork+0x3a/0x50 [ 101.691318][ T401] bcache: bcache_device_free() bcache0 stopped Here is how the above potential deadlock may happen in reboot/shutdown code path, 1) bcache_reboot() is called firstly in the reboot/shutdown code path, then in bcache_reboot(), bcache_device_stop() is called. 2) bcache_device_stop() sets BCACHE_DEV_CLOSING on d->falgs, then call closure_queue(&d->cl) to invoke cached_dev_flush(). And in turn cached_dev_flush() calls cached_dev_free() via closure_at() 3) In cached_dev_free(), after stopped writebach kthread dc->writeback_thread, the kwork dc->writeback_write_wq is stopping by destroy_workqueue(). 4) Inside destroy_workqueue(), drain_workqueue() is called. Inside drain_workqueue(), flush_workqueue() is called. Then wq->lockdep_map is acquired by lock_map_acquire() in flush_workqueue(). After the lock acquired the rest part of flush_workqueue() just wait for the workqueue to complete. 5) Now we look back at writeback thread routine bch_writeback_thread(), in the main while-loop, write_dirty() is called via continue_at() in read_dirty_submit(), which is called via continue_at() in while-loop level called function read_dirty(). Inside write_dirty() it may be re-called on workqueeu dc->writeback_write_wq via continue_at(). It means when the writeback kthread is stopped in cached_dev_free() there might be still one kworker queued on dc->writeback_write_wq to execute write_dirty() again. 6) Now this kworker is scheduled on dc->writeback_write_wq to run by process_one_work() (which is called by worker_thread()). Before calling the kwork routine, wq->lockdep_map is acquired. 7) But wq->lockdep_map is acquired already in step 4), so a A-A lock (lockdep terminology) scenario happens. Indeed on multiple cores syatem, the above deadlock is very rare to happen, just as the code comments in process_one_work() says, 2263 * AFAICT there is no possible deadlock scenario between the 2264 * flush_work() and complete() primitives (except for single-threaded 2265 * workqueues), so hiding them isn't a problem. But it is still good to fix such lockdep warning, even no one running bcache on single core system. The fix is simple. This patch solves the above potential deadlock by, - Do not destory workqueue dc->writeback_write_wq in cached_dev_free(). - Flush and destory dc->writeback_write_wq in writebach kthread routine bch_writeback_thread(), where after quit the thread main while-loop and before cached_dev_put() is called. By this fix, dc->writeback_write_wq will be stopped and destroy before the writeback kthread stopped, so the chance for a A-A locking on wq->lockdep_map is disappeared, such A-A deadlock won't happen any more. Signed-off-by: Coly Li --- drivers/md/bcache/super.c | 2 -- drivers/md/bcache/writeback.c | 4 ++++ 2 files changed, 4 insertions(+), 2 deletions(-) diff --git a/drivers/md/bcache/super.c b/drivers/md/bcache/super.c index f376ba7e4d3f..06b4cc0cecce 100644 --- a/drivers/md/bcache/super.c +++ b/drivers/md/bcache/super.c @@ -1228,8 +1228,6 @@ static void cached_dev_free(struct closure *cl) if (!IS_ERR_OR_NULL(dc->writeback_thread)) kthread_stop(dc->writeback_thread); - if (dc->writeback_write_wq) - destroy_workqueue(dc->writeback_write_wq); if (!IS_ERR_OR_NULL(dc->status_update_thread)) kthread_stop(dc->status_update_thread); diff --git a/drivers/md/bcache/writeback.c b/drivers/md/bcache/writeback.c index 73f0efac2b9f..df0f4e5a051a 100644 --- a/drivers/md/bcache/writeback.c +++ b/drivers/md/bcache/writeback.c @@ -735,6 +735,10 @@ static int bch_writeback_thread(void *arg) } } + if (dc->writeback_write_wq) { + flush_workqueue(dc->writeback_write_wq); + destroy_workqueue(dc->writeback_write_wq); + } cached_dev_put(dc); wait_for_kthread_stop();