From patchwork Thu Jul 26 04:17:36 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Coly Li X-Patchwork-Id: 10545185 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 6F51D157A for ; Thu, 26 Jul 2018 04:18:08 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 592732A9FB for ; Thu, 26 Jul 2018 04:18:08 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 4BB7A2AA00; Thu, 26 Jul 2018 04:18:08 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id DE42B2A9B5 for ; Thu, 26 Jul 2018 04:18:07 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727798AbeGZFc7 (ORCPT ); Thu, 26 Jul 2018 01:32:59 -0400 Received: from mx2.suse.de ([195.135.220.15]:52966 "EHLO mx1.suse.de" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1725787AbeGZFc6 (ORCPT ); Thu, 26 Jul 2018 01:32:58 -0400 X-Virus-Scanned: by amavisd-new at test-mx.suse.de Received: from relay2.suse.de (unknown [195.135.220.254]) by mx1.suse.de (Postfix) with ESMTP id 58AB0AFE3; Thu, 26 Jul 2018 04:18:05 +0000 (UTC) From: Coly Li To: colyli@suse.de, axboe@kernel.dk, linux-bcache@vger.kernel.org Cc: linux-block@vger.kernel.org, Tang Junhui Subject: [PATCH 4/9] bcache: fix I/O significant decline while backend devices registering Date: Thu, 26 Jul 2018 12:17:36 +0800 Message-Id: <20180726041741.1669-5-colyli@suse.de> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20180726041741.1669-1-colyli@suse.de> References: <20180726041741.1669-1-colyli@suse.de> Sender: linux-block-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-block@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP From: Tang Junhui I attached several backend devices in the same cache set, and produced lots of dirty data by running small rand I/O writes in a long time, then I continue run I/O in the others cached devices, and stopped a cached device, after a mean while, I register the stopped device again, I see the running I/O in the others cached devices dropped significantly, sometimes even jumps to zero. In currently code, bcache would traverse each keys and btree node to count the dirty data under read locker, and the writes threads can not get the btree write locker, and when there is a lot of keys and btree node in the registering device, it would last several seconds, so the write I/Os in others cached device are blocked and declined significantly. In this patch, when a device registering to a ache set, which exist others cached devices with running I/Os, we get the amount of dirty data of the device in an incremental way, and do not block other cached devices all the time. Patch v2: Rename some variables and macros name as Coly suggested. Signed-off-by: Tang Junhui Signed-off-by: Coly Li --- drivers/md/bcache/writeback.c | 29 ++++++++++++++++++++++++++--- 1 file changed, 26 insertions(+), 3 deletions(-) diff --git a/drivers/md/bcache/writeback.c b/drivers/md/bcache/writeback.c index 0d2a05074a81..912e969fedba 100644 --- a/drivers/md/bcache/writeback.c +++ b/drivers/md/bcache/writeback.c @@ -676,10 +676,14 @@ static int bch_writeback_thread(void *arg) } /* Init */ +#define INIT_KEYS_EACH_TIME 500000 +#define INIT_KEYS_SLEEP_MS 100 struct sectors_dirty_init { struct btree_op op; unsigned inode; + size_t count; + struct bkey start; }; static int sectors_dirty_init_fn(struct btree_op *_op, struct btree *b, @@ -694,18 +698,37 @@ static int sectors_dirty_init_fn(struct btree_op *_op, struct btree *b, bcache_dev_sectors_dirty_add(b->c, KEY_INODE(k), KEY_START(k), KEY_SIZE(k)); + op->count++; + if (atomic_read(&b->c->search_inflight) && + !(op->count % INIT_KEYS_EACH_TIME)) { + bkey_copy_key(&op->start, k); + return -EAGAIN; + } + return MAP_CONTINUE; } void bch_sectors_dirty_init(struct bcache_device *d) { struct sectors_dirty_init op; + int ret; bch_btree_op_init(&op.op, -1); op.inode = d->id; - - bch_btree_map_keys(&op.op, d->c, &KEY(op.inode, 0, 0), - sectors_dirty_init_fn, 0); + op.count = 0; + op.start = KEY(op.inode, 0, 0); + + do { + ret = bch_btree_map_keys(&op.op, d->c, &op.start, + sectors_dirty_init_fn, 0); + if (ret == -EAGAIN) + schedule_timeout_interruptible( + msecs_to_jiffies(INIT_KEYS_SLEEP_MS)); + else if (ret < 0) { + pr_warn("sectors dirty init failed, ret=%d!", ret); + break; + } + } while (ret == -EAGAIN); } void bch_cached_dev_writeback_init(struct cached_dev *dc)