From patchwork Mon Jan 8 20:21:23 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Michael Lyle X-Patchwork-Id: 10150563 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork.web.codeaurora.org (Postfix) with ESMTP id 68322602CA for ; Mon, 8 Jan 2018 20:22:19 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 7325F1FFDA for ; Mon, 8 Jan 2018 20:22:19 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 67FA9212BE; Mon, 8 Jan 2018 20:22:19 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-6.9 required=2.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 3FB1020602 for ; Mon, 8 Jan 2018 20:22:05 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1754837AbeAHUWD (ORCPT ); Mon, 8 Jan 2018 15:22:03 -0500 Received: from mail-pg0-f67.google.com ([74.125.83.67]:33224 "EHLO mail-pg0-f67.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1757225AbeAHUVr (ORCPT ); Mon, 8 Jan 2018 15:21:47 -0500 Received: by mail-pg0-f67.google.com with SMTP id i196so6309231pgd.0 for ; Mon, 08 Jan 2018 12:21:47 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=lyle-org.20150623.gappssmtp.com; s=20150623; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=K735X/7Z3O44zKcd8bf6juPC4SnU1eaNFIfT6cfXAJI=; b=aN9SKnY3bjm8zE8IUmdkzNiesTrJoqrG9QYqlUjouOwMxZtPbjEkJRUgsMInGjY4HT +dlcbQ2vgpyMRdpg2wfuP8HE3RIyc2lc6K5RFadxWh6AdIls+m2NEhrNdURhUaXVO3dz KBs0YOdUaGvGdFAenAvKX/Iw7iIxRdiPP2T3x7ru/TrusU7XKMukB1ldA31uB4S6KuPH c7MDX44De0x0TM+pXSFf6Kp0ZeYBpJyrsZMNdhD8gyUjU9Vk4Ii1sv7SgcRdpivsytSR vmPBlu2I3elGIf4VtmGGJ89JoxlLIOYNBigcWozLPSbswJOWqYA/GgZGWHCzUsVmgcjk Srww== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=K735X/7Z3O44zKcd8bf6juPC4SnU1eaNFIfT6cfXAJI=; b=tc/VWCQlnDc13fbKztSNTpS/1PItmqMhE1mJ56rXNZcfzgd48wuaPJVfbe2N7N312o U+sVTAKf4wY549263hSpSs+5Xn479gICvDycMIQRFIh4bKu7VG++wmiXjur10zKpcx0I Jx9CraTTrbxGZi6dJJd4p/KGLGgnuFhIH9yJkNA077gn7NEwyTIRhuJiaGU1rfLWQJRW 2DZoGVx+b7QkJ7ZAKLfFMZEctVux46G0h7uHf8u06hEBggCT9U8nj0ZktWCYHlfrLzRh kFHLad/IeMY5fCNtrEpjprcezL16/rCGZkZVDafp6DuosIEbu+asJSqW2tk4dn4IrdOu 053Q== X-Gm-Message-State: AKGB3mJN6U2GxOG54kag/vlebZVisxfwkHcwEYtRCIroYbtc/m6cvd4d j+OZ2aBGffYtMQ2ckuTPyze/uA== X-Google-Smtp-Source: ACJfBottWoyi0fX7/x/Xb/tlvG9Lx88EDG/RORy4vNCEiXjB5OMmMhyO6GNWPpLlCWLAQ3WHZqfFdQ== X-Received: by 10.84.128.35 with SMTP id 32mr3148546pla.350.1515442906540; Mon, 08 Jan 2018 12:21:46 -0800 (PST) Received: from midnight.lan (2600-6c52-6200-383d-a0f8-4aea-fac9-9f39.dhcp6.chtrptr.net. [2600:6c52:6200:383d:a0f8:4aea:fac9:9f39]) by smtp.gmail.com with ESMTPSA id o70sm28540227pfk.79.2018.01.08.12.21.45 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Mon, 08 Jan 2018 12:21:46 -0800 (PST) From: Michael Lyle To: linux-bcache@vger.kernel.org, linux-block@vger.kernel.org Cc: axboe@fb.com, Michael Lyle Subject: [416 PATCH 06/13] bcache: writeback: properly order backing device IO Date: Mon, 8 Jan 2018 12:21:23 -0800 Message-Id: <20180108202130.31303-7-mlyle@lyle.org> X-Mailer: git-send-email 2.14.1 In-Reply-To: <20180108202130.31303-1-mlyle@lyle.org> References: <20180108202130.31303-1-mlyle@lyle.org> Sender: linux-block-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-block@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP Writeback keys are presently iterated and dispatched for writeback in order of the logical block address on the backing device. Multiple may be, in parallel, read from the cache device and then written back (especially when there are contiguous I/O). However-- there was no guarantee with the existing code that the writes would be issued in LBA order, as the reads from the cache device are often re-ordered. In turn, when writing back quickly, the backing disk often has to seek backwards-- this slows writeback and increases utilization. This patch introduces an ordering mechanism that guarantees that the original order of issue is maintained for the write portion of the I/O. Performance for writeback is significantly improved when there are multiple contiguous keys or high writeback rates. Signed-off-by: Michael Lyle Reviewed-by: Tang Junhui Tested-by: Tang Junhui --- drivers/md/bcache/bcache.h | 8 ++++++++ drivers/md/bcache/writeback.c | 29 +++++++++++++++++++++++++++++ 2 files changed, 37 insertions(+) diff --git a/drivers/md/bcache/bcache.h b/drivers/md/bcache/bcache.h index 1784e50eb857..3be0fcc19b1f 100644 --- a/drivers/md/bcache/bcache.h +++ b/drivers/md/bcache/bcache.h @@ -330,6 +330,14 @@ struct cached_dev { struct keybuf writeback_keys; + /* + * Order the write-half of writeback operations strongly in dispatch + * order. (Maintain LBA order; don't allow reads completing out of + * order to re-order the writes...) + */ + struct closure_waitlist writeback_ordering_wait; + atomic_t writeback_sequence_next; + /* For tracking sequential IO */ #define RECENT_IO_BITS 7 #define RECENT_IO (1 << RECENT_IO_BITS) diff --git a/drivers/md/bcache/writeback.c b/drivers/md/bcache/writeback.c index 479095987f22..6e1d2fde43df 100644 --- a/drivers/md/bcache/writeback.c +++ b/drivers/md/bcache/writeback.c @@ -116,6 +116,7 @@ static unsigned writeback_delay(struct cached_dev *dc, unsigned sectors) struct dirty_io { struct closure cl; struct cached_dev *dc; + uint16_t sequence; struct bio bio; }; @@ -194,6 +195,27 @@ static void write_dirty(struct closure *cl) { struct dirty_io *io = container_of(cl, struct dirty_io, cl); struct keybuf_key *w = io->bio.bi_private; + struct cached_dev *dc = io->dc; + + uint16_t next_sequence; + + if (atomic_read(&dc->writeback_sequence_next) != io->sequence) { + /* Not our turn to write; wait for a write to complete */ + closure_wait(&dc->writeback_ordering_wait, cl); + + if (atomic_read(&dc->writeback_sequence_next) == io->sequence) { + /* + * Edge case-- it happened in indeterminate order + * relative to when we were added to wait list.. + */ + closure_wake_up(&dc->writeback_ordering_wait); + } + + continue_at(cl, write_dirty, io->dc->writeback_write_wq); + return; + } + + next_sequence = io->sequence + 1; /* * IO errors are signalled using the dirty bit on the key. @@ -211,6 +233,9 @@ static void write_dirty(struct closure *cl) closure_bio_submit(&io->bio, cl); } + atomic_set(&dc->writeback_sequence_next, next_sequence); + closure_wake_up(&dc->writeback_ordering_wait); + continue_at(cl, write_dirty_finish, io->dc->writeback_write_wq); } @@ -242,7 +267,10 @@ static void read_dirty(struct cached_dev *dc) int nk, i; struct dirty_io *io; struct closure cl; + uint16_t sequence = 0; + BUG_ON(!llist_empty(&dc->writeback_ordering_wait.list)); + atomic_set(&dc->writeback_sequence_next, sequence); closure_init_stack(&cl); /* @@ -303,6 +331,7 @@ static void read_dirty(struct cached_dev *dc) w->private = io; io->dc = dc; + io->sequence = sequence++; dirty_init(w); bio_set_op_attrs(&io->bio, REQ_OP_READ, 0);