From patchwork Fri Jun 30 20:42:57 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Eric Wheeler X-Patchwork-Id: 9820461 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork.web.codeaurora.org (Postfix) with ESMTP id 65ED660224 for ; Fri, 30 Jun 2017 20:44:29 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 59C5126E1A for ; Fri, 30 Jun 2017 20:44:29 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 4E6BF282EC; Fri, 30 Jun 2017 20:44:29 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-6.9 required=2.0 tests=BAYES_00,RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id BB5FD26E1A for ; Fri, 30 Jun 2017 20:44:28 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752590AbdF3UoP (ORCPT ); Fri, 30 Jun 2017 16:44:15 -0400 Received: from mx.ewheeler.net ([66.155.3.69]:43854 "EHLO mail.ewheeler.net" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1752749AbdF3Uny (ORCPT ); Fri, 30 Jun 2017 16:43:54 -0400 Received: from localhost (localhost [127.0.0.1]) by mail.ewheeler.net (Postfix) with ESMTP id 83C21A0C43; Fri, 30 Jun 2017 20:43:54 +0000 (UTC) X-Virus-Scanned: amavisd-new at ewheeler.net Received: from mail.ewheeler.net ([127.0.0.1]) by localhost (mail.ewheeler.net [127.0.0.1]) (amavisd-new, port 10024) with LMTP id 92FteMjZbhhV; Fri, 30 Jun 2017 20:43:54 +0000 (UTC) Received: from el7-dev.ewi (c-24-20-122-25.hsd1.or.comcast.net [24.20.122.25]) (using TLSv1.2 with cipher DHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.ewheeler.net (Postfix) with ESMTPSA id CE192A029F; Fri, 30 Jun 2017 20:43:53 +0000 (UTC) From: bcache@lists.ewheeler.net To: linux-block@vger.kernel.org Cc: linux-bcache@vger.kernel.org, hch@infradead.org, axboe@kernel.dk, Eric Wheeler , Eric Wheeler Subject: [PATCH 08/19] bcache: documentation for sysfs entries describing bcache cache hinting Date: Fri, 30 Jun 2017 13:42:57 -0700 Message-Id: <1498855388-16990-8-git-send-email-bcache@lists.ewheeler.net> X-Mailer: git-send-email 1.8.3.1 In-Reply-To: <1498855388-16990-1-git-send-email-bcache@lists.ewheeler.net> References: <20170629134510.GA32385@infradead.org> <1498855388-16990-1-git-send-email-bcache@lists.ewheeler.net> Sender: linux-block-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-block@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP From: Eric Wheeler Signed-off-by: Eric Wheeler --- Documentation/bcache.txt | 80 ++++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 80 insertions(+) diff --git a/Documentation/bcache.txt b/Documentation/bcache.txt index a9259b5..c78c012 100644 --- a/Documentation/bcache.txt +++ b/Documentation/bcache.txt @@ -133,6 +133,86 @@ the backing devices to passthrough mode. writeback mode). It currently doesn't do anything intelligent if it fails to read some of the dirty data, though. +SSD LONGEVITY: PER-PROCESS CACHE HINTING WITH IO PRIORITY +--------------------------------------------------------- + +Processes can be assigned an IO priority using `ionice` and bcache will +either try to writeback or bypass the cache based on the IO priority +level assigned to the process and the configuration of the syfs ioprio +hints. If configured properly for your workload, this can both increase +performance and reduce SSD wear (erase/write cycles). + +Having idle IOs bypass the cache can increase performance elsewhere +since you probably don't care about their performance. In addition, +this prevents idle IOs from promoting into (polluting) your cache and +evicting blocks that are more important elsewhere. + +Default sysfs values: + 2,7: ioprio_bypass is hinted for process IOs at-or-below best-effort-7. + 0,0: ioprio_writeback hinting is disabled by default. + +Cache hinting is configured by writing 'class,level' pairs to sysfs. +In this example, we write the following: + + echo 2,7 > /sys/block/bcache0/bcache/ioprio_bypass + echo 2,0 > /sys/block/bcache0/bcache/ioprio_writeback + +Thus, processes with the following IO class (ionice -c) and level (-n) +will the behave as shown in this table: + + (-c) IO Class (-n) Class level Action + ----------------------------------------------------- + (1) Realtime 0-7 Writeback + (2) Best-effort 0 Writeback + (2) Best-effort 1-6 Normal, as if hinting were disabled + (2) Best-effort 7 Bypass cache + (3) Idle n/a Bypass cache + +For processes at-or-below best-effort-7 (ionice -c2 -n7), the +ioprio_bypass behavior is as follows: + +* Reads will come from the backing device and will not promote into + (pollute) your cache. If the block being read was already in the cache, + then it will be read from the cache (and remain cached). + +* If you are using writeback mode, then low-priority bypass-hinted writes + will go directly to the backing device. If the write was dirty in + cache, it will cache-invalidate and write directly to the backing + device. If a high-priority task later writes the same block then it + will writeback so no performance is lost for write-after-write. + + For read-after-bypassed-write, the block will be read from the backing + device (not cached) so there may be a miss penalty when a low-priority + process write bypasses the cache followed by a high-priority read that + would otherwise have hit. In practice, this is not an issue; to date, + none have wanted low-priority writes and high-priority reads of the + same block. + +For processes in our example at-or-above best-effort-0 (ionice -c2 -n0), +the ioprio_writeback behavior is as follows: + +* The writeback hint has no effect unless your 'cache_mode' is writeback. + Assuming writeback mode, all writes at this priority will writeback. + Of course this will increase SSD wear, so only use writeback hinting + if you need it. + +* Reads are unaffected by ioprio_writeback, except that read-after-write + will of course read from the cache. + +Linux assigns processes the best-effort class with a level of 4 if +no process is assigned Thus, without `ionice` your processes will +follow normal bcache should_writeback/should_bypass symantecs as if the +ioprio_writeback/ioprio_bypass sysfs flags were disabled. + +Also note that in order to be hinted by ioprio_writeback/ioprio_bypass, +the process must have a valid ioprio setting as returned by +get_task_io_context()->ioprio. Thus, a process without an IO context +will be ignored by the ioprio_writeback/ioprio_bypass hints even if your +sysfs hints specify that best-effort-4 should be flagged for bypass +or writeback. If in doubt, explicitly set the process IO priority with +`ionice`. + +See `man ionice` for more detail about per-process IO priority in Linux. HOWTO/COOKBOOK --------------