From patchwork Tue Oct 11 19:08:13 2016 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Eric Wheeler X-Patchwork-Id: 9819145 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork.web.codeaurora.org (Postfix) with ESMTP id 4852F603D7 for ; Fri, 30 Jun 2017 10:21:43 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 39C292860C for ; Fri, 30 Jun 2017 10:21:43 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 2E3342861A; Fri, 30 Jun 2017 10:21:43 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-3.0 required=2.0 tests=BAYES_00, DATE_IN_PAST_96_XX, RCVD_IN_DNSWL_HI,RCVD_IN_SORBS_SPAM autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 8A86128615 for ; Fri, 30 Jun 2017 10:21:42 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752484AbdF3KVl (ORCPT ); Fri, 30 Jun 2017 06:21:41 -0400 Received: from iad1-shared-relay2.dreamhost.com ([208.113.157.41]:45155 "EHLO iad1-shared-relay2.dreamhost.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752452AbdF3KVV (ORCPT ); Fri, 30 Jun 2017 06:21:21 -0400 Received: from iad1-shared-relay1.dreamhost.com (iad1-shared-relay1.dreamhost.com [208.113.157.50]) by iad1-shared-relay2.dreamhost.com (Postfix) with ESMTP id 5C2C13E5698 for ; Thu, 29 Jun 2017 15:21:09 -0700 (PDT) Received: from ware.dreamhost.com (ware.dreamhost.com [64.111.127.160]) by iad1-shared-relay1.dreamhost.com (Postfix) with ESMTP id 24208B400BA for ; Thu, 29 Jun 2017 15:21:08 -0700 (PDT) Received: by ware.dreamhost.com (Postfix, from userid 2406546) id 013F5100131; Thu, 29 Jun 2017 15:21:08 -0700 (PDT) In-Reply-To: <20170629134510.GA32385@infradead.org> References: <20170629134510.GA32385@infradead.org> From: Eric Wheeler Date: Tue, 11 Oct 2016 12:08:13 -0700 Subject: [PATCH 08/19] bcache: documentation for sysfs entries describing bcache cache hinting To: linux-block@vger.kernel.org Message-Id: <20170629222108.013F5100131@ware.dreamhost.com> Sender: linux-block-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-block@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP Signed-off-by: Eric Wheeler --- Documentation/bcache.txt | 80 ++++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 80 insertions(+) diff --git a/Documentation/bcache.txt b/Documentation/bcache.txt index a9259b5..c78c012 100644 --- a/Documentation/bcache.txt +++ b/Documentation/bcache.txt @@ -133,6 +133,86 @@ the backing devices to passthrough mode. writeback mode). It currently doesn't do anything intelligent if it fails to read some of the dirty data, though. +SSD LONGEVITY: PER-PROCESS CACHE HINTING WITH IO PRIORITY +--------------------------------------------------------- + +Processes can be assigned an IO priority using `ionice` and bcache will +either try to writeback or bypass the cache based on the IO priority +level assigned to the process and the configuration of the syfs ioprio +hints. If configured properly for your workload, this can both increase +performance and reduce SSD wear (erase/write cycles). + +Having idle IOs bypass the cache can increase performance elsewhere +since you probably don't care about their performance. In addition, +this prevents idle IOs from promoting into (polluting) your cache and +evicting blocks that are more important elsewhere. + +Default sysfs values: + 2,7: ioprio_bypass is hinted for process IOs at-or-below best-effort-7. + 0,0: ioprio_writeback hinting is disabled by default. + +Cache hinting is configured by writing 'class,level' pairs to sysfs. +In this example, we write the following: + + echo 2,7 > /sys/block/bcache0/bcache/ioprio_bypass + echo 2,0 > /sys/block/bcache0/bcache/ioprio_writeback + +Thus, processes with the following IO class (ionice -c) and level (-n) +will the behave as shown in this table: + + (-c) IO Class (-n) Class level Action + ----------------------------------------------------- + (1) Realtime 0-7 Writeback + (2) Best-effort 0 Writeback + (2) Best-effort 1-6 Normal, as if hinting were disabled + (2) Best-effort 7 Bypass cache + (3) Idle n/a Bypass cache + +For processes at-or-below best-effort-7 (ionice -c2 -n7), the +ioprio_bypass behavior is as follows: + +* Reads will come from the backing device and will not promote into + (pollute) your cache. If the block being read was already in the cache, + then it will be read from the cache (and remain cached). + +* If you are using writeback mode, then low-priority bypass-hinted writes + will go directly to the backing device. If the write was dirty in + cache, it will cache-invalidate and write directly to the backing + device. If a high-priority task later writes the same block then it + will writeback so no performance is lost for write-after-write. + + For read-after-bypassed-write, the block will be read from the backing + device (not cached) so there may be a miss penalty when a low-priority + process write bypasses the cache followed by a high-priority read that + would otherwise have hit. In practice, this is not an issue; to date, + none have wanted low-priority writes and high-priority reads of the + same block. + +For processes in our example at-or-above best-effort-0 (ionice -c2 -n0), +the ioprio_writeback behavior is as follows: + +* The writeback hint has no effect unless your 'cache_mode' is writeback. + Assuming writeback mode, all writes at this priority will writeback. + Of course this will increase SSD wear, so only use writeback hinting + if you need it. + +* Reads are unaffected by ioprio_writeback, except that read-after-write + will of course read from the cache. + +Linux assigns processes the best-effort class with a level of 4 if +no process is assigned Thus, without `ionice` your processes will +follow normal bcache should_writeback/should_bypass symantecs as if the +ioprio_writeback/ioprio_bypass sysfs flags were disabled. + +Also note that in order to be hinted by ioprio_writeback/ioprio_bypass, +the process must have a valid ioprio setting as returned by +get_task_io_context()->ioprio. Thus, a process without an IO context +will be ignored by the ioprio_writeback/ioprio_bypass hints even if your +sysfs hints specify that best-effort-4 should be flagged for bypass +or writeback. If in doubt, explicitly set the process IO priority with +`ionice`. + +See `man ionice` for more detail about per-process IO priority in Linux. HOWTO/COOKBOOK --------------