diff mbox

[08/19] bcache: documentation for sysfs entries describing bcache cache hinting

Message ID 1498855388-16990-8-git-send-email-bcache@lists.ewheeler.net (mailing list archive)
State New, archived
Headers show

Commit Message

Eric Wheeler June 30, 2017, 8:42 p.m. UTC
From: Eric Wheeler <git@linux.ewheeler.net>

Signed-off-by: Eric Wheeler <bcache@linux.ewheeler.net>
---
 Documentation/bcache.txt | 80 ++++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 80 insertions(+)

Comments

Christoph Hellwig July 5, 2017, 6:27 p.m. UTC | #1
should go into the previous patch.
diff mbox

Patch

diff --git a/Documentation/bcache.txt b/Documentation/bcache.txt
index a9259b5..c78c012 100644
--- a/Documentation/bcache.txt
+++ b/Documentation/bcache.txt
@@ -133,6 +133,86 @@  the backing devices to passthrough mode.
    writeback mode). It currently doesn't do anything intelligent if it fails to
    read some of the dirty data, though.
 
+SSD LONGEVITY: PER-PROCESS CACHE HINTING WITH IO PRIORITY
+---------------------------------------------------------
+
+Processes can be assigned an IO priority using `ionice` and bcache will
+either try to writeback or bypass the cache based on the IO priority
+level assigned to the process and the configuration of the syfs ioprio
+hints.  If configured properly for your workload, this can both increase
+performance and reduce SSD wear (erase/write cycles).
+
+Having idle IOs bypass the cache can increase performance elsewhere
+since you probably don't care about their performance.  In addition,
+this prevents idle IOs from promoting into (polluting) your cache and
+evicting blocks that are more important elsewhere.
+
+Default sysfs values:
+	2,7: ioprio_bypass is hinted for process IOs at-or-below best-effort-7.
+	0,0: ioprio_writeback hinting is disabled by default.
+
+Cache hinting is configured by writing 'class,level' pairs to sysfs.
+In this example, we write the following:
+
+    echo 2,7 > /sys/block/bcache0/bcache/ioprio_bypass
+    echo 2,0 > /sys/block/bcache0/bcache/ioprio_writeback
+
+Thus, processes with the following IO class (ionice -c) and level (-n)
+will the behave as shown in this table:
+
+	(-c) IO Class    (-n) Class level       Action
+	-----------------------------------------------------
+	(1) Realtime      0-7                   Writeback
+	(2) Best-effort     0                   Writeback
+	(2) Best-effort   1-6                   Normal, as if hinting were disabled
+	(2) Best-effort     7                   Bypass cache
+	(3) Idle          n/a                   Bypass cache
+
+For processes at-or-below best-effort-7 (ionice -c2 -n7), the
+ioprio_bypass behavior is as follows:
+
+* Reads will come from the backing device and will not promote into
+  (pollute) your cache.  If the block being read was already in the cache,
+  then it will be read from the cache (and remain cached).
+
+* If you are using writeback mode, then low-priority bypass-hinted writes
+  will go directly to the backing device.  If the write was dirty in
+  cache, it will cache-invalidate and write directly to the backing
+  device.  If a high-priority task later writes the same block then it
+  will writeback so no performance is lost for write-after-write.
+
+  For read-after-bypassed-write, the block will be read from the backing
+  device (not cached) so there may be a miss penalty when a low-priority
+  process write bypasses the cache followed by a high-priority read that
+  would otherwise have hit.  In practice, this is not an issue; to date,
+  none have wanted low-priority writes and high-priority reads of the
+  same block.
+
+For processes in our example at-or-above best-effort-0 (ionice -c2 -n0),
+the ioprio_writeback behavior is as follows:
+
+* The writeback hint has no effect unless your 'cache_mode' is writeback.
+  Assuming writeback mode, all writes at this priority will writeback.
+  Of course this will increase SSD wear, so only use writeback hinting
+  if you need it.
+
+* Reads are unaffected by ioprio_writeback, except that read-after-write
+  will of course read from the cache.
+
+Linux assigns processes the best-effort class with a level of 4 if
+no process is assigned  Thus, without `ionice` your processes will
+follow normal bcache should_writeback/should_bypass symantecs as if the
+ioprio_writeback/ioprio_bypass sysfs flags were disabled.
+
+Also note that in order to be hinted by ioprio_writeback/ioprio_bypass,
+the process must have a valid ioprio setting as returned by
+get_task_io_context()->ioprio. Thus, a process without an IO context
+will be ignored by the ioprio_writeback/ioprio_bypass hints even if your
+sysfs hints specify that best-effort-4 should be flagged for bypass
+or writeback.  If in doubt, explicitly set the process IO priority with
+`ionice`.
+
+See `man ionice` for more detail about per-process IO priority in Linux.
 
 HOWTO/COOKBOOK
 --------------