diff mbox series

zram: Add a huge_idle writeback mode

Message ID 20220321145037.1024083-1-bgeffon@google.com (mailing list archive)
State New, archived
Headers show
Series zram: Add a huge_idle writeback mode | expand

Commit Message

Brian Geffon March 21, 2022, 2:50 p.m. UTC
Today it's only possible to write back as a page, idle, or huge.
A user might want to writeback pages which are huge and idle first
as these idle pages do not require decompression and make a good
first pass for writeback.

Idle writeback specifically has the advantage that a refault is
unlikely given that the page has been swapped for some amount of
time without being refaulted.

Huge writeback has the advantage that you're guaranteed to get
the maximum benefit from a single page writeback, that is, you're
reclaiming one full page of memory. Pages which are compressed in
zram being written back result in some benefit which is always
less than a page size because of the fact that it was compressed.

This change allows for users to write back huge pages which are
also idle.

Signed-off-by: Brian Geffon <bgeffon@google.com>
---
 Documentation/admin-guide/blockdev/zram.rst |  6 ++++++
 drivers/block/zram/zram_drv.c               | 10 ++++++----
 2 files changed, 12 insertions(+), 4 deletions(-)

Comments

Minchan Kim March 22, 2022, 9:13 p.m. UTC | #1
On Mon, Mar 21, 2022 at 07:50:37AM -0700, Brian Geffon wrote:
> Today it's only possible to write back as a page, idle, or huge.
> A user might want to writeback pages which are huge and idle first
> as these idle pages do not require decompression and make a good
> first pass for writeback.
> 
> Idle writeback specifically has the advantage that a refault is
> unlikely given that the page has been swapped for some amount of
> time without being refaulted.
> 
> Huge writeback has the advantage that you're guaranteed to get
> the maximum benefit from a single page writeback, that is, you're
> reclaiming one full page of memory. Pages which are compressed in
> zram being written back result in some benefit which is always
> less than a page size because of the fact that it was compressed.
> 
> This change allows for users to write back huge pages which are
> also idle.

Hey Brian,

I really want to add your explanation about the storage endurance
because it's real issue.

So, could't you add up below in the description?

From your previous reply
"
we're trying to be very sensitive to our devices storage endurance,
for this reason we will have a fairly conservative writeback limit.
Given that, we want to make sure we're maximizing what lands on disk
while still minimizing the refault time. We could take the approach
where we always writeback huge pages but then we may result in very
quick refaults which would be a huge waste of time. So idle writeback
is a must for us and being able to writeback the pages which have
maximum value (huge) would be very useful
"

> 
> Signed-off-by: Brian Geffon <bgeffon@google.com>

Other than that, feel free to add my
Acked-by: Minchan Kim <minchan@kernel.org>

Thanks.
Brian Geffon March 22, 2022, 9:52 p.m. UTC | #2
On Tue, Mar 22, 2022 at 5:13 PM Minchan Kim <minchan@kernel.org> wrote:
>
> On Mon, Mar 21, 2022 at 07:50:37AM -0700, Brian Geffon wrote:
> > Today it's only possible to write back as a page, idle, or huge.
> > A user might want to writeback pages which are huge and idle first
> > as these idle pages do not require decompression and make a good
> > first pass for writeback.
> >
> > Idle writeback specifically has the advantage that a refault is
> > unlikely given that the page has been swapped for some amount of
> > time without being refaulted.
> >
> > Huge writeback has the advantage that you're guaranteed to get
> > the maximum benefit from a single page writeback, that is, you're
> > reclaiming one full page of memory. Pages which are compressed in
> > zram being written back result in some benefit which is always
> > less than a page size because of the fact that it was compressed.
> >
> > This change allows for users to write back huge pages which are
> > also idle.
>
> Hey Brian,
>
> I really want to add your explanation about the storage endurance
> because it's real issue.
>
> So, could't you add up below in the description?

Sure thing.

>
> From your previous reply
> "
> we're trying to be very sensitive to our devices storage endurance,
> for this reason we will have a fairly conservative writeback limit.
> Given that, we want to make sure we're maximizing what lands on disk
> while still minimizing the refault time. We could take the approach
> where we always writeback huge pages but then we may result in very
> quick refaults which would be a huge waste of time. So idle writeback
> is a must for us and being able to writeback the pages which have
> maximum value (huge) would be very useful
> "
>
> >
> > Signed-off-by: Brian Geffon <bgeffon@google.com>
>
> Other than that, feel free to add my
> Acked-by: Minchan Kim <minchan@kernel.org>

Thanks Minchan.

>
> Thanks.
diff mbox series

Patch

diff --git a/Documentation/admin-guide/blockdev/zram.rst b/Documentation/admin-guide/blockdev/zram.rst
index 3e11926a4df9..af1123bfaf92 100644
--- a/Documentation/admin-guide/blockdev/zram.rst
+++ b/Documentation/admin-guide/blockdev/zram.rst
@@ -343,6 +343,12 @@  Admin can request writeback of those idle pages at right timing via::
 
 With the command, zram writeback idle pages from memory to the storage.
 
+Additionally, if a user choose to writeback only huge and idle pages
+this can be accomplished with::
+
+        echo huge_idle > /sys/block/zramX/writeback
+
+
 If admin want to write a specific page in zram device to backing device,
 they could write a page index into the interface.
 
diff --git a/drivers/block/zram/zram_drv.c b/drivers/block/zram/zram_drv.c
index cb253d80d72b..f196902ae554 100644
--- a/drivers/block/zram/zram_drv.c
+++ b/drivers/block/zram/zram_drv.c
@@ -643,8 +643,8 @@  static int read_from_bdev_async(struct zram *zram, struct bio_vec *bvec,
 #define PAGE_WB_SIG "page_index="
 
 #define PAGE_WRITEBACK 0
-#define HUGE_WRITEBACK 1
-#define IDLE_WRITEBACK 2
+#define HUGE_WRITEBACK (1<<0)
+#define IDLE_WRITEBACK (1<<1)
 
 
 static ssize_t writeback_store(struct device *dev,
@@ -664,6 +664,8 @@  static ssize_t writeback_store(struct device *dev,
 		mode = IDLE_WRITEBACK;
 	else if (sysfs_streq(buf, "huge"))
 		mode = HUGE_WRITEBACK;
+	else if (sysfs_streq(buf, "huge_idle"))
+		mode = IDLE_WRITEBACK | HUGE_WRITEBACK;
 	else {
 		if (strncmp(buf, PAGE_WB_SIG, sizeof(PAGE_WB_SIG) - 1))
 			return -EINVAL;
@@ -725,10 +727,10 @@  static ssize_t writeback_store(struct device *dev,
 				zram_test_flag(zram, index, ZRAM_UNDER_WB))
 			goto next;
 
-		if (mode == IDLE_WRITEBACK &&
+		if (mode & IDLE_WRITEBACK &&
 			  !zram_test_flag(zram, index, ZRAM_IDLE))
 			goto next;
-		if (mode == HUGE_WRITEBACK &&
+		if (mode & HUGE_WRITEBACK &&
 			  !zram_test_flag(zram, index, ZRAM_HUGE))
 			goto next;
 		/*