diff mbox series

z3fold: encode object length in the handle

Message ID 20181025112821.0924423fb9ecc7918896ec2b@gmail.com (mailing list archive)
State New, archived
Headers show
Series z3fold: encode object length in the handle | expand

Commit Message

Vitaly Wool Oct. 25, 2018, 9:28 a.m. UTC
Reclaim and free can race on an object (which is basically ok) but
in order for reclaim to be able to  map "freed" object we need to
encode object length in the handle. handle_to_chunks() is thus
introduced to extract object length from a handle and use it during
mapping of the last object we couldn't correctly map before.

Signed-off-by: Vitaly Wool <vitaly.vul@sony.com>
---
 mm/z3fold.c | 48 +++++++++++++++++++++++++++++++++++-------------
 1 file changed, 35 insertions(+), 13 deletions(-)

Comments

Andrew Morton Oct. 25, 2018, 7:42 p.m. UTC | #1
On Thu, 25 Oct 2018 11:28:21 +0200 Vitaly Wool <vitalywool@gmail.com> wrote:

> Reclaim and free can race on an object (which is basically ok) but
> in order for reclaim to be able to  map "freed" object we need to
> encode object length in the handle. handle_to_chunks() is thus
> introduced to extract object length from a handle and use it during
> mapping of the last object we couldn't correctly map before.

What are the runtime effects of this change?

Thanks.
Vitaly Wool Oct. 29, 2018, 12:27 p.m. UTC | #2
Hi Andrew,

Den tors 25 okt. 2018 kl 21:42 skrev Andrew Morton <
akpm@linux-foundation.org>:

> On Thu, 25 Oct 2018 11:28:21 +0200 Vitaly Wool <vitalywool@gmail.com>
> wrote:
>
> > Reclaim and free can race on an object (which is basically ok) but
> > in order for reclaim to be able to  map "freed" object we need to
> > encode object length in the handle. handle_to_chunks() is thus
> > introduced to extract object length from a handle and use it during
> > mapping of the last object we couldn't correctly map before.
>
> What are the runtime effects of this change?
>

I haven't observed any adverse impact with this change used in zswap (and
in fact, this is a bugfix for zswap operation). There is a slight under 1%
impact when z3fold is used with ZRAM but since the support for ZRAM over
zpool is still out of tree, I take it doesn't matter at this point, right?

Best regards,
   Vitaly
<div dir="ltr">Hi Andrew,<br><div><br><div class="gmail_quote"><div dir="ltr">Den tors 25 okt. 2018 kl 21:42 skrev Andrew Morton &lt;<a href="mailto:akpm@linux-foundation.org">akpm@linux-foundation.org</a>&gt;:<br></div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">On Thu, 25 Oct 2018 11:28:21 +0200 Vitaly Wool &lt;<a href="mailto:vitalywool@gmail.com" target="_blank">vitalywool@gmail.com</a>&gt; wrote:<br>
<br>
&gt; Reclaim and free can race on an object (which is basically ok) but<br>
&gt; in order for reclaim to be able to  map &quot;freed&quot; object we need to<br>
&gt; encode object length in the handle. handle_to_chunks() is thus<br>
&gt; introduced to extract object length from a handle and use it during<br>
&gt; mapping of the last object we couldn&#39;t correctly map before.<br>
<br>
What are the runtime effects of this change?<br></blockquote><div><br></div><div>I haven&#39;t observed any adverse impact with this change used in zswap (and in fact, this is a bugfix for zswap operation). There is a slight under 1% impact when z3fold is used with ZRAM but since the support for ZRAM over zpool is still out of tree, I take it doesn&#39;t matter at this point, right?</div><div><br></div><div>Best regards,</div><div>   Vitaly<br></div></div></div></div>
Andrew Morton Nov. 7, 2018, 11 p.m. UTC | #3
On Mon, 29 Oct 2018 13:27:36 +0100 Vitaly Wool <vitalywool@gmail.com> wrote:

> Hi Andrew,
> 
> Den tors 25 okt. 2018 kl 21:42 skrev Andrew Morton <
> akpm@linux-foundation.org>:
> 
> > On Thu, 25 Oct 2018 11:28:21 +0200 Vitaly Wool <vitalywool@gmail.com>
> > wrote:
> >
> > > Reclaim and free can race on an object (which is basically ok) but
> > > in order for reclaim to be able to  map "freed" object we need to
> > > encode object length in the handle. handle_to_chunks() is thus
> > > introduced to extract object length from a handle and use it during
> > > mapping of the last object we couldn't correctly map before.
> >
> > What are the runtime effects of this change?
> >
> 
> I haven't observed any adverse impact with this change used in zswap (and
> in fact, this is a bugfix for zswap operation). There is a slight under 1%
> impact when z3fold is used with ZRAM but since the support for ZRAM over
> zpool is still out of tree, I take it doesn't matter at this point, right?
> 

I mean "runtime effects", not "run time effects" ;)

Apart from wishing to document this change fully, I'm trying to
understand which kernel version(s) need the fix.  To understand that, I
need to know the effect upon end-user-visible behaviour.  You say it
fixes a bug - please describe that bug: how it is triggered, what
effect is has, etc.

Also, any suggestions as to which kernel versions we should fix is
always welcome.
diff mbox series

Patch

diff --git a/mm/z3fold.c b/mm/z3fold.c
index 4b366d181f35..86359b565d45 100644
--- a/mm/z3fold.c
+++ b/mm/z3fold.c
@@ -99,6 +99,7 @@  struct z3fold_header {
 #define NCHUNKS		((PAGE_SIZE - ZHDR_SIZE_ALIGNED) >> CHUNK_SHIFT)
 
 #define BUDDY_MASK	(0x3)
+#define BUDDY_SHIFT	2
 
 /**
  * struct z3fold_pool - stores metadata for each z3fold pool
@@ -223,8 +224,17 @@  static unsigned long encode_handle(struct z3fold_header *zhdr, enum buddy bud)
 	unsigned long handle;
 
 	handle = (unsigned long)zhdr;
-	if (bud != HEADLESS)
-		handle += (bud + zhdr->first_num) & BUDDY_MASK;
+	if (bud != HEADLESS) {
+		unsigned short num_chunks = zhdr->first_chunks;
+
+		if (bud == MIDDLE)
+			num_chunks = zhdr->middle_chunks;
+		if (bud == LAST)
+			num_chunks = zhdr->last_chunks;
+
+		handle |= (bud + zhdr->first_num) & BUDDY_MASK;
+		handle |= (num_chunks << BUDDY_SHIFT);
+	}
 	return handle;
 }
 
@@ -234,6 +244,11 @@  static struct z3fold_header *handle_to_z3fold_header(unsigned long handle)
 	return (struct z3fold_header *)(handle & PAGE_MASK);
 }
 
+static unsigned short handle_to_chunks(unsigned long handle)
+{
+	return (handle & ~PAGE_MASK) >> BUDDY_SHIFT;
+}
+
 /*
  * (handle & BUDDY_MASK) < zhdr->first_num is possible in encode_handle
  *  but that doesn't matter. because the masking will result in the
@@ -732,7 +747,6 @@  static void z3fold_free(struct z3fold_pool *pool, unsigned long handle)
 			break;
 		case MIDDLE:
 			zhdr->middle_chunks = 0;
-			zhdr->start_middle = 0;
 			break;
 		case LAST:
 			zhdr->last_chunks = 0;
@@ -746,11 +760,14 @@  static void z3fold_free(struct z3fold_pool *pool, unsigned long handle)
 	}
 
 	if (bud == HEADLESS) {
-		spin_lock(&pool->lock);
-		list_del(&page->lru);
-		spin_unlock(&pool->lock);
-		free_z3fold_page(page);
-		atomic64_dec(&pool->pages_nr);
+		/* if a headless page is under reclaim, just leave */
+		if (!test_bit(UNDER_RECLAIM, &page->private)) {
+			spin_lock(&pool->lock);
+			list_del(&page->lru);
+			spin_unlock(&pool->lock);
+			free_z3fold_page(page);
+			atomic64_dec(&pool->pages_nr);
+		}
 		return;
 	}
 
@@ -836,20 +853,24 @@  static int z3fold_reclaim_page(struct z3fold_pool *pool, unsigned int retries)
 		}
 		list_for_each_prev(pos, &pool->lru) {
 			page = list_entry(pos, struct page, lru);
+			zhdr = page_address(page);
 			if (test_bit(PAGE_HEADLESS, &page->private))
-				/* candidate found */
 				break;
 
-			zhdr = page_address(page);
-			if (!z3fold_page_trylock(zhdr))
+			if (!z3fold_page_trylock(zhdr)) {
+				zhdr = NULL;
 				continue; /* can't evict at this point */
+			}
 			kref_get(&zhdr->refcount);
 			list_del_init(&zhdr->buddy);
 			zhdr->cpu = -1;
-			set_bit(UNDER_RECLAIM, &page->private);
 			break;
 		}
 
+		if (!zhdr)
+			break;
+
+		set_bit(UNDER_RECLAIM, &page->private);
 		list_del_init(&page->lru);
 		spin_unlock(&pool->lock);
 
@@ -898,6 +919,7 @@  static int z3fold_reclaim_page(struct z3fold_pool *pool, unsigned int retries)
 		if (test_bit(PAGE_HEADLESS, &page->private)) {
 			if (ret == 0) {
 				free_z3fold_page(page);
+				atomic64_dec(&pool->pages_nr);
 				return 0;
 			}
 			spin_lock(&pool->lock);
@@ -964,7 +986,7 @@  static void *z3fold_map(struct z3fold_pool *pool, unsigned long handle)
 		set_bit(MIDDLE_CHUNK_MAPPED, &page->private);
 		break;
 	case LAST:
-		addr += PAGE_SIZE - (zhdr->last_chunks << CHUNK_SHIFT);
+		addr += PAGE_SIZE - (handle_to_chunks(handle) << CHUNK_SHIFT);
 		break;
 	default:
 		pr_err("unknown buddy id %d\n", buddy);