[0/2] Refactor do_fault_around()

Message ID	cover.1679089214.git.lstoakes@gmail.com (mailing list archive)
Headers	show Return-Path: <owner-linux-mm@kvack.org> From: Lorenzo Stoakes <lstoakes@gmail.com> To: linux-mm@kvack.org, linux-kernel@vger.kernel.org, Andrew Morton <akpm@linux-foundation.org> Cc: David Hildenbrand <david@redhat.com>, Matthew Wilcox <willy@infradead.org>, Vlastimil Babka <vbabka@suse.cz>, Lorenzo Stoakes <lstoakes@gmail.com> Subject: [PATCH 0/2] Refactor do_fault_around() Date: Fri, 17 Mar 2023 21:58:24 +0000 Message-Id: <cover.1679089214.git.lstoakes@gmail.com> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Sender: owner-linux-mm@kvack.org Precedence: bulk
Series	Refactor do_fault_around() \| expand [0/2] Refactor do_fault_around() [1/2] mm: refactor do_fault_around() [2/2] mm: pefer fault_around_pages to fault_around_bytes

Message ID

cover.1679089214.git.lstoakes@gmail.com (mailing list archive)

Headers

From: Lorenzo Stoakes <lstoakes@gmail.com>
To: linux-mm@kvack.org,
	linux-kernel@vger.kernel.org,
	Andrew Morton <akpm@linux-foundation.org>
Cc: David Hildenbrand <david@redhat.com>,
	Matthew Wilcox <willy@infradead.org>,
	Vlastimil Babka <vbabka@suse.cz>,
	Lorenzo Stoakes <lstoakes@gmail.com>
Subject: [PATCH 0/2] Refactor do_fault_around()
Date: Fri, 17 Mar 2023 21:58:24 +0000
Message-Id: <cover.1679089214.git.lstoakes@gmail.com>
MIME-Version: 1.0
Content-Transfer-Encoding: 8bit
Sender: owner-linux-mm@kvack.org
Precedence: bulk

Series

Refactor do_fault_around() | expand

Message

Lorenzo Stoakes March 17, 2023, 9:58 p.m. UTC

Refactor do_fault_around() to avoid bitwise tricks and arather difficult to
follow logic.  Additionally, prefer fault_around_pages to
fault_around_bytes as the operations are performed at a base page
granularity.

I have run this code against a small program I wrote which generates
significant input data and compares output with the original function to
ensure that it behaves the same as the old code across varying vmf, vma and
fault_around_pages inputs.

Lorenzo Stoakes (2):
  mm: refactor do_fault_around()
  mm: pefer fault_around_pages to fault_around_bytes

 mm/memory.c | 62 ++++++++++++++++++++++++++---------------------------
 1 file changed, 30 insertions(+), 32 deletions(-)

--
2.39.2

Comments

Andrew Morton March 17, 2023, 11:39 p.m. UTC | #1

On Fri, 17 Mar 2023 21:58:24 +0000 Lorenzo Stoakes <lstoakes@gmail.com> wrote:

> Refactor do_fault_around() to avoid bitwise tricks and arather difficult to
> follow logic.  Additionally, prefer fault_around_pages to
> fault_around_bytes as the operations are performed at a base page
> granularity.
> 
> I have run this code against a small program I wrote which generates
> significant input data and compares output with the original function to
> ensure that it behaves the same as the old code across varying vmf, vma and
> fault_around_pages inputs.

Well, what changes were you looking for in that testing? 
do_fault_around() could become a no-op and most tests wouldn't notice. 
What we'd be looking for to test these changes is performance
differences.

Perhaps one could add a tracepoint to do_fault_around()'s call to
->map_pages, check that the before-and-after traces are identical.


Or, if you're old school and lazy,

	if (!strcmp(current->comm, "name-of-my-test-program"))
		printk("name-of-my-test-program: %lu %lu\n",
			start_pgoff, end_pgoff)

then grep-n-diff the dmesg output.

Lorenzo Stoakes March 17, 2023, 11:48 p.m. UTC | #2

On Fri, Mar 17, 2023 at 04:39:36PM -0700, Andrew Morton wrote:
> On Fri, 17 Mar 2023 21:58:24 +0000 Lorenzo Stoakes <lstoakes@gmail.com> wrote:
>
> > Refactor do_fault_around() to avoid bitwise tricks and arather difficult to
> > follow logic.  Additionally, prefer fault_around_pages to
> > fault_around_bytes as the operations are performed at a base page
> > granularity.
> >
> > I have run this code against a small program I wrote which generates
> > significant input data and compares output with the original function to
> > ensure that it behaves the same as the old code across varying vmf, vma and
> > fault_around_pages inputs.
>
> Well, what changes were you looking for in that testing?

That there was no functional change between this refactoring and the existing
logic.

> do_fault_around() could become a no-op and most tests wouldn't notice.
> What we'd be looking for to test these changes is performance
> differences.
>
> Perhaps one could add a tracepoint to do_fault_around()'s call to
> ->map_pages, check that the before-and-after traces are identical.
>
>
> Or, if you're old school and lazy,
>
> 	if (!strcmp(current->comm, "name-of-my-test-program"))
> 		printk("name-of-my-test-program: %lu %lu\n",
> 			start_pgoff, end_pgoff)
>
> then grep-n-diff the dmesg output.

I am both old school and lazy, however I went so far as to literally copy/paste
the existing code and my speculative change to a userland program, generate a
whole host of random sensible input data and compare output data with this and
the original logic en masse.

This function is actually quite nice for testability as the input and output
variables are limited in scope, vmf->address, vmf->pgoff, vmf->vma->vm_start/end
+ vmf->vma->vm_pgoff and output start_pgoff, end_pgoff.

I could attach said program but it's some quite iffy C++ code that might horrify
small children and adults alike...

I am more than happy to do performance testing to see if there is any impact if
you require?

Andrew Morton March 17, 2023, 11:59 p.m. UTC | #3

On Fri, 17 Mar 2023 23:48:17 +0000 Lorenzo Stoakes <lstoakes@gmail.com> wrote:

> I went so far as to literally copy/paste
> the existing code and my speculative change to a userland program, generate a
> whole host of random sensible input data and compare output data with this and
> the original logic en masse.

Ah, great, all good, thanks.