diff mbox series

mm/memory.c: do_numa_page(): remove a redundant page table read

Message ID 20240228034151.459370-1-jhubbard@nvidia.com (mailing list archive)
State New
Headers show
Series mm/memory.c: do_numa_page(): remove a redundant page table read | expand

Commit Message

John Hubbard Feb. 28, 2024, 3:41 a.m. UTC
do_numa_page() is reading from the same page table entry, twice, while
holding the page table lock: once while checking that the pte hasn't
changed, and again in order to modify the pte.

Instead, just read the pte once, and save it in the same old_pte
variable that already exists. This has no effect on behavior, other than
to provide a tiny potential improvement to performance, by avoiding the
redundant memory read (which the compiler cannot elide, due to
READ_ONCE()).

Also improve the associated comments nearby.

Cc: Ryan Roberts <ryan.roberts@arm.com>
Cc: David Hildenbrand <david@redhat.com>
Signed-off-by: John Hubbard <jhubbard@nvidia.com>
---
 mm/memory.c | 12 ++++++------
 1 file changed, 6 insertions(+), 6 deletions(-)

Comments

David Hildenbrand Feb. 28, 2024, 9:11 a.m. UTC | #1
On 28.02.24 04:41, John Hubbard wrote:
> do_numa_page() is reading from the same page table entry, twice, while
> holding the page table lock: once while checking that the pte hasn't
> changed, and again in order to modify the pte.
> 
> Instead, just read the pte once, and save it in the same old_pte
> variable that already exists. This has no effect on behavior, other than
> to provide a tiny potential improvement to performance, by avoiding the
> redundant memory read (which the compiler cannot elide, due to
> READ_ONCE()).
> 
> Also improve the associated comments nearby.
> 
> Cc: Ryan Roberts <ryan.roberts@arm.com>
> Cc: David Hildenbrand <david@redhat.com>
> Signed-off-by: John Hubbard <jhubbard@nvidia.com>
> ---
>   mm/memory.c | 12 ++++++------
>   1 file changed, 6 insertions(+), 6 deletions(-)
> 
> diff --git a/mm/memory.c b/mm/memory.c
> index 0bfc8b007c01..df0711982901 100644
> --- a/mm/memory.c
> +++ b/mm/memory.c
> @@ -4928,18 +4928,18 @@ static vm_fault_t do_numa_page(struct vm_fault *vmf)
>   	int flags = 0;
>   
>   	/*
> -	 * The "pte" at this point cannot be used safely without
> -	 * validation through pte_unmap_same(). It's of NUMA type but
> -	 * the pfn may be screwed if the read is non atomic.
> +	 * The pte cannot be used safely until we verify, while holding the page
> +	 * table lock, that its contents have not changed during fault handling.
>   	 */
>   	spin_lock(vmf->ptl);
> -	if (unlikely(!pte_same(ptep_get(vmf->pte), vmf->orig_pte))) {
> +	/* Read the live PTE from the page tables: */
> +	old_pte = ptep_get(vmf->pte);
> +
> +	if (unlikely(!pte_same(old_pte, vmf->orig_pte))) {
>   		pte_unmap_unlock(vmf->pte, vmf->ptl);
>   		goto out;
>   	}
>   
> -	/* Get the normal PTE  */
> -	old_pte = ptep_get(vmf->pte);
>   	pte = pte_modify(old_pte, vma->vm_page_prot);
>   
>   	/*

Reviewed-by: David Hildenbrand <david@redhat.com>
Ryan Roberts Feb. 29, 2024, 11:35 a.m. UTC | #2
On 28/02/2024 03:41, John Hubbard wrote:
> do_numa_page() is reading from the same page table entry, twice, while
> holding the page table lock: once while checking that the pte hasn't
> changed, and again in order to modify the pte.
> 
> Instead, just read the pte once, and save it in the same old_pte
> variable that already exists. This has no effect on behavior, other than
> to provide a tiny potential improvement to performance, by avoiding the
> redundant memory read (which the compiler cannot elide, due to
> READ_ONCE()).
> 
> Also improve the associated comments nearby.
> 
> Cc: Ryan Roberts <ryan.roberts@arm.com>
> Cc: David Hildenbrand <david@redhat.com>
> Signed-off-by: John Hubbard <jhubbard@nvidia.com>

Reviewed-by: Ryan Roberts <ryan.roberts@arm.com>

> ---
>  mm/memory.c | 12 ++++++------
>  1 file changed, 6 insertions(+), 6 deletions(-)
> 
> diff --git a/mm/memory.c b/mm/memory.c
> index 0bfc8b007c01..df0711982901 100644
> --- a/mm/memory.c
> +++ b/mm/memory.c
> @@ -4928,18 +4928,18 @@ static vm_fault_t do_numa_page(struct vm_fault *vmf)
>  	int flags = 0;
>  
>  	/*
> -	 * The "pte" at this point cannot be used safely without
> -	 * validation through pte_unmap_same(). It's of NUMA type but
> -	 * the pfn may be screwed if the read is non atomic.
> +	 * The pte cannot be used safely until we verify, while holding the page
> +	 * table lock, that its contents have not changed during fault handling.
>  	 */
>  	spin_lock(vmf->ptl);
> -	if (unlikely(!pte_same(ptep_get(vmf->pte), vmf->orig_pte))) {
> +	/* Read the live PTE from the page tables: */
> +	old_pte = ptep_get(vmf->pte);
> +
> +	if (unlikely(!pte_same(old_pte, vmf->orig_pte))) {
>  		pte_unmap_unlock(vmf->pte, vmf->ptl);
>  		goto out;
>  	}
>  
> -	/* Get the normal PTE  */
> -	old_pte = ptep_get(vmf->pte);
>  	pte = pte_modify(old_pte, vma->vm_page_prot);
>  
>  	/*
diff mbox series

Patch

diff --git a/mm/memory.c b/mm/memory.c
index 0bfc8b007c01..df0711982901 100644
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -4928,18 +4928,18 @@  static vm_fault_t do_numa_page(struct vm_fault *vmf)
 	int flags = 0;
 
 	/*
-	 * The "pte" at this point cannot be used safely without
-	 * validation through pte_unmap_same(). It's of NUMA type but
-	 * the pfn may be screwed if the read is non atomic.
+	 * The pte cannot be used safely until we verify, while holding the page
+	 * table lock, that its contents have not changed during fault handling.
 	 */
 	spin_lock(vmf->ptl);
-	if (unlikely(!pte_same(ptep_get(vmf->pte), vmf->orig_pte))) {
+	/* Read the live PTE from the page tables: */
+	old_pte = ptep_get(vmf->pte);
+
+	if (unlikely(!pte_same(old_pte, vmf->orig_pte))) {
 		pte_unmap_unlock(vmf->pte, vmf->ptl);
 		goto out;
 	}
 
-	/* Get the normal PTE  */
-	old_pte = ptep_get(vmf->pte);
 	pte = pte_modify(old_pte, vma->vm_page_prot);
 
 	/*