diff mbox

[1/2] mm: avoid spurious 'bad pmd' warning messages

Message ID 20170517171639.14501-1-ross.zwisler@linux.intel.com (mailing list archive)
State New, archived
Headers show

Commit Message

Ross Zwisler May 17, 2017, 5:16 p.m. UTC
When the pmd_devmap() checks were added by:

commit 5c7fb56e5e3f ("mm, dax: dax-pmd vs thp-pmd vs hugetlbfs-pmd")

to add better support for DAX huge pages, they were all added to the end of
if() statements after existing pmd_trans_huge() checks.  So, things like:

-       if (pmd_trans_huge(*pmd))
+       if (pmd_trans_huge(*pmd) || pmd_devmap(*pmd))

When further checks were added after pmd_trans_unstable() checks by:

commit 7267ec008b5c ("mm: postpone page table allocation until we have page
to map")

they were also added at the end of the conditional:

+       if (pmd_trans_unstable(fe->pmd) || pmd_devmap(*fe->pmd))

This ordering is fine for pmd_trans_huge(), but doesn't work for
pmd_trans_unstable().  This is because DAX huge pages trip the bad_pmd()
check inside of pmd_none_or_trans_huge_or_clear_bad() (called by
pmd_trans_unstable()), which prints out a warning and returns 1.  So, we do
end up doing the right thing, but only after spamming dmesg with suspicious
looking messages:

mm/pgtable-generic.c:39: bad pmd ffff8808daa49b88(84000001006000a5)

Reorder these checks so that pmd_devmap() is checked first, avoiding the
error messages.

Signed-off-by: Ross Zwisler <ross.zwisler@linux.intel.com>
Fixes: commit 7267ec008b5c ("mm: postpone page table allocation until we have page to map")
Cc: stable@vger.kernel.org
---
 mm/memory.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

Comments

Dave Hansen May 17, 2017, 5:33 p.m. UTC | #1
On 05/17/2017 10:16 AM, Ross Zwisler wrote:
> @@ -3061,7 +3061,7 @@ static int pte_alloc_one_map(struct vm_fault *vmf)
>  	 * through an atomic read in C, which is what pmd_trans_unstable()
>  	 * provides.
>  	 */
> -	if (pmd_trans_unstable(vmf->pmd) || pmd_devmap(*vmf->pmd))
> +	if (pmd_devmap(*vmf->pmd) || pmd_trans_unstable(vmf->pmd))
>  		return VM_FAULT_NOPAGE;

I'm worried we are very unlikely to get this right in the future.  It's
totally not obvious what the ordering requirement is here.

Could we move pmd_devmap() and pmd_trans_unstable() into a helper that
gets the ordering right and also spells out the ordering requirement?
Ross Zwisler May 17, 2017, 6:23 p.m. UTC | #2
On Wed, May 17, 2017 at 10:33:58AM -0700, Dave Hansen wrote:
> On 05/17/2017 10:16 AM, Ross Zwisler wrote:
> > @@ -3061,7 +3061,7 @@ static int pte_alloc_one_map(struct vm_fault *vmf)
> >  	 * through an atomic read in C, which is what pmd_trans_unstable()
> >  	 * provides.
> >  	 */
> > -	if (pmd_trans_unstable(vmf->pmd) || pmd_devmap(*vmf->pmd))
> > +	if (pmd_devmap(*vmf->pmd) || pmd_trans_unstable(vmf->pmd))
> >  		return VM_FAULT_NOPAGE;
> 
> I'm worried we are very unlikely to get this right in the future.  It's
> totally not obvious what the ordering requirement is here.
> 
> Could we move pmd_devmap() and pmd_trans_unstable() into a helper that
> gets the ordering right and also spells out the ordering requirement?

Sure, I'll fix this for v2.

Thanks for the review.
Jan Kara May 22, 2017, 2:40 p.m. UTC | #3
On Wed 17-05-17 11:16:38, Ross Zwisler wrote:
> When the pmd_devmap() checks were added by:
> 
> commit 5c7fb56e5e3f ("mm, dax: dax-pmd vs thp-pmd vs hugetlbfs-pmd")
> 
> to add better support for DAX huge pages, they were all added to the end of
> if() statements after existing pmd_trans_huge() checks.  So, things like:
> 
> -       if (pmd_trans_huge(*pmd))
> +       if (pmd_trans_huge(*pmd) || pmd_devmap(*pmd))
> 
> When further checks were added after pmd_trans_unstable() checks by:
> 
> commit 7267ec008b5c ("mm: postpone page table allocation until we have page
> to map")
> 
> they were also added at the end of the conditional:
> 
> +       if (pmd_trans_unstable(fe->pmd) || pmd_devmap(*fe->pmd))
> 
> This ordering is fine for pmd_trans_huge(), but doesn't work for
> pmd_trans_unstable().  This is because DAX huge pages trip the bad_pmd()
> check inside of pmd_none_or_trans_huge_or_clear_bad() (called by
> pmd_trans_unstable()), which prints out a warning and returns 1.  So, we do
> end up doing the right thing, but only after spamming dmesg with suspicious
> looking messages:
> 
> mm/pgtable-generic.c:39: bad pmd ffff8808daa49b88(84000001006000a5)
> 
> Reorder these checks so that pmd_devmap() is checked first, avoiding the
> error messages.
> 
> Signed-off-by: Ross Zwisler <ross.zwisler@linux.intel.com>
> Fixes: commit 7267ec008b5c ("mm: postpone page table allocation until we have page to map")
> Cc: stable@vger.kernel.org

With the change requested by Dave this looks good to me. You can add:

Reviewed-by: Jan Kara <jack@suse.cz>

								Honza

> ---
>  mm/memory.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/mm/memory.c b/mm/memory.c
> index 6ff5d72..1ee269d 100644
> --- a/mm/memory.c
> +++ b/mm/memory.c
> @@ -3061,7 +3061,7 @@ static int pte_alloc_one_map(struct vm_fault *vmf)
>  	 * through an atomic read in C, which is what pmd_trans_unstable()
>  	 * provides.
>  	 */
> -	if (pmd_trans_unstable(vmf->pmd) || pmd_devmap(*vmf->pmd))
> +	if (pmd_devmap(*vmf->pmd) || pmd_trans_unstable(vmf->pmd))
>  		return VM_FAULT_NOPAGE;
>  
>  	vmf->pte = pte_offset_map_lock(vma->vm_mm, vmf->pmd, vmf->address,
> @@ -3690,7 +3690,7 @@ static int handle_pte_fault(struct vm_fault *vmf)
>  		vmf->pte = NULL;
>  	} else {
>  		/* See comment in pte_alloc_one_map() */
> -		if (pmd_trans_unstable(vmf->pmd) || pmd_devmap(*vmf->pmd))
> +		if (pmd_devmap(*vmf->pmd) || pmd_trans_unstable(vmf->pmd))
>  			return 0;
>  		/*
>  		 * A regular pmd is established and it can't morph into a huge
> -- 
> 2.9.4
>
diff mbox

Patch

diff --git a/mm/memory.c b/mm/memory.c
index 6ff5d72..1ee269d 100644
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -3061,7 +3061,7 @@  static int pte_alloc_one_map(struct vm_fault *vmf)
 	 * through an atomic read in C, which is what pmd_trans_unstable()
 	 * provides.
 	 */
-	if (pmd_trans_unstable(vmf->pmd) || pmd_devmap(*vmf->pmd))
+	if (pmd_devmap(*vmf->pmd) || pmd_trans_unstable(vmf->pmd))
 		return VM_FAULT_NOPAGE;
 
 	vmf->pte = pte_offset_map_lock(vma->vm_mm, vmf->pmd, vmf->address,
@@ -3690,7 +3690,7 @@  static int handle_pte_fault(struct vm_fault *vmf)
 		vmf->pte = NULL;
 	} else {
 		/* See comment in pte_alloc_one_map() */
-		if (pmd_trans_unstable(vmf->pmd) || pmd_devmap(*vmf->pmd))
+		if (pmd_devmap(*vmf->pmd) || pmd_trans_unstable(vmf->pmd))
 			return 0;
 		/*
 		 * A regular pmd is established and it can't morph into a huge