diff mbox series

mm: replace is_zero_pfn with is_huge_zero_pmd for thp

Message ID 20190825200621.211494-1-yuzhao@google.com (mailing list archive)
State New, archived
Headers show
Series mm: replace is_zero_pfn with is_huge_zero_pmd for thp | expand

Commit Message

Yu Zhao Aug. 25, 2019, 8:06 p.m. UTC
For hugely mapped thp, we use is_huge_zero_pmd() to check if it's
zero page or not.

We do fill ptes with my_zero_pfn() when we split zero thp pmd, but
 this is not what we have in vm_normal_page_pmd().
pmd_trans_huge_lock() makes sure of it.

This is a trivial fix for /proc/pid/numa_maps, and AFAIK nobody
complains about it.

Signed-off-by: Yu Zhao <yuzhao@google.com>
---
 mm/memory.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

Comments

Matthew Wilcox Aug. 26, 2019, 1:18 p.m. UTC | #1
Why did you not cc Gerald who wrote the patch?  You can't just
run get_maintainers.pl and call it good.

On Sun, Aug 25, 2019 at 02:06:21PM -0600, Yu Zhao wrote:
> For hugely mapped thp, we use is_huge_zero_pmd() to check if it's
> zero page or not.
> 
> We do fill ptes with my_zero_pfn() when we split zero thp pmd, but
>  this is not what we have in vm_normal_page_pmd().
> pmd_trans_huge_lock() makes sure of it.
> 
> This is a trivial fix for /proc/pid/numa_maps, and AFAIK nobody
> complains about it.
> 
> Signed-off-by: Yu Zhao <yuzhao@google.com>
> ---
>  mm/memory.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/mm/memory.c b/mm/memory.c
> index e2bb51b6242e..ea3c74855b23 100644
> --- a/mm/memory.c
> +++ b/mm/memory.c
> @@ -654,7 +654,7 @@ struct page *vm_normal_page_pmd(struct vm_area_struct *vma, unsigned long addr,
>  
>  	if (pmd_devmap(pmd))
>  		return NULL;
> -	if (is_zero_pfn(pfn))
> +	if (is_huge_zero_pmd(pmd))
>  		return NULL;
>  	if (unlikely(pfn > highest_memmap_pfn))
>  		return NULL;
> -- 
> 2.23.0.187.g17f5b7556c-goog
>
Gerald Schaefer Aug. 26, 2019, 3:09 p.m. UTC | #2
On Mon, 26 Aug 2019 06:18:58 -0700
Matthew Wilcox <willy@infradead.org> wrote:

> Why did you not cc Gerald who wrote the patch?  You can't just
> run get_maintainers.pl and call it good.
> 
> On Sun, Aug 25, 2019 at 02:06:21PM -0600, Yu Zhao wrote:
> > For hugely mapped thp, we use is_huge_zero_pmd() to check if it's
> > zero page or not.
> > 
> > We do fill ptes with my_zero_pfn() when we split zero thp pmd, but
> >  this is not what we have in vm_normal_page_pmd().
> > pmd_trans_huge_lock() makes sure of it.
> > 
> > This is a trivial fix for /proc/pid/numa_maps, and AFAIK nobody
> > complains about it.
> > 
> > Signed-off-by: Yu Zhao <yuzhao@google.com>
> > ---
> >  mm/memory.c | 2 +-
> >  1 file changed, 1 insertion(+), 1 deletion(-)
> > 
> > diff --git a/mm/memory.c b/mm/memory.c
> > index e2bb51b6242e..ea3c74855b23 100644
> > --- a/mm/memory.c
> > +++ b/mm/memory.c
> > @@ -654,7 +654,7 @@ struct page *vm_normal_page_pmd(struct vm_area_struct *vma, unsigned long addr,
> >  
> >  	if (pmd_devmap(pmd))
> >  		return NULL;
> > -	if (is_zero_pfn(pfn))
> > +	if (is_huge_zero_pmd(pmd))
> >  		return NULL;
> >  	if (unlikely(pfn > highest_memmap_pfn))
> >  		return NULL;
> > -- 
> > 2.23.0.187.g17f5b7556c-goog
> >   

Looks good to me. The "_pmd" versions for can_gather_numa_stats() and
vm_normal_page() were introduced to avoid using pte_present/dirty() on
pmds, which is not affected by this patch.

In fact, for vm_normal_page_pmd() I basically copied most of the code
from vm_normal_page(), including the is_zero_pfn(pfn) check, which does
look wrong to me now. Using is_huge_zero_pmd() should be correct.

Maybe the description could also mention the symptom of this bug?
I would assume that it affects anon/dirty accounting in gather_pte_stats(),
for huge mappings, if zero page mappings are not correctly recognized.

Regards,
Gerald
Yu Zhao Sept. 4, 2019, 8:54 p.m. UTC | #3
On Mon, Aug 26, 2019 at 05:09:34PM +0200, Gerald Schaefer wrote:
> On Mon, 26 Aug 2019 06:18:58 -0700
> Matthew Wilcox <willy@infradead.org> wrote:
> 
> > Why did you not cc Gerald who wrote the patch?  You can't just
> > run get_maintainers.pl and call it good.
> > 
> > On Sun, Aug 25, 2019 at 02:06:21PM -0600, Yu Zhao wrote:
> > > For hugely mapped thp, we use is_huge_zero_pmd() to check if it's
> > > zero page or not.
> > > 
> > > We do fill ptes with my_zero_pfn() when we split zero thp pmd, but
> > >  this is not what we have in vm_normal_page_pmd().
> > > pmd_trans_huge_lock() makes sure of it.
> > > 
> > > This is a trivial fix for /proc/pid/numa_maps, and AFAIK nobody
> > > complains about it.
> > > 
> > > Signed-off-by: Yu Zhao <yuzhao@google.com>
> > > ---
> > >  mm/memory.c | 2 +-
> > >  1 file changed, 1 insertion(+), 1 deletion(-)
> > > 
> > > diff --git a/mm/memory.c b/mm/memory.c
> > > index e2bb51b6242e..ea3c74855b23 100644
> > > --- a/mm/memory.c
> > > +++ b/mm/memory.c
> > > @@ -654,7 +654,7 @@ struct page *vm_normal_page_pmd(struct vm_area_struct *vma, unsigned long addr,
> > >  
> > >  	if (pmd_devmap(pmd))
> > >  		return NULL;
> > > -	if (is_zero_pfn(pfn))
> > > +	if (is_huge_zero_pmd(pmd))
> > >  		return NULL;
> > >  	if (unlikely(pfn > highest_memmap_pfn))
> > >  		return NULL;
> > > -- 
> > > 2.23.0.187.g17f5b7556c-goog
> > >   
> 
> Looks good to me. The "_pmd" versions for can_gather_numa_stats() and
> vm_normal_page() were introduced to avoid using pte_present/dirty() on
> pmds, which is not affected by this patch.
> 
> In fact, for vm_normal_page_pmd() I basically copied most of the code
> from vm_normal_page(), including the is_zero_pfn(pfn) check, which does
> look wrong to me now. Using is_huge_zero_pmd() should be correct.
> 
> Maybe the description could also mention the symptom of this bug?
> I would assume that it affects anon/dirty accounting in gather_pte_stats(),
> for huge mappings, if zero page mappings are not correctly recognized.

Hi, sorry for not copying you on the original email. I came across
this while I was looking at the code. I'm not aware of any symptom.
Thank you.
diff mbox series

Patch

diff --git a/mm/memory.c b/mm/memory.c
index e2bb51b6242e..ea3c74855b23 100644
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -654,7 +654,7 @@  struct page *vm_normal_page_pmd(struct vm_area_struct *vma, unsigned long addr,
 
 	if (pmd_devmap(pmd))
 		return NULL;
-	if (is_zero_pfn(pfn))
+	if (is_huge_zero_pmd(pmd))
 		return NULL;
 	if (unlikely(pfn > highest_memmap_pfn))
 		return NULL;