diff mbox series

[RFC] mm/thp: Update mm's MM_ANONPAGES stat in set_huge_zero_page()

Message ID 1620890438-9127-1-git-send-email-anshuman.khandual@arm.com (mailing list archive)
State New
Headers show
Series [RFC] mm/thp: Update mm's MM_ANONPAGES stat in set_huge_zero_page() | expand

Commit Message

Anshuman Khandual May 13, 2021, 7:20 a.m. UTC
Although the zero huge page is being shared across various processes, each
mapping needs to update its mm's MM_ANONPAGES stat by HPAGE_PMD_NR in order
to be consistent. This just updates the stats in set_huge_zero_page() after
the mapping gets created.

Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Zi Yan <ziy@nvidia.com>
Cc: linux-mm@kvack.org
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
---
Should it update MM_SHMEM_PAGES instead ? Applies on latest mainline.

 mm/huge_memory.c | 1 +
 1 file changed, 1 insertion(+)

Comments

Zi Yan May 13, 2021, 2:12 p.m. UTC | #1
On 13 May 2021, at 3:20, Anshuman Khandual wrote:

> Although the zero huge page is being shared across various processes, each
> mapping needs to update its mm's MM_ANONPAGES stat by HPAGE_PMD_NR in order
> to be consistent. This just updates the stats in set_huge_zero_page() after
> the mapping gets created.

In addition, MM_ANONPAGES stats should be decreased at zap_huge_pmd() and
__split_huge_pmd_locked() when the zero huge page mapping is removed from
a process, right?

>
> Cc: Andrew Morton <akpm@linux-foundation.org>
> Cc: Zi Yan <ziy@nvidia.com>
> Cc: linux-mm@kvack.org
> Cc: linux-kernel@vger.kernel.org
> Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
> ---
> Should it update MM_SHMEM_PAGES instead ? Applies on latest mainline.

zero huge page is added via do_huge_pmd_anonymous_page(), I think MM_ANONPAGES
is appropriate.

>
>  mm/huge_memory.c | 1 +
>  1 file changed, 1 insertion(+)
>
> diff --git a/mm/huge_memory.c b/mm/huge_memory.c
> index 63ed6b25deaa..262703304807 100644
> --- a/mm/huge_memory.c
> +++ b/mm/huge_memory.c
> @@ -706,6 +706,7 @@ static void set_huge_zero_page(pgtable_t pgtable, struct mm_struct *mm,
>  	if (pgtable)
>  		pgtable_trans_huge_deposit(mm, pmd, pgtable);
>  	set_pmd_at(mm, haddr, pmd, entry);
> +	add_mm_counter(mm, MM_ANONPAGES, HPAGE_PMD_NR);
>  	mm_inc_nr_ptes(mm);
>  }
>
> -- 
> 2.20.1


—
Best Regards,
Yan Zi
Yang Shi May 13, 2021, 4:50 p.m. UTC | #2
On Thu, May 13, 2021 at 12:20 AM Anshuman Khandual
<anshuman.khandual@arm.com> wrote:
>
> Although the zero huge page is being shared across various processes, each
> mapping needs to update its mm's MM_ANONPAGES stat by HPAGE_PMD_NR in order
> to be consistent. This just updates the stats in set_huge_zero_page() after
> the mapping gets created.

I don't get why MM_ANONPAGES needs to be inc'ed when huge zero page is
installed. This may cause inconsistency between some counters, for
example, MM_ANONPAGES may be much bigger than anon LRU.

MM_ANONPAGES should not be inc'ed unless a new page is allocated and
installed, right?

>
> Cc: Andrew Morton <akpm@linux-foundation.org>
> Cc: Zi Yan <ziy@nvidia.com>
> Cc: linux-mm@kvack.org
> Cc: linux-kernel@vger.kernel.org
> Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
> ---
> Should it update MM_SHMEM_PAGES instead ? Applies on latest mainline.
>
>  mm/huge_memory.c | 1 +
>  1 file changed, 1 insertion(+)
>
> diff --git a/mm/huge_memory.c b/mm/huge_memory.c
> index 63ed6b25deaa..262703304807 100644
> --- a/mm/huge_memory.c
> +++ b/mm/huge_memory.c
> @@ -706,6 +706,7 @@ static void set_huge_zero_page(pgtable_t pgtable, struct mm_struct *mm,
>         if (pgtable)
>                 pgtable_trans_huge_deposit(mm, pmd, pgtable);
>         set_pmd_at(mm, haddr, pmd, entry);
> +       add_mm_counter(mm, MM_ANONPAGES, HPAGE_PMD_NR);
>         mm_inc_nr_ptes(mm);
>  }
>
> --
> 2.20.1
>
>
Yang Shi May 13, 2021, 4:59 p.m. UTC | #3
On Thu, May 13, 2021 at 9:50 AM Yang Shi <shy828301@gmail.com> wrote:
>
> On Thu, May 13, 2021 at 12:20 AM Anshuman Khandual
> <anshuman.khandual@arm.com> wrote:
> >
> > Although the zero huge page is being shared across various processes, each
> > mapping needs to update its mm's MM_ANONPAGES stat by HPAGE_PMD_NR in order
> > to be consistent. This just updates the stats in set_huge_zero_page() after
> > the mapping gets created.
>
> I don't get why MM_ANONPAGES needs to be inc'ed when huge zero page is
> installed. This may cause inconsistency between some counters, for
> example, MM_ANONPAGES may be much bigger than anon LRU.
>
> MM_ANONPAGES should not be inc'ed unless a new page is allocated and
> installed, right?

I just realized I mixed MM_ANONPAGES up with the global anon pages
counter. Take back my comment. Sorry for the confusion.

>
> >
> > Cc: Andrew Morton <akpm@linux-foundation.org>
> > Cc: Zi Yan <ziy@nvidia.com>
> > Cc: linux-mm@kvack.org
> > Cc: linux-kernel@vger.kernel.org
> > Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
> > ---
> > Should it update MM_SHMEM_PAGES instead ? Applies on latest mainline.
> >
> >  mm/huge_memory.c | 1 +
> >  1 file changed, 1 insertion(+)
> >
> > diff --git a/mm/huge_memory.c b/mm/huge_memory.c
> > index 63ed6b25deaa..262703304807 100644
> > --- a/mm/huge_memory.c
> > +++ b/mm/huge_memory.c
> > @@ -706,6 +706,7 @@ static void set_huge_zero_page(pgtable_t pgtable, struct mm_struct *mm,
> >         if (pgtable)
> >                 pgtable_trans_huge_deposit(mm, pmd, pgtable);
> >         set_pmd_at(mm, haddr, pmd, entry);
> > +       add_mm_counter(mm, MM_ANONPAGES, HPAGE_PMD_NR);
> >         mm_inc_nr_ptes(mm);
> >  }
> >
> > --
> > 2.20.1
> >
> >
Anshuman Khandual May 17, 2021, 3:51 a.m. UTC | #4
On 5/13/21 7:42 PM, Zi Yan wrote:
> On 13 May 2021, at 3:20, Anshuman Khandual wrote:
> 
>> Although the zero huge page is being shared across various processes, each
>> mapping needs to update its mm's MM_ANONPAGES stat by HPAGE_PMD_NR in order
>> to be consistent. This just updates the stats in set_huge_zero_page() after
>> the mapping gets created.
> 
> In addition, MM_ANONPAGES stats should be decreased at zap_huge_pmd() and

Right, would something like this work ?

diff --git a/mm/huge_memory.c b/mm/huge_memory.c
index 63ed6b2..776984d 100644
--- a/mm/huge_memory.c
+++ b/mm/huge_memory.c
@@ -1678,6 +1678,7 @@ int zap_huge_pmd(struct mmu_gather *tlb, struct vm_area_struct *vma,
                        tlb_remove_page_size(tlb, pmd_page(orig_pmd), HPAGE_PMD_SIZE);
        } else if (is_huge_zero_pmd(orig_pmd)) {
                zap_deposited_table(tlb->mm, pmd);
+               add_mm_counter(tlb->mm, MM_ANONPAGES, -HPAGE_PMD_NR);
                spin_unlock(ptl);
                tlb_remove_page_size(tlb, pmd_page(orig_pmd), HPAGE_PMD_SIZE);
        } else {

> __split_huge_pmd_locked() when the zero huge page mapping is removed from
> a process, right?

__split_huge_pmd_locked() calls __split_huge_zero_page_pmd() which will
replace a zero huge page with multiple (HPAGE_PMD_NR) zero small pages.
Why should MM_ANONPAGES stats change for the MM when the mapping is still
out there but in normal pages now.

> 
>>
>> Cc: Andrew Morton <akpm@linux-foundation.org>
>> Cc: Zi Yan <ziy@nvidia.com>
>> Cc: linux-mm@kvack.org
>> Cc: linux-kernel@vger.kernel.org
>> Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
>> ---
>> Should it update MM_SHMEM_PAGES instead ? Applies on latest mainline.
> 
> zero huge page is added via do_huge_pmd_anonymous_page(), I think MM_ANONPAGES
> is appropriate.

Okay, sure.

> 
>>
>>  mm/huge_memory.c | 1 +
>>  1 file changed, 1 insertion(+)
>>
>> diff --git a/mm/huge_memory.c b/mm/huge_memory.c
>> index 63ed6b25deaa..262703304807 100644
>> --- a/mm/huge_memory.c
>> +++ b/mm/huge_memory.c
>> @@ -706,6 +706,7 @@ static void set_huge_zero_page(pgtable_t pgtable, struct mm_struct *mm,
>>  	if (pgtable)
>>  		pgtable_trans_huge_deposit(mm, pmd, pgtable);
>>  	set_pmd_at(mm, haddr, pmd, entry);
>> +	add_mm_counter(mm, MM_ANONPAGES, HPAGE_PMD_NR);
>>  	mm_inc_nr_ptes(mm);
>>  }
>>
>> -- 
>> 2.20.1
> 
> 
> —
> Best Regards,
> Yan Zi
>
Zi Yan May 17, 2021, 2:48 p.m. UTC | #5
On 16 May 2021, at 23:51, Anshuman Khandual wrote:

> On 5/13/21 7:42 PM, Zi Yan wrote:
>> On 13 May 2021, at 3:20, Anshuman Khandual wrote:
>>
>>> Although the zero huge page is being shared across various processes, each
>>> mapping needs to update its mm's MM_ANONPAGES stat by HPAGE_PMD_NR in order
>>> to be consistent. This just updates the stats in set_huge_zero_page() after
>>> the mapping gets created.
>>
>> In addition, MM_ANONPAGES stats should be decreased at zap_huge_pmd() and
>
> Right, would something like this work ?
>
> diff --git a/mm/huge_memory.c b/mm/huge_memory.c
> index 63ed6b2..776984d 100644
> --- a/mm/huge_memory.c
> +++ b/mm/huge_memory.c
> @@ -1678,6 +1678,7 @@ int zap_huge_pmd(struct mmu_gather *tlb, struct vm_area_struct *vma,
>                         tlb_remove_page_size(tlb, pmd_page(orig_pmd), HPAGE_PMD_SIZE);
>         } else if (is_huge_zero_pmd(orig_pmd)) {
>                 zap_deposited_table(tlb->mm, pmd);
> +               add_mm_counter(tlb->mm, MM_ANONPAGES, -HPAGE_PMD_NR);
>                 spin_unlock(ptl);
>                 tlb_remove_page_size(tlb, pmd_page(orig_pmd), HPAGE_PMD_SIZE);
>         } else {
>

LGTM.

>> __split_huge_pmd_locked() when the zero huge page mapping is removed from
>> a process, right?
>
> __split_huge_pmd_locked() calls __split_huge_zero_page_pmd() which will
> replace a zero huge page with multiple (HPAGE_PMD_NR) zero small pages.
> Why should MM_ANONPAGES stats change for the MM when the mapping is still
> out there but in normal pages now.

Ah, you are right. I missed this part. No need to change __split_huge_pmd_locked().

>>
>>>
>>> Cc: Andrew Morton <akpm@linux-foundation.org>
>>> Cc: Zi Yan <ziy@nvidia.com>
>>> Cc: linux-mm@kvack.org
>>> Cc: linux-kernel@vger.kernel.org
>>> Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
>>> ---
>>> Should it update MM_SHMEM_PAGES instead ? Applies on latest mainline.
>>
>> zero huge page is added via do_huge_pmd_anonymous_page(), I think MM_ANONPAGES
>> is appropriate.
>
> Okay, sure.
>
>>
>>>
>>>  mm/huge_memory.c | 1 +
>>>  1 file changed, 1 insertion(+)
>>>
>>> diff --git a/mm/huge_memory.c b/mm/huge_memory.c
>>> index 63ed6b25deaa..262703304807 100644
>>> --- a/mm/huge_memory.c
>>> +++ b/mm/huge_memory.c
>>> @@ -706,6 +706,7 @@ static void set_huge_zero_page(pgtable_t pgtable, struct mm_struct *mm,
>>>  	if (pgtable)
>>>  		pgtable_trans_huge_deposit(mm, pmd, pgtable);
>>>  	set_pmd_at(mm, haddr, pmd, entry);
>>> +	add_mm_counter(mm, MM_ANONPAGES, HPAGE_PMD_NR);
>>>  	mm_inc_nr_ptes(mm);
>>>  }
>>>
>>> -- 
>>> 2.20.1
>>
>>
>> —
>> Best Regards,
>> Yan Zi
>>


—
Best Regards,
Yan, Zi
diff mbox series

Patch

diff --git a/mm/huge_memory.c b/mm/huge_memory.c
index 63ed6b25deaa..262703304807 100644
--- a/mm/huge_memory.c
+++ b/mm/huge_memory.c
@@ -706,6 +706,7 @@  static void set_huge_zero_page(pgtable_t pgtable, struct mm_struct *mm,
 	if (pgtable)
 		pgtable_trans_huge_deposit(mm, pmd, pgtable);
 	set_pmd_at(mm, haddr, pmd, entry);
+	add_mm_counter(mm, MM_ANONPAGES, HPAGE_PMD_NR);
 	mm_inc_nr_ptes(mm);
 }