diff mbox series

[v7,4/7] mm,hugetlb: Split prep_new_huge_page functionality

Message ID 20210413104747.12177-5-osalvador@suse.de (mailing list archive)
State New, archived
Headers show
Series Make alloc_contig_range handle Hugetlb pages | expand

Commit Message

Oscar Salvador April 13, 2021, 10:47 a.m. UTC
Currently, prep_new_huge_page() performs two functions.
It sets the right state for a new hugetlb, and increases the hstate's
counters to account for the new page.

Let us split its functionality into two separate functions, decoupling
the handling of the counters from initializing a hugepage.
The outcome is having __prep_new_huge_page(), which only
initializes the page , and __prep_account_new_huge_page(), which adds
the new page to the hstate's counters.

This allows us to be able to set a hugetlb without having to worry
about the counter/locking. It will prove useful in the next patch.
prep_new_huge_page() still calls both functions.

Signed-off-by: Oscar Salvador <osalvador@suse.de>
---
 mm/hugetlb.c | 19 ++++++++++++++++---
 1 file changed, 16 insertions(+), 3 deletions(-)

Comments

Michal Hocko April 13, 2021, 1:24 p.m. UTC | #1
On Tue 13-04-21 12:47:44, Oscar Salvador wrote:
[...]
> +static void prep_new_huge_page(struct hstate *h, struct page *page, int nid)
> +{
> +	__prep_new_huge_page(page);
>  	spin_lock_irq(&hugetlb_lock);
> -	h->nr_huge_pages++;
> -	h->nr_huge_pages_node[nid]++;
> +	__prep_account_new_huge_page(h, nid);
>  	spin_unlock_irq(&hugetlb_lock);
>  }

Any reason to decouple the locking from the accounting?
>  
> -- 
> 2.16.3
Michal Hocko April 13, 2021, 1:26 p.m. UTC | #2
On Tue 13-04-21 15:24:32, Michal Hocko wrote:
> On Tue 13-04-21 12:47:44, Oscar Salvador wrote:
> [...]
> > +static void prep_new_huge_page(struct hstate *h, struct page *page, int nid)
> > +{
> > +	__prep_new_huge_page(page);
> >  	spin_lock_irq(&hugetlb_lock);
> > -	h->nr_huge_pages++;
> > -	h->nr_huge_pages_node[nid]++;
> > +	__prep_account_new_huge_page(h, nid);
> >  	spin_unlock_irq(&hugetlb_lock);
> >  }
> 
> Any reason to decouple the locking from the accounting?

OK, I spoke too soon. The next patch already has the locking around when
calling this. Please add a lockdep assert into the helper to make the
locking expectations more obvious.

With that
Acked-by: Michal Hocko <mhocko@suse.com>
Mike Kravetz April 13, 2021, 9:33 p.m. UTC | #3
On 4/13/21 3:47 AM, Oscar Salvador wrote:
> Currently, prep_new_huge_page() performs two functions.
> It sets the right state for a new hugetlb, and increases the hstate's
> counters to account for the new page.
> 
> Let us split its functionality into two separate functions, decoupling
> the handling of the counters from initializing a hugepage.
> The outcome is having __prep_new_huge_page(), which only
> initializes the page , and __prep_account_new_huge_page(), which adds
> the new page to the hstate's counters.
> 
> This allows us to be able to set a hugetlb without having to worry
> about the counter/locking. It will prove useful in the next patch.
> prep_new_huge_page() still calls both functions.
> 
> Signed-off-by: Oscar Salvador <osalvador@suse.de>
> ---
>  mm/hugetlb.c | 19 ++++++++++++++++---
>  1 file changed, 16 insertions(+), 3 deletions(-)
> 
> diff --git a/mm/hugetlb.c b/mm/hugetlb.c
> index e40d5fe5c63c..0607b2b71ac6 100644
> --- a/mm/hugetlb.c
> +++ b/mm/hugetlb.c
> @@ -1483,7 +1483,16 @@ void free_huge_page(struct page *page)
>  	}
>  }
>  
> -static void prep_new_huge_page(struct hstate *h, struct page *page, int nid)
> +/*
> + * Must be called with the hugetlb lock held
> + */
> +static void __prep_account_new_huge_page(struct hstate *h, int nid)
> +{
> +	h->nr_huge_pages++;
> +	h->nr_huge_pages_node[nid]++;

I would prefer if we also move setting the destructor to this routine.
	set_compound_page_dtor(page, HUGETLB_PAGE_DTOR);

That way, PageHuge() will be false until it 'really' is a huge page.
If not, we could potentially go into that retry loop in
dissolve_free_huge_page or alloc_and_dissolve_huge_page in patch 5.
Oscar Salvador April 14, 2021, 4:59 a.m. UTC | #4
On Tue, Apr 13, 2021 at 02:33:41PM -0700, Mike Kravetz wrote:
> > -static void prep_new_huge_page(struct hstate *h, struct page *page, int nid)
> > +/*
> > + * Must be called with the hugetlb lock held
> > + */
> > +static void __prep_account_new_huge_page(struct hstate *h, int nid)
> > +{
> > +	h->nr_huge_pages++;
> > +	h->nr_huge_pages_node[nid]++;
> 
> I would prefer if we also move setting the destructor to this routine.
> 	set_compound_page_dtor(page, HUGETLB_PAGE_DTOR);

Uhm, but that is the routine that does the accounting, it feels wrong
here, plus...

> 
> That way, PageHuge() will be false until it 'really' is a huge page.
> If not, we could potentially go into that retry loop in
> dissolve_free_huge_page or alloc_and_dissolve_huge_page in patch 5.

...I do not follow here, could you please elaborate some more?
Unless I am missing something, behaviour should not be any different with this
patch.

Thanks
David Hildenbrand April 14, 2021, 12:15 p.m. UTC | #5
On 14.04.21 06:59, Oscar Salvador wrote:
> On Tue, Apr 13, 2021 at 02:33:41PM -0700, Mike Kravetz wrote:
>>> -static void prep_new_huge_page(struct hstate *h, struct page *page, int nid)
>>> +/*
>>> + * Must be called with the hugetlb lock held
>>> + */
>>> +static void __prep_account_new_huge_page(struct hstate *h, int nid)
>>> +{
>>> +	h->nr_huge_pages++;
>>> +	h->nr_huge_pages_node[nid]++;
>>
>> I would prefer if we also move setting the destructor to this routine.
>> 	set_compound_page_dtor(page, HUGETLB_PAGE_DTOR);
> 
> Uhm, but that is the routine that does the accounting, it feels wrong
> here, plus...

I agree. If we want the final activation separately, it might be better 
to have it as a separate step/function like __active_new_huge_page().
Mike Kravetz April 14, 2021, 5:03 p.m. UTC | #6
On 4/13/21 9:59 PM, Oscar Salvador wrote:
> On Tue, Apr 13, 2021 at 02:33:41PM -0700, Mike Kravetz wrote:
>>> -static void prep_new_huge_page(struct hstate *h, struct page *page, int nid)
>>> +/*
>>> + * Must be called with the hugetlb lock held
>>> + */
>>> +static void __prep_account_new_huge_page(struct hstate *h, int nid)
>>> +{
>>> +	h->nr_huge_pages++;
>>> +	h->nr_huge_pages_node[nid]++;
>>
>> I would prefer if we also move setting the destructor to this routine.
>> 	set_compound_page_dtor(page, HUGETLB_PAGE_DTOR);
> 
> Uhm, but that is the routine that does the accounting, it feels wrong
> here, plus...
> 
>>
>> That way, PageHuge() will be false until it 'really' is a huge page.
>> If not, we could potentially go into that retry loop in
>> dissolve_free_huge_page or alloc_and_dissolve_huge_page in patch 5.
> 
> ...I do not follow here, could you please elaborate some more?
> Unless I am missing something, behaviour should not be any different with this
> patch.
> 

I was thinking of the time between the call to __prep_new_huge_page and
__prep_account_new_huge_page.  In that time, PageHuge() will be true but
the page is not yet fully being managed as a hugetlb page.  My thought
was that isolation, migration, offline or any code that does pfn
scanning might the page as PageHuge() (after taking lock) and start to
process it.

Now I see that in patch 5 you call both __prep_new_huge_page and
__prep_account_new_huge_page with the lock held.  So, no issue.  Sorry.

I 'think' there may still be a potential race with the prep_new_huge_page
routine, but that existed before any of your changes.  It may also be
that 'synchronization' at the pageblock level which prevents some of
these pfn scanning operations to operate on the same pageblocks may
prevent this from ever happening.

Mostly thinking out loud.  Upon further thought, I have no objections to
this change.  Sorry for the noise.

Reviewed-by: Mike Kravetz <mike.kravetz@oracle.com>
diff mbox series

Patch

diff --git a/mm/hugetlb.c b/mm/hugetlb.c
index e40d5fe5c63c..0607b2b71ac6 100644
--- a/mm/hugetlb.c
+++ b/mm/hugetlb.c
@@ -1483,7 +1483,16 @@  void free_huge_page(struct page *page)
 	}
 }
 
-static void prep_new_huge_page(struct hstate *h, struct page *page, int nid)
+/*
+ * Must be called with the hugetlb lock held
+ */
+static void __prep_account_new_huge_page(struct hstate *h, int nid)
+{
+	h->nr_huge_pages++;
+	h->nr_huge_pages_node[nid]++;
+}
+
+static void __prep_new_huge_page(struct page *page)
 {
 	INIT_LIST_HEAD(&page->lru);
 	set_compound_page_dtor(page, HUGETLB_PAGE_DTOR);
@@ -1491,9 +1500,13 @@  static void prep_new_huge_page(struct hstate *h, struct page *page, int nid)
 	set_hugetlb_cgroup(page, NULL);
 	set_hugetlb_cgroup_rsvd(page, NULL);
 	ClearHPageFreed(page);
+}
+
+static void prep_new_huge_page(struct hstate *h, struct page *page, int nid)
+{
+	__prep_new_huge_page(page);
 	spin_lock_irq(&hugetlb_lock);
-	h->nr_huge_pages++;
-	h->nr_huge_pages_node[nid]++;
+	__prep_account_new_huge_page(h, nid);
 	spin_unlock_irq(&hugetlb_lock);
 }