diff mbox series

[v7,06/15] mm/hugetlb: Disable freeing vmemmap if struct page size is not power of two

Message ID 20201130151838.11208-7-songmuchun@bytedance.com (mailing list archive)
State New, archived
Headers show
Series Free some vmemmap pages of hugetlb page | expand

Commit Message

Muchun Song Nov. 30, 2020, 3:18 p.m. UTC
We only can free the tail vmemmap pages of HugeTLB to the buddy allocator
when the size of struct page is a power of two.

Signed-off-by: Muchun Song <songmuchun@bytedance.com>
---
 mm/hugetlb_vmemmap.c | 5 +++++
 1 file changed, 5 insertions(+)

Comments

David Hildenbrand Dec. 9, 2020, 9:57 a.m. UTC | #1
On 30.11.20 16:18, Muchun Song wrote:
> We only can free the tail vmemmap pages of HugeTLB to the buddy allocator
> when the size of struct page is a power of two.
> 
> Signed-off-by: Muchun Song <songmuchun@bytedance.com>
> ---
>  mm/hugetlb_vmemmap.c | 5 +++++
>  1 file changed, 5 insertions(+)
> 
> diff --git a/mm/hugetlb_vmemmap.c b/mm/hugetlb_vmemmap.c
> index 51152e258f39..ad8fc61ea273 100644
> --- a/mm/hugetlb_vmemmap.c
> +++ b/mm/hugetlb_vmemmap.c
> @@ -111,6 +111,11 @@ void __init hugetlb_vmemmap_init(struct hstate *h)
>  	unsigned int nr_pages = pages_per_huge_page(h);
>  	unsigned int vmemmap_pages;
>  
> +	if (!is_power_of_2(sizeof(struct page))) {
> +		pr_info("disable freeing vmemmap pages for %s\n", h->name);

I'd just drop that pr_info(). Users are able to observe that it's
working (below), so they are able to identify that it's not working as well.

> +		return;
> +	}
> +
>  	vmemmap_pages = (nr_pages * sizeof(struct page)) >> PAGE_SHIFT;
>  	/*
>  	 * The head page and the first tail page are not to be freed to buddy
> 

Please squash this patch into the enabling patch and add a comment
instead, like

/* We cannot optimize if a "struct page" crosses page boundaries. */
Muchun Song Dec. 9, 2020, 10:03 a.m. UTC | #2
On Wed, Dec 9, 2020 at 5:57 PM David Hildenbrand <david@redhat.com> wrote:
>
> On 30.11.20 16:18, Muchun Song wrote:
> > We only can free the tail vmemmap pages of HugeTLB to the buddy allocator
> > when the size of struct page is a power of two.
> >
> > Signed-off-by: Muchun Song <songmuchun@bytedance.com>
> > ---
> >  mm/hugetlb_vmemmap.c | 5 +++++
> >  1 file changed, 5 insertions(+)
> >
> > diff --git a/mm/hugetlb_vmemmap.c b/mm/hugetlb_vmemmap.c
> > index 51152e258f39..ad8fc61ea273 100644
> > --- a/mm/hugetlb_vmemmap.c
> > +++ b/mm/hugetlb_vmemmap.c
> > @@ -111,6 +111,11 @@ void __init hugetlb_vmemmap_init(struct hstate *h)
> >       unsigned int nr_pages = pages_per_huge_page(h);
> >       unsigned int vmemmap_pages;
> >
> > +     if (!is_power_of_2(sizeof(struct page))) {
> > +             pr_info("disable freeing vmemmap pages for %s\n", h->name);
>
> I'd just drop that pr_info(). Users are able to observe that it's
> working (below), so they are able to identify that it's not working as well.

The below is just a pr_debug. Do you suggest converting it to pr_info?

>
> > +             return;
> > +     }
> > +
> >       vmemmap_pages = (nr_pages * sizeof(struct page)) >> PAGE_SHIFT;
> >       /*
> >        * The head page and the first tail page are not to be freed to buddy
> >
>
> Please squash this patch into the enabling patch and add a comment
> instead, like
>
> /* We cannot optimize if a "struct page" crosses page boundaries. */
>

Will do. Thanks.

> --
> Thanks,
>
> David / dhildenb
>
David Hildenbrand Dec. 9, 2020, 10:06 a.m. UTC | #3
On 09.12.20 11:03, Muchun Song wrote:
> On Wed, Dec 9, 2020 at 5:57 PM David Hildenbrand <david@redhat.com> wrote:
>>
>> On 30.11.20 16:18, Muchun Song wrote:
>>> We only can free the tail vmemmap pages of HugeTLB to the buddy allocator
>>> when the size of struct page is a power of two.
>>>
>>> Signed-off-by: Muchun Song <songmuchun@bytedance.com>
>>> ---
>>>  mm/hugetlb_vmemmap.c | 5 +++++
>>>  1 file changed, 5 insertions(+)
>>>
>>> diff --git a/mm/hugetlb_vmemmap.c b/mm/hugetlb_vmemmap.c
>>> index 51152e258f39..ad8fc61ea273 100644
>>> --- a/mm/hugetlb_vmemmap.c
>>> +++ b/mm/hugetlb_vmemmap.c
>>> @@ -111,6 +111,11 @@ void __init hugetlb_vmemmap_init(struct hstate *h)
>>>       unsigned int nr_pages = pages_per_huge_page(h);
>>>       unsigned int vmemmap_pages;
>>>
>>> +     if (!is_power_of_2(sizeof(struct page))) {
>>> +             pr_info("disable freeing vmemmap pages for %s\n", h->name);
>>
>> I'd just drop that pr_info(). Users are able to observe that it's
>> working (below), so they are able to identify that it's not working as well.
> 
> The below is just a pr_debug. Do you suggest converting it to pr_info?

Good question. I wonder if users really have to know in most cases.
Maybe pr_debug() is good enough in environments where we want to debug
why stuff is not working as expected.
David Hildenbrand Dec. 9, 2020, 10:10 a.m. UTC | #4
On 09.12.20 11:06, David Hildenbrand wrote:
> On 09.12.20 11:03, Muchun Song wrote:
>> On Wed, Dec 9, 2020 at 5:57 PM David Hildenbrand <david@redhat.com> wrote:
>>>
>>> On 30.11.20 16:18, Muchun Song wrote:
>>>> We only can free the tail vmemmap pages of HugeTLB to the buddy allocator
>>>> when the size of struct page is a power of two.
>>>>
>>>> Signed-off-by: Muchun Song <songmuchun@bytedance.com>
>>>> ---
>>>>  mm/hugetlb_vmemmap.c | 5 +++++
>>>>  1 file changed, 5 insertions(+)
>>>>
>>>> diff --git a/mm/hugetlb_vmemmap.c b/mm/hugetlb_vmemmap.c
>>>> index 51152e258f39..ad8fc61ea273 100644
>>>> --- a/mm/hugetlb_vmemmap.c
>>>> +++ b/mm/hugetlb_vmemmap.c
>>>> @@ -111,6 +111,11 @@ void __init hugetlb_vmemmap_init(struct hstate *h)
>>>>       unsigned int nr_pages = pages_per_huge_page(h);
>>>>       unsigned int vmemmap_pages;
>>>>
>>>> +     if (!is_power_of_2(sizeof(struct page))) {
>>>> +             pr_info("disable freeing vmemmap pages for %s\n", h->name);
>>>
>>> I'd just drop that pr_info(). Users are able to observe that it's
>>> working (below), so they are able to identify that it's not working as well.
>>
>> The below is just a pr_debug. Do you suggest converting it to pr_info?
> 
> Good question. I wonder if users really have to know in most cases.
> Maybe pr_debug() is good enough in environments where we want to debug
> why stuff is not working as expected.
> 

Oh, another thought, can we glue availability of
HUGETLB_PAGE_FREE_VMEMMAP (or a new define based on the config and the
size of a stuct page) to the size of struct page somehow?

I mean, it's known at compile time that this will never work.
Muchun Song Dec. 9, 2020, 10:10 a.m. UTC | #5
On Wed, Dec 9, 2020 at 6:06 PM David Hildenbrand <david@redhat.com> wrote:
>
> On 09.12.20 11:03, Muchun Song wrote:
> > On Wed, Dec 9, 2020 at 5:57 PM David Hildenbrand <david@redhat.com> wrote:
> >>
> >> On 30.11.20 16:18, Muchun Song wrote:
> >>> We only can free the tail vmemmap pages of HugeTLB to the buddy allocator
> >>> when the size of struct page is a power of two.
> >>>
> >>> Signed-off-by: Muchun Song <songmuchun@bytedance.com>
> >>> ---
> >>>  mm/hugetlb_vmemmap.c | 5 +++++
> >>>  1 file changed, 5 insertions(+)
> >>>
> >>> diff --git a/mm/hugetlb_vmemmap.c b/mm/hugetlb_vmemmap.c
> >>> index 51152e258f39..ad8fc61ea273 100644
> >>> --- a/mm/hugetlb_vmemmap.c
> >>> +++ b/mm/hugetlb_vmemmap.c
> >>> @@ -111,6 +111,11 @@ void __init hugetlb_vmemmap_init(struct hstate *h)
> >>>       unsigned int nr_pages = pages_per_huge_page(h);
> >>>       unsigned int vmemmap_pages;
> >>>
> >>> +     if (!is_power_of_2(sizeof(struct page))) {
> >>> +             pr_info("disable freeing vmemmap pages for %s\n", h->name);
> >>
> >> I'd just drop that pr_info(). Users are able to observe that it's
> >> working (below), so they are able to identify that it's not working as well.
> >
> > The below is just a pr_debug. Do you suggest converting it to pr_info?
>
> Good question. I wonder if users really have to know in most cases.
> Maybe pr_debug() is good enough in environments where we want to debug
> why stuff is not working as expected.

When someone enables this feature via the boot cmdline, maybe he should
want to know whether this feature works. From this point of view, the pr_info
is necessary. Right?

>
> --
> Thanks,
>
> David / dhildenb
>
Muchun Song Dec. 9, 2020, 10:16 a.m. UTC | #6
On Wed, Dec 9, 2020 at 6:10 PM David Hildenbrand <david@redhat.com> wrote:
>
> On 09.12.20 11:06, David Hildenbrand wrote:
> > On 09.12.20 11:03, Muchun Song wrote:
> >> On Wed, Dec 9, 2020 at 5:57 PM David Hildenbrand <david@redhat.com> wrote:
> >>>
> >>> On 30.11.20 16:18, Muchun Song wrote:
> >>>> We only can free the tail vmemmap pages of HugeTLB to the buddy allocator
> >>>> when the size of struct page is a power of two.
> >>>>
> >>>> Signed-off-by: Muchun Song <songmuchun@bytedance.com>
> >>>> ---
> >>>>  mm/hugetlb_vmemmap.c | 5 +++++
> >>>>  1 file changed, 5 insertions(+)
> >>>>
> >>>> diff --git a/mm/hugetlb_vmemmap.c b/mm/hugetlb_vmemmap.c
> >>>> index 51152e258f39..ad8fc61ea273 100644
> >>>> --- a/mm/hugetlb_vmemmap.c
> >>>> +++ b/mm/hugetlb_vmemmap.c
> >>>> @@ -111,6 +111,11 @@ void __init hugetlb_vmemmap_init(struct hstate *h)
> >>>>       unsigned int nr_pages = pages_per_huge_page(h);
> >>>>       unsigned int vmemmap_pages;
> >>>>
> >>>> +     if (!is_power_of_2(sizeof(struct page))) {
> >>>> +             pr_info("disable freeing vmemmap pages for %s\n", h->name);
> >>>
> >>> I'd just drop that pr_info(). Users are able to observe that it's
> >>> working (below), so they are able to identify that it's not working as well.
> >>
> >> The below is just a pr_debug. Do you suggest converting it to pr_info?
> >
> > Good question. I wonder if users really have to know in most cases.
> > Maybe pr_debug() is good enough in environments where we want to debug
> > why stuff is not working as expected.
> >
>
> Oh, another thought, can we glue availability of
> HUGETLB_PAGE_FREE_VMEMMAP (or a new define based on the config and the
> size of a stuct page) to the size of struct page somehow?
>
> I mean, it's known at compile time that this will never work.

Good question. I also thought about this question. Just like the
macro SPINLOCK_SIZE does, we also can generate a new macro
to indicate the size of the struct page. :)

>
> --
> Thanks,
>
> David / dhildenb
>
Muchun Song Dec. 9, 2020, 3:13 p.m. UTC | #7
On Wed, Dec 9, 2020 at 6:10 PM David Hildenbrand <david@redhat.com> wrote:
>
> On 09.12.20 11:06, David Hildenbrand wrote:
> > On 09.12.20 11:03, Muchun Song wrote:
> >> On Wed, Dec 9, 2020 at 5:57 PM David Hildenbrand <david@redhat.com> wrote:
> >>>
> >>> On 30.11.20 16:18, Muchun Song wrote:
> >>>> We only can free the tail vmemmap pages of HugeTLB to the buddy allocator
> >>>> when the size of struct page is a power of two.
> >>>>
> >>>> Signed-off-by: Muchun Song <songmuchun@bytedance.com>
> >>>> ---
> >>>>  mm/hugetlb_vmemmap.c | 5 +++++
> >>>>  1 file changed, 5 insertions(+)
> >>>>
> >>>> diff --git a/mm/hugetlb_vmemmap.c b/mm/hugetlb_vmemmap.c
> >>>> index 51152e258f39..ad8fc61ea273 100644
> >>>> --- a/mm/hugetlb_vmemmap.c
> >>>> +++ b/mm/hugetlb_vmemmap.c
> >>>> @@ -111,6 +111,11 @@ void __init hugetlb_vmemmap_init(struct hstate *h)
> >>>>       unsigned int nr_pages = pages_per_huge_page(h);
> >>>>       unsigned int vmemmap_pages;
> >>>>
> >>>> +     if (!is_power_of_2(sizeof(struct page))) {
> >>>> +             pr_info("disable freeing vmemmap pages for %s\n", h->name);
> >>>
> >>> I'd just drop that pr_info(). Users are able to observe that it's
> >>> working (below), so they are able to identify that it's not working as well.
> >>
> >> The below is just a pr_debug. Do you suggest converting it to pr_info?
> >
> > Good question. I wonder if users really have to know in most cases.
> > Maybe pr_debug() is good enough in environments where we want to debug
> > why stuff is not working as expected.
> >
>
> Oh, another thought, can we glue availability of
> HUGETLB_PAGE_FREE_VMEMMAP (or a new define based on the config and the
> size of a stuct page) to the size of struct page somehow?
>
> I mean, it's known at compile time that this will never work.

I want to define a macro which indicates the size of the
struct page. There is place (kernel/bounds.c) where can
do similar things. When I added the following code in
that file.

        DEFINE(STRUCT_PAGE_SIZE, sizeof(struct page));

Then the compiler will output a message like:

       make[2]: Circular kernel/bounds.s <- include/generated/bounds.h
dependency dropped.

Then I realise that the size of the struct page also depends
on include/generated/bounds.h. But this file is not generated.

Hi David,

Do you have some idea about this?

>
> --
> Thanks,
>
> David / dhildenb
>
David Hildenbrand Dec. 9, 2020, 3:47 p.m. UTC | #8
On 09.12.20 16:13, Muchun Song wrote:
> On Wed, Dec 9, 2020 at 6:10 PM David Hildenbrand <david@redhat.com> wrote:
>>
>> On 09.12.20 11:06, David Hildenbrand wrote:
>>> On 09.12.20 11:03, Muchun Song wrote:
>>>> On Wed, Dec 9, 2020 at 5:57 PM David Hildenbrand <david@redhat.com> wrote:
>>>>>
>>>>> On 30.11.20 16:18, Muchun Song wrote:
>>>>>> We only can free the tail vmemmap pages of HugeTLB to the buddy allocator
>>>>>> when the size of struct page is a power of two.
>>>>>>
>>>>>> Signed-off-by: Muchun Song <songmuchun@bytedance.com>
>>>>>> ---
>>>>>>  mm/hugetlb_vmemmap.c | 5 +++++
>>>>>>  1 file changed, 5 insertions(+)
>>>>>>
>>>>>> diff --git a/mm/hugetlb_vmemmap.c b/mm/hugetlb_vmemmap.c
>>>>>> index 51152e258f39..ad8fc61ea273 100644
>>>>>> --- a/mm/hugetlb_vmemmap.c
>>>>>> +++ b/mm/hugetlb_vmemmap.c
>>>>>> @@ -111,6 +111,11 @@ void __init hugetlb_vmemmap_init(struct hstate *h)
>>>>>>       unsigned int nr_pages = pages_per_huge_page(h);
>>>>>>       unsigned int vmemmap_pages;
>>>>>>
>>>>>> +     if (!is_power_of_2(sizeof(struct page))) {
>>>>>> +             pr_info("disable freeing vmemmap pages for %s\n", h->name);
>>>>>
>>>>> I'd just drop that pr_info(). Users are able to observe that it's
>>>>> working (below), so they are able to identify that it's not working as well.
>>>>
>>>> The below is just a pr_debug. Do you suggest converting it to pr_info?
>>>
>>> Good question. I wonder if users really have to know in most cases.
>>> Maybe pr_debug() is good enough in environments where we want to debug
>>> why stuff is not working as expected.
>>>
>>
>> Oh, another thought, can we glue availability of
>> HUGETLB_PAGE_FREE_VMEMMAP (or a new define based on the config and the
>> size of a stuct page) to the size of struct page somehow?
>>
>> I mean, it's known at compile time that this will never work.
> 
> I want to define a macro which indicates the size of the
> struct page. There is place (kernel/bounds.c) where can
> do similar things. When I added the following code in
> that file.
> 
>         DEFINE(STRUCT_PAGE_SIZE, sizeof(struct page));
> 
> Then the compiler will output a message like:
> 

Hm, from what I understand you cannot use sizeof() in #if etc. So it
might not be possible after all. At least the compiler should optimize
code like

if (!is_power_of_2(sizeof(struct page))) {
	// either this
} else {
	// or that
}

that can never be reached
Muchun Song Dec. 9, 2020, 3:50 p.m. UTC | #9
On Wed, Dec 9, 2020 at 11:48 PM David Hildenbrand <david@redhat.com> wrote:
>
> On 09.12.20 16:13, Muchun Song wrote:
> > On Wed, Dec 9, 2020 at 6:10 PM David Hildenbrand <david@redhat.com> wrote:
> >>
> >> On 09.12.20 11:06, David Hildenbrand wrote:
> >>> On 09.12.20 11:03, Muchun Song wrote:
> >>>> On Wed, Dec 9, 2020 at 5:57 PM David Hildenbrand <david@redhat.com> wrote:
> >>>>>
> >>>>> On 30.11.20 16:18, Muchun Song wrote:
> >>>>>> We only can free the tail vmemmap pages of HugeTLB to the buddy allocator
> >>>>>> when the size of struct page is a power of two.
> >>>>>>
> >>>>>> Signed-off-by: Muchun Song <songmuchun@bytedance.com>
> >>>>>> ---
> >>>>>>  mm/hugetlb_vmemmap.c | 5 +++++
> >>>>>>  1 file changed, 5 insertions(+)
> >>>>>>
> >>>>>> diff --git a/mm/hugetlb_vmemmap.c b/mm/hugetlb_vmemmap.c
> >>>>>> index 51152e258f39..ad8fc61ea273 100644
> >>>>>> --- a/mm/hugetlb_vmemmap.c
> >>>>>> +++ b/mm/hugetlb_vmemmap.c
> >>>>>> @@ -111,6 +111,11 @@ void __init hugetlb_vmemmap_init(struct hstate *h)
> >>>>>>       unsigned int nr_pages = pages_per_huge_page(h);
> >>>>>>       unsigned int vmemmap_pages;
> >>>>>>
> >>>>>> +     if (!is_power_of_2(sizeof(struct page))) {
> >>>>>> +             pr_info("disable freeing vmemmap pages for %s\n", h->name);
> >>>>>
> >>>>> I'd just drop that pr_info(). Users are able to observe that it's
> >>>>> working (below), so they are able to identify that it's not working as well.
> >>>>
> >>>> The below is just a pr_debug. Do you suggest converting it to pr_info?
> >>>
> >>> Good question. I wonder if users really have to know in most cases.
> >>> Maybe pr_debug() is good enough in environments where we want to debug
> >>> why stuff is not working as expected.
> >>>
> >>
> >> Oh, another thought, can we glue availability of
> >> HUGETLB_PAGE_FREE_VMEMMAP (or a new define based on the config and the
> >> size of a stuct page) to the size of struct page somehow?
> >>
> >> I mean, it's known at compile time that this will never work.
> >
> > I want to define a macro which indicates the size of the
> > struct page. There is place (kernel/bounds.c) where can
> > do similar things. When I added the following code in
> > that file.
> >
> >         DEFINE(STRUCT_PAGE_SIZE, sizeof(struct page));
> >
> > Then the compiler will output a message like:
> >
>
> Hm, from what I understand you cannot use sizeof() in #if etc. So it
> might not be possible after all. At least the compiler should optimize
> code like
>
> if (!is_power_of_2(sizeof(struct page))) {
>         // either this
> } else {
>         // or that
> }
>
> that can never be reached

Got it. Thanks so much.

>
> --
> Thanks,
>
> David / dhildenb
>
diff mbox series

Patch

diff --git a/mm/hugetlb_vmemmap.c b/mm/hugetlb_vmemmap.c
index 51152e258f39..ad8fc61ea273 100644
--- a/mm/hugetlb_vmemmap.c
+++ b/mm/hugetlb_vmemmap.c
@@ -111,6 +111,11 @@  void __init hugetlb_vmemmap_init(struct hstate *h)
 	unsigned int nr_pages = pages_per_huge_page(h);
 	unsigned int vmemmap_pages;
 
+	if (!is_power_of_2(sizeof(struct page))) {
+		pr_info("disable freeing vmemmap pages for %s\n", h->name);
+		return;
+	}
+
 	vmemmap_pages = (nr_pages * sizeof(struct page)) >> PAGE_SHIFT;
 	/*
 	 * The head page and the first tail page are not to be freed to buddy