diff mbox series

[v13,02/12] mm: hugetlb: introduce a new config HUGETLB_PAGE_FREE_VMEMMAP

Message ID 20210117151053.24600-3-songmuchun@bytedance.com (mailing list archive)
State New, archived
Headers show
Series Free some vmemmap pages of HugeTLB page | expand

Commit Message

Muchun Song Jan. 17, 2021, 3:10 p.m. UTC
The HUGETLB_PAGE_FREE_VMEMMAP option is used to enable the freeing
of unnecessary vmemmap associated with HugeTLB pages. The config
option is introduced early so that supporting code can be written
to depend on the option. The initial version of the code only
provides support for x86-64.

Like other code which frees vmemmap, this config option depends on
HAVE_BOOTMEM_INFO_NODE. The routine register_page_bootmem_info() is
used to register bootmem info. Therefore, make sure
register_page_bootmem_info is enabled if HUGETLB_PAGE_FREE_VMEMMAP
is defined.

Signed-off-by: Muchun Song <songmuchun@bytedance.com>
Reviewed-by: Oscar Salvador <osalvador@suse.de>
Acked-by: Mike Kravetz <mike.kravetz@oracle.com>
---
 arch/x86/mm/init_64.c |  2 +-
 fs/Kconfig            | 18 ++++++++++++++++++
 2 files changed, 19 insertions(+), 1 deletion(-)

Comments

David Rientjes Jan. 24, 2021, 11:58 p.m. UTC | #1
On Sun, 17 Jan 2021, Muchun Song wrote:

> The HUGETLB_PAGE_FREE_VMEMMAP option is used to enable the freeing
> of unnecessary vmemmap associated with HugeTLB pages. The config
> option is introduced early so that supporting code can be written
> to depend on the option. The initial version of the code only
> provides support for x86-64.
> 
> Like other code which frees vmemmap, this config option depends on
> HAVE_BOOTMEM_INFO_NODE. The routine register_page_bootmem_info() is
> used to register bootmem info. Therefore, make sure
> register_page_bootmem_info is enabled if HUGETLB_PAGE_FREE_VMEMMAP
> is defined.
> 
> Signed-off-by: Muchun Song <songmuchun@bytedance.com>
> Reviewed-by: Oscar Salvador <osalvador@suse.de>
> Acked-by: Mike Kravetz <mike.kravetz@oracle.com>
> ---
>  arch/x86/mm/init_64.c |  2 +-
>  fs/Kconfig            | 18 ++++++++++++++++++
>  2 files changed, 19 insertions(+), 1 deletion(-)
> 
> diff --git a/arch/x86/mm/init_64.c b/arch/x86/mm/init_64.c
> index 0a45f062826e..0435bee2e172 100644
> --- a/arch/x86/mm/init_64.c
> +++ b/arch/x86/mm/init_64.c
> @@ -1225,7 +1225,7 @@ static struct kcore_list kcore_vsyscall;
>  
>  static void __init register_page_bootmem_info(void)
>  {
> -#ifdef CONFIG_NUMA
> +#if defined(CONFIG_NUMA) || defined(CONFIG_HUGETLB_PAGE_FREE_VMEMMAP)
>  	int i;
>  
>  	for_each_online_node(i)
> diff --git a/fs/Kconfig b/fs/Kconfig
> index 976e8b9033c4..e7c4c2a79311 100644
> --- a/fs/Kconfig
> +++ b/fs/Kconfig
> @@ -245,6 +245,24 @@ config HUGETLBFS
>  config HUGETLB_PAGE
>  	def_bool HUGETLBFS
>  
> +config HUGETLB_PAGE_FREE_VMEMMAP
> +	def_bool HUGETLB_PAGE

I'm not sure I understand the rationale for providing this help text if 
this is def_bool depending on CONFIG_HUGETLB_PAGE.  Are you intending that 
this is actually configurable and we want to provide guidance to the admin 
on when to disable it (which it currently doesn't)?  If not, why have the 
help text?

> +	depends on X86_64
> +	depends on SPARSEMEM_VMEMMAP
> +	depends on HAVE_BOOTMEM_INFO_NODE
> +	help
> +	  The option HUGETLB_PAGE_FREE_VMEMMAP allows for the freeing of
> +	  some vmemmap pages associated with pre-allocated HugeTLB pages.
> +	  For example, on X86_64 6 vmemmap pages of size 4KB each can be
> +	  saved for each 2MB HugeTLB page.  4094 vmemmap pages of size 4KB
> +	  each can be saved for each 1GB HugeTLB page.
> +
> +	  When a HugeTLB page is allocated or freed, the vmemmap array
> +	  representing the range associated with the page will need to be
> +	  remapped.  When a page is allocated, vmemmap pages are freed
> +	  after remapping.  When a page is freed, previously discarded
> +	  vmemmap pages must be allocated before remapping.
> +
>  config MEMFD_CREATE
>  	def_bool TMPFS || HUGETLBFS
>
Randy Dunlap Jan. 25, 2021, 3:16 a.m. UTC | #2
On 1/24/21 3:58 PM, David Rientjes wrote:
> On Sun, 17 Jan 2021, Muchun Song wrote:
> 
>> The HUGETLB_PAGE_FREE_VMEMMAP option is used to enable the freeing
>> of unnecessary vmemmap associated with HugeTLB pages. The config
>> option is introduced early so that supporting code can be written
>> to depend on the option. The initial version of the code only
>> provides support for x86-64.
>>
>> Like other code which frees vmemmap, this config option depends on
>> HAVE_BOOTMEM_INFO_NODE. The routine register_page_bootmem_info() is
>> used to register bootmem info. Therefore, make sure
>> register_page_bootmem_info is enabled if HUGETLB_PAGE_FREE_VMEMMAP
>> is defined.
>>
>> Signed-off-by: Muchun Song <songmuchun@bytedance.com>
>> Reviewed-by: Oscar Salvador <osalvador@suse.de>
>> Acked-by: Mike Kravetz <mike.kravetz@oracle.com>
>> ---
>>  arch/x86/mm/init_64.c |  2 +-
>>  fs/Kconfig            | 18 ++++++++++++++++++
>>  2 files changed, 19 insertions(+), 1 deletion(-)
>>
>> diff --git a/arch/x86/mm/init_64.c b/arch/x86/mm/init_64.c
>> index 0a45f062826e..0435bee2e172 100644
>> --- a/arch/x86/mm/init_64.c
>> +++ b/arch/x86/mm/init_64.c
>> @@ -1225,7 +1225,7 @@ static struct kcore_list kcore_vsyscall;
>>  
>>  static void __init register_page_bootmem_info(void)
>>  {
>> -#ifdef CONFIG_NUMA
>> +#if defined(CONFIG_NUMA) || defined(CONFIG_HUGETLB_PAGE_FREE_VMEMMAP)
>>  	int i;
>>  
>>  	for_each_online_node(i)
>> diff --git a/fs/Kconfig b/fs/Kconfig
>> index 976e8b9033c4..e7c4c2a79311 100644
>> --- a/fs/Kconfig
>> +++ b/fs/Kconfig
>> @@ -245,6 +245,24 @@ config HUGETLBFS
>>  config HUGETLB_PAGE
>>  	def_bool HUGETLBFS
>>  
>> +config HUGETLB_PAGE_FREE_VMEMMAP
>> +	def_bool HUGETLB_PAGE
> 
> I'm not sure I understand the rationale for providing this help text if 
> this is def_bool depending on CONFIG_HUGETLB_PAGE.  Are you intending that 
> this is actually configurable and we want to provide guidance to the admin 
> on when to disable it (which it currently doesn't)?  If not, why have the 
> help text?

It's good for the (non-user) Kconfig symbol's meaning to be documented somewhere,
preferably such that one does not have to go digging thru git commit logs
to find it.

>> +	depends on X86_64
>> +	depends on SPARSEMEM_VMEMMAP
>> +	depends on HAVE_BOOTMEM_INFO_NODE
>> +	help
>> +	  The option HUGETLB_PAGE_FREE_VMEMMAP allows for the freeing of
>> +	  some vmemmap pages associated with pre-allocated HugeTLB pages.
>> +	  For example, on X86_64 6 vmemmap pages of size 4KB each can be
>> +	  saved for each 2MB HugeTLB page.  4094 vmemmap pages of size 4KB
>> +	  each can be saved for each 1GB HugeTLB page.
>> +
>> +	  When a HugeTLB page is allocated or freed, the vmemmap array
>> +	  representing the range associated with the page will need to be
>> +	  remapped.  When a page is allocated, vmemmap pages are freed
>> +	  after remapping.  When a page is freed, previously discarded
>> +	  vmemmap pages must be allocated before remapping.
>> +
>>  config MEMFD_CREATE
>>  	def_bool TMPFS || HUGETLBFS
>>  
>
Muchun Song Jan. 25, 2021, 4:06 a.m. UTC | #3
On Mon, Jan 25, 2021 at 7:58 AM David Rientjes <rientjes@google.com> wrote:
>
>
> On Sun, 17 Jan 2021, Muchun Song wrote:
>
> > The HUGETLB_PAGE_FREE_VMEMMAP option is used to enable the freeing
> > of unnecessary vmemmap associated with HugeTLB pages. The config
> > option is introduced early so that supporting code can be written
> > to depend on the option. The initial version of the code only
> > provides support for x86-64.
> >
> > Like other code which frees vmemmap, this config option depends on
> > HAVE_BOOTMEM_INFO_NODE. The routine register_page_bootmem_info() is
> > used to register bootmem info. Therefore, make sure
> > register_page_bootmem_info is enabled if HUGETLB_PAGE_FREE_VMEMMAP
> > is defined.
> >
> > Signed-off-by: Muchun Song <songmuchun@bytedance.com>
> > Reviewed-by: Oscar Salvador <osalvador@suse.de>
> > Acked-by: Mike Kravetz <mike.kravetz@oracle.com>
> > ---
> >  arch/x86/mm/init_64.c |  2 +-
> >  fs/Kconfig            | 18 ++++++++++++++++++
> >  2 files changed, 19 insertions(+), 1 deletion(-)
> >
> > diff --git a/arch/x86/mm/init_64.c b/arch/x86/mm/init_64.c
> > index 0a45f062826e..0435bee2e172 100644
> > --- a/arch/x86/mm/init_64.c
> > +++ b/arch/x86/mm/init_64.c
> > @@ -1225,7 +1225,7 @@ static struct kcore_list kcore_vsyscall;
> >
> >  static void __init register_page_bootmem_info(void)
> >  {
> > -#ifdef CONFIG_NUMA
> > +#if defined(CONFIG_NUMA) || defined(CONFIG_HUGETLB_PAGE_FREE_VMEMMAP)
> >       int i;
> >
> >       for_each_online_node(i)
> > diff --git a/fs/Kconfig b/fs/Kconfig
> > index 976e8b9033c4..e7c4c2a79311 100644
> > --- a/fs/Kconfig
> > +++ b/fs/Kconfig
> > @@ -245,6 +245,24 @@ config HUGETLBFS
> >  config HUGETLB_PAGE
> >       def_bool HUGETLBFS
> >
> > +config HUGETLB_PAGE_FREE_VMEMMAP
> > +     def_bool HUGETLB_PAGE
>
> I'm not sure I understand the rationale for providing this help text if
> this is def_bool depending on CONFIG_HUGETLB_PAGE.  Are you intending that
> this is actually configurable and we want to provide guidance to the admin
> on when to disable it (which it currently doesn't)?  If not, why have the
> help text?

This is __not__ configurable. Seems like a comment to help others
understand this option. Like Randy said.

Thanks.

>
> > +     depends on X86_64
> > +     depends on SPARSEMEM_VMEMMAP
> > +     depends on HAVE_BOOTMEM_INFO_NODE
> > +     help
> > +       The option HUGETLB_PAGE_FREE_VMEMMAP allows for the freeing of
> > +       some vmemmap pages associated with pre-allocated HugeTLB pages.
> > +       For example, on X86_64 6 vmemmap pages of size 4KB each can be
> > +       saved for each 2MB HugeTLB page.  4094 vmemmap pages of size 4KB
> > +       each can be saved for each 1GB HugeTLB page.
> > +
> > +       When a HugeTLB page is allocated or freed, the vmemmap array
> > +       representing the range associated with the page will need to be
> > +       remapped.  When a page is allocated, vmemmap pages are freed
> > +       after remapping.  When a page is freed, previously discarded
> > +       vmemmap pages must be allocated before remapping.
> > +
> >  config MEMFD_CREATE
> >       def_bool TMPFS || HUGETLBFS
> >
Randy Dunlap Jan. 25, 2021, 4:08 a.m. UTC | #4
On 1/24/21 8:06 PM, Muchun Song wrote:
> On Mon, Jan 25, 2021 at 7:58 AM David Rientjes <rientjes@google.com> wrote:
>>
>>
>> On Sun, 17 Jan 2021, Muchun Song wrote:
>>
>>> The HUGETLB_PAGE_FREE_VMEMMAP option is used to enable the freeing
>>> of unnecessary vmemmap associated with HugeTLB pages. The config
>>> option is introduced early so that supporting code can be written
>>> to depend on the option. The initial version of the code only
>>> provides support for x86-64.
>>>
>>> Like other code which frees vmemmap, this config option depends on
>>> HAVE_BOOTMEM_INFO_NODE. The routine register_page_bootmem_info() is
>>> used to register bootmem info. Therefore, make sure
>>> register_page_bootmem_info is enabled if HUGETLB_PAGE_FREE_VMEMMAP
>>> is defined.
>>>
>>> Signed-off-by: Muchun Song <songmuchun@bytedance.com>
>>> Reviewed-by: Oscar Salvador <osalvador@suse.de>
>>> Acked-by: Mike Kravetz <mike.kravetz@oracle.com>
>>> ---
>>>  arch/x86/mm/init_64.c |  2 +-
>>>  fs/Kconfig            | 18 ++++++++++++++++++
>>>  2 files changed, 19 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/arch/x86/mm/init_64.c b/arch/x86/mm/init_64.c
>>> index 0a45f062826e..0435bee2e172 100644
>>> --- a/arch/x86/mm/init_64.c
>>> +++ b/arch/x86/mm/init_64.c
>>> @@ -1225,7 +1225,7 @@ static struct kcore_list kcore_vsyscall;
>>>
>>>  static void __init register_page_bootmem_info(void)
>>>  {
>>> -#ifdef CONFIG_NUMA
>>> +#if defined(CONFIG_NUMA) || defined(CONFIG_HUGETLB_PAGE_FREE_VMEMMAP)
>>>       int i;
>>>
>>>       for_each_online_node(i)
>>> diff --git a/fs/Kconfig b/fs/Kconfig
>>> index 976e8b9033c4..e7c4c2a79311 100644
>>> --- a/fs/Kconfig
>>> +++ b/fs/Kconfig
>>> @@ -245,6 +245,24 @@ config HUGETLBFS
>>>  config HUGETLB_PAGE
>>>       def_bool HUGETLBFS
>>>
>>> +config HUGETLB_PAGE_FREE_VMEMMAP
>>> +     def_bool HUGETLB_PAGE
>>
>> I'm not sure I understand the rationale for providing this help text if
>> this is def_bool depending on CONFIG_HUGETLB_PAGE.  Are you intending that
>> this is actually configurable and we want to provide guidance to the admin
>> on when to disable it (which it currently doesn't)?  If not, why have the
>> help text?
> 
> This is __not__ configurable. Seems like a comment to help others
> understand this option. Like Randy said.

Yes, it could be written with '#' (or "comment") comment syntax instead of as help text.

thanks.

>>
>>> +     depends on X86_64
>>> +     depends on SPARSEMEM_VMEMMAP
>>> +     depends on HAVE_BOOTMEM_INFO_NODE
>>> +     help
>>> +       The option HUGETLB_PAGE_FREE_VMEMMAP allows for the freeing of
>>> +       some vmemmap pages associated with pre-allocated HugeTLB pages.
>>> +       For example, on X86_64 6 vmemmap pages of size 4KB each can be
>>> +       saved for each 2MB HugeTLB page.  4094 vmemmap pages of size 4KB
>>> +       each can be saved for each 1GB HugeTLB page.
>>> +
>>> +       When a HugeTLB page is allocated or freed, the vmemmap array
>>> +       representing the range associated with the page will need to be
>>> +       remapped.  When a page is allocated, vmemmap pages are freed
>>> +       after remapping.  When a page is freed, previously discarded
>>> +       vmemmap pages must be allocated before remapping.
>>> +
>>>  config MEMFD_CREATE
>>>       def_bool TMPFS || HUGETLBFS
>>>
>
Muchun Song Jan. 25, 2021, 5:06 a.m. UTC | #5
On Mon, Jan 25, 2021 at 12:09 PM Randy Dunlap <rdunlap@infradead.org> wrote:
>
> On 1/24/21 8:06 PM, Muchun Song wrote:
> > On Mon, Jan 25, 2021 at 7:58 AM David Rientjes <rientjes@google.com> wrote:
> >>
> >>
> >> On Sun, 17 Jan 2021, Muchun Song wrote:
> >>
> >>> The HUGETLB_PAGE_FREE_VMEMMAP option is used to enable the freeing
> >>> of unnecessary vmemmap associated with HugeTLB pages. The config
> >>> option is introduced early so that supporting code can be written
> >>> to depend on the option. The initial version of the code only
> >>> provides support for x86-64.
> >>>
> >>> Like other code which frees vmemmap, this config option depends on
> >>> HAVE_BOOTMEM_INFO_NODE. The routine register_page_bootmem_info() is
> >>> used to register bootmem info. Therefore, make sure
> >>> register_page_bootmem_info is enabled if HUGETLB_PAGE_FREE_VMEMMAP
> >>> is defined.
> >>>
> >>> Signed-off-by: Muchun Song <songmuchun@bytedance.com>
> >>> Reviewed-by: Oscar Salvador <osalvador@suse.de>
> >>> Acked-by: Mike Kravetz <mike.kravetz@oracle.com>
> >>> ---
> >>>  arch/x86/mm/init_64.c |  2 +-
> >>>  fs/Kconfig            | 18 ++++++++++++++++++
> >>>  2 files changed, 19 insertions(+), 1 deletion(-)
> >>>
> >>> diff --git a/arch/x86/mm/init_64.c b/arch/x86/mm/init_64.c
> >>> index 0a45f062826e..0435bee2e172 100644
> >>> --- a/arch/x86/mm/init_64.c
> >>> +++ b/arch/x86/mm/init_64.c
> >>> @@ -1225,7 +1225,7 @@ static struct kcore_list kcore_vsyscall;
> >>>
> >>>  static void __init register_page_bootmem_info(void)
> >>>  {
> >>> -#ifdef CONFIG_NUMA
> >>> +#if defined(CONFIG_NUMA) || defined(CONFIG_HUGETLB_PAGE_FREE_VMEMMAP)
> >>>       int i;
> >>>
> >>>       for_each_online_node(i)
> >>> diff --git a/fs/Kconfig b/fs/Kconfig
> >>> index 976e8b9033c4..e7c4c2a79311 100644
> >>> --- a/fs/Kconfig
> >>> +++ b/fs/Kconfig
> >>> @@ -245,6 +245,24 @@ config HUGETLBFS
> >>>  config HUGETLB_PAGE
> >>>       def_bool HUGETLBFS
> >>>
> >>> +config HUGETLB_PAGE_FREE_VMEMMAP
> >>> +     def_bool HUGETLB_PAGE
> >>
> >> I'm not sure I understand the rationale for providing this help text if
> >> this is def_bool depending on CONFIG_HUGETLB_PAGE.  Are you intending that
> >> this is actually configurable and we want to provide guidance to the admin
> >> on when to disable it (which it currently doesn't)?  If not, why have the
> >> help text?
> >
> > This is __not__ configurable. Seems like a comment to help others
> > understand this option. Like Randy said.
>
> Yes, it could be written with '#' (or "comment") comment syntax instead of as help text.

Got it. I will update in the next version. Thanks.

>
> thanks.
>
> >>
> >>> +     depends on X86_64
> >>> +     depends on SPARSEMEM_VMEMMAP
> >>> +     depends on HAVE_BOOTMEM_INFO_NODE
> >>> +     help
> >>> +       The option HUGETLB_PAGE_FREE_VMEMMAP allows for the freeing of
> >>> +       some vmemmap pages associated with pre-allocated HugeTLB pages.
> >>> +       For example, on X86_64 6 vmemmap pages of size 4KB each can be
> >>> +       saved for each 2MB HugeTLB page.  4094 vmemmap pages of size 4KB
> >>> +       each can be saved for each 1GB HugeTLB page.
> >>> +
> >>> +       When a HugeTLB page is allocated or freed, the vmemmap array
> >>> +       representing the range associated with the page will need to be
> >>> +       remapped.  When a page is allocated, vmemmap pages are freed
> >>> +       after remapping.  When a page is freed, previously discarded
> >>> +       vmemmap pages must be allocated before remapping.
> >>> +
> >>>  config MEMFD_CREATE
> >>>       def_bool TMPFS || HUGETLBFS
> >>>
> >
>
>
> --
> ~Randy
>
David Rientjes Jan. 25, 2021, 6:47 p.m. UTC | #6
On Mon, 25 Jan 2021, Muchun Song wrote:

> > >> I'm not sure I understand the rationale for providing this help text if
> > >> this is def_bool depending on CONFIG_HUGETLB_PAGE.  Are you intending that
> > >> this is actually configurable and we want to provide guidance to the admin
> > >> on when to disable it (which it currently doesn't)?  If not, why have the
> > >> help text?
> > >
> > > This is __not__ configurable. Seems like a comment to help others
> > > understand this option. Like Randy said.
> >
> > Yes, it could be written with '#' (or "comment") comment syntax instead of as help text.
> 
> Got it. I will update in the next version. Thanks.
> 

I'm not sure that Kconfig is the right place to document functional 
behavior of the kernel, especially for non-configurable options.  Seems 
like this is already served by existing comments added by this patch 
series in the files where the description is helpful.
Miaohe Lin Jan. 26, 2021, 2:07 a.m. UTC | #7
Hi:
On 2021/1/17 23:10, Muchun Song wrote:
> The HUGETLB_PAGE_FREE_VMEMMAP option is used to enable the freeing
> of unnecessary vmemmap associated with HugeTLB pages. The config
> option is introduced early so that supporting code can be written
> to depend on the option. The initial version of the code only
> provides support for x86-64.
> 
> Like other code which frees vmemmap, this config option depends on
> HAVE_BOOTMEM_INFO_NODE. The routine register_page_bootmem_info() is
> used to register bootmem info. Therefore, make sure
> register_page_bootmem_info is enabled if HUGETLB_PAGE_FREE_VMEMMAP
> is defined.
> 
> Signed-off-by: Muchun Song <songmuchun@bytedance.com>
> Reviewed-by: Oscar Salvador <osalvador@suse.de>
> Acked-by: Mike Kravetz <mike.kravetz@oracle.com>
> ---
>  arch/x86/mm/init_64.c |  2 +-
>  fs/Kconfig            | 18 ++++++++++++++++++
>  2 files changed, 19 insertions(+), 1 deletion(-)
> 
> diff --git a/arch/x86/mm/init_64.c b/arch/x86/mm/init_64.c
> index 0a45f062826e..0435bee2e172 100644
> --- a/arch/x86/mm/init_64.c
> +++ b/arch/x86/mm/init_64.c
> @@ -1225,7 +1225,7 @@ static struct kcore_list kcore_vsyscall;
>  
>  static void __init register_page_bootmem_info(void)
>  {
> -#ifdef CONFIG_NUMA
> +#if defined(CONFIG_NUMA) || defined(CONFIG_HUGETLB_PAGE_FREE_VMEMMAP)
>  	int i;
>  
>  	for_each_online_node(i)
> diff --git a/fs/Kconfig b/fs/Kconfig
> index 976e8b9033c4..e7c4c2a79311 100644
> --- a/fs/Kconfig
> +++ b/fs/Kconfig
> @@ -245,6 +245,24 @@ config HUGETLBFS
>  config HUGETLB_PAGE
>  	def_bool HUGETLBFS
>  
> +config HUGETLB_PAGE_FREE_VMEMMAP
> +	def_bool HUGETLB_PAGE
> +	depends on X86_64
> +	depends on SPARSEMEM_VMEMMAP
> +	depends on HAVE_BOOTMEM_INFO_NODE
> +	help
> +	  The option HUGETLB_PAGE_FREE_VMEMMAP allows for the freeing of
> +	  some vmemmap pages associated with pre-allocated HugeTLB pages.
> +	  For example, on X86_64 6 vmemmap pages of size 4KB each can be
> +	  saved for each 2MB HugeTLB page.  4094 vmemmap pages of size 4KB
> +	  each can be saved for each 1GB HugeTLB page.
> +
> +	  When a HugeTLB page is allocated or freed, the vmemmap array
> +	  representing the range associated with the page will need to be
> +	  remapped.  When a page is allocated, vmemmap pages are freed
> +	  after remapping.  When a page is freed, previously discarded
> +	  vmemmap pages must be allocated before remapping.
> +
>  config MEMFD_CREATE
>  	def_bool TMPFS || HUGETLBFS
>  
> 

LGTM. Thanks.

Reviewed-by: Miaohe Lin <linmiaohe@huawei.com>
Muchun Song Jan. 26, 2021, 2:45 a.m. UTC | #8
On Tue, Jan 26, 2021 at 2:47 AM David Rientjes <rientjes@google.com> wrote:
>
> On Mon, 25 Jan 2021, Muchun Song wrote:
>
> > > >> I'm not sure I understand the rationale for providing this help text if
> > > >> this is def_bool depending on CONFIG_HUGETLB_PAGE.  Are you intending that
> > > >> this is actually configurable and we want to provide guidance to the admin
> > > >> on when to disable it (which it currently doesn't)?  If not, why have the
> > > >> help text?
> > > >
> > > > This is __not__ configurable. Seems like a comment to help others
> > > > understand this option. Like Randy said.
> > >
> > > Yes, it could be written with '#' (or "comment") comment syntax instead of as help text.
> >
> > Got it. I will update in the next version. Thanks.
> >
>
> I'm not sure that Kconfig is the right place to document functional
> behavior of the kernel, especially for non-configurable options.  Seems
> like this is already served by existing comments added by this patch
> series in the files where the description is helpful.

OK. So do you mean just remove the help text here?

Thanks.
David Rientjes Jan. 26, 2021, 8:13 p.m. UTC | #9
On Tue, 26 Jan 2021, Muchun Song wrote:

> > I'm not sure that Kconfig is the right place to document functional
> > behavior of the kernel, especially for non-configurable options.  Seems
> > like this is already served by existing comments added by this patch
> > series in the files where the description is helpful.
> 
> OK. So do you mean just remove the help text here?
> 

Yeah, I'd suggest removing the help text from a non-configurable Kconfig 
option and double checking that its substance is available elsewhere (like 
the giant comment in mm/hugetlb_vmemmap.c).
diff mbox series

Patch

diff --git a/arch/x86/mm/init_64.c b/arch/x86/mm/init_64.c
index 0a45f062826e..0435bee2e172 100644
--- a/arch/x86/mm/init_64.c
+++ b/arch/x86/mm/init_64.c
@@ -1225,7 +1225,7 @@  static struct kcore_list kcore_vsyscall;
 
 static void __init register_page_bootmem_info(void)
 {
-#ifdef CONFIG_NUMA
+#if defined(CONFIG_NUMA) || defined(CONFIG_HUGETLB_PAGE_FREE_VMEMMAP)
 	int i;
 
 	for_each_online_node(i)
diff --git a/fs/Kconfig b/fs/Kconfig
index 976e8b9033c4..e7c4c2a79311 100644
--- a/fs/Kconfig
+++ b/fs/Kconfig
@@ -245,6 +245,24 @@  config HUGETLBFS
 config HUGETLB_PAGE
 	def_bool HUGETLBFS
 
+config HUGETLB_PAGE_FREE_VMEMMAP
+	def_bool HUGETLB_PAGE
+	depends on X86_64
+	depends on SPARSEMEM_VMEMMAP
+	depends on HAVE_BOOTMEM_INFO_NODE
+	help
+	  The option HUGETLB_PAGE_FREE_VMEMMAP allows for the freeing of
+	  some vmemmap pages associated with pre-allocated HugeTLB pages.
+	  For example, on X86_64 6 vmemmap pages of size 4KB each can be
+	  saved for each 2MB HugeTLB page.  4094 vmemmap pages of size 4KB
+	  each can be saved for each 1GB HugeTLB page.
+
+	  When a HugeTLB page is allocated or freed, the vmemmap array
+	  representing the range associated with the page will need to be
+	  remapped.  When a page is allocated, vmemmap pages are freed
+	  after remapping.  When a page is freed, previously discarded
+	  vmemmap pages must be allocated before remapping.
+
 config MEMFD_CREATE
 	def_bool TMPFS || HUGETLBFS