Message ID | 20201213154534.54826-3-songmuchun@bytedance.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | Free some vmemmap pages of HugeTLB page | expand |
On 12/13/20 7:45 AM, Muchun Song wrote: > The purpose of introducing HUGETLB_PAGE_FREE_VMEMMAP is to configure > whether to enable the feature of freeing unused vmemmap associated with > HugeTLB pages. And this is just for dependency check. Now only support > x86-64. > > Because this config depends on HAVE_BOOTMEM_INFO_NODE. And the function > of the register_page_bootmem_info() is aimed to register bootmem info. > So we should register bootmem info when this config is enabled. Suggested commit message rewording? The HUGETLB_PAGE_FREE_VMEMMAP option is used to enable the freeing of unnecessary vmemmap associated with HugeTLB pages. The config option is introduced early so that supporting code can be written to depend on the option. The initial version of the code only provides support for x86-64. Like other code which frees vmemmap, this config option depends on HAVE_BOOTMEM_INFO_NODE. The routine register_page_bootmem_info() is used to register bootmem info. Therefore, make sure register_page_bootmem_info is enabled if HUGETLB_PAGE_FREE_VMEMMAP is defined. > > Signed-off-by: Muchun Song <songmuchun@bytedance.com> > --- > arch/x86/mm/init_64.c | 2 +- > fs/Kconfig | 15 +++++++++++++++ > 2 files changed, 16 insertions(+), 1 deletion(-) > > diff --git a/arch/x86/mm/init_64.c b/arch/x86/mm/init_64.c > index 0a45f062826e..0435bee2e172 100644 > --- a/arch/x86/mm/init_64.c > +++ b/arch/x86/mm/init_64.c > @@ -1225,7 +1225,7 @@ static struct kcore_list kcore_vsyscall; > > static void __init register_page_bootmem_info(void) > { > -#ifdef CONFIG_NUMA > +#if defined(CONFIG_NUMA) || defined(CONFIG_HUGETLB_PAGE_FREE_VMEMMAP) > int i; > > for_each_online_node(i) > diff --git a/fs/Kconfig b/fs/Kconfig > index 976e8b9033c4..4c3a9c614983 100644 > --- a/fs/Kconfig > +++ b/fs/Kconfig > @@ -245,6 +245,21 @@ config HUGETLBFS > config HUGETLB_PAGE > def_bool HUGETLBFS > > +config HUGETLB_PAGE_FREE_VMEMMAP > + def_bool HUGETLB_PAGE > + depends on X86_64 > + depends on SPARSEMEM_VMEMMAP > + depends on HAVE_BOOTMEM_INFO_NODE > + help > + When using HUGETLB_PAGE_FREE_VMEMMAP, the system can save up some > + memory from pre-allocated HugeTLB pages when they are not used. > + 6 pages per HugeTLB page of the pmd level mapping and (PAGE_SIZE - 2) > + pages per HugeTLB page of the pud level mapping. > + > + When the pages are going to be used or freed up, the vmemmap array > + representing that range needs to be remapped again and the pages > + we discarded earlier need to be rellocated again. I see the previous discussion with David about wording here. How about leaving the functionality description general, and provide a specific example for x86_64? As mentioned we can always update when new arch support is added. Suggested text? The option HUGETLB_PAGE_FREE_VMEMMAP allows for the freeing of some vmemmap pages associated with pre-allocated HugeTLB pages. For example, on X86_64 6 vmemmap pages of size 4KB each can be saved for each 2MB HugeTLB page. 4094 vmemmap pages of size 4KB each can be saved for each 1GB HugeTLB page. When a HugeTLB page is allocated or freed, the vmemmap array representing the range associated with the page will need to be remapped. When a page is allocated, vmemmap pages are freed after remapping. When a page is freed, previously discarded vmemmap pages must be allocated before before remapping.
On Wed, Dec 16, 2020 at 9:04 AM Mike Kravetz <mike.kravetz@oracle.com> wrote: > > On 12/13/20 7:45 AM, Muchun Song wrote: > > The purpose of introducing HUGETLB_PAGE_FREE_VMEMMAP is to configure > > whether to enable the feature of freeing unused vmemmap associated with > > HugeTLB pages. And this is just for dependency check. Now only support > > x86-64. > > > > Because this config depends on HAVE_BOOTMEM_INFO_NODE. And the function > > of the register_page_bootmem_info() is aimed to register bootmem info. > > So we should register bootmem info when this config is enabled. > > Suggested commit message rewording? > > The HUGETLB_PAGE_FREE_VMEMMAP option is used to enable the freeing of > unnecessary vmemmap associated with HugeTLB pages. The config option is > introduced early so that supporting code can be written to depend on the > option. The initial version of the code only provides support for x86-64. > > Like other code which frees vmemmap, this config option depends on > HAVE_BOOTMEM_INFO_NODE. The routine register_page_bootmem_info() is used > to register bootmem info. Therefore, make sure register_page_bootmem_info > is enabled if HUGETLB_PAGE_FREE_VMEMMAP is defined. Thank Mike. Will update. > > > > > Signed-off-by: Muchun Song <songmuchun@bytedance.com> > > --- > > arch/x86/mm/init_64.c | 2 +- > > fs/Kconfig | 15 +++++++++++++++ > > 2 files changed, 16 insertions(+), 1 deletion(-) > > > > diff --git a/arch/x86/mm/init_64.c b/arch/x86/mm/init_64.c > > index 0a45f062826e..0435bee2e172 100644 > > --- a/arch/x86/mm/init_64.c > > +++ b/arch/x86/mm/init_64.c > > @@ -1225,7 +1225,7 @@ static struct kcore_list kcore_vsyscall; > > > > static void __init register_page_bootmem_info(void) > > { > > -#ifdef CONFIG_NUMA > > +#if defined(CONFIG_NUMA) || defined(CONFIG_HUGETLB_PAGE_FREE_VMEMMAP) > > int i; > > > > for_each_online_node(i) > > diff --git a/fs/Kconfig b/fs/Kconfig > > index 976e8b9033c4..4c3a9c614983 100644 > > --- a/fs/Kconfig > > +++ b/fs/Kconfig > > @@ -245,6 +245,21 @@ config HUGETLBFS > > config HUGETLB_PAGE > > def_bool HUGETLBFS > > > > +config HUGETLB_PAGE_FREE_VMEMMAP > > + def_bool HUGETLB_PAGE > > + depends on X86_64 > > + depends on SPARSEMEM_VMEMMAP > > + depends on HAVE_BOOTMEM_INFO_NODE > > + help > > + When using HUGETLB_PAGE_FREE_VMEMMAP, the system can save up some > > + memory from pre-allocated HugeTLB pages when they are not used. > > + 6 pages per HugeTLB page of the pmd level mapping and (PAGE_SIZE - 2) > > + pages per HugeTLB page of the pud level mapping. > > + > > + When the pages are going to be used or freed up, the vmemmap array > > + representing that range needs to be remapped again and the pages > > + we discarded earlier need to be rellocated again. > > I see the previous discussion with David about wording here. How about > leaving the functionality description general, and provide a specific > example for x86_64? As mentioned we can always update when new arch support > is added. Suggested text? Good suggestion. Thanks. > > The option HUGETLB_PAGE_FREE_VMEMMAP allows for the freeing of > some vmemmap pages associated with pre-allocated HugeTLB pages. > For example, on X86_64 6 vmemmap pages of size 4KB each can be > saved for each 2MB HugeTLB page. 4094 vmemmap pages of size 4KB > each can be saved for each 1GB HugeTLB page. > > When a HugeTLB page is allocated or freed, the vmemmap array > representing the range associated with the page will need to be > remapped. When a page is allocated, vmemmap pages are freed > after remapping. When a page is freed, previously discarded > vmemmap pages must be allocated before before remapping. > > -- > Mike Kravetz > > > + > > config MEMFD_CREATE > > def_bool TMPFS || HUGETLBFS > > > >
On 12/15/20 5:03 PM, Mike Kravetz wrote: > On 12/13/20 7:45 AM, Muchun Song wrote: >> diff --git a/fs/Kconfig b/fs/Kconfig >> index 976e8b9033c4..4c3a9c614983 100644 >> --- a/fs/Kconfig >> +++ b/fs/Kconfig >> @@ -245,6 +245,21 @@ config HUGETLBFS >> config HUGETLB_PAGE >> def_bool HUGETLBFS >> >> +config HUGETLB_PAGE_FREE_VMEMMAP >> + def_bool HUGETLB_PAGE >> + depends on X86_64 >> + depends on SPARSEMEM_VMEMMAP >> + depends on HAVE_BOOTMEM_INFO_NODE >> + help >> + When using HUGETLB_PAGE_FREE_VMEMMAP, the system can save up some >> + memory from pre-allocated HugeTLB pages when they are not used. >> + 6 pages per HugeTLB page of the pmd level mapping and (PAGE_SIZE - 2) >> + pages per HugeTLB page of the pud level mapping. >> + >> + When the pages are going to be used or freed up, the vmemmap array >> + representing that range needs to be remapped again and the pages >> + we discarded earlier need to be rellocated again. > > I see the previous discussion with David about wording here. How about > leaving the functionality description general, and provide a specific > example for x86_64? As mentioned we can always update when new arch support > is added. Suggested text? > > The option HUGETLB_PAGE_FREE_VMEMMAP allows for the freeing of > some vmemmap pages associated with pre-allocated HugeTLB pages. > For example, on X86_64 6 vmemmap pages of size 4KB each can be > saved for each 2MB HugeTLB page. 4094 vmemmap pages of size 4KB > each can be saved for each 1GB HugeTLB page. > > When a HugeTLB page is allocated or freed, the vmemmap array > representing the range associated with the page will need to be > remapped. When a page is allocated, vmemmap pages are freed > after remapping. When a page is freed, previously discarded > vmemmap pages must be allocated before before remapping. Sorry, I am slowly coming up to speed with discussions when I was away. It appears vmemmap is not being mapped with huge pages if the boot option hugetlb_free_vmemmap is on. Is that correct? If that is correct, we should document the trade off of increased page table pages needed to map vmemmap vs the savings from freeing struct page pages. If a user/sysadmin only uses a small number of hugetlb pages (as a percentage of system memory) they could end up using more memory with hugetlb_free_vmemmap on as opposed to off. Perhaps, it should be part of the documentation for hugetlb_free_vmemmap? If this is true, and people think this should be documented, I can try to come up with something.
On Wed, Dec 16, 2020 at 11:45 AM Mike Kravetz <mike.kravetz@oracle.com> wrote: > > On 12/15/20 5:03 PM, Mike Kravetz wrote: > > On 12/13/20 7:45 AM, Muchun Song wrote: > >> diff --git a/fs/Kconfig b/fs/Kconfig > >> index 976e8b9033c4..4c3a9c614983 100644 > >> --- a/fs/Kconfig > >> +++ b/fs/Kconfig > >> @@ -245,6 +245,21 @@ config HUGETLBFS > >> config HUGETLB_PAGE > >> def_bool HUGETLBFS > >> > >> +config HUGETLB_PAGE_FREE_VMEMMAP > >> + def_bool HUGETLB_PAGE > >> + depends on X86_64 > >> + depends on SPARSEMEM_VMEMMAP > >> + depends on HAVE_BOOTMEM_INFO_NODE > >> + help > >> + When using HUGETLB_PAGE_FREE_VMEMMAP, the system can save up some > >> + memory from pre-allocated HugeTLB pages when they are not used. > >> + 6 pages per HugeTLB page of the pmd level mapping and (PAGE_SIZE - 2) > >> + pages per HugeTLB page of the pud level mapping. > >> + > >> + When the pages are going to be used or freed up, the vmemmap array > >> + representing that range needs to be remapped again and the pages > >> + we discarded earlier need to be rellocated again. > > > > I see the previous discussion with David about wording here. How about > > leaving the functionality description general, and provide a specific > > example for x86_64? As mentioned we can always update when new arch support > > is added. Suggested text? > > > > The option HUGETLB_PAGE_FREE_VMEMMAP allows for the freeing of > > some vmemmap pages associated with pre-allocated HugeTLB pages. > > For example, on X86_64 6 vmemmap pages of size 4KB each can be > > saved for each 2MB HugeTLB page. 4094 vmemmap pages of size 4KB > > each can be saved for each 1GB HugeTLB page. > > > > When a HugeTLB page is allocated or freed, the vmemmap array > > representing the range associated with the page will need to be > > remapped. When a page is allocated, vmemmap pages are freed > > after remapping. When a page is freed, previously discarded > > vmemmap pages must be allocated before before remapping. > > Sorry, I am slowly coming up to speed with discussions when I was away. > > It appears vmemmap is not being mapped with huge pages if the boot option > hugetlb_free_vmemmap is on. Is that correct? Right. > > If that is correct, we should document the trade off of increased page > table pages needed to map vmemmap vs the savings from freeing struct page > pages. If a user/sysadmin only uses a small number of hugetlb pages (as > a percentage of system memory) they could end up using more memory with > hugetlb_free_vmemmap on as opposed to off. Perhaps, it should be part of > the documentation for hugetlb_free_vmemmap? If this is true, and people Right, it is better to document it around hugetlb_free_vmemmap. This should be a part of pathe #8. Thanks. > think this should be documented, I can try to come up with something. > > -- > Mike Kravetz
diff --git a/arch/x86/mm/init_64.c b/arch/x86/mm/init_64.c index 0a45f062826e..0435bee2e172 100644 --- a/arch/x86/mm/init_64.c +++ b/arch/x86/mm/init_64.c @@ -1225,7 +1225,7 @@ static struct kcore_list kcore_vsyscall; static void __init register_page_bootmem_info(void) { -#ifdef CONFIG_NUMA +#if defined(CONFIG_NUMA) || defined(CONFIG_HUGETLB_PAGE_FREE_VMEMMAP) int i; for_each_online_node(i) diff --git a/fs/Kconfig b/fs/Kconfig index 976e8b9033c4..4c3a9c614983 100644 --- a/fs/Kconfig +++ b/fs/Kconfig @@ -245,6 +245,21 @@ config HUGETLBFS config HUGETLB_PAGE def_bool HUGETLBFS +config HUGETLB_PAGE_FREE_VMEMMAP + def_bool HUGETLB_PAGE + depends on X86_64 + depends on SPARSEMEM_VMEMMAP + depends on HAVE_BOOTMEM_INFO_NODE + help + When using HUGETLB_PAGE_FREE_VMEMMAP, the system can save up some + memory from pre-allocated HugeTLB pages when they are not used. + 6 pages per HugeTLB page of the pmd level mapping and (PAGE_SIZE - 2) + pages per HugeTLB page of the pud level mapping. + + When the pages are going to be used or freed up, the vmemmap array + representing that range needs to be remapped again and the pages + we discarded earlier need to be rellocated again. + config MEMFD_CREATE def_bool TMPFS || HUGETLBFS
The purpose of introducing HUGETLB_PAGE_FREE_VMEMMAP is to configure whether to enable the feature of freeing unused vmemmap associated with HugeTLB pages. And this is just for dependency check. Now only support x86-64. Because this config depends on HAVE_BOOTMEM_INFO_NODE. And the function of the register_page_bootmem_info() is aimed to register bootmem info. So we should register bootmem info when this config is enabled. Signed-off-by: Muchun Song <songmuchun@bytedance.com> --- arch/x86/mm/init_64.c | 2 +- fs/Kconfig | 15 +++++++++++++++ 2 files changed, 16 insertions(+), 1 deletion(-)