Message ID | 20220509062703.64249-3-songmuchun@bytedance.com (mailing list archive) |
---|---|
State | New |
Headers | show |
Series | add hugetlb_optimize_vmemmap sysctl | expand |
On 09.05.22 08:27, Muchun Song wrote: > Optimizing HugeTLB vmemmap pages is not compatible with allocating memmap on > hot added memory. If "hugetlb_free_vmemmap=on" and > memory_hotplug.memmap_on_memory" are both passed on the kernel command line, > optimizing hugetlb pages takes precedence. Why?
On Thu, May 12, 2022 at 09:36:15AM +0200, David Hildenbrand wrote: > On 09.05.22 08:27, Muchun Song wrote: > > Optimizing HugeTLB vmemmap pages is not compatible with allocating memmap on > > hot added memory. If "hugetlb_free_vmemmap=on" and > > memory_hotplug.memmap_on_memory" are both passed on the kernel command line, > > optimizing hugetlb pages takes precedence. > > Why? > Because both two features are not compatible since hugetlb_free_vmemmap cannot optimize the vmemmap pages allocated from alternative allocator (when memory_hotplug.memmap_on_memory=1). So when the feature of hugetlb_free_vmemmap is introduced, I made hugetlb_free_vmemmap take precedence. BTW, I have a plan to remove this restriction, I'll post it out ASAP. Thanks.
On 12.05.22 14:50, Muchun Song wrote: > On Thu, May 12, 2022 at 09:36:15AM +0200, David Hildenbrand wrote: >> On 09.05.22 08:27, Muchun Song wrote: >>> Optimizing HugeTLB vmemmap pages is not compatible with allocating memmap on >>> hot added memory. If "hugetlb_free_vmemmap=on" and >>> memory_hotplug.memmap_on_memory" are both passed on the kernel command line, >>> optimizing hugetlb pages takes precedence. >> >> Why? >> > > Because both two features are not compatible since hugetlb_free_vmemmap cannot > optimize the vmemmap pages allocated from alternative allocator (when > memory_hotplug.memmap_on_memory=1). So when the feature of hugetlb_free_vmemmap > is introduced, I made hugetlb_free_vmemmap take precedence. BTW, I have a plan > to remove this restriction, I'll post it out ASAP. I was asking why vmemmap optimization should take precedence. memmap_on_memory makes it more likely to succeed memory hotplug in close-to-OOM situations -- which is IMHO more important than a vmemmap optimization. But anyhow, the proper approach should most probably be to simply not mess with the vmemmap if we stumble over a vmemmap that's special due to memmap_on_memory. I assume that's what you're talking about sending out.
On Thu, May 12, 2022 at 03:04:57PM +0200, David Hildenbrand wrote: > On 12.05.22 14:50, Muchun Song wrote: > > On Thu, May 12, 2022 at 09:36:15AM +0200, David Hildenbrand wrote: > >> On 09.05.22 08:27, Muchun Song wrote: > >>> Optimizing HugeTLB vmemmap pages is not compatible with allocating memmap on > >>> hot added memory. If "hugetlb_free_vmemmap=on" and > >>> memory_hotplug.memmap_on_memory" are both passed on the kernel command line, > >>> optimizing hugetlb pages takes precedence. > >> > >> Why? > >> > > > > Because both two features are not compatible since hugetlb_free_vmemmap cannot > > optimize the vmemmap pages allocated from alternative allocator (when > > memory_hotplug.memmap_on_memory=1). So when the feature of hugetlb_free_vmemmap > > is introduced, I made hugetlb_free_vmemmap take precedence. BTW, I have a plan > > to remove this restriction, I'll post it out ASAP. > > I was asking why vmemmap optimization should take precedence. > memmap_on_memory makes it more likely to succeed memory hotplug in > close-to-OOM situations -- which is IMHO more important than a vmemmap > optimization. > I thought the users who enable hugetlb_free_vmemmap value memory savings more, so I made a decision in commit 4bab4964a59f. Seems I made a bad decision from your description. > But anyhow, the proper approach should most probably be to simply not > mess with the vmemmap if we stumble over a vmemmap that's special due to > memmap_on_memory. I assume that's what you're talking about sending out. > I mean I want to have hugetlb_vmemmap.c do the check whether the section which the HugeTLB pages belong to can be optimized instead of making hugetlb_free_vmemmap take precedence. E.g. If the section's vmemmap pages are allocated from the added memory block itself, hugetlb_free_vmemmap will refuse to optimize the vmemmap, otherwise, do the optimization. Then both kernel parameters are compatible. I have done those patches, but haven't send them out. Thanks.
On 12.05.22 15:59, Muchun Song wrote: > On Thu, May 12, 2022 at 03:04:57PM +0200, David Hildenbrand wrote: >> On 12.05.22 14:50, Muchun Song wrote: >>> On Thu, May 12, 2022 at 09:36:15AM +0200, David Hildenbrand wrote: >>>> On 09.05.22 08:27, Muchun Song wrote: >>>>> Optimizing HugeTLB vmemmap pages is not compatible with allocating memmap on >>>>> hot added memory. If "hugetlb_free_vmemmap=on" and >>>>> memory_hotplug.memmap_on_memory" are both passed on the kernel command line, >>>>> optimizing hugetlb pages takes precedence. >>>> >>>> Why? >>>> >>> >>> Because both two features are not compatible since hugetlb_free_vmemmap cannot >>> optimize the vmemmap pages allocated from alternative allocator (when >>> memory_hotplug.memmap_on_memory=1). So when the feature of hugetlb_free_vmemmap >>> is introduced, I made hugetlb_free_vmemmap take precedence. BTW, I have a plan >>> to remove this restriction, I'll post it out ASAP. >> >> I was asking why vmemmap optimization should take precedence. >> memmap_on_memory makes it more likely to succeed memory hotplug in >> close-to-OOM situations -- which is IMHO more important than a vmemmap >> optimization. >> > > I thought the users who enable hugetlb_free_vmemmap value memory > savings more, so I made a decision in commit 4bab4964a59f. Seems > I made a bad decision from your description. Depends on the perspective I guess. :) > >> But anyhow, the proper approach should most probably be to simply not >> mess with the vmemmap if we stumble over a vmemmap that's special due to >> memmap_on_memory. I assume that's what you're talking about sending out. >> > > I mean I want to have hugetlb_vmemmap.c do the check whether the section > which the HugeTLB pages belong to can be optimized instead of making > hugetlb_free_vmemmap take precedence. E.g. If the section's vmemmap pages > are allocated from the added memory block itself, hugetlb_free_vmemmap will > refuse to optimize the vmemmap, otherwise, do the optimization. Then > both kernel parameters are compatible. I have done those patches, but > haven't send them out. Yeah, that's exactly what I thought. How complicated are they? If they are easy, can we just avoid this patch here and do it "properly"? :)
diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c index 111684878fd9..a6101ae402f9 100644 --- a/mm/memory_hotplug.c +++ b/mm/memory_hotplug.c @@ -42,14 +42,36 @@ #include "internal.h" #include "shuffle.h" +#ifdef CONFIG_MHP_MEMMAP_ON_MEMORY +static int memmap_on_memory_set(const char *val, const struct kernel_param *kp) +{ + if (hugetlb_optimize_vmemmap_enabled()) + return 0; + return param_set_bool(val, kp); +} + +static const struct kernel_param_ops memmap_on_memory_ops = { + .flags = KERNEL_PARAM_OPS_FL_NOARG, + .set = memmap_on_memory_set, + .get = param_get_bool, +}; /* * memory_hotplug.memmap_on_memory parameter */ static bool memmap_on_memory __ro_after_init; -#ifdef CONFIG_MHP_MEMMAP_ON_MEMORY -module_param(memmap_on_memory, bool, 0444); +module_param_cb(memmap_on_memory, &memmap_on_memory_ops, &memmap_on_memory, 0444); MODULE_PARM_DESC(memmap_on_memory, "Enable memmap on memory for memory hotplug"); + +static inline bool mhp_memmap_on_memory(void) +{ + return memmap_on_memory; +} +#else +static inline bool mhp_memmap_on_memory(void) +{ + return false; +} #endif enum { @@ -1263,9 +1285,7 @@ bool mhp_supports_memmap_on_memory(unsigned long size) * altmap as an alternative source of memory, and we do not exactly * populate a single PMD. */ - return memmap_on_memory && - !hugetlb_optimize_vmemmap_enabled() && - IS_ENABLED(CONFIG_MHP_MEMMAP_ON_MEMORY) && + return mhp_memmap_on_memory() && size == memory_block_size_bytes() && IS_ALIGNED(vmemmap_size, PMD_SIZE) && IS_ALIGNED(remaining_size, (pageblock_nr_pages << PAGE_SHIFT)); @@ -2083,7 +2103,7 @@ static int __ref try_remove_memory(u64 start, u64 size) * We only support removing memory added with MHP_MEMMAP_ON_MEMORY in * the same granularity it was added - a single memory block. */ - if (memmap_on_memory) { + if (mhp_memmap_on_memory()) { nr_vmemmap_pages = walk_memory_blocks(start, size, NULL, get_nr_vmemmap_pages_cb); if (nr_vmemmap_pages) {