diff mbox series

[v10,2/4] mm: memory_hotplug: override memmap_on_memory when hugetlb_free_vmemmap=on

Message ID 20220509062703.64249-3-songmuchun@bytedance.com (mailing list archive)
State New
Headers show
Series add hugetlb_optimize_vmemmap sysctl | expand

Commit Message

Muchun Song May 9, 2022, 6:27 a.m. UTC
Optimizing HugeTLB vmemmap pages is not compatible with allocating memmap on
hot added memory. If "hugetlb_free_vmemmap=on" and
memory_hotplug.memmap_on_memory" are both passed on the kernel command line,
optimizing hugetlb pages takes precedence.  However, the global variable
memmap_on_memory will still be set to 1, even though we will not try to
allocate memmap on hot added memory.

Also introduce mhp_memmap_on_memory() helper to move the definition of
"memmap_on_memory" to the scope of CONFIG_MHP_MEMMAP_ON_MEMORY.  In the
next patch, mhp_memmap_on_memory() will also be exported to be used in
hugetlb_vmemmap.c.

Signed-off-by: Muchun Song <songmuchun@bytedance.com>
Acked-by: Mike Kravetz <mike.kravetz@oracle.com>
---
 mm/memory_hotplug.c | 32 ++++++++++++++++++++++++++------
 1 file changed, 26 insertions(+), 6 deletions(-)

Comments

David Hildenbrand May 12, 2022, 7:36 a.m. UTC | #1
On 09.05.22 08:27, Muchun Song wrote:
> Optimizing HugeTLB vmemmap pages is not compatible with allocating memmap on
> hot added memory. If "hugetlb_free_vmemmap=on" and
> memory_hotplug.memmap_on_memory" are both passed on the kernel command line,
> optimizing hugetlb pages takes precedence. 

Why?
Muchun Song May 12, 2022, 12:50 p.m. UTC | #2
On Thu, May 12, 2022 at 09:36:15AM +0200, David Hildenbrand wrote:
> On 09.05.22 08:27, Muchun Song wrote:
> > Optimizing HugeTLB vmemmap pages is not compatible with allocating memmap on
> > hot added memory. If "hugetlb_free_vmemmap=on" and
> > memory_hotplug.memmap_on_memory" are both passed on the kernel command line,
> > optimizing hugetlb pages takes precedence. 
> 
> Why?
>

Because both two features are not compatible since hugetlb_free_vmemmap cannot
optimize the vmemmap pages allocated from alternative allocator (when
memory_hotplug.memmap_on_memory=1). So when the feature of hugetlb_free_vmemmap
is introduced, I made hugetlb_free_vmemmap take precedence.  BTW, I have a plan
to remove this restriction, I'll post it out ASAP.

Thanks.
David Hildenbrand May 12, 2022, 1:04 p.m. UTC | #3
On 12.05.22 14:50, Muchun Song wrote:
> On Thu, May 12, 2022 at 09:36:15AM +0200, David Hildenbrand wrote:
>> On 09.05.22 08:27, Muchun Song wrote:
>>> Optimizing HugeTLB vmemmap pages is not compatible with allocating memmap on
>>> hot added memory. If "hugetlb_free_vmemmap=on" and
>>> memory_hotplug.memmap_on_memory" are both passed on the kernel command line,
>>> optimizing hugetlb pages takes precedence. 
>>
>> Why?
>>
> 
> Because both two features are not compatible since hugetlb_free_vmemmap cannot
> optimize the vmemmap pages allocated from alternative allocator (when
> memory_hotplug.memmap_on_memory=1). So when the feature of hugetlb_free_vmemmap
> is introduced, I made hugetlb_free_vmemmap take precedence.  BTW, I have a plan
> to remove this restriction, I'll post it out ASAP.

I was asking why vmemmap optimization should take precedence.
memmap_on_memory makes it more likely to succeed memory hotplug in
close-to-OOM situations -- which is IMHO more important than a vmemmap
optimization.

But anyhow, the proper approach should most probably be to simply not
mess with the vmemmap if we stumble over a vmemmap that's special due to
memmap_on_memory. I assume that's what you're talking about sending out.
Muchun Song May 12, 2022, 1:59 p.m. UTC | #4
On Thu, May 12, 2022 at 03:04:57PM +0200, David Hildenbrand wrote:
> On 12.05.22 14:50, Muchun Song wrote:
> > On Thu, May 12, 2022 at 09:36:15AM +0200, David Hildenbrand wrote:
> >> On 09.05.22 08:27, Muchun Song wrote:
> >>> Optimizing HugeTLB vmemmap pages is not compatible with allocating memmap on
> >>> hot added memory. If "hugetlb_free_vmemmap=on" and
> >>> memory_hotplug.memmap_on_memory" are both passed on the kernel command line,
> >>> optimizing hugetlb pages takes precedence. 
> >>
> >> Why?
> >>
> > 
> > Because both two features are not compatible since hugetlb_free_vmemmap cannot
> > optimize the vmemmap pages allocated from alternative allocator (when
> > memory_hotplug.memmap_on_memory=1). So when the feature of hugetlb_free_vmemmap
> > is introduced, I made hugetlb_free_vmemmap take precedence.  BTW, I have a plan
> > to remove this restriction, I'll post it out ASAP.
> 
> I was asking why vmemmap optimization should take precedence.
> memmap_on_memory makes it more likely to succeed memory hotplug in
> close-to-OOM situations -- which is IMHO more important than a vmemmap
> optimization.
>

I thought the users who enable hugetlb_free_vmemmap value memory
savings more, so I made a decision in commit 4bab4964a59f.  Seems
I made a bad decision from your description.
 
> But anyhow, the proper approach should most probably be to simply not
> mess with the vmemmap if we stumble over a vmemmap that's special due to
> memmap_on_memory. I assume that's what you're talking about sending out.
>

I mean I want to have hugetlb_vmemmap.c do the check whether the section
which the HugeTLB pages belong to can be optimized instead of making
hugetlb_free_vmemmap take precedence.  E.g. If the section's vmemmap pages
are allocated from the added memory block itself, hugetlb_free_vmemmap will
refuse to optimize the vmemmap, otherwise, do the optimization.  Then
both kernel parameters are compatible.  I have done those patches, but
haven't send them out.

Thanks.
David Hildenbrand May 12, 2022, 4:38 p.m. UTC | #5
On 12.05.22 15:59, Muchun Song wrote:
> On Thu, May 12, 2022 at 03:04:57PM +0200, David Hildenbrand wrote:
>> On 12.05.22 14:50, Muchun Song wrote:
>>> On Thu, May 12, 2022 at 09:36:15AM +0200, David Hildenbrand wrote:
>>>> On 09.05.22 08:27, Muchun Song wrote:
>>>>> Optimizing HugeTLB vmemmap pages is not compatible with allocating memmap on
>>>>> hot added memory. If "hugetlb_free_vmemmap=on" and
>>>>> memory_hotplug.memmap_on_memory" are both passed on the kernel command line,
>>>>> optimizing hugetlb pages takes precedence. 
>>>>
>>>> Why?
>>>>
>>>
>>> Because both two features are not compatible since hugetlb_free_vmemmap cannot
>>> optimize the vmemmap pages allocated from alternative allocator (when
>>> memory_hotplug.memmap_on_memory=1). So when the feature of hugetlb_free_vmemmap
>>> is introduced, I made hugetlb_free_vmemmap take precedence.  BTW, I have a plan
>>> to remove this restriction, I'll post it out ASAP.
>>
>> I was asking why vmemmap optimization should take precedence.
>> memmap_on_memory makes it more likely to succeed memory hotplug in
>> close-to-OOM situations -- which is IMHO more important than a vmemmap
>> optimization.
>>
> 
> I thought the users who enable hugetlb_free_vmemmap value memory
> savings more, so I made a decision in commit 4bab4964a59f.  Seems
> I made a bad decision from your description.

Depends on the perspective I guess. :)

>  
>> But anyhow, the proper approach should most probably be to simply not
>> mess with the vmemmap if we stumble over a vmemmap that's special due to
>> memmap_on_memory. I assume that's what you're talking about sending out.
>>
> 
> I mean I want to have hugetlb_vmemmap.c do the check whether the section
> which the HugeTLB pages belong to can be optimized instead of making
> hugetlb_free_vmemmap take precedence.  E.g. If the section's vmemmap pages
> are allocated from the added memory block itself, hugetlb_free_vmemmap will
> refuse to optimize the vmemmap, otherwise, do the optimization.  Then
> both kernel parameters are compatible.  I have done those patches, but
> haven't send them out.

Yeah, that's exactly what I thought. How complicated are they? If they
are easy, can we just avoid this patch here and do it "properly"? :)
diff mbox series

Patch

diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
index 111684878fd9..a6101ae402f9 100644
--- a/mm/memory_hotplug.c
+++ b/mm/memory_hotplug.c
@@ -42,14 +42,36 @@ 
 #include "internal.h"
 #include "shuffle.h"
 
+#ifdef CONFIG_MHP_MEMMAP_ON_MEMORY
+static int memmap_on_memory_set(const char *val, const struct kernel_param *kp)
+{
+	if (hugetlb_optimize_vmemmap_enabled())
+		return 0;
+	return param_set_bool(val, kp);
+}
+
+static const struct kernel_param_ops memmap_on_memory_ops = {
+	.flags	= KERNEL_PARAM_OPS_FL_NOARG,
+	.set	= memmap_on_memory_set,
+	.get	= param_get_bool,
+};
 
 /*
  * memory_hotplug.memmap_on_memory parameter
  */
 static bool memmap_on_memory __ro_after_init;
-#ifdef CONFIG_MHP_MEMMAP_ON_MEMORY
-module_param(memmap_on_memory, bool, 0444);
+module_param_cb(memmap_on_memory, &memmap_on_memory_ops, &memmap_on_memory, 0444);
 MODULE_PARM_DESC(memmap_on_memory, "Enable memmap on memory for memory hotplug");
+
+static inline bool mhp_memmap_on_memory(void)
+{
+	return memmap_on_memory;
+}
+#else
+static inline bool mhp_memmap_on_memory(void)
+{
+	return false;
+}
 #endif
 
 enum {
@@ -1263,9 +1285,7 @@  bool mhp_supports_memmap_on_memory(unsigned long size)
 	 *       altmap as an alternative source of memory, and we do not exactly
 	 *       populate a single PMD.
 	 */
-	return memmap_on_memory &&
-	       !hugetlb_optimize_vmemmap_enabled() &&
-	       IS_ENABLED(CONFIG_MHP_MEMMAP_ON_MEMORY) &&
+	return mhp_memmap_on_memory() &&
 	       size == memory_block_size_bytes() &&
 	       IS_ALIGNED(vmemmap_size, PMD_SIZE) &&
 	       IS_ALIGNED(remaining_size, (pageblock_nr_pages << PAGE_SHIFT));
@@ -2083,7 +2103,7 @@  static int __ref try_remove_memory(u64 start, u64 size)
 	 * We only support removing memory added with MHP_MEMMAP_ON_MEMORY in
 	 * the same granularity it was added - a single memory block.
 	 */
-	if (memmap_on_memory) {
+	if (mhp_memmap_on_memory()) {
 		nr_vmemmap_pages = walk_memory_blocks(start, size, NULL,
 						      get_nr_vmemmap_pages_cb);
 		if (nr_vmemmap_pages) {