diff mbox series

[v1,5/5] mm/memory_hotplug: print only with DEBUG_VM in online/offline_pages()

Message ID 20180816100628.26428-6-david@redhat.com (mailing list archive)
State New, archived
Headers show
Series mm/memory_hotplug: online/offline_pages refactorings | expand

Commit Message

David Hildenbrand Aug. 16, 2018, 10:06 a.m. UTC
Let's try to minimze the noise.

Signed-off-by: David Hildenbrand <david@redhat.com>
---
 mm/memory_hotplug.c | 6 ++++++
 1 file changed, 6 insertions(+)

Comments

Oscar Salvador Aug. 17, 2018, 8:18 a.m. UTC | #1
>  failed_addition:
> +#ifdef CONFIG_DEBUG_VM
>  	pr_debug("online_pages [mem %#010llx-%#010llx] failed\n",
>  		 (unsigned long long) pfn << PAGE_SHIFT,
>  		 (((unsigned long long) pfn + nr_pages) << PAGE_SHIFT) - 1);
> +#endif

I have never been sure about this.
IMO, if I fail to online pages, I want to know I failed.
I think that pr_err would be better than pr_debug and without CONFIG_DEBUG_VM.

But at least, if not, envolve it with a CONFIG_DEBUG_VM, but change pr_debug to pr_info.

> +#ifdef CONFIG_DEBUG_VM
>  	pr_debug("memory offlining [mem %#010llx-%#010llx] failed\n",
>  		 (unsigned long long) start_pfn << PAGE_SHIFT,
>  		 ((unsigned long long) end_pfn << PAGE_SHIFT) - 1);
> +#endif

Same goes here.

Thanks
Wei Yang Aug. 19, 2018, 12:34 p.m. UTC | #2
On Fri, Aug 17, 2018 at 10:18:53AM +0200, Oscar Salvador wrote:
>>  failed_addition:
>> +#ifdef CONFIG_DEBUG_VM
>>  	pr_debug("online_pages [mem %#010llx-%#010llx] failed\n",
>>  		 (unsigned long long) pfn << PAGE_SHIFT,
>>  		 (((unsigned long long) pfn + nr_pages) << PAGE_SHIFT) - 1);
>> +#endif
>
>I have never been sure about this.
>IMO, if I fail to online pages, I want to know I failed.
>I think that pr_err would be better than pr_debug and without CONFIG_DEBUG_VM.
>
>But at least, if not, envolve it with a CONFIG_DEBUG_VM, but change pr_debug to pr_info.
>

I don't have a clear rule about these debug macro neither.

While when you look at the page related logs in calculate_node_totalpages(),
it is KERNEL_DEBUG level and without any config macro.

Maybe we should leave them at the same state?

>> +#ifdef CONFIG_DEBUG_VM
>>  	pr_debug("memory offlining [mem %#010llx-%#010llx] failed\n",
>>  		 (unsigned long long) start_pfn << PAGE_SHIFT,
>>  		 ((unsigned long long) end_pfn << PAGE_SHIFT) - 1);
>> +#endif
>
>Same goes here.
>
>Thanks
>-- 
>Oscar Salvador
>SUSE L3
David Hildenbrand Aug. 20, 2018, 9:45 a.m. UTC | #3
On 17.08.2018 10:18, Oscar Salvador wrote:
>>  failed_addition:
>> +#ifdef CONFIG_DEBUG_VM
>>  	pr_debug("online_pages [mem %#010llx-%#010llx] failed\n",
>>  		 (unsigned long long) pfn << PAGE_SHIFT,
>>  		 (((unsigned long long) pfn + nr_pages) << PAGE_SHIFT) - 1);
>> +#endif
> 
> I have never been sure about this.
> IMO, if I fail to online pages, I want to know I failed.
> I think that pr_err would be better than pr_debug and without CONFIG_DEBUG_VM.

I consider both error messages only partially useful, as

1. They only catch a subset of actual failures the function handles.
   E.g. onlining will not report an error message if the memory notifier
   failed.
2. Onlining/Offlining is usually (with exceptions - e.g. onlining during
   add_memory) triggered from user space, where we present an error
   code. At any times, the actual state of the memory blocks can be
   observed by querying the state.

I would even vote for dropping the two error case messages completely.
At least I don't consider them very useful.

> 
> But at least, if not, envolve it with a CONFIG_DEBUG_VM, but change pr_debug to pr_info.
> 
>> +#ifdef CONFIG_DEBUG_VM
>>  	pr_debug("memory offlining [mem %#010llx-%#010llx] failed\n",
>>  		 (unsigned long long) start_pfn << PAGE_SHIFT,
>>  		 ((unsigned long long) end_pfn << PAGE_SHIFT) - 1);
>> +#endif
> 
> Same goes here.
> 
> Thanks
>
David Hildenbrand Aug. 20, 2018, 9:49 a.m. UTC | #4
On 20.08.2018 11:45, David Hildenbrand wrote:
> On 17.08.2018 10:18, Oscar Salvador wrote:
>>>  failed_addition:
>>> +#ifdef CONFIG_DEBUG_VM
>>>  	pr_debug("online_pages [mem %#010llx-%#010llx] failed\n",
>>>  		 (unsigned long long) pfn << PAGE_SHIFT,
>>>  		 (((unsigned long long) pfn + nr_pages) << PAGE_SHIFT) - 1);
>>> +#endif
>>
>> I have never been sure about this.
>> IMO, if I fail to online pages, I want to know I failed.
>> I think that pr_err would be better than pr_debug and without CONFIG_DEBUG_VM.
> 
> I consider both error messages only partially useful, as
> 
> 1. They only catch a subset of actual failures the function handles.
>    E.g. onlining will not report an error message if the memory notifier
>    failed.

That statement was wrong. It is rather in offline_pages, errors from
start_isolate_page_range() are ignored.

> 2. Onlining/Offlining is usually (with exceptions - e.g. onlining during
>    add_memory) triggered from user space, where we present an error
>    code. At any times, the actual state of the memory blocks can be
>    observed by querying the state.
> 
> I would even vote for dropping the two error case messages completely.
> At least I don't consider them very useful.
> 
>>
>> But at least, if not, envolve it with a CONFIG_DEBUG_VM, but change pr_debug to pr_info.
>>
>>> +#ifdef CONFIG_DEBUG_VM
>>>  	pr_debug("memory offlining [mem %#010llx-%#010llx] failed\n",
>>>  		 (unsigned long long) start_pfn << PAGE_SHIFT,
>>>  		 ((unsigned long long) end_pfn << PAGE_SHIFT) - 1);
>>> +#endif
>>
>> Same goes here.
>>
>> Thanks
>>
> 
>
David Hildenbrand Aug. 20, 2018, 9:57 a.m. UTC | #5
On 19.08.2018 14:34, Wei Yang wrote:
> On Fri, Aug 17, 2018 at 10:18:53AM +0200, Oscar Salvador wrote:
>>>  failed_addition:
>>> +#ifdef CONFIG_DEBUG_VM
>>>  	pr_debug("online_pages [mem %#010llx-%#010llx] failed\n",
>>>  		 (unsigned long long) pfn << PAGE_SHIFT,
>>>  		 (((unsigned long long) pfn + nr_pages) << PAGE_SHIFT) - 1);
>>> +#endif
>>
>> I have never been sure about this.
>> IMO, if I fail to online pages, I want to know I failed.
>> I think that pr_err would be better than pr_debug and without CONFIG_DEBUG_VM.
>>
>> But at least, if not, envolve it with a CONFIG_DEBUG_VM, but change pr_debug to pr_info.
>>
> 
> I don't have a clear rule about these debug macro neither.
> 
> While when you look at the page related logs in calculate_node_totalpages(),
> it is KERNEL_DEBUG level and without any config macro.
> 
> Maybe we should leave them at the same state?

I guess we can do that for the to debug messages.

When offlining memory right now:

:/# echo 0 > /sys/devices/system/memory/memory9/online
[   24.476207] Offlined Pages 32768
[   24.477200] remove from free list 48000 1024 50000
[   24.477896] remove from free list 48400 1024 50000
[   24.478584] remove from free list 48800 1024 50000
[   24.479454] remove from free list 48c00 1024 50000
[   24.480192] remove from free list 49000 1024 50000
[   24.480957] remove from free list 49400 1024 50000
[   24.481752] remove from free list 49800 1024 50000
[   24.482578] remove from free list 49c00 1024 50000
[   24.483302] remove from free list 4a000 1024 50000
[   24.484300] remove from free list 4a400 1024 50000
[   24.484902] remove from free list 4a800 1024 50000
[   24.485462] remove from free list 4ac00 1024 50000
[   24.486381] remove from free list 4b000 1024 50000
[   24.487108] remove from free list 4b400 1024 50000
[   24.487842] remove from free list 4b800 1024 50000
[   24.488610] remove from free list 4bc00 1024 50000
[   24.489548] remove from free list 4c000 1024 50000
[   24.490392] remove from free list 4c400 1024 50000
[   24.491224] remove from free list 4c800 1024 50000
...

While "remove from free list" is pr_info under CONFIG_DEBUG_VM,
"Offlined Pages ..." is pr_info without CONFIG_DEBUG_VM.
David Hildenbrand Aug. 20, 2018, 10:46 a.m. UTC | #6
On 16.08.2018 12:06, David Hildenbrand wrote:
> Let's try to minimze the noise.
> 
> Signed-off-by: David Hildenbrand <david@redhat.com>
> ---
>  mm/memory_hotplug.c | 6 ++++++
>  1 file changed, 6 insertions(+)
> 
> diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
> index bbbd16f9d877..6fec2dc6a73d 100644
> --- a/mm/memory_hotplug.c
> +++ b/mm/memory_hotplug.c
> @@ -966,9 +966,11 @@ int __ref online_pages(unsigned long pfn, unsigned long nr_pages, int online_typ
>  	return 0;
>  
>  failed_addition:
> +#ifdef CONFIG_DEBUG_VM
>  	pr_debug("online_pages [mem %#010llx-%#010llx] failed\n",
>  		 (unsigned long long) pfn << PAGE_SHIFT,
>  		 (((unsigned long long) pfn + nr_pages) << PAGE_SHIFT) - 1);
> +#endif
>  	memory_notify(MEM_CANCEL_ONLINE, &arg);
>  	return ret;
>  }
> @@ -1660,7 +1662,9 @@ int offline_pages(unsigned long start_pfn, unsigned long nr_pages)
>  	offlined_pages = check_pages_isolated(start_pfn, end_pfn);
>  	if (offlined_pages < 0)
>  		goto repeat;
> +#ifdef CONFIG_DEBUG_VM
>  	pr_info("Offlined Pages %ld\n", offlined_pages);
> +#endif
>  	/* Ok, all of our target is isolated.
>  	   We cannot do rollback at this point. */
>  	offline_isolated_pages(start_pfn, end_pfn);
> @@ -1695,9 +1699,11 @@ int offline_pages(unsigned long start_pfn, unsigned long nr_pages)
>  	return 0;
>  
>  failed_removal:
> +#ifdef CONFIG_DEBUG_VM
>  	pr_debug("memory offlining [mem %#010llx-%#010llx] failed\n",
>  		 (unsigned long long) start_pfn << PAGE_SHIFT,
>  		 ((unsigned long long) end_pfn << PAGE_SHIFT) - 1);
> +#endif
>  	memory_notify(MEM_CANCEL_OFFLINE, &arg);
>  	/* pushback to free area */
>  	undo_isolate_page_range(start_pfn, end_pfn, MIGRATE_MOVABLE);
> 

I'll drop this patch for now, maybe the error messages are actually
useful when debugging a crashdump of a system without CONFIG_DEBUG_VM.
Wei Yang Aug. 20, 2018, 1:12 p.m. UTC | #7
On Mon, Aug 20, 2018 at 11:57:04AM +0200, David Hildenbrand wrote:
>On 19.08.2018 14:34, Wei Yang wrote:
>> On Fri, Aug 17, 2018 at 10:18:53AM +0200, Oscar Salvador wrote:
>>>>  failed_addition:
>>>> +#ifdef CONFIG_DEBUG_VM
>>>>  	pr_debug("online_pages [mem %#010llx-%#010llx] failed\n",
>>>>  		 (unsigned long long) pfn << PAGE_SHIFT,
>>>>  		 (((unsigned long long) pfn + nr_pages) << PAGE_SHIFT) - 1);
>>>> +#endif
>>>
>>> I have never been sure about this.
>>> IMO, if I fail to online pages, I want to know I failed.
>>> I think that pr_err would be better than pr_debug and without CONFIG_DEBUG_VM.
>>>
>>> But at least, if not, envolve it with a CONFIG_DEBUG_VM, but change pr_debug to pr_info.
>>>
>> 
>> I don't have a clear rule about these debug macro neither.
>> 
>> While when you look at the page related logs in calculate_node_totalpages(),
>> it is KERNEL_DEBUG level and without any config macro.
>> 
>> Maybe we should leave them at the same state?
>
>I guess we can do that for the to debug messages.
>
>When offlining memory right now:
>
>:/# echo 0 > /sys/devices/system/memory/memory9/online
>[   24.476207] Offlined Pages 32768
>[   24.477200] remove from free list 48000 1024 50000
>[   24.477896] remove from free list 48400 1024 50000
>[   24.478584] remove from free list 48800 1024 50000
>[   24.479454] remove from free list 48c00 1024 50000
>[   24.480192] remove from free list 49000 1024 50000
>[   24.480957] remove from free list 49400 1024 50000
>[   24.481752] remove from free list 49800 1024 50000
>[   24.482578] remove from free list 49c00 1024 50000
>[   24.483302] remove from free list 4a000 1024 50000
>[   24.484300] remove from free list 4a400 1024 50000
>[   24.484902] remove from free list 4a800 1024 50000
>[   24.485462] remove from free list 4ac00 1024 50000
>[   24.486381] remove from free list 4b000 1024 50000
>[   24.487108] remove from free list 4b400 1024 50000
>[   24.487842] remove from free list 4b800 1024 50000
>[   24.488610] remove from free list 4bc00 1024 50000
>[   24.489548] remove from free list 4c000 1024 50000
>[   24.490392] remove from free list 4c400 1024 50000
>[   24.491224] remove from free list 4c800 1024 50000
>...
>
>While "remove from free list" is pr_info under CONFIG_DEBUG_VM,
>"Offlined Pages ..." is pr_info without CONFIG_DEBUG_VM.

Hmm... yes, don't see the logic between them.

>
>-- 
>
>Thanks,
>
>David / dhildenb
Pasha Tatashin Aug. 30, 2018, 10:36 p.m. UTC | #8
On 8/20/18 6:46 AM, David Hildenbrand wrote:
> On 16.08.2018 12:06, David Hildenbrand wrote:
>> Let's try to minimze the noise.
>>
>> Signed-off-by: David Hildenbrand <david@redhat.com>
>> ---
>>  mm/memory_hotplug.c | 6 ++++++
>>  1 file changed, 6 insertions(+)
>>
>> diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
>> index bbbd16f9d877..6fec2dc6a73d 100644
>> --- a/mm/memory_hotplug.c
>> +++ b/mm/memory_hotplug.c
>> @@ -966,9 +966,11 @@ int __ref online_pages(unsigned long pfn, unsigned long nr_pages, int online_typ
>>  	return 0;
>>  
>>  failed_addition:
>> +#ifdef CONFIG_DEBUG_VM
>>  	pr_debug("online_pages [mem %#010llx-%#010llx] failed\n",
>>  		 (unsigned long long) pfn << PAGE_SHIFT,
>>  		 (((unsigned long long) pfn + nr_pages) << PAGE_SHIFT) - 1);
>> +#endif
>>  	memory_notify(MEM_CANCEL_ONLINE, &arg);
>>  	return ret;
>>  }
>> @@ -1660,7 +1662,9 @@ int offline_pages(unsigned long start_pfn, unsigned long nr_pages)
>>  	offlined_pages = check_pages_isolated(start_pfn, end_pfn);
>>  	if (offlined_pages < 0)
>>  		goto repeat;
>> +#ifdef CONFIG_DEBUG_VM
>>  	pr_info("Offlined Pages %ld\n", offlined_pages);
>> +#endif
>>  	/* Ok, all of our target is isolated.
>>  	   We cannot do rollback at this point. */
>>  	offline_isolated_pages(start_pfn, end_pfn);
>> @@ -1695,9 +1699,11 @@ int offline_pages(unsigned long start_pfn, unsigned long nr_pages)
>>  	return 0;
>>  
>>  failed_removal:
>> +#ifdef CONFIG_DEBUG_VM
>>  	pr_debug("memory offlining [mem %#010llx-%#010llx] failed\n",
>>  		 (unsigned long long) start_pfn << PAGE_SHIFT,
>>  		 ((unsigned long long) end_pfn << PAGE_SHIFT) - 1);
>> +#endif
>>  	memory_notify(MEM_CANCEL_OFFLINE, &arg);
>>  	/* pushback to free area */
>>  	undo_isolate_page_range(start_pfn, end_pfn, MIGRATE_MOVABLE);
>>
> 
> I'll drop this patch for now, maybe the error messages are actually
> useful when debugging a crashdump of a system without CONFIG_DEBUG_VM.
> 

Yes, please drop it, no reason to reduce amount of these messages. They
are useful.

Thank you,
Pavel
diff mbox series

Patch

diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
index bbbd16f9d877..6fec2dc6a73d 100644
--- a/mm/memory_hotplug.c
+++ b/mm/memory_hotplug.c
@@ -966,9 +966,11 @@  int __ref online_pages(unsigned long pfn, unsigned long nr_pages, int online_typ
 	return 0;
 
 failed_addition:
+#ifdef CONFIG_DEBUG_VM
 	pr_debug("online_pages [mem %#010llx-%#010llx] failed\n",
 		 (unsigned long long) pfn << PAGE_SHIFT,
 		 (((unsigned long long) pfn + nr_pages) << PAGE_SHIFT) - 1);
+#endif
 	memory_notify(MEM_CANCEL_ONLINE, &arg);
 	return ret;
 }
@@ -1660,7 +1662,9 @@  int offline_pages(unsigned long start_pfn, unsigned long nr_pages)
 	offlined_pages = check_pages_isolated(start_pfn, end_pfn);
 	if (offlined_pages < 0)
 		goto repeat;
+#ifdef CONFIG_DEBUG_VM
 	pr_info("Offlined Pages %ld\n", offlined_pages);
+#endif
 	/* Ok, all of our target is isolated.
 	   We cannot do rollback at this point. */
 	offline_isolated_pages(start_pfn, end_pfn);
@@ -1695,9 +1699,11 @@  int offline_pages(unsigned long start_pfn, unsigned long nr_pages)
 	return 0;
 
 failed_removal:
+#ifdef CONFIG_DEBUG_VM
 	pr_debug("memory offlining [mem %#010llx-%#010llx] failed\n",
 		 (unsigned long long) start_pfn << PAGE_SHIFT,
 		 ((unsigned long long) end_pfn << PAGE_SHIFT) - 1);
+#endif
 	memory_notify(MEM_CANCEL_OFFLINE, &arg);
 	/* pushback to free area */
 	undo_isolate_page_range(start_pfn, end_pfn, MIGRATE_MOVABLE);