diff mbox series

[2/3] mm/memory_hotplug: fix potential permanent lru cache disable

Message ID 20210821094246.10149-3-linmiaohe@huawei.com (mailing list archive)
State New
Headers show
Series Cleanup and fixups for memory hotplug | expand

Commit Message

Miaohe Lin Aug. 21, 2021, 9:42 a.m. UTC
If offline_pages failed after lru_cache_disable(), it forgot to do
lru_cache_enable() in error path. So we would have lru cache disabled
permanently in this case.

Fixes: d479960e44f2 ("mm: disable LRU pagevec during the migration temporarily")
Signed-off-by: Miaohe Lin <linmiaohe@huawei.com>
---
 mm/memory_hotplug.c | 1 +
 1 file changed, 1 insertion(+)

Comments

HORIGUCHI NAOYA(堀口 直也) Aug. 23, 2021, 8:21 a.m. UTC | #1
On Sat, Aug 21, 2021 at 05:42:45PM +0800, Miaohe Lin wrote:
> If offline_pages failed after lru_cache_disable(), it forgot to do
> lru_cache_enable() in error path. So we would have lru cache disabled
> permanently in this case.
> 
> Fixes: d479960e44f2 ("mm: disable LRU pagevec during the migration temporarily")
> Signed-off-by: Miaohe Lin <linmiaohe@huawei.com>

Reviewed-by: Naoya Horiguchi <naoya.horiguchi@nec.com>
Oscar Salvador Aug. 23, 2021, 9:15 a.m. UTC | #2
On 2021-08-21 11:42, Miaohe Lin wrote:
> If offline_pages failed after lru_cache_disable(), it forgot to do
> lru_cache_enable() in error path. So we would have lru cache disabled
> permanently in this case.
> 
> Fixes: d479960e44f2 ("mm: disable LRU pagevec during the migration 
> temporarily")
> Signed-off-by: Miaohe Lin <linmiaohe@huawei.com>

Reviewed-by: Oscar Salvador <osalvador@suse.de>

Should this go to stable?
In case we fail to enable it again, we will bypass the pvec cache 
anytime we add a new page to the LRU which might lead to severe 
performance regression?

> ---
>  mm/memory_hotplug.c | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
> index d986d3791986..9fd0be32a281 100644
> --- a/mm/memory_hotplug.c
> +++ b/mm/memory_hotplug.c
> @@ -2033,6 +2033,7 @@ int __ref offline_pages(unsigned long start_pfn,
> unsigned long nr_pages,
>  	undo_isolate_page_range(start_pfn, end_pfn, MIGRATE_MOVABLE);
>  	memory_notify(MEM_CANCEL_OFFLINE, &arg);
>  failed_removal_pcplists_disabled:
> +	lru_cache_enable();
>  	zone_pcp_enable(zone);
>  failed_removal:
>  	pr_debug("memory offlining [mem %#010llx-%#010llx] failed due to 
> %s\n",
Miaohe Lin Aug. 23, 2021, 11:13 a.m. UTC | #3
On 2021/8/23 17:15, Oscar Salvador wrote:
> On 2021-08-21 11:42, Miaohe Lin wrote:
>> If offline_pages failed after lru_cache_disable(), it forgot to do
>> lru_cache_enable() in error path. So we would have lru cache disabled
>> permanently in this case.
>>
>> Fixes: d479960e44f2 ("mm: disable LRU pagevec during the migration temporarily")
>> Signed-off-by: Miaohe Lin <linmiaohe@huawei.com>
> 
> Reviewed-by: Oscar Salvador <osalvador@suse.de>
> 

Many thanks for your review and reply. :)

> Should this go to stable?
> In case we fail to enable it again, we will bypass the pvec cache anytime we add a new page to the LRU which might lead to severe performance regression?
> 

Agree with you. I think this should go to stable too.

>> ---
>>  mm/memory_hotplug.c | 1 +
>>  1 file changed, 1 insertion(+)
>>
>> diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
>> index d986d3791986..9fd0be32a281 100644
>> --- a/mm/memory_hotplug.c
>> +++ b/mm/memory_hotplug.c
>> @@ -2033,6 +2033,7 @@ int __ref offline_pages(unsigned long start_pfn,
>> unsigned long nr_pages,
>>      undo_isolate_page_range(start_pfn, end_pfn, MIGRATE_MOVABLE);
>>      memory_notify(MEM_CANCEL_OFFLINE, &arg);
>>  failed_removal_pcplists_disabled:
>> +    lru_cache_enable();
>>      zone_pcp_enable(zone);
>>  failed_removal:
>>      pr_debug("memory offlining [mem %#010llx-%#010llx] failed due to %s\n",
>
David Hildenbrand Aug. 23, 2021, 12:15 p.m. UTC | #4
On 21.08.21 11:42, Miaohe Lin wrote:
> If offline_pages failed after lru_cache_disable(), it forgot to do
> lru_cache_enable() in error path. So we would have lru cache disabled
> permanently in this case.
> 
> Fixes: d479960e44f2 ("mm: disable LRU pagevec during the migration temporarily")
> Signed-off-by: Miaohe Lin <linmiaohe@huawei.com>
> ---
>   mm/memory_hotplug.c | 1 +
>   1 file changed, 1 insertion(+)
> 
> diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
> index d986d3791986..9fd0be32a281 100644
> --- a/mm/memory_hotplug.c
> +++ b/mm/memory_hotplug.c
> @@ -2033,6 +2033,7 @@ int __ref offline_pages(unsigned long start_pfn, unsigned long nr_pages,
>   	undo_isolate_page_range(start_pfn, end_pfn, MIGRATE_MOVABLE);
>   	memory_notify(MEM_CANCEL_OFFLINE, &arg);
>   failed_removal_pcplists_disabled:
> +	lru_cache_enable();
>   	zone_pcp_enable(zone);
>   failed_removal:
>   	pr_debug("memory offlining [mem %#010llx-%#010llx] failed due to %s\n",
> 

Reviewed-by: David Hildenbrand <david@redhat.com>

As mentioned, this should be backported to stable.
diff mbox series

Patch

diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
index d986d3791986..9fd0be32a281 100644
--- a/mm/memory_hotplug.c
+++ b/mm/memory_hotplug.c
@@ -2033,6 +2033,7 @@  int __ref offline_pages(unsigned long start_pfn, unsigned long nr_pages,
 	undo_isolate_page_range(start_pfn, end_pfn, MIGRATE_MOVABLE);
 	memory_notify(MEM_CANCEL_OFFLINE, &arg);
 failed_removal_pcplists_disabled:
+	lru_cache_enable();
 	zone_pcp_enable(zone);
 failed_removal:
 	pr_debug("memory offlining [mem %#010llx-%#010llx] failed due to %s\n",