mbox series

[v3,0/5] HWpoison: further fixes and cleanups

Message ID 20200914101559.17103-1-osalvador@suse.de (mailing list archive)
Headers show
Series HWpoison: further fixes and cleanups | expand

Message

Oscar Salvador Sept. 14, 2020, 10:15 a.m. UTC
The important bit of this patchset is patch#1, which is a fix to take off
HWPoison pages off a buddy freelist since it can lead us to having HWPoison
pages back in the game without no one noticing it.
So fix it (we did that already for soft_offline_page [1]).

The other patches are clean-ups and not that important, so if anything,
consider patch#1 for inclusion.

[1] https://patchwork.kernel.org/cover/11704083/

Thanks

Oscar Salvador (5):
  mm,hwpoison: take free pages off the buddy freelists
  mm,hwpoison: refactor madvise_inject_error
  mm,hwpoison: drain pcplists before bailing out for non-buddy
    zero-refcount page
  mm,hwpoison: drop unneeded pcplist draining
  mm,hwpoison: remove stale code

 mm/madvise.c        | 36 +++++++++++++-------------------
 mm/memory-failure.c | 50 +++++++++++++++++++++++++++++++++------------
 2 files changed, 51 insertions(+), 35 deletions(-)

Comments

Aristeu Rozanski Sept. 15, 2020, 9:22 p.m. UTC | #1
Hi Oscar, Naoya,

On Mon, Sep 14, 2020 at 12:15:54PM +0200, Oscar Salvador wrote:
> The important bit of this patchset is patch#1, which is a fix to take off
> HWPoison pages off a buddy freelist since it can lead us to having HWPoison
> pages back in the game without no one noticing it.
> So fix it (we did that already for soft_offline_page [1]).
> 
> The other patches are clean-ups and not that important, so if anything,
> consider patch#1 for inclusion.
> 
> [1] https://patchwork.kernel.org/cover/11704083/

I found something strange with your and Naoya's hwpoison rework. We have a
customer with a testcase that basically does:

	p1 = mmap(NULL, size, PROT_READ | PROT_WRITE, MAP_PRIVATE | MAP_ANONYMOUS, -1, 0);
	p2 = mmap(NULL, size, PROT_READ | PROT_WRITE, MAP_PRIVATE | MAP_ANONYMOUS, -1, 0);

	madvise(p1, size, MADV_MERGEABLE);
	madvise(p2, size, MADV_MERGEABLE);

	memset(p1, 'a', size);
	memset(p2, 'a', size);

	madvise(p1, size, MADV_SOFT_OFFLINE);

	madvise(p1, size, MADV_UNMERGEABLE);
	madvise(p2, size, MADV_UNMERGEABLE);


where size is about 200,000 pages. It works on a x86_64 box (with and without the
hwpoison rework). On ppc64 boxes (tested 3 different ones with at least 250GB memory)
it fails to take a page off the buddy list (page_handle_poison()/take_page_off_buddy())
(madvise MADV_SOFT_OFFLINE returns -EBUSY). Without the hwpoison rework the test passes.

Possibly related is that ppc64 takes a long time to run this test and according
perf, it spends most of the time clearing pages:

	  17.15%  ksm_poison  [kernel.kallsyms]  [k] copypage_power7
	  13.39%  ksm_poison  [kernel.kallsyms]  [k] clear_user_page
	   8.70%  ksm_poison  libc-2.28.so       [.] __memset_power8
	   8.63%  ksm_poison  [kernel.kallsyms]  [k] opal_return
	   6.04%  ksm_poison  [kernel.kallsyms]  [k] __opal_call
	   2.67%  ksm_poison  [kernel.kallsyms]  [k] opal_call
	   1.52%  ksm_poison  [kernel.kallsyms]  [k] _raw_spin_lock
	   1.45%  ksm_poison  [kernel.kallsyms]  [k] opal_flush_console
	   1.43%  ksm_poison  [unknown]          [k] 0x0000000030005138
	   1.43%  ksm_poison  [kernel.kallsyms]  [k] opal_console_write_buffer_space
	   1.26%  ksm_poison  [kernel.kallsyms]  [k] hvc_console_print
(...)

I've run these tests using mmotm and mmotm with this patchset on top.

Do you know what might be happening here?
Oscar Salvador Sept. 16, 2020, 7:27 a.m. UTC | #2
On Tue, Sep 15, 2020 at 05:22:22PM -0400, Aristeu Rozanski wrote:
> Hi Oscar, Naoya,

Hi Aristeu,

thanks for reporting this.

> I've run these tests using mmotm and mmotm with this patchset on top.

Could you please re-run the tests with the below patch applied, and
attached then the logs here?

diff --git a/mm/memory-failure.c b/mm/memory-failure.c
index 84a7f228af36..d7b6e7724e47 100644
--- a/mm/memory-failure.c
+++ b/mm/memory-failure.c
@@ -67,6 +67,7 @@ atomic_long_t num_poisoned_pages __read_mostly = ATOMIC_LONG_INIT(0);
 
 static bool page_handle_poison(struct page *page, bool hugepage_or_freepage, bool release)
 {
+	dump_page(page, "page_handle_poison");
 	if (release) {
 		put_page(page);
 		drain_all_pages(page_zone(page));
@@ -77,7 +78,7 @@ static bool page_handle_poison(struct page *page, bool hugepage_or_freepage, boo
 		 * Doing this check for free pages is also fine since dissolve_free_huge_page
 		 * returns 0 for non-hugetlb pages as well.
 		 */
-		if (dissolve_free_huge_page(page) || !take_page_off_buddy(page))
+		if (dissolve_free_huge_page(page) || !take_page_off_buddy(page)) {
 			/*
 			 * We could fail to take off the target page from buddy
 			 * for example due to racy page allocaiton, but that's
@@ -85,7 +86,9 @@ static bool page_handle_poison(struct page *page, bool hugepage_or_freepage, boo
 			 * and if someone really want to use it, they should
 			 * take it.
 			 */
+			pr_info("%s: hugepage_or_freepage failed¸n", __func__);
 			return false;
+		}
 	}
 
 	SetPageHWPoison(page);
@@ -1858,8 +1861,11 @@ static int __soft_offline_page(struct page *page)
 		if (!ret) {
 			bool release = !huge;
 
-			if (!page_handle_poison(page, true, release))
+			if (!page_handle_poison(page, true, release)) {
+				pr_info("%s: page_handle_poison -EBUSY\n", __func__);
+				dump_page(page, "__soft_offline_page after migrate");
 				ret = -EBUSY;
+			}
 		} else {
 			if (!list_empty(&pagelist))
 				putback_movable_pages(&pagelist);
@@ -1872,6 +1878,7 @@ static int __soft_offline_page(struct page *page)
 	} else {
 		pr_info("soft offline: %#lx: %s isolation failed: %d, page count %d, type %lx (%pGp)\n",
 			pfn, msg_page[huge], ret, page_count(page), page->flags, &page->flags);
+		dump_page(page, "__soft_offline_page isolation failed");
 		ret = -EBUSY;
 	}
 	return ret;
@@ -1882,8 +1889,11 @@ static int soft_offline_in_use_page(struct page *page)
 	struct page *hpage = compound_head(page);
 
 	if (!PageHuge(page) && PageTransHuge(hpage))
-		if (try_to_split_thp_page(page, "soft offline") < 0)
+		if (try_to_split_thp_page(page, "soft offline") < 0) {
+			pr_info("%s: try_to_split_thp_page -EBUSY\n", __func__);
+			dump_page(page, "try_to_split_thp_page");
 			return -EBUSY;
+		}
 	return __soft_offline_page(page);
 }
 
@@ -1891,8 +1901,11 @@ static int soft_offline_free_page(struct page *page)
 {
 	int rc = 0;
 
-	if (!page_handle_poison(page, true, false))
+	if (!page_handle_poison(page, true, false)) {
+		pr_info("%s: page_handle_poison -EBUSY\n", __func__);
+		dump_page(page, "soft_offline_free_page");
 		rc = -EBUSY;
+	}
 
 	return rc;
 }

Thanks
HORIGUCHI NAOYA(堀口 直也) Sept. 16, 2020, 1:42 p.m. UTC | #3
On Tue, Sep 15, 2020 at 05:22:22PM -0400, Aristeu Rozanski wrote:
> Hi Oscar, Naoya,
> 
> On Mon, Sep 14, 2020 at 12:15:54PM +0200, Oscar Salvador wrote:
> > The important bit of this patchset is patch#1, which is a fix to take off
> > HWPoison pages off a buddy freelist since it can lead us to having HWPoison
> > pages back in the game without no one noticing it.
> > So fix it (we did that already for soft_offline_page [1]).
> > 
> > The other patches are clean-ups and not that important, so if anything,
> > consider patch#1 for inclusion.
> > 
> > [1] https://patchwork.kernel.org/cover/11704083/
> 
> I found something strange with your and Naoya's hwpoison rework. We have a
> customer with a testcase that basically does:
> 
> 	p1 = mmap(NULL, size, PROT_READ | PROT_WRITE, MAP_PRIVATE | MAP_ANONYMOUS, -1, 0);
> 	p2 = mmap(NULL, size, PROT_READ | PROT_WRITE, MAP_PRIVATE | MAP_ANONYMOUS, -1, 0);
> 
> 	madvise(p1, size, MADV_MERGEABLE);
> 	madvise(p2, size, MADV_MERGEABLE);
> 
> 	memset(p1, 'a', size);
> 	memset(p2, 'a', size);
> 
> 	madvise(p1, size, MADV_SOFT_OFFLINE);
> 
> 	madvise(p1, size, MADV_UNMERGEABLE);
> 	madvise(p2, size, MADV_UNMERGEABLE);
> 
> 
> where size is about 200,000 pages. It works on a x86_64 box (with and without the
> hwpoison rework). On ppc64 boxes (tested 3 different ones with at least 250GB memory)
> it fails to take a page off the buddy list (page_handle_poison()/take_page_off_buddy())
> (madvise MADV_SOFT_OFFLINE returns -EBUSY). Without the hwpoison rework the test passes.

I reproduced the similar -EBUSY with small average x86 VM, where it seems to me
a race between page_take_off_buddy() and page allocation.  Oscar's debug patch
shows the following kernel messages:

    [  627.357009] Soft offlining pfn 0x235018 at process virtual address 0x7fd112140000
    [  627.358747] __get_any_page: 0x235018 free buddy page
    [  627.359875] page:00000000038b52c9 refcount:0 mapcount:-128 mapping:0000000000000000 index:0x1 pfn:0x235018
    [  627.362002] flags: 0x57ffe000000000()
    [  627.362841] raw: 0057ffe000000000 fffff84648d12688 ffff955abffd1dd0 0000000000000000
    [  627.364555] raw: 0000000000000001 0000000000000000 00000000ffffff7f 0000000000000000
    [  627.366258] page dumped because: page_handle_poison
    [  627.367357] page->mem_cgroup:ffff9559b6912000
    [  627.368342] page_handle_poison: hugepage_or_freepage failed\xb8n
    [  627.368344] soft_offline_free_page: page_handle_poison -EBUSY
    [  627.370901] page:00000000038b52c9 refcount:6 mapcount:3 mapping:000000001226bf89 index:0x2710 pfn:0x235018
    [  627.373048] aops:ext4_da_aops ino:c63f3 dentry name:"system.journal"
    [  627.374526] flags: 0x57ffe00000201c(uptodate|dirty|lru|private)
    [  627.375865] raw: 0057ffe00000201c fffff84648d300c8 ffff955ab8c3f020 ffff955aba5f4ee0
    [  627.377586] raw: 0000000000002710 ffff9559b811fc98 0000000500000002 ffff9559b6912000
    [  627.379308] page dumped because: soft_offline_free_page
    [  627.380480] page->mem_cgroup:ffff9559b6912000

    CPU 0                                CPU 1

    get_any_page // returns 0 (free buddy path)
      soft_offline_free_page
                                         the page is allocated
        page_handle_poison -> fail
          return -EBUSY

I'm still not sure why this issue is invisible before rework patch,
but setting migrate type to MIGRATE_ISOLATE during offlining could affect
the behavior sensitively.

Thanks,
Naoya Horiguchi
Aristeu Rozanski Sept. 16, 2020, 1:53 p.m. UTC | #4
Hi Oscar,

On Wed, Sep 16, 2020 at 09:27:02AM +0200, Oscar Salvador wrote:
> Could you please re-run the tests with the below patch applied, and
> attached then the logs here?

here it is:
(removed previous dump_page() calls for other pages that didn't fail)

[  109.709342] Soft offlining pfn 0x3fb526 at process virtual address 0x7ffc7a180000
[  109.716969] page:00000000f367dde5 refcount:1 mapcount:0 mapping:0000000000000000 index:0x7ffc7a18 pfn:0x3fb526
[  109.716978] anon flags: 0x3ffff80008000e(referenced|uptodate|dirty|swapbacked)
[  109.716988] raw: 003ffff80008000e 5deadbeef0000100 5deadbeef0000122 c000003fcd56d839
[  109.716997] raw: 000000007ffc7a18 0000000000000000 00000001ffffffff c000003fd42f5000
[  109.717005] page dumped because: page_handle_poison
[  109.717011] page->mem_cgroup:c000003fd42f5000
[  109.725882] page_handle_poison: hugepage_or_freepage failed
[  109.725885] __soft_offline_page: page_handle_poison -EBUSY
[  109.725898] page:00000000f367dde5 refcount:3 mapcount:0 mapping:00000000b43d73e6 index:0x58 pfn:0x3fb526
[  109.725951] aops:xfs_address_space_operations [xfs] ino:49f9c5f dentry name:"messages"
[  109.725958] flags: 0x3ffff800002008(dirty|private)
[  109.725963] raw: 003ffff800002008 5deadbeef0000100 5deadbeef0000122 c000003fb8b7eea8
[  109.725969] raw: 0000000000000058 c000003fdd94eb20 00000003ffffffff c000003fd3c42000
[  109.725975] page dumped because: __soft_offline_page after migrate
[  109.725980] page->mem_cgroup:c000003fd3c42000
Oscar Salvador Sept. 16, 2020, 2:06 p.m. UTC | #5
On Wed, Sep 16, 2020 at 01:42:15PM +0000, HORIGUCHI NAOYA(堀口 直也) wrote:
> On Tue, Sep 15, 2020 at 05:22:22PM -0400, Aristeu Rozanski wrote:
 
> I reproduced the similar -EBUSY with small average x86 VM, where it seems to me
> a race between page_take_off_buddy() and page allocation.  Oscar's debug patch
> shows the following kernel messages:
> 
>     [  627.357009] Soft offlining pfn 0x235018 at process virtual address 0x7fd112140000
>     [  627.358747] __get_any_page: 0x235018 free buddy page
>     [  627.359875] page:00000000038b52c9 refcount:0 mapcount:-128 mapping:0000000000000000 index:0x1 pfn:0x235018
>     [  627.362002] flags: 0x57ffe000000000()
>     [  627.362841] raw: 0057ffe000000000 fffff84648d12688 ffff955abffd1dd0 0000000000000000
>     [  627.364555] raw: 0000000000000001 0000000000000000 00000000ffffff7f 0000000000000000
>     [  627.366258] page dumped because: page_handle_poison
>     [  627.367357] page->mem_cgroup:ffff9559b6912000
>     [  627.368342] page_handle_poison: hugepage_or_freepage failed\xb8n
>     [  627.368344] soft_offline_free_page: page_handle_poison -EBUSY
>     [  627.370901] page:00000000038b52c9 refcount:6 mapcount:3 mapping:000000001226bf89 index:0x2710 pfn:0x235018
>     [  627.373048] aops:ext4_da_aops ino:c63f3 dentry name:"system.journal"
>     [  627.374526] flags: 0x57ffe00000201c(uptodate|dirty|lru|private)
>     [  627.375865] raw: 0057ffe00000201c fffff84648d300c8 ffff955ab8c3f020 ffff955aba5f4ee0
>     [  627.377586] raw: 0000000000002710 ffff9559b811fc98 0000000500000002 ffff9559b6912000
>     [  627.379308] page dumped because: soft_offline_free_page
>     [  627.380480] page->mem_cgroup:ffff9559b6912000
> 
>     CPU 0                                CPU 1
> 
>     get_any_page // returns 0 (free buddy path)
>       soft_offline_free_page
>                                          the page is allocated
>         page_handle_poison -> fail
>           return -EBUSY
> 
> I'm still not sure why this issue is invisible before rework patch,
> but setting migrate type to MIGRATE_ISOLATE during offlining could affect
> the behavior sensitively.

Well, this is very timing depending.
AFAICS, before the rework patchset, we could still race with an allocation
as the page could have been allocated between the get_any_page()
and the call to set_hwpoison_free_buddy_page() which takes the zone->lock
to prevent that.

Maybe we just happen to take longer now to reach take_page_off_buddy, so the
race window is bigger.

AFAICS, this has nothing to do with MIGRATE_ISOLATE, because here we are
dealing with pages that already free (part of the buddy system).

The only thing that comes to my mind right off the bat, might be to do
a "retry" in soft_offline_page in case soft_offline_free_page returns -EBUSY,
so we can call again get_any_page and try to handle the new type of page.
Something like (untested):

@@ -1923,6 +1977,7 @@ int soft_offline_page(unsigned long pfn, int flags)
 {
 	int ret;
 	struct page *page;
+	bool try_again = true;
 
 	if (!pfn_valid(pfn))
 		return -ENXIO;
@@ -1938,6 +1993,7 @@ int soft_offline_page(unsigned long pfn, int flags)
 		return 0;
 	}
 
+retry:
 	get_online_mems();
 	ret = get_any_page(page, pfn, flags);
 	put_online_mems();
@@ -1945,7 +2001,10 @@ int soft_offline_page(unsigned long pfn, int flags)
 	if (ret > 0)
 		ret = soft_offline_in_use_page(page);
 	else if (ret == 0)
-		ret = soft_offline_free_page(page);
+		if (soft_offline_free_page(page) && try_again) {
+			try_again = false;
+			goto retry;
+		}
 
 	return ret;
Oscar Salvador Sept. 16, 2020, 2:09 p.m. UTC | #6
On Wed, Sep 16, 2020 at 09:53:58AM -0400, Aristeu Rozanski wrote:
> Hi Oscar,

Thanks Aristeu,

> 
> On Wed, Sep 16, 2020 at 09:27:02AM +0200, Oscar Salvador wrote:
> > Could you please re-run the tests with the below patch applied, and
> > attached then the logs here?
> 
> here it is:
> (removed previous dump_page() calls for other pages that didn't fail)
> 
> [  109.709342] Soft offlining pfn 0x3fb526 at process virtual address 0x7ffc7a180000
> [  109.716969] page:00000000f367dde5 refcount:1 mapcount:0 mapping:0000000000000000 index:0x7ffc7a18 pfn:0x3fb526
> [  109.716978] anon flags: 0x3ffff80008000e(referenced|uptodate|dirty|swapbacked)
> [  109.716988] raw: 003ffff80008000e 5deadbeef0000100 5deadbeef0000122 c000003fcd56d839
> [  109.716997] raw: 000000007ffc7a18 0000000000000000 00000001ffffffff c000003fd42f5000
> [  109.717005] page dumped because: page_handle_poison
> [  109.717011] page->mem_cgroup:c000003fd42f5000
> [  109.725882] page_handle_poison: hugepage_or_freepage failed
> [  109.725885] __soft_offline_page: page_handle_poison -EBUSY
> [  109.725898] page:00000000f367dde5 refcount:3 mapcount:0 mapping:00000000b43d73e6 index:0x58 pfn:0x3fb526
> [  109.725951] aops:xfs_address_space_operations [xfs] ino:49f9c5f dentry name:"messages"
> [  109.725958] flags: 0x3ffff800002008(dirty|private)
> [  109.725963] raw: 003ffff800002008 5deadbeef0000100 5deadbeef0000122 c000003fb8b7eea8
> [  109.725969] raw: 0000000000000058 c000003fdd94eb20 00000003ffffffff c000003fd3c42000
> [  109.725975] page dumped because: __soft_offline_page after migrate
> [  109.725980] page->mem_cgroup:c000003fd3c42000

Can you try the other patch I posted in response to Naoya?

thanks
Aristeu Rozanski Sept. 16, 2020, 2:46 p.m. UTC | #7
Hi Oscar,

On Wed, Sep 16, 2020 at 04:09:30PM +0200, Oscar Salvador wrote:
> On Wed, Sep 16, 2020 at 09:53:58AM -0400, Aristeu Rozanski wrote:
> Can you try the other patch I posted in response to Naoya?

Same thing:

[  369.195056] Soft offlining pfn 0x3fb5bf at process virtual address 0x7ffc84350000
[  369.195073] page:000000002bb131e4 refcount:1 mapcount:0 mapping:0000000000000000 index:0x7ffc8435 pfn:0x3fb5bf
[  369.195080] anon flags: 0x3ffff80008000e(referenced|uptodate|dirty|swapbacked)
[  369.202131] raw: 003ffff80008000e 5deadbeef0000100 5deadbeef0000122 c000003fda1c7431
[  369.202137] raw: 000000007ffc8435 0000000000000000 00000001ffffffff c000003fd63af000
[  369.202141] page dumped because: page_handle_poison
[  369.202145] page->mem_cgroup:c000003fd63af000
[  369.215055] page_handle_poison: hugepage_or_freepage failed�n
[  369.215057] __soft_offline_page: page_handle_poison -EBUSY
[  369.215068] page:000000002bb131e4 refcount:3 mapcount:0 mapping:00000000f6ca3f32 index:0x5c pfn:0x3fb5bf
[  369.215110] aops:xfs_address_space_operations [xfs] ino:49f9c5f dentry name:"messages"
[  369.215117] flags: 0x3ffff800002008(dirty|private)
[  369.215121] raw: 003ffff800002008 5deadbeef0000100 5deadbeef0000122 c000003fadd3daa8
[  369.215127] raw: 000000000000005c c000003fd9497c20 00000003ffffffff c000003fd1143000
[  369.215132] page dumped because: __soft_offline_page after migrate
[  369.215136] page->mem_cgroup:c000003fd1143000
Oscar Salvador Sept. 16, 2020, 4:30 p.m. UTC | #8
On 2020-09-16 16:46, Aristeu Rozanski wrote:
> Hi Oscar,
> 
> On Wed, Sep 16, 2020 at 04:09:30PM +0200, Oscar Salvador wrote:
>> On Wed, Sep 16, 2020 at 09:53:58AM -0400, Aristeu Rozanski wrote:
>> Can you try the other patch I posted in response to Naoya?
> 
> Same thing:
> 
> [  369.195056] Soft offlining pfn 0x3fb5bf at process virtual address
> 0x7ffc84350000
> [  369.195073] page:000000002bb131e4 refcount:1 mapcount:0
> mapping:0000000000000000 index:0x7ffc8435 pfn:0x3fb5bf
> [  369.195080] anon flags:
> 0x3ffff80008000e(referenced|uptodate|dirty|swapbacked)
> [  369.202131] raw: 003ffff80008000e 5deadbeef0000100 5deadbeef0000122
> c000003fda1c7431
> [  369.202137] raw: 000000007ffc8435 0000000000000000 00000001ffffffff
> c000003fd63af000
> [  369.202141] page dumped because: page_handle_poison
> [  369.202145] page->mem_cgroup:c000003fd63af000
> [  369.215055] page_handle_poison: hugepage_or_freepage failed�n
> [  369.215057] __soft_offline_page: page_handle_poison -EBUSY
> [  369.215068] page:000000002bb131e4 refcount:3 mapcount:0
> mapping:00000000f6ca3f32 index:0x5c pfn:0x3fb5bf
> [  369.215110] aops:xfs_address_space_operations [xfs] ino:49f9c5f
> dentry name:"messages"
> [  369.215117] flags: 0x3ffff800002008(dirty|private)
> [  369.215121] raw: 003ffff800002008 5deadbeef0000100 5deadbeef0000122
> c000003fadd3daa8
> [  369.215127] raw: 000000000000005c c000003fd9497c20 00000003ffffffff
> c000003fd1143000
> [  369.215132] page dumped because: __soft_offline_page after migrate
> [  369.215136] page->mem_cgroup:c000003fd1143000


Ok, this is something different.
The race you saw previously is kinda normal as there is a race window 
between spotting a freepage and taking it off the buddy freelists.
The retry patch should help there.

The issue you are seeing right here is due to the call to 
page_handle_poison in __soft_offline_page being wrong, as we pass 
hugepage_or_freepage = true inconditionally, which is wrong.

Should be:
Oscar Salvador Sept. 16, 2020, 4:34 p.m. UTC | #9
On 2020-09-16 18:30, osalvador@suse.de wrote:
> On 2020-09-16 16:46, Aristeu Rozanski wrote:
>> Hi Oscar,
>> 
>> On Wed, Sep 16, 2020 at 04:09:30PM +0200, Oscar Salvador wrote:
>>> On Wed, Sep 16, 2020 at 09:53:58AM -0400, Aristeu Rozanski wrote:
>>> Can you try the other patch I posted in response to Naoya?
>> 
>> Same thing:
>> 
>> [  369.195056] Soft offlining pfn 0x3fb5bf at process virtual address
>> 0x7ffc84350000
>> [  369.195073] page:000000002bb131e4 refcount:1 mapcount:0
>> mapping:0000000000000000 index:0x7ffc8435 pfn:0x3fb5bf
>> [  369.195080] anon flags:
>> 0x3ffff80008000e(referenced|uptodate|dirty|swapbacked)
>> [  369.202131] raw: 003ffff80008000e 5deadbeef0000100 5deadbeef0000122
>> c000003fda1c7431
>> [  369.202137] raw: 000000007ffc8435 0000000000000000 00000001ffffffff
>> c000003fd63af000
>> [  369.202141] page dumped because: page_handle_poison
>> [  369.202145] page->mem_cgroup:c000003fd63af000
>> [  369.215055] page_handle_poison: hugepage_or_freepage failed�n
>> [  369.215057] __soft_offline_page: page_handle_poison -EBUSY
>> [  369.215068] page:000000002bb131e4 refcount:3 mapcount:0
>> mapping:00000000f6ca3f32 index:0x5c pfn:0x3fb5bf
>> [  369.215110] aops:xfs_address_space_operations [xfs] ino:49f9c5f
>> dentry name:"messages"
>> [  369.215117] flags: 0x3ffff800002008(dirty|private)
>> [  369.215121] raw: 003ffff800002008 5deadbeef0000100 5deadbeef0000122
>> c000003fadd3daa8
>> [  369.215127] raw: 000000000000005c c000003fd9497c20 00000003ffffffff
>> c000003fd1143000
>> [  369.215132] page dumped because: __soft_offline_page after migrate
>> [  369.215136] page->mem_cgroup:c000003fd1143000
> 
> 
> Ok, this is something different.
> The race you saw previously is kinda normal as there is a race window
> between spotting a freepage and taking it off the buddy freelists.
> The retry patch should help there.
> 
> The issue you are seeing right here is due to the call to
> page_handle_poison in __soft_offline_page being wrong, as we pass
> hugepage_or_freepage = true inconditionally, which is wrong.
> 
> Should be:

Fat fingers, sorry:

Ok, this is something different.
The race you saw previously is kinda normal as there is a race window 
between spotting a freepage and taking it off the buddy freelists.
The retry patch should help there.

The issue you are seeing right here is due to the call to 
page_handle_poison in __soft_offline_page being wrong, as we pass 
hugepage_or_freepage = true inconditionally, which is wrong.
I think it was caused during rebasing.

Should be:

@@ -1858,8 +1903,11 @@ static int __soft_offline_page(struct page *page)
                 if (!ret) {
                         bool release = !huge;

-                       if (!page_handle_poison(page, true, release))
+                       if (!page_handle_poison(page, huge, release)) {
+                               pr_info("%s: page_handle_poison 
-EBUSY\n", __func__);
+                               dump_page(page, "__soft_offline_page 
after migrate");
                                 ret = -EBUSY;
+                       }

Could you try that on top please?

I am away from my laptop now but I will be taking a look later today.

thanks
Aristeu Rozanski Sept. 16, 2020, 5:58 p.m. UTC | #10
On Wed, Sep 16, 2020 at 06:34:52PM +0200, osalvador@suse.de wrote:
> Fat fingers, sorry:
> 
> Ok, this is something different.
> The race you saw previously is kinda normal as there is a race window 
> between spotting a freepage and taking it off the buddy freelists.
> The retry patch should help there.
> 
> The issue you are seeing right here is due to the call to 
> page_handle_poison in __soft_offline_page being wrong, as we pass 
> hugepage_or_freepage = true inconditionally, which is wrong.
> I think it was caused during rebasing.
> 
> Should be:

Tests passed with this patch on top of others.
Thanks!
Oscar Salvador Sept. 16, 2020, 6:12 p.m. UTC | #11
On 2020-09-16 19:58, Aristeu Rozanski wrote:
> On Wed, Sep 16, 2020 at 06:34:52PM +0200, osalvador@suse.de wrote:
>> Fat fingers, sorry:
>> 
>> Ok, this is something different.
>> The race you saw previously is kinda normal as there is a race window
>> between spotting a freepage and taking it off the buddy freelists.
>> The retry patch should help there.
>> 
>> The issue you are seeing right here is due to the call to
>> page_handle_poison in __soft_offline_page being wrong, as we pass
>> hugepage_or_freepage = true inconditionally, which is wrong.
>> I think it was caused during rebasing.
>> 
>> Should be:
> 
> Tests passed with this patch on top of others.
> Thanks!

Thanks for giving it a shot!
I will be sending patches tomorrow morning.

Thanks