mbox series

[0/7] HWPoison: Refactor get page interface

Message ID 20201119105716.5962-1-osalvador@suse.de (mailing list archive)
Headers show
Series HWPoison: Refactor get page interface | expand

Message

Oscar Salvador Nov. 19, 2020, 10:57 a.m. UTC
Hi,

following up on previous fix-ups an refactors, this patchset simplifies
the get page interface and removes the MF_COUNT_INCREASED trick we have
for soft offline.

Please, note that this patchset is on top of [1] and [2].

This patchset does three things:

 1) Drops MF_COUNT_INCREASED trick
 2) Refactors get page interface
 3) Places a common entry for grabbin a page from both hard offline
    and soft offline guarded by zone_pcp_{disable/enable}, so we do not
    have to drain pcplists by ourself and retry again.

Note that the MF_COUNT_INCREASED trick was left because if get_hwpoison_page
races with put_page (e.g:)

CPU0                         CPU1
put_page (refcount decremented to 0)
 __put_single_page
  free_unref_page
   free_unref_page_prepare
    free_pcp_prepare
     free_pages_prepare                           soft_offline_page
     :page->flags &= ~PAGE_FLAGS_CHECK_AT_PREP     get_any_page
                            			    get_hwpoison_page
   free_unref_page_commit
    free_one_page
     __free_one_page (place it in buddy)

get_hwpoison_page sees that page has a refcount of 0, but since it was not placed
in buddy yet we cannot really handle it.
We now have a sort of maximum passes in get_any_page, so in case we race
with either an allocation or a put_page, we retry again.

After an off-list discussion with Naoya, he agreed to proceed.

[1] https://patchwork.kernel.org/project/linux-mm/list/?series=364009
[2] https://patchwork.kernel.org/project/linux-mm/list/?series=381903

Naoya Horiguchi (3):
  mm,madvise: call soft_offline_page() without MF_COUNT_INCREASED
  mm,hwpoison: remove MF_COUNT_INCREASED
  mm,hwpoison: remove flag argument from soft offline functions

Oscar Salvador (4):
  mm,hwpoison: Refactor get_any_page
  mm,hwpoison: Drop pfn parameter
  mm,hwpoison: Disable pcplists before grabbing a refcount
  mm,hwpoison: Remove drain_all_pages from shake_page

 drivers/base/memory.c |   2 +-
 include/linux/mm.h    |   9 +--
 mm/madvise.c          |  19 +++--
 mm/memory-failure.c   | 168 +++++++++++++++++-------------------------
 4 files changed, 85 insertions(+), 113 deletions(-)

Comments

Qian Cai Dec. 2, 2020, 1:34 p.m. UTC | #1
On Thu, 2020-11-19 at 11:57 +0100, Oscar Salvador wrote:
> Hi,
> 
> following up on previous fix-ups an refactors, this patchset simplifies
> the get page interface and removes the MF_COUNT_INCREASED trick we have
> for soft offline.

Well, the madvise() EIO is back. I don't understand why we can't test it on a
NUMA system before posting this over and over again.

# git clone https://e.coding.net/cailca/linux/mm
# cd mm; make
# ./ranbug 1 
- start: migrate_huge_offline
- use NUMA nodes 0,3.
- mmap and free 8388608 bytes hugepages on node 0
- mmap and free 8388608 bytes hugepages on node 3
madvise: Input/output error

[ 1270.054919][ T7497] Soft offlining pfn 0x1958e00 at process virtual address 0x7f7d9ca00000
[ 1270.067318][ T7497] Soft offlining pfn 0x18d0600 at process virtual address 0x7f7d9c800000
[ 1270.078856][ T7497] Soft offlining pfn 0x1ac800 at process virtual address 0x7f7d9ca00000
[ 1270.091268][ T7497] Soft offlining pfn 0x1e10a00 at process virtual address 0x7f7d9c800000
[ 1270.101946][ T7497] Soft offlining pfn 0x18c800 at process virtual address 0x7f7d9ca00000
[ 1270.111678][ T7497] soft offline: 0x18c800: hugepage isolation failed: 0, page count 2, type bfffc00001000e (referenced|uptodate|dirty|head)
[ 1270.126133][ T7497] Soft offlining pfn 0x18b5400 at process virtual address 0x7f7d9c800000
[ 1270.136581][ T7497] Soft offlining pfn 0x211c00 at process virtual address 0x7f7d9ca00000
[ 1270.146214][ T7497] soft offline: 0x211c00: hugepage isolation failed: 0, page count 2, type bfffc00001000e (referenced|uptodate|dirty|head)
[ 1270.160624][ T7497] Soft offlining pfn 0x19bee00 at process virtual address 0x7f7d9c800000
[ 1270.170896][ T7497] Soft offlining pfn 0x1e21a00 at process virtual address 0x7f7d9ca00000
[ 1270.185011][ T7497] Soft offlining pfn 0x1fd1200 at process virtual address 0x7f7d9c800000
[ 1270.195341][ T7497] Soft offlining pfn 0x1882400 at process virtual address 0x7f7d9ca00000
[ 1270.480593][ T7497] Soft offlining pfn 0x18bc000 at process virtual address 0x7f7d9c800000
[ 1270.491961][ T7497] soft offline: 0x18bc000: hugepage isolation failed: 0, page count 2, type 3bfffc00001000e (referenced|uptodate|dirty|head)
[ 1270.506018][ T7497] Soft offlining pfn 0x1e76a00 at process virtual address 0x7f7d9c800000
[ 1270.590266][ T7497] Soft offlining pfn 0x1b3c00 at process virtual address 0x7f7d9ca00000
[ 1270.600207][ T7497] soft offline: 0x1b3c00: hugepage isolation failed: 0, page count 2, type bfffc00001000e (referenced|uptodate|dirty|head)
[ 1270.614316][ T7497] Soft offlining pfn 0x1882600 at process virtual address 0x7f7d9c800000
[ 1270.662427][ T7497] Soft offlining pfn 0x1b3c00 at process virtual address 0x7f7d9ca00000
[ 1270.744249][ T7497] Soft offlining pfn 0x18bc000 at process virtual address 0x7f7d9c800000
[ 1270.754314][ T7497] Soft offlining pfn 0x18d1200 at process virtual address 0x7f7d9ca00000
[ 1270.765204][ T7497] soft offline: 0x18d1200: hugepage isolation failed: 0, page count 2, type 3bfffc00001000e (referenced|uptodate|dirty|head)
[ 1270.816653][ T7497] Soft offlining pfn 0x18d0400 at process virtual address 0x7f7d9c800000
[ 1270.827049][ T7497] Soft offlining pfn 0x18d1200 at process virtual address 0x7f7d9ca00000
[ 1270.837997][ T7497] soft offline: 0x18d1200: hugepage isolation failed: 0, page count 2, type 3bfffc00001000e (referenced|uptodate|dirty|head)
[ 1270.852156][ T7497] Soft offlining pfn 0x186ca00 at process virtual address 0x7f7d9c800000
[ 1270.862350][ T7497] Soft offlining pfn 0x18d1200 at process virtual address 0x7f7d9ca00000
[ 1270.872922][ T7497] soft offline: 0x18d1200: hugepage isolation failed: 0, page count 2, type 3bfffc00001000e (referenced|uptodate|dirty|head)
[ 1270.887133][ T7497] Soft offlining pfn 0x18ac200 at process virtual address 0x7f7d9c800000
[ 1270.897450][ T7497] Soft offlining pfn 0x211c00 at process virtual address 0x7f7d9ca00000
[ 1270.907416][ T7497] soft offline: 0x211c00: hugepage isolation failed: 0, page count 2, type bfffc00001000e (referenced|uptodate|dirty|head)
[ 1270.921365][ T7497] Soft offlining pfn 0x1e1cc00 at process virtual address 0x7f7d9c800000
[ 1270.931700][ T7497] Soft offlining pfn 0x18c800 at process virtual address 0x7f7d9ca00000
[ 1270.941580][ T7497] soft offline: 0x18c800: hugepage isolation failed: 0, page count 2, type bfffc00001000e (referenced|uptodate|dirty|head)
[ 1270.955649][ T7497] Soft offlining pfn 0x1e6ae00 at process virtual address 0x7f7d9c800000
[ 1270.966063][ T7497] Soft offlining pfn 0x211c00 at process virtual address 0x7f7d9ca00000
[ 1270.975965][ T7497] soft offline: 0x211c00: hugepage isolation failed: 0, page count 2, type bfffc00001000e (referenced|uptodate|dirty|head)
[ 1270.990059][ T7497] Soft offlining pfn 0x1e72e00 at process virtual address 0x7f7d9c800000
[ 1271.000323][ T7497] Soft offlining pfn 0x18d1200 at process virtual address 0x7f7d9ca00000
[ 1271.011006][ T7497] soft offline: 0x18d1200: hugepage isolation failed: 0, page count 2, type 3bfffc00001000e (referenced|uptodate|dirty|head)
[ 1271.025152][ T7497] Soft offlining pfn 0x1e22200 at process virtual address 0x7f7d9c800000
[ 1271.035395][ T7497] Soft offlining pfn 0x18d1200 at process virtual address 0x7f7d9ca00000
[ 1271.045916][ T7497] soft offline: 0x18d1200: hugepage isolation failed: 0, page count 2, type 3bfffc00001000e (referenced|uptodate|dirty|head)
[ 1271.060159][ T7497] Soft offlining pfn 0x1e6fe00 at process virtual address 0x7f7d9c800000
[ 1271.070695][ T7497] Soft offlining pfn 0x18c800 at process virtual address 0x7f7d9ca00000
[ 1271.080596][ T7497] soft offline: 0x18c800: hugepage isolation failed: 0, page count 2, type bfffc00001000e (referenced|uptodate|dirty|head)
[ 1271.094725][ T7497] Soft offlining pfn 0x1968200 at process virtual address 0x7f7d9c800000
[ 1271.105006][ T7497] Soft offlining pfn 0x18d1200 at process virtual address 0x7f7d9ca00000
[ 1271.115567][ T7497] soft offline: 0x18d1200: hugepage isolation failed: 0, page count 2, type 3bfffc00001000e (referenced|uptodate|dirty|head)
[ 1271.129775][ T7497] Soft offlining pfn 0x1e1ae00 at process virtual address 0x7f7d9c800000
[ 1271.140285][ T7497] Soft offlining pfn 0x18c800 at process virtual address 0x7f7d9ca00000
[ 1271.150185][ T7497] soft offline: 0x18c800: hugepage isolation failed: 0, page count 2, type bfffc[ 1271.468115][ T7497] Soft offlining pfn 0x1de4600 at process virtual address 0x7f7d9c800000
[ 1271.479348][ T7497] Soft offlining pfn 0x145e00 at process virtual address 0x7f7d9ca00000
[ 1271.489928][ T7497] soft offline: 0x145e00: hugepage isolation 1271.538433][ T7497] Soft offlining pfn 0x1fae00 at process virtual address 0x7f7d9c800000
[ 1271.548880][ T7497] Soft offlining pfn 0x1995e00 at process virtual address 0x7f7d9ca00000
[ 1271.558877][ T7497] soft offline: 0x1995e00: hugepage isolation failed: 0, page count 2, type 3bfffc00001000e (referenced|uptodate|dirty|head)
[ 1271.573055][ T7497] Soft offlining pfn 0x221e00 at process virtual address 0x7f7d9c800000
[ 1271.583453][ T7497] Soft offlining pfn 0x1901800 at process virtual address 0x7f7d9ca00000
[ 1271.593440][ T7497] soft offline: 0x1901800: hugepage isolation failed: 0, page count 2, type 3bfffc00001000e (referenced|uptodate|dirty|head)
[ 1271.610005][ T7497] Soft offlining pfn 0x232400 at process virtual address 0x7f7d9c800000
[ 1271.620439][ T7497] Soft offlinin[ 1272.005890][ T7497] Soft offlining pfn 0x230e00 at process virtual address 0x7f7d9c800000
[ 1272.017226][ T7497] Soft offlining pfn 0x185fe00 at process virtual address 0x7f7d9ca00000
[ 1272.029194][ T7497] Soft offlining pfn 0x1f1400 at process virtual address 0x7f7d9c800000
[ 1272.040088][ T7497] Soft offlining pfn 0x1f9e00 at process virtual address 0x7f7d9ca00000
[ 1272.052415][ T7497] Soft offlining pfn 0x1885a00 at process virtual address 0x7f7d9c800000
[ 1272.062510][ T7497] Soft offlining pfn 0x18b6000 at process virtual address 0x7f7d9ca00000
[ 1272.071931][ T7497] soft_offline_page: 0x18b6000: unknown page type: 3bfffc000000000 ((%pG?))

> 
> Please, note that this patchset is on top of [1] and [2].
> 
> This patchset does three things:
> 
>  1) Drops MF_COUNT_INCREASED trick
>  2) Refactors get page interface
>  3) Places a common entry for grabbin a page from both hard offline
>     and soft offline guarded by zone_pcp_{disable/enable}, so we do not
>     have to drain pcplists by ourself and retry again.
> 
> Note that the MF_COUNT_INCREASED trick was left because if get_hwpoison_page
> races with put_page (e.g:)
> 
> CPU0                         CPU1
> put_page (refcount decremented to 0)
>  __put_single_page
>   free_unref_page
>    free_unref_page_prepare
>     free_pcp_prepare
>      free_pages_prepare                           soft_offline_page
>      :page->flags &= ~PAGE_FLAGS_CHECK_AT_PREP     get_any_page
>                             			    get_hwpoison_page
>    free_unref_page_commit
>     free_one_page
>      __free_one_page (place it in buddy)
> 
> get_hwpoison_page sees that page has a refcount of 0, but since it was not
> placed
> in buddy yet we cannot really handle it.
> We now have a sort of maximum passes in get_any_page, so in case we race
> with either an allocation or a put_page, we retry again.
> 
> After an off-list discussion with Naoya, he agreed to proceed.
> 
> [1] https://patchwork.kernel.org/project/linux-mm/list/?series=364009
> [2] https://patchwork.kernel.org/project/linux-mm/list/?series=381903
> 
> Naoya Horiguchi (3):
>   mm,madvise: call soft_offline_page() without MF_COUNT_INCREASED
>   mm,hwpoison: remove MF_COUNT_INCREASED
>   mm,hwpoison: remove flag argument from soft offline functions
> 
> Oscar Salvador (4):
>   mm,hwpoison: Refactor get_any_page
>   mm,hwpoison: Drop pfn parameter
>   mm,hwpoison: Disable pcplists before grabbing a refcount
>   mm,hwpoison: Remove drain_all_pages from shake_page
> 
>  drivers/base/memory.c |   2 +-
>  include/linux/mm.h    |   9 +--
>  mm/madvise.c          |  19 +++--
>  mm/memory-failure.c   | 168 +++++++++++++++++-------------------------
>  4 files changed, 85 insertions(+), 113 deletions(-)
>
Oscar Salvador Dec. 2, 2020, 1:41 p.m. UTC | #2
On Wed, Dec 02, 2020 at 08:34:57AM -0500, Qian Cai wrote:
> On Thu, 2020-11-19 at 11:57 +0100, Oscar Salvador wrote:
> > Hi,
> > 
> > following up on previous fix-ups an refactors, this patchset simplifies
> > the get page interface and removes the MF_COUNT_INCREASED trick we have
> > for soft offline.
> 
> Well, the madvise() EIO is back. I don't understand why we can't test it on a
> NUMA system before posting this over and over again.
> 
> # git clone https://e.coding.net/cailca/linux/mm
> # cd mm; make
> # ./ranbug 1 
> - start: migrate_huge_offline
> - use NUMA nodes 0,3.
> - mmap and free 8388608 bytes hugepages on node 0
> - mmap and free 8388608 bytes hugepages on node 3
> madvise: Input/output error

I tried it out myself enlarging the window race artificially but I was not able
to get -EIO anymore.
But as Vlastimil pointed out in the respective patch, it is better to keep the
page pinned for madvise.

I am planning to re-post leaving out the patches that remove the pinning.

Anyway, thanks for the report.