Message ID | 20220228140245.24552-3-linmiaohe@huawei.com (mailing list archive) |
---|---|
State | New |
Headers | show |
Series | A few fixup patches for memory failure | expand |
On Mon, Feb 28, 2022 at 10:02:43PM +0800, Miaohe Lin wrote: > The dirty swapcache page is still residing in the swap cache after it's > hwpoisoned. So there is always one extra refcount for swap cache. The diff seems fine at a glance, but let me have a few question to understand the issue more. - Is the behavior described above the effect of recent change on shmem where dirty pagecache is pinned on hwpoison (commit a76054266661 ("mm: shmem: don't truncate page if memory failure happens"). Or the older kernels behave as the same? - Is the behavior true for normal anonymous pages (not shmem pages)? I'm trying to test hwpoison hitting the dirty swapcache, but it seems that in my testing memory_faliure() fails with "hwpoison: unhandlable page" warning at get_any_page(). So I'm still not sure that me_pagecache_dirty() fixes any visible problem. Thanks, Naoya Horiguchi > > Signed-off-by: Miaohe Lin <linmiaohe@huawei.com> > --- > mm/memory-failure.c | 6 +----- > 1 file changed, 1 insertion(+), 5 deletions(-) > > diff --git a/mm/memory-failure.c b/mm/memory-failure.c > index 0d7c58340a98..5f9503573263 100644 > --- a/mm/memory-failure.c > +++ b/mm/memory-failure.c > @@ -984,7 +984,6 @@ static int me_pagecache_dirty(struct page_state *ps, struct page *p) > static int me_swapcache_dirty(struct page_state *ps, struct page *p) > { > int ret; > - bool extra_pins = false; > > ClearPageDirty(p); > /* Trigger EIO in shmem: */ > @@ -993,10 +992,7 @@ static int me_swapcache_dirty(struct page_state *ps, struct page *p) > ret = delete_from_lru_cache(p) ? MF_FAILED : MF_DELAYED; > unlock_page(p); > > - if (ret == MF_DELAYED) > - extra_pins = true; > - > - if (has_extra_refcount(ps, p, extra_pins)) > + if (has_extra_refcount(ps, p, true)) > ret = MF_FAILED; > > return ret; > -- > 2.23.0
On 2022/3/4 16:27, HORIGUCHI NAOYA(堀口 直也) wrote: > On Mon, Feb 28, 2022 at 10:02:43PM +0800, Miaohe Lin wrote: >> The dirty swapcache page is still residing in the swap cache after it's >> hwpoisoned. So there is always one extra refcount for swap cache. > > The diff seems fine at a glance, but let me have a few question to > understand the issue more. > > - Is the behavior described above the effect of recent change on shmem where > dirty pagecache is pinned on hwpoison (commit a76054266661 ("mm: shmem: > don't truncate page if memory failure happens"). Or the older kernels > behave as the same? > > - Is the behavior true for normal anonymous pages (not shmem pages)? > The behavior described above is aimed at swapcache not pagecache. So it should be irrelevant with the recent change on shmem. What I try to fix here is that me_swapcache_dirty holds one extra pin via SwapCache regardless of the return value of delete_from_lru_cache. We should try to report more accurate extra refcount for debugging purpose. > I'm trying to test hwpoison hitting the dirty swapcache, but it seems that > in my testing memory_faliure() fails with "hwpoison: unhandlable page" Maybe memory_faliure is racing with page reclaim where page is isolated? > warning at get_any_page(). So I'm still not sure that me_pagecache_dirty() > fixes any visible problem. IIUC, me_pagecache_dirty can't do much except for the corresponding address_space supporting error_remove_page which can truncate the dirty pagecache page. But this may cause silent data loss. It's better to keep the page stay in the pagecache until the file is truncated, hole punched or removed as commit a76054266661 pointed out. Thanks. > > Thanks, > Naoya Horiguchi > >> >> Signed-off-by: Miaohe Lin <linmiaohe@huawei.com> >> --- >> mm/memory-failure.c | 6 +----- >> 1 file changed, 1 insertion(+), 5 deletions(-) >> >> diff --git a/mm/memory-failure.c b/mm/memory-failure.c >> index 0d7c58340a98..5f9503573263 100644 >> --- a/mm/memory-failure.c >> +++ b/mm/memory-failure.c >> @@ -984,7 +984,6 @@ static int me_pagecache_dirty(struct page_state *ps, struct page *p) >> static int me_swapcache_dirty(struct page_state *ps, struct page *p) >> { >> int ret; >> - bool extra_pins = false; >> >> ClearPageDirty(p); >> /* Trigger EIO in shmem: */ >> @@ -993,10 +992,7 @@ static int me_swapcache_dirty(struct page_state *ps, struct page *p) >> ret = delete_from_lru_cache(p) ? MF_FAILED : MF_DELAYED; >> unlock_page(p); >> >> - if (ret == MF_DELAYED) >> - extra_pins = true; >> - >> - if (has_extra_refcount(ps, p, extra_pins)) >> + if (has_extra_refcount(ps, p, true)) >> ret = MF_FAILED; >> >> return ret; >> -- >> 2.23.0
On Mon, Mar 7, 2022 at 3:26 AM Miaohe Lin <linmiaohe@huawei.com> wrote: > > On 2022/3/4 16:27, HORIGUCHI NAOYA(堀口 直也) wrote: > > On Mon, Feb 28, 2022 at 10:02:43PM +0800, Miaohe Lin wrote: > >> The dirty swapcache page is still residing in the swap cache after it's > >> hwpoisoned. So there is always one extra refcount for swap cache. > > > > The diff seems fine at a glance, but let me have a few question to > > understand the issue more. > > > > - Is the behavior described above the effect of recent change on shmem where > > dirty pagecache is pinned on hwpoison (commit a76054266661 ("mm: shmem: > > don't truncate page if memory failure happens"). Or the older kernels > > behave as the same? > > > > - Is the behavior true for normal anonymous pages (not shmem pages)? > > > > The behavior described above is aimed at swapcache not pagecache. So it should be > irrelevant with the recent change on shmem. > > What I try to fix here is that me_swapcache_dirty holds one extra pin via SwapCache > regardless of the return value of delete_from_lru_cache. We should try to report more > accurate extra refcount for debugging purpose. I think you misunderstood the code. The delete_from_lru_cache() returning 0 means the page was on LRU and isolated from LRU successfully now. Returning -EIO means the page was not on LRU, so it should have at least an extra pin on it. So MF_DELAYED means there is no other pin other than hwpoison and swapcache which is expected, MF_FAILED means there might be extra pins. The has_extra_refcount() raised error then there is *unexpected* refcount. > > > I'm trying to test hwpoison hitting the dirty swapcache, but it seems that > > in my testing memory_faliure() fails with "hwpoison: unhandlable page" > > Maybe memory_faliure is racing with page reclaim where page is isolated? > > > warning at get_any_page(). So I'm still not sure that me_pagecache_dirty() > > fixes any visible problem. > > IIUC, me_pagecache_dirty can't do much except for the corresponding address_space supporting > error_remove_page which can truncate the dirty pagecache page. But this may cause silent data > loss. It's better to keep the page stay in the pagecache until the file is truncated, hole > punched or removed as commit a76054266661 pointed out. > > Thanks. > > > > Thanks, > > Naoya Horiguchi > > > >> > >> Signed-off-by: Miaohe Lin <linmiaohe@huawei.com> > >> --- > >> mm/memory-failure.c | 6 +----- > >> 1 file changed, 1 insertion(+), 5 deletions(-) > >> > >> diff --git a/mm/memory-failure.c b/mm/memory-failure.c > >> index 0d7c58340a98..5f9503573263 100644 > >> --- a/mm/memory-failure.c > >> +++ b/mm/memory-failure.c > >> @@ -984,7 +984,6 @@ static int me_pagecache_dirty(struct page_state *ps, struct page *p) > >> static int me_swapcache_dirty(struct page_state *ps, struct page *p) > >> { > >> int ret; > >> - bool extra_pins = false; > >> > >> ClearPageDirty(p); > >> /* Trigger EIO in shmem: */ > >> @@ -993,10 +992,7 @@ static int me_swapcache_dirty(struct page_state *ps, struct page *p) > >> ret = delete_from_lru_cache(p) ? MF_FAILED : MF_DELAYED; > >> unlock_page(p); > >> > >> - if (ret == MF_DELAYED) > >> - extra_pins = true; > >> - > >> - if (has_extra_refcount(ps, p, extra_pins)) > >> + if (has_extra_refcount(ps, p, true)) > >> ret = MF_FAILED; > >> > >> return ret; > >> -- > >> 2.23.0 > >
On 2022/3/8 4:14, Yang Shi wrote: > On Mon, Mar 7, 2022 at 3:26 AM Miaohe Lin <linmiaohe@huawei.com> wrote: >> >> On 2022/3/4 16:27, HORIGUCHI NAOYA(堀口 直也) wrote: >>> On Mon, Feb 28, 2022 at 10:02:43PM +0800, Miaohe Lin wrote: >>>> The dirty swapcache page is still residing in the swap cache after it's >>>> hwpoisoned. So there is always one extra refcount for swap cache. >>> >>> The diff seems fine at a glance, but let me have a few question to >>> understand the issue more. >>> >>> - Is the behavior described above the effect of recent change on shmem where >>> dirty pagecache is pinned on hwpoison (commit a76054266661 ("mm: shmem: >>> don't truncate page if memory failure happens"). Or the older kernels >>> behave as the same? >>> >>> - Is the behavior true for normal anonymous pages (not shmem pages)? >>> >> >> The behavior described above is aimed at swapcache not pagecache. So it should be >> irrelevant with the recent change on shmem. >> >> What I try to fix here is that me_swapcache_dirty holds one extra pin via SwapCache >> regardless of the return value of delete_from_lru_cache. We should try to report more >> accurate extra refcount for debugging purpose. > > I think you misunderstood the code. The delete_from_lru_cache() > returning 0 means the page was on LRU and isolated from LRU > successfully now. Returning -EIO means the page was not on LRU, so it > should have at least an extra pin on it. > > So MF_DELAYED means there is no other pin other than hwpoison and > swapcache which is expected, MF_FAILED means there might be extra > pins. > > The has_extra_refcount() raised error then there is *unexpected* refcount. Many thanks for your explanation. It seems you're right. If page is held on the lru_pvecs when we try to do delete_from_lru_cache, and after that it's drained to the lru list( so its refcnt might be 2 now). Then we might have the following complain if extra_pins is always true: "Memory failure: ... still referenced by 0 users\n" But it seems the origin code can not report the correct reason too because if we retry, page can be delete_from_lru_cache and we can succeed now. Anyway, many thanks for pointing this out. > >> >>> I'm trying to test hwpoison hitting the dirty swapcache, but it seems that >>> in my testing memory_faliure() fails with "hwpoison: unhandlable page" >> >> Maybe memory_faliure is racing with page reclaim where page is isolated? >> >>> warning at get_any_page(). So I'm still not sure that me_pagecache_dirty() >>> fixes any visible problem. >> >> IIUC, me_pagecache_dirty can't do much except for the corresponding address_space supporting >> error_remove_page which can truncate the dirty pagecache page. But this may cause silent data >> loss. It's better to keep the page stay in the pagecache until the file is truncated, hole >> punched or removed as commit a76054266661 pointed out. >> >> Thanks. >> >>>> Thanks, >>> Naoya Horiguchi >>> >>>> >>>> Signed-off-by: Miaohe Lin <linmiaohe@huawei.com> >>>> --- >>>> mm/memory-failure.c | 6 +----- >>>> 1 file changed, 1 insertion(+), 5 deletions(-) >>>> >>>> diff --git a/mm/memory-failure.c b/mm/memory-failure.c >>>> index 0d7c58340a98..5f9503573263 100644 >>>> --- a/mm/memory-failure.c >>>> +++ b/mm/memory-failure.c >>>> @@ -984,7 +984,6 @@ static int me_pagecache_dirty(struct page_state *ps, struct page *p) >>>> static int me_swapcache_dirty(struct page_state *ps, struct page *p) >>>> { >>>> int ret; >>>> - bool extra_pins = false; >>>> >>>> ClearPageDirty(p); >>>> /* Trigger EIO in shmem: */ >>>> @@ -993,10 +992,7 @@ static int me_swapcache_dirty(struct page_state *ps, struct page *p) >>>> ret = delete_from_lru_cache(p) ? MF_FAILED : MF_DELAYED; >>>> unlock_page(p); >>>> >>>> - if (ret == MF_DELAYED) >>>> - extra_pins = true; >>>> - >>>> - if (has_extra_refcount(ps, p, extra_pins)) >>>> + if (has_extra_refcount(ps, p, true)) >>>> ret = MF_FAILED; >>>> >>>> return ret; >>>> -- >>>> 2.23.0 >> >> > . >
On Tue, Mar 8, 2022 at 5:11 AM Miaohe Lin <linmiaohe@huawei.com> wrote: > > On 2022/3/8 4:14, Yang Shi wrote: > > On Mon, Mar 7, 2022 at 3:26 AM Miaohe Lin <linmiaohe@huawei.com> wrote: > >> > >> On 2022/3/4 16:27, HORIGUCHI NAOYA(堀口 直也) wrote: > >>> On Mon, Feb 28, 2022 at 10:02:43PM +0800, Miaohe Lin wrote: > >>>> The dirty swapcache page is still residing in the swap cache after it's > >>>> hwpoisoned. So there is always one extra refcount for swap cache. > >>> > >>> The diff seems fine at a glance, but let me have a few question to > >>> understand the issue more. > >>> > >>> - Is the behavior described above the effect of recent change on shmem where > >>> dirty pagecache is pinned on hwpoison (commit a76054266661 ("mm: shmem: > >>> don't truncate page if memory failure happens"). Or the older kernels > >>> behave as the same? > >>> > >>> - Is the behavior true for normal anonymous pages (not shmem pages)? > >>> > >> > >> The behavior described above is aimed at swapcache not pagecache. So it should be > >> irrelevant with the recent change on shmem. > >> > >> What I try to fix here is that me_swapcache_dirty holds one extra pin via SwapCache > >> regardless of the return value of delete_from_lru_cache. We should try to report more > >> accurate extra refcount for debugging purpose. > > > > I think you misunderstood the code. The delete_from_lru_cache() > > returning 0 means the page was on LRU and isolated from LRU > > successfully now. Returning -EIO means the page was not on LRU, so it > > should have at least an extra pin on it. > > > > So MF_DELAYED means there is no other pin other than hwpoison and > > swapcache which is expected, MF_FAILED means there might be extra > > pins. > > > > The has_extra_refcount() raised error then there is *unexpected* refcount. > > Many thanks for your explanation. It seems you're right. If page is held on > the lru_pvecs when we try to do delete_from_lru_cache, and after that it's > drained to the lru list( so its refcnt might be 2 now). Then we might have > the following complain if extra_pins is always true: > "Memory failure: ... still referenced by 0 users\n" > > But it seems the origin code can not report the correct reason too because > if we retry, page can be delete_from_lru_cache and we can succeed now. Retry is ok, but it seems overkilling to me IMHO. > > Anyway, many thanks for pointing this out. > > > > >> > >>> I'm trying to test hwpoison hitting the dirty swapcache, but it seems that > >>> in my testing memory_faliure() fails with "hwpoison: unhandlable page" > >> > >> Maybe memory_faliure is racing with page reclaim where page is isolated? > >> > >>> warning at get_any_page(). So I'm still not sure that me_pagecache_dirty() > >>> fixes any visible problem. > >> > >> IIUC, me_pagecache_dirty can't do much except for the corresponding address_space supporting > >> error_remove_page which can truncate the dirty pagecache page. But this may cause silent data > >> loss. It's better to keep the page stay in the pagecache until the file is truncated, hole > >> punched or removed as commit a76054266661 pointed out. > >> > >> Thanks. > >> > >>>> Thanks, > >>> Naoya Horiguchi > >>> > >>>> > >>>> Signed-off-by: Miaohe Lin <linmiaohe@huawei.com> > >>>> --- > >>>> mm/memory-failure.c | 6 +----- > >>>> 1 file changed, 1 insertion(+), 5 deletions(-) > >>>> > >>>> diff --git a/mm/memory-failure.c b/mm/memory-failure.c > >>>> index 0d7c58340a98..5f9503573263 100644 > >>>> --- a/mm/memory-failure.c > >>>> +++ b/mm/memory-failure.c > >>>> @@ -984,7 +984,6 @@ static int me_pagecache_dirty(struct page_state *ps, struct page *p) > >>>> static int me_swapcache_dirty(struct page_state *ps, struct page *p) > >>>> { > >>>> int ret; > >>>> - bool extra_pins = false; > >>>> > >>>> ClearPageDirty(p); > >>>> /* Trigger EIO in shmem: */ > >>>> @@ -993,10 +992,7 @@ static int me_swapcache_dirty(struct page_state *ps, struct page *p) > >>>> ret = delete_from_lru_cache(p) ? MF_FAILED : MF_DELAYED; > >>>> unlock_page(p); > >>>> > >>>> - if (ret == MF_DELAYED) > >>>> - extra_pins = true; > >>>> - > >>>> - if (has_extra_refcount(ps, p, extra_pins)) > >>>> + if (has_extra_refcount(ps, p, true)) > >>>> ret = MF_FAILED; > >>>> > >>>> return ret; > >>>> -- > >>>> 2.23.0 > >> > >> > > . > > >
On 2022/3/9 2:51, Yang Shi wrote: > On Tue, Mar 8, 2022 at 5:11 AM Miaohe Lin <linmiaohe@huawei.com> wrote: >> >> On 2022/3/8 4:14, Yang Shi wrote: >>> On Mon, Mar 7, 2022 at 3:26 AM Miaohe Lin <linmiaohe@huawei.com> wrote: >>>> >>>> On 2022/3/4 16:27, HORIGUCHI NAOYA(堀口 直也) wrote: >>>>> On Mon, Feb 28, 2022 at 10:02:43PM +0800, Miaohe Lin wrote: >>>>>> The dirty swapcache page is still residing in the swap cache after it's >>>>>> hwpoisoned. So there is always one extra refcount for swap cache. >>>>> >>>>> The diff seems fine at a glance, but let me have a few question to >>>>> understand the issue more. >>>>> >>>>> - Is the behavior described above the effect of recent change on shmem where >>>>> dirty pagecache is pinned on hwpoison (commit a76054266661 ("mm: shmem: >>>>> don't truncate page if memory failure happens"). Or the older kernels >>>>> behave as the same? >>>>> >>>>> - Is the behavior true for normal anonymous pages (not shmem pages)? >>>>> >>>> >>>> The behavior described above is aimed at swapcache not pagecache. So it should be >>>> irrelevant with the recent change on shmem. >>>> >>>> What I try to fix here is that me_swapcache_dirty holds one extra pin via SwapCache >>>> regardless of the return value of delete_from_lru_cache. We should try to report more >>>> accurate extra refcount for debugging purpose. >>> >>> I think you misunderstood the code. The delete_from_lru_cache() >>> returning 0 means the page was on LRU and isolated from LRU >>> successfully now. Returning -EIO means the page was not on LRU, so it >>> should have at least an extra pin on it. >>> >>> So MF_DELAYED means there is no other pin other than hwpoison and >>> swapcache which is expected, MF_FAILED means there might be extra >>> pins. >>> >>> The has_extra_refcount() raised error then there is *unexpected* refcount. >> >> Many thanks for your explanation. It seems you're right. If page is held on >> the lru_pvecs when we try to do delete_from_lru_cache, and after that it's >> drained to the lru list( so its refcnt might be 2 now). Then we might have >> the following complain if extra_pins is always true: >> "Memory failure: ... still referenced by 0 users\n" >> >> But it seems the origin code can not report the correct reason too because >> if we retry, page can be delete_from_lru_cache and we can succeed now. > > Retry is ok, but it seems overkilling to me IMHO. > Anyway, it seems I misunderstood the code. So I will drop this patch. Thanks for comment. >> >> Anyway, many thanks for pointing this out. >> >>> >>>> >>>>> I'm trying to test hwpoison hitting the dirty swapcache, but it seems that >>>>> in my testing memory_faliure() fails with "hwpoison: unhandlable page" >>>> >>>> Maybe memory_faliure is racing with page reclaim where page is isolated? >>>> >>>>> warning at get_any_page(). So I'm still not sure that me_pagecache_dirty() >>>>> fixes any visible problem. >>>> >>>> IIUC, me_pagecache_dirty can't do much except for the corresponding address_space supporting >>>> error_remove_page which can truncate the dirty pagecache page. But this may cause silent data >>>> loss. It's better to keep the page stay in the pagecache until the file is truncated, hole >>>> punched or removed as commit a76054266661 pointed out. >>>> >>>> Thanks. >>>> >>>>>> Thanks, >>>>> Naoya Horiguchi >>>>> >>>>>> >>>>>> Signed-off-by: Miaohe Lin <linmiaohe@huawei.com> >>>>>> --- >>>>>> mm/memory-failure.c | 6 +----- >>>>>> 1 file changed, 1 insertion(+), 5 deletions(-) >>>>>> >>>>>> diff --git a/mm/memory-failure.c b/mm/memory-failure.c >>>>>> index 0d7c58340a98..5f9503573263 100644 >>>>>> --- a/mm/memory-failure.c >>>>>> +++ b/mm/memory-failure.c >>>>>> @@ -984,7 +984,6 @@ static int me_pagecache_dirty(struct page_state *ps, struct page *p) >>>>>> static int me_swapcache_dirty(struct page_state *ps, struct page *p) >>>>>> { >>>>>> int ret; >>>>>> - bool extra_pins = false; >>>>>> >>>>>> ClearPageDirty(p); >>>>>> /* Trigger EIO in shmem: */ >>>>>> @@ -993,10 +992,7 @@ static int me_swapcache_dirty(struct page_state *ps, struct page *p) >>>>>> ret = delete_from_lru_cache(p) ? MF_FAILED : MF_DELAYED; >>>>>> unlock_page(p); >>>>>> >>>>>> - if (ret == MF_DELAYED) >>>>>> - extra_pins = true; >>>>>> - >>>>>> - if (has_extra_refcount(ps, p, extra_pins)) >>>>>> + if (has_extra_refcount(ps, p, true)) >>>>>> ret = MF_FAILED; >>>>>> >>>>>> return ret; >>>>>> -- >>>>>> 2.23.0 >>>> >>>> >>> . >>> >> > . >
diff --git a/mm/memory-failure.c b/mm/memory-failure.c index 0d7c58340a98..5f9503573263 100644 --- a/mm/memory-failure.c +++ b/mm/memory-failure.c @@ -984,7 +984,6 @@ static int me_pagecache_dirty(struct page_state *ps, struct page *p) static int me_swapcache_dirty(struct page_state *ps, struct page *p) { int ret; - bool extra_pins = false; ClearPageDirty(p); /* Trigger EIO in shmem: */ @@ -993,10 +992,7 @@ static int me_swapcache_dirty(struct page_state *ps, struct page *p) ret = delete_from_lru_cache(p) ? MF_FAILED : MF_DELAYED; unlock_page(p); - if (ret == MF_DELAYED) - extra_pins = true; - - if (has_extra_refcount(ps, p, extra_pins)) + if (has_extra_refcount(ps, p, true)) ret = MF_FAILED; return ret;
The dirty swapcache page is still residing in the swap cache after it's hwpoisoned. So there is always one extra refcount for swap cache. Signed-off-by: Miaohe Lin <linmiaohe@huawei.com> --- mm/memory-failure.c | 6 +----- 1 file changed, 1 insertion(+), 5 deletions(-)