From patchwork Fri Nov 5 20:41:07 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 12605611 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 9DBC4C433EF for ; Fri, 5 Nov 2021 20:41:11 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 57673611C0 for ; Fri, 5 Nov 2021 20:41:11 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org 57673611C0 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=linux-foundation.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=kvack.org Received: by kanga.kvack.org (Postfix) id EE239940078; Fri, 5 Nov 2021 16:41:10 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id E925E940049; Fri, 5 Nov 2021 16:41:10 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id D8066940078; Fri, 5 Nov 2021 16:41:10 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0249.hostedemail.com [216.40.44.249]) by kanga.kvack.org (Postfix) with ESMTP id C825B940049 for ; Fri, 5 Nov 2021 16:41:10 -0400 (EDT) Received: from smtpin21.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay03.hostedemail.com (Postfix) with ESMTP id 9048B82499A8 for ; Fri, 5 Nov 2021 20:41:10 +0000 (UTC) X-FDA: 78776046300.21.7DE6E72 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by imf25.hostedemail.com (Postfix) with ESMTP id 8ADD8B00018B for ; Fri, 5 Nov 2021 20:40:59 +0000 (UTC) Received: by mail.kernel.org (Postfix) with ESMTPSA id 3531361279; Fri, 5 Nov 2021 20:41:08 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1636144868; bh=UsuK52LmUMrPFArYKzCNECASXKGEtg5HkHkL69bbWlg=; h=Date:From:To:Subject:In-Reply-To:From; b=D3ve0zjibpfQNun5KN88gC4YeqGtcoTVzo0vHGK0vON1DDzBbeY4QoK+y45JeZLH5 KA6KwNhv15abxYxq0oDKSbphHKpDDBMBZ50AwaKeyx2MSe65739UMEttLGlwmT4iFX nWLP77rLO/ZaSOdiAKxqFgsFiwOIQuZhgWINdEWw= Date: Fri, 05 Nov 2021 13:41:07 -0700 From: Andrew Morton To: akpm@linux-foundation.org, hughd@google.com, kirill.shutemov@linux.intel.com, linux-mm@kvack.org, mm-commits@vger.kernel.org, naoya.horiguchi@nec.com, osalvador@suse.de, peterx@redhat.com, shy828301@gmail.com, torvalds@linux-foundation.org, willy@infradead.org Subject: [patch 124/262] mm: hwpoison: refactor refcount check handling Message-ID: <20211105204107.FgtW39TxN%akpm@linux-foundation.org> In-Reply-To: <20211105133408.cccbb98b71a77d5e8430aba1@linux-foundation.org> User-Agent: s-nail v14.8.16 X-Rspamd-Server: rspam05 X-Rspamd-Queue-Id: 8ADD8B00018B X-Stat-Signature: 77e7xy7ye91wwtgjs9gwjutgurseq5tc Authentication-Results: imf25.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=korg header.b=D3ve0zji; spf=pass (imf25.hostedemail.com: domain of akpm@linux-foundation.org designates 198.145.29.99 as permitted sender) smtp.mailfrom=akpm@linux-foundation.org; dmarc=none X-HE-Tag: 1636144859-363969 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Yang Shi Subject: mm: hwpoison: refactor refcount check handling Memory failure will report failure if the page still has extra pinned refcount other than from hwpoison after the handler is done. Actually the check is not necessary for all handlers, so move the check into specific handlers. This would make the following keeping shmem page in page cache patch easier. There may be expected extra pin for some cases, for example, when the page is dirty and in swapcache. Link: https://lkml.kernel.org/r/20211020210755.23964-5-shy828301@gmail.com Signed-off-by: Yang Shi Signed-off-by: Naoya Horiguchi Suggested-by: Naoya Horiguchi Cc: Hugh Dickins Cc: Kirill A. Shutemov Cc: Matthew Wilcox Cc: Oscar Salvador Cc: Peter Xu Signed-off-by: Andrew Morton --- mm/memory-failure.c | 93 ++++++++++++++++++++++++++++-------------- 1 file changed, 64 insertions(+), 29 deletions(-) --- a/mm/memory-failure.c~mm-hwpoison-refactor-refcount-check-handling +++ a/mm/memory-failure.c @@ -807,12 +807,44 @@ static int truncate_error_page(struct pa return ret; } +struct page_state { + unsigned long mask; + unsigned long res; + enum mf_action_page_type type; + + /* Callback ->action() has to unlock the relevant page inside it. */ + int (*action)(struct page_state *ps, struct page *p); +}; + +/* + * Return true if page is still referenced by others, otherwise return + * false. + * + * The extra_pins is true when one extra refcount is expected. + */ +static bool has_extra_refcount(struct page_state *ps, struct page *p, + bool extra_pins) +{ + int count = page_count(p) - 1; + + if (extra_pins) + count -= 1; + + if (count > 0) { + pr_err("Memory failure: %#lx: %s still referenced by %d users\n", + page_to_pfn(p), action_page_types[ps->type], count); + return true; + } + + return false; +} + /* * Error hit kernel page. * Do nothing, try to be lucky and not touch this instead. For a few cases we * could be more sophisticated. */ -static int me_kernel(struct page *p, unsigned long pfn) +static int me_kernel(struct page_state *ps, struct page *p) { unlock_page(p); return MF_IGNORED; @@ -821,9 +853,9 @@ static int me_kernel(struct page *p, uns /* * Page in unknown state. Do nothing. */ -static int me_unknown(struct page *p, unsigned long pfn) +static int me_unknown(struct page_state *ps, struct page *p) { - pr_err("Memory failure: %#lx: Unknown page state\n", pfn); + pr_err("Memory failure: %#lx: Unknown page state\n", page_to_pfn(p)); unlock_page(p); return MF_FAILED; } @@ -831,7 +863,7 @@ static int me_unknown(struct page *p, un /* * Clean (or cleaned) page cache page. */ -static int me_pagecache_clean(struct page *p, unsigned long pfn) +static int me_pagecache_clean(struct page_state *ps, struct page *p) { int ret; struct address_space *mapping; @@ -868,9 +900,13 @@ static int me_pagecache_clean(struct pag * * Open: to take i_rwsem or not for this? Right now we don't. */ - ret = truncate_error_page(p, pfn, mapping); + ret = truncate_error_page(p, page_to_pfn(p), mapping); out: unlock_page(p); + + if (has_extra_refcount(ps, p, false)) + ret = MF_FAILED; + return ret; } @@ -879,7 +915,7 @@ out: * Issues: when the error hit a hole page the error is not properly * propagated. */ -static int me_pagecache_dirty(struct page *p, unsigned long pfn) +static int me_pagecache_dirty(struct page_state *ps, struct page *p) { struct address_space *mapping = page_mapping(p); @@ -923,7 +959,7 @@ static int me_pagecache_dirty(struct pag mapping_set_error(mapping, -EIO); } - return me_pagecache_clean(p, pfn); + return me_pagecache_clean(ps, p); } /* @@ -945,9 +981,10 @@ static int me_pagecache_dirty(struct pag * Clean swap cache pages can be directly isolated. A later page fault will * bring in the known good data from disk. */ -static int me_swapcache_dirty(struct page *p, unsigned long pfn) +static int me_swapcache_dirty(struct page_state *ps, struct page *p) { int ret; + bool extra_pins = false; ClearPageDirty(p); /* Trigger EIO in shmem: */ @@ -955,10 +992,17 @@ static int me_swapcache_dirty(struct pag ret = delete_from_lru_cache(p) ? MF_FAILED : MF_DELAYED; unlock_page(p); + + if (ret == MF_DELAYED) + extra_pins = true; + + if (has_extra_refcount(ps, p, extra_pins)) + ret = MF_FAILED; + return ret; } -static int me_swapcache_clean(struct page *p, unsigned long pfn) +static int me_swapcache_clean(struct page_state *ps, struct page *p) { int ret; @@ -966,6 +1010,10 @@ static int me_swapcache_clean(struct pag ret = delete_from_lru_cache(p) ? MF_FAILED : MF_RECOVERED; unlock_page(p); + + if (has_extra_refcount(ps, p, false)) + ret = MF_FAILED; + return ret; } @@ -975,7 +1023,7 @@ static int me_swapcache_clean(struct pag * - Error on hugepage is contained in hugepage unit (not in raw page unit.) * To narrow down kill region to one page, we need to break up pmd. */ -static int me_huge_page(struct page *p, unsigned long pfn) +static int me_huge_page(struct page_state *ps, struct page *p) { int res; struct page *hpage = compound_head(p); @@ -986,7 +1034,7 @@ static int me_huge_page(struct page *p, mapping = page_mapping(hpage); if (mapping) { - res = truncate_error_page(hpage, pfn, mapping); + res = truncate_error_page(hpage, page_to_pfn(p), mapping); unlock_page(hpage); } else { res = MF_FAILED; @@ -1004,6 +1052,9 @@ static int me_huge_page(struct page *p, } } + if (has_extra_refcount(ps, p, false)) + res = MF_FAILED; + return res; } @@ -1029,14 +1080,7 @@ static int me_huge_page(struct page *p, #define slab (1UL << PG_slab) #define reserved (1UL << PG_reserved) -static struct page_state { - unsigned long mask; - unsigned long res; - enum mf_action_page_type type; - - /* Callback ->action() has to unlock the relevant page inside it. */ - int (*action)(struct page *p, unsigned long pfn); -} error_states[] = { +static struct page_state error_states[] = { { reserved, reserved, MF_MSG_KERNEL, me_kernel }, /* * free pages are specially detected outside this table: @@ -1096,19 +1140,10 @@ static int page_action(struct page_state unsigned long pfn) { int result; - int count; /* page p should be unlocked after returning from ps->action(). */ - result = ps->action(p, pfn); + result = ps->action(ps, p); - count = page_count(p) - 1; - if (ps->action == me_swapcache_dirty && result == MF_DELAYED) - count--; - if (count > 0) { - pr_err("Memory failure: %#lx: %s still referenced by %d users\n", - pfn, action_page_types[ps->type], count); - result = MF_FAILED; - } action_result(pfn, ps->type, result); /* Could do more checks here if page looks ok */