From patchwork Sun Feb 13 02:37:40 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Rik van Riel X-Patchwork-Id: 12744481 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id E160BC433EF for ; Sun, 13 Feb 2022 02:37:55 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id DB1F86B0072; Sat, 12 Feb 2022 21:37:54 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id D5FF56B0073; Sat, 12 Feb 2022 21:37:54 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id C4E036B0078; Sat, 12 Feb 2022 21:37:54 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0172.hostedemail.com [216.40.44.172]) by kanga.kvack.org (Postfix) with ESMTP id B53D16B0072 for ; Sat, 12 Feb 2022 21:37:54 -0500 (EST) Received: from smtpin31.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay05.hostedemail.com (Postfix) with ESMTP id 63EFA181AC9C6 for ; Sun, 13 Feb 2022 02:37:54 +0000 (UTC) X-FDA: 79136196468.31.07A1771 Received: from shelob.surriel.com (shelob.surriel.com [96.67.55.147]) by imf13.hostedemail.com (Postfix) with ESMTP id B854720004 for ; Sun, 13 Feb 2022 02:37:53 +0000 (UTC) Received: from [2603:3005:d05:2b00:6e0b:84ff:fee2:98bb] (helo=imladris.surriel.com) by shelob.surriel.com with esmtpsa (TLS1.2) tls TLS_ECDHE_ECDSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1nJ4lg-0004ha-KH; Sat, 12 Feb 2022 21:37:40 -0500 Date: Sat, 12 Feb 2022 21:37:40 -0500 From: Rik van Riel To: linux-kernel@vger.kernel.org Cc: kernel-team@fb.com, linux-mm@kvack.org, Miaohe Lin , Andrew Morton , Mel Gorman , Johannes Weiner , Matthew Wilcox Subject: [PATCH v2] mm: clean up hwpoison page cache page in fault path Message-ID: <20220212213740.423efcea@imladris.surriel.com> X-Mailer: Claws Mail 4.0.0 (GTK+ 3.24.31; x86_64-redhat-linux-gnu) MIME-Version: 1.0 X-Stat-Signature: hsx38tzfknwihw8i75rr8konia41prd9 X-Rspamd-Server: rspam02 X-Rspamd-Queue-Id: B854720004 Authentication-Results: imf13.hostedemail.com; dkim=none; dmarc=none; spf=none (imf13.hostedemail.com: domain of riel@shelob.surriel.com has no SPF policy when checking 96.67.55.147) smtp.mailfrom=riel@shelob.surriel.com X-Rspam-User: X-HE-Tag: 1644719873-347133 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Sometimes the page offlining code can leave behind a hwpoisoned clean page cache page. This can lead to programs being killed over and over and over again as they fault in the hwpoisoned page, get killed, and then get re-spawned by whatever wanted to run them. This is particularly embarrassing when the page was offlined due to having too many corrected memory errors. Now we are killing tasks due to them trying to access memory that probably isn't even corrupted. This problem can be avoided by invalidating the page from the page fault handler, which already has a branch for dealing with these kinds of pages. With this patch we simply pretend the page fault was successful if the page was invalidated, return to userspace, incur another page fault, read in the file from disk (to a new memory page), and then everything works again. Signed-off-by: Rik van Riel Reviewed-by: Miaohe Lin Acked-by: Naoya Horiguchi Reviewed-by: Oscar Salvador --- v2: fix compiler warning found by kernel test robot mm/memory.c | 9 +++++++-- 1 file changed, 7 insertions(+), 2 deletions(-) diff --git a/mm/memory.c b/mm/memory.c index c125c4969913..55270ea2a7c7 100644 --- a/mm/memory.c +++ b/mm/memory.c @@ -3871,11 +3871,16 @@ static vm_fault_t __do_fault(struct vm_fault *vmf) return ret; if (unlikely(PageHWPoison(vmf->page))) { - if (ret & VM_FAULT_LOCKED) + vm_fault_t poisonret = VM_FAULT_HWPOISON; + if (ret & VM_FAULT_LOCKED) { + /* Retry if a clean page was removed from the cache. */ + if (invalidate_inode_page(vmf->page)) + poisonret = 0; unlock_page(vmf->page); + } put_page(vmf->page); vmf->page = NULL; - return VM_FAULT_HWPOISON; + return poisonret; } if (unlikely(!(ret & VM_FAULT_LOCKED)))