From patchwork Tue Sep 14 18:37:15 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yang Shi X-Patchwork-Id: 12494421 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-13.5 required=3.0 tests=BAYES_00, DKIM_ADSP_CUSTOM_MED,DKIM_INVALID,DKIM_SIGNED,FREEMAIL_FORGED_FROMDOMAIN, FREEMAIL_FROM,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER, INCLUDES_PATCH,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 5465AC433F5 for ; Tue, 14 Sep 2021 18:37:35 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 073A8610E6 for ; Tue, 14 Sep 2021 18:37:35 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org 073A8610E6 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=gmail.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=kvack.org Received: by kanga.kvack.org (Postfix) id A563F6B0072; Tue, 14 Sep 2021 14:37:34 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 9DCB0900002; Tue, 14 Sep 2021 14:37:34 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 830886B0074; Tue, 14 Sep 2021 14:37:34 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0180.hostedemail.com [216.40.44.180]) by kanga.kvack.org (Postfix) with ESMTP id 6EA826B0072 for ; Tue, 14 Sep 2021 14:37:34 -0400 (EDT) Received: from smtpin22.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay05.hostedemail.com (Postfix) with ESMTP id 18BFC181D12FA for ; Tue, 14 Sep 2021 18:37:34 +0000 (UTC) X-FDA: 78587037228.22.89EFBCD Received: from mail-pg1-f179.google.com (mail-pg1-f179.google.com [209.85.215.179]) by imf06.hostedemail.com (Postfix) with ESMTP id C6745801A8A7 for ; Tue, 14 Sep 2021 18:37:33 +0000 (UTC) Received: by mail-pg1-f179.google.com with SMTP id u18so99020pgf.0 for ; Tue, 14 Sep 2021 11:37:33 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=czAkFfe1mpXdw9DjxU44hTrmwIzU1RlZ33B1Yq/O+QA=; b=SpGtB4dQ2h7N0KeImNRd34t3WctAxXNt9m+PStBgb6Df0wowqg9AwiCdNjRFyDHjz6 4QCfJr4tjmMSfAy/ibh8kMct1O/0//1j4HrLK3FLl3D8jeUyIepPvCl1sKYS7KqRWG+L 3ZNxwpxvmv2Ry9WCJIx68yHGpg+6W28EzF9ZXPcgdGLIkF6ZGz5tCpL0N9JohCBSV/aC 3m8qkYXhwMt2oojpPg+yy+ArHhHL2IGinpyAjJjRxILpQdcMgCe7MX5UtUBtIwmJZ9Qz KEPP7HKpVqVUhsazy+Be6g+Ri8N0pAGajBMPW5Y/8kA6Jwd/UAb5odcRpoTk69sBD+v3 Uphg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=czAkFfe1mpXdw9DjxU44hTrmwIzU1RlZ33B1Yq/O+QA=; b=Cj3TiPuIr2yizJgwhfF8jaohCLTXmT7WN7ZR/0VsxDwxZH+0AApYojXkQMMNtEDinG ZkKKqKGdC4pbdPedpsPol1sSYaxnkhTIHc15PoztXKK9DY8azL0AE4agd4nwrJ7NqXA5 sQ/xxdhlISvJ1fy5V5kp2cxQDOumAGkMWYfVWb3yzrQiI9WKsJv7+XaOJ9EMQH6GCmCR vxbxASIAXMUkH+PzKNk87DkEjR3TCCsaqYxmz4JdkD+6U8E2SHpa4FgzsqG4m5w1pD+I 8mHfqSDzZ0L4VPl0j9wi2YsXOL06kJCuiWuN3o4ba0Tv0Gpwd9/JERO9iNjeVOieIjBx keMw== X-Gm-Message-State: AOAM531AmiYPFzXvTvG/gJL1EnOlJq6xm+I6u6gIStjpgdEl5qVTeI8y zBekjT466HeiFLGEkhNOjwg= X-Google-Smtp-Source: ABdhPJx0nfeSQkT4y1cLquZT7sl2PsodSVN3cA2O8jZYC3O6mntx+Hvq5N95ug0Oe5Z3QSgKGKLllg== X-Received: by 2002:a62:7dd3:0:b0:438:a22:a49c with SMTP id y202-20020a627dd3000000b004380a22a49cmr6119204pfc.44.1631644652890; Tue, 14 Sep 2021 11:37:32 -0700 (PDT) Received: from localhost.localdomain (c-73-93-239-127.hsd1.ca.comcast.net. [73.93.239.127]) by smtp.gmail.com with ESMTPSA id y3sm12003965pge.44.2021.09.14.11.37.31 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 14 Sep 2021 11:37:32 -0700 (PDT) From: Yang Shi To: naoya.horiguchi@nec.com, hughd@google.com, kirill.shutemov@linux.intel.com, willy@infradead.org, osalvador@suse.de, akpm@linux-foundation.org Cc: shy828301@gmail.com, linux-mm@kvack.org, linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org Subject: [PATCH 1/4] mm: filemap: check if any subpage is hwpoisoned for PMD page fault Date: Tue, 14 Sep 2021 11:37:15 -0700 Message-Id: <20210914183718.4236-2-shy828301@gmail.com> X-Mailer: git-send-email 2.26.2 In-Reply-To: <20210914183718.4236-1-shy828301@gmail.com> References: <20210914183718.4236-1-shy828301@gmail.com> MIME-Version: 1.0 X-Rspamd-Server: rspam04 X-Rspamd-Queue-Id: C6745801A8A7 X-Stat-Signature: h3w9zujko8xm8xickj3bkd961tpxprbn Authentication-Results: imf06.hostedemail.com; dkim=pass header.d=gmail.com header.s=20210112 header.b=SpGtB4dQ; spf=pass (imf06.hostedemail.com: domain of shy828301@gmail.com designates 209.85.215.179 as permitted sender) smtp.mailfrom=shy828301@gmail.com; dmarc=pass (policy=none) header.from=gmail.com X-HE-Tag: 1631644653-858208 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: When handling shmem page fault the THP with corrupted subpage could be PMD mapped if certain conditions are satisfied. But kernel is supposed to send SIGBUS when trying to map hwpoisoned page. There are two paths which may do PMD map: fault around and regular fault. Before commit f9ce0be71d1f ("mm: Cleanup faultaround and finish_fault() codepaths") the thing was even worse in fault around path. The THP could be PMD mapped as long as the VMA fits regardless what subpage is accessed and corrupted. After this commit as long as head page is not corrupted the THP could be PMD mapped. In the regulat fault path the THP could be PMD mapped as long as the corrupted page is not accessed and the VMA fits. Fix the loophole by iterating all subpage to check hwpoisoned one when doing PMD map, if any is found just fallback to PTE map. Such THP just can be PTE mapped. Do the check in the icache flush loop in order to avoid iterating all subpages twice and icache flush is actually noop for most architectures. Cc: Signed-off-by: Yang Shi --- mm/filemap.c | 15 +++++++++------ mm/memory.c | 11 ++++++++++- 2 files changed, 19 insertions(+), 7 deletions(-) diff --git a/mm/filemap.c b/mm/filemap.c index dae481293b5d..740b7afe159a 100644 --- a/mm/filemap.c +++ b/mm/filemap.c @@ -3195,12 +3195,14 @@ static bool filemap_map_pmd(struct vm_fault *vmf, struct page *page) } if (pmd_none(*vmf->pmd) && PageTransHuge(page)) { - vm_fault_t ret = do_set_pmd(vmf, page); - if (!ret) { - /* The page is mapped successfully, reference consumed. */ - unlock_page(page); - return true; - } + vm_fault_t ret = do_set_pmd(vmf, page); + if (ret == VM_FAULT_FALLBACK) + goto out; + if (!ret) { + /* The page is mapped successfully, reference consumed. */ + unlock_page(page); + return true; + } } if (pmd_none(*vmf->pmd)) { @@ -3220,6 +3222,7 @@ static bool filemap_map_pmd(struct vm_fault *vmf, struct page *page) return true; } +out: return false; } diff --git a/mm/memory.c b/mm/memory.c index 25fc46e87214..1765bf72ed16 100644 --- a/mm/memory.c +++ b/mm/memory.c @@ -3920,8 +3920,17 @@ vm_fault_t do_set_pmd(struct vm_fault *vmf, struct page *page) if (unlikely(!pmd_none(*vmf->pmd))) goto out; - for (i = 0; i < HPAGE_PMD_NR; i++) + for (i = 0; i < HPAGE_PMD_NR; i++) { + /* + * Just backoff if any subpage of a THP is corrupted otherwise + * the corrupted page may mapped by PMD silently to escape the + * check. This kind of THP just can be PTE mapped. Access to + * the corrupted subpage should trigger SIGBUS as expected. + */ + if (PageHWPoison(page + i)) + goto out; flush_icache_page(vma, page + i); + } entry = mk_huge_pmd(page, vma->vm_page_prot); if (write) From patchwork Tue Sep 14 18:37:16 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yang Shi X-Patchwork-Id: 12494423 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-13.5 required=3.0 tests=BAYES_00, DKIM_ADSP_CUSTOM_MED,DKIM_INVALID,DKIM_SIGNED,FREEMAIL_FORGED_FROMDOMAIN, FREEMAIL_FROM,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER, INCLUDES_PATCH,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id A690DC433FE for ; Tue, 14 Sep 2021 18:37:37 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 2922961211 for ; Tue, 14 Sep 2021 18:37:37 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org 2922961211 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=gmail.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=kvack.org Received: by kanga.kvack.org (Postfix) id B40126B0073; Tue, 14 Sep 2021 14:37:36 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id A0512900003; Tue, 14 Sep 2021 14:37:36 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 85997900002; Tue, 14 Sep 2021 14:37:36 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0121.hostedemail.com [216.40.44.121]) by kanga.kvack.org (Postfix) with ESMTP id 6DE036B0073 for ; Tue, 14 Sep 2021 14:37:36 -0400 (EDT) Received: from smtpin25.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay01.hostedemail.com (Postfix) with ESMTP id 290D118297704 for ; Tue, 14 Sep 2021 18:37:36 +0000 (UTC) X-FDA: 78587037312.25.06F0B47 Received: from mail-pg1-f176.google.com (mail-pg1-f176.google.com [209.85.215.176]) by imf03.hostedemail.com (Postfix) with ESMTP id DBF3E3000099 for ; Tue, 14 Sep 2021 18:37:35 +0000 (UTC) Received: by mail-pg1-f176.google.com with SMTP id g184so44293pgc.6 for ; Tue, 14 Sep 2021 11:37:35 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=yprth0nsSrttMIWYc9OKOWe+bft4LdAzLPFBmBWUobY=; b=kvvcClInUIG9SRN8aa+z1bR/Pc+UNmTjmjVVAtD2xIqPk47pROSY/f0CSvTRGyYci1 NuEppMVsjGtD/RELsrt0o1bMAdlrYJOnF8+uO285BnIi7c6TeVSibj/kB05I1evFrmuo qwaTYKmipY7nIfdjaRVcfxwJ/y+tlLJrv4ntSajPKY9pj1MKVEzhsUeD0nn2BlGJzAFL Yt5PB0KqXOUi1opqLZwVQ9iHL5M3wLIdiQCJKdJ/bFxTbCDnwG6ParQRC7AchD/mBeDw 8BQoPQU1iQltx438xfOnGjqq8dQ3V5m5PZxMHacSKO/d/D4DZKUqeQ0/AZZuwVxNr1Zv yBjQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=yprth0nsSrttMIWYc9OKOWe+bft4LdAzLPFBmBWUobY=; b=1siqhido5EhLVQAyDlZUeMaQG/C0XK+ecPFDux7LruZ1gu8OQ18wzVUHqEzK7sFMCv sSudC+BuL4YfixooSoqhnTRBJDXUJFAcGXtV8gciisgYJV7stSI69QJmPZDxmc2kLoti 2Rcu3hgD63T30YNEs8efgTpKG7NL/2qTWJvXNsEWxgHm6GQ/OBlKflewjDJ3Y2d2UVNf ndyMRrb0Eglw+cGAXdVkHGPP88GPan4bjHJrgE1GqzxluUijsf1+2PXDUoqMMfkAhqfA aWNiRx5zhqpTIp27IggkySDi6jR0kK0guzslK0Tvffr/rX7ynzCNiIVLhK8EE0/GI9D3 nywQ== X-Gm-Message-State: AOAM532SH/zSoShxpwPMuI1LZsveWos0iIvVWWJ2kFalHHsXFJvlrFwj 15OdeHABj62svtrV7VG7FEc= X-Google-Smtp-Source: ABdhPJzG/S59EV7n4OKhsdYA4k/qedVimHWvPxBnASOQFFBJgdUHomIzqVIxDQbzmk/JcfRKhIQ6iw== X-Received: by 2002:a62:dd83:0:b029:2e8:e511:c32f with SMTP id w125-20020a62dd830000b02902e8e511c32fmr6077555pff.49.1631644655029; Tue, 14 Sep 2021 11:37:35 -0700 (PDT) Received: from localhost.localdomain (c-73-93-239-127.hsd1.ca.comcast.net. [73.93.239.127]) by smtp.gmail.com with ESMTPSA id y3sm12003965pge.44.2021.09.14.11.37.33 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 14 Sep 2021 11:37:34 -0700 (PDT) From: Yang Shi To: naoya.horiguchi@nec.com, hughd@google.com, kirill.shutemov@linux.intel.com, willy@infradead.org, osalvador@suse.de, akpm@linux-foundation.org Cc: shy828301@gmail.com, linux-mm@kvack.org, linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org Subject: [PATCH 2/4] mm: khugepaged: check if file page is on LRU after locking page Date: Tue, 14 Sep 2021 11:37:16 -0700 Message-Id: <20210914183718.4236-3-shy828301@gmail.com> X-Mailer: git-send-email 2.26.2 In-Reply-To: <20210914183718.4236-1-shy828301@gmail.com> References: <20210914183718.4236-1-shy828301@gmail.com> MIME-Version: 1.0 X-Rspamd-Server: rspam05 X-Rspamd-Queue-Id: DBF3E3000099 X-Stat-Signature: 9798zip6uu3brxfjx5w3torwu1t9k5ce Authentication-Results: imf03.hostedemail.com; dkim=pass header.d=gmail.com header.s=20210112 header.b=kvvcClIn; spf=pass (imf03.hostedemail.com: domain of shy828301@gmail.com designates 209.85.215.176 as permitted sender) smtp.mailfrom=shy828301@gmail.com; dmarc=pass (policy=none) header.from=gmail.com X-HE-Tag: 1631644655-615203 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: The khugepaged does check if the page is on LRU or not but it doesn't hold page lock. And it doesn't check this again after holding page lock. So it may race with some others, e.g. reclaimer, migration, etc. All of them isolates page from LRU then lock the page then do something. But it could pass the refcount check done by khugepaged to proceed collapse. Typically such race is not fatal. But if the page has been isolated from LRU before khugepaged it likely means the page may be not suitable for collapse for now. The other more fatal case is the following patch will keep the poisoned page in page cache for shmem, so khugepaged may collapse a poisoned page since the refcount check could pass. 3 refcounts come from: - hwpoison - page cache - khugepaged Since it is not on LRU so no refcount is incremented from LRU isolation. This is definitely not expected. Checking if it is on LRU or not after holding page lock could help serialize against hwpoison handler. But there is still a small race window between setting hwpoison flag and bump refcount in hwpoison handler. It could be closed by checking hwpoison flag in khugepaged, however this race seems unlikely to happen in real life workload. So just check LRU flag for now to avoid over-engineering. Signed-off-by: Yang Shi --- mm/khugepaged.c | 6 ++++++ 1 file changed, 6 insertions(+) diff --git a/mm/khugepaged.c b/mm/khugepaged.c index 045cc579f724..bdc161dc27dc 100644 --- a/mm/khugepaged.c +++ b/mm/khugepaged.c @@ -1808,6 +1808,12 @@ static void collapse_file(struct mm_struct *mm, goto out_unlock; } + /* The hwpoisoned page is off LRU but in page cache */ + if (!PageLRU(page)) { + result = SCAN_PAGE_LRU; + goto out_unlock; + } + if (isolate_lru_page(page)) { result = SCAN_DEL_PAGE_LRU; goto out_unlock; From patchwork Tue Sep 14 18:37:17 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yang Shi X-Patchwork-Id: 12494425 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-13.5 required=3.0 tests=BAYES_00, DKIM_ADSP_CUSTOM_MED,DKIM_INVALID,DKIM_SIGNED,FREEMAIL_FORGED_FROMDOMAIN, FREEMAIL_FROM,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER, INCLUDES_PATCH,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 1E2FCC433FE for ; Tue, 14 Sep 2021 18:37:40 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id B279161107 for ; Tue, 14 Sep 2021 18:37:39 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org B279161107 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=gmail.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=kvack.org Received: by kanga.kvack.org (Postfix) id 4FE4A900002; Tue, 14 Sep 2021 14:37:39 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 45F7D6B0075; Tue, 14 Sep 2021 14:37:39 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 3005C900002; Tue, 14 Sep 2021 14:37:39 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0193.hostedemail.com [216.40.44.193]) by kanga.kvack.org (Postfix) with ESMTP id 178D46B0074 for ; Tue, 14 Sep 2021 14:37:39 -0400 (EDT) Received: from smtpin18.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay02.hostedemail.com (Postfix) with ESMTP id 918042DED9 for ; Tue, 14 Sep 2021 18:37:38 +0000 (UTC) X-FDA: 78587037396.18.E4B7475 Received: from mail-pg1-f173.google.com (mail-pg1-f173.google.com [209.85.215.173]) by imf26.hostedemail.com (Postfix) with ESMTP id 51DC120019DF for ; Tue, 14 Sep 2021 18:37:38 +0000 (UTC) Received: by mail-pg1-f173.google.com with SMTP id e7so63133pgk.2 for ; Tue, 14 Sep 2021 11:37:38 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=9w1AJqNMCNIhFQPh+gH1tqRIw8tAS1H2alD3h2KH4Tg=; b=oRAhLcrLE1KArfHiZtQaKminXxHF840kk+E0maovq8zjG3QXSlhDRsv+IIvNS3f2IZ uh+mjaJolNp5iBSA2yQ02eKiMLlMLFGsXavyXNlZK2p6LAvoLs6KIYhlwVCfghAeaAbV +/kImpPmCqR95zWyPEvKBZVRBx8/VaTtgvZSTryBf4nr3/t/fL0j5DK2VIoQWyXMu3YZ OoBplmeZYginHwX2+luO+/ApvHdpmPCJuzGaUoRzov+C4blw3B6FCYVUV/8t43spurG5 8fUKSCJOMZ90yQpCxgCiMCEWM0jkumNjvJWyH3g1SIyzf5WTz0lyXs5JTXha8687MDtK /+0g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=9w1AJqNMCNIhFQPh+gH1tqRIw8tAS1H2alD3h2KH4Tg=; b=hoKjKTbKvsWXdoSB/nyha6ZK1hxp+Mmj98Q2Vu/JwQW8CCA7oy0rMJvkiLam5s5Ua4 RRU5/84SWVRwqNeNl7vrnhtlafhnKM/w5kLzJfpaLRYQ34Xz4N2XDqAQyFiPdzH1Gpmr CwSswWVnYOUdHMqeSRmu6dyzPZKVinzfBeQrIs3WSkG5aO7KMqAt66RpmcCr769AEJ6w D0WX0UwAlkx9rIQKoa2X3M60HCAFI03A2BtfpRgJFbsgn7xeI5itLImWjP9NyD4LZYhg xhjvWi9m0bjZhYUS+HKnbIivbpVsjMl0JYYACJexa4Fu6S0W9QQGdkFfS3A4Hd0oPmlh GUYw== X-Gm-Message-State: AOAM532oyFJhs8zk8Hhkru0fCgJK4tfwCP0fEcAK46bCnPKqlQ2MYkFS gRwT+7MGuOE3nJhzBYcvsGc= X-Google-Smtp-Source: ABdhPJzRie1QwXC8O0LrqqqozMmN656c59ZLcc2IjJ2ZadEETUI1oV5G1qGPgR0/qVKQhYJhTwjVtA== X-Received: by 2002:a63:62c7:: with SMTP id w190mr16518815pgb.105.1631644657522; Tue, 14 Sep 2021 11:37:37 -0700 (PDT) Received: from localhost.localdomain (c-73-93-239-127.hsd1.ca.comcast.net. [73.93.239.127]) by smtp.gmail.com with ESMTPSA id y3sm12003965pge.44.2021.09.14.11.37.35 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 14 Sep 2021 11:37:36 -0700 (PDT) From: Yang Shi To: naoya.horiguchi@nec.com, hughd@google.com, kirill.shutemov@linux.intel.com, willy@infradead.org, osalvador@suse.de, akpm@linux-foundation.org Cc: shy828301@gmail.com, linux-mm@kvack.org, linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org Subject: [PATCH 3/4] mm: shmem: don't truncate page if memory failure happens Date: Tue, 14 Sep 2021 11:37:17 -0700 Message-Id: <20210914183718.4236-4-shy828301@gmail.com> X-Mailer: git-send-email 2.26.2 In-Reply-To: <20210914183718.4236-1-shy828301@gmail.com> References: <20210914183718.4236-1-shy828301@gmail.com> MIME-Version: 1.0 Authentication-Results: imf26.hostedemail.com; dkim=pass header.d=gmail.com header.s=20210112 header.b=oRAhLcrL; spf=pass (imf26.hostedemail.com: domain of shy828301@gmail.com designates 209.85.215.173 as permitted sender) smtp.mailfrom=shy828301@gmail.com; dmarc=pass (policy=none) header.from=gmail.com X-Rspamd-Server: rspam06 X-Rspamd-Queue-Id: 51DC120019DF X-Stat-Signature: jgnfwrwzcab9g5acwk3fmbufijr6tx1m X-HE-Tag: 1631644658-417971 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: The current behavior of memory failure is to truncate the page cache regardless of dirty or clean. If the page is dirty the later access will get the obsolete data from disk without any notification to the users. This may cause silent data loss. It is even worse for shmem since shmem is in-memory filesystem, truncating page cache means discarding data blocks. The later read would return all zero. The right approach is to keep the corrupted page in page cache, any later access would return error for syscalls or SIGBUS for page fault, until the file is truncated, hole punched or removed. The regular storage backed filesystems would be more complicated so this patch is focused on shmem. This also unblock the support for soft offlining shmem THP. Signed-off-by: Yang Shi --- mm/memory-failure.c | 3 ++- mm/shmem.c | 25 +++++++++++++++++++++++-- 2 files changed, 25 insertions(+), 3 deletions(-) diff --git a/mm/memory-failure.c b/mm/memory-failure.c index 54879c339024..3e06cb9d5121 100644 --- a/mm/memory-failure.c +++ b/mm/memory-failure.c @@ -1101,7 +1101,8 @@ static int page_action(struct page_state *ps, struct page *p, result = ps->action(p, pfn); count = page_count(p) - 1; - if (ps->action == me_swapcache_dirty && result == MF_DELAYED) + if ((ps->action == me_swapcache_dirty && result == MF_DELAYED) || + (ps->action == me_pagecache_dirty && result == MF_FAILED)) count--; if (count > 0) { pr_err("Memory failure: %#lx: %s still referenced by %d users\n", diff --git a/mm/shmem.c b/mm/shmem.c index 88742953532c..ec33f4f7173d 100644 --- a/mm/shmem.c +++ b/mm/shmem.c @@ -2456,6 +2456,7 @@ shmem_write_begin(struct file *file, struct address_space *mapping, struct inode *inode = mapping->host; struct shmem_inode_info *info = SHMEM_I(inode); pgoff_t index = pos >> PAGE_SHIFT; + int ret = 0; /* i_rwsem is held by caller */ if (unlikely(info->seals & (F_SEAL_GROW | @@ -2466,7 +2467,19 @@ shmem_write_begin(struct file *file, struct address_space *mapping, return -EPERM; } - return shmem_getpage(inode, index, pagep, SGP_WRITE); + ret = shmem_getpage(inode, index, pagep, SGP_WRITE); + + if (!ret) { + if (*pagep) { + if (PageHWPoison(*pagep)) { + unlock_page(*pagep); + put_page(*pagep); + ret = -EIO; + } + } + } + + return ret; } static int @@ -2555,6 +2568,11 @@ static ssize_t shmem_file_read_iter(struct kiocb *iocb, struct iov_iter *to) unlock_page(page); } + if (page && PageHWPoison(page)) { + error = -EIO; + break; + } + /* * We must evaluate after, since reads (unlike writes) * are called without i_rwsem protection against truncate @@ -3782,7 +3800,6 @@ const struct address_space_operations shmem_aops = { #ifdef CONFIG_MIGRATION .migratepage = migrate_page, #endif - .error_remove_page = generic_error_remove_page, }; EXPORT_SYMBOL(shmem_aops); @@ -4193,6 +4210,10 @@ struct page *shmem_read_mapping_page_gfp(struct address_space *mapping, page = ERR_PTR(error); else unlock_page(page); + + if (PageHWPoison(page)) + page = NULL; + return page; #else /* From patchwork Tue Sep 14 18:37:18 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yang Shi X-Patchwork-Id: 12494427 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-13.5 required=3.0 tests=BAYES_00, DKIM_ADSP_CUSTOM_MED,DKIM_INVALID,DKIM_SIGNED,FREEMAIL_FORGED_FROMDOMAIN, FREEMAIL_FROM,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER, INCLUDES_PATCH,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 34928C433F5 for ; Tue, 14 Sep 2021 18:37:42 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id D1DA2610E6 for ; Tue, 14 Sep 2021 18:37:41 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org D1DA2610E6 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=gmail.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=kvack.org Received: by kanga.kvack.org (Postfix) id F39AF940007; Tue, 14 Sep 2021 14:37:40 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id E4B986B0075; Tue, 14 Sep 2021 14:37:40 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id BB5576B0078; Tue, 14 Sep 2021 14:37:40 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0245.hostedemail.com [216.40.44.245]) by kanga.kvack.org (Postfix) with ESMTP id A6F806B0074 for ; Tue, 14 Sep 2021 14:37:40 -0400 (EDT) Received: from smtpin32.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay05.hostedemail.com (Postfix) with ESMTP id 4EBE31829794E for ; Tue, 14 Sep 2021 18:37:40 +0000 (UTC) X-FDA: 78587037480.32.29B7F07 Received: from mail-pj1-f54.google.com (mail-pj1-f54.google.com [209.85.216.54]) by imf28.hostedemail.com (Postfix) with ESMTP id 1716890000A1 for ; Tue, 14 Sep 2021 18:37:39 +0000 (UTC) Received: by mail-pj1-f54.google.com with SMTP id me5-20020a17090b17c500b0019af76b7bb4so2247352pjb.2 for ; Tue, 14 Sep 2021 11:37:39 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=+QEr/go/GTpU7Rjfc4KC/IEXZ29vumfEE1Xw4ha+DtA=; b=hGIOmUo0dI9QRjWTNJ0++NGUaLs7m410h6aj1jfFYUwSQIndsHJqOG2IEB3U0jZQ2y BtZ0bX3w+jdCOyP48P8fz6pcr3NjeIHEtlBCKABnjmU/lgYMEUH3qLrFfjUkzxgUJlDj q5zHi1TzMoWGOypCtlMzg3XmIJyzsQijTr6p8/meHv0tbpENrRiI2A8msEQXD//mbS+w /RkZOqIjTJOb6UIidmw/CTSDIZ+sZUJkBzVSxjAT6yXIbzY4uVwsolmU5i4qu3NlDtpP MXJuRRMAFJsQ6pNyeMpOaD1js3/O6dLMRGsaXuUgCHShkLnuYZj+Fbxr+fkRofNwucUJ qzqQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=+QEr/go/GTpU7Rjfc4KC/IEXZ29vumfEE1Xw4ha+DtA=; b=2w8KHPEnq/Pnbyube+Js1hAhBp092GCQ+4l+i7yu++KxxMazofOVkImg6t3zB7LH/Q i/8vondHgjizOwBaa4Y0osOfIAmJnhhxphkv24EABqfsRTNJqpdGz1VD62VdAfvW632Q FTkpIddJKmD1ugLQ8Ad9WTKHtVYXbldYFr9OBb07w20K6cZMaWNHwSFQ1Ob2X+GgtWCp KBF9jgEoSgW7IeBddDxarOd7arddOdGMPwyrdfweDFyFDqZzCS1P464+sEVTXVadvlD4 0fD4wHaaNtMZ4WIbJ+prxuLUxJxsL4lDqmtYxvq8cXtskH+W0sAlBUGDbctf8XZXho8F yNUQ== X-Gm-Message-State: AOAM531I1Fwb+wb8RNx5w29JQllgJcjwg5Iei60zxfJAmHFprkwwejeQ igVkN9UmU3IcLSz5zhwHk1Q= X-Google-Smtp-Source: ABdhPJwLHyNOLnJIL5N7uW7LsNkxaZra7i3+ERtT/jDHWXVzQaxvTyMOQaAc7gHhlP+YS/zE7S1ZQA== X-Received: by 2002:a17:90a:1991:: with SMTP id 17mr3646240pji.149.1631644659271; Tue, 14 Sep 2021 11:37:39 -0700 (PDT) Received: from localhost.localdomain (c-73-93-239-127.hsd1.ca.comcast.net. [73.93.239.127]) by smtp.gmail.com with ESMTPSA id y3sm12003965pge.44.2021.09.14.11.37.37 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 14 Sep 2021 11:37:38 -0700 (PDT) From: Yang Shi To: naoya.horiguchi@nec.com, hughd@google.com, kirill.shutemov@linux.intel.com, willy@infradead.org, osalvador@suse.de, akpm@linux-foundation.org Cc: shy828301@gmail.com, linux-mm@kvack.org, linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org Subject: [PATCH 4/4] mm: hwpoison: handle non-anonymous THP correctly Date: Tue, 14 Sep 2021 11:37:18 -0700 Message-Id: <20210914183718.4236-5-shy828301@gmail.com> X-Mailer: git-send-email 2.26.2 In-Reply-To: <20210914183718.4236-1-shy828301@gmail.com> References: <20210914183718.4236-1-shy828301@gmail.com> MIME-Version: 1.0 X-Stat-Signature: z3164kj8xpgq1ppp5kctqzynn83zsyq5 Authentication-Results: imf28.hostedemail.com; dkim=pass header.d=gmail.com header.s=20210112 header.b=hGIOmUo0; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (imf28.hostedemail.com: domain of shy828301@gmail.com designates 209.85.216.54 as permitted sender) smtp.mailfrom=shy828301@gmail.com X-Rspamd-Server: rspam03 X-Rspamd-Queue-Id: 1716890000A1 X-HE-Tag: 1631644659-712870 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Currently hwpoison doesn't handle non-anonymous THP, but since v4.8 THP support for tmpfs and read-only file cache has been added. They could be offlined by split THP, just like anonymous THP. Signed-off-by: Yang Shi --- mm/memory-failure.c | 21 ++++++++++++--------- 1 file changed, 12 insertions(+), 9 deletions(-) diff --git a/mm/memory-failure.c b/mm/memory-failure.c index 3e06cb9d5121..6f72aab8ec4a 100644 --- a/mm/memory-failure.c +++ b/mm/memory-failure.c @@ -1150,13 +1150,16 @@ static int __get_hwpoison_page(struct page *page) if (PageTransHuge(head)) { /* - * Non anonymous thp exists only in allocation/free time. We - * can't handle such a case correctly, so let's give it up. - * This should be better than triggering BUG_ON when kernel - * tries to touch the "partially handled" page. + * We can't handle allocating or freeing THPs, so let's give + * it up. This should be better than triggering BUG_ON when + * kernel tries to touch the "partially handled" page. + * + * page->mapping won't be initialized until the page is added + * to rmap or page cache. Use this as an indicator for if + * this is an instantiated page. */ - if (!PageAnon(head)) { - pr_err("Memory failure: %#lx: non anonymous thp\n", + if (!head->mapping) { + pr_err("Memory failure: %#lx: non instantiated thp\n", page_to_pfn(page)); return 0; } @@ -1415,12 +1418,12 @@ static int identify_page_state(unsigned long pfn, struct page *p, static int try_to_split_thp_page(struct page *page, const char *msg) { lock_page(page); - if (!PageAnon(page) || unlikely(split_huge_page(page))) { + if (!page->mapping || unlikely(split_huge_page(page))) { unsigned long pfn = page_to_pfn(page); unlock_page(page); - if (!PageAnon(page)) - pr_info("%s: %#lx: non anonymous thp\n", msg, pfn); + if (!page->mapping) + pr_info("%s: %#lx: not instantiated thp\n", msg, pfn); else pr_info("%s: %#lx: thp split failed\n", msg, pfn); put_page(page);