From patchwork Sat Mar 12 07:46:11 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Miaohe Lin X-Patchwork-Id: 12778756 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 92129C433F5 for ; Sat, 12 Mar 2022 07:47:08 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231265AbiCLHsM (ORCPT ); Sat, 12 Mar 2022 02:48:12 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:60016 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231259AbiCLHsL (ORCPT ); Sat, 12 Mar 2022 02:48:11 -0500 Received: from szxga02-in.huawei.com (szxga02-in.huawei.com [45.249.212.188]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 11DC823405F; Fri, 11 Mar 2022 23:47:06 -0800 (PST) Received: from canpemm500002.china.huawei.com (unknown [172.30.72.53]) by szxga02-in.huawei.com (SkyGuard) with ESMTP id 4KFvwL60rQzBrYL; Sat, 12 Mar 2022 15:45:06 +0800 (CST) Received: from huawei.com (10.175.124.27) by canpemm500002.china.huawei.com (7.192.104.244) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2308.21; Sat, 12 Mar 2022 15:47:04 +0800 From: Miaohe Lin To: , , , CC: , , , , , Subject: [PATCH v2 1/3] mm/memory-failure.c: fix race with changing page compound again Date: Sat, 12 Mar 2022 15:46:11 +0800 Message-ID: <20220312074613.4798-2-linmiaohe@huawei.com> X-Mailer: git-send-email 2.23.0 In-Reply-To: <20220312074613.4798-1-linmiaohe@huawei.com> References: <20220312074613.4798-1-linmiaohe@huawei.com> MIME-Version: 1.0 X-Originating-IP: [10.175.124.27] X-ClientProxiedBy: dggems702-chm.china.huawei.com (10.3.19.179) To canpemm500002.china.huawei.com (7.192.104.244) X-CFilter-Loop: Reflected Precedence: bulk List-ID: X-Mailing-List: linux-edac@vger.kernel.org There is a race window where we got the compound_head, the hugetlb page could be freed to buddy, or even changed to another compound page just before we try to get hwpoison page. Think about the below race window: CPU 1 CPU 2 memory_failure_hugetlb struct page *head = compound_head(p); hugetlb page might be freed to buddy, or even changed to another compound page. get_hwpoison_page -- page is not what we want now... If this race happens, just bail out. Also MF_MSG_DIFFERENT_PAGE_SIZE is introduced to record this event. Signed-off-by: Miaohe Lin Acked-by: Naoya Horiguchi --- include/linux/mm.h | 1 + include/ras/ras_event.h | 1 + mm/memory-failure.c | 12 ++++++++++++ 3 files changed, 14 insertions(+) diff --git a/include/linux/mm.h b/include/linux/mm.h index c9bada4096ac..ef98cff2b253 100644 --- a/include/linux/mm.h +++ b/include/linux/mm.h @@ -3253,6 +3253,7 @@ enum mf_action_page_type { MF_MSG_BUDDY, MF_MSG_DAX, MF_MSG_UNSPLIT_THP, + MF_MSG_DIFFERENT_PAGE_SIZE, MF_MSG_UNKNOWN, }; diff --git a/include/ras/ras_event.h b/include/ras/ras_event.h index d0337a41141c..1e694fd239b9 100644 --- a/include/ras/ras_event.h +++ b/include/ras/ras_event.h @@ -374,6 +374,7 @@ TRACE_EVENT(aer_event, EM ( MF_MSG_BUDDY, "free buddy page" ) \ EM ( MF_MSG_DAX, "dax page" ) \ EM ( MF_MSG_UNSPLIT_THP, "unsplit thp" ) \ + EM ( MF_MSG_DIFFERENT_PAGE_SIZE, "different page size" ) \ EMe ( MF_MSG_UNKNOWN, "unknown page" ) /* diff --git a/mm/memory-failure.c b/mm/memory-failure.c index 5444a8ef4867..dabecd87ad3f 100644 --- a/mm/memory-failure.c +++ b/mm/memory-failure.c @@ -733,6 +733,7 @@ static const char * const action_page_types[] = { [MF_MSG_BUDDY] = "free buddy page", [MF_MSG_DAX] = "dax page", [MF_MSG_UNSPLIT_THP] = "unsplit thp", + [MF_MSG_DIFFERENT_PAGE_SIZE] = "different page size", [MF_MSG_UNKNOWN] = "unknown page", }; @@ -1534,6 +1535,17 @@ static int memory_failure_hugetlb(unsigned long pfn, int flags) } lock_page(head); + + /** + * The page could have changed compound pages due to race window. + * If this happens just bail out. + */ + if (!PageHuge(p) || compound_head(p) != head) { + action_result(pfn, MF_MSG_DIFFERENT_PAGE_SIZE, MF_IGNORED); + res = -EBUSY; + goto out; + } + page_flags = head->flags; if (hwpoison_filter(p)) { From patchwork Sat Mar 12 07:46:12 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Miaohe Lin X-Patchwork-Id: 12778757 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 68CA4C433F5 for ; Sat, 12 Mar 2022 07:47:16 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231294AbiCLHsS (ORCPT ); Sat, 12 Mar 2022 02:48:18 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:60038 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231270AbiCLHsM (ORCPT ); Sat, 12 Mar 2022 02:48:12 -0500 Received: from szxga01-in.huawei.com (szxga01-in.huawei.com [45.249.212.187]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 3021A23405E; Fri, 11 Mar 2022 23:47:07 -0800 (PST) Received: from canpemm500002.china.huawei.com (unknown [172.30.72.55]) by szxga01-in.huawei.com (SkyGuard) with ESMTP id 4KFvs023Q2zcb2J; Sat, 12 Mar 2022 15:42:12 +0800 (CST) Received: from huawei.com (10.175.124.27) by canpemm500002.china.huawei.com (7.192.104.244) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2308.21; Sat, 12 Mar 2022 15:47:04 +0800 From: Miaohe Lin To: , , , CC: , , , , , Subject: [PATCH v2 2/3] mm/memory-failure.c: avoid calling invalidate_inode_page() with unexpected pages Date: Sat, 12 Mar 2022 15:46:12 +0800 Message-ID: <20220312074613.4798-3-linmiaohe@huawei.com> X-Mailer: git-send-email 2.23.0 In-Reply-To: <20220312074613.4798-1-linmiaohe@huawei.com> References: <20220312074613.4798-1-linmiaohe@huawei.com> MIME-Version: 1.0 X-Originating-IP: [10.175.124.27] X-ClientProxiedBy: dggems702-chm.china.huawei.com (10.3.19.179) To canpemm500002.china.huawei.com (7.192.104.244) X-CFilter-Loop: Reflected Precedence: bulk List-ID: X-Mailing-List: linux-edac@vger.kernel.org Since commit 042c4f32323b ("mm/truncate: Inline invalidate_complete_page() into its one caller"), invalidate_inode_page() can invalidate the pages in the swap cache because the check of page->mapping != mapping is removed. But invalidate_inode_page() is not expected to deal with the pages in swap cache. Also non-lru movable page can reach here too. They're not page cache pages. Skip these pages by checking PageSwapCache and PageLRU. Signed-off-by: Miaohe Lin --- mm/memory-failure.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/mm/memory-failure.c b/mm/memory-failure.c index dabecd87ad3f..2ff7dd2078c4 100644 --- a/mm/memory-failure.c +++ b/mm/memory-failure.c @@ -2190,7 +2190,7 @@ static int __soft_offline_page(struct page *page) return 0; } - if (!PageHuge(page)) + if (!PageHuge(page) && PageLRU(page) && !PageSwapCache(page)) /* * Try to invalidate first. This should work for * non dirty unmapped page cache pages. From patchwork Sat Mar 12 07:46:13 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Miaohe Lin X-Patchwork-Id: 12778758 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 5B1BEC4332F for ; Sat, 12 Mar 2022 07:47:17 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231259AbiCLHsT (ORCPT ); Sat, 12 Mar 2022 02:48:19 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:60044 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231277AbiCLHsN (ORCPT ); Sat, 12 Mar 2022 02:48:13 -0500 Received: from szxga03-in.huawei.com (szxga03-in.huawei.com [45.249.212.189]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id A5834234062; Fri, 11 Mar 2022 23:47:07 -0800 (PST) Received: from canpemm500002.china.huawei.com (unknown [172.30.72.57]) by szxga03-in.huawei.com (SkyGuard) with ESMTP id 4KFvtK34J0z9sZ5; Sat, 12 Mar 2022 15:43:21 +0800 (CST) Received: from huawei.com (10.175.124.27) by canpemm500002.china.huawei.com (7.192.104.244) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2308.21; Sat, 12 Mar 2022 15:47:05 +0800 From: Miaohe Lin To: , , , CC: , , , , , Subject: [PATCH v2 3/3] mm/memory-failure.c: make non-LRU movable pages unhandlable Date: Sat, 12 Mar 2022 15:46:13 +0800 Message-ID: <20220312074613.4798-4-linmiaohe@huawei.com> X-Mailer: git-send-email 2.23.0 In-Reply-To: <20220312074613.4798-1-linmiaohe@huawei.com> References: <20220312074613.4798-1-linmiaohe@huawei.com> MIME-Version: 1.0 X-Originating-IP: [10.175.124.27] X-ClientProxiedBy: dggems702-chm.china.huawei.com (10.3.19.179) To canpemm500002.china.huawei.com (7.192.104.244) X-CFilter-Loop: Reflected Precedence: bulk List-ID: X-Mailing-List: linux-edac@vger.kernel.org We can not really handle non-LRU movable pages in memory failure. Typically they are balloon, zsmalloc, etc. Assuming we run into a base (4K) non-LRU movable page, we could reach as far as identify_page_state(), it should not fall into any category except me_unknown. For the non-LRU compound movable pages, they could be taken for transhuge pages but it's unexpected to split non-LRU movable pages using split_huge_page_to_list in memory_failure. So we could just simply make non-LRU movable pages unhandlable to avoid these possible nasty cases. Suggested-by: Yang Shi Signed-off-by: Miaohe Lin Acked-by: Naoya Horiguchi Reviewed-by: Yang Shi --- mm/memory-failure.c | 20 +++++++++++++------- 1 file changed, 13 insertions(+), 7 deletions(-) diff --git a/mm/memory-failure.c b/mm/memory-failure.c index 2ff7dd2078c4..ba621c6823ed 100644 --- a/mm/memory-failure.c +++ b/mm/memory-failure.c @@ -1177,12 +1177,18 @@ void ClearPageHWPoisonTakenOff(struct page *page) * does not return true for hugetlb or device memory pages, so it's assumed * to be called only in the context where we never have such pages. */ -static inline bool HWPoisonHandlable(struct page *page) +static inline bool HWPoisonHandlable(struct page *page, unsigned long flags) { - return PageLRU(page) || __PageMovable(page) || is_free_buddy_page(page); + bool movable = false; + + /* Soft offline could mirgate non-LRU movable pages */ + if ((flags & MF_SOFT_OFFLINE) && __PageMovable(page)) + movable = true; + + return movable || PageLRU(page) || is_free_buddy_page(page); } -static int __get_hwpoison_page(struct page *page) +static int __get_hwpoison_page(struct page *page, unsigned long flags) { struct page *head = compound_head(page); int ret = 0; @@ -1197,7 +1203,7 @@ static int __get_hwpoison_page(struct page *page) * for any unsupported type of page in order to reduce the risk of * unexpected races caused by taking a page refcount. */ - if (!HWPoisonHandlable(head)) + if (!HWPoisonHandlable(head, flags)) return -EBUSY; if (get_page_unless_zero(head)) { @@ -1222,7 +1228,7 @@ static int get_any_page(struct page *p, unsigned long flags) try_again: if (!count_increased) { - ret = __get_hwpoison_page(p); + ret = __get_hwpoison_page(p, flags); if (!ret) { if (page_count(p)) { /* We raced with an allocation, retry. */ @@ -1250,7 +1256,7 @@ static int get_any_page(struct page *p, unsigned long flags) } } - if (PageHuge(p) || HWPoisonHandlable(p)) { + if (PageHuge(p) || HWPoisonHandlable(p, flags)) { ret = 1; } else { /* @@ -2308,7 +2314,7 @@ int soft_offline_page(unsigned long pfn, int flags) retry: get_online_mems(); - ret = get_hwpoison_page(page, flags); + ret = get_hwpoison_page(page, flags | MF_SOFT_OFFLINE); put_online_mems(); if (ret > 0) {