From patchwork Fri Apr 8 13:53:21 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Naoya Horiguchi X-Patchwork-Id: 12806794 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 7D82DC433EF for ; Fri, 8 Apr 2022 13:53:34 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id E97CB6B0073; Fri, 8 Apr 2022 09:53:33 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id E47128D0001; Fri, 8 Apr 2022 09:53:33 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id D0F846B0075; Fri, 8 Apr 2022 09:53:33 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (relay.a.hostedemail.com [64.99.140.24]) by kanga.kvack.org (Postfix) with ESMTP id C06A16B0073 for ; Fri, 8 Apr 2022 09:53:33 -0400 (EDT) Received: from smtpin01.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay09.hostedemail.com (Postfix) with ESMTP id 8F7D225A20 for ; Fri, 8 Apr 2022 13:53:33 +0000 (UTC) X-FDA: 79333854306.01.BA523AD Received: from out1.migadu.com (out1.migadu.com [91.121.223.63]) by imf31.hostedemail.com (Postfix) with ESMTP id BD46020002 for ; Fri, 8 Apr 2022 13:53:32 +0000 (UTC) X-Report-Abuse: Please report any abuse attempt to abuse@migadu.com and include these headers. DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.dev; s=key1; t=1649426011; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=mpc3qsHct94jtExg4iEwFSFMa0DJPcxp9unqRtEpQeM=; b=Jb0+t0DDYAIIN1WJOCujPJGYF87g1BDAM2a63+iky+mbUWt/Itp4YujYnfQFCfimz9yx4f pZuDHR1PyJJ7zodIK42qx3rNtqto0vpTIhTB3aCK1dxFQNdHDdGkn2SPMDlWHHy3/OKW3g pgrdwEN/sandS4QYwJ3FneLXDlIyLeg= From: Naoya Horiguchi To: linux-mm@kvack.org Cc: Andrew Morton , Mike Kravetz , Miaohe Lin , Yang Shi , Dan Carpenter , Naoya Horiguchi , linux-kernel@vger.kernel.org Subject: [PATCH v8 1/3] mm/hwpoison: fix race between hugetlb free/demotion and memory_failure_hugetlb() Date: Fri, 8 Apr 2022 22:53:21 +0900 Message-Id: <20220408135323.1559401-2-naoya.horiguchi@linux.dev> In-Reply-To: <20220408135323.1559401-1-naoya.horiguchi@linux.dev> References: <20220408135323.1559401-1-naoya.horiguchi@linux.dev> MIME-Version: 1.0 X-Migadu-Flow: FLOW_OUT X-Migadu-Auth-User: linux.dev X-Rspamd-Server: rspam05 X-Rspamd-Queue-Id: BD46020002 X-Stat-Signature: 5jdrcjkugcnthchyasswjna18dmajd8z X-Rspam-User: Authentication-Results: imf31.hostedemail.com; dkim=pass header.d=linux.dev header.s=key1 header.b=Jb0+t0DD; dmarc=pass (policy=none) header.from=linux.dev; spf=pass (imf31.hostedemail.com: domain of naoya.horiguchi@linux.dev designates 91.121.223.63 as permitted sender) smtp.mailfrom=naoya.horiguchi@linux.dev X-HE-Tag: 1649426012-928590 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Naoya Horiguchi There is a race condition between memory_failure_hugetlb() and hugetlb free/demotion, which causes setting PageHWPoison flag on the wrong page. The one simple result is that wrong processes can be killed, but another (more serious) one is that the actual error is left unhandled, so no one prevents later access to it, and that might lead to more serious results like consuming corrupted data. Think about the below race window: CPU 1 CPU 2 memory_failure_hugetlb struct page *head = compound_head(p); hugetlb page might be freed to buddy, or even changed to another compound page. get_hwpoison_page -- page is not what we want now... The current code first does prechecks roughly and then reconfirms after taking refcount, but it's found that it makes code overly complicated, so move the prechecks in a single hugetlb_lock range. A newly introduced function, try_memory_failure_hugetlb(), always takes hugetlb_lock (even for non-hugetlb pages). That can be improved, but memory_failure() is rare in principle, so should not be a big problem. Fixes: 761ad8d7c7b5 ("mm: hwpoison: introduce memory_failure_hugetlb()") Reported-by: Mike Kravetz Signed-off-by: Naoya Horiguchi Cc: stable@vger.kernel.org Reviewed-by: Miaohe Lin Reviewed-by: Mike Kravetz --- ChangeLog v7 -> v8 - move hwpoison_filter() within page locking. ChangeLog v6 -> v7: - Move lock_page() to try_memory_failure_hugetlb() (based on bug report from Dan Carpenter) - Add Fixes: tag and CC to stable. ChangeLog v5 -> v6: - Moved racy precheck operations into hugetlb_lock (based on Mike's comment). - rebased onto v5.18-rc1. - dropped CC to stable. ChangeLog v4 -> v5: - call TestSetPageHWPoison() when page_handle_poison() fails. - call TestSetPageHWPoison() for unhandlable cases (MF_MSG_UNKNOWN and MF_MSG_DIFFERENT_PAGE_SIZE). - Set PageHWPoison on the head page only when the error page is surely a hugepage, otherwise set the flag on the raw page. - rebased onto v5.17-rc8-mmotm-2022-03-16-17-42 ChangeLog v3 -> v4: - squash with "mm/memory-failure.c: fix race with changing page compound again". - update patch subject and description based on it. ChangeLog v2 -> v3: - rename the patch because page lock is not the primary factor to solve the reported issue. - updated description in the same manner. - call page_handle_poison() instead of __page_handle_poison() for free hugepage case. - reorder put_page and unlock_page (thanks to Miaohe Lin) ChangeLog v1 -> v2: - pass subpage to get_hwpoison_huge_page() instead of head page. - call compound_head() in hugetlb_lock to avoid race with hugetlb demotion/free. --- include/linux/hugetlb.h | 6 ++ include/linux/mm.h | 8 +++ mm/hugetlb.c | 10 +++ mm/memory-failure.c | 145 ++++++++++++++++++++++++++++------------ 4 files changed, 127 insertions(+), 42 deletions(-) diff --git a/include/linux/hugetlb.h b/include/linux/hugetlb.h index 53c1b6082a4c..ac2a1d758a80 100644 --- a/include/linux/hugetlb.h +++ b/include/linux/hugetlb.h @@ -169,6 +169,7 @@ long hugetlb_unreserve_pages(struct inode *inode, long start, long end, long freed); bool isolate_huge_page(struct page *page, struct list_head *list); int get_hwpoison_huge_page(struct page *page, bool *hugetlb); +int get_huge_page_for_hwpoison(unsigned long pfn, int flags); void putback_active_hugepage(struct page *page); void move_hugetlb_state(struct page *oldpage, struct page *newpage, int reason); void free_huge_page(struct page *page); @@ -378,6 +379,11 @@ static inline int get_hwpoison_huge_page(struct page *page, bool *hugetlb) return 0; } +static inline int get_huge_page_for_hwpoison(unsigned long pfn, int flags) +{ + return 0; +} + static inline void putback_active_hugepage(struct page *page) { } diff --git a/include/linux/mm.h b/include/linux/mm.h index e34edb775334..9f44254af8ce 100644 --- a/include/linux/mm.h +++ b/include/linux/mm.h @@ -3197,6 +3197,14 @@ extern int sysctl_memory_failure_recovery; extern void shake_page(struct page *p); extern atomic_long_t num_poisoned_pages __read_mostly; extern int soft_offline_page(unsigned long pfn, int flags); +#ifdef CONFIG_MEMORY_FAILURE +extern int __get_huge_page_for_hwpoison(unsigned long pfn, int flags); +#else +static inline int __get_huge_page_for_hwpoison(unsigned long pfn, int flags) +{ + return 0; +} +#endif #ifndef arch_memory_failure static inline int arch_memory_failure(unsigned long pfn, int flags) diff --git a/mm/hugetlb.c b/mm/hugetlb.c index f8ca7cca3c1a..3fc721789743 100644 --- a/mm/hugetlb.c +++ b/mm/hugetlb.c @@ -6785,6 +6785,16 @@ int get_hwpoison_huge_page(struct page *page, bool *hugetlb) return ret; } +int get_huge_page_for_hwpoison(unsigned long pfn, int flags) +{ + int ret; + + spin_lock_irq(&hugetlb_lock); + ret = __get_huge_page_for_hwpoison(pfn, flags); + spin_unlock_irq(&hugetlb_lock); + return ret; +} + void putback_active_hugepage(struct page *page) { spin_lock_irq(&hugetlb_lock); diff --git a/mm/memory-failure.c b/mm/memory-failure.c index dcb6bb9cf731..2020944398c9 100644 --- a/mm/memory-failure.c +++ b/mm/memory-failure.c @@ -1498,50 +1498,113 @@ static int try_to_split_thp_page(struct page *page, const char *msg) return 0; } -static int memory_failure_hugetlb(unsigned long pfn, int flags) +/* + * Called from hugetlb code with hugetlb_lock held. + * + * Return values: + * 0 - free hugepage + * 1 - in-use hugepage + * 2 - not a hugepage + * -EBUSY - the hugepage is busy (try to retry) + * -EHWPOISON - the hugepage is already hwpoisoned + */ +int __get_huge_page_for_hwpoison(unsigned long pfn, int flags) +{ + struct page *page = pfn_to_page(pfn); + struct page *head = compound_head(page); + int ret = 2; /* fallback to normal page handling */ + bool count_increased = false; + + if (!PageHeadHuge(head)) + goto out; + + if (flags & MF_COUNT_INCREASED) { + ret = 1; + count_increased = true; + } else if (HPageFreed(head) || HPageMigratable(head)) { + ret = get_page_unless_zero(head); + if (ret) + count_increased = true; + } else { + ret = -EBUSY; + goto out; + } + + if (TestSetPageHWPoison(head)) { + ret = -EHWPOISON; + goto out; + } + + return ret; +out: + if (count_increased) + put_page(head); + return ret; +} + +#ifdef CONFIG_HUGETLB_PAGE +/* + * Taking refcount of hugetlb pages needs extra care about race conditions + * with basic operations like hugepage allocation/free/demotion. + * So some of prechecks for hwpoison (pinning, and testing/setting + * PageHWPoison) should be done in single hugetlb_lock range. + */ +static int try_memory_failure_hugetlb(unsigned long pfn, int flags, int *hugetlb) { - struct page *p = pfn_to_page(pfn); - struct page *head = compound_head(p); int res; + struct page *p = pfn_to_page(pfn); + struct page *head; unsigned long page_flags; + bool retry = true; - if (TestSetPageHWPoison(head)) { - pr_err("Memory failure: %#lx: already hardware poisoned\n", - pfn); - res = -EHWPOISON; - if (flags & MF_ACTION_REQUIRED) + *hugetlb = 1; +retry: + res = get_huge_page_for_hwpoison(pfn, flags); + if (res == 2) { /* fallback to normal page handling */ + *hugetlb = 0; + return 0; + } else if (res == -EHWPOISON) { + pr_err("Memory failure: %#lx: already hardware poisoned\n", pfn); + if (flags & MF_ACTION_REQUIRED) { + head = compound_head(p); res = kill_accessing_process(current, page_to_pfn(head), flags); + } return res; + } else if (res == -EBUSY) { + if (retry) { + retry = false; + goto retry; + } + action_result(pfn, MF_MSG_UNKNOWN, MF_IGNORED); + return res; + } + + head = compound_head(p); + lock_page(head); + + if (hwpoison_filter(p)) { + ClearPageHWPoison(head); + res = -EOPNOTSUPP; + goto out; } num_poisoned_pages_inc(); - if (!(flags & MF_COUNT_INCREASED)) { - res = get_hwpoison_page(p, flags); - if (!res) { - lock_page(head); - if (hwpoison_filter(p)) { - if (TestClearPageHWPoison(head)) - num_poisoned_pages_dec(); - unlock_page(head); - return -EOPNOTSUPP; - } - unlock_page(head); - res = MF_FAILED; - if (__page_handle_poison(p)) { - page_ref_inc(p); - res = MF_RECOVERED; - } - action_result(pfn, MF_MSG_FREE_HUGE, res); - return res == MF_RECOVERED ? 0 : -EBUSY; - } else if (res < 0) { - action_result(pfn, MF_MSG_UNKNOWN, MF_IGNORED); - return -EBUSY; + /* + * Handling free hugepage. The possible race with hugepage allocation + * or demotion can be prevented by PageHWPoison flag. + */ + if (res == 0) { + unlock_page(head); + res = MF_FAILED; + if (__page_handle_poison(p)) { + page_ref_inc(p); + res = MF_RECOVERED; } + action_result(pfn, MF_MSG_FREE_HUGE, res); + return res == MF_RECOVERED ? 0 : -EBUSY; } - lock_page(head); - /* * The page could have changed compound pages due to race window. * If this happens just bail out. @@ -1554,14 +1617,6 @@ static int memory_failure_hugetlb(unsigned long pfn, int flags) page_flags = head->flags; - if (hwpoison_filter(p)) { - if (TestClearPageHWPoison(head)) - num_poisoned_pages_dec(); - put_page(p); - res = -EOPNOTSUPP; - goto out; - } - /* * TODO: hwpoison for pud-sized hugetlb doesn't work right now, so * simply disable it. In order to make it work properly, we need @@ -1588,6 +1643,12 @@ static int memory_failure_hugetlb(unsigned long pfn, int flags) unlock_page(head); return res; } +#else +static inline int try_memory_failure_hugetlb(unsigned long pfn, int flags, int *hugetlb) +{ + return 0; +} +#endif static int memory_failure_dev_pagemap(unsigned long pfn, int flags, struct dev_pagemap *pgmap) @@ -1712,6 +1773,7 @@ int memory_failure(unsigned long pfn, int flags) int res = 0; unsigned long page_flags; bool retry = true; + int hugetlb = 0; if (!sysctl_memory_failure_recovery) panic("Memory failure on page %lx", pfn); @@ -1739,10 +1801,9 @@ int memory_failure(unsigned long pfn, int flags) } try_again: - if (PageHuge(p)) { - res = memory_failure_hugetlb(pfn, flags); + res = try_memory_failure_hugetlb(pfn, flags, &hugetlb); + if (hugetlb) goto unlock_mutex; - } if (TestSetPageHWPoison(p)) { pr_err("Memory failure: %#lx: already hardware poisoned\n", From patchwork Fri Apr 8 13:53:22 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Naoya Horiguchi X-Patchwork-Id: 12806795 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id ABF1AC433F5 for ; Fri, 8 Apr 2022 13:53:37 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 411B68D0002; Fri, 8 Apr 2022 09:53:37 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 3BF198D0001; Fri, 8 Apr 2022 09:53:37 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 260E58D0002; Fri, 8 Apr 2022 09:53:37 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0101.hostedemail.com [216.40.44.101]) by kanga.kvack.org (Postfix) with ESMTP id 16C1B8D0001 for ; Fri, 8 Apr 2022 09:53:37 -0400 (EDT) Received: from smtpin24.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay01.hostedemail.com (Postfix) with ESMTP id BF555183B04A0 for ; Fri, 8 Apr 2022 13:53:36 +0000 (UTC) X-FDA: 79333854432.24.9BC944C Received: from out1.migadu.com (out1.migadu.com [91.121.223.63]) by imf07.hostedemail.com (Postfix) with ESMTP id 3EB5340008 for ; Fri, 8 Apr 2022 13:53:36 +0000 (UTC) X-Report-Abuse: Please report any abuse attempt to abuse@migadu.com and include these headers. DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.dev; s=key1; t=1649426015; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=k5tFbU7BShyxeb1PXfEKBF6QTN+g39+2tO6/MCWhE0c=; b=uldKYj8wYfWzhPi+3mWtIYdcWHYJvIuyo9pyhadY1DxD+CHAWwh99kvT69HCH0FtdHMz4p DmGE/mw8ipjSFz4ig7FL0jCVxU0w9aUAfSVFqjg1kUJ3dZV0pjtDEzH7uRPcdOrpsn1Oi+ 32fBpogOy6U2CABsEBk5k8QuMihJDcM= From: Naoya Horiguchi To: linux-mm@kvack.org Cc: Andrew Morton , Mike Kravetz , Miaohe Lin , Yang Shi , Dan Carpenter , Naoya Horiguchi , linux-kernel@vger.kernel.org Subject: [PATCH v8 2/3] mm/hwpoison: put page in already hwpoisoned case with MF_COUNT_INCREASED Date: Fri, 8 Apr 2022 22:53:22 +0900 Message-Id: <20220408135323.1559401-3-naoya.horiguchi@linux.dev> In-Reply-To: <20220408135323.1559401-1-naoya.horiguchi@linux.dev> References: <20220408135323.1559401-1-naoya.horiguchi@linux.dev> MIME-Version: 1.0 X-Migadu-Flow: FLOW_OUT X-Migadu-Auth-User: linux.dev X-Stat-Signature: q9tdh7yzainqfjgbzaqt15xymxzom1c3 Authentication-Results: imf07.hostedemail.com; dkim=pass header.d=linux.dev header.s=key1 header.b=uldKYj8w; spf=pass (imf07.hostedemail.com: domain of naoya.horiguchi@linux.dev designates 91.121.223.63 as permitted sender) smtp.mailfrom=naoya.horiguchi@linux.dev; dmarc=pass (policy=none) header.from=linux.dev X-Rspam-User: X-Rspamd-Server: rspam02 X-Rspamd-Queue-Id: 3EB5340008 X-HE-Tag: 1649426016-17153 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Naoya Horiguchi In already hwpoisoned case, memory_failure() is supposed to return with releasing the page refcount taken for error handling. But currently the refcount is not released when called with MF_COUNT_INCREASED, which makes page refcount inconsistent. This should be rare and non-critical, but it might be inconvenient in testing (unpoison doesn't work). Suggested-by: Miaohe Lin Signed-off-by: Naoya Horiguchi Reviewed-by: Miaohe Lin Reviewed-by: Mike Kravetz --- mm/memory-failure.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/mm/memory-failure.c b/mm/memory-failure.c index 2020944398c9..b2e32cdc3823 100644 --- a/mm/memory-failure.c +++ b/mm/memory-failure.c @@ -1811,6 +1811,8 @@ int memory_failure(unsigned long pfn, int flags) res = -EHWPOISON; if (flags & MF_ACTION_REQUIRED) res = kill_accessing_process(current, pfn, flags); + if (flags & MF_COUNT_INCREASED) + put_page(p); goto unlock_mutex; } From patchwork Fri Apr 8 13:53:23 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Naoya Horiguchi X-Patchwork-Id: 12806796 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 19A1BC433FE for ; Fri, 8 Apr 2022 13:53:42 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id A7C1B8D0003; Fri, 8 Apr 2022 09:53:41 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id A03F78D0001; Fri, 8 Apr 2022 09:53:41 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 8A4CB8D0003; Fri, 8 Apr 2022 09:53:41 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (relay.hostedemail.com [64.99.140.25]) by kanga.kvack.org (Postfix) with ESMTP id 787D68D0001 for ; Fri, 8 Apr 2022 09:53:41 -0400 (EDT) Received: from smtpin01.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay11.hostedemail.com (Postfix) with ESMTP id 52032811CD for ; Fri, 8 Apr 2022 13:53:41 +0000 (UTC) X-FDA: 79333854642.01.B60FAF5 Received: from out1.migadu.com (out1.migadu.com [91.121.223.63]) by imf23.hostedemail.com (Postfix) with ESMTP id C3042140003 for ; Fri, 8 Apr 2022 13:53:40 +0000 (UTC) X-Report-Abuse: Please report any abuse attempt to abuse@migadu.com and include these headers. DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.dev; s=key1; t=1649426018; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=dYRCEMKWYu0UDhikG8Ly0BDPYPTrRS4FaVxa2Scm9pU=; b=D98jGQZJ/Q96aheRWk0f6JMhEQHX+yvMXktw/S+8PypGelMork+OaylW7lF5Rr6nNTDEmh 92u10xKgkamt9Ep6joVsygAllJQQpN/LNRihRUL5Rp/SYTbe0aZZ1LT/53CCUoPLaPtATN 5Rw90DCQuXRP1CXYkvnr1EBIa8yGhMU= From: Naoya Horiguchi To: linux-mm@kvack.org Cc: Andrew Morton , Mike Kravetz , Miaohe Lin , Yang Shi , Dan Carpenter , Naoya Horiguchi , linux-kernel@vger.kernel.org Subject: [PATCH v8 3/3] Revert "mm/memory-failure.c: fix race with changing page compound again" Date: Fri, 8 Apr 2022 22:53:23 +0900 Message-Id: <20220408135323.1559401-4-naoya.horiguchi@linux.dev> In-Reply-To: <20220408135323.1559401-1-naoya.horiguchi@linux.dev> References: <20220408135323.1559401-1-naoya.horiguchi@linux.dev> MIME-Version: 1.0 X-Migadu-Flow: FLOW_OUT X-Migadu-Auth-User: linux.dev X-Stat-Signature: y1rgdqtnhjg8edxtmjkhrairqymbmhkr X-Rspamd-Server: rspam07 X-Rspamd-Queue-Id: C3042140003 Authentication-Results: imf23.hostedemail.com; dkim=pass header.d=linux.dev header.s=key1 header.b=D98jGQZJ; dmarc=pass (policy=none) header.from=linux.dev; spf=pass (imf23.hostedemail.com: domain of naoya.horiguchi@linux.dev designates 91.121.223.63 as permitted sender) smtp.mailfrom=naoya.horiguchi@linux.dev X-Rspam-User: X-HE-Tag: 1649426020-23730 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Naoya Horiguchi Reverts commit 888af2701db7 ("mm/memory-failure.c: fix race with changing page compound again") because now we fetch the page refcount under hugetlb_lock in try_memory_failure_hugetlb() so that the race check is no longer necessary. Suggested-by: Miaohe Lin Signed-off-by: Naoya Horiguchi Reviewed-by: Miaohe Lin Reviewed-by: Mike Kravetz --- include/linux/mm.h | 1 - include/ras/ras_event.h | 1 - mm/memory-failure.c | 11 ----------- 3 files changed, 13 deletions(-) diff --git a/include/linux/mm.h b/include/linux/mm.h index 9f44254af8ce..d446e834a3e5 100644 --- a/include/linux/mm.h +++ b/include/linux/mm.h @@ -3251,7 +3251,6 @@ enum mf_action_page_type { MF_MSG_BUDDY, MF_MSG_DAX, MF_MSG_UNSPLIT_THP, - MF_MSG_DIFFERENT_PAGE_SIZE, MF_MSG_UNKNOWN, }; diff --git a/include/ras/ras_event.h b/include/ras/ras_event.h index 1e694fd239b9..d0337a41141c 100644 --- a/include/ras/ras_event.h +++ b/include/ras/ras_event.h @@ -374,7 +374,6 @@ TRACE_EVENT(aer_event, EM ( MF_MSG_BUDDY, "free buddy page" ) \ EM ( MF_MSG_DAX, "dax page" ) \ EM ( MF_MSG_UNSPLIT_THP, "unsplit thp" ) \ - EM ( MF_MSG_DIFFERENT_PAGE_SIZE, "different page size" ) \ EMe ( MF_MSG_UNKNOWN, "unknown page" ) /* diff --git a/mm/memory-failure.c b/mm/memory-failure.c index b2e32cdc3823..e2674532678b 100644 --- a/mm/memory-failure.c +++ b/mm/memory-failure.c @@ -733,7 +733,6 @@ static const char * const action_page_types[] = { [MF_MSG_BUDDY] = "free buddy page", [MF_MSG_DAX] = "dax page", [MF_MSG_UNSPLIT_THP] = "unsplit thp", - [MF_MSG_DIFFERENT_PAGE_SIZE] = "different page size", [MF_MSG_UNKNOWN] = "unknown page", }; @@ -1605,16 +1604,6 @@ static int try_memory_failure_hugetlb(unsigned long pfn, int flags, int *hugetlb return res == MF_RECOVERED ? 0 : -EBUSY; } - /* - * The page could have changed compound pages due to race window. - * If this happens just bail out. - */ - if (!PageHuge(p) || compound_head(p) != head) { - action_result(pfn, MF_MSG_DIFFERENT_PAGE_SIZE, MF_IGNORED); - res = -EBUSY; - goto out; - } - page_flags = head->flags; /* From patchwork Fri Apr 15 04:18:48 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Naoya Horiguchi X-Patchwork-Id: 12814243 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id EABD8C433EF for ; Fri, 15 Apr 2022 04:18:59 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id EF2F66B0071; Fri, 15 Apr 2022 00:18:58 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id EA2C16B0073; Fri, 15 Apr 2022 00:18:58 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id D69F76B0074; Fri, 15 Apr 2022 00:18:58 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (relay.hostedemail.com [64.99.140.27]) by kanga.kvack.org (Postfix) with ESMTP id C8B4A6B0071 for ; Fri, 15 Apr 2022 00:18:58 -0400 (EDT) Received: from smtpin25.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay13.hostedemail.com (Postfix) with ESMTP id 7AE9460B98 for ; Fri, 15 Apr 2022 04:18:58 +0000 (UTC) X-FDA: 79357807956.25.5FAA62C Received: from out2.migadu.com (out2.migadu.com [188.165.223.204]) by imf15.hostedemail.com (Postfix) with ESMTP id E2A77A0003 for ; Fri, 15 Apr 2022 04:18:57 +0000 (UTC) Date: Fri, 15 Apr 2022 13:18:48 +0900 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.dev; s=key1; t=1649996335; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=rl3sJ0XRlQIg8Gg43fwrBWpDvE8dJjHgrOhVyk/MSC8=; b=aM+/T+HofxzaGmEWiEGC3Ssm5s2AGIukaI8iWQrdo4qauFuK+eJtUA/3zxmLBsKx+Lvnws sW8O7sprEvwrfP5JOXbu07F1kIFPN7M8akoerStcHSwtLlOaiFvDsHeuaqVt25M5eXS4em mC02Pw/6Q//DWiO06W7ilUoSVQ6hA8M= X-Report-Abuse: Please report any abuse attempt to abuse@migadu.com and include these headers. From: Naoya Horiguchi To: Miaohe Lin , Mike Kravetz , "linux-mm@kvack.org" , Andrew Morton Cc: Yang Shi , Dan Carpenter , naoya.horiguchi@nec.com, "linux-kernel@vger.kernel.org" Subject: [PATCH 4/3] mm, hugetlb, hwpoison: separate branch for free and in-use hugepage Message-ID: <20220415041848.GA3034499@ik1-406-35019.vs.sakura.ne.jp> References: <20220408135323.1559401-1-naoya.horiguchi@linux.dev> <20220408135323.1559401-2-naoya.horiguchi@linux.dev> <5b665bcd-57f8-85ae-b0c4-c055875dbfff@oracle.com> <20e677e5-01aa-f8c0-0ce1-bf33da58b7ec@huawei.com> <20220415021233.GA3357039@hori.linux.bs1.fc.nec.co.jp> MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: <20220415021233.GA3357039@hori.linux.bs1.fc.nec.co.jp> X-Migadu-Flow: FLOW_OUT X-Migadu-Auth-User: linux.dev X-Rspam-User: X-Rspamd-Server: rspam11 X-Rspamd-Queue-Id: E2A77A0003 X-Stat-Signature: n83633583ps3aw81i4fkb985mn17ys7s Authentication-Results: imf15.hostedemail.com; dkim=pass header.d=linux.dev header.s=key1 header.b="aM+/T+Ho"; spf=pass (imf15.hostedemail.com: domain of naoya.horiguchi@linux.dev designates 188.165.223.204 as permitted sender) smtp.mailfrom=naoya.horiguchi@linux.dev; dmarc=pass (policy=none) header.from=linux.dev X-HE-Tag: 1649996337-38308 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Naoya Horiguchi We know that HPageFreed pages should have page refcount 0, so get_page_unless_zero() always fails and returns 0. So explicitly separate the branch based on page state for minor optimization and better readability. Suggested-by: Mike Kravetz Suggested-by: Miaohe Lin Signed-off-by: Naoya Horiguchi Reviewed-by: Mike Kravetz Reviewed-by: Miaohe Lin --- mm/hugetlb.c | 4 +++- mm/memory-failure.c | 4 +++- 2 files changed, 6 insertions(+), 2 deletions(-) diff --git a/mm/hugetlb.c b/mm/hugetlb.c index e38cbfdf3e61..3638f166e554 100644 --- a/mm/hugetlb.c +++ b/mm/hugetlb.c @@ -6786,7 +6786,9 @@ int get_hwpoison_huge_page(struct page *page, bool *hugetlb) spin_lock_irq(&hugetlb_lock); if (PageHeadHuge(page)) { *hugetlb = true; - if (HPageFreed(page) || HPageMigratable(page)) + if (HPageFreed(page)) + ret = 0; + else if (HPageMigratable(page)) ret = get_page_unless_zero(page); else ret = -EBUSY; diff --git a/mm/memory-failure.c b/mm/memory-failure.c index 5e3ad640f5bb..661079a37f29 100644 --- a/mm/memory-failure.c +++ b/mm/memory-failure.c @@ -1517,7 +1517,9 @@ int __get_huge_page_for_hwpoison(unsigned long pfn, int flags) if (flags & MF_COUNT_INCREASED) { ret = 1; count_increased = true; - } else if (HPageFreed(head) || HPageMigratable(head)) { + } else if (HPageFreed(head)) { + ret = 0; + } else if (HPageMigratable(head)) { ret = get_page_unless_zero(head); if (ret) count_increased = true;