From patchwork Mon Jul 4 01:33:04 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Naoya Horiguchi X-Patchwork-Id: 12904546 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 92A11C43334 for ; Mon, 4 Jul 2022 01:33:31 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 38C9B6B0073; Sun, 3 Jul 2022 21:33:31 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 33E638E0002; Sun, 3 Jul 2022 21:33:31 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 22C808E0001; Sun, 3 Jul 2022 21:33:31 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0012.hostedemail.com [216.40.44.12]) by kanga.kvack.org (Postfix) with ESMTP id 15A936B0073 for ; Sun, 3 Jul 2022 21:33:31 -0400 (EDT) Received: from smtpin28.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay09.hostedemail.com (Postfix) with ESMTP id D60B035DEC for ; Mon, 4 Jul 2022 01:33:30 +0000 (UTC) X-FDA: 79647694980.28.6390933 Received: from out0.migadu.com (out0.migadu.com [94.23.1.103]) by imf30.hostedemail.com (Postfix) with ESMTP id 413868001A for ; Mon, 4 Jul 2022 01:33:30 +0000 (UTC) X-Report-Abuse: Please report any abuse attempt to abuse@migadu.com and include these headers. DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.dev; s=key1; t=1656898408; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=2DMQHTNxYJaXGXTJwZRtDjz8Ro6QHDIDNISX5LwnZ+s=; b=l1rS6EMHG1eSBfPlLAVtXPJKWp746h4/hewVYtrZpCoo6IKZxse2sDb6ev/jyEWsUZo7fJ v3GnTdI8By0Dq4Ivs0mewBveh6qrJUODaJ7FbyJlI42dDWJsDNdqZx/kmIENvZpBj2kRwd yyTck7qUf8QBlA0Of70QXO1fFeihZyM= From: Naoya Horiguchi To: linux-mm@kvack.org Cc: Andrew Morton , David Hildenbrand , Mike Kravetz , Miaohe Lin , Liu Shixin , Yang Shi , Oscar Salvador , Muchun Song , Naoya Horiguchi , linux-kernel@vger.kernel.org Subject: [mm-unstable PATCH v4 1/9] mm/hugetlb: check gigantic_page_runtime_supported() in return_unused_surplus_pages() Date: Mon, 4 Jul 2022 10:33:04 +0900 Message-Id: <20220704013312.2415700-2-naoya.horiguchi@linux.dev> In-Reply-To: <20220704013312.2415700-1-naoya.horiguchi@linux.dev> References: <20220704013312.2415700-1-naoya.horiguchi@linux.dev> MIME-Version: 1.0 X-Migadu-Flow: FLOW_OUT X-Migadu-Auth-User: linux.dev ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1656898410; a=rsa-sha256; cv=none; b=zz+Zaoks/62aRUtctIxmmXiFOAndwMZz6dXX1M4iy2wceX/H2j0mT21fEyALpjz1dVSXkH Do5F294erG5KgomobLD3H1TDHCXzEo27KAEO6kDBl6Y2GF86c7HvLNQ7/H3tqCpmuzQNk1 bcK4F8EyHt81O92y9JWoVCBDBlP75XE= ARC-Authentication-Results: i=1; imf30.hostedemail.com; dkim=pass header.d=linux.dev header.s=key1 header.b=l1rS6EMH; spf=pass (imf30.hostedemail.com: domain of naoya.horiguchi@linux.dev designates 94.23.1.103 as permitted sender) smtp.mailfrom=naoya.horiguchi@linux.dev; dmarc=pass (policy=none) header.from=linux.dev ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1656898410; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=2DMQHTNxYJaXGXTJwZRtDjz8Ro6QHDIDNISX5LwnZ+s=; b=Kv/6fWuMYxeri9mZCGTyTxnuYDagmSWjOpUxqOHxTzf5Vp7chAyC7g6Twg7TX04HEYILc9 57Lo3x4xGJwwOIDczR6f2Weia1xlVknusBKahLeluL/Kp8vTIOHNb2xeyq1JigleOn4shT cpwgGfKuii+UxhMW2mIMtq8Zzr0XDDI= X-Stat-Signature: 5gtqebxokqqhg3g1o3z9r8jthn73zo4o X-Rspamd-Queue-Id: 413868001A Authentication-Results: imf30.hostedemail.com; dkim=pass header.d=linux.dev header.s=key1 header.b=l1rS6EMH; spf=pass (imf30.hostedemail.com: domain of naoya.horiguchi@linux.dev designates 94.23.1.103 as permitted sender) smtp.mailfrom=naoya.horiguchi@linux.dev; dmarc=pass (policy=none) header.from=linux.dev X-Rspam-User: X-Rspamd-Server: rspam12 X-HE-Tag: 1656898410-72645 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Naoya Horiguchi I found a weird state of 1GB hugepage pool, caused by the following procedure: - run a process reserving all free 1GB hugepages, - shrink free 1GB hugepage pool to zero (i.e. writing 0 to /sys/kernel/mm/hugepages/hugepages-1048576kB/nr_hugepages), then - kill the reserving process. , then all the hugepages are free *and* surplus at the same time. $ cat /sys/kernel/mm/hugepages/hugepages-1048576kB/nr_hugepages 3 $ cat /sys/kernel/mm/hugepages/hugepages-1048576kB/free_hugepages 3 $ cat /sys/kernel/mm/hugepages/hugepages-1048576kB/resv_hugepages 0 $ cat /sys/kernel/mm/hugepages/hugepages-1048576kB/surplus_hugepages 3 This state is resolved by reserving and allocating the pages then freeing them again, so this seems not to result in serious problem. But it's a little surprising (shrinking pool suddenly fails). This behavior is caused by hstate_is_gigantic() check in return_unused_surplus_pages(). This was introduced so long ago in 2008 by commit aa888a74977a ("hugetlb: support larger than MAX_ORDER"), and at that time the gigantic pages were not supposed to be allocated/freed at run-time. Now kernel can support runtime allocation/free, so let's check gigantic_page_runtime_supported() together. Signed-off-by: Naoya Horiguchi --- v2 -> v3: - Fixed typo in patch description, - add !gigantic_page_runtime_supported() check instead of removing hstate_is_gigantic() check (suggested by Miaohe and Muchun) - add a few more !gigantic_page_runtime_supported() check in set_max_huge_pages() (by Mike). --- mm/hugetlb.c | 19 ++++++++++++++++--- 1 file changed, 16 insertions(+), 3 deletions(-) diff --git a/mm/hugetlb.c b/mm/hugetlb.c index 2a554f006255..bdc4499f324b 100644 --- a/mm/hugetlb.c +++ b/mm/hugetlb.c @@ -2432,8 +2432,7 @@ static void return_unused_surplus_pages(struct hstate *h, /* Uncommit the reservation */ h->resv_huge_pages -= unused_resv_pages; - /* Cannot return gigantic pages currently */ - if (hstate_is_gigantic(h)) + if (hstate_is_gigantic(h) && !gigantic_page_runtime_supported()) goto out; /* @@ -3315,7 +3314,8 @@ static int set_max_huge_pages(struct hstate *h, unsigned long count, int nid, * the user tries to allocate gigantic pages but let the user free the * boottime allocated gigantic pages. */ - if (hstate_is_gigantic(h) && !IS_ENABLED(CONFIG_CONTIG_ALLOC)) { + if (hstate_is_gigantic(h) && (!IS_ENABLED(CONFIG_CONTIG_ALLOC) || + !gigantic_page_runtime_supported())) { if (count > persistent_huge_pages(h)) { spin_unlock_irq(&hugetlb_lock); mutex_unlock(&h->resize_lock); @@ -3363,6 +3363,19 @@ static int set_max_huge_pages(struct hstate *h, unsigned long count, int nid, goto out; } + /* + * We can not decrease gigantic pool size if runtime modification + * is not supported. + */ + if (hstate_is_gigantic(h) && !gigantic_page_runtime_supported()) { + if (count < persistent_huge_pages(h)) { + spin_unlock_irq(&hugetlb_lock); + mutex_unlock(&h->resize_lock); + NODEMASK_FREE(node_alloc_noretry); + return -EINVAL; + } + } + /* * Decrease the pool size * First return free pages to the buddy allocator (being careful From patchwork Mon Jul 4 01:33:05 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Naoya Horiguchi X-Patchwork-Id: 12904547 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 6DCE6C43334 for ; Mon, 4 Jul 2022 01:33:35 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id D78636B0074; Sun, 3 Jul 2022 21:33:34 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id D286D8E0002; Sun, 3 Jul 2022 21:33:34 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id C172F8E0001; Sun, 3 Jul 2022 21:33:34 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0011.hostedemail.com [216.40.44.11]) by kanga.kvack.org (Postfix) with ESMTP id B57806B0074 for ; Sun, 3 Jul 2022 21:33:34 -0400 (EDT) Received: from smtpin31.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay12.hostedemail.com (Postfix) with ESMTP id 8FB7512169C for ; Mon, 4 Jul 2022 01:33:34 +0000 (UTC) X-FDA: 79647695148.31.87BB434 Received: from out0.migadu.com (out0.migadu.com [94.23.1.103]) by imf30.hostedemail.com (Postfix) with ESMTP id 226FB80021 for ; Mon, 4 Jul 2022 01:33:34 +0000 (UTC) X-Report-Abuse: Please report any abuse attempt to abuse@migadu.com and include these headers. DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.dev; s=key1; t=1656898412; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=ZNgH5eHPcnq/+3rKXXS1c6RWQGb3F3eUFrPlWLShXcQ=; b=SpdN77/fOWWRpW2V7H7g0MG8DDqDeCF63c2jnfpMye/4Uc3Hjg9f13RAOZQYPDt0PsZpE5 t0ZUWIC9KbfO+TAHF0ACpR8RgeJdBfDJKlR5zTnbjWNvbhE/npSw//SM2TXvlpr5YvZxGW Kep2gOytBblyp3HQ7Nv6aGA3vQoyPk4= From: Naoya Horiguchi To: linux-mm@kvack.org Cc: Andrew Morton , David Hildenbrand , Mike Kravetz , Miaohe Lin , Liu Shixin , Yang Shi , Oscar Salvador , Muchun Song , Naoya Horiguchi , linux-kernel@vger.kernel.org Subject: [mm-unstable PATCH v4 2/9] mm/hugetlb: separate path for hwpoison entry in copy_hugetlb_page_range() Date: Mon, 4 Jul 2022 10:33:05 +0900 Message-Id: <20220704013312.2415700-3-naoya.horiguchi@linux.dev> In-Reply-To: <20220704013312.2415700-1-naoya.horiguchi@linux.dev> References: <20220704013312.2415700-1-naoya.horiguchi@linux.dev> MIME-Version: 1.0 X-Migadu-Flow: FLOW_OUT X-Migadu-Auth-User: linux.dev ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1656898414; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=ZNgH5eHPcnq/+3rKXXS1c6RWQGb3F3eUFrPlWLShXcQ=; b=XSDQW0QvGomHQZipOooSQnNhlAcYUT/e4TBSb3hf6d9as4lJYAuShLWPkIIQ/MaUc5mxxZ KV/O74yDLt1gyXdxyhN9hx2KYokSC01wcqHPEQ/9qMTxtuD+hg3UYidFPVvEirM+W7AgJ9 YNBQj1BRIsqujveicBjI5ZGGO/NP7e4= ARC-Authentication-Results: i=1; imf30.hostedemail.com; dkim=pass header.d=linux.dev header.s=key1 header.b="SpdN77/f"; dmarc=pass (policy=none) header.from=linux.dev; spf=pass (imf30.hostedemail.com: domain of naoya.horiguchi@linux.dev designates 94.23.1.103 as permitted sender) smtp.mailfrom=naoya.horiguchi@linux.dev ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1656898414; a=rsa-sha256; cv=none; b=WQnPhrKGa6eTKUKaNy3JoHA1H/e1ih+aQCMZoNWXY82VC8r5uhGuaWQStQOodBzkcOAUxF /iPXImeLsSa6NkoSCji8GMFNAAZej/7PmUi1X686kakMP2c7G7HiFekSwYLUq6Q5jrsL2y mAMetO5QvUc0rVjIG95jFpD74YNovY4= Authentication-Results: imf30.hostedemail.com; dkim=pass header.d=linux.dev header.s=key1 header.b="SpdN77/f"; dmarc=pass (policy=none) header.from=linux.dev; spf=pass (imf30.hostedemail.com: domain of naoya.horiguchi@linux.dev designates 94.23.1.103 as permitted sender) smtp.mailfrom=naoya.horiguchi@linux.dev X-Rspamd-Server: rspam10 X-Rspamd-Queue-Id: 226FB80021 X-Stat-Signature: wbyaumh5d6un38s3x1yeiysbo9ncyu77 X-Rspam-User: X-HE-Tag: 1656898414-543242 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Naoya Horiguchi Originally copy_hugetlb_page_range() handles migration entries and hwpoisoned entries in similar manner. But recently the related code path has more code for migration entries, and when is_writable_migration_entry() was converted to !is_readable_migration_entry(), hwpoison entries on source processes got to be unexpectedly updated (which is legitimate for migration entries, but not for hwpoison entries). This results in unexpected serious issues like kernel panic when forking processes with hwpoison entries in pmd. Separate the if branch into one for hwpoison entries and one for migration entries. Fixes: 6c287605fd56 ("mm: remember exclusively mapped anonymous pages with PG_anon_exclusive") Signed-off-by: Naoya Horiguchi Reviewed-by: Miaohe Lin Reviewed-by: Mike Kravetz Reviewed-by: Muchun Song Cc: # 5.18 --- v3 -> v4: - replact set_huge_swap_pte_at() with set_huge_pte_at() --- mm/hugetlb.c | 9 +++++++-- 1 file changed, 7 insertions(+), 2 deletions(-) diff --git a/mm/hugetlb.c b/mm/hugetlb.c index bdc4499f324b..ad621688370b 100644 --- a/mm/hugetlb.c +++ b/mm/hugetlb.c @@ -4803,8 +4803,13 @@ int copy_hugetlb_page_range(struct mm_struct *dst, struct mm_struct *src, * sharing with another vma. */ ; - } else if (unlikely(is_hugetlb_entry_migration(entry) || - is_hugetlb_entry_hwpoisoned(entry))) { + } else if (unlikely(is_hugetlb_entry_hwpoisoned(entry))) { + bool uffd_wp = huge_pte_uffd_wp(entry); + + if (!userfaultfd_wp(dst_vma) && uffd_wp) + entry = huge_pte_clear_uffd_wp(entry); + set_huge_pte_at(dst, addr, dst_pte, entry); + } else if (unlikely(is_hugetlb_entry_migration(entry))) { swp_entry_t swp_entry = pte_to_swp_entry(entry); bool uffd_wp = huge_pte_uffd_wp(entry); From patchwork Mon Jul 4 01:33:06 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Naoya Horiguchi X-Patchwork-Id: 12904548 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 8F91BC433EF for ; Mon, 4 Jul 2022 01:33:39 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 317ED8E0001; Sun, 3 Jul 2022 21:33:39 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 2A0FD6B0078; Sun, 3 Jul 2022 21:33:39 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 18FEF8E0001; Sun, 3 Jul 2022 21:33:39 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0011.hostedemail.com [216.40.44.11]) by kanga.kvack.org (Postfix) with ESMTP id 098476B0075 for ; Sun, 3 Jul 2022 21:33:39 -0400 (EDT) Received: from smtpin06.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay10.hostedemail.com (Postfix) with ESMTP id D3F73DD1 for ; Mon, 4 Jul 2022 01:33:38 +0000 (UTC) X-FDA: 79647695316.06.0D4BD55 Received: from out0.migadu.com (out0.migadu.com [94.23.1.103]) by imf28.hostedemail.com (Postfix) with ESMTP id 42A47C004D for ; Mon, 4 Jul 2022 01:33:38 +0000 (UTC) X-Report-Abuse: Please report any abuse attempt to abuse@migadu.com and include these headers. DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.dev; s=key1; t=1656898417; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=F1I02BcWpIjgDvFk+4MlrVlp1TpbgaySUNPdkhFmIaM=; b=A+nOPsGxoBTDr95l5v5rnDzy8nPqcFEjSJsAQUxmeWQN1u/oKq/1uBzEAubMOjoLLZFph8 b1CMBITvKzQTmQ1+ORt6FxEO2yLd/p2XbNWv+lvmbCrc59dkih2S+ABJYFnik0iovj1nVQ M1QoSLF1VIO3oRbiDXSaJoCF7AvZyeg= From: Naoya Horiguchi To: linux-mm@kvack.org Cc: Andrew Morton , David Hildenbrand , Mike Kravetz , Miaohe Lin , Liu Shixin , Yang Shi , Oscar Salvador , Muchun Song , Naoya Horiguchi , linux-kernel@vger.kernel.org Subject: [mm-unstable PATCH v4 3/9] mm/hugetlb: make pud_huge() and follow_huge_pud() aware of non-present pud entry Date: Mon, 4 Jul 2022 10:33:06 +0900 Message-Id: <20220704013312.2415700-4-naoya.horiguchi@linux.dev> In-Reply-To: <20220704013312.2415700-1-naoya.horiguchi@linux.dev> References: <20220704013312.2415700-1-naoya.horiguchi@linux.dev> MIME-Version: 1.0 X-Migadu-Flow: FLOW_OUT X-Migadu-Auth-User: linux.dev ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1656898418; a=rsa-sha256; cv=none; b=eZMvW0FEUEbIXP0vP18wn62E7mo/U0QNQ96YqMuHyXbfpt+uo2ZjQm3BMOXnvZISCPs1Hv 8dNbRlnfGmTss27IrMK5gSCP7Jb8OnxA8Hz3Ip1PTb3GyDWJMQJsxH6fDmPPNxzuaRia+b SXXUU6APn/eT4Kh/5o3Eh1aUF90Vr9s= ARC-Authentication-Results: i=1; imf28.hostedemail.com; dkim=pass header.d=linux.dev header.s=key1 header.b=A+nOPsGx; dmarc=pass (policy=none) header.from=linux.dev; spf=pass (imf28.hostedemail.com: domain of naoya.horiguchi@linux.dev designates 94.23.1.103 as permitted sender) smtp.mailfrom=naoya.horiguchi@linux.dev ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1656898418; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=F1I02BcWpIjgDvFk+4MlrVlp1TpbgaySUNPdkhFmIaM=; b=uGvL5fJe+a6K/Zp7xAXNru0tSh0/MKs8JBPWFysFMr8JTpVxvPwkSPEuOoO8ywGb7MeXpZ 35k9E9x/WQrLKkcfj01hykRdZs65AdkyXonWbBAZgX1FB5+eO9YAY7OuBEB5t5WYWQKUsP +dBZBiIaqBkIpIuFoJ6cUC7smjo0W9I= X-Rspam-User: X-Rspamd-Server: rspam07 Authentication-Results: imf28.hostedemail.com; dkim=pass header.d=linux.dev header.s=key1 header.b=A+nOPsGx; dmarc=pass (policy=none) header.from=linux.dev; spf=pass (imf28.hostedemail.com: domain of naoya.horiguchi@linux.dev designates 94.23.1.103 as permitted sender) smtp.mailfrom=naoya.horiguchi@linux.dev X-Stat-Signature: 7ek9xzo7zbcyt6wfepedh75am6yihhir X-Rspamd-Queue-Id: 42A47C004D X-HE-Tag: 1656898418-552073 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Naoya Horiguchi follow_pud_mask() does not support non-present pud entry now. As long as I tested on x86_64 server, follow_pud_mask() still simply returns no_page_table() for non-present_pud_entry() due to pud_bad(), so no severe user-visible effect should happen. But generally we should call follow_huge_pud() for non-present pud entry for 1GB hugetlb page. Update pud_huge() and follow_huge_pud() to handle non-present pud entries. The changes are similar to previous works for pud entries commit e66f17ff7177 ("mm/hugetlb: take page table lock in follow_huge_pmd()") and commit cbef8478bee5 ("mm/hugetlb: pmd_huge() returns true for non-present hugepage"). Signed-off-by: Naoya Horiguchi Reviewed-by: Miaohe Lin Reviewed-by: Mike Kravetz --- v2 -> v3: - fixed typos in subject and description, - added comment on pud_huge(), - added comment about fallback for hwpoisoned entry, - updated initial check about FOLL_{PIN,GET} flags. --- arch/x86/mm/hugetlbpage.c | 8 +++++++- mm/hugetlb.c | 32 ++++++++++++++++++++++++++++++-- 2 files changed, 37 insertions(+), 3 deletions(-) diff --git a/arch/x86/mm/hugetlbpage.c b/arch/x86/mm/hugetlbpage.c index 509408da0da1..6b3033845c6d 100644 --- a/arch/x86/mm/hugetlbpage.c +++ b/arch/x86/mm/hugetlbpage.c @@ -30,9 +30,15 @@ int pmd_huge(pmd_t pmd) (pmd_val(pmd) & (_PAGE_PRESENT|_PAGE_PSE)) != _PAGE_PRESENT; } +/* + * pud_huge() returns 1 if @pud is hugetlb related entry, that is normal + * hugetlb entry or non-present (migration or hwpoisoned) hugetlb entry. + * Otherwise, returns 0. + */ int pud_huge(pud_t pud) { - return !!(pud_val(pud) & _PAGE_PSE); + return !pud_none(pud) && + (pud_val(pud) & (_PAGE_PRESENT|_PAGE_PSE)) != _PAGE_PRESENT; } #ifdef CONFIG_HUGETLB_PAGE diff --git a/mm/hugetlb.c b/mm/hugetlb.c index ad621688370b..66bb39e0fce8 100644 --- a/mm/hugetlb.c +++ b/mm/hugetlb.c @@ -6994,10 +6994,38 @@ struct page * __weak follow_huge_pud(struct mm_struct *mm, unsigned long address, pud_t *pud, int flags) { - if (flags & (FOLL_GET | FOLL_PIN)) + struct page *page = NULL; + spinlock_t *ptl; + pte_t pte; + + if (WARN_ON_ONCE(flags & FOLL_PIN)) return NULL; - return pte_page(*(pte_t *)pud) + ((address & ~PUD_MASK) >> PAGE_SHIFT); +retry: + ptl = huge_pte_lock(hstate_sizelog(PUD_SHIFT), mm, (pte_t *)pud); + if (!pud_huge(*pud)) + goto out; + pte = huge_ptep_get((pte_t *)pud); + if (pte_present(pte)) { + page = pud_page(*pud) + ((address & ~PUD_MASK) >> PAGE_SHIFT); + if (WARN_ON_ONCE(!try_grab_page(page, flags))) { + page = NULL; + goto out; + } + } else { + if (is_hugetlb_entry_migration(pte)) { + spin_unlock(ptl); + __migration_entry_wait(mm, (pte_t *)pud, ptl); + goto retry; + } + /* + * hwpoisoned entry is treated as no_page_table in + * follow_page_mask(). + */ + } +out: + spin_unlock(ptl); + return page; } struct page * __weak From patchwork Mon Jul 4 01:33:07 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Naoya Horiguchi X-Patchwork-Id: 12904549 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id A8180C43334 for ; Mon, 4 Jul 2022 01:33:43 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 4ECEF8E0002; Sun, 3 Jul 2022 21:33:43 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 475C56B0078; Sun, 3 Jul 2022 21:33:43 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 2EFBB8E0002; Sun, 3 Jul 2022 21:33:43 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0014.hostedemail.com [216.40.44.14]) by kanga.kvack.org (Postfix) with ESMTP id 1AB456B0075 for ; Sun, 3 Jul 2022 21:33:43 -0400 (EDT) Received: from smtpin08.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay01.hostedemail.com (Postfix) with ESMTP id E329460AF0 for ; Mon, 4 Jul 2022 01:33:42 +0000 (UTC) X-FDA: 79647695484.08.A6540D3 Received: from out0.migadu.com (out0.migadu.com [94.23.1.103]) by imf03.hostedemail.com (Postfix) with ESMTP id 4D72020017 for ; Mon, 4 Jul 2022 01:33:42 +0000 (UTC) X-Report-Abuse: Please report any abuse attempt to abuse@migadu.com and include these headers. DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.dev; s=key1; t=1656898421; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=rpbedJgsulqq7WLKw/LuC7umOHTHw0EK2BmjIyWdJrs=; b=abwdNNB5Pt5x1Gre7UmxEdMhbC0PqQNVY7hAkaUuWsXf98nsUpeTXP9PlYdeLxHs/oe3ZG qOilFBALoOv7WtLHy238Vcm6nsRehC4Y+qO1redZpYFBWq2jdHrlPhQvs+5YMGm0dBLcM2 QoNvNuQlbLckH8mBYeCpswhwAKRDnik= From: Naoya Horiguchi To: linux-mm@kvack.org Cc: Andrew Morton , David Hildenbrand , Mike Kravetz , Miaohe Lin , Liu Shixin , Yang Shi , Oscar Salvador , Muchun Song , Naoya Horiguchi , linux-kernel@vger.kernel.org Subject: [mm-unstable PATCH v4 4/9] mm, hwpoison, hugetlb: support saving mechanism of raw error pages Date: Mon, 4 Jul 2022 10:33:07 +0900 Message-Id: <20220704013312.2415700-5-naoya.horiguchi@linux.dev> In-Reply-To: <20220704013312.2415700-1-naoya.horiguchi@linux.dev> References: <20220704013312.2415700-1-naoya.horiguchi@linux.dev> MIME-Version: 1.0 X-Migadu-Flow: FLOW_OUT X-Migadu-Auth-User: linux.dev ARC-Authentication-Results: i=1; imf03.hostedemail.com; dkim=pass header.d=linux.dev header.s=key1 header.b=abwdNNB5; dmarc=pass (policy=none) header.from=linux.dev; spf=pass (imf03.hostedemail.com: domain of naoya.horiguchi@linux.dev designates 94.23.1.103 as permitted sender) smtp.mailfrom=naoya.horiguchi@linux.dev ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1656898422; a=rsa-sha256; cv=none; b=aCNKvYaWESuczXgzy5Tzox4hvA+mxjM04fH663avCoxttJ16pBmfWw3ZzLyX8yWteYGCqt LTmsXeh6dutyAPTLjSTl/Qg7H8Gufu7fY19SJ7sbsmmU3ZJFJwqYBMHP8YiLg0C2mq7g3a YcHJOGwaqsaD/Jqqn8yCedk0l/jh8HI= ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1656898422; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=rpbedJgsulqq7WLKw/LuC7umOHTHw0EK2BmjIyWdJrs=; b=Elvi23G0/YrxEvNaWy//B7oEEQSbXDQ8cnbg6ckUpdzzNcRFYFXSKtRcRUoyEYsfjFvq0+ XpxvfkvVml9J7hT9z8I8ZqzIzlc1HfqcDXVzXQu9auLNkYlEfeTDBriNnzcS9w51EHsAJf +YS5/GjKbWwf1+h+ylw7a+eymqJqqcM= Authentication-Results: imf03.hostedemail.com; dkim=pass header.d=linux.dev header.s=key1 header.b=abwdNNB5; dmarc=pass (policy=none) header.from=linux.dev; spf=pass (imf03.hostedemail.com: domain of naoya.horiguchi@linux.dev designates 94.23.1.103 as permitted sender) smtp.mailfrom=naoya.horiguchi@linux.dev X-Rspamd-Server: rspam09 X-Rspam-User: X-Stat-Signature: orsjuci6nowz71gxwudz64677wjh6xbx X-Rspamd-Queue-Id: 4D72020017 X-HE-Tag: 1656898422-846991 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Naoya Horiguchi When handling memory error on a hugetlb page, the error handler tries to dissolve and turn it into 4kB pages. If it's successfully dissolved, PageHWPoison flag is moved to the raw error page, so that's all right. However, dissolve sometimes fails, then the error page is left as hwpoisoned hugepage. It's useful if we can retry to dissolve it to save healthy pages, but that's not possible now because the information about where the raw error pages is lost. Use the private field of a few tail pages to keep that information. The code path of shrinking hugepage pool uses this info to try delayed dissolve. In order to remember multiple errors in a hugepage, a singly-linked list originated from SUBPAGE_INDEX_HWPOISON-th tail page is constructed. Only simple operations (adding an entry or clearing all) are required and the list is assumed not to be very long, so this simple data structure should be enough. If we failed to save raw error info, the hwpoison hugepage has errors on unknown subpage, then this new saving mechanism does not work any more, so disable saving new raw error info and freeing hwpoison hugepages. Signed-off-by: Naoya Horiguchi --- v3 -> v4: - resolve conflict with "mm: hugetlb_vmemmap: improve hugetlb_vmemmap code readability", use hugetlb_vmemmap_restore() instead of hugetlb_vmemmap_alloc(). v2 -> v3: - remove duplicate "return ret" lines, - use GFP_ATOMIC instead of GFP_KERNEL, - introduce HPageRawHwpUnreliable pseudo flag (suggested by Muchun), - hugetlb_clear_page_hwpoison removes raw_hwp_page list even if HPageRawHwpUnreliable is true, (by Miaohe) v1 -> v2: - support hwpoison hugepage with multiple errors, - moved the new interface functions to mm/memory-failure.c, - define additional subpage index SUBPAGE_INDEX_HWPOISON_UNRELIABLE, - stop freeing/dissolving hwpoison hugepages with unreliable raw error info, - drop hugetlb_clear_page_hwpoison() in dissolve_free_huge_page() because that's done in update_and_free_page(), - move setting/clearing PG_hwpoison flag to the new interfaces, - checking already hwpoisoned or not on a subpage basis. ChangeLog since previous post on 4/27: - fixed typo in patch description (by Miaohe) - fixed config value in #ifdef statement (by Miaohe) - added sentences about "multiple hwpoison pages" scenario in patch description Signed-off-by: Naoya Horiguchi --- include/linux/hugetlb.h | 18 +++++++++- mm/hugetlb.c | 39 ++++++++++---------- mm/memory-failure.c | 80 +++++++++++++++++++++++++++++++++++++++-- 3 files changed, 114 insertions(+), 23 deletions(-) diff --git a/include/linux/hugetlb.h b/include/linux/hugetlb.h index dce46d571575..29c4d0883d36 100644 --- a/include/linux/hugetlb.h +++ b/include/linux/hugetlb.h @@ -42,6 +42,9 @@ enum { SUBPAGE_INDEX_CGROUP, /* reuse page->private */ SUBPAGE_INDEX_CGROUP_RSVD, /* reuse page->private */ __MAX_CGROUP_SUBPAGE_INDEX = SUBPAGE_INDEX_CGROUP_RSVD, +#endif +#ifdef CONFIG_MEMORY_FAILURE + SUBPAGE_INDEX_HWPOISON, #endif __NR_USED_SUBPAGE, }; @@ -551,7 +554,7 @@ generic_hugetlb_get_unmapped_area(struct file *file, unsigned long addr, * Synchronization: Initially set after new page allocation with no * locking. When examined and modified during migration processing * (isolate, migrate, putback) the hugetlb_lock is held. - * HPG_temporary - - Set on a page that is temporarily allocated from the buddy + * HPG_temporary -- Set on a page that is temporarily allocated from the buddy * allocator. Typically used for migration target pages when no pages * are available in the pool. The hugetlb free page path will * immediately free pages with this flag set to the buddy allocator. @@ -561,6 +564,8 @@ generic_hugetlb_get_unmapped_area(struct file *file, unsigned long addr, * HPG_freed - Set when page is on the free lists. * Synchronization: hugetlb_lock held for examination and modification. * HPG_vmemmap_optimized - Set when the vmemmap pages of the page are freed. + * HPG_raw_hwp_unreliable - Set when the hugetlb page has a hwpoison sub-page + * that is not tracked by raw_hwp_page list. */ enum hugetlb_page_flags { HPG_restore_reserve = 0, @@ -568,6 +573,7 @@ enum hugetlb_page_flags { HPG_temporary, HPG_freed, HPG_vmemmap_optimized, + HPG_raw_hwp_unreliable, __NR_HPAGEFLAGS, }; @@ -614,6 +620,7 @@ HPAGEFLAG(Migratable, migratable) HPAGEFLAG(Temporary, temporary) HPAGEFLAG(Freed, freed) HPAGEFLAG(VmemmapOptimized, vmemmap_optimized) +HPAGEFLAG(RawHwpUnreliable, raw_hwp_unreliable) #ifdef CONFIG_HUGETLB_PAGE @@ -796,6 +803,15 @@ extern int dissolve_free_huge_page(struct page *page); extern int dissolve_free_huge_pages(unsigned long start_pfn, unsigned long end_pfn); +#ifdef CONFIG_MEMORY_FAILURE +extern int hugetlb_clear_page_hwpoison(struct page *hpage); +#else +static inline int hugetlb_clear_page_hwpoison(struct page *hpage) +{ + return 0; +} +#endif + #ifdef CONFIG_ARCH_ENABLE_HUGEPAGE_MIGRATION #ifndef arch_hugetlb_migration_supported static inline bool arch_hugetlb_migration_supported(struct hstate *h) diff --git a/mm/hugetlb.c b/mm/hugetlb.c index 66bb39e0fce8..ccd470f0194c 100644 --- a/mm/hugetlb.c +++ b/mm/hugetlb.c @@ -1535,17 +1535,15 @@ static void __update_and_free_page(struct hstate *h, struct page *page) if (hstate_is_gigantic(h) && !gigantic_page_runtime_supported()) return; - if (hugetlb_vmemmap_restore(h, page)) { - spin_lock_irq(&hugetlb_lock); - /* - * If we cannot allocate vmemmap pages, just refuse to free the - * page and put the page back on the hugetlb free list and treat - * as a surplus page. - */ - add_hugetlb_page(h, page, true); - spin_unlock_irq(&hugetlb_lock); - return; - } + if (hugetlb_vmemmap_restore(h, page)) + goto fail; + + /* + * Move PageHWPoison flag from head page to the raw error pages, + * which makes any healthy subpages reusable. + */ + if (unlikely(PageHWPoison(page) && hugetlb_clear_page_hwpoison(page))) + goto fail; for (i = 0; i < pages_per_huge_page(h); i++, subpage = mem_map_next(subpage, page, i)) { @@ -1566,6 +1564,16 @@ static void __update_and_free_page(struct hstate *h, struct page *page) } else { __free_pages(page, huge_page_order(h)); } + return; +fail: + spin_lock_irq(&hugetlb_lock); + /* + * If we cannot allocate vmemmap pages or cannot identify raw hwpoison + * subpages reliably, just refuse to free the page and put the page + * back on the hugetlb free list and treat as a surplus page. + */ + add_hugetlb_page(h, page, true); + spin_unlock_irq(&hugetlb_lock); } /* @@ -2109,15 +2117,6 @@ int dissolve_free_huge_page(struct page *page) */ rc = hugetlb_vmemmap_restore(h, head); if (!rc) { - /* - * Move PageHWPoison flag from head page to the raw - * error page, which makes any subpages rather than - * the error page reusable. - */ - if (PageHWPoison(head) && page != head) { - SetPageHWPoison(page); - ClearPageHWPoison(head); - } update_and_free_page(h, head, false); } else { spin_lock_irq(&hugetlb_lock); diff --git a/mm/memory-failure.c b/mm/memory-failure.c index c9931c676335..53bf7486a245 100644 --- a/mm/memory-failure.c +++ b/mm/memory-failure.c @@ -1664,6 +1664,82 @@ int mf_dax_kill_procs(struct address_space *mapping, pgoff_t index, EXPORT_SYMBOL_GPL(mf_dax_kill_procs); #endif /* CONFIG_FS_DAX */ +/* + * Struct raw_hwp_page represents information about "raw error page", + * constructing singly linked list originated from ->private field of + * SUBPAGE_INDEX_HWPOISON-th tail page. + */ +struct raw_hwp_page { + struct llist_node node; + struct page *page; +}; + +static inline struct llist_head *raw_hwp_list_head(struct page *hpage) +{ + return (struct llist_head *)&page_private(hpage + SUBPAGE_INDEX_HWPOISON); +} + +static inline int hugetlb_set_page_hwpoison(struct page *hpage, + struct page *page) +{ + struct llist_head *head; + struct raw_hwp_page *raw_hwp; + struct llist_node *t, *tnode; + int ret; + + /* + * Once the hwpoison hugepage has lost reliable raw error info, + * there is little meaning to keep additional error info precisely, + * so skip to add additional raw error info. + */ + if (HPageRawHwpUnreliable(hpage)) + return -EHWPOISON; + head = raw_hwp_list_head(hpage); + llist_for_each_safe(tnode, t, head->first) { + struct raw_hwp_page *p = container_of(tnode, struct raw_hwp_page, node); + + if (p->page == page) + return -EHWPOISON; + } + + ret = TestSetPageHWPoison(hpage) ? -EHWPOISON : 0; + /* the first error event will be counted in action_result(). */ + if (ret) + num_poisoned_pages_inc(); + + raw_hwp = kmalloc(sizeof(struct raw_hwp_page), GFP_ATOMIC); + if (raw_hwp) { + raw_hwp->page = page; + llist_add(&raw_hwp->node, head); + } else { + /* + * Failed to save raw error info. We no longer trace all + * hwpoisoned subpages, and we need refuse to free/dissolve + * this hwpoisoned hugepage. + */ + SetHPageRawHwpUnreliable(hpage); + } + return ret; +} + +inline int hugetlb_clear_page_hwpoison(struct page *hpage) +{ + struct llist_head *head; + struct llist_node *t, *tnode; + + if (!HPageRawHwpUnreliable(hpage)) + ClearPageHWPoison(hpage); + head = raw_hwp_list_head(hpage); + llist_for_each_safe(tnode, t, head->first) { + struct raw_hwp_page *p = container_of(tnode, struct raw_hwp_page, node); + + SetPageHWPoison(p->page); + kfree(p); + } + llist_del_all(head); + return 0; +} + /* * Called from hugetlb code with hugetlb_lock held. * @@ -1698,7 +1774,7 @@ int __get_huge_page_for_hwpoison(unsigned long pfn, int flags) goto out; } - if (TestSetPageHWPoison(head)) { + if (hugetlb_set_page_hwpoison(head, page)) { ret = -EHWPOISON; goto out; } @@ -1751,7 +1827,7 @@ static int try_memory_failure_hugetlb(unsigned long pfn, int flags, int *hugetlb lock_page(head); if (hwpoison_filter(p)) { - ClearPageHWPoison(head); + hugetlb_clear_page_hwpoison(head); res = -EOPNOTSUPP; goto out; } From patchwork Mon Jul 4 01:33:08 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Naoya Horiguchi X-Patchwork-Id: 12904550 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id A9F65C43334 for ; Mon, 4 Jul 2022 01:33:47 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 51432900002; Sun, 3 Jul 2022 21:33:47 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 475EA6B0078; Sun, 3 Jul 2022 21:33:47 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 2EFE2900002; Sun, 3 Jul 2022 21:33:47 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0017.hostedemail.com [216.40.44.17]) by kanga.kvack.org (Postfix) with ESMTP id 1C4E16B0075 for ; Sun, 3 Jul 2022 21:33:47 -0400 (EDT) Received: from smtpin05.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay06.hostedemail.com (Postfix) with ESMTP id E122835E3C for ; Mon, 4 Jul 2022 01:33:46 +0000 (UTC) X-FDA: 79647695652.05.520F583 Received: from out0.migadu.com (out0.migadu.com [94.23.1.103]) by imf31.hostedemail.com (Postfix) with ESMTP id 3D1732000D for ; Mon, 4 Jul 2022 01:33:46 +0000 (UTC) X-Report-Abuse: Please report any abuse attempt to abuse@migadu.com and include these headers. DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.dev; s=key1; t=1656898425; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=c53xJY5QAxTG48EN3l+VCLDr5GeKFcglkHzWuOKhYL0=; b=LVkfQK7TP20FdAXxwrCEQ/qn/NIIMGsYJ2y6Yo78cU7uWNBr0+fMpVD3Rd36gpqVUY1MN7 xRM1YMSgVgKONPP9/HZ+981vjWqgmZ9AMvDav6i5sjwQDyvb6vapcL0Hu7zAlpwF4LfHve LrtdgjLn7o+lPMvfaG4NywfEz8qrOkQ= From: Naoya Horiguchi To: linux-mm@kvack.org Cc: Andrew Morton , David Hildenbrand , Mike Kravetz , Miaohe Lin , Liu Shixin , Yang Shi , Oscar Salvador , Muchun Song , Naoya Horiguchi , linux-kernel@vger.kernel.org Subject: [mm-unstable PATCH v4 5/9] mm, hwpoison: make unpoison aware of raw error info in hwpoisoned hugepage Date: Mon, 4 Jul 2022 10:33:08 +0900 Message-Id: <20220704013312.2415700-6-naoya.horiguchi@linux.dev> In-Reply-To: <20220704013312.2415700-1-naoya.horiguchi@linux.dev> References: <20220704013312.2415700-1-naoya.horiguchi@linux.dev> MIME-Version: 1.0 X-Migadu-Flow: FLOW_OUT X-Migadu-Auth-User: linux.dev ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1656898426; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=c53xJY5QAxTG48EN3l+VCLDr5GeKFcglkHzWuOKhYL0=; b=tenJX7PPl85Gr2H1JGVhaqpg6U1zS6OFqOuVLoVCdIRgookmasfahomFwJxtHKd0mbDW3m i9QYkbu1V4Xe2S+yJGS02V2DQ532cdDbrJGT4aQUL8GNIIF0vT8GIdyK9xPybMWb8MTnur T/LaOZcRL7mZ3h9zKYSidaq14vz3UX4= ARC-Authentication-Results: i=1; imf31.hostedemail.com; dkim=pass header.d=linux.dev header.s=key1 header.b=LVkfQK7T; dmarc=pass (policy=none) header.from=linux.dev; spf=pass (imf31.hostedemail.com: domain of naoya.horiguchi@linux.dev designates 94.23.1.103 as permitted sender) smtp.mailfrom=naoya.horiguchi@linux.dev ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1656898426; a=rsa-sha256; cv=none; b=aHi8SV0LXGcV3p/TX/TiP7rQRoacSyxBOUQ4gWF7p0mMn0v1QvMAtAVXfOWEfJid9f9NK1 gr6LHVULj4S1N+Mkmy53/SINBYR/DlRoFaq6EzqMq4Fc6sC2T9Lz9y65ZXwEuUi4rkGzEu 3mIzc+DbO5cJAGWuzT6507fi+cNrDok= Authentication-Results: imf31.hostedemail.com; dkim=pass header.d=linux.dev header.s=key1 header.b=LVkfQK7T; dmarc=pass (policy=none) header.from=linux.dev; spf=pass (imf31.hostedemail.com: domain of naoya.horiguchi@linux.dev designates 94.23.1.103 as permitted sender) smtp.mailfrom=naoya.horiguchi@linux.dev X-Stat-Signature: 1pgcw6h5cwqa5o7mrxismknftaebcqub X-Rspamd-Queue-Id: 3D1732000D X-Rspam-User: X-Rspamd-Server: rspam11 X-HE-Tag: 1656898426-643153 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Naoya Horiguchi Raw error info list needs to be removed when hwpoisoned hugetlb is unpoisoned. And unpoison handler needs to know how many errors there are in the target hugepage. So add them. Signed-off-by: Naoya Horiguchi --- include/linux/swapops.h | 9 +++++++++ mm/memory-failure.c | 31 +++++++++++++++++++++++++------ 2 files changed, 34 insertions(+), 6 deletions(-) diff --git a/include/linux/swapops.h b/include/linux/swapops.h index a01aeb3fcc0b..ddc98f96ad2c 100644 --- a/include/linux/swapops.h +++ b/include/linux/swapops.h @@ -498,6 +498,11 @@ static inline void num_poisoned_pages_dec(void) atomic_long_dec(&num_poisoned_pages); } +static inline void num_poisoned_pages_sub(long i) +{ + atomic_long_sub(i, &num_poisoned_pages); +} + #else static inline swp_entry_t make_hwpoison_entry(struct page *page) @@ -518,6 +523,10 @@ static inline struct page *hwpoison_entry_to_page(swp_entry_t entry) static inline void num_poisoned_pages_inc(void) { } + +static inline void num_poisoned_pages_sub(long i) +{ +} #endif static inline int non_swap_entry(swp_entry_t entry) diff --git a/mm/memory-failure.c b/mm/memory-failure.c index 53bf7486a245..6af2096d8ea0 100644 --- a/mm/memory-failure.c +++ b/mm/memory-failure.c @@ -1722,22 +1722,33 @@ static inline int hugetlb_set_page_hwpoison(struct page *hpage, return ret; } -inline int hugetlb_clear_page_hwpoison(struct page *hpage) +static inline long free_raw_hwp_pages(struct page *hpage, bool move_flag) { struct llist_head *head; struct llist_node *t, *tnode; + long count = 0; - if (!HPageRawHwpUnreliable(hpage)) - ClearPageHWPoison(hpage); head = raw_hwp_list_head(hpage); llist_for_each_safe(tnode, t, head->first) { struct raw_hwp_page *p = container_of(tnode, struct raw_hwp_page, node); - SetPageHWPoison(p->page); + if (move_flag) + SetPageHWPoison(p->page); kfree(p); + count++; } llist_del_all(head); - return 0; + return count; +} + +inline int hugetlb_clear_page_hwpoison(struct page *hpage) +{ + int ret = -EBUSY; + + if (!HPageRawHwpUnreliable(hpage)) + ret = !TestClearPageHWPoison(hpage); + free_raw_hwp_pages(hpage, true); + return ret; } /* @@ -1882,6 +1893,9 @@ static inline int try_memory_failure_hugetlb(unsigned long pfn, int flags, int * return 0; } +static inline void free_raw_hwp_pages(struct page *hpage, bool move_flag) +{ +} #endif /* CONFIG_HUGETLB_PAGE */ static int memory_failure_dev_pagemap(unsigned long pfn, int flags, @@ -2287,6 +2301,7 @@ int unpoison_memory(unsigned long pfn) struct page *p; int ret = -EBUSY; int freeit = 0; + long count = 1; static DEFINE_RATELIMIT_STATE(unpoison_rs, DEFAULT_RATELIMIT_INTERVAL, DEFAULT_RATELIMIT_BURST); @@ -2334,6 +2349,8 @@ int unpoison_memory(unsigned long pfn) ret = get_hwpoison_page(p, MF_UNPOISON); if (!ret) { + if (PageHuge(p)) + count = free_raw_hwp_pages(page, false); ret = TestClearPageHWPoison(page) ? 0 : -EBUSY; } else if (ret < 0) { if (ret == -EHWPOISON) { @@ -2342,6 +2359,8 @@ int unpoison_memory(unsigned long pfn) unpoison_pr_info("Unpoison: failed to grab page %#lx\n", pfn, &unpoison_rs); } else { + if (PageHuge(p)) + count = free_raw_hwp_pages(page, false); freeit = !!TestClearPageHWPoison(p); put_page(page); @@ -2354,7 +2373,7 @@ int unpoison_memory(unsigned long pfn) unlock_mutex: mutex_unlock(&mf_mutex); if (!ret || freeit) { - num_poisoned_pages_dec(); + num_poisoned_pages_sub(count); unpoison_pr_info("Unpoison: Software-unpoisoned page %#lx\n", page_to_pfn(p), &unpoison_rs); } From patchwork Mon Jul 4 01:33:09 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Naoya Horiguchi X-Patchwork-Id: 12904551 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 70F6FC433EF for ; Mon, 4 Jul 2022 01:33:51 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id DEF8D900003; Sun, 3 Jul 2022 21:33:50 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id D76336B0078; Sun, 3 Jul 2022 21:33:50 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id C1752900003; Sun, 3 Jul 2022 21:33:50 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0012.hostedemail.com [216.40.44.12]) by kanga.kvack.org (Postfix) with ESMTP id B0DB76B0075 for ; Sun, 3 Jul 2022 21:33:50 -0400 (EDT) Received: from smtpin15.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay12.hostedemail.com (Postfix) with ESMTP id 9421B12169C for ; Mon, 4 Jul 2022 01:33:50 +0000 (UTC) X-FDA: 79647695820.15.BB72414 Received: from out0.migadu.com (out0.migadu.com [94.23.1.103]) by imf17.hostedemail.com (Postfix) with ESMTP id 40F2740041 for ; Mon, 4 Jul 2022 01:33:50 +0000 (UTC) X-Report-Abuse: Please report any abuse attempt to abuse@migadu.com and include these headers. DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.dev; s=key1; t=1656898429; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=yJVEgY4/4yJl1Y2+SArIxCvna3tlPP28HNqV598KtU4=; b=WBxTII6WEoRduxL3kOq3kc4q4wZin7G8GjrVCT1IptMO/2v36sWMM2DpdP6JyTgrfEseum 7avwkUmKrY/FKt2jfn+EaVzA0KYbN4x549Tyhf5RQvQeeobTMbuthIHrFUSLKeLrk3UJEA bpDrbbZYN3TCLHHtHRGZl4L5trY5cVo= From: Naoya Horiguchi To: linux-mm@kvack.org Cc: Andrew Morton , David Hildenbrand , Mike Kravetz , Miaohe Lin , Liu Shixin , Yang Shi , Oscar Salvador , Muchun Song , Naoya Horiguchi , linux-kernel@vger.kernel.org Subject: [mm-unstable PATCH v4 6/9] mm, hwpoison: set PG_hwpoison for busy hugetlb pages Date: Mon, 4 Jul 2022 10:33:09 +0900 Message-Id: <20220704013312.2415700-7-naoya.horiguchi@linux.dev> In-Reply-To: <20220704013312.2415700-1-naoya.horiguchi@linux.dev> References: <20220704013312.2415700-1-naoya.horiguchi@linux.dev> MIME-Version: 1.0 X-Migadu-Flow: FLOW_OUT X-Migadu-Auth-User: linux.dev ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1656898430; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=yJVEgY4/4yJl1Y2+SArIxCvna3tlPP28HNqV598KtU4=; b=l4QdioMKHNXOcIuQnCdH4zcNMFS7jXdlNIJd6bqRGWUL3gUi24rKMjBMqwUdIv+2ao3a8X NSrOmiCAyoPpBFYXLw4vUpD5SYf9Xf9DbefZYKUsuVgU6Ru5RcGctD7ixxaTzH9kYojkGD qvFHvwDw7xh0+ZqgGmTmx8F49jSGmPU= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1656898430; a=rsa-sha256; cv=none; b=LgswWW6+D0stmRJUWeCB2lpzFUgAlmuIP3fabwrb8JboKicpZ5sAEovzdOTfgLPYr5cBKD fBQrE0RsZUtQX9LzN4It5s0bUqYhESL1y5skY3tz30Mjv6CTXu0erMU4Jsn5g0rueVpCaK tjUrX4vEWkdvIkuLVkjrl7K6jwINLuM= ARC-Authentication-Results: i=1; imf17.hostedemail.com; dkim=pass header.d=linux.dev header.s=key1 header.b=WBxTII6W; dmarc=pass (policy=none) header.from=linux.dev; spf=pass (imf17.hostedemail.com: domain of naoya.horiguchi@linux.dev designates 94.23.1.103 as permitted sender) smtp.mailfrom=naoya.horiguchi@linux.dev X-Stat-Signature: 4zfgp1pcbyfie7u1mm1jwzrjzjw5yusq X-Rspamd-Queue-Id: 40F2740041 X-Rspam-User: Authentication-Results: imf17.hostedemail.com; dkim=pass header.d=linux.dev header.s=key1 header.b=WBxTII6W; dmarc=pass (policy=none) header.from=linux.dev; spf=pass (imf17.hostedemail.com: domain of naoya.horiguchi@linux.dev designates 94.23.1.103 as permitted sender) smtp.mailfrom=naoya.horiguchi@linux.dev X-Rspamd-Server: rspam06 X-HE-Tag: 1656898430-144711 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Naoya Horiguchi If memory_failure() fails to grab page refcount on a hugetlb page because it's busy, it returns without setting PG_hwpoison on it. This not only loses a chance of error containment, but breaks the rule that action_result() should be called only when memory_failure() do any of handling work (even if that's just setting PG_hwpoison). This inconsistency could harm code maintainability. So set PG_hwpoison and call hugetlb_set_page_hwpoison() for such a case. Fixes: 405ce051236c ("mm/hwpoison: fix race between hugetlb free/demotion and memory_failure_hugetlb()") Signed-off-by: Naoya Horiguchi Reviewed-by: Miaohe Lin --- include/linux/mm.h | 1 + mm/memory-failure.c | 8 ++++---- 2 files changed, 5 insertions(+), 4 deletions(-) diff --git a/include/linux/mm.h b/include/linux/mm.h index 433bde7dcbf2..22f2dfe41c99 100644 --- a/include/linux/mm.h +++ b/include/linux/mm.h @@ -3235,6 +3235,7 @@ enum mf_flags { MF_SOFT_OFFLINE = 1 << 3, MF_UNPOISON = 1 << 4, MF_SW_SIMULATED = 1 << 5, + MF_NO_RETRY = 1 << 6, }; int mf_dax_kill_procs(struct address_space *mapping, pgoff_t index, unsigned long count, int mf_flags); diff --git a/mm/memory-failure.c b/mm/memory-failure.c index 6af2096d8ea0..4233b21328a5 100644 --- a/mm/memory-failure.c +++ b/mm/memory-failure.c @@ -1782,7 +1782,8 @@ int __get_huge_page_for_hwpoison(unsigned long pfn, int flags) count_increased = true; } else { ret = -EBUSY; - goto out; + if (!(flags & MF_NO_RETRY)) + goto out; } if (hugetlb_set_page_hwpoison(head, page)) { @@ -1810,7 +1811,6 @@ static int try_memory_failure_hugetlb(unsigned long pfn, int flags, int *hugetlb struct page *p = pfn_to_page(pfn); struct page *head; unsigned long page_flags; - bool retry = true; *hugetlb = 1; retry: @@ -1826,8 +1826,8 @@ static int try_memory_failure_hugetlb(unsigned long pfn, int flags, int *hugetlb } return res; } else if (res == -EBUSY) { - if (retry) { - retry = false; + if (!(flags & MF_NO_RETRY)) { + flags |= MF_NO_RETRY; goto retry; } action_result(pfn, MF_MSG_UNKNOWN, MF_IGNORED); From patchwork Mon Jul 4 01:33:10 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Naoya Horiguchi X-Patchwork-Id: 12904552 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 6DEBDC433EF for ; Mon, 4 Jul 2022 01:33:55 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 0F5C56B0075; Sun, 3 Jul 2022 21:33:55 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 07E42900005; Sun, 3 Jul 2022 21:33:55 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id E62E5900004; Sun, 3 Jul 2022 21:33:54 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0017.hostedemail.com [216.40.44.17]) by kanga.kvack.org (Postfix) with ESMTP id D0A626B0075 for ; Sun, 3 Jul 2022 21:33:54 -0400 (EDT) Received: from smtpin25.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay01.hostedemail.com (Postfix) with ESMTP id B2D7460AF9 for ; Mon, 4 Jul 2022 01:33:54 +0000 (UTC) X-FDA: 79647695988.25.602622F Received: from out0.migadu.com (out0.migadu.com [94.23.1.103]) by imf30.hostedemail.com (Postfix) with ESMTP id 4B6498001A for ; Mon, 4 Jul 2022 01:33:54 +0000 (UTC) X-Report-Abuse: Please report any abuse attempt to abuse@migadu.com and include these headers. DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.dev; s=key1; t=1656898433; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=b93kxoYsvAwFUvAYzt37azWxoNvpZ6NPtNoPBk9U/1A=; b=OubKvWLjU/w0ll3ChvqkFJDpcLhk1I+t4VqgIrFtLZrJu1J1EFjoVOS3AXiN769tKKaDq0 Jdpo4ZprRxvNjyemXrCTE7S5GKSYnyHYmC63ULGsZi0W3maLEjSc7giWcPMKuvgPR+34mh WphfFVPBMIP8XphqZ4+Hz+bC0A1lE6U= From: Naoya Horiguchi To: linux-mm@kvack.org Cc: Andrew Morton , David Hildenbrand , Mike Kravetz , Miaohe Lin , Liu Shixin , Yang Shi , Oscar Salvador , Muchun Song , Naoya Horiguchi , linux-kernel@vger.kernel.org Subject: [mm-unstable PATCH v4 7/9] mm, hwpoison: make __page_handle_poison returns int Date: Mon, 4 Jul 2022 10:33:10 +0900 Message-Id: <20220704013312.2415700-8-naoya.horiguchi@linux.dev> In-Reply-To: <20220704013312.2415700-1-naoya.horiguchi@linux.dev> References: <20220704013312.2415700-1-naoya.horiguchi@linux.dev> MIME-Version: 1.0 X-Migadu-Flow: FLOW_OUT X-Migadu-Auth-User: linux.dev ARC-Authentication-Results: i=1; imf30.hostedemail.com; dkim=pass header.d=linux.dev header.s=key1 header.b=OubKvWLj; dmarc=pass (policy=none) header.from=linux.dev; spf=pass (imf30.hostedemail.com: domain of naoya.horiguchi@linux.dev designates 94.23.1.103 as permitted sender) smtp.mailfrom=naoya.horiguchi@linux.dev ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1656898434; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=b93kxoYsvAwFUvAYzt37azWxoNvpZ6NPtNoPBk9U/1A=; b=BZQn++spdOhtt74QufsPkKUTr65ROnaTBTYkofUmA39EECBJ1vOHQG8LyfGuN7IahSwiEr GU/wXPm8gujJM8WqNY4p4nXSICMoM7ryX9Oohx+fZ40CbsxZ2l6S9Zh7gJ6mgCLGzC5inX ghrEsTeTB6loPFmDX+BrBWiRkbF+wmM= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1656898434; a=rsa-sha256; cv=none; b=oX9nHabfLbhrejwGieEduh4uZAizL1Sa2bKlm6IJKHFIwJ62Vnz3zhqdbcmlK0yCN5tH8o htJoevHbUEsOZMXD0Z8yRJuxf0I1XtX5YT7mOOgZJ/OayOgIXmpwD2PTdpC3USACmaTiTL BAj/iCeFIHgJLvikNvRIdq+oKG7QWlc= X-Stat-Signature: 1mjy7x6tyw8r16r3e4z8f6mifnqtz7hp X-Rspamd-Queue-Id: 4B6498001A Authentication-Results: imf30.hostedemail.com; dkim=pass header.d=linux.dev header.s=key1 header.b=OubKvWLj; dmarc=pass (policy=none) header.from=linux.dev; spf=pass (imf30.hostedemail.com: domain of naoya.horiguchi@linux.dev designates 94.23.1.103 as permitted sender) smtp.mailfrom=naoya.horiguchi@linux.dev X-Rspamd-Server: rspam03 X-Rspam-User: X-HE-Tag: 1656898434-349474 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Naoya Horiguchi __page_handle_poison() returns bool that shows whether take_page_off_buddy() has passed or not now. But we will want to distinguish another case of "dissolve has passed but taking off failed" by its return value. So change the type of the return value. No functional change. Signed-off-by: Naoya Horiguchi Reviewed-by: Miaohe Lin --- v2 -> v3: - move deleting "res = MF_FAILED" to the later patch. (by Miaohe) --- mm/memory-failure.c | 16 +++++++++++----- 1 file changed, 11 insertions(+), 5 deletions(-) diff --git a/mm/memory-failure.c b/mm/memory-failure.c index 4233b21328a5..c8939a39fbe6 100644 --- a/mm/memory-failure.c +++ b/mm/memory-failure.c @@ -71,7 +71,13 @@ atomic_long_t num_poisoned_pages __read_mostly = ATOMIC_LONG_INIT(0); static bool hw_memory_failure __read_mostly = false; -static bool __page_handle_poison(struct page *page) +/* + * Return values: + * 1: the page is dissolved (if needed) and taken off from buddy, + * 0: the page is dissolved (if needed) and not taken off from buddy, + * < 0: failed to dissolve. + */ +static int __page_handle_poison(struct page *page) { int ret; @@ -81,7 +87,7 @@ static bool __page_handle_poison(struct page *page) ret = take_page_off_buddy(page); zone_pcp_enable(page_zone(page)); - return ret > 0; + return ret; } static bool page_handle_poison(struct page *page, bool hugepage_or_freepage, bool release) @@ -91,7 +97,7 @@ static bool page_handle_poison(struct page *page, bool hugepage_or_freepage, boo * Doing this check for free pages is also fine since dissolve_free_huge_page * returns 0 for non-hugetlb pages as well. */ - if (!__page_handle_poison(page)) + if (__page_handle_poison(page) <= 0) /* * We could fail to take off the target page from buddy * for example due to racy page allocation, but that's @@ -1086,7 +1092,7 @@ static int me_huge_page(struct page_state *ps, struct page *p) * subpages. */ put_page(hpage); - if (__page_handle_poison(p)) { + if (__page_handle_poison(p) > 0) { page_ref_inc(p); res = MF_RECOVERED; } @@ -1850,7 +1856,7 @@ static int try_memory_failure_hugetlb(unsigned long pfn, int flags, int *hugetlb if (res == 0) { unlock_page(head); res = MF_FAILED; - if (__page_handle_poison(p)) { + if (__page_handle_poison(p) > 0) { page_ref_inc(p); res = MF_RECOVERED; } From patchwork Mon Jul 4 01:33:11 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Naoya Horiguchi X-Patchwork-Id: 12904553 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 8C7F5C433EF for ; Mon, 4 Jul 2022 01:33:59 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 2A28B6B0078; Sun, 3 Jul 2022 21:33:59 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 22BDB900005; Sun, 3 Jul 2022 21:33:59 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 0A651900004; Sun, 3 Jul 2022 21:33:59 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0015.hostedemail.com [216.40.44.15]) by kanga.kvack.org (Postfix) with ESMTP id EDEED6B0078 for ; Sun, 3 Jul 2022 21:33:58 -0400 (EDT) Received: from smtpin16.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay09.hostedemail.com (Postfix) with ESMTP id C974735DE4 for ; Mon, 4 Jul 2022 01:33:58 +0000 (UTC) X-FDA: 79647696156.16.8538394 Received: from out0.migadu.com (out0.migadu.com [94.23.1.103]) by imf05.hostedemail.com (Postfix) with ESMTP id 4CD5F10000C for ; Mon, 4 Jul 2022 01:33:58 +0000 (UTC) X-Report-Abuse: Please report any abuse attempt to abuse@migadu.com and include these headers. DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.dev; s=key1; t=1656898437; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=heS02K7z+r8PEdKG4MUWA7zX+tLrnPdwPTD094o8JOg=; b=jRaTv0VwNQxjDdV98OfbIxmp1MAqAgbAH/qQR3ItolbymLwEHvIL2qLSHpHa3s3kJ1koVp vAJYGZyR5HC+qRzOJjiyzX4i5zDxgzsy7/gMyfFESteNaKoW7hSeDlkdt2JXRG43voSdKQ 9GgP3nwITYE2BxtTb5MQw7I2+fTbado= From: Naoya Horiguchi To: linux-mm@kvack.org Cc: Andrew Morton , David Hildenbrand , Mike Kravetz , Miaohe Lin , Liu Shixin , Yang Shi , Oscar Salvador , Muchun Song , Naoya Horiguchi , linux-kernel@vger.kernel.org Subject: [mm-unstable PATCH v4 8/9] mm, hwpoison: skip raw hwpoison page in freeing 1GB hugepage Date: Mon, 4 Jul 2022 10:33:11 +0900 Message-Id: <20220704013312.2415700-9-naoya.horiguchi@linux.dev> In-Reply-To: <20220704013312.2415700-1-naoya.horiguchi@linux.dev> References: <20220704013312.2415700-1-naoya.horiguchi@linux.dev> MIME-Version: 1.0 X-Migadu-Flow: FLOW_OUT X-Migadu-Auth-User: linux.dev ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1656898438; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=heS02K7z+r8PEdKG4MUWA7zX+tLrnPdwPTD094o8JOg=; b=fg0JzaWgYy+F/r+AvyX2MlXCRIuHpzTK0X79+OE/xiV7CNuwFnqWysuzR78Tdam50jCzZt JdQbUuhzS6H5cvLWiRHp9qG1aaH/M9x6weV4QnMTMBx3v+xO8z2wDERZfnzVZYAXFimomH dsuhihO+ZEQCJ9Dcg4bPeIf/cCavgLU= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1656898438; a=rsa-sha256; cv=none; b=1aAQmt1YywTi7LFi+jzxiCFoTWtL0LtBFZeL5yslqPwlxilSOdWvxpbn5ehIrxtO5wcrCD NbJqXC+mEpOlprJrinEfh7rjWPybihsMkd5c0+eJ8xgBGw47H5QVPuBq24ZurRq/hPsh2+ GrAT5hLgcbQKI/rbXtBqKJLqxoeLcrk= ARC-Authentication-Results: i=1; imf05.hostedemail.com; dkim=pass header.d=linux.dev header.s=key1 header.b=jRaTv0Vw; dmarc=pass (policy=none) header.from=linux.dev; spf=pass (imf05.hostedemail.com: domain of naoya.horiguchi@linux.dev designates 94.23.1.103 as permitted sender) smtp.mailfrom=naoya.horiguchi@linux.dev X-Stat-Signature: 9awwuen1te5s3bowsy1t9t9x1esy9tn6 X-Rspamd-Queue-Id: 4CD5F10000C X-Rspam-User: Authentication-Results: imf05.hostedemail.com; dkim=pass header.d=linux.dev header.s=key1 header.b=jRaTv0Vw; dmarc=pass (policy=none) header.from=linux.dev; spf=pass (imf05.hostedemail.com: domain of naoya.horiguchi@linux.dev designates 94.23.1.103 as permitted sender) smtp.mailfrom=naoya.horiguchi@linux.dev X-Rspamd-Server: rspam06 X-HE-Tag: 1656898438-300794 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Naoya Horiguchi Currently if memory_failure() (modified to remove blocking code with subsequent patch) is called on a page in some 1GB hugepage, memory error handling fails and the raw error page gets into leaked state. The impact is small in production systems (just leaked single 4kB page), but this limits the testability because unpoison doesn't work for it. We can no longer create 1GB hugepage on the 1GB physical address range with such leaked pages, that's not useful when testing on small systems. When a hwpoison page in a 1GB hugepage is handled, it's caught by the PageHWPoison check in free_pages_prepare() because the 1GB hugepage is broken down into raw error pages before coming to this point: if (unlikely(PageHWPoison(page)) && !order) { ... return false; } Then, the page is not sent to buddy and the page refcount is left 0. Originally this check is supposed to work when the error page is freed from page_handle_poison() (that is called from soft-offline), but now we are opening another path to call it, so the callers of __page_handle_poison() need to handle the case by considering the return value 0 as success. Then page refcount for hwpoison is properly incremented so unpoison works. Signed-off-by: Naoya Horiguchi Reviewed-by: Miaohe Lin --- v2 -> v3: - remove "res = MF_FAILED" in try_memory_failure_hugetlb (by Miaohe) --- mm/memory-failure.c | 10 ++++++---- 1 file changed, 6 insertions(+), 4 deletions(-) diff --git a/mm/memory-failure.c b/mm/memory-failure.c index c8939a39fbe6..f095d55f40bc 100644 --- a/mm/memory-failure.c +++ b/mm/memory-failure.c @@ -1084,7 +1084,6 @@ static int me_huge_page(struct page_state *ps, struct page *p) res = truncate_error_page(hpage, page_to_pfn(p), mapping); unlock_page(hpage); } else { - res = MF_FAILED; unlock_page(hpage); /* * migration entry prevents later access on error hugepage, @@ -1092,9 +1091,11 @@ static int me_huge_page(struct page_state *ps, struct page *p) * subpages. */ put_page(hpage); - if (__page_handle_poison(p) > 0) { + if (__page_handle_poison(p) >= 0) { page_ref_inc(p); res = MF_RECOVERED; + } else { + res = MF_FAILED; } } @@ -1855,10 +1856,11 @@ static int try_memory_failure_hugetlb(unsigned long pfn, int flags, int *hugetlb */ if (res == 0) { unlock_page(head); - res = MF_FAILED; - if (__page_handle_poison(p) > 0) { + if (__page_handle_poison(p) >= 0) { page_ref_inc(p); res = MF_RECOVERED; + } else { + res = MF_FAILED; } action_result(pfn, MF_MSG_FREE_HUGE, res); return res == MF_RECOVERED ? 0 : -EBUSY; From patchwork Mon Jul 4 01:33:12 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Naoya Horiguchi X-Patchwork-Id: 12904554 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id A4DC9C43334 for ; Mon, 4 Jul 2022 01:34:03 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 403B6900004; Sun, 3 Jul 2022 21:34:03 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 38B4A6B007D; Sun, 3 Jul 2022 21:34:03 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 22C6C900004; Sun, 3 Jul 2022 21:34:03 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0014.hostedemail.com [216.40.44.14]) by kanga.kvack.org (Postfix) with ESMTP id 1230E6B007B for ; Sun, 3 Jul 2022 21:34:03 -0400 (EDT) Received: from smtpin23.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay09.hostedemail.com (Postfix) with ESMTP id D932535DD4 for ; Mon, 4 Jul 2022 01:34:02 +0000 (UTC) X-FDA: 79647696324.23.642F293 Received: from out0.migadu.com (out0.migadu.com [94.23.1.103]) by imf02.hostedemail.com (Postfix) with ESMTP id 4463B8003A for ; Mon, 4 Jul 2022 01:34:02 +0000 (UTC) X-Report-Abuse: Please report any abuse attempt to abuse@migadu.com and include these headers. DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.dev; s=key1; t=1656898441; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=M7AjUqkAiHJlpCBZJlbeEmo/d64ROKDpuP+oqggGukU=; b=Gbg4YQuRr1ZBJeJCt6J2bwOvEzs2gfg1uwxV9lUI+P/cVvFlCAtiAhTqnQn2X9U7w2LiRh BDOMV9gZA9sGcdlGUfXT2RnIDtGv9OYbyUsoMQiGfol6G9TEHhiRWZHzwP5xLemzxhFB8K NWzeztk0G4rE/ePQn5TGTJYmxgIlVrI= From: Naoya Horiguchi To: linux-mm@kvack.org Cc: Andrew Morton , David Hildenbrand , Mike Kravetz , Miaohe Lin , Liu Shixin , Yang Shi , Oscar Salvador , Muchun Song , Naoya Horiguchi , linux-kernel@vger.kernel.org Subject: [mm-unstable PATCH v4 9/9] mm, hwpoison: enable memory error handling on 1GB hugepage Date: Mon, 4 Jul 2022 10:33:12 +0900 Message-Id: <20220704013312.2415700-10-naoya.horiguchi@linux.dev> In-Reply-To: <20220704013312.2415700-1-naoya.horiguchi@linux.dev> References: <20220704013312.2415700-1-naoya.horiguchi@linux.dev> MIME-Version: 1.0 X-Migadu-Flow: FLOW_OUT X-Migadu-Auth-User: linux.dev ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1656898442; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=M7AjUqkAiHJlpCBZJlbeEmo/d64ROKDpuP+oqggGukU=; b=dA/xVRplGRmm0POZIhNTuyPM+x3QmdSTVuQIoDfkDhy6XSosbmkbGNSKREmKIhC2WkGdHO oW5qKbLH00rta00hK2/Y4xIVRpQNvcU7y2eQYjyOAVyUp16LMoaE/gdnlj3rZFDISxfMjP pJgan9Uym8mNRmnd7+dN1MwJ4f4dbA0= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1656898442; a=rsa-sha256; cv=none; b=vB1UigCEfEfQ4KNl+rw8F6xmmg/AyOVFB2LFUA8jRIfRFbDSFE3KE/Ur4IpnzelTP+JjPW pLznf13JMFBXBqEfJHY1VQOnH+9EHMNH88s4vqooOxPywsNf74ob9WRH7vEZxXwFMbKQzC 2O36s47h/s6eoAwYlcabomyHf1pwokU= ARC-Authentication-Results: i=1; imf02.hostedemail.com; dkim=pass header.d=linux.dev header.s=key1 header.b=Gbg4YQuR; dmarc=pass (policy=none) header.from=linux.dev; spf=pass (imf02.hostedemail.com: domain of naoya.horiguchi@linux.dev designates 94.23.1.103 as permitted sender) smtp.mailfrom=naoya.horiguchi@linux.dev X-Stat-Signature: 36dpgmf3nwdnsmfy9x6xffgw5emasjne X-Rspamd-Queue-Id: 4463B8003A X-Rspam-User: Authentication-Results: imf02.hostedemail.com; dkim=pass header.d=linux.dev header.s=key1 header.b=Gbg4YQuR; dmarc=pass (policy=none) header.from=linux.dev; spf=pass (imf02.hostedemail.com: domain of naoya.horiguchi@linux.dev designates 94.23.1.103 as permitted sender) smtp.mailfrom=naoya.horiguchi@linux.dev X-Rspamd-Server: rspam06 X-HE-Tag: 1656898442-239314 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Naoya Horiguchi Now error handling code is prepared, so remove the blocking code and enable memory error handling on 1GB hugepage. Signed-off-by: Naoya Horiguchi Reviewed-by: Miaohe Lin --- include/linux/mm.h | 1 - include/ras/ras_event.h | 1 - mm/memory-failure.c | 16 ---------------- 3 files changed, 18 deletions(-) diff --git a/include/linux/mm.h b/include/linux/mm.h index 22f2dfe41c99..d084ce57c7a6 100644 --- a/include/linux/mm.h +++ b/include/linux/mm.h @@ -3288,7 +3288,6 @@ enum mf_action_page_type { MF_MSG_DIFFERENT_COMPOUND, MF_MSG_HUGE, MF_MSG_FREE_HUGE, - MF_MSG_NON_PMD_HUGE, MF_MSG_UNMAP_FAILED, MF_MSG_DIRTY_SWAPCACHE, MF_MSG_CLEAN_SWAPCACHE, diff --git a/include/ras/ras_event.h b/include/ras/ras_event.h index d0337a41141c..cbd3ddd7c33d 100644 --- a/include/ras/ras_event.h +++ b/include/ras/ras_event.h @@ -360,7 +360,6 @@ TRACE_EVENT(aer_event, EM ( MF_MSG_DIFFERENT_COMPOUND, "different compound page after locking" ) \ EM ( MF_MSG_HUGE, "huge page" ) \ EM ( MF_MSG_FREE_HUGE, "free huge page" ) \ - EM ( MF_MSG_NON_PMD_HUGE, "non-pmd-sized huge page" ) \ EM ( MF_MSG_UNMAP_FAILED, "unmapping failed page" ) \ EM ( MF_MSG_DIRTY_SWAPCACHE, "dirty swapcache page" ) \ EM ( MF_MSG_CLEAN_SWAPCACHE, "clean swapcache page" ) \ diff --git a/mm/memory-failure.c b/mm/memory-failure.c index f095d55f40bc..ba24b72b8764 100644 --- a/mm/memory-failure.c +++ b/mm/memory-failure.c @@ -765,7 +765,6 @@ static const char * const action_page_types[] = { [MF_MSG_DIFFERENT_COMPOUND] = "different compound page after locking", [MF_MSG_HUGE] = "huge page", [MF_MSG_FREE_HUGE] = "free huge page", - [MF_MSG_NON_PMD_HUGE] = "non-pmd-sized huge page", [MF_MSG_UNMAP_FAILED] = "unmapping failed page", [MF_MSG_DIRTY_SWAPCACHE] = "dirty swapcache page", [MF_MSG_CLEAN_SWAPCACHE] = "clean swapcache page", @@ -1868,21 +1867,6 @@ static int try_memory_failure_hugetlb(unsigned long pfn, int flags, int *hugetlb page_flags = head->flags; - /* - * TODO: hwpoison for pud-sized hugetlb doesn't work right now, so - * simply disable it. In order to make it work properly, we need - * make sure that: - * - conversion of a pud that maps an error hugetlb into hwpoison - * entry properly works, and - * - other mm code walking over page table is aware of pud-aligned - * hwpoison entries. - */ - if (huge_page_size(page_hstate(head)) > PMD_SIZE) { - action_result(pfn, MF_MSG_NON_PMD_HUGE, MF_IGNORED); - res = -EBUSY; - goto out; - } - if (!hwpoison_user_mappings(p, pfn, flags, head)) { action_result(pfn, MF_MSG_UNMAP_FAILED, MF_IGNORED); res = -EBUSY;