From patchwork Wed Apr 20 21:00:09 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Zhuo, Qiuxu" X-Patchwork-Id: 12820244 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 2E172C433F5 for ; Wed, 20 Apr 2022 13:23:47 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 8D4956B0071; Wed, 20 Apr 2022 09:23:46 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 883336B0072; Wed, 20 Apr 2022 09:23:46 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 724346B0074; Wed, 20 Apr 2022 09:23:46 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (relay.hostedemail.com [64.99.140.27]) by kanga.kvack.org (Postfix) with ESMTP id 6462E6B0071 for ; Wed, 20 Apr 2022 09:23:46 -0400 (EDT) Received: from smtpin12.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay10.hostedemail.com (Postfix) with ESMTP id 36EFE2C81 for ; Wed, 20 Apr 2022 13:23:46 +0000 (UTC) X-FDA: 79377324852.12.854290F Received: from mga17.intel.com (mga17.intel.com [192.55.52.151]) by imf26.hostedemail.com (Postfix) with ESMTP id D4ED9140013 for ; Wed, 20 Apr 2022 13:23:43 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1650461025; x=1681997025; h=from:to:cc:subject:date:message-id; bh=bFw9jBOeodp0zCARNf/d8JtgIIyNELjWMG7lzca9XA4=; b=Fui7Dc1QqBv6dJHpiMrK4jnAZngPaz05YlolQ1looqCFcwgOLUci5myJ T6SoKC/SH/s2fsHEwVab6gXrhDk8wlNYfvme3BDl2OhCq84kjKvd+TJSz wnrTD4aQIxXRyXY1N50jdVtvKnnCf8vYH8JFUad6f9nCsvvkzpxYA7Dfa 1t70M1oPLvQ3Z66bdlL3DVHPT7c5mTjGRKGAJG+PFWxG5NvVZSqhhA8ex gIttPNY+FxYj/H1ZlnSvLsd6ezGDOZqmDxzecqfHpxqhApJ0gv9BZFK0K GkCHrjmEd/lXpuFLEHh5E7HBNaQftjA2g2ZHbOu76mAj9cLgfMVyaoRMq Q==; X-IronPort-AV: E=McAfee;i="6400,9594,10322"; a="244606006" X-IronPort-AV: E=Sophos;i="5.90,275,1643702400"; d="scan'208";a="244606006" Received: from orsmga007.jf.intel.com ([10.7.209.58]) by fmsmga107.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 20 Apr 2022 06:23:23 -0700 X-IronPort-AV: E=Sophos;i="5.90,275,1643702400"; d="scan'208";a="555191349" Received: from qiuxu-clx.sh.intel.com ([10.239.53.12]) by orsmga007-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 20 Apr 2022 06:23:20 -0700 From: Qiuxu Zhuo To: Thomas Gleixner , Ingo Molnar , Borislav Petkov , Tony Luck , Dave Hansen , Andy Lutomirski , Peter Zijlstra , Andrew Morton , Naoya Horiguchi Cc: Qiuxu Zhuo , x86@kernel.org (maintainer:X86 ARCHITECTURE (32-BIT AND 64-BIT)), linux-mm@kvack.org (open list:HWPOISON MEMORY FAILURE HANDLING), linux-kernel@vger.kernel.org (open list:X86 ARCHITECTURE (32-BIT AND 64-BIT)) Subject: [PATCH 1/1] x86/mm: Forbid the zero page once it has uncorrectable errors Date: Wed, 20 Apr 2022 17:00:09 -0400 Message-Id: <20220420210009.65666-1-qiuxu.zhuo@intel.com> X-Mailer: git-send-email 2.17.1 X-Rspamd-Server: rspam10 X-Rspamd-Queue-Id: D4ED9140013 X-Stat-Signature: 1114he815wmqhcmnzir31hpwzoj715zo Authentication-Results: imf26.hostedemail.com; dkim=pass header.d=intel.com header.s=Intel header.b=Fui7Dc1Q; spf=none (imf26.hostedemail.com: domain of qiuxu.zhuo@intel.com has no SPF policy when checking 192.55.52.151) smtp.mailfrom=qiuxu.zhuo@intel.com; dmarc=pass (policy=none) header.from=intel.com X-Rspam-User: X-HE-Tag: 1650461023-843329 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Accessing to the zero page with uncorrectable errors causes unexpected machine checks. So forbid the zero page from being used by user-space processes once it has uncorrectable errors. Processes that have already mapped the zero page with uncorrectable errors will get killed once they access to it. New processes will not use the zero page. Signed-off-by: Qiuxu Zhuo --- 1) Processes that have already mapped the zero page with uncorrectable errors could be recovered by attaching a new zeroed anonymous page. But this may need to walk all page tables for all such processes to update the PTEs pointing to the zero page. Looks like a big modification for a rare problem? 2) Some validation tests that sometimes pick up the virtual address mapped to the zero page to inject errors get themself killed and can't run anymore until reboot the system. To avoid injecting errors to the zero page, please refer to the path: https://lore.kernel.org/all/20220419211921.2230752-1-tony.luck@intel.com/ arch/x86/include/asm/pgtable.h | 3 +++ arch/x86/kernel/cpu/mce/core.c | 6 ++++++ arch/x86/mm/pgtable.c | 2 ++ mm/memory-failure.c | 2 +- 4 files changed, 12 insertions(+), 1 deletion(-) diff --git a/arch/x86/include/asm/pgtable.h b/arch/x86/include/asm/pgtable.h index 62ab07e24aef..d4b8693452e5 100644 --- a/arch/x86/include/asm/pgtable.h +++ b/arch/x86/include/asm/pgtable.h @@ -55,6 +55,9 @@ extern unsigned long empty_zero_page[PAGE_SIZE / sizeof(unsigned long)] __visible; #define ZERO_PAGE(vaddr) ((void)(vaddr),virt_to_page(empty_zero_page)) +extern bool __read_mostly forbids_zeropage; +#define mm_forbids_zeropage(x) forbids_zeropage + extern spinlock_t pgd_lock; extern struct list_head pgd_list; diff --git a/arch/x86/kernel/cpu/mce/core.c b/arch/x86/kernel/cpu/mce/core.c index 981496e6bc0e..5b3af27cc8fa 100644 --- a/arch/x86/kernel/cpu/mce/core.c +++ b/arch/x86/kernel/cpu/mce/core.c @@ -44,6 +44,7 @@ #include #include #include +#include #include #include @@ -1370,6 +1371,11 @@ static void queue_task_work(struct mce *m, char *msg, void (*func)(struct callba if (count > 1) return; + if (is_zero_pfn(current->mce_addr >> PAGE_SHIFT) && !forbids_zeropage) { + pr_err("Forbid user-space process from using zero page\n"); + forbids_zeropage = true; + } + task_work_add(current, ¤t->mce_kill_me, TWA_RESUME); } diff --git a/arch/x86/mm/pgtable.c b/arch/x86/mm/pgtable.c index 3481b35cb4ec..c0c56bce3acc 100644 --- a/arch/x86/mm/pgtable.c +++ b/arch/x86/mm/pgtable.c @@ -28,6 +28,8 @@ void paravirt_tlb_remove_table(struct mmu_gather *tlb, void *table) gfp_t __userpte_alloc_gfp = GFP_PGTABLE_USER | PGTABLE_HIGHMEM; +bool __read_mostly forbids_zeropage; + pgtable_t pte_alloc_one(struct mm_struct *mm) { return __pte_alloc_one(mm, __userpte_alloc_gfp); diff --git a/mm/memory-failure.c b/mm/memory-failure.c index dcb6bb9cf731..30ad7bdeb89f 100644 --- a/mm/memory-failure.c +++ b/mm/memory-failure.c @@ -1744,7 +1744,7 @@ int memory_failure(unsigned long pfn, int flags) goto unlock_mutex; } - if (TestSetPageHWPoison(p)) { + if (TestSetPageHWPoison(p) || is_zero_pfn(pfn)) { pr_err("Memory failure: %#lx: already hardware poisoned\n", pfn); res = -EHWPOISON;