From patchwork Mon Oct 4 11:50:01 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Naoya Horiguchi X-Patchwork-Id: 12533749 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id CC9B2C433F5 for ; Mon, 4 Oct 2021 11:50:56 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 48B696121F for ; Mon, 4 Oct 2021 11:50:56 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org 48B696121F Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=linux.dev Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=kvack.org Received: by kanga.kvack.org (Postfix) id A16266B0080; Mon, 4 Oct 2021 07:50:55 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 9C57594000E; Mon, 4 Oct 2021 07:50:55 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 88D6994000B; Mon, 4 Oct 2021 07:50:55 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0096.hostedemail.com [216.40.44.96]) by kanga.kvack.org (Postfix) with ESMTP id 7B8126B0080 for ; Mon, 4 Oct 2021 07:50:55 -0400 (EDT) Received: from smtpin22.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay02.hostedemail.com (Postfix) with ESMTP id 35C272DD73 for ; Mon, 4 Oct 2021 11:50:55 +0000 (UTC) X-FDA: 78658588470.22.36D874A Received: from out1.migadu.com (out1.migadu.com [91.121.223.63]) by imf08.hostedemail.com (Postfix) with ESMTP id 3FDE5300250B for ; Mon, 4 Oct 2021 11:50:54 +0000 (UTC) X-Report-Abuse: Please report any abuse attempt to abuse@migadu.com and include these headers. DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.dev; s=key1; t=1633348252; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding; bh=mRyXixtJ3RFnQyJNkqqmOeKF1zm9e3y/zEd/NLkN3uQ=; b=bpN/HGzUizbXOG3/YXPD/zhc3hNq+E3ggFh0ruFSegG351Yyip+VZlirYIR6AHCTQpbHHd M5XDK1M+NRBy/wm+BrSaYfITxwt4JTvgIPO3ZrnGO8RDHnjRTRM5UHzSHSkQS6n1UKKhSL 9+Vsss4LrU4g6L8swWuWLwqg36tx5xk= From: Naoya Horiguchi To: linux-mm@kvack.org Cc: Andrew Morton , Alistair Popple , Peter Xu , Mike Kravetz , Konstantin Khlebnikov , Bin Wang , Naoya Horiguchi , linux-kernel@vger.kernel.org Subject: [PATCH v1] mm, pagemap: expose hwpoison entry Date: Mon, 4 Oct 2021 20:50:01 +0900 Message-Id: <20211004115001.1544259-1-naoya.horiguchi@linux.dev> MIME-Version: 1.0 X-Migadu-Flow: FLOW_OUT X-Migadu-Auth-User: naoya.horiguchi@linux.dev X-Rspamd-Server: rspam02 X-Rspamd-Queue-Id: 3FDE5300250B X-Stat-Signature: tni5ewy59gdmsfdm8pzmj663s7or5f15 Authentication-Results: imf08.hostedemail.com; dkim=pass header.d=linux.dev header.s=key1 header.b="bpN/HGzU"; spf=pass (imf08.hostedemail.com: domain of naoya.horiguchi@linux.dev designates 91.121.223.63 as permitted sender) smtp.mailfrom=naoya.horiguchi@linux.dev; dmarc=pass (policy=none) header.from=linux.dev X-HE-Tag: 1633348254-476354 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Naoya Horiguchi A hwpoison entry is a non-present page table entry to report memory error events to userspace. If we have an easy way to know which processes have hwpoison entries, that might be useful for user processes to take proper actions. But we don't have it now. So make pagemap interface expose hwpoison entries to userspace. Hwpoison entry for hugepage is also exposed by this patch. The below example shows how pagemap is visible in the case where a memory error hit a hugepage mapped to a process. $ ./page-types --no-summary --pid $PID --raw --list --addr 0x700000000+0x400 voffset offset len flags 700000000 12fa00 1 ___U_______Ma__H_G_________________f_______1 700000001 12fa01 1ff ___________Ma___TG_________________f_______1 700000200 12f800 1 __________B________X_______________f______w_ 700000201 12f801 1 ___________________X_______________f______w_ // memory failure hit this page 700000202 12f802 1fe __________B________X_______________f______w_ The entries with both of "X" flag (hwpoison flag) and "w" flag (swap flag) are considered as hwpoison entries. So all pages in 2MB range are inaccessible from the process. We can get actual error location by page-types in physical address mode. $ ./page-types --no-summary --addr 0x12f800+0x200 --raw --list offset len flags 12f800 1 __________B_________________________________ 12f801 1 ___________________X________________________ 12f802 1fe __________B_________________________________ Signed-off-by: Naoya Horiguchi Reported-by: kernel test robot Reported-by: kernel test robot --- fs/proc/task_mmu.c | 41 ++++++++++++++++++++++++++++++++--------- include/linux/swapops.h | 13 +++++++++++++ tools/vm/page-types.c | 7 ++++++- 3 files changed, 51 insertions(+), 10 deletions(-) diff --git v5.15-rc3/fs/proc/task_mmu.c v5.15-rc3_patched/fs/proc/task_mmu.c index cf25be3e0321..bfc4772a58fb 100644 --- v5.15-rc3/fs/proc/task_mmu.c +++ v5.15-rc3_patched/fs/proc/task_mmu.c @@ -1298,6 +1298,7 @@ struct pagemapread { #define PM_SOFT_DIRTY BIT_ULL(55) #define PM_MMAP_EXCLUSIVE BIT_ULL(56) #define PM_UFFD_WP BIT_ULL(57) +#define PM_HWPOISON BIT_ULL(60) #define PM_FILE BIT_ULL(61) #define PM_SWAP BIT_ULL(62) #define PM_PRESENT BIT_ULL(63) @@ -1386,6 +1387,10 @@ static pagemap_entry_t pte_to_pagemap_entry(struct pagemapread *pm, flags |= PM_SWAP; if (is_pfn_swap_entry(entry)) page = pfn_swap_entry_to_page(entry); + if (is_hwpoison_entry(entry)) { + page = hwpoison_entry_to_page(entry); + flags |= PM_HWPOISON; + } } if (page && !PageAnon(page)) @@ -1505,34 +1510,52 @@ static int pagemap_hugetlb_range(pte_t *ptep, unsigned long hmask, u64 flags = 0, frame = 0; int err = 0; pte_t pte; + struct page *page = NULL; if (vma->vm_flags & VM_SOFTDIRTY) flags |= PM_SOFT_DIRTY; pte = huge_ptep_get(ptep); if (pte_present(pte)) { - struct page *page = pte_page(pte); - - if (!PageAnon(page)) - flags |= PM_FILE; - - if (page_mapcount(page) == 1) - flags |= PM_MMAP_EXCLUSIVE; + page = pte_page(pte); flags |= PM_PRESENT; if (pm->show_pfn) frame = pte_pfn(pte) + ((addr & ~hmask) >> PAGE_SHIFT); + } else if (is_swap_pte(pte)) { + swp_entry_t entry = pte_to_swp_entry(pte); + unsigned long offset; + + if (pm->show_pfn) { + offset = swp_offset(entry) + + ((addr & ~hmask) >> PAGE_SHIFT); + frame = swp_type(entry) | + (offset << MAX_SWAPFILES_SHIFT); + } + flags |= PM_SWAP; + if (is_migration_entry(entry)) + page = compound_head(pfn_swap_entry_to_page(entry)); + if (is_hwpoison_entry(entry)) + flags |= PM_HWPOISON; } + if (page && !PageAnon(page)) + flags |= PM_FILE; + if (page && page_mapcount(page) == 1) + flags |= PM_MMAP_EXCLUSIVE; + for (; addr != end; addr += PAGE_SIZE) { pagemap_entry_t pme = make_pme(frame, flags); err = add_to_pagemap(addr, &pme, pm); if (err) return err; - if (pm->show_pfn && (flags & PM_PRESENT)) - frame++; + if (pm->show_pfn) + if (flags & PM_PRESENT) + frame++; + else if (flags & PM_SWAP) + frame += (1 << MAX_SWAPFILES_SHIFT); } cond_resched(); diff --git v5.15-rc3/include/linux/swapops.h v5.15-rc3_patched/include/linux/swapops.h index d356ab4047f7..bb6141e5c069 100644 --- v5.15-rc3/include/linux/swapops.h +++ v5.15-rc3_patched/include/linux/swapops.h @@ -360,6 +360,14 @@ static inline unsigned long hwpoison_entry_to_pfn(swp_entry_t entry) return swp_offset(entry); } +static inline struct page *hwpoison_entry_to_page(swp_entry_t entry) +{ + struct page *p = pfn_to_page(swp_offset(entry)); + + WARN_ON(!PageHWPoison(p)); + return p; +} + static inline void num_poisoned_pages_inc(void) { atomic_long_inc(&num_poisoned_pages); @@ -382,6 +390,11 @@ static inline int is_hwpoison_entry(swp_entry_t swp) return 0; } +static inline struct page *hwpoison_entry_to_page(swp_entry_t entry) +{ + return NULL; +} + static inline void num_poisoned_pages_inc(void) { } diff --git v5.15-rc3/tools/vm/page-types.c v5.15-rc3_patched/tools/vm/page-types.c index b1ed76d9a979..483e417fda41 100644 --- v5.15-rc3/tools/vm/page-types.c +++ v5.15-rc3_patched/tools/vm/page-types.c @@ -53,6 +53,7 @@ #define PM_SWAP_OFFSET(x) (((x) & PM_PFRAME_MASK) >> MAX_SWAPFILES_SHIFT) #define PM_SOFT_DIRTY (1ULL << 55) #define PM_MMAP_EXCLUSIVE (1ULL << 56) +#define PM_HWPOISON (1ULL << 60) #define PM_FILE (1ULL << 61) #define PM_SWAP (1ULL << 62) #define PM_PRESENT (1ULL << 63) @@ -311,6 +312,8 @@ static unsigned long pagemap_pfn(uint64_t val) if (val & PM_PRESENT) pfn = PM_PFRAME(val); + else if (val & PM_SWAP) + pfn = PM_SWAP_OFFSET(val); else pfn = 0; @@ -492,6 +495,8 @@ static uint64_t expand_overloaded_flags(uint64_t flags, uint64_t pme) flags |= BIT(FILE); if (pme & PM_SWAP) flags |= BIT(SWAP); + if (pme & PM_HWPOISON) + flags |= BIT(HWPOISON); if (pme & PM_MMAP_EXCLUSIVE) flags |= BIT(MMAP_EXCLUSIVE); @@ -742,7 +747,7 @@ static void walk_vma(unsigned long index, unsigned long count) pfn = pagemap_pfn(buf[i]); if (pfn) walk_pfn(index + i, pfn, 1, buf[i]); - if (buf[i] & PM_SWAP) + else if (buf[i] & PM_SWAP) walk_swap(index + i, buf[i]); }