From patchwork Fri May 14 17:22:42 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Hildenbrand X-Patchwork-Id: 12258427 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.5 required=3.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_CR_TRAILER,INCLUDES_PATCH,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id CB63DC433B4 for ; Fri, 14 May 2021 17:23:21 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id AAA4061177 for ; Fri, 14 May 2021 17:23:21 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S235102AbhENRYa (ORCPT ); Fri, 14 May 2021 13:24:30 -0400 Received: from us-smtp-delivery-124.mimecast.com ([170.10.133.124]:58783 "EHLO us-smtp-delivery-124.mimecast.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S235105AbhENRY1 (ORCPT ); Fri, 14 May 2021 13:24:27 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1621012995; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=/osWbHv9H+D8jMkXYe0YG4c0I4+JjXX9FiaLxBUqIoc=; b=d2dx2X4wQSbZIwPtQEIlXMX9cMPdjcbfe6pykJ19NlRvKxFSXMRJNGDpiYhBwFucsLVCAG fhOxCBPES+IA50e/cgQX4OZRLR+J9YUsFYTYAyJ2BRRrLo8ewci3/7B7fprHUXVmeqn3nq Nwu1ADwwa59+TFP2oMmFWFXWCfB+WFM= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-212-RRS-fpidOpKW1sq9Y9-_rQ-1; Fri, 14 May 2021 13:23:14 -0400 X-MC-Unique: RRS-fpidOpKW1sq9Y9-_rQ-1 Received: from smtp.corp.redhat.com (int-mx08.intmail.prod.int.phx2.redhat.com [10.5.11.23]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id 1D2E3107ACCA; Fri, 14 May 2021 17:23:11 +0000 (UTC) Received: from t480s.redhat.com (ovpn-114-113.ams2.redhat.com [10.36.114.113]) by smtp.corp.redhat.com (Postfix) with ESMTP id 367F519D9B; Fri, 14 May 2021 17:23:04 +0000 (UTC) From: David Hildenbrand To: linux-kernel@vger.kernel.org Cc: David Hildenbrand , Andrew Morton , "Michael S. Tsirkin" , Jason Wang , Alexey Dobriyan , Mike Rapoport , "Matthew Wilcox (Oracle)" , Oscar Salvador , Michal Hocko , Roman Gushchin , Alex Shi , Steven Price , Mike Kravetz , Aili Yao , Jiri Bohac , "K. Y. Srinivasan" , Haiyang Zhang , Stephen Hemminger , Wei Liu , Naoya Horiguchi , linux-hyperv@vger.kernel.org, virtualization@lists.linux-foundation.org, linux-fsdevel@vger.kernel.org, linux-mm@kvack.org, Mike Rapoport Subject: [PATCH v2 1/6] fs/proc/kcore: drop KCORE_REMAP and KCORE_OTHER Date: Fri, 14 May 2021 19:22:42 +0200 Message-Id: <20210514172247.176750-2-david@redhat.com> In-Reply-To: <20210514172247.176750-1-david@redhat.com> References: <20210514172247.176750-1-david@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.84 on 10.5.11.23 Precedence: bulk List-ID: X-Mailing-List: linux-fsdevel@vger.kernel.org Commit db779ef67ffe ("proc/kcore: Remove unused kclist_add_remap()") removed the last user of KCORE_REMAP. Commit 595dd46ebfc1 ("vfs/proc/kcore, x86/mm/kcore: Fix SMAP fault when dumping vsyscall user page") removed the last user of KCORE_OTHER. Let's drop both types. While at it, also drop vaddr in "struct kcore_list", used by KCORE_REMAP only. Reviewed-by: Mike Rapoport Signed-off-by: David Hildenbrand --- fs/proc/kcore.c | 7 ++----- include/linux/kcore.h | 3 --- 2 files changed, 2 insertions(+), 8 deletions(-) diff --git a/fs/proc/kcore.c b/fs/proc/kcore.c index 4d2e64e9016c..09f77d3c6e15 100644 --- a/fs/proc/kcore.c +++ b/fs/proc/kcore.c @@ -380,11 +380,8 @@ read_kcore(struct file *file, char __user *buffer, size_t buflen, loff_t *fpos) phdr->p_type = PT_LOAD; phdr->p_flags = PF_R | PF_W | PF_X; phdr->p_offset = kc_vaddr_to_offset(m->addr) + data_offset; - if (m->type == KCORE_REMAP) - phdr->p_vaddr = (size_t)m->vaddr; - else - phdr->p_vaddr = (size_t)m->addr; - if (m->type == KCORE_RAM || m->type == KCORE_REMAP) + phdr->p_vaddr = (size_t)m->addr; + if (m->type == KCORE_RAM) phdr->p_paddr = __pa(m->addr); else if (m->type == KCORE_TEXT) phdr->p_paddr = __pa_symbol(m->addr); diff --git a/include/linux/kcore.h b/include/linux/kcore.h index da676cdbd727..86c0f1d18998 100644 --- a/include/linux/kcore.h +++ b/include/linux/kcore.h @@ -11,14 +11,11 @@ enum kcore_type { KCORE_RAM, KCORE_VMEMMAP, KCORE_USER, - KCORE_OTHER, - KCORE_REMAP, }; struct kcore_list { struct list_head list; unsigned long addr; - unsigned long vaddr; size_t size; int type; }; From patchwork Fri May 14 17:22:43 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Hildenbrand X-Patchwork-Id: 12258429 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.5 required=3.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_CR_TRAILER,INCLUDES_PATCH,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id BC360C43461 for ; Fri, 14 May 2021 17:23:33 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id A3B4D6145E for ; Fri, 14 May 2021 17:23:33 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S235116AbhENRYo (ORCPT ); Fri, 14 May 2021 13:24:44 -0400 Received: from us-smtp-delivery-124.mimecast.com ([170.10.133.124]:47000 "EHLO us-smtp-delivery-124.mimecast.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S235100AbhENRYn (ORCPT ); Fri, 14 May 2021 13:24:43 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1621013011; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=CeyyYgXMaIv1KkLK2Mxpk2s8iuflkxKOKdYcMCECEco=; b=h7nEeqQwCW1ub5i6tGx6CMyORlkPCSg5gZ/ffOeJbSh9XV8urXK7UAJJEQHvggCgErcJjj RmUZIlDQrX/7B4In2kZawJH/7TX/dayYceRkNq0opemoK1/AnRp0hXqkQ+jOPKzh95lGAv Y8GNx3nosTjUjGOXhmLR7Mjl7J4ZfnQ= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-326-DoFLkz1YPr-E5fxCEaU49Q-1; Fri, 14 May 2021 13:23:29 -0400 X-MC-Unique: DoFLkz1YPr-E5fxCEaU49Q-1 Received: from smtp.corp.redhat.com (int-mx08.intmail.prod.int.phx2.redhat.com [10.5.11.23]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id 8BFDA107ACCA; Fri, 14 May 2021 17:23:26 +0000 (UTC) Received: from t480s.redhat.com (ovpn-114-113.ams2.redhat.com [10.36.114.113]) by smtp.corp.redhat.com (Postfix) with ESMTP id 671681A868; Fri, 14 May 2021 17:23:11 +0000 (UTC) From: David Hildenbrand To: linux-kernel@vger.kernel.org Cc: David Hildenbrand , Andrew Morton , "Michael S. Tsirkin" , Jason Wang , Alexey Dobriyan , Mike Rapoport , "Matthew Wilcox (Oracle)" , Oscar Salvador , Michal Hocko , Roman Gushchin , Alex Shi , Steven Price , Mike Kravetz , Aili Yao , Jiri Bohac , "K. Y. Srinivasan" , Haiyang Zhang , Stephen Hemminger , Wei Liu , Naoya Horiguchi , linux-hyperv@vger.kernel.org, virtualization@lists.linux-foundation.org, linux-fsdevel@vger.kernel.org, linux-mm@kvack.org, Mike Rapoport Subject: [PATCH v2 2/6] fs/proc/kcore: pfn_is_ram check only applies to KCORE_RAM Date: Fri, 14 May 2021 19:22:43 +0200 Message-Id: <20210514172247.176750-3-david@redhat.com> In-Reply-To: <20210514172247.176750-1-david@redhat.com> References: <20210514172247.176750-1-david@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.84 on 10.5.11.23 Precedence: bulk List-ID: X-Mailing-List: linux-fsdevel@vger.kernel.org Let's resturcture the code, using switch-case, and checking pfn_is_ram() only when we are dealing with KCORE_RAM. Reviewed-by: Mike Rapoport Signed-off-by: David Hildenbrand --- fs/proc/kcore.c | 35 +++++++++++++++++++++++++++-------- 1 file changed, 27 insertions(+), 8 deletions(-) diff --git a/fs/proc/kcore.c b/fs/proc/kcore.c index 09f77d3c6e15..ed6fbb3bd50c 100644 --- a/fs/proc/kcore.c +++ b/fs/proc/kcore.c @@ -483,25 +483,36 @@ read_kcore(struct file *file, char __user *buffer, size_t buflen, loff_t *fpos) goto out; } m = NULL; /* skip the list anchor */ - } else if (!pfn_is_ram(__pa(start) >> PAGE_SHIFT)) { - if (clear_user(buffer, tsz)) { - ret = -EFAULT; - goto out; - } - } else if (m->type == KCORE_VMALLOC) { + goto skip; + } + + switch (m->type) { + case KCORE_VMALLOC: vread(buf, (char *)start, tsz); /* we have to zero-fill user buffer even if no read */ if (copy_to_user(buffer, buf, tsz)) { ret = -EFAULT; goto out; } - } else if (m->type == KCORE_USER) { + break; + case KCORE_USER: /* User page is handled prior to normal kernel page: */ if (copy_to_user(buffer, (char *)start, tsz)) { ret = -EFAULT; goto out; } - } else { + break; + case KCORE_RAM: + if (!pfn_is_ram(__pa(start) >> PAGE_SHIFT)) { + if (clear_user(buffer, tsz)) { + ret = -EFAULT; + goto out; + } + break; + } + fallthrough; + case KCORE_VMEMMAP: + case KCORE_TEXT: if (kern_addr_valid(start)) { /* * Using bounce buffer to bypass the @@ -525,7 +536,15 @@ read_kcore(struct file *file, char __user *buffer, size_t buflen, loff_t *fpos) goto out; } } + break; + default: + pr_warn_once("Unhandled KCORE type: %d\n", m->type); + if (clear_user(buffer, tsz)) { + ret = -EFAULT; + goto out; + } } +skip: buflen -= tsz; *fpos += tsz; buffer += tsz; From patchwork Fri May 14 17:22:44 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Hildenbrand X-Patchwork-Id: 12258431 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.5 required=3.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_CR_TRAILER,INCLUDES_PATCH,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id E5DE5C433ED for ; Fri, 14 May 2021 17:23:42 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id CE34961177 for ; Fri, 14 May 2021 17:23:42 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S235129AbhENRYx (ORCPT ); Fri, 14 May 2021 13:24:53 -0400 Received: from us-smtp-delivery-124.mimecast.com ([216.205.24.124]:41041 "EHLO us-smtp-delivery-124.mimecast.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S235124AbhENRYw (ORCPT ); Fri, 14 May 2021 13:24:52 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1621013021; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=JA+PCClAmI2hyPxGFUfyhxo4lFjoE4jDjkNgRG+Z0Zg=; b=OvIicmqrhJ4lKzmoyhH+qRtXpIqxDbVSfHxNrm3Zd8ACL3yYWmDkee21QR6UGI7Ik1n3UI aEh1Bn6kPsymDvXjkInZygO7BbeGfrVtKo9XLe7OO6qfx5/BDzGYwhbnox5U/fb6D1wsLf +jd7Wz3wqqpuJdYm/RbLzrzy0mnzGQ4= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-331-mdTs41BmMbCEVUFq-BpX9A-1; Fri, 14 May 2021 13:23:39 -0400 X-MC-Unique: mdTs41BmMbCEVUFq-BpX9A-1 Received: from smtp.corp.redhat.com (int-mx08.intmail.prod.int.phx2.redhat.com [10.5.11.23]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id 01F25801817; Fri, 14 May 2021 17:23:36 +0000 (UTC) Received: from t480s.redhat.com (ovpn-114-113.ams2.redhat.com [10.36.114.113]) by smtp.corp.redhat.com (Postfix) with ESMTP id E98111A86D; Fri, 14 May 2021 17:23:26 +0000 (UTC) From: David Hildenbrand To: linux-kernel@vger.kernel.org Cc: David Hildenbrand , Andrew Morton , "Michael S. Tsirkin" , Jason Wang , Alexey Dobriyan , Mike Rapoport , "Matthew Wilcox (Oracle)" , Oscar Salvador , Michal Hocko , Roman Gushchin , Alex Shi , Steven Price , Mike Kravetz , Aili Yao , Jiri Bohac , "K. Y. Srinivasan" , Haiyang Zhang , Stephen Hemminger , Wei Liu , Naoya Horiguchi , linux-hyperv@vger.kernel.org, virtualization@lists.linux-foundation.org, linux-fsdevel@vger.kernel.org, linux-mm@kvack.org, Mike Rapoport Subject: [PATCH v2 3/6] fs/proc/kcore: don't read offline sections, logically offline pages and hwpoisoned pages Date: Fri, 14 May 2021 19:22:44 +0200 Message-Id: <20210514172247.176750-4-david@redhat.com> In-Reply-To: <20210514172247.176750-1-david@redhat.com> References: <20210514172247.176750-1-david@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.84 on 10.5.11.23 Precedence: bulk List-ID: X-Mailing-List: linux-fsdevel@vger.kernel.org Let's avoid reading: 1) Offline memory sections: the content of offline memory sections is stale as the memory is effectively unused by the kernel. On s390x with standby memory, offline memory sections (belonging to offline storage increments) are not accessible. With virtio-mem and the hyper-v balloon, we can have unavailable memory chunks that should not be accessed inside offline memory sections. Last but not least, offline memory sections might contain hwpoisoned pages which we can no longer identify because the memmap is stale. 2) PG_offline pages: logically offline pages that are documented as "The content of these pages is effectively stale. Such pages should not be touched (read/write/dump/save) except by their owner.". Examples include pages inflated in a balloon or unavailble memory ranges inside hotplugged memory sections with virtio-mem or the hyper-v balloon. 3) PG_hwpoison pages: Reading pages marked as hwpoisoned can be fatal. As documented: "Accessing is not safe since it may cause another machine check. Don't touch!" Introduce is_page_hwpoison(), adding a comment that it is inherently racy but best we can really do. Reading /proc/kcore now performs similar checks as when reading /proc/vmcore for kdump via makedumpfile: problematic pages are exclude. It's also similar to hibernation code, however, we don't skip hwpoisoned pages when processing pages in kernel/power/snapshot.c:saveable_page() yet. Note 1: we can race against memory offlining code, especially memory going offline and getting unplugged: however, we will properly tear down the identity mapping and handle faults gracefully when accessing this memory from kcore code. Note 2: we can race against drivers setting PageOffline() and turning memory inaccessible in the hypervisor. We'll handle this in a follow-up patch. Reviewed-by: Mike Rapoport Signed-off-by: David Hildenbrand Reviewed-by: Oscar Salvador --- fs/proc/kcore.c | 14 +++++++++++++- include/linux/page-flags.h | 12 ++++++++++++ 2 files changed, 25 insertions(+), 1 deletion(-) diff --git a/fs/proc/kcore.c b/fs/proc/kcore.c index ed6fbb3bd50c..92ff1e4436cb 100644 --- a/fs/proc/kcore.c +++ b/fs/proc/kcore.c @@ -465,6 +465,9 @@ read_kcore(struct file *file, char __user *buffer, size_t buflen, loff_t *fpos) m = NULL; while (buflen) { + struct page *page; + unsigned long pfn; + /* * If this is the first iteration or the address is not within * the previous entry, search for a matching entry. @@ -503,7 +506,16 @@ read_kcore(struct file *file, char __user *buffer, size_t buflen, loff_t *fpos) } break; case KCORE_RAM: - if (!pfn_is_ram(__pa(start) >> PAGE_SHIFT)) { + pfn = __pa(start) >> PAGE_SHIFT; + page = pfn_to_online_page(pfn); + + /* + * Don't read offline sections, logically offline pages + * (e.g., inflated in a balloon), hwpoisoned pages, + * and explicitly excluded physical ranges. + */ + if (!page || PageOffline(page) || + is_page_hwpoison(page) || !pfn_is_ram(pfn)) { if (clear_user(buffer, tsz)) { ret = -EFAULT; goto out; diff --git a/include/linux/page-flags.h b/include/linux/page-flags.h index 04a34c08e0a6..daed82744f4b 100644 --- a/include/linux/page-flags.h +++ b/include/linux/page-flags.h @@ -694,6 +694,18 @@ PAGEFLAG_FALSE(DoubleMap) TESTSCFLAG_FALSE(DoubleMap) #endif +/* + * Check if a page is currently marked HWPoisoned. Note that this check is + * best effort only and inherently racy: there is no way to synchronize with + * failing hardware. + */ +static inline bool is_page_hwpoison(struct page *page) +{ + if (PageHWPoison(page)) + return true; + return PageHuge(page) && PageHWPoison(compound_head(page)); +} + /* * For pages that are never mapped to userspace (and aren't PageSlab), * page_type may be used. Because it is initialised to -1, we invert the From patchwork Fri May 14 17:22:45 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Hildenbrand X-Patchwork-Id: 12258433 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.5 required=3.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_CR_TRAILER,INCLUDES_PATCH,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 47E6DC433B4 for ; Fri, 14 May 2021 17:23:56 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 286EB61177 for ; Fri, 14 May 2021 17:23:56 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S235145AbhENRZH (ORCPT ); Fri, 14 May 2021 13:25:07 -0400 Received: from us-smtp-delivery-124.mimecast.com ([170.10.133.124]:28607 "EHLO us-smtp-delivery-124.mimecast.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S235137AbhENRZG (ORCPT ); Fri, 14 May 2021 13:25:06 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1621013034; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=YEjfdN7WDOCwIANaVcEla+LFmI77UwSVcSaIVJi2J6A=; b=J+TEFM7exjypQ3dflTR3wo50eaEtjv6momDADFWDd4BOr711BwyrBUtjW+SV4pRKd/7izf oAiO8C7jAYslWQZiykE3zzlXImteD2/IEIF381xrPgrGyHyhIjhfh3nFuk26RB5dju6NAc CPkfXav3vcOo4w3lx8SZ2bvgfwSynX8= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-318-ICjzE1FlNMSXmzI9CyvjLQ-1; Fri, 14 May 2021 13:23:51 -0400 X-MC-Unique: ICjzE1FlNMSXmzI9CyvjLQ-1 Received: from smtp.corp.redhat.com (int-mx08.intmail.prod.int.phx2.redhat.com [10.5.11.23]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id C2632100747B; Fri, 14 May 2021 17:23:48 +0000 (UTC) Received: from t480s.redhat.com (ovpn-114-113.ams2.redhat.com [10.36.114.113]) by smtp.corp.redhat.com (Postfix) with ESMTP id 5C39C1A868; Fri, 14 May 2021 17:23:36 +0000 (UTC) From: David Hildenbrand To: linux-kernel@vger.kernel.org Cc: David Hildenbrand , Andrew Morton , "Michael S. Tsirkin" , Jason Wang , Alexey Dobriyan , Mike Rapoport , "Matthew Wilcox (Oracle)" , Oscar Salvador , Michal Hocko , Roman Gushchin , Alex Shi , Steven Price , Mike Kravetz , Aili Yao , Jiri Bohac , "K. Y. Srinivasan" , Haiyang Zhang , Stephen Hemminger , Wei Liu , Naoya Horiguchi , linux-hyperv@vger.kernel.org, virtualization@lists.linux-foundation.org, linux-fsdevel@vger.kernel.org, linux-mm@kvack.org Subject: [PATCH v2 4/6] mm: introduce page_offline_(begin|end|freeze|thaw) to synchronize setting PageOffline() Date: Fri, 14 May 2021 19:22:45 +0200 Message-Id: <20210514172247.176750-5-david@redhat.com> In-Reply-To: <20210514172247.176750-1-david@redhat.com> References: <20210514172247.176750-1-david@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.84 on 10.5.11.23 Precedence: bulk List-ID: X-Mailing-List: linux-fsdevel@vger.kernel.org A driver might set a page logically offline -- PageOffline() -- and turn the page inaccessible in the hypervisor; after that, access to page content can be fatal. One example is virtio-mem; while unplugged memory -- marked as PageOffline() can currently be read in the hypervisor, this will no longer be the case in the future; for example, when having a virtio-mem device backed by huge pages in the hypervisor. Some special PFN walkers -- i.e., /proc/kcore -- read content of random pages after checking PageOffline(); however, these PFN walkers can race with drivers that set PageOffline(). Let's introduce page_offline_(begin|end|freeze|thaw) for synchronizing. page_offline_freeze()/page_offline_thaw() allows for a subsystem to synchronize with such drivers, achieving that a page cannot be set PageOffline() while frozen. page_offline_begin()/page_offline_end() is used by drivers that care about such races when setting a page PageOffline(). For simplicity, use a rwsem for now; neither drivers nor users are performance sensitive. Acked-by: Michal Hocko Signed-off-by: David Hildenbrand Reviewed-by: Mike Rapoport Reviewed-by: Oscar Salvador --- include/linux/page-flags.h | 10 ++++++++++ mm/util.c | 40 ++++++++++++++++++++++++++++++++++++++ 2 files changed, 50 insertions(+) diff --git a/include/linux/page-flags.h b/include/linux/page-flags.h index daed82744f4b..ea2df9a247b3 100644 --- a/include/linux/page-flags.h +++ b/include/linux/page-flags.h @@ -769,9 +769,19 @@ PAGE_TYPE_OPS(Buddy, buddy) * relies on this feature is aware that re-onlining the memory block will * require to re-set the pages PageOffline() and not giving them to the * buddy via online_page_callback_t. + * + * There are drivers that mark a page PageOffline() and do not expect any + * further access to page content. PFN walkers that read content of random + * pages should check PageOffline() and synchronize with such drivers using + * page_offline_freeze()/page_offline_thaw(). */ PAGE_TYPE_OPS(Offline, offline) +extern void page_offline_freeze(void); +extern void page_offline_thaw(void); +extern void page_offline_begin(void); +extern void page_offline_end(void); + /* * Marks pages in use as page tables. */ diff --git a/mm/util.c b/mm/util.c index a8bf17f18a81..a034525e7ba2 100644 --- a/mm/util.c +++ b/mm/util.c @@ -1010,3 +1010,43 @@ void mem_dump_obj(void *object) } EXPORT_SYMBOL_GPL(mem_dump_obj); #endif + +/* + * A driver might set a page logically offline -- PageOffline() -- and + * turn the page inaccessible in the hypervisor; after that, access to page + * content can be fatal. + * + * Some special PFN walkers -- i.e., /proc/kcore -- read content of random + * pages after checking PageOffline(); however, these PFN walkers can race + * with drivers that set PageOffline(). + * + * page_offline_freeze()/page_offline_thaw() allows for a subsystem to + * synchronize with such drivers, achieving that a page cannot be set + * PageOffline() while frozen. + * + * page_offline_begin()/page_offline_end() is used by drivers that care about + * such races when setting a page PageOffline(). + */ +static DECLARE_RWSEM(page_offline_rwsem); + +void page_offline_freeze(void) +{ + down_read(&page_offline_rwsem); +} + +void page_offline_thaw(void) +{ + up_read(&page_offline_rwsem); +} + +void page_offline_begin(void) +{ + down_write(&page_offline_rwsem); +} +EXPORT_SYMBOL(page_offline_begin); + +void page_offline_end(void) +{ + up_write(&page_offline_rwsem); +} +EXPORT_SYMBOL(page_offline_end); From patchwork Fri May 14 17:22:46 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Hildenbrand X-Patchwork-Id: 12258435 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.5 required=3.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_CR_TRAILER,INCLUDES_PATCH,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 40A87C433ED for ; Fri, 14 May 2021 17:24:06 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 23DF061177 for ; Fri, 14 May 2021 17:24:06 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S235131AbhENRZQ (ORCPT ); Fri, 14 May 2021 13:25:16 -0400 Received: from us-smtp-delivery-124.mimecast.com ([216.205.24.124]:54528 "EHLO us-smtp-delivery-124.mimecast.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231250AbhENRZQ (ORCPT ); Fri, 14 May 2021 13:25:16 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1621013043; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=5uvq7/HzHxoBsPRM6oyLa2n9DVz25WrcnXOXa28LPqo=; b=hSv7EFkJELVGa3/P5dMCLGiMGaxh4QD8E4VgLexVRYQNvmORPfpZ6dBfyovzy6pndZ3Rnk 8Ieewhl3jqpW1iIJwODSvnrH/vHkWA6YnlDdhcX2AwMGZznYb9iParjWH4feG4z/a7aQei CmAPT8bUD/SnzmuKSeBHzne8ypYzeJg= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-189-bmhtkxP6O3aGBDbCC145OQ-1; Fri, 14 May 2021 13:23:57 -0400 X-MC-Unique: bmhtkxP6O3aGBDbCC145OQ-1 Received: from smtp.corp.redhat.com (int-mx08.intmail.prod.int.phx2.redhat.com [10.5.11.23]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id A7731107ACE3; Fri, 14 May 2021 17:23:54 +0000 (UTC) Received: from t480s.redhat.com (ovpn-114-113.ams2.redhat.com [10.36.114.113]) by smtp.corp.redhat.com (Postfix) with ESMTP id 2B3F61F4; Fri, 14 May 2021 17:23:48 +0000 (UTC) From: David Hildenbrand To: linux-kernel@vger.kernel.org Cc: David Hildenbrand , Andrew Morton , "Michael S. Tsirkin" , Jason Wang , Alexey Dobriyan , Mike Rapoport , "Matthew Wilcox (Oracle)" , Oscar Salvador , Michal Hocko , Roman Gushchin , Alex Shi , Steven Price , Mike Kravetz , Aili Yao , Jiri Bohac , "K. Y. Srinivasan" , Haiyang Zhang , Stephen Hemminger , Wei Liu , Naoya Horiguchi , linux-hyperv@vger.kernel.org, virtualization@lists.linux-foundation.org, linux-fsdevel@vger.kernel.org, linux-mm@kvack.org Subject: [PATCH v2 5/6] virtio-mem: use page_offline_(start|end) when setting PageOffline() Date: Fri, 14 May 2021 19:22:46 +0200 Message-Id: <20210514172247.176750-6-david@redhat.com> In-Reply-To: <20210514172247.176750-1-david@redhat.com> References: <20210514172247.176750-1-david@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.84 on 10.5.11.23 Precedence: bulk List-ID: X-Mailing-List: linux-fsdevel@vger.kernel.org Let's properly use page_offline_(start|end) to synchronize setting PageOffline(), so we won't have valid page access to unplugged memory regions from /proc/kcore. Existing balloon implementations usually allow reading inflated memory; doing so might result in unnecessary overhead in the hypervisor, which is currently the case with virtio-mem. For future virtio-mem use cases, it will be different when using shmem, huge pages, !anonymous private mappings, ... as backing storage for a VM. virtio-mem unplugged memory must no longer be accessed and access might result in undefined behavior. There will be a virtio spec extension to document this change, including a new feature flag indicating the changed behavior. We really don't want to race against PFN walkers reading random page content. Acked-by: Michael S. Tsirkin Signed-off-by: David Hildenbrand Acked-by: Mike Rapoport Reviewed-by: Oscar Salvador --- drivers/virtio/virtio_mem.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/drivers/virtio/virtio_mem.c b/drivers/virtio/virtio_mem.c index 10ec60d81e84..dc2a2e2b2ff8 100644 --- a/drivers/virtio/virtio_mem.c +++ b/drivers/virtio/virtio_mem.c @@ -1065,6 +1065,7 @@ static int virtio_mem_memory_notifier_cb(struct notifier_block *nb, static void virtio_mem_set_fake_offline(unsigned long pfn, unsigned long nr_pages, bool onlined) { + page_offline_begin(); for (; nr_pages--; pfn++) { struct page *page = pfn_to_page(pfn); @@ -1075,6 +1076,7 @@ static void virtio_mem_set_fake_offline(unsigned long pfn, ClearPageReserved(page); } } + page_offline_end(); } /* From patchwork Fri May 14 17:22:47 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Hildenbrand X-Patchwork-Id: 12258437 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.5 required=3.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_CR_TRAILER,INCLUDES_PATCH,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 83B3AC433B4 for ; Fri, 14 May 2021 17:24:13 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 6809B61177 for ; Fri, 14 May 2021 17:24:13 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S235165AbhENRZY (ORCPT ); Fri, 14 May 2021 13:25:24 -0400 Received: from us-smtp-delivery-124.mimecast.com ([170.10.133.124]:55952 "EHLO us-smtp-delivery-124.mimecast.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S235163AbhENRZV (ORCPT ); Fri, 14 May 2021 13:25:21 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1621013049; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=ulvwg2ZaPIw7oToA7TTsrb85Bhlj+6R6HqazJ1k8M40=; b=TvrjppwLh/ASw0HT5iQ+Ne5yOuXZdhKxsmS4gxXWCyHuHFh2LI6ZhIGzh0PxfX6L/QO6XF MT1o3qldzDD31BLLMtjltCEwDcKp8JHM+4To19namYrekmj7WmzW/E8+Axc33VlGifzjdV 7vCRFmwEhl5EPOz6OXpwtXGLiso+EHc= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-50-OcdPAodxOoS1AkW-8Jin5A-1; Fri, 14 May 2021 13:24:06 -0400 X-MC-Unique: OcdPAodxOoS1AkW-8Jin5A-1 Received: from smtp.corp.redhat.com (int-mx08.intmail.prod.int.phx2.redhat.com [10.5.11.23]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id D179B8015DB; Fri, 14 May 2021 17:24:03 +0000 (UTC) Received: from t480s.redhat.com (ovpn-114-113.ams2.redhat.com [10.36.114.113]) by smtp.corp.redhat.com (Postfix) with ESMTP id F07801971B; Fri, 14 May 2021 17:23:54 +0000 (UTC) From: David Hildenbrand To: linux-kernel@vger.kernel.org Cc: David Hildenbrand , Andrew Morton , "Michael S. Tsirkin" , Jason Wang , Alexey Dobriyan , Mike Rapoport , "Matthew Wilcox (Oracle)" , Oscar Salvador , Michal Hocko , Roman Gushchin , Alex Shi , Steven Price , Mike Kravetz , Aili Yao , Jiri Bohac , "K. Y. Srinivasan" , Haiyang Zhang , Stephen Hemminger , Wei Liu , Naoya Horiguchi , linux-hyperv@vger.kernel.org, virtualization@lists.linux-foundation.org, linux-fsdevel@vger.kernel.org, linux-mm@kvack.org Subject: [PATCH v2 6/6] fs/proc/kcore: use page_offline_(freeze|thaw) Date: Fri, 14 May 2021 19:22:47 +0200 Message-Id: <20210514172247.176750-7-david@redhat.com> In-Reply-To: <20210514172247.176750-1-david@redhat.com> References: <20210514172247.176750-1-david@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.84 on 10.5.11.23 Precedence: bulk List-ID: X-Mailing-List: linux-fsdevel@vger.kernel.org Let's properly synchronize with drivers that set PageOffline(). Unfreeze/thaw every now and then, so drivers that want to set PageOffline() can make progress. Signed-off-by: David Hildenbrand Acked-by: Mike Rapoport Reviewed-by: Oscar Salvador --- fs/proc/kcore.c | 13 +++++++++++++ 1 file changed, 13 insertions(+) diff --git a/fs/proc/kcore.c b/fs/proc/kcore.c index 92ff1e4436cb..982e694aae77 100644 --- a/fs/proc/kcore.c +++ b/fs/proc/kcore.c @@ -313,6 +313,7 @@ read_kcore(struct file *file, char __user *buffer, size_t buflen, loff_t *fpos) { char *buf = file->private_data; size_t phdrs_offset, notes_offset, data_offset; + size_t page_offline_frozen = 1; size_t phdrs_len, notes_len; struct kcore_list *m; size_t tsz; @@ -322,6 +323,11 @@ read_kcore(struct file *file, char __user *buffer, size_t buflen, loff_t *fpos) int ret = 0; down_read(&kclist_lock); + /* + * Don't race against drivers that set PageOffline() and expect no + * further page access. + */ + page_offline_freeze(); get_kcore_size(&nphdr, &phdrs_len, ¬es_len, &data_offset); phdrs_offset = sizeof(struct elfhdr); @@ -480,6 +486,12 @@ read_kcore(struct file *file, char __user *buffer, size_t buflen, loff_t *fpos) } } + if (page_offline_frozen++ % MAX_ORDER_NR_PAGES == 0) { + page_offline_thaw(); + cond_resched(); + page_offline_freeze(); + } + if (&m->list == &kclist_head) { if (clear_user(buffer, tsz)) { ret = -EFAULT; @@ -565,6 +577,7 @@ read_kcore(struct file *file, char __user *buffer, size_t buflen, loff_t *fpos) } out: + page_offline_thaw(); up_read(&kclist_lock); if (ret) return ret;