From patchwork Thu Sep 2 16:09:17 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Hildenbrand X-Patchwork-Id: 12472023 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-13.6 required=3.0 tests=BAYES_00,DKIM_INVALID, DKIM_SIGNED,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id D1A21C8303B for ; Thu, 2 Sep 2021 16:25:57 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id B347B61166 for ; Thu, 2 Sep 2021 16:09:31 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org B347B61166 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=redhat.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=kvack.org Received: by kanga.kvack.org (Postfix) id 4571B6B0071; Thu, 2 Sep 2021 12:09:31 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 405816B0072; Thu, 2 Sep 2021 12:09:31 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 2CCE56B0073; Thu, 2 Sep 2021 12:09:31 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0168.hostedemail.com [216.40.44.168]) by kanga.kvack.org (Postfix) with ESMTP id 1A6BD6B0071 for ; Thu, 2 Sep 2021 12:09:31 -0400 (EDT) Received: from smtpin18.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay01.hostedemail.com (Postfix) with ESMTP id CBAA8183459EC for ; Thu, 2 Sep 2021 16:09:30 +0000 (UTC) X-FDA: 78543118500.18.D860D0E Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [216.205.24.124]) by imf19.hostedemail.com (Postfix) with ESMTP id 7D196B0000AE for ; Thu, 2 Sep 2021 16:09:30 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1630598969; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=MaetS8LnMuKprqnjrvobw+SaaiElk101NGu5Z0Rv9V0=; b=DNSWm3HNUP4uxOI4PwzR+KPUFrrWUByFRvSwIgnIERjzURq/WwBrFkffelU3X2NKf70xzJ Am46MrUjowx/YxxPqUZ67eyqAXtDQpogycTdRl1BUGfDFXmEYCjawHgoXTpl0En4xtIzx5 FYY4clVmVTtU29QsSsPz5dAtgIBsre8= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-143-z6gW9uNbPEOUCfPY9NX9Vw-1; Thu, 02 Sep 2021 12:09:28 -0400 X-MC-Unique: z6gW9uNbPEOUCfPY9NX9Vw-1 Received: from smtp.corp.redhat.com (int-mx03.intmail.prod.int.phx2.redhat.com [10.5.11.13]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id F29BF801ADA; Thu, 2 Sep 2021 16:09:26 +0000 (UTC) Received: from t480s.redhat.com (unknown [10.39.193.198]) by smtp.corp.redhat.com (Postfix) with ESMTP id DB9ED60853; Thu, 2 Sep 2021 16:09:23 +0000 (UTC) From: David Hildenbrand To: linux-kernel@vger.kernel.org Cc: David Hildenbrand , Arnd Bergmann , Greg Kroah-Hartman , "Michael S. Tsirkin" , Jason Wang , "Rafael J. Wysocki" , Andrew Morton , Dan Williams , Hanjun Guo , Andy Shevchenko , virtualization@lists.linux-foundation.org, linux-mm@kvack.org Subject: [PATCH v4 1/3] kernel/resource: clean up and optimize iomem_is_exclusive() Date: Thu, 2 Sep 2021 18:09:17 +0200 Message-Id: <20210902160919.25683-2-david@redhat.com> In-Reply-To: <20210902160919.25683-1-david@redhat.com> References: <20210902160919.25683-1-david@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.13 Authentication-Results: imf19.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=DNSWm3HN; spf=none (imf19.hostedemail.com: domain of david@redhat.com has no SPF policy when checking 216.205.24.124) smtp.mailfrom=david@redhat.com; dmarc=pass (policy=none) header.from=redhat.com X-Stat-Signature: yjnsxkwrc3zrnr8ck8gechamsrso8snd X-Rspamd-Queue-Id: 7D196B0000AE X-Rspamd-Server: rspam04 X-HE-Tag: 1630598970-563633 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: We end up traversing subtrees of ranges we are not interested in; let's optimize this case, skipping such subtrees, cleaning up the function a bit. Reviewed-by: Dan Williams Signed-off-by: David Hildenbrand --- kernel/resource.c | 25 ++++++++++++++++++++----- 1 file changed, 20 insertions(+), 5 deletions(-) diff --git a/kernel/resource.c b/kernel/resource.c index ca9f5198a01f..2999f57da38c 100644 --- a/kernel/resource.c +++ b/kernel/resource.c @@ -73,6 +73,18 @@ static struct resource *next_resource(struct resource *p) return p->sibling; } +static struct resource *next_resource_skip_children(struct resource *p) +{ + while (!p->sibling && p->parent) + p = p->parent; + return p->sibling; +} + +#define for_each_resource(_root, _p, _skip_children) \ + for ((_p) = (_root)->child; (_p); \ + (_p) = (_skip_children) ? next_resource_skip_children(_p) : \ + next_resource(_p)) + static void *r_next(struct seq_file *m, void *v, loff_t *pos) { struct resource *p = v; @@ -1712,10 +1724,9 @@ static int strict_iomem_checks; */ bool iomem_is_exclusive(u64 addr) { - struct resource *p = &iomem_resource; - bool err = false; - loff_t l; + bool skip_children = false, err = false; int size = PAGE_SIZE; + struct resource *p; if (!strict_iomem_checks) return false; @@ -1723,15 +1734,19 @@ bool iomem_is_exclusive(u64 addr) addr = addr & PAGE_MASK; read_lock(&resource_lock); - for (p = p->child; p ; p = r_next(NULL, p, &l)) { + for_each_resource(&iomem_resource, p, skip_children) { /* * We can probably skip the resources without * IORESOURCE_IO attribute? */ if (p->start >= addr + size) break; - if (p->end < addr) + if (p->end < addr) { + skip_children = true; continue; + } + skip_children = false; + /* * A resource is exclusive if IORESOURCE_EXCLUSIVE is set * or CONFIG_IO_STRICT_DEVMEM is enabled and the From patchwork Thu Sep 2 16:09:18 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Hildenbrand X-Patchwork-Id: 12472025 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-13.6 required=3.0 tests=BAYES_00,DKIM_INVALID, DKIM_SIGNED,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id D1634C8303A for ; Thu, 2 Sep 2021 16:25:57 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id BA119611AF for ; Thu, 2 Sep 2021 16:09:41 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org BA119611AF Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=redhat.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=kvack.org Received: by kanga.kvack.org (Postfix) id 6296A6B0072; Thu, 2 Sep 2021 12:09:41 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 600676B0073; Thu, 2 Sep 2021 12:09:41 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 4A2B96B0074; Thu, 2 Sep 2021 12:09:41 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0174.hostedemail.com [216.40.44.174]) by kanga.kvack.org (Postfix) with ESMTP id 3ACB36B0072 for ; Thu, 2 Sep 2021 12:09:41 -0400 (EDT) Received: from smtpin40.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay01.hostedemail.com (Postfix) with ESMTP id F124E183432CE for ; Thu, 2 Sep 2021 16:09:40 +0000 (UTC) X-FDA: 78543118920.40.D6699BE Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [216.205.24.124]) by imf08.hostedemail.com (Postfix) with ESMTP id 702AD30000BE for ; Thu, 2 Sep 2021 16:09:40 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1630598979; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=35N7k3a81h8IFJDkaZOQjP2o+rAflN0TIT3sjhQeT4w=; b=FID3cffkh/7PRvLx6AXVZTvCckMK3rrwpZA1AVEg8AnKnoamI2yi9eOYEEKyzBpLU9CVdp CFZhlmNXmtZVdoN42cbqL4UOBYD/ZfAbmwAc2qv4bgZ8oLtW9IUFQUHEUfCNintf4Hmi46 xuJx0kt3E7iW8Hx2GPQro28VYhYoZz8= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-483-gkxFaDnrPoCByd5TPmeHqg-1; Thu, 02 Sep 2021 12:09:38 -0400 X-MC-Unique: gkxFaDnrPoCByd5TPmeHqg-1 Received: from smtp.corp.redhat.com (int-mx03.intmail.prod.int.phx2.redhat.com [10.5.11.13]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id 1FAE06D5C2; Thu, 2 Sep 2021 16:09:37 +0000 (UTC) Received: from t480s.redhat.com (unknown [10.39.193.198]) by smtp.corp.redhat.com (Postfix) with ESMTP id 5D3B160853; Thu, 2 Sep 2021 16:09:27 +0000 (UTC) From: David Hildenbrand To: linux-kernel@vger.kernel.org Cc: David Hildenbrand , Arnd Bergmann , Greg Kroah-Hartman , "Michael S. Tsirkin" , Jason Wang , "Rafael J. Wysocki" , Andrew Morton , Dan Williams , Hanjun Guo , Andy Shevchenko , virtualization@lists.linux-foundation.org, linux-mm@kvack.org Subject: [PATCH v4 2/3] kernel/resource: disallow access to exclusive system RAM regions Date: Thu, 2 Sep 2021 18:09:18 +0200 Message-Id: <20210902160919.25683-3-david@redhat.com> In-Reply-To: <20210902160919.25683-1-david@redhat.com> References: <20210902160919.25683-1-david@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.13 X-Rspamd-Queue-Id: 702AD30000BE Authentication-Results: imf08.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=FID3cffk; dmarc=pass (policy=none) header.from=redhat.com; spf=none (imf08.hostedemail.com: domain of david@redhat.com has no SPF policy when checking 216.205.24.124) smtp.mailfrom=david@redhat.com X-Rspamd-Server: rspam01 X-Stat-Signature: cemi3xzh4dwtdibxyk4d4d9nj995rrt5 X-HE-Tag: 1630598980-783976 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: virtio-mem dynamically exposes memory inside a device memory region as system RAM to Linux, coordinating with the hypervisor which parts are actually "plugged" and consequently usable/accessible. On the one hand, the virtio-mem driver adds/removes whole memory blocks, creating/removing busy IORESOURCE_SYSTEM_RAM resources, on the other hand, it logically (un)plugs memory inside added memory blocks, dynamically either exposing them to the buddy or hiding them from the buddy and marking them PG_offline. In contrast to physical devices, like a DIMM, the virtio-mem driver is required to actually make use of any of the device-provided memory, because it performs the handshake with the hypervisor. virtio-mem memory cannot simply be access via /dev/mem without a driver. There is no safe way to: a) Access plugged memory blocks via /dev/mem, as they might contain unplugged holes or might get silently unplugged by the virtio-mem driver and consequently turned inaccessible. b) Access unplugged memory blocks via /dev/mem because the virtio-mem driver is required to make them actually accessible first. The virtio-spec states that unplugged memory blocks MUST NOT be written, and only selected unplugged memory blocks MAY be read. We want to make sure, this is the case in sane environments -- where the virtio-mem driver was loaded. We want to make sure that in a sane environment, nobody "accidentially" accesses unplugged memory inside the device managed region. For example, a user might spot a memory region in /proc/iomem and try accessing it via /dev/mem via gdb or dumping it via something else. By the time the mmap() happens, the memory might already have been removed by the virtio-mem driver silently: the mmap() would succeeed and user space might accidentially access unplugged memory. So once the driver was loaded and detected the device along the device-managed region, we just want to disallow any access via /dev/mem to it. In an ideal world, we would mark the whole region as busy ("owned by a driver") and exclude it; however, that would be wrong, as we don't really have actual system RAM at these ranges added to Linux ("busy system RAM"). Instead, we want to mark such ranges as "not actual busy system RAM but still soft-reserved and prepared by a driver for future use." Let's teach iomem_is_exclusive() to reject access to any range with "IORESOURCE_SYSTEM_RAM | IORESOURCE_EXCLUSIVE", even if not busy and even if "iomem=relaxed" is set. For now, there are no applicable ranges and we'll modify virtio-mem next to properly set IORESOURCE_EXCLUSIVE on the parent resource container it creates to contain all actual busy system RAM added via add_memory_driver_managed(). Reviewed-by: Dan Williams Signed-off-by: David Hildenbrand --- kernel/resource.c | 29 +++++++++++++++++++---------- 1 file changed, 19 insertions(+), 10 deletions(-) diff --git a/kernel/resource.c b/kernel/resource.c index 2999f57da38c..5ad3eba619ba 100644 --- a/kernel/resource.c +++ b/kernel/resource.c @@ -1719,26 +1719,23 @@ static int strict_iomem_checks; #endif /* - * check if an address is reserved in the iomem resource tree - * returns true if reserved, false if not reserved. + * Check if an address is exclusive to the kernel and must not be mapped to + * user space, for example, via /dev/mem. + * + * Returns true if exclusive to the kernel, otherwise returns false. */ bool iomem_is_exclusive(u64 addr) { + const unsigned int exclusive_system_ram = IORESOURCE_SYSTEM_RAM | + IORESOURCE_EXCLUSIVE; bool skip_children = false, err = false; int size = PAGE_SIZE; struct resource *p; - if (!strict_iomem_checks) - return false; - addr = addr & PAGE_MASK; read_lock(&resource_lock); for_each_resource(&iomem_resource, p, skip_children) { - /* - * We can probably skip the resources without - * IORESOURCE_IO attribute? - */ if (p->start >= addr + size) break; if (p->end < addr) { @@ -1747,12 +1744,24 @@ bool iomem_is_exclusive(u64 addr) } skip_children = false; + /* + * IORESOURCE_SYSTEM_RAM resources are exclusive if + * IORESOURCE_EXCLUSIVE is set, even if they + * are not busy and even if "iomem=relaxed" is set. The + * responsible driver dynamically adds/removes system RAM within + * such an area and uncontrolled access is dangerous. + */ + if ((p->flags & exclusive_system_ram) == exclusive_system_ram) { + err = true; + break; + } + /* * A resource is exclusive if IORESOURCE_EXCLUSIVE is set * or CONFIG_IO_STRICT_DEVMEM is enabled and the * resource is busy. */ - if ((p->flags & IORESOURCE_BUSY) == 0) + if (!strict_iomem_checks || !(p->flags & IORESOURCE_BUSY)) continue; if (IS_ENABLED(CONFIG_IO_STRICT_DEVMEM) || p->flags & IORESOURCE_EXCLUSIVE) { From patchwork Thu Sep 2 16:09:19 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Hildenbrand X-Patchwork-Id: 12472027 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-13.6 required=3.0 tests=BAYES_00,DKIM_INVALID, DKIM_SIGNED,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id E3ADFC87FCC for ; Thu, 2 Sep 2021 16:25:57 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 27F19611C6 for ; Thu, 2 Sep 2021 16:09:45 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org 27F19611C6 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=redhat.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=kvack.org Received: by kanga.kvack.org (Postfix) id AE4516B0073; Thu, 2 Sep 2021 12:09:44 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id A93ED6B0074; Thu, 2 Sep 2021 12:09:44 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 9836C8D0001; Thu, 2 Sep 2021 12:09:44 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0242.hostedemail.com [216.40.44.242]) by kanga.kvack.org (Postfix) with ESMTP id 8B7386B0073 for ; Thu, 2 Sep 2021 12:09:44 -0400 (EDT) Received: from smtpin29.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay03.hostedemail.com (Postfix) with ESMTP id 3D1DD821FA2F for ; Thu, 2 Sep 2021 16:09:44 +0000 (UTC) X-FDA: 78543119088.29.28B2193 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by imf14.hostedemail.com (Postfix) with ESMTP id E3A9C6001980 for ; Thu, 2 Sep 2021 16:09:43 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1630598983; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=0npxqTFAo/f4YDe14xZvE/iGq1A3ahPk7+KrbVC6uOo=; b=TotSYCCLw6mxcxZu8zjqeL8wgwi00rU7KEDFM2u0bl/VJPMRmeACIw+B5wTgY9XUymRwRg MiRb9Fe/SCVxhxL2qy91AWGzsVsgYS2KLEKuyM/QoEoe9LZvcg3U5k0xe91wOhvTEBopNH BIdF4EbZ0wBimUVEho5jak68vzp/zdA= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-324-uPdac5v3Nc6WMQcAN-9FTg-1; Thu, 02 Sep 2021 12:09:42 -0400 X-MC-Unique: uPdac5v3Nc6WMQcAN-9FTg-1 Received: from smtp.corp.redhat.com (int-mx03.intmail.prod.int.phx2.redhat.com [10.5.11.13]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id AF10E801ADA; Thu, 2 Sep 2021 16:09:40 +0000 (UTC) Received: from t480s.redhat.com (unknown [10.39.193.198]) by smtp.corp.redhat.com (Postfix) with ESMTP id 7E76E60853; Thu, 2 Sep 2021 16:09:37 +0000 (UTC) From: David Hildenbrand To: linux-kernel@vger.kernel.org Cc: David Hildenbrand , Arnd Bergmann , Greg Kroah-Hartman , "Michael S. Tsirkin" , Jason Wang , "Rafael J. Wysocki" , Andrew Morton , Dan Williams , Hanjun Guo , Andy Shevchenko , virtualization@lists.linux-foundation.org, linux-mm@kvack.org Subject: [PATCH v4 3/3] virtio-mem: disallow mapping virtio-mem memory via /dev/mem Date: Thu, 2 Sep 2021 18:09:19 +0200 Message-Id: <20210902160919.25683-4-david@redhat.com> In-Reply-To: <20210902160919.25683-1-david@redhat.com> References: <20210902160919.25683-1-david@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.13 Authentication-Results: imf14.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=TotSYCCL; dmarc=pass (policy=none) header.from=redhat.com; spf=none (imf14.hostedemail.com: domain of david@redhat.com has no SPF policy when checking 170.10.133.124) smtp.mailfrom=david@redhat.com X-Rspamd-Server: rspam03 X-Rspamd-Queue-Id: E3A9C6001980 X-Stat-Signature: xhkmtnp9pns5zm8d9uftgh1yz6qkza5n X-HE-Tag: 1630598983-661984 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: We don't want user space to be able to map virtio-mem device memory directly (e.g., via /dev/mem) in order to have guarantees that in a sane setup we'll never accidentially access unplugged memory within the device-managed region of a virtio-mem device, just as required by the virtio-spec. As soon as the virtio-mem driver is loaded, the device region is visible in /proc/iomem via the parent device region. From that point on user space is aware of the device region and we want to disallow mapping anything inside that region (where we will dynamically (un)plug memory) until the driver has been unloaded cleanly and e.g., another driver might take over. By creating our parent IORESOURCE_SYSTEM_RAM resource with IORESOURCE_EXCLUSIVE, we will disallow any /dev/mem access to our device region until the driver was unloaded cleanly and removed the parent region. This will work even though only some memory blocks are actually currently added to Linux and appear as busy in the resource tree. So access to the region from user space is only possible a) if we don't load the virtio-mem driver. b) after unloading the virtio-mem driver cleanly. Don't build virtio-mem if access to /dev/mem cannot be restricticted -- if we have CONFIG_DEVMEM=y but CONFIG_STRICT_DEVMEM is not set. Reviewed-by: Dan Williams Signed-off-by: David Hildenbrand Acked-by: Michael S. Tsirkin --- drivers/virtio/Kconfig | 1 + drivers/virtio/virtio_mem.c | 4 +++- 2 files changed, 4 insertions(+), 1 deletion(-) diff --git a/drivers/virtio/Kconfig b/drivers/virtio/Kconfig index ce1b3f6ec325..ff80cd03f1d1 100644 --- a/drivers/virtio/Kconfig +++ b/drivers/virtio/Kconfig @@ -101,6 +101,7 @@ config VIRTIO_MEM depends on MEMORY_HOTPLUG_SPARSE depends on MEMORY_HOTREMOVE depends on CONTIG_ALLOC + depends on !DEVMEM || STRICT_DEVMEM help This driver provides access to virtio-mem paravirtualized memory devices, allowing to hotplug and hotunplug memory. diff --git a/drivers/virtio/virtio_mem.c b/drivers/virtio/virtio_mem.c index b91bc810a87e..c2d93492cf0f 100644 --- a/drivers/virtio/virtio_mem.c +++ b/drivers/virtio/virtio_mem.c @@ -2523,8 +2523,10 @@ static int virtio_mem_create_resource(struct virtio_mem *vm) if (!name) return -ENOMEM; + /* Disallow mapping device memory via /dev/mem completely. */ vm->parent_resource = __request_mem_region(vm->addr, vm->region_size, - name, IORESOURCE_SYSTEM_RAM); + name, IORESOURCE_SYSTEM_RAM | + IORESOURCE_EXCLUSIVE); if (!vm->parent_resource) { kfree(name); dev_warn(&vm->vdev->dev, "could not reserve device region\n");