From patchwork Wed Feb 12 03:21:49 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jeff Xu X-Patchwork-Id: 13971019 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 0ED0AC021A2 for ; Wed, 12 Feb 2025 03:22:01 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 4855A6B0083; Tue, 11 Feb 2025 22:22:01 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 435D36B0085; Tue, 11 Feb 2025 22:22:01 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 2AFFD6B0088; Tue, 11 Feb 2025 22:22:01 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0017.hostedemail.com [216.40.44.17]) by kanga.kvack.org (Postfix) with ESMTP id 0FDEA6B0083 for ; Tue, 11 Feb 2025 22:22:01 -0500 (EST) Received: from smtpin01.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay02.hostedemail.com (Postfix) with ESMTP id CA3291216F6 for ; Wed, 12 Feb 2025 03:22:00 +0000 (UTC) X-FDA: 83109843600.01.CCC3621 Received: from mail-pl1-f169.google.com (mail-pl1-f169.google.com [209.85.214.169]) by imf12.hostedemail.com (Postfix) with ESMTP id DD23F40006 for ; Wed, 12 Feb 2025 03:21:58 +0000 (UTC) Authentication-Results: imf12.hostedemail.com; dkim=pass header.d=chromium.org header.s=google header.b="Wls7xV/L"; spf=pass (imf12.hostedemail.com: domain of jeffxu@chromium.org designates 209.85.214.169 as permitted sender) smtp.mailfrom=jeffxu@chromium.org; dmarc=pass (policy=none) header.from=chromium.org ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1739330518; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=DGlgtnJud/tm4lyrsEmm3OjwE2GDthkuHJ1JyP15EEc=; b=pXqssCy4DgFJ4F+yBEK5FrrRE8AIACd9w/Gw+DKQy+H40AM9Gx83H4I5bynmv2siTC6coT x7OsLCIp4fjSB1bHG4LZoqvsI14lYungByC0RO35b2vc7BHkl+vRs6lO2MeI6HNfwq0zQo fG7A919PZsuOS6G01JCXrFfgqqNKROg= ARC-Authentication-Results: i=1; imf12.hostedemail.com; dkim=pass header.d=chromium.org header.s=google header.b="Wls7xV/L"; spf=pass (imf12.hostedemail.com: domain of jeffxu@chromium.org designates 209.85.214.169 as permitted sender) smtp.mailfrom=jeffxu@chromium.org; dmarc=pass (policy=none) header.from=chromium.org ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1739330518; a=rsa-sha256; cv=none; b=itbrfuGwRfAjTIVTySJWbctjlLyzaM70IqfEE0Q4+BjPU9aqIKfvUGa15KIwGC6HdnJtla Ill/4D0fdU2a4XzZTzCIPauXbwAPcP95Rw/GBA9Zv6EHCokOvQZx5Tc7FFT+Itn4W1J9a7 PzHe2EyjCy/g8IuG+DJZLl0g1Z0cFi8= Received: by mail-pl1-f169.google.com with SMTP id d9443c01a7336-21f3826e88cso11821275ad.0 for ; Tue, 11 Feb 2025 19:21:58 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=chromium.org; s=google; t=1739330518; x=1739935318; darn=kvack.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=DGlgtnJud/tm4lyrsEmm3OjwE2GDthkuHJ1JyP15EEc=; b=Wls7xV/Lemq+gSy8cgrW3k1V7o21O/St+WnwLoMXIz3TChM4sRUu0QEFIvIU+tgPS3 VurE4Ssl5W3/e3GcODW1JO0hFtukARTPI8lvTdzN4X1+1ApDAJcwa02q6yR9vnZ2rv6q j9S86BYrLrOYDEl1lOt0ycMPUF7u4FM33t76I= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1739330518; x=1739935318; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=DGlgtnJud/tm4lyrsEmm3OjwE2GDthkuHJ1JyP15EEc=; b=GJpzB/k0ukTUcbao/6VZVRJAiYvxkpAGOIILG2Llmu5YdY4+q7Qs5MC0QTEIP3zoHs 91wUktrr5K3cTqjFg5GFyGoYfsMZye8kKOunRS5SolllUpMeMnMqyuNJ6jVq54tpgOjQ bgCLTZ1BLjVNIob5GuKxlZogGglNh5oK/9GlTBoe0WHvslAkc5ghTba7kwYHALSavJe9 320fUXmNMMHm30vmAm8gSyN9OzRsvfb7AT93Rc5U4UGKgGZg6D2c/+MQEnPRNRmJNr0c NVILy7Exaw9ge9sIviaCfguabjiYkPLv6SNSZUt8c1b7ZuBuHE/q0bIwfoXmxKd73d5g S1wQ== X-Forwarded-Encrypted: i=1; AJvYcCXigHgAOqkd9bwlp3uDhOJwCnuPgn5wz21cpVJsERCqHs33Kn8JpjfJ8rZiYT/0B5izA6L7oTLPKQ==@kvack.org X-Gm-Message-State: AOJu0Yyk59zNU0LGP9WkcL0Kfq9CsxgYb188sOyIesmPgZEIeBvJruAy JJOuPlMXc0/L/1K1k8UBR1oeUXzMDhgKr+AHqPVkC8+UfgKpp/O2C3jkmhq2dw== X-Gm-Gg: ASbGnctxQdb4lrrJMm9yWHgiRnNwwuOL+nqdLzdaA4HqxoIu/bFDrqOW1msFVvCdbmZ FtAuWDqua16LxlAdZz0bgKQ40Q6i5QUvPs8pOGk+8Y6lz5ZF2eCw2WPlAJlssf7Px2URP9RdTSS 8QVWUTcGoIG1/fG2OWRw43u/oRFMGWhjspxoGzfNWQVPXIVlJ1aYsRIhYHuRnIQB746dhPFM81r p7JFkrg76g16fzNRK6CEbofED42QHd5hCVWBu+83sbPtuV4u6MOX2aeGTUIdE7oiL+s2wNvAlCA cZDVYZ02J3BGGQEAb9hbDSt0FU9bE3r2ZS4RdED7S6NrWGrLoA== X-Google-Smtp-Source: AGHT+IEURlNafHcVNKrTKgOcgxh+gG6kKLNCaOgM84jrdUBBOm8abnn5+RM1CQymWz2I341kNGd0sw== X-Received: by 2002:a17:903:22c5:b0:21b:d105:26a7 with SMTP id d9443c01a7336-220bbb045admr10702145ad.6.1739330517663; Tue, 11 Feb 2025 19:21:57 -0800 (PST) Received: from localhost (9.184.168.34.bc.googleusercontent.com. [34.168.184.9]) by smtp.gmail.com with UTF8SMTPSA id 98e67ed59e1d1-2fbf999b5cesm299750a91.34.2025.02.11.19.21.57 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Tue, 11 Feb 2025 19:21:57 -0800 (PST) From: jeffxu@chromium.org To: akpm@linux-foundation.org, keescook@chromium.org, jannh@google.com, torvalds@linux-foundation.org, vbabka@suse.cz, lorenzo.stoakes@oracle.com, Liam.Howlett@Oracle.com, adhemerval.zanella@linaro.org, oleg@redhat.com, avagin@gmail.com, benjamin@sipsolutions.net Cc: linux-kernel@vger.kernel.org, linux-hardening@vger.kernel.org, linux-mm@kvack.org, jorgelo@chromium.org, sroettger@google.com, hch@lst.de, ojeda@kernel.org, thomas.weissschuh@linutronix.de, adobriyan@gmail.com, johannes@sipsolutions.net, pedro.falcato@gmail.com, hca@linux.ibm.com, willy@infradead.org, anna-maria@linutronix.de, mark.rutland@arm.com, linus.walleij@linaro.org, Jason@zx2c4.com, deller@gmx.de, rdunlap@infradead.org, davem@davemloft.net, peterx@redhat.com, f.fainelli@gmail.com, gerg@kernel.org, dave.hansen@linux.intel.com, mingo@kernel.org, ardb@kernel.org, mhocko@suse.com, 42.hyeyoo@gmail.com, peterz@infradead.org, ardb@google.com, enh@google.com, rientjes@google.com, groeck@chromium.org, mpe@ellerman.id.au, aleksandr.mikhalitsyn@canonical.com, mike.rapoport@gmail.com, Jeff Xu Subject: [RFC PATCH v5 1/7] mseal, system mappings: kernel config and header change Date: Wed, 12 Feb 2025 03:21:49 +0000 Message-ID: <20250212032155.1276806-2-jeffxu@google.com> X-Mailer: git-send-email 2.48.1.502.g6dc24dfdaf-goog In-Reply-To: <20250212032155.1276806-1-jeffxu@google.com> References: <20250212032155.1276806-1-jeffxu@google.com> MIME-Version: 1.0 X-Rspam-User: X-Rspamd-Server: rspam09 X-Rspamd-Queue-Id: DD23F40006 X-Stat-Signature: k678jbiqdsm6bh1wyr5eonih987ua3wn X-HE-Tag: 1739330518-866956 X-HE-Meta: U2FsdGVkX1+8ypcS3lOGhYMAcj7h3S5nLQ+ZF6JZ3pcEq5htj/FI3Nw9ZNIilP/u5p0/2uJ9RqaYQwRBMtkxvmMDxOg03Jgv6teZx0B97lyx7qqgavuVlR2Q4u8bYvj/hMVhxwvpTCD+szp/vpuW6qLUZ1kYnlBJJYmaqAGtkeKEPoeAVe+ZJSuTv1UNLpPEDqzdPUZQ7vV/cqnW5QCNjyiyBQauHb45XJqWotkITwqARVywBSCVtgr1vjDKekazbKgRz3vdGf3S//G7SQx2QMTFudS6jCN7IYIWcCKIVjBZj/OegcadCLl/O/cuIWm1LVKoIzf7QT32uo0ZG47/qn+Q/Ff/cPF1Eyz0LS3fwdwNCveM2GdgEhXWMjjOzEZZ4ZQm/hE6rM+OXqpo5895pFmZuXunRp8vRnMQZkqGRyAtdWKbNTfsLDjix3YGtwMf0I8NauJMrgvs9Hf3wWBSm714EPCeYps0GIRLICIWZwdD4tg65PzNaA1+LTb6fuNI2ZBvBcmYwRrYOaAGRQnn/SXfxNKmzryRmOXLLksVaZHRRw+TsgJvPuOwSeyZR/wnbdUBYk3LyoX3/vyehZQKZhSHPq3asouzq9+FnrJigeMxQnEylM9Ri0J1FJKuGQa38SVoEIY4QYTuPS6zsb+EH+OY8eU/r+ufIpVz1anbbnzrynFqsueFi6n1oTWi44xWl5S3jO9aW74xKC2R3H/Ix/HH59WG87Kz1zbt//D1RjglZIBhQ6mkjDBVle5wtmnwhMSWwbQYbyCwlXlqMpAnlOKDQT5EZaTtjiY4sMmCy1FjLEJP4/vaqeldZCIjXst7wP3YMcODXhd9O6ztc1Dgl7GUlN+WoEbHrAPqbSt2pLR3ljH5Zp4b2WAZg938LaGeNbndhM52rCkxdZUCxLhmC4p2U1578H111bEq/P7fFA4ILrGYbU0MMkPuobXNLQmwVlmy9uEyK2/aXvCzaIR xFuAX9Tr 68+OJoFH+0VEGx8KWIkKsq6RgcC9NYYTYkWezl0fjSu7LrVMMoaERHAcga5cyg5Vf2Qn6OBmGVKjQGxlKiXJAxYetEIQULFawd3fJ2dy/TzlYZssIjOsE+trHzGq5YoN1S3FTY1hmhon+brOfofZ22qG99FYuvuAYWbFxOo39uSbKykPUTKSyYA7PuFsOKDZRRr2thsmAmvLFKRBP6jPMm3/DvHuYazMMmP6+pdl0HGdoweVfXvu2T98SWIpPqE6Q4ctRi1ApCTlrD58hW3WBD7ZcgOOLlFfEXzHCZqyULfMzkorLLJ7DlFWTSjdh7hB33Ym8SMa7F1XaiZxFbO/Plw+OBi0pK/ku8f2D/myXca4RqSb3nooHISq0csZ5NZ8AaVqnBpPvU5YOEVJfQbGBcBWR8z6gDdIq0Zy/ X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: From: Jeff Xu Provide infrastructure to mseal system mappings. Establish two kernel configs (CONFIG_MSEAL_SYSTEM_MAPPINGS, ARCH_HAS_MSEAL_SYSTEM_MAPPINGS) and a header file (userprocess.h) for future patches. As discussed during mseal() upstream process [1], mseal() protects the VMAs of a given virtual memory range against modifications, such as the read/write (RW) and no-execute (NX) bits. For complete descriptions of memory sealing, please see mseal.rst [2]. The mseal() is useful to mitigate memory corruption issues where a corrupted pointer is passed to a memory management system. For example, such an attacker primitive can break control-flow integrity guarantees since read-only memory that is supposed to be trusted can become writable or .text pages can get remapped. The system mappings are readonly only, memory sealing can protect them from ever changing to writable or unmmap/remapped as different attributes. System mappings such as vdso, vvar, and sigpage (arm), vectors (arm) are created by the kernel during program initialization, and could be sealed after creation. Unlike the aforementioned mappings, the uprobe mapping is not established during program startup. However, its lifetime is the same as the process's lifetime [3]. It could be sealed from creation. The vsyscall on x86-64 uses a special address (0xffffffffff600000), which is outside the mm managed range. This means mprotect, munmap, and mremap won't work on the vsyscall. Since sealing doesn't enhance the vsyscall's security, it is skipped in this patch. If we ever seal the vsyscall, it is probably only for decorative purpose, i.e. showing the 'sl' flag in the /proc/pid/smaps. For this patch, it is ignored. It is important to note that the CHECKPOINT_RESTORE feature (CRIU) may alter the system mappings during restore operations. UML(User Mode Linux) and gVisor are also known to change the vdso/vvar mappings. Consequently, this feature cannot be universally enabled across all systems. As such, CONFIG_MSEAL_SYSTEM_MAPPINGS is disabled by default. To support mseal of system mappings, architectures must define CONFIG_ARCH_HAS_MSEAL_SYSTEM_MAPPINGS and update their special mappings calls to pass mseal flag. Additionally, architectures must confirm they do not unmap/remap system mappings during the process lifetime. In this version, we've improved the handling of system mapping sealing from previous versions, instead of modifying the _install_special_mapping function itself, which would affect all architectures, we now call _install_special_mapping with a sealing flag only within the specific architecture that requires it. This targeted approach offers two key advantages: 1) It limits the code change's impact to the necessary architectures, and 2) It aligns with the software architecture by keeping the core memory management within the mm layer, while delegating the decision of sealing system mappings to the individual architecture, which is particularly relevant since 32-bit architectures never require sealing. Additionally, this patch introduces a new header, include/linux/usrprocess.h, which contains the mseal_system_mappings() function. This function helps the architecture determine if system mapping is enabled within the current kernel configuration. It can be extended in the future to handle opt-in/out prctl for enabling system mapping sealing at the process level or a kernel cmdline feature. A new header file was introduced because it was difficult to find the best location for this function. Although include/linux/mm.h was considered, this feature is more closely related to user processes than core memory management. Additionally, future prctl or kernel cmd-line implementations for this feature would not fit within the scope of core memory management or mseal.c. This is relevant because if we add unit-tests for mseal.c in the future, we would want to limit mseal.c's dependencies to core memory management. Prior to this patch series, we explored sealing special mappings from userspace using glibc's dynamic linker. This approach revealed several issues: - The PT_LOAD header may report an incorrect length for vdso, (smaller than its actual size). The dynamic linker, which relies on PT_LOAD information to determine mapping size, would then split and partially seal the vdso mapping. Since each architecture has its own vdso/vvar code, fixing this in the kernel would require going through each archiecture. Our initial goal was to enable sealing readonly mappings, e.g. .text, across all architectures, sealing vdso from kernel since creation appears to be simpler than sealing vdso at glibc. - The [vvar] mapping header only contains address information, not length information. Similar issues might exist for other special mappings. - Mappings like uprobe are not covered by the dynamic linker, and there is no effective solution for them. This feature's security enhancements will benefit ChromeOS, Android, and other high security systems. Testing: This feature was tested on ChromeOS and Android for both x86-64 and ARM64. - Enable sealing and verify vdso/vvar, sigpage, vector are sealed properly, i.e. "sl" shown in the smaps for those mappings, and mremap is blocked. - Passing various automation tests (e.g. pre-checkin) on ChromeOS and Android to ensure the sealing doesn't affect the functionality of Chromebook and Android phone. I also tested the feature on Ubuntu on x86-64: - With config disabled, vdso/vvar is not sealed, - with config enabled, vdso/vvar is sealed, and booting up Ubuntu is OK, normal operations such as browsing the web, open/edit doc are OK. In addition, Benjamin Berg tested this on UML. Link: https://lore.kernel.org/all/20240415163527.626541-1-jeffxu@chromium.org/ [1] Link: Documentation/userspace-api/mseal.rst [2] Link: https://lore.kernel.org/all/CABi2SkU9BRUnqf70-nksuMCQ+yyiWjo3fM4XkRkL-NrCZxYAyg@mail.gmail.com/ [3] Signed-off-by: Jeff Xu --- include/linux/userprocess.h | 18 ++++++++++++++++++ init/Kconfig | 18 ++++++++++++++++++ security/Kconfig | 18 ++++++++++++++++++ 3 files changed, 54 insertions(+) create mode 100644 include/linux/userprocess.h diff --git a/include/linux/userprocess.h b/include/linux/userprocess.h new file mode 100644 index 000000000000..bd11e2e972c5 --- /dev/null +++ b/include/linux/userprocess.h @@ -0,0 +1,18 @@ +/* SPDX-License-Identifier: GPL-2.0 */ +#ifndef _LINUX_USER_PROCESS_H +#define _LINUX_USER_PROCESS_H +#include + +/* + * mseal of userspace process's system mappings. + */ +static inline unsigned long mseal_system_mappings(void) +{ +#ifdef CONFIG_MSEAL_SYSTEM_MAPPINGS + return VM_SEALED; +#else + return 0; +#endif +} + +#endif diff --git a/init/Kconfig b/init/Kconfig index d0d021b3fa3b..892d2bcdf397 100644 --- a/init/Kconfig +++ b/init/Kconfig @@ -1882,6 +1882,24 @@ config ARCH_HAS_MEMBARRIER_CALLBACKS config ARCH_HAS_MEMBARRIER_SYNC_CORE bool +config ARCH_HAS_MSEAL_SYSTEM_MAPPINGS + bool + help + Control MSEAL_SYSTEM_MAPPINGS access based on architecture. + + A 64-bit kernel is required for the memory sealing feature. + No specific hardware features from the CPU are needed. + + To enable this feature, the architecture needs to update their + speical mappings calls to include the sealing flag and confirm + that it doesn't unmap/remap system mappings during the life + time of the process. After the architecture enables this, a + distribution can set CONFIG_MSEAL_SYSTEM_MAPPING to manage access + to the feature. + + For complete descriptions of memory sealing, please see + Documentation/userspace-api/mseal.rst + config HAVE_PERF_EVENTS bool help diff --git a/security/Kconfig b/security/Kconfig index f10dbf15c294..bfb35fc7a3c6 100644 --- a/security/Kconfig +++ b/security/Kconfig @@ -51,6 +51,24 @@ config PROC_MEM_NO_FORCE endchoice +config MSEAL_SYSTEM_MAPPINGS + bool "mseal system mappings" + depends on 64BIT + depends on ARCH_HAS_MSEAL_SYSTEM_MAPPINGS + depends on !CHECKPOINT_RESTORE + help + Seal system mappings such as vdso, vvar, sigpage, uprobes, etc. + + A 64-bit kernel is required for the memory sealing feature. + No specific hardware features from the CPU are needed. + + Note: CHECKPOINT_RESTORE, UML, gVisor are known to relocate or + unmap system mapping, therefore this config can't be enabled + universally. + + For complete descriptions of memory sealing, please see + Documentation/userspace-api/mseal.rst + config SECURITY bool "Enable different security models" depends on SYSFS