From patchwork Tue Dec 12 23:16:55 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Jeff Xu X-Patchwork-Id: 13490084 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id D84B2C4167D for ; Tue, 12 Dec 2023 23:17:16 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 1BDA66B030D; Tue, 12 Dec 2023 18:17:15 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 145F26B03AC; Tue, 12 Dec 2023 18:17:15 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id EB15A6B03AA; Tue, 12 Dec 2023 18:17:14 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0011.hostedemail.com [216.40.44.11]) by kanga.kvack.org (Postfix) with ESMTP id CDDC26B00D0 for ; Tue, 12 Dec 2023 18:17:14 -0500 (EST) Received: from smtpin14.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay04.hostedemail.com (Postfix) with ESMTP id A28751A0ACE for ; Tue, 12 Dec 2023 23:17:14 +0000 (UTC) X-FDA: 81559729188.14.900A834 Received: from mail-oa1-f41.google.com (mail-oa1-f41.google.com [209.85.160.41]) by imf11.hostedemail.com (Postfix) with ESMTP id B10A94000B for ; Tue, 12 Dec 2023 23:17:12 +0000 (UTC) Authentication-Results: imf11.hostedemail.com; dkim=pass header.d=chromium.org header.s=google header.b=Y8AUbAsP; spf=pass (imf11.hostedemail.com: domain of jeffxu@chromium.org designates 209.85.160.41 as permitted sender) smtp.mailfrom=jeffxu@chromium.org; dmarc=pass (policy=none) header.from=chromium.org ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1702423032; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=DmouMMuL78gKzcDFfbVzkzXE8UDTo3Q4rwqrYea/myQ=; b=GV0pUIBambFND21kIEnEB4Nq7B+ZN5XMwUnvKLYTVuD1S2SjAmAFRjg7/jTIXtaQVjV2YC YPFAMbrDNTQqcWpQD1BNiNvlU+Mvdy+jmc/lkGaWLOkCKuBI+9c4RzKiwhmK+hM/PToE8F W/BI07qdiSH7vBJjD1HlzkxmNSUZEDE= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1702423032; a=rsa-sha256; cv=none; b=J/75n08g5ke4LZY/2IZmoujed/iinA13kdLc0Ual01ad/mWeSLeMOK8HqNuPXLrF7UVLdv B05rzK7ZwOe31chP7bizss183qjBUduDlhrLfOxYcCML9QQukGcT6FsCTUbpfIn2wk4PuJ UJv/hikBeE8jYCAMzA+iYSMDmlDiQP4= ARC-Authentication-Results: i=1; imf11.hostedemail.com; dkim=pass header.d=chromium.org header.s=google header.b=Y8AUbAsP; spf=pass (imf11.hostedemail.com: domain of jeffxu@chromium.org designates 209.85.160.41 as permitted sender) smtp.mailfrom=jeffxu@chromium.org; dmarc=pass (policy=none) header.from=chromium.org Received: by mail-oa1-f41.google.com with SMTP id 586e51a60fabf-20311257937so435106fac.0 for ; Tue, 12 Dec 2023 15:17:12 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=chromium.org; s=google; t=1702423032; x=1703027832; darn=kvack.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=DmouMMuL78gKzcDFfbVzkzXE8UDTo3Q4rwqrYea/myQ=; b=Y8AUbAsPnjVaAVGJHNCFkIy9AqGOHBFqnql9GBumU6nq0eLoydHvmYRGnYr9e7ld5J DWybjJ0SFvkdTa97QOBKO8YnYmOoOLbpPNYXK+UMiV4WtblxGQqieVYUKhPH72Cjsqoq rEWCZ0EoH14CGR8UwpZA1fVWN+1yxBI+8ysbw= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1702423032; x=1703027832; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=DmouMMuL78gKzcDFfbVzkzXE8UDTo3Q4rwqrYea/myQ=; b=LJO90n6BTFJZWpe6uD5BLVTxpKzd6TVLNvp5OznhHu3M8WXXiMvw+JKtz23SvhWg/Y b7Y9shxQRlWB7VpdojU1Xpipb694EIomATl3OUOW4wj2jAjqQpB1YDnhZNaY3ToU9drf TyVh0hBh8n/oY/f5DMEwoqFb1MJId8w/u1QBV7XFPmGCt/aETIiJW9UvSGydljDAf3q5 cgMDZfuauyfDHuUgyXrQEXGVdNL2LzhnjeF7EpkFcSuTADjO5w16LFM7rsWx+26tH50Q NwrewKkJFsMkUaSyUl2BWMS9SJ007g68XXmy0FDz1GsaUMk19WOtLpFPaEbCx/AF7KIN iBmQ== X-Gm-Message-State: AOJu0YzpnUbZKPmdfps5mKBivMN4iPo99Z8q7drYAneC0foSL1w1XVfh lmCi5Pg1T2oORlX1AOPiezasMA== X-Google-Smtp-Source: AGHT+IHWRbxCpkSD3Z0OA1+KEQM7Nb43rRei8W5sZbD56ePXA6pfN1fHTdnYRm+1Er/WpLM8j3G00Q== X-Received: by 2002:a05:6871:b06:b0:1fb:337c:402c with SMTP id fq6-20020a0568710b0600b001fb337c402cmr8802265oab.37.1702423031712; Tue, 12 Dec 2023 15:17:11 -0800 (PST) Received: from localhost (34.133.83.34.bc.googleusercontent.com. [34.83.133.34]) by smtp.gmail.com with UTF8SMTPSA id p2-20020aa78602000000b006ce691a1419sm8646308pfn.186.2023.12.12.15.17.11 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Tue, 12 Dec 2023 15:17:11 -0800 (PST) From: jeffxu@chromium.org To: akpm@linux-foundation.org, keescook@chromium.org, jannh@google.com, sroettger@google.com, willy@infradead.org, gregkh@linuxfoundation.org, torvalds@linux-foundation.org Cc: jeffxu@google.com, jorgelo@chromium.org, groeck@chromium.org, linux-kernel@vger.kernel.org, linux-kselftest@vger.kernel.org, linux-mm@kvack.org, pedro.falcato@gmail.com, dave.hansen@intel.com, linux-hardening@vger.kernel.org, deraadt@openbsd.org, Jeff Xu Subject: [RFC PATCH v3 01/11] mseal: Add mseal syscall. Date: Tue, 12 Dec 2023 23:16:55 +0000 Message-ID: <20231212231706.2680890-2-jeffxu@chromium.org> X-Mailer: git-send-email 2.43.0.472.g3155946c3a-goog In-Reply-To: <20231212231706.2680890-1-jeffxu@chromium.org> References: <20231212231706.2680890-1-jeffxu@chromium.org> MIME-Version: 1.0 X-Rspamd-Queue-Id: B10A94000B X-Rspam-User: X-Rspamd-Server: rspam11 X-Stat-Signature: 7aw8wx683b158fdecjizxbuay6k4pped X-HE-Tag: 1702423032-857890 X-HE-Meta: U2FsdGVkX1844oyHIh/KX4qOOx6dXIrCMqF/oTg5jj8/Sdqry3FjEFo3feGJuShhTbaFsaqyPf27nwFO1maAIG2JfST485okYELCCNbSDPF6pdgJkGQv3JYyZFwRIqxyi31XKdFCwekmH6db2WAwhdmAVOKgmVkNNYqwQf4sKvTXIF/nK+Qa+YVEv0vftCUX2ALDt2Rs3cH6BRu7yfeqTjk1L9zNApPO0jaC1Uw0njgzJN4mn22cF9f6TjOQzRMZfhsP4Kxbzzs8swC+eA9KgcbhvLHtV/fAmpeseLYPUiQWYPAqr+lN3I8etZCFMpXmOOyy0FTnZTWo1AQjW/D5N04gANF3Al5wTAk70LgnGrXppDDV7DamZAPe+rc59EboXi5Nh/cIrvjVGJT/LWdODb0HWhHvRa2PhKCYc4anOnphF+MYNy2IJSDR0nfP6/sj/CSe6tMu0INwxAoTwcETb766eyfH0Vp+cyo4hKXeyfua+NgsLZx+6NO7sLGhWiDPi1EYvyh2QXkK9ESnUi0TgeYDCUEBDWB4F3hizRN1YU6wLOPv7fZYj4qPBrx6c5L+hCotyw5hrXEHWnDmqvzhcl2zuwghPL5Wo6ltIkJ12hj0Th9AGPMaF+7Pue+hRIBTveELWmV53OzxoUQpb3M5W2Eib9LOuQG6k2ZZjKnmai/n/G70xbyOSpQT2SGp40/jWnE75DT3xVi9AQwBvCgkWw7xUeyijTDqWBwMwepcffE334aS1WLQ+s2vHB/NMTu+EAvUksfIUiR3Tpgt023zi/rVyGKjPbk/kqwuogeEjpC7EVugMKkBbHs2rq/Y4KKGhiUKFHSAmsx68bSsDLVcADBclCYXSZhiQEmra6zKkjq4VObjDxfNubgRTXXup6gVWxCQej40nJguP2zgJtixJKVP3OQeGV3B6k8YJn3i8Yh/NeR3/7ucITV83i3BKyIbxhNyYT+aOq+tv11yaJX O+4G2twM ooK30TYV8A0f/6RzUNhxxYgG3eep/xulnUzMhKay55ygklXwbprC67g4BRf1lq4JTdwPDHGDsF3lfkpBreBQuhttoMgtBb0urzxfhwOn/jET49TvpJh92ccXRMz/omzCp93aq1kxTJ/lMX0N2XhCq7CoFFYBDJ9+SjY5xF87FqQCxC3A23cI/ZXtrDi00FsqU3+hHorgH0SEHwAo/0Pt+VYqgeDz/FDDvcjNhc1iBx2MMo45f7VOBOG59UmBjr6ZoggAIkwWzYddyWj/UBha3w4pGt/XXu/9rXs/lj7W93oH6Sda62IzsRD6+7nOlV2XwzQUWy4tMGIZeYCfQ3woecMQKPXv/eCBG+UG6OjGNnFxJ6YuL3wHHVubVs1KZZXOrsP5Gj6j3bE7ro0Zs7j0P8N57C15qSRW+js3yBkHK8NHYXzea84ioJ3mTfctcYSYT2xd+ X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: From: Jeff Xu The new mseal() is an architecture independent syscall, and with following signature: mseal(void addr, size_t len, unsigned long types, unsigned long flags) addr/len: memory range. Must be continuous/allocated memory, or else mseal() will fail and no VMA is updated. For details on acceptable arguments, please refer to comments in mseal.c. Those are also covered by the selftest. This CL adds three sealing types. MM_SEAL_BASE MM_SEAL_PROT_PKEY MM_SEAL_SEAL The MM_SEAL_BASE: The base package includes the features common to all VMA sealing types. It prevents sealed VMAs from: 1> Unmapping, moving to another location, and shrinking the size, via munmap() and mremap(), can leave an empty space, therefore can be replaced with a VMA with a new set of attributes. 2> Move or expand a different vma into the current location, via mremap(). 3> Modifying sealed VMA via mmap(MAP_FIXED). 4> Size expansion, via mremap(), does not appear to pose any specific risks to sealed VMAs. It is included anyway because the use case is unclear. In any case, users can rely on merging to expand a sealed VMA. We consider the MM_SEAL_BASE feature, on which other sealing features will depend. For instance, it probably does not make sense to seal PROT_PKEY without sealing the BASE, and the kernel will implicitly add SEAL_BASE for SEAL_PROT_PKEY. (If the application wants to relax this in future, we could use the “flags” field in mseal() to overwrite this the behavior of implicitly adding SEAL_BASE.) The MM_SEAL_PROT_PKEY: Seal PROT and PKEY of the address range, in other words, mprotect() and pkey_mprotect() will be denied if the memory is sealed with MM_SEAL_PROT_PKEY. The MM_SEAL_SEAL MM_SEAL_SEAL denies adding a new seal for an VMA. The kernel will remember which seal types are applied, and the application doesn’t need to repeat all existing seal types in the next mseal(). Once a seal type is applied, it can’t be unsealed. Call mseal() on an existing seal type is a no-action, not a failure. Data structure: Internally, the vm_area_struct adds a new field, vm_seals, to store the bit masks. The vm_seals field is added because the existing vm_flags field is full in 32-bit CPUs. The vm_seals field can be merged into vm_flags in the future if the size of vm_flags is ever expanded. TODO: Sealed VMA won't merge with other VMA in this patch, merging support will be added in later patch. Signed-off-by: Jeff Xu --- include/linux/mm.h | 45 ++++++- include/linux/mm_types.h | 7 ++ include/linux/syscalls.h | 2 + include/uapi/linux/mman.h | 4 + kernel/sys_ni.c | 1 + mm/Kconfig | 9 ++ mm/Makefile | 1 + mm/mmap.c | 3 + mm/mseal.c | 257 ++++++++++++++++++++++++++++++++++++++ 9 files changed, 328 insertions(+), 1 deletion(-) create mode 100644 mm/mseal.c diff --git a/include/linux/mm.h b/include/linux/mm.h index 19fc73b02c9f..3d1120570de5 100644 --- a/include/linux/mm.h +++ b/include/linux/mm.h @@ -30,6 +30,7 @@ #include #include #include +#include struct mempolicy; struct anon_vma; @@ -257,9 +258,17 @@ extern struct rw_semaphore nommu_region_sem; extern unsigned int kobjsize(const void *objp); #endif +/* + * MM_SEAL_ALL is all supported flags in mseal(). + */ +#define MM_SEAL_ALL ( \ + MM_SEAL_SEAL | \ + MM_SEAL_BASE | \ + MM_SEAL_PROT_PKEY) + /* * vm_flags in vm_area_struct, see mm_types.h. - * When changing, update also include/trace/events/mmflags.h + * When changing, update also include/trace/events/mmflags.h. */ #define VM_NONE 0x00000000 @@ -3308,6 +3317,40 @@ static inline void mm_populate(unsigned long addr, unsigned long len) static inline void mm_populate(unsigned long addr, unsigned long len) {} #endif +#ifdef CONFIG_MSEAL +static inline bool check_vma_seals_mergeable(unsigned long vm_seals) +{ + /* + * Set sealed VMA not mergeable with another VMA for now. + * This will be changed in later commit to make sealed + * VMA also mergeable. + */ + if (vm_seals & MM_SEAL_ALL) + return false; + + return true; +} + +/* + * return the valid sealing (after mask). + */ +static inline unsigned long vma_seals(struct vm_area_struct *vma) +{ + return (vma->vm_seals & MM_SEAL_ALL); +} + +#else +static inline bool check_vma_seals_mergeable(unsigned long vm_seals1) +{ + return true; +} + +static inline unsigned long vma_seals(struct vm_area_struct *vma) +{ + return 0; +} +#endif + /* These take the mm semaphore themselves */ extern int __must_check vm_brk(unsigned long, unsigned long); extern int __must_check vm_brk_flags(unsigned long, unsigned long, unsigned long); diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h index 589f31ef2e84..052799173c86 100644 --- a/include/linux/mm_types.h +++ b/include/linux/mm_types.h @@ -687,6 +687,13 @@ struct vm_area_struct { struct vma_numab_state *numab_state; /* NUMA Balancing state */ #endif struct vm_userfaultfd_ctx vm_userfaultfd_ctx; +#ifdef CONFIG_MSEAL + /* + * bit masks for seal. + * need this since vm_flags is full. + */ + unsigned long vm_seals; /* seal flags, see mm.h. */ +#endif } __randomize_layout; #ifdef CONFIG_SCHED_MM_CID diff --git a/include/linux/syscalls.h b/include/linux/syscalls.h index 0901af60d971..b1c766b74765 100644 --- a/include/linux/syscalls.h +++ b/include/linux/syscalls.h @@ -812,6 +812,8 @@ asmlinkage long sys_process_mrelease(int pidfd, unsigned int flags); asmlinkage long sys_remap_file_pages(unsigned long start, unsigned long size, unsigned long prot, unsigned long pgoff, unsigned long flags); +asmlinkage long sys_mseal(unsigned long start, size_t len, unsigned long types, + unsigned long flags); asmlinkage long sys_mbind(unsigned long start, unsigned long len, unsigned long mode, const unsigned long __user *nmask, diff --git a/include/uapi/linux/mman.h b/include/uapi/linux/mman.h index a246e11988d5..f561652886c4 100644 --- a/include/uapi/linux/mman.h +++ b/include/uapi/linux/mman.h @@ -55,4 +55,8 @@ struct cachestat { __u64 nr_recently_evicted; }; +#define MM_SEAL_SEAL _BITUL(0) +#define MM_SEAL_BASE _BITUL(1) +#define MM_SEAL_PROT_PKEY _BITUL(2) + #endif /* _UAPI_LINUX_MMAN_H */ diff --git a/kernel/sys_ni.c b/kernel/sys_ni.c index 9db51ea373b0..716d64df522d 100644 --- a/kernel/sys_ni.c +++ b/kernel/sys_ni.c @@ -195,6 +195,7 @@ COND_SYSCALL(migrate_pages); COND_SYSCALL(move_pages); COND_SYSCALL(set_mempolicy_home_node); COND_SYSCALL(cachestat); +COND_SYSCALL(mseal); COND_SYSCALL(perf_event_open); COND_SYSCALL(accept4); diff --git a/mm/Kconfig b/mm/Kconfig index 264a2df5ecf5..63972d476d19 100644 --- a/mm/Kconfig +++ b/mm/Kconfig @@ -1258,6 +1258,15 @@ config LOCK_MM_AND_FIND_VMA bool depends on !STACK_GROWSUP +config MSEAL + default n + bool "Enable mseal() system call" + depends on MMU + help + Enable the virtual memory sealing. + This feature allows sealing each virtual memory area separately with + multiple sealing types. + source "mm/damon/Kconfig" endmenu diff --git a/mm/Makefile b/mm/Makefile index ec65984e2ade..643d8518dac0 100644 --- a/mm/Makefile +++ b/mm/Makefile @@ -120,6 +120,7 @@ obj-$(CONFIG_PAGE_EXTENSION) += page_ext.o obj-$(CONFIG_PAGE_TABLE_CHECK) += page_table_check.o obj-$(CONFIG_CMA_DEBUGFS) += cma_debug.o obj-$(CONFIG_SECRETMEM) += secretmem.o +obj-$(CONFIG_MSEAL) += mseal.o obj-$(CONFIG_CMA_SYSFS) += cma_sysfs.o obj-$(CONFIG_USERFAULTFD) += userfaultfd.o obj-$(CONFIG_IDLE_PAGE_TRACKING) += page_idle.o diff --git a/mm/mmap.c b/mm/mmap.c index 9e018d8dd7d6..42462c2a0c35 100644 --- a/mm/mmap.c +++ b/mm/mmap.c @@ -740,6 +740,9 @@ static inline bool is_mergeable_vma(struct vm_area_struct *vma, return false; if (!anon_vma_name_eq(anon_vma_name(vma), anon_name)) return false; + if (!check_vma_seals_mergeable(vma_seals(vma))) + return false; + return true; } diff --git a/mm/mseal.c b/mm/mseal.c new file mode 100644 index 000000000000..13bbe9ef5883 --- /dev/null +++ b/mm/mseal.c @@ -0,0 +1,257 @@ +// SPDX-License-Identifier: GPL-2.0 +/* + * Implement mseal() syscall. + * + * Copyright (c) 2023 Google, Inc. + * + * Author: Jeff Xu + */ + +#include +#include +#include +#include +#include "internal.h" + +static bool can_do_mseal(unsigned long types, unsigned long flags) +{ + /* check types is a valid bitmap. */ + if (types & ~MM_SEAL_ALL) + return false; + + /* flags isn't used for now. */ + if (flags) + return false; + + return true; +} + +/* + * Check if a seal type can be added to VMA. + */ +static bool can_add_vma_seals(struct vm_area_struct *vma, unsigned long newSeals) +{ + /* When SEAL_MSEAL is set, reject if a new type of seal is added. */ + if ((vma->vm_seals & MM_SEAL_SEAL) && + (newSeals & ~(vma_seals(vma)))) + return false; + + return true; +} + +static int mseal_fixup(struct vma_iterator *vmi, struct vm_area_struct *vma, + struct vm_area_struct **prev, unsigned long start, + unsigned long end, unsigned long addtypes) +{ + int ret = 0; + + if (addtypes & ~(vma_seals(vma))) { + /* + * Handle split at start and end. + * For now sealed VMA doesn't merge with other VMAs. + * This will be updated in later commit to make + * sealed VMA also mergeable. + */ + if (start != vma->vm_start) { + ret = split_vma(vmi, vma, start, 1); + if (ret) + goto out; + } + + if (end != vma->vm_end) { + ret = split_vma(vmi, vma, end, 0); + if (ret) + goto out; + } + + vma->vm_seals |= addtypes; + } + +out: + *prev = vma; + return ret; +} + +/* + * Check for do_mseal: + * 1> start is part of a valid vma. + * 2> end is part of a valid vma. + * 3> No gap (unallocated address) between start and end. + * 4> requested seal type can be added in given address range. + */ +static int check_mm_seal(unsigned long start, unsigned long end, + unsigned long newtypes) +{ + struct vm_area_struct *vma; + unsigned long nstart = start; + + VMA_ITERATOR(vmi, current->mm, start); + + /* going through each vma to check. */ + for_each_vma_range(vmi, vma, end) { + if (vma->vm_start > nstart) + /* unallocated memory found. */ + return -ENOMEM; + + if (!can_add_vma_seals(vma, newtypes)) + return -EACCES; + + if (vma->vm_end >= end) + return 0; + + nstart = vma->vm_end; + } + + return -ENOMEM; +} + +/* + * Apply sealing. + */ +static int apply_mm_seal(unsigned long start, unsigned long end, + unsigned long newtypes) +{ + unsigned long nstart, nend; + struct vm_area_struct *vma, *prev = NULL; + struct vma_iterator vmi; + int error = 0; + + vma_iter_init(&vmi, current->mm, start); + vma = vma_find(&vmi, end); + + prev = vma_prev(&vmi); + if (start > vma->vm_start) + prev = vma; + + nstart = start; + + /* going through each vma to update. */ + for_each_vma_range(vmi, vma, end) { + nend = vma->vm_end; + if (nend > end) + nend = end; + + error = mseal_fixup(&vmi, vma, &prev, nstart, nend, newtypes); + if (error) + break; + + nstart = vma->vm_end; + } + + return error; +} + +/* + * mseal(2) seals the VM's meta data from + * selected syscalls. + * + * addr/len: VM address range. + * + * The address range by addr/len must meet: + * start (addr) must be in a valid VMA. + * end (addr + len) must be in a valid VMA. + * no gap (unallocated memory) between start and end. + * start (addr) must be page aligned. + * + * len: len will be page aligned implicitly. + * + * types: bit mask for sealed syscalls. + * MM_SEAL_BASE: prevent VMA from: + * 1> Unmapping, moving to another location, and shrinking + * the size, via munmap() and mremap(), can leave an empty + * space, therefore can be replaced with a VMA with a new + * set of attributes. + * 2> Move or expand a different vma into the current location, + * via mremap(). + * 3> Modifying sealed VMA via mmap(MAP_FIXED). + * 4> Size expansion, via mremap(), does not appear to pose any + * specific risks to sealed VMAs. It is included anyway because + * the use case is unclear. In any case, users can rely on + * merging to expand a sealed VMA. + * + * The MM_SEAL_PROT_PKEY: + * Seal PROT and PKEY of the address range, in other words, + * mprotect() and pkey_mprotect() will be denied if the memory is + * sealed with MM_SEAL_PROT_PKEY. + * + * The MM_SEAL_SEAL + * MM_SEAL_SEAL denies adding a new seal for an VMA. + * + * The kernel will remember which seal types are applied, and the + * application doesn’t need to repeat all existing seal types in + * the next mseal(). Once a seal type is applied, it can’t be + * unsealed. Call mseal() on an existing seal type is a no-action, + * not a failure. + * + * flags: reserved. + * + * return values: + * zero: success. + * -EINVAL: + * invalid seal type. + * invalid input flags. + * addr is not page aligned. + * addr + len overflow. + * -ENOMEM: + * addr is not a valid address (not allocated). + * end (addr + len) is not a valid address. + * a gap (unallocated memory) between start and end. + * -EACCES: + * MM_SEAL_SEAL is set, adding a new seal is rejected. + * + * Note: + * user can call mseal(2) multiple times to add new seal types. + * adding an already added seal type is a no-action (no error). + * adding a new seal type after MM_SEAL_SEAL will be rejected. + * unseal() or removing a seal type is not supported. + */ +static int do_mseal(unsigned long start, size_t len_in, unsigned long types, + unsigned long flags) +{ + int ret = 0; + unsigned long end; + struct mm_struct *mm = current->mm; + size_t len; + + /* MM_SEAL_BASE is set when other seal types are set. */ + if (types & MM_SEAL_PROT_PKEY) + types |= MM_SEAL_BASE; + + if (!can_do_mseal(types, flags)) + return -EINVAL; + + start = untagged_addr(start); + if (!PAGE_ALIGNED(start)) + return -EINVAL; + + len = PAGE_ALIGN(len_in); + /* Check to see whether len was rounded up from small -ve to zero. */ + if (len_in && !len) + return -EINVAL; + + end = start + len; + if (end < start) + return -EINVAL; + + if (end == start) + return 0; + + if (mmap_write_lock_killable(mm)) + return -EINTR; + + ret = check_mm_seal(start, end, types); + if (ret) + goto out; + + ret = apply_mm_seal(start, end, types); + +out: + mmap_write_unlock(current->mm); + return ret; +} + +SYSCALL_DEFINE4(mseal, unsigned long, start, size_t, len, unsigned long, types, unsigned long, + flags) +{ + return do_mseal(start, len, types, flags); +} From patchwork Tue Dec 12 23:16:56 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jeff Xu X-Patchwork-Id: 13490085 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 2839AC4332F for ; Tue, 12 Dec 2023 23:17:20 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 42D1E6B03AA; Tue, 12 Dec 2023 18:17:16 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 3B6036B03AC; Tue, 12 Dec 2023 18:17:16 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 1BAF16B03D2; Tue, 12 Dec 2023 18:17:16 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0015.hostedemail.com [216.40.44.15]) by kanga.kvack.org (Postfix) with ESMTP id 066E66B03AA for ; Tue, 12 Dec 2023 18:17:16 -0500 (EST) Received: from smtpin05.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay01.hostedemail.com (Postfix) with ESMTP id D132D1C140A for ; Tue, 12 Dec 2023 23:17:15 +0000 (UTC) X-FDA: 81559729230.05.26B2887 Received: from mail-pf1-f169.google.com (mail-pf1-f169.google.com [209.85.210.169]) by imf05.hostedemail.com (Postfix) with ESMTP id 09CB0100014 for ; Tue, 12 Dec 2023 23:17:13 +0000 (UTC) Authentication-Results: imf05.hostedemail.com; dkim=pass header.d=chromium.org header.s=google header.b=C1x0lVJh; dmarc=pass (policy=none) header.from=chromium.org; spf=pass (imf05.hostedemail.com: domain of jeffxu@chromium.org designates 209.85.210.169 as permitted sender) smtp.mailfrom=jeffxu@chromium.org ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1702423034; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=9YF8gInYyKnjLyYVifyCViAm+ORlRHRAbg0UC91NVL0=; b=vqEzY8HSdr/QpvxR2XSOFiUy2yozHnLXR20h5J8+khA89bUqxGnlby/4YP8sIrQ8vZNvQR LEjgg7+QWbfCasdOoL2iz6q2M7d6V6s737nKUD5rVxhdSx+lfiULcJQAAGKuluFUGSmNtR vxSksjNOMYuPfXoPKRm0n77RC16W80g= ARC-Authentication-Results: i=1; imf05.hostedemail.com; dkim=pass header.d=chromium.org header.s=google header.b=C1x0lVJh; dmarc=pass (policy=none) header.from=chromium.org; spf=pass (imf05.hostedemail.com: domain of jeffxu@chromium.org designates 209.85.210.169 as permitted sender) smtp.mailfrom=jeffxu@chromium.org ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1702423034; a=rsa-sha256; cv=none; b=FDb/zK3gEzFFGLoa6jPxJowylLhfYmPYjUBuUoOiM2djVQvrd/mrrdTFucktNrvBLS3mMf NVyzqnY23OCf5k/MFZZtEFu7BmfRC3WBizISG7ZGCxc8SshbTpI7kRjKMS+rWNQJml4OIT EiM45C5pn7FHFhwAPG3wV/Zz5lHWdzw= Received: by mail-pf1-f169.google.com with SMTP id d2e1a72fcca58-6cea5548eb2so5582406b3a.0 for ; Tue, 12 Dec 2023 15:17:13 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=chromium.org; s=google; t=1702423033; x=1703027833; darn=kvack.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=9YF8gInYyKnjLyYVifyCViAm+ORlRHRAbg0UC91NVL0=; b=C1x0lVJhXMgZSmE1SVAjAK7N5SQfm3KxfbGj67Fwz3BDzwt2Scl6wsj6VLR41XqgTH QG9cDjGi86UhQdk0T1+gX+BEk0CWnRw2PD21/LRoo0zsxeCS0nWkC4G8sUlP/KSvs7tm h0I35FIbPdVgqhz0kxaWnPUdPlBQhLgZWYbgU= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1702423033; x=1703027833; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=9YF8gInYyKnjLyYVifyCViAm+ORlRHRAbg0UC91NVL0=; b=a7wVEFky3AyhVDnRufjikR3iIBO3xmEwHPskCGu5YjPXkARWnWRJwExnRrZzTx6MMk 4lmsjjusbe8eBNTEpvTcyGza9o4PaxQ8UfgtJfNen6dCEuqk2fsFBrYkqJY7RsJs40J0 mObOxJclkVJ8X4xuzAgFHl1RIH58GK2LRqcplu4h1/xEbJkJKoXBTPfY6wMZ1B/jyF2D +3C3q7wGNK0qiOieGY0kzxGJZwaH6v/JNjSxaBjL1AbUWSF/qvPVQqkY88poXjYWh9Zf G4m7lfYU8QlHA4xY2fTEurmzsYLqmGoR/Qwx6ceeBSHDRMiHiE+aImPIuqqpTzc2690e n6WA== X-Gm-Message-State: AOJu0YwYrNwnUjdVPqPTRxJmdVkYCzNV952xnim3vwKGxx1Az8YCtmYl tgH+gTkLS1pzST7CElhRPuGVag== X-Google-Smtp-Source: AGHT+IGEdvNtjL7vVDPGkLMDHm2ntWIQ9rfmcT11m+WtWcnAGDujtZ9MEI7HDfwGIcXn8coPiQFtiA== X-Received: by 2002:a05:6a00:2d1b:b0:6cb:bc92:c73f with SMTP id fa27-20020a056a002d1b00b006cbbc92c73fmr7494441pfb.2.1702423032923; Tue, 12 Dec 2023 15:17:12 -0800 (PST) Received: from localhost (34.133.83.34.bc.googleusercontent.com. [34.83.133.34]) by smtp.gmail.com with UTF8SMTPSA id m26-20020aa78a1a000000b006c988fda657sm8975614pfa.177.2023.12.12.15.17.12 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Tue, 12 Dec 2023 15:17:12 -0800 (PST) From: jeffxu@chromium.org To: akpm@linux-foundation.org, keescook@chromium.org, jannh@google.com, sroettger@google.com, willy@infradead.org, gregkh@linuxfoundation.org, torvalds@linux-foundation.org Cc: jeffxu@google.com, jorgelo@chromium.org, groeck@chromium.org, linux-kernel@vger.kernel.org, linux-kselftest@vger.kernel.org, linux-mm@kvack.org, pedro.falcato@gmail.com, dave.hansen@intel.com, linux-hardening@vger.kernel.org, deraadt@openbsd.org, Jeff Xu Subject: [RFC PATCH v3 02/11] mseal: Wire up mseal syscall Date: Tue, 12 Dec 2023 23:16:56 +0000 Message-ID: <20231212231706.2680890-3-jeffxu@chromium.org> X-Mailer: git-send-email 2.43.0.472.g3155946c3a-goog In-Reply-To: <20231212231706.2680890-1-jeffxu@chromium.org> References: <20231212231706.2680890-1-jeffxu@chromium.org> MIME-Version: 1.0 X-Rspam-User: X-Rspamd-Server: rspam12 X-Rspamd-Queue-Id: 09CB0100014 X-Stat-Signature: sjsj85nbzdiemh1oa3zfc67rb3gqw197 X-HE-Tag: 1702423033-82306 X-HE-Meta: U2FsdGVkX19FwSnQr7miGMZmhOUXoYmOvVgoPoNHtgSGkU4JzMnfqPLavyEuU/f60/kiyAUeX8EAIR0BpVKrumQ8gOc7Wvg7miS70kJ/f1dglLhTCjR31/my1qAxZdMqgVrSy3afaEo87TZsaA2MAT6pSBAaQIo99d0K7V4AGvnvvJu981c6/L/j8V0RA/WpjvJbdQVnt+xQdRZZQzbzGzw5ZkChwE0sctWvadMoK2hHQHsDTZpDMutto2uyznHWb0dDxEcZYTCdEYATH7/ZbFh18amQQ8ahG2vmQgTBcbYS1Tw7DcPfXm4RqtfY7YwuwJfGlCkPhmYjo2/Q/eLOBaXluL7XfmRhud4QhOF6NApWeGXN0h+pnMDY1RHAORlI5Bh3g2vJ10R+hVe9CbCUAu6APsI2qVYpK6Zj2vWk7ONMZJMQUaotXmu6EESwlWqMtNlSuhQfLJw3BDZGtaFeGi+JvDG/ykivW8DhaxBqWqxmjtLE1gInyte0R+6ChmA1Z3PiA07U0Fjp5uplVxsn7H0Ix092b5NOClUrWhhkrPnZ1t3q/rCpkys2/QW3NgxC+w0sVZJ07tu5zX6yh9RjpfIlxyizsOzExFOiW/kMe+z4hWtXaUker5sVxXxJQPFKUYHJmmSGW7Ohznpf/PZ/8N2h+694lZGiTanRNQbJ55k5AeGF5xQpQwyYRl04W5LyH2EYk7EQ4ZvvDHQKmMOKGXd4bI0HrcoHxrgrx7WZSpqVlr7qvt3b62FQ5jqGTpdpBD5mCzJIG5TAnso2Zh90h5o6s3r8gh+5vQOQZcEUWC0jB7gBba28xMwdG7+4rFMqwARHhijFkVcTgrc6wtWAsV3rl3UCMu1RgpksZxIWqgYY8uyEnzh9zq63WZNJkdYT26cmrAhiUs30z/6wr856rLSbjLfjtfwQxiwEohnCCC5uKQz9S05Fb65v2CbS/7GW9QroxCdP4y/syBzxkIn Q8kg9YTW PXNqkfoW/9vWhtApqvq8Sdv9excUmacSWjIGvS7BOpekWaGgtOlPoR0mZpXG4Nz4INPuvD5Hv5sc1m2drSZbAq3m9JRw3FXaaiFTVOQuLADCEu2plsGp5Z1tXWmwhhC0lqxtsrh2JuZm/UGKzY4Ol5tBhwSlCIcgAWDneWugq2w+c8DRgVLiNhyhu7hIelZTgDO6PQ4HpCrWZfCTqdYcmKoMe9gDB9CKwxEE2gjqhoaxR0ELhK+pHnuhRgR5CQmGl8uDPpLg5NSN2ZNcHYeyLKOSunH2yOuRk70S2YUOGvPnu/SQoPLhlarzpqTMuKFnUDr4buwBJdiWobdY9Vkz7N1cXR2Ps2KIRKKm3KjA9hM0bRkM3s8iTtIMe/Qemkq6sxxUUqE+jmq7gMVd/J59R43Si0qoqHWkuUu1GBYoyy0y09U3rr9g8vSRYUw== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: From: Jeff Xu Wire up mseal syscall for all architectures. Signed-off-by: Jeff Xu --- arch/alpha/kernel/syscalls/syscall.tbl | 1 + arch/arm/tools/syscall.tbl | 1 + arch/arm64/include/asm/unistd.h | 2 +- arch/arm64/include/asm/unistd32.h | 2 ++ arch/ia64/kernel/syscalls/syscall.tbl | 1 + arch/m68k/kernel/syscalls/syscall.tbl | 1 + arch/microblaze/kernel/syscalls/syscall.tbl | 1 + arch/mips/kernel/syscalls/syscall_n32.tbl | 1 + arch/mips/kernel/syscalls/syscall_n64.tbl | 1 + arch/mips/kernel/syscalls/syscall_o32.tbl | 1 + arch/parisc/kernel/syscalls/syscall.tbl | 1 + arch/powerpc/kernel/syscalls/syscall.tbl | 1 + arch/s390/kernel/syscalls/syscall.tbl | 1 + arch/sh/kernel/syscalls/syscall.tbl | 1 + arch/sparc/kernel/syscalls/syscall.tbl | 1 + arch/x86/entry/syscalls/syscall_32.tbl | 1 + arch/x86/entry/syscalls/syscall_64.tbl | 1 + arch/xtensa/kernel/syscalls/syscall.tbl | 1 + include/uapi/asm-generic/unistd.h | 5 ++++- 19 files changed, 23 insertions(+), 2 deletions(-) diff --git a/arch/alpha/kernel/syscalls/syscall.tbl b/arch/alpha/kernel/syscalls/syscall.tbl index b68f1f56b836..4de33b969009 100644 --- a/arch/alpha/kernel/syscalls/syscall.tbl +++ b/arch/alpha/kernel/syscalls/syscall.tbl @@ -496,3 +496,4 @@ 564 common futex_wake sys_futex_wake 565 common futex_wait sys_futex_wait 566 common futex_requeue sys_futex_requeue +567 common mseal sys_mseal diff --git a/arch/arm/tools/syscall.tbl b/arch/arm/tools/syscall.tbl index 93d0d46cbb15..dacea023bb88 100644 --- a/arch/arm/tools/syscall.tbl +++ b/arch/arm/tools/syscall.tbl @@ -469,3 +469,4 @@ 454 common futex_wake sys_futex_wake 455 common futex_wait sys_futex_wait 456 common futex_requeue sys_futex_requeue +457 common mseal sys_mseal diff --git a/arch/arm64/include/asm/unistd.h b/arch/arm64/include/asm/unistd.h index 531effca5f1f..298313d2e0af 100644 --- a/arch/arm64/include/asm/unistd.h +++ b/arch/arm64/include/asm/unistd.h @@ -39,7 +39,7 @@ #define __ARM_NR_compat_set_tls (__ARM_NR_COMPAT_BASE + 5) #define __ARM_NR_COMPAT_END (__ARM_NR_COMPAT_BASE + 0x800) -#define __NR_compat_syscalls 457 +#define __NR_compat_syscalls 458 #endif #define __ARCH_WANT_SYS_CLONE diff --git a/arch/arm64/include/asm/unistd32.h b/arch/arm64/include/asm/unistd32.h index c453291154fd..015c80b14206 100644 --- a/arch/arm64/include/asm/unistd32.h +++ b/arch/arm64/include/asm/unistd32.h @@ -917,6 +917,8 @@ __SYSCALL(__NR_futex_wake, sys_futex_wake) __SYSCALL(__NR_futex_wait, sys_futex_wait) #define __NR_futex_requeue 456 __SYSCALL(__NR_futex_requeue, sys_futex_requeue) +#define __NR_mseal 457 +__SYSCALL(__NR_mseal, sys_mseal) /* * Please add new compat syscalls above this comment and update diff --git a/arch/ia64/kernel/syscalls/syscall.tbl b/arch/ia64/kernel/syscalls/syscall.tbl index 81375ea78288..e8b40451693d 100644 --- a/arch/ia64/kernel/syscalls/syscall.tbl +++ b/arch/ia64/kernel/syscalls/syscall.tbl @@ -376,3 +376,4 @@ 454 common futex_wake sys_futex_wake 455 common futex_wait sys_futex_wait 456 common futex_requeue sys_futex_requeue +457 common mseal sys_mseal diff --git a/arch/m68k/kernel/syscalls/syscall.tbl b/arch/m68k/kernel/syscalls/syscall.tbl index f7f997a88bab..0da4a4dc1737 100644 --- a/arch/m68k/kernel/syscalls/syscall.tbl +++ b/arch/m68k/kernel/syscalls/syscall.tbl @@ -455,3 +455,4 @@ 454 common futex_wake sys_futex_wake 455 common futex_wait sys_futex_wait 456 common futex_requeue sys_futex_requeue +457 common mseal sys_mseal diff --git a/arch/microblaze/kernel/syscalls/syscall.tbl b/arch/microblaze/kernel/syscalls/syscall.tbl index 2967ec26b978..ca8572222783 100644 --- a/arch/microblaze/kernel/syscalls/syscall.tbl +++ b/arch/microblaze/kernel/syscalls/syscall.tbl @@ -461,3 +461,4 @@ 454 common futex_wake sys_futex_wake 455 common futex_wait sys_futex_wait 456 common futex_requeue sys_futex_requeue +457 common mseal sys_mseal diff --git a/arch/mips/kernel/syscalls/syscall_n32.tbl b/arch/mips/kernel/syscalls/syscall_n32.tbl index 383abb1713f4..4fd33623b7e8 100644 --- a/arch/mips/kernel/syscalls/syscall_n32.tbl +++ b/arch/mips/kernel/syscalls/syscall_n32.tbl @@ -394,3 +394,4 @@ 454 n32 futex_wake sys_futex_wake 455 n32 futex_wait sys_futex_wait 456 n32 futex_requeue sys_futex_requeue +457 n32 mseal sys_mseal diff --git a/arch/mips/kernel/syscalls/syscall_n64.tbl b/arch/mips/kernel/syscalls/syscall_n64.tbl index c9bd09ba905f..aaa6382781e0 100644 --- a/arch/mips/kernel/syscalls/syscall_n64.tbl +++ b/arch/mips/kernel/syscalls/syscall_n64.tbl @@ -370,3 +370,4 @@ 454 n64 futex_wake sys_futex_wake 455 n64 futex_wait sys_futex_wait 456 n64 futex_requeue sys_futex_requeue +457 n64 mseal sys_mseal diff --git a/arch/mips/kernel/syscalls/syscall_o32.tbl b/arch/mips/kernel/syscalls/syscall_o32.tbl index ba5ef6cea97a..bbdd6f151224 100644 --- a/arch/mips/kernel/syscalls/syscall_o32.tbl +++ b/arch/mips/kernel/syscalls/syscall_o32.tbl @@ -443,3 +443,4 @@ 454 o32 futex_wake sys_futex_wake 455 o32 futex_wait sys_futex_wait 456 o32 futex_requeue sys_futex_requeue +457 o32 mseal sys_mseal diff --git a/arch/parisc/kernel/syscalls/syscall.tbl b/arch/parisc/kernel/syscalls/syscall.tbl index 9f0f6df55361..8dda80555c7c 100644 --- a/arch/parisc/kernel/syscalls/syscall.tbl +++ b/arch/parisc/kernel/syscalls/syscall.tbl @@ -454,3 +454,4 @@ 454 common futex_wake sys_futex_wake 455 common futex_wait sys_futex_wait 456 common futex_requeue sys_futex_requeue +457 common mseal sys_mseal diff --git a/arch/powerpc/kernel/syscalls/syscall.tbl b/arch/powerpc/kernel/syscalls/syscall.tbl index 26fc41904266..d0aa97a669bc 100644 --- a/arch/powerpc/kernel/syscalls/syscall.tbl +++ b/arch/powerpc/kernel/syscalls/syscall.tbl @@ -542,3 +542,4 @@ 454 common futex_wake sys_futex_wake 455 common futex_wait sys_futex_wait 456 common futex_requeue sys_futex_requeue +457 common mseal sys_mseal diff --git a/arch/s390/kernel/syscalls/syscall.tbl b/arch/s390/kernel/syscalls/syscall.tbl index 31be90b241f7..228f100f8565 100644 --- a/arch/s390/kernel/syscalls/syscall.tbl +++ b/arch/s390/kernel/syscalls/syscall.tbl @@ -458,3 +458,4 @@ 454 common futex_wake sys_futex_wake sys_futex_wake 455 common futex_wait sys_futex_wait sys_futex_wait 456 common futex_requeue sys_futex_requeue sys_futex_requeue +457 common mseal sys_mseal sys_mseal diff --git a/arch/sh/kernel/syscalls/syscall.tbl b/arch/sh/kernel/syscalls/syscall.tbl index 4bc5d488ab17..cf08ea4a7539 100644 --- a/arch/sh/kernel/syscalls/syscall.tbl +++ b/arch/sh/kernel/syscalls/syscall.tbl @@ -458,3 +458,4 @@ 454 common futex_wake sys_futex_wake 455 common futex_wait sys_futex_wait 456 common futex_requeue sys_futex_requeue +457 common mseal sys_mseal diff --git a/arch/sparc/kernel/syscalls/syscall.tbl b/arch/sparc/kernel/syscalls/syscall.tbl index 8404c8e50394..30796f78bdc2 100644 --- a/arch/sparc/kernel/syscalls/syscall.tbl +++ b/arch/sparc/kernel/syscalls/syscall.tbl @@ -501,3 +501,4 @@ 454 common futex_wake sys_futex_wake 455 common futex_wait sys_futex_wait 456 common futex_requeue sys_futex_requeue +457 common mseal sys_mseal diff --git a/arch/x86/entry/syscalls/syscall_32.tbl b/arch/x86/entry/syscalls/syscall_32.tbl index 31c48bc2c3d8..c4163b904714 100644 --- a/arch/x86/entry/syscalls/syscall_32.tbl +++ b/arch/x86/entry/syscalls/syscall_32.tbl @@ -460,3 +460,4 @@ 454 i386 futex_wake sys_futex_wake 455 i386 futex_wait sys_futex_wait 456 i386 futex_requeue sys_futex_requeue +457 i386 mseal sys_mseal diff --git a/arch/x86/entry/syscalls/syscall_64.tbl b/arch/x86/entry/syscalls/syscall_64.tbl index a577bb27c16d..47fbc6ac0267 100644 --- a/arch/x86/entry/syscalls/syscall_64.tbl +++ b/arch/x86/entry/syscalls/syscall_64.tbl @@ -378,6 +378,7 @@ 454 common futex_wake sys_futex_wake 455 common futex_wait sys_futex_wait 456 common futex_requeue sys_futex_requeue +457 common mseal sys_mseal # # Due to a historical design error, certain syscalls are numbered differently diff --git a/arch/xtensa/kernel/syscalls/syscall.tbl b/arch/xtensa/kernel/syscalls/syscall.tbl index dd71ecce8b86..fe5f562f6493 100644 --- a/arch/xtensa/kernel/syscalls/syscall.tbl +++ b/arch/xtensa/kernel/syscalls/syscall.tbl @@ -426,3 +426,4 @@ 454 common futex_wake sys_futex_wake 455 common futex_wait sys_futex_wait 456 common futex_requeue sys_futex_requeue +457 common mseal sys_mseal diff --git a/include/uapi/asm-generic/unistd.h b/include/uapi/asm-generic/unistd.h index d9e9cd13e577..1678245d8a2b 100644 --- a/include/uapi/asm-generic/unistd.h +++ b/include/uapi/asm-generic/unistd.h @@ -829,8 +829,11 @@ __SYSCALL(__NR_futex_wait, sys_futex_wait) #define __NR_futex_requeue 456 __SYSCALL(__NR_futex_requeue, sys_futex_requeue) +#define __NR_mseal 457 +__SYSCALL(__NR_mseal, sys_mseal) + #undef __NR_syscalls -#define __NR_syscalls 457 +#define __NR_syscalls 458 /* * 32 bit systems traditionally used different From patchwork Tue Dec 12 23:16:57 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jeff Xu X-Patchwork-Id: 13490086 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id CBBD0C4332F for ; Tue, 12 Dec 2023 23:17:22 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 1227A6B03D6; Tue, 12 Dec 2023 18:17:17 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 0AC4E6B03D7; Tue, 12 Dec 2023 18:17:16 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id DEE6E6B03D8; Tue, 12 Dec 2023 18:17:16 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0011.hostedemail.com [216.40.44.11]) by kanga.kvack.org (Postfix) with ESMTP id C2FD36B03D6 for ; Tue, 12 Dec 2023 18:17:16 -0500 (EST) Received: from smtpin04.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay09.hostedemail.com (Postfix) with ESMTP id A023580B6B for ; Tue, 12 Dec 2023 23:17:16 +0000 (UTC) X-FDA: 81559729272.04.9DE0815 Received: from mail-pf1-f175.google.com (mail-pf1-f175.google.com [209.85.210.175]) by imf15.hostedemail.com (Postfix) with ESMTP id CD053A0006 for ; Tue, 12 Dec 2023 23:17:14 +0000 (UTC) Authentication-Results: imf15.hostedemail.com; dkim=pass header.d=chromium.org header.s=google header.b=n6hHw+dL; dmarc=pass (policy=none) header.from=chromium.org; spf=pass (imf15.hostedemail.com: domain of jeffxu@chromium.org designates 209.85.210.175 as permitted sender) smtp.mailfrom=jeffxu@chromium.org ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1702423034; a=rsa-sha256; cv=none; b=VG60KCptohLVQRL4NOtzxK4GxnvdR5epwh2SdrYpbdtdfd/502awhN75C6jIH1kyGTGfzA 9POI0sL96onEAuEv/eEGVesL+Yr4CAnG0xWc3RHyLqd+yRIlU1ZFDBxmgpg8266JqoZ4Y7 tHOpryxXuBgkLSBLcTjBw5kBgbuAJH0= ARC-Authentication-Results: i=1; imf15.hostedemail.com; dkim=pass header.d=chromium.org header.s=google header.b=n6hHw+dL; dmarc=pass (policy=none) header.from=chromium.org; spf=pass (imf15.hostedemail.com: domain of jeffxu@chromium.org designates 209.85.210.175 as permitted sender) smtp.mailfrom=jeffxu@chromium.org ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1702423034; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=Vgg0J3qxY24SnHD9sqtPXJJls+X5lh0PMbmZ3jrDVoQ=; b=nBVfM1Y/iviQXYJyzJsyqOhM61ZlnbMVCSCOOWxVQQv0lRBbZqGFzngC6ERXJeHCQSuKG+ SqXLvatgXDxPO28IR3GdYU9UTsgstLbczLvJwYI9blQ85CfRFLyOODrNf7NbTiFog3lVyX o5Yq0IPithDHS9VljBGukv9gdkVk2Rg= Received: by mail-pf1-f175.google.com with SMTP id d2e1a72fcca58-6cea5548eb2so5582422b3a.0 for ; Tue, 12 Dec 2023 15:17:14 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=chromium.org; s=google; t=1702423034; x=1703027834; darn=kvack.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=Vgg0J3qxY24SnHD9sqtPXJJls+X5lh0PMbmZ3jrDVoQ=; b=n6hHw+dLeey5REcloF1frimHNdQMnGM+CmN6GfSa9KsrTsyUuN++pdYJ/wNkJ+K31t Qle84lXZ7zYspnemR2+nEOfC2VKuCn1/J+2GxjjaSy5F/XHECeo/deJvU7P5gBcFyfHV TicYIq7kH2wrKFi10rnUbvnsMts+7qQ9cst+Y= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1702423034; x=1703027834; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=Vgg0J3qxY24SnHD9sqtPXJJls+X5lh0PMbmZ3jrDVoQ=; b=Cc1XSvYB5R2+SrdIhFHaC/70LMTVld7guIKOKYT1Wm0ES2sZejsUMm7TzUO6C4siYd +G5aLaMVbqlfnvNR0WVJbrlZ2J5+0wf4B3EfFuHnH+MGuIsOUN8m76fox/UDGdLgront jkb0aeMI1uQwpVt5swHWOqZJsXzfaOO0v17KL5w1oR5JcWwPIK9t63QdVSbInRR9bS3D rYl/lBqsRgq65BiP0EMYWMHtTghaBldQZT7C6QF8MCIQGJW35qEmL3QMWR5THU6UwmzW dZZFqeawJc6UyXrUvfZ1mJsj7eAT9p4YqYLY8yp3EOQ4TRETrdEW6yAzNp6+5h3Wpkad aHkQ== X-Gm-Message-State: AOJu0YxqTLgZuSylWQ0Ss/Vmg5PUlntu2Ct9tbooWJcl/pdQbMX9KAuW zQZYJW2TAdZG83hIRtWWq5nrMg== X-Google-Smtp-Source: AGHT+IH0w5ya9vjJqVfLSwVuEpvbnpor4YM2jb0u3s8vpT0Fs9e2ntlip5J/9UPEmwW3uq6XLY2yzA== X-Received: by 2002:a05:6a20:429a:b0:190:50ec:e2e4 with SMTP id o26-20020a056a20429a00b0019050ece2e4mr9563392pzj.45.1702423033741; Tue, 12 Dec 2023 15:17:13 -0800 (PST) Received: from localhost (34.133.83.34.bc.googleusercontent.com. [34.83.133.34]) by smtp.gmail.com with UTF8SMTPSA id b4-20020aa78704000000b006ce41b1ba8csm8575780pfo.131.2023.12.12.15.17.13 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Tue, 12 Dec 2023 15:17:13 -0800 (PST) From: jeffxu@chromium.org To: akpm@linux-foundation.org, keescook@chromium.org, jannh@google.com, sroettger@google.com, willy@infradead.org, gregkh@linuxfoundation.org, torvalds@linux-foundation.org Cc: jeffxu@google.com, jorgelo@chromium.org, groeck@chromium.org, linux-kernel@vger.kernel.org, linux-kselftest@vger.kernel.org, linux-mm@kvack.org, pedro.falcato@gmail.com, dave.hansen@intel.com, linux-hardening@vger.kernel.org, deraadt@openbsd.org, Jeff Xu Subject: [RFC PATCH v3 03/11] mseal: add can_modify_mm and can_modify_vma Date: Tue, 12 Dec 2023 23:16:57 +0000 Message-ID: <20231212231706.2680890-4-jeffxu@chromium.org> X-Mailer: git-send-email 2.43.0.472.g3155946c3a-goog In-Reply-To: <20231212231706.2680890-1-jeffxu@chromium.org> References: <20231212231706.2680890-1-jeffxu@chromium.org> MIME-Version: 1.0 X-Rspam-User: X-Rspamd-Server: rspam06 X-Rspamd-Queue-Id: CD053A0006 X-Stat-Signature: z7icodiuw835jwswocmaznkscoybtiz3 X-HE-Tag: 1702423034-414387 X-HE-Meta: U2FsdGVkX1/84K1se94lmOgucMjcCJjp7HO/JdbSlPdJcoVCsa01q+N10hsM3fU7YANntTMzdwjin5MrUjyGAt43R9DWjqdS0Q96JDMit/I+L5aSyH89i4puVBu5oPAgcBcPtBbGmjdVZXrtX2aa7HUnMtq9HN5HEDF3NLUkL/+x1R/riqxL2jOB+bwdmk/vbjGVaMI1wDj4JYxtzB5swXs/blKBSUkvdtN9h/RRq0ZGTkyOy+wdjYC6YUjamMy1MbogVGcWVQZ1vqODoUEJooQsr1mhLwvma/mer9QfDTeJmZ6svwk9m3nLgZB82LaC2DAyhjBi1M/U90UqAv9ZvQlqu17rZG9hFZ66ma0Ss0zQv9sCbD3DkZ84ZJQh5KdwzAfn+HjFjFYB0EyEDHuFCi6tYaoes0PJhW4t1lnKXRY34vq9H/d9Q/40OE/JEy6CkD9rKe5aboGGxeOjK/1DNYheGZgkgy7mDD5Z11mEkx9nI+MNPsIczV6r+k1ymtMYDKiZf8N/SQuE6s6kaIsxb8N4gtEvp+pUHAaUOpr0b/C6fD/YnPY2qQyniL3mcPvAcRuD4CxYnwndPsdg2ZVKnWwhdT/hwTqPulZPJYhekihs0jRaPYw1mNdxM70KCWH9axysj1ffeSgTF/Gb9y/W0tsEZ78itZvMW2/LtuTgekvERR0K+sS148FrV/AeL+y17EPv4DdMNEXhcJA1vluq5sCTUAFK7GdIfZsjndyfwDn0bca4PPksHjJBSQotGAhqukk0jhj7n5+FpM0KgQu9JtulAcGeGfyBNvz8FA7qQ4c1+mnWIWvGpxVU94bDF1f1/JAIzCigK9eEl02KK2063qaZ4rTDLES8q6H4ynZUY7F6n/Ljs/gNk6cdZBcuNQzLLCipTc5krcAx4xDKNRa/eDRzbH3tX/0G0uBodQsUW9muh3ESAHsK+qlvgtb9KABX/hOz5PQ3CKF/9jQUYBu jNzADJPh azDK57Hs8KY6rQs6XgifvVBbXshvE1fpOYg9PJY92916CqxfZJ9r/YJsaDJr9nhYAb1xBp6XQGRozNK/g4HbImsa/06YbJfEgibxU3YT3HqvHvHSlNj9h8HySqTYl9aOHepO+NTQtDHkf6p9z8d0x1FRHXyXg4HwITOfZylY7r8gyDFzjjRDG0VN3ny1EgAPkxVpLcpsX0sIrs5qhadfE0u/scMqrkmD/XCcD4G3sjMVxcGC2Vow2zgpWzmsmVhNzMOTYNOAkFgjG7FIpJq5jlHRnB7KM/7fdTOUUu3IScbEVPGSFENkKdzIe9JId5fK/iyNDDMWXip+qlQspXGvglEmMA8XZApIqP92nBcid4OCuwQliQhWzSNVauYM6ZvQzVcLce0fEoDOnK9nrzfIcm13Sw+VtsqyexZ4UPlSTFqtbJHajQeQlgpEQx4Pi5sDW3hioujW4M7ZF8EfBLeKhZjE/R+8iNdr2MbBlRrJkNtWt2sU= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: From: Jeff Xu Two utilities to be used later. can_modify_mm: checks sealing flags for given memory range. can_modify_vma: checks sealing flags for given vma. Signed-off-by: Jeff Xu --- include/linux/mm.h | 18 ++++++++++++++++++ mm/mseal.c | 38 ++++++++++++++++++++++++++++++++++++++ 2 files changed, 56 insertions(+) diff --git a/include/linux/mm.h b/include/linux/mm.h index 3d1120570de5..2435acc1f44f 100644 --- a/include/linux/mm.h +++ b/include/linux/mm.h @@ -3339,6 +3339,12 @@ static inline unsigned long vma_seals(struct vm_area_struct *vma) return (vma->vm_seals & MM_SEAL_ALL); } +extern bool can_modify_mm(struct mm_struct *mm, unsigned long start, + unsigned long end, unsigned long checkSeals); + +extern bool can_modify_vma(struct vm_area_struct *vma, + unsigned long checkSeals); + #else static inline bool check_vma_seals_mergeable(unsigned long vm_seals1) { @@ -3349,6 +3355,18 @@ static inline unsigned long vma_seals(struct vm_area_struct *vma) { return 0; } + +static inline bool can_modify_mm(struct mm_struct *mm, unsigned long start, + unsigned long end, unsigned long checkSeals) +{ + return true; +} + +static inline bool can_modify_vma(struct vm_area_struct *vma, + unsigned long checkSeals) +{ + return true; +} #endif /* These take the mm semaphore themselves */ diff --git a/mm/mseal.c b/mm/mseal.c index 13bbe9ef5883..d12aa628ebdc 100644 --- a/mm/mseal.c +++ b/mm/mseal.c @@ -26,6 +26,44 @@ static bool can_do_mseal(unsigned long types, unsigned long flags) return true; } +/* + * check if a vma is sealed for modification. + * return true, if modification is allowed. + */ +bool can_modify_vma(struct vm_area_struct *vma, + unsigned long checkSeals) +{ + if (checkSeals & vma_seals(vma)) + return false; + + return true; +} + +/* + * Check if the vmas of a memory range are allowed to be modified. + * the memory ranger can have a gap (unallocated memory). + * return true, if it is allowed. + */ +bool can_modify_mm(struct mm_struct *mm, unsigned long start, unsigned long end, + unsigned long checkSeals) +{ + struct vm_area_struct *vma; + + VMA_ITERATOR(vmi, mm, start); + + if (!checkSeals) + return true; + + /* going through each vma to check. */ + for_each_vma_range(vmi, vma, end) { + if (!can_modify_vma(vma, checkSeals)) + return false; + } + + /* Allow by default. */ + return true; +} + /* * Check if a seal type can be added to VMA. */ From patchwork Tue Dec 12 23:16:58 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jeff Xu X-Patchwork-Id: 13490087 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 45444C4332F for ; Tue, 12 Dec 2023 23:17:25 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 4300B6B03D7; Tue, 12 Dec 2023 18:17:18 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 3B6466B03D8; Tue, 12 Dec 2023 18:17:18 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 1E1EC6B03D9; Tue, 12 Dec 2023 18:17:18 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0012.hostedemail.com [216.40.44.12]) by kanga.kvack.org (Postfix) with ESMTP id 016B16B03D7 for ; Tue, 12 Dec 2023 18:17:17 -0500 (EST) Received: from smtpin21.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay06.hostedemail.com (Postfix) with ESMTP id C2E30A20C4 for ; Tue, 12 Dec 2023 23:17:17 +0000 (UTC) X-FDA: 81559729314.21.1F64BE5 Received: from mail-ot1-f53.google.com (mail-ot1-f53.google.com [209.85.210.53]) by imf13.hostedemail.com (Postfix) with ESMTP id CADDD20008 for ; Tue, 12 Dec 2023 23:17:15 +0000 (UTC) Authentication-Results: imf13.hostedemail.com; dkim=pass header.d=chromium.org header.s=google header.b=cnHKu0bq; dmarc=pass (policy=none) header.from=chromium.org; spf=pass (imf13.hostedemail.com: domain of jeffxu@chromium.org designates 209.85.210.53 as permitted sender) smtp.mailfrom=jeffxu@chromium.org ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1702423035; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=1D1PlLUfxR4CYnPMS2OzloWMrMloIYQmh/cjfDWMVaY=; b=ILR/I/tvDaCsGi3t0yR7faJGz6CxlUeWtHnoIoMxWnn/3P2m09BwFBy5dL8D9Z0Ppq0zpO tZGgi6msU8OrvLKcBIDnhy+TPMXhfwx/t/sF/PWSIL8vK9JZAu5DJ6zjCl2GrvT4LmeLKk i0RYvNdKr0cnhMWzLeh+7dazEd/Qb3A= ARC-Authentication-Results: i=1; imf13.hostedemail.com; dkim=pass header.d=chromium.org header.s=google header.b=cnHKu0bq; dmarc=pass (policy=none) header.from=chromium.org; spf=pass (imf13.hostedemail.com: domain of jeffxu@chromium.org designates 209.85.210.53 as permitted sender) smtp.mailfrom=jeffxu@chromium.org ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1702423035; a=rsa-sha256; cv=none; b=lrCxzGcBgqvI5yyes+S9pkJdlYlWrfh9I4W/rEi1nM5Bd8LYLBRG2zeGWSOATQhUATzvCx AY+SHvAlcns9oYFC3NYl0CuLOnQpqfZxNp1m5qH+ms4ulk1lhqxJ3+aYFKo4Z+fWRjdl+B UfseCktuyReWVxboslA95BK0nZ2POX0= Received: by mail-ot1-f53.google.com with SMTP id 46e09a7af769-6d9f879f784so3384918a34.2 for ; Tue, 12 Dec 2023 15:17:15 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=chromium.org; s=google; t=1702423035; x=1703027835; darn=kvack.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=1D1PlLUfxR4CYnPMS2OzloWMrMloIYQmh/cjfDWMVaY=; b=cnHKu0bqn6lauYvZ2oTekCqXWXqA2dQE3FfAJMdUbG6GDx8WlwyJsGl7/Nr7zUTjIp C2o4RhWeqtNp+lIxhYbZmz00hnKbj7+YndVuLPkpeMiBObEbtd7aJB2tbdp59Rt/eahk NDLXASY4vs0NMv58SDtFD9BT8/ytU4DnKnqsY= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1702423035; x=1703027835; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=1D1PlLUfxR4CYnPMS2OzloWMrMloIYQmh/cjfDWMVaY=; b=txoN2ANvkzDYrDsDRqobYfVFZ+IOM0e0K3blKwcssVEw4plTMFfXXv6bMZ392zoCBr wrB4T6cbNeVZFIx1rTYnepuustR0Y2tubz445zQpptDiTZmyobrPXntyqlFPA7TrcnIA T82FUcAb38rhcww22B3OQxGVPDO4jwq1H6oGvrWa2IEeX8veseRpF2u8dqJSfp8ioMsb gTBfADG+3NMvNGKtVxJTsse30Qqi7n8GW+5kfbkenYvYKpL7MV7QcgFo8FPwcW7n3m5F NtHoRWW4tu23ukxajYWA3bHcPB4WLVKbJjPrBMjyJyYtTM6GpohWi4wEWMjBQu9+wTCK piGg== X-Gm-Message-State: AOJu0YygPAIUgjdvknn0swkF5ryuvgfg8We7TM0yHUfWftL3vdNcxo5K SEpgZ6peXDJQKHAbhterTWsnpA== X-Google-Smtp-Source: AGHT+IELl9vxFDBtqlLqQjHJMgXEg7iTyGXK7EqaCyV/trQ5sRcstZZrUugD0WkeVeiGw0jvX4mjLw== X-Received: by 2002:a05:6358:262a:b0:16e:2898:5e02 with SMTP id l42-20020a056358262a00b0016e28985e02mr9003519rwc.32.1702423034702; Tue, 12 Dec 2023 15:17:14 -0800 (PST) Received: from localhost (34.133.83.34.bc.googleusercontent.com. [34.83.133.34]) by smtp.gmail.com with UTF8SMTPSA id s188-20020a635ec5000000b005c6617b52e6sm8763314pgb.5.2023.12.12.15.17.14 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Tue, 12 Dec 2023 15:17:14 -0800 (PST) From: jeffxu@chromium.org To: akpm@linux-foundation.org, keescook@chromium.org, jannh@google.com, sroettger@google.com, willy@infradead.org, gregkh@linuxfoundation.org, torvalds@linux-foundation.org Cc: jeffxu@google.com, jorgelo@chromium.org, groeck@chromium.org, linux-kernel@vger.kernel.org, linux-kselftest@vger.kernel.org, linux-mm@kvack.org, pedro.falcato@gmail.com, dave.hansen@intel.com, linux-hardening@vger.kernel.org, deraadt@openbsd.org, Jeff Xu Subject: [RFC PATCH v3 04/11] mseal: add MM_SEAL_BASE Date: Tue, 12 Dec 2023 23:16:58 +0000 Message-ID: <20231212231706.2680890-5-jeffxu@chromium.org> X-Mailer: git-send-email 2.43.0.472.g3155946c3a-goog In-Reply-To: <20231212231706.2680890-1-jeffxu@chromium.org> References: <20231212231706.2680890-1-jeffxu@chromium.org> MIME-Version: 1.0 X-Rspam-User: X-Stat-Signature: mr9s8gtbfa7qzq5m5cirxrzngrmfqp13 X-Rspamd-Server: rspam07 X-Rspamd-Queue-Id: CADDD20008 X-HE-Tag: 1702423035-233776 X-HE-Meta: U2FsdGVkX19O6ufIvsVIlt2VaOwRTfmp9nKibtBrPjixgcc2OEHPcz2YKWSc9TUO64Uzj4fgy0H8gv7/xp0g4Q7WVfQ9AY+gxKbCUyP4mqDG7KniXUE6/E8khZu7N9ycwq+ROT0GqGkCGJ1P6XA4DtM88KSbh7NtyS2/PfxnIFhbzeumzyOoUzjG3w5fX+nQuq/izLNuT9tW9HS2sOomeoc24+H2jaAkr5f4txzYx2yX1mrsCQCoNwNgA8U6VHIxGjue9Prs9CfDFHJ+36bGgvfyXwRc4oynn1X+kwjMLskcGRlryK6yIm0FP4PlSd0QeSGAfQOQJHzbknU75/iMkjMYxA2BAiwO5+BV6K4vfqbdzKOzM6oKHVSIwOmpIdXA8+kxNlt0bN85ELVJxOx6VvQBRFKoYmfjMZDTu96hO/pSSwqwVGIeN/KzOWN6OhJBKSdloZxP9c4RBgjZA8tWvxjXQhPUl5M67axiCDyrPO+r7dq30/ZNKoZOKSQ5pwEk7GDbHwXOBoq9sljUakSuZJ5DBx9C2l0dF5uYnilBTx1Ve31vLg3xLlmFVfgrHuU7nHZMUVXE9ngsMoEQM/J2PR9w92H/Q4WXzP8CeWCS1zQSRGTl3LgtescHQUqIqeTsb7xQV/woJQpw2CfjS33XciApLXJzECXBkcwzsCbskObmKVXyJWb/6eidsBgqt38Vw8ipvVVoY9ptxF94k4D8UBCW1ptunhoKJeBhzZmxX3sJyFiXAzkMqQhBW4SOrhQxT71W0WTVtAbdMGhGc/aRXCRCqKBCjP+/EIbsDuA65rUVK6qaBeJrt16WPhKDZULgkJPRvxb5Swtx0oZHBZPqH5v3vdJva/GhhwQcT64phG516vVpxQ/GScDGl0+qQGGIwGhSkNgFZ1LPim6vYYBe5eemTFrWF1Le9PFNhYIHf3KNqB79u0j/sMs92bzUQRCPEsKoqc+eEMSa6/AaM89 TSeDDnsc f6XiYK9OOy8+bDhfkSvgN8rtD5xZPGI4B3ZT8SIIiLG1eOhGmB29RR54G+0tXj+KYP5uZYxR/zwbJ+cf4canGTrtC8k00cisODZFlieXu3sxfIseWXuRXTwIymNpkO5vRLGu5O34j9A7TU+VvDY4IpGxl3htM+KYAEHti/soeAGKwo7n+HWj27IrMkGShG9Ppm9Auhm42679OqK9pWfIOgZ3FbcXKQb2uoHJRWISkqFKjY+hKIeMcB7Ip7BYOHc3WIJ86eCsjr/RyFW2FZqoS0B3S2EuWWsVE6VxgNx+LS9QghlMcbGWEOURUI8/yOabYq6RMOm4+Gn/2YHlcnMhXSSk4a3PTA5v1J0PPFi/9N7iBegY6qgl/IlGom+GAxC+lC2FfUMl/WjXHjJ84z6Fi7ma3KrpM7ewBg4RDTfEdvA23lY36R/k0Xy5XbE4f1tO4veC0115HmcD4VP3eHLZQYpvxF5biGcN3OJXr X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: From: Jeff Xu The base package includes the features common to all VMA sealing types. It prevents sealed VMAs from: 1> Unmapping, moving to another location, and shrinking the size, via munmap() and mremap(), can leave an empty space, therefore can be replaced with a VMA with a new set of attributes. 2> Move or expand a different vma into the current location, via mremap(). 3> Modifying sealed VMA via mmap(MAP_FIXED). 4> Size expansion, via mremap(), does not appear to pose any specific risks to sealed VMAs. It is included anyway because the use case is unclear. In any case, users can rely on merging to expand a sealed VMA. We consider the MM_SEAL_BASE feature, on which other sealing features will depend. For instance, it probably does not make sense to seal PROT_PKEY without sealing the BASE, and the kernel will implicitly add SEAL_BASE for SEAL_PROT_PKEY. (If the application wants to relax this in future, we could use the flags field in mseal() to overwrite this the behavior of implicitly adding SEAL_BASE.) Signed-off-by: Jeff Xu --- mm/mmap.c | 23 +++++++++++++++++++++++ mm/mremap.c | 42 ++++++++++++++++++++++++++++++++++++++++++ 2 files changed, 65 insertions(+) diff --git a/mm/mmap.c b/mm/mmap.c index 42462c2a0c35..dbc557bd460c 100644 --- a/mm/mmap.c +++ b/mm/mmap.c @@ -1259,6 +1259,13 @@ unsigned long do_mmap(struct file *file, unsigned long addr, return -EEXIST; } + /* + * Check if the address range is sealed for do_mmap(). + * can_modify_mm assumes we have acquired the lock on MM. + */ + if (!can_modify_mm(mm, addr, addr + len, MM_SEAL_BASE)) + return -EACCES; + if (prot == PROT_EXEC) { pkey = execute_only_pkey(mm); if (pkey < 0) @@ -2632,6 +2639,14 @@ int do_vmi_munmap(struct vma_iterator *vmi, struct mm_struct *mm, if (end == start) return -EINVAL; + /* + * Check if memory is sealed before arch_unmap. + * Prevent unmapping a sealed VMA. + * can_modify_mm assumes we have acquired the lock on MM. + */ + if (!can_modify_mm(mm, start, end, MM_SEAL_BASE)) + return -EACCES; + /* arch_unmap() might do unmaps itself. */ arch_unmap(mm, start, end); @@ -3053,6 +3068,14 @@ int do_vma_munmap(struct vma_iterator *vmi, struct vm_area_struct *vma, { struct mm_struct *mm = vma->vm_mm; + /* + * Check if memory is sealed before arch_unmap. + * Prevent unmapping a sealed VMA. + * can_modify_mm assumes we have acquired the lock on MM. + */ + if (!can_modify_mm(mm, start, end, MM_SEAL_BASE)) + return -EACCES; + arch_unmap(mm, start, end); return do_vmi_align_munmap(vmi, vma, mm, start, end, uf, unlock); } diff --git a/mm/mremap.c b/mm/mremap.c index 382e81c33fc4..ff7429bfbbe1 100644 --- a/mm/mremap.c +++ b/mm/mremap.c @@ -835,7 +835,35 @@ static unsigned long mremap_to(unsigned long addr, unsigned long old_len, if ((mm->map_count + 2) >= sysctl_max_map_count - 3) return -ENOMEM; + /* + * In mremap_to() which moves a VMA to another address. + * Check if src address is sealed, if so, reject. + * In other words, prevent a sealed VMA being moved to + * another address. + * + * Place can_modify_mm here because mremap_to() + * does its own checking for address range, and we only + * check the sealing after passing those checks. + * can_modify_mm assumes we have acquired the lock on MM. + */ + if (!can_modify_mm(mm, addr, addr + old_len, MM_SEAL_BASE)) + return -EACCES; + if (flags & MREMAP_FIXED) { + /* + * In mremap_to() which moves a VMA to another address. + * Check if dst address is sealed, if so, reject. + * In other words, prevent moving a vma to a sealed VMA. + * + * Place can_modify_mm here because mremap_to() does its + * own checking for address, and we only check the sealing + * after passing those checks. + * can_modify_mm assumes we have acquired the lock on MM. + */ + if (!can_modify_mm(mm, new_addr, new_addr + new_len, + MM_SEAL_BASE)) + return -EACCES; + ret = do_munmap(mm, new_addr, new_len, uf_unmap_early); if (ret) goto out; @@ -994,6 +1022,20 @@ SYSCALL_DEFINE5(mremap, unsigned long, addr, unsigned long, old_len, goto out; } + /* + * This is shrink/expand case (not mremap_to()) + * Check if src address is sealed, if so, reject. + * In other words, prevent shrinking or expanding a sealed VMA. + * + * Place can_modify_mm here so we can keep the logic related to + * shrink/expand together. Perhaps we can extract below to be its + * own function in future. + */ + if (!can_modify_mm(mm, addr, addr + old_len, MM_SEAL_BASE)) { + ret = -EACCES; + goto out; + } + /* * Always allow a shrinking remap: that just unmaps * the unnecessary pages.. From patchwork Tue Dec 12 23:16:59 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jeff Xu X-Patchwork-Id: 13490088 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id BC940C4332F for ; Tue, 12 Dec 2023 23:17:27 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 199316B03DE; Tue, 12 Dec 2023 18:17:20 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 0A8F96B03D8; Tue, 12 Dec 2023 18:17:20 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id E3E216B03E0; Tue, 12 Dec 2023 18:17:19 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0016.hostedemail.com [216.40.44.16]) by kanga.kvack.org (Postfix) with ESMTP id 9DD056B03D8 for ; Tue, 12 Dec 2023 18:17:18 -0500 (EST) Received: from smtpin20.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay03.hostedemail.com (Postfix) with ESMTP id 7C46FA0924 for ; Tue, 12 Dec 2023 23:17:18 +0000 (UTC) X-FDA: 81559729356.20.5439302 Received: from mail-pl1-f179.google.com (mail-pl1-f179.google.com [209.85.214.179]) by imf14.hostedemail.com (Postfix) with ESMTP id AE9F5100015 for ; Tue, 12 Dec 2023 23:17:16 +0000 (UTC) Authentication-Results: imf14.hostedemail.com; dkim=pass header.d=chromium.org header.s=google header.b=OPfGLWR4; spf=pass (imf14.hostedemail.com: domain of jeffxu@chromium.org designates 209.85.214.179 as permitted sender) smtp.mailfrom=jeffxu@chromium.org; dmarc=pass (policy=none) header.from=chromium.org ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1702423036; a=rsa-sha256; cv=none; b=4XjHo6yHx5KSR6doIT+yJIrvlqq8QSaWCJR8GccypwOvYWa+d6EQwlL29kg1KsXX+Gz3Bn Z5KT5QihmsLaToHh+M2QzplfPsAEY69yTyjQf0ztspFxyjAap6+Yi9qvyIO72sTCKuHOmp dy8/81wcOCg3PMmaZL1B/AGOrnb2UHo= ARC-Authentication-Results: i=1; imf14.hostedemail.com; dkim=pass header.d=chromium.org header.s=google header.b=OPfGLWR4; spf=pass (imf14.hostedemail.com: domain of jeffxu@chromium.org designates 209.85.214.179 as permitted sender) smtp.mailfrom=jeffxu@chromium.org; dmarc=pass (policy=none) header.from=chromium.org ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1702423036; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=nL+BTz2K9JmfgWW40zZKnYHQ0fFwxEyXyS5r4SraM1c=; b=XDjY71SZHIlvfs0/9F1nnYnG9TRwAwEQl0n3DGMCg0OpX/BC3AErSI06Y5Yn1gmcqRE6yo 5xiwibvEurIMFSLzIST7GYYxd+vIRem/gpLusyTkfjX2vV5gTUCKIesbt8NBX8Rn8x+WpG S47KbJkUiT/hqBWe3DaY0dPpUu7p+So= Received: by mail-pl1-f179.google.com with SMTP id d9443c01a7336-1d331f12f45so11259185ad.2 for ; Tue, 12 Dec 2023 15:17:16 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=chromium.org; s=google; t=1702423035; x=1703027835; darn=kvack.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=nL+BTz2K9JmfgWW40zZKnYHQ0fFwxEyXyS5r4SraM1c=; b=OPfGLWR4rRny0kx9Pmv3JxLCRjwQOKHF8PYTMA2QoaUwTZ7kglWJIIzejcqKTqOXoM bSsdCmftemqRaO6M+3ys214Ay3bCO6dih8Pr8mh5lVPd4N133vsTwT+4paeOR4qVScXj iLsaP7AZQ/cY3xsK24Fta8DVKcM3Frjmyuryw= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1702423035; x=1703027835; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=nL+BTz2K9JmfgWW40zZKnYHQ0fFwxEyXyS5r4SraM1c=; b=OcPZugYaZ30XaEfj44T+ALAzJjjVgc8/tOvxo1jFNPXFnIBg9hjDqp/1LmMbUXbHot Tp2RikRjhzxTuVarLo2pgZNET89LnXmWQS2OoxmRIkMlFgO/mCD+hlEyhwOaKB5DIT/W uR0p1EkicAZKu7kMQga3eMSpu95lyBiasz1TtkuO1FnqfFi0ClTl+SW2/S+TJlD/zG9h dWTB0m9Xjz0yC8WEPhkq6eeLoZSvAFMQjQXNHfEykDKCNAHXjpnyisoJ2H5Sx4DAY1ra n34c74GvC/jO489UAKsiO5FAuMPlVdVtizUuaLaD/hXcvvObNMeadIlsrfS/9U++3uu4 zkXw== X-Gm-Message-State: AOJu0YweFy3oHWZn72mHOLPtBmpiWdJh6hw++WKh+vtp2tO2ZCCZNWAs NCAZd7ZJf5aDcw72LCRaJv2VZQ== X-Google-Smtp-Source: AGHT+IGFBfEL94OuzDJXPeAX8v2gc5zss9OFnE/JNle1zzc7v4J2psWLGmJv7VyTMrs5itl5BiG+DQ== X-Received: by 2002:a17:902:b702:b0:1d0:7d83:fdd9 with SMTP id d2-20020a170902b70200b001d07d83fdd9mr3543832pls.122.1702423035559; Tue, 12 Dec 2023 15:17:15 -0800 (PST) Received: from localhost (34.133.83.34.bc.googleusercontent.com. [34.83.133.34]) by smtp.gmail.com with UTF8SMTPSA id h2-20020a170902f54200b001cfc67d46efsm9074320plf.191.2023.12.12.15.17.15 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Tue, 12 Dec 2023 15:17:15 -0800 (PST) From: jeffxu@chromium.org To: akpm@linux-foundation.org, keescook@chromium.org, jannh@google.com, sroettger@google.com, willy@infradead.org, gregkh@linuxfoundation.org, torvalds@linux-foundation.org Cc: jeffxu@google.com, jorgelo@chromium.org, groeck@chromium.org, linux-kernel@vger.kernel.org, linux-kselftest@vger.kernel.org, linux-mm@kvack.org, pedro.falcato@gmail.com, dave.hansen@intel.com, linux-hardening@vger.kernel.org, deraadt@openbsd.org, Jeff Xu Subject: [RFC PATCH v3 05/11] mseal: add MM_SEAL_PROT_PKEY Date: Tue, 12 Dec 2023 23:16:59 +0000 Message-ID: <20231212231706.2680890-6-jeffxu@chromium.org> X-Mailer: git-send-email 2.43.0.472.g3155946c3a-goog In-Reply-To: <20231212231706.2680890-1-jeffxu@chromium.org> References: <20231212231706.2680890-1-jeffxu@chromium.org> MIME-Version: 1.0 X-Rspamd-Server: rspam08 X-Rspamd-Queue-Id: AE9F5100015 X-Stat-Signature: 4z9khjap6wsuywpxyu5dnz7i93qb4g5m X-Rspam-User: X-HE-Tag: 1702423036-531428 X-HE-Meta: U2FsdGVkX1+2eOWaarbaBac8e9+FfNCBuo/btqkvtckVsikogwlsaRVhzbWiE3zI/19dAH5pFm0yH8O6VTmVrWwEMKDifUZ29wZUrXNQVIJb/aBHiTKkVPTjbKiDRg2TS+cpU65xT7YSW0s1pp6BTislJz3Gk6Y8ObVvcdnI3TBZM9HDYb2dpUmsmEP39HLSMc2CVBofphQehrcQ5uYw8yhggIhq/ZHHuD/Lr9pvs+uPQEuDNi4Q6knRofXaMA284yHR05fk/djmog3GYSnQ7Liq79RQM6f2X30ogOUk4bn9Sz8Z/i/rl8vNjJChhCpmBDJz4LHH+cGsoZz9lAhnMqUfNR2FZqTQPSREIneH/+bT7AGWzRTMPPPPbhmPXdPGwVGot86HjJUcbOFHRj/8ir/LGi9Gl/B4jGOFvuaFEecX2NIgbGqOGTSGy9qEUIc9T1rqWN6rI1KfX6ezEZgmXlLheG/4hdZ+EmK4jWmbZWzmXipupjk0YuCy+0iN/ra/ji606y9drZHT+nC7QbL5w4HLprYcHkKNIUfYsydh0ZI7K9JTDG/2VFGHpGOH3/gYIdXZJYuyEFUsKRrgLHsfdtWRRXJUl10C44UKdfq02Vec1cj1TeTBa6IAOkjOdTCOLvfhpgt3pJVXst3LykqluGW+vWl8VETnCASWstmSxDXebk62ZppvM4HhptqWuazv7vpEcYNlrEZKMCkpEYEjNzDOMrLpg3jez8yj8ibf9hLnm7eJE2FM4KK/gFcNkWib/9k9ShuhGM1RYch62NWROcZb72Kf+iKubNI+5ryAavHkW9zt/9d6//0gIPfmxpvcWwCHtAwFURE2qGzvet+p6+ayvFk+s6XyPA5gYgb2x7XkRahXuIs7g5vhBI9pBdpSmDwiou18d8NvYKmBxJ0DbJbefcDxe4x8Km2polMTfUxceHw7GrqJRH9vfgMx1SWUoI/WuE8mA83B+UljzXq SuI8tbxQ yiJdRL0wECNAmbtWmpSv14l3I0VkNnmcnx6NCjYcK9e5v1KyHomwHI4MRZb9zuGGZrOj/+/BxwJ4K6ZUWpFzcCiTG8kxvGYB4k6VRC1a2ANQ+2Zo02b5rx7KiATgixoFirBfRxL9uSLRAbWncxWMvnSkx//a/VZmzDVxPWki1Y+iSumyyf8dJnSPrTLn6V8ZLSvXPMk3Dyny7LXuFNAqT5ZW11m413pvaGPUNauRcMyKp0KMtkrYwmmStdoAQU5kqpWw4+nTdVTjv8tLhHFxGyRt4bZ5J8DyBGYLD9xH2r6RSQsPSyjHFzbb0iOVnXEkJ5RLOxloCJAwXrJSxavZXNDl18pgXETJM543PJC0zcCsmc3Hp/XpC8wwzOENGUmJ+5tXOY+XLIaL61PRREjUrD8f/bKbxfftohsRLbs1YW6GiDnKBz5vcrHTNInQqRAzbOp5ijAenb+furaBnkjux/EJusKOVzvlMb2Bsi0tXPt4P9y2fPnOpDKhCyhSAKC6lDw1Avkt86fu4Z/0= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: From: Jeff Xu Seal PROT and PKEY of the address range, in other words, mprotect() and pkey_mprotect() will be denied if the memory is sealed with MM_SEAL_PROT_PKEY. Signed-off-by: Jeff Xu --- mm/mprotect.c | 10 ++++++++++ 1 file changed, 10 insertions(+) diff --git a/mm/mprotect.c b/mm/mprotect.c index b94fbb45d5c7..1527188b1e92 100644 --- a/mm/mprotect.c +++ b/mm/mprotect.c @@ -32,6 +32,7 @@ #include #include #include +#include #include #include #include @@ -753,6 +754,15 @@ static int do_mprotect_pkey(unsigned long start, size_t len, } } + /* + * checking if PROT and PKEY is sealed. + * can_modify_mm assumes we have acquired the lock on MM. + */ + if (!can_modify_mm(current->mm, start, end, MM_SEAL_PROT_PKEY)) { + error = -EACCES; + goto out; + } + prev = vma_prev(&vmi); if (start > vma->vm_start) prev = vma; From patchwork Tue Dec 12 23:17:00 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jeff Xu X-Patchwork-Id: 13490089 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id F2100C4332F for ; Tue, 12 Dec 2023 23:17:29 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 4624C6B03D8; Tue, 12 Dec 2023 18:17:20 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 36F5F6B03E1; Tue, 12 Dec 2023 18:17:20 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id ED7B06B03DF; Tue, 12 Dec 2023 18:17:19 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0014.hostedemail.com [216.40.44.14]) by kanga.kvack.org (Postfix) with ESMTP id DA25E6B03DE for ; Tue, 12 Dec 2023 18:17:19 -0500 (EST) Received: from smtpin05.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay10.hostedemail.com (Postfix) with ESMTP id A54E9C0AE3 for ; Tue, 12 Dec 2023 23:17:19 +0000 (UTC) X-FDA: 81559729398.05.765996A Received: from mail-pl1-f175.google.com (mail-pl1-f175.google.com [209.85.214.175]) by imf15.hostedemail.com (Postfix) with ESMTP id C591EA0016 for ; Tue, 12 Dec 2023 23:17:17 +0000 (UTC) Authentication-Results: imf15.hostedemail.com; dkim=pass header.d=chromium.org header.s=google header.b=FIH6sem6; dmarc=pass (policy=none) header.from=chromium.org; spf=pass (imf15.hostedemail.com: domain of jeffxu@chromium.org designates 209.85.214.175 as permitted sender) smtp.mailfrom=jeffxu@chromium.org ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1702423037; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=D0KleV6o3T8VwIpW2LkN+EL0QO50pQ5bkLoTJyBRonk=; b=O8d014u2Tbk5HF2u8idfyZZn4PI3HYv9RwOB7zaTEL+s0JSOabDkc09dCYG6u7mF44+29z KXyNbhAzO0SLyES1SXO+r6mMl/U3Dnlwf2NrynLJn3qcKtMypqJNjsaHkBZODTpCTco+ei ycUm/sc8dJr/LFIn1BJgtoKHc/QkRVE= ARC-Authentication-Results: i=1; imf15.hostedemail.com; dkim=pass header.d=chromium.org header.s=google header.b=FIH6sem6; dmarc=pass (policy=none) header.from=chromium.org; spf=pass (imf15.hostedemail.com: domain of jeffxu@chromium.org designates 209.85.214.175 as permitted sender) smtp.mailfrom=jeffxu@chromium.org ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1702423037; a=rsa-sha256; cv=none; b=V2sA8b3osmWjiWiO5dTe8StACX/SZiX0RIXkOQc79Yi8saB2xcoF4Ohuq2MNFxI5+40e9T jnhak4Ab2pasOqoIpJbcuCP9O+oAHFe9CDh42tjGqxOiLvDhtywSZpVOrALtl3r3XG0Sp9 8bZedg1dr0Daf8HUNOUerLxmrWeZ41s= Received: by mail-pl1-f175.google.com with SMTP id d9443c01a7336-1d2f1cecf89so22794545ad.1 for ; Tue, 12 Dec 2023 15:17:17 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=chromium.org; s=google; t=1702423037; x=1703027837; darn=kvack.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=D0KleV6o3T8VwIpW2LkN+EL0QO50pQ5bkLoTJyBRonk=; b=FIH6sem6uonT4bMrOnSGgg5ICJ6RRlEBndfT89TfbUW8t34NdYhd/NAjf/osZjePpO Qat+qCeky4UUQtLAiKPbO7SOPmuq5C92QGMe3/0tnUgXKVjU7OUa4k71oGR12DrYTeE2 CVsTnFnqdQkRUTEfyG6SdWiNkxgW7Brhg/eE0= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1702423037; x=1703027837; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=D0KleV6o3T8VwIpW2LkN+EL0QO50pQ5bkLoTJyBRonk=; b=D4+mqwXeegA+pRAtjvwv7wk8M9vjyCTfUaA7+wk+QqisAvQTEIDxUzK0iFCtsZcYek F7w8jYDIJCgX1f6bfLUF7SsBm3Xlll0WKEgbP8ZgHulIpK97p/jDCc55o8ES+vpkD5dw HX8s0khdhnIZlCjVesjpRtYw7YvwGK6ZBgqRdLinaw/p+CN8HRaU65lKoecnrJM2t8z1 mJt3dlAOfsu/yCjY8m4KJ+s5oVY77mbtdwt65Rzji9DXECKyILJFV8AqRLCzbgATBvLp mxxE9nNoYaEhBDvCvzbfkZZ7ahmXP0PlbEv/AC3Cw3e7okGgKymQDJwLGoDTs2h4DAv5 sC0A== X-Gm-Message-State: AOJu0YwnTUt6EIESej9ugHL3xlqc+94TMMvnJRs5cEEqSadPoU/7m2ig 2n1uinZkZLsFrDNF3DUDSIGj8g== X-Google-Smtp-Source: AGHT+IGxR0oJJr0EjsRByUJbUwGXJ3kDSLTggQ7aP2NMz7xXn0QhhIQ1l2eHY3b5MujUVGQNKRdT6g== X-Received: by 2002:a17:902:d4d2:b0:1d3:4aab:194d with SMTP id o18-20020a170902d4d200b001d34aab194dmr281091plg.72.1702423036672; Tue, 12 Dec 2023 15:17:16 -0800 (PST) Received: from localhost (34.133.83.34.bc.googleusercontent.com. [34.83.133.34]) by smtp.gmail.com with UTF8SMTPSA id e1-20020a17090301c100b001d33e65b3cdsm1661489plh.112.2023.12.12.15.17.15 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Tue, 12 Dec 2023 15:17:16 -0800 (PST) From: jeffxu@chromium.org To: akpm@linux-foundation.org, keescook@chromium.org, jannh@google.com, sroettger@google.com, willy@infradead.org, gregkh@linuxfoundation.org, torvalds@linux-foundation.org Cc: jeffxu@google.com, jorgelo@chromium.org, groeck@chromium.org, linux-kernel@vger.kernel.org, linux-kselftest@vger.kernel.org, linux-mm@kvack.org, pedro.falcato@gmail.com, dave.hansen@intel.com, linux-hardening@vger.kernel.org, deraadt@openbsd.org, Jeff Xu Subject: [RFC PATCH v3 06/11] mseal: add sealing support for mmap Date: Tue, 12 Dec 2023 23:17:00 +0000 Message-ID: <20231212231706.2680890-7-jeffxu@chromium.org> X-Mailer: git-send-email 2.43.0.472.g3155946c3a-goog In-Reply-To: <20231212231706.2680890-1-jeffxu@chromium.org> References: <20231212231706.2680890-1-jeffxu@chromium.org> MIME-Version: 1.0 X-Rspam-User: X-Stat-Signature: i1wughmwnskkd5ooo5fjx1sw8tbh8pda X-Rspamd-Server: rspam07 X-Rspamd-Queue-Id: C591EA0016 X-HE-Tag: 1702423037-944163 X-HE-Meta: U2FsdGVkX1+t5ND5WCll+G8kcakhODyCiqax1ceCMvwEm/P/sgpSn2d4cwn+iJk2fUYyqjrEe31ngmnimW7tsdwizmaHG2GHFp7otPNdpl5ORrqhELxnD3lwdYfuJ9G2JRK9/NMZW1+547zcFMxcvZL6vepDiwL7ZRGwR3bX/8iaZe5worSHJWm+1Jdzdr6jxjg6sbm1urjt1ZZHBXNWRLxn+HrkZue2H6+kibV0OIPL+pk6EjzAowZTOXPGmnLs/v3lZW5lcra8weRym8baK6Z2Bat2HWq0if0PnnZHl6ZrCu2vEYbSUPlUoP5F1rjazM3yel6qJdutFVramYlyR4idLHhZ0DmTDiJ/rZQHtGiFhOEpC8z6OR3k3fxh7rHt5ZE14rx9CE1UIAE6vKuAEQO4HVNUogNFARLwj0PDdO6LSrY5Ate1MivL6FTCP3gZ/DrDWETbnNiLg4vMe57O87My4xOlrHfM7CZ1qxY2iK2h1Q6EVpi0F8FwL1B9ksPK858XOEFCXfsoduMFJ3YDJVaSHDp1NBz7lPs2JMJcFHyCCPA6wPqFnPoz03gT5prX+2LCKOSL8388dPyozef++DHgdm3Nnk9NR17IxVqu0f9yxc/c9P7bAxIu3njEy1AVQLfxw0nLbB8y6tWX2kHZtIZjkPI6/Cj2goCjNtFVnRe2FQ+HqfYHCstDBsf7XqkoltwLEyjDHPbhu/bdOHXI7U/m3SwDG16QmIAUbtZ/UwAS7Si6HCRl+TTn4xcn1rSpZDR35pd6WpeJR6Rj/8sHS1r6Q08/c78hDqhwOCrnRoQBgsvorZmyF7lGIQ4u9AYzVqEXd3N0R3k3EHhxWT9cOiy+Uq0WKdBofm63V0V2Aegtwg1E1vSbfem5rW0oBaytkx1D36G7jGScwSQqvSCVHOzWIXwlaCHK24/jCMgovfasXYB4cQl/sBSaaIymrZ9DVTMvUWjxVGe60ZEhlWl PkM8GNA2 NNr3lxdlvFrq4rL2uqaHzduSb9XxtdOm8jkzoqztc5EkC3saM+I7dUsgUCm+YvqPGc+QHCkmf3wTYbY4xe/afDL8R2QTY37I2Vi6HLHTBnRRjBk1V3VLRkdr0U7brdGz4zQH2I8r8ZuQvlpj58wySEOWPseDmfvr/DKSsshsr17K2F/LpRtL8RRztWVLn4g9+QbMrqB+tE1zhm0KEfa2vVV7HszwNSFHo/QDj5U0sOiJ68Thbw96leDe0eRrYQcchirtan2A00e+DwfLVV5f185tCjj0zyEHB0/krejXFklIM9zlARwKWo+uFUVNZ9pOUa7f7QUuVPY64L89Nb/vloXztJl5UAC4P+ZCKMLtQRSJ0+FL5VFkjVoBvMjxiBlmGy1298mjewTI+AUUmTv/dqHrQzIHubO9NJH7YFa5xZj46OuerWy0mE9h6lM5MP609nuAGyRteOf1IETk= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: From: Jeff Xu Allow mmap() to set the sealing type when creating a mapping. This is useful for optimization because it avoids having to make two system calls: one for mmap() and one for mseal(). With this change, mmap() can take an input that specifies the sealing type, so only one system call is needed. This patch uses the "prot" field of mmap() to set the sealing. Three sealing types are added to match with MM_SEAL_xyz in mseal(). PROT_SEAL_SEAL PROT_SEAL_BASE PROT_SEAL_PROT_PKEY We also thought about using MAP_SEAL_xyz, which is a field in the mmap() function called "flags". However, this field is more about the type of mapping, such as MAP_FIXED_NOREPLACE. The "prot" field seems more appropriate for our case. It's worth noting that even though the sealing type is set via the "prot" field in mmap(), we don't require it to be set in the "prot" field in later mprotect() call. This is unlike the PROT_READ, PROT_WRITE, PROT_EXEC bits, e.g. if PROT_WRITE is not set in mprotect(), it means that the region is not writable. In other words, if you set PROT_SEAL_PROT_PKEY in mmap(), you don't need to set it in mprotect(). In fact, with the current approach, mseal() is used to set sealing on existing VMA. Signed-off-by: Jeff Xu Suggested-by: Pedro Falcato --- arch/mips/kernel/vdso.c | 10 +++- include/linux/mm.h | 63 +++++++++++++++++++++++++- include/uapi/asm-generic/mman-common.h | 13 ++++++ mm/mmap.c | 25 ++++++++-- 4 files changed, 105 insertions(+), 6 deletions(-) diff --git a/arch/mips/kernel/vdso.c b/arch/mips/kernel/vdso.c index f6d40e43f108..6d1103d36af1 100644 --- a/arch/mips/kernel/vdso.c +++ b/arch/mips/kernel/vdso.c @@ -98,11 +98,17 @@ int arch_setup_additional_pages(struct linux_binprm *bprm, int uses_interp) return -EINTR; if (IS_ENABLED(CONFIG_MIPS_FP_SUPPORT)) { - /* Map delay slot emulation page */ + /* + * Map delay slot emulation page. + * + * Note: passing vm_seals = 0 + * Don't support sealing for vdso for now. + * This might change when we add sealing support for vdso. + */ base = mmap_region(NULL, STACK_TOP, PAGE_SIZE, VM_READ | VM_EXEC | VM_MAYREAD | VM_MAYWRITE | VM_MAYEXEC, - 0, NULL); + 0, NULL, 0); if (IS_ERR_VALUE(base)) { ret = base; goto out; diff --git a/include/linux/mm.h b/include/linux/mm.h index 2435acc1f44f..5d3ee79f1438 100644 --- a/include/linux/mm.h +++ b/include/linux/mm.h @@ -266,6 +266,15 @@ extern unsigned int kobjsize(const void *objp); MM_SEAL_BASE | \ MM_SEAL_PROT_PKEY) +/* + * PROT_SEAL_ALL is all supported flags in mmap(). + * See include/uapi/asm-generic/mman-common.h. + */ +#define PROT_SEAL_ALL ( \ + PROT_SEAL_SEAL | \ + PROT_SEAL_BASE | \ + PROT_SEAL_PROT_PKEY) + /* * vm_flags in vm_area_struct, see mm_types.h. * When changing, update also include/trace/events/mmflags.h. @@ -3290,7 +3299,7 @@ extern unsigned long get_unmapped_area(struct file *, unsigned long, unsigned lo extern unsigned long mmap_region(struct file *file, unsigned long addr, unsigned long len, vm_flags_t vm_flags, unsigned long pgoff, - struct list_head *uf); + struct list_head *uf, unsigned long vm_seals); extern unsigned long do_mmap(struct file *file, unsigned long addr, unsigned long len, unsigned long prot, unsigned long flags, vm_flags_t vm_flags, unsigned long pgoff, unsigned long *populate, @@ -3339,12 +3348,47 @@ static inline unsigned long vma_seals(struct vm_area_struct *vma) return (vma->vm_seals & MM_SEAL_ALL); } +static inline void update_vma_seals(struct vm_area_struct *vma, unsigned long vm_seals) +{ + vma->vm_seals |= vm_seals; +} + extern bool can_modify_mm(struct mm_struct *mm, unsigned long start, unsigned long end, unsigned long checkSeals); extern bool can_modify_vma(struct vm_area_struct *vma, unsigned long checkSeals); +/* + * Convert prot field of mmap to vm_seals type. + */ +static inline unsigned long convert_mmap_seals(unsigned long prot) +{ + unsigned long seals = 0; + + /* + * set SEAL_PROT_PKEY implies SEAL_BASE. + */ + if (prot & PROT_SEAL_PROT_PKEY) + prot |= PROT_SEAL_BASE; + + /* + * The seal bits start from bit 26 of the "prot" field of mmap. + * see comments in include/uapi/asm-generic/mman-common.h. + */ + seals = (prot & PROT_SEAL_ALL) >> PROT_SEAL_BIT_BEGIN; + return seals; +} + +/* + * check input sealing type from the "prot" field of mmap(). + * for CONFIG_MSEAL case, this always return 0 (successful). + */ +static inline int check_mmap_seals(unsigned long prot, unsigned long *vm_seals) +{ + *vm_seals = convert_mmap_seals(prot); + return 0; +} #else static inline bool check_vma_seals_mergeable(unsigned long vm_seals1) { @@ -3367,6 +3411,23 @@ static inline bool can_modify_vma(struct vm_area_struct *vma, { return true; } + +static inline void update_vma_seals(struct vm_area_struct *vma, unsigned long vm_seals) +{ +} + +/* + * check input sealing type from the "prot" field of mmap(). + * For not CONFIG_MSEAL, if SEAL flag is set, it will return failure. + */ +static inline int check_mmap_seals(unsigned long prot, unsigned long *vm_seals) +{ + if (prot & PROT_SEAL_ALL) + return -EINVAL; + + return 0; +} + #endif /* These take the mm semaphore themselves */ diff --git a/include/uapi/asm-generic/mman-common.h b/include/uapi/asm-generic/mman-common.h index 6ce1f1ceb432..f07ad9e70b3a 100644 --- a/include/uapi/asm-generic/mman-common.h +++ b/include/uapi/asm-generic/mman-common.h @@ -17,6 +17,19 @@ #define PROT_GROWSDOWN 0x01000000 /* mprotect flag: extend change to start of growsdown vma */ #define PROT_GROWSUP 0x02000000 /* mprotect flag: extend change to end of growsup vma */ +/* + * The PROT_SEAL_XX defines memory sealings flags in the prot argument + * of mmap(). The bits currently take consecutive bits and match + * the same sequence as MM_SEAL_XX type, this allows convert_mmap_seals() + * to convert prot to MM_SEAL_XX type using bit operations. + * The include/uapi/linux/mman.h header file defines the MM_SEAL_XX type, + * which is used by the mseal() system call. + */ +#define PROT_SEAL_BIT_BEGIN 26 +#define PROT_SEAL_SEAL _BITUL(PROT_SEAL_BIT_BEGIN) /* 0x04000000 seal seal */ +#define PROT_SEAL_BASE _BITUL(PROT_SEAL_BIT_BEGIN + 1) /* 0x08000000 base for all sealing types */ +#define PROT_SEAL_PROT_PKEY _BITUL(PROT_SEAL_BIT_BEGIN + 2) /* 0x10000000 seal prot and pkey */ + /* 0x01 - 0x03 are defined in linux/mman.h */ #define MAP_TYPE 0x0f /* Mask for type of mapping */ #define MAP_FIXED 0x10 /* Interpret addr exactly */ diff --git a/mm/mmap.c b/mm/mmap.c index dbc557bd460c..3e1bf5a131b0 100644 --- a/mm/mmap.c +++ b/mm/mmap.c @@ -1211,6 +1211,7 @@ unsigned long do_mmap(struct file *file, unsigned long addr, { struct mm_struct *mm = current->mm; int pkey = 0; + unsigned long vm_seals = 0; *populate = 0; @@ -1231,6 +1232,9 @@ unsigned long do_mmap(struct file *file, unsigned long addr, if (flags & MAP_FIXED_NOREPLACE) flags |= MAP_FIXED; + if (check_mmap_seals(prot, &vm_seals) < 0) + return -EINVAL; + if (!(flags & MAP_FIXED)) addr = round_hint_to_min(addr); @@ -1381,7 +1385,7 @@ unsigned long do_mmap(struct file *file, unsigned long addr, vm_flags |= VM_NORESERVE; } - addr = mmap_region(file, addr, len, vm_flags, pgoff, uf); + addr = mmap_region(file, addr, len, vm_flags, pgoff, uf, vm_seals); if (!IS_ERR_VALUE(addr) && ((vm_flags & VM_LOCKED) || (flags & (MAP_POPULATE | MAP_NONBLOCK)) == MAP_POPULATE)) @@ -2679,7 +2683,7 @@ int do_munmap(struct mm_struct *mm, unsigned long start, size_t len, unsigned long mmap_region(struct file *file, unsigned long addr, unsigned long len, vm_flags_t vm_flags, unsigned long pgoff, - struct list_head *uf) + struct list_head *uf, unsigned long vm_seals) { struct mm_struct *mm = current->mm; struct vm_area_struct *vma = NULL; @@ -2723,7 +2727,13 @@ unsigned long mmap_region(struct file *file, unsigned long addr, next = vma_next(&vmi); prev = vma_prev(&vmi); - if (vm_flags & VM_SPECIAL) { + /* + * For now, sealed VMA doesn't merge with other VMA, + * Will change this in later commit when we make sealed VMA + * also mergeable. + */ + if ((vm_flags & VM_SPECIAL) || + (vm_seals & MM_SEAL_ALL)) { if (prev) vma_iter_next_range(&vmi); goto cannot_expand; @@ -2781,6 +2791,8 @@ unsigned long mmap_region(struct file *file, unsigned long addr, vma->vm_page_prot = vm_get_page_prot(vm_flags); vma->vm_pgoff = pgoff; + update_vma_seals(vma, vm_seals); + if (file) { if (vm_flags & VM_SHARED) { error = mapping_map_writable(file->f_mapping); @@ -2992,6 +3004,13 @@ SYSCALL_DEFINE5(remap_file_pages, unsigned long, start, unsigned long, size, if (pgoff + (size >> PAGE_SHIFT) < pgoff) return ret; + /* + * Do not support sealing in remap_file_page. + * sealing is set via mmap() and mseal(). + */ + if (prot & PROT_SEAL_ALL) + return ret; + if (mmap_write_lock_killable(mm)) return -EINTR; From patchwork Tue Dec 12 23:17:01 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jeff Xu X-Patchwork-Id: 13490090 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id C7CB2C4167D for ; Tue, 12 Dec 2023 23:17:32 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 197976B03E2; Tue, 12 Dec 2023 18:17:21 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 0F8E26B03E3; Tue, 12 Dec 2023 18:17:21 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id DA36E6B03E4; Tue, 12 Dec 2023 18:17:20 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0010.hostedemail.com [216.40.44.10]) by kanga.kvack.org (Postfix) with ESMTP id BBC636B03E2 for ; Tue, 12 Dec 2023 18:17:20 -0500 (EST) Received: from smtpin30.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay10.hostedemail.com (Postfix) with ESMTP id 7EB26C0B04 for ; Tue, 12 Dec 2023 23:17:20 +0000 (UTC) X-FDA: 81559729440.30.5052E2D Received: from mail-ot1-f51.google.com (mail-ot1-f51.google.com [209.85.210.51]) by imf20.hostedemail.com (Postfix) with ESMTP id 9AFE81C0020 for ; Tue, 12 Dec 2023 23:17:18 +0000 (UTC) Authentication-Results: imf20.hostedemail.com; dkim=pass header.d=chromium.org header.s=google header.b=NqHn32JJ; spf=pass (imf20.hostedemail.com: domain of jeffxu@chromium.org designates 209.85.210.51 as permitted sender) smtp.mailfrom=jeffxu@chromium.org; dmarc=pass (policy=none) header.from=chromium.org ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1702423038; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=9hi31AjxJtNZjlJknX0J3BHnj/VEX2GCu7NwkPG/5ME=; b=v+Efvh/1scneSZ/WNzeQa3YNeCGJH+BACaoo1VztrKmK+bqfHuVv6jBG7VNh4mzkYmRAt0 25m+df63UIN2yMw5sjk5kCtsSW5mGP/R29KwA6RHiDCa14Y1kSvlekPqYbWz0DX4vvdDlA HFbeHvjIO+bB9uVxnAiCM18ZU5iK13k= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1702423038; a=rsa-sha256; cv=none; b=havrbKHVzhNwCO1G/F+kGDHLTEHT6rQhQ7HNJzNZ3JyXO8uUpXSnJoLj47ard0QtqtDI/J 6hyui6T6kxF7PlF5vo9QWDFpdcN+5lVefD05pF8a+Erp+5VtoXkVRNz1i/E0XpdNd0ePxL u1bIMpeuoCTjQ3DEg00IP8Agx84PB5I= ARC-Authentication-Results: i=1; imf20.hostedemail.com; dkim=pass header.d=chromium.org header.s=google header.b=NqHn32JJ; spf=pass (imf20.hostedemail.com: domain of jeffxu@chromium.org designates 209.85.210.51 as permitted sender) smtp.mailfrom=jeffxu@chromium.org; dmarc=pass (policy=none) header.from=chromium.org Received: by mail-ot1-f51.google.com with SMTP id 46e09a7af769-6d8029dae41so4785888a34.0 for ; Tue, 12 Dec 2023 15:17:18 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=chromium.org; s=google; t=1702423038; x=1703027838; darn=kvack.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=9hi31AjxJtNZjlJknX0J3BHnj/VEX2GCu7NwkPG/5ME=; b=NqHn32JJRlIG3Ha4QgPQ0TlcV2jC4uCTMS0q+mEMMF1RnnLmRH20jgEMdOrxl23uEk UQHaqHlLcisdid9ZGLYIy6/wYDcvYrfsv/XlsIcZvEBd8k/0pW41fao5SwhwXf6Qgyk9 P6xM3Vc6CBHu3K+ZQ1g/Z/2M+9szqNWyYeJNc= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1702423038; x=1703027838; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=9hi31AjxJtNZjlJknX0J3BHnj/VEX2GCu7NwkPG/5ME=; b=BSr/7gf9rfQ8w6qjmwwKvdpKZq+bsSOXYRcsqqc7kbVed1PfgkMn+9k7DzDGgcYVW8 OicMSx0UU4to4Im2/UsSqpjGiMJpdpyIHqxslYP+nuVohJ5e/9efz9+VbcplXpoNwI5F EOEtFUiFA1IVCXGlud/eUMJP6WGLKYQVt8qEZfiFz1FxOIHD24RFEXaMU8iJRFKN7vNx 7EqKQKwkuZ9IRMN7j4ULoHtmsS3te6HVvjH49HFI0qSSYBJaSNDXwT737lfbO+Kc5NVk HE49uh9Mnvcgsb9Cs5rfPpD26OQmGb3eJ/l0eB2neY5X7RcBNM219JuvGaEdqGAHpj0D efQg== X-Gm-Message-State: AOJu0YykNE0QxUA5i/767KJbZG9CROPA0NiLSK0GRWtUaSHdZDRgUxtf Hr6OnaY3UWC1NlAVB85Wrj99wQ== X-Google-Smtp-Source: AGHT+IHQHtOqaKUE81zeaJCn2q6SFy9/JngG9UEh+z5j67n5lmwKX/FrsIt4S5/AHC8z+ShhO7V8gw== X-Received: by 2002:a05:6358:9394:b0:170:5200:e1b2 with SMTP id h20-20020a056358939400b001705200e1b2mr7564052rwb.4.1702423037735; Tue, 12 Dec 2023 15:17:17 -0800 (PST) Received: from localhost (34.133.83.34.bc.googleusercontent.com. [34.83.133.34]) by smtp.gmail.com with UTF8SMTPSA id w5-20020a056a0014c500b006ce2e77ec4csm8687959pfu.193.2023.12.12.15.17.17 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Tue, 12 Dec 2023 15:17:17 -0800 (PST) From: jeffxu@chromium.org To: akpm@linux-foundation.org, keescook@chromium.org, jannh@google.com, sroettger@google.com, willy@infradead.org, gregkh@linuxfoundation.org, torvalds@linux-foundation.org Cc: jeffxu@google.com, jorgelo@chromium.org, groeck@chromium.org, linux-kernel@vger.kernel.org, linux-kselftest@vger.kernel.org, linux-mm@kvack.org, pedro.falcato@gmail.com, dave.hansen@intel.com, linux-hardening@vger.kernel.org, deraadt@openbsd.org, Jeff Xu Subject: [RFC PATCH v3 07/11] mseal: make sealed VMA mergeable. Date: Tue, 12 Dec 2023 23:17:01 +0000 Message-ID: <20231212231706.2680890-8-jeffxu@chromium.org> X-Mailer: git-send-email 2.43.0.472.g3155946c3a-goog In-Reply-To: <20231212231706.2680890-1-jeffxu@chromium.org> References: <20231212231706.2680890-1-jeffxu@chromium.org> MIME-Version: 1.0 X-Stat-Signature: 7ttahj3pb75pznm6fraqnpm4g38ohfkd X-Rspamd-Server: rspam10 X-Rspamd-Queue-Id: 9AFE81C0020 X-Rspam-User: X-HE-Tag: 1702423038-200062 X-HE-Meta: U2FsdGVkX18CoRHQ6q/nBOG5WgG3HEC5OuWnIjMsxVYdBMRhnq/CF1P04L08jYrk2hBn2sfwZfmRsZhaVcQwBTLmCy2MxOINPBQ+fjKdyOesIYxvVNJl3K4Gug0TdCJQwOfv628WAdXZWjJc0wJlcDOsUSeGmK/PeNPuo0qknlFThy7Lvd5wf2Egb3NLbhOqDkk5QaCiKHssGGleL803bCbuvWz/Aa5gg45V/7ez2XnY2pq6Glmp/JsQT5aYxq7rT9r3cjg2nPhIBnXtyr6Umz8oHgYEGgKiDs3ZV+kv71zdSxEew1+iY3rEnmEDSCIkbkm85ikjyIDoEogmdMzMns6Ik4uANndTcrTTpQ0JvTuhlZG/bdy8IK2XdTTvhC1MDNXSwvaKYlLiR2+BGAPVjSVCk2o59pn0nrI/+qp3Hz8sQEu4AUcku0yEKPvlpxLbCsLXQ8ulwvIcBGCj17Ihsnc2BSmTeDY6gVv/hUG9lunbbnkhz76lkgClJXF95vkq3RWLCvfoFIldu81bUCRmS31rkdL6kpzPZ8uJ/9DSfqx2cDy/v7Kv4YQPZJavAjXaG7eslCM1A8/og41BBOcbPeLLvhlrlHPUqSB7oB00qXJ4DfN81wGks3Qs7YmgfVOT4OIY4u3r5tvtIyqgPSSg4oJp23ImKkNf7Q9FXyV+ygUkD6ChP/1NVrn/DsfhaPFTo0ggnwmBT06mb5ArkpDGlhLAosLQ2JGdP418aJvqG3dg080Vi3XM80qTIsEXa3bdYHiDXSL/G1ryMY1HJ3pCa5kdugmAR1wnKHDo/+OePEKxEextboqyBDbyBmLIOg3BrKzXV/GdCr02cGaEdySanV+nvVfDz/qQzIhekN0bY7Z2JKeJc6KprLnPabqd5hzKXu22g3BminO8MDOaAtvfJvjcxX1+P+sWZuAdxPawDDlrT5ABry/mxa/c6eRWs/sdN/U4TI4JlkoPwr0NxvI PS9iEcD0 YVyLe6QuHQx/Z9IS4rhtlFKfDDGaqVvhaM0s77EY2ns9ZqgQktgEZv5fUDQeIQ5iscbwngR402cBmukI5fLTYT2u2yU1oc9Z1Jr+KTCLXGV6ftZV5EjScJj5GuAgu885LKrDx+VmcHLBIbDdyiHwdWcBQa69/1Kk+Pfmg+db3WqAhjpo72reG2DmZg9oKi7aSYsMg/HH4KI+wujuqlGdrimI7zpdw1LqPDVof1pv98OtusP1+2s5ruwAoPb5WzYu5cgCW8wnUQpDNnonliV1BVYZID6z4JAdzJSe0XkkSit14dpa8S8E9mkhLkt5MwjoPGtn81B6LnDaIaVcGRSIgIOuGFw+2itnSL0fqbp3jwH6VXol29U/mKd6w4HZoCurKmbtmxpRDfRnGWP95TKAYTn43f5tvLdmgjHMzg1DbPiGveqPUa/ikwj7Im54YNjsBhFJhI4PVeac1s0o= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: From: Jeff Xu Add merge/split handling for mlock/madvice/mprotect/mmap case. Make sealed VMA mergeable with adjacent VMAs. This is so that we don't run out of VMAs, i.e. there is a max number of VMA per process. Signed-off-by: Jeff Xu Suggested-by: Jann Horn --- fs/userfaultfd.c | 8 +++++--- include/linux/mm.h | 31 +++++++++++++------------------ mm/madvise.c | 2 +- mm/mempolicy.c | 2 +- mm/mlock.c | 2 +- mm/mmap.c | 44 +++++++++++++++++++++----------------------- mm/mprotect.c | 2 +- mm/mremap.c | 2 +- mm/mseal.c | 23 ++++++++++++++++++----- 9 files changed, 62 insertions(+), 54 deletions(-) diff --git a/fs/userfaultfd.c b/fs/userfaultfd.c index 56eaae9dac1a..8ebee7c1c6cf 100644 --- a/fs/userfaultfd.c +++ b/fs/userfaultfd.c @@ -926,7 +926,8 @@ static int userfaultfd_release(struct inode *inode, struct file *file) new_flags, vma->anon_vma, vma->vm_file, vma->vm_pgoff, vma_policy(vma), - NULL_VM_UFFD_CTX, anon_vma_name(vma)); + NULL_VM_UFFD_CTX, anon_vma_name(vma), + vma_seals(vma)); if (prev) { vma = prev; } else { @@ -1483,7 +1484,7 @@ static int userfaultfd_register(struct userfaultfd_ctx *ctx, vma->anon_vma, vma->vm_file, pgoff, vma_policy(vma), ((struct vm_userfaultfd_ctx){ ctx }), - anon_vma_name(vma)); + anon_vma_name(vma), vma_seals(vma)); if (prev) { /* vma_merge() invalidated the mas */ vma = prev; @@ -1668,7 +1669,8 @@ static int userfaultfd_unregister(struct userfaultfd_ctx *ctx, prev = vma_merge(&vmi, mm, prev, start, vma_end, new_flags, vma->anon_vma, vma->vm_file, pgoff, vma_policy(vma), - NULL_VM_UFFD_CTX, anon_vma_name(vma)); + NULL_VM_UFFD_CTX, anon_vma_name(vma), + vma_seals(vma)); if (prev) { vma = prev; goto next; diff --git a/include/linux/mm.h b/include/linux/mm.h index 5d3ee79f1438..1f162bb5b38d 100644 --- a/include/linux/mm.h +++ b/include/linux/mm.h @@ -3243,7 +3243,7 @@ extern struct vm_area_struct *vma_merge(struct vma_iterator *vmi, struct mm_struct *, struct vm_area_struct *prev, unsigned long addr, unsigned long end, unsigned long vm_flags, struct anon_vma *, struct file *, pgoff_t, struct mempolicy *, struct vm_userfaultfd_ctx, - struct anon_vma_name *); + struct anon_vma_name *, unsigned long vm_seals); extern struct anon_vma *find_mergeable_anon_vma(struct vm_area_struct *); extern int __split_vma(struct vma_iterator *vmi, struct vm_area_struct *, unsigned long addr, int new_below); @@ -3327,19 +3327,6 @@ static inline void mm_populate(unsigned long addr, unsigned long len) {} #endif #ifdef CONFIG_MSEAL -static inline bool check_vma_seals_mergeable(unsigned long vm_seals) -{ - /* - * Set sealed VMA not mergeable with another VMA for now. - * This will be changed in later commit to make sealed - * VMA also mergeable. - */ - if (vm_seals & MM_SEAL_ALL) - return false; - - return true; -} - /* * return the valid sealing (after mask). */ @@ -3353,6 +3340,14 @@ static inline void update_vma_seals(struct vm_area_struct *vma, unsigned long vm vma->vm_seals |= vm_seals; } +static inline bool check_vma_seals_mergeable(unsigned long vm_seals1, unsigned long vm_seals2) +{ + if ((vm_seals1 & MM_SEAL_ALL) != (vm_seals2 & MM_SEAL_ALL)) + return false; + + return true; +} + extern bool can_modify_mm(struct mm_struct *mm, unsigned long start, unsigned long end, unsigned long checkSeals); @@ -3390,14 +3385,14 @@ static inline int check_mmap_seals(unsigned long prot, unsigned long *vm_seals) return 0; } #else -static inline bool check_vma_seals_mergeable(unsigned long vm_seals1) +static inline unsigned long vma_seals(struct vm_area_struct *vma) { - return true; + return 0; } -static inline unsigned long vma_seals(struct vm_area_struct *vma) +static inline bool check_vma_seals_mergeable(unsigned long vm_seals1, unsigned long vm_seals2) { - return 0; + return true; } static inline bool can_modify_mm(struct mm_struct *mm, unsigned long start, diff --git a/mm/madvise.c b/mm/madvise.c index 4dded5d27e7e..e2d219a4b6ef 100644 --- a/mm/madvise.c +++ b/mm/madvise.c @@ -152,7 +152,7 @@ static int madvise_update_vma(struct vm_area_struct *vma, pgoff = vma->vm_pgoff + ((start - vma->vm_start) >> PAGE_SHIFT); *prev = vma_merge(&vmi, mm, *prev, start, end, new_flags, vma->anon_vma, vma->vm_file, pgoff, vma_policy(vma), - vma->vm_userfaultfd_ctx, anon_name); + vma->vm_userfaultfd_ctx, anon_name, vma_seals(vma)); if (*prev) { vma = *prev; goto success; diff --git a/mm/mempolicy.c b/mm/mempolicy.c index e52e3a0b8f2e..e70b69c64564 100644 --- a/mm/mempolicy.c +++ b/mm/mempolicy.c @@ -836,7 +836,7 @@ static int mbind_range(struct vma_iterator *vmi, struct vm_area_struct *vma, pgoff = vma->vm_pgoff + ((vmstart - vma->vm_start) >> PAGE_SHIFT); merged = vma_merge(vmi, vma->vm_mm, *prev, vmstart, vmend, vma->vm_flags, vma->anon_vma, vma->vm_file, pgoff, new_pol, - vma->vm_userfaultfd_ctx, anon_vma_name(vma)); + vma->vm_userfaultfd_ctx, anon_vma_name(vma), vma_seals(vma)); if (merged) { *prev = merged; return vma_replace_policy(merged, new_pol); diff --git a/mm/mlock.c b/mm/mlock.c index 06bdfab83b58..b537a2cbd337 100644 --- a/mm/mlock.c +++ b/mm/mlock.c @@ -428,7 +428,7 @@ static int mlock_fixup(struct vma_iterator *vmi, struct vm_area_struct *vma, pgoff = vma->vm_pgoff + ((start - vma->vm_start) >> PAGE_SHIFT); *prev = vma_merge(vmi, mm, *prev, start, end, newflags, vma->anon_vma, vma->vm_file, pgoff, vma_policy(vma), - vma->vm_userfaultfd_ctx, anon_vma_name(vma)); + vma->vm_userfaultfd_ctx, anon_vma_name(vma), vma_seals(vma)); if (*prev) { vma = *prev; goto success; diff --git a/mm/mmap.c b/mm/mmap.c index 3e1bf5a131b0..6da8d83f2e66 100644 --- a/mm/mmap.c +++ b/mm/mmap.c @@ -720,7 +720,8 @@ int vma_shrink(struct vma_iterator *vmi, struct vm_area_struct *vma, static inline bool is_mergeable_vma(struct vm_area_struct *vma, struct file *file, unsigned long vm_flags, struct vm_userfaultfd_ctx vm_userfaultfd_ctx, - struct anon_vma_name *anon_name, bool may_remove_vma) + struct anon_vma_name *anon_name, bool may_remove_vma, + unsigned long vm_seals) { /* * VM_SOFTDIRTY should not prevent from VMA merging, if we @@ -740,7 +741,7 @@ static inline bool is_mergeable_vma(struct vm_area_struct *vma, return false; if (!anon_vma_name_eq(anon_vma_name(vma), anon_name)) return false; - if (!check_vma_seals_mergeable(vma_seals(vma))) + if (!check_vma_seals_mergeable(vma_seals(vma), vm_seals)) return false; return true; @@ -776,9 +777,10 @@ static bool can_vma_merge_before(struct vm_area_struct *vma, unsigned long vm_flags, struct anon_vma *anon_vma, struct file *file, pgoff_t vm_pgoff, struct vm_userfaultfd_ctx vm_userfaultfd_ctx, - struct anon_vma_name *anon_name) + struct anon_vma_name *anon_name, unsigned long vm_seals) { - if (is_mergeable_vma(vma, file, vm_flags, vm_userfaultfd_ctx, anon_name, true) && + if (is_mergeable_vma(vma, file, vm_flags, vm_userfaultfd_ctx, + anon_name, true, vm_seals) && is_mergeable_anon_vma(anon_vma, vma->anon_vma, vma)) { if (vma->vm_pgoff == vm_pgoff) return true; @@ -799,9 +801,10 @@ static bool can_vma_merge_after(struct vm_area_struct *vma, unsigned long vm_flags, struct anon_vma *anon_vma, struct file *file, pgoff_t vm_pgoff, struct vm_userfaultfd_ctx vm_userfaultfd_ctx, - struct anon_vma_name *anon_name) + struct anon_vma_name *anon_name, unsigned long vm_seals) { - if (is_mergeable_vma(vma, file, vm_flags, vm_userfaultfd_ctx, anon_name, false) && + if (is_mergeable_vma(vma, file, vm_flags, vm_userfaultfd_ctx, + anon_name, false, vm_seals) && is_mergeable_anon_vma(anon_vma, vma->anon_vma, vma)) { pgoff_t vm_pglen; vm_pglen = vma_pages(vma); @@ -869,7 +872,7 @@ struct vm_area_struct *vma_merge(struct vma_iterator *vmi, struct mm_struct *mm, struct anon_vma *anon_vma, struct file *file, pgoff_t pgoff, struct mempolicy *policy, struct vm_userfaultfd_ctx vm_userfaultfd_ctx, - struct anon_vma_name *anon_name) + struct anon_vma_name *anon_name, unsigned long vm_seals) { struct vm_area_struct *curr, *next, *res; struct vm_area_struct *vma, *adjust, *remove, *remove2; @@ -908,7 +911,7 @@ struct vm_area_struct *vma_merge(struct vma_iterator *vmi, struct mm_struct *mm, /* Can we merge the predecessor? */ if (addr == prev->vm_end && mpol_equal(vma_policy(prev), policy) && can_vma_merge_after(prev, vm_flags, anon_vma, file, - pgoff, vm_userfaultfd_ctx, anon_name)) { + pgoff, vm_userfaultfd_ctx, anon_name, vm_seals)) { merge_prev = true; vma_prev(vmi); } @@ -917,7 +920,7 @@ struct vm_area_struct *vma_merge(struct vma_iterator *vmi, struct mm_struct *mm, /* Can we merge the successor? */ if (next && mpol_equal(policy, vma_policy(next)) && can_vma_merge_before(next, vm_flags, anon_vma, file, pgoff+pglen, - vm_userfaultfd_ctx, anon_name)) { + vm_userfaultfd_ctx, anon_name, vm_seals)) { merge_next = true; } @@ -2727,13 +2730,8 @@ unsigned long mmap_region(struct file *file, unsigned long addr, next = vma_next(&vmi); prev = vma_prev(&vmi); - /* - * For now, sealed VMA doesn't merge with other VMA, - * Will change this in later commit when we make sealed VMA - * also mergeable. - */ - if ((vm_flags & VM_SPECIAL) || - (vm_seals & MM_SEAL_ALL)) { + + if (vm_flags & VM_SPECIAL) { if (prev) vma_iter_next_range(&vmi); goto cannot_expand; @@ -2743,7 +2741,7 @@ unsigned long mmap_region(struct file *file, unsigned long addr, /* Check next */ if (next && next->vm_start == end && !vma_policy(next) && can_vma_merge_before(next, vm_flags, NULL, file, pgoff+pglen, - NULL_VM_UFFD_CTX, NULL)) { + NULL_VM_UFFD_CTX, NULL, vm_seals)) { merge_end = next->vm_end; vma = next; vm_pgoff = next->vm_pgoff - pglen; @@ -2752,9 +2750,9 @@ unsigned long mmap_region(struct file *file, unsigned long addr, /* Check prev */ if (prev && prev->vm_end == addr && !vma_policy(prev) && (vma ? can_vma_merge_after(prev, vm_flags, vma->anon_vma, file, - pgoff, vma->vm_userfaultfd_ctx, NULL) : + pgoff, vma->vm_userfaultfd_ctx, NULL, vm_seals) : can_vma_merge_after(prev, vm_flags, NULL, file, pgoff, - NULL_VM_UFFD_CTX, NULL))) { + NULL_VM_UFFD_CTX, NULL, vm_seals))) { merge_start = prev->vm_start; vma = prev; vm_pgoff = prev->vm_pgoff; @@ -2822,7 +2820,7 @@ unsigned long mmap_region(struct file *file, unsigned long addr, merge = vma_merge(&vmi, mm, prev, vma->vm_start, vma->vm_end, vma->vm_flags, NULL, vma->vm_file, vma->vm_pgoff, NULL, - NULL_VM_UFFD_CTX, NULL); + NULL_VM_UFFD_CTX, NULL, vma_seals(vma)); if (merge) { /* * ->mmap() can change vma->vm_file and fput @@ -3130,14 +3128,14 @@ static int do_brk_flags(struct vma_iterator *vmi, struct vm_area_struct *vma, if (security_vm_enough_memory_mm(mm, len >> PAGE_SHIFT)) return -ENOMEM; - /* * Expand the existing vma if possible; Note that singular lists do not * occur after forking, so the expand will only happen on new VMAs. */ if (vma && vma->vm_end == addr && !vma_policy(vma) && can_vma_merge_after(vma, flags, NULL, NULL, - addr >> PAGE_SHIFT, NULL_VM_UFFD_CTX, NULL)) { + addr >> PAGE_SHIFT, NULL_VM_UFFD_CTX, NULL, + vma_seals(vma))) { vma_iter_config(vmi, vma->vm_start, addr + len); if (vma_iter_prealloc(vmi, vma)) goto unacct_fail; @@ -3380,7 +3378,7 @@ struct vm_area_struct *copy_vma(struct vm_area_struct **vmap, new_vma = vma_merge(&vmi, mm, prev, addr, addr + len, vma->vm_flags, vma->anon_vma, vma->vm_file, pgoff, vma_policy(vma), - vma->vm_userfaultfd_ctx, anon_vma_name(vma)); + vma->vm_userfaultfd_ctx, anon_vma_name(vma), vma_seals(vma)); if (new_vma) { /* * Source vma may have been merged into new_vma diff --git a/mm/mprotect.c b/mm/mprotect.c index 1527188b1e92..a4c90e71607b 100644 --- a/mm/mprotect.c +++ b/mm/mprotect.c @@ -632,7 +632,7 @@ mprotect_fixup(struct vma_iterator *vmi, struct mmu_gather *tlb, pgoff = vma->vm_pgoff + ((start - vma->vm_start) >> PAGE_SHIFT); *pprev = vma_merge(vmi, mm, *pprev, start, end, newflags, vma->anon_vma, vma->vm_file, pgoff, vma_policy(vma), - vma->vm_userfaultfd_ctx, anon_vma_name(vma)); + vma->vm_userfaultfd_ctx, anon_vma_name(vma), vma_seals(vma)); if (*pprev) { vma = *pprev; VM_WARN_ON((vma->vm_flags ^ newflags) & ~VM_SOFTDIRTY); diff --git a/mm/mremap.c b/mm/mremap.c index ff7429bfbbe1..357efd6b48b9 100644 --- a/mm/mremap.c +++ b/mm/mremap.c @@ -1098,7 +1098,7 @@ SYSCALL_DEFINE5(mremap, unsigned long, addr, unsigned long, old_len, vma = vma_merge(&vmi, mm, vma, extension_start, extension_end, vma->vm_flags, vma->anon_vma, vma->vm_file, extension_pgoff, vma_policy(vma), - vma->vm_userfaultfd_ctx, anon_vma_name(vma)); + vma->vm_userfaultfd_ctx, anon_vma_name(vma), vma_seals(vma)); if (!vma) { vm_unacct_memory(pages); ret = -ENOMEM; diff --git a/mm/mseal.c b/mm/mseal.c index d12aa628ebdc..3b90dce7d20e 100644 --- a/mm/mseal.c +++ b/mm/mseal.c @@ -7,8 +7,10 @@ * Author: Jeff Xu */ +#include #include #include +#include #include #include #include "internal.h" @@ -81,14 +83,25 @@ static int mseal_fixup(struct vma_iterator *vmi, struct vm_area_struct *vma, struct vm_area_struct **prev, unsigned long start, unsigned long end, unsigned long addtypes) { + pgoff_t pgoff; int ret = 0; + unsigned long newtypes = vma_seals(vma) | addtypes; + + if (newtypes != vma_seals(vma)) { + /* + * Attempt to merge with prev and next vma. + */ + pgoff = vma->vm_pgoff + ((start - vma->vm_start) >> PAGE_SHIFT); + *prev = vma_merge(vmi, vma->vm_mm, *prev, start, end, vma->vm_flags, + vma->anon_vma, vma->vm_file, pgoff, vma_policy(vma), + vma->vm_userfaultfd_ctx, anon_vma_name(vma), newtypes); + if (*prev) { + vma = *prev; + goto out; + } - if (addtypes & ~(vma_seals(vma))) { /* * Handle split at start and end. - * For now sealed VMA doesn't merge with other VMAs. - * This will be updated in later commit to make - * sealed VMA also mergeable. */ if (start != vma->vm_start) { ret = split_vma(vmi, vma, start, 1); @@ -102,7 +115,7 @@ static int mseal_fixup(struct vma_iterator *vmi, struct vm_area_struct *vma, goto out; } - vma->vm_seals |= addtypes; + vma->vm_seals = newtypes; } out: From patchwork Tue Dec 12 23:17:02 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Jeff Xu X-Patchwork-Id: 13490091 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 9BE0AC4332F for ; Tue, 12 Dec 2023 23:17:35 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 406AA6B03E3; Tue, 12 Dec 2023 18:17:22 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 391726B03E5; Tue, 12 Dec 2023 18:17:22 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 195226B03E6; Tue, 12 Dec 2023 18:17:22 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0012.hostedemail.com [216.40.44.12]) by kanga.kvack.org (Postfix) with ESMTP id F3DD26B03E3 for ; Tue, 12 Dec 2023 18:17:21 -0500 (EST) Received: from smtpin18.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay07.hostedemail.com (Postfix) with ESMTP id D1581160A5F for ; Tue, 12 Dec 2023 23:17:21 +0000 (UTC) X-FDA: 81559729482.18.0535E30 Received: from mail-pj1-f47.google.com (mail-pj1-f47.google.com [209.85.216.47]) by imf24.hostedemail.com (Postfix) with ESMTP id E4D48180019 for ; Tue, 12 Dec 2023 23:17:19 +0000 (UTC) Authentication-Results: imf24.hostedemail.com; dkim=pass header.d=chromium.org header.s=google header.b=ICNE+70r; dmarc=pass (policy=none) header.from=chromium.org; spf=pass (imf24.hostedemail.com: domain of jeffxu@chromium.org designates 209.85.216.47 as permitted sender) smtp.mailfrom=jeffxu@chromium.org ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1702423040; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=oTvy78zmaIXZsTIPkmebRTxUrUOjQ/CWIjSmsuuygoc=; b=GLq0tEnREQCyyW94JzykGqYTstxL+lMM2PlQy9SYvTFY/4aOQCUiEDkwNuC8D2LMVwYio3 poFWHAJiwozpJFzGfQcdomVj24sa/b08e0YxjKhcHYlqEgCNqWP7bUhZngbLWBBRVd6nW0 flj6KLDfy2oEMkdTyW1P/sX2hE0QvDM= ARC-Authentication-Results: i=1; imf24.hostedemail.com; dkim=pass header.d=chromium.org header.s=google header.b=ICNE+70r; dmarc=pass (policy=none) header.from=chromium.org; spf=pass (imf24.hostedemail.com: domain of jeffxu@chromium.org designates 209.85.216.47 as permitted sender) smtp.mailfrom=jeffxu@chromium.org ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1702423040; a=rsa-sha256; cv=none; b=n1PbWoeAV6YDE4xzrV2sGhYjs6gjH3YNAggFhq5APz8NhaViwx6/h3gybdE60szDuZy8Oe r2ljaORhNbGISyuJYm2UIUbEMLHU66cxPWFw1trd1yGZj/FYb+z3aE6bS01mNB3Uivq2I3 MFvJKNRSAJEAYI8bHcJXaKToqDf8Lhg= Received: by mail-pj1-f47.google.com with SMTP id 98e67ed59e1d1-28659348677so4893261a91.0 for ; Tue, 12 Dec 2023 15:17:19 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=chromium.org; s=google; t=1702423039; x=1703027839; darn=kvack.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=oTvy78zmaIXZsTIPkmebRTxUrUOjQ/CWIjSmsuuygoc=; b=ICNE+70rSkLKaSPEcfK8DbOIsQ3mg42+VgafzXl4Lrzj9AZRDbtXa6/0oW5hGexiFS Gu3e2PZmz4OdTg/SdnKYmJUbquNtbAD0b5KqqRrDIZ2TEj/7ASC67dbjsdt7gk2C72UJ 7NxhczoAUCmZ2QiowR2CshlVVK4UBqAgbxsyI= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1702423039; x=1703027839; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=oTvy78zmaIXZsTIPkmebRTxUrUOjQ/CWIjSmsuuygoc=; b=q4gNmap+qIrIsOQcgdir7WOdikzZKcSBCncXgqXNGCqgd8JhwYAumHyPM/2o/zzj2E 3dCfPDeI58QpG39gxwvS67K9fSyNbpxRqBzluxzpxduJqNtpkqqB0hCpop2qWqwgCB2W BUwzsNQv7ORsltMorHVhG15BWaJvBo0Jt/q/CdpI26hf8dLbCEcxmKXFPEazxkysNdqa iH8ouRzgU/8AVgBgMKFMMTvDCGEZr4pESUgZM7nKlDumzQqySAKvvSDhpUQpHMw57Wkg PweewyZkpRoA32y+gtLMOIM+6AKaHToPW13/abErWIkhpE0QL6BYvLrIktrfGZP3OWN7 8qEg== X-Gm-Message-State: AOJu0Yy5vU99maQIJ+uNjfDVmwO31Ex5bVlWBF4GCnmAQ6R7LOUtJz8I rRmvFrgUcJ2IwfiehYYHE3hA/w== X-Google-Smtp-Source: AGHT+IGWlFc4mkpXoFC0PEpzgfDHSLkC9hOefaTM1E9H1ZoIhebEpqs8jP0iWKY6qbWGYpKoD0aROg== X-Received: by 2002:a17:90a:7562:b0:28a:79b0:afc1 with SMTP id q89-20020a17090a756200b0028a79b0afc1mr5931086pjk.6.1702423038769; Tue, 12 Dec 2023 15:17:18 -0800 (PST) Received: from localhost (34.133.83.34.bc.googleusercontent.com. [34.83.133.34]) by smtp.gmail.com with UTF8SMTPSA id n20-20020a17090ade9400b00286a275d65asm11093878pjv.41.2023.12.12.15.17.18 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Tue, 12 Dec 2023 15:17:18 -0800 (PST) From: jeffxu@chromium.org To: akpm@linux-foundation.org, keescook@chromium.org, jannh@google.com, sroettger@google.com, willy@infradead.org, gregkh@linuxfoundation.org, torvalds@linux-foundation.org Cc: jeffxu@google.com, jorgelo@chromium.org, groeck@chromium.org, linux-kernel@vger.kernel.org, linux-kselftest@vger.kernel.org, linux-mm@kvack.org, pedro.falcato@gmail.com, dave.hansen@intel.com, linux-hardening@vger.kernel.org, deraadt@openbsd.org, Jeff Xu Subject: [RFC PATCH v3 08/11] mseal: add MM_SEAL_DISCARD_RO_ANON Date: Tue, 12 Dec 2023 23:17:02 +0000 Message-ID: <20231212231706.2680890-9-jeffxu@chromium.org> X-Mailer: git-send-email 2.43.0.472.g3155946c3a-goog In-Reply-To: <20231212231706.2680890-1-jeffxu@chromium.org> References: <20231212231706.2680890-1-jeffxu@chromium.org> MIME-Version: 1.0 X-Rspamd-Server: rspam09 X-Rspamd-Queue-Id: E4D48180019 X-Stat-Signature: dj197ddi346rjosumi7au1joyo5r8qwe X-Rspam-User: X-HE-Tag: 1702423039-434073 X-HE-Meta: U2FsdGVkX19WrDoL3mPfzBxba9f3UOBO8TqT+ss7PZjQcbzIFTCghPHPf62vuphw8YcQo5ACZUB1Q3Y0ChmtkBTrXxrsr0AShwj+uGUngqcmmj0V/4UMvPn77uTvkKFh1Yg7NCO7mlEbGXJAG+I/KlQMQ9YDuFK2C87qM0rP2DkjY/aaCq4plR9Wp87f2anF+EaLy6/nNcEfJ90Uk4gxTsSLN8mDTPEKij58WFTp8gZF2pYwXvIuXv6UuJQhp0SVXsh7wQrfyzszZP5PFqK0Mkmm+8zSHAGwtwNLnxw8VJV5+SNksNQNLE0hYLqI2Mu5x9XOB38jJGrD21qGP9SNPK1tsnVsIYi9yGqxL3BP6mP93dOiES0WtJW/OLz6eXtxDHtqNvVDty3enH+oWqJvnAlXF7QqBF3JUvyO6Rn1tEbnEck3kKs7HTy4wqA3nX7wpqrq4hIIMSWPTWvROtEf2cOL85UnKEGqMtsFxxXoBqDuQ/coXWt/8PjP8PwBKELJaGu2QJoBQsWaXFDQNqZBl5vB9oH0Cue1UvePQ/2YtbNOGLE605cx2OeG0y2jrbnccY0MIGWWd7a8UTNZp99I8tfKl4bbKd588eu9sZ73Oi4BStr9Loy8LKAy8IuMGJ09ey0PylhU2Yw0CRyrE6XnitjSAY6VNqw9ia/wjNOKpVyn8+q4T72yZiS3ISrC5xH6QSV9MLxa0Yn2tAxXvPRjNa7ZwRxZxTeCZX2VcKeSVpSV4ZjcSHHZQD/tAQPuPFl2i11UGoX8Vrf+cyORrPolxzuV1QDWkN75sB1RPO9HPBFaVzvbjsDU3ndNQm/ZCWs9rToTYOFXeYIQkY03PVvswdFL3HDzo5KKczSQLEWRFmCp9O+EYp56IJrwJHMslhXAtaBGWvaa8tfQDe5+/FDW8e7y6rQrb7TDyG7LsJFmqdPOCNX53Xa9WUPgvNJnkTr5T2d+VbVIL4hr2i2TqgN uWvXeKoW u8kucqP1cqX1YcscPHezAvTQrnD6LdDPH5RiRU4PQpl0FWV6xlq1NsVZvarXZwdzyNHKyPdh56NY2arE4fMWa3hOXLwNVYcwrOdOnb4V+MOryiEquFfcna46PCpG8kbWsFj/8ItwMVs11YmYVtCvGIz0rST0R3dU3bd6tfWi3WJXJ4+hUlfgVCT6QufdITIzgvd690hjklyizR3sezOQtqBPNc8K/8kYEARdZJxOOMEhtoix0hvcSZ3mtM149j4FfSv/OGmFJi5T1PDLWDWcVrPj6pzII5FEV0/KqLDxC+yEEquWktJ326KsPEZpilBXzo5J7d9nOTAxht9cQ7G/uaZUYERXTKPJypSpFGxTPgV6sZKXCZ1nAzMBQpkctpzTKTITpc27GB45/E61ih0ooVFsFi1y1rKN8AcCALEVggDM+hcmMWPEYui/6MWxdz2qeaUQ/Aec3DgUYGLxTvYU5ceZHdzLRK9Pub7pzPiBXSR7xngI4ZZ4sjWOrVOEoDy4Gjh+1bEAaU8aBEhW3V6RzUfPSmLsXlgA0GmWe X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: From: Jeff Xu Certain types of madvise() operations are destructive, such as MADV_DONTNEED, which can effectively alter region contents by discarding pages, especially when memory is anonymous. This blocks such operations for anonymous memory which is not writable to the user. The MM_SEAL_DISCARD_RO_ANON blocks such operations if users don't have access to the memory, and the memory is anonymous memory. We do not think such sealing is useful for file-backed mapping because it should repopulate the memory contents from the underlying mapped file. We also do not think it is useful if the user can write to the memory because then the attacker can also write. Signed-off-by: Jeff Xu Suggested-by: Jann Horn Suggested-by: Stephen Röttger --- include/linux/mm.h | 19 +++++-- include/uapi/asm-generic/mman-common.h | 2 + include/uapi/linux/mman.h | 1 + mm/madvise.c | 12 +++++ mm/mseal.c | 73 ++++++++++++++++++++++++-- 5 files changed, 98 insertions(+), 9 deletions(-) diff --git a/include/linux/mm.h b/include/linux/mm.h index 1f162bb5b38d..50dda474acc2 100644 --- a/include/linux/mm.h +++ b/include/linux/mm.h @@ -264,7 +264,8 @@ extern unsigned int kobjsize(const void *objp); #define MM_SEAL_ALL ( \ MM_SEAL_SEAL | \ MM_SEAL_BASE | \ - MM_SEAL_PROT_PKEY) + MM_SEAL_PROT_PKEY | \ + MM_SEAL_DISCARD_RO_ANON) /* * PROT_SEAL_ALL is all supported flags in mmap(). @@ -273,7 +274,8 @@ extern unsigned int kobjsize(const void *objp); #define PROT_SEAL_ALL ( \ PROT_SEAL_SEAL | \ PROT_SEAL_BASE | \ - PROT_SEAL_PROT_PKEY) + PROT_SEAL_PROT_PKEY | \ + PROT_SEAL_DISCARD_RO_ANON) /* * vm_flags in vm_area_struct, see mm_types.h. @@ -3354,6 +3356,9 @@ extern bool can_modify_mm(struct mm_struct *mm, unsigned long start, extern bool can_modify_vma(struct vm_area_struct *vma, unsigned long checkSeals); +extern bool can_modify_mm_madv(struct mm_struct *mm, unsigned long start, + unsigned long end, int behavior); + /* * Convert prot field of mmap to vm_seals type. */ @@ -3362,9 +3367,9 @@ static inline unsigned long convert_mmap_seals(unsigned long prot) unsigned long seals = 0; /* - * set SEAL_PROT_PKEY implies SEAL_BASE. + * set SEAL_PROT_PKEY or SEAL_DISCARD_RO_ANON implies SEAL_BASE. */ - if (prot & PROT_SEAL_PROT_PKEY) + if (prot & (PROT_SEAL_PROT_PKEY | PROT_SEAL_DISCARD_RO_ANON)) prot |= PROT_SEAL_BASE; /* @@ -3407,6 +3412,12 @@ static inline bool can_modify_vma(struct vm_area_struct *vma, return true; } +static inline bool can_modify_mm_madv(struct mm_struct *mm, unsigned long start, + unsigned long end, int behavior) +{ + return true; +} + static inline void update_vma_seals(struct vm_area_struct *vma, unsigned long vm_seals) { } diff --git a/include/uapi/asm-generic/mman-common.h b/include/uapi/asm-generic/mman-common.h index f07ad9e70b3a..bf503962409a 100644 --- a/include/uapi/asm-generic/mman-common.h +++ b/include/uapi/asm-generic/mman-common.h @@ -29,6 +29,8 @@ #define PROT_SEAL_SEAL _BITUL(PROT_SEAL_BIT_BEGIN) /* 0x04000000 seal seal */ #define PROT_SEAL_BASE _BITUL(PROT_SEAL_BIT_BEGIN + 1) /* 0x08000000 base for all sealing types */ #define PROT_SEAL_PROT_PKEY _BITUL(PROT_SEAL_BIT_BEGIN + 2) /* 0x10000000 seal prot and pkey */ +/* seal destructive madvise for non-writeable anonymous memory. */ +#define PROT_SEAL_DISCARD_RO_ANON _BITUL(PROT_SEAL_BIT_BEGIN + 3) /* 0x20000000 */ /* 0x01 - 0x03 are defined in linux/mman.h */ #define MAP_TYPE 0x0f /* Mask for type of mapping */ diff --git a/include/uapi/linux/mman.h b/include/uapi/linux/mman.h index f561652886c4..3872cc118c8a 100644 --- a/include/uapi/linux/mman.h +++ b/include/uapi/linux/mman.h @@ -58,5 +58,6 @@ struct cachestat { #define MM_SEAL_SEAL _BITUL(0) #define MM_SEAL_BASE _BITUL(1) #define MM_SEAL_PROT_PKEY _BITUL(2) +#define MM_SEAL_DISCARD_RO_ANON _BITUL(3) #endif /* _UAPI_LINUX_MMAN_H */ diff --git a/mm/madvise.c b/mm/madvise.c index e2d219a4b6ef..ff038e323779 100644 --- a/mm/madvise.c +++ b/mm/madvise.c @@ -1403,6 +1403,7 @@ int madvise_set_anon_name(struct mm_struct *mm, unsigned long start, * -EIO - an I/O error occurred while paging in data. * -EBADF - map exists, but area maps something that isn't a file. * -EAGAIN - a kernel resource was temporarily unavailable. + * -EACCES - memory is sealed. */ int do_madvise(struct mm_struct *mm, unsigned long start, size_t len_in, int behavior) { @@ -1446,10 +1447,21 @@ int do_madvise(struct mm_struct *mm, unsigned long start, size_t len_in, int beh start = untagged_addr_remote(mm, start); end = start + len; + /* + * Check if the address range is sealed for do_madvise(). + * can_modify_mm_madv assumes we have acquired the lock on MM. + */ + if (!can_modify_mm_madv(mm, start, end, behavior)) { + error = -EACCES; + goto out; + } + blk_start_plug(&plug); error = madvise_walk_vmas(mm, start, end, behavior, madvise_vma_behavior); blk_finish_plug(&plug); + +out: if (write) mmap_write_unlock(mm); else diff --git a/mm/mseal.c b/mm/mseal.c index 3b90dce7d20e..294f48d33db6 100644 --- a/mm/mseal.c +++ b/mm/mseal.c @@ -11,6 +11,7 @@ #include #include #include +#include #include #include #include "internal.h" @@ -66,6 +67,55 @@ bool can_modify_mm(struct mm_struct *mm, unsigned long start, unsigned long end, return true; } +static bool is_madv_discard(int behavior) +{ + return behavior & + (MADV_FREE | MADV_DONTNEED | MADV_DONTNEED_LOCKED | + MADV_REMOVE | MADV_DONTFORK | MADV_WIPEONFORK); +} + +static bool is_ro_anon(struct vm_area_struct *vma) +{ + /* check anonymous mapping. */ + if (vma->vm_file || vma->vm_flags & VM_SHARED) + return false; + + /* + * check for non-writable: + * PROT=RO or PKRU is not writeable. + */ + if (!(vma->vm_flags & VM_WRITE) || + !arch_vma_access_permitted(vma, true, false, false)) + return true; + + return false; +} + +/* + * Check if the vmas of a memory range are allowed to be modified by madvise. + * the memory ranger can have a gap (unallocated memory). + * return true, if it is allowed. + */ +bool can_modify_mm_madv(struct mm_struct *mm, unsigned long start, unsigned long end, + int behavior) +{ + struct vm_area_struct *vma; + + VMA_ITERATOR(vmi, mm, start); + + if (!is_madv_discard(behavior)) + return true; + + /* going through each vma to check. */ + for_each_vma_range(vmi, vma, end) + if (is_ro_anon(vma) && !can_modify_vma( + vma, MM_SEAL_DISCARD_RO_ANON)) + return false; + + /* Allow by default. */ + return true; +} + /* * Check if a seal type can be added to VMA. */ @@ -76,6 +126,12 @@ static bool can_add_vma_seals(struct vm_area_struct *vma, unsigned long newSeals (newSeals & ~(vma_seals(vma)))) return false; + /* + * For simplicity, we allow adding all sealing types during mmap or mseal. + * The actual sealing check will happen later during particular action. + * E.g. For MM_SEAL_DISCARD_RO_ANON, we always allow adding it, at the + * time madvice() call, we will check if the sealing condition isn't met. + */ return true; } @@ -225,15 +281,22 @@ static int apply_mm_seal(unsigned long start, unsigned long end, * mprotect() and pkey_mprotect() will be denied if the memory is * sealed with MM_SEAL_PROT_PKEY. * - * The MM_SEAL_SEAL - * MM_SEAL_SEAL denies adding a new seal for an VMA. - * * The kernel will remember which seal types are applied, and the * application doesn’t need to repeat all existing seal types in * the next mseal(). Once a seal type is applied, it can’t be * unsealed. Call mseal() on an existing seal type is a no-action, * not a failure. * + * MM_SEAL_DISCARD_RO_ANON: block some destructive madvice() + * behavior, such as MADV_DONTNEED, which can effectively + * alter gegion contents by discarding pages, block such + * operation if users don't have write access to the memory, and + * the memory is anonymous memory. + * Setting this implies MM_SEAL_BASE is also set. + * + * The MM_SEAL_SEAL + * MM_SEAL_SEAL denies adding a new seal for an VMA. + * * flags: reserved. * * return values: @@ -264,8 +327,8 @@ static int do_mseal(unsigned long start, size_t len_in, unsigned long types, struct mm_struct *mm = current->mm; size_t len; - /* MM_SEAL_BASE is set when other seal types are set. */ - if (types & MM_SEAL_PROT_PKEY) + /* MM_SEAL_BASE is set when other seal types are set */ + if (types & (MM_SEAL_PROT_PKEY | MM_SEAL_DISCARD_RO_ANON)) types |= MM_SEAL_BASE; if (!can_do_mseal(types, flags)) From patchwork Tue Dec 12 23:17:03 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jeff Xu X-Patchwork-Id: 13490092 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 7B961C4167D for ; Tue, 12 Dec 2023 23:17:38 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 590D66B03E5; Tue, 12 Dec 2023 18:17:23 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 4020C6B03E7; Tue, 12 Dec 2023 18:17:23 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 1E23A6B03E8; Tue, 12 Dec 2023 18:17:23 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0011.hostedemail.com [216.40.44.11]) by kanga.kvack.org (Postfix) with ESMTP id 063186B03E5 for ; Tue, 12 Dec 2023 18:17:23 -0500 (EST) Received: from smtpin08.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay08.hostedemail.com (Postfix) with ESMTP id D794D1409D1 for ; Tue, 12 Dec 2023 23:17:22 +0000 (UTC) X-FDA: 81559729524.08.442907D Received: from mail-oo1-f53.google.com (mail-oo1-f53.google.com [209.85.161.53]) by imf10.hostedemail.com (Postfix) with ESMTP id EDD3AC0002 for ; Tue, 12 Dec 2023 23:17:20 +0000 (UTC) Authentication-Results: imf10.hostedemail.com; dkim=pass header.d=chromium.org header.s=google header.b=bfPcKQ82; spf=pass (imf10.hostedemail.com: domain of jeffxu@chromium.org designates 209.85.161.53 as permitted sender) smtp.mailfrom=jeffxu@chromium.org; dmarc=pass (policy=none) header.from=chromium.org ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1702423041; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=cZjoMoZfbLgYK7cJaZ0EBpgFOBwo/12832X2C7MVeuk=; b=1dS5Rx+hGMit/9ZKduddLNgKpt6kwrNCsY2kLOkk7nlO8oJR3VmVnzLgbqC7OlHpXVKIK1 Okhr2Aw6KArsVj1rkEIC2yQbhNhMqIqbxvonpMtyKHIA0HmVHNcsIKz+tlZyTHx8n3/5vs p5lym30TTnI33nzJZok1cPOoRKocQi4= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1702423041; a=rsa-sha256; cv=none; b=WfpGSKpfz3y6tXvmKTNCpdbtpuGhDrlX1/EB52iaFGDcOoAQmDfH22d2RcJeEPgbBVlDdL HoY8NW6eNpZEXJM40xXf8hWYLKlwogZzugbrjvUIKn2ykWvSQ2lzSXvADaeP/pEcY0Lkl3 e+aF0uccPE3W2mbkkjXs/0jBM/+IrgE= ARC-Authentication-Results: i=1; imf10.hostedemail.com; dkim=pass header.d=chromium.org header.s=google header.b=bfPcKQ82; spf=pass (imf10.hostedemail.com: domain of jeffxu@chromium.org designates 209.85.161.53 as permitted sender) smtp.mailfrom=jeffxu@chromium.org; dmarc=pass (policy=none) header.from=chromium.org Received: by mail-oo1-f53.google.com with SMTP id 006d021491bc7-58df5988172so3954850eaf.0 for ; Tue, 12 Dec 2023 15:17:20 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=chromium.org; s=google; t=1702423040; x=1703027840; darn=kvack.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=cZjoMoZfbLgYK7cJaZ0EBpgFOBwo/12832X2C7MVeuk=; b=bfPcKQ82RwAhtH90xAfdMsccXs+H34GiB04VyvY+OC7oFPAR5jNly0e95RuzxlRGIs zI/2cesK4A6mUyu8GYMSCBvs7mI13cw5M1rzUH4f2tbzO1rmj8zBMIpU+nOvpBLNQ8F1 +sdgwhd5PvU88zoPRnnmAVc98qhbikSJfQfvI= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1702423040; x=1703027840; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=cZjoMoZfbLgYK7cJaZ0EBpgFOBwo/12832X2C7MVeuk=; b=B8CgD5CNZxln6okpZLcZuouAPT2wpDCEaT5uMi/C/MH5BBU7Ns3k3VcHG0ltafOzcR ZVJEPU/SharGoPbtz08DiNIH6WXfdVHMvDWYcnj6CSpc3nIoRIKQyh0YjGwYNwYtr27m fk2jzRdZzk9/fGN6dh7XGC6HuiCVp8elLeUiyi7K0WaTovCWShcf/fqoDuUmgKknz3on SvVK1OvIjpNHVo35rG9MEhfir6XM4RAevQmlLfcJDfX5TORbk8HEC7row879Xq9CRnoL VwYUpvqDMOT9eiCTquLW6i0y7QmjN1T1kgt7uDnmwFAhDHddShOSekHJSUjKQIB2itGO ieSQ== X-Gm-Message-State: AOJu0Yyo3LGlflKRSqIXGGHW3/lV5h+GXqjtNMcC2QGje4ZFOZd+xWyn tbW9QA6bBJrXvQhrq6CAvD/nIw== X-Google-Smtp-Source: AGHT+IGRI96+S8G+Wu1lZ2r1WtnFErIwHP6EFRT0QHSV6FRdjptIOjAkrTcn5HdK1GH4IA0qfPmw5g== X-Received: by 2002:a05:6358:7296:b0:170:17eb:2039 with SMTP id w22-20020a056358729600b0017017eb2039mr9338350rwf.34.1702423039955; Tue, 12 Dec 2023 15:17:19 -0800 (PST) Received: from localhost (34.133.83.34.bc.googleusercontent.com. [34.83.133.34]) by smtp.gmail.com with UTF8SMTPSA id z17-20020aa785d1000000b006ce5bb61a5fsm8749920pfn.3.2023.12.12.15.17.19 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Tue, 12 Dec 2023 15:17:19 -0800 (PST) From: jeffxu@chromium.org To: akpm@linux-foundation.org, keescook@chromium.org, jannh@google.com, sroettger@google.com, willy@infradead.org, gregkh@linuxfoundation.org, torvalds@linux-foundation.org Cc: jeffxu@google.com, jorgelo@chromium.org, groeck@chromium.org, linux-kernel@vger.kernel.org, linux-kselftest@vger.kernel.org, linux-mm@kvack.org, pedro.falcato@gmail.com, dave.hansen@intel.com, linux-hardening@vger.kernel.org, deraadt@openbsd.org, Jeff Xu Subject: [RFC PATCH v3 09/11] mseal: add MAP_SEALABLE to mmap() Date: Tue, 12 Dec 2023 23:17:03 +0000 Message-ID: <20231212231706.2680890-10-jeffxu@chromium.org> X-Mailer: git-send-email 2.43.0.472.g3155946c3a-goog In-Reply-To: <20231212231706.2680890-1-jeffxu@chromium.org> References: <20231212231706.2680890-1-jeffxu@chromium.org> MIME-Version: 1.0 X-Rspamd-Queue-Id: EDD3AC0002 X-Rspam-User: X-Rspamd-Server: rspam11 X-Stat-Signature: ig1bjmibfhuundwcsoa5apxtfmas5ehs X-HE-Tag: 1702423040-752073 X-HE-Meta: U2FsdGVkX18qhuIIvtpOM6qJntwCsKGhosVl5wP4Mc153q3pr6rwOLFIDP0t13oT8xysTkHU1pLhQ3e7Va2A87fygyspO1hy+JvMSh44eSA/nfRm1a/ppyssB6s6HokLRzGD1MySvFXu+8hBacgDneb8gKHiPbMLr2vv0PNFeXmqo8Q3V6BzI1WSNT0ddEMpFZWOkPrXqOggIzLc0GiP6TJocHDimI8he9L8n3UmzfGop1aADBIrCom6Z43Tq5/5kG39j0vTdoZf/R7Fkas9tiTs5oHZl9czPYm25TeN4LuKNMS1xkgot2udzSq97YGW6zBPqE18laRCACOTfzxK8Vzy9wxW2q1RiphLccxoKmvfF9RVbvvNhPQ4X61102gFLPzI3s+6ueBCbVnCCu9dl5YGR34zyh8TpUqfqiPqBJ/UoeZPK+4TcKSYAMbL+Ymfl3YLiSZ07fLWGoxkJluFKisT6wACkTgQ3npq83S3pN0CkZX/w/t8RE+K3TpNFBKMDdrxFtIC1TsA2AXJXsetxh0MiX0+H2QK7kYbmZDNw4RNhMUF+SE6NK8rnVvQOcbcMpDKfQydKfLPnGESqRvgYMQfrmVZu+xXeYtUX6zAsfa+TLnbWU3Ujub0UreErZp5mnYtRePjQhp2kAzPZ9LP+fcojFzlhi4ukjISxWNQjp1pZhMOHGz1AyLgyV3x5PMaiU+VXCjFD7lloMGBvihjtjlJTnXkI2VULynYC4quik+Cmo8vSSbiZK3Z0jVZONr9qzQmkZscHmfyM+edEYtQew1t0MD37KGlyTv7/YLJ0UAGUYNgAKEaS/UNlQJczH6uSJJ+xO0vg9kTzD5EZTcMB/HNK1aLNAiE0FnOr+ZHNJCaK2RGPgVanHofoPuwWcRTHyhZb1RlOKUF+H7dW2opZdsoCSEybeHBWGDoXLBjhixzd6xDhTamfcpDAmnrNLMyH6vCnhff0onrRMCIJvK BbLwCmR6 /PnOJfOjPQFvix9gYjPrj5Mz8Hl/BbYTf+qZn0rtU4+GZjGBW31Mqwv77YlJ84aZLSmo4z46brEs4M9S/9/UmSNHQ1EaxBV2qtLfQfhkuHm3o92GGz9eTNq2SdaIAFjnR5679t4ClUoExUHTE+ooKidwgwx621ceqt1Mul8FqCnLn5m7kedgU6psAUK/Y0epOyUthSWXMDDGVEeZ70lxGZPoPoFkBi7g6SS9cUCBNINhbb5rxpqXk0sGbi5c6BiYSqvqJPEigVMY5SzB/FdbONF533+X3v8LfBD1HMrDzA+t1+GbHYS1tHfnaz2ozpq/YnkEZSi87P+GmRpiwT87TVVbuLGkGg/M4OQv1A8VboIe8aP9f60amQwQ1EvbpaOiA2oIjplKOvUpqZEnByoPTTTWur755e9rU5WRkDK+f9G+yl7P4KV+PvtRBhvIPlSMTe9Eyv0SAbhpR/zj79hvOxXPZhDxUcAPD/ZQY X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: From: Jeff Xu The MAP_SEALABLE flag is added to the flags field of mmap(). When present, it marks the map as sealable. A map created without MAP_SEALABLE will not support sealing; In other words, mseal() will fail for such a map. Applications that don't care about sealing will expect their behavior unchanged. For those that need sealing support, opt-in by adding MAP_SEALABLE when creating the map. Signed-off-by: Jeff Xu --- include/linux/mm.h | 52 ++++++++++++++++++++++++-- include/linux/mm_types.h | 1 + include/uapi/asm-generic/mman-common.h | 1 + mm/mmap.c | 2 +- mm/mseal.c | 7 +++- 5 files changed, 57 insertions(+), 6 deletions(-) diff --git a/include/linux/mm.h b/include/linux/mm.h index 50dda474acc2..6f5dba9fbe21 100644 --- a/include/linux/mm.h +++ b/include/linux/mm.h @@ -267,6 +267,17 @@ extern unsigned int kobjsize(const void *objp); MM_SEAL_PROT_PKEY | \ MM_SEAL_DISCARD_RO_ANON) +/* define VM_SEALABLE in vm_seals of vm_area_struct. */ +#define VM_SEALABLE _BITUL(31) + +/* + * VM_SEALS_BITS_ALL marks the bits used for + * sealing in vm_seals of vm_area_structure. + */ +#define VM_SEALS_BITS_ALL ( \ + MM_SEAL_ALL | \ + VM_SEALABLE) + /* * PROT_SEAL_ALL is all supported flags in mmap(). * See include/uapi/asm-generic/mman-common.h. @@ -3330,9 +3341,17 @@ static inline void mm_populate(unsigned long addr, unsigned long len) {} #ifdef CONFIG_MSEAL /* - * return the valid sealing (after mask). + * return the valid sealing (after mask), this includes sealable bit. */ static inline unsigned long vma_seals(struct vm_area_struct *vma) +{ + return (vma->vm_seals & VM_SEALS_BITS_ALL); +} + +/* + * return the enabled sealing type (after mask), without sealable bit. + */ +static inline unsigned long vma_enabled_seals(struct vm_area_struct *vma) { return (vma->vm_seals & MM_SEAL_ALL); } @@ -3342,9 +3361,14 @@ static inline void update_vma_seals(struct vm_area_struct *vma, unsigned long vm vma->vm_seals |= vm_seals; } +static inline bool is_vma_sealable(struct vm_area_struct *vma) +{ + return vma->vm_seals & VM_SEALABLE; +} + static inline bool check_vma_seals_mergeable(unsigned long vm_seals1, unsigned long vm_seals2) { - if ((vm_seals1 & MM_SEAL_ALL) != (vm_seals2 & MM_SEAL_ALL)) + if ((vm_seals1 & VM_SEALS_BITS_ALL) != (vm_seals2 & VM_SEALS_BITS_ALL)) return false; return true; @@ -3384,9 +3408,15 @@ static inline unsigned long convert_mmap_seals(unsigned long prot) * check input sealing type from the "prot" field of mmap(). * for CONFIG_MSEAL case, this always return 0 (successful). */ -static inline int check_mmap_seals(unsigned long prot, unsigned long *vm_seals) +static inline int check_mmap_seals(unsigned long prot, unsigned long *vm_seals, + unsigned long flags) { *vm_seals = convert_mmap_seals(prot); + if (*vm_seals) + /* setting one of MM_SEAL_XX means the map is sealable. */ + *vm_seals |= VM_SEALABLE; + else + *vm_seals |= (flags & MAP_SEALABLE) ? VM_SEALABLE:0; return 0; } #else @@ -3395,6 +3425,16 @@ static inline unsigned long vma_seals(struct vm_area_struct *vma) return 0; } +static inline unsigned long vma_enabled_seals(struct vm_area_struct *vma) +{ + return 0; +} + +static inline bool is_vma_sealable(struct vm_area_struct *vma) +{ + return false; +} + static inline bool check_vma_seals_mergeable(unsigned long vm_seals1, unsigned long vm_seals2) { return true; @@ -3426,11 +3466,15 @@ static inline void update_vma_seals(struct vm_area_struct *vma, unsigned long vm * check input sealing type from the "prot" field of mmap(). * For not CONFIG_MSEAL, if SEAL flag is set, it will return failure. */ -static inline int check_mmap_seals(unsigned long prot, unsigned long *vm_seals) +static inline int check_mmap_seals(unsigned long prot, unsigned long *vm_seals, + unsigned long flags) { if (prot & PROT_SEAL_ALL) return -EINVAL; + if (flags & MAP_SEALABLE) + return -EINVAL; + return 0; } diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h index 052799173c86..c9b04c545f39 100644 --- a/include/linux/mm_types.h +++ b/include/linux/mm_types.h @@ -691,6 +691,7 @@ struct vm_area_struct { /* * bit masks for seal. * need this since vm_flags is full. + * We could merge this into vm_flags if vm_flags ever get expanded. */ unsigned long vm_seals; /* seal flags, see mm.h. */ #endif diff --git a/include/uapi/asm-generic/mman-common.h b/include/uapi/asm-generic/mman-common.h index bf503962409a..57ef4507c00b 100644 --- a/include/uapi/asm-generic/mman-common.h +++ b/include/uapi/asm-generic/mman-common.h @@ -47,6 +47,7 @@ #define MAP_UNINITIALIZED 0x4000000 /* For anonymous mmap, memory could be * uninitialized */ +#define MAP_SEALABLE 0x8000000 /* map is sealable. */ /* * Flags for mlock diff --git a/mm/mmap.c b/mm/mmap.c index 6da8d83f2e66..6e35e2070060 100644 --- a/mm/mmap.c +++ b/mm/mmap.c @@ -1235,7 +1235,7 @@ unsigned long do_mmap(struct file *file, unsigned long addr, if (flags & MAP_FIXED_NOREPLACE) flags |= MAP_FIXED; - if (check_mmap_seals(prot, &vm_seals) < 0) + if (check_mmap_seals(prot, &vm_seals, flags) < 0) return -EINVAL; if (!(flags & MAP_FIXED)) diff --git a/mm/mseal.c b/mm/mseal.c index 294f48d33db6..5d4cf71b497e 100644 --- a/mm/mseal.c +++ b/mm/mseal.c @@ -121,9 +121,13 @@ bool can_modify_mm_madv(struct mm_struct *mm, unsigned long start, unsigned long */ static bool can_add_vma_seals(struct vm_area_struct *vma, unsigned long newSeals) { + /* if map is not sealable, reject. */ + if (!is_vma_sealable(vma)) + return false; + /* When SEAL_MSEAL is set, reject if a new type of seal is added. */ if ((vma->vm_seals & MM_SEAL_SEAL) && - (newSeals & ~(vma_seals(vma)))) + (newSeals & ~(vma_enabled_seals(vma)))) return false; /* @@ -185,6 +189,7 @@ static int mseal_fixup(struct vma_iterator *vmi, struct vm_area_struct *vma, * 2> end is part of a valid vma. * 3> No gap (unallocated address) between start and end. * 4> requested seal type can be added in given address range. + * 5> map is sealable. */ static int check_mm_seal(unsigned long start, unsigned long end, unsigned long newtypes) From patchwork Tue Dec 12 23:17:04 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jeff Xu X-Patchwork-Id: 13490093 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 6C2A2C4167B for ; Tue, 12 Dec 2023 23:17:41 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 233926B03E7; Tue, 12 Dec 2023 18:17:25 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 1C2976B03EA; Tue, 12 Dec 2023 18:17:25 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id DA4DA6B03EB; Tue, 12 Dec 2023 18:17:24 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0010.hostedemail.com [216.40.44.10]) by kanga.kvack.org (Postfix) with ESMTP id ACB2E6B03E7 for ; Tue, 12 Dec 2023 18:17:24 -0500 (EST) Received: from smtpin24.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay05.hostedemail.com (Postfix) with ESMTP id 7AF5740596 for ; Tue, 12 Dec 2023 23:17:24 +0000 (UTC) X-FDA: 81559729608.24.270DE12 Received: from mail-oo1-f52.google.com (mail-oo1-f52.google.com [209.85.161.52]) by imf11.hostedemail.com (Postfix) with ESMTP id 8D2924000B for ; Tue, 12 Dec 2023 23:17:22 +0000 (UTC) Authentication-Results: imf11.hostedemail.com; dkim=pass header.d=chromium.org header.s=google header.b=DplDjgAo; dmarc=pass (policy=none) header.from=chromium.org; spf=pass (imf11.hostedemail.com: domain of jeffxu@chromium.org designates 209.85.161.52 as permitted sender) smtp.mailfrom=jeffxu@chromium.org ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1702423042; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=4ks0JloPGamaa61d4Hb0TBtMWFANdRTfuPlMlHWvZS0=; b=I5hujJq8bSLaq77agv9LAL56vSUnIWOGhvsGvQbtnrf8Lc0isg6sf7ZsalVlu/zZMtpYov Y+3BCI48Aif+M4HoBZFJBifc0Y0ro6mJBQXL9B2ye9G7hfCmwJHAVo2zf+2F7foWBzcuKB OYxQKktJwvbpklVHqQg7LHhNy4NWCDo= ARC-Authentication-Results: i=1; imf11.hostedemail.com; dkim=pass header.d=chromium.org header.s=google header.b=DplDjgAo; dmarc=pass (policy=none) header.from=chromium.org; spf=pass (imf11.hostedemail.com: domain of jeffxu@chromium.org designates 209.85.161.52 as permitted sender) smtp.mailfrom=jeffxu@chromium.org ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1702423042; a=rsa-sha256; cv=none; b=FeGWatQngHvULwvwQqbUlg9cCtWHO6tI9kmG3YD+EvQJuxNCLtvqKBz6B2zVRy9EbM/UXi +I5MLtDhEzENcLhIlDnKIlrmZl5QeAx7x7/RuFEdUlfuB37cClhsTcOKTfrwxLyCfgXtB9 pdRepyNng0SMuS2UWxs3mIoqrhyaTp0= Received: by mail-oo1-f52.google.com with SMTP id 006d021491bc7-58cecfb4412so3639183eaf.3 for ; Tue, 12 Dec 2023 15:17:22 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=chromium.org; s=google; t=1702423041; x=1703027841; darn=kvack.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=4ks0JloPGamaa61d4Hb0TBtMWFANdRTfuPlMlHWvZS0=; b=DplDjgAo0ZQ5eSXUJv0HVLM6iXCSHDwwqOKLgJMznUiGRc0EKCsSfxiLRRhaUx2D/j M7txnv8ESOeQDQUh4lYw182znUmMQkD/0jb5tMBE6QU23CTDA2JfNyvSc6YGfip9jtvq nICFYkKIRrXHZ5TSaqz91JzSbcCYbIAT3gP9U= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1702423041; x=1703027841; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=4ks0JloPGamaa61d4Hb0TBtMWFANdRTfuPlMlHWvZS0=; b=PD0zfTAPZa+zCpnhXqaoC4oyjmLep2fM5DBcDAESDGFFE39NvN85HQXEpLBgszTiJU p7iyaHOPH8566jVjFbxN19IfzTbNj5Lfj5g7YY1Y6hth01cl6HbJh2BKNtVgEQZnY+SG d8h+2vofRIME/Y72aFcBJWRtYKr45s10jYqOHnIJUM28NOaCKKmPGhZB5hSrlYTrOX/g 4kNAx0CR/ezIONIJf42tG8xRPQW7iiJkBYcE8ScL5ItMJWk7uMBoitlRPsEnXSsNaWzU m7afT1gcZQAuNFEAU8qMOP71j9CkaWyAOY8V0kj2aH2UfqV4T8U3oMhd+cuO8BNFjUlR 8kvw== X-Gm-Message-State: AOJu0YxsGOBX3V68rmU0KxWga8+yt+QIbR0/sG0uZovD8iX31tMrgqYy WJltAKo/Q4EyHj/EpzMExCSdhQ== X-Google-Smtp-Source: AGHT+IF3Oq1fsZYeLBgFCye07IpKCl2Qm0GQHZhunt6nPEVTIAhM5EO+9T1cWMOp9tEYYtdQCNcv8A== X-Received: by 2002:a05:6359:223:b0:170:ad0e:c224 with SMTP id ej35-20020a056359022300b00170ad0ec224mr8667929rwb.23.1702423041112; Tue, 12 Dec 2023 15:17:21 -0800 (PST) Received: from localhost (34.133.83.34.bc.googleusercontent.com. [34.83.133.34]) by smtp.gmail.com with UTF8SMTPSA id r3-20020aa79883000000b006cbb65edcbfsm8922291pfl.12.2023.12.12.15.17.20 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Tue, 12 Dec 2023 15:17:20 -0800 (PST) From: jeffxu@chromium.org To: akpm@linux-foundation.org, keescook@chromium.org, jannh@google.com, sroettger@google.com, willy@infradead.org, gregkh@linuxfoundation.org, torvalds@linux-foundation.org Cc: jeffxu@google.com, jorgelo@chromium.org, groeck@chromium.org, linux-kernel@vger.kernel.org, linux-kselftest@vger.kernel.org, linux-mm@kvack.org, pedro.falcato@gmail.com, dave.hansen@intel.com, linux-hardening@vger.kernel.org, deraadt@openbsd.org, Jeff Xu Subject: [RFC PATCH v3 10/11] selftest mm/mseal memory sealing Date: Tue, 12 Dec 2023 23:17:04 +0000 Message-ID: <20231212231706.2680890-11-jeffxu@chromium.org> X-Mailer: git-send-email 2.43.0.472.g3155946c3a-goog In-Reply-To: <20231212231706.2680890-1-jeffxu@chromium.org> References: <20231212231706.2680890-1-jeffxu@chromium.org> MIME-Version: 1.0 X-Rspamd-Server: rspam09 X-Rspamd-Queue-Id: 8D2924000B X-Stat-Signature: 47ed68abgddhfsqcr8hw8dqsbkhthwpq X-Rspam-User: X-HE-Tag: 1702423042-140785 X-HE-Meta: U2FsdGVkX1/tFETXQioJq5jkvJyxVBjDltwQBnvbjS8kDwpQMW88f9vhMykXF/9wr7GSozu1QGAbUoeXRetVHL0nMC1a/EJy4fjk3f0a4ZND4zfj0jasbkgssUjEXsYTU7T9gO2J6J69XUPIMnbyQb0IAJwTFB8kqIWVXQycSbk9KLqK1rzhPM9OfOTM93pTe2wH6t8jnfpWOMx+zFfkPn6DD3onxYtAdAvhwrngRgW3dSl69yt/Fl8Ma2Dm/R54O9poRa8pRpr8TwUGOVTSdMgnqR8IiYVJLEvOgO8xYvfsB/jllgAqrLcN51mLtf/2Y93Gr5zJ5OKN80QskE+JgY+08FU7dee8ozku2UhsRa87Z92l0G4Rh7b3gufM9TCatxEiDiMo76uwQvFaDw0eHhozCKDNbQsnSEeB25iHUs2pOCW5loi9Y3yiPiWHguFZRWpUnLDhOY6ByQML2siI8Io/LeUWW0xGBz8tp59ZEQqXE1mPx6q0DN0Fs57dB4Q1oy3pkTFRW6at2X8NT1fMSBN8acAK91I7g5rHkEagpIaHBKoP+EHmTh0HCHhaeqdpmonOVteJwAmeyu88XfyUIdbZbTb+419MF/SAZmc05oV07IE6mgp8V8mvtIw9IQZ01j3Wa3T2qBnCkSZQFc74KYFNfY2viREHl3ux5aMkmL5fxtR+7qtNPKAfNEIS9GMt7CbD0wQfaCqO739h/srahad83FnWCN3cBOx2z9zX/7K0sOwOKXNJYcdUyBCSn4lq0bJBCFuN/tPW1dwHxwKYZ24DHa5ptKLtOEEe+1gUE6rVgrW/aYbxIIIlzkybwnesKw3jD7FDsRSwTEzMOvAJfP3kxJRvHsI4oJcECu9M2MoCBGMHReuUWIaf9k+ydgjmIUxvpykrXQ1OfMz4AtJWb37YXZeyvOt5yjBO3ux+V0OhgnKIEGxOvqvtACUklBdGZtIjzlXVuMibtE5tWv1 eKXG/3Hj 3R6FnHJuI5N5lZ6VUrbpPqscOIrhIrAdXQfzR6ShVXe/vI54+Dj+JqW+fQKThDRZiHmyzVys+aGkYu6c0QUvEqmi2fldeN1RlrP7RokHj8+zm2YG07wWB303YW3HGhksfs4LotmVABfX7lXbUQo5ces/Q4N5vv45Sr0aOvEPcAUD8ua94biNfILI+5XOxLv7Khe8lEEAePmSluf5cdE/RIg5xATUaCvyEQjgwUWHYBJItxN84coWooKVj74q7uNldRfo1Osi8q0kwIfha5XRzDDthtpXvbK4PwCPkK+Rx7szjgh5+HXnzggOwb074mPHfnuxElwtdp+2oaAc27OPOX7jXEItOVTs6HhXzngdz4byWSfFMkVbVmA6d76hEEr2UTwF9Ko4OK/lC6bpwmrao6MTIedPgWsJMUWCZkSoUiJkvIKx0RmB+s0Q/LA== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: From: Jeff Xu selftest for memory sealing change in mmap() and mseal(). Signed-off-by: Jeff Xu --- tools/testing/selftests/mm/.gitignore | 1 + tools/testing/selftests/mm/Makefile | 1 + tools/testing/selftests/mm/config | 1 + tools/testing/selftests/mm/mseal_test.c | 2141 +++++++++++++++++++++++ 4 files changed, 2144 insertions(+) create mode 100644 tools/testing/selftests/mm/mseal_test.c diff --git a/tools/testing/selftests/mm/.gitignore b/tools/testing/selftests/mm/.gitignore index cdc9ce4426b9..f0f22a649985 100644 --- a/tools/testing/selftests/mm/.gitignore +++ b/tools/testing/selftests/mm/.gitignore @@ -43,3 +43,4 @@ mdwe_test gup_longterm mkdirty va_high_addr_switch +mseal_test diff --git a/tools/testing/selftests/mm/Makefile b/tools/testing/selftests/mm/Makefile index 6a9fc5693145..0c086cecc093 100644 --- a/tools/testing/selftests/mm/Makefile +++ b/tools/testing/selftests/mm/Makefile @@ -59,6 +59,7 @@ TEST_GEN_FILES += mlock2-tests TEST_GEN_FILES += mrelease_test TEST_GEN_FILES += mremap_dontunmap TEST_GEN_FILES += mremap_test +TEST_GEN_FILES += mseal_test TEST_GEN_FILES += on-fault-limit TEST_GEN_FILES += thuge-gen TEST_GEN_FILES += transhuge-stress diff --git a/tools/testing/selftests/mm/config b/tools/testing/selftests/mm/config index be087c4bc396..cf2b8780e9b1 100644 --- a/tools/testing/selftests/mm/config +++ b/tools/testing/selftests/mm/config @@ -6,3 +6,4 @@ CONFIG_TEST_HMM=m CONFIG_GUP_TEST=y CONFIG_TRANSPARENT_HUGEPAGE=y CONFIG_MEM_SOFT_DIRTY=y +CONFIG_MSEAL=y diff --git a/tools/testing/selftests/mm/mseal_test.c b/tools/testing/selftests/mm/mseal_test.c new file mode 100644 index 000000000000..0692485d8b3c --- /dev/null +++ b/tools/testing/selftests/mm/mseal_test.c @@ -0,0 +1,2141 @@ +// SPDX-License-Identifier: GPL-2.0 +#define _GNU_SOURCE +#include +#include +#include +#include +#include +#include +#include +#include "../kselftest.h" +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include + +/* + * need those definition for manually build using gcc. + * gcc -I ../../../../usr/include -DDEBUG -O3 -DDEBUG -O3 mseal_test.c -o mseal_test + */ +#ifndef MM_SEAL_SEAL +#define MM_SEAL_SEAL 0x1 +#endif + +#ifndef MM_SEAL_BASE +#define MM_SEAL_BASE 0x2 +#endif + +#ifndef MM_SEAL_PROT_PKEY +#define MM_SEAL_PROT_PKEY 0x4 +#endif + +#ifndef MM_SEAL_DISCARD_RO_ANON +#define MM_SEAL_DISCARD_RO_ANON 0x8 +#endif + +#ifndef MAP_SEALABLE +#define MAP_SEALABLE 0x8000000 +#endif + +#ifndef PROT_SEAL_SEAL +#define PROT_SEAL_SEAL 0x04000000 +#endif + +#ifndef PROT_SEAL_BASE +#define PROT_SEAL_BASE 0x08000000 +#endif + +#ifndef PROT_SEAL_PROT_PKEY +#define PROT_SEAL_PROT_PKEY 0x10000000 +#endif + +#ifndef PROT_SEAL_DISCARD_RO_ANON +#define PROT_SEAL_DISCARD_RO_ANON 0x20000000 +#endif + +#ifndef PKEY_DISABLE_ACCESS +# define PKEY_DISABLE_ACCESS 0x1 +#endif + +#ifndef PKEY_DISABLE_WRITE +# define PKEY_DISABLE_WRITE 0x2 +#endif + +#ifndef PKEY_BITS_PER_KEY +#define PKEY_BITS_PER_PKEY 2 +#endif + +#ifndef PKEY_MASK +#define PKEY_MASK (PKEY_DISABLE_ACCESS | PKEY_DISABLE_WRITE) +#endif + +#ifndef DEBUG +#define LOG_TEST_ENTER() {} +#else +#define LOG_TEST_ENTER() {ksft_print_msg("%s\n", __func__); } +#endif + +#ifndef u64 +#define u64 unsigned long long +#endif + +/* + * define sys_xyx to call syscall directly. + */ +static int sys_mseal(void *start, size_t len, int types) +{ + int sret; + + errno = 0; + sret = syscall(__NR_mseal, start, len, types, 0); + return sret; +} + +int sys_mprotect(void *ptr, size_t size, unsigned long prot) +{ + int sret; + + errno = 0; + sret = syscall(SYS_mprotect, ptr, size, prot); + return sret; +} + +int sys_mprotect_pkey(void *ptr, size_t size, unsigned long orig_prot, + unsigned long pkey) +{ + int sret; + + errno = 0; + sret = syscall(__NR_pkey_mprotect, ptr, size, orig_prot, pkey); + return sret; +} + +int sys_munmap(void *ptr, size_t size) +{ + int sret; + + errno = 0; + sret = syscall(SYS_munmap, ptr, size); + return sret; +} + +static int sys_madvise(void *start, size_t len, int types) +{ + int sret; + + errno = 0; + sret = syscall(__NR_madvise, start, len, types); + return sret; +} + +int sys_pkey_alloc(unsigned long flags, unsigned long init_val) +{ + int ret = syscall(SYS_pkey_alloc, flags, init_val); + return ret; +} + +static inline unsigned int __read_pkey_reg(void) +{ + unsigned int eax, edx; + unsigned int ecx = 0; + unsigned int pkey_reg; + + asm volatile(".byte 0x0f,0x01,0xee\n\t" + : "=a" (eax), "=d" (edx) + : "c" (ecx)); + pkey_reg = eax; + return pkey_reg; +} + +static inline void __write_pkey_reg(u64 pkey_reg) +{ + unsigned int eax = pkey_reg; + unsigned int ecx = 0; + unsigned int edx = 0; + + asm volatile(".byte 0x0f,0x01,0xef\n\t" + : : "a" (eax), "c" (ecx), "d" (edx)); + assert(pkey_reg == __read_pkey_reg()); +} + +static inline unsigned long pkey_bit_position(int pkey) +{ + return pkey * PKEY_BITS_PER_PKEY; +} + +static inline u64 set_pkey_bits(u64 reg, int pkey, u64 flags) +{ + unsigned long shift = pkey_bit_position(pkey); + /* mask out bits from pkey in old value */ + reg &= ~((u64)PKEY_MASK << shift); + /* OR in new bits for pkey */ + reg |= (flags & PKEY_MASK) << shift; + return reg; +} + +static inline void set_pkey(int pkey, unsigned long pkey_value) +{ + unsigned long mask = (PKEY_DISABLE_ACCESS | PKEY_DISABLE_WRITE); + u64 new_pkey_reg; + + assert(!(pkey_value & ~mask)); + new_pkey_reg = set_pkey_bits(__read_pkey_reg(), pkey, pkey_value); + __write_pkey_reg(new_pkey_reg); +} + +void setup_single_address(int size, void **ptrOut) +{ + void *ptr; + + ptr = mmap(NULL, size, PROT_READ, MAP_ANONYMOUS | MAP_PRIVATE | MAP_SEALABLE, -1, 0); + assert(ptr != (void *)-1); + *ptrOut = ptr; +} + +void setup_single_address_sealable(int size, void **ptrOut, bool sealable) +{ + void *ptr; + unsigned long mapflags = MAP_ANONYMOUS | MAP_PRIVATE; + + if (sealable) + mapflags |= MAP_SEALABLE; + + ptr = mmap(NULL, size, PROT_READ, mapflags, -1, 0); + assert(ptr != (void *)-1); + *ptrOut = ptr; +} + +void setup_single_address_rw_sealable(int size, void **ptrOut, bool sealable) +{ + void *ptr; + unsigned long mapflags = MAP_ANONYMOUS | MAP_PRIVATE; + + if (sealable) + mapflags |= MAP_SEALABLE; + + ptr = mmap(NULL, size, PROT_READ | PROT_WRITE, mapflags, -1, 0); + assert(ptr != (void *)-1); + *ptrOut = ptr; +} + +void clean_single_address(void *ptr, int size) +{ + int ret; + + ret = munmap(ptr, size); + assert(!ret); +} + +void seal_mprotect_single_address(void *ptr, int size) +{ + int ret; + + ret = sys_mseal(ptr, size, MM_SEAL_PROT_PKEY); + assert(!ret); +} + +void seal_discard_ro_anon_single_address(void *ptr, int size) +{ + int ret; + + ret = sys_mseal(ptr, size, MM_SEAL_DISCARD_RO_ANON); + assert(!ret); +} + +static void test_seal_addseals(void) +{ + LOG_TEST_ENTER(); + int ret; + void *ptr; + unsigned long page_size = getpagesize(); + unsigned long size = 4 * page_size; + + setup_single_address(size, &ptr); + + /* adding seal one by one */ + + ret = sys_mseal(ptr, size, MM_SEAL_BASE); + assert(!ret); + ret = sys_mseal(ptr, size, MM_SEAL_PROT_PKEY); + assert(!ret); + ret = sys_mseal(ptr, size, MM_SEAL_SEAL); + assert(!ret); +} + +static void test_seal_addseals_combined(void) +{ + LOG_TEST_ENTER(); + int ret; + void *ptr; + unsigned long page_size = getpagesize(); + unsigned long size = 4 * page_size; + + setup_single_address(size, &ptr); + + ret = sys_mseal(ptr, size, MM_SEAL_PROT_PKEY); + assert(!ret); + + /* adding multiple seals */ + ret = sys_mseal(ptr, size, + MM_SEAL_PROT_PKEY | MM_SEAL_BASE| + MM_SEAL_SEAL); + assert(!ret); + + /* not adding more seal type, so ok. */ + ret = sys_mseal(ptr, size, MM_SEAL_BASE); + assert(!ret); + + /* not adding more seal type, so ok. */ + ret = sys_mseal(ptr, size, MM_SEAL_SEAL); + assert(!ret); +} + +static void test_seal_addseals_reject(void) +{ + LOG_TEST_ENTER(); + int ret; + void *ptr; + unsigned long page_size = getpagesize(); + unsigned long size = 4 * page_size; + + setup_single_address(size, &ptr); + + ret = sys_mseal(ptr, size, MM_SEAL_BASE | MM_SEAL_SEAL); + assert(!ret); + + /* MM_SEAL_SEAL is set, so not allow new seal type . */ + ret = sys_mseal(ptr, size, + MM_SEAL_PROT_PKEY | MM_SEAL_BASE | MM_SEAL_SEAL); + assert(ret < 0); +} + +static void test_seal_unmapped_start(void) +{ + LOG_TEST_ENTER(); + int ret; + void *ptr; + unsigned long page_size = getpagesize(); + unsigned long size = 4 * page_size; + + setup_single_address(size, &ptr); + + // munmap 2 pages from ptr. + ret = sys_munmap(ptr, 2 * page_size); + assert(!ret); + + // mprotect will fail because 2 pages from ptr are unmapped. + ret = sys_mprotect(ptr, size, PROT_READ | PROT_WRITE); + assert(ret < 0); + + // mseal will fail because 2 pages from ptr are unmapped. + ret = sys_mseal(ptr, size, MM_SEAL_SEAL); + assert(ret < 0); + + ret = sys_mseal(ptr + 2 * page_size, 2 * page_size, MM_SEAL_SEAL); + assert(!ret); +} + +static void test_seal_unmapped_middle(void) +{ + LOG_TEST_ENTER(); + int ret; + void *ptr; + unsigned long page_size = getpagesize(); + unsigned long size = 4 * page_size; + + setup_single_address(size, &ptr); + + // munmap 2 pages from ptr + page. + ret = sys_munmap(ptr + page_size, 2 * page_size); + assert(!ret); + + // mprotect will fail, since size is 4 pages. + ret = sys_mprotect(ptr, size, PROT_READ | PROT_WRITE); + assert(ret < 0); + + // mseal will fail as well. + ret = sys_mseal(ptr, size, MM_SEAL_SEAL); + assert(ret < 0); + + /* we still can add seal to the first page and last page*/ + ret = sys_mseal(ptr, page_size, MM_SEAL_SEAL | MM_SEAL_PROT_PKEY); + assert(!ret); + + ret = sys_mseal(ptr + 3 * page_size, page_size, + MM_SEAL_SEAL | MM_SEAL_PROT_PKEY); + assert(!ret); +} + +static void test_seal_unmapped_end(void) +{ + LOG_TEST_ENTER(); + int ret; + void *ptr; + unsigned long page_size = getpagesize(); + unsigned long size = 4 * page_size; + + setup_single_address(size, &ptr); + + // unmap last 2 pages. + ret = sys_munmap(ptr + 2 * page_size, 2 * page_size); + assert(!ret); + + //mprotect will fail since last 2 pages are unmapped. + ret = sys_mprotect(ptr, size, PROT_READ | PROT_WRITE); + assert(ret < 0); + + //mseal will fail as well. + ret = sys_mseal(ptr, size, MM_SEAL_SEAL); + assert(ret < 0); + + /* The first 2 pages is not sealed, and can add seals */ + ret = sys_mseal(ptr, 2 * page_size, MM_SEAL_SEAL | MM_SEAL_PROT_PKEY); + assert(!ret); +} + +static void test_seal_multiple_vmas(void) +{ + LOG_TEST_ENTER(); + int ret; + void *ptr; + unsigned long page_size = getpagesize(); + unsigned long size = 4 * page_size; + + setup_single_address(size, &ptr); + + // use mprotect to split the vma into 3. + ret = sys_mprotect(ptr + page_size, 2 * page_size, + PROT_READ | PROT_WRITE); + assert(!ret); + + // mprotect will get applied to all 4 pages - 3 VMAs. + ret = sys_mprotect(ptr, size, PROT_READ); + assert(!ret); + + // use mprotect to split the vma into 3. + ret = sys_mprotect(ptr + page_size, 2 * page_size, + PROT_READ | PROT_WRITE); + assert(!ret); + + // mseal get applied to all 4 pages - 3 VMAs. + ret = sys_mseal(ptr, size, MM_SEAL_SEAL); + assert(!ret); + + // verify additional seal type will fail after MM_SEAL_SEAL set. + ret = sys_mseal(ptr, page_size, MM_SEAL_SEAL | MM_SEAL_PROT_PKEY); + assert(ret < 0); + + ret = sys_mseal(ptr + page_size, 2 * page_size, + MM_SEAL_SEAL | MM_SEAL_PROT_PKEY); + assert(ret < 0); + + ret = sys_mseal(ptr + 3 * page_size, page_size, + MM_SEAL_SEAL | MM_SEAL_PROT_PKEY); + assert(ret < 0); +} + +static void test_seal_split_start(void) +{ + LOG_TEST_ENTER(); + int ret; + void *ptr; + unsigned long page_size = getpagesize(); + unsigned long size = 4 * page_size; + + setup_single_address(size, &ptr); + + /* use mprotect to split at middle */ + ret = sys_mprotect(ptr, 2 * page_size, PROT_READ | PROT_WRITE); + assert(!ret); + + /* seal the first page, this will split the VMA */ + ret = sys_mseal(ptr, page_size, MM_SEAL_SEAL); + assert(!ret); + + /* can't add seal to the first page */ + ret = sys_mseal(ptr, page_size, MM_SEAL_SEAL | MM_SEAL_PROT_PKEY); + assert(ret < 0); + + /* add seal to the remain 3 pages */ + ret = sys_mseal(ptr + page_size, 3 * page_size, + MM_SEAL_SEAL | MM_SEAL_PROT_PKEY); + assert(!ret); +} + +static void test_seal_split_end(void) +{ + LOG_TEST_ENTER(); + int ret; + void *ptr; + unsigned long page_size = getpagesize(); + unsigned long size = 4 * page_size; + + setup_single_address(size, &ptr); + + /* use mprotect to split at middle */ + ret = sys_mprotect(ptr, 2 * page_size, PROT_READ | PROT_WRITE); + assert(!ret); + + /* seal the last page */ + ret = sys_mseal(ptr + 3 * page_size, page_size, MM_SEAL_SEAL); + assert(!ret); + + /* adding seal to the last page is rejected. */ + ret = sys_mseal(ptr + 3 * page_size, page_size, + MM_SEAL_SEAL | MM_SEAL_PROT_PKEY); + assert(ret < 0); + + /* Adding seals to the first 3 pages */ + ret = sys_mseal(ptr, 3 * page_size, MM_SEAL_SEAL | MM_SEAL_PROT_PKEY); + assert(!ret); +} + +static void test_seal_invalid_input(void) +{ + LOG_TEST_ENTER(); + void *ptr; + unsigned long page_size = getpagesize(); + unsigned long size = 4 * page_size; + int ret; + + setup_single_address(8 * page_size, &ptr); + clean_single_address(ptr + 4 * page_size, 4 * page_size); + + /* invalid flag */ + ret = sys_mseal(ptr, size, 0x20); + assert(ret < 0); + + ret = sys_mseal(ptr, size, 0x31); + assert(ret < 0); + + ret = sys_mseal(ptr, size, 0x3F); + assert(ret < 0); + + /* unaligned address */ + ret = sys_mseal(ptr + 1, 2 * page_size, MM_SEAL_SEAL); + assert(ret < 0); + + /* length too big */ + ret = sys_mseal(ptr, 5 * page_size, MM_SEAL_SEAL); + assert(ret < 0); + + /* start is not in a valid VMA */ + ret = sys_mseal(ptr - page_size, 5 * page_size, MM_SEAL_SEAL); + assert(ret < 0); +} + +static void test_seal_zero_length(void) +{ + LOG_TEST_ENTER(); + void *ptr; + unsigned long page_size = getpagesize(); + unsigned long size = 4 * page_size; + int ret; + + setup_single_address(size, &ptr); + + ret = sys_mprotect(ptr, 0, PROT_READ | PROT_WRITE); + assert(!ret); + + /* seal 0 length will be OK, same as mprotect */ + ret = sys_mseal(ptr, 0, MM_SEAL_PROT_PKEY); + assert(!ret); + + // verify the 4 pages are not sealed by previous call. + ret = sys_mprotect(ptr, size, PROT_READ | PROT_WRITE); + assert(!ret); +} + +static void test_seal_twice(void) +{ + LOG_TEST_ENTER(); + int ret; + void *ptr; + unsigned long page_size = getpagesize(); + unsigned long size = 4 * page_size; + + setup_single_address(size, &ptr); + + ret = sys_mseal(ptr, size, MM_SEAL_PROT_PKEY); + assert(!ret); + + // apply the same seal will be OK. idempotent. + ret = sys_mseal(ptr, size, MM_SEAL_PROT_PKEY); + assert(!ret); + + ret = sys_mseal(ptr, size, + MM_SEAL_PROT_PKEY | MM_SEAL_BASE | + MM_SEAL_SEAL); + assert(!ret); + + ret = sys_mseal(ptr, size, + MM_SEAL_PROT_PKEY | MM_SEAL_BASE | + MM_SEAL_SEAL); + assert(!ret); + + ret = sys_mseal(ptr, size, MM_SEAL_SEAL); + assert(!ret); +} + +static void test_seal_mprotect(bool seal) +{ + LOG_TEST_ENTER(); + void *ptr; + unsigned long page_size = getpagesize(); + unsigned long size = 4 * page_size; + int ret; + + setup_single_address(size, &ptr); + + if (seal) + seal_mprotect_single_address(ptr, size); + + ret = sys_mprotect(ptr, size, PROT_READ | PROT_WRITE); + if (seal) + assert(ret < 0); + else + assert(!ret); +} + +static void test_seal_start_mprotect(bool seal) +{ + LOG_TEST_ENTER(); + void *ptr; + unsigned long page_size = getpagesize(); + unsigned long size = 4 * page_size; + int ret; + + setup_single_address(size, &ptr); + + if (seal) + seal_mprotect_single_address(ptr, page_size); + + // the first page is sealed. + ret = sys_mprotect(ptr, page_size, PROT_READ | PROT_WRITE); + if (seal) + assert(ret < 0); + else + assert(!ret); + + // pages after the first page is not sealed. + ret = sys_mprotect(ptr + page_size, page_size * 3, + PROT_READ | PROT_WRITE); + assert(!ret); +} + +static void test_seal_end_mprotect(bool seal) +{ + LOG_TEST_ENTER(); + void *ptr; + unsigned long page_size = getpagesize(); + unsigned long size = 4 * page_size; + int ret; + + setup_single_address(size, &ptr); + + if (seal) + seal_mprotect_single_address(ptr + page_size, 3 * page_size); + + /* first page is not sealed */ + ret = sys_mprotect(ptr, page_size, PROT_READ | PROT_WRITE); + assert(!ret); + + /* last 3 page are sealed */ + ret = sys_mprotect(ptr + page_size, page_size * 3, + PROT_READ | PROT_WRITE); + if (seal) + assert(ret < 0); + else + assert(!ret); +} + +static void test_seal_mprotect_unalign_len(bool seal) +{ + LOG_TEST_ENTER(); + void *ptr; + unsigned long page_size = getpagesize(); + unsigned long size = 4 * page_size; + int ret; + + setup_single_address(size, &ptr); + + if (seal) + seal_mprotect_single_address(ptr, page_size * 2 - 1); + + // 2 pages are sealed. + ret = sys_mprotect(ptr, page_size * 2, PROT_READ | PROT_WRITE); + if (seal) + assert(ret < 0); + else + assert(!ret); + + ret = sys_mprotect(ptr + page_size * 2, page_size, + PROT_READ | PROT_WRITE); + assert(!ret); +} + +static void test_seal_mprotect_unalign_len_variant_2(bool seal) +{ + LOG_TEST_ENTER(); + void *ptr; + unsigned long page_size = getpagesize(); + unsigned long size = 4 * page_size; + int ret; + + setup_single_address(size, &ptr); + if (seal) + seal_mprotect_single_address(ptr, page_size * 2 + 1); + + // 3 pages are sealed. + ret = sys_mprotect(ptr, page_size * 3, PROT_READ | PROT_WRITE); + if (seal) + assert(ret < 0); + else + assert(!ret); + + ret = sys_mprotect(ptr + page_size * 3, page_size, + PROT_READ | PROT_WRITE); + assert(!ret); +} + +static void test_seal_mprotect_two_vma(bool seal) +{ + LOG_TEST_ENTER(); + void *ptr; + unsigned long page_size = getpagesize(); + unsigned long size = 4 * page_size; + int ret; + + setup_single_address(size, &ptr); + + /* use mprotect to split */ + ret = sys_mprotect(ptr, page_size * 2, PROT_READ | PROT_WRITE); + assert(!ret); + + if (seal) + seal_mprotect_single_address(ptr, page_size * 4); + + ret = sys_mprotect(ptr, page_size * 2, PROT_READ | PROT_WRITE); + if (seal) + assert(ret < 0); + else + assert(!ret); + + ret = sys_mprotect(ptr + page_size * 2, page_size * 2, + PROT_READ | PROT_WRITE); + if (seal) + assert(ret < 0); + else + assert(!ret); +} + +static void test_seal_mprotect_two_vma_with_split(bool seal) +{ + LOG_TEST_ENTER(); + void *ptr; + unsigned long page_size = getpagesize(); + unsigned long size = 4 * page_size; + int ret; + + setup_single_address(size, &ptr); + + // use mprotect to split as two vma. + ret = sys_mprotect(ptr, page_size * 2, PROT_READ | PROT_WRITE); + assert(!ret); + + // mseal can apply across 2 vma, also split them. + if (seal) + seal_mprotect_single_address(ptr + page_size, page_size * 2); + + // the first page is not sealed. + ret = sys_mprotect(ptr, page_size, PROT_READ | PROT_WRITE); + assert(!ret); + + // the second page is sealed. + ret = sys_mprotect(ptr + page_size, page_size, PROT_READ | PROT_WRITE); + if (seal) + assert(ret < 0); + else + assert(!ret); + + // the third page is sealed. + ret = sys_mprotect(ptr + 2 * page_size, page_size, + PROT_READ | PROT_WRITE); + if (seal) + assert(ret < 0); + else + assert(!ret); + + // the fouth page is not sealed. + ret = sys_mprotect(ptr + 3 * page_size, page_size, + PROT_READ | PROT_WRITE); + assert(!ret); +} + +static void test_seal_mprotect_partial_mprotect(bool seal) +{ + LOG_TEST_ENTER(); + void *ptr; + unsigned long page_size = getpagesize(); + unsigned long size = 4 * page_size; + int ret; + + setup_single_address(size, &ptr); + + // seal one page. + if (seal) + seal_mprotect_single_address(ptr, page_size); + + // mprotect first 2 page will fail, since the first page are sealed. + ret = sys_mprotect(ptr, 2 * page_size, PROT_READ | PROT_WRITE); + if (seal) + assert(ret < 0); + else + assert(!ret); +} + +static void test_seal_mprotect_two_vma_with_gap(bool seal) +{ + LOG_TEST_ENTER(); + void *ptr; + unsigned long page_size = getpagesize(); + unsigned long size = 4 * page_size; + int ret; + + setup_single_address(size, &ptr); + + // use mprotect to split. + ret = sys_mprotect(ptr, page_size, PROT_READ | PROT_WRITE); + assert(!ret); + + // use mprotect to split. + ret = sys_mprotect(ptr + 3 * page_size, page_size, + PROT_READ | PROT_WRITE); + assert(!ret); + + // use munmap to free two pages in the middle + ret = sys_munmap(ptr + page_size, 2 * page_size); + assert(!ret); + + // mprotect will fail, because there is a gap in the address. + // notes, internally mprotect still updated the first page. + ret = sys_mprotect(ptr, 4 * page_size, PROT_READ); + assert(ret < 0); + + // mseal will fail as well. + ret = sys_mseal(ptr, 4 * page_size, MM_SEAL_PROT_PKEY); + assert(ret < 0); + + // unlike mprotect, the first page is not sealed. + ret = sys_mprotect(ptr, page_size, PROT_READ); + assert(ret == 0); + + // the last page is not sealed. + ret = sys_mprotect(ptr + 3 * page_size, page_size, PROT_READ); + assert(ret == 0); +} + +static void test_seal_mprotect_split(bool seal) +{ + LOG_TEST_ENTER(); + void *ptr; + unsigned long page_size = getpagesize(); + unsigned long size = 4 * page_size; + int ret; + + setup_single_address(size, &ptr); + + //use mprotect to split. + ret = sys_mprotect(ptr, page_size, PROT_READ | PROT_WRITE); + assert(!ret); + + //seal all 4 pages. + if (seal) { + ret = sys_mseal(ptr, 4 * page_size, MM_SEAL_PROT_PKEY); + assert(!ret); + } + + //madvice is OK. + ret = sys_madvise(ptr, page_size * 2, MADV_WILLNEED); + assert(!ret); + + //mprotect is sealed. + ret = sys_mprotect(ptr, 2 * page_size, PROT_READ); + if (seal) + assert(ret < 0); + else + assert(!ret); + + + ret = sys_mprotect(ptr + 2 * page_size, 2 * page_size, PROT_READ); + if (seal) + assert(ret < 0); + else + assert(!ret); +} + +static void test_seal_mprotect_merge(bool seal) +{ + LOG_TEST_ENTER(); + void *ptr; + unsigned long page_size = getpagesize(); + unsigned long size = 4 * page_size; + int ret; + + setup_single_address(size, &ptr); + + // use mprotect to split one page. + ret = sys_mprotect(ptr, page_size, PROT_READ | PROT_WRITE); + assert(!ret); + + // seal first two pages. + if (seal) { + ret = sys_mseal(ptr, 2 * page_size, MM_SEAL_PROT_PKEY); + assert(!ret); + } + + ret = sys_madvise(ptr, page_size, MADV_WILLNEED); + assert(!ret); + + // 2 pages are sealed. + ret = sys_mprotect(ptr, 2 * page_size, PROT_READ); + if (seal) + assert(ret < 0); + else + assert(!ret); + + // last 2 pages are not sealed. + ret = sys_mprotect(ptr + 2 * page_size, 2 * page_size, PROT_READ); + assert(ret == 0); +} + +static void test_seal_munmap(bool seal) +{ + LOG_TEST_ENTER(); + void *ptr; + unsigned long page_size = getpagesize(); + unsigned long size = 4 * page_size; + int ret; + + setup_single_address(size, &ptr); + + if (seal) { + ret = sys_mseal(ptr, size, MM_SEAL_BASE); + assert(!ret); + } + + // 4 pages are sealed. + ret = sys_munmap(ptr, size); + if (seal) + assert(ret < 0); + else + assert(!ret); +} + +/* + * allocate 4 pages, + * use mprotect to split it as two VMAs + * seal the whole range + * munmap will fail on both + */ +static void test_seal_munmap_two_vma(bool seal) +{ + LOG_TEST_ENTER(); + void *ptr; + unsigned long page_size = getpagesize(); + unsigned long size = 4 * page_size; + int ret; + + setup_single_address(size, &ptr); + + /* use mprotect to split */ + ret = sys_mprotect(ptr, page_size * 2, PROT_READ | PROT_WRITE); + assert(!ret); + + if (seal) { + ret = sys_mseal(ptr, size, MM_SEAL_BASE); + assert(!ret); + } + + ret = sys_munmap(ptr, page_size * 2); + if (seal) + assert(ret < 0); + else + assert(!ret); + + ret = sys_munmap(ptr + page_size, page_size * 2); + if (seal) + assert(ret < 0); + else + assert(!ret); +} + +/* + * allocate a VMA with 4 pages. + * munmap the middle 2 pages. + * seal the whole 4 pages, will fail. + * note: one of the pages are sealed + * munmap the first page will be OK. + * munmap the last page will be OK. + */ +static void test_seal_munmap_vma_with_gap(bool seal) +{ + LOG_TEST_ENTER(); + void *ptr; + unsigned long page_size = getpagesize(); + unsigned long size = 4 * page_size; + int ret; + + setup_single_address(size, &ptr); + + ret = sys_munmap(ptr + page_size, page_size * 2); + assert(!ret); + + if (seal) { + // can't have gap in the middle. + ret = sys_mseal(ptr, size, MM_SEAL_BASE); + assert(ret < 0); + } + + ret = sys_munmap(ptr, page_size); + assert(!ret); + + ret = sys_munmap(ptr + page_size * 2, page_size); + assert(!ret); + + ret = sys_munmap(ptr, size); + assert(!ret); +} + +static void test_munmap_start_freed(bool seal) +{ + LOG_TEST_ENTER(); + void *ptr; + unsigned long page_size = getpagesize(); + unsigned long size = 4 * page_size; + int ret; + + setup_single_address(size, &ptr); + + // unmap the first page. + ret = sys_munmap(ptr, page_size); + assert(!ret); + + // seal the last 3 pages. + if (seal) { + ret = sys_mseal(ptr + page_size, 3 * page_size, MM_SEAL_BASE); + assert(!ret); + } + + // unmap from the first page. + ret = sys_munmap(ptr, size); + if (seal) { + assert(ret < 0); + + // use mprotect to verify page is not unmapped. + ret = sys_mprotect(ptr + page_size, 3 * page_size, PROT_READ); + assert(!ret); + } else + // note: this will be OK, even the first page is + // already unmapped. + assert(!ret); +} + +static void test_munmap_end_freed(bool seal) +{ + LOG_TEST_ENTER(); + void *ptr; + unsigned long page_size = getpagesize(); + unsigned long size = 4 * page_size; + int ret; + + setup_single_address(size, &ptr); + // unmap last page. + ret = sys_munmap(ptr + page_size * 3, page_size); + assert(!ret); + + // seal the first 3 pages. + if (seal) { + ret = sys_mseal(ptr, 3 * page_size, MM_SEAL_BASE); + assert(!ret); + } + + // unmap all pages. + ret = sys_munmap(ptr, size); + if (seal) { + assert(ret < 0); + + // use mprotect to verify page is not unmapped. + ret = sys_mprotect(ptr, 3 * page_size, PROT_READ); + assert(!ret); + } else + assert(!ret); +} + +static void test_munmap_middle_freed(bool seal) +{ + LOG_TEST_ENTER(); + void *ptr; + unsigned long page_size = getpagesize(); + unsigned long size = 4 * page_size; + int ret; + + setup_single_address(size, &ptr); + // unmap 2 pages in the middle. + ret = sys_munmap(ptr + page_size, page_size * 2); + assert(!ret); + + // seal the first page. + if (seal) { + ret = sys_mseal(ptr, page_size, MM_SEAL_BASE); + assert(!ret); + } + + // munmap all 4 pages. + ret = sys_munmap(ptr, size); + if (seal) { + assert(ret < 0); + + // use mprotect to verify page is not unmapped. + ret = sys_mprotect(ptr, page_size, PROT_READ); + assert(!ret); + } else + assert(!ret); +} + +void test_seal_mremap_shrink(bool seal) +{ + LOG_TEST_ENTER(); + void *ptr; + unsigned long page_size = getpagesize(); + unsigned long size = 4 * page_size; + int ret; + void *ret2; + + setup_single_address(size, &ptr); + + if (seal) { + ret = sys_mseal(ptr, size, MM_SEAL_BASE); + assert(!ret); + } + + // shrink from 4 pages to 2 pages. + ret2 = mremap(ptr, size, 2 * page_size, 0, 0); + if (seal) { + assert(ret2 == MAP_FAILED); + assert(errno == EACCES); + } else { + assert(ret2 != MAP_FAILED); + + } +} + +void test_seal_mremap_expand(bool seal) +{ + LOG_TEST_ENTER(); + void *ptr; + unsigned long page_size = getpagesize(); + unsigned long size = 4 * page_size; + int ret; + void *ret2; + + setup_single_address(size, &ptr); + // ummap last 2 pages. + ret = sys_munmap(ptr + 2 * page_size, 2 * page_size); + assert(!ret); + + if (seal) { + ret = sys_mseal(ptr, 2 * page_size, MM_SEAL_BASE); + assert(!ret); + } + + // expand from 2 page to 4 pages. + ret2 = mremap(ptr, 2 * page_size, 4 * page_size, 0, 0); + if (seal) { + assert(ret2 == MAP_FAILED); + assert(errno == EACCES); + } else { + assert(ret2 == ptr); + + } +} + +void test_seal_mremap_move(bool seal) +{ + LOG_TEST_ENTER(); + void *ptr, *newPtr; + unsigned long page_size = getpagesize(); + unsigned long size = page_size; + int ret; + void *ret2; + + setup_single_address(size, &ptr); + setup_single_address(size, &newPtr); + clean_single_address(newPtr, size); + + if (seal) { + ret = sys_mseal(ptr, size, MM_SEAL_BASE); + assert(!ret); + } + + // move from ptr to fixed address. + ret2 = mremap(ptr, size, size, MREMAP_MAYMOVE | MREMAP_FIXED, newPtr); + if (seal) { + assert(ret2 == MAP_FAILED); + assert(errno == EACCES); + } else { + assert(ret2 != MAP_FAILED); + + } +} + +void test_seal_mmap_overwrite_prot(bool seal) +{ + LOG_TEST_ENTER(); + void *ptr; + unsigned long page_size = getpagesize(); + unsigned long size = page_size; + int ret; + void *ret2; + + setup_single_address(size, &ptr); + + if (seal) { + ret = sys_mseal(ptr, size, MM_SEAL_BASE); + assert(!ret); + } + + // use mmap to change protection. + ret2 = mmap(ptr, size, PROT_NONE, + MAP_ANONYMOUS | MAP_PRIVATE | MAP_FIXED, -1, 0); + if (seal) { + assert(ret2 == MAP_FAILED); + assert(errno == EACCES); + } else + assert(ret2 == ptr); +} + +void test_seal_mmap_expand(bool seal) +{ + LOG_TEST_ENTER(); + void *ptr; + unsigned long page_size = getpagesize(); + unsigned long size = 12 * page_size; + int ret; + void *ret2; + + setup_single_address(size, &ptr); + // ummap last 4 pages. + ret = sys_munmap(ptr + 8 * page_size, 4 * page_size); + assert(!ret); + + if (seal) { + ret = sys_mseal(ptr, 8 * page_size, MM_SEAL_BASE); + assert(!ret); + } + + // use mmap to expand. + ret2 = mmap(ptr, size, PROT_READ, + MAP_ANONYMOUS | MAP_PRIVATE | MAP_FIXED, -1, 0); + if (seal) { + assert(ret2 == MAP_FAILED); + assert(errno == EACCES); + } else + assert(ret2 == ptr); +} + +void test_seal_mmap_shrink(bool seal) +{ + LOG_TEST_ENTER(); + void *ptr; + unsigned long page_size = getpagesize(); + unsigned long size = 12 * page_size; + int ret; + void *ret2; + + setup_single_address(size, &ptr); + + if (seal) { + ret = sys_mseal(ptr, size, MM_SEAL_BASE); + assert(!ret); + } + + // use mmap to shrink. + ret2 = mmap(ptr, 8 * page_size, PROT_READ, + MAP_ANONYMOUS | MAP_PRIVATE | MAP_FIXED, -1, 0); + if (seal) { + assert(ret2 == MAP_FAILED); + assert(errno == EACCES); + } else + assert(ret2 == ptr); +} + +void test_seal_mremap_shrink_fixed(bool seal) +{ + LOG_TEST_ENTER(); + void *ptr; + void *newAddr; + unsigned long page_size = getpagesize(); + unsigned long size = 4 * page_size; + int ret; + void *ret2; + + setup_single_address(size, &ptr); + setup_single_address(size, &newAddr); + + if (seal) { + ret = sys_mseal(ptr, size, MM_SEAL_BASE); + assert(!ret); + } + + // mremap to move and shrink to fixed address + ret2 = mremap(ptr, size, 2 * page_size, MREMAP_MAYMOVE | MREMAP_FIXED, + newAddr); + if (seal) { + assert(ret2 == MAP_FAILED); + assert(errno == EACCES); + } else + assert(ret2 == newAddr); +} + +void test_seal_mremap_expand_fixed(bool seal) +{ + LOG_TEST_ENTER(); + void *ptr; + void *newAddr; + unsigned long page_size = getpagesize(); + unsigned long size = 4 * page_size; + int ret; + void *ret2; + + setup_single_address(page_size, &ptr); + setup_single_address(size, &newAddr); + + if (seal) { + ret = sys_mseal(newAddr, size, MM_SEAL_BASE); + assert(!ret); + } + + // mremap to move and expand to fixed address + ret2 = mremap(ptr, page_size, size, MREMAP_MAYMOVE | MREMAP_FIXED, + newAddr); + if (seal) { + assert(ret2 == MAP_FAILED); + assert(errno == EACCES); + } else + assert(ret2 == newAddr); +} + +void test_seal_mremap_move_fixed(bool seal) +{ + LOG_TEST_ENTER(); + void *ptr; + void *newAddr; + unsigned long page_size = getpagesize(); + unsigned long size = 4 * page_size; + int ret; + void *ret2; + + setup_single_address(size, &ptr); + setup_single_address(size, &newAddr); + + if (seal) { + ret = sys_mseal(newAddr, size, MM_SEAL_BASE); + assert(!ret); + } + + // mremap to move to fixed address + ret2 = mremap(ptr, size, size, MREMAP_MAYMOVE | MREMAP_FIXED, newAddr); + if (seal) { + assert(ret2 == MAP_FAILED); + assert(errno == EACCES); + } else + assert(ret2 == newAddr); +} + +void test_seal_mremap_move_fixed_zero(bool seal) +{ + LOG_TEST_ENTER(); + void *ptr; + void *newAddr; + unsigned long page_size = getpagesize(); + unsigned long size = 4 * page_size; + int ret; + void *ret2; + + setup_single_address(size, &ptr); + + if (seal) { + ret = sys_mseal(ptr, size, MM_SEAL_BASE); + assert(!ret); + } + + /* + * MREMAP_FIXED can move the mapping to zero address + */ + ret2 = mremap(ptr, size, 2 * page_size, MREMAP_MAYMOVE | MREMAP_FIXED, + 0); + if (seal) { + assert(ret2 == MAP_FAILED); + assert(errno == EACCES); + } else { + assert(ret2 == 0); + + } +} + +void test_seal_mremap_move_dontunmap(bool seal) +{ + LOG_TEST_ENTER(); + void *ptr; + void *newAddr; + unsigned long page_size = getpagesize(); + unsigned long size = 4 * page_size; + int ret; + void *ret2; + + setup_single_address(size, &ptr); + + if (seal) { + ret = sys_mseal(ptr, size, MM_SEAL_BASE); + assert(!ret); + } + + // mremap to move, and don't unmap src addr. + ret2 = mremap(ptr, size, size, MREMAP_MAYMOVE | MREMAP_DONTUNMAP, 0); + if (seal) { + assert(ret2 == MAP_FAILED); + assert(errno == EACCES); + } else { + assert(ret2 != MAP_FAILED); + + } +} + +void test_seal_mremap_move_dontunmap_anyaddr(bool seal) +{ + LOG_TEST_ENTER(); + void *ptr; + void *newAddr; + unsigned long page_size = getpagesize(); + unsigned long size = 4 * page_size; + int ret; + void *ret2; + + setup_single_address(size, &ptr); + + if (seal) { + ret = sys_mseal(ptr, size, MM_SEAL_BASE); + assert(!ret); + } + + /* + * The 0xdeaddead should not have effect on dest addr + * when MREMAP_DONTUNMAP is set. + */ + ret2 = mremap(ptr, size, size, MREMAP_MAYMOVE | MREMAP_DONTUNMAP, + 0xdeaddead); + if (seal) { + assert(ret2 == MAP_FAILED); + assert(errno == EACCES); + } else { + assert(ret2 != MAP_FAILED); + assert((long)ret2 != 0xdeaddead); + + } +} + +unsigned long get_vma_size(void *addr) +{ + FILE *maps; + char line[256]; + int size = 0; + uintptr_t addr_start, addr_end; + + maps = fopen("/proc/self/maps", "r"); + if (!maps) + return 0; + + while (fgets(line, sizeof(line), maps)) { + if (sscanf(line, "%lx-%lx", &addr_start, &addr_end) == 2) { + if (addr_start == (uintptr_t) addr) { + size = addr_end - addr_start; + break; + } + } + } + fclose(maps); + return size; +} + +void test_seal_mmap_seal_base(void) +{ + LOG_TEST_ENTER(); + void *ptr; + unsigned long page_size = getpagesize(); + unsigned long size = 4 * page_size; + int ret; + void *ret2; + + ptr = mmap(NULL, size, PROT_READ | PROT_SEAL_BASE, MAP_ANONYMOUS | MAP_PRIVATE, -1, 0); + assert(ptr != (void *)-1); + + ret = sys_munmap(ptr, size); + assert(ret < 0); + + ret = sys_mprotect(ptr, size, PROT_READ | PROT_WRITE); + assert(!ret); + + ret = sys_mseal(ptr, size, MM_SEAL_PROT_PKEY); + assert(!ret); + + ret = sys_mprotect(ptr, size, PROT_READ); + assert(ret < 0); +} + +void test_seal_mmap_seal_mprotect(void) +{ + LOG_TEST_ENTER(); + void *ptr; + unsigned long page_size = getpagesize(); + unsigned long size = 4 * page_size; + int ret; + void *ret2; + + ptr = mmap(NULL, size, PROT_READ | PROT_SEAL_PROT_PKEY, MAP_ANONYMOUS | MAP_PRIVATE, -1, 0); + assert(ptr != (void *)-1); + + ret = sys_munmap(ptr, size); + assert(ret < 0); + + ret = sys_mprotect(ptr, size, PROT_READ | PROT_WRITE); + assert(ret < 0); +} + +void test_seal_mmap_seal_mseal(void) +{ + LOG_TEST_ENTER(); + void *ptr; + unsigned long page_size = getpagesize(); + unsigned long size = 4 * page_size; + int ret; + void *ret2; + + ptr = mmap(NULL, size, PROT_READ | PROT_SEAL_SEAL, MAP_ANONYMOUS | MAP_PRIVATE, -1, 0); + assert(ptr != (void *)-1); + + ret = sys_mseal(ptr, size, MM_SEAL_BASE); + assert(ret < 0); + + ret = sys_mprotect(ptr, size, PROT_READ | PROT_WRITE); + assert(!ret); + + ret = sys_munmap(ptr, size); + assert(!ret); +} + +void test_seal_merge_and_split(void) +{ + LOG_TEST_ENTER(); + void *ptr; + unsigned long page_size = getpagesize(); + unsigned long size; + int ret; + void *ret2; + + // (24 RO) + setup_single_address(24 * page_size, &ptr); + + // use mprotect(NONE) to set out boundary + // (1 NONE) (22 RO) (1 NONE) + ret = sys_mprotect(ptr, page_size, PROT_NONE); + assert(!ret); + ret = sys_mprotect(ptr + 23 * page_size, page_size, PROT_NONE); + assert(!ret); + size = get_vma_size(ptr + page_size); + assert(size == 22 * page_size); + + // use mseal to split from beginning + // (1 NONE) (1 RO_SBASE) (21 RO) (1 NONE) + ret = sys_mseal(ptr + page_size, page_size, MM_SEAL_BASE); + assert(!ret); + size = get_vma_size(ptr + page_size); + assert(size == page_size); + size = get_vma_size(ptr + 2 * page_size); + assert(size == 21 * page_size); + + // use mseal to split from the end. + // (1 NONE) (1 RO_SBASE) (20 RO) (1 RO_SBASE) (1 NONE) + ret = sys_mseal(ptr + 22 * page_size, page_size, MM_SEAL_BASE); + assert(!ret); + size = get_vma_size(ptr + 22 * page_size); + assert(size == page_size); + size = get_vma_size(ptr + 2 * page_size); + assert(size == 20 * page_size); + + // merge with prev. + // (1 NONE) (2 RO_SBASE) (19 RO) (1 RO_SBASE) (1 NONE) + ret = sys_mseal(ptr + 2 * page_size, page_size, MM_SEAL_BASE); + assert(!ret); + size = get_vma_size(ptr + page_size); + assert(size == 2 * page_size); + + // merge with after. + // (1 NONE) (2 RO_SBASE) (18 RO) (2 RO_SBASES) (1 NONE) + ret = sys_mseal(ptr + 21 * page_size, page_size, MM_SEAL_BASE); + assert(!ret); + size = get_vma_size(ptr + 21 * page_size); + assert(size == 2 * page_size); + + // split from prev + // (1 NONE) (1 RO_SBASE) (2RO_SPROT) (17 RO) (2 RO_SBASES) (1 NONE) + ret = sys_mseal(ptr + 2 * page_size, 2 * page_size, MM_SEAL_PROT_PKEY); + assert(!ret); + size = get_vma_size(ptr + 2 * page_size); + assert(size == 2 * page_size); + ret = sys_munmap(ptr + page_size, page_size); + assert(ret < 0); + ret = sys_mprotect(ptr + 2 * page_size, page_size, PROT_NONE); + assert(ret < 0); + + // split from next + // (1 NONE) (1 RO_SBASE) (2 RO_SPROT) (16 RO) (2 RO_SPROT) (1 RO_SBASES) (1 NONE) + ret = sys_mseal(ptr + 20 * page_size, 2 * page_size, MM_SEAL_PROT_PKEY); + assert(!ret); + size = get_vma_size(ptr + 20 * page_size); + assert(size == 2 * page_size); + + // merge from middle of prev and middle of next. + // (1 NONE) (1 RW_SBASE) (20 RO_SPROT) (1 RW_SBASES) (1 NONE) + ret = sys_mseal(ptr + 3 * page_size, 18 * page_size, MM_SEAL_PROT_PKEY); + assert(!ret); + size = get_vma_size(ptr + 2 * page_size); + assert(size == 20 * page_size); + + size = get_vma_size(ptr + 22 * page_size); + assert(size == page_size); + + size = get_vma_size(ptr + 23 * page_size); + assert(size == page_size); + + // Add split using SEAL_ALL + // (1 NONE) (1 RW_SBASE) (1 RO_SALL) (18 RO_SPROT) (1 RO_SALL) (1 RW_SBASES) (1 NONE) + ret = sys_mseal(ptr + 2 * page_size, page_size, + MM_SEAL_PROT_PKEY | MM_SEAL_DISCARD_RO_ANON); + assert(!ret); + size = get_vma_size(ptr + 2 * page_size); + assert(size == 1 * page_size); + + ret = sys_mseal(ptr + 21 * page_size, page_size, + MM_SEAL_PROT_PKEY | MM_SEAL_DISCARD_RO_ANON); + assert(!ret); + size = get_vma_size(ptr + 21 * page_size); + assert(size == 1 * page_size); + + // add a new seal type, and merge with next + // (1 NONE) (2 RO_SALL) (18 RO_SPROT) (2 RO_SALL) (1 NONE) + ret = sys_mprotect(ptr + page_size, page_size, PROT_READ); + assert(!ret); + ret = sys_mseal(ptr + page_size, page_size, MM_SEAL_PROT_PKEY); + assert(!ret); + ret = sys_mseal(ptr + page_size, page_size, MM_SEAL_DISCARD_RO_ANON); + assert(!ret); + size = get_vma_size(ptr + page_size); + assert(size == 2 * page_size); + + ret = sys_mprotect(ptr + 22 * page_size, page_size, PROT_READ); + assert(!ret); + ret = sys_mseal(ptr + 22 * page_size, page_size, MM_SEAL_PROT_PKEY); + assert(!ret); + ret = sys_mseal(ptr + 22 * page_size, page_size, MM_SEAL_DISCARD_RO_ANON); + assert(!ret); + size = get_vma_size(ptr + page_size); + assert(size == 2 * page_size); +} + +void test_seal_mmap_merge(void) +{ + LOG_TEST_ENTER(); + + void *ptr, *ptr2; + unsigned long page_size = getpagesize(); + unsigned long size; + int ret; + void *ret2; + + // (24 RO) + setup_single_address(24 * page_size, &ptr); + + // use mprotect(NONE) to set out boundary + // (1 NONE) (22 RO) (1 NONE) + ret = sys_mprotect(ptr, page_size, PROT_NONE); + assert(!ret); + ret = sys_mprotect(ptr + 23 * page_size, page_size, PROT_NONE); + assert(!ret); + size = get_vma_size(ptr + page_size); + assert(size == 22 * page_size); + + // use munmap to free 2 segment of memory. + // (1 NONE) (1 free) (20 RO) (1 free) (1 NONE) + ret = sys_munmap(ptr + page_size, page_size); + assert(!ret); + + ret = sys_munmap(ptr + 22 * page_size, page_size); + assert(!ret); + + // apply seal to the middle + // (1 NONE) (1 free) (20 RO_SBASE) (1 free) (1 NONE) + ret = sys_mseal(ptr + 2 * page_size, 20 * page_size, MM_SEAL_BASE); + assert(!ret); + size = get_vma_size(ptr + 2 * page_size); + assert(size == 20 * page_size); + + // allocate a mapping at beginning, and make sure it merges. + // (1 NONE) (21 RO_SBASE) (1 free) (1 NONE) + ptr2 = mmap(ptr + page_size, page_size, PROT_READ | PROT_SEAL_BASE, + MAP_FIXED | MAP_PRIVATE | MAP_ANONYMOUS, -1, 0); + assert(ptr != (void *)-1); + size = get_vma_size(ptr + page_size); + assert(size == 21 * page_size); + + // allocate a mapping at end, and make sure it merges. + // (1 NONE) (22 RO_SBASE) (1 NONE) + ptr2 = mmap(ptr + 22 * page_size, page_size, PROT_READ | PROT_SEAL_BASE, + MAP_FIXED | MAP_PRIVATE | MAP_ANONYMOUS, -1, 0); + assert(ptr != (void *)-1); + size = get_vma_size(ptr + page_size); + assert(size == 22 * page_size); +} + +static void test_not_sealable(void) +{ + int ret; + void *ptr; + unsigned long page_size = getpagesize(); + unsigned long size = 4 * page_size; + + ptr = mmap(NULL, size, PROT_READ, MAP_ANONYMOUS | MAP_PRIVATE, -1, 0); + assert(ptr != (void *)-1); + + ret = sys_mseal(ptr, size, MM_SEAL_SEAL); + assert(ret < 0); +} + +static void test_merge_sealable(void) +{ + int ret; + void *ptr, *ptr2; + unsigned long page_size = getpagesize(); + unsigned long size; + + // (24 RO) + setup_single_address(24 * page_size, &ptr); + + // use mprotect(NONE) to set out boundary + // (1 NONE) (22 RO) (1 NONE) + ret = sys_mprotect(ptr, page_size, PROT_NONE); + assert(!ret); + ret = sys_mprotect(ptr + 23 * page_size, page_size, PROT_NONE); + assert(!ret); + size = get_vma_size(ptr + page_size); + assert(size == 22 * page_size); + + // (1 NONE) (RO) (4 free) (17 RO) (1 NONE) + ret = sys_munmap(ptr + 2 * page_size, 4 * page_size); + assert(!ret); + size = get_vma_size(ptr + page_size); + assert(size == 1 * page_size); + size = get_vma_size(ptr + 6 * page_size); + assert(size == 17 * page_size); + + // (1 NONE) (RO) (1 free) (2 RO) (1 free) (17 RO) (1 NONE) + ptr2 = mmap(ptr + 3 * page_size, 2 * page_size, PROT_READ, + MAP_FIXED | MAP_ANONYMOUS | MAP_PRIVATE | MAP_SEALABLE, -1, 0); + size = get_vma_size(ptr + 3 * page_size); + assert(size == 2 * page_size); + + // (1 NONE) (RO) (1 free) (20 RO) (1 NONE) + ptr2 = mmap(ptr + 5 * page_size, 1 * page_size, PROT_READ, + MAP_FIXED | MAP_ANONYMOUS | MAP_PRIVATE | MAP_SEALABLE, -1, 0); + assert(ptr2 != (void *)-1); + size = get_vma_size(ptr + 3 * page_size); + assert(size == 20 * page_size); + + // (1 NONE) (RO) (1 free) (19 RO) (1 RO_SB) (1 NONE) + ret = sys_mseal(ptr + 22 * page_size, page_size, MM_SEAL_BASE); + assert(!ret); + + // (1 NONE) (RO) (not sealable) (19 RO) (1 RO_SB) (1 NONE) + ptr2 = mmap(ptr + 2 * page_size, page_size, PROT_READ, + MAP_FIXED | MAP_ANONYMOUS | MAP_PRIVATE, -1, 0); + assert(ptr2 != (void *)-1); + size = get_vma_size(ptr + page_size); + assert(size == page_size); + size = get_vma_size(ptr + 2 * page_size); + assert(size == page_size); +} + +static void test_seal_discard_ro_anon_on_rw(bool seal) +{ + LOG_TEST_ENTER(); + void *ptr; + unsigned long page_size = getpagesize(); + unsigned long size = 4 * page_size; + int ret; + + setup_single_address_rw_sealable(size, &ptr, seal); + assert(ptr != (void *)-1); + + if (seal) { + ret = sys_mseal(ptr, size, MM_SEAL_DISCARD_RO_ANON); + assert(!ret); + } + + // sealing doesn't take effect on RW memory. + ret = sys_madvise(ptr, size, MADV_DONTNEED); + assert(!ret); + + // base seal still apply. + ret = sys_munmap(ptr, size); + if (seal) + assert(ret < 0); + else + assert(!ret); +} + +static void test_seal_discard_ro_anon_on_pkey(bool seal) +{ + LOG_TEST_ENTER(); + void *ptr; + unsigned long page_size = getpagesize(); + unsigned long size = 4 * page_size; + int ret; + int pkey; + + setup_single_address_rw_sealable(size, &ptr, seal); + assert(ptr != (void *)-1); + + pkey = sys_pkey_alloc(0, 0); + assert(pkey > 0); + + ret = sys_mprotect_pkey((void *)ptr, size, PROT_READ | PROT_WRITE, pkey); + assert(!ret); + + if (seal) { + ret = sys_mseal(ptr, size, MM_SEAL_DISCARD_RO_ANON); + assert(!ret); + } + + // sealing doesn't take effect if PKRU allow write. + set_pkey(pkey, 0); + ret = sys_madvise(ptr, size, MADV_DONTNEED); + assert(!ret); + + // sealing will take effect if PKRU deny write. + set_pkey(pkey, PKEY_DISABLE_WRITE); + ret = sys_madvise(ptr, size, MADV_DONTNEED); + if (seal) + assert(ret < 0); + else + assert(!ret); + + // base seal still apply. + ret = sys_munmap(ptr, size); + if (seal) + assert(ret < 0); + else + assert(!ret); +} + +static void test_seal_discard_ro_anon_on_filebacked(bool seal) +{ + LOG_TEST_ENTER(); + void *ptr; + unsigned long page_size = getpagesize(); + unsigned long size = 4 * page_size; + int ret; + int fd; + unsigned long mapflags = MAP_PRIVATE; + + if (seal) + mapflags |= MAP_SEALABLE; + + fd = memfd_create("test", 0); + assert(fd > 0); + + ret = fallocate(fd, 0, 0, size); + assert(!ret); + + ptr = mmap(NULL, size, PROT_READ, mapflags, fd, 0); + assert(ptr != MAP_FAILED); + + if (seal) { + ret = sys_mseal(ptr, size, MM_SEAL_DISCARD_RO_ANON); + assert(!ret); + } + + // sealing doesn't apply for file backed mapping. + ret = sys_madvise(ptr, size, MADV_DONTNEED); + assert(!ret); + + ret = sys_munmap(ptr, size); + if (seal) + assert(ret < 0); + else + assert(!ret); + close(fd); +} + +static void test_seal_discard_ro_anon_on_shared(bool seal) +{ + LOG_TEST_ENTER(); + void *ptr; + unsigned long page_size = getpagesize(); + unsigned long size = 4 * page_size; + int ret; + unsigned long mapflags = MAP_ANONYMOUS | MAP_SHARED; + + if (seal) + mapflags |= MAP_SEALABLE; + + ptr = mmap(NULL, size, PROT_READ, mapflags, -1, 0); + assert(ptr != (void *)-1); + + if (seal) { + ret = sys_mseal(ptr, size, MM_SEAL_DISCARD_RO_ANON); + assert(!ret); + } + + // sealing doesn't apply for shared mapping. + ret = sys_madvise(ptr, size, MADV_DONTNEED); + assert(!ret); + + ret = sys_munmap(ptr, size); + if (seal) + assert(ret < 0); + else + assert(!ret); +} + +static void test_seal_discard_ro_anon_invalid_shared(bool seal) +{ + LOG_TEST_ENTER(); + void *ptr; + unsigned long page_size = getpagesize(); + unsigned long size = 4 * page_size; + int ret; + int fd; + + fd = open("/proc/self/maps", O_RDONLY); + ptr = mmap(NULL, size, PROT_READ, MAP_ANONYMOUS | MAP_PRIVATE, fd, 0); + assert(ptr != (void *)-1); + + if (seal) { + ret = sys_mseal(ptr, size, MM_SEAL_DISCARD_RO_ANON); + assert(!ret); + } + + ret = sys_madvise(ptr, size, MADV_DONTNEED); + assert(!ret); + + ret = sys_munmap(ptr, size); + assert(ret < 0); + close(fd); +} + +static void test_seal_discard_ro_anon(bool seal) +{ + LOG_TEST_ENTER(); + void *ptr; + unsigned long page_size = getpagesize(); + unsigned long size = 4 * page_size; + int ret; + + setup_single_address(size, &ptr); + + if (seal) + seal_discard_ro_anon_single_address(ptr, size); + + ret = sys_madvise(ptr, size, MADV_DONTNEED); + if (seal) + assert(ret < 0); + else + assert(!ret); + + ret = sys_munmap(ptr, size); + if (seal) + assert(ret < 0); + else + assert(!ret); +} + +static void test_mmap_seal_discard_ro_anon(void) +{ + LOG_TEST_ENTER(); + void *ptr; + unsigned long page_size = getpagesize(); + unsigned long size = 4 * page_size; + int ret; + + ptr = mmap(NULL, size, PROT_READ | PROT_WRITE | PROT_SEAL_DISCARD_RO_ANON, + MAP_ANONYMOUS | MAP_PRIVATE, -1, 0); + assert(ptr != (void *)-1); + + ret = sys_mprotect(ptr, size, PROT_READ); + assert(!ret); + + ret = sys_madvise(ptr, size, MADV_DONTNEED); + assert(ret < 0); + + ret = sys_munmap(ptr, size); + assert(ret < 0); +} + +bool seal_support(void) +{ + void *ptr; + unsigned long page_size = getpagesize(); + + ptr = mmap(NULL, page_size, PROT_READ | PROT_SEAL_BASE, MAP_ANONYMOUS | MAP_PRIVATE, -1, 0); + if (ptr == (void *) -1) + return false; + return true; +} + +bool pkey_supported(void) +{ + int pkey = sys_pkey_alloc(0, 0); + + if (pkey > 0) + return true; + return false; +} + +int main(int argc, char **argv) +{ + bool test_seal = seal_support(); + + if (!test_seal) { + ksft_print_msg("%s CONFIG_MSEAL might be disabled, skip test\n", __func__); + return 0; + } + + test_seal_invalid_input(); + test_seal_addseals(); + test_seal_addseals_combined(); + test_seal_addseals_reject(); + test_seal_unmapped_start(); + test_seal_unmapped_middle(); + test_seal_unmapped_end(); + test_seal_multiple_vmas(); + test_seal_split_start(); + test_seal_split_end(); + + test_seal_zero_length(); + test_seal_twice(); + + test_seal_mprotect(false); + test_seal_mprotect(true); + + test_seal_start_mprotect(false); + test_seal_start_mprotect(true); + + test_seal_end_mprotect(false); + test_seal_end_mprotect(true); + + test_seal_mprotect_unalign_len(false); + test_seal_mprotect_unalign_len(true); + + test_seal_mprotect_unalign_len_variant_2(false); + test_seal_mprotect_unalign_len_variant_2(true); + + test_seal_mprotect_two_vma(false); + test_seal_mprotect_two_vma(true); + + test_seal_mprotect_two_vma_with_split(false); + test_seal_mprotect_two_vma_with_split(true); + + test_seal_mprotect_partial_mprotect(false); + test_seal_mprotect_partial_mprotect(true); + + test_seal_mprotect_two_vma_with_gap(false); + test_seal_mprotect_two_vma_with_gap(true); + + test_seal_mprotect_merge(false); + test_seal_mprotect_merge(true); + + test_seal_mprotect_split(false); + test_seal_mprotect_split(true); + + test_seal_munmap(false); + test_seal_munmap(true); + test_seal_munmap_two_vma(false); + test_seal_munmap_two_vma(true); + test_seal_munmap_vma_with_gap(false); + test_seal_munmap_vma_with_gap(true); + + test_munmap_start_freed(false); + test_munmap_start_freed(true); + test_munmap_middle_freed(false); + test_munmap_middle_freed(true); + test_munmap_end_freed(false); + test_munmap_end_freed(true); + + test_seal_mremap_shrink(false); + test_seal_mremap_shrink(true); + test_seal_mremap_expand(false); + test_seal_mremap_expand(true); + test_seal_mremap_move(false); + test_seal_mremap_move(true); + + test_seal_mremap_shrink_fixed(false); + test_seal_mremap_shrink_fixed(true); + test_seal_mremap_expand_fixed(false); + test_seal_mremap_expand_fixed(true); + test_seal_mremap_move_fixed(false); + test_seal_mremap_move_fixed(true); + test_seal_mremap_move_dontunmap(false); + test_seal_mremap_move_dontunmap(true); + test_seal_mremap_move_fixed_zero(false); + test_seal_mremap_move_fixed_zero(true); + test_seal_mremap_move_dontunmap_anyaddr(false); + test_seal_mremap_move_dontunmap_anyaddr(true); + test_seal_discard_ro_anon(false); + test_seal_discard_ro_anon(true); + test_seal_discard_ro_anon_on_rw(false); + test_seal_discard_ro_anon_on_rw(true); + test_seal_discard_ro_anon_on_shared(false); + test_seal_discard_ro_anon_on_shared(true); + test_seal_discard_ro_anon_on_filebacked(false); + test_seal_discard_ro_anon_on_filebacked(true); + test_seal_mmap_overwrite_prot(false); + test_seal_mmap_overwrite_prot(true); + test_seal_mmap_expand(false); + test_seal_mmap_expand(true); + test_seal_mmap_shrink(false); + test_seal_mmap_shrink(true); + + test_seal_mmap_seal_base(); + test_seal_mmap_seal_mprotect(); + test_seal_mmap_seal_mseal(); + test_mmap_seal_discard_ro_anon(); + test_seal_merge_and_split(); + test_seal_mmap_merge(); + + test_not_sealable(); + test_merge_sealable(); + + if (pkey_supported()) { + test_seal_discard_ro_anon_on_pkey(false); + test_seal_discard_ro_anon_on_pkey(true); + } + + ksft_print_msg("Done\n"); + return 0; +} From patchwork Tue Dec 12 23:17:05 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Jeff Xu X-Patchwork-Id: 13490094 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 77FCFC4167D for ; Tue, 12 Dec 2023 23:17:44 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 6EFD66B03EA; Tue, 12 Dec 2023 18:17:25 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 675A56B03EB; Tue, 12 Dec 2023 18:17:25 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 42D0B6B03ED; Tue, 12 Dec 2023 18:17:25 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0012.hostedemail.com [216.40.44.12]) by kanga.kvack.org (Postfix) with ESMTP id 243FC6B03EB for ; Tue, 12 Dec 2023 18:17:25 -0500 (EST) Received: from smtpin19.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay07.hostedemail.com (Postfix) with ESMTP id F325B160ABD for ; Tue, 12 Dec 2023 23:17:24 +0000 (UTC) X-FDA: 81559729608.19.303D0EF Received: from mail-ot1-f54.google.com (mail-ot1-f54.google.com [209.85.210.54]) by imf24.hostedemail.com (Postfix) with ESMTP id 0D42F18001F for ; Tue, 12 Dec 2023 23:17:22 +0000 (UTC) Authentication-Results: imf24.hostedemail.com; dkim=pass header.d=chromium.org header.s=google header.b=jrfs0OhL; spf=pass (imf24.hostedemail.com: domain of jeffxu@chromium.org designates 209.85.210.54 as permitted sender) smtp.mailfrom=jeffxu@chromium.org; dmarc=pass (policy=none) header.from=chromium.org ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1702423043; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=uU36FjOIbdJKQPRiJg4nrZQU0fb/SKB6wgyMQ4t+i44=; b=3/T8fJ0JkoTE1Mp2mZgYSoqfuon+uyNNXXY6lTAkoIUTnZNaYUSyhY9RdVCsbiStG0bq7u zXLGFpQQLC7rbzM4QUEAa9L11bGQwufq/qq32z659IOXQ18ysUeMCPNPhsYGH4N89Qae0z KNDvnvrFip/f0ImpEw1BtkwbVQUxHRM= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1702423043; a=rsa-sha256; cv=none; b=2Ya8xgeksv7ta4Vc0+W99BJHNGxPEu1PQuwVFG3SXrxbNcpCFNvDfb1EwsIe4v/LVnvKzh vnzfoPrZi7/GCh+BquY76VG05F5rqV7eoouRVeezOPnihsDQhtTM/J8ci8ktHIH5y3z5Gz Sh3TQWUsVIcGG7uH4r3IhHJvOzqyL+8= ARC-Authentication-Results: i=1; imf24.hostedemail.com; dkim=pass header.d=chromium.org header.s=google header.b=jrfs0OhL; spf=pass (imf24.hostedemail.com: domain of jeffxu@chromium.org designates 209.85.210.54 as permitted sender) smtp.mailfrom=jeffxu@chromium.org; dmarc=pass (policy=none) header.from=chromium.org Received: by mail-ot1-f54.google.com with SMTP id 46e09a7af769-6d9d307a732so4739956a34.1 for ; Tue, 12 Dec 2023 15:17:22 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=chromium.org; s=google; t=1702423042; x=1703027842; darn=kvack.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=uU36FjOIbdJKQPRiJg4nrZQU0fb/SKB6wgyMQ4t+i44=; b=jrfs0OhLnGF6KN5D1XpApGBcgkbKsZydJ6hyHpShIWWkt2+o0OAPA2Ra3wAzG5nlQO NPDoRL5hllMqWu8n6k2mEU/8QW8dI+pNhZKe6mpcmwnSf3aMuj5vKrY3ksJ1wxO2rcSF OdTd1CqMrC8o8SMo7weQkaBs/6MHNjnoXnGZI= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1702423042; x=1703027842; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=uU36FjOIbdJKQPRiJg4nrZQU0fb/SKB6wgyMQ4t+i44=; b=mSOFIkAO0ixyIJdelfDIeqUMKxaDAT6cDhUYkpND1cZ51y8BuoQHjddU8+OgcrHN3u KHKycdQJZ2u9O6sLTwzOC1Ae/M2M2BlUHNi2BoenI/Mpr786iFwYaXxLice1NpO3QVnH 3QGUxtEFGfxtohwxQpZkzJEll03h6SlGzF50XIyHUvxv/DIOq+pZZbSYEaQjErakpYvP Ja5kRjGVfPef/3yOv3aSwUEr4C6r+uo8ya5M9isGtf5FaU0v65wZ03IU3pDbShpJ7gkd w6+3jR2CTIRPLUrej7bQp5kgtDyj1mx8omd2CM5r+eWwBLkH4HBQiH6ZnXIvNtrZxuUw MFgw== X-Gm-Message-State: AOJu0Yx5Et1s/xdo/Kx366FWjV45aD0tM/85y22KB6WD1FCfgs9h5tmO CrOLjCJyJDkg+Mq7TmNTIa5/UA== X-Google-Smtp-Source: AGHT+IESnkqDjGhkYEemYfFxmrgB/rWYhFKb6e57jqVPg8rTGimOrJk6IlvAbIgCglInHK6pIykjCw== X-Received: by 2002:a05:6870:ec8b:b0:1fb:75c:400e with SMTP id eo11-20020a056870ec8b00b001fb075c400emr8193744oab.110.1702423042075; Tue, 12 Dec 2023 15:17:22 -0800 (PST) Received: from localhost (34.133.83.34.bc.googleusercontent.com. [34.83.133.34]) by smtp.gmail.com with UTF8SMTPSA id v29-20020a63481d000000b005c19c586cb7sm8685104pga.33.2023.12.12.15.17.21 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Tue, 12 Dec 2023 15:17:21 -0800 (PST) From: jeffxu@chromium.org To: akpm@linux-foundation.org, keescook@chromium.org, jannh@google.com, sroettger@google.com, willy@infradead.org, gregkh@linuxfoundation.org, torvalds@linux-foundation.org Cc: jeffxu@google.com, jorgelo@chromium.org, groeck@chromium.org, linux-kernel@vger.kernel.org, linux-kselftest@vger.kernel.org, linux-mm@kvack.org, pedro.falcato@gmail.com, dave.hansen@intel.com, linux-hardening@vger.kernel.org, deraadt@openbsd.org, Jeff Xu Subject: [RFC PATCH v3 11/11] mseal:add documentation Date: Tue, 12 Dec 2023 23:17:05 +0000 Message-ID: <20231212231706.2680890-12-jeffxu@chromium.org> X-Mailer: git-send-email 2.43.0.472.g3155946c3a-goog In-Reply-To: <20231212231706.2680890-1-jeffxu@chromium.org> References: <20231212231706.2680890-1-jeffxu@chromium.org> MIME-Version: 1.0 X-Stat-Signature: 9oxy78ssu5b63d81sqxk9rtfr1onjnjx X-Rspamd-Server: rspam10 X-Rspamd-Queue-Id: 0D42F18001F X-Rspam-User: X-HE-Tag: 1702423042-258597 X-HE-Meta: U2FsdGVkX190rZa9rxpf4mOKfu4qba2Ur0sAUq2I8oRwZcMCDAlQHisXo0IefofLSQIOWwg0MrECOnVItcjMl86MUbaJC/Jj83508eA75WExeX8WEaWw+rnmAHGMwRkn+U6bVehuS8D8/wLLA+ZX2EK/ftirJhgfp5KvrelLg7qkADHuW4orEPF7WiwcSWIOEQ6WIOPnmKbMbSnpXbCF+pxqgcU4Ka/gbGDbzmXTEZvl5lAWBnLgazOs9i4yt9gUAKehcJwFQJSBYBPG+pKlFH4uM3ySO/m84zFuqZcJjFGgwa8RbrIhG4ucEM77JnXt4m6h6T5PFkvB3Ch00aCeGbbntoPrBl4+SZ95i9AkBthVeY4gvwh4gvqioXbq6pr++qlw6JnLSQJH49JzRlvQHlGzU/t/gkQB6mdgxkJSrVWFLiq219RK9wj/BZs8e+opYoi/xYROYhmfQssj4vGwyrUSSwwzM4YSxkW7qYTL83kX4Dk8DBDhgUHPBu43dofKll47Ni5NM5831HImPvzpzFuhPM6LPtWuqM2YzVp+EMkNisZ1nhjZEdEeiT1CLz28SOBRTnX9kn6PDC1uRhDGmRjbXa1soAaVOt6FkVo9mwgYnlF6vgQDadsrl1OkBGJtcnxguqd+N0qYkv5Su7rYwYxi8UoRuB4y4ps6IV26OK/793VE/rvxlCVbRo/uypm/kBsVvNWa2tuO5aL/y1IiSzzm8b9Kx5BhCGv3ozi92388xR9xzp9pbPFet3F0tAk8E9ZMv6Hc+4h3Yr+AosaV+kFGdp1caAVKMlCwOONn3spJchziiSETPVySZNvGA0tn6HZrqhSVBme0bbJPl05IHhRzrD4tIVyRoa/bwQafvBXoycfcJHpv8eta/CpMJNbUAIwCg5N9xWTlXhhGtqCgM7IQu1tbUnGww+17f+O04MJnMl9ztnVTQtAolZzvZCW+KVmzofZCBX6/gMQ9y8t D4W+PZsb v67f+vxmOPVIQSU9vaCNKdeE/6r7zOqNc2KL1FkTPGCcQkxEMpAPNCZVWo4JHoemkVDIv0ccs4qs4mG+zhKNIlOrDgReNxwHQzHXPUjzGVSyBlQczscnvtzMju3LIeo120rwxji9B/RRLKVlc89sw3uF6E5mKfw3yYF6Bi/FWrK0F/UeYGX7LDvh0yb26owOpSto9YUV6KcAJ+7zQv2uL7sZ2tjBJtt4A1qWxHEw1j5PM+hD5TglhKZBvqHjtVSSXPM9BWHMmtkSMLdHgrRD5PrnAPdlr4kxxw4+F/aRIrhsoaclD3xWkUxytBiaYmQItfB2G3DSHhfU57g+EpYPlkGL1fA9i/7EcXiY7AjT2V6NUkU99N8J7AnnyWLJIo2ponuzFRCMtMPXmZLOKbOipKekAfDN6+8+8ic2l09DQls/MeiLs42ctCY7dSw0VIaTuHQJrAVBLl/Nr45otLgIszzfdSJ2Be06YzNbpqrOY0gPOd8pI94iSvjWO3ls22ta8xPFUs6y3unz8HwsAXoffb8NM+RYkYuMT3z/r6aBkfKk3l3IJy8i+WDExPucbitxe7P21u7pn6FWDXKIb/EdbDFouQoY0eWIVdDQ+62KhZYycwsIr4rOmVT3YywP8z1FZPELl X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: From: Jeff Xu Add documentation for mseal(). Signed-off-by: Jeff Xu --- Documentation/userspace-api/mseal.rst | 189 ++++++++++++++++++++++++++ 1 file changed, 189 insertions(+) create mode 100644 Documentation/userspace-api/mseal.rst diff --git a/Documentation/userspace-api/mseal.rst b/Documentation/userspace-api/mseal.rst new file mode 100644 index 000000000000..651c618d0664 --- /dev/null +++ b/Documentation/userspace-api/mseal.rst @@ -0,0 +1,189 @@ +.. SPDX-License-Identifier: GPL-2.0 + +===================== +Introduction of mseal +===================== + +:Author: Jeff Xu + +Modern CPUs support memory permissions such as RW and NX bits. The memory +permission feature improves security stance on memory corruption bugs, i.e. +the attacker can’t just write to arbitrary memory and point the code to it, +the memory has to be marked with X bit, or else an exception will happen. + +Memory sealing additionally protects the mapping itself against +modifications. This is useful to mitigate memory corruption issues where a +corrupted pointer is passed to a memory management system. For example, +such an attacker primitive can break control-flow integrity guarantees +since read-only memory that is supposed to be trusted can become writable +or .text pages can get remapped. Memory sealing can automatically be +applied by the runtime loader to seal .text and .rodata pages and +applications can additionally seal security critical data at runtime. + +A similar feature already exists in the XNU kernel with the +VM_FLAGS_PERMANENT flag [1] and on OpenBSD with the mimmutable syscall [2]. + +User API +======== +Two system calls are involved in virtual memory sealing, ``mseal()`` and ``mmap()``. + +``mseal()`` +----------- + +The ``mseal()`` is an architecture independent syscall, and with following +signature: + +``int mseal(void addr, size_t len, unsigned long types, unsigned long flags)`` + +**addr/len**: virtual memory address range. + +The address range set by ``addr``/``len`` must meet: + - start (addr) must be in a valid VMA. + - end (addr + len) must be in a valid VMA. + - no gap (unallocated memory) between start and end. + - start (addr) must be page aligned. + +The ``len`` will be paged aligned implicitly by kernel. + +**types**: bit mask to specify the sealing types, they are: + +- The ``MM_SEAL_BASE``: Prevent VMA from: + + Unmapping, moving to another location, and shrinking the size, + via munmap() and mremap(), can leave an empty space, therefore + can be replaced with a VMA with a new set of attributes. + + Move or expand a different vma into the current location, + via mremap(). + + Modifying sealed VMA via mmap(MAP_FIXED). + + Size expansion, via mremap(), does not appear to pose any + specific risks to sealed VMAs. It is included anyway because + the use case is unclear. In any case, users can rely on + merging to expand a sealed VMA. + + We consider the MM_SEAL_BASE feature, on which other sealing + features will depend. For instance, it probably does not make sense + to seal PROT_PKEY without sealing the BASE, and the kernel will + implicitly add SEAL_BASE for SEAL_PROT_PKEY. (If the application + wants to relax this in future, we could use the “flags” field in + mseal() to overwrite this the behavior.) + +- The ``MM_SEAL_PROT_PKEY``: + + Seal PROT and PKEY of the address range, in other words, + mprotect() and pkey_mprotect() will be denied if the memory is + sealed with MM_SEAL_PROT_PKEY. + +- The ``MM_SEAL_DISCARD_RO_ANON``: + + Certain types of madvise() operations are destructive [3], such + as MADV_DONTNEED, which can effectively alter region contents by + discarding pages, especially when memory is anonymous. This blocks + such operations for anonymous memory which is not writable to the + user. + +- The ``MM_SEAL_SEAL`` + Denies adding a new seal. + +**flags**: reserved for future use. + +**return values**: + +- ``0``: + - Success. + +- ``-EINVAL``: + - Invalid seal type. + - Invalid input flags. + - Start address is not page aligned. + - Address range (``addr`` + ``len``) overflow. + +- ``-ENOMEM``: + - ``addr`` is not a valid address (not allocated). + - End address (``addr`` + ``len``) is not a valid address. + - A gap (unallocated memory) between start and end. + +- ``-EACCES``: + - ``MM_SEAL_SEAL`` is set, adding a new seal is not allowed. + - Address range is not sealable, e.g. ``MAP_SEALABLE`` is not + set during ``mmap()``. + +**Note**: + +- User can call mseal(2) multiple times to add new seal types. +- Adding an already added seal type is a no-action (no error). +- unseal() or removing a seal type is not supported. +- In case of error return, one can expect the memory range is unchanged. + +``mmap()`` +---------- +``void *mmap(void* addr, size_t length, int prot, int flags, int fd, +off_t offset);`` + +We made two changes (``prot`` and ``flags``) to ``mmap()`` related to +memory sealing. + +**prot**: + +- ``PROT_SEAL_SEAL`` +- ``PROT_SEAL_BASE`` +- ``PROT_SEAL_PROT_PKEY`` +- ``PROT_SEAL_DISCARD_RO_ANON`` + +Allow ``mmap()`` to set the sealing type when creating a mapping. This is +useful for optimization because it avoids having to make two system +calls: one for ``mmap()`` and one for ``mseal()``. + +It's worth noting that even though the sealing type is set via the +``prot`` field in ``mmap()``, we don't require it to be set in the ``prot`` +field in later ``mprotect()`` call. This is unlike the ``PROT_READ``, +``PROT_WRITE``, ``PROT_EXEC`` bits, e.g. if ``PROT_WRITE`` is not set in +``mprotect()``, it means that the region is not writable. + +**flags** +The ``MAP_SEALABLE`` flag is added to the ``flags`` field of ``mmap()``. +When present, it marks the map as sealable. A map created +without ``MAP_SEALABLE`` will not support sealing; In other words, +``mseal()`` will fail for such a map. + +Applications that don't care about sealing will expect their +behavior unchanged. For those that need sealing support, opt-in +by adding ``MAP_SEALABLE`` when creating the map. + +Use Case: +========= +- glibc: + The dymamic linker, during loading ELF executables, can apply sealing to + to non-writeable memory segments. + +- Chrome browser: protect some security sensitive data-structures. + +Additional notes: +================= +As Jann Horn pointed out in [3], there are still a few ways to write +to RO memory, which is, in a way, by design. Those are not covered by +``mseal()``. If applications want to block such cases, sandboxer +(such as seccomp, LSM, etc) might be considered. + +Those cases are: + +- Write to read-only memory through ``/proc/self/mem`` interface. + +- Write to read-only memory through ``ptrace`` (such as ``PTRACE_POKETEXT``). + +- ``userfaultfd()``. + +The idea that inspired this patch comes from Stephen Röttger’s work in V8 +CFI [4].Chrome browser in ChromeOS will be the first user of this API. + +Reference: +========== +[1] https://github.com/apple-oss-distributions/xnu/blob/1031c584a5e37aff177559b9f69dbd3c8c3fd30a/osfmk/mach/vm_statistics.h#L274 + +[2] https://man.openbsd.org/mimmutable.2 + +[3] https://lore.kernel.org/lkml/CAG48ez3ShUYey+ZAFsU2i1RpQn0a5eOs2hzQ426FkcgnfUGLvA@mail.gmail.com + +[4] https://docs.google.com/document/d/1O2jwK4dxI3nRcOJuPYkonhTkNQfbmwdvxQMyXgeaRHo/edit#heading=h.bvaojj9fu6hc