From patchwork Fri Jun 21 20:14:59 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Kagan, Roman" X-Patchwork-Id: 13708036 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 3FC66C27C4F for ; Fri, 21 Jun 2024 20:15:17 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id B37968D019C; Fri, 21 Jun 2024 16:15:16 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id AC1818D0190; Fri, 21 Jun 2024 16:15:16 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 93A0D8D019C; Fri, 21 Jun 2024 16:15:16 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0015.hostedemail.com [216.40.44.15]) by kanga.kvack.org (Postfix) with ESMTP id 6CF6C8D0190 for ; Fri, 21 Jun 2024 16:15:16 -0400 (EDT) Received: from smtpin30.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay06.hostedemail.com (Postfix) with ESMTP id E85A5A3C0A for ; Fri, 21 Jun 2024 20:15:15 +0000 (UTC) X-FDA: 82256000190.30.4A1A6F3 Received: from smtp-fw-80006.amazon.com (smtp-fw-80006.amazon.com [99.78.197.217]) by imf01.hostedemail.com (Postfix) with ESMTP id E43524001F for ; Fri, 21 Jun 2024 20:15:12 +0000 (UTC) Authentication-Results: imf01.hostedemail.com; dkim=pass header.d=amazon.de header.s=amazon201209 header.b=Tcpa6OGI; spf=pass (imf01.hostedemail.com: domain of "prvs=895821b17=rkagan@amazon.de" designates 99.78.197.217 as permitted sender) smtp.mailfrom="prvs=895821b17=rkagan@amazon.de"; dmarc=pass (policy=quarantine) header.from=amazon.de ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1719000902; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=cgWmlGVHD/WLDCXmIqfk8nrWeurU4qavflNPAcqQXUQ=; b=yUJATC1N7LLCGlxC1HFtoADaTIdQhCMX1J1iynjZP1SptP/g9YkfH4s/5dve3VR7FNmoYN +fiGHjtjoOPaa0u7AeiAq/6jTr+gZVYjMpjxM2Y/CTplK6QtR0Ijcq0GBhiTh9Qbt8R1pP 42bQxS4hhoVwDPlhEXZZbwJUYP4gr0s= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1719000902; a=rsa-sha256; cv=none; b=bKukZjnO0Jq028/C1RM+okfyK+W8XgjSVFzK01uM2Pc+K9qELBW2UZW893j84DX9YHYpoP BGc5Op3mqwEuGFZoB6d7SVFE2QVS9dv9cntghpyw5akRIxkFYNgzCzOOj2brgIAoEOHmuM rU867jDbpFaRxldnKdD+PLdXb4veGKc= ARC-Authentication-Results: i=1; imf01.hostedemail.com; dkim=pass header.d=amazon.de header.s=amazon201209 header.b=Tcpa6OGI; spf=pass (imf01.hostedemail.com: domain of "prvs=895821b17=rkagan@amazon.de" designates 99.78.197.217 as permitted sender) smtp.mailfrom="prvs=895821b17=rkagan@amazon.de"; dmarc=pass (policy=quarantine) header.from=amazon.de DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=amazon.de; i=@amazon.de; q=dns/txt; s=amazon201209; t=1719000913; x=1750536913; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=cgWmlGVHD/WLDCXmIqfk8nrWeurU4qavflNPAcqQXUQ=; b=Tcpa6OGIIVFPmuojUX1LJNt5HqxXi5LMdKL6MqynUuJjIf5doWvrNAM8 sF7TzUMuEWPp7eTe0exXO2tE13UMTw+fOinj9e3R1lTxXhAihsbMukDIh Z7j6qw6KZO9BY9eJ0TNfNb3kFnOyVHFm4MD2s4U6raatxvi8CQeJLnmtE Y=; X-IronPort-AV: E=Sophos;i="6.08,255,1712620800"; d="scan'208";a="303949657" Received: from pdx4-co-svc-p1-lb2-vlan3.amazon.com (HELO smtpout.prod.us-east-1.prod.farcaster.email.amazon.dev) ([10.25.36.214]) by smtp-border-fw-80006.pdx80.corp.amazon.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 21 Jun 2024 20:15:10 +0000 Received: from EX19MTAUEC001.ant.amazon.com [10.0.0.204:38687] by smtpin.naws.us-east-1.prod.farcaster.email.amazon.dev [10.0.50.120:2525] with esmtp (Farcaster) id 6725e830-b8a7-4b4d-9f50-20de851c1f32; Fri, 21 Jun 2024 20:15:08 +0000 (UTC) X-Farcaster-Flow-ID: 6725e830-b8a7-4b4d-9f50-20de851c1f32 Received: from EX19D008UEA002.ant.amazon.com (10.252.134.125) by EX19MTAUEC001.ant.amazon.com (10.252.135.222) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.1258.34; Fri, 21 Jun 2024 20:15:08 +0000 Received: from EX19MTAUEA001.ant.amazon.com (10.252.134.203) by EX19D008UEA002.ant.amazon.com (10.252.134.125) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.1258.34; Fri, 21 Jun 2024 20:15:07 +0000 Received: from u40bc5e070a0153.ant.amazon.com (10.95.134.31) by mail-relay.amazon.com (10.252.134.102) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.1258.34 via Frontend Transport; Fri, 21 Jun 2024 20:15:06 +0000 From: Roman Kagan To: CC: Shuah Khan , Dragan Cvetic , Fares Mehanna , Alexander Graf , "Derek Kiernan" , , , Greg Kroah-Hartman , , David Woodhouse , Andrew Morton , Arnd Bergmann Subject: [PATCH RFC 1/3] mseal: expose interface to seal / unseal user memory ranges Date: Fri, 21 Jun 2024 22:14:59 +0200 Message-ID: <20240621201501.1059948-2-rkagan@amazon.de> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20240621201501.1059948-1-rkagan@amazon.de> References: <20240621201501.1059948-1-rkagan@amazon.de> MIME-Version: 1.0 X-Rspam-User: X-Rspamd-Server: rspam04 X-Rspamd-Queue-Id: E43524001F X-Stat-Signature: oajeiftiz13izqas4mqon9fckq94bj6e X-HE-Tag: 1719000912-721771 X-HE-Meta: U2FsdGVkX1/giS+ySmA0SZBkAlpg/9JWfSoJFQJsdJJGO1TzMcrLfcwJZsNBLYg1FUzeWEu0YdCXWTiFuYB0wMRf+Tyf0WVbJA2f8eyJM7M7AUjQOxeObZ41OZt2jEkZgv0oY23faHcLDn9HLlhFO2bjFtvhFLoxKdIIxqUtw3Hhz01evR8sIqr3wQbHg3F606IjAwyoRv7WVFjeMYitwJVUVPMLJ9ejoMDv2uLso8MxkGRn8LGs9WSLlKJV6jNMSoRII9Hm5/DtUHGOi14NWYssetUr8GbclZgVUXJpwLQUJ13Fq0PXSclwTxnvnUDGookBqQTuYbQDM24L+Vq44fHI9qhvqxMeG0Jf72YDBfd34mb+NqNMmO52GZcIaPdoek4ymniZe5cdMOSMhIZbEh/h2aHfIwcS//tXFgq9r9z32C1v36XqGwTMKLuB4qClWZdL2k4aQjyDh/vHSu9xEm1+9rRb57YUFaRp0jc6AMwDFalMVeZ0m+K7wQynGUOcg+ZUKqlcNwnynRwcLRZDEIQMAXybC66GtdANem0z8A8PKoFUcs3sNF73WiNCIXzLhkeQ7WANTcZeZGpkUKNgE0sXs6R/JvNLA+zTOrnLMcmK8nC8cbbnkdKjvh8f7oo1QyP62NAPFdqxpqlhRDLtHhL9b2OMpXxHbAmlN3xwEnyUZz5jYEy2flYtun+J70h6Lc3O01WyecJISmoRk2PYjR5GZkqwtoRN9o/O8toEIraqmGh+/D4U3mj58d8fi2VUA+Vs4yPOdWr4mnQ4RBYS2uetSmJ9nEphn5UePe26aEyGAfyolqdGl+zfuqsM+e4fVoTn4r95Rv1pFCqjzNO7Y6UAdDnNuv/BUypNSPeZGu+4rAOZf/lb/II43p9g5TJ2idLEd+deOcrcgGXrs+kkXE42zDfu7hmDe47l7Wr3RE3vT0xNfRD3XQx0KLPm8I2g3CGoZlJgjJtyyNG2sBO 0LO2qRya K6kIh1tzttGVH7a5bmkBYtAtE8MqL0zEDBd3BkCrDVbA0Z569PfRBE6u7/XDh/6LcJ8KwEGY34OVjn+x3aMoHM3YwJwnkqpnN7Y7UrQtGvtZsMCMDyTgL2I33P+vJB+IqIAAN5j0Mlh2Ez2H5WKNXL+Pbe0/9dV5mZ90FWtwGKzDYlUnv5zZLY9fflzHna9Rzyh8NKWp+OBIuiYRKjUax8ZYdeXzkmJqxZGj4vBQwfV4rLM7Ckftdqv4AKLQcy3g9yBc6w0VAE8nGiMwTdyD5U0+9uzLzduPvWuN7QGGL8Y8u/XREEe44XSOw6UDpgyWX6dAhRb/hvc0Mw7lxVO2YXtZZAzb61dCITKjYT32ZWHnWrnoMXLhRUiT2pw== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: From: Fares Mehanna To make sure the kernel mm-local mapping is untouched by the user, we will seal the VMA before changing the protection to be used by the kernel. This will guarantee that userspace can't unmap or alter this VMA while it is being used by the kernel. After the kernel is done with the secret memory, it will unseal the VMA to be able to unmap and free it. Unseal operation is not exposed to userspace. Signed-off-by: Fares Mehanna Signed-off-by: Roman Kagan --- mm/internal.h | 7 +++++ mm/mseal.c | 81 ++++++++++++++++++++++++++++++++------------------- 2 files changed, 58 insertions(+), 30 deletions(-) diff --git a/mm/internal.h b/mm/internal.h index b2c75b12014e..5278989610f5 100644 --- a/mm/internal.h +++ b/mm/internal.h @@ -1453,6 +1453,8 @@ bool can_modify_mm(struct mm_struct *mm, unsigned long start, unsigned long end); bool can_modify_mm_madv(struct mm_struct *mm, unsigned long start, unsigned long end, int behavior); +/* mm's mmap write lock must be taken before seal/unseal operation */ +int do_mseal(unsigned long start, unsigned long end, bool seal); #else static inline int can_do_mseal(unsigned long flags) { @@ -1470,6 +1472,11 @@ static inline bool can_modify_mm_madv(struct mm_struct *mm, unsigned long start, { return true; } + +static inline int do_mseal(unsigned long start, unsigned long end, bool seal) +{ + return -EINVAL; +} #endif #ifdef CONFIG_SHRINKER_DEBUG diff --git a/mm/mseal.c b/mm/mseal.c index bf783bba8ed0..331745ac7064 100644 --- a/mm/mseal.c +++ b/mm/mseal.c @@ -26,6 +26,11 @@ static inline void set_vma_sealed(struct vm_area_struct *vma) vm_flags_set(vma, VM_SEALED); } +static inline void clear_vma_sealed(struct vm_area_struct *vma) +{ + vm_flags_clear(vma, VM_SEALED); +} + /* * check if a vma is sealed for modification. * return true, if modification is allowed. @@ -109,7 +114,7 @@ bool can_modify_mm_madv(struct mm_struct *mm, unsigned long start, unsigned long static int mseal_fixup(struct vma_iterator *vmi, struct vm_area_struct *vma, struct vm_area_struct **prev, unsigned long start, - unsigned long end, vm_flags_t newflags) + unsigned long end, vm_flags_t newflags, bool seal) { int ret = 0; vm_flags_t oldflags = vma->vm_flags; @@ -123,7 +128,10 @@ static int mseal_fixup(struct vma_iterator *vmi, struct vm_area_struct *vma, goto out; } - set_vma_sealed(vma); + if (seal) + set_vma_sealed(vma); + else + clear_vma_sealed(vma); out: *prev = vma; return ret; @@ -159,9 +167,9 @@ static int check_mm_seal(unsigned long start, unsigned long end) } /* - * Apply sealing. + * Apply sealing / unsealing. */ -static int apply_mm_seal(unsigned long start, unsigned long end) +static int apply_mm_seal(unsigned long start, unsigned long end, bool seal) { unsigned long nstart; struct vm_area_struct *vma, *prev; @@ -183,11 +191,14 @@ static int apply_mm_seal(unsigned long start, unsigned long end) unsigned long tmp; vm_flags_t newflags; - newflags = vma->vm_flags | VM_SEALED; + if (seal) + newflags = vma->vm_flags | VM_SEALED; + else + newflags = vma->vm_flags & ~(VM_SEALED); tmp = vma->vm_end; if (tmp > end) tmp = end; - error = mseal_fixup(&vmi, vma, &prev, nstart, tmp, newflags); + error = mseal_fixup(&vmi, vma, &prev, nstart, tmp, newflags, seal); if (error) return error; nstart = vma_iter_end(&vmi); @@ -196,6 +207,37 @@ static int apply_mm_seal(unsigned long start, unsigned long end) return 0; } +int do_mseal(unsigned long start, unsigned long end, bool seal) +{ + int ret; + + if (end < start) + return -EINVAL; + + if (end == start) + return 0; + + /* + * First pass, this helps to avoid + * partial sealing in case of error in input address range, + * e.g. ENOMEM error. + */ + ret = check_mm_seal(start, end); + if (ret) + goto out; + + /* + * Second pass, this should success, unless there are errors + * from vma_modify_flags, e.g. merge/split error, or process + * reaching the max supported VMAs, however, those cases shall + * be rare. + */ + ret = apply_mm_seal(start, end, seal); + +out: + return ret; +} + /* * mseal(2) seals the VM's meta data from * selected syscalls. @@ -248,7 +290,7 @@ static int apply_mm_seal(unsigned long start, unsigned long end) * * unseal() is not supported. */ -static int do_mseal(unsigned long start, size_t len_in, unsigned long flags) +static int __do_mseal(unsigned long start, size_t len_in, unsigned long flags) { size_t len; int ret = 0; @@ -269,33 +311,12 @@ static int do_mseal(unsigned long start, size_t len_in, unsigned long flags) return -EINVAL; end = start + len; - if (end < start) - return -EINVAL; - - if (end == start) - return 0; if (mmap_write_lock_killable(mm)) return -EINTR; - /* - * First pass, this helps to avoid - * partial sealing in case of error in input address range, - * e.g. ENOMEM error. - */ - ret = check_mm_seal(start, end); - if (ret) - goto out; + ret = do_mseal(start, end, true); - /* - * Second pass, this should success, unless there are errors - * from vma_modify_flags, e.g. merge/split error, or process - * reaching the max supported VMAs, however, those cases shall - * be rare. - */ - ret = apply_mm_seal(start, end); - -out: mmap_write_unlock(current->mm); return ret; } @@ -303,5 +324,5 @@ static int do_mseal(unsigned long start, size_t len_in, unsigned long flags) SYSCALL_DEFINE3(mseal, unsigned long, start, size_t, len, unsigned long, flags) { - return do_mseal(start, len, flags); + return __do_mseal(start, len, flags); }