From patchwork Thu Aug 11 20:13:40 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Xu X-Patchwork-Id: 12941711 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 2EDF7C19F2A for ; Thu, 11 Aug 2022 20:13:50 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 454F18E0002; Thu, 11 Aug 2022 16:13:49 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 403EF8E0001; Thu, 11 Aug 2022 16:13:49 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 2579E8E0002; Thu, 11 Aug 2022 16:13:49 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0012.hostedemail.com [216.40.44.12]) by kanga.kvack.org (Postfix) with ESMTP id 1519B8E0001 for ; Thu, 11 Aug 2022 16:13:49 -0400 (EDT) Received: from smtpin30.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay02.hostedemail.com (Postfix) with ESMTP id DEC541205D6 for ; Thu, 11 Aug 2022 20:13:48 +0000 (UTC) X-FDA: 79788412536.30.359936C Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by imf11.hostedemail.com (Postfix) with ESMTP id BDD1F40072 for ; Thu, 11 Aug 2022 20:13:47 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1660248827; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding; bh=3daRoUaVJELNsu8uBo1J3dFADg4pP+Ys+hc/IWmrcSE=; b=hqRwNijPzI6/u5zINE9iRv4x3aeiab0QFz5Rpzch7iJluQw9mWiu9COEUheEx6zBLY0oWe GYxQdIDQ5ZGUKRGQUnKZ7w5OdpYgPsaq0XpAaqdX/e01cNq3qFD712s9tE+l5rdOYfTchF csewadmC+eeEWwBQsruqfAbJwnwOWtw= Received: from mail-il1-f198.google.com (mail-il1-f198.google.com [209.85.166.198]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_128_GCM_SHA256) id us-mta-546-kUc3zaSfO7-Ap-xNzjf8Nw-1; Thu, 11 Aug 2022 16:13:44 -0400 X-MC-Unique: kUc3zaSfO7-Ap-xNzjf8Nw-1 Received: by mail-il1-f198.google.com with SMTP id k15-20020a92c24f000000b002e2f53254d2so1001067ilo.15 for ; Thu, 11 Aug 2022 13:13:44 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:message-id:date:subject:cc :to:from:x-gm-message-state:from:to:cc; bh=3daRoUaVJELNsu8uBo1J3dFADg4pP+Ys+hc/IWmrcSE=; b=5GuLRjgAVBbydlIVd/AQsILwCOcY+4KboeLk29B491yXeRF1umMTn6y3w8w/zxdTOY RvAYdxrtsXRTxDwoR67EpTUmM56DOr5EHL0WBCSTxk8aWRZZUldW8UB4G5ZlmHgji9U4 ZnBIdVbeDwrW/KWRuZT4iRoHkvOBpMN+3CgGDmRH4H2Irx+otxXqrQKz50YWZYYlgdGv rc+0CZMPakK5fEz8vcY3UNdVFelw3xPbBxCLvP1hb/I9E/GBwxQF5ejQJ/Z0uW+/TkTh KQjd7ZRu2B2OI91NOPappRMDk5/yWxI5Y9rCrFUZxh/T5bvmOiKCimzxPGOiWew7Wwrz 35DQ== X-Gm-Message-State: ACgBeo2yUftnfiZpjoMkcfYg1ARbuPxJ7o50tjD421HkUOWaQnH/PDus M7yEXArpjO3c1JLB9ycttdw2SaWfli7Km5ZguhBgQxsWAibS9N60h+99mAHCETvEpxrhAJO9bBE dDGoMIfcOFNup0CSaRNdSdx5aSSnW+FMx/Iy2KpXiZ2xtI3N35lyg6c8ykrSx X-Received: by 2002:a05:6e02:178d:b0:2de:a00d:d06c with SMTP id y13-20020a056e02178d00b002dea00dd06cmr407484ilu.142.1660248823307; Thu, 11 Aug 2022 13:13:43 -0700 (PDT) X-Google-Smtp-Source: AA6agR43ZGi6SzGc03X+U1pOQVeLFQiFETGagQRjEv5jpmvnRQZRUNvxVJqd6svLCQzVYY5ZqE/BzA== X-Received: by 2002:a05:6e02:178d:b0:2de:a00d:d06c with SMTP id y13-20020a056e02178d00b002dea00dd06cmr407463ilu.142.1660248822967; Thu, 11 Aug 2022 13:13:42 -0700 (PDT) Received: from localhost.localdomain (bras-base-aurron9127w-grc-35-70-27-3-10.dsl.bell.ca. [70.27.3.10]) by smtp.gmail.com with ESMTPSA id a90-20020a029463000000b003435ab6a6b9sm206845jai.0.2022.08.11.13.13.41 (version=TLS1_3 cipher=TLS_CHACHA20_POLY1305_SHA256 bits=256/256); Thu, 11 Aug 2022 13:13:42 -0700 (PDT) From: Peter Xu To: linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: David Hildenbrand , Mike Rapoport , Andrew Morton , Mike Kravetz , peterx@redhat.com, Andrea Arcangeli , Nadav Amit , Axel Rasmussen Subject: [PATCH] mm/uffd: Reset write protection when unregister with wp-mode Date: Thu, 11 Aug 2022 16:13:40 -0400 Message-Id: <20220811201340.39342-1-peterx@redhat.com> X-Mailer: git-send-email 2.32.0 MIME-Version: 1.0 X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-type: text/plain ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1660248828; a=rsa-sha256; cv=none; b=P6uTB9fAf2V1ETo4bAOUxuniy31lnHtCcOzcC5UWye6vWrrkngagy9lsRiIJYuckg2PwVu WFmAtaK5TpG9S7bzxyS40fn/W2lnTvxfZpyWjum+5VQ6WJVUTdMId806VyIMf7pc+SJJMm cj3+CKgOH4eZimkZHZPIcdVU8Rdu48Y= ARC-Authentication-Results: i=1; imf11.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=hqRwNijP; spf=pass (imf11.hostedemail.com: domain of peterx@redhat.com designates 170.10.133.124 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (policy=none) header.from=redhat.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1660248828; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding:in-reply-to: references:dkim-signature; bh=3daRoUaVJELNsu8uBo1J3dFADg4pP+Ys+hc/IWmrcSE=; b=BC6rKS60+CMibC6cSaADCRqhm4WH2oJH/3AJnQ59cNeyxBXTfOAag32wcKc8GMqR3d5rne 2EOIA/jcmB/DtUpb/d0ariqPqX5+OwSUMofaUTP3JYpImeDM6f1yeJ30OpfK5UrI50aWxS X8WYh5lEwQwwheNqI3OzfFCB25PrFsI= X-Rspam-User: X-Stat-Signature: ur4popxjjy8ud6c6raoi5m5zkxez39zo X-Rspamd-Queue-Id: BDD1F40072 Authentication-Results: imf11.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=hqRwNijP; spf=pass (imf11.hostedemail.com: domain of peterx@redhat.com designates 170.10.133.124 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (policy=none) header.from=redhat.com X-Rspamd-Server: rspam01 X-HE-Tag: 1660248827-789896 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: The motivation of this patch comes from a recent report and patchfix from David Hildenbrand on hugetlb shared handling of wr-protected page [1]. With the reproducer provided in commit message of [1], one can leverage the uffd-wp lazy-reset of ptes to trigger a hugetlb issue which can affect not only the attacker process, but also the whole system. The lazy-reset mechanism of uffd-wp was used to make unregister faster, meanwhile it has an assumption that any leftover pgtable entries should only affect the process on its own, so not only the user should be aware of anything it does, but also it should not affect outside of the process. But it seems that this is not true, and it can also be utilized to make some exploit easier. So far there's no clue showing that the lazy-reset is important to any userfaultfd users because normally the unregister will only happen once for a specific range of memory of the lifecycle of the process. Considering all above, what this patch proposes is to do explicit pte resets when unregister an uffd region with wr-protect mode enabled. It should be the same as calling ioctl(UFFDIO_WRITEPROTECT, wp=false) right before ioctl(UFFDIO_UNREGISTER) for the user. So potentially it'll make the unregister slower. From that pov it's a very slight abi change, but hopefully nothing should break with this change either. Regarding to the change itself - core of uffd write [un]protect operation is moved into a separate function (uffd_wp_range()) and it is reused in the unregister code path. Note that the new function will not check for anything, e.g. ranges or memory types, because they should have been checked during the previous UFFDIO_REGISTER or it should have failed already. It also doesn't check mmap_changing because we're with mmap write lock held anyway. I added a Fixes upon introducing of uffd-wp shmem+hugetlbfs because that's the only issue reported so far and that's the commit David's reproducer will start working (v5.19+). But the whole idea actually applies to not only file memories but also anonymous. It's just that we don't need to fix anonymous prior to v5.19- because there's no known way to exploit. IOW, this patch can also fix the issue reported in [1] as the patch 2 does. [1] https://lore.kernel.org/all/20220811103435.188481-3-david@redhat.com/ Fixes: b1f9e876862d ("mm/uffd: enable write protection for shmem & hugetlbfs") Signed-off-by: Peter Xu --- fs/userfaultfd.c | 4 ++++ include/linux/userfaultfd_k.h | 2 ++ mm/userfaultfd.c | 29 ++++++++++++++++++----------- 3 files changed, 24 insertions(+), 11 deletions(-) diff --git a/fs/userfaultfd.c b/fs/userfaultfd.c index 698e768d5c3d..941ede31a9da 100644 --- a/fs/userfaultfd.c +++ b/fs/userfaultfd.c @@ -1597,6 +1597,10 @@ static int userfaultfd_unregister(struct userfaultfd_ctx *ctx, wake_userfault(vma->vm_userfaultfd_ctx.ctx, &range); } + /* Reset ptes for the whole vma range if wr-protected */ + if (userfaultfd_wp(vma)) + uffd_wp_range(mm, vma, start, vma_end - start, false); + new_flags = vma->vm_flags & ~__VM_UFFD_FLAGS; prev = vma_merge(mm, prev, start, vma_end, new_flags, vma->anon_vma, vma->vm_file, vma->vm_pgoff, diff --git a/include/linux/userfaultfd_k.h b/include/linux/userfaultfd_k.h index 732b522bacb7..e1b8a915e9e9 100644 --- a/include/linux/userfaultfd_k.h +++ b/include/linux/userfaultfd_k.h @@ -73,6 +73,8 @@ extern ssize_t mcopy_continue(struct mm_struct *dst_mm, unsigned long dst_start, extern int mwriteprotect_range(struct mm_struct *dst_mm, unsigned long start, unsigned long len, bool enable_wp, atomic_t *mmap_changing); +extern void uffd_wp_range(struct mm_struct *dst_mm, struct vm_area_struct *vma, + unsigned long start, unsigned long len, bool enable_wp); /* mm helpers */ static inline bool is_mergeable_vm_userfaultfd_ctx(struct vm_area_struct *vma, diff --git a/mm/userfaultfd.c b/mm/userfaultfd.c index 07d3befc80e4..7327b2573f7c 100644 --- a/mm/userfaultfd.c +++ b/mm/userfaultfd.c @@ -703,14 +703,29 @@ ssize_t mcopy_continue(struct mm_struct *dst_mm, unsigned long start, mmap_changing, 0); } +void uffd_wp_range(struct mm_struct *dst_mm, struct vm_area_struct *dst_vma, + unsigned long start, unsigned long len, bool enable_wp) +{ + struct mmu_gather tlb; + pgprot_t newprot; + + if (enable_wp) + newprot = vm_get_page_prot(dst_vma->vm_flags & ~(VM_WRITE)); + else + newprot = vm_get_page_prot(dst_vma->vm_flags); + + tlb_gather_mmu(&tlb, dst_mm); + change_protection(&tlb, dst_vma, start, start + len, newprot, + enable_wp ? MM_CP_UFFD_WP : MM_CP_UFFD_WP_RESOLVE); + tlb_finish_mmu(&tlb); +} + int mwriteprotect_range(struct mm_struct *dst_mm, unsigned long start, unsigned long len, bool enable_wp, atomic_t *mmap_changing) { struct vm_area_struct *dst_vma; unsigned long page_mask; - struct mmu_gather tlb; - pgprot_t newprot; int err; /* @@ -750,15 +765,7 @@ int mwriteprotect_range(struct mm_struct *dst_mm, unsigned long start, goto out_unlock; } - if (enable_wp) - newprot = vm_get_page_prot(dst_vma->vm_flags & ~(VM_WRITE)); - else - newprot = vm_get_page_prot(dst_vma->vm_flags); - - tlb_gather_mmu(&tlb, dst_mm); - change_protection(&tlb, dst_vma, start, start + len, newprot, - enable_wp ? MM_CP_UFFD_WP : MM_CP_UFFD_WP_RESOLVE); - tlb_finish_mmu(&tlb); + uffd_wp_range(dst_mm, dst_vma, start, len, enable_wp); err = 0; out_unlock: