From patchwork Tue Apr 27 16:12:54 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Xu X-Patchwork-Id: 12226867 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.6 required=3.0 tests=BAYES_00,DKIM_INVALID, DKIM_SIGNED,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 0E226C433ED for ; Tue, 27 Apr 2021 16:13:30 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 773BA61151 for ; Tue, 27 Apr 2021 16:13:29 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 773BA61151 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=redhat.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id EC7796B006E; Tue, 27 Apr 2021 12:13:28 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id E9E248D0001; Tue, 27 Apr 2021 12:13:28 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id D409F6B0071; Tue, 27 Apr 2021 12:13:28 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0061.hostedemail.com [216.40.44.61]) by kanga.kvack.org (Postfix) with ESMTP id B30DC6B006E for ; Tue, 27 Apr 2021 12:13:28 -0400 (EDT) Received: from smtpin01.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay03.hostedemail.com (Postfix) with ESMTP id 614D68249980 for ; Tue, 27 Apr 2021 16:13:28 +0000 (UTC) X-FDA: 78078642096.01.594B318 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by imf03.hostedemail.com (Postfix) with ESMTP id 73C6EC0007E0 for ; Tue, 27 Apr 2021 16:13:22 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1619540007; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=sHnDMU05VPiYnBKDqKuhxjdmmK4rVbQVkQv2fE0zGZ4=; b=ZYVRnh9XnxYJT5/w2eBCbS/Jp2pLfp6DQIMOZPO6T6YW0YXhS6KZ0SDowbHFYB9U06ts0e IofcpF6V/iRX9wYy4UxWDTWTidXSYoh6FdGfLVa+EsN1DnGyO2p57ohYguLUjySDn+huE9 EnaxsLLazg8tAVG0aSkSqEQ510aLBws= Received: from mail-qv1-f70.google.com (mail-qv1-f70.google.com [209.85.219.70]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-546-eLXLLL91PRKoYwELHyrsDQ-1; Tue, 27 Apr 2021 12:13:23 -0400 X-MC-Unique: eLXLLL91PRKoYwELHyrsDQ-1 Received: by mail-qv1-f70.google.com with SMTP id c5-20020a0ca9c50000b02901aede9b5061so14109358qvb.14 for ; Tue, 27 Apr 2021 09:13:23 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=sHnDMU05VPiYnBKDqKuhxjdmmK4rVbQVkQv2fE0zGZ4=; b=IPbdz90jwUnXtrNiIzq7fwNB1vGSsrbldKG/fa3ZSg/WxcSw0U4Bxw/zWZENjNrmXZ gClELLS5yqhMesz5DGU8o3F5oUWwyWOH7KU/4BMjACAntTjRww0HA0CoDl+U4mJVZ4tt Gcok8PVCJb6WKJnLF8dIYeuofn5+ev48CTmulzUoJgoC8QSjEKvb4IU4lEXhcL3iWhAp iy9EEEhJSjPsm84sZYVccAz7U6VDhNLa3rCTtl/h+iOSTzEISxOB6a4w9AfLXfi+AHJ6 gov3GQBMe51PAO+yMtQuk7qW/U26rEZHDvSrADsA3staZyc5b9NoO3Ks7JZo9s8o4MZl Zsbw== X-Gm-Message-State: AOAM533sdfqdiIXHnOgYdN06N0Gd0r83QEE5h8Ea9kE7BT8ShXxfx0mW 0oFW7fvi9jyXi7aBVSf+FGKJDjnSzPaethzi0q7odgmcKMQzAQReSzBnSYd4r5/nn45dKqDHqtu RExCWZWWtKonlUvRpOJyUlo34vqFtEVTDdLBqup8lPcQqzTeoNARGEu7+BZ9e X-Received: by 2002:a05:620a:1026:: with SMTP id a6mr16061611qkk.357.1619540002206; Tue, 27 Apr 2021 09:13:22 -0700 (PDT) X-Google-Smtp-Source: ABdhPJzK7lRS/kPuU3kySXb7Ky2ewLEdvChHwX5FGOdMQ2X/Vpu9E4UvJ1aP8a2K5CItcnmpbfQx2g== X-Received: by 2002:a05:620a:1026:: with SMTP id a6mr16061570qkk.357.1619540001874; Tue, 27 Apr 2021 09:13:21 -0700 (PDT) Received: from xz-x1.redhat.com (bras-base-toroon474qw-grc-77-184-145-104-227.dsl.bell.ca. [184.145.104.227]) by smtp.gmail.com with ESMTPSA id v66sm3103621qkd.113.2021.04.27.09.13.19 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 27 Apr 2021 09:13:20 -0700 (PDT) From: Peter Xu To: linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: Nadav Amit , Miaohe Lin , Mike Rapoport , Andrea Arcangeli , Hugh Dickins , peterx@redhat.com, Jerome Glisse , Mike Kravetz , Jason Gunthorpe , Matthew Wilcox , Andrew Morton , Axel Rasmussen , "Kirill A . Shutemov" Subject: [PATCH v2 01/24] shmem/userfaultfd: Take care of UFFDIO_COPY_MODE_WP Date: Tue, 27 Apr 2021 12:12:54 -0400 Message-Id: <20210427161317.50682-2-peterx@redhat.com> X-Mailer: git-send-email 2.26.2 In-Reply-To: <20210427161317.50682-1-peterx@redhat.com> References: <20210427161317.50682-1-peterx@redhat.com> MIME-Version: 1.0 Authentication-Results: relay.mimecast.com; auth=pass smtp.auth=CUSA124A263 smtp.mailfrom=peterx@redhat.com X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com X-Stat-Signature: u4am4njq47ak6rqfxeu1hidwr73ethb1 X-Rspamd-Server: rspam04 X-Rspamd-Queue-Id: 73C6EC0007E0 Received-SPF: none (redhat.com>: No applicable sender policy available) receiver=imf03; identity=mailfrom; envelope-from=""; helo=us-smtp-delivery-124.mimecast.com; client-ip=170.10.133.124 X-HE-DKIM-Result: pass/pass X-HE-Tag: 1619540002-775484 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Firstly, pass wp_copy into shmem_mfill_atomic_pte() through the stack. Then apply the UFFD_WP bit properly when the UFFDIO_COPY on shmem is with UFFDIO_COPY_MODE_WP. One thing to mention is that shmem_mfill_atomic_pte() needs to set the dirty bit in pte even if UFFDIO_COPY_MODE_WP is set. The reason is similar to dcf7fe9d8976 ("userfaultfd: shmem: UFFDIO_COPY: set the page dirty if VM_WRITE is not set") where we need to set page as dirty even if VM_WRITE is no there. It's just that shmem can drop the pte any time later, and if it's not dirty the data will be dropped. For uffd-wp, that could lead to data loss if without the dirty bit set. Note that shmem_mfill_zeropage_pte() will always call shmem_mfill_atomic_pte() with wp_copy==false because UFFDIO_ZEROCOPY does not support UFFDIO_COPY_MODE_WP. Signed-off-by: Peter Xu --- include/linux/shmem_fs.h | 5 +++-- mm/shmem.c | 27 ++++++++++++++++++++------- mm/userfaultfd.c | 2 +- 3 files changed, 24 insertions(+), 10 deletions(-) diff --git a/include/linux/shmem_fs.h b/include/linux/shmem_fs.h index d82b6f3965885..a21eb25183d02 100644 --- a/include/linux/shmem_fs.h +++ b/include/linux/shmem_fs.h @@ -127,14 +127,15 @@ extern int shmem_mcopy_atomic_pte(struct mm_struct *dst_mm, pmd_t *dst_pmd, struct vm_area_struct *dst_vma, unsigned long dst_addr, unsigned long src_addr, - struct page **pagep); + struct page **pagep, + bool wp_copy); extern int shmem_mfill_zeropage_pte(struct mm_struct *dst_mm, pmd_t *dst_pmd, struct vm_area_struct *dst_vma, unsigned long dst_addr); #else #define shmem_mcopy_atomic_pte(dst_mm, dst_pte, dst_vma, dst_addr, \ - src_addr, pagep) ({ BUG(); 0; }) + src_addr, pagep, wp_copy) ({ BUG(); 0; }) #define shmem_mfill_zeropage_pte(dst_mm, dst_pmd, dst_vma, \ dst_addr) ({ BUG(); 0; }) #endif diff --git a/mm/shmem.c b/mm/shmem.c index 26c76b13ad233..8fbf7680f044c 100644 --- a/mm/shmem.c +++ b/mm/shmem.c @@ -2360,7 +2360,8 @@ static int shmem_mfill_atomic_pte(struct mm_struct *dst_mm, unsigned long dst_addr, unsigned long src_addr, bool zeropage, - struct page **pagep) + struct page **pagep, + bool wp_copy) { struct inode *inode = file_inode(dst_vma->vm_file); struct shmem_inode_info *info = SHMEM_I(inode); @@ -2422,9 +2423,18 @@ static int shmem_mfill_atomic_pte(struct mm_struct *dst_mm, goto out_release; _dst_pte = mk_pte(page, dst_vma->vm_page_prot); - if (dst_vma->vm_flags & VM_WRITE) - _dst_pte = pte_mkwrite(pte_mkdirty(_dst_pte)); - else { + if (dst_vma->vm_flags & VM_WRITE) { + if (wp_copy) + _dst_pte = pte_mkuffd_wp(pte_wrprotect(_dst_pte)); + else + _dst_pte = pte_mkwrite(_dst_pte); + /* + * Similar reason to set_page_dirty(), that we need to mark the + * pte dirty even if wp_copy==true here, otherwise the pte and + * its page could be dropped at anytime when e.g. swapped out. + */ + _dst_pte = pte_mkdirty(_dst_pte); + } else { /* * We don't set the pte dirty if the vma has no * VM_WRITE permission, so mark the page dirty or it @@ -2482,10 +2492,12 @@ int shmem_mcopy_atomic_pte(struct mm_struct *dst_mm, struct vm_area_struct *dst_vma, unsigned long dst_addr, unsigned long src_addr, - struct page **pagep) + struct page **pagep, + bool wp_copy) { return shmem_mfill_atomic_pte(dst_mm, dst_pmd, dst_vma, - dst_addr, src_addr, false, pagep); + dst_addr, src_addr, false, pagep, + wp_copy); } int shmem_mfill_zeropage_pte(struct mm_struct *dst_mm, @@ -2496,7 +2508,8 @@ int shmem_mfill_zeropage_pte(struct mm_struct *dst_mm, struct page *page = NULL; return shmem_mfill_atomic_pte(dst_mm, dst_pmd, dst_vma, - dst_addr, 0, true, &page); + dst_addr, 0, true, &page, + false); } #ifdef CONFIG_TMPFS diff --git a/mm/userfaultfd.c b/mm/userfaultfd.c index e14b3820c6a81..7adaebe222b8e 100644 --- a/mm/userfaultfd.c +++ b/mm/userfaultfd.c @@ -443,7 +443,7 @@ static __always_inline ssize_t mfill_atomic_pte(struct mm_struct *dst_mm, if (!zeropage) err = shmem_mcopy_atomic_pte(dst_mm, dst_pmd, dst_vma, dst_addr, - src_addr, page); + src_addr, page, wp_copy); else err = shmem_mfill_zeropage_pte(dst_mm, dst_pmd, dst_vma, dst_addr); From patchwork Tue Apr 27 16:12:55 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Xu X-Patchwork-Id: 12226869 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.6 required=3.0 tests=BAYES_00,DKIM_INVALID, DKIM_SIGNED,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 2383DC433B4 for ; Tue, 27 Apr 2021 16:13:31 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id AA398613E7 for ; Tue, 27 Apr 2021 16:13:30 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org AA398613E7 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=redhat.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 1F6326B0070; Tue, 27 Apr 2021 12:13:30 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 1CD656B0071; Tue, 27 Apr 2021 12:13:30 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 020706B0072; Tue, 27 Apr 2021 12:13:29 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0120.hostedemail.com [216.40.44.120]) by kanga.kvack.org (Postfix) with ESMTP id CF8E26B0070 for ; Tue, 27 Apr 2021 12:13:29 -0400 (EDT) Received: from smtpin23.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay01.hostedemail.com (Postfix) with ESMTP id 81E61180AD81D for ; Tue, 27 Apr 2021 16:13:29 +0000 (UTC) X-FDA: 78078642138.23.4DC1B79 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by imf28.hostedemail.com (Postfix) with ESMTP id 81E0C2000261 for ; Tue, 27 Apr 2021 16:13:30 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1619540008; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=tertuAFVCCpsfAn7kyP9pkuWqwtVeARZYQAUe1yux2Y=; b=B5wgvjKhJRZ6jjMhnToQAK3VGVf+OEA+Yuq40ZWtyLN0n8WMkp9HcysYKJ8egfcdm3xo8C 64l/523ktaSjBG90WpjUzkl5xPDDRcX2/Z5jZTOolgMhoVjoDfLZMt+XsMC5BJfv//P4CM eJgBEqa3roTS07fB1Ak/1v3f4YFZVpQ= Received: from mail-qt1-f199.google.com (mail-qt1-f199.google.com [209.85.160.199]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-564--Enzg-k-M9qVbRjxDpx-Xw-1; Tue, 27 Apr 2021 12:13:25 -0400 X-MC-Unique: -Enzg-k-M9qVbRjxDpx-Xw-1 Received: by mail-qt1-f199.google.com with SMTP id p15-20020a05622a00cfb02901ae13813340so23581783qtw.15 for ; Tue, 27 Apr 2021 09:13:24 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=tertuAFVCCpsfAn7kyP9pkuWqwtVeARZYQAUe1yux2Y=; b=Tu1raAYuDzq+wCWERyNBo7+eckYiN7URW5UN2s0d9MWvOqpslKCf97TGyJavydsW2P XFa5oTeAy35Qge8gl8IA8GWScSv5JOIp6IM7FRbvsSpPLCxpK9L4Hlm+T3h9Val4ogVs 4tSUiv/BizqEJqmc7ku4L13YWHBbVr62sLkuhDsH4t6CIVc8FYl4i+Z+w0EtGDSP5Jie FF4qmv3VdD+i33RUWC7EwgbVp30YFIz/XTDeZeCHmgAxFpESJR6B41MF5UMg/C5Fhsar nB1fsk8A7j/sSLO+OdhvhWiAnSNs+U3nIWjuvtW8Nji8zUS8bkS+7ewCK2DSAxhMMaQi zlww== X-Gm-Message-State: AOAM533bMprrjytbU6G98C/Wco56+B10fo/4jRNyjtYfTLiqNWq064ao qtp71KUsa+rRKuSxFt+yKS56DyYhmNlR6CzAP7jT6LVkye9jcAOzTF0EaQAnyQ21HWqVpkxiBf7 LMla0BBTfXjcjctvbcVPCYHfzlg/gcweHLUtCIX8eRKU0tuqYPoBHQWqeOGu8 X-Received: by 2002:a05:620a:1439:: with SMTP id k25mr15521492qkj.439.1619540003961; Tue, 27 Apr 2021 09:13:23 -0700 (PDT) X-Google-Smtp-Source: ABdhPJxt4RxZQcMhYZdYCMXlw1LswxlEQbOKnDM/0s3R+vwolBpcTn/z++YiH5clcERdhpGhQS5p/Q== X-Received: by 2002:a05:620a:1439:: with SMTP id k25mr15521457qkj.439.1619540003656; Tue, 27 Apr 2021 09:13:23 -0700 (PDT) Received: from xz-x1.redhat.com (bras-base-toroon474qw-grc-77-184-145-104-227.dsl.bell.ca. [184.145.104.227]) by smtp.gmail.com with ESMTPSA id v66sm3103621qkd.113.2021.04.27.09.13.22 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 27 Apr 2021 09:13:23 -0700 (PDT) From: Peter Xu To: linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: Nadav Amit , Miaohe Lin , Mike Rapoport , Andrea Arcangeli , Hugh Dickins , peterx@redhat.com, Jerome Glisse , Mike Kravetz , Jason Gunthorpe , Matthew Wilcox , Andrew Morton , Axel Rasmussen , "Kirill A . Shutemov" Subject: [PATCH v2 02/24] mm: Clear vmf->pte after pte_unmap_same() returns Date: Tue, 27 Apr 2021 12:12:55 -0400 Message-Id: <20210427161317.50682-3-peterx@redhat.com> X-Mailer: git-send-email 2.26.2 In-Reply-To: <20210427161317.50682-1-peterx@redhat.com> References: <20210427161317.50682-1-peterx@redhat.com> MIME-Version: 1.0 Authentication-Results: relay.mimecast.com; auth=pass smtp.auth=CUSA124A263 smtp.mailfrom=peterx@redhat.com X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com X-Rspamd-Server: rspam05 X-Rspamd-Queue-Id: 81E0C2000261 X-Stat-Signature: koonbbh8g4syxsyw9zrh87pznjf9ksb6 Received-SPF: none (redhat.com>: No applicable sender policy available) receiver=imf28; identity=mailfrom; envelope-from=""; helo=us-smtp-delivery-124.mimecast.com; client-ip=170.10.133.124 X-HE-DKIM-Result: pass/pass X-HE-Tag: 1619540010-881449 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: pte_unmap_same() will always unmap the pte pointer. After the unmap, vmf->pte will not be valid any more. We should clear it. It was safe only because no one is accessing vmf->pte after pte_unmap_same() returns, since the only caller of pte_unmap_same() (so far) is do_swap_page(), where vmf->pte will in most cases be overwritten very soon. pte_unmap_same() will be used in other places in follow up patches, so that vmf->pte will not always be re-written. This patch enables us to call functions like finish_fault() because that'll conditionally unmap the pte by checking vmf->pte first. Or, alloc_set_pte() will make sure to allocate a new pte even after calling pte_unmap_same(). Since we'll need to modify vmf->pte, directly pass in vmf into pte_unmap_same() and then we can also avoid the long parameter list. Reviewed-by: Miaohe Lin Signed-off-by: Peter Xu --- mm/memory.c | 13 +++++++------ 1 file changed, 7 insertions(+), 6 deletions(-) diff --git a/mm/memory.c b/mm/memory.c index ffda19542bc6d..955a0bb6b855c 100644 --- a/mm/memory.c +++ b/mm/memory.c @@ -2618,19 +2618,20 @@ EXPORT_SYMBOL_GPL(apply_to_existing_page_range); * proceeding (but do_wp_page is only called after already making such a check; * and do_anonymous_page can safely check later on). */ -static inline int pte_unmap_same(struct mm_struct *mm, pmd_t *pmd, - pte_t *page_table, pte_t orig_pte) +static inline int pte_unmap_same(struct vm_fault *vmf) { int same = 1; #if defined(CONFIG_SMP) || defined(CONFIG_PREEMPTION) if (sizeof(pte_t) > sizeof(unsigned long)) { - spinlock_t *ptl = pte_lockptr(mm, pmd); + spinlock_t *ptl = pte_lockptr(vmf->vma->vm_mm, vmf->pmd); spin_lock(ptl); - same = pte_same(*page_table, orig_pte); + same = pte_same(*vmf->pte, vmf->orig_pte); spin_unlock(ptl); } #endif - pte_unmap(page_table); + pte_unmap(vmf->pte); + /* After unmap of pte, the pointer is invalid now - clear it. */ + vmf->pte = NULL; return same; } @@ -3319,7 +3320,7 @@ vm_fault_t do_swap_page(struct vm_fault *vmf) vm_fault_t ret = 0; void *shadow = NULL; - if (!pte_unmap_same(vma->vm_mm, vmf->pmd, vmf->pte, vmf->orig_pte)) + if (!pte_unmap_same(vmf)) goto out; entry = pte_to_swp_entry(vmf->orig_pte); From patchwork Tue Apr 27 16:12:56 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Xu X-Patchwork-Id: 12226873 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.6 required=3.0 tests=BAYES_00,DKIM_INVALID, DKIM_SIGNED,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id B5D5AC433ED for ; Tue, 27 Apr 2021 16:13:35 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 5571561151 for ; Tue, 27 Apr 2021 16:13:35 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 5571561151 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=redhat.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id AAA246B0072; Tue, 27 Apr 2021 12:13:33 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id A808F6B0073; Tue, 27 Apr 2021 12:13:33 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 8ACDA6B0074; Tue, 27 Apr 2021 12:13:33 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0182.hostedemail.com [216.40.44.182]) by kanga.kvack.org (Postfix) with ESMTP id 5FAE46B0072 for ; Tue, 27 Apr 2021 12:13:33 -0400 (EDT) Received: from smtpin07.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay05.hostedemail.com (Postfix) with ESMTP id 1093E181AF5D7 for ; Tue, 27 Apr 2021 16:13:33 +0000 (UTC) X-FDA: 78078642306.07.58577A5 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [216.205.24.124]) by imf10.hostedemail.com (Postfix) with ESMTP id 6889740001DE for ; Tue, 27 Apr 2021 16:13:22 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1619540012; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=yHI+Q0evUhnkGEC1JxzgrefD9XzCM+vPDN1MAUFX8eY=; b=cNrNJC+piQjUbwYXavAZwG6FNwKNp9vW1Rm7epfGZpT4Y1yECx8p1qh38BiPzGrHJROAPv LEbE0WRfN3XV+qUbO3sx5bMrPGRormBxxCHVZuk/dGtvwBarKR4q9+WdBRceQ2aTttK9vR ll3fqB7ujyJR19ZkGVpv1wkALggvVKE= Received: from mail-qv1-f69.google.com (mail-qv1-f69.google.com [209.85.219.69]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-153-HAWVwzgXNj-1nSQNDVnTmQ-1; Tue, 27 Apr 2021 12:13:27 -0400 X-MC-Unique: HAWVwzgXNj-1nSQNDVnTmQ-1 Received: by mail-qv1-f69.google.com with SMTP id s13-20020a0cdc0d0000b02901bbc03198caso3770861qvk.22 for ; Tue, 27 Apr 2021 09:13:26 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=yHI+Q0evUhnkGEC1JxzgrefD9XzCM+vPDN1MAUFX8eY=; b=OI2MIpgsOfzWsHIIK0xfniYiezUu75slqTMa3HTnezmY5niZaXY53o6bCtEYezmwMC OwGYzEDxmWx097+vSHG5uvcmJ7hzDdmlO7bAm5k2oGy7cjLVrTx4v+FmoWJY0BjqhGRj 7mVESanDm6aB1N3SyrHq9eiawjzte3oi44r0/zAx/QvXzlHotsDzrKdFSYA6C0KBFgVQ 1mJuOklA4CVZdA43JV7qBZ2bjuVvkuKS2Jbzc7Aa8SASs8sNPHxwxmlQt0cN1vZAxArU HxymFj8Cbf0PNf5WYymb2K6YcaBWnp2qC5UAm3twivsAY91wsVTO0cU3svzTm8J3p5dU 43mw== X-Gm-Message-State: AOAM532Bf9Ghf5JzUTX29WWwlVjneVkDdhoev5CdWLeJpbsd+2QHE9/J Fbibe5Bh1A1KDeU5FRofc5yMghtiaLCBFHWqdZIDkYIvmV/BHuh16M44M95ODr1uanj2rqlHTbW X2odOAZr1gUbouXZ3sRKemFPzTWXZtQorOBqUS55/sI0eqZruQPP7LJBbHxHP X-Received: by 2002:a37:8744:: with SMTP id j65mr25195912qkd.304.1619540005774; Tue, 27 Apr 2021 09:13:25 -0700 (PDT) X-Google-Smtp-Source: ABdhPJyz8mXLkD2L73o3l8180wYyAYZ2+Gx9rvoIyRgA/dN9pjw0lpVdb5eO//oDua/vL2UX+K6Lkw== X-Received: by 2002:a37:8744:: with SMTP id j65mr25195859qkd.304.1619540005386; Tue, 27 Apr 2021 09:13:25 -0700 (PDT) Received: from xz-x1.redhat.com (bras-base-toroon474qw-grc-77-184-145-104-227.dsl.bell.ca. [184.145.104.227]) by smtp.gmail.com with ESMTPSA id v66sm3103621qkd.113.2021.04.27.09.13.23 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 27 Apr 2021 09:13:24 -0700 (PDT) From: Peter Xu To: linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: Nadav Amit , Miaohe Lin , Mike Rapoport , Andrea Arcangeli , Hugh Dickins , peterx@redhat.com, Jerome Glisse , Mike Kravetz , Jason Gunthorpe , Matthew Wilcox , Andrew Morton , Axel Rasmussen , "Kirill A . Shutemov" Subject: [PATCH v2 03/24] mm/userfaultfd: Introduce special pte for unmapped file-backed mem Date: Tue, 27 Apr 2021 12:12:56 -0400 Message-Id: <20210427161317.50682-4-peterx@redhat.com> X-Mailer: git-send-email 2.26.2 In-Reply-To: <20210427161317.50682-1-peterx@redhat.com> References: <20210427161317.50682-1-peterx@redhat.com> MIME-Version: 1.0 Authentication-Results: relay.mimecast.com; auth=pass smtp.auth=CUSA124A263 smtp.mailfrom=peterx@redhat.com X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com X-Rspamd-Server: rspam03 X-Rspamd-Queue-Id: 6889740001DE X-Stat-Signature: aixz8cxm7ajpm77pedux4bgw7n9gjani Received-SPF: none (redhat.com>: No applicable sender policy available) receiver=imf10; identity=mailfrom; envelope-from=""; helo=us-smtp-delivery-124.mimecast.com; client-ip=216.205.24.124 X-HE-DKIM-Result: pass/pass X-HE-Tag: 1619540002-764250 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: This patch introduces a very special swap-like pte for file-backed memories. Currently it's only defined for x86_64 only, but as long as any arch that can properly define the UFFD_WP_SWP_PTE_SPECIAL value as requested, it should conceptually work too. We will use this special pte to arm the ptes that got either unmapped or swapped out for a file-backed region that was previously wr-protected. This special pte could trigger a page fault just like swap entries, and as long as the page fault will satisfy pte_none()==false && pte_present()==false. Then we can revive the special pte into a normal pte backed by the page cache. This idea is greatly inspired by Hugh and Andrea in the discussion, which is referenced in the links below. The other idea (from Hugh) is that we use swp_type==1 and swp_offset=0 as the special pte. The current solution (as pointed out by Andrea) is slightly preferred in that we don't even need swp_entry_t knowledge at all in trapping these accesses. Meanwhile, we also reuse _PAGE_SWP_UFFD_WP from the anonymous swp entries. This patch only introduces the special pte and its operators. It's not yet applied to have any functional difference. Link: https://lore.kernel.org/lkml/20201126222359.8120-1-peterx@redhat.com/ Link: https://lore.kernel.org/lkml/20201130230603.46187-1-peterx@redhat.com/ Suggested-by: Andrea Arcangeli Suggested-by: Hugh Dickins Signed-off-by: Peter Xu --- arch/x86/include/asm/pgtable.h | 28 ++++++++++++++++++++++++++++ include/asm-generic/pgtable_uffd.h | 3 +++ include/linux/userfaultfd_k.h | 21 +++++++++++++++++++++ 3 files changed, 52 insertions(+) diff --git a/arch/x86/include/asm/pgtable.h b/arch/x86/include/asm/pgtable.h index a02c67291cfcb..379bae343dd16 100644 --- a/arch/x86/include/asm/pgtable.h +++ b/arch/x86/include/asm/pgtable.h @@ -1329,6 +1329,34 @@ static inline pmd_t pmd_swp_clear_soft_dirty(pmd_t pmd) #endif #ifdef CONFIG_HAVE_ARCH_USERFAULTFD_WP + +/* + * This is a very special swap-like pte that marks this pte as "wr-protected" + * by userfaultfd-wp. It should only exist for file-backed memory where the + * page (previously got wr-protected) has been unmapped or swapped out. + * + * For anonymous memories, the userfaultfd-wp _PAGE_SWP_UFFD_WP bit is kept + * along with a real swp entry instead. + * + * Let's make some rules for this special pte: + * + * (1) pte_none()==false, so that it'll not trigger a missing page fault. + * + * (2) pte_present()==false, so that it's recognized as swap (is_swap_pte). + * + * (3) pte_swp_uffd_wp()==true, so it can be tested just like a swap pte that + * contains a valid swap entry, so that we can check a swap pte always + * using "is_swap_pte() && pte_swp_uffd_wp()" without caring about whether + * there's one swap entry inside of the pte. + * + * (4) It should not be a valid swap pte anywhere, so that when we see this pte + * we know it does not contain a swap entry. + * + * For x86, the simplest special pte which satisfies all of above should be the + * pte with only _PAGE_SWP_UFFD_WP bit set (where swp_type==swp_offset==0). + */ +#define UFFD_WP_SWP_PTE_SPECIAL __pte(_PAGE_SWP_UFFD_WP) + static inline pte_t pte_swp_mkuffd_wp(pte_t pte) { return pte_set_flags(pte, _PAGE_SWP_UFFD_WP); diff --git a/include/asm-generic/pgtable_uffd.h b/include/asm-generic/pgtable_uffd.h index 828966d4c2811..95e9811ce9d1f 100644 --- a/include/asm-generic/pgtable_uffd.h +++ b/include/asm-generic/pgtable_uffd.h @@ -2,6 +2,9 @@ #define _ASM_GENERIC_PGTABLE_UFFD_H #ifndef CONFIG_HAVE_ARCH_USERFAULTFD_WP + +#define UFFD_WP_SWP_PTE_SPECIAL __pte(0) + static __always_inline int pte_uffd_wp(pte_t pte) { return 0; diff --git a/include/linux/userfaultfd_k.h b/include/linux/userfaultfd_k.h index 794d1538b8bac..bc733512c6905 100644 --- a/include/linux/userfaultfd_k.h +++ b/include/linux/userfaultfd_k.h @@ -140,6 +140,17 @@ extern int userfaultfd_unmap_prep(struct vm_area_struct *vma, extern void userfaultfd_unmap_complete(struct mm_struct *mm, struct list_head *uf); +static inline pte_t pte_swp_mkuffd_wp_special(struct vm_area_struct *vma) +{ + WARN_ON_ONCE(vma_is_anonymous(vma)); + return UFFD_WP_SWP_PTE_SPECIAL; +} + +static inline bool pte_swp_uffd_wp_special(pte_t pte) +{ + return pte_same(pte, UFFD_WP_SWP_PTE_SPECIAL); +} + #else /* CONFIG_USERFAULTFD */ /* mm helpers */ @@ -229,6 +240,16 @@ static inline void userfaultfd_unmap_complete(struct mm_struct *mm, { } +static inline pte_t pte_swp_mkuffd_wp_special(struct vm_area_struct *vma) +{ + return __pte(0); +} + +static inline bool pte_swp_uffd_wp_special(pte_t pte) +{ + return false; +} + #endif /* CONFIG_USERFAULTFD */ #endif /* _LINUX_USERFAULTFD_K_H */ From patchwork Tue Apr 27 16:12:57 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Xu X-Patchwork-Id: 12226871 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.6 required=3.0 tests=BAYES_00,DKIM_INVALID, DKIM_SIGNED,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 7EB41C43460 for ; Tue, 27 Apr 2021 16:13:33 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 01AC5613C2 for ; Tue, 27 Apr 2021 16:13:32 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 01AC5613C2 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=redhat.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 5A2B96B0071; Tue, 27 Apr 2021 12:13:32 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 466896B0072; Tue, 27 Apr 2021 12:13:32 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 2E00B6B0073; Tue, 27 Apr 2021 12:13:32 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0062.hostedemail.com [216.40.44.62]) by kanga.kvack.org (Postfix) with ESMTP id 03CF16B0071 for ; Tue, 27 Apr 2021 12:13:31 -0400 (EDT) Received: from smtpin27.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay05.hostedemail.com (Postfix) with ESMTP id B4082181AF5D7 for ; Tue, 27 Apr 2021 16:13:31 +0000 (UTC) X-FDA: 78078642222.27.B201AF4 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by imf25.hostedemail.com (Postfix) with ESMTP id 84BD86000113 for ; Tue, 27 Apr 2021 16:13:26 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1619540010; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=1lFEgDAoL+N1dFJPllx14DTPhPYicaTLYrV4ExkV4x4=; b=H15zQcLys0lMmw/ShLXo5Dzf9mOvrHU4y75war5jsBRc8sz8tuz+aOOTU2m3NFSBN+iWcS 3mrHxkdX1aCIXaqfpzATta9nQ0G7yDzIDFG+JuvfBTj8d2mo2rwfoojnzUXSVoPfBTLjjV Q3mb4he9fX3U+PmNhVuSiyeyVTyL+Tg= Received: from mail-qt1-f198.google.com (mail-qt1-f198.google.com [209.85.160.198]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-186-qzSzGSvxMGaxgbbRTacNCA-1; Tue, 27 Apr 2021 12:13:29 -0400 X-MC-Unique: qzSzGSvxMGaxgbbRTacNCA-1 Received: by mail-qt1-f198.google.com with SMTP id a15-20020a05622a02cfb02901b5e54ac2e5so20445935qtx.4 for ; Tue, 27 Apr 2021 09:13:29 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=1lFEgDAoL+N1dFJPllx14DTPhPYicaTLYrV4ExkV4x4=; b=kMrqKO0N+3nTrQEHHFanPcFnzcldtwpKxBAFFjIP2iJxoOYyAfzy7wYRFBzuyixzsU X0whrhU3y2vVzaCt8mo/VHVglupgXyvkkRH2HDTuNE5NIQoUIr/r0V5UmSJfvPE/lSZb S81VQTwsYUuyW4nZWq92O0QaB2eZUiOplNzl+81/+xPTGPInS3Yt2QTcjNoTk6Rj3QuL YDc0Mhchm/Mr7TIGpe5uGW1wyL0v3T0+66PiU5TKQAOdU1G+tYWFrm7rbE87NUW9R9ez xRI3t8gA5PE/oLxV/XlvuXo2qLAXHxUf3thj8Q9DxdOosX7GZ0/hYNKzLmGRqpr1sUSz nXlw== X-Gm-Message-State: AOAM533nDs5TCLYBH1Lfbouy7/Ymnp24oyVYEV6SkIDG+6WyASoIlWpK UfFAHNb2F63ptVq+tPo9m9MIsyCtFsmrchHvoQs8o9HqWfvgm+Wr288q51PNld5SeSYF2o7ZYD4 CRnW+PiEPynMaQuVJtfLRnkcor9aDKoszjCvsq/8I2ICYS2/WO1Gfw8VyQevB X-Received: by 2002:a37:8bc2:: with SMTP id n185mr23966120qkd.320.1619540007696; Tue, 27 Apr 2021 09:13:27 -0700 (PDT) X-Google-Smtp-Source: ABdhPJyw7paavydyZXFB4Bg3crbIItXrvdV4lxoEZSdyWpDLIhJK10+oLtGvamIwYWO6eoith26Kmg== X-Received: by 2002:a37:8bc2:: with SMTP id n185mr23966068qkd.320.1619540007167; Tue, 27 Apr 2021 09:13:27 -0700 (PDT) Received: from xz-x1.redhat.com (bras-base-toroon474qw-grc-77-184-145-104-227.dsl.bell.ca. [184.145.104.227]) by smtp.gmail.com with ESMTPSA id v66sm3103621qkd.113.2021.04.27.09.13.25 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 27 Apr 2021 09:13:26 -0700 (PDT) From: Peter Xu To: linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: Nadav Amit , Miaohe Lin , Mike Rapoport , Andrea Arcangeli , Hugh Dickins , peterx@redhat.com, Jerome Glisse , Mike Kravetz , Jason Gunthorpe , Matthew Wilcox , Andrew Morton , Axel Rasmussen , "Kirill A . Shutemov" Subject: [PATCH v2 04/24] mm/swap: Introduce the idea of special swap ptes Date: Tue, 27 Apr 2021 12:12:57 -0400 Message-Id: <20210427161317.50682-5-peterx@redhat.com> X-Mailer: git-send-email 2.26.2 In-Reply-To: <20210427161317.50682-1-peterx@redhat.com> References: <20210427161317.50682-1-peterx@redhat.com> MIME-Version: 1.0 Authentication-Results: relay.mimecast.com; auth=pass smtp.auth=CUSA124A263 smtp.mailfrom=peterx@redhat.com X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com X-Rspamd-Queue-Id: 84BD86000113 X-Stat-Signature: et8hjm3ta7euip3cd4h5c4xtu3t78834 X-Rspamd-Server: rspam02 Received-SPF: none (redhat.com>: No applicable sender policy available) receiver=imf25; identity=mailfrom; envelope-from=""; helo=us-smtp-delivery-124.mimecast.com; client-ip=170.10.133.124 X-HE-DKIM-Result: pass/pass X-HE-Tag: 1619540006-465691 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: We used to have special swap entries, like migration entries, hw-poison entries, device private entries, etc. Those "special swap entries" reside in the range that they need to be at least swap entries first, and their types are decided by swp_type(entry). This patch introduces another idea called "special swap ptes". It's very easy to get confused against "special swap entries", but a speical swap pte should never contain a swap entry at all. It means, it's illegal to call pte_to_swp_entry() upon a special swap pte. Make the uffd-wp special pte to be the first special swap pte. Before this patch, is_swap_pte()==true means one of the below: (a.1) The pte has a normal swap entry (non_swap_entry()==false). For example, when an anonymous page got swapped out. (a.2) The pte has a special swap entry (non_swap_entry()==true). For example, a migration entry, a hw-poison entry, etc. After this patch, is_swap_pte()==true means one of the below, where case (b) is added: (a) The pte contains a swap entry. (a.1) The pte has a normal swap entry (non_swap_entry()==false). For example, when an anonymous page got swapped out. (a.2) The pte has a special swap entry (non_swap_entry()==true). For example, a migration entry, a hw-poison entry, etc. (b) The pte does not contain a swap entry at all (so it cannot be passed into pte_to_swp_entry()). For example, uffd-wp special swap pte. Teach the whole mm core about this new idea. It's done by introducing another helper called pte_has_swap_entry() which stands for case (a.1) and (a.2). Before this patch, it will be the same as is_swap_pte() because there's no special swap pte yet. Now for most of the previous use of is_swap_entry() in mm core, we'll need to use the new helper pte_has_swap_entry() instead, to make sure we won't try to parse a swap entry from a swap special pte (which does not contain a swap entry at all!). We either handle the swap special pte, or it'll naturally use the default "else" paths. Warn properly (e.g., in do_swap_page()) when we see a special swap pte - we should never call do_swap_page() upon those ptes, but just to bail out early if it happens. Signed-off-by: Peter Xu --- arch/arm64/kernel/mte.c | 2 +- fs/proc/task_mmu.c | 14 ++++++++------ include/linux/swapops.h | 39 ++++++++++++++++++++++++++++++++++++++- mm/gup.c | 2 +- mm/hmm.c | 2 +- mm/khugepaged.c | 11 ++++++++++- mm/madvise.c | 4 ++-- mm/memcontrol.c | 2 +- mm/memory.c | 7 +++++++ mm/migrate.c | 4 ++-- mm/mincore.c | 2 +- mm/mprotect.c | 2 +- mm/mremap.c | 2 +- mm/page_vma_mapped.c | 6 +++--- mm/swapfile.c | 2 +- 15 files changed, 78 insertions(+), 23 deletions(-) diff --git a/arch/arm64/kernel/mte.c b/arch/arm64/kernel/mte.c index b3c70a612c7a9..ebe213cba9136 100644 --- a/arch/arm64/kernel/mte.c +++ b/arch/arm64/kernel/mte.c @@ -30,7 +30,7 @@ static void mte_sync_page_tags(struct page *page, pte_t *ptep, bool check_swap) { pte_t old_pte = READ_ONCE(*ptep); - if (check_swap && is_swap_pte(old_pte)) { + if (check_swap && pte_has_swap_entry(old_pte)) { swp_entry_t entry = pte_to_swp_entry(old_pte); if (!non_swap_entry(entry) && mte_restore_tags(entry, page)) diff --git a/fs/proc/task_mmu.c b/fs/proc/task_mmu.c index fc9784544b241..4c95cc57a66a8 100644 --- a/fs/proc/task_mmu.c +++ b/fs/proc/task_mmu.c @@ -498,7 +498,7 @@ static void smaps_pte_entry(pte_t *pte, unsigned long addr, if (pte_present(*pte)) { page = vm_normal_page(vma, addr, *pte); - } else if (is_swap_pte(*pte)) { + } else if (pte_has_swap_entry(*pte)) { swp_entry_t swpent = pte_to_swp_entry(*pte); if (!non_swap_entry(swpent)) { @@ -518,8 +518,10 @@ static void smaps_pte_entry(pte_t *pte, unsigned long addr, page = migration_entry_to_page(swpent); else if (is_device_private_entry(swpent)) page = device_private_entry_to_page(swpent); - } else if (unlikely(IS_ENABLED(CONFIG_SHMEM) && mss->check_shmem_swap - && pte_none(*pte))) { + } else if (unlikely(IS_ENABLED(CONFIG_SHMEM) && + mss->check_shmem_swap && + /* Here swap special pte is the same as none pte */ + (pte_none(*pte) || is_swap_special_pte(*pte)))) { page = xa_load(&vma->vm_file->f_mapping->i_pages, linear_page_index(vma, addr)); if (xa_is_value(page)) @@ -691,7 +693,7 @@ static int smaps_hugetlb_range(pte_t *pte, unsigned long hmask, if (pte_present(*pte)) { page = vm_normal_page(vma, addr, *pte); - } else if (is_swap_pte(*pte)) { + } else if (pte_has_swap_entry(*pte)) { swp_entry_t swpent = pte_to_swp_entry(*pte); if (is_migration_entry(swpent)) @@ -1075,7 +1077,7 @@ static inline void clear_soft_dirty(struct vm_area_struct *vma, ptent = pte_wrprotect(old_pte); ptent = pte_clear_soft_dirty(ptent); ptep_modify_prot_commit(vma, addr, pte, old_pte, ptent); - } else if (is_swap_pte(ptent)) { + } else if (pte_has_swap_entry(ptent)) { ptent = pte_swp_clear_soft_dirty(ptent); set_pte_at(vma->vm_mm, addr, pte, ptent); } @@ -1375,7 +1377,7 @@ static pagemap_entry_t pte_to_pagemap_entry(struct pagemapread *pm, page = vm_normal_page(vma, addr, pte); if (pte_soft_dirty(pte)) flags |= PM_SOFT_DIRTY; - } else if (is_swap_pte(pte)) { + } else if (pte_has_swap_entry(pte)) { swp_entry_t entry; if (pte_swp_soft_dirty(pte)) flags |= PM_SOFT_DIRTY; diff --git a/include/linux/swapops.h b/include/linux/swapops.h index 7dd57303bb0c3..7b7387d2892ff 100644 --- a/include/linux/swapops.h +++ b/include/linux/swapops.h @@ -5,6 +5,7 @@ #include #include #include +#include #ifdef CONFIG_MMU @@ -52,12 +53,48 @@ static inline pgoff_t swp_offset(swp_entry_t entry) return entry.val & SWP_OFFSET_MASK; } -/* check whether a pte points to a swap entry */ +/* + * is_swap_pte() returns true for three cases: + * + * (a) The pte contains a swap entry. + * + * (a.1) The pte has a normal swap entry (non_swap_entry()==false). For + * example, when an anonymous page got swapped out. + * + * (a.2) The pte has a special swap entry (non_swap_entry()==true). For + * example, a migration entry, a hw-poison entry, etc. + * + * (b) The pte does not contain a swap entry at all (so it cannot be passed + * into pte_to_swp_entry()). For example, uffd-wp special swap pte. + */ static inline int is_swap_pte(pte_t pte) { return !pte_none(pte) && !pte_present(pte); } +/* + * A swap-like special pte should only be used as special marker to trigger a + * page fault. We should treat them similarly as pte_none() in most cases, + * except that it may contain some special information that can persist within + * the pte. Currently the only special swap pte is UFFD_WP_SWP_PTE_SPECIAL. + * + * Note: we should never call pte_to_swp_entry() upon a special swap pte, + * Because a swap special pte does not contain a swap entry! + */ +static inline bool is_swap_special_pte(pte_t pte) +{ + return pte_swp_uffd_wp_special(pte); +} + +/* + * Returns true if the pte contains a swap entry. This includes not only the + * normal swp entry case, but also for migration entries, etc. + */ +static inline bool pte_has_swap_entry(pte_t pte) +{ + return is_swap_pte(pte) && !is_swap_special_pte(pte); +} + /* * Convert the arch-dependent pte representation of a swp_entry_t into an * arch-independent swp_entry_t. diff --git a/mm/gup.c b/mm/gup.c index aa09535cf4d47..63a079e361a3d 100644 --- a/mm/gup.c +++ b/mm/gup.c @@ -473,7 +473,7 @@ static struct page *follow_page_pte(struct vm_area_struct *vma, */ if (likely(!(flags & FOLL_MIGRATION))) goto no_page; - if (pte_none(pte)) + if (!pte_has_swap_entry(pte)) goto no_page; entry = pte_to_swp_entry(pte); if (!is_migration_entry(entry)) diff --git a/mm/hmm.c b/mm/hmm.c index 943cb2ba44423..4dba5debf1630 100644 --- a/mm/hmm.c +++ b/mm/hmm.c @@ -237,7 +237,7 @@ static int hmm_vma_handle_pte(struct mm_walk *walk, unsigned long addr, pte_t pte = *ptep; uint64_t pfn_req_flags = *hmm_pfn; - if (pte_none(pte)) { + if (pte_none(pte) || is_swap_special_pte(pte)) { required_fault = hmm_pte_need_fault(hmm_vma_walk, pfn_req_flags, 0); if (required_fault) diff --git a/mm/khugepaged.c b/mm/khugepaged.c index ea74da3232ab6..e8b299aa32d06 100644 --- a/mm/khugepaged.c +++ b/mm/khugepaged.c @@ -1019,7 +1019,7 @@ static bool __collapse_huge_page_swapin(struct mm_struct *mm, vmf.pte = pte_offset_map(pmd, address); vmf.orig_pte = *vmf.pte; - if (!is_swap_pte(vmf.orig_pte)) { + if (!pte_has_swap_entry(vmf.orig_pte)) { pte_unmap(vmf.pte); continue; } @@ -1246,6 +1246,15 @@ static int khugepaged_scan_pmd(struct mm_struct *mm, _pte++, _address += PAGE_SIZE) { pte_t pteval = *_pte; if (is_swap_pte(pteval)) { + if (is_swap_special_pte(pteval)) { + /* + * Reuse SCAN_PTE_UFFD_WP. If there will be + * new users of is_swap_special_pte(), we'd + * better introduce a new result type. + */ + result = SCAN_PTE_UFFD_WP; + goto out_unmap; + } if (++unmapped <= khugepaged_max_ptes_swap) { /* * Always be strict with uffd-wp diff --git a/mm/madvise.c b/mm/madvise.c index 01fef79ac761b..c77499d21aac9 100644 --- a/mm/madvise.c +++ b/mm/madvise.c @@ -202,7 +202,7 @@ static int swapin_walk_pmd_entry(pmd_t *pmd, unsigned long start, pte = *(orig_pte + ((index - start) / PAGE_SIZE)); pte_unmap_unlock(orig_pte, ptl); - if (pte_present(pte) || pte_none(pte)) + if (!pte_has_swap_entry(pte)) continue; entry = pte_to_swp_entry(pte); if (unlikely(non_swap_entry(entry))) @@ -594,7 +594,7 @@ static int madvise_free_pte_range(pmd_t *pmd, unsigned long addr, for (; addr != end; pte++, addr += PAGE_SIZE) { ptent = *pte; - if (pte_none(ptent)) + if (pte_none(ptent) || is_swap_special_pte(ptent)) continue; /* * If the pte has swp_entry, just clear page table to diff --git a/mm/memcontrol.c b/mm/memcontrol.c index 3004afb6d0901..f3f21ce908dd2 100644 --- a/mm/memcontrol.c +++ b/mm/memcontrol.c @@ -5550,7 +5550,7 @@ static enum mc_target_type get_mctgt_type(struct vm_area_struct *vma, if (pte_present(ptent)) page = mc_handle_present_pte(vma, addr, ptent); - else if (is_swap_pte(ptent)) + else if (pte_has_swap_entry(ptent)) page = mc_handle_swap_pte(vma, ptent, &ent); else if (pte_none(ptent)) page = mc_handle_file_pte(vma, addr, ptent, &ent); diff --git a/mm/memory.c b/mm/memory.c index 955a0bb6b855c..235857ccfaa11 100644 --- a/mm/memory.c +++ b/mm/memory.c @@ -3323,6 +3323,13 @@ vm_fault_t do_swap_page(struct vm_fault *vmf) if (!pte_unmap_same(vmf)) goto out; + /* + * We should never call do_swap_page upon a swap special pte; just be + * safe to bail out if it happens. + */ + if (WARN_ON_ONCE(is_swap_special_pte(vmf->orig_pte))) + goto out; + entry = pte_to_swp_entry(vmf->orig_pte); if (unlikely(non_swap_entry(entry))) { if (is_migration_entry(entry)) { diff --git a/mm/migrate.c b/mm/migrate.c index 6b37d00890ca5..415961ed7a6cb 100644 --- a/mm/migrate.c +++ b/mm/migrate.c @@ -287,7 +287,7 @@ void __migration_entry_wait(struct mm_struct *mm, pte_t *ptep, spin_lock(ptl); pte = *ptep; - if (!is_swap_pte(pte)) + if (!pte_has_swap_entry(pte)) goto out; entry = pte_to_swp_entry(pte); @@ -2381,7 +2381,7 @@ static int migrate_vma_collect_pmd(pmd_t *pmdp, pte = *ptep; - if (pte_none(pte)) { + if (pte_none(pte) || is_swap_special_pte(pte)) { if (vma_is_anonymous(vma)) { mpfn = MIGRATE_PFN_MIGRATE; migrate->cpages++; diff --git a/mm/mincore.c b/mm/mincore.c index 9122676b54d67..5728c3e6473f0 100644 --- a/mm/mincore.c +++ b/mm/mincore.c @@ -121,7 +121,7 @@ static int mincore_pte_range(pmd_t *pmd, unsigned long addr, unsigned long end, for (; addr != end; ptep++, addr += PAGE_SIZE) { pte_t pte = *ptep; - if (pte_none(pte)) + if (pte_none(pte) || is_swap_special_pte(pte)) __mincore_unmapped_range(addr, addr + PAGE_SIZE, vma, vec); else if (pte_present(pte)) diff --git a/mm/mprotect.c b/mm/mprotect.c index 94188df1ee557..b3def0a102bf4 100644 --- a/mm/mprotect.c +++ b/mm/mprotect.c @@ -139,7 +139,7 @@ static unsigned long change_pte_range(struct vm_area_struct *vma, pmd_t *pmd, } ptep_modify_prot_commit(vma, addr, pte, oldpte, ptent); pages++; - } else if (is_swap_pte(oldpte)) { + } else if (pte_has_swap_entry(oldpte)) { swp_entry_t entry = pte_to_swp_entry(oldpte); pte_t newpte; diff --git a/mm/mremap.c b/mm/mremap.c index d22629ff8f3c0..67d2b84671a5a 100644 --- a/mm/mremap.c +++ b/mm/mremap.c @@ -124,7 +124,7 @@ static pte_t move_soft_dirty_pte(pte_t pte) #ifdef CONFIG_MEM_SOFT_DIRTY if (pte_present(pte)) pte = pte_mksoft_dirty(pte); - else if (is_swap_pte(pte)) + else if (pte_has_swap_entry(pte)) pte = pte_swp_mksoft_dirty(pte); #endif return pte; diff --git a/mm/page_vma_mapped.c b/mm/page_vma_mapped.c index 86e3a3688d592..6b51759d9203f 100644 --- a/mm/page_vma_mapped.c +++ b/mm/page_vma_mapped.c @@ -36,7 +36,7 @@ static bool map_pte(struct page_vma_mapped_walk *pvmw) * For more details on device private memory see HMM * (include/linux/hmm.h or mm/hmm.c). */ - if (is_swap_pte(*pvmw->pte)) { + if (pte_has_swap_entry(*pvmw->pte)) { swp_entry_t entry; /* Handle un-addressable ZONE_DEVICE memory */ @@ -89,7 +89,7 @@ static bool check_pte(struct page_vma_mapped_walk *pvmw) if (pvmw->flags & PVMW_MIGRATION) { swp_entry_t entry; - if (!is_swap_pte(*pvmw->pte)) + if (!pte_has_swap_entry(*pvmw->pte)) return false; entry = pte_to_swp_entry(*pvmw->pte); @@ -97,7 +97,7 @@ static bool check_pte(struct page_vma_mapped_walk *pvmw) return false; pfn = migration_entry_to_pfn(entry); - } else if (is_swap_pte(*pvmw->pte)) { + } else if (pte_has_swap_entry(*pvmw->pte)) { swp_entry_t entry; /* Handle un-addressable ZONE_DEVICE memory */ diff --git a/mm/swapfile.c b/mm/swapfile.c index 149e77454e3c5..8aa4be0746593 100644 --- a/mm/swapfile.c +++ b/mm/swapfile.c @@ -1964,7 +1964,7 @@ static int unuse_pte_range(struct vm_area_struct *vma, pmd_t *pmd, si = swap_info[type]; pte = pte_offset_map(pmd, addr); do { - if (!is_swap_pte(*pte)) + if (!pte_has_swap_entry(*pte)) continue; entry = pte_to_swp_entry(*pte); From patchwork Tue Apr 27 16:12:58 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Xu X-Patchwork-Id: 12226875 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.6 required=3.0 tests=BAYES_00,DKIM_INVALID, DKIM_SIGNED,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 3C47AC433B4 for ; Tue, 27 Apr 2021 16:13:38 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id B880661151 for ; Tue, 27 Apr 2021 16:13:37 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org B880661151 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=redhat.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 15B286B0073; Tue, 27 Apr 2021 12:13:34 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 135BF6B0074; Tue, 27 Apr 2021 12:13:34 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id DDACE6B0075; Tue, 27 Apr 2021 12:13:33 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0025.hostedemail.com [216.40.44.25]) by kanga.kvack.org (Postfix) with ESMTP id C364C6B0074 for ; Tue, 27 Apr 2021 12:13:33 -0400 (EDT) Received: from smtpin04.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay05.hostedemail.com (Postfix) with ESMTP id 8475E181AF5D7 for ; Tue, 27 Apr 2021 16:13:33 +0000 (UTC) X-FDA: 78078642306.04.3847119 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [216.205.24.124]) by imf28.hostedemail.com (Postfix) with ESMTP id 9968F2000247 for ; Tue, 27 Apr 2021 16:13:34 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1619540012; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=P2JkGq2vO8wj3u4HClWKqDzaqtOjDoD0JZSILmTLsRU=; b=Fgc5RyLivRQOMZAwbssuBg158Eriy4YswBwuWAs5aQv+7i6R9+yXSIcEjibuUdGORMv+Jw 8aKSfitiW+kCohAe7A7CR96BSzBevI0ThxpaCmR6CA9LsFnd6p2uZp7lGO331xXPfw5D2r 4KBz32I0vAOnaK0ynd6+EcZ9PWHTjkI= DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1619540012; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=P2JkGq2vO8wj3u4HClWKqDzaqtOjDoD0JZSILmTLsRU=; b=Fgc5RyLivRQOMZAwbssuBg158Eriy4YswBwuWAs5aQv+7i6R9+yXSIcEjibuUdGORMv+Jw 8aKSfitiW+kCohAe7A7CR96BSzBevI0ThxpaCmR6CA9LsFnd6p2uZp7lGO331xXPfw5D2r 4KBz32I0vAOnaK0ynd6+EcZ9PWHTjkI= Received: from mail-qv1-f72.google.com (mail-qv1-f72.google.com [209.85.219.72]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-393-B9MnVn2nN0K2wl6exVSBug-1; Tue, 27 Apr 2021 12:13:30 -0400 X-MC-Unique: B9MnVn2nN0K2wl6exVSBug-1 Received: by mail-qv1-f72.google.com with SMTP id y14-20020a0cf14e0000b029019ff951fd16so25248471qvl.12 for ; Tue, 27 Apr 2021 09:13:30 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=P2JkGq2vO8wj3u4HClWKqDzaqtOjDoD0JZSILmTLsRU=; b=UzwHTqmuVdE4R5BRssJucaNclnq6Q4sBbvtadseGMQNHSoPGg6cjVEg+d+EN5oWBb3 Bs7Nk6YcoG8XVRG4gnCcQZ2xDRXXXBiu2dFchN3K41gHKJmQ59BAowgQZEi8qH7IGvZb 6Ek1VFa+NT3wgkFaxGa4JDqLjgvV6eR30wiRfjG694KXysPkofA7v1TZCQM63O9Kh9dj LvZhqb5PuHSaTeXbRwvPLFI/fjr7J80wMuRe4nrztV/qHxo33jy9kfr3qLtWlNABvjfO zS0ii/BFVkd35AFQn5D7EhEPNbXe+zL1rFDn0Qa1kmGi6fwQU8uvKs1RvOA1t8en0gbf 2+Sw== X-Gm-Message-State: AOAM532VjKMzTtG69vNT7ZWBG3akkYmx2Yr1Hk5cKg9RB7KFUPqPo0+w GBNEDWrCS8E01767/UtuNOMbrbqKharGYAG2ugVxJrm97JKMGuUM024J4UJ2Qscd2z6jgsyxSB5 Qi/0JnlXk78BLmi33q2XO/km5rSpPLkGVGW18vTAqptWmmYsPElsBcyolD/Ue X-Received: by 2002:a37:b685:: with SMTP id g127mr6501930qkf.42.1619540009659; Tue, 27 Apr 2021 09:13:29 -0700 (PDT) X-Google-Smtp-Source: ABdhPJwx5mLIHNqVmAY+l6qskoKTYPSrLswCa0ymj3l43bBwGXFObHOSp0TR6zVhhrJneunIDW/6Kg== X-Received: by 2002:a37:b685:: with SMTP id g127mr6501882qkf.42.1619540009216; Tue, 27 Apr 2021 09:13:29 -0700 (PDT) Received: from xz-x1.redhat.com (bras-base-toroon474qw-grc-77-184-145-104-227.dsl.bell.ca. [184.145.104.227]) by smtp.gmail.com with ESMTPSA id v66sm3103621qkd.113.2021.04.27.09.13.27 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 27 Apr 2021 09:13:28 -0700 (PDT) From: Peter Xu To: linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: Nadav Amit , Miaohe Lin , Mike Rapoport , Andrea Arcangeli , Hugh Dickins , peterx@redhat.com, Jerome Glisse , Mike Kravetz , Jason Gunthorpe , Matthew Wilcox , Andrew Morton , Axel Rasmussen , "Kirill A . Shutemov" Subject: [PATCH v2 05/24] shmem/userfaultfd: Handle uffd-wp special pte in page fault handler Date: Tue, 27 Apr 2021 12:12:58 -0400 Message-Id: <20210427161317.50682-6-peterx@redhat.com> X-Mailer: git-send-email 2.26.2 In-Reply-To: <20210427161317.50682-1-peterx@redhat.com> References: <20210427161317.50682-1-peterx@redhat.com> MIME-Version: 1.0 Authentication-Results: relay.mimecast.com; auth=pass smtp.auth=CUSA124A263 smtp.mailfrom=peterx@redhat.com X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com X-Rspamd-Server: rspam03 X-Rspamd-Queue-Id: 9968F2000247 X-Stat-Signature: n3axyktbhfjkwnzhu5yxnbr7gar5jmb4 Received-SPF: none (redhat.com>: No applicable sender policy available) receiver=imf28; identity=mailfrom; envelope-from=""; helo=us-smtp-delivery-124.mimecast.com; client-ip=216.205.24.124 X-HE-DKIM-Result: pass/pass X-HE-Tag: 1619540014-388535 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: File-backed memories are prone to unmap/swap so the ptes are always unstable. This could lead to userfaultfd-wp information got lost when unmapped or swapped out on such types of memory, for example, shmem. To keep such an information persistent, we will start to use the newly introduced swap-like special ptes to replace a null pte when those ptes were removed. Prepare this by handling such a special pte first before it is applied. Here a new fault flag FAULT_FLAG_UFFD_WP is introduced. When this flag is set, it means the current fault is to resolve a page access (either read or write) to the uffd-wp special pte. The handling of this special pte page fault is similar to missing fault, but it should happen after the pte missing logic since the special pte is designed to be a swap-like pte. Meanwhile it should be handled before do_swap_page() so that the swap core logic won't be confused to see such an illegal swap pte. This is a slow path of uffd-wp handling, because unmap of wr-protected shmem ptes should be rare. So far it should only trigger in two conditions: (1) When trying to punch holes in shmem_fallocate(), there will be a pre-unmap optimization before evicting the page. That will create unmapped shmem ptes with wr-protected pages covered. (2) Swapping out of shmem pages Because of this, the page fault handling is simplifed too by not sending the wr-protect message in the 1st page fault, instead the page will be installed read-only, so the message will be generated until the next do_wp_page() call. Disable fault-around for such a special page fault, because the introduced new flag (FAULT_FLAG_UFFD_WP) only applies to current pte rather than all the pages around it. Doing fault-around with the new flag could confuse all the rest of pages when installing ptes from page cache when there's a cache hit. Signed-off-by: Peter Xu --- include/linux/userfaultfd_k.h | 11 +++++ mm/memory.c | 80 ++++++++++++++++++++++++++++++++--- 2 files changed, 86 insertions(+), 5 deletions(-) diff --git a/include/linux/userfaultfd_k.h b/include/linux/userfaultfd_k.h index bc733512c6905..fefebe6e96560 100644 --- a/include/linux/userfaultfd_k.h +++ b/include/linux/userfaultfd_k.h @@ -89,6 +89,17 @@ static inline bool uffd_disable_huge_pmd_share(struct vm_area_struct *vma) return vma->vm_flags & (VM_UFFD_WP | VM_UFFD_MINOR); } +/* + * Don't do fault around for FAULT_FLAG_UFFD_WP because it means we want to + * recover a previously wr-protected pte. This flag is a per-pte information, + * so it could confuse all the pages around the current page when faulted in. + * Similar reason for MINOR mode faults. + */ +static inline bool uffd_disable_fault_around(struct vm_area_struct *vma) +{ + return vma->vm_flags & (VM_UFFD_WP | VM_UFFD_MINOR); +} + static inline bool userfaultfd_missing(struct vm_area_struct *vma) { return vma->vm_flags & VM_UFFD_MISSING; diff --git a/mm/memory.c b/mm/memory.c index 235857ccfaa11..02db41bad3340 100644 --- a/mm/memory.c +++ b/mm/memory.c @@ -3786,6 +3786,7 @@ vm_fault_t do_set_pmd(struct vm_fault *vmf, struct page *page) void do_set_pte(struct vm_fault *vmf, struct page *page, unsigned long addr) { struct vm_area_struct *vma = vmf->vma; + bool uffd_wp = pte_swp_uffd_wp_special(vmf->orig_pte); bool write = vmf->flags & FAULT_FLAG_WRITE; bool prefault = vmf->address != addr; pte_t entry; @@ -3798,6 +3799,8 @@ void do_set_pte(struct vm_fault *vmf, struct page *page, unsigned long addr) if (write) entry = maybe_mkwrite(pte_mkdirty(entry), vma); + if (unlikely(uffd_wp)) + entry = pte_mkuffd_wp(pte_wrprotect(entry)); /* copy-on-write page */ if (write && !(vma->vm_flags & VM_SHARED)) { inc_mm_counter_fast(vma->vm_mm, MM_ANONPAGES); @@ -3865,8 +3868,12 @@ vm_fault_t finish_fault(struct vm_fault *vmf) vmf->pte = pte_offset_map_lock(vma->vm_mm, vmf->pmd, vmf->address, &vmf->ptl); ret = 0; - /* Re-check under ptl */ - if (likely(pte_none(*vmf->pte))) + + /* + * Re-check under ptl. Note: this will cover both none pte and + * uffd-wp-special swap pte + */ + if (likely(pte_same(*vmf->pte, vmf->orig_pte))) do_set_pte(vmf, page, vmf->address); else ret = VM_FAULT_NOPAGE; @@ -3970,9 +3977,21 @@ static vm_fault_t do_fault_around(struct vm_fault *vmf) return vmf->vma->vm_ops->map_pages(vmf, start_pgoff, end_pgoff); } +/* Return true if we should do read fault-around, false otherwise */ +static inline bool should_fault_around(struct vm_fault *vmf) +{ + /* No ->map_pages? No way to fault around... */ + if (!vmf->vma->vm_ops->map_pages) + return false; + + if (uffd_disable_fault_around(vmf->vma)) + return false; + + return fault_around_bytes >> PAGE_SHIFT > 1; +} + static vm_fault_t do_read_fault(struct vm_fault *vmf) { - struct vm_area_struct *vma = vmf->vma; vm_fault_t ret = 0; /* @@ -3980,7 +3999,7 @@ static vm_fault_t do_read_fault(struct vm_fault *vmf) * if page by the offset is not ready to be mapped (cold cache or * something). */ - if (vma->vm_ops->map_pages && fault_around_bytes >> PAGE_SHIFT > 1) { + if (should_fault_around(vmf)) { ret = do_fault_around(vmf); if (ret) return ret; @@ -4293,6 +4312,57 @@ static vm_fault_t wp_huge_pud(struct vm_fault *vmf, pud_t orig_pud) return VM_FAULT_FALLBACK; } +static vm_fault_t uffd_wp_clear_special(struct vm_fault *vmf) +{ + vmf->pte = pte_offset_map_lock(vmf->vma->vm_mm, vmf->pmd, + vmf->address, &vmf->ptl); + /* + * Be careful so that we will only recover a special uffd-wp pte into a + * none pte. Otherwise it means the pte could have changed, so retry. + */ + if (pte_swp_uffd_wp_special(*vmf->pte)) + pte_clear(vmf->vma->vm_mm, vmf->address, vmf->pte); + pte_unmap_unlock(vmf->pte, vmf->ptl); + return 0; +} + +/* + * This is actually a page-missing access, but with uffd-wp special pte + * installed. It means this pte was wr-protected before being unmapped. + */ +static vm_fault_t uffd_wp_handle_special(struct vm_fault *vmf) +{ + /* Careful! vmf->pte unmapped after return */ + if (!pte_unmap_same(vmf)) + return 0; + + /* + * Just in case there're leftover special ptes even after the region + * got unregistered - we can simply clear them. + */ + if (unlikely(!userfaultfd_wp(vmf->vma) || vma_is_anonymous(vmf->vma))) + return uffd_wp_clear_special(vmf); + + /* + * Here we share most code with do_fault(), in which we can identify + * whether this is "none pte fault" or "uffd-wp-special fault" by + * checking the vmf->orig_pte. + */ + return do_fault(vmf); +} + +static vm_fault_t do_swap_pte(struct vm_fault *vmf) +{ + /* + * We need to handle special swap ptes before handling ptes that + * contain swap entries, always. + */ + if (unlikely(pte_swp_uffd_wp_special(vmf->orig_pte))) + return uffd_wp_handle_special(vmf); + + return do_swap_page(vmf); +} + /* * These routines also need to handle stuff like marking pages dirty * and/or accessed for architectures that don't do it in hardware (most @@ -4367,7 +4437,7 @@ static vm_fault_t handle_pte_fault(struct vm_fault *vmf) } if (!pte_present(vmf->orig_pte)) - return do_swap_page(vmf); + return do_swap_pte(vmf); if (pte_protnone(vmf->orig_pte) && vma_is_accessible(vmf->vma)) return do_numa_page(vmf); From patchwork Tue Apr 27 16:12:59 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Xu X-Patchwork-Id: 12226877 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.6 required=3.0 tests=BAYES_00,DKIM_INVALID, DKIM_SIGNED,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 9434DC433ED for ; Tue, 27 Apr 2021 16:13:40 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 2286E613C2 for ; Tue, 27 Apr 2021 16:13:40 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 2286E613C2 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=redhat.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id D908E6B0074; Tue, 27 Apr 2021 12:13:35 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id D67176B0075; Tue, 27 Apr 2021 12:13:35 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id BE25A6B0078; Tue, 27 Apr 2021 12:13:35 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0086.hostedemail.com [216.40.44.86]) by kanga.kvack.org (Postfix) with ESMTP id A0BFF6B0074 for ; Tue, 27 Apr 2021 12:13:35 -0400 (EDT) Received: from smtpin26.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay02.hostedemail.com (Postfix) with ESMTP id 5EF2F2C96 for ; Tue, 27 Apr 2021 16:13:35 +0000 (UTC) X-FDA: 78078642390.26.9963421 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by imf08.hostedemail.com (Postfix) with ESMTP id B9D2480192ED for ; Tue, 27 Apr 2021 16:13:12 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1619540014; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=gIgjn5S8tiLrRboG7Na97CFc3ZndoZkUNAZThQ2rzRU=; b=SW9XZ6L2p71Wt8w5B3xG/3aM8wQ+SHrfjt/1XZLGIXG+baZSaOIiz8Wv7UduNw7ZNSe7fq nOr7o3G/ejFL2pLZ5rGfoYR2ctmI2KKKY45V+UfO0XN3/9GR48bvyrUDB50Fukn1btu0ch h5pwcmF16N+HhlOh9j5+G2B1n/Ilmfs= Received: from mail-qt1-f198.google.com (mail-qt1-f198.google.com [209.85.160.198]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-80-D5aEe_7oMpqHzBP7VaXZTQ-1; Tue, 27 Apr 2021 12:13:32 -0400 X-MC-Unique: D5aEe_7oMpqHzBP7VaXZTQ-1 Received: by mail-qt1-f198.google.com with SMTP id r20-20020ac85c940000b02901bac34fa2eeso1691295qta.11 for ; Tue, 27 Apr 2021 09:13:32 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=gIgjn5S8tiLrRboG7Na97CFc3ZndoZkUNAZThQ2rzRU=; b=Oa1FeaWMhkY7m2Matq9eUzUpTViE3PKl6idViRmLHmRT8vToEaKIX5T/y4DqzPDtgJ P4I2SS+H0LGb7ccgUVlK2XN2XCrUkW743sClHmKeqvF18k6Wb5o7C4EIcTZLjrLviNOM qyAzbS6Km4EviZQ1ECDof4Kqoa7yCpM3mBISaQmszgMkskXKBzuOAAxOBXcDowThD1GA dfj88TpRTEW5kTEIGi3B/ytAAFghfA8PlOJAmOQkLJs7AuK+kQ0sGsZrGtCG1mOHO+FG MlS8nkvGmccQMr/EUk2qsoklYgN/N1fLSgGZAMEMGqWRcijZ5cBBljvtX92bnRzrwRC4 S0YA== X-Gm-Message-State: AOAM531oPrv+sBfoWiHgVzVbrwFkggM8Pweb/Mq5g00afTXch9IqaqFi A4N//MNVgUcixt+95ONCtu1qlzSnbFxZh9KyqVYqP73st1EGtdBkCMs1E7XJWvzzVRrWHUvMuRc c/RVuVrWnVVnnKuCVLC+q8k9e5r1x65FosErK2F1CiibSU6b6kad7yWIbtqyi X-Received: by 2002:a05:620a:1387:: with SMTP id k7mr21581815qki.134.1619540011360; Tue, 27 Apr 2021 09:13:31 -0700 (PDT) X-Google-Smtp-Source: ABdhPJyW/1PVlT3w2la6TsTdJYih8T6xfjtv6F8WrSt8l1jAsJMc097j8PN/4aDHkK+EGk9Dc9uj4w== X-Received: by 2002:a05:620a:1387:: with SMTP id k7mr21581769qki.134.1619540010984; Tue, 27 Apr 2021 09:13:30 -0700 (PDT) Received: from xz-x1.redhat.com (bras-base-toroon474qw-grc-77-184-145-104-227.dsl.bell.ca. [184.145.104.227]) by smtp.gmail.com with ESMTPSA id v66sm3103621qkd.113.2021.04.27.09.13.29 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 27 Apr 2021 09:13:30 -0700 (PDT) From: Peter Xu To: linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: Nadav Amit , Miaohe Lin , Mike Rapoport , Andrea Arcangeli , Hugh Dickins , peterx@redhat.com, Jerome Glisse , Mike Kravetz , Jason Gunthorpe , Matthew Wilcox , Andrew Morton , Axel Rasmussen , "Kirill A . Shutemov" Subject: [PATCH v2 06/24] mm: Drop first_index/last_index in zap_details Date: Tue, 27 Apr 2021 12:12:59 -0400 Message-Id: <20210427161317.50682-7-peterx@redhat.com> X-Mailer: git-send-email 2.26.2 In-Reply-To: <20210427161317.50682-1-peterx@redhat.com> References: <20210427161317.50682-1-peterx@redhat.com> MIME-Version: 1.0 Authentication-Results: relay.mimecast.com; auth=pass smtp.auth=CUSA124A263 smtp.mailfrom=peterx@redhat.com X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com X-Rspamd-Server: rspam05 X-Rspamd-Queue-Id: B9D2480192ED X-Stat-Signature: iu8egwaq8md8s1aww86tkyr7k8bjkxwk Received-SPF: none (redhat.com>: No applicable sender policy available) receiver=imf08; identity=mailfrom; envelope-from=""; helo=us-smtp-delivery-124.mimecast.com; client-ip=170.10.133.124 X-HE-DKIM-Result: pass/pass X-HE-Tag: 1619539992-984075 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: The first_index/last_index parameters in zap_details are actually only used in unmap_mapping_range_tree(). At the meantime, this function is only called by unmap_mapping_pages() once. Instead of passing these two variables through the whole stack of page zapping code, remove them from zap_details and let them simply be parameters of unmap_mapping_range_tree(), which is inlined. Signed-off-by: Peter Xu --- include/linux/mm.h | 2 -- mm/memory.c | 20 ++++++++++---------- 2 files changed, 10 insertions(+), 12 deletions(-) diff --git a/include/linux/mm.h b/include/linux/mm.h index 84fb1697b20ff..9060b497f4d5c 100644 --- a/include/linux/mm.h +++ b/include/linux/mm.h @@ -1707,8 +1707,6 @@ extern void user_shm_unlock(size_t, struct user_struct *); */ struct zap_details { struct address_space *check_mapping; /* Check page->mapping if set */ - pgoff_t first_index; /* Lowest page->index to unmap */ - pgoff_t last_index; /* Highest page->index to unmap */ }; struct page *vm_normal_page(struct vm_area_struct *vma, unsigned long addr, diff --git a/mm/memory.c b/mm/memory.c index 02db41bad3340..bcbce803e6850 100644 --- a/mm/memory.c +++ b/mm/memory.c @@ -3213,20 +3213,20 @@ static void unmap_mapping_range_vma(struct vm_area_struct *vma, } static inline void unmap_mapping_range_tree(struct rb_root_cached *root, + pgoff_t first_index, + pgoff_t last_index, struct zap_details *details) { struct vm_area_struct *vma; pgoff_t vba, vea, zba, zea; - vma_interval_tree_foreach(vma, root, - details->first_index, details->last_index) { - + vma_interval_tree_foreach(vma, root, first_index, last_index) { vba = vma->vm_pgoff; vea = vba + vma_pages(vma) - 1; - zba = details->first_index; + zba = first_index; if (zba < vba) zba = vba; - zea = details->last_index; + zea = last_index; if (zea > vea) zea = vea; @@ -3252,17 +3252,17 @@ static inline void unmap_mapping_range_tree(struct rb_root_cached *root, void unmap_mapping_pages(struct address_space *mapping, pgoff_t start, pgoff_t nr, bool even_cows) { + pgoff_t first_index = start, last_index = start + nr - 1; struct zap_details details = { }; details.check_mapping = even_cows ? NULL : mapping; - details.first_index = start; - details.last_index = start + nr - 1; - if (details.last_index < details.first_index) - details.last_index = ULONG_MAX; + if (last_index < first_index) + last_index = ULONG_MAX; i_mmap_lock_write(mapping); if (unlikely(!RB_EMPTY_ROOT(&mapping->i_mmap.rb_root))) - unmap_mapping_range_tree(&mapping->i_mmap, &details); + unmap_mapping_range_tree(&mapping->i_mmap, first_index, + last_index, &details); i_mmap_unlock_write(mapping); } From patchwork Tue Apr 27 16:13:00 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Xu X-Patchwork-Id: 12226881 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.6 required=3.0 tests=BAYES_00,DKIM_INVALID, DKIM_SIGNED,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 80471C433ED for ; Tue, 27 Apr 2021 16:13:44 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 09B5D613E8 for ; Tue, 27 Apr 2021 16:13:43 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 09B5D613E8 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=redhat.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id A5BAE6B0078; Tue, 27 Apr 2021 12:13:41 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id A32386B007B; Tue, 27 Apr 2021 12:13:41 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 85EFF6B007E; Tue, 27 Apr 2021 12:13:41 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0079.hostedemail.com [216.40.44.79]) by kanga.kvack.org (Postfix) with ESMTP id 575A46B0078 for ; Tue, 27 Apr 2021 12:13:41 -0400 (EDT) Received: from smtpin32.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay03.hostedemail.com (Postfix) with ESMTP id 8FE258249980 for ; Tue, 27 Apr 2021 16:13:39 +0000 (UTC) X-FDA: 78078642558.32.9C97911 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by imf13.hostedemail.com (Postfix) with ESMTP id 70BE1E00012F for ; Tue, 27 Apr 2021 16:13:31 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1619540018; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=79hgS/3VF19uL8B1KnbV12ieWH4dDf98xJSUkbEku0E=; b=QWiyHIpQeU9yUDtlnPOQLS6smhXN6QzTOyZtpgBIEt39m3cKw9G+C9Crk5BeUi1K/GvWEP e6unBfNeJzeGKx0X0jlqUvhlT/t9yaJ+EopM1eUrPVsKXRCaOXLXqCZJ6myYvRJB1TniH3 08x6+mr00L7XlFUmVvp7Ugs/qslaqCc= Received: from mail-qv1-f70.google.com (mail-qv1-f70.google.com [209.85.219.70]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-238-2NesJNj5OviC8soMrO_G3Q-1; Tue, 27 Apr 2021 12:13:35 -0400 X-MC-Unique: 2NesJNj5OviC8soMrO_G3Q-1 Received: by mail-qv1-f70.google.com with SMTP id r18-20020a0ccc120000b02901a21aadacfcso24128738qvk.5 for ; Tue, 27 Apr 2021 09:13:34 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=79hgS/3VF19uL8B1KnbV12ieWH4dDf98xJSUkbEku0E=; b=DEeHc2p0ep91/hOFuw1JEwnXZAqjyMcRovBwj15aK5T8e5OOGYdxOW3g55Nspk/GJ4 Y0OTTvns/UVVtktH+Mhn72qBpn6nA2dVJas15uTf/f+/1uy9s0CqzcFZ5XIZyh91MFNB TKiuNrzzEWBGlZc504jUxMjyVMsgIbN3axgoak0qyogTqtbVnuhY+DTF+ts/xG7E35gV Cn/HKw/5jgT4RVOf3j1jybvyRAA08a9CVP2FAq2cdv84Ditu9TBh1+XQsALNaI2jQ13o Olrmeuf2vynP2z4TgpFUPx5pYt5s0nny5nYFMKVCIWJUbn42Q2/o/eVENmbAAq0w8URT CFKA== X-Gm-Message-State: AOAM531HZe8I1zekwinWL7zekQJTMwvRqHjanxTyEc0jw2N13hRFSGvp Z50umSo7lNUPIySoQwHc83CDXPaKiAc94Au/pbmbNrtEmXo2ULiMUS9agi8tazK1W7h1fQQ6UBa PZQZKv7qfjdSUF620VKz/90fDjWigKrUqk3YGUhFCmEErhTqSD0a7+v5lQVf4 X-Received: by 2002:a0c:ab12:: with SMTP id h18mr11741098qvb.33.1619540013536; Tue, 27 Apr 2021 09:13:33 -0700 (PDT) X-Google-Smtp-Source: ABdhPJwuSIgswcmVwkMNc/apUchEgjfSAXm6CbKvjAhSKPYH54HMTCmY99nS8hMUF9WgKkHBYwHOQw== X-Received: by 2002:a0c:ab12:: with SMTP id h18mr11741039qvb.33.1619540013117; Tue, 27 Apr 2021 09:13:33 -0700 (PDT) Received: from xz-x1.redhat.com (bras-base-toroon474qw-grc-77-184-145-104-227.dsl.bell.ca. [184.145.104.227]) by smtp.gmail.com with ESMTPSA id v66sm3103621qkd.113.2021.04.27.09.13.31 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 27 Apr 2021 09:13:32 -0700 (PDT) From: Peter Xu To: linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: Nadav Amit , Miaohe Lin , Mike Rapoport , Andrea Arcangeli , Hugh Dickins , peterx@redhat.com, Jerome Glisse , Mike Kravetz , Jason Gunthorpe , Matthew Wilcox , Andrew Morton , Axel Rasmussen , "Kirill A . Shutemov" Subject: [PATCH v2 07/24] mm: Introduce zap_details.zap_flags Date: Tue, 27 Apr 2021 12:13:00 -0400 Message-Id: <20210427161317.50682-8-peterx@redhat.com> X-Mailer: git-send-email 2.26.2 In-Reply-To: <20210427161317.50682-1-peterx@redhat.com> References: <20210427161317.50682-1-peterx@redhat.com> MIME-Version: 1.0 Authentication-Results: relay.mimecast.com; auth=pass smtp.auth=CUSA124A263 smtp.mailfrom=peterx@redhat.com X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com X-Rspamd-Queue-Id: 70BE1E00012F X-Stat-Signature: dkbt9dj3zctgfx8fif6rxhh4ea6wbt3u X-Rspamd-Server: rspam02 Received-SPF: none (redhat.com>: No applicable sender policy available) receiver=imf13; identity=mailfrom; envelope-from=""; helo=us-smtp-delivery-124.mimecast.com; client-ip=170.10.133.124 X-HE-DKIM-Result: pass/pass X-HE-Tag: 1619540011-502515 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Instead of trying to introduce one variable for every new zap_details fields, let's introduce a flag so that it can start to encode true/false informations. Let's start to use this flag first to clean up the only check_mapping variable. Firstly, the name "check_mapping" implies this is a "boolean", but actually it stores the mapping inside, just in a way that it won't be set if we don't want to check the mapping. To make things clearer, introduce the 1st zap flag ZAP_FLAG_CHECK_MAPPING, so that we only check against the mapping if this bit set. At the same time, we can rename check_mapping into zap_mapping and set it always. Since at it, introduce another helper zap_check_mapping_skip() and use it in zap_pte_range() properly. Some old comments have been removed in zap_pte_range() because they're duplicated, and since now we're with ZAP_FLAG_CHECK_MAPPING flag, it'll be very easy to grep this information by simply grepping the flag. It'll also make life easier when we want to e.g. pass in zap_flags into the callers like unmap_mapping_pages() (instead of adding new booleans besides the even_cows parameter). Signed-off-by: Peter Xu --- include/linux/mm.h | 19 ++++++++++++++++++- mm/memory.c | 31 ++++++++----------------------- 2 files changed, 26 insertions(+), 24 deletions(-) diff --git a/include/linux/mm.h b/include/linux/mm.h index 9060b497f4d5c..39c944bf7ed3a 100644 --- a/include/linux/mm.h +++ b/include/linux/mm.h @@ -1702,13 +1702,30 @@ static inline bool can_do_mlock(void) { return false; } extern int user_shm_lock(size_t, struct user_struct *); extern void user_shm_unlock(size_t, struct user_struct *); +/* Whether to check page->mapping when zapping */ +#define ZAP_FLAG_CHECK_MAPPING BIT(0) + /* * Parameter block passed down to zap_pte_range in exceptional cases. */ struct zap_details { - struct address_space *check_mapping; /* Check page->mapping if set */ + struct address_space *zap_mapping; + unsigned long zap_flags; }; +/* Return true if skip zapping this page, false otherwise */ +static inline bool +zap_check_mapping_skip(struct zap_details *details, struct page *page) +{ + if (!details || !page) + return false; + + if (!(details->zap_flags & ZAP_FLAG_CHECK_MAPPING)) + return false; + + return details->zap_mapping != page_rmapping(page); +} + struct page *vm_normal_page(struct vm_area_struct *vma, unsigned long addr, pte_t pte); struct page *vm_normal_page_pmd(struct vm_area_struct *vma, unsigned long addr, diff --git a/mm/memory.c b/mm/memory.c index bcbce803e6850..94954436544f7 100644 --- a/mm/memory.c +++ b/mm/memory.c @@ -1242,16 +1242,8 @@ static unsigned long zap_pte_range(struct mmu_gather *tlb, struct page *page; page = vm_normal_page(vma, addr, ptent); - if (unlikely(details) && page) { - /* - * unmap_shared_mapping_pages() wants to - * invalidate cache without truncating: - * unmap shared but keep private pages. - */ - if (details->check_mapping && - details->check_mapping != page_rmapping(page)) - continue; - } + if (unlikely(zap_check_mapping_skip(details, page))) + continue; ptent = ptep_get_and_clear_full(mm, addr, pte, tlb->fullmm); tlb_remove_tlb_entry(tlb, pte, addr); @@ -1283,17 +1275,8 @@ static unsigned long zap_pte_range(struct mmu_gather *tlb, if (is_device_private_entry(entry)) { struct page *page = device_private_entry_to_page(entry); - if (unlikely(details && details->check_mapping)) { - /* - * unmap_shared_mapping_pages() wants to - * invalidate cache without truncating: - * unmap shared but keep private pages. - */ - if (details->check_mapping != - page_rmapping(page)) - continue; - } - + if (unlikely(zap_check_mapping_skip(details, page))) + continue; pte_clear_not_present_full(mm, addr, pte, tlb->fullmm); rss[mm_counter(page)]--; page_remove_rmap(page, false); @@ -3253,9 +3236,11 @@ void unmap_mapping_pages(struct address_space *mapping, pgoff_t start, pgoff_t nr, bool even_cows) { pgoff_t first_index = start, last_index = start + nr - 1; - struct zap_details details = { }; + struct zap_details details = { .zap_mapping = mapping }; + + if (!even_cows) + details.zap_flags |= ZAP_FLAG_CHECK_MAPPING; - details.check_mapping = even_cows ? NULL : mapping; if (last_index < first_index) last_index = ULONG_MAX; From patchwork Tue Apr 27 16:13:01 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Xu X-Patchwork-Id: 12226879 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.6 required=3.0 tests=BAYES_00,DKIM_INVALID, DKIM_SIGNED,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 7313BC433B4 for ; Tue, 27 Apr 2021 16:13:42 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 07E4A600CC for ; Tue, 27 Apr 2021 16:13:41 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 07E4A600CC Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=redhat.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 308476B0075; Tue, 27 Apr 2021 12:13:41 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 2E0286B0078; Tue, 27 Apr 2021 12:13:41 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 097D86B007B; Tue, 27 Apr 2021 12:13:41 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0251.hostedemail.com [216.40.44.251]) by kanga.kvack.org (Postfix) with ESMTP id DDE1A6B0075 for ; Tue, 27 Apr 2021 12:13:40 -0400 (EDT) Received: from smtpin27.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay04.hostedemail.com (Postfix) with ESMTP id A1BFF45BC for ; Tue, 27 Apr 2021 16:13:40 +0000 (UTC) X-FDA: 78078642600.27.383DC95 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [216.205.24.124]) by imf30.hostedemail.com (Postfix) with ESMTP id D0A87E000123 for ; Tue, 27 Apr 2021 16:13:20 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1619540019; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=y3TQVnHKUnw/sN50yOfbaMGsu/Nf7cRcOPipeW/bhgE=; b=EDafR28tHtITRvZHqrorq7MZmXphZQe4uxNvLYoztZv5H51dPsxBLilVhoWLFS3kOkYmmM r7gFv0UB8UAWxBQeaof4JVzJz6YdYQnZc4MExSA0lDbN8PqdZc1YB3COQoec3gZb09nJqK xpHPexoO6vFlF1io6DaKGfAG/LNgxi0= Received: from mail-qv1-f69.google.com (mail-qv1-f69.google.com [209.85.219.69]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-545-YSCBbruTPdugJMX_EQNcGA-1; Tue, 27 Apr 2021 12:13:37 -0400 X-MC-Unique: YSCBbruTPdugJMX_EQNcGA-1 Received: by mail-qv1-f69.google.com with SMTP id r18-20020a0ccc120000b02901a21aadacfcso24128783qvk.5 for ; Tue, 27 Apr 2021 09:13:35 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=y3TQVnHKUnw/sN50yOfbaMGsu/Nf7cRcOPipeW/bhgE=; b=VmLSF0NkMI1rc1LeHfg20LeDsiIhZLaATbADRN4CSx/q9BmMbGZq/vKWj6CyD62K2u nGG91l6TgRs/u1PULqep9VKHlgWb9x8vku8lB4kKNmie15gKWJQmQVJkZqWlMS/oBMev LonfhtYS46YdrsQBTd2IRI1KEKmlMXQeGpZkmBbfdC+pmSEnFbEaxRLtYssRiMuKXQo/ K+fJjV920PUtX8G3GrnFniv9N40N+8CNKNyTI6OMXgOyK7qnSYITrUhHwwe6VYqfHcJo awo7po1vYpsSiitbjLeTh6FplKS/ZoTRLvubaoXNxX5ixpamScbQX3fs60+mZtsvALsO xoJA== X-Gm-Message-State: AOAM532MHQAQB7dsMDbe8lBCF42/szx0J7AXe3v5/yABGX4Pd0jLfR0F 9C+PXzGmpGTBDtV9eStC5wzN0c7bj1WyhFWikyXaROUZSbiefZOho1BWw7q06uBF+TZ/CrRze+M XiiJS/wsgQYAQegjkwkT87ane7dyQIk0g3ZLnCP6rLkkhQz4Gb4YcrOdqDoS7 X-Received: by 2002:a0c:eacb:: with SMTP id y11mr24286787qvp.57.1619540014981; Tue, 27 Apr 2021 09:13:34 -0700 (PDT) X-Google-Smtp-Source: ABdhPJzx90n+kQM1TV9Fi7Mcce4rO7UufmQT9+0nyJR2Yx3bhbVF1lrR4VAs6KWnUEZ+shz+GRUjDg== X-Received: by 2002:a0c:eacb:: with SMTP id y11mr24286748qvp.57.1619540014668; Tue, 27 Apr 2021 09:13:34 -0700 (PDT) Received: from xz-x1.redhat.com (bras-base-toroon474qw-grc-77-184-145-104-227.dsl.bell.ca. [184.145.104.227]) by smtp.gmail.com with ESMTPSA id v66sm3103621qkd.113.2021.04.27.09.13.33 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 27 Apr 2021 09:13:34 -0700 (PDT) From: Peter Xu To: linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: Nadav Amit , Miaohe Lin , Mike Rapoport , Andrea Arcangeli , Hugh Dickins , peterx@redhat.com, Jerome Glisse , Mike Kravetz , Jason Gunthorpe , Matthew Wilcox , Andrew Morton , Axel Rasmussen , "Kirill A . Shutemov" Subject: [PATCH v2 08/24] mm: Introduce ZAP_FLAG_SKIP_SWAP Date: Tue, 27 Apr 2021 12:13:01 -0400 Message-Id: <20210427161317.50682-9-peterx@redhat.com> X-Mailer: git-send-email 2.26.2 In-Reply-To: <20210427161317.50682-1-peterx@redhat.com> References: <20210427161317.50682-1-peterx@redhat.com> MIME-Version: 1.0 Authentication-Results: relay.mimecast.com; auth=pass smtp.auth=CUSA124A263 smtp.mailfrom=peterx@redhat.com X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com X-Rspamd-Queue-Id: D0A87E000123 X-Stat-Signature: eez68ny19j4up5qd6fhbnrxbdx75bq6i X-Rspamd-Server: rspam02 Received-SPF: none (redhat.com>: No applicable sender policy available) receiver=imf30; identity=mailfrom; envelope-from=""; helo=us-smtp-delivery-124.mimecast.com; client-ip=216.205.24.124 X-HE-DKIM-Result: pass/pass X-HE-Tag: 1619540000-545328 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Firstly, the comment in zap_pte_range() is misleading because it checks against details rather than check_mappings, so it's against what the code did. Meanwhile, it's confusing too on not explaining why passing in the details pointer would mean to skip all swap entries. New user of zap_details could very possibly miss this fact if they don't read deep until zap_pte_range() because there's no comment at zap_details talking about it at all, so swap entries could be errornously skipped without being noticed. This partly reverts 3e8715fdc03e ("mm: drop zap_details::check_swap_entries"), but introduce ZAP_FLAG_SKIP_SWAP flag, which means the opposite of previous "details" parameter: the caller should explicitly set this to skip swap entries, otherwise swap entries will always be considered (which is still the major case here). Cc: Kirill A. Shutemov Signed-off-by: Peter Xu --- include/linux/mm.h | 12 ++++++++++++ mm/memory.c | 8 +++++--- 2 files changed, 17 insertions(+), 3 deletions(-) diff --git a/include/linux/mm.h b/include/linux/mm.h index 39c944bf7ed3a..2227e9107e53e 100644 --- a/include/linux/mm.h +++ b/include/linux/mm.h @@ -1704,6 +1704,8 @@ extern void user_shm_unlock(size_t, struct user_struct *); /* Whether to check page->mapping when zapping */ #define ZAP_FLAG_CHECK_MAPPING BIT(0) +/* Whether to skip zapping swap entries */ +#define ZAP_FLAG_SKIP_SWAP BIT(1) /* * Parameter block passed down to zap_pte_range in exceptional cases. @@ -1726,6 +1728,16 @@ zap_check_mapping_skip(struct zap_details *details, struct page *page) return details->zap_mapping != page_rmapping(page); } +/* Return true if skip swap entries, false otherwise */ +static inline bool +zap_skip_swap(struct zap_details *details) +{ + if (!details) + return false; + + return details->zap_flags & ZAP_FLAG_SKIP_SWAP; +} + struct page *vm_normal_page(struct vm_area_struct *vma, unsigned long addr, pte_t pte); struct page *vm_normal_page_pmd(struct vm_area_struct *vma, unsigned long addr, diff --git a/mm/memory.c b/mm/memory.c index 94954436544f7..5325c1c2cbd78 100644 --- a/mm/memory.c +++ b/mm/memory.c @@ -1284,8 +1284,7 @@ static unsigned long zap_pte_range(struct mmu_gather *tlb, continue; } - /* If details->check_mapping, we leave swap entries. */ - if (unlikely(details)) + if (unlikely(zap_skip_swap(details))) continue; if (!non_swap_entry(entry)) @@ -3236,7 +3235,10 @@ void unmap_mapping_pages(struct address_space *mapping, pgoff_t start, pgoff_t nr, bool even_cows) { pgoff_t first_index = start, last_index = start + nr - 1; - struct zap_details details = { .zap_mapping = mapping }; + struct zap_details details = { + .zap_mapping = mapping, + .zap_flags = ZAP_FLAG_SKIP_SWAP, + }; if (!even_cows) details.zap_flags |= ZAP_FLAG_CHECK_MAPPING; From patchwork Tue Apr 27 16:13:02 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Xu X-Patchwork-Id: 12226883 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.6 required=3.0 tests=BAYES_00,DKIM_INVALID, DKIM_SIGNED,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 6171DC433B4 for ; Tue, 27 Apr 2021 16:13:46 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id EC0DB613C2 for ; Tue, 27 Apr 2021 16:13:45 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org EC0DB613C2 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=redhat.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id D02A46B007B; Tue, 27 Apr 2021 12:13:41 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id C7E936B0080; Tue, 27 Apr 2021 12:13:41 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 948B16B007D; Tue, 27 Apr 2021 12:13:41 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0081.hostedemail.com [216.40.44.81]) by kanga.kvack.org (Postfix) with ESMTP id 5DEB76B007B for ; Tue, 27 Apr 2021 12:13:41 -0400 (EDT) Received: from smtpin05.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay02.hostedemail.com (Postfix) with ESMTP id 1405745BC for ; Tue, 27 Apr 2021 16:13:41 +0000 (UTC) X-FDA: 78078642642.05.CE2C607 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by imf04.hostedemail.com (Postfix) with ESMTP id 06AEC13A for ; Tue, 27 Apr 2021 16:13:36 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1619540020; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=LT2kIK8sipp9j2ir2CNI9ANhHohLPp0GAzeCrbDSG6A=; b=YN/6o76feKPa1BNtiuyrVVJtTQmyZmO2u6Orh+pctn3/qMFtrm2qBtqtZPQrDh0oJCjrMY ylKbAfT1u71Nzfg55TO8UdmVWgomchxY/CjXEEXtPUBiekwlrhEdyHxVDpTMe9a1u3oXia fdRIcf0OCstmde8oFioArqe6pWDS0qQ= Received: from mail-qk1-f200.google.com (mail-qk1-f200.google.com [209.85.222.200]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-62-yw8-yt4AOx-V9rKzSZHCbw-1; Tue, 27 Apr 2021 12:13:38 -0400 X-MC-Unique: yw8-yt4AOx-V9rKzSZHCbw-1 Received: by mail-qk1-f200.google.com with SMTP id g76-20020a379d4f0000b02902e40532d832so17024935qke.20 for ; Tue, 27 Apr 2021 09:13:38 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=LT2kIK8sipp9j2ir2CNI9ANhHohLPp0GAzeCrbDSG6A=; b=bXmtKdjoUCGkTJv+S/cxClPBie5ktgcJke7bL+m4Hc35iDeMJEdr+3FETni6jYDifG qXpLgGaNWUFohD2DjJ477cmWm+B1CfqdlACL9bZsSqRDDl+wOf4NiHJUwnq8dL32gjN4 CmdFxE/YE+IjADBZnbWGxuOrA7pXVA+f6B7j9R1Nb9NXNldc0vO9dhWsXpOxzmoW1zse +u1G2JdvvTVre0xIGCOOx4sgt3+LpSBezQTUUuYblzsGvXWf/3Ki7glqiQtFh/fbBN0c 7W14BbvyX3VCLRifystYwaClNz+Hk9k2dXZqNX5/RUi8Sr68Ot8VqSAPyzjAWXX15j4T OkaA== X-Gm-Message-State: AOAM531Ijjt2HcPB73YIQ2AhF22MiyGwuLvADl+M3q0i8eOzgvvs75XO E+XOfDVvcnIrMVaanPkq0iLu9NmdCMjxujM6AZjb3pZVCFNwwP3dftR+9sEXxHr92+B9sQLXKu4 EdqTGd2SNjRByWVKltrr0fy/Dvd7l62L4/h1JfAj5Bukw3L8qM4U6tWobZJvl X-Received: by 2002:a0c:bec3:: with SMTP id f3mr24412591qvj.49.1619540016946; Tue, 27 Apr 2021 09:13:36 -0700 (PDT) X-Google-Smtp-Source: ABdhPJzBrjIhgyXc8ASqxD1DOx8zVV0xPaGMkLyYwUGO30ydezgxooODIclnDbUOAGjmE/pwb1KXEg== X-Received: by 2002:a0c:bec3:: with SMTP id f3mr24412535qvj.49.1619540016586; Tue, 27 Apr 2021 09:13:36 -0700 (PDT) Received: from xz-x1.redhat.com (bras-base-toroon474qw-grc-77-184-145-104-227.dsl.bell.ca. [184.145.104.227]) by smtp.gmail.com with ESMTPSA id v66sm3103621qkd.113.2021.04.27.09.13.34 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 27 Apr 2021 09:13:36 -0700 (PDT) From: Peter Xu To: linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: Nadav Amit , Miaohe Lin , Mike Rapoport , Andrea Arcangeli , Hugh Dickins , peterx@redhat.com, Jerome Glisse , Mike Kravetz , Jason Gunthorpe , Matthew Wilcox , Andrew Morton , Axel Rasmussen , "Kirill A . Shutemov" Subject: [PATCH v2 09/24] mm: Pass zap_flags into unmap_mapping_pages() Date: Tue, 27 Apr 2021 12:13:02 -0400 Message-Id: <20210427161317.50682-10-peterx@redhat.com> X-Mailer: git-send-email 2.26.2 In-Reply-To: <20210427161317.50682-1-peterx@redhat.com> References: <20210427161317.50682-1-peterx@redhat.com> MIME-Version: 1.0 Authentication-Results: relay.mimecast.com; auth=pass smtp.auth=CUSA124A263 smtp.mailfrom=peterx@redhat.com X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com X-Rspamd-Server: rspam03 X-Rspamd-Queue-Id: 06AEC13A X-Stat-Signature: rz9mmi1spdefbk4rnocmwcawhsahyq93 Received-SPF: none (redhat.com>: No applicable sender policy available) receiver=imf04; identity=mailfrom; envelope-from=""; helo=us-smtp-delivery-124.mimecast.com; client-ip=170.10.133.124 X-HE-DKIM-Result: pass/pass X-HE-Tag: 1619540016-846726 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Give unmap_mapping_pages() more power by allowing to specify a zap flag so that it can pass in more information than "whether we'd also like to zap cow pages". With the new flag, we can remove the even_cow parameter because even_cow==false equals to zap_flags==ZAP_FLAG_CHECK_MAPPING, while even_cow==true means a none zap flag to pass in (though in most cases we have had even_cow==false). No functional change intended. Signed-off-by: Peter Xu --- fs/dax.c | 10 ++++++---- include/linux/mm.h | 4 ++-- mm/khugepaged.c | 3 ++- mm/memory.c | 15 ++++++++------- mm/truncate.c | 11 +++++++---- 5 files changed, 25 insertions(+), 18 deletions(-) diff --git a/fs/dax.c b/fs/dax.c index 69216241392f2..20ca8d7d36ebb 100644 --- a/fs/dax.c +++ b/fs/dax.c @@ -517,7 +517,7 @@ static void *grab_mapping_entry(struct xa_state *xas, xas_unlock_irq(xas); unmap_mapping_pages(mapping, xas->xa_index & ~PG_PMD_COLOUR, - PG_PMD_NR, false); + PG_PMD_NR, ZAP_FLAG_CHECK_MAPPING); xas_reset(xas); xas_lock_irq(xas); } @@ -612,7 +612,8 @@ struct page *dax_layout_busy_page_range(struct address_space *mapping, * guaranteed to either see new references or prevent new * references from being established. */ - unmap_mapping_pages(mapping, start_idx, end_idx - start_idx + 1, 0); + unmap_mapping_pages(mapping, start_idx, end_idx - start_idx + 1, + ZAP_FLAG_CHECK_MAPPING); xas_lock_irq(&xas); xas_for_each(&xas, entry, end_idx) { @@ -743,9 +744,10 @@ static void *dax_insert_entry(struct xa_state *xas, /* we are replacing a zero page with block mapping */ if (dax_is_pmd_entry(entry)) unmap_mapping_pages(mapping, index & ~PG_PMD_COLOUR, - PG_PMD_NR, false); + PG_PMD_NR, ZAP_FLAG_CHECK_MAPPING); else /* pte entry */ - unmap_mapping_pages(mapping, index, 1, false); + unmap_mapping_pages(mapping, index, 1, + ZAP_FLAG_CHECK_MAPPING); } xas_reset(xas); diff --git a/include/linux/mm.h b/include/linux/mm.h index 2227e9107e53e..b8aa81a064a55 100644 --- a/include/linux/mm.h +++ b/include/linux/mm.h @@ -1784,7 +1784,7 @@ extern int fixup_user_fault(struct mm_struct *mm, unsigned long address, unsigned int fault_flags, bool *unlocked); void unmap_mapping_pages(struct address_space *mapping, - pgoff_t start, pgoff_t nr, bool even_cows); + pgoff_t start, pgoff_t nr, unsigned long zap_flags); void unmap_mapping_range(struct address_space *mapping, loff_t const holebegin, loff_t const holelen, int even_cows); #else @@ -1804,7 +1804,7 @@ static inline int fixup_user_fault(struct mm_struct *mm, unsigned long address, return -EFAULT; } static inline void unmap_mapping_pages(struct address_space *mapping, - pgoff_t start, pgoff_t nr, bool even_cows) { } + pgoff_t start, pgoff_t nr, unsigned long zap_flags) { } static inline void unmap_mapping_range(struct address_space *mapping, loff_t const holebegin, loff_t const holelen, int even_cows) { } #endif diff --git a/mm/khugepaged.c b/mm/khugepaged.c index e8b299aa32d06..64a36cd375359 100644 --- a/mm/khugepaged.c +++ b/mm/khugepaged.c @@ -1831,7 +1831,8 @@ static void collapse_file(struct mm_struct *mm, } if (page_mapped(page)) - unmap_mapping_pages(mapping, index, 1, false); + unmap_mapping_pages(mapping, index, 1, + ZAP_FLAG_CHECK_MAPPING); xas_lock_irq(&xas); xas_set(&xas, index); diff --git a/mm/memory.c b/mm/memory.c index 5325c1c2cbd78..189f60853a51d 100644 --- a/mm/memory.c +++ b/mm/memory.c @@ -3224,7 +3224,10 @@ static inline void unmap_mapping_range_tree(struct rb_root_cached *root, * @mapping: The address space containing pages to be unmapped. * @start: Index of first page to be unmapped. * @nr: Number of pages to be unmapped. 0 to unmap to end of file. - * @even_cows: Whether to unmap even private COWed pages. + * @zap_flags: Zap flags for the process. E.g., when ZAP_FLAG_CHECK_MAPPING is + * passed into it, we will only zap the pages that are in the same mapping + * specified in the @mapping parameter; otherwise we will not check mapping, + * IOW cow pages will be zapped too. * * Unmap the pages in this address space from any userspace process which * has them mmaped. Generally, you want to remove COWed pages as well when @@ -3232,17 +3235,14 @@ static inline void unmap_mapping_range_tree(struct rb_root_cached *root, * cache. */ void unmap_mapping_pages(struct address_space *mapping, pgoff_t start, - pgoff_t nr, bool even_cows) + pgoff_t nr, unsigned long zap_flags) { pgoff_t first_index = start, last_index = start + nr - 1; struct zap_details details = { .zap_mapping = mapping, - .zap_flags = ZAP_FLAG_SKIP_SWAP, + .zap_flags = zap_flags | ZAP_FLAG_SKIP_SWAP, }; - if (!even_cows) - details.zap_flags |= ZAP_FLAG_CHECK_MAPPING; - if (last_index < first_index) last_index = ULONG_MAX; @@ -3284,7 +3284,8 @@ void unmap_mapping_range(struct address_space *mapping, hlen = ULONG_MAX - hba + 1; } - unmap_mapping_pages(mapping, hba, hlen, even_cows); + unmap_mapping_pages(mapping, hba, hlen, even_cows ? + 0 : ZAP_FLAG_CHECK_MAPPING); } EXPORT_SYMBOL(unmap_mapping_range); diff --git a/mm/truncate.c b/mm/truncate.c index 95af244b112a0..ba2cbe300e83e 100644 --- a/mm/truncate.c +++ b/mm/truncate.c @@ -172,7 +172,8 @@ truncate_cleanup_page(struct address_space *mapping, struct page *page) { if (page_mapped(page)) { unsigned int nr = thp_nr_pages(page); - unmap_mapping_pages(mapping, page->index, nr, false); + unmap_mapping_pages(mapping, page->index, nr, + ZAP_FLAG_CHECK_MAPPING); } if (page_has_private(page)) @@ -652,14 +653,15 @@ int invalidate_inode_pages2_range(struct address_space *mapping, * Zap the rest of the file in one hit. */ unmap_mapping_pages(mapping, index, - (1 + end - index), false); + (1 + end - index), + ZAP_FLAG_CHECK_MAPPING); did_range_unmap = 1; } else { /* * Just zap this page */ unmap_mapping_pages(mapping, index, - 1, false); + 1, ZAP_FLAG_CHECK_MAPPING); } } BUG_ON(page_mapped(page)); @@ -685,7 +687,8 @@ int invalidate_inode_pages2_range(struct address_space *mapping, * get remapped later. */ if (dax_mapping(mapping)) { - unmap_mapping_pages(mapping, start, end - start + 1, false); + unmap_mapping_pages(mapping, start, end - start + 1, + ZAP_FLAG_CHECK_MAPPING); } out: cleancache_invalidate_inode(mapping); From patchwork Tue Apr 27 16:13:03 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Xu X-Patchwork-Id: 12226885 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.6 required=3.0 tests=BAYES_00,DKIM_INVALID, DKIM_SIGNED,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id C0C2CC433ED for ; Tue, 27 Apr 2021 16:13:48 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 5E51A613E8 for ; Tue, 27 Apr 2021 16:13:48 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 5E51A613E8 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=redhat.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id D67C86B007D; Tue, 27 Apr 2021 12:13:42 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id D400E6B007E; Tue, 27 Apr 2021 12:13:42 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id B46756B0080; Tue, 27 Apr 2021 12:13:42 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0058.hostedemail.com [216.40.44.58]) by kanga.kvack.org (Postfix) with ESMTP id 7FD8F6B007D for ; Tue, 27 Apr 2021 12:13:42 -0400 (EDT) Received: from smtpin07.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay02.hostedemail.com (Postfix) with ESMTP id 3FA903AA7 for ; Tue, 27 Apr 2021 16:13:42 +0000 (UTC) X-FDA: 78078642684.07.5013065 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by imf13.hostedemail.com (Postfix) with ESMTP id 08985E00010B for ; Tue, 27 Apr 2021 16:13:33 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1619540021; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=YOidGBu67rggbabLyL6x5COAOqvbxp3OOia9m+3YHH4=; b=EWLI1NIIwidqTwEj1Xrlx/6KkxRMVINPaaP6HgF9sTGzgItP7A4MV2/4KF2U+L2Od4hFO5 4M6hQ4dc3/uh7lstV9aXFc1LJ6BRXtKFnTXelgh7yoC4GYSbT0cFyh5sG9Lds5bK79ChN8 shrz0HbLEymNT+xlZE4FJ89UhILL2mY= Received: from mail-qk1-f198.google.com (mail-qk1-f198.google.com [209.85.222.198]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-382-0QgkCCymMDuOkVROBX3l_Q-1; Tue, 27 Apr 2021 12:13:39 -0400 X-MC-Unique: 0QgkCCymMDuOkVROBX3l_Q-1 Received: by mail-qk1-f198.google.com with SMTP id h190-20020a3785c70000b02902e022511825so23253088qkd.7 for ; Tue, 27 Apr 2021 09:13:39 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=YOidGBu67rggbabLyL6x5COAOqvbxp3OOia9m+3YHH4=; b=MvdECkI9Pc9cueqb58SraQREvy6LDgQWmQNSbUYfdv1Lh2+2yCP3LGOBtCTJnJXBds E2zwlnmvgZYuu1pBo+Aj23om3KMTBj533QSxA4is78dxSZQ/SQxweXqSHfzwjrAfJX46 NZrfVmYvV1UAqLu3C+4H+DR+ExyxY93tMEH05+naqCdg4xufBORzv2pCXQVzKOH2ek21 1vxdkLshiRamv6satORyvnXSwhhT+sgghetMkyCfv+kwmja7rx+088nfv1SuhquqfaJK 6PeWDx2sgb6Gm8S69Uouinsz6156drqDenJ31iI6f8xzCXo5eXeWftJaTfirqQkWS35J uw8g== X-Gm-Message-State: AOAM530YDxgukExPOB4dSebuhq1JtXCi6DsRDEOkNh9nkeYPzL73U3a8 5ruFIB/3c5RNQT6Cw89RSpfFZ20Iw9ef3802llesNSawh9jZ3XMLa7+b5Zg8/DoYkPeAJ+mRJSH MBDRoOnSmlQWKy7K5ZOHNXoFefwswnTjycCsY3idE9vcQpsNwymklz/0IJB3C X-Received: by 2002:a05:622a:15c6:: with SMTP id d6mr21539287qty.172.1619540018415; Tue, 27 Apr 2021 09:13:38 -0700 (PDT) X-Google-Smtp-Source: ABdhPJwPwsZSjtPtUeVC6HEWqXxrayaBTQUK6a/jnkm2j/6sjjPcAvNZHssIXgu3pSQdPgQjGRU+Iw== X-Received: by 2002:a05:622a:15c6:: with SMTP id d6mr21539233qty.172.1619540017940; Tue, 27 Apr 2021 09:13:37 -0700 (PDT) Received: from xz-x1.redhat.com (bras-base-toroon474qw-grc-77-184-145-104-227.dsl.bell.ca. [184.145.104.227]) by smtp.gmail.com with ESMTPSA id v66sm3103621qkd.113.2021.04.27.09.13.36 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 27 Apr 2021 09:13:37 -0700 (PDT) From: Peter Xu To: linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: Nadav Amit , Miaohe Lin , Mike Rapoport , Andrea Arcangeli , Hugh Dickins , peterx@redhat.com, Jerome Glisse , Mike Kravetz , Jason Gunthorpe , Matthew Wilcox , Andrew Morton , Axel Rasmussen , "Kirill A . Shutemov" Subject: [PATCH v2 10/24] shmem/userfaultfd: Persist uffd-wp bit across zapping for file-backed Date: Tue, 27 Apr 2021 12:13:03 -0400 Message-Id: <20210427161317.50682-11-peterx@redhat.com> X-Mailer: git-send-email 2.26.2 In-Reply-To: <20210427161317.50682-1-peterx@redhat.com> References: <20210427161317.50682-1-peterx@redhat.com> MIME-Version: 1.0 Authentication-Results: relay.mimecast.com; auth=pass smtp.auth=CUSA124A263 smtp.mailfrom=peterx@redhat.com X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com X-Rspamd-Server: rspam03 X-Rspamd-Queue-Id: 08985E00010B X-Stat-Signature: d5h6a187b18w5f8ry4453yxewisc54hr Received-SPF: none (redhat.com>: No applicable sender policy available) receiver=imf13; identity=mailfrom; envelope-from=""; helo=us-smtp-delivery-124.mimecast.com; client-ip=170.10.133.124 X-HE-DKIM-Result: pass/pass X-HE-Tag: 1619540013-460088 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: File-backed memory is prone to being unmapped at any time. It means all information in the pte will be dropped, including the uffd-wp flag. Since the uffd-wp info cannot be stored in page cache or swap cache, persist this wr-protect information by installing the special uffd-wp marker pte when we're going to unmap a uffd wr-protected pte. When the pte is accessed again, we will know it's previously wr-protected by recognizing the special pte. Meanwhile add a new flag ZAP_FLAG_DROP_FILE_UFFD_WP when we don't want to persist such an information. For example, when destroying the whole vma, or punching a hole in a shmem file. For the latter, we can only drop the uffd-wp bit when holding the page lock. It means the unmap_mapping_range() in shmem_fallocate() still reuqires to zap without ZAP_FLAG_DROP_FILE_UFFD_WP because that's still racy with the page faults. Signed-off-by: Peter Xu --- include/linux/mm.h | 11 ++++++++++ include/linux/mm_inline.h | 43 +++++++++++++++++++++++++++++++++++++++ mm/memory.c | 42 +++++++++++++++++++++++++++++++++++++- mm/rmap.c | 8 ++++++++ mm/truncate.c | 8 +++++++- 5 files changed, 110 insertions(+), 2 deletions(-) diff --git a/include/linux/mm.h b/include/linux/mm.h index b8aa81a064a55..d6790ab0cf575 100644 --- a/include/linux/mm.h +++ b/include/linux/mm.h @@ -1706,6 +1706,8 @@ extern void user_shm_unlock(size_t, struct user_struct *); #define ZAP_FLAG_CHECK_MAPPING BIT(0) /* Whether to skip zapping swap entries */ #define ZAP_FLAG_SKIP_SWAP BIT(1) +/* Whether to completely drop uffd-wp entries for file-backed memory */ +#define ZAP_FLAG_DROP_FILE_UFFD_WP BIT(2) /* * Parameter block passed down to zap_pte_range in exceptional cases. @@ -1738,6 +1740,15 @@ zap_skip_swap(struct zap_details *details) return details->zap_flags & ZAP_FLAG_SKIP_SWAP; } +static inline bool +zap_drop_file_uffd_wp(struct zap_details *details) +{ + if (!details) + return false; + + return details->zap_flags & ZAP_FLAG_DROP_FILE_UFFD_WP; +} + struct page *vm_normal_page(struct vm_area_struct *vma, unsigned long addr, pte_t pte); struct page *vm_normal_page_pmd(struct vm_area_struct *vma, unsigned long addr, diff --git a/include/linux/mm_inline.h b/include/linux/mm_inline.h index 355ea1ee32bd7..c29a6ef3a642a 100644 --- a/include/linux/mm_inline.h +++ b/include/linux/mm_inline.h @@ -4,6 +4,8 @@ #include #include +#include +#include /** * page_is_file_lru - should the page be on a file LRU or anon LRU? @@ -104,4 +106,45 @@ static __always_inline void del_page_from_lru_list(struct page *page, update_lru_size(lruvec, page_lru(page), page_zonenum(page), -thp_nr_pages(page)); } + +/* + * If this pte is wr-protected by uffd-wp in any form, arm the special pte to + * replace a none pte. NOTE! This should only be called when *pte is already + * cleared so we will never accidentally replace something valuable. Meanwhile + * none pte also means we are not demoting the pte so if tlb flushed then we + * don't need to do it again; otherwise if tlb flush is postponed then it's + * even better. + * + * Must be called with pgtable lock held. + */ +static inline void +pte_install_uffd_wp_if_needed(struct vm_area_struct *vma, unsigned long addr, + pte_t *pte, pte_t pteval) +{ +#ifdef CONFIG_USERFAULTFD + bool arm_uffd_pte = false; + + /* The current status of the pte should be "cleared" before calling */ + WARN_ON_ONCE(!pte_none(*pte)); + + if (vma_is_anonymous(vma)) + return; + + /* A uffd-wp wr-protected normal pte */ + if (unlikely(pte_present(pteval) && pte_uffd_wp(pteval))) + arm_uffd_pte = true; + + /* + * A uffd-wp wr-protected swap pte. Note: this should even work for + * pte_swp_uffd_wp_special() too. + */ + if (unlikely(is_swap_pte(pteval) && pte_swp_uffd_wp(pteval))) + arm_uffd_pte = true; + + if (unlikely(arm_uffd_pte)) + set_pte_at(vma->vm_mm, addr, pte, + pte_swp_mkuffd_wp_special(vma)); +#endif +} + #endif diff --git a/mm/memory.c b/mm/memory.c index 189f60853a51d..872fb59192277 100644 --- a/mm/memory.c +++ b/mm/memory.c @@ -73,6 +73,7 @@ #include #include #include +#include #include @@ -1210,6 +1211,21 @@ copy_page_range(struct vm_area_struct *dst_vma, struct vm_area_struct *src_vma) return ret; } +/* + * This function makes sure that we'll replace the none pte with an uffd-wp + * swap special pte marker when necessary. Must be with the pgtable lock held. + */ +static inline void +zap_install_uffd_wp_if_needed(struct vm_area_struct *vma, + unsigned long addr, pte_t *pte, + struct zap_details *details, pte_t pteval) +{ + if (zap_drop_file_uffd_wp(details)) + return; + + pte_install_uffd_wp_if_needed(vma, addr, pte, pteval); +} + static unsigned long zap_pte_range(struct mmu_gather *tlb, struct vm_area_struct *vma, pmd_t *pmd, unsigned long addr, unsigned long end, @@ -1247,6 +1263,8 @@ static unsigned long zap_pte_range(struct mmu_gather *tlb, ptent = ptep_get_and_clear_full(mm, addr, pte, tlb->fullmm); tlb_remove_tlb_entry(tlb, pte, addr); + zap_install_uffd_wp_if_needed(vma, addr, pte, details, + ptent); if (unlikely(!page)) continue; @@ -1271,6 +1289,22 @@ static unsigned long zap_pte_range(struct mmu_gather *tlb, continue; } + /* + * If this is a special uffd-wp marker pte... Drop it only if + * enforced to do so. + */ + if (unlikely(is_swap_special_pte(ptent))) { + WARN_ON_ONCE(!pte_swp_uffd_wp_special(ptent)); + /* + * If this is a common unmap of ptes, keep this as is. + * Drop it only if this is a whole-vma destruction. + */ + if (zap_drop_file_uffd_wp(details)) + ptep_get_and_clear_full(mm, addr, pte, + tlb->fullmm); + continue; + } + entry = pte_to_swp_entry(ptent); if (is_device_private_entry(entry)) { struct page *page = device_private_entry_to_page(entry); @@ -1281,6 +1315,8 @@ static unsigned long zap_pte_range(struct mmu_gather *tlb, rss[mm_counter(page)]--; page_remove_rmap(page, false); put_page(page); + zap_install_uffd_wp_if_needed(vma, addr, pte, details, + ptent); continue; } @@ -1298,6 +1334,7 @@ static unsigned long zap_pte_range(struct mmu_gather *tlb, if (unlikely(!free_swap_and_cache(entry))) print_bad_pte(vma, addr, ptent, NULL); pte_clear_not_present_full(mm, addr, pte, tlb->fullmm); + zap_install_uffd_wp_if_needed(vma, addr, pte, details, ptent); } while (pte++, addr += PAGE_SIZE, addr != end); add_mm_rss_vec(mm, rss); @@ -1497,12 +1534,15 @@ void unmap_vmas(struct mmu_gather *tlb, unsigned long end_addr) { struct mmu_notifier_range range; + struct zap_details details = { + .zap_flags = ZAP_FLAG_DROP_FILE_UFFD_WP, + }; mmu_notifier_range_init(&range, MMU_NOTIFY_UNMAP, 0, vma, vma->vm_mm, start_addr, end_addr); mmu_notifier_invalidate_range_start(&range); for ( ; vma && vma->vm_start < end_addr; vma = vma->vm_next) - unmap_single_vma(tlb, vma, start_addr, end_addr, NULL); + unmap_single_vma(tlb, vma, start_addr, end_addr, &details); mmu_notifier_invalidate_range_end(&range); } diff --git a/mm/rmap.c b/mm/rmap.c index b0fc27e77d6d7..5e25c57164fcf 100644 --- a/mm/rmap.c +++ b/mm/rmap.c @@ -72,6 +72,7 @@ #include #include #include +#include #include @@ -1571,6 +1572,13 @@ static bool try_to_unmap_one(struct page *page, struct vm_area_struct *vma, pteval = ptep_clear_flush(vma, address, pvmw.pte); } + /* + * Now the pte is cleared. If this is uffd-wp armed pte, we + * may want to replace a none pte with a marker pte if it's + * file-backed, so we don't lose the tracking information. + */ + pte_install_uffd_wp_if_needed(vma, address, pvmw.pte, pteval); + /* Move the dirty bit to the page. Now the pte is gone. */ if (pte_dirty(pteval)) set_page_dirty(page); diff --git a/mm/truncate.c b/mm/truncate.c index ba2cbe300e83e..65fed21e52bd0 100644 --- a/mm/truncate.c +++ b/mm/truncate.c @@ -173,7 +173,13 @@ truncate_cleanup_page(struct address_space *mapping, struct page *page) if (page_mapped(page)) { unsigned int nr = thp_nr_pages(page); unmap_mapping_pages(mapping, page->index, nr, - ZAP_FLAG_CHECK_MAPPING); + ZAP_FLAG_CHECK_MAPPING | + /* + * Now it's safe to drop uffd-wp because + * we're with page lock, and the page is + * being truncated. + */ + ZAP_FLAG_DROP_FILE_UFFD_WP); } if (page_has_private(page)) From patchwork Tue Apr 27 16:13:04 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Xu X-Patchwork-Id: 12226887 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.6 required=3.0 tests=BAYES_00,DKIM_INVALID, DKIM_SIGNED,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 3C08DC433B4 for ; Tue, 27 Apr 2021 16:13:51 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id DB400613E8 for ; Tue, 27 Apr 2021 16:13:50 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org DB400613E8 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=redhat.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 332416B007E; Tue, 27 Apr 2021 12:13:44 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 2BCC26B0080; Tue, 27 Apr 2021 12:13:44 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 0E8BF6B0081; Tue, 27 Apr 2021 12:13:44 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0245.hostedemail.com [216.40.44.245]) by kanga.kvack.org (Postfix) with ESMTP id DA6846B007E for ; Tue, 27 Apr 2021 12:13:43 -0400 (EDT) Received: from smtpin11.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay03.hostedemail.com (Postfix) with ESMTP id 9DC248249980 for ; Tue, 27 Apr 2021 16:13:43 +0000 (UTC) X-FDA: 78078642726.11.05285D1 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by imf13.hostedemail.com (Postfix) with ESMTP id 6DE6AE000135 for ; Tue, 27 Apr 2021 16:13:35 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1619540022; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=sFcTcWKJ6vbywtGp6iOAQR8zJIhwD0TyppO0sf4kPHY=; b=I4AEvx7eP+xOQx2VXfKGufbVIfGKE2b5bj7yvf1ZetzblU94yVSzIdOvRByLBw4zzAQzJV lm6Ve8aDhHDOCEY/pBFjNNvZAUXcRlXpXi4CnO5Hbzuq2B55u7OnulKJfC5+6rb2SkKQr0 E3yzj55sDAO90x9QE6QnU/ci5b6HVGQ= Received: from mail-qk1-f198.google.com (mail-qk1-f198.google.com [209.85.222.198]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-578-SSRYJSmlPJKfmqv7O0YCjw-1; Tue, 27 Apr 2021 12:13:41 -0400 X-MC-Unique: SSRYJSmlPJKfmqv7O0YCjw-1 Received: by mail-qk1-f198.google.com with SMTP id v7-20020a05620a0a87b02902e02f31812fso23309395qkg.6 for ; Tue, 27 Apr 2021 09:13:41 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=sFcTcWKJ6vbywtGp6iOAQR8zJIhwD0TyppO0sf4kPHY=; b=o1QaiMsKeM3NTt/Eale0kCAmBbjLrabHUiVwVSchMXHs7hoM4LcjZho5yhF3WEY0aE Ao/Rqw9Ix1QGMumOoW6kYXbVi8SkhHaj3HvF0MZepgiJP6vOFV2BE/WSoX2odGV97Fzm ePMXq2qrjfYMNJg60K8sJ/V6lrKHOMK/9V1+edTWqn17jQ05CRaZbdjmVZE0LOSI4ZyJ MuPN5wpFOuG/7A5T/jLT7ZQ+f6MsmnhJNtxCYA34dTVdVNbTm8jygMgF4bVPso12Akz5 3CMymFIXaCUuGEOG3057VtaziH4ql502Aj8QSH49kHasMELOQH05APtgcpwhCYkQ7aWc oMIA== X-Gm-Message-State: AOAM531mwKqST1aJKLrc1AzTivXN9atE2MukXlQy9vp7Um4SR3J1LtXj BkvpWX271dbVycpaPlrtqtJydYla8XKRfuG+J+3uANZnauUKwMUqX0DgbHotd4Gz3GUb2R8JGRE XYjn0qYGSkuEJoslU6m4Oj2Wc5NGfcef3XwTwaeQVsKaLkSLsDghP1Bi8jH+Y X-Received: by 2002:a37:906:: with SMTP id 6mr23571632qkj.234.1619540019870; Tue, 27 Apr 2021 09:13:39 -0700 (PDT) X-Google-Smtp-Source: ABdhPJwizuNUF7IfETxrg3V3tIEOD3b3VLA2dvV5qF9HpChu8JMhRtFPP+WV7Rtoqm2bFWb4I5EFlQ== X-Received: by 2002:a37:906:: with SMTP id 6mr23571576qkj.234.1619540019405; Tue, 27 Apr 2021 09:13:39 -0700 (PDT) Received: from xz-x1.redhat.com (bras-base-toroon474qw-grc-77-184-145-104-227.dsl.bell.ca. [184.145.104.227]) by smtp.gmail.com with ESMTPSA id v66sm3103621qkd.113.2021.04.27.09.13.38 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 27 Apr 2021 09:13:38 -0700 (PDT) From: Peter Xu To: linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: Nadav Amit , Miaohe Lin , Mike Rapoport , Andrea Arcangeli , Hugh Dickins , peterx@redhat.com, Jerome Glisse , Mike Kravetz , Jason Gunthorpe , Matthew Wilcox , Andrew Morton , Axel Rasmussen , "Kirill A . Shutemov" Subject: [PATCH v2 11/24] shmem/userfaultfd: Allow wr-protect none pte for file-backed mem Date: Tue, 27 Apr 2021 12:13:04 -0400 Message-Id: <20210427161317.50682-12-peterx@redhat.com> X-Mailer: git-send-email 2.26.2 In-Reply-To: <20210427161317.50682-1-peterx@redhat.com> References: <20210427161317.50682-1-peterx@redhat.com> MIME-Version: 1.0 Authentication-Results: relay.mimecast.com; auth=pass smtp.auth=CUSA124A263 smtp.mailfrom=peterx@redhat.com X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com X-Stat-Signature: itwecr86o5pf6p9pcztw7ipcqtd465n9 X-Rspamd-Server: rspam04 X-Rspamd-Queue-Id: 6DE6AE000135 Received-SPF: none (redhat.com>: No applicable sender policy available) receiver=imf13; identity=mailfrom; envelope-from=""; helo=us-smtp-delivery-124.mimecast.com; client-ip=170.10.133.124 X-HE-DKIM-Result: pass/pass X-HE-Tag: 1619540015-122090 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: File-backed memory differs from anonymous memory in that even if the pte is missing, the data could still resides either in the file or in page/swap cache. So when wr-protect a pte, we need to consider none ptes too. We do that by installing the uffd-wp special swap pte as a marker. So when there's a future write to the pte, the fault handler will go the special path to first fault-in the page as read-only, then report to userfaultfd server with the wr-protect message. On the other hand, when unprotecting a page, it's also possible that the pte got unmapped but replaced by the special uffd-wp marker. Then we'll need to be able to recover from a uffd-wp special swap pte into a none pte, so that the next access to the page will fault in correctly as usual when trigger the fault handler next time, rather than sending a uffd-wp message. Special care needs to be taken throughout the change_protection_range() process. Since now we allow user to wr-protect a none pte, we need to be able to pre-populate the page table entries if we see !anonymous && MM_CP_UFFD_WP requests, otherwise change_protection_range() will always skip when the pgtable entry does not exist. Note that this patch only covers the small pages (pte level) but not covering any of the transparent huge pages yet. But this will be a base for thps too. Signed-off-by: Peter Xu --- mm/mprotect.c | 48 ++++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 48 insertions(+) diff --git a/mm/mprotect.c b/mm/mprotect.c index b3def0a102bf4..6b63e3544b470 100644 --- a/mm/mprotect.c +++ b/mm/mprotect.c @@ -29,6 +29,7 @@ #include #include #include +#include #include #include #include @@ -176,6 +177,32 @@ static unsigned long change_pte_range(struct vm_area_struct *vma, pmd_t *pmd, set_pte_at(vma->vm_mm, addr, pte, newpte); pages++; } + } else if (unlikely(is_swap_special_pte(oldpte))) { + if (uffd_wp_resolve && !vma_is_anonymous(vma) && + pte_swp_uffd_wp_special(oldpte)) { + /* + * This is uffd-wp special pte and we'd like to + * unprotect it. What we need to do is simply + * recover the pte into a none pte; the next + * page fault will fault in the page. + */ + pte_clear(vma->vm_mm, addr, pte); + pages++; + } + } else { + /* It must be an none page, or what else?.. */ + WARN_ON_ONCE(!pte_none(oldpte)); + if (unlikely(uffd_wp && !vma_is_anonymous(vma))) { + /* + * For file-backed mem, we need to be able to + * wr-protect even for a none pte! Because + * even if the pte is null, the page/swap cache + * could exist. + */ + set_pte_at(vma->vm_mm, addr, pte, + pte_swp_mkuffd_wp_special(vma)); + pages++; + } } } while (pte++, addr += PAGE_SIZE, addr != end); arch_leave_lazy_mmu_mode(); @@ -209,6 +236,25 @@ static inline int pmd_none_or_clear_bad_unless_trans_huge(pmd_t *pmd) return 0; } +/* + * File-backed vma allows uffd wr-protect upon none ptes, because even if pte + * is missing, page/swap cache could exist. When that happens, the wr-protect + * information will be stored in the page table entries with the marker (e.g., + * PTE_SWP_UFFD_WP_SPECIAL). Prepare for that by always populating the page + * tables to pte level, so that we'll install the markers in change_pte_range() + * where necessary. + * + * Note that we only need to do this in pmd level, because if pmd does not + * exist, it means the whole range covered by the pmd entry (of a pud) does not + * contain any valid data but all zeros. Then nothing to wr-protect. + */ +#define change_protection_prepare(vma, pmd, addr, cp_flags) \ + do { \ + if (unlikely((cp_flags & MM_CP_UFFD_WP) && pmd_none(*pmd) && \ + !vma_is_anonymous(vma))) \ + WARN_ON_ONCE(pte_alloc(vma->vm_mm, pmd)); \ + } while (0) + static inline unsigned long change_pmd_range(struct vm_area_struct *vma, pud_t *pud, unsigned long addr, unsigned long end, pgprot_t newprot, unsigned long cp_flags) @@ -227,6 +273,8 @@ static inline unsigned long change_pmd_range(struct vm_area_struct *vma, next = pmd_addr_end(addr, end); + change_protection_prepare(vma, pmd, addr, cp_flags); + /* * Automatic NUMA balancing walks the tables with mmap_lock * held for read. It's possible a parallel update to occur From patchwork Tue Apr 27 16:13:05 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Xu X-Patchwork-Id: 12226889 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.6 required=3.0 tests=BAYES_00,DKIM_INVALID, DKIM_SIGNED,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 81C38C433ED for ; Tue, 27 Apr 2021 16:13:53 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 1046A613E7 for ; Tue, 27 Apr 2021 16:13:53 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 1046A613E7 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=redhat.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 8FA856B0080; Tue, 27 Apr 2021 12:13:45 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 883E06B0081; Tue, 27 Apr 2021 12:13:45 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 6D8E86B0082; Tue, 27 Apr 2021 12:13:45 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0047.hostedemail.com [216.40.44.47]) by kanga.kvack.org (Postfix) with ESMTP id 46DED6B0080 for ; Tue, 27 Apr 2021 12:13:45 -0400 (EDT) Received: from smtpin35.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay04.hostedemail.com (Postfix) with ESMTP id F40295DE6 for ; Tue, 27 Apr 2021 16:13:44 +0000 (UTC) X-FDA: 78078642810.35.2594227 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [216.205.24.124]) by imf22.hostedemail.com (Postfix) with ESMTP id F11B1C0007FA for ; Tue, 27 Apr 2021 16:13:37 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1619540024; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=bJUKYR7hr17ohgPo/T1v5XQHyXxi59fVxEqscfG7/f8=; b=gKu26c/ZJyIzZSeJvI543OTWahwg0KTt2cTjsJEcQetRNCpzLT8FAqKbu5jslYQqunzxMH A9iR4I2dwa6T/XsLFszynZyAlKDVoWwBnebEUpcB7GtA2lsgPQYVfTTyaq/G5aUzfBylQ/ XgUFyvoBtFbOnoaKLoeZGEpT/9v1Ghg= Received: from mail-qt1-f197.google.com (mail-qt1-f197.google.com [209.85.160.197]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-407-tatHiBwYMC2Ht318Ycsy7A-1; Tue, 27 Apr 2021 12:13:42 -0400 X-MC-Unique: tatHiBwYMC2Ht318Ycsy7A-1 Received: by mail-qt1-f197.google.com with SMTP id b8-20020a05622a0208b02901b5b18f4f91so20772727qtx.18 for ; Tue, 27 Apr 2021 09:13:42 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=bJUKYR7hr17ohgPo/T1v5XQHyXxi59fVxEqscfG7/f8=; b=iqlBlYV/vqOPBQ4j3J5v/QNtEaZkTLimm0cE38UiOpaDZeUBFRqFC1+sOaLT6Dc/4Y zE6PywcJUqzbhHBWDp8zIeWjSZGZ5oUcHEJVxkUBfh0YuLE72kH3dLx9jeFjWSk0ttUI ZWsKpnBrRU6lBiSD4dX+4iy3pmsOpfGSEKfp8crIrvILkuYSeIRvL53kXvutrhm/tzL4 YT06ldlO0bam00A/vs/XKQw7L3L3jhLZvyMVa85/EKjN3gPJxltKbq3v9kJFqwoxWP8w mZbSQNXkHcMjlPcmRj0obXWvxwp70zosYI4NGHONaqC83JRlsPD0hpT7N4FUXIoqcvUg 6C+w== X-Gm-Message-State: AOAM5330/hIE7Jx2Can4IBSHwb6df0/kfKQsOjuFjmnbQ4krtq8RbCsK 5HksyU5x+V4S+HQZ6sVuJaFT2VLG6fBsV+jD3+td03lgZJ3CdVWnPSk8UbGl9zJRYMh8w86CsMw cqM9tlBPh8au3A4qq9G9NBkhVsvjD1XbvkNcS/M/vAoAlHzaJ9l7iCtG7ng1y X-Received: by 2002:a37:a510:: with SMTP id o16mr5597698qke.306.1619540021761; Tue, 27 Apr 2021 09:13:41 -0700 (PDT) X-Google-Smtp-Source: ABdhPJyK0i3RBqZsx2OzwuTxGkQwSJmhVwmVOV/gMvcyErmc0haHIIYmda75zIZFKivMWbj/kuK6pA== X-Received: by 2002:a37:a510:: with SMTP id o16mr5597661qke.306.1619540021451; Tue, 27 Apr 2021 09:13:41 -0700 (PDT) Received: from xz-x1.redhat.com (bras-base-toroon474qw-grc-77-184-145-104-227.dsl.bell.ca. [184.145.104.227]) by smtp.gmail.com with ESMTPSA id v66sm3103621qkd.113.2021.04.27.09.13.39 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 27 Apr 2021 09:13:40 -0700 (PDT) From: Peter Xu To: linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: Nadav Amit , Miaohe Lin , Mike Rapoport , Andrea Arcangeli , Hugh Dickins , peterx@redhat.com, Jerome Glisse , Mike Kravetz , Jason Gunthorpe , Matthew Wilcox , Andrew Morton , Axel Rasmussen , "Kirill A . Shutemov" Subject: [PATCH v2 12/24] shmem/userfaultfd: Allows file-back mem to be uffd wr-protected on thps Date: Tue, 27 Apr 2021 12:13:05 -0400 Message-Id: <20210427161317.50682-13-peterx@redhat.com> X-Mailer: git-send-email 2.26.2 In-Reply-To: <20210427161317.50682-1-peterx@redhat.com> References: <20210427161317.50682-1-peterx@redhat.com> MIME-Version: 1.0 Authentication-Results: relay.mimecast.com; auth=pass smtp.auth=CUSA124A263 smtp.mailfrom=peterx@redhat.com X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com X-Rspamd-Server: rspam05 X-Rspamd-Queue-Id: F11B1C0007FA X-Stat-Signature: hceqskwthfexsqo64kesfik83n4gebp8 Received-SPF: none (redhat.com>: No applicable sender policy available) receiver=imf22; identity=mailfrom; envelope-from=""; helo=us-smtp-delivery-124.mimecast.com; client-ip=216.205.24.124 X-HE-DKIM-Result: pass/pass X-HE-Tag: 1619540017-293025 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: We don't have "huge" version of PTE_SWP_UFFD_WP_SPECIAL, instead when necessary we split the thp if the huge page is uffd wr-protected previously. However split the thp is not enough, because file-backed thp is handled totally differently comparing to anonymous thps - rather than doing a real split, the thp pmd will simply got dropped in __split_huge_pmd_locked(). That is definitely not enough if e.g. when there is a thp covers range [0, 2M) but we want to wr-protect small page resides in [4K, 8K) range, because after __split_huge_pmd() returns, there will be a none pmd. Here we leverage the previously introduced change_protection_prepare() macro so that we'll populate the pmd with a pgtable page. Then change_pte_range() will do all the rest for us, e.g., install the uffd-wp swap special pte marker at any pte that we'd like to wr-protect, under the protection of pgtable lock. Signed-off-by: Peter Xu --- mm/mprotect.c | 10 +++++++++- 1 file changed, 9 insertions(+), 1 deletion(-) diff --git a/mm/mprotect.c b/mm/mprotect.c index 6b63e3544b470..51c954afa4069 100644 --- a/mm/mprotect.c +++ b/mm/mprotect.c @@ -296,8 +296,16 @@ static inline unsigned long change_pmd_range(struct vm_area_struct *vma, } if (is_swap_pmd(*pmd) || pmd_trans_huge(*pmd) || pmd_devmap(*pmd)) { - if (next - addr != HPAGE_PMD_SIZE) { + if (next - addr != HPAGE_PMD_SIZE || + /* Uffd wr-protecting a file-backed memory range */ + unlikely(!vma_is_anonymous(vma) && + (cp_flags & MM_CP_UFFD_WP))) { __split_huge_pmd(vma, pmd, addr, false, NULL); + /* + * For file-backed, the pmd could have been + * gone; still provide a pte pgtable if needed. + */ + change_protection_prepare(vma, pmd, addr, cp_flags); } else { int nr_ptes = change_huge_pmd(vma, pmd, addr, newprot, cp_flags); From patchwork Tue Apr 27 16:13:06 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Xu X-Patchwork-Id: 12226891 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.6 required=3.0 tests=BAYES_00,DKIM_INVALID, DKIM_SIGNED,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 3C0EBC433B4 for ; Tue, 27 Apr 2021 16:13:55 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id D3005613C2 for ; Tue, 27 Apr 2021 16:13:54 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org D3005613C2 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=redhat.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 63CAB6B0081; Tue, 27 Apr 2021 12:13:47 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 613B76B0082; Tue, 27 Apr 2021 12:13:47 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 4B9A36B0083; Tue, 27 Apr 2021 12:13:47 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0046.hostedemail.com [216.40.44.46]) by kanga.kvack.org (Postfix) with ESMTP id 25EB66B0081 for ; Tue, 27 Apr 2021 12:13:47 -0400 (EDT) Received: from smtpin01.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay01.hostedemail.com (Postfix) with ESMTP id DA547180AD81D for ; Tue, 27 Apr 2021 16:13:46 +0000 (UTC) X-FDA: 78078642852.01.9EB091F Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [216.205.24.124]) by imf17.hostedemail.com (Postfix) with ESMTP id F03EE40001DE for ; Tue, 27 Apr 2021 16:13:42 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1619540026; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=MQo3igXQ2ZrlT78vmfcbMjgqdCn30eweEx4RKB8geEQ=; b=GXugMs41Lg2WhU6t70DJte53tdotvshl5v3R/VwcSN8gUs60tIovGCn3bCjw75AtsOJ35n BJfDBJIUJRikrILUqefm7zLbCTUWKPNOwBmsqT8vp9SpPqehUSNHfKnG132KEDPvfh7If0 n8f7bgWQLStuvpNe/FQzyUBlWLSKebk= Received: from mail-qk1-f200.google.com (mail-qk1-f200.google.com [209.85.222.200]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-333-AaNLft8GOHeUUTGtOC6-oA-1; Tue, 27 Apr 2021 12:13:44 -0400 X-MC-Unique: AaNLft8GOHeUUTGtOC6-oA-1 Received: by mail-qk1-f200.google.com with SMTP id g76-20020a379d4f0000b02902e40532d832so17025115qke.20 for ; Tue, 27 Apr 2021 09:13:44 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=MQo3igXQ2ZrlT78vmfcbMjgqdCn30eweEx4RKB8geEQ=; b=kIQVKKRBpiQDbP9lFyP29tgEBQhB18BEAzFSd47hym7JIxRUd5vPfTbBpRxsjlqr4a Nl7hhAZqnu3vGSzzFWs6fFIUrHkZOX9o3kCGd04WtwUbiRzUzQEcLbTf+BNxoWADic/p 5BbjUg1A1k8ytuxQDsHqJc9z4aXev1EdVVJTUVPLBsMs1w5XufK/keSjAVKBr3RLTzSR f6hP2f/K0j5137Z8nhHFBy916ODscPo7ATQ/6+p1yJM+0ZAK5PBz0rpjTaXgygXxalRH nLjWL929Dd8atVjqBiaIlr9AoJaXcz3dl9RjHCw79IqwIfMkJ3mWZ4ydGoCgxTvOP5bW DaTQ== X-Gm-Message-State: AOAM53329ElDwfrqcXD7fk39sBw+4La4t55921KtyogmsRF9tIesdI9A 0bG/eoM6VDdZeDMNgmlXBL47wVnwKDdBcHxgP/QhHHnSoEiEFBtVBh9yUvOoRpe1ZlWFn4c6pRd fUkwKd5nsTB8fSwGjGKAt45w1UFSvBLOB7yYCiFVNLXx02k/Rh2SeS9No+wbe X-Received: by 2002:a37:ae85:: with SMTP id x127mr23480375qke.436.1619540023608; Tue, 27 Apr 2021 09:13:43 -0700 (PDT) X-Google-Smtp-Source: ABdhPJwBvftm878KrfXNKO7yJY30UGBZnCImne2Q/icuY+oWvYS3f5pjudN4L4FNOQSPKT1Zf9Wuiw== X-Received: by 2002:a37:ae85:: with SMTP id x127mr23480325qke.436.1619540023259; Tue, 27 Apr 2021 09:13:43 -0700 (PDT) Received: from xz-x1.redhat.com (bras-base-toroon474qw-grc-77-184-145-104-227.dsl.bell.ca. [184.145.104.227]) by smtp.gmail.com with ESMTPSA id v66sm3103621qkd.113.2021.04.27.09.13.41 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 27 Apr 2021 09:13:42 -0700 (PDT) From: Peter Xu To: linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: Nadav Amit , Miaohe Lin , Mike Rapoport , Andrea Arcangeli , Hugh Dickins , peterx@redhat.com, Jerome Glisse , Mike Kravetz , Jason Gunthorpe , Matthew Wilcox , Andrew Morton , Axel Rasmussen , "Kirill A . Shutemov" Subject: [PATCH v2 13/24] shmem/userfaultfd: Handle the left-overed special swap ptes Date: Tue, 27 Apr 2021 12:13:06 -0400 Message-Id: <20210427161317.50682-14-peterx@redhat.com> X-Mailer: git-send-email 2.26.2 In-Reply-To: <20210427161317.50682-1-peterx@redhat.com> References: <20210427161317.50682-1-peterx@redhat.com> MIME-Version: 1.0 Authentication-Results: relay.mimecast.com; auth=pass smtp.auth=CUSA124A263 smtp.mailfrom=peterx@redhat.com X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com X-Rspamd-Queue-Id: F03EE40001DE X-Stat-Signature: em47mcby4cjsmkz4djbwq5tumcntgg7r X-Rspamd-Server: rspam02 Received-SPF: none (redhat.com>: No applicable sender policy available) receiver=imf17; identity=mailfrom; envelope-from=""; helo=us-smtp-delivery-124.mimecast.com; client-ip=216.205.24.124 X-HE-DKIM-Result: pass/pass X-HE-Tag: 1619540022-130948 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Note that the special uffd-wp swap pte can be left over even if the page under the pte got evicted. Normally when evict a page, we will unmap the ptes by walking through the reverse mapping. However we never tracked such information for the special swap ptes because they're not real mappings but just markers. So we need to take care of that when we see a marker but when it's actually meaningless (the page behind it got evicted). We have already taken care of that in e.g. alloc_set_pte() where we'll treat the special swap pte as pte_none() when necessary. However we need to also teach userfaultfd itself on either UFFDIO_COPY or handling page faults, so that everything will still work as expected. Signed-off-by: Peter Xu --- fs/userfaultfd.c | 15 +++++++++++++++ mm/shmem.c | 13 ++++++++++++- 2 files changed, 27 insertions(+), 1 deletion(-) diff --git a/fs/userfaultfd.c b/fs/userfaultfd.c index 5dd78238cc156..b34486a88b5f3 100644 --- a/fs/userfaultfd.c +++ b/fs/userfaultfd.c @@ -329,6 +329,21 @@ static inline bool userfaultfd_must_wait(struct userfaultfd_ctx *ctx, */ if (pte_none(*pte)) ret = true; + /* + * We also treat the swap special uffd-wp pte as the pte_none() here. + * This should in most cases be a missing event, as we never handle + * wr-protect upon a special uffd-wp swap pte - it should first be + * converted into a normal read request before handling wp. It just + * means the page/swap cache that backing this pte is gone, so this + * special pte is leftover. + * + * We can't simply replace it with a none pte because we're not with + * the pgtable lock here. Instead of taking it and clearing the pte, + * the easy way is to let UFFDIO_COPY understand this pte too when + * trying to install a new page onto it. + */ + if (pte_swp_uffd_wp_special(*pte)) + ret = true; if (!pte_write(*pte) && (reason & VM_UFFD_WP)) ret = true; pte_unmap(pte); diff --git a/mm/shmem.c b/mm/shmem.c index 8fbf7680f044c..a1f21736ad68e 100644 --- a/mm/shmem.c +++ b/mm/shmem.c @@ -2453,7 +2453,18 @@ static int shmem_mfill_atomic_pte(struct mm_struct *dst_mm, goto out_release_unlock; ret = -EEXIST; - if (!pte_none(*dst_pte)) + /* + * Besides the none pte, we also allow UFFDIO_COPY to install a pte + * onto the uffd-wp swap special pte, because that pte should be the + * same as a pte_none() just in that it contains wr-protect information + * (which could only be dropped when unmap the memory). + * + * It's safe to drop that marker because we know this is part of a + * MISSING fault, and the caller is very clear about this page missing + * rather than wr-protected. Then we're sure the wr-protect bit is + * just a leftover so it's useless already. + */ + if (!pte_none(*dst_pte) && !pte_swp_uffd_wp_special(*dst_pte)) goto out_release_unlock; lru_cache_add(page); From patchwork Tue Apr 27 16:13:07 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Xu X-Patchwork-Id: 12226893 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.6 required=3.0 tests=BAYES_00,DKIM_INVALID, DKIM_SIGNED,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 0BB3FC43460 for ; Tue, 27 Apr 2021 16:13:57 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id A81BD613C2 for ; Tue, 27 Apr 2021 16:13:56 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org A81BD613C2 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=redhat.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 21D476B0082; Tue, 27 Apr 2021 12:13:49 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 1D5EE6B0083; Tue, 27 Apr 2021 12:13:49 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 0284F6B0085; Tue, 27 Apr 2021 12:13:48 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0135.hostedemail.com [216.40.44.135]) by kanga.kvack.org (Postfix) with ESMTP id C75FC6B0082 for ; Tue, 27 Apr 2021 12:13:48 -0400 (EDT) Received: from smtpin37.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay01.hostedemail.com (Postfix) with ESMTP id 87074180AD81D for ; Tue, 27 Apr 2021 16:13:48 +0000 (UTC) X-FDA: 78078642936.37.545FEF2 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by imf15.hostedemail.com (Postfix) with ESMTP id 47638A000192 for ; Tue, 27 Apr 2021 16:13:44 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1619540027; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=dXTRKapTK7xkWfcmUhC675oNCWp18cl/MiRtGAKeqgk=; b=gIPZmCRMVhHjB22Gl8TA+e3ck4c3ilyBAsp6xJyQdZNn717/K5ooQxOiE1BnohX5Xo6YYo e0Ao0KnMSr8cN5bNj+6EdpgGzV53VGA9oKeK87uz8H9E7oG6xyaHz5ldkPOQUHa5av2BFA inkOO/MURKLXHMB6S/cJjc884Sjm+M8= Received: from mail-qv1-f69.google.com (mail-qv1-f69.google.com [209.85.219.69]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-254-bg_uqcc0NRuF5fumR--StA-1; Tue, 27 Apr 2021 12:13:46 -0400 X-MC-Unique: bg_uqcc0NRuF5fumR--StA-1 Received: by mail-qv1-f69.google.com with SMTP id w9-20020a0cdf890000b029019aa511c767so3902672qvl.18 for ; Tue, 27 Apr 2021 09:13:46 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=dXTRKapTK7xkWfcmUhC675oNCWp18cl/MiRtGAKeqgk=; b=QwNvQ09zCHIPaDpdJbDOaWFAgHE29Y9TEM7m0bNG7AoijNvSebILLWCQUnf+ZukuuO NiicQv+y/IcBm7TC/0LWr7oeuLUYQusOE7UtMQVpyXUVdUEThBuXeEe1NBQl/mkBkvf8 KIKh7KBLx+gGpsDDOP2I6y6TxE0h16RAUtl5F7sFQO1RAnPodZ3FCq8fNYUrFfwBiDxQ 2L/lWwpUsUM5XGwpro6ytxK16Ycv3Kj6DlmYhGs2NYq3+LVKTiZZvA85C3P5d0D826Wd oeYb1iWFlZiPjwbVBseU4yYZsr3Fc4Qp8ja99x4HxjpwT4BPrOOHDxXgLdw6A0upe9tX nBGA== X-Gm-Message-State: AOAM531KKxBrAvKKiNkEfRvradtO2ZI47JZ4Ls68jCyx5u7JPFcuKn1c A86DfIWRSmcRRNZZp7L8bLSg5YsemSvon6jg1jyUIS98JppMsQZvrP/JiXgIiMIKk2SlhRkmMjY 8nDBupHMWSSMNVWFVJ6Zu67rriNg4Xuh2gUZ6FbY2nBa7VkfPKPjeJcwXIzFb X-Received: by 2002:a37:390:: with SMTP id 138mr23732941qkd.136.1619540025078; Tue, 27 Apr 2021 09:13:45 -0700 (PDT) X-Google-Smtp-Source: ABdhPJzyrh8sZh+fxfsRGRfMYWR874+IQftCU711ibVBt1LefYgvzxPopd1PBmk0Q9G1Upc1N/6ITQ== X-Received: by 2002:a37:390:: with SMTP id 138mr23732903qkd.136.1619540024751; Tue, 27 Apr 2021 09:13:44 -0700 (PDT) Received: from xz-x1.redhat.com (bras-base-toroon474qw-grc-77-184-145-104-227.dsl.bell.ca. [184.145.104.227]) by smtp.gmail.com with ESMTPSA id v66sm3103621qkd.113.2021.04.27.09.13.43 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 27 Apr 2021 09:13:44 -0700 (PDT) From: Peter Xu To: linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: Nadav Amit , Miaohe Lin , Mike Rapoport , Andrea Arcangeli , Hugh Dickins , peterx@redhat.com, Jerome Glisse , Mike Kravetz , Jason Gunthorpe , Matthew Wilcox , Andrew Morton , Axel Rasmussen , "Kirill A . Shutemov" Subject: [PATCH v2 14/24] shmem/userfaultfd: Pass over uffd-wp special swap pte when fork() Date: Tue, 27 Apr 2021 12:13:07 -0400 Message-Id: <20210427161317.50682-15-peterx@redhat.com> X-Mailer: git-send-email 2.26.2 In-Reply-To: <20210427161317.50682-1-peterx@redhat.com> References: <20210427161317.50682-1-peterx@redhat.com> MIME-Version: 1.0 Authentication-Results: relay.mimecast.com; auth=pass smtp.auth=CUSA124A263 smtp.mailfrom=peterx@redhat.com X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com X-Rspamd-Server: rspam01 X-Rspamd-Queue-Id: 47638A000192 X-Stat-Signature: zuucf4djx3gct5di6jae7mftu4duqrw1 Received-SPF: none (redhat.com>: No applicable sender policy available) receiver=imf15; identity=mailfrom; envelope-from=""; helo=us-smtp-delivery-124.mimecast.com; client-ip=170.10.133.124 X-HE-DKIM-Result: pass/pass X-HE-Tag: 1619540024-674352 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: It should be handled similarly like other uffd-wp wr-protected ptes: we should pass it over when the dst_vma has VM_UFFD_WP armed, otherwise drop it. Signed-off-by: Peter Xu --- mm/memory.c | 15 ++++++++++++++- 1 file changed, 14 insertions(+), 1 deletion(-) diff --git a/mm/memory.c b/mm/memory.c index 872fb59192277..f1cdc613b5887 100644 --- a/mm/memory.c +++ b/mm/memory.c @@ -715,8 +715,21 @@ copy_nonpresent_pte(struct mm_struct *dst_mm, struct mm_struct *src_mm, unsigned long vm_flags = dst_vma->vm_flags; pte_t pte = *src_pte; struct page *page; - swp_entry_t entry = pte_to_swp_entry(pte); + swp_entry_t entry; + + if (unlikely(is_swap_special_pte(pte))) { + /* + * uffd-wp special swap pte is the only possibility for now. + * If dst vma is registered with uffd-wp, copy it over. + * Otherwise, ignore this pte as if it's a none pte would work. + */ + WARN_ON_ONCE(!pte_swp_uffd_wp_special(pte)); + if (userfaultfd_wp(dst_vma)) + set_pte_at(dst_mm, addr, dst_pte, pte); + return 0; + } + entry = pte_to_swp_entry(pte); if (likely(!non_swap_entry(entry))) { if (swap_duplicate(entry) < 0) return entry.val; From patchwork Tue Apr 27 16:13:08 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Xu X-Patchwork-Id: 12226895 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.6 required=3.0 tests=BAYES_00,DKIM_INVALID, DKIM_SIGNED,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 5EE72C433ED for ; Tue, 27 Apr 2021 16:13:59 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 0441C61186 for ; Tue, 27 Apr 2021 16:13:58 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 0441C61186 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=redhat.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 9BFB06B0085; Tue, 27 Apr 2021 12:13:50 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 998C16B0087; Tue, 27 Apr 2021 12:13:50 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 7E9F96B0088; Tue, 27 Apr 2021 12:13:50 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0191.hostedemail.com [216.40.44.191]) by kanga.kvack.org (Postfix) with ESMTP id 5B1F46B0085 for ; Tue, 27 Apr 2021 12:13:50 -0400 (EDT) Received: from smtpin01.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay02.hostedemail.com (Postfix) with ESMTP id 18BD35DC9 for ; Tue, 27 Apr 2021 16:13:50 +0000 (UTC) X-FDA: 78078643020.01.AADC83B Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [216.205.24.124]) by imf26.hostedemail.com (Postfix) with ESMTP id 1E6C340002F7 for ; Tue, 27 Apr 2021 16:13:40 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1619540029; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=Ene9hsYCt95YhP01hJ6SbYksc0YKGhtCroAdQwF4+A0=; b=Qfya5s0OZWKxyuJzQxM/FNH7mDRolw7NlzrqSOJS9kNcQde7XRSziKCKtADqgTRfZZMKJ+ kbZ5uBeYvZkOy83ynsvjBVEW1PLDdF/K3BjXiwTnq5MA6HKO/t4LGfaLj2SkVSHwT80m+q CVsWD9pvEUk/sC5WASeAaioTqxsVqec= Received: from mail-qv1-f70.google.com (mail-qv1-f70.google.com [209.85.219.70]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-425-ufMrHI9_PcK0LURyDENwog-1; Tue, 27 Apr 2021 12:13:47 -0400 X-MC-Unique: ufMrHI9_PcK0LURyDENwog-1 Received: by mail-qv1-f70.google.com with SMTP id x15-20020a0ce0cf0000b029019cb3e75c62so26247781qvk.15 for ; Tue, 27 Apr 2021 09:13:47 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=Ene9hsYCt95YhP01hJ6SbYksc0YKGhtCroAdQwF4+A0=; b=VeOsbonhSuvOKW3HC04m/5vPfjdW+ZgBuwfcZLD2H6rTUI50+mjGng7nR49oYI+pTc WsOxDmvNiQ4WLRVOfcnkG8u3OISkkVFvmqda++GZ6vlDHLje7iukDqRXRkmYBzB4WBnf 7U6YISk4CiZ86FIEISaLSdr5ovjuwI27gbsgX3XBgRP3DbQN+Vv1YkW1LxwcU68FuQGS Hfje3ooKp4l33jzFNG3p90Jl8pOUYsYhLwetRZB0G9pQdySLQcefi6n6pXR+agoIfKqe oI2Cop/2NjwFsxPGaCnSa2rBjT+hCSoN1WVAE4QmJaQT7c47QrGPUbb14W5/b5SzWb55 IHRg== X-Gm-Message-State: AOAM530wbV1ehMA128KYwydljHUazHhZccOVIct15MgELOb2f0BtC+m2 6XoXwQ4a3NTtlg/CMlTCgs9ZkLqHt/+ArQP5/dQ8nC4w7A+vUFnfiNUgla9f3uINJwQjwmiDR+O jjKXCAeTD59ZLLKdpikf66xEoPVwPlDaYDfcPCH8MZhYcf8IIIUb3EgrsY0O/ X-Received: by 2002:a0c:a98d:: with SMTP id a13mr24391198qvb.39.1619540026696; Tue, 27 Apr 2021 09:13:46 -0700 (PDT) X-Google-Smtp-Source: ABdhPJyeBDEAolLFEvpC2nGq6dbXDcre41GVmLMNg+IbtaHl9JGzGvURbs1BlzbLWeFm2t7fJ9kOdQ== X-Received: by 2002:a0c:a98d:: with SMTP id a13mr24391159qvb.39.1619540026457; Tue, 27 Apr 2021 09:13:46 -0700 (PDT) Received: from xz-x1.redhat.com (bras-base-toroon474qw-grc-77-184-145-104-227.dsl.bell.ca. [184.145.104.227]) by smtp.gmail.com with ESMTPSA id v66sm3103621qkd.113.2021.04.27.09.13.44 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 27 Apr 2021 09:13:46 -0700 (PDT) From: Peter Xu To: linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: Nadav Amit , Miaohe Lin , Mike Rapoport , Andrea Arcangeli , Hugh Dickins , peterx@redhat.com, Jerome Glisse , Mike Kravetz , Jason Gunthorpe , Matthew Wilcox , Andrew Morton , Axel Rasmussen , "Kirill A . Shutemov" Subject: [PATCH v2 15/24] mm/hugetlb: Drop __unmap_hugepage_range definition from hugetlb.h Date: Tue, 27 Apr 2021 12:13:08 -0400 Message-Id: <20210427161317.50682-16-peterx@redhat.com> X-Mailer: git-send-email 2.26.2 In-Reply-To: <20210427161317.50682-1-peterx@redhat.com> References: <20210427161317.50682-1-peterx@redhat.com> MIME-Version: 1.0 Authentication-Results: relay.mimecast.com; auth=pass smtp.auth=CUSA124A263 smtp.mailfrom=peterx@redhat.com X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com X-Rspamd-Server: rspam01 X-Rspamd-Queue-Id: 1E6C340002F7 X-Stat-Signature: ouxg3meyijpg9ixfoarosece81epdexy Received-SPF: none (redhat.com>: No applicable sender policy available) receiver=imf26; identity=mailfrom; envelope-from=""; helo=us-smtp-delivery-124.mimecast.com; client-ip=216.205.24.124 X-HE-DKIM-Result: pass/pass X-HE-Tag: 1619540020-258788 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Drop it in the header since it's only used in hugetlb.c. Suggested-by: Mike Kravetz Signed-off-by: Peter Xu --- include/linux/hugetlb.h | 10 ---------- 1 file changed, 10 deletions(-) diff --git a/include/linux/hugetlb.h b/include/linux/hugetlb.h index b92f25ccef588..eb134a75cad41 100644 --- a/include/linux/hugetlb.h +++ b/include/linux/hugetlb.h @@ -126,9 +126,6 @@ void __unmap_hugepage_range_final(struct mmu_gather *tlb, struct vm_area_struct *vma, unsigned long start, unsigned long end, struct page *ref_page); -void __unmap_hugepage_range(struct mmu_gather *tlb, struct vm_area_struct *vma, - unsigned long start, unsigned long end, - struct page *ref_page); void hugetlb_report_meminfo(struct seq_file *); int hugetlb_report_node_meminfo(char *buf, int len, int nid); void hugetlb_show_meminfo(void); @@ -362,13 +359,6 @@ static inline void __unmap_hugepage_range_final(struct mmu_gather *tlb, BUG(); } -static inline void __unmap_hugepage_range(struct mmu_gather *tlb, - struct vm_area_struct *vma, unsigned long start, - unsigned long end, struct page *ref_page) -{ - BUG(); -} - static inline vm_fault_t hugetlb_fault(struct mm_struct *mm, struct vm_area_struct *vma, unsigned long address, unsigned int flags) From patchwork Tue Apr 27 16:13:09 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Xu X-Patchwork-Id: 12226897 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.6 required=3.0 tests=BAYES_00,DKIM_INVALID, DKIM_SIGNED,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 9BFE1C433ED for ; Tue, 27 Apr 2021 16:14:01 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 375FD61151 for ; Tue, 27 Apr 2021 16:14:01 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 375FD61151 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=redhat.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 701236B0089; Tue, 27 Apr 2021 12:13:52 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 6AFE86B008A; Tue, 27 Apr 2021 12:13:52 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 505B06B008C; Tue, 27 Apr 2021 12:13:52 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0070.hostedemail.com [216.40.44.70]) by kanga.kvack.org (Postfix) with ESMTP id 2999E6B0089 for ; Tue, 27 Apr 2021 12:13:52 -0400 (EDT) Received: from smtpin04.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay05.hostedemail.com (Postfix) with ESMTP id DC913181AF5E1 for ; Tue, 27 Apr 2021 16:13:51 +0000 (UTC) X-FDA: 78078643062.04.E0EA17A Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [216.205.24.124]) by imf10.hostedemail.com (Postfix) with ESMTP id 3A14040002EC for ; Tue, 27 Apr 2021 16:13:41 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1619540030; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=h4WOl7CIm3ueQ8RREsb6vElg27yEJH2OFi487xgNXDI=; b=h7i9vHHa5vS7splKub2P/H0n2FUL1ONvGay6uWcrZvRQ94Ft2DFQE4V78vMX2tlWYaRG1c 9YWVPXcsYaikSjxh168wg7qWuDX3IsDQ7xdBoyu54+9gUoShpO815omfoE44el93EGGaDs aeHvuS7E5epcwlWiIorTDXRuEmduXPY= Received: from mail-qv1-f70.google.com (mail-qv1-f70.google.com [209.85.219.70]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-297-VfZ7r2h3NJmo48uFbaly9A-1; Tue, 27 Apr 2021 12:13:49 -0400 X-MC-Unique: VfZ7r2h3NJmo48uFbaly9A-1 Received: by mail-qv1-f70.google.com with SMTP id l61-20020a0c84430000b02901a9a7e363edso17903940qva.16 for ; Tue, 27 Apr 2021 09:13:49 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=h4WOl7CIm3ueQ8RREsb6vElg27yEJH2OFi487xgNXDI=; b=ih/PrHcOLFJYWBz6u1ieNkOB+5Lpyyml95iLb8BZOSE/0Zo7xkQAou9Ap3dzSGDoH/ QYwJP2MrGQGBUlr0zSs9Yff6Zo9dX26b45bsATZTHUAmp9tnzjmuwqB0G0/qjT5IHHjn amhja+9quvEKMnBqrCfh97I2aJave6kTcj5BpIqzTbsFYufk1D8m3NzMAsxRAJGRkzrZ L6pCpItA8E9i1agxRUSsbKnjossTqaD4AE/yx+5S3AfIyYrDg1s6HZn078/usFj7cdnj vn086XP5sLSP8HGiSaweOp578rEyAp/x+c34mqNpFWuKhA0QBtEUarSgq7s8dDDwQJ3C tNCQ== X-Gm-Message-State: AOAM533b2l/1GCMeH0StwRm2buwIDoTdhhXDEdbuSgye7dzGLG8ZsGtv PXs/diaMhNKOiQN9MNf4EYc6Bti7hLmqbsjno0TKQROKl8+tzGDQv+XngK7yidzAHpiiZg8IpmQ ann6UG6sjOZQ24nH2H9ft0sLzi8SYEFnsq403wlWMOv81x2AsgHutWZqTAhEh X-Received: by 2002:a05:622a:100e:: with SMTP id d14mr22854324qte.143.1619540028325; Tue, 27 Apr 2021 09:13:48 -0700 (PDT) X-Google-Smtp-Source: ABdhPJzJmDOOgfUngR5oU8MYAl3vjcwLrudb8BiqqgyPc5TSA0H5CdeRhdgWrf0J20xAaMh3sJ7dfQ== X-Received: by 2002:a05:622a:100e:: with SMTP id d14mr22854291qte.143.1619540028034; Tue, 27 Apr 2021 09:13:48 -0700 (PDT) Received: from xz-x1.redhat.com (bras-base-toroon474qw-grc-77-184-145-104-227.dsl.bell.ca. [184.145.104.227]) by smtp.gmail.com with ESMTPSA id v66sm3103621qkd.113.2021.04.27.09.13.46 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 27 Apr 2021 09:13:47 -0700 (PDT) From: Peter Xu To: linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: Nadav Amit , Miaohe Lin , Mike Rapoport , Andrea Arcangeli , Hugh Dickins , peterx@redhat.com, Jerome Glisse , Mike Kravetz , Jason Gunthorpe , Matthew Wilcox , Andrew Morton , Axel Rasmussen , "Kirill A . Shutemov" Subject: [PATCH v2 16/24] hugetlb/userfaultfd: Hook page faults for uffd write protection Date: Tue, 27 Apr 2021 12:13:09 -0400 Message-Id: <20210427161317.50682-17-peterx@redhat.com> X-Mailer: git-send-email 2.26.2 In-Reply-To: <20210427161317.50682-1-peterx@redhat.com> References: <20210427161317.50682-1-peterx@redhat.com> MIME-Version: 1.0 Authentication-Results: relay.mimecast.com; auth=pass smtp.auth=CUSA124A263 smtp.mailfrom=peterx@redhat.com X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com X-Rspamd-Queue-Id: 3A14040002EC X-Stat-Signature: ygd8k6sc6ks5fsuegccbnbs3z91kuxyy X-Rspamd-Server: rspam02 Received-SPF: none (redhat.com>: No applicable sender policy available) receiver=imf10; identity=mailfrom; envelope-from=""; helo=us-smtp-delivery-124.mimecast.com; client-ip=216.205.24.124 X-HE-DKIM-Result: pass/pass X-HE-Tag: 1619540021-881733 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Hook up hugetlbfs_fault() with the capability to handle userfaultfd-wp faults. We do this slightly earlier than hugetlb_cow() so that we can avoid taking some extra locks that we definitely don't need. Reviewed-by: Mike Kravetz Signed-off-by: Peter Xu --- mm/hugetlb.c | 19 +++++++++++++++++++ 1 file changed, 19 insertions(+) diff --git a/mm/hugetlb.c b/mm/hugetlb.c index 629aa4c2259c8..8e234ee9a15e2 100644 --- a/mm/hugetlb.c +++ b/mm/hugetlb.c @@ -4802,6 +4802,25 @@ vm_fault_t hugetlb_fault(struct mm_struct *mm, struct vm_area_struct *vma, if (unlikely(!pte_same(entry, huge_ptep_get(ptep)))) goto out_ptl; + /* Handle userfault-wp first, before trying to lock more pages */ + if (userfaultfd_pte_wp(vma, huge_ptep_get(ptep)) && + (flags & FAULT_FLAG_WRITE) && !huge_pte_write(entry)) { + struct vm_fault vmf = { + .vma = vma, + .address = haddr, + .flags = flags, + }; + + spin_unlock(ptl); + if (pagecache_page) { + unlock_page(pagecache_page); + put_page(pagecache_page); + } + mutex_unlock(&hugetlb_fault_mutex_table[hash]); + i_mmap_unlock_read(mapping); + return handle_userfault(&vmf, VM_UFFD_WP); + } + /* * hugetlb_cow() requires page locks of pte_page(entry) and * pagecache_page, so here we need take the former one From patchwork Tue Apr 27 16:13:10 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Xu X-Patchwork-Id: 12226899 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.6 required=3.0 tests=BAYES_00,DKIM_INVALID, DKIM_SIGNED,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 82CA5C433B4 for ; Tue, 27 Apr 2021 16:14:03 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 1980261151 for ; Tue, 27 Apr 2021 16:14:03 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 1980261151 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=redhat.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 1D1BE6B0092; Tue, 27 Apr 2021 12:13:56 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 0BD066B0096; Tue, 27 Apr 2021 12:13:55 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id DB59B6B0095; Tue, 27 Apr 2021 12:13:55 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0103.hostedemail.com [216.40.44.103]) by kanga.kvack.org (Postfix) with ESMTP id B56AC6B0092 for ; Tue, 27 Apr 2021 12:13:55 -0400 (EDT) Received: from smtpin33.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay03.hostedemail.com (Postfix) with ESMTP id 696618249980 for ; Tue, 27 Apr 2021 16:13:55 +0000 (UTC) X-FDA: 78078643230.33.9D22AAB Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by imf24.hostedemail.com (Postfix) with ESMTP id ACE17A000187 for ; Tue, 27 Apr 2021 16:13:44 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1619540034; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=nXCghrNNWHRRCCV5QV6/ctMnnQcSJJc61Ke+xYUBfck=; b=bF5c07sI7zP1jr7pPZ9yO4dUzVoAzVsjK9nrWu0mLF7tST8kZ/Nwu9NAxSdt2uxjypUjVG dzXBtwkx9ml6WnmqeY2b1VlVA86IvYWIQ/ay67rpisW0zN/K7N+NKVr5KB+CB8w/c13tWK sJC9NKK9WSChbr6ZdJBC3ZFkXO2sEsg= Received: from mail-qt1-f198.google.com (mail-qt1-f198.google.com [209.85.160.198]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-64-BzK41oP4N7yIR7RjCoQpBQ-1; Tue, 27 Apr 2021 12:13:51 -0400 X-MC-Unique: BzK41oP4N7yIR7RjCoQpBQ-1 Received: by mail-qt1-f198.google.com with SMTP id 10-20020ac8594a0000b02901b9f6ae286fso15562939qtz.23 for ; Tue, 27 Apr 2021 09:13:50 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=nXCghrNNWHRRCCV5QV6/ctMnnQcSJJc61Ke+xYUBfck=; b=ukuh+NHAaJKTBGdHV1CItpG3idoBiT7pI8YTtjI4RY5fp9QxmtbtNGX28f1chC328B hCJ7a60YUXvDDvP+8diiF2p54SmtDAp2X1QCVPZsRwFGsbTjw8S2YZqSBrV1uOqJ2VaS 0Tk9JgKDmEKRwRzrVjdPIwSQCJyv7OyyE4cttoaK4uMTSMwP9VwN/P5aCz82YB498swD yuOZtd9zWMpPbfJhbNs+ldJtZaau+D6XbxuWoKAjKogRNgun3CKCtkl4npD9+4y6uMxQ 1BnHILok6lZP7IoCNLbKlBjRbGAPgJINH/c0zj/OORNSjFe11w1xpm4xrAl/VpKiYiUa fxOg== X-Gm-Message-State: AOAM533/gFaP4accgaD49EyvsflCCiKW9FEeP/IGq/yXX4Gw2nZFVYys UU0d1aR5lshRZyJUBU4AYY/yW74aYZK7OxoD6FJas3lw1AyesPNkJ/0Gv92foOSi14lk7YOQw8J D9/5PViHYSUw+koMMJS3+gKGlP58WL/7lOmxbTrgPVkE8sRonKXQfcWS6DD+n X-Received: by 2002:a0c:fcc8:: with SMTP id i8mr12477272qvq.31.1619540030027; Tue, 27 Apr 2021 09:13:50 -0700 (PDT) X-Google-Smtp-Source: ABdhPJxguoAyDMHYkFEYl63WpQWxcE8JjPIEI2Z6YlCr7tNl/+SaEdw+TPXCtDuapevxol/5mZyGrg== X-Received: by 2002:a0c:fcc8:: with SMTP id i8mr12477221qvq.31.1619540029712; Tue, 27 Apr 2021 09:13:49 -0700 (PDT) Received: from xz-x1.redhat.com (bras-base-toroon474qw-grc-77-184-145-104-227.dsl.bell.ca. [184.145.104.227]) by smtp.gmail.com with ESMTPSA id v66sm3103621qkd.113.2021.04.27.09.13.48 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 27 Apr 2021 09:13:49 -0700 (PDT) From: Peter Xu To: linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: Nadav Amit , Miaohe Lin , Mike Rapoport , Andrea Arcangeli , Hugh Dickins , peterx@redhat.com, Jerome Glisse , Mike Kravetz , Jason Gunthorpe , Matthew Wilcox , Andrew Morton , Axel Rasmussen , "Kirill A . Shutemov" Subject: [PATCH v2 17/24] hugetlb/userfaultfd: Take care of UFFDIO_COPY_MODE_WP Date: Tue, 27 Apr 2021 12:13:10 -0400 Message-Id: <20210427161317.50682-18-peterx@redhat.com> X-Mailer: git-send-email 2.26.2 In-Reply-To: <20210427161317.50682-1-peterx@redhat.com> References: <20210427161317.50682-1-peterx@redhat.com> MIME-Version: 1.0 Authentication-Results: relay.mimecast.com; auth=pass smtp.auth=CUSA124A263 smtp.mailfrom=peterx@redhat.com X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com X-Rspamd-Queue-Id: ACE17A000187 X-Stat-Signature: jmyq7x7qsk8idzipf1jqetsobz98xre3 X-Rspamd-Server: rspam02 Received-SPF: none (redhat.com>: No applicable sender policy available) receiver=imf24; identity=mailfrom; envelope-from=""; helo=us-smtp-delivery-124.mimecast.com; client-ip=170.10.133.124 X-HE-DKIM-Result: pass/pass X-HE-Tag: 1619540024-207681 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Firstly, pass the wp_copy variable into hugetlb_mcopy_atomic_pte() thoughout the stack. Then, apply the UFFD_WP bit if UFFDIO_COPY_MODE_WP is with UFFDIO_COPY. Introduce huge_pte_mkuffd_wp() for it. Hugetlb pages are only managed by hugetlbfs, so we're safe even without setting dirty bit in the huge pte if the page is installed as read-only. However we'd better still keep the dirty bit set for a read-only UFFDIO_COPY pte (when UFFDIO_COPY_MODE_WP bit is set), not only to match what we do with shmem, but also because the page does contain dirty data that the kernel just copied from the userspace. Signed-off-by: Peter Xu --- include/asm-generic/hugetlb.h | 5 +++++ include/linux/hugetlb.h | 6 ++++-- mm/hugetlb.c | 22 +++++++++++++++++----- mm/userfaultfd.c | 12 ++++++++---- 4 files changed, 34 insertions(+), 11 deletions(-) diff --git a/include/asm-generic/hugetlb.h b/include/asm-generic/hugetlb.h index 8e1e6244a89da..548212eccbd61 100644 --- a/include/asm-generic/hugetlb.h +++ b/include/asm-generic/hugetlb.h @@ -27,6 +27,11 @@ static inline pte_t huge_pte_mkdirty(pte_t pte) return pte_mkdirty(pte); } +static inline pte_t huge_pte_mkuffd_wp(pte_t pte) +{ + return pte_mkuffd_wp(pte); +} + static inline pte_t huge_pte_modify(pte_t pte, pgprot_t newprot) { return pte_modify(pte, newprot); diff --git a/include/linux/hugetlb.h b/include/linux/hugetlb.h index eb134a75cad41..e38077918330f 100644 --- a/include/linux/hugetlb.h +++ b/include/linux/hugetlb.h @@ -138,7 +138,8 @@ int hugetlb_mcopy_atomic_pte(struct mm_struct *dst_mm, pte_t *dst_pte, unsigned long dst_addr, unsigned long src_addr, enum mcopy_atomic_mode mode, - struct page **pagep); + struct page **pagep, + bool wp_copy); #endif /* CONFIG_USERFAULTFD */ bool hugetlb_reserve_pages(struct inode *inode, long from, long to, struct vm_area_struct *vma, @@ -318,7 +319,8 @@ static inline int hugetlb_mcopy_atomic_pte(struct mm_struct *dst_mm, unsigned long dst_addr, unsigned long src_addr, enum mcopy_atomic_mode mode, - struct page **pagep) + struct page **pagep, + bool wp_copy) { BUG(); return 0; diff --git a/mm/hugetlb.c b/mm/hugetlb.c index 8e234ee9a15e2..20ee8fdf6507d 100644 --- a/mm/hugetlb.c +++ b/mm/hugetlb.c @@ -4884,7 +4884,8 @@ int hugetlb_mcopy_atomic_pte(struct mm_struct *dst_mm, unsigned long dst_addr, unsigned long src_addr, enum mcopy_atomic_mode mode, - struct page **pagep) + struct page **pagep, + bool wp_copy) { bool is_continue = (mode == MCOPY_ATOMIC_CONTINUE); struct address_space *mapping; @@ -4981,17 +4982,28 @@ int hugetlb_mcopy_atomic_pte(struct mm_struct *dst_mm, hugepage_add_new_anon_rmap(page, dst_vma, dst_addr); } - /* For CONTINUE on a non-shared VMA, don't set VM_WRITE for CoW. */ - if (is_continue && !vm_shared) + /* + * For either: (1) CONTINUE on a non-shared VMA, or (2) UFFDIO_COPY + * with wp flag set, don't set pte write bit. + */ + if (wp_copy || (is_continue && !vm_shared)) writable = 0; else writable = dst_vma->vm_flags & VM_WRITE; _dst_pte = make_huge_pte(dst_vma, page, writable); - if (writable) - _dst_pte = huge_pte_mkdirty(_dst_pte); + /* + * Always mark UFFDIO_COPY page dirty; note that this may not be + * extremely important for hugetlbfs for now since swapping is not + * supported, but we should still be clear in that this page cannot be + * thrown away at will, even if write bit not set. + */ + _dst_pte = huge_pte_mkdirty(_dst_pte); _dst_pte = pte_mkyoung(_dst_pte); + if (wp_copy) + _dst_pte = huge_pte_mkuffd_wp(_dst_pte); + set_huge_pte_at(dst_mm, dst_addr, dst_pte, _dst_pte); (void)huge_ptep_set_access_flags(dst_vma, dst_addr, dst_pte, _dst_pte, diff --git a/mm/userfaultfd.c b/mm/userfaultfd.c index 7adaebe222b8e..4f716838f1fdb 100644 --- a/mm/userfaultfd.c +++ b/mm/userfaultfd.c @@ -207,7 +207,8 @@ static __always_inline ssize_t __mcopy_atomic_hugetlb(struct mm_struct *dst_mm, unsigned long dst_start, unsigned long src_start, unsigned long len, - enum mcopy_atomic_mode mode) + enum mcopy_atomic_mode mode, + bool wp_copy) { int vm_alloc_shared = dst_vma->vm_flags & VM_SHARED; int vm_shared = dst_vma->vm_flags & VM_SHARED; @@ -304,7 +305,8 @@ static __always_inline ssize_t __mcopy_atomic_hugetlb(struct mm_struct *dst_mm, } err = hugetlb_mcopy_atomic_pte(dst_mm, dst_pte, dst_vma, - dst_addr, src_addr, mode, &page); + dst_addr, src_addr, mode, &page, + wp_copy); mutex_unlock(&hugetlb_fault_mutex_table[hash]); i_mmap_unlock_read(mapping); @@ -406,7 +408,8 @@ extern ssize_t __mcopy_atomic_hugetlb(struct mm_struct *dst_mm, unsigned long dst_start, unsigned long src_start, unsigned long len, - enum mcopy_atomic_mode mode); + enum mcopy_atomic_mode mode, + bool wp_copy); #endif /* CONFIG_HUGETLB_PAGE */ static __always_inline ssize_t mfill_atomic_pte(struct mm_struct *dst_mm, @@ -526,7 +529,8 @@ static __always_inline ssize_t __mcopy_atomic(struct mm_struct *dst_mm, */ if (is_vm_hugetlb_page(dst_vma)) return __mcopy_atomic_hugetlb(dst_mm, dst_vma, dst_start, - src_start, len, mcopy_mode); + src_start, len, mcopy_mode, + wp_copy); if (!vma_is_anonymous(dst_vma) && !vma_is_shmem(dst_vma)) goto out_unlock; From patchwork Tue Apr 27 16:13:11 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Xu X-Patchwork-Id: 12226901 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.6 required=3.0 tests=BAYES_00,DKIM_INVALID, DKIM_SIGNED,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id C0724C433ED for ; Tue, 27 Apr 2021 16:14:05 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 5401B61151 for ; Tue, 27 Apr 2021 16:14:05 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 5401B61151 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=redhat.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 616F76B0093; Tue, 27 Apr 2021 12:13:56 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 5EF3E6B0095; Tue, 27 Apr 2021 12:13:56 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 3F2906B0098; Tue, 27 Apr 2021 12:13:56 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0074.hostedemail.com [216.40.44.74]) by kanga.kvack.org (Postfix) with ESMTP id 07BBE6B0093 for ; Tue, 27 Apr 2021 12:13:56 -0400 (EDT) Received: from smtpin26.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay03.hostedemail.com (Postfix) with ESMTP id BC8B4824999B for ; Tue, 27 Apr 2021 16:13:55 +0000 (UTC) X-FDA: 78078643230.26.3DFEE6E Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [216.205.24.124]) by imf24.hostedemail.com (Postfix) with ESMTP id 199BCA000188 for ; Tue, 27 Apr 2021 16:13:44 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1619540034; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=X3STWoX+h+7H6Wj9KnSBfR38e0Bi1l3dmByZQ1yWWdw=; b=djoEQpyE0aUc6g44yy69QTMMh2oQR4sZ30AVsDnMgry09//yL/KFuHcNOvLr6Qrtq+nQ89 87VCvo5+EONVileXfVz0s2Bu4C7P+uNr3Iw0sijuzbO96cdZ7kTT+Hy5CMvHHZvMFNi7tM M49AHRmqszNl5KN/AnJ2F0eKjTExYBQ= Received: from mail-qv1-f71.google.com (mail-qv1-f71.google.com [209.85.219.71]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-123-grXZuT9PP2aKniIgqVkGvg-1; Tue, 27 Apr 2021 12:13:53 -0400 X-MC-Unique: grXZuT9PP2aKniIgqVkGvg-1 Received: by mail-qv1-f71.google.com with SMTP id f7-20020a0562141d27b029019a6fd0a183so26279398qvd.23 for ; Tue, 27 Apr 2021 09:13:53 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=X3STWoX+h+7H6Wj9KnSBfR38e0Bi1l3dmByZQ1yWWdw=; b=Rt4B3gDCWp5hFFCXB7miUjmAXBCkEGDgjDfAtBv8Lqc3he8ALdzSt+zEHsGwEgOWJM m1l3I3Bzg4QV4fHE7I/CKx53XcjxuBW2h4VuxF6Jau7UUw8ilHcLBZ1U6PEvn2A6VN4q fz9H7MOZOEjzoToV6VjAVz5p3RkyDEytyAPF0R3wVhgzzPILmmHOUpCQ5GpJ6hcPd4Lk zSP/RVFKiMQr25tMXF3s+QBC2tDdkO0kKL0YC4ZRHTrZpNySYtAdxl60VXLb5Mmccv/j o4c81mHBMSkyfqsyRIu4xqrdsWgQfWOr+PGWKolH8hu1/XbruUoWerofx71l9x9rZz3r Zi/g== X-Gm-Message-State: AOAM530VFlt49nOZrSmYRfo/veGK+UfR1hFe1le0EqsPyTpANPZd+yXM 4IXZyu5apW7D2ZDqH6Ntaq6HONB7kjXfY5+FPDSHX1kSpkAuh3kcxAhHUDTcQV0Omz9IbPuZJiu jiLjC2rvIAg8CCj2OdDAquHKGr3atbRH7bQuT8SpLPuEKQAfUlfiwK7ka72g2 X-Received: by 2002:ac8:7415:: with SMTP id p21mr4673989qtq.182.1619540032397; Tue, 27 Apr 2021 09:13:52 -0700 (PDT) X-Google-Smtp-Source: ABdhPJwuCqNWpL5G+yOlcCNJxU5kGyB20g5K62154tFThtBHit8TA/uPr1LYDNPdz/VW8nA8Z0hNFQ== X-Received: by 2002:ac8:7415:: with SMTP id p21mr4673949qtq.182.1619540032082; Tue, 27 Apr 2021 09:13:52 -0700 (PDT) Received: from xz-x1.redhat.com (bras-base-toroon474qw-grc-77-184-145-104-227.dsl.bell.ca. [184.145.104.227]) by smtp.gmail.com with ESMTPSA id v66sm3103621qkd.113.2021.04.27.09.13.49 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 27 Apr 2021 09:13:50 -0700 (PDT) From: Peter Xu To: linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: Nadav Amit , Miaohe Lin , Mike Rapoport , Andrea Arcangeli , Hugh Dickins , peterx@redhat.com, Jerome Glisse , Mike Kravetz , Jason Gunthorpe , Matthew Wilcox , Andrew Morton , Axel Rasmussen , "Kirill A . Shutemov" Subject: [PATCH v2 18/24] hugetlb/userfaultfd: Handle UFFDIO_WRITEPROTECT Date: Tue, 27 Apr 2021 12:13:11 -0400 Message-Id: <20210427161317.50682-19-peterx@redhat.com> X-Mailer: git-send-email 2.26.2 In-Reply-To: <20210427161317.50682-1-peterx@redhat.com> References: <20210427161317.50682-1-peterx@redhat.com> MIME-Version: 1.0 Authentication-Results: relay.mimecast.com; auth=pass smtp.auth=CUSA124A263 smtp.mailfrom=peterx@redhat.com X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com X-Rspamd-Server: rspam05 X-Rspamd-Queue-Id: 199BCA000188 X-Stat-Signature: k8tb5541wd776z3juqmcszbit37934w8 Received-SPF: none (redhat.com>: No applicable sender policy available) receiver=imf24; identity=mailfrom; envelope-from=""; helo=us-smtp-delivery-124.mimecast.com; client-ip=216.205.24.124 X-HE-DKIM-Result: pass/pass X-HE-Tag: 1619540024-703643 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: This starts from passing cp_flags into hugetlb_change_protection() so hugetlb will be able to handle MM_CP_UFFD_WP[_RESOLVE] requests. huge_pte_clear_uffd_wp() is introduced to handle the case where the UFFDIO_WRITEPROTECT is requested upon migrating huge page entries. Reviewed-by: Mike Kravetz Signed-off-by: Peter Xu --- include/asm-generic/hugetlb.h | 5 +++++ include/linux/hugetlb.h | 6 ++++-- mm/hugetlb.c | 13 ++++++++++++- mm/mprotect.c | 3 ++- mm/userfaultfd.c | 8 ++++++++ 5 files changed, 31 insertions(+), 4 deletions(-) diff --git a/include/asm-generic/hugetlb.h b/include/asm-generic/hugetlb.h index 548212eccbd61..181cdc3297e7b 100644 --- a/include/asm-generic/hugetlb.h +++ b/include/asm-generic/hugetlb.h @@ -32,6 +32,11 @@ static inline pte_t huge_pte_mkuffd_wp(pte_t pte) return pte_mkuffd_wp(pte); } +static inline pte_t huge_pte_clear_uffd_wp(pte_t pte) +{ + return pte_clear_uffd_wp(pte); +} + static inline pte_t huge_pte_modify(pte_t pte, pgprot_t newprot) { return pte_modify(pte, newprot); diff --git a/include/linux/hugetlb.h b/include/linux/hugetlb.h index e38077918330f..652660fd6ec8a 100644 --- a/include/linux/hugetlb.h +++ b/include/linux/hugetlb.h @@ -187,7 +187,8 @@ struct page *follow_huge_pgd(struct mm_struct *mm, unsigned long address, int pmd_huge(pmd_t pmd); int pud_huge(pud_t pud); unsigned long hugetlb_change_protection(struct vm_area_struct *vma, - unsigned long address, unsigned long end, pgprot_t newprot); + unsigned long address, unsigned long end, pgprot_t newprot, + unsigned long cp_flags); bool is_hugetlb_entry_migration(pte_t pte); void hugetlb_unshare_all_pmds(struct vm_area_struct *vma); @@ -349,7 +350,8 @@ static inline void move_hugetlb_state(struct page *oldpage, static inline unsigned long hugetlb_change_protection( struct vm_area_struct *vma, unsigned long address, - unsigned long end, pgprot_t newprot) + unsigned long end, pgprot_t newprot, + unsigned long cp_flags) { return 0; } diff --git a/mm/hugetlb.c b/mm/hugetlb.c index 20ee8fdf6507d..3cad5d7726614 100644 --- a/mm/hugetlb.c +++ b/mm/hugetlb.c @@ -5222,7 +5222,8 @@ long follow_hugetlb_page(struct mm_struct *mm, struct vm_area_struct *vma, } unsigned long hugetlb_change_protection(struct vm_area_struct *vma, - unsigned long address, unsigned long end, pgprot_t newprot) + unsigned long address, unsigned long end, + pgprot_t newprot, unsigned long cp_flags) { struct mm_struct *mm = vma->vm_mm; unsigned long start = address; @@ -5232,6 +5233,8 @@ unsigned long hugetlb_change_protection(struct vm_area_struct *vma, unsigned long pages = 0; bool shared_pmd = false; struct mmu_notifier_range range; + bool uffd_wp = cp_flags & MM_CP_UFFD_WP; + bool uffd_wp_resolve = cp_flags & MM_CP_UFFD_WP_RESOLVE; /* * In the case of shared PMDs, the area to flush could be beyond @@ -5272,6 +5275,10 @@ unsigned long hugetlb_change_protection(struct vm_area_struct *vma, make_migration_entry_read(&entry); newpte = swp_entry_to_pte(entry); + if (uffd_wp) + newpte = pte_swp_mkuffd_wp(newpte); + else if (uffd_wp_resolve) + newpte = pte_swp_clear_uffd_wp(newpte); set_huge_swap_pte_at(mm, address, ptep, newpte, huge_page_size(h)); pages++; @@ -5285,6 +5292,10 @@ unsigned long hugetlb_change_protection(struct vm_area_struct *vma, old_pte = huge_ptep_modify_prot_start(vma, address, ptep); pte = pte_mkhuge(huge_pte_modify(old_pte, newprot)); pte = arch_make_huge_pte(pte, vma, NULL, 0); + if (uffd_wp) + pte = huge_pte_mkuffd_wp(huge_pte_wrprotect(pte)); + else if (uffd_wp_resolve) + pte = huge_pte_clear_uffd_wp(pte); huge_ptep_modify_prot_commit(vma, address, ptep, old_pte, pte); pages++; } diff --git a/mm/mprotect.c b/mm/mprotect.c index 51c954afa4069..fe5a5b96a61f9 100644 --- a/mm/mprotect.c +++ b/mm/mprotect.c @@ -416,7 +416,8 @@ unsigned long change_protection(struct vm_area_struct *vma, unsigned long start, BUG_ON((cp_flags & MM_CP_UFFD_WP_ALL) == MM_CP_UFFD_WP_ALL); if (is_vm_hugetlb_page(vma)) - pages = hugetlb_change_protection(vma, start, end, newprot); + pages = hugetlb_change_protection(vma, start, end, newprot, + cp_flags); else pages = change_protection_range(vma, start, end, newprot, cp_flags); diff --git a/mm/userfaultfd.c b/mm/userfaultfd.c index 4f716838f1fdb..ceb77ea24497e 100644 --- a/mm/userfaultfd.c +++ b/mm/userfaultfd.c @@ -653,6 +653,7 @@ int mwriteprotect_range(struct mm_struct *dst_mm, unsigned long start, unsigned long len, bool enable_wp, bool *mmap_changing) { struct vm_area_struct *dst_vma; + unsigned long page_mask; pgprot_t newprot; int err; @@ -689,6 +690,13 @@ int mwriteprotect_range(struct mm_struct *dst_mm, unsigned long start, if (!vma_is_anonymous(dst_vma)) goto out_unlock; + if (is_vm_hugetlb_page(dst_vma)) { + err = -EINVAL; + page_mask = vma_kernel_pagesize(dst_vma) - 1; + if ((start & page_mask) || (len & page_mask)) + goto out_unlock; + } + if (enable_wp) newprot = vm_get_page_prot(dst_vma->vm_flags & ~(VM_WRITE)); else From patchwork Tue Apr 27 16:13:12 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Xu X-Patchwork-Id: 12226903 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.6 required=3.0 tests=BAYES_00,DKIM_INVALID, DKIM_SIGNED,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id DD174C433B4 for ; Tue, 27 Apr 2021 16:14:07 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 840CA61151 for ; Tue, 27 Apr 2021 16:14:07 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 840CA61151 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=redhat.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 3F24B6B0096; Tue, 27 Apr 2021 12:13:58 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 3CAD86B0098; Tue, 27 Apr 2021 12:13:58 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 1CD5E6B0099; Tue, 27 Apr 2021 12:13:58 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0014.hostedemail.com [216.40.44.14]) by kanga.kvack.org (Postfix) with ESMTP id ECF606B0096 for ; Tue, 27 Apr 2021 12:13:57 -0400 (EDT) Received: from smtpin06.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay04.hostedemail.com (Postfix) with ESMTP id A27825DC9 for ; Tue, 27 Apr 2021 16:13:57 +0000 (UTC) X-FDA: 78078643314.06.F95FBC5 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by imf25.hostedemail.com (Postfix) with ESMTP id 87BB36000123 for ; Tue, 27 Apr 2021 16:13:52 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1619540036; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=Of2/jNcSu5bZyfPgDma9r/3WOH0gTfsq0jzNO09tuew=; b=DlOIkCKkOsCLheyOx/CN6SmMHAUEAiSNADAYF31PXS28pWZ3zVwvRSvu6+I6he6DWbnt9x YWTpXumnTf0xULqWbaiSfBq1vprHEuP6bHteJhR0eT4dpS9J55JulHiNixWRCmJibPbmQW tpxLOuO95iL2viBrkjvtghV3J5uJRH8= Received: from mail-qt1-f199.google.com (mail-qt1-f199.google.com [209.85.160.199]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-596-sMBRYV0XPOS7QqHHr6rvJQ-1; Tue, 27 Apr 2021 12:13:55 -0400 X-MC-Unique: sMBRYV0XPOS7QqHHr6rvJQ-1 Received: by mail-qt1-f199.google.com with SMTP id i7-20020ac84f470000b02901b944d49e13so15807300qtw.7 for ; Tue, 27 Apr 2021 09:13:55 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=Of2/jNcSu5bZyfPgDma9r/3WOH0gTfsq0jzNO09tuew=; b=mtdR5Vg7qPr82nsXq+7yTs+yzJf6QHyeCw0FN84drTOpqHB0ummASESNG1vdQk2xgY iuxv3PewdvDHxvN1XT2KeQVZvhTKDbxCaRvA8SQRFzEQhS3JY/LEstb8cQkmFVfO9ZoU 7nsql6bzLExXeMnf6/dyn1MMb7wF+hmuBqS/33m/q58jhzO9eDnuf4GREQKX1kHwXLHl LtSrZlGXcqijz+LnpygM8YquNXhllbgBPc6sszzGo6JIEK588umLKY2ej3CM9QAI7sJA ebpxVWayRgBh7x39FDU9PIukvG+GYYRat3LyWW4G75rtr5duFLWlSy2QodwrtTk8BNmq y9ew== X-Gm-Message-State: AOAM530o6W1F7oLVlWApq9pSfCeduV+A2l5KxnMjyE6p2CW1LUOPb2Bk fEzwfyhZnAygDDjwnSsjbH0tZeZ6hFRJWkFBJGqeJEcR5THioJMqU197F7QdxQVmnKK8Azo7tdW 8lrYItBYILVeKmDVbUTPV+v3I+6+Kizds2erab/Nah7i/n4fzHEoqefmnXBkV X-Received: by 2002:ac8:7e8a:: with SMTP id w10mr21740812qtj.126.1619540034198; Tue, 27 Apr 2021 09:13:54 -0700 (PDT) X-Google-Smtp-Source: ABdhPJwNseODxdO93YFWKSY+Dyt7Y8fBZc5YflnrNJdlFTqOIRJxaq9cLRBHJ2XKxA4N01e/IDcnqQ== X-Received: by 2002:ac8:7e8a:: with SMTP id w10mr21740763qtj.126.1619540033840; Tue, 27 Apr 2021 09:13:53 -0700 (PDT) Received: from xz-x1.redhat.com (bras-base-toroon474qw-grc-77-184-145-104-227.dsl.bell.ca. [184.145.104.227]) by smtp.gmail.com with ESMTPSA id v66sm3103621qkd.113.2021.04.27.09.13.52 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 27 Apr 2021 09:13:53 -0700 (PDT) From: Peter Xu To: linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: Nadav Amit , Miaohe Lin , Mike Rapoport , Andrea Arcangeli , Hugh Dickins , peterx@redhat.com, Jerome Glisse , Mike Kravetz , Jason Gunthorpe , Matthew Wilcox , Andrew Morton , Axel Rasmussen , "Kirill A . Shutemov" Subject: [PATCH v2 19/24] mm/hugetlb: Introduce huge version of special swap pte helpers Date: Tue, 27 Apr 2021 12:13:12 -0400 Message-Id: <20210427161317.50682-20-peterx@redhat.com> X-Mailer: git-send-email 2.26.2 In-Reply-To: <20210427161317.50682-1-peterx@redhat.com> References: <20210427161317.50682-1-peterx@redhat.com> MIME-Version: 1.0 Authentication-Results: relay.mimecast.com; auth=pass smtp.auth=CUSA124A263 smtp.mailfrom=peterx@redhat.com X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com X-Rspamd-Server: rspam03 X-Rspamd-Queue-Id: 87BB36000123 X-Stat-Signature: oj4y5omyhby869acynwbshjqioyedrep Received-SPF: none (redhat.com>: No applicable sender policy available) receiver=imf25; identity=mailfrom; envelope-from=""; helo=us-smtp-delivery-124.mimecast.com; client-ip=170.10.133.124 X-HE-DKIM-Result: pass/pass X-HE-Tag: 1619540032-704319 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: This is to let hugetlbfs be prepared to also recognize swap special ptes just like uffd-wp special swap ptes. Reviewed-by: Mike Kravetz Signed-off-by: Peter Xu --- mm/hugetlb.c | 24 ++++++++++++++++++++++-- 1 file changed, 22 insertions(+), 2 deletions(-) diff --git a/mm/hugetlb.c b/mm/hugetlb.c index 3cad5d7726614..071a8429ea190 100644 --- a/mm/hugetlb.c +++ b/mm/hugetlb.c @@ -93,6 +93,26 @@ static inline bool subpool_is_free(struct hugepage_subpool *spool) return true; } +/* + * These are sister versions of is_swap_pte() and pte_has_swap_entry(). We + * need standalone ones because huge_pte_none() is handled differently from + * pte_none(). For more information, please refer to comments above + * is_swap_pte() and pte_has_swap_entry(). + * + * Here we directly reuse the pte level of swap special ptes, for example, the + * pte_swp_uffd_wp_special(). It just stands for a huge page rather than a + * small page for hugetlbfs pages. + */ +static inline bool is_huge_swap_pte(pte_t pte) +{ + return !huge_pte_none(pte) && !pte_present(pte); +} + +static inline bool huge_pte_has_swap_entry(pte_t pte) +{ + return is_huge_swap_pte(pte) && !is_swap_special_pte(pte); +} + static inline void unlock_or_release_subpool(struct hugepage_subpool *spool, unsigned long irq_flags) { @@ -3885,7 +3905,7 @@ bool is_hugetlb_entry_migration(pte_t pte) { swp_entry_t swp; - if (huge_pte_none(pte) || pte_present(pte)) + if (!huge_pte_has_swap_entry(pte)) return false; swp = pte_to_swp_entry(pte); if (is_migration_entry(swp)) @@ -3898,7 +3918,7 @@ static bool is_hugetlb_entry_hwpoisoned(pte_t pte) { swp_entry_t swp; - if (huge_pte_none(pte) || pte_present(pte)) + if (!huge_pte_has_swap_entry(pte)) return false; swp = pte_to_swp_entry(pte); if (is_hwpoison_entry(swp)) From patchwork Tue Apr 27 16:13:13 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Xu X-Patchwork-Id: 12226905 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.6 required=3.0 tests=BAYES_00,DKIM_INVALID, DKIM_SIGNED,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 111AEC433B4 for ; Tue, 27 Apr 2021 16:14:10 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id A17BE600CC for ; Tue, 27 Apr 2021 16:14:09 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org A17BE600CC Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=redhat.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 8ACEF6B0099; Tue, 27 Apr 2021 12:14:00 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 882026B009A; Tue, 27 Apr 2021 12:14:00 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 6126F6B009B; Tue, 27 Apr 2021 12:14:00 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0110.hostedemail.com [216.40.44.110]) by kanga.kvack.org (Postfix) with ESMTP id 397C36B0099 for ; Tue, 27 Apr 2021 12:14:00 -0400 (EDT) Received: from smtpin34.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay02.hostedemail.com (Postfix) with ESMTP id E3C025DF4 for ; Tue, 27 Apr 2021 16:13:59 +0000 (UTC) X-FDA: 78078643398.34.229DFE3 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by imf06.hostedemail.com (Postfix) with ESMTP id 8E8C1C0007F0 for ; Tue, 27 Apr 2021 16:14:02 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1619540039; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=CpSgxtKNEYae4dyjeVp8AL4d3+foNARnUP4SjfgN4GM=; b=aYKR5nuqcGyqx+e6o89Zj8JDBjLM0TUUEv4zv1atB053ZtfOPxTm/ySLSmJB7eLgUQxVT6 9zsRd+00sCjOcawhL+oMhAQh/Xd9MAhjttj94HDTVWKB9dvNsB2h0y0umO3oNdTuYadoQm hQq68E/XrGewQbG62GDFdLLqaxi5w20= Received: from mail-qv1-f71.google.com (mail-qv1-f71.google.com [209.85.219.71]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-538-xOO_FrmmOvSPYVEUhnH0ZA-1; Tue, 27 Apr 2021 12:13:57 -0400 X-MC-Unique: xOO_FrmmOvSPYVEUhnH0ZA-1 Received: by mail-qv1-f71.google.com with SMTP id h88-20020a0c82610000b02901b70a2884e8so6351257qva.20 for ; Tue, 27 Apr 2021 09:13:56 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=CpSgxtKNEYae4dyjeVp8AL4d3+foNARnUP4SjfgN4GM=; b=QB5Dp6TKc9d/jDiOXxzVl48YYUv5GLwxjNOih4zioz/LgWrFwpVhN69iTHYYLvRmjv Pp0YeVIqotroPZhIa71+fSJhksfPGU2Cv/5qvTBBpyaHy1ZUjWIdxZej1NkiE6b+l6LG gXPPLMP6kkIrwOO9w/BHAHDVbQvEsQmW4Wd4PojD9lI6LaNyPQAM/JYpox6ZHRWqEFuz Hpg1/s3iWQGVKCFJZYlfy2HvRx6me/yyRe4pAKSpiO8A5ZGksVP7zWel58WwQJN+wMtZ m6H1LSD3nil8SQalkQsRyT4C1tMdr7UqZpCdKDOMLPoJ7E0Qygbk34VQQ4Hmx964+uXt bKqw== X-Gm-Message-State: AOAM531v7E5NJMnnOjUWhEeKpZ/B5VvmsakGkHV9uP8L/PeWuvidp2pn aaLb728eKovbzUTYWDUvKSk+f+tnGuCTgSRCR37i2bU8JTLP6pzQLxIOgfl1I30KMFCW1bjG8LQ 1A5DKZwU1zWwHyEZvs4lWpximqWd3jZTdGbnvRJD7GGd/qj9L5l7Y3/UgqEN4 X-Received: by 2002:ac8:5a07:: with SMTP id n7mr1426888qta.86.1619540035497; Tue, 27 Apr 2021 09:13:55 -0700 (PDT) X-Google-Smtp-Source: ABdhPJxNFw9TlAPK5bR57wS+0NWmiHML0S2PBkzuVSnA+dUH2gMYuiWAD3BFtJ4U0QVvjjqXwtJ2nA== X-Received: by 2002:ac8:5a07:: with SMTP id n7mr1426842qta.86.1619540035151; Tue, 27 Apr 2021 09:13:55 -0700 (PDT) Received: from xz-x1.redhat.com (bras-base-toroon474qw-grc-77-184-145-104-227.dsl.bell.ca. [184.145.104.227]) by smtp.gmail.com with ESMTPSA id v66sm3103621qkd.113.2021.04.27.09.13.53 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 27 Apr 2021 09:13:54 -0700 (PDT) From: Peter Xu To: linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: Nadav Amit , Miaohe Lin , Mike Rapoport , Andrea Arcangeli , Hugh Dickins , peterx@redhat.com, Jerome Glisse , Mike Kravetz , Jason Gunthorpe , Matthew Wilcox , Andrew Morton , Axel Rasmussen , "Kirill A . Shutemov" Subject: [PATCH v2 20/24] hugetlb/userfaultfd: Handle uffd-wp special pte in hugetlb pf handler Date: Tue, 27 Apr 2021 12:13:13 -0400 Message-Id: <20210427161317.50682-21-peterx@redhat.com> X-Mailer: git-send-email 2.26.2 In-Reply-To: <20210427161317.50682-1-peterx@redhat.com> References: <20210427161317.50682-1-peterx@redhat.com> MIME-Version: 1.0 Authentication-Results: relay.mimecast.com; auth=pass smtp.auth=CUSA124A263 smtp.mailfrom=peterx@redhat.com X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com X-Stat-Signature: qbzdis63msbjb15ru638y6urdni1wmrp X-Rspamd-Server: rspam04 X-Rspamd-Queue-Id: 8E8C1C0007F0 Received-SPF: none (redhat.com>: No applicable sender policy available) receiver=imf06; identity=mailfrom; envelope-from=""; helo=us-smtp-delivery-124.mimecast.com; client-ip=170.10.133.124 X-HE-DKIM-Result: pass/pass X-HE-Tag: 1619540042-858003 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Teach the hugetlb page fault code to understand uffd-wp special pte. For example, when seeing such a pte we need to convert any write fault into a read one (which is fake - we'll retry the write later if so). Meanwhile, for handle_userfault() we'll need to make sure we must wait for the special swap pte too just like a none pte. Note that we also need to teach UFFDIO_COPY about this special pte across the code path so that we can safely install a new page at this special pte as long as we know it's a stall entry. Signed-off-by: Peter Xu --- fs/userfaultfd.c | 5 ++++- mm/hugetlb.c | 26 ++++++++++++++++++++------ mm/userfaultfd.c | 5 ++++- 3 files changed, 28 insertions(+), 8 deletions(-) diff --git a/fs/userfaultfd.c b/fs/userfaultfd.c index b34486a88b5f3..a41e0631af512 100644 --- a/fs/userfaultfd.c +++ b/fs/userfaultfd.c @@ -245,8 +245,11 @@ static inline bool userfaultfd_huge_must_wait(struct userfaultfd_ctx *ctx, /* * Lockless access: we're in a wait_event so it's ok if it * changes under us. + * + * Regarding uffd-wp special case, please refer to comments in + * userfaultfd_must_wait(). */ - if (huge_pte_none(pte)) + if (huge_pte_none(pte) || pte_swp_uffd_wp_special(pte)) ret = true; if (!huge_pte_write(pte) && (reason & VM_UFFD_WP)) ret = true; diff --git a/mm/hugetlb.c b/mm/hugetlb.c index 071a8429ea190..d9ff7db14175d 100644 --- a/mm/hugetlb.c +++ b/mm/hugetlb.c @@ -4529,7 +4529,8 @@ static inline vm_fault_t hugetlb_handle_userfault(struct vm_area_struct *vma, static vm_fault_t hugetlb_no_page(struct mm_struct *mm, struct vm_area_struct *vma, struct address_space *mapping, pgoff_t idx, - unsigned long address, pte_t *ptep, unsigned int flags) + unsigned long address, pte_t *ptep, + pte_t old_pte, unsigned int flags) { struct hstate *h = hstate_vma(vma); vm_fault_t ret = VM_FAULT_SIGBUS; @@ -4653,7 +4654,7 @@ static vm_fault_t hugetlb_no_page(struct mm_struct *mm, ptl = huge_pte_lock(h, mm, ptep); ret = 0; - if (!huge_pte_none(huge_ptep_get(ptep))) + if (!pte_same(huge_ptep_get(ptep), old_pte)) goto backout; if (anon_rmap) { @@ -4663,6 +4664,12 @@ static vm_fault_t hugetlb_no_page(struct mm_struct *mm, page_dup_rmap(page, true); new_pte = make_huge_pte(vma, page, ((vma->vm_flags & VM_WRITE) && (vma->vm_flags & VM_SHARED))); + /* + * If this pte was previously wr-protected, keep it wr-protected even + * if populated. + */ + if (unlikely(pte_swp_uffd_wp_special(old_pte))) + new_pte = huge_pte_wrprotect(huge_pte_mkuffd_wp(new_pte)); set_huge_pte_at(mm, haddr, ptep, new_pte); hugetlb_count_add(pages_per_huge_page(h), mm); @@ -4778,8 +4785,13 @@ vm_fault_t hugetlb_fault(struct mm_struct *mm, struct vm_area_struct *vma, mutex_lock(&hugetlb_fault_mutex_table[hash]); entry = huge_ptep_get(ptep); - if (huge_pte_none(entry)) { - ret = hugetlb_no_page(mm, vma, mapping, idx, address, ptep, flags); + /* + * uffd-wp-special should be handled merely the same as pte none + * because it's basically a none pte with a special marker + */ + if (huge_pte_none(entry) || pte_swp_uffd_wp_special(entry)) { + ret = hugetlb_no_page(mm, vma, mapping, idx, address, ptep, + entry, flags); goto out_mutex; } @@ -4913,7 +4925,7 @@ int hugetlb_mcopy_atomic_pte(struct mm_struct *dst_mm, unsigned long size; int vm_shared = dst_vma->vm_flags & VM_SHARED; struct hstate *h = hstate_vma(dst_vma); - pte_t _dst_pte; + pte_t _dst_pte, cur_pte; spinlock_t *ptl; int ret; struct page *page; @@ -4991,8 +5003,10 @@ int hugetlb_mcopy_atomic_pte(struct mm_struct *dst_mm, if (idx >= size) goto out_release_unlock; + cur_pte = huge_ptep_get(dst_pte); ret = -EEXIST; - if (!huge_pte_none(huge_ptep_get(dst_pte))) + /* Please refer to shmem_mfill_atomic_pte() for uffd-wp special case */ + if (!huge_pte_none(cur_pte) && !pte_swp_uffd_wp_special(cur_pte)) goto out_release_unlock; if (vm_shared) { diff --git a/mm/userfaultfd.c b/mm/userfaultfd.c index ceb77ea24497e..2cd6ad5c3d8f8 100644 --- a/mm/userfaultfd.c +++ b/mm/userfaultfd.c @@ -274,6 +274,8 @@ static __always_inline ssize_t __mcopy_atomic_hugetlb(struct mm_struct *dst_mm, } while (src_addr < src_start + len) { + pte_t pteval; + BUG_ON(dst_addr >= dst_start + len); /* @@ -296,8 +298,9 @@ static __always_inline ssize_t __mcopy_atomic_hugetlb(struct mm_struct *dst_mm, goto out_unlock; } + pteval = huge_ptep_get(dst_pte); if (mode != MCOPY_ATOMIC_CONTINUE && - !huge_pte_none(huge_ptep_get(dst_pte))) { + !huge_pte_none(pteval) && !pte_swp_uffd_wp_special(pteval)) { err = -EEXIST; mutex_unlock(&hugetlb_fault_mutex_table[hash]); i_mmap_unlock_read(mapping); From patchwork Tue Apr 27 16:13:14 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Xu X-Patchwork-Id: 12226907 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.6 required=3.0 tests=BAYES_00,DKIM_INVALID, DKIM_SIGNED,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 52DB1C433ED for ; Tue, 27 Apr 2021 16:14:12 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id E084D61158 for ; Tue, 27 Apr 2021 16:14:11 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org E084D61158 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=redhat.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id F40276B009F; Tue, 27 Apr 2021 12:14:02 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id EC6EB6B00A0; Tue, 27 Apr 2021 12:14:02 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id C7DAA6B00A1; Tue, 27 Apr 2021 12:14:02 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0108.hostedemail.com [216.40.44.108]) by kanga.kvack.org (Postfix) with ESMTP id A92F16B009F for ; Tue, 27 Apr 2021 12:14:02 -0400 (EDT) Received: from smtpin12.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay01.hostedemail.com (Postfix) with ESMTP id 6C27A180AD822 for ; Tue, 27 Apr 2021 16:14:02 +0000 (UTC) X-FDA: 78078643524.12.10B8C58 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [216.205.24.124]) by imf30.hostedemail.com (Postfix) with ESMTP id 9C0F9E00012A for ; Tue, 27 Apr 2021 16:13:42 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1619540041; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=l1z1o5mbCVa78Hzla2EzzKhxkMDgZgiPiKG4mm1uLIY=; b=S8dwxmVXkEaUSf5V8eG6+HuRXktYgqn+7lyn8ak7QoGAP2Vpnn95aD0CJ9i5SelMl6WpMM r3sgAA3gM414gKMiKSl6O206vpx8YHaOib3CyKSboF4xZJWmu9EKseFJc06/8cRXdoSPWO 0gAWfIcp4k3oXBebfjp8qyZw8Vp9KZ4= Received: from mail-qt1-f199.google.com (mail-qt1-f199.google.com [209.85.160.199]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-41-YmWaluwFMyyHOfBdGFb3Gw-1; Tue, 27 Apr 2021 12:13:57 -0400 X-MC-Unique: YmWaluwFMyyHOfBdGFb3Gw-1 Received: by mail-qt1-f199.google.com with SMTP id x13-20020ac84d4d0000b02901a95d7c4bb5so23539247qtv.14 for ; Tue, 27 Apr 2021 09:13:57 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=l1z1o5mbCVa78Hzla2EzzKhxkMDgZgiPiKG4mm1uLIY=; b=G1K8EngY+tJU0hmsj6DXh3w3B7Wlu4lyzNf9p6bnr0TzKcAj26JJ2OxVC7wFcUgws+ iD2E592GAgZ4NvGbBPsIRnUYot33PMvLsBBTgqq6vB3ewOtBXhObBSjjO0CEeDYuMxfF BzNWPeZWfKcW7/EBXMwzhID4mdd9Wa3i/aB3Ay+ptXthUK2Y1/hDqbufxM1eWcsMDqA1 AfDgeikgp3n6wV3QePSPpK/+laaYu1ezib4tYP1qKpxoc41o6FH4LztyD3wYHJgS2Lo1 b1ZFI1V9GBcWxpSrpPMGgT1sWoUtmlzn9XHgF0UKO6TzwG0AmmOUISz6Nn4R+GtuiLKF S5lA== X-Gm-Message-State: AOAM5317ub3AIzW6rkUSk/qGAgtyUQ0/Mn6ABI9yO+ApRYDwPaagoJDi p6AsAaP2Xjl6j6aUvoCPWSmC3RLFHN6zbyQz+N47uvn2HPV0t+tfrMFXMF8QyjU61qgV9mEWF7B 6xaw+Qg3JcZkMQwnluuJn3hjo2vh75pcxSEEQg82HosT9myox++g1R6GSAYbu X-Received: by 2002:a05:620a:133c:: with SMTP id p28mr11069472qkj.3.1619540036890; Tue, 27 Apr 2021 09:13:56 -0700 (PDT) X-Google-Smtp-Source: ABdhPJyBwc48GEDC8J2KNhfxo5GVkwB09tGhSlMBTREmsPdAdSwfXDt4d9f/uX9TJoG1RIhmgJavrQ== X-Received: by 2002:a05:620a:133c:: with SMTP id p28mr11069428qkj.3.1619540036590; Tue, 27 Apr 2021 09:13:56 -0700 (PDT) Received: from xz-x1.redhat.com (bras-base-toroon474qw-grc-77-184-145-104-227.dsl.bell.ca. [184.145.104.227]) by smtp.gmail.com with ESMTPSA id v66sm3103621qkd.113.2021.04.27.09.13.55 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 27 Apr 2021 09:13:55 -0700 (PDT) From: Peter Xu To: linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: Nadav Amit , Miaohe Lin , Mike Rapoport , Andrea Arcangeli , Hugh Dickins , peterx@redhat.com, Jerome Glisse , Mike Kravetz , Jason Gunthorpe , Matthew Wilcox , Andrew Morton , Axel Rasmussen , "Kirill A . Shutemov" Subject: [PATCH v2 21/24] hugetlb/userfaultfd: Allow wr-protect none ptes Date: Tue, 27 Apr 2021 12:13:14 -0400 Message-Id: <20210427161317.50682-22-peterx@redhat.com> X-Mailer: git-send-email 2.26.2 In-Reply-To: <20210427161317.50682-1-peterx@redhat.com> References: <20210427161317.50682-1-peterx@redhat.com> MIME-Version: 1.0 Authentication-Results: relay.mimecast.com; auth=pass smtp.auth=CUSA124A263 smtp.mailfrom=peterx@redhat.com X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com X-Rspamd-Queue-Id: 9C0F9E00012A X-Stat-Signature: jpprhehsso1mq5tgc7iy6gaerwraehr1 X-Rspamd-Server: rspam02 Received-SPF: none (redhat.com>: No applicable sender policy available) receiver=imf30; identity=mailfrom; envelope-from=""; helo=us-smtp-delivery-124.mimecast.com; client-ip=216.205.24.124 X-HE-DKIM-Result: pass/pass X-HE-Tag: 1619540022-642108 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Teach hugetlbfs code to wr-protect none ptes just in case the page cache existed for that pte. Meanwhile we also need to be able to recognize a uffd-wp marker pte and remove it for uffd_wp_resolve. Since at it, introduce a variable "psize" to replace all references to the huge page size fetcher. Reviewed-by: Mike Kravetz Signed-off-by: Peter Xu --- mm/hugetlb.c | 29 +++++++++++++++++++++++++---- 1 file changed, 25 insertions(+), 4 deletions(-) diff --git a/mm/hugetlb.c b/mm/hugetlb.c index d9ff7db14175d..fa9af9c893512 100644 --- a/mm/hugetlb.c +++ b/mm/hugetlb.c @@ -5264,7 +5264,7 @@ unsigned long hugetlb_change_protection(struct vm_area_struct *vma, pte_t *ptep; pte_t pte; struct hstate *h = hstate_vma(vma); - unsigned long pages = 0; + unsigned long pages = 0, psize = huge_page_size(h); bool shared_pmd = false; struct mmu_notifier_range range; bool uffd_wp = cp_flags & MM_CP_UFFD_WP; @@ -5284,13 +5284,19 @@ unsigned long hugetlb_change_protection(struct vm_area_struct *vma, mmu_notifier_invalidate_range_start(&range); i_mmap_lock_write(vma->vm_file->f_mapping); - for (; address < end; address += huge_page_size(h)) { + for (; address < end; address += psize) { spinlock_t *ptl; - ptep = huge_pte_offset(mm, address, huge_page_size(h)); + ptep = huge_pte_offset(mm, address, psize); if (!ptep) continue; ptl = huge_pte_lock(h, mm, ptep); if (huge_pmd_unshare(mm, vma, &address, ptep)) { + /* + * When uffd-wp is enabled on the vma, unshare + * shouldn't happen at all. Warn about it if it + * happened due to some reason. + */ + WARN_ON_ONCE(uffd_wp || uffd_wp_resolve); pages++; spin_unlock(ptl); shared_pmd = true; @@ -5314,12 +5320,21 @@ unsigned long hugetlb_change_protection(struct vm_area_struct *vma, else if (uffd_wp_resolve) newpte = pte_swp_clear_uffd_wp(newpte); set_huge_swap_pte_at(mm, address, ptep, - newpte, huge_page_size(h)); + newpte, psize); pages++; } spin_unlock(ptl); continue; } + if (unlikely(is_swap_special_pte(pte))) { + WARN_ON_ONCE(!pte_swp_uffd_wp_special(pte)); + /* + * This is changing a non-present pte into a none pte, + * no need for huge_ptep_modify_prot_start/commit(). + */ + if (uffd_wp_resolve) + huge_pte_clear(mm, address, ptep, psize); + } if (!huge_pte_none(pte)) { pte_t old_pte; @@ -5332,6 +5347,12 @@ unsigned long hugetlb_change_protection(struct vm_area_struct *vma, pte = huge_pte_clear_uffd_wp(pte); huge_ptep_modify_prot_commit(vma, address, ptep, old_pte, pte); pages++; + } else { + /* None pte */ + if (unlikely(uffd_wp)) + /* Safe to modify directly (none->non-present). */ + set_huge_pte_at(mm, address, ptep, + pte_swp_mkuffd_wp_special(vma)); } spin_unlock(ptl); } From patchwork Tue Apr 27 16:13:15 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Xu X-Patchwork-Id: 12226909 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.6 required=3.0 tests=BAYES_00,DKIM_INVALID, DKIM_SIGNED,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 87FD3C433B4 for ; Tue, 27 Apr 2021 16:14:14 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 12E9861151 for ; Tue, 27 Apr 2021 16:14:14 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 12E9861151 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=redhat.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 557B96B00A0; Tue, 27 Apr 2021 12:14:03 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 505886B00A1; Tue, 27 Apr 2021 12:14:03 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 32E2F6B00A3; Tue, 27 Apr 2021 12:14:03 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0156.hostedemail.com [216.40.44.156]) by kanga.kvack.org (Postfix) with ESMTP id 08AE46B00A1 for ; Tue, 27 Apr 2021 12:14:03 -0400 (EDT) Received: from smtpin18.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay03.hostedemail.com (Postfix) with ESMTP id B9B988249980 for ; Tue, 27 Apr 2021 16:14:02 +0000 (UTC) X-FDA: 78078643524.18.9C1A97D Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by imf06.hostedemail.com (Postfix) with ESMTP id 7942FC0007FE for ; Tue, 27 Apr 2021 16:14:05 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1619540041; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=wphNT2xD9jf++zN0kBcInFhZz8ZsCYCIVGYPc7akVQI=; b=bpfHafxQggwtmjRJcC5804O5ktTQFCqfrzQO47grm8xd0MGwPXA9lhVRXETbgRoSbltYpH bty+t9LbezzcMgxZvkgMFrg7ana8nrzFhvDaIyvKdFyvZrYoaS2+XAeti1i2khN/rGz1Y4 mg/oIboMVqc4ZUVD7s3MAHGDvWIKMUo= Received: from mail-qt1-f198.google.com (mail-qt1-f198.google.com [209.85.160.198]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-174-wMQsmuC9OB-13TXsjvofgg-1; Tue, 27 Apr 2021 12:14:00 -0400 X-MC-Unique: wMQsmuC9OB-13TXsjvofgg-1 Received: by mail-qt1-f198.google.com with SMTP id h12-20020ac8744c0000b02901ba644d864fso13454081qtr.8 for ; Tue, 27 Apr 2021 09:14:00 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=wphNT2xD9jf++zN0kBcInFhZz8ZsCYCIVGYPc7akVQI=; b=JgLSxsnbR9C5xgdC4uBipfvmKvZNXSxoXBN2FJE1XRMte0bAeEt1hDbIpPLxQymfiK EWkTlkxaeMOVOM+qFzS+q7TXoARiA9ZxKtroSDQoRM9oRCpbldbDWnVtXg2tMinKgmlY ejLSWns0cXXrckEzl+ylvOw8EnLY9cDWSisrrZ/8m3FjuK6oKk34ArmE4FFbKvr9PIpv u7L18Fadh8GqB8w60jzxGVAmkjGJUla2D2UW/4KJFGMme2qkYPOLe2lra/Bsi5r8bmQe 5ckXzxJihCE9oNwXdAVDBQk1GYPIL65AGKbtoB/S+2SuS3DT/8JjvFwNI9864DQyIYyP rYhQ== X-Gm-Message-State: AOAM531hg9TCdpdiMo1G4gcEcAW1AbDUCL/+MDWJS4x4gG+iO+i71tWs I9FOCjGf63epmgWYH8im14259jMRjA8ZmZ6hkAjmAabhyq9JqzCmWp5F7QNg5G+B01jivBeFYQV Mq7C1VQAVkIPh1/BjXy/tqzT4KoQinzaNvJ6G9vmS2/c8a9C/h13zXv8gwmuF X-Received: by 2002:ac8:110f:: with SMTP id c15mr22619080qtj.251.1619540038992; Tue, 27 Apr 2021 09:13:58 -0700 (PDT) X-Google-Smtp-Source: ABdhPJxT4sMZbO5PuB5GIUmhLz1zl/3d5a+ZQmR9s8aJgEzvVt5VQlwUiwfZwbCp9M+hCjkWydKm3w== X-Received: by 2002:ac8:110f:: with SMTP id c15mr22619025qtj.251.1619540038585; Tue, 27 Apr 2021 09:13:58 -0700 (PDT) Received: from xz-x1.redhat.com (bras-base-toroon474qw-grc-77-184-145-104-227.dsl.bell.ca. [184.145.104.227]) by smtp.gmail.com with ESMTPSA id v66sm3103621qkd.113.2021.04.27.09.13.56 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 27 Apr 2021 09:13:57 -0700 (PDT) From: Peter Xu To: linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: Nadav Amit , Miaohe Lin , Mike Rapoport , Andrea Arcangeli , Hugh Dickins , peterx@redhat.com, Jerome Glisse , Mike Kravetz , Jason Gunthorpe , Matthew Wilcox , Andrew Morton , Axel Rasmussen , "Kirill A . Shutemov" Subject: [PATCH v2 22/24] hugetlb/userfaultfd: Only drop uffd-wp special pte if required Date: Tue, 27 Apr 2021 12:13:15 -0400 Message-Id: <20210427161317.50682-23-peterx@redhat.com> X-Mailer: git-send-email 2.26.2 In-Reply-To: <20210427161317.50682-1-peterx@redhat.com> References: <20210427161317.50682-1-peterx@redhat.com> MIME-Version: 1.0 Authentication-Results: relay.mimecast.com; auth=pass smtp.auth=CUSA124A263 smtp.mailfrom=peterx@redhat.com X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com X-Rspamd-Server: rspam03 X-Rspamd-Queue-Id: 7942FC0007FE X-Stat-Signature: 5cno7rpm4e4fqkczfciqed968z7zx9px Received-SPF: none (redhat.com>: No applicable sender policy available) receiver=imf06; identity=mailfrom; envelope-from=""; helo=us-smtp-delivery-124.mimecast.com; client-ip=170.10.133.124 X-HE-DKIM-Result: pass/pass X-HE-Tag: 1619540045-323777 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: As with shmem uffd-wp special ptes, only drop the uffd-wp special swap pte if unmapping an entire vma or synchronized such that faults can not race with the unmap operation. This requires passing zap_flags all the way to the lowest level hugetlb unmap routine: __unmap_hugepage_range. In general, unmap calls originated in hugetlbfs code will pass the ZAP_FLAG_DROP_FILE_UFFD_WP flag as synchronization is in place to prevent faults. The exception is hole punch which will first unmap without any synchronization. Later when hole punch actually removes the page from the file, it will check to see if there was a subsequent fault and if so take the hugetlb fault mutex while unmapping again. This second unmap will pass in ZAP_FLAG_DROP_FILE_UFFD_WP. The core justification of "whether to apply ZAP_FLAG_DROP_FILE_UFFD_WP flag when unmap a hugetlb range" is (IMHO): we should never reach a state when a page fault could errornously fault in a page-cache page that was wr-protected to be writable, even in an extremely short period. That could happen if e.g. we pass ZAP_FLAG_DROP_FILE_UFFD_WP in hugetlbfs_punch_hole() when calling hugetlb_vmdelete_list(), because if a page fault triggers after that call and before the remove_inode_hugepages() right after it, the page cache can be mapped writable again in the small window, which can cause data corruption. Reviewed-by: Mike Kravetz Signed-off-by: Peter Xu --- fs/hugetlbfs/inode.c | 15 +++++++++------ include/linux/hugetlb.h | 8 +++++--- mm/hugetlb.c | 27 +++++++++++++++++++++------ mm/memory.c | 5 ++++- 4 files changed, 39 insertions(+), 16 deletions(-) diff --git a/fs/hugetlbfs/inode.c b/fs/hugetlbfs/inode.c index a2a42335e8fd2..9b383c39756a5 100644 --- a/fs/hugetlbfs/inode.c +++ b/fs/hugetlbfs/inode.c @@ -399,7 +399,8 @@ static void remove_huge_page(struct page *page) } static void -hugetlb_vmdelete_list(struct rb_root_cached *root, pgoff_t start, pgoff_t end) +hugetlb_vmdelete_list(struct rb_root_cached *root, pgoff_t start, pgoff_t end, + unsigned long zap_flags) { struct vm_area_struct *vma; @@ -432,7 +433,7 @@ hugetlb_vmdelete_list(struct rb_root_cached *root, pgoff_t start, pgoff_t end) } unmap_hugepage_range(vma, vma->vm_start + v_offset, v_end, - NULL); + NULL, zap_flags); } } @@ -510,7 +511,8 @@ static void remove_inode_hugepages(struct inode *inode, loff_t lstart, mutex_lock(&hugetlb_fault_mutex_table[hash]); hugetlb_vmdelete_list(&mapping->i_mmap, index * pages_per_huge_page(h), - (index + 1) * pages_per_huge_page(h)); + (index + 1) * pages_per_huge_page(h), + ZAP_FLAG_DROP_FILE_UFFD_WP); i_mmap_unlock_write(mapping); } @@ -576,7 +578,8 @@ static void hugetlb_vmtruncate(struct inode *inode, loff_t offset) i_mmap_lock_write(mapping); i_size_write(inode, offset); if (!RB_EMPTY_ROOT(&mapping->i_mmap.rb_root)) - hugetlb_vmdelete_list(&mapping->i_mmap, pgoff, 0); + hugetlb_vmdelete_list(&mapping->i_mmap, pgoff, 0, + ZAP_FLAG_DROP_FILE_UFFD_WP); i_mmap_unlock_write(mapping); remove_inode_hugepages(inode, offset, LLONG_MAX); } @@ -609,8 +612,8 @@ static long hugetlbfs_punch_hole(struct inode *inode, loff_t offset, loff_t len) i_mmap_lock_write(mapping); if (!RB_EMPTY_ROOT(&mapping->i_mmap.rb_root)) hugetlb_vmdelete_list(&mapping->i_mmap, - hole_start >> PAGE_SHIFT, - hole_end >> PAGE_SHIFT); + hole_start >> PAGE_SHIFT, + hole_end >> PAGE_SHIFT, 0); i_mmap_unlock_write(mapping); remove_inode_hugepages(inode, hole_start, hole_end); inode_unlock(inode); diff --git a/include/linux/hugetlb.h b/include/linux/hugetlb.h index 652660fd6ec8a..5fa84bbefa628 100644 --- a/include/linux/hugetlb.h +++ b/include/linux/hugetlb.h @@ -121,11 +121,12 @@ long follow_hugetlb_page(struct mm_struct *, struct vm_area_struct *, unsigned long *, unsigned long *, long, unsigned int, int *); void unmap_hugepage_range(struct vm_area_struct *, - unsigned long, unsigned long, struct page *); + unsigned long, unsigned long, struct page *, + unsigned long); void __unmap_hugepage_range_final(struct mmu_gather *tlb, struct vm_area_struct *vma, unsigned long start, unsigned long end, - struct page *ref_page); + struct page *ref_page, unsigned long zap_flags); void hugetlb_report_meminfo(struct seq_file *); int hugetlb_report_node_meminfo(char *buf, int len, int nid); void hugetlb_show_meminfo(void); @@ -358,7 +359,8 @@ static inline unsigned long hugetlb_change_protection( static inline void __unmap_hugepage_range_final(struct mmu_gather *tlb, struct vm_area_struct *vma, unsigned long start, - unsigned long end, struct page *ref_page) + unsigned long end, struct page *ref_page, + unsigned long zap_flags) { BUG(); } diff --git a/mm/hugetlb.c b/mm/hugetlb.c index fa9af9c893512..f73a236b5a835 100644 --- a/mm/hugetlb.c +++ b/mm/hugetlb.c @@ -4096,7 +4096,7 @@ int copy_hugetlb_page_range(struct mm_struct *dst, struct mm_struct *src, void __unmap_hugepage_range(struct mmu_gather *tlb, struct vm_area_struct *vma, unsigned long start, unsigned long end, - struct page *ref_page) + struct page *ref_page, unsigned long zap_flags) { struct mm_struct *mm = vma->vm_mm; unsigned long address; @@ -4148,6 +4148,19 @@ void __unmap_hugepage_range(struct mmu_gather *tlb, struct vm_area_struct *vma, continue; } + if (unlikely(is_swap_special_pte(pte))) { + WARN_ON_ONCE(!pte_swp_uffd_wp_special(pte)); + /* + * Only drop the special swap uffd-wp pte if + * e.g. unmapping a vma or punching a hole (with proper + * lock held so that concurrent page fault won't happen). + */ + if (zap_flags & ZAP_FLAG_DROP_FILE_UFFD_WP) + huge_pte_clear(mm, address, ptep, sz); + spin_unlock(ptl); + continue; + } + /* * Migrating hugepage or HWPoisoned hugepage is already * unmapped and its refcount is dropped, so just clear pte here. @@ -4199,9 +4212,10 @@ void __unmap_hugepage_range(struct mmu_gather *tlb, struct vm_area_struct *vma, void __unmap_hugepage_range_final(struct mmu_gather *tlb, struct vm_area_struct *vma, unsigned long start, - unsigned long end, struct page *ref_page) + unsigned long end, struct page *ref_page, + unsigned long zap_flags) { - __unmap_hugepage_range(tlb, vma, start, end, ref_page); + __unmap_hugepage_range(tlb, vma, start, end, ref_page, zap_flags); /* * Clear this flag so that x86's huge_pmd_share page_table_shareable @@ -4217,12 +4231,13 @@ void __unmap_hugepage_range_final(struct mmu_gather *tlb, } void unmap_hugepage_range(struct vm_area_struct *vma, unsigned long start, - unsigned long end, struct page *ref_page) + unsigned long end, struct page *ref_page, + unsigned long zap_flags) { struct mmu_gather tlb; tlb_gather_mmu(&tlb, vma->vm_mm); - __unmap_hugepage_range(&tlb, vma, start, end, ref_page); + __unmap_hugepage_range(&tlb, vma, start, end, ref_page, zap_flags); tlb_finish_mmu(&tlb); } @@ -4277,7 +4292,7 @@ static void unmap_ref_private(struct mm_struct *mm, struct vm_area_struct *vma, */ if (!is_vma_resv_set(iter_vma, HPAGE_RESV_OWNER)) unmap_hugepage_range(iter_vma, address, - address + huge_page_size(h), page); + address + huge_page_size(h), page, 0); } i_mmap_unlock_write(mapping); } diff --git a/mm/memory.c b/mm/memory.c index f1cdc613b5887..99741c9254c5b 100644 --- a/mm/memory.c +++ b/mm/memory.c @@ -1515,8 +1515,11 @@ static void unmap_single_vma(struct mmu_gather *tlb, * safe to do nothing in this case. */ if (vma->vm_file) { + unsigned long zap_flags = details ? + details->zap_flags : 0; i_mmap_lock_write(vma->vm_file->f_mapping); - __unmap_hugepage_range_final(tlb, vma, start, end, NULL); + __unmap_hugepage_range_final(tlb, vma, start, end, + NULL, zap_flags); i_mmap_unlock_write(vma->vm_file->f_mapping); } } else From patchwork Tue Apr 27 16:13:16 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Xu X-Patchwork-Id: 12226911 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.6 required=3.0 tests=BAYES_00,DKIM_INVALID, DKIM_SIGNED,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 600E6C43460 for ; Tue, 27 Apr 2021 16:14:16 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 0C334600CC for ; Tue, 27 Apr 2021 16:14:16 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 0C334600CC Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=redhat.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 220256B00A1; Tue, 27 Apr 2021 12:14:07 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 10BB96B00A4; Tue, 27 Apr 2021 12:14:07 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id E02926B00A6; Tue, 27 Apr 2021 12:14:06 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0166.hostedemail.com [216.40.44.166]) by kanga.kvack.org (Postfix) with ESMTP id B57E66B00A4 for ; Tue, 27 Apr 2021 12:14:06 -0400 (EDT) Received: from smtpin35.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay02.hostedemail.com (Postfix) with ESMTP id 6ECCA5DE9 for ; Tue, 27 Apr 2021 16:14:06 +0000 (UTC) X-FDA: 78078643692.35.CEAD747 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by imf21.hostedemail.com (Postfix) with ESMTP id 9FDCEE000128 for ; Tue, 27 Apr 2021 16:14:02 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1619540045; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=pWfJd/H/T26cGeSLT9ABJVI8Det1WDmJfdLZOnVj/9Y=; b=bwsm8rlCDRmRkWoOk8M/OgoxPrdIwH31gPDE3nml4kTyCPKANHlpYOzlMdB5fF/3PYH9+Q qwIl18lNESYwS8VaIAAnIfAZatLXDRORx6N6NgSWcUJZlZCJBxQmi+X5yi27FJYeJ2SPoM W/mBhmDMQYwKeEtSnuDtBm2UKT2m6jg= Received: from mail-qv1-f71.google.com (mail-qv1-f71.google.com [209.85.219.71]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-330-v4Z_7boPNDOBhTMmIWbtdg-1; Tue, 27 Apr 2021 12:14:01 -0400 X-MC-Unique: v4Z_7boPNDOBhTMmIWbtdg-1 Received: by mail-qv1-f71.google.com with SMTP id l19-20020a0ce5130000b02901b6795e3304so6666720qvm.2 for ; Tue, 27 Apr 2021 09:14:01 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=pWfJd/H/T26cGeSLT9ABJVI8Det1WDmJfdLZOnVj/9Y=; b=dSszAYwhqenfuIy6c+cbshauFmoX3OSMiOZtcf8HSjNmeZYCyvaV30+LuPAXLUXkDK JgHLeLyLCSFs0O1XfPUpLpZAPmu2DnJ/TfJzGmpGXXguaAsxSmgVR9oEZevs2BEMj95V DtiSBF8vJDyYLYLn9X1GrNnYihRUs4NunyTkqHcnNhCj7oIsk1IJP3NDv/8DFksnc0oy TmOlD6f62pabz7juvduhnMSkq1ZDfPfRcHrCdPz1/3C8nDcbQypb70U3ueu+2CN5Dgf/ JrTw/KPsiNFsfWEkD2Sns57SP4Fv2pBtqpfEB4wuE9ywaH1S9kX/+1JvAV6JyslK+OzW NBCg== X-Gm-Message-State: AOAM533tGY6qPymzjQ7LLng/1ctr55/s9Y34CcGvu8IAhvIrW8Jx6je7 c4r/1dCOm4gEwi3HNF01/4Csx5EmgbY3aVAhvUIcc4AXL8k/CTmwcytDM8QyN/HsPK6hz5kQgsA RoIeydLl/D9AZtntEuwV8dO6d/jYgQenyhf8RqoP/YpxiY6a0tMOzJpaTUEiJ X-Received: by 2002:a05:620a:40c6:: with SMTP id g6mr7712197qko.226.1619540040616; Tue, 27 Apr 2021 09:14:00 -0700 (PDT) X-Google-Smtp-Source: ABdhPJxaSB+v4nhW1pF8jFSUzXD1x5f/DDzv4u7/+Bwb8/4kHfWIDByTz/eIHQ9VVD180ealMkxeXQ== X-Received: by 2002:a05:620a:40c6:: with SMTP id g6mr7712153qko.226.1619540040220; Tue, 27 Apr 2021 09:14:00 -0700 (PDT) Received: from xz-x1.redhat.com (bras-base-toroon474qw-grc-77-184-145-104-227.dsl.bell.ca. [184.145.104.227]) by smtp.gmail.com with ESMTPSA id v66sm3103621qkd.113.2021.04.27.09.13.58 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 27 Apr 2021 09:13:59 -0700 (PDT) From: Peter Xu To: linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: Nadav Amit , Miaohe Lin , Mike Rapoport , Andrea Arcangeli , Hugh Dickins , peterx@redhat.com, Jerome Glisse , Mike Kravetz , Jason Gunthorpe , Matthew Wilcox , Andrew Morton , Axel Rasmussen , "Kirill A . Shutemov" Subject: [PATCH v2 23/24] mm/userfaultfd: Enable write protection for shmem & hugetlbfs Date: Tue, 27 Apr 2021 12:13:16 -0400 Message-Id: <20210427161317.50682-24-peterx@redhat.com> X-Mailer: git-send-email 2.26.2 In-Reply-To: <20210427161317.50682-1-peterx@redhat.com> References: <20210427161317.50682-1-peterx@redhat.com> MIME-Version: 1.0 Authentication-Results: relay.mimecast.com; auth=pass smtp.auth=CUSA124A263 smtp.mailfrom=peterx@redhat.com X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com X-Rspamd-Server: rspam05 X-Rspamd-Queue-Id: 9FDCEE000128 X-Stat-Signature: isdmtb4wju4obdro4wuohpt6tpzp5dec Received-SPF: none (redhat.com>: No applicable sender policy available) receiver=imf21; identity=mailfrom; envelope-from=""; helo=us-smtp-delivery-124.mimecast.com; client-ip=170.10.133.124 X-HE-DKIM-Result: pass/pass X-HE-Tag: 1619540042-908034 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: We've had all the necessary changes ready for both shmem and hugetlbfs. Turn on all the shmem/hugetlbfs switches for userfaultfd-wp. Now we can remove the flags parameter for vma_can_userfault() since not used any more. Meanwhile, we can expand UFFD_API_RANGE_IOCTLS_BASIC with _UFFDIO_WRITEPROTECT too because all existing types now support write protection mode. Since vma_can_userfault() will be used elsewhere, move into userfaultfd_k.h. Signed-off-by: Peter Xu --- fs/userfaultfd.c | 19 ------------------- include/linux/userfaultfd_k.h | 15 +++++++++++++++ include/uapi/linux/userfaultfd.h | 7 +++++-- mm/userfaultfd.c | 10 +++------- 4 files changed, 23 insertions(+), 28 deletions(-) diff --git a/fs/userfaultfd.c b/fs/userfaultfd.c index a41e0631af512..a436a1feb10db 100644 --- a/fs/userfaultfd.c +++ b/fs/userfaultfd.c @@ -1275,25 +1275,6 @@ static __always_inline int validate_range(struct mm_struct *mm, return 0; } -static inline bool vma_can_userfault(struct vm_area_struct *vma, - unsigned long vm_flags) -{ - /* FIXME: add WP support to hugetlbfs and shmem */ - if (vm_flags & VM_UFFD_WP) { - if (is_vm_hugetlb_page(vma) || vma_is_shmem(vma)) - return false; - } - - if (vm_flags & VM_UFFD_MINOR) { - /* FIXME: Add minor fault interception for shmem. */ - if (!is_vm_hugetlb_page(vma)) - return false; - } - - return vma_is_anonymous(vma) || is_vm_hugetlb_page(vma) || - vma_is_shmem(vma); -} - static int userfaultfd_register(struct userfaultfd_ctx *ctx, unsigned long arg) { diff --git a/include/linux/userfaultfd_k.h b/include/linux/userfaultfd_k.h index fefebe6e96560..95afd4814ab29 100644 --- a/include/linux/userfaultfd_k.h +++ b/include/linux/userfaultfd_k.h @@ -16,6 +16,7 @@ #include #include #include +#include /* The set of all possible UFFD-related VM flags. */ #define __VM_UFFD_FLAGS (VM_UFFD_MISSING | VM_UFFD_WP | VM_UFFD_MINOR) @@ -132,6 +133,20 @@ static inline bool userfaultfd_armed(struct vm_area_struct *vma) return vma->vm_flags & __VM_UFFD_FLAGS; } +static inline bool vma_can_userfault(struct vm_area_struct *vma, + unsigned long vm_flags) +{ + if (vm_flags & VM_UFFD_MINOR) { + /* FIXME: Add minor fault interception for shmem. */ + if (!is_vm_hugetlb_page(vma)) + return false; + } + + return vma_is_anonymous(vma) || is_vm_hugetlb_page(vma) || + vma_is_shmem(vma); +} + + extern int dup_userfaultfd(struct vm_area_struct *, struct list_head *); extern void dup_userfaultfd_complete(struct list_head *); diff --git a/include/uapi/linux/userfaultfd.h b/include/uapi/linux/userfaultfd.h index bafbeb1a26245..298fbd4e2d1d3 100644 --- a/include/uapi/linux/userfaultfd.h +++ b/include/uapi/linux/userfaultfd.h @@ -31,7 +31,8 @@ UFFD_FEATURE_MISSING_SHMEM | \ UFFD_FEATURE_SIGBUS | \ UFFD_FEATURE_THREAD_ID | \ - UFFD_FEATURE_MINOR_HUGETLBFS) + UFFD_FEATURE_MINOR_HUGETLBFS | \ + UFFD_FEATURE_WP_HUGETLBFS_SHMEM) #define UFFD_API_IOCTLS \ ((__u64)1 << _UFFDIO_REGISTER | \ (__u64)1 << _UFFDIO_UNREGISTER | \ @@ -45,7 +46,8 @@ #define UFFD_API_RANGE_IOCTLS_BASIC \ ((__u64)1 << _UFFDIO_WAKE | \ (__u64)1 << _UFFDIO_COPY | \ - (__u64)1 << _UFFDIO_CONTINUE) + (__u64)1 << _UFFDIO_CONTINUE | \ + (__u64)1 << _UFFDIO_WRITEPROTECT) /* * Valid ioctl command number range with this API is from 0x00 to @@ -196,6 +198,7 @@ struct uffdio_api { #define UFFD_FEATURE_SIGBUS (1<<7) #define UFFD_FEATURE_THREAD_ID (1<<8) #define UFFD_FEATURE_MINOR_HUGETLBFS (1<<9) +#define UFFD_FEATURE_WP_HUGETLBFS_SHMEM (1<<10) __u64 features; __u64 ioctls; diff --git a/mm/userfaultfd.c b/mm/userfaultfd.c index 2cd6ad5c3d8f8..3930e56aaefd8 100644 --- a/mm/userfaultfd.c +++ b/mm/userfaultfd.c @@ -445,7 +445,6 @@ static __always_inline ssize_t mfill_atomic_pte(struct mm_struct *dst_mm, err = mfill_zeropage_pte(dst_mm, dst_pmd, dst_vma, dst_addr); } else { - VM_WARN_ON_ONCE(wp_copy); if (!zeropage) err = shmem_mcopy_atomic_pte(dst_mm, dst_pmd, dst_vma, dst_addr, @@ -682,15 +681,12 @@ int mwriteprotect_range(struct mm_struct *dst_mm, unsigned long start, err = -ENOENT; dst_vma = find_dst_vma(dst_mm, start, len); - /* - * Make sure the vma is not shared, that the dst range is - * both valid and fully within a single existing vma. - */ - if (!dst_vma || (dst_vma->vm_flags & VM_SHARED)) + + if (!dst_vma) goto out_unlock; if (!userfaultfd_wp(dst_vma)) goto out_unlock; - if (!vma_is_anonymous(dst_vma)) + if (!vma_can_userfault(dst_vma, dst_vma->vm_flags)) goto out_unlock; if (is_vm_hugetlb_page(dst_vma)) { From patchwork Tue Apr 27 16:13:17 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Xu X-Patchwork-Id: 12226913 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.6 required=3.0 tests=BAYES_00,DKIM_INVALID, DKIM_SIGNED,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 46199C433B4 for ; Tue, 27 Apr 2021 16:14:18 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id E322B613E7 for ; Tue, 27 Apr 2021 16:14:17 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org E322B613E7 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=redhat.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 5716B6B00A4; Tue, 27 Apr 2021 12:14:07 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 418CD6B00A7; Tue, 27 Apr 2021 12:14:07 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id F157B6B00A5; Tue, 27 Apr 2021 12:14:06 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0211.hostedemail.com [216.40.44.211]) by kanga.kvack.org (Postfix) with ESMTP id ACF536B00A1 for ; Tue, 27 Apr 2021 12:14:06 -0400 (EDT) Received: from smtpin15.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay03.hostedemail.com (Postfix) with ESMTP id 6F10A8249980 for ; Tue, 27 Apr 2021 16:14:06 +0000 (UTC) X-FDA: 78078643692.15.92272D1 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [216.205.24.124]) by imf24.hostedemail.com (Postfix) with ESMTP id AB128A000190 for ; Tue, 27 Apr 2021 16:13:55 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1619540045; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=cr+5e1VLs/HzikgCvktprPaWi0mBnwkwdmGII74N+nI=; b=U24Eg0xlD0Hf6cy2OqtNADy/iGFtRIMloDVOu8A4uDtrp2a468P6LnprMwSEcvKLLKkZMV Ejoo874SFgieVYNXOkCdF+y7BST/qAmFY20ZyyvGtaHRQ+hsjvruCJ+xYGkncHK5rPd4tK Xg8WH8vwa5LP2xTi3IWsU1pmVUoys6g= Received: from mail-qv1-f70.google.com (mail-qv1-f70.google.com [209.85.219.70]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-213-7QmcO4tKP86X81rCIjlXOQ-1; Tue, 27 Apr 2021 12:14:03 -0400 X-MC-Unique: 7QmcO4tKP86X81rCIjlXOQ-1 Received: by mail-qv1-f70.google.com with SMTP id w20-20020a0562140b34b029019c9674180fso26183792qvj.0 for ; Tue, 27 Apr 2021 09:14:02 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=cr+5e1VLs/HzikgCvktprPaWi0mBnwkwdmGII74N+nI=; b=N+9d0pwYix1M44VotK5OvMHWcrcxp+mSjTX0Dklmm2qN4t/eh+pRezmrpiGnye1CNt USETNS+lEB1Vq3KkLJP2o4fwECSDH/ICuRuDtSoLTbbwP2TX4DybqpZ86szX9eohPhNA tDpkoCRAvjn5lDd5yNRL+V0Dj7n6ri7MUMINX98IpEgQcdAMGf1Y5hZ5TwfzDZWC6QwZ CzH9bPrVKK9Z8GsfpjmP2D/uDYSXhduhaltM7zzWjvwRoSYJ5mE/o6dr9f5r9/3btoRo Joo9vCT6MuA6ThGk0j2eQENtV5iXovgR21LbOTIjhrg5Ku08EGxpt1oCeGkgreFdzNSf HCUQ== X-Gm-Message-State: AOAM531DNMNui5IKdQLri+WXUjdjNjNy7w5A1z13ZZAot4Lt7qLGuG34 iQxb47VdYP66B2rdiKvwru8wP11+mgdJ95Y3gUODLw3uSaofsdw+Hihf5MBsRPUv8UlfY+qjN0k q5N8zWQdQtCJsIf+2pZhKJu+tvCr55PvfxEYmlpmcavhyfmjWG3hzXomEUC1r X-Received: by 2002:a05:620a:74b:: with SMTP id i11mr14332434qki.445.1619540041980; Tue, 27 Apr 2021 09:14:01 -0700 (PDT) X-Google-Smtp-Source: ABdhPJw5lVjxxW5hAwO00BZZT8kxy+y40eQhnQFXe208ItbjEc0IxNxi5grm6oA/Uy+P+8egT9t04g== X-Received: by 2002:a05:620a:74b:: with SMTP id i11mr14332387qki.445.1619540041643; Tue, 27 Apr 2021 09:14:01 -0700 (PDT) Received: from xz-x1.redhat.com (bras-base-toroon474qw-grc-77-184-145-104-227.dsl.bell.ca. [184.145.104.227]) by smtp.gmail.com with ESMTPSA id v66sm3103621qkd.113.2021.04.27.09.14.00 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 27 Apr 2021 09:14:01 -0700 (PDT) From: Peter Xu To: linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: Nadav Amit , Miaohe Lin , Mike Rapoport , Andrea Arcangeli , Hugh Dickins , peterx@redhat.com, Jerome Glisse , Mike Kravetz , Jason Gunthorpe , Matthew Wilcox , Andrew Morton , Axel Rasmussen , "Kirill A . Shutemov" Subject: [PATCH v2 24/24] userfaultfd/selftests: Enable uffd-wp for shmem/hugetlbfs Date: Tue, 27 Apr 2021 12:13:17 -0400 Message-Id: <20210427161317.50682-25-peterx@redhat.com> X-Mailer: git-send-email 2.26.2 In-Reply-To: <20210427161317.50682-1-peterx@redhat.com> References: <20210427161317.50682-1-peterx@redhat.com> MIME-Version: 1.0 Authentication-Results: relay.mimecast.com; auth=pass smtp.auth=CUSA124A263 smtp.mailfrom=peterx@redhat.com X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com X-Rspamd-Server: rspam01 X-Rspamd-Queue-Id: AB128A000190 X-Stat-Signature: ejwsio8h139zpe8m1rsunq3autx6tdfs Received-SPF: none (redhat.com>: No applicable sender policy available) receiver=imf24; identity=mailfrom; envelope-from=""; helo=us-smtp-delivery-124.mimecast.com; client-ip=216.205.24.124 X-HE-DKIM-Result: pass/pass X-HE-Tag: 1619540035-127345 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: After we added support for shmem and hugetlbfs, we can turn uffd-wp test on always now. Define HUGETLB_EXPECTED_IOCTLS to avoid using UFFD_API_RANGE_IOCTLS_BASIC, because UFFD_API_RANGE_IOCTLS_BASIC is normally a superset of capabilities, while the test may not satisfy them all. E.g., when hugetlb registered without minor mode, then we need to explicitly remove _UFFDIO_CONTINUE. Same thing to uffd-wp, as we'll need to explicitly remove _UFFDIO_WRITEPROTECT if not registered with uffd-wp. For the long term, we may consider dropping UFFD_API_* macros completely from uapi/linux/userfaultfd.h header files, because it may cause kernel header update to easily break userspace. Signed-off-by: Peter Xu --- tools/testing/selftests/vm/userfaultfd.c | 9 +++++---- 1 file changed, 5 insertions(+), 4 deletions(-) diff --git a/tools/testing/selftests/vm/userfaultfd.c b/tools/testing/selftests/vm/userfaultfd.c index 6339aeaeeff8b..cfa6c0e960e6a 100644 --- a/tools/testing/selftests/vm/userfaultfd.c +++ b/tools/testing/selftests/vm/userfaultfd.c @@ -80,7 +80,7 @@ static int test_type; static volatile bool test_uffdio_copy_eexist = true; static volatile bool test_uffdio_zeropage_eexist = true; /* Whether to test uffd write-protection */ -static bool test_uffdio_wp = false; +static bool test_uffdio_wp = true; /* Whether to test uffd minor faults */ static bool test_uffdio_minor = false; @@ -299,6 +299,9 @@ struct uffd_test_ops { (1 << _UFFDIO_ZEROPAGE) | \ (1 << _UFFDIO_WRITEPROTECT)) +#define HUGETLB_EXPECTED_IOCTLS ((1 << _UFFDIO_WAKE) | \ + (1 << _UFFDIO_COPY)) + static struct uffd_test_ops anon_uffd_test_ops = { .expected_ioctls = ANON_EXPECTED_IOCTLS, .allocate_area = anon_allocate_area, @@ -314,7 +317,7 @@ static struct uffd_test_ops shmem_uffd_test_ops = { }; static struct uffd_test_ops hugetlb_uffd_test_ops = { - .expected_ioctls = UFFD_API_RANGE_IOCTLS_BASIC & ~(1 << _UFFDIO_CONTINUE), + .expected_ioctls = HUGETLB_EXPECTED_IOCTLS, .allocate_area = hugetlb_allocate_area, .release_pages = hugetlb_release_pages, .alias_mapping = hugetlb_alias_mapping, @@ -1374,8 +1377,6 @@ static void set_test_type(const char *type) if (!strcmp(type, "anon")) { test_type = TEST_ANON; uffd_test_ops = &anon_uffd_test_ops; - /* Only enable write-protect test for anonymous test */ - test_uffdio_wp = true; } else if (!strcmp(type, "hugetlb")) { test_type = TEST_HUGETLB; uffd_test_ops = &hugetlb_uffd_test_ops;