From patchwork Fri Jan 15 17:08:38 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Xu X-Patchwork-Id: 12023309 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.6 required=3.0 tests=BAYES_00,DKIM_INVALID, DKIM_SIGNED,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 7B7DFC433E0 for ; Fri, 15 Jan 2021 17:09:16 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 0D742238EF for ; Fri, 15 Jan 2021 17:09:15 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 0D742238EF Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=redhat.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 2C4018D019A; Fri, 15 Jan 2021 12:09:15 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 274A38D016A; Fri, 15 Jan 2021 12:09:15 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 161AA8D019A; Fri, 15 Jan 2021 12:09:15 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0141.hostedemail.com [216.40.44.141]) by kanga.kvack.org (Postfix) with ESMTP id 014608D016A for ; Fri, 15 Jan 2021 12:09:14 -0500 (EST) Received: from smtpin25.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay05.hostedemail.com (Postfix) with ESMTP id 9917A181D68B9 for ; Fri, 15 Jan 2021 17:09:14 +0000 (UTC) X-FDA: 77708645028.25.crook18_560826a27531 Received: from filter.hostedemail.com (10.5.16.251.rfc1918.com [10.5.16.251]) by smtpin25.hostedemail.com (Postfix) with ESMTP id 72406180882BA for ; Fri, 15 Jan 2021 17:09:14 +0000 (UTC) X-HE-Tag: crook18_560826a27531 X-Filterd-Recvd-Size: 5579 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [63.128.21.124]) by imf29.hostedemail.com (Postfix) with ESMTP for ; Fri, 15 Jan 2021 17:09:13 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1610730553; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=mHLrJ5S0GG/KaPJKytrt7gQrBF3eoNX3484kaZfBHrA=; b=FYBPak5ly0/G1YCUBDhuoa1LIOIGPw7cjJ/+NyW/18XY3GogiDdRfuPKTfM8oFq0mHykF8 hzue0iuuqQ1NgHMEOr7BFcHpBu6Fagr2UAehOKBsrbtMsjoz0osIuPOazH+XCESgfH7sDo atsObg1pY5Hn3furj+b98c1gUtYxgHg= Received: from mail-qv1-f69.google.com (mail-qv1-f69.google.com [209.85.219.69]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-344-ugc9KSIuMu-nOfLBbWWsGw-1; Fri, 15 Jan 2021 12:09:11 -0500 X-MC-Unique: ugc9KSIuMu-nOfLBbWWsGw-1 Received: by mail-qv1-f69.google.com with SMTP id f7so8266198qvr.4 for ; Fri, 15 Jan 2021 09:09:11 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=mHLrJ5S0GG/KaPJKytrt7gQrBF3eoNX3484kaZfBHrA=; b=pKbrRapuiZ/eSqAb9tkHpoQXvcBD+JknKnGxDfy4XPUBJh/Uvo9I/Yi8JhjnxpggYA NQxO3/U/Gm4hk5pMe0mQ+2ptVeRb8BCLFhKCyMN0U3c6I12r2HZ9tMJ5haKbeqThVDpQ oGyMUxohnbIONA9DOUPYlhEvBVPsUqk8FBER31mSenm0aZnO+dCIoSNnFx87AizZTWLS LKZh79Xz4SCAF1bLNPMyCUCpmzpbyv1loKFHp7jie4cUZENEMfdONiO7SbAzluuLpI1k iQmXN2jBnVjPnnXuZKGrWazi2Vkox8Xj1gHou5QpPOy7OR5o8oViU1WQNQ+AdpSsLywv EtRw== X-Gm-Message-State: AOAM533EM+ssdtBwAMlJFueKTLHdr12xLNkk+6C6jUfP5HWd9j1qeUvC h77sNgPIfzIf3zTpHVygU2y1muyxGxQuoY6iwnqp2YaPwmT+z2jdqgnE9gwwgKiOxvfr3hlPacr PRu3PKJfhBK0= X-Received: by 2002:ac8:488e:: with SMTP id i14mr12617641qtq.372.1610730551310; Fri, 15 Jan 2021 09:09:11 -0800 (PST) X-Google-Smtp-Source: ABdhPJzsp9bEM5PMzBbT8M14w42+8pX/BLJyTHZkLiXgcRVTGoU7UpYGDMzrmgA0vG0YbE5PiQX55A== X-Received: by 2002:ac8:488e:: with SMTP id i14mr12617613qtq.372.1610730551089; Fri, 15 Jan 2021 09:09:11 -0800 (PST) Received: from localhost.localdomain ([142.126.83.202]) by smtp.gmail.com with ESMTPSA id d123sm5187840qke.95.2021.01.15.09.09.09 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 15 Jan 2021 09:09:10 -0800 (PST) From: Peter Xu To: linux-kernel@vger.kernel.org, linux-mm@kvack.org Cc: Mike Rapoport , Mike Kravetz , peterx@redhat.com, Jerome Glisse , "Kirill A . Shutemov" , Hugh Dickins , Axel Rasmussen , Matthew Wilcox , Andrew Morton , Andrea Arcangeli , Nadav Amit Subject: [PATCH RFC 01/30] mm/thp: Simplify copying of huge zero page pmd when fork Date: Fri, 15 Jan 2021 12:08:38 -0500 Message-Id: <20210115170907.24498-2-peterx@redhat.com> X-Mailer: git-send-email 2.26.2 In-Reply-To: <20210115170907.24498-1-peterx@redhat.com> References: <20210115170907.24498-1-peterx@redhat.com> MIME-Version: 1.0 Authentication-Results: relay.mimecast.com; auth=pass smtp.auth=CUSA124A263 smtp.mailfrom=peterx@redhat.com X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Huge zero page is handled in a special path in copy_huge_pmd(), however it should share most codes with a normal thp page. Trying to share more code with it by removing the special path. The only leftover so far is the huge zero page refcounting (mm_get_huge_zero_page()), because that's separately done with a global counter. This prepares for a future patch to modify the huge pmd to be installed, so that we don't need to duplicate it explicitly into huge zero page case too. Cc: Kirill A. Shutemov Signed-off-by: Peter Xu --- mm/huge_memory.c | 9 +++------ 1 file changed, 3 insertions(+), 6 deletions(-) diff --git a/mm/huge_memory.c b/mm/huge_memory.c index ec2bb93f7431..b64ad1947900 100644 --- a/mm/huge_memory.c +++ b/mm/huge_memory.c @@ -1058,17 +1058,13 @@ int copy_huge_pmd(struct mm_struct *dst_mm, struct mm_struct *src_mm, * a page table. */ if (is_huge_zero_pmd(pmd)) { - struct page *zero_page; /* * get_huge_zero_page() will never allocate a new page here, * since we already have a zero page to copy. It just takes a * reference. */ - zero_page = mm_get_huge_zero_page(dst_mm); - set_huge_zero_page(pgtable, dst_mm, vma, addr, dst_pmd, - zero_page); - ret = 0; - goto out_unlock; + mm_get_huge_zero_page(dst_mm); + goto out_zero_page; } src_page = pmd_page(pmd); @@ -1094,6 +1090,7 @@ int copy_huge_pmd(struct mm_struct *dst_mm, struct mm_struct *src_mm, get_page(src_page); page_dup_rmap(src_page, true); add_mm_counter(dst_mm, MM_ANONPAGES, HPAGE_PMD_NR); +out_zero_page: mm_inc_nr_ptes(dst_mm); pgtable_trans_huge_deposit(dst_mm, dst_pmd, pgtable); From patchwork Fri Jan 15 17:08:39 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Xu X-Patchwork-Id: 12023313 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.6 required=3.0 tests=BAYES_00,DKIM_INVALID, DKIM_SIGNED,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 30C86C433E9 for ; Fri, 15 Jan 2021 17:09:20 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id BA8A8238EE for ; Fri, 15 Jan 2021 17:09:19 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org BA8A8238EE Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=redhat.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 4A00E8D019C; Fri, 15 Jan 2021 12:09:19 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 44C1F8D016A; Fri, 15 Jan 2021 12:09:19 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 24D328D019C; Fri, 15 Jan 2021 12:09:19 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0181.hostedemail.com [216.40.44.181]) by kanga.kvack.org (Postfix) with ESMTP id 0C5388D016A for ; Fri, 15 Jan 2021 12:09:19 -0500 (EST) Received: from smtpin16.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay04.hostedemail.com (Postfix) with ESMTP id C4FD78E49 for ; Fri, 15 Jan 2021 17:09:18 +0000 (UTC) X-FDA: 77708645196.16.smoke66_1b0c8ad27531 Received: from filter.hostedemail.com (10.5.16.251.rfc1918.com [10.5.16.251]) by smtpin16.hostedemail.com (Postfix) with ESMTP id A39F1100E7FC4 for ; Fri, 15 Jan 2021 17:09:18 +0000 (UTC) X-HE-Tag: smoke66_1b0c8ad27531 X-Filterd-Recvd-Size: 12131 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [216.205.24.124]) by imf18.hostedemail.com (Postfix) with ESMTP for ; Fri, 15 Jan 2021 17:09:17 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1610730557; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=uxL/EXqPGyWZMDtVPV4yKceP//rOWf13NhV4Ho8YVV0=; b=e6Q4VPVZ5nAc/rdnN4NxIK1u86p7SyoeF6H1pVG9QPlpZ1hQ2I7wRciWANOwzp2dMv5AXd dA58mA9to2WiQdArj/aR77B0BlSTMH6eteczGYpqxbOfP+uTdt75yyGcqKJxNbmpi+kFqJ Nl4caFlsMQGAV+ufRgdVvKqri+dBxWs= Received: from mail-qt1-f200.google.com (mail-qt1-f200.google.com [209.85.160.200]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-593-FrhnAD6ROMqCtp8tWXqpFw-1; Fri, 15 Jan 2021 12:09:13 -0500 X-MC-Unique: FrhnAD6ROMqCtp8tWXqpFw-1 Received: by mail-qt1-f200.google.com with SMTP id g9so7867210qtv.12 for ; Fri, 15 Jan 2021 09:09:13 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=uxL/EXqPGyWZMDtVPV4yKceP//rOWf13NhV4Ho8YVV0=; b=rJEfrq6xL9c7pTpLutwL50eQX6mB45wxiDE/pi7dDTY4OmoQ+HxkNwrT7Y4crmZqah w4qxvmPdv5IyG5DM1qW1XP9KG0wVcaamo+lKU3F16JRzm/wdfHSnY2Hb2qnpEZxfJwMw iqq/0h+zzmY9cqclaY7DLlss+736/6lgxBDUS29SI7MdaeXt1CdaHEp4srQmshJ9jk/C Ax72oTQ5evFcyTf8TDzyuh1hes2uxguMUNrmkDRTBLUfN6dJVEK+e1dTHxNQMWg72LPH hmMN0HPl+6tPC9AtzLPYnC0FSpHkSCgmJpPSbC9Zog+gv43oIOD3OfEwNUP6AG0fGqY2 15pQ== X-Gm-Message-State: AOAM530Q2OlZNbdkwTWM9sff68iRutTLD9yAm2w/sXgT0iw+O7OChoiT xMbTP9fPFSfRzv8ibMsD6ObHjgaoxVuIMbE4fgQWWo8jD59v0Bjzp4/xQtzvrUMdF0oeXgZfJ/D Lk9YvNMBUfOk= X-Received: by 2002:ac8:4e0e:: with SMTP id c14mr12659975qtw.71.1610730553164; Fri, 15 Jan 2021 09:09:13 -0800 (PST) X-Google-Smtp-Source: ABdhPJxk/GhB36vk+VQJjp7yB/I/Ci+O+t+37/F8esLD26hM8u75lmGxSsENJDlgYLD/4ZSHsS7ccw== X-Received: by 2002:ac8:4e0e:: with SMTP id c14mr12659938qtw.71.1610730552804; Fri, 15 Jan 2021 09:09:12 -0800 (PST) Received: from localhost.localdomain ([142.126.83.202]) by smtp.gmail.com with ESMTPSA id d123sm5187840qke.95.2021.01.15.09.09.11 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 15 Jan 2021 09:09:12 -0800 (PST) From: Peter Xu To: linux-kernel@vger.kernel.org, linux-mm@kvack.org Cc: Mike Rapoport , Mike Kravetz , peterx@redhat.com, Jerome Glisse , "Kirill A . Shutemov" , Hugh Dickins , Axel Rasmussen , Matthew Wilcox , Andrew Morton , Andrea Arcangeli , Nadav Amit Subject: [PATCH RFC 02/30] mm/userfaultfd: Fix uffd-wp special cases for fork() Date: Fri, 15 Jan 2021 12:08:39 -0500 Message-Id: <20210115170907.24498-3-peterx@redhat.com> X-Mailer: git-send-email 2.26.2 In-Reply-To: <20210115170907.24498-1-peterx@redhat.com> References: <20210115170907.24498-1-peterx@redhat.com> MIME-Version: 1.0 Authentication-Results: relay.mimecast.com; auth=pass smtp.auth=CUSA124A263 smtp.mailfrom=peterx@redhat.com X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: We tried to do something similar in b569a1760782 ("userfaultfd: wp: drop _PAGE_UFFD_WP properly when fork") previously, but it's not doing it all right.. A few fixes around the code path: 1. We were referencing VM_UFFD_WP vm_flags on the _old_ vma rather than the new vma. That's overlooked in b569a1760782, so it won't work as expected. Thanks to the recent rework on fork code (7a4830c380f3a8b3), we can easily get the new vma now, so switch the checks to that. 2. Dropping the uffd-wp bit in copy_huge_pmd() could be wrong if the huge pmd is a migration huge pmd. When it happens, instead of using pmd_uffd_wp(), we should use pmd_swp_uffd_wp(). The fix is simply to handle them separately. 3. Forget to carry over uffd-wp bit for a write migration huge pmd entry. This also happens in copy_huge_pmd(), where we converted a write huge migration entry into a read one. 4. In copy_nonpresent_pte(), drop uffd-wp if necessary for swap ptes. 5. In copy_present_page() when COW is enforced when fork(), we also need to pass over the uffd-wp bit if VM_UFFD_WP is armed on the new vma, and when the pte to be copied has uffd-wp bit set. Remove the comment in copy_present_pte() about this. It won't help a huge lot to only comment there, but comment everywhere would be an overkill. Let's assume the commit messages would help. Cc: Jerome Glisse Cc: Mike Rapoport Fixes: b569a1760782f3da03ff718d61f74163dea599ff Signed-off-by: Peter Xu --- include/linux/huge_mm.h | 3 ++- mm/huge_memory.c | 23 ++++++++++------------- mm/memory.c | 24 +++++++++++++----------- 3 files changed, 25 insertions(+), 25 deletions(-) diff --git a/include/linux/huge_mm.h b/include/linux/huge_mm.h index 0365aa97f8e7..77b8d2132c3a 100644 --- a/include/linux/huge_mm.h +++ b/include/linux/huge_mm.h @@ -10,7 +10,8 @@ extern vm_fault_t do_huge_pmd_anonymous_page(struct vm_fault *vmf); extern int copy_huge_pmd(struct mm_struct *dst_mm, struct mm_struct *src_mm, pmd_t *dst_pmd, pmd_t *src_pmd, unsigned long addr, - struct vm_area_struct *vma); + struct vm_area_struct *src_vma, + struct vm_area_struct *dst_vma); extern void huge_pmd_set_accessed(struct vm_fault *vmf, pmd_t orig_pmd); extern int copy_huge_pud(struct mm_struct *dst_mm, struct mm_struct *src_mm, pud_t *dst_pud, pud_t *src_pud, unsigned long addr, diff --git a/mm/huge_memory.c b/mm/huge_memory.c index b64ad1947900..35d4acac6874 100644 --- a/mm/huge_memory.c +++ b/mm/huge_memory.c @@ -996,7 +996,7 @@ struct page *follow_devmap_pmd(struct vm_area_struct *vma, unsigned long addr, int copy_huge_pmd(struct mm_struct *dst_mm, struct mm_struct *src_mm, pmd_t *dst_pmd, pmd_t *src_pmd, unsigned long addr, - struct vm_area_struct *vma) + struct vm_area_struct *src_vma, struct vm_area_struct *dst_vma) { spinlock_t *dst_ptl, *src_ptl; struct page *src_page; @@ -1005,7 +1005,7 @@ int copy_huge_pmd(struct mm_struct *dst_mm, struct mm_struct *src_mm, int ret = -ENOMEM; /* Skip if can be re-fill on fault */ - if (!vma_is_anonymous(vma)) + if (!vma_is_anonymous(dst_vma)) return 0; pgtable = pte_alloc_one(dst_mm); @@ -1019,14 +1019,6 @@ int copy_huge_pmd(struct mm_struct *dst_mm, struct mm_struct *src_mm, ret = -EAGAIN; pmd = *src_pmd; - /* - * Make sure the _PAGE_UFFD_WP bit is cleared if the new VMA - * does not have the VM_UFFD_WP, which means that the uffd - * fork event is not enabled. - */ - if (!(vma->vm_flags & VM_UFFD_WP)) - pmd = pmd_clear_uffd_wp(pmd); - #ifdef CONFIG_ARCH_ENABLE_THP_MIGRATION if (unlikely(is_swap_pmd(pmd))) { swp_entry_t entry = pmd_to_swp_entry(pmd); @@ -1037,11 +1029,15 @@ int copy_huge_pmd(struct mm_struct *dst_mm, struct mm_struct *src_mm, pmd = swp_entry_to_pmd(entry); if (pmd_swp_soft_dirty(*src_pmd)) pmd = pmd_swp_mksoft_dirty(pmd); + if (pmd_swp_uffd_wp(*src_pmd)) + pmd = pmd_swp_mkuffd_wp(pmd); set_pmd_at(src_mm, addr, src_pmd, pmd); } add_mm_counter(dst_mm, MM_ANONPAGES, HPAGE_PMD_NR); mm_inc_nr_ptes(dst_mm); pgtable_trans_huge_deposit(dst_mm, dst_pmd, pgtable); + if (!(dst_vma->vm_flags & VM_UFFD_WP)) + pmd = pmd_swp_clear_uffd_wp(pmd); set_pmd_at(dst_mm, addr, dst_pmd, pmd); ret = 0; goto out_unlock; @@ -1077,13 +1073,13 @@ int copy_huge_pmd(struct mm_struct *dst_mm, struct mm_struct *src_mm, * best effort that the pinned pages won't be replaced by another * random page during the coming copy-on-write. */ - if (unlikely(is_cow_mapping(vma->vm_flags) && + if (unlikely(is_cow_mapping(src_vma->vm_flags) && atomic_read(&src_mm->has_pinned) && page_maybe_dma_pinned(src_page))) { pte_free(dst_mm, pgtable); spin_unlock(src_ptl); spin_unlock(dst_ptl); - __split_huge_pmd(vma, src_pmd, addr, false, NULL); + __split_huge_pmd(src_vma, src_pmd, addr, false, NULL); return -EAGAIN; } @@ -1093,8 +1089,9 @@ int copy_huge_pmd(struct mm_struct *dst_mm, struct mm_struct *src_mm, out_zero_page: mm_inc_nr_ptes(dst_mm); pgtable_trans_huge_deposit(dst_mm, dst_pmd, pgtable); - pmdp_set_wrprotect(src_mm, addr, src_pmd); + if (!(dst_vma->vm_flags & VM_UFFD_WP)) + pmd = pmd_clear_uffd_wp(pmd); pmd = pmd_mkold(pmd_wrprotect(pmd)); set_pmd_at(dst_mm, addr, dst_pmd, pmd); diff --git a/mm/memory.c b/mm/memory.c index c48f8df6e502..d6d2873368e1 100644 --- a/mm/memory.c +++ b/mm/memory.c @@ -696,10 +696,10 @@ struct page *vm_normal_page_pmd(struct vm_area_struct *vma, unsigned long addr, static unsigned long copy_nonpresent_pte(struct mm_struct *dst_mm, struct mm_struct *src_mm, - pte_t *dst_pte, pte_t *src_pte, struct vm_area_struct *vma, - unsigned long addr, int *rss) + pte_t *dst_pte, pte_t *src_pte, struct vm_area_struct *src_vma, + struct vm_area_struct *dst_vma, unsigned long addr, int *rss) { - unsigned long vm_flags = vma->vm_flags; + unsigned long vm_flags = dst_vma->vm_flags; pte_t pte = *src_pte; struct page *page; swp_entry_t entry = pte_to_swp_entry(pte); @@ -768,6 +768,8 @@ copy_nonpresent_pte(struct mm_struct *dst_mm, struct mm_struct *src_mm, set_pte_at(src_mm, addr, src_pte, pte); } } + if (!userfaultfd_wp(dst_vma)) + pte = pte_swp_clear_uffd_wp(pte); set_pte_at(dst_mm, addr, dst_pte, pte); return 0; } @@ -839,6 +841,9 @@ copy_present_page(struct vm_area_struct *dst_vma, struct vm_area_struct *src_vma /* All done, just insert the new page copy in the child */ pte = mk_pte(new_page, dst_vma->vm_page_prot); pte = maybe_mkwrite(pte_mkdirty(pte), dst_vma); + if (userfaultfd_wp(dst_vma) && pte_uffd_wp(*src_pte)) + /* Uffd-wp needs to be delivered to dest pte as well */ + pte = pte_wrprotect(pte_mkuffd_wp(pte)); set_pte_at(dst_vma->vm_mm, addr, dst_pte, pte); return 0; } @@ -888,12 +893,7 @@ copy_present_pte(struct vm_area_struct *dst_vma, struct vm_area_struct *src_vma, pte = pte_mkclean(pte); pte = pte_mkold(pte); - /* - * Make sure the _PAGE_UFFD_WP bit is cleared if the new VMA - * does not have the VM_UFFD_WP, which means that the uffd - * fork event is not enabled. - */ - if (!(vm_flags & VM_UFFD_WP)) + if (!(dst_vma->vm_flags & VM_UFFD_WP)) pte = pte_clear_uffd_wp(pte); set_pte_at(dst_vma->vm_mm, addr, dst_pte, pte); @@ -968,7 +968,8 @@ copy_pte_range(struct vm_area_struct *dst_vma, struct vm_area_struct *src_vma, if (unlikely(!pte_present(*src_pte))) { entry.val = copy_nonpresent_pte(dst_mm, src_mm, dst_pte, src_pte, - src_vma, addr, rss); + src_vma, dst_vma, + addr, rss); if (entry.val) break; progress += 8; @@ -1046,7 +1047,8 @@ copy_pmd_range(struct vm_area_struct *dst_vma, struct vm_area_struct *src_vma, int err; VM_BUG_ON_VMA(next-addr != HPAGE_PMD_SIZE, src_vma); err = copy_huge_pmd(dst_mm, src_mm, - dst_pmd, src_pmd, addr, src_vma); + dst_pmd, src_pmd, addr, src_vma, + dst_vma); if (err == -ENOMEM) return -ENOMEM; if (!err) From patchwork Fri Jan 15 17:08:40 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Xu X-Patchwork-Id: 12023315 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.6 required=3.0 tests=BAYES_00,DKIM_INVALID, DKIM_SIGNED,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 5A6F1C433E0 for ; Fri, 15 Jan 2021 17:09:22 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 0161A238EF for ; Fri, 15 Jan 2021 17:09:21 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 0161A238EF Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=redhat.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 8500B8D016A; Fri, 15 Jan 2021 12:09:19 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 73AD98D019E; Fri, 15 Jan 2021 12:09:19 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 5D5A18D019D; Fri, 15 Jan 2021 12:09:19 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0062.hostedemail.com [216.40.44.62]) by kanga.kvack.org (Postfix) with ESMTP id 463EB8D019D for ; Fri, 15 Jan 2021 12:09:19 -0500 (EST) Received: from smtpin02.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay01.hostedemail.com (Postfix) with ESMTP id 04E7C181D68B9 for ; Fri, 15 Jan 2021 17:09:19 +0000 (UTC) X-FDA: 77708645238.02.trip09_4702a0c27531 Received: from filter.hostedemail.com (10.5.16.251.rfc1918.com [10.5.16.251]) by smtpin02.hostedemail.com (Postfix) with ESMTP id D3DA0100BFDFF for ; Fri, 15 Jan 2021 17:09:18 +0000 (UTC) X-HE-Tag: trip09_4702a0c27531 X-Filterd-Recvd-Size: 5777 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [63.128.21.124]) by imf03.hostedemail.com (Postfix) with ESMTP for ; Fri, 15 Jan 2021 17:09:18 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1610730557; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=55sEsJtJtT3yQ/SJrjmLB0CUvQ5qtdXoD13C+gjkjA8=; b=B4QOahvSU6/KM6055XdEsVCVsx824+F6OwLAAZocKgOhxwighaD/VY3NT3Tjpum5lIbBcd wqvtP/TS1B5P7fX4ltseReWvS3K2g1IwgrK6UpDJABraCnjky659Evy3ypDyHWkX+tqZsJ MsQK1+Ni0P7ZiClx8uUxLygH2uXd0t0= Received: from mail-qt1-f198.google.com (mail-qt1-f198.google.com [209.85.160.198]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-38-_EcOnNpeN4GCValkYFEjjQ-1; Fri, 15 Jan 2021 12:09:16 -0500 X-MC-Unique: _EcOnNpeN4GCValkYFEjjQ-1 Received: by mail-qt1-f198.google.com with SMTP id l6so7891381qtr.9 for ; Fri, 15 Jan 2021 09:09:15 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=55sEsJtJtT3yQ/SJrjmLB0CUvQ5qtdXoD13C+gjkjA8=; b=DhClPNXPRHVd7iali/HquzEXE5DUVVz2UBQBbN9A5/4MFwcWxFeN9aqRc4q+Fpf6Gd b4Jaa/2YVGzFBwTCQx+K5rSA92Ab6KJMpPDhc0Pu9ZSxmKRJX7/Q5ra8fb6E7G9a4LlK gi6AYOxeuZn+YTGr1XSZ0fIZnAV7RMjeBaoJWDvloT4In5Si4AZtDmqgIfYsflym2zTY WgcY6FlMnnVuoWKbjAdYw5utkABOULwrtGqQfj2ga+54cemdnbEX4U+HFxWTo0PkZogD C6374ef85cIEC6Ig+PR/VfPkG7PYcThb8h7VRwgomwH/17GvbikTvmyB2paHlUu4LKaL hVzw== X-Gm-Message-State: AOAM5321zL5tHR0eY9OIiIADyqsyXs2J9i595KESbsK/4OEydrBNmIkR G9yJ3s6GpelUBqFOWBMv/1wFBqC92sgTvgRm/QSXgQkVXYa6L/+UJElz/WzuNDdHLO6AWU5XPGB 018yWrTiU4EM= X-Received: by 2002:a05:6214:4e2:: with SMTP id cl2mr12895236qvb.27.1610730554855; Fri, 15 Jan 2021 09:09:14 -0800 (PST) X-Google-Smtp-Source: ABdhPJzN6JI/2g7RCjZr0SNpUqJFGmSSLlynIlwy16/M1XzDDRvPyo6B1EX1O0uPYvY1K6cPK0Z7tA== X-Received: by 2002:a05:6214:4e2:: with SMTP id cl2mr12895212qvb.27.1610730554659; Fri, 15 Jan 2021 09:09:14 -0800 (PST) Received: from localhost.localdomain ([142.126.83.202]) by smtp.gmail.com with ESMTPSA id d123sm5187840qke.95.2021.01.15.09.09.12 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 15 Jan 2021 09:09:14 -0800 (PST) From: Peter Xu To: linux-kernel@vger.kernel.org, linux-mm@kvack.org Cc: Mike Rapoport , Mike Kravetz , peterx@redhat.com, Jerome Glisse , "Kirill A . Shutemov" , Hugh Dickins , Axel Rasmussen , Matthew Wilcox , Andrew Morton , Andrea Arcangeli , Nadav Amit Subject: [PATCH RFC 03/30] mm/userfaultfd: Fix a few thp pmd missing uffd-wp bit Date: Fri, 15 Jan 2021 12:08:40 -0500 Message-Id: <20210115170907.24498-4-peterx@redhat.com> X-Mailer: git-send-email 2.26.2 In-Reply-To: <20210115170907.24498-1-peterx@redhat.com> References: <20210115170907.24498-1-peterx@redhat.com> MIME-Version: 1.0 Authentication-Results: relay.mimecast.com; auth=pass smtp.auth=CUSA124A263 smtp.mailfrom=peterx@redhat.com X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: These include: 1. When remove migration pmd entry, should keep the uffd-wp bit from swap pte. Note that we need to do this after setting write bit just in case we need to remove it. 2. When change huge pmd and convert write -> read migration entry, persist the same uffd-wp bit. 3. When convert pmd to swap entry, should drop the uffd-wp bit always. Signed-off-by: Peter Xu --- include/linux/swapops.h | 2 ++ mm/huge_memory.c | 4 ++++ 2 files changed, 6 insertions(+) diff --git a/include/linux/swapops.h b/include/linux/swapops.h index d9b7c9132c2f..7dd57303bb0c 100644 --- a/include/linux/swapops.h +++ b/include/linux/swapops.h @@ -258,6 +258,8 @@ static inline swp_entry_t pmd_to_swp_entry(pmd_t pmd) if (pmd_swp_soft_dirty(pmd)) pmd = pmd_swp_clear_soft_dirty(pmd); + if (pmd_swp_uffd_wp(pmd)) + pmd = pmd_swp_clear_uffd_wp(pmd); arch_entry = __pmd_to_swp_entry(pmd); return swp_entry(__swp_type(arch_entry), __swp_offset(arch_entry)); } diff --git a/mm/huge_memory.c b/mm/huge_memory.c index 35d4acac6874..4abc46e780a0 100644 --- a/mm/huge_memory.c +++ b/mm/huge_memory.c @@ -1810,6 +1810,8 @@ int change_huge_pmd(struct vm_area_struct *vma, pmd_t *pmd, newpmd = swp_entry_to_pmd(entry); if (pmd_swp_soft_dirty(*pmd)) newpmd = pmd_swp_mksoft_dirty(newpmd); + if (pmd_swp_uffd_wp(*pmd)) + newpmd = pmd_swp_mkuffd_wp(newpmd); set_pmd_at(mm, addr, pmd, newpmd); } goto unlock; @@ -2968,6 +2970,8 @@ void remove_migration_pmd(struct page_vma_mapped_walk *pvmw, struct page *new) pmde = pmd_mksoft_dirty(pmde); if (is_write_migration_entry(entry)) pmde = maybe_pmd_mkwrite(pmde, vma); + if (pmd_swp_uffd_wp(*pvmw->pmd)) + pmde = pmd_wrprotect(pmd_mkuffd_wp(pmde)); flush_cache_range(vma, mmun_start, mmun_start + HPAGE_PMD_SIZE); if (PageAnon(new)) From patchwork Fri Jan 15 17:08:41 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Xu X-Patchwork-Id: 12023317 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.6 required=3.0 tests=BAYES_00,DKIM_INVALID, DKIM_SIGNED,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id D3339C433DB for ; Fri, 15 Jan 2021 17:09:24 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 869C6238EE for ; Fri, 15 Jan 2021 17:09:24 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 869C6238EE Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=redhat.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 009718D019E; Fri, 15 Jan 2021 12:09:22 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id EF7AC8D019D; Fri, 15 Jan 2021 12:09:21 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id DBF5C8D019E; Fri, 15 Jan 2021 12:09:21 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0054.hostedemail.com [216.40.44.54]) by kanga.kvack.org (Postfix) with ESMTP id C3F628D019D for ; Fri, 15 Jan 2021 12:09:21 -0500 (EST) Received: from smtpin17.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay05.hostedemail.com (Postfix) with ESMTP id 7BB8E18086CA1 for ; Fri, 15 Jan 2021 17:09:21 +0000 (UTC) X-FDA: 77708645322.17.frame18_30146a727531 Received: from filter.hostedemail.com (10.5.16.251.rfc1918.com [10.5.16.251]) by smtpin17.hostedemail.com (Postfix) with ESMTP id 51A85181D19F3 for ; Fri, 15 Jan 2021 17:09:21 +0000 (UTC) X-HE-Tag: frame18_30146a727531 X-Filterd-Recvd-Size: 8401 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [216.205.24.124]) by imf45.hostedemail.com (Postfix) with ESMTP for ; Fri, 15 Jan 2021 17:09:20 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1610730559; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=EPPhkXY6qqcoUMjqvnKf4Rq6Ghiq17Gr0NJKks49TF8=; b=QkAkXpDIyKPKG4jj7VsbWm4fVp8cTL/1oe0RuALZYiloSwWxaOfBttWiLhIUNvDDSQ3TXw Bj76j3E1YS5bkJugZ49hrPm+9wxjnaG4MJICs7R6yoDvXNPgw+ETyXDwloVDQZGdOAXN9x rWYeRZRHtwvvncMx+Gn1i+GRSqgY7bA= Received: from mail-qt1-f199.google.com (mail-qt1-f199.google.com [209.85.160.199]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-201-Ht3dNmk0NimjYVCP2nZtrA-1; Fri, 15 Jan 2021 12:09:18 -0500 X-MC-Unique: Ht3dNmk0NimjYVCP2nZtrA-1 Received: by mail-qt1-f199.google.com with SMTP id l7so7869784qth.15 for ; Fri, 15 Jan 2021 09:09:17 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=EPPhkXY6qqcoUMjqvnKf4Rq6Ghiq17Gr0NJKks49TF8=; b=qfwfIQCjSS6yWEa/GE1W3Qv5JHbr256UcjOCgv/oYhOPkI4S9rLjYXLW9+LFq2aRt0 RUCcHp/H7IqFjfuGfqXDcB+qfDGrYQhkqhqTJdMEU8Tz0D2pp3/QOj2VKUvRqTczhkPY rqfUWxrJJcMtElQFxrowypJvYjjMAQ4+O3HjpVwAwS58RQVr6+Lb+MnaCy8zht/MHGNA L7uCCPFlYgloGgHVQC8niqhF3vkSimAFFY4kH4cvB9IlmG5dqfz01qYSIm+KSODe+a1q RbaeNqB+TtaVqNEBNHjEtMNSKkfX89lT/9ngJuSvy8hzFGzZXbNCvDMP/EyfPju4XSkw yfIA== X-Gm-Message-State: AOAM531x0iqw5Hjevze6R8BXnvbfFENi9CwYvBSWaQs7LqwB/J4RF8YZ sYyGHCHeZDnpbTNb1AiUjtGsVxvpmjSa4whNkElKTZ+XtoQQUdTAhdyR+6pv09KANYVxdYElvK1 +H5lOhP5vmds= X-Received: by 2002:ac8:5909:: with SMTP id 9mr12534445qty.39.1610730556570; Fri, 15 Jan 2021 09:09:16 -0800 (PST) X-Google-Smtp-Source: ABdhPJyA6aXXTk8b7KR4J5Sp3e2w3GhtAjwNrFTg1PW5St0nUpfk5uptN72sS8pOhTvI4iezxGfWTQ== X-Received: by 2002:ac8:5909:: with SMTP id 9mr12534416qty.39.1610730556339; Fri, 15 Jan 2021 09:09:16 -0800 (PST) Received: from localhost.localdomain ([142.126.83.202]) by smtp.gmail.com with ESMTPSA id d123sm5187840qke.95.2021.01.15.09.09.14 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 15 Jan 2021 09:09:15 -0800 (PST) From: Peter Xu To: linux-kernel@vger.kernel.org, linux-mm@kvack.org Cc: Mike Rapoport , Mike Kravetz , peterx@redhat.com, Jerome Glisse , "Kirill A . Shutemov" , Hugh Dickins , Axel Rasmussen , Matthew Wilcox , Andrew Morton , Andrea Arcangeli , Nadav Amit Subject: [PATCH RFC 04/30] shmem/userfaultfd: Take care of UFFDIO_COPY_MODE_WP Date: Fri, 15 Jan 2021 12:08:41 -0500 Message-Id: <20210115170907.24498-5-peterx@redhat.com> X-Mailer: git-send-email 2.26.2 In-Reply-To: <20210115170907.24498-1-peterx@redhat.com> References: <20210115170907.24498-1-peterx@redhat.com> MIME-Version: 1.0 Authentication-Results: relay.mimecast.com; auth=pass smtp.auth=CUSA124A263 smtp.mailfrom=peterx@redhat.com X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Firstly, pass wp_copy into shmem_mfill_atomic_pte() through the stack. Then apply the UFFD_WP bit properly when the UFFDIO_COPY on shmem is with UFFDIO_COPY_MODE_WP. One thing to mention is that shmem_mfill_atomic_pte() needs to set the dirty bit in pte even if UFFDIO_COPY_MODE_WP is set. The reason is similar to dcf7fe9d8976 ("userfaultfd: shmem: UFFDIO_COPY: set the page dirty if VM_WRITE is not set") where we need to set page as dirty even if VM_WRITE is no there. It's just that shmem can drop the pte any time later, and if it's not dirty the data will be dropped. For uffd-wp, that could lead to data loss if without the dirty bit set. Note that shmem_mfill_zeropage_pte() will always call shmem_mfill_atomic_pte() with wp_copy==false because UFFDIO_ZEROCOPY does not support UFFDIO_COPY_MODE_WP. Signed-off-by: Peter Xu --- include/linux/shmem_fs.h | 5 +++-- mm/shmem.c | 26 +++++++++++++++++++------- mm/userfaultfd.c | 2 +- 3 files changed, 23 insertions(+), 10 deletions(-) diff --git a/include/linux/shmem_fs.h b/include/linux/shmem_fs.h index a5a5d1d4d7b1..9d6fc68a1e57 100644 --- a/include/linux/shmem_fs.h +++ b/include/linux/shmem_fs.h @@ -123,14 +123,15 @@ extern int shmem_mcopy_atomic_pte(struct mm_struct *dst_mm, pmd_t *dst_pmd, struct vm_area_struct *dst_vma, unsigned long dst_addr, unsigned long src_addr, - struct page **pagep); + struct page **pagep, + bool wp_copy); extern int shmem_mfill_zeropage_pte(struct mm_struct *dst_mm, pmd_t *dst_pmd, struct vm_area_struct *dst_vma, unsigned long dst_addr); #else #define shmem_mcopy_atomic_pte(dst_mm, dst_pte, dst_vma, dst_addr, \ - src_addr, pagep) ({ BUG(); 0; }) + src_addr, pagep, wp_copy) ({ BUG(); 0; }) #define shmem_mfill_zeropage_pte(dst_mm, dst_pmd, dst_vma, \ dst_addr) ({ BUG(); 0; }) #endif diff --git a/mm/shmem.c b/mm/shmem.c index 537c137698f8..de45333626f7 100644 --- a/mm/shmem.c +++ b/mm/shmem.c @@ -2363,7 +2363,8 @@ static int shmem_mfill_atomic_pte(struct mm_struct *dst_mm, unsigned long dst_addr, unsigned long src_addr, bool zeropage, - struct page **pagep) + struct page **pagep, + bool wp_copy) { struct inode *inode = file_inode(dst_vma->vm_file); struct shmem_inode_info *info = SHMEM_I(inode); @@ -2425,9 +2426,18 @@ static int shmem_mfill_atomic_pte(struct mm_struct *dst_mm, goto out_release; _dst_pte = mk_pte(page, dst_vma->vm_page_prot); - if (dst_vma->vm_flags & VM_WRITE) - _dst_pte = pte_mkwrite(pte_mkdirty(_dst_pte)); - else { + if (dst_vma->vm_flags & VM_WRITE) { + if (wp_copy) + _dst_pte = pte_mkuffd_wp(pte_wrprotect(_dst_pte)); + else + _dst_pte = pte_mkwrite(_dst_pte); + /* + * Similar reason to set_page_dirty(), that we need to mark the + * pte dirty even if wp_copy==true here, otherwise the pte and + * its page could be dropped at anytime when e.g. swapped out. + */ + _dst_pte = pte_mkdirty(_dst_pte); + } else { /* * We don't set the pte dirty if the vma has no * VM_WRITE permission, so mark the page dirty or it @@ -2485,10 +2495,12 @@ int shmem_mcopy_atomic_pte(struct mm_struct *dst_mm, struct vm_area_struct *dst_vma, unsigned long dst_addr, unsigned long src_addr, - struct page **pagep) + struct page **pagep, + bool wp_copy) { return shmem_mfill_atomic_pte(dst_mm, dst_pmd, dst_vma, - dst_addr, src_addr, false, pagep); + dst_addr, src_addr, false, pagep, + wp_copy); } int shmem_mfill_zeropage_pte(struct mm_struct *dst_mm, @@ -2499,7 +2511,7 @@ int shmem_mfill_zeropage_pte(struct mm_struct *dst_mm, struct page *page = NULL; return shmem_mfill_atomic_pte(dst_mm, dst_pmd, dst_vma, - dst_addr, 0, true, &page); + dst_addr, 0, true, &page, false); } #ifdef CONFIG_TMPFS diff --git a/mm/userfaultfd.c b/mm/userfaultfd.c index 9a3d451402d7..6d4b3b7c7f9f 100644 --- a/mm/userfaultfd.c +++ b/mm/userfaultfd.c @@ -445,7 +445,7 @@ static __always_inline ssize_t mfill_atomic_pte(struct mm_struct *dst_mm, if (!zeropage) err = shmem_mcopy_atomic_pte(dst_mm, dst_pmd, dst_vma, dst_addr, - src_addr, page); + src_addr, page, wp_copy); else err = shmem_mfill_zeropage_pte(dst_mm, dst_pmd, dst_vma, dst_addr); From patchwork Fri Jan 15 17:08:42 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Xu X-Patchwork-Id: 12023319 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.6 required=3.0 tests=BAYES_00,DKIM_INVALID, DKIM_SIGNED,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 2BBAAC433E0 for ; Fri, 15 Jan 2021 17:09:27 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id D635B2333E for ; Fri, 15 Jan 2021 17:09:26 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org D635B2333E Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=redhat.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id A8CD58D019F; Fri, 15 Jan 2021 12:09:23 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id A3CD68D019D; Fri, 15 Jan 2021 12:09:23 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 8B4FE8D019F; Fri, 15 Jan 2021 12:09:23 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0104.hostedemail.com [216.40.44.104]) by kanga.kvack.org (Postfix) with ESMTP id 731EE8D019D for ; Fri, 15 Jan 2021 12:09:23 -0500 (EST) Received: from smtpin15.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay05.hostedemail.com (Postfix) with ESMTP id 3ACFD181AEF39 for ; Fri, 15 Jan 2021 17:09:23 +0000 (UTC) X-FDA: 77708645406.15.shock17_5808f4827531 Received: from filter.hostedemail.com (10.5.16.251.rfc1918.com [10.5.16.251]) by smtpin15.hostedemail.com (Postfix) with ESMTP id 10D2818149036 for ; Fri, 15 Jan 2021 17:09:23 +0000 (UTC) X-HE-Tag: shock17_5808f4827531 X-Filterd-Recvd-Size: 6150 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [63.128.21.124]) by imf41.hostedemail.com (Postfix) with ESMTP for ; Fri, 15 Jan 2021 17:09:22 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1610730561; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=PCE2jOXT0VlYi8yeqH5i2c+Kk2ht8PluAgOmKW/+jmU=; b=TJwxJb1WtuXJJ2h0qY2CfaOAMAnuBxhAB9m78WJr5NkYAsyqKvf9Tz8OIpHEj/c8ANqYoO p727tzOTDJPCssxhdcbqo97WIUNZf2O7AM7MNFhcnZdYVi2gLmNNvK/EdHqMOydDAksPCW 8bFbjiwBdhzbW0LwAjLWqrbdhCuX91w= Received: from mail-qk1-f198.google.com (mail-qk1-f198.google.com [209.85.222.198]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-520-5OfbqIMSNnOGYKPR4uXWFg-1; Fri, 15 Jan 2021 12:09:19 -0500 X-MC-Unique: 5OfbqIMSNnOGYKPR4uXWFg-1 Received: by mail-qk1-f198.google.com with SMTP id s66so8605057qkh.10 for ; Fri, 15 Jan 2021 09:09:19 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=PCE2jOXT0VlYi8yeqH5i2c+Kk2ht8PluAgOmKW/+jmU=; b=qEVPgAVanV65tH8gYB/HLeu/YnwfEzLOqXztnYY8kmLsQNa/U6ITPAEY2GD2pHkhPh e9FV+ha0JH8Os8A/K/W7k9atWQpkU7xtwnBMrd1KzEBWLfuc5o7RtPMzh3pns2otc5AO CgnHGmpkyHXJp5dyRUoi6v9VovFSM03vgwTaS/8MK+Y48WbSHOhnHlK/lLYgM1rib23X qNRPH6owPS9fNiBoXAJ4TAbDxj7V8UdpXmw6Dy+nr1/CUZkg42P4MBx0EmnLxBbylEbI 8RlUpFFuE5mIv5eU/xEC6IsJR+hfb8INhm7eCGlo5P66B9gVvJTJp4Ema9Yb2bEUTOlE 7UkA== X-Gm-Message-State: AOAM530AoXDxHkpePL3t5CfPYyUqI5htfuCgnVZ7zgAHGfRVSJMV7HM9 Py/sMKxPjbOtqqnkOKcp3UamLdD6VGYEg4OQ6c56Fo8vkQ+GdsPrKycNqbLV3KQouqp7AuT8S/I pwSIKRfhPy+Q= X-Received: by 2002:a0c:b21e:: with SMTP id x30mr13021645qvd.21.1610730559116; Fri, 15 Jan 2021 09:09:19 -0800 (PST) X-Google-Smtp-Source: ABdhPJySkGiY3Z/8br+xmbJLSbFkxSFlEGX46L/q0MAxVvLzmhfZ9gxy0ZD2X+oBqYSHUsZRO+Ix2w== X-Received: by 2002:a0c:b21e:: with SMTP id x30mr13021616qvd.21.1610730558892; Fri, 15 Jan 2021 09:09:18 -0800 (PST) Received: from localhost.localdomain ([142.126.83.202]) by smtp.gmail.com with ESMTPSA id d123sm5187840qke.95.2021.01.15.09.09.16 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 15 Jan 2021 09:09:18 -0800 (PST) From: Peter Xu To: linux-kernel@vger.kernel.org, linux-mm@kvack.org Cc: Mike Rapoport , Mike Kravetz , peterx@redhat.com, Jerome Glisse , "Kirill A . Shutemov" , Hugh Dickins , Axel Rasmussen , Matthew Wilcox , Andrew Morton , Andrea Arcangeli , Nadav Amit Subject: [PATCH RFC 05/30] mm: Clear vmf->pte after pte_unmap_same() returns Date: Fri, 15 Jan 2021 12:08:42 -0500 Message-Id: <20210115170907.24498-6-peterx@redhat.com> X-Mailer: git-send-email 2.26.2 In-Reply-To: <20210115170907.24498-1-peterx@redhat.com> References: <20210115170907.24498-1-peterx@redhat.com> MIME-Version: 1.0 Authentication-Results: relay.mimecast.com; auth=pass smtp.auth=CUSA124A263 smtp.mailfrom=peterx@redhat.com X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: pte_unmap_same() will always unmap the pte pointer. After the unmap, vmf->pte will not be valid any more. We should clear it. It was safe only because no one is accessing vmf->pte after pte_unmap_same() returns, since the only caller of pte_unmap_same() (so far) is do_swap_page(), where vmf->pte will in most cases be overwritten very soon. pte_unmap_same() will be used in other places in follow up patches, so that vmf->pte will not always be re-written. This patch enables us to call functions like finish_fault() because that'll conditionally unmap the pte by checking vmf->pte first. Or, alloc_set_pte() will make sure to allocate a new pte even after calling pte_unmap_same(). Since we'll need to modify vmf->pte, directly pass in vmf into pte_unmap_same() and then we can also avoid the long parameter list. Signed-off-by: Peter Xu --- mm/memory.c | 13 +++++++------ 1 file changed, 7 insertions(+), 6 deletions(-) diff --git a/mm/memory.c b/mm/memory.c index d6d2873368e1..5ab3106cdd35 100644 --- a/mm/memory.c +++ b/mm/memory.c @@ -2559,19 +2559,20 @@ EXPORT_SYMBOL_GPL(apply_to_existing_page_range); * proceeding (but do_wp_page is only called after already making such a check; * and do_anonymous_page can safely check later on). */ -static inline int pte_unmap_same(struct mm_struct *mm, pmd_t *pmd, - pte_t *page_table, pte_t orig_pte) +static inline int pte_unmap_same(struct vm_fault *vmf) { int same = 1; #if defined(CONFIG_SMP) || defined(CONFIG_PREEMPTION) if (sizeof(pte_t) > sizeof(unsigned long)) { - spinlock_t *ptl = pte_lockptr(mm, pmd); + spinlock_t *ptl = pte_lockptr(vmf->vma->vm_mm, vmf->pmd); spin_lock(ptl); - same = pte_same(*page_table, orig_pte); + same = pte_same(*vmf->pte, vmf->orig_pte); spin_unlock(ptl); } #endif - pte_unmap(page_table); + pte_unmap(vmf->pte); + /* After unmap of pte, the pointer is invalid now - clear it. */ + vmf->pte = NULL; return same; } @@ -3251,7 +3252,7 @@ vm_fault_t do_swap_page(struct vm_fault *vmf) vm_fault_t ret = 0; void *shadow = NULL; - if (!pte_unmap_same(vma->vm_mm, vmf->pmd, vmf->pte, vmf->orig_pte)) + if (!pte_unmap_same(vmf)) goto out; entry = pte_to_swp_entry(vmf->orig_pte); From patchwork Fri Jan 15 17:08:43 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Xu X-Patchwork-Id: 12023321 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.6 required=3.0 tests=BAYES_00,DKIM_INVALID, DKIM_SIGNED,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id A442AC433DB for ; Fri, 15 Jan 2021 17:09:29 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 5638A2333E for ; Fri, 15 Jan 2021 17:09:29 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 5638A2333E Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=redhat.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 622BB8D01A0; Fri, 15 Jan 2021 12:09:25 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 5840E8D019D; Fri, 15 Jan 2021 12:09:25 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 3D2E58D01A0; Fri, 15 Jan 2021 12:09:25 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0119.hostedemail.com [216.40.44.119]) by kanga.kvack.org (Postfix) with ESMTP id 2896A8D019D for ; Fri, 15 Jan 2021 12:09:25 -0500 (EST) Received: from smtpin11.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay04.hostedemail.com (Postfix) with ESMTP id E888A6C2D for ; Fri, 15 Jan 2021 17:09:24 +0000 (UTC) X-FDA: 77708645448.11.scarf47_5e0010627531 Received: from filter.hostedemail.com (10.5.16.251.rfc1918.com [10.5.16.251]) by smtpin11.hostedemail.com (Postfix) with ESMTP id B72B618107825 for ; Fri, 15 Jan 2021 17:09:24 +0000 (UTC) X-HE-Tag: scarf47_5e0010627531 X-Filterd-Recvd-Size: 8806 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [63.128.21.124]) by imf23.hostedemail.com (Postfix) with ESMTP for ; Fri, 15 Jan 2021 17:09:23 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1610730563; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=cqZTBljEF9GT7Y9GLdeQfZOzqDo8WiWZAWLZ5VaMYnA=; b=h0UXDoJYdE0Sv5/fSN+ABlVowcuqUmazoPAsc2Lx6m/jb6dGTFwYU7o9xKCvqVMVzhYV/J nQehaaYprH7vjLAHFPZ6MVaXHCOtP7j1Or8A+JM+/nkdGZtzuG+mhB+JAqj8HhvCFadAaC 8qO3iVoMyVAvz9TflYtTqJYr5kEnwfo= Received: from mail-qt1-f200.google.com (mail-qt1-f200.google.com [209.85.160.200]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-462-ym9PNP5SMF6ecr7IVcG9Vg-1; Fri, 15 Jan 2021 12:09:22 -0500 X-MC-Unique: ym9PNP5SMF6ecr7IVcG9Vg-1 Received: by mail-qt1-f200.google.com with SMTP id b24so7851387qtt.22 for ; Fri, 15 Jan 2021 09:09:21 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=cqZTBljEF9GT7Y9GLdeQfZOzqDo8WiWZAWLZ5VaMYnA=; b=GfhrbZe7WFu9foCeu0Xz1nXT/IwCGtEjMx+MG9ptlhhDtjLhVvfbYcRiDOWoiTmE8X vwcj2n75y2CXR6dXzP7EewFbPCZEwJU2pGBysKIvdI2QCSG5hwoOMkVMz9HomXTCt/XT KjoneJhztzfMVs8VO8GAFbtSTYCIMiFRzh7q41/NpaRmwWTmi3LKaJsd1f1NOhg5eFGa K4yJJS91reEAcZlSbgP06cuvgWGok083G8ufBYpgdl2XXpGvPJ2/J0hZi/TsDepzlFwW vQtmblia2UFj4sjydZpJ0VFmO7dMbeclNP1bbX/LJU1vsBhthXq3MaA7HUCpQGuKeRYj B1kg== X-Gm-Message-State: AOAM532nekZvyuIzeho4Av5kM3NSsyinp6YvG9gSmOfDdvwWc0Nh9Mgv YhUz5P9Cf+rAXl2VdP3KzjSpVlq8J+uvGNFDrsVh7zqthuhkc4CL9c2w7bqUu9uqXjhtMBDhPcJ xFKKuh2fvbQQ= X-Received: by 2002:a05:6214:184a:: with SMTP id d10mr13126458qvy.41.1610730561445; Fri, 15 Jan 2021 09:09:21 -0800 (PST) X-Google-Smtp-Source: ABdhPJwtvqsqQRqOzybvIZ7TXm0c4V2KxFHwP7leE+WQu0Wp7REl3mbuhbUL+TTPDsTGv24CaH6oIg== X-Received: by 2002:a05:6214:184a:: with SMTP id d10mr13126432qvy.41.1610730561200; Fri, 15 Jan 2021 09:09:21 -0800 (PST) Received: from localhost.localdomain ([142.126.83.202]) by smtp.gmail.com with ESMTPSA id d123sm5187840qke.95.2021.01.15.09.09.18 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 15 Jan 2021 09:09:19 -0800 (PST) From: Peter Xu To: linux-kernel@vger.kernel.org, linux-mm@kvack.org Cc: Mike Rapoport , Mike Kravetz , peterx@redhat.com, Jerome Glisse , "Kirill A . Shutemov" , Hugh Dickins , Axel Rasmussen , Matthew Wilcox , Andrew Morton , Andrea Arcangeli , Nadav Amit Subject: [PATCH RFC 06/30] mm/userfaultfd: Introduce special pte for unmapped file-backed mem Date: Fri, 15 Jan 2021 12:08:43 -0500 Message-Id: <20210115170907.24498-7-peterx@redhat.com> X-Mailer: git-send-email 2.26.2 In-Reply-To: <20210115170907.24498-1-peterx@redhat.com> References: <20210115170907.24498-1-peterx@redhat.com> MIME-Version: 1.0 Authentication-Results: relay.mimecast.com; auth=pass smtp.auth=CUSA124A263 smtp.mailfrom=peterx@redhat.com X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: This patch introduces a very special swap-like pte for file-backed memories. Currently it's only defined for x86_64 only, but as long as any arch that can properly define the UFFD_WP_SWP_PTE_SPECIAL value as requested, it should conceptually work too. We will use this special pte to arm the ptes that got either unmapped or swapped out for a file-backed region that was previously wr-protected. This special pte could trigger a page fault just like swap entries, and as long as the page fault will satisfy pte_none()==false && pte_present()==false. Then we can revive the special pte into a normal pte backed by the page cache. This idea is greatly inspired by Hugh and Andrea in the discussion, which is referenced in the links below. The other idea (from Hugh) is that we use swp_type==1 and swp_offset=0 as the special pte. The current solution (as pointed out by Andrea) is slightly preferred in that we don't even need swp_entry_t knowledge at all in trapping these accesses. Meanwhile, we also reuse _PAGE_SWP_UFFD_WP from the anonymous swp entries. This patch only introduces the special pte and its operators. It's not yet applied to have any functional difference. Link: https://lore.kernel.org/lkml/20201126222359.8120-1-peterx@redhat.com/ Link: https://lore.kernel.org/lkml/20201130230603.46187-1-peterx@redhat.com/ Suggested-by: Andrea Arcangeli Suggested-by: Hugh Dickins Signed-off-by: Peter Xu --- arch/x86/include/asm/pgtable.h | 28 ++++++++++++++++++++++++++++ include/asm-generic/pgtable_uffd.h | 3 +++ include/linux/userfaultfd_k.h | 21 +++++++++++++++++++++ 3 files changed, 52 insertions(+) diff --git a/arch/x86/include/asm/pgtable.h b/arch/x86/include/asm/pgtable.h index a02c67291cfc..379bae343dd1 100644 --- a/arch/x86/include/asm/pgtable.h +++ b/arch/x86/include/asm/pgtable.h @@ -1329,6 +1329,34 @@ static inline pmd_t pmd_swp_clear_soft_dirty(pmd_t pmd) #endif #ifdef CONFIG_HAVE_ARCH_USERFAULTFD_WP + +/* + * This is a very special swap-like pte that marks this pte as "wr-protected" + * by userfaultfd-wp. It should only exist for file-backed memory where the + * page (previously got wr-protected) has been unmapped or swapped out. + * + * For anonymous memories, the userfaultfd-wp _PAGE_SWP_UFFD_WP bit is kept + * along with a real swp entry instead. + * + * Let's make some rules for this special pte: + * + * (1) pte_none()==false, so that it'll not trigger a missing page fault. + * + * (2) pte_present()==false, so that it's recognized as swap (is_swap_pte). + * + * (3) pte_swp_uffd_wp()==true, so it can be tested just like a swap pte that + * contains a valid swap entry, so that we can check a swap pte always + * using "is_swap_pte() && pte_swp_uffd_wp()" without caring about whether + * there's one swap entry inside of the pte. + * + * (4) It should not be a valid swap pte anywhere, so that when we see this pte + * we know it does not contain a swap entry. + * + * For x86, the simplest special pte which satisfies all of above should be the + * pte with only _PAGE_SWP_UFFD_WP bit set (where swp_type==swp_offset==0). + */ +#define UFFD_WP_SWP_PTE_SPECIAL __pte(_PAGE_SWP_UFFD_WP) + static inline pte_t pte_swp_mkuffd_wp(pte_t pte) { return pte_set_flags(pte, _PAGE_SWP_UFFD_WP); diff --git a/include/asm-generic/pgtable_uffd.h b/include/asm-generic/pgtable_uffd.h index 828966d4c281..95e9811ce9d1 100644 --- a/include/asm-generic/pgtable_uffd.h +++ b/include/asm-generic/pgtable_uffd.h @@ -2,6 +2,9 @@ #define _ASM_GENERIC_PGTABLE_UFFD_H #ifndef CONFIG_HAVE_ARCH_USERFAULTFD_WP + +#define UFFD_WP_SWP_PTE_SPECIAL __pte(0) + static __always_inline int pte_uffd_wp(pte_t pte) { return 0; diff --git a/include/linux/userfaultfd_k.h b/include/linux/userfaultfd_k.h index a8e5f3ea9bb2..7d6071a65ded 100644 --- a/include/linux/userfaultfd_k.h +++ b/include/linux/userfaultfd_k.h @@ -98,6 +98,17 @@ extern int userfaultfd_unmap_prep(struct vm_area_struct *vma, extern void userfaultfd_unmap_complete(struct mm_struct *mm, struct list_head *uf); +static inline pte_t pte_swp_mkuffd_wp_special(struct vm_area_struct *vma) +{ + WARN_ON_ONCE(vma_is_anonymous(vma)); + return UFFD_WP_SWP_PTE_SPECIAL; +} + +static inline bool pte_swp_uffd_wp_special(pte_t pte) +{ + return pte_same(pte, UFFD_WP_SWP_PTE_SPECIAL); +} + #else /* CONFIG_USERFAULTFD */ /* mm helpers */ @@ -182,6 +193,16 @@ static inline void userfaultfd_unmap_complete(struct mm_struct *mm, { } +static inline pte_t pte_swp_mkuffd_wp_special(struct vm_area_struct *vma) +{ + return __pte(0); +} + +static inline bool pte_swp_uffd_wp_special(pte_t pte) +{ + return false; +} + #endif /* CONFIG_USERFAULTFD */ #endif /* _LINUX_USERFAULTFD_K_H */ From patchwork Fri Jan 15 17:08:44 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Xu X-Patchwork-Id: 12023323 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.6 required=3.0 tests=BAYES_00,DKIM_INVALID, DKIM_SIGNED,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 63339C433DB for ; Fri, 15 Jan 2021 17:09:32 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id F193F2333E for ; Fri, 15 Jan 2021 17:09:31 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org F193F2333E Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=redhat.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 9F1658D01A1; Fri, 15 Jan 2021 12:09:27 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 99FAE8D019D; Fri, 15 Jan 2021 12:09:27 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 8691E8D01A1; Fri, 15 Jan 2021 12:09:27 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0236.hostedemail.com [216.40.44.236]) by kanga.kvack.org (Postfix) with ESMTP id 6E8188D019D for ; Fri, 15 Jan 2021 12:09:27 -0500 (EST) Received: from smtpin13.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay01.hostedemail.com (Postfix) with ESMTP id 32D711803C4F6 for ; Fri, 15 Jan 2021 17:09:27 +0000 (UTC) X-FDA: 77708645574.13.nut70_420095127531 Received: from filter.hostedemail.com (10.5.16.251.rfc1918.com [10.5.16.251]) by smtpin13.hostedemail.com (Postfix) with ESMTP id EC11118086CA1 for ; Fri, 15 Jan 2021 17:09:26 +0000 (UTC) X-HE-Tag: nut70_420095127531 X-Filterd-Recvd-Size: 15328 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [63.128.21.124]) by imf45.hostedemail.com (Postfix) with ESMTP for ; Fri, 15 Jan 2021 17:09:26 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1610730565; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=/1vlUeHhytX0Zn1XKXcct2jvLo89TqfY4pIBek2Nk6w=; b=X9sdlObl+XQ/0lVK7/JX1F9Qec/c6SyRvwrAyDR8ObLMLNh0VyecBmmsSxlmPTBMLX0y/B dBU4evy7mZCCsF6PV1eh79nA1eeL9M2sTBOsdPH5CKFbmHnMJeUZtSGuVZuKL28zTOVY48 e3tJfaM3jmhwEzPcVMzPhbgXp4N32Is= Received: from mail-qt1-f200.google.com (mail-qt1-f200.google.com [209.85.160.200]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-286-nvnj7ygcOvGYrRkDbtJPVA-1; Fri, 15 Jan 2021 12:09:24 -0500 X-MC-Unique: nvnj7ygcOvGYrRkDbtJPVA-1 Received: by mail-qt1-f200.google.com with SMTP id b24so7851444qtt.22 for ; Fri, 15 Jan 2021 09:09:24 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=/1vlUeHhytX0Zn1XKXcct2jvLo89TqfY4pIBek2Nk6w=; b=sfhLm5sYfe4hPiq0XaH0DnOGYRVKsGg8AZdqhgL5MLtq5njDFPirRXg6X2T02aNQ2o vdhuIgqOwGvrUZvgSluRRe7Y6kYnx9CM/k2qc7JPwEW8YYTR1xU/6yMfAgWNWx4Q3OUO 6/GLYoP0YD/jaQU/W2hTs8EHj7ZtmFZKHSAM6Yy8BFZXY86eXvhvE6JPX7edOk6UnIeb NS0uv5xeMfqA0ISVIDixmka0wybAOSET+GgBLn7qeQl3vCutrpqhpQNxQzPs4A75tRhR 8dn6hrC8BcQIJ6tpZPU6kPjjcoFJMLfQLf5CLfnyJYPu649P0JI2qwpdolF7bNneZTBK AIpg== X-Gm-Message-State: AOAM530j7RCk8ckFtg7dFUqW4xwZsnfORhd7wR+L7jJCKuc9ssyeMKUR Jx31Mb01BiCPhXDwoCGSGt8ns5yJ3gjRBWZc6GFaEOhQQRZvsQ7gnOkd6Rsfl1vuak70MHzJcha Lc1Cwi8J6Nmc= X-Received: by 2002:a37:a1d6:: with SMTP id k205mr13509294qke.384.1610730563101; Fri, 15 Jan 2021 09:09:23 -0800 (PST) X-Google-Smtp-Source: ABdhPJyUDTEd6KHwD8hcYvYDXopX+e8jSdEwnqhv+y4ihKk9o1h30UduZojRTJB6Qc+C6GSA3qFLSQ== X-Received: by 2002:a37:a1d6:: with SMTP id k205mr13509268qke.384.1610730562780; Fri, 15 Jan 2021 09:09:22 -0800 (PST) Received: from localhost.localdomain ([142.126.83.202]) by smtp.gmail.com with ESMTPSA id d123sm5187840qke.95.2021.01.15.09.09.21 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 15 Jan 2021 09:09:21 -0800 (PST) From: Peter Xu To: linux-kernel@vger.kernel.org, linux-mm@kvack.org Cc: Mike Rapoport , Mike Kravetz , peterx@redhat.com, Jerome Glisse , "Kirill A . Shutemov" , Hugh Dickins , Axel Rasmussen , Matthew Wilcox , Andrew Morton , Andrea Arcangeli , Nadav Amit Subject: [PATCH RFC 07/30] mm/swap: Introduce the idea of special swap ptes Date: Fri, 15 Jan 2021 12:08:44 -0500 Message-Id: <20210115170907.24498-8-peterx@redhat.com> X-Mailer: git-send-email 2.26.2 In-Reply-To: <20210115170907.24498-1-peterx@redhat.com> References: <20210115170907.24498-1-peterx@redhat.com> MIME-Version: 1.0 Authentication-Results: relay.mimecast.com; auth=pass smtp.auth=CUSA124A263 smtp.mailfrom=peterx@redhat.com X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: We used to have special swap entries, like migration entries, hw-poison entries, device private entries, etc. Those "special swap entries" reside in the range that they need to be at least swap entries first, and their types are decided by swp_type(entry). This patch introduces another idea called "special swap ptes". It's very easy to get confused against "special swap entries", but a speical swap pte should never contain a swap entry at all. It means, it's illegal to call pte_to_swp_entry() upon a special swap pte. Make the uffd-wp special pte to be the first special swap pte. Before this patch, is_swap_pte()==true means one of the below: (a.1) The pte has a normal swap entry (non_swap_entry()==false). For example, when an anonymous page got swapped out. (a.2) The pte has a special swap entry (non_swap_entry()==true). For example, a migration entry, a hw-poison entry, etc. After this patch, is_swap_pte()==true means one of the below, where case (b) is added: (a) The pte contains a swap entry. (a.1) The pte has a normal swap entry (non_swap_entry()==false). For example, when an anonymous page got swapped out. (a.2) The pte has a special swap entry (non_swap_entry()==true). For example, a migration entry, a hw-poison entry, etc. (b) The pte does not contain a swap entry at all (so it cannot be passed into pte_to_swp_entry()). For example, uffd-wp special swap pte. Teach the whole mm core about this new idea. It's done by introducing another helper called pte_has_swap_entry() which stands for case (a.1) and (a.2). Before this patch, it will be the same as is_swap_pte() because there's no special swap pte yet. Now for most of the previous use of is_swap_entry() in mm core, we'll need to use the new helper pte_has_swap_entry() instead, to make sure we won't try to parse a swap entry from a swap special pte (which does not contain a swap entry at all!). We either handle the swap special pte, or it'll naturally use the default "else" paths. Warn properly (e.g., in do_swap_page()) when we see a special swap pte - we should never call do_swap_page() upon those ptes, but just to bail out early if it happens. Signed-off-by: Peter Xu --- fs/proc/task_mmu.c | 14 ++++++++------ include/linux/swapops.h | 39 ++++++++++++++++++++++++++++++++++++++- mm/khugepaged.c | 11 ++++++++++- mm/memcontrol.c | 2 +- mm/memory.c | 7 +++++++ mm/migrate.c | 2 +- mm/mprotect.c | 2 +- mm/mremap.c | 2 +- mm/page_vma_mapped.c | 6 +++--- 9 files changed, 70 insertions(+), 15 deletions(-) diff --git a/fs/proc/task_mmu.c b/fs/proc/task_mmu.c index ee5a235b3056..5286fd23bbf4 100644 --- a/fs/proc/task_mmu.c +++ b/fs/proc/task_mmu.c @@ -498,7 +498,7 @@ static void smaps_pte_entry(pte_t *pte, unsigned long addr, if (pte_present(*pte)) { page = vm_normal_page(vma, addr, *pte); - } else if (is_swap_pte(*pte)) { + } else if (pte_has_swap_entry(*pte)) { swp_entry_t swpent = pte_to_swp_entry(*pte); if (!non_swap_entry(swpent)) { @@ -518,8 +518,10 @@ static void smaps_pte_entry(pte_t *pte, unsigned long addr, page = migration_entry_to_page(swpent); else if (is_device_private_entry(swpent)) page = device_private_entry_to_page(swpent); - } else if (unlikely(IS_ENABLED(CONFIG_SHMEM) && mss->check_shmem_swap - && pte_none(*pte))) { + } else if (unlikely(IS_ENABLED(CONFIG_SHMEM) && + mss->check_shmem_swap && + /* Here swap special pte is the same as none pte */ + (pte_none(*pte) || is_swap_special_pte(*pte)))) { page = xa_load(&vma->vm_file->f_mapping->i_pages, linear_page_index(vma, addr)); if (xa_is_value(page)) @@ -688,7 +690,7 @@ static int smaps_hugetlb_range(pte_t *pte, unsigned long hmask, if (pte_present(*pte)) { page = vm_normal_page(vma, addr, *pte); - } else if (is_swap_pte(*pte)) { + } else if (pte_has_swap_entry(*pte)) { swp_entry_t swpent = pte_to_swp_entry(*pte); if (is_migration_entry(swpent)) @@ -1053,7 +1055,7 @@ static inline void clear_soft_dirty(struct vm_area_struct *vma, ptent = pte_wrprotect(old_pte); ptent = pte_clear_soft_dirty(ptent); ptep_modify_prot_commit(vma, addr, pte, old_pte, ptent); - } else if (is_swap_pte(ptent)) { + } else if (pte_has_swap_entry(ptent)) { ptent = pte_swp_clear_soft_dirty(ptent); set_pte_at(vma->vm_mm, addr, pte, ptent); } @@ -1366,7 +1368,7 @@ static pagemap_entry_t pte_to_pagemap_entry(struct pagemapread *pm, page = vm_normal_page(vma, addr, pte); if (pte_soft_dirty(pte)) flags |= PM_SOFT_DIRTY; - } else if (is_swap_pte(pte)) { + } else if (pte_has_swap_entry(pte)) { swp_entry_t entry; if (pte_swp_soft_dirty(pte)) flags |= PM_SOFT_DIRTY; diff --git a/include/linux/swapops.h b/include/linux/swapops.h index 7dd57303bb0c..7b7387d2892f 100644 --- a/include/linux/swapops.h +++ b/include/linux/swapops.h @@ -5,6 +5,7 @@ #include #include #include +#include #ifdef CONFIG_MMU @@ -52,12 +53,48 @@ static inline pgoff_t swp_offset(swp_entry_t entry) return entry.val & SWP_OFFSET_MASK; } -/* check whether a pte points to a swap entry */ +/* + * is_swap_pte() returns true for three cases: + * + * (a) The pte contains a swap entry. + * + * (a.1) The pte has a normal swap entry (non_swap_entry()==false). For + * example, when an anonymous page got swapped out. + * + * (a.2) The pte has a special swap entry (non_swap_entry()==true). For + * example, a migration entry, a hw-poison entry, etc. + * + * (b) The pte does not contain a swap entry at all (so it cannot be passed + * into pte_to_swp_entry()). For example, uffd-wp special swap pte. + */ static inline int is_swap_pte(pte_t pte) { return !pte_none(pte) && !pte_present(pte); } +/* + * A swap-like special pte should only be used as special marker to trigger a + * page fault. We should treat them similarly as pte_none() in most cases, + * except that it may contain some special information that can persist within + * the pte. Currently the only special swap pte is UFFD_WP_SWP_PTE_SPECIAL. + * + * Note: we should never call pte_to_swp_entry() upon a special swap pte, + * Because a swap special pte does not contain a swap entry! + */ +static inline bool is_swap_special_pte(pte_t pte) +{ + return pte_swp_uffd_wp_special(pte); +} + +/* + * Returns true if the pte contains a swap entry. This includes not only the + * normal swp entry case, but also for migration entries, etc. + */ +static inline bool pte_has_swap_entry(pte_t pte) +{ + return is_swap_pte(pte) && !is_swap_special_pte(pte); +} + /* * Convert the arch-dependent pte representation of a swp_entry_t into an * arch-independent swp_entry_t. diff --git a/mm/khugepaged.c b/mm/khugepaged.c index 4e3dff13eb70..20807163a25f 100644 --- a/mm/khugepaged.c +++ b/mm/khugepaged.c @@ -1006,7 +1006,7 @@ static bool __collapse_huge_page_swapin(struct mm_struct *mm, for (; vmf.address < address + HPAGE_PMD_NR*PAGE_SIZE; vmf.pte++, vmf.address += PAGE_SIZE) { vmf.orig_pte = *vmf.pte; - if (!is_swap_pte(vmf.orig_pte)) + if (!pte_has_swap_entry(vmf.orig_pte)) continue; swapped_in++; ret = do_swap_page(&vmf); @@ -1238,6 +1238,15 @@ static int khugepaged_scan_pmd(struct mm_struct *mm, _pte++, _address += PAGE_SIZE) { pte_t pteval = *_pte; if (is_swap_pte(pteval)) { + if (is_swap_special_pte(pteval)) { + /* + * Reuse SCAN_PTE_UFFD_WP. If there will be + * new users of is_swap_special_pte(), we'd + * better introduce a new result type. + */ + result = SCAN_PTE_UFFD_WP; + goto out_unmap; + } if (++unmapped <= khugepaged_max_ptes_swap) { /* * Always be strict with uffd-wp diff --git a/mm/memcontrol.c b/mm/memcontrol.c index 29459a6ce1c7..3af43a218b8b 100644 --- a/mm/memcontrol.c +++ b/mm/memcontrol.c @@ -5776,7 +5776,7 @@ static enum mc_target_type get_mctgt_type(struct vm_area_struct *vma, if (pte_present(ptent)) page = mc_handle_present_pte(vma, addr, ptent); - else if (is_swap_pte(ptent)) + else if (pte_has_swap_entry(ptent)) page = mc_handle_swap_pte(vma, ptent, &ent); else if (pte_none(ptent)) page = mc_handle_file_pte(vma, addr, ptent, &ent); diff --git a/mm/memory.c b/mm/memory.c index 5ab3106cdd35..394c2602dce7 100644 --- a/mm/memory.c +++ b/mm/memory.c @@ -3255,6 +3255,13 @@ vm_fault_t do_swap_page(struct vm_fault *vmf) if (!pte_unmap_same(vmf)) goto out; + /* + * We should never call do_swap_page upon a swap special pte; just be + * safe to bail out if it happens. + */ + if (WARN_ON_ONCE(is_swap_special_pte(vmf->orig_pte))) + goto out; + entry = pte_to_swp_entry(vmf->orig_pte); if (unlikely(non_swap_entry(entry))) { if (is_migration_entry(entry)) { diff --git a/mm/migrate.c b/mm/migrate.c index 5795cb82e27c..8a5459859e17 100644 --- a/mm/migrate.c +++ b/mm/migrate.c @@ -318,7 +318,7 @@ void __migration_entry_wait(struct mm_struct *mm, pte_t *ptep, spin_lock(ptl); pte = *ptep; - if (!is_swap_pte(pte)) + if (!pte_has_swap_entry(pte)) goto out; entry = pte_to_swp_entry(pte); diff --git a/mm/mprotect.c b/mm/mprotect.c index 56c02beb6041..e75bfe43cedd 100644 --- a/mm/mprotect.c +++ b/mm/mprotect.c @@ -139,7 +139,7 @@ static unsigned long change_pte_range(struct vm_area_struct *vma, pmd_t *pmd, } ptep_modify_prot_commit(vma, addr, pte, oldpte, ptent); pages++; - } else if (is_swap_pte(oldpte)) { + } else if (pte_has_swap_entry(oldpte)) { swp_entry_t entry = pte_to_swp_entry(oldpte); pte_t newpte; diff --git a/mm/mremap.c b/mm/mremap.c index 138abbae4f75..f736fcbe1247 100644 --- a/mm/mremap.c +++ b/mm/mremap.c @@ -106,7 +106,7 @@ static pte_t move_soft_dirty_pte(pte_t pte) #ifdef CONFIG_MEM_SOFT_DIRTY if (pte_present(pte)) pte = pte_mksoft_dirty(pte); - else if (is_swap_pte(pte)) + else if (pte_has_swap_entry(pte)) pte = pte_swp_mksoft_dirty(pte); #endif return pte; diff --git a/mm/page_vma_mapped.c b/mm/page_vma_mapped.c index 5e77b269c330..c97884007232 100644 --- a/mm/page_vma_mapped.c +++ b/mm/page_vma_mapped.c @@ -36,7 +36,7 @@ static bool map_pte(struct page_vma_mapped_walk *pvmw) * For more details on device private memory see HMM * (include/linux/hmm.h or mm/hmm.c). */ - if (is_swap_pte(*pvmw->pte)) { + if (pte_has_swap_entry(*pvmw->pte)) { swp_entry_t entry; /* Handle un-addressable ZONE_DEVICE memory */ @@ -88,7 +88,7 @@ static bool check_pte(struct page_vma_mapped_walk *pvmw) if (pvmw->flags & PVMW_MIGRATION) { swp_entry_t entry; - if (!is_swap_pte(*pvmw->pte)) + if (!pte_has_swap_entry(*pvmw->pte)) return false; entry = pte_to_swp_entry(*pvmw->pte); @@ -96,7 +96,7 @@ static bool check_pte(struct page_vma_mapped_walk *pvmw) return false; pfn = migration_entry_to_pfn(entry); - } else if (is_swap_pte(*pvmw->pte)) { + } else if (pte_has_swap_entry(*pvmw->pte)) { swp_entry_t entry; /* Handle un-addressable ZONE_DEVICE memory */ From patchwork Fri Jan 15 17:08:45 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Xu X-Patchwork-Id: 12023325 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.6 required=3.0 tests=BAYES_00,DKIM_INVALID, DKIM_SIGNED,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 4995EC433E0 for ; Fri, 15 Jan 2021 17:09:35 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id E90E72333E for ; Fri, 15 Jan 2021 17:09:34 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org E90E72333E Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=redhat.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 164EC8D01A2; Fri, 15 Jan 2021 12:09:31 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 117928D019D; Fri, 15 Jan 2021 12:09:31 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id E85F08D01A2; Fri, 15 Jan 2021 12:09:30 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0176.hostedemail.com [216.40.44.176]) by kanga.kvack.org (Postfix) with ESMTP id D139E8D019D for ; Fri, 15 Jan 2021 12:09:30 -0500 (EST) Received: from smtpin03.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay03.hostedemail.com (Postfix) with ESMTP id 8E91182E8E82 for ; Fri, 15 Jan 2021 17:09:30 +0000 (UTC) X-FDA: 77708645700.03.kite82_231858527531 Received: from filter.hostedemail.com (10.5.16.251.rfc1918.com [10.5.16.251]) by smtpin03.hostedemail.com (Postfix) with ESMTP id 6A2E428A4EB for ; Fri, 15 Jan 2021 17:09:30 +0000 (UTC) X-HE-Tag: kite82_231858527531 X-Filterd-Recvd-Size: 12632 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [63.128.21.124]) by imf21.hostedemail.com (Postfix) with ESMTP for ; Fri, 15 Jan 2021 17:09:29 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1610730569; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=pU1W6eF+5acoXg0XrMV/kYBFP7oiRoIxFmy0y+UKd/k=; b=bJXlr1TM4mx/IaZ+VMsfSdL33JEx2sdigWRQB1TV4PdUcYqzWCXL0vhDmsAI/Vy2fG0u5W MUhR+L95iQ7uudEpjciNQv9dTphNNK7fqRIB6XeTTxFj3UhqPwp1u6l7G5dI/i38Jiu+Sr JMxZfmzX+q/qlUzVpYUsySSWI98jxEg= Received: from mail-qv1-f71.google.com (mail-qv1-f71.google.com [209.85.219.71]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-90-0PJG9U8UOfWhwdJ5mpOeoA-1; Fri, 15 Jan 2021 12:09:25 -0500 X-MC-Unique: 0PJG9U8UOfWhwdJ5mpOeoA-1 Received: by mail-qv1-f71.google.com with SMTP id h1so8237994qvr.7 for ; Fri, 15 Jan 2021 09:09:25 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=pU1W6eF+5acoXg0XrMV/kYBFP7oiRoIxFmy0y+UKd/k=; b=MRHYdiXkKe0q4r3ALlayL4hOE24KQhuij3tozRt5JdvbtmJEYGh5HXY3eRf6Xb89Dz qj/6X1ofimJ8Fxb03HyqYjdoPNjPA7xTNs8QmS3WgJuM4aQg2WALuyVVpLUghzNjnaZO jOm4hP4wsPJfTPDa01NWYzA31oSt7fmP5sTO0vcU1cSk4eSzpGLwZoQF3AcDrZqOhZyj yVmmBp6QdYTMFzov2qfwYpwWCIsOl5c3znju2GU+0K9BDugZbbcDtVq0Jzf0U2BIgKpk UqVgDb0fmlHEDLC1GGVXvK10luFcVTAWqp57gUfsi6oYb385SkZ5L0Zrv4MfPNMS+aFi QlaA== X-Gm-Message-State: AOAM533huQuiscTw4nor2S3sJ/Q2e0rKB5n1Flr9Ev2Dk6Nc5jLFoq/q X92zKrxm1GTUdz2ByepNUpARVFofKbmFlrVUAqQ906IUP/rOq8lnQ/EtzddxbDvsxhmnkXDSXC0 9FT10kPKxcP0= X-Received: by 2002:a0c:bd2b:: with SMTP id m43mr12934476qvg.32.1610730564948; Fri, 15 Jan 2021 09:09:24 -0800 (PST) X-Google-Smtp-Source: ABdhPJynSq2SBFF8P8zjh3jGjKduQkRr9w+ZGRCy7Dj1mcGpq2w7EdvE3LxZusU0Nl55C86yHtpjRA== X-Received: by 2002:a0c:bd2b:: with SMTP id m43mr12934439qvg.32.1610730564546; Fri, 15 Jan 2021 09:09:24 -0800 (PST) Received: from localhost.localdomain ([142.126.83.202]) by smtp.gmail.com with ESMTPSA id d123sm5187840qke.95.2021.01.15.09.09.22 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 15 Jan 2021 09:09:23 -0800 (PST) From: Peter Xu To: linux-kernel@vger.kernel.org, linux-mm@kvack.org Cc: Mike Rapoport , Mike Kravetz , peterx@redhat.com, Jerome Glisse , "Kirill A . Shutemov" , Hugh Dickins , Axel Rasmussen , Matthew Wilcox , Andrew Morton , Andrea Arcangeli , Nadav Amit Subject: [PATCH RFC 08/30] shmem/userfaultfd: Handle uffd-wp special pte in page fault handler Date: Fri, 15 Jan 2021 12:08:45 -0500 Message-Id: <20210115170907.24498-9-peterx@redhat.com> X-Mailer: git-send-email 2.26.2 In-Reply-To: <20210115170907.24498-1-peterx@redhat.com> References: <20210115170907.24498-1-peterx@redhat.com> MIME-Version: 1.0 Authentication-Results: relay.mimecast.com; auth=pass smtp.auth=CUSA124A263 smtp.mailfrom=peterx@redhat.com X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: File-backed memories are prone to unmap/swap so the ptes are always unstable. This could lead to userfaultfd-wp information got lost when unmapped or swapped out on such types of memory, for example, shmem. To keep such an information persistent, we will start to use the newly introduced swap-like special ptes to replace a null pte when those ptes were removed. Prepare this by handling such a special pte first before it is applied. Here a new fault flag FAULT_FLAG_UFFD_WP is introduced. When this flag is set, it means the current fault is to resolve a page access (either read or write) to the uffd-wp special pte. The handling of this special pte page fault is similar to missing fault, but it should happen after the pte missing logic since the special pte is designed to be a swap-like pte. Meanwhile it should be handled before do_swap_page() so that the swap core logic won't be confused to see such an illegal swap pte. This is a slow path of uffd-wp handling, because unmap of wr-protected shmem ptes should be rare. So far it should only trigger in two conditions: (1) When trying to punch holes in shmem_fallocate(), there will be a pre-unmap optimization before evicting the page. That will create unmapped shmem ptes with wr-protected pages covered. (2) Swapping out of shmem pages Because of this, the page fault handling is simplifed too by always assuming it's a read fault when calling do_fault(). When it's a write fault, it'll fault again when retry the page access, then do_wp_page() will handle the rest of message generation and delivery to the userfaultfd. Disable fault-around for such a special page fault, because the introduced new flag (FAULT_FLAG_UFFD_WP) only applies to current pte rather than all the pages around it. Doing fault-around with the new flag could confuse all the rest of pages when installing ptes from page cache when there's a cache hit. Signed-off-by: Peter Xu --- include/linux/mm.h | 2 + mm/memory.c | 107 +++++++++++++++++++++++++++++++++++++++++++-- 2 files changed, 105 insertions(+), 4 deletions(-) diff --git a/include/linux/mm.h b/include/linux/mm.h index db6ae4d3fb4e..85d928764b64 100644 --- a/include/linux/mm.h +++ b/include/linux/mm.h @@ -426,6 +426,7 @@ extern pgprot_t protection_map[16]; * @FAULT_FLAG_REMOTE: The fault is not for current task/mm. * @FAULT_FLAG_INSTRUCTION: The fault was during an instruction fetch. * @FAULT_FLAG_INTERRUPTIBLE: The fault can be interrupted by non-fatal signals. + * @FAULT_FLAG_UFFD_WP: When install new page entries, set uffd-wp bit. * * About @FAULT_FLAG_ALLOW_RETRY and @FAULT_FLAG_TRIED: we can specify * whether we would allow page faults to retry by specifying these two @@ -456,6 +457,7 @@ extern pgprot_t protection_map[16]; #define FAULT_FLAG_REMOTE 0x80 #define FAULT_FLAG_INSTRUCTION 0x100 #define FAULT_FLAG_INTERRUPTIBLE 0x200 +#define FAULT_FLAG_UFFD_WP 0x400 /* * The default fault flags that should be used by most of the diff --git a/mm/memory.c b/mm/memory.c index 394c2602dce7..0b687f0be4d0 100644 --- a/mm/memory.c +++ b/mm/memory.c @@ -3797,6 +3797,7 @@ static vm_fault_t do_set_pmd(struct vm_fault *vmf, struct page *page) vm_fault_t alloc_set_pte(struct vm_fault *vmf, struct page *page) { struct vm_area_struct *vma = vmf->vma; + bool pte_changed, uffd_wp = vmf->flags & FAULT_FLAG_UFFD_WP; bool write = vmf->flags & FAULT_FLAG_WRITE; pte_t entry; vm_fault_t ret; @@ -3807,14 +3808,27 @@ vm_fault_t alloc_set_pte(struct vm_fault *vmf, struct page *page) return ret; } + /* + * Note: besides pte missing, FAULT_FLAG_UFFD_WP could also trigger + * this path where vmf->pte got released before reaching here. In that + * case, even if vmf->pte==NULL, the pte actually still contains the + * protection pte (by pte_swp_mkuffd_wp_special()). For that case, + * we'd also like to allocate a new pte like pte none, but check + * differently for changing pte. + */ if (!vmf->pte) { ret = pte_alloc_one_map(vmf); if (ret) return ret; } + if (unlikely(uffd_wp)) + pte_changed = !pte_swp_uffd_wp_special(*vmf->pte); + else + pte_changed = !pte_none(*vmf->pte); + /* Re-check under ptl */ - if (unlikely(!pte_none(*vmf->pte))) { + if (unlikely(pte_changed)) { update_mmu_tlb(vma, vmf->address, vmf->pte); return VM_FAULT_NOPAGE; } @@ -3824,6 +3838,11 @@ vm_fault_t alloc_set_pte(struct vm_fault *vmf, struct page *page) entry = pte_sw_mkyoung(entry); if (write) entry = maybe_mkwrite(pte_mkdirty(entry), vma); + if (uffd_wp) { + /* This should only be triggered by a read fault */ + WARN_ON_ONCE(write); + entry = pte_mkuffd_wp(pte_wrprotect(entry)); + } /* copy-on-write page */ if (write && !(vma->vm_flags & VM_SHARED)) { inc_mm_counter_fast(vma->vm_mm, MM_ANONPAGES); @@ -3997,9 +4016,27 @@ static vm_fault_t do_fault_around(struct vm_fault *vmf) return ret; } +/* Return true if we should do read fault-around, false otherwise */ +static inline bool should_fault_around(struct vm_fault *vmf) +{ + /* No ->map_pages? No way to fault around... */ + if (!vmf->vma->vm_ops->map_pages) + return false; + + /* + * Don't do fault around for FAULT_FLAG_UFFD_WP because it means we + * want to recover a previously wr-protected pte. This flag is a + * per-pte information, so it could confuse all the pages around the + * current page when faulted in. Give up on that quickly. + */ + if (vmf->flags & FAULT_FLAG_UFFD_WP) + return false; + + return fault_around_bytes >> PAGE_SHIFT > 1; +} + static vm_fault_t do_read_fault(struct vm_fault *vmf) { - struct vm_area_struct *vma = vmf->vma; vm_fault_t ret = 0; /* @@ -4007,7 +4044,7 @@ static vm_fault_t do_read_fault(struct vm_fault *vmf) * if page by the offset is not ready to be mapped (cold cache or * something). */ - if (vma->vm_ops->map_pages && fault_around_bytes >> PAGE_SHIFT > 1) { + if (should_fault_around(vmf)) { ret = do_fault_around(vmf); if (ret) return ret; @@ -4322,6 +4359,68 @@ static vm_fault_t wp_huge_pud(struct vm_fault *vmf, pud_t orig_pud) return VM_FAULT_FALLBACK; } +static vm_fault_t uffd_wp_clear_special(struct vm_fault *vmf) +{ + vmf->pte = pte_offset_map_lock(vmf->vma->vm_mm, vmf->pmd, + vmf->address, &vmf->ptl); + /* + * Be careful so that we will only recover a special uffd-wp pte into a + * none pte. Otherwise it means the pte could have changed, so retry. + */ + if (pte_swp_uffd_wp_special(*vmf->pte)) + pte_clear(vmf->vma->vm_mm, vmf->address, vmf->pte); + pte_unmap_unlock(vmf->pte, vmf->ptl); + return 0; +} + +/* + * This is actually a page-missing access, but with uffd-wp special pte + * installed. It means this pte was wr-protected before being unmapped. + */ +vm_fault_t uffd_wp_handle_special(struct vm_fault *vmf) +{ + /* Careful! vmf->pte unmapped after return */ + if (!pte_unmap_same(vmf)) + return 0; + + /* + * Just in case there're leftover special ptes even after the region + * got unregistered - we can simply clear them. + */ + if (unlikely(!userfaultfd_wp(vmf->vma) || vma_is_anonymous(vmf->vma))) + return uffd_wp_clear_special(vmf); + + /* + * Tell all the rest of the fault code: we're handling a special pte, + * always remember to arm the uffd-wp bit when intalling the new pte. + */ + vmf->flags |= FAULT_FLAG_UFFD_WP; + + /* + * Let's assume this is a read fault no matter what. If it is a real + * write access, it'll fault again into do_wp_page() where the message + * will be generated before the thread yields itself. + * + * Ideally we can also handle write immediately before return, but this + * should be a slow path already (pte unmapped), so be simple first. + */ + vmf->flags &= ~FAULT_FLAG_WRITE; + + return do_fault(vmf); +} + +static vm_fault_t do_swap_pte(struct vm_fault *vmf) +{ + /* + * We need to handle special swap ptes before handling ptes that + * contain swap entries, always. + */ + if (unlikely(pte_swp_uffd_wp_special(vmf->orig_pte))) + return uffd_wp_handle_special(vmf); + + return do_swap_page(vmf); +} + /* * These routines also need to handle stuff like marking pages dirty * and/or accessed for architectures that don't do it in hardware (most @@ -4385,7 +4484,7 @@ static vm_fault_t handle_pte_fault(struct vm_fault *vmf) } if (!pte_present(vmf->orig_pte)) - return do_swap_page(vmf); + return do_swap_pte(vmf); if (pte_protnone(vmf->orig_pte) && vma_is_accessible(vmf->vma)) return do_numa_page(vmf); From patchwork Fri Jan 15 17:08:46 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Xu X-Patchwork-Id: 12023327 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.6 required=3.0 tests=BAYES_00,DKIM_INVALID, DKIM_SIGNED,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 1BCFDC433DB for ; Fri, 15 Jan 2021 17:09:38 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id BF8C12333E for ; Fri, 15 Jan 2021 17:09:37 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org BF8C12333E Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=redhat.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id B009A8D01A3; Fri, 15 Jan 2021 12:09:32 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id AB6678D019D; Fri, 15 Jan 2021 12:09:32 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 840CA8D01A3; Fri, 15 Jan 2021 12:09:32 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0007.hostedemail.com [216.40.44.7]) by kanga.kvack.org (Postfix) with ESMTP id 699BC8D019D for ; Fri, 15 Jan 2021 12:09:32 -0500 (EST) Received: from smtpin12.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay02.hostedemail.com (Postfix) with ESMTP id 111F383F4 for ; Fri, 15 Jan 2021 17:09:32 +0000 (UTC) X-FDA: 77708645784.12.face47_61150cb27531 Received: from filter.hostedemail.com (10.5.16.251.rfc1918.com [10.5.16.251]) by smtpin12.hostedemail.com (Postfix) with ESMTP id DF64F181070D7 for ; Fri, 15 Jan 2021 17:09:31 +0000 (UTC) X-HE-Tag: face47_61150cb27531 X-Filterd-Recvd-Size: 6694 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [216.205.24.124]) by imf33.hostedemail.com (Postfix) with ESMTP for ; Fri, 15 Jan 2021 17:09:31 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1610730570; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=a0JYz3HBZ34RwENyJxX7OSGITuFKgQMNHlEyQKe2PSI=; b=B5BzBF7h2ZQC+D6/C9oIeMk98o7SoryHYM9TMRjOZ/qx+70AySuozs/fCVN2XQ/76ra4PX TE8UVlTab5p/99rH7jjxOMEA+sn/KPKcxfby73IWrys4PvJvrMv7kOtUCf56yQ2ZsXQoOh ZTdbK3cF4Qj4BoUvT6OeGw1Z3Z/XyVI= Received: from mail-qk1-f199.google.com (mail-qk1-f199.google.com [209.85.222.199]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-425-euAPppEEMaOgqCp9xkfblQ-1; Fri, 15 Jan 2021 12:09:28 -0500 X-MC-Unique: euAPppEEMaOgqCp9xkfblQ-1 Received: by mail-qk1-f199.google.com with SMTP id f27so8614075qkh.0 for ; Fri, 15 Jan 2021 09:09:28 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=a0JYz3HBZ34RwENyJxX7OSGITuFKgQMNHlEyQKe2PSI=; b=l2MRR5RMn4yoNmkrfBCif/2BFaeBKLRboZ/4Sqjwem0oMRNzdruCDgZl4OtevFDnME xHqV2hTBLCMNSZUQb6NtIEZ9lWSa9B3TWeKiOoOgIKPG6QbX552WMW/ZoD6Q6ztVfCtR mZ9/r0r3tGBkjWWNTWrowUbqBMIy4QSYSekrjaDd2kuKjiuDHEAu9L2cwpxIw31k3R0o NFLDCQGHoW6X8lCxmB9E2s5bLKSB4hOY0rIHse2e8sQ3+cmGcXE4gPYKg9LYZU5/5a5R T/Tf45G2oA7TOufk7JpBbqwRyqfSKdVF1y6rtJY45lawC/PjElmrynXgVw/TlzWBglsV Ro5Q== X-Gm-Message-State: AOAM531zhq1hKsCPTnNSn8ebUPblYOmjeT03QB3TOEFTh/kj4sdi6r1u FVWyoQwTKwDJNo10kFLdbQe6yjilRM+d5L6b37YWRyHX/0rkbz0PCj2kjDN08zTD4m4o5oiftCT 7lvbGVN5kzow= X-Received: by 2002:a05:6214:533:: with SMTP id x19mr13215469qvw.20.1610730567637; Fri, 15 Jan 2021 09:09:27 -0800 (PST) X-Google-Smtp-Source: ABdhPJwjUXdkNSM6Ii2VUn7NnUVKuMeCISUxdtKhPGypjO2PwOsx14vwE6XLqQeJD6if0ENarzbWBQ== X-Received: by 2002:a05:6214:533:: with SMTP id x19mr13215453qvw.20.1610730567461; Fri, 15 Jan 2021 09:09:27 -0800 (PST) Received: from localhost.localdomain ([142.126.83.202]) by smtp.gmail.com with ESMTPSA id d123sm5187840qke.95.2021.01.15.09.09.24 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 15 Jan 2021 09:09:26 -0800 (PST) From: Peter Xu To: linux-kernel@vger.kernel.org, linux-mm@kvack.org Cc: Mike Rapoport , Mike Kravetz , peterx@redhat.com, Jerome Glisse , "Kirill A . Shutemov" , Hugh Dickins , Axel Rasmussen , Matthew Wilcox , Andrew Morton , Andrea Arcangeli , Nadav Amit Subject: [PATCH RFC 09/30] mm: Drop first_index/last_index in zap_details Date: Fri, 15 Jan 2021 12:08:46 -0500 Message-Id: <20210115170907.24498-10-peterx@redhat.com> X-Mailer: git-send-email 2.26.2 In-Reply-To: <20210115170907.24498-1-peterx@redhat.com> References: <20210115170907.24498-1-peterx@redhat.com> MIME-Version: 1.0 Authentication-Results: relay.mimecast.com; auth=pass smtp.auth=CUSA124A263 smtp.mailfrom=peterx@redhat.com X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: The first_index/last_index parameters in zap_details are actually only used in unmap_mapping_range_tree(). At the meantime, this function is only called by unmap_mapping_pages() once. Instead of passing these two variables through the whole stack of page zapping code, remove them from zap_details and let them simply be parameters of unmap_mapping_range_tree(), which is inlined. Signed-off-by: Peter Xu --- include/linux/mm.h | 2 -- mm/memory.c | 20 ++++++++++---------- 2 files changed, 10 insertions(+), 12 deletions(-) diff --git a/include/linux/mm.h b/include/linux/mm.h index 85d928764b64..faf9538c13b2 100644 --- a/include/linux/mm.h +++ b/include/linux/mm.h @@ -1635,8 +1635,6 @@ extern void user_shm_unlock(size_t, struct user_struct *); */ struct zap_details { struct address_space *check_mapping; /* Check page->mapping if set */ - pgoff_t first_index; /* Lowest page->index to unmap */ - pgoff_t last_index; /* Highest page->index to unmap */ }; struct page *vm_normal_page(struct vm_area_struct *vma, unsigned long addr, diff --git a/mm/memory.c b/mm/memory.c index 0b687f0be4d0..dd49dea276e3 100644 --- a/mm/memory.c +++ b/mm/memory.c @@ -3145,20 +3145,20 @@ static void unmap_mapping_range_vma(struct vm_area_struct *vma, } static inline void unmap_mapping_range_tree(struct rb_root_cached *root, + pgoff_t first_index, + pgoff_t last_index, struct zap_details *details) { struct vm_area_struct *vma; pgoff_t vba, vea, zba, zea; - vma_interval_tree_foreach(vma, root, - details->first_index, details->last_index) { - + vma_interval_tree_foreach(vma, root, first_index, last_index) { vba = vma->vm_pgoff; vea = vba + vma_pages(vma) - 1; - zba = details->first_index; + zba = first_index; if (zba < vba) zba = vba; - zea = details->last_index; + zea = last_index; if (zea > vea) zea = vea; @@ -3184,17 +3184,17 @@ static inline void unmap_mapping_range_tree(struct rb_root_cached *root, void unmap_mapping_pages(struct address_space *mapping, pgoff_t start, pgoff_t nr, bool even_cows) { + pgoff_t first_index = start, last_index = start + nr - 1; struct zap_details details = { }; details.check_mapping = even_cows ? NULL : mapping; - details.first_index = start; - details.last_index = start + nr - 1; - if (details.last_index < details.first_index) - details.last_index = ULONG_MAX; + if (last_index < first_index) + last_index = ULONG_MAX; i_mmap_lock_write(mapping); if (unlikely(!RB_EMPTY_ROOT(&mapping->i_mmap.rb_root))) - unmap_mapping_range_tree(&mapping->i_mmap, &details); + unmap_mapping_range_tree(&mapping->i_mmap, first_index, + last_index, &details); i_mmap_unlock_write(mapping); } From patchwork Fri Jan 15 17:08:47 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Xu X-Patchwork-Id: 12023329 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.6 required=3.0 tests=BAYES_00,DKIM_INVALID, DKIM_SIGNED,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id DA46EC433DB for ; Fri, 15 Jan 2021 17:09:40 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 88A562333E for ; Fri, 15 Jan 2021 17:09:40 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 88A562333E Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=redhat.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 3B3478D019D; Fri, 15 Jan 2021 12:09:33 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 2C4D08D01A4; Fri, 15 Jan 2021 12:09:33 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 13A4B8D019D; Fri, 15 Jan 2021 12:09:33 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0026.hostedemail.com [216.40.44.26]) by kanga.kvack.org (Postfix) with ESMTP id EE0988D01A4 for ; Fri, 15 Jan 2021 12:09:32 -0500 (EST) Received: from smtpin07.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay03.hostedemail.com (Postfix) with ESMTP id B396A82EB555 for ; Fri, 15 Jan 2021 17:09:32 +0000 (UTC) X-FDA: 77708645784.07.tin86_12020b527531 Received: from filter.hostedemail.com (10.5.16.251.rfc1918.com [10.5.16.251]) by smtpin07.hostedemail.com (Postfix) with ESMTP id 88DC7180388F0 for ; Fri, 15 Jan 2021 17:09:32 +0000 (UTC) X-HE-Tag: tin86_12020b527531 X-Filterd-Recvd-Size: 8472 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [216.205.24.124]) by imf03.hostedemail.com (Postfix) with ESMTP for ; Fri, 15 Jan 2021 17:09:31 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1610730571; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=7zNMLI73E9rvmIG628SfisA+2GLXaIOPoACKwF1DQC4=; b=TaRXdDUHhVus38f99186UyMSCsmz94ZVrgcgenijVzbA8uzmX4snsqg2i0rInZwQUlt16z DWj4nUnODp6nBe7lXZ448wmLxAjuvKs2omCvbf13SnNVB/WkWzOhQ+VmW5CFZHaNUJ+sJS eoFnz9uRVgx2pzXnzI5qDhArt1zLKLo= Received: from mail-qk1-f199.google.com (mail-qk1-f199.google.com [209.85.222.199]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-507-_Samc0W1Mda10T8bkJivlQ-1; Fri, 15 Jan 2021 12:09:29 -0500 X-MC-Unique: _Samc0W1Mda10T8bkJivlQ-1 Received: by mail-qk1-f199.google.com with SMTP id f27so8614148qkh.0 for ; Fri, 15 Jan 2021 09:09:29 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=7zNMLI73E9rvmIG628SfisA+2GLXaIOPoACKwF1DQC4=; b=b26zZXqyaCvJAuR25qVdFdo4Jax0M/OGSTaLvwz01YyExAGg0VG7txLKvIvRFDygny SGYFiXMvfdkzvICzkTUCkqMECbZApzDqnmsy5dvIfJonJfKVbWupcWmjNOnkU0JhZArD 60Wjbyjvy6/EnEjSmAkHNAH1VNlmGjE8qNlIagHkHVO2hg2OQSj9miBvll/9JUiLcMxy lk9dOWyUeVAI6QV0LCG9qzymYRMriHga+bRG4hZbaXy+zSHyBpIAKkmPDR2VtbSV8aMC lhM6N7hEipFM4YWe8Tt0uQNzCmZ87z25r8vB5e3qRYnGVMB423c4ONSp616zDpTaJsk3 /qSw== X-Gm-Message-State: AOAM5312A10KlC0p5SgcKEJ4+HIwTrdDAtBKTWEEwERshvnfHp3c9vB2 +lH+cBUki3kSDAZwhUtKpH5MJAUhYE4o62YV1NVAeP2f1hKjvrCGHVQ/iRJxoDi+lBBxcXvN1mJ xU7LkmcBhbuk= X-Received: by 2002:ae9:ebd5:: with SMTP id b204mr13052856qkg.195.1610730569373; Fri, 15 Jan 2021 09:09:29 -0800 (PST) X-Google-Smtp-Source: ABdhPJyUyy20s6HzED7NEtKWLEp73vXuBFzUuQgA1QzITMcyJTdxXNmC/+9kNX4RpSO4M94ywIGwUg== X-Received: by 2002:ae9:ebd5:: with SMTP id b204mr13052827qkg.195.1610730569089; Fri, 15 Jan 2021 09:09:29 -0800 (PST) Received: from localhost.localdomain ([142.126.83.202]) by smtp.gmail.com with ESMTPSA id d123sm5187840qke.95.2021.01.15.09.09.27 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 15 Jan 2021 09:09:28 -0800 (PST) From: Peter Xu To: linux-kernel@vger.kernel.org, linux-mm@kvack.org Cc: Mike Rapoport , Mike Kravetz , peterx@redhat.com, Jerome Glisse , "Kirill A . Shutemov" , Hugh Dickins , Axel Rasmussen , Matthew Wilcox , Andrew Morton , Andrea Arcangeli , Nadav Amit Subject: [PATCH RFC 10/30] mm: Introduce zap_details.zap_flags Date: Fri, 15 Jan 2021 12:08:47 -0500 Message-Id: <20210115170907.24498-11-peterx@redhat.com> X-Mailer: git-send-email 2.26.2 In-Reply-To: <20210115170907.24498-1-peterx@redhat.com> References: <20210115170907.24498-1-peterx@redhat.com> MIME-Version: 1.0 Authentication-Results: relay.mimecast.com; auth=pass smtp.auth=CUSA124A263 smtp.mailfrom=peterx@redhat.com X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Instead of trying to introduce one variable for every new zap_details fields, let's introduce a flag so that it can start to encode true/false informations. Let's start to use this flag first to clean up the only check_mapping variable. Firstly, the name "check_mapping" implies this is a "boolean", but actually it stores the mapping inside, just in a way that it won't be set if we don't want to check the mapping. To make things clearer, introduce the 1st zap flag ZAP_FLAG_CHECK_MAPPING, so that we only check against the mapping if this bit set. At the same time, we can rename check_mapping into zap_mapping and set it always. Since at it, introduce another helper zap_check_mapping_skip() and use it in zap_pte_range() properly. Some old comments have been removed in zap_pte_range() because they're duplicated, and since now we're with ZAP_FLAG_CHECK_MAPPING flag, it'll be very easy to grep this information by simply grepping the flag. It'll also make life easier when we want to e.g. pass in zap_flags into the callers like unmap_mapping_pages() (instead of adding new booleans besides the even_cows parameter). Signed-off-by: Peter Xu --- include/linux/mm.h | 19 ++++++++++++++++++- mm/memory.c | 31 ++++++++----------------------- 2 files changed, 26 insertions(+), 24 deletions(-) diff --git a/include/linux/mm.h b/include/linux/mm.h index faf9538c13b2..2380e1df6a49 100644 --- a/include/linux/mm.h +++ b/include/linux/mm.h @@ -1630,13 +1630,30 @@ static inline bool can_do_mlock(void) { return false; } extern int user_shm_lock(size_t, struct user_struct *); extern void user_shm_unlock(size_t, struct user_struct *); +/* Whether to check page->mapping when zapping */ +#define ZAP_FLAG_CHECK_MAPPING BIT(0) + /* * Parameter block passed down to zap_pte_range in exceptional cases. */ struct zap_details { - struct address_space *check_mapping; /* Check page->mapping if set */ + struct address_space *zap_mapping; /* Check page->mapping if set */ + unsigned long zap_flags; /* Special flags for zapping */ }; +/* Return true if skip zapping this page, false otherwise */ +static inline bool +zap_check_mapping_skip(struct zap_details *details, struct page *page) +{ + if (!details || !page) + return false; + + if (!(details->zap_flags & ZAP_FLAG_CHECK_MAPPING)) + return false; + + return details->zap_mapping != page_rmapping(page); +} + struct page *vm_normal_page(struct vm_area_struct *vma, unsigned long addr, pte_t pte); struct page *vm_normal_page_pmd(struct vm_area_struct *vma, unsigned long addr, diff --git a/mm/memory.c b/mm/memory.c index dd49dea276e3..43d8641dbe18 100644 --- a/mm/memory.c +++ b/mm/memory.c @@ -1226,16 +1226,8 @@ static unsigned long zap_pte_range(struct mmu_gather *tlb, struct page *page; page = vm_normal_page(vma, addr, ptent); - if (unlikely(details) && page) { - /* - * unmap_shared_mapping_pages() wants to - * invalidate cache without truncating: - * unmap shared but keep private pages. - */ - if (details->check_mapping && - details->check_mapping != page_rmapping(page)) - continue; - } + if (unlikely(zap_check_mapping_skip(details, page))) + continue; ptent = ptep_get_and_clear_full(mm, addr, pte, tlb->fullmm); tlb_remove_tlb_entry(tlb, pte, addr); @@ -1267,17 +1259,8 @@ static unsigned long zap_pte_range(struct mmu_gather *tlb, if (is_device_private_entry(entry)) { struct page *page = device_private_entry_to_page(entry); - if (unlikely(details && details->check_mapping)) { - /* - * unmap_shared_mapping_pages() wants to - * invalidate cache without truncating: - * unmap shared but keep private pages. - */ - if (details->check_mapping != - page_rmapping(page)) - continue; - } - + if (unlikely(zap_check_mapping_skip(details, page))) + continue; pte_clear_not_present_full(mm, addr, pte, tlb->fullmm); rss[mm_counter(page)]--; page_remove_rmap(page, false); @@ -3185,9 +3168,11 @@ void unmap_mapping_pages(struct address_space *mapping, pgoff_t start, pgoff_t nr, bool even_cows) { pgoff_t first_index = start, last_index = start + nr - 1; - struct zap_details details = { }; + struct zap_details details = { .zap_mapping = mapping }; + + if (!even_cows) + details.zap_flags |= ZAP_FLAG_CHECK_MAPPING; - details.check_mapping = even_cows ? NULL : mapping; if (last_index < first_index) last_index = ULONG_MAX; From patchwork Fri Jan 15 17:08:48 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Xu X-Patchwork-Id: 12023331 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.6 required=3.0 tests=BAYES_00,DKIM_INVALID, DKIM_SIGNED,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id BCA3BC433E0 for ; Fri, 15 Jan 2021 17:09:43 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 6D84C2333E for ; Fri, 15 Jan 2021 17:09:43 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 6D84C2333E Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=redhat.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 11A258D01A5; Fri, 15 Jan 2021 12:09:38 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 0CC9D8D01A4; Fri, 15 Jan 2021 12:09:38 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id E0E8C8D01A5; Fri, 15 Jan 2021 12:09:37 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0101.hostedemail.com [216.40.44.101]) by kanga.kvack.org (Postfix) with ESMTP id C82098D01A4 for ; Fri, 15 Jan 2021 12:09:37 -0500 (EST) Received: from smtpin17.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay01.hostedemail.com (Postfix) with ESMTP id 72A171818FCD3 for ; Fri, 15 Jan 2021 17:09:37 +0000 (UTC) X-FDA: 77708645994.17.fifth69_4105d4027531 Received: from filter.hostedemail.com (10.5.16.251.rfc1918.com [10.5.16.251]) by smtpin17.hostedemail.com (Postfix) with ESMTP id 3E99818103F2A for ; Fri, 15 Jan 2021 17:09:37 +0000 (UTC) X-HE-Tag: fifth69_4105d4027531 X-Filterd-Recvd-Size: 6823 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [216.205.24.124]) by imf47.hostedemail.com (Postfix) with ESMTP for ; Fri, 15 Jan 2021 17:09:36 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1610730576; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=zJk71zqOOD6fILHtrWgSqq8ioKD+WcJBLRfGm+Kflhs=; b=UBYtPL2O2mK8ODK1gWMvLIz0nybOPg4LlPdtGbquO9QD5e3PgCdc7f6Qt2H17ZUHwSQ8Jx NsirZ1hhvq0BunFlQT7OlEe4Y8XyVI83NttG/HxXOtrEVe/gTMAwNbaa9ovN8zezaFIiXs gNQmSlT7QkYN0oK+uEUxyi/4ki6uUXI= Received: from mail-qk1-f199.google.com (mail-qk1-f199.google.com [209.85.222.199]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-359-XKfV1pzWMGi4C1e_XCEXjA-1; Fri, 15 Jan 2021 12:09:32 -0500 X-MC-Unique: XKfV1pzWMGi4C1e_XCEXjA-1 Received: by mail-qk1-f199.google.com with SMTP id 189so8626585qko.1 for ; Fri, 15 Jan 2021 09:09:32 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=zJk71zqOOD6fILHtrWgSqq8ioKD+WcJBLRfGm+Kflhs=; b=W0NbDyBJjwf12d+jswLFT6Lx3cFWT19grxccAQy+M0Oh/mNiWApvujTVSdAou6RG/M GcQVrDMv2sRODDuLHN/A72yg5zM4/cshf2uPr2yUTEzGP8YvMuKvwPk2r0IX4k40h5ed nyjh67a7eVKqC8at9NngjisTC8S5GPZchioOT12aT5k6+mZ4FdD5iT2jQVogIUWIjcxE WWica+HRYXwImhCtOQSu7tzwHaLoKg33LRuH0VRez0c/1j4M6rkJ1jHaFbFVStIS41ib fvRoIk8Mk/L7npAoRVDRpwGWAF669/omv+GQ5+ZMLXN3CjSY5S6NpB9p4BNoxlkwN1RG Piyg== X-Gm-Message-State: AOAM5322yy5cMVLavxxXxBo4HYFndebjJkkGuUJr1LKHWJvi6UvgGdo/ oK9Ft5qyQ/IIHynvakIrRUAs/UZWKLIKO7egpIZsjMiZ0O33x2z//JisK4dbYMhLdwyfwFqMki/ NJLDHceE2iZ0= X-Received: by 2002:ad4:58c2:: with SMTP id dh2mr13333735qvb.4.1610730571897; Fri, 15 Jan 2021 09:09:31 -0800 (PST) X-Google-Smtp-Source: ABdhPJx/ZGLVCHpITAQmxUytH5fcvrbscIS0jfSoJpLhoJN9VFYKioBq4OLm6JnuowYdUKczUKfJBw== X-Received: by 2002:ad4:58c2:: with SMTP id dh2mr13333709qvb.4.1610730571685; Fri, 15 Jan 2021 09:09:31 -0800 (PST) Received: from localhost.localdomain ([142.126.83.202]) by smtp.gmail.com with ESMTPSA id d123sm5187840qke.95.2021.01.15.09.09.29 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 15 Jan 2021 09:09:30 -0800 (PST) From: Peter Xu To: linux-kernel@vger.kernel.org, linux-mm@kvack.org Cc: Mike Rapoport , Mike Kravetz , peterx@redhat.com, Jerome Glisse , "Kirill A . Shutemov" , Hugh Dickins , Axel Rasmussen , Matthew Wilcox , Andrew Morton , Andrea Arcangeli , Nadav Amit Subject: [PATCH RFC 11/30] mm: Introduce ZAP_FLAG_SKIP_SWAP Date: Fri, 15 Jan 2021 12:08:48 -0500 Message-Id: <20210115170907.24498-12-peterx@redhat.com> X-Mailer: git-send-email 2.26.2 In-Reply-To: <20210115170907.24498-1-peterx@redhat.com> References: <20210115170907.24498-1-peterx@redhat.com> MIME-Version: 1.0 Authentication-Results: relay.mimecast.com; auth=pass smtp.auth=CUSA124A263 smtp.mailfrom=peterx@redhat.com X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Firstly, the comment in zap_pte_range() is misleading because it checks against details rather than check_mappings, so it's against what the code did. Meanwhile, it's confusing too on not explaining why passing in the details pointer would mean to skip all swap entries. New user of zap_details could very possibly miss this fact if they don't read deep until zap_pte_range() because there's no comment at zap_details talking about it at all, so swap entries could be errornously skipped without being noticed. This partly reverts 3e8715fdc03e ("mm: drop zap_details::check_swap_entries"), but introduce ZAP_FLAG_SKIP_SWAP flag, which means the opposite of previous "details" parameter: the caller should explicitly set this to skip swap entries, otherwise swap entries will always be considered (which is still the major case here). Cc: Kirill A. Shutemov Signed-off-by: Peter Xu --- include/linux/mm.h | 12 ++++++++++++ mm/memory.c | 8 +++++--- 2 files changed, 17 insertions(+), 3 deletions(-) diff --git a/include/linux/mm.h b/include/linux/mm.h index 2380e1df6a49..0b1d04404275 100644 --- a/include/linux/mm.h +++ b/include/linux/mm.h @@ -1632,6 +1632,8 @@ extern void user_shm_unlock(size_t, struct user_struct *); /* Whether to check page->mapping when zapping */ #define ZAP_FLAG_CHECK_MAPPING BIT(0) +/* Whether to skip zapping swap entries */ +#define ZAP_FLAG_SKIP_SWAP BIT(1) /* * Parameter block passed down to zap_pte_range in exceptional cases. @@ -1654,6 +1656,16 @@ zap_check_mapping_skip(struct zap_details *details, struct page *page) return details->zap_mapping != page_rmapping(page); } +/* Return true if skip swap entries, false otherwise */ +static inline bool +zap_skip_swap(struct zap_details *details) +{ + if (!details) + return false; + + return details->zap_flags & ZAP_FLAG_SKIP_SWAP; +} + struct page *vm_normal_page(struct vm_area_struct *vma, unsigned long addr, pte_t pte); struct page *vm_normal_page_pmd(struct vm_area_struct *vma, unsigned long addr, diff --git a/mm/memory.c b/mm/memory.c index 43d8641dbe18..873b2515e187 100644 --- a/mm/memory.c +++ b/mm/memory.c @@ -1268,8 +1268,7 @@ static unsigned long zap_pte_range(struct mmu_gather *tlb, continue; } - /* If details->check_mapping, we leave swap entries. */ - if (unlikely(details)) + if (unlikely(zap_skip_swap(details))) continue; if (!non_swap_entry(entry)) @@ -3168,7 +3167,10 @@ void unmap_mapping_pages(struct address_space *mapping, pgoff_t start, pgoff_t nr, bool even_cows) { pgoff_t first_index = start, last_index = start + nr - 1; - struct zap_details details = { .zap_mapping = mapping }; + struct zap_details details = { + .zap_mapping = mapping, + .zap_flags = ZAP_FLAG_SKIP_SWAP, + }; if (!even_cows) details.zap_flags |= ZAP_FLAG_CHECK_MAPPING; From patchwork Fri Jan 15 17:08:49 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Xu X-Patchwork-Id: 12023337 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.6 required=3.0 tests=BAYES_00,DKIM_INVALID, DKIM_SIGNED,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 8008EC433E0 for ; Fri, 15 Jan 2021 17:09:52 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 278972333E for ; Fri, 15 Jan 2021 17:09:52 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 278972333E Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=redhat.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id F212B8D01A8; Fri, 15 Jan 2021 12:09:42 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id E80AB8D01A4; Fri, 15 Jan 2021 12:09:42 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id D21DE8D01A8; Fri, 15 Jan 2021 12:09:42 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0222.hostedemail.com [216.40.44.222]) by kanga.kvack.org (Postfix) with ESMTP id BC6DF8D01A4 for ; Fri, 15 Jan 2021 12:09:42 -0500 (EST) Received: from smtpin06.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay04.hostedemail.com (Postfix) with ESMTP id 8293E9407 for ; Fri, 15 Jan 2021 17:09:42 +0000 (UTC) X-FDA: 77708646204.06.doll36_540bfd027531 Received: from filter.hostedemail.com (10.5.16.251.rfc1918.com [10.5.16.251]) by smtpin06.hostedemail.com (Postfix) with ESMTP id 5A19A100A11C8 for ; Fri, 15 Jan 2021 17:09:42 +0000 (UTC) X-HE-Tag: doll36_540bfd027531 X-Filterd-Recvd-Size: 10579 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [63.128.21.124]) by imf38.hostedemail.com (Postfix) with ESMTP for ; Fri, 15 Jan 2021 17:09:37 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1610730576; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=owFz0J6uPi+bEZ5VVUVLD8wcRysHu0om3GiZkucTTw0=; b=Imyy+HwboiYruQhWPFcruKpSX2pUIjHlZWJuSUYV95/S8cCQRGrDLPklCbCCjwsAQRp7Qg 0bhqH0P8p0nZq+fFz3sLdfEwHS76RHQpw9EAGb1V8/qX8OKahSRVTQTgOBerj6wr28J/xx 4OtHY8YMbIBpg0HtMsph38uTMCQZw5I= Received: from mail-qv1-f69.google.com (mail-qv1-f69.google.com [209.85.219.69]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-546-NIvIIW_XPquESK9N3zYNnw-1; Fri, 15 Jan 2021 12:09:35 -0500 X-MC-Unique: NIvIIW_XPquESK9N3zYNnw-1 Received: by mail-qv1-f69.google.com with SMTP id j24so8265668qvg.8 for ; Fri, 15 Jan 2021 09:09:34 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=owFz0J6uPi+bEZ5VVUVLD8wcRysHu0om3GiZkucTTw0=; b=uVm2miTmsfKGWoDeXOB/bSK112S9RPiJIC2zVhc/wqAhrBH0j+40jC4z9NjLZNh8L2 rhmiGOzRm6sCBJ3b4PYuYURnd2VcD0O5nRqPMuaHQxlsQzvsgRKtRMRpXPXfESJ0K5CD moq0/iZ8iBOxlXGqSAbWZ6h2+WFixdHvQwgsMzeXg8Z1bY0qC2xJb4xtiWauFrhHgzjA oM8ftHxBNM9BTYfz5E2wSm4qj/k5To888mCPlW5p28oR7QVEOdZ5Kdv4eEXd43s6B/w6 et8LctfLV5waPZKmMefebkQLJsUeJ5iyxNoHXRgc9b6xL2aN9R1HNi6VQ635O6yHrDmQ aWzA== X-Gm-Message-State: AOAM53074T2pApwHXEGXoBdivWZjcDLRzA210HVGKc4JXqeG0WwTf25s nNd3uDqATj+o+4JlzSdM6rH1iMZe2HnSmqXoukOfPodRIa6XuDhdZpQTgZVlK8vWa0/5auaEAOq Q5E4bT8g+FVs= X-Received: by 2002:a05:6214:b82:: with SMTP id fe2mr13163938qvb.3.1610730573630; Fri, 15 Jan 2021 09:09:33 -0800 (PST) X-Google-Smtp-Source: ABdhPJxCJFBxV0A3cvs+bbtsL/5sswye4yXD71jk13j8IwVrx/u1KMNc/UbMaGnP6Gb/A3uNc8DhIA== X-Received: by 2002:a05:6214:b82:: with SMTP id fe2mr13163910qvb.3.1610730573340; Fri, 15 Jan 2021 09:09:33 -0800 (PST) Received: from localhost.localdomain ([142.126.83.202]) by smtp.gmail.com with ESMTPSA id d123sm5187840qke.95.2021.01.15.09.09.31 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 15 Jan 2021 09:09:32 -0800 (PST) From: Peter Xu To: linux-kernel@vger.kernel.org, linux-mm@kvack.org Cc: Mike Rapoport , Mike Kravetz , peterx@redhat.com, Jerome Glisse , "Kirill A . Shutemov" , Hugh Dickins , Axel Rasmussen , Matthew Wilcox , Andrew Morton , Andrea Arcangeli , Nadav Amit Subject: [PATCH RFC 12/30] mm: Pass zap_flags into unmap_mapping_pages() Date: Fri, 15 Jan 2021 12:08:49 -0500 Message-Id: <20210115170907.24498-13-peterx@redhat.com> X-Mailer: git-send-email 2.26.2 In-Reply-To: <20210115170907.24498-1-peterx@redhat.com> References: <20210115170907.24498-1-peterx@redhat.com> MIME-Version: 1.0 Authentication-Results: relay.mimecast.com; auth=pass smtp.auth=CUSA124A263 smtp.mailfrom=peterx@redhat.com X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Give unmap_mapping_pages() more power by allowing to specify a zap flag so that it can pass in more information than "whether we'd also like to zap cow pages". With the new flag, we can remove the even_cow parameter because even_cow==false equals to zap_flags==ZAP_FLAG_CHECK_MAPPING, while even_cow==true means a none zap flag to pass in (though in most cases we have had even_cow==false). No functional change intended. Signed-off-by: Peter Xu --- fs/dax.c | 10 ++++++---- include/linux/mm.h | 4 ++-- mm/khugepaged.c | 3 ++- mm/memory.c | 15 ++++++++------- mm/truncate.c | 11 +++++++---- 5 files changed, 25 insertions(+), 18 deletions(-) diff --git a/fs/dax.c b/fs/dax.c index 5b47834f2e1b..6a123c2bfc59 100644 --- a/fs/dax.c +++ b/fs/dax.c @@ -517,7 +517,7 @@ static void *grab_mapping_entry(struct xa_state *xas, xas_unlock_irq(xas); unmap_mapping_pages(mapping, xas->xa_index & ~PG_PMD_COLOUR, - PG_PMD_NR, false); + PG_PMD_NR, ZAP_FLAG_CHECK_MAPPING); xas_reset(xas); xas_lock_irq(xas); } @@ -612,7 +612,8 @@ struct page *dax_layout_busy_page_range(struct address_space *mapping, * guaranteed to either see new references or prevent new * references from being established. */ - unmap_mapping_pages(mapping, start_idx, end_idx - start_idx + 1, 0); + unmap_mapping_pages(mapping, start_idx, end_idx - start_idx + 1, + ZAP_FLAG_CHECK_MAPPING); xas_lock_irq(&xas); xas_for_each(&xas, entry, end_idx) { @@ -743,9 +744,10 @@ static void *dax_insert_entry(struct xa_state *xas, /* we are replacing a zero page with block mapping */ if (dax_is_pmd_entry(entry)) unmap_mapping_pages(mapping, index & ~PG_PMD_COLOUR, - PG_PMD_NR, false); + PG_PMD_NR, ZAP_FLAG_CHECK_MAPPING); else /* pte entry */ - unmap_mapping_pages(mapping, index, 1, false); + unmap_mapping_pages(mapping, index, 1, + ZAP_FLAG_CHECK_MAPPING); } xas_reset(xas); diff --git a/include/linux/mm.h b/include/linux/mm.h index 0b1d04404275..57bb3d680844 100644 --- a/include/linux/mm.h +++ b/include/linux/mm.h @@ -1710,7 +1710,7 @@ extern int fixup_user_fault(struct mm_struct *mm, unsigned long address, unsigned int fault_flags, bool *unlocked); void unmap_mapping_pages(struct address_space *mapping, - pgoff_t start, pgoff_t nr, bool even_cows); + pgoff_t start, pgoff_t nr, unsigned long zap_flags); void unmap_mapping_range(struct address_space *mapping, loff_t const holebegin, loff_t const holelen, int even_cows); #else @@ -1730,7 +1730,7 @@ static inline int fixup_user_fault(struct mm_struct *mm, unsigned long address, return -EFAULT; } static inline void unmap_mapping_pages(struct address_space *mapping, - pgoff_t start, pgoff_t nr, bool even_cows) { } + pgoff_t start, pgoff_t nr, unsigned long zap_flags) { } static inline void unmap_mapping_range(struct address_space *mapping, loff_t const holebegin, loff_t const holelen, int even_cows) { } #endif diff --git a/mm/khugepaged.c b/mm/khugepaged.c index 20807163a25f..981d7abb09ef 100644 --- a/mm/khugepaged.c +++ b/mm/khugepaged.c @@ -1817,7 +1817,8 @@ static void collapse_file(struct mm_struct *mm, } if (page_mapped(page)) - unmap_mapping_pages(mapping, index, 1, false); + unmap_mapping_pages(mapping, index, 1, + ZAP_FLAG_CHECK_MAPPING); xas_lock_irq(&xas); xas_set(&xas, index); diff --git a/mm/memory.c b/mm/memory.c index 873b2515e187..afe09fccdee1 100644 --- a/mm/memory.c +++ b/mm/memory.c @@ -3156,7 +3156,10 @@ static inline void unmap_mapping_range_tree(struct rb_root_cached *root, * @mapping: The address space containing pages to be unmapped. * @start: Index of first page to be unmapped. * @nr: Number of pages to be unmapped. 0 to unmap to end of file. - * @even_cows: Whether to unmap even private COWed pages. + * @zap_flags: Zap flags for the process. E.g., when ZAP_FLAG_CHECK_MAPPING is + * passed into it, we will only zap the pages that are in the same mapping + * specified in the @mapping parameter; otherwise we will not check mapping, + * IOW cow pages will be zapped too. * * Unmap the pages in this address space from any userspace process which * has them mmaped. Generally, you want to remove COWed pages as well when @@ -3164,17 +3167,14 @@ static inline void unmap_mapping_range_tree(struct rb_root_cached *root, * cache. */ void unmap_mapping_pages(struct address_space *mapping, pgoff_t start, - pgoff_t nr, bool even_cows) + pgoff_t nr, unsigned long zap_flags) { pgoff_t first_index = start, last_index = start + nr - 1; struct zap_details details = { .zap_mapping = mapping, - .zap_flags = ZAP_FLAG_SKIP_SWAP, + .zap_flags = zap_flags | ZAP_FLAG_SKIP_SWAP, }; - if (!even_cows) - details.zap_flags |= ZAP_FLAG_CHECK_MAPPING; - if (last_index < first_index) last_index = ULONG_MAX; @@ -3216,7 +3216,8 @@ void unmap_mapping_range(struct address_space *mapping, hlen = ULONG_MAX - hba + 1; } - unmap_mapping_pages(mapping, hba, hlen, even_cows); + unmap_mapping_pages(mapping, hba, hlen, even_cows ? + 0 : ZAP_FLAG_CHECK_MAPPING); } EXPORT_SYMBOL(unmap_mapping_range); diff --git a/mm/truncate.c b/mm/truncate.c index 960edf5803ca..dac66749e400 100644 --- a/mm/truncate.c +++ b/mm/truncate.c @@ -178,7 +178,8 @@ truncate_cleanup_page(struct address_space *mapping, struct page *page) { if (page_mapped(page)) { unsigned int nr = thp_nr_pages(page); - unmap_mapping_pages(mapping, page->index, nr, false); + unmap_mapping_pages(mapping, page->index, nr, + ZAP_FLAG_CHECK_MAPPING); } if (page_has_private(page)) @@ -750,14 +751,15 @@ int invalidate_inode_pages2_range(struct address_space *mapping, * Zap the rest of the file in one hit. */ unmap_mapping_pages(mapping, index, - (1 + end - index), false); + (1 + end - index), + ZAP_FLAG_CHECK_MAPPING); did_range_unmap = 1; } else { /* * Just zap this page */ unmap_mapping_pages(mapping, index, - 1, false); + 1, ZAP_FLAG_CHECK_MAPPING); } } BUG_ON(page_mapped(page)); @@ -783,7 +785,8 @@ int invalidate_inode_pages2_range(struct address_space *mapping, * get remapped later. */ if (dax_mapping(mapping)) { - unmap_mapping_pages(mapping, start, end - start + 1, false); + unmap_mapping_pages(mapping, start, end - start + 1, + ZAP_FLAG_CHECK_MAPPING); } out: cleancache_invalidate_inode(mapping); From patchwork Fri Jan 15 17:08:50 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Xu X-Patchwork-Id: 12023333 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.6 required=3.0 tests=BAYES_00,DKIM_INVALID, DKIM_SIGNED,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id A058DC433E0 for ; Fri, 15 Jan 2021 17:09:46 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 4235F2333E for ; Fri, 15 Jan 2021 17:09:46 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 4235F2333E Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=redhat.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 4490E8D01A6; Fri, 15 Jan 2021 12:09:41 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 3FB808D01A4; Fri, 15 Jan 2021 12:09:41 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 272EA8D01A6; Fri, 15 Jan 2021 12:09:41 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0038.hostedemail.com [216.40.44.38]) by kanga.kvack.org (Postfix) with ESMTP id 12DF28D01A4 for ; Fri, 15 Jan 2021 12:09:41 -0500 (EST) Received: from smtpin03.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay04.hostedemail.com (Postfix) with ESMTP id CE3F5381E for ; Fri, 15 Jan 2021 17:09:40 +0000 (UTC) X-FDA: 77708646120.03.night79_611332927531 Received: from filter.hostedemail.com (10.5.16.251.rfc1918.com [10.5.16.251]) by smtpin03.hostedemail.com (Postfix) with ESMTP id A96D328A4EA for ; Fri, 15 Jan 2021 17:09:40 +0000 (UTC) X-HE-Tag: night79_611332927531 X-Filterd-Recvd-Size: 12906 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [216.205.24.124]) by imf25.hostedemail.com (Postfix) with ESMTP for ; Fri, 15 Jan 2021 17:09:39 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1610730579; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=a5KLb2lATkQ/XNiv9F7KPoJriimXjMJy8UfLukuBk5Y=; b=MjrziuEcfTD5+mwPzX9AvSQeyVvuFntoKzPSHqsbf1BpCYGChOQmeskHKBys62MJFR65nZ eC/OxS85sYZxlHK5Ddf9e5TzKCmtBKJ33mY54YdNmjrEHW9O4A2EjiCU0szfSrWkFhp7Rv yabUMlmx5AktBUqY1WVZ5Njc9CcRL7Q= Received: from mail-qv1-f72.google.com (mail-qv1-f72.google.com [209.85.219.72]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-579-S8FE7ntOMpiS6sjb3Od3Hw-1; Fri, 15 Jan 2021 12:09:36 -0500 X-MC-Unique: S8FE7ntOMpiS6sjb3Od3Hw-1 Received: by mail-qv1-f72.google.com with SMTP id j24so8265757qvg.8 for ; Fri, 15 Jan 2021 09:09:36 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=a5KLb2lATkQ/XNiv9F7KPoJriimXjMJy8UfLukuBk5Y=; b=tQIK9+KoJ0EwrfzYrD3qeaahM6Hh2UXJYTbqzIIkHoN/W6Pg7mS7qayTzYSvclvAES 5gNE76EuLivsPOP5d0MKHKS+ycikHsKuNF+zwOyVFxZI0LJt6t099koi+dDXfJZnAacr NbBu8MHnHxZ01fiAoWKyw1deXXw8W4UeesvZgW4KmqStq0UEA01wOIl7IR64Ia43y5Zm 2Iv3AglY/pQEkqUU6ThDQgakqWXK/aA1cAmSqI5dfE9ZQd8qcNFQOIeSJibXvRyzTcfx a/e3F3cs1C2bacXTH0Lrb0DRgFszVuAFlTM0MO2A5qNtpded9tYMlMZbLOyCiFkkGSnq pxyA== X-Gm-Message-State: AOAM531l3ZSEDfxIi5sA64fUYtpeAGkUhp32Y97ByFVkCmaEhE4yGp/A Mh81fhxt0xyvk/8nxy7nIcXql2tHUz6iodxkmmufacr+V/dET8qhKJxDVQZBq1nUNA5k+vVXR1+ ubi5HZUvtme0= X-Received: by 2002:ac8:396d:: with SMTP id t42mr12482420qtb.273.1610730575865; Fri, 15 Jan 2021 09:09:35 -0800 (PST) X-Google-Smtp-Source: ABdhPJwx+fHE05WbMDp5unvCgzYCvNrqmnR+hBQDHIuSFam5gLX1Ken6kJ60Sbq11jUg42je7VpxZw== X-Received: by 2002:ac8:396d:: with SMTP id t42mr12482366qtb.273.1610730575395; Fri, 15 Jan 2021 09:09:35 -0800 (PST) Received: from localhost.localdomain ([142.126.83.202]) by smtp.gmail.com with ESMTPSA id d123sm5187840qke.95.2021.01.15.09.09.33 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 15 Jan 2021 09:09:34 -0800 (PST) From: Peter Xu To: linux-kernel@vger.kernel.org, linux-mm@kvack.org Cc: Mike Rapoport , Mike Kravetz , peterx@redhat.com, Jerome Glisse , "Kirill A . Shutemov" , Hugh Dickins , Axel Rasmussen , Matthew Wilcox , Andrew Morton , Andrea Arcangeli , Nadav Amit Subject: [PATCH RFC 13/30] shmem/userfaultfd: Persist uffd-wp bit across zapping for file-backed Date: Fri, 15 Jan 2021 12:08:50 -0500 Message-Id: <20210115170907.24498-14-peterx@redhat.com> X-Mailer: git-send-email 2.26.2 In-Reply-To: <20210115170907.24498-1-peterx@redhat.com> References: <20210115170907.24498-1-peterx@redhat.com> MIME-Version: 1.0 Authentication-Results: relay.mimecast.com; auth=pass smtp.auth=CUSA124A263 smtp.mailfrom=peterx@redhat.com X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: File-backed memory is prone to being unmapped at any time. It means all information in the pte will be dropped, including the uffd-wp flag. Since the uffd-wp info cannot be stored in page cache or swap cache, persist this wr-protect information by installing the special uffd-wp marker pte when we're going to unmap a uffd wr-protected pte. When the pte is accessed again, we will know it's previously wr-protected by recognizing the special pte. Meanwhile add a new flag ZAP_FLAG_DROP_FILE_UFFD_WP when we don't want to persist such an information. For example, when destroying the whole vma, or punching a hole in a shmem file. For the latter, we can only drop the uffd-wp bit when holding the page lock. It means the unmap_mapping_range() in shmem_fallocate() still reuqires to zap without ZAP_FLAG_DROP_FILE_UFFD_WP because that's still racy with the page faults. Signed-off-by: Peter Xu --- include/linux/mm.h | 11 ++++++++++ include/linux/mm_inline.h | 43 +++++++++++++++++++++++++++++++++++++++ mm/memory.c | 42 +++++++++++++++++++++++++++++++++++++- mm/rmap.c | 8 ++++++++ mm/truncate.c | 8 +++++++- 5 files changed, 110 insertions(+), 2 deletions(-) diff --git a/include/linux/mm.h b/include/linux/mm.h index 57bb3d680844..e4aba745be62 100644 --- a/include/linux/mm.h +++ b/include/linux/mm.h @@ -1634,6 +1634,8 @@ extern void user_shm_unlock(size_t, struct user_struct *); #define ZAP_FLAG_CHECK_MAPPING BIT(0) /* Whether to skip zapping swap entries */ #define ZAP_FLAG_SKIP_SWAP BIT(1) +/* Whether to completely drop uffd-wp entries for file-backed memory */ +#define ZAP_FLAG_DROP_FILE_UFFD_WP BIT(2) /* * Parameter block passed down to zap_pte_range in exceptional cases. @@ -1666,6 +1668,15 @@ zap_skip_swap(struct zap_details *details) return details->zap_flags & ZAP_FLAG_SKIP_SWAP; } +static inline bool +zap_drop_file_uffd_wp(struct zap_details *details) +{ + if (!details) + return false; + + return details->zap_flags & ZAP_FLAG_DROP_FILE_UFFD_WP; +} + struct page *vm_normal_page(struct vm_area_struct *vma, unsigned long addr, pte_t pte); struct page *vm_normal_page_pmd(struct vm_area_struct *vma, unsigned long addr, diff --git a/include/linux/mm_inline.h b/include/linux/mm_inline.h index 8fc71e9d7bb0..da4c710859e6 100644 --- a/include/linux/mm_inline.h +++ b/include/linux/mm_inline.h @@ -4,6 +4,8 @@ #include #include +#include +#include /** * page_is_file_lru - should the page be on a file LRU or anon LRU? @@ -125,4 +127,45 @@ static __always_inline enum lru_list page_lru(struct page *page) } return lru; } + +/* + * If this pte is wr-protected by uffd-wp in any form, arm the special pte to + * replace a none pte. NOTE! This should only be called when *pte is already + * cleared so we will never accidentally replace something valuable. Meanwhile + * none pte also means we are not demoting the pte so if tlb flushed then we + * don't need to do it again; otherwise if tlb flush is postponed then it's + * even better. + * + * Must be called with pgtable lock held. + */ +static inline void +pte_install_uffd_wp_if_needed(struct vm_area_struct *vma, unsigned long addr, + pte_t *pte, pte_t pteval) +{ +#ifdef CONFIG_USERFAULTFD + bool arm_uffd_pte = false; + + /* The current status of the pte should be "cleared" before calling */ + WARN_ON_ONCE(!pte_none(*pte)); + + if (vma_is_anonymous(vma)) + return; + + /* A uffd-wp wr-protected normal pte */ + if (unlikely(pte_present(pteval) && pte_uffd_wp(pteval))) + arm_uffd_pte = true; + + /* + * A uffd-wp wr-protected swap pte. Note: this should even work for + * pte_swp_uffd_wp_special() too. + */ + if (unlikely(is_swap_pte(pteval) && pte_swp_uffd_wp(pteval))) + arm_uffd_pte = true; + + if (unlikely(arm_uffd_pte)) + set_pte_at(vma->vm_mm, addr, pte, + pte_swp_mkuffd_wp_special(vma)); +#endif +} + #endif diff --git a/mm/memory.c b/mm/memory.c index afe09fccdee1..f87b5a8a098e 100644 --- a/mm/memory.c +++ b/mm/memory.c @@ -73,6 +73,7 @@ #include #include #include +#include #include @@ -1194,6 +1195,21 @@ copy_page_range(struct vm_area_struct *dst_vma, struct vm_area_struct *src_vma) return ret; } +/* + * This function makes sure that we'll replace the none pte with an uffd-wp + * swap special pte marker when necessary. Must be with the pgtable lock held. + */ +static inline void +zap_install_uffd_wp_if_needed(struct vm_area_struct *vma, + unsigned long addr, pte_t *pte, + struct zap_details *details, pte_t pteval) +{ + if (zap_drop_file_uffd_wp(details)) + return; + + pte_install_uffd_wp_if_needed(vma, addr, pte, pteval); +} + static unsigned long zap_pte_range(struct mmu_gather *tlb, struct vm_area_struct *vma, pmd_t *pmd, unsigned long addr, unsigned long end, @@ -1231,6 +1247,8 @@ static unsigned long zap_pte_range(struct mmu_gather *tlb, ptent = ptep_get_and_clear_full(mm, addr, pte, tlb->fullmm); tlb_remove_tlb_entry(tlb, pte, addr); + zap_install_uffd_wp_if_needed(vma, addr, pte, details, + ptent); if (unlikely(!page)) continue; @@ -1255,6 +1273,22 @@ static unsigned long zap_pte_range(struct mmu_gather *tlb, continue; } + /* + * If this is a special uffd-wp marker pte... Drop it only if + * enforced to do so. + */ + if (unlikely(is_swap_special_pte(ptent))) { + WARN_ON_ONCE(!pte_swp_uffd_wp_special(ptent)); + /* + * If this is a common unmap of ptes, keep this as is. + * Drop it only if this is a whole-vma destruction. + */ + if (zap_drop_file_uffd_wp(details)) + ptep_get_and_clear_full(mm, addr, pte, + tlb->fullmm); + continue; + } + entry = pte_to_swp_entry(ptent); if (is_device_private_entry(entry)) { struct page *page = device_private_entry_to_page(entry); @@ -1265,6 +1299,8 @@ static unsigned long zap_pte_range(struct mmu_gather *tlb, rss[mm_counter(page)]--; page_remove_rmap(page, false); put_page(page); + zap_install_uffd_wp_if_needed(vma, addr, pte, details, + ptent); continue; } @@ -1282,6 +1318,7 @@ static unsigned long zap_pte_range(struct mmu_gather *tlb, if (unlikely(!free_swap_and_cache(entry))) print_bad_pte(vma, addr, ptent, NULL); pte_clear_not_present_full(mm, addr, pte, tlb->fullmm); + zap_install_uffd_wp_if_needed(vma, addr, pte, details, ptent); } while (pte++, addr += PAGE_SIZE, addr != end); add_mm_rss_vec(mm, rss); @@ -1481,12 +1518,15 @@ void unmap_vmas(struct mmu_gather *tlb, unsigned long end_addr) { struct mmu_notifier_range range; + struct zap_details details = { + .zap_flags = ZAP_FLAG_DROP_FILE_UFFD_WP, + }; mmu_notifier_range_init(&range, MMU_NOTIFY_UNMAP, 0, vma, vma->vm_mm, start_addr, end_addr); mmu_notifier_invalidate_range_start(&range); for ( ; vma && vma->vm_start < end_addr; vma = vma->vm_next) - unmap_single_vma(tlb, vma, start_addr, end_addr, NULL); + unmap_single_vma(tlb, vma, start_addr, end_addr, &details); mmu_notifier_invalidate_range_end(&range); } diff --git a/mm/rmap.c b/mm/rmap.c index 31b29321adfe..f6cc0b9b1963 100644 --- a/mm/rmap.c +++ b/mm/rmap.c @@ -72,6 +72,7 @@ #include #include #include +#include #include @@ -1560,6 +1561,13 @@ static bool try_to_unmap_one(struct page *page, struct vm_area_struct *vma, pteval = ptep_clear_flush(vma, address, pvmw.pte); } + /* + * Now the pte is cleared. If this is uffd-wp armed pte, we + * may want to replace a none pte with a marker pte if it's + * file-backed, so we don't lose the tracking information. + */ + pte_install_uffd_wp_if_needed(vma, address, pvmw.pte, pteval); + /* Move the dirty bit to the page. Now the pte is gone. */ if (pte_dirty(pteval)) set_page_dirty(page); diff --git a/mm/truncate.c b/mm/truncate.c index dac66749e400..35df3b1d301e 100644 --- a/mm/truncate.c +++ b/mm/truncate.c @@ -179,7 +179,13 @@ truncate_cleanup_page(struct address_space *mapping, struct page *page) if (page_mapped(page)) { unsigned int nr = thp_nr_pages(page); unmap_mapping_pages(mapping, page->index, nr, - ZAP_FLAG_CHECK_MAPPING); + ZAP_FLAG_CHECK_MAPPING | + /* + * Now it's safe to drop uffd-wp because + * we're with page lock, and the page is + * being truncated. + */ + ZAP_FLAG_DROP_FILE_UFFD_WP); } if (page_has_private(page)) From patchwork Fri Jan 15 17:08:51 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Xu X-Patchwork-Id: 12023335 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.6 required=3.0 tests=BAYES_00,DKIM_INVALID, DKIM_SIGNED,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id A5FB1C433DB for ; Fri, 15 Jan 2021 17:09:49 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 4820C2333E for ; Fri, 15 Jan 2021 17:09:49 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 4820C2333E Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=redhat.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id B75B98D01A7; Fri, 15 Jan 2021 12:09:41 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id B23A98D01A4; Fri, 15 Jan 2021 12:09:41 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 5D02F8D01A8; Fri, 15 Jan 2021 12:09:41 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0243.hostedemail.com [216.40.44.243]) by kanga.kvack.org (Postfix) with ESMTP id 4634E8D01A7 for ; Fri, 15 Jan 2021 12:09:41 -0500 (EST) Received: from smtpin19.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay03.hostedemail.com (Postfix) with ESMTP id 0980782EB55E for ; Fri, 15 Jan 2021 17:09:41 +0000 (UTC) X-FDA: 77708646162.19.crime99_130d3ec27531 Received: from filter.hostedemail.com (10.5.16.251.rfc1918.com [10.5.16.251]) by smtpin19.hostedemail.com (Postfix) with ESMTP id DE7BB1ACC48 for ; Fri, 15 Jan 2021 17:09:40 +0000 (UTC) X-HE-Tag: crime99_130d3ec27531 X-Filterd-Recvd-Size: 8454 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [63.128.21.124]) by imf22.hostedemail.com (Postfix) with ESMTP for ; Fri, 15 Jan 2021 17:09:40 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1610730579; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=TcPY6/7za0lNPyDAmKRI/igfmQiWJ8vt53iJjnvvO3w=; b=Sjq8zTAlhXoxzSarB8+AyoZsg9rgut0YOsct7GNltMFqvT6HJKtOUpDOTNs1F+ChEIuA75 8/o+Nav1h1ptm98UZB0SPNJLNvHChsErejooPlk3u7Z1rxitOhdNf6GK3xjGttCbdnM/YH bzydOBRJaG0OY1eOtwdp5ovztovJpgg= Received: from mail-qk1-f200.google.com (mail-qk1-f200.google.com [209.85.222.200]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-308-zbqdIzW0POubiRGUFIhwmA-1; Fri, 15 Jan 2021 12:09:38 -0500 X-MC-Unique: zbqdIzW0POubiRGUFIhwmA-1 Received: by mail-qk1-f200.google.com with SMTP id f27so8614528qkh.0 for ; Fri, 15 Jan 2021 09:09:38 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=TcPY6/7za0lNPyDAmKRI/igfmQiWJ8vt53iJjnvvO3w=; b=fSg2/xH8TeRkOogOFBLuK0aDrDHcyNHHzYUOJt8ZMGjvH+wxDh0RCDUfz69Z10kZfT sSoHfor5UXhxJKlJY+Bv006PRWj/UOMzlpFkbP2EUOROwk28NWy6RU1PEqr7BbHec+ZC 3WC3/jyoN92i33nCBVOKd5Ik5RpZ8aFKRVVcm4pNR6OdsgkuAZfOxRYPlx5wR7VDguWs k7bcqajfCHheBtyn0pj0yOXbD93xPq/qB0MhfqI0kaJS+d/Q/qHoZGCM0lIu2X2J5X7O 2AKq3JW49GqbAVq8lRqMPZqg2P+QNjrAu2mbhfy3VdXdNL0sM3ReiphP6J15yFmPCJn7 Eonw== X-Gm-Message-State: AOAM532vINmlIM5rP0EvPtHAeGYt+blNpQ29mKuH9WD9Q45LVcB52qXC y6/TegyAKGW7vmaUCTzjAeBgt3WviYAKKwusQrUdLByM7J04LIkuR/ibG13U83L4L/IFyHOeS9D KHH4Jt5Wxfks= X-Received: by 2002:a37:4bc1:: with SMTP id y184mr13317270qka.278.1610730577609; Fri, 15 Jan 2021 09:09:37 -0800 (PST) X-Google-Smtp-Source: ABdhPJy9hYHKUoizQIDfREFDpOwWL5XviptW/l4PxeLmjUj9Hh/PjW0cghaNlNVKRdMVb7Q0NdxI+g== X-Received: by 2002:a37:4bc1:: with SMTP id y184mr13317249qka.278.1610730577340; Fri, 15 Jan 2021 09:09:37 -0800 (PST) Received: from localhost.localdomain ([142.126.83.202]) by smtp.gmail.com with ESMTPSA id d123sm5187840qke.95.2021.01.15.09.09.35 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 15 Jan 2021 09:09:36 -0800 (PST) From: Peter Xu To: linux-kernel@vger.kernel.org, linux-mm@kvack.org Cc: Mike Rapoport , Mike Kravetz , peterx@redhat.com, Jerome Glisse , "Kirill A . Shutemov" , Hugh Dickins , Axel Rasmussen , Matthew Wilcox , Andrew Morton , Andrea Arcangeli , Nadav Amit Subject: [PATCH RFC 14/30] shmem/userfaultfd: Allow wr-protect none pte for file-backed mem Date: Fri, 15 Jan 2021 12:08:51 -0500 Message-Id: <20210115170907.24498-15-peterx@redhat.com> X-Mailer: git-send-email 2.26.2 In-Reply-To: <20210115170907.24498-1-peterx@redhat.com> References: <20210115170907.24498-1-peterx@redhat.com> MIME-Version: 1.0 Authentication-Results: relay.mimecast.com; auth=pass smtp.auth=CUSA124A263 smtp.mailfrom=peterx@redhat.com X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: File-backed memory differs from anonymous memory in that even if the pte is missing, the data could still resides either in the file or in page/swap cache. So when wr-protect a pte, we need to consider none ptes too. We do that by installing the uffd-wp special swap pte as a marker. So when there's a future write to the pte, the fault handler will go the special path to first fault-in the page as read-only, then report to userfaultfd server with the wr-protect message. On the other hand, when unprotecting a page, it's also possible that the pte got unmapped but replaced by the special uffd-wp marker. Then we'll need to be able to recover from a uffd-wp special swap pte into a none pte, so that the next access to the page will fault in correctly as usual when trigger the fault handler next time, rather than sending a uffd-wp message. Special care needs to be taken throughout the change_protection_range() process. Since now we allow user to wr-protect a none pte, we need to be able to pre-populate the page table entries if we see !anonymous && MM_CP_UFFD_WP requests, otherwise change_protection_range() will always skip when the pgtable entry does not exist. Note that this patch only covers the small pages (pte level) but not covering any of the transparent huge pages yet. But this will be a base for thps too. Signed-off-by: Peter Xu --- mm/mprotect.c | 48 ++++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 48 insertions(+) diff --git a/mm/mprotect.c b/mm/mprotect.c index e75bfe43cedd..c9390fd673fe 100644 --- a/mm/mprotect.c +++ b/mm/mprotect.c @@ -29,6 +29,7 @@ #include #include #include +#include #include #include #include @@ -176,6 +177,32 @@ static unsigned long change_pte_range(struct vm_area_struct *vma, pmd_t *pmd, set_pte_at(vma->vm_mm, addr, pte, newpte); pages++; } + } else if (unlikely(is_swap_special_pte(oldpte))) { + if (uffd_wp_resolve && !vma_is_anonymous(vma) && + pte_swp_uffd_wp_special(oldpte)) { + /* + * This is uffd-wp special pte and we'd like to + * unprotect it. What we need to do is simply + * recover the pte into a none pte; the next + * page fault will fault in the page. + */ + pte_clear(vma->vm_mm, addr, pte); + pages++; + } + } else { + /* It must be an none page, or what else?.. */ + WARN_ON_ONCE(!pte_none(oldpte)); + if (unlikely(uffd_wp && !vma_is_anonymous(vma))) { + /* + * For file-backed mem, we need to be able to + * wr-protect even for a none pte! Because + * even if the pte is null, the page/swap cache + * could exist. + */ + set_pte_at(vma->vm_mm, addr, pte, + pte_swp_mkuffd_wp_special(vma)); + pages++; + } } } while (pte++, addr += PAGE_SIZE, addr != end); arch_leave_lazy_mmu_mode(); @@ -209,6 +236,25 @@ static inline int pmd_none_or_clear_bad_unless_trans_huge(pmd_t *pmd) return 0; } +/* + * File-backed vma allows uffd wr-protect upon none ptes, because even if pte + * is missing, page/swap cache could exist. When that happens, the wr-protect + * information will be stored in the page table entries with the marker (e.g., + * PTE_SWP_UFFD_WP_SPECIAL). Prepare for that by always populating the page + * tables to pte level, so that we'll install the markers in change_pte_range() + * where necessary. + * + * Note that we only need to do this in pmd level, because if pmd does not + * exist, it means the whole range covered by the pmd entry (of a pud) does not + * contain any valid data but all zeros. Then nothing to wr-protect. + */ +#define change_protection_prepare(vma, pmd, addr, cp_flags) \ + do { \ + if (unlikely((cp_flags & MM_CP_UFFD_WP) && pmd_none(*pmd) && \ + !vma_is_anonymous(vma))) \ + WARN_ON_ONCE(pte_alloc(vma->vm_mm, pmd)); \ + } while (0) + static inline unsigned long change_pmd_range(struct vm_area_struct *vma, pud_t *pud, unsigned long addr, unsigned long end, pgprot_t newprot, unsigned long cp_flags) @@ -227,6 +273,8 @@ static inline unsigned long change_pmd_range(struct vm_area_struct *vma, next = pmd_addr_end(addr, end); + change_protection_prepare(vma, pmd, addr, cp_flags); + /* * Automatic NUMA balancing walks the tables with mmap_lock * held for read. It's possible a parallel update to occur From patchwork Fri Jan 15 17:08:52 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Xu X-Patchwork-Id: 12023339 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.6 required=3.0 tests=BAYES_00,DKIM_INVALID, DKIM_SIGNED,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 68DDFC433DB for ; Fri, 15 Jan 2021 17:09:55 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 144FE222B3 for ; Fri, 15 Jan 2021 17:09:55 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 144FE222B3 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=redhat.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 7A4C18D01A9; Fri, 15 Jan 2021 12:09:44 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 77E9F8D01A4; Fri, 15 Jan 2021 12:09:44 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 66B9A8D01A9; Fri, 15 Jan 2021 12:09:44 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0178.hostedemail.com [216.40.44.178]) by kanga.kvack.org (Postfix) with ESMTP id 513A58D01A4 for ; Fri, 15 Jan 2021 12:09:44 -0500 (EST) Received: from smtpin29.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay05.hostedemail.com (Postfix) with ESMTP id 15871181AEF39 for ; Fri, 15 Jan 2021 17:09:44 +0000 (UTC) X-FDA: 77708646288.29.women17_451859827531 Received: from filter.hostedemail.com (10.5.16.251.rfc1918.com [10.5.16.251]) by smtpin29.hostedemail.com (Postfix) with ESMTP id 4DC51181894E5 for ; Fri, 15 Jan 2021 17:09:43 +0000 (UTC) X-HE-Tag: women17_451859827531 X-Filterd-Recvd-Size: 5787 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [63.128.21.124]) by imf41.hostedemail.com (Postfix) with ESMTP for ; Fri, 15 Jan 2021 17:09:42 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1610730582; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=dvgubUamx9bzZRkUGwNzgfOj+y21NKvpsf3CqKI90Qc=; b=PP1T9fZZW0+El3M7TpwnIwynr4Y2CKuyq7PYQ20iz1R+30+7RYWFzi/ihPiUMrEsG5ktez 8n2/TtHBVqEpeb3qaECQEe8vqZ3jz5SOSYM4p1lIZZNdn1i07ea1d2kEfjn5VAvpUZVD9I Ydqmayi9juP00d4710/YLq+4WFxeGEY= Received: from mail-qv1-f71.google.com (mail-qv1-f71.google.com [209.85.219.71]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-480-ConrXKx9NBWgBqjmnTrfAA-1; Fri, 15 Jan 2021 12:09:40 -0500 X-MC-Unique: ConrXKx9NBWgBqjmnTrfAA-1 Received: by mail-qv1-f71.google.com with SMTP id l7so8260056qvp.15 for ; Fri, 15 Jan 2021 09:09:40 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=dvgubUamx9bzZRkUGwNzgfOj+y21NKvpsf3CqKI90Qc=; b=Z/tT/jI72mngBaPYfzloGH18MbIOBXfS+XXwaiwfKS3321emyGb7crGfBE8DhJqtCK tXglY6vqND0WPhHWuFhbh67ZN6W3x81klGwm8ZugpG1PSoMhU2kxmFPglPJu261H/9fI 4WggRjF46KsAglVyKd0emfpTQTL8x58ZTiaRnAvjsiTRHm1AjGtTblQ91vvJFxP0HmXi 8wD3t+7BQDGkzuv/4aCIW5VRpnLKR+ZROAgyXQTTUiQ+jY86/eAofLQvc4XAMhFWpzGw 5s8bhHpwUqG+cdGGvRUgx3Fixl07Oa+NoHVqlhxb3ruhY8SycGtwy1CSqcWpSbkRPfoK O4Ug== X-Gm-Message-State: AOAM532/0NHFKsyVAiq04PNCR+seKVa78c5oBcZzLpdJliZnkbJYX+qp ESDCsTKYO+3wyfFWPNnWkJpdEcww8BWIz8Si9YZ8u0fN41RxoX1pyJq0+v6ms5J495cPtoZ+1mU Xd9vKEsaZ5V4= X-Received: by 2002:a37:aa57:: with SMTP id t84mr13905356qke.348.1610730579783; Fri, 15 Jan 2021 09:09:39 -0800 (PST) X-Google-Smtp-Source: ABdhPJypFNWmfzxwqGMAMXQdAVCpNR1NUiGkMMax+RIf/niA0GkwrkuRQrPyK91eLGDjgdLHnbwTGA== X-Received: by 2002:a37:aa57:: with SMTP id t84mr13905333qke.348.1610730579535; Fri, 15 Jan 2021 09:09:39 -0800 (PST) Received: from localhost.localdomain ([142.126.83.202]) by smtp.gmail.com with ESMTPSA id d123sm5187840qke.95.2021.01.15.09.09.37 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 15 Jan 2021 09:09:38 -0800 (PST) From: Peter Xu To: linux-kernel@vger.kernel.org, linux-mm@kvack.org Cc: Mike Rapoport , Mike Kravetz , peterx@redhat.com, Jerome Glisse , "Kirill A . Shutemov" , Hugh Dickins , Axel Rasmussen , Matthew Wilcox , Andrew Morton , Andrea Arcangeli , Nadav Amit Subject: [PATCH RFC 15/30] shmem/userfaultfd: Allows file-back mem to be uffd wr-protected on thps Date: Fri, 15 Jan 2021 12:08:52 -0500 Message-Id: <20210115170907.24498-16-peterx@redhat.com> X-Mailer: git-send-email 2.26.2 In-Reply-To: <20210115170907.24498-1-peterx@redhat.com> References: <20210115170907.24498-1-peterx@redhat.com> MIME-Version: 1.0 Authentication-Results: relay.mimecast.com; auth=pass smtp.auth=CUSA124A263 smtp.mailfrom=peterx@redhat.com X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: We don't have "huge" version of PTE_SWP_UFFD_WP_SPECIAL, instead when necessary we split the thp if the huge page is uffd wr-protected previously. However split the thp is not enough, because file-backed thp is handled totally differently comparing to anonymous thps - rather than doing a real split, the thp pmd will simply got dropped in __split_huge_pmd_locked(). That is definitely not enough if e.g. when there is a thp covers range [0, 2M) but we want to wr-protect small page resides in [4K, 8K) range, because after __split_huge_pmd() returns, there will be a none pmd. Here we leverage the previously introduced change_protection_prepare() macro so that we'll populate the pmd with a pgtable page. Then change_pte_range() will do all the rest for us, e.g., install the uffd-wp swap special pte marker at any pte that we'd like to wr-protect, under the protection of pgtable lock. Signed-off-by: Peter Xu --- mm/mprotect.c | 10 +++++++++- 1 file changed, 9 insertions(+), 1 deletion(-) diff --git a/mm/mprotect.c b/mm/mprotect.c index c9390fd673fe..055871322007 100644 --- a/mm/mprotect.c +++ b/mm/mprotect.c @@ -296,8 +296,16 @@ static inline unsigned long change_pmd_range(struct vm_area_struct *vma, } if (is_swap_pmd(*pmd) || pmd_trans_huge(*pmd) || pmd_devmap(*pmd)) { - if (next - addr != HPAGE_PMD_SIZE) { + if (next - addr != HPAGE_PMD_SIZE || + /* Uffd wr-protecting a file-backed memory range */ + unlikely(!vma_is_anonymous(vma) && + (cp_flags & MM_CP_UFFD_WP))) { __split_huge_pmd(vma, pmd, addr, false, NULL); + /* + * For file-backed, the pmd could have been + * gone; still provide a pte pgtable if needed. + */ + change_protection_prepare(vma, pmd, addr, cp_flags); } else { int nr_ptes = change_huge_pmd(vma, pmd, addr, newprot, cp_flags); From patchwork Fri Jan 15 17:08:53 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Xu X-Patchwork-Id: 12023343 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.6 required=3.0 tests=BAYES_00,DKIM_INVALID, DKIM_SIGNED,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 7943DC433DB for ; Fri, 15 Jan 2021 17:10:00 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 2EA90238EE for ; Fri, 15 Jan 2021 17:10:00 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 2EA90238EE Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=redhat.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 6BEE48D01A4; Fri, 15 Jan 2021 12:09:49 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 30E218D01AC; Fri, 15 Jan 2021 12:09:49 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 1631A8D01A4; Fri, 15 Jan 2021 12:09:49 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0214.hostedemail.com [216.40.44.214]) by kanga.kvack.org (Postfix) with ESMTP id ECBE48D01AA for ; Fri, 15 Jan 2021 12:09:48 -0500 (EST) Received: from smtpin27.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay01.hostedemail.com (Postfix) with ESMTP id 6B04B181B0496 for ; Fri, 15 Jan 2021 17:09:47 +0000 (UTC) X-FDA: 77708646414.27.twig08_290310727531 Received: from filter.hostedemail.com (10.5.16.251.rfc1918.com [10.5.16.251]) by smtpin27.hostedemail.com (Postfix) with ESMTP id 3B27DC710F for ; Fri, 15 Jan 2021 17:09:47 +0000 (UTC) X-HE-Tag: twig08_290310727531 X-Filterd-Recvd-Size: 6812 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [216.205.24.124]) by imf19.hostedemail.com (Postfix) with ESMTP for ; Fri, 15 Jan 2021 17:09:46 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1610730586; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=2RW1oC/qi3s0vECjAVM4EdHx2RWM3abAnstC0UUqfa4=; b=eeGa3h7JTDfrA7h2U4VBPRVeJSCTQho8YRFcs58xG1zQUJk1hozHAnXvyrbvNLcLgr2FjF cGwK3VbRi6iXctQNfum+E4VCO+SYq/ZYkYYQZfQAeuZVnNMXSDm9fsNaUpTq0a7V/dfCwj zucxhASndLSm8E6b2B8cthbQeTb24yA= Received: from mail-qk1-f200.google.com (mail-qk1-f200.google.com [209.85.222.200]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-481-gaOxWk9jNkKEUInr55447Q-1; Fri, 15 Jan 2021 12:09:42 -0500 X-MC-Unique: gaOxWk9jNkKEUInr55447Q-1 Received: by mail-qk1-f200.google.com with SMTP id y187so8596351qke.20 for ; Fri, 15 Jan 2021 09:09:42 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=2RW1oC/qi3s0vECjAVM4EdHx2RWM3abAnstC0UUqfa4=; b=paD8GXqnInjZJb1PbZDgCSgXGDv0XIQFViCi5E9Zt6aCXxFv36yVFGfj4Z3kgfSuWx suceBXBoyflxmJcNYkc4pmz1YFG7v900GLDkQy+QHVQjxnWiSY43FJVtJ8eIfy0NxXEj wajIlFDmkJGbfIwKcq55DUYao/9Av4cRy/24ZwC5qClcifWarI23fp70TzazPThnYSSD QB9vpC1/+6Ap3pvMSdaCwPCksP9DVnbOEUtm9KtMpHE4m3GgvYnmytH98fBLslO0khx5 XivRUkk4s7txFundWtEy5n01cSa7e0dKSq292kGK+EUOGduNIQgTF44S8DjHgIc+gUvh P4ZA== X-Gm-Message-State: AOAM5307sJC46lxDqmSe/wGPD9hJyPVtzWEWiSjSxTV32BqFIO8HDn97 H6+Yb61HnjRM0Xol306m6GhkYzJjLGWoaHQctTepUOa0J8cYJlw+t03rioAmxlGQOZdXBwoKiQi jAzdNqA8ldBA= X-Received: by 2002:ae9:ebd5:: with SMTP id b204mr13053906qkg.195.1610730581727; Fri, 15 Jan 2021 09:09:41 -0800 (PST) X-Google-Smtp-Source: ABdhPJwE1LDryFn9iuF2RmMnRgFXb7Hrs0U9bk25uHUYNCYsMb0UtyRhx3yW16acT/76VwhO+7vNHA== X-Received: by 2002:ae9:ebd5:: with SMTP id b204mr13053878qkg.195.1610730581492; Fri, 15 Jan 2021 09:09:41 -0800 (PST) Received: from localhost.localdomain ([142.126.83.202]) by smtp.gmail.com with ESMTPSA id d123sm5187840qke.95.2021.01.15.09.09.39 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 15 Jan 2021 09:09:40 -0800 (PST) From: Peter Xu To: linux-kernel@vger.kernel.org, linux-mm@kvack.org Cc: Mike Rapoport , Mike Kravetz , peterx@redhat.com, Jerome Glisse , "Kirill A . Shutemov" , Hugh Dickins , Axel Rasmussen , Matthew Wilcox , Andrew Morton , Andrea Arcangeli , Nadav Amit Subject: [PATCH RFC 16/30] shmem/userfaultfd: Handle the left-overed special swap ptes Date: Fri, 15 Jan 2021 12:08:53 -0500 Message-Id: <20210115170907.24498-17-peterx@redhat.com> X-Mailer: git-send-email 2.26.2 In-Reply-To: <20210115170907.24498-1-peterx@redhat.com> References: <20210115170907.24498-1-peterx@redhat.com> MIME-Version: 1.0 Authentication-Results: relay.mimecast.com; auth=pass smtp.auth=CUSA124A263 smtp.mailfrom=peterx@redhat.com X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Note that the special uffd-wp swap pte can be left over even if the page under the pte got evicted. Normally when evict a page, we will unmap the ptes by walking through the reverse mapping. However we never tracked such information for the special swap ptes because they're not real mappings but just markers. So we need to take care of that when we see a marker but when it's actually meaningless (the page behind it got evicted). We have already taken care of that in e.g. alloc_set_pte() where we'll treat the special swap pte as pte_none() when necessary. However we need to also teach userfaultfd itself on either UFFDIO_COPY or handling page faults, so that everything will still work as expected. Signed-off-by: Peter Xu --- fs/userfaultfd.c | 15 +++++++++++++++ mm/shmem.c | 13 ++++++++++++- 2 files changed, 27 insertions(+), 1 deletion(-) diff --git a/fs/userfaultfd.c b/fs/userfaultfd.c index 000b457ad087..3537a43b69c9 100644 --- a/fs/userfaultfd.c +++ b/fs/userfaultfd.c @@ -331,6 +331,21 @@ static inline bool userfaultfd_must_wait(struct userfaultfd_ctx *ctx, */ if (pte_none(*pte)) ret = true; + /* + * We also treat the swap special uffd-wp pte as the pte_none() here. + * This should in most cases be a missing event, as we never handle + * wr-protect upon a special uffd-wp swap pte - it should first be + * converted into a normal read request before handling wp. It just + * means the page/swap cache that backing this pte is gone, so this + * special pte is leftover. + * + * We can't simply replace it with a none pte because we're not with + * the pgtable lock here. Instead of taking it and clearing the pte, + * the easy way is to let UFFDIO_COPY understand this pte too when + * trying to install a new page onto it. + */ + if (pte_swp_uffd_wp_special(*pte)) + ret = true; if (!pte_write(*pte) && (reason & VM_UFFD_WP)) ret = true; pte_unmap(pte); diff --git a/mm/shmem.c b/mm/shmem.c index de45333626f7..9947bcf92663 100644 --- a/mm/shmem.c +++ b/mm/shmem.c @@ -2456,7 +2456,18 @@ static int shmem_mfill_atomic_pte(struct mm_struct *dst_mm, goto out_release_unlock; ret = -EEXIST; - if (!pte_none(*dst_pte)) + /* + * Besides the none pte, we also allow UFFDIO_COPY to install a pte + * onto the uffd-wp swap special pte, because that pte should be the + * same as a pte_none() just in that it contains wr-protect information + * (which could only be dropped when unmap the memory). + * + * It's safe to drop that marker because we know this is part of a + * MISSING fault, and the caller is very clear about this page missing + * rather than wr-protected. Then we're sure the wr-protect bit is + * just a leftover so it's useless already. + */ + if (!pte_none(*dst_pte) && !pte_swp_uffd_wp_special(*dst_pte)) goto out_release_unlock; lru_cache_add(page); From patchwork Fri Jan 15 17:08:54 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Xu X-Patchwork-Id: 12023341 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.6 required=3.0 tests=BAYES_00,DKIM_INVALID, DKIM_SIGNED,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 0AAD1C433E0 for ; Fri, 15 Jan 2021 17:09:58 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id B95822333E for ; Fri, 15 Jan 2021 17:09:57 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org B95822333E Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=redhat.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 2745F8D01AB; Fri, 15 Jan 2021 12:09:49 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 223468D01AA; Fri, 15 Jan 2021 12:09:49 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 09E758D01AB; Fri, 15 Jan 2021 12:09:49 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0066.hostedemail.com [216.40.44.66]) by kanga.kvack.org (Postfix) with ESMTP id E56298D01A4 for ; Fri, 15 Jan 2021 12:09:48 -0500 (EST) Received: from smtpin07.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay01.hostedemail.com (Postfix) with ESMTP id 24AF418016C8A for ; Fri, 15 Jan 2021 17:09:47 +0000 (UTC) X-FDA: 77708646414.07.steel26_1b0e50127531 Received: from filter.hostedemail.com (10.5.16.251.rfc1918.com [10.5.16.251]) by smtpin07.hostedemail.com (Postfix) with ESMTP id C9F56181D19F3 for ; Fri, 15 Jan 2021 17:09:46 +0000 (UTC) X-HE-Tag: steel26_1b0e50127531 X-Filterd-Recvd-Size: 5079 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [216.205.24.124]) by imf46.hostedemail.com (Postfix) with ESMTP for ; Fri, 15 Jan 2021 17:09:45 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1610730585; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=yBkQRG2pL7J64zZlZOQIc2dhhUBB5j7m8vruAB1Id6o=; b=SqBLXe9qSPiMthtW1TrEn/xD2txPGfuA5EgOpksQ98VtIYe2PEOGChyFn2hAgaPsxrpIsT Ega+01zLfMtHfT+lDVjOLLXKUyxZNc0wCJAjqBw9U7L96ugGre8sI2j6io9kuwGlxQ3Rip g049G82qgNLU+mIrZ+UREv1xYk8MNfk= Received: from mail-qv1-f70.google.com (mail-qv1-f70.google.com [209.85.219.70]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-521-z595yFwWON2ogclElyb6Fw-1; Fri, 15 Jan 2021 12:09:44 -0500 X-MC-Unique: z595yFwWON2ogclElyb6Fw-1 Received: by mail-qv1-f70.google.com with SMTP id u8so8265456qvm.5 for ; Fri, 15 Jan 2021 09:09:44 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=yBkQRG2pL7J64zZlZOQIc2dhhUBB5j7m8vruAB1Id6o=; b=km/sRC0Dvtumpt1SJlA8vzzh1X38e+EkH9DrCuVL04bQm5uerNAzssvTKw1lWMJezL iZJicCZem0Ahc2wrav4QfIdEl5BmDyQk+fsdrE0iqMHTb/wsQ00CjI3bET53PrjhmGeq yjs2qHgAGq9PWqZB+tCqDG3rCnziKxEQotJaNTWUF+Ja+oxukpO5YX0sdHG1xPDD4lWC eY4SFguJ/UnPdjWTCd/yeQEKHpoc1yG2ERyoNk1/qiFw3pb0BrM7tTL7qccGNJF2bQ06 opnOxv3sBbrxwzo/Hm3exq+zG+ktm/izX2tfsWMKZOJ6zlJ3+aL3ulWH+OseAE74eCPu 6bMA== X-Gm-Message-State: AOAM530sbyWmMu+8+4U9qnOYMs9hptmjhqq+IIboTiZ5fvBfu/nzC8k0 8UNAUDCQv/mqZjuhYv+yoHmxWaV3Nu56wtg4GHj23L0wBoYlZymoWJ6gP0y6sjouMhvucbdDC+x dhCkksxG7DBM= X-Received: by 2002:a05:620a:1264:: with SMTP id b4mr13698540qkl.187.1610730583587; Fri, 15 Jan 2021 09:09:43 -0800 (PST) X-Google-Smtp-Source: ABdhPJya4OiwdJGAuCczPlqaSBaKxxbyDX4PGAfQreU5Kr2CdzlXJm35YPbgKxKn0g+KWI8OS78AjA== X-Received: by 2002:a05:620a:1264:: with SMTP id b4mr13698504qkl.187.1610730583353; Fri, 15 Jan 2021 09:09:43 -0800 (PST) Received: from localhost.localdomain ([142.126.83.202]) by smtp.gmail.com with ESMTPSA id d123sm5187840qke.95.2021.01.15.09.09.41 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 15 Jan 2021 09:09:42 -0800 (PST) From: Peter Xu To: linux-kernel@vger.kernel.org, linux-mm@kvack.org Cc: Mike Rapoport , Mike Kravetz , peterx@redhat.com, Jerome Glisse , "Kirill A . Shutemov" , Hugh Dickins , Axel Rasmussen , Matthew Wilcox , Andrew Morton , Andrea Arcangeli , Nadav Amit Subject: [PATCH RFC 17/30] shmem/userfaultfd: Pass over uffd-wp special swap pte when fork() Date: Fri, 15 Jan 2021 12:08:54 -0500 Message-Id: <20210115170907.24498-18-peterx@redhat.com> X-Mailer: git-send-email 2.26.2 In-Reply-To: <20210115170907.24498-1-peterx@redhat.com> References: <20210115170907.24498-1-peterx@redhat.com> MIME-Version: 1.0 Authentication-Results: relay.mimecast.com; auth=pass smtp.auth=CUSA124A263 smtp.mailfrom=peterx@redhat.com X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: It should be handled similarly like other uffd-wp wr-protected ptes: we should pass it over when the dst_vma has VM_UFFD_WP armed, otherwise drop it. Signed-off-by: Peter Xu --- mm/memory.c | 15 ++++++++++++++- 1 file changed, 14 insertions(+), 1 deletion(-) diff --git a/mm/memory.c b/mm/memory.c index f87b5a8a098e..59d56f57ba2c 100644 --- a/mm/memory.c +++ b/mm/memory.c @@ -703,8 +703,21 @@ copy_nonpresent_pte(struct mm_struct *dst_mm, struct mm_struct *src_mm, unsigned long vm_flags = dst_vma->vm_flags; pte_t pte = *src_pte; struct page *page; - swp_entry_t entry = pte_to_swp_entry(pte); + swp_entry_t entry; + + if (unlikely(is_swap_special_pte(pte))) { + /* + * uffd-wp special swap pte is the only possibility for now. + * If dst vma is registered with uffd-wp, copy it over. + * Otherwise, ignore this pte as if it's a none pte would work. + */ + WARN_ON_ONCE(!pte_swp_uffd_wp_special(pte)); + if (userfaultfd_wp(dst_vma)) + set_pte_at(dst_mm, addr, dst_pte, pte); + return 0; + } + entry = pte_to_swp_entry(pte); if (likely(!non_swap_entry(entry))) { if (swap_duplicate(entry) < 0) return entry.val; From patchwork Fri Jan 15 17:08:55 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Xu X-Patchwork-Id: 12023345 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.6 required=3.0 tests=BAYES_00,DKIM_INVALID, DKIM_SIGNED,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id EE92EC433DB for ; Fri, 15 Jan 2021 17:10:02 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id A90E62333E for ; Fri, 15 Jan 2021 17:10:02 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org A90E62333E Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=redhat.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id E338A8D01AC; Fri, 15 Jan 2021 12:09:49 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id E0AD18D01AA; Fri, 15 Jan 2021 12:09:49 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id D022A8D01AC; Fri, 15 Jan 2021 12:09:49 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0253.hostedemail.com [216.40.44.253]) by kanga.kvack.org (Postfix) with ESMTP id 99A3C8D01AA for ; Fri, 15 Jan 2021 12:09:49 -0500 (EST) Received: from smtpin21.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay02.hostedemail.com (Postfix) with ESMTP id 159D98792 for ; Fri, 15 Jan 2021 17:09:49 +0000 (UTC) X-FDA: 77708646498.21.kite09_561648f27531 Received: from filter.hostedemail.com (10.5.16.251.rfc1918.com [10.5.16.251]) by smtpin21.hostedemail.com (Postfix) with ESMTP id B4E871803978B for ; Fri, 15 Jan 2021 17:09:48 +0000 (UTC) X-HE-Tag: kite09_561648f27531 X-Filterd-Recvd-Size: 5175 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [63.128.21.124]) by imf27.hostedemail.com (Postfix) with ESMTP for ; Fri, 15 Jan 2021 17:09:48 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1610730587; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=w6siyxttlu7KCpWlkxB4VjYp+yZ18KeXRH4qtrNiwZw=; b=CUlmXRNvtLBVK0Sacm668t5j2INSgakSXHRnxbl/PdKF+l27WBNFyannhnH59wEALoAfFI 6JRHaJoSYGI9oU5UVcMpoClbDdGmHgW76TuZ78tOwLL/ilguTSgNlPMwllviFH0DevbdwK GFTACjB2PGeanqB0Sq/046yYGZkO1Es= Received: from mail-qk1-f199.google.com (mail-qk1-f199.google.com [209.85.222.199]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-576-K8II2DFAOdCq0a9DAWHjCg-1; Fri, 15 Jan 2021 12:09:46 -0500 X-MC-Unique: K8II2DFAOdCq0a9DAWHjCg-1 Received: by mail-qk1-f199.google.com with SMTP id u17so8601469qku.17 for ; Fri, 15 Jan 2021 09:09:46 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=w6siyxttlu7KCpWlkxB4VjYp+yZ18KeXRH4qtrNiwZw=; b=hpzVGaU3bx4wtCFQQYwS0pU4ifvyvdXDdudsX0frkt+DRE+Zuhcr2f3i8w2oQwER7U XoR5VhNaqds3MOrJjJVssZ4LKQJP3kVx+K/WQXOq8rGY+Ki0rPkfw8IZoOFNRGM3ZnvC BRizPK35YEEFq4C6lDnuSBN44VxWt8d6v4KIRc4DoYfSyrKZv5PFcjXsVRHuvdQ8B35o yhZBneqaCqWOfSnR5oOqngXi7Qt7sqezSEazkq5Mn1jbPeC/IV+7M2Jmv/4orTAoDW1L 5lrRtMVOd5z1cYBl9Q2Pb5naZkxetgA0bfB6RikqKXgEdstrn96hteFnfsJK/4tA2PuT tBMQ== X-Gm-Message-State: AOAM533T/G+BaZ88VTzfz/F8o6kCvrx9O9zus0rwE8felHnlWfkW4JEC UHAMG+P8Rt1dSCoGJIxhPpK7HJnyIyrgA4YMm3W5cLUHF/UlyF922CkYQBDgvKv83pfSx74Fdyx XOtnWca11Y3U= X-Received: by 2002:a37:4a4e:: with SMTP id x75mr13396449qka.89.1610730585802; Fri, 15 Jan 2021 09:09:45 -0800 (PST) X-Google-Smtp-Source: ABdhPJy247OPfvvMBFwBHFtTUpj3tm1FUibXTgft3ZPmifm23yeptWRPKyFEZ5qXtk/qtx/VoM7IRQ== X-Received: by 2002:a37:4a4e:: with SMTP id x75mr13396416qka.89.1610730585574; Fri, 15 Jan 2021 09:09:45 -0800 (PST) Received: from localhost.localdomain ([142.126.83.202]) by smtp.gmail.com with ESMTPSA id d123sm5187840qke.95.2021.01.15.09.09.43 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 15 Jan 2021 09:09:44 -0800 (PST) From: Peter Xu To: linux-kernel@vger.kernel.org, linux-mm@kvack.org Cc: Mike Rapoport , Mike Kravetz , peterx@redhat.com, Jerome Glisse , "Kirill A . Shutemov" , Hugh Dickins , Axel Rasmussen , Matthew Wilcox , Andrew Morton , Andrea Arcangeli , Nadav Amit Subject: [PATCH RFC 18/30] hugetlb/userfaultfd: Hook page faults for uffd write protection Date: Fri, 15 Jan 2021 12:08:55 -0500 Message-Id: <20210115170907.24498-19-peterx@redhat.com> X-Mailer: git-send-email 2.26.2 In-Reply-To: <20210115170907.24498-1-peterx@redhat.com> References: <20210115170907.24498-1-peterx@redhat.com> MIME-Version: 1.0 Authentication-Results: relay.mimecast.com; auth=pass smtp.auth=CUSA124A263 smtp.mailfrom=peterx@redhat.com X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Hook up hugetlbfs_fault() with the capability to handle userfaultfd-wp faults. We do this slightly earlier than hugetlb_cow() so that we can avoid taking some extra locks that we definitely don't need. Signed-off-by: Peter Xu --- mm/hugetlb.c | 19 +++++++++++++++++++ 1 file changed, 19 insertions(+) diff --git a/mm/hugetlb.c b/mm/hugetlb.c index d029d938d26d..dcbbba53bd10 100644 --- a/mm/hugetlb.c +++ b/mm/hugetlb.c @@ -4544,6 +4544,25 @@ vm_fault_t hugetlb_fault(struct mm_struct *mm, struct vm_area_struct *vma, if (unlikely(!pte_same(entry, huge_ptep_get(ptep)))) goto out_ptl; + /* Handle userfault-wp first, before trying to lock more pages */ + if (userfaultfd_pte_wp(vma, huge_ptep_get(ptep)) && + (flags & FAULT_FLAG_WRITE) && !huge_pte_write(entry)) { + struct vm_fault vmf = { + .vma = vma, + .address = haddr, + .flags = flags, + }; + + spin_unlock(ptl); + if (pagecache_page) { + unlock_page(pagecache_page); + put_page(pagecache_page); + } + mutex_unlock(&hugetlb_fault_mutex_table[hash]); + i_mmap_unlock_read(mapping); + return handle_userfault(&vmf, VM_UFFD_WP); + } + /* * hugetlb_cow() requires page locks of pte_page(entry) and * pagecache_page, so here we need take the former one From patchwork Fri Jan 15 17:08:56 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Xu X-Patchwork-Id: 12023347 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.6 required=3.0 tests=BAYES_00,DKIM_INVALID, DKIM_SIGNED,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id B19E1C433E0 for ; Fri, 15 Jan 2021 17:10:05 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 6975A2333E for ; Fri, 15 Jan 2021 17:10:05 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 6975A2333E Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=redhat.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 29AA68D01AD; Fri, 15 Jan 2021 12:09:52 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 24BAB8D01AA; Fri, 15 Jan 2021 12:09:52 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 1606B8D01AD; Fri, 15 Jan 2021 12:09:52 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0081.hostedemail.com [216.40.44.81]) by kanga.kvack.org (Postfix) with ESMTP id EEF2C8D01AA for ; Fri, 15 Jan 2021 12:09:51 -0500 (EST) Received: from smtpin21.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay02.hostedemail.com (Postfix) with ESMTP id AA8976D80 for ; Fri, 15 Jan 2021 17:09:51 +0000 (UTC) X-FDA: 77708646582.21.crow20_18177ae27531 Received: from filter.hostedemail.com (10.5.16.251.rfc1918.com [10.5.16.251]) by smtpin21.hostedemail.com (Postfix) with ESMTP id 643BC18103F2A for ; Fri, 15 Jan 2021 17:09:51 +0000 (UTC) X-HE-Tag: crow20_18177ae27531 X-Filterd-Recvd-Size: 8519 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [63.128.21.124]) by imf24.hostedemail.com (Postfix) with ESMTP for ; Fri, 15 Jan 2021 17:09:50 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1610730590; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=Uo187ljIKtl+Ve2u3XKyxHQxREG1CzLvXoZ9sZssXMc=; b=ZgzSco4WL2KtA+FqtxCrxNx7xN7XdeUKu8NKFCfQVkYA7dkKRLIxJQ6om2sHWLvcZuHQ8L aKwSr93iw+C+vTyoQVbO7GiHm4j05U5aQhE2oPy7DLM/bXu2E8ZaZjsHkYMEHRaSY8aivc UcUImcWgzgSFO/BVvKx53gk+uMC6F34= Received: from mail-qt1-f198.google.com (mail-qt1-f198.google.com [209.85.160.198]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-146-4RcsdymTODi3EF_nxmXWwA-1; Fri, 15 Jan 2021 12:09:47 -0500 X-MC-Unique: 4RcsdymTODi3EF_nxmXWwA-1 Received: by mail-qt1-f198.google.com with SMTP id l6so7892774qtr.9 for ; Fri, 15 Jan 2021 09:09:47 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=Uo187ljIKtl+Ve2u3XKyxHQxREG1CzLvXoZ9sZssXMc=; b=W1xsbF3Ho5PEJ579AZ4OYgghktk08FNtteVQmwhLwjsGgIs01OtuNQXpzY0KMRg3P1 BlTfJdsFxwfPHhT0J99p/hbsZOgCJgSbUGWy7pfpZwTLc5enYGVkMeqZyqRdIvCCV4Gb fBstzibzWKc2LM8WTdIZUuG1B0Jo8JOmgaPiJ6femgULWb7WPbG4gP70g80L8hcoaRq/ MwcPkIvZHDfWylICU94ys5/C/MihFK4xCzbevD2uEPFEdvjzNBQVpA4xxD8Rz5l5PJj9 gkol+7a4DaB00nm/b2PrY58GmmaJMAJpDjZ1LARWf2nRX6qxRHCjGa0MVL4Yvfk0SV2A EuSA== X-Gm-Message-State: AOAM532e2alKMl/ClMu0CL/jSlImdtKsGmQReEKIaqKZE/fAgZjHJL9U B7a1hIz4hLTXYoUxTdnst1IR6pwvCKfYzkiJV3lnL6cZ0an/JM0RMpFnpLKDn7gGGcbj8tspwWI O4QLdvZ2Dvsc= X-Received: by 2002:a05:6214:504:: with SMTP id v4mr12626546qvw.54.1610730587426; Fri, 15 Jan 2021 09:09:47 -0800 (PST) X-Google-Smtp-Source: ABdhPJzZfKqOh3ezoGUMGkQTHnf/tdFrONUsOKGa4/2/CEv0Wfi0RVejL4iZxSP/JZlIQULBxfJCjA== X-Received: by 2002:a05:6214:504:: with SMTP id v4mr12626514qvw.54.1610730587234; Fri, 15 Jan 2021 09:09:47 -0800 (PST) Received: from localhost.localdomain ([142.126.83.202]) by smtp.gmail.com with ESMTPSA id d123sm5187840qke.95.2021.01.15.09.09.45 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 15 Jan 2021 09:09:46 -0800 (PST) From: Peter Xu To: linux-kernel@vger.kernel.org, linux-mm@kvack.org Cc: Mike Rapoport , Mike Kravetz , peterx@redhat.com, Jerome Glisse , "Kirill A . Shutemov" , Hugh Dickins , Axel Rasmussen , Matthew Wilcox , Andrew Morton , Andrea Arcangeli , Nadav Amit Subject: [PATCH RFC 19/30] hugetlb/userfaultfd: Take care of UFFDIO_COPY_MODE_WP Date: Fri, 15 Jan 2021 12:08:56 -0500 Message-Id: <20210115170907.24498-20-peterx@redhat.com> X-Mailer: git-send-email 2.26.2 In-Reply-To: <20210115170907.24498-1-peterx@redhat.com> References: <20210115170907.24498-1-peterx@redhat.com> MIME-Version: 1.0 Authentication-Results: relay.mimecast.com; auth=pass smtp.auth=CUSA124A263 smtp.mailfrom=peterx@redhat.com X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Firstly, pass the wp_copy variable into hugetlb_mcopy_atomic_pte() thoughout the stack. Then, apply the UFFD_WP bit if UFFDIO_COPY_MODE_WP is with UFFDIO_COPY. Introduce huge_pte_mkuffd_wp() for it. Note that similar to how we've handled shmem, we'd better keep setting the dirty bit even if UFFDIO_COPY_MODE_WP is provided, so that the core mm will know this page contains valid data and never drop it. Signed-off-by: Peter Xu --- include/asm-generic/hugetlb.h | 5 +++++ include/linux/hugetlb.h | 6 ++++-- mm/hugetlb.c | 9 +++++++-- mm/userfaultfd.c | 12 ++++++++---- 4 files changed, 24 insertions(+), 8 deletions(-) diff --git a/include/asm-generic/hugetlb.h b/include/asm-generic/hugetlb.h index 8e1e6244a89d..548212eccbd6 100644 --- a/include/asm-generic/hugetlb.h +++ b/include/asm-generic/hugetlb.h @@ -27,6 +27,11 @@ static inline pte_t huge_pte_mkdirty(pte_t pte) return pte_mkdirty(pte); } +static inline pte_t huge_pte_mkuffd_wp(pte_t pte) +{ + return pte_mkuffd_wp(pte); +} + static inline pte_t huge_pte_modify(pte_t pte, pgprot_t newprot) { return pte_modify(pte, newprot); diff --git a/include/linux/hugetlb.h b/include/linux/hugetlb.h index ebca2ef02212..bd061f7eedcb 100644 --- a/include/linux/hugetlb.h +++ b/include/linux/hugetlb.h @@ -138,7 +138,8 @@ int hugetlb_mcopy_atomic_pte(struct mm_struct *dst_mm, pte_t *dst_pte, struct vm_area_struct *dst_vma, unsigned long dst_addr, unsigned long src_addr, - struct page **pagep); + struct page **pagep, + bool wp_copy); int hugetlb_reserve_pages(struct inode *inode, long from, long to, struct vm_area_struct *vma, vm_flags_t vm_flags); @@ -313,7 +314,8 @@ static inline int hugetlb_mcopy_atomic_pte(struct mm_struct *dst_mm, struct vm_area_struct *dst_vma, unsigned long dst_addr, unsigned long src_addr, - struct page **pagep) + struct page **pagep, + bool wp_copy) { BUG(); return 0; diff --git a/mm/hugetlb.c b/mm/hugetlb.c index dcbbba53bd10..563b8f70537f 100644 --- a/mm/hugetlb.c +++ b/mm/hugetlb.c @@ -4624,7 +4624,8 @@ int hugetlb_mcopy_atomic_pte(struct mm_struct *dst_mm, struct vm_area_struct *dst_vma, unsigned long dst_addr, unsigned long src_addr, - struct page **pagep) + struct page **pagep, + bool wp_copy) { struct address_space *mapping; pgoff_t idx; @@ -4717,8 +4718,12 @@ int hugetlb_mcopy_atomic_pte(struct mm_struct *dst_mm, } _dst_pte = make_huge_pte(dst_vma, page, dst_vma->vm_flags & VM_WRITE); - if (dst_vma->vm_flags & VM_WRITE) + if (dst_vma->vm_flags & VM_WRITE) { _dst_pte = huge_pte_mkdirty(_dst_pte); + if (wp_copy) + _dst_pte = huge_pte_mkuffd_wp( + huge_pte_wrprotect(_dst_pte)); + } _dst_pte = pte_mkyoung(_dst_pte); set_huge_pte_at(dst_mm, dst_addr, dst_pte, _dst_pte); diff --git a/mm/userfaultfd.c b/mm/userfaultfd.c index 6d4b3b7c7f9f..b00e5e6b8b8b 100644 --- a/mm/userfaultfd.c +++ b/mm/userfaultfd.c @@ -207,7 +207,8 @@ static __always_inline ssize_t __mcopy_atomic_hugetlb(struct mm_struct *dst_mm, unsigned long dst_start, unsigned long src_start, unsigned long len, - bool zeropage) + bool zeropage, + bool wp_copy) { int vm_alloc_shared = dst_vma->vm_flags & VM_SHARED; int vm_shared = dst_vma->vm_flags & VM_SHARED; @@ -306,7 +307,8 @@ static __always_inline ssize_t __mcopy_atomic_hugetlb(struct mm_struct *dst_mm, } err = hugetlb_mcopy_atomic_pte(dst_mm, dst_pte, dst_vma, - dst_addr, src_addr, &page); + dst_addr, src_addr, &page, + wp_copy); mutex_unlock(&hugetlb_fault_mutex_table[hash]); i_mmap_unlock_read(mapping); @@ -408,7 +410,8 @@ extern ssize_t __mcopy_atomic_hugetlb(struct mm_struct *dst_mm, unsigned long dst_start, unsigned long src_start, unsigned long len, - bool zeropage); + bool zeropage, + bool wp_copy); #endif /* CONFIG_HUGETLB_PAGE */ static __always_inline ssize_t mfill_atomic_pte(struct mm_struct *dst_mm, @@ -527,7 +530,8 @@ static __always_inline ssize_t __mcopy_atomic(struct mm_struct *dst_mm, */ if (is_vm_hugetlb_page(dst_vma)) return __mcopy_atomic_hugetlb(dst_mm, dst_vma, dst_start, - src_start, len, zeropage); + src_start, len, zeropage, + wp_copy); if (!vma_is_anonymous(dst_vma) && !vma_is_shmem(dst_vma)) goto out_unlock; From patchwork Fri Jan 15 17:08:57 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Xu X-Patchwork-Id: 12023349 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.6 required=3.0 tests=BAYES_00,DKIM_INVALID, DKIM_SIGNED,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 526A8C433E0 for ; Fri, 15 Jan 2021 17:10:08 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 020172333E for ; Fri, 15 Jan 2021 17:10:07 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 020172333E Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=redhat.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id A17068D01AA; Fri, 15 Jan 2021 12:09:52 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 9C64F8D01AE; Fri, 15 Jan 2021 12:09:52 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 86A6D8D01AA; Fri, 15 Jan 2021 12:09:52 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0035.hostedemail.com [216.40.44.35]) by kanga.kvack.org (Postfix) with ESMTP id 6B64E8D01AE for ; Fri, 15 Jan 2021 12:09:52 -0500 (EST) Received: from smtpin18.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay02.hostedemail.com (Postfix) with ESMTP id 26AAC1DFE for ; Fri, 15 Jan 2021 17:09:52 +0000 (UTC) X-FDA: 77708646624.18.rock89_580513327531 Received: from filter.hostedemail.com (10.5.16.251.rfc1918.com [10.5.16.251]) by smtpin18.hostedemail.com (Postfix) with ESMTP id E884A100EDBCA for ; Fri, 15 Jan 2021 17:09:51 +0000 (UTC) X-HE-Tag: rock89_580513327531 X-Filterd-Recvd-Size: 9107 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [216.205.24.124]) by imf30.hostedemail.com (Postfix) with ESMTP for ; Fri, 15 Jan 2021 17:09:51 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1610730591; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=ACj7dAXdw9MZrmiH0lb9ADzAHOunb4bShuydpXqJ1bw=; b=Wsma3Csp5LMGvUYiZPDZUVfGMFf+t9CJ6cDOgTrw0GwtqqPxQD0yz7KOGpccVh7ZJrGc3h zZABgtOYs4YQMeXt/wEjXJrOM3Fq2CKEhg99mxrheRvFh/KbzNjtZ55ab0zxn6Vaxc2D6G WpnRLK6BlDLKVzgEUTNwVODLhDMUp1Y= Received: from mail-qv1-f72.google.com (mail-qv1-f72.google.com [209.85.219.72]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-445-mC5fRZaCPxW6ncyQwX-vHw-1; Fri, 15 Jan 2021 12:09:49 -0500 X-MC-Unique: mC5fRZaCPxW6ncyQwX-vHw-1 Received: by mail-qv1-f72.google.com with SMTP id h1so8238970qvr.7 for ; Fri, 15 Jan 2021 09:09:49 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=ACj7dAXdw9MZrmiH0lb9ADzAHOunb4bShuydpXqJ1bw=; b=pW3kx4DFeMBSjDEkVY/oETBunehfIesw4NsN93lD4fOOaDB1lTuJV5U/d2Ev1RaP0E w++Y6yi7ieI3PyxCbylko84/S/UAxFDmkfNOJfhlquKDcUKQ9XJcWjIlDmHVdqvFJTy5 xjBCHjfXCeIkbHf44wuJSeNi/mTJGPpPYjjuP2ZrTl/zyBALX5toMI7xrzJh/tW62b7w Of34RqG5EABHeJMpSvDuj0oCCOZKRC53iqUos216VrVMiGol4+P5qQ71+CxU9Z6P3ReA Bt6MUJzvwpsGu/Rd3AGPUW9DakO1LrwpRscvYWPo2eTJiMpdNb6EdWxJwAs0Jwm07TeS AmAA== X-Gm-Message-State: AOAM530+yAYfxliSncIk+/TQ6M5hPo5J3F+6Hgt/z++6sGmvxVnig7rz VtleVVGyCkkSNrrYbBGF6SUQ/hShp06/zNz26k8sGylyi1q1xCtQNJPlXn5bMObtcUQ//9wfAZy 9mBOT/qosHko= X-Received: by 2002:a05:620a:145a:: with SMTP id i26mr13319234qkl.31.1610730589030; Fri, 15 Jan 2021 09:09:49 -0800 (PST) X-Google-Smtp-Source: ABdhPJym5MvHIxxRQ18MHwTBywIzvX4iFw2g80I1OMBrnkwSP/L9AX0DIkatK4Nke7iZJFu9Bek/gQ== X-Received: by 2002:a05:620a:145a:: with SMTP id i26mr13319205qkl.31.1610730588799; Fri, 15 Jan 2021 09:09:48 -0800 (PST) Received: from localhost.localdomain ([142.126.83.202]) by smtp.gmail.com with ESMTPSA id d123sm5187840qke.95.2021.01.15.09.09.47 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 15 Jan 2021 09:09:48 -0800 (PST) From: Peter Xu To: linux-kernel@vger.kernel.org, linux-mm@kvack.org Cc: Mike Rapoport , Mike Kravetz , peterx@redhat.com, Jerome Glisse , "Kirill A . Shutemov" , Hugh Dickins , Axel Rasmussen , Matthew Wilcox , Andrew Morton , Andrea Arcangeli , Nadav Amit Subject: [PATCH RFC 20/30] hugetlb/userfaultfd: Handle UFFDIO_WRITEPROTECT Date: Fri, 15 Jan 2021 12:08:57 -0500 Message-Id: <20210115170907.24498-21-peterx@redhat.com> X-Mailer: git-send-email 2.26.2 In-Reply-To: <20210115170907.24498-1-peterx@redhat.com> References: <20210115170907.24498-1-peterx@redhat.com> MIME-Version: 1.0 Authentication-Results: relay.mimecast.com; auth=pass smtp.auth=CUSA124A263 smtp.mailfrom=peterx@redhat.com X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: This starts from passing cp_flags into hugetlb_change_protection() so hugetlb will be able to handle MM_CP_UFFD_WP[_RESOLVE] requests. huge_pte_clear_uffd_wp() is introduced to handle the case where the UFFDIO_WRITEPROTECT is requested upon migrating huge page entries. Signed-off-by: Peter Xu --- include/asm-generic/hugetlb.h | 5 +++++ include/linux/hugetlb.h | 6 ++++-- mm/hugetlb.c | 13 ++++++++++++- mm/mprotect.c | 3 ++- mm/userfaultfd.c | 8 ++++++++ 5 files changed, 31 insertions(+), 4 deletions(-) diff --git a/include/asm-generic/hugetlb.h b/include/asm-generic/hugetlb.h index 548212eccbd6..181cdc3297e7 100644 --- a/include/asm-generic/hugetlb.h +++ b/include/asm-generic/hugetlb.h @@ -32,6 +32,11 @@ static inline pte_t huge_pte_mkuffd_wp(pte_t pte) return pte_mkuffd_wp(pte); } +static inline pte_t huge_pte_clear_uffd_wp(pte_t pte) +{ + return pte_clear_uffd_wp(pte); +} + static inline pte_t huge_pte_modify(pte_t pte, pgprot_t newprot) { return pte_modify(pte, newprot); diff --git a/include/linux/hugetlb.h b/include/linux/hugetlb.h index bd061f7eedcb..fe1dde0afbaf 100644 --- a/include/linux/hugetlb.h +++ b/include/linux/hugetlb.h @@ -185,7 +185,8 @@ struct page *follow_huge_pgd(struct mm_struct *mm, unsigned long address, int pmd_huge(pmd_t pmd); int pud_huge(pud_t pud); unsigned long hugetlb_change_protection(struct vm_area_struct *vma, - unsigned long address, unsigned long end, pgprot_t newprot); + unsigned long address, unsigned long end, pgprot_t newprot, + unsigned long cp_flags); bool is_hugetlb_entry_migration(pte_t pte); @@ -343,7 +344,8 @@ static inline void move_hugetlb_state(struct page *oldpage, static inline unsigned long hugetlb_change_protection( struct vm_area_struct *vma, unsigned long address, - unsigned long end, pgprot_t newprot) + unsigned long end, pgprot_t newprot, + unsigned long cp_flags) { return 0; } diff --git a/mm/hugetlb.c b/mm/hugetlb.c index 563b8f70537f..18b236bac6cd 100644 --- a/mm/hugetlb.c +++ b/mm/hugetlb.c @@ -4938,7 +4938,8 @@ long follow_hugetlb_page(struct mm_struct *mm, struct vm_area_struct *vma, #endif unsigned long hugetlb_change_protection(struct vm_area_struct *vma, - unsigned long address, unsigned long end, pgprot_t newprot) + unsigned long address, unsigned long end, + pgprot_t newprot, unsigned long cp_flags) { struct mm_struct *mm = vma->vm_mm; unsigned long start = address; @@ -4948,6 +4949,8 @@ unsigned long hugetlb_change_protection(struct vm_area_struct *vma, unsigned long pages = 0; bool shared_pmd = false; struct mmu_notifier_range range; + bool uffd_wp = cp_flags & MM_CP_UFFD_WP; + bool uffd_wp_resolve = cp_flags & MM_CP_UFFD_WP_RESOLVE; /* * In the case of shared PMDs, the area to flush could be beyond @@ -4988,6 +4991,10 @@ unsigned long hugetlb_change_protection(struct vm_area_struct *vma, make_migration_entry_read(&entry); newpte = swp_entry_to_pte(entry); + if (uffd_wp) + newpte = pte_swp_mkuffd_wp(newpte); + else if (uffd_wp_resolve) + newpte = pte_swp_clear_uffd_wp(newpte); set_huge_swap_pte_at(mm, address, ptep, newpte, huge_page_size(h)); pages++; @@ -5001,6 +5008,10 @@ unsigned long hugetlb_change_protection(struct vm_area_struct *vma, old_pte = huge_ptep_modify_prot_start(vma, address, ptep); pte = pte_mkhuge(huge_pte_modify(old_pte, newprot)); pte = arch_make_huge_pte(pte, vma, NULL, 0); + if (uffd_wp) + pte = huge_pte_mkuffd_wp(huge_pte_wrprotect(pte)); + else if (uffd_wp_resolve) + pte = huge_pte_clear_uffd_wp(pte); huge_ptep_modify_prot_commit(vma, address, ptep, old_pte, pte); pages++; } diff --git a/mm/mprotect.c b/mm/mprotect.c index 055871322007..fce87ac99117 100644 --- a/mm/mprotect.c +++ b/mm/mprotect.c @@ -416,7 +416,8 @@ unsigned long change_protection(struct vm_area_struct *vma, unsigned long start, BUG_ON((cp_flags & MM_CP_UFFD_WP_ALL) == MM_CP_UFFD_WP_ALL); if (is_vm_hugetlb_page(vma)) - pages = hugetlb_change_protection(vma, start, end, newprot); + pages = hugetlb_change_protection(vma, start, end, newprot, + cp_flags); else pages = change_protection_range(vma, start, end, newprot, cp_flags); diff --git a/mm/userfaultfd.c b/mm/userfaultfd.c index b00e5e6b8b8b..480d91b783d4 100644 --- a/mm/userfaultfd.c +++ b/mm/userfaultfd.c @@ -644,6 +644,7 @@ int mwriteprotect_range(struct mm_struct *dst_mm, unsigned long start, unsigned long len, bool enable_wp, bool *mmap_changing) { struct vm_area_struct *dst_vma; + unsigned long page_mask; pgprot_t newprot; int err; @@ -680,6 +681,13 @@ int mwriteprotect_range(struct mm_struct *dst_mm, unsigned long start, if (!vma_is_anonymous(dst_vma)) goto out_unlock; + if (is_vm_hugetlb_page(dst_vma)) { + err = -EINVAL; + page_mask = vma_kernel_pagesize(dst_vma) - 1; + if ((start & page_mask) || (len & page_mask)) + goto out_unlock; + } + if (enable_wp) newprot = vm_get_page_prot(dst_vma->vm_flags & ~(VM_WRITE)); else From patchwork Fri Jan 15 17:08:58 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Xu X-Patchwork-Id: 12023351 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.6 required=3.0 tests=BAYES_00,DKIM_INVALID, DKIM_SIGNED,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 34E0AC433E0 for ; Fri, 15 Jan 2021 17:10:11 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id CB11D2333E for ; Fri, 15 Jan 2021 17:10:10 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org CB11D2333E Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=redhat.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 004848D01AF; Fri, 15 Jan 2021 12:09:55 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id EF4478D01AE; Fri, 15 Jan 2021 12:09:54 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id DBBA48D01AF; Fri, 15 Jan 2021 12:09:54 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0144.hostedemail.com [216.40.44.144]) by kanga.kvack.org (Postfix) with ESMTP id C4BE68D01AE for ; Fri, 15 Jan 2021 12:09:54 -0500 (EST) Received: from smtpin20.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay03.hostedemail.com (Postfix) with ESMTP id 8277E82EB55C for ; Fri, 15 Jan 2021 17:09:54 +0000 (UTC) X-FDA: 77708646708.20.flag11_281261127531 Received: from filter.hostedemail.com (10.5.16.251.rfc1918.com [10.5.16.251]) by smtpin20.hostedemail.com (Postfix) with ESMTP id 53C411810DA28 for ; Fri, 15 Jan 2021 17:09:54 +0000 (UTC) X-HE-Tag: flag11_281261127531 X-Filterd-Recvd-Size: 10767 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [216.205.24.124]) by imf34.hostedemail.com (Postfix) with ESMTP for ; Fri, 15 Jan 2021 17:09:53 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1610730593; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=Fzb3OrCvuj9u1ejMitFEdOZGsE/rKLZCXUuasJ8wkyY=; b=LDo4gDfdkQ9kj6ViqN00f5eJMnRLHwyweJYRf2kKuQWJ44+WAeHd9ZC6qK4aAa7zibRWOu 26YP7er2LEd6HRWIuURgxq8mOGHXoUwSnjB+1E+mreOwEOK/JhIgUijvTiPkNVohQ3XX1b /b3oVsztRSvaEdtb6vFokzQz+xrmp0I= Received: from mail-qk1-f199.google.com (mail-qk1-f199.google.com [209.85.222.199]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-325-i3rKz7PEPdC3_Px0668S6A-1; Fri, 15 Jan 2021 12:09:52 -0500 X-MC-Unique: i3rKz7PEPdC3_Px0668S6A-1 Received: by mail-qk1-f199.google.com with SMTP id f27so8615126qkh.0 for ; Fri, 15 Jan 2021 09:09:51 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=Fzb3OrCvuj9u1ejMitFEdOZGsE/rKLZCXUuasJ8wkyY=; b=HDtVVsN9xTYFxQG1zZxwBH7l6eZyH6m8ioh+nBbtzMxecqFkMvD/kZ+TjQgUEtn91j ZwV3NuRpvrhakGT+2GKfB19UXSXjP4bPqMsNMQJHGUHtiyNjfAM+3mrPr1d1wIRaf/lw giH3kAicCv5620pEcb2lZfPFJS4wEGHFediMblDMl0Fu5jVy69KNTnGNSLZaMtLQfsPd XiCKHRiakjbxlZ0Ga3x9jUcOllkNcuOXni+EyggMYhzLiSUZsBvs+871LtXzIg2frx/p XNy1H54dskRoNyaSTjsZFXK59cMlRWfe8mIYCgXHNfRJsGcazDvlG6L1h2JSUbxbpSsx D4Iw== X-Gm-Message-State: AOAM530zLrBz2dncFSIrNILMCE2zl+lo+LlzE6DzSBe/VJOlR+Rp3d3C S4zb/Vbzcn9IMK7xngJgfbBw8ZCJpp3q3IWxd4QE7RZ7GcgcVwSwz19I4RXjtT5VRRTweSIJ8RO RtiAhkpt7ycI= X-Received: by 2002:ac8:8ca:: with SMTP id y10mr1953478qth.330.1610730591481; Fri, 15 Jan 2021 09:09:51 -0800 (PST) X-Google-Smtp-Source: ABdhPJynPbJZARbFKUAqCyYrAIhsUHZ/PdW2ilr6NGS4nhJHsYltq257UUAHhzGtHV6iYBRr031H3A== X-Received: by 2002:ac8:8ca:: with SMTP id y10mr1953445qth.330.1610730591246; Fri, 15 Jan 2021 09:09:51 -0800 (PST) Received: from localhost.localdomain ([142.126.83.202]) by smtp.gmail.com with ESMTPSA id d123sm5187840qke.95.2021.01.15.09.09.48 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 15 Jan 2021 09:09:50 -0800 (PST) From: Peter Xu To: linux-kernel@vger.kernel.org, linux-mm@kvack.org Cc: Mike Rapoport , Mike Kravetz , peterx@redhat.com, Jerome Glisse , "Kirill A . Shutemov" , Hugh Dickins , Axel Rasmussen , Matthew Wilcox , Andrew Morton , Andrea Arcangeli , Nadav Amit Subject: [PATCH RFC 21/30] hugetlb: Pass vma into huge_pte_alloc() Date: Fri, 15 Jan 2021 12:08:58 -0500 Message-Id: <20210115170907.24498-22-peterx@redhat.com> X-Mailer: git-send-email 2.26.2 In-Reply-To: <20210115170907.24498-1-peterx@redhat.com> References: <20210115170907.24498-1-peterx@redhat.com> MIME-Version: 1.0 Authentication-Results: relay.mimecast.com; auth=pass smtp.auth=CUSA124A263 smtp.mailfrom=peterx@redhat.com X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: It is a preparation work to be able to behave differently in the per architecture huge_pte_alloc() according to different VMA attributes. Signed-off-by: Peter Xu --- arch/arm64/mm/hugetlbpage.c | 2 +- arch/ia64/mm/hugetlbpage.c | 3 ++- arch/mips/mm/hugetlbpage.c | 4 ++-- arch/parisc/mm/hugetlbpage.c | 2 +- arch/powerpc/mm/hugetlbpage.c | 3 ++- arch/s390/mm/hugetlbpage.c | 2 +- arch/sh/mm/hugetlbpage.c | 2 +- arch/sparc/mm/hugetlbpage.c | 2 +- include/linux/hugetlb.h | 2 +- mm/hugetlb.c | 6 +++--- mm/userfaultfd.c | 2 +- 11 files changed, 16 insertions(+), 14 deletions(-) diff --git a/arch/arm64/mm/hugetlbpage.c b/arch/arm64/mm/hugetlbpage.c index 55ecf6de9ff7..5b32ec888698 100644 --- a/arch/arm64/mm/hugetlbpage.c +++ b/arch/arm64/mm/hugetlbpage.c @@ -252,7 +252,7 @@ void set_huge_swap_pte_at(struct mm_struct *mm, unsigned long addr, set_pte(ptep, pte); } -pte_t *huge_pte_alloc(struct mm_struct *mm, +pte_t *huge_pte_alloc(struct mm_struct *mm, struct vm_area_struct *vma, unsigned long addr, unsigned long sz) { pgd_t *pgdp; diff --git a/arch/ia64/mm/hugetlbpage.c b/arch/ia64/mm/hugetlbpage.c index b331f94d20ac..f993cb36c062 100644 --- a/arch/ia64/mm/hugetlbpage.c +++ b/arch/ia64/mm/hugetlbpage.c @@ -25,7 +25,8 @@ unsigned int hpage_shift = HPAGE_SHIFT_DEFAULT; EXPORT_SYMBOL(hpage_shift); pte_t * -huge_pte_alloc(struct mm_struct *mm, unsigned long addr, unsigned long sz) +huge_pte_alloc(struct mm_struct *mm, struct vm_area_struct *vma, + unsigned long addr, unsigned long sz) { unsigned long taddr = htlbpage_to_page(addr); pgd_t *pgd; diff --git a/arch/mips/mm/hugetlbpage.c b/arch/mips/mm/hugetlbpage.c index 77ffece9c270..c1d8f51c5255 100644 --- a/arch/mips/mm/hugetlbpage.c +++ b/arch/mips/mm/hugetlbpage.c @@ -21,8 +21,8 @@ #include #include -pte_t *huge_pte_alloc(struct mm_struct *mm, unsigned long addr, - unsigned long sz) +pte_t *huge_pte_alloc(struct mm_struct *mm, structt vm_area_struct *vma, + unsigned long addr, unsigned long sz) { pgd_t *pgd; p4d_t *p4d; diff --git a/arch/parisc/mm/hugetlbpage.c b/arch/parisc/mm/hugetlbpage.c index d7ba014a7fbb..e141441bfa64 100644 --- a/arch/parisc/mm/hugetlbpage.c +++ b/arch/parisc/mm/hugetlbpage.c @@ -44,7 +44,7 @@ hugetlb_get_unmapped_area(struct file *file, unsigned long addr, } -pte_t *huge_pte_alloc(struct mm_struct *mm, +pte_t *huge_pte_alloc(struct mm_struct *mm, struct vm_area_struct *vma, unsigned long addr, unsigned long sz) { pgd_t *pgd; diff --git a/arch/powerpc/mm/hugetlbpage.c b/arch/powerpc/mm/hugetlbpage.c index 36c3800769fb..2514884c0d20 100644 --- a/arch/powerpc/mm/hugetlbpage.c +++ b/arch/powerpc/mm/hugetlbpage.c @@ -106,7 +106,8 @@ static int __hugepte_alloc(struct mm_struct *mm, hugepd_t *hpdp, * At this point we do the placement change only for BOOK3S 64. This would * possibly work on other subarchs. */ -pte_t *huge_pte_alloc(struct mm_struct *mm, unsigned long addr, unsigned long sz) +pte_t *huge_pte_alloc(struct mm_struct *mm, struct vm_area_struct *vma, + unsigned long addr, unsigned long sz) { pgd_t *pg; p4d_t *p4; diff --git a/arch/s390/mm/hugetlbpage.c b/arch/s390/mm/hugetlbpage.c index 3b5a4d25ca9b..da36d13ffc16 100644 --- a/arch/s390/mm/hugetlbpage.c +++ b/arch/s390/mm/hugetlbpage.c @@ -189,7 +189,7 @@ pte_t huge_ptep_get_and_clear(struct mm_struct *mm, return pte; } -pte_t *huge_pte_alloc(struct mm_struct *mm, +pte_t *huge_pte_alloc(struct mm_struct *mm, struct vm_area_struct *vma, unsigned long addr, unsigned long sz) { pgd_t *pgdp; diff --git a/arch/sh/mm/hugetlbpage.c b/arch/sh/mm/hugetlbpage.c index 220d7bc43d2b..999ab5916e69 100644 --- a/arch/sh/mm/hugetlbpage.c +++ b/arch/sh/mm/hugetlbpage.c @@ -21,7 +21,7 @@ #include #include -pte_t *huge_pte_alloc(struct mm_struct *mm, +pte_t *huge_pte_alloc(struct mm_struct *mm, struct vm_area_struct *vma, unsigned long addr, unsigned long sz) { pgd_t *pgd; diff --git a/arch/sparc/mm/hugetlbpage.c b/arch/sparc/mm/hugetlbpage.c index ec423b5f17dd..ae06f7df9750 100644 --- a/arch/sparc/mm/hugetlbpage.c +++ b/arch/sparc/mm/hugetlbpage.c @@ -272,7 +272,7 @@ static unsigned long huge_tte_to_size(pte_t pte) return size; } -pte_t *huge_pte_alloc(struct mm_struct *mm, +pte_t *huge_pte_alloc(struct mm_struct *mm, struct vm_area_struct *vma, unsigned long addr, unsigned long sz) { pgd_t *pgd; diff --git a/include/linux/hugetlb.h b/include/linux/hugetlb.h index fe1dde0afbaf..7d4c5669e118 100644 --- a/include/linux/hugetlb.h +++ b/include/linux/hugetlb.h @@ -162,7 +162,7 @@ extern struct list_head huge_boot_pages; /* arch callbacks */ -pte_t *huge_pte_alloc(struct mm_struct *mm, +pte_t *huge_pte_alloc(struct mm_struct *mm, struct vm_area_struct *vma, unsigned long addr, unsigned long sz); pte_t *huge_pte_offset(struct mm_struct *mm, unsigned long addr, unsigned long sz); diff --git a/mm/hugetlb.c b/mm/hugetlb.c index 18b236bac6cd..eb7cd0c7d6d2 100644 --- a/mm/hugetlb.c +++ b/mm/hugetlb.c @@ -3767,7 +3767,7 @@ int copy_hugetlb_page_range(struct mm_struct *dst, struct mm_struct *src, src_pte = huge_pte_offset(src, addr, sz); if (!src_pte) continue; - dst_pte = huge_pte_alloc(dst, addr, sz); + dst_pte = huge_pte_alloc(dst, vma, addr, sz); if (!dst_pte) { ret = -ENOMEM; break; @@ -4484,7 +4484,7 @@ vm_fault_t hugetlb_fault(struct mm_struct *mm, struct vm_area_struct *vma, */ mapping = vma->vm_file->f_mapping; i_mmap_lock_read(mapping); - ptep = huge_pte_alloc(mm, haddr, huge_page_size(h)); + ptep = huge_pte_alloc(mm, vma, haddr, huge_page_size(h)); if (!ptep) { i_mmap_unlock_read(mapping); return VM_FAULT_OOM; @@ -5407,7 +5407,7 @@ void adjust_range_if_pmd_sharing_possible(struct vm_area_struct *vma, #endif /* CONFIG_ARCH_WANT_HUGE_PMD_SHARE */ #ifdef CONFIG_ARCH_WANT_GENERAL_HUGETLB -pte_t *huge_pte_alloc(struct mm_struct *mm, +pte_t *huge_pte_alloc(struct mm_struct *mm, struct vm_area_struct *vma, unsigned long addr, unsigned long sz) { pgd_t *pgd; diff --git a/mm/userfaultfd.c b/mm/userfaultfd.c index 480d91b783d4..3d49b888e3e8 100644 --- a/mm/userfaultfd.c +++ b/mm/userfaultfd.c @@ -291,7 +291,7 @@ static __always_inline ssize_t __mcopy_atomic_hugetlb(struct mm_struct *dst_mm, mutex_lock(&hugetlb_fault_mutex_table[hash]); err = -ENOMEM; - dst_pte = huge_pte_alloc(dst_mm, dst_addr, vma_hpagesize); + dst_pte = huge_pte_alloc(dst_mm, dst_vma, dst_addr, vma_hpagesize); if (!dst_pte) { mutex_unlock(&hugetlb_fault_mutex_table[hash]); i_mmap_unlock_read(mapping); From patchwork Fri Jan 15 17:08:59 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Xu X-Patchwork-Id: 12023353 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.6 required=3.0 tests=BAYES_00,DKIM_INVALID, DKIM_SIGNED,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 00721C433DB for ; Fri, 15 Jan 2021 17:10:14 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 9D9532333E for ; Fri, 15 Jan 2021 17:10:13 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 9D9532333E Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=redhat.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 3AF3D8D01B0; Fri, 15 Jan 2021 12:09:58 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 2C0288D01AE; Fri, 15 Jan 2021 12:09:58 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 1AF188D01B0; Fri, 15 Jan 2021 12:09:58 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0210.hostedemail.com [216.40.44.210]) by kanga.kvack.org (Postfix) with ESMTP id F31A78D01AE for ; Fri, 15 Jan 2021 12:09:57 -0500 (EST) Received: from smtpin07.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay04.hostedemail.com (Postfix) with ESMTP id B4C87876D for ; Fri, 15 Jan 2021 17:09:57 +0000 (UTC) X-FDA: 77708646834.07.wheel50_3e10dc927531 Received: from filter.hostedemail.com (10.5.16.251.rfc1918.com [10.5.16.251]) by smtpin07.hostedemail.com (Postfix) with ESMTP id 924FD1818FCD3 for ; Fri, 15 Jan 2021 17:09:57 +0000 (UTC) X-HE-Tag: wheel50_3e10dc927531 X-Filterd-Recvd-Size: 8368 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [216.205.24.124]) by imf18.hostedemail.com (Postfix) with ESMTP for ; Fri, 15 Jan 2021 17:09:56 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1610730596; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=ez9VaM0jxqRxRHrokBxOLqFRU+5IpsC9OSJDJDSHKdU=; b=J/Ozmtq4RRfqpvgwigfUJony8GrqnlEGWqtX/gRdtFLqKFbkfKQR9b8AuCOJNkGwvyeK2q l9GTktVkSoDRU0zrtkAfUYHRbcv9viJPvay++teCl5NU/PToPkyHY0GILR37u3DE43zzgR ytxXEe1pB0theULiNiQ2pXCMgOFyojY= Received: from mail-qt1-f197.google.com (mail-qt1-f197.google.com [209.85.160.197]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-501-pxG-Z7XCNmu5aaxsE1IJXg-1; Fri, 15 Jan 2021 12:09:55 -0500 X-MC-Unique: pxG-Z7XCNmu5aaxsE1IJXg-1 Received: by mail-qt1-f197.google.com with SMTP id b8so7861553qtr.18 for ; Fri, 15 Jan 2021 09:09:54 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=ez9VaM0jxqRxRHrokBxOLqFRU+5IpsC9OSJDJDSHKdU=; b=VdILaMvJaMUQBjZGbwFybFLB8C/vvlo9vNn7K9je2RqgaSKEwToyUBRWer2vNjtdai 3qVJsAtAMedXoZ34puOqFWWj29WunfX4D3uiuJiHVuR6tvyss3YkIVK7Ewo8UEI7RZnD VTX5ITEwUU4UIFn9BvmyTjs5O9l5CiiCn+6JMfPPbYXZE4q9gvqtv0Wm4Z0Xn/GNY3jl Nn4u6tfHCAVwroawHyAcYL2BVohANGntvL4VIRybqYaYe8R8sisI28HV/yAuRBe3i4Lf 6hPBwKjB+gq3XI2+95820hdsdMH1WQEYEcr+Z81zMo8h+rKXV7Fdls8zvD+MjAfFQbTc Bpgw== X-Gm-Message-State: AOAM532nZT6/K5dDZHB49BVOspjb28siEe7friCmd7fM8esgbY8Q9bTf pVnc9IoHdXO3TbjNjl8+Rnkj6OCynDfqtRstJnNyPHDtrZqicFOgFTNgTSbx2Akq9vFjhQxkvXn Obm0EqrfHgGI= X-Received: by 2002:a37:64c4:: with SMTP id y187mr13101249qkb.371.1610730593519; Fri, 15 Jan 2021 09:09:53 -0800 (PST) X-Google-Smtp-Source: ABdhPJwyfJ5XAHeQaZql7ziGHlK2PdUYiLPmxsrv1KSOsOkkqMGGZaMw9BF6kuEjwLDe+dcyq4yjJA== X-Received: by 2002:a37:64c4:: with SMTP id y187mr13101223qkb.371.1610730593288; Fri, 15 Jan 2021 09:09:53 -0800 (PST) Received: from localhost.localdomain ([142.126.83.202]) by smtp.gmail.com with ESMTPSA id d123sm5187840qke.95.2021.01.15.09.09.51 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 15 Jan 2021 09:09:52 -0800 (PST) From: Peter Xu To: linux-kernel@vger.kernel.org, linux-mm@kvack.org Cc: Mike Rapoport , Mike Kravetz , peterx@redhat.com, Jerome Glisse , "Kirill A . Shutemov" , Hugh Dickins , Axel Rasmussen , Matthew Wilcox , Andrew Morton , Andrea Arcangeli , Nadav Amit Subject: [PATCH RFC 22/30] hugetlb/userfaultfd: Forbid huge pmd sharing when uffd enabled Date: Fri, 15 Jan 2021 12:08:59 -0500 Message-Id: <20210115170907.24498-23-peterx@redhat.com> X-Mailer: git-send-email 2.26.2 In-Reply-To: <20210115170907.24498-1-peterx@redhat.com> References: <20210115170907.24498-1-peterx@redhat.com> MIME-Version: 1.0 Authentication-Results: relay.mimecast.com; auth=pass smtp.auth=CUSA124A263 smtp.mailfrom=peterx@redhat.com X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Huge pmd sharing could bring problem to userfaultfd. The thing is that userfaultfd is running its logic based on the special bits on page table entries, however the huge pmd sharing could potentially share page table entries for different address ranges. That could cause issues on either: - When sharing huge pmd page tables for an uffd write protected range, the newly mapped huge pmd range will also be write protected unexpectedly, or, - When we try to write protect a range of huge pmd shared range, we'll first do huge_pmd_unshare() in hugetlb_change_protection(), however that also means the UFFDIO_WRITEPROTECT could be silently skipped for the shared region, which could lead to data loss. Since at it, a few other things are done altogether: - Move want_pmd_share() from mm/hugetlb.c into linux/hugetlb.h, because that's definitely something that arch code would like to use too - ARM64 currently directly check against CONFIG_ARCH_WANT_HUGE_PMD_SHARE when trying to share huge pmd. Switch to the want_pmd_share() helper. Signed-off-by: Peter Xu --- arch/arm64/mm/hugetlbpage.c | 3 +-- include/linux/hugetlb.h | 12 ++++++++++++ include/linux/userfaultfd_k.h | 9 +++++++++ mm/hugetlb.c | 5 ++--- 4 files changed, 24 insertions(+), 5 deletions(-) diff --git a/arch/arm64/mm/hugetlbpage.c b/arch/arm64/mm/hugetlbpage.c index 5b32ec888698..1a8ce0facfe8 100644 --- a/arch/arm64/mm/hugetlbpage.c +++ b/arch/arm64/mm/hugetlbpage.c @@ -284,8 +284,7 @@ pte_t *huge_pte_alloc(struct mm_struct *mm, struct vm_area_struct *vma, */ ptep = pte_alloc_map(mm, pmdp, addr); } else if (sz == PMD_SIZE) { - if (IS_ENABLED(CONFIG_ARCH_WANT_HUGE_PMD_SHARE) && - pud_none(READ_ONCE(*pudp))) + if (want_pmd_share(vma) && pud_none(READ_ONCE(*pudp))) ptep = huge_pmd_share(mm, addr, pudp); else ptep = (pte_t *)pmd_alloc(mm, pudp, addr); diff --git a/include/linux/hugetlb.h b/include/linux/hugetlb.h index 7d4c5669e118..27ada597a8e6 100644 --- a/include/linux/hugetlb.h +++ b/include/linux/hugetlb.h @@ -11,6 +11,7 @@ #include #include #include +#include struct ctl_table; struct user_struct; @@ -951,4 +952,15 @@ static inline __init void hugetlb_cma_check(void) } #endif +static inline bool want_pmd_share(struct vm_area_struct *vma) +{ +#ifdef CONFIG_ARCH_WANT_HUGE_PMD_SHARE + if (uffd_disable_huge_pmd_share(vma)) + return false; + return true; +#else + return false; +#endif +} + #endif /* _LINUX_HUGETLB_H */ diff --git a/include/linux/userfaultfd_k.h b/include/linux/userfaultfd_k.h index 7d6071a65ded..7d14444862d4 100644 --- a/include/linux/userfaultfd_k.h +++ b/include/linux/userfaultfd_k.h @@ -52,6 +52,15 @@ static inline bool is_mergeable_vm_userfaultfd_ctx(struct vm_area_struct *vma, return vma->vm_userfaultfd_ctx.ctx == vm_ctx.ctx; } +/* + * Never enable huge pmd sharing on uffd-wp registered vmas, because uffd-wp + * protect information is per pgtable entry. + */ +static inline bool uffd_disable_huge_pmd_share(struct vm_area_struct *vma) +{ + return vma->vm_flags & VM_UFFD_WP; +} + static inline bool userfaultfd_missing(struct vm_area_struct *vma) { return vma->vm_flags & VM_UFFD_MISSING; diff --git a/mm/hugetlb.c b/mm/hugetlb.c index eb7cd0c7d6d2..dd2acb8b3f0f 100644 --- a/mm/hugetlb.c +++ b/mm/hugetlb.c @@ -5386,7 +5386,7 @@ int huge_pmd_unshare(struct mm_struct *mm, struct vm_area_struct *vma, *addr = ALIGN(*addr, HPAGE_SIZE * PTRS_PER_PTE) - HPAGE_SIZE; return 1; } -#define want_pmd_share() (1) + #else /* !CONFIG_ARCH_WANT_HUGE_PMD_SHARE */ pte_t *huge_pmd_share(struct mm_struct *mm, unsigned long addr, pud_t *pud) { @@ -5403,7 +5403,6 @@ void adjust_range_if_pmd_sharing_possible(struct vm_area_struct *vma, unsigned long *start, unsigned long *end) { } -#define want_pmd_share() (0) #endif /* CONFIG_ARCH_WANT_HUGE_PMD_SHARE */ #ifdef CONFIG_ARCH_WANT_GENERAL_HUGETLB @@ -5425,7 +5424,7 @@ pte_t *huge_pte_alloc(struct mm_struct *mm, struct vm_area_struct *vma, pte = (pte_t *)pud; } else { BUG_ON(sz != PMD_SIZE); - if (want_pmd_share() && pud_none(*pud)) + if (want_pmd_share(vma) && pud_none(*pud)) pte = huge_pmd_share(mm, addr, pud); else pte = (pte_t *)pmd_alloc(mm, pud, addr); From patchwork Fri Jan 15 17:09:00 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Xu X-Patchwork-Id: 12023355 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.6 required=3.0 tests=BAYES_00,DKIM_INVALID, DKIM_SIGNED,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 92783C433E0 for ; Fri, 15 Jan 2021 17:10:16 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 4D5BC2333E for ; Fri, 15 Jan 2021 17:10:16 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 4D5BC2333E Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=redhat.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id A646F8D01B1; Fri, 15 Jan 2021 12:09:59 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id A13C68D01AE; Fri, 15 Jan 2021 12:09:59 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 904108D01B1; Fri, 15 Jan 2021 12:09:59 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0226.hostedemail.com [216.40.44.226]) by kanga.kvack.org (Postfix) with ESMTP id 764F78D01AE for ; Fri, 15 Jan 2021 12:09:59 -0500 (EST) Received: from smtpin22.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay03.hostedemail.com (Postfix) with ESMTP id 4078A801AD4B for ; Fri, 15 Jan 2021 17:09:59 +0000 (UTC) X-FDA: 77708646918.22.glove73_030c00627531 Received: from filter.hostedemail.com (10.5.16.251.rfc1918.com [10.5.16.251]) by smtpin22.hostedemail.com (Postfix) with ESMTP id 188851810DA3A for ; Fri, 15 Jan 2021 17:09:59 +0000 (UTC) X-HE-Tag: glove73_030c00627531 X-Filterd-Recvd-Size: 5825 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [216.205.24.124]) by imf36.hostedemail.com (Postfix) with ESMTP for ; Fri, 15 Jan 2021 17:09:58 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1610730598; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=xbapPq1ojGk8GME+3Q5hIrqVuXOJyakklI0BdH5xjl0=; b=AZ+6V318A0ZD0vAGxCuKkGBxei7iRJywlXJtUQZocXw+rQYW78inKk0wF/FdCiVa4nrKmc fa5tnmjrX07linCeGl3NDAvwJqS4q/rjXgW+F1WXgwZVenHcPhuxgOSCowt9i5tadpaDK/ rcd/9R5bcaPqFQC5cJQUjuWQAk9yN6k= Received: from mail-qv1-f71.google.com (mail-qv1-f71.google.com [209.85.219.71]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-339-FckXlN_hONOSIpBFPYxJew-1; Fri, 15 Jan 2021 12:09:55 -0500 X-MC-Unique: FckXlN_hONOSIpBFPYxJew-1 Received: by mail-qv1-f71.google.com with SMTP id l3so8258257qvr.10 for ; Fri, 15 Jan 2021 09:09:55 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=xbapPq1ojGk8GME+3Q5hIrqVuXOJyakklI0BdH5xjl0=; b=Q7F4obuxY+183W8hW51nstjqIpHkzZNpSoA42YC/hNa4It00I4QbvLxpgcDUE/7Pft zlaZohxuKNegipJkpFtlyPQQBLSQadtlDY1c3XR9B6nTh9m56TyxzKJn67cTxzpVD8jm DqnAT+ZyVmshUX2xwY0B1ovzVYTC2zBNncoEYgh53D+Ounh0yriaIGkp7bpsZNZhPoci FB6bQn6GMAEsJaMYYeDC2S5wQzSBP/29bR+LqUbB8LbMUjfrefPfydl6b8Nipu3ct3DS J4AeAw36aD6jOi3NBXvyLKpZSmvkRg34xSiB2GlHVppaTR5j3A68ObzvI7qG73lZo/HL cenQ== X-Gm-Message-State: AOAM532PqtbSr38mDA5F3Wx5lCVVwOc6PfW7onCN7q8pbyOBxOcbMIR0 HhJscuh3Peb98VeRXOYLiwMJsqWDXx5/l8JZAtWdyOsV251ws5P8YN+6AJ1s6MlbtCE8Pjggy49 cS2Zm/RGj86g= X-Received: by 2002:a37:d13:: with SMTP id 19mr13531587qkn.93.1610730595178; Fri, 15 Jan 2021 09:09:55 -0800 (PST) X-Google-Smtp-Source: ABdhPJykF/E1p80aUABfZGvXlMxL3xP2NWVwmUQdTiVuy3IF0dNtbXolMRMz6g1eiLbW1PrGID+lQw== X-Received: by 2002:a37:d13:: with SMTP id 19mr13531559qkn.93.1610730594964; Fri, 15 Jan 2021 09:09:54 -0800 (PST) Received: from localhost.localdomain ([142.126.83.202]) by smtp.gmail.com with ESMTPSA id d123sm5187840qke.95.2021.01.15.09.09.53 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 15 Jan 2021 09:09:54 -0800 (PST) From: Peter Xu To: linux-kernel@vger.kernel.org, linux-mm@kvack.org Cc: Mike Rapoport , Mike Kravetz , peterx@redhat.com, Jerome Glisse , "Kirill A . Shutemov" , Hugh Dickins , Axel Rasmussen , Matthew Wilcox , Andrew Morton , Andrea Arcangeli , Nadav Amit Subject: [PATCH RFC 23/30] mm/hugetlb: Introduce huge version of special swap pte helpers Date: Fri, 15 Jan 2021 12:09:00 -0500 Message-Id: <20210115170907.24498-24-peterx@redhat.com> X-Mailer: git-send-email 2.26.2 In-Reply-To: <20210115170907.24498-1-peterx@redhat.com> References: <20210115170907.24498-1-peterx@redhat.com> MIME-Version: 1.0 Authentication-Results: relay.mimecast.com; auth=pass smtp.auth=CUSA124A263 smtp.mailfrom=peterx@redhat.com X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: This is to let hugetlbfs be prepared to also recognize swap special ptes just like uffd-wp special swap ptes. Signed-off-by: Peter Xu --- mm/hugetlb.c | 23 +++++++++++++++++++++-- 1 file changed, 21 insertions(+), 2 deletions(-) diff --git a/mm/hugetlb.c b/mm/hugetlb.c index dd2acb8b3f0f..16a07f41880e 100644 --- a/mm/hugetlb.c +++ b/mm/hugetlb.c @@ -82,6 +82,25 @@ struct mutex *hugetlb_fault_mutex_table ____cacheline_aligned_in_smp; /* Forward declaration */ static int hugetlb_acct_memory(struct hstate *h, long delta); +/* + * These are sister versions of is_swap_pte() and pte_has_swap_entry(). We + * need standalone ones because huge_pte_none() is handled differently from + * pte_none(). For more information, please refer to comments above + * is_swap_pte() and pte_has_swap_entry(). + * + * Here we directly reuse the pte level of swap special ptes, for example, the + * pte_swp_uffd_wp_special(). It just stands for a huge page rather than a + * small page for hugetlbfs pages. + */ +static inline bool is_huge_swap_pte(pte_t pte) +{ + return !huge_pte_none(pte) && !pte_present(pte); +} +static inline bool huge_pte_has_swap_entry(pte_t pte) +{ + return is_huge_swap_pte(pte) && !is_swap_special_pte(pte); +} + static inline void unlock_or_release_subpool(struct hugepage_subpool *spool) { bool free = (spool->count == 0) && (spool->used_hpages == 0); @@ -3710,7 +3729,7 @@ bool is_hugetlb_entry_migration(pte_t pte) { swp_entry_t swp; - if (huge_pte_none(pte) || pte_present(pte)) + if (!huge_pte_has_swap_entry(pte)) return false; swp = pte_to_swp_entry(pte); if (is_migration_entry(swp)) @@ -3723,7 +3742,7 @@ static bool is_hugetlb_entry_hwpoisoned(pte_t pte) { swp_entry_t swp; - if (huge_pte_none(pte) || pte_present(pte)) + if (!huge_pte_has_swap_entry(pte)) return false; swp = pte_to_swp_entry(pte); if (is_hwpoison_entry(swp)) From patchwork Fri Jan 15 17:09:01 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Xu X-Patchwork-Id: 12023357 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.6 required=3.0 tests=BAYES_00,DKIM_INVALID, DKIM_SIGNED,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 4AED7C433E0 for ; Fri, 15 Jan 2021 17:10:19 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 011D92333E for ; Fri, 15 Jan 2021 17:10:18 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 011D92333E Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=redhat.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 0512A8D01AE; Fri, 15 Jan 2021 12:10:03 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id F1C828D01B2; Fri, 15 Jan 2021 12:10:02 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id E42A98D01AE; Fri, 15 Jan 2021 12:10:02 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0049.hostedemail.com [216.40.44.49]) by kanga.kvack.org (Postfix) with ESMTP id C745F8D01AE for ; Fri, 15 Jan 2021 12:10:02 -0500 (EST) Received: from smtpin14.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay03.hostedemail.com (Postfix) with ESMTP id 8464982EB541 for ; Fri, 15 Jan 2021 17:10:02 +0000 (UTC) X-FDA: 77708647044.14.ship15_150348c27531 Received: from filter.hostedemail.com (10.5.16.251.rfc1918.com [10.5.16.251]) by smtpin14.hostedemail.com (Postfix) with ESMTP id 503BC18229835 for ; Fri, 15 Jan 2021 17:10:02 +0000 (UTC) X-HE-Tag: ship15_150348c27531 X-Filterd-Recvd-Size: 5323 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [63.128.21.124]) by imf03.hostedemail.com (Postfix) with ESMTP for ; Fri, 15 Jan 2021 17:10:01 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1610730601; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=hOQ6aKF6Uf/Sgh+ZG/y5WuLjCjtq30JXrndJQraRJDw=; b=Nb94v2dLD9fMq1a/Uxf+Cf3znwtJ+9qAdGaoc22xWI+mO1EA4i5JQJmUTPyDoEku4e9CbJ itMLeMOdNyaipZLs2GCn5rVbR8GIQ+qPkXFdaJwES9b4snJcodaSvM6B/sqRjrpCIbM4Cb 0NOXB9g1kb5KnSB54FAUuepeKLjl2X4= Received: from mail-qk1-f200.google.com (mail-qk1-f200.google.com [209.85.222.200]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-596-9ti0mCZ1PP2GnFMo41fqaA-1; Fri, 15 Jan 2021 12:09:57 -0500 X-MC-Unique: 9ti0mCZ1PP2GnFMo41fqaA-1 Received: by mail-qk1-f200.google.com with SMTP id b206so8603654qkc.14 for ; Fri, 15 Jan 2021 09:09:57 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=hOQ6aKF6Uf/Sgh+ZG/y5WuLjCjtq30JXrndJQraRJDw=; b=t9695cDwOou8pgghAUu1Y1o2xQzrHtllSZKvq6/L3kfgO+33yp0iI4VhVCVumGwcqB tLc0wKXjoD7qw+lk+R5VWnlJjQZK6gGYrlm/Deozj5RWABmLEtEH2kNY6crZx04ubNTk lFuHc6BMGHMSpVK8N25Esva0Yi+sweia/kv/iuusf8LZEaJqNRMcRZtNb4GJF8qJorlk +j1Hkqfclzk67ldNFzluLatyl7lbP1Rc7yaBIjW40J+TuRi9esKbIH30Ha1/5W4uARD/ gmnsVU99P9tcc9JYsnSrK9/c6hUJ6Ib6GAvZj6e0UcJPxy+HxGHc9fBm9nz+slB6VFs1 5u8w== X-Gm-Message-State: AOAM532E748oLbHa6s340q9XrUztZzkm0/yZR6ZUGzn1On15LEBiGIQc 2CGOU0wONc/98y7J0fpXAyJ6dGcZXF0sGRI728nJDexJUhdnsNe+vyq2KBbaHAOk2acMU9WOY80 9x2jjxJwnjis= X-Received: by 2002:ac8:38f6:: with SMTP id g51mr12588358qtc.79.1610730596920; Fri, 15 Jan 2021 09:09:56 -0800 (PST) X-Google-Smtp-Source: ABdhPJwgQut9yjXxi5Zbbwd9SLAQdXiEzbcIQIXaZFaxeESmgqxEn0NosiOIzyybd2fS/MF9pEL6CQ== X-Received: by 2002:ac8:38f6:: with SMTP id g51mr12588327qtc.79.1610730596666; Fri, 15 Jan 2021 09:09:56 -0800 (PST) Received: from localhost.localdomain ([142.126.83.202]) by smtp.gmail.com with ESMTPSA id d123sm5187840qke.95.2021.01.15.09.09.55 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 15 Jan 2021 09:09:56 -0800 (PST) From: Peter Xu To: linux-kernel@vger.kernel.org, linux-mm@kvack.org Cc: Mike Rapoport , Mike Kravetz , peterx@redhat.com, Jerome Glisse , "Kirill A . Shutemov" , Hugh Dickins , Axel Rasmussen , Matthew Wilcox , Andrew Morton , Andrea Arcangeli , Nadav Amit Subject: [PATCH RFC 24/30] mm/hugetlb: Move flush_hugetlb_tlb_range() into hugetlb.h Date: Fri, 15 Jan 2021 12:09:01 -0500 Message-Id: <20210115170907.24498-25-peterx@redhat.com> X-Mailer: git-send-email 2.26.2 In-Reply-To: <20210115170907.24498-1-peterx@redhat.com> References: <20210115170907.24498-1-peterx@redhat.com> MIME-Version: 1.0 Authentication-Results: relay.mimecast.com; auth=pass smtp.auth=CUSA124A263 smtp.mailfrom=peterx@redhat.com X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Prepare for it to be called outside of mm/hugetlb.c. Signed-off-by: Peter Xu --- include/linux/hugetlb.h | 8 ++++++++ mm/hugetlb.c | 8 -------- 2 files changed, 8 insertions(+), 8 deletions(-) diff --git a/include/linux/hugetlb.h b/include/linux/hugetlb.h index 27ada597a8e6..8841d118f45b 100644 --- a/include/linux/hugetlb.h +++ b/include/linux/hugetlb.h @@ -963,4 +963,12 @@ static inline bool want_pmd_share(struct vm_area_struct *vma) #endif } +#ifndef __HAVE_ARCH_FLUSH_HUGETLB_TLB_RANGE +/* + * ARCHes with special requirements for evicting HUGETLB backing TLB entries can + * implement this. + */ +#define flush_hugetlb_tlb_range(vma, addr, end) flush_tlb_range(vma, addr, end) +#endif + #endif /* _LINUX_HUGETLB_H */ diff --git a/mm/hugetlb.c b/mm/hugetlb.c index 16a07f41880e..a971513cdff6 100644 --- a/mm/hugetlb.c +++ b/mm/hugetlb.c @@ -4948,14 +4948,6 @@ long follow_hugetlb_page(struct mm_struct *mm, struct vm_area_struct *vma, return i ? i : err; } -#ifndef __HAVE_ARCH_FLUSH_HUGETLB_TLB_RANGE -/* - * ARCHes with special requirements for evicting HUGETLB backing TLB entries can - * implement this. - */ -#define flush_hugetlb_tlb_range(vma, addr, end) flush_tlb_range(vma, addr, end) -#endif - unsigned long hugetlb_change_protection(struct vm_area_struct *vma, unsigned long address, unsigned long end, pgprot_t newprot, unsigned long cp_flags) From patchwork Fri Jan 15 17:09:02 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Xu X-Patchwork-Id: 12023359 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.6 required=3.0 tests=BAYES_00,DKIM_INVALID, DKIM_SIGNED,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id AF08BC433E0 for ; Fri, 15 Jan 2021 17:10:21 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 66CD62333E for ; Fri, 15 Jan 2021 17:10:21 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 66CD62333E Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=redhat.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id BECF88D01B3; Fri, 15 Jan 2021 12:10:05 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id B2C768D01B2; Fri, 15 Jan 2021 12:10:05 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 9766A8D01B3; Fri, 15 Jan 2021 12:10:05 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0228.hostedemail.com [216.40.44.228]) by kanga.kvack.org (Postfix) with ESMTP id 8080A8D01B2 for ; Fri, 15 Jan 2021 12:10:05 -0500 (EST) Received: from smtpin23.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay05.hostedemail.com (Postfix) with ESMTP id 46367181D6067 for ; Fri, 15 Jan 2021 17:10:05 +0000 (UTC) X-FDA: 77708647170.23.songs80_060398027531 Received: from filter.hostedemail.com (10.5.16.251.rfc1918.com [10.5.16.251]) by smtpin23.hostedemail.com (Postfix) with ESMTP id 1C83DC711C for ; Fri, 15 Jan 2021 17:10:05 +0000 (UTC) X-HE-Tag: songs80_060398027531 X-Filterd-Recvd-Size: 7093 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [63.128.21.124]) by imf33.hostedemail.com (Postfix) with ESMTP for ; Fri, 15 Jan 2021 17:10:04 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1610730604; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=7v8QDeNoG8OU/p7s9fC8RzrSPTHJdCM/pVMqrxY8yjM=; b=b8XsYxSdE+yPX879bXMNQLV7VZFL7hd17+ocHuTVmkrjGNRlCtRrIfudujpkbzBmbc/pr2 hsmppwdxvQf82t9KLz0nClsaogT12O6xDFU7MbMA9d6nXfKSlusJ0WtAdz8MWGefiaZmz/ 0T/UctK5eHKmalOXnhVXE0cCHU47WzY= Received: from mail-qk1-f197.google.com (mail-qk1-f197.google.com [209.85.222.197]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-470-gtLgVmfONYSNm8QkeGj7sA-1; Fri, 15 Jan 2021 12:09:59 -0500 X-MC-Unique: gtLgVmfONYSNm8QkeGj7sA-1 Received: by mail-qk1-f197.google.com with SMTP id g5so8595495qke.22 for ; Fri, 15 Jan 2021 09:09:59 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=7v8QDeNoG8OU/p7s9fC8RzrSPTHJdCM/pVMqrxY8yjM=; b=aJ6h1U2lMaLzLoJbyamcj4Yw9C98mwE0X22UGqsAUqVq70R/q4/9KK2ZmFAxlO2PNT jZbPEuiXj+gx/CFPSsdGtWGhv8fhbX4ArhVVeP4Hzh1ELhu5txpTMwWAdln6Z6gYWGNT r6nE7xQHVaXCcQjklBIQcqJMWEaxLaKI332op47AqjmAL/VMNK+Emw8PcYuburbolipX B6QsFfmAahx0hUK/sC8AhIiq9yejFKEdJ8TneeA8b0cVBVks1y2NEQoHLjs1Ql17nyEN Xs0LaERhJ2of0A2uP6vyOacg5rxPcXXEyjj/UU+yY/vfvOgySC6f5MEbfMTmcJnK9eI5 p/vw== X-Gm-Message-State: AOAM530W0N2E1JHurJphJARf1CzPMbd4CWgDezRtedJpHJpOxJTvJ6YW y+UgN5UMQExzucvARPecVqtcTkFpmoi+Cb6e2lSuyndNSO57L6Knn3MwXB6CZVyuWT9qL6QXhtP SCT8EHmTBHIk= X-Received: by 2002:aed:3964:: with SMTP id l91mr1920024qte.32.1610730598760; Fri, 15 Jan 2021 09:09:58 -0800 (PST) X-Google-Smtp-Source: ABdhPJxBMs/Rjdvzmsm9NCdgZQkWUAdVthi7FvD5c8VyUB4WKoACNSo+uWAgz6uRc12xEta7mwRbgA== X-Received: by 2002:aed:3964:: with SMTP id l91mr1919997qte.32.1610730598498; Fri, 15 Jan 2021 09:09:58 -0800 (PST) Received: from localhost.localdomain ([142.126.83.202]) by smtp.gmail.com with ESMTPSA id d123sm5187840qke.95.2021.01.15.09.09.56 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 15 Jan 2021 09:09:57 -0800 (PST) From: Peter Xu To: linux-kernel@vger.kernel.org, linux-mm@kvack.org Cc: Mike Rapoport , Mike Kravetz , peterx@redhat.com, Jerome Glisse , "Kirill A . Shutemov" , Hugh Dickins , Axel Rasmussen , Matthew Wilcox , Andrew Morton , Andrea Arcangeli , Nadav Amit Subject: [PATCH RFC 25/30] hugetlb/userfaultfd: Unshare all pmds for hugetlbfs when register wp Date: Fri, 15 Jan 2021 12:09:02 -0500 Message-Id: <20210115170907.24498-26-peterx@redhat.com> X-Mailer: git-send-email 2.26.2 In-Reply-To: <20210115170907.24498-1-peterx@redhat.com> References: <20210115170907.24498-1-peterx@redhat.com> MIME-Version: 1.0 Authentication-Results: relay.mimecast.com; auth=pass smtp.auth=CUSA124A263 smtp.mailfrom=peterx@redhat.com X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Huge pmd sharing for hugetlbfs is racy with userfaultfd-wp because userfaultfd-wp is always based on pgtable entries, so they cannot be shared. Walk the hugetlb range and unshare all such mappings if there is, right before UFFDIO_REGISTER will succeed and return to userspace. This will pair with want_pmd_share() in hugetlb code so that huge pmd sharing is completely disabled for userfaultfd-wp registered range. Signed-off-by: Peter Xu --- fs/userfaultfd.c | 43 ++++++++++++++++++++++++++++++++++++ include/linux/mmu_notifier.h | 1 + 2 files changed, 44 insertions(+) diff --git a/fs/userfaultfd.c b/fs/userfaultfd.c index 3537a43b69c9..3190dff39d6c 100644 --- a/fs/userfaultfd.c +++ b/fs/userfaultfd.c @@ -15,6 +15,7 @@ #include #include #include +#include #include #include #include @@ -1198,6 +1199,45 @@ static ssize_t userfaultfd_read(struct file *file, char __user *buf, } } +/* + * This function will unconditionally remove all the shared pmd pgtable entries + * within the specific vma for a hugetlbfs memory range. + */ +static void hugetlb_unshare_all_pmds(struct vm_area_struct *vma) +{ + struct hstate *h = hstate_vma(vma); + unsigned long sz = huge_page_size(h); + struct mm_struct *mm = vma->vm_mm; + struct mmu_notifier_range range; + unsigned long address; + spinlock_t *ptl; + pte_t *ptep; + + /* + * No need to call adjust_range_if_pmd_sharing_possible(), because + * we're going to operate on the whole vma + */ + mmu_notifier_range_init(&range, MMU_NOTIFY_HUGETLB_UNSHARE, + 0, vma, mm, vma->vm_start, vma->vm_end); + mmu_notifier_invalidate_range_start(&range); + i_mmap_lock_write(vma->vm_file->f_mapping); + for (address = vma->vm_start; address < vma->vm_end; address += sz) { + ptep = huge_pte_offset(mm, address, sz); + if (!ptep) + continue; + ptl = huge_pte_lock(h, mm, ptep); + huge_pmd_unshare(mm, vma, &address, ptep); + spin_unlock(ptl); + } + flush_hugetlb_tlb_range(vma, vma->vm_start, vma->vm_end); + i_mmap_unlock_write(vma->vm_file->f_mapping); + /* + * No need to call mmu_notifier_invalidate_range(), see + * Documentation/vm/mmu_notifier.rst. + */ + mmu_notifier_invalidate_range_end(&range); +} + static void __wake_userfault(struct userfaultfd_ctx *ctx, struct userfaultfd_wake_range *range) { @@ -1456,6 +1496,9 @@ static int userfaultfd_register(struct userfaultfd_ctx *ctx, vma->vm_flags = new_flags; vma->vm_userfaultfd_ctx.ctx = ctx; + if (is_vm_hugetlb_page(vma) && uffd_disable_huge_pmd_share(vma)) + hugetlb_unshare_all_pmds(vma); + skip: prev = vma; start = vma->vm_end; diff --git a/include/linux/mmu_notifier.h b/include/linux/mmu_notifier.h index b8200782dede..ff50c8528113 100644 --- a/include/linux/mmu_notifier.h +++ b/include/linux/mmu_notifier.h @@ -51,6 +51,7 @@ enum mmu_notifier_event { MMU_NOTIFY_SOFT_DIRTY, MMU_NOTIFY_RELEASE, MMU_NOTIFY_MIGRATE, + MMU_NOTIFY_HUGETLB_UNSHARE, }; #define MMU_NOTIFIER_RANGE_BLOCKABLE (1 << 0) From patchwork Fri Jan 15 17:09:03 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Xu X-Patchwork-Id: 12023365 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.6 required=3.0 tests=BAYES_00,DKIM_INVALID, DKIM_SIGNED,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id DC5EBC433DB for ; Fri, 15 Jan 2021 17:10:28 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 885A7238EE for ; Fri, 15 Jan 2021 17:10:28 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 885A7238EE Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=redhat.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 9A0B68D01B6; Fri, 15 Jan 2021 12:10:09 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 979B38D01B2; Fri, 15 Jan 2021 12:10:09 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 817248D01B6; Fri, 15 Jan 2021 12:10:09 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0194.hostedemail.com [216.40.44.194]) by kanga.kvack.org (Postfix) with ESMTP id 6C86D8D01B2 for ; Fri, 15 Jan 2021 12:10:09 -0500 (EST) Received: from smtpin15.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay03.hostedemail.com (Postfix) with ESMTP id 2672182EB5D3 for ; Fri, 15 Jan 2021 17:10:09 +0000 (UTC) X-FDA: 77708647338.15.uncle88_3612a7827531 Received: from filter.hostedemail.com (10.5.16.251.rfc1918.com [10.5.16.251]) by smtpin15.hostedemail.com (Postfix) with ESMTP id 569DA180AD5F6 for ; Fri, 15 Jan 2021 17:10:08 +0000 (UTC) X-HE-Tag: uncle88_3612a7827531 X-Filterd-Recvd-Size: 9460 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [216.205.24.124]) by imf44.hostedemail.com (Postfix) with ESMTP for ; Fri, 15 Jan 2021 17:10:07 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1610730607; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=sMVLMMoC7/lVRGIgCdIMxtKXCeDrCmUB3i6hisp0Dww=; b=aFOBzwodlVCuIskLaC7ZC0huU/Udk51i90RW5BSktbCafosjQb5SWOlbIz+XlmsnFNGNVv jRdcs483SRg5wHgrZPbpI/9WRGZKuEB6CCm4c13yhV46ej8iYIGZJLEPgH/k7SPxZQQgku brQDzM5Jus+8qakBK+UbqcVQ8gnLBZg= Received: from mail-qv1-f70.google.com (mail-qv1-f70.google.com [209.85.219.70]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-523-5JbdgA9sN8ipWJPZq-lswA-1; Fri, 15 Jan 2021 12:10:06 -0500 X-MC-Unique: 5JbdgA9sN8ipWJPZq-lswA-1 Received: by mail-qv1-f70.google.com with SMTP id cc1so8278399qvb.3 for ; Fri, 15 Jan 2021 09:10:05 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=sMVLMMoC7/lVRGIgCdIMxtKXCeDrCmUB3i6hisp0Dww=; b=tFkOZtAMh4uK0kDZJPCWIGIKJFtaom7AnDuTSW4bqw8xyztPgz37fWwLtC037RgNfo OwcsCgfnYlx7gIKZ8ixz2Fnews6KsmK6AJOD1CnOdyfoK/qo7XD5cXCZybRw7+bAtbil vBp3zAAMW0Dx+jwuARvX1KiZdEevvJilj4bw44aujKelyd76gmmub4HSpZh9ywgC0bNY iXbGMVU4+r7XAGqGdsVTmeUDg8eI1rtkaCv0S5U0Q9SS//Nt4b+J7Ej0hdfgvIwZbv+X 40vn022MfX0A4091CwRnznI/0PCFHaw1zJemxR8HV9ZuIPFxAoI0yZJsQ8SRF20tcNoq za7Q== X-Gm-Message-State: AOAM531oTlm2mzxUQxdxCgD/nC5AnW/3WW8zQtppxSUqQrb0retQxS8Y mO8nUY3s1j+lSkzvgLhbIWL7RdYAigAtH34SwHSogmNYN269tn/t4AjppCCzXvChgMiqkYsEJNg 7o+MNzBRqxNg= X-Received: by 2002:a37:aa57:: with SMTP id t84mr13907624qke.348.1610730605445; Fri, 15 Jan 2021 09:10:05 -0800 (PST) X-Google-Smtp-Source: ABdhPJxpVqmDPrezOxqwa3bnjl7O2ntRnlxdZLk3dna2VNrZfC5BdmyIlUWV+uAZwcf84SZLYFbF3g== X-Received: by 2002:a37:aa57:: with SMTP id t84mr13907135qke.348.1610730600356; Fri, 15 Jan 2021 09:10:00 -0800 (PST) Received: from localhost.localdomain ([142.126.83.202]) by smtp.gmail.com with ESMTPSA id d123sm5187840qke.95.2021.01.15.09.09.58 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 15 Jan 2021 09:09:59 -0800 (PST) From: Peter Xu To: linux-kernel@vger.kernel.org, linux-mm@kvack.org Cc: Mike Rapoport , Mike Kravetz , peterx@redhat.com, Jerome Glisse , "Kirill A . Shutemov" , Hugh Dickins , Axel Rasmussen , Matthew Wilcox , Andrew Morton , Andrea Arcangeli , Nadav Amit Subject: [PATCH RFC 26/30] hugetlb/userfaultfd: Handle uffd-wp special pte in hugetlb pf handler Date: Fri, 15 Jan 2021 12:09:03 -0500 Message-Id: <20210115170907.24498-27-peterx@redhat.com> X-Mailer: git-send-email 2.26.2 In-Reply-To: <20210115170907.24498-1-peterx@redhat.com> References: <20210115170907.24498-1-peterx@redhat.com> MIME-Version: 1.0 Authentication-Results: relay.mimecast.com; auth=pass smtp.auth=CUSA124A263 smtp.mailfrom=peterx@redhat.com X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Teach the hugetlb page fault code to understand uffd-wp special pte. For example, when seeing such a pte we need to convert any write fault into a read one (which is fake - we'll retry the write later if so). Meanwhile, for handle_userfault() we'll need to make sure we must wait for the special swap pte too just like a none pte. Note that we also need to teach UFFDIO_COPY about this special pte across the code path so that we can safely install a new page at this special pte as long as we know it's a stall entry. Signed-off-by: Peter Xu --- fs/userfaultfd.c | 5 ++++- mm/hugetlb.c | 35 ++++++++++++++++++++++++++++------- mm/userfaultfd.c | 3 ++- 3 files changed, 34 insertions(+), 9 deletions(-) diff --git a/fs/userfaultfd.c b/fs/userfaultfd.c index 3190dff39d6c..3264ec46242b 100644 --- a/fs/userfaultfd.c +++ b/fs/userfaultfd.c @@ -248,8 +248,11 @@ static inline bool userfaultfd_huge_must_wait(struct userfaultfd_ctx *ctx, /* * Lockless access: we're in a wait_event so it's ok if it * changes under us. + * + * Regarding uffd-wp special case, please refer to comments in + * userfaultfd_must_wait(). */ - if (huge_pte_none(pte)) + if (huge_pte_none(pte) || pte_swp_uffd_wp_special(pte)) ret = true; if (!huge_pte_write(pte) && (reason & VM_UFFD_WP)) ret = true; diff --git a/mm/hugetlb.c b/mm/hugetlb.c index a971513cdff6..9b9f71ec30e1 100644 --- a/mm/hugetlb.c +++ b/mm/hugetlb.c @@ -4253,7 +4253,8 @@ int huge_add_to_page_cache(struct page *page, struct address_space *mapping, static vm_fault_t hugetlb_no_page(struct mm_struct *mm, struct vm_area_struct *vma, struct address_space *mapping, pgoff_t idx, - unsigned long address, pte_t *ptep, unsigned int flags) + unsigned long address, pte_t *ptep, + pte_t old_pte, unsigned int flags) { struct hstate *h = hstate_vma(vma); vm_fault_t ret = VM_FAULT_SIGBUS; @@ -4297,6 +4298,7 @@ static vm_fault_t hugetlb_no_page(struct mm_struct *mm, .vma = vma, .address = haddr, .flags = flags, + .orig_pte = old_pte, /* * Hard to debug if it ends up being * used by a callee that assumes @@ -4394,7 +4396,7 @@ static vm_fault_t hugetlb_no_page(struct mm_struct *mm, ptl = huge_pte_lock(h, mm, ptep); ret = 0; - if (!huge_pte_none(huge_ptep_get(ptep))) + if (!pte_same(huge_ptep_get(ptep), old_pte)) goto backout; if (anon_rmap) { @@ -4404,6 +4406,11 @@ static vm_fault_t hugetlb_no_page(struct mm_struct *mm, page_dup_rmap(page, true); new_pte = make_huge_pte(vma, page, ((vma->vm_flags & VM_WRITE) && (vma->vm_flags & VM_SHARED))); + if (unlikely(flags & FAULT_FLAG_UFFD_WP)) { + WARN_ON_ONCE(flags & FAULT_FLAG_WRITE); + /* We should have the write bit cleared already, but be safe */ + new_pte = huge_pte_wrprotect(huge_pte_mkuffd_wp(new_pte)); + } set_huge_pte_at(mm, haddr, ptep, new_pte); hugetlb_count_add(pages_per_huge_page(h), mm); @@ -4485,9 +4492,16 @@ vm_fault_t hugetlb_fault(struct mm_struct *mm, struct vm_area_struct *vma, if (unlikely(is_hugetlb_entry_migration(entry))) { migration_entry_wait_huge(vma, mm, ptep); return 0; - } else if (unlikely(is_hugetlb_entry_hwpoisoned(entry))) + } else if (unlikely(is_hugetlb_entry_hwpoisoned(entry))) { return VM_FAULT_HWPOISON_LARGE | VM_FAULT_SET_HINDEX(hstate_index(h)); + } else if (unlikely(is_swap_special_pte(entry))) { + /* Must be a uffd-wp special swap pte */ + WARN_ON_ONCE(!pte_swp_uffd_wp_special(entry)); + flags |= FAULT_FLAG_UFFD_WP; + /* Emulate a read fault */ + flags &= ~FAULT_FLAG_WRITE; + } } /* @@ -4519,8 +4533,13 @@ vm_fault_t hugetlb_fault(struct mm_struct *mm, struct vm_area_struct *vma, mutex_lock(&hugetlb_fault_mutex_table[hash]); entry = huge_ptep_get(ptep); - if (huge_pte_none(entry)) { - ret = hugetlb_no_page(mm, vma, mapping, idx, address, ptep, flags); + /* + * FAULT_FLAG_UFFD_WP should be handled merely the same as pte none + * because it's basically a none pte with a special marker + */ + if (huge_pte_none(entry) || pte_swp_uffd_wp_special(entry)) { + ret = hugetlb_no_page(mm, vma, mapping, idx, address, ptep, + entry, flags); goto out_mutex; } @@ -4651,7 +4670,7 @@ int hugetlb_mcopy_atomic_pte(struct mm_struct *dst_mm, unsigned long size; int vm_shared = dst_vma->vm_flags & VM_SHARED; struct hstate *h = hstate_vma(dst_vma); - pte_t _dst_pte; + pte_t _dst_pte, cur_pte; spinlock_t *ptl; int ret; struct page *page; @@ -4725,8 +4744,10 @@ int hugetlb_mcopy_atomic_pte(struct mm_struct *dst_mm, if (idx >= size) goto out_release_unlock; + cur_pte = huge_ptep_get(dst_pte); ret = -EEXIST; - if (!huge_pte_none(huge_ptep_get(dst_pte))) + /* Please refer to shmem_mfill_atomic_pte() for uffd-wp special case */ + if (!huge_pte_none(cur_pte) && !pte_swp_uffd_wp_special(cur_pte)) goto out_release_unlock; if (vm_shared) { diff --git a/mm/userfaultfd.c b/mm/userfaultfd.c index 3d49b888e3e8..1dff5b9a2c26 100644 --- a/mm/userfaultfd.c +++ b/mm/userfaultfd.c @@ -300,7 +300,8 @@ static __always_inline ssize_t __mcopy_atomic_hugetlb(struct mm_struct *dst_mm, err = -EEXIST; dst_pteval = huge_ptep_get(dst_pte); - if (!huge_pte_none(dst_pteval)) { + if (!huge_pte_none(dst_pteval) && + !pte_swp_uffd_wp_special(dst_pteval)) { mutex_unlock(&hugetlb_fault_mutex_table[hash]); i_mmap_unlock_read(mapping); goto out_unlock; From patchwork Fri Jan 15 17:09:04 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Xu X-Patchwork-Id: 12023361 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.6 required=3.0 tests=BAYES_00,DKIM_INVALID, DKIM_SIGNED,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 1E40EC433E0 for ; Fri, 15 Jan 2021 17:10:24 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id BFE9A238EE for ; Fri, 15 Jan 2021 17:10:23 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org BFE9A238EE Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=redhat.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 694268D01B4; Fri, 15 Jan 2021 12:10:08 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 645248D01B2; Fri, 15 Jan 2021 12:10:08 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 497208D01B4; Fri, 15 Jan 2021 12:10:08 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0201.hostedemail.com [216.40.44.201]) by kanga.kvack.org (Postfix) with ESMTP id 2937B8D01B2 for ; Fri, 15 Jan 2021 12:10:08 -0500 (EST) Received: from smtpin02.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay01.hostedemail.com (Postfix) with ESMTP id BDA94180AD5F6 for ; Fri, 15 Jan 2021 17:10:07 +0000 (UTC) X-FDA: 77708647254.02.kite05_120a17427531 Received: from filter.hostedemail.com (10.5.16.251.rfc1918.com [10.5.16.251]) by smtpin02.hostedemail.com (Postfix) with ESMTP id 97BF9100DAA0A for ; Fri, 15 Jan 2021 17:10:07 +0000 (UTC) X-HE-Tag: kite05_120a17427531 X-Filterd-Recvd-Size: 6772 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [63.128.21.124]) by imf33.hostedemail.com (Postfix) with ESMTP for ; Fri, 15 Jan 2021 17:10:07 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1610730606; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=yrCU019uaMiIX4Y0l4Cvu/CTWpK4vi96Wr+NfUKCM+Y=; b=hD4vKDgTCZ+isd99xziIvVevTlZ/dxc/aT9L4yBHCOftN4acyXwF9Lljx3K4ySgEt/lpxl xlP0J1hXDQvcUVgBIJOm4YnJrialO4LKmkjEtLGgmtTX3fcI3RgKIt+xGTDMBh2q4KVAr9 ccwlsTQ+mzA/PfyiyAqYYNG7klpwrLo= Received: from mail-qt1-f199.google.com (mail-qt1-f199.google.com [209.85.160.199]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-388-vqs9zEOwN2CiT5kG31YSYg-1; Fri, 15 Jan 2021 12:10:03 -0500 X-MC-Unique: vqs9zEOwN2CiT5kG31YSYg-1 Received: by mail-qt1-f199.google.com with SMTP id h7so7860520qtn.21 for ; Fri, 15 Jan 2021 09:10:03 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=yrCU019uaMiIX4Y0l4Cvu/CTWpK4vi96Wr+NfUKCM+Y=; b=stOIr3q01jfxndNxYkuLEzHBihzDqbVoisnnPrnxBVIUHkvEFR1nBOK1U+MHxWN/Pp 4WYdmgr4K7uXiWJWsRLRTQglaLQHPBIozWDulvRFyQwIQcTeDUSt1hOiAWduUrZ4ZAo/ cxjNjDbBa0vxiQH98mCQHwGfbQL6ym7ILdaDpplXszwy8DFlQCmxHfreHN1wcGkaC91m nyt7zNkX2SVQvoUbgU7AHuX9YeVrZfaPVAKW3Gx6vVyG0XRF5x/niYmCkbXe1VmoRIye WeCeaNwjWAZ2g4Eq/vnGTXyCw+K7/TJTveP6bF+esJEKRdteJsKWrWGL/Gogv7Fz5Fq1 nrrA== X-Gm-Message-State: AOAM533TnrDoPwbm3BqMXjixeMqE4kk2mo163eC9JIdr0d3vI/MojGQR yDxCmlX+5zqb4/TGKeTEGh94iuDPM+ess/J2TW9ZMulLVQe/sh9bstABAl/h0RRs3BWeB4c5zjQ eMXcvGObjvdQ= X-Received: by 2002:a05:6214:4e2:: with SMTP id cl2mr12899053qvb.27.1610730602575; Fri, 15 Jan 2021 09:10:02 -0800 (PST) X-Google-Smtp-Source: ABdhPJwDGxBmzq+M9a/zQLYdo9stwTA7YMDDZpLptv18Gfz7OBlvn0/azI9YdYGAEij/PUHl4Ymv6w== X-Received: by 2002:a05:6214:4e2:: with SMTP id cl2mr12899037qvb.27.1610730602378; Fri, 15 Jan 2021 09:10:02 -0800 (PST) Received: from localhost.localdomain ([142.126.83.202]) by smtp.gmail.com with ESMTPSA id d123sm5187840qke.95.2021.01.15.09.10.00 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 15 Jan 2021 09:10:01 -0800 (PST) From: Peter Xu To: linux-kernel@vger.kernel.org, linux-mm@kvack.org Cc: Mike Rapoport , Mike Kravetz , peterx@redhat.com, Jerome Glisse , "Kirill A . Shutemov" , Hugh Dickins , Axel Rasmussen , Matthew Wilcox , Andrew Morton , Andrea Arcangeli , Nadav Amit Subject: [PATCH RFC 27/30] hugetlb/userfaultfd: Allow wr-protect none ptes Date: Fri, 15 Jan 2021 12:09:04 -0500 Message-Id: <20210115170907.24498-28-peterx@redhat.com> X-Mailer: git-send-email 2.26.2 In-Reply-To: <20210115170907.24498-1-peterx@redhat.com> References: <20210115170907.24498-1-peterx@redhat.com> MIME-Version: 1.0 Authentication-Results: relay.mimecast.com; auth=pass smtp.auth=CUSA124A263 smtp.mailfrom=peterx@redhat.com X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Teach hugetlbfs code to wr-protect none ptes just in case the page cache existed for that pte. Meanwhile we also need to be able to recognize a uffd-wp marker pte and remove it for uffd_wp_resolve. Since at it, introduce a variable "psize" to replace all references to the huge page size fetcher. Signed-off-by: Peter Xu --- mm/hugetlb.c | 29 +++++++++++++++++++++++++---- 1 file changed, 25 insertions(+), 4 deletions(-) diff --git a/mm/hugetlb.c b/mm/hugetlb.c index 9b9f71ec30e1..7959fb4b1633 100644 --- a/mm/hugetlb.c +++ b/mm/hugetlb.c @@ -4978,7 +4978,7 @@ unsigned long hugetlb_change_protection(struct vm_area_struct *vma, pte_t *ptep; pte_t pte; struct hstate *h = hstate_vma(vma); - unsigned long pages = 0; + unsigned long pages = 0, psize = huge_page_size(h); bool shared_pmd = false; struct mmu_notifier_range range; bool uffd_wp = cp_flags & MM_CP_UFFD_WP; @@ -4998,13 +4998,19 @@ unsigned long hugetlb_change_protection(struct vm_area_struct *vma, mmu_notifier_invalidate_range_start(&range); i_mmap_lock_write(vma->vm_file->f_mapping); - for (; address < end; address += huge_page_size(h)) { + for (; address < end; address += psize) { spinlock_t *ptl; - ptep = huge_pte_offset(mm, address, huge_page_size(h)); + ptep = huge_pte_offset(mm, address, psize); if (!ptep) continue; ptl = huge_pte_lock(h, mm, ptep); if (huge_pmd_unshare(mm, vma, &address, ptep)) { + /* + * When uffd-wp is enabled on the vma, unshare + * shouldn't happen at all. Warn about it if it + * happened due to some reason. + */ + WARN_ON_ONCE(uffd_wp || uffd_wp_resolve); pages++; spin_unlock(ptl); shared_pmd = true; @@ -5028,12 +5034,21 @@ unsigned long hugetlb_change_protection(struct vm_area_struct *vma, else if (uffd_wp_resolve) newpte = pte_swp_clear_uffd_wp(newpte); set_huge_swap_pte_at(mm, address, ptep, - newpte, huge_page_size(h)); + newpte, psize); pages++; } spin_unlock(ptl); continue; } + if (unlikely(is_swap_special_pte(pte))) { + WARN_ON_ONCE(!pte_swp_uffd_wp_special(pte)); + /* + * This is changing a non-present pte into a none pte, + * no need for huge_ptep_modify_prot_start/commit(). + */ + if (uffd_wp_resolve) + huge_pte_clear(mm, address, ptep, psize); + } if (!huge_pte_none(pte)) { pte_t old_pte; @@ -5046,6 +5061,12 @@ unsigned long hugetlb_change_protection(struct vm_area_struct *vma, pte = huge_pte_clear_uffd_wp(pte); huge_ptep_modify_prot_commit(vma, address, ptep, old_pte, pte); pages++; + } else { + /* None pte */ + if (unlikely(uffd_wp)) + /* Safe to modify directly (none->non-present). */ + set_huge_pte_at(mm, address, ptep, + pte_swp_mkuffd_wp_special(vma)); } spin_unlock(ptl); } From patchwork Fri Jan 15 17:09:05 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Xu X-Patchwork-Id: 12023363 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.6 required=3.0 tests=BAYES_00,DKIM_INVALID, DKIM_SIGNED,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 542A6C433DB for ; Fri, 15 Jan 2021 17:10:26 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 05736238EE for ; Fri, 15 Jan 2021 17:10:26 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 05736238EE Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=redhat.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id EAA7A8D01B5; Fri, 15 Jan 2021 12:10:08 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id E08B08D01B2; Fri, 15 Jan 2021 12:10:08 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id C84288D01B5; Fri, 15 Jan 2021 12:10:08 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0092.hostedemail.com [216.40.44.92]) by kanga.kvack.org (Postfix) with ESMTP id B1EDB8D01B2 for ; Fri, 15 Jan 2021 12:10:08 -0500 (EST) Received: from smtpin13.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay02.hostedemail.com (Postfix) with ESMTP id E87C08E75 for ; Fri, 15 Jan 2021 17:10:07 +0000 (UTC) X-FDA: 77708647254.13.spoon82_5d175d527531 Received: from filter.hostedemail.com (10.5.16.251.rfc1918.com [10.5.16.251]) by smtpin13.hostedemail.com (Postfix) with ESMTP id C0D49181D02AD for ; Fri, 15 Jan 2021 17:10:07 +0000 (UTC) X-HE-Tag: spoon82_5d175d527531 X-Filterd-Recvd-Size: 11949 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [216.205.24.124]) by imf01.hostedemail.com (Postfix) with ESMTP for ; Fri, 15 Jan 2021 17:10:06 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1610730606; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=GCz2Z9Zgad3gqL1m2VPHTbAcPly4nG/hCXs5cYStaH0=; b=UAbfGnIhThxCZVE3B4ufHxJEBiyKHpsCzEkX+D253eFEJ++qubPA/uHkT1Uxqu/jKubfnY WaXMCid+87+7bzph7qI/BDo0HYlSojuhiOgfl4aPIgwXHav9V+zRAjnrvsnyk1SDY9CNFE Z2bg86yR6IlfO7RTS9YxDCT3Cl98Pq4= Received: from mail-qk1-f200.google.com (mail-qk1-f200.google.com [209.85.222.200]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-443-Fv_ftd7iMeavvr_Hf3ypag-1; Fri, 15 Jan 2021 12:10:05 -0500 X-MC-Unique: Fv_ftd7iMeavvr_Hf3ypag-1 Received: by mail-qk1-f200.google.com with SMTP id l138so8617689qke.4 for ; Fri, 15 Jan 2021 09:10:04 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=GCz2Z9Zgad3gqL1m2VPHTbAcPly4nG/hCXs5cYStaH0=; b=UzCrN4OsuzQoflptN6ghdH6fm6MvMHgbO8Jue3tBx4E/HiObf/wL6me72VrgY0rq+2 fKQNqOTvQGFIe5bkBIQABg4aEnIeEWcPCgb1OYWona/eIaKNcwwCOp7tqTht9vVeyy8j 7lgvpRHeJZOu8//OA8iGSuX8p7AFE19neBPRUF6EJRoWgesafdjTzIfoeqZ0QMlDmhRf 0N0MODsF1LxuE7wc8i52mHz5ARGvddLQWqRVkL+nBgqG8KWDqwkyUr1Vob2z1wchnivg G6z3n0TFCXGGErfapVZAmtb7aOy6HpQl1Lk1oTaGVNqqdssll4g64trjDiBjd7ae35Gy hzWg== X-Gm-Message-State: AOAM531kiJ2L48+mEd733+swmOQ6XJc7mNto8a9Y3SbaKU9RfApWvTbq RB4ZFtX5RissOiDuGVQcfAnixeotw3AdA0hAULU9sCWwMhyLFsP5GgNLaq6wrqcjPRo7aaMEJhb 2TUrXjur/WL4= X-Received: by 2002:a37:c0b:: with SMTP id 11mr13568851qkm.32.1610730604178; Fri, 15 Jan 2021 09:10:04 -0800 (PST) X-Google-Smtp-Source: ABdhPJxR8gnzle+XvdiEUHLdFl5e93T23UAPGki6VkBtygLlYVM2vhnQhPTVW8Iyu6c62gDqNjWM3w== X-Received: by 2002:a37:c0b:: with SMTP id 11mr13568805qkm.32.1610730603894; Fri, 15 Jan 2021 09:10:03 -0800 (PST) Received: from localhost.localdomain ([142.126.83.202]) by smtp.gmail.com with ESMTPSA id d123sm5187840qke.95.2021.01.15.09.10.02 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 15 Jan 2021 09:10:03 -0800 (PST) From: Peter Xu To: linux-kernel@vger.kernel.org, linux-mm@kvack.org Cc: Mike Rapoport , Mike Kravetz , peterx@redhat.com, Jerome Glisse , "Kirill A . Shutemov" , Hugh Dickins , Axel Rasmussen , Matthew Wilcox , Andrew Morton , Andrea Arcangeli , Nadav Amit Subject: [PATCH RFC 28/30] hugetlb/userfaultfd: Only drop uffd-wp special pte if required Date: Fri, 15 Jan 2021 12:09:05 -0500 Message-Id: <20210115170907.24498-29-peterx@redhat.com> X-Mailer: git-send-email 2.26.2 In-Reply-To: <20210115170907.24498-1-peterx@redhat.com> References: <20210115170907.24498-1-peterx@redhat.com> MIME-Version: 1.0 Authentication-Results: relay.mimecast.com; auth=pass smtp.auth=CUSA124A263 smtp.mailfrom=peterx@redhat.com X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Just like what we've done with shmem uffd-wp special ptes, we shouldn't drop uffd-wp special swap pte for hugetlb too, only if we're going to unmap the whole vma, or we're punching a hole with safe locks held. For example, remove_inode_hugepages() is safe to drop uffd-wp ptes, because it has taken hugetlb fault mutex so that no concurrent page fault would trigger. While the call to hugetlb_vmdelete_list() in hugetlbfs_punch_hole() is not safe. That's why the previous call will be with ZAP_FLAG_DROP_FILE_UFFD_WP, while the latter one won't be able to. Signed-off-by: Peter Xu --- fs/hugetlbfs/inode.c | 15 +++++++++------ include/linux/hugetlb.h | 13 ++++++++----- mm/hugetlb.c | 27 +++++++++++++++++++++------ mm/memory.c | 5 ++++- 4 files changed, 42 insertions(+), 18 deletions(-) diff --git a/fs/hugetlbfs/inode.c b/fs/hugetlbfs/inode.c index b5c109703daa..f9ff2ba5e47b 100644 --- a/fs/hugetlbfs/inode.c +++ b/fs/hugetlbfs/inode.c @@ -399,7 +399,8 @@ static void remove_huge_page(struct page *page) } static void -hugetlb_vmdelete_list(struct rb_root_cached *root, pgoff_t start, pgoff_t end) +hugetlb_vmdelete_list(struct rb_root_cached *root, pgoff_t start, pgoff_t end, + unsigned long zap_flags) { struct vm_area_struct *vma; @@ -432,7 +433,7 @@ hugetlb_vmdelete_list(struct rb_root_cached *root, pgoff_t start, pgoff_t end) } unmap_hugepage_range(vma, vma->vm_start + v_offset, v_end, - NULL); + NULL, zap_flags); } } @@ -513,7 +514,8 @@ static void remove_inode_hugepages(struct inode *inode, loff_t lstart, mutex_lock(&hugetlb_fault_mutex_table[hash]); hugetlb_vmdelete_list(&mapping->i_mmap, index * pages_per_huge_page(h), - (index + 1) * pages_per_huge_page(h)); + (index + 1) * pages_per_huge_page(h), + ZAP_FLAG_DROP_FILE_UFFD_WP); i_mmap_unlock_write(mapping); } @@ -579,7 +581,8 @@ static int hugetlb_vmtruncate(struct inode *inode, loff_t offset) i_mmap_lock_write(mapping); i_size_write(inode, offset); if (!RB_EMPTY_ROOT(&mapping->i_mmap.rb_root)) - hugetlb_vmdelete_list(&mapping->i_mmap, pgoff, 0); + hugetlb_vmdelete_list(&mapping->i_mmap, pgoff, 0, + ZAP_FLAG_DROP_FILE_UFFD_WP); i_mmap_unlock_write(mapping); remove_inode_hugepages(inode, offset, LLONG_MAX); return 0; @@ -613,8 +616,8 @@ static long hugetlbfs_punch_hole(struct inode *inode, loff_t offset, loff_t len) i_mmap_lock_write(mapping); if (!RB_EMPTY_ROOT(&mapping->i_mmap.rb_root)) hugetlb_vmdelete_list(&mapping->i_mmap, - hole_start >> PAGE_SHIFT, - hole_end >> PAGE_SHIFT); + hole_start >> PAGE_SHIFT, + hole_end >> PAGE_SHIFT, 0); i_mmap_unlock_write(mapping); remove_inode_hugepages(inode, hole_start, hole_end); inode_unlock(inode); diff --git a/include/linux/hugetlb.h b/include/linux/hugetlb.h index 8841d118f45b..93f3c46439b2 100644 --- a/include/linux/hugetlb.h +++ b/include/linux/hugetlb.h @@ -121,14 +121,15 @@ long follow_hugetlb_page(struct mm_struct *, struct vm_area_struct *, unsigned long *, unsigned long *, long, unsigned int, int *); void unmap_hugepage_range(struct vm_area_struct *, - unsigned long, unsigned long, struct page *); + unsigned long, unsigned long, struct page *, + unsigned long); void __unmap_hugepage_range_final(struct mmu_gather *tlb, struct vm_area_struct *vma, unsigned long start, unsigned long end, - struct page *ref_page); + struct page *ref_page, unsigned long zap_flags); void __unmap_hugepage_range(struct mmu_gather *tlb, struct vm_area_struct *vma, unsigned long start, unsigned long end, - struct page *ref_page); + struct page *ref_page, unsigned long zap_flags); void hugetlb_report_meminfo(struct seq_file *); int hugetlb_report_node_meminfo(char *buf, int len, int nid); void hugetlb_show_meminfo(void); @@ -353,14 +354,16 @@ static inline unsigned long hugetlb_change_protection( static inline void __unmap_hugepage_range_final(struct mmu_gather *tlb, struct vm_area_struct *vma, unsigned long start, - unsigned long end, struct page *ref_page) + unsigned long end, struct page *ref_page, + unsigned long zap_flags) { BUG(); } static inline void __unmap_hugepage_range(struct mmu_gather *tlb, struct vm_area_struct *vma, unsigned long start, - unsigned long end, struct page *ref_page) + unsigned long end, struct page *ref_page, + unsigned long zap_flags) { BUG(); } diff --git a/mm/hugetlb.c b/mm/hugetlb.c index 7959fb4b1633..731a26617673 100644 --- a/mm/hugetlb.c +++ b/mm/hugetlb.c @@ -3864,7 +3864,7 @@ int copy_hugetlb_page_range(struct mm_struct *dst, struct mm_struct *src, void __unmap_hugepage_range(struct mmu_gather *tlb, struct vm_area_struct *vma, unsigned long start, unsigned long end, - struct page *ref_page) + struct page *ref_page, unsigned long zap_flags) { struct mm_struct *mm = vma->vm_mm; unsigned long address; @@ -3916,6 +3916,19 @@ void __unmap_hugepage_range(struct mmu_gather *tlb, struct vm_area_struct *vma, continue; } + if (unlikely(is_swap_special_pte(pte))) { + WARN_ON_ONCE(!pte_swp_uffd_wp_special(pte)); + /* + * Only drop the special swap uffd-wp pte if + * e.g. unmapping a vma or punching a hole (with proper + * lock held so that concurrent page fault won't happen). + */ + if (zap_flags & ZAP_FLAG_DROP_FILE_UFFD_WP) + huge_pte_clear(mm, address, ptep, sz); + spin_unlock(ptl); + continue; + } + /* * Migrating hugepage or HWPoisoned hugepage is already * unmapped and its refcount is dropped, so just clear pte here. @@ -3967,9 +3980,10 @@ void __unmap_hugepage_range(struct mmu_gather *tlb, struct vm_area_struct *vma, void __unmap_hugepage_range_final(struct mmu_gather *tlb, struct vm_area_struct *vma, unsigned long start, - unsigned long end, struct page *ref_page) + unsigned long end, struct page *ref_page, + unsigned long zap_flags) { - __unmap_hugepage_range(tlb, vma, start, end, ref_page); + __unmap_hugepage_range(tlb, vma, start, end, ref_page, zap_flags); /* * Clear this flag so that x86's huge_pmd_share page_table_shareable @@ -3985,7 +3999,8 @@ void __unmap_hugepage_range_final(struct mmu_gather *tlb, } void unmap_hugepage_range(struct vm_area_struct *vma, unsigned long start, - unsigned long end, struct page *ref_page) + unsigned long end, struct page *ref_page, + unsigned long zap_flags) { struct mm_struct *mm; struct mmu_gather tlb; @@ -4004,7 +4019,7 @@ void unmap_hugepage_range(struct vm_area_struct *vma, unsigned long start, mm = vma->vm_mm; tlb_gather_mmu(&tlb, mm, tlb_start, tlb_end); - __unmap_hugepage_range(&tlb, vma, start, end, ref_page); + __unmap_hugepage_range(&tlb, vma, start, end, ref_page, zap_flags); tlb_finish_mmu(&tlb, tlb_start, tlb_end); } @@ -4059,7 +4074,7 @@ static void unmap_ref_private(struct mm_struct *mm, struct vm_area_struct *vma, */ if (!is_vma_resv_set(iter_vma, HPAGE_RESV_OWNER)) unmap_hugepage_range(iter_vma, address, - address + huge_page_size(h), page); + address + huge_page_size(h), page, 0); } i_mmap_unlock_write(mapping); } diff --git a/mm/memory.c b/mm/memory.c index 59d56f57ba2c..993ec7a7961a 100644 --- a/mm/memory.c +++ b/mm/memory.c @@ -1499,8 +1499,11 @@ static void unmap_single_vma(struct mmu_gather *tlb, * safe to do nothing in this case. */ if (vma->vm_file) { + unsigned long zap_flags = details ? + details->zap_flags : 0; i_mmap_lock_write(vma->vm_file->f_mapping); - __unmap_hugepage_range_final(tlb, vma, start, end, NULL); + __unmap_hugepage_range_final(tlb, vma, start, end, + NULL, zap_flags); i_mmap_unlock_write(vma->vm_file->f_mapping); } } else From patchwork Fri Jan 15 17:09:06 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Xu X-Patchwork-Id: 12023367 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.6 required=3.0 tests=BAYES_00,DKIM_INVALID, DKIM_SIGNED,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id DA3A5C433DB for ; Fri, 15 Jan 2021 17:10:31 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 6D5D32333E for ; Fri, 15 Jan 2021 17:10:31 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 6D5D32333E Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=redhat.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 030FD8D01B7; Fri, 15 Jan 2021 12:10:11 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id EABC18D01B2; Fri, 15 Jan 2021 12:10:10 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id CD8228D01B7; Fri, 15 Jan 2021 12:10:10 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0089.hostedemail.com [216.40.44.89]) by kanga.kvack.org (Postfix) with ESMTP id B56518D01B2 for ; Fri, 15 Jan 2021 12:10:10 -0500 (EST) Received: from smtpin16.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay03.hostedemail.com (Postfix) with ESMTP id 7D33882EB57E for ; Fri, 15 Jan 2021 17:10:10 +0000 (UTC) X-FDA: 77708647380.16.waste58_2a06ce027531 Received: from filter.hostedemail.com (10.5.16.251.rfc1918.com [10.5.16.251]) by smtpin16.hostedemail.com (Postfix) with ESMTP id 5A49A100E753F for ; Fri, 15 Jan 2021 17:10:10 +0000 (UTC) X-HE-Tag: waste58_2a06ce027531 X-Filterd-Recvd-Size: 8824 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [216.205.24.124]) by imf28.hostedemail.com (Postfix) with ESMTP for ; Fri, 15 Jan 2021 17:10:09 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1610730609; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=Qc9c/gofLjGafrTcbLQ3n1guQABnxcjbcx/uBfZWm9I=; b=VryPhsAmlv3rEaDpOKdZH7pDUrhKfQ4blUb6dcxlhrMs08fupzAMfZ1AyvGbzMtr5b6Z7u FNXDGSh1+aA/y88Km1CZB2IlVh059F2pHd3wLhBZS4bGp0G8UneCkRTOLV2au6VPwiOVOe EpcYfHBr6IkgXRwh0TtEKvRW5Ix+ytU= Received: from mail-qt1-f197.google.com (mail-qt1-f197.google.com [209.85.160.197]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-277-1P9Ry2VfOvmblORL40_60w-1; Fri, 15 Jan 2021 12:10:08 -0500 X-MC-Unique: 1P9Ry2VfOvmblORL40_60w-1 Received: by mail-qt1-f197.google.com with SMTP id t7so7868919qtn.19 for ; Fri, 15 Jan 2021 09:10:07 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=Qc9c/gofLjGafrTcbLQ3n1guQABnxcjbcx/uBfZWm9I=; b=ddwWDUtb9t3Aqa94JSXKmNbE+cvIEjf0mWlP6ra9fHrHrbVr6PzDFPMWNj58GmJRqT qD10hxvV5dTc3BwN0LrbZ+lDKwDFyKabkxQxcFhFnGEJgee0B9LqdRJeHxgiIyfOR9AE F9MUYk9KNQ4KbYmn5iKmoQgQhfZxKGSMU5u6KVEFIFo2i+Dl3mw39NryWP6zHnqOB/IA tyww4FhMWoHyldv+/y4ZeJj8116xq1Kwah3nk8o43yIb72Fh3UkZm8yB1Szb90TopixA RQzikd7MALXLl9bH7Q8Z1IMO5/bgtclLmSo5FNBZF/VGtqjiM4PuusGKHt2oRC2n0dxi nOOA== X-Gm-Message-State: AOAM530S2l3+Daf5UmeWwDFE9bkg+DthKWiTeKhxfsbUme7ivREBACiv qqbZNMs1LKOTGov2ckK2XHXj2cy/KODRPmUBWbl6jGKOjAdk3HSnbikhTC0d9D+Nv8XkNbDKAXn aLjCMLguNPbQ= X-Received: by 2002:ac8:4553:: with SMTP id z19mr12589521qtn.278.1610730605743; Fri, 15 Jan 2021 09:10:05 -0800 (PST) X-Google-Smtp-Source: ABdhPJxoMgM51c/GHRmaGeJ+Zy2tSzT8SlyOfu7E1vA+zEc9W0Bb9NoJGtjyJUxMHvZEaMYWNpZWYw== X-Received: by 2002:ac8:4553:: with SMTP id z19mr12589487qtn.278.1610730605474; Fri, 15 Jan 2021 09:10:05 -0800 (PST) Received: from localhost.localdomain ([142.126.83.202]) by smtp.gmail.com with ESMTPSA id d123sm5187840qke.95.2021.01.15.09.10.04 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 15 Jan 2021 09:10:04 -0800 (PST) From: Peter Xu To: linux-kernel@vger.kernel.org, linux-mm@kvack.org Cc: Mike Rapoport , Mike Kravetz , peterx@redhat.com, Jerome Glisse , "Kirill A . Shutemov" , Hugh Dickins , Axel Rasmussen , Matthew Wilcox , Andrew Morton , Andrea Arcangeli , Nadav Amit Subject: [PATCH RFC 29/30] userfaultfd: Enable write protection for shmem & hugetlbfs Date: Fri, 15 Jan 2021 12:09:06 -0500 Message-Id: <20210115170907.24498-30-peterx@redhat.com> X-Mailer: git-send-email 2.26.2 In-Reply-To: <20210115170907.24498-1-peterx@redhat.com> References: <20210115170907.24498-1-peterx@redhat.com> MIME-Version: 1.0 Authentication-Results: relay.mimecast.com; auth=pass smtp.auth=CUSA124A263 smtp.mailfrom=peterx@redhat.com X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: We've had all the necessary changes ready for both shmem and hugetlbfs. Turn on all the shmem/hugetlbfs switches for userfaultfd-wp. Now we can remove the flags parameter for vma_can_userfault() since not used any more. Meanwhile, we can expand UFFD_API_RANGE_IOCTLS_BASIC with _UFFDIO_WRITEPROTECT too because all existing types now support write protection mode. Since vma_can_userfault() will be used elsewhere, move into userfaultfd_k.h. Signed-off-by: Peter Xu --- fs/userfaultfd.c | 17 ++++------------- include/linux/userfaultfd_k.h | 7 +++++++ include/uapi/linux/userfaultfd.h | 3 ++- mm/userfaultfd.c | 10 +++------- 4 files changed, 16 insertions(+), 21 deletions(-) diff --git a/fs/userfaultfd.c b/fs/userfaultfd.c index 3264ec46242b..88ad90fc8539 100644 --- a/fs/userfaultfd.c +++ b/fs/userfaultfd.c @@ -1307,15 +1307,6 @@ static __always_inline int validate_range(struct mm_struct *mm, return 0; } -static inline bool vma_can_userfault(struct vm_area_struct *vma, - unsigned long vm_flags) -{ - /* FIXME: add WP support to hugetlbfs and shmem */ - return vma_is_anonymous(vma) || - ((is_vm_hugetlb_page(vma) || vma_is_shmem(vma)) && - !(vm_flags & VM_UFFD_WP)); -} - static int userfaultfd_register(struct userfaultfd_ctx *ctx, unsigned long arg) { @@ -1394,7 +1385,7 @@ static int userfaultfd_register(struct userfaultfd_ctx *ctx, /* check not compatible vmas */ ret = -EINVAL; - if (!vma_can_userfault(cur, vm_flags)) + if (!vma_can_userfault(cur)) goto out_unlock; /* @@ -1453,7 +1444,7 @@ static int userfaultfd_register(struct userfaultfd_ctx *ctx, do { cond_resched(); - BUG_ON(!vma_can_userfault(vma, vm_flags)); + BUG_ON(!vma_can_userfault(vma)); BUG_ON(vma->vm_userfaultfd_ctx.ctx && vma->vm_userfaultfd_ctx.ctx != ctx); WARN_ON(!(vma->vm_flags & VM_MAYWRITE)); @@ -1602,7 +1593,7 @@ static int userfaultfd_unregister(struct userfaultfd_ctx *ctx, * provides for more strict behavior to notice * unregistration errors. */ - if (!vma_can_userfault(cur, cur->vm_flags)) + if (!vma_can_userfault(cur)) goto out_unlock; found = true; @@ -1616,7 +1607,7 @@ static int userfaultfd_unregister(struct userfaultfd_ctx *ctx, do { cond_resched(); - BUG_ON(!vma_can_userfault(vma, vma->vm_flags)); + BUG_ON(!vma_can_userfault(vma)); /* * Nothing to do: this vma is already registered into this diff --git a/include/linux/userfaultfd_k.h b/include/linux/userfaultfd_k.h index 7d14444862d4..fd7031173949 100644 --- a/include/linux/userfaultfd_k.h +++ b/include/linux/userfaultfd_k.h @@ -16,6 +16,7 @@ #include #include #include +#include /* * CAREFUL: Check include/uapi/asm-generic/fcntl.h when defining @@ -88,6 +89,12 @@ static inline bool userfaultfd_armed(struct vm_area_struct *vma) return vma->vm_flags & (VM_UFFD_MISSING | VM_UFFD_WP); } +static inline bool vma_can_userfault(struct vm_area_struct *vma) +{ + return vma_is_anonymous(vma) || vma_is_shmem(vma) || + is_vm_hugetlb_page(vma); +} + extern int dup_userfaultfd(struct vm_area_struct *, struct list_head *); extern void dup_userfaultfd_complete(struct list_head *); diff --git a/include/uapi/linux/userfaultfd.h b/include/uapi/linux/userfaultfd.h index e7e98bde221f..83bcd739de50 100644 --- a/include/uapi/linux/userfaultfd.h +++ b/include/uapi/linux/userfaultfd.h @@ -39,7 +39,8 @@ (__u64)1 << _UFFDIO_WRITEPROTECT) #define UFFD_API_RANGE_IOCTLS_BASIC \ ((__u64)1 << _UFFDIO_WAKE | \ - (__u64)1 << _UFFDIO_COPY) + (__u64)1 << _UFFDIO_COPY | \ + (__u64)1 << _UFFDIO_WRITEPROTECT) /* * Valid ioctl command number range with this API is from 0x00 to diff --git a/mm/userfaultfd.c b/mm/userfaultfd.c index 1dff5b9a2c26..3ad52f01553b 100644 --- a/mm/userfaultfd.c +++ b/mm/userfaultfd.c @@ -445,7 +445,6 @@ static __always_inline ssize_t mfill_atomic_pte(struct mm_struct *dst_mm, err = mfill_zeropage_pte(dst_mm, dst_pmd, dst_vma, dst_addr); } else { - VM_WARN_ON_ONCE(wp_copy); if (!zeropage) err = shmem_mcopy_atomic_pte(dst_mm, dst_pmd, dst_vma, dst_addr, @@ -671,15 +670,12 @@ int mwriteprotect_range(struct mm_struct *dst_mm, unsigned long start, err = -ENOENT; dst_vma = find_dst_vma(dst_mm, start, len); - /* - * Make sure the vma is not shared, that the dst range is - * both valid and fully within a single existing vma. - */ - if (!dst_vma || (dst_vma->vm_flags & VM_SHARED)) + + if (!dst_vma) goto out_unlock; if (!userfaultfd_wp(dst_vma)) goto out_unlock; - if (!vma_is_anonymous(dst_vma)) + if (!vma_can_userfault(dst_vma)) goto out_unlock; if (is_vm_hugetlb_page(dst_vma)) { From patchwork Fri Jan 15 17:12:26 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Xu X-Patchwork-Id: 12023373 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.6 required=3.0 tests=BAYES_00,DKIM_INVALID, DKIM_SIGNED,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 423C8C433DB for ; Fri, 15 Jan 2021 17:12:35 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id E87F2238EE for ; Fri, 15 Jan 2021 17:12:34 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org E87F2238EE Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=redhat.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 836F78D01B8; Fri, 15 Jan 2021 12:12:34 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 7E65F8D01B2; Fri, 15 Jan 2021 12:12:34 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 6FCD78D01B8; Fri, 15 Jan 2021 12:12:34 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0018.hostedemail.com [216.40.44.18]) by kanga.kvack.org (Postfix) with ESMTP id 57BF58D01B2 for ; Fri, 15 Jan 2021 12:12:34 -0500 (EST) Received: from smtpin23.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay05.hostedemail.com (Postfix) with ESMTP id 1DD1D1812A458 for ; Fri, 15 Jan 2021 17:12:34 +0000 (UTC) X-FDA: 77708653428.23.patch36_161369627531 Received: from filter.hostedemail.com (10.5.16.251.rfc1918.com [10.5.16.251]) by smtpin23.hostedemail.com (Postfix) with ESMTP id F3C4C37621 for ; Fri, 15 Jan 2021 17:12:33 +0000 (UTC) X-HE-Tag: patch36_161369627531 X-Filterd-Recvd-Size: 6238 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by imf10.hostedemail.com (Postfix) with ESMTP for ; Fri, 15 Jan 2021 17:12:33 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1610730752; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=tEf8T0Go1KzAUeqBOwWH7qIZostwBrdc7d1MGMOL0rE=; b=ZVE/vkW++mCOaQhNFu9LM94wasXr17KLClYBCKAUTFy+p5mz5rFT/1Es/PBs9Fl9URV4vu gWIa9oatG4WvV7HcQUXsRPgK+ZX12hBPq6ok1qRVBrE2Y/zGmDTaDYXjebNX8gykSg/RXr 3YSeGZtsa3BLyPgPfX9CtIhelglxuqA= Received: from mail-qt1-f200.google.com (mail-qt1-f200.google.com [209.85.160.200]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-8-Y7UHIcxQPbuLq-TYfUc6Jg-1; Fri, 15 Jan 2021 12:12:31 -0500 X-MC-Unique: Y7UHIcxQPbuLq-TYfUc6Jg-1 Received: by mail-qt1-f200.google.com with SMTP id c14so7889367qtn.5 for ; Fri, 15 Jan 2021 09:12:30 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=tEf8T0Go1KzAUeqBOwWH7qIZostwBrdc7d1MGMOL0rE=; b=VsALxTAKjwDuxwJjnV3eISCP/NTBiLIGWJbHQ12020KYjN5WM3gfiyBJw0BappY8Ur PI6UxYnDDiQMFGmqxxkiJS4O4AG043S5ZNYJn+L6hitTzlzBhjcCbphXqukaih1Hg7GF DjwV1zp/i3gaOgdW6wsi2wlYKq+VSBR6toNiEvdLV+ESMn6XHTcxe55tyGhp1XQP1QPG nXMf20VxcSj3+58QDyCGVGS/2XnqEm4gBUF/lPtTyJEgrYv4boDs07RvkKxDqpUYSOOT k6OoWuaFM+D4De6R6+kwxkBPsktQ/sdez/vqDK+4Ms7yjU5kMOtwiC4xx7rpmZcZ0Rdr /K5g== X-Gm-Message-State: AOAM531O4LqQdrryOOYoJjH2ClqP46kOlbyaNlGrg8DazChniOJjYoeV ViqnQ0n5r4oRfrElA479PO/uvlM+Qq4ztHO3qqguiifk3J9YQlkAYn78WCw21vEXIdCG7qNoQeN xhJYFXvwxye8= X-Received: by 2002:ad4:4c03:: with SMTP id bz3mr12717675qvb.18.1610730748366; Fri, 15 Jan 2021 09:12:28 -0800 (PST) X-Google-Smtp-Source: ABdhPJyrBIMScW0Akbh1tP53btOa7oiIiWgBQH32e/IJAnpRq/urGVjQU2YQNcHoeHxeMdk+kjiB3w== X-Received: by 2002:ad4:4c03:: with SMTP id bz3mr12717658qvb.18.1610730748161; Fri, 15 Jan 2021 09:12:28 -0800 (PST) Received: from localhost.localdomain ([142.126.83.202]) by smtp.gmail.com with ESMTPSA id x134sm5418843qka.1.2021.01.15.09.12.26 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 15 Jan 2021 09:12:27 -0800 (PST) From: Peter Xu To: linux-kernel@vger.kernel.org, linux-mm@kvack.org Cc: Andrea Arcangeli , Hugh Dickins , Jerome Glisse , "Kirill A . Shutemov" , Mike Rapoport , Matthew Wilcox , peterx@redhat.com, Nadav Amit , Axel Rasmussen , Mike Kravetz , Andrew Morton Subject: [PATCH RFC 30/30] userfaultfd/selftests: Enable uffd-wp for shmem/hugetlbfs Date: Fri, 15 Jan 2021 12:12:26 -0500 Message-Id: <20210115171226.25127-1-peterx@redhat.com> X-Mailer: git-send-email 2.26.2 In-Reply-To: <20210115170907.24498-1-peterx@redhat.com> References: <20210115170907.24498-1-peterx@redhat.com> MIME-Version: 1.0 Authentication-Results: relay.mimecast.com; auth=pass smtp.auth=CUSA124A263 smtp.mailfrom=peterx@redhat.com X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: After we added support for shmem and hugetlbfs, we can turn uffd-wp test on always now. Signed-off-by: Peter Xu --- tools/testing/selftests/vm/userfaultfd.c | 14 ++++++-------- 1 file changed, 6 insertions(+), 8 deletions(-) diff --git a/tools/testing/selftests/vm/userfaultfd.c b/tools/testing/selftests/vm/userfaultfd.c index c4425597769a..219251c194da 100644 --- a/tools/testing/selftests/vm/userfaultfd.c +++ b/tools/testing/selftests/vm/userfaultfd.c @@ -77,8 +77,8 @@ static int test_type; #define ALARM_INTERVAL_SECS 10 static volatile bool test_uffdio_copy_eexist = true; static volatile bool test_uffdio_zeropage_eexist = true; -/* Whether to test uffd write-protection */ -static bool test_uffdio_wp = false; +/* Whether to test uffd write-protection. Default is to test uffd-wp always */ +static bool test_uffdio_wp = true; static bool map_shared; static int huge_fd; @@ -295,9 +295,9 @@ struct uffd_test_ops { void (*alias_mapping)(__u64 *start, size_t len, unsigned long offset); }; -#define SHMEM_EXPECTED_IOCTLS ((1 << _UFFDIO_WAKE) | \ +#define HUGETLB_EXPECTED_IOCTLS ((1 << _UFFDIO_WAKE) | \ (1 << _UFFDIO_COPY) | \ - (1 << _UFFDIO_ZEROPAGE)) + (1 << _UFFDIO_WRITEPROTECT)) #define ANON_EXPECTED_IOCTLS ((1 << _UFFDIO_WAKE) | \ (1 << _UFFDIO_COPY) | \ @@ -312,14 +312,14 @@ static struct uffd_test_ops anon_uffd_test_ops = { }; static struct uffd_test_ops shmem_uffd_test_ops = { - .expected_ioctls = SHMEM_EXPECTED_IOCTLS, + .expected_ioctls = ANON_EXPECTED_IOCTLS, .allocate_area = shmem_allocate_area, .release_pages = shmem_release_pages, .alias_mapping = noop_alias_mapping, }; static struct uffd_test_ops hugetlb_uffd_test_ops = { - .expected_ioctls = UFFD_API_RANGE_IOCTLS_BASIC, + .expected_ioctls = HUGETLB_EXPECTED_IOCTLS, .allocate_area = hugetlb_allocate_area, .release_pages = hugetlb_release_pages, .alias_mapping = hugetlb_alias_mapping, @@ -1453,8 +1453,6 @@ static void set_test_type(const char *type) if (!strcmp(type, "anon")) { test_type = TEST_ANON; uffd_test_ops = &anon_uffd_test_ops; - /* Only enable write-protect test for anonymous test */ - test_uffdio_wp = true; } else if (!strcmp(type, "hugetlb")) { test_type = TEST_HUGETLB; uffd_test_ops = &hugetlb_uffd_test_ops;