From patchwork Mon Jul 18 12:01:59 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Nadav Amit X-Patchwork-Id: 12921680 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 87889C433EF for ; Mon, 18 Jul 2022 19:37:07 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 140526B0072; Mon, 18 Jul 2022 15:37:07 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 0F0C16B0073; Mon, 18 Jul 2022 15:37:07 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id EFA508E0001; Mon, 18 Jul 2022 15:37:06 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0016.hostedemail.com [216.40.44.16]) by kanga.kvack.org (Postfix) with ESMTP id E12386B0072 for ; Mon, 18 Jul 2022 15:37:06 -0400 (EDT) Received: from smtpin13.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay06.hostedemail.com (Postfix) with ESMTP id ACB3C34D9C for ; Mon, 18 Jul 2022 19:37:06 +0000 (UTC) X-FDA: 79701228852.13.ED8ABCE Received: from relay3.hostedemail.com (smtprelay0013.hostedemail.com [216.40.44.13]) by imf31.hostedemail.com (Postfix) with ESMTP id 4921D20042 for ; Mon, 18 Jul 2022 19:37:06 +0000 (UTC) Received: from smtpin15.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay10.hostedemail.com (Postfix) with ESMTP id E3C01E5C for ; Mon, 18 Jul 2022 19:37:05 +0000 (UTC) X-FDA: 79701228810.15.27B2127 Received: from mail-pf1-f171.google.com (mail-pf1-f171.google.com [209.85.210.171]) by imf30.hostedemail.com (Postfix) with ESMTP id 682948008E for ; Mon, 18 Jul 2022 19:37:05 +0000 (UTC) Received: by mail-pf1-f171.google.com with SMTP id w185so11569319pfb.4 for ; Mon, 18 Jul 2022 12:37:05 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=lqN6lUA6CKPQ6pXduKwXrnAjAVaaU8p/HZgGKWiS1UU=; b=anEaeCdDUpWprBhzF25fJpAdtW+LgFlwZ2ebfZ8wM0kpnCfn2B96Ny11a5FiZPFqb+ 8d217vnOhZgu2PfDrycJhxqOQiXud4xvWn2c80DYc99oVi3Ir9FehDJAGUmSeOUhksfr QR+ahMTY316YC8qzqWOkrwI9aF+uUCzKo6Q+X7SXWicVhjcsJrIFPjQgPYBlrrFSkyza nrpLpAgplwp8N1Y7IPDrjQick91Rj9cFFO7tm3RrwD9n6LXQmJHs9J6KNpKvgDrlFFhE LWoW2wCSjvgnI+efvRVSp0tdhvWKQfmNMDQryOAu3HMUlJExJUMY0OFO8aRltUkZiSpA eLoQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=lqN6lUA6CKPQ6pXduKwXrnAjAVaaU8p/HZgGKWiS1UU=; b=aUdW0oXkWVR2Rq4/A75wnkCl9vNGwLIGk8SJi4FiPtcAIpdsv+7WDKWD18xRM3OMIT /i1O67oQF0Lqe6fsFx20+gr4/L4NH+jsQMeMwYrgnNH+wuSPp/WdTApBGmJ0QFjNZouC jyVF7138oQsP6JB4q2NwkGhKUMSuU8fycndOr5kU7KCfT9z8Kz4rfubf+Rm33QU3zOhx k1jEABr5l75hCxn6gP7FmQ/aeXErVvKWaqxljx6CxOfX6feMXWy27gHqW9nIsHQExcZ/ TAR7Z3seXIN+VrWBipUhZ7sdugy+ZvdHVpmlZ2JKCDxSm/elaDKyZKDNnlOjy6JnJOQM bePA== X-Gm-Message-State: AJIora9GdzbXo8ITf25LAqtrX4Yo6mfoJR7DZ6qeVyHbDQWAciXqm2CW 0cK7LvQx++YWM1i1v3PDGaIA0sUbr1gbYA== X-Google-Smtp-Source: AGRyM1t4cIpL6IVhzKiQES1MTghU8BQMTlI5pi0eeK/ksqsiBBWEFmy9twLEG4Iqd+Ne+Q/gUcyJmQ== X-Received: by 2002:a05:6a00:b92:b0:52a:e60d:dfbb with SMTP id g18-20020a056a000b9200b0052ae60ddfbbmr29561053pfj.72.1658173024046; Mon, 18 Jul 2022 12:37:04 -0700 (PDT) Received: from sc2-haas01-esx0118.eng.vmware.com ([66.170.99.1]) by smtp.gmail.com with ESMTPSA id q6-20020a170902a3c600b0016bc4a6ce28sm9907887plb.98.2022.07.18.12.37.02 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 18 Jul 2022 12:37:03 -0700 (PDT) From: Nadav Amit X-Google-Original-From: Nadav Amit To: linux-mm@kvack.org Cc: linux-kernel@vger.kernel.org, Andrew Morton , Mike Rapoport , Axel Rasmussen , Nadav Amit , Andrea Arcangeli , Andrew Cooper , Andy Lutomirski , Dave Hansen , David Hildenbrand , Peter Xu , Peter Zijlstra , Thomas Gleixner , Will Deacon , Yu Zhao , Nick Piggin Subject: [RFC PATCH 01/14] userfaultfd: set dirty and young on writeprotect Date: Mon, 18 Jul 2022 05:01:59 -0700 Message-Id: <20220718120212.3180-2-namit@vmware.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20220718120212.3180-1-namit@vmware.com> References: <20220718120212.3180-1-namit@vmware.com> MIME-Version: 1.0 ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1658173026; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=lqN6lUA6CKPQ6pXduKwXrnAjAVaaU8p/HZgGKWiS1UU=; b=iZ0k9TcdGpyunnPRtkwpyBGEyD1hN81HMzZDOXM7oTbPgqnGizssfSeG/w8icX12dlCUAA kNql6qIMffcT+sa9frsD9KuHP/yKKJgYkNMzh6SCvPxeiu93ToXk49MZo4V42Ij8aVrw5G tosUxmtTgx4rNZC6jHcmZFShP1H2yBs= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1658173026; a=rsa-sha256; cv=none; b=0ZsDqb7y7MLSKWeO3brU7AKqoyvyf81OKXji7i/NVkdc9BlBLPPgaRLqjrf8eWodlYVgOu x3PDMACH+lV2DkHnnRuAoqceYiE2uCXcpSKjxFr1eJ7iHDB3Hz1ozzcpl6PU1cdm0rvo4F 5CgLlm7Cl3kGjeLMjaeNrtGREdL+TaI= ARC-Authentication-Results: i=1; imf31.hostedemail.com; dkim=pass header.d=gmail.com header.s=20210112 header.b=anEaeCdD; dmarc=pass (policy=none) header.from=gmail.com; spf=none (imf31.hostedemail.com: domain of MAILER-DAEMON@hostedemail.com has no SPF policy when checking 216.40.44.13) smtp.mailfrom=MAILER-DAEMON@hostedemail.com X-Rspamd-Server: rspam10 X-Stat-Signature: ygyf64m7bkj6mi9j8dqc8omw5ehgms6r X-Rspam-User: X-Rspamd-Queue-Id: 4921D20042 X-HE-Tag-Orig: 1658173025-990891 Authentication-Results: imf31.hostedemail.com; dkim=pass header.d=gmail.com header.s=20210112 header.b=anEaeCdD; dmarc=pass (policy=none) header.from=gmail.com; spf=none (imf31.hostedemail.com: domain of MAILER-DAEMON@hostedemail.com has no SPF policy when checking 216.40.44.13) smtp.mailfrom=MAILER-DAEMON@hostedemail.com X-HE-Tag: 1658173026-469861 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Nadav Amit When userfaultfd makes a PTE writable, it can now change the PTE directly, in some cases, without going triggering a page-fault first. Yet, doing so might leave the PTE that was write-unprotected as old and clean. At least on x86, this would cause a >500 cycles overhead when the PTE is first accessed. Use MM_CP_WILL_NEED to set the PTE as young and dirty when userfaultfd gets a hint that the page is likely to be used. Avoid changing the PTE to young and dirty in other cases to avoid excessive writeback and messing with the page reclamation logic. Cc: Andrea Arcangeli Cc: Andrew Cooper Cc: Andrew Morton Cc: Andy Lutomirski Cc: Dave Hansen Cc: David Hildenbrand Cc: Peter Xu Cc: Peter Zijlstra Cc: Thomas Gleixner Cc: Will Deacon Cc: Yu Zhao Cc: Nick Piggin --- include/linux/mm.h | 2 ++ mm/mprotect.c | 9 ++++++++- mm/userfaultfd.c | 8 ++++++-- 3 files changed, 16 insertions(+), 3 deletions(-) diff --git a/include/linux/mm.h b/include/linux/mm.h index 9cc02a7e503b..4afd75ce5875 100644 --- a/include/linux/mm.h +++ b/include/linux/mm.h @@ -1988,6 +1988,8 @@ extern unsigned long move_page_tables(struct vm_area_struct *vma, /* Whether this change is for write protecting */ #define MM_CP_UFFD_WP (1UL << 2) /* do wp */ #define MM_CP_UFFD_WP_RESOLVE (1UL << 3) /* Resolve wp */ +/* Whether to try to mark entries as dirty as they are to be written */ +#define MM_CP_WILL_NEED (1UL << 4) #define MM_CP_UFFD_WP_ALL (MM_CP_UFFD_WP | \ MM_CP_UFFD_WP_RESOLVE) diff --git a/mm/mprotect.c b/mm/mprotect.c index 996a97e213ad..34c2dfb68c42 100644 --- a/mm/mprotect.c +++ b/mm/mprotect.c @@ -82,6 +82,7 @@ static unsigned long change_pte_range(struct mmu_gather *tlb, bool prot_numa = cp_flags & MM_CP_PROT_NUMA; bool uffd_wp = cp_flags & MM_CP_UFFD_WP; bool uffd_wp_resolve = cp_flags & MM_CP_UFFD_WP_RESOLVE; + bool will_need = cp_flags & MM_CP_WILL_NEED; tlb_change_page_size(tlb, PAGE_SIZE); @@ -172,6 +173,9 @@ static unsigned long change_pte_range(struct mmu_gather *tlb, ptent = pte_clear_uffd_wp(ptent); } + if (will_need) + ptent = pte_mkyoung(ptent); + /* * In some writable, shared mappings, we might want * to catch actual write access -- see @@ -187,8 +191,11 @@ static unsigned long change_pte_range(struct mmu_gather *tlb, */ if ((cp_flags & MM_CP_TRY_CHANGE_WRITABLE) && !pte_write(ptent) && - can_change_pte_writable(vma, addr, ptent)) + can_change_pte_writable(vma, addr, ptent)) { ptent = pte_mkwrite(ptent); + if (will_need) + ptent = pte_mkdirty(ptent); + } ptep_modify_prot_commit(vma, addr, pte, oldpte, ptent); if (pte_needs_flush(oldpte, ptent)) diff --git a/mm/userfaultfd.c b/mm/userfaultfd.c index 954c6980b29f..e0492f5f06a0 100644 --- a/mm/userfaultfd.c +++ b/mm/userfaultfd.c @@ -749,6 +749,7 @@ int mwriteprotect_range(struct mm_struct *dst_mm, unsigned long start, bool enable_wp = uffd_flags & UFFD_FLAGS_WP; struct vm_area_struct *dst_vma; unsigned long page_mask; + unsigned long cp_flags; struct mmu_gather tlb; pgprot_t newprot; int err; @@ -795,9 +796,12 @@ int mwriteprotect_range(struct mm_struct *dst_mm, unsigned long start, else newprot = vm_get_page_prot(dst_vma->vm_flags); + cp_flags = enable_wp ? MM_CP_UFFD_WP : MM_CP_UFFD_WP_RESOLVE; + if (uffd_flags & (UFFD_FLAGS_ACCESS_LIKELY|UFFD_FLAGS_WRITE_LIKELY)) + cp_flags |= MM_CP_WILL_NEED; + tlb_gather_mmu(&tlb, dst_mm); - change_protection(&tlb, dst_vma, start, start + len, newprot, - enable_wp ? MM_CP_UFFD_WP : MM_CP_UFFD_WP_RESOLVE); + change_protection(&tlb, dst_vma, start, start + len, newprot, cp_flags); tlb_finish_mmu(&tlb); err = 0;