From patchwork Sun Jan 31 00:11:13 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Nadav Amit X-Patchwork-Id: 12057441 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-13.5 required=3.0 tests=BAYES_00, DKIM_ADSP_CUSTOM_MED,DKIM_INVALID,DKIM_SIGNED,FREEMAIL_FORGED_FROMDOMAIN, FREEMAIL_FROM,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER, INCLUDES_PATCH,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 3D0E7C433DB for ; Sun, 31 Jan 2021 00:16:05 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id CD4E764E15 for ; Sun, 31 Jan 2021 00:16:04 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org CD4E764E15 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=gmail.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 47B216B0006; Sat, 30 Jan 2021 19:16:04 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 404206B006C; Sat, 30 Jan 2021 19:16:04 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 259896B006E; Sat, 30 Jan 2021 19:16:04 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0092.hostedemail.com [216.40.44.92]) by kanga.kvack.org (Postfix) with ESMTP id 079826B0006 for ; Sat, 30 Jan 2021 19:16:04 -0500 (EST) Received: from smtpin03.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay01.hostedemail.com (Postfix) with ESMTP id C115F180AD82F for ; Sun, 31 Jan 2021 00:16:03 +0000 (UTC) X-FDA: 77764152606.03.8F0E9FB Received: from mail-pf1-f173.google.com (mail-pf1-f173.google.com [209.85.210.173]) by imf01.hostedemail.com (Postfix) with ESMTP id 8BE1520001F2 for ; Sun, 31 Jan 2021 00:16:02 +0000 (UTC) Received: by mail-pf1-f173.google.com with SMTP id b145so2281406pfb.4 for ; Sat, 30 Jan 2021 16:16:03 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=l25hYkMVDoyooYvVfTs7fc3Z+gKHwtM6S7f1RS+25jI=; b=QWgShbAGwg3wKINOtZylfbSTrKZ7qLVxJ9N5IFviem4ZQZbvt3Qb5FGnEFjaJkKWM8 +N59D/2HoNthkC9KM8y5rri2hSwYvZ3PGjWAtGVOTbVKPODJHb+bNvA2CTILAtcpT2eF h2vKj7f18D9utR+WqRIZocdX9KGJMGNpGDq4OAUv3clGITiSokZvfotFKb0au2qJk83H +vQ3E86+C/gXHYcrJ616SNR0zQDCLX573HjZiqIKyq8Skylj6hd55xtbGDdaDulme6bC qkMuGVJdrJXN1VfTQjQkLNTiMdo9yVwgsqXAzyKwt97zZNMt7qYeioV6GYWPf1kFGeIf 5v9A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=l25hYkMVDoyooYvVfTs7fc3Z+gKHwtM6S7f1RS+25jI=; b=Dy9UeiHRPwB75M8ufqQLxhQ/o6+LEK+YnuEtne5GR5cb8VEACMKCure/JB4P6ss5Yp 7S1lc27jIHIXBPLYECEkTtsWcBy8JqYr2Ku7ecnLQQQdHk3IePrRzejBYYIW+EyNUbOs dPYvR3ALr3xZx/gHEl/jef8EI/yP5L1kdEl1BWCmVLUJeOZ3t7MzEnXj1X1MZ2458XoK 66Es9i+IjildnTSKo+jNV+HcOz1rc/zcrRF07qDiIBuZavZOr09eyg1P8zKDSy5Z2S7A ZolYGre57plWI3SHNWDW9bzxS9xpWnhoi7Q1IBUhv7T1Pg1R64ybxfX6YzjzwvX1l5hy unOg== X-Gm-Message-State: AOAM532okBUA2nJvlkGQ8LbbEEdtzbj8T67Wuyq5cOa6TathktCYonvc mJHOdd2A/1H37ZG191KjzbUGYiSWb7s= X-Google-Smtp-Source: ABdhPJyd6V4GPxAmmHG/IDX8wnqgvmExjPww3NvYNxqdzTjwzAadbhjaaR3aAin7oAYPZWEG51FDXg== X-Received: by 2002:a63:c148:: with SMTP id p8mr10395437pgi.188.1612052161728; Sat, 30 Jan 2021 16:16:01 -0800 (PST) Received: from sc2-haas01-esx0118.eng.vmware.com ([66.170.99.1]) by smtp.gmail.com with ESMTPSA id e12sm13127365pga.13.2021.01.30.16.16.00 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sat, 30 Jan 2021 16:16:01 -0800 (PST) From: Nadav Amit X-Google-Original-From: Nadav Amit To: linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: Nadav Amit , Andrea Arcangeli , Andrew Morton , Andy Lutomirski , Dave Hansen , Peter Zijlstra , Thomas Gleixner , Will Deacon , Yu Zhao , Nick Piggin , x86@kernel.org Subject: [RFC 01/20] mm/tlb: fix fullmm semantics Date: Sat, 30 Jan 2021 16:11:13 -0800 Message-Id: <20210131001132.3368247-2-namit@vmware.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20210131001132.3368247-1-namit@vmware.com> References: <20210131001132.3368247-1-namit@vmware.com> MIME-Version: 1.0 X-Rspamd-Server: rspam01 X-Rspamd-Queue-Id: 8BE1520001F2 X-Stat-Signature: c31fnuoihpkjho4tikd55bcwchh1jkm6 Received-SPF: none (<>: No applicable sender policy available) receiver=imf01; identity=mailfrom; envelope-from="<>"; helo=mail-pf1-f173.google.com; client-ip=209.85.210.173 X-HE-DKIM-Result: pass/pass X-HE-Tag: 1612052162-31373 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Nadav Amit fullmm in mmu_gather is supposed to indicate that the mm is torn-down (e.g., on process exit) and can therefore allow certain optimizations. However, tlb_finish_mmu() sets fullmm, when in fact it want to say that the TLB should be fully flushed. Change tlb_finish_mmu() to set need_flush_all and check this flag in tlb_flush_mmu_tlbonly() when deciding whether a flush is needed. Signed-off-by: Nadav Amit Cc: Andrea Arcangeli Cc: Andrew Morton Cc: Andy Lutomirski Cc: Dave Hansen Cc: Peter Zijlstra Cc: Thomas Gleixner Cc: Will Deacon Cc: Yu Zhao Cc: Nick Piggin Cc: x86@kernel.org --- include/asm-generic/tlb.h | 2 +- mm/mmu_gather.c | 2 +- 2 files changed, 2 insertions(+), 2 deletions(-) diff --git a/include/asm-generic/tlb.h b/include/asm-generic/tlb.h index 2c68a545ffa7..eea113323468 100644 --- a/include/asm-generic/tlb.h +++ b/include/asm-generic/tlb.h @@ -420,7 +420,7 @@ static inline void tlb_flush_mmu_tlbonly(struct mmu_gather *tlb) * these bits. */ if (!(tlb->freed_tables || tlb->cleared_ptes || tlb->cleared_pmds || - tlb->cleared_puds || tlb->cleared_p4ds)) + tlb->cleared_puds || tlb->cleared_p4ds || tlb->need_flush_all)) return; tlb_flush(tlb); diff --git a/mm/mmu_gather.c b/mm/mmu_gather.c index 0dc7149b0c61..5a659d4e59eb 100644 --- a/mm/mmu_gather.c +++ b/mm/mmu_gather.c @@ -323,7 +323,7 @@ void tlb_finish_mmu(struct mmu_gather *tlb) * On x86 non-fullmm doesn't yield significant difference * against fullmm. */ - tlb->fullmm = 1; + tlb->need_flush_all = 1; __tlb_reset_range(tlb); tlb->freed_tables = 1; } From patchwork Sun Jan 31 00:11:14 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Nadav Amit X-Patchwork-Id: 12057443 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-13.5 required=3.0 tests=BAYES_00, DKIM_ADSP_CUSTOM_MED,DKIM_INVALID,DKIM_SIGNED,FREEMAIL_FORGED_FROMDOMAIN, FREEMAIL_FROM,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER, INCLUDES_PATCH,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 4CAF1C433E0 for ; Sun, 31 Jan 2021 00:16:07 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id E52AB64E0F for ; Sun, 31 Jan 2021 00:16:06 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org E52AB64E0F Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=gmail.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id DA3926B006C; Sat, 30 Jan 2021 19:16:05 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id D2F756B006E; Sat, 30 Jan 2021 19:16:05 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id A96A76B0070; Sat, 30 Jan 2021 19:16:05 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0009.hostedemail.com [216.40.44.9]) by kanga.kvack.org (Postfix) with ESMTP id 8C0196B006C for ; Sat, 30 Jan 2021 19:16:05 -0500 (EST) Received: from smtpin21.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay05.hostedemail.com (Postfix) with ESMTP id 4B51B181AEF1F for ; Sun, 31 Jan 2021 00:16:05 +0000 (UTC) X-FDA: 77764152690.21.magic58_5c131ab275b5 Received: from filter.hostedemail.com (10.5.16.251.rfc1918.com [10.5.16.251]) by smtpin21.hostedemail.com (Postfix) with ESMTP id 32576180442C2 for ; Sun, 31 Jan 2021 00:16:05 +0000 (UTC) X-HE-Tag: magic58_5c131ab275b5 X-Filterd-Recvd-Size: 10952 Received: from mail-pf1-f180.google.com (mail-pf1-f180.google.com [209.85.210.180]) by imf27.hostedemail.com (Postfix) with ESMTP for ; Sun, 31 Jan 2021 00:16:04 +0000 (UTC) Received: by mail-pf1-f180.google.com with SMTP id i63so9062923pfg.7 for ; Sat, 30 Jan 2021 16:16:04 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=M4Q7ZjV588LKdkSbAzgnMot3DVim5ZSVz69bh4M0k3g=; b=nMmhg68SZfVRe7l5tfC35c3vDqI1usRnjleTjJTDokkkhrVHdWjbNMu6dAeQwpRdjK Pq8f/o+jsR/DDpO6jCSFcucZhH2mvIrXN27Hpivi9KQP659S1mi56y4F/SRNSWHQbzIh eFB/QmMLmnVRtymTl8Y68SRNEaiHinOFiNzX0VX2nbY2tlkXohLxUarfxjgHf3Gpvdp1 m0+7o/8PYOc3r6u8TwZMW9sSrynOmMSmrohx7KESPK+xoJe+7jf6ldV2izYZ00M3qyL/ 8Sc5/gzNrkcjOjCQMNqIexWHNljfavI8opIEfGFQDHSwaO8w31vedSzsnmTiWsvxs+8w Y1NQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=M4Q7ZjV588LKdkSbAzgnMot3DVim5ZSVz69bh4M0k3g=; b=Y8TYrH1QrB00qwd0ohxiS43c0tmPePTAsFTDBvf2R1JUYJvo1m4KyJwv1ze7M8yiAN ZuAC425fmKN/sI4w1QB7Vx4r+ZZxoICWiFrR8iBhohkBbr3EdGyzuMld7GIGhcIujFFN vUOx+vT0xgCs5CNlRVQsDW20iGJE3IDezEPbayOlO/Y5R8WTzgGX65U/3sVNcm3Y1M7V 2pVxAe9dYsYLSN129tV5e17uD/oMVI7ABsY65Elt54VRIwO7wp2F/uSBHpX//adH/NTD IUOH5jYiHDDNn6HHSJdts4epee0La+r8cuMF0JRuofHvVjOT0Ei32tNuGkfYt6R9CvRW 4amg== X-Gm-Message-State: AOAM532at6wFF9CAMomUILnZjQUdGlpdZxWaWPROgQLRZU+t1xU9lL8y 2ZQZIAzcwe98dcxIOaX/YCB6bu97Wkg= X-Google-Smtp-Source: ABdhPJxg0qtiRLhUaMKCvHqNchv1j0FmjCLXzVSaQO3+WkMK6v7l6XetS39BLFVhZx/xyxt2SvTa6Q== X-Received: by 2002:a63:505d:: with SMTP id q29mr10737187pgl.89.1612052163402; Sat, 30 Jan 2021 16:16:03 -0800 (PST) Received: from sc2-haas01-esx0118.eng.vmware.com ([66.170.99.1]) by smtp.gmail.com with ESMTPSA id e12sm13127365pga.13.2021.01.30.16.16.01 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sat, 30 Jan 2021 16:16:02 -0800 (PST) From: Nadav Amit X-Google-Original-From: Nadav Amit To: linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: Nadav Amit , Andrea Arcangeli , Andrew Morton , Andy Lutomirski , Dave Hansen , Peter Zijlstra , Thomas Gleixner , Will Deacon , Yu Zhao , Nick Piggin , x86@kernel.org Subject: [RFC 02/20] mm/mprotect: use mmu_gather Date: Sat, 30 Jan 2021 16:11:14 -0800 Message-Id: <20210131001132.3368247-3-namit@vmware.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20210131001132.3368247-1-namit@vmware.com> References: <20210131001132.3368247-1-namit@vmware.com> MIME-Version: 1.0 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Nadav Amit change_pXX_range() currently does not use mmu_gather, but instead implements its own deferred TLB flushes scheme. This both complicates the code, as developers need to be aware of different invalidation schemes, and prevents. Use mmu_gather in change_pXX_range(). As the pages are not released, only record the flushed range using tlb_flush_pXX_range(). Signed-off-by: Nadav Amit Cc: Andrea Arcangeli Cc: Andrew Morton Cc: Andy Lutomirski Cc: Dave Hansen Cc: Peter Zijlstra Cc: Thomas Gleixner Cc: Will Deacon Cc: Yu Zhao Cc: Nick Piggin Cc: x86@kernel.org --- include/linux/huge_mm.h | 3 ++- mm/huge_memory.c | 4 +++- mm/mprotect.c | 51 ++++++++++++++++++++--------------------- 3 files changed, 30 insertions(+), 28 deletions(-) diff --git a/include/linux/huge_mm.h b/include/linux/huge_mm.h index 6a19f35f836b..6eff7f59a778 100644 --- a/include/linux/huge_mm.h +++ b/include/linux/huge_mm.h @@ -37,7 +37,8 @@ int zap_huge_pud(struct mmu_gather *tlb, struct vm_area_struct *vma, pud_t *pud, bool move_huge_pmd(struct vm_area_struct *vma, unsigned long old_addr, unsigned long new_addr, pmd_t *old_pmd, pmd_t *new_pmd); int change_huge_pmd(struct vm_area_struct *vma, pmd_t *pmd, unsigned long addr, - pgprot_t newprot, unsigned long cp_flags); + pgprot_t newprot, unsigned long cp_flags, + struct mmu_gather *tlb); vm_fault_t vmf_insert_pfn_pmd_prot(struct vm_fault *vmf, pfn_t pfn, pgprot_t pgprot, bool write); diff --git a/mm/huge_memory.c b/mm/huge_memory.c index 9237976abe72..c345b8b06183 100644 --- a/mm/huge_memory.c +++ b/mm/huge_memory.c @@ -1797,7 +1797,8 @@ bool move_huge_pmd(struct vm_area_struct *vma, unsigned long old_addr, * - HPAGE_PMD_NR is protections changed and TLB flush necessary */ int change_huge_pmd(struct vm_area_struct *vma, pmd_t *pmd, - unsigned long addr, pgprot_t newprot, unsigned long cp_flags) + unsigned long addr, pgprot_t newprot, unsigned long cp_flags, + struct mmu_gather *tlb) { struct mm_struct *mm = vma->vm_mm; spinlock_t *ptl; @@ -1885,6 +1886,7 @@ int change_huge_pmd(struct vm_area_struct *vma, pmd_t *pmd, entry = pmd_clear_uffd_wp(entry); } ret = HPAGE_PMD_NR; + tlb_flush_pmd_range(tlb, addr, HPAGE_PMD_SIZE); set_pmd_at(mm, addr, pmd, entry); BUG_ON(vma_is_anonymous(vma) && !preserve_write && pmd_write(entry)); unlock: diff --git a/mm/mprotect.c b/mm/mprotect.c index ab709023e9aa..632d5a677d3f 100644 --- a/mm/mprotect.c +++ b/mm/mprotect.c @@ -32,12 +32,13 @@ #include #include #include +#include #include "internal.h" -static unsigned long change_pte_range(struct vm_area_struct *vma, pmd_t *pmd, - unsigned long addr, unsigned long end, pgprot_t newprot, - unsigned long cp_flags) +static unsigned long change_pte_range(struct mmu_gather *tlb, + struct vm_area_struct *vma, pmd_t *pmd, unsigned long addr, + unsigned long end, pgprot_t newprot, unsigned long cp_flags) { pte_t *pte, oldpte; spinlock_t *ptl; @@ -138,6 +139,7 @@ static unsigned long change_pte_range(struct vm_area_struct *vma, pmd_t *pmd, ptent = pte_mkwrite(ptent); } ptep_modify_prot_commit(vma, addr, pte, oldpte, ptent); + tlb_flush_pte_range(tlb, addr, PAGE_SIZE); pages++; } else if (is_swap_pte(oldpte)) { swp_entry_t entry = pte_to_swp_entry(oldpte); @@ -209,9 +211,9 @@ static inline int pmd_none_or_clear_bad_unless_trans_huge(pmd_t *pmd) return 0; } -static inline unsigned long change_pmd_range(struct vm_area_struct *vma, - pud_t *pud, unsigned long addr, unsigned long end, - pgprot_t newprot, unsigned long cp_flags) +static inline unsigned long change_pmd_range(struct mmu_gather *tlb, + struct vm_area_struct *vma, pud_t *pud, unsigned long addr, + unsigned long end, pgprot_t newprot, unsigned long cp_flags) { pmd_t *pmd; unsigned long next; @@ -252,7 +254,7 @@ static inline unsigned long change_pmd_range(struct vm_area_struct *vma, __split_huge_pmd(vma, pmd, addr, false, NULL); } else { int nr_ptes = change_huge_pmd(vma, pmd, addr, - newprot, cp_flags); + newprot, cp_flags, tlb); if (nr_ptes) { if (nr_ptes == HPAGE_PMD_NR) { @@ -266,8 +268,8 @@ static inline unsigned long change_pmd_range(struct vm_area_struct *vma, } /* fall through, the trans huge pmd just split */ } - this_pages = change_pte_range(vma, pmd, addr, next, newprot, - cp_flags); + this_pages = change_pte_range(tlb, vma, pmd, addr, next, + newprot, cp_flags); pages += this_pages; next: cond_resched(); @@ -281,9 +283,9 @@ static inline unsigned long change_pmd_range(struct vm_area_struct *vma, return pages; } -static inline unsigned long change_pud_range(struct vm_area_struct *vma, - p4d_t *p4d, unsigned long addr, unsigned long end, - pgprot_t newprot, unsigned long cp_flags) +static inline unsigned long change_pud_range(struct mmu_gather *tlb, + struct vm_area_struct *vma, p4d_t *p4d, unsigned long addr, + unsigned long end, pgprot_t newprot, unsigned long cp_flags) { pud_t *pud; unsigned long next; @@ -294,16 +296,16 @@ static inline unsigned long change_pud_range(struct vm_area_struct *vma, next = pud_addr_end(addr, end); if (pud_none_or_clear_bad(pud)) continue; - pages += change_pmd_range(vma, pud, addr, next, newprot, + pages += change_pmd_range(tlb, vma, pud, addr, next, newprot, cp_flags); } while (pud++, addr = next, addr != end); return pages; } -static inline unsigned long change_p4d_range(struct vm_area_struct *vma, - pgd_t *pgd, unsigned long addr, unsigned long end, - pgprot_t newprot, unsigned long cp_flags) +static inline unsigned long change_p4d_range(struct mmu_gather *tlb, + struct vm_area_struct *vma, pgd_t *pgd, unsigned long addr, + unsigned long end, pgprot_t newprot, unsigned long cp_flags) { p4d_t *p4d; unsigned long next; @@ -314,7 +316,7 @@ static inline unsigned long change_p4d_range(struct vm_area_struct *vma, next = p4d_addr_end(addr, end); if (p4d_none_or_clear_bad(p4d)) continue; - pages += change_pud_range(vma, p4d, addr, next, newprot, + pages += change_pud_range(tlb, vma, p4d, addr, next, newprot, cp_flags); } while (p4d++, addr = next, addr != end); @@ -328,25 +330,22 @@ static unsigned long change_protection_range(struct vm_area_struct *vma, struct mm_struct *mm = vma->vm_mm; pgd_t *pgd; unsigned long next; - unsigned long start = addr; unsigned long pages = 0; + struct mmu_gather tlb; BUG_ON(addr >= end); pgd = pgd_offset(mm, addr); - flush_cache_range(vma, addr, end); - inc_tlb_flush_pending(mm); + tlb_gather_mmu(&tlb, mm); + tlb_start_vma(&tlb, vma); do { next = pgd_addr_end(addr, end); if (pgd_none_or_clear_bad(pgd)) continue; - pages += change_p4d_range(vma, pgd, addr, next, newprot, + pages += change_p4d_range(&tlb, vma, pgd, addr, next, newprot, cp_flags); } while (pgd++, addr = next, addr != end); - - /* Only flush the TLB if we actually modified any entries: */ - if (pages) - flush_tlb_range(vma, start, end); - dec_tlb_flush_pending(mm); + tlb_end_vma(&tlb, vma); + tlb_finish_mmu(&tlb); return pages; } From patchwork Sun Jan 31 00:11:15 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Nadav Amit X-Patchwork-Id: 12057445 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-13.5 required=3.0 tests=BAYES_00, DKIM_ADSP_CUSTOM_MED,DKIM_INVALID,DKIM_SIGNED,FREEMAIL_FORGED_FROMDOMAIN, FREEMAIL_FROM,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER, INCLUDES_PATCH,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 85344C433DB for ; Sun, 31 Jan 2021 00:16:09 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 3312E64E1C for ; Sun, 31 Jan 2021 00:16:09 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 3312E64E1C Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=gmail.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 5B2EF6B006E; Sat, 30 Jan 2021 19:16:07 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 547FD6B0070; Sat, 30 Jan 2021 19:16:07 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 2D2DD6B0071; Sat, 30 Jan 2021 19:16:07 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0129.hostedemail.com [216.40.44.129]) by kanga.kvack.org (Postfix) with ESMTP id 12BC16B006E for ; Sat, 30 Jan 2021 19:16:07 -0500 (EST) Received: from smtpin10.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay04.hostedemail.com (Postfix) with ESMTP id D34771EE6 for ; Sun, 31 Jan 2021 00:16:06 +0000 (UTC) X-FDA: 77764152732.10.pin22_5211efb275b5 Received: from filter.hostedemail.com (10.5.16.251.rfc1918.com [10.5.16.251]) by smtpin10.hostedemail.com (Postfix) with ESMTP id B10F316A046 for ; Sun, 31 Jan 2021 00:16:06 +0000 (UTC) X-HE-Tag: pin22_5211efb275b5 X-Filterd-Recvd-Size: 7846 Received: from mail-pl1-f182.google.com (mail-pl1-f182.google.com [209.85.214.182]) by imf28.hostedemail.com (Postfix) with ESMTP for ; Sun, 31 Jan 2021 00:16:06 +0000 (UTC) Received: by mail-pl1-f182.google.com with SMTP id u11so7895427plg.13 for ; Sat, 30 Jan 2021 16:16:06 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=DNxo16iCNNEqpRi0tVt5da0M9RdBAEB3Mfjbxme6ACQ=; b=Yr95/MAa90ahbQ9SGbcNuoIm7byeFbJgjA9qtcNbqcRUs8stZ62X6TP0sb6ijhCM4u UqPIQ6NM8f8yf33mQgiOmMuVBSAt9LDz/lBdJivbFlXqZajnoA1NCOt+CU9xHdgx3QUi To//nUdTJRhpoUbmS8RibLZ11gh7dQDbowtXmLcvx9xVwgdAD//UwpjJRjnf6LY4skue GxZQIqWNESsTJBgDQ8STQLM7AS73T+ccluPy8jOij6Hq8ugtyl0Vmo6RgBMFg/iolsYa I5fekBGetsDGdcsIdj3sl6IHqyWxXvAW8HdmrtqyWKtD3LHfwl4bLwTXq21MZydYJOQR DnBA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=DNxo16iCNNEqpRi0tVt5da0M9RdBAEB3Mfjbxme6ACQ=; b=CFA8zirpszuQ0EhoPfxq/2N/3L6ckhqmkwmbIqzpB9Qg7zj0vx8ALAS2+Mi1dHzyQY TiirqJcisMzy1j35Z1u5ZVX+V8eTrUqVdPt6U6+QURPFCRZ4YqYz2426CwBgOgVI+VtA 7pC08BTfb2ZUzPIP63hcaJ9p4IGSyJ9y9a1Lo3d+jTQgbvf36W61Hm1MKxdKbXsq4QeW /i++F5G65SXMc9KQOmuOl3z4IG7mjAVsq4QuVsFNEngHsn9BWrx/x+JsvBSeYpzTLQyI ZE3SRUzbguqRcDDriRQ3Y5mw6D4Jb8K5Wr2fpvdzGq4dmJzKrcsYza9OI8VCNDDLo8Ey yrjQ== X-Gm-Message-State: AOAM530hnm4UjcLyBT8+LUuZxc8FsmwX2QtiVO465CT4RnF/hdhEgdd4 f2HiScdt98IhWV1gqNEx9Jtk5J8avZE= X-Google-Smtp-Source: ABdhPJzqJLhgd8JRSQewC2Nzlo9p9h1NzX9V2S7FLf7k6KbsLBVCMC9Lm4EPN9txdmdVTtWwr62UDA== X-Received: by 2002:a17:90b:180d:: with SMTP id lw13mr10626232pjb.94.1612052165042; Sat, 30 Jan 2021 16:16:05 -0800 (PST) Received: from sc2-haas01-esx0118.eng.vmware.com ([66.170.99.1]) by smtp.gmail.com with ESMTPSA id e12sm13127365pga.13.2021.01.30.16.16.03 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sat, 30 Jan 2021 16:16:04 -0800 (PST) From: Nadav Amit X-Google-Original-From: Nadav Amit To: linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: Nadav Amit , Andrea Arcangeli , Andrew Morton , Andy Lutomirski , Dave Hansen , Peter Zijlstra , Thomas Gleixner , Will Deacon , Yu Zhao , Nick Piggin , x86@kernel.org Subject: [RFC 03/20] mm/mprotect: do not flush on permission promotion Date: Sat, 30 Jan 2021 16:11:15 -0800 Message-Id: <20210131001132.3368247-4-namit@vmware.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20210131001132.3368247-1-namit@vmware.com> References: <20210131001132.3368247-1-namit@vmware.com> MIME-Version: 1.0 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Nadav Amit Currently, using mprotect() to unprotect a memory region or uffd to unprotect a memory region causes a TLB flush. At least on x86, as protection is promoted, no TLB flush is needed. Add an arch-specific pte_may_need_flush() which tells whether a TLB flush is needed based on the old PTE and the new one. Implement an x86 pte_may_need_flush(). For x86, besides the simple logic that PTE protection promotion or changes of software bits does require a flush, also add logic that considers the dirty-bit. If the dirty-bit is clear and write-protect is set, no TLB flush is needed, as x86 updates the dirty-bit atomically on write, and if the bit is clear, the PTE is reread. Signed-off-by: Nadav Amit Cc: Andrea Arcangeli Cc: Andrew Morton Cc: Andy Lutomirski Cc: Dave Hansen Cc: Peter Zijlstra Cc: Thomas Gleixner Cc: Will Deacon Cc: Yu Zhao Cc: Nick Piggin Cc: x86@kernel.org --- arch/x86/include/asm/tlbflush.h | 44 +++++++++++++++++++++++++++++++++ include/asm-generic/tlb.h | 4 +++ mm/mprotect.c | 3 ++- 3 files changed, 50 insertions(+), 1 deletion(-) diff --git a/arch/x86/include/asm/tlbflush.h b/arch/x86/include/asm/tlbflush.h index 8c87a2e0b660..a617dc0a9b06 100644 --- a/arch/x86/include/asm/tlbflush.h +++ b/arch/x86/include/asm/tlbflush.h @@ -255,6 +255,50 @@ static inline void arch_tlbbatch_add_mm(struct arch_tlbflush_unmap_batch *batch, extern void arch_tlbbatch_flush(struct arch_tlbflush_unmap_batch *batch); +static inline bool pte_may_need_flush(pte_t oldpte, pte_t newpte) +{ + const pteval_t ignore_mask = _PAGE_SOFTW1 | _PAGE_SOFTW2 | + _PAGE_SOFTW3 | _PAGE_ACCESSED; + const pteval_t enable_mask = _PAGE_RW | _PAGE_DIRTY | _PAGE_GLOBAL; + pteval_t oldval = pte_val(oldpte); + pteval_t newval = pte_val(newpte); + pteval_t diff = oldval ^ newval; + pteval_t disable_mask = 0; + + if (IS_ENABLED(CONFIG_X86_64) || IS_ENABLED(CONFIG_X86_PAE)) + disable_mask = _PAGE_NX; + + /* new is non-present: need only if old is present */ + if (pte_none(newpte)) + return !pte_none(oldpte); + + /* + * If, excluding the ignored bits, only RW and dirty are cleared and the + * old PTE does not have the dirty-bit set, we can avoid a flush. This + * is possible since x86 architecture set the dirty bit atomically while + * it caches the PTE in the TLB. + * + * The condition considers any change to RW and dirty as not requiring + * flush if the old PTE is not dirty or not writable for simplification + * of the code and to consider (unlikely) cases of changing dirty-bit of + * write-protected PTE. + */ + if (!(diff & ~(_PAGE_RW | _PAGE_DIRTY | ignore_mask)) && + (!(pte_dirty(oldpte) || !pte_write(oldpte)))) + return false; + + /* + * Any change of PFN and any flag other than those that we consider + * requires a flush (e.g., PAT, protection keys). To save flushes we do + * not consider the access bit as it is considered by the kernel as + * best-effort. + */ + return diff & ((oldval & enable_mask) | + (newval & disable_mask) | + ~(enable_mask | disable_mask | ignore_mask)); +} +#define pte_may_need_flush pte_may_need_flush + #endif /* !MODULE */ #endif /* _ASM_X86_TLBFLUSH_H */ diff --git a/include/asm-generic/tlb.h b/include/asm-generic/tlb.h index eea113323468..c2deec0b6919 100644 --- a/include/asm-generic/tlb.h +++ b/include/asm-generic/tlb.h @@ -654,6 +654,10 @@ static inline void tlb_flush_p4d_range(struct mmu_gather *tlb, } while (0) #endif +#ifndef pte_may_need_flush +static inline bool pte_may_need_flush(pte_t oldpte, pte_t newpte) { return true; } +#endif + #endif /* CONFIG_MMU */ #endif /* _ASM_GENERIC__TLB_H */ diff --git a/mm/mprotect.c b/mm/mprotect.c index 632d5a677d3f..b7473d2c9a1f 100644 --- a/mm/mprotect.c +++ b/mm/mprotect.c @@ -139,7 +139,8 @@ static unsigned long change_pte_range(struct mmu_gather *tlb, ptent = pte_mkwrite(ptent); } ptep_modify_prot_commit(vma, addr, pte, oldpte, ptent); - tlb_flush_pte_range(tlb, addr, PAGE_SIZE); + if (pte_may_need_flush(oldpte, ptent)) + tlb_flush_pte_range(tlb, addr, PAGE_SIZE); pages++; } else if (is_swap_pte(oldpte)) { swp_entry_t entry = pte_to_swp_entry(oldpte); From patchwork Sun Jan 31 00:11:16 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Nadav Amit X-Patchwork-Id: 12057447 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-13.5 required=3.0 tests=BAYES_00, DKIM_ADSP_CUSTOM_MED,DKIM_INVALID,DKIM_SIGNED,FREEMAIL_FORGED_FROMDOMAIN, FREEMAIL_FROM,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER, INCLUDES_PATCH,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id D91FAC433E6 for ; Sun, 31 Jan 2021 00:16:11 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 93CF164E17 for ; Sun, 31 Jan 2021 00:16:11 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 93CF164E17 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=gmail.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 0F8C16B0070; Sat, 30 Jan 2021 19:16:09 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 086936B0071; Sat, 30 Jan 2021 19:16:08 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id E186F6B0072; Sat, 30 Jan 2021 19:16:08 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0193.hostedemail.com [216.40.44.193]) by kanga.kvack.org (Postfix) with ESMTP id C47B06B0070 for ; Sat, 30 Jan 2021 19:16:08 -0500 (EST) Received: from smtpin24.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay02.hostedemail.com (Postfix) with ESMTP id 91E0C3632 for ; Sun, 31 Jan 2021 00:16:08 +0000 (UTC) X-FDA: 77764152816.24.bulb53_57154b9275b5 Received: from filter.hostedemail.com (10.5.16.251.rfc1918.com [10.5.16.251]) by smtpin24.hostedemail.com (Postfix) with ESMTP id 603151A4A5 for ; Sun, 31 Jan 2021 00:16:08 +0000 (UTC) X-HE-Tag: bulb53_57154b9275b5 X-Filterd-Recvd-Size: 6974 Received: from mail-pf1-f176.google.com (mail-pf1-f176.google.com [209.85.210.176]) by imf11.hostedemail.com (Postfix) with ESMTP for ; Sun, 31 Jan 2021 00:16:07 +0000 (UTC) Received: by mail-pf1-f176.google.com with SMTP id o20so9090929pfu.0 for ; Sat, 30 Jan 2021 16:16:07 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=jBC+01VJDydhQXc0ahXrqYuyDmXo9pxN+3IIrkulC2k=; b=TVaH6VkAFpqdsk3/O5YnMODhdAAk+rzD1GW94VeMtYiLPCpNcRRex1R5curwd/F/u2 JoZO+6hwCClkfatFg7nvWtmamjIgVWUi6E8GJpxIWv2iPzxgDU/qaQNi3EZ9zSaSkdcr JeafsVuqp25qdIhEsQH1BBdCItzsiDYGctTkAvqDLxfpb74yY1KmKU2sjHqS2M0xFeVc YIU60KL8eTty3Ml5hP60IR5UfA1EwnwH4r29RX9A0wlgF123L+y962rKwK3V6mzC1u1r JIrMzJzUFgndPpsk4dnSac3Hja48wtwRVU1AJ865LmviTsl5s5ntTPYH88jXCOxx/4V8 nuXQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=jBC+01VJDydhQXc0ahXrqYuyDmXo9pxN+3IIrkulC2k=; b=JXRX/QvkVXBU68XtO3rpFFtZPBJ9ADFvDou8wIlU033iw6faT8K4skxzyTty4YQh8/ hA2EPjIbtCTYKlKzOVhL+0iFxGJuECp15c7SGBTl7M4UoF6KeaHNuN4Rqw/zl1dSxkvr dHBms+itAQep8hjUgoSRob3Rd4FLu30vxPAh0ZJdbF3J+yzDkP5WLr6dcJo0GvWFsT/Z rEWStbia4vD5Z+4Q0cBEsL/Av8XmRFWHhKfHhF3FvT3s9skCkoATzPx5e3zt4TBRgV3O xH/B1Rb9ffgYhKaMYJ/1HW1TypQCyWrHBGGExNc8Tr9kv6HsJUPmOmlM/sibtAwN7fCn Pr4g== X-Gm-Message-State: AOAM532H14InH2rYgKUXouyfCvKwE36nCsM/2vQiUSyNCZvlozMqViAE fp40rxoQCmkHMg1rrxQXjR3gcvC8Gwo= X-Google-Smtp-Source: ABdhPJyqoXCkGVfO0VoTpCz9roEewxlxHC7XiUyi9KFDCIAE2zizHYqngYx2P5q4ZIJvw6BChNykYQ== X-Received: by 2002:a63:43c6:: with SMTP id q189mr10561751pga.245.1612052166645; Sat, 30 Jan 2021 16:16:06 -0800 (PST) Received: from sc2-haas01-esx0118.eng.vmware.com ([66.170.99.1]) by smtp.gmail.com with ESMTPSA id e12sm13127365pga.13.2021.01.30.16.16.05 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sat, 30 Jan 2021 16:16:06 -0800 (PST) From: Nadav Amit X-Google-Original-From: Nadav Amit To: linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: Nadav Amit , Andrea Arcangeli , Andrew Morton , Andy Lutomirski , Dave Hansen , Peter Zijlstra , Thomas Gleixner , Will Deacon , Yu Zhao , Nick Piggin , x86@kernel.org Subject: [RFC 04/20] mm/mapping_dirty_helpers: use mmu_gather Date: Sat, 30 Jan 2021 16:11:16 -0800 Message-Id: <20210131001132.3368247-5-namit@vmware.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20210131001132.3368247-1-namit@vmware.com> References: <20210131001132.3368247-1-namit@vmware.com> MIME-Version: 1.0 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Nadav Amit Avoid open-coding mmu_gather for no reason. There is no apparent reason not to use the existing mmu_gather interfaces. Use the newly introduced pte_may_need_flush() to check whether a flush is needed to avoid unnecassary flushes. Signed-off-by: Nadav Amit Cc: Andrea Arcangeli Cc: Andrew Morton Cc: Andy Lutomirski Cc: Dave Hansen Cc: Peter Zijlstra Cc: Thomas Gleixner Cc: Will Deacon Cc: Yu Zhao Cc: Nick Piggin Cc: x86@kernel.org --- mm/mapping_dirty_helpers.c | 37 +++++++++++-------------------------- 1 file changed, 11 insertions(+), 26 deletions(-) diff --git a/mm/mapping_dirty_helpers.c b/mm/mapping_dirty_helpers.c index b59054ef2e10..2ce6cf431026 100644 --- a/mm/mapping_dirty_helpers.c +++ b/mm/mapping_dirty_helpers.c @@ -4,7 +4,7 @@ #include #include #include -#include +#include /** * struct wp_walk - Private struct for pagetable walk callbacks @@ -15,8 +15,7 @@ */ struct wp_walk { struct mmu_notifier_range range; - unsigned long tlbflush_start; - unsigned long tlbflush_end; + struct mmu_gather tlb; unsigned long total; }; @@ -42,9 +41,9 @@ static int wp_pte(pte_t *pte, unsigned long addr, unsigned long end, ptent = pte_wrprotect(old_pte); ptep_modify_prot_commit(walk->vma, addr, pte, old_pte, ptent); wpwalk->total++; - wpwalk->tlbflush_start = min(wpwalk->tlbflush_start, addr); - wpwalk->tlbflush_end = max(wpwalk->tlbflush_end, - addr + PAGE_SIZE); + + if (pte_may_need_flush(old_pte, ptent)) + tlb_flush_pte_range(&wpwalk->tlb, addr, PAGE_SIZE); } return 0; @@ -101,9 +100,7 @@ static int clean_record_pte(pte_t *pte, unsigned long addr, ptep_modify_prot_commit(walk->vma, addr, pte, old_pte, ptent); wpwalk->total++; - wpwalk->tlbflush_start = min(wpwalk->tlbflush_start, addr); - wpwalk->tlbflush_end = max(wpwalk->tlbflush_end, - addr + PAGE_SIZE); + tlb_flush_pte_range(&wpwalk->tlb, addr, PAGE_SIZE); __set_bit(pgoff, cwalk->bitmap); cwalk->start = min(cwalk->start, pgoff); @@ -184,20 +181,13 @@ static int wp_clean_pre_vma(unsigned long start, unsigned long end, { struct wp_walk *wpwalk = walk->private; - wpwalk->tlbflush_start = end; - wpwalk->tlbflush_end = start; - mmu_notifier_range_init(&wpwalk->range, MMU_NOTIFY_PROTECTION_PAGE, 0, walk->vma, walk->mm, start, end); mmu_notifier_invalidate_range_start(&wpwalk->range); flush_cache_range(walk->vma, start, end); - /* - * We're not using tlb_gather_mmu() since typically - * only a small subrange of PTEs are affected, whereas - * tlb_gather_mmu() records the full range. - */ - inc_tlb_flush_pending(walk->mm); + tlb_gather_mmu(&wpwalk->tlb, walk->mm); + tlb_start_vma(&wpwalk->tlb, walk->vma); return 0; } @@ -212,15 +202,10 @@ static void wp_clean_post_vma(struct mm_walk *walk) { struct wp_walk *wpwalk = walk->private; - if (mm_tlb_flush_nested(walk->mm)) - flush_tlb_range(walk->vma, wpwalk->range.start, - wpwalk->range.end); - else if (wpwalk->tlbflush_end > wpwalk->tlbflush_start) - flush_tlb_range(walk->vma, wpwalk->tlbflush_start, - wpwalk->tlbflush_end); - mmu_notifier_invalidate_range_end(&wpwalk->range); - dec_tlb_flush_pending(walk->mm); + + tlb_end_vma(&wpwalk->tlb, walk->vma); + tlb_finish_mmu(&wpwalk->tlb); } /* From patchwork Sun Jan 31 00:11:17 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Nadav Amit X-Patchwork-Id: 12057449 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-13.5 required=3.0 tests=BAYES_00, DKIM_ADSP_CUSTOM_MED,DKIM_INVALID,DKIM_SIGNED,FREEMAIL_FORGED_FROMDOMAIN, FREEMAIL_FROM,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER, INCLUDES_PATCH,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 33D6AC433DB for ; Sun, 31 Jan 2021 00:16:14 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id C8E5E64E17 for ; Sun, 31 Jan 2021 00:16:13 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org C8E5E64E17 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=gmail.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id B5D5F6B0071; Sat, 30 Jan 2021 19:16:10 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id ABB936B0072; Sat, 30 Jan 2021 19:16:10 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 8C2496B0073; Sat, 30 Jan 2021 19:16:10 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0169.hostedemail.com [216.40.44.169]) by kanga.kvack.org (Postfix) with ESMTP id 7060D6B0071 for ; Sat, 30 Jan 2021 19:16:10 -0500 (EST) Received: from smtpin18.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay04.hostedemail.com (Postfix) with ESMTP id 383E51EE6 for ; Sun, 31 Jan 2021 00:16:10 +0000 (UTC) X-FDA: 77764152900.18.rat72_48042d9275b5 Received: from filter.hostedemail.com (10.5.16.251.rfc1918.com [10.5.16.251]) by smtpin18.hostedemail.com (Postfix) with ESMTP id 1BDDF100ED586 for ; Sun, 31 Jan 2021 00:16:10 +0000 (UTC) X-HE-Tag: rat72_48042d9275b5 X-Filterd-Recvd-Size: 6612 Received: from mail-pf1-f172.google.com (mail-pf1-f172.google.com [209.85.210.172]) by imf39.hostedemail.com (Postfix) with ESMTP for ; Sun, 31 Jan 2021 00:16:09 +0000 (UTC) Received: by mail-pf1-f172.google.com with SMTP id i63so9062986pfg.7 for ; Sat, 30 Jan 2021 16:16:09 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=ZzlVqewRXAg3OMvze9ev5t9d0DHfKiwF7Fd8unaArI0=; b=a/vzewfyyogHVl7oec30+NCJAuiKllkXhkf/E92lZlob0Juz6BSXZ4LybGQOgTD7Pb qjPdbwodVyGr1gRxpvoQourmMUu56/daQ1UKeXadpoz/CvQF+35+maIzvYKdg1KoioKy RXitPbiy/dwIzfqN1x9Hmvde8fUl1GSNtVNBpNSuHEYMZoRb1mSel32bn+kr9VkA0HgL nm/iPrmaqNVze6dgkSwUtdAQ44AmH7KMf2GtcLJBOEZfwKTrInGZv0jj5akCHUyRdiKE OJVWaziglwMV7bGvkUnuTE6n3KG+yR7DE+AB0ky8D4VB7VLvSbJZq+pKZ7HwauIlvYxR QYjw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=ZzlVqewRXAg3OMvze9ev5t9d0DHfKiwF7Fd8unaArI0=; b=m0cwQ3tw64im6dHFYEthHWJDWl+P2+OFrHB0ZXsVrP4TBmYhZ6YNdGXpQaycRaAuoD HmQZNxN/9nSCFdNbLNTSHc4K0tG6zA0nPQ4vBVlXSh5vftlyPAymoEStg0MQB9f85/wd 4O2/VMRZeqUV49zIMv8L/FJABvA3mR0HBpd4IBO0fr04OjMZp5KbaDd2Ze/MXy90rKTP 4DRGaaqJnkXONq8ONT8K00IMgjDEkU07MfrgJtS7oTUdshl0JV4yBtawUoVSvf/4IkDT NenZeb2iVD2txCNiFMtHEF5mLoOdvukCA+iE3TLJQmEV5+cHzNQWnnKwjiaQA5o/Omx4 ZhSA== X-Gm-Message-State: AOAM530Y79F2IbjsJcaLTT7WLwChaT6lC0jepQJlZW/t7dxkQeBPN+s5 iKF2/xgnD6xzgGj8EH+E9NsUX//iU1I= X-Google-Smtp-Source: ABdhPJyhSETixc5AJsaHNKoGQJPd6srN/UgKTvNi1vnesRUTToIiDJ/7t8Kfp5f0pZIshkCllVGq+w== X-Received: by 2002:a63:f19:: with SMTP id e25mr10533814pgl.220.1612052168283; Sat, 30 Jan 2021 16:16:08 -0800 (PST) Received: from sc2-haas01-esx0118.eng.vmware.com ([66.170.99.1]) by smtp.gmail.com with ESMTPSA id e12sm13127365pga.13.2021.01.30.16.16.06 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sat, 30 Jan 2021 16:16:07 -0800 (PST) From: Nadav Amit X-Google-Original-From: Nadav Amit To: linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: Nadav Amit , Mel Gorman , Andrea Arcangeli , Andrew Morton , Andy Lutomirski , Dave Hansen , Peter Zijlstra , Thomas Gleixner , Will Deacon , Yu Zhao , x86@kernel.org Subject: [RFC 05/20] mm/tlb: move BATCHED_UNMAP_TLB_FLUSH to tlb.h Date: Sat, 30 Jan 2021 16:11:17 -0800 Message-Id: <20210131001132.3368247-6-namit@vmware.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20210131001132.3368247-1-namit@vmware.com> References: <20210131001132.3368247-1-namit@vmware.com> MIME-Version: 1.0 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Nadav Amit Arguably, tlb.h is the natural place for TLB related code. In addition, task_mmu needs to be able to call to flush_tlb_batched_pending() and therefore cannot (or should not) use mm/internal.h. Move all the functions that are controlled by CONFIG_ARCH_WANT_BATCHED_UNMAP_TLB_FLUSH to tlb.h. Signed-off-by: Nadav Amit Cc: Mel Gorman Cc: Andrea Arcangeli Cc: Andrew Morton Cc: Andy Lutomirski Cc: Dave Hansen Cc: Peter Zijlstra Cc: Thomas Gleixner Cc: Will Deacon Cc: Yu Zhao Cc: x86@kernel.org --- include/asm-generic/tlb.h | 17 +++++++++++++++++ mm/internal.h | 16 ---------------- mm/mremap.c | 2 +- mm/vmscan.c | 1 + 4 files changed, 19 insertions(+), 17 deletions(-) diff --git a/include/asm-generic/tlb.h b/include/asm-generic/tlb.h index c2deec0b6919..517c89398c83 100644 --- a/include/asm-generic/tlb.h +++ b/include/asm-generic/tlb.h @@ -658,6 +658,23 @@ static inline void tlb_flush_p4d_range(struct mmu_gather *tlb, static inline bool pte_may_need_flush(pte_t oldpte, pte_t newpte) { return true; } #endif +#ifdef CONFIG_ARCH_WANT_BATCHED_UNMAP_TLB_FLUSH +void try_to_unmap_flush(void); +void try_to_unmap_flush_dirty(void); +void flush_tlb_batched_pending(struct mm_struct *mm); +#else +static inline void try_to_unmap_flush(void) +{ +} +static inline void try_to_unmap_flush_dirty(void) +{ +} +static inline void flush_tlb_batched_pending(struct mm_struct *mm) +{ +} +static inline void tlb_batch_init(void) { } +#endif /* CONFIG_ARCH_WANT_BATCHED_UNMAP_TLB_FLUSH */ + #endif /* CONFIG_MMU */ #endif /* _ASM_GENERIC__TLB_H */ diff --git a/mm/internal.h b/mm/internal.h index 25d2b2439f19..d3860f9fbb83 100644 --- a/mm/internal.h +++ b/mm/internal.h @@ -585,22 +585,6 @@ struct tlbflush_unmap_batch; */ extern struct workqueue_struct *mm_percpu_wq; -#ifdef CONFIG_ARCH_WANT_BATCHED_UNMAP_TLB_FLUSH -void try_to_unmap_flush(void); -void try_to_unmap_flush_dirty(void); -void flush_tlb_batched_pending(struct mm_struct *mm); -#else -static inline void try_to_unmap_flush(void) -{ -} -static inline void try_to_unmap_flush_dirty(void) -{ -} -static inline void flush_tlb_batched_pending(struct mm_struct *mm) -{ -} -#endif /* CONFIG_ARCH_WANT_BATCHED_UNMAP_TLB_FLUSH */ - extern const struct trace_print_flags pageflag_names[]; extern const struct trace_print_flags vmaflag_names[]; extern const struct trace_print_flags gfpflag_names[]; diff --git a/mm/mremap.c b/mm/mremap.c index f554320281cc..57655d1b1031 100644 --- a/mm/mremap.c +++ b/mm/mremap.c @@ -26,7 +26,7 @@ #include #include -#include +#include #include "internal.h" diff --git a/mm/vmscan.c b/mm/vmscan.c index b1b574ad199d..ee144c359b41 100644 --- a/mm/vmscan.c +++ b/mm/vmscan.c @@ -52,6 +52,7 @@ #include #include +#include #include #include From patchwork Sun Jan 31 00:11:18 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Nadav Amit X-Patchwork-Id: 12057451 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-13.5 required=3.0 tests=BAYES_00, DKIM_ADSP_CUSTOM_MED,DKIM_INVALID,DKIM_SIGNED,FREEMAIL_FORGED_FROMDOMAIN, FREEMAIL_FROM,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER, INCLUDES_PATCH,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 4ADA9C433E6 for ; Sun, 31 Jan 2021 00:16:16 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 033FD64E15 for ; Sun, 31 Jan 2021 00:16:15 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 033FD64E15 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=gmail.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 4CA546B0072; Sat, 30 Jan 2021 19:16:12 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 4539C6B0073; Sat, 30 Jan 2021 19:16:12 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 2CDCC6B0074; Sat, 30 Jan 2021 19:16:12 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0084.hostedemail.com [216.40.44.84]) by kanga.kvack.org (Postfix) with ESMTP id 1048B6B0072 for ; Sat, 30 Jan 2021 19:16:12 -0500 (EST) Received: from smtpin21.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay03.hostedemail.com (Postfix) with ESMTP id CEB1182499A8 for ; Sun, 31 Jan 2021 00:16:11 +0000 (UTC) X-FDA: 77764152942.21.roll14_181180c275b5 Received: from filter.hostedemail.com (10.5.16.251.rfc1918.com [10.5.16.251]) by smtpin21.hostedemail.com (Postfix) with ESMTP id B3176180442C2 for ; Sun, 31 Jan 2021 00:16:11 +0000 (UTC) X-HE-Tag: roll14_181180c275b5 X-Filterd-Recvd-Size: 6989 Received: from mail-pj1-f49.google.com (mail-pj1-f49.google.com [209.85.216.49]) by imf26.hostedemail.com (Postfix) with ESMTP for ; Sun, 31 Jan 2021 00:16:11 +0000 (UTC) Received: by mail-pj1-f49.google.com with SMTP id a20so8196743pjs.1 for ; Sat, 30 Jan 2021 16:16:11 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=iWAT5LTYdh0RU/BQ9SE2YshZULg3KZXfLmEPedqIyqE=; b=UuM/YrXEkvYOegT3ZR4hSe5siUCIuNlpxDunWSaWN3A6lStfUT0VCrnTWP0ywlm2ak 3SOPc+b2jJ4nnbxG3nLmr5B4H/UHzjArVF/yNXzV2nFHH9FCFuyOTZbXtDCnOX4OV8Ad OpW5avMfxmalWCBjgBnVTTQ/x4A5rLgf26xV14RNvci96a0oxUuZ3iO+86TuSYDMC6I8 s6my2/kIyDXGPxQYFP2bGWiTO30o/+oLvqVoID/I1NVJu7IJYQhoO/JreQ1yIl4aOOJA VV54+EiaeqX9kD033IWxjZYoTTCxMfwObVX6zvpwN1Jl8jiqDDTd0JlKSL9F0qMX8ubg NqrQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=iWAT5LTYdh0RU/BQ9SE2YshZULg3KZXfLmEPedqIyqE=; b=HqGO8CSxo/KFXkU0db57k1z1oJdL55T5Izb0gSds3mFl8VA/VA8WeJzJuDbRKqWyJ4 VTEsft8pX+8EM3a25/MQMhv7JBaUra+BN2ND6Yf62HpWJLHyyVuJ+Tay2bXoiIZwnIDK EV5V/+CIivV8b8wz4p0KZBQZsbSfBMU7kul4bXNKMP++vClIqzrRsaV2djx43am0XywZ lZwDRC8nJ3zDrQbmlBm/1Mtehyy6K6oW2pqABtxVDft8+P4OpGIupQH4v7qUghykK9b/ rr26P7c2od1wuJ3hwsE0NLHT9rf3cGYhxzxHo90QWXczpalSjr0SD+iXR54GpHE481jX Qm4w== X-Gm-Message-State: AOAM533dPFehNoggJDI7S8+134A1IPdt3GK5zg41Y5IL1TkNhRCxXDgx Pb0hKg4bOLA7drbME6a8v3KAa+8LUss= X-Google-Smtp-Source: ABdhPJyAOtaB18E0ZUfJhrWQrqey7BWu2lxfepi790EgHIvhgRAbOLPI5LnhO91qOv9N9CzgTqL6ww== X-Received: by 2002:a17:90a:9503:: with SMTP id t3mr10424487pjo.189.1612052169930; Sat, 30 Jan 2021 16:16:09 -0800 (PST) Received: from sc2-haas01-esx0118.eng.vmware.com ([66.170.99.1]) by smtp.gmail.com with ESMTPSA id e12sm13127365pga.13.2021.01.30.16.16.08 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sat, 30 Jan 2021 16:16:09 -0800 (PST) From: Nadav Amit X-Google-Original-From: Nadav Amit To: linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: Nadav Amit , Andrea Arcangeli , Andrew Morton , Andy Lutomirski , Dave Hansen , Peter Zijlstra , Thomas Gleixner , Will Deacon , Yu Zhao , Nick Piggin , x86@kernel.org Subject: [RFC 06/20] fs/task_mmu: use mmu_gather interface of clear-soft-dirty Date: Sat, 30 Jan 2021 16:11:18 -0800 Message-Id: <20210131001132.3368247-7-namit@vmware.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20210131001132.3368247-1-namit@vmware.com> References: <20210131001132.3368247-1-namit@vmware.com> MIME-Version: 1.0 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Nadav Amit Use mmu_gather interface in task_mmu instead of {inc|dec}_tlb_flush_pending(). This would allow to consolidate the code and to avoid potential bugs. Signed-off-by: Nadav Amit Cc: Andrea Arcangeli Cc: Andrew Morton Cc: Andy Lutomirski Cc: Dave Hansen Cc: Peter Zijlstra Cc: Thomas Gleixner Cc: Will Deacon Cc: Yu Zhao Cc: Nick Piggin Cc: x86@kernel.org --- fs/proc/task_mmu.c | 27 ++++++++++++++++++++++++--- 1 file changed, 24 insertions(+), 3 deletions(-) diff --git a/fs/proc/task_mmu.c b/fs/proc/task_mmu.c index 3cec6fbef725..4cd048ffa0f6 100644 --- a/fs/proc/task_mmu.c +++ b/fs/proc/task_mmu.c @@ -1032,8 +1032,25 @@ enum clear_refs_types { struct clear_refs_private { enum clear_refs_types type; + struct mmu_gather tlb; }; +static int tlb_pre_vma(unsigned long start, unsigned long end, + struct mm_walk *walk) +{ + struct clear_refs_private *cp = walk->private; + + tlb_start_vma(&cp->tlb, walk->vma); + return 0; +} + +static void tlb_post_vma(struct mm_walk *walk) +{ + struct clear_refs_private *cp = walk->private; + + tlb_end_vma(&cp->tlb, walk->vma); +} + #ifdef CONFIG_MEM_SOFT_DIRTY #define is_cow_mapping(flags) (((flags) & (VM_SHARED | VM_MAYWRITE)) == VM_MAYWRITE) @@ -1140,6 +1157,7 @@ static int clear_refs_pte_range(pmd_t *pmd, unsigned long addr, /* Clear accessed and referenced bits. */ pmdp_test_and_clear_young(vma, addr, pmd); test_and_clear_page_young(page); + tlb_flush_pmd_range(&cp->tlb, addr, HPAGE_PMD_SIZE); ClearPageReferenced(page); out: spin_unlock(ptl); @@ -1155,6 +1173,7 @@ static int clear_refs_pte_range(pmd_t *pmd, unsigned long addr, if (cp->type == CLEAR_REFS_SOFT_DIRTY) { clear_soft_dirty(vma, addr, pte); + tlb_flush_pte_range(&cp->tlb, addr, PAGE_SIZE); continue; } @@ -1168,6 +1187,7 @@ static int clear_refs_pte_range(pmd_t *pmd, unsigned long addr, /* Clear accessed and referenced bits. */ ptep_test_and_clear_young(vma, addr, pte); test_and_clear_page_young(page); + tlb_flush_pte_range(&cp->tlb, addr, PAGE_SIZE); ClearPageReferenced(page); } pte_unmap_unlock(pte - 1, ptl); @@ -1198,6 +1218,8 @@ static int clear_refs_test_walk(unsigned long start, unsigned long end, } static const struct mm_walk_ops clear_refs_walk_ops = { + .pre_vma = tlb_pre_vma, + .post_vma = tlb_post_vma, .pmd_entry = clear_refs_pte_range, .test_walk = clear_refs_test_walk, }; @@ -1248,6 +1270,7 @@ static ssize_t clear_refs_write(struct file *file, const char __user *buf, goto out_unlock; } + tlb_gather_mmu(&cp.tlb, mm); if (type == CLEAR_REFS_SOFT_DIRTY) { for (vma = mm->mmap; vma; vma = vma->vm_next) { if (!(vma->vm_flags & VM_SOFTDIRTY)) @@ -1256,7 +1279,6 @@ static ssize_t clear_refs_write(struct file *file, const char __user *buf, vma_set_page_prot(vma); } - inc_tlb_flush_pending(mm); mmu_notifier_range_init(&range, MMU_NOTIFY_SOFT_DIRTY, 0, NULL, mm, 0, -1UL); mmu_notifier_invalidate_range_start(&range); @@ -1265,10 +1287,9 @@ static ssize_t clear_refs_write(struct file *file, const char __user *buf, &cp); if (type == CLEAR_REFS_SOFT_DIRTY) { mmu_notifier_invalidate_range_end(&range); - flush_tlb_mm(mm); - dec_tlb_flush_pending(mm); } out_unlock: + tlb_finish_mmu(&cp.tlb); mmap_write_unlock(mm); out_mm: mmput(mm); From patchwork Sun Jan 31 00:11:19 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Nadav Amit X-Patchwork-Id: 12057453 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-13.5 required=3.0 tests=BAYES_00, DKIM_ADSP_CUSTOM_MED,DKIM_INVALID,DKIM_SIGNED,FREEMAIL_FORGED_FROMDOMAIN, FREEMAIL_FROM,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER, INCLUDES_PATCH,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 8F5CFC433E0 for ; Sun, 31 Jan 2021 00:16:18 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 397CE64E0F for ; Sun, 31 Jan 2021 00:16:18 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 397CE64E0F Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=gmail.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 47CFA6B0073; Sat, 30 Jan 2021 19:16:14 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 3DDDC6B0074; Sat, 30 Jan 2021 19:16:14 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 259886B0075; Sat, 30 Jan 2021 19:16:14 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0251.hostedemail.com [216.40.44.251]) by kanga.kvack.org (Postfix) with ESMTP id 0C7DD6B0073 for ; Sat, 30 Jan 2021 19:16:14 -0500 (EST) Received: from smtpin14.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay05.hostedemail.com (Postfix) with ESMTP id C87CC181AEF23 for ; Sun, 31 Jan 2021 00:16:13 +0000 (UTC) X-FDA: 77764153026.14.sink60_020ca24275b5 Received: from filter.hostedemail.com (10.5.16.251.rfc1918.com [10.5.16.251]) by smtpin14.hostedemail.com (Postfix) with ESMTP id A88AE18229835 for ; Sun, 31 Jan 2021 00:16:13 +0000 (UTC) X-HE-Tag: sink60_020ca24275b5 X-Filterd-Recvd-Size: 13214 Received: from mail-pl1-f174.google.com (mail-pl1-f174.google.com [209.85.214.174]) by imf27.hostedemail.com (Postfix) with ESMTP for ; Sun, 31 Jan 2021 00:16:13 +0000 (UTC) Received: by mail-pl1-f174.google.com with SMTP id s15so7915588plr.9 for ; Sat, 30 Jan 2021 16:16:13 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=3hJDN7XNNHxUFXr1gFcDueyT3gu6sc9b9CRfP4LC3cA=; b=ow69JJyNSdpulLwcbA6pnNqB05DyszuF2ILeMYJoZ6kDSZtOizqFoSJ1Z52b6kdZb7 iuNX06K/eXkJbnx5csSDBXdK9HkxDbJo8Ava29VfhIFughtSLCG5U/4jfKVwNMZK4Qnz QAgmJmIAdovQ9wmmPTmDG+K+PQAeRRwMj3TlxsD4cJs+rV10LUS/hs4eX5ru/olAgwoj H8jFXUdh23m1f+P1x9Mwu/qNgyqPDFk2kv+RUcdA3dMY4O95QhK2gqyywq/2DEg/4yUd jr5wEfSe8v9lPkDxrH+ARJsEpIoxXfbIiA24RUPH8etmRUDcE7PjB6fKzrXS+o9OOPLw eKMw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=3hJDN7XNNHxUFXr1gFcDueyT3gu6sc9b9CRfP4LC3cA=; b=mItOEE+ZN30zK6DvE56KTFJatIEGiVrqE8LymWonQMO9psYhYkYydnGJa0cvAI3y4N MnB2W2yTMn0b6j9QGNu/PYxZ6lGcSgOJguFw+rw1p2Dhb7D38CeVESy+Xi8426c+SSgv j3WlpURVlQ9iN6yBHkwB4U8zbl6UKXQcP77PBfel3PH0iPRcssn8DLyDXuhbkaS9OLIc BNu2E//Pfv5zzPTy0CU5SFt7aL8JIRJS/crE5IK8QnZDdGQ6anXwEzU1irMfnRkbTBJq 8us4y1qsYqY+EQaDkFnL4h+3VdrITXkdUpJKNRto79jWTR1TbwDnzHvY2/1/iPcle73I SczA== X-Gm-Message-State: AOAM533e04uSQUagqqMtayVx2owY26zhaiRTiEsAWqDLDYNP+pSH6MJ6 3lTVZ9hAYfrIGp81n8V6c49XQzJ4LvM= X-Google-Smtp-Source: ABdhPJy72eR2ZRDjCZKm4iC26xiBMDQwui2uZKhjvbwqocRCErn3L6mMegNacR+p6TAjZbEgqfwanw== X-Received: by 2002:a17:902:9306:b029:df:fb48:8c43 with SMTP id bc6-20020a1709029306b02900dffb488c43mr11826131plb.77.1612052171913; Sat, 30 Jan 2021 16:16:11 -0800 (PST) Received: from sc2-haas01-esx0118.eng.vmware.com ([66.170.99.1]) by smtp.gmail.com with ESMTPSA id e12sm13127365pga.13.2021.01.30.16.16.10 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sat, 30 Jan 2021 16:16:11 -0800 (PST) From: Nadav Amit X-Google-Original-From: Nadav Amit To: linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: Nadav Amit , Andrea Arcangeli , Andrew Morton , Andy Lutomirski , Dave Hansen , Peter Zijlstra , Thomas Gleixner , Will Deacon , Yu Zhao , Nick Piggin , x86@kernel.org Subject: [RFC 07/20] mm: move x86 tlb_gen to generic code Date: Sat, 30 Jan 2021 16:11:19 -0800 Message-Id: <20210131001132.3368247-8-namit@vmware.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20210131001132.3368247-1-namit@vmware.com> References: <20210131001132.3368247-1-namit@vmware.com> MIME-Version: 1.0 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Nadav Amit x86 currently has a TLB-generation tracking logic that can be used by additional architectures (as long as they implement some additional logic). Extract the relevant pieces of code from x86 to general TLB code. This would be useful to allow to write the next "fine granularity deferred TLB flushes detection" patches without making them x86-specific. Signed-off-by: Nadav Amit Cc: Andrea Arcangeli Cc: Andrew Morton Cc: Andy Lutomirski Cc: Dave Hansen Cc: Peter Zijlstra Cc: Thomas Gleixner Cc: Will Deacon Cc: Yu Zhao Cc: Nick Piggin Cc: x86@kernel.org --- arch/x86/Kconfig | 1 + arch/x86/include/asm/mmu.h | 10 -------- arch/x86/include/asm/mmu_context.h | 1 - arch/x86/include/asm/tlbflush.h | 18 -------------- arch/x86/mm/tlb.c | 8 +++--- drivers/firmware/efi/efi.c | 1 + include/linux/mm_types.h | 39 ++++++++++++++++++++++++++++++ init/Kconfig | 6 +++++ kernel/fork.c | 2 ++ mm/init-mm.c | 1 + mm/rmap.c | 9 ++++++- 11 files changed, 62 insertions(+), 34 deletions(-) diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig index 591efb2476bc..6bd4d626a6b3 100644 --- a/arch/x86/Kconfig +++ b/arch/x86/Kconfig @@ -86,6 +86,7 @@ config X86 select ARCH_HAS_SYSCALL_WRAPPER select ARCH_HAS_UBSAN_SANITIZE_ALL select ARCH_HAS_DEBUG_WX + select ARCH_HAS_TLB_GENERATIONS select ARCH_HAVE_NMI_SAFE_CMPXCHG select ARCH_MIGHT_HAVE_ACPI_PDC if ACPI select ARCH_MIGHT_HAVE_PC_PARPORT diff --git a/arch/x86/include/asm/mmu.h b/arch/x86/include/asm/mmu.h index 5d7494631ea9..134454956c96 100644 --- a/arch/x86/include/asm/mmu.h +++ b/arch/x86/include/asm/mmu.h @@ -23,16 +23,6 @@ typedef struct { */ u64 ctx_id; - /* - * Any code that needs to do any sort of TLB flushing for this - * mm will first make its changes to the page tables, then - * increment tlb_gen, then flush. This lets the low-level - * flushing code keep track of what needs flushing. - * - * This is not used on Xen PV. - */ - atomic64_t tlb_gen; - #ifdef CONFIG_MODIFY_LDT_SYSCALL struct rw_semaphore ldt_usr_sem; struct ldt_struct *ldt; diff --git a/arch/x86/include/asm/mmu_context.h b/arch/x86/include/asm/mmu_context.h index 27516046117a..e7597c642270 100644 --- a/arch/x86/include/asm/mmu_context.h +++ b/arch/x86/include/asm/mmu_context.h @@ -105,7 +105,6 @@ static inline int init_new_context(struct task_struct *tsk, mutex_init(&mm->context.lock); mm->context.ctx_id = atomic64_inc_return(&last_mm_ctx_id); - atomic64_set(&mm->context.tlb_gen, 0); #ifdef CONFIG_X86_INTEL_MEMORY_PROTECTION_KEYS if (cpu_feature_enabled(X86_FEATURE_OSPKE)) { diff --git a/arch/x86/include/asm/tlbflush.h b/arch/x86/include/asm/tlbflush.h index a617dc0a9b06..2110b98026a7 100644 --- a/arch/x86/include/asm/tlbflush.h +++ b/arch/x86/include/asm/tlbflush.h @@ -235,24 +235,6 @@ static inline void flush_tlb_page(struct vm_area_struct *vma, unsigned long a) flush_tlb_mm_range(vma->vm_mm, a, a + PAGE_SIZE, PAGE_SHIFT, false); } -static inline u64 inc_mm_tlb_gen(struct mm_struct *mm) -{ - /* - * Bump the generation count. This also serves as a full barrier - * that synchronizes with switch_mm(): callers are required to order - * their read of mm_cpumask after their writes to the paging - * structures. - */ - return atomic64_inc_return(&mm->context.tlb_gen); -} - -static inline void arch_tlbbatch_add_mm(struct arch_tlbflush_unmap_batch *batch, - struct mm_struct *mm) -{ - inc_mm_tlb_gen(mm); - cpumask_or(&batch->cpumask, &batch->cpumask, mm_cpumask(mm)); -} - extern void arch_tlbbatch_flush(struct arch_tlbflush_unmap_batch *batch); static inline bool pte_may_need_flush(pte_t oldpte, pte_t newpte) diff --git a/arch/x86/mm/tlb.c b/arch/x86/mm/tlb.c index 569ac1d57f55..7ab21430be41 100644 --- a/arch/x86/mm/tlb.c +++ b/arch/x86/mm/tlb.c @@ -511,7 +511,7 @@ void switch_mm_irqs_off(struct mm_struct *prev, struct mm_struct *next, * the TLB shootdown code. */ smp_mb(); - next_tlb_gen = atomic64_read(&next->context.tlb_gen); + next_tlb_gen = atomic64_read(&next->tlb_gen); if (this_cpu_read(cpu_tlbstate.ctxs[prev_asid].tlb_gen) == next_tlb_gen) return; @@ -546,7 +546,7 @@ void switch_mm_irqs_off(struct mm_struct *prev, struct mm_struct *next, */ if (next != &init_mm) cpumask_set_cpu(cpu, mm_cpumask(next)); - next_tlb_gen = atomic64_read(&next->context.tlb_gen); + next_tlb_gen = atomic64_read(&next->tlb_gen); choose_new_asid(next, next_tlb_gen, &new_asid, &need_flush); @@ -618,7 +618,7 @@ void initialize_tlbstate_and_flush(void) { int i; struct mm_struct *mm = this_cpu_read(cpu_tlbstate.loaded_mm); - u64 tlb_gen = atomic64_read(&init_mm.context.tlb_gen); + u64 tlb_gen = atomic64_read(&init_mm.tlb_gen); unsigned long cr3 = __read_cr3(); /* Assert that CR3 already references the right mm. */ @@ -667,7 +667,7 @@ static void flush_tlb_func_common(const struct flush_tlb_info *f, */ struct mm_struct *loaded_mm = this_cpu_read(cpu_tlbstate.loaded_mm); u32 loaded_mm_asid = this_cpu_read(cpu_tlbstate.loaded_mm_asid); - u64 mm_tlb_gen = atomic64_read(&loaded_mm->context.tlb_gen); + u64 mm_tlb_gen = atomic64_read(&loaded_mm->tlb_gen); u64 local_tlb_gen = this_cpu_read(cpu_tlbstate.ctxs[loaded_mm_asid].tlb_gen); /* This code cannot presently handle being reentered. */ diff --git a/drivers/firmware/efi/efi.c b/drivers/firmware/efi/efi.c index df3f9bcab581..02a6a1c81576 100644 --- a/drivers/firmware/efi/efi.c +++ b/drivers/firmware/efi/efi.c @@ -62,6 +62,7 @@ struct mm_struct efi_mm = { .page_table_lock = __SPIN_LOCK_UNLOCKED(efi_mm.page_table_lock), .mmlist = LIST_HEAD_INIT(efi_mm.mmlist), .cpu_bitmap = { [BITS_TO_LONGS(NR_CPUS)] = 0}, + INIT_TLB_GENERATIONS }; struct workqueue_struct *efi_rts_wq; diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h index 0974ad501a47..2035ac319c2b 100644 --- a/include/linux/mm_types.h +++ b/include/linux/mm_types.h @@ -560,6 +560,17 @@ struct mm_struct { #ifdef CONFIG_IOMMU_SUPPORT u32 pasid; +#endif +#ifdef CONFIG_ARCH_HAS_TLB_GENERATIONS + /* + * Any code that needs to do any sort of TLB flushing for this + * mm will first make its changes to the page tables, then + * increment tlb_gen, then flush. This lets the low-level + * flushing code keep track of what needs flushing. + * + * This is not used on Xen PV. + */ + atomic64_t tlb_gen; #endif } __randomize_layout; @@ -676,6 +687,34 @@ static inline bool mm_tlb_flush_nested(struct mm_struct *mm) return atomic_read(&mm->tlb_flush_pending) > 1; } +#ifdef CONFIG_ARCH_HAS_TLB_GENERATIONS +static inline void init_mm_tlb_gen(struct mm_struct *mm) +{ + atomic64_set(&mm->tlb_gen, 0); +} + +static inline u64 inc_mm_tlb_gen(struct mm_struct *mm) +{ + /* + * Bump the generation count. This also serves as a full barrier + * that synchronizes with switch_mm(): callers are required to order + * their read of mm_cpumask after their writes to the paging + * structures. + */ + return atomic64_inc_return(&mm->tlb_gen); +} + +#define INIT_TLB_GENERATIONS \ + .tlb_gen = ATOMIC64_INIT(0), + +#else + +static inline void init_mm_tlb_gen(struct mm_struct *mm) { } + +#define INIT_TLB_GENERATION + +#endif + struct vm_fault; /** diff --git a/init/Kconfig b/init/Kconfig index b77c60f8b963..3d11a0f7c8cc 100644 --- a/init/Kconfig +++ b/init/Kconfig @@ -842,6 +842,12 @@ config ARCH_SUPPORTS_NUMA_BALANCING # and the refill costs are offset by the savings of sending fewer IPIs. config ARCH_WANT_BATCHED_UNMAP_TLB_FLUSH bool + depends on ARCH_HAS_TLB_GENERATIONS + +# +# For architectures that track for each address space the TLB generation. +config ARCH_HAS_TLB_GENERATIONS + bool config CC_HAS_INT128 def_bool !$(cc-option,$(m64-flag) -D__SIZEOF_INT128__=0) && 64BIT diff --git a/kernel/fork.c b/kernel/fork.c index d66cd1014211..3e735a86ab2c 100644 --- a/kernel/fork.c +++ b/kernel/fork.c @@ -1027,6 +1027,8 @@ static struct mm_struct *mm_init(struct mm_struct *mm, struct task_struct *p, RCU_INIT_POINTER(mm->exe_file, NULL); mmu_notifier_subscriptions_init(mm); init_tlb_flush_pending(mm); + init_mm_tlb_gen(mm); + #if defined(CONFIG_TRANSPARENT_HUGEPAGE) && !USE_SPLIT_PMD_PTLOCKS mm->pmd_huge_pte = NULL; #endif diff --git a/mm/init-mm.c b/mm/init-mm.c index 153162669f80..ef3a471f4de4 100644 --- a/mm/init-mm.c +++ b/mm/init-mm.c @@ -38,5 +38,6 @@ struct mm_struct init_mm = { .mmlist = LIST_HEAD_INIT(init_mm.mmlist), .user_ns = &init_user_ns, .cpu_bitmap = CPU_BITS_NONE, + INIT_TLB_GENERATIONS INIT_MM_CONTEXT(init_mm) }; diff --git a/mm/rmap.c b/mm/rmap.c index 08c56aaf72eb..9655e1fc328a 100644 --- a/mm/rmap.c +++ b/mm/rmap.c @@ -613,11 +613,18 @@ void try_to_unmap_flush_dirty(void) try_to_unmap_flush(); } +static inline void tlbbatch_add_mm(struct arch_tlbflush_unmap_batch *batch, + struct mm_struct *mm) +{ + inc_mm_tlb_gen(mm); + cpumask_or(&batch->cpumask, &batch->cpumask, mm_cpumask(mm)); +} + static void set_tlb_ubc_flush_pending(struct mm_struct *mm, bool writable) { struct tlbflush_unmap_batch *tlb_ubc = ¤t->tlb_ubc; - arch_tlbbatch_add_mm(&tlb_ubc->arch, mm); + tlbbatch_add_mm(&tlb_ubc->arch, mm); tlb_ubc->flush_required = true; /* From patchwork Sun Jan 31 00:11:20 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Nadav Amit X-Patchwork-Id: 12057455 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-13.5 required=3.0 tests=BAYES_00, DKIM_ADSP_CUSTOM_MED,DKIM_INVALID,DKIM_SIGNED,FREEMAIL_FORGED_FROMDOMAIN, FREEMAIL_FROM,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER, INCLUDES_PATCH,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 5B715C433DB for ; Sun, 31 Jan 2021 00:16:20 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 037D864E19 for ; Sun, 31 Jan 2021 00:16:19 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 037D864E19 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=gmail.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id E8B3E6B0074; Sat, 30 Jan 2021 19:16:15 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id DCC786B0075; Sat, 30 Jan 2021 19:16:15 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id BCCC56B0078; Sat, 30 Jan 2021 19:16:15 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0177.hostedemail.com [216.40.44.177]) by kanga.kvack.org (Postfix) with ESMTP id 95B416B0074 for ; Sat, 30 Jan 2021 19:16:15 -0500 (EST) Received: from smtpin24.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay05.hostedemail.com (Postfix) with ESMTP id 5FCE4181AEF23 for ; Sun, 31 Jan 2021 00:16:15 +0000 (UTC) X-FDA: 77764153110.24.scene65_6312f89275b5 Received: from filter.hostedemail.com (10.5.16.251.rfc1918.com [10.5.16.251]) by smtpin24.hostedemail.com (Postfix) with ESMTP id 3F95C1A4A5 for ; Sun, 31 Jan 2021 00:16:15 +0000 (UTC) X-HE-Tag: scene65_6312f89275b5 X-Filterd-Recvd-Size: 7717 Received: from mail-pf1-f173.google.com (mail-pf1-f173.google.com [209.85.210.173]) by imf06.hostedemail.com (Postfix) with ESMTP for ; Sun, 31 Jan 2021 00:16:14 +0000 (UTC) Received: by mail-pf1-f173.google.com with SMTP id e19so9062033pfh.6 for ; Sat, 30 Jan 2021 16:16:14 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=B1n9KuHbkwDteeyX+EGvgbidSs5/e5CI67AfWMUmp8s=; b=SKZEOydA0mMWkDa5c+uNDDq0+qMOl+Zb6dqieV/jCTFS++mV7yMbTyfUAEFSfwFcL2 ya1tZMpINcor6UQYYqa5YSPVwsgJwmtDtXjfoVUxQhTlq3wC7rn7uhgnvpiTAijnzQ5J DOjVqicU91j+tmfyiWaGWEcL9Kj/vkOFJdbPV/ZE8hP51yiruPoTE9yTgfFXptPlh9ok Iahtt+cLVVjWY3zHdcUenrf4+tyHlaNTzuZ36PebMzYas0Xz1Pr4A/ILg/aDGBYou02R vR0q3WWSXwG2iv4f3C53S0tpyjBf3PsMIYTG7KBudtFNHH6ySwoMHXUKRczItmaUieXX IFig== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=B1n9KuHbkwDteeyX+EGvgbidSs5/e5CI67AfWMUmp8s=; b=PqkCvdvvlkjkn7VcUK9e9ZgfNHTRY9Wh/PE8q7bNYspV/Rek48+Sj6m0y0wOhlessq 1VPF71ZeWQeCyDY3XISnahc3HAi8VdxRXxZtAa+Wl0MVJ0KskK2C3qiu8gPJsjfwB+PZ AaRlEh+TBsI1DYxB93atzv3papZXazzllqrfN4y4cbCJeHGYYJwLJrHVLorOIaiSWju+ JYoDD5AnzrdyccsIRcHcz6RSRFjymfmfyDMWIU6lfkJqGZj0EmBlsFfOJMGmeWk6e39R SZPVWZXyiqCcO87O4KG/l1rqec2yowoY8EswfDIx6SmaV2fH+9FbCQdR4NafUhxjyYkL 91+g== X-Gm-Message-State: AOAM530SSMZPCCklngB9oQzzGj9DQcTc9tnBu785b3Fd/NT7E+k4T4mV GPm6lRhKKo9mNebVNydb621Zb4L4U5Y= X-Google-Smtp-Source: ABdhPJxF8xo5w6K8Nn+HFQfi/CPEUbKsBx5EUV2igQkFgijrTJKmcZ10by10k68OT6k9qdD59CoIoQ== X-Received: by 2002:a63:1261:: with SMTP id 33mr10809432pgs.213.1612052173517; Sat, 30 Jan 2021 16:16:13 -0800 (PST) Received: from sc2-haas01-esx0118.eng.vmware.com ([66.170.99.1]) by smtp.gmail.com with ESMTPSA id e12sm13127365pga.13.2021.01.30.16.16.12 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sat, 30 Jan 2021 16:16:12 -0800 (PST) From: Nadav Amit X-Google-Original-From: Nadav Amit To: linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: Nadav Amit , Andrea Arcangeli , Andrew Morton , Andy Lutomirski , Dave Hansen , Peter Zijlstra , Thomas Gleixner , Will Deacon , Yu Zhao , Nick Piggin , x86@kernel.org Subject: [RFC 08/20] mm: store completed TLB generation Date: Sat, 30 Jan 2021 16:11:20 -0800 Message-Id: <20210131001132.3368247-9-namit@vmware.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20210131001132.3368247-1-namit@vmware.com> References: <20210131001132.3368247-1-namit@vmware.com> MIME-Version: 1.0 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Nadav Amit To detect deferred TLB flushes in fine granularity, we need to keep track on the completed TLB flush generation for each mm. Add logic to track for each mm the tlb_gen_completed, which tracks the completed TLB generation. It is the arch responsibility to call mark_mm_tlb_gen_done() whenever a TLB flush is completed. Start the generation numbers from 1 instead of 0. This would allow later to detect whether flushes of a certain generation were completed. Signed-off-by: Nadav Amit Cc: Andrea Arcangeli Cc: Andrew Morton Cc: Andy Lutomirski Cc: Dave Hansen Cc: Peter Zijlstra Cc: Thomas Gleixner Cc: Will Deacon Cc: Yu Zhao Cc: Nick Piggin Cc: x86@kernel.org --- arch/x86/mm/tlb.c | 10 ++++++++++ include/asm-generic/tlb.h | 33 +++++++++++++++++++++++++++++++++ include/linux/mm_types.h | 15 ++++++++++++++- 3 files changed, 57 insertions(+), 1 deletion(-) diff --git a/arch/x86/mm/tlb.c b/arch/x86/mm/tlb.c index 7ab21430be41..d17b5575531e 100644 --- a/arch/x86/mm/tlb.c +++ b/arch/x86/mm/tlb.c @@ -14,6 +14,7 @@ #include #include #include +#include #include "mm_internal.h" @@ -915,6 +916,9 @@ void flush_tlb_mm_range(struct mm_struct *mm, unsigned long start, if (cpumask_any_but(mm_cpumask(mm), cpu) < nr_cpu_ids) flush_tlb_others(mm_cpumask(mm), info); + /* Update the completed generation */ + mark_mm_tlb_gen_done(mm, new_tlb_gen); + put_flush_tlb_info(); put_cpu(); } @@ -1147,6 +1151,12 @@ void arch_tlbbatch_flush(struct arch_tlbflush_unmap_batch *batch) cpumask_clear(&batch->cpumask); + /* + * We cannot call mark_mm_tlb_gen_done() since we do not know which + * mm's should be flushed. This may lead to some unwarranted TLB + * flushes, but not to correction problems. + */ + put_cpu(); } diff --git a/include/asm-generic/tlb.h b/include/asm-generic/tlb.h index 517c89398c83..427bfcc6cdec 100644 --- a/include/asm-generic/tlb.h +++ b/include/asm-generic/tlb.h @@ -513,6 +513,39 @@ static inline void tlb_end_vma(struct mmu_gather *tlb, struct vm_area_struct *vm } #endif +#ifdef CONFIG_ARCH_HAS_TLB_GENERATIONS + +/* + * Helper function to update a generation to have a new value, as long as new + * value is greater or equal to gen. + */ +static inline void tlb_update_generation(atomic64_t *gen, u64 new_gen) +{ + u64 cur_gen = atomic64_read(gen); + + while (cur_gen < new_gen) { + u64 old_gen = atomic64_cmpxchg(gen, cur_gen, new_gen); + + /* Check if we succeeded in the cmpxchg */ + if (likely(cur_gen == old_gen)) + break; + + cur_gen = old_gen; + }; +} + + +static inline void mark_mm_tlb_gen_done(struct mm_struct *mm, u64 gen) +{ + /* + * Update the completed generation to the new generation if the new + * generation is greater than the previous one. + */ + tlb_update_generation(&mm->tlb_gen_completed, gen); +} + +#endif /* CONFIG_ARCH_HAS_TLB_GENERATIONS */ + /* * tlb_flush_{pte|pmd|pud|p4d}_range() adjust the tlb->start and tlb->end, * and set corresponding cleared_*. diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h index 2035ac319c2b..8a5eb4bfac59 100644 --- a/include/linux/mm_types.h +++ b/include/linux/mm_types.h @@ -571,6 +571,13 @@ struct mm_struct { * This is not used on Xen PV. */ atomic64_t tlb_gen; + + /* + * TLB generation which is guarnateed to be flushed, including + * all the PTE changes that were performed before tlb_gen was + * incremented. + */ + atomic64_t tlb_gen_completed; #endif } __randomize_layout; @@ -690,7 +697,13 @@ static inline bool mm_tlb_flush_nested(struct mm_struct *mm) #ifdef CONFIG_ARCH_HAS_TLB_GENERATIONS static inline void init_mm_tlb_gen(struct mm_struct *mm) { - atomic64_set(&mm->tlb_gen, 0); + /* + * Start from generation of 1, so default generation 0 will be + * considered as flushed and would not be regarded as an outstanding + * deferred invalidation. + */ + atomic64_set(&mm->tlb_gen, 1); + atomic64_set(&mm->tlb_gen_completed, 1); } static inline u64 inc_mm_tlb_gen(struct mm_struct *mm) From patchwork Sun Jan 31 00:11:21 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Nadav Amit X-Patchwork-Id: 12057457 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-13.5 required=3.0 tests=BAYES_00, DKIM_ADSP_CUSTOM_MED,DKIM_INVALID,DKIM_SIGNED,FREEMAIL_FORGED_FROMDOMAIN, FREEMAIL_FROM,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER, INCLUDES_PATCH,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id CF435C433E0 for ; Sun, 31 Jan 2021 00:16:22 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 736E664E0F for ; Sun, 31 Jan 2021 00:16:22 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 736E664E0F Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=gmail.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 95D9A6B0075; Sat, 30 Jan 2021 19:16:17 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 8E4C76B0078; Sat, 30 Jan 2021 19:16:17 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 7611C6B007B; Sat, 30 Jan 2021 19:16:17 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0229.hostedemail.com [216.40.44.229]) by kanga.kvack.org (Postfix) with ESMTP id 5A0F66B0075 for ; Sat, 30 Jan 2021 19:16:17 -0500 (EST) Received: from smtpin08.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay04.hostedemail.com (Postfix) with ESMTP id 20EC51EE6 for ; Sun, 31 Jan 2021 00:16:17 +0000 (UTC) X-FDA: 77764153194.08.corn94_5f0a75e275b5 Received: from filter.hostedemail.com (10.5.16.251.rfc1918.com [10.5.16.251]) by smtpin08.hostedemail.com (Postfix) with ESMTP id F27861819E772 for ; Sun, 31 Jan 2021 00:16:16 +0000 (UTC) X-HE-Tag: corn94_5f0a75e275b5 X-Filterd-Recvd-Size: 11177 Received: from mail-pj1-f47.google.com (mail-pj1-f47.google.com [209.85.216.47]) by imf03.hostedemail.com (Postfix) with ESMTP for ; Sun, 31 Jan 2021 00:16:16 +0000 (UTC) Received: by mail-pj1-f47.google.com with SMTP id z9so234613pjl.5 for ; Sat, 30 Jan 2021 16:16:16 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=pXtsqL7OnePQSIl7PBrxGJpBqi17FMqIVecv4DEE9Io=; b=HWnMEPQTPZY8rmZlWoMLQUShrGKYPxiPXEDmXrKPxu0sd2fkt7QDryKz1vR9o9gy2L PzAGCR8eDGwEDrfJKTPwoPwYUerFd9mdQMZkfb9KWChY2xRYnDe0zwJPq3olbLm8UYuk SnwakZEKJflQiP9eOWgb4DoMzl+rgODt6VoYGsCXZymnrPrp14Gcb2SLe7+CVI7GCBkC e0iUy5+UeO1HcMiYxfaUn1bnFcooMMFDTmgn+nSKfXfweXwrvURkKEUAdfQCtSC/f2lv 13x3QazIzNf5V5d1S1ORV+z599hdqREURm5m/8pM8JAQRm5QrJacIB62AzP3Iysen/7H FOXw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=pXtsqL7OnePQSIl7PBrxGJpBqi17FMqIVecv4DEE9Io=; b=I1293deecTUmTjBGYfR9J0Qct1JgBlXfyCjhp3ON1dQ4lMdN4l3DAyHww8B5VoFlGh VYUvVWIRQXA92bffi3fqF90/COLQNH2iSs0kSnNn4bJxGcJDOndfnEa6Yc6TAgrO+b4A 31nxn0iqVnKujX/G9Zt7a9LIPAlol+cKystPWRVsfNV+LOM81Wt8PJW+pMFhaUM3+VV2 oY+r1ggjFMvvD+QPzjv7qP5Hzl57VMxKnvEoVJs9nAonthLDT5wesFiR14mClFIuieY1 fa54sPkA81OmRGEduWWXt1GJSwaTt1oKTDf6073gJ3d/t0gMgbXt5go4aHjGs1JszXQ/ rxOQ== X-Gm-Message-State: AOAM533D5BhjVSjfpjOTzFGD6kdnUWRHC4Acn02UdT2RmV3wsKNk3Nmg wwfuU/3rFng5/lL8LJyK4Qnnw5WowPU= X-Google-Smtp-Source: ABdhPJy0UgFXnVAqlHfJHtpa7PxNh6RointC74k3ZkIMdLDA/33nXTOgUee/DDeu4NstAPwAedTUvA== X-Received: by 2002:a17:90a:7e94:: with SMTP id j20mr11015456pjl.218.1612052175121; Sat, 30 Jan 2021 16:16:15 -0800 (PST) Received: from sc2-haas01-esx0118.eng.vmware.com ([66.170.99.1]) by smtp.gmail.com with ESMTPSA id e12sm13127365pga.13.2021.01.30.16.16.13 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sat, 30 Jan 2021 16:16:14 -0800 (PST) From: Nadav Amit X-Google-Original-From: Nadav Amit To: linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: Nadav Amit , Andrea Arcangeli , Andrew Morton , Andy Lutomirski , Dave Hansen , Peter Zijlstra , Thomas Gleixner , Will Deacon , Yu Zhao , Nick Piggin , x86@kernel.org Subject: [RFC 09/20] mm: create pte/pmd_tlb_flush_pending() Date: Sat, 30 Jan 2021 16:11:21 -0800 Message-Id: <20210131001132.3368247-10-namit@vmware.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20210131001132.3368247-1-namit@vmware.com> References: <20210131001132.3368247-1-namit@vmware.com> MIME-Version: 1.0 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Nadav Amit In preparation for fine(r) granularity, introduce pte_tlb_flush_pending() and pmd_tlb_flush_pending(). Right now the function directs to mm_tlb_flush_pending(). Change pte_accessible() to provide the vma as well. No functional change. Next patches will use this information on architectures that use per-table deferred TLB tracking. Signed-off-by: Nadav Amit Cc: Andrea Arcangeli Cc: Andrew Morton Cc: Andy Lutomirski Cc: Dave Hansen Cc: Peter Zijlstra Cc: Thomas Gleixner Cc: Will Deacon Cc: Yu Zhao Cc: Nick Piggin Cc: x86@kernel.org --- arch/arm/include/asm/pgtable.h | 4 +++- arch/arm64/include/asm/pgtable.h | 4 ++-- arch/sparc/include/asm/pgtable_64.h | 9 ++++++--- arch/sparc/mm/init_64.c | 2 +- arch/x86/include/asm/pgtable.h | 7 +++---- include/linux/mm_types.h | 10 ++++++++++ include/linux/pgtable.h | 2 +- mm/huge_memory.c | 2 +- mm/ksm.c | 2 +- mm/pgtable-generic.c | 2 +- 10 files changed, 29 insertions(+), 15 deletions(-) diff --git a/arch/arm/include/asm/pgtable.h b/arch/arm/include/asm/pgtable.h index c02f24400369..59bcacc14dc3 100644 --- a/arch/arm/include/asm/pgtable.h +++ b/arch/arm/include/asm/pgtable.h @@ -190,7 +190,9 @@ static inline pte_t *pmd_page_vaddr(pmd_t pmd) #define pte_none(pte) (!pte_val(pte)) #define pte_present(pte) (pte_isset((pte), L_PTE_PRESENT)) #define pte_valid(pte) (pte_isset((pte), L_PTE_VALID)) -#define pte_accessible(mm, pte) (mm_tlb_flush_pending(mm) ? pte_present(pte) : pte_valid(pte)) +#define pte_accessible(vma, pte) \ + (pte_tlb_flush_pending(vma, pte) ? \ + pte_present(*pte) : pte_valid(*pte)) #define pte_write(pte) (pte_isclear((pte), L_PTE_RDONLY)) #define pte_dirty(pte) (pte_isset((pte), L_PTE_DIRTY)) #define pte_young(pte) (pte_isset((pte), L_PTE_YOUNG)) diff --git a/arch/arm64/include/asm/pgtable.h b/arch/arm64/include/asm/pgtable.h index 501562793ce2..f14f1e9dbc3e 100644 --- a/arch/arm64/include/asm/pgtable.h +++ b/arch/arm64/include/asm/pgtable.h @@ -126,8 +126,8 @@ extern unsigned long empty_zero_page[PAGE_SIZE / sizeof(unsigned long)]; * flag, since ptep_clear_flush_young() elides a DSB when invalidating the * TLB. */ -#define pte_accessible(mm, pte) \ - (mm_tlb_flush_pending(mm) ? pte_present(pte) : pte_valid(pte)) +#define pte_accessible(vma, pte) \ + (pte_tlb_flush_pending(vma, pte) ? pte_present(*pte) : pte_valid(*pte)) /* * p??_access_permitted() is true for valid user mappings (subject to the diff --git a/arch/sparc/include/asm/pgtable_64.h b/arch/sparc/include/asm/pgtable_64.h index 550d3904de65..749efd9c49c9 100644 --- a/arch/sparc/include/asm/pgtable_64.h +++ b/arch/sparc/include/asm/pgtable_64.h @@ -673,9 +673,9 @@ static inline unsigned long pte_present(pte_t pte) } #define pte_accessible pte_accessible -static inline unsigned long pte_accessible(struct mm_struct *mm, pte_t a) +static inline unsigned long pte_accessible(struct vm_area_struct *vma, pte_t *a) { - return pte_val(a) & _PAGE_VALID; + return pte_val(*a) & _PAGE_VALID; } static inline unsigned long pte_special(pte_t pte) @@ -906,8 +906,11 @@ static void maybe_tlb_batch_add(struct mm_struct *mm, unsigned long vaddr, * * SUN4V NOTE: _PAGE_VALID is the same value in both the SUN4U * and SUN4V pte layout, so this inline test is fine. + * + * The vma is not propagated to this point, but it is not used by + * sparc's pte_accessible(). We therefore provide NULL. */ - if (likely(mm != &init_mm) && pte_accessible(mm, orig)) + if (likely(mm != &init_mm) && pte_accessible(NULL, ptep)) tlb_batch_add(mm, vaddr, ptep, orig, fullmm, hugepage_shift); } diff --git a/arch/sparc/mm/init_64.c b/arch/sparc/mm/init_64.c index 182bb7bdaa0a..bda397aa9709 100644 --- a/arch/sparc/mm/init_64.c +++ b/arch/sparc/mm/init_64.c @@ -404,7 +404,7 @@ void update_mmu_cache(struct vm_area_struct *vma, unsigned long address, pte_t * mm = vma->vm_mm; /* Don't insert a non-valid PTE into the TSB, we'll deadlock. */ - if (!pte_accessible(mm, pte)) + if (!pte_accessible(vma, ptep)) return; spin_lock_irqsave(&mm->context.lock, flags); diff --git a/arch/x86/include/asm/pgtable.h b/arch/x86/include/asm/pgtable.h index a02c67291cfc..a0e069c15dbc 100644 --- a/arch/x86/include/asm/pgtable.h +++ b/arch/x86/include/asm/pgtable.h @@ -775,13 +775,12 @@ static inline int pte_devmap(pte_t a) #endif #define pte_accessible pte_accessible -static inline bool pte_accessible(struct mm_struct *mm, pte_t a) +static inline bool pte_accessible(struct vm_area_struct *vma, pte_t *a) { - if (pte_flags(a) & _PAGE_PRESENT) + if (pte_flags(*a) & _PAGE_PRESENT) return true; - if ((pte_flags(a) & _PAGE_PROTNONE) && - mm_tlb_flush_pending(mm)) + if ((pte_flags(*a) & _PAGE_PROTNONE) && pte_tlb_flush_pending(vma, a)) return true; return false; diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h index 8a5eb4bfac59..812ee0fd4c35 100644 --- a/include/linux/mm_types.h +++ b/include/linux/mm_types.h @@ -682,6 +682,16 @@ static inline bool mm_tlb_flush_pending(struct mm_struct *mm) return atomic_read(&mm->tlb_flush_pending); } +static inline bool pte_tlb_flush_pending(struct vm_area_struct *vma, pte_t *pte) +{ + return mm_tlb_flush_pending(vma->vm_mm); +} + +static inline bool pmd_tlb_flush_pending(struct vm_area_struct *vma, pmd_t *pmd) +{ + return mm_tlb_flush_pending(vma->vm_mm); +} + static inline bool mm_tlb_flush_nested(struct mm_struct *mm) { /* diff --git a/include/linux/pgtable.h b/include/linux/pgtable.h index 8fcdfa52eb4b..e8bce53ca3e8 100644 --- a/include/linux/pgtable.h +++ b/include/linux/pgtable.h @@ -725,7 +725,7 @@ static inline void arch_swap_restore(swp_entry_t entry, struct page *page) #endif #ifndef pte_accessible -# define pte_accessible(mm, pte) ((void)(pte), 1) +# define pte_accessible(vma, pte) ((void)(pte), 1) #endif #ifndef flush_tlb_fix_spurious_fault diff --git a/mm/huge_memory.c b/mm/huge_memory.c index c345b8b06183..c4b7c00cc69c 100644 --- a/mm/huge_memory.c +++ b/mm/huge_memory.c @@ -1514,7 +1514,7 @@ vm_fault_t do_huge_pmd_numa_page(struct vm_fault *vmf, pmd_t pmd) * We are not sure a pending tlb flush here is for a huge page * mapping or not. Hence use the tlb range variant */ - if (mm_tlb_flush_pending(vma->vm_mm)) { + if (pmd_tlb_flush_pending(vma, vmf->pmd)) { flush_tlb_range(vma, haddr, haddr + HPAGE_PMD_SIZE); /* * change_huge_pmd() released the pmd lock before diff --git a/mm/ksm.c b/mm/ksm.c index 9694ee2c71de..515acbffc283 100644 --- a/mm/ksm.c +++ b/mm/ksm.c @@ -1060,7 +1060,7 @@ static int write_protect_page(struct vm_area_struct *vma, struct page *page, if (pte_write(*pvmw.pte) || pte_dirty(*pvmw.pte) || (pte_protnone(*pvmw.pte) && pte_savedwrite(*pvmw.pte)) || - mm_tlb_flush_pending(mm)) { + pte_tlb_flush_pending(vma, pvmw.pte)) { pte_t entry; swapped = PageSwapCache(page); diff --git a/mm/pgtable-generic.c b/mm/pgtable-generic.c index 9578db83e312..2ca66e269d33 100644 --- a/mm/pgtable-generic.c +++ b/mm/pgtable-generic.c @@ -93,7 +93,7 @@ pte_t ptep_clear_flush(struct vm_area_struct *vma, unsigned long address, struct mm_struct *mm = (vma)->vm_mm; pte_t pte; pte = ptep_get_and_clear(mm, address, ptep); - if (pte_accessible(mm, pte)) + if (pte_accessible(vma, ptep)) flush_tlb_page(vma, address); return pte; } From patchwork Sun Jan 31 00:11:22 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Nadav Amit X-Patchwork-Id: 12057459 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-13.5 required=3.0 tests=BAYES_00, DKIM_ADSP_CUSTOM_MED,DKIM_INVALID,DKIM_SIGNED,FREEMAIL_FORGED_FROMDOMAIN, FREEMAIL_FROM,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER, INCLUDES_PATCH,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 325FEC433DB for ; Sun, 31 Jan 2021 00:16:25 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id C2B2864E15 for ; Sun, 31 Jan 2021 00:16:24 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org C2B2864E15 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=gmail.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 035E26B0078; Sat, 30 Jan 2021 19:16:19 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id F07996B007B; Sat, 30 Jan 2021 19:16:18 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id CB87B6B007D; Sat, 30 Jan 2021 19:16:18 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0209.hostedemail.com [216.40.44.209]) by kanga.kvack.org (Postfix) with ESMTP id B5B936B0078 for ; Sat, 30 Jan 2021 19:16:18 -0500 (EST) Received: from smtpin15.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay01.hostedemail.com (Postfix) with ESMTP id 84B87180AD830 for ; Sun, 31 Jan 2021 00:16:18 +0000 (UTC) X-FDA: 77764153236.15.plot34_0100603275b5 Received: from filter.hostedemail.com (10.5.16.251.rfc1918.com [10.5.16.251]) by smtpin15.hostedemail.com (Postfix) with ESMTP id 63C241814B0C7 for ; Sun, 31 Jan 2021 00:16:18 +0000 (UTC) X-HE-Tag: plot34_0100603275b5 X-Filterd-Recvd-Size: 4860 Received: from mail-pf1-f172.google.com (mail-pf1-f172.google.com [209.85.210.172]) by imf36.hostedemail.com (Postfix) with ESMTP for ; Sun, 31 Jan 2021 00:16:17 +0000 (UTC) Received: by mail-pf1-f172.google.com with SMTP id t29so9063546pfg.11 for ; Sat, 30 Jan 2021 16:16:17 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=JWcbdOnCHMci4gKmimhTyGECgzNX5SMoV5I4t1mnNZ0=; b=kGy8P8tEiPGFpNiMdALVuLcPBAKB5uzpz08ljdIJlFrJ3V1V4HWknbD805k1tUI0zq 5bOA5gbhW77cDC3aRozEAUTAoZvPJeo+5njDt9+GhpD1RuGVjoAsMOzLlgAd3oPpcKsz oAhtOZqt784wu/H4bdG0X5Ad02xITuDE8bxQLlAkx8rMNdeCJb1j4DMgc28ObRF1MTU6 UJVw9bnewbKW+tGZYJpjcYbgwmGxrF6Fun7V30BY53BAfi+Q666601tb+6ZFxzdTN/8T +dsz79Iaipj0knSCa1Wgsxxzd2Ps/UWLt76tMMSW19kR3a4j4ApKxGhMZux70qA60K/g p+lQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=JWcbdOnCHMci4gKmimhTyGECgzNX5SMoV5I4t1mnNZ0=; b=DYwpHNdom8hG52UyZw1+c+YSr7ZXhkkHAbnFxQBn4mwk7O36rTDfUMluV/QBCPvlYB 0N5oiJQxXOyVUS5VjxqLxnXmOqsjvC6v7xbrcdTkcpyMDCHNi3rpMlvYW41E+/7kGIfg rHcGgr15sgbFG7BzIiuy30qZlFcP8qGUmisB16XUdEyaA+F2AcgDzQP+quNYjTKcokVe NYL13Gz1ON/MuwHUYxx1HNqSK4wq3mhMsSiN5ertCcc1p7gjKtYTUtIhK7DFTnz1h5SH Zh7Pw53ALTL6PWauVub/JTsejCcRvW/AWIjt5j8PWN36o4CWDmICsanEN3V0RpwzsgHn xkfw== X-Gm-Message-State: AOAM530/aEV7ae/Wm7ibcPUROcuZPEf+g8XGl5VVUfVLd/VhyR1H8OH5 /i/uO1G4qiqw7YJtV2VI3Wukkv+nfNk= X-Google-Smtp-Source: ABdhPJxQY7yUiw79HkzXnSgtQxnTcWCyetEnB5yzFncWaRDkeWefmPRQgZcLK7ZQuCV+QkDOmqmirQ== X-Received: by 2002:a62:8f96:0:b029:1b7:75a9:a8b7 with SMTP id n144-20020a628f960000b02901b775a9a8b7mr10430610pfd.28.1612052176690; Sat, 30 Jan 2021 16:16:16 -0800 (PST) Received: from sc2-haas01-esx0118.eng.vmware.com ([66.170.99.1]) by smtp.gmail.com with ESMTPSA id e12sm13127365pga.13.2021.01.30.16.16.15 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sat, 30 Jan 2021 16:16:16 -0800 (PST) From: Nadav Amit X-Google-Original-From: Nadav Amit To: linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: Nadav Amit , Andrea Arcangeli , Andrew Morton , Andy Lutomirski , Dave Hansen , Peter Zijlstra , Thomas Gleixner , Will Deacon , Nick Piggin , Yu Zhao Subject: [RFC 10/20] mm: add pte_to_page() Date: Sat, 30 Jan 2021 16:11:22 -0800 Message-Id: <20210131001132.3368247-11-namit@vmware.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20210131001132.3368247-1-namit@vmware.com> References: <20210131001132.3368247-1-namit@vmware.com> MIME-Version: 1.0 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Nadav Amit Add a pte_to_page(), which is similar to pmd_to_page, which will be used later. Inline pmd_to_page() as well. Signed-off-by: Nadav Amit Cc: Andrea Arcangeli Cc: Andrew Morton Cc: Andy Lutomirski Cc: Dave Hansen Cc: Peter Zijlstra Cc: Thomas Gleixner Cc: Will Deacon Cc: Nick Piggin Cc: Yu Zhao --- include/linux/mm.h | 9 ++++++++- 1 file changed, 8 insertions(+), 1 deletion(-) diff --git a/include/linux/mm.h b/include/linux/mm.h index ecdf8a8cd6ae..d78a79fbb012 100644 --- a/include/linux/mm.h +++ b/include/linux/mm.h @@ -2161,6 +2161,13 @@ static inline spinlock_t *ptlock_ptr(struct page *page) } #endif /* ALLOC_SPLIT_PTLOCKS */ +static inline struct page *pte_to_page(pte_t *pte) +{ + unsigned long mask = ~(PTRS_PER_PTE * sizeof(pte_t) - 1); + + return virt_to_page((void *)((unsigned long) pte & mask)); +} + static inline spinlock_t *pte_lockptr(struct mm_struct *mm, pmd_t *pmd) { return ptlock_ptr(pmd_page(*pmd)); @@ -2246,7 +2253,7 @@ static inline void pgtable_pte_page_dtor(struct page *page) #if USE_SPLIT_PMD_PTLOCKS -static struct page *pmd_to_page(pmd_t *pmd) +static inline struct page *pmd_to_page(pmd_t *pmd) { unsigned long mask = ~(PTRS_PER_PMD * sizeof(pmd_t) - 1); return virt_to_page((void *)((unsigned long) pmd & mask)); From patchwork Sun Jan 31 00:11:23 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Nadav Amit X-Patchwork-Id: 12057461 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-13.5 required=3.0 tests=BAYES_00, DKIM_ADSP_CUSTOM_MED,DKIM_INVALID,DKIM_SIGNED,FREEMAIL_FORGED_FROMDOMAIN, FREEMAIL_FROM,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER, INCLUDES_PATCH,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 5421EC433DB for ; Sun, 31 Jan 2021 00:16:27 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id EAED764E0F for ; Sun, 31 Jan 2021 00:16:26 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org EAED764E0F Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=gmail.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 0A99E6B007B; Sat, 30 Jan 2021 19:16:21 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 036176B007D; Sat, 30 Jan 2021 19:16:20 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id DCB666B007E; Sat, 30 Jan 2021 19:16:20 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0229.hostedemail.com [216.40.44.229]) by kanga.kvack.org (Postfix) with ESMTP id C22EE6B007B for ; Sat, 30 Jan 2021 19:16:20 -0500 (EST) Received: from smtpin06.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay05.hostedemail.com (Postfix) with ESMTP id 944BA181AEF23 for ; Sun, 31 Jan 2021 00:16:20 +0000 (UTC) X-FDA: 77764153320.06.rub88_2e11d83275b5 Received: from filter.hostedemail.com (10.5.16.251.rfc1918.com [10.5.16.251]) by smtpin06.hostedemail.com (Postfix) with ESMTP id 76D5110039B54 for ; Sun, 31 Jan 2021 00:16:20 +0000 (UTC) X-HE-Tag: rub88_2e11d83275b5 X-Filterd-Recvd-Size: 13297 Received: from mail-pl1-f175.google.com (mail-pl1-f175.google.com [209.85.214.175]) by imf04.hostedemail.com (Postfix) with ESMTP for ; Sun, 31 Jan 2021 00:16:19 +0000 (UTC) Received: by mail-pl1-f175.google.com with SMTP id j11so5313787plt.11 for ; Sat, 30 Jan 2021 16:16:19 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=cP3jkcKrTRH72l/nN753Ht9O0hI+z4GxWtjo9/TstQ0=; b=sqMbwBBCmtEWdXV8UsvISdoaOS2TT+Sq7SoU/b0WKpDqVNz8EtnPQlLhzpd3UsBzUx hQ5UoK2AYi5qpvyUA0nAMjyg3cwdUHmmivktrUhp+zKfzCYzQs9po/oAG8PwbCK877S8 IGtqaKTHzV7hpNgx2cLRgC5sPpRMt6+scT/Twg4rBw29fMP6x+JziDsjtzgt5f78Q4oX ysXVJPstkA7O2i959e5wXSosND85hEU1WA/jlpNhdPovBAHLO/+q2JfGw1O2uVEiqj9n bS79KY0ePgjgeEGMVFL0B8/3h3wpfvsvp7u4xgMnh8yGLkgR+BnoN3VTB+Q87BsWVoX9 ceCQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=cP3jkcKrTRH72l/nN753Ht9O0hI+z4GxWtjo9/TstQ0=; b=MxVVEz6QxB4krkL1ed8pZiqtlZiQc2nwXtpZScbz39TEYtEIghk8xd77D72ahlDTeq bGwtaWXO7rJef5pSJ68ciPfxmUfLEZQqL3RO8UT1ssVrRHQK6onDy49ULsFlx1673gY2 8vksCwH0m8CktSkjJEGjY5WjW585Yp63eCgU5h2MFq/VpWpNH8EeSGb/MEKW/K4X8wpc U3lY8zvRua/bdiya3Fp77KYXTWAqw+RXQqBBiVe944JmTbWGzDNzhn6F9ez1UH4/JSYQ NIeUwiv0K28ZAJfRM//jtXf30RC4XBkjYtfk7qhBcEss9mjeOpCdOqO2UBGXDXsyIfv+ zl+w== X-Gm-Message-State: AOAM5320ywC5+I0mMlQWPTc8oQP1G3L0rjRqacd9900UGbfcPlGLJsj3 Syt9MZdcV78qGj5iqhbnd+iQPHM3wkA= X-Google-Smtp-Source: ABdhPJwfaKAtwjR3StROIV1ECyYsNwj8eV5UZEKT4kWHE+4J7kAATUd8KQouQDX54qtt+0fmavw6Pw== X-Received: by 2002:a17:90b:350b:: with SMTP id ls11mr10705500pjb.166.1612052178479; Sat, 30 Jan 2021 16:16:18 -0800 (PST) Received: from sc2-haas01-esx0118.eng.vmware.com ([66.170.99.1]) by smtp.gmail.com with ESMTPSA id e12sm13127365pga.13.2021.01.30.16.16.16 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sat, 30 Jan 2021 16:16:17 -0800 (PST) From: Nadav Amit X-Google-Original-From: Nadav Amit To: linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: Nadav Amit , Andrea Arcangeli , Andrew Morton , Andy Lutomirski , Dave Hansen , Peter Zijlstra , Thomas Gleixner , Will Deacon , Yu Zhao , Nick Piggin , linux-csky@vger.kernel.org, linuxppc-dev@lists.ozlabs.org, linux-s390@vger.kernel.org, x86@kernel.org Subject: [RFC 11/20] mm/tlb: remove arch-specific tlb_start/end_vma() Date: Sat, 30 Jan 2021 16:11:23 -0800 Message-Id: <20210131001132.3368247-12-namit@vmware.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20210131001132.3368247-1-namit@vmware.com> References: <20210131001132.3368247-1-namit@vmware.com> MIME-Version: 1.0 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Nadav Amit Architecture-specific tlb_start_vma() and tlb_end_vma() seem unnecessary. They are currently used for: 1. Avoid per-VMA TLB flushes. This can be determined by introducing a new config option. 2. Avoid saving information on the vma that is being flushed. Saving this information, even for architectures that do not need it, is cheap and we will need it for per-VMA deferred TLB flushing. 3. Avoid calling flush_cache_range(). Remove the architecture specific tlb_start_vma() and tlb_end_vma() in the following manner, corresponding to the previous requirements: 1. Introduce a new config option - ARCH_WANT_AGGRESSIVE_TLB_FLUSH_BATCHING - to allow architectures to define whether they want aggressive TLB flush batching (instead of flushing mappings of each VMA separately). 2. Save information on the vma regardless of architecture. Saving this information should have negligible overhead, and they will be needed for fine granularity TLB flushes. 3. flush_cache_range() is anyhow not defined for the architectures that implement tlb_start/end_vma(). No functional change intended. Signed-off-by: Nadav Amit Cc: Andrea Arcangeli Cc: Andrew Morton Cc: Andy Lutomirski Cc: Dave Hansen Cc: Peter Zijlstra Cc: Thomas Gleixner Cc: Will Deacon Cc: Yu Zhao Cc: Nick Piggin Cc: linux-csky@vger.kernel.org Cc: linuxppc-dev@lists.ozlabs.org Cc: linux-s390@vger.kernel.org Cc: x86@kernel.org --- arch/csky/Kconfig | 1 + arch/csky/include/asm/tlb.h | 12 ------------ arch/powerpc/Kconfig | 1 + arch/powerpc/include/asm/tlb.h | 2 -- arch/s390/Kconfig | 1 + arch/s390/include/asm/tlb.h | 3 --- arch/sparc/Kconfig | 1 + arch/sparc/include/asm/tlb_64.h | 2 -- arch/x86/Kconfig | 1 + arch/x86/include/asm/tlb.h | 3 --- include/asm-generic/tlb.h | 15 +++++---------- init/Kconfig | 8 ++++++++ 12 files changed, 18 insertions(+), 32 deletions(-) diff --git a/arch/csky/Kconfig b/arch/csky/Kconfig index 89dd2fcf38fa..924ff5721240 100644 --- a/arch/csky/Kconfig +++ b/arch/csky/Kconfig @@ -8,6 +8,7 @@ config CSKY select ARCH_HAS_SYNC_DMA_FOR_DEVICE select ARCH_USE_BUILTIN_BSWAP select ARCH_USE_QUEUED_RWLOCKS if NR_CPUS>2 + select ARCH_WANT_AGGRESSIVE_TLB_FLUSH_BATCHING select ARCH_WANT_FRAME_POINTERS if !CPU_CK610 select ARCH_WANT_DEFAULT_TOPDOWN_MMAP_LAYOUT select COMMON_CLK diff --git a/arch/csky/include/asm/tlb.h b/arch/csky/include/asm/tlb.h index fdff9b8d70c8..8130a5f09a6b 100644 --- a/arch/csky/include/asm/tlb.h +++ b/arch/csky/include/asm/tlb.h @@ -6,18 +6,6 @@ #include -#define tlb_start_vma(tlb, vma) \ - do { \ - if (!(tlb)->fullmm) \ - flush_cache_range(vma, (vma)->vm_start, (vma)->vm_end); \ - } while (0) - -#define tlb_end_vma(tlb, vma) \ - do { \ - if (!(tlb)->fullmm) \ - flush_tlb_range(vma, (vma)->vm_start, (vma)->vm_end); \ - } while (0) - #define tlb_flush(tlb) flush_tlb_mm((tlb)->mm) #include diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig index 107bb4319e0e..d9761b6f192a 100644 --- a/arch/powerpc/Kconfig +++ b/arch/powerpc/Kconfig @@ -151,6 +151,7 @@ config PPC select ARCH_USE_CMPXCHG_LOCKREF if PPC64 select ARCH_USE_QUEUED_RWLOCKS if PPC_QUEUED_SPINLOCKS select ARCH_USE_QUEUED_SPINLOCKS if PPC_QUEUED_SPINLOCKS + select ARCH_WANT_AGGRESSIVE_TLB_FLUSH_BATCHING select ARCH_WANT_IPC_PARSE_VERSION select ARCH_WANT_IRQS_OFF_ACTIVATE_MM select ARCH_WANT_LD_ORPHAN_WARN diff --git a/arch/powerpc/include/asm/tlb.h b/arch/powerpc/include/asm/tlb.h index 160422a439aa..880b7daf904e 100644 --- a/arch/powerpc/include/asm/tlb.h +++ b/arch/powerpc/include/asm/tlb.h @@ -19,8 +19,6 @@ #include -#define tlb_start_vma(tlb, vma) do { } while (0) -#define tlb_end_vma(tlb, vma) do { } while (0) #define __tlb_remove_tlb_entry __tlb_remove_tlb_entry #define tlb_flush tlb_flush diff --git a/arch/s390/Kconfig b/arch/s390/Kconfig index c72874f09741..5b3dc5ca9873 100644 --- a/arch/s390/Kconfig +++ b/arch/s390/Kconfig @@ -113,6 +113,7 @@ config S390 select ARCH_USE_BUILTIN_BSWAP select ARCH_USE_CMPXCHG_LOCKREF select ARCH_WANTS_DYNAMIC_TASK_STRUCT + select ARCH_WANT_AGGRESSIVE_TLB_FLUSH_BATCHING select ARCH_WANT_DEFAULT_BPF_JIT select ARCH_WANT_IPC_PARSE_VERSION select BUILDTIME_TABLE_SORT diff --git a/arch/s390/include/asm/tlb.h b/arch/s390/include/asm/tlb.h index 954fa8ca6cbd..03f31d59f97c 100644 --- a/arch/s390/include/asm/tlb.h +++ b/arch/s390/include/asm/tlb.h @@ -27,9 +27,6 @@ static inline void tlb_flush(struct mmu_gather *tlb); static inline bool __tlb_remove_page_size(struct mmu_gather *tlb, struct page *page, int page_size); -#define tlb_start_vma(tlb, vma) do { } while (0) -#define tlb_end_vma(tlb, vma) do { } while (0) - #define tlb_flush tlb_flush #define pte_free_tlb pte_free_tlb #define pmd_free_tlb pmd_free_tlb diff --git a/arch/sparc/Kconfig b/arch/sparc/Kconfig index c9c34dc52b7d..fb46e1b6f177 100644 --- a/arch/sparc/Kconfig +++ b/arch/sparc/Kconfig @@ -51,6 +51,7 @@ config SPARC select NEED_DMA_MAP_STATE select NEED_SG_DMA_LENGTH select SET_FS + select ARCH_WANT_AGGRESSIVE_TLB_FLUSH_BATCHING config SPARC32 def_bool !64BIT diff --git a/arch/sparc/include/asm/tlb_64.h b/arch/sparc/include/asm/tlb_64.h index 779a5a0f0608..3037187482db 100644 --- a/arch/sparc/include/asm/tlb_64.h +++ b/arch/sparc/include/asm/tlb_64.h @@ -22,8 +22,6 @@ void smp_flush_tlb_mm(struct mm_struct *mm); void __flush_tlb_pending(unsigned long, unsigned long, unsigned long *); void flush_tlb_pending(void); -#define tlb_start_vma(tlb, vma) do { } while (0) -#define tlb_end_vma(tlb, vma) do { } while (0) #define tlb_flush(tlb) flush_tlb_pending() /* diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig index 6bd4d626a6b3..d56b0f5cb00c 100644 --- a/arch/x86/Kconfig +++ b/arch/x86/Kconfig @@ -101,6 +101,7 @@ config X86 select ARCH_USE_QUEUED_RWLOCKS select ARCH_USE_QUEUED_SPINLOCKS select ARCH_USE_SYM_ANNOTATIONS + select ARCH_WANT_AGGRESSIVE_TLB_FLUSH_BATCHING select ARCH_WANT_BATCHED_UNMAP_TLB_FLUSH select ARCH_WANT_DEFAULT_BPF_JIT if X86_64 select ARCH_WANTS_DYNAMIC_TASK_STRUCT diff --git a/arch/x86/include/asm/tlb.h b/arch/x86/include/asm/tlb.h index 1bfe979bb9bc..580636cdc257 100644 --- a/arch/x86/include/asm/tlb.h +++ b/arch/x86/include/asm/tlb.h @@ -2,9 +2,6 @@ #ifndef _ASM_X86_TLB_H #define _ASM_X86_TLB_H -#define tlb_start_vma(tlb, vma) do { } while (0) -#define tlb_end_vma(tlb, vma) do { } while (0) - #define tlb_flush tlb_flush static inline void tlb_flush(struct mmu_gather *tlb); diff --git a/include/asm-generic/tlb.h b/include/asm-generic/tlb.h index 427bfcc6cdec..b97136b7010b 100644 --- a/include/asm-generic/tlb.h +++ b/include/asm-generic/tlb.h @@ -334,8 +334,8 @@ static inline void __tlb_reset_range(struct mmu_gather *tlb) #ifdef CONFIG_MMU_GATHER_NO_RANGE -#if defined(tlb_flush) || defined(tlb_start_vma) || defined(tlb_end_vma) -#error MMU_GATHER_NO_RANGE relies on default tlb_flush(), tlb_start_vma() and tlb_end_vma() +#if defined(tlb_flush) +#error MMU_GATHER_NO_RANGE relies on default tlb_flush() #endif /* @@ -362,10 +362,6 @@ static inline void tlb_end_vma(struct mmu_gather *tlb, struct vm_area_struct *vm #ifndef tlb_flush -#if defined(tlb_start_vma) || defined(tlb_end_vma) -#error Default tlb_flush() relies on default tlb_start_vma() and tlb_end_vma() -#endif - /* * When an architecture does not provide its own tlb_flush() implementation * but does have a reasonably efficient flush_vma_range() implementation @@ -486,7 +482,6 @@ static inline unsigned long tlb_get_unmap_size(struct mmu_gather *tlb) * case where we're doing a full MM flush. When we're doing a munmap, * the vmas are adjusted to only cover the region to be torn down. */ -#ifndef tlb_start_vma static inline void tlb_start_vma(struct mmu_gather *tlb, struct vm_area_struct *vma) { if (tlb->fullmm) @@ -495,14 +490,15 @@ static inline void tlb_start_vma(struct mmu_gather *tlb, struct vm_area_struct * tlb_update_vma_flags(tlb, vma); flush_cache_range(vma, vma->vm_start, vma->vm_end); } -#endif -#ifndef tlb_end_vma static inline void tlb_end_vma(struct mmu_gather *tlb, struct vm_area_struct *vma) { if (tlb->fullmm) return; + if (IS_ENABLED(CONFIG_ARCH_WANT_AGGRESSIVE_TLB_FLUSH_BATCHING)) + return; + /* * Do a TLB flush and reset the range at VMA boundaries; this avoids * the ranges growing with the unused space between consecutive VMAs, @@ -511,7 +507,6 @@ static inline void tlb_end_vma(struct mmu_gather *tlb, struct vm_area_struct *vm */ tlb_flush_mmu_tlbonly(tlb); } -#endif #ifdef CONFIG_ARCH_HAS_TLB_GENERATIONS diff --git a/init/Kconfig b/init/Kconfig index 3d11a0f7c8cc..14a599a48738 100644 --- a/init/Kconfig +++ b/init/Kconfig @@ -849,6 +849,14 @@ config ARCH_WANT_BATCHED_UNMAP_TLB_FLUSH config ARCH_HAS_TLB_GENERATIONS bool +# +# For architectures that prefer to batch TLB flushes aggressively, i.e., +# not to flush after changing or removing each VMA. The architecture must +# provide its own tlb_flush() function. +config ARCH_WANT_AGGRESSIVE_TLB_FLUSH_BATCHING + bool + depends on !CONFIG_MMU_GATHER_NO_GATHER + config CC_HAS_INT128 def_bool !$(cc-option,$(m64-flag) -D__SIZEOF_INT128__=0) && 64BIT From patchwork Sun Jan 31 00:11:24 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Nadav Amit X-Patchwork-Id: 12057463 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-13.5 required=3.0 tests=BAYES_00, DKIM_ADSP_CUSTOM_MED,DKIM_INVALID,DKIM_SIGNED,FREEMAIL_FORGED_FROMDOMAIN, FREEMAIL_FROM,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER, INCLUDES_PATCH,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 90F3AC433E6 for ; Sun, 31 Jan 2021 00:16:29 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 1BD8964E15 for ; Sun, 31 Jan 2021 00:16:29 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 1BD8964E15 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=gmail.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 7DA6C6B007D; Sat, 30 Jan 2021 19:16:22 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 6F1866B007E; Sat, 30 Jan 2021 19:16:22 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 4F5B76B0080; Sat, 30 Jan 2021 19:16:22 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0043.hostedemail.com [216.40.44.43]) by kanga.kvack.org (Postfix) with ESMTP id 22F386B007D for ; Sat, 30 Jan 2021 19:16:22 -0500 (EST) Received: from smtpin30.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay02.hostedemail.com (Postfix) with ESMTP id E43DB3633 for ; Sun, 31 Jan 2021 00:16:21 +0000 (UTC) X-FDA: 77764153362.30.swim52_2505ea6275b5 Received: from filter.hostedemail.com (10.5.16.251.rfc1918.com [10.5.16.251]) by smtpin30.hostedemail.com (Postfix) with ESMTP id BFC17180B3C83 for ; Sun, 31 Jan 2021 00:16:21 +0000 (UTC) X-HE-Tag: swim52_2505ea6275b5 X-Filterd-Recvd-Size: 8290 Received: from mail-pg1-f176.google.com (mail-pg1-f176.google.com [209.85.215.176]) by imf40.hostedemail.com (Postfix) with ESMTP for ; Sun, 31 Jan 2021 00:16:21 +0000 (UTC) Received: by mail-pg1-f176.google.com with SMTP id o63so9417964pgo.6 for ; Sat, 30 Jan 2021 16:16:21 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=O8uSPE/HXC6aHbVUxVpHG/d195EuTH7J5Gm24sXeUII=; b=PwdIMepAzUWPJ56ktZqfTNN8uHV1c1HOobe4OORa4XxJPl9z4jD2tKvqxvoz0NN8Q7 vyLxvXY0eIDxnVZd66ZzKxalFo2pJE4Nj5hazn3Y1/ageaD2NuE9vFR0+KUNpYXZDmeq HUolGqpL0IJ8uww1/JcIOqWNR0896AbtuxsneTHFJm4E5OQ4HsdJ3pCEoxppW+6sQng9 gfJsDusUVhQ/bLlx6/0ttYoqj6WhBqI+LVBu1dBxrdfJZWgd/SbsJ6xlHnyqblqTMkjl bDFUXr8WllrSt+rdAR4J+fExRJPMaSbNwUeleG1XK+k6LHuMJzUxR10VFVXkN/Rp39YY ZRaQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=O8uSPE/HXC6aHbVUxVpHG/d195EuTH7J5Gm24sXeUII=; b=IUjeeJAkprk2u8ncLFZ227MbkQLLJeh+5YY8Hz31PbW2lG+CNjZgt7EhgNlD3z/Qmp 2WkUZ/6FVeoF3OGr3a+HsvIkAX7xqm0VCOGxpfFTFQTXumH8Hhq/T6436sS3h3bffNU0 xERcctBGA2dM35Uk1vTBFODKxct7P+cWmGM00JgTwjTe1TLlTVEKkHkCi26oNeC1XDqi rh91xmlW6H10ouvUzA7h6OLimgj3jNP1gezp9djcbmo/eNPGKImDnv2phbZNNs1+UNXc OaOU+Y7YpVwyTZ8v6gwCXBHPqZWSZmJuMmJkU0HEaUiMNna3s3/8O2yT7iGSnm91qWaZ CX0A== X-Gm-Message-State: AOAM532mdtyJQlnsAo+JNNRmmmYlUTsP2zGYkE1rbgGyVuR0O32oY93d SvQXyR9xs/ZsOu8YT/2QocKopQyu63c= X-Google-Smtp-Source: ABdhPJxKdj8NOHZ5TdgXQQbvYo0DHw/nVRQlVpun7jqBLlHK6Vdma/V1nIaCJETf5QG8NYedmgWadQ== X-Received: by 2002:a63:f74f:: with SMTP id f15mr10935466pgk.186.1612052180090; Sat, 30 Jan 2021 16:16:20 -0800 (PST) Received: from sc2-haas01-esx0118.eng.vmware.com ([66.170.99.1]) by smtp.gmail.com with ESMTPSA id e12sm13127365pga.13.2021.01.30.16.16.18 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sat, 30 Jan 2021 16:16:19 -0800 (PST) From: Nadav Amit X-Google-Original-From: Nadav Amit To: linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: Nadav Amit , Andrea Arcangeli , Andrew Morton , Andy Lutomirski , Dave Hansen , Peter Zijlstra , Thomas Gleixner , Will Deacon , Yu Zhao , Nick Piggin , x86@kernel.org Subject: [RFC 12/20] mm/tlb: save the VMA that is flushed during tlb_start_vma() Date: Sat, 30 Jan 2021 16:11:24 -0800 Message-Id: <20210131001132.3368247-13-namit@vmware.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20210131001132.3368247-1-namit@vmware.com> References: <20210131001132.3368247-1-namit@vmware.com> MIME-Version: 1.0 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Nadav Amit Certain architectures need information about the vma that is about to be flushed. Currently, an artificial vma is constructed using the original vma infromation. Instead of saving the flags, record the vma during tlb_start_vma() and use this vma when calling flush_tlb_range(). Record the vma unconditionally as it would be needed for per-VMA deferred TLB flush tracking and the overhead of tracking it unconditionally should be negligible. Signed-off-by: Nadav Amit Cc: Andrea Arcangeli Cc: Andrew Morton Cc: Andy Lutomirski Cc: Dave Hansen Cc: Peter Zijlstra Cc: Thomas Gleixner Cc: Will Deacon Cc: Yu Zhao Cc: Nick Piggin Cc: x86@kernel.org --- include/asm-generic/tlb.h | 56 +++++++++++++-------------------------- 1 file changed, 19 insertions(+), 37 deletions(-) diff --git a/include/asm-generic/tlb.h b/include/asm-generic/tlb.h index b97136b7010b..041be2ef4426 100644 --- a/include/asm-generic/tlb.h +++ b/include/asm-generic/tlb.h @@ -252,6 +252,13 @@ extern bool __tlb_remove_page_size(struct mmu_gather *tlb, struct page *page, struct mmu_gather { struct mm_struct *mm; + /* + * The current vma. This information is changing upon tlb_start_vma() + * and is therefore only valid between tlb_start_vma() and tlb_end_vma() + * calls. + */ + struct vm_area_struct *vma; + #ifdef CONFIG_MMU_GATHER_TABLE_FREE struct mmu_table_batch *batch; #endif @@ -283,12 +290,6 @@ struct mmu_gather { unsigned int cleared_puds : 1; unsigned int cleared_p4ds : 1; - /* - * tracks VM_EXEC | VM_HUGETLB in tlb_start_vma - */ - unsigned int vma_exec : 1; - unsigned int vma_huge : 1; - unsigned int batch_count; #ifndef CONFIG_MMU_GATHER_NO_GATHER @@ -352,10 +353,6 @@ static inline void tlb_flush(struct mmu_gather *tlb) flush_tlb_mm(tlb->mm); } -static inline void -tlb_update_vma_flags(struct mmu_gather *tlb, struct vm_area_struct *vma) { } - -#define tlb_end_vma tlb_end_vma static inline void tlb_end_vma(struct mmu_gather *tlb, struct vm_area_struct *vma) { } #else /* CONFIG_MMU_GATHER_NO_RANGE */ @@ -364,7 +361,7 @@ static inline void tlb_end_vma(struct mmu_gather *tlb, struct vm_area_struct *vm /* * When an architecture does not provide its own tlb_flush() implementation - * but does have a reasonably efficient flush_vma_range() implementation + * but does have a reasonably efficient flush_tlb_range() implementation * use that. */ static inline void tlb_flush(struct mmu_gather *tlb) @@ -372,38 +369,20 @@ static inline void tlb_flush(struct mmu_gather *tlb) if (tlb->fullmm || tlb->need_flush_all) { flush_tlb_mm(tlb->mm); } else if (tlb->end) { - struct vm_area_struct vma = { - .vm_mm = tlb->mm, - .vm_flags = (tlb->vma_exec ? VM_EXEC : 0) | - (tlb->vma_huge ? VM_HUGETLB : 0), - }; - - flush_tlb_range(&vma, tlb->start, tlb->end); + VM_BUG_ON(!tlb->vma); + flush_tlb_range(tlb->vma, tlb->start, tlb->end); } } static inline void -tlb_update_vma_flags(struct mmu_gather *tlb, struct vm_area_struct *vma) +tlb_update_vma(struct mmu_gather *tlb, struct vm_area_struct *vma) { - /* - * flush_tlb_range() implementations that look at VM_HUGETLB (tile, - * mips-4k) flush only large pages. - * - * flush_tlb_range() implementations that flush I-TLB also flush D-TLB - * (tile, xtensa, arm), so it's ok to just add VM_EXEC to an existing - * range. - * - * We rely on tlb_end_vma() to issue a flush, such that when we reset - * these values the batch is empty. - */ - tlb->vma_huge = is_vm_hugetlb_page(vma); - tlb->vma_exec = !!(vma->vm_flags & VM_EXEC); + tlb->vma = vma; } - #else static inline void -tlb_update_vma_flags(struct mmu_gather *tlb, struct vm_area_struct *vma) { } +tlb_update_vma(struct mmu_gather *tlb, struct vm_area_struct *vma) { } #endif @@ -487,17 +466,17 @@ static inline void tlb_start_vma(struct mmu_gather *tlb, struct vm_area_struct * if (tlb->fullmm) return; - tlb_update_vma_flags(tlb, vma); + tlb_update_vma(tlb, vma); flush_cache_range(vma, vma->vm_start, vma->vm_end); } static inline void tlb_end_vma(struct mmu_gather *tlb, struct vm_area_struct *vma) { if (tlb->fullmm) - return; + goto out; if (IS_ENABLED(CONFIG_ARCH_WANT_AGGRESSIVE_TLB_FLUSH_BATCHING)) - return; + goto out; /* * Do a TLB flush and reset the range at VMA boundaries; this avoids @@ -506,6 +485,9 @@ static inline void tlb_end_vma(struct mmu_gather *tlb, struct vm_area_struct *vm * this. */ tlb_flush_mmu_tlbonly(tlb); +out: + /* Reset the VMA as a precaution. */ + tlb_update_vma(tlb, NULL); } #ifdef CONFIG_ARCH_HAS_TLB_GENERATIONS From patchwork Sun Jan 31 00:11:25 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Nadav Amit X-Patchwork-Id: 12057465 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-13.5 required=3.0 tests=BAYES_00, DKIM_ADSP_CUSTOM_MED,DKIM_INVALID,DKIM_SIGNED,FREEMAIL_FORGED_FROMDOMAIN, FREEMAIL_FROM,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER, INCLUDES_PATCH,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 0DC80C433DB for ; Sun, 31 Jan 2021 00:16:32 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 9933F64E15 for ; Sun, 31 Jan 2021 00:16:31 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 9933F64E15 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=gmail.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 16F976B007E; Sat, 30 Jan 2021 19:16:24 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 0D0DC6B0080; Sat, 30 Jan 2021 19:16:24 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id E68916B0081; Sat, 30 Jan 2021 19:16:23 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0139.hostedemail.com [216.40.44.139]) by kanga.kvack.org (Postfix) with ESMTP id C83D56B007E for ; Sat, 30 Jan 2021 19:16:23 -0500 (EST) Received: from smtpin06.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay03.hostedemail.com (Postfix) with ESMTP id 8E60982499A8 for ; Sun, 31 Jan 2021 00:16:23 +0000 (UTC) X-FDA: 77764153446.06.rice03_0c02d21275b5 Received: from filter.hostedemail.com (10.5.16.251.rfc1918.com [10.5.16.251]) by smtpin06.hostedemail.com (Postfix) with ESMTP id 695431003A845 for ; Sun, 31 Jan 2021 00:16:23 +0000 (UTC) X-HE-Tag: rice03_0c02d21275b5 X-Filterd-Recvd-Size: 11100 Received: from mail-pj1-f51.google.com (mail-pj1-f51.google.com [209.85.216.51]) by imf28.hostedemail.com (Postfix) with ESMTP for ; Sun, 31 Jan 2021 00:16:22 +0000 (UTC) Received: by mail-pj1-f51.google.com with SMTP id gx1so8806123pjb.1 for ; Sat, 30 Jan 2021 16:16:22 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=xvDcVWDMqSlgRKDC/415dsNWXRNDCXGyoKkXgRrWozk=; b=sHfvfBFkHKyTyMjCEaanWGOfbmnW6eqnCByHh+aQy9puH39w/8QOImPSD5ESCNZPPW wQMXyW1xHIQs4H84eMYNrngi6phUfigFjDXPZk8ZZ/IYmOaD44gyo16syDWbj8jV14TC DF//wgzY5SSxBGAAk/rmVD9X/AXccea+NItjjkseOFTfaavrdH+/hKdaZ2+COQFlsVg+ EDMy2LRt91m0sAKAOiUbPfKufcw8qeeH+NxSjaoUhH9jle5bu6E103GzQBmOH5cRg3em x07MiWMxWGuZMomPESLflaGqR5ddIgqF7FNv9cq2OTImt67OHCo+9HJT0SdQiUODHUGa a42w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=xvDcVWDMqSlgRKDC/415dsNWXRNDCXGyoKkXgRrWozk=; b=VhVgZ27Gg/XZ/nr4nxgqio9BCEvZzoQ07ut0ST22qWoY45VSYR5Unyi/cVOCvjOSwH ZsBKVrEFVjIxPAzFUvzVF/tUEC+lYpimk8pO/t/dLxyAAUjM4/5ISLFNrPNwPhK4jPg3 dw2c9OOB7K7cC7/hKvhCJqmu+K0F/RKojK8uhEz90rBai/h/5CyHsuFs+gNQKoxmho+b +Zq9yPJPIXpVQ83yHdeNBERVeH4mQ2egLCmqxcF66jCCrbdUJyBVmwQP77zOalZXb81q OHGzrUmI3WRsussxvpjACsD0guroV8xUjG3uBYIRSLHIAJOZ/fJ3xnxXxkEKUh8in8Fq /8sQ== X-Gm-Message-State: AOAM533ycAewTi/FrZSQPWBbMv8mPtvB5SXq9Wrr58IKsOmGrkvOpxcx MVIuDImU71QAbr+2ycGSBLIVGWLrDys= X-Google-Smtp-Source: ABdhPJxSJ3S1Hh834JyxNBWkaIPU+kNlBMZlR30PiuMLpD9MM9sZ1et0g2VBjM/ZMxyGVv2eOMlHTg== X-Received: by 2002:a17:90b:30d6:: with SMTP id hi22mr1757058pjb.42.1612052181708; Sat, 30 Jan 2021 16:16:21 -0800 (PST) Received: from sc2-haas01-esx0118.eng.vmware.com ([66.170.99.1]) by smtp.gmail.com with ESMTPSA id e12sm13127365pga.13.2021.01.30.16.16.20 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sat, 30 Jan 2021 16:16:21 -0800 (PST) From: Nadav Amit X-Google-Original-From: Nadav Amit To: linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: Nadav Amit , Andrea Arcangeli , Andrew Morton , Andy Lutomirski , Dave Hansen , Peter Zijlstra , Thomas Gleixner , Will Deacon , Yu Zhao , Nick Piggin , x86@kernel.org Subject: [RFC 13/20] mm/tlb: introduce tlb_start_ptes() and tlb_end_ptes() Date: Sat, 30 Jan 2021 16:11:25 -0800 Message-Id: <20210131001132.3368247-14-namit@vmware.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20210131001132.3368247-1-namit@vmware.com> References: <20210131001132.3368247-1-namit@vmware.com> MIME-Version: 1.0 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Nadav Amit Introduce tlb_start_ptes() and tlb_end_ptes() which would be called before and after PTEs are updated and TLB flushes are deferred. This will be later be used for fine granualrity deferred TLB flushing detection. In the meanwhile, move flush_tlb_batched_pending() into tlb_start_ptes(). It was not called from mapping_dirty_helpers by wp_pte() and clean_record_pte(), which might be a bug. No additional functional change is intended. Signed-off-by: Nadav Amit Cc: Andrea Arcangeli Cc: Andrew Morton Cc: Andy Lutomirski Cc: Dave Hansen Cc: Peter Zijlstra Cc: Thomas Gleixner Cc: Will Deacon Cc: Yu Zhao Cc: Nick Piggin Cc: x86@kernel.org --- fs/proc/task_mmu.c | 2 ++ include/asm-generic/tlb.h | 18 ++++++++++++++++++ mm/madvise.c | 6 ++++-- mm/mapping_dirty_helpers.c | 15 +++++++++++++-- mm/memory.c | 2 ++ mm/mprotect.c | 3 ++- 6 files changed, 41 insertions(+), 5 deletions(-) diff --git a/fs/proc/task_mmu.c b/fs/proc/task_mmu.c index 4cd048ffa0f6..d0cce961fa5c 100644 --- a/fs/proc/task_mmu.c +++ b/fs/proc/task_mmu.c @@ -1168,6 +1168,7 @@ static int clear_refs_pte_range(pmd_t *pmd, unsigned long addr, return 0; pte = pte_offset_map_lock(vma->vm_mm, pmd, addr, &ptl); + tlb_start_ptes(&cp->tlb); for (; addr != end; pte++, addr += PAGE_SIZE) { ptent = *pte; @@ -1190,6 +1191,7 @@ static int clear_refs_pte_range(pmd_t *pmd, unsigned long addr, tlb_flush_pte_range(&cp->tlb, addr, PAGE_SIZE); ClearPageReferenced(page); } + tlb_end_ptes(&cp->tlb); pte_unmap_unlock(pte - 1, ptl); cond_resched(); return 0; diff --git a/include/asm-generic/tlb.h b/include/asm-generic/tlb.h index 041be2ef4426..10690763090a 100644 --- a/include/asm-generic/tlb.h +++ b/include/asm-generic/tlb.h @@ -58,6 +58,11 @@ * Defaults to flushing at tlb_end_vma() to reset the range; helps when * there's large holes between the VMAs. * + * - tlb_start_ptes() / tlb_end_ptes; makr the start / end of PTEs change. + * + * Does internal accounting to allow fine(r) granularity checks for + * pte_accessible() on certain configuration. + * * - tlb_remove_table() * * tlb_remove_table() is the basic primitive to free page-table directories @@ -373,6 +378,10 @@ static inline void tlb_flush(struct mmu_gather *tlb) flush_tlb_range(tlb->vma, tlb->start, tlb->end); } } +#endif + +#if __is_defined(tlb_flush) || \ + IS_ENABLED(CONFIG_ARCH_WANT_AGGRESSIVE_TLB_FLUSH_BATCHING) static inline void tlb_update_vma(struct mmu_gather *tlb, struct vm_area_struct *vma) @@ -523,6 +532,15 @@ static inline void mark_mm_tlb_gen_done(struct mm_struct *mm, u64 gen) #endif /* CONFIG_ARCH_HAS_TLB_GENERATIONS */ +#define tlb_start_ptes(tlb) \ + do { \ + struct mmu_gather *_tlb = (tlb); \ + \ + flush_tlb_batched_pending(_tlb->mm); \ + } while (0) + +static inline void tlb_end_ptes(struct mmu_gather *tlb) { } + /* * tlb_flush_{pte|pmd|pud|p4d}_range() adjust the tlb->start and tlb->end, * and set corresponding cleared_*. diff --git a/mm/madvise.c b/mm/madvise.c index 0938fd3ad228..932c1c2eb9a3 100644 --- a/mm/madvise.c +++ b/mm/madvise.c @@ -392,7 +392,7 @@ static int madvise_cold_or_pageout_pte_range(pmd_t *pmd, #endif tlb_change_page_size(tlb, PAGE_SIZE); orig_pte = pte = pte_offset_map_lock(vma->vm_mm, pmd, addr, &ptl); - flush_tlb_batched_pending(mm); + tlb_start_ptes(tlb); arch_enter_lazy_mmu_mode(); for (; addr < end; pte++, addr += PAGE_SIZE) { ptent = *pte; @@ -468,6 +468,7 @@ static int madvise_cold_or_pageout_pte_range(pmd_t *pmd, } arch_leave_lazy_mmu_mode(); + tlb_end_ptes(tlb); pte_unmap_unlock(orig_pte, ptl); if (pageout) reclaim_pages(&page_list); @@ -588,7 +589,7 @@ static int madvise_free_pte_range(pmd_t *pmd, unsigned long addr, tlb_change_page_size(tlb, PAGE_SIZE); orig_pte = pte = pte_offset_map_lock(mm, pmd, addr, &ptl); - flush_tlb_batched_pending(mm); + tlb_start_ptes(tlb); arch_enter_lazy_mmu_mode(); for (; addr != end; pte++, addr += PAGE_SIZE) { ptent = *pte; @@ -692,6 +693,7 @@ static int madvise_free_pte_range(pmd_t *pmd, unsigned long addr, add_mm_counter(mm, MM_SWAPENTS, nr_swap); } arch_leave_lazy_mmu_mode(); + tlb_end_ptes(tlb); pte_unmap_unlock(orig_pte, ptl); cond_resched(); next: diff --git a/mm/mapping_dirty_helpers.c b/mm/mapping_dirty_helpers.c index 2ce6cf431026..063419ade304 100644 --- a/mm/mapping_dirty_helpers.c +++ b/mm/mapping_dirty_helpers.c @@ -6,6 +6,8 @@ #include #include +#include "internal.h" + /** * struct wp_walk - Private struct for pagetable walk callbacks * @range: Range for mmu notifiers @@ -36,7 +38,10 @@ static int wp_pte(pte_t *pte, unsigned long addr, unsigned long end, pte_t ptent = *pte; if (pte_write(ptent)) { - pte_t old_pte = ptep_modify_prot_start(walk->vma, addr, pte); + pte_t old_pte; + + tlb_start_ptes(&wpwalk->tlb); + old_pte = ptep_modify_prot_start(walk->vma, addr, pte); ptent = pte_wrprotect(old_pte); ptep_modify_prot_commit(walk->vma, addr, pte, old_pte, ptent); @@ -44,6 +49,7 @@ static int wp_pte(pte_t *pte, unsigned long addr, unsigned long end, if (pte_may_need_flush(old_pte, ptent)) tlb_flush_pte_range(&wpwalk->tlb, addr, PAGE_SIZE); + tlb_end_ptes(&wpwalk->tlb); } return 0; @@ -94,13 +100,18 @@ static int clean_record_pte(pte_t *pte, unsigned long addr, if (pte_dirty(ptent)) { pgoff_t pgoff = ((addr - walk->vma->vm_start) >> PAGE_SHIFT) + walk->vma->vm_pgoff - cwalk->bitmap_pgoff; - pte_t old_pte = ptep_modify_prot_start(walk->vma, addr, pte); + pte_t old_pte; + + tlb_start_ptes(&wpwalk->tlb); + + old_pte = ptep_modify_prot_start(walk->vma, addr, pte); ptent = pte_mkclean(old_pte); ptep_modify_prot_commit(walk->vma, addr, pte, old_pte, ptent); wpwalk->total++; tlb_flush_pte_range(&wpwalk->tlb, addr, PAGE_SIZE); + tlb_end_ptes(&wpwalk->tlb); __set_bit(pgoff, cwalk->bitmap); cwalk->start = min(cwalk->start, pgoff); diff --git a/mm/memory.c b/mm/memory.c index 9e8576a83147..929a93c50d9a 100644 --- a/mm/memory.c +++ b/mm/memory.c @@ -1221,6 +1221,7 @@ static unsigned long zap_pte_range(struct mmu_gather *tlb, init_rss_vec(rss); start_pte = pte_offset_map_lock(mm, pmd, addr, &ptl); pte = start_pte; + tlb_start_ptes(tlb); flush_tlb_batched_pending(mm); arch_enter_lazy_mmu_mode(); do { @@ -1314,6 +1315,7 @@ static unsigned long zap_pte_range(struct mmu_gather *tlb, add_mm_rss_vec(mm, rss); arch_leave_lazy_mmu_mode(); + tlb_end_ptes(tlb); /* Do the actual TLB flush before dropping ptl */ if (force_flush) tlb_flush_mmu_tlbonly(tlb); diff --git a/mm/mprotect.c b/mm/mprotect.c index b7473d2c9a1f..1258bbe42ee1 100644 --- a/mm/mprotect.c +++ b/mm/mprotect.c @@ -70,7 +70,7 @@ static unsigned long change_pte_range(struct mmu_gather *tlb, atomic_read(&vma->vm_mm->mm_users) == 1) target_node = numa_node_id(); - flush_tlb_batched_pending(vma->vm_mm); + tlb_start_ptes(tlb); arch_enter_lazy_mmu_mode(); do { oldpte = *pte; @@ -182,6 +182,7 @@ static unsigned long change_pte_range(struct mmu_gather *tlb, } } while (pte++, addr += PAGE_SIZE, addr != end); arch_leave_lazy_mmu_mode(); + tlb_end_ptes(tlb); pte_unmap_unlock(pte - 1, ptl); return pages; From patchwork Sun Jan 31 00:11:26 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Nadav Amit X-Patchwork-Id: 12057467 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-13.5 required=3.0 tests=BAYES_00, DKIM_ADSP_CUSTOM_MED,DKIM_INVALID,DKIM_SIGNED,FREEMAIL_FORGED_FROMDOMAIN, FREEMAIL_FROM,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER, INCLUDES_PATCH,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id B1B40C433DB for ; Sun, 31 Jan 2021 00:16:34 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 50FBD64E0F for ; Sun, 31 Jan 2021 00:16:34 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 50FBD64E0F Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=gmail.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id D088A6B0080; Sat, 30 Jan 2021 19:16:25 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id C91DA6B0081; Sat, 30 Jan 2021 19:16:25 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 9FB3C6B0082; Sat, 30 Jan 2021 19:16:25 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0254.hostedemail.com [216.40.44.254]) by kanga.kvack.org (Postfix) with ESMTP id 853096B0080 for ; Sat, 30 Jan 2021 19:16:25 -0500 (EST) Received: from smtpin13.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay02.hostedemail.com (Postfix) with ESMTP id 49F973633 for ; Sun, 31 Jan 2021 00:16:25 +0000 (UTC) X-FDA: 77764153530.13.top39_1b15b5c275b5 Received: from filter.hostedemail.com (10.5.16.251.rfc1918.com [10.5.16.251]) by smtpin13.hostedemail.com (Postfix) with ESMTP id 1BDF118140B69 for ; Sun, 31 Jan 2021 00:16:25 +0000 (UTC) X-HE-Tag: top39_1b15b5c275b5 X-Filterd-Recvd-Size: 8389 Received: from mail-pg1-f171.google.com (mail-pg1-f171.google.com [209.85.215.171]) by imf33.hostedemail.com (Postfix) with ESMTP for ; Sun, 31 Jan 2021 00:16:24 +0000 (UTC) Received: by mail-pg1-f171.google.com with SMTP id v19so9418394pgj.12 for ; Sat, 30 Jan 2021 16:16:24 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=6DfXOdul3BLGU90QLFVs7+YmYGOmb4ShgTcRCfB8SKk=; b=rzXYCCMt7lIoTbm1AX0NaK4JcpYl3mpYPtPPWccGVcYlicf90HqGe490fUvOVbLaq9 12W0/fh3QMZkRC77RVtEgkJeKfLWcQ5YxEg9HY9nrswO0onKA//7P1lgQwosWV2utIHO Xbijz+GGPfiAOV5rU95L/2S/PNnrGTYvKt8oE8m1V+VadGx3M99K26DAn0rtzFNQb6HV S7UzQ7r2hcEVdE8mdIGcS2ZQnc5ixnSi3g+R9Pg7Jf6oVqOLlqNM5nr3Nb2JeFL0Qpaq Jwh2IrMY6IH3aQo+FmuayEYLKT6MpJvyJKM6/U3ZvMadrbLjhsLefq26XwYQYoDopQ0S RiLw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=6DfXOdul3BLGU90QLFVs7+YmYGOmb4ShgTcRCfB8SKk=; b=EyLM6RuaJX5SkfEXEkQ7NCzmOOIX7WtzQGjPPO3G7CHJ9LI/rskyBFIkO9997yC3nu s+xnBTeKAggHS6tdjMcjBLaA/H2zVxZqtqOYJhq0xYGmnq+1PKXIqcCMSppmWEyq2HxN Ck4iAMwvfoHJapvSVrG77SaZeP3w/VEis54jIPwzDW3WMyPR4Gm9IdLkdSGd5pqayK5i jGJxf64f7uln3o2QhnXeQ/7B88fILU9px619ESLfmtrqz6Qt/KjQAMyDS5wslcKXUHFr mu8QZD5TsPuFaWRGqsBCrYSPglnQlwLjZWCRUnpGz87OmT7jkr3aBi1jhLbps3iQCeuq CNdg== X-Gm-Message-State: AOAM5310kEovJeytIEH7Lho4aWDPy+ZiHxZ2ONcFaCUhvqcj/agSx8uy QagmPMJtv6CGd/NDf0Y8D0g0tO89TRc= X-Google-Smtp-Source: ABdhPJzymehCQkc1AiOy4wnRhxKaB0Uby6rJCyGv65YnWd4ayACsGKdDOPYrtfb5ztEQ5gLeaTxgVA== X-Received: by 2002:a63:cc05:: with SMTP id x5mr10497845pgf.254.1612052183332; Sat, 30 Jan 2021 16:16:23 -0800 (PST) Received: from sc2-haas01-esx0118.eng.vmware.com ([66.170.99.1]) by smtp.gmail.com with ESMTPSA id e12sm13127365pga.13.2021.01.30.16.16.21 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sat, 30 Jan 2021 16:16:22 -0800 (PST) From: Nadav Amit X-Google-Original-From: Nadav Amit To: linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: Nadav Amit , Andrea Arcangeli , Andrew Morton , Andy Lutomirski , Dave Hansen , Peter Zijlstra , Thomas Gleixner , Will Deacon , Yu Zhao , Nick Piggin , x86@kernel.org Subject: [RFC 14/20] mm: move inc/dec_tlb_flush_pending() to mmu_gather.c Date: Sat, 30 Jan 2021 16:11:26 -0800 Message-Id: <20210131001132.3368247-15-namit@vmware.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20210131001132.3368247-1-namit@vmware.com> References: <20210131001132.3368247-1-namit@vmware.com> MIME-Version: 1.0 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Nadav Amit Reduce the chances that inc/dec_tlb_flush_pending() will be abused by moving them into mmu_gather.c, which is more of their natural place. This also allows to reduce the clutter on mm_types.h. Signed-off-by: Nadav Amit Cc: Andrea Arcangeli Cc: Andrew Morton Cc: Andy Lutomirski Cc: Dave Hansen Cc: Peter Zijlstra Cc: Thomas Gleixner Cc: Will Deacon Cc: Yu Zhao Cc: Nick Piggin Cc: x86@kernel.org --- include/linux/mm_types.h | 54 ---------------------------------------- mm/mmu_gather.c | 54 ++++++++++++++++++++++++++++++++++++++++ 2 files changed, 54 insertions(+), 54 deletions(-) diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h index 812ee0fd4c35..676795dfd5d4 100644 --- a/include/linux/mm_types.h +++ b/include/linux/mm_types.h @@ -615,60 +615,6 @@ static inline void init_tlb_flush_pending(struct mm_struct *mm) atomic_set(&mm->tlb_flush_pending, 0); } -static inline void inc_tlb_flush_pending(struct mm_struct *mm) -{ - atomic_inc(&mm->tlb_flush_pending); - /* - * The only time this value is relevant is when there are indeed pages - * to flush. And we'll only flush pages after changing them, which - * requires the PTL. - * - * So the ordering here is: - * - * atomic_inc(&mm->tlb_flush_pending); - * spin_lock(&ptl); - * ... - * set_pte_at(); - * spin_unlock(&ptl); - * - * spin_lock(&ptl) - * mm_tlb_flush_pending(); - * .... - * spin_unlock(&ptl); - * - * flush_tlb_range(); - * atomic_dec(&mm->tlb_flush_pending); - * - * Where the increment if constrained by the PTL unlock, it thus - * ensures that the increment is visible if the PTE modification is - * visible. After all, if there is no PTE modification, nobody cares - * about TLB flushes either. - * - * This very much relies on users (mm_tlb_flush_pending() and - * mm_tlb_flush_nested()) only caring about _specific_ PTEs (and - * therefore specific PTLs), because with SPLIT_PTE_PTLOCKS and RCpc - * locks (PPC) the unlock of one doesn't order against the lock of - * another PTL. - * - * The decrement is ordered by the flush_tlb_range(), such that - * mm_tlb_flush_pending() will not return false unless all flushes have - * completed. - */ -} - -static inline void dec_tlb_flush_pending(struct mm_struct *mm) -{ - /* - * See inc_tlb_flush_pending(). - * - * This cannot be smp_mb__before_atomic() because smp_mb() simply does - * not order against TLB invalidate completion, which is what we need. - * - * Therefore we must rely on tlb_flush_*() to guarantee order. - */ - atomic_dec(&mm->tlb_flush_pending); -} - static inline bool mm_tlb_flush_pending(struct mm_struct *mm) { /* diff --git a/mm/mmu_gather.c b/mm/mmu_gather.c index 5a659d4e59eb..13338c096cc6 100644 --- a/mm/mmu_gather.c +++ b/mm/mmu_gather.c @@ -249,6 +249,60 @@ void tlb_flush_mmu(struct mmu_gather *tlb) tlb_flush_mmu_free(tlb); } +static inline void inc_tlb_flush_pending(struct mm_struct *mm) +{ + atomic_inc(&mm->tlb_flush_pending); + /* + * The only time this value is relevant is when there are indeed pages + * to flush. And we'll only flush pages after changing them, which + * requires the PTL. + * + * So the ordering here is: + * + * atomic_inc(&mm->tlb_flush_pending); + * spin_lock(&ptl); + * ... + * set_pte_at(); + * spin_unlock(&ptl); + * + * spin_lock(&ptl) + * mm_tlb_flush_pending(); + * .... + * spin_unlock(&ptl); + * + * flush_tlb_range(); + * atomic_dec(&mm->tlb_flush_pending); + * + * Where the increment if constrained by the PTL unlock, it thus + * ensures that the increment is visible if the PTE modification is + * visible. After all, if there is no PTE modification, nobody cares + * about TLB flushes either. + * + * This very much relies on users (mm_tlb_flush_pending() and + * mm_tlb_flush_nested()) only caring about _specific_ PTEs (and + * therefore specific PTLs), because with SPLIT_PTE_PTLOCKS and RCpc + * locks (PPC) the unlock of one doesn't order against the lock of + * another PTL. + * + * The decrement is ordered by the flush_tlb_range(), such that + * mm_tlb_flush_pending() will not return false unless all flushes have + * completed. + */ +} + +static inline void dec_tlb_flush_pending(struct mm_struct *mm) +{ + /* + * See inc_tlb_flush_pending(). + * + * This cannot be smp_mb__before_atomic() because smp_mb() simply does + * not order against TLB invalidate completion, which is what we need. + * + * Therefore we must rely on tlb_flush_*() to guarantee order. + */ + atomic_dec(&mm->tlb_flush_pending); +} + /** * tlb_gather_mmu - initialize an mmu_gather structure for page-table tear-down * @tlb: the mmu_gather structure to initialize From patchwork Sun Jan 31 00:11:27 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Nadav Amit X-Patchwork-Id: 12057469 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-13.5 required=3.0 tests=BAYES_00, DKIM_ADSP_CUSTOM_MED,DKIM_INVALID,DKIM_SIGNED,FREEMAIL_FORGED_FROMDOMAIN, FREEMAIL_FROM,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER, INCLUDES_PATCH,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED, USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 6BD26C433DB for ; Sun, 31 Jan 2021 00:16:37 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id E69C364E0F for ; Sun, 31 Jan 2021 00:16:36 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org E69C364E0F Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=gmail.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 7D8266B0081; Sat, 30 Jan 2021 19:16:27 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 73CF26B0082; Sat, 30 Jan 2021 19:16:27 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 4CD4E6B0083; Sat, 30 Jan 2021 19:16:27 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0040.hostedemail.com [216.40.44.40]) by kanga.kvack.org (Postfix) with ESMTP id 230C56B0081 for ; Sat, 30 Jan 2021 19:16:27 -0500 (EST) Received: from smtpin13.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay01.hostedemail.com (Postfix) with ESMTP id DF22C180AD830 for ; Sun, 31 Jan 2021 00:16:26 +0000 (UTC) X-FDA: 77764153572.13.shake05_1f172fa275b5 Received: from filter.hostedemail.com (10.5.16.251.rfc1918.com [10.5.16.251]) by smtpin13.hostedemail.com (Postfix) with ESMTP id C1EDB18140B60 for ; Sun, 31 Jan 2021 00:16:26 +0000 (UTC) X-HE-Tag: shake05_1f172fa275b5 X-Filterd-Recvd-Size: 17414 Received: from mail-pf1-f172.google.com (mail-pf1-f172.google.com [209.85.210.172]) by imf27.hostedemail.com (Postfix) with ESMTP for ; Sun, 31 Jan 2021 00:16:26 +0000 (UTC) Received: by mail-pf1-f172.google.com with SMTP id b145so2281756pfb.4 for ; Sat, 30 Jan 2021 16:16:26 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=J0sNt0ybEAJJITGt3SZGOCmrQucNsG5Zn6uxnJSWEHU=; b=mGqXv/v3S2PT1kY3WmXQuXGhWNvnf3TFomZSCuxt1lRwxzy077QI4egWiVOxjrT4Ps WdBCHWZ9bM30L+PnMBFQYR5QoPUCONXwjHTb2CDztTXX0+Y/RsYWFOCAgJny0OVKA0dX urxWPZ18rfSnLOWT7SNjyGpNrbjABnWpvnaCkip4YW9JEZq9xoazTN5jOWU/yhyXXh0O ttiWKtPXS0Xf7UzsCDAJtZ7STa/yKkDu5EslnM+ftx0oR1V/jKH82cGUXaqyfGQmYWDD Z0N6Tkq2NOap6GymQUn4b7abeYc4N8WuMremtfN9pDd4PvmWnva8N+AbeWAlPjpEj52G M+0w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=J0sNt0ybEAJJITGt3SZGOCmrQucNsG5Zn6uxnJSWEHU=; b=uDWesGompKCkYAw2W2U+slV4T8Tyh33XfZRdzHHYpO+C/dzWrk+4tfUquBcHjjDxlF WQVjTkiL7odWCFwoIZSTDhsQDISEonx83WtE1P7tvFM1zhHxp8CnqyZjw8Bx1oltaChS VNjAd7bpaBa1Z6ZF2FV8H1JUMzNE4BQH2jfkLYZTooS9Ke7gHrlFrI0y9kdfRXuFnJrl nQ2ZPUzR47Y5RKEgB9gJo9KFqnoRtNxvOG+x43zS29jjhKUz1mynkt4FOD9wPc/Mqq5X NWsZy3+eaB4L7/+ksCSA9TupaK3xUAXgAa0G10PTWjhzxv0EiQC6zppmtbF9jMxgjucp 052g== X-Gm-Message-State: AOAM533fSHCzqdAsh974rqxwG/Fy2TiSln+8ANZ8eCxP05Dn+EpQH55Y Dg0mHu8TcIY/k1xQiZk+VUk0XzC9kWE= X-Google-Smtp-Source: ABdhPJwNe3mKTlXo7k7U2SgPj7FsSjQAw80hIuT+kdV7YK4IsAvj/+Q83sl9GE2LG7Qvy2myneRi7g== X-Received: by 2002:a62:1690:0:b029:1c6:fdac:3438 with SMTP id 138-20020a6216900000b02901c6fdac3438mr10186735pfw.43.1612052184960; Sat, 30 Jan 2021 16:16:24 -0800 (PST) Received: from sc2-haas01-esx0118.eng.vmware.com ([66.170.99.1]) by smtp.gmail.com with ESMTPSA id e12sm13127365pga.13.2021.01.30.16.16.23 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sat, 30 Jan 2021 16:16:24 -0800 (PST) From: Nadav Amit X-Google-Original-From: Nadav Amit To: linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: Nadav Amit , Andrea Arcangeli , Andrew Morton , Andy Lutomirski , Dave Hansen , Peter Zijlstra , Thomas Gleixner , Will Deacon , Yu Zhao , x86@kernel.org Subject: [RFC 15/20] mm: detect deferred TLB flushes in vma granularity Date: Sat, 30 Jan 2021 16:11:27 -0800 Message-Id: <20210131001132.3368247-16-namit@vmware.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20210131001132.3368247-1-namit@vmware.com> References: <20210131001132.3368247-1-namit@vmware.com> MIME-Version: 1.0 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Nadav Amit Currently, deferred TLB flushes are detected in the mm granularity: if there is any deferred TLB flush in the entire address space due to NUMA migration, pte_accessible() in x86 would return true, and ptep_clear_flush() would require a TLB flush. This would happen even if the PTE resides in a completely different vma. Recent code changes and possible future enhancements might require to detect TLB flushes in finer granularity. Detection in finer granularity can also enable more aggressive TLB deferring in the future. Record for each vma the last mm's TLB generation after the last deferred PTE/PMD change while the page-table lock is still held. Increase the mm generation before recording to indicate that a pending TLB flush is pending. Record in the mmu_gather struct the mm's TLB generation at the time in which the last TLB flushing was deferred. Once the TLB flushing of deferred request takes place, use the deferred TLB generation that is recorded in mmu_gather. Detection of deferred TLB flushes is performed by checking whether the mm's completed TLB generation is the lower/equal than the mm's TLB generation. Architectures that use the TLB generation logic are required to perform a full TLB flush if they detect that a new TLB flush request "skips" a generation (as already done by x86 code). To indicate that a deferred TLB flush takes place, increase the mm's TLB generation after updating the PTEs. However, try to avoid increasing the mm's generation after subsequent PTE updates, as increasing it again would lead to a full TLB flush once the deferred TLB flushes are performed (due to the "skipped" TLB generation). Therefore, if the mm generation did not change after subsequent PTE update, use the previous generation. As multiple updates of the vma generation can be performed concurrently, use atomic operations to ensure that the TLB generation as recorded in the vma is the last (most recent) one. Once a deferred TLB flush is eventually performed it might be redundant, if due to another TLB flush the deferred flush was performed (by doing a full TLB flush once detecting the "skipped" generation). This case can be detected if the deferred TLB generation, as recorded in mmu_gather was already completed. However, we do not record deferred PUD/P4D flushes, and freeing tables also requires a flush on cores in lazy TLB mode. In such cases a TLB flush is needed even if the mm's completed TLB generation indicates the flush was already "performed". Signed-off-by: Nadav Amit Cc: Andrea Arcangeli Cc: Andrew Morton Cc: Andy Lutomirski Cc: Dave Hansen Cc: Peter Zijlstra Cc: Thomas Gleixner Cc: Will Deacon Cc: Yu Zhao Cc: x86@kernel.org --- arch/x86/include/asm/tlb.h | 18 ++++-- arch/x86/include/asm/tlbflush.h | 5 ++ arch/x86/mm/tlb.c | 14 ++++- include/asm-generic/tlb.h | 104 ++++++++++++++++++++++++++++++-- include/linux/mm_types.h | 19 ++++++ mm/mmap.c | 1 + mm/mmu_gather.c | 3 + 7 files changed, 150 insertions(+), 14 deletions(-) diff --git a/arch/x86/include/asm/tlb.h b/arch/x86/include/asm/tlb.h index 580636cdc257..ecf538e6c6d5 100644 --- a/arch/x86/include/asm/tlb.h +++ b/arch/x86/include/asm/tlb.h @@ -9,15 +9,23 @@ static inline void tlb_flush(struct mmu_gather *tlb); static inline void tlb_flush(struct mmu_gather *tlb) { - unsigned long start = 0UL, end = TLB_FLUSH_ALL; unsigned int stride_shift = tlb_get_unmap_shift(tlb); - if (!tlb->fullmm && !tlb->need_flush_all) { - start = tlb->start; - end = tlb->end; + /* Perform full flush when needed */ + if (tlb->fullmm || tlb->need_flush_all) { + flush_tlb_mm_range(tlb->mm, 0, TLB_FLUSH_ALL, stride_shift, + tlb->freed_tables); + return; } - flush_tlb_mm_range(tlb->mm, start, end, stride_shift, tlb->freed_tables); + /* Check if flush was already performed */ + if (!tlb->freed_tables && !tlb->cleared_puds && + !tlb->cleared_p4ds && + atomic64_read(&tlb->mm->tlb_gen_completed) > tlb->defer_gen) + return; + + flush_tlb_mm_range_gen(tlb->mm, tlb->start, tlb->end, stride_shift, + tlb->freed_tables, tlb->defer_gen); } /* diff --git a/arch/x86/include/asm/tlbflush.h b/arch/x86/include/asm/tlbflush.h index 2110b98026a7..296a00545056 100644 --- a/arch/x86/include/asm/tlbflush.h +++ b/arch/x86/include/asm/tlbflush.h @@ -225,6 +225,11 @@ void flush_tlb_others(const struct cpumask *cpumask, : PAGE_SHIFT, false) extern void flush_tlb_all(void); + +extern void flush_tlb_mm_range_gen(struct mm_struct *mm, unsigned long start, + unsigned long end, unsigned int stride_shift, + bool freed_tables, u64 gen); + extern void flush_tlb_mm_range(struct mm_struct *mm, unsigned long start, unsigned long end, unsigned int stride_shift, bool freed_tables); diff --git a/arch/x86/mm/tlb.c b/arch/x86/mm/tlb.c index d17b5575531e..48f4b56fc4a7 100644 --- a/arch/x86/mm/tlb.c +++ b/arch/x86/mm/tlb.c @@ -883,12 +883,11 @@ static inline void put_flush_tlb_info(void) #endif } -void flush_tlb_mm_range(struct mm_struct *mm, unsigned long start, +void flush_tlb_mm_range_gen(struct mm_struct *mm, unsigned long start, unsigned long end, unsigned int stride_shift, - bool freed_tables) + bool freed_tables, u64 new_tlb_gen) { struct flush_tlb_info *info; - u64 new_tlb_gen; int cpu; cpu = get_cpu(); @@ -923,6 +922,15 @@ void flush_tlb_mm_range(struct mm_struct *mm, unsigned long start, put_cpu(); } +void flush_tlb_mm_range(struct mm_struct *mm, unsigned long start, + unsigned long end, unsigned int stride_shift, + bool freed_tables) +{ + u64 new_tlb_gen = inc_mm_tlb_gen(mm); + + flush_tlb_mm_range_gen(mm, start, end, stride_shift, freed_tables, + new_tlb_gen); +} static void do_flush_tlb_all(void *info) { diff --git a/include/asm-generic/tlb.h b/include/asm-generic/tlb.h index 10690763090a..f25d2d955076 100644 --- a/include/asm-generic/tlb.h +++ b/include/asm-generic/tlb.h @@ -295,6 +295,11 @@ struct mmu_gather { unsigned int cleared_puds : 1; unsigned int cleared_p4ds : 1; + /* + * Whether a TLB flush was needed for PTEs in the current table + */ + unsigned int cleared_ptes_in_table : 1; + unsigned int batch_count; #ifndef CONFIG_MMU_GATHER_NO_GATHER @@ -305,6 +310,10 @@ struct mmu_gather { #ifdef CONFIG_MMU_GATHER_PAGE_SIZE unsigned int page_size; #endif + +#ifdef CONFIG_ARCH_HAS_TLB_GENERATIONS + u64 defer_gen; +#endif #endif }; @@ -381,7 +390,8 @@ static inline void tlb_flush(struct mmu_gather *tlb) #endif #if __is_defined(tlb_flush) || \ - IS_ENABLED(CONFIG_ARCH_WANT_AGGRESSIVE_TLB_FLUSH_BATCHING) + IS_ENABLED(CONFIG_ARCH_WANT_AGGRESSIVE_TLB_FLUSH_BATCHING) || \ + IS_ENABLED(CONFIG_ARCH_HAS_TLB_GENERATIONS) static inline void tlb_update_vma(struct mmu_gather *tlb, struct vm_area_struct *vma) @@ -472,7 +482,8 @@ static inline unsigned long tlb_get_unmap_size(struct mmu_gather *tlb) */ static inline void tlb_start_vma(struct mmu_gather *tlb, struct vm_area_struct *vma) { - if (tlb->fullmm) + if (IS_ENABLED(CONFIG_ARCH_WANT_AGGRESSIVE_TLB_FLUSH_BATCHING) && + tlb->fullmm) return; tlb_update_vma(tlb, vma); @@ -530,16 +541,87 @@ static inline void mark_mm_tlb_gen_done(struct mm_struct *mm, u64 gen) tlb_update_generation(&mm->tlb_gen_completed, gen); } -#endif /* CONFIG_ARCH_HAS_TLB_GENERATIONS */ +static inline void read_defer_tlb_flush_gen(struct mmu_gather *tlb) +{ + struct mm_struct *mm = tlb->mm; + u64 mm_gen; + + /* + * Any change of PTE before calling __track_deferred_tlb_flush() must be + * performed using RMW atomic operation that provides a memory barriers, + * such as ptep_modify_prot_start(). The barrier ensure the PTEs are + * written before the current generation is read, synchronizing + * (implicitly) with flush_tlb_mm_range(). + */ + smp_mb__after_atomic(); + + mm_gen = atomic64_read(&mm->tlb_gen); + + /* + * This condition checks for both first deferred TLB flush and for other + * TLB pending or executed TLB flushes after the last table that we + * updated. In the latter case, we are going to skip a generation, which + * would lead to a full TLB flush. This should therefore not cause + * correctness issues, and should not induce overheads, since anyhow in + * TLB storms it is better to perform full TLB flush. + */ + if (mm_gen != tlb->defer_gen) { + VM_BUG_ON(mm_gen < tlb->defer_gen); + + tlb->defer_gen = inc_mm_tlb_gen(mm); + } +} + +/* + * Store the deferred TLB generation in the VMA + */ +static inline void store_deferred_tlb_gen(struct mmu_gather *tlb) +{ + tlb_update_generation(&tlb->vma->defer_tlb_gen, tlb->defer_gen); +} + +/* + * Track deferred TLB flushes for PTEs and PMDs to allow fine granularity checks + * whether a PTE is accessible. The TLB generation after the PTE is flushed is + * saved in the mmu_gather struct. Once a flush is performed, the geneartion is + * advanced. + */ +static inline void track_defer_tlb_flush(struct mmu_gather *tlb) +{ + if (tlb->fullmm) + return; + + BUG_ON(!tlb->vma); + + read_defer_tlb_flush_gen(tlb); + store_deferred_tlb_gen(tlb); +} + +#define init_vma_tlb_generation(vma) \ + atomic64_set(&(vma)->defer_tlb_gen, 0) +#else +static inline void init_vma_tlb_generation(struct vm_area_struct *vma) { } +#endif #define tlb_start_ptes(tlb) \ do { \ struct mmu_gather *_tlb = (tlb); \ \ flush_tlb_batched_pending(_tlb->mm); \ + if (IS_ENABLED(CONFIG_ARCH_HAS_TLB_GENERATIONS)) \ + _tlb->cleared_ptes_in_table = 0; \ } while (0) -static inline void tlb_end_ptes(struct mmu_gather *tlb) { } +static inline void tlb_end_ptes(struct mmu_gather *tlb) +{ + if (!IS_ENABLED(CONFIG_ARCH_HAS_TLB_GENERATIONS)) + return; + + if (tlb->cleared_ptes_in_table) + track_defer_tlb_flush(tlb); + + tlb->cleared_ptes_in_table = 0; +} /* * tlb_flush_{pte|pmd|pud|p4d}_range() adjust the tlb->start and tlb->end, @@ -550,15 +632,25 @@ static inline void tlb_flush_pte_range(struct mmu_gather *tlb, { __tlb_adjust_range(tlb, address, size); tlb->cleared_ptes = 1; + + if (IS_ENABLED(CONFIG_ARCH_HAS_TLB_GENERATIONS)) + tlb->cleared_ptes_in_table = 1; } -static inline void tlb_flush_pmd_range(struct mmu_gather *tlb, +static inline void __tlb_flush_pmd_range(struct mmu_gather *tlb, unsigned long address, unsigned long size) { __tlb_adjust_range(tlb, address, size); tlb->cleared_pmds = 1; } +static inline void tlb_flush_pmd_range(struct mmu_gather *tlb, + unsigned long address, unsigned long size) +{ + __tlb_flush_pmd_range(tlb, address, size); + track_defer_tlb_flush(tlb); +} + static inline void tlb_flush_pud_range(struct mmu_gather *tlb, unsigned long address, unsigned long size) { @@ -649,7 +741,7 @@ static inline void tlb_flush_p4d_range(struct mmu_gather *tlb, #ifndef pte_free_tlb #define pte_free_tlb(tlb, ptep, address) \ do { \ - tlb_flush_pmd_range(tlb, address, PAGE_SIZE); \ + __tlb_flush_pmd_range(tlb, address, PAGE_SIZE); \ tlb->freed_tables = 1; \ __pte_free_tlb(tlb, ptep, address); \ } while (0) diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h index 676795dfd5d4..bbe5d4a422f7 100644 --- a/include/linux/mm_types.h +++ b/include/linux/mm_types.h @@ -367,6 +367,9 @@ struct vm_area_struct { #endif #ifdef CONFIG_NUMA struct mempolicy *vm_policy; /* NUMA policy for the VMA */ +#endif +#ifdef CONFIG_ARCH_HAS_TLB_GENERATIONS + atomic64_t defer_tlb_gen; /* Deferred TLB flushes generation */ #endif struct vm_userfaultfd_ctx vm_userfaultfd_ctx; } __randomize_layout; @@ -628,6 +631,21 @@ static inline bool mm_tlb_flush_pending(struct mm_struct *mm) return atomic_read(&mm->tlb_flush_pending); } +#ifdef CONFIG_ARCH_HAS_TLB_GENERATIONS +static inline bool pte_tlb_flush_pending(struct vm_area_struct *vma, pte_t *pte) +{ + struct mm_struct *mm = vma->vm_mm; + + return atomic64_read(&vma->defer_tlb_gen) < atomic64_read(&mm->tlb_gen_completed); +} + +static inline bool pmd_tlb_flush_pending(struct vm_area_struct *vma, pmd_t *pmd) +{ + struct mm_struct *mm = vma->vm_mm; + + return atomic64_read(&vma->defer_tlb_gen) < atomic64_read(&mm->tlb_gen_completed); +} +#else /* CONFIG_ARCH_HAS_TLB_GENERATIONS */ static inline bool pte_tlb_flush_pending(struct vm_area_struct *vma, pte_t *pte) { return mm_tlb_flush_pending(vma->vm_mm); @@ -637,6 +655,7 @@ static inline bool pmd_tlb_flush_pending(struct vm_area_struct *vma, pmd_t *pmd) { return mm_tlb_flush_pending(vma->vm_mm); } +#endif /* CONFIG_ARCH_HAS_TLB_GENERATIONS */ static inline bool mm_tlb_flush_nested(struct mm_struct *mm) { diff --git a/mm/mmap.c b/mm/mmap.c index 90673febce6a..a81ef902e296 100644 --- a/mm/mmap.c +++ b/mm/mmap.c @@ -3337,6 +3337,7 @@ struct vm_area_struct *copy_vma(struct vm_area_struct **vmap, get_file(new_vma->vm_file); if (new_vma->vm_ops && new_vma->vm_ops->open) new_vma->vm_ops->open(new_vma); + init_vma_tlb_generation(new_vma); vma_link(mm, new_vma, prev, rb_link, rb_parent); *need_rmap_locks = false; } diff --git a/mm/mmu_gather.c b/mm/mmu_gather.c index 13338c096cc6..0d554f2f92ac 100644 --- a/mm/mmu_gather.c +++ b/mm/mmu_gather.c @@ -329,6 +329,9 @@ static void __tlb_gather_mmu(struct mmu_gather *tlb, struct mm_struct *mm, #endif tlb_table_init(tlb); +#ifdef CONFIG_ARCH_HAS_TLB_GENERATIONS + tlb->defer_gen = 0; +#endif #ifdef CONFIG_MMU_GATHER_PAGE_SIZE tlb->page_size = 0; #endif From patchwork Sun Jan 31 00:11:28 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Nadav Amit X-Patchwork-Id: 12057471 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-13.5 required=3.0 tests=BAYES_00, DKIM_ADSP_CUSTOM_MED,DKIM_INVALID,DKIM_SIGNED,FREEMAIL_FORGED_FROMDOMAIN, FREEMAIL_FROM,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER, INCLUDES_PATCH,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 5BEFFC433DB for ; Sun, 31 Jan 2021 00:16:40 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id D30AA64E18 for ; Sun, 31 Jan 2021 00:16:39 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org D30AA64E18 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=gmail.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 2F3D36B0082; Sat, 30 Jan 2021 19:16:29 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 2A6C36B0083; Sat, 30 Jan 2021 19:16:29 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 11DA96B0085; Sat, 30 Jan 2021 19:16:29 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0183.hostedemail.com [216.40.44.183]) by kanga.kvack.org (Postfix) with ESMTP id E712F6B0082 for ; Sat, 30 Jan 2021 19:16:28 -0500 (EST) Received: from smtpin17.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay03.hostedemail.com (Postfix) with ESMTP id B614E82499A8 for ; Sun, 31 Jan 2021 00:16:28 +0000 (UTC) X-FDA: 77764153656.17.sock73_33003f1275b5 Received: from filter.hostedemail.com (10.5.16.251.rfc1918.com [10.5.16.251]) by smtpin17.hostedemail.com (Postfix) with ESMTP id 9622C180D0180 for ; Sun, 31 Jan 2021 00:16:28 +0000 (UTC) X-HE-Tag: sock73_33003f1275b5 X-Filterd-Recvd-Size: 18669 Received: from mail-pg1-f180.google.com (mail-pg1-f180.google.com [209.85.215.180]) by imf42.hostedemail.com (Postfix) with ESMTP for ; Sun, 31 Jan 2021 00:16:27 +0000 (UTC) Received: by mail-pg1-f180.google.com with SMTP id g15so9431078pgu.9 for ; Sat, 30 Jan 2021 16:16:27 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=ElI7mxctjrmKx7eCNMFhwo+0oUmE+713UYT2vq30CCA=; b=VoErfthXvUHlf0A89aRqLo5YMgSxGBm/eX/s9E5U0RqmefKZ4dYW+dDdr71xhasTKR jiubmwQKBeIZ1pbM4uc05+sat9CG67olBqgzi9H/cycz9jHlXlHF/n8T2IcZncamZvfV IzCNBeONuYPPn6zIQzrAY3cBXCEYh04+pQYtaqjR6moxO6S8fEBSvmB8GS0SIHyQWxM+ 6olU2xvdJYEwUYHlqvzcxO0biEd/uL1u4x4H/OhYj+i30nQLoN5IbadqinmzsTDhIb4z 6gkgemMIGS24HEwSq8kTLYnFmNrlXbFVvL2PmGjC8DoaqT3bZ2Q14iwT+pUMXUWh/o9o 7RJg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=ElI7mxctjrmKx7eCNMFhwo+0oUmE+713UYT2vq30CCA=; b=cvn633K+YYoZqpQabrg6gTSCqLVGCTy83UI+9MkVv52dP5m85NaNCT1GzDH4Y/qgQ6 wttGEpdSWVIqwbow2SH4gesi5ToXPdopJ9vqyVtDPr6z1fuG7X+8n6CFL5skhfywbzZX ZhbCnZNt/SZtd9KD5xeBEKW0Tz2mpG4s9TAOlagxlI33FsZoXuYRcs792ymKVOY85VfH Mn8vu32iK0EY+lYaOuw4EgCZECyNKM/sq9s2mpg9haFXQA4D0I3nMuOUUBNAr36cZl2q /7zP1dS30IDUB4kuHFeVGcKtvYDobML4Hb3GPA6qxeWaGshBnQe4+eiWhDkox0cfUvgo NAyg== X-Gm-Message-State: AOAM53148fPWQD1o01076QU7YSMQIP+kCu+BCA8EZdiRqt2QLThmrMbn E4EkOylQlRNriatCCqRppXPNC1DhfgM= X-Google-Smtp-Source: ABdhPJzug1v9DT8EGutRtA5bxGDyj1wQn8vVmXM8N6eCVwRItNY/5oL6QbIAd1LvLu4xsBYwgt5D+A== X-Received: by 2002:a63:3747:: with SMTP id g7mr10371402pgn.376.1612052186671; Sat, 30 Jan 2021 16:16:26 -0800 (PST) Received: from sc2-haas01-esx0118.eng.vmware.com ([66.170.99.1]) by smtp.gmail.com with ESMTPSA id e12sm13127365pga.13.2021.01.30.16.16.25 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sat, 30 Jan 2021 16:16:25 -0800 (PST) From: Nadav Amit X-Google-Original-From: Nadav Amit To: linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: Nadav Amit , Andrea Arcangeli , Andrew Morton , Andy Lutomirski , Dave Hansen , Peter Zijlstra , Thomas Gleixner , Will Deacon , Yu Zhao , x86@kernel.org Subject: [RFC 16/20] mm/tlb: per-page table generation tracking Date: Sat, 30 Jan 2021 16:11:28 -0800 Message-Id: <20210131001132.3368247-17-namit@vmware.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20210131001132.3368247-1-namit@vmware.com> References: <20210131001132.3368247-1-namit@vmware.com> MIME-Version: 1.0 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Nadav Amit Detecting deferred TLB flushes per-VMA has two drawbacks: 1. It requires an atomic cmpxchg to record mm's TLB generation at the time of the last TLB flush, as two deferred TLB flushes on the same VMA can race. 2. It might be in coarse granularity for large VMAs. On 64-bit architectures that have available space on page-struct, we can resolve these two drawbacks by recording the TLB generation at the time of the last deferred flush in page-struct of page-table whose TLB flushes were deferred. Introduce a new CONFIG_PER_TABLE_DEFERRED_FLUSHES config option. Record when enabled the deferred TLB flush generation on page-struct, which is protected by the page-table lock. Signed-off-by: Nadav Amit Cc: Andrea Arcangeli Cc: Andrew Morton Cc: Andy Lutomirski Cc: Dave Hansen Cc: Peter Zijlstra Cc: Thomas Gleixner Cc: Will Deacon Cc: Yu Zhao Cc: x86@kernel.org --- arch/x86/Kconfig | 1 + arch/x86/include/asm/pgtable.h | 23 ++++++------ fs/proc/task_mmu.c | 6 ++-- include/asm-generic/tlb.h | 65 ++++++++++++++++++++++++++-------- include/linux/mm.h | 13 +++++++ include/linux/mm_types.h | 22 ++++++++++++ init/Kconfig | 7 ++++ mm/huge_memory.c | 2 +- mm/mapping_dirty_helpers.c | 4 +-- mm/mprotect.c | 2 +- 10 files changed, 113 insertions(+), 32 deletions(-) diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig index d56b0f5cb00c..dfc6ee9dbe9c 100644 --- a/arch/x86/Kconfig +++ b/arch/x86/Kconfig @@ -250,6 +250,7 @@ config X86 select X86_FEATURE_NAMES if PROC_FS select PROC_PID_ARCH_STATUS if PROC_FS select MAPPING_DIRTY_HELPERS + select PER_TABLE_DEFERRED_FLUSHES if X86_64 imply IMA_SECURE_AND_OR_TRUSTED_BOOT if EFI config INSTRUCTION_DECODER diff --git a/arch/x86/include/asm/pgtable.h b/arch/x86/include/asm/pgtable.h index a0e069c15dbc..b380a849be90 100644 --- a/arch/x86/include/asm/pgtable.h +++ b/arch/x86/include/asm/pgtable.h @@ -774,17 +774,18 @@ static inline int pte_devmap(pte_t a) } #endif -#define pte_accessible pte_accessible -static inline bool pte_accessible(struct vm_area_struct *vma, pte_t *a) -{ - if (pte_flags(*a) & _PAGE_PRESENT) - return true; - - if ((pte_flags(*a) & _PAGE_PROTNONE) && pte_tlb_flush_pending(vma, a)) - return true; - - return false; -} +#define pte_accessible(vma, a) \ + ({ \ + pte_t *_a = (a); \ + bool r = false; \ + \ + if (pte_flags(*_a) & _PAGE_PRESENT) \ + r = true; \ + else \ + r = ((pte_flags(*_a) & _PAGE_PROTNONE) && \ + pte_tlb_flush_pending((vma), _a)); \ + r; \ + }) static inline int pmd_present(pmd_t pmd) { diff --git a/fs/proc/task_mmu.c b/fs/proc/task_mmu.c index d0cce961fa5c..00e116feb62c 100644 --- a/fs/proc/task_mmu.c +++ b/fs/proc/task_mmu.c @@ -1157,7 +1157,7 @@ static int clear_refs_pte_range(pmd_t *pmd, unsigned long addr, /* Clear accessed and referenced bits. */ pmdp_test_and_clear_young(vma, addr, pmd); test_and_clear_page_young(page); - tlb_flush_pmd_range(&cp->tlb, addr, HPAGE_PMD_SIZE); + tlb_flush_pmd_range(&cp->tlb, pmd, addr, HPAGE_PMD_SIZE); ClearPageReferenced(page); out: spin_unlock(ptl); @@ -1174,7 +1174,7 @@ static int clear_refs_pte_range(pmd_t *pmd, unsigned long addr, if (cp->type == CLEAR_REFS_SOFT_DIRTY) { clear_soft_dirty(vma, addr, pte); - tlb_flush_pte_range(&cp->tlb, addr, PAGE_SIZE); + tlb_flush_pte_range(&cp->tlb, pte, addr, PAGE_SIZE); continue; } @@ -1188,7 +1188,7 @@ static int clear_refs_pte_range(pmd_t *pmd, unsigned long addr, /* Clear accessed and referenced bits. */ ptep_test_and_clear_young(vma, addr, pte); test_and_clear_page_young(page); - tlb_flush_pte_range(&cp->tlb, addr, PAGE_SIZE); + tlb_flush_pte_range(&cp->tlb, pte, addr, PAGE_SIZE); ClearPageReferenced(page); } tlb_end_ptes(&cp->tlb); diff --git a/include/asm-generic/tlb.h b/include/asm-generic/tlb.h index f25d2d955076..74dbb56d816d 100644 --- a/include/asm-generic/tlb.h +++ b/include/asm-generic/tlb.h @@ -310,10 +310,12 @@ struct mmu_gather { #ifdef CONFIG_MMU_GATHER_PAGE_SIZE unsigned int page_size; #endif - #ifdef CONFIG_ARCH_HAS_TLB_GENERATIONS u64 defer_gen; #endif +#ifdef CONFIG_PER_TABLE_DEFERRED_FLUSHES + pte_t *last_pte; +#endif #endif }; @@ -572,21 +574,45 @@ static inline void read_defer_tlb_flush_gen(struct mmu_gather *tlb) } } +#ifndef CONFIG_PER_TABLE_DEFERRED_FLUSHES + /* - * Store the deferred TLB generation in the VMA + * Store the deferred TLB generation in the VMA or page-table for PTEs or PMDs */ -static inline void store_deferred_tlb_gen(struct mmu_gather *tlb) +static inline void store_deferred_tlb_gen(struct mmu_gather *tlb, + struct page *page) { tlb_update_generation(&tlb->vma->defer_tlb_gen, tlb->defer_gen); } +static inline void tlb_set_last_pte(struct mmu_gather *tlb, pte_t *pte) { } + +#else /* CONFIG_PER_TABLE_DEFERRED_FLUSHES */ + +/* + * Store the deferred TLB generation in the VMA + */ +static inline void store_deferred_tlb_gen(struct mmu_gather *tlb, + struct page *page) +{ + page->deferred_tlb_gen = tlb->defer_gen; +} + +static inline void tlb_set_last_pte(struct mmu_gather *tlb, pte_t *pte) +{ + tlb->last_pte = pte; +} + +#endif /* CONFIG_PER_TABLE_DEFERRED_FLUSHES */ + /* * Track deferred TLB flushes for PTEs and PMDs to allow fine granularity checks * whether a PTE is accessible. The TLB generation after the PTE is flushed is * saved in the mmu_gather struct. Once a flush is performed, the geneartion is * advanced. */ -static inline void track_defer_tlb_flush(struct mmu_gather *tlb) +static inline void track_defer_tlb_flush(struct mmu_gather *tlb, + struct page *page) { if (tlb->fullmm) return; @@ -594,7 +620,7 @@ static inline void track_defer_tlb_flush(struct mmu_gather *tlb) BUG_ON(!tlb->vma); read_defer_tlb_flush_gen(tlb); - store_deferred_tlb_gen(tlb); + store_deferred_tlb_gen(tlb, page); } #define init_vma_tlb_generation(vma) \ @@ -610,6 +636,7 @@ static inline void init_vma_tlb_generation(struct vm_area_struct *vma) { } flush_tlb_batched_pending(_tlb->mm); \ if (IS_ENABLED(CONFIG_ARCH_HAS_TLB_GENERATIONS)) \ _tlb->cleared_ptes_in_table = 0; \ + tlb_set_last_pte(_tlb, NULL); \ } while (0) static inline void tlb_end_ptes(struct mmu_gather *tlb) @@ -617,24 +644,31 @@ static inline void tlb_end_ptes(struct mmu_gather *tlb) if (!IS_ENABLED(CONFIG_ARCH_HAS_TLB_GENERATIONS)) return; +#ifdef CONFIG_PER_TABLE_DEFERRED_FLUSHES + if (tlb->last_pte) + track_defer_tlb_flush(tlb, pte_to_page(tlb->last_pte)); +#elif CONFIG_ARCH_HAS_TLB_GENERATIONS /* && !CONFIG_PER_TABLE_DEFERRED_FLUSHES */ if (tlb->cleared_ptes_in_table) - track_defer_tlb_flush(tlb); - + track_defer_tlb_flush(tlb, NULL); tlb->cleared_ptes_in_table = 0; +#endif /* CONFIG_PER_TABLE_DEFERRED_FLUSHES */ } /* * tlb_flush_{pte|pmd|pud|p4d}_range() adjust the tlb->start and tlb->end, * and set corresponding cleared_*. */ -static inline void tlb_flush_pte_range(struct mmu_gather *tlb, +static inline void tlb_flush_pte_range(struct mmu_gather *tlb, pte_t *pte, unsigned long address, unsigned long size) { __tlb_adjust_range(tlb, address, size); tlb->cleared_ptes = 1; - if (IS_ENABLED(CONFIG_ARCH_HAS_TLB_GENERATIONS)) + if (IS_ENABLED(CONFIG_ARCH_HAS_TLB_GENERATIONS) && + !IS_ENABLED(CONFIG_PER_TABLE_DEFERRED_FLUSHES)) tlb->cleared_ptes_in_table = 1; + + tlb_set_last_pte(tlb, pte); } static inline void __tlb_flush_pmd_range(struct mmu_gather *tlb, @@ -644,11 +678,11 @@ static inline void __tlb_flush_pmd_range(struct mmu_gather *tlb, tlb->cleared_pmds = 1; } -static inline void tlb_flush_pmd_range(struct mmu_gather *tlb, +static inline void tlb_flush_pmd_range(struct mmu_gather *tlb, pmd_t *pmd, unsigned long address, unsigned long size) { __tlb_flush_pmd_range(tlb, address, size); - track_defer_tlb_flush(tlb); + track_defer_tlb_flush(tlb, pmd_to_page(pmd)); } static inline void tlb_flush_pud_range(struct mmu_gather *tlb, @@ -678,7 +712,8 @@ static inline void tlb_flush_p4d_range(struct mmu_gather *tlb, */ #define tlb_remove_tlb_entry(tlb, ptep, address) \ do { \ - tlb_flush_pte_range(tlb, address, PAGE_SIZE); \ + tlb_flush_pte_range(tlb, ptep, address, \ + PAGE_SIZE); \ __tlb_remove_tlb_entry(tlb, ptep, address); \ } while (0) @@ -686,7 +721,8 @@ static inline void tlb_flush_p4d_range(struct mmu_gather *tlb, do { \ unsigned long _sz = huge_page_size(h); \ if (_sz == PMD_SIZE) \ - tlb_flush_pmd_range(tlb, address, _sz); \ + tlb_flush_pmd_range(tlb, (pmd_t *)ptep, \ + address, _sz); \ else if (_sz == PUD_SIZE) \ tlb_flush_pud_range(tlb, address, _sz); \ __tlb_remove_tlb_entry(tlb, ptep, address); \ @@ -702,7 +738,8 @@ static inline void tlb_flush_p4d_range(struct mmu_gather *tlb, #define tlb_remove_pmd_tlb_entry(tlb, pmdp, address) \ do { \ - tlb_flush_pmd_range(tlb, address, HPAGE_PMD_SIZE); \ + tlb_flush_pmd_range(tlb, pmdp, address, \ + HPAGE_PMD_SIZE); \ __tlb_remove_pmd_tlb_entry(tlb, pmdp, address); \ } while (0) diff --git a/include/linux/mm.h b/include/linux/mm.h index d78a79fbb012..a8a5bf82bd03 100644 --- a/include/linux/mm.h +++ b/include/linux/mm.h @@ -2208,11 +2208,21 @@ static inline void pgtable_init(void) pgtable_cache_init(); } +#ifdef CONFIG_PER_TABLE_DEFERRED_FLUSHES +static inline void page_table_tlb_gen_init(struct page *page) +{ + page->deferred_tlb_gen = 0; +} +#else /* CONFIG_PER_TABLE_DEFERRED_FLUSHES */ +static inline void page_table_tlb_gen_init(struct page *page) { } +#endif /* CONFIG_PER_TABLE_DEFERRED_FLUSHES */ + static inline bool pgtable_pte_page_ctor(struct page *page) { if (!ptlock_init(page)) return false; __SetPageTable(page); + page_table_tlb_gen_init(page); inc_lruvec_page_state(page, NR_PAGETABLE); return true; } @@ -2221,6 +2231,7 @@ static inline void pgtable_pte_page_dtor(struct page *page) { ptlock_free(page); __ClearPageTable(page); + page_table_tlb_gen_init(page); dec_lruvec_page_state(page, NR_PAGETABLE); } @@ -2308,6 +2319,7 @@ static inline bool pgtable_pmd_page_ctor(struct page *page) if (!pmd_ptlock_init(page)) return false; __SetPageTable(page); + page_table_tlb_gen_init(page); inc_lruvec_page_state(page, NR_PAGETABLE); return true; } @@ -2316,6 +2328,7 @@ static inline void pgtable_pmd_page_dtor(struct page *page) { pmd_ptlock_free(page); __ClearPageTable(page); + page_table_tlb_gen_init(page); dec_lruvec_page_state(page, NR_PAGETABLE); } diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h index bbe5d4a422f7..cae9e8bbf8e6 100644 --- a/include/linux/mm_types.h +++ b/include/linux/mm_types.h @@ -148,6 +148,9 @@ struct page { pgtable_t pmd_huge_pte; /* protected by page->ptl */ unsigned long _pt_pad_2; /* mapping */ union { +#ifdef CONFIG_PER_TABLE_DEFERRED_FLUSHES + u64 deferred_tlb_gen; /* x86 non-pgd protected by page->ptl */ +#endif struct mm_struct *pt_mm; /* x86 pgds only */ atomic_t pt_frag_refcount; /* powerpc */ }; @@ -632,6 +635,7 @@ static inline bool mm_tlb_flush_pending(struct mm_struct *mm) } #ifdef CONFIG_ARCH_HAS_TLB_GENERATIONS +#ifndef CONFIG_PER_TABLE_DEFERRED_FLUSHES static inline bool pte_tlb_flush_pending(struct vm_area_struct *vma, pte_t *pte) { struct mm_struct *mm = vma->vm_mm; @@ -645,6 +649,24 @@ static inline bool pmd_tlb_flush_pending(struct vm_area_struct *vma, pmd_t *pmd) return atomic64_read(&vma->defer_tlb_gen) < atomic64_read(&mm->tlb_gen_completed); } +#else /* CONFIG_PER_TABLE_DEFERRED_FLUSHES */ +#define pte_tlb_flush_pending(vma, pte) \ + ({ \ + struct mm_struct *mm = (vma)->vm_mm; \ + \ + (pte_to_page(pte))->deferred_tlb_gen < \ + atomic64_read(&mm->tlb_gen_completed); \ + }) + +#define pmd_tlb_flush_pending(vma, pmd) \ + ({ \ + struct mm_struct *mm = (vma)->vm_mm; \ + \ + (pmd_to_page(pmd))->deferred_tlb_gen < \ + atomic64_read(&mm->tlb_gen_completed); \ + }) + +#endif /* CONFIG_PER_TABLE_DEFERRED_FLUSHES */ #else /* CONFIG_ARCH_HAS_TLB_GENERATIONS */ static inline bool pte_tlb_flush_pending(struct vm_area_struct *vma, pte_t *pte) { diff --git a/init/Kconfig b/init/Kconfig index 14a599a48738..e0d8a9ea7dd0 100644 --- a/init/Kconfig +++ b/init/Kconfig @@ -857,6 +857,13 @@ config ARCH_WANT_AGGRESSIVE_TLB_FLUSH_BATCHING bool depends on !CONFIG_MMU_GATHER_NO_GATHER +# +# For architectures that prefer to save deferred TLB generations in the +# page-table instead of the VMA. +config PER_TABLE_DEFERRED_FLUSHES + bool + depends on ARCH_HAS_TLB_GENERATIONS && 64BIT + config CC_HAS_INT128 def_bool !$(cc-option,$(m64-flag) -D__SIZEOF_INT128__=0) && 64BIT diff --git a/mm/huge_memory.c b/mm/huge_memory.c index c4b7c00cc69c..8f6c0e1a7ff7 100644 --- a/mm/huge_memory.c +++ b/mm/huge_memory.c @@ -1886,7 +1886,7 @@ int change_huge_pmd(struct vm_area_struct *vma, pmd_t *pmd, entry = pmd_clear_uffd_wp(entry); } ret = HPAGE_PMD_NR; - tlb_flush_pmd_range(tlb, addr, HPAGE_PMD_SIZE); + tlb_flush_pmd_range(tlb, pmd, addr, HPAGE_PMD_SIZE); set_pmd_at(mm, addr, pmd, entry); BUG_ON(vma_is_anonymous(vma) && !preserve_write && pmd_write(entry)); unlock: diff --git a/mm/mapping_dirty_helpers.c b/mm/mapping_dirty_helpers.c index 063419ade304..923b8c0ec837 100644 --- a/mm/mapping_dirty_helpers.c +++ b/mm/mapping_dirty_helpers.c @@ -48,7 +48,7 @@ static int wp_pte(pte_t *pte, unsigned long addr, unsigned long end, wpwalk->total++; if (pte_may_need_flush(old_pte, ptent)) - tlb_flush_pte_range(&wpwalk->tlb, addr, PAGE_SIZE); + tlb_flush_pte_range(&wpwalk->tlb, pte, addr, PAGE_SIZE); tlb_end_ptes(&wpwalk->tlb); } @@ -110,7 +110,7 @@ static int clean_record_pte(pte_t *pte, unsigned long addr, ptep_modify_prot_commit(walk->vma, addr, pte, old_pte, ptent); wpwalk->total++; - tlb_flush_pte_range(&wpwalk->tlb, addr, PAGE_SIZE); + tlb_flush_pte_range(&wpwalk->tlb, pte, addr, PAGE_SIZE); tlb_end_ptes(&wpwalk->tlb); __set_bit(pgoff, cwalk->bitmap); diff --git a/mm/mprotect.c b/mm/mprotect.c index 1258bbe42ee1..c3aa3030f4d9 100644 --- a/mm/mprotect.c +++ b/mm/mprotect.c @@ -140,7 +140,7 @@ static unsigned long change_pte_range(struct mmu_gather *tlb, } ptep_modify_prot_commit(vma, addr, pte, oldpte, ptent); if (pte_may_need_flush(oldpte, ptent)) - tlb_flush_pte_range(tlb, addr, PAGE_SIZE); + tlb_flush_pte_range(tlb, pte, addr, PAGE_SIZE); pages++; } else if (is_swap_pte(oldpte)) { swp_entry_t entry = pte_to_swp_entry(oldpte); From patchwork Sun Jan 31 00:11:29 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Nadav Amit X-Patchwork-Id: 12057473 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-13.5 required=3.0 tests=BAYES_00, DKIM_ADSP_CUSTOM_MED,DKIM_INVALID,DKIM_SIGNED,FREEMAIL_FORGED_FROMDOMAIN, FREEMAIL_FROM,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER, INCLUDES_PATCH,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id E7988C433E6 for ; Sun, 31 Jan 2021 00:16:42 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 8D04964E15 for ; Sun, 31 Jan 2021 00:16:42 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 8D04964E15 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=gmail.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 82B836B0083; Sat, 30 Jan 2021 19:16:30 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 787CA6B0085; Sat, 30 Jan 2021 19:16:30 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 5DAE86B0087; Sat, 30 Jan 2021 19:16:30 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0137.hostedemail.com [216.40.44.137]) by kanga.kvack.org (Postfix) with ESMTP id 421616B0083 for ; Sat, 30 Jan 2021 19:16:30 -0500 (EST) Received: from smtpin21.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay02.hostedemail.com (Postfix) with ESMTP id 05CD63633 for ; Sun, 31 Jan 2021 00:16:30 +0000 (UTC) X-FDA: 77764153740.21.chin40_240427e275b5 Received: from filter.hostedemail.com (10.5.16.251.rfc1918.com [10.5.16.251]) by smtpin21.hostedemail.com (Postfix) with ESMTP id DCD31180442C4 for ; Sun, 31 Jan 2021 00:16:29 +0000 (UTC) X-HE-Tag: chin40_240427e275b5 X-Filterd-Recvd-Size: 6446 Received: from mail-pf1-f181.google.com (mail-pf1-f181.google.com [209.85.210.181]) by imf27.hostedemail.com (Postfix) with ESMTP for ; Sun, 31 Jan 2021 00:16:29 +0000 (UTC) Received: by mail-pf1-f181.google.com with SMTP id t29so9063712pfg.11 for ; Sat, 30 Jan 2021 16:16:29 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=kt89wMExxB9mDZA9vWtyDT+tDBx3/YIxqezv1vec+68=; b=Z5HJbKiNov3/1MqP6XXO23ukd068I8ix9I99unM52IccOHfT9Hj9BoH9R8X1vm2fgE JDk7hr2hNN3oxblJMNiZ7K2IeCvHh3mrtboU70WDcxA03xXVGv8TUwLYbq5mMCt73Zoq UytzF2crzPOyu4jT7ytbq3xuthDJcdJN8rX5OGFqu7jjNON5BLB/sM+j8twhIoxIcXIZ JHGjQvRX4asWSyEQVvYHDQG7hQpLHmn+TanLbrdt7oQzw5S2SZKtooN6t/r1jQV7bKQ0 OljxSOcmcQy6UXIx0Ig5hMgRaFKhhkVwWb9Me2tokcWFccRzJJDsJjNUnLC1cCclYDAs pEhQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=kt89wMExxB9mDZA9vWtyDT+tDBx3/YIxqezv1vec+68=; b=bpMvMpX06VVAklWL/H6mnu8RDnkeInyf4pI6/g3gvYizNCxQB351QPelaxKCGBYLiH biRo3b4NnTfvMlxei4/oObuR/J5dTlk2S1qeOi7tJRGMfqyL4m67Jf6UDg2G+8Ax6ocQ n3BlRo+27REJcNth+kzBVmNTvbEYrFmOY6UkUB+iYh757wBKvJED6D7FK1cZ8BsFwEt1 r/NggdOxkasPGpT0P5680Jm1tFZ67bF1LnVvs4OGSuuSFaW/GLv0QCNcMfEVLa2cApZk F3FWZ6rC6/VIN+qm3RoW0tljQt1FBURuAdn53/LkzGcHUqat9bdqmQNGZXNLFtxf/Mkt Ohqg== X-Gm-Message-State: AOAM531ZcUzNH+Zh52Mfrby9Xj8P3gw1SWWujgrqvNjDZGtYZ4DvwDWT Iu4QQgz9t+Qp9KOiJ0q1s5rX0TkBXuw= X-Google-Smtp-Source: ABdhPJz+SyTFVAhW+P9pLT3RRUrcjfybFvt58ygJGhoPJwFKwTcLrwg+PMGlshJUUkIYq9kJ0kl6Sw== X-Received: by 2002:a65:68ce:: with SMTP id k14mr9338963pgt.401.1612052188222; Sat, 30 Jan 2021 16:16:28 -0800 (PST) Received: from sc2-haas01-esx0118.eng.vmware.com ([66.170.99.1]) by smtp.gmail.com with ESMTPSA id e12sm13127365pga.13.2021.01.30.16.16.26 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sat, 30 Jan 2021 16:16:27 -0800 (PST) From: Nadav Amit X-Google-Original-From: Nadav Amit To: linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: Nadav Amit , Andrea Arcangeli , Andrew Morton , Andy Lutomirski , Dave Hansen , Peter Zijlstra , Thomas Gleixner , Will Deacon , Yu Zhao , x86@kernel.org Subject: [RFC 17/20] mm/tlb: updated completed deferred TLB flush conditionally Date: Sat, 30 Jan 2021 16:11:29 -0800 Message-Id: <20210131001132.3368247-18-namit@vmware.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20210131001132.3368247-1-namit@vmware.com> References: <20210131001132.3368247-1-namit@vmware.com> MIME-Version: 1.0 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Nadav Amit If all the deferred TLB flushes were completed, there is no need to update the completed TLB flush. This update requires an atomic cmpxchg, so we would like to skip it. To do so, save for each mm the last TLB generation in which TLB flushes were deferred. While saving this information requires another atomic cmpxchg, assume that deferred TLB flushes are less frequent than TLB flushes. Signed-off-by: Nadav Amit Cc: Andrea Arcangeli Cc: Andrew Morton Cc: Andy Lutomirski Cc: Dave Hansen Cc: Peter Zijlstra Cc: Thomas Gleixner Cc: Will Deacon Cc: Yu Zhao Cc: x86@kernel.org --- include/asm-generic/tlb.h | 23 ++++++++++++++++++----- include/linux/mm_types.h | 5 +++++ 2 files changed, 23 insertions(+), 5 deletions(-) diff --git a/include/asm-generic/tlb.h b/include/asm-generic/tlb.h index 74dbb56d816d..a41af03fbede 100644 --- a/include/asm-generic/tlb.h +++ b/include/asm-generic/tlb.h @@ -536,6 +536,14 @@ static inline void tlb_update_generation(atomic64_t *gen, u64 new_gen) static inline void mark_mm_tlb_gen_done(struct mm_struct *mm, u64 gen) { + /* + * If we all the deferred TLB generations were completed, we can skip + * the update of tlb_gen_completed and save a few cycles on cmpxchg. + */ + if (atomic64_read(&mm->tlb_gen_deferred) == + atomic64_read(&mm->tlb_gen_completed)) + return; + /* * Update the completed generation to the new generation if the new * generation is greater than the previous one. @@ -546,7 +554,7 @@ static inline void mark_mm_tlb_gen_done(struct mm_struct *mm, u64 gen) static inline void read_defer_tlb_flush_gen(struct mmu_gather *tlb) { struct mm_struct *mm = tlb->mm; - u64 mm_gen; + u64 mm_gen, new_gen; /* * Any change of PTE before calling __track_deferred_tlb_flush() must be @@ -567,11 +575,16 @@ static inline void read_defer_tlb_flush_gen(struct mmu_gather *tlb) * correctness issues, and should not induce overheads, since anyhow in * TLB storms it is better to perform full TLB flush. */ - if (mm_gen != tlb->defer_gen) { - VM_BUG_ON(mm_gen < tlb->defer_gen); + if (mm_gen == tlb->defer_gen) + return; - tlb->defer_gen = inc_mm_tlb_gen(mm); - } + VM_BUG_ON(mm_gen < tlb->defer_gen); + + new_gen = inc_mm_tlb_gen(mm); + tlb->defer_gen = new_gen; + + /* Update mm->tlb_gen_deferred */ + tlb_update_generation(&mm->tlb_gen_deferred, new_gen); } #ifndef CONFIG_PER_TABLE_DEFERRED_FLUSHES diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h index cae9e8bbf8e6..4122a9b8b56f 100644 --- a/include/linux/mm_types.h +++ b/include/linux/mm_types.h @@ -578,6 +578,11 @@ struct mm_struct { */ atomic64_t tlb_gen; + /* + * The last TLB generation which was deferred. + */ + atomic64_t tlb_gen_deferred; + /* * TLB generation which is guarnateed to be flushed, including * all the PTE changes that were performed before tlb_gen was From patchwork Sun Jan 31 00:11:30 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Nadav Amit X-Patchwork-Id: 12057475 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-13.5 required=3.0 tests=BAYES_00, DKIM_ADSP_CUSTOM_MED,DKIM_INVALID,DKIM_SIGNED,FREEMAIL_FORGED_FROMDOMAIN, FREEMAIL_FROM,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER, INCLUDES_PATCH,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 74836C433E0 for ; Sun, 31 Jan 2021 00:16:45 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id F20F164E15 for ; Sun, 31 Jan 2021 00:16:44 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org F20F164E15 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=gmail.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 711BE6B0085; Sat, 30 Jan 2021 19:16:32 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 6A00E6B0087; Sat, 30 Jan 2021 19:16:32 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 42E476B0088; Sat, 30 Jan 2021 19:16:32 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0037.hostedemail.com [216.40.44.37]) by kanga.kvack.org (Postfix) with ESMTP id 253B76B0085 for ; Sat, 30 Jan 2021 19:16:32 -0500 (EST) Received: from smtpin06.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay01.hostedemail.com (Postfix) with ESMTP id DC365180AD83B for ; Sun, 31 Jan 2021 00:16:31 +0000 (UTC) X-FDA: 77764153782.06.seat57_1c1071a275b5 Received: from filter.hostedemail.com (10.5.16.251.rfc1918.com [10.5.16.251]) by smtpin06.hostedemail.com (Postfix) with ESMTP id BD1AF1003B766 for ; Sun, 31 Jan 2021 00:16:31 +0000 (UTC) X-HE-Tag: seat57_1c1071a275b5 X-Filterd-Recvd-Size: 24751 Received: from mail-pg1-f180.google.com (mail-pg1-f180.google.com [209.85.215.180]) by imf08.hostedemail.com (Postfix) with ESMTP for ; Sun, 31 Jan 2021 00:16:31 +0000 (UTC) Received: by mail-pg1-f180.google.com with SMTP id c132so9446518pga.3 for ; Sat, 30 Jan 2021 16:16:31 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=iGLhiAHsqIvmXZ1nOh8kB6ci+ExzudIBHsQKMIKw3zY=; b=WWmvgw8FNP+oFRjvxB4gRpb3luKsDHJi+ebFm9S0DRJNzh4HfnNBs0LeKmQCHvfiO6 lJ/jpEry1B1JZDgva43Q1cKc/PCcQTegEoltkzdXLc9RnHqhacDyd6cSj3lIZVE6Opsf 0BbeAGGxOW7/UQ557GLE13AZUeAWjuWJpncHWVlB8/PD4QXafrak77JmO5C69EztwOkc qgCW+8V9rduo0yaiMPg4t4d/pk0csLQTyhZdCRVv4+P7moeO49xDW1OysCTd65ABNIpe R6MiAnw2SguKkXpKt+MsESM2salvhwef9xoJWugPH1CbcGSguAFKCKGp5wTeYkMsjg7x vTDQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=iGLhiAHsqIvmXZ1nOh8kB6ci+ExzudIBHsQKMIKw3zY=; b=IqF3BRahKGfr8+ibXWWVuF7dPVs/YyYZWudNMs7A+lOM4RtX6k0CSDiy7Bq19oiHKL IQkHi3GISw7C4ABedig8tuZlH3mhKG3vjXlPAYNOQKscn7wnQH52tSNrCB4gVrQdwTCP Smgj51//MhtDnKkFtJP3ZJ6WPGti2xM1iCo3zRRECeK/KIKLWmnOqwrsd5AO/Vt3J0Gn NqTRG+xphxBY2Gpk6iMlQIpe+xTZ6do1npOBeCi1jdtIS7RdqfIbQekrUdaTdphqivJP QFPqnbVyN4qm0oRT3vPVOciZbNsFu0rNYtCHut+8URfmG2T/M8J5EqDlaKlj+QxwtLG4 5D/Q== X-Gm-Message-State: AOAM5317dhi3kghtHrVbTl2i/nukdgvxxyo3Jrzi5GagDnsTK4TGSzey ZdkXws1SNygIseZB0EsGPHzmALrEiCU= X-Google-Smtp-Source: ABdhPJwGHu1W7bT6MFMn88oqPzaikRtCmjfI/E7R7/8LUIS/ayIBUGVxFUogCBOMo/X9ATYigHvkfw== X-Received: by 2002:a63:4301:: with SMTP id q1mr10636974pga.430.1612052189837; Sat, 30 Jan 2021 16:16:29 -0800 (PST) Received: from sc2-haas01-esx0118.eng.vmware.com ([66.170.99.1]) by smtp.gmail.com with ESMTPSA id e12sm13127365pga.13.2021.01.30.16.16.28 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sat, 30 Jan 2021 16:16:29 -0800 (PST) From: Nadav Amit X-Google-Original-From: Nadav Amit To: linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: Nadav Amit , Andrea Arcangeli , Andrew Morton , Andy Lutomirski , Dave Hansen , Peter Zijlstra , Thomas Gleixner , Will Deacon , Yu Zhao , x86@kernel.org Subject: [RFC 18/20] mm: make mm_cpumask() volatile Date: Sat, 30 Jan 2021 16:11:30 -0800 Message-Id: <20210131001132.3368247-19-namit@vmware.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20210131001132.3368247-1-namit@vmware.com> References: <20210131001132.3368247-1-namit@vmware.com> MIME-Version: 1.0 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Nadav Amit mm_cpumask() is volatile: a bit might be turned on or off at any given moment, and it is not protected by any lock. While the kernel coding guidelines are very prohibitive against the use of volatile, not marking mm_cpumask() as volatile seems wrong. Cpumask and bitmap manipulation functions may work fine, as they are allowed to use either the new or old value. Apparently they do, as no bugs were reported. However, the fact that mm_cpumask() is not volatile might lead to theoretical bugs due to compiler optimizations. For example, cpumask_next() uses _find_next_bit(). A compiler might add to _find_next_bit() invented loads that would cause __ffs() to run on different value than the one read before. Consequently, if something like that happens, the result might be a CPU that was neither set on the old nor the new mask. I could not find what might go wrong in such a case, but it seems as an improper result. Mark mm_cpumask() result as volatile and propagate the "volatile" qualifier according to the compiler shouts. Signed-off-by: Nadav Amit Cc: Andrea Arcangeli Cc: Andrew Morton Cc: Andy Lutomirski Cc: Dave Hansen Cc: Peter Zijlstra Cc: Thomas Gleixner Cc: Will Deacon Cc: Yu Zhao Cc: x86@kernel.org --- arch/arm/include/asm/bitops.h | 4 ++-- arch/x86/hyperv/mmu.c | 2 +- arch/x86/include/asm/paravirt_types.h | 2 +- arch/x86/include/asm/tlbflush.h | 2 +- arch/x86/mm/tlb.c | 4 ++-- arch/x86/xen/mmu_pv.c | 2 +- include/asm-generic/bitops/find.h | 8 ++++---- include/linux/bitmap.h | 16 +++++++-------- include/linux/cpumask.h | 28 +++++++++++++-------------- include/linux/mm_types.h | 4 ++-- include/linux/smp.h | 6 +++--- kernel/smp.c | 8 ++++---- lib/bitmap.c | 8 ++++---- lib/cpumask.c | 8 ++++---- lib/find_bit.c | 10 +++++----- 15 files changed, 56 insertions(+), 56 deletions(-) diff --git a/arch/arm/include/asm/bitops.h b/arch/arm/include/asm/bitops.h index c92e42a5c8f7..c8690c0ff15a 100644 --- a/arch/arm/include/asm/bitops.h +++ b/arch/arm/include/asm/bitops.h @@ -162,8 +162,8 @@ extern int _test_and_change_bit(int nr, volatile unsigned long * p); */ extern int _find_first_zero_bit_le(const unsigned long *p, unsigned size); extern int _find_next_zero_bit_le(const unsigned long *p, int size, int offset); -extern int _find_first_bit_le(const unsigned long *p, unsigned size); -extern int _find_next_bit_le(const unsigned long *p, int size, int offset); +extern int _find_first_bit_le(const volatile unsigned long *p, unsigned size); +extern int _find_next_bit_le(const volatile unsigned long *p, int size, int offset); /* * Big endian assembly bitops. nr = 0 -> byte 3 bit 0. diff --git a/arch/x86/hyperv/mmu.c b/arch/x86/hyperv/mmu.c index 2c87350c1fb0..76ce8a0f19ef 100644 --- a/arch/x86/hyperv/mmu.c +++ b/arch/x86/hyperv/mmu.c @@ -52,7 +52,7 @@ static inline int fill_gva_list(u64 gva_list[], int offset, return gva_n - offset; } -static void hyperv_flush_tlb_others(const struct cpumask *cpus, +static void hyperv_flush_tlb_others(const volatile struct cpumask *cpus, const struct flush_tlb_info *info) { int cpu, vcpu, gva_n, max_gvas; diff --git a/arch/x86/include/asm/paravirt_types.h b/arch/x86/include/asm/paravirt_types.h index b6b02b7c19cc..35b5696aedc7 100644 --- a/arch/x86/include/asm/paravirt_types.h +++ b/arch/x86/include/asm/paravirt_types.h @@ -201,7 +201,7 @@ struct pv_mmu_ops { void (*flush_tlb_user)(void); void (*flush_tlb_kernel)(void); void (*flush_tlb_one_user)(unsigned long addr); - void (*flush_tlb_others)(const struct cpumask *cpus, + void (*flush_tlb_others)(const volatile struct cpumask *cpus, const struct flush_tlb_info *info); void (*tlb_remove_table)(struct mmu_gather *tlb, void *table); diff --git a/arch/x86/include/asm/tlbflush.h b/arch/x86/include/asm/tlbflush.h index 296a00545056..a4e7c90d11a8 100644 --- a/arch/x86/include/asm/tlbflush.h +++ b/arch/x86/include/asm/tlbflush.h @@ -208,7 +208,7 @@ struct flush_tlb_info { void flush_tlb_local(void); void flush_tlb_one_user(unsigned long addr); void flush_tlb_one_kernel(unsigned long addr); -void flush_tlb_others(const struct cpumask *cpumask, +void flush_tlb_others(const volatile struct cpumask *cpumask, const struct flush_tlb_info *info); #ifdef CONFIG_PARAVIRT diff --git a/arch/x86/mm/tlb.c b/arch/x86/mm/tlb.c index 48f4b56fc4a7..ba85d6bb4988 100644 --- a/arch/x86/mm/tlb.c +++ b/arch/x86/mm/tlb.c @@ -796,7 +796,7 @@ static bool tlb_is_not_lazy(int cpu, void *data) return !per_cpu(cpu_tlbstate.is_lazy, cpu); } -STATIC_NOPV void native_flush_tlb_others(const struct cpumask *cpumask, +STATIC_NOPV void native_flush_tlb_others(const volatile struct cpumask *cpumask, const struct flush_tlb_info *info) { count_vm_tlb_event(NR_TLB_REMOTE_FLUSH); @@ -824,7 +824,7 @@ STATIC_NOPV void native_flush_tlb_others(const struct cpumask *cpumask, (void *)info, 1, cpumask); } -void flush_tlb_others(const struct cpumask *cpumask, +void flush_tlb_others(const volatile struct cpumask *cpumask, const struct flush_tlb_info *info) { __flush_tlb_others(cpumask, info); diff --git a/arch/x86/xen/mmu_pv.c b/arch/x86/xen/mmu_pv.c index cf2ade864c30..0f9e1ff1e388 100644 --- a/arch/x86/xen/mmu_pv.c +++ b/arch/x86/xen/mmu_pv.c @@ -1247,7 +1247,7 @@ static void xen_flush_tlb_one_user(unsigned long addr) preempt_enable(); } -static void xen_flush_tlb_others(const struct cpumask *cpus, +static void xen_flush_tlb_others(const volatile struct cpumask *cpus, const struct flush_tlb_info *info) { struct { diff --git a/include/asm-generic/bitops/find.h b/include/asm-generic/bitops/find.h index 9fdf21302fdf..324078362ea1 100644 --- a/include/asm-generic/bitops/find.h +++ b/include/asm-generic/bitops/find.h @@ -12,8 +12,8 @@ * Returns the bit number for the next set bit * If no bits are set, returns @size. */ -extern unsigned long find_next_bit(const unsigned long *addr, unsigned long - size, unsigned long offset); +extern unsigned long find_next_bit(const volatile unsigned long *addr, + unsigned long size, unsigned long offset); #endif #ifndef find_next_and_bit @@ -27,8 +27,8 @@ extern unsigned long find_next_bit(const unsigned long *addr, unsigned long * Returns the bit number for the next set bit * If no bits are set, returns @size. */ -extern unsigned long find_next_and_bit(const unsigned long *addr1, - const unsigned long *addr2, unsigned long size, +extern unsigned long find_next_and_bit(const volatile unsigned long *addr1, + const volatile unsigned long *addr2, unsigned long size, unsigned long offset); #endif diff --git a/include/linux/bitmap.h b/include/linux/bitmap.h index 70a932470b2d..769b7a98e12f 100644 --- a/include/linux/bitmap.h +++ b/include/linux/bitmap.h @@ -141,8 +141,8 @@ extern void __bitmap_shift_left(unsigned long *dst, const unsigned long *src, extern void bitmap_cut(unsigned long *dst, const unsigned long *src, unsigned int first, unsigned int cut, unsigned int nbits); -extern int __bitmap_and(unsigned long *dst, const unsigned long *bitmap1, - const unsigned long *bitmap2, unsigned int nbits); +extern int __bitmap_and(unsigned long *dst, const volatile unsigned long *bitmap1, + const volatile unsigned long *bitmap2, unsigned int nbits); extern void __bitmap_or(unsigned long *dst, const unsigned long *bitmap1, const unsigned long *bitmap2, unsigned int nbits); extern void __bitmap_xor(unsigned long *dst, const unsigned long *bitmap1, @@ -152,8 +152,8 @@ extern int __bitmap_andnot(unsigned long *dst, const unsigned long *bitmap1, extern void __bitmap_replace(unsigned long *dst, const unsigned long *old, const unsigned long *new, const unsigned long *mask, unsigned int nbits); -extern int __bitmap_intersects(const unsigned long *bitmap1, - const unsigned long *bitmap2, unsigned int nbits); +extern int __bitmap_intersects(const volatile unsigned long *bitmap1, + const volatile unsigned long *bitmap2, unsigned int nbits); extern int __bitmap_subset(const unsigned long *bitmap1, const unsigned long *bitmap2, unsigned int nbits); extern int __bitmap_weight(const unsigned long *bitmap, unsigned int nbits); @@ -278,8 +278,8 @@ extern void bitmap_to_arr32(u32 *buf, const unsigned long *bitmap, (const unsigned long *) (bitmap), (nbits)) #endif -static inline int bitmap_and(unsigned long *dst, const unsigned long *src1, - const unsigned long *src2, unsigned int nbits) +static inline int bitmap_and(unsigned long *dst, const volatile unsigned long *src1, + const volatile unsigned long *src2, unsigned int nbits) { if (small_const_nbits(nbits)) return (*dst = *src1 & *src2 & BITMAP_LAST_WORD_MASK(nbits)) != 0; @@ -359,8 +359,8 @@ static inline bool bitmap_or_equal(const unsigned long *src1, return !(((*src1 | *src2) ^ *src3) & BITMAP_LAST_WORD_MASK(nbits)); } -static inline int bitmap_intersects(const unsigned long *src1, - const unsigned long *src2, unsigned int nbits) +static inline int bitmap_intersects(const volatile unsigned long *src1, + const volatile unsigned long *src2, unsigned int nbits) { if (small_const_nbits(nbits)) return ((*src1 & *src2) & BITMAP_LAST_WORD_MASK(nbits)) != 0; diff --git a/include/linux/cpumask.h b/include/linux/cpumask.h index 383684e30f12..3d7e418aa113 100644 --- a/include/linux/cpumask.h +++ b/include/linux/cpumask.h @@ -158,7 +158,7 @@ static inline unsigned int cpumask_last(const struct cpumask *srcp) } /* Valid inputs for n are -1 and 0. */ -static inline unsigned int cpumask_next(int n, const struct cpumask *srcp) +static inline unsigned int cpumask_next(int n, const volatile struct cpumask *srcp) { return n+1; } @@ -169,8 +169,8 @@ static inline unsigned int cpumask_next_zero(int n, const struct cpumask *srcp) } static inline unsigned int cpumask_next_and(int n, - const struct cpumask *srcp, - const struct cpumask *andp) + const volatile struct cpumask *srcp, + const volatile struct cpumask *andp) { return n+1; } @@ -183,7 +183,7 @@ static inline unsigned int cpumask_next_wrap(int n, const struct cpumask *mask, } /* cpu must be a valid cpu, ie 0, so there's no other choice. */ -static inline unsigned int cpumask_any_but(const struct cpumask *mask, +static inline unsigned int cpumask_any_but(const volatile struct cpumask *mask, unsigned int cpu) { return 1; @@ -235,7 +235,7 @@ static inline unsigned int cpumask_last(const struct cpumask *srcp) return find_last_bit(cpumask_bits(srcp), nr_cpumask_bits); } -unsigned int cpumask_next(int n, const struct cpumask *srcp); +unsigned int cpumask_next(int n, const volatile struct cpumask *srcp); /** * cpumask_next_zero - get the next unset cpu in a cpumask @@ -252,8 +252,8 @@ static inline unsigned int cpumask_next_zero(int n, const struct cpumask *srcp) return find_next_zero_bit(cpumask_bits(srcp), nr_cpumask_bits, n+1); } -int cpumask_next_and(int n, const struct cpumask *, const struct cpumask *); -int cpumask_any_but(const struct cpumask *mask, unsigned int cpu); +int cpumask_next_and(int n, const volatile struct cpumask *, const volatile struct cpumask *); +int cpumask_any_but(const volatile struct cpumask *mask, unsigned int cpu); unsigned int cpumask_local_spread(unsigned int i, int node); int cpumask_any_and_distribute(const struct cpumask *src1p, const struct cpumask *src2p); @@ -335,7 +335,7 @@ extern int cpumask_next_wrap(int n, const struct cpumask *mask, int start, bool * @cpu: cpu number (< nr_cpu_ids) * @dstp: the cpumask pointer */ -static inline void cpumask_set_cpu(unsigned int cpu, struct cpumask *dstp) +static inline void cpumask_set_cpu(unsigned int cpu, volatile struct cpumask *dstp) { set_bit(cpumask_check(cpu), cpumask_bits(dstp)); } @@ -351,7 +351,7 @@ static inline void __cpumask_set_cpu(unsigned int cpu, struct cpumask *dstp) * @cpu: cpu number (< nr_cpu_ids) * @dstp: the cpumask pointer */ -static inline void cpumask_clear_cpu(int cpu, struct cpumask *dstp) +static inline void cpumask_clear_cpu(int cpu, volatile struct cpumask *dstp) { clear_bit(cpumask_check(cpu), cpumask_bits(dstp)); } @@ -368,7 +368,7 @@ static inline void __cpumask_clear_cpu(int cpu, struct cpumask *dstp) * * Returns 1 if @cpu is set in @cpumask, else returns 0 */ -static inline int cpumask_test_cpu(int cpu, const struct cpumask *cpumask) +static inline int cpumask_test_cpu(int cpu, const volatile struct cpumask *cpumask) { return test_bit(cpumask_check(cpu), cpumask_bits((cpumask))); } @@ -428,8 +428,8 @@ static inline void cpumask_clear(struct cpumask *dstp) * If *@dstp is empty, returns 0, else returns 1 */ static inline int cpumask_and(struct cpumask *dstp, - const struct cpumask *src1p, - const struct cpumask *src2p) + const volatile struct cpumask *src1p, + const volatile struct cpumask *src2p) { return bitmap_and(cpumask_bits(dstp), cpumask_bits(src1p), cpumask_bits(src2p), nr_cpumask_bits); @@ -521,8 +521,8 @@ static inline bool cpumask_or_equal(const struct cpumask *src1p, * @src1p: the first input * @src2p: the second input */ -static inline bool cpumask_intersects(const struct cpumask *src1p, - const struct cpumask *src2p) +static inline bool cpumask_intersects(const volatile struct cpumask *src1p, + const volatile struct cpumask *src2p) { return bitmap_intersects(cpumask_bits(src1p), cpumask_bits(src2p), nr_cpumask_bits); diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h index 4122a9b8b56f..5a9b8c417f23 100644 --- a/include/linux/mm_types.h +++ b/include/linux/mm_types.h @@ -611,9 +611,9 @@ static inline void mm_init_cpumask(struct mm_struct *mm) } /* Future-safe accessor for struct mm_struct's cpu_vm_mask. */ -static inline cpumask_t *mm_cpumask(struct mm_struct *mm) +static inline volatile cpumask_t *mm_cpumask(struct mm_struct *mm) { - return (struct cpumask *)&mm->cpu_bitmap; + return (volatile struct cpumask *)&mm->cpu_bitmap; } struct mmu_gather; diff --git a/include/linux/smp.h b/include/linux/smp.h index 70c6f6284dcf..62b3456fec04 100644 --- a/include/linux/smp.h +++ b/include/linux/smp.h @@ -59,7 +59,7 @@ void on_each_cpu(smp_call_func_t func, void *info, int wait); * Call a function on processors specified by mask, which might include * the local one. */ -void on_each_cpu_mask(const struct cpumask *mask, smp_call_func_t func, +void on_each_cpu_mask(const volatile struct cpumask *mask, smp_call_func_t func, void *info, bool wait); /* @@ -71,7 +71,7 @@ void on_each_cpu_cond(smp_cond_func_t cond_func, smp_call_func_t func, void *info, bool wait); void on_each_cpu_cond_mask(smp_cond_func_t cond_func, smp_call_func_t func, - void *info, bool wait, const struct cpumask *mask); + void *info, bool wait, const volatile struct cpumask *mask); int smp_call_function_single_async(int cpu, call_single_data_t *csd); @@ -118,7 +118,7 @@ extern void smp_cpus_done(unsigned int max_cpus); * Call a function on all other processors */ void smp_call_function(smp_call_func_t func, void *info, int wait); -void smp_call_function_many(const struct cpumask *mask, +void smp_call_function_many(const volatile struct cpumask *mask, smp_call_func_t func, void *info, bool wait); int smp_call_function_any(const struct cpumask *mask, diff --git a/kernel/smp.c b/kernel/smp.c index 1b6070bf97bb..fa6e080251bf 100644 --- a/kernel/smp.c +++ b/kernel/smp.c @@ -604,7 +604,7 @@ int smp_call_function_any(const struct cpumask *mask, } EXPORT_SYMBOL_GPL(smp_call_function_any); -static void smp_call_function_many_cond(const struct cpumask *mask, +static void smp_call_function_many_cond(const volatile struct cpumask *mask, smp_call_func_t func, void *info, bool wait, smp_cond_func_t cond_func) { @@ -705,7 +705,7 @@ static void smp_call_function_many_cond(const struct cpumask *mask, * hardware interrupt handler or from a bottom half handler. Preemption * must be disabled when calling this function. */ -void smp_call_function_many(const struct cpumask *mask, +void smp_call_function_many(const volatile struct cpumask *mask, smp_call_func_t func, void *info, bool wait) { smp_call_function_many_cond(mask, func, info, wait, NULL); @@ -853,7 +853,7 @@ EXPORT_SYMBOL(on_each_cpu); * exception is that it may be used during early boot while * early_boot_irqs_disabled is set. */ -void on_each_cpu_mask(const struct cpumask *mask, smp_call_func_t func, +void on_each_cpu_mask(const volatile struct cpumask *mask, smp_call_func_t func, void *info, bool wait) { int cpu = get_cpu(); @@ -892,7 +892,7 @@ EXPORT_SYMBOL(on_each_cpu_mask); * from a hardware interrupt handler or from a bottom half handler. */ void on_each_cpu_cond_mask(smp_cond_func_t cond_func, smp_call_func_t func, - void *info, bool wait, const struct cpumask *mask) + void *info, bool wait, const volatile struct cpumask *mask) { int cpu = get_cpu(); diff --git a/lib/bitmap.c b/lib/bitmap.c index 75006c4036e9..6df7b13727d3 100644 --- a/lib/bitmap.c +++ b/lib/bitmap.c @@ -235,8 +235,8 @@ void bitmap_cut(unsigned long *dst, const unsigned long *src, } EXPORT_SYMBOL(bitmap_cut); -int __bitmap_and(unsigned long *dst, const unsigned long *bitmap1, - const unsigned long *bitmap2, unsigned int bits) +int __bitmap_and(unsigned long *dst, const volatile unsigned long *bitmap1, + const volatile unsigned long *bitmap2, unsigned int bits) { unsigned int k; unsigned int lim = bits/BITS_PER_LONG; @@ -301,8 +301,8 @@ void __bitmap_replace(unsigned long *dst, } EXPORT_SYMBOL(__bitmap_replace); -int __bitmap_intersects(const unsigned long *bitmap1, - const unsigned long *bitmap2, unsigned int bits) +int __bitmap_intersects(const volatile unsigned long *bitmap1, + const volatile unsigned long *bitmap2, unsigned int bits) { unsigned int k, lim = bits/BITS_PER_LONG; for (k = 0; k < lim; ++k) diff --git a/lib/cpumask.c b/lib/cpumask.c index 35924025097b..28763b992beb 100644 --- a/lib/cpumask.c +++ b/lib/cpumask.c @@ -15,7 +15,7 @@ * * Returns >= nr_cpu_ids if no further cpus set. */ -unsigned int cpumask_next(int n, const struct cpumask *srcp) +unsigned int cpumask_next(int n, const volatile struct cpumask *srcp) { /* -1 is a legal arg here. */ if (n != -1) @@ -32,8 +32,8 @@ EXPORT_SYMBOL(cpumask_next); * * Returns >= nr_cpu_ids if no further cpus set in both. */ -int cpumask_next_and(int n, const struct cpumask *src1p, - const struct cpumask *src2p) +int cpumask_next_and(int n, const volatile struct cpumask *src1p, + const volatile struct cpumask *src2p) { /* -1 is a legal arg here. */ if (n != -1) @@ -51,7 +51,7 @@ EXPORT_SYMBOL(cpumask_next_and); * Often used to find any cpu but smp_processor_id() in a mask. * Returns >= nr_cpu_ids if no cpus set. */ -int cpumask_any_but(const struct cpumask *mask, unsigned int cpu) +int cpumask_any_but(const volatile struct cpumask *mask, unsigned int cpu) { unsigned int i; diff --git a/lib/find_bit.c b/lib/find_bit.c index f67f86fd2f62..08cd64aecc96 100644 --- a/lib/find_bit.c +++ b/lib/find_bit.c @@ -29,8 +29,8 @@ * searching it for one bits. * - The optional "addr2", which is anded with "addr1" if present. */ -static unsigned long _find_next_bit(const unsigned long *addr1, - const unsigned long *addr2, unsigned long nbits, +static unsigned long _find_next_bit(const volatile unsigned long *addr1, + const volatile unsigned long *addr2, unsigned long nbits, unsigned long start, unsigned long invert, unsigned long le) { unsigned long tmp, mask; @@ -74,7 +74,7 @@ static unsigned long _find_next_bit(const unsigned long *addr1, /* * Find the next set bit in a memory region. */ -unsigned long find_next_bit(const unsigned long *addr, unsigned long size, +unsigned long find_next_bit(const volatile unsigned long *addr, unsigned long size, unsigned long offset) { return _find_next_bit(addr, NULL, size, offset, 0UL, 0); @@ -92,8 +92,8 @@ EXPORT_SYMBOL(find_next_zero_bit); #endif #if !defined(find_next_and_bit) -unsigned long find_next_and_bit(const unsigned long *addr1, - const unsigned long *addr2, unsigned long size, +unsigned long find_next_and_bit(const volatile unsigned long *addr1, + const volatile unsigned long *addr2, unsigned long size, unsigned long offset) { return _find_next_bit(addr1, addr2, size, offset, 0UL, 0); From patchwork Sun Jan 31 00:11:31 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Nadav Amit X-Patchwork-Id: 12057477 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-13.5 required=3.0 tests=BAYES_00, DKIM_ADSP_CUSTOM_MED,DKIM_INVALID,DKIM_SIGNED,FREEMAIL_FORGED_FROMDOMAIN, FREEMAIL_FROM,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER, INCLUDES_PATCH,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 05082C43381 for ; Sun, 31 Jan 2021 00:16:48 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id AD2AD64E19 for ; Sun, 31 Jan 2021 00:16:47 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org AD2AD64E19 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=gmail.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 9A9CF6B0087; Sat, 30 Jan 2021 19:16:33 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 931EC6B0088; Sat, 30 Jan 2021 19:16:33 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 75DEE6B0089; Sat, 30 Jan 2021 19:16:33 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0131.hostedemail.com [216.40.44.131]) by kanga.kvack.org (Postfix) with ESMTP id 5F8D86B0087 for ; Sat, 30 Jan 2021 19:16:33 -0500 (EST) Received: from smtpin19.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay05.hostedemail.com (Postfix) with ESMTP id 2F4A6181AEF23 for ; Sun, 31 Jan 2021 00:16:33 +0000 (UTC) X-FDA: 77764153866.19.cats14_1017169275b5 Received: from filter.hostedemail.com (10.5.16.251.rfc1918.com [10.5.16.251]) by smtpin19.hostedemail.com (Postfix) with ESMTP id 178C61AD1B9 for ; Sun, 31 Jan 2021 00:16:33 +0000 (UTC) X-HE-Tag: cats14_1017169275b5 X-Filterd-Recvd-Size: 6807 Received: from mail-pf1-f174.google.com (mail-pf1-f174.google.com [209.85.210.174]) by imf38.hostedemail.com (Postfix) with ESMTP for ; Sun, 31 Jan 2021 00:16:32 +0000 (UTC) Received: by mail-pf1-f174.google.com with SMTP id y205so9069313pfc.5 for ; Sat, 30 Jan 2021 16:16:32 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=tkrptTBK3N3R404MIQhBTsLhuUAmLX1yl6cqYDzKwyA=; b=WjebMTziYJW52tJp0ecHzhl0k1PuuNiWVAV+79HZYfxsnt5v4az2VBXEzvPkAHS6Q9 hjT1LxAy2ED9H+NVTdABs2daiWJLGsg9Jbp97LsFT8TPB7fPnMns14St8gL9pl0KJdJi jUHg+3yPAd7aKlu2mg/YH19B3bejOWyhMlyqDbilE6I0G3ZJjiDWjqnZz+fO7sTPp6dZ vyqe6gKiUcV6hZJVcwAui8L71cdM9pgzHchPndGr1ook+9Gd92I7wU2/zhlZwv7AdTha 4O0kAUXxZJ8u+f+mDWzqbWK+EYI4SOOmL3F1q3aMvRo6Mtpc3C9UgLi+DPzuiHziC2HN TwZQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=tkrptTBK3N3R404MIQhBTsLhuUAmLX1yl6cqYDzKwyA=; b=NBk29M/FsnFL+Ds7XE59wNlwUSi39Z+1VLeFUZvx5ezXUL5F8YObobibVmiOoe9VfP xNmT1vGE/8pckTh+UyJY9ZBuUYxjVHPiHrXH/lQk8G/woFiZzf4t0pfHU1KG/c2D86fW JfO9y9b3mgCGvLHGvPMTn6DRnE0eqON0bkgJNa3llbI5WjEQRjk4jSEPvEfWtGJBRrBO QoKXRakH4PyQl0S0Xe2pa98hZPNbRvmEp9qaKwzfn2nKLHLMLEXYhLTLvSi3+/qihPLJ s0NnxxSHtUs3mn0WKYXMmxW9ekapadOWKvjUfsylXpOFdP25OM8/JDFfrLSeK1uEEfuM Otog== X-Gm-Message-State: AOAM533qQQ1Z0IFifI4kuTTYqetCeAruVWl+EGF1M6F9DeC3vbYSEN8F KT8tvMi1beXy3RlyRrlVs36m/AG2c48= X-Google-Smtp-Source: ABdhPJwXMFg7+1aJTyLt9vDF5ckFJc96jqdhXgkz0Lsqm2Hnw2vjWguS2Ee3rjHPCC7agJskagZBmQ== X-Received: by 2002:a65:430b:: with SMTP id j11mr10240396pgq.130.1612052191427; Sat, 30 Jan 2021 16:16:31 -0800 (PST) Received: from sc2-haas01-esx0118.eng.vmware.com ([66.170.99.1]) by smtp.gmail.com with ESMTPSA id e12sm13127365pga.13.2021.01.30.16.16.29 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sat, 30 Jan 2021 16:16:30 -0800 (PST) From: Nadav Amit X-Google-Original-From: Nadav Amit To: linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: Nadav Amit , Mel Gorman , Andrea Arcangeli , Andrew Morton , Andy Lutomirski , Dave Hansen , Peter Zijlstra , Thomas Gleixner , Will Deacon , Yu Zhao , x86@kernel.org Subject: [RFC 19/20] lib/cpumask: introduce cpumask_atomic_or() Date: Sat, 30 Jan 2021 16:11:31 -0800 Message-Id: <20210131001132.3368247-20-namit@vmware.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20210131001132.3368247-1-namit@vmware.com> References: <20210131001132.3368247-1-namit@vmware.com> MIME-Version: 1.0 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Nadav Amit Introduce cpumask_atomic_or() and bitmask_atomic_or() to allow to perform atomic or operations atomically on cpumasks. This will be used by the next patch. To be more efficient, skip atomic operations when no changes are needed. Signed-off-by: Nadav Amit Cc: Mel Gorman Cc: Andrea Arcangeli Cc: Andrew Morton Cc: Andy Lutomirski Cc: Dave Hansen Cc: Peter Zijlstra Cc: Thomas Gleixner Cc: Will Deacon Cc: Yu Zhao Cc: x86@kernel.org --- include/linux/bitmap.h | 5 +++++ include/linux/cpumask.h | 12 ++++++++++++ lib/bitmap.c | 25 +++++++++++++++++++++++++ 3 files changed, 42 insertions(+) diff --git a/include/linux/bitmap.h b/include/linux/bitmap.h index 769b7a98e12f..c9a9b784b244 100644 --- a/include/linux/bitmap.h +++ b/include/linux/bitmap.h @@ -76,6 +76,7 @@ * bitmap_to_arr32(buf, src, nbits) Copy nbits from buf to u32[] dst * bitmap_get_value8(map, start) Get 8bit value from map at start * bitmap_set_value8(map, value, start) Set 8bit value to map at start + * bitmap_atomic_or(dst, src, nbits) *dst |= *src (atomically) * * Note, bitmap_zero() and bitmap_fill() operate over the region of * unsigned longs, that is, bits behind bitmap till the unsigned long @@ -577,6 +578,10 @@ static inline void bitmap_set_value8(unsigned long *map, unsigned long value, map[index] |= value << offset; } +extern void bitmap_atomic_or(volatile unsigned long *dst, + const volatile unsigned long *bitmap, unsigned int bits); + + #endif /* __ASSEMBLY__ */ #endif /* __LINUX_BITMAP_H */ diff --git a/include/linux/cpumask.h b/include/linux/cpumask.h index 3d7e418aa113..0567d73a0192 100644 --- a/include/linux/cpumask.h +++ b/include/linux/cpumask.h @@ -699,6 +699,18 @@ static inline unsigned int cpumask_size(void) return BITS_TO_LONGS(nr_cpumask_bits) * sizeof(long); } +/** + * cpumask_atomic_or - *dstp |= *srcp (*dstp is set atomically) + * @dstp: the cpumask result (and source which is or'd) + * @srcp: the source input + */ +static inline void cpumask_atomic_or(volatile struct cpumask *dstp, + const volatile struct cpumask *srcp) +{ + bitmap_atomic_or(cpumask_bits(dstp), cpumask_bits(srcp), + nr_cpumask_bits); +} + /* * cpumask_var_t: struct cpumask for stack usage. * diff --git a/lib/bitmap.c b/lib/bitmap.c index 6df7b13727d3..50f1842ff891 100644 --- a/lib/bitmap.c +++ b/lib/bitmap.c @@ -1310,3 +1310,28 @@ void bitmap_to_arr32(u32 *buf, const unsigned long *bitmap, unsigned int nbits) EXPORT_SYMBOL(bitmap_to_arr32); #endif + +void bitmap_atomic_or(volatile unsigned long *dst, + const volatile unsigned long *bitmap, unsigned int bits) +{ + unsigned int k; + unsigned int nr = BITS_TO_LONGS(bits); + + for (k = 0; k < nr; k++) { + unsigned long src = bitmap[k]; + + /* + * Skip atomic operations when no bits are changed. Do not use + * bitmap[k] directly to avoid redundant loads as bitmap + * variable is volatile. + */ + if (!(src & ~dst[k])) + continue; + + if (BITS_PER_LONG == 64) + atomic64_or(src, (atomic64_t*)&dst[k]); + else + atomic_or(src, (atomic_t*)&dst[k]); + } +} +EXPORT_SYMBOL(bitmap_atomic_or); From patchwork Sun Jan 31 00:11:32 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Nadav Amit X-Patchwork-Id: 12057479 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-13.5 required=3.0 tests=BAYES_00, DKIM_ADSP_CUSTOM_MED,DKIM_INVALID,DKIM_SIGNED,FREEMAIL_FORGED_FROMDOMAIN, FREEMAIL_FROM,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER, INCLUDES_PATCH,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 6024EC433E0 for ; Sun, 31 Jan 2021 00:16:50 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 0629064E15 for ; Sun, 31 Jan 2021 00:16:50 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 0629064E15 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=gmail.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 95B0E6B0088; Sat, 30 Jan 2021 19:16:35 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 8E84B6B0089; Sat, 30 Jan 2021 19:16:35 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 713396B008A; Sat, 30 Jan 2021 19:16:35 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0241.hostedemail.com [216.40.44.241]) by kanga.kvack.org (Postfix) with ESMTP id 54ABC6B0088 for ; Sat, 30 Jan 2021 19:16:35 -0500 (EST) Received: from smtpin07.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay04.hostedemail.com (Postfix) with ESMTP id 2141E1EE6 for ; Sun, 31 Jan 2021 00:16:35 +0000 (UTC) X-FDA: 77764153950.07.deer53_4808843275b5 Received: from filter.hostedemail.com (10.5.16.251.rfc1918.com [10.5.16.251]) by smtpin07.hostedemail.com (Postfix) with ESMTP id 047B81803F90F for ; Sun, 31 Jan 2021 00:16:35 +0000 (UTC) X-HE-Tag: deer53_4808843275b5 X-Filterd-Recvd-Size: 12183 Received: from mail-pl1-f177.google.com (mail-pl1-f177.google.com [209.85.214.177]) by imf01.hostedemail.com (Postfix) with ESMTP for ; Sun, 31 Jan 2021 00:16:34 +0000 (UTC) Received: by mail-pl1-f177.google.com with SMTP id u11so7895703plg.13 for ; Sat, 30 Jan 2021 16:16:34 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=vTdsuJQth2J6WGuWWZ2jdlnmzFKq4d9KvvhYC+YtYPM=; b=kQA8w6RBjWwklh54JZyl96kDKOlZPvVrUhbxGdhR4VwqYSyrGHXNPzm/7T3cNlhk+9 Jt/AvSljO4Qqx7b7YU7hxsVKNvzENWz4PlbqvlqI+E9TprAVSCOcZyHWlqBIA/+uqfSQ H/S8DpVZPw3N8xREtnXd5OXhbe3kkWh8ll+f58bkGpfUo1Zbv5s274x2Gi4ID+RKAp4R dsjseZNxDFRCwZJpWykiRQBVCNOwCJE7jXQtt0juVEt2J0aHZazhW3DRmn9E+N7OgeSe eB/l7QMfwLa04lCx1txffJBhV+2JqFsbkWOClh9AaUL623Seyvv9XMyzl2+mj2y0DFjj saXg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=vTdsuJQth2J6WGuWWZ2jdlnmzFKq4d9KvvhYC+YtYPM=; b=WFS5vlda70r5UvYhBDpG+E+gaRzgJo8ChxLj1AAIJPE227hVMTNpqH9km1eUx8xHs5 /ZxaH8rM/uNcAAIud6EuLBQ6v6daqQQnGMlZRa8TASKEWhi+Z0YNiF6uYx2f5/2WoGAN fnq4k2eUgmyI807stYnhZHd5YspnFVX5wtipsnCG5cPXm2U+e6TVvWMaQDt8mqVTaVgB pXcYC3dvk2w/CCXtINbDiwT7RuDV9SELd0DKyQXEzJHO4127vhZuyk+1Dp/yJajRK1eJ yOJ6Ti2hBHJHdpmkandmP9HVbcImWprsN9MAWKF7nKIg3fUYc6nEDgC/o6hk56+47rWu y3Ww== X-Gm-Message-State: AOAM532csc6jcd5vjnrrA6qVe/kUsXRxfbLDmKiopmU1ZLrSfkXC00uy Gj1FBpQ17S6DwSZ3HSdkA0ao2MMQajc= X-Google-Smtp-Source: ABdhPJxc1EgCFTX8mBi+r3yG2hwnghqgny3SVhg+g/XVGDEfxMAToxY0ux9sP9kqY7dHWCPL7E5z2w== X-Received: by 2002:a17:90a:8c87:: with SMTP id b7mr11168725pjo.158.1612052193016; Sat, 30 Jan 2021 16:16:33 -0800 (PST) Received: from sc2-haas01-esx0118.eng.vmware.com ([66.170.99.1]) by smtp.gmail.com with ESMTPSA id e12sm13127365pga.13.2021.01.30.16.16.31 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sat, 30 Jan 2021 16:16:32 -0800 (PST) From: Nadav Amit X-Google-Original-From: Nadav Amit To: linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: Nadav Amit , Mel Gorman , Andrea Arcangeli , Andrew Morton , Andy Lutomirski , Dave Hansen , Peter Zijlstra , Thomas Gleixner , Will Deacon , Yu Zhao , x86@kernel.org Subject: [RFC 20/20] mm/rmap: avoid potential races Date: Sat, 30 Jan 2021 16:11:32 -0800 Message-Id: <20210131001132.3368247-21-namit@vmware.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20210131001132.3368247-1-namit@vmware.com> References: <20210131001132.3368247-1-namit@vmware.com> MIME-Version: 1.0 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Nadav Amit flush_tlb_batched_pending() appears to have a theoretical race: tlb_flush_batched is being cleared after the TLB flush, and if in between another core calls set_tlb_ubc_flush_pending() and sets the pending TLB flush indication, this indication might be lost. Holding the page-table lock when SPLIT_LOCK is set cannot eliminate this race. The current batched TLB invalidation scheme therefore does not seem viable or easily repairable. Introduce a new scheme, in which a cpumask is maintained for pending batched TLB flushes. When a full TLB flush is performed clear the corresponding bit on the CPU the performs the TLB flush. This scheme is only suitable for architectures that use IPIs for TLB shootdowns. As x86 is the only architecture that currently uses batched TLB flushes, this is not an issue. Signed-off-by: Nadav Amit Cc: Mel Gorman Cc: Andrea Arcangeli Cc: Andrew Morton Cc: Andy Lutomirski Cc: Dave Hansen Cc: Peter Zijlstra Cc: Thomas Gleixner Cc: Will Deacon Cc: Yu Zhao Cc: x86@kernel.org --- arch/x86/include/asm/tlbbatch.h | 15 ------------ arch/x86/include/asm/tlbflush.h | 2 +- arch/x86/mm/tlb.c | 18 ++++++++++----- include/linux/mm.h | 7 ++++++ include/linux/mm_types_task.h | 13 ----------- mm/rmap.c | 41 ++++++++++++++++----------------- 6 files changed, 40 insertions(+), 56 deletions(-) delete mode 100644 arch/x86/include/asm/tlbbatch.h diff --git a/arch/x86/include/asm/tlbbatch.h b/arch/x86/include/asm/tlbbatch.h deleted file mode 100644 index 1ad56eb3e8a8..000000000000 --- a/arch/x86/include/asm/tlbbatch.h +++ /dev/null @@ -1,15 +0,0 @@ -/* SPDX-License-Identifier: GPL-2.0 */ -#ifndef _ARCH_X86_TLBBATCH_H -#define _ARCH_X86_TLBBATCH_H - -#include - -struct arch_tlbflush_unmap_batch { - /* - * Each bit set is a CPU that potentially has a TLB entry for one of - * the PFNs being flushed.. - */ - struct cpumask cpumask; -}; - -#endif /* _ARCH_X86_TLBBATCH_H */ diff --git a/arch/x86/include/asm/tlbflush.h b/arch/x86/include/asm/tlbflush.h index a4e7c90d11a8..0e681a565b78 100644 --- a/arch/x86/include/asm/tlbflush.h +++ b/arch/x86/include/asm/tlbflush.h @@ -240,7 +240,7 @@ static inline void flush_tlb_page(struct vm_area_struct *vma, unsigned long a) flush_tlb_mm_range(vma->vm_mm, a, a + PAGE_SIZE, PAGE_SHIFT, false); } -extern void arch_tlbbatch_flush(struct arch_tlbflush_unmap_batch *batch); +extern void arch_tlbbatch_flush(void); static inline bool pte_may_need_flush(pte_t oldpte, pte_t newpte) { diff --git a/arch/x86/mm/tlb.c b/arch/x86/mm/tlb.c index ba85d6bb4988..f7304d45e6b9 100644 --- a/arch/x86/mm/tlb.c +++ b/arch/x86/mm/tlb.c @@ -760,8 +760,15 @@ static void flush_tlb_func_common(const struct flush_tlb_info *f, count_vm_tlb_events(NR_TLB_LOCAL_FLUSH_ONE, nr_invalidate); trace_tlb_flush(reason, nr_invalidate); } else { + int cpu = smp_processor_id(); + /* Full flush. */ flush_tlb_local(); + + /* If there are batched TLB flushes, mark they are done */ + if (cpumask_test_cpu(cpu, &tlb_flush_batched_cpumask)) + cpumask_clear_cpu(cpu, &tlb_flush_batched_cpumask); + if (local) count_vm_tlb_event(NR_TLB_LOCAL_FLUSH_ALL); trace_tlb_flush(reason, TLB_FLUSH_ALL); @@ -1143,21 +1150,20 @@ static const struct flush_tlb_info full_flush_tlb_info = { .end = TLB_FLUSH_ALL, }; -void arch_tlbbatch_flush(struct arch_tlbflush_unmap_batch *batch) +void arch_tlbbatch_flush(void) { int cpu = get_cpu(); - if (cpumask_test_cpu(cpu, &batch->cpumask)) { + if (cpumask_test_cpu(cpu, &tlb_flush_batched_cpumask)) { lockdep_assert_irqs_enabled(); local_irq_disable(); flush_tlb_func_local(&full_flush_tlb_info, TLB_LOCAL_SHOOTDOWN); local_irq_enable(); } - if (cpumask_any_but(&batch->cpumask, cpu) < nr_cpu_ids) - flush_tlb_others(&batch->cpumask, &full_flush_tlb_info); - - cpumask_clear(&batch->cpumask); + if (cpumask_any_but(&tlb_flush_batched_cpumask, cpu) < nr_cpu_ids) + flush_tlb_others(&tlb_flush_batched_cpumask, + &full_flush_tlb_info); /* * We cannot call mark_mm_tlb_gen_done() since we do not know which diff --git a/include/linux/mm.h b/include/linux/mm.h index a8a5bf82bd03..e4eeee985cf6 100644 --- a/include/linux/mm.h +++ b/include/linux/mm.h @@ -3197,5 +3197,12 @@ unsigned long wp_shared_mapping_range(struct address_space *mapping, extern int sysctl_nr_trim_pages; +#ifdef CONFIG_ARCH_WANT_BATCHED_UNMAP_TLB_FLUSH +extern volatile cpumask_t tlb_flush_batched_cpumask; +void tlb_batch_init(void); +#else +static inline void tlb_batch_init(void) { } +#endif + #endif /* __KERNEL__ */ #endif /* _LINUX_MM_H */ diff --git a/include/linux/mm_types_task.h b/include/linux/mm_types_task.h index c1bc6731125c..742c542aaf3f 100644 --- a/include/linux/mm_types_task.h +++ b/include/linux/mm_types_task.h @@ -15,10 +15,6 @@ #include -#ifdef CONFIG_ARCH_WANT_BATCHED_UNMAP_TLB_FLUSH -#include -#endif - #define USE_SPLIT_PTE_PTLOCKS (NR_CPUS >= CONFIG_SPLIT_PTLOCK_CPUS) #define USE_SPLIT_PMD_PTLOCKS (USE_SPLIT_PTE_PTLOCKS && \ IS_ENABLED(CONFIG_ARCH_ENABLE_SPLIT_PMD_PTLOCK)) @@ -75,15 +71,6 @@ struct page_frag { /* Track pages that require TLB flushes */ struct tlbflush_unmap_batch { #ifdef CONFIG_ARCH_WANT_BATCHED_UNMAP_TLB_FLUSH - /* - * The arch code makes the following promise: generic code can modify a - * PTE, then call arch_tlbbatch_add_mm() (which internally provides all - * needed barriers), then call arch_tlbbatch_flush(), and the entries - * will be flushed on all CPUs by the time that arch_tlbbatch_flush() - * returns. - */ - struct arch_tlbflush_unmap_batch arch; - /* True if a flush is needed. */ bool flush_required; diff --git a/mm/rmap.c b/mm/rmap.c index 9655e1fc328a..0d2ac5a72d19 100644 --- a/mm/rmap.c +++ b/mm/rmap.c @@ -586,6 +586,18 @@ void page_unlock_anon_vma_read(struct anon_vma *anon_vma) } #ifdef CONFIG_ARCH_WANT_BATCHED_UNMAP_TLB_FLUSH + +/* + * TLB batching requires arch code to make the following promise: upon a full + * TLB flushes, the CPU that performs tlb_flush_batched_cpumask will clear + * tlb_flush_batched_cpumask atomically (i.e., during an IRQ or while interrupts + * are disabled). arch_tlbbatch_flush() is required to flush all the CPUs that + * are set in tlb_flush_batched_cpumask. + * + * This scheme is therefore only suitable for IPI-based TLB shootdowns. + */ +volatile cpumask_t tlb_flush_batched_cpumask = { 0 }; + /* * Flush TLB entries for recently unmapped pages from remote CPUs. It is * important if a PTE was dirty when it was unmapped that it's flushed @@ -599,7 +611,7 @@ void try_to_unmap_flush(void) if (!tlb_ubc->flush_required) return; - arch_tlbbatch_flush(&tlb_ubc->arch); + arch_tlbbatch_flush(); tlb_ubc->flush_required = false; tlb_ubc->writable = false; } @@ -613,27 +625,20 @@ void try_to_unmap_flush_dirty(void) try_to_unmap_flush(); } -static inline void tlbbatch_add_mm(struct arch_tlbflush_unmap_batch *batch, - struct mm_struct *mm) +static inline void tlbbatch_add_mm(struct mm_struct *mm) { + cpumask_atomic_or(&tlb_flush_batched_cpumask, mm_cpumask(mm)); + inc_mm_tlb_gen(mm); - cpumask_or(&batch->cpumask, &batch->cpumask, mm_cpumask(mm)); } static void set_tlb_ubc_flush_pending(struct mm_struct *mm, bool writable) { struct tlbflush_unmap_batch *tlb_ubc = ¤t->tlb_ubc; - tlbbatch_add_mm(&tlb_ubc->arch, mm); + tlbbatch_add_mm(mm); tlb_ubc->flush_required = true; - /* - * Ensure compiler does not re-order the setting of tlb_flush_batched - * before the PTE is cleared. - */ - barrier(); - mm->tlb_flush_batched = true; - /* * If the PTE was dirty then it's best to assume it's writable. The * caller must use try_to_unmap_flush_dirty() or try_to_unmap_flush() @@ -679,16 +684,10 @@ static bool should_defer_flush(struct mm_struct *mm, enum ttu_flags flags) */ void flush_tlb_batched_pending(struct mm_struct *mm) { - if (data_race(mm->tlb_flush_batched)) { - flush_tlb_mm(mm); + if (!cpumask_intersects(mm_cpumask(mm), &tlb_flush_batched_cpumask)) + return; - /* - * Do not allow the compiler to re-order the clearing of - * tlb_flush_batched before the tlb is flushed. - */ - barrier(); - mm->tlb_flush_batched = false; - } + flush_tlb_mm(mm); } #else static void set_tlb_ubc_flush_pending(struct mm_struct *mm, bool writable)