From patchwork Mon Jul 10 08:39:11 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yicong Yang X-Patchwork-Id: 13306475 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 7B1AEEB64DA for ; Mon, 10 Jul 2023 08:41:25 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 72B416B007D; Mon, 10 Jul 2023 04:41:22 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 6B3AE6B007E; Mon, 10 Jul 2023 04:41:22 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 52D066B0080; Mon, 10 Jul 2023 04:41:22 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0015.hostedemail.com [216.40.44.15]) by kanga.kvack.org (Postfix) with ESMTP id 3B30E6B007D for ; Mon, 10 Jul 2023 04:41:22 -0400 (EDT) Received: from smtpin02.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay02.hostedemail.com (Postfix) with ESMTP id 0AF291202DE for ; Mon, 10 Jul 2023 08:41:22 +0000 (UTC) X-FDA: 80995058004.02.DA3B4F2 Received: from szxga01-in.huawei.com (szxga01-in.huawei.com [45.249.212.187]) by imf17.hostedemail.com (Postfix) with ESMTP id 7C2AC4001D for ; Mon, 10 Jul 2023 08:41:19 +0000 (UTC) Authentication-Results: imf17.hostedemail.com; dkim=none; dmarc=pass (policy=quarantine) header.from=huawei.com; spf=pass (imf17.hostedemail.com: domain of yangyicong@huawei.com designates 45.249.212.187 as permitted sender) smtp.mailfrom=yangyicong@huawei.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1688978480; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=tQDXL5G5kaAGoo/7M++qyEMAn71EHHw+DYe+eZ1T3qc=; b=cdKjX0Q/bqpGkPx8/shTjHFrGFe4jUOo47iuCuNURG4U7uYapl14I35VgfGKYCTrqEDv+t 6/AsCnksD6HptUES9jYEdy+4S+GMihuvCbnGC9WvF4KrmalCvhvd2WF0nt/ws2zG7tZKT8 3Q6sWvJuFUcagGawAw2cyVk1TbQDr1E= ARC-Authentication-Results: i=1; imf17.hostedemail.com; dkim=none; dmarc=pass (policy=quarantine) header.from=huawei.com; spf=pass (imf17.hostedemail.com: domain of yangyicong@huawei.com designates 45.249.212.187 as permitted sender) smtp.mailfrom=yangyicong@huawei.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1688978480; a=rsa-sha256; cv=none; b=rzb2tr0Nbt5KObMwhvQnesRLlVMKFPlnDQZc3ut9KkfmaAOJ5iJbc1Ysip9eKEn0pjN1b1 EPJtOIf5bnGHsUCdbWh6kLc4/L2xM/y2+VDdW0JjegdcI5g5Vd099h/PgYjBNWJXjeP683 9qhGaAv+HYkC2J6wfTorUqV/p/G5UNE= Received: from canpemm500009.china.huawei.com (unknown [172.30.72.56]) by szxga01-in.huawei.com (SkyGuard) with ESMTP id 4Qzy7m5v4bztR5P; Mon, 10 Jul 2023 16:38:12 +0800 (CST) Received: from localhost.localdomain (10.50.163.32) by canpemm500009.china.huawei.com (7.192.105.203) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2507.27; Mon, 10 Jul 2023 16:41:09 +0800 From: Yicong Yang To: , , , , , , , , , CC: , , , , , , , , , , , , , , , , , Barry Song <21cnbao@gmail.com>, , , , , Anshuman Khandual , Barry Song Subject: [PATCH v10 1/4] mm/tlbbatch: Introduce arch_tlbbatch_should_defer() Date: Mon, 10 Jul 2023 16:39:11 +0800 Message-ID: <20230710083914.18336-2-yangyicong@huawei.com> X-Mailer: git-send-email 2.31.0 In-Reply-To: <20230710083914.18336-1-yangyicong@huawei.com> References: <20230710083914.18336-1-yangyicong@huawei.com> MIME-Version: 1.0 X-Originating-IP: [10.50.163.32] X-ClientProxiedBy: dggems704-chm.china.huawei.com (10.3.19.181) To canpemm500009.china.huawei.com (7.192.105.203) X-CFilter-Loop: Reflected X-Rspamd-Queue-Id: 7C2AC4001D X-Rspam-User: X-Rspamd-Server: rspam05 X-Stat-Signature: bm9u3nbhpc6mjjtcoisxyif5i6bw5qun X-HE-Tag: 1688978479-380966 X-HE-Meta: U2FsdGVkX1/d8X4ld8g4wQPBrh5uPtYrI56nkR4kmGiJ2ISyoa4WTa6pGCCO+k1fns28l1q0x8uL4X/Pq7RIFjLBtyrWuSHJIj/U2X/csaZ1agELWmAHzkKVGbG6tsSl0kcBsS1Z7nExGiBRRJWhVa8+42QrKV/m4tEgfwYU6wyCAqjg0BcgoxPwoSEoX6ptt5N0Sdz9kVrNWWW4ELV/yBQGegB6YZ7s5wU3Wulm1H0cBSHi3NPQUi6vaPGmu/bQGqjfFVgTXmDxiHL8hXeJfSk6oMHSDnCpIV9trDS+CvNLuFQp4zBVJYZKSOENQuuDpRGHVzR51JvlZrBcpVRqNfa9zRKTB4WNTKSL35fJtqkNTX1m+of+beSMOkMwmOZcVNNNgRBULPO6szeDilqdUsXNJlSYCyR/2S5CDMMTP7ff6OEKki2lz6k1toX/cs+s50je5qVQlu8mPBsLypDwX0j0FcuVdbN/VrSegVzdzhgkhSFT6lKZRZ/8xItPQD5bdBBQW5Jz+nJ7wB+ibscPOi6MgyETzGBic+BqvEcWiF8jFwQehBi7AX/c2lD8mLC4FeO7Do+fk5BVR1b4EASuQQTfLMC5l9ABpC0aI9e0WZUkNVzSJsBhHoY05Sxh4JM7d8yi3JAckWdHEcG1eXLq1kpnN30w2ta8CCIuOZyHYA1svk17TMmgcjfOMmKFfjAk3riTxRhGIscSKs53lbp/IcFa3ykBcfGY2Sd/1eKurLcdYSXPH6PkH/E+BRw93DkjjdW62JiZo4f7tQ+c72xcCNu1B8zckTl1sBy7YYSAYn19uXcrvz7fBNaJFuRaNyhfAIomp91EKpV4fjwDO/GC9MBNaBtX3q7Rlt7+qhRwWztDKZK9C3FmSj61aFyMf3mYlZoEGS59oVgzQgKz8KE4GZmnxNgAqgI7VoQtpP7QceiFT+AxBtJpPv0AzVspvuobX56OheaWAnImfgMzOC2 3lywaKVl 0UywnYEQO4PlORqjM7UCHsBjc5yb5O8WkXekyBITKIhbnNDZWN+BVM5sJ4KAgfF7D/R5dtOp2DdFmDycGDXwDB9M4opNSQI4D756G/4ttp2uHlfndSoUKZnsbbMOEkyqj5Zw7noY+9S3F0keyAcKeN7K6qSMC4lNeDgJLLPmnatL0ykXVPBsdFmWCrx4oGrkv1YlZNIdbX7TaM9ZYrnzzLk561PJK5zYuCJ3kk9Bxy7kxcLWJJ9CX4GBr9oNM4TDWBMTEzaupxM+ZFt4m5LkB0CRN+ho9OdsOKV/peOPa6el5Gy+gzTrPwaAE1pIgruSurhVhX2btJnaDNaEsNm3Us0k/XM8RJJYaJWiay56xDXj3OkKnEoev4SIfP+tfFtGUnlCLglV1i9bzkBxZgJ6DtgPDXk//g2fY8n5Rnnx8jvyvI99WvHAf0Hn4ydnnvKZhgK3tJ8W0/a/w4vUiG8nyl/zv8iAemgnUQnKJ X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Anshuman Khandual The entire scheme of deferred TLB flush in reclaim path rests on the fact that the cost to refill TLB entries is less than flushing out individual entries by sending IPI to remote CPUs. But architecture can have different ways to evaluate that. Hence apart from checking TTU_BATCH_FLUSH in the TTU flags, rest of the decision should be architecture specific. Signed-off-by: Anshuman Khandual [https://lore.kernel.org/linuxppc-dev/20171101101735.2318-2-khandual@linux.vnet.ibm.com/] Signed-off-by: Yicong Yang [Rebase and fix incorrect return value type] Reviewed-by: Kefeng Wang Reviewed-by: Anshuman Khandual Reviewed-by: Barry Song Reviewed-by: Xin Hao Tested-by: Punit Agrawal --- arch/x86/include/asm/tlbflush.h | 12 ++++++++++++ mm/rmap.c | 9 +-------- 2 files changed, 13 insertions(+), 8 deletions(-) diff --git a/arch/x86/include/asm/tlbflush.h b/arch/x86/include/asm/tlbflush.h index 80450e1d5385..cf2a1de5d388 100644 --- a/arch/x86/include/asm/tlbflush.h +++ b/arch/x86/include/asm/tlbflush.h @@ -253,6 +253,18 @@ static inline void flush_tlb_page(struct vm_area_struct *vma, unsigned long a) flush_tlb_mm_range(vma->vm_mm, a, a + PAGE_SIZE, PAGE_SHIFT, false); } +static inline bool arch_tlbbatch_should_defer(struct mm_struct *mm) +{ + bool should_defer = false; + + /* If remote CPUs need to be flushed then defer batch the flush */ + if (cpumask_any_but(mm_cpumask(mm), get_cpu()) < nr_cpu_ids) + should_defer = true; + put_cpu(); + + return should_defer; +} + static inline u64 inc_mm_tlb_gen(struct mm_struct *mm) { /* diff --git a/mm/rmap.c b/mm/rmap.c index 0c0d8857dfce..6480e526c154 100644 --- a/mm/rmap.c +++ b/mm/rmap.c @@ -688,17 +688,10 @@ static void set_tlb_ubc_flush_pending(struct mm_struct *mm, pte_t pteval) */ static bool should_defer_flush(struct mm_struct *mm, enum ttu_flags flags) { - bool should_defer = false; - if (!(flags & TTU_BATCH_FLUSH)) return false; - /* If remote CPUs need to be flushed then defer batch the flush */ - if (cpumask_any_but(mm_cpumask(mm), get_cpu()) < nr_cpu_ids) - should_defer = true; - put_cpu(); - - return should_defer; + return arch_tlbbatch_should_defer(mm); } /* From patchwork Mon Jul 10 08:39:12 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yicong Yang X-Patchwork-Id: 13306472 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id E7FF2EB64DA for ; Mon, 10 Jul 2023 08:41:19 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 2E57A6B0075; Mon, 10 Jul 2023 04:41:19 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 292E56B0078; Mon, 10 Jul 2023 04:41:19 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 15C136B007B; Mon, 10 Jul 2023 04:41:19 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0012.hostedemail.com [216.40.44.12]) by kanga.kvack.org (Postfix) with ESMTP id F1F946B0075 for ; Mon, 10 Jul 2023 04:41:18 -0400 (EDT) Received: from smtpin19.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay04.hostedemail.com (Postfix) with ESMTP id AB27F1A030D for ; Mon, 10 Jul 2023 08:41:18 +0000 (UTC) X-FDA: 80995057836.19.99AB65B Received: from szxga03-in.huawei.com (szxga03-in.huawei.com [45.249.212.189]) by imf26.hostedemail.com (Postfix) with ESMTP id 2877D14000F for ; Mon, 10 Jul 2023 08:41:14 +0000 (UTC) Authentication-Results: imf26.hostedemail.com; dkim=none; dmarc=pass (policy=quarantine) header.from=huawei.com; spf=pass (imf26.hostedemail.com: domain of yangyicong@huawei.com designates 45.249.212.189 as permitted sender) smtp.mailfrom=yangyicong@huawei.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1688978476; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=sUBByuZXbpyKrc9wDQ9lkzXHe8yqE1f/9cc7ANd+/4c=; b=eQNF2kkmNAhjFFJlrXdGN6TPIl9aZglj5mz3zaVB25kXE0uYcFUuDYsa4igO9dZQXAToOY OtHjVvNYsG0KmKdClOenN5OPLmjKh3oOCBq1wx6VR/GgRfqWhkAwbZQYZq1GtWCprmONt2 TMoZMPQAc2GP5610bN7vcoJ8RtWagG4= ARC-Authentication-Results: i=1; imf26.hostedemail.com; dkim=none; dmarc=pass (policy=quarantine) header.from=huawei.com; spf=pass (imf26.hostedemail.com: domain of yangyicong@huawei.com designates 45.249.212.189 as permitted sender) smtp.mailfrom=yangyicong@huawei.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1688978476; a=rsa-sha256; cv=none; b=dxMPj8VQ4bHnhufHIM5roGPks+s5BMMfihHXBvqUMCiGkNUyCkEUyYzp7/ReMyQJgq/XLb oi5lzaOOCdjqi8EiForeTq80k5IP6ctp5DW514w3RchBUjOxhJ8jjYoBKMPbu0iDu743Sg /19DmmgPhD4TUnu052paZi1PossgH6A= Received: from canpemm500009.china.huawei.com (unknown [172.30.72.56]) by szxga03-in.huawei.com (SkyGuard) with ESMTP id 4Qzy8Y0ThMzPk21; Mon, 10 Jul 2023 16:38:53 +0800 (CST) Received: from localhost.localdomain (10.50.163.32) by canpemm500009.china.huawei.com (7.192.105.203) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2507.27; Mon, 10 Jul 2023 16:41:10 +0800 From: Yicong Yang To: , , , , , , , , , CC: , , , , , , , , , , , , , , , , , Barry Song <21cnbao@gmail.com>, , , , , Barry Song , Nadav Amit , Mel Gorman Subject: [PATCH v10 2/4] mm/tlbbatch: Rename and extend some functions Date: Mon, 10 Jul 2023 16:39:12 +0800 Message-ID: <20230710083914.18336-3-yangyicong@huawei.com> X-Mailer: git-send-email 2.31.0 In-Reply-To: <20230710083914.18336-1-yangyicong@huawei.com> References: <20230710083914.18336-1-yangyicong@huawei.com> MIME-Version: 1.0 X-Originating-IP: [10.50.163.32] X-ClientProxiedBy: dggems704-chm.china.huawei.com (10.3.19.181) To canpemm500009.china.huawei.com (7.192.105.203) X-CFilter-Loop: Reflected X-Rspamd-Queue-Id: 2877D14000F X-Rspam-User: X-Rspamd-Server: rspam04 X-Stat-Signature: rnftbqpzkib9o3x3xyt489zctjjcret6 X-HE-Tag: 1688978474-655822 X-HE-Meta: U2FsdGVkX18anhICHHB6sSz3w+ONYwpBheG3N51uspzWIbTccmf1HSfiXpreA8eO2FiDEEZP5lv06wYYjVPToQIHnKciN9It6DHkhhlSjqZqYbDS6m5p9tVSSs2CiwG8NACLINvt6SH8dV+GAub13C+7PoiG4YYq4YZi/S7zqEOEuu4L5Y4edJxOQUqF5dv7QbfV6OlLioLiLNJsfAuY2X0vMy3FsoV+cElElpWkdCM5Zdmm+QAHna7wyT7sCtmAB/BU5jW6WEzLUIFZIwXkD3bqWQpqmNUS94Sa6HuxDK6m1Yk4gVbdyfnZmNxEfHogSEiyT7Qyxv1KLMC3UQiq0VgofTgMZoiDXkyCYX1y9Fe80He34aF7QziiDGu4mrsOn8VAUvBvTe+lt2CsCjfrV1Xh9L/QU8puv1nc4ZebSMBvtEDMbbgOnMYEdAE+PblqxErSxoJQc26yMsqsiuLVk5OztArJUP/eRJDFCLXjoQ1fH/AnIw2olw9OSESwOSnAUyajWFliW02c6O2WiH5Cw5pIwBCOQl3iN8KDWSU4oFTaFkblqYLtkdvQS+xRjoLEqoo1pdYuj6kYjOpU7PqXxiYkNEyT++Kne5hSRyBJG3s2Z7nan/ge+kLR2MaAOvznJPnX8nTs1/aXH72rk49tZZKdbRkjkk4Ru/85Vp1Qa13Vo48UIuK+xhBnSZ1wZ2/BJEOHdj2QcvdIKcqK5KYoquADvQjOg6BgJt50IormUeSAEqjFKg6w/Qv9ejnL7YXMcuJMHnIGdRcLfSB6AaHuzwUOPYnM56L1c5aVhc7KNYuBvCyDkpGZxjEf7zKihPZxr/xJofffbyluK6LD4v3SRUXESOSiICeATICLP3GOWOy27KvvM2FBL8pPioBIZMttcpuBt5KTA8sP1NkO47P4+qXDQCxCZSJwWQsDt9BRRIj/Oc0Wv7MfOh4UNW+jyedOfE6/0ezFYkb5rWyJvsZ XbgSHv+W TTAEkSYxWvAquoXJEvyZMLPzyCP7XkRMxT5/QdBHyn27t1NAqJhGv2sRYYgs9l0jEszRgNYC5qWXJuEnkJgNJZT/6O5oZjEqaYTpbe7CCywAOwx9AK/SPl2CdW9eYic9HZB8/mQKXMmxCiviwo6pIdOdMfD5DfNi2jTk3HMUFRAJFvA/kOyHlYjgRrx3WWJbvpm/VPWHZnRdTrEMGbDLgPQd8dpKRZ+9Z+3QHK24byJyvNLcOFn7TO2yP4BeZW2Ok1nkR2ocw1AiChrF6Oxhsntj73uVxAfNPqnzqxhnFjguyNSBDSuoILymk/0BFk7uq+S+TDKugURl9JXNGTK7blui71+ENN2nbvz7pQ48s42TgKdT5NpowbXCoqEGc4PX8Fj84tNKLtfAg4y/ocjx5FIF4bxe2a0e67+QIbl5+uo6nuCSS2YMl1hwS7+ufvM3gl1wca7oqQPP0gjfXzwsg+jxQgm4c+YEUr5BYsTNI5b2Wx6MHUXtiIb9W8A== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Barry Song This patch does some preparation works to extend batched TLB flush to arm64. Including: - Extend set_tlb_ubc_flush_pending() and arch_tlbbatch_add_mm() to accept an additional argument for address, architectures like arm64 may need this for tlbi. - Rename arch_tlbbatch_add_mm() to arch_tlbbatch_add_pending() to match its current function since we don't need to handle mm on architectures like arm64 and add_mm is not proper, add_pending will make sense to both as on x86 we're pending the TLB flush operations while on arm64 we're pending the synchronize operations. This intends no functional changes on x86. Cc: Anshuman Khandual Cc: Jonathan Corbet Cc: Nadav Amit Cc: Mel Gorman Tested-by: Yicong Yang Tested-by: Xin Hao Tested-by: Punit Agrawal Signed-off-by: Barry Song Signed-off-by: Yicong Yang Reviewed-by: Kefeng Wang Reviewed-by: Xin Hao Reviewed-by: Anshuman Khandual --- arch/x86/include/asm/tlbflush.h | 5 +++-- include/linux/mm_types_task.h | 4 ++-- mm/rmap.c | 12 +++++++----- 3 files changed, 12 insertions(+), 9 deletions(-) diff --git a/arch/x86/include/asm/tlbflush.h b/arch/x86/include/asm/tlbflush.h index cf2a1de5d388..1c7d3a36e16c 100644 --- a/arch/x86/include/asm/tlbflush.h +++ b/arch/x86/include/asm/tlbflush.h @@ -276,8 +276,9 @@ static inline u64 inc_mm_tlb_gen(struct mm_struct *mm) return atomic64_inc_return(&mm->context.tlb_gen); } -static inline void arch_tlbbatch_add_mm(struct arch_tlbflush_unmap_batch *batch, - struct mm_struct *mm) +static inline void arch_tlbbatch_add_pending(struct arch_tlbflush_unmap_batch *batch, + struct mm_struct *mm, + unsigned long uaddr) { inc_mm_tlb_gen(mm); cpumask_or(&batch->cpumask, &batch->cpumask, mm_cpumask(mm)); diff --git a/include/linux/mm_types_task.h b/include/linux/mm_types_task.h index 5414b5c6a103..aa44fff8bb9d 100644 --- a/include/linux/mm_types_task.h +++ b/include/linux/mm_types_task.h @@ -52,8 +52,8 @@ struct tlbflush_unmap_batch { #ifdef CONFIG_ARCH_WANT_BATCHED_UNMAP_TLB_FLUSH /* * The arch code makes the following promise: generic code can modify a - * PTE, then call arch_tlbbatch_add_mm() (which internally provides all - * needed barriers), then call arch_tlbbatch_flush(), and the entries + * PTE, then call arch_tlbbatch_add_pending() (which internally provides + * all needed barriers), then call arch_tlbbatch_flush(), and the entries * will be flushed on all CPUs by the time that arch_tlbbatch_flush() * returns. */ diff --git a/mm/rmap.c b/mm/rmap.c index 6480e526c154..9699c6011b0e 100644 --- a/mm/rmap.c +++ b/mm/rmap.c @@ -642,7 +642,8 @@ void try_to_unmap_flush_dirty(void) #define TLB_FLUSH_BATCH_PENDING_LARGE \ (TLB_FLUSH_BATCH_PENDING_MASK / 2) -static void set_tlb_ubc_flush_pending(struct mm_struct *mm, pte_t pteval) +static void set_tlb_ubc_flush_pending(struct mm_struct *mm, pte_t pteval, + unsigned long uaddr) { struct tlbflush_unmap_batch *tlb_ubc = ¤t->tlb_ubc; int batch; @@ -651,7 +652,7 @@ static void set_tlb_ubc_flush_pending(struct mm_struct *mm, pte_t pteval) if (!pte_accessible(mm, pteval)) return; - arch_tlbbatch_add_mm(&tlb_ubc->arch, mm); + arch_tlbbatch_add_pending(&tlb_ubc->arch, mm, uaddr); tlb_ubc->flush_required = true; /* @@ -726,7 +727,8 @@ void flush_tlb_batched_pending(struct mm_struct *mm) } } #else -static void set_tlb_ubc_flush_pending(struct mm_struct *mm, pte_t pteval) +static void set_tlb_ubc_flush_pending(struct mm_struct *mm, pte_t pteval, + unsigned long uaddr) { } @@ -1579,7 +1581,7 @@ static bool try_to_unmap_one(struct folio *folio, struct vm_area_struct *vma, */ pteval = ptep_get_and_clear(mm, address, pvmw.pte); - set_tlb_ubc_flush_pending(mm, pteval); + set_tlb_ubc_flush_pending(mm, pteval, address); } else { pteval = ptep_clear_flush(vma, address, pvmw.pte); } @@ -1962,7 +1964,7 @@ static bool try_to_migrate_one(struct folio *folio, struct vm_area_struct *vma, */ pteval = ptep_get_and_clear(mm, address, pvmw.pte); - set_tlb_ubc_flush_pending(mm, pteval); + set_tlb_ubc_flush_pending(mm, pteval, address); } else { pteval = ptep_clear_flush(vma, address, pvmw.pte); } From patchwork Mon Jul 10 08:39:13 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yicong Yang X-Patchwork-Id: 13306474 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id CCCC8EB64D9 for ; Mon, 10 Jul 2023 08:41:23 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 5A4DC6B007B; Mon, 10 Jul 2023 04:41:21 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 52E0A6B007D; Mon, 10 Jul 2023 04:41:21 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 37E816B007E; Mon, 10 Jul 2023 04:41:21 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0010.hostedemail.com [216.40.44.10]) by kanga.kvack.org (Postfix) with ESMTP id 277556B007B for ; Mon, 10 Jul 2023 04:41:21 -0400 (EDT) Received: from smtpin05.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay04.hostedemail.com (Postfix) with ESMTP id EDB891A02FF for ; Mon, 10 Jul 2023 08:41:20 +0000 (UTC) X-FDA: 80995057920.05.B387EF9 Received: from szxga03-in.huawei.com (szxga03-in.huawei.com [45.249.212.189]) by imf21.hostedemail.com (Postfix) with ESMTP id A9F881C0018 for ; Mon, 10 Jul 2023 08:41:18 +0000 (UTC) Authentication-Results: imf21.hostedemail.com; dkim=none; spf=pass (imf21.hostedemail.com: domain of yangyicong@huawei.com designates 45.249.212.189 as permitted sender) smtp.mailfrom=yangyicong@huawei.com; dmarc=pass (policy=quarantine) header.from=huawei.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1688978479; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=fhZ+V4S52X/kXuQV51PeKIPPTeN+0eX+HbKE+khnxNc=; b=3zkG7LH91Hl6KKWXw8RhZ4+BU82myW+34WC1somP9BI93kbDaE4oOBkDh/eGAgSTk2/nFH Wd/xkUBwl8aOzeWWCIBmhSCV49VC9ISjL+Mgy+IbatySOuIzJKDG6tc43bEsWI33B6YWLU jBPv18vUG7MVXegA9VLdGQ10EC0Q+RQ= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1688978479; a=rsa-sha256; cv=none; b=si9i/iuiBmgEuC2o0o3abpQCwDY/mgpcDNPcW/NvBYQYCrM2vohsmU3OT6i/KeDi4Mff/s PdfQBTTQkJqSvsbvQWZ0T5xnl+lCyDmLw3/lXdmj5diHoWzDFvYBCW2Jd5d7jygZ/PwAHP MRzH7dIqEno08TcVXGYDrXg30t2JiUM= ARC-Authentication-Results: i=1; imf21.hostedemail.com; dkim=none; spf=pass (imf21.hostedemail.com: domain of yangyicong@huawei.com designates 45.249.212.189 as permitted sender) smtp.mailfrom=yangyicong@huawei.com; dmarc=pass (policy=quarantine) header.from=huawei.com Received: from canpemm500009.china.huawei.com (unknown [172.30.72.54]) by szxga03-in.huawei.com (SkyGuard) with ESMTP id 4Qzy8Y75pSzPk2S; Mon, 10 Jul 2023 16:38:53 +0800 (CST) Received: from localhost.localdomain (10.50.163.32) by canpemm500009.china.huawei.com (7.192.105.203) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2507.27; Mon, 10 Jul 2023 16:41:11 +0800 From: Yicong Yang To: , , , , , , , , , CC: , , , , , , , , , , , , , , , , , Barry Song <21cnbao@gmail.com>, , , , Subject: [PATCH v10 3/4] mm/tlbbatch: Introduce arch_flush_tlb_batched_pending() Date: Mon, 10 Jul 2023 16:39:13 +0800 Message-ID: <20230710083914.18336-4-yangyicong@huawei.com> X-Mailer: git-send-email 2.31.0 In-Reply-To: <20230710083914.18336-1-yangyicong@huawei.com> References: <20230710083914.18336-1-yangyicong@huawei.com> MIME-Version: 1.0 X-Originating-IP: [10.50.163.32] X-ClientProxiedBy: dggems704-chm.china.huawei.com (10.3.19.181) To canpemm500009.china.huawei.com (7.192.105.203) X-CFilter-Loop: Reflected X-Rspamd-Queue-Id: A9F881C0018 X-Rspam-User: X-Stat-Signature: cr6bmrpkgysx7nd8mht4kowzu1yfkedt X-Rspamd-Server: rspam03 X-HE-Tag: 1688978478-988674 X-HE-Meta: U2FsdGVkX1/BvuUxRwetBoH33nMn00sPlWuZsNYIJU2+jd4PYl3CHXiywNUxkjWzliqHqvP97Kzi5D4BkrO6gDCwXJWDg9SUv/BI1UIYoywvr7SdmUqkwqoIlkiox1w93iHRnSI2Oqsr8MayBhrLaqQtF6gWYh+B9ThQsLt0okbQcoRziMbMLVAiq5d5k8EICUlFjBxatTr2gc0zAMK87cWEIEtypuCnKdB5MV8fZGNzAlSH/RXNRPwnrzLyUCgnoPaiapyxGK7Bek+yahG/pw/b27yuDSzO886d8uHrySnt5P5axDIIypwcXOQGgKxwhAEsR1xqi8fFRxCQ0yjWGC8K1aePxMC7YiGR6pVc/oqHVesaTdMLqQlT0A/PDmFPdACZzPlhHL5xvFQwrpZOH5KAiDM9TTqPxNAcNfDhmPPYcpN49r2Mk1xmv1CADjmFTSqi2H2DBqHGGLFaeqyTuT3U3yzKnnWrU6iXplwpdf0atIIqPuFEJa7xvGKf5oB0r1TT3fn+w3+B4GRM8LVUFGoQ8pcsiQ0RFr46GTKtJriSWYE4lcdUTXdh7alb563+kz7V/gI+LTUNxQSymLgfNYy99ZBJfAKCwoofp3k/yoE4Z0G2RgZJE5kCkXl05OEqSjM5Ujn3+LLPd7tTNVytN8lTqPsONhcbSBT7rITBwMiJ32QmtFxbd3atm+6j3MCe4OogofuzET3rsSzWRmLoR4pg1WI6bfuat77YJM3hBpyceYO+rYvpxxVs7/lR1w4hI2EZnv+tIILxm5HFe6hmL/yloJhROQHU572Ab7e+lJGJjWWwvp3Z9Q/2vCLcHHiXJDoJslOHJctXyuc/7kHtM7WDKzRZrmltdxgiX+lBJiFehMKzOpn47XUtZVFzFeHczo24n7Fw7gwK7ib+nnzg1bvegug1/zzGni7pmaXDk10uJ/G1RET5ZjhWTWb0FmYvSafZm4NDCeX624cBkzJ FGcBo1be TapVDvMhNC/mgctis8rD+vl9I/JT0w38KT6mn/ygdj8dW+kftEsBTyRAvWp1esYHmfn5M4elD5S4AKYjZ93n5ZnZTyRBfbSEmdb9KAld/sWEdN0HNhHd06A3fXUWpaXUd6RqGFr+WNj0qNsX3E4VXxauKgOBNBRb2fxYWj7ms/j9pNNOhyGngwh5cjLxb2rI+qtan9oPHwBbkPPt3OkBwrzJe1Ir2Vv8UiphJ/8o9AXaSzjA= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Yicong Yang Currently we'll flush the mm in flush_tlb_batched_pending() to avoid race between reclaim unmaps pages by batched TLB flush and mprotect/munmap/etc. Other architectures like arm64 may only need a synchronization barrier(dsb) here rather than a full mm flush. So add arch_flush_tlb_batched_pending() to allow an arch-specific implementation here. This intends no functional changes on x86 since still a full mm flush for x86. Signed-off-by: Yicong Yang --- arch/x86/include/asm/tlbflush.h | 5 +++++ mm/rmap.c | 2 +- 2 files changed, 6 insertions(+), 1 deletion(-) diff --git a/arch/x86/include/asm/tlbflush.h b/arch/x86/include/asm/tlbflush.h index 1c7d3a36e16c..837e4a50281a 100644 --- a/arch/x86/include/asm/tlbflush.h +++ b/arch/x86/include/asm/tlbflush.h @@ -284,6 +284,11 @@ static inline void arch_tlbbatch_add_pending(struct arch_tlbflush_unmap_batch *b cpumask_or(&batch->cpumask, &batch->cpumask, mm_cpumask(mm)); } +static inline void arch_flush_tlb_batched_pending(struct mm_struct *mm) +{ + flush_tlb_mm(mm); +} + extern void arch_tlbbatch_flush(struct arch_tlbflush_unmap_batch *batch); static inline bool pte_flags_need_flush(unsigned long oldflags, diff --git a/mm/rmap.c b/mm/rmap.c index 9699c6011b0e..3a16c91be7e2 100644 --- a/mm/rmap.c +++ b/mm/rmap.c @@ -717,7 +717,7 @@ void flush_tlb_batched_pending(struct mm_struct *mm) int flushed = batch >> TLB_FLUSH_BATCH_FLUSHED_SHIFT; if (pending != flushed) { - flush_tlb_mm(mm); + arch_flush_tlb_batched_pending(mm); /* * If the new TLB flushing is pending during flushing, leave * mm->tlb_flush_batched as is, to avoid losing flushing. From patchwork Mon Jul 10 08:39:14 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yicong Yang X-Patchwork-Id: 13306473 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id ACD97C001DD for ; Mon, 10 Jul 2023 08:41:21 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 924FD6B0078; Mon, 10 Jul 2023 04:41:20 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 8D5056B007B; Mon, 10 Jul 2023 04:41:20 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 7768E6B007D; Mon, 10 Jul 2023 04:41:20 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0011.hostedemail.com [216.40.44.11]) by kanga.kvack.org (Postfix) with ESMTP id 6932B6B0078 for ; Mon, 10 Jul 2023 04:41:20 -0400 (EDT) Received: from smtpin08.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay08.hostedemail.com (Postfix) with ESMTP id 3ECDE1402E3 for ; Mon, 10 Jul 2023 08:41:20 +0000 (UTC) X-FDA: 80995057920.08.7F7FF78 Received: from szxga08-in.huawei.com (szxga08-in.huawei.com [45.249.212.255]) by imf19.hostedemail.com (Postfix) with ESMTP id 9CBE31A0011 for ; Mon, 10 Jul 2023 08:41:17 +0000 (UTC) Authentication-Results: imf19.hostedemail.com; dkim=none; spf=pass (imf19.hostedemail.com: domain of yangyicong@huawei.com designates 45.249.212.255 as permitted sender) smtp.mailfrom=yangyicong@huawei.com; dmarc=pass (policy=quarantine) header.from=huawei.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1688978478; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=yxOksWJmC1E9HL0ukTnBshvE/Q+99aLXO6wPG8oDNMM=; b=bH7huDljdk/Untld8uHjQ8t9G0YjtLf+PmL+IPJryekqjD9ROMDacOLemA+J14pHCIFKPl YyF0r7jAMXd80/sJWbCgMYz/Gl1tvH+mid7NLyRFu9CoJHz+MgoMmtw+8nU+eZP3aFPvl4 5ExGzWqIiDGztvhI+QVWYV/zSTKPLdo= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1688978478; a=rsa-sha256; cv=none; b=kL14TSREHyZ000IbG/wTx4VqEO3yBNkyKFIjDdPOxepxSZQ1t1iLNek6or23IN9ZaDjGX7 brnbhyTxTeeSnLfbGr3VfpW5FwTsH5MtglLDBzpcp6Z92A/9VgJmJGDqhixqxFjRQw74z6 rtaIm+J8UFAO4bhWyFCR3r2C7VJ1puc= ARC-Authentication-Results: i=1; imf19.hostedemail.com; dkim=none; spf=pass (imf19.hostedemail.com: domain of yangyicong@huawei.com designates 45.249.212.255 as permitted sender) smtp.mailfrom=yangyicong@huawei.com; dmarc=pass (policy=quarantine) header.from=huawei.com Received: from canpemm500009.china.huawei.com (unknown [172.30.72.57]) by szxga08-in.huawei.com (SkyGuard) with ESMTP id 4QzyBd1pnCz1FDnF; Mon, 10 Jul 2023 16:40:41 +0800 (CST) Received: from localhost.localdomain (10.50.163.32) by canpemm500009.china.huawei.com (7.192.105.203) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2507.27; Mon, 10 Jul 2023 16:41:12 +0800 From: Yicong Yang To: , , , , , , , , , CC: , , , , , , , , , , , , , , , , , Barry Song <21cnbao@gmail.com>, , , , , Barry Song , Nadav Amit , Mel Gorman Subject: [PATCH v10 4/4] arm64: support batched/deferred tlb shootdown during page reclamation/migration Date: Mon, 10 Jul 2023 16:39:14 +0800 Message-ID: <20230710083914.18336-5-yangyicong@huawei.com> X-Mailer: git-send-email 2.31.0 In-Reply-To: <20230710083914.18336-1-yangyicong@huawei.com> References: <20230710083914.18336-1-yangyicong@huawei.com> MIME-Version: 1.0 X-Originating-IP: [10.50.163.32] X-ClientProxiedBy: dggems704-chm.china.huawei.com (10.3.19.181) To canpemm500009.china.huawei.com (7.192.105.203) X-CFilter-Loop: Reflected X-Rspamd-Queue-Id: 9CBE31A0011 X-Rspam-User: X-Rspamd-Server: rspam11 X-Stat-Signature: neppfbfeoo8jesf5ysjoteiwqm4xd41m X-HE-Tag: 1688978477-931373 X-HE-Meta: U2FsdGVkX19JBY4NBPtEE77bfHkmdiFzt5nDyeeuzR8dtUY055JdwUOW2e8xuW04t7U6iY8/r9S6hFBKbrSp3F3ACg02iFdwp0JmwmHoCv3J92CBjMne/Fzi0fZ1ht13CgoRr0Gd7eXo8hf6gSBgJN953NVNkOe/SYT1sjysRT9UIxBQh2p47HSYTBhrysPbmJSeHZOWeFpTKkmMjc1kyT3UuELy0Ntfg9NkjWyw30KevKfTDmq17rUjzwE8z2mFuweIWrppwPrhi5IrOB9wNnGOq6ELwoPt5129BHbAWjLK+G96QCMbw9jJUKo2KnCSBZxo2fsmFvPbdAAzVA8SqWRrnvIzmBAekc64SE7wwE+gy6KpnK3KiH5kmSO6QQ+R9gFUARgRZHC5hby/fu3ECP8uNTCgs47Iqb/y6bYbZs66jdrrOqglWcZr/jTRYcmuCRDsqJzsiOTgNj3Q0shKnRnMESAmeWEqEx0VtipbsfjryBO1rNH4UD0CkwGx5qpCfksD895otMgQGlTQAH1m83Wsd/lnyhCg9kbs7Y0hTu6L0lbtvTVWA9ErmobOmW83J/0Svqd+5AZYfqdgcn7VXlPSIC5AoKAkq3Q1hbuSX0RAwK6INUVvmS76s+uD4GqSUr7qRKQy3V7pSEgghSDttoC6e0Uum1MT5kU4kxuUNLgySj5dHM7vRPERYyIIBrIAtwk8kKXWqovUeJroanL3Mhyl54uDX3LLPlV57rHZ6uua3akbmXH+egbTURbV6Z8ia6WBeviwbX2iGmzCP9xiGdqhNGMLHrW0N2YyeXdJbNkOP+ThYiSZVqbQetvm115MsRGysKhe2NHdRRveYSvzBnTBA+UHS0TYQjl7fAbrx7ZJv8bE9oP6gy+Wq9boZUj9nJwvCMkECndTQMg4l4/9FUE2v7TQbh4g2w7rihViE2elibgjjn9sv7bqQGY0z71SXDXTCUTP2RFfsrDWuIF 3LIIAkEO cpPZ5aXtsQvki0M82cs8GhZwB9dLiWgHqYR7kTlhoXOsVmS3kyJQtVg1V47eEHDUn+EdKztJD6aJW7AADswGG3H6/lzuaXz701KY5RtTT5m9Vx14PSp1RR1jovA7H02DIbMfXGqqeO++28gg/nGl2i6SnscwC6yE4yyslNyDakFvmMybHeSu+wgzyRI3M7Cuqd2LwysDDz6SREjqt5Tf7bxpqw8y5Q8DqL0P/NMJ9Vip7gqWXVIltC36aWIGK13h7JTcf4FgR2oUQYy+ExhgamRBAdyy38kxukOe7ALNCsds/T7bWipe3vU7Gmo1T29iSzoZQXYFJDFcDiCERx667f/rY2oif/KHrSq7P7c3KcHRdd2Gh848MYCmQNyQQNihBC38Xngw+Dg2RYfohyZmd7ANeC3QMVwKWuuwdq7do39H0waLNGmVmG0cJUoGHr8e2rFdC/50Dr+lAZdsNzInNOl6XLDaJIyA8OFt4nxKu5sNLBoI= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Barry Song on x86, batched and deferred tlb shootdown has lead to 90% performance increase on tlb shootdown. on arm64, HW can do tlb shootdown without software IPI. But sync tlbi is still quite expensive. Even running a simplest program which requires swapout can prove this is true, #include #include #include #include int main() { #define SIZE (1 * 1024 * 1024) volatile unsigned char *p = mmap(NULL, SIZE, PROT_READ | PROT_WRITE, MAP_SHARED | MAP_ANONYMOUS, -1, 0); memset(p, 0x88, SIZE); for (int k = 0; k < 10000; k++) { /* swap in */ for (int i = 0; i < SIZE; i += 4096) { (void)p[i]; } /* swap out */ madvise(p, SIZE, MADV_PAGEOUT); } } Perf result on snapdragon 888 with 8 cores by using zRAM as the swap block device. ~ # perf record taskset -c 4 ./a.out [ perf record: Woken up 10 times to write data ] [ perf record: Captured and wrote 2.297 MB perf.data (60084 samples) ] ~ # perf report # To display the perf.data header info, please use --header/--header-only options. # To display the perf.data header info, please use --header/--header-only options. # # # Total Lost Samples: 0 # # Samples: 60K of event 'cycles' # Event count (approx.): 35706225414 # # Overhead Command Shared Object Symbol # ........ ....... ................. ...... # 21.07% a.out [kernel.kallsyms] [k] _raw_spin_unlock_irq 8.23% a.out [kernel.kallsyms] [k] _raw_spin_unlock_irqrestore 6.67% a.out [kernel.kallsyms] [k] filemap_map_pages 6.16% a.out [kernel.kallsyms] [k] __zram_bvec_write 5.36% a.out [kernel.kallsyms] [k] ptep_clear_flush 3.71% a.out [kernel.kallsyms] [k] _raw_spin_lock 3.49% a.out [kernel.kallsyms] [k] memset64 1.63% a.out [kernel.kallsyms] [k] clear_page 1.42% a.out [kernel.kallsyms] [k] _raw_spin_unlock 1.26% a.out [kernel.kallsyms] [k] mod_zone_state.llvm.8525150236079521930 1.23% a.out [kernel.kallsyms] [k] xas_load 1.15% a.out [kernel.kallsyms] [k] zram_slot_lock ptep_clear_flush() takes 5.36% CPU in the micro-benchmark swapping in/out a page mapped by only one process. If the page is mapped by multiple processes, typically, like more than 100 on a phone, the overhead would be much higher as we have to run tlb flush 100 times for one single page. Plus, tlb flush overhead will increase with the number of CPU cores due to the bad scalability of tlb shootdown in HW, so those ARM64 servers should expect much higher overhead. Further perf annonate shows 95% cpu time of ptep_clear_flush is actually used by the final dsb() to wait for the completion of tlb flush. This provides us a very good chance to leverage the existing batched tlb in kernel. The minimum modification is that we only send async tlbi in the first stage and we send dsb while we have to sync in the second stage. With the above simplest micro benchmark, collapsed time to finish the program decreases around 5%. Typical collapsed time w/o patch: ~ # time taskset -c 4 ./a.out 0.21user 14.34system 0:14.69elapsed w/ patch: ~ # time taskset -c 4 ./a.out 0.22user 13.45system 0:13.80elapsed Also tested with benchmark in the commit on Kunpeng920 arm64 server and observed an improvement around 12.5% with command `time ./swap_bench`. w/o w/ real 0m13.460s 0m11.771s user 0m0.248s 0m0.279s sys 0m12.039s 0m11.458s Originally it's noticed a 16.99% overhead of ptep_clear_flush() which has been eliminated by this patch: [root@localhost yang]# perf record -- ./swap_bench && perf report [...] 16.99% swap_bench [kernel.kallsyms] [k] ptep_clear_flush It is tested on 4,8,128 CPU platforms and shows to be beneficial on large systems but may not have improvement on small systems like on a 4 CPU platform. So make this depends on EXPERT at this stage for tests on more small platforms. Also this patch improve the performance of page migration. Using pmbench and tries to migrate the pages of pmbench between node 0 and node 1 for 100 times for 1G memory, this patch decrease the time used around 20% (prev 18.338318910 sec after 13.981866350 sec) and saved the time used by ptep_clear_flush(). Cc: Anshuman Khandual Cc: Jonathan Corbet Cc: Nadav Amit Cc: Mel Gorman Tested-by: Yicong Yang Tested-by: Xin Hao Tested-by: Punit Agrawal Signed-off-by: Barry Song Signed-off-by: Yicong Yang Reviewed-by: Kefeng Wang Reviewed-by: Xin Hao Reviewed-by: Anshuman Khandual --- .../features/vm/TLB/arch-support.txt | 2 +- arch/arm64/Kconfig | 1 + arch/arm64/include/asm/tlbbatch.h | 12 +++++ arch/arm64/include/asm/tlbflush.h | 48 +++++++++++++++++-- 4 files changed, 59 insertions(+), 4 deletions(-) create mode 100644 arch/arm64/include/asm/tlbbatch.h diff --git a/Documentation/features/vm/TLB/arch-support.txt b/Documentation/features/vm/TLB/arch-support.txt index 7f049c251a79..76208db88f3b 100644 --- a/Documentation/features/vm/TLB/arch-support.txt +++ b/Documentation/features/vm/TLB/arch-support.txt @@ -9,7 +9,7 @@ | alpha: | TODO | | arc: | TODO | | arm: | TODO | - | arm64: | N/A | + | arm64: | ok | | csky: | TODO | | hexagon: | TODO | | ia64: | TODO | diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig index 7856c3a3e35a..f0ce8208c57f 100644 --- a/arch/arm64/Kconfig +++ b/arch/arm64/Kconfig @@ -96,6 +96,7 @@ config ARM64 select ARCH_SUPPORTS_NUMA_BALANCING select ARCH_SUPPORTS_PAGE_TABLE_CHECK select ARCH_SUPPORTS_PER_VMA_LOCK + select ARCH_WANT_BATCHED_UNMAP_TLB_FLUSH if EXPERT select ARCH_WANT_COMPAT_IPC_PARSE_VERSION if COMPAT select ARCH_WANT_DEFAULT_BPF_JIT select ARCH_WANT_DEFAULT_TOPDOWN_MMAP_LAYOUT diff --git a/arch/arm64/include/asm/tlbbatch.h b/arch/arm64/include/asm/tlbbatch.h new file mode 100644 index 000000000000..fedb0b87b8db --- /dev/null +++ b/arch/arm64/include/asm/tlbbatch.h @@ -0,0 +1,12 @@ +/* SPDX-License-Identifier: GPL-2.0 */ +#ifndef _ARCH_ARM64_TLBBATCH_H +#define _ARCH_ARM64_TLBBATCH_H + +struct arch_tlbflush_unmap_batch { + /* + * For arm64, HW can do tlb shootdown, so we don't + * need to record cpumask for sending IPI + */ +}; + +#endif /* _ARCH_ARM64_TLBBATCH_H */ diff --git a/arch/arm64/include/asm/tlbflush.h b/arch/arm64/include/asm/tlbflush.h index 412a3b9a3c25..4bb9cec62e26 100644 --- a/arch/arm64/include/asm/tlbflush.h +++ b/arch/arm64/include/asm/tlbflush.h @@ -254,17 +254,23 @@ static inline void flush_tlb_mm(struct mm_struct *mm) dsb(ish); } -static inline void flush_tlb_page_nosync(struct vm_area_struct *vma, - unsigned long uaddr) +static inline void __flush_tlb_page_nosync(struct mm_struct *mm, + unsigned long uaddr) { unsigned long addr; dsb(ishst); - addr = __TLBI_VADDR(uaddr, ASID(vma->vm_mm)); + addr = __TLBI_VADDR(uaddr, ASID(mm)); __tlbi(vale1is, addr); __tlbi_user(vale1is, addr); } +static inline void flush_tlb_page_nosync(struct vm_area_struct *vma, + unsigned long uaddr) +{ + return __flush_tlb_page_nosync(vma->vm_mm, uaddr); +} + static inline void flush_tlb_page(struct vm_area_struct *vma, unsigned long uaddr) { @@ -272,6 +278,42 @@ static inline void flush_tlb_page(struct vm_area_struct *vma, dsb(ish); } +#ifdef CONFIG_ARCH_WANT_BATCHED_UNMAP_TLB_FLUSH + +static inline bool arch_tlbbatch_should_defer(struct mm_struct *mm) +{ +#ifdef CONFIG_ARM64_WORKAROUND_REPEAT_TLBI + /* + * TLB flush deferral is not required on systems, which are affected with + * ARM64_WORKAROUND_REPEAT_TLBI, as __tlbi()/__tlbi_user() implementation + * will have two consecutive TLBI instructions with a dsb(ish) in between + * defeating the purpose (i.e save overall 'dsb ish' cost). + */ + if (unlikely(cpus_have_const_cap(ARM64_WORKAROUND_REPEAT_TLBI))) + return false; +#endif + return true; +} + +static inline void arch_tlbbatch_add_pending(struct arch_tlbflush_unmap_batch *batch, + struct mm_struct *mm, + unsigned long uaddr) +{ + __flush_tlb_page_nosync(mm, uaddr); +} + +static inline void arch_flush_tlb_batched_pending(struct mm_struct *mm) +{ + dsb(ish); +} + +static inline void arch_tlbbatch_flush(struct arch_tlbflush_unmap_batch *batch) +{ + dsb(ish); +} + +#endif /* CONFIG_ARCH_WANT_BATCHED_UNMAP_TLB_FLUSH */ + /* * This is meant to avoid soft lock-ups on large TLB flushing ranges and not * necessarily a performance improvement.