From patchwork Fri May 31 09:19:50 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Byungchul Park X-Patchwork-Id: 13681398 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 1AAAEC25B75 for ; Fri, 31 May 2024 09:20:18 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 1E1076B009A; Fri, 31 May 2024 05:20:18 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 1921E6B009B; Fri, 31 May 2024 05:20:18 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id F26196B009C; Fri, 31 May 2024 05:20:17 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0017.hostedemail.com [216.40.44.17]) by kanga.kvack.org (Postfix) with ESMTP id D7C636B009A for ; Fri, 31 May 2024 05:20:17 -0400 (EDT) Received: from smtpin03.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay04.hostedemail.com (Postfix) with ESMTP id 9B1331A0CA4 for ; Fri, 31 May 2024 09:20:17 +0000 (UTC) X-FDA: 82178144874.03.15C20D1 Received: from invmail4.hynix.com (exvmail4.hynix.com [166.125.252.92]) by imf10.hostedemail.com (Postfix) with ESMTP id 84ACFC0014 for ; Fri, 31 May 2024 09:20:14 +0000 (UTC) Authentication-Results: imf10.hostedemail.com; dkim=none; dmarc=none; spf=pass (imf10.hostedemail.com: domain of byungchul@sk.com designates 166.125.252.92 as permitted sender) smtp.mailfrom=byungchul@sk.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1717147216; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:content-type: content-transfer-encoding:in-reply-to:in-reply-to: references:references; bh=lYfFMbAEM1F2cBEc4miPa+Jx1WoZRiG7k2wKNFJlSqA=; b=lK6TI7StNPcfy7QqBgXB5MZpqG5zaMUncCP/QVtNMy8ZyhwC3VnHZRPYVQFVQ4eFXnKIUj P9oHBtc0d15R5aRIWRws56VBn3EVqrbBCmBOZDyiZ/ZFlVvpyKQTYmlsYig6Dh3ayOJVAj Qg67BPyB5COGz2JIXMlpW4JtGpS2zqc= ARC-Authentication-Results: i=1; imf10.hostedemail.com; dkim=none; dmarc=none; spf=pass (imf10.hostedemail.com: domain of byungchul@sk.com designates 166.125.252.92 as permitted sender) smtp.mailfrom=byungchul@sk.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1717147216; a=rsa-sha256; cv=none; b=npw6Hlk+2Kw0zU1nVBUkEVjocrINfJ9scMExuuu1Uc14jZSi2BkHAOhGSxk3pSjeS7NGFg +vXHrGsUAKf7fjofaobQL9TGYK5RT63xbkdXPRNh1+742+QCxCmyWM1aWh8gMlhFeWr4r2 ZWQ4drZd1DeI5DguB5/dhYDrKXgdmhs= X-AuditID: a67dfc5b-d85ff70000001748-43-6659964b11d6 From: Byungchul Park To: linux-kernel@vger.kernel.org, linux-mm@kvack.org Cc: kernel_team@skhynix.com, akpm@linux-foundation.org, ying.huang@intel.com, vernhao@tencent.com, mgorman@techsingularity.net, hughd@google.com, willy@infradead.org, david@redhat.com, peterz@infradead.org, luto@kernel.org, tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, dave.hansen@linux.intel.com, rjgolo@gmail.com Subject: [PATCH v11 01/12] x86/tlb: add APIs manipulating tlb batch's arch data Date: Fri, 31 May 2024 18:19:50 +0900 Message-Id: <20240531092001.30428-2-byungchul@sk.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20240531092001.30428-1-byungchul@sk.com> References: <20240531092001.30428-1-byungchul@sk.com> X-Brightmail-Tracker: H4sIAAAAAAAAA+NgFnrCLMWRmVeSWpSXmKPExsXC9ZZnka7PtMg0g+3bGC3mrF/DZvF5wz82 ixcb2hktvq7/xWzx9FMfi8XlXXPYLO6t+c9qcX7XWlaLHUv3MVlcOrCAyeJ47wEmi/n3PrNZ bN40ldni+JSpjBa/fwAVn5w1mcVBwON7ax+Lx85Zd9k9Fmwq9di8Qstj8Z6XTB6bVnWyeWz6 NInd4925c+weJ2b8ZvGYdzLQ4/2+q2weW3/ZeTROvcbm8XmTXABfFJdNSmpOZllqkb5dAlfG yf4NTAVvBCrOPJ3H3sA4h6+LkZNDQsBEYvWz+8ww9oRPDWA2m4C6xI0bP8FsEQEziYOtf9hB bGaBu0wSB/rZuhg5OIQF/CX+7rMGMVkEVCUu76oHqeAVMJVo/vuVDWKivMTqDQfApnACTTnw 9w4jiC0EVLPofy+QzQVU855NYl3/DagGSYmDK26wTGDkXcDIsIpRKDOvLDcxM8dEL6MyL7NC Lzk/dxMjMOyX1f6J3sH46ULwIUYBDkYlHt6Aiog0IdbEsuLK3EOMEhzMSiK8v9KBQrwpiZVV qUX58UWlOanFhxilOViUxHmNvpWnCAmkJ5akZqemFqQWwWSZODilGhi1nhddOdE/89khh6lu F7/bL9I+ZbTRzuWBU9PijsntB26wnX6yTsyt7ZXu7MzS/d8jc+4x7DGRUGfa3GcQNGf5OhvD JYJrFb+LLbDRPOO69YjQhx2n7sbIHv+y4kr8F7/Qb8omyxSm9ti+MOBQ8nj++lXg7wz5pX4T yrtO3/r4brPIfo/DUtNvK7EUZyQaajEXFScCADKmY6d3AgAA X-Brightmail-Tracker: H4sIAAAAAAAAA+NgFjrNLMWRmVeSWpSXmKPExsXC5WfdrOs9LTLNYOZZY4s569ewWXze8I/N 4sWGdkaLr+t/MVs8/dTHYnF47klWi8u75rBZ3Fvzn9Xi/K61rBY7lu5jsrh0YAGTxfHeA0wW 8+99ZrPYvGkqs8XxKVMZLX7/ACo+OWsyi4Ogx/fWPhaPnbPusnss2FTqsXmFlsfiPS+ZPDat 6mTz2PRpErvHu3Pn2D1OzPjN4jHvZKDH+31X2TwWv/jA5LH1l51H49RrbB6fN8kF8Edx2aSk 5mSWpRbp2yVwZZzs38BU8Eag4szTeewNjHP4uhg5OSQETCQmfGpgBrHZBNQlbtz4CWaLCJhJ HGz9ww5iMwvcZZI40M/WxcjBISzgL/F3nzWIySKgKnF5Vz1IBa+AqUTz369sEBPlJVZvOAA2 hRNoyoG/dxhBbCGgmkX/exknMHItYGRYxSiSmVeWm5iZY6pXnJ1RmZdZoZecn7uJERjEy2r/ TNzB+OWy+yFGAQ5GJR7egIqINCHWxLLiytxDjBIczEoivL/SgUK8KYmVValF+fFFpTmpxYcY pTlYlMR5vcJTE4QE0hNLUrNTUwtSi2CyTBycUg2MmxuYsyT7ap1nPP+7V7Xyzpl9Sf/tTm84 v3r/p6Kfc2bMvhWe+8R51ctnF+M3eDnkqbouWPFyqmZS+7/fmfUqdbsWbU1U+Jeiq8vKXKAw IzBshs0+p38RIZMm3J8mWm7eGfi2Q2W6lFnO9LBQbdaoiae/6EhrbmlXP74y6UlsyCSBlUWX yo8dUWIpzkg01GIuKk4EALuE7ydeAgAA X-CFilter-Loop: Reflected X-Rspamd-Queue-Id: 84ACFC0014 X-Stat-Signature: 6kdmxtk97gk3r4ixq6nd7ang6q1qz6fj X-Rspam-User: X-Rspamd-Server: rspam08 X-HE-Tag: 1717147214-885558 X-HE-Meta: U2FsdGVkX188gkG7/MI6X8ft3Wy9PNQ62p295mTi6kg3DFc5IICQ6qATD8WiODBht83SSMX7vRZRpFD1TIuePFu8LymTkgwCw0CUjHiaQZKYIcxFLEDiHnUVYpHf/4VC41UC49n5S7d4gf/zAy2hKizAgWnM8Ne+bhNdD0YZ5RzeSJLYJNtJkB3R2eKrBSE1/dM03lxYt8lAbUpA+FoLGYJOtYB5jaur08xwJsKsS5P9Wnx7F6Q3HM3WUcrf7t7VzzuJqppwWTHE0C9hqS47AWnMhFAZoGvI0EaQ8LuKPlgwPhoUlbjYqX7hh2YqqTKQyb+R+rc8NM9nL/8zmnaV0aQ4pQEOgp7atBhrHAE4AEXm+QKz5SaHeuqeysjp+UKcUZ8KQ+546bGvjwdqs3UxVEu9S9RUb4sn8BuyknHzK5KVurZK12o3CXdGEhEKFgEXQ111PsqgxBeQWgj1XJFRN/rzbfVIkPjSvIcsu2dGVFnpAp4wJ1QIkOQYiFqGSAw4Oxl55uNMdvHfo70YxS+OIvr01qVPnAY+NpyjLEe8IlgWkay9NDy1ST1Iwk85mFg8L7qAcoXwmfxW/6dSyQA27movLDiPt28j6jM59bbjIdEs99UU4gbUi/JU0OUBuWN5F+zF8gX7wP7EzSTj7DcEHI+8RrCZSzzkc0yFXNJEGXCpYCKM5mhU84VhYfh2Ts2VyFlgcuxbkTKIvt+d+49FowjYn8bYEwEOfA1YB0a59cjQ1e/uA17zQha0cQ1ai5f19cldtFvT8FUBVvkY32kdy3tnOasyuCfFU7XWLh52SySuANu1098z+W4Le7ddcMCq1ow3FPRT8ZeAmsxM93dmTPz+plsZuapjflPXe06D390p2/LVLpjQ9cq8e2Vh97h0EHS1aO8slrqOYiou9IPHED8dw5G/KfQjNqBMDr1vCl+FMCuEKTOUbwqCfPMsg29KIoMRDu4EdQJij1AviYV rFBTBq85 2vzeS1yoc/G2pruWHAbVAwm15Yw== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: A new mechanism, LUF(Lazy Unmap Flush), defers tlb flush until folios that have been unmapped and freed, eventually get allocated again. It's safe for folios that had been mapped read-only and were unmapped, since the contents of the folios wouldn't change while staying in pcp or buddy so we can still read the data through the stale tlb entries. This is a preparation for the mechanism that needs to recognize read-only tlb entries by separating tlb batch arch data into two, one is for read-only entries and the other is for writable ones, and merging those two when needed. It also optimizes tlb shootdown by skipping CPUs that have already performed tlb flush needed since. To support it, added APIs manipulating arch data for x86. Signed-off-by: Byungchul Park --- arch/x86/include/asm/tlbflush.h | 18 ++++++++++++++++++ 1 file changed, 18 insertions(+) diff --git a/arch/x86/include/asm/tlbflush.h b/arch/x86/include/asm/tlbflush.h index 25726893c6f4..a14f77c5cdde 100644 --- a/arch/x86/include/asm/tlbflush.h +++ b/arch/x86/include/asm/tlbflush.h @@ -5,6 +5,7 @@ #include #include #include +#include #include #include @@ -293,6 +294,23 @@ static inline void arch_flush_tlb_batched_pending(struct mm_struct *mm) extern void arch_tlbbatch_flush(struct arch_tlbflush_unmap_batch *batch); +static inline void arch_tlbbatch_clear(struct arch_tlbflush_unmap_batch *batch) +{ + cpumask_clear(&batch->cpumask); +} + +static inline void arch_tlbbatch_fold(struct arch_tlbflush_unmap_batch *bdst, + struct arch_tlbflush_unmap_batch *bsrc) +{ + cpumask_or(&bdst->cpumask, &bdst->cpumask, &bsrc->cpumask); +} + +static inline bool arch_tlbbatch_done(struct arch_tlbflush_unmap_batch *bdst, + struct arch_tlbflush_unmap_batch *bsrc) +{ + return !cpumask_andnot(&bdst->cpumask, &bdst->cpumask, &bsrc->cpumask); +} + static inline bool pte_flags_need_flush(unsigned long oldflags, unsigned long newflags, bool ignore_access) From patchwork Fri May 31 09:19:51 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Byungchul Park X-Patchwork-Id: 13681400 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id D7280C25B75 for ; Fri, 31 May 2024 09:20:23 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id DF0AD6B009B; Fri, 31 May 2024 05:20:18 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id BA5276B009C; Fri, 31 May 2024 05:20:18 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id A1CEB6B009D; Fri, 31 May 2024 05:20:18 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0015.hostedemail.com [216.40.44.15]) by kanga.kvack.org (Postfix) with ESMTP id 833FC6B009C for ; Fri, 31 May 2024 05:20:18 -0400 (EDT) Received: from smtpin29.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay04.hostedemail.com (Postfix) with ESMTP id 06ABB1A045D for ; Fri, 31 May 2024 09:20:18 +0000 (UTC) X-FDA: 82178144916.29.98C047E Received: from invmail4.hynix.com (exvmail4.hynix.com [166.125.252.92]) by imf08.hostedemail.com (Postfix) with ESMTP id 188A0160015 for ; Fri, 31 May 2024 09:20:15 +0000 (UTC) Authentication-Results: imf08.hostedemail.com; dkim=none; dmarc=none; spf=pass (imf08.hostedemail.com: domain of byungchul@sk.com designates 166.125.252.92 as permitted sender) smtp.mailfrom=byungchul@sk.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1717147216; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:content-type: content-transfer-encoding:in-reply-to:in-reply-to: references:references; bh=cqHgsKtVwaSvlB9DKtU3WH28TLrEXnxe/c3rvazlg90=; b=YYHQt95GR8EhtTk062bLfInhaTHs/5UzBI4LqJy+4Wczpe50o7qqNP1MhS54EZ7G/6z82t aOVEQlJRBUyWpKS3jKj0sTce1I9zIzWfbNMysvdExszaFfIYn8vUL3914/nfN+Xa7AuCgj sqiB+qokTOSAP33NHKmQFsqd6DizfhA= ARC-Authentication-Results: i=1; imf08.hostedemail.com; dkim=none; dmarc=none; spf=pass (imf08.hostedemail.com: domain of byungchul@sk.com designates 166.125.252.92 as permitted sender) smtp.mailfrom=byungchul@sk.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1717147216; a=rsa-sha256; cv=none; b=F1mB15yU8AeA7TSKtVqX6Ft21oM/lhYWWNTUH0ywS8H756whk4fuqddxEKOysDS3SsyLXs SK3j1dtnc4RIPSOGDldAChMbQLid7tkAkwCHU+XhxqO7kRo1iB/xySHDXFQv2usS4xHJQN k22YRmQ5Jm9t8NUiaqJW8yscAQ1C/IU= X-AuditID: a67dfc5b-d85ff70000001748-48-6659964bd3bd From: Byungchul Park To: linux-kernel@vger.kernel.org, linux-mm@kvack.org Cc: kernel_team@skhynix.com, akpm@linux-foundation.org, ying.huang@intel.com, vernhao@tencent.com, mgorman@techsingularity.net, hughd@google.com, willy@infradead.org, david@redhat.com, peterz@infradead.org, luto@kernel.org, tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, dave.hansen@linux.intel.com, rjgolo@gmail.com Subject: [PATCH v11 02/12] arm64: tlbflush: add APIs manipulating tlb batch's arch data Date: Fri, 31 May 2024 18:19:51 +0900 Message-Id: <20240531092001.30428-3-byungchul@sk.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20240531092001.30428-1-byungchul@sk.com> References: <20240531092001.30428-1-byungchul@sk.com> X-Brightmail-Tracker: H4sIAAAAAAAAA+NgFnrGLMWRmVeSWpSXmKPExsXC9ZZnka7PtMg0g5YnLBZz1q9hs/i84R+b xYsN7YwWX9f/YrZ4+qmPxeLyrjlsFvfW/Ge1OL9rLavFjqX7mCwuHVjAZHG89wCTxfx7n9ks Nm+aymxxfMpURovfP4CKT86azOIg4PG9tY/FY+esu+weCzaVemxeoeWxeM9LJo9NqzrZPDZ9 msTu8e7cOXaPEzN+s3jMOxno8X7fVTaPrb/sPBqnXmPz+LxJLoAvissmJTUnsyy1SN8ugStj 9dybrAWLeSte75zD1sD4m6uLkZNDQsBE4u7kR6ww9pO/zcwgNpuAusSNGz/BbBEBM4mDrX/Y QWxmgbtMEgf62UBsYYFwifk3tjCB2CwCqhI//28Am8MrYCrxbv8/RoiZ8hKrNxwAm8MJNOfA 3ztgcSGgmkX/e4FsLqCa12wS5zZ+h2qQlDi44gbLBEbeBYwMqxiFMvPKchMzc0z0MirzMiv0 kvNzNzECQ39Z7Z/oHYyfLgQfYhTgYFTi4Q2oiEgTYk0sK67MPcQowcGsJML7Kx0oxJuSWFmV WpQfX1Sak1p8iFGag0VJnNfoW3mKkEB6YklqdmpqQWoRTJaJg1OqgdFJhGOp4aO1pknlmnFl 77euuxDA+8N7r4/SVM3Huuxy94x/zNRN5v/umcqxNfx3wK/ykzuX//yrIlm/TEvfc4OQ68Op h6x/iHgFbH7kwt4QeH+y54lVq88teOb3b+8Zed3/JhqC106xLj6yTktohlxXJlu6xe17Eg/v T9kx6fKLJYd25imtuVKrxFKckWioxVxUnAgAqiUIBHkCAAA= X-Brightmail-Tracker: H4sIAAAAAAAAA+NgFjrLLMWRmVeSWpSXmKPExsXC5WfdrOs9LTLNoO+NmcWc9WvYLD5v+Mdm 8WJDO6PF1/W/mC2efupjsTg89ySrxeVdc9gs7q35z2pxftdaVosdS/cxWVw6sIDJ4njvASaL +fc+s1ls3jSV2eL4lKmMFr9/ABWfnDWZxUHQ43trH4vHzll32T0WbCr12LxCy2PxnpdMHptW dbJ5bPo0id3j3blz7B4nZvxm8Zh3MtDj/b6rbB6LX3xg8tj6y86jceo1No/Pm+QC+KO4bFJS czLLUov07RK4MlbPvclasJi34vXOOWwNjL+5uhg5OSQETCSe/G1mBrHZBNQlbtz4CWaLCJhJ HGz9ww5iMwvcZZI40M8GYgsLhEvMv7GFCcRmEVCV+Pl/AyuIzStgKvFu/z9GiJnyEqs3HACb wwk058DfO2BxIaCaRf97GScwci1gZFjFKJKZV5abmJljqlecnVGZl1mhl5yfu4kRGMjLav9M 3MH45bL7IUYBDkYlHt6Aiog0IdbEsuLK3EOMEhzMSiK8v9KBQrwpiZVVqUX58UWlOanFhxil OViUxHm9wlMThATSE0tSs1NTC1KLYLJMHJxSDYzs8SUfWOdEqJStvKazdV30GRV17tOvNeK+ q3v4nty0ot5g9oIHkfriHrLWXsW/wr/7M9hKOOq0SRgaBOzMOBuXVcP/Iu2MWtREhh1L+zc/ 8tzIasMtHlyr+qle61Dt/fXXZNxaZ81TmqFhlJddcnSf76wKgy+7exLlNbKClSa/mSD5iH1J nBJLcUaioRZzUXEiAFDjtkNgAgAA X-CFilter-Loop: Reflected X-Stat-Signature: 6b6w96bwrx6x7fuhbksiubznhmj4bqwo X-Rspamd-Queue-Id: 188A0160015 X-Rspam-User: X-Rspamd-Server: rspam10 X-HE-Tag: 1717147215-87601 X-HE-Meta: U2FsdGVkX19hjsG0dlPA+EYpsakLUHBJzlEJFsAFg7jsSisamRhIXQu5Xj5K73qRhypFxiTbnaMgycHBuohMbFNFvXpCxBZriE1if/fpWatxsH/+Lo77CaYNMdX4JHTeQ/GxxFPQNecpEH+00EKEnpvBz1O4J76MjsMgpMfSC54CZw9U7bNYFAEY3TIX3g1Y5AgewrmQ9ziWIk7LiFMN4R4/X22/jZcaMRXOUzMOxq3LUvic/QgOVyLh0Xzj3gYZUDo4/jYhcdM0gKXkchxMsqBidmMVgUpoEfdUHPBuYCsTlNLDdQK2bCEFkyyj3wJYSWMh/1YG4rmGz6aWKsjKQ4cX2MX0iNTCdXojG7YeRPI6O/RokDTy9IczpSQnDfpxzydFWFEFc7r3e5LGPWTFvDeiCTVqvRWG7WScGmracPsAQo600dfJynco5O5Tp46YW8ltSSxW234M6p4oclrYRgxb5ZeYWlbnV4r4xYtyuDN34pw9kdzcH5aEKNFPI0XwAhQNipQA+QrsKwWbWvs6akTyk24Hwlc72OSiOOmQPUpu0+aiTf8trcEaucI+ajput4LTCS+5uYfNmfdbiwEllxuO8hf7e650qsfGaD2YS1EOCd++uMEYUFNQ98Q/ZsDH8MnG2KpdGGxxwXSEZoNZNgHzK7SSP7o0eYAvOenVNT1dTmfJTK5Jvt/sE0cDmd6BZn95ApO7HKheilF529T0Myl5/jTpeyYan6puAKYok6TyiCPhjQP/J1BQD+96d8i/5QvZVUQ4qBW4G57GFFIeQSXPEmTJabQe/eeRHGoigtAyYbDBTcGykIeDkv0j3L9BPFyYVsXBLkODa3/8ENKmv8Gg3p9o+KG2olQV33i5eVfiHCBgpM5FfpcFngDHPCoEZIcprpzJEkLkT0kqGa4ZsAFJpSxhEaU0sJaHI41LNImpCtl8UmKk4+rncDnQAjBITI1mAotF7tlWOA5TNTv rjXhOgtO nPMRlaYLpFq55aRlQVXsQfvFFew8zOjJky61mw9r7WrjVOTv9XsCZU+oVAZ07N53rgaB+Tfwj6dCHp1wuOunN7jnf2Z4XOPrkuAS9 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: A new mechanism, LUF(Lazy Unmap Flush), defers tlb flush until folios that have been unmapped and freed, eventually get allocated again. It's safe for folios that had been mapped read only and were unmapped, since the contents of the folios don't change while staying in pcp or buddy so we can still read the data through the stale tlb entries. This is a preparation for the mechanism that requires to manipulate tlb batch's arch data. Even though arm64 does nothing for tlb things, arch with CONFIG_ARCH_WANT_BATCHED_UNMAP_TLB_FLUSH should provide the APIs. Signed-off-by: Byungchul Park --- arch/arm64/include/asm/tlbflush.h | 18 ++++++++++++++++++ 1 file changed, 18 insertions(+) diff --git a/arch/arm64/include/asm/tlbflush.h b/arch/arm64/include/asm/tlbflush.h index 95fbc8c05607..4fefc1f90304 100644 --- a/arch/arm64/include/asm/tlbflush.h +++ b/arch/arm64/include/asm/tlbflush.h @@ -354,6 +354,24 @@ static inline void arch_tlbbatch_flush(struct arch_tlbflush_unmap_batch *batch) dsb(ish); } +static inline void arch_tlbbatch_clear(struct arch_tlbflush_unmap_batch *batch) +{ + /* nothing to do */ +} + +static inline void arch_tlbbatch_fold(struct arch_tlbflush_unmap_batch *bdst, + struct arch_tlbflush_unmap_batch *bsrc) +{ + /* nothing to do */ +} + +static inline bool arch_tlbbatch_done(struct arch_tlbflush_unmap_batch *bdst, + struct arch_tlbflush_unmap_batch *bsrc) +{ + /* Kernel can consider tlb batch always has been done. */ + return true; +} + /* * This is meant to avoid soft lock-ups on large TLB flushing ranges and not * necessarily a performance improvement. From patchwork Fri May 31 09:19:52 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Byungchul Park X-Patchwork-Id: 13681397 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 10B1CC27C44 for ; Fri, 31 May 2024 09:20:18 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 95BA66B0099; Fri, 31 May 2024 05:20:17 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 90BB76B009A; Fri, 31 May 2024 05:20:17 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 7D3806B009B; Fri, 31 May 2024 05:20:17 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0014.hostedemail.com [216.40.44.14]) by kanga.kvack.org (Postfix) with ESMTP id 607686B0099 for ; Fri, 31 May 2024 05:20:17 -0400 (EDT) Received: from smtpin06.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay03.hostedemail.com (Postfix) with ESMTP id 0C7F8A10AD for ; Fri, 31 May 2024 09:20:17 +0000 (UTC) X-FDA: 82178144874.06.B25818A Received: from invmail4.hynix.com (exvmail4.skhynix.com [166.125.252.92]) by imf22.hostedemail.com (Postfix) with ESMTP id 97BF8C001A for ; Fri, 31 May 2024 09:20:14 +0000 (UTC) Authentication-Results: imf22.hostedemail.com; dkim=none; dmarc=none; spf=pass (imf22.hostedemail.com: domain of byungchul@sk.com designates 166.125.252.92 as permitted sender) smtp.mailfrom=byungchul@sk.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1717147215; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:content-type: content-transfer-encoding:in-reply-to:in-reply-to: references:references; bh=2xyy9kkcj1Tl96/Je7+DpJ1ey8tWADMcAHlkm2MEIw0=; b=jCYnaoIxBoU8EaLwTgB0MMX6FHE8a5GsiL/jnDNy2hrbLYk+SLTZKgLLFBd1RL7Jr9oxF0 thWAizQ2BSn8Rp7XGxYe5KWhc6v1OQSPDIVmHId1SpcB8LRLAtrFZM3C7Q6EpLKjRhnCKK qsM9VyDlo5srCVbaNq/BF2aMWEIQDk8= ARC-Authentication-Results: i=1; imf22.hostedemail.com; dkim=none; dmarc=none; spf=pass (imf22.hostedemail.com: domain of byungchul@sk.com designates 166.125.252.92 as permitted sender) smtp.mailfrom=byungchul@sk.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1717147215; a=rsa-sha256; cv=none; b=xLJfWSG2R9pJ2DGwcZYpWBYTyd6WZ4u0OfuLZAhe7DXlaFollammFx94EF2EvD8+AIhN1u HXH/yE7iqECLMFj2e82J/3eRlGWKiuBVD2ppVC7lFSpVGOBQ0x9tHJDRT/6ojY6bkSCxnw Kn/kXQznmvWHDo4klVn6fbWFkHhc1UM= X-AuditID: a67dfc5b-d85ff70000001748-4d-6659964bdba7 From: Byungchul Park To: linux-kernel@vger.kernel.org, linux-mm@kvack.org Cc: kernel_team@skhynix.com, akpm@linux-foundation.org, ying.huang@intel.com, vernhao@tencent.com, mgorman@techsingularity.net, hughd@google.com, willy@infradead.org, david@redhat.com, peterz@infradead.org, luto@kernel.org, tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, dave.hansen@linux.intel.com, rjgolo@gmail.com Subject: [PATCH v11 03/12] riscv, tlb: add APIs manipulating tlb batch's arch data Date: Fri, 31 May 2024 18:19:52 +0900 Message-Id: <20240531092001.30428-4-byungchul@sk.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20240531092001.30428-1-byungchul@sk.com> References: <20240531092001.30428-1-byungchul@sk.com> X-Brightmail-Tracker: H4sIAAAAAAAAA+NgFnrKLMWRmVeSWpSXmKPExsXC9ZZnka7PtMg0g/fv2CzmrF/DZvF5wz82 ixcb2hktvq7/xWzx9FMfi8XlXXPYLO6t+c9qcX7XWlaLHUv3MVlcOrCAyeJ47wEmi/n3PrNZ bN40ldni+JSpjBa/fwAVn5w1mcVBwON7ax+Lx85Zd9k9Fmwq9di8Qstj8Z6XTB6bVnWyeWz6 NInd4925c+weJ2b8ZvGYdzLQ4/2+q2weW3/ZeTROvcbm8XmTXABfFJdNSmpOZllqkb5dAlfG xn1vmAsmC1Zsff+BsYHxIF8XIyeHhICJxMXG1+ww9rypd8FsNgF1iRs3fjKD2CICZhIHW/+A xZkF7jJJHOhn62Lk4BAWCJJ4u04BJMwioCqxbuNyJhCbV8BUYuKZ46wQI+UlVm84ADaGE2jM gb93GEFsIaCaRf97gWwuoJrPbBLbV35ig2iQlDi44gbLBEbeBYwMqxiFMvPKchMzc0z0Mirz Miv0kvNzNzECA39Z7Z/oHYyfLgQfYhTgYFTi4Q2oiEgTYk0sK67MPcQowcGsJML7Kx0oxJuS WFmVWpQfX1Sak1p8iFGag0VJnNfoW3mKkEB6YklqdmpqQWoRTJaJg1OqgVHSue20+pTPDu9a f0uZBXKyFt8yuHklfinDHp3F6X12OhO7lP2bNkxqiPwY+E0wopuVs3RBXm7BxX2tV3Yu9eE+ lWD46FbQjl0iQr9tQpUYHd2dUh51VDfM+W9reqbK8OF3nt+PTgin1MbIJHfdNn8jsMSY68Vr ixV81xfrLJkyMSxCuV3qnhJLcUaioRZzUXEiAOq8YAd4AgAA X-Brightmail-Tracker: H4sIAAAAAAAAA+NgFjrDLMWRmVeSWpSXmKPExsXC5WfdrOs9LTLNYNIhS4s569ewWXze8I/N 4sWGdkaLr+t/MVs8/dTHYnF47klWi8u75rBZ3Fvzn9Xi/K61rBY7lu5jsrh0YAGTxfHeA0wW 8+99ZrPYvGkqs8XxKVMZLX7/ACo+OWsyi4Ogx/fWPhaPnbPusnss2FTqsXmFlsfiPS+ZPDat 6mTz2PRpErvHu3Pn2D1OzPjN4jHvZKDH+31X2TwWv/jA5LH1l51H49RrbB6fN8kF8Edx2aSk 5mSWpRbp2yVwZWzc94a5YLJgxdb3HxgbGA/ydTFyckgImEjMm3qXHcRmE1CXuHHjJzOILSJg JnGw9Q9YnFngLpPEgX62LkYODmGBIIm36xRAwiwCqhLrNi5nArF5BUwlJp45zgoxUl5i9YYD YGM4gcYc+HuHEcQWAqpZ9L+XcQIj1wJGhlWMIpl5ZbmJmTmmesXZGZV5mRV6yfm5mxiBYbys 9s/EHYxfLrsfYhTgYFTi4Q2oiEgTYk0sK67MPcQowcGsJML7Kx0oxJuSWFmVWpQfX1Sak1p8 iFGag0VJnNcrPDVBSCA9sSQ1OzW1ILUIJsvEwSnVwHg42PNpzP13QlUnL8yfe4BVVKN3scNv tu0GnCuSvDTrjxv9kf59MuV2NMsyI6NdByeV33m+4WrHMf/5vDfeSTO+D7SYdfWx1+3XaufN dym+lU/8lmS7j/nU0nCvKpsOmw3ezQwiQvMXH62a1BxZrPZH7ejZpFlrl876UldrnHv70S73 RfrcLuVKLMUZiYZazEXFiQCTcEXWXwIAAA== X-CFilter-Loop: Reflected X-Rspamd-Queue-Id: 97BF8C001A X-Stat-Signature: k6hbebuji6nzwqu7trm81ufekmqae58s X-Rspam-User: X-Rspamd-Server: rspam04 X-HE-Tag: 1717147214-230739 X-HE-Meta: U2FsdGVkX1+L3ckkmWdGEToTCGS2vZTkoiGTSN67e0f1IZZSZRP/0E1m+Az7GhAOC9PqHQd3qbiCCL03Se5zeqT06xo6x7RGTuP4ukK+7Azn4/Kvue3nY62YumhK+724aXD/lYQOARjc6lK8v/CXv0qAvA9ov5zpOJv07fzPDiCsiiiP69nNApX17bCEs2sKVwXUYsBHdWgyRU3qbHC3c54MiwdMcRLgqnOoz5nn1XsfrHAt7dDJLMD85GeXxLp/lNNlOalZX6Mso8uZUaWa0UEUFFqbMZdmcL6L966spn/G7wCZVGXszVHetHGd2YEA6nuJ9vDShRNiy0K7hYc0Hzk3sWD5rRwicqYG8klYydxokqc2v3XIAaCiq/3JW1DhnuOFMtv9LSrSbt4Dtu5Zq+pCFEEH5t5KJS1VKzxFsSWi/pekKJMT/+RR6VqkeskqQqdjMaSWpJSHA7mgQnpCCPxRy17dc09Nu4zNqEB81qplnOF3UjPtgchAupb0c/pIoB5tXHdGCaljwXmNV01NG1AxT72y1xD0/GUyR3Pj2sD40iet/q5uahRwIhLDCEeGLEK95FuErzAzuUGV9kEmnRNz3UOfOAwrVuvTVBaRLkMX49oiyVhXdkGF42CR1FfJ8MVleb9k4O5WXSCGF88Q7g2/1CXnjC0QBJFs1V1dojqMS5Iq37Hq0eEmpYFpU5c+PkmyjIQy+jFlJJzHbB6lxeKMnNcNn7IQqgNERtVEcwympt9WtCj/JYVgWBs+7h90GDZteQga3WkRscC6hSau7Lo2Rpo6N8/KTXR93AJGJWJzhnToFcyXlQZ0e1Ak/9oSQXFkwfNEBTOCdExX0D3eqIlhhO66G9AF4WfQ/6Xf8mPOGFoLkZbqVNae4U7aM4FxLQkORKSju/uWYb4AnEggCfbG6y0NWUjoLf+7AhzA3tDkQ6Y9MQJQy/QN8SHLZSwTUeQb2ogRNy92HQ5yr1d s3pVPWbj tjuJxcigqV6gcGvkuhH0/1YfJww== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: A new mechanism, LUF(Lazy Unmap Flush), defers tlb flush until folios that have been unmapped and freed, eventually get allocated again. It's safe for folios that had been mapped read only and were unmapped, since the contents of the folios don't change while staying in pcp or buddy so we can still read the data through the stale tlb entries. This is a preparation for the mechanism that needs to recognize read-only tlb entries by separating tlb batch arch data into two, one is for read-only entries and the other is for writable ones, and merging those two when needed. It also optimizes tlb shootdown by skipping CPUs that have already performed tlb flush needed since. To support it, added APIs manipulating arch data for riscv. Signed-off-by: Byungchul Park --- arch/riscv/include/asm/tlbflush.h | 21 +++++++++++++++++++++ 1 file changed, 21 insertions(+) diff --git a/arch/riscv/include/asm/tlbflush.h b/arch/riscv/include/asm/tlbflush.h index 72e559934952..c0bfb9b2bf54 100644 --- a/arch/riscv/include/asm/tlbflush.h +++ b/arch/riscv/include/asm/tlbflush.h @@ -8,6 +8,7 @@ #define _ASM_RISCV_TLBFLUSH_H #include +#include #include #include @@ -65,6 +66,26 @@ void arch_tlbbatch_add_pending(struct arch_tlbflush_unmap_batch *batch, void arch_flush_tlb_batched_pending(struct mm_struct *mm); void arch_tlbbatch_flush(struct arch_tlbflush_unmap_batch *batch); +static inline void arch_tlbbatch_clear(struct arch_tlbflush_unmap_batch *batch) +{ + cpumask_clear(&batch->cpumask); + +} + +static inline void arch_tlbbatch_fold(struct arch_tlbflush_unmap_batch *bdst, + struct arch_tlbflush_unmap_batch *bsrc) +{ + cpumask_or(&bdst->cpumask, &bdst->cpumask, &bsrc->cpumask); + +} + +static inline bool arch_tlbbatch_done(struct arch_tlbflush_unmap_batch *bdst, + struct arch_tlbflush_unmap_batch *bsrc) +{ + return !cpumask_andnot(&bdst->cpumask, &bdst->cpumask, &bsrc->cpumask); + +} + extern unsigned long tlb_flush_all_threshold; #else /* CONFIG_MMU */ #define local_flush_tlb_all() do { } while (0) From patchwork Fri May 31 09:19:53 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Byungchul Park X-Patchwork-Id: 13681401 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 32FE0C27C44 for ; Fri, 31 May 2024 09:20:26 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id D56186B009C; Fri, 31 May 2024 05:20:19 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id C1B2E6B009D; Fri, 31 May 2024 05:20:19 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id ABC006B009F; Fri, 31 May 2024 05:20:19 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0014.hostedemail.com [216.40.44.14]) by kanga.kvack.org (Postfix) with ESMTP id 80D2F6B009C for ; Fri, 31 May 2024 05:20:19 -0400 (EDT) Received: from smtpin27.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay01.hostedemail.com (Postfix) with ESMTP id 3C65C1C226F for ; Fri, 31 May 2024 09:20:19 +0000 (UTC) X-FDA: 82178144958.27.20ED6B8 Received: from invmail4.hynix.com (exvmail4.skhynix.com [166.125.252.92]) by imf22.hostedemail.com (Postfix) with ESMTP id 65BA6C0016 for ; Fri, 31 May 2024 09:20:17 +0000 (UTC) Authentication-Results: imf22.hostedemail.com; dkim=none; dmarc=none; spf=pass (imf22.hostedemail.com: domain of byungchul@sk.com designates 166.125.252.92 as permitted sender) smtp.mailfrom=byungchul@sk.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1717147217; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:content-type: content-transfer-encoding:in-reply-to:in-reply-to: references:references; bh=zPJ8LPXwHB1Ubve+4Fc/lsX01Hbuj9Tq2nq76J2JNWo=; b=ZETPGRZqgi/Gt6FUS6K9tS/qRA5Kv6WT6lPbUYTCRxtriDYUUp0A52OSBID0DY0iBkVBUu Hff5pbnxjdVRgNonunxLSMMghM8PuT26E09MNSABNdSvCog/0re/ISenGTvUEMlT83ANis M3l1VvnCZbaPFIcmo+J6kuHkpM740fo= ARC-Authentication-Results: i=1; imf22.hostedemail.com; dkim=none; dmarc=none; spf=pass (imf22.hostedemail.com: domain of byungchul@sk.com designates 166.125.252.92 as permitted sender) smtp.mailfrom=byungchul@sk.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1717147217; a=rsa-sha256; cv=none; b=QfNjly5H4bp7or5thMTBOsnjZgIKS3etI4lqB9C8DThASG+GeuDkeKyaJtkr7ej6dL/C6w 2JaHi5wEXmygAEXyl+3+/UVfmhauJLq/UEQRydpVwVyEiWvplTVjkOXXEgLYEXBZ3a0TCd TTWl5liq7Y4mh5jbIoMSRPwB1jdhWMw= X-AuditID: a67dfc5b-d85ff70000001748-52-6659964cfef5 From: Byungchul Park To: linux-kernel@vger.kernel.org, linux-mm@kvack.org Cc: kernel_team@skhynix.com, akpm@linux-foundation.org, ying.huang@intel.com, vernhao@tencent.com, mgorman@techsingularity.net, hughd@google.com, willy@infradead.org, david@redhat.com, peterz@infradead.org, luto@kernel.org, tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, dave.hansen@linux.intel.com, rjgolo@gmail.com Subject: [PATCH v11 04/12] x86/tlb, riscv/tlb, mm/rmap: separate arch_tlbbatch_clear() out of arch_tlbbatch_flush() Date: Fri, 31 May 2024 18:19:53 +0900 Message-Id: <20240531092001.30428-5-byungchul@sk.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20240531092001.30428-1-byungchul@sk.com> References: <20240531092001.30428-1-byungchul@sk.com> X-Brightmail-Tracker: H4sIAAAAAAAAA+NgFnrGLMWRmVeSWpSXmKPExsXC9ZZnka7PtMg0g1lNnBZz1q9hs/i84R+b xYsN7YwWX9f/YrZ4+qmPxeLyrjlsFvfW/Ge1OL9rLavFjqX7mCwuHVjAZHG89wCTxfx7n9ks Nm+aymxxfMpURovfP4CKT86azOIg4PG9tY/FY+esu+weCzaVemxeoeWxeM9LJo9NqzrZPDZ9 msTu8e7cOXaPEzN+s3jMOxno8X7fVTaPrb/sPBqnXmPz+LxJLoAvissmJTUnsyy1SN8ugSvj xukuxoL7fBVTVr9maWCcytPFyMkhIWAi0dD+iA3G3tJ3nBnEZhNQl7hx4yeYLSJgJnGw9Q87 iM0scJdJ4kA/WL2wQLHEnN7tYHEWAVWJG/f+soDYvAKmEuc27WCFmCkvsXrDAbA5nEBzDvy9 wwhiCwHVLPrfC2RzAdW8ZpN4s20CC0SDpMTBFTdYJjDyLmBkWMUolJlXlpuYmWOil1GZl1mh l5yfu4kRGPrLav9E72D8dCH4EKMAB6MSD29ARUSaEGtiWXFl7iFGCQ5mJRHeX+lAId6UxMqq 1KL8+KLSnNTiQ4zSHCxK4rxG38pThATSE0tSs1NTC1KLYLJMHJxSDYzNRnL8/nuWnzK8qckm dq89+qGT2Yne8kVT90eyvfEWXxzTv3FlsjNL3Gn1u+zbr29Sf/kxPajbUnClote7hrnizOzi aa48x7I7rW2Lr1ty7bfYqpcyY4PbKU71ZV8WZwXVOccUTNFmkmadP2mi8nKjlp3p+YF1cq6X mNR2l7btOiJqWPn2uxJLcUaioRZzUXEiAHlk0Fd5AgAA X-Brightmail-Tracker: H4sIAAAAAAAAA+NgFjrLLMWRmVeSWpSXmKPExsXC5WfdrOs9LTLN4MVsG4s569ewWXze8I/N 4sWGdkaLr+t/MVs8/dTHYnF47klWi8u75rBZ3Fvzn9Xi/K61rBY7lu5jsrh0YAGTxfHeA0wW 8+99ZrPYvGkqs8XxKVMZLX7/ACo+OWsyi4Ogx/fWPhaPnbPusnss2FTqsXmFlsfiPS+ZPDat 6mTz2PRpErvHu3Pn2D1OzPjN4jHvZKDH+31X2TwWv/jA5LH1l51H49RrbB6fN8kF8Edx2aSk 5mSWpRbp2yVwZdw43cVYcJ+vYsrq1ywNjFN5uhg5OSQETCS29B1nBrHZBNQlbtz4CWaLCJhJ HGz9ww5iMwvcZZI40M8GYgsLFEvM6d0OFmcRUJW4ce8vC4jNK2AqcW7TDlaImfISqzccAJvD CTTnwN87jCC2EFDNov+9jBMYuRYwMqxiFMnMK8tNzMwx1SvOzqjMy6zQS87P3cQIDORltX8m 7mD8ctn9EKMAB6MSD29ARUSaEGtiWXFl7iFGCQ5mJRHeX+lAId6UxMqq1KL8+KLSnNTiQ4zS HCxK4rxe4akJQgLpiSWp2ampBalFMFkmDk6pBkYd80cf9bWsTEMtDzqfUl7MKb3L3bLIIaR8 p/vFqnaDt20ar7w9dky5sEqWR2j2ny1O1/teb/jtm/vq0Du2Lwc/uBQ4q8e5/DrxWOnnxge8 AvdCHjmwh67JrYlfbnY5RVwovEljpWS7d9dh8XoutfmyLqFO3dNY1/0//Il1r8W5aVPzGDst XJRYijMSDbWYi4oTAWUft89gAgAA X-CFilter-Loop: Reflected X-Rspamd-Queue-Id: 65BA6C0016 X-Stat-Signature: cimbseo8w6cbc9oca7swm6pf1dnohsjo X-Rspam-User: X-Rspamd-Server: rspam04 X-HE-Tag: 1717147217-203413 X-HE-Meta: U2FsdGVkX18OMD/t7S7LVoHCe/ah8nT85+roWkfJAJwMNWEgPjBIQhEYiM5Ni9xo/HaMFM66+o8p/QVU0AQ99Ti9ayk9lqjugDMRzGHZqlsztEnzko1fp2IXm7yzF6PiXSfYjwSZjDo0IZ+o8vZ6dJmvxo6Oi/N0ovk/a0s2+mTFWNywX56XBZSxMMYwfg94l9PhwsJzQQcJZ5sG1uQWhHor76lihClhdN/qci4hwj+5O/3rN/UH5NGSI1sTW69WuoF04GGMIF55IsbhJFDr1V4f7c8pOO9fsSYYU+bZHttjgGaGRpwYBitqozUez4LogwEd/iTL+wrnLvKopT83HnqsNByhcs+P0RBiuCW23PzI3bLOSDEQGdhM1wE8s5Dj1RrTDVJ+JAoO0lIxRW15Gaq5aCIXpkOdWYvchwNNFQSd9xYJ8ohjQEeFWdg0+eKfZRaKUf3Q0RB/OwKi9Q+mJJZUyPCQmWmEY6IrwdPFZM80tWFDcfO+hbuAAD4tCCvpLrBYDx7JITk7rid5W/UtOsT8vwlXikCdA+sJQycgQuGl40lNn6AeF9Uk1GtE1QUpwd8LeR41eW+tnzjavgQ6/vVN6RSc/jPtXbUvSxN+98StVBFKe/ZCGufmgizvkLYa0xdz9JDOACfCruCQ3PNJMq7zGzqOMiZO8uPh3v9BRpMBl7CLoGXHeRrzwuumDnQR8rRzf55cJMu+YUFxl6MtVCMbIEM2/qMcau+VIf/IBPU2IWdKvHJAxBnUk8xgqQmT6iVe2BpTJSVNmcqA8gaI70Kpj4Vo0GXmSRJ7Qp93U/PyyLI2MBxhT+kukNuO0VGVKiVRf0Vv+WQ/nnMMzvrwth124ETCeZsMhlVZjjqVlHEhM+s1iGD10xp34SKowIzydoytV4Nq/uTccDkkRdbnckNKsmD0dDmQEK0i2MnP5TaWO/vDoPvkieontCvlInVmmEAK689xYjtIsRM9aa4 QfGFAFuu +9GD3EqF/Gmzu8kZmHtxUA27XrxEKxsi1v+tu X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: A new mechanism, LUF(Lazy Unmap Flush), defers tlb flush until folios that have been unmapped and freed, eventually get allocated again. It's safe for folios that had been mapped read only and were unmapped, since the contents of the folios don't change while staying in pcp or buddy so we can still read the data through the stale tlb entries. This is a preparation for the mechanism that requires to avoid redundant tlb flush by manipulating tlb batch's arch data. To achieve that, we need to separate the part clearing the tlb batch's arch data out of arch_tlbbatch_flush(). Signed-off-by: Byungchul Park --- arch/riscv/mm/tlbflush.c | 1 - arch/x86/mm/tlb.c | 2 -- mm/rmap.c | 1 + 3 files changed, 1 insertion(+), 3 deletions(-) diff --git a/arch/riscv/mm/tlbflush.c b/arch/riscv/mm/tlbflush.c index 9b6e86ce3867..36f996af6256 100644 --- a/arch/riscv/mm/tlbflush.c +++ b/arch/riscv/mm/tlbflush.c @@ -201,5 +201,4 @@ void arch_tlbbatch_flush(struct arch_tlbflush_unmap_batch *batch) { __flush_tlb_range(&batch->cpumask, FLUSH_TLB_NO_ASID, 0, FLUSH_TLB_MAX_SIZE, PAGE_SIZE); - cpumask_clear(&batch->cpumask); } diff --git a/arch/x86/mm/tlb.c b/arch/x86/mm/tlb.c index 44ac64f3a047..24bce69222cd 100644 --- a/arch/x86/mm/tlb.c +++ b/arch/x86/mm/tlb.c @@ -1265,8 +1265,6 @@ void arch_tlbbatch_flush(struct arch_tlbflush_unmap_batch *batch) local_irq_enable(); } - cpumask_clear(&batch->cpumask); - put_flush_tlb_info(); put_cpu(); } diff --git a/mm/rmap.c b/mm/rmap.c index 52357d79917c..a65a94aada8d 100644 --- a/mm/rmap.c +++ b/mm/rmap.c @@ -648,6 +648,7 @@ void try_to_unmap_flush(void) return; arch_tlbbatch_flush(&tlb_ubc->arch); + arch_tlbbatch_clear(&tlb_ubc->arch); tlb_ubc->flush_required = false; tlb_ubc->writable = false; } From patchwork Fri May 31 09:19:54 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Byungchul Park X-Patchwork-Id: 13681402 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id BD036C25B75 for ; Fri, 31 May 2024 09:20:28 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 8251B6B009D; Fri, 31 May 2024 05:20:20 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 762336B009F; Fri, 31 May 2024 05:20:20 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 541A56B00A1; Fri, 31 May 2024 05:20:20 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0014.hostedemail.com [216.40.44.14]) by kanga.kvack.org (Postfix) with ESMTP id 1EDC26B009D for ; Fri, 31 May 2024 05:20:20 -0400 (EDT) Received: from smtpin18.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay03.hostedemail.com (Postfix) with ESMTP id D18FEA0DFD for ; Fri, 31 May 2024 09:20:19 +0000 (UTC) X-FDA: 82178144958.18.B9C7963 Received: from invmail4.hynix.com (exvmail4.hynix.com [166.125.252.92]) by imf10.hostedemail.com (Postfix) with ESMTP id EECF3C0020 for ; Fri, 31 May 2024 09:20:17 +0000 (UTC) Authentication-Results: imf10.hostedemail.com; dkim=none; dmarc=none; spf=pass (imf10.hostedemail.com: domain of byungchul@sk.com designates 166.125.252.92 as permitted sender) smtp.mailfrom=byungchul@sk.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1717147218; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:content-type: content-transfer-encoding:in-reply-to:in-reply-to: references:references; bh=+o6xo0C6ae0FoaIC20MRcVPPBKvho+cxd0i5byPrumw=; b=15ryczKgmm57iyShWitgqkEvm9IHQKAqNzmHssnWEBxNpXNllJQ2z5N0+rWo/cZLFmFo7+ SJvTj/qAPFoXY9Ox9nedgLfEn9b61R/ludtO/OOk4GzRW7ijSkyBTbPvysYTXd5KTQuiwz /8vjkeDbz7YANGw9w/C79oRh4zOxzJ8= ARC-Authentication-Results: i=1; imf10.hostedemail.com; dkim=none; dmarc=none; spf=pass (imf10.hostedemail.com: domain of byungchul@sk.com designates 166.125.252.92 as permitted sender) smtp.mailfrom=byungchul@sk.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1717147218; a=rsa-sha256; cv=none; b=BjbCzANCaftNpmpF9twcYmG/Fx5ih+JDxTu7NPuYK87ee4TetzY+oydtidM29hZW1mmHgJ ncF1L2f07xFeOQqgKhfo7B0/MZRxgmIsI8k8Nw2ch5eGxHhGnhzSDWXvhe7q9YulGDW1GZ P1CYsqo3c0zpzDT6AmhfXzelbrnZdjU= X-AuditID: a67dfc5b-d85ff70000001748-57-6659964c6c58 From: Byungchul Park To: linux-kernel@vger.kernel.org, linux-mm@kvack.org Cc: kernel_team@skhynix.com, akpm@linux-foundation.org, ying.huang@intel.com, vernhao@tencent.com, mgorman@techsingularity.net, hughd@google.com, willy@infradead.org, david@redhat.com, peterz@infradead.org, luto@kernel.org, tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, dave.hansen@linux.intel.com, rjgolo@gmail.com Subject: [PATCH v11 05/12] mm: buddy: make room for a new variable, ugen, in struct page Date: Fri, 31 May 2024 18:19:54 +0900 Message-Id: <20240531092001.30428-6-byungchul@sk.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20240531092001.30428-1-byungchul@sk.com> References: <20240531092001.30428-1-byungchul@sk.com> X-Brightmail-Tracker: H4sIAAAAAAAAA+NgFnrGLMWRmVeSWpSXmKPExsXC9ZZnoa7PtMg0g3nvuC3mrF/DZvF5wz82 ixcb2hktvq7/xWzx9FMfi8XlXXPYLO6t+c9qcX7XWlaLHUv3MVlcOrCAyeJ47wEmi/n3PrNZ bN40ldni+JSpjBa/fwAVn5w1mcVBwON7ax+Lx85Zd9k9Fmwq9di8Qstj8Z6XTB6bVnWyeWz6 NInd4925c+weJ2b8ZvGYdzLQ4/2+q2weW3/ZeTROvcbm8XmTXABfFJdNSmpOZllqkb5dAlfG k4+/WQuuqlQcn/6VuYGxTbaLkZNDQsBE4tO5A8ww9qXnqxhBbDYBdYkbN36CxUUEzCQOtv5h B7GZBe4ySRzoZwOxhQUiJDZd6wSLswioSmz93MwEYvMKmEqsWrWNBWKmvMTqDRDzOYHmHPh7 B2y+EFDNov+9QDYXUM17Nolfr06xQTRIShxccYNlAiPvAkaGVYxCmXlluYmZOSZ6GZV5mRV6 yfm5mxiBob+s9k/0DsZPF4IPMQpwMCrx8AZURKQJsSaWFVfmHmKU4GBWEuH9lQ4U4k1JrKxK LcqPLyrNSS0+xCjNwaIkzmv0rTxFSCA9sSQ1OzW1ILUIJsvEwSnVwCjryvz40vvJyrdqnzfJ uk1n9lu8Oz9HT9XU+5VggRRf1NFJz/cf6bA4d3p2nGbTqs6jog+Cv4YKzRWsOPtnztx9s/lM 3WPOSR6o2XYh2+DPZ1VB/eQ0YUNzAVmFstMcoe9qA581nuue6PLi6etLpmLn1y58PjNUO/Zq 3Pf20s0WV59Jf5LhW6bEUpyRaKjFXFScCABLWebVeQIAAA== X-Brightmail-Tracker: H4sIAAAAAAAAA+NgFjrLLMWRmVeSWpSXmKPExsXC5WfdrOszLTLN4OlFRos569ewWXze8I/N 4sWGdkaLr+t/MVs8/dTHYnF47klWi8u75rBZ3Fvzn9Xi/K61rBY7lu5jsrh0YAGTxfHeA0wW 8+99ZrPYvGkqs8XxKVMZLX7/ACo+OWsyi4Ogx/fWPhaPnbPusnss2FTqsXmFlsfiPS+ZPDat 6mTz2PRpErvHu3Pn2D1OzPjN4jHvZKDH+31X2TwWv/jA5LH1l51H49RrbB6fN8kF8Edx2aSk 5mSWpRbp2yVwZTz5+Ju14KpKxfHpX5kbGNtkuxg5OSQETCQuPV/FCGKzCahL3LjxkxnEFhEw kzjY+ocdxGYWuMskcaCfDcQWFoiQ2HStEyzOIqAqsfVzMxOIzStgKrFq1TYWiJnyEqs3HACb wwk058DfO2DzhYBqFv3vZZzAyLWAkWEVo0hmXlluYmaOqV5xdkZlXmaFXnJ+7iZGYCAvq/0z cQfjl8vuhxgFOBiVeHgDKiLShFgTy4orcw8xSnAwK4nw/koHCvGmJFZWpRblxxeV5qQWH2KU 5mBREuf1Ck9NEBJITyxJzU5NLUgtgskycXBKNTA+fnZK+2ia54ZJppuPr1j2tetYx4HIo7wt NTtTVeWLnj/4M2Frv76rQt9fw5IPSv+6F9sfTZ1Tz6uipmwYnnNZrlJl9qxI37DK0qJYFlPv niL30xeuX5nvV95jsN5zoc7Wvy5v5Pw8pzbsXj8vXpLjZHzer9fvW89tqz/kvdYmcYKtFGNH yUQlluKMREMt5qLiRADuneMJYAIAAA== X-CFilter-Loop: Reflected X-Rspamd-Queue-Id: EECF3C0020 X-Stat-Signature: 3h1pcw3h5nquymc8j9pakwtpy8cmsixi X-Rspam-User: X-Rspamd-Server: rspam08 X-HE-Tag: 1717147217-599206 X-HE-Meta: U2FsdGVkX1/3d13/A7xoNKOj1wOzucrPKePWryytQHPHTaieL/Q4fQfFAKv3xxGhs0YKf5xzQMlT3uCKqXFz3RgwUbf8C3s6WcGdNcfTq+DUP4D3B0u6adLU2tFZjNMaf7RDIQZpX3TGaRBH9atrAaTu2XK00edmb60roarBms11cKYCUuHz65GEXCzBlFqjGpkhrWtMwKhAXFsDA20/12kNu3EmFv1HVdUpoOfipTXM7Z4y1x2M6Qa9gI65GdDxab7gc3s+asmO6scSdnRAKAsgejAA5IRF84hOwKyVoJKj49uosu9xG1CnRRw3wq+xrXexoZN+EHEd6PO/3klDtFeXudS2rQ7CmQ2bzVlBM07triihu1mk8Az6c/7+gTWVYz/glpuibVtp1pR3/wusvjc64Vpx29VL0ghO2xSftgfDbU9OPYI8I0A9EI26KA76q+t/K21ICVME4oJrtn89jmQDAB+wCtxlaJ4Do8xpVFdRymQKMAqDh0aZgphkNUjdJxh2HioU5e6quM8HOYpniZufr+ZmbGbRZwwIaUThwlRQabjQ/iCVUqR2Z756TMMfhIExeUQMkDbpa3+CkZGkkARVUMEAV/B3KgdjI9JXL0w98pFS8jUHVmtl7hWec+gUYeqU6Y2HlhWnC/KkODg2tcxNg4xGDuVEx5QEJAzX/YQ8MwGzK0MctX/5DlMhfSCfBrfE44PTLTbMcRuhG3Gcf0+kKsmSrLoYqiAiqP/Os7hxxI6RqPODsFRllI4HecbeimvFtQZDmzHMY8Hbp4/KSGAMmT5ZzB4BPrcyipcfwmFGeODS9X4RX8gY+2OrNHNlmryhbDZ0ik/LK398+Ut7y/S0iO6M9Zh57mFyy7VXrL29YJ5SGLXLss8MCbmuOCQUOU+o1tXz0qM1Xvq9auq1cza7++g/b13zdvDnSAOiWcOS+hI3+g8guQuEySA9t6CuQCXU+MJIeTOiZoD4iR7 sIg== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Functionally, no change. This is a preparation for luf mechanism that tracks need of tlb flush for each page residing in buddy, using a generation number in struct page. Fortunately, since the private field in struct page is used only to store page order in buddy, ranging from 0 to MAX_PAGE_ORDER, that can be covered with unsigned short int. So splitted it into two smaller ones, order and ugen, so that the both can be used in buddy at the same time. Signed-off-by: Byungchul Park --- include/linux/mm_types.h | 40 +++++++++++++++++++++++++++++++++------- mm/internal.h | 4 ++-- mm/page_alloc.c | 13 ++++++++----- 3 files changed, 43 insertions(+), 14 deletions(-) diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h index 24323c7d0bd4..37eb3000267c 100644 --- a/include/linux/mm_types.h +++ b/include/linux/mm_types.h @@ -108,13 +108,25 @@ struct page { pgoff_t index; /* Our offset within mapping. */ unsigned long share; /* share count for fsdax */ }; - /** - * @private: Mapping-private opaque data. - * Usually used for buffer_heads if PagePrivate. - * Used for swp_entry_t if PageSwapCache. - * Indicates order in the buddy system if PageBuddy. - */ - unsigned long private; + union { + /** + * @private: Mapping-private opaque data. + * Usually used for buffer_heads if PagePrivate. + * Used for swp_entry_t if PageSwapCache. + */ + unsigned long private; + struct { + /* + * Indicates order in the buddy system if PageBuddy. + */ + unsigned short int order; + /* + * Tracks need of tlb flush used by luf, + * which stands for lazy unmap flush. + */ + unsigned short int ugen; + }; + }; }; struct { /* page_pool used by netstack */ /** @@ -521,6 +533,20 @@ static inline void set_page_private(struct page *page, unsigned long private) page->private = private; } +#define page_buddy_order(page) ((page)->order) + +static inline void set_page_buddy_order(struct page *page, unsigned int order) +{ + page->order = (unsigned short int)order; +} + +#define page_buddy_ugen(page) ((page)->ugen) + +static inline void set_page_buddy_ugen(struct page *page, unsigned short int ugen) +{ + page->ugen = ugen; +} + static inline void *folio_get_private(struct folio *folio) { return folio->private; diff --git a/mm/internal.h b/mm/internal.h index bbec99cc9d9d..552e1061d36d 100644 --- a/mm/internal.h +++ b/mm/internal.h @@ -461,7 +461,7 @@ struct alloc_context { static inline unsigned int buddy_order(struct page *page) { /* PageBuddy() must be checked by the caller */ - return page_private(page); + return page_buddy_order(page); } /* @@ -475,7 +475,7 @@ static inline unsigned int buddy_order(struct page *page) * times, potentially observing different values in the tests and the actual * use of the result. */ -#define buddy_order_unsafe(page) READ_ONCE(page_private(page)) +#define buddy_order_unsafe(page) READ_ONCE(page_buddy_order(page)) /* * This function checks whether a page is free && is the buddy diff --git a/mm/page_alloc.c b/mm/page_alloc.c index b1e3eb5787de..ae57dd8718fe 100644 --- a/mm/page_alloc.c +++ b/mm/page_alloc.c @@ -565,9 +565,12 @@ void prep_compound_page(struct page *page, unsigned int order) prep_compound_head(page, order); } -static inline void set_buddy_order(struct page *page, unsigned int order) +static inline void set_buddy_order_ugen(struct page *page, + unsigned int order, + unsigned short int ugen) { - set_page_private(page, order); + set_page_buddy_order(page, order); + set_page_buddy_ugen(page, order); __SetPageBuddy(page); } @@ -826,7 +829,7 @@ static inline void __free_one_page(struct page *page, } done_merging: - set_buddy_order(page, order); + set_buddy_order_ugen(page, order, 0); if (fpi_flags & FPI_TO_TAIL) to_tail = true; @@ -1336,7 +1339,7 @@ static inline void expand(struct zone *zone, struct page *page, continue; __add_to_free_list(&page[size], zone, high, migratetype, false); - set_buddy_order(&page[size], high); + set_buddy_order_ugen(&page[size], high, 0); nr_added += size; } account_freepages(zone, nr_added, migratetype); @@ -6801,7 +6804,7 @@ static void break_down_buddy_pages(struct zone *zone, struct page *page, continue; add_to_free_list(current_buddy, zone, high, migratetype, false); - set_buddy_order(current_buddy, high); + set_buddy_order_ugen(current_buddy, high, 0); } } From patchwork Fri May 31 09:19:55 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Byungchul Park X-Patchwork-Id: 13681404 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id EFC6FC25B75 for ; Fri, 31 May 2024 09:20:33 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 2EA726B00A1; Fri, 31 May 2024 05:20:21 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 191896B00A4; Fri, 31 May 2024 05:20:21 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id E633A6B00A2; Fri, 31 May 2024 05:20:20 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0015.hostedemail.com [216.40.44.15]) by kanga.kvack.org (Postfix) with ESMTP id BD3076B00A1 for ; Fri, 31 May 2024 05:20:20 -0400 (EDT) Received: from smtpin19.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay07.hostedemail.com (Postfix) with ESMTP id 766E8160228 for ; Fri, 31 May 2024 09:20:20 +0000 (UTC) X-FDA: 82178145000.19.8F44FBF Received: from invmail4.hynix.com (exvmail4.hynix.com [166.125.252.92]) by imf08.hostedemail.com (Postfix) with ESMTP id 5D434160015 for ; Fri, 31 May 2024 09:20:18 +0000 (UTC) Authentication-Results: imf08.hostedemail.com; dkim=none; dmarc=none; spf=pass (imf08.hostedemail.com: domain of byungchul@sk.com designates 166.125.252.92 as permitted sender) smtp.mailfrom=byungchul@sk.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1717147218; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:content-type: content-transfer-encoding:in-reply-to:in-reply-to: references:references; bh=h+Lp2OMfaWvMJNQxSQ1+ibITcMSbpR1pJTOsDfbrnRA=; b=iqGOrfuFMGcSkmbgMk0Agdk/gnOnSPsoHJ0GDULxG0Hfd72/7prvNB214aMmHRX2E+GuRB UTI3FHVD4j0DQ+Ic3L/kNtYKQkarGWM/4WNBGfvog+R3Ms4PHbSm6nFGaPn30RA/nQYgMt paF562iIk9woypgJ2i6QKGSZ3Y6vU7Q= ARC-Authentication-Results: i=1; imf08.hostedemail.com; dkim=none; dmarc=none; spf=pass (imf08.hostedemail.com: domain of byungchul@sk.com designates 166.125.252.92 as permitted sender) smtp.mailfrom=byungchul@sk.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1717147218; a=rsa-sha256; cv=none; b=Q+Tx3skcVVYgIt8N9/jBjtTSzNUfIzJeqErA+gpTNNqUplwoJpukiUN3HdJwUE57JPVDE8 THjFmB+NrAaqfnTPI0AOI4WyZxal9KVhBYvM4EesGLyjfx52cMQsCZ4zwVswthlGWsgGKv tPYufUwfKWMxxmU0byiwBhR5RpJOMLA= X-AuditID: a67dfc5b-d85ff70000001748-5c-6659964cd640 From: Byungchul Park To: linux-kernel@vger.kernel.org, linux-mm@kvack.org Cc: kernel_team@skhynix.com, akpm@linux-foundation.org, ying.huang@intel.com, vernhao@tencent.com, mgorman@techsingularity.net, hughd@google.com, willy@infradead.org, david@redhat.com, peterz@infradead.org, luto@kernel.org, tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, dave.hansen@linux.intel.com, rjgolo@gmail.com Subject: [PATCH v11 06/12] mm: add folio_put_ugen() to deliver unmap generation number to pcp or buddy Date: Fri, 31 May 2024 18:19:55 +0900 Message-Id: <20240531092001.30428-7-byungchul@sk.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20240531092001.30428-1-byungchul@sk.com> References: <20240531092001.30428-1-byungchul@sk.com> X-Brightmail-Tracker: H4sIAAAAAAAAA+NgFnrBLMWRmVeSWpSXmKPExsXC9ZZnka7PtMg0g5ffBCzmrF/DZvF5wz82 ixcb2hktvq7/xWzx9FMfi8XlXXPYLO6t+c9qcX7XWlaLHUv3MVlcOrCAyeJ47wEmi/n3PrNZ bN40ldni+JSpjBa/fwAVn5w1mcVBwON7ax+Lx85Zd9k9Fmwq9di8Qstj8Z6XTB6bVnWyeWz6 NInd4925c+weJ2b8ZvGYdzLQ4/2+q2weW3/ZeTROvcbm8XmTXABfFJdNSmpOZllqkb5dAlfG pSVdzAV3djFWbH5/jLWB8c9Uxi5GTg4JAROJNR+3MMPY8ycvBLPZBNQlbtz4CWaLCJhJHGz9 ww5iMwvcZZI40M8GYgsLpElseLgBrIZFQFWideI2sJm8AqYSVy+0skPMlJdYveEAWA0n0JwD f++A1QgB1Sz63wtkcwHVvGeT2L1hClSDpMTBFTdYJjDyLmBkWMUolJlXlpuYmWOil1GZl1mh l5yfu4kRGP7Lav9E72D8dCH4EKMAB6MSD29ARUSaEGtiWXFl7iFGCQ5mJRHeX+lAId6UxMqq 1KL8+KLSnNTiQ4zSHCxK4rxG38pThATSE0tSs1NTC1KLYLJMHJxSDYyiU+u+dJs81q66mnip VM/WceK/3ysvHs2712q+q3T/6/7AB99fO/q0Op/QMu6uNi8RS7m2c/bUfMeZAj+uidcs8p7Y 9Fyg4WTjmpOM25q4orzZb1/ddSYs5dWTL1LPueyjbuqbljy6uPNji4+QWdy6vLNRiyX1Ph46 fmmOxRapmK7AY017ba8rsRRnJBpqMRcVJwIAoUFZvHsCAAA= X-Brightmail-Tracker: H4sIAAAAAAAAA+NgFjrPLMWRmVeSWpSXmKPExsXC5WfdrOszLTLNoG09q8Wc9WvYLD5v+Mdm 8WJDO6PF1/W/mC2efupjsTg89ySrxeVdc9gs7q35z2pxftdaVosdS/cxWVw6sIDJ4njvASaL +fc+s1ls3jSV2eL4lKmMFr9/ABWfnDWZxUHQ43trH4vHzll32T0WbCr12LxCy2PxnpdMHptW dbJ5bPo0id3j3blz7B4nZvxm8Zh3MtDj/b6rbB6LX3xg8tj6y86jceo1No/Pm+QC+KO4bFJS czLLUov07RK4Mi4t6WIuuLOLsWLz+2OsDYx/pjJ2MXJySAiYSMyfvJAZxGYTUJe4ceMnmC0i YCZxsPUPO4jNLHCXSeJAPxuILSyQJrHh4QawGhYBVYnWidvA5vAKmEpcvdDKDjFTXmL1hgNg NZxAcw78vQNWIwRUs+h/L+MERq4FjAyrGEUy88pyEzNzTPWKszMq8zIr9JLzczcxAoN5We2f iTsYv1x2P8QowMGoxMMbUBGRJsSaWFZcmXuIUYKDWUmE91c6UIg3JbGyKrUoP76oNCe1+BCj NAeLkjivV3hqgpBAemJJanZqakFqEUyWiYNTqoGR1+f4t395D/LnnVQ7Yda6Ym2UnnhziXyd 5sOJ6s/2cf1bKxw5K++L+px/q+615WmxZ7z8s3Hfy9BjzB3sSgXLl396nGoaeeLW82XVtbuq 1doFRJfOk5gzXWza6vePZtwt6XuscjI0/86NK5KKqv8Dn7Te38t9W2j/E0WdxZKH7D5IvGK6 z/zooxJLcUaioRZzUXEiAMkMLzpiAgAA X-CFilter-Loop: Reflected X-Stat-Signature: 38rshs5jmo7ejy6zg79m9m5mzfjao6n3 X-Rspamd-Queue-Id: 5D434160015 X-Rspam-User: X-Rspamd-Server: rspam10 X-HE-Tag: 1717147218-851982 X-HE-Meta: U2FsdGVkX1/RjMZ6gRQxnnfGMYT/TTQ3DaaxDZR+leSKZFULah9mMrnbgvcJHoxZYaK7r1IvZjWdb1X5ui4BvdijKCqaAF0HSRHG0hiHbuXv/mdRgoELhBPzo1jXq0ZL6IFVu70U6L+szJfV+6F5QJuYL2I4NybjWfvPluumeeMFun1RvZc3AJdn67DlZtghyhNhSLa6SRS00o841L8RFONyAuGYtrTNg9Vh2a95lUU/JumTKKg84A9Os4/HopI/7LTcJ3p1zCPsMZaPedNTv62M8GApJ9w4PDFJ7tP/M7RvfO4dZbKJMfG6Of/kxC0Ly2zhJLFTFqae6+FZ8xr5AYvAuMuSC+yyOs2xM/shBpxrqlLyWh/+hrnXiQOnY1XkF/k8jIo7xMEv+XBUbRt+4yUtV2xrSphVP+M9mb8OCn/THOplwHOTyj528lAkj3BElOtgzyb9+3H8QAYK7qoDKCR/ng7/NzxkRpXv+Y091KQ4Bd56k9b1FiQmAgTzGHrAXMXnzjF5U62v1+PW483dHXrqNNGA7+Xj0sVHFMjZ7J3sLJsPIs2pnmYoBAser2z3oCpneu6T0OJuy4DAhiTB2hJBZwkfDIOg8KPj2gOZ2DDdZ/ffxhdZOk8GfSdenuYNQ4luv+Tfcn0eglnlvZzcMRVqq1fdLIqs9Kb+B6slLiYgggfPOAci1ns7uu2j+HvkvXYCPwSntNi5fk2g6c1h/8zEeghxbGh5HtUFJqJbyPB+Eo0UWnsiVW3BVqkxeznuWNnhYLKwMrHIdH6Dc7HXqo1L49DkFU+KMfgr2Nz6xzSMJtdYxk//Dd+o6tE2XQ7P6+KwERv5+95XaWc+ZIDw1D0p4QA0sWdDiQ39hxdaGCnQkq1I7/PGa+INtxnBSHxwwPQn6xmbCoiaZgVVu9pJFtICE8FkJcjyFF5JXyfxRUDQAeoE1CCcTp+BdSXIk5Kv8DCXduqBKHgsp15y7xr HaVRFWFB EhfRJezivPEr/U/etdnemQgy8Lwh9IvJG5TS9DWWLc18IWX4= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Introduced a new API, folio_put_ugen(), to deliver unmap generation number to pcp or buddy that will be used by luf mechanism to track need of tlb flush for each page residing in pcp or buddy. For now, the delivery should work for the following call path that is of releasing source folios during migration: folio_put_ugen() __folio_put_ugen() free_unref_page() free_unref_page_commit() free_one_page() __free_one_page() The generation number should be handed over properly when pages travel between pcp and buddy, and must do necessary handling on exit from pcp or buddy. This patch doesn't include actual body for tlb flush on the exit, which will be filled by the main patch of luf mechanism. Signed-off-by: Byungchul Park --- include/linux/mm.h | 22 +++++++ include/linux/sched.h | 1 + mm/compaction.c | 10 +++ mm/internal.h | 71 ++++++++++++++++++++- mm/page_alloc.c | 144 ++++++++++++++++++++++++++++++++++-------- mm/page_isolation.c | 6 ++ mm/page_reporting.c | 10 +++ mm/swap.c | 12 +++- 8 files changed, 248 insertions(+), 28 deletions(-) diff --git a/include/linux/mm.h b/include/linux/mm.h index 3aa1b6889bcc..54cb6316a76d 100644 --- a/include/linux/mm.h +++ b/include/linux/mm.h @@ -1316,6 +1316,7 @@ static inline struct folio *virt_to_folio(const void *x) } void __folio_put(struct folio *folio); +void __folio_put_ugen(struct folio *folio, unsigned short int ugen); void put_pages_list(struct list_head *pages); @@ -1508,6 +1509,27 @@ static inline void folio_put(struct folio *folio) __folio_put(folio); } +/** + * folio_put_ugen - Decrement the last reference count on a folio. + * @folio: The folio. + * @ugen: The unmap generation # of TLB flush that the folio requires. + * + * The folio's reference count should be one since the only user, folio + * migration code, calls folio_put_ugen() only when the folio has no + * reference else. The memory will be released back to the page + * allocator and may be used by another allocation immediately. Do not + * access the memory or the struct folio after calling folio_put_ugen(). + * + * Context: May be called in process or interrupt context, but not in NMI + * context. May be called while holding a spinlock. + */ +static inline void folio_put_ugen(struct folio *folio, unsigned short int ugen) +{ + if (WARN_ON(!folio_put_testzero(folio))) + return; + __folio_put_ugen(folio, ugen); +} + /** * folio_put_refs - Reduce the reference count on a folio. * @folio: The folio. diff --git a/include/linux/sched.h b/include/linux/sched.h index 61591ac6eab6..ab5a2ed79b88 100644 --- a/include/linux/sched.h +++ b/include/linux/sched.h @@ -1340,6 +1340,7 @@ struct task_struct { #endif struct tlbflush_unmap_batch tlb_ubc; + unsigned short int ugen; /* Cache last used pipe for splice(): */ struct pipe_inode_info *splice_pipe; diff --git a/mm/compaction.c b/mm/compaction.c index e731d45befc7..13799fbb2a9a 100644 --- a/mm/compaction.c +++ b/mm/compaction.c @@ -701,6 +701,11 @@ static unsigned long isolate_freepages_block(struct compact_control *cc, if (locked) spin_unlock_irqrestore(&cc->zone->lock, flags); + /* + * Check and flush before using the isolated pages. + */ + check_flush_task_ugen(); + /* * Be careful to not go outside of the pageblock. */ @@ -1673,6 +1678,11 @@ static void fast_isolate_freepages(struct compact_control *cc) spin_unlock_irqrestore(&cc->zone->lock, flags); + /* + * Check and flush before using the isolated pages. + */ + check_flush_task_ugen(); + /* Skip fast search if enough freepages isolated */ if (cc->nr_freepages >= cc->nr_migratepages) break; diff --git a/mm/internal.h b/mm/internal.h index 552e1061d36d..380ae980e4f9 100644 --- a/mm/internal.h +++ b/mm/internal.h @@ -661,7 +661,7 @@ extern bool free_pages_prepare(struct page *page, unsigned int order); extern int user_min_free_kbytes; -void free_unref_page(struct page *page, unsigned int order); +void free_unref_page(struct page *page, unsigned int order, unsigned short int ugen); void free_unref_folios(struct folio_batch *fbatch); extern void zone_pcp_reset(struct zone *zone); @@ -1536,6 +1536,75 @@ static inline void shrinker_debugfs_remove(struct dentry *debugfs_entry, void workingset_update_node(struct xa_node *node); extern struct list_lru shadow_nodes; +#if defined(CONFIG_ARCH_WANT_BATCHED_UNMAP_TLB_FLUSH) +static inline unsigned short int ugen_latest(unsigned short int a, unsigned short int b) +{ + if (!a || !b) + return a + b; + + /* + * The ugen is wrapped around so let's use this trick. + */ + if ((short int)(a - b) < 0) + return b; + else + return a; +} + +static inline void update_task_ugen(unsigned short int ugen) +{ + current->ugen = ugen_latest(current->ugen, ugen); +} + +static inline unsigned short int hand_over_task_ugen(void) +{ + unsigned short int ret = current->ugen; + + current->ugen = 0; + return ret; +} + +static inline void check_flush_task_ugen(void) +{ + /* + * XXX: luf mechanism will handle this. For now, do nothing but + * reset current's ugen to finalize this turn. + */ + current->ugen = 0; +} + +/* + * Check the constratints of what luf currently supports. + */ +static inline bool can_luf_folio(struct folio *f) +{ + bool can_luf = true; + + /* + * XXX: Remove the constraint once luf handles zone device folio. + */ + can_luf = can_luf && likely(!folio_is_zone_device(f)); + + /* + * XXX: Remove the constraint once luf handles hugetlb folio. + */ + can_luf = can_luf && likely(!folio_test_hugetlb(f)); + + /* + * XXX: Remove the constraint once luf handles large folio. + */ + can_luf = can_luf && likely(!folio_test_large(f)); + + return can_luf; +} +#else /* CONFIG_ARCH_WANT_BATCHED_UNMAP_TLB_FLUSH */ +static inline unsigned short int ugen_latest(unsigned short int a, unsigned short int b) { return 0; } +static inline void update_task_ugen(unsigned short int ugen) {} +static inline unsigned short int hand_over_task_ugen(void) { return 0; } +static inline void check_flush_task_ugen(void) {} +static inline bool can_luf_folio(struct folio *f) { return false; } +#endif + struct unlink_vma_file_batch { int count; struct vm_area_struct *vmas[8]; diff --git a/mm/page_alloc.c b/mm/page_alloc.c index ae57dd8718fe..6fbbe45be5ae 100644 --- a/mm/page_alloc.c +++ b/mm/page_alloc.c @@ -688,6 +688,7 @@ static inline void __del_page_from_free_list(struct page *page, struct zone *zon if (page_reported(page)) __ClearPageReported(page); + update_task_ugen(page_buddy_ugen(page)); list_del(&page->buddy_list); __ClearPageBuddy(page); set_page_private(page, 0); @@ -760,7 +761,7 @@ buddy_merge_likely(unsigned long pfn, unsigned long buddy_pfn, static inline void __free_one_page(struct page *page, unsigned long pfn, struct zone *zone, unsigned int order, - int migratetype, fpi_t fpi_flags) + int migratetype, fpi_t fpi_flags, unsigned short int ugen) { struct capture_control *capc = task_capc(zone); unsigned long buddy_pfn = 0; @@ -775,12 +776,22 @@ static inline void __free_one_page(struct page *page, VM_BUG_ON_PAGE(pfn & ((1 << order) - 1), page); VM_BUG_ON_PAGE(bad_range(zone, page), page); + /* + * Ensure private is zero before using it inside buddy. + */ + set_page_private(page, 0); + account_freepages(zone, 1 << order, migratetype); while (order < MAX_PAGE_ORDER) { int buddy_mt = migratetype; if (compaction_capture(capc, page, order, migratetype)) { + /* + * Capturer will check_flush_task_ugen() through + * prep_new_page(). + */ + update_task_ugen(ugen); account_freepages(zone, -(1 << order), migratetype); return; } @@ -811,6 +822,11 @@ static inline void __free_one_page(struct page *page, if (page_is_guard(buddy)) clear_page_guard(zone, buddy, order); else + /* + * __del_page_from_free_list() updates current's + * ugen that pairs with hand_over_task_ugen() below + * in this funtion. + */ __del_page_from_free_list(buddy, zone, order, buddy_mt); if (unlikely(buddy_mt != migratetype)) { @@ -829,7 +845,8 @@ static inline void __free_one_page(struct page *page, } done_merging: - set_buddy_order_ugen(page, order, 0); + ugen = ugen_latest(ugen, hand_over_task_ugen()); + set_buddy_order_ugen(page, order, ugen); if (fpi_flags & FPI_TO_TAIL) to_tail = true; @@ -1040,6 +1057,11 @@ __always_inline bool free_pages_prepare(struct page *page, VM_BUG_ON_PAGE(PageTail(page), page); + /* + * Ensure private is zero before using it inside pcp. + */ + set_page_private(page, 0); + trace_mm_page_free(page, order); kmsan_free_page(page, order); @@ -1171,17 +1193,23 @@ static void free_pcppages_bulk(struct zone *zone, int count, do { unsigned long pfn; int mt; + unsigned short int ugen; page = list_last_entry(list, struct page, pcp_list); pfn = page_to_pfn(page); mt = get_pfnblock_migratetype(page, pfn); + /* + * pcp uses private to store ugen. + */ + ugen = page_private(page); + /* must delete to avoid corrupting pcp list */ list_del(&page->pcp_list); count -= nr_pages; pcp->count -= nr_pages; - __free_one_page(page, pfn, zone, order, mt, FPI_NONE); + __free_one_page(page, pfn, zone, order, mt, FPI_NONE, ugen); trace_mm_page_pcpu_drain(page, order, mt); } while (count > 0 && !list_empty(list)); } @@ -1191,14 +1219,14 @@ static void free_pcppages_bulk(struct zone *zone, int count, static void free_one_page(struct zone *zone, struct page *page, unsigned long pfn, unsigned int order, - fpi_t fpi_flags) + fpi_t fpi_flags, unsigned short int ugen) { unsigned long flags; int migratetype; spin_lock_irqsave(&zone->lock, flags); migratetype = get_pfnblock_migratetype(page, pfn); - __free_one_page(page, pfn, zone, order, migratetype, fpi_flags); + __free_one_page(page, pfn, zone, order, migratetype, fpi_flags, ugen); spin_unlock_irqrestore(&zone->lock, flags); } @@ -1211,7 +1239,7 @@ static void __free_pages_ok(struct page *page, unsigned int order, if (!free_pages_prepare(page, order)) return; - free_one_page(zone, page, pfn, order, fpi_flags); + free_one_page(zone, page, pfn, order, fpi_flags, 0); __count_vm_events(PGFREE, 1 << order); } @@ -1476,6 +1504,10 @@ inline void post_alloc_hook(struct page *page, unsigned int order, static void prep_new_page(struct page *page, unsigned int order, gfp_t gfp_flags, unsigned int alloc_flags) { + /* + * Check and flush before using the pages. + */ + check_flush_task_ugen(); post_alloc_hook(page, order, gfp_flags); if (order && (gfp_flags & __GFP_COMP)) @@ -1511,6 +1543,10 @@ struct page *__rmqueue_smallest(struct zone *zone, unsigned int order, page = get_page_from_free_area(area, migratetype); if (!page) continue; + /* + * del_page_from_free_list() updates current's ugen that + * pairs with check_flush_task_ugen() in prep_new_page(). + */ del_page_from_free_list(page, zone, current_order, migratetype); expand(zone, page, order, current_order, migratetype); trace_mm_page_alloc_zone_locked(page, order, migratetype, @@ -1673,7 +1709,8 @@ static unsigned long find_large_buddy(unsigned long start_pfn) /* Split a multi-block free page into its individual pageblocks */ static void split_large_buddy(struct zone *zone, struct page *page, - unsigned long pfn, int order) + unsigned long pfn, int order, + unsigned short int ugen) { unsigned long end_pfn = pfn + (1 << order); @@ -1686,7 +1723,7 @@ static void split_large_buddy(struct zone *zone, struct page *page, while (pfn != end_pfn) { int mt = get_pfnblock_migratetype(page, pfn); - __free_one_page(page, pfn, zone, pageblock_order, mt, FPI_NONE); + __free_one_page(page, pfn, zone, pageblock_order, mt, FPI_NONE, ugen); pfn += pageblock_nr_pages; page = pfn_to_page(pfn); } @@ -1728,22 +1765,34 @@ bool move_freepages_block_isolate(struct zone *zone, struct page *page, if (pfn != start_pfn) { struct page *buddy = pfn_to_page(pfn); int order = buddy_order(buddy); + unsigned short int ugen; + /* + * del_page_from_free_list() updates current's ugen that + * pairs with the following hand_over_task_ugen(). + */ del_page_from_free_list(buddy, zone, order, get_pfnblock_migratetype(buddy, pfn)); + ugen = hand_over_task_ugen(); set_pageblock_migratetype(page, migratetype); - split_large_buddy(zone, buddy, pfn, order); + split_large_buddy(zone, buddy, pfn, order, ugen); return true; } /* We're the starting block of a larger buddy */ if (PageBuddy(page) && buddy_order(page) > pageblock_order) { int order = buddy_order(page); + unsigned short int ugen; + /* + * del_page_from_free_list() updates current's ugen that + * pairs with the following hand_over_task_ugen(). + */ del_page_from_free_list(page, zone, order, get_pfnblock_migratetype(page, pfn)); + ugen = hand_over_task_ugen(); set_pageblock_migratetype(page, migratetype); - split_large_buddy(zone, page, pfn, order); + split_large_buddy(zone, page, pfn, order, ugen); return true; } move: @@ -1863,6 +1912,10 @@ steal_suitable_fallback(struct zone *zone, struct page *page, /* Take ownership for orders >= pageblock_order */ if (current_order >= pageblock_order) { + /* + * del_page_from_free_list() updates current's ugen that + * pairs with check_flush_task_ugen() in prep_new_page(). + */ del_page_from_free_list(page, zone, current_order, block_type); change_pageblock_range(page, current_order, start_type); expand(zone, page, order, current_order, start_type); @@ -1918,6 +1971,10 @@ steal_suitable_fallback(struct zone *zone, struct page *page, } single_page: + /* + * del_page_from_free_list() updates current's ugen that pairs + * with check_flush_task_ugen() in prep_new_page(). + */ del_page_from_free_list(page, zone, current_order, block_type); expand(zone, page, order, current_order, block_type); return page; @@ -2539,7 +2596,7 @@ static int nr_pcp_high(struct per_cpu_pages *pcp, struct zone *zone, static void free_unref_page_commit(struct zone *zone, struct per_cpu_pages *pcp, struct page *page, int migratetype, - unsigned int order) + unsigned int order, unsigned short int ugen) { int high, batch; int pindex; @@ -2553,6 +2610,11 @@ static void free_unref_page_commit(struct zone *zone, struct per_cpu_pages *pcp, pcp->alloc_factor >>= 1; __count_vm_events(PGFREE, 1 << order); pindex = order_to_pindex(migratetype, order); + + /* + * pcp uses private to store ugen. + */ + set_page_private(page, ugen); list_add(&page->pcp_list, &pcp->lists[pindex]); pcp->count += 1 << order; @@ -2588,7 +2650,8 @@ static void free_unref_page_commit(struct zone *zone, struct per_cpu_pages *pcp, /* * Free a pcp page */ -void free_unref_page(struct page *page, unsigned int order) +void free_unref_page(struct page *page, unsigned int order, + unsigned short int ugen) { unsigned long __maybe_unused UP_flags; struct per_cpu_pages *pcp; @@ -2614,7 +2677,7 @@ void free_unref_page(struct page *page, unsigned int order) migratetype = get_pfnblock_migratetype(page, pfn); if (unlikely(migratetype >= MIGRATE_PCPTYPES)) { if (unlikely(is_migrate_isolate(migratetype))) { - free_one_page(page_zone(page), page, pfn, order, FPI_NONE); + free_one_page(page_zone(page), page, pfn, order, FPI_NONE, ugen); return; } migratetype = MIGRATE_MOVABLE; @@ -2624,10 +2687,10 @@ void free_unref_page(struct page *page, unsigned int order) pcp_trylock_prepare(UP_flags); pcp = pcp_spin_trylock(zone->per_cpu_pageset); if (pcp) { - free_unref_page_commit(zone, pcp, page, migratetype, order); + free_unref_page_commit(zone, pcp, page, migratetype, order, ugen); pcp_spin_unlock(pcp); } else { - free_one_page(zone, page, pfn, order, FPI_NONE); + free_one_page(zone, page, pfn, order, FPI_NONE, ugen); } pcp_trylock_finish(UP_flags); } @@ -2657,7 +2720,7 @@ void free_unref_folios(struct folio_batch *folios) */ if (!pcp_allowed_order(order)) { free_one_page(folio_zone(folio), &folio->page, - pfn, order, FPI_NONE); + pfn, order, FPI_NONE, 0); continue; } folio->private = (void *)(unsigned long)order; @@ -2693,7 +2756,7 @@ void free_unref_folios(struct folio_batch *folios) */ if (is_migrate_isolate(migratetype)) { free_one_page(zone, &folio->page, pfn, - order, FPI_NONE); + order, FPI_NONE, 0); continue; } @@ -2706,7 +2769,7 @@ void free_unref_folios(struct folio_batch *folios) if (unlikely(!pcp)) { pcp_trylock_finish(UP_flags); free_one_page(zone, &folio->page, pfn, - order, FPI_NONE); + order, FPI_NONE, 0); continue; } locked_zone = zone; @@ -2721,7 +2784,7 @@ void free_unref_folios(struct folio_batch *folios) trace_mm_page_free_batched(&folio->page); free_unref_page_commit(zone, pcp, &folio->page, migratetype, - order); + order, 0); } if (pcp) { @@ -2772,6 +2835,11 @@ int __isolate_free_page(struct page *page, unsigned int order) return 0; } + /* + * del_page_from_free_list() updates current's ugen. The user of + * the isolated page should check_flush_task_ugen() before using + * it. + */ del_page_from_free_list(page, zone, order, mt); /* @@ -2813,7 +2881,7 @@ void __putback_isolated_page(struct page *page, unsigned int order, int mt) /* Return isolated page to tail of freelist. */ __free_one_page(page, page_to_pfn(page), zone, order, mt, - FPI_SKIP_REPORT_NOTIFY | FPI_TO_TAIL); + FPI_SKIP_REPORT_NOTIFY | FPI_TO_TAIL, 0); } /* @@ -2956,6 +3024,11 @@ struct page *__rmqueue_pcplist(struct zone *zone, unsigned int order, } page = list_first_entry(list, struct page, pcp_list); + + /* + * Pairs with check_flush_task_ugen() in prep_new_page(). + */ + update_task_ugen(page_private(page)); list_del(&page->pcp_list); pcp->count -= 1 << order; } while (check_new_pages(page, order)); @@ -4782,11 +4855,11 @@ void __free_pages(struct page *page, unsigned int order) struct alloc_tag *tag = pgalloc_tag_get(page); if (put_page_testzero(page)) - free_unref_page(page, order); + free_unref_page(page, order, 0); else if (!head) { pgalloc_tag_sub_pages(tag, (1 << order) - 1); while (order-- > 0) - free_unref_page(page + (1 << order), order); + free_unref_page(page + (1 << order), order, 0); } } EXPORT_SYMBOL(__free_pages); @@ -4848,7 +4921,7 @@ void __page_frag_cache_drain(struct page *page, unsigned int count) VM_BUG_ON_PAGE(page_ref_count(page) == 0, page); if (page_ref_sub_and_test(page, count)) - free_unref_page(page, compound_order(page)); + free_unref_page(page, compound_order(page), 0); } EXPORT_SYMBOL(__page_frag_cache_drain); @@ -4889,7 +4962,7 @@ void *__page_frag_alloc_align(struct page_frag_cache *nc, goto refill; if (unlikely(nc->pfmemalloc)) { - free_unref_page(page, compound_order(page)); + free_unref_page(page, compound_order(page), 0); goto refill; } @@ -4933,7 +5006,7 @@ void page_frag_free(void *addr) struct page *page = virt_to_head_page(addr); if (unlikely(put_page_testzero(page))) - free_unref_page(page, compound_order(page)); + free_unref_page(page, compound_order(page), 0); } EXPORT_SYMBOL(page_frag_free); @@ -6742,10 +6815,19 @@ void __offline_isolated_pages(unsigned long start_pfn, unsigned long end_pfn) BUG_ON(!PageBuddy(page)); VM_WARN_ON(get_pageblock_migratetype(page) != MIGRATE_ISOLATE); order = buddy_order(page); + /* + * del_page_from_free_list() updates current's ugen that + * pairs with check_flush_task_ugen() below in this function. + */ del_page_from_free_list(page, zone, order, MIGRATE_ISOLATE); pfn += (1 << order); } spin_unlock_irqrestore(&zone->lock, flags); + + /* + * Check and flush before using it. + */ + check_flush_task_ugen(); } #endif @@ -6829,6 +6911,11 @@ bool take_page_off_buddy(struct page *page) int migratetype = get_pfnblock_migratetype(page_head, pfn_head); + /* + * del_page_from_free_list() updates current's + * ugen that pairs with check_flush_task_ugen() below + * in this function. + */ del_page_from_free_list(page_head, zone, page_order, migratetype); break_down_buddy_pages(zone, page_head, page, 0, @@ -6841,6 +6928,11 @@ bool take_page_off_buddy(struct page *page) break; } spin_unlock_irqrestore(&zone->lock, flags); + + /* + * Check and flush before using it. + */ + check_flush_task_ugen(); return ret; } @@ -6859,7 +6951,7 @@ bool put_page_back_buddy(struct page *page) int migratetype = get_pfnblock_migratetype(page, pfn); ClearPageHWPoisonTakenOff(page); - __free_one_page(page, pfn, zone, 0, migratetype, FPI_NONE); + __free_one_page(page, pfn, zone, 0, migratetype, FPI_NONE, 0); if (TestClearPageHWPoison(page)) { ret = true; } diff --git a/mm/page_isolation.c b/mm/page_isolation.c index 042937d5abe4..5823da60a621 100644 --- a/mm/page_isolation.c +++ b/mm/page_isolation.c @@ -260,6 +260,12 @@ static void unset_migratetype_isolate(struct page *page, int migratetype) zone->nr_isolate_pageblock--; out: spin_unlock_irqrestore(&zone->lock, flags); + + /* + * Check and flush for the pages that have been isolated. + */ + if (isolated_page) + check_flush_task_ugen(); } static inline struct page * diff --git a/mm/page_reporting.c b/mm/page_reporting.c index e4c428e61d8c..4f94a3ea1b22 100644 --- a/mm/page_reporting.c +++ b/mm/page_reporting.c @@ -221,6 +221,11 @@ page_reporting_cycle(struct page_reporting_dev_info *prdev, struct zone *zone, /* release lock before waiting on report processing */ spin_unlock_irq(&zone->lock); + /* + * Check and flush before using the isolated pages. + */ + check_flush_task_ugen(); + /* begin processing pages in local list */ err = prdev->report(prdev, sgl, PAGE_REPORTING_CAPACITY); @@ -253,6 +258,11 @@ page_reporting_cycle(struct page_reporting_dev_info *prdev, struct zone *zone, spin_unlock_irq(&zone->lock); + /* + * Check and flush before using the isolated pages. + */ + check_flush_task_ugen(); + return err; } diff --git a/mm/swap.c b/mm/swap.c index dc205bdfbbd4..dae169b19ab9 100644 --- a/mm/swap.c +++ b/mm/swap.c @@ -125,10 +125,20 @@ void __folio_put(struct folio *folio) page_cache_release(folio); folio_undo_large_rmappable(folio); mem_cgroup_uncharge(folio); - free_unref_page(&folio->page, folio_order(folio)); + free_unref_page(&folio->page, folio_order(folio), 0); } EXPORT_SYMBOL(__folio_put); +void __folio_put_ugen(struct folio *folio, unsigned short int ugen) +{ + if (WARN_ON(!can_luf_folio(folio))) + return; + + page_cache_release(folio); + mem_cgroup_uncharge(folio); + free_unref_page(&folio->page, 0, ugen); +} + /** * put_pages_list() - release a list of pages * @pages: list of pages threaded on page->lru From patchwork Fri May 31 09:19:56 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Byungchul Park X-Patchwork-Id: 13681403 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 4DDBEC27C4F for ; Fri, 31 May 2024 09:20:31 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 0AB516B009F; Fri, 31 May 2024 05:20:21 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 00EAB6B00A1; Fri, 31 May 2024 05:20:20 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id DA0556B00A3; Fri, 31 May 2024 05:20:20 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0014.hostedemail.com [216.40.44.14]) by kanga.kvack.org (Postfix) with ESMTP id AAB786B009F for ; Fri, 31 May 2024 05:20:20 -0400 (EDT) Received: from smtpin24.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay01.hostedemail.com (Postfix) with ESMTP id 6418A1C2292 for ; Fri, 31 May 2024 09:20:20 +0000 (UTC) X-FDA: 82178145000.24.33162E8 Received: from invmail4.hynix.com (exvmail4.skhynix.com [166.125.252.92]) by imf11.hostedemail.com (Postfix) with ESMTP id 8D4B64000E for ; Fri, 31 May 2024 09:20:18 +0000 (UTC) Authentication-Results: imf11.hostedemail.com; dkim=none; dmarc=none; spf=pass (imf11.hostedemail.com: domain of byungchul@sk.com designates 166.125.252.92 as permitted sender) smtp.mailfrom=byungchul@sk.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1717147218; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:content-type: content-transfer-encoding:in-reply-to:in-reply-to: references:references; bh=x9Nc2jB8AR9hDzDeRW+9QLUyZ+e1BO5UpCyzvuvWD7Q=; b=i+/31sVpG99cJj5yY3zpktQIGAthuIDPGCl7plzYmFFvCdH3U9lbaRMWsYg0pvojAcHp35 nz3GNDD9GhIko//YbSXsr3kmeibZ9OTXjrdakO3M1W25xg/i0Jk+LJNktkF5rP6Q1Xi/Qe MVS1tAnsv7y54/KPohxBC1hgjKGlrzc= ARC-Authentication-Results: i=1; imf11.hostedemail.com; dkim=none; dmarc=none; spf=pass (imf11.hostedemail.com: domain of byungchul@sk.com designates 166.125.252.92 as permitted sender) smtp.mailfrom=byungchul@sk.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1717147218; a=rsa-sha256; cv=none; b=ahB67eHOQNKAylj1JUzDDKw0zBDoWNNsClWfcHzeXmEzryhEf+VsmAXI+i09odXxMHJdQe kjBIkUPSJx0219cyCKTnD5VwCAVZvYMeZTuSspi9tKR+pM7YCm8fCr+bFe8IH3Y9uKceTZ /Ofbfte5qvZWNOHc+mLY014w8Dn/lm4= X-AuditID: a67dfc5b-d85ff70000001748-61-6659964c009f From: Byungchul Park To: linux-kernel@vger.kernel.org, linux-mm@kvack.org Cc: kernel_team@skhynix.com, akpm@linux-foundation.org, ying.huang@intel.com, vernhao@tencent.com, mgorman@techsingularity.net, hughd@google.com, willy@infradead.org, david@redhat.com, peterz@infradead.org, luto@kernel.org, tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, dave.hansen@linux.intel.com, rjgolo@gmail.com Subject: [PATCH v11 07/12] mm: add a parameter, unmap generation number, to free_unref_folios() Date: Fri, 31 May 2024 18:19:56 +0900 Message-Id: <20240531092001.30428-8-byungchul@sk.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20240531092001.30428-1-byungchul@sk.com> References: <20240531092001.30428-1-byungchul@sk.com> X-Brightmail-Tracker: H4sIAAAAAAAAA+NgFnrGLMWRmVeSWpSXmKPExsXC9ZZnoa7PtMg0g0lThC3mrF/DZvF5wz82 ixcb2hktvq7/xWzx9FMfi8XlXXPYLO6t+c9qcX7XWlaLHUv3MVlcOrCAyeJ47wEmi/n3PrNZ bN40ldni+JSpjBa/fwAVn5w1mcVBwON7ax+Lx85Zd9k9Fmwq9di8Qstj8Z6XTB6bVnWyeWz6 NInd4925c+weJ2b8ZvGYdzLQ4/2+q2weW3/ZeTROvcbm8XmTXABfFJdNSmpOZllqkb5dAlfG 3D2rWAvuqlU86TzC0sC4U76LkZNDQsBE4vXki8ww9qmTT1hAbDYBdYkbN36CxUUEzCQOtv5h B7GZBe4ySRzoZwOxhQXiJWa8hoizCKhK7Pm3AyzOK2Aq8fXmCxaImfISqzccAJvDCTTnwN87 jCC2EFDNov+9QDYXUM17NolDF2dAHSEpcXDFDZYJjLwLGBlWMQpl5pXlJmbmmOhlVOZlVugl 5+duYgSG/rLaP9E7GD9dCD7EKMDBqMTDG1ARkSbEmlhWXJl7iFGCg1lJhPdXOlCINyWxsiq1 KD++qDQntfgQozQHi5I4r9G38hQhgfTEktTs1NSC1CKYLBMHp1QDo1NnzjGhvF3zvnLovXVh f3zizEvvR1q+Ztc/ZJrf+WnvEv2H89H07cGTmvdqPvo8T0lgmsws5hOVuxc9l5ZdLrSl64Xy m9Vm16a+8JtrP2XtNamir8fvXr5fuK72lrrQ7bPLzB1c/74yVrrQbyLwWLxlzcfKnjW+AtHF 7r+32+5Z7HN6QypjznklluKMREMt5qLiRAAo08rMeQIAAA== X-Brightmail-Tracker: H4sIAAAAAAAAA+NgFjrLLMWRmVeSWpSXmKPExsXC5WfdrOszLTLNYNsjdos569ewWXze8I/N 4sWGdkaLr+t/MVs8/dTHYnF47klWi8u75rBZ3Fvzn9Xi/K61rBY7lu5jsrh0YAGTxfHeA0wW 8+99ZrPYvGkqs8XxKVMZLX7/ACo+OWsyi4Ogx/fWPhaPnbPusnss2FTqsXmFlsfiPS+ZPDat 6mTz2PRpErvHu3Pn2D1OzPjN4jHvZKDH+31X2TwWv/jA5LH1l51H49RrbB6fN8kF8Edx2aSk 5mSWpRbp2yVwZczds4q14K5axZPOIywNjDvluxg5OSQETCROnXzCAmKzCahL3LjxkxnEFhEw kzjY+ocdxGYWuMskcaCfDcQWFoiXmPEaIs4ioCqx598OsDivgKnE15svWCBmykus3nAAbA4n 0JwDf+8wgthCQDWL/vcyTmDkWsDIsIpRJDOvLDcxM8dUrzg7ozIvs0IvOT93EyMwkJfV/pm4 g/HLZfdDjAIcjEo8vAEVEWlCrIllxZW5hxglOJiVRHh/pQOFeFMSK6tSi/Lji0pzUosPMUpz sCiJ83qFpyYICaQnlqRmp6YWpBbBZJk4OKUaGCe+PnHw2aOZX14+ZYx7ZyVbVFf4OZ+dyUyp yIBv5k6HzPhNi1SlbYJtmqt9fz3KvLTp07+9fvsvifsdCe4N+2sbd3dldH9p1Cmf/nk9GueM 79YGzIg4apnc7r1mS3Tz5SV3dHQU7CZ/zl+7zlm/o1vxX/913jVrFh3c2fx71aOwOQ3H/6gK zFBiKc5INNRiLipOBAC//IG9YAIAAA== X-CFilter-Loop: Reflected X-Rspam-User: X-Rspamd-Server: rspam02 X-Rspamd-Queue-Id: 8D4B64000E X-Stat-Signature: wapyuo788569ampdyuumztdbnejuzby8 X-HE-Tag: 1717147218-482969 X-HE-Meta: U2FsdGVkX1/ojRRhYhG4j94S2xe605cz+7IfJKxmyEhdyvzbfDtId5Jknggzb5teh+RqbVGkAPdY8Df+a+GmAMyWcE+ADJWbJEDDgtvY+UV9Mlz75lWteh4ivHMHMTk0QhZtSvEr/AlaW6w70iv55hA4GwoDtBlDUFyA30X5CCGa295kyQjrKclAYq37K9mvIzlmxeAvzZe09iwseBV5btqJZs6tytMx7H4hLBPuSxSLdLdbyvuNDb8XQg16LJA83qAQfkul6+55RBcZOQTnt/a8N3HlMxuR4tUvCz2jYkLuB6vvJhGhwAsPMf39s8Dzhc+LC4L9oCQxJ+FSZrRgWld8DpHwPD3sRNRMuET4IwoEYBeutl6fkOh/pSo7LUQ5GUVMhEv7dJ3bQ32V1PTi37QqHVv/nJV0mAVmXZ/bussGzn9FxnxDuEnlMZop6p+ooRKgI5dqSOzaTurr+YYRFmdwYbDORYV8dKlAi247+5+8PrHHuhcO5FLIxpSNTGfd4ts6ug+e8TaED/o9s7udWB5rtOLrzz9La+8LGQ+k9wypjKkR1w7FrAKo0RrnuFBtUwH/9/A4vBf/AOibB7Yr+fqntL6YjL9bWlGvGghK7Pc9SElmTPwU/WHbs0XeaZW2vBrGgdiRMvfaTjqPzVNiCxKexrcAFtjwuvugY11bPUc/XkcRMD9NQOJtYw6Kn2xmlnxNmSXJqTwpnKJaJXiDvTgf9hbqPI/pyIjoIdLoMXOR7WyTO1Pb8aBmi9kUFZ+RXxy99zfkuZAAR36eVfyOL1FkWqkLnK2k3t3mqNSlWxn0ioB89oErqfKlpfiE1x5hzy88PD5CPv3eAxcr71f7obzNyO+1b0Q8stK9qPbdr0g1sy4qZ4aZG0AWkmCj8em+VND5ocgkeDv0EMf4Yc0L+Ohha/vlrh8D97QJVB2BKUCuL4h3fTSEUHJuwYNcaH2jDH6g9lB6gqVZ+sqhwAs ysWfRNbc WH7dZbQINXlwBt21Lff/vES6aneHwxXIXw3YlwvA0eii5G7E= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Unmap generation number is used by luf mechanism to track need of tlb flush for each page residing in pcp or buddy. The number should be delivered to pcp or buddy via free_unref_folios() that is for releasing folios that have been unmapped during reclaim in shrink_folio_list(). Signed-off-by: Byungchul Park --- mm/internal.h | 2 +- mm/page_alloc.c | 10 +++++----- mm/swap.c | 6 +++--- mm/vmscan.c | 8 ++++---- 4 files changed, 13 insertions(+), 13 deletions(-) diff --git a/mm/internal.h b/mm/internal.h index 380ae980e4f9..dba6d0eb7b6d 100644 --- a/mm/internal.h +++ b/mm/internal.h @@ -662,7 +662,7 @@ extern bool free_pages_prepare(struct page *page, unsigned int order); extern int user_min_free_kbytes; void free_unref_page(struct page *page, unsigned int order, unsigned short int ugen); -void free_unref_folios(struct folio_batch *fbatch); +void free_unref_folios(struct folio_batch *fbatch, unsigned short int ugen); extern void zone_pcp_reset(struct zone *zone); extern void zone_pcp_disable(struct zone *zone); diff --git a/mm/page_alloc.c b/mm/page_alloc.c index 6fbbe45be5ae..c9acb4da91e0 100644 --- a/mm/page_alloc.c +++ b/mm/page_alloc.c @@ -2698,7 +2698,7 @@ void free_unref_page(struct page *page, unsigned int order, /* * Free a batch of folios */ -void free_unref_folios(struct folio_batch *folios) +void free_unref_folios(struct folio_batch *folios, unsigned short int ugen) { unsigned long __maybe_unused UP_flags; struct per_cpu_pages *pcp = NULL; @@ -2720,7 +2720,7 @@ void free_unref_folios(struct folio_batch *folios) */ if (!pcp_allowed_order(order)) { free_one_page(folio_zone(folio), &folio->page, - pfn, order, FPI_NONE, 0); + pfn, order, FPI_NONE, ugen); continue; } folio->private = (void *)(unsigned long)order; @@ -2756,7 +2756,7 @@ void free_unref_folios(struct folio_batch *folios) */ if (is_migrate_isolate(migratetype)) { free_one_page(zone, &folio->page, pfn, - order, FPI_NONE, 0); + order, FPI_NONE, ugen); continue; } @@ -2769,7 +2769,7 @@ void free_unref_folios(struct folio_batch *folios) if (unlikely(!pcp)) { pcp_trylock_finish(UP_flags); free_one_page(zone, &folio->page, pfn, - order, FPI_NONE, 0); + order, FPI_NONE, ugen); continue; } locked_zone = zone; @@ -2784,7 +2784,7 @@ void free_unref_folios(struct folio_batch *folios) trace_mm_page_free_batched(&folio->page); free_unref_page_commit(zone, pcp, &folio->page, migratetype, - order, 0); + order, ugen); } if (pcp) { diff --git a/mm/swap.c b/mm/swap.c index dae169b19ab9..67605bbfc95c 100644 --- a/mm/swap.c +++ b/mm/swap.c @@ -161,11 +161,11 @@ void put_pages_list(struct list_head *pages) /* LRU flag must be clear because it's passed using the lru */ if (folio_batch_add(&fbatch, folio) > 0) continue; - free_unref_folios(&fbatch); + free_unref_folios(&fbatch, 0); } if (fbatch.nr) - free_unref_folios(&fbatch); + free_unref_folios(&fbatch, 0); INIT_LIST_HEAD(pages); } EXPORT_SYMBOL(put_pages_list); @@ -1027,7 +1027,7 @@ void folios_put_refs(struct folio_batch *folios, unsigned int *refs) folios->nr = j; mem_cgroup_uncharge_folios(folios); - free_unref_folios(folios); + free_unref_folios(folios, 0); } EXPORT_SYMBOL(folios_put_refs); diff --git a/mm/vmscan.c b/mm/vmscan.c index b9170f767353..15efe6f0edce 100644 --- a/mm/vmscan.c +++ b/mm/vmscan.c @@ -1461,7 +1461,7 @@ static unsigned int shrink_folio_list(struct list_head *folio_list, if (folio_batch_add(&free_folios, folio) == 0) { mem_cgroup_uncharge_folios(&free_folios); try_to_unmap_flush(); - free_unref_folios(&free_folios); + free_unref_folios(&free_folios, 0); } continue; @@ -1528,7 +1528,7 @@ static unsigned int shrink_folio_list(struct list_head *folio_list, mem_cgroup_uncharge_folios(&free_folios); try_to_unmap_flush(); - free_unref_folios(&free_folios); + free_unref_folios(&free_folios, 0); list_splice(&ret_folios, folio_list); count_vm_events(PGACTIVATE, pgactivate); @@ -1868,7 +1868,7 @@ static unsigned int move_folios_to_lru(struct lruvec *lruvec, if (folio_batch_add(&free_folios, folio) == 0) { spin_unlock_irq(&lruvec->lru_lock); mem_cgroup_uncharge_folios(&free_folios); - free_unref_folios(&free_folios); + free_unref_folios(&free_folios, 0); spin_lock_irq(&lruvec->lru_lock); } @@ -1890,7 +1890,7 @@ static unsigned int move_folios_to_lru(struct lruvec *lruvec, if (free_folios.nr) { spin_unlock_irq(&lruvec->lru_lock); mem_cgroup_uncharge_folios(&free_folios); - free_unref_folios(&free_folios); + free_unref_folios(&free_folios, 0); spin_lock_irq(&lruvec->lru_lock); } From patchwork Fri May 31 09:19:57 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Byungchul Park X-Patchwork-Id: 13681405 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 838F7C27C50 for ; Fri, 31 May 2024 09:20:36 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 05CA76B00A2; Fri, 31 May 2024 05:20:22 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 00CDF6B00A3; Fri, 31 May 2024 05:20:21 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id DA2AE6B00A4; Fri, 31 May 2024 05:20:21 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0017.hostedemail.com [216.40.44.17]) by kanga.kvack.org (Postfix) with ESMTP id B76E26B00A2 for ; Fri, 31 May 2024 05:20:21 -0400 (EDT) Received: from smtpin27.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay02.hostedemail.com (Postfix) with ESMTP id 68E8F120CDD for ; Fri, 31 May 2024 09:20:21 +0000 (UTC) X-FDA: 82178145042.27.4182CC1 Received: from invmail4.hynix.com (exvmail4.skhynix.com [166.125.252.92]) by imf22.hostedemail.com (Postfix) with ESMTP id 936CCC0016 for ; Fri, 31 May 2024 09:20:19 +0000 (UTC) Authentication-Results: imf22.hostedemail.com; dkim=none; dmarc=none; spf=pass (imf22.hostedemail.com: domain of byungchul@sk.com designates 166.125.252.92 as permitted sender) smtp.mailfrom=byungchul@sk.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1717147219; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:content-type: content-transfer-encoding:in-reply-to:in-reply-to: references:references; bh=ZCyuWiCSt+64WEjjRW+aCA/oEEYTlh28G/HkoubEy38=; b=wEvLZfMKHGq4+tMROS6QLzDIW+6wq629c/QzAr3sIIRoa38pMACdkBidHT2a2Fj19SUqT2 h3NGamw0RFRzb7EC+nz8L+ZvhGHNlZ37XNndXlGTE2tf1OPKZFNnQ646czKCqt3jXHuznE d5WF3Nx3PpVQZO/pfI6YiLuyuFn1czI= ARC-Authentication-Results: i=1; imf22.hostedemail.com; dkim=none; dmarc=none; spf=pass (imf22.hostedemail.com: domain of byungchul@sk.com designates 166.125.252.92 as permitted sender) smtp.mailfrom=byungchul@sk.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1717147219; a=rsa-sha256; cv=none; b=IcrDcimIN1WvgJ07N/WnMO7lTZnCFnQTEQJjk3Y85D3buDyZrOHO+U9y5yfosY64G49GhP Io9Q7zHJVnlclDUFazOzqgPCxLL+IV34EGmwwNvmVjRtV5OdK7Yn1fvwlGNDnBKxjBZFLn xFq5BBbG//06I90q6dENdYsR+aBXbZc= X-AuditID: a67dfc5b-d85ff70000001748-66-6659964c5b63 From: Byungchul Park To: linux-kernel@vger.kernel.org, linux-mm@kvack.org Cc: kernel_team@skhynix.com, akpm@linux-foundation.org, ying.huang@intel.com, vernhao@tencent.com, mgorman@techsingularity.net, hughd@google.com, willy@infradead.org, david@redhat.com, peterz@infradead.org, luto@kernel.org, tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, dave.hansen@linux.intel.com, rjgolo@gmail.com Subject: [PATCH v11 08/12] mm/rmap: recognize read-only tlb entries during batched tlb flush Date: Fri, 31 May 2024 18:19:57 +0900 Message-Id: <20240531092001.30428-9-byungchul@sk.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20240531092001.30428-1-byungchul@sk.com> References: <20240531092001.30428-1-byungchul@sk.com> X-Brightmail-Tracker: H4sIAAAAAAAAA+NgFnrKLMWRmVeSWpSXmKPExsXC9ZZnka7PtMg0g9ZmaYs569ewWXze8I/N 4sWGdkaLr+t/MVs8/dTHYnF51xw2i3tr/rNanN+1ltVix9J9TBaXDixgsjjee4DJYv69z2wW mzdNZbY4PmUqo8XvH0DFJ2dNZnEQ8Pje2sfisXPWXXaPBZtKPTav0PJYvOclk8emVZ1sHps+ TWL3eHfuHLvHiRm/WTzmnQz0eL/vKpvH1l92Ho1Tr7F5fN4kF8AXxWWTkpqTWZZapG+XwJXx bfd8poI7shXtxyexNzAekehi5OCQEDCReLiVsYuRE8yctn0VO4jNJqAucePGT2YQW0TATOJg 6x+wOLPAXSaJA/1sILawQIzEo33bwWwWAVWJntNtYDavgKnE2/s/2SFmykus3nAAbA4n0JwD f++A7RICqln0vxfI5gKqec8mcfz+WRaIBkmJgytusExg5F3AyLCKUSgzryw3MTPHRC+jMi+z Qi85P3cTIzDwl9X+id7B+OlC8CFGAQ5GJR7egIqINCHWxLLiytxDjBIczEoivL/SgUK8KYmV ValF+fFFpTmpxYcYpTlYlMR5jb6VpwgJpCeWpGanphakFsFkmTg4pRoYA2Oitgd+2sTw9Jj6 yuu/Ztw4ue9i7LXgkuCsosUdubeX9v5pVb0+8ea51WLfORzO53z28H5W+L5fsHdrx65/l6bd m7mu/6Bnz4oCX8ajRW5XPL6H/Ghmq1ryWDTz76t91949n7qhKuHp94lH15mXLDWPPHZn/7pr c83jWF8f8GJ/sF284sj74/OUWIozEg21mIuKEwHsDoh0eAIAAA== X-Brightmail-Tracker: H4sIAAAAAAAAA+NgFjrLLMWRmVeSWpSXmKPExsXC5WfdrOszLTLNYO8tPos569ewWXze8I/N 4sWGdkaLr+t/MVs8/dTHYnF47klWi8u75rBZ3Fvzn9Xi/K61rBY7lu5jsrh0YAGTxfHeA0wW 8+99ZrPYvGkqs8XxKVMZLX7/ACo+OWsyi4Ogx/fWPhaPnbPusnss2FTqsXmFlsfiPS+ZPDat 6mTz2PRpErvHu3Pn2D1OzPjN4jHvZKDH+31X2TwWv/jA5LH1l51H49RrbB6fN8kF8Edx2aSk 5mSWpRbp2yVwZXzbPZ+p4I5sRfvxSewNjEckuhg5OSQETCSmbV/FDmKzCahL3LjxkxnEFhEw kzjY+gcszixwl0niQD8biC0sECPxaN92MJtFQFWi53QbmM0rYCrx9v5PdoiZ8hKrNxwAm8MJ NOfA3zuMILYQUM2i/72MExi5FjAyrGIUycwry03MzDHVK87OqMzLrNBLzs/dxAgM5GW1fybu YPxy2f0QowAHoxIPb0BFRJoQa2JZcWXuIUYJDmYlEd5f6UAh3pTEyqrUovz4otKc1OJDjNIc LErivF7hqQlCAumJJanZqakFqUUwWSYOTqkGRrO7b7ayVyxJYmDcwRJy9bDa/64XLV/WhnRt eDW7RJpTY97Mn4Wx84zdH5qUa1Z29f1ru1N35fbyk9xranRsOBhWHN5+0rm49C7byptvDuU+ OpS3xizJUqd8SaGcdpruagkO0xPTritXv7BOXTSx1WtLAeeeQLP/a9adMuWs3xrwbN/8Lx67 HyqxFGckGmoxFxUnAgB1VFSbYAIAAA== X-CFilter-Loop: Reflected X-Rspamd-Queue-Id: 936CCC0016 X-Stat-Signature: hbi8c7qcfgfzzxhjx3yj7uyk6donzwgt X-Rspam-User: X-Rspamd-Server: rspam04 X-HE-Tag: 1717147219-322052 X-HE-Meta: U2FsdGVkX1+J3jz7q3iVu4sR3zBgHxX5fdIQ+54vEc/gUZ1Tb/fkL9wLIVsJf7RQVmIjZLjzVTNPkmkic4minnXSnua698BtB4c1XvA6m48ty9OrlJcIJ2u3SSMvaYRdNmD2tZQNysZvtbTP30eKdWh0GnIQeIgpUTmj7CBrNZlx0mrQQ6wBj73uU2Cbmo6Fk8GTBZ26LrJdxNpY3fO8i5+d5BCkbYDhUBjyyI0i/wYa3q74tbyI646tt3iTcsdjEIWer6sQwt8bj43m5uURuNcwX3BSzpuURc+2ty84qg+T4hoQBzx6ta6qag+uyx29Q32xqtXaQ5B0vOxE3V+Ds8+t1l60Sk9d7RN4GUAd10Aw3b4434fs2B7o8PYke7BcaGYzHsN49SB8RrpLYONMUXM/xCPAPhTlLvXXmHdACxjlB2G/7CUPIIAdmbW86XCN1ARhpRvqqh6WR7Zt0U5tYevc6Vz64Oio4IVNCpOv7GBHv7EOMJyvp8S4U3oPUE4JUaUlccd1uDSH9CAPvwsBv/zuQraEzedY8PIkoRzbybKDnkt09BY+rSD2OyoO+j0aeX2gdIVdCKnuw6C3uRk9wY3Fu+BBiybwERs4wy42go3FA9BPT4cJtxgqu9fYBDei3Sjw4l1oWGIaEgFl9aJkn5bXDxE1is5KovAzXGvWorEwEPftx8K7snmSdi9k7NfX5gWNA+ADewHcvDV/CSfthd+WW+W4MiN69usQOM3scWLQvHeBZlDoDUTjBWdb67hY2vrx79sdcq+z8cgQ+GF3PkGTGxcd2x7/gapRDwP9m4lVeNIlV7AAmaAdYZ69PjVSdFAb5IY5vapcFqFrgNvUSAUS8qtkCUAnoSOK6yNGu/nIYwlQ2/lJ+vGSZfgcczKosuIGYkPwlBVI5FQ4fWP8U/7Y/EWzI74bqtWGlq40Cte47l2sbrxge5mUsByjmB3lRAgnyI6/Orsn/PFjfjc 8pQ== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Functionally, no change. This is a preparation for luf mechanism that requires to recognize read-only tlb entries and handle them in a different way. The newly introduced API in this patch, fold_ubc(), will be used by luf mechanism. Signed-off-by: Byungchul Park --- include/linux/sched.h | 1 + mm/internal.h | 4 ++++ mm/rmap.c | 34 ++++++++++++++++++++++++++++++++-- 3 files changed, 37 insertions(+), 2 deletions(-) diff --git a/include/linux/sched.h b/include/linux/sched.h index ab5a2ed79b88..d9722c014157 100644 --- a/include/linux/sched.h +++ b/include/linux/sched.h @@ -1340,6 +1340,7 @@ struct task_struct { #endif struct tlbflush_unmap_batch tlb_ubc; + struct tlbflush_unmap_batch tlb_ubc_ro; unsigned short int ugen; /* Cache last used pipe for splice(): */ diff --git a/mm/internal.h b/mm/internal.h index dba6d0eb7b6d..ca6fb5b2a640 100644 --- a/mm/internal.h +++ b/mm/internal.h @@ -1124,6 +1124,7 @@ extern struct workqueue_struct *mm_percpu_wq; void try_to_unmap_flush(void); void try_to_unmap_flush_dirty(void); void flush_tlb_batched_pending(struct mm_struct *mm); +void fold_ubc(struct tlbflush_unmap_batch *dst, struct tlbflush_unmap_batch *src); #else static inline void try_to_unmap_flush(void) { @@ -1134,6 +1135,9 @@ static inline void try_to_unmap_flush_dirty(void) static inline void flush_tlb_batched_pending(struct mm_struct *mm) { } +static inline void fold_ubc(struct tlbflush_unmap_batch *dst, struct tlbflush_unmap_batch *src) +{ +} #endif /* CONFIG_ARCH_WANT_BATCHED_UNMAP_TLB_FLUSH */ extern const struct trace_print_flags pageflag_names[]; diff --git a/mm/rmap.c b/mm/rmap.c index a65a94aada8d..1a246788e867 100644 --- a/mm/rmap.c +++ b/mm/rmap.c @@ -634,6 +634,28 @@ struct anon_vma *folio_lock_anon_vma_read(struct folio *folio, } #ifdef CONFIG_ARCH_WANT_BATCHED_UNMAP_TLB_FLUSH + +void fold_ubc(struct tlbflush_unmap_batch *dst, + struct tlbflush_unmap_batch *src) +{ + if (!src->flush_required) + return; + + /* + * Fold src to dst. + */ + arch_tlbbatch_fold(&dst->arch, &src->arch); + dst->writable = dst->writable || src->writable; + dst->flush_required = true; + + /* + * Reset src. + */ + arch_tlbbatch_clear(&src->arch); + src->flush_required = false; + src->writable = false; +} + /* * Flush TLB entries for recently unmapped pages from remote CPUs. It is * important if a PTE was dirty when it was unmapped that it's flushed @@ -643,7 +665,9 @@ struct anon_vma *folio_lock_anon_vma_read(struct folio *folio, void try_to_unmap_flush(void) { struct tlbflush_unmap_batch *tlb_ubc = ¤t->tlb_ubc; + struct tlbflush_unmap_batch *tlb_ubc_ro = ¤t->tlb_ubc_ro; + fold_ubc(tlb_ubc, tlb_ubc_ro); if (!tlb_ubc->flush_required) return; @@ -657,8 +681,9 @@ void try_to_unmap_flush(void) void try_to_unmap_flush_dirty(void) { struct tlbflush_unmap_batch *tlb_ubc = ¤t->tlb_ubc; + struct tlbflush_unmap_batch *tlb_ubc_ro = ¤t->tlb_ubc_ro; - if (tlb_ubc->writable) + if (tlb_ubc->writable || tlb_ubc_ro->writable) try_to_unmap_flush(); } @@ -675,13 +700,18 @@ void try_to_unmap_flush_dirty(void) static void set_tlb_ubc_flush_pending(struct mm_struct *mm, pte_t pteval, unsigned long uaddr) { - struct tlbflush_unmap_batch *tlb_ubc = ¤t->tlb_ubc; + struct tlbflush_unmap_batch *tlb_ubc; int batch; bool writable = pte_dirty(pteval); if (!pte_accessible(mm, pteval)) return; + if (pte_write(pteval)) + tlb_ubc = ¤t->tlb_ubc; + else + tlb_ubc = ¤t->tlb_ubc_ro; + arch_tlbbatch_add_pending(&tlb_ubc->arch, mm, uaddr); tlb_ubc->flush_required = true; From patchwork Fri May 31 09:19:58 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Byungchul Park X-Patchwork-Id: 13681408 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 1CDA8C25B75 for ; Fri, 31 May 2024 09:20:44 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id C1CC96B00A5; Fri, 31 May 2024 05:20:23 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id B7EC36B00A6; Fri, 31 May 2024 05:20:23 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 982FE6B00A7; Fri, 31 May 2024 05:20:23 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0016.hostedemail.com [216.40.44.16]) by kanga.kvack.org (Postfix) with ESMTP id 588CC6B00A6 for ; Fri, 31 May 2024 05:20:23 -0400 (EDT) Received: from smtpin09.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay05.hostedemail.com (Postfix) with ESMTP id 0B4754127E for ; Fri, 31 May 2024 09:20:23 +0000 (UTC) X-FDA: 82178145126.09.AD44D71 Received: from invmail4.hynix.com (exvmail4.skhynix.com [166.125.252.92]) by imf11.hostedemail.com (Postfix) with ESMTP id BB79E4000A for ; Fri, 31 May 2024 09:20:20 +0000 (UTC) Authentication-Results: imf11.hostedemail.com; dkim=none; dmarc=none; spf=pass (imf11.hostedemail.com: domain of byungchul@sk.com designates 166.125.252.92 as permitted sender) smtp.mailfrom=byungchul@sk.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1717147221; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:content-type: content-transfer-encoding:in-reply-to:in-reply-to: references:references; bh=7dQpbpbLlm7DFD9m0RapSFIbgpU8d7AdRKWiAxzHQUs=; b=h4u7gXe5kzXelP9tfCkiHEZ8YgGM5Aw++DfCXY39Ar5cX5dQ4wDw25ZnsHwZSp+bnDRO6M VSvLs+G46wD0S7/xRWbtSnyT2fhmIhh8p9MPH6ra7m4MSNogOeffciWDg8vyMz0SBWKEhq K2uCeOh0nJUv9JBpPSWQnkgYk99Aw6w= ARC-Authentication-Results: i=1; imf11.hostedemail.com; dkim=none; dmarc=none; spf=pass (imf11.hostedemail.com: domain of byungchul@sk.com designates 166.125.252.92 as permitted sender) smtp.mailfrom=byungchul@sk.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1717147221; a=rsa-sha256; cv=none; b=OBv24fXB4FSxjXu3BZGLHdgwTnbPTchJh3Dc+yqLdGGioF/oBpayKrNzGwix46bXfS+o7E kCUesjMWlWaf984LmNeMsNhfudGKx/84zGZvgx767wvP/GDv1Y7PH7AFpSXIwf8N1FEHMe K97Wb/ZWLdyCm3IpMy8hIbw+oIrvvxs= X-AuditID: a67dfc5b-d85ff70000001748-6c-6659964c9a02 From: Byungchul Park To: linux-kernel@vger.kernel.org, linux-mm@kvack.org Cc: kernel_team@skhynix.com, akpm@linux-foundation.org, ying.huang@intel.com, vernhao@tencent.com, mgorman@techsingularity.net, hughd@google.com, willy@infradead.org, david@redhat.com, peterz@infradead.org, luto@kernel.org, tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, dave.hansen@linux.intel.com, rjgolo@gmail.com Subject: [PATCH v11 09/12] mm: implement LUF(Lazy Unmap Flush) defering tlb flush when folios get unmapped Date: Fri, 31 May 2024 18:19:58 +0900 Message-Id: <20240531092001.30428-10-byungchul@sk.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20240531092001.30428-1-byungchul@sk.com> References: <20240531092001.30428-1-byungchul@sk.com> X-Brightmail-Tracker: H4sIAAAAAAAAA+NgFnrOLMWRmVeSWpSXmKPExsXC9ZZnoa7PtMg0g7VHFSzmrF/DZvF5wz82 ixcb2hktvq7/xWzx9FMfi8XlXXPYLO6t+c9qcX7XWlaLHUv3MVlcOrCAyeJ47wEmi/n3PrNZ bN40ldni+JSpjBa/fwAVn5w1mcVBwON7ax+Lx85Zd9k9Fmwq9di8Qstj8Z6XTB6bVnWyeWz6 NInd4925c+weJ2b8ZvGYdzLQ4/2+q2weW3/ZeTROvcbm8XmTXABfFJdNSmpOZllqkb5dAlfG 1r73zAWnHzNWfJ69kbWBsWcjYxcjJ4eEgInE5t/T2GDsxrMfweJsAuoSN278ZAaxRQTMJA62 /mEHsZkF7jJJHOgHqufgEBbIkpg5uQ4kzCKgKnF2+XxWEJsXqPzp9BssECPlJVZvOAA2hhMo fuDvHbDxQgKmEov+9wLZXEA1n9kkns1+C3WPpMTBFTdYJjDyLmBkWMUolJlXlpuYmWOil1GZ l1mhl5yfu4kRGPzLav9E72D8dCH4EKMAB6MSD29ARUSaEGtiWXFl7iFGCQ5mJRHeX+lAId6U xMqq1KL8+KLSnNTiQ4zSHCxK4rxG38pThATSE0tSs1NTC1KLYLJMHJxSDYy5rUv/fe/aYXx5 0emDfHyPH05JzWT+ccDN5cRavaWMJx+vMlq+fssMoRy3Wa9En/nMmaKwZXrUjSML+zcutFEL eqMouHuynNtTByUH2eMm1fpJSdqKvDquWzkN1Xeb7mqKMyiX/Cz2cpFcfOK5SovmQ5s3VUdW /nA+Uz+zRi6vKEOGN2WiCacSS3FGoqEWc1FxIgAutu26egIAAA== X-Brightmail-Tracker: H4sIAAAAAAAAA+NgFjrHLMWRmVeSWpSXmKPExsXC5WfdrOszLTLNoPWUiMWc9WvYLD5v+Mdm 8WJDO6PF1/W/mC2efupjsTg89ySrxeVdc9gs7q35z2pxftdaVosdS/cxWVw6sIDJ4njvASaL +fc+s1ls3jSV2eL4lKmMFr9/ABWfnDWZxUHQ43trH4vHzll32T0WbCr12LxCy2PxnpdMHptW dbJ5bPo0id3j3blz7B4nZvxm8Zh3MtDj/b6rbB6LX3xg8tj6y86jceo1No/Pm+QC+KO4bFJS czLLUov07RK4Mrb2vWcuOP2YseLz7I2sDYw9Gxm7GDk5JARMJBrPfgSz2QTUJW7c+MkMYosI mEkcbP3DDmIzC9xlkjjQz9bFyMEhLJAlMXNyHUiYRUBV4uzy+awgNi9Q+dPpN1ggRspLrN5w AGwMJ1D8wN87YOOFBEwlFv3vZZzAyLWAkWEVo0hmXlluYmaOqV5xdkZlXmaFXnJ+7iZGYCgv q/0zcQfjl8vuhxgFOBiVeHgDKiLShFgTy4orcw8xSnAwK4nw/koHCvGmJFZWpRblxxeV5qQW H2KU5mBREuf1Ck9NEBJITyxJzU5NLUgtgskycXBKNTCe6zPIMjse4TZLifP4opKgevfOR8LT EpI26f13fj2Xc8OF4/d3V9fEmUs3dnccX985M//8IsaXYo6MPWal5hu0H0xeyDn1Zb2PbvBz 70eSGVkBJ08LuEs9nKY/fePq7Lu3dIsXy/t8POvx0Vl463x5fwbHw8+q0o/VnF64JNC131s9 bO7fhXVKLMUZiYZazEXFiQCi4bqRYQIAAA== X-CFilter-Loop: Reflected X-Rspam-User: X-Rspamd-Server: rspam02 X-Rspamd-Queue-Id: BB79E4000A X-Stat-Signature: dwq36akjn14phx4yerytxiimyfjxdzkr X-HE-Tag: 1717147220-729173 X-HE-Meta: U2FsdGVkX1/VL3xvhHgzRkLqXyKN2ld1qhmUWKEaNq2L8jVHn73Jvu5w6wdOypl42u1sV/Y9DC+AzXA/hLdRA3Awm3NpIGrV1aDv7PioO3oxSTKH7t3C+LR+FgeSeWPfXJuYu0SR0aZboq2rhdXtDVwJ1ZuGSiqGJ4KkNlfldp21i27wGttlUiXxOsnIVjT4CbJObVF1OCSj5NedaV97iJUL10rbqtNijoXqNK/jjfPynckHUKWXh83bhkGvYgV02nSyeWZISi/6w+8/wpjozJvxdX7B4O1IZXZB2pdYbxzSqKFPJ0PrqlAdlG735tElAWtQW2lSaR1n9xvTIEmmuiuhGOB12K/rJiHqAjW5mEFtvbrLfn+WG5dRX4dA90tj/bbEyGxtkYbrBsLTvflIyUGDPVL0CKGMGtGf5yr2iFZnUlvmD3x3/ld/ZVop7OM8bJfg+5E/6rpC9XUb0brw9TDBDBmRrDiv3euv1JpJu2KKvonycdsrdRF3t+KMTf2gVdsqDXR0HOmP9P/nFQb3q2rrGOE7dXnY1pPI7J3uJ2gVBWAMqd+YsYMCguLgAlPtSztMzt/Rqr4aB/lsd8KJaBH+fqu1txrpt8cUOadQw5nTokjG8b7HOWY5j+pCr8J5BvWCCrTFaBeM1I1Rl4fnWCGc27CzWeR8bYUDdFiD+TSnde77tm7QjxqRw36Y+G4axYqZZivLjy+oeiWYm6dKDWQGs++v0kiCAD4cCxm70R9H+TPN4N9A9Bkb75bFpzgrhVHxkpekyrU4fR7ySSVBIgbuf6r0DICoNAC3FaugFLpb8dg6tmnaWCvBTwfbv4QpJfEMULXArKggnwZzPfqDEG7Pi8xMml02B9ZHDXXAhl10aEmT6GHcvnMjOtnRrcZkH6rkuV6spUELSmZb4cf0Vs4N4LO+kw1xCudILqpLsZFkfyVkCyHIHuvLAoLJKtkRba9977p24TwFBi2kV3r yoA== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: A new mechanism, LUF(Lazy Unmap Flush), defers tlb flush until folios that have been unmapped and freed, eventually get allocated again. It's safe for folios that had been mapped read-only and were unmapped, as long as the contents of the folios don't change while staying in pcp or buddy so we can still read the data through the stale tlb entries. tlb flush can be defered when folios get unmapped as long as it guarantees to perform tlb flush needed, before the folios actually become used, of course, only if all the corresponding ptes don't have write permission. Otherwise, the system will get messed up. To achieve that, for the folios that map only to non-writable tlb entries, prevent tlb flush during unmapping but perform it just before the folios actually become used, out of buddy or pcp. However, we should cancel the pending by LUF and perform the deferred TLB flush right away when: 1. a writable pte is newly set through fault handler 2. a file is updated 3. kasan needs poisoning on free 4. the kernel wants to init pages on free No matter what type of workload is used for performance evaluation, the result would be positive thanks to the unconditional reduction of tlb flushes, tlb misses and interrupts. For the test, I picked up one of the most popular and heavy workload, llama.cpp that is a LLM(Large Language Model) inference engine. The result would depend on memory latency and how often reclaim runs, which implies tlb miss overhead and how many times unmapping happens. In my system, the result shows: 1. tlb shootdown interrupts are reduced about 97%. 2. The test program runtime is reduced about 4.5%. The test environment and the result is like: Machine: bare metal, x86_64, Intel(R) Xeon(R) Gold 6430 CPU: 1 socket 64 core with hyper thread on Numa: 2 nodes (64 CPUs DRAM 42GB, no CPUs CXL expander 98GB) Config: swap off, numa balancing tiering on, demotion enabled The test set: llama.cpp/main -m $(70G_model1) -p "who are you?" -s 1 -t 15 -n 20 & llama.cpp/main -m $(70G_model2) -p "who are you?" -s 1 -t 15 -n 20 & llama.cpp/main -m $(70G_model3) -p "who are you?" -s 1 -t 15 -n 20 & wait where -t: nr of threads, -s: seed used to make the runtime stable, -n: nr of tokens that determines the runtime, -p: prompt to ask, -m: LLM model to use. Run the test set 5 times successively with caches dropped every run via 'echo 3 > /proc/sys/vm/drop_caches'. Each inference prints its runtime at the end of each. 1. Runtime from the output of llama.cpp: BEFORE ------ llama_print_timings: total time = 883450.54 ms / 24 tokens llama_print_timings: total time = 861665.91 ms / 24 tokens llama_print_timings: total time = 898079.02 ms / 24 tokens llama_print_timings: total time = 879897.69 ms / 24 tokens llama_print_timings: total time = 892360.75 ms / 24 tokens llama_print_timings: total time = 884587.85 ms / 24 tokens llama_print_timings: total time = 861023.19 ms / 24 tokens llama_print_timings: total time = 900022.18 ms / 24 tokens llama_print_timings: total time = 878771.88 ms / 24 tokens llama_print_timings: total time = 889027.98 ms / 24 tokens llama_print_timings: total time = 880783.90 ms / 24 tokens llama_print_timings: total time = 856475.29 ms / 24 tokens llama_print_timings: total time = 896842.21 ms / 24 tokens llama_print_timings: total time = 878883.53 ms / 24 tokens llama_print_timings: total time = 890122.10 ms / 24 tokens AFTER ----- llama_print_timings: total time = 871060.86 ms / 24 tokens llama_print_timings: total time = 825609.53 ms / 24 tokens llama_print_timings: total time = 836854.81 ms / 24 tokens llama_print_timings: total time = 843147.99 ms / 24 tokens llama_print_timings: total time = 831426.65 ms / 24 tokens llama_print_timings: total time = 873939.23 ms / 24 tokens llama_print_timings: total time = 826127.69 ms / 24 tokens llama_print_timings: total time = 835489.26 ms / 24 tokens llama_print_timings: total time = 842589.62 ms / 24 tokens llama_print_timings: total time = 833700.66 ms / 24 tokens llama_print_timings: total time = 875996.19 ms / 24 tokens llama_print_timings: total time = 826401.73 ms / 24 tokens llama_print_timings: total time = 839341.28 ms / 24 tokens llama_print_timings: total time = 841075.10 ms / 24 tokens llama_print_timings: total time = 835136.41 ms / 24 tokens 2. tlb shootdowns from 'cat /proc/interrupts': BEFORE ------ TLB: 80911532 93691786 100296251 111062810 109769109 109862429 108968588 119175230 115779676 118377498 119325266 120300143 124514185 116697222 121068466 118031913 122660681 117494403 121819907 116960596 120936335 117217061 118630217 122322724 119595577 111693298 119232201 120030377 115334687 113179982 118808254 116353592 140987367 137095516 131724276 139742240 136501150 130428761 127585535 132483981 133430250 133756207 131786710 126365824 129812539 133850040 131742690 125142213 128572830 132234350 131945922 128417707 133355434 129972846 126331823 134050849 133991626 121129038 124637283 132830916 126875507 122322440 125776487 124340278 TLB shootdowns AFTER ----- TLB: 2121206 2615108 2983494 2911950 3055086 3092672 3204894 3346082 3286744 3307310 3357296 3315940 3428034 3112596 3143325 3185551 3186493 3322314 3330523 3339663 3156064 3272070 3296309 3198962 3332662 3315870 3234467 3353240 3281234 3300666 3345452 3173097 4009196 3932215 3898735 3726531 3717982 3671726 3728788 3724613 3799147 3691764 3620630 3684655 3666688 3393974 3448651 3487593 3446357 3618418 3671920 3712949 3575264 3715385 3641513 3630897 3691047 3630690 3504933 3662647 3629926 3443044 3832970 3548813 TLB shootdowns Signed-off-by: Byungchul Park --- include/linux/fs.h | 6 + include/linux/mm_types.h | 8 + include/linux/sched.h | 9 ++ mm/compaction.c | 2 +- mm/internal.h | 42 +++++- mm/memory.c | 39 ++++- mm/page_alloc.c | 17 ++- mm/rmap.c | 315 ++++++++++++++++++++++++++++++++++++++- 8 files changed, 420 insertions(+), 18 deletions(-) diff --git a/include/linux/fs.h b/include/linux/fs.h index 0283cf366c2a..03683bf66031 100644 --- a/include/linux/fs.h +++ b/include/linux/fs.h @@ -2872,6 +2872,12 @@ static inline void file_end_write(struct file *file) if (!S_ISREG(file_inode(file)->i_mode)) return; sb_end_write(file_inode(file)->i_sb); + + /* + * XXX: If needed, can be optimized by avoiding luf_flush() if + * the address space of the file has never been involved by luf. + */ + luf_flush(); } /** diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h index 37eb3000267c..cd52c996e8aa 100644 --- a/include/linux/mm_types.h +++ b/include/linux/mm_types.h @@ -1223,6 +1223,14 @@ static inline unsigned int mm_cid_size(void) } #endif /* CONFIG_SCHED_MM_CID */ +#if defined(CONFIG_ARCH_WANT_BATCHED_UNMAP_TLB_FLUSH) +void check_luf_flush(unsigned short int ugen); +void luf_flush(void); +#else +static inline void check_luf_flush(unsigned short int ugen) {} +static inline void luf_flush(void) {} +#endif + struct mmu_gather; extern void tlb_gather_mmu(struct mmu_gather *tlb, struct mm_struct *mm); extern void tlb_gather_mmu_fullmm(struct mmu_gather *tlb, struct mm_struct *mm); diff --git a/include/linux/sched.h b/include/linux/sched.h index d9722c014157..613ed175e5f2 100644 --- a/include/linux/sched.h +++ b/include/linux/sched.h @@ -1341,8 +1341,17 @@ struct task_struct { struct tlbflush_unmap_batch tlb_ubc; struct tlbflush_unmap_batch tlb_ubc_ro; + struct tlbflush_unmap_batch tlb_ubc_luf; unsigned short int ugen; +#if defined(CONFIG_ARCH_WANT_BATCHED_UNMAP_TLB_FLUSH) + /* + * whether all the mappings of a folio during unmap are read-only + * so that luf can work on the folio + */ + bool can_luf; +#endif + /* Cache last used pipe for splice(): */ struct pipe_inode_info *splice_pipe; diff --git a/mm/compaction.c b/mm/compaction.c index 13799fbb2a9a..4a75c56af0b0 100644 --- a/mm/compaction.c +++ b/mm/compaction.c @@ -1925,7 +1925,7 @@ static void compaction_free(struct folio *dst, unsigned long data) struct page *page = &dst->page; if (folio_put_testzero(dst)) { - free_pages_prepare(page, order); + free_pages_prepare(page, order, NULL); list_add(&dst->lru, &cc->freepages[order]); cc->nr_freepages += 1 << order; } diff --git a/mm/internal.h b/mm/internal.h index ca6fb5b2a640..b3d7a5e5f7e3 100644 --- a/mm/internal.h +++ b/mm/internal.h @@ -657,7 +657,8 @@ extern void prep_compound_page(struct page *page, unsigned int order); extern void post_alloc_hook(struct page *page, unsigned int order, gfp_t gfp_flags); -extern bool free_pages_prepare(struct page *page, unsigned int order); +extern bool free_pages_prepare(struct page *page, unsigned int order, + unsigned short int *ugen); extern int user_min_free_kbytes; @@ -1541,6 +1542,36 @@ void workingset_update_node(struct xa_node *node); extern struct list_lru shadow_nodes; #if defined(CONFIG_ARCH_WANT_BATCHED_UNMAP_TLB_FLUSH) +unsigned short int try_to_unmap_luf(void); + +/* + * Reset the indicator indicating there are no writable mappings at the + * beginning of every rmap traverse for unmap. luf can work only when + * all the mappings are read-only. + */ +static inline void can_luf_init(void) +{ + current->can_luf = true; +} + +/* + * Mark the folio is not applicable to luf once it found a writble or + * dirty pte during rmap traverse for unmap. + */ +static inline void can_luf_fail(void) +{ + current->can_luf = false; +} + +/* + * Check if all the mappings are read-only and read-only mappings even + * exist. + */ +static inline bool can_luf_test(void) +{ + return current->can_luf && current->tlb_ubc_ro.flush_required; +} + static inline unsigned short int ugen_latest(unsigned short int a, unsigned short int b) { if (!a || !b) @@ -1570,10 +1601,7 @@ static inline unsigned short int hand_over_task_ugen(void) static inline void check_flush_task_ugen(void) { - /* - * XXX: luf mechanism will handle this. For now, do nothing but - * reset current's ugen to finalize this turn. - */ + check_luf_flush(current->ugen); current->ugen = 0; } @@ -1602,6 +1630,10 @@ static inline bool can_luf_folio(struct folio *f) return can_luf; } #else /* CONFIG_ARCH_WANT_BATCHED_UNMAP_TLB_FLUSH */ +static inline unsigned short int try_to_unmap_luf(void) { return 0; } +static inline void can_luf_init(void) {} +static inline void can_luf_fail(void) {} +static inline bool can_luf_test(void) { return false; } static inline unsigned short int ugen_latest(unsigned short int a, unsigned short int b) { return 0; } static inline void update_task_ugen(unsigned short int ugen) {} static inline unsigned short int hand_over_task_ugen(void) { return 0; } diff --git a/mm/memory.c b/mm/memory.c index 100f54fc9e6c..12c9e87e489d 100644 --- a/mm/memory.c +++ b/mm/memory.c @@ -3011,6 +3011,15 @@ static inline int pte_unmap_same(struct vm_fault *vmf) return same; } +static bool need_luf_flush(struct vm_fault *vmf) +{ + if ((vmf->flags & FAULT_FLAG_ORIG_PTE_VALID) && + pte_write(vmf->orig_pte)) + return false; + + return pte_write(ptep_get(vmf->pte)); +} + /* * Return: * 0: copied succeeded @@ -3026,6 +3035,7 @@ static inline int __wp_page_copy_user(struct page *dst, struct page *src, struct vm_area_struct *vma = vmf->vma; struct mm_struct *mm = vma->vm_mm; unsigned long addr = vmf->address; + bool luf = false; if (likely(src)) { if (copy_mc_user_highpage(dst, src, addr, vma)) { @@ -3059,8 +3069,10 @@ static inline int __wp_page_copy_user(struct page *dst, struct page *src, * Other thread has already handled the fault * and update local tlb only */ - if (vmf->pte) + if (vmf->pte) { update_mmu_tlb(vma, addr, vmf->pte); + luf = need_luf_flush(vmf); + } ret = -EAGAIN; goto pte_unlock; } @@ -3084,8 +3096,10 @@ static inline int __wp_page_copy_user(struct page *dst, struct page *src, vmf->pte = pte_offset_map_lock(mm, vmf->pmd, addr, &vmf->ptl); if (unlikely(!vmf->pte || !pte_same(ptep_get(vmf->pte), vmf->orig_pte))) { /* The PTE changed under us, update local tlb */ - if (vmf->pte) + if (vmf->pte) { update_mmu_tlb(vma, addr, vmf->pte); + luf = need_luf_flush(vmf); + } ret = -EAGAIN; goto pte_unlock; } @@ -3112,6 +3126,8 @@ static inline int __wp_page_copy_user(struct page *dst, struct page *src, pte_unmap_unlock(vmf->pte, vmf->ptl); pagefault_enable(); kunmap_local(kaddr); + if (luf) + luf_flush(); flush_dcache_page(dst); return ret; @@ -3446,6 +3462,8 @@ static vm_fault_t wp_page_copy(struct vm_fault *vmf) } else if (vmf->pte) { update_mmu_tlb(vma, vmf->address, vmf->pte); pte_unmap_unlock(vmf->pte, vmf->ptl); + if (need_luf_flush(vmf)) + luf_flush(); } mmu_notifier_invalidate_range_end(&range); @@ -3501,6 +3519,8 @@ static vm_fault_t finish_mkwrite_fault(struct vm_fault *vmf, struct folio *folio if (!pte_same(ptep_get(vmf->pte), vmf->orig_pte)) { update_mmu_tlb(vmf->vma, vmf->address, vmf->pte); pte_unmap_unlock(vmf->pte, vmf->ptl); + if (need_luf_flush(vmf)) + luf_flush(); return VM_FAULT_NOPAGE; } wp_page_reuse(vmf, folio); @@ -4469,6 +4489,7 @@ static vm_fault_t do_anonymous_page(struct vm_fault *vmf) vm_fault_t ret = 0; int nr_pages = 1; pte_t entry; + bool luf = false; /* File mapping without ->vm_ops ? */ if (vma->vm_flags & VM_SHARED) @@ -4492,6 +4513,7 @@ static vm_fault_t do_anonymous_page(struct vm_fault *vmf) goto unlock; if (vmf_pte_changed(vmf)) { update_mmu_tlb(vma, vmf->address, vmf->pte); + luf = need_luf_flush(vmf); goto unlock; } ret = check_stable_address_space(vma->vm_mm); @@ -4536,9 +4558,11 @@ static vm_fault_t do_anonymous_page(struct vm_fault *vmf) goto release; if (nr_pages == 1 && vmf_pte_changed(vmf)) { update_mmu_tlb(vma, addr, vmf->pte); + luf = need_luf_flush(vmf); goto release; } else if (nr_pages > 1 && !pte_range_none(vmf->pte, nr_pages)) { update_mmu_tlb_range(vma, addr, vmf->pte, nr_pages); + luf = need_luf_flush(vmf); goto release; } @@ -4570,6 +4594,8 @@ static vm_fault_t do_anonymous_page(struct vm_fault *vmf) unlock: if (vmf->pte) pte_unmap_unlock(vmf->pte, vmf->ptl); + if (luf) + luf_flush(); return ret; release: folio_put(folio); @@ -4796,6 +4822,7 @@ vm_fault_t finish_fault(struct vm_fault *vmf) vm_fault_t ret; bool is_cow = (vmf->flags & FAULT_FLAG_WRITE) && !(vma->vm_flags & VM_SHARED); + bool luf = false; /* Did we COW the page? */ if (is_cow) @@ -4841,10 +4868,14 @@ vm_fault_t finish_fault(struct vm_fault *vmf) ret = 0; } else { update_mmu_tlb(vma, vmf->address, vmf->pte); + luf = need_luf_flush(vmf); ret = VM_FAULT_NOPAGE; } pte_unmap_unlock(vmf->pte, vmf->ptl); + + if (luf) + luf_flush(); return ret; } @@ -5397,6 +5428,7 @@ static vm_fault_t wp_huge_pud(struct vm_fault *vmf, pud_t orig_pud) static vm_fault_t handle_pte_fault(struct vm_fault *vmf) { pte_t entry; + bool luf = false; if (unlikely(pmd_none(*vmf->pmd))) { /* @@ -5440,6 +5472,7 @@ static vm_fault_t handle_pte_fault(struct vm_fault *vmf) entry = vmf->orig_pte; if (unlikely(!pte_same(ptep_get(vmf->pte), entry))) { update_mmu_tlb(vmf->vma, vmf->address, vmf->pte); + luf = need_luf_flush(vmf); goto unlock; } if (vmf->flags & (FAULT_FLAG_WRITE|FAULT_FLAG_UNSHARE)) { @@ -5469,6 +5502,8 @@ static vm_fault_t handle_pte_fault(struct vm_fault *vmf) } unlock: pte_unmap_unlock(vmf->pte, vmf->ptl); + if (luf) + luf_flush(); return 0; } diff --git a/mm/page_alloc.c b/mm/page_alloc.c index c9acb4da91e0..4007c9757c3f 100644 --- a/mm/page_alloc.c +++ b/mm/page_alloc.c @@ -1048,7 +1048,7 @@ void kernel_init_pages(struct page *page, int numpages) } __always_inline bool free_pages_prepare(struct page *page, - unsigned int order) + unsigned int order, unsigned short int *ugen) { int bad = 0; bool skip_kasan_poison = should_skip_kasan_poison(page); @@ -1062,6 +1062,15 @@ __always_inline bool free_pages_prepare(struct page *page, */ set_page_private(page, 0); + /* + * The contents of the pages will be updated for some reasons. + * So we should give up luf. + */ + if ((!skip_kasan_poison || init) && ugen && *ugen) { + check_luf_flush(*ugen); + *ugen = 0; + } + trace_mm_page_free(page, order); kmsan_free_page(page, order); @@ -1236,7 +1245,7 @@ static void __free_pages_ok(struct page *page, unsigned int order, unsigned long pfn = page_to_pfn(page); struct zone *zone = page_zone(page); - if (!free_pages_prepare(page, order)) + if (!free_pages_prepare(page, order, NULL)) return; free_one_page(zone, page, pfn, order, fpi_flags, 0); @@ -2664,7 +2673,7 @@ void free_unref_page(struct page *page, unsigned int order, return; } - if (!free_pages_prepare(page, order)) + if (!free_pages_prepare(page, order, &ugen)) return; /* @@ -2712,7 +2721,7 @@ void free_unref_folios(struct folio_batch *folios, unsigned short int ugen) unsigned int order = folio_order(folio); folio_undo_large_rmappable(folio); - if (!free_pages_prepare(&folio->page, order)) + if (!free_pages_prepare(&folio->page, order, &ugen)) continue; /* * Free orders not handled on the PCP directly to the diff --git a/mm/rmap.c b/mm/rmap.c index 1a246788e867..459d4d1631f0 100644 --- a/mm/rmap.c +++ b/mm/rmap.c @@ -634,6 +634,274 @@ struct anon_vma *folio_lock_anon_vma_read(struct folio *folio, } #ifdef CONFIG_ARCH_WANT_BATCHED_UNMAP_TLB_FLUSH +static struct tlbflush_unmap_batch luf_ubc; +static DEFINE_SPINLOCK(luf_lock); + +/* + * Don't be zero to distinguish from invalid ugen, 0. + */ +static unsigned short int ugen_next(unsigned short int a) +{ + return a + 1 ?: a + 2; +} + +static bool ugen_before(unsigned short int a, unsigned short int b) +{ + return (short int)(a - b) < 0; +} + +/* + * Need to synchronize between tlb flush and managing pending CPUs in + * luf_ubc. Take a look at the following scenario, where CPU0 is in + * try_to_unmap_flush() and CPU1 is in migrate_pages_batch(): + * + * CPU0 CPU1 + * ---- ---- + * tlb flush + * unmap folios (needing tlb flush) + * add pending CPUs to luf_ubc + * <-- not performed tlb flush needed by + * the unmap above yet but the request + * will be cleared by CPU0 shortly. bug! + * clear the CPUs from luf_ubc + * + * The pending CPUs added in CPU1 should not be cleared from luf_ubc + * in CPU0 because the tlb flush for luf_ubc added in CPU1 has not + * been performed this turn. To avoid this, using 'on_flushing' + * variable, prevent adding pending CPUs to luf_ubc and give up luf + * mechanism if someone is in the middle of tlb flush, like: + * + * CPU0 CPU1 + * ---- ---- + * on_flushing++ + * tlb flush + * unmap folios (needing tlb flush) + * if on_flushing == 0: + * add pending CPUs to luf_ubc + * else: <-- hit + * give up luf mechanism + * clear the CPUs from luf_ubc + * on_flushing-- + * + * Only the following case would be allowed for luf mechanism to work: + * + * CPU0 CPU1 + * ---- ---- + * unmap folios (needing tlb flush) + * if on_flushing == 0: <-- hit + * add pending CPUs to luf_ubc + * else: + * give up luf mechanism + * on_flushing++ + * tlb flush + * clear the CPUs from luf_ubc + * on_flushing-- + */ +static int on_flushing; + +/* + * When more than one thread enter check_luf_flush() at the same + * time, each should wait for the request on progress to be done to + * avoid the following scenario, where the both CPUs are in + * check_luf_flush(): + * + * CPU0 CPU1 + * ---- ---- + * if !luf_ubc.flush_required: + * return + * luf_ubc.flush_required = false + * if !luf_ubc.flush_requied: <-- hit + * return <-- not performed tlb flush + * needed yet but return. bug! + * luf_ubc.flush_required = false + * try_to_unmap_flush() + * finalize + * try_to_unmap_flush() <-- performs tlb flush needed + * finalize + * + * So it should be handled: + * + * CPU0 CPU1 + * ---- ---- + * atomically execute { + * if luf_on_flushing: + * wait for the completion + * return + * if !luf_ubc.flush_required: + * return + * luf_ubc.flush_required = false + * luf_on_flushing = true + * } + * atomically execute { + * if luf_on_flushing: <-- hit + * wait for the completion + * return <-- tlb flush needed is done + * if !luf_ubc.flush_requied: + * return + * luf_ubc.flush_required = false + * luf_on_flushing = true + * } + * + * try_to_unmap_flush() + * luf_on_flushing = false + * finalize + * try_to_unmap_flush() <-- performs tlb flush needed + * luf_on_flushing = false + * finalize + */ +static bool luf_on_flushing; + +/* + * Generation number for the current request of deferred tlb flush. + */ +static unsigned short int luf_gen; + +/* + * Generation number for the next request. + */ +static unsigned short int luf_gen_next = 1; + +/* + * Generation number for the latest request handled. + */ +static unsigned short int luf_gen_done; + +unsigned short int try_to_unmap_luf(void) +{ + struct tlbflush_unmap_batch *tlb_ubc = ¤t->tlb_ubc; + struct tlbflush_unmap_batch *tlb_ubc_luf = ¤t->tlb_ubc_luf; + unsigned long flags; + unsigned short int ugen; + + if (!spin_trylock_irqsave(&luf_lock, flags)) { + /* + * Give up luf mechanism. Just let tlb flush needed + * handled by try_to_unmap_flush() at the caller side. + */ + fold_ubc(tlb_ubc, tlb_ubc_luf); + return 0; + } + + if (on_flushing || luf_on_flushing) { + spin_unlock_irqrestore(&luf_lock, flags); + + /* + * Give up luf mechanism. Just let tlb flush needed + * handled by try_to_unmap_flush() at the caller side. + */ + fold_ubc(tlb_ubc, tlb_ubc_luf); + return 0; + } + + fold_ubc(&luf_ubc, tlb_ubc_luf); + ugen = luf_gen = luf_gen_next; + spin_unlock_irqrestore(&luf_lock, flags); + + return ugen; +} + +static bool rmap_flush_start(void) +{ + unsigned long flags; + + if (!spin_trylock_irqsave(&luf_lock, flags)) + return false; + + on_flushing++; + spin_unlock_irqrestore(&luf_lock, flags); + return true; +} + +static void rmap_flush_end(struct tlbflush_unmap_batch *batch) +{ + unsigned long flags; + + spin_lock_irqsave(&luf_lock, flags); + if (arch_tlbbatch_done(&luf_ubc.arch, &batch->arch)) { + luf_ubc.flush_required = false; + luf_ubc.writable = false; + } + on_flushing--; + spin_unlock_irqrestore(&luf_lock, flags); +} + +/* + * It must be guaranteed to have completed tlb flush requested on return. + */ +void check_luf_flush(unsigned short int ugen) +{ + struct tlbflush_unmap_batch *tlb_ubc = ¤t->tlb_ubc; + unsigned long flags; + + /* + * Nothing has been requested. We are done. + */ + if (!ugen) + return; +retry: + /* + * We can see a larger value than or equal to luf_gen_done, + * which means the tlb flush we need has been done. + */ + if (!ugen_before(READ_ONCE(luf_gen_done), ugen)) + return; + + spin_lock_irqsave(&luf_lock, flags); + + /* + * With luf_lock held, we might read luf_gen_done updated. + */ + if (ugen_next(luf_gen_done) != ugen) { + spin_unlock_irqrestore(&luf_lock, flags); + return; + } + + /* + * Others are already working for us. + */ + if (luf_on_flushing) { + spin_unlock_irqrestore(&luf_lock, flags); + goto retry; + } + + if (!luf_ubc.flush_required) { + spin_unlock_irqrestore(&luf_lock, flags); + return; + } + + fold_ubc(tlb_ubc, &luf_ubc); + luf_gen_next = ugen_next(luf_gen); + luf_on_flushing = true; + spin_unlock_irqrestore(&luf_lock, flags); + + try_to_unmap_flush(); + + spin_lock_irqsave(&luf_lock, flags); + luf_on_flushing = false; + + /* + * luf_gen_done can be read by another with luf_lock not + * held so use WRITE_ONCE() to prevent tearing. + */ + WRITE_ONCE(luf_gen_done, ugen); + spin_unlock_irqrestore(&luf_lock, flags); +} + +void luf_flush(void) +{ + unsigned long flags; + unsigned short int ugen; + + /* + * Obtain the latest ugen number. + */ + spin_lock_irqsave(&luf_lock, flags); + ugen = luf_gen; + spin_unlock_irqrestore(&luf_lock, flags); + + check_luf_flush(ugen); +} +EXPORT_SYMBOL(luf_flush); void fold_ubc(struct tlbflush_unmap_batch *dst, struct tlbflush_unmap_batch *src) @@ -665,13 +933,18 @@ void fold_ubc(struct tlbflush_unmap_batch *dst, void try_to_unmap_flush(void) { struct tlbflush_unmap_batch *tlb_ubc = ¤t->tlb_ubc; - struct tlbflush_unmap_batch *tlb_ubc_ro = ¤t->tlb_ubc_ro; + struct tlbflush_unmap_batch *tlb_ubc_luf = ¤t->tlb_ubc_luf; + bool started; - fold_ubc(tlb_ubc, tlb_ubc_ro); + fold_ubc(tlb_ubc, tlb_ubc_luf); if (!tlb_ubc->flush_required) return; + started = rmap_flush_start(); arch_tlbbatch_flush(&tlb_ubc->arch); + if (started) + rmap_flush_end(tlb_ubc); + arch_tlbbatch_clear(&tlb_ubc->arch); tlb_ubc->flush_required = false; tlb_ubc->writable = false; @@ -681,9 +954,9 @@ void try_to_unmap_flush(void) void try_to_unmap_flush_dirty(void) { struct tlbflush_unmap_batch *tlb_ubc = ¤t->tlb_ubc; - struct tlbflush_unmap_batch *tlb_ubc_ro = ¤t->tlb_ubc_ro; + struct tlbflush_unmap_batch *tlb_ubc_luf = ¤t->tlb_ubc_luf; - if (tlb_ubc->writable || tlb_ubc_ro->writable) + if (tlb_ubc->writable || tlb_ubc_luf->writable) try_to_unmap_flush(); } @@ -707,9 +980,15 @@ static void set_tlb_ubc_flush_pending(struct mm_struct *mm, pte_t pteval, if (!pte_accessible(mm, pteval)) return; - if (pte_write(pteval)) + if (pte_write(pteval)) { tlb_ubc = ¤t->tlb_ubc; - else + + /* + * luf cannot work with the folio once it found a + * writable or dirty mapping on it. + */ + can_luf_fail(); + } else tlb_ubc = ¤t->tlb_ubc_ro; arch_tlbbatch_add_pending(&tlb_ubc->arch, mm, uaddr); @@ -2004,11 +2283,23 @@ void try_to_unmap(struct folio *folio, enum ttu_flags flags) .done = folio_not_mapped, .anon_lock = folio_lock_anon_vma_read, }; + struct tlbflush_unmap_batch *tlb_ubc = ¤t->tlb_ubc; + struct tlbflush_unmap_batch *tlb_ubc_ro = ¤t->tlb_ubc_ro; + struct tlbflush_unmap_batch *tlb_ubc_luf = ¤t->tlb_ubc_luf; + bool can_luf; + + can_luf_init(); if (flags & TTU_RMAP_LOCKED) rmap_walk_locked(folio, &rwc); else rmap_walk(folio, &rwc); + + can_luf = can_luf_folio(folio) && can_luf_test(); + if (can_luf) + fold_ubc(tlb_ubc_luf, tlb_ubc_ro); + else + fold_ubc(tlb_ubc, tlb_ubc_ro); } /* @@ -2353,6 +2644,10 @@ void try_to_migrate(struct folio *folio, enum ttu_flags flags) .done = folio_not_mapped, .anon_lock = folio_lock_anon_vma_read, }; + struct tlbflush_unmap_batch *tlb_ubc = ¤t->tlb_ubc; + struct tlbflush_unmap_batch *tlb_ubc_ro = ¤t->tlb_ubc_ro; + struct tlbflush_unmap_batch *tlb_ubc_luf = ¤t->tlb_ubc_luf; + bool can_luf; /* * Migration always ignores mlock and only supports TTU_RMAP_LOCKED and @@ -2377,10 +2672,18 @@ void try_to_migrate(struct folio *folio, enum ttu_flags flags) if (!folio_test_ksm(folio) && folio_test_anon(folio)) rwc.invalid_vma = invalid_migration_vma; + can_luf_init(); + if (flags & TTU_RMAP_LOCKED) rmap_walk_locked(folio, &rwc); else rmap_walk(folio, &rwc); + + can_luf = can_luf_folio(folio) && can_luf_test(); + if (can_luf) + fold_ubc(tlb_ubc_luf, tlb_ubc_ro); + else + fold_ubc(tlb_ubc, tlb_ubc_ro); } #ifdef CONFIG_DEVICE_PRIVATE From patchwork Fri May 31 09:19:59 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Byungchul Park X-Patchwork-Id: 13681406 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id D7762C27C44 for ; Fri, 31 May 2024 09:20:38 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id BCD376B00A3; Fri, 31 May 2024 05:20:22 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id B2E1C6B00A4; Fri, 31 May 2024 05:20:22 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 95A7E6B00A5; Fri, 31 May 2024 05:20:22 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0016.hostedemail.com [216.40.44.16]) by kanga.kvack.org (Postfix) with ESMTP id 73F546B00A3 for ; Fri, 31 May 2024 05:20:22 -0400 (EDT) Received: from smtpin04.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay04.hostedemail.com (Postfix) with ESMTP id 05D921A0B9B for ; Fri, 31 May 2024 09:20:22 +0000 (UTC) X-FDA: 82178145084.04.B77EDD4 Received: from invmail4.hynix.com (exvmail4.hynix.com [166.125.252.92]) by imf10.hostedemail.com (Postfix) with ESMTP id 2FDA8C002B for ; Fri, 31 May 2024 09:20:19 +0000 (UTC) Authentication-Results: imf10.hostedemail.com; dkim=none; dmarc=none; spf=pass (imf10.hostedemail.com: domain of byungchul@sk.com designates 166.125.252.92 as permitted sender) smtp.mailfrom=byungchul@sk.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1717147220; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:content-type: content-transfer-encoding:in-reply-to:in-reply-to: references:references; bh=lwRYNIdgQG2CoR5OJgy3frCHcjMYFOQSIuWHhdwmIxo=; b=NYkXC3wY8KjMvDtTnxSD4WfNcC8wl3y0M2vg1OK9764IH43yTPJHbpWeZ4FnXbfqL8RBTI 9qxpocOoRPunb2TJ5uCopWfrlulgPff2g32NpNvzaciBdH/1JbSDlNOcZG0QGzaTO00eEK C7yUMvJC1pBviCnjbjyfXjQ8x/kQW8g= ARC-Authentication-Results: i=1; imf10.hostedemail.com; dkim=none; dmarc=none; spf=pass (imf10.hostedemail.com: domain of byungchul@sk.com designates 166.125.252.92 as permitted sender) smtp.mailfrom=byungchul@sk.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1717147220; a=rsa-sha256; cv=none; b=jBnk/eTWz5IupRXnfxPCkoQTjO6j6dZMvfPb8KFnE600kBvwmcWxgW2anGBcxqvIvs0aN+ 3o49HDYqZalGqorIUgHHEKE8vsLDrKSLvRmEOzgqVQhyaBtRZG3AOQdbHzNG1XYp83rDaX bF4NrsjxV9CwBQhFsiZvT46iK7THYY0= X-AuditID: a67dfc5b-d85ff70000001748-71-6659964c48ee From: Byungchul Park To: linux-kernel@vger.kernel.org, linux-mm@kvack.org Cc: kernel_team@skhynix.com, akpm@linux-foundation.org, ying.huang@intel.com, vernhao@tencent.com, mgorman@techsingularity.net, hughd@google.com, willy@infradead.org, david@redhat.com, peterz@infradead.org, luto@kernel.org, tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, dave.hansen@linux.intel.com, rjgolo@gmail.com Subject: [PATCH v11 10/12] mm: separate move/undo parts from migrate_pages_batch() Date: Fri, 31 May 2024 18:19:59 +0900 Message-Id: <20240531092001.30428-11-byungchul@sk.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20240531092001.30428-1-byungchul@sk.com> References: <20240531092001.30428-1-byungchul@sk.com> X-Brightmail-Tracker: H4sIAAAAAAAAA+NgFnrKLMWRmVeSWpSXmKPExsXC9ZZnka7PtMg0g9e/FCzmrF/DZvF5wz82 ixcb2hktvq7/xWzx9FMfi8XlXXPYLO6t+c9qcX7XWlaLHUv3MVlcOrCAyeJ47wEmi/n3PrNZ bN40ldni+JSpjBa/fwAVn5w1mcVBwON7ax+Lx85Zd9k9Fmwq9di8Qstj8Z6XTB6bVnWyeWz6 NInd4925c+weJ2b8ZvGYdzLQ4/2+q2weW3/ZeTROvcbm8XmTXABfFJdNSmpOZllqkb5dAlfG pG2tbAU/tSue7OpjamCcrtzFyMkhIWAiMfHDK2YY++OVjewgNpuAusSNGz/B4iICZhIHW/+A xZkF7jJJHOhnA7GFBYIk9u9rYgWxWQRUJb7unwMU5+DgBar/cJodYqS8xOoNB8DGcAKFD/y9 wwhiCwmYSiz63wtkcwHVvGeT2DKvmwmiQVLi4IobLBMYeRcwMqxiFMrMK8tNzMwx0cuozMus 0EvOz93ECAz8ZbV/oncwfroQfIhRgINRiYc3oCIiTYg1say4MvcQowQHs5II7690oBBvSmJl VWpRfnxRaU5q8SFGaQ4WJXFeo2/lKUIC6YklqdmpqQWpRTBZJg5OqQZGnetHNrLGrPlk3vP3 mulSi6JXU7pEDnWkP5jtr3pLRSFjtWCprdsHkZeiiWsX7Z7/j3dZ1xO26Mmrhfh7Yy7+1Nox JUI/xe+3ZNuF629fKkxQK1rYc5v3x5sli+tU1CUsvNx4N+vJqVWEPGv86cSolrJSIfaDy5e+ aXeeBzFbWH3MS6wW28quxFKckWioxVxUnAgA8yXjLHgCAAA= X-Brightmail-Tracker: H4sIAAAAAAAAA+NgFjrDLMWRmVeSWpSXmKPExsXC5WfdrOszLTLNYPYPMYs569ewWXze8I/N 4sWGdkaLr+t/MVs8/dTHYnF47klWi8u75rBZ3Fvzn9Xi/K61rBY7lu5jsrh0YAGTxfHeA0wW 8+99ZrPYvGkqs8XxKVMZLX7/ACo+OWsyi4Ogx/fWPhaPnbPusnss2FTqsXmFlsfiPS+ZPDat 6mTz2PRpErvHu3Pn2D1OzPjN4jHvZKDH+31X2TwWv/jA5LH1l51H49RrbB6fN8kF8Edx2aSk 5mSWpRbp2yVwZUza1spW8FO74smuPqYGxunKXYycHBICJhIfr2xkB7HZBNQlbtz4yQxiiwiY SRxs/QMWZxa4yyRxoJ8NxBYWCJLYv6+JFcRmEVCV+Lp/DlCcg4MXqP7DaXaIkfISqzccABvD CRQ+8PcOI4gtJGAqseh/L+MERq4FjAyrGEUy88pyEzNzTPWKszMq8zIr9JLzczcxAsN4We2f iTsYv1x2P8QowMGoxMMbUBGRJsSaWFZcmXuIUYKDWUmE91c6UIg3JbGyKrUoP76oNCe1+BCj NAeLkjivV3hqgpBAemJJanZqakFqEUyWiYNTqoExrqdV8t2/qR1ef4pYd7D0LVR5Ip0mHLxC znjDU6cJUyYarLhzY0NCgv+Ba75Lv7a6hvZ9TPwp2sMlUnwldduion+sJoH2nzIt/+fO/aQe +KbYrmD1MvVlyef/rKyM75vaeXp7+bUl+3onTmQ2nx8oMS0+VeveQ9GKWGstJafpXyLzF63/ Nz9JiaU4I9FQi7moOBEAAlW2u18CAAA= X-CFilter-Loop: Reflected X-Rspamd-Queue-Id: 2FDA8C002B X-Stat-Signature: h6p1fhznsbrybk1dr7tqs6arjjiooz9y X-Rspam-User: X-Rspamd-Server: rspam08 X-HE-Tag: 1717147219-908547 X-HE-Meta: U2FsdGVkX19BbZP45MGsou333lyfzc8iZeWlJ1utupD6b2+X9f0YwAOrB3fNkmNzXzvbDVXgP2h3HThNp9K2SrVuLufyVrND6wZs0aRE8h2+AKELYB65Lw0Nk6mCL07MY4VR8hKbZOYFMf42Idi1d+qRsLKSV98bOxlVgCg2lkCP/JXe3GXUjcBN5NK+WSWPaQDXetEALmAUL7Ny8iEGTJ2bpZe5NSEfQsOT8imDGaWxfcx5XwGHn1U/hS7RaOMPJrh+veMbfdUMOTCYK/Kvr8MhnvGbebamDkoe6qgT6ALBXrzj9UuaDX/KhlMAe03D+vdQGGFIVShF1VPrw63tlgEn8DD4P7T2GRPMEbWkoIytUIjYBncy2JkP9SY7Vj9UMFXWhyXYyypGGTPL3kKgQ8kyBf/sdiqit3Z0AUATRmiXGtHNv3Bco25Z89ofJbbiKaTlqBzgtGiPvrNhhgKGEWan4NTOBGKJ9mvgSdFwjhl7Ep8mT+USzxbxC58y1SF/Lp/W5luAF9yRJYzONAweCekTPh9kLa237U8wM16He2B88EqxCrepc6h1AxDw23UcMDcHZTJYB2CLj4EPAVGQvmwyTlNAsfz8MUTcwZKFnr5bRajwMg/NRCI6SCfJLXDvRV252FWbS9eqmcYAYvTKPax05SYvaZEPgHEIKGpfidqPO2hTYRDD5klm5lyIUX+tHvqiMMpaF3/bYs9egdcCZaU9FW5jHo+fmJSPoQnIOniVnPdTwa1tIYLcfDQJcQNe4/7mlbcSC4CvJ4Nx/GjGAhd9cDWggVZatZoYi7AL6Y3zGJ04O6b0JxGKnE/yZsUrLwljpmy1ptQJ5vnGhJAHxGnygypkqtpuTDeCJaPGIMHjuF9OURutp7oTzBE4g+sNHDuq7KshxNuwpq01Cih/TNnF6O1qAOrCtbbPgJKdci9V2LpcjiDyNF6ZNRXVO2GmE75Bc5Z2WZCNad/viaj LoDhcCbt fMIHl2yEvH79X4evzDw2toWf9LaCpSlMnVd20 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Functionally, no change. This is a preparation for luf mechanism that requires to use separated folio lists for its own handling during migration. Refactored migrate_pages_batch() so as to separate move/undo parts from migrate_pages_batch(). Signed-off-by: Byungchul Park --- mm/migrate.c | 134 +++++++++++++++++++++++++++++++-------------------- 1 file changed, 83 insertions(+), 51 deletions(-) diff --git a/mm/migrate.c b/mm/migrate.c index e04b451c4289..6c22a1402923 100644 --- a/mm/migrate.c +++ b/mm/migrate.c @@ -1584,6 +1584,81 @@ static int migrate_hugetlbs(struct list_head *from, new_folio_t get_new_folio, return nr_failed; } +static void migrate_folios_move(struct list_head *src_folios, + struct list_head *dst_folios, + free_folio_t put_new_folio, unsigned long private, + enum migrate_mode mode, int reason, + struct list_head *ret_folios, + struct migrate_pages_stats *stats, + int *retry, int *thp_retry, int *nr_failed, + int *nr_retry_pages) +{ + struct folio *folio, *folio2, *dst, *dst2; + bool is_thp; + int nr_pages; + int rc; + + dst = list_first_entry(dst_folios, struct folio, lru); + dst2 = list_next_entry(dst, lru); + list_for_each_entry_safe(folio, folio2, src_folios, lru) { + is_thp = folio_test_large(folio) && folio_test_pmd_mappable(folio); + nr_pages = folio_nr_pages(folio); + + cond_resched(); + + rc = migrate_folio_move(put_new_folio, private, + folio, dst, mode, + reason, ret_folios); + /* + * The rules are: + * Success: folio will be freed + * -EAGAIN: stay on the unmap_folios list + * Other errno: put on ret_folios list + */ + switch(rc) { + case -EAGAIN: + *retry += 1; + *thp_retry += is_thp; + *nr_retry_pages += nr_pages; + break; + case MIGRATEPAGE_SUCCESS: + stats->nr_succeeded += nr_pages; + stats->nr_thp_succeeded += is_thp; + break; + default: + *nr_failed += 1; + stats->nr_thp_failed += is_thp; + stats->nr_failed_pages += nr_pages; + break; + } + dst = dst2; + dst2 = list_next_entry(dst, lru); + } +} + +static void migrate_folios_undo(struct list_head *src_folios, + struct list_head *dst_folios, + free_folio_t put_new_folio, unsigned long private, + struct list_head *ret_folios) +{ + struct folio *folio, *folio2, *dst, *dst2; + + dst = list_first_entry(dst_folios, struct folio, lru); + dst2 = list_next_entry(dst, lru); + list_for_each_entry_safe(folio, folio2, src_folios, lru) { + int old_page_state = 0; + struct anon_vma *anon_vma = NULL; + + __migrate_folio_extract(dst, &old_page_state, &anon_vma); + migrate_folio_undo_src(folio, old_page_state & PAGE_WAS_MAPPED, + anon_vma, true, ret_folios); + list_del(&dst->lru); + migrate_folio_undo_dst(dst, true, put_new_folio, private); + dst = dst2; + dst2 = list_next_entry(dst, lru); + } +} + /* * migrate_pages_batch() first unmaps folios in the from list as many as * possible, then move the unmapped folios. @@ -1606,7 +1681,7 @@ static int migrate_pages_batch(struct list_head *from, int pass = 0; bool is_thp = false; bool is_large = false; - struct folio *folio, *folio2, *dst = NULL, *dst2; + struct folio *folio, *folio2, *dst = NULL; int rc, rc_saved = 0, nr_pages; LIST_HEAD(unmap_folios); LIST_HEAD(dst_folios); @@ -1765,42 +1840,11 @@ static int migrate_pages_batch(struct list_head *from, thp_retry = 0; nr_retry_pages = 0; - dst = list_first_entry(&dst_folios, struct folio, lru); - dst2 = list_next_entry(dst, lru); - list_for_each_entry_safe(folio, folio2, &unmap_folios, lru) { - is_thp = folio_test_large(folio) && folio_test_pmd_mappable(folio); - nr_pages = folio_nr_pages(folio); - - cond_resched(); - - rc = migrate_folio_move(put_new_folio, private, - folio, dst, mode, - reason, ret_folios); - /* - * The rules are: - * Success: folio will be freed - * -EAGAIN: stay on the unmap_folios list - * Other errno: put on ret_folios list - */ - switch(rc) { - case -EAGAIN: - retry++; - thp_retry += is_thp; - nr_retry_pages += nr_pages; - break; - case MIGRATEPAGE_SUCCESS: - stats->nr_succeeded += nr_pages; - stats->nr_thp_succeeded += is_thp; - break; - default: - nr_failed++; - stats->nr_thp_failed += is_thp; - stats->nr_failed_pages += nr_pages; - break; - } - dst = dst2; - dst2 = list_next_entry(dst, lru); - } + /* Move the unmapped folios */ + migrate_folios_move(&unmap_folios, &dst_folios, + put_new_folio, private, mode, reason, + ret_folios, stats, &retry, &thp_retry, + &nr_failed, &nr_retry_pages); } nr_failed += retry; stats->nr_thp_failed += thp_retry; @@ -1809,20 +1853,8 @@ static int migrate_pages_batch(struct list_head *from, rc = rc_saved ? : nr_failed; out: /* Cleanup remaining folios */ - dst = list_first_entry(&dst_folios, struct folio, lru); - dst2 = list_next_entry(dst, lru); - list_for_each_entry_safe(folio, folio2, &unmap_folios, lru) { - int old_page_state = 0; - struct anon_vma *anon_vma = NULL; - - __migrate_folio_extract(dst, &old_page_state, &anon_vma); - migrate_folio_undo_src(folio, old_page_state & PAGE_WAS_MAPPED, - anon_vma, true, ret_folios); - list_del(&dst->lru); - migrate_folio_undo_dst(dst, true, put_new_folio, private); - dst = dst2; - dst2 = list_next_entry(dst, lru); - } + migrate_folios_undo(&unmap_folios, &dst_folios, + put_new_folio, private, ret_folios); return rc; } From patchwork Fri May 31 09:20:00 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Byungchul Park X-Patchwork-Id: 13681407 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 3D4CAC25B75 for ; Fri, 31 May 2024 09:20:41 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 601716B00A4; Fri, 31 May 2024 05:20:23 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 53CC56B00A5; Fri, 31 May 2024 05:20:23 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 38DC46B00A6; Fri, 31 May 2024 05:20:23 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0017.hostedemail.com [216.40.44.17]) by kanga.kvack.org (Postfix) with ESMTP id 1BF7F6B00A4 for ; Fri, 31 May 2024 05:20:23 -0400 (EDT) Received: from smtpin30.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay05.hostedemail.com (Postfix) with ESMTP id CCF1541169 for ; Fri, 31 May 2024 09:20:22 +0000 (UTC) X-FDA: 82178145084.30.061317F Received: from invmail4.hynix.com (exvmail4.hynix.com [166.125.252.92]) by imf08.hostedemail.com (Postfix) with ESMTP id CA9CA16000E for ; Fri, 31 May 2024 09:20:20 +0000 (UTC) Authentication-Results: imf08.hostedemail.com; dkim=none; dmarc=none; spf=pass (imf08.hostedemail.com: domain of byungchul@sk.com designates 166.125.252.92 as permitted sender) smtp.mailfrom=byungchul@sk.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1717147221; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:content-type: content-transfer-encoding:in-reply-to:in-reply-to: references:references; bh=DWL/qYrEsADsHqpSLfnBPP2hWqV8vJfMQfSqpZua+C0=; b=gADcQzxYnyO+s8MdDZxR1JuONCwqarnaKOXCmca2EjYZvSmPD14sXxZ8i/9EDZnAoQ5neT 50b6R7g1QcjH+aaF5oBIEr7ksS+gMYT5wJFamN3/mbq8LVfxH/6a5GRoS0GWnE3GHad9cJ yKkQY46Y/VXP6UDqEEd/cV6KJ8BEXjo= ARC-Authentication-Results: i=1; imf08.hostedemail.com; dkim=none; dmarc=none; spf=pass (imf08.hostedemail.com: domain of byungchul@sk.com designates 166.125.252.92 as permitted sender) smtp.mailfrom=byungchul@sk.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1717147221; a=rsa-sha256; cv=none; b=2lWbin0vhCjdesaqnNCGPrzZBWIPXm3vBw+V8O0Bve6ej0YJJaO05IK6/t5xPo+0F9mMbF qKToeR5QPe2Ho9itWavJ/sIfofaYFn11ftXSRZR+zAwLC+YhaEh4YSNgMaQsfGfsQ3PhOu gbTWvsD92N0lYOvA2rxeTW8pk0w065c= X-AuditID: a67dfc5b-d85ff70000001748-76-6659964c68c9 From: Byungchul Park To: linux-kernel@vger.kernel.org, linux-mm@kvack.org Cc: kernel_team@skhynix.com, akpm@linux-foundation.org, ying.huang@intel.com, vernhao@tencent.com, mgorman@techsingularity.net, hughd@google.com, willy@infradead.org, david@redhat.com, peterz@infradead.org, luto@kernel.org, tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, dave.hansen@linux.intel.com, rjgolo@gmail.com Subject: [PATCH v11 11/12] mm, migrate: apply luf mechanism to unmapping during migration Date: Fri, 31 May 2024 18:20:00 +0900 Message-Id: <20240531092001.30428-12-byungchul@sk.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20240531092001.30428-1-byungchul@sk.com> References: <20240531092001.30428-1-byungchul@sk.com> X-Brightmail-Tracker: H4sIAAAAAAAAA+NgFnrCLMWRmVeSWpSXmKPExsXC9ZZnka7PtMg0g+dTVSzmrF/DZvF5wz82 ixcb2hktvq7/xWzx9FMfi8XlXXPYLO6t+c9qcX7XWlaLHUv3MVlcOrCAyeJ47wEmi/n3PrNZ bN40ldni+JSpjBa/fwAVn5w1mcVBwON7ax+Lx85Zd9k9Fmwq9di8Qstj8Z6XTB6bVnWyeWz6 NInd4925c+weJ2b8ZvGYdzLQ4/2+q2weW3/ZeTROvcbm8XmTXABfFJdNSmpOZllqkb5dAlfG ivkTWAqa3SqmnX3E2sC40KKLkZNDQsBE4uf/E4ww9pO+YywgNpuAusSNGz+ZQWwRATOJg61/ 2EFsZoG7TBIH+tm6GDk4hAUiJVb9SwMxWQRUJT53GoBU8AJVX23sY4WYKC+xesMBsCmcQPED f++AbRISMJVY9L8XyOYCqnnPJnHy3T42iAZJiYMrbrBMYORdwMiwilEoM68sNzEzx0QvozIv s0IvOT93EyMw7JfV/onewfjpQvAhRgEORiUe3oCKiDQh1sSy4srcQ4wSHMxKIry/0oFCvCmJ lVWpRfnxRaU5qcWHGKU5WJTEeY2+lacICaQnlqRmp6YWpBbBZJk4OKUaGK3/aSwp6u+uLisV Cd725dPDnmsmFcV87JOzbvk9DtLmvvd6eqv1/F3r9NQuxUz7sbU46YNKTG3WlaqseO6S5uk9 dlOmMmWvbnjFuiyM5db9m3cZQnvWplt7PMwL+jMjt+N3zeplDpqeL64IPrt1MVC1WEQmNHHh nSuFx5inWuoemR6YdFH8mBJLcUaioRZzUXEiAOcOLfV3AgAA X-Brightmail-Tracker: H4sIAAAAAAAAA+NgFjrNLMWRmVeSWpSXmKPExsXC5WfdrOszLTLNoPmlpMWc9WvYLD5v+Mdm 8WJDO6PF1/W/mC2efupjsTg89ySrxeVdc9gs7q35z2pxftdaVosdS/cxWVw6sIDJ4njvASaL +fc+s1ls3jSV2eL4lKmMFr9/ABWfnDWZxUHQ43trH4vHzll32T0WbCr12LxCy2PxnpdMHptW dbJ5bPo0id3j3blz7B4nZvxm8Zh3MtDj/b6rbB6LX3xg8tj6y86jceo1No/Pm+QC+KO4bFJS czLLUov07RK4MlbMn8BS0OxWMe3sI9YGxoUWXYycHBICJhJP+o6xgNhsAuoSN278ZAaxRQTM JA62/mEHsZkF7jJJHOhn62Lk4BAWiJRY9S8NxGQRUJX43GkAUsELVH21sY8VYqK8xOoNB8Cm cALFD/y9wwhiCwmYSiz638s4gZFrASPDKkaRzLyy3MTMHFO94uyMyrzMCr3k/NxNjMAgXlb7 Z+IOxi+X3Q8xCnAwKvHwBlREpAmxJpYVV+YeYpTgYFYS4f2VDhTiTUmsrEotyo8vKs1JLT7E KM3BoiTO6xWemiAkkJ5YkpqdmlqQWgSTZeLglGpgvDH3m6fsxadPdXK/X/Z2O8Z0kOHE/O6Z GtEfWWX/d7MUlVqvlY1/lPUuo1rhqF+PaMy2zODzFs/sfn6S3nbvEq/n3bki7ybZ/qkWlr2+ kW8h8+QYv3ibgKDaurkKrd+59xww2tSUwf+24uyluA7zq0avffRPp3r+4J11ni8yaElWkZqE 1O9zSizFGYmGWsxFxYkAOqgFhV4CAAA= X-CFilter-Loop: Reflected X-Stat-Signature: p3g73k1w5aebmzunq4pz9mkgwmut3w1a X-Rspamd-Queue-Id: CA9CA16000E X-Rspam-User: X-Rspamd-Server: rspam10 X-HE-Tag: 1717147220-803930 X-HE-Meta: U2FsdGVkX18isTSonzsYf3y+A1sgWCZjOen/ZeJIi/BfmHtaYLOtOIszzSvs3+BE36F0fhpt6yqMpF5/ml17c4JHZNtCRc6sThyXxp/BjKN8cySTQPj4dtnbGKJVw8t4PNw/1CZ3I/AOGDz3AaxsucjrUaHNtkIzIDbtnJzbFF7fHhcxiF3wouO7LNMJbGHPMiRaIrzFsBq/SFCHHYjqqHPFyc+/hd4zODqFj0MtNJq36DwfYH9FGsf4MOPF/OB3dkW9FXERG4gCA3jisWKR63e7eyr77+tg9cqX9Ad/fqcdyM0aRS6mUeDH02+WwAXcd0mTGsqOAwULbtc8ArMrUtNB70AdX+KuSUwoSUNWRWAxpdkWW4YsPv8b8hMQYmjMC7kNC8j/NXf+rbsIhZ35jRa7fQ4KJZ0dSUCN1quWoCG1DY+VmaLPiM9/sKsf7y3sRsYrZU/gHaalBiD6Px0UxNqXOcKDfO1rtoWNv6W5sW5RSdzFn7YHFg4SKGe4BPTpxmKbfFuiL3rQbId3+3oGYrc7oIoN2zc2bzAwKWbLTXw3sSWTtkVz3rIjovKkoQKrlCol8VNIC3rnFG+WV5ndAS3kwBKCzHj73A4NvrYrtrwKLaEsJp/QekObtNJt1COMKWSFRcLEsjYfdREg2qjHx1BZlS1HTHlL4xxlRunMy7J08A6iZPFY8toa6FkmyaBKs1PjdSJbcL/czNdRXg0wAnBlhnsJ7gpR4IYnA/1FB7TogaUg3WBj++bgVZmCS67qIwbJDUVmOPZ2tupWcqZZZ2qJy0m7ReKhevEbcBjirIA2d5EAGSWeCqeLfpRZN3XYjsEiQ4r7USVI7U6mPTG0WUI4COElgDPJEY6sprX/9nY86R5jHb0cfA/DLq+oWzmCVkDz+fxUr2fqxkzRv3QcOjnDZ+ArbVNmmpkwq4T+cE6JFQBUmQtfXPwQ4c30aDi8IzHPBfkKidg6n0Nw3jB Zng== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: A new mechanism, LUF(Lazy Unmap Flush), defers tlb flush until folios that have been unmapped and freed, eventually get allocated again. It's safe for folios that had been mapped read only and were unmapped, since the contents of the folios don't change while staying in pcp or buddy so we can still read the data through the stale tlb entries. Applied the mechanism to unmapping during migration. Signed-off-by: Byungchul Park --- include/linux/rmap.h | 2 +- mm/migrate.c | 56 ++++++++++++++++++++++++++++++++------------ mm/rmap.c | 9 ++++--- 3 files changed, 48 insertions(+), 19 deletions(-) diff --git a/include/linux/rmap.h b/include/linux/rmap.h index bb53e5920b88..6aca569e342b 100644 --- a/include/linux/rmap.h +++ b/include/linux/rmap.h @@ -660,7 +660,7 @@ static inline int folio_try_share_anon_rmap_pmd(struct folio *folio, int folio_referenced(struct folio *, int is_locked, struct mem_cgroup *memcg, unsigned long *vm_flags); -void try_to_migrate(struct folio *folio, enum ttu_flags flags); +bool try_to_migrate(struct folio *folio, enum ttu_flags flags); void try_to_unmap(struct folio *, enum ttu_flags flags); int make_device_exclusive_range(struct mm_struct *mm, unsigned long start, diff --git a/mm/migrate.c b/mm/migrate.c index 6c22a1402923..6da8335cdf4c 100644 --- a/mm/migrate.c +++ b/mm/migrate.c @@ -1067,7 +1067,8 @@ static void migrate_folio_undo_dst(struct folio *dst, bool locked, /* Cleanup src folio upon migration success */ static void migrate_folio_done(struct folio *src, - enum migrate_reason reason) + enum migrate_reason reason, + unsigned short int ugen) { /* * Compaction can migrate also non-LRU pages which are @@ -1078,8 +1079,12 @@ static void migrate_folio_done(struct folio *src, mod_node_page_state(folio_pgdat(src), NR_ISOLATED_ANON + folio_is_file_lru(src), -folio_nr_pages(src)); - if (reason != MR_MEMORY_FAILURE) - /* We release the page in page_handle_poison. */ + /* We release the page in page_handle_poison. */ + if (reason == MR_MEMORY_FAILURE) + check_luf_flush(ugen); + else if (ugen) + folio_put_ugen(src, ugen); + else folio_put(src); } @@ -1087,7 +1092,8 @@ static void migrate_folio_done(struct folio *src, static int migrate_folio_unmap(new_folio_t get_new_folio, free_folio_t put_new_folio, unsigned long private, struct folio *src, struct folio **dstp, enum migrate_mode mode, - enum migrate_reason reason, struct list_head *ret) + enum migrate_reason reason, struct list_head *ret, + bool *can_luf) { struct folio *dst; int rc = -EAGAIN; @@ -1103,7 +1109,7 @@ static int migrate_folio_unmap(new_folio_t get_new_folio, folio_clear_unevictable(src); /* free_pages_prepare() will clear PG_isolated. */ list_del(&src->lru); - migrate_folio_done(src, reason); + migrate_folio_done(src, reason, 0); return MIGRATEPAGE_SUCCESS; } @@ -1220,7 +1226,7 @@ static int migrate_folio_unmap(new_folio_t get_new_folio, /* Establish migration ptes */ VM_BUG_ON_FOLIO(folio_test_anon(src) && !folio_test_ksm(src) && !anon_vma, src); - try_to_migrate(src, mode == MIGRATE_ASYNC ? TTU_BATCH_FLUSH : 0); + *can_luf = try_to_migrate(src, mode == MIGRATE_ASYNC ? TTU_BATCH_FLUSH : 0); old_page_state |= PAGE_WAS_MAPPED; } @@ -1248,7 +1254,7 @@ static int migrate_folio_unmap(new_folio_t get_new_folio, static int migrate_folio_move(free_folio_t put_new_folio, unsigned long private, struct folio *src, struct folio *dst, enum migrate_mode mode, enum migrate_reason reason, - struct list_head *ret) + struct list_head *ret, unsigned short int ugen) { int rc; int old_page_state = 0; @@ -1302,7 +1308,7 @@ static int migrate_folio_move(free_folio_t put_new_folio, unsigned long private, if (anon_vma) put_anon_vma(anon_vma); folio_unlock(src); - migrate_folio_done(src, reason); + migrate_folio_done(src, reason, ugen); return rc; out: @@ -1591,7 +1597,7 @@ static void migrate_folios_move(struct list_head *src_folios, struct list_head *ret_folios, struct migrate_pages_stats *stats, int *retry, int *thp_retry, int *nr_failed, - int *nr_retry_pages) + int *nr_retry_pages, unsigned short int ugen) { struct folio *folio, *folio2, *dst, *dst2; bool is_thp; @@ -1608,7 +1614,7 @@ static void migrate_folios_move(struct list_head *src_folios, rc = migrate_folio_move(put_new_folio, private, folio, dst, mode, - reason, ret_folios); + reason, ret_folios, ugen); /* * The rules are: * Success: folio will be freed @@ -1685,7 +1691,11 @@ static int migrate_pages_batch(struct list_head *from, int rc, rc_saved = 0, nr_pages; LIST_HEAD(unmap_folios); LIST_HEAD(dst_folios); + LIST_HEAD(unmap_folios_luf); + LIST_HEAD(dst_folios_luf); bool nosplit = (reason == MR_NUMA_MISPLACED); + unsigned short int ugen; + bool can_luf; VM_WARN_ON_ONCE(mode != MIGRATE_ASYNC && !list_empty(from) && !list_is_singular(from)); @@ -1748,9 +1758,11 @@ static int migrate_pages_batch(struct list_head *from, continue; } + can_luf = false; rc = migrate_folio_unmap(get_new_folio, put_new_folio, private, folio, &dst, mode, reason, - ret_folios); + ret_folios, &can_luf); + /* * The rules are: * Success: folio will be freed @@ -1796,7 +1808,8 @@ static int migrate_pages_batch(struct list_head *from, /* nr_failed isn't updated for not used */ stats->nr_thp_failed += thp_retry; rc_saved = rc; - if (list_empty(&unmap_folios)) + if (list_empty(&unmap_folios) && + list_empty(&unmap_folios_luf)) goto out; else goto move; @@ -1810,8 +1823,13 @@ static int migrate_pages_batch(struct list_head *from, stats->nr_thp_succeeded += is_thp; break; case MIGRATEPAGE_UNMAP: - list_move_tail(&folio->lru, &unmap_folios); - list_add_tail(&dst->lru, &dst_folios); + if (can_luf) { + list_move_tail(&folio->lru, &unmap_folios_luf); + list_add_tail(&dst->lru, &dst_folios_luf); + } else { + list_move_tail(&folio->lru, &unmap_folios); + list_add_tail(&dst->lru, &dst_folios); + } break; default: /* @@ -1831,6 +1849,8 @@ static int migrate_pages_batch(struct list_head *from, stats->nr_thp_failed += thp_retry; stats->nr_failed_pages += nr_retry_pages; move: + /* Should be before try_to_unmap_flush() */ + ugen = try_to_unmap_luf(); /* Flush TLBs for all unmapped folios */ try_to_unmap_flush(); @@ -1844,7 +1864,11 @@ static int migrate_pages_batch(struct list_head *from, migrate_folios_move(&unmap_folios, &dst_folios, put_new_folio, private, mode, reason, ret_folios, stats, &retry, &thp_retry, - &nr_failed, &nr_retry_pages); + &nr_failed, &nr_retry_pages, 0); + migrate_folios_move(&unmap_folios_luf, &dst_folios_luf, + put_new_folio, private, mode, reason, + ret_folios, stats, &retry, &thp_retry, + &nr_failed, &nr_retry_pages, ugen); } nr_failed += retry; stats->nr_thp_failed += thp_retry; @@ -1855,6 +1879,8 @@ static int migrate_pages_batch(struct list_head *from, /* Cleanup remaining folios */ migrate_folios_undo(&unmap_folios, &dst_folios, put_new_folio, private, ret_folios); + migrate_folios_undo(&unmap_folios_luf, &dst_folios_luf, + put_new_folio, private, ret_folios); return rc; } diff --git a/mm/rmap.c b/mm/rmap.c index 459d4d1631f0..b8b977278a1b 100644 --- a/mm/rmap.c +++ b/mm/rmap.c @@ -2635,8 +2635,9 @@ static bool try_to_migrate_one(struct folio *folio, struct vm_area_struct *vma, * * Tries to remove all the page table entries which are mapping this folio and * replace them with special swap entries. Caller must hold the folio lock. + * Return true if all the mappings are read-only, otherwise false. */ -void try_to_migrate(struct folio *folio, enum ttu_flags flags) +bool try_to_migrate(struct folio *folio, enum ttu_flags flags) { struct rmap_walk_control rwc = { .rmap_one = try_to_migrate_one, @@ -2655,11 +2656,11 @@ void try_to_migrate(struct folio *folio, enum ttu_flags flags) */ if (WARN_ON_ONCE(flags & ~(TTU_RMAP_LOCKED | TTU_SPLIT_HUGE_PMD | TTU_SYNC | TTU_BATCH_FLUSH))) - return; + return false; if (folio_is_zone_device(folio) && (!folio_is_device_private(folio) && !folio_is_device_coherent(folio))) - return; + return false; /* * During exec, a temporary VMA is setup and later moved. @@ -2684,6 +2685,8 @@ void try_to_migrate(struct folio *folio, enum ttu_flags flags) fold_ubc(tlb_ubc_luf, tlb_ubc_ro); else fold_ubc(tlb_ubc, tlb_ubc_ro); + + return can_luf; } #ifdef CONFIG_DEVICE_PRIVATE From patchwork Fri May 31 09:20:01 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Byungchul Park X-Patchwork-Id: 13681409 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 141C8C27C4F for ; Fri, 31 May 2024 09:20:47 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 4EF456B00A6; Fri, 31 May 2024 05:20:24 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 49ED86B00A7; Fri, 31 May 2024 05:20:24 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 319B36B00A8; Fri, 31 May 2024 05:20:24 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0017.hostedemail.com [216.40.44.17]) by kanga.kvack.org (Postfix) with ESMTP id 13ED86B00A6 for ; Fri, 31 May 2024 05:20:24 -0400 (EDT) Received: from smtpin16.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay08.hostedemail.com (Postfix) with ESMTP id B745A140704 for ; Fri, 31 May 2024 09:20:23 +0000 (UTC) X-FDA: 82178145126.16.5D432BE Received: from invmail4.hynix.com (exvmail4.skhynix.com [166.125.252.92]) by imf22.hostedemail.com (Postfix) with ESMTP id D0693C0008 for ; Fri, 31 May 2024 09:20:21 +0000 (UTC) Authentication-Results: imf22.hostedemail.com; dkim=none; dmarc=none; spf=pass (imf22.hostedemail.com: domain of byungchul@sk.com designates 166.125.252.92 as permitted sender) smtp.mailfrom=byungchul@sk.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1717147222; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:content-type: content-transfer-encoding:in-reply-to:in-reply-to: references:references; bh=HLsYuS1DPBPDOw8u7ph3LtR3bhH5qggymy+4rHIGABI=; b=P35GERSoXAVsBO9fnP/hqwTkmezqPHvgkeejAwTuuRppbhdOt9pO7eiJ6Eulo8QJinq50U LnFH/Z14FZ2f3Ebw2hHwRDiJA/Hh9EzXa9UFcCJgTk3vc/JtyuIr5jMTUBIKfDy0a7jrBG 5ESUtqFR4VCM7BExe8Y76gnP4ADiSSo= ARC-Authentication-Results: i=1; imf22.hostedemail.com; dkim=none; dmarc=none; spf=pass (imf22.hostedemail.com: domain of byungchul@sk.com designates 166.125.252.92 as permitted sender) smtp.mailfrom=byungchul@sk.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1717147222; a=rsa-sha256; cv=none; b=fIOPtb6bTZKoQCmylv30bJTZPcpsDNKBHbUgkPlwE+hTXH4GAwoOyzE4FvjxSrELVIMynA mbUJ9XJG9/IXXzn2BwzmWbnX76D/GWMoqZgHFKyYEB6wuRaujGdmi81h7NT0Ag9LTKj/+J fHmdPmp0rdHpA7xei0WMTzIqCaIR7Ww= X-AuditID: a67dfc5b-d85ff70000001748-7b-6659964cd3e1 From: Byungchul Park To: linux-kernel@vger.kernel.org, linux-mm@kvack.org Cc: kernel_team@skhynix.com, akpm@linux-foundation.org, ying.huang@intel.com, vernhao@tencent.com, mgorman@techsingularity.net, hughd@google.com, willy@infradead.org, david@redhat.com, peterz@infradead.org, luto@kernel.org, tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, dave.hansen@linux.intel.com, rjgolo@gmail.com Subject: [PATCH v11 12/12] mm, vmscan: apply luf mechanism to unmapping during folio reclaim Date: Fri, 31 May 2024 18:20:01 +0900 Message-Id: <20240531092001.30428-13-byungchul@sk.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20240531092001.30428-1-byungchul@sk.com> References: <20240531092001.30428-1-byungchul@sk.com> X-Brightmail-Tracker: H4sIAAAAAAAAA+NgFnrCLMWRmVeSWpSXmKPExsXC9ZZnka7PtMg0g8Pr1SzmrF/DZvF5wz82 ixcb2hktvq7/xWzx9FMfi8XlXXPYLO6t+c9qcX7XWlaLHUv3MVlcOrCAyeJ47wEmi/n3PrNZ bN40ldni+JSpjBa/fwAVn5w1mcVBwON7ax+Lx85Zd9k9Fmwq9di8Qstj8Z6XTB6bVnWyeWz6 NInd4925c+weJ2b8ZvGYdzLQ4/2+q2weW3/ZeTROvcbm8XmTXABfFJdNSmpOZllqkb5dAlfG jFdXGQsuqlVs/zeDrYHxnHwXIweHhICJxLoXcV2MnGDmjqe72UFsNgF1iRs3fjKD2CICZhIH W/+AxZkF7jJJHOhnA2kVFoiRmNsGFmYRUJX4faaBFcTmBSo/vfcYO8RIeYnVGw6AjeEEih/4 e4cRxBYSMJVY9L8XyOYCqnnPJrHn30w2iAZJiYMrbrBMYORdwMiwilEoM68sNzEzx0QvozIv s0IvOT93EyMw7JfV/onewfjpQvAhRgEORiUe3oCKiDQh1sSy4srcQ4wSHMxKIry/0oFCvCmJ lVWpRfnxRaU5qcWHGKU5WJTEeY2+lacICaQnlqRmp6YWpBbBZJk4OKUaGKNeWKy/Om+nn+ID R8GFHXlp5xU9TapyL8XUhD2MPqIk2pm32WkpY4tW2cqAk2cW3z9vccEkMnFOQtMdvp7+gNIA o241mX+ck0JeHQnyPr1YPz9D7rjz7hYJkQVtYm+33pz8a0LTRtnwN835gj/CA1/Fe/7U2cC3 +8GN5xltyeonBcTObL1dqMRSnJFoqMVcVJwIAK+DkQR3AgAA X-Brightmail-Tracker: H4sIAAAAAAAAA+NgFjrDLMWRmVeSWpSXmKPExsXC5WfdrOszLTLN4MBGGYs569ewWXze8I/N 4sWGdkaLr+t/MVs8/dTHYnF47klWi8u75rBZ3Fvzn9Xi/K61rBY7lu5jsrh0YAGTxfHeA0wW 8+99ZrPYvGkqs8XxKVMZLX7/ACo+OWsyi4Ogx/fWPhaPnbPusnss2FTqsXmFlsfiPS+ZPDat 6mTz2PRpErvHu3Pn2D1OzPjN4jHvZKDH+31X2TwWv/jA5LH1l51H49RrbB6fN8kF8Edx2aSk 5mSWpRbp2yVwZcx4dZWx4KJaxfZ/M9gaGM/JdzFyckgImEjseLqbHcRmE1CXuHHjJzOILSJg JnGw9Q9YnFngLpPEgX62LkYODmGBGIm5bWBhFgFVid9nGlhBbF6g8tN7j7FDjJSXWL3hANgY TqD4gb93GEFsIQFTiUX/exknMHItYGRYxSiSmVeWm5iZY6pXnJ1RmZdZoZecn7uJERjGy2r/ TNzB+OWy+yFGAQ5GJR7egIqINCHWxLLiytxDjBIczEoivL/SgUK8KYmVValF+fFFpTmpxYcY pTlYlMR5vcJTE4QE0hNLUrNTUwtSi2CyTBycUg2MzfbJLtFvJyd0h0l0NtYEFppJFzVEpM+Z ebh1xv/4+ZLy6anXpFgTEn5etXazS5Pk2RGS8GwJ5/J6LYVn14t48s7aRk9nmGXnecXAfonp bmuNBXec3xpVvmJi8tFbyVGlqCLU/j/3OOPyg+zP5YROX3INunbiWJb0jbcH83MTK2zEH978 HK3EUpyRaKjFXFScCAARIRXIXwIAAA== X-CFilter-Loop: Reflected X-Rspamd-Queue-Id: D0693C0008 X-Stat-Signature: bfmw9e7h3a98qssoy88hxiwtin5c9141 X-Rspam-User: X-Rspamd-Server: rspam04 X-HE-Tag: 1717147221-824388 X-HE-Meta: U2FsdGVkX1/GsW4fDxEDM7WLlAknvcnMfs9MeY+LsZbAMl1w4guTKqevkFZ92FTXkm1lxpsNfyuigFE2D/R2M3zDAegEVuioC0hrXKUQPPEVY2uOue7Os9LkjYv1KH2CMD3MD2w4UedPJftM+FWovhaFOu6P2xeDGnZ9uZ1lkcKuUQnpO/NqFhj0Hi4xo3FkUb3WneVMqz1HvFfg/nPIiAH+FNIhIHyx+fzxTejKKYaPxhlLbDLrLJniKSqCLMDuih0ycfT6g/AnrjLG6tuIYsQ87wpJzHSdhZT/J7ZRdB0SiYr09vc1wS/7IEB6dDY2tTDlKodfqkx3kMU7RXNoxWl5KTMnKzLaU7Tn7BfHK0FR4EVYqHYoHdti/vgH+lZJIN4YhppU0nZkZUNmHmBlavxr6ZXPS6CqHavG/GTuRvzIQVKrWuTYAGNn9EvN+4NW9vhXX5kJMjd2oYrpDV7C7nRmQaRdrU3EFQakoHd/fmRbHHw59zkMk4/CH1utAWXABTXjqxwrSNAQXWW909xO4rTdbdDiY3fZ2u6fyNsaDTc2OTCDsBsxflGLgc9R/Y02i2sQRuFjP54aCNb8ma4olahzEd4C0nWrN929AQ060XrsHrwjtXM6CuG7w1nYCokIN97oqNoVm2G56GSlTu+V70xhzt6aZik9BVVAgl1fGa4wOsaophbQNOfB+N5yBDMKN8/B1y+EfQgJP8BtJos9aGtiH+BV+FB/loiiEfcFwsbUtBYvpBrHH60FEEiwEbcQpQBEaPdNt9nYJw11DD52pPwx7ZTzEjzXtfpuUGIPGeivYpzu6MInfvp/ewFeYQ3iVRLqdmHywhIsg3Rr/gs8dzA+TfTN5mFJNDs60FJMFNdnftAbqo9TIXgyulBaG/Huh6p7xqfhqCSnSWLV9dS0ufJqHcZR7bztXfBj9Z8eg9kL7+qtbxSeqyM5PBwRKXakrEk1mw5GxptGBu/wQ8N G0v69RmX sSVsjieGwtV/bLH/36mbPFvXlewIwq82Nzy56+sHNcSU4ovRMW9lM2SQCkujUrtBaFSWq1nTl/ttYHrGVqpEYCUIyfKipQHY6BMXO X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: A new mechanism, LUF(Lazy Unmap Flush), defers tlb flush until folios that have been unmapped and freed, eventually get allocated again. It's safe for folios that had been mapped read only and were unmapped, since the contents of the folios don't change while staying in pcp or buddy so we can still read the data through the stale tlb entries. Applied the mechanism to unmapping during folio reclaim. Signed-off-by: Byungchul Park --- include/linux/rmap.h | 5 +++-- mm/rmap.c | 5 ++++- mm/vmscan.c | 21 ++++++++++++++++++++- 3 files changed, 27 insertions(+), 4 deletions(-) diff --git a/include/linux/rmap.h b/include/linux/rmap.h index 6aca569e342b..9f3e66239f0a 100644 --- a/include/linux/rmap.h +++ b/include/linux/rmap.h @@ -661,7 +661,7 @@ int folio_referenced(struct folio *, int is_locked, struct mem_cgroup *memcg, unsigned long *vm_flags); bool try_to_migrate(struct folio *folio, enum ttu_flags flags); -void try_to_unmap(struct folio *, enum ttu_flags flags); +bool try_to_unmap(struct folio *, enum ttu_flags flags); int make_device_exclusive_range(struct mm_struct *mm, unsigned long start, unsigned long end, struct page **pages, @@ -770,8 +770,9 @@ static inline int folio_referenced(struct folio *folio, int is_locked, return 0; } -static inline void try_to_unmap(struct folio *folio, enum ttu_flags flags) +static inline bool try_to_unmap(struct folio *folio, enum ttu_flags flags) { + return false; } static inline int folio_mkclean(struct folio *folio) diff --git a/mm/rmap.c b/mm/rmap.c index b8b977278a1b..6f90c2adc4ae 100644 --- a/mm/rmap.c +++ b/mm/rmap.c @@ -2272,10 +2272,11 @@ static int folio_not_mapped(struct folio *folio) * Tries to remove all the page table entries which are mapping this * folio. It is the caller's responsibility to check if the folio is * still mapped if needed (use TTU_SYNC to prevent accounting races). + * Return true if all the mappings are read-only, otherwise false. * * Context: Caller must hold the folio lock. */ -void try_to_unmap(struct folio *folio, enum ttu_flags flags) +bool try_to_unmap(struct folio *folio, enum ttu_flags flags) { struct rmap_walk_control rwc = { .rmap_one = try_to_unmap_one, @@ -2300,6 +2301,8 @@ void try_to_unmap(struct folio *folio, enum ttu_flags flags) fold_ubc(tlb_ubc_luf, tlb_ubc_ro); else fold_ubc(tlb_ubc, tlb_ubc_ro); + + return can_luf; } /* diff --git a/mm/vmscan.c b/mm/vmscan.c index 15efe6f0edce..d52a6e605183 100644 --- a/mm/vmscan.c +++ b/mm/vmscan.c @@ -1034,14 +1034,17 @@ static unsigned int shrink_folio_list(struct list_head *folio_list, struct reclaim_stat *stat, bool ignore_references) { struct folio_batch free_folios; + struct folio_batch free_folios_luf; LIST_HEAD(ret_folios); LIST_HEAD(demote_folios); unsigned int nr_reclaimed = 0; unsigned int pgactivate = 0; bool do_demote_pass; struct swap_iocb *plug = NULL; + unsigned short int ugen; folio_batch_init(&free_folios); + folio_batch_init(&free_folios_luf); memset(stat, 0, sizeof(*stat)); cond_resched(); do_demote_pass = can_demote(pgdat->node_id, sc); @@ -1053,6 +1056,7 @@ static unsigned int shrink_folio_list(struct list_head *folio_list, enum folio_references references = FOLIOREF_RECLAIM; bool dirty, writeback; unsigned int nr_pages; + bool can_luf = false; cond_resched(); @@ -1295,7 +1299,7 @@ static unsigned int shrink_folio_list(struct list_head *folio_list, if (folio_test_large(folio) && list_empty(&folio->_deferred_list)) flags |= TTU_SYNC; - try_to_unmap(folio, flags); + can_luf = try_to_unmap(folio, flags); if (folio_mapped(folio)) { stat->nr_unmap_fail += nr_pages; if (!was_swapbacked && @@ -1458,6 +1462,18 @@ static unsigned int shrink_folio_list(struct list_head *folio_list, nr_reclaimed += nr_pages; folio_undo_large_rmappable(folio); + + if (can_luf) { + if (folio_batch_add(&free_folios_luf, folio) == 0) { + mem_cgroup_uncharge_folios(&free_folios_luf); + ugen = try_to_unmap_luf(); + if (!ugen) + try_to_unmap_flush(); + free_unref_folios(&free_folios_luf, ugen); + } + continue; + } + if (folio_batch_add(&free_folios, folio) == 0) { mem_cgroup_uncharge_folios(&free_folios); try_to_unmap_flush(); @@ -1527,8 +1543,11 @@ static unsigned int shrink_folio_list(struct list_head *folio_list, pgactivate = stat->nr_activate[0] + stat->nr_activate[1]; mem_cgroup_uncharge_folios(&free_folios); + mem_cgroup_uncharge_folios(&free_folios_luf); + ugen = try_to_unmap_luf(); try_to_unmap_flush(); free_unref_folios(&free_folios, 0); + free_unref_folios(&free_folios_luf, ugen); list_splice(&ret_folios, folio_list); count_vm_events(PGACTIVATE, pgactivate);