From patchwork Fri Apr 4 21:06:57 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: SeongJae Park X-Patchwork-Id: 14038911 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id DAB2DC369A2 for ; Fri, 4 Apr 2025 21:07:11 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 2316B6B008C; Fri, 4 Apr 2025 17:07:09 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 1935B6B0092; Fri, 4 Apr 2025 17:07:09 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 082296B0093; Fri, 4 Apr 2025 17:07:09 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0016.hostedemail.com [216.40.44.16]) by kanga.kvack.org (Postfix) with ESMTP id DFDAA6B008C for ; Fri, 4 Apr 2025 17:07:08 -0400 (EDT) Received: from smtpin09.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay09.hostedemail.com (Postfix) with ESMTP id 4FDAF807D3 for ; Fri, 4 Apr 2025 21:07:10 +0000 (UTC) X-FDA: 83297596620.09.47C50E8 Received: from tor.source.kernel.org (tor.source.kernel.org [172.105.4.254]) by imf05.hostedemail.com (Postfix) with ESMTP id C1FE5100003 for ; Fri, 4 Apr 2025 21:07:08 +0000 (UTC) Authentication-Results: imf05.hostedemail.com; dkim=pass header.d=kernel.org header.s=k20201202 header.b=HUNgvhyS; spf=pass (imf05.hostedemail.com: domain of sj@kernel.org designates 172.105.4.254 as permitted sender) smtp.mailfrom=sj@kernel.org; dmarc=pass (policy=quarantine) header.from=kernel.org ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1743800828; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=XD5DRAgwl43mO26b80F2cM1Xh7X51SvTXhMhcHPhHfc=; b=FWpbBW+Q+Gzi7dPZQlzDrASYNAdNmq5N0/+ZzzAdtmttZMviU2Q+avTyruqWvNsun9OLwr 3n8DPl5DXodpWLWXHrhO8eI5y2j8Y3lECSEUr61YvegoqitQ0Mix8a+c23+Lfgz3I/9qLJ 89WZAiC+aBNU7OKQSzqPKUR0bjnfT/Y= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1743800828; a=rsa-sha256; cv=none; b=LtGY5Ri+WGRcCBnaS2GgFURDx+XplFIdTzsb0++aW0y4fvHAezSraCwN5kF/mBrIVfRnqb vy7gAukRmL7dQu5BTBI1UsFNAr1gil50t2L5xQwU+KxWC9nXmGAgyjii8RdHudLj71J+ri ETpR7AakIgILAf/2JctTEk/8XEfmPos= ARC-Authentication-Results: i=1; imf05.hostedemail.com; dkim=pass header.d=kernel.org header.s=k20201202 header.b=HUNgvhyS; spf=pass (imf05.hostedemail.com: domain of sj@kernel.org designates 172.105.4.254 as permitted sender) smtp.mailfrom=sj@kernel.org; dmarc=pass (policy=quarantine) header.from=kernel.org Received: from smtp.kernel.org (transwarp.subspace.kernel.org [100.75.92.58]) by tor.source.kernel.org (Postfix) with ESMTP id B8F6561139; Fri, 4 Apr 2025 21:07:00 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 9B4BFC4CEDD; Fri, 4 Apr 2025 21:07:07 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1743800827; bh=CxwThz77NK2ocIQM45XioWZ5doCa2VSf2GhbT9w5GE0=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=HUNgvhySHeYKE/pVHAXtqhLJUQbvfW/laXnWakps7gS2JVcoFZwX2YpU7xEIOUgLs qnHQZw+MxH4zoiiMIKJSjJm4+G1PgLa62n1szHW306Eka5L1Lva0xa5tMCJ2hjJTJg +iKJgQ1FaOJjNz8MbaHkcJ65VX6Yx9UtkFLPcaoHzjnOsewVwsdtRNW0orFgQiMmma E+oxc36SF1Z+/bhzGGfeYsoqfWYMBbjJ4Pn3t2d64j/uUTNOE/LVMaFgdpDMpjBpUN RtwE4iFSLMjjRMNcrVeRfEISqjGd1bYLPWultgzuAKddTnwjOvwia2wpgcWZfIzd/s yEfzJSsfQeEXQ== From: SeongJae Park To: Andrew Morton Cc: SeongJae Park , "Liam R.Howlett" , David Hildenbrand , Lorenzo Stoakes , Rik van Riel , Shakeel Butt , Vlastimil Babka , kernel-team@meta.com, linux-kernel@vger.kernel.org, linux-mm@kvack.org Subject: [PATCH v2 1/4] mm/madvise: define and use madvise_behavior struct for madvise_do_behavior() Date: Fri, 4 Apr 2025 14:06:57 -0700 Message-Id: <20250404210700.2156-2-sj@kernel.org> X-Mailer: git-send-email 2.39.5 In-Reply-To: <20250404210700.2156-1-sj@kernel.org> References: <20250404210700.2156-1-sj@kernel.org> MIME-Version: 1.0 X-Rspam-User: X-Rspamd-Server: rspam10 X-Rspamd-Queue-Id: C1FE5100003 X-Stat-Signature: eq18zyg48y1knehfz4ortnxgt4w7s387 X-HE-Tag: 1743800828-955012 X-HE-Meta: U2FsdGVkX1/krbl28B/+OT/Bjar3QVjWTe0ScVAe1mDrRrwaWwpSbs43CgaDCzq4tNVO8uYc9/aRox85+I8iS25oV3Oa9DI4wNkmqoPsroLBht7+Xu/WD/81kcxXPvyDrOQ55RgvCVkaV/js0JWoXKBJp7Jlwn0/HdijhStNz2hXDvIGP8jSr1QonzsgB8L+v8uVvgGE1/ExlxLdjUXFBzjgL+i8SVRsqB1vIbZtsS1Ty380P1AI/zNqkpNISHUmyVjfMVZGefOXQh9Y6HdUZ/9YQwe/wMAD5rirBNRwLQUQmT5KDAtZQC5pVGKhOlWy/EweWCuLo/uL7gf045a28hM+wdHe8be4r4SyM9pu4Bleulish5rqV1tSZOhzyp0DaIVa/CVHMfsyLaB5FgfF/s55Lmd+v5lYxLrxH8RVwh4Co+NMFr7L/1VNu1Ze2JmY2BFmp5Up7asLYxECYs8X6Hz7Zkx3hk9u5s1otoFQgMI+g7BbpT0pVQIjueuIk0lwxlM9roYXXdehEFj2in8l32gLF6YW6j/OVhW4U2juRCI8yeJacQSBP5NdB7nPh0e/kaG0QH5weuf+p+deKCO0fJ/A1I1WCIDigfGBVHn3uOhqvQ/k1Oj/3Q3evVGVN6wSxnpO7w8qPRsM9z0tCltRoLDvO6onNpLmowDyVXKXkhsXEOQulw2AKcOd5Le/qryZs32ypDK7VEzOsYjzf2EHFPxgWZNnGTd2t1qcu4grqxs9oMWCaLtpoT6mvIpp4ZMYu0QBH0o8AiKffZ5cBNCb6tEdp50Mft+hLhw5Zmi8WaxXkcKdNpGdPpzKLffIRHsXqSoLG6gizyfpZe/3yWUey3zHcokjQgIj9uxFh4uHe2CN1ElgVVzJTx2iMnFxtoCVUITwQI9LdUEz+SFp6Dw6dBZMc94100KGJvfj3Gam3X46McL2vkShRoPp3pDC/wmKgATcryyx+WBmaMvCs/E SvyMdoFy eJbnnho3rQzEH9s/9VXPIB8sUmMjO4NSBSYFBzfArQbzPR3vP26wysC20nwko2GbOGzy+w22RL8IpK3werK/swjiNuIz6Itxy0q4Ne2XFD+Vm2OplgoefzpltquiGLUCLRPG0hjFVUcFVdStoicEGIHdgb8JB2FzNcT5ebxT6lD1d2nZsKS275KXhPaO7Ugf2XYvjxiQD7r1RfMFLo0YR36yJ0ZdlDPpBkGMF9XJ45gXN1QZ+z2Yl/8dWNVbw/GQMpYQk2+cqC9bqM5ElG2RvXpq5tzDy4Fg0wDC4cJ1bi1bltIw= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: To implement batched tlb flushes for MADV_DONTNEED[_LOCKED] and MADV_FREE, an mmu_gather object in addition to the behavior integer need to be passed to the internal logics. Using a struct can make it easy without increasing the number of parameters of all code paths towards the internal logic. Define a struct for the purpose and use it on the code path that starts from madvise_do_behavior() and ends on madvise_dontneed_free(). Note that this changes madvise_walk_vmas() visitor type signature, too. Specifically, it changes its 'arg' type from 'unsigned long' to the new struct pointer. Reviewed-by: Lorenzo Stoakes Signed-off-by: SeongJae Park --- mm/madvise.c | 46 +++++++++++++++++++++++++++++----------------- 1 file changed, 29 insertions(+), 17 deletions(-) diff --git a/mm/madvise.c b/mm/madvise.c index b17f684322ad..8bcfdd995d18 100644 --- a/mm/madvise.c +++ b/mm/madvise.c @@ -48,6 +48,11 @@ struct madvise_walk_private { bool pageout; }; +struct madvise_behavior { + int behavior; + struct mmu_gather *tlb; +}; + /* * Any behaviour which results in changes to the vma->vm_flags needs to * take mmap_lock for writing. Others, which simply traverse vmas, need @@ -893,12 +898,13 @@ static bool madvise_dontneed_free_valid_vma(struct vm_area_struct *vma, static long madvise_dontneed_free(struct vm_area_struct *vma, struct vm_area_struct **prev, unsigned long start, unsigned long end, - int behavior) + struct madvise_behavior *behavior) { + int action = behavior->behavior; struct mm_struct *mm = vma->vm_mm; *prev = vma; - if (!madvise_dontneed_free_valid_vma(vma, start, &end, behavior)) + if (!madvise_dontneed_free_valid_vma(vma, start, &end, action)) return -EINVAL; if (start == end) @@ -915,8 +921,7 @@ static long madvise_dontneed_free(struct vm_area_struct *vma, * Potential end adjustment for hugetlb vma is OK as * the check below keeps end within vma. */ - if (!madvise_dontneed_free_valid_vma(vma, start, &end, - behavior)) + if (!madvise_dontneed_free_valid_vma(vma, start, &end, action)) return -EINVAL; if (end > vma->vm_end) { /* @@ -945,9 +950,9 @@ static long madvise_dontneed_free(struct vm_area_struct *vma, VM_WARN_ON(start > end); } - if (behavior == MADV_DONTNEED || behavior == MADV_DONTNEED_LOCKED) + if (action == MADV_DONTNEED || action == MADV_DONTNEED_LOCKED) return madvise_dontneed_single_vma(vma, start, end); - else if (behavior == MADV_FREE) + else if (action == MADV_FREE) return madvise_free_single_vma(vma, start, end); else return -EINVAL; @@ -1249,8 +1254,10 @@ static long madvise_guard_remove(struct vm_area_struct *vma, static int madvise_vma_behavior(struct vm_area_struct *vma, struct vm_area_struct **prev, unsigned long start, unsigned long end, - unsigned long behavior) + void *behavior_arg) { + struct madvise_behavior *arg = behavior_arg; + int behavior = arg->behavior; int error; struct anon_vma_name *anon_name; unsigned long new_flags = vma->vm_flags; @@ -1270,7 +1277,7 @@ static int madvise_vma_behavior(struct vm_area_struct *vma, case MADV_FREE: case MADV_DONTNEED: case MADV_DONTNEED_LOCKED: - return madvise_dontneed_free(vma, prev, start, end, behavior); + return madvise_dontneed_free(vma, prev, start, end, arg); case MADV_NORMAL: new_flags = new_flags & ~VM_RAND_READ & ~VM_SEQ_READ; break; @@ -1487,10 +1494,10 @@ static bool process_madvise_remote_valid(int behavior) */ static int madvise_walk_vmas(struct mm_struct *mm, unsigned long start, - unsigned long end, unsigned long arg, + unsigned long end, void *arg, int (*visit)(struct vm_area_struct *vma, struct vm_area_struct **prev, unsigned long start, - unsigned long end, unsigned long arg)) + unsigned long end, void *arg)) { struct vm_area_struct *vma; struct vm_area_struct *prev; @@ -1548,7 +1555,7 @@ int madvise_walk_vmas(struct mm_struct *mm, unsigned long start, static int madvise_vma_anon_name(struct vm_area_struct *vma, struct vm_area_struct **prev, unsigned long start, unsigned long end, - unsigned long anon_name) + void *anon_name) { int error; @@ -1557,7 +1564,7 @@ static int madvise_vma_anon_name(struct vm_area_struct *vma, return -EBADF; error = madvise_update_vma(vma, prev, start, end, vma->vm_flags, - (struct anon_vma_name *)anon_name); + anon_name); /* * madvise() returns EAGAIN if kernel resources, such as @@ -1589,7 +1596,7 @@ int madvise_set_anon_name(struct mm_struct *mm, unsigned long start, if (end == start) return 0; - return madvise_walk_vmas(mm, start, end, (unsigned long)anon_name, + return madvise_walk_vmas(mm, start, end, anon_name, madvise_vma_anon_name); } #endif /* CONFIG_ANON_VMA_NAME */ @@ -1677,8 +1684,10 @@ static bool is_madvise_populate(int behavior) } static int madvise_do_behavior(struct mm_struct *mm, - unsigned long start, size_t len_in, int behavior) + unsigned long start, size_t len_in, + struct madvise_behavior *madv_behavior) { + int behavior = madv_behavior->behavior; struct blk_plug plug; unsigned long end; int error; @@ -1692,7 +1701,7 @@ static int madvise_do_behavior(struct mm_struct *mm, if (is_madvise_populate(behavior)) error = madvise_populate(mm, start, end, behavior); else - error = madvise_walk_vmas(mm, start, end, behavior, + error = madvise_walk_vmas(mm, start, end, madv_behavior, madvise_vma_behavior); blk_finish_plug(&plug); return error; @@ -1773,13 +1782,14 @@ static int madvise_do_behavior(struct mm_struct *mm, int do_madvise(struct mm_struct *mm, unsigned long start, size_t len_in, int behavior) { int error; + struct madvise_behavior madv_behavior = {.behavior = behavior}; if (madvise_should_skip(start, len_in, behavior, &error)) return error; error = madvise_lock(mm, behavior); if (error) return error; - error = madvise_do_behavior(mm, start, len_in, behavior); + error = madvise_do_behavior(mm, start, len_in, &madv_behavior); madvise_unlock(mm, behavior); return error; @@ -1796,6 +1806,7 @@ static ssize_t vector_madvise(struct mm_struct *mm, struct iov_iter *iter, { ssize_t ret = 0; size_t total_len; + struct madvise_behavior madv_behavior = {.behavior = behavior}; total_len = iov_iter_count(iter); @@ -1811,7 +1822,8 @@ static ssize_t vector_madvise(struct mm_struct *mm, struct iov_iter *iter, if (madvise_should_skip(start, len_in, behavior, &error)) ret = error; else - ret = madvise_do_behavior(mm, start, len_in, behavior); + ret = madvise_do_behavior(mm, start, len_in, + &madv_behavior); /* * An madvise operation is attempting to restart the syscall, * but we cannot proceed as it would not be correct to repeat From patchwork Fri Apr 4 21:06:58 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: SeongJae Park X-Patchwork-Id: 14038912 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 96E02C36010 for ; Fri, 4 Apr 2025 21:07:14 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id BD88F6B0093; Fri, 4 Apr 2025 17:07:11 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id B38616B0092; Fri, 4 Apr 2025 17:07:11 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 89FD46B0096; Fri, 4 Apr 2025 17:07:11 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0015.hostedemail.com [216.40.44.15]) by kanga.kvack.org (Postfix) with ESMTP id 68EBB6B0092 for ; Fri, 4 Apr 2025 17:07:11 -0400 (EDT) Received: from smtpin06.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay08.hostedemail.com (Postfix) with ESMTP id 45BBF1405DA for ; Fri, 4 Apr 2025 21:07:12 +0000 (UTC) X-FDA: 83297596704.06.6CC167C Received: from dfw.source.kernel.org (dfw.source.kernel.org [139.178.84.217]) by imf25.hostedemail.com (Postfix) with ESMTP id 8E947A000B for ; Fri, 4 Apr 2025 21:07:10 +0000 (UTC) Authentication-Results: imf25.hostedemail.com; dkim=pass header.d=kernel.org header.s=k20201202 header.b=HjPrYRhr; spf=pass (imf25.hostedemail.com: domain of sj@kernel.org designates 139.178.84.217 as permitted sender) smtp.mailfrom=sj@kernel.org; dmarc=pass (policy=quarantine) header.from=kernel.org ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1743800830; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=0HlGlm/xL8v2iPJbXYinHFDaKH26Yu1BTKvZNFIYAno=; b=VjUPQ+iv+eovQuujNxlWqY0UevDYCaiH3YN7GqSjosnDlzDnNOcEIL/dsMVuxbikZsw6lO +Qr6mO4rxvELbMnlMNH4IC+ETr10gnWCYNvTx+1dXA7sMu92m6hzyjHh1LQbeNTZwh4m6q d3SsDXJKmAMFPXtmpCS+44r4XcYZ084= ARC-Authentication-Results: i=1; imf25.hostedemail.com; dkim=pass header.d=kernel.org header.s=k20201202 header.b=HjPrYRhr; spf=pass (imf25.hostedemail.com: domain of sj@kernel.org designates 139.178.84.217 as permitted sender) smtp.mailfrom=sj@kernel.org; dmarc=pass (policy=quarantine) header.from=kernel.org ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1743800830; a=rsa-sha256; cv=none; b=npTjpogSW6/SQbd4bFY6vF8FWgnRqWBM8J/gleXqtlHHKLKsNm9a0u4lfnzGmKqAkb5eP1 gXYTknWC0NCmkz2LB+54KevZcL4Yqb4MdVGd/6j53AxOAjzWugWE3tpBZgjfXoZGOw4ODi nkiowZLcA5EEdWpxkUK2i0NY/6UKDCc= Received: from smtp.kernel.org (transwarp.subspace.kernel.org [100.75.92.58]) by dfw.source.kernel.org (Postfix) with ESMTP id 3FEB25C6C42; Fri, 4 Apr 2025 21:04:52 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id BE601C4CEDD; Fri, 4 Apr 2025 21:07:08 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1743800828; bh=x25eIUo6VkYoCUtqxJ6hHhPHU0LH2iMLq4u6rDb9zDQ=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=HjPrYRhr9Z7oTIv0r/O7mTYr2szq9bcvmUiqSzHwAamexSCI1B7OB5tAvTdgqr0XR r4iSWr7knMRVGPyCELhZTjQ0tUwZ11hqg76RYMy9LVzWdswC/c+ylC/P0JHcQLFSDG E9IbbpFZufhnMH1G0a39KdFFJVp0bjta7evjxrdnBXqW8BEwyzF9j3iAN/GWwG/zLI hdo8SNtiKK2HZYRGX+Be77WnsEYlwaiUr8SVsIX8mJXlP0+UJhNsbTxNvLmdTrhcQT GO/1FlgMgpcnvX1R5vu+kp5GhVqW6w9Hn4uaLTZGaIXSyzda+5QCKU7+A7trNNM8sZ PN6QwXfiSmTqw== From: SeongJae Park To: Andrew Morton Cc: SeongJae Park , "Liam R.Howlett" , David Hildenbrand , Lorenzo Stoakes , Rik van Riel , Shakeel Butt , Vlastimil Babka , kernel-team@meta.com, linux-kernel@vger.kernel.org, linux-mm@kvack.org Subject: [PATCH v2 2/4] mm/madvise: batch tlb flushes for MADV_FREE Date: Fri, 4 Apr 2025 14:06:58 -0700 Message-Id: <20250404210700.2156-3-sj@kernel.org> X-Mailer: git-send-email 2.39.5 In-Reply-To: <20250404210700.2156-1-sj@kernel.org> References: <20250404210700.2156-1-sj@kernel.org> MIME-Version: 1.0 X-Rspam-User: X-Rspamd-Server: rspam09 X-Rspamd-Queue-Id: 8E947A000B X-Stat-Signature: zggh9eixrak4u9pkgbq96z1ebpjj3xxe X-HE-Tag: 1743800830-40027 X-HE-Meta: U2FsdGVkX18XimThejjPFrUjcd0EhgM9m2gf7epiVcPa1Htgol3QLnGMQQvWnd4nvk4LZVmeCBe9MUlzlxwLJkV3OmZXyoe38zvTUP5mbXXXROqjyNcC6iVJR/ubQxUyLvnap1JiX3H5ztnoJJN4FzBnIJQGI3Vap++Gzxvw6RfKrT5bYT9gsfXmfQbVDbkRRZb8qupgKMlIKzpFOxM8GuKABXyXdfFtta48Zsen5E/U2YQO0fqnpv4xbiO++4oPfqxAn7SRtEKJkhi12EaqlDP1TbpyXAxOEcd+UDASM6XD2yrOjE1L9Mwnx7ELUupC2PS5sKV0YUmkU+Udg5oBSgO8z7aIBJlYXN1HdRPuyS+JByS2ujMGR/blNDLIhVNCEdVLiZlUS5L1I2WVy5Ra93OoaIASf4JPKDFz1EKyrVnLRbM2x+g8pS20oZXj0Ol7MZpqUCMVXElwkLKjETLcNA3cK2/Jy/h5ch5FiwSHB8rKdCft/kq+C2bAyyt2P5Ga3OE3y3sHwNEq/ricVKPNs8TnLbpyjaHiDa66YMbxZMq8g9nQQIhWJdfvOUGNx5DOwdwK5FGdzLjRzJ6Zyt6or9za440v++MdAPqzVHy5iIs18a4fs/Z23y8PTBFkT/cw4051B8z0XhLPxxlShLSr68gxIfbqZ6wUt3Szf0i8N6ZwHkbM8gytvKi1Hrcq3cZnFf5P8rQd5jWY9S6NV9lPAzqQCfWvyxrEa7k0KnH57hHUEoNVLOPJsYbFVJty5Ewd3+WVUyDVPCTRAGeMRrXpzw4dypzAsPdQQkF51akvpFHCT6KK7wTWYG/0J4aWzJjmlvchHdF2xBr1lqmQ9DGiC1E/b7hZqseparPk1Z3e4x2dZmLhGH+pJCw4uL73wVezmQbWCCzmnFjCkB4iqFfbnNxSUefSTcIghSRac9HjgTs834bSriMaCPiFZ3AoELiN72Lp4+V6Sv4OH5cAZO0 nEVM6/Vv Lv3nw1+PNvOJ5xrmAsTzQk4lOB93YJkM9WkWKHOsLUZ1RI2pKeUBMRGk+LOcwgdyzXnMcm6Cvn6pnx8YnRiK8MkKvTblCX3AZnILfdiXxLPF+efJVyfw3KsE6a4sRnG3yFIGLxz9dZ16rC/beqAJ6x4Yj69jYufq7j9sFz5XBVBlzWzMq19/rXjE6O1J5Ux5GyPEXL7p7MGhMNIx2FAlw22iJ0GN8iiqwuE8ZTQobc8Ss2zQSdX8rNJdftF/Sd9s8Q8WgVfMPtVUTOhoC6LWG/LvfMg== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: MADV_FREE handling for [process_]madvise() flushes tlb for each vma of each address range. Update the logic to do tlb flushes in a batched way. Initialize an mmu_gather object from do_madvise() and vector_madvise(), which are the entry level functions for [process_]madvise(), respectively. And pass those objects to the function for per-vma work, via madvise_behavior struct. Make the per-vma logic not flushes tlb on their own but just saves the tlb entries to the received mmu_gather object. Finally, the entry level functions flush the tlb entries that gathered for the entire user request, at once. Signed-off-by: SeongJae Park --- mm/madvise.c | 59 +++++++++++++++++++++++++++++++++++++++++----------- 1 file changed, 47 insertions(+), 12 deletions(-) diff --git a/mm/madvise.c b/mm/madvise.c index 8bcfdd995d18..564095e381b2 100644 --- a/mm/madvise.c +++ b/mm/madvise.c @@ -799,12 +799,13 @@ static const struct mm_walk_ops madvise_free_walk_ops = { .walk_lock = PGWALK_RDLOCK, }; -static int madvise_free_single_vma(struct vm_area_struct *vma, - unsigned long start_addr, unsigned long end_addr) +static int madvise_free_single_vma( + struct madvise_behavior *behavior, struct vm_area_struct *vma, + unsigned long start_addr, unsigned long end_addr) { struct mm_struct *mm = vma->vm_mm; struct mmu_notifier_range range; - struct mmu_gather tlb; + struct mmu_gather *tlb = behavior->tlb; /* MADV_FREE works for only anon vma at the moment */ if (!vma_is_anonymous(vma)) @@ -820,17 +821,14 @@ static int madvise_free_single_vma(struct vm_area_struct *vma, range.start, range.end); lru_add_drain(); - tlb_gather_mmu(&tlb, mm); update_hiwater_rss(mm); mmu_notifier_invalidate_range_start(&range); - tlb_start_vma(&tlb, vma); + tlb_start_vma(tlb, vma); walk_page_range(vma->vm_mm, range.start, range.end, - &madvise_free_walk_ops, &tlb); - tlb_end_vma(&tlb, vma); + &madvise_free_walk_ops, tlb); + tlb_end_vma(tlb, vma); mmu_notifier_invalidate_range_end(&range); - tlb_finish_mmu(&tlb); - return 0; } @@ -953,7 +951,7 @@ static long madvise_dontneed_free(struct vm_area_struct *vma, if (action == MADV_DONTNEED || action == MADV_DONTNEED_LOCKED) return madvise_dontneed_single_vma(vma, start, end); else if (action == MADV_FREE) - return madvise_free_single_vma(vma, start, end); + return madvise_free_single_vma(behavior, vma, start, end); else return -EINVAL; } @@ -1626,6 +1624,29 @@ static void madvise_unlock(struct mm_struct *mm, int behavior) mmap_read_unlock(mm); } +static bool madvise_batch_tlb_flush(int behavior) +{ + switch (behavior) { + case MADV_FREE: + return true; + default: + return false; + } +} + +static void madvise_init_tlb(struct madvise_behavior *madv_behavior, + struct mm_struct *mm) +{ + if (madvise_batch_tlb_flush(madv_behavior->behavior)) + tlb_gather_mmu(madv_behavior->tlb, mm); +} + +static void madvise_finish_tlb(struct madvise_behavior *madv_behavior) +{ + if (madvise_batch_tlb_flush(madv_behavior->behavior)) + tlb_finish_mmu(madv_behavior->tlb); +} + static bool is_valid_madvise(unsigned long start, size_t len_in, int behavior) { size_t len; @@ -1782,14 +1803,20 @@ static int madvise_do_behavior(struct mm_struct *mm, int do_madvise(struct mm_struct *mm, unsigned long start, size_t len_in, int behavior) { int error; - struct madvise_behavior madv_behavior = {.behavior = behavior}; + struct mmu_gather tlb; + struct madvise_behavior madv_behavior = { + .behavior = behavior, + .tlb = &tlb, + }; if (madvise_should_skip(start, len_in, behavior, &error)) return error; error = madvise_lock(mm, behavior); if (error) return error; + madvise_init_tlb(&madv_behavior, mm); error = madvise_do_behavior(mm, start, len_in, &madv_behavior); + madvise_finish_tlb(&madv_behavior); madvise_unlock(mm, behavior); return error; @@ -1806,13 +1833,18 @@ static ssize_t vector_madvise(struct mm_struct *mm, struct iov_iter *iter, { ssize_t ret = 0; size_t total_len; - struct madvise_behavior madv_behavior = {.behavior = behavior}; + struct mmu_gather tlb; + struct madvise_behavior madv_behavior = { + .behavior = behavior, + .tlb = &tlb, + }; total_len = iov_iter_count(iter); ret = madvise_lock(mm, behavior); if (ret) return ret; + madvise_init_tlb(&madv_behavior, mm); while (iov_iter_count(iter)) { unsigned long start = (unsigned long)iter_iov_addr(iter); @@ -1841,14 +1873,17 @@ static ssize_t vector_madvise(struct mm_struct *mm, struct iov_iter *iter, } /* Drop and reacquire lock to unwind race. */ + madvise_finish_tlb(&madv_behavior); madvise_unlock(mm, behavior); madvise_lock(mm, behavior); + madvise_init_tlb(&madv_behavior, mm); continue; } if (ret < 0) break; iov_iter_advance(iter, iter_iov_len(iter)); } + madvise_finish_tlb(&madv_behavior); madvise_unlock(mm, behavior); ret = (total_len - iov_iter_count(iter)) ? : ret; From patchwork Fri Apr 4 21:06:59 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: SeongJae Park X-Patchwork-Id: 14038913 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 3B750C3601E for ; Fri, 4 Apr 2025 21:07:17 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id F2CB96B0092; Fri, 4 Apr 2025 17:07:11 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id DAC896B0099; Fri, 4 Apr 2025 17:07:11 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id AC2C96B0095; Fri, 4 Apr 2025 17:07:11 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0012.hostedemail.com [216.40.44.12]) by kanga.kvack.org (Postfix) with ESMTP id 7B4D36B0093 for ; Fri, 4 Apr 2025 17:07:11 -0400 (EDT) Received: from smtpin04.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay09.hostedemail.com (Postfix) with ESMTP id E6553808E5 for ; Fri, 4 Apr 2025 21:07:12 +0000 (UTC) X-FDA: 83297596704.04.16C2933 Received: from dfw.source.kernel.org (dfw.source.kernel.org [139.178.84.217]) by imf27.hostedemail.com (Postfix) with ESMTP id 31D5140005 for ; Fri, 4 Apr 2025 21:07:10 +0000 (UTC) Authentication-Results: imf27.hostedemail.com; dkim=pass header.d=kernel.org header.s=k20201202 header.b=mFAEWJ8U; spf=pass (imf27.hostedemail.com: domain of sj@kernel.org designates 139.178.84.217 as permitted sender) smtp.mailfrom=sj@kernel.org; dmarc=pass (policy=quarantine) header.from=kernel.org ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1743800831; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=xQUuSdvqqvhsAv6w3cj3wDtvTVdsB4ZlHl+t344Bfc4=; b=nc5HFIfi47hhtp/R/Izq9MaCdeoUkIFjItGL3D0a25w7rru7h6NjI3YIy3OS8q0NmtYCdX 98pf+AlAwGNCHMfDVr9zjRemi3F6VFg0qQ922lCpBoK6iqdpGsKjxojAIjl6p7IWrGgX0J eGimWs6hb/uxaA6ndOL8eWWNSaY5lmU= ARC-Authentication-Results: i=1; imf27.hostedemail.com; dkim=pass header.d=kernel.org header.s=k20201202 header.b=mFAEWJ8U; spf=pass (imf27.hostedemail.com: domain of sj@kernel.org designates 139.178.84.217 as permitted sender) smtp.mailfrom=sj@kernel.org; dmarc=pass (policy=quarantine) header.from=kernel.org ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1743800831; a=rsa-sha256; cv=none; b=4yyup8KS8FN9PguD34G2qncN3IXkMjiur6v3XTiGSnmWRkBnK++9KCQnhjN5c0lh34fyuV hsLlH354PDMc2dtll5rEK/A6CSBe206ShfwQ4z9Slb+Nvl0svwquViU4G8Vbylk9YXI2pI JnSPM/uj1AIqMsdtrrEWX0w3F4Mqtb8= Received: from smtp.kernel.org (transwarp.subspace.kernel.org [100.75.92.58]) by dfw.source.kernel.org (Postfix) with ESMTP id 40D8E5C6C4A; Fri, 4 Apr 2025 21:04:53 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id EAA66C4CEDD; Fri, 4 Apr 2025 21:07:09 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1743800830; bh=J7/gCDdc8fvuXxQWaVRGjApfDs/hGBRUpXDXtj9AVtM=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=mFAEWJ8UakEVQgoyrGBCSCT/ySW7BY+FmZ4Y3zhWuE45GXBofakIyEn0PXwiETFTA fmMqTCKz10Ox03ttTmL2pS+lLcLBANmV87GIHc3ZpaKcZqhl8RuDge4oCu/jpSFVLN x/D1w5UmSKaieWuPAPVgmHVagEO2DrWcUzjSEA2ln2QEdSAjlGpxveG3rNGHaoOkTH TWG5YdXnPgIC7eNjYmsVLfqgdgH/zxgGVWBQhM1BLdSXeO38xgNROr3/DjwczH6F5h R2kJifQwWjZWC/Yyg7lW9Wpqwjy9v2X+IOVliGpxpLiSoexrTuNW6Hzx7hTYpHIUYg vp7tTqJGmxQQA== From: SeongJae Park To: Andrew Morton Cc: SeongJae Park , "Liam R.Howlett" , David Hildenbrand , Lorenzo Stoakes , Rik van Riel , Shakeel Butt , Vlastimil Babka , kernel-team@meta.com, linux-kernel@vger.kernel.org, linux-mm@kvack.org Subject: [PATCH v2 3/4] mm/memory: split non-tlb flushing part from zap_page_range_single() Date: Fri, 4 Apr 2025 14:06:59 -0700 Message-Id: <20250404210700.2156-4-sj@kernel.org> X-Mailer: git-send-email 2.39.5 In-Reply-To: <20250404210700.2156-1-sj@kernel.org> References: <20250404210700.2156-1-sj@kernel.org> MIME-Version: 1.0 X-Rspamd-Queue-Id: 31D5140005 X-Stat-Signature: xq73nwobagnux5hwsgirfuur3nijiaus X-Rspam-User: X-Rspamd-Server: rspam12 X-HE-Tag: 1743800830-793509 X-HE-Meta: U2FsdGVkX1/TQLz2NOHxoca0/6XHeKxnbqNjqxBK5gkNlFmOX0WhtzJxrZ9YjjIz7ayJtjDLoAO4sXBlOitzj6hQPCo94U5Am46KXwX7ojN4C4i+XRMt/Ht+pnoFF0VP5aPXMxIdh8pASMpRx4kdNfXU7cQB5itUnb7vrhba0k3vfC7EgQzrgFT/Ham87F1YCG332HlHoRtaEEsJR9tqLw+j/gCEyURmPudptf6WNfCEGHM41jKlU8495S6Sptsc5FSrCMpv/3dY5UvTGnvd3fmB2aluT+o1qHwP41Tc4PEzuHlCpMMny9xeaGdrhfE6vI8DvvS/dQyrQOrhFE224fMNFrcGF1sOzyj8mN40YN269xYsnHwI99V+Y9RqjaHlkpzdV0tDDOXJ0d8RMFd9X2KayA/DAMaaRlLYXdnY/Wb/qCXowEYNbDrkYUJXbxkqsrOAXWr8x0GzUmZBqpvhVGlOcvEYGKoaL/2anKbTSTzcv4YcorojDgvsPzBSkq4z5zbuVxz8o7DkBmIlFQ9gONl3gWwb1ekiS30A2Klk7aE8p6uBvbXuLj0sFnGG1khGNBpDunUbquM1mCYrRn070N/H9dbqc3Uuc/DHyaxpwrhSVr0UV/qCwLg4HrLC2f5ADWKB7Ip+EMEW/V4UgPLVVzwe73WYTCHoriLXTBLeHQAhe5Jj/dNrgvmLa0rxvmENY1x55tHzpIXod2HFEjmp37ou5wZqS8pDk3bACyPRVQQYbIm/G7u6etNK73jUqhBrPtbNXccutzR/cWxs/l5+nZ/acd8xqPCZnyI9+9sj2hfc4eyHjXUhdTECV4DdKKrlwBPf+OyvZZSQllAQmSnvFANNnscy7c/es2wFCSjexPaUIisKKRXje7qMJxTmRd1fy5iNc6WiBf/S4JvJo8AdnQ7tYtCqYT8F/unHwhx5ZoocIY30TJDmYFsEPrL5lGb4kvxEjpBHU3VjjT05x3y gZBrO3yj zU/+WjUUKQjCzl3/rE41fA0aeNlWurmSGAtYVhlGEPyWBR2zlBJ+aBJyrYHk6NW5RZhMys036Z+7PGWcL/O0L/tt3NqqVjfDR/RjsUmT8rSfxkHPd/Rz9YW2Gri80Mb5v1HTLhU9Kr1ttr3gsC1tiPkYvglhTT+8Rx9gHwFKvNGkz02Zj+/w+e4XqQwyISXZpGeh/xtg0wut+HsNsP6fLjXiYriMpmyvjrhEJNBWinNjAOlWj9+evCuR9lDV9B/se9+llipnY+A3PdEBINPVrgJcZZg== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Some of zap_page_range_single() callers such as [process_]madvise() with MADV_DONTNEED[_LOCKED] cannot batch tlb flushes because zap_page_range_single() flushes tlb for each invocation. Split out the body of zap_page_range_single() except mmu_gather object initialization and gathered tlb entries flushing for such batched tlb flushing usage. To avoid hugetlb pages allocation failures from concurrent page faults, the tlb flush should be done before hugetlb faults unlocking, though. Do the flush and the unlock inside the split out function in the order for hugetlb vma case. Refer to commit 2820b0f09be9 ("hugetlbfs: close race between MADV_DONTNEED and page fault") for more details about the concurrent faults' page allocation failure problem. Signed-off-by: SeongJae Park --- mm/memory.c | 49 +++++++++++++++++++++++++++++++++++++++---------- 1 file changed, 39 insertions(+), 10 deletions(-) diff --git a/mm/memory.c b/mm/memory.c index 8669b2c981a5..8c9bbb1a008c 100644 --- a/mm/memory.c +++ b/mm/memory.c @@ -1989,36 +1989,65 @@ void unmap_vmas(struct mmu_gather *tlb, struct ma_state *mas, mmu_notifier_invalidate_range_end(&range); } -/** - * zap_page_range_single - remove user pages in a given range +/* + * notify_unmap_single_vma - remove user pages in a given range + * @tlb: pointer to the caller's struct mmu_gather * @vma: vm_area_struct holding the applicable pages - * @address: starting address of pages to zap - * @size: number of bytes to zap + * @address: starting address of pages to remove + * @size: number of bytes to remove * @details: details of shared cache invalidation * - * The range must fit into one VMA. + * @tlb shouldn't be NULL. The range must fit into one VMA. If @vma is for + * hugetlb, @tlb is flushed and re-initialized by this function. */ -void zap_page_range_single(struct vm_area_struct *vma, unsigned long address, +static void notify_unmap_single_vma(struct mmu_gather *tlb, + struct vm_area_struct *vma, unsigned long address, unsigned long size, struct zap_details *details) { const unsigned long end = address + size; struct mmu_notifier_range range; - struct mmu_gather tlb; + + VM_WARN_ON_ONCE(!tlb); mmu_notifier_range_init(&range, MMU_NOTIFY_CLEAR, 0, vma->vm_mm, address, end); hugetlb_zap_begin(vma, &range.start, &range.end); - tlb_gather_mmu(&tlb, vma->vm_mm); update_hiwater_rss(vma->vm_mm); mmu_notifier_invalidate_range_start(&range); /* * unmap 'address-end' not 'range.start-range.end' as range * could have been expanded for hugetlb pmd sharing. */ - unmap_single_vma(&tlb, vma, address, end, details, false); + unmap_single_vma(tlb, vma, address, end, details, false); mmu_notifier_invalidate_range_end(&range); + if (is_vm_hugetlb_page(vma)) { + /* + * flush tlb and free resources before hugetlb_zap_end(), to + * avoid concurrent page faults' allocation failure + */ + tlb_finish_mmu(tlb); + hugetlb_zap_end(vma, details); + tlb_gather_mmu(tlb, vma->vm_mm); + } +} + +/** + * zap_page_range_single - remove user pages in a given range + * @vma: vm_area_struct holding the applicable pages + * @address: starting address of pages to zap + * @size: number of bytes to zap + * @details: details of shared cache invalidation + * + * The range must fit into one VMA. + */ +void zap_page_range_single(struct vm_area_struct *vma, unsigned long address, + unsigned long size, struct zap_details *details) +{ + struct mmu_gather tlb; + + tlb_gather_mmu(&tlb, vma->vm_mm); + notify_unmap_single_vma(&tlb, vma, address, size, details); tlb_finish_mmu(&tlb); - hugetlb_zap_end(vma, details); } /** From patchwork Fri Apr 4 21:07:00 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: SeongJae Park X-Patchwork-Id: 14038914 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id E71B0C3601E for ; Fri, 4 Apr 2025 21:07:19 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 9E0346B0095; Fri, 4 Apr 2025 17:07:12 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 879F76B0096; Fri, 4 Apr 2025 17:07:12 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 6D0A56B0099; Fri, 4 Apr 2025 17:07:12 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0010.hostedemail.com [216.40.44.10]) by kanga.kvack.org (Postfix) with ESMTP id 48C736B0095 for ; Fri, 4 Apr 2025 17:07:12 -0400 (EDT) Received: from smtpin29.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay06.hostedemail.com (Postfix) with ESMTP id A8F3DB9811 for ; Fri, 4 Apr 2025 21:07:13 +0000 (UTC) X-FDA: 83297596746.29.E64DF0A Received: from nyc.source.kernel.org (nyc.source.kernel.org [147.75.193.91]) by imf26.hostedemail.com (Postfix) with ESMTP id 2955F140005 for ; Fri, 4 Apr 2025 21:07:11 +0000 (UTC) Authentication-Results: imf26.hostedemail.com; dkim=pass header.d=kernel.org header.s=k20201202 header.b=tfe5O0hn; dmarc=pass (policy=quarantine) header.from=kernel.org; spf=pass (imf26.hostedemail.com: domain of sj@kernel.org designates 147.75.193.91 as permitted sender) smtp.mailfrom=sj@kernel.org ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1743800832; a=rsa-sha256; cv=none; b=RLdepIEKTdzBBjPAeaQkLtLITSsHNV2gTQIjtbUwhKzrFAljAh9Ig3Q1xUK4Qiqhe2xH8B ITBtjpzpL0Sh9KFUAdTV0mKf6rue74z/lwW37S25HmKnR01kJzNF3EYvf997WSZK+sVF3W 3gfYCaf6CeIC8hxLi8gHxnxSPmWMCNY= ARC-Authentication-Results: i=1; imf26.hostedemail.com; dkim=pass header.d=kernel.org header.s=k20201202 header.b=tfe5O0hn; dmarc=pass (policy=quarantine) header.from=kernel.org; spf=pass (imf26.hostedemail.com: domain of sj@kernel.org designates 147.75.193.91 as permitted sender) smtp.mailfrom=sj@kernel.org ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1743800832; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=euXWFZsWtbqMUcYgP8fpgoYoMI4a4qO3i76QDzmrI2o=; b=56HqXv7rjR5xnqDqE74opReAEOjM+cMJX8tWbrN6eNcHDuiC1BBoihnqMPXuY4sc7ffwK5 of+JK0em8zdCaMn7mnJPSfDtP+NDjDaBCAom+UKBu/7gbOuOyirTQx6jYzFZW4W5iQF8Rn hDEZH8EVaAsn1V3iJ+FwiS27ydYw5gM= Received: from smtp.kernel.org (transwarp.subspace.kernel.org [100.75.92.58]) by nyc.source.kernel.org (Postfix) with ESMTP id BEBBCA468C0; Fri, 4 Apr 2025 21:01:42 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 0CF3BC4CEDD; Fri, 4 Apr 2025 21:07:11 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1743800831; bh=PmO0QX64UlqAoz84TozSVHOKpk+4GGuBfZEHaifqBDg=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=tfe5O0hnRe939EEJfRxDNGiQ2pY/sppUaKc+3ycE4PVz3K9kjCVQqNYTL2cUilCYI aOrXiWURyiVn/Su/0TO9rfP1GBglrQs9yjvvDxZxFQPWwtgiLTozL5CMbvStwljFzi 8AETsR/AgF2+QXn/dN3hkJP+Cjdr7wf77DY8hfk6iOcWfh5suzQ5VqVkMhZxbynnO5 zs0vUjCSXYyCbmTtu3R9xs1Eg4sVBkMnRhRAEJyY02L7Z3YkAF6i9HPOJr5tl6HVU5 sg0H9Y220WDutkwLm9uznN13Hk0Ei6e/kFjPaXS0LC/URd8bnydmoD1GLapOGrNtft L0meA5IVAPO3w== From: SeongJae Park To: Andrew Morton Cc: SeongJae Park , "Liam R.Howlett" , David Hildenbrand , Lorenzo Stoakes , Rik van Riel , Shakeel Butt , Vlastimil Babka , kernel-team@meta.com, linux-kernel@vger.kernel.org, linux-mm@kvack.org Subject: [PATCH v2 4/4] mm/madvise: batch tlb flushes for MADV_DONTNEED[_LOCKED] Date: Fri, 4 Apr 2025 14:07:00 -0700 Message-Id: <20250404210700.2156-5-sj@kernel.org> X-Mailer: git-send-email 2.39.5 In-Reply-To: <20250404210700.2156-1-sj@kernel.org> References: <20250404210700.2156-1-sj@kernel.org> MIME-Version: 1.0 X-Rspamd-Queue-Id: 2955F140005 X-Stat-Signature: gscz8wdauq5hz3u37g69tzrbwadeg9nr X-Rspam-User: X-Rspamd-Server: rspam06 X-HE-Tag: 1743800831-851541 X-HE-Meta: U2FsdGVkX18EOOpzK1UNGDICxdXQmbk1Ld1ILzAPERUXUh5xHj1ua5w9OZJmWFvisgOL61kcd/67XmxpiGs5H95kAPF0t6IZaDi+/4wugFsIjCp2ePUxXJ49jDcy3tJoYC3pO3ImBcqrCwuVO4AR1iCM1qEHyj/tqSFo+d9s7yim4lRpS9zE1JiaWf27JVg1cJ3kuNxoYFK6JlqnAc6RuuGo4/AW5zud80g/3KcqB6XQZlKGyWcWxqJlz1BUeUuSOlk6S/2MoHg3EE777vliIA1k7vBqQD0mXmtqfHhtV6ZcwQlOV/n5YZ2n0PeW8vHUFhGLV7U5hwkWu3Y0MA7WPZRvztgYr032MoG/wEcXCyxxNa34fKYHma5bTSUtZ5vYn0KIEHEHtDtzNiz1FiOxero9fjvwneVxjdgX1INRaTGNpC8tZvaUYGgS9gXrVt5uTsXH4fmsRhk2L0lDELfjo+a0IKF/XhRJndd7gpQmcHgs7vJB+8eVsoyOWWImSB3BEvXpjTT0SBpwupOfMa0AwD2rccIepoONxP7/XSEnJAeUZsNHbGAe+tG71dZQHmxAOAIt9L3zLj6rraPgCSHaMc8cHHh1z7Ua36Z9pa2iX1ViYfXc/2Zd6Llcf5DYwLvKkNv2z8lb64F7j5+C0NZD76ciMoWSIp2zjPWSJPPlNGD97CR49pfgILFPcL19iQcPQzJRFTLm2bV5O/7wiEbcWXZzxbGcjmUSXsU2UfJuVMAnfnbyat38Pt5fDhe9VCl+XpYCy6zHCi/FW0cWl4c6ZP+7Ul9l/HRJ6CnHZEt5gm0x0cUTNIqEhpXBYOurx8uzku9oQeBSm6SeQaZh42Rx8jorSIWp1QsFClwCE6bccibN4PNI+4WZDgqECimhi0cPL61Qfdo/e1g8cHe+tQfHeb3M7ms3W5no9W4Ot+TEvsm7YbLul/hv0Amyv8jPhScMjzQ+08PYO7cW8wyjtDf +ShSoHDI 8AhQuJbLOh2Bj/vq5y+Roig/EDrbR/FLnwWGg2bPbr14Jv8qNoHrAxAPubglgO9ZawEDuM7QVoy5lXJj3WHD2QVvl0rYL4DsxTnbndoTz6Q5/91/pRbasNsR9b1iLP5qME3eA/VT6ZjXkFi3EnLKVIVXfqjVA7kU3xG9IrdwMxI0qXftYzvBgjmkl3IVy2frzFAfqZgMvUu+JjYENfHpfszOi8A46ElMOtcmZzhmf+A7+BdE0X18MEzz6V9oQ+da0kcFO7FbSJT6VGbYr35HNqPCXyw== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Batch tlb flushes for MADV_DONTNEED[_LOCKED] for better efficiency, in a way that very similar to the tlb flushes batching for MADV_FREE. Signed-off-by: SeongJae Park --- mm/internal.h | 3 +++ mm/madvise.c | 9 ++++++--- mm/memory.c | 4 ++-- 3 files changed, 11 insertions(+), 5 deletions(-) diff --git a/mm/internal.h b/mm/internal.h index e9695baa5922..be0c46837e22 100644 --- a/mm/internal.h +++ b/mm/internal.h @@ -435,6 +435,9 @@ void unmap_page_range(struct mmu_gather *tlb, struct vm_area_struct *vma, unsigned long addr, unsigned long end, struct zap_details *details); +void notify_unmap_single_vma(struct mmu_gather *tlb, + struct vm_area_struct *vma, unsigned long addr, + unsigned long size, struct zap_details *details); int folio_unmap_invalidate(struct address_space *mapping, struct folio *folio, gfp_t gfp); diff --git a/mm/madvise.c b/mm/madvise.c index 564095e381b2..c7ac32b4a371 100644 --- a/mm/madvise.c +++ b/mm/madvise.c @@ -851,7 +851,8 @@ static int madvise_free_single_vma( * An interface that causes the system to free clean pages and flush * dirty pages is already available as msync(MS_INVALIDATE). */ -static long madvise_dontneed_single_vma(struct vm_area_struct *vma, +static long madvise_dontneed_single_vma(struct madvise_behavior *behavior, + struct vm_area_struct *vma, unsigned long start, unsigned long end) { struct zap_details details = { @@ -859,7 +860,7 @@ static long madvise_dontneed_single_vma(struct vm_area_struct *vma, .even_cows = true, }; - zap_page_range_single(vma, start, end - start, &details); + notify_unmap_single_vma(behavior->tlb, vma, start, end - start, &details); return 0; } @@ -949,7 +950,7 @@ static long madvise_dontneed_free(struct vm_area_struct *vma, } if (action == MADV_DONTNEED || action == MADV_DONTNEED_LOCKED) - return madvise_dontneed_single_vma(vma, start, end); + return madvise_dontneed_single_vma(behavior, vma, start, end); else if (action == MADV_FREE) return madvise_free_single_vma(behavior, vma, start, end); else @@ -1627,6 +1628,8 @@ static void madvise_unlock(struct mm_struct *mm, int behavior) static bool madvise_batch_tlb_flush(int behavior) { switch (behavior) { + case MADV_DONTNEED: + case MADV_DONTNEED_LOCKED: case MADV_FREE: return true; default: diff --git a/mm/memory.c b/mm/memory.c index 8c9bbb1a008c..6a01b73aa111 100644 --- a/mm/memory.c +++ b/mm/memory.c @@ -1989,7 +1989,7 @@ void unmap_vmas(struct mmu_gather *tlb, struct ma_state *mas, mmu_notifier_invalidate_range_end(&range); } -/* +/** * notify_unmap_single_vma - remove user pages in a given range * @tlb: pointer to the caller's struct mmu_gather * @vma: vm_area_struct holding the applicable pages @@ -2000,7 +2000,7 @@ void unmap_vmas(struct mmu_gather *tlb, struct ma_state *mas, * @tlb shouldn't be NULL. The range must fit into one VMA. If @vma is for * hugetlb, @tlb is flushed and re-initialized by this function. */ -static void notify_unmap_single_vma(struct mmu_gather *tlb, +void notify_unmap_single_vma(struct mmu_gather *tlb, struct vm_area_struct *vma, unsigned long address, unsigned long size, struct zap_details *details) {