From patchwork Thu Apr 10 00:00:19 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: SeongJae Park X-Patchwork-Id: 14045657 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 98382C369A6 for ; Thu, 10 Apr 2025 00:00:48 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 8AA106B015B; Wed, 9 Apr 2025 20:00:46 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 834466B015C; Wed, 9 Apr 2025 20:00:46 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 6D5896B015D; Wed, 9 Apr 2025 20:00:46 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0014.hostedemail.com [216.40.44.14]) by kanga.kvack.org (Postfix) with ESMTP id 4CCA66B015B for ; Wed, 9 Apr 2025 20:00:46 -0400 (EDT) Received: from smtpin24.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay01.hostedemail.com (Postfix) with ESMTP id 164771CCE3F for ; Thu, 10 Apr 2025 00:00:47 +0000 (UTC) X-FDA: 83316178134.24.FD58325 Received: from tor.source.kernel.org (tor.source.kernel.org [172.105.4.254]) by imf13.hostedemail.com (Postfix) with ESMTP id 7718420011 for ; Thu, 10 Apr 2025 00:00:45 +0000 (UTC) Authentication-Results: imf13.hostedemail.com; dkim=pass header.d=kernel.org header.s=k20201202 header.b=b9oAE4TY; spf=pass (imf13.hostedemail.com: domain of sj@kernel.org designates 172.105.4.254 as permitted sender) smtp.mailfrom=sj@kernel.org; dmarc=pass (policy=quarantine) header.from=kernel.org ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1744243245; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=poQPQqAzFl41Mz7AHPcU3Sbk4NPcgLoS1V8w1FRCQtk=; b=Hhx2hvBkMNZ672G08btQ0zocHgX7zha5/C6mSjJCln2AaBIjbhcSzoTjDrjSMSRholplpr VqA2DBWpGxIbECsuy/rHBeE66XAZP+3md2PT1T3F4oNy3iE5gQXRy7RPdj8LsZVbIVJp9E DtCCTkIZTopYIFFuhsIUIuTKSce6X4Q= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1744243245; a=rsa-sha256; cv=none; b=Oihn+iVQKjgt0OkJ9l6+FXLFYeNbXVzKPnlgUfr21E1WRJLNEfi4Yx5FGZJK2xAAvzJYCC s2ORxyiSSlxTAQuHu7Rn3tKeqIpJmi7RwXNP7u5H3sxAoSUV65xrWcdf0o33GYi1KsXz0i AD/5xvqguW+TTeFdG+/RIVOGaPZti7I= ARC-Authentication-Results: i=1; imf13.hostedemail.com; dkim=pass header.d=kernel.org header.s=k20201202 header.b=b9oAE4TY; spf=pass (imf13.hostedemail.com: domain of sj@kernel.org designates 172.105.4.254 as permitted sender) smtp.mailfrom=sj@kernel.org; dmarc=pass (policy=quarantine) header.from=kernel.org Received: from smtp.kernel.org (transwarp.subspace.kernel.org [100.75.92.58]) by tor.source.kernel.org (Postfix) with ESMTP id E22D46112C; Thu, 10 Apr 2025 00:00:34 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 42D10C4CEE3; Thu, 10 Apr 2025 00:00:44 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1744243244; bh=xd4mck/AQ5TPoldc/O3PgGKXoydGwIPm1O/sUKAmeLg=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=b9oAE4TYiYRGl4iNGjv4aMkqQaxVmLxLLE3yT1rh0/X9MYBtai3P6dP8ovYbsONvV IqD+wwLVuqQS8znqCewXXKH1QX2vPh1PW+UxXocMW6E7i6RJ6QVVT68DCkB/haYNQj 4e6B56AbGvGkI8wUxg9aTkC0RzZZuh+hWfJrenp6gZvixz/xdA+hm/vpQJ1iXO95tC XZqQgPb4Hp4beOK9e1xTDr0UM+cPPL+zoG1CEjDWvdYpawa0AC9bnB3GHfGeE4aX6b xGUZqOJmPmbptvnzMqOqiw/igLJR1gCriDFUnSvifui/dOmOLMuC7TKVj9w60/HWZ0 yFMxgiCdFX76w== From: SeongJae Park To: Andrew Morton Cc: SeongJae Park , "Liam R.Howlett" , David Hildenbrand , Lorenzo Stoakes , Rik van Riel , Shakeel Butt , Vlastimil Babka , kernel-team@meta.com, linux-kernel@vger.kernel.org, linux-mm@kvack.org Subject: [PATCH v3 1/4] mm/madvise: define and use madvise_behavior struct for madvise_do_behavior() Date: Wed, 9 Apr 2025 17:00:19 -0700 Message-Id: <20250410000022.1901-2-sj@kernel.org> X-Mailer: git-send-email 2.39.5 In-Reply-To: <20250410000022.1901-1-sj@kernel.org> References: <20250410000022.1901-1-sj@kernel.org> MIME-Version: 1.0 X-Rspam-User: X-Rspamd-Server: rspam10 X-Rspamd-Queue-Id: 7718420011 X-Stat-Signature: 5jtipffp5f8m469y6i1zh8zu6p7tsdt9 X-HE-Tag: 1744243245-23958 X-HE-Meta: U2FsdGVkX1/g3plSfNeg0YNJa732BuBHiwLMarxriggQI6dXbWkuof5cJdVxBK62EXzayXwXtgbt6rJtiF6yl8PJfi7/KpYTZ1Yqoqmn/Snn/D8FTr+o6DZ07kq97okO5sy0GVld9NpDWgaibruTPa09jTuQVgh9Q9oarGIq79rF3y2ImE17e8uPISaLo3tEkRYSPilPHCVjb27198ycgYx0na9ma3bmINk3/mgPQyC1niugH3EnORzQnn1nFsU02lgvH3rIufgNsDbUxNcP4P180RCMtfbRQK8gEGs0sfDXYJ8VXNtQ7i57poEloD/CmxRXq61bEZI+lhDLELv8XRX8wnpX/pCXR2ub9MobDmAhqbW7S9O+QbhIRF0woEUw601gmFI1fhDTuv8w9eVcmvFi+nqool4X4CBh9ApedSKWL0BioH1BqkkZUyDH5N/iiSD7+MBnH7d5hzC3vhyEqC9KXzxcbs4Ukabj4x5ITjUjDYLXlYTybU2YGbFt2ficrBU/lWPkTtwXTwk1gWjgLjTYhAPha5MinfZ1vQ9+u8w1PzSkIP/fxQyK7HUlnqrLIc7AJi9BcamP5wnQ8sySpr4JlTuem+aQfnk5XTgOSnxP8bc3FY5apwfMj4PFm7PvHx0VYlnya0NX8eP4wCSxW/WNnuY0EXs/Tcsofo46bNXIQAtEbYKROUoyMn+4uUAGQ5k5+trHTyg42AvN4tvSRQSTDXPEAZfgpB1ZZCNox4fNlnhnFKnHKJz3cZQlhR22FK+MnhWg2mFmBUCFEt+xhswExUML2LuQ1ToQvD8a2B10RVOWLDOtQosvNehjNzB/wHPFj6sLw5pywDcD7W+eV9nLbpfYq1CGl1DX/NWaAsecZ67gtqRZHywvLyUqjlmhr7Lw64dwIEaoRJwmiITcZPwuCcC8pmsp4r5cXU1ton+2/PyTTwLO1F8W3YecoMHm+wf9XQi7F4572IaFVXA FVksbfMx ezb8DkOoBsoKlgHayfCMxv14F7ryYm9zKXQGdq864d/CWIGyKZpnxL3jd9j27448z3hPQq4kR5VMi8nzdHt+eCnD/ppLVm1aHqvzjxplYWZ1WzwZw5zbgaNhUZKKj+IbkrArZzRS4RiJ7Wje0j7f19GwJonlZUofT3nu6x8RNe/9jaJ8bNV3awnmQz8pLlTtiaPFYs2bwAs54MoP+WhfvpmpWilJcIQ1/vwplln/bum9GiI728egXAqBmkT+Gr9Ruq81QyJ083p8uGtAqkx0AXIljGPnz4JTGFyO8YlDyB8+nxY4= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: To implement batched tlb flushes for MADV_DONTNEED[_LOCKED] and MADV_FREE, an mmu_gather object in addition to the behavior integer need to be passed to the internal logics. Using a struct can make it easy without increasing the number of parameters of all code paths towards the internal logic. Define a struct for the purpose and use it on the code path that starts from madvise_do_behavior() and ends on madvise_dontneed_free(). Note that this changes madvise_walk_vmas() visitor type signature, too. Specifically, it changes its 'arg' type from 'unsigned long' to the new struct pointer. Reviewed-by: Lorenzo Stoakes Signed-off-by: SeongJae Park --- mm/madvise.c | 37 +++++++++++++++++++++++++------------ 1 file changed, 25 insertions(+), 12 deletions(-) diff --git a/mm/madvise.c b/mm/madvise.c index b17f684322ad..26fa868b41af 100644 --- a/mm/madvise.c +++ b/mm/madvise.c @@ -48,6 +48,11 @@ struct madvise_walk_private { bool pageout; }; +struct madvise_behavior { + int behavior; + struct mmu_gather *tlb; +}; + /* * Any behaviour which results in changes to the vma->vm_flags needs to * take mmap_lock for writing. Others, which simply traverse vmas, need @@ -893,8 +898,9 @@ static bool madvise_dontneed_free_valid_vma(struct vm_area_struct *vma, static long madvise_dontneed_free(struct vm_area_struct *vma, struct vm_area_struct **prev, unsigned long start, unsigned long end, - int behavior) + struct madvise_behavior *madv_behavior) { + int behavior = madv_behavior->behavior; struct mm_struct *mm = vma->vm_mm; *prev = vma; @@ -1249,8 +1255,10 @@ static long madvise_guard_remove(struct vm_area_struct *vma, static int madvise_vma_behavior(struct vm_area_struct *vma, struct vm_area_struct **prev, unsigned long start, unsigned long end, - unsigned long behavior) + void *behavior_arg) { + struct madvise_behavior *arg = behavior_arg; + int behavior = arg->behavior; int error; struct anon_vma_name *anon_name; unsigned long new_flags = vma->vm_flags; @@ -1270,7 +1278,7 @@ static int madvise_vma_behavior(struct vm_area_struct *vma, case MADV_FREE: case MADV_DONTNEED: case MADV_DONTNEED_LOCKED: - return madvise_dontneed_free(vma, prev, start, end, behavior); + return madvise_dontneed_free(vma, prev, start, end, arg); case MADV_NORMAL: new_flags = new_flags & ~VM_RAND_READ & ~VM_SEQ_READ; break; @@ -1487,10 +1495,10 @@ static bool process_madvise_remote_valid(int behavior) */ static int madvise_walk_vmas(struct mm_struct *mm, unsigned long start, - unsigned long end, unsigned long arg, + unsigned long end, void *arg, int (*visit)(struct vm_area_struct *vma, struct vm_area_struct **prev, unsigned long start, - unsigned long end, unsigned long arg)) + unsigned long end, void *arg)) { struct vm_area_struct *vma; struct vm_area_struct *prev; @@ -1548,7 +1556,7 @@ int madvise_walk_vmas(struct mm_struct *mm, unsigned long start, static int madvise_vma_anon_name(struct vm_area_struct *vma, struct vm_area_struct **prev, unsigned long start, unsigned long end, - unsigned long anon_name) + void *anon_name) { int error; @@ -1557,7 +1565,7 @@ static int madvise_vma_anon_name(struct vm_area_struct *vma, return -EBADF; error = madvise_update_vma(vma, prev, start, end, vma->vm_flags, - (struct anon_vma_name *)anon_name); + anon_name); /* * madvise() returns EAGAIN if kernel resources, such as @@ -1589,7 +1597,7 @@ int madvise_set_anon_name(struct mm_struct *mm, unsigned long start, if (end == start) return 0; - return madvise_walk_vmas(mm, start, end, (unsigned long)anon_name, + return madvise_walk_vmas(mm, start, end, anon_name, madvise_vma_anon_name); } #endif /* CONFIG_ANON_VMA_NAME */ @@ -1677,8 +1685,10 @@ static bool is_madvise_populate(int behavior) } static int madvise_do_behavior(struct mm_struct *mm, - unsigned long start, size_t len_in, int behavior) + unsigned long start, size_t len_in, + struct madvise_behavior *madv_behavior) { + int behavior = madv_behavior->behavior; struct blk_plug plug; unsigned long end; int error; @@ -1692,7 +1702,7 @@ static int madvise_do_behavior(struct mm_struct *mm, if (is_madvise_populate(behavior)) error = madvise_populate(mm, start, end, behavior); else - error = madvise_walk_vmas(mm, start, end, behavior, + error = madvise_walk_vmas(mm, start, end, madv_behavior, madvise_vma_behavior); blk_finish_plug(&plug); return error; @@ -1773,13 +1783,14 @@ static int madvise_do_behavior(struct mm_struct *mm, int do_madvise(struct mm_struct *mm, unsigned long start, size_t len_in, int behavior) { int error; + struct madvise_behavior madv_behavior = {.behavior = behavior}; if (madvise_should_skip(start, len_in, behavior, &error)) return error; error = madvise_lock(mm, behavior); if (error) return error; - error = madvise_do_behavior(mm, start, len_in, behavior); + error = madvise_do_behavior(mm, start, len_in, &madv_behavior); madvise_unlock(mm, behavior); return error; @@ -1796,6 +1807,7 @@ static ssize_t vector_madvise(struct mm_struct *mm, struct iov_iter *iter, { ssize_t ret = 0; size_t total_len; + struct madvise_behavior madv_behavior = {.behavior = behavior}; total_len = iov_iter_count(iter); @@ -1811,7 +1823,8 @@ static ssize_t vector_madvise(struct mm_struct *mm, struct iov_iter *iter, if (madvise_should_skip(start, len_in, behavior, &error)) ret = error; else - ret = madvise_do_behavior(mm, start, len_in, behavior); + ret = madvise_do_behavior(mm, start, len_in, + &madv_behavior); /* * An madvise operation is attempting to restart the syscall, * but we cannot proceed as it would not be correct to repeat From patchwork Thu Apr 10 00:00:20 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: SeongJae Park X-Patchwork-Id: 14045658 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id E6514C369A2 for ; Thu, 10 Apr 2025 00:00:50 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 5F7186B015D; Wed, 9 Apr 2025 20:00:48 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 5782F6B015F; Wed, 9 Apr 2025 20:00:48 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 37DDD6B0161; Wed, 9 Apr 2025 20:00:48 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0014.hostedemail.com [216.40.44.14]) by kanga.kvack.org (Postfix) with ESMTP id 1982D6B015D for ; Wed, 9 Apr 2025 20:00:48 -0400 (EDT) Received: from smtpin08.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay02.hostedemail.com (Postfix) with ESMTP id D6A6712162A for ; Thu, 10 Apr 2025 00:00:48 +0000 (UTC) X-FDA: 83316178176.08.FFB9763 Received: from dfw.source.kernel.org (dfw.source.kernel.org [139.178.84.217]) by imf06.hostedemail.com (Postfix) with ESMTP id 1F042180016 for ; Thu, 10 Apr 2025 00:00:46 +0000 (UTC) Authentication-Results: imf06.hostedemail.com; dkim=pass header.d=kernel.org header.s=k20201202 header.b=XVCCyujV; spf=pass (imf06.hostedemail.com: domain of sj@kernel.org designates 139.178.84.217 as permitted sender) smtp.mailfrom=sj@kernel.org; dmarc=pass (policy=quarantine) header.from=kernel.org ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1744243247; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=o7zBYRfvkjUIg32avt0DAwnMUG4oJBLUKAxDAS4YKLw=; b=PtmqUYUEkosaC/xxfWvFauf0h/jS0yPYPe+rSRbMzfID2MEDvJn4Ho5Ok1n6ZC4sU8cGwV lbvhfiJlW3UIbPOXYjsyUzIX+HZEvQNXYzjteapVX2M/wPimoSOrUaVLU4mR5bRiSNcBaI 7UqllsNBqPiurAylW/hsruL6WpoJYsU= ARC-Authentication-Results: i=1; imf06.hostedemail.com; dkim=pass header.d=kernel.org header.s=k20201202 header.b=XVCCyujV; spf=pass (imf06.hostedemail.com: domain of sj@kernel.org designates 139.178.84.217 as permitted sender) smtp.mailfrom=sj@kernel.org; dmarc=pass (policy=quarantine) header.from=kernel.org ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1744243247; a=rsa-sha256; cv=none; b=aFCNjfLOPyRlJjZfohG8S6Tsy35rVfgFqCaPSesOTm8+t5JPU+ThS6eYSDwDKoomcqTRTg EJrDfb+sZxWlg9O6sTxp/v5lIpCHUuymKDainjvgjprw411Aw5gy6wpUY1GPMAhl89u0PQ xMOeispJCNxfMzyVLysR+aRNzWdj3D4= Received: from smtp.kernel.org (transwarp.subspace.kernel.org [100.75.92.58]) by dfw.source.kernel.org (Postfix) with ESMTP id CAD0E5C4AE7; Wed, 9 Apr 2025 23:58:28 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 5D9BEC4CEE2; Thu, 10 Apr 2025 00:00:45 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1744243245; bh=dkT2xHQjKC605c81qGII2JAGaeNQ/xG5ym+UekI2lR8=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=XVCCyujVyhZO/6pZC6zQJ8ACzYqibNmVtsKHhsxECVQir4MCa5NxyDzEA3buXH9sq T1bxafDXAjcsepzIcZ5rk4/LbsMa/E9DQ2fOgJ4iU5Lwu1Ju6gO/IObVbOqj9My6ij GCgWZiNBm7zdRH2YMCUB1+LBE4LX+WHA07OhorwmzjxHWSNSWIdSCmzUn9pjQCg492 7DzRbIG0rmangBqkPo4NtTI8gsy6wPweAHqwFywW2Zq+ev09CexT7dLdwnoPq6jHnh ErQ4hx+2CY1/izeEfR/V8vp5+2OP+Rp52t3fNdqfJXdE7ZJZE71A+3YyTY4+eHCDLv aTTzaygJw3kgg== From: SeongJae Park To: Andrew Morton Cc: SeongJae Park , "Liam R.Howlett" , David Hildenbrand , Lorenzo Stoakes , Rik van Riel , Shakeel Butt , Vlastimil Babka , kernel-team@meta.com, linux-kernel@vger.kernel.org, linux-mm@kvack.org Subject: [PATCH v3 2/4] mm/madvise: batch tlb flushes for MADV_FREE Date: Wed, 9 Apr 2025 17:00:20 -0700 Message-Id: <20250410000022.1901-3-sj@kernel.org> X-Mailer: git-send-email 2.39.5 In-Reply-To: <20250410000022.1901-1-sj@kernel.org> References: <20250410000022.1901-1-sj@kernel.org> MIME-Version: 1.0 X-Rspam-User: X-Rspamd-Server: rspam09 X-Rspamd-Queue-Id: 1F042180016 X-Stat-Signature: fmz8319f9aqwxehse55hgbhh13pgrtjq X-HE-Tag: 1744243246-460660 X-HE-Meta: U2FsdGVkX19hQsEr+xBNAixxm6g/ru0qnzNqep521u3hRW+Umh588wRT+yP0ri56lHLvgKEWOhC9ThMCGIAwXmtjzcRLw7ehonoMYHhs6hv0FS4vv1XJ3t1IqSKLAPBVeyNh6I/KlKpKkQ4bvxNfPayhIxeYdOCL8EUI8neWTCjKzwzfEAQS/vFCqort8/7rDWsL388BSWcl+LLuopz/CpVjiMEacp9X8hnMBOvSxmBGP3gGfYUwuCZHpdHHc7Njgl6ZAjV6aLLWwtUxvjVnWZYDuyiJFI0lqm0SUeZ5pDtO03sh2Y5EchBRSOembW3ICxi65FSOeWwigEOpfDGGbhd7N4EfzqVaX44BAeQNLNqxUClQvslxw05+5lhOg7HMDMH5ycTkXCL3Sh/4Z9n0UC8cb/nO8aBpMmjneIHaxve8BnYn7+JBRjCAOVf0EoNnWUDyfgtFTbtTuK+jNkMLyE17dTdrWhFuQ/Lpt1UTzqMmcpFmICC5WoBLh2/nVEQ6ZK0w/PAdo9eCEhtWPuCunaZypDME1VAhqFUh4osyuaS5/Q0wej+wy74V3l/i4rHyxNd3J+xpjXbhttqVSdjFyWHtI3u8eB3Yu5Iaigz2d8KFnb1bywjFhNpvgukNzlXbVKd7dzjS/1/u3HLNbQYl8IG5oQE3aqaMafB8aAUUiEPHKcYt7NZYHDfK36QSKEGPocBC0H22vBG74VYxEFewh3mGvdoOVGBmBBKYH5/lzwUSxhOCqFoG4vPNk/0JwHkLoghpxRz1mDI0zJekR0E7bgZ8aWH1AVBODXwacOxwQGkz5s1bs01zA1bMktSpRJgX/beiXgbqQdqpkZeN5orCDU3fZd2y4KDRzhRnFHacr+yUh7NZEauRIa6Q7FDM5FAZYVW+Ba+gGX5d0IFRFjodkBMy/naJloz8SZZrB7YrpwtQW6V2s137e+1ZNOfH9DImadw5zV6TCjlTjmj2imQ Ri9A06jo 2jCgnZ93oTgWFnCPQ06Insk+JX5tmbE3dQBDxNlF5sn1K29OFf4SLchUk+jXXU92cJywZZAgUIqyMo6h2Yuh5RWFn5VflRwJGqizKyRsOSMk48nt0lS5zRWnOCDRP8h2rBmeqrpFDEZ3xt7H2IfFkjmM1mVHtV2lcNsFjALsyc5Ed2JcTrpCc3QVXFVw1u9fJXC8mS5AR8QmR1tvkd0l/YgXnwNJtmoVkWnkivdM11E6CK+WkTUJ+chHldMegxacA4E6Hwf8HB9uCSiAFFV7981f1EfXVjKT0+HzQqKsvTseCjAmPO056aHiDFQ== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: MADV_FREE handling for [process_]madvise() flushes tlb for each vma of each address range. Update the logic to do tlb flushes in a batched way. Initialize an mmu_gather object from do_madvise() and vector_madvise(), which are the entry level functions for [process_]madvise(), respectively. And pass those objects to the function for per-vma work, via madvise_behavior struct. Make the per-vma logic not flushes tlb on their own but just saves the tlb entries to the received mmu_gather object. Finally, the entry level functions flush the tlb entries that gathered for the entire user request, at once. Reviewed-by: Lorenzo Stoakes Signed-off-by: SeongJae Park --- mm/madvise.c | 57 ++++++++++++++++++++++++++++++++++++++++++---------- 1 file changed, 46 insertions(+), 11 deletions(-) diff --git a/mm/madvise.c b/mm/madvise.c index 26fa868b41af..951038a9f36f 100644 --- a/mm/madvise.c +++ b/mm/madvise.c @@ -799,12 +799,13 @@ static const struct mm_walk_ops madvise_free_walk_ops = { .walk_lock = PGWALK_RDLOCK, }; -static int madvise_free_single_vma(struct vm_area_struct *vma, +static int madvise_free_single_vma(struct madvise_behavior *madv_behavior, + struct vm_area_struct *vma, unsigned long start_addr, unsigned long end_addr) { struct mm_struct *mm = vma->vm_mm; struct mmu_notifier_range range; - struct mmu_gather tlb; + struct mmu_gather *tlb = madv_behavior->tlb; /* MADV_FREE works for only anon vma at the moment */ if (!vma_is_anonymous(vma)) @@ -820,17 +821,14 @@ static int madvise_free_single_vma(struct vm_area_struct *vma, range.start, range.end); lru_add_drain(); - tlb_gather_mmu(&tlb, mm); update_hiwater_rss(mm); mmu_notifier_invalidate_range_start(&range); - tlb_start_vma(&tlb, vma); + tlb_start_vma(tlb, vma); walk_page_range(vma->vm_mm, range.start, range.end, - &madvise_free_walk_ops, &tlb); - tlb_end_vma(&tlb, vma); + &madvise_free_walk_ops, tlb); + tlb_end_vma(tlb, vma); mmu_notifier_invalidate_range_end(&range); - tlb_finish_mmu(&tlb); - return 0; } @@ -954,7 +952,7 @@ static long madvise_dontneed_free(struct vm_area_struct *vma, if (behavior == MADV_DONTNEED || behavior == MADV_DONTNEED_LOCKED) return madvise_dontneed_single_vma(vma, start, end); else if (behavior == MADV_FREE) - return madvise_free_single_vma(vma, start, end); + return madvise_free_single_vma(madv_behavior, vma, start, end); else return -EINVAL; } @@ -1627,6 +1625,29 @@ static void madvise_unlock(struct mm_struct *mm, int behavior) mmap_read_unlock(mm); } +static bool madvise_batch_tlb_flush(int behavior) +{ + switch (behavior) { + case MADV_FREE: + return true; + default: + return false; + } +} + +static void madvise_init_tlb(struct madvise_behavior *madv_behavior, + struct mm_struct *mm) +{ + if (madvise_batch_tlb_flush(madv_behavior->behavior)) + tlb_gather_mmu(madv_behavior->tlb, mm); +} + +static void madvise_finish_tlb(struct madvise_behavior *madv_behavior) +{ + if (madvise_batch_tlb_flush(madv_behavior->behavior)) + tlb_finish_mmu(madv_behavior->tlb); +} + static bool is_valid_madvise(unsigned long start, size_t len_in, int behavior) { size_t len; @@ -1783,14 +1804,20 @@ static int madvise_do_behavior(struct mm_struct *mm, int do_madvise(struct mm_struct *mm, unsigned long start, size_t len_in, int behavior) { int error; - struct madvise_behavior madv_behavior = {.behavior = behavior}; + struct mmu_gather tlb; + struct madvise_behavior madv_behavior = { + .behavior = behavior, + .tlb = &tlb, + }; if (madvise_should_skip(start, len_in, behavior, &error)) return error; error = madvise_lock(mm, behavior); if (error) return error; + madvise_init_tlb(&madv_behavior, mm); error = madvise_do_behavior(mm, start, len_in, &madv_behavior); + madvise_finish_tlb(&madv_behavior); madvise_unlock(mm, behavior); return error; @@ -1807,13 +1834,18 @@ static ssize_t vector_madvise(struct mm_struct *mm, struct iov_iter *iter, { ssize_t ret = 0; size_t total_len; - struct madvise_behavior madv_behavior = {.behavior = behavior}; + struct mmu_gather tlb; + struct madvise_behavior madv_behavior = { + .behavior = behavior, + .tlb = &tlb, + }; total_len = iov_iter_count(iter); ret = madvise_lock(mm, behavior); if (ret) return ret; + madvise_init_tlb(&madv_behavior, mm); while (iov_iter_count(iter)) { unsigned long start = (unsigned long)iter_iov_addr(iter); @@ -1842,14 +1874,17 @@ static ssize_t vector_madvise(struct mm_struct *mm, struct iov_iter *iter, } /* Drop and reacquire lock to unwind race. */ + madvise_finish_tlb(&madv_behavior); madvise_unlock(mm, behavior); madvise_lock(mm, behavior); + madvise_init_tlb(&madv_behavior, mm); continue; } if (ret < 0) break; iov_iter_advance(iter, iter_iov_len(iter)); } + madvise_finish_tlb(&madv_behavior); madvise_unlock(mm, behavior); ret = (total_len - iov_iter_count(iter)) ? : ret; From patchwork Thu Apr 10 00:00:21 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: SeongJae Park X-Patchwork-Id: 14045659 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 29149C369A2 for ; Thu, 10 Apr 2025 00:00:54 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id B245C6B015F; Wed, 9 Apr 2025 20:00:48 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id AD13B6B0163; Wed, 9 Apr 2025 20:00:48 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 95F346B015F; Wed, 9 Apr 2025 20:00:48 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0010.hostedemail.com [216.40.44.10]) by kanga.kvack.org (Postfix) with ESMTP id 5A4916B0161 for ; Wed, 9 Apr 2025 20:00:48 -0400 (EDT) Received: from smtpin11.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay06.hostedemail.com (Postfix) with ESMTP id 28E39BB8B8 for ; Thu, 10 Apr 2025 00:00:49 +0000 (UTC) X-FDA: 83316178218.11.812C79E Received: from nyc.source.kernel.org (nyc.source.kernel.org [147.75.193.91]) by imf27.hostedemail.com (Postfix) with ESMTP id 8D92F40012 for ; Thu, 10 Apr 2025 00:00:47 +0000 (UTC) Authentication-Results: imf27.hostedemail.com; dkim=pass header.d=kernel.org header.s=k20201202 header.b=h2GrDKXy; spf=pass (imf27.hostedemail.com: domain of sj@kernel.org designates 147.75.193.91 as permitted sender) smtp.mailfrom=sj@kernel.org; dmarc=pass (policy=quarantine) header.from=kernel.org ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1744243247; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=KAAphlR2WIlwTiPq7OkUFfoAyMKyIeF+suS8pSab5ZY=; b=JcrDTnRB/QCR6CSIm9rBFzWO2CySOfCZs6I6z847ryo7LJS+RRy2r1c5KShGhM4FcK3E1z YHkyytah2E5B34zvWzAxwp1hPIqTExi8zary+y2r0eGd/AVvkoktnhHUeWT9Z76J/fab2A er+yGWy6aNEkb/sYuGSFN3y4g3eXHMc= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1744243247; a=rsa-sha256; cv=none; b=fEDXvuPZN4nl4JEaTKMZpP8SjME7U4l1zP6TAhx57amoDHBRckph18aQTK2jS67Fg+BKww Afniw0I8B86s0QlKeiGnUI9YmYWAYFsjs4iGktZRQ3D71VHk6PqmIlVi8Qx/9w5/wvSxco RuKqzRe58tnKZ3e3LRs4i7mLkfbCTP4= ARC-Authentication-Results: i=1; imf27.hostedemail.com; dkim=pass header.d=kernel.org header.s=k20201202 header.b=h2GrDKXy; spf=pass (imf27.hostedemail.com: domain of sj@kernel.org designates 147.75.193.91 as permitted sender) smtp.mailfrom=sj@kernel.org; dmarc=pass (policy=quarantine) header.from=kernel.org Received: from smtp.kernel.org (transwarp.subspace.kernel.org [100.75.92.58]) by nyc.source.kernel.org (Postfix) with ESMTP id 5CFB3A4911B; Wed, 9 Apr 2025 23:55:18 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 78490C4CEE9; Thu, 10 Apr 2025 00:00:46 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1744243246; bh=AiRM6AgaPbYM4aARYtsDAE0SH4eLTigEd/mfyKuf6e8=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=h2GrDKXyRxw0mpBW/ScN6CLLCrTI3Y/nMLtNQj4qrA+gMGaT4diZIwQxyVRlk+Qk5 U/k2UZAy4O6RbCvBkd22bPhecL7dT3ZIWFijQIAyBQRDEOM3erKpDai5Rmw5UJhSM/ DZWNpyizeqE5pr0G3xKb/i3sSdtzbT0iAPAMR4tJiETAFniiz7LS1VHtCCo1icS+Yq qzPQtWstjVegYNDRvWby9HCM1zEi3n0tVO61oE+ByGQ0ahG3C/YnJFvNzugE/TZTYv B5+/egDGYkBZziaNVKgk18AENnOJuy4HX76bioEVFEjwe6bmnZfnJdvDmsoU3MQ4b2 cxAxW45QgU+eA== From: SeongJae Park To: Andrew Morton Cc: SeongJae Park , "Liam R.Howlett" , David Hildenbrand , Lorenzo Stoakes , Rik van Riel , Shakeel Butt , Vlastimil Babka , kernel-team@meta.com, linux-kernel@vger.kernel.org, linux-mm@kvack.org Subject: [PATCH v3 3/4] mm/memory: split non-tlb flushing part from zap_page_range_single() Date: Wed, 9 Apr 2025 17:00:21 -0700 Message-Id: <20250410000022.1901-4-sj@kernel.org> X-Mailer: git-send-email 2.39.5 In-Reply-To: <20250410000022.1901-1-sj@kernel.org> References: <20250410000022.1901-1-sj@kernel.org> MIME-Version: 1.0 X-Stat-Signature: 3y8i1zpxnai18qct9im4qtm3mga563sp X-Rspam-User: X-Rspamd-Queue-Id: 8D92F40012 X-Rspamd-Server: rspam08 X-HE-Tag: 1744243247-61789 X-HE-Meta: U2FsdGVkX1/TK9pwHHlWR33YewRpjmCVgy6b9kSUXWLtdiLrQCxWPJeb7/rKjd/WgZsgbvM/xbmkm13lll/e2SkUTkSLilkB2+EVwY+keehUw8bII65vECPIT4WFhh5aFeg/ew0fd9oxyS/EXn2rj2JhEKlXQ5gvxNo5e8zB1Wcn4T6JDWyLZmqPluGTMUHcRF4csX/sgYqyFYiUWvWvag1AzGBMduplmRqylsMt4ePdb6gQU1hNFfHo3pp3/w1nErNjpxvVd6M2kWdo+kzLD41+be2my3SvWEdZ2Mr5vqeOUn7OIRM++tMDokpGRZ3sxasusfLkTP3VbChKNvOpJGblVKwlDXFTzoyWHJvO4Vn0UDVv1tOQwfwqBZDaVP6XRjYafvmlRTMEj6a2lmeCaiSOqYJ8ifibCVIDIeWanxMXWd9a3EHN9cc3q4VC/ACEiXjRv2K27G3y9DbkpLSFKhaLNHO+nPxs0i3rB/q+/3tmpSExM+ha2vPUyUl1erUFvVzrc5B+rb0UdoBhFD871MXGvZVU0Zobbua7e35rWxgyDBQTEyAjXHXvzyOZO3g/bk15srIVJZJbJ0BqSQtZu1oyImpFJ0q3kwTEwYVz9VDCwt4Et2e0/4Tj/nSD+16fHcx/n26+SdEdJvRhHUC+3SC3af8fwfCFMZTWEXI2bwCvxSOb2qDNgrImTyEmO2YWhVIxYhVhThX+SAPJawf3CZ4cgzZxEF1vwKji+G7HWQsBRUEOIhNwIR70uSVT5HLa2VmcgZZaVCYz3vuatxOlofSa94sDwbPLZwCDdQ6ZvwX6kpuSl2wYxXSqxOeHS0ZyJOfLciJE9AqzV/z1mbNQ4AWLmfq5RQyC70cD19B+oYUmQ7E0TNh03zTQVyHdSdBIZqe/OJDsd8TifPEYE7h/CHRiIDQh4JCmP5SHIlYIaan//gtyInrICi8V7eon6UDOQgVZT1R6+PP3ivRNoy9 bM2G5Hsi 6PPmRQGtYHO3Q7coNvEjJ9DNtqtO43FShVWRyG511Xjq9LpjvFHK1X9s/H7GzFr7B39Hl38+cRLoEAW50e3eZ0OZdGgZbx9/pNr44+aw55a4mdbv7tNf/srtfm39vFn6hFIVMCxixAHok0DfRW0ygoMwPWgqxVY0kIO5TaYL0XqfU4ZKloQ9YSZ6aPX8T1RYgvjyEh8YoblLnWuSi3joGi929QtB/zb8fhl7DKxoGrVgtB+su6zT6kZM+rpc3mzDowxFHxH4tVHcO92Kf/tDESBh83g== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Some of zap_page_range_single() callers such as [process_]madvise() with MADV_DONTNEED[_LOCKED] cannot batch tlb flushes because zap_page_range_single() flushes tlb for each invocation. Split out the body of zap_page_range_single() except mmu_gather object initialization and gathered tlb entries flushing for such batched tlb flushing usage. To avoid hugetlb pages allocation failures from concurrent page faults, the tlb flush should be done before hugetlb faults unlocking, though. Do the flush and the unlock inside the split out function in the order for hugetlb vma case. Refer to commit 2820b0f09be9 ("hugetlbfs: close race between MADV_DONTNEED and page fault") for more details about the concurrent faults' page allocation failure problem. Signed-off-by: SeongJae Park Reviewed-by: Lorenzo Stoakes --- mm/memory.c | 49 +++++++++++++++++++++++++++++++++++++++---------- 1 file changed, 39 insertions(+), 10 deletions(-) diff --git a/mm/memory.c b/mm/memory.c index fda6d6429a27..690695643dfb 100644 --- a/mm/memory.c +++ b/mm/memory.c @@ -1998,36 +1998,65 @@ void unmap_vmas(struct mmu_gather *tlb, struct ma_state *mas, mmu_notifier_invalidate_range_end(&range); } -/** - * zap_page_range_single - remove user pages in a given range +/* + * zap_page_range_single_batched - remove user pages in a given range + * @tlb: pointer to the caller's struct mmu_gather * @vma: vm_area_struct holding the applicable pages - * @address: starting address of pages to zap - * @size: number of bytes to zap + * @address: starting address of pages to remove + * @size: number of bytes to remove * @details: details of shared cache invalidation * - * The range must fit into one VMA. + * @tlb shouldn't be NULL. The range must fit into one VMA. If @vma is for + * hugetlb, @tlb is flushed and re-initialized by this function. */ -void zap_page_range_single(struct vm_area_struct *vma, unsigned long address, +static void zap_page_range_single_batched(struct mmu_gather *tlb, + struct vm_area_struct *vma, unsigned long address, unsigned long size, struct zap_details *details) { const unsigned long end = address + size; struct mmu_notifier_range range; - struct mmu_gather tlb; + + VM_WARN_ON_ONCE(!tlb || tlb->mm != vma->vm_mm); mmu_notifier_range_init(&range, MMU_NOTIFY_CLEAR, 0, vma->vm_mm, address, end); hugetlb_zap_begin(vma, &range.start, &range.end); - tlb_gather_mmu(&tlb, vma->vm_mm); update_hiwater_rss(vma->vm_mm); mmu_notifier_invalidate_range_start(&range); /* * unmap 'address-end' not 'range.start-range.end' as range * could have been expanded for hugetlb pmd sharing. */ - unmap_single_vma(&tlb, vma, address, end, details, false); + unmap_single_vma(tlb, vma, address, end, details, false); mmu_notifier_invalidate_range_end(&range); + if (is_vm_hugetlb_page(vma)) { + /* + * flush tlb and free resources before hugetlb_zap_end(), to + * avoid concurrent page faults' allocation failure. + */ + tlb_finish_mmu(tlb); + hugetlb_zap_end(vma, details); + tlb_gather_mmu(tlb, vma->vm_mm); + } +} + +/** + * zap_page_range_single - remove user pages in a given range + * @vma: vm_area_struct holding the applicable pages + * @address: starting address of pages to zap + * @size: number of bytes to zap + * @details: details of shared cache invalidation + * + * The range must fit into one VMA. + */ +void zap_page_range_single(struct vm_area_struct *vma, unsigned long address, + unsigned long size, struct zap_details *details) +{ + struct mmu_gather tlb; + + tlb_gather_mmu(&tlb, vma->vm_mm); + zap_page_range_single_batched(&tlb, vma, address, size, details); tlb_finish_mmu(&tlb); - hugetlb_zap_end(vma, details); } /** From patchwork Thu Apr 10 00:00:22 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: SeongJae Park X-Patchwork-Id: 14045660 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id BEB5BC369A6 for ; Thu, 10 Apr 2025 00:00:56 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 868FD6B0167; Wed, 9 Apr 2025 20:00:50 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 7CAC028005F; Wed, 9 Apr 2025 20:00:50 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 559BB6B016B; Wed, 9 Apr 2025 20:00:50 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0013.hostedemail.com [216.40.44.13]) by kanga.kvack.org (Postfix) with ESMTP id 302216B0167 for ; Wed, 9 Apr 2025 20:00:50 -0400 (EDT) Received: from smtpin11.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay05.hostedemail.com (Postfix) with ESMTP id 02F995BAFB for ; Thu, 10 Apr 2025 00:00:50 +0000 (UTC) X-FDA: 83316178302.11.650BF89 Received: from sea.source.kernel.org (sea.source.kernel.org [172.234.252.31]) by imf16.hostedemail.com (Postfix) with ESMTP id 330B718000B for ; Thu, 10 Apr 2025 00:00:49 +0000 (UTC) Authentication-Results: imf16.hostedemail.com; dkim=pass header.d=kernel.org header.s=k20201202 header.b=p5ZwDjtQ; spf=pass (imf16.hostedemail.com: domain of sj@kernel.org designates 172.234.252.31 as permitted sender) smtp.mailfrom=sj@kernel.org; dmarc=pass (policy=quarantine) header.from=kernel.org ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1744243249; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=z71nXh+vlXk3+nWqMzsmDIHIo1Pz1iomfXEf+3JDGsw=; b=sCk7fTWZ3E1ef7rcFoKQOQx4bQxfsBGDFNkwGwkcioQ7YymEXqdaTkMOGupkciVGvI0yxP iHITxrmcp3fVqpOVXccZSxk9L+rHyiAMw6lBZbQKIZu1vnioi+wsC+PPhGmZoo/BOKNkCx KAUiuDqBlRXc+nQ3TRiON0ee2l9Xy0I= ARC-Authentication-Results: i=1; imf16.hostedemail.com; dkim=pass header.d=kernel.org header.s=k20201202 header.b=p5ZwDjtQ; spf=pass (imf16.hostedemail.com: domain of sj@kernel.org designates 172.234.252.31 as permitted sender) smtp.mailfrom=sj@kernel.org; dmarc=pass (policy=quarantine) header.from=kernel.org ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1744243249; a=rsa-sha256; cv=none; b=pEJ9sBBpIUZGguuTljGGEc/YCj6kwVMZjcUO0afa2ApFRRQQgknqCOvdHk6CHq6viz2Hmi xl8o1pkULFW3XZPTdrjA2WLaz107pxm0b+L4qDnnRGHeVE0aqpltp5MeN2wimhke7pkRfa FbpiIEgi/pZycKM3Nnf4ixb2RKWTlQk= Received: from smtp.kernel.org (transwarp.subspace.kernel.org [100.75.92.58]) by sea.source.kernel.org (Postfix) with ESMTP id 11D9E4A36B; Thu, 10 Apr 2025 00:00:47 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 92D35C4CEE2; Thu, 10 Apr 2025 00:00:47 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1744243247; bh=IBkUpUZBg5kO3BQ+hDKIWhFYAjVEYGs2Rh96p5apXwQ=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=p5ZwDjtQl7XjSSd/tBnBXVz/IgfOj5bqgDwup5g/5OnK0V1fqHV9yeOmYqgy9veSb ZpNwByXJQujyZip8zr1DoR/ZHd+HpFty7hMsE4jMhdq3cIyC1Jt8OYGDF0bo5WGmGW ifHjdOMJ+X0M9gtLiKBJtbUM1OHY8XIOlF0qKrA2PHE2iYNkGtzInWpS6u7jyIyWSQ os3jZt8ZpgHsdvTyFx0fBR45miXYCPZztnUe8dqHlRMzbyxZkDVEp3mzFPRvVGc52H CqbeEfMTrHzUzFQINzi7d+202l1tmBjXoC89Zu7pQiKeNfZQhNBGZUhC4l3e6V+qx5 xSVaOkY+ReVHQ== From: SeongJae Park To: Andrew Morton Cc: SeongJae Park , "Liam R.Howlett" , David Hildenbrand , Lorenzo Stoakes , Rik van Riel , Shakeel Butt , Vlastimil Babka , kernel-team@meta.com, linux-kernel@vger.kernel.org, linux-mm@kvack.org Subject: [PATCH v3 4/4] mm/madvise: batch tlb flushes for MADV_DONTNEED[_LOCKED] Date: Wed, 9 Apr 2025 17:00:22 -0700 Message-Id: <20250410000022.1901-5-sj@kernel.org> X-Mailer: git-send-email 2.39.5 In-Reply-To: <20250410000022.1901-1-sj@kernel.org> References: <20250410000022.1901-1-sj@kernel.org> MIME-Version: 1.0 X-Rspam-User: X-Rspamd-Server: rspam09 X-Rspamd-Queue-Id: 330B718000B X-Stat-Signature: 869shgjidxx9jhjpzz91nxizmg61owzy X-HE-Tag: 1744243249-433380 X-HE-Meta: U2FsdGVkX18qpp8bJabMLVUMMdFxF9Gnx8kAyvW9LH/9dbhirvI9WVcr+uUzHei6kcJCYZqQ2W7FQmfOgslfuCk+jAXnQCDo/oXcJheGRkF3bpbgsa4pfFxTdwsE7KiT+yrgaZxvmOnuy4ikaT5TTVOT53mYvv/NFTHIUA2DXfKCHQV6xfvo9C/M0OJqmIO6T7oTzF4+cuBYZYizhxLHTbBwd+7C5BEMAc+9rA2WXJciGRwzjOZcI/3IJ2LLwDfe65MYtIR0pPPv6XFDsJQLXmOSZSfgAGGkaogi+MbEHiHr56ZO2hEkk9m9T6mIjQC6sLQ4fSTOh82Rk7bsQRn0XPB1+/kSP5xvbWjy9DbsyafeTUKf4X19dwQUjm+Zn8r8be+i6HPiijOhdlME+3saLZV619xEkkfexlLsljaLp6Z5aboPtHDSFCET3+tBjlVeZaOF0682BUHEaiSvWv8XlLIXRUwW8XUAnQwiJSC+isw3b3QPNPtRyLcHkSrhB0f76CNoHL6z7+uNHGC+nEOnt2rz2buyJeQ9Az/W0fQXiOQm1F1AwChP9pQuW/JZwnsi5+1I20EcIpe4ATld+1/zVdz0CfpUC+Rn14Evv/cNaT/pmW+ZJc5xkSvJh52wdeJUtDecz37JZqw7X8v17iQ+50goA+wa2XazYwGa/SOnDLi2oU9ULdL+MZOBwnLY7wHIL9LPaKxxovzZsj00cqixzhYMW37cjsc/YdGBV5L0fCnQoV/8GAW/6RevOc3rnkS2rmjUFl2VYTSCgRSM59L2M/5bA8RnAJkc3wGZdUyejl+PvYPEkGkNugm/ZZ2omJCcnzDRi8CbJfwkQHOXdQn/7Uk6N0jXOZLjAKMJ5VB4pEamDrswiZgUsnxWMvwkjAsndypsXL4MxeK9qaMYZRhylCvD2FqYE/8Fj4wDkChgspOfA+D7cv6zNpM9WgVUdVJzw3LnxmZ0F/I7U56xXik xz7c9D0s YpXuuhONeafbBgZILK+2uIXsxZMRDHoIsfe2TnsmNYbCogrXE6mg6MHn2V07liEDiUEOZKxl/FCdjk1UG1BQ2Yc/fEvQcfcnlButgKZX0jxH8rMBXVH0Ypt03ln9yzM58qC5v2QHHTkv1IIaqSix13fNHyG2f+LIOfmfKP4NpX/sRfRhJkNU4jlMTf9txX14BvAj4ewmTDYjufUt3C7NA96gZ2XZKCjFGr18CiTukBGjkocAwi8FoXHV+db9sAywjlyH1ft/OH+tO/4Pp0aFNBAN8PA== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: MADV_DONTNEED[_LOCKED] handling for [process_]madvise() flushes tlb for each vma of each address range. Update the logic to do tlb flushes in a batched way. Initialize an mmu_gather object from do_madvise() and vector_madvise(), which are the entry level functions for [process_]madvise(), respectively. And pass those objects to the function for per-vma work, via madvise_behavior struct. Make the per-vma logic not flushes tlb on their own but just saves the tlb entries to the received mmu_gather object. For this internal logic change, make zap_page_range_single_batched() non-static and use it directly from madvise_dontneed_single_vma(). Finally, the entry level functions flush the tlb entries that gathered for the entire user request, at once. Signed-off-by: SeongJae Park Reviewed-by: Lorenzo Stoakes --- mm/internal.h | 3 +++ mm/madvise.c | 11 ++++++++--- mm/memory.c | 4 ++-- 3 files changed, 13 insertions(+), 5 deletions(-) diff --git a/mm/internal.h b/mm/internal.h index ef92e88738fe..c5f9dd007215 100644 --- a/mm/internal.h +++ b/mm/internal.h @@ -435,6 +435,9 @@ void unmap_page_range(struct mmu_gather *tlb, struct vm_area_struct *vma, unsigned long addr, unsigned long end, struct zap_details *details); +void zap_page_range_single_batched(struct mmu_gather *tlb, + struct vm_area_struct *vma, unsigned long addr, + unsigned long size, struct zap_details *details); int folio_unmap_invalidate(struct address_space *mapping, struct folio *folio, gfp_t gfp); diff --git a/mm/madvise.c b/mm/madvise.c index 951038a9f36f..8433ac9b27e0 100644 --- a/mm/madvise.c +++ b/mm/madvise.c @@ -851,7 +851,8 @@ static int madvise_free_single_vma(struct madvise_behavior *madv_behavior, * An interface that causes the system to free clean pages and flush * dirty pages is already available as msync(MS_INVALIDATE). */ -static long madvise_dontneed_single_vma(struct vm_area_struct *vma, +static long madvise_dontneed_single_vma(struct madvise_behavior *madv_behavior, + struct vm_area_struct *vma, unsigned long start, unsigned long end) { struct zap_details details = { @@ -859,7 +860,8 @@ static long madvise_dontneed_single_vma(struct vm_area_struct *vma, .even_cows = true, }; - zap_page_range_single(vma, start, end - start, &details); + zap_page_range_single_batched( + madv_behavior->tlb, vma, start, end - start, &details); return 0; } @@ -950,7 +952,8 @@ static long madvise_dontneed_free(struct vm_area_struct *vma, } if (behavior == MADV_DONTNEED || behavior == MADV_DONTNEED_LOCKED) - return madvise_dontneed_single_vma(vma, start, end); + return madvise_dontneed_single_vma( + madv_behavior, vma, start, end); else if (behavior == MADV_FREE) return madvise_free_single_vma(madv_behavior, vma, start, end); else @@ -1628,6 +1631,8 @@ static void madvise_unlock(struct mm_struct *mm, int behavior) static bool madvise_batch_tlb_flush(int behavior) { switch (behavior) { + case MADV_DONTNEED: + case MADV_DONTNEED_LOCKED: case MADV_FREE: return true; default: diff --git a/mm/memory.c b/mm/memory.c index 690695643dfb..559f3e194438 100644 --- a/mm/memory.c +++ b/mm/memory.c @@ -1998,7 +1998,7 @@ void unmap_vmas(struct mmu_gather *tlb, struct ma_state *mas, mmu_notifier_invalidate_range_end(&range); } -/* +/** * zap_page_range_single_batched - remove user pages in a given range * @tlb: pointer to the caller's struct mmu_gather * @vma: vm_area_struct holding the applicable pages @@ -2009,7 +2009,7 @@ void unmap_vmas(struct mmu_gather *tlb, struct ma_state *mas, * @tlb shouldn't be NULL. The range must fit into one VMA. If @vma is for * hugetlb, @tlb is flushed and re-initialized by this function. */ -static void zap_page_range_single_batched(struct mmu_gather *tlb, +void zap_page_range_single_batched(struct mmu_gather *tlb, struct vm_area_struct *vma, unsigned long address, unsigned long size, struct zap_details *details) {