From patchwork Wed Jul 12 06:01:43 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yin Fengwei X-Patchwork-Id: 13309583 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id BFFC2EB64D9 for ; Wed, 12 Jul 2023 06:02:13 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 59FD56B0075; Wed, 12 Jul 2023 02:02:13 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 54EF96B0078; Wed, 12 Jul 2023 02:02:13 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 4172B6B007B; Wed, 12 Jul 2023 02:02:13 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0016.hostedemail.com [216.40.44.16]) by kanga.kvack.org (Postfix) with ESMTP id 304D76B0075 for ; Wed, 12 Jul 2023 02:02:13 -0400 (EDT) Received: from smtpin02.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay07.hostedemail.com (Postfix) with ESMTP id EF9E0160124 for ; Wed, 12 Jul 2023 06:02:12 +0000 (UTC) X-FDA: 81001914504.02.A65A803 Received: from mga04.intel.com (mga04.intel.com [192.55.52.120]) by imf01.hostedemail.com (Postfix) with ESMTP id AB97F40013 for ; Wed, 12 Jul 2023 06:02:10 +0000 (UTC) Authentication-Results: imf01.hostedemail.com; dkim=pass header.d=intel.com header.s=Intel header.b=FvPwCiTQ; spf=pass (imf01.hostedemail.com: domain of fengwei.yin@intel.com designates 192.55.52.120 as permitted sender) smtp.mailfrom=fengwei.yin@intel.com; dmarc=pass (policy=none) header.from=intel.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1689141731; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=PVsUsy9u6N0EfW896VdKQHOaT76JviUbVuFG7umdeIo=; b=zsvfInvS0tL9TaYtqBrIelwwU5yZi4QHpurtBYrzqpK/u8lq9Ew5cH1qDVtXNfplljrpgX Yx/qLkfLamftxSrWg6k06UwZLNCwZEmpasK+HxCmwUHKNWEbokP3+7vc1N6nz6GlDvj/H7 byayKJmkbx+896RoBXutjclObhl8atQ= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1689141731; a=rsa-sha256; cv=none; b=QVH0nIF14GQC+n3d+zNAiR6BmrBQu8JxV24Yhp/95CH+smJbk6qkAmXhfBChwMgR/LHKaC mxCD/xVgBRBeJ5jhxcaMSInt89MWFpuYbbyFDnGf3AwVKCye4D/ph0y1gvBnjWv2BHsZHX moSrCIbz0K0Jgs5Wmaw+kgcJTXZkhaM= ARC-Authentication-Results: i=1; imf01.hostedemail.com; dkim=pass header.d=intel.com header.s=Intel header.b=FvPwCiTQ; spf=pass (imf01.hostedemail.com: domain of fengwei.yin@intel.com designates 192.55.52.120 as permitted sender) smtp.mailfrom=fengwei.yin@intel.com; dmarc=pass (policy=none) header.from=intel.com DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1689141730; x=1720677730; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=bgQflrx8qMg9vkqGlfihJRy1xgfAbAds62AMD1mGmM4=; b=FvPwCiTQ2ewOzBpaDjqFetT9isTYHhpAjobWoHZFeIe6ep3ztCH19Ivb HwavmLE6K5tpFAHGpCl9gLrlUIO86Hu4bpaQHM5vFzoSqw/0sip947DEb v/GqNwmNSHzHsOXFTTeiBg/xMFKtubFyY587jPVcxWUlJ1YlqlBD1UL73 45TaGFtOkIz/5LQL9iVU2D6N+2MH4innH2+N977o9rYInXR42z3XrSjMl bE7OxDR4p6pvixxAa7G5PGwrkors1Egl2usSZhabG4nJyGOg1/OtVtW5p vsMDOZX+f41IOJE0TLfGLil/F6I8tf3uPAqVWsszJPqaYkwq4g83C+0K4 A==; X-IronPort-AV: E=McAfee;i="6600,9927,10768"; a="363673769" X-IronPort-AV: E=Sophos;i="6.01,198,1684825200"; d="scan'208";a="363673769" Received: from fmsmga005.fm.intel.com ([10.253.24.32]) by fmsmga104.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 11 Jul 2023 23:02:09 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10768"; a="1052051350" X-IronPort-AV: E=Sophos;i="6.01,198,1684825200"; d="scan'208";a="1052051350" Received: from fyin-dev.sh.intel.com ([10.239.159.32]) by fmsmga005.fm.intel.com with ESMTP; 11 Jul 2023 23:02:06 -0700 From: Yin Fengwei To: linux-mm@kvack.org, linux-kernel@vger.kernel.org, akpm@linux-foundation.org, yuzhao@google.com, willy@infradead.org, david@redhat.com, ryan.roberts@arm.com, shy828301@gmail.com Cc: fengwei.yin@intel.com Subject: [RFC PATCH v2 2/3] mm: handle large folio when large folio in VM_LOCKED VMA range Date: Wed, 12 Jul 2023 14:01:43 +0800 Message-Id: <20230712060144.3006358-3-fengwei.yin@intel.com> X-Mailer: git-send-email 2.39.2 In-Reply-To: <20230712060144.3006358-1-fengwei.yin@intel.com> References: <20230712060144.3006358-1-fengwei.yin@intel.com> MIME-Version: 1.0 X-Rspamd-Queue-Id: AB97F40013 X-Rspam-User: X-Rspamd-Server: rspam11 X-Stat-Signature: e3iqkj6p8bwqp3b1d4xrpq5dtyjiq78b X-HE-Tag: 1689141730-840000 X-HE-Meta: U2FsdGVkX18yBhl+Kqd5G0UEZ/a/kZhV5cQlGEz9KWI/TlENSADiF0w1TkSKYcGDwpXJdIbeEkEW4Ivn78/wFcZnSQAG9UnufaqrjSp2HxlWCNKERbKaAKrT9uu9DjxoBFOULV0DC1YYp0S13iIUSOtuyTbIWMtk1dERXz6thfiTAtqB6UtKJGyS+31epJDYrQgEtCxOWzAEF6KB19HDXBP1Rxqp1vzNFhrMUwdcmTe5g2bvPGEa93pJW+Ho4A0S0pwVecQCDAxsL2vq3/JIcLl89WxCxX7fmqcAyY1+fEETUBP5m11SuKITHed/2nVDlDj3ggYnlWGwUoqHIkp+/wFjg3uh6yHk3ejsBvha8lm5l48kEDxWcUYzP7TSk+RPMjNPP1cAb1EzBWXvv37XNdcL0p2+gH19DqpFoV09qOKqwrbtZpJnoNdw7xtTERRNOnioap4XP7WT00fHsoBqoiPuE1FVUHO4Mo0WBFNNUTYIfLDElBQEW2OUTUF/65G4PID3SPTHHJI0/W9twdj7+VzhhMRO+zPKfnTpz5iV5eAHDbh2GorFvc8HXJoYx2q91MltMYysYkNXEu4Bx8POQNNnG+UOJJzonx5tQ+AJ9kYXxxMxqkLYE0/tEk7vrwVyAx+VqSXrf+h8HffPEz8miW4ea8F4TvGLIEAMOzIC567uDJO3BOU7kJ0DQO9mnxDw+Bb+kncr22r32/4DCC7KbkKxRBOvGe/7mc34zVO45gj5k4fNNUg4gfXjz6yenjQVuUYvztA58Ga19dnxd1Br0CzBgsrzv4ygjaSW4s6Mz4St/SE+TD1gCPKCPVdYdAZfdV6r9nT4+O2qGWVHXqE5h9fGTpGHxavtssRjFQNTSucFhrNBi6cugfeAQ/RFseCIHzktE8QP3oxJfnzPQPz8w09ZoZT+T5mbkWI09e9fGOsO9BopcDgWSQUiuobcoY7xO43+altWUZ3Q78XScJj L2Ne2gNT a61s2CTYmaQsdD3F16MFKf+ra0NqX+WSxXr77eTj3nK7Io/YSw9HtEJ5vkr5O1hrz+eRKb1hKojV3S8jiE71HY2+WdRHPzldhbLY1Ui4tpjcvnSUHaBY5f7DQPZpSjm0IrjrFLscgh1sNfNeQal/3ul8WPPi3R+TXs65Bvf4rvg3hDy6FU4oPNxYtGG7vMnCX8WkLyAFwMsCNgPg= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: If large folio is in the range of VM_LOCKED VMA, it should be mlocked to avoid being picked by page reclaim. Which may split the large folio and then mlock each pages again. Mlock this kind of large folio to prevent them being picked by page reclaim. For the large folio which cross the boundary of VM_LOCKED VMA, we'd better not to mlock it. So if the system is under memory pressure, this kind of large folio will be split and the pages ouf of VM_LOCKED VMA can be reclaimed. Signed-off-by: Yin Fengwei --- mm/internal.h | 11 ++++++++--- mm/rmap.c | 34 +++++++++++++++++++++++++++------- 2 files changed, 35 insertions(+), 10 deletions(-) diff --git a/mm/internal.h b/mm/internal.h index c7dd15d8de3ef..776141de2797a 100644 --- a/mm/internal.h +++ b/mm/internal.h @@ -643,7 +643,8 @@ static inline void mlock_vma_folio(struct folio *folio, * still be set while VM_SPECIAL bits are added: so ignore it then. */ if (unlikely((vma->vm_flags & (VM_LOCKED|VM_SPECIAL)) == VM_LOCKED) && - (compound || !folio_test_large(folio))) + (compound || !folio_test_large(folio) || + folio_in_range(folio, vma, vma->vm_start, vma->vm_end))) mlock_folio(folio); } @@ -651,8 +652,12 @@ void munlock_folio(struct folio *folio); static inline void munlock_vma_folio(struct folio *folio, struct vm_area_struct *vma, bool compound) { - if (unlikely(vma->vm_flags & VM_LOCKED) && - (compound || !folio_test_large(folio))) + /* + * To handle the case that a mlocked large folio is unmapped from VMA + * piece by piece, allow munlock the large folio which is partially + * mapped to VMA. + */ + if (unlikely(vma->vm_flags & VM_LOCKED)) munlock_folio(folio); } diff --git a/mm/rmap.c b/mm/rmap.c index 2668f5ea35342..455f415d8d9ca 100644 --- a/mm/rmap.c +++ b/mm/rmap.c @@ -803,6 +803,14 @@ struct folio_referenced_arg { unsigned long vm_flags; struct mem_cgroup *memcg; }; + +static inline bool should_restore_mlock(struct folio *folio, + struct vm_area_struct *vma, bool pmd_mapped) +{ + return !folio_test_large(folio) || + pmd_mapped || folio_within_vma(folio, vma); +} + /* * arg: folio_referenced_arg will be passed */ @@ -816,13 +824,25 @@ static bool folio_referenced_one(struct folio *folio, while (page_vma_mapped_walk(&pvmw)) { address = pvmw.address; - if ((vma->vm_flags & VM_LOCKED) && - (!folio_test_large(folio) || !pvmw.pte)) { - /* Restore the mlock which got missed */ - mlock_vma_folio(folio, vma, !pvmw.pte); - page_vma_mapped_walk_done(&pvmw); - pra->vm_flags |= VM_LOCKED; - return false; /* To break the loop */ + if (vma->vm_flags & VM_LOCKED) { + if (should_restore_mlock(folio, vma, !pvmw.pte)) { + /* Restore the mlock which got missed */ + mlock_vma_folio(folio, vma, !pvmw.pte); + page_vma_mapped_walk_done(&pvmw); + pra->vm_flags |= VM_LOCKED; + return false; /* To break the loop */ + } else { + /* + * For large folio cross VMA boundaries, it's + * expected to be picked by page reclaim. But + * should skip reference of pages which are in + * the range of VM_LOCKED vma. As page reclaim + * should just count the reference of pages out + * the range of VM_LOCKED vma. + */ + pra->mapcount--; + continue; + } } if (pvmw.pte) {