From patchwork Mon Mar 6 09:22:55 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yin Fengwei X-Patchwork-Id: 13160757 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id D2C30C61DA4 for ; Mon, 6 Mar 2023 09:21:59 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 6D5186B0075; Mon, 6 Mar 2023 04:21:59 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 6868E280001; Mon, 6 Mar 2023 04:21:59 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 528366B007B; Mon, 6 Mar 2023 04:21:59 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0011.hostedemail.com [216.40.44.11]) by kanga.kvack.org (Postfix) with ESMTP id 455E76B0075 for ; Mon, 6 Mar 2023 04:21:59 -0500 (EST) Received: from smtpin21.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay04.hostedemail.com (Postfix) with ESMTP id EBD731A0931 for ; Mon, 6 Mar 2023 09:21:58 +0000 (UTC) X-FDA: 80537931516.21.42258D4 Received: from mga04.intel.com (mga04.intel.com [192.55.52.120]) by imf29.hostedemail.com (Postfix) with ESMTP id D065B120015 for ; Mon, 6 Mar 2023 09:21:56 +0000 (UTC) Authentication-Results: imf29.hostedemail.com; dkim=pass header.d=intel.com header.s=Intel header.b=KGFydPag; spf=pass (imf29.hostedemail.com: domain of fengwei.yin@intel.com designates 192.55.52.120 as permitted sender) smtp.mailfrom=fengwei.yin@intel.com; dmarc=pass (policy=none) header.from=intel.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1678094517; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=gYqZfrAlseg/s6Sc6J91lblgYt6k4J4Slcn85cNAxqo=; b=oAbdNLCt5v+btpt1krj+ECAVl73FV+8AGX7H2TKy+4NNurTsK2eEqd8q/blp5ac9H8V2B7 ECb9WeRCIslpdhrYObNowCuFGPuZdtoCjWW6h6X0PrQoYo/6NC8+zVRiKrhrjZX3XjJcRM csz5StR7iBJKD0rNxIeyfVRIsNvBG9s= ARC-Authentication-Results: i=1; imf29.hostedemail.com; dkim=pass header.d=intel.com header.s=Intel header.b=KGFydPag; spf=pass (imf29.hostedemail.com: domain of fengwei.yin@intel.com designates 192.55.52.120 as permitted sender) smtp.mailfrom=fengwei.yin@intel.com; dmarc=pass (policy=none) header.from=intel.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1678094517; a=rsa-sha256; cv=none; b=7mn8Ol8YD13XtI1zAiIeJdZAK6JQfq41+IBTo7LjZYA+RL8s7TCNodGIEgx2VFl/fJW2k6 WWkb5cnCOnevRwozhFTV8xNIHqQLmHmtjhFaOS8ol0B+TPAih6NVerDSdtfxNFPk636ASj msjmqN3njTjZdESJ1tYUpH/rTPp5oLU= DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1678094517; x=1709630517; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=cbep/069ys7eOkAdGkMtuK/fXBOfL7OwZ64lPB2f1nE=; b=KGFydPag1gtJPoDjwABqOJY5Pbn/JZj+MVoJ6UhmKmY5nNrgRO1njc18 i5SXIpJJpAeAeZwe/8tFjpr3DVoYAIqZdH2wMhE1sYFFotdOUln4f45nX x+ni5cCLVgVDWyJBSiXg2k4XHGn8EeKfxIw/zee+3H6s5PNy8gaezmR+C Ni45HIFz9ElRnOBlodLpTSspnFXS0J1pnlfAl7jwdYLdB1jtg1I0eH/jQ egKJdccwtkezhj18cRZi1W0HIU710h5e3n/RTeyJwJPQ+1dcek01e1aO/ YefXSWYgKC8Pn7lSmORl0mDpt5q6+wSVa7N7qh14EvETI+B6atXyB5PMQ g==; X-IronPort-AV: E=McAfee;i="6500,9779,10640"; a="334225594" X-IronPort-AV: E=Sophos;i="5.98,236,1673942400"; d="scan'208";a="334225594" Received: from orsmga002.jf.intel.com ([10.7.209.21]) by fmsmga104.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 06 Mar 2023 01:21:55 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6500,9779,10640"; a="676099442" X-IronPort-AV: E=Sophos;i="5.98,236,1673942400"; d="scan'208";a="676099442" Received: from fyin-dev.sh.intel.com ([10.239.159.32]) by orsmga002.jf.intel.com with ESMTP; 06 Mar 2023 01:21:52 -0800 From: Yin Fengwei To: linux-mm@kvack.org, akpm@linux-foundation.org, willy@infradead.org, mike.kravetz@oracle.com, sidhartha.kumar@oracle.com, naoya.horiguchi@nec.com, jane.chu@oracle.com, david@redhat.com Cc: fengwei.yin@intel.com Subject: [PATCH v3 1/5] rmap: move hugetlb try_to_unmap to dedicated function Date: Mon, 6 Mar 2023 17:22:55 +0800 Message-Id: <20230306092259.3507807-2-fengwei.yin@intel.com> X-Mailer: git-send-email 2.30.2 In-Reply-To: <20230306092259.3507807-1-fengwei.yin@intel.com> References: <20230306092259.3507807-1-fengwei.yin@intel.com> MIME-Version: 1.0 X-Rspamd-Server: rspam05 X-Rspamd-Queue-Id: D065B120015 X-Stat-Signature: 58fwye5dpuwhc77whg9d5fdq5mycrpd7 X-Rspam-User: X-HE-Tag: 1678094516-43851 X-HE-Meta: U2FsdGVkX1+fnWJI1d1N/UP74sBvFcjZUYHdCx915M8mt0TY/79RvE00NyhMMP/ruH5Nss2EFjM8250IHFd7lOXX00mv3psAgZ0f5QJ7psB53ay0ygeT8b6gdALuJIQSfNqNIVojEyIbOGzXgxy7FasprxRZetDSGrWsAjB6vvzJcGWjbk+YuL+Ji14vGep0F/qpRL9zOxq6mLuXZ2CmkjSTxubHGLW6IjSixByim4XhQ8/Jau0iEFgeX53E2P/m5ILxAJZDAcl8RpupC7PBTr2hjcjd9zRw2/GZb6eEcrZlqQLZTSTOlKRlCjdp/9JBsrKmgqxE28t9ecUMuvJR6KfoQOQSU+WTHBjotiZWfD7Dw8SfPIsW+sv6g4sc3IqeOdm0IpjG/e0m8g7zpRNDb8hAXjZ9PKdLmL4DIBfM/QsQD2ADDsaxRwzJJuT7fo8ezx39yclqDJCcNa4Iiv2Nsr3/ZsK+PHPQSOfKW35kYpylsdt7njKxSoW+7KNmkXc/ZhOziPj8dxUae+bLbcPb1i8rkKbLnJqlOsEqkNBbrHKr1xN/APggrVd0GTlxsVeX1b0bbIrqi2o7sS0LCp+jrsIPy38FEVWPuuFkNg016OR9eEHszJyymOnMTF1dhL82qHi1jJ0OaVzsy26SCTBXIfHu0WMK+Il/bFGDedQ2G0NIJDkwrY1y2TZZZ4o+oy2AQWB2ukvzJW7KI32HDpgwqimtc+PKsufRMdxDm409nYBM4rAk0ZRk8E8Q0UJBD8QnCq3dw+y/yggYMKkNbvdEIpNDZy0y5c0DhYMxVEFOJx8LwxyQVG9vFS3bWO/Y1YJ89Df7/zMwUbjAxVERiaaCneggYPpkOwYBxYcDDK6CG99NFSeF5qzOo7SmkR1Crbnih/pNswiggNmj4Hl9vmWoyGZN8zOUllvkcik3n7zNR+FTZ9nw+tGdoI3Pv3m/bHnDCqrIe/gSvlqmERo1OLJ PCJOT0uI wvF+xGkXpIxcHVJ6quOJalByJfBKDhwT4y+DHGWXT2BDnxYc4dX9X6XQ9ellAYIcsPDjQ+RaLsQL+Bb81LNWCWW86KPVqxuhVn/cCnqmTrSUz6B02jck87lsst1LsKrNDIiAH+sNgcZLgWhWEOkvCwCQ3FcWVOMAMAC9Z48ycaVRX4QQdFaFtLif8ZH8RLmy4zFxIgm7w9VN/T2EB+ly/jby0Pw== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: It's to prepare the batched rmap update for large folio. No need to looped handle hugetlb. Just handle hugetlb and bail out early. Signed-off-by: Yin Fengwei Reviewed-by: Mike Kravetz --- mm/rmap.c | 200 +++++++++++++++++++++++++++++++++--------------------- 1 file changed, 121 insertions(+), 79 deletions(-) diff --git a/mm/rmap.c b/mm/rmap.c index ba901c416785..508d141dacc5 100644 --- a/mm/rmap.c +++ b/mm/rmap.c @@ -1441,6 +1441,103 @@ void page_remove_rmap(struct page *page, struct vm_area_struct *vma, munlock_vma_folio(folio, vma, compound); } +static bool try_to_unmap_one_hugetlb(struct folio *folio, + struct vm_area_struct *vma, struct mmu_notifier_range range, + struct page_vma_mapped_walk pvmw, unsigned long address, + enum ttu_flags flags) +{ + struct mm_struct *mm = vma->vm_mm; + pte_t pteval; + bool ret = true, anon = folio_test_anon(folio); + + /* + * The try_to_unmap() is only passed a hugetlb page + * in the case where the hugetlb page is poisoned. + */ + VM_BUG_ON_FOLIO(!folio_test_hwpoison(folio), folio); + /* + * huge_pmd_unshare may unmap an entire PMD page. + * There is no way of knowing exactly which PMDs may + * be cached for this mm, so we must flush them all. + * start/end were already adjusted above to cover this + * range. + */ + flush_cache_range(vma, range.start, range.end); + + /* + * To call huge_pmd_unshare, i_mmap_rwsem must be + * held in write mode. Caller needs to explicitly + * do this outside rmap routines. + * + * We also must hold hugetlb vma_lock in write mode. + * Lock order dictates acquiring vma_lock BEFORE + * i_mmap_rwsem. We can only try lock here and fail + * if unsuccessful. + */ + if (!anon) { + VM_BUG_ON(!(flags & TTU_RMAP_LOCKED)); + if (!hugetlb_vma_trylock_write(vma)) { + ret = false; + goto out; + } + if (huge_pmd_unshare(mm, vma, address, pvmw.pte)) { + hugetlb_vma_unlock_write(vma); + flush_tlb_range(vma, + range.start, range.end); + mmu_notifier_invalidate_range(mm, + range.start, range.end); + /* + * The ref count of the PMD page was + * dropped which is part of the way map + * counting is done for shared PMDs. + * Return 'true' here. When there is + * no other sharing, huge_pmd_unshare + * returns false and we will unmap the + * actual page and drop map count + * to zero. + */ + goto out; + } + hugetlb_vma_unlock_write(vma); + } + pteval = huge_ptep_clear_flush(vma, address, pvmw.pte); + + /* + * Now the pte is cleared. If this pte was uffd-wp armed, + * we may want to replace a none pte with a marker pte if + * it's file-backed, so we don't lose the tracking info. + */ + pte_install_uffd_wp_if_needed(vma, address, pvmw.pte, pteval); + + /* Set the dirty flag on the folio now the pte is gone. */ + if (pte_dirty(pteval)) + folio_mark_dirty(folio); + + /* Update high watermark before we lower rss */ + update_hiwater_rss(mm); + + /* Poisoned hugetlb folio with TTU_HWPOISON always cleared in flags */ + pteval = swp_entry_to_pte(make_hwpoison_entry(&folio->page)); + set_huge_pte_at(mm, address, pvmw.pte, pteval); + hugetlb_count_sub(folio_nr_pages(folio), mm); + + /* + * No need to call mmu_notifier_invalidate_range() it has be + * done above for all cases requiring it to happen under page + * table lock before mmu_notifier_invalidate_range_end() + * + * See Documentation/mm/mmu_notifier.rst + */ + page_remove_rmap(&folio->page, vma, folio_test_hugetlb(folio)); + /* No VM_LOCKED set in vma->vm_flags for hugetlb. So not + * necessary to call mlock_drain_local(). + */ + folio_put(folio); + +out: + return ret; +} + /* * @arg: enum ttu_flags will be passed to this argument */ @@ -1504,86 +1601,37 @@ static bool try_to_unmap_one(struct folio *folio, struct vm_area_struct *vma, break; } + address = pvmw.address; + if (folio_test_hugetlb(folio)) { + ret = try_to_unmap_one_hugetlb(folio, vma, range, + pvmw, address, flags); + + /* no need to loop for hugetlb */ + page_vma_mapped_walk_done(&pvmw); + break; + } + subpage = folio_page(folio, pte_pfn(*pvmw.pte) - folio_pfn(folio)); - address = pvmw.address; anon_exclusive = folio_test_anon(folio) && PageAnonExclusive(subpage); - if (folio_test_hugetlb(folio)) { - bool anon = folio_test_anon(folio); - - /* - * The try_to_unmap() is only passed a hugetlb page - * in the case where the hugetlb page is poisoned. - */ - VM_BUG_ON_PAGE(!PageHWPoison(subpage), subpage); + flush_cache_page(vma, address, pte_pfn(*pvmw.pte)); + /* Nuke the page table entry. */ + if (should_defer_flush(mm, flags)) { /* - * huge_pmd_unshare may unmap an entire PMD page. - * There is no way of knowing exactly which PMDs may - * be cached for this mm, so we must flush them all. - * start/end were already adjusted above to cover this - * range. + * We clear the PTE but do not flush so potentially + * a remote CPU could still be writing to the folio. + * If the entry was previously clean then the + * architecture must guarantee that a clear->dirty + * transition on a cached TLB entry is written through + * and traps if the PTE is unmapped. */ - flush_cache_range(vma, range.start, range.end); + pteval = ptep_get_and_clear(mm, address, pvmw.pte); - /* - * To call huge_pmd_unshare, i_mmap_rwsem must be - * held in write mode. Caller needs to explicitly - * do this outside rmap routines. - * - * We also must hold hugetlb vma_lock in write mode. - * Lock order dictates acquiring vma_lock BEFORE - * i_mmap_rwsem. We can only try lock here and fail - * if unsuccessful. - */ - if (!anon) { - VM_BUG_ON(!(flags & TTU_RMAP_LOCKED)); - if (!hugetlb_vma_trylock_write(vma)) { - page_vma_mapped_walk_done(&pvmw); - ret = false; - break; - } - if (huge_pmd_unshare(mm, vma, address, pvmw.pte)) { - hugetlb_vma_unlock_write(vma); - flush_tlb_range(vma, - range.start, range.end); - mmu_notifier_invalidate_range(mm, - range.start, range.end); - /* - * The ref count of the PMD page was - * dropped which is part of the way map - * counting is done for shared PMDs. - * Return 'true' here. When there is - * no other sharing, huge_pmd_unshare - * returns false and we will unmap the - * actual page and drop map count - * to zero. - */ - page_vma_mapped_walk_done(&pvmw); - break; - } - hugetlb_vma_unlock_write(vma); - } - pteval = huge_ptep_clear_flush(vma, address, pvmw.pte); + set_tlb_ubc_flush_pending(mm, pte_dirty(pteval)); } else { - flush_cache_page(vma, address, pte_pfn(*pvmw.pte)); - /* Nuke the page table entry. */ - if (should_defer_flush(mm, flags)) { - /* - * We clear the PTE but do not flush so potentially - * a remote CPU could still be writing to the folio. - * If the entry was previously clean then the - * architecture must guarantee that a clear->dirty - * transition on a cached TLB entry is written through - * and traps if the PTE is unmapped. - */ - pteval = ptep_get_and_clear(mm, address, pvmw.pte); - - set_tlb_ubc_flush_pending(mm, pte_dirty(pteval)); - } else { - pteval = ptep_clear_flush(vma, address, pvmw.pte); - } + pteval = ptep_clear_flush(vma, address, pvmw.pte); } /* @@ -1602,14 +1650,8 @@ static bool try_to_unmap_one(struct folio *folio, struct vm_area_struct *vma, if (PageHWPoison(subpage) && (flags & TTU_HWPOISON)) { pteval = swp_entry_to_pte(make_hwpoison_entry(subpage)); - if (folio_test_hugetlb(folio)) { - hugetlb_count_sub(folio_nr_pages(folio), mm); - set_huge_pte_at(mm, address, pvmw.pte, pteval); - } else { - dec_mm_counter(mm, mm_counter(&folio->page)); - set_pte_at(mm, address, pvmw.pte, pteval); - } - + dec_mm_counter(mm, mm_counter(&folio->page)); + set_pte_at(mm, address, pvmw.pte, pteval); } else if (pte_unused(pteval) && !userfaultfd_armed(vma)) { /* * The guest indicated that the page content is of no From patchwork Mon Mar 6 09:22:56 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yin Fengwei X-Patchwork-Id: 13160758 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 62FEBC678D4 for ; Mon, 6 Mar 2023 09:22:14 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 045376B0078; Mon, 6 Mar 2023 04:22:14 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id F37D56B007B; Mon, 6 Mar 2023 04:22:13 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id DD8CF6B007D; Mon, 6 Mar 2023 04:22:13 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0017.hostedemail.com [216.40.44.17]) by kanga.kvack.org (Postfix) with ESMTP id CFC596B0078 for ; Mon, 6 Mar 2023 04:22:13 -0500 (EST) Received: from smtpin05.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay05.hostedemail.com (Postfix) with ESMTP id 9BC5740A12 for ; Mon, 6 Mar 2023 09:22:13 +0000 (UTC) X-FDA: 80537932146.05.705364D Received: from mga01.intel.com (mga01.intel.com [192.55.52.88]) by imf12.hostedemail.com (Postfix) with ESMTP id 54D6A40013 for ; Mon, 6 Mar 2023 09:22:10 +0000 (UTC) Authentication-Results: imf12.hostedemail.com; dkim=pass header.d=intel.com header.s=Intel header.b=QDcBhh5l; spf=pass (imf12.hostedemail.com: domain of fengwei.yin@intel.com designates 192.55.52.88 as permitted sender) smtp.mailfrom=fengwei.yin@intel.com; dmarc=pass (policy=none) header.from=intel.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1678094530; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=6uMFnD/7JocVIfIwxiS09ToRcJzLtqPg7eRKebdQneI=; b=SVuJx6HpPYuR3+WPEDTPMbhQyCi9dNMRcQgYyap0jAybET6ESoKke3GkSftr1ru5HmRxhC 7ve0mu3fUwKXAVqtNNOCmSdrAR5/qP9NxBmInUg0gHROETjw0l8HvG3/PFJUYnverZ8rhf z4tn11vp04lthxlgeqtFD/1qxmmuYlI= ARC-Authentication-Results: i=1; imf12.hostedemail.com; dkim=pass header.d=intel.com header.s=Intel header.b=QDcBhh5l; spf=pass (imf12.hostedemail.com: domain of fengwei.yin@intel.com designates 192.55.52.88 as permitted sender) smtp.mailfrom=fengwei.yin@intel.com; dmarc=pass (policy=none) header.from=intel.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1678094530; a=rsa-sha256; cv=none; b=xPVE0NYyPIbHKuo8Ssv2By5HY0MidFWsTvsQ+1QZUi5RzVvbo4kybgNKF82zom1lVBKydJ heGHv4d5PNx9UildFxphojNHUVNGPZoEGZpbRMORkFed6KUZ8Vn/+SaEVNZO0YFQJuIu75 bLif9i3+DZN6KQctWbRp0iaKPegYf9g= DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1678094530; x=1709630530; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=nGlTKbMyeEZTiSX3ADmtCHbhxG3gVYPWqUBO0qZiZMM=; b=QDcBhh5li8NZw7ivay21YgcpmmXfv60tby2yxOsyA6yV3BO+iuIA20w8 o4bvx16yUGs7uzufL7Kb3qBMOtd/2JnwEh7LKtR0iNsPDN8v5/tLrdMfj eqDCCmk4djfKqZJ1vOpFxJizeZshqxVxWaQbp71kRF2U2YX8pk4386Qu6 YpyQ5W9jaFHkvCWRhgzjUO6OwG5Gb3Uf+SiorrAQhcAm/VwC1IHtUpTXw 933NLi433o0CYEj6WGxHMdbsW8YScEdzveTiCCDfVvYh0Ndh5wnAYTL4G C2banjGTjWrTq90pS/ZO55Z1xLzfBHp3ndP3/2YnGki5u2XzKncTfD2j9 w==; X-IronPort-AV: E=McAfee;i="6500,9779,10640"; a="363123967" X-IronPort-AV: E=Sophos;i="5.98,236,1673942400"; d="scan'208";a="363123967" Received: from orsmga004.jf.intel.com ([10.7.209.38]) by fmsmga101.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 06 Mar 2023 01:22:08 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6500,9779,10640"; a="799932411" X-IronPort-AV: E=Sophos;i="5.98,236,1673942400"; d="scan'208";a="799932411" Received: from fyin-dev.sh.intel.com ([10.239.159.32]) by orsmga004.jf.intel.com with ESMTP; 06 Mar 2023 01:22:05 -0800 From: Yin Fengwei To: linux-mm@kvack.org, akpm@linux-foundation.org, willy@infradead.org, mike.kravetz@oracle.com, sidhartha.kumar@oracle.com, naoya.horiguchi@nec.com, jane.chu@oracle.com, david@redhat.com Cc: fengwei.yin@intel.com Subject: [PATCH v3 2/5] rmap: move page unmap operation to dedicated function Date: Mon, 6 Mar 2023 17:22:56 +0800 Message-Id: <20230306092259.3507807-3-fengwei.yin@intel.com> X-Mailer: git-send-email 2.30.2 In-Reply-To: <20230306092259.3507807-1-fengwei.yin@intel.com> References: <20230306092259.3507807-1-fengwei.yin@intel.com> MIME-Version: 1.0 X-Rspamd-Server: rspam07 X-Rspamd-Queue-Id: 54D6A40013 X-Rspam-User: X-Stat-Signature: qtphnpt3c51dzzruuh9jxtt36a7jq6f3 X-HE-Tag: 1678094530-930298 X-HE-Meta: U2FsdGVkX1+2fA3EyNae58jivbmRiCyPQPYebtjKluywVRyo2cRew6/UvZADyvDybyr+E4+DdNsfAefm/65lAk24LwtUSNczTWg1jFNpuNGYs7wy75b260wLgTAoSayUesFdnoP6Ira42feMeb1mYxntAp5SIdts05SD5XeavMMhda/FFO0PFbrRBUxC68Pn9hfkAQBBWpTyHUO0D2FBmXrLcI7gVvx/zEaws0UC4hszlF0hvDZ5TUWtNXPvbSugNDSUjYLCBw43TIGmmlNMWv4VygGwo8EjdENcw7Jolb7HWYWorLS++bdx4SNpSubovGANDUxHPJ29On88hMY3YB/YfoVI1NJ6+HZBEQirwch3hLUt+IipBE2lLtuERdqUrpYceWpJOmIdtF5iFJK3Ars177+fkp1LmR+oXaFILGsVuiJvejgIsKTFmgMJDLxdmJipugiuACmgpt6TN7xYqr0NeDlVVe1LKeNuznpeoc4f4ExbcGuPrTH/JfVCgVEDp2JuQhTC7SWIRU6zICtYZoUb9BcC/cVlF11gH9rOFh858MMv2Hqj0CdDF3D5vAkKkx4S5uEFNsfIgr+WZhBqvxx09I2ldMvgkEmgIeQA5REF+Gxx1LBXkYwVjb/8TdhcMm2HTC79uyf/B/UfcGtLC0SBGbsYMmzOPbG9MoGmOej+F9e0ne3K5m/B0NbF3AJsQjy4fL5TlY4DrTh9nc78SBZ65h/g09piq1WThPdm5sRRdfU9ccWmz/MYU0+ANYKb9Sx2WwEBC9hlqoExNqRbU455olpMLjq+9/N2J6Yk5jThpF7XCx5MHWBZXn27PtacbqUGqbG8QGRCjG3qeFb0Wfw4olXkyDInnHraD9C+sQpzeRMEiawClCiER2KPRa1mVYFMkjLTUN2dFzBfZ3SdJDc0NDYBugC66Fu5tyeqLgfIKhhvW/TpPGiuVkl2RHwJppGs+8aSPNpnBXMPQ8R AzVrXG1w +SgnftTNN/uuCbPdpAOcSqq/JuqM0jGQ/Mk+Jg+VJ3IeDx0oHqVFxrMvDV2h6FYCSkBZt0hIC4BhZbhQaHmQc4X/YZ3Pl2lK0JijoxBiHqlPVoqUaHFDeTMApEJGYrPr8ylkssz0FDi1xyofTlI0zwpc7AJL73Rq8wwcn47K6CnN33bQ= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: No functional change. Just code reorganized. Signed-off-by: Yin Fengwei --- mm/rmap.c | 369 ++++++++++++++++++++++++++++-------------------------- 1 file changed, 194 insertions(+), 175 deletions(-) diff --git a/mm/rmap.c b/mm/rmap.c index 508d141dacc5..013643122d0c 100644 --- a/mm/rmap.c +++ b/mm/rmap.c @@ -1538,17 +1538,204 @@ static bool try_to_unmap_one_hugetlb(struct folio *folio, return ret; } +static bool try_to_unmap_one_page(struct folio *folio, + struct vm_area_struct *vma, struct mmu_notifier_range range, + struct page_vma_mapped_walk pvmw, unsigned long address, + enum ttu_flags flags) +{ + bool anon_exclusive, ret = true; + struct page *subpage; + struct mm_struct *mm = vma->vm_mm; + pte_t pteval; + + subpage = folio_page(folio, + pte_pfn(*pvmw.pte) - folio_pfn(folio)); + anon_exclusive = folio_test_anon(folio) && + PageAnonExclusive(subpage); + + flush_cache_page(vma, address, pte_pfn(*pvmw.pte)); + /* Nuke the page table entry. */ + if (should_defer_flush(mm, flags)) { + /* + * We clear the PTE but do not flush so potentially + * a remote CPU could still be writing to the folio. + * If the entry was previously clean then the + * architecture must guarantee that a clear->dirty + * transition on a cached TLB entry is written through + * and traps if the PTE is unmapped. + */ + pteval = ptep_get_and_clear(mm, address, pvmw.pte); + + set_tlb_ubc_flush_pending(mm, pte_dirty(pteval)); + } else { + pteval = ptep_clear_flush(vma, address, pvmw.pte); + } + + /* + * Now the pte is cleared. If this pte was uffd-wp armed, + * we may want to replace a none pte with a marker pte if + * it's file-backed, so we don't lose the tracking info. + */ + pte_install_uffd_wp_if_needed(vma, address, pvmw.pte, pteval); + + /* Set the dirty flag on the folio now the pte is gone. */ + if (pte_dirty(pteval)) + folio_mark_dirty(folio); + + /* Update high watermark before we lower rss */ + update_hiwater_rss(mm); + + if (PageHWPoison(subpage) && !(flags & TTU_HWPOISON)) { + pteval = swp_entry_to_pte(make_hwpoison_entry(subpage)); + dec_mm_counter(mm, mm_counter(&folio->page)); + set_pte_at(mm, address, pvmw.pte, pteval); + } else if (pte_unused(pteval) && !userfaultfd_armed(vma)) { + /* + * The guest indicated that the page content is of no + * interest anymore. Simply discard the pte, vmscan + * will take care of the rest. + * A future reference will then fault in a new zero + * page. When userfaultfd is active, we must not drop + * this page though, as its main user (postcopy + * migration) will not expect userfaults on already + * copied pages. + */ + dec_mm_counter(mm, mm_counter(&folio->page)); + /* We have to invalidate as we cleared the pte */ + mmu_notifier_invalidate_range(mm, address, + address + PAGE_SIZE); + } else if (folio_test_anon(folio)) { + swp_entry_t entry = { .val = page_private(subpage) }; + pte_t swp_pte; + /* + * Store the swap location in the pte. + * See handle_pte_fault() ... + */ + if (unlikely(folio_test_swapbacked(folio) != + folio_test_swapcache(folio))) { + WARN_ON_ONCE(1); + ret = false; + /* We have to invalidate as we cleared the pte */ + mmu_notifier_invalidate_range(mm, address, + address + PAGE_SIZE); + page_vma_mapped_walk_done(&pvmw); + goto discard; + } + + /* MADV_FREE page check */ + if (!folio_test_swapbacked(folio)) { + int ref_count, map_count; + + /* + * Synchronize with gup_pte_range(): + * - clear PTE; barrier; read refcount + * - inc refcount; barrier; read PTE + */ + smp_mb(); + + ref_count = folio_ref_count(folio); + map_count = folio_mapcount(folio); + + /* + * Order reads for page refcount and dirty flag + * (see comments in __remove_mapping()). + */ + smp_rmb(); + + /* + * The only page refs must be one from isolation + * plus the rmap(s) (dropped by discard:). + */ + if (ref_count == 1 + map_count && + !folio_test_dirty(folio)) { + /* Invalidate as we cleared the pte */ + mmu_notifier_invalidate_range(mm, + address, address + PAGE_SIZE); + dec_mm_counter(mm, MM_ANONPAGES); + goto discard; + } + + /* + * If the folio was redirtied, it cannot be + * discarded. Remap the page to page table. + */ + set_pte_at(mm, address, pvmw.pte, pteval); + folio_set_swapbacked(folio); + ret = false; + page_vma_mapped_walk_done(&pvmw); + goto discard; + } + + if (swap_duplicate(entry) < 0) { + set_pte_at(mm, address, pvmw.pte, pteval); + ret = false; + page_vma_mapped_walk_done(&pvmw); + goto discard; + } + if (arch_unmap_one(mm, vma, address, pteval) < 0) { + swap_free(entry); + set_pte_at(mm, address, pvmw.pte, pteval); + ret = false; + page_vma_mapped_walk_done(&pvmw); + goto discard; + } + + /* See page_try_share_anon_rmap(): clear PTE first. */ + if (anon_exclusive && + page_try_share_anon_rmap(subpage)) { + swap_free(entry); + set_pte_at(mm, address, pvmw.pte, pteval); + ret = false; + page_vma_mapped_walk_done(&pvmw); + goto discard; + } + if (list_empty(&mm->mmlist)) { + spin_lock(&mmlist_lock); + if (list_empty(&mm->mmlist)) + list_add(&mm->mmlist, &init_mm.mmlist); + spin_unlock(&mmlist_lock); + } + dec_mm_counter(mm, MM_ANONPAGES); + inc_mm_counter(mm, MM_SWAPENTS); + swp_pte = swp_entry_to_pte(entry); + if (anon_exclusive) + swp_pte = pte_swp_mkexclusive(swp_pte); + if (pte_soft_dirty(pteval)) + swp_pte = pte_swp_mksoft_dirty(swp_pte); + if (pte_uffd_wp(pteval)) + swp_pte = pte_swp_mkuffd_wp(swp_pte); + set_pte_at(mm, address, pvmw.pte, swp_pte); + /* Invalidate as we cleared the pte */ + mmu_notifier_invalidate_range(mm, address, + address + PAGE_SIZE); + } else { + /* + * This is a locked file-backed folio, + * so it cannot be removed from the page + * cache and replaced by a new folio before + * mmu_notifier_invalidate_range_end, so no + * concurrent thread might update its page table + * to point at a new folio while a device is + * still using this folio. + * + * See Documentation/mm/mmu_notifier.rst + */ + dec_mm_counter(mm, mm_counter_file(&folio->page)); + } + +discard: + return ret; +} + /* * @arg: enum ttu_flags will be passed to this argument */ static bool try_to_unmap_one(struct folio *folio, struct vm_area_struct *vma, unsigned long address, void *arg) { - struct mm_struct *mm = vma->vm_mm; DEFINE_FOLIO_VMA_WALK(pvmw, folio, vma, address, 0); - pte_t pteval; struct page *subpage; - bool anon_exclusive, ret = true; + bool ret = true; struct mmu_notifier_range range; enum ttu_flags flags = (enum ttu_flags)(long)arg; @@ -1613,179 +1800,11 @@ static bool try_to_unmap_one(struct folio *folio, struct vm_area_struct *vma, subpage = folio_page(folio, pte_pfn(*pvmw.pte) - folio_pfn(folio)); - anon_exclusive = folio_test_anon(folio) && - PageAnonExclusive(subpage); - - flush_cache_page(vma, address, pte_pfn(*pvmw.pte)); - /* Nuke the page table entry. */ - if (should_defer_flush(mm, flags)) { - /* - * We clear the PTE but do not flush so potentially - * a remote CPU could still be writing to the folio. - * If the entry was previously clean then the - * architecture must guarantee that a clear->dirty - * transition on a cached TLB entry is written through - * and traps if the PTE is unmapped. - */ - pteval = ptep_get_and_clear(mm, address, pvmw.pte); - - set_tlb_ubc_flush_pending(mm, pte_dirty(pteval)); - } else { - pteval = ptep_clear_flush(vma, address, pvmw.pte); - } - - /* - * Now the pte is cleared. If this pte was uffd-wp armed, - * we may want to replace a none pte with a marker pte if - * it's file-backed, so we don't lose the tracking info. - */ - pte_install_uffd_wp_if_needed(vma, address, pvmw.pte, pteval); - - /* Set the dirty flag on the folio now the pte is gone. */ - if (pte_dirty(pteval)) - folio_mark_dirty(folio); - - /* Update high watermark before we lower rss */ - update_hiwater_rss(mm); - - if (PageHWPoison(subpage) && (flags & TTU_HWPOISON)) { - pteval = swp_entry_to_pte(make_hwpoison_entry(subpage)); - dec_mm_counter(mm, mm_counter(&folio->page)); - set_pte_at(mm, address, pvmw.pte, pteval); - } else if (pte_unused(pteval) && !userfaultfd_armed(vma)) { - /* - * The guest indicated that the page content is of no - * interest anymore. Simply discard the pte, vmscan - * will take care of the rest. - * A future reference will then fault in a new zero - * page. When userfaultfd is active, we must not drop - * this page though, as its main user (postcopy - * migration) will not expect userfaults on already - * copied pages. - */ - dec_mm_counter(mm, mm_counter(&folio->page)); - /* We have to invalidate as we cleared the pte */ - mmu_notifier_invalidate_range(mm, address, - address + PAGE_SIZE); - } else if (folio_test_anon(folio)) { - swp_entry_t entry = { .val = page_private(subpage) }; - pte_t swp_pte; - /* - * Store the swap location in the pte. - * See handle_pte_fault() ... - */ - if (unlikely(folio_test_swapbacked(folio) != - folio_test_swapcache(folio))) { - WARN_ON_ONCE(1); - ret = false; - /* We have to invalidate as we cleared the pte */ - mmu_notifier_invalidate_range(mm, address, - address + PAGE_SIZE); - page_vma_mapped_walk_done(&pvmw); - break; - } - - /* MADV_FREE page check */ - if (!folio_test_swapbacked(folio)) { - int ref_count, map_count; - - /* - * Synchronize with gup_pte_range(): - * - clear PTE; barrier; read refcount - * - inc refcount; barrier; read PTE - */ - smp_mb(); - - ref_count = folio_ref_count(folio); - map_count = folio_mapcount(folio); - - /* - * Order reads for page refcount and dirty flag - * (see comments in __remove_mapping()). - */ - smp_rmb(); - - /* - * The only page refs must be one from isolation - * plus the rmap(s) (dropped by discard:). - */ - if (ref_count == 1 + map_count && - !folio_test_dirty(folio)) { - /* Invalidate as we cleared the pte */ - mmu_notifier_invalidate_range(mm, - address, address + PAGE_SIZE); - dec_mm_counter(mm, MM_ANONPAGES); - goto discard; - } - - /* - * If the folio was redirtied, it cannot be - * discarded. Remap the page to page table. - */ - set_pte_at(mm, address, pvmw.pte, pteval); - folio_set_swapbacked(folio); - ret = false; - page_vma_mapped_walk_done(&pvmw); - break; - } - - if (swap_duplicate(entry) < 0) { - set_pte_at(mm, address, pvmw.pte, pteval); - ret = false; - page_vma_mapped_walk_done(&pvmw); - break; - } - if (arch_unmap_one(mm, vma, address, pteval) < 0) { - swap_free(entry); - set_pte_at(mm, address, pvmw.pte, pteval); - ret = false; - page_vma_mapped_walk_done(&pvmw); - break; - } + ret = try_to_unmap_one_page(folio, vma, + range, pvmw, address, flags); + if (!ret) + break; - /* See page_try_share_anon_rmap(): clear PTE first. */ - if (anon_exclusive && - page_try_share_anon_rmap(subpage)) { - swap_free(entry); - set_pte_at(mm, address, pvmw.pte, pteval); - ret = false; - page_vma_mapped_walk_done(&pvmw); - break; - } - if (list_empty(&mm->mmlist)) { - spin_lock(&mmlist_lock); - if (list_empty(&mm->mmlist)) - list_add(&mm->mmlist, &init_mm.mmlist); - spin_unlock(&mmlist_lock); - } - dec_mm_counter(mm, MM_ANONPAGES); - inc_mm_counter(mm, MM_SWAPENTS); - swp_pte = swp_entry_to_pte(entry); - if (anon_exclusive) - swp_pte = pte_swp_mkexclusive(swp_pte); - if (pte_soft_dirty(pteval)) - swp_pte = pte_swp_mksoft_dirty(swp_pte); - if (pte_uffd_wp(pteval)) - swp_pte = pte_swp_mkuffd_wp(swp_pte); - set_pte_at(mm, address, pvmw.pte, swp_pte); - /* Invalidate as we cleared the pte */ - mmu_notifier_invalidate_range(mm, address, - address + PAGE_SIZE); - } else { - /* - * This is a locked file-backed folio, - * so it cannot be removed from the page - * cache and replaced by a new folio before - * mmu_notifier_invalidate_range_end, so no - * concurrent thread might update its page table - * to point at a new folio while a device is - * still using this folio. - * - * See Documentation/mm/mmu_notifier.rst - */ - dec_mm_counter(mm, mm_counter_file(&folio->page)); - } -discard: /* * No need to call mmu_notifier_invalidate_range() it has be * done above for all cases requiring it to happen under page From patchwork Mon Mar 6 09:22:57 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yin Fengwei X-Patchwork-Id: 13160759 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id AB919C61DA4 for ; Mon, 6 Mar 2023 09:22:26 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 01DC06B0074; Mon, 6 Mar 2023 04:22:26 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id F108C6B007B; Mon, 6 Mar 2023 04:22:25 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id DDA0B6B007D; Mon, 6 Mar 2023 04:22:25 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0012.hostedemail.com [216.40.44.12]) by kanga.kvack.org (Postfix) with ESMTP id CEF886B0074 for ; Mon, 6 Mar 2023 04:22:25 -0500 (EST) Received: from smtpin10.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay05.hostedemail.com (Postfix) with ESMTP id 93B6D40222 for ; Mon, 6 Mar 2023 09:22:25 +0000 (UTC) X-FDA: 80537932650.10.A7D941F Received: from mga01.intel.com (mga01.intel.com [192.55.52.88]) by imf22.hostedemail.com (Postfix) with ESMTP id 95CDBC0012 for ; Mon, 6 Mar 2023 09:22:23 +0000 (UTC) Authentication-Results: imf22.hostedemail.com; dkim=pass header.d=intel.com header.s=Intel header.b=Q0jvkHYd; spf=pass (imf22.hostedemail.com: domain of fengwei.yin@intel.com designates 192.55.52.88 as permitted sender) smtp.mailfrom=fengwei.yin@intel.com; dmarc=pass (policy=none) header.from=intel.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1678094543; a=rsa-sha256; cv=none; b=a4ulrFb8GmVDuCoB29oduZyAC+sQFpDx5pNhyxv+DjJnKguSBg0l5jwhWv/4Y3qN9yfZwq +38szfvlPylkBeqHlRGNb5UiM7ClAmyaomcT1VogWo8pUehYbCASt8nBwsA2rQsbNxGbfe 7BurSUp2tI9FS/8xR4yhaFK2/wDP98Y= ARC-Authentication-Results: i=1; imf22.hostedemail.com; dkim=pass header.d=intel.com header.s=Intel header.b=Q0jvkHYd; spf=pass (imf22.hostedemail.com: domain of fengwei.yin@intel.com designates 192.55.52.88 as permitted sender) smtp.mailfrom=fengwei.yin@intel.com; dmarc=pass (policy=none) header.from=intel.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1678094543; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=rfTHxPoxGggk3s18axmB19Urf5GRweKPAMJqctjpi3g=; b=8XZvZJz1LKIw4HmRhuN+hbQ2dzEbJAxzPZZtE20TEy9/I0JCnwBhJSbK6CQip8UeQQZnxd 1S7TD0gv9oP4CfUP99VQfG9QlsowQuJz2D/lcXXj35QXd1kFGAqXaceWMJivjFH1KyDcYQ /gjrJl9Q8Yp4SeWsFuiDQiJds+NZ+TM= DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1678094543; x=1709630543; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=71/4/wxaPV0XPVnrmXXtIkG0KxHDkOgrfZzOHhO22ss=; b=Q0jvkHYdS80NQi5s6ekiG8AqZiDWRPxLGkNma925pNvYX5gcVyBWqHA9 WTCnRLt+LATGxhrCL/1gS9361gmpkNcLmY2PGLr1tBiSiwDandrEjBnf+ VEYTpau6Wir53JnlOGQomW/ivx9Mj/tp4vt3nyR0eE3l4UNTTOlAvoM5v ZuAeYjQUj4s/Gw2d79z8tVFmOFrIaFLxExsZntC/B8F2U4IH8WlQ5lkE9 MJnsz1NpoWSwE5ArjO28PLIcAD1F0KUI2JhcBRgmk/fbVl7oB1gJAERq1 HG8FZGEzgjAqskRHWswgcrq2SilIHRlxBxIQiwbkpmRbEQs3f5mtSXb75 g==; X-IronPort-AV: E=McAfee;i="6500,9779,10640"; a="363124026" X-IronPort-AV: E=Sophos;i="5.98,236,1673942400"; d="scan'208";a="363124026" Received: from orsmga004.jf.intel.com ([10.7.209.38]) by fmsmga101.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 06 Mar 2023 01:22:21 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6500,9779,10640"; a="799932460" X-IronPort-AV: E=Sophos;i="5.98,236,1673942400"; d="scan'208";a="799932460" Received: from fyin-dev.sh.intel.com ([10.239.159.32]) by orsmga004.jf.intel.com with ESMTP; 06 Mar 2023 01:22:19 -0800 From: Yin Fengwei To: linux-mm@kvack.org, akpm@linux-foundation.org, willy@infradead.org, mike.kravetz@oracle.com, sidhartha.kumar@oracle.com, naoya.horiguchi@nec.com, jane.chu@oracle.com, david@redhat.com Cc: fengwei.yin@intel.com Subject: [PATCH v3 3/5] rmap: cleanup exit path of try_to_unmap_one_page() Date: Mon, 6 Mar 2023 17:22:57 +0800 Message-Id: <20230306092259.3507807-4-fengwei.yin@intel.com> X-Mailer: git-send-email 2.30.2 In-Reply-To: <20230306092259.3507807-1-fengwei.yin@intel.com> References: <20230306092259.3507807-1-fengwei.yin@intel.com> MIME-Version: 1.0 X-Rspam-User: X-Rspamd-Queue-Id: 95CDBC0012 X-Rspamd-Server: rspam01 X-Stat-Signature: t789iybqbk8a75pmxx8xko8arautt89f X-HE-Tag: 1678094543-991615 X-HE-Meta: U2FsdGVkX198NY2I6BiRm+jQv5jfN5I9bBdC1O6YXGzG3yPgFSTUXx8q9frou+XPqMTmKrr9hfyIH7upOvNhcB+V8ioeHQhdOWPd+GCM4LWtO1RP7K2JzaahSM2E5bzE9Bu1snXxgQi7dfRH0JYSIH477WjxS66LPxi5Q769+c9BkH248J+fqne2jr2zepLnF9osfkWe69iGc8uJ+8jqrWUuYLSuJeWTPNoDes6DArla6Qm1FrxodN/YOXeR9aAMDS7vXoM9KkhYQpbLKtybXe7g8y0QTaS07xuriSEDe2EeSQ18oY2JgQFdmqQLFU+TEoSPq+WJEDseIbB2N1YB9cBHG5rL45dYi928WCA3leYU0QU7brSLitvW5ttOVaK6SmkWLoC1i91NcjUWJOulhwSskQIJ4RQ3sphhImqlvKrtoABh4yPxZiJhtD9EuFd/ZTNbXKYr1aVkSz4Emp/Oj9ogVQ4WU3rlhJCvu6PHkJDp/aJqIuRg5J9uBuIWxjaH2yJ0kUGbCaIav8X6fXOQV6vzyKMXFvHf0gthLuFmg9axcfx0SaeZ29ZOrSEWSMAoNYzivbRk721sEVLg5RciaTLU2QE8ibmPvEuIdYZE9hYE5ZAMLjj/SMsZyKMwvWsTrtIYOyE0iDer/fdTShLBF7jWJ2W/TE/JROM2xIaks2TiLMft9zBeVkpQY82SQDBXIWz3m9E7KWYFONEQ/gBnkjKEVZOuanJ8ujHh/pfb5myoow8BFjnJZmG1fsdXMeZyJYuGa/Aq4+dIfwZ0MS9Z7fHmV+/MMbvcWSKd+vH2pAkoyH7yJ+Q+IJ+pJ3La9pgbtcowGqXPpgwTu+dCLDfd0jU6tddXzcR5yPULz3fd2OOazLapQbffleAZGuA/6MmoWW4Urm6HcFYmxmF9jsOcmzecToZGhC4fg37bHRS65dRkPdsbPIPxUham7Q9JWV7GH3EElwpDVG2g2+Li6Wa LhlFmq3v KHwj16/nS0XCK4qEMQTwN9/H7470bR/NsaUIRYDFttfjUfcMoesTOH+jWDJEYY9U2y/7zFGdesPzz//CKiE413z/aHiVSS+dWH4AD421dLkQiPCU8JgLCblYFB82mg2Skja65bOjUU+KabHkzQdjPCnWKGAuFC9xk1Lis0x0+OMxnwvJKPhgChK5bOI8OdnYlL36fWNe6PytE41nGwF79WbFx8w== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Cleanup exit path of try_to_unmap_one_page() by removing some duplicated code. Move page_vma_mapped_walk_done() back to try_to_unmap_one(). Change subpage to page as folio has no concept of subpage. Signed-off-by: Yin Fengwei --- mm/rmap.c | 74 ++++++++++++++++++++++--------------------------------- 1 file changed, 30 insertions(+), 44 deletions(-) diff --git a/mm/rmap.c b/mm/rmap.c index 013643122d0c..836cfc13cf9d 100644 --- a/mm/rmap.c +++ b/mm/rmap.c @@ -1528,7 +1528,7 @@ static bool try_to_unmap_one_hugetlb(struct folio *folio, * * See Documentation/mm/mmu_notifier.rst */ - page_remove_rmap(&folio->page, vma, folio_test_hugetlb(folio)); + page_remove_rmap(&folio->page, vma, true); /* No VM_LOCKED set in vma->vm_flags for hugetlb. So not * necessary to call mlock_drain_local(). */ @@ -1543,15 +1543,13 @@ static bool try_to_unmap_one_page(struct folio *folio, struct page_vma_mapped_walk pvmw, unsigned long address, enum ttu_flags flags) { - bool anon_exclusive, ret = true; - struct page *subpage; + bool anon_exclusive; + struct page *page; struct mm_struct *mm = vma->vm_mm; pte_t pteval; - subpage = folio_page(folio, - pte_pfn(*pvmw.pte) - folio_pfn(folio)); - anon_exclusive = folio_test_anon(folio) && - PageAnonExclusive(subpage); + page = folio_page(folio, pte_pfn(*pvmw.pte) - folio_pfn(folio)); + anon_exclusive = folio_test_anon(folio) && PageAnonExclusive(page); flush_cache_page(vma, address, pte_pfn(*pvmw.pte)); /* Nuke the page table entry. */ @@ -1579,15 +1577,14 @@ static bool try_to_unmap_one_page(struct folio *folio, pte_install_uffd_wp_if_needed(vma, address, pvmw.pte, pteval); /* Set the dirty flag on the folio now the pte is gone. */ - if (pte_dirty(pteval)) + if (pte_dirty(pteval) && !folio_test_dirty(folio)) folio_mark_dirty(folio); /* Update high watermark before we lower rss */ update_hiwater_rss(mm); - if (PageHWPoison(subpage) && !(flags & TTU_HWPOISON)) { - pteval = swp_entry_to_pte(make_hwpoison_entry(subpage)); - dec_mm_counter(mm, mm_counter(&folio->page)); + if (PageHWPoison(page) && !(flags & TTU_HWPOISON)) { + pteval = swp_entry_to_pte(make_hwpoison_entry(page)); set_pte_at(mm, address, pvmw.pte, pteval); } else if (pte_unused(pteval) && !userfaultfd_armed(vma)) { /* @@ -1600,12 +1597,11 @@ static bool try_to_unmap_one_page(struct folio *folio, * migration) will not expect userfaults on already * copied pages. */ - dec_mm_counter(mm, mm_counter(&folio->page)); /* We have to invalidate as we cleared the pte */ mmu_notifier_invalidate_range(mm, address, address + PAGE_SIZE); } else if (folio_test_anon(folio)) { - swp_entry_t entry = { .val = page_private(subpage) }; + swp_entry_t entry = { .val = page_private(page) }; pte_t swp_pte; /* * Store the swap location in the pte. @@ -1614,12 +1610,10 @@ static bool try_to_unmap_one_page(struct folio *folio, if (unlikely(folio_test_swapbacked(folio) != folio_test_swapcache(folio))) { WARN_ON_ONCE(1); - ret = false; /* We have to invalidate as we cleared the pte */ mmu_notifier_invalidate_range(mm, address, address + PAGE_SIZE); - page_vma_mapped_walk_done(&pvmw); - goto discard; + goto exit; } /* MADV_FREE page check */ @@ -1651,7 +1645,6 @@ static bool try_to_unmap_one_page(struct folio *folio, /* Invalidate as we cleared the pte */ mmu_notifier_invalidate_range(mm, address, address + PAGE_SIZE); - dec_mm_counter(mm, MM_ANONPAGES); goto discard; } @@ -1659,43 +1652,30 @@ static bool try_to_unmap_one_page(struct folio *folio, * If the folio was redirtied, it cannot be * discarded. Remap the page to page table. */ - set_pte_at(mm, address, pvmw.pte, pteval); folio_set_swapbacked(folio); - ret = false; - page_vma_mapped_walk_done(&pvmw); - goto discard; + goto exit_restore_pte; } - if (swap_duplicate(entry) < 0) { - set_pte_at(mm, address, pvmw.pte, pteval); - ret = false; - page_vma_mapped_walk_done(&pvmw); - goto discard; - } + if (swap_duplicate(entry) < 0) + goto exit_restore_pte; + if (arch_unmap_one(mm, vma, address, pteval) < 0) { swap_free(entry); - set_pte_at(mm, address, pvmw.pte, pteval); - ret = false; - page_vma_mapped_walk_done(&pvmw); - goto discard; + goto exit_restore_pte; } /* See page_try_share_anon_rmap(): clear PTE first. */ - if (anon_exclusive && - page_try_share_anon_rmap(subpage)) { + if (anon_exclusive && page_try_share_anon_rmap(page)) { swap_free(entry); - set_pte_at(mm, address, pvmw.pte, pteval); - ret = false; - page_vma_mapped_walk_done(&pvmw); - goto discard; + goto exit_restore_pte; } + if (list_empty(&mm->mmlist)) { spin_lock(&mmlist_lock); if (list_empty(&mm->mmlist)) list_add(&mm->mmlist, &init_mm.mmlist); spin_unlock(&mmlist_lock); } - dec_mm_counter(mm, MM_ANONPAGES); inc_mm_counter(mm, MM_SWAPENTS); swp_pte = swp_entry_to_pte(entry); if (anon_exclusive) @@ -1706,8 +1686,7 @@ static bool try_to_unmap_one_page(struct folio *folio, swp_pte = pte_swp_mkuffd_wp(swp_pte); set_pte_at(mm, address, pvmw.pte, swp_pte); /* Invalidate as we cleared the pte */ - mmu_notifier_invalidate_range(mm, address, - address + PAGE_SIZE); + mmu_notifier_invalidate_range(mm, address, address + PAGE_SIZE); } else { /* * This is a locked file-backed folio, @@ -1720,11 +1699,16 @@ static bool try_to_unmap_one_page(struct folio *folio, * * See Documentation/mm/mmu_notifier.rst */ - dec_mm_counter(mm, mm_counter_file(&folio->page)); } discard: - return ret; + dec_mm_counter(vma->vm_mm, mm_counter(&folio->page)); + return true; + +exit_restore_pte: + set_pte_at(mm, address, pvmw.pte, pteval); +exit: + return false; } /* @@ -1802,8 +1786,10 @@ static bool try_to_unmap_one(struct folio *folio, struct vm_area_struct *vma, pte_pfn(*pvmw.pte) - folio_pfn(folio)); ret = try_to_unmap_one_page(folio, vma, range, pvmw, address, flags); - if (!ret) + if (!ret) { + page_vma_mapped_walk_done(&pvmw); break; + } /* * No need to call mmu_notifier_invalidate_range() it has be @@ -1812,7 +1798,7 @@ static bool try_to_unmap_one(struct folio *folio, struct vm_area_struct *vma, * * See Documentation/mm/mmu_notifier.rst */ - page_remove_rmap(subpage, vma, folio_test_hugetlb(folio)); + page_remove_rmap(subpage, vma, false); if (vma->vm_flags & VM_LOCKED) mlock_drain_local(); folio_put(folio); From patchwork Mon Mar 6 09:22:58 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yin Fengwei X-Patchwork-Id: 13160760 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id AC52AC6FD19 for ; Mon, 6 Mar 2023 09:22:39 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 48A756B007B; Mon, 6 Mar 2023 04:22:39 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 461A96B007D; Mon, 6 Mar 2023 04:22:39 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 3298F6B007E; Mon, 6 Mar 2023 04:22:39 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0011.hostedemail.com [216.40.44.11]) by kanga.kvack.org (Postfix) with ESMTP id 226AF6B007B for ; Mon, 6 Mar 2023 04:22:39 -0500 (EST) Received: from smtpin04.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay05.hostedemail.com (Postfix) with ESMTP id F3F2540B38 for ; Mon, 6 Mar 2023 09:22:38 +0000 (UTC) X-FDA: 80537933196.04.C2955C6 Received: from mga06.intel.com (mga06b.intel.com [134.134.136.31]) by imf13.hostedemail.com (Postfix) with ESMTP id D497E20002 for ; Mon, 6 Mar 2023 09:22:36 +0000 (UTC) Authentication-Results: imf13.hostedemail.com; dkim=pass header.d=intel.com header.s=Intel header.b=hNoWPfX2; dmarc=pass (policy=none) header.from=intel.com; spf=pass (imf13.hostedemail.com: domain of fengwei.yin@intel.com designates 134.134.136.31 as permitted sender) smtp.mailfrom=fengwei.yin@intel.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1678094557; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=udgZbP1JVBpOFWZLs5hDZuekTW7pDVYq1cY/Jwp6jQ4=; b=aUt6/tRa8JjFj88PRp3N+6RdyEfDqVuPqrJCUUmiIxrNi1tM1Kf14gjWmnPMQicsYw6tHN 4ja1ikyzsolcFM05wGZCD3qroJ4J/dIS3xYejpwrrrS3G4D8F2FH6zZrnUEeFz8KA+L5Mx E3sLxRMl5tKgUjr6NzGgkEbVvtUyXKQ= ARC-Authentication-Results: i=1; imf13.hostedemail.com; dkim=pass header.d=intel.com header.s=Intel header.b=hNoWPfX2; dmarc=pass (policy=none) header.from=intel.com; spf=pass (imf13.hostedemail.com: domain of fengwei.yin@intel.com designates 134.134.136.31 as permitted sender) smtp.mailfrom=fengwei.yin@intel.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1678094557; a=rsa-sha256; cv=none; b=g/CXch6PrK7BbFUl5Fv8mLjx7ZzotrVsRvFufRhnazvoXMp3D4TnDqG+ykpTJh9hrby4n5 7GXeCoPoGVfSmqKFKbSFXUpNrDD1rIm1VW2hoT7azvf5drEAL5h6h0YWX6z8ARzVO2iiue uMABDZLm7tByv9UG79qX8cdYKFKrbMs= DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1678094555; x=1709630555; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=g5aabX8vz/M8Ls+LQWJ4EOQUiHzF9r4YTa/VaArWwhI=; b=hNoWPfX2IOyCv1LQWtXlMCjIOiKVkKmilMK8Y+vjXEt4M1AvuAU/Ruea h/DIFImJwgoSJX6r+G0SI5QjjAEkcc1p6K5Q42qaUNBcgnjaXdDO/Amtb APhMwg3dF7l/gcOYnfNmI3q9HdtO6wuvV+OT1qOSYKdV5CY8Hts+clF6W YvCfEkHzwIvR9eyx+cLQFrFFLP9vczK/bqgTnAg98EIm3sKWKARdIoZOB 5n0c9GR8YbDJaQi2sssRxRlqwR6xHxRkDV9KlTxuTK4GC0JRniUbUBpTR aPlLSih2RXpQCHaENomVcdvjQRGkAoObGo4WPvtqcTFprLneR4BWJhLTq w==; X-IronPort-AV: E=McAfee;i="6500,9779,10640"; a="398091720" X-IronPort-AV: E=Sophos;i="5.98,236,1673942400"; d="scan'208";a="398091720" Received: from fmsmga008.fm.intel.com ([10.253.24.58]) by orsmga104.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 06 Mar 2023 01:22:34 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6500,9779,10640"; a="740261322" X-IronPort-AV: E=Sophos;i="5.98,236,1673942400"; d="scan'208";a="740261322" Received: from fyin-dev.sh.intel.com ([10.239.159.32]) by fmsmga008.fm.intel.com with ESMTP; 06 Mar 2023 01:22:32 -0800 From: Yin Fengwei To: linux-mm@kvack.org, akpm@linux-foundation.org, willy@infradead.org, mike.kravetz@oracle.com, sidhartha.kumar@oracle.com, naoya.horiguchi@nec.com, jane.chu@oracle.com, david@redhat.com Cc: fengwei.yin@intel.com Subject: [PATCH v3 4/5] rmap:addd folio_remove_rmap_range() Date: Mon, 6 Mar 2023 17:22:58 +0800 Message-Id: <20230306092259.3507807-5-fengwei.yin@intel.com> X-Mailer: git-send-email 2.30.2 In-Reply-To: <20230306092259.3507807-1-fengwei.yin@intel.com> References: <20230306092259.3507807-1-fengwei.yin@intel.com> MIME-Version: 1.0 X-Rspam-User: X-Rspamd-Server: rspam02 X-Rspamd-Queue-Id: D497E20002 X-Stat-Signature: kadxut1shoctnzid45ff8y8xgksj5cti X-HE-Tag: 1678094556-30538 X-HE-Meta: U2FsdGVkX1+hs2GNbS3JaU4k+NN84x2txyazZjIFHt9tSqmzmXDroJ5jV0Dhp8y+JTeQh9tb4fDSl2FnE1xWo5NlWNj1a+xlaxgIZ5pK4u+ha/uMkVtO0Nk5fcU6ZOtuyE5i2rCCchYeN48KZ0Nqf9+dxdq6HNXDRxKY/BZcKl12usotHtJ97QlxO6Y+X5hy5yHNnH5/pc4bQkZ0PLSH2bKn+6M9Oym89d6H00t43rO4OP6dR/QLSbTinAWFc0+six+8TDBEnUj3I03V2OhalQADFtNF2DHbiktm8tPnM1G5EAZ+g8eWHnrwvLokXrOIBBNphOfPOHkHBnimFFjnJh80c1WKlreqgTPPZLnutwd5nv7G/D69ipclEq9ruLtF4s8/3CTxoHehHfPaFnpSAP+305nA+x/cKbRDKeGj3gu6W+CVkjkPEm70V3hfsbi3tZnYXU6qsk0bgi2UuazpwRL9yBmoiXZBJyl7uT6aOWuDdA1WXMxbPHdqXqtG82/bl6A13fMKptfibCQg3DGgO+2+4OpHzN1nX5KMcCAConZ2wgYniMey6ENOPJIvzlssvK3LkN7vV5kutMAG1SFbd4L0g+WPU4CWFRxbvVmNYsk7yrpwkF/aFi4gDpo6vBGBh/K9c/E+FRCog/hPdhgBomZnnbUpiUfiqYqlBSzMdwkhq3PyEfiBrNBIJLtL9tst6ocpiTsDLclp9sK772G9b9osHakfBmIqmpqQ5GJlWQoWS8nHVBYI92R1j2Zg/BssdVK1+xiCKqVCO0uxoHEJ36FZh65ljrTm/YankQYnk7Md70B5by56VbtIKTPJxEYBstTUlunYUksOdZjFv7Jh1Gj0CrCbhIyyUeQMEuxj/B4vRkBo7PoIQQbcMzF5PPqUyg39zhYeVMnk4X75C+MVJzClXmqkxbTMCLHdW+nlSIxgNvbhkvCd8gSa28KqG4fd0Z1bU46xcVBAoYMdZuE oiew5TpY GkFG3mfYOdsZpcHz0c2JgQa7sub3cyrflEo2plQm2F51wHTPHOBccKcr6Hkw+QQ2vgYD7MYWQhD8L1mvR5geEz16GHyeDVKXg7DP4L7C544cO+eV1INe6NSMUZ3PWs4RaL03H3gqaaMKTujrq8LlyUbyTcLfVh5QR57pbyCeadVFurfFlVeAgawbrb3G97t0dpdqMKx3B6oPVmBf8Ds5xRGizKkpvNWxGACwpMgahBZW0bwk8QmPHSGd5dvsclOsJISBx X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: folio_remove_rmap_range() allows to take down the pte mapping to a specific range of folio. Comparing to page_remove_rmap(), it batched updates __lruvec_stat for large folio. Signed-off-by: Yin Fengwei --- include/linux/rmap.h | 4 +++ mm/rmap.c | 58 +++++++++++++++++++++++++++++++++----------- 2 files changed, 48 insertions(+), 14 deletions(-) diff --git a/include/linux/rmap.h b/include/linux/rmap.h index b87d01660412..d2569b42e21a 100644 --- a/include/linux/rmap.h +++ b/include/linux/rmap.h @@ -200,6 +200,10 @@ void page_add_file_rmap(struct page *, struct vm_area_struct *, bool compound); void page_remove_rmap(struct page *, struct vm_area_struct *, bool compound); +void folio_remove_rmap_range(struct folio *, struct page *, + unsigned int nr_pages, struct vm_area_struct *, + bool compound); + void hugepage_add_anon_rmap(struct page *, struct vm_area_struct *, unsigned long address, rmap_t flags); diff --git a/mm/rmap.c b/mm/rmap.c index 836cfc13cf9d..bb3fcb8df579 100644 --- a/mm/rmap.c +++ b/mm/rmap.c @@ -1355,23 +1355,25 @@ void page_add_file_rmap(struct page *page, struct vm_area_struct *vma, } /** - * page_remove_rmap - take down pte mapping from a page - * @page: page to remove mapping from + * folio_remove_rmap_range - take down pte mapping from a range of pages + * @folio: folio to remove mapping from + * @page: The first page to take down pte mapping + * @nr_pages: The number of pages which will be take down pte mapping * @vma: the vm area from which the mapping is removed * @compound: uncharge the page as compound or small page * * The caller needs to hold the pte lock. */ -void page_remove_rmap(struct page *page, struct vm_area_struct *vma, - bool compound) +void folio_remove_rmap_range(struct folio *folio, struct page *page, + unsigned int nr_pages, struct vm_area_struct *vma, + bool compound) { - struct folio *folio = page_folio(page); atomic_t *mapped = &folio->_nr_pages_mapped; - int nr = 0, nr_pmdmapped = 0; - bool last; + int nr = 0, nr_pmdmapped = 0, last; enum node_stat_item idx; - VM_BUG_ON_PAGE(compound && !PageHead(page), page); + VM_BUG_ON_FOLIO(compound && (nr_pages != folio_nr_pages(folio)), folio); + VM_BUG_ON_FOLIO(compound && (page != &folio->page), folio); /* Hugetlb pages are not counted in NR_*MAPPED */ if (unlikely(folio_test_hugetlb(folio))) { @@ -1382,12 +1384,16 @@ void page_remove_rmap(struct page *page, struct vm_area_struct *vma, /* Is page being unmapped by PTE? Is this its last map to be removed? */ if (likely(!compound)) { - last = atomic_add_negative(-1, &page->_mapcount); - nr = last; - if (last && folio_test_large(folio)) { - nr = atomic_dec_return_relaxed(mapped); - nr = (nr < COMPOUND_MAPPED); - } + do { + last = atomic_add_negative(-1, &page->_mapcount); + if (last && folio_test_large(folio)) { + last = atomic_dec_return_relaxed(mapped); + last = (last < COMPOUND_MAPPED); + } + + if (last) + nr++; + } while (page++, --nr_pages > 0); } else if (folio_test_pmd_mappable(folio)) { /* That test is redundant: it's for safety or to optimize out */ @@ -1441,6 +1447,30 @@ void page_remove_rmap(struct page *page, struct vm_area_struct *vma, munlock_vma_folio(folio, vma, compound); } +/** + * page_remove_rmap - take down pte mapping from a page + * @page: page to remove mapping from + * @vma: the vm area from which the mapping is removed + * @compound: uncharge the page as compound or small page + * + * The caller needs to hold the pte lock. + */ +void page_remove_rmap(struct page *page, struct vm_area_struct *vma, + bool compound) +{ + struct folio *folio = page_folio(page); + unsigned int nr_pages; + + VM_BUG_ON_FOLIO(compound && (page != &folio->page), folio); + + if (likely(!compound)) + nr_pages = 1; + else + nr_pages = folio_nr_pages(folio); + + folio_remove_rmap_range(folio, page, nr_pages, vma, compound); +} + static bool try_to_unmap_one_hugetlb(struct folio *folio, struct vm_area_struct *vma, struct mmu_notifier_range range, struct page_vma_mapped_walk pvmw, unsigned long address, From patchwork Mon Mar 6 09:22:59 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yin Fengwei X-Patchwork-Id: 13160761 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id DD412C678D4 for ; Mon, 6 Mar 2023 09:22:52 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 7A30E6B007E; Mon, 6 Mar 2023 04:22:52 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 748546B007D; Mon, 6 Mar 2023 04:22:52 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 60FA06B007E; Mon, 6 Mar 2023 04:22:52 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0011.hostedemail.com [216.40.44.11]) by kanga.kvack.org (Postfix) with ESMTP id 520366B0075 for ; Mon, 6 Mar 2023 04:22:52 -0500 (EST) Received: from smtpin01.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay07.hostedemail.com (Postfix) with ESMTP id 23A53160BBA for ; Mon, 6 Mar 2023 09:22:52 +0000 (UTC) X-FDA: 80537933784.01.C73D1A9 Received: from mga01.intel.com (mga01.intel.com [192.55.52.88]) by imf11.hostedemail.com (Postfix) with ESMTP id 0ECF34000B for ; Mon, 6 Mar 2023 09:22:49 +0000 (UTC) Authentication-Results: imf11.hostedemail.com; dkim=pass header.d=intel.com header.s=Intel header.b=FxbdWRyJ; spf=pass (imf11.hostedemail.com: domain of fengwei.yin@intel.com designates 192.55.52.88 as permitted sender) smtp.mailfrom=fengwei.yin@intel.com; dmarc=pass (policy=none) header.from=intel.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1678094570; a=rsa-sha256; cv=none; b=mzw/s9lf7nr8sPmW3wjru53aCYHSRI6D3C1+UJqFJ5GYxAeWEVGwRf2nSQdvpg0uhqZ9fq bxQqo8tkyF5Uy/hWgTI9+cBMgJoVbNC/M90/vV2icQ0Ngqsi+tTSEOyYObnBjje1BAHxPl qnHaeEKclqNfgHkpgu9VoKQMnQoXDe0= ARC-Authentication-Results: i=1; imf11.hostedemail.com; dkim=pass header.d=intel.com header.s=Intel header.b=FxbdWRyJ; spf=pass (imf11.hostedemail.com: domain of fengwei.yin@intel.com designates 192.55.52.88 as permitted sender) smtp.mailfrom=fengwei.yin@intel.com; dmarc=pass (policy=none) header.from=intel.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1678094570; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=158cfA4gBZ7Jw81IuKZLazNoOKiwHKCVyba2V7h0Rk4=; b=1lEn1Eba8mej9TM569d2caaeSxbb9B99h54otSszpcX4uoYyJESNXfgbz+aWwNdxjBuBu8 2JUtcPFCw0ENmv6vOUyMTIz4fxMqqTYBAdsAmUXcvLBQuryURd7kRJjJfd2YdHlmBV+M5X fIK4D0Xe0g+92V3F8z7Ecxzv+rm0BkY= DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1678094570; x=1709630570; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=EL1UNEPykI4yEh+CYDaUYZvpEG8ybNFoNbxj0OEbvtA=; b=FxbdWRyJoDLyUw2mWgPesAdAnqPBKsuHISxvXcrxzZYKB2YQ/+JOL+NP Etr3/t2kMeWWM2XO65gF2sn+esvajyYNQ3lHZNyX3cRsxk1sjNYKY/dw8 b/6e1Mxdv7nRgxv2xaQQj4VhmR8nrVHEbCzZQeYYjK0SqBOXKYVO/vtJl gqfJ/gRA6yrzHBdDjQj0DdUBo75FKT5n8ay0UTWTTsXdiYJhqDR1QRgM5 1X0KH7/V3Ey3D65kVt0wfMHwii7MKYbf+HLh5He6CyV7YASCZw0ftJNJU nAvwrpOVudh55PLB5Qv44Kl/HyjQvv0TwbnN7iKoSXc1QTQT4G9iWfTe2 Q==; X-IronPort-AV: E=McAfee;i="6500,9779,10640"; a="363124100" X-IronPort-AV: E=Sophos;i="5.98,236,1673942400"; d="scan'208";a="363124100" Received: from orsmga004.jf.intel.com ([10.7.209.38]) by fmsmga101.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 06 Mar 2023 01:22:48 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6500,9779,10640"; a="799932523" X-IronPort-AV: E=Sophos;i="5.98,236,1673942400"; d="scan'208";a="799932523" Received: from fyin-dev.sh.intel.com ([10.239.159.32]) by orsmga004.jf.intel.com with ESMTP; 06 Mar 2023 01:22:46 -0800 From: Yin Fengwei To: linux-mm@kvack.org, akpm@linux-foundation.org, willy@infradead.org, mike.kravetz@oracle.com, sidhartha.kumar@oracle.com, naoya.horiguchi@nec.com, jane.chu@oracle.com, david@redhat.com Cc: fengwei.yin@intel.com Subject: [PATCH v3 5/5] try_to_unmap_one: batched remove rmap, update folio refcount Date: Mon, 6 Mar 2023 17:22:59 +0800 Message-Id: <20230306092259.3507807-6-fengwei.yin@intel.com> X-Mailer: git-send-email 2.30.2 In-Reply-To: <20230306092259.3507807-1-fengwei.yin@intel.com> References: <20230306092259.3507807-1-fengwei.yin@intel.com> MIME-Version: 1.0 X-Rspam-User: X-Rspamd-Queue-Id: 0ECF34000B X-Rspamd-Server: rspam01 X-Stat-Signature: peuabauz3ce9ruzbksqaa6ko5xu47hwg X-HE-Tag: 1678094569-1427 X-HE-Meta: U2FsdGVkX18guf1dV5rfU5N0TQuJgLqPFQ6ZafsHhEyq9bQp7DmOfNaNrD2YtuUcepU1rvOpYIa/755v2Zo2qDg1oP3IhvbJtNCVSVQL/1d3T85cMT4rB3mu3XwVcIZOmJQK/LmM6xd44PqMf1EiJkzpy6OuKL3qlk9319wW+5sVje6ufSFVvTby+UP+AHVhW21qGG/a5UGf0maLK1n0Jwm3DTVYDOGMGZryE9cB1c2d5Cuaw9jkIiQmxKVNziR5YnBzlK2pKKoy37R/rRRKYruNufpir06Tn7qGB9D5/h/EPzgEUL8gcGmMmTdO41fZNy9rrVPydk/QKa1orrwZI7McqYsJGuNzH5OrQS+rjj0YJai+N2uAL7usI+G3OYoqBU8rH1ba1AFqbVANu+y3uDAeD7t0OiwO4c0JjvKDGplPtLFJL7n710QtYd1RHyMX43oNFnaUBe96PyWMJORoPh3CFi/lujhhJf+bhM++ZRhUmtroEERm3N2/IajsQcNhsrubJE5NVc3a1GL4InYHoesXX3KyRBTC+b+71Gyx564zXjOfX+MEeqEFxP0S+fn2LR4If4laIsPOq7yOjAWDrRT2bVilE/rjgXADvwcagFIhGLwH+eXIAzFElOpcWO83gjxFTj2W7bSAOSIsAoQzn1k9hvjV8T8/sbveLLrtMlaXbYt4gSCO23Yq8WQox++ZUBFW6YVwxvz0Yas5JwcwrqJcbxeefUQD7vS8fr7mUlSyxmWwnGZKnH8b10YK+uLKQtMY59of2kJEFcres7OK3ntCwbK1CzxB6N4fHwNY2nqfZ0AqUJupsLzBVdel+JZDmetn7JVUe0oYIl3xXDaQf4gNbR/Q4RxHkCPtHgqYJeIPI+4AntyjJgnN+qle/Lvh2GG7dxJz31zxuSTMjaKuvTKCbDLa3CEaE0nIeIAPQmBbNUDS+ocajUJcTydUy+c57KwXRoLLWWI9yJDJS9K E4z07T/y CmEZpyJkEv4F3JOX3/EwiYvSMMzfFDjyR7jx+YNA3IDohNPcHBMv3m+otVnmUPRZQNbalhc0cRlwm8islSSbUkPYY1H15kDn4qBO3h6mmISFptXfGlNqgrNUT2T81FMpl/OFP1s74v598CerpmJNYZJ7mSSjanyBdPU83LiaBtD9hy5pQrP9szevZPdVVD/xRpJRcaWYNYlS4dBavoR2SaaadTA== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: If unmap one page fails, or the vma walk will skip next pte, or the vma walk will end on next pte, batched remove map, update folio refcount. Signed-off-by: Yin Fengwei --- include/linux/rmap.h | 1 + mm/page_vma_mapped.c | 30 +++++++++++++++++++++++++++ mm/rmap.c | 48 ++++++++++++++++++++++++++++++++++---------- 3 files changed, 68 insertions(+), 11 deletions(-) diff --git a/include/linux/rmap.h b/include/linux/rmap.h index d2569b42e21a..18193d1d5a8e 100644 --- a/include/linux/rmap.h +++ b/include/linux/rmap.h @@ -424,6 +424,7 @@ static inline void page_vma_mapped_walk_done(struct page_vma_mapped_walk *pvmw) } bool page_vma_mapped_walk(struct page_vma_mapped_walk *pvmw); +bool pvmw_walk_skip_or_end_on_next(struct page_vma_mapped_walk *pvmw); /* * Used by swapoff to help locate where page is expected in vma. diff --git a/mm/page_vma_mapped.c b/mm/page_vma_mapped.c index 4e448cfbc6ef..19e997dfb5c6 100644 --- a/mm/page_vma_mapped.c +++ b/mm/page_vma_mapped.c @@ -291,6 +291,36 @@ bool page_vma_mapped_walk(struct page_vma_mapped_walk *pvmw) return false; } +/** + * pvmw_walk_skip_or_end_on_next - check if next pte will be skipped or + * end the walk + * @pvmw: pointer to struct page_vma_mapped_walk. + * + * This function can only be called with correct pte lock hold + */ +bool pvmw_walk_skip_or_end_on_next(struct page_vma_mapped_walk *pvmw) +{ + unsigned long address = pvmw->address + PAGE_SIZE; + + if (address >= vma_address_end(pvmw)) + return true; + + if ((address & (PMD_SIZE - PAGE_SIZE)) == 0) + return true; + + if (pte_none(*pvmw->pte)) + return true; + + pvmw->pte++; + if (!check_pte(pvmw)) { + pvmw->pte--; + return true; + } + pvmw->pte--; + + return false; +} + /** * page_mapped_in_vma - check whether a page is really mapped in a VMA * @page: the page to test diff --git a/mm/rmap.c b/mm/rmap.c index bb3fcb8df579..a64e9cbb52dd 100644 --- a/mm/rmap.c +++ b/mm/rmap.c @@ -1741,6 +1741,26 @@ static bool try_to_unmap_one_page(struct folio *folio, return false; } +static void folio_remove_rmap_and_update_count(struct folio *folio, + struct page *start, struct vm_area_struct *vma, int count) +{ + if (count == 0) + return; + + /* + * No need to call mmu_notifier_invalidate_range() it has be + * done above for all cases requiring it to happen under page + * table lock before mmu_notifier_invalidate_range_end() + * + * See Documentation/mm/mmu_notifier.rst + */ + folio_remove_rmap_range(folio, start, count, vma, + folio_test_hugetlb(folio)); + if (vma->vm_flags & VM_LOCKED) + mlock_drain_local(); + folio_ref_sub(folio, count); +} + /* * @arg: enum ttu_flags will be passed to this argument */ @@ -1748,10 +1768,11 @@ static bool try_to_unmap_one(struct folio *folio, struct vm_area_struct *vma, unsigned long address, void *arg) { DEFINE_FOLIO_VMA_WALK(pvmw, folio, vma, address, 0); - struct page *subpage; + struct page *start = NULL; bool ret = true; struct mmu_notifier_range range; enum ttu_flags flags = (enum ttu_flags)(long)arg; + int count = 0; /* * When racing against e.g. zap_pte_range() on another cpu, @@ -1812,26 +1833,31 @@ static bool try_to_unmap_one(struct folio *folio, struct vm_area_struct *vma, break; } - subpage = folio_page(folio, + if (!start) + start = folio_page(folio, pte_pfn(*pvmw.pte) - folio_pfn(folio)); ret = try_to_unmap_one_page(folio, vma, range, pvmw, address, flags); if (!ret) { + folio_remove_rmap_and_update_count(folio, + start, vma, count); page_vma_mapped_walk_done(&pvmw); break; } + count++; /* - * No need to call mmu_notifier_invalidate_range() it has be - * done above for all cases requiring it to happen under page - * table lock before mmu_notifier_invalidate_range_end() - * - * See Documentation/mm/mmu_notifier.rst + * If next pte will be skipped in page_vma_mapped_walk() or + * the walk will end at it, batched remove rmap and update + * page refcount. We can't do it after page_vma_mapped_walk() + * return false because the pte lock will not be hold. */ - page_remove_rmap(subpage, vma, false); - if (vma->vm_flags & VM_LOCKED) - mlock_drain_local(); - folio_put(folio); + if (pvmw_walk_skip_or_end_on_next(&pvmw)) { + folio_remove_rmap_and_update_count(folio, + start, vma, count); + count = 0; + start = NULL; + } } mmu_notifier_invalidate_range_end(&range);