From patchwork Mon Mar 13 12:45:22 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yin Fengwei X-Patchwork-Id: 13172439 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 90BB7C6FD19 for ; Mon, 13 Mar 2023 12:44:20 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id DF57D6B0071; Mon, 13 Mar 2023 08:44:19 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id DA6126B0072; Mon, 13 Mar 2023 08:44:19 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id C470A6B0074; Mon, 13 Mar 2023 08:44:19 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0010.hostedemail.com [216.40.44.10]) by kanga.kvack.org (Postfix) with ESMTP id B64AA6B0071 for ; Mon, 13 Mar 2023 08:44:19 -0400 (EDT) Received: from smtpin06.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay06.hostedemail.com (Postfix) with ESMTP id 6E6BBAAF67 for ; Mon, 13 Mar 2023 12:44:19 +0000 (UTC) X-FDA: 80563843038.06.35854D7 Received: from mga04.intel.com (mga04.intel.com [192.55.52.120]) by imf20.hostedemail.com (Postfix) with ESMTP id 38B831C0022 for ; Mon, 13 Mar 2023 12:44:15 +0000 (UTC) Authentication-Results: imf20.hostedemail.com; dkim=pass header.d=intel.com header.s=Intel header.b=h3ni4bsc; spf=pass (imf20.hostedemail.com: domain of fengwei.yin@intel.com designates 192.55.52.120 as permitted sender) smtp.mailfrom=fengwei.yin@intel.com; dmarc=pass (policy=none) header.from=intel.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1678711457; a=rsa-sha256; cv=none; b=yAFm00BwrbReNpFhHzUxG1K5KMLWe0Xjk1o42AnH5TZoUtx+rhH0sNKR8vF7/OPGjNqFsb kNV56PjZq9RxTM/mWR4ty+tZgoXz89Y6MDpDiBwBNPuSRfr3Be0Ej4CCIPdcBPbYo2B8Zl OwbW0pgbqnzt6FhCcPQ7h9mb9bqULg4= ARC-Authentication-Results: i=1; imf20.hostedemail.com; dkim=pass header.d=intel.com header.s=Intel header.b=h3ni4bsc; spf=pass (imf20.hostedemail.com: domain of fengwei.yin@intel.com designates 192.55.52.120 as permitted sender) smtp.mailfrom=fengwei.yin@intel.com; dmarc=pass (policy=none) header.from=intel.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1678711457; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=9eqD9hHexPc3y5JOE54deUYZ2SMWoKebySgUBABqJlI=; b=5bWYcfPZBDYAUtVrQt9MU60w2VHnGrZGjRXKbWFm50W7D4jGONeKXODoWgTRiPwWX3+PDG P0blHJHPKT1G0ctvKfsRgqnYA6YGjQ2FMJdlyVBIGwwrUhFv+9Se9pI5fS2VEpaNe4Fhhb d5dKsyNt1s+mHfSCr2wXTgQq9J1CUk8= DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1678711456; x=1710247456; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=hUG1bUoFVWB/VZA5fzXRqhvSKxl9ESyzxRiFybJ+EPA=; b=h3ni4bscYG95ZE/NWJifE9/DjBjEKPI2iEvcfjIn5S2H2oez1xXmyKUe am9wkFr4wmWEWoFLKXEpw/hJL+GbN84WAgSBoGROA3MmUfEyRnDShojF1 AwnfESGxKoQt3hS5sH33vFx4jK6yPz+pDhut1CPZbxcethjnlf94F9jsn yDg4XTpCOMKIWc8PMTymOvdlO3VK7C/274VSHqH8oILIkmPrFQD/2kem9 Oa6gf5SSyWp69uWpsxZ56F4QIpqWF9lB6ndjopKavp6vImQw1QlpDrX54 S3oQ9HZEHaqnM3jl+FyQ2cU4s4NdfrX5FXCbj5QtUC/H3ACCJ1nAOzLcF w==; X-IronPort-AV: E=McAfee;i="6500,9779,10647"; a="335834155" X-IronPort-AV: E=Sophos;i="5.98,256,1673942400"; d="scan'208";a="335834155" Received: from orsmga004.jf.intel.com ([10.7.209.38]) by fmsmga104.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 13 Mar 2023 05:44:14 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6500,9779,10647"; a="802432929" X-IronPort-AV: E=Sophos;i="5.98,256,1673942400"; d="scan'208";a="802432929" Received: from fyin-dev.sh.intel.com ([10.239.159.32]) by orsmga004.jf.intel.com with ESMTP; 13 Mar 2023 05:44:11 -0700 From: Yin Fengwei To: linux-mm@kvack.org, akpm@linux-foundation.org, willy@infradead.org, mike.kravetz@oracle.com, sidhartha.kumar@oracle.com, naoya.horiguchi@nec.com, jane.chu@oracle.com, david@redhat.com Cc: fengwei.yin@intel.com Subject: [PATCH v4 1/5] rmap: move hugetlb try_to_unmap to dedicated function Date: Mon, 13 Mar 2023 20:45:22 +0800 Message-Id: <20230313124526.1207490-2-fengwei.yin@intel.com> X-Mailer: git-send-email 2.30.2 In-Reply-To: <20230313124526.1207490-1-fengwei.yin@intel.com> References: <20230313124526.1207490-1-fengwei.yin@intel.com> MIME-Version: 1.0 X-Rspam-User: X-Rspamd-Queue-Id: 38B831C0022 X-Rspamd-Server: rspam01 X-Stat-Signature: jug4k76ymxrccg9uym1ghqa8y6szjphr X-HE-Tag: 1678711455-905590 X-HE-Meta: U2FsdGVkX19T0BY96JsNzDep45tBGwLQlwUTL5vcTWldBSvBzbRGgWw+r5J0NhSF4GWjDf+oSwjYItX5jMtdThUZOLiOgJrLqSE6GZ94s3twj2uyRgt9lKmfELBUE9xFOIRhe6n1QKcOBQVJj41bt41EZUmhArI7VN6urtO9rf3I5kYvl8Pq40CEGaHYKPhRET1GQZVDARapF8HMMfaALfemCv4+cubQJ0/ErRmmyEQOJ4vpjluiVmmxqyWbhcATG/Isydb4sCLgYI7+l0VDbj0rJ2HFc0nLCo/kjJkDEdnR4PauGIPT6XzVU6/f5KlwooHqxu3AaaDekd2Epgbzc2L8J1bTkV/fWsOMgj4hIXEw4XrJCu1qCfdvn1E296bUPRRggVDILF2K47TxuQTb+UzkIj6lVF9nF2tFYRHWm0qY2wN82WsXGUAYCEQZUdO4n5P9/fs1RZdRLfyuXIl/725eqIwm6gOcn3H9BwT9o/tE70BuG7ZviwSh8tqz8A3J8rFJYWD74LFmmG1VaXFi8zgYklnzDHzWSafWa4LJ2PEaVqb0fAM1hBhOfUW4B4Zf3c7S/R5qCogtJHhY/lLkJ1xP+IDHsvJlobboKn8l0EuW12pRrhfgEsuRrVC/6YugQ8zwbl0ZfXzeuxQfMZShFOQdG0j37xsRzSNdoY+aoJaRHoCbhVQ0bwOWPYA581py2pYHAHgX7UonSBfwgAS2BQTfjniFcArTctcXWx9yMoRzO0DanSNRRYuz5R5L8YKchVpnfeMx5QzDK1UAXsN31rBPxjVBdqX3V199mQSW1dm4d59G2yZ3QYJABgttjTgV31kGG5tTOXt/1dm1mukJo0vs6yE+3ke0qMUqY1OeXy8xs0QVhIx9/9QY1ekoAjGbJAeDhSNzz5EZabvtuZTweMyDy6luMx4Dyp62muoL6D/872VKT2K7hsKC8dtiqNosknaLqYkTfOeGsxEBfAA 1QeWGWh1 Rtl9of70z7An9BPKJQ2dje/B2cSXBuSovemvle53MdizDULwnz8+9D63pfePHpAlWVRYHw0JR/TtZmBKp06nvWcjZ+fU9JY0YhY4SlCfHvGk156FQ9pyv8FqMN3zKyLIQZvYKiN69AQt9CLf0o/b/5iCjSqHzxHTx3PM0rq4grcuOiiGPLbv0LcrGK5Y+vjXUYJedFKQTRmmE7dBQMQCAFQyHbnw98fjnmaYn+cmvCbh95WUiAXbes7oS5A== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: It's to prepare the batched rmap update for large folio. No need to looped handle hugetlb. Just handle hugetlb and bail out early. Signed-off-by: Yin Fengwei Reviewed-by: Mike Kravetz --- mm/rmap.c | 200 +++++++++++++++++++++++++++++++++--------------------- 1 file changed, 121 insertions(+), 79 deletions(-) diff --git a/mm/rmap.c b/mm/rmap.c index ba901c416785..3a2e3ccb8031 100644 --- a/mm/rmap.c +++ b/mm/rmap.c @@ -1441,6 +1441,103 @@ void page_remove_rmap(struct page *page, struct vm_area_struct *vma, munlock_vma_folio(folio, vma, compound); } +static bool try_to_unmap_one_hugetlb(struct folio *folio, + struct vm_area_struct *vma, struct mmu_notifier_range range, + struct page_vma_mapped_walk pvmw, unsigned long address, + enum ttu_flags flags) +{ + struct mm_struct *mm = vma->vm_mm; + pte_t pteval; + bool ret = true, anon = folio_test_anon(folio); + + /* + * The try_to_unmap() is only passed a hugetlb page + * in the case where the hugetlb page is poisoned. + */ + VM_BUG_ON_FOLIO(!folio_test_hwpoison(folio), folio); + /* + * huge_pmd_unshare may unmap an entire PMD page. + * There is no way of knowing exactly which PMDs may + * be cached for this mm, so we must flush them all. + * start/end were already adjusted in caller + * (try_to_unmap_one) to cover this range. + */ + flush_cache_range(vma, range.start, range.end); + + /* + * To call huge_pmd_unshare, i_mmap_rwsem must be + * held in write mode. Caller needs to explicitly + * do this outside rmap routines. + * + * We also must hold hugetlb vma_lock in write mode. + * Lock order dictates acquiring vma_lock BEFORE + * i_mmap_rwsem. We can only try lock here and fail + * if unsuccessful. + */ + if (!anon) { + VM_BUG_ON(!(flags & TTU_RMAP_LOCKED)); + if (!hugetlb_vma_trylock_write(vma)) { + ret = false; + goto out; + } + if (huge_pmd_unshare(mm, vma, address, pvmw.pte)) { + hugetlb_vma_unlock_write(vma); + flush_tlb_range(vma, + range.start, range.end); + mmu_notifier_invalidate_range(mm, + range.start, range.end); + /* + * The ref count of the PMD page was + * dropped which is part of the way map + * counting is done for shared PMDs. + * Return 'true' here. When there is + * no other sharing, huge_pmd_unshare + * returns false and we will unmap the + * actual page and drop map count + * to zero. + */ + goto out; + } + hugetlb_vma_unlock_write(vma); + } + pteval = huge_ptep_clear_flush(vma, address, pvmw.pte); + + /* + * Now the pte is cleared. If this pte was uffd-wp armed, + * we may want to replace a none pte with a marker pte if + * it's file-backed, so we don't lose the tracking info. + */ + pte_install_uffd_wp_if_needed(vma, address, pvmw.pte, pteval); + + /* Set the dirty flag on the folio now the pte is gone. */ + if (huge_pte_dirty(pteval)) + folio_mark_dirty(folio); + + /* Update high watermark before we lower rss */ + update_hiwater_rss(mm); + + /* Poisoned hugetlb folio with TTU_HWPOISON always cleared in flags */ + pteval = swp_entry_to_pte(make_hwpoison_entry(&folio->page)); + set_huge_pte_at(mm, address, pvmw.pte, pteval); + hugetlb_count_sub(folio_nr_pages(folio), mm); + + /* + * No need to call mmu_notifier_invalidate_range() it has be + * done above for all cases requiring it to happen under page + * table lock before mmu_notifier_invalidate_range_end() + * + * See Documentation/mm/mmu_notifier.rst + */ + page_remove_rmap(&folio->page, vma, true); + /* No VM_LOCKED set in vma->vm_flags for hugetlb. So not + * necessary to call mlock_drain_local(). + */ + folio_put(folio); + +out: + return ret; +} + /* * @arg: enum ttu_flags will be passed to this argument */ @@ -1504,86 +1601,37 @@ static bool try_to_unmap_one(struct folio *folio, struct vm_area_struct *vma, break; } + address = pvmw.address; + if (folio_test_hugetlb(folio)) { + ret = try_to_unmap_one_hugetlb(folio, vma, range, + pvmw, address, flags); + + /* no need to loop for hugetlb */ + page_vma_mapped_walk_done(&pvmw); + break; + } + subpage = folio_page(folio, pte_pfn(*pvmw.pte) - folio_pfn(folio)); - address = pvmw.address; anon_exclusive = folio_test_anon(folio) && PageAnonExclusive(subpage); - if (folio_test_hugetlb(folio)) { - bool anon = folio_test_anon(folio); - - /* - * The try_to_unmap() is only passed a hugetlb page - * in the case where the hugetlb page is poisoned. - */ - VM_BUG_ON_PAGE(!PageHWPoison(subpage), subpage); + flush_cache_page(vma, address, pte_pfn(*pvmw.pte)); + /* Nuke the page table entry. */ + if (should_defer_flush(mm, flags)) { /* - * huge_pmd_unshare may unmap an entire PMD page. - * There is no way of knowing exactly which PMDs may - * be cached for this mm, so we must flush them all. - * start/end were already adjusted above to cover this - * range. + * We clear the PTE but do not flush so potentially + * a remote CPU could still be writing to the folio. + * If the entry was previously clean then the + * architecture must guarantee that a clear->dirty + * transition on a cached TLB entry is written through + * and traps if the PTE is unmapped. */ - flush_cache_range(vma, range.start, range.end); + pteval = ptep_get_and_clear(mm, address, pvmw.pte); - /* - * To call huge_pmd_unshare, i_mmap_rwsem must be - * held in write mode. Caller needs to explicitly - * do this outside rmap routines. - * - * We also must hold hugetlb vma_lock in write mode. - * Lock order dictates acquiring vma_lock BEFORE - * i_mmap_rwsem. We can only try lock here and fail - * if unsuccessful. - */ - if (!anon) { - VM_BUG_ON(!(flags & TTU_RMAP_LOCKED)); - if (!hugetlb_vma_trylock_write(vma)) { - page_vma_mapped_walk_done(&pvmw); - ret = false; - break; - } - if (huge_pmd_unshare(mm, vma, address, pvmw.pte)) { - hugetlb_vma_unlock_write(vma); - flush_tlb_range(vma, - range.start, range.end); - mmu_notifier_invalidate_range(mm, - range.start, range.end); - /* - * The ref count of the PMD page was - * dropped which is part of the way map - * counting is done for shared PMDs. - * Return 'true' here. When there is - * no other sharing, huge_pmd_unshare - * returns false and we will unmap the - * actual page and drop map count - * to zero. - */ - page_vma_mapped_walk_done(&pvmw); - break; - } - hugetlb_vma_unlock_write(vma); - } - pteval = huge_ptep_clear_flush(vma, address, pvmw.pte); + set_tlb_ubc_flush_pending(mm, pte_dirty(pteval)); } else { - flush_cache_page(vma, address, pte_pfn(*pvmw.pte)); - /* Nuke the page table entry. */ - if (should_defer_flush(mm, flags)) { - /* - * We clear the PTE but do not flush so potentially - * a remote CPU could still be writing to the folio. - * If the entry was previously clean then the - * architecture must guarantee that a clear->dirty - * transition on a cached TLB entry is written through - * and traps if the PTE is unmapped. - */ - pteval = ptep_get_and_clear(mm, address, pvmw.pte); - - set_tlb_ubc_flush_pending(mm, pte_dirty(pteval)); - } else { - pteval = ptep_clear_flush(vma, address, pvmw.pte); - } + pteval = ptep_clear_flush(vma, address, pvmw.pte); } /* @@ -1602,14 +1650,8 @@ static bool try_to_unmap_one(struct folio *folio, struct vm_area_struct *vma, if (PageHWPoison(subpage) && (flags & TTU_HWPOISON)) { pteval = swp_entry_to_pte(make_hwpoison_entry(subpage)); - if (folio_test_hugetlb(folio)) { - hugetlb_count_sub(folio_nr_pages(folio), mm); - set_huge_pte_at(mm, address, pvmw.pte, pteval); - } else { - dec_mm_counter(mm, mm_counter(&folio->page)); - set_pte_at(mm, address, pvmw.pte, pteval); - } - + dec_mm_counter(mm, mm_counter(&folio->page)); + set_pte_at(mm, address, pvmw.pte, pteval); } else if (pte_unused(pteval) && !userfaultfd_armed(vma)) { /* * The guest indicated that the page content is of no From patchwork Mon Mar 13 12:45:23 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yin Fengwei X-Patchwork-Id: 13172441 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 21714C61DA4 for ; Mon, 13 Mar 2023 12:44:32 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id AC25A6B0074; Mon, 13 Mar 2023 08:44:31 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id A714D6B0075; Mon, 13 Mar 2023 08:44:31 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 912766B0078; Mon, 13 Mar 2023 08:44:31 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0013.hostedemail.com [216.40.44.13]) by kanga.kvack.org (Postfix) with ESMTP id 80D416B0074 for ; Mon, 13 Mar 2023 08:44:31 -0400 (EDT) Received: from smtpin19.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay04.hostedemail.com (Postfix) with ESMTP id 4FBD71A0B05 for ; Mon, 13 Mar 2023 12:44:31 +0000 (UTC) X-FDA: 80563843542.19.AF1D1C1 Received: from mga06.intel.com (mga06b.intel.com [134.134.136.31]) by imf10.hostedemail.com (Postfix) with ESMTP id 126DAC001F for ; Mon, 13 Mar 2023 12:44:28 +0000 (UTC) Authentication-Results: imf10.hostedemail.com; dkim=pass header.d=intel.com header.s=Intel header.b=njxQ5hWL; spf=pass (imf10.hostedemail.com: domain of fengwei.yin@intel.com designates 134.134.136.31 as permitted sender) smtp.mailfrom=fengwei.yin@intel.com; dmarc=pass (policy=none) header.from=intel.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1678711469; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=WTwL8LHDuL9P+LNTKHtgsih7WEgDUpwXA5rVCR2+HCI=; b=EjVZ/zBpF2JzDUVAyFsHhLTtpcjaoUeN+9i1SEIQ7DD86WlPovvffSBR+d0MSyP1/8ZHfY 6NtsolaGCTBlD8NqUNjO9eLX4Q+au7vIDdiqNfmxqJZBy2w32i/MqEx5ONjvycf8/lBa9h GmFHwOTm0OZSwGktlAtsxwqTVedcfzY= ARC-Authentication-Results: i=1; imf10.hostedemail.com; dkim=pass header.d=intel.com header.s=Intel header.b=njxQ5hWL; spf=pass (imf10.hostedemail.com: domain of fengwei.yin@intel.com designates 134.134.136.31 as permitted sender) smtp.mailfrom=fengwei.yin@intel.com; dmarc=pass (policy=none) header.from=intel.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1678711469; a=rsa-sha256; cv=none; b=PBrIYqnTvoaVcr/pcSgh5oqGB5nh6XaaJRwjLobjyY17eXQ+scEVP1vH5mR3CNXRKWmtn+ IKrGlwHSHWyOSAWiw9LS7VZ8aVTv+cd/++VNS6ZBLYnUl4b4CEapeO4u/fkuGEDthoD/iR 9zxFIEOzeYt4jJLab7sofKrkOLBP2EA= DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1678711469; x=1710247469; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=+Cb4ASBAgHOPwudxgLxAly/MqqPV70H5weA/H5xS/qw=; b=njxQ5hWLr7OPvdgaoKhgM9u4byjzR+wrReSPva5AzVDFEWPh6D/RnluT 0EdEogq/Sj9c35pyXN9ZAYzDKE98I8wWffJCHcCf5ugK3v+oPFwBDe1bK yaLRguk0p9p+gxsTpeQyzafn2yW0wDaelokwiasnQ4NzCe6BUMX/urZjY Hn6GGsYuP5/dxJV7kU9NR4tawzAR8z5wb1Zwccpi4HBGTCj6wTcTUYNDz Kv8Ar7jYpIOY//d076sxbzWa9C3iMziWDPDrfRKNHDryH3KNt/SvyP0cQ m72R/N/TrzCRBYSGGn4hooHlWIS7016A/wDRQD4woigzjlc/WSYQI2bJ5 g==; X-IronPort-AV: E=McAfee;i="6500,9779,10647"; a="399727470" X-IronPort-AV: E=Sophos;i="5.98,256,1673942400"; d="scan'208";a="399727470" Received: from fmsmga003.fm.intel.com ([10.253.24.29]) by orsmga104.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 13 Mar 2023 05:44:27 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6500,9779,10647"; a="767683317" X-IronPort-AV: E=Sophos;i="5.98,256,1673942400"; d="scan'208";a="767683317" Received: from fyin-dev.sh.intel.com ([10.239.159.32]) by FMSMGA003.fm.intel.com with ESMTP; 13 Mar 2023 05:44:24 -0700 From: Yin Fengwei To: linux-mm@kvack.org, akpm@linux-foundation.org, willy@infradead.org, mike.kravetz@oracle.com, sidhartha.kumar@oracle.com, naoya.horiguchi@nec.com, jane.chu@oracle.com, david@redhat.com Cc: fengwei.yin@intel.com Subject: [PATCH v4 2/5] rmap: move page unmap operation to dedicated function Date: Mon, 13 Mar 2023 20:45:23 +0800 Message-Id: <20230313124526.1207490-3-fengwei.yin@intel.com> X-Mailer: git-send-email 2.30.2 In-Reply-To: <20230313124526.1207490-1-fengwei.yin@intel.com> References: <20230313124526.1207490-1-fengwei.yin@intel.com> MIME-Version: 1.0 X-Rspamd-Server: rspam05 X-Rspamd-Queue-Id: 126DAC001F X-Stat-Signature: 73z1b7m4ucshf4mbtagdp1m8cyzdrduf X-Rspam-User: X-HE-Tag: 1678711468-939404 X-HE-Meta: U2FsdGVkX18j3s8MYdHoCvIArhDoD4et8DMOIPvV++YStQwjRqihZg+xhGOlfqwCRT99J+XCZC0iseeqZlDUJnqQOYBaB8/mZWVvPh9hUSFd+yZoTWMWVZanNBsAIfV4dZzkPtoyTxIygEQPwJGSArdypbt+ylHWFr7/HJDiYNsRavRjymwBQx6wh48Pu9RlfaeDBiCZ2GS/7P/iNhva3g4bCg2rhosqcADP7OFpr3tV/I2aOKSd4yEBpHJOaSRjYWvb2FZESvB5JhsPe5sO/kM5oBQQtXPBqn7hxi2qTu3qyLieqyJ4hRYi9cMMAZl54LuwJSRMKrWqaUPj93adlRMkSAjB/HQZqmnnKi+veVUaWThqC50B5iUkKPtuos6ZCJYx63X+lsWMcgqYdPoA+q2+oWU0xsFLSUeWfA03EIjVSSzgLKEyNP6u+73sgLD8hT5mZMxvsVkFcbkNmHsXNAjJxjwCuqowo2L+GSB4zK7iVs1rKHzoB3K2CP4ojlEE8VCoxB6OrN310dDQ+x40wL6Je+AHIBi9dmi9mRvySICfPTShsvnbrnS54RYTmmfLx+/kVCfote+/dO/oU8a94lbadUBq1TcoAbm3wEP645VZe/hYTiQz5NhY7q2JFb2k0DDUDqyRHBinHL7qvTyRvGundVsOBXyOHMZtj2BceKf5BI6GhK/kXYqbIe8hi6GV1nswUHLKszDY0n01m+mC0SSKYOmDu5djUeY9ch101X7s3JTxqrhrAqiVYRRuAE3UD6qmVZx4fP1HoK/smS1k6Ev/4WRNN5VegmTbyKTzLS2aM0rhZj7qlN7Tws1VAiWUqRjUL0546G2YGHNX/5yCzeayxUrRGqNBjnDbgOB9OCDTokcUY9ODGhmfKkCalykL4oYINAnYCEQi2JfVAWQEJxa4+2hXztmkzXwkOp6w7KY1La2n7YpcRdpVLzomoPKf2N9Q8tIAXCVsuiPmdzc HHpTa2rn CzRUVqy5XkmbHiJXNMEv7kH6VXbrPE953GjAU9UhjPBQyt22IWOP5Fm9XuXB/uiXLeRn+AOKCCFvmPFI20jvvFaBoBSzedNhHkZbGnbsL6kUJZ2JNpivFvHBDhCCRzC223FfRrQZc5a68Cmd54i6k4eb/Cnxux+mxBFWFnAP1eXAK+M4= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: No functional change. Just code reorganized. Signed-off-by: Yin Fengwei --- mm/rmap.c | 369 ++++++++++++++++++++++++++++-------------------------- 1 file changed, 194 insertions(+), 175 deletions(-) diff --git a/mm/rmap.c b/mm/rmap.c index 3a2e3ccb8031..23eda671447a 100644 --- a/mm/rmap.c +++ b/mm/rmap.c @@ -1538,17 +1538,204 @@ static bool try_to_unmap_one_hugetlb(struct folio *folio, return ret; } +static bool try_to_unmap_one_page(struct folio *folio, + struct vm_area_struct *vma, struct mmu_notifier_range range, + struct page_vma_mapped_walk pvmw, unsigned long address, + enum ttu_flags flags) +{ + bool anon_exclusive, ret = true; + struct page *subpage; + struct mm_struct *mm = vma->vm_mm; + pte_t pteval; + + subpage = folio_page(folio, + pte_pfn(*pvmw.pte) - folio_pfn(folio)); + anon_exclusive = folio_test_anon(folio) && + PageAnonExclusive(subpage); + + flush_cache_page(vma, address, pte_pfn(*pvmw.pte)); + /* Nuke the page table entry. */ + if (should_defer_flush(mm, flags)) { + /* + * We clear the PTE but do not flush so potentially + * a remote CPU could still be writing to the folio. + * If the entry was previously clean then the + * architecture must guarantee that a clear->dirty + * transition on a cached TLB entry is written through + * and traps if the PTE is unmapped. + */ + pteval = ptep_get_and_clear(mm, address, pvmw.pte); + + set_tlb_ubc_flush_pending(mm, pte_dirty(pteval)); + } else { + pteval = ptep_clear_flush(vma, address, pvmw.pte); + } + + /* + * Now the pte is cleared. If this pte was uffd-wp armed, + * we may want to replace a none pte with a marker pte if + * it's file-backed, so we don't lose the tracking info. + */ + pte_install_uffd_wp_if_needed(vma, address, pvmw.pte, pteval); + + /* Set the dirty flag on the folio now the pte is gone. */ + if (pte_dirty(pteval)) + folio_mark_dirty(folio); + + /* Update high watermark before we lower rss */ + update_hiwater_rss(mm); + + if (PageHWPoison(subpage) && !(flags & TTU_HWPOISON)) { + pteval = swp_entry_to_pte(make_hwpoison_entry(subpage)); + dec_mm_counter(mm, mm_counter(&folio->page)); + set_pte_at(mm, address, pvmw.pte, pteval); + } else if (pte_unused(pteval) && !userfaultfd_armed(vma)) { + /* + * The guest indicated that the page content is of no + * interest anymore. Simply discard the pte, vmscan + * will take care of the rest. + * A future reference will then fault in a new zero + * page. When userfaultfd is active, we must not drop + * this page though, as its main user (postcopy + * migration) will not expect userfaults on already + * copied pages. + */ + dec_mm_counter(mm, mm_counter(&folio->page)); + /* We have to invalidate as we cleared the pte */ + mmu_notifier_invalidate_range(mm, address, + address + PAGE_SIZE); + } else if (folio_test_anon(folio)) { + swp_entry_t entry = { .val = page_private(subpage) }; + pte_t swp_pte; + /* + * Store the swap location in the pte. + * See handle_pte_fault() ... + */ + if (unlikely(folio_test_swapbacked(folio) != + folio_test_swapcache(folio))) { + WARN_ON_ONCE(1); + ret = false; + /* We have to invalidate as we cleared the pte */ + mmu_notifier_invalidate_range(mm, address, + address + PAGE_SIZE); + page_vma_mapped_walk_done(&pvmw); + goto discard; + } + + /* MADV_FREE page check */ + if (!folio_test_swapbacked(folio)) { + int ref_count, map_count; + + /* + * Synchronize with gup_pte_range(): + * - clear PTE; barrier; read refcount + * - inc refcount; barrier; read PTE + */ + smp_mb(); + + ref_count = folio_ref_count(folio); + map_count = folio_mapcount(folio); + + /* + * Order reads for page refcount and dirty flag + * (see comments in __remove_mapping()). + */ + smp_rmb(); + + /* + * The only page refs must be one from isolation + * plus the rmap(s) (dropped by discard:). + */ + if (ref_count == 1 + map_count && + !folio_test_dirty(folio)) { + /* Invalidate as we cleared the pte */ + mmu_notifier_invalidate_range(mm, + address, address + PAGE_SIZE); + dec_mm_counter(mm, MM_ANONPAGES); + goto discard; + } + + /* + * If the folio was redirtied, it cannot be + * discarded. Remap the page to page table. + */ + set_pte_at(mm, address, pvmw.pte, pteval); + folio_set_swapbacked(folio); + ret = false; + page_vma_mapped_walk_done(&pvmw); + goto discard; + } + + if (swap_duplicate(entry) < 0) { + set_pte_at(mm, address, pvmw.pte, pteval); + ret = false; + page_vma_mapped_walk_done(&pvmw); + goto discard; + } + if (arch_unmap_one(mm, vma, address, pteval) < 0) { + swap_free(entry); + set_pte_at(mm, address, pvmw.pte, pteval); + ret = false; + page_vma_mapped_walk_done(&pvmw); + goto discard; + } + + /* See page_try_share_anon_rmap(): clear PTE first. */ + if (anon_exclusive && + page_try_share_anon_rmap(subpage)) { + swap_free(entry); + set_pte_at(mm, address, pvmw.pte, pteval); + ret = false; + page_vma_mapped_walk_done(&pvmw); + goto discard; + } + if (list_empty(&mm->mmlist)) { + spin_lock(&mmlist_lock); + if (list_empty(&mm->mmlist)) + list_add(&mm->mmlist, &init_mm.mmlist); + spin_unlock(&mmlist_lock); + } + dec_mm_counter(mm, MM_ANONPAGES); + inc_mm_counter(mm, MM_SWAPENTS); + swp_pte = swp_entry_to_pte(entry); + if (anon_exclusive) + swp_pte = pte_swp_mkexclusive(swp_pte); + if (pte_soft_dirty(pteval)) + swp_pte = pte_swp_mksoft_dirty(swp_pte); + if (pte_uffd_wp(pteval)) + swp_pte = pte_swp_mkuffd_wp(swp_pte); + set_pte_at(mm, address, pvmw.pte, swp_pte); + /* Invalidate as we cleared the pte */ + mmu_notifier_invalidate_range(mm, address, + address + PAGE_SIZE); + } else { + /* + * This is a locked file-backed folio, + * so it cannot be removed from the page + * cache and replaced by a new folio before + * mmu_notifier_invalidate_range_end, so no + * concurrent thread might update its page table + * to point at a new folio while a device is + * still using this folio. + * + * See Documentation/mm/mmu_notifier.rst + */ + dec_mm_counter(mm, mm_counter_file(&folio->page)); + } + +discard: + return ret; +} + /* * @arg: enum ttu_flags will be passed to this argument */ static bool try_to_unmap_one(struct folio *folio, struct vm_area_struct *vma, unsigned long address, void *arg) { - struct mm_struct *mm = vma->vm_mm; DEFINE_FOLIO_VMA_WALK(pvmw, folio, vma, address, 0); - pte_t pteval; struct page *subpage; - bool anon_exclusive, ret = true; + bool ret = true; struct mmu_notifier_range range; enum ttu_flags flags = (enum ttu_flags)(long)arg; @@ -1613,179 +1800,11 @@ static bool try_to_unmap_one(struct folio *folio, struct vm_area_struct *vma, subpage = folio_page(folio, pte_pfn(*pvmw.pte) - folio_pfn(folio)); - anon_exclusive = folio_test_anon(folio) && - PageAnonExclusive(subpage); - - flush_cache_page(vma, address, pte_pfn(*pvmw.pte)); - /* Nuke the page table entry. */ - if (should_defer_flush(mm, flags)) { - /* - * We clear the PTE but do not flush so potentially - * a remote CPU could still be writing to the folio. - * If the entry was previously clean then the - * architecture must guarantee that a clear->dirty - * transition on a cached TLB entry is written through - * and traps if the PTE is unmapped. - */ - pteval = ptep_get_and_clear(mm, address, pvmw.pte); - - set_tlb_ubc_flush_pending(mm, pte_dirty(pteval)); - } else { - pteval = ptep_clear_flush(vma, address, pvmw.pte); - } - - /* - * Now the pte is cleared. If this pte was uffd-wp armed, - * we may want to replace a none pte with a marker pte if - * it's file-backed, so we don't lose the tracking info. - */ - pte_install_uffd_wp_if_needed(vma, address, pvmw.pte, pteval); - - /* Set the dirty flag on the folio now the pte is gone. */ - if (pte_dirty(pteval)) - folio_mark_dirty(folio); - - /* Update high watermark before we lower rss */ - update_hiwater_rss(mm); - - if (PageHWPoison(subpage) && (flags & TTU_HWPOISON)) { - pteval = swp_entry_to_pte(make_hwpoison_entry(subpage)); - dec_mm_counter(mm, mm_counter(&folio->page)); - set_pte_at(mm, address, pvmw.pte, pteval); - } else if (pte_unused(pteval) && !userfaultfd_armed(vma)) { - /* - * The guest indicated that the page content is of no - * interest anymore. Simply discard the pte, vmscan - * will take care of the rest. - * A future reference will then fault in a new zero - * page. When userfaultfd is active, we must not drop - * this page though, as its main user (postcopy - * migration) will not expect userfaults on already - * copied pages. - */ - dec_mm_counter(mm, mm_counter(&folio->page)); - /* We have to invalidate as we cleared the pte */ - mmu_notifier_invalidate_range(mm, address, - address + PAGE_SIZE); - } else if (folio_test_anon(folio)) { - swp_entry_t entry = { .val = page_private(subpage) }; - pte_t swp_pte; - /* - * Store the swap location in the pte. - * See handle_pte_fault() ... - */ - if (unlikely(folio_test_swapbacked(folio) != - folio_test_swapcache(folio))) { - WARN_ON_ONCE(1); - ret = false; - /* We have to invalidate as we cleared the pte */ - mmu_notifier_invalidate_range(mm, address, - address + PAGE_SIZE); - page_vma_mapped_walk_done(&pvmw); - break; - } - - /* MADV_FREE page check */ - if (!folio_test_swapbacked(folio)) { - int ref_count, map_count; - - /* - * Synchronize with gup_pte_range(): - * - clear PTE; barrier; read refcount - * - inc refcount; barrier; read PTE - */ - smp_mb(); - - ref_count = folio_ref_count(folio); - map_count = folio_mapcount(folio); - - /* - * Order reads for page refcount and dirty flag - * (see comments in __remove_mapping()). - */ - smp_rmb(); - - /* - * The only page refs must be one from isolation - * plus the rmap(s) (dropped by discard:). - */ - if (ref_count == 1 + map_count && - !folio_test_dirty(folio)) { - /* Invalidate as we cleared the pte */ - mmu_notifier_invalidate_range(mm, - address, address + PAGE_SIZE); - dec_mm_counter(mm, MM_ANONPAGES); - goto discard; - } - - /* - * If the folio was redirtied, it cannot be - * discarded. Remap the page to page table. - */ - set_pte_at(mm, address, pvmw.pte, pteval); - folio_set_swapbacked(folio); - ret = false; - page_vma_mapped_walk_done(&pvmw); - break; - } - - if (swap_duplicate(entry) < 0) { - set_pte_at(mm, address, pvmw.pte, pteval); - ret = false; - page_vma_mapped_walk_done(&pvmw); - break; - } - if (arch_unmap_one(mm, vma, address, pteval) < 0) { - swap_free(entry); - set_pte_at(mm, address, pvmw.pte, pteval); - ret = false; - page_vma_mapped_walk_done(&pvmw); - break; - } + ret = try_to_unmap_one_page(folio, vma, + range, pvmw, address, flags); + if (!ret) + break; - /* See page_try_share_anon_rmap(): clear PTE first. */ - if (anon_exclusive && - page_try_share_anon_rmap(subpage)) { - swap_free(entry); - set_pte_at(mm, address, pvmw.pte, pteval); - ret = false; - page_vma_mapped_walk_done(&pvmw); - break; - } - if (list_empty(&mm->mmlist)) { - spin_lock(&mmlist_lock); - if (list_empty(&mm->mmlist)) - list_add(&mm->mmlist, &init_mm.mmlist); - spin_unlock(&mmlist_lock); - } - dec_mm_counter(mm, MM_ANONPAGES); - inc_mm_counter(mm, MM_SWAPENTS); - swp_pte = swp_entry_to_pte(entry); - if (anon_exclusive) - swp_pte = pte_swp_mkexclusive(swp_pte); - if (pte_soft_dirty(pteval)) - swp_pte = pte_swp_mksoft_dirty(swp_pte); - if (pte_uffd_wp(pteval)) - swp_pte = pte_swp_mkuffd_wp(swp_pte); - set_pte_at(mm, address, pvmw.pte, swp_pte); - /* Invalidate as we cleared the pte */ - mmu_notifier_invalidate_range(mm, address, - address + PAGE_SIZE); - } else { - /* - * This is a locked file-backed folio, - * so it cannot be removed from the page - * cache and replaced by a new folio before - * mmu_notifier_invalidate_range_end, so no - * concurrent thread might update its page table - * to point at a new folio while a device is - * still using this folio. - * - * See Documentation/mm/mmu_notifier.rst - */ - dec_mm_counter(mm, mm_counter_file(&folio->page)); - } -discard: /* * No need to call mmu_notifier_invalidate_range() it has be * done above for all cases requiring it to happen under page From patchwork Mon Mar 13 12:45:24 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yin Fengwei X-Patchwork-Id: 13172442 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id CAAA9C6FD19 for ; Mon, 13 Mar 2023 12:44:44 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 67AEA6B0078; Mon, 13 Mar 2023 08:44:44 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 62B076B0075; Mon, 13 Mar 2023 08:44:44 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 51A506B0078; Mon, 13 Mar 2023 08:44:44 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0013.hostedemail.com [216.40.44.13]) by kanga.kvack.org (Postfix) with ESMTP id 417C26B0071 for ; Mon, 13 Mar 2023 08:44:44 -0400 (EDT) Received: from smtpin16.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay07.hostedemail.com (Postfix) with ESMTP id 1900F160A78 for ; Mon, 13 Mar 2023 12:44:44 +0000 (UTC) X-FDA: 80563844088.16.3828BBA Received: from mga05.intel.com (mga05.intel.com [192.55.52.43]) by imf09.hostedemail.com (Postfix) with ESMTP id D70E4140028 for ; Mon, 13 Mar 2023 12:44:41 +0000 (UTC) Authentication-Results: imf09.hostedemail.com; dkim=pass header.d=intel.com header.s=Intel header.b=ZtFxWWIU; spf=pass (imf09.hostedemail.com: domain of fengwei.yin@intel.com designates 192.55.52.43 as permitted sender) smtp.mailfrom=fengwei.yin@intel.com; dmarc=pass (policy=none) header.from=intel.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1678711482; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=aBgyHS9VdZiCBplye4znKQcf2Ia4igUspSMdoM6iekU=; b=kdKK/8qkNb6LqmaW0/+BIdHg52h1XxLtQK5tsZHLMYXiAcLaGA033j9muKZym19M0gfO+G Tp6SQ4VKVfnpxjX75ijOFqZQ2/FzMn2hQyICKFmsBwn3sy82I4wVq5jlkzMZaJu8F3F0JA ec4TSxiT0duJZrgiCBGrj8IdnTN6rEs= ARC-Authentication-Results: i=1; imf09.hostedemail.com; dkim=pass header.d=intel.com header.s=Intel header.b=ZtFxWWIU; spf=pass (imf09.hostedemail.com: domain of fengwei.yin@intel.com designates 192.55.52.43 as permitted sender) smtp.mailfrom=fengwei.yin@intel.com; dmarc=pass (policy=none) header.from=intel.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1678711482; a=rsa-sha256; cv=none; b=GlTLOWmjHeQXp5UnHBB+GwXh4AmbnHqH/IahbEtOTQfVmGPBZnQL6sH28506jFZOq8s7Ys MTeHcoJuLlOy/chzOHJURHfWCrpLsz+9bHoE4l2i44yngCI40QsZVh9/cAf2f13y6mV3SR a2eXCwN38rdD3nXyl8GpGS9GER0kD/w= DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1678711481; x=1710247481; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=lUNfaXC0kDCBuJ3XPQm6R4hq0DCdot7KNdmyPQuE5BY=; b=ZtFxWWIUd46HeokxifxnVFMj+kShFJ6DZD0KESkA8vAvKqljDcT4+Amb 2pyA/TZIVV3+QGBqCnckGEYFtoB5lGOhZ/Wc00p9sNEF9YNVvhn5MQOxV UeBR67uo07zirlql26MDeBNaOP0J8DRqgHkHdzehlPExQZ8du8RLygjy0 KlpxlfYFfy/OZPMO+HEywZQw7JdcxWGI+8KFvhwHj2A4x0NBSMJrm/ncz bh//crjMoLCTssqwxEf0dx7SN2TDfbfM8BbSvOdFLpCCiQjWr70s4X0VY tHWmXyvLWBi5mQ1N8CBxE5CtYvKwEPsuC35fq5YyLm9ZfA/0UbXlMH6Vr Q==; X-IronPort-AV: E=McAfee;i="6500,9779,10647"; a="423400767" X-IronPort-AV: E=Sophos;i="5.98,256,1673942400"; d="scan'208";a="423400767" Received: from fmsmga004.fm.intel.com ([10.253.24.48]) by fmsmga105.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 13 Mar 2023 05:44:40 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6500,9779,10647"; a="747597783" X-IronPort-AV: E=Sophos;i="5.98,256,1673942400"; d="scan'208";a="747597783" Received: from fyin-dev.sh.intel.com ([10.239.159.32]) by fmsmga004.fm.intel.com with ESMTP; 13 Mar 2023 05:44:37 -0700 From: Yin Fengwei To: linux-mm@kvack.org, akpm@linux-foundation.org, willy@infradead.org, mike.kravetz@oracle.com, sidhartha.kumar@oracle.com, naoya.horiguchi@nec.com, jane.chu@oracle.com, david@redhat.com Cc: fengwei.yin@intel.com Subject: [PATCH v4 3/5] rmap: cleanup exit path of try_to_unmap_one_page() Date: Mon, 13 Mar 2023 20:45:24 +0800 Message-Id: <20230313124526.1207490-4-fengwei.yin@intel.com> X-Mailer: git-send-email 2.30.2 In-Reply-To: <20230313124526.1207490-1-fengwei.yin@intel.com> References: <20230313124526.1207490-1-fengwei.yin@intel.com> MIME-Version: 1.0 X-Rspamd-Server: rspam07 X-Rspamd-Queue-Id: D70E4140028 X-Rspam-User: X-Stat-Signature: bj6a6rz9jxei88i86n8rdiktu8s4ai7s X-HE-Tag: 1678711481-486656 X-HE-Meta: U2FsdGVkX1+2X9OZDYW5BTcAZMnB7c9u46wTyGp0feHeWJ4GvhXdq4btvi/izmO9pUTGgMteG5Wa5N4kbz8ftA2Ssu3MuATEMAbhBflaGDCcErL2wiv1DPBlmtS2roafvheptPVtQ1OeE1WQYbgcZDRznlxYUY49CwaTlBw0Td5u9dChBQMWw7OX05nxHIhbaYlsKuLtwqUHHzbMlrYQ3C9UCJN1+A/kyCNj9vj9ia46VSLq+TdfqjYFVQJP1Zko799S5k300vE+Og90pbuQXFWiCiq1XChYyJ9nCFemP8+ji3E6RtGuvrYz3ioJ+0aEB4gz1QRxRp6np9S5ba4OF5Ef8iSi12O4lhFWbbQLQ/YMTCHpxd1NCqF5pB7dtVhzDoLXy4ASNJjRJynUkuyUwCQFT3JOqcxi1OM4YpURZaop5zzRUFKdfUI8aMeOHQWxkKfszUDrGTioTfPYtTGfbQiZhxukTCYhCNVoWqKLkyFZZQCdJHsZrW6Hiu6htcQlNbI2yhG7NdveHz6bd3VVhYjdhcpIBAZkV2Wbn1NxYzTiF+jq7UzCqXjiBjzT923xKoOj2THSQPTGCTZvyRc+Nc0LIngx71ZUQN33496BE0MREqhq2mzR3C6UOwwZFJQgsCeKaY4ccrYUgmzzV6lMdZQUBI3DuCH+472gb2ZwhfxUgnUPzs6XK4gTulA+4FW3lCR7c/8YeY45zVko7/CWlD+z4OiFWuHgn9Yf5tyJnzm1P67Nq1aVsk40wAqAFNjrx9lzGaGu92hOvNOWjgNUelOnb8rg0HI8yLrz26inIMa3sCPEMr5YThKAz0S1sivZ2k5P4pSA8TxE0CyOwdsJzx0gfpdMCIUKeI4rZAmaB7Zs2qAsCkrHmHCSkENOEu9HBGyhuceT7/HxTg46AYD+T7Y3TipQc2vcfjjwcyiuc9D27M+TvptksbAM+0mARHSVN4F5wooPV4rUraJv2qS d0mwERhJ 5awcvAyF48StgZB0qiH13yYmCDdL3oddEyBPZNJ3TCe5iX1in8rHW0i3jCrvfkU68gK/w9tIYfQIyplwpdFBAggNq2vNRmaz59H2G24iknflKxTs9+l1sbzg/2cza+D05JURMjGdZyjbDZOtd/kCJ/UcxWa0Xr20grQeFApoUo0RJpmNJTe9jCU//SyYqzphdkiOn70aOt/05n58azDXJh0H7aw== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Cleanup exit path of try_to_unmap_one_page() by removing some duplicated code. Move page_vma_mapped_walk_done() back to try_to_unmap_one(). Change subpage to page as folio has no concept of subpage. Signed-off-by: Yin Fengwei --- mm/rmap.c | 72 ++++++++++++++++++++++--------------------------------- 1 file changed, 29 insertions(+), 43 deletions(-) diff --git a/mm/rmap.c b/mm/rmap.c index 23eda671447a..72fc8c559cd9 100644 --- a/mm/rmap.c +++ b/mm/rmap.c @@ -1543,15 +1543,13 @@ static bool try_to_unmap_one_page(struct folio *folio, struct page_vma_mapped_walk pvmw, unsigned long address, enum ttu_flags flags) { - bool anon_exclusive, ret = true; - struct page *subpage; + bool anon_exclusive; + struct page *page; struct mm_struct *mm = vma->vm_mm; pte_t pteval; - subpage = folio_page(folio, - pte_pfn(*pvmw.pte) - folio_pfn(folio)); - anon_exclusive = folio_test_anon(folio) && - PageAnonExclusive(subpage); + page = folio_page(folio, pte_pfn(*pvmw.pte) - folio_pfn(folio)); + anon_exclusive = folio_test_anon(folio) && PageAnonExclusive(page); flush_cache_page(vma, address, pte_pfn(*pvmw.pte)); /* Nuke the page table entry. */ @@ -1579,15 +1577,14 @@ static bool try_to_unmap_one_page(struct folio *folio, pte_install_uffd_wp_if_needed(vma, address, pvmw.pte, pteval); /* Set the dirty flag on the folio now the pte is gone. */ - if (pte_dirty(pteval)) + if (pte_dirty(pteval) && !folio_test_dirty(folio)) folio_mark_dirty(folio); /* Update high watermark before we lower rss */ update_hiwater_rss(mm); - if (PageHWPoison(subpage) && !(flags & TTU_HWPOISON)) { - pteval = swp_entry_to_pte(make_hwpoison_entry(subpage)); - dec_mm_counter(mm, mm_counter(&folio->page)); + if (PageHWPoison(page) && !(flags & TTU_HWPOISON)) { + pteval = swp_entry_to_pte(make_hwpoison_entry(page)); set_pte_at(mm, address, pvmw.pte, pteval); } else if (pte_unused(pteval) && !userfaultfd_armed(vma)) { /* @@ -1600,12 +1597,11 @@ static bool try_to_unmap_one_page(struct folio *folio, * migration) will not expect userfaults on already * copied pages. */ - dec_mm_counter(mm, mm_counter(&folio->page)); /* We have to invalidate as we cleared the pte */ mmu_notifier_invalidate_range(mm, address, address + PAGE_SIZE); } else if (folio_test_anon(folio)) { - swp_entry_t entry = { .val = page_private(subpage) }; + swp_entry_t entry = { .val = page_private(page) }; pte_t swp_pte; /* * Store the swap location in the pte. @@ -1614,12 +1610,10 @@ static bool try_to_unmap_one_page(struct folio *folio, if (unlikely(folio_test_swapbacked(folio) != folio_test_swapcache(folio))) { WARN_ON_ONCE(1); - ret = false; /* We have to invalidate as we cleared the pte */ mmu_notifier_invalidate_range(mm, address, address + PAGE_SIZE); - page_vma_mapped_walk_done(&pvmw); - goto discard; + goto exit; } /* MADV_FREE page check */ @@ -1651,7 +1645,6 @@ static bool try_to_unmap_one_page(struct folio *folio, /* Invalidate as we cleared the pte */ mmu_notifier_invalidate_range(mm, address, address + PAGE_SIZE); - dec_mm_counter(mm, MM_ANONPAGES); goto discard; } @@ -1659,43 +1652,30 @@ static bool try_to_unmap_one_page(struct folio *folio, * If the folio was redirtied, it cannot be * discarded. Remap the page to page table. */ - set_pte_at(mm, address, pvmw.pte, pteval); folio_set_swapbacked(folio); - ret = false; - page_vma_mapped_walk_done(&pvmw); - goto discard; + goto exit_restore_pte; } - if (swap_duplicate(entry) < 0) { - set_pte_at(mm, address, pvmw.pte, pteval); - ret = false; - page_vma_mapped_walk_done(&pvmw); - goto discard; - } + if (swap_duplicate(entry) < 0) + goto exit_restore_pte; + if (arch_unmap_one(mm, vma, address, pteval) < 0) { swap_free(entry); - set_pte_at(mm, address, pvmw.pte, pteval); - ret = false; - page_vma_mapped_walk_done(&pvmw); - goto discard; + goto exit_restore_pte; } /* See page_try_share_anon_rmap(): clear PTE first. */ - if (anon_exclusive && - page_try_share_anon_rmap(subpage)) { + if (anon_exclusive && page_try_share_anon_rmap(page)) { swap_free(entry); - set_pte_at(mm, address, pvmw.pte, pteval); - ret = false; - page_vma_mapped_walk_done(&pvmw); - goto discard; + goto exit_restore_pte; } + if (list_empty(&mm->mmlist)) { spin_lock(&mmlist_lock); if (list_empty(&mm->mmlist)) list_add(&mm->mmlist, &init_mm.mmlist); spin_unlock(&mmlist_lock); } - dec_mm_counter(mm, MM_ANONPAGES); inc_mm_counter(mm, MM_SWAPENTS); swp_pte = swp_entry_to_pte(entry); if (anon_exclusive) @@ -1706,8 +1686,7 @@ static bool try_to_unmap_one_page(struct folio *folio, swp_pte = pte_swp_mkuffd_wp(swp_pte); set_pte_at(mm, address, pvmw.pte, swp_pte); /* Invalidate as we cleared the pte */ - mmu_notifier_invalidate_range(mm, address, - address + PAGE_SIZE); + mmu_notifier_invalidate_range(mm, address, address + PAGE_SIZE); } else { /* * This is a locked file-backed folio, @@ -1720,11 +1699,16 @@ static bool try_to_unmap_one_page(struct folio *folio, * * See Documentation/mm/mmu_notifier.rst */ - dec_mm_counter(mm, mm_counter_file(&folio->page)); } discard: - return ret; + dec_mm_counter(vma->vm_mm, mm_counter(&folio->page)); + return true; + +exit_restore_pte: + set_pte_at(mm, address, pvmw.pte, pteval); +exit: + return false; } /* @@ -1802,8 +1786,10 @@ static bool try_to_unmap_one(struct folio *folio, struct vm_area_struct *vma, pte_pfn(*pvmw.pte) - folio_pfn(folio)); ret = try_to_unmap_one_page(folio, vma, range, pvmw, address, flags); - if (!ret) + if (!ret) { + page_vma_mapped_walk_done(&pvmw); break; + } /* * No need to call mmu_notifier_invalidate_range() it has be @@ -1812,7 +1798,7 @@ static bool try_to_unmap_one(struct folio *folio, struct vm_area_struct *vma, * * See Documentation/mm/mmu_notifier.rst */ - page_remove_rmap(subpage, vma, folio_test_hugetlb(folio)); + page_remove_rmap(subpage, vma, false); if (vma->vm_flags & VM_LOCKED) mlock_drain_local(); folio_put(folio); From patchwork Mon Mar 13 12:45:25 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yin Fengwei X-Patchwork-Id: 13172443 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 3E96FC61DA4 for ; Mon, 13 Mar 2023 12:44:59 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 9FD546B0071; Mon, 13 Mar 2023 08:44:58 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 9B8446B0075; Mon, 13 Mar 2023 08:44:58 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 8752A6B007B; Mon, 13 Mar 2023 08:44:58 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0015.hostedemail.com [216.40.44.15]) by kanga.kvack.org (Postfix) with ESMTP id 752516B0071 for ; Mon, 13 Mar 2023 08:44:58 -0400 (EDT) Received: from smtpin01.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay10.hostedemail.com (Postfix) with ESMTP id 43410C08C7 for ; Mon, 13 Mar 2023 12:44:58 +0000 (UTC) X-FDA: 80563844676.01.1C28D43 Received: from mga01.intel.com (mga01.intel.com [192.55.52.88]) by imf30.hostedemail.com (Postfix) with ESMTP id 45CFB80007 for ; Mon, 13 Mar 2023 12:44:55 +0000 (UTC) Authentication-Results: imf30.hostedemail.com; dkim=pass header.d=intel.com header.s=Intel header.b=ipGk3030; spf=pass (imf30.hostedemail.com: domain of fengwei.yin@intel.com designates 192.55.52.88 as permitted sender) smtp.mailfrom=fengwei.yin@intel.com; dmarc=pass (policy=none) header.from=intel.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1678711495; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=VvuEsOqB0rMIX/ofACTDOngn/bkpGjpWRTNnWf8PO/s=; b=ndtLJXDLUMTedVwMESXyAm8vhYQg7PxXLQjABgtsFzGyQgZeyWiPs9d4V3bdPw1DN00hlU JZ+LzyH/GLT/Gm7dHFRHBQnEqw6+ePuBTWnxRhASKd81ZoykHzrkhTy4jNuvkyPvleyZQE 0VeS3Ti20MseZeXyldcIvf5tV9bmhy0= ARC-Authentication-Results: i=1; imf30.hostedemail.com; dkim=pass header.d=intel.com header.s=Intel header.b=ipGk3030; spf=pass (imf30.hostedemail.com: domain of fengwei.yin@intel.com designates 192.55.52.88 as permitted sender) smtp.mailfrom=fengwei.yin@intel.com; dmarc=pass (policy=none) header.from=intel.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1678711495; a=rsa-sha256; cv=none; b=c5Tr/Lbcf9GvierePhn3vE2ahJi/yFAmXG2DMVxVnFgFpObZgV69dqR9cqCU+Sz42oFMsQ rqIBtqwNh4QGK9aVUvIyiBYMW/pbciMWFUxw/QTUX75DKuMxZEZcicQLpsaCHVOhC/Tz/Q gz4ck4HhJj4NgimMjcNaZEfClxPEezQ= DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1678711495; x=1710247495; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=L0nog3Zuea8dweSuw9vCb41izmv85iPbZrSBDZeUARA=; b=ipGk30300fyBykJnNGvsH5GBUb3Ee3nTSCNnmzo9FXkn+qzypRlhjFSg KID+VD2a0/wZYlgNh3TODds4c9NZVPSkUrcbJMiNPMbw8TsnfGUF7j+Xh +ZW2sSWPnxyPLBNm0uY2GiKhszmj+3LJ9SlpKlfhkmIIy/lP2svFanZ+0 peuSgnXKm28oDphfVY2FM5aiRmj6rZ9EmYLLjsFnuNv361Fcmh91iDroe Xxklrdf7/0KwK8b7Y24+EhXWVHKjrrIkM9HvGNXQYSn39ziaMbY/Fq1e1 kwvqls5pvwKiK/exgkbPVBBdmTapLHBmVV2eWdGCNa+OfPp6xunKuM0e/ g==; X-IronPort-AV: E=McAfee;i="6500,9779,10647"; a="364797293" X-IronPort-AV: E=Sophos;i="5.98,256,1673942400"; d="scan'208";a="364797293" Received: from orsmga006.jf.intel.com ([10.7.209.51]) by fmsmga101.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 13 Mar 2023 05:44:53 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6500,9779,10647"; a="655962641" X-IronPort-AV: E=Sophos;i="5.98,256,1673942400"; d="scan'208";a="655962641" Received: from fyin-dev.sh.intel.com ([10.239.159.32]) by orsmga006.jf.intel.com with ESMTP; 13 Mar 2023 05:44:51 -0700 From: Yin Fengwei To: linux-mm@kvack.org, akpm@linux-foundation.org, willy@infradead.org, mike.kravetz@oracle.com, sidhartha.kumar@oracle.com, naoya.horiguchi@nec.com, jane.chu@oracle.com, david@redhat.com Cc: fengwei.yin@intel.com Subject: [PATCH v4 4/5] rmap:addd folio_remove_rmap_range() Date: Mon, 13 Mar 2023 20:45:25 +0800 Message-Id: <20230313124526.1207490-5-fengwei.yin@intel.com> X-Mailer: git-send-email 2.30.2 In-Reply-To: <20230313124526.1207490-1-fengwei.yin@intel.com> References: <20230313124526.1207490-1-fengwei.yin@intel.com> MIME-Version: 1.0 X-Rspam-User: X-Rspamd-Server: rspam04 X-Rspamd-Queue-Id: 45CFB80007 X-Stat-Signature: m6yc87cmj3b4jfnpi5rrdjdzoq6kd4ar X-HE-Tag: 1678711495-328796 X-HE-Meta: U2FsdGVkX1+YERHk9Ck7ekoamggAwd3Zq6NTFGHAhaevLjCTKEuSagClNQFkfDw/S+xeX3LN4GPIyk8zn25CL84AFUJYUS2/sjO9JuMIpWIngi+4oJzFz7CNdsLnzJWuekOzwK4ImLQ7wNyEdLBOdJGI6iL1WOR2kFnn3yTwLaynXX4Pt4SFH2o1xXx2btPd2rvEwrWJqfhbKMN2GuyKKLyO4rZAIjcYaoq9uCiGkqXaC2BCJgz0lmZ5UAxNaUMoJy5m8Dl+RiYRbk+ozXeDlbubI5UIeGeFoUvrpCDTEICXksbkxJkTS3/syEwqerYrQIWa4vxLskrYc0K5oKizu43YwaCSPNf0MXrbCc9I28DTv86+IiRGD4OXFN583Uj5c4H8CJ34ukUZvBZLZt0Pmsb6bH3UYHxvLMuIWWp+yelfO1HyEmD+GDpwPlrUL4T4ZlBIW2OyE0hL5PrmcanKERcyv6MDj+fZZgY7kpScAxszMP1NUmRJqFh/1ztl66aXRjqJOiErlBuLShpLfwsOAsjzcZp8NyUh1n1QGPCAGeNoVej0AZEfmuo/7PnfdvQKD6YB4gl7qF7v/BaRroF1/SqLSO92dA4TivthJyRZqXcCzqSpX9fQu3ddg/+91mywIzxi/PZCb9JIftzpPHZ64zkrPwvnEuborf770RlFOsw/JHgMwNz6CQA33gZYOZsT6Ggc6E/SbGWFzzkaAt0+pe8CJDkySZVC8KUOvuyagMJx0vgxaa8aQkMcGViCqkaxlrITvbDMoAxyvMgrJsyoRnIIJQUKSkhWoFGH0gaGWwu+Q+1oiW4v0yktDZj3smf38kmygfdrNROCq46j8DUT2ri2M4ySKuCgbZQ7GyBFgXTx8BakWkcJlX7yjO8HgwRjg/3vzjiz63o8DMYE/+s1pS+DeubCgOxN4FIVjOy2jf14nJyiDF3blCvAuqrRbwr/yE4vMig/WF0NA7H5THl lH1edye0 CdlDC/wV570Gbs3mqdYOgN9Gepc6p0K0oxBU2URnjvT/Fkjjp1wuRCQAVp6k/3aSTT1e9tvcorHiRViN/C3HYpL+lilAVh0ezBYZDA5IJfZtt5LAWqnIhBgVR8VL9pBNILR2+VV561M1t1QUMF0AaQ29brQtoJ/kFJRXqP9AXyxVUIM7Ut9aRM7C9+EPe/8Ht1wxYexTHmuVMOILVn9yazDUhL86OiGx5iRRqFMxeWU33aNx0kC3WtyDH8/TQmCHPGzWC X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: folio_remove_rmap_range() allows to take down the pte mapping to a specific range of folio. Comparing to page_remove_rmap(), it batched updates __lruvec_stat for large folio. Signed-off-by: Yin Fengwei --- include/linux/rmap.h | 4 +++ mm/rmap.c | 58 +++++++++++++++++++++++++++++++++----------- 2 files changed, 48 insertions(+), 14 deletions(-) diff --git a/include/linux/rmap.h b/include/linux/rmap.h index b87d01660412..d2569b42e21a 100644 --- a/include/linux/rmap.h +++ b/include/linux/rmap.h @@ -200,6 +200,10 @@ void page_add_file_rmap(struct page *, struct vm_area_struct *, bool compound); void page_remove_rmap(struct page *, struct vm_area_struct *, bool compound); +void folio_remove_rmap_range(struct folio *, struct page *, + unsigned int nr_pages, struct vm_area_struct *, + bool compound); + void hugepage_add_anon_rmap(struct page *, struct vm_area_struct *, unsigned long address, rmap_t flags); diff --git a/mm/rmap.c b/mm/rmap.c index 72fc8c559cd9..bd5331dc9d44 100644 --- a/mm/rmap.c +++ b/mm/rmap.c @@ -1355,23 +1355,25 @@ void page_add_file_rmap(struct page *page, struct vm_area_struct *vma, } /** - * page_remove_rmap - take down pte mapping from a page - * @page: page to remove mapping from + * folio_remove_rmap_range - take down pte mapping from a range of pages + * @folio: folio to remove mapping from + * @page: The first page to take down pte mapping + * @nr_pages: The number of pages which will be take down pte mapping * @vma: the vm area from which the mapping is removed * @compound: uncharge the page as compound or small page * * The caller needs to hold the pte lock. */ -void page_remove_rmap(struct page *page, struct vm_area_struct *vma, - bool compound) +void folio_remove_rmap_range(struct folio *folio, struct page *page, + unsigned int nr_pages, struct vm_area_struct *vma, + bool compound) { - struct folio *folio = page_folio(page); atomic_t *mapped = &folio->_nr_pages_mapped; - int nr = 0, nr_pmdmapped = 0; - bool last; + int nr = 0, nr_pmdmapped = 0, last; enum node_stat_item idx; - VM_BUG_ON_PAGE(compound && !PageHead(page), page); + VM_BUG_ON_FOLIO(compound && (nr_pages != folio_nr_pages(folio)), folio); + VM_BUG_ON_FOLIO(compound && (page != &folio->page), folio); /* Hugetlb pages are not counted in NR_*MAPPED */ if (unlikely(folio_test_hugetlb(folio))) { @@ -1382,12 +1384,16 @@ void page_remove_rmap(struct page *page, struct vm_area_struct *vma, /* Is page being unmapped by PTE? Is this its last map to be removed? */ if (likely(!compound)) { - last = atomic_add_negative(-1, &page->_mapcount); - nr = last; - if (last && folio_test_large(folio)) { - nr = atomic_dec_return_relaxed(mapped); - nr = (nr < COMPOUND_MAPPED); - } + do { + last = atomic_add_negative(-1, &page->_mapcount); + if (last && folio_test_large(folio)) { + last = atomic_dec_return_relaxed(mapped); + last = (last < COMPOUND_MAPPED); + } + + if (last) + nr++; + } while (page++, --nr_pages > 0); } else if (folio_test_pmd_mappable(folio)) { /* That test is redundant: it's for safety or to optimize out */ @@ -1441,6 +1447,30 @@ void page_remove_rmap(struct page *page, struct vm_area_struct *vma, munlock_vma_folio(folio, vma, compound); } +/** + * page_remove_rmap - take down pte mapping from a page + * @page: page to remove mapping from + * @vma: the vm area from which the mapping is removed + * @compound: uncharge the page as compound or small page + * + * The caller needs to hold the pte lock. + */ +void page_remove_rmap(struct page *page, struct vm_area_struct *vma, + bool compound) +{ + struct folio *folio = page_folio(page); + unsigned int nr_pages; + + VM_BUG_ON_FOLIO(compound && (page != &folio->page), folio); + + if (likely(!compound)) + nr_pages = 1; + else + nr_pages = folio_nr_pages(folio); + + folio_remove_rmap_range(folio, page, nr_pages, vma, compound); +} + static bool try_to_unmap_one_hugetlb(struct folio *folio, struct vm_area_struct *vma, struct mmu_notifier_range range, struct page_vma_mapped_walk pvmw, unsigned long address, From patchwork Mon Mar 13 12:45:26 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yin Fengwei X-Patchwork-Id: 13172444 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 9D333C6FD19 for ; Mon, 13 Mar 2023 12:45:11 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 2356A6B0075; Mon, 13 Mar 2023 08:45:11 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 1E52A6B007B; Mon, 13 Mar 2023 08:45:11 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 0D4036B007D; Mon, 13 Mar 2023 08:45:11 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0015.hostedemail.com [216.40.44.15]) by kanga.kvack.org (Postfix) with ESMTP id F185B6B0075 for ; Mon, 13 Mar 2023 08:45:10 -0400 (EDT) Received: from smtpin16.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay03.hostedemail.com (Postfix) with ESMTP id C9995A0BA5 for ; Mon, 13 Mar 2023 12:45:10 +0000 (UTC) X-FDA: 80563845180.16.3E633AB Received: from mga06.intel.com (mga06b.intel.com [134.134.136.31]) by imf27.hostedemail.com (Postfix) with ESMTP id B7E454001E for ; Mon, 13 Mar 2023 12:45:08 +0000 (UTC) Authentication-Results: imf27.hostedemail.com; dkim=pass header.d=intel.com header.s=Intel header.b=U76QrLbb; spf=pass (imf27.hostedemail.com: domain of fengwei.yin@intel.com designates 134.134.136.31 as permitted sender) smtp.mailfrom=fengwei.yin@intel.com; dmarc=pass (policy=none) header.from=intel.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1678711508; a=rsa-sha256; cv=none; b=8m/AZvcCqPxvJnqnwuPXUA/Hfsnq4rzE+LBkyK0lociWd8aUHV26KoLJMW3JP0vfNCbjYP 6Qlh8H+3YkpZ+oGZQElI4K1LMF3jWk+aebxWck1tU976cnisSU9hhKxcLi+3T20Co4JYmv 0WNolUqkHH5Hi6DGlRJvFCboNSrSHms= ARC-Authentication-Results: i=1; imf27.hostedemail.com; dkim=pass header.d=intel.com header.s=Intel header.b=U76QrLbb; spf=pass (imf27.hostedemail.com: domain of fengwei.yin@intel.com designates 134.134.136.31 as permitted sender) smtp.mailfrom=fengwei.yin@intel.com; dmarc=pass (policy=none) header.from=intel.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1678711508; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=AzV5UcBSaYj6VDh9ha107vVAIwK2e7Ly4U9WmNu6xts=; b=vLtlmvUQ4S/1IcbGuEG/J9Evjhx5089P9Tv3z/OxdMzkT2+PpNGtvM7GprpjL6W+3NthOp /6mhNc92tat0WO90DVa3Jvqao1yqfKblA6L1sq2q0rrS5RW28rIsNWK7203gDp+3sLv/8x n8t9cBND+LWUnJfLEh28AqmDJnvzGog= DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1678711508; x=1710247508; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=WxOc3fPEg3ei7kjqLpPuFsRhUIe1Zw1ZeIqSXy7g7uE=; b=U76QrLbbJv4Uy9D9cYoKcmvPcXTC+f3utlEJ2zpgcUA6Um916lUA7t/s 3Yy2IVIlUjTCIFK3B5NF6xSm7hb4NQhbiPmZvFI2fUtSYaHWQQebEIfqb 2yAtEZmVVi9KmSJwDyvW3ZXpyEHFGBr4JpYlfHeYZdGFdMlxKt/HdW+s2 Uu9hsX/peU9k+lVM8VuL5aYAKS3wHmANHgWgOJalnulxbqQG82LTtSMNX CzRPZbngWZbpI0geqRQ7O4WHPwcYqHkYzFA+wtMhTHxkt/u8upEGhKQ5Q 827G2n2vLhmoid+IuSve4Efuy6fZXypeH/VLoXqqgj/fidLpsbZNDSw2n A==; X-IronPort-AV: E=McAfee;i="6500,9779,10647"; a="399727558" X-IronPort-AV: E=Sophos;i="5.98,256,1673942400"; d="scan'208";a="399727558" Received: from fmsmga003.fm.intel.com ([10.253.24.29]) by orsmga104.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 13 Mar 2023 05:45:07 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6500,9779,10647"; a="767683536" X-IronPort-AV: E=Sophos;i="5.98,256,1673942400"; d="scan'208";a="767683536" Received: from fyin-dev.sh.intel.com ([10.239.159.32]) by FMSMGA003.fm.intel.com with ESMTP; 13 Mar 2023 05:45:04 -0700 From: Yin Fengwei To: linux-mm@kvack.org, akpm@linux-foundation.org, willy@infradead.org, mike.kravetz@oracle.com, sidhartha.kumar@oracle.com, naoya.horiguchi@nec.com, jane.chu@oracle.com, david@redhat.com Cc: fengwei.yin@intel.com Subject: [PATCH v4 5/5] try_to_unmap_one: batched remove rmap, update folio refcount Date: Mon, 13 Mar 2023 20:45:26 +0800 Message-Id: <20230313124526.1207490-6-fengwei.yin@intel.com> X-Mailer: git-send-email 2.30.2 In-Reply-To: <20230313124526.1207490-1-fengwei.yin@intel.com> References: <20230313124526.1207490-1-fengwei.yin@intel.com> MIME-Version: 1.0 X-Rspam-User: X-Rspamd-Queue-Id: B7E454001E X-Rspamd-Server: rspam01 X-Stat-Signature: bf6iirch4611m8om9gzaur6t9dxc43ed X-HE-Tag: 1678711508-876921 X-HE-Meta: U2FsdGVkX19mpuir1vjEgGfyfFhMauuV0pJoILdwemgLWA1DgcYJTOey2cSh3a8T2ccQ8en6E2dUyJQnWOHtJxlAZYmgx+ZM+IL4GcO3xuhNa01jAiSC7eJkfXne2XRCqsAohVrl4XNCtCAdeGr3a0NkpvuXuTFHbTZfCBfEr1ukLlalV5zh956bVYfInsk1PMdAX9qNuNiO96M55aGlonW0d7Dg2gblq5FtDgdvwM1SvgN+rkcGvd7FkQbS70nto8BfNQZgieKLy5Lg+q2tK5Rcs3thZZLe5GdGfKWIqOGi9c5PvoH0eq6YON+bSnslNeiXcQVzI6nwcGcXhbJo1h4IWJU29xQroRcK0DcrkEoDhHbh6qhqA0Svlj1m6HDneY7N4iLZvRABPAsTTta49PxqRGzPLR1NaT6VdAR9jtUyrMNYC8cyjUnVJR9IcezDZxhcH74oPVNKH1cqpKkT9YLev+oPFqrzIrPleKiTmkd3nXnxSp8Nyl3pk9HPCfUGgYIJbeeWa/BZP1lCtENFbSRvv/kH1gPTYQopQ2dwi871teYCjzZb3QTJ4mtWSs+yXWy9lWKNgcezjRcaY6FWToF5lEPvnnHHUGQyA7VQBoFRooRlLGNRa0B7d72p/xRDgaJ0tcpb5D58H0cEZFK96v1VQ9qjSUmRsJWtPj3CSmIs8rF31mjpyquAm8XlWIiyzORCJej44bINiqRhVlhnvnaFXaL/6PgmSFBwWzECL2Vy13PCqSxeYQt0vBFcDZOzZPuefD0IaBdMd4kDA0Dn2ATza85LjJlXPHdm19eQD6KfMR6w00/U569cAeAj72SUhVik0lnXB5zIu/4EYyxuxwNOC6Hm0ysKiyc5bsUE+nO2+saQisc6/Wu+W9WTZZhVl+0KfjgpqpM1YcUM8hnEvaujthVSpXQJ4whCEuDC/skG8/14Bv8UZ24OLYbkuXL65f52Ood6QCivss//gpR PS1vuCqd QKcv1nZsXCurtMaPo3JAiOjUEU6xIEX2474eaaprGu1V1HsD3e6I27xKgfyJQ8zNj2+WiO0MWe+x13dHEr+hCdofAwTrD40RoD7V58zNA/Mp8JLBn5Jf5m/2jq/fhnd3V217JmbsfUbORjon255UP0CX5yJu2FsxPtbz/LkGNinu8oTjAbLRBrRjMIT7F9mUMqBE6Ke3YtcH0prSaL9siIp5doa/xmU4CrOp7 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: If unmap one page fails, or the vma walk will skip next pte, or the vma walk will end on next pte, batched remove map, update folio refcount. Signed-off-by: Yin Fengwei --- include/linux/rmap.h | 1 + mm/page_vma_mapped.c | 30 +++++++++++++++++++++++++++ mm/rmap.c | 48 ++++++++++++++++++++++++++++++++++---------- 3 files changed, 68 insertions(+), 11 deletions(-) diff --git a/include/linux/rmap.h b/include/linux/rmap.h index d2569b42e21a..18193d1d5a8e 100644 --- a/include/linux/rmap.h +++ b/include/linux/rmap.h @@ -424,6 +424,7 @@ static inline void page_vma_mapped_walk_done(struct page_vma_mapped_walk *pvmw) } bool page_vma_mapped_walk(struct page_vma_mapped_walk *pvmw); +bool pvmw_walk_skip_or_end_on_next(struct page_vma_mapped_walk *pvmw); /* * Used by swapoff to help locate where page is expected in vma. diff --git a/mm/page_vma_mapped.c b/mm/page_vma_mapped.c index 4e448cfbc6ef..d5d69d02f08b 100644 --- a/mm/page_vma_mapped.c +++ b/mm/page_vma_mapped.c @@ -291,6 +291,36 @@ bool page_vma_mapped_walk(struct page_vma_mapped_walk *pvmw) return false; } +/** + * pvmw_walk_skip_or_end_on_next - check if next pte will be skipped or + * end the walk + * @pvmw: pointer to struct page_vma_mapped_walk. + * + * This function can only be called with correct pte lock hold + */ +bool pvmw_walk_skip_or_end_on_next(struct page_vma_mapped_walk *pvmw) +{ + unsigned long address = pvmw->address + PAGE_SIZE; + + if (address >= vma_address_end(pvmw)) + return true; + + if ((address & (PMD_SIZE - PAGE_SIZE)) == 0) + return true; + + pvmw->pte++; + if (pte_none(*pvmw->pte)) + return true; + + if (!check_pte(pvmw)) { + pvmw->pte--; + return true; + } + pvmw->pte--; + + return false; +} + /** * page_mapped_in_vma - check whether a page is really mapped in a VMA * @page: the page to test diff --git a/mm/rmap.c b/mm/rmap.c index bd5331dc9d44..60314c76df59 100644 --- a/mm/rmap.c +++ b/mm/rmap.c @@ -1741,6 +1741,26 @@ static bool try_to_unmap_one_page(struct folio *folio, return false; } +static void folio_remove_rmap_and_update_count(struct folio *folio, + struct page *start, struct vm_area_struct *vma, int count) +{ + if (count == 0) + return; + + /* + * No need to call mmu_notifier_invalidate_range() it has be + * done above for all cases requiring it to happen under page + * table lock before mmu_notifier_invalidate_range_end() + * + * See Documentation/mm/mmu_notifier.rst + */ + folio_remove_rmap_range(folio, start, count, vma, + folio_test_hugetlb(folio)); + if (vma->vm_flags & VM_LOCKED) + mlock_drain_local(); + folio_ref_sub(folio, count); +} + /* * @arg: enum ttu_flags will be passed to this argument */ @@ -1748,10 +1768,11 @@ static bool try_to_unmap_one(struct folio *folio, struct vm_area_struct *vma, unsigned long address, void *arg) { DEFINE_FOLIO_VMA_WALK(pvmw, folio, vma, address, 0); - struct page *subpage; + struct page *start = NULL; bool ret = true; struct mmu_notifier_range range; enum ttu_flags flags = (enum ttu_flags)(long)arg; + int count = 0; /* * When racing against e.g. zap_pte_range() on another cpu, @@ -1812,26 +1833,31 @@ static bool try_to_unmap_one(struct folio *folio, struct vm_area_struct *vma, break; } - subpage = folio_page(folio, + if (!start) + start = folio_page(folio, pte_pfn(*pvmw.pte) - folio_pfn(folio)); ret = try_to_unmap_one_page(folio, vma, range, pvmw, address, flags); if (!ret) { + folio_remove_rmap_and_update_count(folio, + start, vma, count); page_vma_mapped_walk_done(&pvmw); break; } + count++; /* - * No need to call mmu_notifier_invalidate_range() it has be - * done above for all cases requiring it to happen under page - * table lock before mmu_notifier_invalidate_range_end() - * - * See Documentation/mm/mmu_notifier.rst + * If next pte will be skipped in page_vma_mapped_walk() or + * the walk will end at it, batched remove rmap and update + * page refcount. We can't do it after page_vma_mapped_walk() + * return false because the pte lock will not be hold. */ - page_remove_rmap(subpage, vma, false); - if (vma->vm_flags & VM_LOCKED) - mlock_drain_local(); - folio_put(folio); + if (pvmw_walk_skip_or_end_on_next(&pvmw)) { + folio_remove_rmap_and_update_count(folio, + start, vma, count); + count = 0; + start = NULL; + } } mmu_notifier_invalidate_range_end(&range);