From patchwork Tue Feb 28 12:23:04 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yin Fengwei X-Patchwork-Id: 13154853 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id F3E7FC64ED6 for ; Tue, 28 Feb 2023 12:22:29 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id DF7956B007D; Tue, 28 Feb 2023 07:22:28 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id D0A456B007B; Tue, 28 Feb 2023 07:22:28 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id A98F36B007D; Tue, 28 Feb 2023 07:22:28 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0010.hostedemail.com [216.40.44.10]) by kanga.kvack.org (Postfix) with ESMTP id 9CE606B0078 for ; Tue, 28 Feb 2023 07:22:28 -0500 (EST) Received: from smtpin11.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay09.hostedemail.com (Postfix) with ESMTP id 69F9380955 for ; Tue, 28 Feb 2023 12:22:28 +0000 (UTC) X-FDA: 80516613576.11.EA92C6C Received: from mga18.intel.com (mga18.intel.com [134.134.136.126]) by imf06.hostedemail.com (Postfix) with ESMTP id 4E4E1180018 for ; Tue, 28 Feb 2023 12:22:26 +0000 (UTC) Authentication-Results: imf06.hostedemail.com; dkim=pass header.d=intel.com header.s=Intel header.b=jo8sbaBD; spf=pass (imf06.hostedemail.com: domain of fengwei.yin@intel.com designates 134.134.136.126 as permitted sender) smtp.mailfrom=fengwei.yin@intel.com; dmarc=pass (policy=none) header.from=intel.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1677586946; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=q3sS0hurUNQ+Y++fTudY1yfNdB+1Q6sVmeSmWveel0I=; b=0pM4OTLgjoYKADivlncTdgYxRP3hh0MopydPXmuciujVEf/nuylV0TY+HrZbr4yxZV8q+E XoipUSPgKMz8Om0DOD2FDzASss1QqmUn+OHVQaqsUgoyGTCL4Lad00m/0pxFQl/7hpj8Om fCgioh3Ay1/sSkMGyVw0aE+MWeLG7ms= ARC-Authentication-Results: i=1; imf06.hostedemail.com; dkim=pass header.d=intel.com header.s=Intel header.b=jo8sbaBD; spf=pass (imf06.hostedemail.com: domain of fengwei.yin@intel.com designates 134.134.136.126 as permitted sender) smtp.mailfrom=fengwei.yin@intel.com; dmarc=pass (policy=none) header.from=intel.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1677586946; a=rsa-sha256; cv=none; b=wOHYznVph90wZVK5p+THbGvSYm2p45RPojev1u2vJbPoj9w+L1lFPfWpOPlvu8imuDCBBC XbJ4a1hT9QAWY8yyVOvg+ICNLJW0FST/DWu5HlcJBWItDBavQv06iye7QtJKpnyaH7RKhX 5Qjh+PaSGlksErVDBeowM3id6WFUFIo= DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1677586946; x=1709122946; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=/8ZslATSdQSy2Jniwxmoxv3WRHCslMvDl3ogC59rwcQ=; b=jo8sbaBDJK9++ks7kB0Tzis7JTNlVotNtfZMwFhS1PFL3oQh6xbV/ofe LjDjqx2VeODGpg+5+fBqiFuN/ZiK8ltpSdtFUz7bE+PLs2hVhxJjUIOxj H1bMRm6S516pV7xr+ksGg1zIAyfhPOdW7H4M2diEbd2hSP00LwB5dfNlS RGW+972/lHQ18Y7K4pc1bewXnTN2nsJMzQhCzpJE4QFfvh3XzqleFzPP5 LMVIqwOkkNIYipuSnIfknrAprcGPT1gxgiWQQwbN11Fi1GQN+XCSmi+S3 Rv3++RaCqrZH8Gbcve+eVXI/8Eydxv/4IAAvWNwlNHdx4LnX7zWgV11mN g==; X-IronPort-AV: E=McAfee;i="6500,9779,10634"; a="317921175" X-IronPort-AV: E=Sophos;i="5.98,221,1673942400"; d="scan'208";a="317921175" Received: from fmsmga005.fm.intel.com ([10.253.24.32]) by orsmga106.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 28 Feb 2023 04:22:14 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6500,9779,10634"; a="1003220723" X-IronPort-AV: E=Sophos;i="5.98,221,1673942400"; d="scan'208";a="1003220723" Received: from fyin-dev.sh.intel.com ([10.239.159.32]) by fmsmga005.fm.intel.com with ESMTP; 28 Feb 2023 04:21:58 -0800 From: Yin Fengwei To: linux-mm@kvack.org, akpm@linux-foundation.org, willy@infradead.org, sidhartha.kumar@oracle.com, mike.kravetz@oracle.com, jane.chu@oracle.com, naoya.horiguchi@nec.com Cc: fengwei.yin@intel.com Subject: [PATCH v2 1/5] rmap: move hugetlb try_to_unmap to dedicated function Date: Tue, 28 Feb 2023 20:23:04 +0800 Message-Id: <20230228122308.2972219-2-fengwei.yin@intel.com> X-Mailer: git-send-email 2.30.2 In-Reply-To: <20230228122308.2972219-1-fengwei.yin@intel.com> References: <20230228122308.2972219-1-fengwei.yin@intel.com> MIME-Version: 1.0 X-Rspamd-Server: rspam05 X-Rspamd-Queue-Id: 4E4E1180018 X-Stat-Signature: btgt3w5digrh1y98kjyzcbbgkbb5jhoo X-Rspam-User: X-HE-Tag: 1677586946-223794 X-HE-Meta: U2FsdGVkX1/l2M6T7AmR5mLEUDGjG2C5xx2n1wt7UwlrsuRPWWIKVJxdzbFJLDgPwpYN+C54NlHFuQ1Ax26cZ3fHBF5401ffumvS4o8FIl/d5N7m5bQhU1lXRLO3Xa3qgp/apfylLSjYeJov6mPm4ZmkWKRch4FxfKKf1XZDRpGTUpyhHGqLIgx/SDqrELYrumzRIkuAOXC7HWkCH9pbsvY8o+ebuQ0dIkHp0dqzLzp6ZDt8+A9HobTlr4RZYfRUhflqfVcAIkSizQ2sCxQIVWrNJOrmsMfmxIgEo8PvAX8IoH4quBnDSlOlHXLkUuR9p0Guj0usf2YnVnY7rkLmuZ0Q+UH9i714aRhf2QIsvDwTtPGMfd/ZXmoOFcHciVw9ij09ABpR8hDKGaUZqqekmCReydZ+hWrf5ehRWK7ikPJC17oH0UlqhMp8yRAl/5q9v3yOCn16W09IwptjUpzodpYw9qRTCw6s8Hi5t+F+fKJelQh23D/eQMP1PDGJ5edZuYkF3Vai9NOR4tvnT5PTcPH08xNAVUtfTdFeoE0yfYb2xA4wBy2ZEr8mHRtmw1a3NPyGkUt9DaF2G9BBqJ10fAKK4qLaJVsJmPu1AiweAMPlrPDycm6+p8rmqeykoZMvr/zLrXDEZ6gH49/9X0A+C2FwitNtdauLaEcefly2OCYE9I0sDoJhQ1IyDBLedgcwBls65qJoU6Wpjr8zOeGmssZPJXqO2IjT4vHX6xHI+BgxWT7Cveix0EeChKhB5anExajO3TR84IZ9vtz0vNI67TxMTKoubIZTcOOlPSVwDL1xaUZ7KZmDAPkvOggpNAFCrajru81RCmm3SPTTpCQgVU2zdkiAPzpH4qZm5BOFVYjfUlgE+ZG1YgWSh9Swtktxb0ugLL0wMIP87OJM61F7QoEdpHg0DgihzckonCmchb845X3iFoQVmE06TEH8ugqSCRN8qNdPS83mbBu4JAK J4jnVuD7 V8In2MF40X0pPEGTtCqG3EGzvPoxtFpyK6zE7oJyZ4kR5pBJVP6HQGb+kR8KpDO/Zkt+7j3eWQV5JLNqMLJPcDSKOZ4l5wGqGfniSeSkl3FvSIxiGYwVpex8AgVedcyJmenjN6oWzdOXAbz5KuP6SiNtyBFABtmUWRO9PoRRrlR9xDSczwDngz0sHcTMt3XFYEcSlGwMuKxFZZbcr95gdcy8tsF+V/FjZSwAJ X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: It's to prepare the batched rmap update for large folio. No need to looped handle hugetlb. Just handle hugetlb and bail out early. Signed-off-by: Yin Fengwei --- mm/rmap.c | 200 +++++++++++++++++++++++++++++++++--------------------- 1 file changed, 121 insertions(+), 79 deletions(-) diff --git a/mm/rmap.c b/mm/rmap.c index 8632e02661ac..0f09518d6f30 100644 --- a/mm/rmap.c +++ b/mm/rmap.c @@ -1443,6 +1443,103 @@ void page_remove_rmap(struct page *page, struct vm_area_struct *vma, munlock_vma_folio(folio, vma, compound); } +static bool try_to_unmap_one_hugetlb(struct folio *folio, + struct vm_area_struct *vma, struct mmu_notifier_range range, + struct page_vma_mapped_walk pvmw, unsigned long address, + enum ttu_flags flags) +{ + struct mm_struct *mm = vma->vm_mm; + pte_t pteval; + bool ret = true, anon = folio_test_anon(folio); + + /* + * The try_to_unmap() is only passed a hugetlb page + * in the case where the hugetlb page is poisoned. + */ + VM_BUG_ON_FOLIO(!folio_test_hwpoison(folio), folio); + /* + * huge_pmd_unshare may unmap an entire PMD page. + * There is no way of knowing exactly which PMDs may + * be cached for this mm, so we must flush them all. + * start/end were already adjusted above to cover this + * range. + */ + flush_cache_range(vma, range.start, range.end); + + /* + * To call huge_pmd_unshare, i_mmap_rwsem must be + * held in write mode. Caller needs to explicitly + * do this outside rmap routines. + * + * We also must hold hugetlb vma_lock in write mode. + * Lock order dictates acquiring vma_lock BEFORE + * i_mmap_rwsem. We can only try lock here and fail + * if unsuccessful. + */ + if (!anon) { + VM_BUG_ON(!(flags & TTU_RMAP_LOCKED)); + if (!hugetlb_vma_trylock_write(vma)) { + ret = false; + goto out; + } + if (huge_pmd_unshare(mm, vma, address, pvmw.pte)) { + hugetlb_vma_unlock_write(vma); + flush_tlb_range(vma, + range.start, range.end); + mmu_notifier_invalidate_range(mm, + range.start, range.end); + /* + * The ref count of the PMD page was + * dropped which is part of the way map + * counting is done for shared PMDs. + * Return 'true' here. When there is + * no other sharing, huge_pmd_unshare + * returns false and we will unmap the + * actual page and drop map count + * to zero. + */ + goto out; + } + hugetlb_vma_unlock_write(vma); + } + pteval = huge_ptep_clear_flush(vma, address, pvmw.pte); + + /* + * Now the pte is cleared. If this pte was uffd-wp armed, + * we may want to replace a none pte with a marker pte if + * it's file-backed, so we don't lose the tracking info. + */ + pte_install_uffd_wp_if_needed(vma, address, pvmw.pte, pteval); + + /* Set the dirty flag on the folio now the pte is gone. */ + if (pte_dirty(pteval)) + folio_mark_dirty(folio); + + /* Update high watermark before we lower rss */ + update_hiwater_rss(mm); + + /* Poisoned hugetlb folio with TTU_HWPOISON always cleared in flags */ + pteval = swp_entry_to_pte(make_hwpoison_entry(&folio->page)); + set_huge_pte_at(mm, address, pvmw.pte, pteval); + hugetlb_count_sub(folio_nr_pages(folio), mm); + + /* + * No need to call mmu_notifier_invalidate_range() it has be + * done above for all cases requiring it to happen under page + * table lock before mmu_notifier_invalidate_range_end() + * + * See Documentation/mm/mmu_notifier.rst + */ + page_remove_rmap(&folio->page, vma, folio_test_hugetlb(folio)); + /* No VM_LOCKED set in vma->vm_flags for hugetlb. So not + * necessary to call mlock_drain_local(). + */ + folio_put(folio); + +out: + return ret; +} + /* * @arg: enum ttu_flags will be passed to this argument */ @@ -1506,86 +1603,37 @@ static bool try_to_unmap_one(struct folio *folio, struct vm_area_struct *vma, break; } + address = pvmw.address; + if (folio_test_hugetlb(folio)) { + ret = try_to_unmap_one_hugetlb(folio, vma, range, + pvmw, address, flags); + + /* no need to loop for hugetlb */ + page_vma_mapped_walk_done(&pvmw); + break; + } + subpage = folio_page(folio, pte_pfn(*pvmw.pte) - folio_pfn(folio)); - address = pvmw.address; anon_exclusive = folio_test_anon(folio) && PageAnonExclusive(subpage); - if (folio_test_hugetlb(folio)) { - bool anon = folio_test_anon(folio); - - /* - * The try_to_unmap() is only passed a hugetlb page - * in the case where the hugetlb page is poisoned. - */ - VM_BUG_ON_PAGE(!PageHWPoison(subpage), subpage); + flush_cache_page(vma, address, pte_pfn(*pvmw.pte)); + /* Nuke the page table entry. */ + if (should_defer_flush(mm, flags)) { /* - * huge_pmd_unshare may unmap an entire PMD page. - * There is no way of knowing exactly which PMDs may - * be cached for this mm, so we must flush them all. - * start/end were already adjusted above to cover this - * range. + * We clear the PTE but do not flush so potentially + * a remote CPU could still be writing to the folio. + * If the entry was previously clean then the + * architecture must guarantee that a clear->dirty + * transition on a cached TLB entry is written through + * and traps if the PTE is unmapped. */ - flush_cache_range(vma, range.start, range.end); + pteval = ptep_get_and_clear(mm, address, pvmw.pte); - /* - * To call huge_pmd_unshare, i_mmap_rwsem must be - * held in write mode. Caller needs to explicitly - * do this outside rmap routines. - * - * We also must hold hugetlb vma_lock in write mode. - * Lock order dictates acquiring vma_lock BEFORE - * i_mmap_rwsem. We can only try lock here and fail - * if unsuccessful. - */ - if (!anon) { - VM_BUG_ON(!(flags & TTU_RMAP_LOCKED)); - if (!hugetlb_vma_trylock_write(vma)) { - page_vma_mapped_walk_done(&pvmw); - ret = false; - break; - } - if (huge_pmd_unshare(mm, vma, address, pvmw.pte)) { - hugetlb_vma_unlock_write(vma); - flush_tlb_range(vma, - range.start, range.end); - mmu_notifier_invalidate_range(mm, - range.start, range.end); - /* - * The ref count of the PMD page was - * dropped which is part of the way map - * counting is done for shared PMDs. - * Return 'true' here. When there is - * no other sharing, huge_pmd_unshare - * returns false and we will unmap the - * actual page and drop map count - * to zero. - */ - page_vma_mapped_walk_done(&pvmw); - break; - } - hugetlb_vma_unlock_write(vma); - } - pteval = huge_ptep_clear_flush(vma, address, pvmw.pte); + set_tlb_ubc_flush_pending(mm, pte_dirty(pteval)); } else { - flush_cache_page(vma, address, pte_pfn(*pvmw.pte)); - /* Nuke the page table entry. */ - if (should_defer_flush(mm, flags)) { - /* - * We clear the PTE but do not flush so potentially - * a remote CPU could still be writing to the folio. - * If the entry was previously clean then the - * architecture must guarantee that a clear->dirty - * transition on a cached TLB entry is written through - * and traps if the PTE is unmapped. - */ - pteval = ptep_get_and_clear(mm, address, pvmw.pte); - - set_tlb_ubc_flush_pending(mm, pte_dirty(pteval)); - } else { - pteval = ptep_clear_flush(vma, address, pvmw.pte); - } + pteval = ptep_clear_flush(vma, address, pvmw.pte); } /* @@ -1604,14 +1652,8 @@ static bool try_to_unmap_one(struct folio *folio, struct vm_area_struct *vma, if (PageHWPoison(subpage) && (flags & TTU_HWPOISON)) { pteval = swp_entry_to_pte(make_hwpoison_entry(subpage)); - if (folio_test_hugetlb(folio)) { - hugetlb_count_sub(folio_nr_pages(folio), mm); - set_huge_pte_at(mm, address, pvmw.pte, pteval); - } else { - dec_mm_counter(mm, mm_counter(&folio->page)); - set_pte_at(mm, address, pvmw.pte, pteval); - } - + dec_mm_counter(mm, mm_counter(&folio->page)); + set_pte_at(mm, address, pvmw.pte, pteval); } else if (pte_unused(pteval) && !userfaultfd_armed(vma)) { /* * The guest indicated that the page content is of no From patchwork Tue Feb 28 12:23:05 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yin Fengwei X-Patchwork-Id: 13154852 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 7BD90C7EE2E for ; Tue, 28 Feb 2023 12:22:28 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 1244E6B0075; Tue, 28 Feb 2023 07:22:28 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 0D5696B0078; Tue, 28 Feb 2023 07:22:28 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id EB6996B007B; Tue, 28 Feb 2023 07:22:27 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0017.hostedemail.com [216.40.44.17]) by kanga.kvack.org (Postfix) with ESMTP id DC9126B0075 for ; Tue, 28 Feb 2023 07:22:27 -0500 (EST) Received: from smtpin20.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay05.hostedemail.com (Postfix) with ESMTP id 9F94A40BA0 for ; Tue, 28 Feb 2023 12:22:27 +0000 (UTC) X-FDA: 80516613534.20.A9EBE80 Received: from mga18.intel.com (mga18.intel.com [134.134.136.126]) by imf14.hostedemail.com (Postfix) with ESMTP id 956A3100002 for ; Tue, 28 Feb 2023 12:22:25 +0000 (UTC) Authentication-Results: imf14.hostedemail.com; dkim=pass header.d=intel.com header.s=Intel header.b=dV25LRn8; spf=pass (imf14.hostedemail.com: domain of fengwei.yin@intel.com designates 134.134.136.126 as permitted sender) smtp.mailfrom=fengwei.yin@intel.com; dmarc=pass (policy=none) header.from=intel.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1677586945; a=rsa-sha256; cv=none; b=M810BS+sHOU/KyOgbxBWPsgRHDUeU4VS7+YHpbphU2tDt6uWce6CZkx51EQ/wU2dDZndaD 2CjuChggiE9DktDvvQejWPzgGEfUmjRlHUtNQUfq/xbFQKoaa7T+tg/Oln7meEnXmlaEO9 qR9Iy54Ib05TWFqHw06vu839ettRKcs= ARC-Authentication-Results: i=1; imf14.hostedemail.com; dkim=pass header.d=intel.com header.s=Intel header.b=dV25LRn8; spf=pass (imf14.hostedemail.com: domain of fengwei.yin@intel.com designates 134.134.136.126 as permitted sender) smtp.mailfrom=fengwei.yin@intel.com; dmarc=pass (policy=none) header.from=intel.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1677586945; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=YNyikPF6tepsvMyyPv5ldrqa2xsKoUNY77UgEhRDE/w=; b=0Y0vfcv52cV6c2u8srOdaKEcPger5ryTZWUdzQ3YDtDctdDsUIvJlnfLYw0Ge8pI9TXg9Z 87XeX0f7B0akrvQfieHI+ICHujCG2lYyESreX/dqTfxM1haXyJaP9x8qaaZNTWyin6ZR9Y RrdW6bp9pzsFmF6uJv3CBC713gHbdi4= DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1677586945; x=1709122945; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=MoyDh6JaXdjTDJC8LTe5N7dBQ1nA5a+PQJq7FFmWwto=; b=dV25LRn89fysuvwHrJQN+AM1dRWvgMyRbf2b1dmAuvKfbI2ZFZ3lX0vI SOVqNsAM/dNvl2+p/R8MZMogvCCoC/8+lG7PWxxVvGrH8/9Vz2WK/hKn1 SQLoHav3ApM65E94U5gEFqOY9q0b6/N0Rdd+2y61krUiSIUUcWCRXB7+9 3OKWRxEUmv7MJ0SC328kzjUCPCMHd8N0E9qumCx73s8tkSlanLVLlJU26 pgzU+1yIrOZQUiwuPJ68fx+FStPBFOxLrR1MSv5o/AF8nSQjChrx50nUu z+RNdRc2hKwmn/TdmZDbR+Kcz1STEX3EOZ4XVqhUqyyaweqdb6I39m8PK w==; X-IronPort-AV: E=McAfee;i="6500,9779,10634"; a="317921170" X-IronPort-AV: E=Sophos;i="5.98,221,1673942400"; d="scan'208";a="317921170" Received: from fmsmga005.fm.intel.com ([10.253.24.32]) by orsmga106.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 28 Feb 2023 04:22:14 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6500,9779,10634"; a="1003220777" X-IronPort-AV: E=Sophos;i="5.98,221,1673942400"; d="scan'208";a="1003220777" Received: from fyin-dev.sh.intel.com ([10.239.159.32]) by fmsmga005.fm.intel.com with ESMTP; 28 Feb 2023 04:22:01 -0800 From: Yin Fengwei To: linux-mm@kvack.org, akpm@linux-foundation.org, willy@infradead.org, sidhartha.kumar@oracle.com, mike.kravetz@oracle.com, jane.chu@oracle.com, naoya.horiguchi@nec.com Cc: fengwei.yin@intel.com Subject: [PATCH v2 2/5] rmap: move page unmap operation to dedicated function Date: Tue, 28 Feb 2023 20:23:05 +0800 Message-Id: <20230228122308.2972219-3-fengwei.yin@intel.com> X-Mailer: git-send-email 2.30.2 In-Reply-To: <20230228122308.2972219-1-fengwei.yin@intel.com> References: <20230228122308.2972219-1-fengwei.yin@intel.com> MIME-Version: 1.0 X-Rspam-User: X-Rspamd-Queue-Id: 956A3100002 X-Rspamd-Server: rspam01 X-Stat-Signature: 6ug81zssxjh8utcu4thbydmyfps68ru8 X-HE-Tag: 1677586945-501146 X-HE-Meta: U2FsdGVkX18/7ywvoqEHyIFdfkpFezO9tWXo8F5BTPFdv6hncjnMEYlTIW1pvRIsRDZ3ebbnz/D5c5PxBp98NiIfZn2dDJ9wWLVICfPdM0uuIXPb/NiIFtycHgEKRGRBLjy6rRPGta87lOr9bjSBqCqDpB4jc8N5KFKAAMWD2NnagPphJkHPEjOMJxVAzjBA3MoBs2vbq/IuJ58L1F0LDiiyyKbLw2ABYzDxozWyegbuwhBmVDlFz5GuhczQd3ZxuBuvha8LmzSHH6FKmzgCpa5vtimKC5WWFpyqFAC1WlpOdItFMmZTkWLfIY5wQ+feNVBPSqssH+V1gq2JBsJg2shw/464HCtI9HS1pSMQoY+ezRQa0uAsBvKtHiGMEMcn8JZkli411ai5ZQYvj7xmr6sH7F8sMcDpRQeCPMUn0yyc2r3bGWCn3ZSb401y/g35XpoOX4Pt3lbag2tF5EzaPF5/u2hpAOMNkAni65JETtpSSo8xKO7zUO7pOXsD2On8yklKkSrKfXf3knmJFjjLpoPNm8HYW+KlIKpllqFlzBLB9kybqSHyDgdkEKtbC2X2/xM0eA6lMSZRuxGVwIN2eziHBasMszcsAEYXfCy820+QXpGkgshE+Dvi9rz4/kKBND2QrupRzykAtbqDF6nZSi1wffGamuO9R+jWjVp2LnURZ85ZtyaiuX6lXdx9LuwmjLrkPZSOg1h8a9uGwSpvNvXurgxEetlass8CYv4B+jb+aePdyIgwTZtncy0k0fyQLAzuOZlYhOaX+lw+HJ005jPFaJ9wsD8lfz9HFCIU4KxDkKRiEDxJznwBDbSB2tpmcvdpTSh+9c0s1mvUGbo4kWDhE76t3Z3qRv6K2ix+fGAq5dmVfrmOZ4s9toFqpI2XkH4XtOu52k9atszBrfwx7RsHdW3uuGzMtabJAHqFdA6U/zJc+abLbAlSoJ+TsXzUDL59qwduf0NrTDOEiTS TqtNC2D9 uA6DpJpxv2VgVktFcjqzfjLF5YbbSPC8StCCZlDVNxvGL5RKUjywIwPKpv2jtdEOpfV+2/1mVuke06O8wliCR1rvpvTJE+jZf9ANOWE61L1jKn56gLbDIVXcnFIh8GcosuurGta7+UMUWCJ57j1W7rsbkxcK/o+5gMUIaYUljDxiMjiI= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: No functional change. Just code reorganized. Signed-off-by: Yin Fengwei --- mm/rmap.c | 369 ++++++++++++++++++++++++++++-------------------------- 1 file changed, 194 insertions(+), 175 deletions(-) diff --git a/mm/rmap.c b/mm/rmap.c index 0f09518d6f30..987ab402392f 100644 --- a/mm/rmap.c +++ b/mm/rmap.c @@ -1540,17 +1540,204 @@ static bool try_to_unmap_one_hugetlb(struct folio *folio, return ret; } +static bool try_to_unmap_one_page(struct folio *folio, + struct vm_area_struct *vma, struct mmu_notifier_range range, + struct page_vma_mapped_walk pvmw, unsigned long address, + enum ttu_flags flags) +{ + bool anon_exclusive, ret = true; + struct page *subpage; + struct mm_struct *mm = vma->vm_mm; + pte_t pteval; + + subpage = folio_page(folio, + pte_pfn(*pvmw.pte) - folio_pfn(folio)); + anon_exclusive = folio_test_anon(folio) && + PageAnonExclusive(subpage); + + flush_cache_page(vma, address, pte_pfn(*pvmw.pte)); + /* Nuke the page table entry. */ + if (should_defer_flush(mm, flags)) { + /* + * We clear the PTE but do not flush so potentially + * a remote CPU could still be writing to the folio. + * If the entry was previously clean then the + * architecture must guarantee that a clear->dirty + * transition on a cached TLB entry is written through + * and traps if the PTE is unmapped. + */ + pteval = ptep_get_and_clear(mm, address, pvmw.pte); + + set_tlb_ubc_flush_pending(mm, pte_dirty(pteval)); + } else { + pteval = ptep_clear_flush(vma, address, pvmw.pte); + } + + /* + * Now the pte is cleared. If this pte was uffd-wp armed, + * we may want to replace a none pte with a marker pte if + * it's file-backed, so we don't lose the tracking info. + */ + pte_install_uffd_wp_if_needed(vma, address, pvmw.pte, pteval); + + /* Set the dirty flag on the folio now the pte is gone. */ + if (pte_dirty(pteval)) + folio_mark_dirty(folio); + + /* Update high watermark before we lower rss */ + update_hiwater_rss(mm); + + if (PageHWPoison(subpage) && !(flags & TTU_HWPOISON)) { + pteval = swp_entry_to_pte(make_hwpoison_entry(subpage)); + dec_mm_counter(mm, mm_counter(&folio->page)); + set_pte_at(mm, address, pvmw.pte, pteval); + } else if (pte_unused(pteval) && !userfaultfd_armed(vma)) { + /* + * The guest indicated that the page content is of no + * interest anymore. Simply discard the pte, vmscan + * will take care of the rest. + * A future reference will then fault in a new zero + * page. When userfaultfd is active, we must not drop + * this page though, as its main user (postcopy + * migration) will not expect userfaults on already + * copied pages. + */ + dec_mm_counter(mm, mm_counter(&folio->page)); + /* We have to invalidate as we cleared the pte */ + mmu_notifier_invalidate_range(mm, address, + address + PAGE_SIZE); + } else if (folio_test_anon(folio)) { + swp_entry_t entry = { .val = page_private(subpage) }; + pte_t swp_pte; + /* + * Store the swap location in the pte. + * See handle_pte_fault() ... + */ + if (unlikely(folio_test_swapbacked(folio) != + folio_test_swapcache(folio))) { + WARN_ON_ONCE(1); + ret = false; + /* We have to invalidate as we cleared the pte */ + mmu_notifier_invalidate_range(mm, address, + address + PAGE_SIZE); + page_vma_mapped_walk_done(&pvmw); + goto discard; + } + + /* MADV_FREE page check */ + if (!folio_test_swapbacked(folio)) { + int ref_count, map_count; + + /* + * Synchronize with gup_pte_range(): + * - clear PTE; barrier; read refcount + * - inc refcount; barrier; read PTE + */ + smp_mb(); + + ref_count = folio_ref_count(folio); + map_count = folio_mapcount(folio); + + /* + * Order reads for page refcount and dirty flag + * (see comments in __remove_mapping()). + */ + smp_rmb(); + + /* + * The only page refs must be one from isolation + * plus the rmap(s) (dropped by discard:). + */ + if (ref_count == 1 + map_count && + !folio_test_dirty(folio)) { + /* Invalidate as we cleared the pte */ + mmu_notifier_invalidate_range(mm, + address, address + PAGE_SIZE); + dec_mm_counter(mm, MM_ANONPAGES); + goto discard; + } + + /* + * If the folio was redirtied, it cannot be + * discarded. Remap the page to page table. + */ + set_pte_at(mm, address, pvmw.pte, pteval); + folio_set_swapbacked(folio); + ret = false; + page_vma_mapped_walk_done(&pvmw); + goto discard; + } + + if (swap_duplicate(entry) < 0) { + set_pte_at(mm, address, pvmw.pte, pteval); + ret = false; + page_vma_mapped_walk_done(&pvmw); + goto discard; + } + if (arch_unmap_one(mm, vma, address, pteval) < 0) { + swap_free(entry); + set_pte_at(mm, address, pvmw.pte, pteval); + ret = false; + page_vma_mapped_walk_done(&pvmw); + goto discard; + } + + /* See page_try_share_anon_rmap(): clear PTE first. */ + if (anon_exclusive && + page_try_share_anon_rmap(subpage)) { + swap_free(entry); + set_pte_at(mm, address, pvmw.pte, pteval); + ret = false; + page_vma_mapped_walk_done(&pvmw); + goto discard; + } + if (list_empty(&mm->mmlist)) { + spin_lock(&mmlist_lock); + if (list_empty(&mm->mmlist)) + list_add(&mm->mmlist, &init_mm.mmlist); + spin_unlock(&mmlist_lock); + } + dec_mm_counter(mm, MM_ANONPAGES); + inc_mm_counter(mm, MM_SWAPENTS); + swp_pte = swp_entry_to_pte(entry); + if (anon_exclusive) + swp_pte = pte_swp_mkexclusive(swp_pte); + if (pte_soft_dirty(pteval)) + swp_pte = pte_swp_mksoft_dirty(swp_pte); + if (pte_uffd_wp(pteval)) + swp_pte = pte_swp_mkuffd_wp(swp_pte); + set_pte_at(mm, address, pvmw.pte, swp_pte); + /* Invalidate as we cleared the pte */ + mmu_notifier_invalidate_range(mm, address, + address + PAGE_SIZE); + } else { + /* + * This is a locked file-backed folio, + * so it cannot be removed from the page + * cache and replaced by a new folio before + * mmu_notifier_invalidate_range_end, so no + * concurrent thread might update its page table + * to point at a new folio while a device is + * still using this folio. + * + * See Documentation/mm/mmu_notifier.rst + */ + dec_mm_counter(mm, mm_counter_file(&folio->page)); + } + +discard: + return ret; +} + /* * @arg: enum ttu_flags will be passed to this argument */ static bool try_to_unmap_one(struct folio *folio, struct vm_area_struct *vma, unsigned long address, void *arg) { - struct mm_struct *mm = vma->vm_mm; DEFINE_FOLIO_VMA_WALK(pvmw, folio, vma, address, 0); - pte_t pteval; struct page *subpage; - bool anon_exclusive, ret = true; + bool ret = true; struct mmu_notifier_range range; enum ttu_flags flags = (enum ttu_flags)(long)arg; @@ -1615,179 +1802,11 @@ static bool try_to_unmap_one(struct folio *folio, struct vm_area_struct *vma, subpage = folio_page(folio, pte_pfn(*pvmw.pte) - folio_pfn(folio)); - anon_exclusive = folio_test_anon(folio) && - PageAnonExclusive(subpage); - - flush_cache_page(vma, address, pte_pfn(*pvmw.pte)); - /* Nuke the page table entry. */ - if (should_defer_flush(mm, flags)) { - /* - * We clear the PTE but do not flush so potentially - * a remote CPU could still be writing to the folio. - * If the entry was previously clean then the - * architecture must guarantee that a clear->dirty - * transition on a cached TLB entry is written through - * and traps if the PTE is unmapped. - */ - pteval = ptep_get_and_clear(mm, address, pvmw.pte); - - set_tlb_ubc_flush_pending(mm, pte_dirty(pteval)); - } else { - pteval = ptep_clear_flush(vma, address, pvmw.pte); - } - - /* - * Now the pte is cleared. If this pte was uffd-wp armed, - * we may want to replace a none pte with a marker pte if - * it's file-backed, so we don't lose the tracking info. - */ - pte_install_uffd_wp_if_needed(vma, address, pvmw.pte, pteval); - - /* Set the dirty flag on the folio now the pte is gone. */ - if (pte_dirty(pteval)) - folio_mark_dirty(folio); - - /* Update high watermark before we lower rss */ - update_hiwater_rss(mm); - - if (PageHWPoison(subpage) && (flags & TTU_HWPOISON)) { - pteval = swp_entry_to_pte(make_hwpoison_entry(subpage)); - dec_mm_counter(mm, mm_counter(&folio->page)); - set_pte_at(mm, address, pvmw.pte, pteval); - } else if (pte_unused(pteval) && !userfaultfd_armed(vma)) { - /* - * The guest indicated that the page content is of no - * interest anymore. Simply discard the pte, vmscan - * will take care of the rest. - * A future reference will then fault in a new zero - * page. When userfaultfd is active, we must not drop - * this page though, as its main user (postcopy - * migration) will not expect userfaults on already - * copied pages. - */ - dec_mm_counter(mm, mm_counter(&folio->page)); - /* We have to invalidate as we cleared the pte */ - mmu_notifier_invalidate_range(mm, address, - address + PAGE_SIZE); - } else if (folio_test_anon(folio)) { - swp_entry_t entry = { .val = page_private(subpage) }; - pte_t swp_pte; - /* - * Store the swap location in the pte. - * See handle_pte_fault() ... - */ - if (unlikely(folio_test_swapbacked(folio) != - folio_test_swapcache(folio))) { - WARN_ON_ONCE(1); - ret = false; - /* We have to invalidate as we cleared the pte */ - mmu_notifier_invalidate_range(mm, address, - address + PAGE_SIZE); - page_vma_mapped_walk_done(&pvmw); - break; - } - - /* MADV_FREE page check */ - if (!folio_test_swapbacked(folio)) { - int ref_count, map_count; - - /* - * Synchronize with gup_pte_range(): - * - clear PTE; barrier; read refcount - * - inc refcount; barrier; read PTE - */ - smp_mb(); - - ref_count = folio_ref_count(folio); - map_count = folio_mapcount(folio); - - /* - * Order reads for page refcount and dirty flag - * (see comments in __remove_mapping()). - */ - smp_rmb(); - - /* - * The only page refs must be one from isolation - * plus the rmap(s) (dropped by discard:). - */ - if (ref_count == 1 + map_count && - !folio_test_dirty(folio)) { - /* Invalidate as we cleared the pte */ - mmu_notifier_invalidate_range(mm, - address, address + PAGE_SIZE); - dec_mm_counter(mm, MM_ANONPAGES); - goto discard; - } - - /* - * If the folio was redirtied, it cannot be - * discarded. Remap the page to page table. - */ - set_pte_at(mm, address, pvmw.pte, pteval); - folio_set_swapbacked(folio); - ret = false; - page_vma_mapped_walk_done(&pvmw); - break; - } - - if (swap_duplicate(entry) < 0) { - set_pte_at(mm, address, pvmw.pte, pteval); - ret = false; - page_vma_mapped_walk_done(&pvmw); - break; - } - if (arch_unmap_one(mm, vma, address, pteval) < 0) { - swap_free(entry); - set_pte_at(mm, address, pvmw.pte, pteval); - ret = false; - page_vma_mapped_walk_done(&pvmw); - break; - } + ret = try_to_unmap_one_page(folio, vma, + range, pvmw, address, flags); + if (!ret) + break; - /* See page_try_share_anon_rmap(): clear PTE first. */ - if (anon_exclusive && - page_try_share_anon_rmap(subpage)) { - swap_free(entry); - set_pte_at(mm, address, pvmw.pte, pteval); - ret = false; - page_vma_mapped_walk_done(&pvmw); - break; - } - if (list_empty(&mm->mmlist)) { - spin_lock(&mmlist_lock); - if (list_empty(&mm->mmlist)) - list_add(&mm->mmlist, &init_mm.mmlist); - spin_unlock(&mmlist_lock); - } - dec_mm_counter(mm, MM_ANONPAGES); - inc_mm_counter(mm, MM_SWAPENTS); - swp_pte = swp_entry_to_pte(entry); - if (anon_exclusive) - swp_pte = pte_swp_mkexclusive(swp_pte); - if (pte_soft_dirty(pteval)) - swp_pte = pte_swp_mksoft_dirty(swp_pte); - if (pte_uffd_wp(pteval)) - swp_pte = pte_swp_mkuffd_wp(swp_pte); - set_pte_at(mm, address, pvmw.pte, swp_pte); - /* Invalidate as we cleared the pte */ - mmu_notifier_invalidate_range(mm, address, - address + PAGE_SIZE); - } else { - /* - * This is a locked file-backed folio, - * so it cannot be removed from the page - * cache and replaced by a new folio before - * mmu_notifier_invalidate_range_end, so no - * concurrent thread might update its page table - * to point at a new folio while a device is - * still using this folio. - * - * See Documentation/mm/mmu_notifier.rst - */ - dec_mm_counter(mm, mm_counter_file(&folio->page)); - } -discard: /* * No need to call mmu_notifier_invalidate_range() it has be * done above for all cases requiring it to happen under page From patchwork Tue Feb 28 12:23:06 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yin Fengwei X-Patchwork-Id: 13154855 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 24A51C64ED6 for ; Tue, 28 Feb 2023 12:22:33 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 237B86B0082; Tue, 28 Feb 2023 07:22:31 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 178D46B0081; Tue, 28 Feb 2023 07:22:31 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id E21A36B0080; Tue, 28 Feb 2023 07:22:30 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0017.hostedemail.com [216.40.44.17]) by kanga.kvack.org (Postfix) with ESMTP id CF6ED6B007E for ; Tue, 28 Feb 2023 07:22:30 -0500 (EST) Received: from smtpin29.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay04.hostedemail.com (Postfix) with ESMTP id 8CDAC1A0972 for ; Tue, 28 Feb 2023 12:22:30 +0000 (UTC) X-FDA: 80516613660.29.B3E774D Received: from mga18.intel.com (mga18.intel.com [134.134.136.126]) by imf06.hostedemail.com (Postfix) with ESMTP id 81E70180005 for ; Tue, 28 Feb 2023 12:22:28 +0000 (UTC) Authentication-Results: imf06.hostedemail.com; dkim=pass header.d=intel.com header.s=Intel header.b=WfURzNFX; spf=pass (imf06.hostedemail.com: domain of fengwei.yin@intel.com designates 134.134.136.126 as permitted sender) smtp.mailfrom=fengwei.yin@intel.com; dmarc=pass (policy=none) header.from=intel.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1677586948; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=AwPy+c43oFsYwD4hiVZ/pUstsa9jnYnKUWH/NSqWR7E=; b=S5pILEWGUIW3Lu9KuEDywlscA4sCrc0S6zj9WSVXjVBzecg6ozJO1rLA9LJ9hUnLZDSoM6 cpHuIH4UGpchaAaqaA2W5O2D3z5von8Y2EYi2x7DITM+nbfimqEEqfyWDGyEIgB3z5NvBD kWfBV7YrFIUKnY7emB+Sh1okAi26sOM= ARC-Authentication-Results: i=1; imf06.hostedemail.com; dkim=pass header.d=intel.com header.s=Intel header.b=WfURzNFX; spf=pass (imf06.hostedemail.com: domain of fengwei.yin@intel.com designates 134.134.136.126 as permitted sender) smtp.mailfrom=fengwei.yin@intel.com; dmarc=pass (policy=none) header.from=intel.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1677586948; a=rsa-sha256; cv=none; b=l/h9Ez6lF2ep5fEEhsl4RHfGaPDcsi1dcqW40/6lgWtWuEW4zQAsl5WxHEK5yEGlv2xrtY OlZAE4/n8CqXF8ojOYa84M5oQm4Pi1RQEj8SxAK/odEezRHDBeQFZwsz7q+eazPINqYi2e GxYuGnKCQId4BtC/6OzjQaFnCsk8rf4= DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1677586948; x=1709122948; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=Dy4UH/JKR5tY8YvHaIig91ornTI36QW88TO4IHJeQSY=; b=WfURzNFXNZTfJE+WDWM2Dlzt2jJN9XM3PScxajSJnYqHnySTGSmhPr7T flg+p0O9wvB1v8u7p9xc4fXuKhYkmUDsrq+bsGZ6TmUo79+y9HdCpl+7I RxpBSSZe6yPB87lBH2fTyfWbHuVOJPZXGIKvctDwrZZs6dH+U+h97X4ts OlE8D+rXJZVMCvZn4qGtxT4KebYWshVCX8H9CYxGF9glzkW1Tktz87p6H JooP0HSvXT3tSDf2ogc9JgQI5Hsu5ikdZ4sozyD9GQFU7/SZmc3r2Hehp vGnRFvX9/nwLjXsOIdiqALNOL7f7b13rz846AN+5LIe0XyOZKncuvF34j g==; X-IronPort-AV: E=McAfee;i="6500,9779,10634"; a="317921183" X-IronPort-AV: E=Sophos;i="5.98,221,1673942400"; d="scan'208";a="317921183" Received: from fmsmga005.fm.intel.com ([10.253.24.32]) by orsmga106.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 28 Feb 2023 04:22:14 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6500,9779,10634"; a="1003220779" X-IronPort-AV: E=Sophos;i="5.98,221,1673942400"; d="scan'208";a="1003220779" Received: from fyin-dev.sh.intel.com ([10.239.159.32]) by fmsmga005.fm.intel.com with ESMTP; 28 Feb 2023 04:22:03 -0800 From: Yin Fengwei To: linux-mm@kvack.org, akpm@linux-foundation.org, willy@infradead.org, sidhartha.kumar@oracle.com, mike.kravetz@oracle.com, jane.chu@oracle.com, naoya.horiguchi@nec.com Cc: fengwei.yin@intel.com Subject: [PATCH v2 3/5] rmap: cleanup exit path of try_to_unmap_one_page() Date: Tue, 28 Feb 2023 20:23:06 +0800 Message-Id: <20230228122308.2972219-4-fengwei.yin@intel.com> X-Mailer: git-send-email 2.30.2 In-Reply-To: <20230228122308.2972219-1-fengwei.yin@intel.com> References: <20230228122308.2972219-1-fengwei.yin@intel.com> MIME-Version: 1.0 X-Rspamd-Server: rspam05 X-Rspamd-Queue-Id: 81E70180005 X-Stat-Signature: 5yymwgi179jc97sijjgxnxb5owwrzuka X-Rspam-User: X-HE-Tag: 1677586948-203338 X-HE-Meta: U2FsdGVkX1+ebHkTDhwOHU7L/p97jChg8nXSfhy0+FDWNH6zAXOm4/lYPZjsXOt/mzzJCEi/TMosr/mWXAr+ouMZH9Ef6rlAeGOVKFCjYIF154nzSEa4vD4ebgoqap6YEXLV/zC+6t4COmLSMmvKSM3LmOki163bAgcTh1W83pNJszrxL0E+bcPCX9qOUdozS2arSOAZ8NyGYCKinEXrxqAnQzZtKTlYjs/P8u8JWWBf7d9dMjO+3fm+sbyf5kNUnHJxELRU/gmZWfMPGfk+AHol5/5quhZ5AE21quCS0PbGbcQ3UeuSJTSgPStCQ4L/Aygoay84aAe26H9oQtYy/oWUKNsTWNyKbI7Br7Oooio+bUPNYb5AZO4YOnnMt29yC4nLFbfPxIassZR7WuqsSidC4p15ImG2ClhKupfblgY0G7QIqbV+XKGNw3dxXsQz9PHzG+cfhyptpsQBry1uSSOXqRPOe9jfxBAb5IHFeEqh1KK1sfo8ZHdFzqXmmHJhYx9VwNxXxYmrl4zHVclXSkKNWj8WINA6PrR3/y18DTiQ5DYIAkYyzvVPdOWbgI9zwlz1EDmhzPMta0rHVvK6Tw9knwblO//3h+h3ek5XexupuiOI3bYGrEAWyq+c4bLDNGm1iusxAI5JYkCA3qd/kHl795+vm22tvy4IjmYZtde81sZdIhEL8LVfqYSIdNcleW2DaKGeF0WcHUAkD33+1KitiTfKXk6o1w1JcuxdHu6y2v1r0n/CQdwrBqB+3IW3WY0zqIlQz7Q14O//82EswCdMD3riMFBM7E5iHuLoICOGcWNS1v0t+ZjcT5JL5rY5TVXYMY9aWyOjWRQ8s2kpP/wAE5/0W4Qb72MLkq7vS8ah4hALV2F3tWdgwVFx6EcU7J2l5HYHAhQUeI780Y4QDAIw1dQ+/WG+AEJa3D/UNVXGsr/lNq27gxHbyxczENQp9gp11t47LgMszL/DQyR 2PkQQZeJ POAq4Ia2QATm7GOWvOh0XdvtmU/nmMnNLDDe88Yd5AO1whnTJevr6CSHnEsWnzJ15Rnu8l581kjbxIw1uGvFCd5nSqM4lugtqSVCKuLuOGDHZowPOM4a1Yl1hzgQ9P9+IqMk2oKMnTxcKPF/jmdh4RmZzorowTr/VvpT8PHjhsg/RIkejm/+wdhg0UH+N7aBxsZjlZ+GrOD/t4IXip8Qf8jQAj8bADXkrBUYL X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Cleanup exit path of try_to_unmap_one_page() by removing some duplicated code. Move page_vma_mapped_walk_done() back to try_to_unmap_one(). Change subpage to page as folio has no concept of subpage. Signed-off-by: Yin Fengwei --- mm/rmap.c | 74 ++++++++++++++++++++++--------------------------------- 1 file changed, 30 insertions(+), 44 deletions(-) diff --git a/mm/rmap.c b/mm/rmap.c index 987ab402392f..d243e557c6e4 100644 --- a/mm/rmap.c +++ b/mm/rmap.c @@ -1530,7 +1530,7 @@ static bool try_to_unmap_one_hugetlb(struct folio *folio, * * See Documentation/mm/mmu_notifier.rst */ - page_remove_rmap(&folio->page, vma, folio_test_hugetlb(folio)); + page_remove_rmap(&folio->page, vma, true); /* No VM_LOCKED set in vma->vm_flags for hugetlb. So not * necessary to call mlock_drain_local(). */ @@ -1545,15 +1545,13 @@ static bool try_to_unmap_one_page(struct folio *folio, struct page_vma_mapped_walk pvmw, unsigned long address, enum ttu_flags flags) { - bool anon_exclusive, ret = true; - struct page *subpage; + bool anon_exclusive; + struct page *page; struct mm_struct *mm = vma->vm_mm; pte_t pteval; - subpage = folio_page(folio, - pte_pfn(*pvmw.pte) - folio_pfn(folio)); - anon_exclusive = folio_test_anon(folio) && - PageAnonExclusive(subpage); + page = folio_page(folio, pte_pfn(*pvmw.pte) - folio_pfn(folio)); + anon_exclusive = folio_test_anon(folio) && PageAnonExclusive(page); flush_cache_page(vma, address, pte_pfn(*pvmw.pte)); /* Nuke the page table entry. */ @@ -1581,15 +1579,14 @@ static bool try_to_unmap_one_page(struct folio *folio, pte_install_uffd_wp_if_needed(vma, address, pvmw.pte, pteval); /* Set the dirty flag on the folio now the pte is gone. */ - if (pte_dirty(pteval)) + if (pte_dirty(pteval) && !folio_test_dirty(folio)) folio_mark_dirty(folio); /* Update high watermark before we lower rss */ update_hiwater_rss(mm); - if (PageHWPoison(subpage) && !(flags & TTU_HWPOISON)) { - pteval = swp_entry_to_pte(make_hwpoison_entry(subpage)); - dec_mm_counter(mm, mm_counter(&folio->page)); + if (PageHWPoison(page) && !(flags & TTU_HWPOISON)) { + pteval = swp_entry_to_pte(make_hwpoison_entry(page)); set_pte_at(mm, address, pvmw.pte, pteval); } else if (pte_unused(pteval) && !userfaultfd_armed(vma)) { /* @@ -1602,12 +1599,11 @@ static bool try_to_unmap_one_page(struct folio *folio, * migration) will not expect userfaults on already * copied pages. */ - dec_mm_counter(mm, mm_counter(&folio->page)); /* We have to invalidate as we cleared the pte */ mmu_notifier_invalidate_range(mm, address, address + PAGE_SIZE); } else if (folio_test_anon(folio)) { - swp_entry_t entry = { .val = page_private(subpage) }; + swp_entry_t entry = { .val = page_private(page) }; pte_t swp_pte; /* * Store the swap location in the pte. @@ -1616,12 +1612,10 @@ static bool try_to_unmap_one_page(struct folio *folio, if (unlikely(folio_test_swapbacked(folio) != folio_test_swapcache(folio))) { WARN_ON_ONCE(1); - ret = false; /* We have to invalidate as we cleared the pte */ mmu_notifier_invalidate_range(mm, address, address + PAGE_SIZE); - page_vma_mapped_walk_done(&pvmw); - goto discard; + goto exit; } /* MADV_FREE page check */ @@ -1653,7 +1647,6 @@ static bool try_to_unmap_one_page(struct folio *folio, /* Invalidate as we cleared the pte */ mmu_notifier_invalidate_range(mm, address, address + PAGE_SIZE); - dec_mm_counter(mm, MM_ANONPAGES); goto discard; } @@ -1661,43 +1654,30 @@ static bool try_to_unmap_one_page(struct folio *folio, * If the folio was redirtied, it cannot be * discarded. Remap the page to page table. */ - set_pte_at(mm, address, pvmw.pte, pteval); folio_set_swapbacked(folio); - ret = false; - page_vma_mapped_walk_done(&pvmw); - goto discard; + goto exit_restore_pte; } - if (swap_duplicate(entry) < 0) { - set_pte_at(mm, address, pvmw.pte, pteval); - ret = false; - page_vma_mapped_walk_done(&pvmw); - goto discard; - } + if (swap_duplicate(entry) < 0) + goto exit_restore_pte; + if (arch_unmap_one(mm, vma, address, pteval) < 0) { swap_free(entry); - set_pte_at(mm, address, pvmw.pte, pteval); - ret = false; - page_vma_mapped_walk_done(&pvmw); - goto discard; + goto exit_restore_pte; } /* See page_try_share_anon_rmap(): clear PTE first. */ - if (anon_exclusive && - page_try_share_anon_rmap(subpage)) { + if (anon_exclusive && page_try_share_anon_rmap(page)) { swap_free(entry); - set_pte_at(mm, address, pvmw.pte, pteval); - ret = false; - page_vma_mapped_walk_done(&pvmw); - goto discard; + goto exit_restore_pte; } + if (list_empty(&mm->mmlist)) { spin_lock(&mmlist_lock); if (list_empty(&mm->mmlist)) list_add(&mm->mmlist, &init_mm.mmlist); spin_unlock(&mmlist_lock); } - dec_mm_counter(mm, MM_ANONPAGES); inc_mm_counter(mm, MM_SWAPENTS); swp_pte = swp_entry_to_pte(entry); if (anon_exclusive) @@ -1708,8 +1688,7 @@ static bool try_to_unmap_one_page(struct folio *folio, swp_pte = pte_swp_mkuffd_wp(swp_pte); set_pte_at(mm, address, pvmw.pte, swp_pte); /* Invalidate as we cleared the pte */ - mmu_notifier_invalidate_range(mm, address, - address + PAGE_SIZE); + mmu_notifier_invalidate_range(mm, address, address + PAGE_SIZE); } else { /* * This is a locked file-backed folio, @@ -1722,11 +1701,16 @@ static bool try_to_unmap_one_page(struct folio *folio, * * See Documentation/mm/mmu_notifier.rst */ - dec_mm_counter(mm, mm_counter_file(&folio->page)); } discard: - return ret; + dec_mm_counter(vma->vm_mm, mm_counter(&folio->page)); + return true; + +exit_restore_pte: + set_pte_at(mm, address, pvmw.pte, pteval); +exit: + return false; } /* @@ -1804,8 +1788,10 @@ static bool try_to_unmap_one(struct folio *folio, struct vm_area_struct *vma, pte_pfn(*pvmw.pte) - folio_pfn(folio)); ret = try_to_unmap_one_page(folio, vma, range, pvmw, address, flags); - if (!ret) + if (!ret) { + page_vma_mapped_walk_done(&pvmw); break; + } /* * No need to call mmu_notifier_invalidate_range() it has be @@ -1814,7 +1800,7 @@ static bool try_to_unmap_one(struct folio *folio, struct vm_area_struct *vma, * * See Documentation/mm/mmu_notifier.rst */ - page_remove_rmap(subpage, vma, folio_test_hugetlb(folio)); + page_remove_rmap(subpage, vma, false); if (vma->vm_flags & VM_LOCKED) mlock_drain_local(); folio_put(folio); From patchwork Tue Feb 28 12:23:07 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yin Fengwei X-Patchwork-Id: 13154856 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 85603C64EC7 for ; Tue, 28 Feb 2023 12:22:34 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 56CAC6B007B; Tue, 28 Feb 2023 07:22:31 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 47E4D6B0080; Tue, 28 Feb 2023 07:22:31 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 238966B007B; Tue, 28 Feb 2023 07:22:31 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0016.hostedemail.com [216.40.44.16]) by kanga.kvack.org (Postfix) with ESMTP id F3E006B007B for ; Tue, 28 Feb 2023 07:22:30 -0500 (EST) Received: from smtpin28.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay02.hostedemail.com (Postfix) with ESMTP id C84B0120321 for ; Tue, 28 Feb 2023 12:22:30 +0000 (UTC) X-FDA: 80516613660.28.F8F8AB6 Received: from mga18.intel.com (mga18.intel.com [134.134.136.126]) by imf23.hostedemail.com (Postfix) with ESMTP id B6C23140004 for ; Tue, 28 Feb 2023 12:22:28 +0000 (UTC) Authentication-Results: imf23.hostedemail.com; dkim=pass header.d=intel.com header.s=Intel header.b=d200wGKY; spf=pass (imf23.hostedemail.com: domain of fengwei.yin@intel.com designates 134.134.136.126 as permitted sender) smtp.mailfrom=fengwei.yin@intel.com; dmarc=pass (policy=none) header.from=intel.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1677586948; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=LD1lWbFwW7ZVR5pqFBwcUmIBn+MRU/gnJ+vXlQHQPa4=; b=Z/MjZuvmDgfHaJmvwTisocKxvwYhHQrCorPqmpFILzc+ZREEccvmeol0FfuD/I3nUzM8uL VgvupeumPcZWonggUSShuDIaKuc+aHMnaUS9zXkBUn7ckL9JwE47DD3yOFtXD5ikUhsQTy 859EH5wHwAhsVkBoe9uh9keLHhY3sgI= ARC-Authentication-Results: i=1; imf23.hostedemail.com; dkim=pass header.d=intel.com header.s=Intel header.b=d200wGKY; spf=pass (imf23.hostedemail.com: domain of fengwei.yin@intel.com designates 134.134.136.126 as permitted sender) smtp.mailfrom=fengwei.yin@intel.com; dmarc=pass (policy=none) header.from=intel.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1677586949; a=rsa-sha256; cv=none; b=Xk8rwCJauzgp6HHaFw4I8bMBNjR0r6OpANu0aaLT/8zbo/XNLJqyGUOk91XD3Te4xZiemn ON+0z+KNOUkVULOmMYoDk9aJYFdQYoWu4bGHTrA1fd5Th337tU8WdEO0dZrsxb9eOq6J+H AsUXUmQ4IU7E/sK7aj9eFEyUavCT/aQ= DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1677586948; x=1709122948; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=VMMRS6YvSAo+8kzloSgN4rxmLa9+TgipJdVxggMXU3w=; b=d200wGKYxcbBj9bqj8O1+q8dLuoPCTzp8GIOftlynVHGeqo42v6ICt37 EEYD2TcE3CLZPlAJ3QoohyJWBD11cMagxPFE2oCuwglqasPl8401/yOAe fCnMytAbT8Cr2Sb3pJ63JP5EV5LTgBCCvOGRrLf/55M6WTkvLyKATVz2d TRZO89odGEf53kmgEfE5pa7ybIH22eqW2nQmpHYqReeajdsOZDTqsdNC7 zy7FLs4ks8ZGifY9cUAsBMMLMS5WlJhaSsc6lVRF5BUFdpaeDdNMUYVqS NVHA0+oTc+bxy0jTwYQ3NhQcM+/LPrFFyEJZtUXClDQ9tstebv8hLyMOG g==; X-IronPort-AV: E=McAfee;i="6500,9779,10634"; a="317921179" X-IronPort-AV: E=Sophos;i="5.98,221,1673942400"; d="scan'208";a="317921179" Received: from fmsmga005.fm.intel.com ([10.253.24.32]) by orsmga106.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 28 Feb 2023 04:22:14 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6500,9779,10634"; a="1003220782" X-IronPort-AV: E=Sophos;i="5.98,221,1673942400"; d="scan'208";a="1003220782" Received: from fyin-dev.sh.intel.com ([10.239.159.32]) by fmsmga005.fm.intel.com with ESMTP; 28 Feb 2023 04:22:05 -0800 From: Yin Fengwei To: linux-mm@kvack.org, akpm@linux-foundation.org, willy@infradead.org, sidhartha.kumar@oracle.com, mike.kravetz@oracle.com, jane.chu@oracle.com, naoya.horiguchi@nec.com Cc: fengwei.yin@intel.com Subject: [PATCH v2 4/5] rmap:addd folio_remove_rmap_range() Date: Tue, 28 Feb 2023 20:23:07 +0800 Message-Id: <20230228122308.2972219-5-fengwei.yin@intel.com> X-Mailer: git-send-email 2.30.2 In-Reply-To: <20230228122308.2972219-1-fengwei.yin@intel.com> References: <20230228122308.2972219-1-fengwei.yin@intel.com> MIME-Version: 1.0 X-Stat-Signature: 3z7oz5yzp81sd348u9g4fjmmbw3aohbf X-Rspam-User: X-Rspamd-Queue-Id: B6C23140004 X-Rspamd-Server: rspam06 X-HE-Tag: 1677586948-603745 X-HE-Meta: U2FsdGVkX19yXyVMs2foEZhaKAoZnzR25NHNF/U0du7OrEiQgo66DlpC5mA0zlFQ5bQcTrkTA76AbOcjGU+JiEGqVwlZE77tvT829Uo6wJda1YaxtMrYzgTR0wgFdd6enVVX7LCRg74nzm55qytZ3VtJscJ9dwB/1M2R7rraCqk/iqiVx8rIwH7ZIsWgb6mmOZc8NfbRyGr8/6hg+t4jIrkDzDkeB/kUciY+BCAk3MNaUMu14ImDKRmH0Fbge4bOSzLyes7KkzpN57tedKbvq1oKNpnjl4+VuqsA3SBYjCeBho7q82RiPbejjwgTuQy2RrUhm8N4Vi5O+uQxepuot5+mbcP2yLHxwDy+6KdzDAa00mkkzUlRtsexhaRe6k3fyA94xUhjflnxMn812LsjzK3S90a6dsDfUh8PgvJ7iRxMD8zRDYcH4yc0RV5yAVOeZlejA2aWIYHp+iU2/SBPsTQCqrC0hhwEmgzLJ56vq3cCqiKWHXlH6wpy3lNLqHJifnz4PwsXljYufJzAJwJxK+22cSCzScSsRzFkqfF4hd+TkqMgRScV0lgdoN+lbtxR3EF/r/7nRvG9sQ+G6kETYDI/zQxiGPezR5MQtXAIn0h9L+LGxFSEH6hAyEw6Z+J8U1nE2dKUfDQ4/oMOMNDWmqSG2iFdO/smB0RG6s15iB+2NtfOxsE0x5G0AfrXCKRvhQ76ATxSH2MZBqKi2oRBaN1Y9buN1KIzqZaCiUMcngtiPLl02zw2G3vhhhbsOGtSIGfzU3VIJUaeUpgm0KmBZ9k2GjBcM6UKiAZU6hibgzHLb30DgL+cw1wPN7CRIRKp4JGOKptsDiaWqEovXPClFW0+QnDCvPo/RZQiXkyiF28Ep1WU+bCDu7Q/3PvhjKXENuw9z6+gE5Fnuz4EOyLze604JantXeSKx8k+CTQCGMCO3l9j99yHgUA5ueHIKbgQTp5JfQ0PHrnyUa1B92g mlm5rFlG i967vHjIyFV9o7VNwY4YS+jCLs7QlaeQ9yQOiADOZI+jDYIEq5pxHJWDJE1zdChvIwbqltWbHoLxF1FtN29lDINzv1hVvLYdzyFY0g6GKQaHfF6sWl9GL5U2aC9wuEG+u7vxL5I0nIim/sFAm+YRLKjA5eCSdfzXFXC2xS52D2k2uZ1+fmD4ZOAJ+udaXHaiU2mN5ChqKQW9/dNIiwfoWMd8rCMYu/4JLvVI9n9WJidEvdwtiEuDc7uOHz8pQ+ha1CaSU X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: folio_remove_rmap_range() allows to take down the pte mapping to a specific range of folio. Comparing to page_remove_rmap(), it batched updates __lruvec_stat for large folio. Signed-off-by: Yin Fengwei --- include/linux/rmap.h | 4 +++ mm/rmap.c | 58 +++++++++++++++++++++++++++++++++----------- 2 files changed, 48 insertions(+), 14 deletions(-) diff --git a/include/linux/rmap.h b/include/linux/rmap.h index b87d01660412..d2569b42e21a 100644 --- a/include/linux/rmap.h +++ b/include/linux/rmap.h @@ -200,6 +200,10 @@ void page_add_file_rmap(struct page *, struct vm_area_struct *, bool compound); void page_remove_rmap(struct page *, struct vm_area_struct *, bool compound); +void folio_remove_rmap_range(struct folio *, struct page *, + unsigned int nr_pages, struct vm_area_struct *, + bool compound); + void hugepage_add_anon_rmap(struct page *, struct vm_area_struct *, unsigned long address, rmap_t flags); diff --git a/mm/rmap.c b/mm/rmap.c index d243e557c6e4..fc02a8f9c59c 100644 --- a/mm/rmap.c +++ b/mm/rmap.c @@ -1357,23 +1357,25 @@ void page_add_file_rmap(struct page *page, struct vm_area_struct *vma, } /** - * page_remove_rmap - take down pte mapping from a page - * @page: page to remove mapping from + * folio_remove_rmap_range - take down pte mapping from a range of pages + * @folio: folio to remove mapping from + * @page: The first page to take down pte mapping + * @nr_pages: The number of pages which will be take down pte mapping * @vma: the vm area from which the mapping is removed * @compound: uncharge the page as compound or small page * * The caller needs to hold the pte lock. */ -void page_remove_rmap(struct page *page, struct vm_area_struct *vma, - bool compound) +void folio_remove_rmap_range(struct folio *folio, struct page *page, + unsigned int nr_pages, struct vm_area_struct *vma, + bool compound) { - struct folio *folio = page_folio(page); atomic_t *mapped = &folio->_nr_pages_mapped; - int nr = 0, nr_pmdmapped = 0; - bool last; + int nr = 0, nr_pmdmapped = 0, last; enum node_stat_item idx; - VM_BUG_ON_PAGE(compound && !PageHead(page), page); + VM_BUG_ON_FOLIO(compound && (nr_pages != folio_nr_pages(folio)), folio); + VM_BUG_ON_FOLIO(compound && (page != &folio->page), folio); /* Hugetlb pages are not counted in NR_*MAPPED */ if (unlikely(folio_test_hugetlb(folio))) { @@ -1384,12 +1386,16 @@ void page_remove_rmap(struct page *page, struct vm_area_struct *vma, /* Is page being unmapped by PTE? Is this its last map to be removed? */ if (likely(!compound)) { - last = atomic_add_negative(-1, &page->_mapcount); - nr = last; - if (last && folio_test_large(folio)) { - nr = atomic_dec_return_relaxed(mapped); - nr = (nr < COMPOUND_MAPPED); - } + do { + last = atomic_add_negative(-1, &page->_mapcount); + if (last && folio_test_large(folio)) { + last = atomic_dec_return_relaxed(mapped); + last = (last < COMPOUND_MAPPED); + } + + if (last) + nr++; + } while (page++, --nr_pages > 0); } else if (folio_test_pmd_mappable(folio)) { /* That test is redundant: it's for safety or to optimize out */ @@ -1443,6 +1449,30 @@ void page_remove_rmap(struct page *page, struct vm_area_struct *vma, munlock_vma_folio(folio, vma, compound); } +/** + * page_remove_rmap - take down pte mapping from a page + * @page: page to remove mapping from + * @vma: the vm area from which the mapping is removed + * @compound: uncharge the page as compound or small page + * + * The caller needs to hold the pte lock. + */ +void page_remove_rmap(struct page *page, struct vm_area_struct *vma, + bool compound) +{ + struct folio *folio = page_folio(page); + unsigned int nr_pages; + + VM_BUG_ON_FOLIO(compound && (page != &folio->page), folio); + + if (likely(!compound)) + nr_pages = 1; + else + nr_pages = folio_nr_pages(folio); + + folio_remove_rmap_range(folio, page, nr_pages, vma, compound); +} + static bool try_to_unmap_one_hugetlb(struct folio *folio, struct vm_area_struct *vma, struct mmu_notifier_range range, struct page_vma_mapped_walk pvmw, unsigned long address, From patchwork Tue Feb 28 12:23:08 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yin Fengwei X-Patchwork-Id: 13154854 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 93EE9C7EE32 for ; Tue, 28 Feb 2023 12:22:31 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id A4EA16B0078; Tue, 28 Feb 2023 07:22:30 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 98A776B007B; Tue, 28 Feb 2023 07:22:30 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 828956B007E; Tue, 28 Feb 2023 07:22:30 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0014.hostedemail.com [216.40.44.14]) by kanga.kvack.org (Postfix) with ESMTP id 6D7FE6B0078 for ; Tue, 28 Feb 2023 07:22:30 -0500 (EST) Received: from smtpin01.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay07.hostedemail.com (Postfix) with ESMTP id 234F4160CF6 for ; Tue, 28 Feb 2023 12:22:30 +0000 (UTC) X-FDA: 80516613660.01.4B99F8A Received: from mga18.intel.com (mga18.intel.com [134.134.136.126]) by imf14.hostedemail.com (Postfix) with ESMTP id 09330100002 for ; Tue, 28 Feb 2023 12:22:27 +0000 (UTC) Authentication-Results: imf14.hostedemail.com; dkim=pass header.d=intel.com header.s=Intel header.b=P3XIGi8r; spf=pass (imf14.hostedemail.com: domain of fengwei.yin@intel.com designates 134.134.136.126 as permitted sender) smtp.mailfrom=fengwei.yin@intel.com; dmarc=pass (policy=none) header.from=intel.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1677586948; a=rsa-sha256; cv=none; b=CKT1Cl8fgoadlNNmJAZDf5MWnz2bzjmMlImP2n22fzVZSDHiX+9d6uXRBbSSru0pf16JG7 d/x4UhpwLeH6QLXLzr6wFba8Ca0ampUlUD7zyRUb6rLPkI05xh4cLkBU1+UR1Rl1OJ+dLO cxRRBDC8jPoziIcQlkec6fW6mDO+v5k= ARC-Authentication-Results: i=1; imf14.hostedemail.com; dkim=pass header.d=intel.com header.s=Intel header.b=P3XIGi8r; spf=pass (imf14.hostedemail.com: domain of fengwei.yin@intel.com designates 134.134.136.126 as permitted sender) smtp.mailfrom=fengwei.yin@intel.com; dmarc=pass (policy=none) header.from=intel.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1677586948; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=8GuT8ZTejFpZQvJKAhqGky6Mm6hpqAjqKMe23Yn3ckk=; b=1X1acH/BmAgjNUwl40i8wLCkAAxIjKYAoJgdu9G/F9LeWZiNrgoo4iTDpCujkhyaAP9Q7G 3ITBby0TF0imUBRzMTJixabcySSJtONU4dT1CJQ0zFk9idjjOSKQT1P43rtZx874JHLbjr Ajdjk3gxqjsYLP9+Ns9GpLSFiAEzMSg= DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1677586948; x=1709122948; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=UpSIPZ+9DCUWOZcnzjq7arbpb2lEc08dm0ksjgfOGqo=; b=P3XIGi8rjNLikbI0D7vyZgWPpTceK8YgP16yzZcel7byVcggy5Otxrvb AnznAcg1BscFRt/szL6ZPg/hVSgcey0xWu4uaAiNKVkEGMfL+PHadFTYe OTtE68KsfYKIBZJPa5qFxwM48vlI0cbYt2iVNh6gBHPRRDE3qD6zfrpVK sxCGOFAHlp+uhkCyEq0MeO4Qqf5g+hKeChIx7rP3oeamS2zOVVvO6avVD rHEGfs/F9Qds2mCqszf7vaCsSErcsmI+tEy04ONGKN+j0wo5o3IYAP272 RYFJ+EOk9cH9T8lJSi3x7Vt3K1L6lnN5dfbmiY5lP33ZqJPSxqNNp6zmy Q==; X-IronPort-AV: E=McAfee;i="6500,9779,10634"; a="317921184" X-IronPort-AV: E=Sophos;i="5.98,221,1673942400"; d="scan'208";a="317921184" Received: from fmsmga005.fm.intel.com ([10.253.24.32]) by orsmga106.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 28 Feb 2023 04:22:14 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6500,9779,10634"; a="1003220784" X-IronPort-AV: E=Sophos;i="5.98,221,1673942400"; d="scan'208";a="1003220784" Received: from fyin-dev.sh.intel.com ([10.239.159.32]) by fmsmga005.fm.intel.com with ESMTP; 28 Feb 2023 04:22:07 -0800 From: Yin Fengwei To: linux-mm@kvack.org, akpm@linux-foundation.org, willy@infradead.org, sidhartha.kumar@oracle.com, mike.kravetz@oracle.com, jane.chu@oracle.com, naoya.horiguchi@nec.com Cc: fengwei.yin@intel.com Subject: [PATCH v2 5/5] try_to_unmap_one: batched remove rmap, update folio refcount Date: Tue, 28 Feb 2023 20:23:08 +0800 Message-Id: <20230228122308.2972219-6-fengwei.yin@intel.com> X-Mailer: git-send-email 2.30.2 In-Reply-To: <20230228122308.2972219-1-fengwei.yin@intel.com> References: <20230228122308.2972219-1-fengwei.yin@intel.com> MIME-Version: 1.0 X-Rspam-User: X-Rspamd-Queue-Id: 09330100002 X-Rspamd-Server: rspam01 X-Stat-Signature: 9crht8wkkuaaifonebnh44dpj9r3ie38 X-HE-Tag: 1677586947-430073 X-HE-Meta: U2FsdGVkX18ijwOiJG3ucikeMYQm/uw+FvDE3bnNPxOxVr51gNohCmCaIrwDc5m8+sC1OlDUaIDmYmUsX28uIT3s1Y8ogargvjdEq4pfZhkkXTu0j+VLKKETeysa80QGXiFcLssSRkLCivdJNzFSGyyE7lRHRZwvSBo5zTL7ouhjmmtdG10awsMZJ8as6xDhBWcy9BoBjx6vRByhmENhay7abaHbTmLyXEsDhOSsOSgbfQPjy384u4vV26Oh2RiVT7+7pKPrTQFjTA4Zu1LkEtYmR1TzH/+2hO893Sl/ZEi1jvybywAD7e+4dt2T69O+SUrTpcRFepnLgj4nh8VKaYThzK624OMSRiNvtUuqGUu5+Cp05fSx50jWZZ9533obb3AIuF/3no5km+MZg9jbw41onNFPldMIeLx5R+HfnxJCE+VueyZOqsPsZOaLvCJ+labmv4Dlps/0K1Lkqd29bmRomG2obFVdfOqcUm8EqUuSXetpU7XdGvHAFZlGW9H8B5F2Axdez40mh2QWyhgoRfOoorbKso9Fp/uDaNDahuMnlDv5bCAyVwpiiZ77czNyKTw5mfHDMqOGLXg9j/1b28/AK+gSrHLMiRcgre3HEYV9EOtHh3DPpjgogGrMXDwAK4a4fznR6lFR8Kxi8VCgLZP5tNlzGu5sm6gZP8pv6tQ6flMuTbkQeIOUpkE8FQDgoI9gYX4DNbqpFsNOHsIVqvx+UvU+fvkL1mQBWin3MtKOlja4mCzC7ejlkY8ZpvyjbYsMJbSwQcPDyX8fACFQkHn8mki3icpYYq1QeX0YVW24kNFA7zS1/t024Mfi3lvQnoSYkyrbMlqPPFGyh4oX2CfQ+dDXeCEIIGXJuTjnlIZDOxZJqGCzQSabg9lp/K5LPB47mZu/Z5nphZbMUf79877Bb7lyAPjk60B3DimdVvJ2BnKz7kJkreo18Qm5a/bqj9gTlDblyL5iCLyVI6t 4jJpcEcT ApXUgb6VS5Ylf+61u2oDC01qeMpO0Rw61GHDZgyg6HLwBZQRe8eODk3w7K2XiW0/gfSuPN9+mtu0D3Sgnyh9nqfaAbKj05VI92iRRhxHcWE18QykFygHZNSE4VrLWtt3nqLKYjkMowNZ//eXQVgYuHudo+lSqBgnKxKbIMIU1NOPubPifqWpz5J3MoThDjkVIgsfVju4njKdBYv1fie3c2KWbmCwD5eyYkRu/ X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: If unmap one page fails, or the vma walk will skip next pte, or the vma walk will end on next pte, batched remove map, update folio refcount. Signed-off-by: Yin Fengwei --- include/linux/rmap.h | 1 + mm/page_vma_mapped.c | 30 +++++++++++++++++++++++++++ mm/rmap.c | 48 ++++++++++++++++++++++++++++++++++---------- 3 files changed, 68 insertions(+), 11 deletions(-) diff --git a/include/linux/rmap.h b/include/linux/rmap.h index d2569b42e21a..18193d1d5a8e 100644 --- a/include/linux/rmap.h +++ b/include/linux/rmap.h @@ -424,6 +424,7 @@ static inline void page_vma_mapped_walk_done(struct page_vma_mapped_walk *pvmw) } bool page_vma_mapped_walk(struct page_vma_mapped_walk *pvmw); +bool pvmw_walk_skip_or_end_on_next(struct page_vma_mapped_walk *pvmw); /* * Used by swapoff to help locate where page is expected in vma. diff --git a/mm/page_vma_mapped.c b/mm/page_vma_mapped.c index 4e448cfbc6ef..19e997dfb5c6 100644 --- a/mm/page_vma_mapped.c +++ b/mm/page_vma_mapped.c @@ -291,6 +291,36 @@ bool page_vma_mapped_walk(struct page_vma_mapped_walk *pvmw) return false; } +/** + * pvmw_walk_skip_or_end_on_next - check if next pte will be skipped or + * end the walk + * @pvmw: pointer to struct page_vma_mapped_walk. + * + * This function can only be called with correct pte lock hold + */ +bool pvmw_walk_skip_or_end_on_next(struct page_vma_mapped_walk *pvmw) +{ + unsigned long address = pvmw->address + PAGE_SIZE; + + if (address >= vma_address_end(pvmw)) + return true; + + if ((address & (PMD_SIZE - PAGE_SIZE)) == 0) + return true; + + if (pte_none(*pvmw->pte)) + return true; + + pvmw->pte++; + if (!check_pte(pvmw)) { + pvmw->pte--; + return true; + } + pvmw->pte--; + + return false; +} + /** * page_mapped_in_vma - check whether a page is really mapped in a VMA * @page: the page to test diff --git a/mm/rmap.c b/mm/rmap.c index fc02a8f9c59c..a6ed95b89078 100644 --- a/mm/rmap.c +++ b/mm/rmap.c @@ -1743,6 +1743,26 @@ static bool try_to_unmap_one_page(struct folio *folio, return false; } +static void folio_remove_rmap_and_update_count(struct folio *folio, + struct page *start, struct vm_area_struct *vma, int count) +{ + if (count == 0) + return; + + /* + * No need to call mmu_notifier_invalidate_range() it has be + * done above for all cases requiring it to happen under page + * table lock before mmu_notifier_invalidate_range_end() + * + * See Documentation/mm/mmu_notifier.rst + */ + folio_remove_rmap_range(folio, start, count, vma, + folio_test_hugetlb(folio)); + if (vma->vm_flags & VM_LOCKED) + mlock_drain_local(); + folio_ref_sub(folio, count); +} + /* * @arg: enum ttu_flags will be passed to this argument */ @@ -1750,10 +1770,11 @@ static bool try_to_unmap_one(struct folio *folio, struct vm_area_struct *vma, unsigned long address, void *arg) { DEFINE_FOLIO_VMA_WALK(pvmw, folio, vma, address, 0); - struct page *subpage; + struct page *start = NULL; bool ret = true; struct mmu_notifier_range range; enum ttu_flags flags = (enum ttu_flags)(long)arg; + int count = 0; /* * When racing against e.g. zap_pte_range() on another cpu, @@ -1814,26 +1835,31 @@ static bool try_to_unmap_one(struct folio *folio, struct vm_area_struct *vma, break; } - subpage = folio_page(folio, + if (!start) + start = folio_page(folio, pte_pfn(*pvmw.pte) - folio_pfn(folio)); ret = try_to_unmap_one_page(folio, vma, range, pvmw, address, flags); if (!ret) { + folio_remove_rmap_and_update_count(folio, + start, vma, count); page_vma_mapped_walk_done(&pvmw); break; } + count++; /* - * No need to call mmu_notifier_invalidate_range() it has be - * done above for all cases requiring it to happen under page - * table lock before mmu_notifier_invalidate_range_end() - * - * See Documentation/mm/mmu_notifier.rst + * If next pte will be skipped in page_vma_mapped_walk() or + * the walk will end at it, batched remove rmap and update + * page refcount. We can't do it after page_vma_mapped_walk() + * return false because the pte lock will not be hold. */ - page_remove_rmap(subpage, vma, false); - if (vma->vm_flags & VM_LOCKED) - mlock_drain_local(); - folio_put(folio); + if (pvmw_walk_skip_or_end_on_next(&pvmw)) { + folio_remove_rmap_and_update_count(folio, + start, vma, count); + count = 0; + start = NULL; + } } mmu_notifier_invalidate_range_end(&range);