From patchwork Mon Aug 5 12:55:05 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Qi Zheng X-Patchwork-Id: 13753583 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 0FC65C3DA4A for ; Mon, 5 Aug 2024 12:55:54 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 9C4936B0092; Mon, 5 Aug 2024 08:55:53 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 971506B0093; Mon, 5 Aug 2024 08:55:53 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 7EAC36B0095; Mon, 5 Aug 2024 08:55:53 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0012.hostedemail.com [216.40.44.12]) by kanga.kvack.org (Postfix) with ESMTP id 61B086B0092 for ; Mon, 5 Aug 2024 08:55:53 -0400 (EDT) Received: from smtpin07.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay06.hostedemail.com (Postfix) with ESMTP id 16F3CA2440 for ; Mon, 5 Aug 2024 12:55:53 +0000 (UTC) X-FDA: 82418188986.07.4B0DA49 Received: from mail-oa1-f53.google.com (mail-oa1-f53.google.com [209.85.160.53]) by imf17.hostedemail.com (Postfix) with ESMTP id 485874001B for ; Mon, 5 Aug 2024 12:55:51 +0000 (UTC) Authentication-Results: imf17.hostedemail.com; dkim=pass header.d=bytedance.com header.s=google header.b=TOEmIU07; dmarc=pass (policy=quarantine) header.from=bytedance.com; spf=pass (imf17.hostedemail.com: domain of zhengqi.arch@bytedance.com designates 209.85.160.53 as permitted sender) smtp.mailfrom=zhengqi.arch@bytedance.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1722862505; a=rsa-sha256; cv=none; b=OaAYxrTwKrmdktAqI2ltuagv2W8k322GXUqL9GL6KPBSGdlID89z3soEjbjHlK2Gsr3Hzw wELlyEfYIdDvK71jWsP5uTACvjR4yeOLxgOJK9ngdH7CUjNleFlvbcw1uej73ApjZVb+wJ 3s6RdnizE3CrBhPBc/oQw9SdcLthXeo= ARC-Authentication-Results: i=1; imf17.hostedemail.com; dkim=pass header.d=bytedance.com header.s=google header.b=TOEmIU07; dmarc=pass (policy=quarantine) header.from=bytedance.com; spf=pass (imf17.hostedemail.com: domain of zhengqi.arch@bytedance.com designates 209.85.160.53 as permitted sender) smtp.mailfrom=zhengqi.arch@bytedance.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1722862505; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=+E/n3u69l82MknfkBVWlLtitLQrhUjJGa1Gd08uFtUg=; b=1pFyNO3vyVc4fVfSe8BJU8q1iV0UjnHbh0LIAelEEdjQVevpFJ8lqNhuo7zl69N0AvRY4F lYrz0cqYGrcHco5BDhbADOTGaQEFoPCEt4jd36d8iNEjiQhgF9cUzwG8vYbHGjzAUGKGtP UFOqrtK7iYdf0UfXX/OhAi7fbVHMg04= Received: by mail-oa1-f53.google.com with SMTP id 586e51a60fabf-2644f7d0fb2so1529618fac.0 for ; Mon, 05 Aug 2024 05:55:51 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance.com; s=google; t=1722862550; x=1723467350; darn=kvack.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=+E/n3u69l82MknfkBVWlLtitLQrhUjJGa1Gd08uFtUg=; b=TOEmIU07YZyrP591kJD5AdO02765aKMwZnl08Edu96w2tmreRIk7sWHZwXbvDlR/oB iWIhoTPpSws5umoPhbd+dUBuqo27CTSDuYZGLqX6Hr/SWU98Sg2x7NiD7+BerupX6XZy q9F2yzKLf70qh6Xwp3pvx9gt7867Wh0FXTG8mUbXH2fo/Nsl4nFD5QchBQ/8B9NxWW/I 1Tb6r9vU/BnJc/AHz41c7mMAnMiR3eP4RU9E123z4AdYwr8AxDLFvXb0Oxcb/2+DPbyA DPuHrcPp6YjxT/LyjK7aA1V6jzdNthYRTF+zyGwPVvv6ggWXfRa20CFrHEpCqn+qHvz5 LpfA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1722862550; x=1723467350; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=+E/n3u69l82MknfkBVWlLtitLQrhUjJGa1Gd08uFtUg=; b=RLt20Vq3vbsr2I8n54nvXfeWYWWoDp3lekCTC6SPogdbGqB7UpObwLlp3kZXC1J69F DCrGsYPYBgXjeOSe2qzP5kBeQlOgrsBoxXHO6pdG5JKaxO5MDAgTlR4a5DIcwsJM5dsk mlk/CBjfktLpPHqLrEKXE5SNDN7gJKeAReNpd7Z6JM/McTXo9//DG2J47GEaGgHQP4RU YmsmnmAodjLdOauAI8lK5G0uGnOQsVy8fM8XNmCOq2xkDwiP2b+UImgP89gydTUn8sPN IBFsQaGJf72RNH0Dokq+8AOqTqGAbCdODxt64lvCr3j51OheWFchKW/C/n2/fr/TKoke sThQ== X-Gm-Message-State: AOJu0Ywvnaj7fGQ4RxpQQN3AxWXvK8QATNKef3QISs7OaiFB+StlU3T4 T2Qjl38gpeqEbbzXF3DHQwJcKUSMYF7FMJAv57DVK8XuaBcmysk+rgisX59uwWU= X-Google-Smtp-Source: AGHT+IFv0DoB3b0vEye0FjjGUQbE4c0I7YpIv3kq3xPjqljbSw0tByADYY9Wsgd4q3ipuhDCRO/jUQ== X-Received: by 2002:a05:6871:338c:b0:25e:44b9:b2ee with SMTP id 586e51a60fabf-26891aa626amr6693054fac.2.1722862550191; Mon, 05 Aug 2024 05:55:50 -0700 (PDT) Received: from C02DW0BEMD6R.bytedance.net ([139.177.225.232]) by smtp.gmail.com with ESMTPSA id d2e1a72fcca58-7106ecfaf1asm5503030b3a.142.2024.08.05.05.55.45 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 05 Aug 2024 05:55:49 -0700 (PDT) From: Qi Zheng To: david@redhat.com, hughd@google.com, willy@infradead.org, mgorman@suse.de, muchun.song@linux.dev, vbabka@kernel.org, akpm@linux-foundation.org, zokeefe@google.com, rientjes@google.com Cc: linux-mm@kvack.org, linux-kernel@vger.kernel.org, Qi Zheng Subject: [RFC PATCH v2 1/7] mm: pgtable: make pte_offset_map_nolock() return pmdval Date: Mon, 5 Aug 2024 20:55:05 +0800 Message-Id: X-Mailer: git-send-email 2.24.3 (Apple Git-128) In-Reply-To: References: MIME-Version: 1.0 X-Rspamd-Queue-Id: 485874001B X-Rspam-User: X-Rspamd-Server: rspam05 X-Stat-Signature: 9p15ztfhwg3rd6fm3e1gkbipytfkpj84 X-HE-Tag: 1722862551-196664 X-HE-Meta: U2FsdGVkX18yEtFL9ninhTNV/Y9YzxwdtjudBlSananiv/JbQey7TV6dNNMUcN7Wbj843t6Jq727vRT8mSJ1IcUlqGJJiZFwerGzBebLawe3RBM3m0Yyi9MTlHP7l8QBkwbYuWfj5dAa5Hw4X/SpvQDW1vPhwg0VwKNF4SKHcniLfXBukvaOu8yYbmKrP3BCbqwCv4T+t86wqVcF64ZSQ+WS9MXQ9pwq7wrd6ehE/WIpv8o39uk4WwY7PSW+8BSfvVYayBwHOgOQd3MziMuNKhTKJ6HLcdduWqDhLh6gk/69IQkgQw5MtL7CIFXB95gZej3Rofq97HzeYNdz0Wk30lqq1w4INNuqLhRNrRaRzyXaiSHQhJ+UE2sCSEcoJY4bNs7aco3y8Zc9yExY3rNa4z3J6uPlwLsB4qw36DncjFc8puYE6/CZw8tKMJFi5cnn9R44jpcMFLJkBqLEqBAqLvVfIMcrjYXfDvKxWuy1QPbB87F0N3ocn3bdWbsFzajVBcCBBRmY6FXeLiUET2nywWPYZhD1C/MMSPi4ky+qQyc0cMHrOmgWJCTi3MEjnHQsIcTOthVnKiEMlIzGqwcXe+CkjXuG4yAPpsSQYTmeHdQZ4SxTxECXp8s62Zcdlc+EkfEsoOV1F0EP2qJ2UZ77sz8F+ZrNZZFuQ3WUwoYLnttSobIQwLegb+V/mj419mmB7BpZZVWxrQ/+wHeF8jzrkBwxLwXnrrWEmpTYLTq9RVYTn9yQ7EB2Zt9RqvGFkN7h9To6Rler+YgR0PQzgIGEjSHOB0C4LZNd90ykyryLchowmJkoRgR/6vQqjWstdtBVGwjEYnWfeVozED1xJwwpgXaBHJxYupAC/pySRNl/7fh5SfRBryuBhX+uQAHwJHN8CBs5WrnU5NO7a187FlMz+zyxlG+LZeKjqF0SVRYQFl0fF4jozH1AZfw7a5oKALCeUvu13juF0gXLJLR6OHZ 8GLJrRhQ Fz9pTb568dmLeFOV0SZKm2mW70wRJpbAk44BbwLqc5eteLeUz5RBxGEusEdyj2oewSwj1xj+bT7A+7vIV+3O74KoAuTwk6PETlFUUqA1NTQG2BScdJxPCdcXL/ledyfdct02EPlrdP4IVMrR7zoYZZvLelt9QXkBhaaSZRQNQMVS11hG2EzXu4sYu79xRD2KYlqOup44ebdoxAQaE6eLcTxL1FsntCIX1tJS5QAgDuw5nP+k30W2YqtACjGhbWuzWY9yoLECF/X76z0cyvD+ITFDZSOgzWcJ1a+9s565Gi4tmHGAIUnRf/CE+KD7VldmECsB+Za56C0LC6Pcwwb9JcSoY1UXnw9JD1thweL0MW6WayRCTA3CcS66cEw== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Make pte_offset_map_nolock() return pmdval so that we can recheck the *pmd once the lock is taken. This is a preparation for freeing empty PTE pages, no functional changes are expected. Signed-off-by: Qi Zheng --- Documentation/mm/split_page_table_lock.rst | 3 ++- arch/arm/mm/fault-armv.c | 2 +- arch/powerpc/mm/pgtable.c | 2 +- include/linux/mm.h | 4 ++-- mm/filemap.c | 2 +- mm/khugepaged.c | 4 ++-- mm/memory.c | 4 ++-- mm/mremap.c | 2 +- mm/page_vma_mapped.c | 2 +- mm/pgtable-generic.c | 21 ++++++++++++--------- mm/userfaultfd.c | 4 ++-- mm/vmscan.c | 2 +- 12 files changed, 28 insertions(+), 24 deletions(-) diff --git a/Documentation/mm/split_page_table_lock.rst b/Documentation/mm/split_page_table_lock.rst index e4f6972eb6c04..e6a47d57531cd 100644 --- a/Documentation/mm/split_page_table_lock.rst +++ b/Documentation/mm/split_page_table_lock.rst @@ -18,7 +18,8 @@ There are helpers to lock/unlock a table and other accessor functions: pointer to its PTE table lock, or returns NULL if no PTE table; - pte_offset_map_nolock() maps PTE, returns pointer to PTE with pointer to its PTE table - lock (not taken), or returns NULL if no PTE table; + lock (not taken) and the value of its pmd entry, or returns NULL + if no PTE table; - pte_offset_map() maps PTE, returns pointer to PTE, or returns NULL if no PTE table; - pte_unmap() diff --git a/arch/arm/mm/fault-armv.c b/arch/arm/mm/fault-armv.c index 831793cd6ff94..db07e6a05eb6e 100644 --- a/arch/arm/mm/fault-armv.c +++ b/arch/arm/mm/fault-armv.c @@ -117,7 +117,7 @@ static int adjust_pte(struct vm_area_struct *vma, unsigned long address, * must use the nested version. This also means we need to * open-code the spin-locking. */ - pte = pte_offset_map_nolock(vma->vm_mm, pmd, address, &ptl); + pte = pte_offset_map_nolock(vma->vm_mm, pmd, NULL, address, &ptl); if (!pte) return 0; diff --git a/arch/powerpc/mm/pgtable.c b/arch/powerpc/mm/pgtable.c index 7316396e452d8..9b67d2a1457ed 100644 --- a/arch/powerpc/mm/pgtable.c +++ b/arch/powerpc/mm/pgtable.c @@ -398,7 +398,7 @@ void assert_pte_locked(struct mm_struct *mm, unsigned long addr) */ if (pmd_none(*pmd)) return; - pte = pte_offset_map_nolock(mm, pmd, addr, &ptl); + pte = pte_offset_map_nolock(mm, pmd, NULL, addr, &ptl); BUG_ON(!pte); assert_spin_locked(ptl); pte_unmap(pte); diff --git a/include/linux/mm.h b/include/linux/mm.h index 43b40334e9b28..b1ef2afe620c5 100644 --- a/include/linux/mm.h +++ b/include/linux/mm.h @@ -2937,8 +2937,8 @@ static inline pte_t *pte_offset_map_lock(struct mm_struct *mm, pmd_t *pmd, return pte; } -pte_t *pte_offset_map_nolock(struct mm_struct *mm, pmd_t *pmd, - unsigned long addr, spinlock_t **ptlp); +pte_t *pte_offset_map_nolock(struct mm_struct *mm, pmd_t *pmd, pmd_t *pmdvalp, + unsigned long addr, spinlock_t **ptlp); #define pte_unmap_unlock(pte, ptl) do { \ spin_unlock(ptl); \ diff --git a/mm/filemap.c b/mm/filemap.c index 67c3f5136db33..3285dffb64cf8 100644 --- a/mm/filemap.c +++ b/mm/filemap.c @@ -3231,7 +3231,7 @@ static vm_fault_t filemap_fault_recheck_pte_none(struct vm_fault *vmf) if (!(vmf->flags & FAULT_FLAG_ORIG_PTE_VALID)) return 0; - ptep = pte_offset_map_nolock(vma->vm_mm, vmf->pmd, vmf->address, + ptep = pte_offset_map_nolock(vma->vm_mm, vmf->pmd, NULL, vmf->address, &vmf->ptl); if (unlikely(!ptep)) return VM_FAULT_NOPAGE; diff --git a/mm/khugepaged.c b/mm/khugepaged.c index cdd1d8655a76b..91b93259ee214 100644 --- a/mm/khugepaged.c +++ b/mm/khugepaged.c @@ -1009,7 +1009,7 @@ static int __collapse_huge_page_swapin(struct mm_struct *mm, }; if (!pte++) { - pte = pte_offset_map_nolock(mm, pmd, address, &ptl); + pte = pte_offset_map_nolock(mm, pmd, NULL, address, &ptl); if (!pte) { mmap_read_unlock(mm); result = SCAN_PMD_NULL; @@ -1598,7 +1598,7 @@ int collapse_pte_mapped_thp(struct mm_struct *mm, unsigned long addr, if (userfaultfd_armed(vma) && !(vma->vm_flags & VM_SHARED)) pml = pmd_lock(mm, pmd); - start_pte = pte_offset_map_nolock(mm, pmd, haddr, &ptl); + start_pte = pte_offset_map_nolock(mm, pmd, NULL, haddr, &ptl); if (!start_pte) /* mmap_lock + page lock should prevent this */ goto abort; if (!pml) diff --git a/mm/memory.c b/mm/memory.c index d6a9dcddaca4a..afd8a967fb953 100644 --- a/mm/memory.c +++ b/mm/memory.c @@ -1108,7 +1108,7 @@ copy_pte_range(struct vm_area_struct *dst_vma, struct vm_area_struct *src_vma, ret = -ENOMEM; goto out; } - src_pte = pte_offset_map_nolock(src_mm, src_pmd, addr, &src_ptl); + src_pte = pte_offset_map_nolock(src_mm, src_pmd, NULL, addr, &src_ptl); if (!src_pte) { pte_unmap_unlock(dst_pte, dst_ptl); /* ret == 0 */ @@ -5671,7 +5671,7 @@ static vm_fault_t handle_pte_fault(struct vm_fault *vmf) * it into a huge pmd: just retry later if so. */ vmf->pte = pte_offset_map_nolock(vmf->vma->vm_mm, vmf->pmd, - vmf->address, &vmf->ptl); + NULL, vmf->address, &vmf->ptl); if (unlikely(!vmf->pte)) return 0; vmf->orig_pte = ptep_get_lockless(vmf->pte); diff --git a/mm/mremap.c b/mm/mremap.c index e7ae140fc6409..f672d0218a6fe 100644 --- a/mm/mremap.c +++ b/mm/mremap.c @@ -175,7 +175,7 @@ static int move_ptes(struct vm_area_struct *vma, pmd_t *old_pmd, err = -EAGAIN; goto out; } - new_pte = pte_offset_map_nolock(mm, new_pmd, new_addr, &new_ptl); + new_pte = pte_offset_map_nolock(mm, new_pmd, NULL, new_addr, &new_ptl); if (!new_pte) { pte_unmap_unlock(old_pte, old_ptl); err = -EAGAIN; diff --git a/mm/page_vma_mapped.c b/mm/page_vma_mapped.c index ae5cc42aa2087..507701b7bcc1e 100644 --- a/mm/page_vma_mapped.c +++ b/mm/page_vma_mapped.c @@ -33,7 +33,7 @@ static bool map_pte(struct page_vma_mapped_walk *pvmw, spinlock_t **ptlp) * Though, in most cases, page lock already protects this. */ pvmw->pte = pte_offset_map_nolock(pvmw->vma->vm_mm, pvmw->pmd, - pvmw->address, ptlp); + NULL, pvmw->address, ptlp); if (!pvmw->pte) return false; diff --git a/mm/pgtable-generic.c b/mm/pgtable-generic.c index a78a4adf711ac..443e3b34434a5 100644 --- a/mm/pgtable-generic.c +++ b/mm/pgtable-generic.c @@ -305,7 +305,7 @@ pte_t *__pte_offset_map(pmd_t *pmd, unsigned long addr, pmd_t *pmdvalp) return NULL; } -pte_t *pte_offset_map_nolock(struct mm_struct *mm, pmd_t *pmd, +pte_t *pte_offset_map_nolock(struct mm_struct *mm, pmd_t *pmd, pmd_t *pmdvalp, unsigned long addr, spinlock_t **ptlp) { pmd_t pmdval; @@ -314,6 +314,8 @@ pte_t *pte_offset_map_nolock(struct mm_struct *mm, pmd_t *pmd, pte = __pte_offset_map(pmd, addr, &pmdval); if (likely(pte)) *ptlp = pte_lockptr(mm, &pmdval); + if (pmdvalp) + *pmdvalp = pmdval; return pte; } @@ -347,14 +349,15 @@ pte_t *pte_offset_map_nolock(struct mm_struct *mm, pmd_t *pmd, * and disconnected table. Until pte_unmap(pte) unmaps and rcu_read_unlock()s * afterwards. * - * pte_offset_map_nolock(mm, pmd, addr, ptlp), above, is like pte_offset_map(); - * but when successful, it also outputs a pointer to the spinlock in ptlp - as - * pte_offset_map_lock() does, but in this case without locking it. This helps - * the caller to avoid a later pte_lockptr(mm, *pmd), which might by that time - * act on a changed *pmd: pte_offset_map_nolock() provides the correct spinlock - * pointer for the page table that it returns. In principle, the caller should - * recheck *pmd once the lock is taken; in practice, no callsite needs that - - * either the mmap_lock for write, or pte_same() check on contents, is enough. + * pte_offset_map_nolock(mm, pmd, pmdvalp, addr, ptlp), above, is like + * pte_offset_map(); but when successful, it also outputs a pointer to the + * spinlock in ptlp - as pte_offset_map_lock() does, but in this case without + * locking it. This helps the caller to avoid a later pte_lockptr(mm, *pmd), + * which might by that time act on a changed *pmd: pte_offset_map_nolock() + * provides the correct spinlock pointer for the page table that it returns. + * In principle, the caller should recheck *pmd once the lock is taken; But in + * most cases, either the mmap_lock for write, or pte_same() check on contents, + * is enough. * * Note that free_pgtables(), used after unmapping detached vmas, or when * exiting the whole mm, does not take page table lock before freeing a page diff --git a/mm/userfaultfd.c b/mm/userfaultfd.c index 3b7715ecf292a..aa3c9cc51cc36 100644 --- a/mm/userfaultfd.c +++ b/mm/userfaultfd.c @@ -1143,7 +1143,7 @@ static int move_pages_pte(struct mm_struct *mm, pmd_t *dst_pmd, pmd_t *src_pmd, src_addr, src_addr + PAGE_SIZE); mmu_notifier_invalidate_range_start(&range); retry: - dst_pte = pte_offset_map_nolock(mm, dst_pmd, dst_addr, &dst_ptl); + dst_pte = pte_offset_map_nolock(mm, dst_pmd, NULL, dst_addr, &dst_ptl); /* Retry if a huge pmd materialized from under us */ if (unlikely(!dst_pte)) { @@ -1151,7 +1151,7 @@ static int move_pages_pte(struct mm_struct *mm, pmd_t *dst_pmd, pmd_t *src_pmd, goto out; } - src_pte = pte_offset_map_nolock(mm, src_pmd, src_addr, &src_ptl); + src_pte = pte_offset_map_nolock(mm, src_pmd, NULL, src_addr, &src_ptl); /* * We held the mmap_lock for reading so MADV_DONTNEED diff --git a/mm/vmscan.c b/mm/vmscan.c index 31d13462571e6..b00cd560c0e43 100644 --- a/mm/vmscan.c +++ b/mm/vmscan.c @@ -3378,7 +3378,7 @@ static bool walk_pte_range(pmd_t *pmd, unsigned long start, unsigned long end, DEFINE_MAX_SEQ(walk->lruvec); int old_gen, new_gen = lru_gen_from_seq(max_seq); - pte = pte_offset_map_nolock(args->mm, pmd, start & PMD_MASK, &ptl); + pte = pte_offset_map_nolock(args->mm, pmd, NULL, start & PMD_MASK, &ptl); if (!pte) return false; if (!spin_trylock(ptl)) { From patchwork Mon Aug 5 12:55:06 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Qi Zheng X-Patchwork-Id: 13753584 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 7AB06C3DA7F for ; Mon, 5 Aug 2024 12:55:59 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 10F016B0095; Mon, 5 Aug 2024 08:55:59 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 0BF266B0098; Mon, 5 Aug 2024 08:55:59 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id EA1656B0099; Mon, 5 Aug 2024 08:55:58 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0017.hostedemail.com [216.40.44.17]) by kanga.kvack.org (Postfix) with ESMTP id CA9BE6B0095 for ; Mon, 5 Aug 2024 08:55:58 -0400 (EDT) Received: from smtpin18.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay04.hostedemail.com (Postfix) with ESMTP id 6FEF51A0279 for ; Mon, 5 Aug 2024 12:55:58 +0000 (UTC) X-FDA: 82418189196.18.94D1F07 Received: from mail-pf1-f179.google.com (mail-pf1-f179.google.com [209.85.210.179]) by imf12.hostedemail.com (Postfix) with ESMTP id A32B24001F for ; Mon, 5 Aug 2024 12:55:56 +0000 (UTC) Authentication-Results: imf12.hostedemail.com; dkim=pass header.d=bytedance.com header.s=google header.b=M5ukeAPO; dmarc=pass (policy=quarantine) header.from=bytedance.com; spf=pass (imf12.hostedemail.com: domain of zhengqi.arch@bytedance.com designates 209.85.210.179 as permitted sender) smtp.mailfrom=zhengqi.arch@bytedance.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1722862506; a=rsa-sha256; cv=none; b=L0ICP/U7OIRvQCT0omMtV4pJoBs0cYW5PIRV36JWzClA6hVcsawtWjjHXeb5OKJ9gbMOPx wlMDELkl7Aiq3AzlzgEaI5GTbvVy6AzqShWmIvG5bU+56IYrxezB8bkstGARb4PxIq6ExK vfe5g4NoB1YS/4fB+TT5RqojiDUmFA8= ARC-Authentication-Results: i=1; imf12.hostedemail.com; dkim=pass header.d=bytedance.com header.s=google header.b=M5ukeAPO; dmarc=pass (policy=quarantine) header.from=bytedance.com; spf=pass (imf12.hostedemail.com: domain of zhengqi.arch@bytedance.com designates 209.85.210.179 as permitted sender) smtp.mailfrom=zhengqi.arch@bytedance.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1722862506; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=omYHK4YJqVuB4cm7/RSntivvxRLai4c0fyHEJvres6w=; b=cW7BYFj/9+pFC7kWIicc8b8WClCII5qB48ZUwRw8v9H5ftLCyTztD1Q5TU8gq4k4tDKcX0 lh2H4V4vlWpegGorVXlqxgW/Cy7YWg4tpdIa9iF9FOqAaTBYymj7pkldi7Pirzz90VygFF I8qo5C7oNG2nYOYO7g8fzt1Ww+q7hrc= Received: by mail-pf1-f179.google.com with SMTP id d2e1a72fcca58-70d1876e748so1305757b3a.1 for ; Mon, 05 Aug 2024 05:55:56 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance.com; s=google; t=1722862555; x=1723467355; darn=kvack.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=omYHK4YJqVuB4cm7/RSntivvxRLai4c0fyHEJvres6w=; b=M5ukeAPOUavRHirJneEYgdQCrAtgxPkkz9s6SpAbGIp0IFM79kwoGDQicTSvvLSBxq lpxnVWWLGMe43N7tfKnEbyO3x+OZL1yyJnnC7PpMI8JV56kEnCaICkfL0+q+MiLPJiOc 2eHDoQbQpUvASb/FJoKlengTZpYRXQE8Nzppjrkou422OUm+GYZ929CiwOmiBhI5+8Wi VL68EAhUqXxCP2fXMZEefWf88ioKhlGePvBqjqpPkysqUgOQeOf38XJ3DbBR50L+/09E EoKyLzTWr64ul2wzn0ByRMF/s3t7H/NUrezQxO3LY1gXqSVh3PkFbSTCE9A8b1Dun8RZ vxLg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1722862555; x=1723467355; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=omYHK4YJqVuB4cm7/RSntivvxRLai4c0fyHEJvres6w=; b=Ff+BW0ibrs1iMy+MCJNRwa7jL0oN1DMSQRfRrSk7DzNMWCEiutQ6LSzgZRiBWzWz1q SYwZws/bZDYbbR+WJ5cptnWxUZ90HKEeKUuQj++Ze4GbiU19C2BhQdLiGWJH0XaC1L56 HL4qToRsnZC4iusCCXAJvuvq2DjhNx/6oNoqoYFu+dU+jz9FM/J9jnsD7OJ75XSx8551 NOOuzSZUjVOUI48UtlHGDwo/ZlsXsnedvNKXPicFp3ZwCXX8iW1p/KyO6Wdqf3cdyRIB WIf05GPfoNPvaX8G90gT7b0hFCQLQ28DDpLCD1smXwRtzMDErlX6HICp71cEIUY8j4it k97g== X-Gm-Message-State: AOJu0YxNITf+1p0tbiFu/xrXHzPpyIWfanCTQtkNyN20c7DSt50S8AMd ixcqsrCxGjIkVex1esM9A/4j8Q2VT6tyhdg8g3pfl6BcbzvMp7Kbl1hd+wfKVCo= X-Google-Smtp-Source: AGHT+IGYYukG+ok6MYSzTAxan5mNsM3B2kkWkuJBYY1IjywvHeK1bY7aqXgLyTgGlf83PVoHS2ooMA== X-Received: by 2002:a05:6a00:21ce:b0:70d:2c09:45ff with SMTP id d2e1a72fcca58-7106d0927eemr9121873b3a.4.1722862555191; Mon, 05 Aug 2024 05:55:55 -0700 (PDT) Received: from C02DW0BEMD6R.bytedance.net ([139.177.225.232]) by smtp.gmail.com with ESMTPSA id d2e1a72fcca58-7106ecfaf1asm5503030b3a.142.2024.08.05.05.55.50 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 05 Aug 2024 05:55:54 -0700 (PDT) From: Qi Zheng To: david@redhat.com, hughd@google.com, willy@infradead.org, mgorman@suse.de, muchun.song@linux.dev, vbabka@kernel.org, akpm@linux-foundation.org, zokeefe@google.com, rientjes@google.com Cc: linux-mm@kvack.org, linux-kernel@vger.kernel.org, Qi Zheng Subject: [RFC PATCH v2 2/7] mm: introduce CONFIG_PT_RECLAIM Date: Mon, 5 Aug 2024 20:55:06 +0800 Message-Id: <7c726839e2610f1873d9fa2a7c60715796579d1a.1722861064.git.zhengqi.arch@bytedance.com> X-Mailer: git-send-email 2.24.3 (Apple Git-128) In-Reply-To: References: MIME-Version: 1.0 X-Rspamd-Server: rspam12 X-Rspamd-Queue-Id: A32B24001F X-Stat-Signature: z9rxew1auibu7mcbiw9sj358er8jjohz X-Rspam-User: X-HE-Tag: 1722862556-613415 X-HE-Meta: U2FsdGVkX18myk2h+Hr57WWRcA4Ma4o8+PUjHKO/RnMlh9lsa3Rg7TLg3aNefaAQ/wqPQX4nwWuddgEAM7GIStmljyT8KSbJqCV7Yu6GS6yFmzoGVeMEw0Byx10xYqT6QFm63e7LCWw1i2AfUw50FLBDk6U4XXKI/KtQ3zL4Sw8kImFI2f1w0Hfmf/tsE3HC72FZDMBs4n5VCwxLU2PzDWUgqd62cNCRcRJibCI2ZJNuDubrJSmFTmTYzC42x9BkxwXiAp8Z7V7vdgnPGvOVNRIVSFLUhC6ryceYwvZBZpwuz/Gpw2YT3nKUKpZSyUzcnnlYX81QtGYH3SfFSEWOpBE5W6k7WTmWKn9ryTYYkzRQcbSyy0Q0F8/2+Oms3kCheXHfeifDl9nd8EhxaHzUKobsELDMuPwYDOIBhEv2wtFTI1iL9bbawkLXpLkoD20HKj364/E8bG4Egc2tgt4sncT7TqeCzMprNKg+GJ+O7cUzyHdTLfKeL520PnO0i8Nc0NQqJExUz1K32G93WKUba6IjKhqcY0cD7C/h+u+2XzbpXRyuklRGeNQiZ9HfDnum8W2IBvPUNWsdM9ap0FDzLKREXMKr40+pAK74MpMhNGHZDbh0qNgdsvpQhd6slfYyrQoBzVORr+r/tuTBaxqm9eGREJPR3mjfdwfGm7xMxlZwS5ydyLKCQTEG2JZ3oCvDh5iDulj3PrJsnwvM16Dt/3kKtc7dRP71B/vn1R1u9sMFdh7MtAXtQb/XKYvsY3C5ncJoZSGawmvER5SEVRz1uCROXjHAS5vcsZrcwFSqK6pUV8iBEZT5C/hJO5ScxGNz7qxIlRi8FGJFwkNHfHGcZyVknduxCAgBcahvv818BUslBqYYhcMR/qzkeFSjpCh+ZtX9pgmii0jSdWQVu/YsGo0mSreLJuaIj/u+efDrdsFickQ6jYD6bt/yeMibvVgWqkpWoBkzfvvXR8T9vmS gh0F38cp ayGgKK5yap5PqIg1O6KrbfyTbEakUU4OfCMIQOyytE8oE8rm9JPTJv2i9R86V9509YsVNKd8SDxTyGkk7V/rQtZGuJeF+NnbRavNGa1TD1Zxz+R7aCWuCiOTvZOWwKq/SrkpdD/sMoXsi1LvctRj8j0f+ObMoBrG3s4txdBBjH4q3uIGTysp0Lp4/a8Mcvc4VxPhF3/t2FybB9UFLGBJqDQmNG4G0R8nCYNew6TsUDzSOy7JiX2LvaInyFEgqIP3JoHYPPPQJs159UgY6uUhkcanKDmDX5MQN0uTMfz8EGNUrElV3KHCWoVsmmS7Tjyz7P1jIC7/esogti5jto24BrIKaSozk/npcJ2yzGwewidiMVVmPrTNudn8orw== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: This configuration variable will be used to build the code needed to free empty user page table pages. This feature is not available on all architectures yet, so ARCH_SUPPORTS_PT_RECLAIM is needed. We can remove it once all architectures support this feature. Signed-off-by: Qi Zheng --- mm/Kconfig | 14 ++++++++++++++ 1 file changed, 14 insertions(+) diff --git a/mm/Kconfig b/mm/Kconfig index 3936fe4d26d91..c10741c54dcb1 100644 --- a/mm/Kconfig +++ b/mm/Kconfig @@ -1278,6 +1278,20 @@ config NUMA_EMU into virtual nodes when booted with "numa=fake=N", where N is the number of nodes. This is only useful for debugging. +config ARCH_SUPPORTS_PT_RECLAIM + def_bool n + +config PT_RECLAIM + bool "reclaim empty user page table pages" + default y + depends on ARCH_SUPPORTS_PT_RECLAIM && MMU && SMP + select MMU_GATHER_RCU_TABLE_FREE + help + Try to reclaim empty user page table pages in paths other that munmap + and exit_mmap path. + + Note: now only empty user PTE page table pages will be reclaimed. + source "mm/damon/Kconfig" endmenu From patchwork Mon Aug 5 12:55:07 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Qi Zheng X-Patchwork-Id: 13753585 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 5AA24C3DA4A for ; Mon, 5 Aug 2024 12:56:05 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id D6A3E6B0099; Mon, 5 Aug 2024 08:56:04 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id D1A866B009A; Mon, 5 Aug 2024 08:56:04 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id BBA9D6B009B; Mon, 5 Aug 2024 08:56:04 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0015.hostedemail.com [216.40.44.15]) by kanga.kvack.org (Postfix) with ESMTP id 9ED376B0099 for ; Mon, 5 Aug 2024 08:56:04 -0400 (EDT) Received: from smtpin22.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay01.hostedemail.com (Postfix) with ESMTP id 4D92C1C120E for ; Mon, 5 Aug 2024 12:56:04 +0000 (UTC) X-FDA: 82418189448.22.1604C0F Received: from mail-pf1-f170.google.com (mail-pf1-f170.google.com [209.85.210.170]) by imf30.hostedemail.com (Postfix) with ESMTP id 5F27480012 for ; Mon, 5 Aug 2024 12:56:02 +0000 (UTC) Authentication-Results: imf30.hostedemail.com; dkim=pass header.d=bytedance.com header.s=google header.b=Pk1Tldpd; spf=pass (imf30.hostedemail.com: domain of zhengqi.arch@bytedance.com designates 209.85.210.170 as permitted sender) smtp.mailfrom=zhengqi.arch@bytedance.com; dmarc=pass (policy=quarantine) header.from=bytedance.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1722862500; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=RhrH1Q9NnZeRUJcj26yTP/GgTj/smFAH27zlkHuMnIM=; b=BxULiPIYc+LXr8Jq/hEqW7uM6LoL7WfhucWNXMWjxchu799SWE7ccZIjk1CEtSK7oOKjW7 3b1hWbqPgkA3sxypo6pz1n/uaDuKPwakALxsdFOWfgkbO5YQnyKAgJJ6PyaBSEnJjjZOy0 RtjERK8auyylkSXOdrgk3l4F3Q7Mmww= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1722862500; a=rsa-sha256; cv=none; b=jjq5IaaS/O2630G6XcXzjDhLYxJIrPIZeYSfWMh+yimaIvGzQVi+7kb6MKwifdoS05VGEk i1d5RXJZDfK2eoRHXJcy16evu60RRVjwKO2llUeeocXA47rLIv3virp6fsFp1lgq0k2PEs Fi4bkKuiYWochagFUS6GdVvLSih6hZ0= ARC-Authentication-Results: i=1; imf30.hostedemail.com; dkim=pass header.d=bytedance.com header.s=google header.b=Pk1Tldpd; spf=pass (imf30.hostedemail.com: domain of zhengqi.arch@bytedance.com designates 209.85.210.170 as permitted sender) smtp.mailfrom=zhengqi.arch@bytedance.com; dmarc=pass (policy=quarantine) header.from=bytedance.com Received: by mail-pf1-f170.google.com with SMTP id d2e1a72fcca58-70d19bfdabbso252650b3a.2 for ; Mon, 05 Aug 2024 05:56:01 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance.com; s=google; t=1722862561; x=1723467361; darn=kvack.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=RhrH1Q9NnZeRUJcj26yTP/GgTj/smFAH27zlkHuMnIM=; b=Pk1TldpdrkxcPYKqNlbsF9rj3KM9THD9BcBq+Z6oT6xHAPo2y6PiY5ve+B4pMlcWfs 3XhZusd8mGW4IuWcSfrZyy2I5SqIxpTWJxeouUN01b06Y8r1KNzHJ2bvvBEuCziOt+/y qZcCMsyM10P53fyQ+VGL2YtoHZy8AELxhSC2ut984hQGPyY6MEFzy70sujW5CljINT/r 0ZaEwlJREnVL9xkRj9wg9tA1FA5QBzX8VVPt7y334rx1oAowqubRe5QTzMGbStaE72nV TTnPPGppfpqUiz7L5EfOArJ++k7CtUJvP53vT4tcn+nFzAuy+0gLrV47w7AmTqv040X7 MGdA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1722862561; x=1723467361; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=RhrH1Q9NnZeRUJcj26yTP/GgTj/smFAH27zlkHuMnIM=; b=sJqNJUJSNu1YhQZPuX35o1bBrjAfjMOEN4k1vvPUiz0xxFtFFyR9hi0fdBpCmL4H7Z QaqLdQvT2KDQ4EOFRMl44h4A+hPlyvm2/mH1pvr4u/y/DFNDa8U+hLfVtNZRtbe96o5o h5+jk3+nPEByKqG/vPUn2tF4rE2f9P91gHKDVJHsZRygB8PHpVOCtOAiZwyt1ZpgCNuh t1smFnpIhtaIxmwUI19hUGGthWVAm/LFRSwqGXuqRcz0g7n/eags1+GgjSjzDXB96o2z xL4R3TJLF8lV3ZlEkX3AlS0PbADEKYLW630yThGouRfjZm2zv2gJ6KDP6VbsbPXZanrB WwVw== X-Gm-Message-State: AOJu0YwQWGRy8bIw022dwfu2phmagghTpVm+D8ZJWxdIE2P8MDCgChma AAV1WRNkAsAOEduNCrcBFAXIee6INur/f91gps3+UU6/aaYf75eA9Ea4i+S+SfE= X-Google-Smtp-Source: AGHT+IFKE0ajxoQQC50+uF8r9y1IJZQgbjRdiBt9JdNBowXPi+FU6ZnE1IQEVkFfZ4V1YKUGUyQ/8Q== X-Received: by 2002:a05:6a00:a17:b0:710:5d11:ec2f with SMTP id d2e1a72fcca58-7106cdd9fd4mr9461205b3a.0.1722862560754; Mon, 05 Aug 2024 05:56:00 -0700 (PDT) Received: from C02DW0BEMD6R.bytedance.net ([139.177.225.232]) by smtp.gmail.com with ESMTPSA id d2e1a72fcca58-7106ecfaf1asm5503030b3a.142.2024.08.05.05.55.55 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 05 Aug 2024 05:55:59 -0700 (PDT) From: Qi Zheng To: david@redhat.com, hughd@google.com, willy@infradead.org, mgorman@suse.de, muchun.song@linux.dev, vbabka@kernel.org, akpm@linux-foundation.org, zokeefe@google.com, rientjes@google.com Cc: linux-mm@kvack.org, linux-kernel@vger.kernel.org, Qi Zheng Subject: [RFC PATCH v2 3/7] mm: pass address information to pmd_install() Date: Mon, 5 Aug 2024 20:55:07 +0800 Message-Id: <095dc55b68ef4650e2eaf66ad7dd2feabe87f89e.1722861064.git.zhengqi.arch@bytedance.com> X-Mailer: git-send-email 2.24.3 (Apple Git-128) In-Reply-To: References: MIME-Version: 1.0 X-Stat-Signature: r5zmd61rqo4e1eysf7ysp4osa7g4sdyy X-Rspamd-Queue-Id: 5F27480012 X-Rspam-User: X-Rspamd-Server: rspam08 X-HE-Tag: 1722862562-36546 X-HE-Meta: U2FsdGVkX1+Wz1T4nh2NTE7cwh1xTxrTCqERTM+O1HoeZ8H0rzC/TS6MKs4k9HSHuN4pHb9Ak+Ig3rngkWSQYXFIIyftmmwAhBbsEXKFlvXA10iF0XibLlM4Qg0Oxe4CiYgsMc89QGg0KNYurxxOeVH1y46oCNU9oRsYVc+odXqPSJrhaSLjCsxNJS3kx3YfTXexeFpsUAePa/vHyAYrGZ+nh/WrrFKDXV+7oYj0By8BBejfWhgS0UgKIBGeZ9z+PG5pLswOo8/xSjlqTMpZL+tXOOh2frJV6tltnwhbfMUgHzftHNNLiooWm/w6AsL5zzzCoLivDEq0kRBwPlj6aTegUeux9FcQ1cjo7vwh4jN+eJSThYqwNDZcK8EBcGVsl9kJFcDeZhgqtP9nc2D2mrHBmo+6ZDIEvRZoq12UAzIGqSkesG8nmXfOGmDQnWEpAvDHPK7pPq/O5ndJqElWES0+c1406WRE1pSoAk7YqpZ831ZjsR0Cin5ms4eBiNV6sFymcICgsG53yod08ldKbaZR4UkvEASnrQVCzUwyBOsbu8D/2LwI+4cyKluFLYtvAMuViwh1IkA9EcERovRhozz28b2KUyNUEqYPTE/vdrLAhrDte31mlgKAAp+iofJ8eCzR1nJ9DVDFNAdGWwW3lG4/+4mSs6O5Q6DrRonbSTmoEB0QHeABtS7aGEAV+3GuRqms2iuuQ8CX8OWNR3ISo6Kl1rrPnilMuVzsaPj51xyicotS9n47AgAVXDXbyvXUdIq3wb/aQQEfR053bBMLA7MbECwn8bGcMz6SHRCuFPnUrtqNTb4vSUIcL6OeroyTGvaMEruLyNeMkrFJI9mCPaRloAi2WSpO3wjjWQpDMNj6nq/l+LkQi/BNczHUvrM/j92jyv5d99DEae+s7PvCMacM+SsGFy4usKGtBFQjDpz6yZkALVPX0zhdjAXLr3xRTMmNA/Y5H+Uqq+YMnsB tVL1xR2L EVXUXuBqX8WbF/mzLKy1YkaOlPhtaOYks991sKVI9Q7KR4rnsM1Kv2+BOnYbx3ILkJGFpJJTf8F8j6LlFPxJYpWy++VK5AU9BA5fcuBjZS+5+SAw3ENXkEf84z6k6SF9VdbeiZnr4SSS0Sq6GKST0jBN+2wsps2j4BkwJdCJsMWr6M3qu9tKTBwVmoYJloRlRsxICNp5mSaPts4pRRkpaSrPWPO4Yz8G0bu45RmU1kTWJibJULPdO38KWAdUYww/HJLV7zwhuBI/DMJdlxck1v4Mn3kv/2O2z7YZe8d02m1q+Zv020M8EdDD0XWQvtrkb/zmtUodm8/ASv4NgP4NGY09gshjALfSn4bDAwuN8YnNT0YfobLilpG4zWw== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: In the subsequent implementation of freeing empty page table pages, we need the address information to flush tlb, so pass address to pmd_install() in advance. No functional changes. Signed-off-by: Qi Zheng --- include/linux/hugetlb.h | 2 +- include/linux/mm.h | 9 +++++---- mm/debug_vm_pgtable.c | 2 +- mm/filemap.c | 2 +- mm/gup.c | 2 +- mm/internal.h | 3 ++- mm/memory.c | 15 ++++++++------- mm/migrate_device.c | 2 +- mm/mprotect.c | 8 ++++---- mm/mremap.c | 2 +- mm/userfaultfd.c | 6 +++--- 11 files changed, 28 insertions(+), 25 deletions(-) diff --git a/include/linux/hugetlb.h b/include/linux/hugetlb.h index a76db143bffee..fcdcef367fffe 100644 --- a/include/linux/hugetlb.h +++ b/include/linux/hugetlb.h @@ -189,7 +189,7 @@ static inline pte_t *pte_offset_huge(pmd_t *pmd, unsigned long address) static inline pte_t *pte_alloc_huge(struct mm_struct *mm, pmd_t *pmd, unsigned long address) { - return pte_alloc(mm, pmd) ? NULL : pte_offset_huge(pmd, address); + return pte_alloc(mm, pmd, address) ? NULL : pte_offset_huge(pmd, address); } #endif diff --git a/include/linux/mm.h b/include/linux/mm.h index b1ef2afe620c5..f0b821dcb085b 100644 --- a/include/linux/mm.h +++ b/include/linux/mm.h @@ -2758,7 +2758,7 @@ static inline void mm_inc_nr_ptes(struct mm_struct *mm) {} static inline void mm_dec_nr_ptes(struct mm_struct *mm) {} #endif -int __pte_alloc(struct mm_struct *mm, pmd_t *pmd); +int __pte_alloc(struct mm_struct *mm, pmd_t *pmd, unsigned long addr); int __pte_alloc_kernel(pmd_t *pmd); #if defined(CONFIG_MMU) @@ -2945,13 +2945,14 @@ pte_t *pte_offset_map_nolock(struct mm_struct *mm, pmd_t *pmd, pmd_t *pmdvalp, pte_unmap(pte); \ } while (0) -#define pte_alloc(mm, pmd) (unlikely(pmd_none(*(pmd))) && __pte_alloc(mm, pmd)) +#define pte_alloc(mm, pmd, addr) \ + (unlikely(pmd_none(*(pmd))) && __pte_alloc(mm, pmd, addr)) #define pte_alloc_map(mm, pmd, address) \ - (pte_alloc(mm, pmd) ? NULL : pte_offset_map(pmd, address)) + (pte_alloc(mm, pmd, address) ? NULL : pte_offset_map(pmd, address)) #define pte_alloc_map_lock(mm, pmd, address, ptlp) \ - (pte_alloc(mm, pmd) ? \ + (pte_alloc(mm, pmd, address) ? \ NULL : pte_offset_map_lock(mm, pmd, address, ptlp)) #define pte_alloc_kernel(pmd, address) \ diff --git a/mm/debug_vm_pgtable.c b/mm/debug_vm_pgtable.c index e4969fb54da34..18375744e1845 100644 --- a/mm/debug_vm_pgtable.c +++ b/mm/debug_vm_pgtable.c @@ -1246,7 +1246,7 @@ static int __init init_args(struct pgtable_debug_args *args) args->start_pmdp = pmd_offset(args->pudp, 0UL); WARN_ON(!args->start_pmdp); - if (pte_alloc(args->mm, args->pmdp)) { + if (pte_alloc(args->mm, args->pmdp, args->vaddr)) { pr_err("Failed to allocate pte entries\n"); ret = -ENOMEM; goto error; diff --git a/mm/filemap.c b/mm/filemap.c index 3285dffb64cf8..efcb8ae3f235f 100644 --- a/mm/filemap.c +++ b/mm/filemap.c @@ -3453,7 +3453,7 @@ static bool filemap_map_pmd(struct vm_fault *vmf, struct folio *folio, } if (pmd_none(*vmf->pmd) && vmf->prealloc_pte) - pmd_install(mm, vmf->pmd, &vmf->prealloc_pte); + pmd_install(mm, vmf->pmd, vmf->address, &vmf->prealloc_pte); return false; } diff --git a/mm/gup.c b/mm/gup.c index d19884e097fd2..53c3b73810150 100644 --- a/mm/gup.c +++ b/mm/gup.c @@ -972,7 +972,7 @@ static struct page *follow_pmd_mask(struct vm_area_struct *vma, spin_unlock(ptl); split_huge_pmd(vma, pmd, address); /* If pmd was left empty, stuff a page table in there quickly */ - return pte_alloc(mm, pmd) ? ERR_PTR(-ENOMEM) : + return pte_alloc(mm, pmd, address) ? ERR_PTR(-ENOMEM) : follow_page_pte(vma, address, pmd, flags, &ctx->pgmap); } page = follow_huge_pmd(vma, address, pmd, flags, ctx); diff --git a/mm/internal.h b/mm/internal.h index 52f7fc4e8ac30..dfc992de01115 100644 --- a/mm/internal.h +++ b/mm/internal.h @@ -325,7 +325,8 @@ void folio_activate(struct folio *folio); void free_pgtables(struct mmu_gather *tlb, struct ma_state *mas, struct vm_area_struct *start_vma, unsigned long floor, unsigned long ceiling, bool mm_wr_locked); -void pmd_install(struct mm_struct *mm, pmd_t *pmd, pgtable_t *pte); +void pmd_install(struct mm_struct *mm, pmd_t *pmd, unsigned long addr, + pgtable_t *pte); struct zap_details; void unmap_page_range(struct mmu_gather *tlb, diff --git a/mm/memory.c b/mm/memory.c index afd8a967fb953..fef1e425e4702 100644 --- a/mm/memory.c +++ b/mm/memory.c @@ -417,7 +417,8 @@ void free_pgtables(struct mmu_gather *tlb, struct ma_state *mas, } while (vma); } -void pmd_install(struct mm_struct *mm, pmd_t *pmd, pgtable_t *pte) +void pmd_install(struct mm_struct *mm, pmd_t *pmd, unsigned long addr, + pgtable_t *pte) { spinlock_t *ptl = pmd_lock(mm, pmd); @@ -443,13 +444,13 @@ void pmd_install(struct mm_struct *mm, pmd_t *pmd, pgtable_t *pte) spin_unlock(ptl); } -int __pte_alloc(struct mm_struct *mm, pmd_t *pmd) +int __pte_alloc(struct mm_struct *mm, pmd_t *pmd, unsigned long addr) { pgtable_t new = pte_alloc_one(mm); if (!new) return -ENOMEM; - pmd_install(mm, pmd, &new); + pmd_install(mm, pmd, addr, &new); if (new) pte_free(mm, new); return 0; @@ -2115,7 +2116,7 @@ static int insert_pages(struct vm_area_struct *vma, unsigned long addr, /* Allocate the PTE if necessary; takes PMD lock once only. */ ret = -ENOMEM; - if (pte_alloc(mm, pmd)) + if (pte_alloc(mm, pmd, addr)) goto out; while (pages_to_write_in_pmd) { @@ -4686,7 +4687,7 @@ static vm_fault_t do_anonymous_page(struct vm_fault *vmf) * Use pte_alloc() instead of pte_alloc_map(), so that OOM can * be distinguished from a transient failure of pte_offset_map(). */ - if (pte_alloc(vma->vm_mm, vmf->pmd)) + if (pte_alloc(vma->vm_mm, vmf->pmd, vmf->address)) return VM_FAULT_OOM; /* Use the zero-page for reads */ @@ -5033,8 +5034,8 @@ vm_fault_t finish_fault(struct vm_fault *vmf) } if (vmf->prealloc_pte) - pmd_install(vma->vm_mm, vmf->pmd, &vmf->prealloc_pte); - else if (unlikely(pte_alloc(vma->vm_mm, vmf->pmd))) + pmd_install(vma->vm_mm, vmf->pmd, vmf->address, &vmf->prealloc_pte); + else if (unlikely(pte_alloc(vma->vm_mm, vmf->pmd, vmf->address))) return VM_FAULT_OOM; } diff --git a/mm/migrate_device.c b/mm/migrate_device.c index 6d66dc1c6ffa0..e4d2e19e6611d 100644 --- a/mm/migrate_device.c +++ b/mm/migrate_device.c @@ -598,7 +598,7 @@ static void migrate_vma_insert_page(struct migrate_vma *migrate, goto abort; if (pmd_trans_huge(*pmdp) || pmd_devmap(*pmdp)) goto abort; - if (pte_alloc(mm, pmdp)) + if (pte_alloc(mm, pmdp, addr)) goto abort; if (unlikely(anon_vma_prepare(vma))) goto abort; diff --git a/mm/mprotect.c b/mm/mprotect.c index 37cf8d249405d..7b58db622f825 100644 --- a/mm/mprotect.c +++ b/mm/mprotect.c @@ -329,11 +329,11 @@ pgtable_populate_needed(struct vm_area_struct *vma, unsigned long cp_flags) * allocation failures during page faults by kicking OOM and returning * error. */ -#define change_pmd_prepare(vma, pmd, cp_flags) \ +#define change_pmd_prepare(vma, pmd, addr, cp_flags) \ ({ \ long err = 0; \ if (unlikely(pgtable_populate_needed(vma, cp_flags))) { \ - if (pte_alloc(vma->vm_mm, pmd)) \ + if (pte_alloc(vma->vm_mm, pmd, addr)) \ err = -ENOMEM; \ } \ err; \ @@ -374,7 +374,7 @@ static inline long change_pmd_range(struct mmu_gather *tlb, again: next = pmd_addr_end(addr, end); - ret = change_pmd_prepare(vma, pmd, cp_flags); + ret = change_pmd_prepare(vma, pmd, addr, cp_flags); if (ret) { pages = ret; break; @@ -401,7 +401,7 @@ static inline long change_pmd_range(struct mmu_gather *tlb, * cleared; make sure pmd populated if * necessary, then fall-through to pte level. */ - ret = change_pmd_prepare(vma, pmd, cp_flags); + ret = change_pmd_prepare(vma, pmd, addr, cp_flags); if (ret) { pages = ret; break; diff --git a/mm/mremap.c b/mm/mremap.c index f672d0218a6fe..7723d11e77cd2 100644 --- a/mm/mremap.c +++ b/mm/mremap.c @@ -628,7 +628,7 @@ unsigned long move_page_tables(struct vm_area_struct *vma, } if (pmd_none(*old_pmd)) continue; - if (pte_alloc(new_vma->vm_mm, new_pmd)) + if (pte_alloc(new_vma->vm_mm, new_pmd, new_addr)) break; if (move_ptes(vma, old_pmd, old_addr, old_addr + extent, new_vma, new_pmd, new_addr, need_rmap_locks) < 0) diff --git a/mm/userfaultfd.c b/mm/userfaultfd.c index aa3c9cc51cc36..41d659bd2589c 100644 --- a/mm/userfaultfd.c +++ b/mm/userfaultfd.c @@ -796,7 +796,7 @@ static __always_inline ssize_t mfill_atomic(struct userfaultfd_ctx *ctx, break; } if (unlikely(pmd_none(dst_pmdval)) && - unlikely(__pte_alloc(dst_mm, dst_pmd))) { + unlikely(__pte_alloc(dst_mm, dst_pmd, dst_addr))) { err = -ENOMEM; break; } @@ -1713,13 +1713,13 @@ ssize_t move_pages(struct userfaultfd_ctx *ctx, unsigned long dst_start, err = -ENOENT; break; } - if (unlikely(__pte_alloc(mm, src_pmd))) { + if (unlikely(__pte_alloc(mm, src_pmd, src_addr))) { err = -ENOMEM; break; } } - if (unlikely(pte_alloc(mm, dst_pmd))) { + if (unlikely(pte_alloc(mm, dst_pmd, dst_addr))) { err = -ENOMEM; break; } From patchwork Mon Aug 5 12:55:08 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Qi Zheng X-Patchwork-Id: 13753586 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 11F4AC3DA4A for ; Mon, 5 Aug 2024 12:56:10 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id A5EC46B009B; Mon, 5 Aug 2024 08:56:09 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id A0DDB6B009C; Mon, 5 Aug 2024 08:56:09 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 8AE3A6B009D; Mon, 5 Aug 2024 08:56:09 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0011.hostedemail.com [216.40.44.11]) by kanga.kvack.org (Postfix) with ESMTP id 67AC36B009B for ; Mon, 5 Aug 2024 08:56:09 -0400 (EDT) Received: from smtpin20.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay05.hostedemail.com (Postfix) with ESMTP id 2480F402C5 for ; Mon, 5 Aug 2024 12:56:09 +0000 (UTC) X-FDA: 82418189658.20.1832962 Received: from mail-pj1-f45.google.com (mail-pj1-f45.google.com [209.85.216.45]) by imf08.hostedemail.com (Postfix) with ESMTP id 3A1A6160024 for ; Mon, 5 Aug 2024 12:56:07 +0000 (UTC) Authentication-Results: imf08.hostedemail.com; dkim=pass header.d=bytedance.com header.s=google header.b=ejZdN3pa; spf=pass (imf08.hostedemail.com: domain of zhengqi.arch@bytedance.com designates 209.85.216.45 as permitted sender) smtp.mailfrom=zhengqi.arch@bytedance.com; dmarc=pass (policy=quarantine) header.from=bytedance.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1722862536; a=rsa-sha256; cv=none; b=5TZC1PiJSfGiFx/tiLbWtFB1I+WvMebHlqo0sMpNtyrrbTXADnF4FCg3E7iy/xEdD8PXa7 ClrXlQp0nIkvA+OohgKe2UPdlsvmUaAbZuMsxNJL1aUtwlN9R9IFLu3rs8f7Mo5p/Fsot0 UDw4VErL/k5ZQHylFu1sABR8H5n+xTE= ARC-Authentication-Results: i=1; imf08.hostedemail.com; dkim=pass header.d=bytedance.com header.s=google header.b=ejZdN3pa; spf=pass (imf08.hostedemail.com: domain of zhengqi.arch@bytedance.com designates 209.85.216.45 as permitted sender) smtp.mailfrom=zhengqi.arch@bytedance.com; dmarc=pass (policy=quarantine) header.from=bytedance.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1722862536; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=nsBfabV4yKHDUnZv1G5PTky4g9osbznwx2f7l8Y7SBU=; b=6dptZW6rKD0vIBtFgLIN8KNtvqJQ1lWBstChA88fajNwvMNsYedwIRDZSDcgr4fJYELz+9 Aiz6klQnthwJMdQSGLDEy/wOcqNyBHhzXwPQDAqclrMcIFgbLErGiljhvohgYICmi8tlwP TI6wkCO1pfjMEj2oAn3fAFyFtf4qsBc= Received: by mail-pj1-f45.google.com with SMTP id 98e67ed59e1d1-2cb77ecd7a2so1780320a91.1 for ; Mon, 05 Aug 2024 05:56:06 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance.com; s=google; t=1722862566; x=1723467366; darn=kvack.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=nsBfabV4yKHDUnZv1G5PTky4g9osbznwx2f7l8Y7SBU=; b=ejZdN3panqpZBH/A+eMnnbfaX1CPflsmT9juYen8ecomDQK2ECtpMyulfHrhkPiBqj dnmt+g9KF6ERs5QsU1JJ9S2O1aBZz6be1YD4MB7C2yde640TuegAXhkDcH9hWDgKIu+j M1XCtISEbWWzez7uQygGYo3s9jxUKtOFeMbVB6W1t5IiLtpJWUjykSBCNgfz6oOH9Y7h Nx+7PsOsowbmq1QSBvNufSPsDuO/zCQQrOVK8FZYXl+VJXztQBg4m/UG4Q2DJpUkJRar RVq7ghVkBwNMZPKb3dyrvoECNP1R3nNiDtaiAsXr9zGwEDo5WLbonWLRhfjs86eU8s6i 3OtQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1722862566; x=1723467366; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=nsBfabV4yKHDUnZv1G5PTky4g9osbznwx2f7l8Y7SBU=; b=GMaM7d5fUIOc+UUWilmPrUKw7eVZF32AsdkIBccBIUNIMu0xP1MvOBZ6rEXn2haXlI //jIGRzneo+UlaqJ6TRUAp5MC2+YTUYVxFS9cvrmY8/7Lpa6juI+ucbidxyEMS3z9esR Wc7c590yy5/tewtZkiPQgOykQGqwpL/orIL4QUXOR/E6F6RPg7H3WFTYzEYfFxtb058D ibYto8gPNVJR/KeUdObc/0cQsniAgfzywq6B4F30smvhgkbPfWS9+7BhWVlypH2kFYMj Uodds8e/J4RCsXAkqp9KFo42Mi3Glh8Pq0xBdHc3rH1YZ5rKKDzq591+87a9BJrK92R2 S29Q== X-Gm-Message-State: AOJu0YyxSG/6f0+9oKhWS+FPD3Rus+FD9U6PYV7Nwx0wDpYH51N2a0X0 8FuUycoZ9m9uJ2X2pvl0Q3sfE7EgfQwSf2Zrm2AgqIyaOYxBzYjgx+tafRI568I= X-Google-Smtp-Source: AGHT+IGC1DBrGh4Hjx0OyIm/9r87y3+3knx7wGDASaN8LJsQ77h18R7/hh4KYzuDRNZTKJBgub17iA== X-Received: by 2002:a05:6a20:8414:b0:1c4:a742:ab20 with SMTP id adf61e73a8af0-1c69943c40cmr6974660637.0.1722862565538; Mon, 05 Aug 2024 05:56:05 -0700 (PDT) Received: from C02DW0BEMD6R.bytedance.net ([139.177.225.232]) by smtp.gmail.com with ESMTPSA id d2e1a72fcca58-7106ecfaf1asm5503030b3a.142.2024.08.05.05.56.01 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 05 Aug 2024 05:56:05 -0700 (PDT) From: Qi Zheng To: david@redhat.com, hughd@google.com, willy@infradead.org, mgorman@suse.de, muchun.song@linux.dev, vbabka@kernel.org, akpm@linux-foundation.org, zokeefe@google.com, rientjes@google.com Cc: linux-mm@kvack.org, linux-kernel@vger.kernel.org, Qi Zheng Subject: [RFC PATCH v2 4/7] mm: pgtable: try to reclaim empty PTE pages in zap_page_range_single() Date: Mon, 5 Aug 2024 20:55:08 +0800 Message-Id: <9fb3dc75cb7f023750da2b4645fd098429deaad5.1722861064.git.zhengqi.arch@bytedance.com> X-Mailer: git-send-email 2.24.3 (Apple Git-128) In-Reply-To: References: MIME-Version: 1.0 X-Stat-Signature: ugxeiwsdgze5y9p8g6dkhcpq19djou57 X-Rspamd-Queue-Id: 3A1A6160024 X-Rspam-User: X-Rspamd-Server: rspam10 X-HE-Tag: 1722862567-382603 X-HE-Meta: U2FsdGVkX19baapzktlEl91Uthx6umG3oSX/m6u0ER90eU6AUvThf3LAStcstS8v89xVRu/KCRHmPqd3jLNkO+oHi/rA+EQ5H3AtAWavxIoIemmQHr0pO2PqSll3Tt50DkdwVj6r+yQplzoQPnEW8Z1hr3QkIgbhWCm0mXvasbdkxaKrZNTRUjpOqBEjeV5gFDMoZu5lIR8yf4oFLzYQNGW2NTEHV2n5svkkU0odCoQd5+HUIhOVE8M1HsGd+Hm6eaBp9oEmLqJELjE9G6Nr+/Seh4h4qKZy2tmTOcv3UAb0Q6oC5iIvYmDUGhzCG58LlTAvNWEbyxDwyUpXUnt7QY6qscCsglEbwQhmHau1dzKgLPDeKrqYJEnpABqpwcVMZZeF80KuoDHGGsdfptAMLvvMcOr7NeYBRgJfL8g4dXFR9HCD6pNVc/2GKMbrV37nybsc8Y4muwjFTrq8NdjgfYPIjJ1fY0353hhdIpXXFxw0rZkm1U235sTFRsq1VYEsJpI1nULl/S6KvVeYryECcU2XVl/haaJw4zmSEKGGnDvnQvs6DRPiopTmiC8/1kQ/RqDCrLM/htAqkowAg3pzU21zFK3CTgqwKGRmOHBlp6+t9dw7RS6bTeLw+Zx6hK1QGJAX8pP80JxyVKaBAZ084A5S+s2okk6wKqLQbPUQz9hTs1++frElvsAmyp733n+3CFkCZlgG3Vyp92YH+LdiR8ULe2GRqwvMHFRMs766UTBY6zPSVNq7okz6oXABVW/N+ftUznyliBouGK7S6gVBm5SzOHtAMuc781wxGvfu3hKNNFH/pfJnjszqRMdnwnEVXG9HPnGKcgLH72g2Mpi193LCsD2tggWDlL3lBXHs6SJnR7svKS+u3VFnj8qC18mCw2soiKoJbc8tvCnoLaPFLIAfbKrc8Fc9bPwh+m5pFJ0suAFGnmT/uiTlzUAI9LrRncIl9r5+i5uNRGEtMZ8 ASFIUBZY qtJ3vIxpf5UHsvJRwWETywEnC3AH4MnGYby9KaIR5gPzYO9tgCY/zYzaJXmtIBQDInog5owke1+UbSBiY5VDk28WozazMM5ICZQcIgl/gVAIElw0vN1qtIbRX7+EoDM15BldR7sP27he2gVGos7KJYPSX6MzCvUzEDgJcE9neSy+8kdQiD+Q73EatuemN+NbZiaqm+RuFi9YaB0kdFizqJZI5tj7wFPUWyeTtbeVdBwUMA19wQNTjAHrulTGTb8cSiY5cMRT79Hon6PbryIxP+rySBS5Zwd50GEMbFZagyCcUXe67c2dZWYO7sexaIw6U6kbqY0j5uq/NZHVPx63ABXJHdEoufLZzHHxXmJvY1BGqSyRDEz6P2KfpvQ== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Now in order to pursue high performance, applications mostly use some high-performance user-mode memory allocators, such as jemalloc or tcmalloc. These memory allocators use madvise(MADV_DONTNEED or MADV_FREE) to release physical memory, but neither MADV_DONTNEED nor MADV_FREE will release page table memory, which may cause huge page table memory usage. The following are a memory usage snapshot of one process which actually happened on our server: VIRT: 55t RES: 590g VmPTE: 110g In this case, most of the page table entries are empty. For such a PTE page where all entries are empty, we can actually free it back to the system for others to use. As a first step, this commit attempts to synchronously free the empty PTE pages in zap_page_range_single() (MADV_DONTNEED etc will invoke this). In order to reduce overhead, we only handle the cases with a high probability of generating empty PTE pages, and other cases will be filtered out, such as: - hugetlb vma (unsuitable) - userfaultfd_wp vma (may reinstall the pte entry) - writable private file mapping case (COW-ed anon page is not zapped) - etc For userfaultfd_wp and private file mapping cases (and MADV_FREE case, of course), consider scanning and freeing empty PTE pages asynchronously in the future. The following code snippet can show the effect of optimization: mmap 50G while (1) { for (; i < 1024 * 25; i++) { touch 2M memory madvise MADV_DONTNEED 2M } } As we can see, the memory usage of VmPTE is reduced: before after VIRT 50.0 GB 50.0 GB RES 3.1 MB 3.1 MB VmPTE 102640 KB 240 KB Signed-off-by: Qi Zheng --- include/linux/pgtable.h | 14 +++++ mm/Makefile | 1 + mm/huge_memory.c | 3 + mm/internal.h | 14 +++++ mm/khugepaged.c | 30 +++++++-- mm/memory.c | 2 + mm/pt_reclaim.c | 131 ++++++++++++++++++++++++++++++++++++++++ 7 files changed, 189 insertions(+), 6 deletions(-) create mode 100644 mm/pt_reclaim.c diff --git a/include/linux/pgtable.h b/include/linux/pgtable.h index 2a6a3cccfc367..572343650eb0f 100644 --- a/include/linux/pgtable.h +++ b/include/linux/pgtable.h @@ -447,6 +447,20 @@ static inline void arch_check_zapped_pmd(struct vm_area_struct *vma, } #endif +#ifndef arch_flush_tlb_before_set_huge_page +static inline void arch_flush_tlb_before_set_huge_page(struct mm_struct *mm, + unsigned long addr) +{ +} +#endif + +#ifndef arch_flush_tlb_before_set_pte_page +static inline void arch_flush_tlb_before_set_pte_page(struct mm_struct *mm, + unsigned long addr) +{ +} +#endif + #ifndef __HAVE_ARCH_PTEP_GET_AND_CLEAR static inline pte_t ptep_get_and_clear(struct mm_struct *mm, unsigned long address, diff --git a/mm/Makefile b/mm/Makefile index ab5ed56c5c033..8bec86469c1d5 100644 --- a/mm/Makefile +++ b/mm/Makefile @@ -145,3 +145,4 @@ obj-$(CONFIG_GENERIC_IOREMAP) += ioremap.o obj-$(CONFIG_SHRINKER_DEBUG) += shrinker_debug.o obj-$(CONFIG_EXECMEM) += execmem.o obj-$(CONFIG_TMPFS_QUOTA) += shmem_quota.o +obj-$(CONFIG_PT_RECLAIM) += pt_reclaim.o diff --git a/mm/huge_memory.c b/mm/huge_memory.c index 697fcf89f975b..0afbb1e45cdac 100644 --- a/mm/huge_memory.c +++ b/mm/huge_memory.c @@ -999,6 +999,7 @@ static vm_fault_t __do_huge_pmd_anonymous_page(struct vm_fault *vmf, folio_add_new_anon_rmap(folio, vma, haddr, RMAP_EXCLUSIVE); folio_add_lru_vma(folio, vma); pgtable_trans_huge_deposit(vma->vm_mm, vmf->pmd, pgtable); + arch_flush_tlb_before_set_huge_page(vma->vm_mm, haddr); set_pmd_at(vma->vm_mm, haddr, vmf->pmd, entry); update_mmu_cache_pmd(vma, vmf->address, vmf->pmd); add_mm_counter(vma->vm_mm, MM_ANONPAGES, HPAGE_PMD_NR); @@ -1066,6 +1067,7 @@ static void set_huge_zero_folio(pgtable_t pgtable, struct mm_struct *mm, entry = mk_pmd(&zero_folio->page, vma->vm_page_prot); entry = pmd_mkhuge(entry); pgtable_trans_huge_deposit(mm, pmd, pgtable); + arch_flush_tlb_before_set_huge_page(mm, haddr); set_pmd_at(mm, haddr, pmd, entry); mm_inc_nr_ptes(mm); } @@ -1173,6 +1175,7 @@ static void insert_pfn_pmd(struct vm_area_struct *vma, unsigned long addr, pgtable = NULL; } + arch_flush_tlb_before_set_huge_page(mm, addr); set_pmd_at(mm, addr, pmd, entry); update_mmu_cache_pmd(vma, addr, pmd); diff --git a/mm/internal.h b/mm/internal.h index dfc992de01115..09bd1cee7a523 100644 --- a/mm/internal.h +++ b/mm/internal.h @@ -1441,4 +1441,18 @@ static inline bool try_to_accept_memory(struct zone *zone, unsigned int order) } #endif /* CONFIG_UNACCEPTED_MEMORY */ +#ifdef CONFIG_PT_RECLAIM +void try_to_reclaim_pgtables(struct mmu_gather *tlb, struct vm_area_struct *vma, + unsigned long start_addr, unsigned long end_addr, + struct zap_details *details); +#else +static inline void try_to_reclaim_pgtables(struct mmu_gather *tlb, + struct vm_area_struct *vma, + unsigned long start_addr, + unsigned long end_addr, + struct zap_details *details) +{ +} +#endif /* CONFIG_PT_RECLAIM */ + #endif /* __MM_INTERNAL_H */ diff --git a/mm/khugepaged.c b/mm/khugepaged.c index 91b93259ee214..ffd3963b1c3d1 100644 --- a/mm/khugepaged.c +++ b/mm/khugepaged.c @@ -1598,7 +1598,7 @@ int collapse_pte_mapped_thp(struct mm_struct *mm, unsigned long addr, if (userfaultfd_armed(vma) && !(vma->vm_flags & VM_SHARED)) pml = pmd_lock(mm, pmd); - start_pte = pte_offset_map_nolock(mm, pmd, NULL, haddr, &ptl); + start_pte = pte_offset_map_nolock(mm, pmd, &pgt_pmd, haddr, &ptl); if (!start_pte) /* mmap_lock + page lock should prevent this */ goto abort; if (!pml) @@ -1606,6 +1606,11 @@ int collapse_pte_mapped_thp(struct mm_struct *mm, unsigned long addr, else if (ptl != pml) spin_lock_nested(ptl, SINGLE_DEPTH_NESTING); + /* pmd entry may be changed by others */ + if (unlikely(IS_ENABLED(CONFIG_PT_RECLAIM) && !pml && + !pmd_same(pgt_pmd, pmdp_get_lockless(pmd)))) + goto abort; + /* step 2: clear page table and adjust rmap */ for (i = 0, addr = haddr, pte = start_pte; i < HPAGE_PMD_NR; i++, addr += PAGE_SIZE, pte++) { @@ -1651,6 +1656,11 @@ int collapse_pte_mapped_thp(struct mm_struct *mm, unsigned long addr, /* step 4: remove empty page table */ if (!pml) { pml = pmd_lock(mm, pmd); + if (unlikely(IS_ENABLED(CONFIG_PT_RECLAIM) && + !pmd_same(pgt_pmd, pmdp_get_lockless(pmd)))) { + spin_unlock(pml); + goto pmd_change; + } if (ptl != pml) spin_lock_nested(ptl, SINGLE_DEPTH_NESTING); } @@ -1682,6 +1692,7 @@ int collapse_pte_mapped_thp(struct mm_struct *mm, unsigned long addr, pte_unmap_unlock(start_pte, ptl); if (pml && pml != ptl) spin_unlock(pml); +pmd_change: if (notified) mmu_notifier_invalidate_range_end(&range); drop_folio: @@ -1703,6 +1714,7 @@ static void retract_page_tables(struct address_space *mapping, pgoff_t pgoff) spinlock_t *pml; spinlock_t *ptl; bool skipped_uffd = false; + pte_t *pte; /* * Check vma->anon_vma to exclude MAP_PRIVATE mappings that @@ -1738,11 +1750,17 @@ static void retract_page_tables(struct address_space *mapping, pgoff_t pgoff) addr, addr + HPAGE_PMD_SIZE); mmu_notifier_invalidate_range_start(&range); + pte = pte_offset_map_nolock(mm, pmd, &pgt_pmd, addr, &ptl); + if (!pte) + goto skip; + pml = pmd_lock(mm, pmd); - ptl = pte_lockptr(mm, pmd); if (ptl != pml) spin_lock_nested(ptl, SINGLE_DEPTH_NESTING); + if (unlikely(IS_ENABLED(CONFIG_PT_RECLAIM) && + !pmd_same(pgt_pmd, pmdp_get_lockless(pmd)))) + goto unlock_skip; /* * Huge page lock is still held, so normally the page table * must remain empty; and we have already skipped anon_vma @@ -1758,11 +1776,11 @@ static void retract_page_tables(struct address_space *mapping, pgoff_t pgoff) pgt_pmd = pmdp_collapse_flush(vma, addr, pmd); pmdp_get_lockless_sync(); } - +unlock_skip: + pte_unmap_unlock(pte, ptl); if (ptl != pml) - spin_unlock(ptl); - spin_unlock(pml); - + spin_unlock(pml); +skip: mmu_notifier_invalidate_range_end(&range); if (!skipped_uffd) { diff --git a/mm/memory.c b/mm/memory.c index fef1e425e4702..a8108451e4dac 100644 --- a/mm/memory.c +++ b/mm/memory.c @@ -423,6 +423,7 @@ void pmd_install(struct mm_struct *mm, pmd_t *pmd, unsigned long addr, spinlock_t *ptl = pmd_lock(mm, pmd); if (likely(pmd_none(*pmd))) { /* Has another populated it ? */ + arch_flush_tlb_before_set_pte_page(mm, addr); mm_inc_nr_ptes(mm); /* * Ensure all pte setup (eg. pte page lock and page clearing) are @@ -1931,6 +1932,7 @@ void zap_page_range_single(struct vm_area_struct *vma, unsigned long address, * could have been expanded for hugetlb pmd sharing. */ unmap_single_vma(&tlb, vma, address, end, details, false); + try_to_reclaim_pgtables(&tlb, vma, address, end, details); mmu_notifier_invalidate_range_end(&range); tlb_finish_mmu(&tlb); hugetlb_zap_end(vma, details); diff --git a/mm/pt_reclaim.c b/mm/pt_reclaim.c new file mode 100644 index 0000000000000..e375e7f2059f8 --- /dev/null +++ b/mm/pt_reclaim.c @@ -0,0 +1,131 @@ +// SPDX-License-Identifier: GPL-2.0 +#include +#include +#include +#include + +#include "internal.h" + +/* + * Locking: + * - already held the mmap read lock to traverse the pgtable + * - use pmd lock for clearing pmd entry + * - use pte lock for checking empty PTE page, and release it after clearing + * pmd entry, then we can capture the changed pmd in pte_offset_map_lock() + * etc after holding this pte lock. Thanks to this, we don't need to hold the + * rmap-related locks. + * - users of pte_offset_map_lock() etc all expect the PTE page to be stable by + * using rcu lock, so PTE pages should be freed by RCU. + */ +static int reclaim_pgtables_pmd_entry(pmd_t *pmd, unsigned long addr, + unsigned long next, struct mm_walk *walk) +{ + struct mm_struct *mm = walk->mm; + struct mmu_gather *tlb = walk->private; + pte_t *start_pte, *pte; + pmd_t pmdval; + spinlock_t *pml = NULL, *ptl; + int i; + + start_pte = pte_offset_map_nolock(mm, pmd, &pmdval, addr, &ptl); + if (!start_pte) + return 0; + + pml = pmd_lock(mm, pmd); + if (ptl != pml) + spin_lock_nested(ptl, SINGLE_DEPTH_NESTING); + + if (unlikely(!pmd_same(pmdval, pmdp_get_lockless(pmd)))) + goto out_ptl; + + /* Check if it is empty PTE page */ + for (i = 0, pte = start_pte; i < PTRS_PER_PTE; i++, pte++) { + if (!pte_none(ptep_get(pte))) + goto out_ptl; + } + pte_unmap(start_pte); + + pmd_clear(pmd); + if (ptl != pml) + spin_unlock(ptl); + spin_unlock(pml); + + /* + * NOTE: + * In order to reuse mmu_gather to batch flush tlb and free PTE pages, + * here tlb is not flushed before pmd lock is unlocked. This may + * result in the following two situations: + * + * 1) Userland can trigger page fault and fill a huge page, which will + * cause the existence of small size TLB and huge TLB for the same + * address. + * + * 2) Userland can also trigger page fault and fill a PTE page, which + * will cause the existence of two small size TLBs, but the PTE + * page they map are different. + * + * Some CPUs do not allow these, to solve this, we can define + * arch_flush_tlb_before_set_{huge|pte}_page to detect this case and + * flush TLB before filling a huge page or a PTE page in page fault + * path. + */ + pte_free_tlb(tlb, pmd_pgtable(pmdval), addr); + mm_dec_nr_ptes(mm); + + return 0; + +out_ptl: + pte_unmap_unlock(start_pte, ptl); + if (pml != ptl) + spin_unlock(pml); + + return 0; +} + +static const struct mm_walk_ops reclaim_pgtables_walk_ops = { + .pmd_entry = reclaim_pgtables_pmd_entry, + .walk_lock = PGWALK_RDLOCK, +}; + +void try_to_reclaim_pgtables(struct mmu_gather *tlb, struct vm_area_struct *vma, + unsigned long start_addr, unsigned long end_addr, + struct zap_details *details) +{ + unsigned long start = max(vma->vm_start, start_addr); + unsigned long end; + + if (start >= vma->vm_end) + return; + end = min(vma->vm_end, end_addr); + if (end <= vma->vm_start) + return; + + /* Skip hugetlb case */ + if (is_vm_hugetlb_page(vma)) + return; + + /* Leave this to the THP path to handle */ + if (vma->vm_flags & VM_HUGEPAGE) + return; + + /* userfaultfd_wp case may reinstall the pte entry, also skip */ + if (userfaultfd_wp(vma)) + return; + + /* + * For private file mapping, the COW-ed page is an anon page, and it + * will not be zapped. For simplicity, skip the all writable private + * file mapping cases. + */ + if (details && !vma_is_anonymous(vma) && + !(vma->vm_flags & VM_MAYSHARE) && + (vma->vm_flags & VM_WRITE)) + return; + + start = ALIGN(start, PMD_SIZE); + end = ALIGN_DOWN(end, PMD_SIZE); + if (end - start < PMD_SIZE) + return; + + walk_page_range_vma(vma, start, end, &reclaim_pgtables_walk_ops, tlb); +} From patchwork Mon Aug 5 12:55:09 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Qi Zheng X-Patchwork-Id: 13753587 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 8BC7CC3DA4A for ; Mon, 5 Aug 2024 12:56:14 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 1F8DF6B0085; Mon, 5 Aug 2024 08:56:14 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 182816B009D; Mon, 5 Aug 2024 08:56:14 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id F16EB6B00A0; Mon, 5 Aug 2024 08:56:13 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0010.hostedemail.com [216.40.44.10]) by kanga.kvack.org (Postfix) with ESMTP id D2AC76B009D for ; Mon, 5 Aug 2024 08:56:13 -0400 (EDT) Received: from smtpin05.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay09.hostedemail.com (Postfix) with ESMTP id 73F6980251 for ; Mon, 5 Aug 2024 12:56:13 +0000 (UTC) X-FDA: 82418189826.05.A958276 Received: from mail-pl1-f178.google.com (mail-pl1-f178.google.com [209.85.214.178]) by imf06.hostedemail.com (Postfix) with ESMTP id 8C624180014 for ; Mon, 5 Aug 2024 12:56:11 +0000 (UTC) Authentication-Results: imf06.hostedemail.com; dkim=pass header.d=bytedance.com header.s=google header.b="dtvz/8BM"; dmarc=pass (policy=quarantine) header.from=bytedance.com; spf=pass (imf06.hostedemail.com: domain of zhengqi.arch@bytedance.com designates 209.85.214.178 as permitted sender) smtp.mailfrom=zhengqi.arch@bytedance.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1722862503; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=zvOCNxNaR9g68beFCkVQN7ot2Umf+8WTdO0qJ5nAYzo=; b=g5K1A3/M6sPWaURMQ3gzW+UCogjqc3iegFvbQlWXcaPsBA+EOKnUTSm/k7rImTtic3pGLM /mBaIqBJIh7qkiGdVii0iZB0dikSkEe7GRl0Ohx06xxFl6BhNYgZM+p4z1oAoExD1t2++S wfeYivb8H6rTdVEuJVu5iaFeS6+RL1c= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1722862503; a=rsa-sha256; cv=none; b=CPyurOoJwYA84RZAc1h3FuN4t07jy+cVI3Sor/ke4rNeBZF3owEyBHYYsr92/h6pIBNkHP gTIZdYgjkudPby9qjyklgVvxODBQ0DRgmVAcSZB/pvhM27n/921rGvNndkURRzCgP/vU4E t6q9xp0VdMZzSZlQZwINQZUBmxkY8Kg= ARC-Authentication-Results: i=1; imf06.hostedemail.com; dkim=pass header.d=bytedance.com header.s=google header.b="dtvz/8BM"; dmarc=pass (policy=quarantine) header.from=bytedance.com; spf=pass (imf06.hostedemail.com: domain of zhengqi.arch@bytedance.com designates 209.85.214.178 as permitted sender) smtp.mailfrom=zhengqi.arch@bytedance.com Received: by mail-pl1-f178.google.com with SMTP id d9443c01a7336-1fc53f91ed3so6090105ad.0 for ; Mon, 05 Aug 2024 05:56:11 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance.com; s=google; t=1722862570; x=1723467370; darn=kvack.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=zvOCNxNaR9g68beFCkVQN7ot2Umf+8WTdO0qJ5nAYzo=; b=dtvz/8BMSvC3B9IbWcH5MVLmB8IObLBdqvU1+2X7O1KPuFEEjLVRVuSnINAcLtpo0q 1VvjonEXem7DB7QH1NJUQGyjmQqE2CJ7aOfFxU77q3PZfT8u5YS4L04ZmpV+1coHHmg4 CyGaWxW/8LyFmf2/EfFzZcPRGwfrXgIrFTwiu+e1f3RGFliQERopNTrDS3ySC6jU6PK9 7D26EJVQtIPewrFE0qDPcq77+A7IdyMEiDn2AAB702unbpVpiCGeKW6IQiXU71CQYAmS DS3OLeEppPA0GAHoRrH82YqIOcIYy1kqZpgpfBqHYnmmXX4+ZWXJn3WogBn25rZAuBTY 2rcA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1722862570; x=1723467370; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=zvOCNxNaR9g68beFCkVQN7ot2Umf+8WTdO0qJ5nAYzo=; b=ChIR6pDYaxf0TN3sBbOU7iwaCcFXk7OYX2Pwk93G2xpIOjVBpzdp0CsFi7Tm1ttMh6 WTkydZwpfORXCPb9Qu7vTq2HKn1iCmBFOKePsrZHrzIesNH+/3K9gOVM3dx2iPoKFz3p v0+dMK5b0TmUEB3ukvqxCdbBQ4MhktBbAHZ9c0+L434o3+GRFqHpc92ZYIj9+ZVYrHib K+kPDd0ArT5T/nuFRrzqPXyaOji2m7j+YNGFNkjRyvfRLsuGHXOyPPZzEtJZL5V6IHH+ TLM98x9CqEppDzWj0WDpgRiYerjqiE5+Jf30ZlqEppwRZcitib5je6ct/x5NSnRV3etx ooQA== X-Gm-Message-State: AOJu0YzFa3Go80evRBdzVZHav/UPYoWQ8iSqH6kN4fbLaZ6gzEZC6X+5 ZIZu/bWn/xTSgnwMPrTQzPxHjuLIjbqI6AjkCmvikCZsvSRhkDl7O7lGkr+AuUc= X-Google-Smtp-Source: AGHT+IGEkzD+QFj3cmoP4z9fngv/zsFb+TuS1tTAyot/td9F6PCtAysO8/m5eTxJBHfpd8BN64UXhw== X-Received: by 2002:a05:6a21:6da8:b0:1c4:c007:51b7 with SMTP id adf61e73a8af0-1c69965d0bfmr10731013637.6.1722862570294; Mon, 05 Aug 2024 05:56:10 -0700 (PDT) Received: from C02DW0BEMD6R.bytedance.net ([139.177.225.232]) by smtp.gmail.com with ESMTPSA id d2e1a72fcca58-7106ecfaf1asm5503030b3a.142.2024.08.05.05.56.05 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 05 Aug 2024 05:56:10 -0700 (PDT) From: Qi Zheng To: david@redhat.com, hughd@google.com, willy@infradead.org, mgorman@suse.de, muchun.song@linux.dev, vbabka@kernel.org, akpm@linux-foundation.org, zokeefe@google.com, rientjes@google.com Cc: linux-mm@kvack.org, linux-kernel@vger.kernel.org, Qi Zheng Subject: [RFC PATCH v2 5/7] x86: mm: free page table pages by RCU instead of semi RCU Date: Mon, 5 Aug 2024 20:55:09 +0800 Message-Id: <9a3deedc55947030db20a5ef8aca7b2741df2d9d.1722861064.git.zhengqi.arch@bytedance.com> X-Mailer: git-send-email 2.24.3 (Apple Git-128) In-Reply-To: References: MIME-Version: 1.0 X-Rspamd-Server: rspam07 X-Rspamd-Queue-Id: 8C624180014 X-Stat-Signature: k6n6w9t17ogq5r67zzj38w5knyxnnozb X-Rspam-User: X-HE-Tag: 1722862571-745749 X-HE-Meta: U2FsdGVkX1/iHjCoP8nC9e3QkLMQGSuW15u1f8I94Ha3qwWwgqqRamkF/ZfsqB4di98n7A3OpAXGZMD7SI/KzdSXTv0A501QDR2PAp8QkyV4ts9hDJ+r+nKEXnbupplKFEkI2jhRrRdYnMZrxFPwbl722FP0LwRmj4BdJ9LSB2vIzPipIGLjB2pygWrir+s8fXkcKYiTvmi9IfcFyslVAkcIloEe9R80/gU87dBhpdBi50NFwOAXYNBB73jX8ulDUmYXUVZYHolSueMihakbulE5P+wFouQSTPoUwicfOA+0vYWBSpJ8u53nhsIZAIiM5mwqxAXOD6K3S16oXqruGJiS8aReRHSw7VWtQWgLGpEm7Ln/XKOnYQGAXxj8KvLgZ8n2iIU/KFqiripCv0jhvMYMRIHXU4/KAwYHiUqg9dR0bFgg2bZvy4hMf1UhbDTKJbs8ar0fdr33cKlTJ5tKzHgm9lGm6T+yzHGtCVQbE8dgHR8AORKin0eWIYcwrMS4tft8536w0XJAD7fM3U4khsM3cSW5Vg5NDDXAf5RiW+pZFO87p+VNLbY42fUOraD1CYzY6ei7Kd/+gLE7qrrrVm7zWoAmLCzmlScVglKG11QXnJ2ZewtYdt31+D/YKJqANKvqnnOQu3wQ9yMoFdGHp+GGuKfjQszf/MNAr7vEAgyH0Rn4gkMQXIVHXLxUh0mKvVDSIVKZ4QyYf3vlR8j+LDMaHyQH6Myd8k/29UDag8QwOrJwyxzGKz2+cbDkro2WTMMt0T3LUzQbCQvqf31P47cL1Loiul+4iKl6PmU6CI6+kSC3q8H2mv4g1fOXy951buuzWuZG4hNDz7iyeNgA0URCPRpYu+WbV1gVWuq9vqLs4CQVzBT74TiC8OeWjzs3E+FrkrlqbbpYdFb8s4wOEum+IQow7j+SeZqwj4VTpg4PivRsz+W4KhQSbuuDjK2XYlDOYWQPd1HSCrmqv7s 0Xp58hUG LXu4D21VMu2tvjvfJxik3OH6LFqY8jCZwsZBrQ632AvPnLSxlKWiRAT/XckB+oBbIobXmggT/KTHvtdRTL5I0aBBfT91S3XNxwp8ULU8zRm6GROYmH5uoLYs+1JRZdgUxg7DXCnK1p7y3uHfCPJYa8dExUoeLhvbPA/kZfYs82J6pzO0eIf+Xh2HQO6xLeUKAQnYJkRkSLxTdbnbaqHC/QfGMmaR+LljB4RPFwO8Qu3NyRYXYs6S/bo3q/HAUwMNkgAEjfJWdnxrZI4CRk2ES/gmnlUdYU8vUneaTKNK1ea1lEluhVvTsHZGYq2IqE5Xh2Eto X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Now, if CONFIG_MMU_GATHER_RCU_TABLE_FREE is selected, the page table pages will be freed by semi RCU, that is: - batch table freeing: asynchronous free by RCU - single table freeing: IPI + synchronous free In this way, the page table can be lockless traversed by disabling IRQ in paths such as fast GUP. But this is not enough to free the empty PTE page table pages in paths other that munmap and exit_mmap path, because IPI cannot be synchronized with rcu_read_lock() in pte_offset_map{_lock}(). In preparation for supporting empty PTE page table pages reclaimation, let single table also be freed by RCU like batch table freeing. Then we can also use pte_offset_map() etc to prevent PTE page from being freed. Like pte_free_defer(), we can also safely use ptdesc->pt_rcu_head to free the page table pages: - The pt_rcu_head is unioned with pt_list and pmd_huge_pte. - For pt_list, it is used to manage the PGD page in x86. Fortunately tlb_remove_table() will not be used for free PGD pages, so it is safe to use pt_rcu_head. - For pmd_huge_pte, we will do zap_deposited_table() before freeing the PMD page, so it is also safe. Signed-off-by: Qi Zheng --- arch/x86/include/asm/tlb.h | 19 +++++++++++++++++++ arch/x86/kernel/paravirt.c | 7 +++++++ arch/x86/mm/pgtable.c | 10 +++++++++- mm/mmu_gather.c | 9 ++++++++- 4 files changed, 43 insertions(+), 2 deletions(-) diff --git a/arch/x86/include/asm/tlb.h b/arch/x86/include/asm/tlb.h index 580636cdc257b..e223b53a8b190 100644 --- a/arch/x86/include/asm/tlb.h +++ b/arch/x86/include/asm/tlb.h @@ -34,4 +34,23 @@ static inline void __tlb_remove_table(void *table) free_page_and_swap_cache(table); } +#ifdef CONFIG_PT_RECLAIM +static inline void __tlb_remove_table_one_rcu(struct rcu_head *head) +{ + struct page *page; + + page = container_of(head, struct page, rcu_head); + free_page_and_swap_cache(page); +} + +static inline void __tlb_remove_table_one(void *table) +{ + struct page *page; + + page = table; + call_rcu(&page->rcu_head, __tlb_remove_table_one_rcu); +} +#define __tlb_remove_table_one __tlb_remove_table_one +#endif /* CONFIG_PT_RECLAIM */ + #endif /* _ASM_X86_TLB_H */ diff --git a/arch/x86/kernel/paravirt.c b/arch/x86/kernel/paravirt.c index 5358d43886adc..199b9a3813b4a 100644 --- a/arch/x86/kernel/paravirt.c +++ b/arch/x86/kernel/paravirt.c @@ -60,10 +60,17 @@ void __init native_pv_lock_init(void) static_branch_disable(&virt_spin_lock_key); } +#ifndef CONFIG_PT_RECLAIM static void native_tlb_remove_table(struct mmu_gather *tlb, void *table) { tlb_remove_page(tlb, table); } +#else +static void native_tlb_remove_table(struct mmu_gather *tlb, void *table) +{ + tlb_remove_table(tlb, table); +} +#endif struct static_key paravirt_steal_enabled; struct static_key paravirt_steal_rq_enabled; diff --git a/arch/x86/mm/pgtable.c b/arch/x86/mm/pgtable.c index f5931499c2d6b..ea8522289c93d 100644 --- a/arch/x86/mm/pgtable.c +++ b/arch/x86/mm/pgtable.c @@ -19,12 +19,20 @@ EXPORT_SYMBOL(physical_mask); #endif #ifndef CONFIG_PARAVIRT +#ifndef CONFIG_PT_RECLAIM static inline void paravirt_tlb_remove_table(struct mmu_gather *tlb, void *table) { tlb_remove_page(tlb, table); } -#endif +#else +static inline +void paravirt_tlb_remove_table(struct mmu_gather *tlb, void *table) +{ + tlb_remove_table(tlb, table); +} +#endif /* !CONFIG_PT_RECLAIM */ +#endif /* !CONFIG_PARAVIRT */ gfp_t __userpte_alloc_gfp = GFP_PGTABLE_USER | PGTABLE_HIGHMEM; diff --git a/mm/mmu_gather.c b/mm/mmu_gather.c index 99b3e9408aa0f..d948479ca09e6 100644 --- a/mm/mmu_gather.c +++ b/mm/mmu_gather.c @@ -311,10 +311,17 @@ static inline void tlb_table_invalidate(struct mmu_gather *tlb) } } +#ifndef __tlb_remove_table_one +static inline void __tlb_remove_table_one(void *table) +{ + __tlb_remove_table(table); +} +#endif + static void tlb_remove_table_one(void *table) { tlb_remove_table_sync_one(); - __tlb_remove_table(table); + __tlb_remove_table_one(table); } static void tlb_table_flush(struct mmu_gather *tlb) From patchwork Mon Aug 5 12:55:10 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Qi Zheng X-Patchwork-Id: 13753595 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id ACEE0C3DA4A for ; Mon, 5 Aug 2024 13:03:22 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 29A6C6B0089; Mon, 5 Aug 2024 09:03:22 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 24B066B009D; Mon, 5 Aug 2024 09:03:22 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 0EB696B00A2; Mon, 5 Aug 2024 09:03:22 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0015.hostedemail.com [216.40.44.15]) by kanga.kvack.org (Postfix) with ESMTP id E2F2C6B0089 for ; Mon, 5 Aug 2024 09:03:21 -0400 (EDT) Received: from smtpin19.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay08.hostedemail.com (Postfix) with ESMTP id 713F71402CE for ; Mon, 5 Aug 2024 13:03:21 +0000 (UTC) X-FDA: 82418207802.19.E4FA8FD Received: from mail-qk1-f173.google.com (mail-qk1-f173.google.com [209.85.222.173]) by imf09.hostedemail.com (Postfix) with ESMTP id 7F6AF14001B for ; Mon, 5 Aug 2024 13:03:19 +0000 (UTC) Authentication-Results: imf09.hostedemail.com; dkim=pass header.d=bytedance.com header.s=google header.b=Up3tApbB; spf=pass (imf09.hostedemail.com: domain of zhengqi.arch@bytedance.com designates 209.85.222.173 as permitted sender) smtp.mailfrom=zhengqi.arch@bytedance.com; dmarc=pass (policy=quarantine) header.from=bytedance.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1722862938; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=XC0Rcbjdh8ylucXC7ADaLu0MmCwUQ8ESfO/B+g+YjaQ=; b=yD0stz5YIhJ+lIidNe3fMPrj39pM43f++JR20gu4XrBt3SbZhbkWGhMEn5qZNKYRMMdzbY 7qfcG+flIj/097TJnxwkwcqwzpebT4uwrxKgvAZWJl0UN7ScktaP6JaJbHWLt+yjSVm614 /ncHgHk2ur7Mz7TcWmGttWuqO1pI/SY= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1722862938; a=rsa-sha256; cv=none; b=viLpK9xMTK6qKv0uf1tcCJ4+3IKsuxv2VBJ9JKqqT++nwP9OO4SdhV9bfJH34vFZeL0hxy F5Rg0Jc5FaqGTOeTVeCJGdyCWRTpsXBjDQXZ3AZEfQEtrali8Fto3NJOiVYhgV3sr1VoSt QsZe1MH8Q5B6phe4vgXZBgSo+w7FArs= ARC-Authentication-Results: i=1; imf09.hostedemail.com; dkim=pass header.d=bytedance.com header.s=google header.b=Up3tApbB; spf=pass (imf09.hostedemail.com: domain of zhengqi.arch@bytedance.com designates 209.85.222.173 as permitted sender) smtp.mailfrom=zhengqi.arch@bytedance.com; dmarc=pass (policy=quarantine) header.from=bytedance.com Received: by mail-qk1-f173.google.com with SMTP id af79cd13be357-7a201bcf379so80568085a.3 for ; Mon, 05 Aug 2024 06:03:19 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance.com; s=google; t=1722862998; x=1723467798; darn=kvack.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=XC0Rcbjdh8ylucXC7ADaLu0MmCwUQ8ESfO/B+g+YjaQ=; b=Up3tApbBkS257jK8LBsoHpof7uk2VzGnnLcz9XUp0/C37RtpdkYmYyO7AWPZKZh4m0 EBnqsjjz+YnzZMTtMAgGr6VSW7+tUQthv0+XAA0KQwB5jV1y3Hi0eBetIy3uRfVLe1xj 0Ey7sc5AN2ZvCF1N3Gn0Mq9YQ+J+n8pg3xrL+dHz+5bAB9qYzs8yPy4wDYghQ4S+BTtu iyOg8KWncCR8RyguI0cw5uBJpgiR8sWwfVwzPdYame7RAMH/5J5turYbV48+GEmjSpzS ijf5jMxVu+1/cN8MOwPtETjmWi7lsGuAdvbl4T51XFxV1HRORF2HpMwM53M2YdtSCY5M UmpQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1722862998; x=1723467798; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=XC0Rcbjdh8ylucXC7ADaLu0MmCwUQ8ESfO/B+g+YjaQ=; b=c2OjgY3WGIHHTnLgLmpP2aibVMqwL6Eckjw2ePk7SArlialxz5Su2ZLdaq8yn1RAXG Mx3P8u/Udt5+OtKhw0YsPqLAYSIRRgOhJB8EHZgzaFe24ytYOJJdKWaeP09qWEYN+MVi yZiRvwUCPClbmh+7Do+LJtsP6prxp8MXsEwbvrka+VaLrqeYYFiwP2BNcxx5M/QQKxUc RH9YmCSGC4ogBN3aXFMURMfYZoZPO6AmilsRAPGZfOaZ0sHHxoJmkSPlfmohiv00OPRA 4JwP2St48g5/6lxZIcdGNzgdUqS2O58ploFIBncDmjDdq4teNyQhynOo7Y5t1VSmHEuC KXbA== X-Gm-Message-State: AOJu0Yz7lS+q8MrlvvbfH5cvUS4dn1aqMNL8GZqm91EsgfAZgPZdisvm 78pReEaHwctFi9efQxVTmr9dPTsOraPTfwfswIyzkCOh0H4LlopVEOvwjTEoKjrml/Mqbh3K9m3 U X-Google-Smtp-Source: AGHT+IF/l8kL7sdJ/NhN/cq8F6a1ZTnpgYnj4ROM1RQNcS/ED4bxJRGoEHg/fBeOyRXw7ShGlIlTUA== X-Received: by 2002:a05:6a00:3a20:b0:706:aadc:b0a7 with SMTP id d2e1a72fcca58-7106cf94aa4mr8039553b3a.1.1722862575131; Mon, 05 Aug 2024 05:56:15 -0700 (PDT) Received: from C02DW0BEMD6R.bytedance.net ([139.177.225.232]) by smtp.gmail.com with ESMTPSA id d2e1a72fcca58-7106ecfaf1asm5503030b3a.142.2024.08.05.05.56.10 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 05 Aug 2024 05:56:14 -0700 (PDT) From: Qi Zheng To: david@redhat.com, hughd@google.com, willy@infradead.org, mgorman@suse.de, muchun.song@linux.dev, vbabka@kernel.org, akpm@linux-foundation.org, zokeefe@google.com, rientjes@google.com Cc: linux-mm@kvack.org, linux-kernel@vger.kernel.org, Qi Zheng Subject: [RFC PATCH v2 6/7] x86: mm: define arch_flush_tlb_before_set_huge_page Date: Mon, 5 Aug 2024 20:55:10 +0800 Message-Id: <1c8bee0c868c1e67ea02a6fa49225b00503b5436.1722861064.git.zhengqi.arch@bytedance.com> X-Mailer: git-send-email 2.24.3 (Apple Git-128) In-Reply-To: References: MIME-Version: 1.0 X-Rspam-User: X-Rspamd-Server: rspam04 X-Rspamd-Queue-Id: 7F6AF14001B X-Stat-Signature: 414zq7jtwnbw7rbzbizmubmr1huupds8 X-HE-Tag: 1722862999-164335 X-HE-Meta: U2FsdGVkX18XZ7Qo54zbxJM/XNRGCTJeuyyLaxYYmXBtbvKQ20R6yrwGFX+Xu0OCuIdbkqbNImZcGVn3mvD4AX+008LFjjc19rzFhBd6wJhJ9RdmA4Nz06SGTIMTp9o6SsShCj9qDIKigwTb3vDmmztHi0Tnq8R4cKFjXhDlfJgSbFr2LZT0jjOr/RpGV8zL0qusimzFJI7XGZ3XmhdOBfMgzkX3tDcdCpSHKT4QEOjGW/0orZWgz0jTTY40gfZi78XRTbrbONFofpm0TsuazfRMGxL21GBmKxGyEA37wDMt5d6y6xaQJvTtCGnRvCzolkPQzgjHlXpq9XG/N1rxwQTxYoBDuYiXUlaAdofJFuaFZYJSazB4PoMqpRlC8W3I9WpRpcPxpgqs2qEYFiSAgLnTk5FSQSdv++wstOJgtiQQEXlJOYLrDWUnbJNjnEsLIBwvk5Okx+iFG4ELty05nufw2072q5kp1LgTsknBl1/bRSDnBoRf80CyMNz0E/vAlvm2lUNm+4u/HMzZ7wro4xd/Q9UlOFZ6bwIjIg7Emt8bIEB4dxpmRJ84MC1ddueun0FybZZpQMdkWCm8nrlhQHahRfT0QuAU5Omf+eyJCFOTsl6TTxRHhMy9950Zk+wYdQ0yA1aj/O05pYHCCgKLEw8GOaUBC4+E/gMg7hglMYTA6DgtUddr7iqvjmF6qg238hl/hYqTwE+Wy3hkx4XMVxGDlbz1YG8JdTZWdoFZRxbTrxgBIdEYQxvNMX4miv4+lLqtadYaFVGOp4bik/1jz7JOFxCGQY9SSZOLevqSAhAnhV8LnuxkNmj6iEKryP4Wgg42c/+gNo6BhmBX0LJXJ/Fvj5JBmkj5ymeijCRN4GbVnHkp7qpXqUAR4ZP00DS+IoNK6m5SwR0i28oZinLz7xiPo5HdpzAiSxZcSCH8QjPt1AljzGLXSnPyJi7d8eYWygaqDzorqot1kNfmEHU u8gNZgZi oZRa8OMALHYFo5t8vfxQRsZe4C5RBWIMy7/RC5BRU62CjxWkybfxY91cvtoPdqUhx+gptSIbV753EMfBN8KIJ2ITlhIphk47K75FCUnWAAXoRKzY7Vo6U1jkpp0CLJIiS6sH95mLwV+A/zQO21KbkURWwQcBtlGz3S7pu98Agi5sS0cZmPG36to56fz0tpioyeRDmHUDIeDo34SaUJjfMbmfZ61MR0W2riC2XlosFAMtkV+kNNtC9AZmlMqLFYT90kr4/7Rzk7GmZ5wXi8rQDBka8Wy8J3x5tVGOG/uTI19NkUY2m0pCOzzxWbC01DeTxgxAc1nbDoe44udVlFix6G3uDnZkc3Hx7EMGKdR5AD0/pT2t/jVmD+vMESA== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: When we use mmu_gather to batch flush tlb and free PTE pages, the TLB is not flushed before pmd lock is unlocked. This may result in the following two situations: 1) Userland can trigger page fault and fill a huge page, which will cause the existence of small size TLB and huge TLB for the same address. 2) Userland can also trigger page fault and fill a PTE page, which will cause the existence of two small size TLBs, but the PTE page they map are different. According to Intel's TLB Application note (317080), some CPUs of x86 do not allow the 1) case, so define arch_flush_tlb_before_set_huge_page to detect and fix this issue. Signed-off-by: Qi Zheng --- arch/x86/include/asm/pgtable.h | 6 ++++++ arch/x86/mm/pgtable.c | 13 +++++++++++++ 2 files changed, 19 insertions(+) diff --git a/arch/x86/include/asm/pgtable.h b/arch/x86/include/asm/pgtable.h index e39311a89bf47..f93d964ab6a3e 100644 --- a/arch/x86/include/asm/pgtable.h +++ b/arch/x86/include/asm/pgtable.h @@ -1668,6 +1668,12 @@ void arch_check_zapped_pte(struct vm_area_struct *vma, pte_t pte); #define arch_check_zapped_pmd arch_check_zapped_pmd void arch_check_zapped_pmd(struct vm_area_struct *vma, pmd_t pmd); +#ifdef CONFIG_PT_RECLAIM +#define arch_flush_tlb_before_set_huge_page arch_flush_tlb_before_set_huge_page +void arch_flush_tlb_before_set_huge_page(struct mm_struct *mm, + unsigned long addr); +#endif + #ifdef CONFIG_XEN_PV #define arch_has_hw_nonleaf_pmd_young arch_has_hw_nonleaf_pmd_young static inline bool arch_has_hw_nonleaf_pmd_young(void) diff --git a/arch/x86/mm/pgtable.c b/arch/x86/mm/pgtable.c index ea8522289c93d..7e14cae819edd 100644 --- a/arch/x86/mm/pgtable.c +++ b/arch/x86/mm/pgtable.c @@ -934,3 +934,16 @@ void arch_check_zapped_pmd(struct vm_area_struct *vma, pmd_t pmd) VM_WARN_ON_ONCE(!(vma->vm_flags & VM_SHADOW_STACK) && pmd_shstk(pmd)); } + +#ifdef CONFIG_PT_RECLAIM +void arch_flush_tlb_before_set_huge_page(struct mm_struct *mm, + unsigned long addr) +{ + if (atomic_read(&mm->tlb_flush_pending)) { + unsigned long start = ALIGN_DOWN(addr, PMD_SIZE); + unsigned long end = start + PMD_SIZE; + + flush_tlb_mm_range(mm, start, end, PAGE_SHIFT, false); + } +} +#endif /* CONFIG_PT_RECLAIM */ From patchwork Mon Aug 5 12:55:11 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Qi Zheng X-Patchwork-Id: 13753588 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 825D5C3DA7F for ; Mon, 5 Aug 2024 12:56:23 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 21FAE6B00A0; Mon, 5 Aug 2024 08:56:23 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 1D02C6B00A2; Mon, 5 Aug 2024 08:56:23 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 097B26B00A3; Mon, 5 Aug 2024 08:56:23 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0015.hostedemail.com [216.40.44.15]) by kanga.kvack.org (Postfix) with ESMTP id DB3256B00A0 for ; Mon, 5 Aug 2024 08:56:22 -0400 (EDT) Received: from smtpin08.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay05.hostedemail.com (Postfix) with ESMTP id 9C698402BD for ; Mon, 5 Aug 2024 12:56:22 +0000 (UTC) X-FDA: 82418190204.08.3087937 Received: from mail-pf1-f171.google.com (mail-pf1-f171.google.com [209.85.210.171]) by imf29.hostedemail.com (Postfix) with ESMTP id D2C4612001E for ; Mon, 5 Aug 2024 12:56:20 +0000 (UTC) Authentication-Results: imf29.hostedemail.com; dkim=pass header.d=bytedance.com header.s=google header.b=MC5Wsp8b; spf=pass (imf29.hostedemail.com: domain of zhengqi.arch@bytedance.com designates 209.85.210.171 as permitted sender) smtp.mailfrom=zhengqi.arch@bytedance.com; dmarc=pass (policy=quarantine) header.from=bytedance.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1722862519; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=PiQJiB5+YTgWJEkf+MbrVUnf+za6HpDp7AR+2atGBrA=; b=MkCaR1lWVZvtGQNzIIxV06B5j6lR7x4V9uKjNTy05wOiZeyjFWPzUlbz/tOkvXrSpDi5yN 0bsi+J+r4OAR6r2Q2xPhBL8l1nyBX2v0DHwveWP4u5B4Y1wDbu4Pn1I2JXI1BwelTYA4ml W4kyvI3toWlzXm4KOd6gtWgngVfTVQE= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1722862519; a=rsa-sha256; cv=none; b=lJpLWt3D9nUiJmSapv1J2mPD3HxV0F79+z2yHleC4RkJqDBwoqUuyf5gXZPrla+U8QmwGn DhnBdVe5SSRtKR2FSir0cay63z8qlnYjxSWDw9oVreFtss6sea6HJupvfs5/7o/9OFhXte CD1+5omVcQI/XiVkWttCGKsNCxud810= ARC-Authentication-Results: i=1; imf29.hostedemail.com; dkim=pass header.d=bytedance.com header.s=google header.b=MC5Wsp8b; spf=pass (imf29.hostedemail.com: domain of zhengqi.arch@bytedance.com designates 209.85.210.171 as permitted sender) smtp.mailfrom=zhengqi.arch@bytedance.com; dmarc=pass (policy=quarantine) header.from=bytedance.com Received: by mail-pf1-f171.google.com with SMTP id d2e1a72fcca58-70e9545d8b2so931743b3a.3 for ; Mon, 05 Aug 2024 05:56:20 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance.com; s=google; t=1722862579; x=1723467379; darn=kvack.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=PiQJiB5+YTgWJEkf+MbrVUnf+za6HpDp7AR+2atGBrA=; b=MC5Wsp8bHYtBQbOSZjnPKNcPLcJfwgMSPPys45qk61+EH6peKenr9UZ6aPJC5Hyg6u KdMWwEkxUiUWACENX1xDdB//sflhX/n+uXk8B9a0Qv06/T/F0Q5MLpvUCEUKXtrQ8UEQ j9qKS/P3yHD/A1DLOBT8vq/fW4a4wTCoGKaWsh2jdj2UsGhdo+85HiRl47vJp2n9+jui uLiYiSczd2PI7tlqLuZ/qCvHiiJ9tKfOZ2fLyUM0fkIGc3NRl4MvuLFYtGaEQAhedXhU lGSSxDaCQ6MzCmVxMFMB9OOlBLTAVGUavmw8VCaGpAbd7CwSuxwdQOU0fR6OE4cp1oxY vwAQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1722862579; x=1723467379; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=PiQJiB5+YTgWJEkf+MbrVUnf+za6HpDp7AR+2atGBrA=; b=O05twLPgunvr31vWUBY3YmEtLiiYPe22b3lASDspHhfmgiJJWrhQZC0/tr/rYqABWB b3ITNLX5UYXXi3QvyQ50NSJtMhZyVsiTWDhecB6rKEgBrUyBO9xY2TOfUuc6KtxDNPHL yI6ZxvFv9xlm604+U8mAz7N6cterxj1AiCfbep7bZ3YNtLhHQzdgyM+KOl5xLCmVdxKY uV8cvDAXvvtVEdnuejXE2vzsVZ3q9J6zHNikUGT+PK+Zwr7VqLvYj3ihV2asnAWrzP79 9u4UzsJZQtn3zHBZ7Flr9mJ+LoJKGMZzpwA8mcG1DBeHhqPetG66rwp3UO50sbk9g0wL EdiQ== X-Gm-Message-State: AOJu0YyH8O3iIP/Cfk85bEvSdXmmCCVIRK94vvtmOJ8zBfV09mU2T6yF fR5MZnOe5HylOprDwWtJwmHP6D7opQMj8imOZ8cad3OWeR/OVgaeEvRT7E/G5wY= X-Google-Smtp-Source: AGHT+IFVoFNV/H2C9dHSZ1PBN5mIC2cyRnjqaIeUe5zi3ii7ROXtPfC7wvdYAPcekd2YlYQGHcbP5g== X-Received: by 2002:a05:6a20:9188:b0:1c4:c8ef:8e68 with SMTP id adf61e73a8af0-1c69966d186mr10728067637.9.1722862579692; Mon, 05 Aug 2024 05:56:19 -0700 (PDT) Received: from C02DW0BEMD6R.bytedance.net ([139.177.225.232]) by smtp.gmail.com with ESMTPSA id d2e1a72fcca58-7106ecfaf1asm5503030b3a.142.2024.08.05.05.56.15 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 05 Aug 2024 05:56:19 -0700 (PDT) From: Qi Zheng To: david@redhat.com, hughd@google.com, willy@infradead.org, mgorman@suse.de, muchun.song@linux.dev, vbabka@kernel.org, akpm@linux-foundation.org, zokeefe@google.com, rientjes@google.com Cc: linux-mm@kvack.org, linux-kernel@vger.kernel.org, Qi Zheng Subject: [RFC PATCH v2 7/7] x86: select ARCH_SUPPORTS_PT_RECLAIM if X86_64 Date: Mon, 5 Aug 2024 20:55:11 +0800 Message-Id: <0949a17afe11cf108b8a09a538c00c93cdc5507e.1722861064.git.zhengqi.arch@bytedance.com> X-Mailer: git-send-email 2.24.3 (Apple Git-128) In-Reply-To: References: MIME-Version: 1.0 X-Stat-Signature: wan5udefe3a41subp65j1zjkc88akcw8 X-Rspamd-Queue-Id: D2C4612001E X-Rspam-User: X-Rspamd-Server: rspam08 X-HE-Tag: 1722862580-31516 X-HE-Meta: U2FsdGVkX1+sfM/Nz9NjxFuVHYU6zQkuOc4ipQZ4uvJeu1hRCU7dv8JAIqk01dZ8vuxFFJJBo11s0YX6o07APd0ZLsDdCNQSKd2IuAoP9SsDjMXAvkE24ACTEwZEXQc/dD1LV1FBnVT5ZEVj+tQ1MgW63RAQ2ZIO+z/cMq5bkJJbdYDfsSb2J5F7VYOlgvpLn3o8yk1KqTZfe4LqU+VvL4vWorhdjwoHZHsZQdP0Z07OQ+a2jgSeu7STGyvgtaJLkosfcEwiDBVKNoTEWmnbrrOEhvjZ+hJT59kg3TFOhjtRngHq4SM90iuTv1cDIMHQn8oAcAXNHE4fOIrhNr1ukLMPjafNlOIwya56fCjWd6nxU8TpqY7EhhqlMQxt6BIPW5cpH64mc5LoEG0NSHIUzLKcDvwqTO8vi/McwZUSUO4oWnc+CltPoqb0mkFMU+HdSeznP1jyvfh47tlJlp36Tevz/Dl41MUY2b/43WwJ6+GYSnoPwdNjUkJygutu7p81y3LBtAOOOkTsEEjwT/OzvNAprE7QR479M5knbU6qVjnYBrNjuf54gS441/yDePZbRQTGDO9N/vt62CWUH6+JOfQqSlgW+DqlNGQ2YmKFFVVMmjm16zB9Z4l/3ByP0+kBPSH2sseWqax5JghtwdXTHrhFTLLWA3N8HXR0MUzxkaGTsq+qDYL7Htxs0lnmWF4f+yVtGldRe6zI9oqFFKgQzFxqc00NmeRNS9/Ddq4xNbD/y+QiQcqdA1j6lC3W/9uVUbU0y0iRKA571cbZ0v5l2S/CKMrRiKj6YtcG4iAIBV0gvUHLwe6px3Aau88GzwWLNdZfYu9UBJnROjKaE0uKQ65rPxgcYHBZvpOv2gRF7SrkUXPZbSTFvfhMTBfBJWC1fIS92ZH2d3NVOh9UNARItvLkAqTva4Fpqvym69TIYX0SxzoOY4PijsycUIDOcg+mgx1zr28e8cdcgwFyNZe nJxyd0mG X3OrCjv1dieSlYIiC2FwFa9lXjg1mNi/AHAs3a60Yr0cpr654ycyg1GUFnk9LFa7wJGV0rGaELX/KDDyyYf5CjShKic0z6nt2qMeUc8PnjLxw5JdzbBsmZmxZg39YfC/2DKvI464kTznA1URr/QMjABan7Zaq2P87sdVVzi+JUiDyA2ph2hjvGz56JRWvdO1gZ9Z5gsBNCtF0+clEAxRwBZQ6IOb3E/HRMob2rQvMjSiSAmuf3WztNuNDq+BBUvcSc5XLOfSzfrRVZw38Edeg0L0zte9GgZt6zmu0rPQp3sVyfg/JskrEaaAwaOfDUs6EnzQbjLQKBbE2lw/ZsYHuH2WLyc/YCA3jhFRWGmnDrL6gGuIWJqyJ5QC+Cg== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Now, x86 has fully supported the CONFIG_PT_RECLAIM feature, and reclaiming PTE pages is profitable only on 64-bit systems, so select ARCH_SUPPORTS_PT_RECLAIM if X86_64. Signed-off-by: Qi Zheng --- arch/x86/Kconfig | 1 + 1 file changed, 1 insertion(+) diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig index 2f611ffd0c9a4..fff6a7e6ea1de 100644 --- a/arch/x86/Kconfig +++ b/arch/x86/Kconfig @@ -316,6 +316,7 @@ config X86 select FUNCTION_ALIGNMENT_4B imply IMA_SECURE_AND_OR_TRUSTED_BOOT if EFI select HAVE_DYNAMIC_FTRACE_NO_PATCHABLE + select ARCH_SUPPORTS_PT_RECLAIM if X86_64 config INSTRUCTION_DECODER def_bool y