From patchwork Mon Jul 1 08:46:42 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Qi Zheng X-Patchwork-Id: 13717667 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id EA902C2BD09 for ; Mon, 1 Jul 2024 08:48:12 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 54DA56B00A3; Mon, 1 Jul 2024 04:48:12 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 4FFC86B00A4; Mon, 1 Jul 2024 04:48:12 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 376E06B00A5; Mon, 1 Jul 2024 04:48:12 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0010.hostedemail.com [216.40.44.10]) by kanga.kvack.org (Postfix) with ESMTP id 17DA66B00A3 for ; Mon, 1 Jul 2024 04:48:12 -0400 (EDT) Received: from smtpin05.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay02.hostedemail.com (Postfix) with ESMTP id C6AA8121AC9 for ; Mon, 1 Jul 2024 08:48:11 +0000 (UTC) X-FDA: 82290556782.05.F0E0DB5 Received: from mail-il1-f175.google.com (mail-il1-f175.google.com [209.85.166.175]) by imf21.hostedemail.com (Postfix) with ESMTP id 088271C001F for ; Mon, 1 Jul 2024 08:48:09 +0000 (UTC) Authentication-Results: imf21.hostedemail.com; dkim=pass header.d=bytedance.com header.s=google header.b="ko3Kd/is"; spf=none (imf21.hostedemail.com: domain of zhengqi.arch@bytedance.com has no SPF policy when checking 209.85.166.175) smtp.mailfrom=zhengqi.arch@bytedance.com; dmarc=pass (policy=quarantine) header.from=bytedance.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1719823673; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=/FEjQRmnZwv5IniDvqPInjdkphsULH24MtgWxf3WNnc=; b=jTSDRadLMQJPisq0fRBcEuz76C+awLD1nrgH5ME9jH8VzCBKOJe2QG0mR3Izrr3zRIlOui gjVvkmQ9EgT4ZchvCT7vnjh/AX84NVWrkMiIOEMZtPDC0ReCMF2Kb22+8Fg/ZQf1qeBkxS 1P22hkZG5f74C8CsnjYTDIFbU0jVyrg= ARC-Authentication-Results: i=1; imf21.hostedemail.com; dkim=pass header.d=bytedance.com header.s=google header.b="ko3Kd/is"; spf=none (imf21.hostedemail.com: domain of zhengqi.arch@bytedance.com has no SPF policy when checking 209.85.166.175) smtp.mailfrom=zhengqi.arch@bytedance.com; dmarc=pass (policy=quarantine) header.from=bytedance.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1719823673; a=rsa-sha256; cv=none; b=07yMEBeCduou4p9q1NhxXLQYnPwyq1ky1bYbhvsJ+wmqWk6ZwIi4ruvBgqjD5oaiRnPsIw qQZKSv5ot1p5cTv54iltaLEItVGKJvYW/PlRI8ASLd055gKOdxEI3PKiJxvm3nN/lqzZaH 4sRY+yz7pyOBUW0V6Rua1ANIw+0SHSg= Received: by mail-il1-f175.google.com with SMTP id e9e14a558f8ab-375df7da74cso1331205ab.1 for ; Mon, 01 Jul 2024 01:48:09 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance.com; s=google; t=1719823689; x=1720428489; darn=kvack.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=/FEjQRmnZwv5IniDvqPInjdkphsULH24MtgWxf3WNnc=; b=ko3Kd/isyYlsyF3LpLjKoYRHSJgzUFdVwaoG429fbur1mQkGLaL/jfzxOA3b/n2hWx zkjAmTpVQS32b7i0AtlYnmHCP7bxjI2MPGwFYe3Tkvgu/FPIBSX+GmzsWi5Gwqx53408 i99CA+Oem7vFJA/eXoxs4PsJz/HyMy6ZhOTePLlXJDgfq2Jo5BdLozBXMxdh0dzWa6tx R7umjNJqi0sSK/D6FSCwSOESRUziqPSCRSFIsATAY8+p+W82S5jLZkDojjeoQbp5K5b2 RFoCllY+xGtjwvs4eilWyy5tAnW9j/r/X+HoVSLc6+vePT6iluPn4IQXkS7zWNg02doc oO6A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1719823689; x=1720428489; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=/FEjQRmnZwv5IniDvqPInjdkphsULH24MtgWxf3WNnc=; b=MGdxyEX383LQ8ymSTGg2JmLR1qJCHD+VaAJQQdDNOMLg4Nudvj9HJ09SPTEp+G/Tnx dasIiLFQPbKEIk1sXq0uUlJ6nlx019BJvJeZPOnjwB8ZAcVylkUDs/tviqqm0xABUZiS f+7mB6sTCorkBlNdqGzmvKIZAiwcJVlthmOpcS2JmUpTc9eClzIj4Iz8UfHGswoYnb8D ZmC9Nvqgyoyk8KksHnEZriLy5aWT9NHRnIPBJ6oDsVt4XBV7/tsj/GEyPwgcgqS5paOw fjA/iKh4BMf6IPlEOxg1hPQIn0z4z6YGlugG8aAJTGuCxLe4wu7mPDw/C11HBqZVaDjg 9EbQ== X-Gm-Message-State: AOJu0YwI2Xu46p3l2Cw+2UINFEC1N8O96i0Anlq82Iig9w198c9pkc7C kfoX33zpIr/tAHqc+n+M+DUYTXNmOcmjzABAF7CF22o8mhbMJoJXaTW4tBOKzDU= X-Google-Smtp-Source: AGHT+IFNREJpAukQmSF+e3y8DudZ+fBumzStMqKfWS9xBvY6s+QthbSABnLkfofffgxBCHUpAfSNuw== X-Received: by 2002:a5d:9b8e:0:b0:7f3:a20e:c38f with SMTP id ca18e2360f4ac-7f62ed4cf57mr403632739f.0.1719823688743; Mon, 01 Jul 2024 01:48:08 -0700 (PDT) Received: from C02DW0BEMD6R.bytedance.net ([139.177.225.241]) by smtp.gmail.com with ESMTPSA id d2e1a72fcca58-70804a7e7e0sm5932374b3a.204.2024.07.01.01.48.05 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 01 Jul 2024 01:48:08 -0700 (PDT) From: Qi Zheng To: david@redhat.com, hughd@google.com, willy@infradead.org, mgorman@suse.de, muchun.song@linux.dev, akpm@linux-foundation.org Cc: linux-mm@kvack.org, linux-kernel@vger.kernel.org, Qi Zheng Subject: [RFC PATCH 1/7] mm: pgtable: make pte_offset_map_nolock() return pmdval Date: Mon, 1 Jul 2024 16:46:42 +0800 Message-Id: <7f5233f9f612c7f58abf218852fb1042d764940b.1719570849.git.zhengqi.arch@bytedance.com> X-Mailer: git-send-email 2.24.3 (Apple Git-128) In-Reply-To: References: MIME-Version: 1.0 X-Stat-Signature: 91kgu95n9do14j484jd1hmyog9c1hfok X-Rspam-User: X-Rspamd-Queue-Id: 088271C001F X-Rspamd-Server: rspam02 X-HE-Tag: 1719823689-878206 X-HE-Meta: U2FsdGVkX19XRygofGjsR9597yLcx3fXeERasVDGARCydqKygsYdr+PlcpLv+3cqOw/YQDf2zMnq0lNJNFBcoZhSZWGItMyRIKNvqOm9gLUaX3HfRNcxOkuha8qmWywgHCRLJKRiMa/S6KUd83dEzM9Sm7dCHmhdM6Il+gFLfsVCkDeJ6xSvT4j7pu5e2ttSbus4z+I/rtauV7pDmTvSJAf0IoiJzf8PNiTrDDWYTbcEFU+DJKNi8lyXKY2Cw79ih6uzA3T33gncACtqWz17bXbKlPN/a+/HXSay6+cc2hmwd9ME5NCN9h4costRCjp0lqVwxdv0ZDViHhvwFd4y0lTMqW7xkmdA0cfuyli5Lgayk5Gb/PowlaN5sDe2eD95DOJ2B2VaRA3OTWSTB5iWlOoOG9XGvFJxuGJQDYxB3T/zaT6dHkWvintbv7sZVQrL01R+RQ5DvTffkBR+L4MukRQhczB0iEIKeewBxp/RtUD5yTgGywsESs6sLOAQ1W2sXC4WwE+3j/QgwwxuehZ6CWQ/iuG5YV/aGFWq+M/mbhxFVoInUyjj2D1qHR/kKN3BYB7s/YgW1ErXpYLlSxlsjxLhlINkqFD8xD+db1grtVeBIOrR8v+ocjPd4DZMlfzvU7nczLDz95V07wEuCxKwhMcfHG3TOcXYboCezhnqabDYHGKgid+gOmCH+x+vZ0/jIjVYcZm9hOdNyibC7yRxWIExP0Tn+GmryZOd4AkF1NlOXJgCFW0XpGzBTjxr/4lRbEjui+QBxCr42uQgtvJnzuGrJRZfjEGkeX9RXlCajPDzT3djlf/lbNBlVxlhWIzMO1I6KjzkvurSvHWmtlqoqZqVdyiIsAwwyo6JAisWen3d2hYyKmyY3XU9GZTxeo+yrAPdm7ErIYnIKLtuDR+vXd8X0eVRqAb7zXv+WnF6FEH7v23uv8vBe0j370SEtqYRi80AnKAN1K16q5kwM4L FSwj53Os tOyFj0uavdERIG3w1E7L1WEFWa9ddhHSNXc0RXk9SFkChlsupVXpptN3/dyA8OauWCmzwugvOWpBPEu6FHYIo1xVAoLywD3KCpseTZ3mmk9FVHhTXK4WRgPPe3vwkRCHRJfkZZbkBP1psDRAedAB9bCKWxk1xuoYKfDMDPE/xj49f7m0NJFU4mZhRAZegICSLMJ9O1YdNEqdADcuVJ8hIxUFV02El98Ct//amYs/CzEzi6iCMELZ86r/WlqyJwExfIuLKHlrLIom0be/ebdu7ELFXYUAGgY6IJN5lWZu3FCM/xhW636LueJIblSVJTkkBvhV3C8p+onoBOfa3MJXBthOXNmTfn5L2G47V X-Bogosity: Ham, tests=bogofilter, spamicity=0.000001, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Make pte_offset_map_nolock() return pmdval so that we can recheck the *pmd once the lock is taken. This is a preparation for freeing empty PTE pages, no functional changes are expected. Signed-off-by: Qi Zheng --- Documentation/mm/split_page_table_lock.rst | 3 ++- arch/arm/mm/fault-armv.c | 2 +- arch/powerpc/mm/pgtable.c | 2 +- include/linux/mm.h | 4 ++-- mm/filemap.c | 2 +- mm/khugepaged.c | 4 ++-- mm/memory.c | 4 ++-- mm/mremap.c | 2 +- mm/page_vma_mapped.c | 2 +- mm/pgtable-generic.c | 21 ++++++++++++--------- mm/userfaultfd.c | 4 ++-- mm/vmscan.c | 2 +- 12 files changed, 28 insertions(+), 24 deletions(-) diff --git a/Documentation/mm/split_page_table_lock.rst b/Documentation/mm/split_page_table_lock.rst index e4f6972eb6c0..e6a47d57531c 100644 --- a/Documentation/mm/split_page_table_lock.rst +++ b/Documentation/mm/split_page_table_lock.rst @@ -18,7 +18,8 @@ There are helpers to lock/unlock a table and other accessor functions: pointer to its PTE table lock, or returns NULL if no PTE table; - pte_offset_map_nolock() maps PTE, returns pointer to PTE with pointer to its PTE table - lock (not taken), or returns NULL if no PTE table; + lock (not taken) and the value of its pmd entry, or returns NULL + if no PTE table; - pte_offset_map() maps PTE, returns pointer to PTE, or returns NULL if no PTE table; - pte_unmap() diff --git a/arch/arm/mm/fault-armv.c b/arch/arm/mm/fault-armv.c index 2286c2ea60ec..3e4ed99b9330 100644 --- a/arch/arm/mm/fault-armv.c +++ b/arch/arm/mm/fault-armv.c @@ -117,7 +117,7 @@ static int adjust_pte(struct vm_area_struct *vma, unsigned long address, * must use the nested version. This also means we need to * open-code the spin-locking. */ - pte = pte_offset_map_nolock(vma->vm_mm, pmd, address, &ptl); + pte = pte_offset_map_nolock(vma->vm_mm, pmd, NULL, address, &ptl); if (!pte) return 0; diff --git a/arch/powerpc/mm/pgtable.c b/arch/powerpc/mm/pgtable.c index 9e7ba9c3851f..ab0250f1b226 100644 --- a/arch/powerpc/mm/pgtable.c +++ b/arch/powerpc/mm/pgtable.c @@ -350,7 +350,7 @@ void assert_pte_locked(struct mm_struct *mm, unsigned long addr) */ if (pmd_none(*pmd)) return; - pte = pte_offset_map_nolock(mm, pmd, addr, &ptl); + pte = pte_offset_map_nolock(mm, pmd, NULL, addr, &ptl); BUG_ON(!pte); assert_spin_locked(ptl); pte_unmap(pte); diff --git a/include/linux/mm.h b/include/linux/mm.h index 7d044e737dba..396bdc3b3726 100644 --- a/include/linux/mm.h +++ b/include/linux/mm.h @@ -2979,8 +2979,8 @@ static inline pte_t *pte_offset_map_lock(struct mm_struct *mm, pmd_t *pmd, return pte; } -pte_t *pte_offset_map_nolock(struct mm_struct *mm, pmd_t *pmd, - unsigned long addr, spinlock_t **ptlp); +pte_t *pte_offset_map_nolock(struct mm_struct *mm, pmd_t *pmd, pmd_t *pmdvalp, + unsigned long addr, spinlock_t **ptlp); #define pte_unmap_unlock(pte, ptl) do { \ spin_unlock(ptl); \ diff --git a/mm/filemap.c b/mm/filemap.c index 6835977ee99a..35bbba960447 100644 --- a/mm/filemap.c +++ b/mm/filemap.c @@ -3231,7 +3231,7 @@ static vm_fault_t filemap_fault_recheck_pte_none(struct vm_fault *vmf) if (!(vmf->flags & FAULT_FLAG_ORIG_PTE_VALID)) return 0; - ptep = pte_offset_map_nolock(vma->vm_mm, vmf->pmd, vmf->address, + ptep = pte_offset_map_nolock(vma->vm_mm, vmf->pmd, NULL, vmf->address, &vmf->ptl); if (unlikely(!ptep)) return VM_FAULT_NOPAGE; diff --git a/mm/khugepaged.c b/mm/khugepaged.c index 2e017585f813..7b7c858d5f99 100644 --- a/mm/khugepaged.c +++ b/mm/khugepaged.c @@ -989,7 +989,7 @@ static int __collapse_huge_page_swapin(struct mm_struct *mm, }; if (!pte++) { - pte = pte_offset_map_nolock(mm, pmd, address, &ptl); + pte = pte_offset_map_nolock(mm, pmd, NULL, address, &ptl); if (!pte) { mmap_read_unlock(mm); result = SCAN_PMD_NULL; @@ -1578,7 +1578,7 @@ int collapse_pte_mapped_thp(struct mm_struct *mm, unsigned long addr, if (userfaultfd_armed(vma) && !(vma->vm_flags & VM_SHARED)) pml = pmd_lock(mm, pmd); - start_pte = pte_offset_map_nolock(mm, pmd, haddr, &ptl); + start_pte = pte_offset_map_nolock(mm, pmd, NULL, haddr, &ptl); if (!start_pte) /* mmap_lock + page lock should prevent this */ goto abort; if (!pml) diff --git a/mm/memory.c b/mm/memory.c index 0a769f34bbb2..1c9068b0b067 100644 --- a/mm/memory.c +++ b/mm/memory.c @@ -1108,7 +1108,7 @@ copy_pte_range(struct vm_area_struct *dst_vma, struct vm_area_struct *src_vma, ret = -ENOMEM; goto out; } - src_pte = pte_offset_map_nolock(src_mm, src_pmd, addr, &src_ptl); + src_pte = pte_offset_map_nolock(src_mm, src_pmd, NULL, addr, &src_ptl); if (!src_pte) { pte_unmap_unlock(dst_pte, dst_ptl); /* ret == 0 */ @@ -5507,7 +5507,7 @@ static vm_fault_t handle_pte_fault(struct vm_fault *vmf) * it into a huge pmd: just retry later if so. */ vmf->pte = pte_offset_map_nolock(vmf->vma->vm_mm, vmf->pmd, - vmf->address, &vmf->ptl); + NULL, vmf->address, &vmf->ptl); if (unlikely(!vmf->pte)) return 0; vmf->orig_pte = ptep_get_lockless(vmf->pte); diff --git a/mm/mremap.c b/mm/mremap.c index e7ae140fc640..f672d0218a6f 100644 --- a/mm/mremap.c +++ b/mm/mremap.c @@ -175,7 +175,7 @@ static int move_ptes(struct vm_area_struct *vma, pmd_t *old_pmd, err = -EAGAIN; goto out; } - new_pte = pte_offset_map_nolock(mm, new_pmd, new_addr, &new_ptl); + new_pte = pte_offset_map_nolock(mm, new_pmd, NULL, new_addr, &new_ptl); if (!new_pte) { pte_unmap_unlock(old_pte, old_ptl); err = -EAGAIN; diff --git a/mm/page_vma_mapped.c b/mm/page_vma_mapped.c index ae5cc42aa208..507701b7bcc1 100644 --- a/mm/page_vma_mapped.c +++ b/mm/page_vma_mapped.c @@ -33,7 +33,7 @@ static bool map_pte(struct page_vma_mapped_walk *pvmw, spinlock_t **ptlp) * Though, in most cases, page lock already protects this. */ pvmw->pte = pte_offset_map_nolock(pvmw->vma->vm_mm, pvmw->pmd, - pvmw->address, ptlp); + NULL, pvmw->address, ptlp); if (!pvmw->pte) return false; diff --git a/mm/pgtable-generic.c b/mm/pgtable-generic.c index a78a4adf711a..443e3b34434a 100644 --- a/mm/pgtable-generic.c +++ b/mm/pgtable-generic.c @@ -305,7 +305,7 @@ pte_t *__pte_offset_map(pmd_t *pmd, unsigned long addr, pmd_t *pmdvalp) return NULL; } -pte_t *pte_offset_map_nolock(struct mm_struct *mm, pmd_t *pmd, +pte_t *pte_offset_map_nolock(struct mm_struct *mm, pmd_t *pmd, pmd_t *pmdvalp, unsigned long addr, spinlock_t **ptlp) { pmd_t pmdval; @@ -314,6 +314,8 @@ pte_t *pte_offset_map_nolock(struct mm_struct *mm, pmd_t *pmd, pte = __pte_offset_map(pmd, addr, &pmdval); if (likely(pte)) *ptlp = pte_lockptr(mm, &pmdval); + if (pmdvalp) + *pmdvalp = pmdval; return pte; } @@ -347,14 +349,15 @@ pte_t *pte_offset_map_nolock(struct mm_struct *mm, pmd_t *pmd, * and disconnected table. Until pte_unmap(pte) unmaps and rcu_read_unlock()s * afterwards. * - * pte_offset_map_nolock(mm, pmd, addr, ptlp), above, is like pte_offset_map(); - * but when successful, it also outputs a pointer to the spinlock in ptlp - as - * pte_offset_map_lock() does, but in this case without locking it. This helps - * the caller to avoid a later pte_lockptr(mm, *pmd), which might by that time - * act on a changed *pmd: pte_offset_map_nolock() provides the correct spinlock - * pointer for the page table that it returns. In principle, the caller should - * recheck *pmd once the lock is taken; in practice, no callsite needs that - - * either the mmap_lock for write, or pte_same() check on contents, is enough. + * pte_offset_map_nolock(mm, pmd, pmdvalp, addr, ptlp), above, is like + * pte_offset_map(); but when successful, it also outputs a pointer to the + * spinlock in ptlp - as pte_offset_map_lock() does, but in this case without + * locking it. This helps the caller to avoid a later pte_lockptr(mm, *pmd), + * which might by that time act on a changed *pmd: pte_offset_map_nolock() + * provides the correct spinlock pointer for the page table that it returns. + * In principle, the caller should recheck *pmd once the lock is taken; But in + * most cases, either the mmap_lock for write, or pte_same() check on contents, + * is enough. * * Note that free_pgtables(), used after unmapping detached vmas, or when * exiting the whole mm, does not take page table lock before freeing a page diff --git a/mm/userfaultfd.c b/mm/userfaultfd.c index 8dedaec00486..61c1d228d239 100644 --- a/mm/userfaultfd.c +++ b/mm/userfaultfd.c @@ -1143,7 +1143,7 @@ static int move_pages_pte(struct mm_struct *mm, pmd_t *dst_pmd, pmd_t *src_pmd, src_addr, src_addr + PAGE_SIZE); mmu_notifier_invalidate_range_start(&range); retry: - dst_pte = pte_offset_map_nolock(mm, dst_pmd, dst_addr, &dst_ptl); + dst_pte = pte_offset_map_nolock(mm, dst_pmd, NULL, dst_addr, &dst_ptl); /* Retry if a huge pmd materialized from under us */ if (unlikely(!dst_pte)) { @@ -1151,7 +1151,7 @@ static int move_pages_pte(struct mm_struct *mm, pmd_t *dst_pmd, pmd_t *src_pmd, goto out; } - src_pte = pte_offset_map_nolock(mm, src_pmd, src_addr, &src_ptl); + src_pte = pte_offset_map_nolock(mm, src_pmd, NULL, src_addr, &src_ptl); /* * We held the mmap_lock for reading so MADV_DONTNEED diff --git a/mm/vmscan.c b/mm/vmscan.c index 3d4c681c6d40..c9a4cd31e6b4 100644 --- a/mm/vmscan.c +++ b/mm/vmscan.c @@ -3373,7 +3373,7 @@ static bool walk_pte_range(pmd_t *pmd, unsigned long start, unsigned long end, DEFINE_MAX_SEQ(walk->lruvec); int old_gen, new_gen = lru_gen_from_seq(max_seq); - pte = pte_offset_map_nolock(args->mm, pmd, start & PMD_MASK, &ptl); + pte = pte_offset_map_nolock(args->mm, pmd, NULL, start & PMD_MASK, &ptl); if (!pte) return false; if (!spin_trylock(ptl)) { From patchwork Mon Jul 1 08:46:43 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Qi Zheng X-Patchwork-Id: 13717668 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 19ECCC30653 for ; Mon, 1 Jul 2024 08:48:17 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 9B95E6B00A5; Mon, 1 Jul 2024 04:48:16 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 96BAF6B00A6; Mon, 1 Jul 2024 04:48:16 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 80A986B00A7; Mon, 1 Jul 2024 04:48:16 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0011.hostedemail.com [216.40.44.11]) by kanga.kvack.org (Postfix) with ESMTP id 610996B00A5 for ; Mon, 1 Jul 2024 04:48:16 -0400 (EDT) Received: from smtpin11.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay08.hostedemail.com (Postfix) with ESMTP id 1C1EA140B94 for ; Mon, 1 Jul 2024 08:48:16 +0000 (UTC) X-FDA: 82290556992.11.2D19C63 Received: from mail-oa1-f46.google.com (mail-oa1-f46.google.com [209.85.160.46]) by imf04.hostedemail.com (Postfix) with ESMTP id 48D4940012 for ; Mon, 1 Jul 2024 08:48:14 +0000 (UTC) Authentication-Results: imf04.hostedemail.com; dkim=pass header.d=bytedance.com header.s=google header.b=eeIUmraw; dmarc=pass (policy=quarantine) header.from=bytedance.com; spf=none (imf04.hostedemail.com: domain of zhengqi.arch@bytedance.com has no SPF policy when checking 209.85.160.46) smtp.mailfrom=zhengqi.arch@bytedance.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1719823670; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=rTUbQcnas0bLANV3VeS/cF6tFOZwLY6gcsxQegoCBbA=; b=VKtX9bFqT+b/1H5W5ilw5X7YlTj5SNzPQpqe80SkrEHxAmgOT3LObaHLiJ9ekiWGIUu9hz hf7TfiaIrsxwkTEgrfjUjMlQzeODM1NjDUWLWvFL8BTb64J7bQRuqNZHVqx0JpHDigiW2i vBLP5JglGWKf8cUjB1uS7VgATAduhfY= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1719823670; a=rsa-sha256; cv=none; b=zM5EHDj1xq6nfdwkcnd6rCAiojPXSZPlDPMiU2tHvi+x3QNy6ZNWrI0/FbJTyqK2nLhK6f U2D2OZiJRPcfu82iG1lffWX4gZ7Zd1LqOfbBlgd0b+h7i2O9F79uAD/J2iVRCla9vGg/2f 8yNNk3Uv4oZBsx5X42p/kXDN6xST55o= ARC-Authentication-Results: i=1; imf04.hostedemail.com; dkim=pass header.d=bytedance.com header.s=google header.b=eeIUmraw; dmarc=pass (policy=quarantine) header.from=bytedance.com; spf=none (imf04.hostedemail.com: domain of zhengqi.arch@bytedance.com has no SPF policy when checking 209.85.160.46) smtp.mailfrom=zhengqi.arch@bytedance.com Received: by mail-oa1-f46.google.com with SMTP id 586e51a60fabf-25cce249004so441803fac.1 for ; Mon, 01 Jul 2024 01:48:13 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance.com; s=google; t=1719823693; x=1720428493; darn=kvack.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=rTUbQcnas0bLANV3VeS/cF6tFOZwLY6gcsxQegoCBbA=; b=eeIUmrawZl8W9T26YDi7xJldtdKDMRRYPB0CMInpQ9qW7UvoTvOFjGMi6YqTK6YBFL dHJdLsaSdm64ysMHWeodsZCcesnzs8srPpLBKF7SOVQ5HNy7VhQGUEQknl9O+Fi4BGZe bymqs1t9+GBGwpVZ0PxfzGMamvy14sW7g46hHXMZMxe69ItgN7FWVc8LQFERqcCDcMaX fUQTrxGF9EXrcw7GBAFxcxo9glaC8RFsPExbaIC+sxjDtOAClrCJMzyi+CBf+3aYPwQ3 NXeiBPbu84kz/0QAhGeNKq2kAIv8F2pHoPCYVcjSNfXMQN69wzNwdxcZoM6uHpQTNdVF R9mA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1719823693; x=1720428493; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=rTUbQcnas0bLANV3VeS/cF6tFOZwLY6gcsxQegoCBbA=; b=T2ZQAB28Z6Li9NfZ40Y54nKEun8a9tMEh+lwgJ3BGHR6FXHhbuzfQMfS3Zzv7yIHOQ /h5v+UPVsqEROOyi2nW/NaVN6bfJG9C8Z9OgzxKsnazBHUGOhNBM44eIVjzz6yKj1AHR ebzjY8F5CfhvyRL6uwnkuIuFt2up10O6KxkFWOIwIqOEtyQSLUPoKoJZhn5Gsa0miNZN 6dX5jDUE6L1dTHzBx4LalYKdL6/bjjzimsxLdIIGdgpjiAy5Jj5CbPwMICJbYu/9vP2M B8rX6O3TRHyShX44eXgb7ad/702cBSbFPQz5XBY+VIn3rDYn8ICg84CYVQ556N6wweIk 8OeQ== X-Gm-Message-State: AOJu0YxwnxU29tSrnZcwBs5dtjApKQqzAOjdYByFe9HqEE5jPWkl6Gje jtjA7VcQWl1gJDE2c0yu0HrfS5kK1IrsgKWAvUZQ+1GdW582ukWD0/2XWGa0RZE= X-Google-Smtp-Source: AGHT+IEOYbpAWZ9+Meh4EfekAHEexcqmS8rjkaDrl5TI+CElE2CSv6vBFMWBxumSSWmDWCJBhTLlmA== X-Received: by 2002:a05:6870:a511:b0:259:f03c:4e90 with SMTP id 586e51a60fabf-25db36a52edmr5197533fac.4.1719823693115; Mon, 01 Jul 2024 01:48:13 -0700 (PDT) Received: from C02DW0BEMD6R.bytedance.net ([139.177.225.241]) by smtp.gmail.com with ESMTPSA id d2e1a72fcca58-70804a7e7e0sm5932374b3a.204.2024.07.01.01.48.09 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 01 Jul 2024 01:48:12 -0700 (PDT) From: Qi Zheng To: david@redhat.com, hughd@google.com, willy@infradead.org, mgorman@suse.de, muchun.song@linux.dev, akpm@linux-foundation.org Cc: linux-mm@kvack.org, linux-kernel@vger.kernel.org, Qi Zheng Subject: [RFC PATCH 2/7] mm: introduce CONFIG_PT_RECLAIM Date: Mon, 1 Jul 2024 16:46:43 +0800 Message-Id: <58942ecf91fea0a62307e5ab848228142a1270ac.1719570849.git.zhengqi.arch@bytedance.com> X-Mailer: git-send-email 2.24.3 (Apple Git-128) In-Reply-To: References: MIME-Version: 1.0 X-Rspamd-Server: rspam07 X-Rspamd-Queue-Id: 48D4940012 X-Stat-Signature: 5puu4b8yj1dswy3oizyb5fu9z51owuoy X-Rspam-User: X-HE-Tag: 1719823694-376589 X-HE-Meta: U2FsdGVkX1/yrAT767dmFl8uA52Dor6Nm2bripqFlERfQrdPk+vAhXm0gUBk/TS2q1TjMJASfd5nT7qkErnv4d3GPerHQrDVm91xsiklVcAQJhPy9QHOvmuF5rN84u9M11oAK9hDW6RTkfEFnSseEbwjetJOFx0QLt57V6BJl9uldfjNRl2PcAsLX8MmNc6Zv73eq8tSv7XUd7oANdNB7TkFNcVcLNKukTgfiDaUqtPDv9luthosn2w9wN1LLaX5xbpPfv1FXKEbofY14Nw6GWl5VuqzcQLUYzT660wR5gTHo0EleLRCGPmWk9rX+3NAuIvTehK36I6G56kqWpvAxVUsSSX/4rXDfzz009KUY5u7sjW2peL3IfFXXddFEERK6CFv2JicdSyJyjilzs4QNQtw1LibvmKnhdlRGvYn93IbsUCqxDI9ZKllg1VBfbIngrr1eEROWGDDcBMvPwSQuQztLaScwzLk/LqKle9fgO61hpgoFCHmTDlP7uVm5vBkMbQuuHX6+jic0jTVoAmz34bAdc4zapw/IFzhZ01ulKLki+CoKI0bQpDJkzGjUPsOBejd/80xetkSbmt6+hVLY60w+DycMvSNCuAmQ1eTFjl+udYLalFwepC54msCv5bf1ubdMPSq+ySNjH9C66MYnXlySdaqIqb8qGpqOMB5kEvsBCL5UW6FJhrYc0QjZmpW3DR8lydrtp16a2Io0UniEGn0DYL8f9eRBZRPttOGgrHnJZQsviRqvAyGtJqxF7h2SGOGs3CXmvdnOLvY1F+HAtW9VQjEjC7UjXjMOXWxkGiWQAquvtnnZxLNxO3GzUNVMHBeNBXLKq4l+MnmIujMWk3bLVhRvRsTRgQpyiN/7rup+grVny5+b9LHRi8IPiBYHNASF3RJaruQZh4R/XH52bWutquTWfBgFiKnsnC9No2L9WVJTl1Bl8slyyC3sN3O1Xdqom39D8bGbDynIie HuO8HMg0 iB7/qwhb+evmgQ894yg7XBrMUp+8Mgcy3wgLv9DcUD4riPyyarF8D6TfG/YKQQKNS62md4Ddk/fMp7FrMM8THHb/I0z98Ehf49lRGuJ11yzCQMy2/IjOGwNu9A4xaGQmHyoOQ2s3gM6NSYTb3NAnnon8XtSUYy4pxAGyKqL/CYMoCEIJPBO2QshOeKxkgDvSZp9UpV/1t4Nvbz0WZafuv5X+VHQow3BxlvTIbPMFtatz7FhNe+fcTY6SrsmOEYnD7eOCuUch0+oWrgf8eLuOSVW/T1U6yqJp4hCkm2o7PdaO2D7lCcbXLzcaHHE9kfvinXV4gjbixbCBXtinagXqZWpzkzrQABSR6o2Bw X-Bogosity: Ham, tests=bogofilter, spamicity=0.000004, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: This configuration variable will be used to build the code needed to free empty user page table pages. This feature is not available on all architectures yet, so ARCH_SUPPORTS_PT_RECLAIM is needed. We can remove it once all architectures support this feature. Signed-off-by: Qi Zheng --- mm/Kconfig | 14 ++++++++++++++ 1 file changed, 14 insertions(+) diff --git a/mm/Kconfig b/mm/Kconfig index 991fa9cf6137..7e2c87784d86 100644 --- a/mm/Kconfig +++ b/mm/Kconfig @@ -1256,6 +1256,20 @@ config IOMMU_MM_DATA config EXECMEM bool +config ARCH_SUPPORTS_PT_RECLAIM + def_bool n + +config PT_RECLAIM + bool "reclaim empty user page table pages" + default y + depends on ARCH_SUPPORTS_PT_RECLAIM && MMU && SMP + select MMU_GATHER_RCU_TABLE_FREE + help + Try to reclaim empty user page table pages in paths other that munmap + and exit_mmap path. + + Note: now only empty user PTE page table pages will be reclaimed. + source "mm/damon/Kconfig" endmenu From patchwork Mon Jul 1 08:46:44 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Qi Zheng X-Patchwork-Id: 13717669 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id EBBBFC30653 for ; Mon, 1 Jul 2024 08:48:20 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 85A696B00A7; Mon, 1 Jul 2024 04:48:20 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 80A4E6B00A8; Mon, 1 Jul 2024 04:48:20 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 6AA6F6B00A9; Mon, 1 Jul 2024 04:48:20 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0012.hostedemail.com [216.40.44.12]) by kanga.kvack.org (Postfix) with ESMTP id 4DDED6B00A7 for ; Mon, 1 Jul 2024 04:48:20 -0400 (EDT) Received: from smtpin01.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay01.hostedemail.com (Postfix) with ESMTP id 036281C31E6 for ; Mon, 1 Jul 2024 08:48:19 +0000 (UTC) X-FDA: 82290557160.01.5C9A0F3 Received: from mail-oa1-f43.google.com (mail-oa1-f43.google.com [209.85.160.43]) by imf06.hostedemail.com (Postfix) with ESMTP id 3E66618000A for ; Mon, 1 Jul 2024 08:48:18 +0000 (UTC) Authentication-Results: imf06.hostedemail.com; dkim=pass header.d=bytedance.com header.s=google header.b=UZC6kGdr; spf=none (imf06.hostedemail.com: domain of zhengqi.arch@bytedance.com has no SPF policy when checking 209.85.160.43) smtp.mailfrom=zhengqi.arch@bytedance.com; dmarc=pass (policy=quarantine) header.from=bytedance.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1719823687; a=rsa-sha256; cv=none; b=UFdDmugaUNzOMU8fsuCDuY1yJgEprWqgrRVvu00BvLS3YS2sycU5fOpsGU821NJOWOKtpz UQTp/nBm0Q8z2V2nmm6bhiTU8dEh9O4LUsAIjBqJ23/oUpi9e9NXN3BOJ7PfQcPkYIjRv2 MK+p6El0rOK6JzBwIO+yY7ZvoM/fFYU= ARC-Authentication-Results: i=1; imf06.hostedemail.com; dkim=pass header.d=bytedance.com header.s=google header.b=UZC6kGdr; spf=none (imf06.hostedemail.com: domain of zhengqi.arch@bytedance.com has no SPF policy when checking 209.85.160.43) smtp.mailfrom=zhengqi.arch@bytedance.com; dmarc=pass (policy=quarantine) header.from=bytedance.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1719823687; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=heRFBrzmiSFbf80Ldx88bwYUpXRT+UgzgGRtnFqgbfo=; b=Ee2/7b/p1wEF/TxB2XptbclshaL2CybrRCRBjgmT5YHnBTeo0iTjNpKHXLJPQQA5KVSBiI I4EGbtmKzb1CQPPiJTVD07jYc7yCbNu2ORVQet2hvmVplyZ0gVVbBSTmQQ1D8g3CkoLzeW FpHFlzF690aeV1LbmdWY34bv8ZF6UDM= Received: by mail-oa1-f43.google.com with SMTP id 586e51a60fabf-25989f87e20so416515fac.1 for ; Mon, 01 Jul 2024 01:48:18 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance.com; s=google; t=1719823697; x=1720428497; darn=kvack.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=heRFBrzmiSFbf80Ldx88bwYUpXRT+UgzgGRtnFqgbfo=; b=UZC6kGdrfKhJF2f5fS5EWHDILOeRqPYONGMTUN+8GXvNBWnbcBsFakux/Z/wss3XzQ ewx4DS7+HfskuzK17SBCKE6NlFfyCywP6DlWP/kk04rqy5z0VONIpa8ONW6ilD4zTLv+ QUFtONBDuYRryz8GQtj4VBgFOTqblPQ2Z7d5rGdegSiGmOIo4sJNfTgHAwup6hT5/3BQ 92zWpPgYIYjFkv+7tEqhXj8I3TH54x11IGiwCMRNidL7ZU/NQ24QDoWeo0IkXHzOxnd5 JAVLJc0yOF8aV+2fedwsj4nOLJm1XCqPAQ3jT/+KQ3h2U1Dx91nBYvK0ygRiqci3axJK 0GgA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1719823697; x=1720428497; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=heRFBrzmiSFbf80Ldx88bwYUpXRT+UgzgGRtnFqgbfo=; b=h/OYYl10qou+y0uYH+Ogtk6TRQGHUQ4HO6k+x2uIP0YfcP7uhPH3CpeCqlxvRsLLA4 3+CWQpm8KHmlEL28gWJ5lbRUaol6Xa+OASD3qrJRdegbL0+Itel1uQ2qJc+HXHtgT6lp rippqpSdWbVrCLALsCQ9hTZQaLzApi7l3pLwCbzGwqB4TvJIpwJ9fnSwyNOn+9TvMj6F jV6BMsJ+ls/12rDTPtVgp+yEw2A+Jt2U4djdtrTMgLTaPIo0GVtmwKj/XsHRZs4ZLvVw KKIBvZPesLSedQe2aAVg+ewXjgid2JLtVtS24u/tGlacxxeRrmLFDl0ApIqeeIM/bCQv m5wg== X-Gm-Message-State: AOJu0Yx2EMkSvlQEqv3aeEkJPdwTVWqZnvHVUf11piGuVuSc4UFxjn+o rlQ2xaxV1oXbdalmjRbDrEk94v7spBmOKRaGfXjFVk+c0BXmr3A8f27KFKOpv5Q= X-Google-Smtp-Source: AGHT+IFFB2WixvwXraavN9qiwzLbm4r9XSaP2jQhzM0/LqY8bBCt0wrI3ZybaGE0lHA7u/jpjEN+uQ== X-Received: by 2002:a05:6870:3751:b0:254:affe:5a08 with SMTP id 586e51a60fabf-25db340a2d2mr4729423fac.2.1719823697112; Mon, 01 Jul 2024 01:48:17 -0700 (PDT) Received: from C02DW0BEMD6R.bytedance.net ([139.177.225.241]) by smtp.gmail.com with ESMTPSA id d2e1a72fcca58-70804a7e7e0sm5932374b3a.204.2024.07.01.01.48.13 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 01 Jul 2024 01:48:16 -0700 (PDT) From: Qi Zheng To: david@redhat.com, hughd@google.com, willy@infradead.org, mgorman@suse.de, muchun.song@linux.dev, akpm@linux-foundation.org Cc: linux-mm@kvack.org, linux-kernel@vger.kernel.org, Qi Zheng Subject: [RFC PATCH 3/7] mm: pass address information to pmd_install() Date: Mon, 1 Jul 2024 16:46:44 +0800 Message-Id: X-Mailer: git-send-email 2.24.3 (Apple Git-128) In-Reply-To: References: MIME-Version: 1.0 X-Stat-Signature: 8o9j86rxe6w9y3izuhhd6y8szad7quww X-Rspamd-Queue-Id: 3E66618000A X-Rspam-User: X-Rspamd-Server: rspam10 X-HE-Tag: 1719823698-397857 X-HE-Meta: U2FsdGVkX1+1D6bHfsF7F8kob3nRrmeEarH3eTEC/xhiiTQhHS6SSknVvDM5RwvYptYtmVw8hhlmNVLy8baO+jx6G8mDMUpOIpqju4b4EsObARgoL2N23Nu537c6Sz+GzZaHJobVJRY2yf4kgJpzxnNnA0jm4wgIaooZdyQ6IDqXwFsUZ/ftJFU0/M40MVjm0KYT3eGCA94npOCcEELYL6IrPBqHpMZiy93/3wSUQlTrzidemXmnTMntfibxGMbGDpwCMlF5V3DL8qy4lXMwwwB2uEkSRFKNgPGgNo/wkqspfMtqxWudsAknfbIWapngALKg2RiHDY1v7oZLWjzcG/0p1KqFQsz00zrBJhnEqj6r/P73/SVGa1Hli022VgmYBvJqVM/Wilssb3YTSmmTfNz5VcdT2ASb3zFSG2mdB6Qaxm8RamnqvjrH5K2uMmnnEppAgDVg/W5kZK0YT+bR7pNFA0C1gi9Hz3USqJ1ndv7OTwebhap725259Xt/Vjj/+7cpcs08H96P/Y674GhK1Okfqa4eSkVe1YlZJwho2RqRPPvaByzxDp1DwfYVLO6rDHCmkolAogY3BY5Xe16Fc7ci23Au7NCUXNEC/AD3RkiwTgreh6DvcUNzWVTY7qhc/o4DH/MKQBGSYmdBFdw2woaJ9GQpVl9L6PAZM8k5rFmri0NvW3qOQ0Gpc9IkCUI/JusKn21UKqUQjtZohaHe0dyZsfvbgz2Spl18AfZphO+2xt0/bh8wuv7+PPjwrNUBcWcA8Jlhr45rSqbaZAcnAOkmqMvTvyI7VZ+sjUTX2PxFiRyf5PUBHJ+CakJk6EdfkXk1sfy7Wb38XD0dW88eO/qopW/sDDoGkWTPb23YTIXsTHAOAinbidXoUKPBFMKe921CmkyEu6wqF3HtwzjE0oqavid6EDwIVkUs1MygIS8riC2XzjXYfBra465XzFNSnNp+qlzY8iJkKHDJvQg ZnPSk9zK ZHWmH6n5gR/GWNIlETVLJbQKhvmrC787xvDCnKu92a/X76e2FAfRHD2A0JfFU21qik8XVr2DxHirCT/OSfZL+GnQpnxUjfiFbR0jDLMttOZJQFNF+efjQUeWhoRLpTz2/rSOSjVlDD4otxDO8NyOO8clBBO5E7NXtA4FJeh79eeFd6pfFTbJKPJtqhc2jPoZBkcfJc6Lpw9qRauQYLTSJ4H5Pfklfe1UACkNu4SbAdSyZmSignbVBbnT+jlbCFyd7BCPfQ3mxA0xTVRmBarxi8WEbFMokMhEJRUcKrqGLM6KE/vI6J1/HwLWPlinX4ZMwds31CN6HsKZUrio8UdLqW7BfnNWzwAi1az4Q X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: In the subsequent implementation of freeing empty page table pages, we need the address information to flush tlb, so pass address to pmd_install() in advance. No functional changes. Signed-off-by: Qi Zheng --- include/linux/hugetlb.h | 2 +- include/linux/mm.h | 9 +++++---- mm/debug_vm_pgtable.c | 2 +- mm/filemap.c | 2 +- mm/gup.c | 2 +- mm/internal.h | 3 ++- mm/memory.c | 15 ++++++++------- mm/migrate_device.c | 2 +- mm/mprotect.c | 8 ++++---- mm/mremap.c | 2 +- mm/userfaultfd.c | 6 +++--- 11 files changed, 28 insertions(+), 25 deletions(-) diff --git a/include/linux/hugetlb.h b/include/linux/hugetlb.h index a951c0d06061..55715eb5cb34 100644 --- a/include/linux/hugetlb.h +++ b/include/linux/hugetlb.h @@ -198,7 +198,7 @@ static inline pte_t *pte_offset_huge(pmd_t *pmd, unsigned long address) static inline pte_t *pte_alloc_huge(struct mm_struct *mm, pmd_t *pmd, unsigned long address) { - return pte_alloc(mm, pmd) ? NULL : pte_offset_huge(pmd, address); + return pte_alloc(mm, pmd, address) ? NULL : pte_offset_huge(pmd, address); } #endif diff --git a/include/linux/mm.h b/include/linux/mm.h index 396bdc3b3726..880100a8b472 100644 --- a/include/linux/mm.h +++ b/include/linux/mm.h @@ -2800,7 +2800,7 @@ static inline void mm_inc_nr_ptes(struct mm_struct *mm) {} static inline void mm_dec_nr_ptes(struct mm_struct *mm) {} #endif -int __pte_alloc(struct mm_struct *mm, pmd_t *pmd); +int __pte_alloc(struct mm_struct *mm, pmd_t *pmd, unsigned long addr); int __pte_alloc_kernel(pmd_t *pmd); #if defined(CONFIG_MMU) @@ -2987,13 +2987,14 @@ pte_t *pte_offset_map_nolock(struct mm_struct *mm, pmd_t *pmd, pmd_t *pmdvalp, pte_unmap(pte); \ } while (0) -#define pte_alloc(mm, pmd) (unlikely(pmd_none(*(pmd))) && __pte_alloc(mm, pmd)) +#define pte_alloc(mm, pmd, addr) \ + (unlikely(pmd_none(*(pmd))) && __pte_alloc(mm, pmd, addr)) #define pte_alloc_map(mm, pmd, address) \ - (pte_alloc(mm, pmd) ? NULL : pte_offset_map(pmd, address)) + (pte_alloc(mm, pmd, address) ? NULL : pte_offset_map(pmd, address)) #define pte_alloc_map_lock(mm, pmd, address, ptlp) \ - (pte_alloc(mm, pmd) ? \ + (pte_alloc(mm, pmd, address) ? \ NULL : pte_offset_map_lock(mm, pmd, address, ptlp)) #define pte_alloc_kernel(pmd, address) \ diff --git a/mm/debug_vm_pgtable.c b/mm/debug_vm_pgtable.c index e4969fb54da3..18375744e184 100644 --- a/mm/debug_vm_pgtable.c +++ b/mm/debug_vm_pgtable.c @@ -1246,7 +1246,7 @@ static int __init init_args(struct pgtable_debug_args *args) args->start_pmdp = pmd_offset(args->pudp, 0UL); WARN_ON(!args->start_pmdp); - if (pte_alloc(args->mm, args->pmdp)) { + if (pte_alloc(args->mm, args->pmdp, args->vaddr)) { pr_err("Failed to allocate pte entries\n"); ret = -ENOMEM; goto error; diff --git a/mm/filemap.c b/mm/filemap.c index 35bbba960447..d8b936d87eb4 100644 --- a/mm/filemap.c +++ b/mm/filemap.c @@ -3453,7 +3453,7 @@ static bool filemap_map_pmd(struct vm_fault *vmf, struct folio *folio, } if (pmd_none(*vmf->pmd) && vmf->prealloc_pte) - pmd_install(mm, vmf->pmd, &vmf->prealloc_pte); + pmd_install(mm, vmf->pmd, vmf->address, &vmf->prealloc_pte); return false; } diff --git a/mm/gup.c b/mm/gup.c index 8bea9ad80984..b87b1ea9d008 100644 --- a/mm/gup.c +++ b/mm/gup.c @@ -1105,7 +1105,7 @@ static struct page *follow_pmd_mask(struct vm_area_struct *vma, spin_unlock(ptl); split_huge_pmd(vma, pmd, address); /* If pmd was left empty, stuff a page table in there quickly */ - return pte_alloc(mm, pmd) ? ERR_PTR(-ENOMEM) : + return pte_alloc(mm, pmd, address) ? ERR_PTR(-ENOMEM) : follow_page_pte(vma, address, pmd, flags, &ctx->pgmap); } page = follow_huge_pmd(vma, address, pmd, flags, ctx); diff --git a/mm/internal.h b/mm/internal.h index 2ea9a88dcb95..1dfdad110a9a 100644 --- a/mm/internal.h +++ b/mm/internal.h @@ -320,7 +320,8 @@ void folio_activate(struct folio *folio); void free_pgtables(struct mmu_gather *tlb, struct ma_state *mas, struct vm_area_struct *start_vma, unsigned long floor, unsigned long ceiling, bool mm_wr_locked); -void pmd_install(struct mm_struct *mm, pmd_t *pmd, pgtable_t *pte); +void pmd_install(struct mm_struct *mm, pmd_t *pmd, unsigned long addr, + pgtable_t *pte); struct zap_details; void unmap_page_range(struct mmu_gather *tlb, diff --git a/mm/memory.c b/mm/memory.c index 1c9068b0b067..09db2c97cc5c 100644 --- a/mm/memory.c +++ b/mm/memory.c @@ -417,7 +417,8 @@ void free_pgtables(struct mmu_gather *tlb, struct ma_state *mas, } while (vma); } -void pmd_install(struct mm_struct *mm, pmd_t *pmd, pgtable_t *pte) +void pmd_install(struct mm_struct *mm, pmd_t *pmd, unsigned long addr, + pgtable_t *pte) { spinlock_t *ptl = pmd_lock(mm, pmd); @@ -443,13 +444,13 @@ void pmd_install(struct mm_struct *mm, pmd_t *pmd, pgtable_t *pte) spin_unlock(ptl); } -int __pte_alloc(struct mm_struct *mm, pmd_t *pmd) +int __pte_alloc(struct mm_struct *mm, pmd_t *pmd, unsigned long addr) { pgtable_t new = pte_alloc_one(mm); if (!new) return -ENOMEM; - pmd_install(mm, pmd, &new); + pmd_install(mm, pmd, addr, &new); if (new) pte_free(mm, new); return 0; @@ -2115,7 +2116,7 @@ static int insert_pages(struct vm_area_struct *vma, unsigned long addr, /* Allocate the PTE if necessary; takes PMD lock once only. */ ret = -ENOMEM; - if (pte_alloc(mm, pmd)) + if (pte_alloc(mm, pmd, addr)) goto out; while (pages_to_write_in_pmd) { @@ -4521,7 +4522,7 @@ static vm_fault_t do_anonymous_page(struct vm_fault *vmf) * Use pte_alloc() instead of pte_alloc_map(), so that OOM can * be distinguished from a transient failure of pte_offset_map(). */ - if (pte_alloc(vma->vm_mm, vmf->pmd)) + if (pte_alloc(vma->vm_mm, vmf->pmd, vmf->address)) return VM_FAULT_OOM; /* Use the zero-page for reads */ @@ -4868,8 +4869,8 @@ vm_fault_t finish_fault(struct vm_fault *vmf) } if (vmf->prealloc_pte) - pmd_install(vma->vm_mm, vmf->pmd, &vmf->prealloc_pte); - else if (unlikely(pte_alloc(vma->vm_mm, vmf->pmd))) + pmd_install(vma->vm_mm, vmf->pmd, vmf->address, &vmf->prealloc_pte); + else if (unlikely(pte_alloc(vma->vm_mm, vmf->pmd, vmf->address))) return VM_FAULT_OOM; } diff --git a/mm/migrate_device.c b/mm/migrate_device.c index 6d66dc1c6ffa..e4d2e19e6611 100644 --- a/mm/migrate_device.c +++ b/mm/migrate_device.c @@ -598,7 +598,7 @@ static void migrate_vma_insert_page(struct migrate_vma *migrate, goto abort; if (pmd_trans_huge(*pmdp) || pmd_devmap(*pmdp)) goto abort; - if (pte_alloc(mm, pmdp)) + if (pte_alloc(mm, pmdp, addr)) goto abort; if (unlikely(anon_vma_prepare(vma))) goto abort; diff --git a/mm/mprotect.c b/mm/mprotect.c index 222ab434da54..1a1537ddffe4 100644 --- a/mm/mprotect.c +++ b/mm/mprotect.c @@ -330,11 +330,11 @@ pgtable_populate_needed(struct vm_area_struct *vma, unsigned long cp_flags) * allocation failures during page faults by kicking OOM and returning * error. */ -#define change_pmd_prepare(vma, pmd, cp_flags) \ +#define change_pmd_prepare(vma, pmd, addr, cp_flags) \ ({ \ long err = 0; \ if (unlikely(pgtable_populate_needed(vma, cp_flags))) { \ - if (pte_alloc(vma->vm_mm, pmd)) \ + if (pte_alloc(vma->vm_mm, pmd, addr)) \ err = -ENOMEM; \ } \ err; \ @@ -375,7 +375,7 @@ static inline long change_pmd_range(struct mmu_gather *tlb, again: next = pmd_addr_end(addr, end); - ret = change_pmd_prepare(vma, pmd, cp_flags); + ret = change_pmd_prepare(vma, pmd, addr, cp_flags); if (ret) { pages = ret; break; @@ -402,7 +402,7 @@ static inline long change_pmd_range(struct mmu_gather *tlb, * cleared; make sure pmd populated if * necessary, then fall-through to pte level. */ - ret = change_pmd_prepare(vma, pmd, cp_flags); + ret = change_pmd_prepare(vma, pmd, addr, cp_flags); if (ret) { pages = ret; break; diff --git a/mm/mremap.c b/mm/mremap.c index f672d0218a6f..7723d11e77cd 100644 --- a/mm/mremap.c +++ b/mm/mremap.c @@ -628,7 +628,7 @@ unsigned long move_page_tables(struct vm_area_struct *vma, } if (pmd_none(*old_pmd)) continue; - if (pte_alloc(new_vma->vm_mm, new_pmd)) + if (pte_alloc(new_vma->vm_mm, new_pmd, new_addr)) break; if (move_ptes(vma, old_pmd, old_addr, old_addr + extent, new_vma, new_pmd, new_addr, need_rmap_locks) < 0) diff --git a/mm/userfaultfd.c b/mm/userfaultfd.c index 61c1d228d239..e1674580b54f 100644 --- a/mm/userfaultfd.c +++ b/mm/userfaultfd.c @@ -796,7 +796,7 @@ static __always_inline ssize_t mfill_atomic(struct userfaultfd_ctx *ctx, break; } if (unlikely(pmd_none(dst_pmdval)) && - unlikely(__pte_alloc(dst_mm, dst_pmd))) { + unlikely(__pte_alloc(dst_mm, dst_pmd, dst_addr))) { err = -ENOMEM; break; } @@ -1713,13 +1713,13 @@ ssize_t move_pages(struct userfaultfd_ctx *ctx, unsigned long dst_start, err = -ENOENT; break; } - if (unlikely(__pte_alloc(mm, src_pmd))) { + if (unlikely(__pte_alloc(mm, src_pmd, src_addr))) { err = -ENOMEM; break; } } - if (unlikely(pte_alloc(mm, dst_pmd))) { + if (unlikely(pte_alloc(mm, dst_pmd, dst_addr))) { err = -ENOMEM; break; } From patchwork Mon Jul 1 08:46:45 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Qi Zheng X-Patchwork-Id: 13717670 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id A4A3BC2BD09 for ; Mon, 1 Jul 2024 08:48:25 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 32C086B00A9; Mon, 1 Jul 2024 04:48:25 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 2DBD96B00AA; Mon, 1 Jul 2024 04:48:25 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 12DBD6B00AB; Mon, 1 Jul 2024 04:48:25 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0016.hostedemail.com [216.40.44.16]) by kanga.kvack.org (Postfix) with ESMTP id E1FE66B00A9 for ; Mon, 1 Jul 2024 04:48:24 -0400 (EDT) Received: from smtpin04.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay05.hostedemail.com (Postfix) with ESMTP id 966E340B6E for ; Mon, 1 Jul 2024 08:48:24 +0000 (UTC) X-FDA: 82290557328.04.0A80E1E Received: from mail-pf1-f182.google.com (mail-pf1-f182.google.com [209.85.210.182]) by imf22.hostedemail.com (Postfix) with ESMTP id B3BBDC0018 for ; Mon, 1 Jul 2024 08:48:22 +0000 (UTC) Authentication-Results: imf22.hostedemail.com; dkim=pass header.d=bytedance.com header.s=google header.b=TR0gvgPp; dmarc=pass (policy=quarantine) header.from=bytedance.com; spf=none (imf22.hostedemail.com: domain of zhengqi.arch@bytedance.com has no SPF policy when checking 209.85.210.182) smtp.mailfrom=zhengqi.arch@bytedance.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1719823685; a=rsa-sha256; cv=none; b=78V10B33GS5qCOe/TBGQf3goeRu+MrYkdKXu+l1SiAk+wn0R9OKQg8eDaMTEtgBTCxIkre x5HFQiEjkX0BmI7osKV+/gLLaG9h+UEAyioi/KZchYoZaJIshOvMafNMvI1zK6CO6iJBrr 5/ScNcN6PG0wqRSSB9flwgJAIaT2qQg= ARC-Authentication-Results: i=1; imf22.hostedemail.com; dkim=pass header.d=bytedance.com header.s=google header.b=TR0gvgPp; dmarc=pass (policy=quarantine) header.from=bytedance.com; spf=none (imf22.hostedemail.com: domain of zhengqi.arch@bytedance.com has no SPF policy when checking 209.85.210.182) smtp.mailfrom=zhengqi.arch@bytedance.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1719823685; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=8iGTSPNawa0GagXOh9xE0UwlegG4b1kDfjCNd9R6avY=; b=QPVALM0bZlLJXDXG6KE8V9NMhBMV9mX2bzX2dh4rGr4HJrLiXeApoFbjxpvIsq04iaNVtU UegCvgYtin7ZEAVdz7dTDnoTYD0QYR8lmq+h+bcDhFmbnrWpyA+5DLEiSYVakq+40CBz4P vfM8dFT0jPwizQaEe1Rea9FRY86qBhQ= Received: by mail-pf1-f182.google.com with SMTP id d2e1a72fcca58-7065c413655so91132b3a.3 for ; Mon, 01 Jul 2024 01:48:22 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance.com; s=google; t=1719823701; x=1720428501; darn=kvack.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=8iGTSPNawa0GagXOh9xE0UwlegG4b1kDfjCNd9R6avY=; b=TR0gvgPpb7iCddlNkK5eLULPECq989/Hsg1jx18ItUyBuI15MwJag+yYxnmnGRpW1/ mHoPaP23HPoJpjs8cGSJ6TGx/+I4VI/R2dMmRyYt2a2AnOJWXNaNeONKhwHjVWAQhYrO y26EafK1WCE8EwWo2nan+/N6e2bj5WD8npG9hg2+dr4OLrfF6u1KhzPnxRFJvEqc7ygS Dl+rlTW4XdozEkPXStmSFH2hUSgkQ9EPtrCGieIK1sMo8O2MSBX8ABpK9T3MOuGAY24D zqmXvdXX3U4JDrV7JxNJy5VLjo8myDiAggSf4KAOrSP+rRYYoggfRYBJ9uCd9h3USlA4 Ic+A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1719823701; x=1720428501; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=8iGTSPNawa0GagXOh9xE0UwlegG4b1kDfjCNd9R6avY=; b=a0fo6u4joOdDGnIx0R9m138l7VpsImZXK9g7llWZg/PI3lkvJFMbi2wMfEwMk6mrBP B9ocOvi80HGvo2zHkg2LOnQEtXX8KypEzwWKHQM/tpZW5uf4gZY8KU3pJ3Lo+gPxylq7 80onM4mBl5ZVvHaRajF3Y5enNkpOQXLCsQFxtVLKAeL/9IJV3nBOizdwetilqIioZjPC 2DuNqWKUuy6QP6u3xjPXznJ1r0iyh1BrNFmhLk6uHXNiYdmXoR2sbDK7C/iGg06f1p/R YcYCKtST9TbCEEso9vrkRxNmtt3Ry4fOkcom+h+wfMwRsGRjH4S5lefDh/+8bcjDgUTA 5uzQ== X-Gm-Message-State: AOJu0YxxCUB7bVf124Q1PknGkmLFlVjWfhLfImdh83+7yQNPm+tRigwB 8LTpwJ1WZ4utqqNz95/9Z4Lx99DG1pSeHwnjA0w7V1GFErKiy5Zt6uwT0UR6Xak= X-Google-Smtp-Source: AGHT+IFls0oOjbzAtOgY83MipQfLTlG0nuWQRUmCut8mpMpDbM18C75DCE0fKwS+uZ7TURaRFkpcBQ== X-Received: by 2002:aa7:8ecc:0:b0:704:21c2:ae92 with SMTP id d2e1a72fcca58-70aaaefd3bamr5492602b3a.2.1719823701163; Mon, 01 Jul 2024 01:48:21 -0700 (PDT) Received: from C02DW0BEMD6R.bytedance.net ([139.177.225.241]) by smtp.gmail.com with ESMTPSA id d2e1a72fcca58-70804a7e7e0sm5932374b3a.204.2024.07.01.01.48.17 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 01 Jul 2024 01:48:20 -0700 (PDT) From: Qi Zheng To: david@redhat.com, hughd@google.com, willy@infradead.org, mgorman@suse.de, muchun.song@linux.dev, akpm@linux-foundation.org Cc: linux-mm@kvack.org, linux-kernel@vger.kernel.org, Qi Zheng Subject: [RFC PATCH 4/7] mm: pgtable: try to reclaim empty PTE pages in zap_page_range_single() Date: Mon, 1 Jul 2024 16:46:45 +0800 Message-Id: <09a7b82e61bc87849ca6bde35f98345d109817e2.1719570849.git.zhengqi.arch@bytedance.com> X-Mailer: git-send-email 2.24.3 (Apple Git-128) In-Reply-To: References: MIME-Version: 1.0 X-Rspamd-Server: rspam12 X-Rspamd-Queue-Id: B3BBDC0018 X-Stat-Signature: 57x7xinsx6dp738nit6r7xyrpyy5i1o7 X-Rspam-User: X-HE-Tag: 1719823702-234449 X-HE-Meta: U2FsdGVkX1/lSOXnlZsuI+/qLif5pUsuGiSzLH0N5DlWD8EwCRKzOPBe283IBGHyHAxDPwvSr7rjpUpAICwI0T2n5ZxKFyXR/hn3StVU+u/f8hN+MKdzpJwIkgwnlTgF+q/XadHSPxgNnF7ikTM6TEUDkRKZx1dcq7jk9zVufBe/iqZ6vm2HToylvc+0inGe+iRHNqk65Le2Ow6ZT973XWgkTmb9WdLLIBYecf1QHqbH7GDa4dC77HB3bT+l+ZzRPNP760HExZFtLRLHZ+eVUTvXuTOBjH0aMO2HO8n8VVUFjDE69jUK+uggmk2bv4eXDSy5sQcOLgclOYgTr9MHY9zQIEsiMLCZb8MtTJ0Qh4QPogqSjHDlr2Adn05/glSfD0/qg5zlk1JvaeSb2k78JE6XgTotuaKTT+OjnQYVW5i79DfWZszic+K/1BO8hcQVuiApXrcdhRReCEu5mNG8ERcumjZ/gZhR8ot7ijhe5Z6rTkUYtSNHL2GZCLADLCx0RCYTs+7+tAWkCSUN0rc63UHrPWA6SQoU5/LPhq4xjW8q8CSRHQ9OpPZUckyphf/1MNK4EFkUzogvqqJkYo4wqKx2qp0cZhrSxYN95LLZIdQbmTy46/qxze49bfb7A00rT4b6aBARBWnum7NjHKk8oh3GBQ07b3OmGbYCitSYtmtvKoxvm2faLJiC3xsxWNPOyJ/7JN6ym0y4q/KeaBfTAwvTIKStYETEqqbBI7/brKL/VvRyOkUgkXGoZEsaGhCb3WyE68O9CAuAs4Maf6wloAO3yx5DbZN5hcfj5zGScVaVEuWch3tqG61sCD8NvjaaA7kaNm2pYBAl7LoAA7Qmxv/7E09HmdiZz9Qx65+VFprmJBowaCiMfddokyRI7mqHssASRlrvts/NFrj6KYCbK8a5em/SCFJszjMDnFRVAB/4fAwF+3m7Wl2wwpMNy/jaysqqnJSG/+KeHp2/ciN 41O7Oxz5 V58ex8vzHQtAcfB4l5CpBQFU4rwOGWRejVieV0p8gYF99yXo3n6U4nFKCZMLSZAvVZ1vIJfA6z2ySA1jS2wIgQE2vbu9bBoxggCkktwPijqR8kD/oNez8Fp9/ys0tiLBKQtHrv9mti0jazU2jGfkHP1azH+npDHCIdeyKUiu5a9vRk6HNCaYzZrGJhiGWIbRHQ5ETioDBnP9WuuHXGFCCpKP7LnuoLwbhsgcFXFZ0w0ADkHiqQF0QtBSWfncUcGUmdCZhmzKTmSZbLaL3gHEuMb+swv4cSl/Q3Gv4XodTs9ze1Hr7NN2b3WobpymEi0OUB/oyIv1l7Erwc9r1OpEP/qQUA3fG36kObky8 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Now in order to pursue high performance, applications mostly use some high-performance user-mode memory allocators, such as jemalloc or tcmalloc. These memory allocators use madvise(MADV_DONTNEED or MADV_FREE) to release physical memory, but neither MADV_DONTNEED nor MADV_FREE will release page table memory, which may cause huge page table memory usage. The following are a memory usage snapshot of one process which actually happened on our server: VIRT: 55t RES: 590g VmPTE: 110g In this case, most of the page table entries are empty. For such a PTE page where all entries are empty, we can actually free it back to the system for others to use. As a first step, this commit attempts to synchronously free the empty PTE pages in zap_page_range_single() (MADV_DONTNEED etc will invoke this). In order to reduce overhead, we only handle the cases with a high probability of generating empty PTE pages, and other cases will be filtered out, such as: - hugetlb vma (unsuitable) - userfaultfd_wp vma (may reinstall the pte entry) - writable private file mapping case (COW-ed anon page is not zapped) - etc For userfaultfd_wp and private file mapping cases (and MADV_FREE case, of course), consider scanning and freeing empty PTE pages asynchronously in the future. The following code snippet can show the effect of optimization: mmap 50G while (1) { for (; i < 1024 * 25; i++) { touch 2M memory madvise MADV_DONTNEED 2M } } As we can see, the memory usage of VmPTE is reduced: before after VIRT 50.0 GB 50.0 GB RES 3.1 MB 3.1 MB VmPTE 102640 KB 240 KB Signed-off-by: Qi Zheng --- include/linux/pgtable.h | 14 +++++ mm/Makefile | 1 + mm/huge_memory.c | 3 + mm/internal.h | 14 +++++ mm/khugepaged.c | 22 ++++++- mm/memory.c | 2 + mm/pt_reclaim.c | 131 ++++++++++++++++++++++++++++++++++++++++ 7 files changed, 186 insertions(+), 1 deletion(-) create mode 100644 mm/pt_reclaim.c diff --git a/include/linux/pgtable.h b/include/linux/pgtable.h index 2f32eaccf0b9..59e894f705a7 100644 --- a/include/linux/pgtable.h +++ b/include/linux/pgtable.h @@ -447,6 +447,20 @@ static inline void arch_check_zapped_pmd(struct vm_area_struct *vma, } #endif +#ifndef arch_flush_tlb_before_set_huge_page +static inline void arch_flush_tlb_before_set_huge_page(struct mm_struct *mm, + unsigned long addr) +{ +} +#endif + +#ifndef arch_flush_tlb_before_set_pte_page +static inline void arch_flush_tlb_before_set_pte_page(struct mm_struct *mm, + unsigned long addr) +{ +} +#endif + #ifndef __HAVE_ARCH_PTEP_GET_AND_CLEAR static inline pte_t ptep_get_and_clear(struct mm_struct *mm, unsigned long address, diff --git a/mm/Makefile b/mm/Makefile index d2915f8c9dc0..3cb3c1f5d090 100644 --- a/mm/Makefile +++ b/mm/Makefile @@ -141,3 +141,4 @@ obj-$(CONFIG_HAVE_BOOTMEM_INFO_NODE) += bootmem_info.o obj-$(CONFIG_GENERIC_IOREMAP) += ioremap.o obj-$(CONFIG_SHRINKER_DEBUG) += shrinker_debug.o obj-$(CONFIG_EXECMEM) += execmem.o +obj-$(CONFIG_PT_RECLAIM) += pt_reclaim.o diff --git a/mm/huge_memory.c b/mm/huge_memory.c index c7ce28f6b7f3..444a1cdaf06d 100644 --- a/mm/huge_memory.c +++ b/mm/huge_memory.c @@ -977,6 +977,7 @@ static vm_fault_t __do_huge_pmd_anonymous_page(struct vm_fault *vmf, folio_add_new_anon_rmap(folio, vma, haddr, RMAP_EXCLUSIVE); folio_add_lru_vma(folio, vma); pgtable_trans_huge_deposit(vma->vm_mm, vmf->pmd, pgtable); + arch_flush_tlb_before_set_huge_page(vma->vm_mm, haddr); set_pmd_at(vma->vm_mm, haddr, vmf->pmd, entry); update_mmu_cache_pmd(vma, vmf->address, vmf->pmd); add_mm_counter(vma->vm_mm, MM_ANONPAGES, HPAGE_PMD_NR); @@ -1044,6 +1045,7 @@ static void set_huge_zero_folio(pgtable_t pgtable, struct mm_struct *mm, entry = mk_pmd(&zero_folio->page, vma->vm_page_prot); entry = pmd_mkhuge(entry); pgtable_trans_huge_deposit(mm, pmd, pgtable); + arch_flush_tlb_before_set_huge_page(mm, haddr); set_pmd_at(mm, haddr, pmd, entry); mm_inc_nr_ptes(mm); } @@ -1151,6 +1153,7 @@ static void insert_pfn_pmd(struct vm_area_struct *vma, unsigned long addr, pgtable = NULL; } + arch_flush_tlb_before_set_huge_page(mm, addr); set_pmd_at(mm, addr, pmd, entry); update_mmu_cache_pmd(vma, addr, pmd); diff --git a/mm/internal.h b/mm/internal.h index 1dfdad110a9a..ac1fdd4681dc 100644 --- a/mm/internal.h +++ b/mm/internal.h @@ -1579,4 +1579,18 @@ void unlink_file_vma_batch_init(struct unlink_vma_file_batch *); void unlink_file_vma_batch_add(struct unlink_vma_file_batch *, struct vm_area_struct *); void unlink_file_vma_batch_final(struct unlink_vma_file_batch *); +#ifdef CONFIG_PT_RECLAIM +void try_to_reclaim_pgtables(struct mmu_gather *tlb, struct vm_area_struct *vma, + unsigned long start_addr, unsigned long end_addr, + struct zap_details *details); +#else +static inline void try_to_reclaim_pgtables(struct mmu_gather *tlb, + struct vm_area_struct *vma, + unsigned long start_addr, + unsigned long end_addr, + struct zap_details *details) +{ +} +#endif /* CONFIG_PT_RECLAIM */ + #endif /* __MM_INTERNAL_H */ diff --git a/mm/khugepaged.c b/mm/khugepaged.c index 7b7c858d5f99..63551077795d 100644 --- a/mm/khugepaged.c +++ b/mm/khugepaged.c @@ -1578,7 +1578,7 @@ int collapse_pte_mapped_thp(struct mm_struct *mm, unsigned long addr, if (userfaultfd_armed(vma) && !(vma->vm_flags & VM_SHARED)) pml = pmd_lock(mm, pmd); - start_pte = pte_offset_map_nolock(mm, pmd, NULL, haddr, &ptl); + start_pte = pte_offset_map_nolock(mm, pmd, &pgt_pmd, haddr, &ptl); if (!start_pte) /* mmap_lock + page lock should prevent this */ goto abort; if (!pml) @@ -1586,6 +1586,11 @@ int collapse_pte_mapped_thp(struct mm_struct *mm, unsigned long addr, else if (ptl != pml) spin_lock_nested(ptl, SINGLE_DEPTH_NESTING); + /* pmd entry may be changed by others */ + if (unlikely(IS_ENABLED(CONFIG_PT_RECLAIM) && !pml && + !pmd_same(pgt_pmd, pmdp_get_lockless(pmd)))) + goto abort; + /* step 2: clear page table and adjust rmap */ for (i = 0, addr = haddr, pte = start_pte; i < HPAGE_PMD_NR; i++, addr += PAGE_SIZE, pte++) { @@ -1633,6 +1638,12 @@ int collapse_pte_mapped_thp(struct mm_struct *mm, unsigned long addr, pml = pmd_lock(mm, pmd); if (ptl != pml) spin_lock_nested(ptl, SINGLE_DEPTH_NESTING); + + if (unlikely(IS_ENABLED(CONFIG_PT_RECLAIM) && + !pmd_same(pgt_pmd, pmdp_get_lockless(pmd)))) { + spin_unlock(ptl); + goto unlock; + } } pgt_pmd = pmdp_collapse_flush(vma, haddr, pmd); pmdp_get_lockless_sync(); @@ -1660,6 +1671,7 @@ int collapse_pte_mapped_thp(struct mm_struct *mm, unsigned long addr, } if (start_pte) pte_unmap_unlock(start_pte, ptl); +unlock: if (pml && pml != ptl) spin_unlock(pml); if (notified) @@ -1719,6 +1731,14 @@ static void retract_page_tables(struct address_space *mapping, pgoff_t pgoff) mmu_notifier_invalidate_range_start(&range); pml = pmd_lock(mm, pmd); +#ifdef CONFIG_PT_RECLAIM + /* check if the pmd is still valid */ + if (check_pmd_still_valid(mm, addr, pmd) != SCAN_SUCCEED) { + spin_unlock(pml); + mmu_notifier_invalidate_range_end(&range); + continue; + } +#endif ptl = pte_lockptr(mm, pmd); if (ptl != pml) spin_lock_nested(ptl, SINGLE_DEPTH_NESTING); diff --git a/mm/memory.c b/mm/memory.c index 09db2c97cc5c..b07d63767d93 100644 --- a/mm/memory.c +++ b/mm/memory.c @@ -423,6 +423,7 @@ void pmd_install(struct mm_struct *mm, pmd_t *pmd, unsigned long addr, spinlock_t *ptl = pmd_lock(mm, pmd); if (likely(pmd_none(*pmd))) { /* Has another populated it ? */ + arch_flush_tlb_before_set_pte_page(mm, addr); mm_inc_nr_ptes(mm); /* * Ensure all pte setup (eg. pte page lock and page clearing) are @@ -1931,6 +1932,7 @@ void zap_page_range_single(struct vm_area_struct *vma, unsigned long address, * could have been expanded for hugetlb pmd sharing. */ unmap_single_vma(&tlb, vma, address, end, details, false); + try_to_reclaim_pgtables(&tlb, vma, address, end, details); mmu_notifier_invalidate_range_end(&range); tlb_finish_mmu(&tlb); hugetlb_zap_end(vma, details); diff --git a/mm/pt_reclaim.c b/mm/pt_reclaim.c new file mode 100644 index 000000000000..e375e7f2059f --- /dev/null +++ b/mm/pt_reclaim.c @@ -0,0 +1,131 @@ +// SPDX-License-Identifier: GPL-2.0 +#include +#include +#include +#include + +#include "internal.h" + +/* + * Locking: + * - already held the mmap read lock to traverse the pgtable + * - use pmd lock for clearing pmd entry + * - use pte lock for checking empty PTE page, and release it after clearing + * pmd entry, then we can capture the changed pmd in pte_offset_map_lock() + * etc after holding this pte lock. Thanks to this, we don't need to hold the + * rmap-related locks. + * - users of pte_offset_map_lock() etc all expect the PTE page to be stable by + * using rcu lock, so PTE pages should be freed by RCU. + */ +static int reclaim_pgtables_pmd_entry(pmd_t *pmd, unsigned long addr, + unsigned long next, struct mm_walk *walk) +{ + struct mm_struct *mm = walk->mm; + struct mmu_gather *tlb = walk->private; + pte_t *start_pte, *pte; + pmd_t pmdval; + spinlock_t *pml = NULL, *ptl; + int i; + + start_pte = pte_offset_map_nolock(mm, pmd, &pmdval, addr, &ptl); + if (!start_pte) + return 0; + + pml = pmd_lock(mm, pmd); + if (ptl != pml) + spin_lock_nested(ptl, SINGLE_DEPTH_NESTING); + + if (unlikely(!pmd_same(pmdval, pmdp_get_lockless(pmd)))) + goto out_ptl; + + /* Check if it is empty PTE page */ + for (i = 0, pte = start_pte; i < PTRS_PER_PTE; i++, pte++) { + if (!pte_none(ptep_get(pte))) + goto out_ptl; + } + pte_unmap(start_pte); + + pmd_clear(pmd); + if (ptl != pml) + spin_unlock(ptl); + spin_unlock(pml); + + /* + * NOTE: + * In order to reuse mmu_gather to batch flush tlb and free PTE pages, + * here tlb is not flushed before pmd lock is unlocked. This may + * result in the following two situations: + * + * 1) Userland can trigger page fault and fill a huge page, which will + * cause the existence of small size TLB and huge TLB for the same + * address. + * + * 2) Userland can also trigger page fault and fill a PTE page, which + * will cause the existence of two small size TLBs, but the PTE + * page they map are different. + * + * Some CPUs do not allow these, to solve this, we can define + * arch_flush_tlb_before_set_{huge|pte}_page to detect this case and + * flush TLB before filling a huge page or a PTE page in page fault + * path. + */ + pte_free_tlb(tlb, pmd_pgtable(pmdval), addr); + mm_dec_nr_ptes(mm); + + return 0; + +out_ptl: + pte_unmap_unlock(start_pte, ptl); + if (pml != ptl) + spin_unlock(pml); + + return 0; +} + +static const struct mm_walk_ops reclaim_pgtables_walk_ops = { + .pmd_entry = reclaim_pgtables_pmd_entry, + .walk_lock = PGWALK_RDLOCK, +}; + +void try_to_reclaim_pgtables(struct mmu_gather *tlb, struct vm_area_struct *vma, + unsigned long start_addr, unsigned long end_addr, + struct zap_details *details) +{ + unsigned long start = max(vma->vm_start, start_addr); + unsigned long end; + + if (start >= vma->vm_end) + return; + end = min(vma->vm_end, end_addr); + if (end <= vma->vm_start) + return; + + /* Skip hugetlb case */ + if (is_vm_hugetlb_page(vma)) + return; + + /* Leave this to the THP path to handle */ + if (vma->vm_flags & VM_HUGEPAGE) + return; + + /* userfaultfd_wp case may reinstall the pte entry, also skip */ + if (userfaultfd_wp(vma)) + return; + + /* + * For private file mapping, the COW-ed page is an anon page, and it + * will not be zapped. For simplicity, skip the all writable private + * file mapping cases. + */ + if (details && !vma_is_anonymous(vma) && + !(vma->vm_flags & VM_MAYSHARE) && + (vma->vm_flags & VM_WRITE)) + return; + + start = ALIGN(start, PMD_SIZE); + end = ALIGN_DOWN(end, PMD_SIZE); + if (end - start < PMD_SIZE) + return; + + walk_page_range_vma(vma, start, end, &reclaim_pgtables_walk_ops, tlb); +} From patchwork Mon Jul 1 08:46:46 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Qi Zheng X-Patchwork-Id: 13717671 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id B64EFC2BD09 for ; Mon, 1 Jul 2024 08:48:29 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 379076B008A; Mon, 1 Jul 2024 04:48:29 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 32AE86B00AB; Mon, 1 Jul 2024 04:48:29 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 1A4F76B009E; Mon, 1 Jul 2024 04:48:29 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0017.hostedemail.com [216.40.44.17]) by kanga.kvack.org (Postfix) with ESMTP id ED6D96B00AB for ; Mon, 1 Jul 2024 04:48:28 -0400 (EDT) Received: from smtpin02.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay01.hostedemail.com (Postfix) with ESMTP id 9BA0B1C32F1 for ; Mon, 1 Jul 2024 08:48:28 +0000 (UTC) X-FDA: 82290557496.02.B8DCFBF Received: from mail-oa1-f54.google.com (mail-oa1-f54.google.com [209.85.160.54]) by imf29.hostedemail.com (Postfix) with ESMTP id DDB0312000C for ; Mon, 1 Jul 2024 08:48:26 +0000 (UTC) Authentication-Results: imf29.hostedemail.com; dkim=pass header.d=bytedance.com header.s=google header.b=QF5lE4AF; spf=none (imf29.hostedemail.com: domain of zhengqi.arch@bytedance.com has no SPF policy when checking 209.85.160.54) smtp.mailfrom=zhengqi.arch@bytedance.com; dmarc=pass (policy=quarantine) header.from=bytedance.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1719823696; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=McQK/YxjiDTA99Z+xjg1GpN+1HDVMfpYwsApecFoTmw=; b=ASFUadJ+wljszNmdH2DQxMdEAQX3T/rxLvYpLCsuL5p4lD9M+aqxA43f6ZA4BEX9tBSKgL bPJco6q5ZSTSPlccMjkTu11a7dWxxQhcqxDH8q/e/dISJESXfIxS6ROn5OGWrcd8NUugju zCr6ilKj5BlfyvnWNoo/z5OYHtX/gZk= ARC-Authentication-Results: i=1; imf29.hostedemail.com; dkim=pass header.d=bytedance.com header.s=google header.b=QF5lE4AF; spf=none (imf29.hostedemail.com: domain of zhengqi.arch@bytedance.com has no SPF policy when checking 209.85.160.54) smtp.mailfrom=zhengqi.arch@bytedance.com; dmarc=pass (policy=quarantine) header.from=bytedance.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1719823696; a=rsa-sha256; cv=none; b=pfcL0DPmgCs2o6TdW4DqlcTprzi15XJpy0PyLLZv7dHoWna9Lm3ks0KSLVs63YSXqUStAe w40UJ0pM1SyuBOeQn/z1jgHLyP/sX49SWpAoeknoSzF1AtfE6gMB4x5KHcEDpKh8inqmE+ 40PV3Ba+kPDmvycshn7U52wt+iLFNZg= Received: by mail-oa1-f54.google.com with SMTP id 586e51a60fabf-25d0f8d79ebso424509fac.0 for ; Mon, 01 Jul 2024 01:48:26 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance.com; s=google; t=1719823706; x=1720428506; darn=kvack.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=McQK/YxjiDTA99Z+xjg1GpN+1HDVMfpYwsApecFoTmw=; b=QF5lE4AF4pdGmye0pslAlIaAQVl6zi8sVTMJIDENhXA+dgMq0AFUUvwJLbz7IhdsMt +SwHlRmbF+w8G/V/j0xavPjPTOXk9L5/i48kf8pKoDPUES4MzVHzYxfdTssGkUmFw+l+ SVIQizwlCqq1YQMQUUshT6FxyR8vwUSqXdfvL4aiXwwdEYHNyK+98JI+xFKcCzBh5t5v VJp8/wPK38G/KOz8I3QygLuwmIdIBYLmAeeKgEQbK/Nt5WUyDPg0Ooftos9QXqGkLBnf vpO+XtuLMk9mlkl7seQjb/HhoVxhjbR14Xnokc4MBuevaJNT2J8zFC2dnRXE9Eh17Lij hhgg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1719823706; x=1720428506; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=McQK/YxjiDTA99Z+xjg1GpN+1HDVMfpYwsApecFoTmw=; b=rV2FI1m6EeKScl0wtwKdOquC+sk6OQWBpmkgmQiQ5e7Xqyem+fLbzmyI9TGtLGScdE L89kD54W4AOY/NqTgazbHhaegfGQzt6oJARAycMUBeBEjlGmC+5LgmDMhLWjQgNtS6Cg faGovw7EzNAnXyhawGRdEkEGqmai7PWlbFqyXxF7AmdWLOXFfc+DotHraKd+0fxNL690 GiB4lAgGJPvyV3yszvZhxSL6mQw/in1WG9GAHg0kv22XcHGAIk3pJjjdq1pacDFzQuiL YrE3Sl0bD/jKcaIpGj0tpC2gqLauqvyx7sa9ODy/Sw5+g1pXjfbux4VdbvWZpV1vRK/j oRJA== X-Gm-Message-State: AOJu0YxOFX6SwaGFIQ3Nd/u0F9JP8q/rTfTVA2opWcml67FJJXsLl058 fnKQgS3E5I06Z4ALNhTCAu2d6qBFHt/VW8IylKwd+a7JdR06Y3ry/yf/UPpLDXY= X-Google-Smtp-Source: AGHT+IFN2WppGgIQE1/8To548LzmJeYPiitqZmZGxr2Jxr97Tojwrcl4bY6VD5q2UaSwNJF/59bDBA== X-Received: by 2002:a05:6870:2892:b0:255:1fea:340d with SMTP id 586e51a60fabf-25db3049d93mr5100637fac.0.1719823705877; Mon, 01 Jul 2024 01:48:25 -0700 (PDT) Received: from C02DW0BEMD6R.bytedance.net ([139.177.225.241]) by smtp.gmail.com with ESMTPSA id d2e1a72fcca58-70804a7e7e0sm5932374b3a.204.2024.07.01.01.48.21 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 01 Jul 2024 01:48:25 -0700 (PDT) From: Qi Zheng To: david@redhat.com, hughd@google.com, willy@infradead.org, mgorman@suse.de, muchun.song@linux.dev, akpm@linux-foundation.org Cc: linux-mm@kvack.org, linux-kernel@vger.kernel.org, Qi Zheng Subject: [RFC PATCH 5/7] x86: mm: free page table pages by RCU instead of semi RCU Date: Mon, 1 Jul 2024 16:46:46 +0800 Message-Id: <1a27215790293face83242cfd703e910aa0c5ce8.1719570849.git.zhengqi.arch@bytedance.com> X-Mailer: git-send-email 2.24.3 (Apple Git-128) In-Reply-To: References: MIME-Version: 1.0 X-Rspamd-Server: rspam03 X-Rspam-User: X-Rspamd-Queue-Id: DDB0312000C X-Stat-Signature: j6kkqnhybgfnmnnxuqh8p16ukeegtf6u X-HE-Tag: 1719823706-889913 X-HE-Meta: U2FsdGVkX18shwj25Bd9mWv8zmSB/N2/Her2zgxOaatZxo7+fgoU6bMAxmeIHd+ThSXHY6H8z08+ywhOKm/ssWWC76u/DUNrbZW3gFcUOjn+Wj5msHHUAoPMZC3LygRJU7/1syMaPGgrZPPlsCy0oSw3FuOqUjqppuxqke97zOpBk8ryjYRo/vNxMSELyoYmtY0evpJ3aE1b0OAGrY+XyOT5T4iTI3zxlozDXXEsNyvcrGb47eF/jOM9wLh3onDeJ16m7E3CtFvz4J9hPRX/ePkyFyQi+63JXj22606z1HSUYXOidiEPOa23RYGXTvsj47RnR4MtxLtfqFsqQFXXOjsKPkuK3JI6MIeuOoas9qI1HWrIVUvwAlqCKVQEiHOK+DZdVz9Z8mNIlz56A7YmHJ4PL/ebe8bCIWGmdawffIC8GsKMUHFvfGzSvEH3iJsmwHm7txy7C6NQviDYhHkbwbdcnZI9hZ0YDNX6hkKQFxlTZze+3RLzzV07TA0BbrvEpPLd78UJ6qk4rt+SVfz1zrSn3NUhEM9pKlFI0yy+oxx5UBzKzAwCmgIvGAbSk65sC48NdFMsb1bLEsl1Rx4EE6VDiGytCPS1tx1kQDOh65bdCf2RnyY7JA/RgslNjHdpwmTcSzw5itmuVKex+nxq8nEp3payxd47Z66X21tNS0eljoV8bo6j7QsIopVoEDKHC4FWY5NunCST0aTSQtnnQhz91OVPA1v388iubR6NKcfQwr0OI32v5LmwYPan2AMgMSzYPIKl9uAHpJzVOn9EBaqCe+MHjOAkPDAZF0mXgJ/y1fGm9X4hH0ylENzbLhxG0hD1ExK8I/c6uleQ23xsGu2oBFzBxuap53Bl6J54a7wV7KzEixRR6zL+1vHynK76MTUOAx57cScwgtZofvl2CqDzfEs8HfNRhGeC+5wKyIQe/vsRfttjhHZX3w6nCYyN5hsajWTNONuZ/a0LMg7 SoIiFSS7 BbNWz1uBvhkopco1SvEIWlZ0K/qN5+IMp4Ul/TarZAw4hfRqONQNlJwcWOQFZECKGk57ye6ebIksbTRxUVcitJzK5k4GQGIi0rKyNJQ2p5OFnmPk2Bo7Ri4EajFnqxdZ4HsYIYvDZDATYC6DAMpfTs0h5Bzg0OP/Gk24vpRlmDtKomktUIagtQGSMJWETKwABivP93YZ6AK/YSAQFMv2opZFepcKOQT5h+6Z03ZwI87oqGFbi+xA76XM4f24Oh02tSCH9ctifW8BGgWOpJ0HrRBBZKp+Qp+HZIBvX4xrfkhcyjvRT98PYxaa7sg== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Now, if CONFIG_MMU_GATHER_RCU_TABLE_FREE is selected, the page table pages will be freed by semi RCU, that is: - batch table freeing: asynchronous free by RCU - single table freeing: IPI + synchronous free In this way, the page table can be lockless traversed by disabling IRQ in paths such as fast GUP. But this is not enough to free the empty PTE page table pages in paths other that munmap and exit_mmap path, because IPI cannot be synchronized with rcu_read_lock() in pte_offset_map{_lock}(). In preparation for supporting empty PTE page table pages reclaimation, let single table also be freed by RCU like batch table freeing. Then we can also use pte_offset_map() etc to prevent PTE page from being freed. Like pte_free_defer(), we can also safely use ptdesc->pt_rcu_head to free the page table pages: - The pt_rcu_head is unioned with pt_list and pmd_huge_pte. - For pt_list, it is used to manage the PGD page in x86. Fortunately tlb_remove_table() will not be used for free PGD pages, so it is safe to use pt_rcu_head. - For pmd_huge_pte, we will do zap_deposited_table() before freeing the PMD page, so it is also safe. Signed-off-by: Qi Zheng --- arch/x86/include/asm/tlb.h | 23 +++++++++++++++++++++++ arch/x86/kernel/paravirt.c | 7 +++++++ arch/x86/mm/pgtable.c | 2 +- mm/mmu_gather.c | 2 +- 4 files changed, 32 insertions(+), 2 deletions(-) diff --git a/arch/x86/include/asm/tlb.h b/arch/x86/include/asm/tlb.h index 580636cdc257..9182db1e0264 100644 --- a/arch/x86/include/asm/tlb.h +++ b/arch/x86/include/asm/tlb.h @@ -34,4 +34,27 @@ static inline void __tlb_remove_table(void *table) free_page_and_swap_cache(table); } +#ifndef CONFIG_PT_RECLAIM +static inline void __tlb_remove_table_one(void *table) +{ + free_page_and_swap_cache(table); +} +#else +static inline void __tlb_remove_table_one_rcu(struct rcu_head *head) +{ + struct page *page; + + page = container_of(head, struct page, rcu_head); + free_page_and_swap_cache(page); +} + +static inline void __tlb_remove_table_one(void *table) +{ + struct page *page; + + page = table; + call_rcu(&page->rcu_head, __tlb_remove_table_one_rcu); +} +#endif /* CONFIG_PT_RECLAIM */ + #endif /* _ASM_X86_TLB_H */ diff --git a/arch/x86/kernel/paravirt.c b/arch/x86/kernel/paravirt.c index 5358d43886ad..199b9a3813b4 100644 --- a/arch/x86/kernel/paravirt.c +++ b/arch/x86/kernel/paravirt.c @@ -60,10 +60,17 @@ void __init native_pv_lock_init(void) static_branch_disable(&virt_spin_lock_key); } +#ifndef CONFIG_PT_RECLAIM static void native_tlb_remove_table(struct mmu_gather *tlb, void *table) { tlb_remove_page(tlb, table); } +#else +static void native_tlb_remove_table(struct mmu_gather *tlb, void *table) +{ + tlb_remove_table(tlb, table); +} +#endif struct static_key paravirt_steal_enabled; struct static_key paravirt_steal_rq_enabled; diff --git a/arch/x86/mm/pgtable.c b/arch/x86/mm/pgtable.c index 93e54ba91fbf..cd5bf2157611 100644 --- a/arch/x86/mm/pgtable.c +++ b/arch/x86/mm/pgtable.c @@ -18,7 +18,7 @@ EXPORT_SYMBOL(physical_mask); #define PGTABLE_HIGHMEM 0 #endif -#ifndef CONFIG_PARAVIRT +#if !defined(CONFIG_PARAVIRT) && !defined(CONFIG_PT_RECLAIM) static inline void paravirt_tlb_remove_table(struct mmu_gather *tlb, void *table) { diff --git a/mm/mmu_gather.c b/mm/mmu_gather.c index 99b3e9408aa0..1a8f7b8781a2 100644 --- a/mm/mmu_gather.c +++ b/mm/mmu_gather.c @@ -314,7 +314,7 @@ static inline void tlb_table_invalidate(struct mmu_gather *tlb) static void tlb_remove_table_one(void *table) { tlb_remove_table_sync_one(); - __tlb_remove_table(table); + __tlb_remove_table_one(table); } static void tlb_table_flush(struct mmu_gather *tlb) From patchwork Mon Jul 1 08:46:47 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Qi Zheng X-Patchwork-Id: 13717672 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 96DFCC2BD09 for ; Mon, 1 Jul 2024 08:48:34 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id E9F246B00AE; Mon, 1 Jul 2024 04:48:33 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id E4F166B00AF; Mon, 1 Jul 2024 04:48:33 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id C2EBC6B00B0; Mon, 1 Jul 2024 04:48:33 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0014.hostedemail.com [216.40.44.14]) by kanga.kvack.org (Postfix) with ESMTP id A0A6F6B00AE for ; Mon, 1 Jul 2024 04:48:33 -0400 (EDT) Received: from smtpin10.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay05.hostedemail.com (Postfix) with ESMTP id 56C7A41A2C for ; Mon, 1 Jul 2024 08:48:33 +0000 (UTC) X-FDA: 82290557706.10.6C50A08 Received: from mail-oa1-f52.google.com (mail-oa1-f52.google.com [209.85.160.52]) by imf01.hostedemail.com (Postfix) with ESMTP id 83C6340018 for ; Mon, 1 Jul 2024 08:48:31 +0000 (UTC) Authentication-Results: imf01.hostedemail.com; dkim=pass header.d=bytedance.com header.s=google header.b=Hr64vZ9P; spf=none (imf01.hostedemail.com: domain of zhengqi.arch@bytedance.com has no SPF policy when checking 209.85.160.52) smtp.mailfrom=zhengqi.arch@bytedance.com; dmarc=pass (policy=quarantine) header.from=bytedance.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1719823700; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=BSdXPp6cua9HpA6rLebdl04hcrKD7F+dglilKwP4NdI=; b=coB1sor1rK6RxxblQvTLIr7w9s7IWr//tVc0Dl+SK/UHodYKEYD0la1BL/oVdEU9fhwuQW EZLEYJJhb6mPuImlRu9TGR64pZs1pGERuV/2E+V2eeKdkA9e3gE6NgTaCqe5veaWmzPo3y qT3mRod0hKdgpLaEbN4EV9U4LtJMiw0= ARC-Authentication-Results: i=1; imf01.hostedemail.com; dkim=pass header.d=bytedance.com header.s=google header.b=Hr64vZ9P; spf=none (imf01.hostedemail.com: domain of zhengqi.arch@bytedance.com has no SPF policy when checking 209.85.160.52) smtp.mailfrom=zhengqi.arch@bytedance.com; dmarc=pass (policy=quarantine) header.from=bytedance.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1719823700; a=rsa-sha256; cv=none; b=FaGK0+8ycaXSM5mBZsY0eSR9nwHHgElN6UY3uxaE4xMtuobw2A5nduSEeU3xUoTEBBYpTd RNtmdDGuUPF9F4Z2dh4XmkEq11UdsWQ/JCl0kcaQmcX/Ou2Voml87Z86PlT1KFFxRDu4Di 6DlwxnDV+zr2vyz+EKV3OzR84wWwK6M= Received: by mail-oa1-f52.google.com with SMTP id 586e51a60fabf-25cd2b51fd3so351838fac.3 for ; Mon, 01 Jul 2024 01:48:31 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance.com; s=google; t=1719823710; x=1720428510; darn=kvack.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=BSdXPp6cua9HpA6rLebdl04hcrKD7F+dglilKwP4NdI=; b=Hr64vZ9PGR/7Y3MvgZ3byBZqg/xjp5/YlZ89/kd0pRe0BCzIocnCZeYxCN61eluMvv F+CWMtATdOJ+wyke65H58qts6DJ1sCwfs1wlqb9g7ry8vkrzsWiCm+85tZjlauJBmDuF 3TbP4yQcxV7GiSh0jeu9uy6jwshv8+XTZz9H8j5XS8un05XW3HotJQwTSuroSBDq/UUV MoLPoEClxFEbxDP/uq9ntZdqyLXTVR6XhusJNy+b6lP8WQ7aRD2WSi+ZuqzoLwp7CFJf I9NHkM2Az01UVW1IKsn75D/KdXBTL3xyiJGJeJd50ZjdUJJbUEba2qYr08WfxtU7gqle kTcw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1719823710; x=1720428510; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=BSdXPp6cua9HpA6rLebdl04hcrKD7F+dglilKwP4NdI=; b=pYs3k/Zs1ftvbPohf0zvWcwC+K0sa8+TzJIod6X5iZD3Z8IzB88inqzxoJzBB1XhXc VtBpXNk6rySH1gZjj7JEBVzI6BfJlne6d0Drk1gB3QVmh+rMlOoARqzxVFDYUbZsA/MS V/1gmJIPnM47fNgJyY1jNFm1e4amqqF1uQ25l/COCFjgPL3zgNRfX5cG7+AvRUgpNdbJ sETQqGHVJW/O1nptFB2/avtxmbhAALz7HJQ3kno15/TwnoKk60KYt2ej1DpStRQmgcaG xX4P+rj5D3eIXYZST35PBI+usjk0fuNKEasd8FuAt4X701yGHpyCSniy2XZqqTnFQgLI 4TIg== X-Gm-Message-State: AOJu0YxEGm5OfgM49D32NP2oFq3VMwCLVb5edPzJ18RZg3BEcQyLMkSr hG9y8C9VO5AKJ+wtzJwPmjUHTdpsGzgMtzTZJfHHUNvaJxJPaDPMTtZSdwVPw1s= X-Google-Smtp-Source: AGHT+IFfafy76qxnP3FNZsq5qulnNjrKrfxWqb4/wPoL/DGlK5/mwzGRfDiOgExztR4uBC55G6FnbA== X-Received: by 2002:a05:6871:7806:b0:254:ecbd:1815 with SMTP id 586e51a60fabf-25db3709c91mr5161423fac.5.1719823710530; Mon, 01 Jul 2024 01:48:30 -0700 (PDT) Received: from C02DW0BEMD6R.bytedance.net ([139.177.225.241]) by smtp.gmail.com with ESMTPSA id d2e1a72fcca58-70804a7e7e0sm5932374b3a.204.2024.07.01.01.48.26 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 01 Jul 2024 01:48:30 -0700 (PDT) From: Qi Zheng To: david@redhat.com, hughd@google.com, willy@infradead.org, mgorman@suse.de, muchun.song@linux.dev, akpm@linux-foundation.org Cc: linux-mm@kvack.org, linux-kernel@vger.kernel.org, Qi Zheng Subject: [RFC PATCH 6/7] x86: mm: define arch_flush_tlb_before_set_huge_page Date: Mon, 1 Jul 2024 16:46:47 +0800 Message-Id: X-Mailer: git-send-email 2.24.3 (Apple Git-128) In-Reply-To: References: MIME-Version: 1.0 X-Rspamd-Server: rspam11 X-Rspamd-Queue-Id: 83C6340018 X-Stat-Signature: 81nki8t46uzfkqs5zihxbs4nttafbuzs X-Rspam-User: X-HE-Tag: 1719823711-397342 X-HE-Meta: U2FsdGVkX1/6Pq34E4J6pydwHZnNEN9OiR8yX4fx9bCmolI1CeCSR4Gjm/jrBabc4I5JrGWoc8Ub9KxgjIInbnccO4fS1XKiTyDAIKctD5bdUM3ElLzpClkViEC7whzfR7VZ0MsBlBdG+1M9A1TwRJqssKgBiTu7jfU5XOEckMCfP+3l/FjoIozrcGdsDDwsS4MNFZzeSzH97NrwIyOLKIpLs0r28YSe77ngehWrRzEOx2gyJhAbe/zbHk97N6MRjSjMdmZ2aDarLqelAfWuPeFZLA12vmbwyfJDBC4NhVZU3L73ZVQ7wEGeTeQ31Ymsg+Hc+xWrHS9jx5HOx1f58icul3baO0VAA8xpSzZPmw95GPRc+7g8+kZ2E/1hWWgZSfYqWxjeIc7SMitg6wufOtLwpUsscJS7kIqLFhx0nup80WD5FRuvAuW8N4zsLLEwJJETA8laLRyu/If54vCDn15Tc1wFMHB3St8/3p3O/I0aRxH3d0RdTyGIR9OjwPkyFlw9oWP/42Sh7THeD0FdHPIqMwUmJJmuyHYl0I0CTh79EHLMQ3FSU1u4sBL+xixpzZNK301NWct30ff5spkglZtrYX2DwcYkA7/OUZAoOeZe7FT8lBiXDTAGN0AbpGmPpFIdwAyXf7QsDwzpI/XJG3VrHte7XTL4viJ63r0R1u8UeWNlMBw7wonW8IAcoIqWtBMYhDKSAx25SgZFrhMxg7EcMGcFa1I+ZGoAWKldLHvkyojSN4zjHt6nF3ZKU64vwcnMJJ89m0ewp0RWDlAsfryKZBRUcwGLUF0/qTP0Y4K3RjT0W1+kYexR3ActjRj5YZOcKv5jxQrdza6qNL5mHk2YNe6El6EvW9mMpYJni8wsykvBhYPXvC8GPnor7QmUtMpQbhb0LU89pRb3d0n5st+U6d+HRMq7VQjKU+XCu8JLc+hi2CWJjzcKgT4Tghcj5ndcq145oyAue9vNR7U 5MjeuzvS zy6ZV86fdqscBGAwAV/wY/e77IBdO51QHRNSV8uvBq5Kupf/FacGbHr4rzMIFgu0MNTJJjsjjYL+mNmC0L1yncsSo+K13BHNs257PfMLlnfmhgfxxUH+3Eqsc8zFGP9CoNEoTeVhQcNpKWr+21oF2fW2dB1AnzhFvqhKuWdcJw18zYNmtMI5oFxegXbZN6P+ZGvNtRpPvlGOewaEHjMi3D/maEpDYoEthweSWtAI113zaNwjf8YwnJ64YFJ16g7uiIbyzsSviNoj4AlcYcrMNE7OeI1za/cAf1IN74fR3NMN/2hubRXpjOzYL7bmEPcndK9LU9Uye7ZzgMSp6X90YBwDEOvjs4yJG+azE X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: When we use mmu_gather to batch flush tlb and free PTE pages, the TLB is not flushed before pmd lock is unlocked. This may result in the following two situations: 1) Userland can trigger page fault and fill a huge page, which will cause the existence of small size TLB and huge TLB for the same address. 2) Userland can also trigger page fault and fill a PTE page, which will cause the existence of two small size TLBs, but the PTE page they map are different. According to Intel's TLB Application note (317080), some CPUs of x86 do not allow the 1) case, so define arch_flush_tlb_before_set_huge_page to detect and fix this issue. Signed-off-by: Qi Zheng --- arch/x86/include/asm/pgtable.h | 6 ++++++ arch/x86/mm/pgtable.c | 13 +++++++++++++ 2 files changed, 19 insertions(+) diff --git a/arch/x86/include/asm/pgtable.h b/arch/x86/include/asm/pgtable.h index e39311a89bf4..f93d964ab6a3 100644 --- a/arch/x86/include/asm/pgtable.h +++ b/arch/x86/include/asm/pgtable.h @@ -1668,6 +1668,12 @@ void arch_check_zapped_pte(struct vm_area_struct *vma, pte_t pte); #define arch_check_zapped_pmd arch_check_zapped_pmd void arch_check_zapped_pmd(struct vm_area_struct *vma, pmd_t pmd); +#ifdef CONFIG_PT_RECLAIM +#define arch_flush_tlb_before_set_huge_page arch_flush_tlb_before_set_huge_page +void arch_flush_tlb_before_set_huge_page(struct mm_struct *mm, + unsigned long addr); +#endif + #ifdef CONFIG_XEN_PV #define arch_has_hw_nonleaf_pmd_young arch_has_hw_nonleaf_pmd_young static inline bool arch_has_hw_nonleaf_pmd_young(void) diff --git a/arch/x86/mm/pgtable.c b/arch/x86/mm/pgtable.c index cd5bf2157611..d037f7425f82 100644 --- a/arch/x86/mm/pgtable.c +++ b/arch/x86/mm/pgtable.c @@ -926,3 +926,16 @@ void arch_check_zapped_pmd(struct vm_area_struct *vma, pmd_t pmd) VM_WARN_ON_ONCE(!(vma->vm_flags & VM_SHADOW_STACK) && pmd_shstk(pmd)); } + +#ifdef CONFIG_PT_RECLAIM +void arch_flush_tlb_before_set_huge_page(struct mm_struct *mm, + unsigned long addr) +{ + if (atomic_read(&mm->tlb_flush_pending)) { + unsigned long start = ALIGN_DOWN(addr, PMD_SIZE); + unsigned long end = start + PMD_SIZE; + + flush_tlb_mm_range(mm, start, end, PAGE_SHIFT, false); + } +} +#endif /* CONFIG_PT_RECLAIM */ From patchwork Mon Jul 1 08:46:48 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Qi Zheng X-Patchwork-Id: 13717673 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 89417C30653 for ; Mon, 1 Jul 2024 08:48:38 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 1F3846B008C; Mon, 1 Jul 2024 04:48:38 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 1A3C96B00A0; Mon, 1 Jul 2024 04:48:38 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 01E346B00B1; Mon, 1 Jul 2024 04:48:37 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0016.hostedemail.com [216.40.44.16]) by kanga.kvack.org (Postfix) with ESMTP id CD01D6B008C for ; Mon, 1 Jul 2024 04:48:37 -0400 (EDT) Received: from smtpin05.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay02.hostedemail.com (Postfix) with ESMTP id 8E4B5121ABF for ; Mon, 1 Jul 2024 08:48:37 +0000 (UTC) X-FDA: 82290557874.05.3F2AB6C Received: from mail-oa1-f49.google.com (mail-oa1-f49.google.com [209.85.160.49]) by imf08.hostedemail.com (Postfix) with ESMTP id C6326160012 for ; Mon, 1 Jul 2024 08:48:35 +0000 (UTC) Authentication-Results: imf08.hostedemail.com; dkim=pass header.d=bytedance.com header.s=google header.b=SQVRBLPX; spf=none (imf08.hostedemail.com: domain of zhengqi.arch@bytedance.com has no SPF policy when checking 209.85.160.49) smtp.mailfrom=zhengqi.arch@bytedance.com; dmarc=pass (policy=quarantine) header.from=bytedance.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1719823699; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=StMJ2RtFXCKaoVJqaHkGOtRFp2kjcwvK/9x07Wn8pQQ=; b=KkqUwEIypZYblrsrcxuUpn6632KgB6GouWEB2NEaEk5OgSi55LYgeC8VshCGWtC5QaknD+ +ko3v8Q7eKKeRNLFyxcnXYcIApapP1dJeqcq60E1gb8T20MdCiOrusCNTXt13fDfyYkoiP WnILwZxdO3kTSKGLIVP7ctz3nwC+dw4= ARC-Authentication-Results: i=1; imf08.hostedemail.com; dkim=pass header.d=bytedance.com header.s=google header.b=SQVRBLPX; spf=none (imf08.hostedemail.com: domain of zhengqi.arch@bytedance.com has no SPF policy when checking 209.85.160.49) smtp.mailfrom=zhengqi.arch@bytedance.com; dmarc=pass (policy=quarantine) header.from=bytedance.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1719823699; a=rsa-sha256; cv=none; b=X08zoNTjWhatyRwIaWq4oj3oNg779jiALTg8NQOvkWijb+a+t5fiDygcMotwfg+2y9pmsX CrSk5SWGPETmJnD+O4je0PnfWXavm+nUrhRvkss0YAXW7fHjvPnlGVtPnTxIOvVRQwenxb IDrQxW+vRRqXKXdFO6m1Kh7hcFOZXvk= Received: by mail-oa1-f49.google.com with SMTP id 586e51a60fabf-254c411d816so363166fac.1 for ; Mon, 01 Jul 2024 01:48:35 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance.com; s=google; t=1719823715; x=1720428515; darn=kvack.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=StMJ2RtFXCKaoVJqaHkGOtRFp2kjcwvK/9x07Wn8pQQ=; b=SQVRBLPXF0qaQo1OzxNgG7zpL9ytNcPx09CyGQS3InOF5NsvjfhdwCnOa4+7zMsa1v 2xX1ah0x8b669b895IdXyQZ/kSHI5iD81/OrdgjQrzrlPRkmYBBG9nSQ9DJyBd8GHxk3 ckfXV+HuJjgla7iEJbhb3K/BmclsJOx+RBplUy67ciccujrmBBiMwVKkhFXUVLr3geyb 6H6sUSKnwbLySKuqYvhCJeErHZPf0qxiOwbyIqo/lTjGCzATO4gsmW3fQ3RCSNexh2uQ Bp2GTc4s0IFGVn1FU7XfaYVNu9R5KRgExZ5neAtyluWXbvDbR0N8/Q2ZiyAnw+W1gZUX 5eHQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1719823715; x=1720428515; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=StMJ2RtFXCKaoVJqaHkGOtRFp2kjcwvK/9x07Wn8pQQ=; b=VstAtU4dFWwCI+G8GzNcFH3+Ld9TEegfvRIwpWkGxk7oxa7KyDoa+9odzT5cBh61zI kDuAOalcfW3XCMbibIvhNgOinlU52vEtL1Pm6x6ZbUMf/a8UlwO5y25qxRu3K6kb7QcC VZQ6k+5ZJyuq8xRzsgakktlacF/+BIhv4b9xLUQwIumCTlclqyvbpNiq16HmbgNVz/Tr UdX7kU+79FELABBSdTrMDTW6CKdFSotp4VKuMtHWaOWUUulKskNytoGDNzHQamDOORnJ gYYwdgaX5a82Eb6LOdy/N8aP7zf4cbCQvNmJoWEwSdKII9Ym70gagCZEk3wKEvE8aSwO +Png== X-Gm-Message-State: AOJu0YyozphZBMlGuJtAorMUYwZjKLNbB23kPdHQ3ewTh370N9H23VOr js9htll1BhTEtIjrDoocO7fkUdeBorjw5Crb/UJCLQt/vthrGOjO0Q/tCwge2Sw= X-Google-Smtp-Source: AGHT+IFE3dKI9Tgl7EIyomU4IdVHY+PmwvfXhSbglMPZKAPvgX4/lADNKj6dPhxAruv+7V80kKXCBw== X-Received: by 2002:a05:6808:1a2a:b0:3d5:65c7:c26c with SMTP id 5614622812f47-3d6b549a9fcmr5659724b6e.4.1719823714723; Mon, 01 Jul 2024 01:48:34 -0700 (PDT) Received: from C02DW0BEMD6R.bytedance.net ([139.177.225.241]) by smtp.gmail.com with ESMTPSA id d2e1a72fcca58-70804a7e7e0sm5932374b3a.204.2024.07.01.01.48.30 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 01 Jul 2024 01:48:34 -0700 (PDT) From: Qi Zheng To: david@redhat.com, hughd@google.com, willy@infradead.org, mgorman@suse.de, muchun.song@linux.dev, akpm@linux-foundation.org Cc: linux-mm@kvack.org, linux-kernel@vger.kernel.org, Qi Zheng Subject: [RFC PATCH 7/7] x86: select ARCH_SUPPORTS_PT_RECLAIM if X86_64 Date: Mon, 1 Jul 2024 16:46:48 +0800 Message-Id: <0f3aacc9707da962398de71c127e7771c6798062.1719570849.git.zhengqi.arch@bytedance.com> X-Mailer: git-send-email 2.24.3 (Apple Git-128) In-Reply-To: References: MIME-Version: 1.0 X-Stat-Signature: rk8s57yqae4u1pq91zu3idjwa3bt1opi X-Rspam-User: X-Rspamd-Queue-Id: C6326160012 X-Rspamd-Server: rspam02 X-HE-Tag: 1719823715-650354 X-HE-Meta: U2FsdGVkX198cAgvVmHncmNrLWY2EXfKvxRJmVExFeiYUmuU41SuO2uDlMr8n/Bb6bSzOk6Pd5hBLLdzhNkpKzHOdt3wkIlRMivw3Wo7FrjEHggUyvkCBS5E2ZwOr0HJVVKW6+WL8da24YpyKyi3iYIgaw9zP2JrXJyYoukcUCUAgGkTAkfcHwfA/X0o1AnLvgW3Ktpujq2+oqTXv4MwgEfRNMzIUaxLrB73JJKa7M3AUVbEMBpxRKqwMGCFSNzkOZazvG+pjHwq99IxYf0/cyk0iFUUYQLP36/zOhNEJJJt+iiLZqYtjN+EWXgt9tFoHn+7YcEFPVgPY3yWe25Ez8U/sz21MLOYRbplyn4Nx8VJP+3aswJYPy8yivKIG27Yil0OkbbgWQTAMNho1CodIzt8/8fvdGu1cRN8hWkw6ZzUeGlI1aRqr8X/jegGSJgJRZU7pMmE00Q1vbHzfUf+69AOaeTuxg5ZDylr5UXXDi5IbuFbSuYofBTTQ9hyLWToJot9PFKTPeB321AUMDPou9rR04B5wb0lAZ33koDcD5OQa9rQnViRuL3NuBWsQW0bZBGXTweOwJ/j7H8kcptmFOotZelr55P91MZBLsJzUoGZlZQrs66P4msscm2WKvPOnr6C6DcyzmiX/1CNm3jvirxCI7MWTWf8x/omYtLY4/zJvLVwKppM84lURITbhOMs4GlQV2sjCQdIkhMv1UAazIjxaTLAF6XQ8Q8RFbM5oHyM85L4Nn+gaBCZzgyvN1p941WAGTe8EZgW7mAjf4JnSYwsNPNWBfMJUUbZUn+y5W+6oxkjqelkIWG0K+bNKyH2LffyrlhXqsgTjW06kQTJ3KWbvSGItuvVsLyFPljHjKZNyWKPM87VwNlr12a+L758PsuWjlMEUyctXFvnsQhcPoQ3gNpL49zWNCohdzHVb9me74w51no3bgyZVO1nI5wXKbFatxECv9BSr1WTxDe TCoaMgwT 2tfeUoVbePcqYip0IP1+e3SKMmyCnJUFJ+nBzxyWUYM3K63RN4zObSDSFO5v5xQki4rCAXbvGeX6c8pJl6pKXfOQ0IaIDGO0rjOEkT2cOkJzltn7XyQnBYVa1LOD+ne5zWQ/FP+GOA0+VB4j24STyKhPDMX4R0RGrHZQcTYCQo+OR2ANNnUiJgaJWOA9uay9kh9hcM/4h/AyKcplkP85TmNJAle6nIN6/tb+oKcirGVRiI+bfHI2F2l2KR1VZTnMDEfSN26i/3+h1ZTbn9T/U1zQl5CJ+5IYRNvgWgefrnYh+26hJRgf7s90nB+etwGEuLPzcDohxBPk38zU0jDiSJrrMVKI7OUeMPcsf X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Now, x86 has fully supported the CONFIG_PT_RECLAIM feature, and reclaiming PTE pages is profitable only on 64-bit systems, so select ARCH_SUPPORTS_PT_RECLAIM if X86_64. Signed-off-by: Qi Zheng --- arch/x86/Kconfig | 1 + 1 file changed, 1 insertion(+) diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig index cbe5fac4b9dd..23ccd7c30adc 100644 --- a/arch/x86/Kconfig +++ b/arch/x86/Kconfig @@ -313,6 +313,7 @@ config X86 select FUNCTION_ALIGNMENT_4B imply IMA_SECURE_AND_OR_TRUSTED_BOOT if EFI select HAVE_DYNAMIC_FTRACE_NO_PATCHABLE + select ARCH_SUPPORTS_PT_RECLAIM if X86_64 config INSTRUCTION_DECODER def_bool y