From patchwork Wed Sep 4 08:40:15 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Qi Zheng X-Patchwork-Id: 13790095 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 613F9CD3431 for ; Wed, 4 Sep 2024 08:41:33 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id DCA118D023A; Wed, 4 Sep 2024 04:41:32 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id D53D08D0239; Wed, 4 Sep 2024 04:41:32 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id BCD168D023A; Wed, 4 Sep 2024 04:41:32 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0016.hostedemail.com [216.40.44.16]) by kanga.kvack.org (Postfix) with ESMTP id 913CA8D0239 for ; Wed, 4 Sep 2024 04:41:32 -0400 (EDT) Received: from smtpin10.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay04.hostedemail.com (Postfix) with ESMTP id 142931A0E7C for ; Wed, 4 Sep 2024 08:41:32 +0000 (UTC) X-FDA: 82526412024.10.0E33C8A Received: from mail-pg1-f174.google.com (mail-pg1-f174.google.com [209.85.215.174]) by imf22.hostedemail.com (Postfix) with ESMTP id 380B6C0014 for ; Wed, 4 Sep 2024 08:41:29 +0000 (UTC) Authentication-Results: imf22.hostedemail.com; dkim=pass header.d=bytedance.com header.s=google header.b=Rzf6qdHT; spf=pass (imf22.hostedemail.com: domain of zhengqi.arch@bytedance.com designates 209.85.215.174 as permitted sender) smtp.mailfrom=zhengqi.arch@bytedance.com; dmarc=pass (policy=quarantine) header.from=bytedance.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1725439163; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=jJ720IHYb4PoZSjS34HAH7fD9gF/j7WA/1t49m6Kiwg=; b=YN/n6HyTYKwXUtcbBQl/UJU/3ZvRpMw/SaHJGDTOxfz3wf3FlTycFlVhnHsiFvhi5wJffd kZxsjw0hDsixF0YvexgzNttgAkANJ7taK0ihchXjxUrLlsX+zdqXmx45YSS7uj8C8Cp4do KbRDog0rJpsMkP3ZTncFyIObewBCHr4= ARC-Authentication-Results: i=1; imf22.hostedemail.com; dkim=pass header.d=bytedance.com header.s=google header.b=Rzf6qdHT; spf=pass (imf22.hostedemail.com: domain of zhengqi.arch@bytedance.com designates 209.85.215.174 as permitted sender) smtp.mailfrom=zhengqi.arch@bytedance.com; dmarc=pass (policy=quarantine) header.from=bytedance.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1725439163; a=rsa-sha256; cv=none; b=rnHSYQz/T4ioo+XDkOqO1r7k2/pDlb1vx1FaOkwt6oZBBQYxl1j+aDrv/KkABI01XiLVHh 5zCARbB8oVf5nD5eopx6ezpOq1kEeIaQl3349ej+lBPSwdGe0s+VI6hD+H0eKIQwED7Ub4 UJCQ8siREyeB14qOomoQXZvYiyCRV+k= Received: by mail-pg1-f174.google.com with SMTP id 41be03b00d2f7-7d1fa104851so3590829a12.3 for ; Wed, 04 Sep 2024 01:41:29 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance.com; s=google; t=1725439289; x=1726044089; darn=kvack.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=jJ720IHYb4PoZSjS34HAH7fD9gF/j7WA/1t49m6Kiwg=; b=Rzf6qdHTYeQTpgEMg919Sm1Knp7bH/z9x35F75fzjpFX5AAMh+UMzq7gYyP5eM87bi 8OM2Iv5ywlXjNBL77KEwkIY0wS/FILb0bHUL/F9r2NrUaWddE5WSJNJhlTyqa0P+uEDM cRBeLjfXiGCKgXhN0skS8VB40Xxu+cmWVz2lmR4qoDCUw01HNlUvIGb9qqUtamm1RWMZ FaorFytjYfTJMPO5JiU2W+Dj8mLBWZrjdI0nRPfWRXw8rkuSeyci8c0QzJ/zevXORgQ4 aR0kPH9VowiyTJPIqv71yMi/JWcuZOcq/wpW1aSv6iKGLB/S876JgZP0oTcZct2+vdla bmrA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1725439289; x=1726044089; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=jJ720IHYb4PoZSjS34HAH7fD9gF/j7WA/1t49m6Kiwg=; b=MkiWcYtNHjdS3HqHexJC3G+/KvXCCfxqX1Ecee0EXhH2wTLpRhdH78973NUxJYq/il uUI7zo2pCj7ePDn5JIZC3HhlJRVSDW7uLFCNe4o0xIK3I6ADaTueWkLt9IqE35z/HABA kse94vhlGgheZ41tKZZoIckgUUfuabXwmssoqjDknq1ZGMHHS3gRoql5dxyFE1XHdWVt uZch4/QBwC3kRZ+/kyX1XXrp6WGX7dpW3v4pZwp/yDRaLp0JJwD+9NOHr+DPaHxY5Lm/ o+rfQ8EqX75Ilsqxy7BqvMWP0nLVnRepjQJrIfYlXfpf1zLTt1VLIw69dG0hbYcKEJjg ljwA== X-Forwarded-Encrypted: i=1; AJvYcCU22AIJvx/uOyR36Ytvk0XoGpI6KBJpQLfgXdtsfMbsLEaxztpvejhgSvr5Wf5OeeB+Oarq92iZBQ==@kvack.org X-Gm-Message-State: AOJu0YxQo8AjL3xxwwYcxjgxZbk2zXveJ/iwIEWMLt7WHFrbg2yPLq8z c6FPfKzl0DFERNg2r0Mia+GeP7z8++kWcci8VnkEKZHLAeMnvOY5lceAb0GiYWg= X-Google-Smtp-Source: AGHT+IHVhSodGEWLJbqfntGARXqFTBke8IrjtWuvc6P4tKsUnW/m6YB1esVTiG0+MVp8pwa1UVDERA== X-Received: by 2002:a05:6a21:330b:b0:1c6:edfb:431f with SMTP id adf61e73a8af0-1cecf788518mr14875554637.44.1725439288977; Wed, 04 Sep 2024 01:41:28 -0700 (PDT) Received: from C02DW0BEMD6R.bytedance.net ([139.177.225.242]) by smtp.gmail.com with ESMTPSA id d9443c01a7336-206ae95a51csm9414045ad.117.2024.09.04.01.41.22 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 04 Sep 2024 01:41:28 -0700 (PDT) From: Qi Zheng To: david@redhat.com, hughd@google.com, willy@infradead.org, muchun.song@linux.dev, vbabka@kernel.org, akpm@linux-foundation.org, rppt@kernel.org, vishal.moola@gmail.com, peterx@redhat.com, ryan.roberts@arm.com, christophe.leroy2@cs-soprasteria.com Cc: linux-kernel@vger.kernel.org, linux-mm@kvack.org, linux-arm-kernel@lists.infradead.org, linuxppc-dev@lists.ozlabs.org, Qi Zheng Subject: [PATCH v3 07/14] mm: khugepaged: collapse_pte_mapped_thp() use pte_offset_map_rw_nolock() Date: Wed, 4 Sep 2024 16:40:15 +0800 Message-Id: <20240904084022.32728-8-zhengqi.arch@bytedance.com> X-Mailer: git-send-email 2.24.3 (Apple Git-128) In-Reply-To: <20240904084022.32728-1-zhengqi.arch@bytedance.com> References: <20240904084022.32728-1-zhengqi.arch@bytedance.com> MIME-Version: 1.0 X-Rspamd-Server: rspam06 X-Rspamd-Queue-Id: 380B6C0014 X-Stat-Signature: 6drfw9osybtabrngd41o4xetkndb413g X-Rspam-User: X-HE-Tag: 1725439289-60144 X-HE-Meta: U2FsdGVkX1/TQqWkHfUzDBNuiDHEy6/7QHnk1sLe1cHKcgcBl81a+YMsfGuKajVYnj9OblN67bF9vmbRinOFp8PoZ3W+RaHMd/b7Ggk5fncNQGSMEDI0yIiYEsUict219PckYoUlqipfyKSw37+V+bLeQlaVi+6GnTWrUbHANpzY5eYoN31Kj7oqrxHVpQdSYth24uW2QjMSfez0UNhCHpoU12Lecuo7CWyEnvyCNh1R6RagsytvRaHEJwgAI6ME1I7qt/9Tw6lfTv6g03vxmFZko1gCSrCFfS3k24oSNmkTQqdS/j5dNky3fIkvED7JXrJH4l3Gmqh8AmoLz6bemoHXW1aEeelhXLfUdWnZiECSgVPF61IwgkIMy4U/Mi4jpYRcI3+XyMWBsrnP4SS7qevt4Rw1EjasvHkRlfQsJMKxG0F8DpYVid4I7VUXryo2lyiF4+DKwfWHHx0YFkxmsvp+jdepuYkCnynHt1GZ8sGwf4SM0BF4S/RrzRTwCK83Tz/SGsBQqYHiAld0qPiNTOGHxqPdl5nyDcvYZHq7RbVc5uGWg88DH24sKEl2FdyirJ8/yUHnhH9WefunMEdMGGYzn5BNQr2pu0BHCA/cGwlVo4NwnVgVnQ/PsuksAF5QiiaR4XI471LMJq/fupUHL6yPI6nsrm1mySIulNbD/6f7M5dXn90cd+D9kI4bhQsMc0kl2h6mKIdRSd25HYIEhLBELfWW5MnMHyKTgYtVZiZtictpR6uOSybA4iT9s1fsdgKth1rIDkR+2dVvNjfMKd5WsGH/FO323huAP62bpFUNSFIkz91zaW1NvJnckWvGy2Efl3uRNnItWC3YM2f5fQ5DJEGCo3/FUfniS2y+QU7T9CqCF4fsBLM9W63/XUI0G7/bU5DCUXArlLTFuxlIclEpJ4DR0gxLxwZ6huyfL0qvqLBw5IZ691TpQGW3KCFW5aXy3eJT8/R5UjOo7Xc U4AFLTzI IZrO8z6UDntWRvItBlr0b7FZBCd4/OtV1c32XUQRq8KfCsq66JT4sf9NOHL2SvsgiJmUS99qKqEs779BkMMb1u0VNzjL9kid1euvC3ADx/kX1UNe9nl6qF2dkqAq3Le71C/GJ8WEWTSq1HidsK6x9ni2U1bqix9Zqa7YjZhlzjBuR9rH1GhMG1aYhYADNxy5VvL4Be41QMDZWk6bxng/6ViGXSMMeaXCHpP6KtvboGTH8wIOgHNvhnSFlfyRMacalbszhFAOx0w5fM5rRxhnjAKcc/iHE8fpbUd22aWLnDvm/uCW/xWatCauZ9oH0rP25q84rEWshoj40L5E6NjcBqSe7njoPnDTRX0Lcvts3g7pVkERHuuU+3n3UaPgsXWje2R+Kktc3WGwvAvwvhK7UxcVnhO+0see0ywNXFsf1ADavIQ0Fcv9yzW0srQ== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: In collapse_pte_mapped_thp(), we may modify the pte and pmd entry after acquring the ptl, so convert it to using pte_offset_map_rw_nolock(). At this time, the pte_same() check is not performed after the PTL held. So we should get pgt_pmd and do pmd_same() check after the ptl held. For the case where the ptl is released first and then the pml is acquired, the PTE page may have been freed, so we must do pmd_same() check before reacquiring the ptl. Signed-off-by: Qi Zheng --- mm/khugepaged.c | 16 +++++++++++++++- 1 file changed, 15 insertions(+), 1 deletion(-) diff --git a/mm/khugepaged.c b/mm/khugepaged.c index 6498721d4783a..a117d35f33aee 100644 --- a/mm/khugepaged.c +++ b/mm/khugepaged.c @@ -1605,7 +1605,7 @@ int collapse_pte_mapped_thp(struct mm_struct *mm, unsigned long addr, if (userfaultfd_armed(vma) && !(vma->vm_flags & VM_SHARED)) pml = pmd_lock(mm, pmd); - start_pte = pte_offset_map_nolock(mm, pmd, haddr, &ptl); + start_pte = pte_offset_map_rw_nolock(mm, pmd, haddr, &pgt_pmd, &ptl); if (!start_pte) /* mmap_lock + page lock should prevent this */ goto abort; if (!pml) @@ -1613,6 +1613,9 @@ int collapse_pte_mapped_thp(struct mm_struct *mm, unsigned long addr, else if (ptl != pml) spin_lock_nested(ptl, SINGLE_DEPTH_NESTING); + if (unlikely(!pmd_same(pgt_pmd, pmdp_get_lockless(pmd)))) + goto abort; + /* step 2: clear page table and adjust rmap */ for (i = 0, addr = haddr, pte = start_pte; i < HPAGE_PMD_NR; i++, addr += PAGE_SIZE, pte++) { @@ -1658,6 +1661,16 @@ int collapse_pte_mapped_thp(struct mm_struct *mm, unsigned long addr, /* step 4: remove empty page table */ if (!pml) { pml = pmd_lock(mm, pmd); + /* + * We called pte_unmap() and release the ptl before acquiring + * the pml, which means we left the RCU critical section, so the + * PTE page may have been freed, so we must do pmd_same() check + * before reacquiring the ptl. + */ + if (unlikely(!pmd_same(pgt_pmd, pmdp_get_lockless(pmd)))) { + spin_unlock(pml); + goto pmd_change; + } if (ptl != pml) spin_lock_nested(ptl, SINGLE_DEPTH_NESTING); } @@ -1689,6 +1702,7 @@ int collapse_pte_mapped_thp(struct mm_struct *mm, unsigned long addr, pte_unmap_unlock(start_pte, ptl); if (pml && pml != ptl) spin_unlock(pml); +pmd_change: if (notified) mmu_notifier_invalidate_range_end(&range); drop_folio: