From patchwork Wed Sep 4 08:40:15 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Qi Zheng X-Patchwork-Id: 13790197 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 39881CA0ED3 for ; Wed, 4 Sep 2024 09:12:58 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20210309; h=Sender:List-Subscribe:List-Help :List-Post:List-Archive:List-Unsubscribe:List-Id:Content-Transfer-Encoding: MIME-Version:References:In-Reply-To:Message-Id:Date:Subject:Cc:To:From: Reply-To:Content-Type:Content-ID:Content-Description:Resent-Date:Resent-From: Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID:List-Owner; bh=jJ720IHYb4PoZSjS34HAH7fD9gF/j7WA/1t49m6Kiwg=; b=soDUIL7d+sczPYDo5mUDt2jThT 7b+A3dsW1Uf0/04vyCitC7l+Un77BAD6W7JZeTNn4JbTsbew+bBSPX2ZKf3/eKTe03BZi6Px/6nH3 ZL3T9of3548u5Ec6KX6KZlaG628LSqTQXwlcSQI4jfwRLEomyus5erMSVXtB+22D6cGvqaxI7Bnuj 9xbRmX/X+fljQPJdqFPM/vxusjqWDm8K/AsQegW2ci27yvpVtqfpjVxwUM7zCpPlpCE4U8SfgDkir 5NvvrXFFqQzYjJIrrnzQuwXwXlizFy0zHNo/N66foNHX9hs2zuwa0UhA/PyR1pRZY/nAz1tIaiC1i ZrifDCcg==; Received: from localhost ([::1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.97.1 #2 (Red Hat Linux)) id 1slm4G-00000003cSm-1Nu9; Wed, 04 Sep 2024 09:12:48 +0000 Received: from mail-pf1-x430.google.com ([2607:f8b0:4864:20::430]) by bombadil.infradead.org with esmtps (Exim 4.97.1 #2 (Red Hat Linux)) id 1sllZx-00000003TJ8-3VXz for linux-arm-kernel@lists.infradead.org; Wed, 04 Sep 2024 08:41:31 +0000 Received: by mail-pf1-x430.google.com with SMTP id d2e1a72fcca58-714114be925so5098900b3a.2 for ; Wed, 04 Sep 2024 01:41:29 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance.com; s=google; t=1725439289; x=1726044089; darn=lists.infradead.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=jJ720IHYb4PoZSjS34HAH7fD9gF/j7WA/1t49m6Kiwg=; b=A5EDDJxNGJMdft6ZPp1Phuo+iQOJP14QQrRa9LMmYDR0s1mJOXpJdCd0C39E8oMJOX KMl3rIijwMmcL2GnNFQm5hcSpD2GEdXPCCkGKyGOb7DUIm07MTNLKW3Kf2B72eQHBee7 0AmLI2+IHTybhraGc4/fycv8Da6qGFKTHtlA4D5N/yW/HnUrEqsNAE0DQZQ/+k2M7M92 u/LuB4zf8OZL9nXrCrLKn0bTDutAI4VCef96SwzCihVhSlWmWezfrouANaDrXHSx/4MH oRRj7u4sjNlZsth7X1rKuKIebxJbiTvt9uc3hK4E4R4r689mS/WU7P9EUw6dEGQ/51IU vCHg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1725439289; x=1726044089; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=jJ720IHYb4PoZSjS34HAH7fD9gF/j7WA/1t49m6Kiwg=; b=uuT+QijDfzQAgRu/SuvJ6Uxb3Rbjm/j1FPFrhixEM1ukyPnS0LTWf/4yP4YlIoVrsg JUs/HZD7WQ9mbox+ZJgmjUSXkB99DwJQ7fOrwdRAw44Ee8d/M+h7HEjy5X46TT7ED7U4 xT4fSMDYPjbJNYvznyuGSanFNICXcCk1hAXWnnAuX79qtizpjhuGNw5E7OSKzbjxiXcj beUzSzTTTqQ0nD1V2fCu4HHNaYIusLcDi+FNYu1Jf4wGN3BMgxzXeuZ6ET1UkG3eDjhU Z+5PJypFlc4FzbT6Op3wFgMQlRLTnhwRfP0p76RQsSm0GKX87qnHhWOiQV9/M1AZ3RU1 w5VQ== X-Forwarded-Encrypted: i=1; AJvYcCVaOAxBc/GcIzrJL/bq+2+Yqup2aVVF/a8AfM1zNwOFGo8r0nwVl78sOwt3HuGD8m3zOsCxqNZY0yI3E0DPNJy8@lists.infradead.org X-Gm-Message-State: AOJu0YwN8bxTzo7kEJ7O9jcaqyFSgpAUHBPL+YiAQEhQdOisV09rFQZB hCcaUsedbZ5/VoHTCrzRDWI3kKSSFaW7x+GMYJ1tDB+HoPBGy7uZD7+U+2gRl5g= X-Google-Smtp-Source: AGHT+IHVhSodGEWLJbqfntGARXqFTBke8IrjtWuvc6P4tKsUnW/m6YB1esVTiG0+MVp8pwa1UVDERA== X-Received: by 2002:a05:6a21:330b:b0:1c6:edfb:431f with SMTP id adf61e73a8af0-1cecf788518mr14875554637.44.1725439288977; Wed, 04 Sep 2024 01:41:28 -0700 (PDT) Received: from C02DW0BEMD6R.bytedance.net ([139.177.225.242]) by smtp.gmail.com with ESMTPSA id d9443c01a7336-206ae95a51csm9414045ad.117.2024.09.04.01.41.22 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 04 Sep 2024 01:41:28 -0700 (PDT) From: Qi Zheng To: david@redhat.com, hughd@google.com, willy@infradead.org, muchun.song@linux.dev, vbabka@kernel.org, akpm@linux-foundation.org, rppt@kernel.org, vishal.moola@gmail.com, peterx@redhat.com, ryan.roberts@arm.com, christophe.leroy2@cs-soprasteria.com Cc: linux-kernel@vger.kernel.org, linux-mm@kvack.org, linux-arm-kernel@lists.infradead.org, linuxppc-dev@lists.ozlabs.org, Qi Zheng Subject: [PATCH v3 07/14] mm: khugepaged: collapse_pte_mapped_thp() use pte_offset_map_rw_nolock() Date: Wed, 4 Sep 2024 16:40:15 +0800 Message-Id: <20240904084022.32728-8-zhengqi.arch@bytedance.com> X-Mailer: git-send-email 2.24.3 (Apple Git-128) In-Reply-To: <20240904084022.32728-1-zhengqi.arch@bytedance.com> References: <20240904084022.32728-1-zhengqi.arch@bytedance.com> MIME-Version: 1.0 X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20240904_014129_886969_22B1AB2F X-CRM114-Status: GOOD ( 15.86 ) X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org In collapse_pte_mapped_thp(), we may modify the pte and pmd entry after acquring the ptl, so convert it to using pte_offset_map_rw_nolock(). At this time, the pte_same() check is not performed after the PTL held. So we should get pgt_pmd and do pmd_same() check after the ptl held. For the case where the ptl is released first and then the pml is acquired, the PTE page may have been freed, so we must do pmd_same() check before reacquiring the ptl. Signed-off-by: Qi Zheng --- mm/khugepaged.c | 16 +++++++++++++++- 1 file changed, 15 insertions(+), 1 deletion(-) diff --git a/mm/khugepaged.c b/mm/khugepaged.c index 6498721d4783a..a117d35f33aee 100644 --- a/mm/khugepaged.c +++ b/mm/khugepaged.c @@ -1605,7 +1605,7 @@ int collapse_pte_mapped_thp(struct mm_struct *mm, unsigned long addr, if (userfaultfd_armed(vma) && !(vma->vm_flags & VM_SHARED)) pml = pmd_lock(mm, pmd); - start_pte = pte_offset_map_nolock(mm, pmd, haddr, &ptl); + start_pte = pte_offset_map_rw_nolock(mm, pmd, haddr, &pgt_pmd, &ptl); if (!start_pte) /* mmap_lock + page lock should prevent this */ goto abort; if (!pml) @@ -1613,6 +1613,9 @@ int collapse_pte_mapped_thp(struct mm_struct *mm, unsigned long addr, else if (ptl != pml) spin_lock_nested(ptl, SINGLE_DEPTH_NESTING); + if (unlikely(!pmd_same(pgt_pmd, pmdp_get_lockless(pmd)))) + goto abort; + /* step 2: clear page table and adjust rmap */ for (i = 0, addr = haddr, pte = start_pte; i < HPAGE_PMD_NR; i++, addr += PAGE_SIZE, pte++) { @@ -1658,6 +1661,16 @@ int collapse_pte_mapped_thp(struct mm_struct *mm, unsigned long addr, /* step 4: remove empty page table */ if (!pml) { pml = pmd_lock(mm, pmd); + /* + * We called pte_unmap() and release the ptl before acquiring + * the pml, which means we left the RCU critical section, so the + * PTE page may have been freed, so we must do pmd_same() check + * before reacquiring the ptl. + */ + if (unlikely(!pmd_same(pgt_pmd, pmdp_get_lockless(pmd)))) { + spin_unlock(pml); + goto pmd_change; + } if (ptl != pml) spin_lock_nested(ptl, SINGLE_DEPTH_NESTING); } @@ -1689,6 +1702,7 @@ int collapse_pte_mapped_thp(struct mm_struct *mm, unsigned long addr, pte_unmap_unlock(start_pte, ptl); if (pml && pml != ptl) spin_unlock(pml); +pmd_change: if (notified) mmu_notifier_invalidate_range_end(&range); drop_folio: