From patchwork Thu Aug 22 07:13:22 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Qi Zheng X-Patchwork-Id: 13772831 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 19DCDC3DA4A for ; Thu, 22 Aug 2024 07:14:38 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 4AFAF8000F; Thu, 22 Aug 2024 03:14:32 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 43D5F80009; Thu, 22 Aug 2024 03:14:32 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 21A148000F; Thu, 22 Aug 2024 03:14:32 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0016.hostedemail.com [216.40.44.16]) by kanga.kvack.org (Postfix) with ESMTP id F1BDE80009 for ; Thu, 22 Aug 2024 03:14:31 -0400 (EDT) Received: from smtpin11.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay01.hostedemail.com (Postfix) with ESMTP id ADA511C1F9A for ; Thu, 22 Aug 2024 07:14:31 +0000 (UTC) X-FDA: 82479018342.11.3C15192 Received: from mail-pf1-f169.google.com (mail-pf1-f169.google.com [209.85.210.169]) by imf08.hostedemail.com (Postfix) with ESMTP id D07D316000F for ; Thu, 22 Aug 2024 07:14:29 +0000 (UTC) Authentication-Results: imf08.hostedemail.com; dkim=pass header.d=bytedance.com header.s=google header.b=V+IflJ1m; spf=pass (imf08.hostedemail.com: domain of zhengqi.arch@bytedance.com designates 209.85.210.169 as permitted sender) smtp.mailfrom=zhengqi.arch@bytedance.com; dmarc=pass (policy=quarantine) header.from=bytedance.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1724310790; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=jy8AvBGPXewIN0f06tSdN0J5TZbHLAwXxEIbEfEZm70=; b=vTog/IXBxGRVOwa2SPQ1mSJohHc1wAMKsw7kO7Zfwe/DpADR8g+tLHRPtgxlrRgDaZKfT7 /FE01qphVaYzVllUmUQhk6jbBXbbASaAfUDO5cV14HGAHpJ5COKGSSxKhkZh9/EggzvztM 96vtMMyKzFACb7H4BhwmssxbESQIdO0= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1724310790; a=rsa-sha256; cv=none; b=MXiB2Y7B1NXW/mclX6bDLbux6CqY2RaK7ligrlMKQh33GBXBI32FgUT3v8XPg0ndAMvJq9 hl9FOmQPuMtwSz+jypjudv0bIGSjhDEuVFLcMO3n1XV3ECubOi7KzE/dc9U/F030UliR+f j9EArNr7bq0qTjQ/0WbVtDLRoYh7GzI= ARC-Authentication-Results: i=1; imf08.hostedemail.com; dkim=pass header.d=bytedance.com header.s=google header.b=V+IflJ1m; spf=pass (imf08.hostedemail.com: domain of zhengqi.arch@bytedance.com designates 209.85.210.169 as permitted sender) smtp.mailfrom=zhengqi.arch@bytedance.com; dmarc=pass (policy=quarantine) header.from=bytedance.com Received: by mail-pf1-f169.google.com with SMTP id d2e1a72fcca58-71431524f3aso398600b3a.0 for ; Thu, 22 Aug 2024 00:14:29 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance.com; s=google; t=1724310868; x=1724915668; darn=kvack.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=jy8AvBGPXewIN0f06tSdN0J5TZbHLAwXxEIbEfEZm70=; b=V+IflJ1mctHvzSkk0jlwM9stp+RNdKxZ/F5v7+76ZXEH8cm1N5nxPSDTVJAbchBMzS ihv9xOdwiFhwYFzh8Iy14Ctqsk2SsWJMedBCUZIQ//5zq49pVIiXwYPd5DdyPjnvX0M7 Jl7oS+s4oDPg7ZjuOwRmOxxrCuCMgwChfei/gAcgN3jbPc5i1qC1cyutlqtM96/V4RU5 K+ltSkKgP8uZslTlyfSHtpeTiY5HFEwScuJVssQKehrurcOdaEwDhVGcFCmjLUpEC/9e CTsJ6Dt91hVK7N9wKglZrVLWTxGIUzIgnCU1Kf4dnjPRcKMDw9tXE6CvqNl/ilTqc7pq QZYw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1724310868; x=1724915668; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=jy8AvBGPXewIN0f06tSdN0J5TZbHLAwXxEIbEfEZm70=; b=w1FqdarfG174cA/2QAwOccwq7I2srXmXZ1wN89NiQKiZc1iJg0LqLfzyY1FBlvBzLl czkEom+Tuvz3+2JSDjTKXiC610XOvyH5C8+8eyLS9Y2WhCo1nhAn8F40OR0Xb87VIvzR HXOy7z+Zpq737uzrlWPHUhXMCgzQTB09spqpr4NgR+HDsNf0JQZ01GNiumWv3BooxhsM uM7tm9yenLBRTLjXvzVzSvhqqxy5kIoDVwJKvH95mMQByaQufJPWMKQdBgnq6Sn2XhTM tqvA6/hEkb7qvB046hUpQ4mJK7njvuhO9ZJc1WnORuUpfZaGg0IlTub9q4DEituueOqA nqcA== X-Forwarded-Encrypted: i=1; AJvYcCWxCuvszg9SI3/cF5hdHbOS0kB6u+fZw4asvkdhNHGDagvW3NkwcEF5cHvbqHLgjgrP002YRowbMg==@kvack.org X-Gm-Message-State: AOJu0Ywjk0dj1EsjM2KmRcoj+zdbfGNIXpPmsYcW5quYt4jMA9xUCD3I QPKXWQBuIYBslC3a17zjHdVabCCIOqQqleKvzv8dWxsi9U1el6XRzg1TzH004bo= X-Google-Smtp-Source: AGHT+IEV/3obR+4TCbDGR6rVI1h9FT7v1/u/+fxVcK1o1ztnph09CHD4OJqmjxMXnPNwi8/vaSiIPQ== X-Received: by 2002:a05:6a21:6b0a:b0:1c0:e728:a99e with SMTP id adf61e73a8af0-1cada078d94mr5247385637.26.1724310868515; Thu, 22 Aug 2024 00:14:28 -0700 (PDT) Received: from C02DW0BEMD6R.bytedance.net ([240e:473:c90:f96:d029:ea8a:4e6d:d272]) by smtp.gmail.com with ESMTPSA id 41be03b00d2f7-7cd9ac994a3sm695095a12.16.2024.08.22.00.14.23 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 22 Aug 2024 00:14:28 -0700 (PDT) From: Qi Zheng To: david@redhat.com, hughd@google.com, willy@infradead.org, muchun.song@linux.dev, vbabka@kernel.org, akpm@linux-foundation.org, rppt@kernel.org, vishal.moola@gmail.com, peterx@redhat.com, ryan.roberts@arm.com, christophe.leroy2@cs-soprasteria.com Cc: linux-kernel@vger.kernel.org, linux-mm@kvack.org, linux-arm-kernel@lists.infradead.org, linuxppc-dev@lists.ozlabs.org, Qi Zheng Subject: [PATCH v2 07/14] mm: khugepaged: collapse_pte_mapped_thp() use pte_offset_map_rw_nolock() Date: Thu, 22 Aug 2024 15:13:22 +0800 Message-Id: X-Mailer: git-send-email 2.24.3 (Apple Git-128) In-Reply-To: References: MIME-Version: 1.0 X-Rspamd-Queue-Id: D07D316000F X-Stat-Signature: e9fbkqr8sh3u1th8g9arjwnrnkrrdmbu X-Rspamd-Server: rspam09 X-Rspam-User: X-HE-Tag: 1724310869-657172 X-HE-Meta: U2FsdGVkX1+BfKV8noKjhlDBCI1isGIEXiK4RQCv24PntuGW5vcaGkAX9d2upixib+PS6VgNh7TF+RNY/DKI0IegsYYZgRh/D+Q04zCJQmzFvvhkxy8eg7f6C5iAPTyV2K2teJf/yU1RpTEp1AoFM9KsRE0pluuuWdTv8By2yq/sgu6bkRb4QAUybmQZxVhfeYaMgeFuvEqYhoRUrfM2UF74/9t1Bvpw4P19rrixY/lirDWffk2MKqIFYRgk6oyuhrYDfE9C91ffHjGnzJ34IbeeJfWRYV7yiWWFFi1+kMGCEDyr2XCkdE1ed1IcEU3cq2gtfAxK+t6MJ1Hp1mg+YrH8Xn8HyIociN1ElinZf5TUtmA7EVOogPH6gwVuJPgEGCw3CLhL7UjpT+7+pR39DOQJ8KEhR1r7ypIH/rXYxDM7F6OKqk+06r3RHICxodzHd3c9476QS8GwOK5aCpwOuLjsgHkl5x/fT9eEufrePAYoExOvdu/CWOH0b/o5DmUVn6yX+ZaGNG4wTVkA9lgmHo0SpVRka2vuJPGcTfwKA+BpBVmTspKlJXtUOYqqwF/60lZ1PsZwFuVfwarTgcehd1NvcDXoMvUNQZX2+z/lZoNyNLCtPnZ/LiMcPK4ct12ekRP+S50sOTb1FaqOHl8l3gfYgmZJPTJE1oN5fwGwT8H7UQkfSozw6o6gfy4AlwHQVmNUV12/K2UOoiEJ8fLfUtj1vhyUgv8zeNaybC1AumIF3vPIkv9CDJuBXHMpzoF195pBUSFsxJ0qSDYb5XCDOup9rkmtNz4WEPoeXFaohQuy8DoJQiV4tcCRBqrhKyDuMzt6nhZ9jgGQP6+LvqML5Je1tBsMFO9zF+6dT46kJwEhCMtYBZrHvRx4fdyf+qP7bzfx5vQIirobJL+IxLL1iMbV3myXows+KmvXYD8AmT7EJjfPLashF/07vVo7X6rddPm6GLwyJpQv87mN9kH sPr4dVCD U9w2EtTUEaLLLPGvfZLhhrzEm3rHv5x/6KcNxNOcvZ92L05aWcDz7gMCbcuzLGg58d6MvoYkNz/35Fqj44V4uKIGI7J+4TLKhIQbJRD8ep6yyRQfmNw6PWGuALywnQNLHZTtriBc65cqF6BRLjMVGc8w1Qtomx9vEcJsu3ciR0fE4zL8l65oze3QTmSVgU/JqHtRC7KMr9CQ4R9MsidcYH+N281LSnLDRNaHnckyVJz/58OQTm+5ZK1wZUnoF7n/HhhR0JFiD3cphW7awWAI/aKVhmvQKFvPWapniIlPMxkzkirpd+ixs9DWL7L7QXrQTQl18kcsysmYq9S91aBHMHxCYxKOgKsuX6Z04iiNx1mInaVHloHYWZI/vI/yFeY4pUr0K2HKFQi72SDyoNk7F57QtP6rSTrtvQZGATMd206eSg2XBZEcuw4QImFJC5HDH7Hck X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: In collapse_pte_mapped_thp(), we may modify the pte and pmd entry after acquring the ptl, so convert it to using pte_offset_map_rw_nolock(). At this time, the write lock of mmap_lock is not held, and the pte_same() check is not performed after the PTL held. So we should get pgt_pmd and do pmd_same() check after the ptl held. For the case where the ptl is released first and then the pml is acquired, the PTE page may have been freed, so we must do pmd_same() check before reacquiring the ptl. Signed-off-by: Qi Zheng --- mm/khugepaged.c | 16 +++++++++++++++- 1 file changed, 15 insertions(+), 1 deletion(-) diff --git a/mm/khugepaged.c b/mm/khugepaged.c index 53bfa7f4b7f82..15d3f7f3c65f2 100644 --- a/mm/khugepaged.c +++ b/mm/khugepaged.c @@ -1604,7 +1604,7 @@ int collapse_pte_mapped_thp(struct mm_struct *mm, unsigned long addr, if (userfaultfd_armed(vma) && !(vma->vm_flags & VM_SHARED)) pml = pmd_lock(mm, pmd); - start_pte = pte_offset_map_nolock(mm, pmd, haddr, &ptl); + start_pte = pte_offset_map_rw_nolock(mm, pmd, haddr, &pgt_pmd, &ptl); if (!start_pte) /* mmap_lock + page lock should prevent this */ goto abort; if (!pml) @@ -1612,6 +1612,9 @@ int collapse_pte_mapped_thp(struct mm_struct *mm, unsigned long addr, else if (ptl != pml) spin_lock_nested(ptl, SINGLE_DEPTH_NESTING); + if (unlikely(!pmd_same(pgt_pmd, pmdp_get_lockless(pmd)))) + goto abort; + /* step 2: clear page table and adjust rmap */ for (i = 0, addr = haddr, pte = start_pte; i < HPAGE_PMD_NR; i++, addr += PAGE_SIZE, pte++) { @@ -1657,6 +1660,16 @@ int collapse_pte_mapped_thp(struct mm_struct *mm, unsigned long addr, /* step 4: remove empty page table */ if (!pml) { pml = pmd_lock(mm, pmd); + /* + * We called pte_unmap() and release the ptl before acquiring + * the pml, which means we left the RCU critical section, so the + * PTE page may have been freed, so we must do pmd_same() check + * before reacquiring the ptl. + */ + if (unlikely(!pmd_same(pgt_pmd, pmdp_get_lockless(pmd)))) { + spin_unlock(pml); + goto pmd_change; + } if (ptl != pml) spin_lock_nested(ptl, SINGLE_DEPTH_NESTING); } @@ -1688,6 +1701,7 @@ int collapse_pte_mapped_thp(struct mm_struct *mm, unsigned long addr, pte_unmap_unlock(start_pte, ptl); if (pml && pml != ptl) spin_unlock(pml); +pmd_change: if (notified) mmu_notifier_invalidate_range_end(&range); drop_folio: