From patchwork Tue Feb 11 07:26:25 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Qi Zheng X-Patchwork-Id: 13969511 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 57CD9C021A1 for ; Tue, 11 Feb 2025 07:28:17 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id C34996B009A; Tue, 11 Feb 2025 02:28:16 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id BE5D5280001; Tue, 11 Feb 2025 02:28:16 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id A84A86B009C; Tue, 11 Feb 2025 02:28:16 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0016.hostedemail.com [216.40.44.16]) by kanga.kvack.org (Postfix) with ESMTP id 8ACC46B009A for ; Tue, 11 Feb 2025 02:28:16 -0500 (EST) Received: from smtpin05.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay02.hostedemail.com (Postfix) with ESMTP id 40EA41211B2 for ; Tue, 11 Feb 2025 07:28:16 +0000 (UTC) X-FDA: 83106835392.05.41FDAC5 Received: from mail-pj1-f49.google.com (mail-pj1-f49.google.com [209.85.216.49]) by imf18.hostedemail.com (Postfix) with ESMTP id 1EBFB1C0006 for ; Tue, 11 Feb 2025 07:28:12 +0000 (UTC) Authentication-Results: imf18.hostedemail.com; dkim=pass header.d=bytedance.com header.s=google header.b=Dhq+CJTW; spf=pass (imf18.hostedemail.com: domain of zhengqi.arch@bytedance.com designates 209.85.216.49 as permitted sender) smtp.mailfrom=zhengqi.arch@bytedance.com; dmarc=pass (policy=quarantine) header.from=bytedance.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1739258894; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:references:dkim-signature; bh=u8JvwZe222+nCD5D0NlDaqFgopE4s8QEJ665TCjR17Q=; b=09Z2IG10+hGlzD5LOlORrbTUIW/7pYETE+Z+6bb3xWtaknkbfQo32kxvYtJQ5ag6K0Tpg0 Oq0yMJNJ6Nqr2maPhffQ+F9gfYGt+iOJNyEdVXmIl0yUBM+V5Ne75Sk1mj+I1Jxwvugrjs 8x6nlXeRjXXcRn+xqCmXgfbUv3uQMzQ= ARC-Authentication-Results: i=1; imf18.hostedemail.com; dkim=pass header.d=bytedance.com header.s=google header.b=Dhq+CJTW; spf=pass (imf18.hostedemail.com: domain of zhengqi.arch@bytedance.com designates 209.85.216.49 as permitted sender) smtp.mailfrom=zhengqi.arch@bytedance.com; dmarc=pass (policy=quarantine) header.from=bytedance.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1739258894; a=rsa-sha256; cv=none; b=HxfO6SaLv48ZfAW/X5+zXUjjGxB+jd2ClxU7ugaVmYkeZAhHwxfr5XLXeGUx3Q/G4Rbdcc OxYjjy2tkwa78LDtJPDr5wSswThMbYZwdf7AdpoYXr8LGMp29oKy8c53XFaN78x5+LUMVF suXdrPMN4MQ1QQLcjmiJODEVZoEszz4= Received: by mail-pj1-f49.google.com with SMTP id 98e67ed59e1d1-2fa3e20952fso6145325a91.1 for ; Mon, 10 Feb 2025 23:28:12 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance.com; s=google; t=1739258892; x=1739863692; darn=kvack.org; h=content-transfer-encoding:mime-version:message-id:date:subject:cc :to:from:from:to:cc:subject:date:message-id:reply-to; bh=u8JvwZe222+nCD5D0NlDaqFgopE4s8QEJ665TCjR17Q=; b=Dhq+CJTWS7qxqKYu80qxHb+rIKADVEB+fKaIEXV9iX40ncheuee+7VhvO9kbeKN0zw VFGMpvbgK6u+nPwqM8eSnrYAaKSmqhi0q7hYu5rtb+CfKhXiLock6NdeviOSTndufEgQ RIsZS5e1X/72H3jSxyeR+9mjP56N29svVup9YkQCTmWHezgaAbtJ7J4AOFYIVb7vJlIZ 3jDb4+XQVVHwSq/H8Z18lBJcFb/q6b9CYgvOBJmwPBBUYa8lhawMoBMLIsSB3989tVgX JbewlyBLRjpUDu7UWSC3hG8YuPVODJPAQkD5xO7samYmda89VOzndC80kCZ3nNt99Co2 MzuQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1739258892; x=1739863692; h=content-transfer-encoding:mime-version:message-id:date:subject:cc :to:from:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=u8JvwZe222+nCD5D0NlDaqFgopE4s8QEJ665TCjR17Q=; b=FZDAXGeXgNhGe9QpAiHQSw+pi+Z6uT2mloggKVLa7byEKTOFESKsL24LqyLnB/2/rH N/tRycMyNRVSosGV2PQ/lVkvdhkojoREUCd1mNVH3u/v5pVKW3BPrdGWiwtNzVM0M98M IRh/Us8PTRdT81PY1YTSbcvnlTJEXSA2mbE8/MCs0SnKpcU8Qy9u9omVVX5/RhAJVOKK KUsyKZEuJQqCEwbrMMu0y66r7synefxiLOZW4g6Fhf1jrF0EI3CY6qOz9ZpqLBXpX5Q0 g+5eOaoLKXjvV6RYGwkBPwdUzIZqLaKOEFrxiOKgbyr3PxfqguHQwQfFVmltyxJXsCoQ aGcA== X-Forwarded-Encrypted: i=1; AJvYcCXCRHxdIus34X41+gL2eGXWMCyHPSLqCGPa53GxMgOYA/65L8HmG9dJQpg18Wjec2WV6Bjak2UlNg==@kvack.org X-Gm-Message-State: AOJu0YwiY1iThNF9DhSiM3sesVJSq9bPDDk1DwuPylQq5OSTTlCJWTSZ iyf3oTRDTrhHqQgVWBNr2qXNHoURft6E84McvSmloBCMZjPeUNBHd0jSNACJ7Jc= X-Gm-Gg: ASbGncvQjqxT1wedH7DpaPUSCbRR6JKp4rvx1dctRv/4Q/ssCMNScJgRaoTjM+ZGwpo K6vTHMQRM3XFQ46W24M8+xVyvL+K5VZwOGY2/W34ewKbTNbYkYWrZ0RVvvFQVadxXzktXoceCgD fOB84gj9JCrU9oSVbxXKLnh87tmzn5UO6s7KvpyxyK1RyVxaupjessvXbi2z10Eb9K2jV9XBphj +480Wv6t0Wg94x3YhgpVBUs517vbrjC8R0TXIF3Ts5xxlJzkFTsa2O98LoHHurdAgaLjZBpSjPN QqXVVcGH7PC/LfWDKWP8DcbAIF3+3e8HUjQfHMNGMjIH+Uv2RpDrHA0e X-Google-Smtp-Source: AGHT+IEGFQNt5sPJaZzJIBOUWjXNir7ONpzC/zGumrQumqkRxYesmObQGhXHsFIIUFr5ys2Z+8JnCg== X-Received: by 2002:a05:6a21:350d:b0:1ed:9e58:5195 with SMTP id adf61e73a8af0-1ee03a45ccdmr32137005637.13.1739258891851; Mon, 10 Feb 2025 23:28:11 -0800 (PST) Received: from C02DW0BEMD6R.bytedance.net ([203.208.167.150]) by smtp.gmail.com with ESMTPSA id 41be03b00d2f7-ad54a066811sm3946778a12.8.2025.02.10.23.28.06 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 10 Feb 2025 23:28:11 -0800 (PST) From: Qi Zheng To: brauner@kernel.org, willy@infradead.org, ziy@nvidia.com, quwenruo.btrfs@gmx.com, david@redhat.com, jannh@google.com, akpm@linux-foundation.org, david@fromorbit.com, djwong@kernel.org, muchun.song@linux.dev Cc: linux-kernel@vger.kernel.org, linux-mm@kvack.org, linux-fsdevel@vger.kernel.org, Qi Zheng , stable@vger.kernel.org Subject: [PATCH] mm: pgtable: fix incorrect reclaim of non-empty PTE pages Date: Tue, 11 Feb 2025 15:26:25 +0800 Message-Id: <20250211072625.89188-1-zhengqi.arch@bytedance.com> X-Mailer: git-send-email 2.24.3 (Apple Git-128) MIME-Version: 1.0 X-Rspamd-Server: rspam04 X-Rspamd-Queue-Id: 1EBFB1C0006 X-Stat-Signature: gama4dmjk35tm87ijqurg7bjbzo6mmup X-Rspam-User: X-HE-Tag: 1739258892-404129 X-HE-Meta: U2FsdGVkX19dO0mU/4B4Td0s8loKuL34KEcbodowShrrA/ykDrYYBnaQm7NMCWUvedPcB6fwSXE52WWulEGEG3AumFr4JuF186gdgdjXps2zv+dOxK9jiaCAOckWBg3xnnDNCRrOx+GSk68DVIll/TuBnPmpsLtSNgUCMMk+i0uYnrUgxLDyo2VtnpBJ/FjhvCWnRyKUOdzxkAOIo4L+8LAlEf8nRLBRLdctlC0HVF2ek1FYd8JJPc06c1pvFFPwgTTQ+Y6vOIdN0vqpmiO6bXluVpLK+TsLTDRGbwK4e/sgtJgre7/6w7TG3qxu5ak7tjygmwT3r6IBThR4zKp1q4iXEc+DmprZxQeGzbYflbCNSzR6aW9huTyWaEKN0tCF50syPyI7uoCOrBmSMBNC/H4XWLk7AvvSPrLEP9Bv/yCOSyxH4L9pNZm72/k5HyVilerSWKJflGbyP4yPaNXE+FUg7tkOdPPzEi6fhZQzoMlJKOwm+H/+UrHADrIxvlQwfU1pmDlno9E4/K/LT5rm3SZyxjX9SRt6CaXyCsbgcd39StDdxGAVyoMnbUnZIFjijlCivCIP171D9JF51vOxJuX7Dqv0VX1uV/63O2a7Qdu0JZC4a5lgMb9k/cV5HrPrvxoqITizsAi+lVOV/zDcqIo0Ye3nvPt9eJBuqAJ+XcPhVRLuzMAAMfhcoNuWlxcMiziULhaBUGDwf/92ps3038mkQH3Vlkreiyeg/cAAItsGq7lwAzQQ1Zl8wI2L+ybAJCoZ6NGUc0rYyIhgP6JNb0FC1tjhciFWwSVyYaJKH+xESbrjWzZ8N9FCq+iysJuhwIWNsWU7zGfTcyPeq3qm+rqi+2tjZJ3GCr7tH+0GVNixkT65QIiOkpMII6FSYhJ9G+Vc4Oy9HPJoxCNe2f0+wtAsS4iA4wsYqsOSRLdin9WSHFc6r9lYnAo6aqLX+FTawRyd7ymXemVjS0SE+xs zJLGKkO/ DPL1feXqqbgeGXxc3dEF6croK9NAn7+wDt8NeJdNNcyCIcrCbwhnL2wEPLI+lx6MT/2lPc3Sx5Aimrq8VkcbFGjMvtLr+RgoQeNsKzK13sYrLjy+TzD6ePqbh1h4T14lX3wOGn0UZ+AbiP3sKutJbds7zgwZfa/3TPnPjDJXsUf1hzveCiAysq6H5niVzIoS3meMxbrjp6f0PwDPeknlX8jjEXwbMRZOIJqAt/r7YyGC+HvCDn08gE5Lv1uDFIV+qZjHGsuX5GEurl1VPcxGqtvoI8iGEQszttcDGmgdgqdQlAKKLTNEysq02YKSI3M/7Q5Bu8T197eBLMzbxxITKs9sV7veJhZ711Gsz5gl9sYlTEdUT9CbpDf4/2gQL7IXy3M9E0hi7BcbuXUOlev2jho2mymY+pyyZvoLM X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: In zap_pte_range(), if the pte lock was released midway, the pte entries may be refilled with physical pages by another thread, which may cause a non-empty PTE page to be reclaimed and eventually cause the system to crash. To fix it, fall back to the slow path in this case to recheck if all pte entries are still none. Fixes: 6375e95f381e ("mm: pgtable: reclaim empty PTE page in madvise(MADV_DONTNEED)") Reported-by: Christian Brauner Closes: https://lore.kernel.org/all/20250207-anbot-bankfilialen-acce9d79a2c7@brauner/ Reported-by: Qu Wenruo Closes: https://lore.kernel.org/all/152296f3-5c81-4a94-97f3-004108fba7be@gmx.com/ Tested-by: Zi Yan Cc: stable@vger.kernel.org Signed-off-by: Qi Zheng --- mm/memory.c | 17 ++++++++++++++--- 1 file changed, 14 insertions(+), 3 deletions(-) diff --git a/mm/memory.c b/mm/memory.c index a8196ae72e9ae..7c7193cb21248 100644 --- a/mm/memory.c +++ b/mm/memory.c @@ -1721,7 +1721,7 @@ static unsigned long zap_pte_range(struct mmu_gather *tlb, pmd_t pmdval; unsigned long start = addr; bool can_reclaim_pt = reclaim_pt_is_enabled(start, end, details); - bool direct_reclaim = false; + bool direct_reclaim = true; int nr; retry: @@ -1736,8 +1736,10 @@ static unsigned long zap_pte_range(struct mmu_gather *tlb, do { bool any_skipped = false; - if (need_resched()) + if (need_resched()) { + direct_reclaim = false; break; + } nr = do_zap_pte_range(tlb, vma, pte, addr, end, details, rss, &force_flush, &force_break, &any_skipped); @@ -1745,11 +1747,20 @@ static unsigned long zap_pte_range(struct mmu_gather *tlb, can_reclaim_pt = false; if (unlikely(force_break)) { addr += nr * PAGE_SIZE; + direct_reclaim = false; break; } } while (pte += nr, addr += PAGE_SIZE * nr, addr != end); - if (can_reclaim_pt && addr == end) + /* + * Fast path: try to hold the pmd lock and unmap the PTE page. + * + * If the pte lock was released midway (retry case), or if the attempt + * to hold the pmd lock failed, then we need to recheck all pte entries + * to ensure they are still none, thereby preventing the pte entries + * from being repopulated by another thread. + */ + if (can_reclaim_pt && direct_reclaim && addr == end) direct_reclaim = try_get_and_clear_pmd(mm, pmd, &pmdval); add_mm_rss_vec(mm, rss);