From patchwork Sun Jun 18 06:57:56 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yosry Ahmed X-Patchwork-Id: 13283771 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id C787BEB64D8 for ; Sun, 18 Jun 2023 06:58:03 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 61F8F6B0075; Sun, 18 Jun 2023 02:58:03 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 5D1A48E0002; Sun, 18 Jun 2023 02:58:03 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 498598E0001; Sun, 18 Jun 2023 02:58:03 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0017.hostedemail.com [216.40.44.17]) by kanga.kvack.org (Postfix) with ESMTP id 36BBD6B0075 for ; Sun, 18 Jun 2023 02:58:03 -0400 (EDT) Received: from smtpin30.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay05.hostedemail.com (Postfix) with ESMTP id 0C39D403F6 for ; Sun, 18 Jun 2023 06:58:03 +0000 (UTC) X-FDA: 80914964046.30.A71D462 Received: from mail-pf1-f202.google.com (mail-pf1-f202.google.com [209.85.210.202]) by imf07.hostedemail.com (Postfix) with ESMTP id 5E9B840007 for ; Sun, 18 Jun 2023 06:58:01 +0000 (UTC) Authentication-Results: imf07.hostedemail.com; dkim=pass header.d=google.com header.s=20221208 header.b=wiiXK9RV; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf07.hostedemail.com: domain of 3-KqOZAoKCLMrhlkrTafXWZhhZeX.Vhfebgnq-ffdoTVd.hkZ@flex--yosryahmed.bounces.google.com designates 209.85.210.202 as permitted sender) smtp.mailfrom=3-KqOZAoKCLMrhlkrTafXWZhhZeX.Vhfebgnq-ffdoTVd.hkZ@flex--yosryahmed.bounces.google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1687071481; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding:in-reply-to: references:dkim-signature; bh=x1i7wt10FRqoJR/0k/fRkwJvkZET8us7vWtGv6N9JkM=; b=OAk5jcKuVZNWvpCTyw5gDsPduuj7q0gAhwpbqEOJejcvqjzxjonj0jfjaksvZfKi1ldrVO oNrUwHP1we4u6eqpT8uCIufPQfvNChFza91VgeDWrguX25O4db/DoC5yRvSIz5ZC2+h+UN mER/YQrmwscB+9cvJEvWEwaXjdwwv70= ARC-Authentication-Results: i=1; imf07.hostedemail.com; dkim=pass header.d=google.com header.s=20221208 header.b=wiiXK9RV; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf07.hostedemail.com: domain of 3-KqOZAoKCLMrhlkrTafXWZhhZeX.Vhfebgnq-ffdoTVd.hkZ@flex--yosryahmed.bounces.google.com designates 209.85.210.202 as permitted sender) smtp.mailfrom=3-KqOZAoKCLMrhlkrTafXWZhhZeX.Vhfebgnq-ffdoTVd.hkZ@flex--yosryahmed.bounces.google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1687071481; a=rsa-sha256; cv=none; b=tcYSjwj0wfj+2jgMMpJpndv0SHiaJaP1v/P6Razq2/HjNdqjvgkLViDWHpf82k9Rxdwg/x R6DbIP2k3ms1TcfCRlRvdvXx6W5460/nCOd9RmadfrQjsD7u3bhAZQPcimUtYZVNZTNrh7 MPnXBC/wSCFJqzdMyEYniZU4Yx2NNXQ= Received: by mail-pf1-f202.google.com with SMTP id d2e1a72fcca58-6686a103a8cso780355b3a.1 for ; Sat, 17 Jun 2023 23:58:01 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20221208; t=1687071480; x=1689663480; h=cc:to:from:subject:message-id:mime-version:date:from:to:cc:subject :date:message-id:reply-to; bh=x1i7wt10FRqoJR/0k/fRkwJvkZET8us7vWtGv6N9JkM=; b=wiiXK9RVcjc6zXXnvOnzIHx0QvWcLndYS5cikYrM2XoqA5fnzzO4iZ3MVCVVrJ6Fwn As4BvVUIQ75yp5oF65LfjuNfJryRmjH7FleuGKzWg4HXPHQ8MMFz2J+/md40tNJ/m3e/ imNsFC1Qwc1K2vwpAOjFKRAikJCrYKJEqLpfqHtTtuL/PXuC6HnOwhtteQ7ASIMuMqLG NeLsple1mPw1PS3FEyOxIXW8hSxpNZNK5TcuBTRVIJgJEoMQS2UGi6tFiipF3+sawrW2 T7qYdku1evM2j1oRkchbHm0j4VDRm4ldH8CRDCvDaDfBgW0b9PhEzY51U/tI8QK009jn Y/Eg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1687071480; x=1689663480; h=cc:to:from:subject:message-id:mime-version:date:x-gm-message-state :from:to:cc:subject:date:message-id:reply-to; bh=x1i7wt10FRqoJR/0k/fRkwJvkZET8us7vWtGv6N9JkM=; b=Iri+zQ0EHHWBjwsLuyt1xsJRiYiSZHGx0UnqQUK7QIbVLYFCXUEr/QB9u0Jjht3aDn 9pjWbtL6dN5e3x8AiSFAjFY0BnDk9XXDhBAg+V0E4wY1Xpfd3tW++dxXKSY1/+xJ/TfW DHU1scrIf54emqGh/HX/q50h4ky9PRQjWn3MNGpP7xugYM/lVKEhNnhFcgEZxd5rsknn N7Ss3muAjhwwP0hoZ4gs9d0+6Ad30ZrxOaKnJxCaYaBGytwPesOK6+PyEsMbpnEOiKpU z6+NSG5Y9KGKEO9nH36ULi2j2LrnlhwCvR6R9gnL/37A5rw1qsmLzIiIAV5ax0Qht6Vf ysLg== X-Gm-Message-State: AC+VfDyWqc9BHKPl6P8FAw1sUReuckbSu0hhNJsAZx5BgVsBd901Dw17 ttR32ryYVVUg3wrHV/wLED5wP34ic6PXO/zS X-Google-Smtp-Source: ACHHUZ6Id6BCQc4em9nIdd8QYYHd6lTWLtvplCqOWsU+BGezvy9ft4sskTUOwDO98945URQ+GDemSFWaK1L6vnbw X-Received: from yosry.c.googlers.com ([fda3:e722:ac3:cc00:7f:e700:c0a8:2327]) (user=yosryahmed job=sendgmr) by 2002:a05:6a00:14c4:b0:668:7143:50ea with SMTP id w4-20020a056a0014c400b00668714350eamr577347pfu.4.1687071480165; Sat, 17 Jun 2023 23:58:00 -0700 (PDT) Date: Sun, 18 Jun 2023 06:57:56 +0000 Mime-Version: 1.0 X-Mailer: git-send-email 2.41.0.162.gfafddb0af9-goog Message-ID: <20230618065756.1364399-1-yosryahmed@google.com> Subject: [RFC PATCH 2/5] mm/mlock: fixup mlock_count during unmap From: Yosry Ahmed To: Andrew Morton Cc: Yu Zhao , "Jan Alexander Steffens (heftig)" , Steven Barrett , Brian Geffon , "T.J. Alumbaugh" , Gaosheng Cui , Suren Baghdasaryan , "Matthew Wilcox (Oracle)" , "Liam R. Howlett" , David Hildenbrand , Jason Gunthorpe , Mathieu Desnoyers , David Howells , Hugh Dickins , Greg Thelen , linux-mm@kvack.org, linux-kernel@vger.kernel.org, Yosry Ahmed X-Rspamd-Server: rspam09 X-Rspamd-Queue-Id: 5E9B840007 X-Stat-Signature: 5q7wohm5r1h47p3kfrgdz58tj16xt9t5 X-Rspam-User: X-HE-Tag: 1687071481-645635 X-HE-Meta: U2FsdGVkX183Be5B9hLE0Yyfh5Vwx+HDqZ/tJMUHEbN/GoabfY/b85uBoqrumgfRIeT/zhn5yQfeVzb7A8uE6Fja1SWxArrUa3Cia8D+jEUkM2nWW4AZEcGhPs7BJqrg7U9id7VRZi9+lQEjQtt6YUUGKWnusFP2XpVu5LfVUZAvxQxKJA7zfRJ6+GhEKsm+iMM0ofBPyzNAA2H3r8a6ARDuRzuaDukJ0OUHmpMFiwNJ/HGaKsO6nctMLuGkf6aJyShdce/NnTBr6sIMFNoqeNQsATB+Xm2SXP4nmIKUkNFys8YC5j3j4pJy/78wbz1vziQAAKvbtkCRUp9ih/GrJ2SZdPd4owuve/TXIDKJcbnRf4hiqfx3IQeBwlPkmvQ1euCLUfXUEkgHEuJhKfr6837+Qax48XJ2YUeOSRiFUtdTYHK2mryVQX3hMNfEYzWuOaLoSTsJpU/X266ryr3Mi8CRJAsyswzvo4Sjmg2H2sRTMTmOmlIM7V/81dY8cs2Os0OxCV6VWTDh10rpB/oON7Q5O+rn6N0jYAm3YeImze9kGM8jf+xWtPgIQCXSrFG3OO6Q5xCZ1G6mL+9I9zP6YVGpoX1Iy3Tu2DHpjXhGaSfqgWyTMyb9njjrJega0N3uS7llYM3wTckzGbgQcC+zFG2VIDMEiTO8Kw7mk7r377WwsF9fbS6j6dPoNKtR0L8ShSQ4dbUeooQh9RpmGFlpyz83L7LZTY/EM3tcMW1leFF9+5sdeDIY+buI1bB6p6PUU9yUMPWwQ0oj34jdQ72d5HAMuv8IN5LMURbqG6wiTbbpUzGETepnDZZhmn/GyTcp5irEO92Qf/V+dnSSOgveLJ+To0UfFIs07xUThOCGQAtpWdATDbCV8a1Pu4852jiRBaaHfnvT5c/RFWCUp6m6+qf3Cn+NgWE75nQV871dF3wfogha1Rzj8xLkz1zDH5C6UY1ZDJzzCSpeQZNe79c IoZcPvwY tH33t6eqcvK6eI93JS9nE91mAxRYUHkN9bWfte1ZqKvHtfGJE1ny2s5MzEBYeQbngEZSV5SAMnUoZLsngr1SfTPPlX8cQcSqg7NHxax8ScuRO4rXV1pU2yjYLqXlP9pdv99wUcJGZKqW0tf06x06S4xt/rehFEBcw7i4Wi3bMU1Zy77higriR0Wzt8Gv/NRx48CDJmaDIpjs0BHw1A8raBPUZcZHt/XVCwDpc59ZwobPzwyZkgTiauwUPpUXSr9xXR0ABDZBQsz/OFCPyCebnYmchURGdPttHWhgFhLCgAtQU8edhEjW9xCjbyueuUw1P0v2U7QJwYRaIg/z3Pw4TL+exnS9wevod+8Cc+Vt1/soVAJCJRgexGTpOq4jv+kCoUFR9OpKZoWqe5usyDnqZ+WmqXJiqZdkOsW+NBPYLtuaYQ0GoL9BXP8QI0SDvXS8cbje3hLSWck3vh92LyeOWndteCc8ZvFai35AZmbd88K4XnkD/yGQr2GJXoA7C63cBlByulVXSrrSLy2V0WmxMQxhjiWv3e9whZZW9knN6r6gL6+b2W/ZkEWuunUogSRLWm29YwdnioE5Xv4lz81Prpd2TxoqaIkPySRY7uI2fX4IISW6z7dAJfRMoBZpgswXD//WA93vDcRbPIajYsg+jyccxs10oMuP2b0iENKQyxKHS3ygJhA9Q/6zUVg== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: In the rare case where an mlocked order-0 folio is mapped 2^20 or more times, the high mapcount can be interpreted mistakenly by munlock() as an mlock_count, causing PG_mlocked to not be cleared, possibly leaving the folio stranded as unevictable endlessly. To fix this, add a hook during unmapping to check if the bits used for mlock_count are 0s yet PG_mlocked is set. In this case, call make sure to perform the missed munlock operation. Signed-off-by: Yosry Ahmed --- include/linux/mm.h | 4 ++++ mm/mlock.c | 18 +++++++++++++++++- mm/rmap.c | 1 + 3 files changed, 22 insertions(+), 1 deletion(-) diff --git a/include/linux/mm.h b/include/linux/mm.h index 3994580772b3..b341477a83e8 100644 --- a/include/linux/mm.h +++ b/include/linux/mm.h @@ -1050,6 +1050,7 @@ unsigned long vmalloc_to_pfn(const void *addr); extern bool is_vmalloc_addr(const void *x); extern int is_vmalloc_or_module_addr(const void *x); extern int folio_mlocked_mapcount(struct folio *folio); +extern void folio_mlock_unmap_check(struct folio *folio); #else static inline bool is_vmalloc_addr(const void *x) { @@ -1063,6 +1064,9 @@ static inline int folio_mlocked_mapcount(struct folio *folio) { return 0; } +static inline void folio_mlock_unmap_check(struct folio *folio) +{ +} #endif /* diff --git a/mm/mlock.c b/mm/mlock.c index 5c5462627391..8261df11d6a6 100644 --- a/mm/mlock.c +++ b/mm/mlock.c @@ -66,7 +66,8 @@ EXPORT_SYMBOL(can_do_mlock); * (1) The mapcount will be incorrect (underestimated). It will be correct again * once the number of mappings falls below MLOCK_COUNT_BIAS. * (2) munlock() can misinterpret the large number of mappings as an mlock_count - * and leave PG_mlocked set. + * and leave PG_mlocked set. This will be fixed when the number of mappings + * falls below MLOCK_COUNT_BIAS by folio_mlock_unmap_check(). */ #define MLOCK_COUNT_SHIFT 20 #define MLOCK_COUNT_BIAS (1U << MLOCK_COUNT_SHIFT) @@ -139,6 +140,21 @@ static int folio_mlock_count_dec(struct folio *folio) return mlock_count - 1; } +/* + * Call after decrementing the mapcount. If the mapcount previously overflowed + * beyond the lower 20 bits for an order-0 mlocked folio, munlock() have + * mistakenly left the folio mlocked. Fix it here. + */ +void folio_mlock_unmap_check(struct folio *folio) +{ + int mapcount = atomic_read(&folio->_mapcount) + 1; + int mlock_count = mapcount >> MLOCK_COUNT_SHIFT; + + if (unlikely(!folio_test_large(folio) && folio_test_mlocked(folio) && + mlock_count == 0)) + munlock_folio(folio); +} + /* * Mlocked folios are marked with the PG_mlocked flag for efficient testing * in vmscan and, possibly, the fault path; and to support semi-accurate diff --git a/mm/rmap.c b/mm/rmap.c index 19392e090bec..02e558551f15 100644 --- a/mm/rmap.c +++ b/mm/rmap.c @@ -1392,6 +1392,7 @@ void page_remove_rmap(struct page *page, struct vm_area_struct *vma, nr = atomic_dec_return_relaxed(mapped); nr = (nr < COMPOUND_MAPPED); } + folio_mlock_unmap_check(folio); } else if (folio_test_pmd_mappable(folio)) { /* That test is redundant: it's for safety or to optimize out */