From patchwork Sat Feb 18 00:27:34 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: James Houghton X-Patchwork-Id: 13145366 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 4382EC636D6 for ; Sat, 18 Feb 2023 00:28:48 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 7B6F76B0072; Fri, 17 Feb 2023 19:28:47 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 6A2556B0073; Fri, 17 Feb 2023 19:28:47 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 569576B0074; Fri, 17 Feb 2023 19:28:47 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0014.hostedemail.com [216.40.44.14]) by kanga.kvack.org (Postfix) with ESMTP id 47B516B0072 for ; Fri, 17 Feb 2023 19:28:47 -0500 (EST) Received: from smtpin22.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay01.hostedemail.com (Postfix) with ESMTP id 163281C5DF5 for ; Sat, 18 Feb 2023 00:28:47 +0000 (UTC) X-FDA: 80478527094.22.82F33E9 Received: from mail-yw1-f202.google.com (mail-yw1-f202.google.com [209.85.128.202]) by imf19.hostedemail.com (Postfix) with ESMTP id 5F4E91A0002 for ; Sat, 18 Feb 2023 00:28:45 +0000 (UTC) Authentication-Results: imf19.hostedemail.com; dkim=pass header.d=google.com header.s=20210112 header.b=JUkI+uUZ; spf=pass (imf19.hostedemail.com: domain of 3vBvwYwoKCNkEOCJPBCOJIBJJBG9.7JHGDIPS-HHFQ57F.JMB@flex--jthoughton.bounces.google.com designates 209.85.128.202 as permitted sender) smtp.mailfrom=3vBvwYwoKCNkEOCJPBCOJIBJJBG9.7JHGDIPS-HHFQ57F.JMB@flex--jthoughton.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1676680125; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=WRJSWRU8dQHXXhdXAbgeFtewLtYoHueyg1zV/lollrc=; b=xX5mhwdO7/HyZee/+4l9stR0JI5wmEbC2sBi+qSv+alTEaQT1XmT0qq8K2QiGVjJhJVYL8 bjIvG5r7AHNU7j+AWVzogzbskcJrU/8e76Xaa6Mijh13uQ3ajgsdCHGWQX8FuzvhrJtI5T 0dTMXpnSDlmMnptoFlJcrfTnWXv2kjU= ARC-Authentication-Results: i=1; imf19.hostedemail.com; dkim=pass header.d=google.com header.s=20210112 header.b=JUkI+uUZ; spf=pass (imf19.hostedemail.com: domain of 3vBvwYwoKCNkEOCJPBCOJIBJJBG9.7JHGDIPS-HHFQ57F.JMB@flex--jthoughton.bounces.google.com designates 209.85.128.202 as permitted sender) smtp.mailfrom=3vBvwYwoKCNkEOCJPBCOJIBJJBG9.7JHGDIPS-HHFQ57F.JMB@flex--jthoughton.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1676680125; a=rsa-sha256; cv=none; b=W8MX7KAsCPTSicmZSkAxYV+oS/vEdieFKePn5++PzN9HOFmLxErnkC1+Y9tn+5zVeWkNTx 8u0F3lZ3KOlwUG8NrHQi17sVCywCkO50i7O27aPk/xPuKLSHTRhqD9DHZ4DslruRmgXR7B ZLhcPYkAs2xFeUy8L0L0tYXYyWMLHl0= Received: by mail-yw1-f202.google.com with SMTP id 00721157ae682-536587ff9e1so22047857b3.21 for ; Fri, 17 Feb 2023 16:28:45 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=WRJSWRU8dQHXXhdXAbgeFtewLtYoHueyg1zV/lollrc=; b=JUkI+uUZUGikStVOrWOWfUFvkifZQjdvysrYVMcxjv1b2yQplxM71vw1fjhTzhEkxI BEohrL6MRF518g9B3YG2ljl8T7G93FwY6nToc+IVWoDTzVPkWMKdK99Fw+Y/07yhJ3P2 Pt0n2XoDP+p37aLwSJxXkZR/Jyh8LDLuXTOP46PclH0WaRZarQPsfJWYdmcsYTgivcxA V29rcePGsSV2Bm8JBI3cnvgY6VYwZ38Vy23D8uxt/hblg5U5P6fb6PnTeEwjCMXWB3jt Xt02v+X21m74LuwV377B2oYbsfLR/FpkRSgMkEe4zOTBZt33XomrJuA07ToJPeOqmC5B 2qvw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=WRJSWRU8dQHXXhdXAbgeFtewLtYoHueyg1zV/lollrc=; b=1BbVsaFVwlbBPx78wZNO0rfN2PqX3dYV2DY8wzIfz9MOchZq4z4ipmET2mGU9DzWmw LM9zQSTZtU9knBM7pTha+xKdki4bwNN5N/nLX7Xw2Jb9d1UCLonHzt2Fip577S8M5Ug5 7KYKaJW/JOuQsIKNaPzFgnDi4LkJNgBHGj/AeJ/nl/QDxZL2y1HNdMahThN6tdYAELb9 hqxA1tBpgZlrbPKxj5rIeOmfRIF+7/jpLoz9p4wuT1YvVR2ogRFKMkYeAwbk8jZ4nEPm 52JYOSsZk4bkeUWN9VQtZ+W1dJsXnLsbRX+2L8BXKncBQrQHdCTjcIhiHapPViNuoW4r aq2w== X-Gm-Message-State: AO0yUKXYMqqOiT2o0Sx/PiUh80narCp9wIWL0dyMWbHelbZhKVFnRoaR Q3nvE2x2bUADoqhBpPl7gZM7jNJJ6vB8ec9c X-Google-Smtp-Source: AK7set/WjcZL+XrN64oa4iFWZNoLag2qFEizB65Q47paRVpixNGKzuLMD0K/0D4ZsPotwj/bHVSjyMfKhh7AGtfF X-Received: from jthoughton.c.googlers.com ([fda3:e722:ac3:cc00:14:4d90:c0a8:2a4f]) (user=jthoughton job=sendgmr) by 2002:a81:4981:0:b0:527:adb4:3297 with SMTP id w123-20020a814981000000b00527adb43297mr1507453ywa.161.1676680124408; Fri, 17 Feb 2023 16:28:44 -0800 (PST) Date: Sat, 18 Feb 2023 00:27:34 +0000 In-Reply-To: <20230218002819.1486479-1-jthoughton@google.com> Mime-Version: 1.0 References: <20230218002819.1486479-1-jthoughton@google.com> X-Mailer: git-send-email 2.39.2.637.g21b0678d19-goog Message-ID: <20230218002819.1486479-2-jthoughton@google.com> Subject: [PATCH v2 01/46] hugetlb: don't set PageUptodate for UFFDIO_CONTINUE From: James Houghton To: Mike Kravetz , Muchun Song , Peter Xu , Andrew Morton Cc: David Hildenbrand , David Rientjes , Axel Rasmussen , Mina Almasry , "Zach O'Keefe" , Manish Mishra , Naoya Horiguchi , "Dr . David Alan Gilbert" , "Matthew Wilcox (Oracle)" , Vlastimil Babka , Baolin Wang , Miaohe Lin , Yang Shi , Frank van der Linden , Jiaqi Yan , linux-mm@kvack.org, linux-kernel@vger.kernel.org, James Houghton X-Stat-Signature: z14h55x3fbwrp76sbi1rqxhfz79s7fkm X-Rspam-User: X-Rspamd-Queue-Id: 5F4E91A0002 X-Rspamd-Server: rspam06 X-HE-Tag: 1676680125-906181 X-HE-Meta: U2FsdGVkX19NC6D1oW9rR6jN0q8pTB+ObOZOVJuyaWSsklnhpmN3/34aNEoweNoxz70Itd8ekAAHvz1aUrM9V8C5rbaJxN4LNNhYJNCfGKx37l5YBEeNg0waFg2oOI3CDs2jRKpgw1vvU4bPE+7kSMi8wCAomFzktURhOkvu+lFULHxIzwSQCQSTweUH8ewb6wZd9+mueALgyttHkdyMAyKhhblYLC1Y/l8riSY4Z+so2dguuChlz16hx6RB8hq6ybSX1lZB2gbtoRenMXx4+Wig+T35hAC/vlG6R1OPtQHP7rrNiiZYCPkZD6XfFoEZgNXuVc0jL0FY91sMRyZd16pW54PdPl88v4xUJZN/4nwThRSIzbykhcEx0ZbwUduha8X3rEUn9JOtVPoQ1PyfhSdyt1xzLrL4TsPsbO/zhD/kw+iBXA4JsezkgWwe3Lg5W1d6ljV5EriSj0th+WSxvwZXAvjbJR+2dJyq0jueyXUoGkF5ctPzmSIcTIRSaKLi6T5OPpDh4yAF7cplFGK5baCd5NOxL9HkFjkZqKFSpFC7tBncscNNjRDBqJyZK+4F4g9kxIS8Xo4ynQiBW8+AQ8j6FljFN/Uw1TDP3CQlvMOHXsBkn4eUZYOKYWc4sf3eAsw+apRYeYMXXiR0igZR1YjgLlS8vLgEfQFUqlY9SPZthe9/JTqf2f++kMApLRQXUIHObuCZ1Uui0NLD6924bwjs4snfz8YlQX3h/Q/zjUMLCWm264fH0OQmo6Gt5Rq2CVbwBn6A3rycWPh/fLY1FmAE2UxdEbeG5wQj1OTunFZmiBSR+6G8LuezVzch+kS5dxafE433E45Y1L0efDmj8gzNwNCG5GJOoEs0h7RgKRMU3CKAi36DnTxiOEyThtjcqzbEjBDYjJHU/Y7xeoxaH4NMY6QGlCoXmDg/k4wInU9ngDxpoW2O77eSAHkaQEWxgoAQ1RxZATE3S0DtfRa 4guJMQRT I79eygtQtDR1qxlRm7Wr5jTiJNk4aMj6R9f9cZXiciAqH8amaQHARu7LRieK/asw7Z2F2aAUt9alqFRf64/M+sBOcbJO1f2Fu/uueX0FjF5WCMOu3304aB/sJt9BrPwnYZWrVdKAlfPbdETilcpsg5jqOcbCEJhZRZHpTlKjIDvdn46W3XRcuAUuoBHK5nU/wM4ovAgf9lEtSdtNPmqAlNpiFb1xRBvdr6agntLoy91DZ1D6a6+BKW041YOVf0JAokP/LDs5SNicXHDn2vQyylPYbS+eXNxRGEbQ833KkO8HiFtMWdwuD5ee4upf6GnmrW9J5JjRFblFptZSmOqSe3+8zfWlSZsiU/cRyF5V2cMzfFYzAlwijqY0b1v7TvY6FxAGeN9s3Z+FiQ+l9f1IPtYHLgqvt5jOWAUzpSib8/+KycQZhxMIuxuCCpO3iBcnfW2e6VsRUsYB3/BBg3wIwhYmCZSXLMFlBn5xgyBk2Oa03Q2NZeBx24SrVPZekB2kmgt8/waG4jRmCuAwAammCU+MSvPFgguD7ApqfIHLGboNOD6jytgQXfRd+oZATErfnISPziZHgfDwjgcCuaiYMySo6RYkC9oTL+S4PR2p1LoMqsim0oX+Rlsn9Q6R7Zpt2q74XfKR3BsDy6EWajrR4eOmH/tZQQeL3Gp3za8H1GM1lAliZJnpMB4hdbFZ/xgFlgiITGdO5p7ITc5Q= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: If would be bad if we actually set PageUptodate with UFFDIO_CONTINUE; PageUptodate indicates that the page has been zeroed, and we don't want to give a non-zeroed page to the user. The reason this change is being made now is because UFFDIO_CONTINUEs on subpages definitely shouldn't set this page flag on the head page. Signed-off-by: James Houghton diff --git a/mm/hugetlb.c b/mm/hugetlb.c index 07abcb6eb203..792cb2e67ce5 100644 --- a/mm/hugetlb.c +++ b/mm/hugetlb.c @@ -6256,7 +6256,16 @@ int hugetlb_mcopy_atomic_pte(struct mm_struct *dst_mm, * preceding stores to the page contents become visible before * the set_pte_at() write. */ - __folio_mark_uptodate(folio); + if (!is_continue) + __folio_mark_uptodate(folio); + else if (!folio_test_uptodate(folio)) { + /* + * This should never happen; HugeTLB pages are always Uptodate + * as soon as they are allocated. + */ + ret = -EFAULT; + goto out_release_nounlock; + } /* Add shared, newly allocated pages to the page cache. */ if (vm_shared && !is_continue) { From patchwork Sat Feb 18 00:27:35 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: James Houghton X-Patchwork-Id: 13145368 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 46E96C6379F for ; Sat, 18 Feb 2023 00:28:50 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 596156B0073; Fri, 17 Feb 2023 19:28:48 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 54F716B0074; Fri, 17 Feb 2023 19:28:48 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 40C9D280001; Fri, 17 Feb 2023 19:28:48 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0014.hostedemail.com [216.40.44.14]) by kanga.kvack.org (Postfix) with ESMTP id 2DD316B0073 for ; Fri, 17 Feb 2023 19:28:48 -0500 (EST) Received: from smtpin16.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay01.hostedemail.com (Postfix) with ESMTP id 019BD1C607E for ; Sat, 18 Feb 2023 00:28:47 +0000 (UTC) X-FDA: 80478527136.16.9C2F881 Received: from mail-vs1-f73.google.com (mail-vs1-f73.google.com [209.85.217.73]) by imf17.hostedemail.com (Postfix) with ESMTP id 3D54C40002 for ; Sat, 18 Feb 2023 00:28:46 +0000 (UTC) Authentication-Results: imf17.hostedemail.com; dkim=pass header.d=google.com header.s=20210112 header.b=lQipQIAG; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf17.hostedemail.com: domain of 3vRvwYwoKCNoFPDKQCDPKJCKKCHA.8KIHEJQT-IIGR68G.KNC@flex--jthoughton.bounces.google.com designates 209.85.217.73 as permitted sender) smtp.mailfrom=3vRvwYwoKCNoFPDKQCDPKJCKKCHA.8KIHEJQT-IIGR68G.KNC@flex--jthoughton.bounces.google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1676680126; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=LnKhelmr0oBezYImc/CiFkR5B91EfUKvE0Rbw9pLRe4=; b=t6uCG7dCeyaMvb8eJM1PUd2FZ/A4RT2b+zxC/v0u0sKCQC2ONueRQ02JiX2eFc4bbjvJdR 0nSCQfwtiAi0orHhGo39ZaraiSEvV8tq8zo/k1uHC07nsly+fakczkEsDWblL3tB+P6eUm 4YLfbtzn0E2/e9sRDxMuascYZoNGnJ8= ARC-Authentication-Results: i=1; imf17.hostedemail.com; dkim=pass header.d=google.com header.s=20210112 header.b=lQipQIAG; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf17.hostedemail.com: domain of 3vRvwYwoKCNoFPDKQCDPKJCKKCHA.8KIHEJQT-IIGR68G.KNC@flex--jthoughton.bounces.google.com designates 209.85.217.73 as permitted sender) smtp.mailfrom=3vRvwYwoKCNoFPDKQCDPKJCKKCHA.8KIHEJQT-IIGR68G.KNC@flex--jthoughton.bounces.google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1676680126; a=rsa-sha256; cv=none; b=yg9zpS8shQbV3W7E/taGWb+NCP7SRRdg/Lay5ddoG1L4HlkjekCcGY9W0sgTCbFOu+xask T77V49KHHHn+USoRHw6FqQV3k2imL462k9PivbCpw9yezJeRXf4++vp0hD3mkkh15t3YEC 64NbuN38NHQ0sVxwrrj4DlB268TaqE8= Received: by mail-vs1-f73.google.com with SMTP id l16-20020a67ba10000000b00411d9dbb998so583201vsn.1 for ; Fri, 17 Feb 2023 16:28:46 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=LnKhelmr0oBezYImc/CiFkR5B91EfUKvE0Rbw9pLRe4=; b=lQipQIAGaSVcFBsRExtQaFLnPm6HAzNz2+VJbpondpycb1x3PDuXTd+i0o++TSF9er gxut9YhAnGbep2SPcG0jDIMVR2mq6NQ41EE/6bkL3ZaN7Gb6kf8EWJR5CwB22gEUYwsb dux6/8aAmRzRNX/2PmjME6f1c/ZNU7ZaQroNOMlaJyZVlnL1oAKwRpxb7/HYgt7/PydP lxdcZ2uDPD9axodhgg/osne68KRXEtgsS6fa3iXBXgLQZ3W+b4wOBh19dwO8qERvglmh y56jvQy6e2CUxUzTLyorFkYxBNBmTKZfxgrrDzoZSssQ0cO8QYft6CecYtHZ5imrv86K Ha7g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=LnKhelmr0oBezYImc/CiFkR5B91EfUKvE0Rbw9pLRe4=; b=J6NpTw8EExPO2ruTzkNeJG4rGAt3KsItGHgIGC9kfKlWllZIJ1DC29AfRIJrNvpBzz YTGhL39b1VkaN5k48na4O1lwsQNwOMwhSU06g49HoMzP1ilAk2JKD2gqfkDKcJSCbQj4 P3Yg++kNWgH//PC7h0ORq9FZuMy2ZeFhVXa/rkVyC3AvBy8br1qb2aWmhbRd/hjYsEMR upGxzJQUmfkKalACMdwRJxv5vbhtfFIgCcwTCdCBznAQwop7rTL/6/7qvJuAPuMjkfrc UNBh/zESbpteX1PPEch0WAPlPBwcebX9v2dCfkKYwv61vpNMRZnXfDKWoq8g8WZgpEOB 16uw== X-Gm-Message-State: AO0yUKUaBSIChZF6byR0qpK5mPMFJ1osI/M9aLOL0ayOj1Fqrv0SjzaU 5yVm/MPYnH7Lwiw81CxyD6VjnU2LQ7n1c8sf X-Google-Smtp-Source: AK7set9cfmc3rDPTjjWrEWA1g/aV5e0+YQFlWRnOBBxS/xzEKz7fRzIC1V+FfmOXZMHdILJkQVdHqDoADwXiTyis X-Received: from jthoughton.c.googlers.com ([fda3:e722:ac3:cc00:14:4d90:c0a8:2a4f]) (user=jthoughton job=sendgmr) by 2002:a05:6130:a0a:b0:67a:2833:5ceb with SMTP id bx10-20020a0561300a0a00b0067a28335cebmr54247uab.0.1676680125416; Fri, 17 Feb 2023 16:28:45 -0800 (PST) Date: Sat, 18 Feb 2023 00:27:35 +0000 In-Reply-To: <20230218002819.1486479-1-jthoughton@google.com> Mime-Version: 1.0 References: <20230218002819.1486479-1-jthoughton@google.com> X-Mailer: git-send-email 2.39.2.637.g21b0678d19-goog Message-ID: <20230218002819.1486479-3-jthoughton@google.com> Subject: [PATCH v2 02/46] hugetlb: remove mk_huge_pte; it is unused From: James Houghton To: Mike Kravetz , Muchun Song , Peter Xu , Andrew Morton Cc: David Hildenbrand , David Rientjes , Axel Rasmussen , Mina Almasry , "Zach O'Keefe" , Manish Mishra , Naoya Horiguchi , "Dr . David Alan Gilbert" , "Matthew Wilcox (Oracle)" , Vlastimil Babka , Baolin Wang , Miaohe Lin , Yang Shi , Frank van der Linden , Jiaqi Yan , linux-mm@kvack.org, linux-kernel@vger.kernel.org, James Houghton X-Rspam-User: X-Rspamd-Server: rspam02 X-Rspamd-Queue-Id: 3D54C40002 X-Stat-Signature: 4xmcwtwxk8ddby4gzucoscdnj1ty5qxh X-HE-Tag: 1676680126-665045 X-HE-Meta: U2FsdGVkX1+3evSTViS4RlsSBwLIK8liIsyVu2NrfGS6AfolEfC2ZxZbtvvqpcje14BMDKarIiM1kqC6Vzm0DnzWswRquNMv7iNa71wPCTVjYu86YAAHSh5Je6UFV66vKR5Q8l6KtP6aErmCFLIMGb3W00HVWevTKhV7jJ5y/ZXFfw95u+1V04/EhC7w/FaHs0cdhYY9OCHkquN8gCiiQfpanuxnsEANxcHxlvjyLy23gPKWCZ/dFVSHNe6G+I66MolslV6F316EDuYqK3Tr1QTgdRscW4ZYMDfINYlvABHXOWceAODAjC1Pjelky00u2JFzey7lhEJF97tNuy1lCgVHhMU2RkDFB6xPEGAXXCOlk4PTC9l6h10gAXKo6fc5DqfUnavqfoyodta2utCuTtwoZijG/hKLvWRDACRtVgPWe6NRAJtKY/tBPnD6xkT052GJ0uWulzEcGKXNTLK8DA7nhdrqUYOhV1xI+VY0NKQkcUChPdK6jNFyhZ5TFZllf20OcPpaZXy3l12MfcN5lki6Wxs36QnFbv7Y9KwJsOt4ehzyCdpxpIR9tlw+J9dmwmHTeDXME1G9z914FWCzxyXrSvtlsD8b3zDoCeCayVsUE0BIaAmy0rMI0xd3UVXxhNBmpxwYtCXu3I4nFHbkOFzyf5YmevzAuIK1J3G9M3hahdO9NvIrJC7Hs1FpYvChNk3tPs54tU7ACtVWrJrvIswDJEBxjG+PKcdAAfdevW4g+1y1SwZtqBvHj60TUFw6wDPoxDZByZRWqN/GUbuPDrnctqgEDjNJjlRfydg4gKoi8OdYrNfEtzq/j0t2PnOgRznaRUcL1HGsonsW8ay0oykz+EhLfGYZQ1ncwBhQKf6rJI/PQ8KFCBu+4+PLyBufxSenfYFX8d/tj4alPhWOUtUbZrt/Xo9z8Aw4SwVsisKBDCfJA/NIUnaVlVS/1DK4WlspyFsh0FlxNFN+UZR 5mn21X7X QN5Bkhd6qJXpGUUytahfjc5xZlnbqbI3iS2jE5aJdgEzkGaPHLaVgz4SG/sIPIrmwODSlDAWQLNOz4JJKecEQYhCfysLnjRZ5oVE1JIapnjJhKZESeB/ayvb1xjYRxqiJ+93lgrS03BFdRGavq9Ej+qLzyWmPOuFk+7iet1TvTRtYfCA86nIY6YKPZT20Sl3zUnpkChiFlnOAIh6z50ukA2lYxWgqI8oWDRJJ4C6vSF/m7vbXMulip2KKV8a74s4E139pLrKENmCQlx7Q56EId5ny3oMwyQOGFQkwKlMIeWTpom7scl7xVReFS550B9+s6fCbHkfFN01CYt+iJtcXiniOS2fHykR9lZWf/D3IhvQ+p2MpyzsbkuHcbLBPzCUoGOiT3+tTXutytOY8bJ6awV0sOPA3PPS3+V3nt7QOOSFJ+/Gmfq4xLAHOLm4XuRqzo/d9de+wacnGrDfraf4lIPBLhdp2P18O2QlFCvtFgm8w45nt1E23RQvRyfODMkzjavjuwYUv8vkjClsC3RGL8Q2pAa7nzhm1crJWMzo6z3dMyQBH1XTHuQ9qGY1h53bfTBo6p6cDWbHcAfxV71moDeft6aK5I3rzCkKMU9KGRa57J/jw2Zadfk/N2v+H83MeIlQI2J6dnjTSkAzllLB4S6j1pwzUu79So8o59gxdc3VwO+bb0JtylId76LX1PgR4sZ9fgDcKiceUBTTOzQqKDXdixyHZq0aTbKZb1IGjLt0u8PaeAj1yaEj6xA== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: mk_huge_pte is unused and not necessary. pte_mkhuge is the appropriate function to call to create a HugeTLB PTE (see Documentation/mm/arch_pgtable_helpers.rst). It is being removed now to avoid complicating the implementation of HugeTLB high-granularity mapping. Acked-by: Peter Xu Acked-by: Mina Almasry Reviewed-by: Mike Kravetz Signed-off-by: James Houghton diff --git a/arch/s390/include/asm/hugetlb.h b/arch/s390/include/asm/hugetlb.h index ccdbccfde148..c34893719715 100644 --- a/arch/s390/include/asm/hugetlb.h +++ b/arch/s390/include/asm/hugetlb.h @@ -77,11 +77,6 @@ static inline void huge_ptep_set_wrprotect(struct mm_struct *mm, set_huge_pte_at(mm, addr, ptep, pte_wrprotect(pte)); } -static inline pte_t mk_huge_pte(struct page *page, pgprot_t pgprot) -{ - return mk_pte(page, pgprot); -} - static inline int huge_pte_none(pte_t pte) { return pte_none(pte); diff --git a/include/asm-generic/hugetlb.h b/include/asm-generic/hugetlb.h index d7f6335d3999..be2e763e956f 100644 --- a/include/asm-generic/hugetlb.h +++ b/include/asm-generic/hugetlb.h @@ -5,11 +5,6 @@ #include #include -static inline pte_t mk_huge_pte(struct page *page, pgprot_t pgprot) -{ - return mk_pte(page, pgprot); -} - static inline unsigned long huge_pte_write(pte_t pte) { return pte_write(pte); diff --git a/mm/debug_vm_pgtable.c b/mm/debug_vm_pgtable.c index af59cc7bd307..fbbc53113473 100644 --- a/mm/debug_vm_pgtable.c +++ b/mm/debug_vm_pgtable.c @@ -925,7 +925,7 @@ static void __init hugetlb_basic_tests(struct pgtable_debug_args *args) * as it was previously derived from a real kernel symbol. */ page = pfn_to_page(args->fixed_pmd_pfn); - pte = mk_huge_pte(page, args->page_prot); + pte = mk_pte(page, args->page_prot); WARN_ON(!huge_pte_dirty(huge_pte_mkdirty(pte))); WARN_ON(!huge_pte_write(huge_pte_mkwrite(huge_pte_wrprotect(pte)))); diff --git a/mm/hugetlb.c b/mm/hugetlb.c index 792cb2e67ce5..540cdf9570d3 100644 --- a/mm/hugetlb.c +++ b/mm/hugetlb.c @@ -4899,11 +4899,10 @@ static pte_t make_huge_pte(struct vm_area_struct *vma, struct page *page, unsigned int shift = huge_page_shift(hstate_vma(vma)); if (writable) { - entry = huge_pte_mkwrite(huge_pte_mkdirty(mk_huge_pte(page, - vma->vm_page_prot))); + entry = huge_pte_mkwrite(huge_pte_mkdirty(mk_pte(page, + vma->vm_page_prot))); } else { - entry = huge_pte_wrprotect(mk_huge_pte(page, - vma->vm_page_prot)); + entry = huge_pte_wrprotect(mk_pte(page, vma->vm_page_prot)); } entry = pte_mkyoung(entry); entry = arch_make_huge_pte(entry, shift, vma->vm_flags); From patchwork Sat Feb 18 00:27:36 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: James Houghton X-Patchwork-Id: 13145369 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 2A780C636D6 for ; Sat, 18 Feb 2023 00:28:52 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 432E26B0075; Fri, 17 Feb 2023 19:28:49 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 3E32B6B0078; Fri, 17 Feb 2023 19:28:49 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 2860A280001; Fri, 17 Feb 2023 19:28:49 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0015.hostedemail.com [216.40.44.15]) by kanga.kvack.org (Postfix) with ESMTP id 008D06B0075 for ; Fri, 17 Feb 2023 19:28:48 -0500 (EST) Received: from smtpin06.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay08.hostedemail.com (Postfix) with ESMTP id CFAE91405E5 for ; Sat, 18 Feb 2023 00:28:48 +0000 (UTC) X-FDA: 80478527136.06.DFC33CE Received: from mail-yb1-f201.google.com (mail-yb1-f201.google.com [209.85.219.201]) by imf14.hostedemail.com (Postfix) with ESMTP id 204EC100003 for ; Sat, 18 Feb 2023 00:28:46 +0000 (UTC) Authentication-Results: imf14.hostedemail.com; dkim=pass header.d=google.com header.s=20210112 header.b=fw5qqHMf; spf=pass (imf14.hostedemail.com: domain of 3vhvwYwoKCNsGQELRDEQLKDLLDIB.9LJIFKRU-JJHS79H.LOD@flex--jthoughton.bounces.google.com designates 209.85.219.201 as permitted sender) smtp.mailfrom=3vhvwYwoKCNsGQELRDEQLKDLLDIB.9LJIFKRU-JJHS79H.LOD@flex--jthoughton.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1676680127; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=ZDdwvW86Qv3EMGRuKtmhHN/6JFPd8EDP9/1ArRwRYXY=; b=CERpDE8XnHnvBYSXX/dfeIXoK12kwET09NhUAPeLLJkpz9gpWc8XNYBa6foSTpgpp+err/ e+PnU2wbCD9YOx6MHqQbepQvvcnPZY1jlG6QCP6HtxPskNNzoZnrALsR+g/wo5cder7YQs fDOGOYEVUefNlU8Jn0p5hjlQ/ZWLi/U= ARC-Authentication-Results: i=1; imf14.hostedemail.com; dkim=pass header.d=google.com header.s=20210112 header.b=fw5qqHMf; spf=pass (imf14.hostedemail.com: domain of 3vhvwYwoKCNsGQELRDEQLKDLLDIB.9LJIFKRU-JJHS79H.LOD@flex--jthoughton.bounces.google.com designates 209.85.219.201 as permitted sender) smtp.mailfrom=3vhvwYwoKCNsGQELRDEQLKDLLDIB.9LJIFKRU-JJHS79H.LOD@flex--jthoughton.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1676680127; a=rsa-sha256; cv=none; b=QEW662k/1sXV7obfp+jdoZv6B6wVTG0rngRHVON06QCpcW9EBUMewuFALbcKqCHVE5aZPf F1Xdysq1GoAZXRNcNeebiJzSeGTlu3/njieV5EQKCZQqPtDHCDXfQK3phAFpH+GQyER/3X KFM2zBpBVOm74yorJuJ1kymw+8mpI6U= Received: by mail-yb1-f201.google.com with SMTP id 84-20020a251457000000b0091231592671so2246199ybu.1 for ; Fri, 17 Feb 2023 16:28:46 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=ZDdwvW86Qv3EMGRuKtmhHN/6JFPd8EDP9/1ArRwRYXY=; b=fw5qqHMfBDzeIudommkAOwo+9UU/gmxeQgIih9FaoEsaJr2Xir63Ni9u5plph47GPQ qKNMSk3EoMZlFvKBixc5SlWffdrOIGT9xwufVeB6w6XtbUefWoHl0UTkSBIeVWA0nx9V 3vx/k8P3X+Nic17umWBpychYwMRvoLYQTiDpyeEvZuRVuWt8x3OPh+qcKKxIEMfqpB+f E4BJn1mxijItNJzVQ7AX794mA3LDFIyPdFN2u6SvQkdpMoHSeet4GeunVcsqyLAUuaFL HvKzb76potSZYgInefPIg3oX/vB0taPuTk8WpA+EnFjlSQJ7bSf8eNOYwuT07YyeDAq/ 1HaA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=ZDdwvW86Qv3EMGRuKtmhHN/6JFPd8EDP9/1ArRwRYXY=; b=fy8b4YUjoP6hZw1wVrWBlJscuUJhxOqXqgxQFP3AKTFi1YRluJ/YWB+qGxFm62zlMp K0mudjSkUS3DG0vTmtH5fW+A7+UpAYCHQX0QAPZuaJ1bkrHno+cbqwBmjo/3kSKM6h2j OS/+cBBWaza93CWL2VWmACQqNlLssi0Ag/m4ATombNnjg7htM2+DloYK6LLzjenhLaV7 2mMnNqQiF+eTI8J/EXG4gUKoxTiZKwySWSnfKXDXtL2q+Wf2Dkh4OjGNr7SQ7Sipnaib Ig3K4ZscT4Q68O3R9NEGh76cwyouR/0bSOEXcHXoi5tCeSbapgxSEHoPpmiwNx9y0Gji FD+Q== X-Gm-Message-State: AO0yUKWnVSA6pFdznJ9Wi7o3fUA/LepvW9TRKAML+yNOugjoPunmpCnT h0lRfkNlzwOKSpYEn/P7/Fp0RHwk+ox95HO3 X-Google-Smtp-Source: AK7set+jWNiXveMIuz9URODx+pb4bTot1Uag93kYz1pPcv1KgJFihpEhzSj3o+H8iWxGWVbjL8VHgACEeLjh2Bea X-Received: from jthoughton.c.googlers.com ([fda3:e722:ac3:cc00:14:4d90:c0a8:2a4f]) (user=jthoughton job=sendgmr) by 2002:a81:4511:0:b0:52f:69d:cc75 with SMTP id s17-20020a814511000000b0052f069dcc75mr301138ywa.6.1676680126332; Fri, 17 Feb 2023 16:28:46 -0800 (PST) Date: Sat, 18 Feb 2023 00:27:36 +0000 In-Reply-To: <20230218002819.1486479-1-jthoughton@google.com> Mime-Version: 1.0 References: <20230218002819.1486479-1-jthoughton@google.com> X-Mailer: git-send-email 2.39.2.637.g21b0678d19-goog Message-ID: <20230218002819.1486479-4-jthoughton@google.com> Subject: [PATCH v2 03/46] hugetlb: remove redundant pte_mkhuge in migration path From: James Houghton To: Mike Kravetz , Muchun Song , Peter Xu , Andrew Morton Cc: David Hildenbrand , David Rientjes , Axel Rasmussen , Mina Almasry , "Zach O'Keefe" , Manish Mishra , Naoya Horiguchi , "Dr . David Alan Gilbert" , "Matthew Wilcox (Oracle)" , Vlastimil Babka , Baolin Wang , Miaohe Lin , Yang Shi , Frank van der Linden , Jiaqi Yan , linux-mm@kvack.org, linux-kernel@vger.kernel.org, James Houghton X-Rspam-User: X-Rspamd-Server: rspam04 X-Rspamd-Queue-Id: 204EC100003 X-Stat-Signature: efuq757ju4ai6uejm5p7x3o9zqsoq9wz X-HE-Tag: 1676680126-359154 X-HE-Meta: U2FsdGVkX18tHNKZCpkuodnHePHA2ZIR+AsgJB8U8W78UvJzKWHJkw8Ztc7AWyuVmBMLQYMC2ge0+cgVPhXAB1QY2n1NHbDgEyOwQD6Xo+sjpkrILmF4K8/EN17UI8WqX0cbw36dCjBiiQg21uMxSknpDweB9j46q2aKFBGKEeXKpY0WAY14q2xaMfKL8/c601zJbzK5TCUU9Wrdsnw8ztAMuCUEtc7stwm9DnSADR4cXqDhcmb5X6x4oOTfGh8KNMbO01hqztC13R9eGHAM+ao0uloWHEFk1Z/uxw1Z1XAKvtAhxNSM7mv+m5q5I7A2KvqXhp4XQ3jj5gD0gQAWBEiRbqLWz+MYbiy3US2sUEZJSlcu+epV2Ilf02hi4uStdJ2f9XiIW+komSSay2GnHUnxpI5jEJ5zWmGR4jQVluSrRO73LorwkiOC7mXKu/sb4pq29UMgN8Bhtan+zFd3etwTrBrWW+CjSuHNALxr+rqCPqnEcYVtoSx1FZwVBLy4PmUBDEysxAWFs0agg3ImfbByWwVAbSVQGHTUlKQI3gsliDMFCklo2wEJ6yu1TUA1D1GYdmFF5bwGzF4Sjd29bD8swh4NVoREu06ED4DmFxupt1D15ynCbOaQb8IGTu+a0e8NkTtl66kywrCwgFYssbTXXw98ynBiw2rnHgm3DyOyZq2qHA4lRwv5yNU7Ww+FU7M817IQfWFtBsS2DQ3JOWytwD4qw2LoAJeiydPw1CSzhjygrnfq7cgJ5cXoNiy1kdx9cf8VB4MkDQdX9WQZw3uuOJCX9k5C8L+Hp1Lk+1Vitx7o3j/MEA1Bkl4ELWdK10LfVsPDMgPtKGIMetZm4W+C8ce819DLXUOa7wKxhfMgnbfBN6WJ50evyP89ruYbf9pcqZq5kurBtSRpVL5VNsuKIvLoVpejnPOAOnRjCRNRFSsfkeDL87fKZhXLtfnTxR3fzeOLbcVKDPpiGwO SpUKQUxd r6fsei7aKFZrRYcZ5rMbhGlE5ix6czyO2jyVJS4HbTjIxhJdKLpofRG7KZysH8f1DdZ5x61EPAYz+jeLNo3cw2xDFE7fRJ0T4yQbOPBpw5ZjnV1PLvKRlD5wdB1T9iFFn3B+v0ziGq2FauQLr5GnzfKdIq+rsHwumIdKBrfl9QKeXxqQImV2xGEQWp+cNXzOgfbDBldRrkcJpTU4PciYeLiXDdZiXLDrMO5xpm8MzcQQZ7+YQL2ZNCWT3dyW9J23cx+RoEaElV4dLmSgHYFXcc19jsx5jIrH5fBLAGMmIzQsNSmdWzcyY24HPCnlzjm1iwk6qI3zemI9wUlsC1ORqtu/WTUBV7PB8Iv+fUZI4mjrvZBS0A9EvmgmMLI3Dz+s3SK9PqowJ4VbsmxV59F18ZmgGOK2lJMxp1ZJJRcW3Ze5q6n4nf2qgNS4qzNAWknQNa3vrEOw73cCs7gQo2kziivx54YPs86aRuQq2naijDSnvkuM1M7D6VrYTR+qbxNsWQEoKGqOn+QzdoxNBnJIq7ZXHmLLfONK9zmLzqkNRsjdJp00nc0I2r8/vYRH+n4+2Xk8vVXZqOhz34u+yhTyo8H9gVW1X/XX95cdDjnmOkC53uxGATRMuvemKErhpOeyylFGBJEM/GWaV0MZ/W2+oMvlZu6CKznIx/bLp X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: arch_make_huge_pte, which is called immediately following pte_mkhuge, already makes the necessary changes to the PTE that pte_mkhuge would have. The generic implementation of arch_make_huge_pte simply calls pte_mkhuge. Acked-by: Peter Xu Acked-by: Mina Almasry Reviewed-by: Mike Kravetz Signed-off-by: James Houghton diff --git a/mm/migrate.c b/mm/migrate.c index 37865f85df6d..d3964c414010 100644 --- a/mm/migrate.c +++ b/mm/migrate.c @@ -249,7 +249,6 @@ static bool remove_migration_pte(struct folio *folio, if (folio_test_hugetlb(folio)) { unsigned int shift = huge_page_shift(hstate_vma(vma)); - pte = pte_mkhuge(pte); pte = arch_make_huge_pte(pte, shift, vma->vm_flags); if (folio_test_anon(folio)) hugepage_add_anon_rmap(new, vma, pvmw.address, From patchwork Sat Feb 18 00:27:37 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: James Houghton X-Patchwork-Id: 13145370 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id CB0C1C64ED6 for ; Sat, 18 Feb 2023 00:28:53 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 3E6BC6B0078; Fri, 17 Feb 2023 19:28:50 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 34710280001; Fri, 17 Feb 2023 19:28:50 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 173256B007D; Fri, 17 Feb 2023 19:28:50 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0010.hostedemail.com [216.40.44.10]) by kanga.kvack.org (Postfix) with ESMTP id 032A76B0078 for ; Fri, 17 Feb 2023 19:28:50 -0500 (EST) Received: from smtpin29.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay04.hostedemail.com (Postfix) with ESMTP id CCFCD1A073D for ; Sat, 18 Feb 2023 00:28:49 +0000 (UTC) X-FDA: 80478527178.29.188758E Received: from mail-yw1-f202.google.com (mail-yw1-f202.google.com [209.85.128.202]) by imf07.hostedemail.com (Postfix) with ESMTP id 17A7840004 for ; Sat, 18 Feb 2023 00:28:47 +0000 (UTC) Authentication-Results: imf07.hostedemail.com; dkim=pass header.d=google.com header.s=20210112 header.b="n/AuNA+w"; spf=pass (imf07.hostedemail.com: domain of 3vxvwYwoKCNwHRFMSEFRMLEMMEJC.AMKJGLSV-KKIT8AI.MPE@flex--jthoughton.bounces.google.com designates 209.85.128.202 as permitted sender) smtp.mailfrom=3vxvwYwoKCNwHRFMSEFRMLEMMEJC.AMKJGLSV-KKIT8AI.MPE@flex--jthoughton.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1676680128; a=rsa-sha256; cv=none; b=FNVG0DH52cwFJAcJ39yF6FuerYuw3DPAVq1ou8ECyJBDAcPRqVLo7GuflbeHkmHapWhXbu ZvDfoC9tdc/RP4VZFSSvDxbxTdU779LFLSdWnWgmzN8M/QUwhweRq+rJa0SpQqeml0pJG9 PcSi64ZkIipz+7HvoIhwWtMHjMHY/U4= ARC-Authentication-Results: i=1; imf07.hostedemail.com; dkim=pass header.d=google.com header.s=20210112 header.b="n/AuNA+w"; spf=pass (imf07.hostedemail.com: domain of 3vxvwYwoKCNwHRFMSEFRMLEMMEJC.AMKJGLSV-KKIT8AI.MPE@flex--jthoughton.bounces.google.com designates 209.85.128.202 as permitted sender) smtp.mailfrom=3vxvwYwoKCNwHRFMSEFRMLEMMEJC.AMKJGLSV-KKIT8AI.MPE@flex--jthoughton.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1676680128; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=ywqgi26Brd11uCoWP0sbMHtbYsofzUY9ZAimSzTW2NE=; b=rxfu1FKYDfvb5V0KneM1vGmkLC1UeML34HUUzpn7FWDaH3bUveHZwAdXX1nMjbCSNhH4CM zFEwW6iRJNH6RWnGHIl0fJDOQ7ZlMAx9GoaYrch9rqhzciM50b0SFdZZzDuPMC7J1otyrH lM+fFeKUbgetBztsF0ddwHBg7BIc9Ng= Received: by mail-yw1-f202.google.com with SMTP id 00721157ae682-536566339d6so26924587b3.11 for ; Fri, 17 Feb 2023 16:28:47 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=ywqgi26Brd11uCoWP0sbMHtbYsofzUY9ZAimSzTW2NE=; b=n/AuNA+wao69AJqH2jxvia/RDl00z2MfIhcsqjJucnXsT0cN+rnMxxAJPhBDLk5ZSi MCiuvlZOLO6F7xC0qObRnm2Fw7Kk4xKcB8RUyTN9WchfsVg6IiD35+jkjOxG7TArgt04 nthdLjbi+YqMB//69CKeg7ZPl8QIrBQY7Ll3FExxGFCSIT1L0xNOUp19y9PDuIe+SxhZ B4Zyh5fsXW/P04RT85l9L0piLa0E//fpZzktUrEB+JJqehCEtNNVOMSTn6SN8kGNhUKz UMxeGr/Oj6oj5yYpspoF7jsafZJJiQT+dGrZV68rl4q3U6mgeh9r3mvqGZ+fe/wLSD9M jzxg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=ywqgi26Brd11uCoWP0sbMHtbYsofzUY9ZAimSzTW2NE=; b=VgcNFikIlhQG2aDAJotu6eOnuN+iLj1vOZarhEbL1JOUnZQ4NFfrSoRe0Q0b4NupA0 whAXYu6/j3el8bjWDaS1TVZdnZYhk840zqvHuU8BuwZXvbST5vHPRtMQRffoY4B8IpBq 8W1ieUH9cDz+eCRg0WesKb/zvSKrS68ZSyzbl1CQFtPfwdPdh15RX1h85n3dsQLS6raH Wkpu0bGvz0sp9PfopOOUvD7wknuBOIRSzxmkJdsCbWPCQY/EsE1RVEoBrAdHaawEuMEr W1gFpOoYVcb0UXip6e2mpRAWuUBeHs53qTR+2t9YOllLz0BqYcomjQ+4pBkRXqAysZAj Spqw== X-Gm-Message-State: AO0yUKW1jTsnhNNc8fgvFc6dYkhZq0EUzz0rTud/YyIhA0y1CgRDW6kl iDorT/VRTkBDvYf1fIn5HL7JpR1f1CxD7NzE X-Google-Smtp-Source: AK7set91EOlgY73vi4WtH0/wZAAYWGqc3EHj1NmJd/UToA1QLT2ebGJE6OT31llBzsdBdICY2Qgx1JjnImasJfkA X-Received: from jthoughton.c.googlers.com ([fda3:e722:ac3:cc00:14:4d90:c0a8:2a4f]) (user=jthoughton job=sendgmr) by 2002:a5b:84:0:b0:902:5b5c:73f7 with SMTP id b4-20020a5b0084000000b009025b5c73f7mr14406ybp.12.1676680127282; Fri, 17 Feb 2023 16:28:47 -0800 (PST) Date: Sat, 18 Feb 2023 00:27:37 +0000 In-Reply-To: <20230218002819.1486479-1-jthoughton@google.com> Mime-Version: 1.0 References: <20230218002819.1486479-1-jthoughton@google.com> X-Mailer: git-send-email 2.39.2.637.g21b0678d19-goog Message-ID: <20230218002819.1486479-5-jthoughton@google.com> Subject: [PATCH v2 04/46] hugetlb: only adjust address ranges when VMAs want PMD sharing From: James Houghton To: Mike Kravetz , Muchun Song , Peter Xu , Andrew Morton Cc: David Hildenbrand , David Rientjes , Axel Rasmussen , Mina Almasry , "Zach O'Keefe" , Manish Mishra , Naoya Horiguchi , "Dr . David Alan Gilbert" , "Matthew Wilcox (Oracle)" , Vlastimil Babka , Baolin Wang , Miaohe Lin , Yang Shi , Frank van der Linden , Jiaqi Yan , linux-mm@kvack.org, linux-kernel@vger.kernel.org, James Houghton X-Rspam-User: X-Rspamd-Queue-Id: 17A7840004 X-Rspamd-Server: rspam01 X-Stat-Signature: g73pgms4b1zbt8b6dc1egrqb3yh8bk1c X-HE-Tag: 1676680127-752798 X-HE-Meta: U2FsdGVkX1+TRGBPOPUC4RQ78C63zdqGdLpeb+QU/ijvbjCAuRB+RuVq2LvWzzxlCaiaj0HcScwocldxxz9b16rvwk/4oBrHCTp2g/IZfcdluEeN3gPvUqPJ4PFQQL6yPiahL7aVAXOfcO0r0at8tsH22chR9JWLInLdQlIa4SPVuGGx7/+ebfwx7PfY2qTPy+w1PUmJ0PMtOO0/dqsvvg9tzpypS8LZHbNVptL7GxmpzCcr3EG98cjGZmdhkNDbXYp445nEQgJkcFHuw3WIF/8eSJ0JAGEtZjzZZqvSmQF9X83Ur88SvZgEIhU2JWI1h7ssvZ4LC+d1p5dC1+JZQW7/3QDT0TP1A/T49l6O2+nimLd+4UuKhp62FM4defXpl2Pv5OJH9Z4FEt3OaaY9MjBD84qDz4++YfaE5vRgCBaCOes+YT25KHlFZO45FAr54+eGVVHP9W+Ec0EhKP/VcVrwNO9yOyjtfksrAGEUhVcJ53E7u70M3sq2c8mv3EspqcyiyZBAmMUWk4ls046cM1/bYGHkWFCRaQrylFYwXYrkYzlDOHrpDZyFGa68rV6D0Y/svRll9ndivjZrGub3NPtVdkH1Th8S9mchBSybUJ30Fo1G3Q7yHsclVPa217eijZ8vy5CDiRGaxS4mCb9juG8hU0hKe7r56v99zlBnS79Pf7Kc3rRr9a7+mXWBFaWhkfqtedL7Yuqod0uIgYFfI4F167VdTCPir4Zi8HIYsZ48tb0UUCYNkXIBZ8FzH9zGgaxeIsitjt/PLymq1gpciDuB6o7dSEwclXOKn/XyjijjMAcLx4JnrX7xk9tB9W4FZp9qEUJrQ4sRDhsrAzf2YxPsrHUOHyxJW1kw2E+5nz6S6AO2iSwjsMY26vbZdjuraCXqzNs+CVUR3M2lYAuAeHRbgseCyICNuhKdeVt6QXi8OuUi7Ua0AtCVNo9ofnKIcYz5/ooVi6z1EWckZ+1 +fmu6He+ iLD3ZhbvoRu1NLOMnNg0s0VGK/2HkJ3mFN3tTQVMQyv3flmb0gtPI40sYWPAmgtE5dzpRnKQBlURB7BY6rQsBcy5PF3QP+9IbngX71QLmPQKdYUSMoruujQWj728QcpBYRL+kCHyuXap2+nNrg+E5duw31mWc73V4h09Kg4cjsS9evSgUx95/oqQqOdy40m7aq7JjK8VqNmWSzrugo+jZ14zlbaBD4Y7AntfRGkqogSP1OGfDQhmTlt+gl1w+0LK2+T39AT4BBVQvJ6fKJTv7rUgdi8Jp+waLtgtguBtZFqEAfMIHXnyfiAiZ9xK7Ht3IjB9XnDXFO1sGhUbKMx+Ydqdg4f9QTylqUBn8IMwKe3bK8g6gQdO6afBLUqq6vVRsjyFXJYoS4rHT0lacjzdW+ItIqQcMTWzohLaUU3AKox+bk4up5yz16/arWxhLFcLD3pkgbCuw8YX2E0Vaq2LPr2SIBCxJulUqa0XHPLbjNvI4B8cQnSLfR9GsVUlCA+ZCh5NRW/dHA23GkWxnQ4LehaP4a4HKUKFdBksC5w99/CMpxKs3ezVJlt4vt2Yuuw8xZjrzyi4ceioPuGsu7Je9c/anheVX5KaePVZicbEUID6YQG0hqkpva8EN609gsJr9E0pfqc1/vcQG9Bd7sSacbD9TGGct6Wvi0pN6E2UGwJEGGSJuAYupMVwWm9+X+6c4yftmEjEK3gI2e5E= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Currently this check is overly aggressive. For some userfaultfd VMAs, VMA sharing is disabled, yet we still widen the address range, which is used for flushing TLBs and sending MMU notifiers. This is done now, as HGM VMAs also have sharing disabled, yet would still have flush ranges adjusted. Overaggressively flushing TLBs and triggering MMU notifiers is particularly harmful with lots of high-granularity operations. Acked-by: Peter Xu Reviewed-by: Mike Kravetz Signed-off-by: James Houghton Acked-by: Mina Almasry diff --git a/mm/hugetlb.c b/mm/hugetlb.c index 540cdf9570d3..08004371cfed 100644 --- a/mm/hugetlb.c +++ b/mm/hugetlb.c @@ -6999,22 +6999,31 @@ static unsigned long page_table_shareable(struct vm_area_struct *svma, return saddr; } -bool want_pmd_share(struct vm_area_struct *vma, unsigned long addr) +static bool pmd_sharing_possible(struct vm_area_struct *vma) { - unsigned long start = addr & PUD_MASK; - unsigned long end = start + PUD_SIZE; - #ifdef CONFIG_USERFAULTFD if (uffd_disable_huge_pmd_share(vma)) return false; #endif /* - * check on proper vm_flags and page table alignment + * Only shared VMAs can share PMDs. */ if (!(vma->vm_flags & VM_MAYSHARE)) return false; if (!vma->vm_private_data) /* vma lock required for sharing */ return false; + return true; +} + +bool want_pmd_share(struct vm_area_struct *vma, unsigned long addr) +{ + unsigned long start = addr & PUD_MASK; + unsigned long end = start + PUD_SIZE; + /* + * check on proper vm_flags and page table alignment + */ + if (!pmd_sharing_possible(vma)) + return false; if (!range_in_vma(vma, start, end)) return false; return true; @@ -7035,7 +7044,7 @@ void adjust_range_if_pmd_sharing_possible(struct vm_area_struct *vma, * vma needs to span at least one aligned PUD size, and the range * must be at least partially within in. */ - if (!(vma->vm_flags & VM_MAYSHARE) || !(v_end > v_start) || + if (!pmd_sharing_possible(vma) || !(v_end > v_start) || (*end <= v_start) || (*start >= v_end)) return; From patchwork Sat Feb 18 00:27:38 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: James Houghton X-Patchwork-Id: 13145371 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 06AD3C05027 for ; Sat, 18 Feb 2023 00:28:56 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 45BE2280001; Fri, 17 Feb 2023 19:28:51 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 3E50C6B007D; Fri, 17 Feb 2023 19:28:51 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 285BC280001; Fri, 17 Feb 2023 19:28:51 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0017.hostedemail.com [216.40.44.17]) by kanga.kvack.org (Postfix) with ESMTP id 1748E6B007B for ; Fri, 17 Feb 2023 19:28:51 -0500 (EST) Received: from smtpin18.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay09.hostedemail.com (Postfix) with ESMTP id DDE3B80723 for ; Sat, 18 Feb 2023 00:28:50 +0000 (UTC) X-FDA: 80478527220.18.C96F4E3 Received: from mail-yb1-f202.google.com (mail-yb1-f202.google.com [209.85.219.202]) by imf21.hostedemail.com (Postfix) with ESMTP id 244231C0014 for ; Sat, 18 Feb 2023 00:28:48 +0000 (UTC) Authentication-Results: imf21.hostedemail.com; dkim=pass header.d=google.com header.s=20210112 header.b=UvLBkxS6; spf=pass (imf21.hostedemail.com: domain of 3wBvwYwoKCN0ISGNTFGSNMFNNFKD.BNLKHMTW-LLJU9BJ.NQF@flex--jthoughton.bounces.google.com designates 209.85.219.202 as permitted sender) smtp.mailfrom=3wBvwYwoKCN0ISGNTFGSNMFNNFKD.BNLKHMTW-LLJU9BJ.NQF@flex--jthoughton.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1676680129; a=rsa-sha256; cv=none; b=jGZyC7tSv2YEoCvfQTkk28tsuPdQ7FoqgDMP5l7FVeFajIFVFwtbhtkP0mCnoehBvnleQd FYUeVBJ0jL0xOf7S0xwp3c3UbCymKxlsPn6YE5E/Hl7ZYTz4PEed26jk1dTdq4DP7zLPE4 rP2Oe/qfE5Q7mN50F92+7P0cNz5YFx4= ARC-Authentication-Results: i=1; imf21.hostedemail.com; dkim=pass header.d=google.com header.s=20210112 header.b=UvLBkxS6; spf=pass (imf21.hostedemail.com: domain of 3wBvwYwoKCN0ISGNTFGSNMFNNFKD.BNLKHMTW-LLJU9BJ.NQF@flex--jthoughton.bounces.google.com designates 209.85.219.202 as permitted sender) smtp.mailfrom=3wBvwYwoKCN0ISGNTFGSNMFNNFKD.BNLKHMTW-LLJU9BJ.NQF@flex--jthoughton.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1676680129; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=nelWhEWWDb3ok6PYl3fRG2FGkHP98SSg6vQYnhj1QJk=; b=M+oMrrEY1VaT352SCL9L0eB578ey+hW5qzbrI0VzEEfEi2uX2cZQkE6jb52qTGZbRqD/4g yqXDpYAH3Ew+Cwh4jXraAVK+2SYpKBYnznExs8/aqurhLJrqXOx234pXanXfNuYVHTJkWP 5HhEmkPWfHU1oPOeeQoG11KZByKId1c= Received: by mail-yb1-f202.google.com with SMTP id e83-20020a25e756000000b0086349255277so2436361ybh.8 for ; Fri, 17 Feb 2023 16:28:48 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=nelWhEWWDb3ok6PYl3fRG2FGkHP98SSg6vQYnhj1QJk=; b=UvLBkxS6HWSvm6F/1ZojF9oUJKGMsoR6TJYUH52tIUGTUnHPSS4OjC8ZKXvwZwASAV tHFCfrP/r3B6sftGO2ROO6qyp0hWV5ssCNe8BVzZ/8UYiB4V6Kh8RSM5cqHAivyWBXpO lalg+riEaU3qUCklp3VFoj3s+ykjm2m5yzWRFdOAwMvlFIaPzMq9C6wMUYVDqXxeAYN4 ydi5/oNHHp4Z8bFgiq0SG1+hakJtq3Bas6/hdi8e26/G9KknLRZXzrhTcc3TXa9yzpwQ m8r8Qx+aRsi+iePo4b0oOe0PRXw52ZFZExKevE8Zt1tK+nMw9eyeE3g+fji1JRRzlVBM Sb3g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=nelWhEWWDb3ok6PYl3fRG2FGkHP98SSg6vQYnhj1QJk=; b=HGyOSn3cRUjQX4GB5alsLleuWp344A0AHkIytIcGo2j8cFBci54HRGZ7bvDmvu/7Wp OJfqH7IQ2hF6zL2UJojKZegzv9MCHExpxnyWT8DqYH50vsBg1wUZIfC8Poa+ndmo6sTy P1xuW8EbHznSREsY//n7h2HGJnS6cMVSM3f/3Lkby5QpyvUsQ26acD6Hzc8cZ5vc9FFw HzFNoGL1HN+UsibYGWcMshezpWSAhSE/k1kbpkOvD/qTRLRbcl52+ZYJH5Orkssn8Ah1 ppVUrumdeU0eAwE45o4AieuGQxgm4RRJoZPnU20aNg6eArZPDIc8Qjl8FjrHhj+xooKL PbQg== X-Gm-Message-State: AO0yUKVbiUBH2nvQNHvkRx1J+2D/oMdHgfARUOHc+fuCpeX+w3tJZl/F PNGd5Drc8Y7fHMc1VJAOGY/uBcv5FXakfr7d X-Google-Smtp-Source: AK7set/Hqt7/tFdWTqNQyJko4/eRErfrIERd8YtMlK8cdL7HORy0LZ2CmPBJNFYLl26tfN06LjHCnVCVpHyEbeAH X-Received: from jthoughton.c.googlers.com ([fda3:e722:ac3:cc00:14:4d90:c0a8:2a4f]) (user=jthoughton job=sendgmr) by 2002:a81:b705:0:b0:534:d71f:14e6 with SMTP id v5-20020a81b705000000b00534d71f14e6mr53479ywh.9.1676680128284; Fri, 17 Feb 2023 16:28:48 -0800 (PST) Date: Sat, 18 Feb 2023 00:27:38 +0000 In-Reply-To: <20230218002819.1486479-1-jthoughton@google.com> Mime-Version: 1.0 References: <20230218002819.1486479-1-jthoughton@google.com> X-Mailer: git-send-email 2.39.2.637.g21b0678d19-goog Message-ID: <20230218002819.1486479-6-jthoughton@google.com> Subject: [PATCH v2 05/46] rmap: hugetlb: switch from page_dup_file_rmap to page_add_file_rmap From: James Houghton To: Mike Kravetz , Muchun Song , Peter Xu , Andrew Morton Cc: David Hildenbrand , David Rientjes , Axel Rasmussen , Mina Almasry , "Zach O'Keefe" , Manish Mishra , Naoya Horiguchi , "Dr . David Alan Gilbert" , "Matthew Wilcox (Oracle)" , Vlastimil Babka , Baolin Wang , Miaohe Lin , Yang Shi , Frank van der Linden , Jiaqi Yan , linux-mm@kvack.org, linux-kernel@vger.kernel.org, James Houghton X-Rspam-User: X-Rspamd-Queue-Id: 244231C0014 X-Rspamd-Server: rspam01 X-Stat-Signature: 9jwxuzy6en139ictjx7yafcp59n4d945 X-HE-Tag: 1676680128-893459 X-HE-Meta: U2FsdGVkX196r4F17fUkOM8EG8z70pwb1tcZ1zEOvALRU2MY8smocz4K62iHb1CZJC9CtZPJ4nOo9xzAzDVb9B+dImxMzk+7qUZlDd1kY3LQt+ql22IfcTIQGFJSNx7WjOGRn5kGHizSC80qi+GycZhQ1Djvi8BDC+Cxr6R+DBuJDy8q7TSm+xP6tg74IUA3/03SKerVHcSExlxAXJ3/0445Rq4zxIFesBDnGKQHekT5c/RxZ/WF9zq66SEtkmRmAYtu1Z7Luo9FyCzbUzRPEkAW0zg332XLCaBoG05O/PFxL3FGtgdTRl6c1QV78U5xxM5puZJSp05stKLvYfknS9jvZQmHMOI34Mh2gvnf7oPB/Fjo2MJRa7vvVW3DtNjMTP7xL/qWPcm64V3wD/oU2O8CyYOyicchbeOVSprrHrf4s+D+0JiqwWKZWNbdKDKG2xBJhMqlMxGPx2b/i0+9sj5kkA0XZ0K2WJmxlLRLR/OjOkFuz3pVvGNS2mrRGI1V+IJ12g1rhE2Xv1e9Q2YQJfvc73ry091xnt39gi2vtrDJnO/zMzeldzWgkVrykOoMe/8haZbvj+xf2PGYoIVEFarkr7P8DxiMpJVqh9jVzYfAewO0GkWNG0O1+aGKMJnt9NBOTfpa0g5vmE947BI3MYf1/Jsw0PSvyPl/Kgv/BjFoiygvJwigqoBzWdsUZwpQgNBUQP0Rn1URs8H0QBOsXG8kLmcZDfOXa7T4Tu9zoh5ouL6boH3OD9t5P97HrLrJr56SFEmqalBN6277qnpyGo9wSVnfHeLWYeJ7+4kem09PP8yGC+WlnaHxIfnX1zi6tAqTwjSlh24s+fy7hDAw4ha/BwslUywfqUbdiOwBLjceMMLiN/h1CBosvEA2UM/gBbB7LW2HB53zz0lY0FSjhnW27YU0DDnFVfK7qLTfQhRclwyOkc7Qk8CeXiJRQBr69W5S+Nq8Bn9vaaWRM8E ETLys93f a5oqaixtiYp3MnZuz5fY2zf66x9kx/y5NFcmWet07Eyv+RBVCWvpAg0G7Jx8M0MyymkYDh/jMVeCFxXC+7p1ywmMPKnikPI+9BqpGcfL9/PoOKaQFL5bFftR2pUzI5vPrBTKzl9w7Xg1sEk2wuJgm7AAdlPzXkGpQnXAHcIkgdirMtf4QQRAC28ENKLgDUjG5o9OBX6gAEfezLI4ItG8HMbxkMW3+SrQAM5nF7uvnbpV8LNVmae3UVgOsjQuBZKrc2sPhyI4e3FMfCkgYP9QaaIM+G+Dms5JsIIxafkd15D5/bGVDsGURRAwOJ1xF7s7SDCdLxqK4A/GZ53ucQ7zvkS4y06k5X7O5uXOXnL/B7YwkIwpi247uQu4LruJ4Ew2b6ngHOzBylPmeC3Q2p9w7RanL/W1EG7ahLQqwxOPESNryuoXdxz/L5KS2ypI2ykxH6UBKojt/lIv/io0NfcYHRaLMMhNoFVRISfj7OGnBN9wK/MGdKcCmrI4/j4jmC+0igtCSHYjbQcoEznKX5WJsB01/0GQeT8hwJFHmWbYsr6IgoD5P2sbXrgw9Zcn1u2BkSXEZRHiRC91MwktvU0e8M2rLfXotmrNKvKd9tfo+veqJ1gQ= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: This only applies to file-backed HugeTLB, and it should be a no-op until high-granularity mapping is possible. Also update page_remove_rmap to support the eventual case where !compound && folio_test_hugetlb(). HugeTLB doesn't use LRU or mlock, so we avoid those bits. This also means we don't need to use subpage_mapcount; if we did, it would overflow with only a few mappings. There is still one caller of page_dup_file_rmap left: copy_present_pte, and it is always called with compound=false in this case. Signed-off-by: James Houghton diff --git a/mm/hugetlb.c b/mm/hugetlb.c index 08004371cfed..6c008c9de80e 100644 --- a/mm/hugetlb.c +++ b/mm/hugetlb.c @@ -5077,7 +5077,7 @@ int copy_hugetlb_page_range(struct mm_struct *dst, struct mm_struct *src, * sleep during the process. */ if (!PageAnon(ptepage)) { - page_dup_file_rmap(ptepage, true); + page_add_file_rmap(ptepage, src_vma, true); } else if (page_try_dup_anon_rmap(ptepage, true, src_vma)) { pte_t src_pte_old = entry; @@ -5910,7 +5910,7 @@ static vm_fault_t hugetlb_no_page(struct mm_struct *mm, if (anon_rmap) hugepage_add_new_anon_rmap(folio, vma, haddr); else - page_dup_file_rmap(&folio->page, true); + page_add_file_rmap(&folio->page, vma, true); new_pte = make_huge_pte(vma, &folio->page, ((vma->vm_flags & VM_WRITE) && (vma->vm_flags & VM_SHARED))); /* @@ -6301,7 +6301,7 @@ int hugetlb_mcopy_atomic_pte(struct mm_struct *dst_mm, goto out_release_unlock; if (folio_in_pagecache) - page_dup_file_rmap(&folio->page, true); + page_add_file_rmap(&folio->page, dst_vma, true); else hugepage_add_new_anon_rmap(folio, dst_vma, dst_addr); diff --git a/mm/migrate.c b/mm/migrate.c index d3964c414010..b0f87f19b536 100644 --- a/mm/migrate.c +++ b/mm/migrate.c @@ -254,7 +254,7 @@ static bool remove_migration_pte(struct folio *folio, hugepage_add_anon_rmap(new, vma, pvmw.address, rmap_flags); else - page_dup_file_rmap(new, true); + page_add_file_rmap(new, vma, true); set_huge_pte_at(vma->vm_mm, pvmw.address, pvmw.pte, pte); } else #endif diff --git a/mm/rmap.c b/mm/rmap.c index 15ae24585fc4..c010d0af3a82 100644 --- a/mm/rmap.c +++ b/mm/rmap.c @@ -1318,21 +1318,21 @@ void page_add_file_rmap(struct page *page, struct vm_area_struct *vma, int nr = 0, nr_pmdmapped = 0; bool first; - VM_BUG_ON_PAGE(compound && !PageTransHuge(page), page); + VM_BUG_ON_PAGE(compound && !PageTransHuge(page) + && !folio_test_hugetlb(folio), page); /* Is page being mapped by PTE? Is this its first map to be added? */ if (likely(!compound)) { first = atomic_inc_and_test(&page->_mapcount); nr = first; - if (first && folio_test_large(folio)) { + if (first && folio_test_large(folio) + && !folio_test_hugetlb(folio)) { nr = atomic_inc_return_relaxed(mapped); nr = (nr < COMPOUND_MAPPED); } - } else if (folio_test_pmd_mappable(folio)) { - /* That test is redundant: it's for safety or to optimize out */ - + } else { first = atomic_inc_and_test(&folio->_entire_mapcount); - if (first) { + if (first && !folio_test_hugetlb(folio)) { nr = atomic_add_return_relaxed(COMPOUND_MAPPED, mapped); if (likely(nr < COMPOUND_MAPPED + COMPOUND_MAPPED)) { nr_pmdmapped = folio_nr_pages(folio); @@ -1347,6 +1347,9 @@ void page_add_file_rmap(struct page *page, struct vm_area_struct *vma, } } + if (folio_test_hugetlb(folio)) + return; + if (nr_pmdmapped) __lruvec_stat_mod_folio(folio, folio_test_swapbacked(folio) ? NR_SHMEM_PMDMAPPED : NR_FILE_PMDMAPPED, nr_pmdmapped); @@ -1376,8 +1379,7 @@ void page_remove_rmap(struct page *page, struct vm_area_struct *vma, VM_BUG_ON_PAGE(compound && !PageHead(page), page); /* Hugetlb pages are not counted in NR_*MAPPED */ - if (unlikely(folio_test_hugetlb(folio))) { - /* hugetlb pages are always mapped with pmds */ + if (unlikely(folio_test_hugetlb(folio)) && compound) { atomic_dec(&folio->_entire_mapcount); return; } @@ -1386,15 +1388,14 @@ void page_remove_rmap(struct page *page, struct vm_area_struct *vma, if (likely(!compound)) { last = atomic_add_negative(-1, &page->_mapcount); nr = last; - if (last && folio_test_large(folio)) { + if (last && folio_test_large(folio) + && !folio_test_hugetlb(folio)) { nr = atomic_dec_return_relaxed(mapped); nr = (nr < COMPOUND_MAPPED); } - } else if (folio_test_pmd_mappable(folio)) { - /* That test is redundant: it's for safety or to optimize out */ - + } else { last = atomic_add_negative(-1, &folio->_entire_mapcount); - if (last) { + if (last && !folio_test_hugetlb(folio)) { nr = atomic_sub_return_relaxed(COMPOUND_MAPPED, mapped); if (likely(nr < COMPOUND_MAPPED)) { nr_pmdmapped = folio_nr_pages(folio); @@ -1409,6 +1410,9 @@ void page_remove_rmap(struct page *page, struct vm_area_struct *vma, } } + if (folio_test_hugetlb(folio)) + return; + if (nr_pmdmapped) { if (folio_test_anon(folio)) idx = NR_ANON_THPS; From patchwork Sat Feb 18 00:27:39 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: James Houghton X-Patchwork-Id: 13145372 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id D330EC636D6 for ; Sat, 18 Feb 2023 00:28:57 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 212626B007B; Fri, 17 Feb 2023 19:28:52 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 1C124280002; Fri, 17 Feb 2023 19:28:52 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 012556B007E; Fri, 17 Feb 2023 19:28:51 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0012.hostedemail.com [216.40.44.12]) by kanga.kvack.org (Postfix) with ESMTP id E86E86B007B for ; Fri, 17 Feb 2023 19:28:51 -0500 (EST) Received: from smtpin16.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay10.hostedemail.com (Postfix) with ESMTP id C9070C07CF for ; Sat, 18 Feb 2023 00:28:51 +0000 (UTC) X-FDA: 80478527262.16.DE0C763 Received: from mail-yb1-f202.google.com (mail-yb1-f202.google.com [209.85.219.202]) by imf16.hostedemail.com (Postfix) with ESMTP id 1FC6C180007 for ; Sat, 18 Feb 2023 00:28:49 +0000 (UTC) Authentication-Results: imf16.hostedemail.com; dkim=pass header.d=google.com header.s=20210112 header.b=ghgOaEON; spf=pass (imf16.hostedemail.com: domain of 3wRvwYwoKCN4JTHOUGHTONGOOGLE.COMLINUX-MMKVACK.ORG@flex--jthoughton.bounces.google.com designates 209.85.219.202 as permitted sender) smtp.mailfrom=3wRvwYwoKCN4JTHOUGHTONGOOGLE.COMLINUX-MMKVACK.ORG@flex--jthoughton.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1676680130; a=rsa-sha256; cv=none; b=hhv8GadV1wRl2KPEAGFjlCGotBkt0C0Dz6pJxlat8pOLhNE9PM0w+jm+LPGm8apZIOpW4T 8H0wVfYvOdL+Z9Hm2Zkcg35B3RE19gBmKnyGFZADhGAcUhSRiywJKQxZ0RGVGw0gV1fOT2 jbNwqmUqiBp8rPbYWB8mYNjLFWmV/GQ= ARC-Authentication-Results: i=1; imf16.hostedemail.com; dkim=pass header.d=google.com header.s=20210112 header.b=ghgOaEON; spf=pass (imf16.hostedemail.com: domain of 3wRvwYwoKCN4JTHOUGHTONGOOGLE.COMLINUX-MMKVACK.ORG@flex--jthoughton.bounces.google.com designates 209.85.219.202 as permitted sender) smtp.mailfrom=3wRvwYwoKCN4JTHOUGHTONGOOGLE.COMLINUX-MMKVACK.ORG@flex--jthoughton.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1676680130; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=8nfyEWndKsa3tyc1KlPsVZDyN8xlZ4zW0vpf47Ary0I=; b=uA25vtHs2ZGF/bjxGfOtRCBSz0p/Ic8L5UcIIbHZqJ6E3AklpE2RSBCT3KHaiRYRnoeDpm DhfTQqPmAODZlIpZh3JQPJdXkf688TwS4QO1NMEASIi+attS0C9+HUcJPLyqlHtgMayFR9 vxYBKM64UN6kXqp7kuKhgrqpLBJcHEo= Received: by mail-yb1-f202.google.com with SMTP id q8-20020a25f408000000b00944353b6a81so2505258ybd.7 for ; Fri, 17 Feb 2023 16:28:49 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=8nfyEWndKsa3tyc1KlPsVZDyN8xlZ4zW0vpf47Ary0I=; b=ghgOaEONmk3TxY7Vbd4llR8C6ZhiH535rAEn2mRoH4sgelGi+DzvBNDNzGInJZzqfs MEErbyZH29/h7FEPNCh9VZcW/Bm4YgRch9ir0kS2NDXVsbFg5l6v2m5+GdPoQ68p73Au kx7/25Br5F6Yw90cWFqLg3eViGqsOMcWI58kCeCLrtLPN7KaOw1BIGgyIswJlZykM+9O 0y4w9vz06OJ9NLQ+jIbxIwJsCdh/V36xONqd3gjqH80qszv271zhh3nf8DhiYiNLA2qQ k8oLBo0ZQMuPQXPM4GYcllmYMP6s7dZX1ehwhZA1pOfZpnWcQPV0VVxxICdxJelUUbzk U9lw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=8nfyEWndKsa3tyc1KlPsVZDyN8xlZ4zW0vpf47Ary0I=; b=ndPJHfwO29s3GjpXj2XIZe7/ays0VuY8oQ6QiRTYcWfFOOOeD7RqpGuYaasEnSRnNf Iwlxs+fI8e2hyNjrgykEE0yKhqFqxGEnGhvYpZmiohpoJzRGIHrUyQUxzd5KrXAMGTER qxYHqAOlBtqfAeaD+QyZ3OC0G5URRrBOiWdUqSjBm147T+gTsyQJccinwZw6fcDusa0Q ZYwMUa7AfyrzCdyn5Q/YCn4Kna8T5XdkjRK9Ta2lnb2FosbLsouOHr573vzGuSGJ2zjt 931+OYPzhMBB+JqP4HUQRxp15Y59CUgbP1w7hfXbEBvICrN7fxiJRfzVEw4IrzEbQHFq bkgA== X-Gm-Message-State: AO0yUKXRNmRmHKyVDToZBOfaGGgP9fChuvESAE2Jo6+EOyqCt+ZahYJ5 +WIDkEM/RlDh55ZCnfFoGgyS+aj5FXBDwY6w X-Google-Smtp-Source: AK7set+i/pEV+wA/Uyq68EOFTB/N9gXVBIKZiQfbLrFGkhuKCKH/yF+5J8fuZrkmB0Jrum8skN72smbY4sjy/+UA X-Received: from jthoughton.c.googlers.com ([fda3:e722:ac3:cc00:14:4d90:c0a8:2a4f]) (user=jthoughton job=sendgmr) by 2002:a05:6902:241:b0:8db:41c9:aa6f with SMTP id k1-20020a056902024100b008db41c9aa6fmr200113ybs.2.1676680129319; Fri, 17 Feb 2023 16:28:49 -0800 (PST) Date: Sat, 18 Feb 2023 00:27:39 +0000 In-Reply-To: <20230218002819.1486479-1-jthoughton@google.com> Mime-Version: 1.0 References: <20230218002819.1486479-1-jthoughton@google.com> X-Mailer: git-send-email 2.39.2.637.g21b0678d19-goog Message-ID: <20230218002819.1486479-7-jthoughton@google.com> Subject: [PATCH v2 06/46] hugetlb: add CONFIG_HUGETLB_HIGH_GRANULARITY_MAPPING From: James Houghton To: Mike Kravetz , Muchun Song , Peter Xu , Andrew Morton Cc: David Hildenbrand , David Rientjes , Axel Rasmussen , Mina Almasry , "Zach O'Keefe" , Manish Mishra , Naoya Horiguchi , "Dr . David Alan Gilbert" , "Matthew Wilcox (Oracle)" , Vlastimil Babka , Baolin Wang , Miaohe Lin , Yang Shi , Frank van der Linden , Jiaqi Yan , linux-mm@kvack.org, linux-kernel@vger.kernel.org, James Houghton X-Rspam-User: X-Rspamd-Queue-Id: 1FC6C180007 X-Rspamd-Server: rspam01 X-Stat-Signature: s6qmpfexmi3ukcuz7ft1fd8t44eosnpy X-HE-Tag: 1676680129-325041 X-HE-Meta: U2FsdGVkX1+K/ljF7Y8u3yPfz6pM9RVLe5vR4nhFv4K0fFvgEGVKOP95mTlqM9qS1V3cJV+LEHpLaZaesg1zP7eeGDfSjSgxXlGdKC5X4Ax+ef+Zlt5u8I6i5VwXYHsVjKdo7MBH40sk+Fr3CR69ZnV9in/HCDhfyKm2DiOzdRqMst8JCVBpVEXhkvTdFIJGRhh0HaOgnv69+jr0s+XosIlJmy+lg2IdRg/o45WRjDMkOOfkaWjPHZZK3hfvnPtO2s1eJ59hygFpxGi4ToTKaPECqah3HrH+rq4b8G9XsiC7W4W/gAGKTM1Oa8tE1wnFxTn8gMJZzAueVw6TcIuSjxctbvzdzfTGKgQpFXowZRAmEd/5td42xm/be35an8PRoDbfrvZpQOwpNGfU9i8DOa7no4n3nBnsJX/D/5zhlMXNNkwrXnnoRD3SixIQ4A1eyvj1uZoX23DjlVY1OW1Uv0cNzUCM4U7jKF6nkR1ItYxl0+4HdF7mLpMu+l1R05whl4hXm13kYFGn/YaiYmcmeD2VEtIZnKmytrJ6Pu36fhZEKisdO7YNuQPG9uJll2U258BTLOR1RseOZfMkWsdQW1flVojrJYkzWq+7qlQQdrFEfZn2eoOh/I8sNaS+g4u5xVWJXmsAhWVMIcjHpeuc+eXbQGr8Up9zfM5kf5rzlXEqqoF18I+lpbcyil7JaCwcNlvpCuoF9MylfJrZIMa1c/j74pZNrs2eCdzbxZ4aJm4Pk6uNdtg7GY0z1N2Olkk0Qx3YjsLhjbOXWwBtWMOyvdGEr8QIxIzs23GS/f7i7yUk4PL0oAWTsi5SmYkteua3Zx2Xo7Y0javEYL3QqCvPaURZXVyvR06enROGNCiFge0+f2HVmQioOEa+aSXYA6UCYtWwVaYHe3oyhRCS5FnTQZCYcKCkE5CbTkEW1msIkeajiAZS4By/vyDKUfoGV5c/RNWO1gFZfdaghk0jmd2 VLycobwM Hv6xVxbw7A3CVch+9D8W3o0qxWmGeHH+N2aMubJs6G5SFRbB2X6fSO8o9KqDmIpsMUOZ+Gq7ZjMfCLhwc2zWeY6CYQZZwFEV5ue410r3aPNmV/rl3f6j5BKVurjatfM/TMimmtLBOKaytJDjBerIbBoECm/oLk/3ywr/n7f/ahsfswQFPdyNobFEGhC0Ct2q+QczVjSUkA+pvnN4dESgv+NwcUvACLCvhq8BW+qd0Bj6K4/uwWn9phN8P74gOsr+R2PZZVSAF3TID67jOAJyXI64LHL0Aj+F5cxg92LdF9Ll2fGDjHIkMocu1oqMfzqpx+JI3aPZ6Hq8fQQOk9CRgUKn0vTEW8wb6UIAH1YIop/bZ8JqDPyz3tON25sW5B+keBgv3g4Eo5YeiGlgpgX3MjPIdtaZYuoX5Am1zLD5j9oN+pYXSTG1nPhyt43m6DIBbNbqB7+gRwCay4O1y7gRsvAVoAASzgyRSK7NMUQ7ntRQz4nXav8LH+vXiWUdpbYh7XZoSW3eMLhDFxfZkPEOsANieI9r/mIhXPIz4+bS9rN9v/tohF7gFPYjktl5Zznz1U6nh9MmPXSZNC3i6VvjMKbzmyOBFq7DRz1t7QEqoq1/JKxo= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: This adds the Kconfig to enable or disable high-granularity mapping. Each architecture must explicitly opt-in to it (via ARCH_WANT_HUGETLB_HIGH_GRANULARITY_MAPPING), but when opted in, HGM will be enabled by default if HUGETLB_PAGE is enabled. Signed-off-by: James Houghton diff --git a/fs/Kconfig b/fs/Kconfig index 2685a4d0d353..a072bbe3439a 100644 --- a/fs/Kconfig +++ b/fs/Kconfig @@ -246,6 +246,18 @@ config HUGETLBFS config HUGETLB_PAGE def_bool HUGETLBFS +config ARCH_WANT_HUGETLB_HIGH_GRANULARITY_MAPPING + bool + +config HUGETLB_HIGH_GRANULARITY_MAPPING + bool "HugeTLB high-granularity mapping support" + default n + depends on ARCH_WANT_HUGETLB_HIGH_GRANULARITY_MAPPING + help + HugeTLB high-granularity mapping (HGM) allows userspace to issue + UFFDIO_CONTINUE on HugeTLB mappings in PAGE_SIZE chunks. + HGM is incompatible with the HugeTLB Vmemmap Optimization (HVO). + # # Select this config option from the architecture Kconfig, if it is preferred # to enable the feature of HugeTLB Vmemmap Optimization (HVO). @@ -257,6 +269,7 @@ config HUGETLB_PAGE_OPTIMIZE_VMEMMAP def_bool HUGETLB_PAGE depends on ARCH_WANT_HUGETLB_PAGE_OPTIMIZE_VMEMMAP depends on SPARSEMEM_VMEMMAP + depends on !HUGETLB_HIGH_GRANULARITY_MAPPING config HUGETLB_PAGE_OPTIMIZE_VMEMMAP_DEFAULT_ON bool "HugeTLB Vmemmap Optimization (HVO) defaults to on" From patchwork Sat Feb 18 00:27:40 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: James Houghton X-Patchwork-Id: 13145373 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id C051FC64ED8 for ; Sat, 18 Feb 2023 00:28:59 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 285B6280003; Fri, 17 Feb 2023 19:28:53 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 236E8280002; Fri, 17 Feb 2023 19:28:53 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 062D5280003; Fri, 17 Feb 2023 19:28:53 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0017.hostedemail.com [216.40.44.17]) by kanga.kvack.org (Postfix) with ESMTP id EC70C280002 for ; Fri, 17 Feb 2023 19:28:52 -0500 (EST) Received: from smtpin09.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay06.hostedemail.com (Postfix) with ESMTP id C0ABBA907B for ; Sat, 18 Feb 2023 00:28:52 +0000 (UTC) X-FDA: 80478527304.09.45C252E Received: from mail-yb1-f202.google.com (mail-yb1-f202.google.com [209.85.219.202]) by imf08.hostedemail.com (Postfix) with ESMTP id 0CB84160009 for ; Sat, 18 Feb 2023 00:28:50 +0000 (UTC) Authentication-Results: imf08.hostedemail.com; dkim=pass header.d=google.com header.s=20210112 header.b=L4T7D1Q7; spf=pass (imf08.hostedemail.com: domain of 3whvwYwoKCN8KUIPVHIUPOHPPHMF.DPNMJOVY-NNLWBDL.PSH@flex--jthoughton.bounces.google.com designates 209.85.219.202 as permitted sender) smtp.mailfrom=3whvwYwoKCN8KUIPVHIUPOHPPHMF.DPNMJOVY-NNLWBDL.PSH@flex--jthoughton.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1676680131; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=Wk7vTpC+014vSbmA7BXLNrcppJOJiIEYznqCa6jfXqI=; b=I0AIcaL82+x1YILbpjEAHldV8oJklnoSTu9kMiiLkUQ4Uhd1g/YyNOIJ948vfNhKnaypLk ZoWuibD7dmrWOhD0lf6cKcqqyjoc2lbHqXJxKPC208a/NlRq8fCL48jvGMK0uz2xkWSLzo zvco7Kp8Jr28L4EU3JzeCdVrUFqlcbI= ARC-Authentication-Results: i=1; imf08.hostedemail.com; dkim=pass header.d=google.com header.s=20210112 header.b=L4T7D1Q7; spf=pass (imf08.hostedemail.com: domain of 3whvwYwoKCN8KUIPVHIUPOHPPHMF.DPNMJOVY-NNLWBDL.PSH@flex--jthoughton.bounces.google.com designates 209.85.219.202 as permitted sender) smtp.mailfrom=3whvwYwoKCN8KUIPVHIUPOHPPHMF.DPNMJOVY-NNLWBDL.PSH@flex--jthoughton.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1676680131; a=rsa-sha256; cv=none; b=iX7vGLpXq8g+jAKy65OXTP/OivstkpCHCgOr+LBd1FG5wcRR1cAUjc2GnGL8ZSXLDpCwgs e6zRJjACEpziJ8GtR1KCVPaIGh5BqkYixNjq9TIN/e3rW1gF4fXlCFOwqEwPFxcYZLheU5 vkWJiLJxHxQgmmmyPO53UyxXvK1jBro= Received: by mail-yb1-f202.google.com with SMTP id 188-20020a2503c5000000b008e1de4c1e7dso2280834ybd.17 for ; Fri, 17 Feb 2023 16:28:50 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=Wk7vTpC+014vSbmA7BXLNrcppJOJiIEYznqCa6jfXqI=; b=L4T7D1Q77q9QoOsnz1tm6Qf6BnOnAK2ZGTzjyBq232kdcaHnTPLu0KkbIFKg2iqZGi ur2eS2TG9FID8wSfSSNIUDRAH4/hc4cg6N78xCCSSvyxSCrRt7XxD8Kq9J+3/utwUlNa XwiHxLMwsKt54AmsMKUm23cdHK0jjFppi4B8oAiG6cdvRInp32jKLbSvstKyMXiqo8UK 07XHP/2+lJH0n/T1p+F0gJBrThd4SgIk1odXs4Xk0ZR7hodPP4o/xUlKaYBCidUzDuQ8 b+jdsTQykyqbj5MGq0jpb+xO5JrvhEf0KZUw8P19p8z3GR5sTSonXH6Em23ytW9e2DzL g/dQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=Wk7vTpC+014vSbmA7BXLNrcppJOJiIEYznqCa6jfXqI=; b=PFmNVrHFRZ6p+GWJig3SfgCY598284r9jICxzdhd8EPsNIzTYXJCZNUhpK/zBM8POo /DN5UOEqiJaG5oxH5x/OMQTuWaozL/3ZZTimseLy3vYHP5lrYIfmIopG2W9rLTBAjvUc PVN7Kt1/DoDtNykVjvZs1qit65Je6bfC5VdXHAY4gduOhM54IeEfV1i4JF537xg9T7Y0 /nL45dJr5N15Zx3tNYy77rYW9pjPHaqPIUDP8SGMlH55ku27EPVJEcH6XQPZ8qc7jOki nUmDPzVsU88mTUmM2zDo8vXODj7SdDaa/QHIIzUqM02yjuQLg4UBvFGbJXwXgaXB69QX 3zVg== X-Gm-Message-State: AO0yUKVVQmdYxthmOI8cJn9UlcdPRvsAOkC0wu2O4ZbdN8tbNOTg/Opd xH31tk1o7DhFeqhMm4VmN/dn3/TMDuHo8rYN X-Google-Smtp-Source: AK7set9Co62mLp/e8p6CW/1K+ZyZaQWY5gSTbmZ9xHWOnSJaaxinPhJ3ofPuLxwknQEtxTaVNR5J+AVeIdAiLUtz X-Received: from jthoughton.c.googlers.com ([fda3:e722:ac3:cc00:14:4d90:c0a8:2a4f]) (user=jthoughton job=sendgmr) by 2002:a5b:4c3:0:b0:904:2aa2:c26c with SMTP id u3-20020a5b04c3000000b009042aa2c26cmr196573ybp.5.1676680130224; Fri, 17 Feb 2023 16:28:50 -0800 (PST) Date: Sat, 18 Feb 2023 00:27:40 +0000 In-Reply-To: <20230218002819.1486479-1-jthoughton@google.com> Mime-Version: 1.0 References: <20230218002819.1486479-1-jthoughton@google.com> X-Mailer: git-send-email 2.39.2.637.g21b0678d19-goog Message-ID: <20230218002819.1486479-8-jthoughton@google.com> Subject: [PATCH v2 07/46] mm: add VM_HUGETLB_HGM VMA flag From: James Houghton To: Mike Kravetz , Muchun Song , Peter Xu , Andrew Morton Cc: David Hildenbrand , David Rientjes , Axel Rasmussen , Mina Almasry , "Zach O'Keefe" , Manish Mishra , Naoya Horiguchi , "Dr . David Alan Gilbert" , "Matthew Wilcox (Oracle)" , Vlastimil Babka , Baolin Wang , Miaohe Lin , Yang Shi , Frank van der Linden , Jiaqi Yan , linux-mm@kvack.org, linux-kernel@vger.kernel.org, James Houghton X-Rspamd-Server: rspam07 X-Rspamd-Queue-Id: 0CB84160009 X-Rspam-User: X-Stat-Signature: 8ftg3rmc4k5rdho8y798hksh6iwh33u8 X-HE-Tag: 1676680130-746573 X-HE-Meta: U2FsdGVkX184KzyVmd1G6QcmTLitjKOQYwiKMGWdPrFy9QpiVKieC4+0GmFohc9McMYGs3MZQuwkTRCGnL23QirZAAa1IDzRb+YsT/+W7ttOaGaLyHW5Sf8uXRSjAcUj8MHjg2cHvj3mzjoLFYh4X76CtZ4KbvkdoE+LUy6Tyd8Yv5D/kd//nDsuAKspKQYqA6ax6ExCP9YvrgxnyjDYbAj5famz4AGfONtRPtE0DlzS683k2Vh7mhyz2GIMWt8Tw3W/4e0qR9QdlX5PXRWavZwmI88t7YS8uDdOiDqdPaz8nu/XjMdaOzQH4SvQ/16zRLJa/mS0S/UcdBY2sHgaPu3zpNXr4MdHEFsOzGQzJXQoQQyG3Dpultg1KUPvbft9mMyb1dqkzDE+4Dz3fLf6DU6A29Cegg5ozKHmSRePQRY/bs+YgN5G1cLo7hEgSWsUshhp3hUAmlrC651PY6Lf93dhRmoP0BTyU745zfGNsPxt7AUMH3ISl0lrfuzDXFeQuyoLhB6ZQJy7azv3uxmPdgv6tpMoiappNJfa1ZSnsGqJdxoQC093deF4XDJM66AogF/JYp66+XtKLDkj2rIRQOkyQwpmC5EaT2cYJDQ8Yj4+471tGE2OpW8O7qpH9L8/znkYgmJJO7HWgZ8lAh/XXQ/1pZS64hrEBusmukA0c5oFmVsvvmke2NkFFfFEe+A/OMUJJrwuXdN/iVY5Vfhyb1Y2A1PQVChWiPaYdD6DV4HqAaaoljk/z6tV0U9DsWqTYLGTYQfrjb3s7GuA2CaK34yi/XjNMKAflvp1M9PDlGXSgA0+D3XMSRCZpK8MOjSrHTYLDTexxv98287M6gVpy7C8Jo/0hAplZPPihDHgyKikWeox/zAMJlCS6Ch1LsVHPo9WWLiPi7O+MEDyplu5f3EwHEs8Ne2j9hLWWVo+okElm0bo6HCLQh/SYQ15Cx56SX8SLpWnnp42pC21IL2 lhWcLkII a/Yg0A2eH7dGzPbPjzPfLKuTPuVvlKaYle0ASFzVHzDOB3zzSh9NzsKycmyXhgdzs89iUb5N2xupD7kqtfN9xaTG84jEDgLUqRzv9RHb4U5pPkDLe5Ix5Z9dzdNgnUHrzY0EbFHBikg/5KFf/fhQz3pv+YRV+XpaCf4wOb0ELlwMVbwUg+R7+/fbtQyB4YlnveVICQbv78vKQM/gHiUTnDjZn3x0yJC5EK4AIyEqjVfChU+5G2KLPLxv7hqH9+kJSqjzN5O8idBfl/L74JJrlCMhfkRitJfAlP1aEHBy7gCz1S71d4U4fPSVhk5DL/F+YWjUjFm0pMZZDf+HTQSfCiqNcgYZBk9wHtAnTEN7f6ySX3s+U1bFri/weJjdwdETeb3/aLOvZVxwRpi7TV2vnfX2dgkHlXx3bWWUDcyRBzz4aAIshThz1pXO+6AR7lkIY7FIdcStm8R5hwyIgL/6hCgVZN4Zfa7mj3QGGr34M5oPT2amUZBt3fePcARbBpL8D+TyiFCO4sHC7IY/vjrWNpSmD4PwoIcp2bP1vwnnNwf9RMjMhHJue09/QdgmKo77eEcn+HPXbL2UWyjyy01x4+ut67IIAOl3sKsHq+W+ZC8fYqLQ= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: VM_HUGETLB_HGM indicates that a HugeTLB VMA may contain high-granularity mappings. Its VmFlags string is "hm". Signed-off-by: James Houghton Acked-by: Mike Kravetz diff --git a/fs/proc/task_mmu.c b/fs/proc/task_mmu.c index 6a96e1713fd5..77b72f42556a 100644 --- a/fs/proc/task_mmu.c +++ b/fs/proc/task_mmu.c @@ -711,6 +711,9 @@ static void show_smap_vma_flags(struct seq_file *m, struct vm_area_struct *vma) #ifdef CONFIG_HAVE_ARCH_USERFAULTFD_MINOR [ilog2(VM_UFFD_MINOR)] = "ui", #endif /* CONFIG_HAVE_ARCH_USERFAULTFD_MINOR */ +#ifdef CONFIG_HUGETLB_HIGH_GRANULARITY_MAPPING + [ilog2(VM_HUGETLB_HGM)] = "hm", +#endif /* CONFIG_HUGETLB_HIGH_GRANULARITY_MAPPING */ }; size_t i; diff --git a/include/linux/mm.h b/include/linux/mm.h index 2992a2d55aee..9d3216b4284a 100644 --- a/include/linux/mm.h +++ b/include/linux/mm.h @@ -383,6 +383,13 @@ extern unsigned int kobjsize(const void *objp); # define VM_UFFD_MINOR VM_NONE #endif /* CONFIG_HAVE_ARCH_USERFAULTFD_MINOR */ +#ifdef CONFIG_HUGETLB_HIGH_GRANULARITY_MAPPING +# define VM_HUGETLB_HGM_BIT 38 +# define VM_HUGETLB_HGM BIT(VM_HUGETLB_HGM_BIT) /* HugeTLB high-granularity mapping */ +#else /* !CONFIG_HUGETLB_HIGH_GRANULARITY_MAPPING */ +# define VM_HUGETLB_HGM VM_NONE +#endif /* CONFIG_HUGETLB_HIGH_GRANULARITY_MAPPING */ + /* Bits set in the VMA until the stack is in its final location */ #define VM_STACK_INCOMPLETE_SETUP (VM_RAND_READ | VM_SEQ_READ) diff --git a/include/trace/events/mmflags.h b/include/trace/events/mmflags.h index 9db52bc4ce19..bceb960dbada 100644 --- a/include/trace/events/mmflags.h +++ b/include/trace/events/mmflags.h @@ -162,6 +162,12 @@ IF_HAVE_PG_SKIP_KASAN_POISON(PG_skip_kasan_poison, "skip_kasan_poison") # define IF_HAVE_UFFD_MINOR(flag, name) #endif +#ifdef CONFIG_HUGETLB_HIGH_GRANULARITY_MAPPING +# define IF_HAVE_HUGETLB_HGM(flag, name) {flag, name}, +#else +# define IF_HAVE_HUGETLB_HGM(flag, name) +#endif + #define __def_vmaflag_names \ {VM_READ, "read" }, \ {VM_WRITE, "write" }, \ @@ -186,6 +192,7 @@ IF_HAVE_UFFD_MINOR(VM_UFFD_MINOR, "uffd_minor" ) \ {VM_ACCOUNT, "account" }, \ {VM_NORESERVE, "noreserve" }, \ {VM_HUGETLB, "hugetlb" }, \ +IF_HAVE_HUGETLB_HGM(VM_HUGETLB_HGM, "hugetlb_hgm" ) \ {VM_SYNC, "sync" }, \ __VM_ARCH_SPECIFIC_1 , \ {VM_WIPEONFORK, "wipeonfork" }, \ From patchwork Sat Feb 18 00:27:41 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: James Houghton X-Patchwork-Id: 13145374 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id E8581C05027 for ; Sat, 18 Feb 2023 00:29:01 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 853FB280004; Fri, 17 Feb 2023 19:28:54 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 7B37D280002; Fri, 17 Feb 2023 19:28:54 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 67B64280004; Fri, 17 Feb 2023 19:28:54 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0012.hostedemail.com [216.40.44.12]) by kanga.kvack.org (Postfix) with ESMTP id 5A0B3280002 for ; Fri, 17 Feb 2023 19:28:54 -0500 (EST) Received: from smtpin29.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay01.hostedemail.com (Postfix) with ESMTP id 375231C5DF5 for ; Sat, 18 Feb 2023 00:28:54 +0000 (UTC) X-FDA: 80478527388.29.D5474B0 Received: from mail-vk1-f202.google.com (mail-vk1-f202.google.com [209.85.221.202]) by imf29.hostedemail.com (Postfix) with ESMTP id 6656B120013 for ; Sat, 18 Feb 2023 00:28:52 +0000 (UTC) Authentication-Results: imf29.hostedemail.com; dkim=pass header.d=google.com header.s=20210112 header.b="f9I/sF01"; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf29.hostedemail.com: domain of 3wxvwYwoKCOALVJQWIJVQPIQQING.EQONKPWZ-OOMXCEM.QTI@flex--jthoughton.bounces.google.com designates 209.85.221.202 as permitted sender) smtp.mailfrom=3wxvwYwoKCOALVJQWIJVQPIQQING.EQONKPWZ-OOMXCEM.QTI@flex--jthoughton.bounces.google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1676680132; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=EVR7nxu9NqFl8iop5UBI5tIYqVwEs8TwTl7GvIiAmDc=; b=27lrWlb58LbbO13Y7z9rtr8Nu0APsF+fL0rkV9+yMY8Qw6JazZqaPrqBWUI5PlQ03ad6OH 1jxUpkxu3odKLAvownzUoEF/HANRH6EvE4vg3jsPae23fHIYtP95CAbf13nnlrohGrY+3b 7FhWYrxJPIho36mcQeFKVhTYB+sH+co= ARC-Authentication-Results: i=1; imf29.hostedemail.com; dkim=pass header.d=google.com header.s=20210112 header.b="f9I/sF01"; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf29.hostedemail.com: domain of 3wxvwYwoKCOALVJQWIJVQPIQQING.EQONKPWZ-OOMXCEM.QTI@flex--jthoughton.bounces.google.com designates 209.85.221.202 as permitted sender) smtp.mailfrom=3wxvwYwoKCOALVJQWIJVQPIQQING.EQONKPWZ-OOMXCEM.QTI@flex--jthoughton.bounces.google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1676680132; a=rsa-sha256; cv=none; b=ZGq+xQtUwMVfTzBbOZ7F1mD91UPfGuTkVbs0ZKaeY71Mbj3ZjUfPxAm30m1kG3m4JdMCTa Him0bT7fldAb3fYSzUO0xg9kX7oNQ1lTdkcNeRZUJrWBDP/FE37a25yaIwv3tUWfffGS/m 7wURPl5T10VTAhbVQZNstJ8iKzbe8cM= Received: by mail-vk1-f202.google.com with SMTP id o73-20020a1f414c000000b0040163d749ecso646417vka.11 for ; Fri, 17 Feb 2023 16:28:52 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=EVR7nxu9NqFl8iop5UBI5tIYqVwEs8TwTl7GvIiAmDc=; b=f9I/sF017TCFQ06v8JVGvP2Gz6zjUBu/YI/6tPRqHUhBratYAP73U723CvkiNRmX+2 Xq1dOa8TR7SCr3LVAGWBGIMiSfJg1XkFZM5gm/al2A53lbtwLqPsmpTxwC6EJWviOgyV 6ZoThnT1qOoxqJ91hnxCH3KMvw34Z4n+qBZp+x8ogItt4N49rc+X+Kia1yaFeZctlPrt dHqaxn4yVk5u2oBjIzFDTnhm8t11FS1zIMAY2JsmW8Lyk6cGaRH4PRACzyyvtyhEDhjv 2AMBPlp0rxu60Rp9Xuc47FP4zxAkrddmeJ3HXqOflP02EXV117YxiSCeFjnJ8DA0bJ1V pWMg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=EVR7nxu9NqFl8iop5UBI5tIYqVwEs8TwTl7GvIiAmDc=; b=LAptAKojHLHIdgm5LPD6bg7ps6Nhy6w5t6H00e9J/Gu2GlEavxXL/MbE2JanMWCYuw Iv6rXdPPlPWrQaeMrtokme5KA5rz7DZTt/sYKmhaXkruIZKkOEvD4oS/LGdecoF5/H6G lC4hZ6fWX+HS2obMIJCgx3uvQVpelN6KSxX6+69otMBsvxmwb1trZKNZcwyz56EE2Vrx Fu9Jsbzd91uqdpt9AYswcJ8zqTdH7XXaZF1VohAMG2CqDEOyGExLEa42F4stD6pGoh6c iveqq8w6X3lG7gt0bEo4S9+Rt4mxd2m1Him4BstBpwqG/oFn6VEqFY4+DhfTznG5N5Pu N7Uw== X-Gm-Message-State: AO0yUKV8zbTh2iQVzWTsGQ4Rrus1lNxyZgOWLrmEak1FIXlWufZiGg+f B/5WH/zkb68yyeIqFKHRPm0iHwEYUrRIXF9A X-Google-Smtp-Source: AK7set+7kve29zHBIESSAgNBn7bVNkTyjU2Wm0RcYZpG6xt7UjvZapCdpaj7ibFEOjWwLks8K3N9f134pdwQEDVw X-Received: from jthoughton.c.googlers.com ([fda3:e722:ac3:cc00:14:4d90:c0a8:2a4f]) (user=jthoughton job=sendgmr) by 2002:a1f:21c4:0:b0:406:983c:e6de with SMTP id h187-20020a1f21c4000000b00406983ce6demr657672vkh.1.1676680131643; Fri, 17 Feb 2023 16:28:51 -0800 (PST) Date: Sat, 18 Feb 2023 00:27:41 +0000 In-Reply-To: <20230218002819.1486479-1-jthoughton@google.com> Mime-Version: 1.0 References: <20230218002819.1486479-1-jthoughton@google.com> X-Mailer: git-send-email 2.39.2.637.g21b0678d19-goog Message-ID: <20230218002819.1486479-9-jthoughton@google.com> Subject: [PATCH v2 08/46] hugetlb: add HugeTLB HGM enablement helpers From: James Houghton To: Mike Kravetz , Muchun Song , Peter Xu , Andrew Morton Cc: David Hildenbrand , David Rientjes , Axel Rasmussen , Mina Almasry , "Zach O'Keefe" , Manish Mishra , Naoya Horiguchi , "Dr . David Alan Gilbert" , "Matthew Wilcox (Oracle)" , Vlastimil Babka , Baolin Wang , Miaohe Lin , Yang Shi , Frank van der Linden , Jiaqi Yan , linux-mm@kvack.org, linux-kernel@vger.kernel.org, James Houghton X-Rspam-User: X-Rspamd-Server: rspam02 X-Rspamd-Queue-Id: 6656B120013 X-Stat-Signature: tq3fofgj1sahdn9ii1qtok6ikmgdwutk X-HE-Tag: 1676680132-730096 X-HE-Meta: U2FsdGVkX19DCOR6D25v5qRclyTwsPlLH5SFjaPyR9u0i0AgCUfsfAUI08Ns9fG7HMDMmqEzgV9/CWqm5csQoDAGwnRp4/9tFBoJ3IFGL94Z+94Y1h9+n7Twlez6bQ+TwIxsARIxcThLqTZbjbFKB6muStaOBgNeEcQ0uT0kQfGEBZJXCcvwgLBSRvyLJlupG0tAV9/S7PZeVQlTM+oyvDyrENNnVSO2lUD0COmcHP9xonS8CZ8SLa/pXpLT3IH+seW306h6ycY5C8/SCNJZLu5q5RNLh+b9fXi6Md+LJKVcPnQ9hYOmMkoIHTAPFugG9m5iXK8DplLuq0Kge3nTJgwnAItwEzBSl43lUG73/WDkiK+Hr1Pxb5TEKww5gyGd50bVttId5oJEHsOdBpxdk+ru1qEe/Emv+lwugfymmgdjBXXIlTqQAlxUp8lKE2xiX9m6yWCy4w7ejq+1aFRHtFPn0UtEz/ZN3/oY9Y4CDLc0xNzHjyk0h36Xea2z2AZqUlbANSShLRjn3rBTf6oeU4pOox2ydppdzc7igVpvQ+7fR4nhSaRrlZbPMUcb+NsZ9Wat8A1bAQbcZcSt2IYYB0rzhu8dAKSj2S6qaqzw8eXpRmp2LnJdPCdF34JIxbufv10v4IYmFInx6dLkr9Cmo0Pn1bgec7MHZ02gjf8bq9Tx3hEFAL6MEN2F8+HCi5wb0uiDMF4VkLKajHYyDvmElNAJ25LR1PKg0gZtOfK8cUsXghCLiLYQGKsc24iOjoCJ66AbL0wU3kSLcN+JV3t+lVm0RPNslUQovSPomxMVJaOhj8VsHWkFT8UBzyjPj5MiE8Bar0Sst7cf2VKefIZC9ZWdsiH4w4a1E5J9Cucvr9fhg3lPzxgRLC6RW1qj9rK94XxlKCNsiBXi9ECweUfjNNXQaBNwKITrtF4RArFb/7onZ6PNoIwSA+Ah/kZOBfrf/p3i/EPJ+dMsoKNty/R xbAMoqAh HlFXL5d963QztHN1RQui3Jn7r81F8S3DWB5n0ivh/HTSobAnq8cQYT+97kYxjvdOC4OMjFcW47GcTZGumMTUb//izVRHyBG0n5TBuruyX+64k953lA9QCunXVwXOgB1XrmKZxP+9jIuwqcNDEfZjduhwpzql4XyY6gv1KKmKbjy4iMsweOITX9GHqUnXWl71PwtiUfdcNI7meSCLdxpoZo2j7HOG7vIU34vwdZPjgV+GE6gCQLwLu552JqAUh+Q/nBMCoxu7Q4szSDFMy8d7B/dVEr9zZLCFTovJwmDD2K+im7Lwi+rJnq2J3imUFViKAy/Gypb3HIpdm+vvKuaduFI/0vmKSdPfU6tJam3M6sKOF6q0II/t9L96YjG219d+DMgX6auPUlBxsmFcIachHnO3EYacVZGhZrG/tze/NJG0ZQ4Drv3Bf7Y8tEt1pNFUOtRps45UgA2mEmXkrVKAElF/VfN5HzIZKi44JbCytlEn0jWX67vT+hHOGSYM/eUZzjKsVdN+GC+tI9IiGz+2MFFqALyLcNEcS8jnYDwgIDHAFoA/TXsJPQYKssbp11DUPa2mlsPicnqufFKdm22QDmMPx9400qmJblw54YaBwvxBUm7d81MEYaTpuvsLAJ2t2L6x+SBG/htzTZ/t7YSNaAwejQg== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: hugetlb_hgm_eligible indicates that a VMA is eligible to have HGM explicitly enabled via MADV_SPLIT, and hugetlb_hgm_enabled indicates that HGM has been enabled. Signed-off-by: James Houghton Reviewed-by: Mina Almasry diff --git a/include/linux/hugetlb.h b/include/linux/hugetlb.h index 7c977d234aba..efd2635a87f5 100644 --- a/include/linux/hugetlb.h +++ b/include/linux/hugetlb.h @@ -1211,6 +1211,20 @@ static inline void hugetlb_unregister_node(struct node *node) } #endif /* CONFIG_HUGETLB_PAGE */ +#ifdef CONFIG_HUGETLB_HIGH_GRANULARITY_MAPPING +bool hugetlb_hgm_enabled(struct vm_area_struct *vma); +bool hugetlb_hgm_eligible(struct vm_area_struct *vma); +#else +static inline bool hugetlb_hgm_enabled(struct vm_area_struct *vma) +{ + return false; +} +static inline bool hugetlb_hgm_eligible(struct vm_area_struct *vma) +{ + return false; +} +#endif + static inline spinlock_t *huge_pte_lock(struct hstate *h, struct mm_struct *mm, pte_t *pte) { diff --git a/mm/hugetlb.c b/mm/hugetlb.c index 6c008c9de80e..0576dcc98044 100644 --- a/mm/hugetlb.c +++ b/mm/hugetlb.c @@ -7004,6 +7004,10 @@ static bool pmd_sharing_possible(struct vm_area_struct *vma) #ifdef CONFIG_USERFAULTFD if (uffd_disable_huge_pmd_share(vma)) return false; +#endif +#ifdef CONFIG_HUGETLB_HIGH_GRANULARITY_MAPPING + if (hugetlb_hgm_enabled(vma)) + return false; #endif /* * Only shared VMAs can share PMDs. @@ -7267,6 +7271,18 @@ __weak unsigned long hugetlb_mask_last_page(struct hstate *h) #endif /* CONFIG_ARCH_WANT_GENERAL_HUGETLB */ +#ifdef CONFIG_HUGETLB_HIGH_GRANULARITY_MAPPING +bool hugetlb_hgm_eligible(struct vm_area_struct *vma) +{ + /* All shared VMAs may have HGM. */ + return vma && (vma->vm_flags & VM_MAYSHARE); +} +bool hugetlb_hgm_enabled(struct vm_area_struct *vma) +{ + return vma && (vma->vm_flags & VM_HUGETLB_HGM); +} +#endif /* CONFIG_HUGETLB_HIGH_GRANULARITY_MAPPING */ + /* * These functions are overwritable if your architecture needs its own * behavior. From patchwork Sat Feb 18 00:27:42 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: James Houghton X-Patchwork-Id: 13145375 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id F0A70C64ED6 for ; Sat, 18 Feb 2023 00:29:03 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 7DC4A280005; Fri, 17 Feb 2023 19:28:55 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 78D17280002; Fri, 17 Feb 2023 19:28:55 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 5DF67280005; Fri, 17 Feb 2023 19:28:55 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0013.hostedemail.com [216.40.44.13]) by kanga.kvack.org (Postfix) with ESMTP id 4A735280002 for ; Fri, 17 Feb 2023 19:28:55 -0500 (EST) Received: from smtpin25.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay06.hostedemail.com (Postfix) with ESMTP id 18D25A907B for ; Sat, 18 Feb 2023 00:28:55 +0000 (UTC) X-FDA: 80478527430.25.DFCA9D5 Received: from mail-yb1-f202.google.com (mail-yb1-f202.google.com [209.85.219.202]) by imf07.hostedemail.com (Postfix) with ESMTP id 4FC7D40009 for ; Sat, 18 Feb 2023 00:28:53 +0000 (UTC) Authentication-Results: imf07.hostedemail.com; dkim=pass header.d=google.com header.s=20210112 header.b=tYpmmFQz; spf=pass (imf07.hostedemail.com: domain of 3xBvwYwoKCOEMWKRXJKWRQJRRJOH.FRPOLQXa-PPNYDFN.RUJ@flex--jthoughton.bounces.google.com designates 209.85.219.202 as permitted sender) smtp.mailfrom=3xBvwYwoKCOEMWKRXJKWRQJRRJOH.FRPOLQXa-PPNYDFN.RUJ@flex--jthoughton.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1676680133; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=Gzu9Oqr9H8Fn0tUeHEDWXYAlOB8OWdsieehz/zpOW6k=; b=Djqiy34RrAF1s/8ZezT9it/HK42AGUfDrTotamCr1zTjH4qmUvp7c/uWimeCIBJIwJ7/nz T/AUXmy+bvVaeEUhoCBCpE4UKr5p6YftmnrY8R7iQJEl2MdxvS46zgo9TdD/qpSawxBSj+ faKnTBKm7aHWLIlO9jOtmMe2H49qnqk= ARC-Authentication-Results: i=1; imf07.hostedemail.com; dkim=pass header.d=google.com header.s=20210112 header.b=tYpmmFQz; spf=pass (imf07.hostedemail.com: domain of 3xBvwYwoKCOEMWKRXJKWRQJRRJOH.FRPOLQXa-PPNYDFN.RUJ@flex--jthoughton.bounces.google.com designates 209.85.219.202 as permitted sender) smtp.mailfrom=3xBvwYwoKCOEMWKRXJKWRQJRRJOH.FRPOLQXa-PPNYDFN.RUJ@flex--jthoughton.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1676680133; a=rsa-sha256; cv=none; b=haFGM98H0USJBKaTu2AFz9LyBLaiOvivXdSNn7QLUgMbqm0sJoRS0J1mRbsmZCs8ujwuX2 cMRm1juSlFMRQ8+bzknzwuTDqovchUvIRWNLqY6Kdk/wtC2zB/sMfsN4OQzTxAvODz8th2 cDAR06Is0/d+Hg9QmhwOTvOMkzP297U= Received: by mail-yb1-f202.google.com with SMTP id o14-20020a25810e000000b0095d2ada3d26so1816633ybk.5 for ; Fri, 17 Feb 2023 16:28:53 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=Gzu9Oqr9H8Fn0tUeHEDWXYAlOB8OWdsieehz/zpOW6k=; b=tYpmmFQzEgrObj/2FMcbtdaJQm3sazU5VnbVoR+zz6vUh7rASk45yjIKn2FedC7eo+ 64TFERWH3kAocQ35djKIpY0U7uy4JBvt4LsnfsaNiT08MQoKwPApF8bVx1iphGbWHeEc IMInRpDhVgnidOgxuvPZq1H7/CcXyiRAgK7ISL6kaB1IUSOqgvGwY430h2xrB2v8neVQ 8AN2tGoltRJlPrCZJ1MuICMziVOgA+TSNCC1pOgqumOa93V1to2Sec2gOzLTecmc35fN 74QeAtnqV0HzcQ4DlaDW3Lxjmfem+TFSMTYeknGNL3g2HO+TM9ub4VO/dStuPqoBrn6U ZpmA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=Gzu9Oqr9H8Fn0tUeHEDWXYAlOB8OWdsieehz/zpOW6k=; b=zwhsG6lWtTntrdsvygGwMzsbq5JpkhMQQ8/oJe0GashaoUuxgN2w+IAfhEubUqJQwP uyqjXCbtOD9G9JKqlh47zij2Zb7m2NUx4XqvCTpbJRiYKdctZYppeFNbLgBny22bCnJG YHisS6dzLaDDCSgcnVPNm/M3G17KuEsVCY8JYFqkZeoOArOZ/S9zEJ0UQpRURvalx1ea +w/n236W10lLYuxlUL87bFHfunm1jFFr9ehdSwh61Gz5RBtk7Lo+xLKibVS1xTxwCb1+ Y8Qz/pSoAVhRVdoWWKEEDYMdeDFSWGbfEYKBZn8trUo3oLlpgYHYdCcxj0ZMCC7UjaKy zaIw== X-Gm-Message-State: AO0yUKVMcpz5Sxn7TEbl/TzQdhMIDON36KMF+7Kg42Lk5utTA/H5gZUS RJtKozbXgJd1kS0MCWfUV8puduQrhC6+IXaT X-Google-Smtp-Source: AK7set9Ig32WWEUYTKzsQPjrHFiRL8Rvh2WamhJ5X+omJqqagrRjDEce35WMrY5ZjwX5thzjLfDORxzKFgGHjUzE X-Received: from jthoughton.c.googlers.com ([fda3:e722:ac3:cc00:14:4d90:c0a8:2a4f]) (user=jthoughton job=sendgmr) by 2002:a05:6902:1024:b0:8fc:686c:cf87 with SMTP id x4-20020a056902102400b008fc686ccf87mr53474ybt.4.1676680132546; Fri, 17 Feb 2023 16:28:52 -0800 (PST) Date: Sat, 18 Feb 2023 00:27:42 +0000 In-Reply-To: <20230218002819.1486479-1-jthoughton@google.com> Mime-Version: 1.0 References: <20230218002819.1486479-1-jthoughton@google.com> X-Mailer: git-send-email 2.39.2.637.g21b0678d19-goog Message-ID: <20230218002819.1486479-10-jthoughton@google.com> Subject: [PATCH v2 09/46] mm: add MADV_SPLIT to enable HugeTLB HGM From: James Houghton To: Mike Kravetz , Muchun Song , Peter Xu , Andrew Morton Cc: David Hildenbrand , David Rientjes , Axel Rasmussen , Mina Almasry , "Zach O'Keefe" , Manish Mishra , Naoya Horiguchi , "Dr . David Alan Gilbert" , "Matthew Wilcox (Oracle)" , Vlastimil Babka , Baolin Wang , Miaohe Lin , Yang Shi , Frank van der Linden , Jiaqi Yan , linux-mm@kvack.org, linux-kernel@vger.kernel.org, James Houghton X-Stat-Signature: nk5jk7gmsprwm7qibkm8enowss89hxac X-Rspam-User: X-Rspamd-Queue-Id: 4FC7D40009 X-Rspamd-Server: rspam06 X-HE-Tag: 1676680133-543688 X-HE-Meta: U2FsdGVkX1/8QZ3BZXspXQlM2zywkbyilAX4G1LDRp+K8rfeyR/DB4kVVSc90qoFfOxP+98gQoYZiooc5wPNqnjVuDoWNF0unzNMJnvoN3/Qkbx3xxwOqIn453BFgaLgj/FgfSntCW6Hsr4cyusq35HE0hUp8XP8SS5orw6KUeEg6X9VevNWW3VcoHIE4uUsRq7tXchFo0ipULl9iuZqgD6B9iFry8Zh/PUouomCPQubYgp/wtegksWAmpwTwBImqHzBPuBqXMR05y6DJ2VvBp7JDuVr7KF+w2V4A1fBwRwEKodhEMJ4XvcmQsN5UYnkbee04YgsR4fyu5rdUF0tZ1H/GrunRzreE1rpYJ27MHn55CFchMsWwENHrvsyjRH8c7LD8lZUnwaQYp2jnanVCigKkZMVQNNSw27FDKDX3BUcVZXe4K2GCZqiSRyKElX94zQpjv04C4a28wXEhmt2RNVNitSOu623F+nfaCyPYoA4HbXPtPaIeJhKX1e2M78pce4q8x4Q01WlS5oUKV+5K5Y554KfPoGF89KwkKY79XW+tDNqiBzDT5DTP4aFmyKX1xqhiZs9gMdxD5k4R1otQjzFhcs5kAHs1k5yUnJUddLsjwjpX96bp5WNkG8QiKjIV40G5jIRve5QqWW2UVJsSaGpIOhZaUYuC4HRdkkI7CE3osOoA4j6vAuiDsOKNv3HBwvcNwQLLkOtPUq4urpEOE30dpxPdpmc25HRm1uu497cX8WYcvwK/8o40X/p995iqNCwaS1XDB8FK2di9onoBFFrZwJXHOI7DNvaXvnHwxzZWiwOs4IcRGwK63GL30VsWcGCD2ztEVmKnJfmSa3qtqt64xVZSW2pcO5soUDzaCDwaUz05nDAoT/67V7TKJdQOH5n8pOdG6fzEsd+LO4Jo/2cp+mXoiKTUDbuUk61B0iMClaY00nZmfoyZfSo+/z4FXsOsnqaID7K3Gr1M1u 5soSlYgj nCDTyk2hV645JG3Rm20zxtIuvmpzXWV8fyyoV5ZSUb6d3+nZHtnRlNx5zaUhM2qBTXSL3SzDHfJ9aI9fClHSw7amMCBnDBx2p8gkpcBS7amV6eKchVB9t/p7xbSQuX/AYXpRbsuIUkvaUtzjuuErQe4XQzO48PZZbzbgO177fq9GAJTB7l6fgHqksYomobpuQE5qDwihfoFWLH6dXkiQAjtDWJAdag36lggxYSQXdMbQZpTyQGjwW4bWHiECPdLf/SokGuL0KMcH3fdCB1b919E/+VLp/yEnsdNVMIbv+8etjYny/1IYFNJ5G7wywwCr2GbEjohcypRbiVwKDnHSEvRV0+GYFt7Q5BbIOnj2Euxlm2oIYcaLfpzW5WjDlCqnz50bsi8t3m1FypNQyvtKsNw48FX6vrhsw2dblgiOYQh6JIADOAZReZ0WfVX9nvugmCFP8Tp6Q4W9oPP8TquNKBeyj3b//rqAeJakzIDOFqnev4LXTBPhOq97+ZbcQ5lbBgXhmi1lJ0TwqK6zk8q76gaxtWz7CpNmmwZjdJz9t7d1p0wbkPVfo+e2LWOsEqEY7rD08RGTYVCFumJyPOK7ugXxy68/kOit0qrH0EwoVUSzGaDw= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Issuing ioctl(MADV_SPLIT) on a HugeTLB address range will enable HugeTLB HGM. MADV_SPLIT was chosen for the name so that this API can be applied to non-HugeTLB memory in the future, if such an application is to arise. MADV_SPLIT provides several API changes for some syscalls on HugeTLB address ranges: 1. UFFDIO_CONTINUE is allowed for MAP_SHARED VMAs at PAGE_SIZE alignment. 2. read()ing a page fault event from a userfaultfd will yield a PAGE_SIZE-rounded address, instead of a huge-page-size-rounded address (unless UFFD_FEATURE_EXACT_ADDRESS is used). There is no way to disable the API changes that come with issuing MADV_SPLIT. MADV_COLLAPSE can be used to collapse high-granularity page table mappings that come from the extended functionality that comes with using MADV_SPLIT. For post-copy live migration, the expected use-case is: 1. mmap(MAP_SHARED, some_fd) primary mapping 2. mmap(MAP_SHARED, some_fd) alias mapping 3. MADV_SPLIT the primary mapping 4. UFFDIO_REGISTER/etc. the primary mapping 5. Copy memory contents into alias mapping and UFFDIO_CONTINUE the corresponding PAGE_SIZE sections in the primary mapping. More API changes may be added in the future. Signed-off-by: James Houghton diff --git a/arch/alpha/include/uapi/asm/mman.h b/arch/alpha/include/uapi/asm/mman.h index 763929e814e9..7a26f3648b90 100644 --- a/arch/alpha/include/uapi/asm/mman.h +++ b/arch/alpha/include/uapi/asm/mman.h @@ -78,6 +78,8 @@ #define MADV_COLLAPSE 25 /* Synchronous hugepage collapse */ +#define MADV_SPLIT 26 /* Enable hugepage high-granularity APIs */ + /* compatibility flags */ #define MAP_FILE 0 diff --git a/arch/mips/include/uapi/asm/mman.h b/arch/mips/include/uapi/asm/mman.h index c6e1fc77c996..f8a74a3a0928 100644 --- a/arch/mips/include/uapi/asm/mman.h +++ b/arch/mips/include/uapi/asm/mman.h @@ -105,6 +105,8 @@ #define MADV_COLLAPSE 25 /* Synchronous hugepage collapse */ +#define MADV_SPLIT 26 /* Enable hugepage high-granularity APIs */ + /* compatibility flags */ #define MAP_FILE 0 diff --git a/arch/parisc/include/uapi/asm/mman.h b/arch/parisc/include/uapi/asm/mman.h index 68c44f99bc93..a6dc6a56c941 100644 --- a/arch/parisc/include/uapi/asm/mman.h +++ b/arch/parisc/include/uapi/asm/mman.h @@ -72,6 +72,8 @@ #define MADV_COLLAPSE 25 /* Synchronous hugepage collapse */ +#define MADV_SPLIT 74 /* Enable hugepage high-granularity APIs */ + #define MADV_HWPOISON 100 /* poison a page for testing */ #define MADV_SOFT_OFFLINE 101 /* soft offline page for testing */ diff --git a/arch/xtensa/include/uapi/asm/mman.h b/arch/xtensa/include/uapi/asm/mman.h index 1ff0c858544f..f98a77c430a9 100644 --- a/arch/xtensa/include/uapi/asm/mman.h +++ b/arch/xtensa/include/uapi/asm/mman.h @@ -113,6 +113,8 @@ #define MADV_COLLAPSE 25 /* Synchronous hugepage collapse */ +#define MADV_SPLIT 26 /* Enable hugepage high-granularity APIs */ + /* compatibility flags */ #define MAP_FILE 0 diff --git a/include/uapi/asm-generic/mman-common.h b/include/uapi/asm-generic/mman-common.h index 6ce1f1ceb432..996e8ded092f 100644 --- a/include/uapi/asm-generic/mman-common.h +++ b/include/uapi/asm-generic/mman-common.h @@ -79,6 +79,8 @@ #define MADV_COLLAPSE 25 /* Synchronous hugepage collapse */ +#define MADV_SPLIT 26 /* Enable hugepage high-granularity APIs */ + /* compatibility flags */ #define MAP_FILE 0 diff --git a/mm/madvise.c b/mm/madvise.c index c2202f51e9dd..8c004c678262 100644 --- a/mm/madvise.c +++ b/mm/madvise.c @@ -1006,6 +1006,28 @@ static long madvise_remove(struct vm_area_struct *vma, return error; } +static int madvise_split(struct vm_area_struct *vma, + unsigned long *new_flags) +{ +#ifdef CONFIG_HUGETLB_HIGH_GRANULARITY_MAPPING + if (!is_vm_hugetlb_page(vma) || !hugetlb_hgm_eligible(vma)) + return -EINVAL; + + /* + * PMD sharing doesn't work with HGM. If this MADV_SPLIT is on part + * of a VMA, then we will split the VMA. Here, we're unsharing before + * splitting because it's simpler, although we may be unsharing more + * than we need. + */ + hugetlb_unshare_all_pmds(vma); + + *new_flags |= VM_HUGETLB_HGM; + return 0; +#else + return -EINVAL; +#endif +} + /* * Apply an madvise behavior to a region of a vma. madvise_update_vma * will handle splitting a vm area into separate areas, each area with its own @@ -1084,6 +1106,11 @@ static int madvise_vma_behavior(struct vm_area_struct *vma, break; case MADV_COLLAPSE: return madvise_collapse(vma, prev, start, end); + case MADV_SPLIT: + error = madvise_split(vma, &new_flags); + if (error) + goto out; + break; } anon_name = anon_vma_name(vma); @@ -1178,6 +1205,9 @@ madvise_behavior_valid(int behavior) case MADV_HUGEPAGE: case MADV_NOHUGEPAGE: case MADV_COLLAPSE: +#endif +#ifdef CONFIG_HUGETLB_HIGH_GRANULARITY_MAPPING + case MADV_SPLIT: #endif case MADV_DONTDUMP: case MADV_DODUMP: @@ -1368,6 +1398,8 @@ int madvise_set_anon_name(struct mm_struct *mm, unsigned long start, * transparent huge pages so the existing pages will not be * coalesced into THP and new pages will not be allocated as THP. * MADV_COLLAPSE - synchronously coalesce pages into new THP. + * MADV_SPLIT - allow HugeTLB pages to be mapped at PAGE_SIZE. This allows + * UFFDIO_CONTINUE to accept PAGE_SIZE-aligned regions. * MADV_DONTDUMP - the application wants to prevent pages in the given range * from being included in its core dump. * MADV_DODUMP - cancel MADV_DONTDUMP: no longer exclude from core dump. From patchwork Sat Feb 18 00:27:43 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: James Houghton X-Patchwork-Id: 13145376 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id CF561C6379F for ; Sat, 18 Feb 2023 00:29:05 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 8C7D5280006; Fri, 17 Feb 2023 19:28:56 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 828DD280002; Fri, 17 Feb 2023 19:28:56 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 6F255280006; Fri, 17 Feb 2023 19:28:56 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0015.hostedemail.com [216.40.44.15]) by kanga.kvack.org (Postfix) with ESMTP id 60C65280002 for ; Fri, 17 Feb 2023 19:28:56 -0500 (EST) Received: from smtpin16.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay03.hostedemail.com (Postfix) with ESMTP id 3C363A0694 for ; Sat, 18 Feb 2023 00:28:56 +0000 (UTC) X-FDA: 80478527472.16.F5E4554 Received: from mail-yb1-f202.google.com (mail-yb1-f202.google.com [209.85.219.202]) by imf28.hostedemail.com (Postfix) with ESMTP id 80AE8C000E for ; Sat, 18 Feb 2023 00:28:54 +0000 (UTC) Authentication-Results: imf28.hostedemail.com; dkim=pass header.d=google.com header.s=20210112 header.b=G9OKNC7M; spf=pass (imf28.hostedemail.com: domain of 3xRvwYwoKCOINXLSYKLXSRKSSKPI.GSQPMRYb-QQOZEGO.SVK@flex--jthoughton.bounces.google.com designates 209.85.219.202 as permitted sender) smtp.mailfrom=3xRvwYwoKCOINXLSYKLXSRKSSKPI.GSQPMRYb-QQOZEGO.SVK@flex--jthoughton.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1676680134; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=5Xd63Fapv1zuFib4b0fw8zL6DTisPv2GzpUTcimolMg=; b=iCC1hPTGQrh20HwwfaltR4Zlij1ZD9d4UZ4t82sFLn4Dnz9Q4gdP2U9tntZFrxlVscgHIH n8wrd1CGpr9uvtpd5J1lcjD6PtXiYKeAsoRWiMVsek1UzSy5em1KPgwV+AMrA1iCB0tzNc kyQpRVZcIoItP+5oMGwsdqAtFEdnSmA= ARC-Authentication-Results: i=1; imf28.hostedemail.com; dkim=pass header.d=google.com header.s=20210112 header.b=G9OKNC7M; spf=pass (imf28.hostedemail.com: domain of 3xRvwYwoKCOINXLSYKLXSRKSSKPI.GSQPMRYb-QQOZEGO.SVK@flex--jthoughton.bounces.google.com designates 209.85.219.202 as permitted sender) smtp.mailfrom=3xRvwYwoKCOINXLSYKLXSRKSSKPI.GSQPMRYb-QQOZEGO.SVK@flex--jthoughton.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1676680134; a=rsa-sha256; cv=none; b=sGKfwTVl7zmNfC0J+qX8g4DEPBwKaAsBZHMLe16MO4kojbYAG6Ka9506/qfs9bNtanAWTg WrETPKxuCJC2TYbTvy8GRRBsXw1aUZTx8B43hj0HTe0BL9CchrkPRfcNLg/ijkPwOKBDd/ B2TQJbASNG03J06HU+C6jjEcjLXvIW8= Received: by mail-yb1-f202.google.com with SMTP id y187-20020a2532c4000000b008f257b16d71so2333709yby.15 for ; Fri, 17 Feb 2023 16:28:54 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=5Xd63Fapv1zuFib4b0fw8zL6DTisPv2GzpUTcimolMg=; b=G9OKNC7MB5QZ1WtbCVd/b4GGEEQxFMkkCyh9YPN/SaI+Iv0UY6xpXV4C/bjLuR5GX/ xT5jqUIS0kJvgcXBGXG6eGMOl41EQ+9+EPXqDEN8/E+QrPDYAERgHP5tENbu09gPxEEE J2plHmBfhFP7cLD9TJAsfNCeQfwyLgPfEiNyKU7bNFt6UToPyOfI/Tv35qpqCPqZOx+3 BkeCahUvSvDUjoiOquOpSnIGEQ48uVtj2zGVrhp8k95YXggqw2b91kgnZMkucKn+Y1SW QrvcwU7NPBdC0gLOUvUCZFlWQklPf0wCQDXXAGua2AKBXeE+nHdPEk2ToJ76Lq1odat1 0YNw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=5Xd63Fapv1zuFib4b0fw8zL6DTisPv2GzpUTcimolMg=; b=0tTxjof6x++0KEuBY6TPJAP8mzqt1dI7ptBJaExwPVC+28EdiMwazBvb9g+vD10jUz NnR8OMFps6B39G8hrY/zxG5zvuANC4R59mioKVcAjSU+8nqMvFZOT9NMPyIXJEY/j/LX TilZcBclR4lu5EfMY+S3+E3qBQ1mYIIv8gNPsknM9fHKaiQuEVLdx/mKD8pekygTUN+9 DrXb9IBlkzGQkw424mEHjSRSLRGv4ppAVvPTnXdNgMMgUYfZq2yIuYL+aFWQ0+ZJHND9 bVy4ESFFfxuUIyfuVD0WLx3bQeb0IE7g3A8RUXlauj+OMuaYxnOotp+/KZ/iJ0gRG2p+ ItLA== X-Gm-Message-State: AO0yUKVqgSwBIFTOvcJcfPSeMELPYryYEVmRo9HKFvedwVZ1ghjIN28J MbHy1IXLZbQzNpRTifBXG1UT0nuVzva+I/Vo X-Google-Smtp-Source: AK7set8eABNjJa8kuKJfW7WVM4L/kIrgHT4nRE22RZ7aTpjJYSh+BZra+YA1xFFUMi8kyRc4Y+q2EPk2y7TmuTgs X-Received: from jthoughton.c.googlers.com ([fda3:e722:ac3:cc00:14:4d90:c0a8:2a4f]) (user=jthoughton job=sendgmr) by 2002:a81:b60f:0:b0:52a:92e9:27c1 with SMTP id u15-20020a81b60f000000b0052a92e927c1mr279953ywh.10.1676680133715; Fri, 17 Feb 2023 16:28:53 -0800 (PST) Date: Sat, 18 Feb 2023 00:27:43 +0000 In-Reply-To: <20230218002819.1486479-1-jthoughton@google.com> Mime-Version: 1.0 References: <20230218002819.1486479-1-jthoughton@google.com> X-Mailer: git-send-email 2.39.2.637.g21b0678d19-goog Message-ID: <20230218002819.1486479-11-jthoughton@google.com> Subject: [PATCH v2 10/46] hugetlb: make huge_pte_lockptr take an explicit shift argument From: James Houghton To: Mike Kravetz , Muchun Song , Peter Xu , Andrew Morton Cc: David Hildenbrand , David Rientjes , Axel Rasmussen , Mina Almasry , "Zach O'Keefe" , Manish Mishra , Naoya Horiguchi , "Dr . David Alan Gilbert" , "Matthew Wilcox (Oracle)" , Vlastimil Babka , Baolin Wang , Miaohe Lin , Yang Shi , Frank van der Linden , Jiaqi Yan , linux-mm@kvack.org, linux-kernel@vger.kernel.org, James Houghton X-Rspam-User: X-Rspamd-Server: rspam04 X-Rspamd-Queue-Id: 80AE8C000E X-Stat-Signature: 89xzhwi3nshymic1p7hithjmd1oscem6 X-HE-Tag: 1676680134-832340 X-HE-Meta: U2FsdGVkX1/YdYpHc3VWKWczsAvEIOlpFsnwXuvnoJfEgZG3l34mzTMZvZkHWsySfrVUbYQiMGGY92unDjwbOLMueAAQu+olTQ9isBcPKBA5dUQSSXNYyAsRMXMDv+bTzBkfzFq6z0VUsEipTRg52Lgb+5uZ9WFjlt4K1xUR60nceex/3Sa45YjWyuPb9bZn8eJoKBAw2DMlXvY8qJCZUuY6OH9Y+3+S+I5rvVJQknrj35/DYTky6fI5O2lAfoh7xPFBCcFtzmLYZwwrf7YmI5MHnD6nEranKzHe0y+jkWNHskqq++xz9sjuhPPlKNOS2Dm7FlNncV3edWWs64TOXYkLrKoKN3joERFAdbhNmU45KiyXOrtRizeN41FpYP4+CydCgqM4+KY+jwwkEg4tp+G3TJ48yyLToaCaY+P3gLF3Iopjn0upBhp/W29ZoQhMuZOXCSP3ApfudC7IrT/el1zMtv8CVK4X1ffVmaqndojmR7SH59KY89wGUZSA3Hi01lTAKkT8oQmd2gOrsQIFcKoxHlVDiN4imAnGz16RxwW/MmKISreDnjqHd+vxy1A/L9WjYZad/swIW2j0QHGGmgUNSxMWfMPtbYIveYAJRLO883mPQtbv1rxC3vuhySZ7pgChTdzR43RSUMmuJ/Zc12SyCdKaO3HyGQSYS+YbXLOzAt+CIWVCLOzof5vGmISHnMhLl2isKkzn2IaVY6Y5VoZIQcpxaXKl5em0bSgrTB5F6k1RcsoijZm4wo+kWWH8ANg+Rum3a8Vlf5WDJSBKW/34rNnYSFpshbq34c8DNzrcCGK6l73+oHo0BmAOC9/V0wFW7CAAjCL2XCvTu99IXPiIS2gzLNYnH9/fiz3Fqqp0cBGQS4gd8w/MEN9rzlXq7ggdcm5tNknZ73xBbhEysTqSDW2uBV5LtBEQ3eoDnvGOpX+cE0p6pa721rPC3XHHaWP+orU2dqaCexFOZFB CNUm5G7k TVzBl4APfOl+Vb+v/9T9LGkWimum152V+/ORWo6zF6I77KQWIjgipCXz65KCozhn3wr+6PkFanno6VkB+nXByJfzpXsLbjdGPn1WB/Fu0OnNuhKiWEamKCVM5l/IYNKsQZtIbNP35IwWJUfIvle27BBlOn1ssyHJ8nPW78TVXi2oeP36/O0v0URCtPQGf7TtZy9Nwrsu77NSGg5NkMBVnUT7pUoYvewjzyQJVzh29i9yRPVtmv/f6BFDyX3enNRzuR6otbpf3HWTArd2mIlH/zU14akNSfGE9ENCvZfcEOVubnw84hFRWpAvTJscwEhkKE54OOBmdCk5Wf2NFtjmlodyxhMFOnKd2jNHP4xVuMMLu3mODxJOnwnZ05e4NbW8s8CgaTkGg8xfOxsq0ZApaZtRTB/OpbLRbz+MVsT9CVIzKmB0EWDuWpVfEMQT78gYHApdQ6Gj8/qaPYT68xi2DvoPCR1BSNPVido0uRsaUzlIOLf/lD2ZZMSSmsQ4Uq3caeEW0YFrb45NmLw0trmZOiukWlwDbg+leHfNOd9/qqc50E7GwxyyDndDD0LiJPfqCSv1ADVnZO/buU1O5pmT1DWfzuE/UAMt6aZr5lZLMwLH28h1Zhij6JXPIhhVJxsn3xfRc46NbWwvAuhs= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: This is needed to handle PTL locking with high-granularity mapping. We won't always be using the PMD-level PTL even if we're using the 2M hugepage hstate. It's possible that we're dealing with 4K PTEs, in which case, we need to lock the PTL for the 4K PTE. Reviewed-by: Mina Almasry Acked-by: Mike Kravetz Signed-off-by: James Houghton diff --git a/arch/powerpc/mm/pgtable.c b/arch/powerpc/mm/pgtable.c index cb2dcdb18f8e..035a0df47af0 100644 --- a/arch/powerpc/mm/pgtable.c +++ b/arch/powerpc/mm/pgtable.c @@ -261,7 +261,8 @@ int huge_ptep_set_access_flags(struct vm_area_struct *vma, psize = hstate_get_psize(h); #ifdef CONFIG_DEBUG_VM - assert_spin_locked(huge_pte_lockptr(h, vma->vm_mm, ptep)); + assert_spin_locked(huge_pte_lockptr(huge_page_shift(h), + vma->vm_mm, ptep)); #endif #else diff --git a/include/linux/hugetlb.h b/include/linux/hugetlb.h index efd2635a87f5..a1ceb9417f01 100644 --- a/include/linux/hugetlb.h +++ b/include/linux/hugetlb.h @@ -958,12 +958,11 @@ static inline gfp_t htlb_modify_alloc_mask(struct hstate *h, gfp_t gfp_mask) return modified_mask; } -static inline spinlock_t *huge_pte_lockptr(struct hstate *h, +static inline spinlock_t *huge_pte_lockptr(unsigned int shift, struct mm_struct *mm, pte_t *pte) { - if (huge_page_size(h) == PMD_SIZE) + if (shift == PMD_SHIFT) return pmd_lockptr(mm, (pmd_t *) pte); - VM_BUG_ON(huge_page_size(h) == PAGE_SIZE); return &mm->page_table_lock; } @@ -1173,7 +1172,7 @@ static inline gfp_t htlb_modify_alloc_mask(struct hstate *h, gfp_t gfp_mask) return 0; } -static inline spinlock_t *huge_pte_lockptr(struct hstate *h, +static inline spinlock_t *huge_pte_lockptr(unsigned int shift, struct mm_struct *mm, pte_t *pte) { return &mm->page_table_lock; @@ -1230,7 +1229,7 @@ static inline spinlock_t *huge_pte_lock(struct hstate *h, { spinlock_t *ptl; - ptl = huge_pte_lockptr(h, mm, pte); + ptl = huge_pte_lockptr(huge_page_shift(h), mm, pte); spin_lock(ptl); return ptl; } diff --git a/mm/hugetlb.c b/mm/hugetlb.c index 0576dcc98044..5ca9eae0ac42 100644 --- a/mm/hugetlb.c +++ b/mm/hugetlb.c @@ -5017,7 +5017,7 @@ int copy_hugetlb_page_range(struct mm_struct *dst, struct mm_struct *src, } dst_ptl = huge_pte_lock(h, dst, dst_pte); - src_ptl = huge_pte_lockptr(h, src, src_pte); + src_ptl = huge_pte_lockptr(huge_page_shift(h), src, src_pte); spin_lock_nested(src_ptl, SINGLE_DEPTH_NESTING); entry = huge_ptep_get(src_pte); again: @@ -5098,7 +5098,8 @@ int copy_hugetlb_page_range(struct mm_struct *dst, struct mm_struct *src, /* Install the new hugetlb folio if src pte stable */ dst_ptl = huge_pte_lock(h, dst, dst_pte); - src_ptl = huge_pte_lockptr(h, src, src_pte); + src_ptl = huge_pte_lockptr(huge_page_shift(h), + src, src_pte); spin_lock_nested(src_ptl, SINGLE_DEPTH_NESTING); entry = huge_ptep_get(src_pte); if (!pte_same(src_pte_old, entry)) { @@ -5152,7 +5153,7 @@ static void move_huge_pte(struct vm_area_struct *vma, unsigned long old_addr, pte_t pte; dst_ptl = huge_pte_lock(h, mm, dst_pte); - src_ptl = huge_pte_lockptr(h, mm, src_pte); + src_ptl = huge_pte_lockptr(huge_page_shift(h), mm, src_pte); /* * We don't have to worry about the ordering of src and dst ptlocks diff --git a/mm/migrate.c b/mm/migrate.c index b0f87f19b536..9b4a7e75f6e6 100644 --- a/mm/migrate.c +++ b/mm/migrate.c @@ -363,7 +363,8 @@ void __migration_entry_wait_huge(struct vm_area_struct *vma, void migration_entry_wait_huge(struct vm_area_struct *vma, pte_t *pte) { - spinlock_t *ptl = huge_pte_lockptr(hstate_vma(vma), vma->vm_mm, pte); + spinlock_t *ptl = huge_pte_lockptr(huge_page_shift(hstate_vma(vma)), + vma->vm_mm, pte); __migration_entry_wait_huge(vma, pte, ptl); } From patchwork Sat Feb 18 00:27:44 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: James Houghton X-Patchwork-Id: 13145377 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 13148C05027 for ; Sat, 18 Feb 2023 00:29:08 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 803F9280007; Fri, 17 Feb 2023 19:28:57 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 7DB57280002; Fri, 17 Feb 2023 19:28:57 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 6A310280007; Fri, 17 Feb 2023 19:28:57 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0013.hostedemail.com [216.40.44.13]) by kanga.kvack.org (Postfix) with ESMTP id 5D02A280002 for ; Fri, 17 Feb 2023 19:28:57 -0500 (EST) Received: from smtpin06.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay03.hostedemail.com (Postfix) with ESMTP id 2EA0CA019B for ; Sat, 18 Feb 2023 00:28:57 +0000 (UTC) X-FDA: 80478527514.06.06E8998 Received: from mail-yb1-f202.google.com (mail-yb1-f202.google.com [209.85.219.202]) by imf23.hostedemail.com (Postfix) with ESMTP id 727F2140009 for ; Sat, 18 Feb 2023 00:28:55 +0000 (UTC) Authentication-Results: imf23.hostedemail.com; dkim=pass header.d=google.com header.s=20210112 header.b=V1UHwis9; spf=pass (imf23.hostedemail.com: domain of 3xhvwYwoKCOMOYMTZLMYTSLTTLQJ.HTRQNSZc-RRPaFHP.TWL@flex--jthoughton.bounces.google.com designates 209.85.219.202 as permitted sender) smtp.mailfrom=3xhvwYwoKCOMOYMTZLMYTSLTTLQJ.HTRQNSZc-RRPaFHP.TWL@flex--jthoughton.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1676680135; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=Pz2nYroaVRrGSVy94qGG4nZ9E30MSqR5I0rmAzuTvq0=; b=t3FTpAlD4m7nwRSVdS8fPbRzKfVJdAACiLh86wQSdCKsr8E04W5DWm0qwoxugi6l2RKfi4 oL+C86i/YXXBmV+5RdhPL+7XZztxYo3FC1hfWYenWJ4Mw7VFZ7WXxGtrgVwoDqksbZgCOm MQW9qX4LjaFbpnmjLvBzqJmMrFL2LWA= ARC-Authentication-Results: i=1; imf23.hostedemail.com; dkim=pass header.d=google.com header.s=20210112 header.b=V1UHwis9; spf=pass (imf23.hostedemail.com: domain of 3xhvwYwoKCOMOYMTZLMYTSLTTLQJ.HTRQNSZc-RRPaFHP.TWL@flex--jthoughton.bounces.google.com designates 209.85.219.202 as permitted sender) smtp.mailfrom=3xhvwYwoKCOMOYMTZLMYTSLTTLQJ.HTRQNSZc-RRPaFHP.TWL@flex--jthoughton.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1676680135; a=rsa-sha256; cv=none; b=H6EPd+w0GC5lxjkbnFF5AMSGpDNnd8YiLTgRGK0SD6HlPWH0WfVW3XzB73Qo1tGk7CLybh NIwByt+maSqGThFgj1CuCwQBJsOxLcnoS07VKjVPzGfgyBCbuaj3AsiTcNFezZGLOXWmqB mXkPVlCaTh278xSoR7l88NSH2PVi88Y= Received: by mail-yb1-f202.google.com with SMTP id 75-20020a250b4e000000b0090f2c84a6a4so1997283ybl.13 for ; Fri, 17 Feb 2023 16:28:55 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=Pz2nYroaVRrGSVy94qGG4nZ9E30MSqR5I0rmAzuTvq0=; b=V1UHwis97dacsHy0l1b3lzUp4aEffElgkQqFRGAP4R9Hw6zzh9pL3pU8TQ6LeCkOOO LAnGR0E6fLjGxKpr6E0yb8yofQJ3Yplq8n0cLT/bshgkpsM/hMNaBPY6WVg4rCJqcfel PmhHCvaQjRLpj3PZA4l3c17dZhuSlGVknfSXIUsszWjHRSH2g8HHkboEveCGaLYpHpYI huKFnLPcITV45hT7FdZX9JtRZdP0XFlN1WbgnsZXqSux2N7F1a5KMCJ3UlJJdxC6K6rW 3mwyAccoa5IIH5iiNCO6YwhAkDP3JU0L27dI4gD0cC8hs7RdC32Vnk5NUwnl613BgcAj bg0A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=Pz2nYroaVRrGSVy94qGG4nZ9E30MSqR5I0rmAzuTvq0=; b=ULwLdJAPgoxJSHeq/wMFHEB7w7PZrfDUaIK+CsZxOrzD5nbbzO05OpX4txK3nzWMG1 gC8xPuk1H1bpp9Y+eCSrzNmWppZ178kvZGvMKRoIazyiypftV7LRZAqi2Rq5IuUJBagd lkX7ydw23a2Kui4N84FOuw7qocv9Vkl23B0VNV5nAIPTxk/koP09rsY8wbiFUh8HGk5g WlQ+Kl66h9NWF+5HyrQT8PJgT/MxJDw5f+xWTNVETvkHbDYxwsfAPwuMko0qGhD/gzC7 TOgyM6qytlCFbpl4hoZd3CvBMel5uMZzGTZ7BHcxLhwLmsfHHxfvyUpH+otFjBGThHXH 6/6g== X-Gm-Message-State: AO0yUKWn8c2vW1LgaBDXy8Q0bdJazmXZs93VJjdwYRnESeLb1bcQyfrM U4hOB5hZLb7jdipIWJ6Y9V+Z5rlduM/dRJbx X-Google-Smtp-Source: AK7set+zcld/vdofwpUCcvIXOcKa++WCtmqUm8NMGDjM4RjhBKN3Re9Z99sY+8b3ruIVEFKeHpX84TBdzsYueQO8 X-Received: from jthoughton.c.googlers.com ([fda3:e722:ac3:cc00:14:4d90:c0a8:2a4f]) (user=jthoughton job=sendgmr) by 2002:a05:6902:10c:b0:997:c919:4484 with SMTP id o12-20020a056902010c00b00997c9194484mr28393ybh.6.1676680134689; Fri, 17 Feb 2023 16:28:54 -0800 (PST) Date: Sat, 18 Feb 2023 00:27:44 +0000 In-Reply-To: <20230218002819.1486479-1-jthoughton@google.com> Mime-Version: 1.0 References: <20230218002819.1486479-1-jthoughton@google.com> X-Mailer: git-send-email 2.39.2.637.g21b0678d19-goog Message-ID: <20230218002819.1486479-12-jthoughton@google.com> Subject: [PATCH v2 11/46] hugetlb: add hugetlb_pte to track HugeTLB page table entries From: James Houghton To: Mike Kravetz , Muchun Song , Peter Xu , Andrew Morton Cc: David Hildenbrand , David Rientjes , Axel Rasmussen , Mina Almasry , "Zach O'Keefe" , Manish Mishra , Naoya Horiguchi , "Dr . David Alan Gilbert" , "Matthew Wilcox (Oracle)" , Vlastimil Babka , Baolin Wang , Miaohe Lin , Yang Shi , Frank van der Linden , Jiaqi Yan , linux-mm@kvack.org, linux-kernel@vger.kernel.org, James Houghton X-Rspam-User: X-Rspamd-Server: rspam04 X-Rspamd-Queue-Id: 727F2140009 X-Stat-Signature: dua91cuzwk6kbw4akt1sqkny5ee8fobe X-HE-Tag: 1676680135-920856 X-HE-Meta: U2FsdGVkX19y3u47WO0cQ3uI/004dTyXMgnWePvoemDGK29N5T90whP2nwP+ngvJE+jNElTv8P78aYh/XzGovgoN79EOA7TC4LpSx9xNR806WC7LSbAcasdOAbdGo78MU3Yk/mrCjZzX/buNe3JKt7Ya0XBEbn7cMKjiNOUE/BN5i97Zt9z13YbniO1wNkNzTLocYNq+/H7Os13idIPXKWahwGLoK3AI0+2pzeTYn+qU0EubtHSORIP9BLhWa5hoS3eIjNvWF2ITxCxPjNi86fP9z3E2AJOJ4RKunNqhHjRjaqiacinS+YwZRLyASiT5Bo9+DlulZSMkGv2ZxchomgIKxTuocMQKriMCRY+jDp6CK3vGJItDhnfYS2X01sW+WPEDpISQnWvvOGNayh2Sc20I5Tb/kQP7vX1l4peVUJekcbqO29xFir93YRoPV0vaDSbg4LB4bentdH7YCu1Mj7wPMT4PCHkg/M/RInap0UYohO2PLU0YRZEfL1Jzs5BtG468ZpQNX4bPlpNmrfsw5cZ6+6PzobOxerIPoHNypSzmi4sfjCZaxYWUtYKOvD3PS95gj7rRz31ancc6LrzoY6/BVet6JnXUnseeIfV7Yis1e2CodhUJqZsJcn2EOtVwt57llbDuaZcuthmIJYMwlfndyR5VprIF+sFW/FZJKe9rdcUALV606/SGENekJfqshm1IX1u8d1TcIJTzmB3rfvzBiWvaR/i0ZWNMPet/kDmn5vUJGHXFfJsR8acUFsUua57xmm4JmSHijB/cdfCOLxDiXGBVMxCU5uZM2UQ+X+Z/dwlrdDCYzXU1z9PhblT43of1pTAFwIj9JjPcNUxgADKz9kNjKi1dzi7b1zkG+NATT0j58+ABg02TOwudoGtZKiTK15lwQZT6tR86FvWsTvhcY4igfsisbs02n6ubvumLvrNhZy0V88UucWvySW/I9hdGD763re9iG4N0LG0 ftBggRxK dz+AVnab2PoOAuUJJmeWNww4nhDbsx/MSXfxy6rFiIJUKwKlfGTI0gi25v/JMwgUXhoreJWgiMkfK0x2/xxMYlTekEKpH3lXT50cs/WHJskELixydQlHh9tCO1n2IOvO4M571YMnVxEASkFuv4+rKqbA4nzK+wo06gAa1sm31o62nIuPiytyfNpSkYHeDuuvPAr23fmUp74fDBj4qaChjgqPdQylmRNSVjxjQTVhH/mFrSoiQOAQFX/zN5KLgxyLUGVZScfjDrSzzSlygJ0zhIKiVklvdfOIURSjL63Twr3YD9ln2l23pm+zsqfdK7p2veOXP7Lp8uVXV9xwPBojuuf5vV8trDCxEaX0Pqm9+E0/VMP/xFROFmTMwJGBYge5RnUJV6l1VhoubO2oS5o7EQG1Mf0f8Uuf2LctCtH+jDOSu5eF2C5nUeP/RNAXu8K7YBFxzMTfapqczC+qFgzxuLVQhwhEqmfcRG0F+4iWvaBw0kqxWz3e0fxIptUlF82rl3GOdC/sh44pzUNAhEfNch5A0+8O5RxcRlqtkAaTvxYxln56GkgrOx1hhY681npcspBJeUxHpf42T/e5OM+Q76NTMFbCv/UJfzCOAWFpXm/nGSgY= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: After high-granularity mapping, page table entries for HugeTLB pages can be of any size/type. (For example, we can have a 1G page mapped with a mix of PMDs and PTEs.) This struct is to help keep track of a HugeTLB PTE after we have done a page table walk. Without this, we'd have to pass around the "size" of the PTE everywhere. We effectively did this before; it could be fetched from the hstate, which we pass around pretty much everywhere. hugetlb_pte_present_leaf is included here as a helper function that will be used frequently later on. Signed-off-by: James Houghton Reviewed-by: Mina Almasry Acked-by: Mike Kravetz diff --git a/include/linux/hugetlb.h b/include/linux/hugetlb.h index a1ceb9417f01..eeacadf3272b 100644 --- a/include/linux/hugetlb.h +++ b/include/linux/hugetlb.h @@ -26,6 +26,25 @@ typedef struct { unsigned long pd; } hugepd_t; #define __hugepd(x) ((hugepd_t) { (x) }) #endif +enum hugetlb_level { + HUGETLB_LEVEL_PTE = 1, + /* + * We always include PMD, PUD, and P4D in this enum definition so that, + * when logged as an integer, we can easily tell which level it is. + */ + HUGETLB_LEVEL_PMD, + HUGETLB_LEVEL_PUD, + HUGETLB_LEVEL_P4D, + HUGETLB_LEVEL_PGD, +}; + +struct hugetlb_pte { + pte_t *ptep; + unsigned int shift; + enum hugetlb_level level; + spinlock_t *ptl; +}; + #ifdef CONFIG_HUGETLB_PAGE #include @@ -39,6 +58,20 @@ typedef struct { unsigned long pd; } hugepd_t; */ #define __NR_USED_SUBPAGE 3 +static inline +unsigned long hugetlb_pte_size(const struct hugetlb_pte *hpte) +{ + return 1UL << hpte->shift; +} + +static inline +unsigned long hugetlb_pte_mask(const struct hugetlb_pte *hpte) +{ + return ~(hugetlb_pte_size(hpte) - 1); +} + +bool hugetlb_pte_present_leaf(const struct hugetlb_pte *hpte, pte_t pte); + struct hugepage_subpool { spinlock_t lock; long count; @@ -1234,6 +1267,45 @@ static inline spinlock_t *huge_pte_lock(struct hstate *h, return ptl; } +static inline +spinlock_t *hugetlb_pte_lockptr(struct hugetlb_pte *hpte) +{ + return hpte->ptl; +} + +static inline +spinlock_t *hugetlb_pte_lock(struct hugetlb_pte *hpte) +{ + spinlock_t *ptl = hugetlb_pte_lockptr(hpte); + + spin_lock(ptl); + return ptl; +} + +static inline +void __hugetlb_pte_init(struct hugetlb_pte *hpte, pte_t *ptep, + unsigned int shift, enum hugetlb_level level, + spinlock_t *ptl) +{ + /* + * If 'shift' indicates that this PTE is contiguous, then @ptep must + * be the first pte of the contiguous bunch. + */ + hpte->ptl = ptl; + hpte->ptep = ptep; + hpte->shift = shift; + hpte->level = level; +} + +static inline +void hugetlb_pte_init(struct mm_struct *mm, struct hugetlb_pte *hpte, + pte_t *ptep, unsigned int shift, + enum hugetlb_level level) +{ + __hugetlb_pte_init(hpte, ptep, shift, level, + huge_pte_lockptr(shift, mm, ptep)); +} + #if defined(CONFIG_HUGETLB_PAGE) && defined(CONFIG_CMA) extern void __init hugetlb_cma_reserve(int order); #else diff --git a/mm/hugetlb.c b/mm/hugetlb.c index 5ca9eae0ac42..6c74adff43b6 100644 --- a/mm/hugetlb.c +++ b/mm/hugetlb.c @@ -1269,6 +1269,35 @@ static bool vma_has_reserves(struct vm_area_struct *vma, long chg) return false; } +bool hugetlb_pte_present_leaf(const struct hugetlb_pte *hpte, pte_t pte) +{ + pgd_t pgd; + p4d_t p4d; + pud_t pud; + pmd_t pmd; + + switch (hpte->level) { + case HUGETLB_LEVEL_PGD: + pgd = __pgd(pte_val(pte)); + return pgd_present(pgd) && pgd_leaf(pgd); + case HUGETLB_LEVEL_P4D: + p4d = __p4d(pte_val(pte)); + return p4d_present(p4d) && p4d_leaf(p4d); + case HUGETLB_LEVEL_PUD: + pud = __pud(pte_val(pte)); + return pud_present(pud) && pud_leaf(pud); + case HUGETLB_LEVEL_PMD: + pmd = __pmd(pte_val(pte)); + return pmd_present(pmd) && pmd_leaf(pmd); + case HUGETLB_LEVEL_PTE: + return pte_present(pte); + default: + WARN_ON_ONCE(1); + return false; + } +} + + static void enqueue_hugetlb_folio(struct hstate *h, struct folio *folio) { int nid = folio_nid(folio); From patchwork Sat Feb 18 00:27:45 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: James Houghton X-Patchwork-Id: 13145378 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 46652C6379F for ; Sat, 18 Feb 2023 00:29:10 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 80621280008; Fri, 17 Feb 2023 19:28:58 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 6F453280002; Fri, 17 Feb 2023 19:28:58 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 4A72C280008; Fri, 17 Feb 2023 19:28:58 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0010.hostedemail.com [216.40.44.10]) by kanga.kvack.org (Postfix) with ESMTP id 3C27F280002 for ; Fri, 17 Feb 2023 19:28:58 -0500 (EST) Received: from smtpin01.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay09.hostedemail.com (Postfix) with ESMTP id 0CC4580799 for ; Sat, 18 Feb 2023 00:28:58 +0000 (UTC) X-FDA: 80478527556.01.CE05AC3 Received: from mail-yb1-f202.google.com (mail-yb1-f202.google.com [209.85.219.202]) by imf12.hostedemail.com (Postfix) with ESMTP id 4CEC24000A for ; Sat, 18 Feb 2023 00:28:56 +0000 (UTC) Authentication-Results: imf12.hostedemail.com; dkim=pass header.d=google.com header.s=20210112 header.b=b9QeURwu; spf=pass (imf12.hostedemail.com: domain of 3xxvwYwoKCOQPZNUaMNZUTMUUMRK.IUSROTad-SSQbGIQ.UXM@flex--jthoughton.bounces.google.com designates 209.85.219.202 as permitted sender) smtp.mailfrom=3xxvwYwoKCOQPZNUaMNZUTMUUMRK.IUSROTad-SSQbGIQ.UXM@flex--jthoughton.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1676680136; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=Coc3VejVpx2o70gw2F1FMLmVPB7+pJmCD4QBd1TWaEQ=; b=gRj055mdI0sHVcg68lGgBMMn4oDlmOt4B/UfZ6XpaxCrQft9DLGzqnrmqbratAUBzIgTLx /akgP+DjjcwO0mwx0ac7bn3KgbFfhUvDFZhM0zMY/TVTOSfTeVBu+xxGReIW2bntCo0Za+ wv6lSL3Hsm/Q6QK8hrDlC6vejUwRdGs= ARC-Authentication-Results: i=1; imf12.hostedemail.com; dkim=pass header.d=google.com header.s=20210112 header.b=b9QeURwu; spf=pass (imf12.hostedemail.com: domain of 3xxvwYwoKCOQPZNUaMNZUTMUUMRK.IUSROTad-SSQbGIQ.UXM@flex--jthoughton.bounces.google.com designates 209.85.219.202 as permitted sender) smtp.mailfrom=3xxvwYwoKCOQPZNUaMNZUTMUUMRK.IUSROTad-SSQbGIQ.UXM@flex--jthoughton.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1676680136; a=rsa-sha256; cv=none; b=FxsrQglxwChJRL1iM0gDRBptCP3CiJCxjBa0iuy3SADs1f3qAnhlFWjb8+Eu4hjY7jwF2x OppiuLQhqZjuS39pq8OsE2czMe347/Tm7h2uKZdaxmRu1I0mkFoQcInmyWQDGvzUlnDITC wKSuKKx8s6J4FM+Q65Z2m6cm4EEDBQI= Received: by mail-yb1-f202.google.com with SMTP id y187-20020a2532c4000000b008f257b16d71so2333769yby.15 for ; Fri, 17 Feb 2023 16:28:56 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=Coc3VejVpx2o70gw2F1FMLmVPB7+pJmCD4QBd1TWaEQ=; b=b9QeURwuMOHoR6oMuvaNhmrHn5fZGAuH6iVnLRa9UH1kSasBCwgdba6QJPd+K6jzHb xz6s/6q+eVA72+3bOi37v4IpLi+gt+dDSU4ZiqKrbXd+tRUGRvRJ1mGjZ02I2XMyurj4 SsSE1g1trZVfhuYlrhfBM4w2eQv5qoEpUPlr5CJMDFyaM52fzsGSeRGzxi0rFH4dVLC6 IFd4sxykmk7qNUAT62W1oRvfIgHfE5E1BizMsR8G5eNcCZgGLkwZhgNod0TNXPX+Z62j 6Y1IxEaN7fiBpSKCe960CuuSJGub6psv+J5ih1KSgDonLBFQ4O9QsMNnykmF5WR/o4UG 3CMg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=Coc3VejVpx2o70gw2F1FMLmVPB7+pJmCD4QBd1TWaEQ=; b=x1yF66mdquYC28C6xRIe9IxLv2GRmWhEk3HprVozO9HATuO/Lw5H8/f72P25ckb/TX gLIhZEl/ZafWxunTYkwDL3Vv36OWqy6HrMQK2THLM9Gvo7tEJAScymhQ/fpaP7U8UKz4 4f4aKVtfvuCGdosRrb0TDn3p5JUQjs2JPahlgvHvcODlgbSdOQRbPShXowg60Rip+5gL yRGtioEVcTn1alXiKrjm1C2OWEmIMDyXxtTiedqhXQT2EEUIuhqvh/8HmmHbAWYhgEky peWlsVeGMJnlcafVT0/x/STOJpCvqB44jwzgHDHiCSiD1danp76nKu6oiGASkVKI+/z1 KqFg== X-Gm-Message-State: AO0yUKWeJ2AI/yMpTPN/zGbze6C3WLVpl4TPxJMV1ja2eGhSd+BUd91n 5wQNb6jt1jNWApMk0RSlqnI/PFEwTW01z7Ip X-Google-Smtp-Source: AK7set9ALTOiMY3O/c14DOIxwwoyqi8AmUxl79K9DJLNuYtBmR+KN/t3Tjueou+iosR6XToAl+9j+ymfqMf1UvES X-Received: from jthoughton.c.googlers.com ([fda3:e722:ac3:cc00:14:4d90:c0a8:2a4f]) (user=jthoughton job=sendgmr) by 2002:a81:b705:0:b0:534:d71f:14e6 with SMTP id v5-20020a81b705000000b00534d71f14e6mr53501ywh.9.1676680135521; Fri, 17 Feb 2023 16:28:55 -0800 (PST) Date: Sat, 18 Feb 2023 00:27:45 +0000 In-Reply-To: <20230218002819.1486479-1-jthoughton@google.com> Mime-Version: 1.0 References: <20230218002819.1486479-1-jthoughton@google.com> X-Mailer: git-send-email 2.39.2.637.g21b0678d19-goog Message-ID: <20230218002819.1486479-13-jthoughton@google.com> Subject: [PATCH v2 12/46] hugetlb: add hugetlb_alloc_pmd and hugetlb_alloc_pte From: James Houghton To: Mike Kravetz , Muchun Song , Peter Xu , Andrew Morton Cc: David Hildenbrand , David Rientjes , Axel Rasmussen , Mina Almasry , "Zach O'Keefe" , Manish Mishra , Naoya Horiguchi , "Dr . David Alan Gilbert" , "Matthew Wilcox (Oracle)" , Vlastimil Babka , Baolin Wang , Miaohe Lin , Yang Shi , Frank van der Linden , Jiaqi Yan , linux-mm@kvack.org, linux-kernel@vger.kernel.org, James Houghton X-Stat-Signature: ekjiyt3ujb3ayzxnqenprosmwkes4i38 X-Rspam-User: X-Rspamd-Queue-Id: 4CEC24000A X-Rspamd-Server: rspam06 X-HE-Tag: 1676680136-933915 X-HE-Meta: U2FsdGVkX18wXXo7Br3PR454whEdzqqc1GviKONjoGlb+0ZHpVOPdQzJrU6Ng0aYdzcidX8Awq81PBRmC6oFVQBgISLVO9FeLLZSxWmPvXEG82+6KmqocrF+tHVBPA/SgqDfJd5rEFe/DEuCHs4rqBxC6uTeEXga163w/ZsIGA7bw+gl+V5d+dws167jXTDZb4Dp5KXlEh/hdHj+UuntaTpJS40c7CrF4qp5wCe0lYgO6NQPDJjBsCwCfYeo1RptAa5bLWAgDp0QopPZYGcNtNhzhj+HQVsKQe4nggxtdQrQEKEwvQm/PQV+NijODM9gEplHH9YTTDux49wm5I7nN1QuDPMyT5VGy7hWQsY/4NVyG1KXwQ9hbMPnSl7YFIoyMrkoWqZAV5reDFVSXcBs/ZZnCaDhOMYwVlhzmTuDZsl7aRAFGhVJrlkoHjWwCYRqwgDs/rIaqPDxhB5nuzzKcRlXzmyxmUyxp+HzoloTaLmGeG3VVKaQcJfoKughrxeXtIT2F74TqJL0LOynw6O3Tn+xB8ogUY/EmBY0I93A5GJW0k5GFRd8EFmG5ar8N8xxdkvY6qMmLq0Wx31fR9QLFr4JXqVn6fqCZZxAcaGy9F94QGGRicd4moTTnC+tXHq7/Fbb4Bde4fPblyiszr5llea4y+n1QQzcrEsvcFUswuIVED7s2gxIdOXmhbf1fbgCRQliERm3WF8mzpL7aSpadbM21sYCtThVHlKF+xE/M77YX3Y5UVelknAD4Fe/wIaoJ9gI/n5GvqUDljJTzxOeP8OLUrMb+5jLkxWSxUpozKJhgUoerwZ+dWrB8dlNYL1T0sXvpfbQqjRMVbSpISCGf3y5jFk4Gw6QEYmXBClRz/N6S0wDgYWNOs1T9AwS6dtjD3fg9TsZ75Wh8xptImJUnDOYuEwnZj04Y8OFYh/T/3dDQwX6Jn1Swl6BZXTHxz1/8kWqS7hReXOIQU4NKRF Y8RjNEwJ 60SgKsVu60GPIZxE33usvCH+mikskymJ8EEGw1wMgamTx/HvvGt8iWTU+29Ohz83dGtg2hMd4V82kpw9Op18kkBSy0Y4Wn8KnJRnv9r20of43oRiMLvITt/7U5/SUbF8EECuTfRPQ5TwzSageb8l6m4XRBcQTVB0/S+IoW2QnmW+u1k1REYhWK1mU5F2166CHzKnud91yArfZNMRY9S4WKWoZkbEq3bTLJ7ocQzZwvvqYr4uPem7jSQzJNmTvcV0Mdwi3Ncm/j0/jmjnBT3Hn048TgZcvPiKUczlNa3F2J8xEibwB+2H5T/R8qeQLcFcjvArp4ctyah3F+cgkgoCVat/Sn8GoupugucovfuwEvimAqmdhBEkBFia5MHJwrjlZVY96SfUpHsQ3KnWq0SJoycwZ3zSG56xmzGzxoZjEKBJhQlpsYb7OOf4WJXHlJTdqo2grqyMX/bnXAwBxznbbtd1OLlX/x1Q/PpqxQiwextgoE+W/ke0Ha62QIYrqO1u2zgoxBuKvoZPB+9FAvMBt3PPyQrfP356FujhjUXjKDGtQVCqLvt42bWez1HsUHqfmJ70N2Rwk9bCDwaVgVSMa9WHBvtm8ANh6LKJTBredvC6I1xw= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: These functions are used to allocate new PTEs below the hstate PTE. This will be used by hugetlb_walk_step, which implements stepping forwards in a HugeTLB high-granularity page table walk. The reasons that we don't use the standard pmd_alloc/pte_alloc* functions are: 1) This prevents us from accidentally overwriting swap entries or attempting to use swap entries as present non-leaf PTEs (see pmd_alloc(); we assume that !pte_none means pte_present and non-leaf). 2) Locking hugetlb PTEs can different than regular PTEs. (Although, as implemented right now, locking is the same.) 3) We can maintain compatibility with CONFIG_HIGHPTE. That is, HugeTLB HGM won't use HIGHPTE, but the kernel can still be built with it, and other mm code will use it. When GENERAL_HUGETLB supports P4D-based hugepages, we will need to implement hugetlb_pud_alloc to implement hugetlb_walk_step. Signed-off-by: James Houghton diff --git a/include/linux/hugetlb.h b/include/linux/hugetlb.h index eeacadf3272b..9d839519c875 100644 --- a/include/linux/hugetlb.h +++ b/include/linux/hugetlb.h @@ -72,6 +72,11 @@ unsigned long hugetlb_pte_mask(const struct hugetlb_pte *hpte) bool hugetlb_pte_present_leaf(const struct hugetlb_pte *hpte, pte_t pte); +pmd_t *hugetlb_alloc_pmd(struct mm_struct *mm, struct hugetlb_pte *hpte, + unsigned long addr); +pte_t *hugetlb_alloc_pte(struct mm_struct *mm, struct hugetlb_pte *hpte, + unsigned long addr); + struct hugepage_subpool { spinlock_t lock; long count; diff --git a/mm/hugetlb.c b/mm/hugetlb.c index 6c74adff43b6..bb424cdf79e4 100644 --- a/mm/hugetlb.c +++ b/mm/hugetlb.c @@ -483,6 +483,120 @@ static bool has_same_uncharge_info(struct file_region *rg, #endif } +/* + * hugetlb_alloc_pmd -- Allocate or find a PMD beneath a PUD-level hpte. + * + * This is meant to be used to implement hugetlb_walk_step when one must go to + * step down to a PMD. Different architectures may implement hugetlb_walk_step + * differently, but hugetlb_alloc_pmd and hugetlb_alloc_pte are architecture- + * independent. + * + * Returns: + * On success: the pointer to the PMD. This should be placed into a + * hugetlb_pte. @hpte is not changed. + * ERR_PTR(-EINVAL): hpte is not PUD-level + * ERR_PTR(-EEXIST): there is a non-leaf and non-empty PUD in @hpte + * ERR_PTR(-ENOMEM): could not allocate the new PMD + */ +pmd_t *hugetlb_alloc_pmd(struct mm_struct *mm, struct hugetlb_pte *hpte, + unsigned long addr) +{ + spinlock_t *ptl = hugetlb_pte_lockptr(hpte); + pmd_t *new; + pud_t *pudp; + pud_t pud; + + if (hpte->level != HUGETLB_LEVEL_PUD) + return ERR_PTR(-EINVAL); + + pudp = (pud_t *)hpte->ptep; +retry: + pud = READ_ONCE(*pudp); + if (likely(pud_present(pud))) + return unlikely(pud_leaf(pud)) + ? ERR_PTR(-EEXIST) + : pmd_offset(pudp, addr); + else if (!pud_none(pud)) + /* + * Not present and not none means that a swap entry lives here, + * and we can't get rid of it. + */ + return ERR_PTR(-EEXIST); + + new = pmd_alloc_one(mm, addr); + if (!new) + return ERR_PTR(-ENOMEM); + + spin_lock(ptl); + if (!pud_same(pud, *pudp)) { + spin_unlock(ptl); + pmd_free(mm, new); + goto retry; + } + + mm_inc_nr_pmds(mm); + smp_wmb(); /* See comment in pmd_install() */ + pud_populate(mm, pudp, new); + spin_unlock(ptl); + return pmd_offset(pudp, addr); +} + +/* + * hugetlb_alloc_pte -- Allocate a PTE beneath a pmd_none PMD-level hpte. + * + * See the comment above hugetlb_alloc_pmd. + */ +pte_t *hugetlb_alloc_pte(struct mm_struct *mm, struct hugetlb_pte *hpte, + unsigned long addr) +{ + spinlock_t *ptl = hugetlb_pte_lockptr(hpte); + pgtable_t new; + pmd_t *pmdp; + pmd_t pmd; + + if (hpte->level != HUGETLB_LEVEL_PMD) + return ERR_PTR(-EINVAL); + + pmdp = (pmd_t *)hpte->ptep; +retry: + pmd = READ_ONCE(*pmdp); + if (likely(pmd_present(pmd))) + return unlikely(pmd_leaf(pmd)) + ? ERR_PTR(-EEXIST) + : pte_offset_kernel(pmdp, addr); + else if (!pmd_none(pmd)) + /* + * Not present and not none means that a swap entry lives here, + * and we can't get rid of it. + */ + return ERR_PTR(-EEXIST); + + /* + * With CONFIG_HIGHPTE, calling `pte_alloc_one` directly may result + * in page tables being allocated in high memory, needing a kmap to + * access. Instead, we call __pte_alloc_one directly with + * GFP_PGTABLE_USER to prevent these PTEs being allocated in high + * memory. + */ + new = __pte_alloc_one(mm, GFP_PGTABLE_USER); + if (!new) + return ERR_PTR(-ENOMEM); + + spin_lock(ptl); + if (!pmd_same(pmd, *pmdp)) { + spin_unlock(ptl); + pgtable_pte_page_dtor(new); + __free_page(new); + goto retry; + } + + mm_inc_nr_ptes(mm); + smp_wmb(); /* See comment in pmd_install() */ + pmd_populate(mm, pmdp, new); + spin_unlock(ptl); + return pte_offset_kernel(pmdp, addr); +} + static void coalesce_file_region(struct resv_map *resv, struct file_region *rg) { struct file_region *nrg, *prg; From patchwork Sat Feb 18 00:27:46 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: James Houghton X-Patchwork-Id: 13145379 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 008B1C636D6 for ; Sat, 18 Feb 2023 00:29:11 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 4A925280009; Fri, 17 Feb 2023 19:28:59 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 43135280002; Fri, 17 Feb 2023 19:28:59 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 28423280009; Fri, 17 Feb 2023 19:28:59 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0012.hostedemail.com [216.40.44.12]) by kanga.kvack.org (Postfix) with ESMTP id 1BE02280002 for ; Fri, 17 Feb 2023 19:28:59 -0500 (EST) Received: from smtpin27.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay08.hostedemail.com (Postfix) with ESMTP id E9CDD140875 for ; Sat, 18 Feb 2023 00:28:58 +0000 (UTC) X-FDA: 80478527556.27.1A7326E Received: from mail-vk1-f202.google.com (mail-vk1-f202.google.com [209.85.221.202]) by imf01.hostedemail.com (Postfix) with ESMTP id 2B37640008 for ; Sat, 18 Feb 2023 00:28:56 +0000 (UTC) Authentication-Results: imf01.hostedemail.com; dkim=pass header.d=google.com header.s=20210112 header.b=aY3F+EdX; spf=pass (imf01.hostedemail.com: domain of 3yBvwYwoKCOUQaOVbNOaVUNVVNSL.JVTSPUbe-TTRcHJR.VYN@flex--jthoughton.bounces.google.com designates 209.85.221.202 as permitted sender) smtp.mailfrom=3yBvwYwoKCOUQaOVbNOaVUNVVNSL.JVTSPUbe-TTRcHJR.VYN@flex--jthoughton.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1676680137; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=uRKYla0TQmxQv6jLt3zA0HnNAsQhQcQxeVgRZIxAfD0=; b=Ii008l2E+Cog+3zbyFPaeFWo/KoxXl2WD26RHEm8orQjjKMewGG5J2DQL54ZRyJH830fpw yU8vTkpOsIErzykSqO5RJCGUY4//hspw0YiqmmSG75L6zv8CPRg8Y66CDLGsR7TmfWV8i+ 1fsxRM2JmC2kVKVIpureK3phRmTWfcM= ARC-Authentication-Results: i=1; imf01.hostedemail.com; dkim=pass header.d=google.com header.s=20210112 header.b=aY3F+EdX; spf=pass (imf01.hostedemail.com: domain of 3yBvwYwoKCOUQaOVbNOaVUNVVNSL.JVTSPUbe-TTRcHJR.VYN@flex--jthoughton.bounces.google.com designates 209.85.221.202 as permitted sender) smtp.mailfrom=3yBvwYwoKCOUQaOVbNOaVUNVVNSL.JVTSPUbe-TTRcHJR.VYN@flex--jthoughton.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1676680137; a=rsa-sha256; cv=none; b=J0QT51CHGcYwH94rv+Akqrcjl8FFR2chX5HVVEIa/CfF1ByBZXoB7LLcEkX8Oo00PybaLW F6MPtep5fSxukYW+otk3DsT/t+GDPLhH4S35Le/bMuu9or8gZoGKzyU0Xu/B6z5XUu727l Z057Sel02NCBBsba81S8G+2FkpHbdiY= Received: by mail-vk1-f202.google.com with SMTP id bj44-20020a0561220e6c00b003e1cb6fe65dso762987vkb.9 for ; Fri, 17 Feb 2023 16:28:56 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; t=1676680136; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=uRKYla0TQmxQv6jLt3zA0HnNAsQhQcQxeVgRZIxAfD0=; b=aY3F+EdXtCUshFPXDCNcttdzNX+c4LbhQi1SlJKFNgfqHWfMJ9vDmVr9sTpieHk05X o4uR5QadzoXUGRF67oJyJbtK2h79UlcD3xZm62j78SP9Oa4ap24kfVkt0+OF4A/n+edW 7IAq9oxJqPxznoi+TtMM93uAoNkDnONQ+ts18nLlvOqmoYn3iH/qQKh7/ZFGrihwcuj9 1V4KX6t2Pwg0FKhmdUmpWWjWMENeKOVdjc9494RWzzvR00UMehx3hUstaQ0j8uMeOex4 5VuSVQoGMANGWUUusrSYOeAZy9IVSY0n/qOvAGec/mTz8lN3nGuzMNi0BmdrRxSyqNfw Sbbg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; t=1676680136; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=uRKYla0TQmxQv6jLt3zA0HnNAsQhQcQxeVgRZIxAfD0=; b=scISKYNT7ZqG2aiXrcRlyXVklOaGWDMpRaLPRCP1kcbbkNKFh4z7LUJPxvj/oyM5Yg keYWtx6eHBfXRROR5Foqy8SymWxiJBzBHpKrQv51WiUi9xGnNC8juY5aCAC8f7zXevKi fO/GGXfwSIXUJnrahnH+C3Jj6rvdLluIXi+qYbEv3BKCxPbz/alvMlnVneAXsT3KWML6 l4Ziw6r+WrH6itLwTJUT2d1FSq+Gpdbl4yVKLcjJWQkMPAbH/8H/UjXowXNOAiFZtVPF BPkL3ygE+mFeNbfbC2QgEPS2solnGTErr1Cbl1wI7j1J14f6eoI8T/JmKLPV0uz3apf3 CjLw== X-Gm-Message-State: AO0yUKXXvFr657qdA4tyzdf2/lkqbuvH868NdpV1C3x2UiRfK+Zx+Bm4 tCO4TEYrqm9TstukAthUVyc86eu1UEYKStO1 X-Google-Smtp-Source: AK7set/GEorAyYMaCd0jNzqquEaNj7jwa1sYuJrs5tjMlDyYAqUGDdv0CU5O6YV9UsW4yKpSO/jg62VpebURY2U0 X-Received: from jthoughton.c.googlers.com ([fda3:e722:ac3:cc00:14:4d90:c0a8:2a4f]) (user=jthoughton job=sendgmr) by 2002:a05:6102:153:b0:417:159c:218b with SMTP id a19-20020a056102015300b00417159c218bmr652647vsr.13.1676680136399; Fri, 17 Feb 2023 16:28:56 -0800 (PST) Date: Sat, 18 Feb 2023 00:27:46 +0000 In-Reply-To: <20230218002819.1486479-1-jthoughton@google.com> Mime-Version: 1.0 References: <20230218002819.1486479-1-jthoughton@google.com> X-Mailer: git-send-email 2.39.2.637.g21b0678d19-goog Message-ID: <20230218002819.1486479-14-jthoughton@google.com> Subject: [PATCH v2 13/46] hugetlb: add hugetlb_hgm_walk and hugetlb_walk_step From: James Houghton To: Mike Kravetz , Muchun Song , Peter Xu , Andrew Morton Cc: David Hildenbrand , David Rientjes , Axel Rasmussen , Mina Almasry , "Zach O'Keefe" , Manish Mishra , Naoya Horiguchi , "Dr . David Alan Gilbert" , "Matthew Wilcox (Oracle)" , Vlastimil Babka , Baolin Wang , Miaohe Lin , Yang Shi , Frank van der Linden , Jiaqi Yan , linux-mm@kvack.org, linux-kernel@vger.kernel.org, James Houghton X-Rspamd-Server: rspam07 X-Rspamd-Queue-Id: 2B37640008 X-Rspam-User: X-Stat-Signature: wsgo4a8k56gqxok4kwxgy9se7gm7jjjp X-HE-Tag: 1676680136-857607 X-HE-Meta: U2FsdGVkX1+lw+Cj/QvEVtEUgvQ9EbIvGgJ4cegixcsfVanLcK82kjp0bFRkw7pWr6Rf1XNucGhhkAL70t/SRkBsCA8hvA9ThjFzrC1lcx7hYRaTSEmAv1pyborwC1sshEcX2zJiAh+KW2k4/ETRzut196vkRhUBBdXUsugJ9yoEYeiqud3VQUg5CplCMdhjkAhe7KvqIRdFDMqLdT5kZIw5xNFtPpK5c/31874TJ2cZVWbJFFWdPuILkoWerVm1XAyr9vBDmUcSX7KRXKg/O51f3dwaeZjYis/nAPGLYHxw9WqAUQ9/hy/k8C9yaWrOzVq45OVjqXD7UnuqqQODeloTCAevKMU+neGO38WyFUDcejG5hMU8rEathse/s2ZaiWKOHPn99tJignY3l3mtaq9oyy0a8e5y4+9YN5C+Pc/RiogoRdONVogMw7XMPHYFyUa0Jqmz7fq/AuUdH9+z9aloL/ecN8ikiZ3txZHR+OsW5Kj/N9ipOUhZYF1iy1STcL8djmYPW9WX+gLMqyLl5pAThhGUCTswaLElpFavwwBGN4zcD/lYhXpdmd0T6Iz51N+IGXawVxfTPkM56XdcJ56l3liw5Z/GHZ2ifuW5fv8K73iAEs456E4maDCPdfxDi1kXIMqGC+vabJXXhCqiGn6xEeqalIzBqf6c0hY/maZf4k8PFjyhjiCH8xK29amEx9rGcNFYFLBmKjTF9uuD3AhoK7aa/NRYrezapJvnhisz6BQRFbN7EQss2I6SVO0sAX0/ERu6V0fPK2WCWRm9oQTfN/PYiRuto1Mrc3NBwSTpEJFN5BtCPo3JWLGUhVTInETSlT4IWiQbALQ4bd6Lkq6Et+Klfy794WuCrbPGHh3NhAuzwEhMeUvobCvUm2VlBkh2UPLnID0tomOggbnGPH1lMTWOtBEQ4RjZ/faPy1cVUz3r6ZXB4/Kvk3IDuBLPC0ze5SY8NVSfGykC0GD yHjbotZN BVvH+pvsgdMz3WQ8x1zvhpA7SH4d49r8JPgsgtzaFu6PlZTyJyo6WxCNZYI9QMn2olTLBp4+EHPcAacCkRh14Vi/i9pg1iaHCr1w6aPZWhQnoQ2Uf5K09zOIsv3c1+XwUOx1mWC+DS1uBvKgS1zZpCGoGGj2LixS/Ct65ovxX9OpsjerF1fotrgRkCGjWObl0ZFs+jafryHIFf+3fgAMMgCzrbu66g6+XBthm1wVGjPyWhPvHrL2trd+DEawv3cxt06wniAAE4BKksVPSO+N0DxgEiUmr2PMUF8ct9XlvuwYMhEuhIU6hUHJ/k/jML58QyCoWqMFYJPTaqrdfNxuguDbeUOh2JKSAd53UX6qbyu7YJxpN7t4iujy/hJ/We0hg6UZu4CFM5tQ6TrWl9MllavRRtLQKzXYXubr1GH/kRbC6eBToWkEeJyhRw7pxYg6bJMuhZprq7CQfVdc2FKOC4ElOdq6ETdjEnJN+VMxTJC5j+MPQ1xUzs6AO8F+rDmn+viLkSHQBP1AJO2OE12+BIy4uxkJn/jOOUeEq8gQE15CZyUAfePZfIm3/nnqcWEWW2RE6MZcGw/IbtpQpMPDwJEQhoRA86kVcPtsaQ83nQpKmVKKPmQm7X4YopO/q5FAINysiP6vdRXGNlJ+ZUmFfm1Dh3Xk1I/sehDj8mTlNYdBIPRzArA27iBxmsSS2CP8a/hmIzpAwYc/yfqQ= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: hugetlb_hgm_walk implements high-granularity page table walks for HugeTLB. It is safe to call on non-HGM enabled VMAs; it will return immediately. hugetlb_walk_step implements how we step forwards in the walk. For architectures that don't use GENERAL_HUGETLB, they will need to provide their own implementation. The broader API that should be used is hugetlb_full_walk[,alloc|,continue]. Signed-off-by: James Houghton diff --git a/include/linux/hugetlb.h b/include/linux/hugetlb.h index 9d839519c875..726d581158b1 100644 --- a/include/linux/hugetlb.h +++ b/include/linux/hugetlb.h @@ -223,6 +223,14 @@ u32 hugetlb_fault_mutex_hash(struct address_space *mapping, pgoff_t idx); pte_t *huge_pmd_share(struct mm_struct *mm, struct vm_area_struct *vma, unsigned long addr, pud_t *pud); +int hugetlb_full_walk(struct hugetlb_pte *hpte, struct vm_area_struct *vma, + unsigned long addr); +void hugetlb_full_walk_continue(struct hugetlb_pte *hpte, + struct vm_area_struct *vma, unsigned long addr); +int hugetlb_full_walk_alloc(struct hugetlb_pte *hpte, + struct vm_area_struct *vma, unsigned long addr, + unsigned long target_sz); + struct address_space *hugetlb_page_mapping_lock_write(struct page *hpage); extern int sysctl_hugetlb_shm_group; @@ -272,6 +280,8 @@ pte_t *huge_pte_alloc(struct mm_struct *mm, struct vm_area_struct *vma, pte_t *huge_pte_offset(struct mm_struct *mm, unsigned long addr, unsigned long sz); unsigned long hugetlb_mask_last_page(struct hstate *h); +int hugetlb_walk_step(struct mm_struct *mm, struct hugetlb_pte *hpte, + unsigned long addr, unsigned long sz); int huge_pmd_unshare(struct mm_struct *mm, struct vm_area_struct *vma, unsigned long addr, pte_t *ptep); void adjust_range_if_pmd_sharing_possible(struct vm_area_struct *vma, @@ -1054,6 +1064,8 @@ void hugetlb_register_node(struct node *node); void hugetlb_unregister_node(struct node *node); #endif +enum hugetlb_level hpage_size_to_level(unsigned long sz); + #else /* CONFIG_HUGETLB_PAGE */ struct hstate {}; @@ -1246,6 +1258,11 @@ static inline void hugetlb_register_node(struct node *node) static inline void hugetlb_unregister_node(struct node *node) { } + +static inline enum hugetlb_level hpage_size_to_level(unsigned long sz) +{ + return HUGETLB_LEVEL_PTE; +} #endif /* CONFIG_HUGETLB_PAGE */ #ifdef CONFIG_HUGETLB_HIGH_GRANULARITY_MAPPING diff --git a/mm/hugetlb.c b/mm/hugetlb.c index bb424cdf79e4..810c05feb41f 100644 --- a/mm/hugetlb.c +++ b/mm/hugetlb.c @@ -97,6 +97,29 @@ static void __hugetlb_vma_unlock_write_free(struct vm_area_struct *vma); static void hugetlb_unshare_pmds(struct vm_area_struct *vma, unsigned long start, unsigned long end); +/* + * hpage_size_to_level() - convert @sz to the corresponding page table level + * + * @sz must be less than or equal to a valid hugepage size. + */ +enum hugetlb_level hpage_size_to_level(unsigned long sz) +{ + /* + * We order the conditionals from smallest to largest to pick the + * smallest level when multiple levels have the same size (i.e., + * when levels are folded). + */ + if (sz < PMD_SIZE) + return HUGETLB_LEVEL_PTE; + if (sz < PUD_SIZE) + return HUGETLB_LEVEL_PMD; + if (sz < P4D_SIZE) + return HUGETLB_LEVEL_PUD; + if (sz < PGDIR_SIZE) + return HUGETLB_LEVEL_P4D; + return HUGETLB_LEVEL_PGD; +} + static inline bool subpool_is_free(struct hugepage_subpool *spool) { if (spool->count) @@ -7315,6 +7338,154 @@ bool want_pmd_share(struct vm_area_struct *vma, unsigned long addr) } #endif /* CONFIG_ARCH_WANT_HUGE_PMD_SHARE */ +/* __hugetlb_hgm_walk - walks a high-granularity HugeTLB page table to resolve + * the page table entry for @addr. We might allocate new PTEs. + * + * @hpte must always be pointing at an hstate-level PTE or deeper. + * + * This function will never walk further if it encounters a PTE of a size + * less than or equal to @sz. + * + * @alloc determines what we do when we encounter an empty PTE. If false, + * we stop walking. If true and @sz is less than the current PTE's size, + * we make that PTE point to the next level down, going until @sz is the same + * as our current PTE. + * + * If @alloc is false and @sz is PAGE_SIZE, this function will always + * succeed, but that does not guarantee that hugetlb_pte_size(hpte) is @sz. + * + * Return: + * -ENOMEM if we couldn't allocate new PTEs. + * -EEXIST if the caller wanted to walk further than a migration PTE, + * poison PTE, or a PTE marker. The caller needs to manually deal + * with this scenario. + * -EINVAL if called with invalid arguments (@sz invalid, @hpte not + * initialized). + * 0 otherwise. + * + * Even if this function fails, @hpte is guaranteed to always remain + * valid. + */ +static int __hugetlb_hgm_walk(struct mm_struct *mm, struct vm_area_struct *vma, + struct hugetlb_pte *hpte, unsigned long addr, + unsigned long sz, bool alloc) +{ + int ret = 0; + pte_t pte; + + if (WARN_ON_ONCE(sz < PAGE_SIZE)) + return -EINVAL; + + if (WARN_ON_ONCE(!hpte->ptep)) + return -EINVAL; + + while (hugetlb_pte_size(hpte) > sz && !ret) { + pte = huge_ptep_get(hpte->ptep); + if (!pte_present(pte)) { + if (!alloc) + return 0; + if (unlikely(!huge_pte_none(pte))) + return -EEXIST; + } else if (hugetlb_pte_present_leaf(hpte, pte)) + return 0; + ret = hugetlb_walk_step(mm, hpte, addr, sz); + } + + return ret; +} + +/* + * hugetlb_hgm_walk - Has the same behavior as __hugetlb_hgm_walk but will + * initialize @hpte with hstate-level PTE pointer @ptep. + */ +static int hugetlb_hgm_walk(struct hugetlb_pte *hpte, + pte_t *ptep, + struct vm_area_struct *vma, + unsigned long addr, + unsigned long target_sz, + bool alloc) +{ + struct hstate *h = hstate_vma(vma); + + hugetlb_pte_init(vma->vm_mm, hpte, ptep, huge_page_shift(h), + hpage_size_to_level(huge_page_size(h))); + return __hugetlb_hgm_walk(vma->vm_mm, vma, hpte, addr, target_sz, + alloc); +} + +/* + * hugetlb_full_walk_continue - continue a high-granularity page-table walk. + * + * If a user has a valid @hpte but knows that @hpte is not a leaf, they can + * attempt to continue walking by calling this function. + * + * This function will never fail, but @hpte might not change. + * + * If @hpte hasn't been initialized, then this function's behavior is + * undefined. + */ +void hugetlb_full_walk_continue(struct hugetlb_pte *hpte, + struct vm_area_struct *vma, + unsigned long addr) +{ + /* __hugetlb_hgm_walk will never fail with these arguments. */ + WARN_ON_ONCE(__hugetlb_hgm_walk(vma->vm_mm, vma, hpte, addr, + PAGE_SIZE, false)); +} + +/* + * hugetlb_full_walk - do a high-granularity page-table walk; never allocate. + * + * This function can only fail if we find that the hstate-level PTE is not + * allocated. Callers can take advantage of this fact to skip address regions + * that cannot be mapped in that case. + * + * If this function succeeds, @hpte is guaranteed to be valid. + */ +int hugetlb_full_walk(struct hugetlb_pte *hpte, + struct vm_area_struct *vma, + unsigned long addr) +{ + struct hstate *h = hstate_vma(vma); + unsigned long sz = huge_page_size(h); + /* + * We must mask the address appropriately so that we pick up the first + * PTE in a contiguous group. + */ + pte_t *ptep = hugetlb_walk(vma, addr & huge_page_mask(h), sz); + + if (!ptep) + return -ENOMEM; + + /* hugetlb_hgm_walk will never fail with these arguments. */ + WARN_ON_ONCE(hugetlb_hgm_walk(hpte, ptep, vma, addr, PAGE_SIZE, false)); + return 0; +} + +/* + * hugetlb_full_walk_alloc - do a high-granularity walk, potentially allocate + * new PTEs. + */ +int hugetlb_full_walk_alloc(struct hugetlb_pte *hpte, + struct vm_area_struct *vma, + unsigned long addr, + unsigned long target_sz) +{ + struct hstate *h = hstate_vma(vma); + unsigned long sz = huge_page_size(h); + /* + * We must mask the address appropriately so that we pick up the first + * PTE in a contiguous group. + */ + pte_t *ptep = huge_pte_alloc(vma->vm_mm, vma, addr & huge_page_mask(h), + sz); + + if (!ptep) + return -ENOMEM; + + return hugetlb_hgm_walk(hpte, ptep, vma, addr, target_sz, true); +} + #ifdef CONFIG_ARCH_WANT_GENERAL_HUGETLB pte_t *huge_pte_alloc(struct mm_struct *mm, struct vm_area_struct *vma, unsigned long addr, unsigned long sz) @@ -7382,6 +7553,48 @@ pte_t *huge_pte_offset(struct mm_struct *mm, return (pte_t *)pmd; } +/* + * hugetlb_walk_step() - Walk the page table one step to resolve the page + * (hugepage or subpage) entry at address @addr. + * + * @sz always points at the final target PTE size (e.g. PAGE_SIZE for the + * lowest level PTE). + * + * @hpte will always remain valid, even if this function fails. + * + * Architectures that implement this function must ensure that if @hpte does + * not change levels, then its PTL must also stay the same. + */ +int hugetlb_walk_step(struct mm_struct *mm, struct hugetlb_pte *hpte, + unsigned long addr, unsigned long sz) +{ + pte_t *ptep; + spinlock_t *ptl; + + switch (hpte->level) { + case HUGETLB_LEVEL_PUD: + ptep = (pte_t *)hugetlb_alloc_pmd(mm, hpte, addr); + if (IS_ERR(ptep)) + return PTR_ERR(ptep); + hugetlb_pte_init(mm, hpte, ptep, PMD_SHIFT, + HUGETLB_LEVEL_PMD); + break; + case HUGETLB_LEVEL_PMD: + ptep = hugetlb_alloc_pte(mm, hpte, addr); + if (IS_ERR(ptep)) + return PTR_ERR(ptep); + ptl = pte_lockptr(mm, (pmd_t *)hpte->ptep); + __hugetlb_pte_init(hpte, ptep, PAGE_SHIFT, + HUGETLB_LEVEL_PTE, ptl); + break; + default: + WARN_ONCE(1, "%s: got invalid level: %d (shift: %d)\n", + __func__, hpte->level, hpte->shift); + return -EINVAL; + } + return 0; +} + /* * Return a mask that can be used to update an address to the last huge * page in a page table page mapping size. Used to skip non-present From patchwork Sat Feb 18 00:27:47 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: James Houghton X-Patchwork-Id: 13145380 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id A3FB3C6379F for ; Sat, 18 Feb 2023 00:29:13 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 45A2D28000A; Fri, 17 Feb 2023 19:29:00 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 40BDB280002; Fri, 17 Feb 2023 19:29:00 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 283FB28000A; Fri, 17 Feb 2023 19:29:00 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0017.hostedemail.com [216.40.44.17]) by kanga.kvack.org (Postfix) with ESMTP id 1B66C280002 for ; Fri, 17 Feb 2023 19:29:00 -0500 (EST) Received: from smtpin24.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay05.hostedemail.com (Postfix) with ESMTP id BF71E405E8 for ; Sat, 18 Feb 2023 00:28:59 +0000 (UTC) X-FDA: 80478527598.24.ABF2995 Received: from mail-ua1-f74.google.com (mail-ua1-f74.google.com [209.85.222.74]) by imf30.hostedemail.com (Postfix) with ESMTP id 076548000D for ; Sat, 18 Feb 2023 00:28:57 +0000 (UTC) Authentication-Results: imf30.hostedemail.com; dkim=pass header.d=google.com header.s=20210112 header.b=ZXgI9bmT; spf=pass (imf30.hostedemail.com: domain of 3yRvwYwoKCOYRbPWcOPbWVOWWOTM.KWUTQVcf-UUSdIKS.WZO@flex--jthoughton.bounces.google.com designates 209.85.222.74 as permitted sender) smtp.mailfrom=3yRvwYwoKCOYRbPWcOPbWVOWWOTM.KWUTQVcf-UUSdIKS.WZO@flex--jthoughton.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1676680138; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=6oe0JKndOCFJAJQ6FnTw91/ULLeb3XwnLhf6Mjl8ksE=; b=bu+RY8aT7/Co1hgUvmfB9t3pBKVEK7R2Ji47B0bxu6AihZ7KsrH+068Q0bHl4eNEBpV1tB iWR1xRUkB7kdpVuTwApIpPxGllC7eZhBv2HLcQg/813V6d8xmpMPq1I3GdA/eqS2vYAJvL ze9LnNp63Y/siVaXltrnbuR0XJG2vmo= ARC-Authentication-Results: i=1; imf30.hostedemail.com; dkim=pass header.d=google.com header.s=20210112 header.b=ZXgI9bmT; spf=pass (imf30.hostedemail.com: domain of 3yRvwYwoKCOYRbPWcOPbWVOWWOTM.KWUTQVcf-UUSdIKS.WZO@flex--jthoughton.bounces.google.com designates 209.85.222.74 as permitted sender) smtp.mailfrom=3yRvwYwoKCOYRbPWcOPbWVOWWOTM.KWUTQVcf-UUSdIKS.WZO@flex--jthoughton.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1676680138; a=rsa-sha256; cv=none; b=QfbkskqPdPIRcClv3D9ceacZTx7WZC7NDDTiaNAc0opQhQkuPply1DuBUJS8ukLKq2RtkH m7Yh2LSNjlhNIVHRpa1TqBMx7o5WtayGxMdo+GqkrhyZKSWzzZeGdX+aOGM59rVk8oHb1G 1GBgX3NdAer0acfh/2MZRdcuw23j1hU= Received: by mail-ua1-f74.google.com with SMTP id p6-20020ab05486000000b0068398735344so660676uaa.15 for ; Fri, 17 Feb 2023 16:28:57 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=6oe0JKndOCFJAJQ6FnTw91/ULLeb3XwnLhf6Mjl8ksE=; b=ZXgI9bmT7U5wLokZ0C4oNJ6Gk9RcbKcaK8gcoGLiwH9auXPSDVM0Ol5I8tJdiHkQBm cqJXsvHhpPo95w78Rkbk3SLZVdtOTUOQ5nztMCmY5/Ud5mRj3hFPJhvSI6benmg8rsXE zUOmpa69V6q/M9cjYkffaZNIdg3JvQ6rKjZMp6Dj0DYTIaXY+Lg7BLYbHF6CKnajEaK+ wl2mnnELaLd/60dmFNDm2ghVntcF83+VBQ5Wv+xyP+zua7DC9xkWhlP06ETH3LjJnFLh bf40rMtbYVk0AxD73EzwULPcAUGoIp7bBDuosmnG4aaaqWZsAeWgTL7c1Q5nHavHsXWV bZFQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=6oe0JKndOCFJAJQ6FnTw91/ULLeb3XwnLhf6Mjl8ksE=; b=m1LOP9Ah5lRKe4HPSv28TdanbFWCYsyQJHKnPVox/P+E7IrqFt1Ak1PhwuGMindQ5s FTn+SW96hrs/GkGXD45LOm7mAVDbW1uFHci2B0MvFr6iRd9Dvsey+rBeU0DZbMHgvs/s 4foAt4oajpwLOth2DvkrB5HF0ASAOTXN4KtnRVQPYGZfjDgvRT/87CvIODfYkaGHJK4S YZIhkNv/F8R0mhLzxJRZ//wqyA1wHYxsWXwGIVyU+fbjpL6iKwQKLZRbzmvvHhfyBhgG 86wqU81mZXnnt0OXr91MIrjAb447yQlChV9iwyGd81oBwa3q2exINL/ntXQsYJPUu+3l qTLQ== X-Gm-Message-State: AO0yUKXYiONEcjO59Ex3TaXPvZAEVpPB+8kJjUYWjBKqbtLdVO0SMJ4R zK2i7VvcpDOJ4mPzChmlnsBIJZVa+suIJj/z X-Google-Smtp-Source: AK7set8EIO6lV9DKEYEv7Gocq2q4pnYUxsgcQQWqM5Ogw0BQTVEawiIrDCY/tG3VwGpwjeKgD8+npdD3lNuWqFdx X-Received: from jthoughton.c.googlers.com ([fda3:e722:ac3:cc00:14:4d90:c0a8:2a4f]) (user=jthoughton job=sendgmr) by 2002:a1f:9111:0:b0:409:92de:63bd with SMTP id t17-20020a1f9111000000b0040992de63bdmr110245vkd.12.1676680137159; Fri, 17 Feb 2023 16:28:57 -0800 (PST) Date: Sat, 18 Feb 2023 00:27:47 +0000 In-Reply-To: <20230218002819.1486479-1-jthoughton@google.com> Mime-Version: 1.0 References: <20230218002819.1486479-1-jthoughton@google.com> X-Mailer: git-send-email 2.39.2.637.g21b0678d19-goog Message-ID: <20230218002819.1486479-15-jthoughton@google.com> Subject: [PATCH v2 14/46] hugetlb: split PTE markers when doing HGM walks From: James Houghton To: Mike Kravetz , Muchun Song , Peter Xu , Andrew Morton Cc: David Hildenbrand , David Rientjes , Axel Rasmussen , Mina Almasry , "Zach O'Keefe" , Manish Mishra , Naoya Horiguchi , "Dr . David Alan Gilbert" , "Matthew Wilcox (Oracle)" , Vlastimil Babka , Baolin Wang , Miaohe Lin , Yang Shi , Frank van der Linden , Jiaqi Yan , linux-mm@kvack.org, linux-kernel@vger.kernel.org, James Houghton X-Rspam-User: X-Rspamd-Server: rspam04 X-Rspamd-Queue-Id: 076548000D X-Stat-Signature: 9dy1exngyihcn5na5ku6twbmhr3hajr1 X-HE-Tag: 1676680137-290923 X-HE-Meta: U2FsdGVkX1+eSY0ategURx9HH3ftV2GkpZBfsfb15CAAm+ZsJAMOub+/vVcK3VdOZnp0X9UX/N9xxhepbY0z7uMc2ejg4hpNvw7nckAXqz/zuj3PdhFu9omw1AWETHaNWkcHwA9ezRzMNQ95ju3WAi14GQEEC3O2moEwVlg3VqUy6774REI2NpxCHMa+A29C/YwMUABhVGGw/yFPQGpmmh4E3naVjBVsYDxRHxOlmkjidJW12utTONnBlnVUs4iAwPUXCtRWnFlCZa31AIcsgunZIHKyApwbfGa4qK2qhejeqp3ehLICPq3SyMQELjU8iubSHd07AMIZdeKVpNTi+Th2juCM5CiDV228XtHEzHMKGqfMbQf7IADHJW8lCkmGZ6Ay16shFF5tDECtvz6DeWUqh7uYBv07N+oByR7Y46ab62oRvqp8sP3x+M03b5qv3VO/IRtMFtRy/BzmaP2IhdwHJsdssIz7hsRo8iu0sHXLG5gw6WNcQQusEIwXkxmNaK+p0IP4jwatnkzxi7z09cZu5sUD1d0Wau8TyI5nYRPp+60DnfBrB4PZ9u+icH5rd/1m7p/b0nAPTsYmnEyAYl+aczn6N8wrzkUyxgDI/cB3DKX8pjf6aeoMxkUM4hCmihzWnVdrNMMofeng/8x8M2xdyCCQ4YdLQmoAotlqbuGKcJA443rQijBSrpzsc21BG/j1iQoTErodvDjZliyKwBSLsLH+U0TvzAWOKSJf11FMl6NCKF6mMIFVmimZ3ZWFCy0+lybtrQB/qFCV5srX+Yvf53zDt8JeKRrKG7NIltpb/GZnfosyKm/aVLnaSRrFRz8WPnB0fNGAM+Onrg8hClJLQ8uQCeY9Bz800gEMydrnMKiIXlVIgKMVMleHdaAS8Fe2QPKmoouuX2ZcNQjrijK35mcWOewNIn9ay2RTU83zyF6t7pmy8I8RajKNKVndxyatvkwzuJGzZGX1D7D G5qbfseI 6eKXyXIsBs2ejG5yq4kCx3HdlUUM7O2ySSwP6zowDW7BsikSCrg2ibCRaggBvXPk0TqWeKV9rQLjHTgag5ps1UAGhoLI2alk222EMxD9G9KCtSg1Z4ZA50P9TRWX9lNxzaW8bJOnhlDocYVJPxOR38OPXALWRs7+IrQxtl4REWOZNySaxVBB9228o89RvcpKlX9nSXXivqSqbYkJv72Y0zoc7yrljN70shzT/hMUYG+8BBpq4FJq/D8oStnYCOInP3DgmvvzkY6W19H7mU+7ymfLR6WldZs2mWo8c+DCSbsTrBMtIhbs7vevcMU+a5j3/7juLOVVQyelk5JUZa92kdn9FRZ96A1KZKXTrVHw5phHZTlt2flGU1pk23Q8Ta0WCfq/3hMCIq+/ODy8gv/ituMCZsOYrEjAcTTngQbN/P8lkwf9xGp1fO3lItqC6psHk8PN7QiunOhySgjG6Pz5F9ulMI+JQluN8w53kiG/7OObiQeIbn2+2k5TDvGDzKZm8AGOlxOCLJbabH254ZW7pmhg+V9KwbmvjToptReDJVLJp1SfNLn9wCV84l68qZjwTUPOFzyC17D3dK04/4Gp78sOKVOTfatTqF+LpLksuDutlM/sbu9gGma8IZdPMEivrTem3ur/pAsj7SLlVGVuRs4cmzg== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Fix how UFFDIO_CONTINUE and UFFDIO_WRITEPROTECT interact in these two ways: - UFFDIO_WRITEPROTECT no longer prevents a high-granularity UFFDIO_CONTINUE. - UFFD-WP PTE markers installed with UFFDIO_WRITEPROTECT will be properly propagated when high-granularily UFFDIO_CONTINUEs are performed. Note: UFFDIO_WRITEPROTECT is not yet permitted at PAGE_SIZE granularity. Signed-off-by: James Houghton Acked-by: Mike Kravetz diff --git a/mm/hugetlb.c b/mm/hugetlb.c index 810c05feb41f..f74183acc521 100644 --- a/mm/hugetlb.c +++ b/mm/hugetlb.c @@ -506,6 +506,30 @@ static bool has_same_uncharge_info(struct file_region *rg, #endif } +static void hugetlb_install_markers_pmd(pmd_t *pmdp, pte_marker marker) +{ + int i; + + for (i = 0; i < PTRS_PER_PMD; ++i) + /* + * WRITE_ONCE not needed because the pud hasn't been + * installed yet. + */ + pmdp[i] = __pmd(pte_val(make_pte_marker(marker))); +} + +static void hugetlb_install_markers_pte(pte_t *ptep, pte_marker marker) +{ + int i; + + for (i = 0; i < PTRS_PER_PTE; ++i) + /* + * WRITE_ONCE not needed because the pmd hasn't been + * installed yet. + */ + ptep[i] = make_pte_marker(marker); +} + /* * hugetlb_alloc_pmd -- Allocate or find a PMD beneath a PUD-level hpte. * @@ -528,23 +552,32 @@ pmd_t *hugetlb_alloc_pmd(struct mm_struct *mm, struct hugetlb_pte *hpte, pmd_t *new; pud_t *pudp; pud_t pud; + bool is_marker; + pte_marker marker; if (hpte->level != HUGETLB_LEVEL_PUD) return ERR_PTR(-EINVAL); pudp = (pud_t *)hpte->ptep; retry: + is_marker = false; pud = READ_ONCE(*pudp); if (likely(pud_present(pud))) return unlikely(pud_leaf(pud)) ? ERR_PTR(-EEXIST) : pmd_offset(pudp, addr); - else if (!pud_none(pud)) + else if (!pud_none(pud)) { /* - * Not present and not none means that a swap entry lives here, - * and we can't get rid of it. + * Not present and not none means that a swap entry lives here. + * If it's a PTE marker, we can deal with it. If it's another + * swap entry, we don't attempt to split it. */ - return ERR_PTR(-EEXIST); + is_marker = is_pte_marker(__pte(pud_val(pud))); + if (!is_marker) + return ERR_PTR(-EEXIST); + + marker = pte_marker_get(pte_to_swp_entry(__pte(pud_val(pud)))); + } new = pmd_alloc_one(mm, addr); if (!new) @@ -557,6 +590,13 @@ pmd_t *hugetlb_alloc_pmd(struct mm_struct *mm, struct hugetlb_pte *hpte, goto retry; } + /* + * Install markers before PUD to avoid races with other + * page tables walks. + */ + if (is_marker) + hugetlb_install_markers_pmd(new, marker); + mm_inc_nr_pmds(mm); smp_wmb(); /* See comment in pmd_install() */ pud_populate(mm, pudp, new); @@ -576,23 +616,32 @@ pte_t *hugetlb_alloc_pte(struct mm_struct *mm, struct hugetlb_pte *hpte, pgtable_t new; pmd_t *pmdp; pmd_t pmd; + bool is_marker; + pte_marker marker; if (hpte->level != HUGETLB_LEVEL_PMD) return ERR_PTR(-EINVAL); pmdp = (pmd_t *)hpte->ptep; retry: + is_marker = false; pmd = READ_ONCE(*pmdp); if (likely(pmd_present(pmd))) return unlikely(pmd_leaf(pmd)) ? ERR_PTR(-EEXIST) : pte_offset_kernel(pmdp, addr); - else if (!pmd_none(pmd)) + else if (!pmd_none(pmd)) { /* - * Not present and not none means that a swap entry lives here, - * and we can't get rid of it. + * Not present and not none means that a swap entry lives here. + * If it's a PTE marker, we can deal with it. If it's another + * swap entry, we don't attempt to split it. */ - return ERR_PTR(-EEXIST); + is_marker = is_pte_marker(__pte(pmd_val(pmd))); + if (!is_marker) + return ERR_PTR(-EEXIST); + + marker = pte_marker_get(pte_to_swp_entry(__pte(pmd_val(pmd)))); + } /* * With CONFIG_HIGHPTE, calling `pte_alloc_one` directly may result @@ -613,6 +662,9 @@ pte_t *hugetlb_alloc_pte(struct mm_struct *mm, struct hugetlb_pte *hpte, goto retry; } + if (is_marker) + hugetlb_install_markers_pte(page_address(new), marker); + mm_inc_nr_ptes(mm); smp_wmb(); /* See comment in pmd_install() */ pmd_populate(mm, pmdp, new); @@ -7384,7 +7436,12 @@ static int __hugetlb_hgm_walk(struct mm_struct *mm, struct vm_area_struct *vma, if (!pte_present(pte)) { if (!alloc) return 0; - if (unlikely(!huge_pte_none(pte))) + /* + * In hugetlb_alloc_pmd and hugetlb_alloc_pte, + * we split PTE markers, so we can tolerate + * PTE markers here. + */ + if (unlikely(!huge_pte_none_mostly(pte))) return -EEXIST; } else if (hugetlb_pte_present_leaf(hpte, pte)) return 0; From patchwork Sat Feb 18 00:27:48 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: James Houghton X-Patchwork-Id: 13145381 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id A8ACEC05027 for ; Sat, 18 Feb 2023 00:29:15 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id E1EB428000B; Fri, 17 Feb 2023 19:29:00 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id D82A6280002; Fri, 17 Feb 2023 19:29:00 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id C20F028000B; Fri, 17 Feb 2023 19:29:00 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0011.hostedemail.com [216.40.44.11]) by kanga.kvack.org (Postfix) with ESMTP id B140D280002 for ; Fri, 17 Feb 2023 19:29:00 -0500 (EST) Received: from smtpin18.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay04.hostedemail.com (Postfix) with ESMTP id 7CC2B1A073D for ; Sat, 18 Feb 2023 00:29:00 +0000 (UTC) X-FDA: 80478527640.18.7DA4A73 Received: from mail-vk1-f202.google.com (mail-vk1-f202.google.com [209.85.221.202]) by imf16.hostedemail.com (Postfix) with ESMTP id B5B2318000B for ; Sat, 18 Feb 2023 00:28:58 +0000 (UTC) Authentication-Results: imf16.hostedemail.com; dkim=pass header.d=google.com header.s=20210112 header.b=a3A4PV5p; spf=pass (imf16.hostedemail.com: domain of 3yRvwYwoKCOYRbPWcOPbWVOWWOTM.KWUTQVcf-UUSdIKS.WZO@flex--jthoughton.bounces.google.com designates 209.85.221.202 as permitted sender) smtp.mailfrom=3yRvwYwoKCOYRbPWcOPbWVOWWOTM.KWUTQVcf-UUSdIKS.WZO@flex--jthoughton.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1676680138; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=xG9DbE8VOder+S14FjxoH+IwzXAdpZM7a0a5kZcYva0=; b=VtlfQjMp3chVLlBacBTkvpKaV//Qn2mB8Z2CP/aVHB+M/FwIcLJK/tdVnothaRIWaET9hU Qlvo6fHxVhhU0TkOeUMjRFTDBUocGkcGstH3mgi8ekKN6RtjbAmWjkUxkvWiOjKU23tgKP /65skshbLRhbQ+tD07qdIat9pU0Yfdk= ARC-Authentication-Results: i=1; imf16.hostedemail.com; dkim=pass header.d=google.com header.s=20210112 header.b=a3A4PV5p; spf=pass (imf16.hostedemail.com: domain of 3yRvwYwoKCOYRbPWcOPbWVOWWOTM.KWUTQVcf-UUSdIKS.WZO@flex--jthoughton.bounces.google.com designates 209.85.221.202 as permitted sender) smtp.mailfrom=3yRvwYwoKCOYRbPWcOPbWVOWWOTM.KWUTQVcf-UUSdIKS.WZO@flex--jthoughton.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1676680138; a=rsa-sha256; cv=none; b=4/LJJjlDt5ZiFnIsfFF0dl8VXkE8uO5c6QD2T/+ajQNh1lauNbOp7tRcU6/ti3OFqSg39v 25BGmTTDxaVD2WRl3CzAvR4WFKjScnqSIIOAdaDSTn3kJgww0jZFgWle2KyN2SeZhglsCi rX7ve/niM4L3P6YSmPeNdPJORV9GlD0= Received: by mail-vk1-f202.google.com with SMTP id n21-20020ac5cd55000000b00388997b8d31so732971vkm.3 for ; Fri, 17 Feb 2023 16:28:58 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=xG9DbE8VOder+S14FjxoH+IwzXAdpZM7a0a5kZcYva0=; b=a3A4PV5p5qh9WhQPMBGPxG18XU5syYapjVA8cJrIQKg9SlfzzX0meZ5KSEbjCS76/h KkgUk1GyrdDSRmNwgJs67ciqNHmVAv5DJmhTpojLF0Uyhx6Rxq0nYHhEXa500KqHdLSY 8OPHt1vFrno7tMNUTlD5OZynE2nE9LC7G2tsTG/pmDJw36wEBjXLJkf3kjpEqyBFrFHU onGHC9FIBifpRYHynXz9NhBphtMbFvKNAw8XziHocmnSczkwT0klgF5M5m56ALq64vUA bsFoi/I6FyoSWC8X++RYI9nsaIKHXsGED5l665XewLFlFoDtjmgAh+mOcZ6rWrr94rxB 1OVA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=xG9DbE8VOder+S14FjxoH+IwzXAdpZM7a0a5kZcYva0=; b=RVJ+ZZi9ovDwyKKYQ8Ra1wVfLAOikLlXti3A8dTA6+YMjxZPiskn1ubUs4A0mTUEhO JvjC1bmChPIB9c/0tDpGbNhwg1upbTlRx7lTRy8bQ1R4jN1W9csJ2MqIWrc2S7Y/cjup tvDhoYTdlvlgPCEIfoxkDwXSoNNSS6BVFFlE8IfQ3KVjPT8fDCY1fmr7UQKQ26SJahn3 sRptprf0U2IlOLiD5ZoQ1rj7Pw93ugLK1FzBNZR866wcG1wqw45zHQa7Tb51ZwdT/BH8 0D0GEBe0FgKy1GTC2nBj+lEXNKeg/bidTAYabv+SbmYCHQfkoAW2EM000mJ35/UBW/n/ CY5w== X-Gm-Message-State: AO0yUKV6VTdrThF66liVtO8zfcAGrxsR/3W2k0i2y1NyqgCdnh1D47js H2xnEhzDINhCHKLmjEq8TMWyWQNLn78G2EPO X-Google-Smtp-Source: AK7set90mnEVzNzEwK6fPTTkXMb5MvuuAdnXyAlauMlVrzIA9ZFHAX7xiVkHYeaXZEfkvh8rxB+tXWDlCgeheDym X-Received: from jthoughton.c.googlers.com ([fda3:e722:ac3:cc00:14:4d90:c0a8:2a4f]) (user=jthoughton job=sendgmr) by 2002:ab0:53d9:0:b0:68d:6360:77b with SMTP id l25-20020ab053d9000000b0068d6360077bmr26282uaa.1.1676680137978; Fri, 17 Feb 2023 16:28:57 -0800 (PST) Date: Sat, 18 Feb 2023 00:27:48 +0000 In-Reply-To: <20230218002819.1486479-1-jthoughton@google.com> Mime-Version: 1.0 References: <20230218002819.1486479-1-jthoughton@google.com> X-Mailer: git-send-email 2.39.2.637.g21b0678d19-goog Message-ID: <20230218002819.1486479-16-jthoughton@google.com> Subject: [PATCH v2 15/46] hugetlb: add make_huge_pte_with_shift From: James Houghton To: Mike Kravetz , Muchun Song , Peter Xu , Andrew Morton Cc: David Hildenbrand , David Rientjes , Axel Rasmussen , Mina Almasry , "Zach O'Keefe" , Manish Mishra , Naoya Horiguchi , "Dr . David Alan Gilbert" , "Matthew Wilcox (Oracle)" , Vlastimil Babka , Baolin Wang , Miaohe Lin , Yang Shi , Frank van der Linden , Jiaqi Yan , linux-mm@kvack.org, linux-kernel@vger.kernel.org, James Houghton X-Rspam-User: X-Rspamd-Server: rspam04 X-Rspamd-Queue-Id: B5B2318000B X-Stat-Signature: j4mrtd1gkibqdththj1jqtan746wbazr X-HE-Tag: 1676680138-791488 X-HE-Meta: U2FsdGVkX19aEUmx9q50WUUp5JlOScSgW8aqAt+tUDLrayBpgveAfAxVRZ52S1/DsVhrW5TA88t+Uqx5u8PvH+4/iWgi8bv1ua7SpEC1Tewv1Fxqd5cf7D4h8BFAqVm/KTCsKNFsw4Ug0fUBXP4iT7Rgn5YfrmkWktMX7FZ6DcLokGl5AgBoqgxrYcC6M2ShKQjKGjJaFviHZpsJp7bECnSu6bZe4gWCcNEd1dnhUT7BFAhwwt6Ty0Y+7kUQRlEtMrplhZngPJTKOMBf1rt04O7z/kCkESFO0cOsI0H8p5NoaD5/eB3PB24AhggyzZP5DUxu5AIFvE/mzlxdQ1wMcxnFe+ztM3wSNF3G+Xqgg7GVylD3E8Zl8CjEfBPZG3WBfqzsEnQjQX0Lm6+evnCmWw7bCDIYJUUmdWlY1kKP5Yock69LOP2+OpEk+pz80wqIyJ6gMzrgSHT4wLSPgvW4xKh/9XPLfONHkPKaiWAh9COingWbkSCgU5nwmGfOU7VsaX5dHeY3KcZgRs0Dv2WCa6S3zrY7DbIeuUzqp8ha9+VRrI5EPCQeglyDRuuWP2PZmqGj6KH7gZCMQm8EnEtuRVviyp/WTj5heyBRy92x6gFOXEqOh8wBv9YX5urwiFwn9wKPPzvQxoGVUCMZucuXftHBhqDYEC79cPipuTTFaqaOuEP4d9V1gKRqNsY2RPfTO9AFimLN/jpTrZ9ws8ccQ91P7kqQ/KkL1QVRCE4Wcnry4+62ktVPf7TRtQbejTFPonrUz3jCwD5mf1DEAz36k5Evq3PPXZFDrd/vpkj5hgDoDUihpE7r8EoEGSr7QmNAVhgPXNg0ivzg73tuf+lDKaz0V4dcZp5WJ+hLy8WqGVaKl2O/aQoaaVZuTesnTskCbpgVQ6sgmMludcOFs2DYSEIIjhY+HS9rT6YGD4ff7cmkposgbPhFar9sTbbpoaxLYMXcCsZRKtsmxHMbEia IVNDe8iQ 6/09WVSGeO6PohjBvHYzS36P60mE4OfNN0yHjK80GnOQNwwxXpVdl+gYwLbeofcEDrChCsRotW7gHNpPJmNZtzxh7nmOEn8D4knN/zFS64/cHwy64P4NP8rlkCDrxRXMi2vtcxX0FaP9BF7BIDDsCy0MLNYjIfZHFejZ0N0KWR70Q6jNDyXm9M6v02iF/+GQnBOqyaWnCIPAVyLRXUT6MD4SG2sgEsWeud3cRstdDNuVI9i8jRzm0eWCcNfwr9i6vCZruDpSunpZY8elWt95gUaCWO8vYDfhCyvp4plw4AfYhtUkK9fWNpSMjf43DlQ9YRPajF06dXaUHlZLFUKwNGgf6dwTo+crJiGBXH91s5KRszEAMHz5QVR7XEWvpSJbiF7fQjvc0M6+zX56P1ZaIMRWcc4/h4KOS0uFxSWslZvnqdI4kKk2XejD3a/zIszdq3XUh0/02BhMuI4sWXb3XsL0SFKjk7pt/V+ULCJK2mPvGgYuV6PvOSg9k2XGjKkN2lHksonCdAEGPDCnH76S+khVWUmR0XcOdh14MZE1zjygUO+qbVgC4AhfR+2mZR3Ruy4f26ds11945RDo3HhHL87LGPdT4ZK3oTEulSjzhRrSwLMcZ5pcNR9B9YsCh8tBykcXftB9gfnsPWvh0SgbmrVEneOXoDncOyKNERc46SDOfLuMlFr54SCikJZ4hLrBZ2jgUAZN29CSfsx7r32osWr/PBp9O7ipKeXGF X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: This allows us to make huge PTEs at shifts other than the hstate shift, which will be necessary for high-granularity mappings. Acked-by: Mike Kravetz Signed-off-by: James Houghton Reviewed-by: Mina Almasry diff --git a/mm/hugetlb.c b/mm/hugetlb.c index f74183acc521..ed1d806020de 100644 --- a/mm/hugetlb.c +++ b/mm/hugetlb.c @@ -5110,11 +5110,11 @@ const struct vm_operations_struct hugetlb_vm_ops = { .pagesize = hugetlb_vm_op_pagesize, }; -static pte_t make_huge_pte(struct vm_area_struct *vma, struct page *page, - int writable) +static pte_t make_huge_pte_with_shift(struct vm_area_struct *vma, + struct page *page, int writable, + int shift) { pte_t entry; - unsigned int shift = huge_page_shift(hstate_vma(vma)); if (writable) { entry = huge_pte_mkwrite(huge_pte_mkdirty(mk_pte(page, @@ -5128,6 +5128,14 @@ static pte_t make_huge_pte(struct vm_area_struct *vma, struct page *page, return entry; } +static pte_t make_huge_pte(struct vm_area_struct *vma, struct page *page, + int writable) +{ + unsigned int shift = huge_page_shift(hstate_vma(vma)); + + return make_huge_pte_with_shift(vma, page, writable, shift); +} + static void set_huge_ptep_writable(struct vm_area_struct *vma, unsigned long address, pte_t *ptep) { From patchwork Sat Feb 18 00:27:49 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: James Houghton X-Patchwork-Id: 13145382 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 2DF37C636D6 for ; Sat, 18 Feb 2023 00:29:17 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 211CA28000C; Fri, 17 Feb 2023 19:29:02 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 1760D280002; Fri, 17 Feb 2023 19:29:02 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 0623028000C; Fri, 17 Feb 2023 19:29:01 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0013.hostedemail.com [216.40.44.13]) by kanga.kvack.org (Postfix) with ESMTP id E5CA2280002 for ; Fri, 17 Feb 2023 19:29:01 -0500 (EST) Received: from smtpin10.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay09.hostedemail.com (Postfix) with ESMTP id BFBAF80956 for ; Sat, 18 Feb 2023 00:29:01 +0000 (UTC) X-FDA: 80478527682.10.C5D1239 Received: from mail-ua1-f74.google.com (mail-ua1-f74.google.com [209.85.222.74]) by imf11.hostedemail.com (Postfix) with ESMTP id 0071A40019 for ; Sat, 18 Feb 2023 00:28:59 +0000 (UTC) Authentication-Results: imf11.hostedemail.com; dkim=pass header.d=google.com header.s=20210112 header.b=rK+w++ej; spf=pass (imf11.hostedemail.com: domain of 3yxvwYwoKCOgTdRYeQRdYXQYYQVO.MYWVSXeh-WWUfKMU.YbQ@flex--jthoughton.bounces.google.com designates 209.85.222.74 as permitted sender) smtp.mailfrom=3yxvwYwoKCOgTdRYeQRdYXQYYQVO.MYWVSXeh-WWUfKMU.YbQ@flex--jthoughton.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1676680140; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=iq/7jaT5OFDRS9NrMnKt72E+h6CEDJFDrueWaD5zQ0Y=; b=mLXRjLXdAm0vv6n2ExNKZ3FO1oqCWZFC2IWplRaNH6QqCI52vyTjjP2MSBbJeQqjbuLvhk vs0RvdqaWXejYyJq4wOb59olFNK5hKWM5Yl8FNt3l9DMg/vvQ4gZQ6W7I0IxtrBLnnwch1 RTEDf2hgEcoZA36ZE2YfE4EWq7YmUjQ= ARC-Authentication-Results: i=1; imf11.hostedemail.com; dkim=pass header.d=google.com header.s=20210112 header.b=rK+w++ej; spf=pass (imf11.hostedemail.com: domain of 3yxvwYwoKCOgTdRYeQRdYXQYYQVO.MYWVSXeh-WWUfKMU.YbQ@flex--jthoughton.bounces.google.com designates 209.85.222.74 as permitted sender) smtp.mailfrom=3yxvwYwoKCOgTdRYeQRdYXQYYQVO.MYWVSXeh-WWUfKMU.YbQ@flex--jthoughton.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1676680140; a=rsa-sha256; cv=none; b=5o4jL2FcI7Nq0vk1VXsqtHPLiibXnIMIH2hm1QSLSx347nc76bjlcPdkXgQZdt6YNrbai4 C3XY/3xHOmWB0mO+LCMgG7C1Arc6qe3sCxZMZHLucARfEdd26G/s+eDABlsTWsq2rFJuRk AmujTRkUayClTjNh2DbVgj1PhU4nm5w= Received: by mail-ua1-f74.google.com with SMTP id f48-20020ab014f3000000b006001ac8d2efso498111uae.11 for ; Fri, 17 Feb 2023 16:28:59 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; t=1676680139; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=iq/7jaT5OFDRS9NrMnKt72E+h6CEDJFDrueWaD5zQ0Y=; b=rK+w++ejfSj9Z8KdDCJh6qsGzcDORRUZ3hQ92A+AEldYRxu0DZCJLjDAFmiCmrVA6B Na6Pptp7zGpST0zHRT4GdkbtZvivNmOwM/uu3+mymM3QGH+eXzsM6/DmTDcrdLZCk2uG zau9emovV4r2ySgAVzctF5VlGsHtjYqeBgKZNZgPBZ1SLSfsxBHt+SiZ1vbcgJ/wESBP CZOUK3DDa9LZUS5lOm6XD268od69L3dM1qEzdes7zX+Xw17tkhZQQ2+9B8qvRwzQK+cC k0aoIctvmcgulQSnvyH15+EvIXO4nbINf3tkMBSYU/g3dwiCcRGViRIyK+goTDWkP3/w +3Mw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; t=1676680139; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=iq/7jaT5OFDRS9NrMnKt72E+h6CEDJFDrueWaD5zQ0Y=; b=8MLylPTolYIm40G81zMoosexk11QlKVXm6ro9vqJDxXAqqwmajmYEhQqnpCg3UEt41 ElbY29A64/4ys1AHTzFek8RJnYXXFJmS9Sr8MQwdxBrSrj7bxIUr0jjIrNXi7zL9qYU4 dgLXR8zHIVbeb0vwgn6G9vMhWIaHUk+Rnnzt1NFqDRQa1IIIfs8axEeSTj8TDdNbvAR4 +FNXjHial6nCCmPr6ogpvaGDcpf9TO4RfHSxFMVbdnh7aOr/iIT9Puc8GmwS+bXIQzAp g/mQR9Ih4JQFHcyEhxprbicjXBhCK9kaPB/MUj7r5/RVytF2pc6C/AGk0NXA5k9eczBW GoFw== X-Gm-Message-State: AO0yUKWjiMsu2ZeZR4otzdeRuLl2ukJt+KzQTvnMdddITfqqI/NKjWE7 DRT9D9akUFUQCqa5r/SIfLDAHGAE4Y2jTpAm X-Google-Smtp-Source: AK7set8EhVlNCK9IccPiC1WcAxl/BD8tBcWSdj7osxR3aU5m9A2R/pd4W9j1Bya2QmGiWMHJAKk9Wk9ZzEwHF/MA X-Received: from jthoughton.c.googlers.com ([fda3:e722:ac3:cc00:14:4d90:c0a8:2a4f]) (user=jthoughton job=sendgmr) by 2002:a67:f1c6:0:b0:3ec:ab8:a571 with SMTP id v6-20020a67f1c6000000b003ec0ab8a571mr271401vsm.55.1676680139239; Fri, 17 Feb 2023 16:28:59 -0800 (PST) Date: Sat, 18 Feb 2023 00:27:49 +0000 In-Reply-To: <20230218002819.1486479-1-jthoughton@google.com> Mime-Version: 1.0 References: <20230218002819.1486479-1-jthoughton@google.com> X-Mailer: git-send-email 2.39.2.637.g21b0678d19-goog Message-ID: <20230218002819.1486479-17-jthoughton@google.com> Subject: [PATCH v2 16/46] hugetlb: make default arch_make_huge_pte understand small mappings From: James Houghton To: Mike Kravetz , Muchun Song , Peter Xu , Andrew Morton Cc: David Hildenbrand , David Rientjes , Axel Rasmussen , Mina Almasry , "Zach O'Keefe" , Manish Mishra , Naoya Horiguchi , "Dr . David Alan Gilbert" , "Matthew Wilcox (Oracle)" , Vlastimil Babka , Baolin Wang , Miaohe Lin , Yang Shi , Frank van der Linden , Jiaqi Yan , linux-mm@kvack.org, linux-kernel@vger.kernel.org, James Houghton X-Rspamd-Server: rspam05 X-Rspamd-Queue-Id: 0071A40019 X-Stat-Signature: 8q83p36hna69x4ypy6iszjrz1bm4eb4m X-Rspam-User: X-HE-Tag: 1676680139-398436 X-HE-Meta: U2FsdGVkX18CNgAnQjyxFBDmSwF0pucVsm663Ks11Gr+evegKYyj/pm5kKZFrLjI0qqLbDzMG6pc+l10t4zga946kyEq38qbLuNA85vn4M0nZiEAHvuaHZ4wmx/jBxmfGPU3wqVoXk4SBFO3DJPHOKMrC0SksUYGGwNQcqyYz07vIapf+jEsqiEBAwjfsfb/qXalAQN2bOgPBjKP05XgZW2KEC6cdbji/f2G9JOBHgv6byIQ0QCtXKU0MH9gfKaVc75sVm6BBbCmGzpDi3YOpug7zw6vg2wyt4rPsIHaqdG8v5qGqdonw/tJXG7B3liz+loZxqdw44Tpk0HD2u5kOMbv8GlrRpB+WqkBfOhVKH4Mzy1VCjjhIepfT8EbLdhJIlyIrpWvgTAY1auNwKBvzfTC1pe6/xkkC4zmVvGJtBm1HvylWsnWHunk1KdQ7CWBPcfJw0422ASoguArgMO1RrG/PP8PvoOuSxAZ1ihbkUvpdpW+4sXs37MyPozMI+7w72ujGUETqV55zuRDx48hJElu4S4Ko4lTstKKHxoBbfPJWk9RHamG3FfkaZPL7lOdL5omsZ2LlZ9infMnIvvLzCb+R1vuJP1HNRShN9bZNtjR/uX5ya/gYnZQLQXf45xjek+ZEpuj8l6ueb72US91pgAN+aEjgxSGBGvwQhk/xXWmWKe5Vthf84JbTPd8JnewWJ/jPpGZCqpv46TclyxVr3Z28QihAGb8La3NXwJEpDgIA6Q3AYdm1Tcr6/WXDO1BNwB3ccGKpdW0O2P9jzPGS9fha7dxy6+SEuLpRzJmD3BNW6mLdDQhbCfozCAwW3jAGXHKMCvmvBg8F7ajuYyHmDFfMJ02+Bv1/SxgyvChfCr6Kfyvw92HaxMYP+rxAAtlUEH6vdCmX0nuS2aLBZ8TEz8itI2OgwMCMT0Y99MsVdHzQPO4t+8XsOHCDaQli3maLh2IGmk2ZT3Hijeq0ub rl9qTtdO i1wn9HPp9gp4fWFdc54WmuZ0yTXXPAP6xbdsOyBgC62hBlLRQVKfRpJ9XKdpCzP2EbZ4PNA1dXxWVjJYdHSKe4deWhkA4jjmiRXjzJo8Qq0QkkeH4XDKpEgps0Mq8F+ZQ8IoI8akTyzLxVZWtGihbVSKlk5hl2sAd/b86zuX5FjXphD1s1P3OCvzi2CZ0AMrgvG9+C/42S1EIBpEC74YllZzvTf3Gm6DA30pj97KcpXnZKhsxlYerSTokiOyv/9T8NyAOuaLe3Uv5JG/1xRNV5+aWA87QWKy2ezsJyXBNXEkwWw5hpZ0mDxnA0LeCGfbjF8477sADv+vQgfCft9tNOIAWaS31X0hTgs+d/0g1DhYuHObjyUShm+OjoGKxfLJ+mj/8oiTIdmbu5yVdtOnfsTeRXoMA8/mrhvfbWYCRTLFagOwMwqs6hpO2HMQsZhV3wO2bEiqMQJQgDFgfW92MWSBD0NI0x0PM+SsPzq+opQzZ4UqU/r87DyP9Z8I+xy4zzgRMgmbHIcL+X7sOA0FMTpNKkyDNVbXDexkoiZHhivz7VwD4IwpWbDmc48BLH3s9/63lO5q3RhSviuv0xqv+64N7xQZV1qZiVJSaUJrPshZcKOQ/x2KUmw8vxJZSlUFSFdtLIYLngWUWOVgZmeAFuopevUVl7RzmTutOoWrAal2l3cjZ1NJ+hBKeoGLCFTPL6mu3dM3XnocTDJM= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: This is a simple change: don't create a "huge" PTE if we are making a regular, PAGE_SIZE PTE. All architectures that want to implement HGM likely need to be changed in a similar way if they implement their own version of arch_make_huge_pte. Signed-off-by: James Houghton Reviewed-by: Mike Kravetz diff --git a/include/linux/hugetlb.h b/include/linux/hugetlb.h index 726d581158b1..b767b6889dea 100644 --- a/include/linux/hugetlb.h +++ b/include/linux/hugetlb.h @@ -899,7 +899,7 @@ static inline void arch_clear_hugepage_flags(struct page *page) { } static inline pte_t arch_make_huge_pte(pte_t entry, unsigned int shift, vm_flags_t flags) { - return pte_mkhuge(entry); + return shift > PAGE_SHIFT ? pte_mkhuge(entry) : entry; } #endif From patchwork Sat Feb 18 00:27:50 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: James Houghton X-Patchwork-Id: 13145383 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 810FEC6379F for ; Sat, 18 Feb 2023 00:29:18 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id E949C28000D; Fri, 17 Feb 2023 19:29:02 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id E42EE280002; Fri, 17 Feb 2023 19:29:02 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id D342C28000D; Fri, 17 Feb 2023 19:29:02 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0014.hostedemail.com [216.40.44.14]) by kanga.kvack.org (Postfix) with ESMTP id BA5E2280002 for ; Fri, 17 Feb 2023 19:29:02 -0500 (EST) Received: from smtpin01.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay05.hostedemail.com (Postfix) with ESMTP id 9B2EE4038F for ; Sat, 18 Feb 2023 00:29:02 +0000 (UTC) X-FDA: 80478527724.01.E3A9BC5 Received: from mail-yb1-f201.google.com (mail-yb1-f201.google.com [209.85.219.201]) by imf14.hostedemail.com (Postfix) with ESMTP id DDE9C100015 for ; Sat, 18 Feb 2023 00:29:00 +0000 (UTC) Authentication-Results: imf14.hostedemail.com; dkim=pass header.d=google.com header.s=20210112 header.b=axPHyxy1; spf=pass (imf14.hostedemail.com: domain of 3zBvwYwoKCOkUeSZfRSeZYRZZRWP.NZXWTYfi-XXVgLNV.ZcR@flex--jthoughton.bounces.google.com designates 209.85.219.201 as permitted sender) smtp.mailfrom=3zBvwYwoKCOkUeSZfRSeZYRZZRWP.NZXWTYfi-XXVgLNV.ZcR@flex--jthoughton.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1676680140; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=0wSMKMWzwfQ8fLAFHXt0NmwJTi1H1o52qxxMn528B3Y=; b=XB6zjxdyZNJgbTjVDtcI3cIDs4nG8lMmjOrem5y3E//5XCXV7NLuldnAy5YTj6irz+0ewx 0Y50XBR3N4+bOxcfEPaa9q76T1mpg304yTns+Gh7Z3iDaLy/Qmni7MISuawCavQbIMce2S 5SEG94jxZCWn72Y2C8uz3VCkkTcXyZg= ARC-Authentication-Results: i=1; imf14.hostedemail.com; dkim=pass header.d=google.com header.s=20210112 header.b=axPHyxy1; spf=pass (imf14.hostedemail.com: domain of 3zBvwYwoKCOkUeSZfRSeZYRZZRWP.NZXWTYfi-XXVgLNV.ZcR@flex--jthoughton.bounces.google.com designates 209.85.219.201 as permitted sender) smtp.mailfrom=3zBvwYwoKCOkUeSZfRSeZYRZZRWP.NZXWTYfi-XXVgLNV.ZcR@flex--jthoughton.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1676680140; a=rsa-sha256; cv=none; b=MVxTjyiWPTBb6ZV5bAM/V7L26svs9PHRAwiRyMtUlHdRqcXf6SW94BLrdcM9WTaW2qZS6J BMZX821faSxP7cJsMr1zOYNm468by1X3nLdY4y6/ZY5PDn9/9Y6nP4WXKVcKbOCyl21h43 tbj1qteHY/Eo0jN6mmdNYKaBOq7AndY= Received: by mail-yb1-f201.google.com with SMTP id o137-20020a25418f000000b009419f64f6afso2164775yba.2 for ; Fri, 17 Feb 2023 16:29:00 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=0wSMKMWzwfQ8fLAFHXt0NmwJTi1H1o52qxxMn528B3Y=; b=axPHyxy1GBdJ3j5NWk4sW5NYHawOKf9AoibB4HV5FfIZJydc97adJIODrPR+QPgxNK nr00e5FMgkdE8K907sG4ZDoHr5Lt8bsvRMbzhBZu9D/vDZvX+woxXErLOoWLQSU7lSB3 ColvlhjaPZZh6IxAnrEln+IITIxYCmy04Z2l1SbzE3B6POnZJUSMCAYWIRaT1Yw6pXLR Wq0cbmdstbtdCR3d2d+bJjVqqNZDwx8tE8yRYgFkpy+28U20Db+9VK0iOYSso+pD2sG8 OJeoIbKDbGGxUqdzljTj52m+1rUrxcY+G67jmbfYWqpHTo5y/BK5aqWA0UQIkWR55NRP GAqA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=0wSMKMWzwfQ8fLAFHXt0NmwJTi1H1o52qxxMn528B3Y=; b=UsPJv8X73hHKaQrLH7cZ3iBJiJNZ+BJZjK9ecAUdYjxAPrWMvNZ5Gru87/GqbdH3XT og1CuZR8BGFOzvAfTWIi/Ffh8B6Kftv0sD9U1MuQLWwOox8gsM44UOngKW8EZX+fwqYn tCQDfQvnLb2Qd/+UgfHW8+TXuqwDyffpUhFxq3wrywGZ4RdmTO/T5UmfBwwyTn71XCEs qDOkie4lkG7UsURGQjfjzZIH7n0MknOsF6GtJXDbkVhmRJDuCHjyvtwmpsi9l8F9i8tT nMDofSz4O6OlvbuYQWwqf4nOl60nSmNU6/gEDT1UNkJar8AwSqW4z8/NnqU8yOsuxRcb 3Img== X-Gm-Message-State: AO0yUKUzqrYOD83uBW74a73DVjKakmuhlbxTJvthT9TerFk1vZhZEIMG /i4z05f4Pww0vlV3XegHfVUF1jDUJWHXBkyw X-Google-Smtp-Source: AK7set8P4IRNjQRV3GSaApBMgcJvUr/kZLWUTr2MSxmgTV4ZqD4cm7y5M6XYvFN0JDI5vO1MJ3hc9iawF3gnH6MW X-Received: from jthoughton.c.googlers.com ([fda3:e722:ac3:cc00:14:4d90:c0a8:2a4f]) (user=jthoughton job=sendgmr) by 2002:a05:690c:38d:b0:533:a15a:d33e with SMTP id bh13-20020a05690c038d00b00533a15ad33emr73114ywb.5.1676680140092; Fri, 17 Feb 2023 16:29:00 -0800 (PST) Date: Sat, 18 Feb 2023 00:27:50 +0000 In-Reply-To: <20230218002819.1486479-1-jthoughton@google.com> Mime-Version: 1.0 References: <20230218002819.1486479-1-jthoughton@google.com> X-Mailer: git-send-email 2.39.2.637.g21b0678d19-goog Message-ID: <20230218002819.1486479-18-jthoughton@google.com> Subject: [PATCH v2 17/46] hugetlbfs: do a full walk to check if vma maps a page From: James Houghton To: Mike Kravetz , Muchun Song , Peter Xu , Andrew Morton Cc: David Hildenbrand , David Rientjes , Axel Rasmussen , Mina Almasry , "Zach O'Keefe" , Manish Mishra , Naoya Horiguchi , "Dr . David Alan Gilbert" , "Matthew Wilcox (Oracle)" , Vlastimil Babka , Baolin Wang , Miaohe Lin , Yang Shi , Frank van der Linden , Jiaqi Yan , linux-mm@kvack.org, linux-kernel@vger.kernel.org, James Houghton X-Rspam-User: X-Rspamd-Server: rspam04 X-Rspamd-Queue-Id: DDE9C100015 X-Stat-Signature: b5jmwjncuypaxjbaccit6j5xhb9y6z8k X-HE-Tag: 1676680140-410913 X-HE-Meta: U2FsdGVkX19eQOS+zwP1LI3YZpRSVnWps7pHJ0FZgikf/CU9pOd8Ahje/aulPLjgRpIQrlohHhckdu7UrHxpqJGHZ4y1WrkaSRVufnYOljCnmcrBhsZPiF06/nHX3GHG5ZsZO6TVHOgeZ2Qv0NyOhZ92emPLxooqxmNgrVfQWkfBFKmYK/WBQfEJTjl+eDIGnPRRfo6nYgbeX/0xgbNbFb2UlLETUT12nGVDLnGvCW13e8XY1mDh1GdVqwfDYMnr+oJtOysbhL5omXC5maDxMomQ/JWiIVtukgTti7cvpy2aNqiselvD4KAiwxnuVtS5H2pDJhH/EJvebEe6uBXX3UkzWZb+JkTjaL8Ij/yh0155lMYguzdoBZ53gAx+x3loMtqMvoKXCbrydWt2/GAsLX4faTuQrCz47rxYKpy4JiDSmQnMEzHSu6JUTLucIXya5nYDUYtE+sxfnjq8xXPgGYv9WgdHTzSj/UcvJ0EFKdE2pGuP0WtOjCbJNiAMcBl/gVwqI1yMU8/fEkmIg3YUgCLyWbKkfUn+qbQUAAuYANbpuVSkQci5I1hIxEwFe06Wy8rJqfvMaj7M106L+qjRXGsELAeKBkGbZTtc8iuBHqYxknzEWMGEpyO2cHv8r3yxFZ2ZkL0h/Pg0Csf4r5mZdx67MbqGO8gYLk5/t+WCRt/5XG05lrX7GuufYkd38UvejLxc27+E53sB1GNEp6hUoNhLQBA03zmCXtzZd2t+QCEGDDhjVlHThYnfT/RX1n9qArWRSCKGIldYEMbTNiZamePWUOs3fqJtDHQrsWmi9Ht6D5JbV+Dbl0VNpcx5NIPJH0HUD1bVSCW+qElx/U0UrgDnFLej9I14f2rM2duWn906pNd3FEq4TLvCM30nkWJuI2Q1Mh2bMIhZKdE90jlDtGSk2BWpCjNoPljuYbrSql0oz6s2b8m/IfSNV18SQ356NphNcvMF7hjBmwO6SiA yVvtqevK lNdSAcgxaOHbJANK/2k0RtpvBgKnB3xTgHIdxx3TW+FzaokQWAHdHLvU9eFhHrZ6CQMzbBq4ncKrZcizPRxt6DCHKH32K3dn9bmBkRwo8mMznmT+RsQxBRER/ivlYHQw+YUDyUG7InUj94IQ2dIgfdxAHIFmuY6LWFMlqHpCndEaLK4PpchQr3DGMG6ANT52AcZjQKw1O78ESpU1j6SSBAwFopg1m+mb7ufpeBBaT9H+xBs6QhppMKRwwYiMSj4JVM+nk316QTPaeUCpUlTYM3jZZXn1Hqh9LR929R4DcWpxB5/Ngty39/z6SYk07PZkPu5GSRfFqDyv3CcgR9gSkHZBgH+aa/n1eVwa62vbBTNuP0ug+02L8NdmgWYxRbBbTRieNQJtnW9nddDY+MCjI2IUEUBC+tpS+F4Er46FZFy/dxsK0RyxzL9UtGMdYw2ZWJ9wLW4sB0d9yQRHSNkZ37EW/HWqV2SNWEhv5LLVmXr00L9k6Y6EOXwS1fbHZ+mmZkkE0LIf9Vml71YG0Ra5diiQAG/mJnJkNOTeDBN2WNMA04b0Do7mYB+yOym2J/3Az0r+xcDYvsArARinLx6R7zjaYbrvLc4OquZx6 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Because it is safe to do so, do a full high-granularity page table walk to check if the page is mapped. Signed-off-by: James Houghton diff --git a/fs/hugetlbfs/inode.c b/fs/hugetlbfs/inode.c index cfd09f95551b..c0ee69f0418e 100644 --- a/fs/hugetlbfs/inode.c +++ b/fs/hugetlbfs/inode.c @@ -386,17 +386,24 @@ static void hugetlb_delete_from_page_cache(struct folio *folio) static bool hugetlb_vma_maps_page(struct vm_area_struct *vma, unsigned long addr, struct page *page) { - pte_t *ptep, pte; + pte_t pte; + struct hugetlb_pte hpte; - ptep = hugetlb_walk(vma, addr, huge_page_size(hstate_vma(vma))); - if (!ptep) + if (hugetlb_full_walk(&hpte, vma, addr)) return false; - pte = huge_ptep_get(ptep); + pte = huge_ptep_get(hpte.ptep); if (huge_pte_none(pte) || !pte_present(pte)) return false; - if (pte_page(pte) == page) + if (unlikely(!hugetlb_pte_present_leaf(&hpte, pte))) + /* + * We raced with someone splitting us, and the only case + * where this is impossible is when the pte was none. + */ + return false; + + if (compound_head(pte_page(pte)) == page) return true; return false; From patchwork Sat Feb 18 00:27:51 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: James Houghton X-Patchwork-Id: 13145384 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id F1685C64ED6 for ; Sat, 18 Feb 2023 00:29:19 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id DF93028000E; Fri, 17 Feb 2023 19:29:03 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id D800D280002; Fri, 17 Feb 2023 19:29:03 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id C701528000E; Fri, 17 Feb 2023 19:29:03 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0015.hostedemail.com [216.40.44.15]) by kanga.kvack.org (Postfix) with ESMTP id B903A280002 for ; Fri, 17 Feb 2023 19:29:03 -0500 (EST) Received: from smtpin26.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay04.hostedemail.com (Postfix) with ESMTP id 82CF11A050F for ; Sat, 18 Feb 2023 00:29:03 +0000 (UTC) X-FDA: 80478527766.26.E758635 Received: from mail-yw1-f202.google.com (mail-yw1-f202.google.com [209.85.128.202]) by imf10.hostedemail.com (Postfix) with ESMTP id B652BC0004 for ; Sat, 18 Feb 2023 00:29:01 +0000 (UTC) Authentication-Results: imf10.hostedemail.com; dkim=pass header.d=google.com header.s=20210112 header.b=faTIHphc; spf=pass (imf10.hostedemail.com: domain of 3zBvwYwoKCOkUeSZfRSeZYRZZRWP.NZXWTYfi-XXVgLNV.ZcR@flex--jthoughton.bounces.google.com designates 209.85.128.202 as permitted sender) smtp.mailfrom=3zBvwYwoKCOkUeSZfRSeZYRZZRWP.NZXWTYfi-XXVgLNV.ZcR@flex--jthoughton.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1676680141; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=Of81VsYOGaJiC6DFXM6sxGYcirLfRpMhUkY/yI+h7kc=; b=fJRk7fnfKfY2z7Yti96p+ueCyvFKivI5WqK387O8ppJnfaAZZ0LweInUOwO2QHeVs5gGy5 3hR07qAHk04d8cgn3f8+jryHLM9S9SUhDPJrSgVK6CACUE0BEv6DtLCdb3iPYvBSukJ04p pFZJ7f8D3g1xFA8G//th6nYN5S9fE2w= ARC-Authentication-Results: i=1; imf10.hostedemail.com; dkim=pass header.d=google.com header.s=20210112 header.b=faTIHphc; spf=pass (imf10.hostedemail.com: domain of 3zBvwYwoKCOkUeSZfRSeZYRZZRWP.NZXWTYfi-XXVgLNV.ZcR@flex--jthoughton.bounces.google.com designates 209.85.128.202 as permitted sender) smtp.mailfrom=3zBvwYwoKCOkUeSZfRSeZYRZZRWP.NZXWTYfi-XXVgLNV.ZcR@flex--jthoughton.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1676680141; a=rsa-sha256; cv=none; b=JdJL03+1A0I7lgjhjHEEazRHC9irjLuk4HOig8ZUURtpXGa88Ht0M+dBDBn/fA1p7q4V6V 0eWNMQSKcR92mO4/iLXrruJUUCRy9ltkt27TJjeHPd3bTUp+koUedUhMswtaLLKvGGM4ta CuCEKfWrOklyqfrgtcUVbtQuR/c+NA8= Received: by mail-yw1-f202.google.com with SMTP id 00721157ae682-53659bb998bso19839597b3.9 for ; Fri, 17 Feb 2023 16:29:01 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=Of81VsYOGaJiC6DFXM6sxGYcirLfRpMhUkY/yI+h7kc=; b=faTIHphcD3M2ErzVxaHvUUDOZt6TLoZA4Xzil1xu0/G1//a74JWzTWI0tyBHfpuNha JuiaDE7sP9JLYbJHclSCn1zLcoQow2nzU2lXYNGugjxvfih5hgYInPRhhLcCctGhNLo3 KwFJIDeU1tiOhE1xHmEihdXHq7cGiCAhH0q2zoPtNQjwpXYxrAP0w6dyKJSxmBEiRhDd 7V/h4dydjcI+nPCAOZLKD8SpPxrMq1uA1FOYjkSBWBh0f82wcANh+dEFe2hWIerkaZPV R7mY45T1lGt6PPZpw4zMX2iPUJ6tZ/5sKy1qlMP0mc3wUXq6LpfnZyl3pSM6f22otbp5 BCNA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=Of81VsYOGaJiC6DFXM6sxGYcirLfRpMhUkY/yI+h7kc=; b=imK6mDJrxexnxRkie5m1otXH6vVoDw1VjmOe98Ndi+Yw30jfD08QByBcLEQ52YEt1S +tsxKxKFnG+fhnE879/VNvR4TkehgpXQjTbfa/p9zjCMs4JqFviPzRPmVR1bX8piYsb3 zsqNv3Rc/+DxujUiHv1BC/yQB+8Yz4hCeUS1EX6whPnRXiesldDbkE/hn0F98JXHiHVp 6wkyv8mSa1PMdCuM13DeuWy7aEiPPfeNacO5JHQHhqE8BNrtTxkbprsgtwp7tGTOsGkO RRL2SFcn2FpeVcFq8mYQvOw8WzFFSiLJJMFOaQUx8FYeEm4jmxrcYPqIQFPpBi1MuVzG Nc2g== X-Gm-Message-State: AO0yUKVfOBaa7wmVNaF30zOd3b7ECgDvqMr8LnxYgdBOu4IhDUOzSVUi dPou2Vt0RZSzyLC3HCJKmFY6xyVSe9hhECkV X-Google-Smtp-Source: AK7set9tTFKkYw3kZzJu84DBO6wAm5FU1Hz3uatyB+dq4su6im3hpPVARJ7+PJEJD7fhUdzldcN3emm0/I1SXTEb X-Received: from jthoughton.c.googlers.com ([fda3:e722:ac3:cc00:14:4d90:c0a8:2a4f]) (user=jthoughton job=sendgmr) by 2002:a5b:889:0:b0:95b:7778:5158 with SMTP id e9-20020a5b0889000000b0095b77785158mr63089ybq.12.1676680140991; Fri, 17 Feb 2023 16:29:00 -0800 (PST) Date: Sat, 18 Feb 2023 00:27:51 +0000 In-Reply-To: <20230218002819.1486479-1-jthoughton@google.com> Mime-Version: 1.0 References: <20230218002819.1486479-1-jthoughton@google.com> X-Mailer: git-send-email 2.39.2.637.g21b0678d19-goog Message-ID: <20230218002819.1486479-19-jthoughton@google.com> Subject: [PATCH v2 18/46] hugetlb: add HGM support to __unmap_hugepage_range From: James Houghton To: Mike Kravetz , Muchun Song , Peter Xu , Andrew Morton Cc: David Hildenbrand , David Rientjes , Axel Rasmussen , Mina Almasry , "Zach O'Keefe" , Manish Mishra , Naoya Horiguchi , "Dr . David Alan Gilbert" , "Matthew Wilcox (Oracle)" , Vlastimil Babka , Baolin Wang , Miaohe Lin , Yang Shi , Frank van der Linden , Jiaqi Yan , linux-mm@kvack.org, linux-kernel@vger.kernel.org, James Houghton X-Rspam-User: X-Rspamd-Server: rspam04 X-Rspamd-Queue-Id: B652BC0004 X-Stat-Signature: mj4cwexbia5mab3sowk4csfttgzninim X-HE-Tag: 1676680141-67723 X-HE-Meta: U2FsdGVkX18cQNKKPwUYaWId15rk55IPSZdPh9NLpZ0AeG3Wkz6ypmsm9TdRGH4dBwtkMHXVog2vGxMugrUrEBHA9CPBJBGJEEle4IikhAOK4t4aXJ7oXnHWSwG8iiFlsFzOUytP8kp3iK07pi04hjwjetAoH5qA0WT6QVhqdbC8FFYv3pmaueuPRaZi5iAq+hC/XunxnKFz2MxsAGE3kiQZ8D7VXzm150YxyJAODjG5IyDOu2avncenMlQpkei7612HdA3aj9mKb12YVVGCD2g7kmwh2pp+wZ9mei+So1an/NAvFNmhr97ADa8LRIAuX8ma76tE5Ld9hGMZEitRNAjttx2xiXe1EWY0RsggMXD9de6RolkQf31FkOkOtzCtyFftCH9Uhzd6NrfiLHHrHrbnMv3pQS3FVb/v4NDp+zqubAsHcZsYUHD1DISfuUax51ARPxt9FDif0C79kQPdx9G4bHBdd/pDCuUjLl+Pwrm8z0Cyb2DOSn6YPEleLZA+ndBsV+lu1cLVEHfR3nr5YpYkdV7FLvdQasCGeS3IC6gjeCgkYLjMfN3GXW7lR5xJ0nBM7ZKW+a5sr2iF1Y0uywsfPgfvcbXGaWac+JfsxzNV3X2M2hJ1a/mxr7x+ZWGwc+U4DP5i6F8n3U8wIItPLVMDXZxVTCqECed1RRxj4V/BDKWdGX0d98Gd3AOofrDNIimcZvd+GWe+nel7glhWKUeX0KF7+gt5CrMvNUbvchXI6WZrEiTRs5ClUpCxyCuko3wEGIbUFpzYP73F9vjZntcP9po6AfMVbC/qck5JHbpCrddoImmZloSPL8r3j9GHBKlDQYNaHW51Cd2uNumFGs3in2dorGpWDTyCSJDZ/QNIaDSHIhKJ5mFKmOdfATNHfc7RI7EtdOuY/pI7R4gnieOWXRvZ4ajb+SwQO3dLU6U09YQ03SIAiLb2C6/PfqOzRxt6YEf1IyL72JTwPD1 ajIv2Wqc qsZSuh4ncAm3r99hBVIYyYbBDoj2rHCf7EtBBtSGzVlAKRan+/BGVYGaDZ8XiCamlNPQDfSl8H1gT/gaySo3gR6IIbZRwPI4bUi5JwPYKMiJwsaja17sHoGvrKyM1JtACdsgdw7TaRvjbP8pkn6rwGppit0YrZk7fOMq+7isNJ0S7TgkbO7GKLWHjhC+F/88KWihCKCHN5HmrF4sOxP2AtB3LgH7LID1clg12gCylnzAbR1s9PFNvitGErE+zj5p5Lw3SZjRpaK+j2PLaU6Xe4OM0omFTQr+Gi0d7Wb6O4+A9/Cxjr6qSb4NiMwbtUgQ+QuxcLcS2ZSaYFn9qjxt6vse4bFmHf5shL40Nkez6nMGuRU/ym8nrfe5NLMTNdKOeFcIt6/m8tCf/pESVLUW07hxwNcv9Ck7e2M7D6tGB2V4hCZQjmg6WchTzRGEMcea9hqtbzqr6t8b5o0ohYCJsLb5sEAxOVlRDNfnAlKL7WSE6bIvcBA5xEqVMyjvtlZfUqhio4exalBV4vF/W9rymakUvdC/QH64pfTpTFChTAJZDCLrjLNFfOIzvG8LDKitNxN1m49jLc8cBCcqR4IfxB6FOA3o+c8Icz7wcHGAhqYWDFsq7HwLjpk0dMXdRcGOXUbM4D7HI1ap+cPVcLZ4jCu5ms7qQG2Xi9aSaECQeE1rSI7ITFpk3S4z90cVGX2sSJocWaIc+z14xWiM= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Enlighten __unmap_hugepage_range to deal with high-granularity mappings. This doesn't change its API; it still must be called with hugepage alignment, but it will correctly unmap hugepages that have been mapped at high granularity. Eventually, functionality here can be expanded to allow users to call MADV_DONTNEED on PAGE_SIZE-aligned sections of a hugepage, but that is not done here. Introduce hugetlb_remove_rmap to properly decrement mapcount for high-granularity-mapped HugeTLB pages. Signed-off-by: James Houghton diff --git a/include/asm-generic/tlb.h b/include/asm-generic/tlb.h index b46617207c93..31267471760e 100644 --- a/include/asm-generic/tlb.h +++ b/include/asm-generic/tlb.h @@ -598,9 +598,9 @@ static inline void tlb_flush_p4d_range(struct mmu_gather *tlb, __tlb_remove_tlb_entry(tlb, ptep, address); \ } while (0) -#define tlb_remove_huge_tlb_entry(h, tlb, ptep, address) \ +#define tlb_remove_huge_tlb_entry(tlb, hpte, address) \ do { \ - unsigned long _sz = huge_page_size(h); \ + unsigned long _sz = hugetlb_pte_size(&hpte); \ if (_sz >= P4D_SIZE) \ tlb_flush_p4d_range(tlb, address, _sz); \ else if (_sz >= PUD_SIZE) \ @@ -609,7 +609,7 @@ static inline void tlb_flush_p4d_range(struct mmu_gather *tlb, tlb_flush_pmd_range(tlb, address, _sz); \ else \ tlb_flush_pte_range(tlb, address, _sz); \ - __tlb_remove_tlb_entry(tlb, ptep, address); \ + __tlb_remove_tlb_entry(tlb, hpte.ptep, address);\ } while (0) /** diff --git a/include/linux/hugetlb.h b/include/linux/hugetlb.h index b767b6889dea..1a1a71868dfd 100644 --- a/include/linux/hugetlb.h +++ b/include/linux/hugetlb.h @@ -160,6 +160,9 @@ struct hugepage_subpool *hugepage_new_subpool(struct hstate *h, long max_hpages, long min_hpages); void hugepage_put_subpool(struct hugepage_subpool *spool); +void hugetlb_remove_rmap(struct page *subpage, unsigned long shift, + struct hstate *h, struct vm_area_struct *vma); + void hugetlb_dup_vma_private(struct vm_area_struct *vma); void clear_vma_resv_huge_pages(struct vm_area_struct *vma); int hugetlb_sysctl_handler(struct ctl_table *, int, void *, size_t *, loff_t *); diff --git a/mm/hugetlb.c b/mm/hugetlb.c index ed1d806020de..ecf1a28dbaaa 100644 --- a/mm/hugetlb.c +++ b/mm/hugetlb.c @@ -120,6 +120,28 @@ enum hugetlb_level hpage_size_to_level(unsigned long sz) return HUGETLB_LEVEL_PGD; } +void hugetlb_remove_rmap(struct page *subpage, unsigned long shift, + struct hstate *h, struct vm_area_struct *vma) +{ + struct page *hpage = compound_head(subpage); + + if (shift == huge_page_shift(h)) { + VM_BUG_ON_PAGE(subpage != hpage, subpage); + page_remove_rmap(hpage, vma, true); + } else { + unsigned long nr_subpages = 1UL << (shift - PAGE_SHIFT); + struct page *final_page = &subpage[nr_subpages]; + + VM_BUG_ON_PAGE(HPageVmemmapOptimized(hpage), hpage); + /* + * Decrement the mapcount on each page that is getting + * unmapped. + */ + for (; subpage < final_page; ++subpage) + page_remove_rmap(subpage, vma, false); + } +} + static inline bool subpool_is_free(struct hugepage_subpool *spool) { if (spool->count) @@ -5466,10 +5488,10 @@ static void __unmap_hugepage_range(struct mmu_gather *tlb, struct vm_area_struct { struct mm_struct *mm = vma->vm_mm; unsigned long address; - pte_t *ptep; + struct hugetlb_pte hpte; pte_t pte; spinlock_t *ptl; - struct page *page; + struct page *hpage, *subpage; struct hstate *h = hstate_vma(vma); unsigned long sz = huge_page_size(h); unsigned long last_addr_mask; @@ -5479,35 +5501,33 @@ static void __unmap_hugepage_range(struct mmu_gather *tlb, struct vm_area_struct BUG_ON(start & ~huge_page_mask(h)); BUG_ON(end & ~huge_page_mask(h)); - /* - * This is a hugetlb vma, all the pte entries should point - * to huge page. - */ - tlb_change_page_size(tlb, sz); tlb_start_vma(tlb, vma); last_addr_mask = hugetlb_mask_last_page(h); address = start; - for (; address < end; address += sz) { - ptep = hugetlb_walk(vma, address, sz); - if (!ptep) { - address |= last_addr_mask; + + while (address < end) { + if (hugetlb_full_walk(&hpte, vma, address)) { + address = (address | last_addr_mask) + sz; continue; } - ptl = huge_pte_lock(h, mm, ptep); - if (huge_pmd_unshare(mm, vma, address, ptep)) { + ptl = hugetlb_pte_lock(&hpte); + if (hugetlb_pte_size(&hpte) == sz && + huge_pmd_unshare(mm, vma, address, hpte.ptep)) { spin_unlock(ptl); tlb_flush_pmd_range(tlb, address & PUD_MASK, PUD_SIZE); force_flush = true; address |= last_addr_mask; + address += sz; continue; } - pte = huge_ptep_get(ptep); + pte = huge_ptep_get(hpte.ptep); + if (huge_pte_none(pte)) { spin_unlock(ptl); - continue; + goto next_hpte; } /* @@ -5523,24 +5543,35 @@ static void __unmap_hugepage_range(struct mmu_gather *tlb, struct vm_area_struct */ if (pte_swp_uffd_wp_any(pte) && !(zap_flags & ZAP_FLAG_DROP_MARKER)) - set_huge_pte_at(mm, address, ptep, + set_huge_pte_at(mm, address, hpte.ptep, make_pte_marker(PTE_MARKER_UFFD_WP)); else - huge_pte_clear(mm, address, ptep, sz); + huge_pte_clear(mm, address, hpte.ptep, + hugetlb_pte_size(&hpte)); + spin_unlock(ptl); + goto next_hpte; + } + + if (unlikely(!hugetlb_pte_present_leaf(&hpte, pte))) { + /* + * We raced with someone splitting out from under us. + * Retry the walk. + */ spin_unlock(ptl); continue; } - page = pte_page(pte); + subpage = pte_page(pte); + hpage = compound_head(subpage); /* * If a reference page is supplied, it is because a specific * page is being unmapped, not a range. Ensure the page we * are about to unmap is the actual page of interest. */ if (ref_page) { - if (page != ref_page) { + if (hpage != ref_page) { spin_unlock(ptl); - continue; + goto next_hpte; } /* * Mark the VMA as having unmapped its page so that @@ -5550,25 +5581,32 @@ static void __unmap_hugepage_range(struct mmu_gather *tlb, struct vm_area_struct set_vma_resv_flags(vma, HPAGE_RESV_UNMAPPED); } - pte = huge_ptep_get_and_clear(mm, address, ptep); - tlb_remove_huge_tlb_entry(h, tlb, ptep, address); + pte = huge_ptep_get_and_clear(mm, address, hpte.ptep); + tlb_change_page_size(tlb, hugetlb_pte_size(&hpte)); + tlb_remove_huge_tlb_entry(tlb, hpte, address); if (huge_pte_dirty(pte)) - set_page_dirty(page); + set_page_dirty(hpage); /* Leave a uffd-wp pte marker if needed */ if (huge_pte_uffd_wp(pte) && !(zap_flags & ZAP_FLAG_DROP_MARKER)) - set_huge_pte_at(mm, address, ptep, + set_huge_pte_at(mm, address, hpte.ptep, make_pte_marker(PTE_MARKER_UFFD_WP)); - hugetlb_count_sub(pages_per_huge_page(h), mm); - page_remove_rmap(page, vma, true); + hugetlb_count_sub(hugetlb_pte_size(&hpte)/PAGE_SIZE, mm); + hugetlb_remove_rmap(subpage, hpte.shift, h, vma); spin_unlock(ptl); - tlb_remove_page_size(tlb, page, huge_page_size(h)); /* - * Bail out after unmapping reference page if supplied + * Lower the reference count on the head page. + */ + tlb_remove_page_size(tlb, hpage, sz); + /* + * Bail out after unmapping reference page if supplied, + * and there's only one PTE mapping this page. */ - if (ref_page) + if (ref_page && hugetlb_pte_size(&hpte) == sz) break; +next_hpte: + address += hugetlb_pte_size(&hpte); } tlb_end_vma(tlb, vma); @@ -5846,7 +5884,7 @@ static vm_fault_t hugetlb_wp(struct mm_struct *mm, struct vm_area_struct *vma, /* Break COW or unshare */ huge_ptep_clear_flush(vma, haddr, ptep); mmu_notifier_invalidate_range(mm, range.start, range.end); - page_remove_rmap(old_page, vma, true); + hugetlb_remove_rmap(old_page, huge_page_shift(h), h, vma); hugepage_add_new_anon_rmap(new_folio, vma, haddr); set_huge_pte_at(mm, haddr, ptep, make_huge_pte(vma, &new_folio->page, !unshare)); From patchwork Sat Feb 18 00:27:52 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: James Houghton X-Patchwork-Id: 13145385 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 844E2C636D6 for ; Sat, 18 Feb 2023 00:29:21 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id F33C028000F; Fri, 17 Feb 2023 19:29:04 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id E6C24280002; Fri, 17 Feb 2023 19:29:04 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id C956A28000F; Fri, 17 Feb 2023 19:29:04 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0016.hostedemail.com [216.40.44.16]) by kanga.kvack.org (Postfix) with ESMTP id BB0C2280002 for ; Fri, 17 Feb 2023 19:29:04 -0500 (EST) Received: from smtpin20.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay01.hostedemail.com (Postfix) with ESMTP id 9DC831C5F4C for ; Sat, 18 Feb 2023 00:29:04 +0000 (UTC) X-FDA: 80478527808.20.8276037 Received: from mail-yw1-f201.google.com (mail-yw1-f201.google.com [209.85.128.201]) by imf08.hostedemail.com (Postfix) with ESMTP id DBAB9160009 for ; Sat, 18 Feb 2023 00:29:02 +0000 (UTC) Authentication-Results: imf08.hostedemail.com; dkim=pass header.d=google.com header.s=20210112 header.b=tIIvrAMR; spf=pass (imf08.hostedemail.com: domain of 3zRvwYwoKCOoVfTagSTfaZSaaSXQ.OaYXUZgj-YYWhMOW.adS@flex--jthoughton.bounces.google.com designates 209.85.128.201 as permitted sender) smtp.mailfrom=3zRvwYwoKCOoVfTagSTfaZSaaSXQ.OaYXUZgj-YYWhMOW.adS@flex--jthoughton.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1676680142; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=+tSPnCkiv+4IG9wAhN3nJiAPBMdRQ7SRBdMiKhfpNi8=; b=KL5jurEs8El65elTeycif3nyQ3b3kmF2wsUbaQ5SNlif1orzSbqqSZ6cZmJFpr1atsQFx2 1xMQK2BtlS+9KBJ39KW5M6zHWyV2FrlIZhlNLz9pp88HO2qHQsrjFsjRpVz0YwbOEHMVtt TamTIEFdkGglR/dEkPIp6Ede0cn/zyc= ARC-Authentication-Results: i=1; imf08.hostedemail.com; dkim=pass header.d=google.com header.s=20210112 header.b=tIIvrAMR; spf=pass (imf08.hostedemail.com: domain of 3zRvwYwoKCOoVfTagSTfaZSaaSXQ.OaYXUZgj-YYWhMOW.adS@flex--jthoughton.bounces.google.com designates 209.85.128.201 as permitted sender) smtp.mailfrom=3zRvwYwoKCOoVfTagSTfaZSaaSXQ.OaYXUZgj-YYWhMOW.adS@flex--jthoughton.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1676680142; a=rsa-sha256; cv=none; b=V0IDt6ZzAvAlGK5WOkw8vFuu6rkIEDzTnU6C2P4g3WnMwj0fpopttlERxex85/XeUAWwWz K6tqquMVgzvqTlbWvQEnU8zzyrAyKzZfSNIJtPDueHyEMz1o2FOSGvt121dMVOVo8VqDgd Uqud+hU0nUij6c5KC8RgLvmzYCHWb54= Received: by mail-yw1-f201.google.com with SMTP id 00721157ae682-5365a2b9e4fso19055837b3.15 for ; Fri, 17 Feb 2023 16:29:02 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=+tSPnCkiv+4IG9wAhN3nJiAPBMdRQ7SRBdMiKhfpNi8=; b=tIIvrAMRq+3HXFL5rW8Lh7/gCEy2v7mBrYIzZnmVImsr6rHLUrJkHvESEkKIKnP2HA ETKeSXB6gqBCteIJ68LmS1NgcW8NqFCp4N0/v9eSUb5HNjRwRpHF7O8+s4KVElrP4dI4 Zjvu8aQ6Opgdsvu/yAqPR0lJhv574KFnXRf4LH++FsPR7lEug2Wl9pAFrHWFu0GZq42N Vy0SevF4NJaR0vgAKFGoEdFDQKl01vAhAwbwFzmxhSRiOGi2HQEkv51SNB+xC5ZLD4Tm F1M7gD13G/x1ohwgsWJjFr2DhGgQpTZDe/17U0ZGQKm1J8xgGRPTne6Pgw+hiXkqL9v1 Pi9w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=+tSPnCkiv+4IG9wAhN3nJiAPBMdRQ7SRBdMiKhfpNi8=; b=LCIh0eIFzs/hGwsZgLVk/wcN1oSZYUCxEOzIFyFS1Z4vWw8kt74tSUEA/w5AmRc6yH 82sX97YnrK2AdB+FQZYGqQBtdQlCRSipeboo/z4mXFPgIOmrIJszbu8ZKvoJB6BCKBy2 7tc7pNmz0PnOq1W2zU2YIVTUi+gCNxpYn/pcpzH1K1n2fYb0Go9DbW+g3wsY6ZASwIYZ l4Oo+g24e/el74FA86CG4Mso3VPEwcfCw8AH9tba+0lRTjuff2+BwkEn7q4f+dYTzayb rXjlnwlAo5CLg39xuKYP+RJxCI3H5zpsqywfU1oOKhaYFjRll/tP/Fiz82m4WZ/jKGx6 om9g== X-Gm-Message-State: AO0yUKUvLPOfWp9bIlEaMz9JimS0M6fvapxWnqo0UF1xGU5KqN/J4qr3 yRaMbcsXVoG8hj5L+YmEykhiLg9t/3tG9HDe X-Google-Smtp-Source: AK7set8J1uc8tp8/hEWZItT5UxOZqGWxRV48NY6+J5XvomWx6cLnGWfxbZYO/6vWe6Thx7C5x8jbEtY4jfNJnPC8 X-Received: from jthoughton.c.googlers.com ([fda3:e722:ac3:cc00:14:4d90:c0a8:2a4f]) (user=jthoughton job=sendgmr) by 2002:a81:46d6:0:b0:52e:e3a8:d0b9 with SMTP id t205-20020a8146d6000000b0052ee3a8d0b9mr1575870ywa.509.1676680141961; Fri, 17 Feb 2023 16:29:01 -0800 (PST) Date: Sat, 18 Feb 2023 00:27:52 +0000 In-Reply-To: <20230218002819.1486479-1-jthoughton@google.com> Mime-Version: 1.0 References: <20230218002819.1486479-1-jthoughton@google.com> X-Mailer: git-send-email 2.39.2.637.g21b0678d19-goog Message-ID: <20230218002819.1486479-20-jthoughton@google.com> Subject: [PATCH v2 19/46] hugetlb: add HGM support to hugetlb_change_protection From: James Houghton To: Mike Kravetz , Muchun Song , Peter Xu , Andrew Morton Cc: David Hildenbrand , David Rientjes , Axel Rasmussen , Mina Almasry , "Zach O'Keefe" , Manish Mishra , Naoya Horiguchi , "Dr . David Alan Gilbert" , "Matthew Wilcox (Oracle)" , Vlastimil Babka , Baolin Wang , Miaohe Lin , Yang Shi , Frank van der Linden , Jiaqi Yan , linux-mm@kvack.org, linux-kernel@vger.kernel.org, James Houghton X-Rspamd-Server: rspam05 X-Rspamd-Queue-Id: DBAB9160009 X-Stat-Signature: bsqwm95fz4uqjfshj37wamu3g5m197ij X-Rspam-User: X-HE-Tag: 1676680142-651822 X-HE-Meta: U2FsdGVkX18RkxvduM0HRfb5HEsFrE+9jC7d+y2nsPFhUoqJBzacdLaYWQsZyEd8ZCCbvjBi2SmKwi0mRTsmiPcy170NIXnvWExgaD4lAzTG8HJp0o+5J7wmqHRAT98BWefjqhOVdu2Ih5vlcAW7/X2dd7Enc9vdSfeO3O5poahBaenbzC0S38q/2YMR0P1wqnIzghvg6NtJo8SBf1C13yhzz7sqsJrBQQneMtxIx6Wh+sYGaNBqkz4EaFju5DBusTX7r79wYfUxuHAVz7eK/RTqStoqnoj0TE5if9jYJKEedsMvh/satmZrLgthRUTkEOdUqFzwm12wLFoNWWe6llKwfs1c7P3UmXfXHHxs6AsKgjHLLrZ7sqGSndysS/Upuu2/OXWyVT6wHfoXyMHgsbBbNyCnivFv0oV1sBm/RRF1RTg9PfECZC5MT0rsyGTxfRW2YRS/DLFobj8GSagixp6HfKvpF+8U6lrMB8G6gPduSODFczMq8jJCA355NTgAaVNYpD/GjWkzvq8otFhShyHdQukMPCc7SIfX3yyZVAF+kL+Kh9cy7OXDscFWkM3XTdmQ+3jTJkifWYzDq7tRHis8RDJejW3ObKdSgBZBxKhlrtHtNwcrsTo7ln/nJRdJEjrpKYClwJ0FbaAbHOeO9U5qStHHzof2C5+3CNnZttNts+V11mUZmZFckRSdkOSjAWLCZ1NKxy+QMc8T2RdedUFAc1JN4meMZ3I1EXpDYrjgt9xw+uyVKWS5iFd4uxJMwjryDcegZZJGcytBmSQD5vChdqussPZR7veu/D8oOOgIZf7jfaDcBDjcrt8jM4er2x0Apk1SafTR1Oly37L0Il+1HbOImnK/g1vVk672e3iMhegTUDympNBfqymVAhZ+wWFkWpPLQeE2VGYY9VaQ6cykllScwlLF2jElxY/dDm3ld/BXYeYN/+O6B7J/chYiUlZKKiE5bNDby1mtaZj s4pp7vY+ D3RsQfHPS+fG4j61YJPceDw4TBPOKy1hHZCeWmkrGv0zQMFidm2P4ffBObdWP/OiEFnIpLoTjujQJn/69Rvears1y3mzfGxoBN3zuERvdycAeyRifU/PwxdC0fLamKrJYNwv6kN19+pNeYdtTCSYw6WmNSGHrldgbdsSpdHvcE0dPCG5idsOjihbX4VcoVbUjhzI/0aipfVWA5hOy1XIkHp3SgutuJQmJT1hYvbuJapVw414zAZjaAevFsVMpLwZBpTxkNo3QazcJIZVXsUfDOf56o4bkA4arrG9C5VHtWUn7nHJKOA/Uuv8SI9xFGnYlu/upc/lhu1lmP5rmZO1zl7veAcuuWwdPBr6WggK0EcgvDKaXyfqgyxyn2Au/bMu3N9k28UukOOih7GZTCMWhGCGfpMpxmgPtZboEgQjyBhvVbTcxOcs52BnIE+rfq3fbFe728YLbDT3RnJCCLxCKW2kwcGy+/0JV8rGpsoV4sk8VL5mJIPzl9V6IWLUY0UALRE2sJ3ehqVyRCoPw6v/sQdmtZkO+/+n+/u8/EkcKJ9haJ4K0Ybxlby/g9uK6x9oGzpkvh1lRHfWXFGhgfFdDaTcl9+mcteq4YKT9jiddBJzR2lbvLYojmqosUhJQ3aWsr17FldHhMCzRNGYkI+goEHhtaDcHRgNmhI3OoJb3id4kyBdC/rqvcCBLJeqklygXAVtDgZCi2spNLB8= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: The main change here is to do a high-granularity walk and pulling the shift from the walk (not from the hstate). Signed-off-by: James Houghton diff --git a/mm/hugetlb.c b/mm/hugetlb.c index ecf1a28dbaaa..7321c6602d6f 100644 --- a/mm/hugetlb.c +++ b/mm/hugetlb.c @@ -6900,15 +6900,15 @@ long hugetlb_change_protection(struct vm_area_struct *vma, { struct mm_struct *mm = vma->vm_mm; unsigned long start = address; - pte_t *ptep; pte_t pte; struct hstate *h = hstate_vma(vma); - long pages = 0, psize = huge_page_size(h); + long base_pages = 0, psize = huge_page_size(h); bool shared_pmd = false; struct mmu_notifier_range range; unsigned long last_addr_mask; bool uffd_wp = cp_flags & MM_CP_UFFD_WP; bool uffd_wp_resolve = cp_flags & MM_CP_UFFD_WP_RESOLVE; + struct hugetlb_pte hpte; /* * In the case of shared PMDs, the area to flush could be beyond @@ -6926,39 +6926,43 @@ long hugetlb_change_protection(struct vm_area_struct *vma, hugetlb_vma_lock_write(vma); i_mmap_lock_write(vma->vm_file->f_mapping); last_addr_mask = hugetlb_mask_last_page(h); - for (; address < end; address += psize) { + while (address < end) { spinlock_t *ptl; - ptep = hugetlb_walk(vma, address, psize); - if (!ptep) { + if (hugetlb_full_walk(&hpte, vma, address)) { if (!uffd_wp) { - address |= last_addr_mask; + address = (address | last_addr_mask) + psize; continue; } /* * Userfaultfd wr-protect requires pgtable * pre-allocations to install pte markers. + * + * Use hugetlb_full_walk_alloc to allocate + * the hstate-level PTE. */ - ptep = huge_pte_alloc(mm, vma, address, psize); - if (!ptep) { - pages = -ENOMEM; + if (hugetlb_full_walk_alloc(&hpte, vma, + address, psize)) { + base_pages = -ENOMEM; break; } } - ptl = huge_pte_lock(h, mm, ptep); - if (huge_pmd_unshare(mm, vma, address, ptep)) { + + ptl = hugetlb_pte_lock(&hpte); + if (hugetlb_pte_size(&hpte) == psize && + huge_pmd_unshare(mm, vma, address, hpte.ptep)) { /* * When uffd-wp is enabled on the vma, unshare * shouldn't happen at all. Warn about it if it * happened due to some reason. */ WARN_ON_ONCE(uffd_wp || uffd_wp_resolve); - pages++; + base_pages += psize / PAGE_SIZE; spin_unlock(ptl); shared_pmd = true; - address |= last_addr_mask; + address = (address | last_addr_mask) + psize; continue; } - pte = huge_ptep_get(ptep); + pte = huge_ptep_get(hpte.ptep); if (unlikely(is_hugetlb_entry_hwpoisoned(pte))) { /* Nothing to do. */ } else if (unlikely(is_hugetlb_entry_migration(pte))) { @@ -6974,7 +6978,7 @@ long hugetlb_change_protection(struct vm_area_struct *vma, entry = make_readable_migration_entry( swp_offset(entry)); newpte = swp_entry_to_pte(entry); - pages++; + base_pages += hugetlb_pte_size(&hpte) / PAGE_SIZE; } if (uffd_wp) @@ -6982,34 +6986,49 @@ long hugetlb_change_protection(struct vm_area_struct *vma, else if (uffd_wp_resolve) newpte = pte_swp_clear_uffd_wp(newpte); if (!pte_same(pte, newpte)) - set_huge_pte_at(mm, address, ptep, newpte); + set_huge_pte_at(mm, address, hpte.ptep, newpte); } else if (unlikely(is_pte_marker(pte))) { /* No other markers apply for now. */ WARN_ON_ONCE(!pte_marker_uffd_wp(pte)); if (uffd_wp_resolve) /* Safe to modify directly (non-present->none). */ - huge_pte_clear(mm, address, ptep, psize); + huge_pte_clear(mm, address, hpte.ptep, + hugetlb_pte_size(&hpte)); } else if (!huge_pte_none(pte)) { pte_t old_pte; - unsigned int shift = huge_page_shift(hstate_vma(vma)); + unsigned int shift = hpte.shift; + + if (unlikely(!hugetlb_pte_present_leaf(&hpte, pte))) { + /* + * Someone split the PTE from under us, so retry + * the walk, + */ + spin_unlock(ptl); + continue; + } - old_pte = huge_ptep_modify_prot_start(vma, address, ptep); + old_pte = huge_ptep_modify_prot_start( + vma, address, hpte.ptep); pte = huge_pte_modify(old_pte, newprot); - pte = arch_make_huge_pte(pte, shift, vma->vm_flags); + pte = arch_make_huge_pte( + pte, shift, vma->vm_flags); if (uffd_wp) pte = huge_pte_mkuffd_wp(pte); else if (uffd_wp_resolve) pte = huge_pte_clear_uffd_wp(pte); - huge_ptep_modify_prot_commit(vma, address, ptep, old_pte, pte); - pages++; + huge_ptep_modify_prot_commit( + vma, address, hpte.ptep, + old_pte, pte); + base_pages += hugetlb_pte_size(&hpte) / PAGE_SIZE; } else { /* None pte */ if (unlikely(uffd_wp)) /* Safe to modify directly (none->non-present). */ - set_huge_pte_at(mm, address, ptep, + set_huge_pte_at(mm, address, hpte.ptep, make_pte_marker(PTE_MARKER_UFFD_WP)); } spin_unlock(ptl); + address += hugetlb_pte_size(&hpte); } /* * Must flush TLB before releasing i_mmap_rwsem: x86's huge_pmd_unshare @@ -7032,7 +7051,7 @@ long hugetlb_change_protection(struct vm_area_struct *vma, hugetlb_vma_unlock_write(vma); mmu_notifier_invalidate_range_end(&range); - return pages > 0 ? (pages << h->order) : pages; + return base_pages; } /* Return true if reservation was successful, false otherwise. */ From patchwork Sat Feb 18 00:27:53 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: James Houghton X-Patchwork-Id: 13145386 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 125B7C64ED8 for ; Sat, 18 Feb 2023 00:29:23 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 1C3A1280010; Fri, 17 Feb 2023 19:29:06 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 0D821280002; Fri, 17 Feb 2023 19:29:06 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id E92F2280010; Fri, 17 Feb 2023 19:29:05 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0011.hostedemail.com [216.40.44.11]) by kanga.kvack.org (Postfix) with ESMTP id DABE5280002 for ; Fri, 17 Feb 2023 19:29:05 -0500 (EST) Received: from smtpin14.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay07.hostedemail.com (Postfix) with ESMTP id BBF471605D5 for ; Sat, 18 Feb 2023 00:29:05 +0000 (UTC) X-FDA: 80478527850.14.1C521A1 Received: from mail-yw1-f202.google.com (mail-yw1-f202.google.com [209.85.128.202]) by imf01.hostedemail.com (Postfix) with ESMTP id F068140016 for ; Sat, 18 Feb 2023 00:29:03 +0000 (UTC) Authentication-Results: imf01.hostedemail.com; dkim=pass header.d=google.com header.s=20210112 header.b=cJVTi2N6; spf=pass (imf01.hostedemail.com: domain of 3zxvwYwoKCOwXhVciUVhcbUccUZS.QcaZWbil-aaYjOQY.cfU@flex--jthoughton.bounces.google.com designates 209.85.128.202 as permitted sender) smtp.mailfrom=3zxvwYwoKCOwXhVciUVhcbUccUZS.QcaZWbil-aaYjOQY.cfU@flex--jthoughton.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1676680144; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=Wqs0fsEFEjqd+rJsTObxpTCfifnDhyGaJkzr9TFCSvk=; b=Jf97shFll6jxx1sCo83P2+OAQnTiBizM2I0wMWPSWSSzOPlSsczr+o1DypAertTO4dgjwK peFWQx+HwkBkswIZKs/qJGTwDz7z1iUpCG5x1jywUgI8xw1husGKsZvdf9n1C3cAxx0b6B ViBS/0df01e47IONU8GlaMj8mzk1q28= ARC-Authentication-Results: i=1; imf01.hostedemail.com; dkim=pass header.d=google.com header.s=20210112 header.b=cJVTi2N6; spf=pass (imf01.hostedemail.com: domain of 3zxvwYwoKCOwXhVciUVhcbUccUZS.QcaZWbil-aaYjOQY.cfU@flex--jthoughton.bounces.google.com designates 209.85.128.202 as permitted sender) smtp.mailfrom=3zxvwYwoKCOwXhVciUVhcbUccUZS.QcaZWbil-aaYjOQY.cfU@flex--jthoughton.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1676680144; a=rsa-sha256; cv=none; b=KmQ6CrjQrmpRoH2TLfhr7hItvlhd8KXkv80o/MfUJR+JOsYCZYVYtRY392sTZA3P1En3bl 9+8n34aHDaivpWMVwuxNfuC7KRqxR1sRV3ZU3lde7EIVzm58pqICz/3emWSYt+b305qLQo zGRA+uS7eCwKjWwW7hp2TsrMpkUdP3U= Received: by mail-yw1-f202.google.com with SMTP id 00721157ae682-5365a8dd33aso18698887b3.22 for ; Fri, 17 Feb 2023 16:29:03 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=Wqs0fsEFEjqd+rJsTObxpTCfifnDhyGaJkzr9TFCSvk=; b=cJVTi2N6kM66ucMEEndfS5PbMZx/snZYIaX4pqh8wHLpb03XCb5lJGQNTsMPX9uGLM 3wrM65Jw9kjrzso9alDlQrPBuVrtt7sPBKbdMEoD3wjSNN9aVSq6XDimkW/z3s2yU53N 01HcXY5734HriS3KEBsLiLM4XpS0m8aYkKnF+jbQkjlqFDqIpmrep+3g8RZmFM7dnjig IH2rbnvxfAr1zdpMGTaaiYQFb5fCvbDD9c/lVJg17L/SIZ0Al4fOA+YIO6cW7+D3MDjz 3rGL2IRT+C4f2EXWISMVmkij1Ys+Tetb9JtllXr0s2awCerx9KT+6hSElNdAsCbda374 SHaQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=Wqs0fsEFEjqd+rJsTObxpTCfifnDhyGaJkzr9TFCSvk=; b=NDihqBDkIvrZHDbEp2AIsk6MRe1otFIXeIM8CFedLerRiUQYyenv52ibsSMpIL7rZR tHYE5wU2dI5K/3MJ7StGJqHz/Er5XVMV0r2mw5ah4kkivnHsm/tVUU6gtlwRFn5ent2+ 0tCSRVMqTalOSTMAC2Xc0j3tdsk1Rq2PpPAkLzg6lwZboWFcgp5pusZ+3sAihLIQUxjz ZXJpV3h323jEQEjyU7/qdIEZurqNhBx1QicUhHYwII4VA0kBXyusU9kBQabqdCFJ4+9u oyvzb+pdz79qbL/OBO5v1yC4GOtTl7mqhxLLVe4aj68JYH3I/wNHIsc5OMvbsn2dy9XM du3g== X-Gm-Message-State: AO0yUKVaRuKbn6BXCJuRBuseVd1B+7EJ6Re4+Losjfj2tGytkeF24WQG OXIY5fdA2Urcf8axO4FHQKoLEGF7Fjb4G6MY X-Google-Smtp-Source: AK7set8Vb2mEtNk6xau6GwJBBjn11LS8z+AdmXnFLSNyWOrweHaxia9rgscf+J08KFziUFJlJkWb+B9Wav4nPE21 X-Received: from jthoughton.c.googlers.com ([fda3:e722:ac3:cc00:14:4d90:c0a8:2a4f]) (user=jthoughton job=sendgmr) by 2002:a5b:711:0:b0:97a:956d:6a4 with SMTP id g17-20020a5b0711000000b0097a956d06a4mr36513ybq.5.1676680143155; Fri, 17 Feb 2023 16:29:03 -0800 (PST) Date: Sat, 18 Feb 2023 00:27:53 +0000 In-Reply-To: <20230218002819.1486479-1-jthoughton@google.com> Mime-Version: 1.0 References: <20230218002819.1486479-1-jthoughton@google.com> X-Mailer: git-send-email 2.39.2.637.g21b0678d19-goog Message-ID: <20230218002819.1486479-21-jthoughton@google.com> Subject: [PATCH v2 20/46] hugetlb: add HGM support to follow_hugetlb_page From: James Houghton To: Mike Kravetz , Muchun Song , Peter Xu , Andrew Morton Cc: David Hildenbrand , David Rientjes , Axel Rasmussen , Mina Almasry , "Zach O'Keefe" , Manish Mishra , Naoya Horiguchi , "Dr . David Alan Gilbert" , "Matthew Wilcox (Oracle)" , Vlastimil Babka , Baolin Wang , Miaohe Lin , Yang Shi , Frank van der Linden , Jiaqi Yan , linux-mm@kvack.org, linux-kernel@vger.kernel.org, James Houghton X-Rspam-User: X-Rspamd-Server: rspam04 X-Rspamd-Queue-Id: F068140016 X-Stat-Signature: gnkopqy7bimu5qoedxqts8oy6dffb4ck X-HE-Tag: 1676680143-387855 X-HE-Meta: U2FsdGVkX1+/9Z4MvlYqc2ki5dFlxiQhQLzmhxbmUXzlOU17r6bHKJcmh//ErCwALAA1Yz9f/nncn3HJucwMrrB+ZF9w2OThSSy6Kk1t6V1F2MbjR7m1bwomiFyLs3iB7dhpxGQ6FT/UT8tsj0VBDgPt1qiq9tQC51z/TeOL0DFiP30V3GqNeaxZXIRukLLLoKJ72bl4y2axfw/R6PcdEXwlLrSVoAJo6ubHaWPVEScum2aDIfT0crS3d8VjmVec26ydEuRJiSoqbxDFXCKJhMAfrSBkv6WEH+AXUvDqy3cWv9dX7jyoFczsn3LBRgjLLEY+w8IJa8UVhUL3X6wS5g5D5TG89RC0qC6TCza7U2VT7btiDkG70TlleZ/ORD3h5fg2tHYuXG1c3b7YPXKiOM6Ygp0w+JI/x2/GND9pwziSX+O/w9p9FHLIkF2cvgyEC0dNmlqMykivPcmJOewLrWhhLoMrXQ1euPSVu0r9u1HgSHf5EJycTcrPoyad8eZpxicL9CxSOE//CrjmOb6R04izwgLZCduhDVoozz272Ow2AKJMlbW3kRWGHfM0SgJXFb3JLSIADUSgjdFFFVsqK/efKgy1VFIfrnkq4/2289GwKey8ZNWuddQjd0lx0Xm0SCY42lbPyEqUFuJmI+mk8aNQPrQ19c0cbFW/qo+ypr+3hNTDn5Gqy93Qty5rFMAzKQWlu8sdcJ3oOMa6DYYTPPRr1zBwObjCyV+0GbI3AuchHI9GIqWSfueKaqDpCm1KssRJYMjszxa9NuROWinEaS8Da1kZb6ZFauN2R9CUuVk3QkJRH1wuCs3zkVbTpVSNUnsi+Dntk/z6NBVq2HH0hAuzUbvNMh1xkpZTmScdP+CfUXTfTYj1mCMRF3U5zhXz5mwUPng32eUL/28wekh6AldhuXGMn2xOM4crN0nw+mzPlJqi//I7lFKRhEyexD/RGwYS5eAIwq9Rw8g8Ilm sxtZJT4U +6Bs1cERUcPoWgj4JMOfaZwMCkp3/q4abYvr+1uQupdlsi97/Zw2rbnighdrXEEe2UfjWvojXEVrLa//0PuSX8e0LOQuijaNo7N/3O4h17Hk2ZZD9+1Wmi+8cMQ/QwoRct5Ypb2OymfwPOmwiakrI/EE7Edr2z0BPELb/Bembr/KmIHeQ4Fnhi4zcq0yKlXaOu07Qdfa27tRIS5zyVjDT2eqEjd9OyZB5Uzp1mFEvi3VYVaXJ6eN0DS6qDKef+mJBnHxPKq2Zx065KkfLedVZzmQmwFqpjUs51npPz527cYZ+xl3ylmC6BwFhciwF7kIO+QtAVwck/87HlV0lSZ5hrx8Z6wiXeXvjWLrPs02pziec21sywg1WAiGY9SZEdkC96jhPNrNgPI4GcreeyIjgRGm9zOVY7P31hkEpYr9Bax1Mlw27FSHLShM2yV7dKyrNPGsYea5nlWyzqEs01LGMvlqEXCXqp5fCqIifrPE0g8UAi5NJ6xovJTjf1PD6lVfRZ6mmHeBjMSacKMr+rPvMOc7W6jhkm4LcNcDZxRIUISyiyIrthjM/v0WZOshfvDpWK3KAO7oD1uz4l0PyUIkHFjWoU2LnQmPUWxU8/c8qOYheEJfX9O5hOn82VxtUwhBrhZC0MC0WizBpCHsmlMCYAfEycVauLz+Cy4uLzPWGvqVwintkSD4iQbeDHwhF0RAJIv+lCHa+rNndCmU= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Enable high-granularity mapping support in GUP. In case it is confusing, pfn_offset is the offset (in PAGE_SIZE units) that vaddr points to within the subpage that hpte points to. Signed-off-by: James Houghton diff --git a/mm/hugetlb.c b/mm/hugetlb.c index 7321c6602d6f..c26b040f4fb5 100644 --- a/mm/hugetlb.c +++ b/mm/hugetlb.c @@ -6634,11 +6634,9 @@ static void record_subpages_vmas(struct page *page, struct vm_area_struct *vma, } static inline bool __follow_hugetlb_must_fault(struct vm_area_struct *vma, - unsigned int flags, pte_t *pte, + unsigned int flags, pte_t pteval, bool *unshare) { - pte_t pteval = huge_ptep_get(pte); - *unshare = false; if (is_swap_pte(pteval)) return true; @@ -6713,11 +6711,13 @@ long follow_hugetlb_page(struct mm_struct *mm, struct vm_area_struct *vma, int err = -EFAULT, refs; while (vaddr < vma->vm_end && remainder) { - pte_t *pte; + pte_t *ptep, pte; spinlock_t *ptl = NULL; bool unshare = false; int absent; - struct page *page; + unsigned long pages_per_hpte; + struct page *page, *subpage; + struct hugetlb_pte hpte; /* * If we have a pending SIGKILL, don't keep faulting pages and @@ -6734,13 +6734,19 @@ long follow_hugetlb_page(struct mm_struct *mm, struct vm_area_struct *vma, * each hugepage. We have to make sure we get the * first, for the page indexing below to work. * - * Note that page table lock is not held when pte is null. + * hugetlb_full_walk will mask the address appropriately. + * + * Note that page table lock is not held when ptep is null. */ - pte = hugetlb_walk(vma, vaddr & huge_page_mask(h), - huge_page_size(h)); - if (pte) - ptl = huge_pte_lock(h, mm, pte); - absent = !pte || huge_pte_none(huge_ptep_get(pte)); + if (hugetlb_full_walk(&hpte, vma, vaddr)) { + ptep = NULL; + absent = true; + } else { + ptl = hugetlb_pte_lock(&hpte); + ptep = hpte.ptep; + pte = huge_ptep_get(ptep); + absent = huge_pte_none(pte); + } /* * When coredumping, it suits get_dump_page if we just return @@ -6751,13 +6757,21 @@ long follow_hugetlb_page(struct mm_struct *mm, struct vm_area_struct *vma, */ if (absent && (flags & FOLL_DUMP) && !hugetlbfs_pagecache_present(h, vma, vaddr)) { - if (pte) + if (ptep) spin_unlock(ptl); hugetlb_vma_unlock_read(vma); remainder = 0; break; } + if (!absent && pte_present(pte) && + !hugetlb_pte_present_leaf(&hpte, pte)) { + /* We raced with someone splitting the PTE, so retry. */ + spin_unlock(ptl); + hugetlb_vma_unlock_read(vma); + continue; + } + /* * We need call hugetlb_fault for both hugepages under migration * (in which case hugetlb_fault waits for the migration,) and @@ -6773,7 +6787,7 @@ long follow_hugetlb_page(struct mm_struct *mm, struct vm_area_struct *vma, vm_fault_t ret; unsigned int fault_flags = 0; - if (pte) + if (ptep) spin_unlock(ptl); hugetlb_vma_unlock_read(vma); @@ -6822,8 +6836,10 @@ long follow_hugetlb_page(struct mm_struct *mm, struct vm_area_struct *vma, continue; } - pfn_offset = (vaddr & ~huge_page_mask(h)) >> PAGE_SHIFT; - page = pte_page(huge_ptep_get(pte)); + pfn_offset = (vaddr & ~hugetlb_pte_mask(&hpte)) >> PAGE_SHIFT; + subpage = pte_page(pte); + pages_per_hpte = hugetlb_pte_size(&hpte) / PAGE_SIZE; + page = compound_head(subpage); VM_BUG_ON_PAGE((flags & FOLL_PIN) && PageAnon(page) && !PageAnonExclusive(page), page); @@ -6833,22 +6849,22 @@ long follow_hugetlb_page(struct mm_struct *mm, struct vm_area_struct *vma, * and skip the same_page loop below. */ if (!pages && !vmas && !pfn_offset && - (vaddr + huge_page_size(h) < vma->vm_end) && - (remainder >= pages_per_huge_page(h))) { - vaddr += huge_page_size(h); - remainder -= pages_per_huge_page(h); - i += pages_per_huge_page(h); + (vaddr + hugetlb_pte_size(&hpte) < vma->vm_end) && + (remainder >= pages_per_hpte)) { + vaddr += hugetlb_pte_size(&hpte); + remainder -= pages_per_hpte; + i += pages_per_hpte; spin_unlock(ptl); hugetlb_vma_unlock_read(vma); continue; } /* vaddr may not be aligned to PAGE_SIZE */ - refs = min3(pages_per_huge_page(h) - pfn_offset, remainder, + refs = min3(pages_per_hpte - pfn_offset, remainder, (vma->vm_end - ALIGN_DOWN(vaddr, PAGE_SIZE)) >> PAGE_SHIFT); if (pages || vmas) - record_subpages_vmas(nth_page(page, pfn_offset), + record_subpages_vmas(nth_page(subpage, pfn_offset), vma, refs, likely(pages) ? pages + i : NULL, vmas ? vmas + i : NULL); From patchwork Sat Feb 18 00:27:54 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: James Houghton X-Patchwork-Id: 13145387 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id A7366C636D6 for ; Sat, 18 Feb 2023 00:29:24 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 17404280011; Fri, 17 Feb 2023 19:29:07 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 0FD90280002; Fri, 17 Feb 2023 19:29:07 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id E98AD280011; Fri, 17 Feb 2023 19:29:06 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0017.hostedemail.com [216.40.44.17]) by kanga.kvack.org (Postfix) with ESMTP id D0385280002 for ; Fri, 17 Feb 2023 19:29:06 -0500 (EST) Received: from smtpin05.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay03.hostedemail.com (Postfix) with ESMTP id AE934A019B for ; Sat, 18 Feb 2023 00:29:06 +0000 (UTC) X-FDA: 80478527892.05.91380DF Received: from mail-yw1-f201.google.com (mail-yw1-f201.google.com [209.85.128.201]) by imf24.hostedemail.com (Postfix) with ESMTP id 01B31180010 for ; Sat, 18 Feb 2023 00:29:04 +0000 (UTC) Authentication-Results: imf24.hostedemail.com; dkim=pass header.d=google.com header.s=20210112 header.b=jsCsMDMV; spf=pass (imf24.hostedemail.com: domain of 30BvwYwoKCO0YiWdjVWidcVddVaT.RdbaXcjm-bbZkPRZ.dgV@flex--jthoughton.bounces.google.com designates 209.85.128.201 as permitted sender) smtp.mailfrom=30BvwYwoKCO0YiWdjVWidcVddVaT.RdbaXcjm-bbZkPRZ.dgV@flex--jthoughton.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1676680145; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=xFEWCFEVqNzp10q8weB+17bzU0TWKtM8UIz/Xx7DMio=; b=LP2p9Lv0rro9hXhEWp+pwgN/ZgUtZgpeXZ9HKzOViYMssSCy5io8RIb+ateGuGgCbW77T/ jdCUgZj0uyuycDdh8N6Ko3ve8aMWCT/ePZojGwWnz3LncW+okC/eEZJZ+cW2ISK5M4mqoe 3zIQL5fs4IXbChpHPWElQkhDVbZYxZU= ARC-Authentication-Results: i=1; imf24.hostedemail.com; dkim=pass header.d=google.com header.s=20210112 header.b=jsCsMDMV; spf=pass (imf24.hostedemail.com: domain of 30BvwYwoKCO0YiWdjVWidcVddVaT.RdbaXcjm-bbZkPRZ.dgV@flex--jthoughton.bounces.google.com designates 209.85.128.201 as permitted sender) smtp.mailfrom=30BvwYwoKCO0YiWdjVWidcVddVaT.RdbaXcjm-bbZkPRZ.dgV@flex--jthoughton.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1676680145; a=rsa-sha256; cv=none; b=8buJul6F3bVY3UEmFb2ms19wqlptiD1eYIz1iTQn1lmqcCxhm+awanXx3QM4M30Atk721r q2ymHi8THamH1NbXfvPIwzbULTaRUuVMXdjFUIuF3sFaxm7c13U5YRNjPZ4Epe3PAgP3ac FGMY6K8y5ke7mWthSSvzMzhZ52M2MCk= Received: by mail-yw1-f201.google.com with SMTP id 00721157ae682-5365742524fso24752267b3.13 for ; Fri, 17 Feb 2023 16:29:04 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=xFEWCFEVqNzp10q8weB+17bzU0TWKtM8UIz/Xx7DMio=; b=jsCsMDMV78mjE8aA5hDj4ELvx3b3FYDfUCGtdB3AxH26EGW3FR4Wa7s8y12aCgjJAY xZHdXI0S9zLwyoyC2XXFQ/t878xZ76UJxTZdgaT1614gKQ8S86PPdMjKwYlq97dle3xZ X18JJ5MDR3KedDVpG1uX/f8VVj8pEdf241wOEt1/1ioMygZNzU7w6K5FfESBrPr6rcwL oyCOP/Tffhfxzt33/TtBRunbLaQrs+s+Ng2bav+9PXMaH8T+pd/4F//IKbtKNQ3pXtLn zm3LA/sIhVTcMgTue+SyM1JfkLU6x3PjoLgkHIa6SXkJODW1AhFZZ0jaKNWlzrRQQiMx uOCg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=xFEWCFEVqNzp10q8weB+17bzU0TWKtM8UIz/Xx7DMio=; b=A58OinjhsaNtbKb6jaE0u8sIS9+k8w6y/5qhxbPK4PL9Du0XpjoRUwbiS5OzKTjsGd wngUMXXBIVoINxYSv4EbYZw7UIZj6i9Qj0MojvjU07rluVJ+tYR82mQ7AYLVWuQl17v2 jO728C8wCTUEIvp2Au24jFg1YYc+eawiOMiPoedcc9ciFSkOp4StTruJa+S4j8ZGz5GC F44+Etxxeo7ZXEJg/K+Ogn4NF9ikEw/XrD06Zw+3W3mefUteafX0LOfrioA0SfBok9wm 8fhtmxyCD9sFyy03YUr80OJCCO5wRkis+pYgVLc+mvyp21PtfhCMKiWI/4gcBsc74cOw X3Lw== X-Gm-Message-State: AO0yUKVIN7Z/DyHxKGZvRSxs8tFcxurHWwCvkFbMoeEs6RphMg4S81aU 7Mn9sTRRA8BDxT3W69zTEbMziRLrhcJlzpxZ X-Google-Smtp-Source: AK7set+l/nyB7A0r6pHwCnr6M8cb+COlb7s5TL7EsXfsO+GgeuyPTV/etrosFrCflWYhmk4cAs1xco6548KS9NkT X-Received: from jthoughton.c.googlers.com ([fda3:e722:ac3:cc00:14:4d90:c0a8:2a4f]) (user=jthoughton job=sendgmr) by 2002:a5b:10e:0:b0:94a:ebba:cba6 with SMTP id 14-20020a5b010e000000b0094aebbacba6mr249759ybx.9.1676680144192; Fri, 17 Feb 2023 16:29:04 -0800 (PST) Date: Sat, 18 Feb 2023 00:27:54 +0000 In-Reply-To: <20230218002819.1486479-1-jthoughton@google.com> Mime-Version: 1.0 References: <20230218002819.1486479-1-jthoughton@google.com> X-Mailer: git-send-email 2.39.2.637.g21b0678d19-goog Message-ID: <20230218002819.1486479-22-jthoughton@google.com> Subject: [PATCH v2 21/46] hugetlb: add HGM support to hugetlb_follow_page_mask From: James Houghton To: Mike Kravetz , Muchun Song , Peter Xu , Andrew Morton Cc: David Hildenbrand , David Rientjes , Axel Rasmussen , Mina Almasry , "Zach O'Keefe" , Manish Mishra , Naoya Horiguchi , "Dr . David Alan Gilbert" , "Matthew Wilcox (Oracle)" , Vlastimil Babka , Baolin Wang , Miaohe Lin , Yang Shi , Frank van der Linden , Jiaqi Yan , linux-mm@kvack.org, linux-kernel@vger.kernel.org, James Houghton X-Rspam-User: X-Rspamd-Server: rspam04 X-Rspamd-Queue-Id: 01B31180010 X-Stat-Signature: 9hn8uq9ep5p1bugnr94eku475cotfzbw X-HE-Tag: 1676680144-556709 X-HE-Meta: U2FsdGVkX183bLx69hrRst5VO0One+HTk/arWzZIqYzdNknTXIXu3uOb/lxaQQcXrqbzbHgheIvi3yzIGu+Y0K8CV7hgECFplsxhZOs0iFF4hnbA3vhu3HNJsfWxS1CwIMCAdOKFegsCMM0uyoKqbJBkOWBA26hs/hNMWy+d9OFUGGJaQ0LtzsnTojT1GzFC0bkA69eTSyi47cIeZTH6YYz7QhvqQlq0N3ECdSfunKnr+xcpmHtq3LtskbX+0VrhdVuA7htQaSFDxzORXec4uyQ2SxNeyUMRqVljQme0RKh11zSpfozTKbQ9Gz8TntoqWtgI3dA+24fbQcFLsR00K7+Vgi97Bn4PX+JE3DmoMdK1ImVhCmnGKvxLJPOl2zcbcQ4BDcLIgH7eP6gFghgh645MOjjA0f/tqN9ItyMOH8VlNLi8ynP9OoOPGl0qWcfju8M96fmyrN7UzJOL1F8QU3jVoA4ONZOLNQ2QxyrjMfRIq2mgRcPMAqnVgORlJCXUVAzNEdqT0sOxqzJR8+/QpPvbEqn1g6Sqi3WznU1Nn3baulwuXEzVo/MUfGCGtL85FW20Hns4Nn/nVmALWT0X0rPj4bTXsNddISqGK11fIU20qKxuOFtZBv2JmLtH9lJ1S2t0jVENLXzL2Yj6F85WidbkvJjoGif+ABAt41xKrVajr6DxI8eT9C0g7Ttl+vz7fpVKcFWvWY8T2wmcAhDPNFELTh5OhB9fMCsB8B278ncXr3typhPop16QGvL9trkqjwrjV30SONLdmPpxl4WGVZmZrPMdnV0CoNtuvD8NTtYA8ePStJ4/4Jx2dJJ7Y/kD7O59yPWVu1YgftBhYFjUxW6nozFsQJFL7p42TOjh0A/bguBQyjPIzdZ0BkboYeRw1i+qDt2VfBSvWOAUEWbhcdCtikizSN2VF4tFKjf26vTmVcC0+/aEk2v/x+RbxpFc5LX0C7jBW93zMUIqfwl vbQ/nN7c mCIs+0AxgN0IGqgYKPK0+8GvHIbqKojwZKUU4ovr9BBYBP8Lok06Wgc6JABxobtlHKS04DviD76g1Oebn/XdeEmXSxjXIrZgETpXFtZmEzpuP6Kxdz5wwEafpAzf3vzfe6tlb4raBgbWZfNYzRUTbHl22bPhiJ2lr6aqXO5zRBefhmtHo1nY927LsxrgEHjZ7WmiYGWQ6+IveZF8dt5jszhGadGSdWunUnOpgkAhGuLpGzXWMAfa+APDI0hkkXATFAdGcOxoSprpl5S1d9Z61zdNqrsGBZj+/DT+5p6ouwFt1U/BH3FXyBCujqRTj7FOZ/gu0TBHf2tsc5FFFH+uvm2vgCq/GyV21PrzAU0uHo8UZ2I9vGkZjd6tAPtMgSx7tkUdBNKNfLVUdp18PMmSOuQyScyldm9ciVfS/O56vjzjnwcNMbhC8hEB8d6F8b1T92pXb5xzNe4MdNeYDGAYsoHtTgZic//65KUV7R+7eCLrfYN1Dnr1d1MEpH7nFl/DDZONblHd2YbQoWpDv2xu4NpmSe8g/1EOHzqRGJZPi6gDewXi05TEvM4z4Bd3ctSgqOdNP59rHjHMHyZb3Tg1t7dNgrwvCaPM/LtGxLjr4AV7bmfhcmfY7mxX0dgngxtI7IDfVnA15mU5MPHe2mQNGyxP4n2RyX/kaupJOmaKFN4IwUjcqQRA9OI5867WcnfW7gaWSgX2EgdPndz0= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: The change here is very simple: do a high-granularity walk. Signed-off-by: James Houghton diff --git a/mm/hugetlb.c b/mm/hugetlb.c index c26b040f4fb5..693332b7e186 100644 --- a/mm/hugetlb.c +++ b/mm/hugetlb.c @@ -6655,11 +6655,10 @@ struct page *hugetlb_follow_page_mask(struct vm_area_struct *vma, unsigned long address, unsigned int flags) { struct hstate *h = hstate_vma(vma); - struct mm_struct *mm = vma->vm_mm; - unsigned long haddr = address & huge_page_mask(h); struct page *page = NULL; spinlock_t *ptl; - pte_t *pte, entry; + pte_t entry; + struct hugetlb_pte hpte; /* * FOLL_PIN is not supported for follow_page(). Ordinary GUP goes via @@ -6669,13 +6668,24 @@ struct page *hugetlb_follow_page_mask(struct vm_area_struct *vma, return NULL; hugetlb_vma_lock_read(vma); - pte = hugetlb_walk(vma, haddr, huge_page_size(h)); - if (!pte) + + if (hugetlb_full_walk(&hpte, vma, address)) goto out_unlock; - ptl = huge_pte_lock(h, mm, pte); - entry = huge_ptep_get(pte); +retry: + ptl = hugetlb_pte_lock(&hpte); + entry = huge_ptep_get(hpte.ptep); if (pte_present(entry)) { + if (unlikely(!hugetlb_pte_present_leaf(&hpte, entry))) { + /* + * We raced with someone splitting from under us. + * Keep walking to get to the real leaf. + */ + spin_unlock(ptl); + hugetlb_full_walk_continue(&hpte, vma, address); + goto retry; + } + page = pte_page(entry) + ((address & ~huge_page_mask(h)) >> PAGE_SHIFT); /* From patchwork Sat Feb 18 00:27:55 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: James Houghton X-Patchwork-Id: 13145388 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 93BEBC64ED6 for ; Sat, 18 Feb 2023 00:29:26 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 23785280012; Fri, 17 Feb 2023 19:29:08 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 1E6E0280002; Fri, 17 Feb 2023 19:29:08 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 08802280012; Fri, 17 Feb 2023 19:29:08 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0010.hostedemail.com [216.40.44.10]) by kanga.kvack.org (Postfix) with ESMTP id EF340280002 for ; Fri, 17 Feb 2023 19:29:07 -0500 (EST) Received: from smtpin27.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay09.hostedemail.com (Postfix) with ESMTP id CA837808D1 for ; Sat, 18 Feb 2023 00:29:07 +0000 (UTC) X-FDA: 80478527934.27.E583B62 Received: from mail-yw1-f201.google.com (mail-yw1-f201.google.com [209.85.128.201]) by imf29.hostedemail.com (Postfix) with ESMTP id 13A3412000A for ; Sat, 18 Feb 2023 00:29:05 +0000 (UTC) Authentication-Results: imf29.hostedemail.com; dkim=pass header.d=google.com header.s=20210112 header.b=GhwC2T28; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf29.hostedemail.com: domain of 30RvwYwoKCO4ZjXekWXjedWeeWbU.SecbYdkn-ccalQSa.ehW@flex--jthoughton.bounces.google.com designates 209.85.128.201 as permitted sender) smtp.mailfrom=30RvwYwoKCO4ZjXekWXjedWeeWbU.SecbYdkn-ccalQSa.ehW@flex--jthoughton.bounces.google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1676680146; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=JfAQD2GnLgelJPsgCkKvGQkDAwhBg880UG9Zo5xkj4o=; b=xitQ+Nvfz/7wr+hRO9blF2y+5M0gZ3CgOPdTw5fEG/b4wnPjB7D9yriCaQ87J8XmaMdgWJ e9iTkvMJpL/1GnrpgBGcQK8h0VmcPnnp6z3fyGJB/j+7WX53zVMYYK7NIQ+ZyEpbbwhTcU NB6NiU04ROhEX5sla7FXRDacfcghrDI= ARC-Authentication-Results: i=1; imf29.hostedemail.com; dkim=pass header.d=google.com header.s=20210112 header.b=GhwC2T28; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf29.hostedemail.com: domain of 30RvwYwoKCO4ZjXekWXjedWeeWbU.SecbYdkn-ccalQSa.ehW@flex--jthoughton.bounces.google.com designates 209.85.128.201 as permitted sender) smtp.mailfrom=30RvwYwoKCO4ZjXekWXjedWeeWbU.SecbYdkn-ccalQSa.ehW@flex--jthoughton.bounces.google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1676680146; a=rsa-sha256; cv=none; b=hBzMX6UywmtyI7Mo6EZA4QamRavu1IO4gsvwTEBeE/Cu+ejwZHnSBz9qQcJhfaKwDy7q98 tHeH++qrOcMBRakKK0BVuVwgWTYvU6kuSzH2BLELokXYOgLeBuYP2jU4/6EkbSHH5tB576 FlwXYgwEQZIonVGqPEnk52luQYuG+cQ= Received: by mail-yw1-f201.google.com with SMTP id 00721157ae682-536629fa4ceso17376237b3.16 for ; Fri, 17 Feb 2023 16:29:05 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=JfAQD2GnLgelJPsgCkKvGQkDAwhBg880UG9Zo5xkj4o=; b=GhwC2T28C0lmYxfHkr1F39H6tq+748WK5+znOZC5ojr/ThSSPvOkW3JfFS+N39ixGB ghbNpsvcjrL12G/rlTpuIaTgDVDrM0g8p1X3gpqXE1+PzjgwrE4/bveclrEHpITrsH6/ vCh2zPSW0omBWie4q3qfP1M6ZsaWlNDa4GFKgD1OeOlMW8F6eFJFEyAiAd7ZbqdicqY8 v4DpXcrSMZw5s++QcfPtVYVTTZBANPOzW0cTGWCstj+659DoDrTPU4/tbwMfCr3764rU 0IShYiy8Ml8Z3PlRkk3ux+gdZUWzixjHuGIN16t1b0SrnlJSfYwfS4QIWpPPtcB+ore8 jTlw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=JfAQD2GnLgelJPsgCkKvGQkDAwhBg880UG9Zo5xkj4o=; b=kzDXJoI3doT/1LyBruhQO3CDARUAd7rfEgLXxlWhmUCIb2QQzEiSYITPcoRJJ5C0Cc nw/u60C9ZeZMP84IDp+Mh28RVBCX6Kzxg2myNyxCYNZwYQ0fe4nR1RVAdav2AA6hURyn RihKyuyBGT5Ae0uPoXVJrtIKERgyhjeEL6OQvrDxs4k1cnDmMzXOaDUQGhZBZi32IXv9 H7ehqdLozCwiVBMgd0v+A4rjfu9o5Miigz0rTCOoq3mss3u7LveL4FqEHiE83TMV5CCd aIcT89iTDK5olYnKA+7ionOG4R0aaVKtKUsePzt2Q+s1riyoiQ0CSx5NFeFMyX6BppdK 1Q3A== X-Gm-Message-State: AO0yUKUr/DLEeMDD8tXnV9fpjtLhAuNJ5cAfgI9SQqNO6BTpX6xrP1KU HQ8S0X546nQC6e966OBeHg2XfvSZpFHbZCbX X-Google-Smtp-Source: AK7set+yokzqkL4u720B7KYDqcTj9sYjOsXNmiF1GrEsGqhb2zUZSxmLSlU9X3KMDaalRNX0T2p99Q9vNaJNFASA X-Received: from jthoughton.c.googlers.com ([fda3:e722:ac3:cc00:14:4d90:c0a8:2a4f]) (user=jthoughton job=sendgmr) by 2002:a05:6902:28c:b0:997:bdfe:78c5 with SMTP id v12-20020a056902028c00b00997bdfe78c5mr59430ybh.6.1676680145067; Fri, 17 Feb 2023 16:29:05 -0800 (PST) Date: Sat, 18 Feb 2023 00:27:55 +0000 In-Reply-To: <20230218002819.1486479-1-jthoughton@google.com> Mime-Version: 1.0 References: <20230218002819.1486479-1-jthoughton@google.com> X-Mailer: git-send-email 2.39.2.637.g21b0678d19-goog Message-ID: <20230218002819.1486479-23-jthoughton@google.com> Subject: [PATCH v2 22/46] hugetlb: add HGM support to copy_hugetlb_page_range From: James Houghton To: Mike Kravetz , Muchun Song , Peter Xu , Andrew Morton Cc: David Hildenbrand , David Rientjes , Axel Rasmussen , Mina Almasry , "Zach O'Keefe" , Manish Mishra , Naoya Horiguchi , "Dr . David Alan Gilbert" , "Matthew Wilcox (Oracle)" , Vlastimil Babka , Baolin Wang , Miaohe Lin , Yang Shi , Frank van der Linden , Jiaqi Yan , linux-mm@kvack.org, linux-kernel@vger.kernel.org, James Houghton X-Rspam-User: X-Rspamd-Server: rspam02 X-Rspamd-Queue-Id: 13A3412000A X-Stat-Signature: fx73fqmhjhn7sy17znapcs3fn4k49hhc X-HE-Tag: 1676680145-884451 X-HE-Meta: U2FsdGVkX18T2TegfG2cjAaIrb8sGyMeo+M4rFHstrUyfg/4UqF4Bt45odbsdK+/lNVqlwdZiF9uNaaT1N/vDoCT+kT6VTg4nLm1D1H1jTQQg7xXdznH8Y0/NZw8w/XGUTp4ZEM1fkLVhoqeEmAhAW/3IN+JSjPG4TAyy9MBakyCf16uvRGEifyJt/iTNclHVdo4cXqgMRfR8kUASKw5W0onnSDeO9VOCmjsOI6sDpad4ImMFf7MdFEEbXKohY0cOYt5P8vxEGDk+lmaudVxcOo9FyB9Sctnsm4TOaGp4Vlk0gQ/RQafyzymUywjRgQOItiOMiDSdUd1pCUMDayzIcF5eD5laB9I9S7QuDLSJyK1xh/bgu7Yv6PRf1kvgQdY+mSsXWnTfSrwoLb+RPDwfZJ7Ss2B8RZUKtTIWGfUqNGJ2e011xVwM++VjZBsYGY7SWUaPWWj7nPSFDHfLKtWexhzZugWXbSdJBD1UHqCb+qil7a0X6PuOqrDQeeG/F3kwKkMmpN3Kxr3hkhRaIO/nlNmkudse9miFK2O2wlMezfh9RSqemDQwQCz6iNLtBtIv6ZL+2J2+OaXS8dSl+qHeyqkZ2vsYL+zM3eez0unzjKSxvw2olOFW2iDzeod+7OUjEv80DSJVRFYWlzxxyHKd1t5r5rWRppUCrk/yL/L06Lgo4QCcx9Chyh/LKoPhWmqpmAYhKfiqg4kzOLqFap1hUhndzc97/o9eMExSQ78hstySHr/892xYSJiVd61Mk4pgqjBTmylUjHH1k6kTVvfhkUTQ3657QiNv7ucpIQvEb35MEAr3Qo3loUS/7IVfNEeHHAI4CVp0U7lfnpTn/lZ6Llo+2YVnBGK2F7C7EKAf8CB0671Z4jRoi9wSoPMF1DYhenRDtQlwqlzYH3Gf/A/2LgPxxAkOvxbep73OEzSe22zS8/cy4TU10UcO7qV3Oylt8tvs75FEfBkEnG6kg6 f4CMbUrc Fztg0ZzwiEI3pjsIhzWyIfI3VHILZZnt2cFEzXjE/it3Jn3Fv+V8XfdrzzhwA62kF2BS61T8w0Wl5cm4o9a0ZoHCW/thyqibWEvR4jUb42yygMrabf6DYjo7hzaGoytjC8oc2cB/hQgjxNU4igTJmWn6dO2VSNQ8XBhWS49PPgKDKRxypr0M7AQsHs9Ev9wCtVL4Lvp4mx7jz6I17aCVS7x2ylXgZDAMLE/31n9cUb4SmYGVkJ/qka540w1SkczHW2ivLHSKqXO3jyGIphtvwoSSXh7fDKFOXyynbQtVa9m6yKjOnIGqbMxL+NPeJlmCMwujHgFai/etLQot+PYwiF/YpA2D/aDQq2f4viS/kklwGwLrzPHI8f2h8/lk3/hzmlhUC3LUoKxtqJgA9imsUR7zFDkRCSLnoJPhtFZjLC9Wedkygtz98i2JaZqTJI3OuxoWJxG/8Z1hwJGr+eR6JzNFIGUbygbpKyaOntaChAUAdb1dJfNDJxw6gGWxirD6rovCVtUJ3ECOgOlR8qd3HP4+kaO6OLjNflNJzwUbTu5DeAdsw+LGCbtfZq/XpZqbcSryh2sJ2osfGd+zGheEVsDX8YtD+UQU2G4bQuoZ+rIy6L4kiY6KrG5NL49C33d1WfqWPHxp5FhX4CZ5i+ju8JDZ6si2TO49iakwMOqSUiROhi+9l0gluPp1vKceAenGiWUsCZaCup1HS/mU= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: This allows fork() to work with high-granularity mappings. The page table structure is copied such that partially mapped regions will remain partially mapped in the same way for the new process. A page's reference count is incremented for *each* portion of it that is mapped in the page table. For example, if you have a PMD-mapped 1G page, the reference count will be incremented by 512. mapcount is handled similar to THPs: if you're completely mapping a hugepage, then the compound_mapcount is incremented. If you're mapping a part of it, the subpages that are getting mapped will have their mapcounts incremented. Signed-off-by: James Houghton diff --git a/include/linux/hugetlb.h b/include/linux/hugetlb.h index 1a1a71868dfd..2fe1eb6897d4 100644 --- a/include/linux/hugetlb.h +++ b/include/linux/hugetlb.h @@ -162,6 +162,8 @@ void hugepage_put_subpool(struct hugepage_subpool *spool); void hugetlb_remove_rmap(struct page *subpage, unsigned long shift, struct hstate *h, struct vm_area_struct *vma); +void hugetlb_add_file_rmap(struct page *subpage, unsigned long shift, + struct hstate *h, struct vm_area_struct *vma); void hugetlb_dup_vma_private(struct vm_area_struct *vma); void clear_vma_resv_huge_pages(struct vm_area_struct *vma); diff --git a/mm/hugetlb.c b/mm/hugetlb.c index 693332b7e186..210c6f2b16a5 100644 --- a/mm/hugetlb.c +++ b/mm/hugetlb.c @@ -141,6 +141,37 @@ void hugetlb_remove_rmap(struct page *subpage, unsigned long shift, page_remove_rmap(subpage, vma, false); } } +/* + * hugetlb_add_file_rmap() - increment the mapcounts for file-backed hugetlb + * pages appropriately. + * + * For pages that are being mapped with their hstate-level PTE (e.g., a 1G page + * being mapped with a 1G PUD), then we increment the compound_mapcount for the + * head page. + * + * For pages that are being mapped with high-granularity, we increment the + * mapcounts for the individual subpages that are getting mapped. + */ +void hugetlb_add_file_rmap(struct page *subpage, unsigned long shift, + struct hstate *h, struct vm_area_struct *vma) +{ + struct page *hpage = compound_head(subpage); + + if (shift == huge_page_shift(h)) { + VM_BUG_ON_PAGE(subpage != hpage, subpage); + page_add_file_rmap(hpage, vma, true); + } else { + unsigned long nr_subpages = 1UL << (shift - PAGE_SHIFT); + struct page *final_page = &subpage[nr_subpages]; + + VM_BUG_ON_PAGE(HPageVmemmapOptimized(hpage), hpage); + /* + * Increment the mapcount on each page that is getting mapped. + */ + for (; subpage < final_page; ++subpage) + page_add_file_rmap(subpage, vma, false); + } +} static inline bool subpool_is_free(struct hugepage_subpool *spool) { @@ -5210,7 +5241,8 @@ int copy_hugetlb_page_range(struct mm_struct *dst, struct mm_struct *src, struct vm_area_struct *src_vma) { pte_t *src_pte, *dst_pte, entry; - struct page *ptepage; + struct hugetlb_pte src_hpte, dst_hpte; + struct page *ptepage, *hpage; unsigned long addr; bool cow = is_cow_mapping(src_vma->vm_flags); struct hstate *h = hstate_vma(src_vma); @@ -5238,18 +5270,24 @@ int copy_hugetlb_page_range(struct mm_struct *dst, struct mm_struct *src, } last_addr_mask = hugetlb_mask_last_page(h); - for (addr = src_vma->vm_start; addr < src_vma->vm_end; addr += sz) { + addr = src_vma->vm_start; + while (addr < src_vma->vm_end) { spinlock_t *src_ptl, *dst_ptl; - src_pte = hugetlb_walk(src_vma, addr, sz); - if (!src_pte) { - addr |= last_addr_mask; + unsigned long hpte_sz; + + if (hugetlb_full_walk(&src_hpte, src_vma, addr)) { + addr = (addr | last_addr_mask) + sz; continue; } - dst_pte = huge_pte_alloc(dst, dst_vma, addr, sz); - if (!dst_pte) { - ret = -ENOMEM; + ret = hugetlb_full_walk_alloc(&dst_hpte, dst_vma, addr, + hugetlb_pte_size(&src_hpte)); + if (ret) break; - } + + src_pte = src_hpte.ptep; + dst_pte = dst_hpte.ptep; + + hpte_sz = hugetlb_pte_size(&src_hpte); /* * If the pagetables are shared don't copy or take references. @@ -5259,13 +5297,14 @@ int copy_hugetlb_page_range(struct mm_struct *dst, struct mm_struct *src, * another vma. So page_count of ptep page is checked instead * to reliably determine whether pte is shared. */ - if (page_count(virt_to_page(dst_pte)) > 1) { - addr |= last_addr_mask; + if (hugetlb_pte_size(&dst_hpte) == sz && + page_count(virt_to_page(dst_pte)) > 1) { + addr = (addr | last_addr_mask) + sz; continue; } - dst_ptl = huge_pte_lock(h, dst, dst_pte); - src_ptl = huge_pte_lockptr(huge_page_shift(h), src, src_pte); + dst_ptl = hugetlb_pte_lock(&dst_hpte); + src_ptl = hugetlb_pte_lockptr(&src_hpte); spin_lock_nested(src_ptl, SINGLE_DEPTH_NESTING); entry = huge_ptep_get(src_pte); again: @@ -5309,10 +5348,15 @@ int copy_hugetlb_page_range(struct mm_struct *dst, struct mm_struct *src, */ if (userfaultfd_wp(dst_vma)) set_huge_pte_at(dst, addr, dst_pte, entry); + } else if (!hugetlb_pte_present_leaf(&src_hpte, entry)) { + /* Retry the walk. */ + spin_unlock(src_ptl); + spin_unlock(dst_ptl); + continue; } else { - entry = huge_ptep_get(src_pte); ptepage = pte_page(entry); - get_page(ptepage); + hpage = compound_head(ptepage); + get_page(hpage); /* * Failing to duplicate the anon rmap is a rare case @@ -5324,13 +5368,34 @@ int copy_hugetlb_page_range(struct mm_struct *dst, struct mm_struct *src, * need to be without the pgtable locks since we could * sleep during the process. */ - if (!PageAnon(ptepage)) { - page_add_file_rmap(ptepage, src_vma, true); - } else if (page_try_dup_anon_rmap(ptepage, true, + if (!PageAnon(hpage)) { + hugetlb_add_file_rmap(ptepage, + src_hpte.shift, h, src_vma); + } + /* + * It is currently impossible to get anonymous HugeTLB + * high-granularity mappings, so we use 'hpage' here. + * + * This will need to be changed when HGM support for + * anon mappings is added. + */ + else if (page_try_dup_anon_rmap(hpage, true, src_vma)) { pte_t src_pte_old = entry; struct folio *new_folio; + /* + * If we are mapped at high granularity, we + * may end up allocating lots and lots of + * hugepages when we only need one. Bail out + * now. + */ + if (hugetlb_pte_size(&src_hpte) != sz) { + put_page(hpage); + ret = -EINVAL; + break; + } + spin_unlock(src_ptl); spin_unlock(dst_ptl); /* Do not use reserve as it's private owned */ @@ -5342,7 +5407,7 @@ int copy_hugetlb_page_range(struct mm_struct *dst, struct mm_struct *src, } copy_user_huge_page(&new_folio->page, ptepage, addr, dst_vma, npages); - put_page(ptepage); + put_page(hpage); /* Install the new hugetlb folio if src pte stable */ dst_ptl = huge_pte_lock(h, dst, dst_pte); @@ -5360,6 +5425,7 @@ int copy_hugetlb_page_range(struct mm_struct *dst, struct mm_struct *src, hugetlb_install_folio(dst_vma, dst_pte, addr, new_folio); spin_unlock(src_ptl); spin_unlock(dst_ptl); + addr += hugetlb_pte_size(&src_hpte); continue; } @@ -5376,10 +5442,13 @@ int copy_hugetlb_page_range(struct mm_struct *dst, struct mm_struct *src, } set_huge_pte_at(dst, addr, dst_pte, entry); - hugetlb_count_add(npages, dst); + hugetlb_count_add( + hugetlb_pte_size(&dst_hpte) / PAGE_SIZE, + dst); } spin_unlock(src_ptl); spin_unlock(dst_ptl); + addr += hugetlb_pte_size(&src_hpte); } if (cow) { From patchwork Sat Feb 18 00:27:56 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: James Houghton X-Patchwork-Id: 13145389 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 14313C05027 for ; Sat, 18 Feb 2023 00:29:28 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 2FC07280013; Fri, 17 Feb 2023 19:29:09 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 2AB89280002; Fri, 17 Feb 2023 19:29:09 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 01212280013; Fri, 17 Feb 2023 19:29:08 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0017.hostedemail.com [216.40.44.17]) by kanga.kvack.org (Postfix) with ESMTP id E87DE280002 for ; Fri, 17 Feb 2023 19:29:08 -0500 (EST) Received: from smtpin20.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay06.hostedemail.com (Postfix) with ESMTP id BE2C4AAF95 for ; Sat, 18 Feb 2023 00:29:08 +0000 (UTC) X-FDA: 80478527976.20.8B38CED Received: from mail-yw1-f202.google.com (mail-yw1-f202.google.com [209.85.128.202]) by imf27.hostedemail.com (Postfix) with ESMTP id 1765840006 for ; Sat, 18 Feb 2023 00:29:06 +0000 (UTC) Authentication-Results: imf27.hostedemail.com; dkim=pass header.d=google.com header.s=20210112 header.b=eyB1wewX; spf=pass (imf27.hostedemail.com: domain of 30hvwYwoKCO8akYflXYkfeXffXcV.TfdcZelo-ddbmRTb.fiX@flex--jthoughton.bounces.google.com designates 209.85.128.202 as permitted sender) smtp.mailfrom=30hvwYwoKCO8akYflXYkfeXffXcV.TfdcZelo-ddbmRTb.fiX@flex--jthoughton.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1676680147; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=LnPCXYu0bp3jbi4b+RX4dTJj+ODlc15FNHREFY5xs1M=; b=M8irPkmpWvqp6PIFKtdo3TRNHqUuU0zhGpr8itTe+jLU5ZGNelQyZGXMv0bJyVuVkFhqWZ gMOGZNL73I/hgPOduHUchWzihiEmfLcglpkyODJV1NoOVel22UhbgYVZTxkrm9O8QfAp5C qGZq+j2BIOmcc0/XOOMP4cURNyALW+A= ARC-Authentication-Results: i=1; imf27.hostedemail.com; dkim=pass header.d=google.com header.s=20210112 header.b=eyB1wewX; spf=pass (imf27.hostedemail.com: domain of 30hvwYwoKCO8akYflXYkfeXffXcV.TfdcZelo-ddbmRTb.fiX@flex--jthoughton.bounces.google.com designates 209.85.128.202 as permitted sender) smtp.mailfrom=30hvwYwoKCO8akYflXYkfeXffXcV.TfdcZelo-ddbmRTb.fiX@flex--jthoughton.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1676680147; a=rsa-sha256; cv=none; b=wzLBDfxnjNpS31pKIqoQgBW4/65IQoP+C3AnwdEyQfEaXSR26fq4XAlfs+MqwoMLO8qss0 kDxNyNgJvIl7vF8InpoP7yPc8X6XAntzvSFVoQvaNImocHvB666QXdPft7+RC+AZJLN3w8 X/7jfH+9t2gqqtAlJ73IPw8evKYykz8= Received: by mail-yw1-f202.google.com with SMTP id 00721157ae682-536629d5255so17213977b3.2 for ; Fri, 17 Feb 2023 16:29:06 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=LnPCXYu0bp3jbi4b+RX4dTJj+ODlc15FNHREFY5xs1M=; b=eyB1wewXCAx1HDERPMCO0zwZAas4YhYFgV+T58Xo/niWsDRG5UC3Rs6XOZmH1VN6an BDeTgeZmAtV2U/RJBmPsaw2Pzs+502TrDV5JW7uSQjIUphjyqQ6JBLVLZiSywnbCwmN7 glxseLDjZd3DHe25r9gB54clLLLOql2zDYooJkeUUmIQE39+XJaDt5UKqvRy0vPMNP/Y dlrZCv9EsYeOE222dzKUD7vcvUi3Z842TugU3j8cUBo6f42mvUEeN6sqIIjBwYUd3WND 0vcWoKQXj/p0vdSibSMZSUoLzsYHkjTdgECKTuMGuOdbRhA9Y6lS7gs45B0ytQxsj3iT jZcg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=LnPCXYu0bp3jbi4b+RX4dTJj+ODlc15FNHREFY5xs1M=; b=0JKFiLt4+pQ8S/Cj+GAtwklyHrfs5CksZZdA+E9rcjMaYO1Stn/iQtOCD8CqVKdLhE SU4Kl/5qFjrE8ewMLQRyUET7Nar8kTU/V+ZcHZv4efpYC49qOsZNfuE4CTwbhIndeQIn zW0HswXFRu08unYH0T9yLXQv0eQk/qLwpMNWB+Pai+fzlEc7Z34OsKD4Tx1Dj8jHozm+ aI+yrvcoCBG+mqh8bgDQNb79z4r5aeSKFg4aKHfOkE4fCZWnapgqI6nJ37q11g0I4Tfw vl/aJVDlmZWYKxeh+Pwi1elWYhBaZ1goYDpG2B/YR5shuW0ax3Eh95Pq2LIRUdHCrhYd Vp6Q== X-Gm-Message-State: AO0yUKW3WMQsVNbKFKl+W9HhpNxACDUUsxCFZc/5O5RrjiL6mQNXalks 0wvzRNV1JhknFkbbTW5iipzFTTpG31uV00GD X-Google-Smtp-Source: AK7set8Hmr9k0Rf2FDN/phGApjkx1AsdUQUwGd2TSqsgrixKWblzVEB77ByAXj7RTzk+3xBvLZoYclWlE/0WLKho X-Received: from jthoughton.c.googlers.com ([fda3:e722:ac3:cc00:14:4d90:c0a8:2a4f]) (user=jthoughton job=sendgmr) by 2002:a5b:ec7:0:b0:965:bac9:d458 with SMTP id a7-20020a5b0ec7000000b00965bac9d458mr8139ybs.11.1676680146246; Fri, 17 Feb 2023 16:29:06 -0800 (PST) Date: Sat, 18 Feb 2023 00:27:56 +0000 In-Reply-To: <20230218002819.1486479-1-jthoughton@google.com> Mime-Version: 1.0 References: <20230218002819.1486479-1-jthoughton@google.com> X-Mailer: git-send-email 2.39.2.637.g21b0678d19-goog Message-ID: <20230218002819.1486479-24-jthoughton@google.com> Subject: [PATCH v2 23/46] hugetlb: add HGM support to move_hugetlb_page_tables From: James Houghton To: Mike Kravetz , Muchun Song , Peter Xu , Andrew Morton Cc: David Hildenbrand , David Rientjes , Axel Rasmussen , Mina Almasry , "Zach O'Keefe" , Manish Mishra , Naoya Horiguchi , "Dr . David Alan Gilbert" , "Matthew Wilcox (Oracle)" , Vlastimil Babka , Baolin Wang , Miaohe Lin , Yang Shi , Frank van der Linden , Jiaqi Yan , linux-mm@kvack.org, linux-kernel@vger.kernel.org, James Houghton X-Stat-Signature: tq57y16puobqfrqktr65zdxpgmyiaqbn X-Rspam-User: X-Rspamd-Queue-Id: 1765840006 X-Rspamd-Server: rspam06 X-HE-Tag: 1676680146-920110 X-HE-Meta: U2FsdGVkX1//JBTE2hPxbXd7/J4mNy65AYtk++0SsmBOLPsVp70SrZZE50ouxoMhcaPcRqWAhOrilF1nSY/g9BypoVj6ELeeL1xv150H9qcMINwGSJ9JfiQZosKhkl0vBJazL+7WuvDR6iaf/0wmUYUiBmgQxm/hbGD0utpzUdeJEFzhqqXup5sLk/wLGfjBtoQ6kNihvGiFo2amBOJ9Kse6uQzLkJaYMXe3BW7369GzBbTvvaYFxRvFTicq1rnID9VWTyvPt979GS400hsAXFWfjpUhkpznrSJoyTxzVfFzQZUDLIDx0ZLkZmgrj74DgSPy9Tum2Kx7GOyEs3/kZCN5fqJfZo8juOCnxuo+xixBfYsgAKh1tajKRDxxKWiCSrnrMP0IcrZ2BmvRueQWb6SBWwF8k5NFgt/wCkxW21fSP/uikKf37qOpBHBUhmRG5F47hdzj1RGLhFZ21d/eEAMHahgIZgPtoNegpqGcMLyS78ytVwp4Twz8URyNf0I782IdxQyCHnEG01wboljxgyTR61DwM0RFRFWwAnZSAGpht9AkPP/H3gopf9Z5I/gk9Mx9i7TqP4Ul/Adln+BQfbvCFqKFvIY0T7bMsySRIKuMn8qMv6MpgyP2NM6javluaio2a7YJ9K0BmbkYsW5SYI1H+Dr9nz3oJxFAd9jLt6zHatoc00GZoRKOwBANgnu4wtUa5l2SBDpIsO2W9sTkfxtet/mhtX8j77nPBCoEEuO8Yh4lcYPD8EXPz7ieKMWGoR+nWdFwpzjDGyq+pCw4Y9u/ikmHAQAUjudNO47iQLQJl/N63p96DAedlJE9FLJHKDXbxgZPzTDIqx9sXSwKsuYLPM4mWSsLO6E0EotdHt9+GNoytkd8E5XpAJkLeKabCLdI6Q/nUWOLOmO+yM1V6sar9pQ6CK+XxuI2kfrL7/c2pcY6JEMc7cBf7O9b/nTpKl5kaNgoc3OPu07mywE GM+09UmR rhkFjDUQifVptZB5yBhkQMwCcGoYMaqginMDHkNSsxDYVnOSNkLPSeN03l1m64hn9zFNyn1gB902+5m3iLPflZUxRitDJcnQ4DlyKIyeQMQZlE2NFTricpeIJ8BhFNX8XI50PbPFNtnGagQEW/6GaKmBquX9eCAReKyWg5c2XarLi7z9kLK4dNpC8kDh2aV/R6MftqXAURXpKQPZniI61ejCAuQz4VM30CkXgGwc8MjymjmcQzpb3oR9Xs841m0Ow0DZZc9uTdy+ZttdOpAu59d5IAQRfearmq2+lcmP0wHystXUIjF9/4gUOvRxL5sqfxrD96fmexeQYKsLL1ItW0AOJzC9322dpYEDumOQZk49vtEEQt0+OwH5qYZem31Y+OJ2nZ3LIJDd9AwaxYTXweX+LvRrZN3s8G3vAFJozDtAK6FkikuJgW1G4mYS3bnY8Z2Il55hSi5+rA3eHBdjK9C5n6dX0AudxR4dakxJm0HpyzOuVvTNmt+PtpM7qEhUyUThCIwCgzKRCJD9zPRUYxShN/xA81Neml6XBXCjt2sumSNL39QKsCUXOtF9hIDpV6WqmUElUzsZwduFlui8bgo4pa9KxYAz8T1D84COsE2c0xm81FOC6nRXKZtpXfgcvRKaylYUL1TnngZZtbrGzzVQpbPEHtZDXHl8CW1Z3+wN8FwGN2U6arQLYtjmO+LLSfpRe58uc1VjpqKM= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: This is very similar to the support that was added to copy_hugetlb_page_range. We simply do a high-granularity walk now, and most of the rest of the code stays the same. Signed-off-by: James Houghton diff --git a/mm/hugetlb.c b/mm/hugetlb.c index 210c6f2b16a5..6c4678b7a07d 100644 --- a/mm/hugetlb.c +++ b/mm/hugetlb.c @@ -5461,16 +5461,16 @@ int copy_hugetlb_page_range(struct mm_struct *dst, struct mm_struct *src, return ret; } -static void move_huge_pte(struct vm_area_struct *vma, unsigned long old_addr, - unsigned long new_addr, pte_t *src_pte, pte_t *dst_pte) +static void move_hugetlb_pte(struct vm_area_struct *vma, unsigned long old_addr, + unsigned long new_addr, struct hugetlb_pte *src_hpte, + struct hugetlb_pte *dst_hpte) { - struct hstate *h = hstate_vma(vma); struct mm_struct *mm = vma->vm_mm; spinlock_t *src_ptl, *dst_ptl; pte_t pte; - dst_ptl = huge_pte_lock(h, mm, dst_pte); - src_ptl = huge_pte_lockptr(huge_page_shift(h), mm, src_pte); + dst_ptl = hugetlb_pte_lock(dst_hpte); + src_ptl = hugetlb_pte_lockptr(src_hpte); /* * We don't have to worry about the ordering of src and dst ptlocks @@ -5479,8 +5479,8 @@ static void move_huge_pte(struct vm_area_struct *vma, unsigned long old_addr, if (src_ptl != dst_ptl) spin_lock_nested(src_ptl, SINGLE_DEPTH_NESTING); - pte = huge_ptep_get_and_clear(mm, old_addr, src_pte); - set_huge_pte_at(mm, new_addr, dst_pte, pte); + pte = huge_ptep_get_and_clear(mm, old_addr, src_hpte->ptep); + set_huge_pte_at(mm, new_addr, dst_hpte->ptep, pte); if (src_ptl != dst_ptl) spin_unlock(src_ptl); @@ -5498,9 +5498,9 @@ int move_hugetlb_page_tables(struct vm_area_struct *vma, struct mm_struct *mm = vma->vm_mm; unsigned long old_end = old_addr + len; unsigned long last_addr_mask; - pte_t *src_pte, *dst_pte; struct mmu_notifier_range range; bool shared_pmd = false; + struct hugetlb_pte src_hpte, dst_hpte; mmu_notifier_range_init(&range, MMU_NOTIFY_CLEAR, 0, mm, old_addr, old_end); @@ -5516,28 +5516,35 @@ int move_hugetlb_page_tables(struct vm_area_struct *vma, /* Prevent race with file truncation */ hugetlb_vma_lock_write(vma); i_mmap_lock_write(mapping); - for (; old_addr < old_end; old_addr += sz, new_addr += sz) { - src_pte = hugetlb_walk(vma, old_addr, sz); - if (!src_pte) { - old_addr |= last_addr_mask; - new_addr |= last_addr_mask; + while (old_addr < old_end) { + if (hugetlb_full_walk(&src_hpte, vma, old_addr)) { + /* The hstate-level PTE wasn't allocated. */ + old_addr = (old_addr | last_addr_mask) + sz; + new_addr = (new_addr | last_addr_mask) + sz; continue; } - if (huge_pte_none(huge_ptep_get(src_pte))) + + if (huge_pte_none(huge_ptep_get(src_hpte.ptep))) { + old_addr += hugetlb_pte_size(&src_hpte); + new_addr += hugetlb_pte_size(&src_hpte); continue; + } - if (huge_pmd_unshare(mm, vma, old_addr, src_pte)) { + if (hugetlb_pte_size(&src_hpte) == sz && + huge_pmd_unshare(mm, vma, old_addr, src_hpte.ptep)) { shared_pmd = true; - old_addr |= last_addr_mask; - new_addr |= last_addr_mask; + old_addr = (old_addr | last_addr_mask) + sz; + new_addr = (new_addr | last_addr_mask) + sz; continue; } - dst_pte = huge_pte_alloc(mm, new_vma, new_addr, sz); - if (!dst_pte) + if (hugetlb_full_walk_alloc(&dst_hpte, new_vma, new_addr, + hugetlb_pte_size(&src_hpte))) break; - move_huge_pte(vma, old_addr, new_addr, src_pte, dst_pte); + move_hugetlb_pte(vma, old_addr, new_addr, &src_hpte, &dst_hpte); + old_addr += hugetlb_pte_size(&src_hpte); + new_addr += hugetlb_pte_size(&src_hpte); } if (shared_pmd) From patchwork Sat Feb 18 00:27:57 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: James Houghton X-Patchwork-Id: 13145390 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id AF95CC64ED6 for ; Sat, 18 Feb 2023 00:29:29 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 3E49D280014; Fri, 17 Feb 2023 19:29:10 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 3A06C280002; Fri, 17 Feb 2023 19:29:10 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 1C1B3280014; Fri, 17 Feb 2023 19:29:10 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0016.hostedemail.com [216.40.44.16]) by kanga.kvack.org (Postfix) with ESMTP id 0827C280002 for ; Fri, 17 Feb 2023 19:29:10 -0500 (EST) Received: from smtpin12.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay04.hostedemail.com (Postfix) with ESMTP id D8A201A0511 for ; Sat, 18 Feb 2023 00:29:09 +0000 (UTC) X-FDA: 80478528018.12.96F9E5C Received: from mail-yb1-f202.google.com (mail-yb1-f202.google.com [209.85.219.202]) by imf02.hostedemail.com (Postfix) with ESMTP id 2004A80016 for ; Sat, 18 Feb 2023 00:29:07 +0000 (UTC) Authentication-Results: imf02.hostedemail.com; dkim=pass header.d=google.com header.s=20210112 header.b=L30EdOY7; spf=pass (imf02.hostedemail.com: domain of 30xvwYwoKCPAblZgmYZlgfYggYdW.Ugedafmp-eecnSUc.gjY@flex--jthoughton.bounces.google.com designates 209.85.219.202 as permitted sender) smtp.mailfrom=30xvwYwoKCPAblZgmYZlgfYggYdW.Ugedafmp-eecnSUc.gjY@flex--jthoughton.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1676680148; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=WTnKSuVZCt+rhuzvxEm8L73SzZuvMukxFScaxZ9D8Co=; b=cj8CNgqZJ+GYf0o+XR06lN4sHAbf/Y5vmCyjA+rsJNntt2sNqAh2k2lqucpJjIBqRIxQSE 0q1PnePZ22zbUO5CnL9BkFl80en9y+dfwmYBybvPwhQLiiW8zBitFZfBP7g6MrsoYEmWuj xGftmargTfmXrq1imKLKh+u1mEspdok= ARC-Authentication-Results: i=1; imf02.hostedemail.com; dkim=pass header.d=google.com header.s=20210112 header.b=L30EdOY7; spf=pass (imf02.hostedemail.com: domain of 30xvwYwoKCPAblZgmYZlgfYggYdW.Ugedafmp-eecnSUc.gjY@flex--jthoughton.bounces.google.com designates 209.85.219.202 as permitted sender) smtp.mailfrom=30xvwYwoKCPAblZgmYZlgfYggYdW.Ugedafmp-eecnSUc.gjY@flex--jthoughton.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1676680148; a=rsa-sha256; cv=none; b=u/zdbgDHGE7g26i3IUEQtcMUMlDWdTxa0lp4RowU3wRmvxzVChtl9+HpXxKPQglaA0M42V hzUUeVPxowgInCHKe+jnA71H9l6fQO4roEtV3NtuoDMBhQH4vKJrOj1GDqlfZkCJHOtffc BRQ+WIyxXLM+cLyT3zTvBmrbMFUK70Y= Received: by mail-yb1-f202.google.com with SMTP id b15-20020a252e4f000000b008ee1c76c25dso2086327ybn.11 for ; Fri, 17 Feb 2023 16:29:07 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=WTnKSuVZCt+rhuzvxEm8L73SzZuvMukxFScaxZ9D8Co=; b=L30EdOY7ZdwK//mLU+7411LW4r7AXP+dL/wo9+jXO/WcJ04CoJuWdXFBiiCSNGb46t sBaVLxi21hhAwKqlQ42PTpoktSV3w8ZCG6GtaV6zNLoCZsWyIozI7LT3SGds8vUSGtvL 8bn+t4XSOU3//vae5pDb5V2Wt0nqx+ASlUpTQxUcwOKp6auK0QYbvPxXLS//Whkg8Xep plK+8niTC21BXzXzAR2uTAn/GBxuv7yY3ZRhMgy6+GQbwBq7yNFIB3yLNd8/V4O+5B+l K1VKQkZL1YOcNbEpaCrjieJj1keaKiBGGAuUTdjO/zQEfWTY5Cm/wW0Vm3cp6E52Cyi6 zD3w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=WTnKSuVZCt+rhuzvxEm8L73SzZuvMukxFScaxZ9D8Co=; b=T5638QFz5nTduTCkFpVS8FtcIs3Axjia9qxgM4hkMrqsXdufv8RSlhvIkO/d9Sh9az SrgxIV60pvi/AJzdurqLCSpUztuwgeSNHymT6wS+G8AwTMPVTziZh0FnfOyPcF3c3fXO W1E3Arl92kscIicsJdsvqE1Mtnn8DLn6Lry8eipCtd/2qZc6DH7fRjjxG8Vg6mtgGgG6 t3nx9zci5n7mX/wznzySB8zvJGg9HU+WLx+BywAmgpSfcJwEIJEMyRTNDyE3p7pqMkEs pHVtG0v8lwrXncbyZH8izO0BP4YfZN+wk6hM8JkXIXmwHpDlV2Iax4f3MnCs37PQA+J7 V2yA== X-Gm-Message-State: AO0yUKWeZlRatgrGuD/lzzHGnZbLEy0Wsofc9hoyyiDe5CuqiyYyTwco 7PTPlasmRCnghW/Q10BR4Ea8M9AS8KhfWpO5 X-Google-Smtp-Source: AK7set+JZbRX7CDnj+Gizzbk/DH4kxZkKGMSGcEb4+M9JWKbZSU4JEyz12TmXNRwFx+ySFC7KGlj2zgmA6wzw7Gh X-Received: from jthoughton.c.googlers.com ([fda3:e722:ac3:cc00:14:4d90:c0a8:2a4f]) (user=jthoughton job=sendgmr) by 2002:a05:6902:154b:b0:97a:ebd:a594 with SMTP id r11-20020a056902154b00b0097a0ebda594mr79653ybu.3.1676680147254; Fri, 17 Feb 2023 16:29:07 -0800 (PST) Date: Sat, 18 Feb 2023 00:27:57 +0000 In-Reply-To: <20230218002819.1486479-1-jthoughton@google.com> Mime-Version: 1.0 References: <20230218002819.1486479-1-jthoughton@google.com> X-Mailer: git-send-email 2.39.2.637.g21b0678d19-goog Message-ID: <20230218002819.1486479-25-jthoughton@google.com> Subject: [PATCH v2 24/46] hugetlb: add HGM support to hugetlb_fault and hugetlb_no_page From: James Houghton To: Mike Kravetz , Muchun Song , Peter Xu , Andrew Morton Cc: David Hildenbrand , David Rientjes , Axel Rasmussen , Mina Almasry , "Zach O'Keefe" , Manish Mishra , Naoya Horiguchi , "Dr . David Alan Gilbert" , "Matthew Wilcox (Oracle)" , Vlastimil Babka , Baolin Wang , Miaohe Lin , Yang Shi , Frank van der Linden , Jiaqi Yan , linux-mm@kvack.org, linux-kernel@vger.kernel.org, James Houghton X-Rspam-User: X-Rspamd-Server: rspam03 X-Stat-Signature: e7j39aioidy8qn9otmkh34ma4keyqoqt X-Rspamd-Queue-Id: 2004A80016 X-HE-Tag: 1676680147-407269 X-HE-Meta: U2FsdGVkX19n7zvWmuehUwXh9NVvWg3CKkForXV7u/ZRbWlvhgU7dS57iJZlUHr/zjpeqUNAN3+Q5fzjkHrBr5MPft2/Wp0mz48b2Vb4inwjGvViuE1NBYvx5lUbD7yZ3mypzBBi5fVm044wc7bWHBI6bDlGIynLplaaplsHH1YcQ5t3EST8y5XeTxck8cjr4gHt+PzglATAynU/pIbAOCXc0qRrJmjnrmKuz8hylklh57ek5fugLWyq6WJExQSQaeKzvb7sR7+r84hMIbnnQk53QdYfahTxeOLzkXDk/bYv0xTy9IB00U0A7e7r+tEALE8QpicX8D32c4DdMJGyG/slNFy0S4vgFm5Ybmc1MZg6U2RrSfnylieTwFH5VHT7ZDcMsIwf5tXjq61FJfOSonIFOklfkeCmifMwdZOrS0uUHlR7WnSGdKqibVlZn1vlVnZLT9Y3LKvk5wuYp1gPzkzuSRJ2mc+rY4XQpsJAIhSd1TUIjagHPIwW0S4r1A/NFZg9s4BJ5UCZDiVpNWDk//H5GLg95WNq30SDlKYB/eOKasDGnhrs8cMM30f7bj5UArfe9GQhsNagVovQFG1uRbhpN0A3KF6vHoLGTWTe2PIEU/qtDwk1YuUyiWfTMl/30aEqF2m5ktXHeJdufPNy9A9fWUMdOuGti4rVQ64VWuGrOzxMErFjm5L9Z8zFCocF8B3yeNXKDt1NWvMGEDJxN2t4jkj7B3ny0AQ2R/WJJlLxRJKiDfbr9PKW3FW+gKjDP8/y7OwrA3lWN06tJX+PkTdbLSjIMVfCfYwv+DQASSy9FEUdv+fvP1UIgy0RhIOsJcVQhpXEVCBWnGRzZWfXfIm1rT8olznkoEa34RwoXq/2gHx8XDXDk0pNCuUpQxNTi1sIEw9figmPYAIGhjlDXoY1mk6TheTxCdkFhI+uOUkk1tzb6neFbnxUW8N/dGhyxaWd7Iq+s3RV7C0U3Lq AY8hu/Oa oFOBiQS/3gLhJQsfHt29QNa5/Rkkn1roNMBXoyWos9Szvh5f8Rz2KzdYSabbCn9D0dxwCQxUZsieOIvSScooJTIo1Cl9Por8vW3f9KePUypTpm/IJ1MY4DFrijBV8t6UQi94r5PRYlbRXWXWp/wlCElR1mhG/tOsNdPbfspVL3FE1Xe84Gcdk1hOvABLtxKyz8RqUXXCZ+Wf0O1MLPHn7NUGuEd6hbPpO7fGTQfX0vURBUn4YbSnq50X0wGhhQ4wCbO65hUDfD9GrMRmUVo8VkCDjKcR3PZRUMmO679Khh1RCMylBIUBNA8dPMN5qJPLJfi8xsxQ98yiXfxmVLEzexv0wHmGe5nbF0N9Is6ZX5hLmwfvBUMqsi8L6KjVJXPkz4Qnf52eLck7dKeLtBkMrBrmaCePzq+q4fyeGExAZSPpXXEXHl0aHU1NEGeePXhiCrjJtZUue9TbjBrJSskIkLqCrX0+ThRdl4nwg0fI2cmZ7NDdcZgvJ+p77YUIPFE+wKiEMo74BmZFP5DvvNGT5qeaQWUjN5oW3wRmFrowwt9Rph6XWc8wnieQ+nymetqCNpRoIvRQrDq/KCVg/v+eUN2PzDrtwObHn5WX4MjqhZC3onU8= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Update the page fault handler to support high-granularity page faults. While handling a page fault on a partially-mapped HugeTLB page, if the PTE we find with hugetlb_pte_walk is none, then we will replace it with a leaf-level PTE to map the page. To give some examples: 1. For a completely unmapped 1G page, it will be mapped with a 1G PUD. 2. For a 1G page that has its first 512M mapped, any faults on the unmapped sections will result in 2M PMDs mapping each unmapped 2M section. 3. For a 1G page that has only its first 4K mapped, a page fault on its second 4K section will get a 4K PTE to map it. Unless high-granularity mappings are created via UFFDIO_CONTINUE, it is impossible for hugetlb_fault to create high-granularity mappings. This commit does not handle hugetlb_wp right now, and it doesn't handle HugeTLB page migration and swap entries. The BUG_ON in huge_pte_alloc is removed, as it is not longer valid when HGM is possible. HGM can be disabled if the VMA lock cannot be allocated after a VMA is split, yet high-granularity mappings may still exist. Signed-off-by: James Houghton diff --git a/mm/hugetlb.c b/mm/hugetlb.c index 6c4678b7a07d..86cd51beb02c 100644 --- a/mm/hugetlb.c +++ b/mm/hugetlb.c @@ -173,6 +173,18 @@ void hugetlb_add_file_rmap(struct page *subpage, unsigned long shift, } } +/* + * Find the subpage that corresponds to `addr` in `folio`. + */ +static struct page *hugetlb_find_subpage(struct hstate *h, struct folio *folio, + unsigned long addr) +{ + size_t idx = (addr & ~huge_page_mask(h))/PAGE_SIZE; + + BUG_ON(idx >= pages_per_huge_page(h)); + return folio_page(folio, idx); +} + static inline bool subpool_is_free(struct hugepage_subpool *spool) { if (spool->count) @@ -6072,14 +6084,14 @@ static inline vm_fault_t hugetlb_handle_userfault(struct vm_area_struct *vma, * Recheck pte with pgtable lock. Returns true if pte didn't change, or * false if pte changed or is changing. */ -static bool hugetlb_pte_stable(struct hstate *h, struct mm_struct *mm, - pte_t *ptep, pte_t old_pte) +static bool hugetlb_pte_stable(struct hstate *h, struct hugetlb_pte *hpte, + pte_t old_pte) { spinlock_t *ptl; bool same; - ptl = huge_pte_lock(h, mm, ptep); - same = pte_same(huge_ptep_get(ptep), old_pte); + ptl = hugetlb_pte_lock(hpte); + same = pte_same(huge_ptep_get(hpte->ptep), old_pte); spin_unlock(ptl); return same; @@ -6088,7 +6100,7 @@ static bool hugetlb_pte_stable(struct hstate *h, struct mm_struct *mm, static vm_fault_t hugetlb_no_page(struct mm_struct *mm, struct vm_area_struct *vma, struct address_space *mapping, pgoff_t idx, - unsigned long address, pte_t *ptep, + unsigned long address, struct hugetlb_pte *hpte, pte_t old_pte, unsigned int flags) { struct hstate *h = hstate_vma(vma); @@ -6096,10 +6108,12 @@ static vm_fault_t hugetlb_no_page(struct mm_struct *mm, int anon_rmap = 0; unsigned long size; struct folio *folio; + struct page *subpage; pte_t new_pte; spinlock_t *ptl; unsigned long haddr = address & huge_page_mask(h); bool new_folio, new_pagecache_folio = false; + unsigned long haddr_hgm = address & hugetlb_pte_mask(hpte); u32 hash = hugetlb_fault_mutex_hash(mapping, idx); /* @@ -6143,7 +6157,7 @@ static vm_fault_t hugetlb_no_page(struct mm_struct *mm, * never happen on the page after UFFDIO_COPY has * correctly installed the page and returned. */ - if (!hugetlb_pte_stable(h, mm, ptep, old_pte)) { + if (!hugetlb_pte_stable(h, hpte, old_pte)) { ret = 0; goto out; } @@ -6167,7 +6181,7 @@ static vm_fault_t hugetlb_no_page(struct mm_struct *mm, * here. Before returning error, get ptl and make * sure there really is no pte entry. */ - if (hugetlb_pte_stable(h, mm, ptep, old_pte)) + if (hugetlb_pte_stable(h, hpte, old_pte)) ret = vmf_error(PTR_ERR(folio)); else ret = 0; @@ -6217,7 +6231,7 @@ static vm_fault_t hugetlb_no_page(struct mm_struct *mm, folio_unlock(folio); folio_put(folio); /* See comment in userfaultfd_missing() block above */ - if (!hugetlb_pte_stable(h, mm, ptep, old_pte)) { + if (!hugetlb_pte_stable(h, hpte, old_pte)) { ret = 0; goto out; } @@ -6242,30 +6256,46 @@ static vm_fault_t hugetlb_no_page(struct mm_struct *mm, vma_end_reservation(h, vma, haddr); } - ptl = huge_pte_lock(h, mm, ptep); + ptl = hugetlb_pte_lock(hpte); ret = 0; - /* If pte changed from under us, retry */ - if (!pte_same(huge_ptep_get(ptep), old_pte)) + /* + * If pte changed from under us, retry. + * + * When dealing with high-granularity-mapped PTEs, it's possible that + * a non-contiguous PTE within our contiguous PTE group gets populated, + * in which case, we need to retry here. This is NOT caught here, and + * will need to be addressed when HGM is supported for architectures + * that support contiguous PTEs. + */ + if (!pte_same(huge_ptep_get(hpte->ptep), old_pte)) goto backout; - if (anon_rmap) + subpage = hugetlb_find_subpage(h, folio, haddr_hgm); + + if (anon_rmap) { + VM_BUG_ON(&folio->page != subpage); hugepage_add_new_anon_rmap(folio, vma, haddr); + } else - page_add_file_rmap(&folio->page, vma, true); - new_pte = make_huge_pte(vma, &folio->page, ((vma->vm_flags & VM_WRITE) - && (vma->vm_flags & VM_SHARED))); + hugetlb_add_file_rmap(subpage, hpte->shift, h, vma); + + new_pte = make_huge_pte_with_shift(vma, subpage, + ((vma->vm_flags & VM_WRITE) + && (vma->vm_flags & VM_SHARED)), + hpte->shift); /* * If this pte was previously wr-protected, keep it wr-protected even * if populated. */ if (unlikely(pte_marker_uffd_wp(old_pte))) new_pte = huge_pte_mkuffd_wp(new_pte); - set_huge_pte_at(mm, haddr, ptep, new_pte); + set_huge_pte_at(mm, haddr_hgm, hpte->ptep, new_pte); - hugetlb_count_add(pages_per_huge_page(h), mm); + hugetlb_count_add(hugetlb_pte_size(hpte) / PAGE_SIZE, mm); if ((flags & FAULT_FLAG_WRITE) && !(vma->vm_flags & VM_SHARED)) { + WARN_ON_ONCE(hugetlb_pte_size(hpte) != huge_page_size(h)); /* Optimization, do the COW without a second fault */ - ret = hugetlb_wp(mm, vma, address, ptep, flags, folio, ptl); + ret = hugetlb_wp(mm, vma, address, hpte->ptep, flags, folio, ptl); } spin_unlock(ptl); @@ -6322,17 +6352,19 @@ u32 hugetlb_fault_mutex_hash(struct address_space *mapping, pgoff_t idx) vm_fault_t hugetlb_fault(struct mm_struct *mm, struct vm_area_struct *vma, unsigned long address, unsigned int flags) { - pte_t *ptep, entry; + pte_t entry; spinlock_t *ptl; vm_fault_t ret; u32 hash; pgoff_t idx; - struct page *page = NULL; - struct folio *pagecache_folio = NULL; + struct page *subpage = NULL; + struct folio *pagecache_folio = NULL, *folio = NULL; struct hstate *h = hstate_vma(vma); struct address_space *mapping; int need_wait_lock = 0; unsigned long haddr = address & huge_page_mask(h); + unsigned long haddr_hgm; + struct hugetlb_pte hpte; /* * Serialize hugepage allocation and instantiation, so that we don't @@ -6346,26 +6378,26 @@ vm_fault_t hugetlb_fault(struct mm_struct *mm, struct vm_area_struct *vma, /* * Acquire vma lock before calling huge_pte_alloc and hold - * until finished with ptep. This prevents huge_pmd_unshare from - * being called elsewhere and making the ptep no longer valid. + * until finished with hpte. This prevents huge_pmd_unshare from + * being called elsewhere and making the hpte no longer valid. */ hugetlb_vma_lock_read(vma); - ptep = huge_pte_alloc(mm, vma, haddr, huge_page_size(h)); - if (!ptep) { + if (hugetlb_full_walk_alloc(&hpte, vma, address, 0)) { hugetlb_vma_unlock_read(vma); mutex_unlock(&hugetlb_fault_mutex_table[hash]); return VM_FAULT_OOM; } - entry = huge_ptep_get(ptep); + entry = huge_ptep_get(hpte.ptep); /* PTE markers should be handled the same way as none pte */ - if (huge_pte_none_mostly(entry)) + if (huge_pte_none_mostly(entry)) { /* * hugetlb_no_page will drop vma lock and hugetlb fault * mutex internally, which make us return immediately. */ - return hugetlb_no_page(mm, vma, mapping, idx, address, ptep, + return hugetlb_no_page(mm, vma, mapping, idx, address, &hpte, entry, flags); + } ret = 0; @@ -6386,7 +6418,7 @@ vm_fault_t hugetlb_fault(struct mm_struct *mm, struct vm_area_struct *vma, * be released there. */ mutex_unlock(&hugetlb_fault_mutex_table[hash]); - migration_entry_wait_huge(vma, ptep); + migration_entry_wait_huge(vma, hpte.ptep); return 0; } else if (unlikely(is_hugetlb_entry_hwpoisoned(entry))) ret = VM_FAULT_HWPOISON_LARGE | @@ -6394,6 +6426,10 @@ vm_fault_t hugetlb_fault(struct mm_struct *mm, struct vm_area_struct *vma, goto out_mutex; } + if (!hugetlb_pte_present_leaf(&hpte, entry)) + /* We raced with someone splitting the entry. */ + goto out_mutex; + /* * If we are going to COW/unshare the mapping later, we examine the * pending reservations for this page now. This will ensure that any @@ -6413,14 +6449,17 @@ vm_fault_t hugetlb_fault(struct mm_struct *mm, struct vm_area_struct *vma, pagecache_folio = filemap_lock_folio(mapping, idx); } - ptl = huge_pte_lock(h, mm, ptep); + ptl = hugetlb_pte_lock(&hpte); /* Check for a racing update before calling hugetlb_wp() */ - if (unlikely(!pte_same(entry, huge_ptep_get(ptep)))) + if (unlikely(!pte_same(entry, huge_ptep_get(hpte.ptep)))) goto out_ptl; + /* haddr_hgm is the base address of the region that hpte maps. */ + haddr_hgm = address & hugetlb_pte_mask(&hpte); + /* Handle userfault-wp first, before trying to lock more pages */ - if (userfaultfd_wp(vma) && huge_pte_uffd_wp(huge_ptep_get(ptep)) && + if (userfaultfd_wp(vma) && huge_pte_uffd_wp(entry) && (flags & FAULT_FLAG_WRITE) && !huge_pte_write(entry)) { struct vm_fault vmf = { .vma = vma, @@ -6444,18 +6483,21 @@ vm_fault_t hugetlb_fault(struct mm_struct *mm, struct vm_area_struct *vma, * pagecache_folio, so here we need take the former one * when page != pagecache_folio or !pagecache_folio. */ - page = pte_page(entry); - if (page_folio(page) != pagecache_folio) - if (!trylock_page(page)) { + subpage = pte_page(entry); + folio = page_folio(subpage); + if (folio != pagecache_folio) + if (!trylock_page(&folio->page)) { need_wait_lock = 1; goto out_ptl; } - get_page(page); + folio_get(folio); if (flags & (FAULT_FLAG_WRITE|FAULT_FLAG_UNSHARE)) { if (!huge_pte_write(entry)) { - ret = hugetlb_wp(mm, vma, address, ptep, flags, + WARN_ON_ONCE(hugetlb_pte_size(&hpte) != + huge_page_size(h)); + ret = hugetlb_wp(mm, vma, address, hpte.ptep, flags, pagecache_folio, ptl); goto out_put_page; } else if (likely(flags & FAULT_FLAG_WRITE)) { @@ -6463,13 +6505,13 @@ vm_fault_t hugetlb_fault(struct mm_struct *mm, struct vm_area_struct *vma, } } entry = pte_mkyoung(entry); - if (huge_ptep_set_access_flags(vma, haddr, ptep, entry, + if (huge_ptep_set_access_flags(vma, haddr_hgm, hpte.ptep, entry, flags & FAULT_FLAG_WRITE)) - update_mmu_cache(vma, haddr, ptep); + update_mmu_cache(vma, haddr_hgm, hpte.ptep); out_put_page: - if (page_folio(page) != pagecache_folio) - unlock_page(page); - put_page(page); + if (folio != pagecache_folio) + folio_unlock(folio); + folio_put(folio); out_ptl: spin_unlock(ptl); @@ -6488,7 +6530,7 @@ vm_fault_t hugetlb_fault(struct mm_struct *mm, struct vm_area_struct *vma, * here without taking refcount. */ if (need_wait_lock) - wait_on_page_locked(page); + wait_on_page_locked(&folio->page); return ret; } @@ -7689,6 +7731,9 @@ int hugetlb_full_walk(struct hugetlb_pte *hpte, /* * hugetlb_full_walk_alloc - do a high-granularity walk, potentially allocate * new PTEs. + * + * If @target_sz is 0, then only attempt to allocate the hstate-level PTE and + * walk as far as we can go. */ int hugetlb_full_walk_alloc(struct hugetlb_pte *hpte, struct vm_area_struct *vma, @@ -7707,6 +7752,12 @@ int hugetlb_full_walk_alloc(struct hugetlb_pte *hpte, if (!ptep) return -ENOMEM; + if (!target_sz) { + WARN_ON_ONCE(hugetlb_hgm_walk(hpte, ptep, vma, addr, + PAGE_SIZE, false)); + return 0; + } + return hugetlb_hgm_walk(hpte, ptep, vma, addr, target_sz, true); } @@ -7735,7 +7786,6 @@ pte_t *huge_pte_alloc(struct mm_struct *mm, struct vm_area_struct *vma, pte = (pte_t *)pmd_alloc(mm, pud, addr); } } - BUG_ON(pte && pte_present(*pte) && !pte_huge(*pte)); return pte; } From patchwork Sat Feb 18 00:27:58 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: James Houghton X-Patchwork-Id: 13145391 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 691AFC636D6 for ; Sat, 18 Feb 2023 00:29:31 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 1E95F280015; Fri, 17 Feb 2023 19:29:11 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 19A4C280002; Fri, 17 Feb 2023 19:29:11 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 01517280015; Fri, 17 Feb 2023 19:29:10 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0015.hostedemail.com [216.40.44.15]) by kanga.kvack.org (Postfix) with ESMTP id E2954280002 for ; Fri, 17 Feb 2023 19:29:10 -0500 (EST) Received: from smtpin03.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay08.hostedemail.com (Postfix) with ESMTP id C3D22140683 for ; Sat, 18 Feb 2023 00:29:10 +0000 (UTC) X-FDA: 80478528060.03.FC1788E Received: from mail-yw1-f202.google.com (mail-yw1-f202.google.com [209.85.128.202]) by imf01.hostedemail.com (Postfix) with ESMTP id 0A07440008 for ; Sat, 18 Feb 2023 00:29:08 +0000 (UTC) Authentication-Results: imf01.hostedemail.com; dkim=pass header.d=google.com header.s=20210112 header.b=pv9sgXbR; spf=pass (imf01.hostedemail.com: domain of 31BvwYwoKCPEcmahnZamhgZhhZeX.Vhfebgnq-ffdoTVd.hkZ@flex--jthoughton.bounces.google.com designates 209.85.128.202 as permitted sender) smtp.mailfrom=31BvwYwoKCPEcmahnZamhgZhhZeX.Vhfebgnq-ffdoTVd.hkZ@flex--jthoughton.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1676680149; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=UjVXJ8hdtbHTf+lUAZ4f6x6F3SynSj2CG1xD1+Yj8f8=; b=HrsEmqmLjK3zD8Wrcdwt5Znxunc+aV/MIAmJyi7YE7UVZs/50zqZHvZ1nAcWQIyGmkH8sW oMDgrBJNuaX3BC26NxvTibFN3jBkXV/VcYjSRy49YP/YWn86dAd3/yY7A++DijwkcqZcTj I8GOKX3trRIQ1c1vhapT6EsMBQCYctM= ARC-Authentication-Results: i=1; imf01.hostedemail.com; dkim=pass header.d=google.com header.s=20210112 header.b=pv9sgXbR; spf=pass (imf01.hostedemail.com: domain of 31BvwYwoKCPEcmahnZamhgZhhZeX.Vhfebgnq-ffdoTVd.hkZ@flex--jthoughton.bounces.google.com designates 209.85.128.202 as permitted sender) smtp.mailfrom=31BvwYwoKCPEcmahnZamhgZhhZeX.Vhfebgnq-ffdoTVd.hkZ@flex--jthoughton.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1676680149; a=rsa-sha256; cv=none; b=HgX5Gon/dltEkJarWTszfJV4Kb8VGoCuXyUChvbTZblhDmTMhNXc54cY3CirpKs8NN4G2r 72gxXgrtv2NgI3uuHY/hs5ElK0UIAbtCqjfWmi+dPVJFB1KIpCJNKAvcu/eq6RJ554H/+E TlgnJMLk1d3snlgTMPlPrL9+qFd/axE= Received: by mail-yw1-f202.google.com with SMTP id 00721157ae682-5365a8dd33aso18700547b3.22 for ; Fri, 17 Feb 2023 16:29:08 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=UjVXJ8hdtbHTf+lUAZ4f6x6F3SynSj2CG1xD1+Yj8f8=; b=pv9sgXbRSK+ulujLGfdPPPAff8mO5bDv7HpAt6sVLZ6JoZzQXqgTKRKandAavm2wSz GaPYHHRdcJpckwms1PVrFc6vNiVTcxAlG6oCUMRdsxDqYTYXzVurmJEml8+ql0/TDF3v UCJ95R9PBOaSmpPE4Ic9Jp4SeoUjQgy3yUVhXKTk40u73yY7EVS1bRLPiHZZ1nL8e3sb q6IAgfazCkla4E5CBbTWQRtlKAAvX2g0orVfxrreagJHjk3BLCZ3ldDp7zxJzy4vym4x NWi0886Mcej8s6rWUU/kKQxFgDeHHLlKOLYF3V4C/36HC2UNIB85Ve9+c4SQOQU3rEkv hcBQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=UjVXJ8hdtbHTf+lUAZ4f6x6F3SynSj2CG1xD1+Yj8f8=; b=NWJ3I6aeT3OJVKZertPp0CSOKRRyiRcYjj/Mjc4i5I71Gf9cmnIsjc062lDsNqmBBk +0mp5NIhE2x9pstNqFjMWdGNph6B0PD5wFzt39KkuNsO/riHCBtxGNn0H83cZ3bc+STJ JXs22qfXU2pqYXOD3vlP0kpE+fq3J4MXkMEg3zlo79HCBcs5zkr8DGcK/8j5ed6as43q ubqkNt5dzJTURtqXADlbzqyVjxNba2prlSjtdK2ULYt9ZXermGLn5skCe07fkzuKWUMV prpoT1qHMeklhO6DYsIGwyFRvyHdPFqroQxYkdAOMYA+fZDXyOC1n7oCHD93V86Arx1L 9QOQ== X-Gm-Message-State: AO0yUKUkbX15Sk2zR9DOSUz/fO04qnulaa/sIuDZ2JvFCo91L4t4STah yZ9lPBS3fw+XfbWaviL05G5gpu6++8lvQ6kG X-Google-Smtp-Source: AK7set9HSTLEVb4H/dZ4jXxsGK/LepFZp0hDfPakpFLzNJu6RpLhAhLCkF7ri6HXdFjyEvBAOB+uhx84ME7Ia1t3 X-Received: from jthoughton.c.googlers.com ([fda3:e722:ac3:cc00:14:4d90:c0a8:2a4f]) (user=jthoughton job=sendgmr) by 2002:a5b:c4b:0:b0:8c3:7bc8:7f0e with SMTP id d11-20020a5b0c4b000000b008c37bc87f0emr1152747ybr.588.1676680148502; Fri, 17 Feb 2023 16:29:08 -0800 (PST) Date: Sat, 18 Feb 2023 00:27:58 +0000 In-Reply-To: <20230218002819.1486479-1-jthoughton@google.com> Mime-Version: 1.0 References: <20230218002819.1486479-1-jthoughton@google.com> X-Mailer: git-send-email 2.39.2.637.g21b0678d19-goog Message-ID: <20230218002819.1486479-26-jthoughton@google.com> Subject: [PATCH v2 25/46] hugetlb: use struct hugetlb_pte for walk_hugetlb_range From: James Houghton To: Mike Kravetz , Muchun Song , Peter Xu , Andrew Morton Cc: David Hildenbrand , David Rientjes , Axel Rasmussen , Mina Almasry , "Zach O'Keefe" , Manish Mishra , Naoya Horiguchi , "Dr . David Alan Gilbert" , "Matthew Wilcox (Oracle)" , Vlastimil Babka , Baolin Wang , Miaohe Lin , Yang Shi , Frank van der Linden , Jiaqi Yan , linux-mm@kvack.org, linux-kernel@vger.kernel.org, James Houghton X-Rspam-User: X-Rspamd-Server: rspam04 X-Rspamd-Queue-Id: 0A07440008 X-Stat-Signature: 8ih3h1wi834z18hymxbw5c6y88wdauhu X-HE-Tag: 1676680148-372768 X-HE-Meta: U2FsdGVkX18JOrZ8xMle4wwEWLlU/n2U2cnqOlYaMukkwzSlifmexFsqlM0HnEKCUtHoi2KWHZfic9o4f5e9HM9bRs2cClkkzNyAaIc/brfu5DmaKtmST5G8g54kiZrmEb8pZdKBpA87DaRlW0iZoqNr/JFtRFPQr8riPSd63ObDKP+QdIXvqrrXLxBGj/Lr4M7VT6LcWcG03xLbpDKtA8rpzjOZQVfdK3lYhnXkFoJqnY1MTQAdMsx6SI04rypx4j42HADGqB/jIvACA9/WaljMyCNx0kUQ3XiQPmkg3hM2HXNm9gFV8htu82o6pzcA68jUA5eWAwunwdUPaRQZ4a/He5Lw/xezIzEWa9iAekXzEM0N6qrDel2SO3TjjISUo/YHVA6GAR1lWgqBb2KwaWFZH37LbZagii2PDXmDFpI0mAzo2ZxwEy3Nef8bzdxV5EgebhRHOLhCPJ5CUdQvBAr58qCjx7pLAjsEKQyLBbocGtg2kjBJlL4fqzPhXEKbgMJ6/TLK4mLvC6KGVvIOK9MoMTePDzU8afvhOdZvnY7uUoHp69S8IWJ90LdFioDQU70XuddyIP4WoZBVJp1lZXsiawF58l90nRMgN6t1fB5cULpPFw7AVIVcLtixNVljvoUeTgHhDK2+1xaOeAK+BG+CHa+kXvFo4x0eZt4eh/hlXq4MI8EpZFl8QmY1xHT4EA2TSCpVFB+9qxY1m5Kw3GpsJtnZnqLipXvIRr1MHdEU9ILHpL3sFxaBeh/EbvAAJLyiBYB86s049cGgoR3wTRE/puMZPOx3cZ/hEixBGjcsLNCZ7OABA422XvgPJFE+9AQlXEUhcbkAOnDPe/ZdrDqyAFH9dLgbY1w5aDU/jsWRrUR9BAQgFyLZA/ornnccsxZnPISKcpYMEqccH/Ir4PRDmfWxf+YI6Pfq+PZbB+xsThKrZiJySwf84xPRH2KskirWHEiIM85re0M62QH S/71JSkc sI3UXYCfeipdfgEMvV0DuQml9PXTPVUp2y72V05OXrKQanAX9cUZ7NorWptD38f5aj6U2j3YaQXuEhwVQDbkwDxUusnOgVHTBXpeiuPwwdk89o0jssf3oM/dwLSJP6sEkzaut0ESTYNSGDAlcTAqpTwO6+hGXfKqqPRwlejtnCwffGXQnD3hrCTMtChBg89MBcEtEZ7RhU1kTeLlEf6Zt/sMT7l/a//DWSg+NEUzbh7iIGnqQGLxVkUsK/cdMIXAHzk5tHYkkeYXYOUKO85ShD1NLtoEXdgGIIqtl9OoKIP5OXHVyHTp7x98b3uwLysjG5mbl/9wiVWrYRkITwgOzxVDpehmdAaDrUtVJnoxV3LgMR8rRnqfd+ySKAjBFpb/KaQiHM8+BSn0wrJTpGYuZakdEWVxet2KTzaXVFtsDyVZXTYOLwBQkP5SXXZ1MYM+c5RsdBCAGqM4Q8c2sBuQmw2QyK0k8ysOWJus0KnvcL5d3uthwu+Isvg6BlFiLIFdhcAtv8jkGbyREFb9uCAyOj//OdQoFm6i7BB6VpmJw7RVB2j/yaM9BUiK8/T/CdQ3pEV6D5JFHkqUOyVY81WjKu9xnOJZnxWhMubbnkKudufWTGEaJKUi6FMqhRCQ+1NFKCca7ONAy/LFIeZVAFoLcKBQh6PCEJTVEN6JCMXWOIslWmjrJQMnx5LP6D7pImjk5/GtqX309bLIv8n8= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: The main change in this commit is to walk_hugetlb_range to support walking HGM mappings, but all walk_hugetlb_range callers must be updated to use the new API and take the correct action. Listing all the changes to the callers: For s390 changes, we simply BUILD_BUG_ON if HGM is enabled. For smaps, shared_hugetlb (and private_hugetlb, although private mappings don't support HGM) may now not be divisible by the hugepage size. The appropriate changes have been made to support analyzing HGM PTEs. For pagemap, we ignore non-leaf PTEs by treating that as if they were none PTEs. We can only end up with non-leaf PTEs if they had just been updated from a none PTE. For show_numa_map, the challenge is that, if any of a hugepage is mapped, we have to count that entire page exactly once, as the results are given in units of hugepages. To support HGM mappings, we keep track of the last page that we looked it. If the hugepage we are currently looking at is the same as the last one, then we must be looking at an HGM-mapped page that has been mapped at high-granularity, and we've already accounted for it. For DAMON, we treat non-leaf PTEs as if they were blank, for the same reason as pagemap. For hwpoison, we proactively update the logic to support the case when hpte is pointing to a subpage within the poisoned hugepage. For queue_pages_hugetlb/migration, we ignore all HGM-enabled VMAs for now. For mincore, we ignore non-leaf PTEs for the same reason as pagemap. For mprotect/prot_none_hugetlb_entry, we retry the walk when we get a non-leaf PTE. Signed-off-by: James Houghton diff --git a/arch/s390/mm/gmap.c b/arch/s390/mm/gmap.c index 5a716bdcba05..e1d41caa8504 100644 --- a/arch/s390/mm/gmap.c +++ b/arch/s390/mm/gmap.c @@ -2629,14 +2629,20 @@ static int __s390_enable_skey_pmd(pmd_t *pmd, unsigned long addr, return 0; } -static int __s390_enable_skey_hugetlb(pte_t *pte, unsigned long addr, - unsigned long hmask, unsigned long next, +static int __s390_enable_skey_hugetlb(struct hugetlb_pte *hpte, + unsigned long addr, struct mm_walk *walk) { - pmd_t *pmd = (pmd_t *)pte; + pmd_t *pmd = (pmd_t *)hpte->ptep; unsigned long start, end; struct page *page = pmd_page(*pmd); + /* + * We don't support high-granularity mappings yet. If we did, the + * pmd_page() call above would be unsafe. + */ + BUILD_BUG_ON(IS_ENABLED(CONFIG_HUGETLB_HIGH_GRANULARITY_MAPPING)); + /* * The write check makes sure we do not set a key on shared * memory. This is needed as the walker does not differentiate diff --git a/fs/proc/task_mmu.c b/fs/proc/task_mmu.c index 77b72f42556a..2f293b5dabc0 100644 --- a/fs/proc/task_mmu.c +++ b/fs/proc/task_mmu.c @@ -731,27 +731,39 @@ static void show_smap_vma_flags(struct seq_file *m, struct vm_area_struct *vma) } #ifdef CONFIG_HUGETLB_PAGE -static int smaps_hugetlb_range(pte_t *pte, unsigned long hmask, - unsigned long addr, unsigned long end, - struct mm_walk *walk) +static int smaps_hugetlb_range(struct hugetlb_pte *hpte, + unsigned long addr, + struct mm_walk *walk) { struct mem_size_stats *mss = walk->private; struct vm_area_struct *vma = walk->vma; struct page *page = NULL; + pte_t pte = huge_ptep_get(hpte->ptep); - if (pte_present(*pte)) { - page = vm_normal_page(vma, addr, *pte); - } else if (is_swap_pte(*pte)) { - swp_entry_t swpent = pte_to_swp_entry(*pte); + if (pte_present(pte)) { + /* We only care about leaf-level PTEs. */ + if (!hugetlb_pte_present_leaf(hpte, pte)) + /* + * The only case where hpte is not a leaf is that + * it was originally none, but it was split from + * under us. It was originally none, so exclude it. + */ + return 0; + + page = vm_normal_page(vma, addr, pte); + } else if (is_swap_pte(pte)) { + swp_entry_t swpent = pte_to_swp_entry(pte); if (is_pfn_swap_entry(swpent)) page = pfn_swap_entry_to_page(swpent); } if (page) { - if (page_mapcount(page) >= 2 || hugetlb_pmd_shared(pte)) - mss->shared_hugetlb += huge_page_size(hstate_vma(vma)); + unsigned long sz = hugetlb_pte_size(hpte); + + if (page_mapcount(page) >= 2 || hugetlb_pmd_shared(hpte->ptep)) + mss->shared_hugetlb += sz; else - mss->private_hugetlb += huge_page_size(hstate_vma(vma)); + mss->private_hugetlb += sz; } return 0; } @@ -1569,22 +1581,31 @@ static int pagemap_pmd_range(pmd_t *pmdp, unsigned long addr, unsigned long end, #ifdef CONFIG_HUGETLB_PAGE /* This function walks within one hugetlb entry in the single call */ -static int pagemap_hugetlb_range(pte_t *ptep, unsigned long hmask, - unsigned long addr, unsigned long end, +static int pagemap_hugetlb_range(struct hugetlb_pte *hpte, + unsigned long addr, struct mm_walk *walk) { struct pagemapread *pm = walk->private; struct vm_area_struct *vma = walk->vma; u64 flags = 0, frame = 0; int err = 0; - pte_t pte; + unsigned long hmask = hugetlb_pte_mask(hpte); + unsigned long end = addr + hugetlb_pte_size(hpte); + pte_t pte = huge_ptep_get(hpte->ptep); + struct page *page; if (vma->vm_flags & VM_SOFTDIRTY) flags |= PM_SOFT_DIRTY; - pte = huge_ptep_get(ptep); if (pte_present(pte)) { - struct page *page = pte_page(pte); + /* + * We raced with this PTE being split, which can only happen if + * it was blank before. Treat it is as if it were blank. + */ + if (!hugetlb_pte_present_leaf(hpte, pte)) + return 0; + + page = pte_page(pte); if (!PageAnon(page)) flags |= PM_FILE; @@ -1865,10 +1886,16 @@ static struct page *can_gather_numa_stats_pmd(pmd_t pmd, } #endif +struct show_numa_map_private { + struct numa_maps *md; + struct page *last_page; +}; + static int gather_pte_stats(pmd_t *pmd, unsigned long addr, unsigned long end, struct mm_walk *walk) { - struct numa_maps *md = walk->private; + struct show_numa_map_private *priv = walk->private; + struct numa_maps *md = priv->md; struct vm_area_struct *vma = walk->vma; spinlock_t *ptl; pte_t *orig_pte; @@ -1880,6 +1907,7 @@ static int gather_pte_stats(pmd_t *pmd, unsigned long addr, struct page *page; page = can_gather_numa_stats_pmd(*pmd, vma, addr); + priv->last_page = page; if (page) gather_stats(page, md, pmd_dirty(*pmd), HPAGE_PMD_SIZE/PAGE_SIZE); @@ -1893,6 +1921,7 @@ static int gather_pte_stats(pmd_t *pmd, unsigned long addr, orig_pte = pte = pte_offset_map_lock(walk->mm, pmd, addr, &ptl); do { struct page *page = can_gather_numa_stats(*pte, vma, addr); + priv->last_page = page; if (!page) continue; gather_stats(page, md, pte_dirty(*pte), 1); @@ -1903,19 +1932,25 @@ static int gather_pte_stats(pmd_t *pmd, unsigned long addr, return 0; } #ifdef CONFIG_HUGETLB_PAGE -static int gather_hugetlb_stats(pte_t *pte, unsigned long hmask, - unsigned long addr, unsigned long end, struct mm_walk *walk) +static int gather_hugetlb_stats(struct hugetlb_pte *hpte, unsigned long addr, + struct mm_walk *walk) { - pte_t huge_pte = huge_ptep_get(pte); + struct show_numa_map_private *priv = walk->private; + pte_t huge_pte = huge_ptep_get(hpte->ptep); struct numa_maps *md; struct page *page; - if (!pte_present(huge_pte)) + if (!hugetlb_pte_present_leaf(hpte, huge_pte)) + return 0; + + page = compound_head(pte_page(huge_pte)); + if (priv->last_page == page) + /* we've already accounted for this page */ return 0; - page = pte_page(huge_pte); + priv->last_page = page; - md = walk->private; + md = priv->md; gather_stats(page, md, pte_dirty(huge_pte), 1); return 0; } @@ -1945,9 +1980,15 @@ static int show_numa_map(struct seq_file *m, void *v) struct file *file = vma->vm_file; struct mm_struct *mm = vma->vm_mm; struct mempolicy *pol; + char buffer[64]; int nid; + struct show_numa_map_private numa_map_private; + + numa_map_private.md = md; + numa_map_private.last_page = NULL; + if (!mm) return 0; @@ -1977,7 +2018,7 @@ static int show_numa_map(struct seq_file *m, void *v) seq_puts(m, " huge"); /* mmap_lock is held by m_start */ - walk_page_vma(vma, &show_numa_ops, md); + walk_page_vma(vma, &show_numa_ops, &numa_map_private); if (!md->pages) goto out; diff --git a/include/linux/pagewalk.h b/include/linux/pagewalk.h index 27a6df448ee5..f4bddad615c2 100644 --- a/include/linux/pagewalk.h +++ b/include/linux/pagewalk.h @@ -3,6 +3,7 @@ #define _LINUX_PAGEWALK_H #include +#include struct mm_walk; @@ -31,6 +32,10 @@ struct mm_walk; * ptl after dropping the vma lock, or else revalidate * those items after re-acquiring the vma lock and before * accessing them. + * In the presence of high-granularity hugetlb entries, + * @hugetlb_entry is called only for leaf-level entries + * (hstate-level entries are ignored if they are not + * leaves). * @test_walk: caller specific callback function to determine whether * we walk over the current vma or not. Returning 0 means * "do page table walk over the current vma", returning @@ -58,9 +63,8 @@ struct mm_walk_ops { unsigned long next, struct mm_walk *walk); int (*pte_hole)(unsigned long addr, unsigned long next, int depth, struct mm_walk *walk); - int (*hugetlb_entry)(pte_t *pte, unsigned long hmask, - unsigned long addr, unsigned long next, - struct mm_walk *walk); + int (*hugetlb_entry)(struct hugetlb_pte *hpte, + unsigned long addr, struct mm_walk *walk); int (*test_walk)(unsigned long addr, unsigned long next, struct mm_walk *walk); int (*pre_vma)(unsigned long start, unsigned long end, diff --git a/mm/damon/vaddr.c b/mm/damon/vaddr.c index 1fec16d7263e..0f001950498a 100644 --- a/mm/damon/vaddr.c +++ b/mm/damon/vaddr.c @@ -330,11 +330,11 @@ static int damon_mkold_pmd_entry(pmd_t *pmd, unsigned long addr, } #ifdef CONFIG_HUGETLB_PAGE -static void damon_hugetlb_mkold(pte_t *pte, struct mm_struct *mm, +static void damon_hugetlb_mkold(struct hugetlb_pte *hpte, pte_t entry, + struct mm_struct *mm, struct vm_area_struct *vma, unsigned long addr) { bool referenced = false; - pte_t entry = huge_ptep_get(pte); struct folio *folio = pfn_folio(pte_pfn(entry)); folio_get(folio); @@ -342,12 +342,12 @@ static void damon_hugetlb_mkold(pte_t *pte, struct mm_struct *mm, if (pte_young(entry)) { referenced = true; entry = pte_mkold(entry); - set_huge_pte_at(mm, addr, pte, entry); + set_huge_pte_at(mm, addr, hpte->ptep, entry); } #ifdef CONFIG_MMU_NOTIFIER if (mmu_notifier_clear_young(mm, addr, - addr + huge_page_size(hstate_vma(vma)))) + addr + hugetlb_pte_size(hpte))) referenced = true; #endif /* CONFIG_MMU_NOTIFIER */ @@ -358,20 +358,26 @@ static void damon_hugetlb_mkold(pte_t *pte, struct mm_struct *mm, folio_put(folio); } -static int damon_mkold_hugetlb_entry(pte_t *pte, unsigned long hmask, - unsigned long addr, unsigned long end, +static int damon_mkold_hugetlb_entry(struct hugetlb_pte *hpte, + unsigned long addr, struct mm_walk *walk) { - struct hstate *h = hstate_vma(walk->vma); spinlock_t *ptl; pte_t entry; - ptl = huge_pte_lock(h, walk->mm, pte); - entry = huge_ptep_get(pte); + ptl = hugetlb_pte_lock(hpte); + entry = huge_ptep_get(hpte->ptep); if (!pte_present(entry)) goto out; - damon_hugetlb_mkold(pte, walk->mm, walk->vma, addr); + if (!hugetlb_pte_present_leaf(hpte, entry)) + /* + * We raced with someone splitting a blank PTE. Treat this PTE + * as if it were blank. + */ + goto out; + + damon_hugetlb_mkold(hpte, entry, walk->mm, walk->vma, addr); out: spin_unlock(ptl); @@ -483,8 +489,8 @@ static int damon_young_pmd_entry(pmd_t *pmd, unsigned long addr, } #ifdef CONFIG_HUGETLB_PAGE -static int damon_young_hugetlb_entry(pte_t *pte, unsigned long hmask, - unsigned long addr, unsigned long end, +static int damon_young_hugetlb_entry(struct hugetlb_pte *hpte, + unsigned long addr, struct mm_walk *walk) { struct damon_young_walk_private *priv = walk->private; @@ -493,11 +499,18 @@ static int damon_young_hugetlb_entry(pte_t *pte, unsigned long hmask, spinlock_t *ptl; pte_t entry; - ptl = huge_pte_lock(h, walk->mm, pte); - entry = huge_ptep_get(pte); + ptl = hugetlb_pte_lock(hpte); + entry = huge_ptep_get(hpte->ptep); if (!pte_present(entry)) goto out; + if (!hugetlb_pte_present_leaf(hpte, entry)) + /* + * We raced with someone splitting a blank PTE. Treat this PTE + * as if it were blank. + */ + goto out; + folio = pfn_folio(pte_pfn(entry)); folio_get(folio); diff --git a/mm/hmm.c b/mm/hmm.c index 6a151c09de5e..d3e40cfdd4cb 100644 --- a/mm/hmm.c +++ b/mm/hmm.c @@ -468,8 +468,8 @@ static int hmm_vma_walk_pud(pud_t *pudp, unsigned long start, unsigned long end, #endif #ifdef CONFIG_HUGETLB_PAGE -static int hmm_vma_walk_hugetlb_entry(pte_t *pte, unsigned long hmask, - unsigned long start, unsigned long end, +static int hmm_vma_walk_hugetlb_entry(struct hugetlb_pte *hpte, + unsigned long start, struct mm_walk *walk) { unsigned long addr = start, i, pfn; @@ -479,16 +479,24 @@ static int hmm_vma_walk_hugetlb_entry(pte_t *pte, unsigned long hmask, unsigned int required_fault; unsigned long pfn_req_flags; unsigned long cpu_flags; + unsigned long hmask = hugetlb_pte_mask(hpte); + unsigned int order = hpte->shift - PAGE_SHIFT; + unsigned long end = start + hugetlb_pte_size(hpte); spinlock_t *ptl; pte_t entry; - ptl = huge_pte_lock(hstate_vma(vma), walk->mm, pte); - entry = huge_ptep_get(pte); + ptl = hugetlb_pte_lock(hpte); + entry = huge_ptep_get(hpte->ptep); + + if (!hugetlb_pte_present_leaf(hpte, entry)) { + spin_unlock(ptl); + return -EAGAIN; + } i = (start - range->start) >> PAGE_SHIFT; pfn_req_flags = range->hmm_pfns[i]; cpu_flags = pte_to_hmm_pfn_flags(range, entry) | - hmm_pfn_flags_order(huge_page_order(hstate_vma(vma))); + hmm_pfn_flags_order(order); required_fault = hmm_pte_need_fault(hmm_vma_walk, pfn_req_flags, cpu_flags); if (required_fault) { @@ -605,7 +613,7 @@ int hmm_range_fault(struct hmm_range *range) * in pfns. All entries < last in the pfn array are set to their * output, and all >= are still at their input values. */ - } while (ret == -EBUSY); + } while (ret == -EBUSY || ret == -EAGAIN); return ret; } EXPORT_SYMBOL(hmm_range_fault); diff --git a/mm/memory-failure.c b/mm/memory-failure.c index a1ede7bdce95..0b37cbc6e8ae 100644 --- a/mm/memory-failure.c +++ b/mm/memory-failure.c @@ -676,6 +676,7 @@ static int check_hwpoisoned_entry(pte_t pte, unsigned long addr, short shift, unsigned long poisoned_pfn, struct to_kill *tk) { unsigned long pfn = 0; + unsigned long base_pages_poisoned = (1UL << shift) / PAGE_SIZE; if (pte_present(pte)) { pfn = pte_pfn(pte); @@ -686,7 +687,8 @@ static int check_hwpoisoned_entry(pte_t pte, unsigned long addr, short shift, pfn = swp_offset_pfn(swp); } - if (!pfn || pfn != poisoned_pfn) + if (!pfn || pfn < poisoned_pfn || + pfn >= poisoned_pfn + base_pages_poisoned) return 0; set_to_kill(tk, addr, shift); @@ -752,16 +754,15 @@ static int hwpoison_pte_range(pmd_t *pmdp, unsigned long addr, } #ifdef CONFIG_HUGETLB_PAGE -static int hwpoison_hugetlb_range(pte_t *ptep, unsigned long hmask, - unsigned long addr, unsigned long end, - struct mm_walk *walk) +static int hwpoison_hugetlb_range(struct hugetlb_pte *hpte, + unsigned long addr, + struct mm_walk *walk) { struct hwp_walk *hwp = walk->private; - pte_t pte = huge_ptep_get(ptep); - struct hstate *h = hstate_vma(walk->vma); + pte_t pte = huge_ptep_get(hpte->ptep); - return check_hwpoisoned_entry(pte, addr, huge_page_shift(h), - hwp->pfn, &hwp->tk); + return check_hwpoisoned_entry(pte, addr & hugetlb_pte_mask(hpte), + hpte->shift, hwp->pfn, &hwp->tk); } #else #define hwpoison_hugetlb_range NULL diff --git a/mm/mempolicy.c b/mm/mempolicy.c index a256a241fd1d..0f91be88392b 100644 --- a/mm/mempolicy.c +++ b/mm/mempolicy.c @@ -558,8 +558,8 @@ static int queue_folios_pte_range(pmd_t *pmd, unsigned long addr, return addr != end ? -EIO : 0; } -static int queue_folios_hugetlb(pte_t *pte, unsigned long hmask, - unsigned long addr, unsigned long end, +static int queue_folios_hugetlb(struct hugetlb_pte *hpte, + unsigned long addr, struct mm_walk *walk) { int ret = 0; @@ -570,8 +570,12 @@ static int queue_folios_hugetlb(pte_t *pte, unsigned long hmask, spinlock_t *ptl; pte_t entry; - ptl = huge_pte_lock(hstate_vma(walk->vma), walk->mm, pte); - entry = huge_ptep_get(pte); + /* We don't migrate high-granularity HugeTLB mappings for now. */ + if (hugetlb_hgm_enabled(walk->vma)) + return -EINVAL; + + ptl = hugetlb_pte_lock(hpte); + entry = huge_ptep_get(hpte->ptep); if (!pte_present(entry)) goto unlock; folio = pfn_folio(pte_pfn(entry)); @@ -608,7 +612,7 @@ static int queue_folios_hugetlb(pte_t *pte, unsigned long hmask, */ if (flags & (MPOL_MF_MOVE_ALL) || (flags & MPOL_MF_MOVE && folio_estimated_sharers(folio) == 1 && - !hugetlb_pmd_shared(pte))) { + !hugetlb_pmd_shared(hpte->ptep))) { if (!isolate_hugetlb(folio, qp->pagelist) && (flags & MPOL_MF_STRICT)) /* diff --git a/mm/mincore.c b/mm/mincore.c index a085a2aeabd8..0894965b3944 100644 --- a/mm/mincore.c +++ b/mm/mincore.c @@ -22,18 +22,29 @@ #include #include "swap.h" -static int mincore_hugetlb(pte_t *pte, unsigned long hmask, unsigned long addr, - unsigned long end, struct mm_walk *walk) +static int mincore_hugetlb(struct hugetlb_pte *hpte, unsigned long addr, + struct mm_walk *walk) { #ifdef CONFIG_HUGETLB_PAGE unsigned char present; + unsigned long end = addr + hugetlb_pte_size(hpte); unsigned char *vec = walk->private; + pte_t pte = huge_ptep_get(hpte->ptep); /* * Hugepages under user process are always in RAM and never * swapped out, but theoretically it needs to be checked. */ - present = pte && !huge_pte_none(huge_ptep_get(pte)); + present = !huge_pte_none(pte); + + /* + * If the pte is present but not a leaf, we raced with someone + * splitting it. For someone to have split it, it must have been + * huge_pte_none before, so treat it as such. + */ + if (pte_present(pte) && !hugetlb_pte_present_leaf(hpte, pte)) + present = false; + for (; addr != end; vec++, addr += PAGE_SIZE) *vec = present; walk->private = vec; diff --git a/mm/mprotect.c b/mm/mprotect.c index 1d4843c97c2a..61263ce9d925 100644 --- a/mm/mprotect.c +++ b/mm/mprotect.c @@ -564,12 +564,16 @@ static int prot_none_pte_entry(pte_t *pte, unsigned long addr, 0 : -EACCES; } -static int prot_none_hugetlb_entry(pte_t *pte, unsigned long hmask, - unsigned long addr, unsigned long next, +static int prot_none_hugetlb_entry(struct hugetlb_pte *hpte, + unsigned long addr, struct mm_walk *walk) { - return pfn_modify_allowed(pte_pfn(*pte), *(pgprot_t *)(walk->private)) ? - 0 : -EACCES; + pte_t pte = huge_ptep_get(hpte->ptep); + + if (!hugetlb_pte_present_leaf(hpte, pte)) + return -EAGAIN; + return pfn_modify_allowed(pte_pfn(pte), + *(pgprot_t *)(walk->private)) ? 0 : -EACCES; } static int prot_none_test(unsigned long addr, unsigned long next, @@ -612,8 +616,10 @@ mprotect_fixup(struct vma_iterator *vmi, struct mmu_gather *tlb, (newflags & VM_ACCESS_FLAGS) == 0) { pgprot_t new_pgprot = vm_get_page_prot(newflags); - error = walk_page_range(current->mm, start, end, - &prot_none_walk_ops, &new_pgprot); + do { + error = walk_page_range(current->mm, start, end, + &prot_none_walk_ops, &new_pgprot); + } while (error == -EAGAIN); if (error) return error; } diff --git a/mm/pagewalk.c b/mm/pagewalk.c index cb23f8a15c13..05ce242f8b7e 100644 --- a/mm/pagewalk.c +++ b/mm/pagewalk.c @@ -3,6 +3,7 @@ #include #include #include +#include /* * We want to know the real level where a entry is located ignoring any @@ -296,20 +297,21 @@ static int walk_hugetlb_range(unsigned long addr, unsigned long end, struct vm_area_struct *vma = walk->vma; struct hstate *h = hstate_vma(vma); unsigned long next; - unsigned long hmask = huge_page_mask(h); - unsigned long sz = huge_page_size(h); - pte_t *pte; const struct mm_walk_ops *ops = walk->ops; int err = 0; + struct hugetlb_pte hpte; hugetlb_vma_lock_read(vma); do { - next = hugetlb_entry_end(h, addr, end); - pte = hugetlb_walk(vma, addr & hmask, sz); - if (pte) - err = ops->hugetlb_entry(pte, hmask, addr, next, walk); - else if (ops->pte_hole) - err = ops->pte_hole(addr, next, -1, walk); + if (hugetlb_full_walk(&hpte, vma, addr)) { + next = hugetlb_entry_end(h, addr, end); + if (ops->pte_hole) + err = ops->pte_hole(addr, next, -1, walk); + } else { + err = ops->hugetlb_entry( + &hpte, addr, walk); + next = min(addr + hugetlb_pte_size(&hpte), end); + } if (err) break; } while (addr = next, addr != end); From patchwork Sat Feb 18 00:27:59 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: James Houghton X-Patchwork-Id: 13145392 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 63709C6379F for ; Sat, 18 Feb 2023 00:29:33 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 7B674280016; Fri, 17 Feb 2023 19:29:12 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 74044280002; Fri, 17 Feb 2023 19:29:12 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 56B29280016; Fri, 17 Feb 2023 19:29:12 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0011.hostedemail.com [216.40.44.11]) by kanga.kvack.org (Postfix) with ESMTP id 44706280002 for ; Fri, 17 Feb 2023 19:29:12 -0500 (EST) Received: from smtpin23.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay09.hostedemail.com (Postfix) with ESMTP id 2542D80411 for ; Sat, 18 Feb 2023 00:29:12 +0000 (UTC) X-FDA: 80478528144.23.96028AE Received: from mail-vk1-f202.google.com (mail-vk1-f202.google.com [209.85.221.202]) by imf13.hostedemail.com (Postfix) with ESMTP id 5FE9520016 for ; Sat, 18 Feb 2023 00:29:10 +0000 (UTC) Authentication-Results: imf13.hostedemail.com; dkim=pass header.d=google.com header.s=20210112 header.b=qVyA3qUq; spf=pass (imf13.hostedemail.com: domain of 31RvwYwoKCPIdnbioabnihaiiafY.Wigfchor-ggepUWe.ila@flex--jthoughton.bounces.google.com designates 209.85.221.202 as permitted sender) smtp.mailfrom=31RvwYwoKCPIdnbioabnihaiiafY.Wigfchor-ggepUWe.ila@flex--jthoughton.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1676680150; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=R2na1A3NSq7U/MTCBc/o5RGQOMMtDEqSAD5VSD0flvE=; b=8QttuLgwVkpHu0Zdgk9jehwu8uCs6tZSzQbFPfUgQ/F+vpfoKg0Wu+LZkhJWTPAC0n6j/U l5WwSa8AE48+aWkRy2N2tQdKM2uUg+PjLI5nnRE4cv8qbyy6xgAZz8W8OnzdaKygSTNyaS IGO3OwTWcqkfFFhvV//P+CZHpbG0gdo= ARC-Authentication-Results: i=1; imf13.hostedemail.com; dkim=pass header.d=google.com header.s=20210112 header.b=qVyA3qUq; spf=pass (imf13.hostedemail.com: domain of 31RvwYwoKCPIdnbioabnihaiiafY.Wigfchor-ggepUWe.ila@flex--jthoughton.bounces.google.com designates 209.85.221.202 as permitted sender) smtp.mailfrom=31RvwYwoKCPIdnbioabnihaiiafY.Wigfchor-ggepUWe.ila@flex--jthoughton.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1676680150; a=rsa-sha256; cv=none; b=lE0rrnlQwfPjUeRXR3ruGEYlOnwnUq84UtpvYArtYHocPuYu0ZZeQN41A22j/A5OFVa19c 5t2HACtBerYooOmZ30/Irr5FFP5dNzIqbRiWAcq9OsHJfEXNP0GTekMD8wE6jd2x14MHEt yJ8oCF5/Q5uT+5S/TY8Sbu2kcNrQUlI= Received: by mail-vk1-f202.google.com with SMTP id f9-20020ac5c9a9000000b00401cb01aff3so792616vkm.4 for ; Fri, 17 Feb 2023 16:29:10 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; t=1676680149; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=R2na1A3NSq7U/MTCBc/o5RGQOMMtDEqSAD5VSD0flvE=; b=qVyA3qUqF5DHCviCI1nzfYS5Svk9ua2LpWd4a99ORevkgEpjI4dqg4lKQMb+bwNcNw 61c9ND7a9iH3goStVFiSljoFRjNY2j1AhhPNcUU/a1rv5fdHxnLbmQMVtti88Q+JYZyU bhe8TGck/nYHhFokZ4sDeNAoyxCO64fiBBLXbHmrXkJokroObrlP3/h9q8bTGXc3eo/F vBbNamFNBm7arbW972qTZ0yke+aKbVBelBbPkEYjM+gx8b1eyLRItOwCX8k0eelqQ1oo E6X1WzYSjvE5m5sQqtV1xVs+q8GAKHv2jxgseAPVQAnj0ZSKEohG90BS+LNhWTC4WS5w 8pnQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; t=1676680149; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=R2na1A3NSq7U/MTCBc/o5RGQOMMtDEqSAD5VSD0flvE=; b=R/QLo+NBShDYhr5o9XBiMx1cO4F+oLYEKxMQT8zlZbLyBBIInTZcwD/ZbqhGFkIOHM Vkv/wGwdiq79jHITPW2QA1urCtebOpWfMAvzmO/U3QOSf53nCqs+iGjW6vHXjn5PPG1C rM6Ojr8vwm4CwTzdyDsdkH/J9LwshqcV41XFZPKaY9xU4BvGbzrVkM4ZZdgsc+YLs0Sa Dr6NXf7QZd/YiColZn5pHNaElxXy2+kr1QThOzREn/ceN9+Psuz2JJogMpdpuW7fHJnM Eac0sxunuKr3i+f+R/c6pKM0BZOQjddKz8klN55TA3qWLcqckne4h8icOHKnvZxGHMIP 1kAw== X-Gm-Message-State: AO0yUKUDeO9N/fGiUZwW3T/KQB6KIipWfm8BMJfM2YKbjudDuBJh6QJO J3bIeBJMho9EmwFiry0j/EeHZD6t8x08aaPZ X-Google-Smtp-Source: AK7set+z+B4A4WuUUUfTfMJ3HPbCsVBPAnASJYuSG63wOfEVQp9exRkXKeR5pjRe9t+Ysvo+wmN1lJQLga4KIPTk X-Received: from jthoughton.c.googlers.com ([fda3:e722:ac3:cc00:14:4d90:c0a8:2a4f]) (user=jthoughton job=sendgmr) by 2002:a67:c485:0:b0:412:4cf3:d0ed with SMTP id d5-20020a67c485000000b004124cf3d0edmr38832vsk.32.1676680149589; Fri, 17 Feb 2023 16:29:09 -0800 (PST) Date: Sat, 18 Feb 2023 00:27:59 +0000 In-Reply-To: <20230218002819.1486479-1-jthoughton@google.com> Mime-Version: 1.0 References: <20230218002819.1486479-1-jthoughton@google.com> X-Mailer: git-send-email 2.39.2.637.g21b0678d19-goog Message-ID: <20230218002819.1486479-27-jthoughton@google.com> Subject: [PATCH v2 26/46] mm: rmap: provide pte_order in page_vma_mapped_walk From: James Houghton To: Mike Kravetz , Muchun Song , Peter Xu , Andrew Morton Cc: David Hildenbrand , David Rientjes , Axel Rasmussen , Mina Almasry , "Zach O'Keefe" , Manish Mishra , Naoya Horiguchi , "Dr . David Alan Gilbert" , "Matthew Wilcox (Oracle)" , Vlastimil Babka , Baolin Wang , Miaohe Lin , Yang Shi , Frank van der Linden , Jiaqi Yan , linux-mm@kvack.org, linux-kernel@vger.kernel.org, James Houghton X-Stat-Signature: 8uefhtqxesyy8nmccc4xtextyktparde X-Rspam-User: X-Rspamd-Queue-Id: 5FE9520016 X-Rspamd-Server: rspam06 X-HE-Tag: 1676680150-723332 X-HE-Meta: U2FsdGVkX1/XbVp9PfzP5dh+ClOv0BjyjIFvgjVrs6RYD0XYjV40vAD2KEEWBXE0Oy/5haI674Gbx+niEztDBKkoBKxDFUX6CaZvysEc7ab27CGG8Y0ChW98Nn+jsxRnDpwlcDjZxwcs0kffCim0sbGllTUlwhv5h2YWpblzV0Lnk2lDX3881IbYFMMwiY8Yi/HVoRYvOfO4iKytGI5GjP4wDYMbGMxjIjiGG7YDMHr/QMRjFdQlmDKK4Z06fBd+7xkMmrSiTjglN6vNbDCQB7Q8U3YW/cy/RXKgSslfIqpsMp1tAykK/tZF8eoMzci5H40gzB91QAizXLAVUgqbZFrrgHBycmDBfJs+2vHkxbJfrs27pGI0SjaRgaoxITEGIdYA1WGZ4NVTj24BGK3WcRNflMtWaobJP84uFAbC66zS7UuIVF4xUm6WHNx/K/MJX1ncv9Em8JxcWo6M9EJeuxLfremewqA0bzbndrI2WA3yRgCj69xpTthv4Ntp6IhkcOQvcwHEbT1S9iSFwLPIBh2FYrU3uk34e4/vA0lYtuIbbIeIp6SPadyzEsgWoSN7OT9SzYsyaeSLbqbFQMNVpJBtuqRJTX8apf6MxDDMq5wZn4LCwWuUbqr3AEpTTQ9FmtsNMKqnfWM5Ps7WwTlYGlAtsyZk0IO+75FcI7NXP+IHh34EnqxDPibXPbu4KdmHaHD3uRZAxMeBPzEF+jDqAtEin8jxTN3OApajp5SckWV8lPChOWsKQVyGYR7UdRkBPSvWdM+szf8BMOlN7oW6VDwcUye3I0Y2t39hucTQrhDQLA0MLzFhZo2wdl//fqP1uOb04SprNAIW6l4x/5MerDlqcLBXr2WauG6RGLvWax1fa6dUYp4v9UFFp+Kg6r6c+/hlGJoUAFWt1ONhRcMeEjjR7jkeBRTD7gG0FMSpqz48eQY+Mj54f7v4Sspy6sLKXdotonER10zScdrL6Gl YqqWihjy FULKLbzG9d2rXSUcxPtQ8bKtiFbwiLvUstZQ4G39RicHmNFe0+Bw3YQmfd86wnL9ixXgFDNt0usHOWzwCplsWzQnwG9clXC+vs+G2c48GKLtbme1WLciOIrFaqjufUzsHytEWOdtP9kzGJ0SPYRkJ9XRtMPssPVWj26n2ROdfZG44kHK0xoBm6gMImGlPGd4TjTwe5TbHhC7gBfYJx1uZnVs04nfvEVVkhtfPOzrOV/elI41uKyCZc06fdSgM56JGuIVN0QCDF10xjVRCAQM/U46OYgwfjJ/7ZCXQdJreyxElQzoCuYNRslTf+1MoLYd1h13xoWOL8P2DMi75P8GpdJE6+gYSrvQXwAdENixlMKC7mGzgoqOqwpDL3LuKxQ6piTSwljqel08tGXowXUfTxg4bcYyUj+k4LPvMeJIglkO1uxFJnc4U/kFoS/9eWTJOv0Cw7WGaoQDwQo2WG52w2tVP0Lw4cr5fzK9Z6/eRkQxIxJnMJEXRwlsk+U9Sc3VK/OnzXJp2TjKLCcQkBBZxz4IJlbhtfyXsC1vPdkvm57c1odGNogG7ppeFf4xrQuHsc3mINMOXYM2AaF+0GS6e2YcDcSvaFR/APMJqRl/4IB+dfpRJugmpvslhTfF5Kv77WLF3Ftl9Ay8tdtbRr/ZATD0jttu2eBDfHZSq2tCySxIxetQLl1EQI4aWvu/1ONF0BBqy6Th3Gke0GAU= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: page_vma_mapped_walk callers will need this information to know how HugeTLB pages are mapped. pte_order only applies if pte is not NULL. Signed-off-by: James Houghton diff --git a/include/linux/rmap.h b/include/linux/rmap.h index a4570da03e58..87a2c7f422bf 100644 --- a/include/linux/rmap.h +++ b/include/linux/rmap.h @@ -387,6 +387,7 @@ struct page_vma_mapped_walk { pmd_t *pmd; pte_t *pte; spinlock_t *ptl; + unsigned int pte_order; unsigned int flags; }; diff --git a/mm/page_vma_mapped.c b/mm/page_vma_mapped.c index 4e448cfbc6ef..08295b122ad6 100644 --- a/mm/page_vma_mapped.c +++ b/mm/page_vma_mapped.c @@ -16,6 +16,7 @@ static inline bool not_found(struct page_vma_mapped_walk *pvmw) static bool map_pte(struct page_vma_mapped_walk *pvmw) { pvmw->pte = pte_offset_map(pvmw->pmd, pvmw->address); + pvmw->pte_order = 0; if (!(pvmw->flags & PVMW_SYNC)) { if (pvmw->flags & PVMW_MIGRATION) { if (!is_swap_pte(*pvmw->pte)) @@ -177,6 +178,7 @@ bool page_vma_mapped_walk(struct page_vma_mapped_walk *pvmw) if (!pvmw->pte) return false; + pvmw->pte_order = huge_page_order(hstate); pvmw->ptl = huge_pte_lock(hstate, mm, pvmw->pte); if (!check_pte(pvmw)) return not_found(pvmw); @@ -272,6 +274,7 @@ bool page_vma_mapped_walk(struct page_vma_mapped_walk *pvmw) } pte_unmap(pvmw->pte); pvmw->pte = NULL; + pvmw->pte_order = 0; goto restart; } pvmw->pte++; From patchwork Sat Feb 18 00:28:00 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: James Houghton X-Patchwork-Id: 13145393 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 23F2FC636D6 for ; Sat, 18 Feb 2023 00:29:35 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id A0196280017; Fri, 17 Feb 2023 19:29:13 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 9B179280002; Fri, 17 Feb 2023 19:29:13 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 801DD280017; Fri, 17 Feb 2023 19:29:13 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0017.hostedemail.com [216.40.44.17]) by kanga.kvack.org (Postfix) with ESMTP id 6A372280002 for ; Fri, 17 Feb 2023 19:29:13 -0500 (EST) Received: from smtpin12.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay03.hostedemail.com (Postfix) with ESMTP id 4BC92A0881 for ; Sat, 18 Feb 2023 00:29:13 +0000 (UTC) X-FDA: 80478528186.12.404B00D Received: from mail-yb1-f201.google.com (mail-yb1-f201.google.com [209.85.219.201]) by imf13.hostedemail.com (Postfix) with ESMTP id 833622000C for ; Sat, 18 Feb 2023 00:29:11 +0000 (UTC) Authentication-Results: imf13.hostedemail.com; dkim=pass header.d=google.com header.s=20210112 header.b=URp4YrNR; spf=pass (imf13.hostedemail.com: domain of 31hvwYwoKCPMeocjpbcojibjjbgZ.Xjhgdips-hhfqVXf.jmb@flex--jthoughton.bounces.google.com designates 209.85.219.201 as permitted sender) smtp.mailfrom=31hvwYwoKCPMeocjpbcojibjjbgZ.Xjhgdips-hhfqVXf.jmb@flex--jthoughton.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1676680151; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=C4SpvvUSe/+H51r29czUV8oyanUNWvkyfpCFhzIRnLk=; b=br2vmX31MbFjMs2SqVl02zP/SBG51B/2cju2vn3tgKR6kxN3Io6ec4Q1Aa2cDTuIrTYRK5 JWHs6neMta0kLUQV9XgjVyQVDFgMI4Y+Aoytg1nXRJZChAd/dXKG6QhGeIfoHB7SGDN4O9 7BJymvqeA9STBPFNYkB4jxhL9lfu1+0= ARC-Authentication-Results: i=1; imf13.hostedemail.com; dkim=pass header.d=google.com header.s=20210112 header.b=URp4YrNR; spf=pass (imf13.hostedemail.com: domain of 31hvwYwoKCPMeocjpbcojibjjbgZ.Xjhgdips-hhfqVXf.jmb@flex--jthoughton.bounces.google.com designates 209.85.219.201 as permitted sender) smtp.mailfrom=31hvwYwoKCPMeocjpbcojibjjbgZ.Xjhgdips-hhfqVXf.jmb@flex--jthoughton.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1676680151; a=rsa-sha256; cv=none; b=tcamOlECvbyGILcQLfMIIS9niG4fYQiOwzd6hl3PVQiJFA7R3kdqhSfUrogjNfR1EPeDfj QrDyaT2Ia2gjRAwqklo+KZiM76FIOz8cdua6U2WoPDgRkVCLJNDSt/RUCKEDjLiLwyveAX 1/pmQ71f4ExkWXe4zHSZ9O31KN0vWnM= Received: by mail-yb1-f201.google.com with SMTP id q10-20020a056902150a00b0091b90b20cd9so1737439ybu.6 for ; Fri, 17 Feb 2023 16:29:11 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=C4SpvvUSe/+H51r29czUV8oyanUNWvkyfpCFhzIRnLk=; b=URp4YrNRsXrEMcz5vF68rwrPTWL0r/1PsOn0iwRwzUv8fGWjA2Zj3WsCDtOHnnEOXI 9xQcdgSYjiC0L8FMkrysuWd8Z4rTcFfESiyb98ZDWB3Rpv8/7b3sLIWRQtcVzhoZSCmY 1q2oSqG9SQQKP9k9222Pkp5CKnK0J5shuyy4KwIs9snSVsRw3F9uu39yYj2MkyXE/1KV b/75TnZWsolWJ9o2Dg5OQzts7dH32JqEhtpz0u50LcpTAXPOuZwDBizDHafxqGuZbm79 w0zdlsUS9tlTAwK9A8se/3bTurakeHRO+WC3ta4JwV799mBHwIfC4VY+Lrp2ra7exXyR sdeQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=C4SpvvUSe/+H51r29czUV8oyanUNWvkyfpCFhzIRnLk=; b=WDk0nglqXiqVHRyg6AupfmhhdS7DTeehJqLJto0jnzMYbkhNM2+hEWTd5sMNkkYvBp U2UO3KxFJg98kDx/3U+nIcr6Nu9aNwB/JRtPuodU3PW5i+hC81bnModC68tnSMOcYzLn zfI7TGQyRse++8QkKAgT/8KZ4/HURnocOSq5+G45EQNSWj3n1YruqzPZ4vbQLWKODXeQ iAEZquVy2DxYmG3LUh5bFCm8Ei0FUBEpJxlg/rNoaTTrv6TnQm1hHcJH7xiRIMw+/ajj UXdQ3YgBjdd4OQ3X+5niBaaZLzpOipE4ZyEYDF+G0D7/1dQTDEIiWWHrm6RFhjhqqBsi v+8g== X-Gm-Message-State: AO0yUKUyu4OfpMd+02GnYi+M90mu9bIXrWrXOgksheDyQlxwMx1TXBr2 Z2fjc5RRNhtSkubiKTzXx9C7GaJAOzMzvAVI X-Google-Smtp-Source: AK7set/928+LwO6Lgp5pRPZtsNiWTzN/B8EG4aLn4PvCGnxGZvLIKoC9wr4DtxldmQEi9jOl1xJbJJDcIzKsLS/r X-Received: from jthoughton.c.googlers.com ([fda3:e722:ac3:cc00:14:4d90:c0a8:2a4f]) (user=jthoughton job=sendgmr) by 2002:a81:ac66:0:b0:52e:ed7f:6e82 with SMTP id z38-20020a81ac66000000b0052eed7f6e82mr257319ywj.9.1676680150801; Fri, 17 Feb 2023 16:29:10 -0800 (PST) Date: Sat, 18 Feb 2023 00:28:00 +0000 In-Reply-To: <20230218002819.1486479-1-jthoughton@google.com> Mime-Version: 1.0 References: <20230218002819.1486479-1-jthoughton@google.com> X-Mailer: git-send-email 2.39.2.637.g21b0678d19-goog Message-ID: <20230218002819.1486479-28-jthoughton@google.com> Subject: [PATCH v2 27/46] mm: rmap: update try_to_{migrate,unmap} to handle mapcount for HGM From: James Houghton To: Mike Kravetz , Muchun Song , Peter Xu , Andrew Morton Cc: David Hildenbrand , David Rientjes , Axel Rasmussen , Mina Almasry , "Zach O'Keefe" , Manish Mishra , Naoya Horiguchi , "Dr . David Alan Gilbert" , "Matthew Wilcox (Oracle)" , Vlastimil Babka , Baolin Wang , Miaohe Lin , Yang Shi , Frank van der Linden , Jiaqi Yan , linux-mm@kvack.org, linux-kernel@vger.kernel.org, James Houghton X-Rspamd-Queue-Id: 833622000C X-Stat-Signature: bhq3j3szq8j48366zdzhc4bsrnynhpdn X-Rspam-User: X-Rspamd-Server: rspam08 X-HE-Tag: 1676680151-665841 X-HE-Meta: U2FsdGVkX1/HZDFnF0dYV690ESXbfa6mlegKu6vTvyH8Faa7zwl5qAMQ237LHcmEPpjF7vGMn4nnC/IviWN0G6/8vNm+pLEkCFQ89cPB1CzcUWF13ffRo9LndIKGvAzxUNfH1ewFeQbiDVUS7vUPFS3hsJzCfXHY6rPqluKvTH4Nxjm0kSZYHK8GLTPbbqgzp2GcrUn5g4mJkPM6jxAuW7DvGJx15P1i/B0ACCeMaizWRYD0BfEBNPBzFhhuR39kl3tKpU32/vzSYVs+vzoqtIEH4Yh6e5yUjjCKDZZQEDjcQfs4l94XZPlITOE1ut78Jp/DLBEkh9I+8CHQwYU9n4Pvj4CF/CQWYpCqDLTRz2mELkW3i/mbTNcOINIcS5e6llQK+XCKW2d8okgWNMpYRofPmsZiUIdIxm127dQY/GYMnz3SYw064ByKCFBFZwMwiBnxgf+LNhAr/JmPgll4Rog+MY+G9JvwrVcJsAM6hOvikcQ6ESnp+fCrFK81jQWWtUl0aC3KU6GmmCq68M+FiwRwDnEhiQyV7x08OvOzn3hB57GrumcxasUiOCKUIcmS4UyJTTZlREZOH3QuHahBGyIDl+oJEXUDzn4hTJexHw+hPrFU6IcjlGKnAdib1jY+uXdvhJq0uMjH0Yhj9fRgZSpLVep7qUxzm6Ab6/EQ7DaH/6OP04l5yD9Bmm95uVT9C4JQR1rVaANeiGyscdEbVMgxGeAY0t7oD2SmZQEL+CI1keTzOybiD+j6fsaiybR8v+G0EpSI04YK6QyhE5MJ+sE8uBkVAqUpb39NrLDaDenS+UGqB0UOyJKKEE7Ea0PoDSnQPoV0qmJ+duIJLV0ABFxmfc+nDpMfk0VXLLpJErx4qvKKrJRLtHl4O3HiDfjMXBfIk6hVjch8MCLMzZOpCBMeu7nKJnpRRe/IhuNQ2qcs0ZGsArgkvnnGhAInWiI7g7kIjyXZuI/Bl+K7GXJ il44uYCD eTTcARiwoxx25hTIxzU5bA9LxyiLk5cT8ZYykhaDw4VLKxmsmee6ZT9o9jI7P39SIol3UveNwdKEumR5HaEI5bl5kPbJZr5Yp6ssh2YuET8Al1wWPucR8Z08t2J/pym1bjW1EP6ohEDL9ZMRDcuZBYeaWyrBf9kIibvbA1tMRtLYr117xPBSHHak3lODsxd2gi2AVjN1iprjDIbyOeZU32M2B4Ot9UYKOO+3orUn27XtXYwxDt8/VpxcIVDqCKgSn4LhYZpX9beWv2RgIjAajQM9ThmbRIr5Ifuy3MHi57tyiYJdsa8LKDNnMtDEXxYqCOotubhJu7X4VAgi2bj2gEiBdyMbpg8UMD2qIGSNGp/WFGFIJd9Rmw7xNOTn1+bOHaQA3/GMJqyO8YRXX2MEF+TvHOW8ig3vzgKM9+mv1JuhE+vN/uGOqpbtt/sBcLojEQMDTODF5wAQht3cWjGLXQy2XWmMdFZo/dgfaiZord1ZdWyhlcvUEg4gsn1eyg39F4PWCJ+NCY9hbAQ+x/lg3ccPcqpve0eCippn/7zJsfj76w5OW/agd2RvejIuziQeYssVLTmw5INV1EHfu1V6bXQwCsL87LK2Cd4C4XeAauT76hFw= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Make use of the new pvmw->pte_order field to determine the size of the PTE we're unmapping/migrating. Signed-off-by: James Houghton diff --git a/mm/migrate.c b/mm/migrate.c index 9b4a7e75f6e6..616afcc40fdc 100644 --- a/mm/migrate.c +++ b/mm/migrate.c @@ -247,7 +247,7 @@ static bool remove_migration_pte(struct folio *folio, #ifdef CONFIG_HUGETLB_PAGE if (folio_test_hugetlb(folio)) { - unsigned int shift = huge_page_shift(hstate_vma(vma)); + unsigned int shift = pvmw.pte_order + PAGE_SHIFT; pte = arch_make_huge_pte(pte, shift, vma->vm_flags); if (folio_test_anon(folio)) diff --git a/mm/rmap.c b/mm/rmap.c index c010d0af3a82..0a019ae32f04 100644 --- a/mm/rmap.c +++ b/mm/rmap.c @@ -1609,7 +1609,7 @@ static bool try_to_unmap_one(struct folio *folio, struct vm_area_struct *vma, if (PageHWPoison(subpage) && !(flags & TTU_IGNORE_HWPOISON)) { pteval = swp_entry_to_pte(make_hwpoison_entry(subpage)); if (folio_test_hugetlb(folio)) { - hugetlb_count_sub(folio_nr_pages(folio), mm); + hugetlb_count_sub(1UL << pvmw.pte_order, mm); set_huge_pte_at(mm, address, pvmw.pte, pteval); } else { dec_mm_counter(mm, mm_counter(&folio->page)); @@ -1757,7 +1757,13 @@ static bool try_to_unmap_one(struct folio *folio, struct vm_area_struct *vma, * * See Documentation/mm/mmu_notifier.rst */ - page_remove_rmap(subpage, vma, folio_test_hugetlb(folio)); + if (folio_test_hugetlb(folio)) + hugetlb_remove_rmap(subpage, + pvmw.pte_order + PAGE_SHIFT, + hstate_vma(vma), vma); + else + page_remove_rmap(subpage, vma, false); + if (vma->vm_flags & VM_LOCKED) mlock_drain_local(); folio_put(folio); @@ -2020,7 +2026,7 @@ static bool try_to_migrate_one(struct folio *folio, struct vm_area_struct *vma, } else if (PageHWPoison(subpage)) { pteval = swp_entry_to_pte(make_hwpoison_entry(subpage)); if (folio_test_hugetlb(folio)) { - hugetlb_count_sub(folio_nr_pages(folio), mm); + hugetlb_count_sub(1L << pvmw.pte_order, mm); set_huge_pte_at(mm, address, pvmw.pte, pteval); } else { dec_mm_counter(mm, mm_counter(&folio->page)); @@ -2112,7 +2118,12 @@ static bool try_to_migrate_one(struct folio *folio, struct vm_area_struct *vma, * * See Documentation/mm/mmu_notifier.rst */ - page_remove_rmap(subpage, vma, folio_test_hugetlb(folio)); + if (folio_test_hugetlb(folio)) + hugetlb_remove_rmap(subpage, + pvmw.pte_order + PAGE_SHIFT, + hstate_vma(vma), vma); + else + page_remove_rmap(subpage, vma, false); if (vma->vm_flags & VM_LOCKED) mlock_drain_local(); folio_put(folio); @@ -2196,6 +2207,8 @@ static bool page_make_device_exclusive_one(struct folio *folio, args->owner); mmu_notifier_invalidate_range_start(&range); + VM_BUG_ON_FOLIO(folio_test_hugetlb(folio), folio); + while (page_vma_mapped_walk(&pvmw)) { /* Unexpected PMD-mapped THP? */ VM_BUG_ON_FOLIO(!pvmw.pte, folio); From patchwork Sat Feb 18 00:28:01 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: James Houghton X-Patchwork-Id: 13145394 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 6579FC6379F for ; Sat, 18 Feb 2023 00:29:36 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 8C8AA280018; Fri, 17 Feb 2023 19:29:14 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 87893280002; Fri, 17 Feb 2023 19:29:14 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 6CD51280018; Fri, 17 Feb 2023 19:29:14 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0016.hostedemail.com [216.40.44.16]) by kanga.kvack.org (Postfix) with ESMTP id 5DDE5280002 for ; Fri, 17 Feb 2023 19:29:14 -0500 (EST) Received: from smtpin05.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay09.hostedemail.com (Postfix) with ESMTP id 3CFB380799 for ; Sat, 18 Feb 2023 00:29:14 +0000 (UTC) X-FDA: 80478528228.05.125A0CB Received: from mail-yw1-f201.google.com (mail-yw1-f201.google.com [209.85.128.201]) by imf10.hostedemail.com (Postfix) with ESMTP id 73D2FC0007 for ; Sat, 18 Feb 2023 00:29:12 +0000 (UTC) Authentication-Results: imf10.hostedemail.com; dkim=pass header.d=google.com header.s=20210112 header.b=YN0Pytyi; spf=pass (imf10.hostedemail.com: domain of 31xvwYwoKCPQfpdkqcdpkjckkcha.Ykihejqt-iigrWYg.knc@flex--jthoughton.bounces.google.com designates 209.85.128.201 as permitted sender) smtp.mailfrom=31xvwYwoKCPQfpdkqcdpkjckkcha.Ykihejqt-iigrWYg.knc@flex--jthoughton.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1676680152; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=1rCBHumdufOlTcJkqiypODYa1r7LDyl4S1gdc35TorQ=; b=MBQETJbUM/T7t895gmVtHRqIiZOlkp7Q/U633cCB4YdVRJVXgY2485dstM3Exs9med2Z1J nfkoqfMipRhm11HAwdNMQ3VQGWRCznW0nlpQqF7aCVRgGRtCC78e39WV+P5USv81MA5MXi QDKXT0e36RLv2MNJP0tjMppY2X+fDfM= ARC-Authentication-Results: i=1; imf10.hostedemail.com; dkim=pass header.d=google.com header.s=20210112 header.b=YN0Pytyi; spf=pass (imf10.hostedemail.com: domain of 31xvwYwoKCPQfpdkqcdpkjckkcha.Ykihejqt-iigrWYg.knc@flex--jthoughton.bounces.google.com designates 209.85.128.201 as permitted sender) smtp.mailfrom=31xvwYwoKCPQfpdkqcdpkjckkcha.Ykihejqt-iigrWYg.knc@flex--jthoughton.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1676680152; a=rsa-sha256; cv=none; b=d9ab+x9vw4z+ar/xe6sYcOPCyE5enFBNMG+kbfkplo+gn39ruz5TxUJeJVSx0neAaf6VTN yhZQBNSOKLieNbIyn5FhzVKrh+qQGZ1g1UNpGmHNDLTZ5hCbv1Kv0lB+5YJI0YknU5EMpo bAenD+wuz1k3KMDUME7aC/sPcE5bzAs= Received: by mail-yw1-f201.google.com with SMTP id 00721157ae682-5365741ec51so24410207b3.1 for ; Fri, 17 Feb 2023 16:29:12 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=1rCBHumdufOlTcJkqiypODYa1r7LDyl4S1gdc35TorQ=; b=YN0PytyiHYPIu8m+M2E6v1NUQCnHXY6ytZcGKXyuh7z/uqk1m7+gCvJsZq1js1+Nyk aWy2chcaY0oSjM7oplmnZTRau4yrS7eatEn1k9ey5wfeHKB+fHXyH3hpyjtsmTvEZjCC 943Dvm4iECKIUtQU6bNEzs0C/hZeHT7vfxnkuu/ImNucvgQ6NZzZ5BLuTjeDWoY/lNEN mppTbWX+d43SPhJRbQNNU06lgt5y4Yc251XxwreuYdKf65ASB8UzsacjLW/LKF5eAHUL AdfT5Xxv2SVRLajtE01G0LVCPjbrRrRShVjKlJ9kluNRT6aaGpxY/8iNFGTbZhsxLO7k +hCw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=1rCBHumdufOlTcJkqiypODYa1r7LDyl4S1gdc35TorQ=; b=L3GsITZDlydjm4x6YR/81uR8Gr2AeLV77LbVSyhp6I0hE4xRjD6Z8KJGVmp3LHM5kq Fta572BP3taKATUx5Gt88MsrwJ92+fVE4LziSQhVPEzhJ/Ra4ENxXH8hY5qD2UxRcl+c D6hHUFrgZvGxgcswN9hV73lys8WYDXObCAdKWWg23WlbzyjGAxgkf3AQV9+L6F/A9f24 V4sJvQCnsLyRmuwattlALbQUFklLzJpQILaHCryZ3I8zz10AuRMiwy/mp5KTEfJce3DK 2hmc3/mxgbsPOylXD9Y1Sk/cditjES8mLbxGiFPosCHLfH+iWVtSV6KDxbP2qanm8uTO 2sXw== X-Gm-Message-State: AO0yUKVot7A8IMD2Ks+94G/HpPFGjJYiTEzIbsG0QoEy8vvGomPCsNoK FLCoYB6PENY4zSE2ukoQPadwToMc3avcDWfz X-Google-Smtp-Source: AK7set+LO+JnPnboVPieFkXX7WpathP3qjh8yyUr3ixLzVkV8bA9em/CG8b88d9+skAi+Y+zl4Jc6Pv1vnp/FebD X-Received: from jthoughton.c.googlers.com ([fda3:e722:ac3:cc00:14:4d90:c0a8:2a4f]) (user=jthoughton job=sendgmr) by 2002:a5b:910:0:b0:927:a3c1:b2de with SMTP id a16-20020a5b0910000000b00927a3c1b2demr200123ybq.7.1676680151721; Fri, 17 Feb 2023 16:29:11 -0800 (PST) Date: Sat, 18 Feb 2023 00:28:01 +0000 In-Reply-To: <20230218002819.1486479-1-jthoughton@google.com> Mime-Version: 1.0 References: <20230218002819.1486479-1-jthoughton@google.com> X-Mailer: git-send-email 2.39.2.637.g21b0678d19-goog Message-ID: <20230218002819.1486479-29-jthoughton@google.com> Subject: [PATCH v2 28/46] mm: rmap: in try_to_{migrate,unmap}, check head page for hugetlb page flags From: James Houghton To: Mike Kravetz , Muchun Song , Peter Xu , Andrew Morton Cc: David Hildenbrand , David Rientjes , Axel Rasmussen , Mina Almasry , "Zach O'Keefe" , Manish Mishra , Naoya Horiguchi , "Dr . David Alan Gilbert" , "Matthew Wilcox (Oracle)" , Vlastimil Babka , Baolin Wang , Miaohe Lin , Yang Shi , Frank van der Linden , Jiaqi Yan , linux-mm@kvack.org, linux-kernel@vger.kernel.org, James Houghton X-Rspamd-Server: rspam05 X-Rspamd-Queue-Id: 73D2FC0007 X-Stat-Signature: junxxkpuqcmdybe5smunbuk5cy111tto X-Rspam-User: X-HE-Tag: 1676680152-841058 X-HE-Meta: U2FsdGVkX18FfONQ1l+Lzua4iEEsbFeRuYaT8rRA6NHABiX7nEXmmvnc3UD+3S4FkoS/B7BkD3RKw2tMLOpSgRvd21usrAuvtaEGJOAe96NWF4EMRKbrzZyc7R8zo3Y56idj/+IfMjNnYvofqWmyX7VxKzRfMycr0NYgpYuS0W1HPPhHrxdQbKDYqwqQDrHoA7cs4Ly8HRTFAMgyajPuxZjC7NSRg+y4w+7mPg3BZjPhxQBenXemBnsu3pnh9Wzl4hVCGPmvQAecU05KcGmZKiPKs+AqfbWQGKDdlWiHYOdGmFhJymopE2UB74mC2AB0iqkKLCDCQSwtRL1l40ALyDb4fLi7232CcrLJDRXtGgM6sJWcP5JCADTrSHnTyikYbuNX/m+u54c9pkmziCp+yN3ZaL1GKBjiHzUe5Xd+OmBXdZNo8+ub0ARA2KsRA/M2ksHU71SQ0OYyPSPpbTH6wq6jEdTj/gtkBFXWvF0GWAWuYrjFCX7nE0X1ETkLcxUdtMg8JFzDAzNqAEKzhilIQXDVFktmZye2pDUMBFodYTYfEZqc+n0UF2kCClkQuXwi7FLfHBx+ej9MxWrCxrDsP2kDyaxCpFveVDe5aatlEVcyWqKO6dbR3l/jjFTDvtZFnch+QCfmLUQ9EOz0XAL0rKlD9VfcKkx6wPPQBAhqK6rupXi9TcX9q6eqOkRL2IV4QdGXr0RjZ+KQNTtiP4Lcs1jE2jDvToGUAw4kaUqvIpGRQhM7aVvc97Ek40lEX7wVXb8cDQFgTRBrjaAB/llEliM/pvlL1PID0sTaraRhVlcSGl6gXFgo23k+VOq5Q5tTbq6ada1AX8X6Ki9L7es8do+PsXKw1qPreBFtGesjC+JVdaKWz+pVBv85jXeTcF+RjT1wfSB1EXv02ntK++OwkGk5s4sGG3mkAFCY2iavjv+tWUjy0WTPyDfhQN7Vow0cLy9PiXJuiTVCiSMCxLO p5Ts/Ifu gIzBUGqPuW7BhOaEBtwcQshEosql7LwtvEqoQxc8uEn1nCY3dWEUkaP8jSFra0R/cuS5vqWuT9XbWcmOL81BCnyQjJpGGp9UKBa0KHe6X7i8C9a8ix6+Hwl9SsxAi6czxmn2j/eLIHDZg8CCRg2W1alUHjM5UihGP9VeAhL5h5HgBA1+VgAhXg5/uUCYcYmss8fPRfBT3m7i/VbDjB79nWMvpf7zy/TrAw9V7Iu0OdfCqVYGoTV+2D3RMt4Or/WNMFm+sg0DJ9kLq17YP6OR9xYLJ8YKosXhq+bdRyGmVS25X2Apx7WeggXt93NKjYamvI8oxIKR7NHeGGi1WAEqJTJ96nhg2sEiQxMRrBL7Mxivi9zrqpU2L+W9zyxiwOrwdntFKsK0gLDNTOgXIrAV5KdF0sFmxPMhK5K/bCpKyJNEjzwBeEFvrRD3s6LvXEn+/hfSm9g+ZFgeH4kieX0Mroams1tWEDjljgQI/Jz9ALQdPQNOCGrVjF8DDpuxroD7ndcGfKZe43TNq0qRZoMsoywLzTrvf5dy8xn6ITR38mWFtkeCoxoHUXWPZff1tVe5CZPnrmBECJqXNbgPbD1P5XYoGaYllZl11gVx98XHXv4h2+mF7VfTPrxGTLtNNR5Cv6B9uqLiScTyandmIJtrq5kj1epop/zWoTjmfhGlgkePifRZNAnYu4KIpv9ZRbIgxJaAzbMHbpXSYOd8= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: The main complication here is that HugeTLB pages have their poison status stored in the head page as the HWPoison page flag. Because HugeTLB high-granularity mapping can create PTEs that point to subpages instead of always the head of a hugepage, we need to check the compound_head for page flags. Signed-off-by: James Houghton diff --git a/mm/rmap.c b/mm/rmap.c index 0a019ae32f04..4908ede83173 100644 --- a/mm/rmap.c +++ b/mm/rmap.c @@ -1456,10 +1456,11 @@ static bool try_to_unmap_one(struct folio *folio, struct vm_area_struct *vma, struct mm_struct *mm = vma->vm_mm; DEFINE_FOLIO_VMA_WALK(pvmw, folio, vma, address, 0); pte_t pteval; - struct page *subpage; + struct page *subpage, *page_flags_page; bool anon_exclusive, ret = true; struct mmu_notifier_range range; enum ttu_flags flags = (enum ttu_flags)(long)arg; + bool page_poisoned; /* * When racing against e.g. zap_pte_range() on another cpu, @@ -1512,9 +1513,17 @@ static bool try_to_unmap_one(struct folio *folio, struct vm_area_struct *vma, subpage = folio_page(folio, pte_pfn(*pvmw.pte) - folio_pfn(folio)); + /* + * We check the page flags of HugeTLB pages by checking the + * head page. + */ + page_flags_page = folio_test_hugetlb(folio) + ? &folio->page + : subpage; + page_poisoned = PageHWPoison(page_flags_page); address = pvmw.address; anon_exclusive = folio_test_anon(folio) && - PageAnonExclusive(subpage); + PageAnonExclusive(page_flags_page); if (folio_test_hugetlb(folio)) { bool anon = folio_test_anon(folio); @@ -1523,7 +1532,7 @@ static bool try_to_unmap_one(struct folio *folio, struct vm_area_struct *vma, * The try_to_unmap() is only passed a hugetlb page * in the case where the hugetlb page is poisoned. */ - VM_BUG_ON_PAGE(!PageHWPoison(subpage), subpage); + VM_BUG_ON_FOLIO(!page_poisoned, folio); /* * huge_pmd_unshare may unmap an entire PMD page. * There is no way of knowing exactly which PMDs may @@ -1606,7 +1615,7 @@ static bool try_to_unmap_one(struct folio *folio, struct vm_area_struct *vma, /* Update high watermark before we lower rss */ update_hiwater_rss(mm); - if (PageHWPoison(subpage) && !(flags & TTU_IGNORE_HWPOISON)) { + if (page_poisoned && !(flags & TTU_IGNORE_HWPOISON)) { pteval = swp_entry_to_pte(make_hwpoison_entry(subpage)); if (folio_test_hugetlb(folio)) { hugetlb_count_sub(1UL << pvmw.pte_order, mm); @@ -1632,7 +1641,9 @@ static bool try_to_unmap_one(struct folio *folio, struct vm_area_struct *vma, mmu_notifier_invalidate_range(mm, address, address + PAGE_SIZE); } else if (folio_test_anon(folio)) { - swp_entry_t entry = { .val = page_private(subpage) }; + swp_entry_t entry = { + .val = page_private(page_flags_page) + }; pte_t swp_pte; /* * Store the swap location in the pte. @@ -1822,7 +1833,7 @@ static bool try_to_migrate_one(struct folio *folio, struct vm_area_struct *vma, struct mm_struct *mm = vma->vm_mm; DEFINE_FOLIO_VMA_WALK(pvmw, folio, vma, address, 0); pte_t pteval; - struct page *subpage; + struct page *subpage, *page_flags_page; bool anon_exclusive, ret = true; struct mmu_notifier_range range; enum ttu_flags flags = (enum ttu_flags)(long)arg; @@ -1902,9 +1913,16 @@ static bool try_to_migrate_one(struct folio *folio, struct vm_area_struct *vma, subpage = folio_page(folio, pte_pfn(*pvmw.pte) - folio_pfn(folio)); } + /* + * We check the page flags of HugeTLB pages by checking the + * head page. + */ + page_flags_page = folio_test_hugetlb(folio) + ? &folio->page + : subpage; address = pvmw.address; anon_exclusive = folio_test_anon(folio) && - PageAnonExclusive(subpage); + PageAnonExclusive(page_flags_page); if (folio_test_hugetlb(folio)) { bool anon = folio_test_anon(folio); @@ -2023,7 +2041,7 @@ static bool try_to_migrate_one(struct folio *folio, struct vm_area_struct *vma, * No need to invalidate here it will synchronize on * against the special swap migration pte. */ - } else if (PageHWPoison(subpage)) { + } else if (PageHWPoison(page_flags_page)) { pteval = swp_entry_to_pte(make_hwpoison_entry(subpage)); if (folio_test_hugetlb(folio)) { hugetlb_count_sub(1L << pvmw.pte_order, mm); From patchwork Sat Feb 18 00:28:02 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: James Houghton X-Patchwork-Id: 13145395 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 0049DC636D6 for ; Sat, 18 Feb 2023 00:29:37 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 4F6F7280019; Fri, 17 Feb 2023 19:29:15 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 45991280002; Fri, 17 Feb 2023 19:29:15 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 2AF5A280019; Fri, 17 Feb 2023 19:29:15 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0012.hostedemail.com [216.40.44.12]) by kanga.kvack.org (Postfix) with ESMTP id 1CA13280002 for ; Fri, 17 Feb 2023 19:29:15 -0500 (EST) Received: from smtpin02.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay04.hostedemail.com (Postfix) with ESMTP id F35F21A050F for ; Sat, 18 Feb 2023 00:29:14 +0000 (UTC) X-FDA: 80478528228.02.1C6F9DC Received: from mail-yb1-f202.google.com (mail-yb1-f202.google.com [209.85.219.202]) by imf02.hostedemail.com (Postfix) with ESMTP id 459798000C for ; Sat, 18 Feb 2023 00:29:13 +0000 (UTC) Authentication-Results: imf02.hostedemail.com; dkim=pass header.d=google.com header.s=20210112 header.b="Zm6/7/Qa"; spf=pass (imf02.hostedemail.com: domain of 32BvwYwoKCPUgqelrdeqlkdlldib.Zljifkru-jjhsXZh.lod@flex--jthoughton.bounces.google.com designates 209.85.219.202 as permitted sender) smtp.mailfrom=32BvwYwoKCPUgqelrdeqlkdlldib.Zljifkru-jjhsXZh.lod@flex--jthoughton.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1676680153; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=mRwFktkeJYIdKMusf4qC81y0TR6AbHEOFSy4VjO6bDw=; b=1WXoPWDhLGW64u5dDE9nG2Q5Q6dZcByJLaEjGq5sBynt3gjMVec3IZf4+Wg3bNwcw54U7k Wkec1fVdmW3b5yWBhAt/HxPaWWqaBgUjLOLepkX4b487B6tp/5X+PVcMQNVu5OYxTg5oBi kGdb8fsbs3ZJ2PWWkkuhxCQ4noIofH0= ARC-Authentication-Results: i=1; imf02.hostedemail.com; dkim=pass header.d=google.com header.s=20210112 header.b="Zm6/7/Qa"; spf=pass (imf02.hostedemail.com: domain of 32BvwYwoKCPUgqelrdeqlkdlldib.Zljifkru-jjhsXZh.lod@flex--jthoughton.bounces.google.com designates 209.85.219.202 as permitted sender) smtp.mailfrom=32BvwYwoKCPUgqelrdeqlkdlldib.Zljifkru-jjhsXZh.lod@flex--jthoughton.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1676680153; a=rsa-sha256; cv=none; b=anheXFn5dwI9UN4sByB+cpTM7Ja+ooRrWGcni4+wmAXrCLgTLyiGOlFZHIfkmyynF7/8fN AUKr30V/fRJ0W2cO0diopyl3IrQTsYynx1YXCo2I6w+qlsNhaljJsD9Zq/JqYsv/ybh5q8 gqgv4NgaiGCV6Cf4ifLIQ86KZDSt8Wg= Received: by mail-yb1-f202.google.com with SMTP id b15-20020a252e4f000000b008ee1c76c25dso2086503ybn.11 for ; Fri, 17 Feb 2023 16:29:12 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=mRwFktkeJYIdKMusf4qC81y0TR6AbHEOFSy4VjO6bDw=; b=Zm6/7/QaEOKxvrDuMfeB7vwmF4lLu6UTGBUmswTJcUAlfDSMlIReG5bhg0wTj24gmC pyzKX/4iSeu0TJCnoaa5IeZeAwpPJP4t7J15FYcDft0GCDUyIekB7QiiA5d6VpYF6r1H nOxlRQfswvR4QTg+iFofMR17/iZTfwCKWd+Emo/OWhpb51k0Epcgf3/x2g21X+2fJvir E6KpK03FxgLoBdBKPFmKqjWGZ7c7Mw5BSnv8qT+zAW5LhNVyj/X7uCXj4k6ZrqZKQzWs jRtMwhX56IYmGhi9pxW4bL/oSI6T41SeQ+p8o0DKCkHwnC/wa7X1kwvdzUJ3ryjcOdKA 7yIw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=mRwFktkeJYIdKMusf4qC81y0TR6AbHEOFSy4VjO6bDw=; b=6uXENifqCagfFy7irmNVRMWyCh2yg4fLhgqef1v3+DBJ4zfMny3sYRuvTB5DhtAIcv ehHl8H3uCocTYz6Lz54jJdqi8sjVOEbjFqn8b5dfN5xQsqjZ6QkqTRrrrDUZWBUTq9Ls NLhl/+7NLrwobKLwfbda29GbmueVy5B0PZZop96w5tis4HaAQAGLsj92rEo2lxlVEaG5 G9RkzXE0CRS1wfKJ7PezC13BDxtfLC8bgwR9XZp4vxGs2Fy1HvHrMXtoqdmqKmZZSvr7 B+HyC0MEnMnOfHhrwHKUFPBOTKQgiB0wY4HvBloKHq/iOFO6rI3kTguv22HMmFVLcCuM 75Xg== X-Gm-Message-State: AO0yUKWxZJFBYn1H9F7lLAYTHWwA400c1kuxrskSUewL7ee9uKribCih 0A6Qb/MCYj0ONFNBTYqrLhxefuH68BdxiEnS X-Google-Smtp-Source: AK7set8dwJ3KQ5pRHgFDkybx18bdI4wSFLevnVCzXA1Or4MzQVqGPdFvHJZtBGks9A5Y7+5HxTvdx/c0MRnYUx9S X-Received: from jthoughton.c.googlers.com ([fda3:e722:ac3:cc00:14:4d90:c0a8:2a4f]) (user=jthoughton job=sendgmr) by 2002:a05:6902:1103:b0:8ed:3426:8a69 with SMTP id o3-20020a056902110300b008ed34268a69mr91121ybu.1.1676680152644; Fri, 17 Feb 2023 16:29:12 -0800 (PST) Date: Sat, 18 Feb 2023 00:28:02 +0000 In-Reply-To: <20230218002819.1486479-1-jthoughton@google.com> Mime-Version: 1.0 References: <20230218002819.1486479-1-jthoughton@google.com> X-Mailer: git-send-email 2.39.2.637.g21b0678d19-goog Message-ID: <20230218002819.1486479-30-jthoughton@google.com> Subject: [PATCH v2 29/46] hugetlb: update page_vma_mapped to do high-granularity walks From: James Houghton To: Mike Kravetz , Muchun Song , Peter Xu , Andrew Morton Cc: David Hildenbrand , David Rientjes , Axel Rasmussen , Mina Almasry , "Zach O'Keefe" , Manish Mishra , Naoya Horiguchi , "Dr . David Alan Gilbert" , "Matthew Wilcox (Oracle)" , Vlastimil Babka , Baolin Wang , Miaohe Lin , Yang Shi , Frank van der Linden , Jiaqi Yan , linux-mm@kvack.org, linux-kernel@vger.kernel.org, James Houghton X-Rspam-User: X-Rspamd-Server: rspam03 X-Stat-Signature: pogkfapcycjnd64z8n6ysf65bdrzyc5k X-Rspamd-Queue-Id: 459798000C X-HE-Tag: 1676680153-540137 X-HE-Meta: U2FsdGVkX1+Q2Y2LqqsFDkH6UwTHZ4VKnuScf6D32B6Bog0PpuTPV+7w27c0Ok4fuNaXsbKjIkMQF3yrgg/y9DKR1YuDyJYsX2r/Fo7VCGHX7UJzatZNv3pbzLz7yJB/iVXVFNH6bqnD0DTTqU/OHDiGHrFJtIVHHzGUYJ8dgV6s07OjIeUY0pH0qL44OxQ6eSyB3tgFcrNwqIOUoZD4kN44w+i5RA+ao0IJNWIQUq9kKjkoGYaDHViSIk/GO9Oue3kkYh/dHopZBe25C7tq6jk5gyK5h0QaT+FRibbbOIpOouNzImNhj95a0oFq9hH1NRehA3o4nYIcTbsIcTJubuSlD7pRPGD0mxLXdnmwNGRa9oCVaTgVcWOgVHxquWR1q6iXZPH5QuyOHYNU+crNkywHr2bZEA45UBxdjWFRm4D49qsNLm9JFoV3k4IbW3rqnDYr55p3FbbCpsRYwPxbPodSt5H1XsUZTQrNcMD9ikiXIFsrH8/y5KcEqkRqEuWtP2Ah6benwSX5srdd9I/t6OIR9YeBNxcgYF2HKYcw2Lh829U07sXCGnPk8nsqf2enW/FFwFyUXcU8TnNcv03gYmepCb/1gBGlqOYmSq3ZG6P9F7Sue++RoeJtLMvRu7R+Uttp2lm/m/sYcHnSMqSE/QDgCNaM9mNBCD7QKAy9mChJGx6qZOFu+HnEBc1rvRJVQMACVIPtNHRPTLIGtpd7htL8tflVYPRxXTuLW+NB0cMiX6ZBCHls5w4+8AWJXH/b5xkjODOMPcHbwm/Cc+y/CCVCg4f8hyBGqgvsZpmIl5/WkII1CEnbkK/jRkS0COiFbbZFoHMqWP85KInnodIzCsjDIkXS5u1Xmt3/Ohd2S0GaYRjyDADgoxrIp2nQmKblQNBEVr/ATaXL/M5l79FuKbYMaF1XltWdLJ2W7GYWoiRwcyj1Qekax1NmbMBI/jeP5MQHZWQqzKX2HeLjwUx TzcigG63 o5/oBskF3zYtDGVYrDl7freg58ecdq0uRVBP2gKR2FxDkxv45oQeQGxm81t4MSvQA0jne5Ce0RYVUzIK3CsuvYarGuyGu/MJ8FdjMbNqb17T/yxKkZJIz3IyIjjpLTcXJuMgdAUbkh2yMnLfYSKkWSzFNuW6Ye9BOKBH0ktqmKsXdeBBaLsXmn0vVk3eggEBJOEwTjIinVxa9veeZXLHrroOU7umi3+CqDFCNPYimV8O746jtWyt0jitcV96G92vO4DBY/ndLQbg0cUaI+7b/pBGx21iABTDrHhecfV3QnuRfAjVw/i9x8kGvGBbL6tKdO0devrsKLRmVR9DTihlDVtIjACoOuUFA9YaxE2iGOQJtVujoh0AsuTzd9CKjtvVhihnp9bpNWHX28/r069PE+ZB5tfKI/f74Zi5fFf0TpQsA5fy0OWj0TN2QUpMWs0K4iaci15MRx4M9mdp6gtD4+kxI9jkjFY8WN/+knBL9D3y/82iFLHD3JCYeEw+JUFDI7NGnO4n8N34zjw/mLSutwVCyHu2BPNnvcy9rQ2pyfFwsdcdibAUuU8uU1ymdryJ1m9Ak4lOFQmmgEggq4bTzoguIwhV4sedewosh9fNYSkjinVQ= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Update the HugeTLB logic to look a lot more like the PTE-mapped THP logic. When a user calls us in a loop, we will update pvmw->address to walk to each page table entry that could possibly map the hugepage containing pvmw->pfn. Make use of the new pte_order so callers know what size PTE they're getting. The !pte failure case is changed to call not_found() instead of just returning false. This should be a no-op, but if somehow the hstate-level PTE were deallocated between iterations, not_found() should be called to drop locks. Signed-off-by: James Houghton diff --git a/mm/page_vma_mapped.c b/mm/page_vma_mapped.c index 08295b122ad6..03e8a4987272 100644 --- a/mm/page_vma_mapped.c +++ b/mm/page_vma_mapped.c @@ -133,7 +133,8 @@ static void step_forward(struct page_vma_mapped_walk *pvmw, unsigned long size) * * Returns true if the page is mapped in the vma. @pvmw->pmd and @pvmw->pte point * to relevant page table entries. @pvmw->ptl is locked. @pvmw->address is - * adjusted if needed (for PTE-mapped THPs). + * adjusted if needed (for PTE-mapped THPs and high-granularity-mapped HugeTLB + * pages). * * If @pvmw->pmd is set but @pvmw->pte is not, you have found PMD-mapped page * (usually THP). For PTE-mapped THP, you should run page_vma_mapped_walk() in @@ -165,23 +166,47 @@ bool page_vma_mapped_walk(struct page_vma_mapped_walk *pvmw) if (unlikely(is_vm_hugetlb_page(vma))) { struct hstate *hstate = hstate_vma(vma); - unsigned long size = huge_page_size(hstate); - /* The only possible mapping was handled on last iteration */ - if (pvmw->pte) - return not_found(pvmw); - /* - * All callers that get here will already hold the - * i_mmap_rwsem. Therefore, no additional locks need to be - * taken before calling hugetlb_walk(). - */ - pvmw->pte = hugetlb_walk(vma, pvmw->address, size); - if (!pvmw->pte) - return false; + struct hugetlb_pte hpte; + pte_t pteval; + + end = (pvmw->address & huge_page_mask(hstate)) + + huge_page_size(hstate); + + do { + if (pvmw->pte) { + if (pvmw->ptl) + spin_unlock(pvmw->ptl); + pvmw->ptl = NULL; + pvmw->address += PAGE_SIZE << pvmw->pte_order; + if (pvmw->address >= end) + return not_found(pvmw); + } - pvmw->pte_order = huge_page_order(hstate); - pvmw->ptl = huge_pte_lock(hstate, mm, pvmw->pte); - if (!check_pte(pvmw)) - return not_found(pvmw); + /* + * All callers that get here will already hold the + * i_mmap_rwsem. Therefore, no additional locks need to + * be taken before calling hugetlb_walk(). + */ + if (hugetlb_full_walk(&hpte, vma, pvmw->address)) + return not_found(pvmw); + +retry: + pvmw->pte = hpte.ptep; + pvmw->pte_order = hpte.shift - PAGE_SHIFT; + pvmw->ptl = hugetlb_pte_lock(&hpte); + pteval = huge_ptep_get(hpte.ptep); + if (pte_present(pteval) && !hugetlb_pte_present_leaf( + &hpte, pteval)) { + /* + * Someone split from under us, so keep + * walking. + */ + spin_unlock(pvmw->ptl); + hugetlb_full_walk_continue(&hpte, vma, + pvmw->address); + goto retry; + } + } while (!check_pte(pvmw)); return true; } From patchwork Sat Feb 18 00:28:03 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: James Houghton X-Patchwork-Id: 13145396 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id AC69FC05027 for ; Sat, 18 Feb 2023 00:29:39 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id BD44828001A; Fri, 17 Feb 2023 19:29:16 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id B3654280002; Fri, 17 Feb 2023 19:29:16 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 9FE6828001A; Fri, 17 Feb 2023 19:29:16 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0015.hostedemail.com [216.40.44.15]) by kanga.kvack.org (Postfix) with ESMTP id 8FFF0280002 for ; Fri, 17 Feb 2023 19:29:16 -0500 (EST) Received: from smtpin21.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay05.hostedemail.com (Postfix) with ESMTP id 6F072405E8 for ; Sat, 18 Feb 2023 00:29:16 +0000 (UTC) X-FDA: 80478528312.21.E6BEB14 Received: from mail-vk1-f201.google.com (mail-vk1-f201.google.com [209.85.221.201]) by imf13.hostedemail.com (Postfix) with ESMTP id B504120007 for ; Sat, 18 Feb 2023 00:29:14 +0000 (UTC) Authentication-Results: imf13.hostedemail.com; dkim=pass header.d=google.com header.s=20210112 header.b="dX7/40KB"; spf=pass (imf13.hostedemail.com: domain of 32RvwYwoKCPYhrfmsefrmlemmejc.amkjglsv-kkitYai.mpe@flex--jthoughton.bounces.google.com designates 209.85.221.201 as permitted sender) smtp.mailfrom=32RvwYwoKCPYhrfmsefrmlemmejc.amkjglsv-kkitYai.mpe@flex--jthoughton.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1676680154; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=6R+bM0KtFeTkcEJ8SgP6306LOQusudAcElSasV+tkuA=; b=6akXmLsk7TaSwa15JC/h2c0u/BImGRBkkCRRSNzQxeMTHd1G5vyg1TAjsrdkqpL0lOm+Q1 bcgfDK1OZXjJg6tlFqe+a8zRZYeOFfKa14A7HcrbPXOOxT1Z/mghiMQhrjsXfp7w+gh9mF fht5Iln3Hbrzy0TGQpSZr1/oPZxymQk= ARC-Authentication-Results: i=1; imf13.hostedemail.com; dkim=pass header.d=google.com header.s=20210112 header.b="dX7/40KB"; spf=pass (imf13.hostedemail.com: domain of 32RvwYwoKCPYhrfmsefrmlemmejc.amkjglsv-kkitYai.mpe@flex--jthoughton.bounces.google.com designates 209.85.221.201 as permitted sender) smtp.mailfrom=32RvwYwoKCPYhrfmsefrmlemmejc.amkjglsv-kkitYai.mpe@flex--jthoughton.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1676680154; a=rsa-sha256; cv=none; b=fR9WGsU9LjyuhGE/wX88zaZncCA8YaT1YMJt+eC66gq45UFVzfSjuuN/rAzDYa0pI0SWm5 Q1pjG0h9xH0jPi5Od/40Wmpp249rPxSjqdo8ZjPVUE/1UiBIPQ9zEJbBnbhRuEMRQSZNVl u1g17D8ZcsVRYbXdC6zpDb3bnHNpA2o= Received: by mail-vk1-f201.google.com with SMTP id g1-20020ac5c5c1000000b00401b81d313bso828604vkl.6 for ; Fri, 17 Feb 2023 16:29:14 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; t=1676680154; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=6R+bM0KtFeTkcEJ8SgP6306LOQusudAcElSasV+tkuA=; b=dX7/40KBqq/CpQhiOpkAQIzPNluCgT+pqoGZRKlN9aa8QluovsQruDCu/wh6+Xi7HL 6kLXZgl+5XKDk7FsMYGYtr2yyFU9erZiSo7V2dyAwIFPJO5DAOooAQa2F7w+5zW0Hqdu QubYMhLcbqNzMIVdTubD/VkEI82Q/CwfufnycIJ0sLbdRiI6AathfyY49TSn1Ly/A6+9 +4K1wirIYyv3shJT02BYKt93mVx2ZH0jSU1+ZgamxSym8xZo1mO/BLj8U+ZTd86LEOwb MxBJulRTBsOmTowXosHxX9g5jLuhbpjY1XM7qMG0aeLiOSSYz9qPT0kULRTK/B6ula9l hMbw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; t=1676680154; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=6R+bM0KtFeTkcEJ8SgP6306LOQusudAcElSasV+tkuA=; b=qMV8XtYsVdybPM09+q/Zsc/J5Rto+5KN9/lBDGHaH6VJPC7m/QkSlw44uD2v5Uvgh3 XcXwP2fd8XAv8ICrxX1ol64/k8b0Xykgeeaqsyk5P5uhctyDOXgf3vyXt4nzYy9OGUAt gdZZfOCub7cnTmPEg+g7737YzI3jdmzVMRxb7u/j5N0BWZ7Wr00FhALdftOeWtVhUJBY ynXE2MB50uup7hn/TlBIehLciczB3IZEHpUYPvBI8I2uoGkIOJTylESGHo/j+6R32hgP 4+NvlsC6oE4/5TszGHRsIJVsEkfZxwtsbZCyvmr7hWKbp4K9nKmmBj5C1BZkXwtv4VuR Sqdg== X-Gm-Message-State: AO0yUKVCXHROXicFsksxnNWzaCcMgNLIWe8c9gs+OH3r0yUhQ/vKSWwP d/309f9njHh+4f4sIMYO1lBhNP7uG1Toc2dz X-Google-Smtp-Source: AK7set8ydXZ+Dciv1+JvuwKHd4m9xN+HXoJNqOMVwMnv6hYgnClrworMO3PvsXyUer/h8BcS4rjrDLd27tp3RFfQ X-Received: from jthoughton.c.googlers.com ([fda3:e722:ac3:cc00:14:4d90:c0a8:2a4f]) (user=jthoughton job=sendgmr) by 2002:a1f:a041:0:b0:401:7fe9:ff7f with SMTP id j62-20020a1fa041000000b004017fe9ff7fmr213533vke.5.1676680153948; Fri, 17 Feb 2023 16:29:13 -0800 (PST) Date: Sat, 18 Feb 2023 00:28:03 +0000 In-Reply-To: <20230218002819.1486479-1-jthoughton@google.com> Mime-Version: 1.0 References: <20230218002819.1486479-1-jthoughton@google.com> X-Mailer: git-send-email 2.39.2.637.g21b0678d19-goog Message-ID: <20230218002819.1486479-31-jthoughton@google.com> Subject: [PATCH v2 30/46] hugetlb: add high-granularity migration support From: James Houghton To: Mike Kravetz , Muchun Song , Peter Xu , Andrew Morton Cc: David Hildenbrand , David Rientjes , Axel Rasmussen , Mina Almasry , "Zach O'Keefe" , Manish Mishra , Naoya Horiguchi , "Dr . David Alan Gilbert" , "Matthew Wilcox (Oracle)" , Vlastimil Babka , Baolin Wang , Miaohe Lin , Yang Shi , Frank van der Linden , Jiaqi Yan , linux-mm@kvack.org, linux-kernel@vger.kernel.org, James Houghton X-Rspamd-Queue-Id: B504120007 X-Stat-Signature: 5s7bu8tacfsesfnugz4cpwr9gcjobwpx X-Rspam-User: X-Rspamd-Server: rspam08 X-HE-Tag: 1676680154-170761 X-HE-Meta: U2FsdGVkX1/dp4Yj/Q76jIsdLa8XbeoEOmdRqDmYtpD5iO4fhBJ+jTEx1TYQnPakHJDJrSct8TlfOnvIMTRkasmduAt+wygtuOF6n8PdZhqPCxhWjJ8oA1DZrrGpxhqSiOw8ty37W52Y3bnEUHqEkKb7UxqXEJCarJekC+JT6iDlGiFfiIaQbkmqsE4uCui53BaTY44P2IYSMEUPzVpO85/PSho3wNX5g8r9HJUYFv0fRlPg8EbC/sjbW7OHOhuXolmT1yAECsupMp9xtU5DiCXvA+RpIcpRVCNnRguMoFHiBzDI7rjBbcD3ckgt5gdNaImxyAfqqQMorH/TR/ID2L/6wL+KL/ONJmSv6IZ5eF310z2UYzGTY9N9Xnjor5A14TdfJGyDXU2/CcMMXSBABNC/TEfEf7F0kqpn5XrTt4HFqyr7p0TuZnMRiMgwN1HSKHm0oUBEj1I0DKpTQ+FlH2iOLgsyMBq/te0oDUsjFNCZRW49nJP7HPmHr24JZKA4j67DJiqLkKx86YodYH23GblV/oA9V/o80pdeZz2thLNd7jiVc1FSsSMZstRyex3lqYqTbgxNe61r+eDNpm7Shb1PWKJcSIzhDqOwpt3FmPrkqSuAk6hm1sVtO/GILvg/iPrQrDVzY8E58M7PBhYpzFMYfXL4qpYyEp0x+PIsqftUc+3jhoN3B/IFih0qeAQvFv0Rts9MxlqL1aR6Bk7hTw2M28am9NtfRrqA99jucpEKKbqB/U7t77PW+38PC4EQTiyektPw2sMU4WjryTnX24cko4ke6ssEJc8lcY9FH5Nch969lGJg02sKy/R4nptogNq9DycRHytMk3o01HTPJa1eEpBVkj9KC/9LJD31+bKt94a+Jczg0Z6xeX6y/BD+psReRf+YuO+PrExXYhW6qIVPFNro1FyRaIJaxNVP28MQmvF6WjwW1rfw+5jrDIuIE6plIzaZT+iUEcgygss FySu1DGz np0BObbZWcKs1lBZF7LNSPBjkUdHN1vFQn892j2lF4mhU/Q0u9pyDzOLO4GNDYjgMRmA77IXLDI5v8utyODE8Gwu1qim/H3F9/LkyVkeTc7gjv0mznd/MXRteOyxxctBrHUGO1lt0XuFqxCOS/3kxUw00cSNCxgQbnF2wQ+VAONo4XJP+gY/2s5m/qon10kGorMkIsQRaXIIDjHTf46REocr/cK8rY9OuGRAtVQXOe6tSRYOidVuL3KbUox6PlQ+MSX3em+dmc81nZaQreLVJERUHpVn6c4788M4JvChzb1QqGpNWlCC335akA4nM9S2clJS6l4wp69BXjgzg+XaDKk+xGla+MC/psAQEvJA6jh5z3k+2tUOichn197tQX+Edun6ye5QUcxtEk0jJZTAVQHqB+FsI0UyORhpk31944ZdOZtFmf5Kr6Iwb1AMrfa3ZJut2gCdXFsjsl2UNtHjYIQk+q0Vi5D+8AQcPzzEYXXSRREuDU0L33fxnqC6IqX/ztHcEJ+jPOL9Sh1eYhySCdklJboVFXOqiOmusotAuyuPRBzKUfK/ie0q3GDZmjTfzbHlavxEU4reK6xX6FtWVjY8qV464WlbyDg2F1Ky9FOMBGOGF8Vg+6+GLYhImUVgzCUImXf4LDyizAk58Ll3YyrrGCPuTr1ttzvlf X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: To prevent queueing a hugepage for migration multiple times, we use last_folio to keep track of the last page we saw in queue_pages_hugetlb, and if the page we're looking at is last_folio, then we skip it. For the non-hugetlb cases, last_folio, although unused, is still updated so that it has a consistent meaning with the hugetlb case. Signed-off-by: James Houghton diff --git a/include/linux/swapops.h b/include/linux/swapops.h index 3a451b7afcb3..6ef80763e629 100644 --- a/include/linux/swapops.h +++ b/include/linux/swapops.h @@ -68,6 +68,8 @@ static inline bool is_pfn_swap_entry(swp_entry_t entry); +struct hugetlb_pte; + /* Clear all flags but only keep swp_entry_t related information */ static inline pte_t pte_swp_clear_flags(pte_t pte) { @@ -339,7 +341,8 @@ extern void migration_entry_wait(struct mm_struct *mm, pmd_t *pmd, #ifdef CONFIG_HUGETLB_PAGE extern void __migration_entry_wait_huge(struct vm_area_struct *vma, pte_t *ptep, spinlock_t *ptl); -extern void migration_entry_wait_huge(struct vm_area_struct *vma, pte_t *pte); +extern void migration_entry_wait_huge(struct vm_area_struct *vma, + struct hugetlb_pte *hpte); #endif /* CONFIG_HUGETLB_PAGE */ #else /* CONFIG_MIGRATION */ static inline swp_entry_t make_readable_migration_entry(pgoff_t offset) @@ -369,7 +372,8 @@ static inline void migration_entry_wait(struct mm_struct *mm, pmd_t *pmd, #ifdef CONFIG_HUGETLB_PAGE static inline void __migration_entry_wait_huge(struct vm_area_struct *vma, pte_t *ptep, spinlock_t *ptl) { } -static inline void migration_entry_wait_huge(struct vm_area_struct *vma, pte_t *pte) { } +static inline void migration_entry_wait_huge(struct vm_area_struct *vma, + struct hugetlb_pte *hpte) { } #endif /* CONFIG_HUGETLB_PAGE */ static inline int is_writable_migration_entry(swp_entry_t entry) { diff --git a/mm/hugetlb.c b/mm/hugetlb.c index 86cd51beb02c..39f541b4a0a8 100644 --- a/mm/hugetlb.c +++ b/mm/hugetlb.c @@ -6418,7 +6418,7 @@ vm_fault_t hugetlb_fault(struct mm_struct *mm, struct vm_area_struct *vma, * be released there. */ mutex_unlock(&hugetlb_fault_mutex_table[hash]); - migration_entry_wait_huge(vma, hpte.ptep); + migration_entry_wait_huge(vma, &hpte); return 0; } else if (unlikely(is_hugetlb_entry_hwpoisoned(entry))) ret = VM_FAULT_HWPOISON_LARGE | diff --git a/mm/mempolicy.c b/mm/mempolicy.c index 0f91be88392b..43e210181cce 100644 --- a/mm/mempolicy.c +++ b/mm/mempolicy.c @@ -424,6 +424,7 @@ struct queue_pages { unsigned long start; unsigned long end; struct vm_area_struct *first; + struct folio *last_folio; }; /* @@ -475,6 +476,7 @@ static int queue_folios_pmd(pmd_t *pmd, spinlock_t *ptl, unsigned long addr, flags = qp->flags; /* go to folio migration */ if (flags & (MPOL_MF_MOVE | MPOL_MF_MOVE_ALL)) { + qp->last_folio = folio; if (!vma_migratable(walk->vma) || migrate_folio_add(folio, qp->pagelist, flags)) { ret = 1; @@ -539,6 +541,8 @@ static int queue_folios_pte_range(pmd_t *pmd, unsigned long addr, break; } + qp->last_folio = folio; + /* * Do not abort immediately since there may be * temporary off LRU pages in the range. Still @@ -570,15 +574,22 @@ static int queue_folios_hugetlb(struct hugetlb_pte *hpte, spinlock_t *ptl; pte_t entry; - /* We don't migrate high-granularity HugeTLB mappings for now. */ - if (hugetlb_hgm_enabled(walk->vma)) - return -EINVAL; - ptl = hugetlb_pte_lock(hpte); entry = huge_ptep_get(hpte->ptep); if (!pte_present(entry)) goto unlock; - folio = pfn_folio(pte_pfn(entry)); + + if (!hugetlb_pte_present_leaf(hpte, entry)) { + ret = -EAGAIN; + goto unlock; + } + + folio = page_folio(pte_page(entry)); + + /* We already queued this page with another high-granularity PTE. */ + if (folio == qp->last_folio) + goto unlock; + if (!queue_folio_required(folio, qp)) goto unlock; @@ -747,6 +758,7 @@ queue_pages_range(struct mm_struct *mm, unsigned long start, unsigned long end, .start = start, .end = end, .first = NULL, + .last_folio = NULL, }; err = walk_page_range(mm, start, end, &queue_pages_walk_ops, &qp); diff --git a/mm/migrate.c b/mm/migrate.c index 616afcc40fdc..b26169990532 100644 --- a/mm/migrate.c +++ b/mm/migrate.c @@ -196,6 +196,9 @@ static bool remove_migration_pte(struct folio *folio, /* pgoff is invalid for ksm pages, but they are never large */ if (folio_test_large(folio) && !folio_test_hugetlb(folio)) idx = linear_page_index(vma, pvmw.address) - pvmw.pgoff; + else if (folio_test_hugetlb(folio)) + idx = (pvmw.address & ~huge_page_mask(hstate_vma(vma)))/ + PAGE_SIZE; new = folio_page(folio, idx); #ifdef CONFIG_ARCH_ENABLE_THP_MIGRATION @@ -247,14 +250,16 @@ static bool remove_migration_pte(struct folio *folio, #ifdef CONFIG_HUGETLB_PAGE if (folio_test_hugetlb(folio)) { + struct page *hpage = folio_page(folio, 0); unsigned int shift = pvmw.pte_order + PAGE_SHIFT; pte = arch_make_huge_pte(pte, shift, vma->vm_flags); if (folio_test_anon(folio)) - hugepage_add_anon_rmap(new, vma, pvmw.address, + hugepage_add_anon_rmap(hpage, vma, pvmw.address, rmap_flags); else - page_add_file_rmap(new, vma, true); + hugetlb_add_file_rmap(new, shift, + hstate_vma(vma), vma); set_huge_pte_at(vma->vm_mm, pvmw.address, pvmw.pte, pte); } else #endif @@ -270,7 +275,7 @@ static bool remove_migration_pte(struct folio *folio, mlock_drain_local(); trace_remove_migration_pte(pvmw.address, pte_val(pte), - compound_order(new)); + pvmw.pte_order); /* No need to invalidate - it was non-present before */ update_mmu_cache(vma, pvmw.address, pvmw.pte); @@ -361,12 +366,10 @@ void __migration_entry_wait_huge(struct vm_area_struct *vma, } } -void migration_entry_wait_huge(struct vm_area_struct *vma, pte_t *pte) +void migration_entry_wait_huge(struct vm_area_struct *vma, + struct hugetlb_pte *hpte) { - spinlock_t *ptl = huge_pte_lockptr(huge_page_shift(hstate_vma(vma)), - vma->vm_mm, pte); - - __migration_entry_wait_huge(vma, pte, ptl); + __migration_entry_wait_huge(vma, hpte->ptep, hpte->ptl); } #endif From patchwork Sat Feb 18 00:28:04 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: James Houghton X-Patchwork-Id: 13145397 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 707DBC6379F for ; Sat, 18 Feb 2023 00:29:41 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id DD4FA28001B; Fri, 17 Feb 2023 19:29:17 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id CBD5E280002; Fri, 17 Feb 2023 19:29:17 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id AE85028001B; Fri, 17 Feb 2023 19:29:17 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0010.hostedemail.com [216.40.44.10]) by kanga.kvack.org (Postfix) with ESMTP id 9761F280002 for ; Fri, 17 Feb 2023 19:29:17 -0500 (EST) Received: from smtpin15.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay09.hostedemail.com (Postfix) with ESMTP id 7D7D7808A5 for ; Sat, 18 Feb 2023 00:29:17 +0000 (UTC) X-FDA: 80478528354.15.6B68CC1 Received: from mail-yw1-f201.google.com (mail-yw1-f201.google.com [209.85.128.201]) by imf21.hostedemail.com (Postfix) with ESMTP id B51001C0002 for ; Sat, 18 Feb 2023 00:29:15 +0000 (UTC) Authentication-Results: imf21.hostedemail.com; dkim=pass header.d=google.com header.s=20210112 header.b=iqrumGyh; spf=pass (imf21.hostedemail.com: domain of 32hvwYwoKCPcisgntfgsnmfnnfkd.bnlkhmtw-lljuZbj.nqf@flex--jthoughton.bounces.google.com designates 209.85.128.201 as permitted sender) smtp.mailfrom=32hvwYwoKCPcisgntfgsnmfnnfkd.bnlkhmtw-lljuZbj.nqf@flex--jthoughton.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1676680155; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=h5QeZmWStm8MBBmvLyhf4o+hksgZUS71zRGtmoeJ5N4=; b=I3G00XSXTHreFpGGBy6xJP4A8Tbx6qOAMXSy85qeQ4B7cfPdk4avKYx3xk1gfY6oJv4f4D 97VSujKaR8ySLdcB+nPymNklmNUY7wGM7J4l/dry1jkXgEwQO6oaREu51mT2cpSXGeS+B8 ZECm99zigb6UFz/mnfXqY6CFgXERggQ= ARC-Authentication-Results: i=1; imf21.hostedemail.com; dkim=pass header.d=google.com header.s=20210112 header.b=iqrumGyh; spf=pass (imf21.hostedemail.com: domain of 32hvwYwoKCPcisgntfgsnmfnnfkd.bnlkhmtw-lljuZbj.nqf@flex--jthoughton.bounces.google.com designates 209.85.128.201 as permitted sender) smtp.mailfrom=32hvwYwoKCPcisgntfgsnmfnnfkd.bnlkhmtw-lljuZbj.nqf@flex--jthoughton.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1676680155; a=rsa-sha256; cv=none; b=QT9FPmJAFmQ0Okt2UG/FnSKgY/8/ARu1HqmBQ7o4sw1f70GiPBuyuTq7QGgOJBBAd9RdT/ 7gVXPdCZWIKkdxF7d8GH3J3Uj+JIdvaWtuqRw5DCAosnNrKJY8R/p7nkGsJDol6O9uSoU7 2x5dpYy/7rnWlNgmvj7eLJqsE8mmlWQ= Received: by mail-yw1-f201.google.com with SMTP id 00721157ae682-5365a8e6d8dso18112957b3.7 for ; Fri, 17 Feb 2023 16:29:15 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=h5QeZmWStm8MBBmvLyhf4o+hksgZUS71zRGtmoeJ5N4=; b=iqrumGyh+1DXgn+jsoYD5tlyxEnhTKa0RiDswgYlulz2R+ePcUkS4hk7pJr1l5TN4Z t4OGyf3Ee88VCdy7Q/P2cmEh2dAWdOVlOvBP1o2vJ8WyqBhGxkkoh3k8/Itb4ZdbmqKm cc2rGHBwWYUHY+CQ/nhxnlepr/hLoBQcN2mefhb5d+Skxpoc3mQ6YYqwqicwAf3fMJi9 IbApGiC57N5YDkCsKcP9870idXAalXkC8/897RuNBgRUzMFJC0wrZjC6bZRbVALKQPA4 KUFkbVYqdOLVPYutCh7Ev7uHgUEdcvwEGi0C/ONTXZeJkJHpEuUFB85nzCIxbQ0alG9k tAvw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=h5QeZmWStm8MBBmvLyhf4o+hksgZUS71zRGtmoeJ5N4=; b=OUe5Jc/LxQlW424gml2aqxqqZ/8+o3AUKjBWOS6+3RnlCmodSTkxpC4yPeJz3rMb3b cZ9WnbyVajS8/T4M1DjtfxF2sOarKuUOVbGKj0ugw8pU9jOpbMqYeZ7EmVieGFHNwRYW mcuEkgcjDUwD1d+k8Njb3mxjTl92K8kHJdNuYlDoGcvRIt5XLIZxwNMRO8yejKfp8/V0 RyLMHifUrcvMG9d8ecoR0tXsY1YJ4bgAWkUkwxUGbfQ2z4q7F8f6F25/JaE+u2kQFgd/ leIlWEKYoE900mKjHsaHhlOWeDY4LTk6H3ACT30d6KxK+TW6Al0bVg9c0MOWsTKm7eDZ hYSg== X-Gm-Message-State: AO0yUKVaysSCRLriD4FNAbES1j4ZQQS2+c/hroJrm0WLclXyM7E1PnT/ 4ysyLnmMYZDYU23JrcIjbTkjMY3uzgeIjZ5T X-Google-Smtp-Source: AK7set8cs/bMSPkcig+yHBN2nUuLXJMfaVz/DRZ9ytGQCyW0wyjGaQbg0OHjwkIcX38+L9v/Io1eBcNX9GOn9imw X-Received: from jthoughton.c.googlers.com ([fda3:e722:ac3:cc00:14:4d90:c0a8:2a4f]) (user=jthoughton job=sendgmr) by 2002:a25:e211:0:b0:872:465e:2cbf with SMTP id h17-20020a25e211000000b00872465e2cbfmr1298716ybe.264.1676680154885; Fri, 17 Feb 2023 16:29:14 -0800 (PST) Date: Sat, 18 Feb 2023 00:28:04 +0000 In-Reply-To: <20230218002819.1486479-1-jthoughton@google.com> Mime-Version: 1.0 References: <20230218002819.1486479-1-jthoughton@google.com> X-Mailer: git-send-email 2.39.2.637.g21b0678d19-goog Message-ID: <20230218002819.1486479-32-jthoughton@google.com> Subject: [PATCH v2 31/46] hugetlb: sort hstates in hugetlb_init_hstates From: James Houghton To: Mike Kravetz , Muchun Song , Peter Xu , Andrew Morton Cc: David Hildenbrand , David Rientjes , Axel Rasmussen , Mina Almasry , "Zach O'Keefe" , Manish Mishra , Naoya Horiguchi , "Dr . David Alan Gilbert" , "Matthew Wilcox (Oracle)" , Vlastimil Babka , Baolin Wang , Miaohe Lin , Yang Shi , Frank van der Linden , Jiaqi Yan , linux-mm@kvack.org, linux-kernel@vger.kernel.org, James Houghton X-Rspamd-Server: rspam07 X-Rspamd-Queue-Id: B51001C0002 X-Rspam-User: X-Stat-Signature: uacotzimooqu9sygu3bjs9aaqpwk5bow X-HE-Tag: 1676680155-814578 X-HE-Meta: U2FsdGVkX1/4cafXB1gNn39sEA53kG/Urb+taH44scxn/eCHWPjsIxksL0PH4b0ZoA77+uYgmEOxVseScji5MOAMYubS26hk1O2dJ3/Ogeb8fMjHMmA1eQDOpJGPw0SXl/Y0E5tPnnvw0e5GepTqllNva6w7WDo3aU7C0UoYImCZTLrOSIhJLgw+BefFtpOcFSH1q+1rism2lqexplaf0D6YuA1baWSkIdBv6CDasSEzb9pAHeiOemFKHglh9vjMQ396g287fcMazp4a9C2VCETNlarvtt7LdGxuOf9gSwxpkb4lw2uKvvoD7RAy1J03x1pJS3ZDqoGfHiSWIefBm0VfxolQ9g0ubW1+MyWf7kaDJkQz7b3RyH7sEDKHwHGpMmkRrAE3sTc/xY7eC/Kg8ScS4T6+ViP+XP4hcics++haa11fB8rimpt0ICtrmBcaGYvviLJqwNBBbk2GiRPXJsxHvt1m4+rfbfLUf38StdvdL0pQVlGkfp1JDoAq8lec8RENC4Y6zQZyVk5zqPu3JkdmKBXYsGL+FEPubNyiN7qt34YAm8JXIl4SMJoYPEshqpGmu1vTWdmyWCFPEP9YSWXHvzDH3ZhihpNo3IosjFbftF0exvN4jLSzCCl9QGlEpM2E777jKMBpdBovWRZk+i0xrRSoAtfgkaSTQw0roa4+RwnkOy6612WEqYAp/clmJIsLhszT39JZYcTM+c3McfQVAy2ChTaj/+7fFD79fgE947u5OrrLBq38X2um8a2o6/K/IvVMawSgNoZP/462sADGy8yvpjtdiqmZjm1w8WuW4Zvkoi8ggZ7dS/oTN2NRnxqWfDyrf8pWySyjVK3qExctP7zAqjbAYcazT7W8lWTnb1Jf+hKQJnC4UWzWbNqFvNEB9aEWFORLPonVvR0UXUDVn6goKOEHVk5xrLWpDN2cQgI0wh+ua7/qg/ngcrp80sWAFIe2//sFdZeIwZk cdLrD+hZ myDIZAKkyRiF+PJrXmJIIIJw500aaMo92AIBLJPFzu5AIsBlXGKRYVxm0bBfMhTcFBs7gqYzjRjyXgoqhzOMrBBvPcchp1x0obe2V6XaAqKplHuhxmMLT9whWC+gzwb1vDQYubn/C24hKSd/4gE27ZlRr3bgl+ZxEOpAtimAm0/J/hnET/x9/YsaeTRe2U0aTp9FfTGMTVCBgvG/uh75L8CgpbEIbeqdi4EPyaofR32wL1VS4vnJ2BwiO4fKCRHbVFv3yQW4l5fRFdrEwLvBDIm7HbADdiYLZczQ+bRlHRV2FvX0j5OauCVMHQOgQSXYdgarSXQbyIGTeUGUoZQyxvM2NRTMhyHGNqr3LfiGbWOGFRLGoYMfHRQmMORZMNbTuVBnStWjoyddfrgsUcWnauXyB//mYpEgbX3i+5F2RXVzJYq+IdIF1Gc2uf6KyPzBBPw4gmJ8TXFa5iZkx/BjTIwONTmL1/5otRGa/ZsOjPnianJOt3exUGJQRGKgKQ0eaNrymYIQB1tc7pI9BlDHPZgpBwav4Vljgz41DsyWc/jesP/bZjd7Q6FbygmgxFy+JyOlO7LS6KIpDx9hBtRanjFSs45a7aEP1bbwcqDTQOXX4YatAlhfzFZD+ESQDLtzalMQj+BPvTyAR/vxk5xgtyJn2WRsv4hfW8lGnV3p1WSLor0ZagiIqDVe8FFSu6Has//A16rRyPnUESA21K0jTb7GG+e+HcEfYLycv1mgvbe4MBqSNRJvLXgm2yw== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: When using HugeTLB high-granularity mapping, we need to go through the supported hugepage sizes in decreasing order so that we pick the largest size that works. Consider the case where we're faulting in a 1G hugepage for the first time: we want hugetlb_fault/hugetlb_no_page to map it with a PUD. By going through the sizes in decreasing order, we will find that PUD_SIZE works before finding out that PMD_SIZE or PAGE_SIZE work too. This commit also changes bootmem hugepages from storing hstate pointers directly to storing the hstate sizes. The hstate pointers used for boot-time-allocated hugepages become invalid after we sort the hstates. `gather_bootmem_prealloc`, called after the hstates have been sorted, now converts the size to the correct hstate. Signed-off-by: James Houghton diff --git a/include/linux/hugetlb.h b/include/linux/hugetlb.h index 2fe1eb6897d4..a344f9d9eba1 100644 --- a/include/linux/hugetlb.h +++ b/include/linux/hugetlb.h @@ -766,7 +766,7 @@ struct hstate { struct huge_bootmem_page { struct list_head list; - struct hstate *hstate; + unsigned long hstate_sz; }; int isolate_or_dissolve_huge_page(struct page *page, struct list_head *list); diff --git a/mm/hugetlb.c b/mm/hugetlb.c index 39f541b4a0a8..e20df8f6216e 100644 --- a/mm/hugetlb.c +++ b/mm/hugetlb.c @@ -34,6 +34,7 @@ #include #include #include +#include #include #include @@ -49,6 +50,10 @@ int hugetlb_max_hstate __read_mostly; unsigned int default_hstate_idx; +/* + * After hugetlb_init_hstates is called, hstates will be sorted from largest + * to smallest. + */ struct hstate hstates[HUGE_MAX_HSTATE]; #ifdef CONFIG_CMA @@ -3464,7 +3469,7 @@ int __alloc_bootmem_huge_page(struct hstate *h, int nid) /* Put them into a private list first because mem_map is not up yet */ INIT_LIST_HEAD(&m->list); list_add(&m->list, &huge_boot_pages); - m->hstate = h; + m->hstate_sz = huge_page_size(h); return 1; } @@ -3479,7 +3484,7 @@ static void __init gather_bootmem_prealloc(void) list_for_each_entry(m, &huge_boot_pages, list) { struct page *page = virt_to_page(m); struct folio *folio = page_folio(page); - struct hstate *h = m->hstate; + struct hstate *h = size_to_hstate(m->hstate_sz); VM_BUG_ON(!hstate_is_gigantic(h)); WARN_ON(folio_ref_count(folio) != 1); @@ -3595,9 +3600,38 @@ static void __init hugetlb_hstate_alloc_pages(struct hstate *h) kfree(node_alloc_noretry); } +static int compare_hstates_decreasing(const void *a, const void *b) +{ + unsigned long sz_a = huge_page_size((const struct hstate *)a); + unsigned long sz_b = huge_page_size((const struct hstate *)b); + + if (sz_a < sz_b) + return 1; + if (sz_a > sz_b) + return -1; + return 0; +} + +static void sort_hstates(void) +{ + unsigned long default_hstate_sz = huge_page_size(&default_hstate); + + /* Sort from largest to smallest. */ + sort(hstates, hugetlb_max_hstate, sizeof(*hstates), + compare_hstates_decreasing, NULL); + + /* + * We may have changed the location of the default hstate, so we need to + * update it. + */ + default_hstate_idx = hstate_index(size_to_hstate(default_hstate_sz)); +} + static void __init hugetlb_init_hstates(void) { - struct hstate *h, *h2; + struct hstate *h; + + sort_hstates(); for_each_hstate(h) { /* oversize hugepages were init'ed in early boot */ @@ -3616,13 +3650,8 @@ static void __init hugetlb_init_hstates(void) continue; if (hugetlb_cma_size && h->order <= HUGETLB_PAGE_ORDER) continue; - for_each_hstate(h2) { - if (h2 == h) - continue; - if (h2->order < h->order && - h2->order > h->demote_order) - h->demote_order = h2->order; - } + if (h - 1 >= &hstates[0]) + h->demote_order = huge_page_order(h - 1); } } From patchwork Sat Feb 18 00:28:05 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: James Houghton X-Patchwork-Id: 13145398 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 44CB3C05027 for ; Sat, 18 Feb 2023 00:29:43 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id D0D5A28001C; Fri, 17 Feb 2023 19:29:18 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id C728A280002; Fri, 17 Feb 2023 19:29:18 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id AC1D328001C; Fri, 17 Feb 2023 19:29:18 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0013.hostedemail.com [216.40.44.13]) by kanga.kvack.org (Postfix) with ESMTP id 982C5280002 for ; Fri, 17 Feb 2023 19:29:18 -0500 (EST) Received: from smtpin12.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay07.hostedemail.com (Postfix) with ESMTP id 740B5160923 for ; Sat, 18 Feb 2023 00:29:18 +0000 (UTC) X-FDA: 80478528396.12.B824102 Received: from mail-vk1-f202.google.com (mail-vk1-f202.google.com [209.85.221.202]) by imf23.hostedemail.com (Postfix) with ESMTP id 97BB0140006 for ; Sat, 18 Feb 2023 00:29:16 +0000 (UTC) Authentication-Results: imf23.hostedemail.com; dkim=pass header.d=google.com header.s=20210112 header.b=n5krxTWv; spf=pass (imf23.hostedemail.com: domain of 32xvwYwoKCPgoymtzlmytslttlqj.htrqnsz2-rrp0fhp.twl@flex--jthoughton.bounces.google.com designates 209.85.221.202 as permitted sender) smtp.mailfrom=32xvwYwoKCPgoymtzlmytslttlqj.htrqnsz2-rrp0fhp.twl@flex--jthoughton.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1676680156; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=Hf03eS1BobxnoyUk+0ZYrZ8zRMgrel2ztMkHa50OYSI=; b=YUcFb06Qk+Fh6uLAp2JgmrJtRi7AEXzgOnREp6w1QNw8QeAyLhWfNm7iksUdluopDxIqHM ZPlk8APeGTHCwsVihcq1bRTM0eP6EnCyh/FLHOS1Hrmod39gbkGwKheUsv/PnjMiVvPDCF KxsXaRqowvmM7WAKqD7wzmge9url7ag= ARC-Authentication-Results: i=1; imf23.hostedemail.com; dkim=pass header.d=google.com header.s=20210112 header.b=n5krxTWv; spf=pass (imf23.hostedemail.com: domain of 32xvwYwoKCPgoymtzlmytslttlqj.htrqnsz2-rrp0fhp.twl@flex--jthoughton.bounces.google.com designates 209.85.221.202 as permitted sender) smtp.mailfrom=32xvwYwoKCPgoymtzlmytslttlqj.htrqnsz2-rrp0fhp.twl@flex--jthoughton.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1676680156; a=rsa-sha256; cv=none; b=Oos4cR1goXOGoabtPRXKRqWS8iKiM3xGab7u8eGbE1uNd4i9ZpYEpgIxk1qgaAlypOdsyz Ds3Bkosb2jSxF3hJv8IiR99Y1pk1VN8pQ4ud9EDVP/XPdMZzO0HpwjLrhZNKTIX0N8w/ag EBHU8v7ks2kiecMd+7SsfXKKErRIa6M= Received: by mail-vk1-f202.google.com with SMTP id y26-20020ac5c81a000000b003e1a7591524so1067592vkl.1 for ; Fri, 17 Feb 2023 16:29:16 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=Hf03eS1BobxnoyUk+0ZYrZ8zRMgrel2ztMkHa50OYSI=; b=n5krxTWva4Sf4Vsvk57VrejZS7qh83I29bbOukhXhjxnbyiLVwsIU1jYSB+TyjblCL MilLcyhHjSUyY+1YLmOg+OnBETPCIOw/L4b8Fi1wC+MKnO4VLbaVIbLLSIpqvEF5fKek S7PVzlegxhgRs1TdpdgnswFYQr0DAofh6bzKR/4aYfTbY2USEybj9EjCijALtRxrpFmY oRt8kPX/U/3EtMyNyG9s3xqsF3Jx65+txOMviZr70m7fuLxiVCeDc004dBLqVT+P2gLb aZ2Lhu7IFeKdLx2Du68oGZvPZtxzFiuZ9OH+JAdOUn2J1T26W3LFbmXb5sUZtm1q1FgI +6FA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=Hf03eS1BobxnoyUk+0ZYrZ8zRMgrel2ztMkHa50OYSI=; b=PeIeX0Yta+7mokTDHIDpL8uIhaHKYNqjsTE+32GK+wI7PC0rw9hceMKU/YRrPLznNA oY/xTjfN1iKB4G4Jd+mLgkde5ISya0ExLqg9ucDUEetrrAqoEMQNKqPBrLD3+Fan/EV8 kKtOuiIiCgx1Gv2jTDsIaLvwKFCw72u5forVjkEtLrLUJwf1imEu6v2Z1cqmufJdMuAg Y+lKLbJSo4DiRsGtwboOjS7DrGyAei9PZeC95kLlI21lGc39EVyG/WWSaD1NAj2apxyW jdTggt61aQXO4N+rCB8Cq/4gVagxezPY1ItIrFsk/agkt6HAM0Ow7BgwHjAa8NpzBgWz PJXA== X-Gm-Message-State: AO0yUKWT3uKtLqFDeDlW+Xhafxz1/0OfpbmxwjHTynI5huX8xgxqtqQ6 427ho0XTPDrGJqb4K34QnLeSKTz2p/azyxJn X-Google-Smtp-Source: AK7set/xeVdMk4/mlhl6jkXC1OYI04ENP7/RQfinMSut9jH5SSiIHmeo9doCVF6zrjX21weL0A1G0FvBpBxmEp3t X-Received: from jthoughton.c.googlers.com ([fda3:e722:ac3:cc00:14:4d90:c0a8:2a4f]) (user=jthoughton job=sendgmr) by 2002:a05:6102:356b:b0:415:48ce:8597 with SMTP id bh11-20020a056102356b00b0041548ce8597mr942711vsb.8.1676680155862; Fri, 17 Feb 2023 16:29:15 -0800 (PST) Date: Sat, 18 Feb 2023 00:28:05 +0000 In-Reply-To: <20230218002819.1486479-1-jthoughton@google.com> Mime-Version: 1.0 References: <20230218002819.1486479-1-jthoughton@google.com> X-Mailer: git-send-email 2.39.2.637.g21b0678d19-goog Message-ID: <20230218002819.1486479-33-jthoughton@google.com> Subject: [PATCH v2 32/46] hugetlb: add for_each_hgm_shift From: James Houghton To: Mike Kravetz , Muchun Song , Peter Xu , Andrew Morton Cc: David Hildenbrand , David Rientjes , Axel Rasmussen , Mina Almasry , "Zach O'Keefe" , Manish Mishra , Naoya Horiguchi , "Dr . David Alan Gilbert" , "Matthew Wilcox (Oracle)" , Vlastimil Babka , Baolin Wang , Miaohe Lin , Yang Shi , Frank van der Linden , Jiaqi Yan , linux-mm@kvack.org, linux-kernel@vger.kernel.org, James Houghton X-Stat-Signature: p4ercg5r9m4ujoakfijonergq685ggsi X-Rspam-User: X-Rspamd-Queue-Id: 97BB0140006 X-Rspamd-Server: rspam06 X-HE-Tag: 1676680156-588309 X-HE-Meta: U2FsdGVkX1+QSOdxyy5IR79u6qpswM6jDKpZMw9WhOFh3bP2oKTtFN6y2ci6iyvTWNtQqIrC5br5cNQ1wyKwYT8ZnBYOX+NsDtTpcR9XwiHzQR69zUah59n9SUZy3kr7JfLTgF0weR4hYFJjtPMk1cSUnPIe7P2g9xoacIl5weu8q6lngch6u1q4ThzstY/SQTDwd7kSYk69DxR2abCulYRxdOsGsQbVAw54sOqgNt5+Q1AlRbMSDRb7n8GjRILoV/eBHPyMyf+9LaJXO+0DL7O/HhF+kkh4msHgiV1hh/XzWA91nk9iQL9kyCMG2U+z7GyNVOPSC9qR5y36t3eLozP/5TsQxM1aq/3NkBHuyocEwel6Oi2dQCA5LmDpx0dXKpTw7nk+vCpmONQwo4Y7jAUHCAinlWYds48dLZHb30VUPhr2rD7JWOlT97E5EF8hfYuJ/yz4NeTHNVF6XHqR3PxU8rLXjMnijn1XIp92RjRTv9zJWtKg3LrOwZH3kreDQEM5nPcj3MLNxEl/f2qpg0hHWP0IMzOYCGmZO+0jx4mOHeHhtD7gu9BQGljN4WCNlRuwhawcIe50jo4lGFyBovSLyaD28MLeYnHAufBJ7uPYhVjuw8TJHyJOy2OlrcpiGDajscBKUwgT0362ViT7qV70YLxyGuEXex9jDvCK9rUz4jis03H/RB9z6Z348q1gfOKv/36VZgghUiK2q2aH4ghyW8yDe5rxLjXiQbTJuT84YlwMLWlUhVBFPdocnhgS3ih9+InuGO7ZyjDNEW12V+ge7miqB4bUTklRaoLCIaHK9rveVTW5lX14vY7fIvm4TLIZkF+I4z9FF/Pyav3CNw7S2ljYL4jTFny/db7Ni8q70BEFCW502RM118MkYH7XVvMJPUvvFCKQxeGCxbIfz11DliZBr6Ui9bDgkFYl59GZdw6Kds8CrE5ZivImkU+9J5VK+rcuzSJ6tdPp3xo YsD12Bzn lY/X1pK1X7DTj5gc2Q4znzdq4rUxQJkT+H2PCE3LN7/Srew3X2GwMH0ZVQ+AiWRzkUqZtRUgwIbKHjbmJ5h4AZwJzYuTkvCleh/vg6D1uil2rI73oZPOChd8dQ7aC7uLd50JwRwapoBMfP2eY/1s9ACvB4qHO1b2ZIgugFgBKeElnwH5p004jwiGlq/d+ZDaG04JvIKP611FjOgimLBaoLp2Iw4KI/E5NT+QJMdRQb6/+/TpbXfxMiUvK4j3/spcG1gt8tSui2L8a5tbnHyX+YfBkudainEqhSrgSnjpNFLJVF+sR8eanQBPmWoVUdWAvY2Qs8B88gdKgW/gpWVY57+E3A4Yj+LWt2Hzx91EFY+C26+K64V2q2JqUAgleDSqx4L08c40NGEvZahuRmVKrYtnEZXaij8BoGZh4/zJXxB6KABpkAzQwL9yaoB6eDiyBibqj5NioZ3PVMt0/FBPvkbZZSL3EjQWxC0wPsoiyNEqIcnqFA5xuDicTUMra3az7GqBjlgnVSBwt9auKfBX87E8l4IiRriC0nmAiAIZmV8AE8w7n8TLr4JFnxAx5ZGxg7fa0aJZDZfi7jW1Au9U6g9NiZkw7Lteyi8rNG4kxoJ0p5lzk9/FCwfVafyT9T6VK8XT6v9Y6OUAlbwE7noMw0GmrqWfM6kA8tokyx+f1/9vgbF4KSVkpz9EY8t+79OGXvEyY68JoGzEwMlI= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: This is a helper macro to loop through all the usable page sizes for a high-granularity-enabled HugeTLB VMA. Given the VMA's hstate, it will loop, in descending order, through the page sizes that HugeTLB supports for this architecture. It always includes PAGE_SIZE. This is done by looping through the hstates; however, there is no hstate for PAGE_SIZE. To handle this case, the loop intentionally goes out of bounds, and the out-of-bounds pointer is mapped to PAGE_SIZE. Signed-off-by: James Houghton diff --git a/mm/hugetlb.c b/mm/hugetlb.c index e20df8f6216e..667e82b7a0ff 100644 --- a/mm/hugetlb.c +++ b/mm/hugetlb.c @@ -7941,6 +7941,24 @@ bool hugetlb_hgm_enabled(struct vm_area_struct *vma) { return vma && (vma->vm_flags & VM_HUGETLB_HGM); } +/* Should only be used by the for_each_hgm_shift macro. */ +static unsigned int __shift_for_hstate(struct hstate *h) +{ + /* If h is out of bounds, we have reached the end, so give PAGE_SIZE */ + if (h >= &hstates[hugetlb_max_hstate]) + return PAGE_SHIFT; + return huge_page_shift(h); +} + +/* + * Intentionally go out of bounds. An out-of-bounds hstate will be converted to + * PAGE_SIZE. + */ +#define for_each_hgm_shift(hstate, tmp_h, shift) \ + for ((tmp_h) = hstate; (shift) = __shift_for_hstate(tmp_h), \ + (tmp_h) <= &hstates[hugetlb_max_hstate]; \ + (tmp_h)++) + #endif /* CONFIG_HUGETLB_HIGH_GRANULARITY_MAPPING */ /* From patchwork Sat Feb 18 00:28:06 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: James Houghton X-Patchwork-Id: 13145399 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id E4EBAC6379F for ; Sat, 18 Feb 2023 00:29:44 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id E6CD528001D; Fri, 17 Feb 2023 19:29:19 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id E1C1F280002; Fri, 17 Feb 2023 19:29:19 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id B87E428001D; Fri, 17 Feb 2023 19:29:19 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0011.hostedemail.com [216.40.44.11]) by kanga.kvack.org (Postfix) with ESMTP id 9A39C280002 for ; Fri, 17 Feb 2023 19:29:19 -0500 (EST) Received: from smtpin29.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay04.hostedemail.com (Postfix) with ESMTP id 69FFE1A0658 for ; Sat, 18 Feb 2023 00:29:19 +0000 (UTC) X-FDA: 80478528438.29.E92F444 Received: from mail-yb1-f201.google.com (mail-yb1-f201.google.com [209.85.219.201]) by imf10.hostedemail.com (Postfix) with ESMTP id A877FC0004 for ; Sat, 18 Feb 2023 00:29:17 +0000 (UTC) Authentication-Results: imf10.hostedemail.com; dkim=pass header.d=google.com header.s=20210112 header.b=mCbQikAQ; spf=pass (imf10.hostedemail.com: domain of 33BvwYwoKCPkkuipvhiupohpphmf.dpnmjovy-nnlwbdl.psh@flex--jthoughton.bounces.google.com designates 209.85.219.201 as permitted sender) smtp.mailfrom=33BvwYwoKCPkkuipvhiupohpphmf.dpnmjovy-nnlwbdl.psh@flex--jthoughton.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1676680157; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=wrKlKZ6ZakLkTbIJ1oX0i57iaJAjsrq7KWEmG7MzQmY=; b=asmwfuC1RxmCDfGrp9FUpo1C5BLdpzbXbYilM18xyXpCwPno5LX0WbPJSbJ8uSTdcAFauz qQ0msQDRLKuXI2aQPjei/FgAUN/W9fRdmPa44YFBnJVQ7gPAWhpOnjvxEMpmNaMaZC6/2u zSlQouJcUv57Mo/kKuQW/xpid6NEEdQ= ARC-Authentication-Results: i=1; imf10.hostedemail.com; dkim=pass header.d=google.com header.s=20210112 header.b=mCbQikAQ; spf=pass (imf10.hostedemail.com: domain of 33BvwYwoKCPkkuipvhiupohpphmf.dpnmjovy-nnlwbdl.psh@flex--jthoughton.bounces.google.com designates 209.85.219.201 as permitted sender) smtp.mailfrom=33BvwYwoKCPkkuipvhiupohpphmf.dpnmjovy-nnlwbdl.psh@flex--jthoughton.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1676680157; a=rsa-sha256; cv=none; b=rNaSpO0J3cX4Ub2+rc9yfF3MfjRcWDpMO7py3biIOSgWElEkW8WQYg0cUHEXCvuAm4kKeA CyYtS7mTjUeI2DPoa6QY1TXLCCiFV7MiBCFgbnuCd9OoVTMxKijAGYnDp+Pfsj4te9rfcr jT+DDTgosGQNtcDEnulYIzy8KGDp2go= Received: by mail-yb1-f201.google.com with SMTP id 127-20020a251885000000b0092aabd4fa90so2078962yby.18 for ; Fri, 17 Feb 2023 16:29:17 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=wrKlKZ6ZakLkTbIJ1oX0i57iaJAjsrq7KWEmG7MzQmY=; b=mCbQikAQ3LLbb2MaLcCfnhff3PiBABnQwSQf9LWl30IUncwsLCnlc7QcDea1N/QLsC vfw0K2950rSrhIHrRGtEn7vPNXstlIlDJme+x9X1Tl4rCeL5TE1nmFxc78IZG7kUcuGZ gWnC/qKD/qWIT+WrgA2/XtwBBmqj4uwswcowjU0CsCy/gmEKNQpD5PEKpeYu26k0gax4 wFF1AM3BcKLwDBazsiY337gPGCphNPaNpVDV9c2Psll+/miIIyYrv4fm4pZwi/PDcjtw 80annwRMmYnG6AnPnD+aevt1DhBv/jq/mj8pS2S8t5MLt/H13U85E5EeFDuqgEKiranZ zR4g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=wrKlKZ6ZakLkTbIJ1oX0i57iaJAjsrq7KWEmG7MzQmY=; b=BtMwIsYQl52U0wzLxTYG+UVDig1RN3oItJeLY+vjSGphkQFc97l0hXwsAwe+btWKdT iwaGMK3dVTTIiK4DDbksVZVYYS7/5ouSyPFP0DrdJEtcwAeKM6164mh0MvKNRwwBwG3G Ma456vfqpwyWQ3n10VV2qAEzKII9zgZf64anEORC3GIJCT7NOuW4lKteaLXOr3lh9OgR dOcLg3mIn/+2lTigUB0FziGcNJF6O9VGVKbGLSIWkyrgg3ZXFLD47e8MgBKlctH8n37a yz3nm+xl96KICm3TvIbuH2UhGVQ8Kmy477kDo1aFYwkLOsegOiYrXrwnzdQldEjolr5e 0EMg== X-Gm-Message-State: AO0yUKWk695XX9AHP/OGXaL670MqLIlkzHLSKfgZp07SDNkCyeCL96Zp gqyZxMNdqRCZetBduLgjqgNto1ndm6LumynR X-Google-Smtp-Source: AK7set/ChT8w1lRpdLXETNdy5AqAngOnJfnNB3h58M/w44KB78rgBAvvKVVzRYFSwR4a8k+Uo0NqEoJrPiNJtx6Y X-Received: from jthoughton.c.googlers.com ([fda3:e722:ac3:cc00:14:4d90:c0a8:2a4f]) (user=jthoughton job=sendgmr) by 2002:a05:6902:1024:b0:8fc:686c:cf87 with SMTP id x4-20020a056902102400b008fc686ccf87mr53605ybt.4.1676680156885; Fri, 17 Feb 2023 16:29:16 -0800 (PST) Date: Sat, 18 Feb 2023 00:28:06 +0000 In-Reply-To: <20230218002819.1486479-1-jthoughton@google.com> Mime-Version: 1.0 References: <20230218002819.1486479-1-jthoughton@google.com> X-Mailer: git-send-email 2.39.2.637.g21b0678d19-goog Message-ID: <20230218002819.1486479-34-jthoughton@google.com> Subject: [PATCH v2 33/46] hugetlb: userfaultfd: add support for high-granularity UFFDIO_CONTINUE From: James Houghton To: Mike Kravetz , Muchun Song , Peter Xu , Andrew Morton Cc: David Hildenbrand , David Rientjes , Axel Rasmussen , Mina Almasry , "Zach O'Keefe" , Manish Mishra , Naoya Horiguchi , "Dr . David Alan Gilbert" , "Matthew Wilcox (Oracle)" , Vlastimil Babka , Baolin Wang , Miaohe Lin , Yang Shi , Frank van der Linden , Jiaqi Yan , linux-mm@kvack.org, linux-kernel@vger.kernel.org, James Houghton X-Rspam-User: X-Rspamd-Server: rspam04 X-Rspamd-Queue-Id: A877FC0004 X-Stat-Signature: jb4fmb6tw6p8wmx1rgut5szudc6rixnj X-HE-Tag: 1676680157-352719 X-HE-Meta: U2FsdGVkX1/Q61oG0zlsCfzyZwK9OXHFaOGoLzs6m8BJKOST+J3urOoT76ZT9g+NyTk1UXzFDOlPUm8xKXzr4ZOc7b2A14vcB3ziyMl5cmYA4Cgh2c8xqTx44bLMF3LwRYqx0VEOrrFZMNg9CxAiIRaQpRYBx5kyU4J3ZzwhSoMKlIWVIVV38E5QxLfxEy8tV5Er+kYqk4RVQJsbsyQaEaYr3haU6+DusbZol3LY8niaGFOeY6HPC4ZdnIjtpFVcBEoML06TGdxoGaWnt/6bGeb71ox5p9LuDzFUimvDgzaOJcO2SJJYUtU+edzLwvx6/SRzfqpU0i2Dkxqdp8wl66ANGvIdZmG95TeIdBsMgEmEF4aYrpnp2IzP3KGLOBiuJsvy3qENwFuKGnErISYIsTLzLaYFQeqC30EVFaLI1DYBC2YuLXZfysoMpbBpgJxbN4NExrwM4Ou2RNbbmqUa5ZrN11gtQSrpC45TA4l/yrqRdDgQTybyOG8W5ICVhATbwHBCMWPvgKaum1ZDP6/go9ItL8Qcl+xIZk0Rw7UADQRrjXYXkWunJmlRCaMA566MhbXIc3j4Ff2oGPlo1CgudnWZkHayg9R3WGm+3wbZb05oS7rvi3mtrFPiJt3ywtSmlS+ldkf96Y9rhk9kSHUg7BKj4N3yMlGOJJpxgLgCSHClaGN2MSlk401ptsLXUrF+jZK08hmwR6z3XZbRVqBMLoMsshd+k6qJxlyxuN4EX8m41qB3U3PL9PBM1Yr+isMNNoeiUaandfnzV1MAGPB9iCyNbwnPEnVeCDGYIbjakytWQvtLRVI4bwh2qo5B+/c0wI1QIY9rkwjJRtTygoeIgXX7c7aT/JjPswNa5Czft3qITppzYlQWSv37qGu+cuniSrv/BsUaYTl7ysD6O+hO12KxbF9NksAoAe3IYqXn+SVMetcCR1S0FVFoITJ3bw+2TTf92DnqRZ9Ekn3T6ob YTwnJzek twF74WKko3s+QAcv866kJ4sva1PJVjs84OOcvKi/uU1bRoCfINgNzd+IhPc/iHXTSBJ411OZzSyHGbmoC1khrI0Kl4lbOwCtcUf67EL61a7Mf3ez2FQDU0fxfJ2F0ic7ghgTVPbzRfQFaWGE+kRTmoiLXDx425HoznxLK+UuVK1jsNCCuum5sWOOhXPnuLArakKfKIESv/74u3C2Y57/m215XJU6A7qrVsS3HzCgxh1UF3DVHBrZ1TXMpNWdG4qjjW11ZW5kcsFWqWTTrz57xUGcXmbTxgNiv8AZicJOKULGZFTRXGIw0UOAk39SSrkVwbEIELKlr+hB0Xwixrhd4cIKhxfYLvCIk3ynJUBZFmtGGQVhKq+ZP1zpJZe9hIvsuLtg2k4KKLfMamqZHQKbQsBq2g7GvgukXP9zKkXZ1AaPNivZ+xyQIzZawCTv8znUG1qNKtD/YXACZUHD4xivx1UkPY1+/6+ybq8NrQFxcT9Shq1vgMawVnq0sv0ljRc/Bs6PqcbCUhYhsSWzSLFOt+GFF+6fdEZELzccEnh6ACJtZLYU+xFTFMcpH88jQQ7AdTIviIbHyycrPUx53Bi6rmRBiR9SBKmBHt3kDL+2XF0w+zYI= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Changes here are similar to the changes made for hugetlb_no_page. Pass vmf->real_address to userfaultfd_huge_must_wait because vmf->address may be rounded down to the hugepage size, and a high-granularity page table walk would look up the wrong PTE. Also change the call to userfaultfd_must_wait in the same way for consistency. This commit introduces hugetlb_alloc_largest_pte which is used to find the appropriate PTE size to map pages with UFFDIO_CONTINUE. When MADV_SPLIT is provided, page fault events will report PAGE_SIZE-aligned address instead of huge_page_size(h)-aligned addresses, regardless of if UFFD_FEATURE_EXACT_ADDRESS is used. Signed-off-by: James Houghton diff --git a/fs/userfaultfd.c b/fs/userfaultfd.c index 44d1ee429eb0..bb30001b63ba 100644 --- a/fs/userfaultfd.c +++ b/fs/userfaultfd.c @@ -252,17 +252,17 @@ static inline bool userfaultfd_huge_must_wait(struct userfaultfd_ctx *ctx, unsigned long flags, unsigned long reason) { - pte_t *ptep, pte; + pte_t pte; bool ret = true; + struct hugetlb_pte hpte; mmap_assert_locked(ctx->mm); - ptep = hugetlb_walk(vma, address, vma_mmu_pagesize(vma)); - if (!ptep) + if (hugetlb_full_walk(&hpte, vma, address)) goto out; ret = false; - pte = huge_ptep_get(ptep); + pte = huge_ptep_get(hpte.ptep); /* * Lockless access: we're in a wait_event so it's ok if it @@ -531,11 +531,11 @@ vm_fault_t handle_userfault(struct vm_fault *vmf, unsigned long reason) spin_unlock_irq(&ctx->fault_pending_wqh.lock); if (!is_vm_hugetlb_page(vma)) - must_wait = userfaultfd_must_wait(ctx, vmf->address, vmf->flags, - reason); + must_wait = userfaultfd_must_wait(ctx, vmf->real_address, + vmf->flags, reason); else must_wait = userfaultfd_huge_must_wait(ctx, vma, - vmf->address, + vmf->real_address, vmf->flags, reason); if (is_vm_hugetlb_page(vma)) hugetlb_vma_unlock_read(vma); diff --git a/include/linux/hugetlb.h b/include/linux/hugetlb.h index a344f9d9eba1..e0e51bb06112 100644 --- a/include/linux/hugetlb.h +++ b/include/linux/hugetlb.h @@ -201,7 +201,8 @@ unsigned long hugetlb_total_pages(void); vm_fault_t hugetlb_fault(struct mm_struct *mm, struct vm_area_struct *vma, unsigned long address, unsigned int flags); #ifdef CONFIG_USERFAULTFD -int hugetlb_mcopy_atomic_pte(struct mm_struct *dst_mm, pte_t *dst_pte, +int hugetlb_mcopy_atomic_pte(struct mm_struct *dst_mm, + struct hugetlb_pte *dst_hpte, struct vm_area_struct *dst_vma, unsigned long dst_addr, unsigned long src_addr, @@ -1272,16 +1273,31 @@ static inline enum hugetlb_level hpage_size_to_level(unsigned long sz) #ifdef CONFIG_HUGETLB_HIGH_GRANULARITY_MAPPING bool hugetlb_hgm_enabled(struct vm_area_struct *vma); +bool hugetlb_hgm_advised(struct vm_area_struct *vma); bool hugetlb_hgm_eligible(struct vm_area_struct *vma); +int hugetlb_alloc_largest_pte(struct hugetlb_pte *hpte, struct mm_struct *mm, + struct vm_area_struct *vma, unsigned long start, + unsigned long end); #else static inline bool hugetlb_hgm_enabled(struct vm_area_struct *vma) { return false; } +static inline bool hugetlb_hgm_advised(struct vm_area_struct *vma) +{ + return false; +} static inline bool hugetlb_hgm_eligible(struct vm_area_struct *vma) { return false; } +static inline +int hugetlb_alloc_largest_pte(struct hugetlb_pte *hpte, struct mm_struct *mm, + struct vm_area_struct *vma, unsigned long start, + unsigned long end) +{ + return -EINVAL; +} #endif static inline spinlock_t *huge_pte_lock(struct hstate *h, diff --git a/mm/hugetlb.c b/mm/hugetlb.c index 667e82b7a0ff..a00b4ac07046 100644 --- a/mm/hugetlb.c +++ b/mm/hugetlb.c @@ -6083,9 +6083,15 @@ static inline vm_fault_t hugetlb_handle_userfault(struct vm_area_struct *vma, unsigned long reason) { u32 hash; + /* + * Don't use the hpage-aligned address if the user has explicitly + * enabled HGM. + */ + bool round_to_pagesize = hugetlb_hgm_advised(vma) && + reason == VM_UFFD_MINOR; struct vm_fault vmf = { .vma = vma, - .address = haddr, + .address = round_to_pagesize ? addr & PAGE_MASK : haddr, .real_address = addr, .flags = flags, @@ -6569,7 +6575,7 @@ vm_fault_t hugetlb_fault(struct mm_struct *mm, struct vm_area_struct *vma, * modifications for huge pages. */ int hugetlb_mcopy_atomic_pte(struct mm_struct *dst_mm, - pte_t *dst_pte, + struct hugetlb_pte *dst_hpte, struct vm_area_struct *dst_vma, unsigned long dst_addr, unsigned long src_addr, @@ -6580,13 +6586,15 @@ int hugetlb_mcopy_atomic_pte(struct mm_struct *dst_mm, bool is_continue = (mode == MCOPY_ATOMIC_CONTINUE); struct hstate *h = hstate_vma(dst_vma); struct address_space *mapping = dst_vma->vm_file->f_mapping; - pgoff_t idx = vma_hugecache_offset(h, dst_vma, dst_addr); + unsigned long haddr = dst_addr & huge_page_mask(h); + pgoff_t idx = vma_hugecache_offset(h, dst_vma, haddr); unsigned long size; int vm_shared = dst_vma->vm_flags & VM_SHARED; pte_t _dst_pte; spinlock_t *ptl; int ret = -ENOMEM; struct folio *folio; + struct page *subpage; int writable; bool folio_in_pagecache = false; @@ -6601,12 +6609,12 @@ int hugetlb_mcopy_atomic_pte(struct mm_struct *dst_mm, * a non-missing case. Return -EEXIST. */ if (vm_shared && - hugetlbfs_pagecache_present(h, dst_vma, dst_addr)) { + hugetlbfs_pagecache_present(h, dst_vma, haddr)) { ret = -EEXIST; goto out; } - folio = alloc_hugetlb_folio(dst_vma, dst_addr, 0); + folio = alloc_hugetlb_folio(dst_vma, haddr, 0); if (IS_ERR(folio)) { ret = -ENOMEM; goto out; @@ -6622,13 +6630,13 @@ int hugetlb_mcopy_atomic_pte(struct mm_struct *dst_mm, /* Free the allocated folio which may have * consumed a reservation. */ - restore_reserve_on_error(h, dst_vma, dst_addr, folio); + restore_reserve_on_error(h, dst_vma, haddr, folio); folio_put(folio); /* Allocate a temporary folio to hold the copied * contents. */ - folio = alloc_hugetlb_folio_vma(h, dst_vma, dst_addr); + folio = alloc_hugetlb_folio_vma(h, dst_vma, haddr); if (!folio) { ret = -ENOMEM; goto out; @@ -6642,14 +6650,14 @@ int hugetlb_mcopy_atomic_pte(struct mm_struct *dst_mm, } } else { if (vm_shared && - hugetlbfs_pagecache_present(h, dst_vma, dst_addr)) { + hugetlbfs_pagecache_present(h, dst_vma, haddr)) { put_page(*pagep); ret = -EEXIST; *pagep = NULL; goto out; } - folio = alloc_hugetlb_folio(dst_vma, dst_addr, 0); + folio = alloc_hugetlb_folio(dst_vma, haddr, 0); if (IS_ERR(folio)) { put_page(*pagep); ret = -ENOMEM; @@ -6697,7 +6705,7 @@ int hugetlb_mcopy_atomic_pte(struct mm_struct *dst_mm, folio_in_pagecache = true; } - ptl = huge_pte_lock(h, dst_mm, dst_pte); + ptl = hugetlb_pte_lock(dst_hpte); ret = -EIO; if (folio_test_hwpoison(folio)) @@ -6709,11 +6717,13 @@ int hugetlb_mcopy_atomic_pte(struct mm_struct *dst_mm, * page backing it, then access the page. */ ret = -EEXIST; - if (!huge_pte_none_mostly(huge_ptep_get(dst_pte))) + if (!huge_pte_none_mostly(huge_ptep_get(dst_hpte->ptep))) goto out_release_unlock; + subpage = hugetlb_find_subpage(h, folio, dst_addr); + if (folio_in_pagecache) - page_add_file_rmap(&folio->page, dst_vma, true); + hugetlb_add_file_rmap(subpage, dst_hpte->shift, h, dst_vma); else hugepage_add_new_anon_rmap(folio, dst_vma, dst_addr); @@ -6726,7 +6736,8 @@ int hugetlb_mcopy_atomic_pte(struct mm_struct *dst_mm, else writable = dst_vma->vm_flags & VM_WRITE; - _dst_pte = make_huge_pte(dst_vma, &folio->page, writable); + _dst_pte = make_huge_pte_with_shift(dst_vma, subpage, writable, + dst_hpte->shift); /* * Always mark UFFDIO_COPY page dirty; note that this may not be * extremely important for hugetlbfs for now since swapping is not @@ -6739,12 +6750,12 @@ int hugetlb_mcopy_atomic_pte(struct mm_struct *dst_mm, if (wp_copy) _dst_pte = huge_pte_mkuffd_wp(_dst_pte); - set_huge_pte_at(dst_mm, dst_addr, dst_pte, _dst_pte); + set_huge_pte_at(dst_mm, dst_addr, dst_hpte->ptep, _dst_pte); - hugetlb_count_add(pages_per_huge_page(h), dst_mm); + hugetlb_count_add(hugetlb_pte_size(dst_hpte) / PAGE_SIZE, dst_mm); /* No need to invalidate - it was non-present before */ - update_mmu_cache(dst_vma, dst_addr, dst_pte); + update_mmu_cache(dst_vma, dst_addr, dst_hpte->ptep); spin_unlock(ptl); if (!is_continue) @@ -7941,6 +7952,18 @@ bool hugetlb_hgm_enabled(struct vm_area_struct *vma) { return vma && (vma->vm_flags & VM_HUGETLB_HGM); } +bool hugetlb_hgm_advised(struct vm_area_struct *vma) +{ + /* + * Right now, the only way for HGM to be enabled is if a user + * explicitly enables it via MADV_SPLIT, but in the future, there + * may be cases where it gets enabled automatically. + * + * Provide hugetlb_hgm_advised() now for call sites where care that the + * user explicitly enabled HGM. + */ + return hugetlb_hgm_enabled(vma); +} /* Should only be used by the for_each_hgm_shift macro. */ static unsigned int __shift_for_hstate(struct hstate *h) { @@ -7959,6 +7982,38 @@ static unsigned int __shift_for_hstate(struct hstate *h) (tmp_h) <= &hstates[hugetlb_max_hstate]; \ (tmp_h)++) +/* + * Find the HugeTLB PTE that maps as much of [start, end) as possible with a + * single page table entry. It is returned in @hpte. + */ +int hugetlb_alloc_largest_pte(struct hugetlb_pte *hpte, struct mm_struct *mm, + struct vm_area_struct *vma, unsigned long start, + unsigned long end) +{ + struct hstate *h = hstate_vma(vma), *tmp_h; + unsigned int shift; + unsigned long sz; + int ret; + + for_each_hgm_shift(h, tmp_h, shift) { + sz = 1UL << shift; + + if (!IS_ALIGNED(start, sz) || start + sz > end) + continue; + goto found; + } + return -EINVAL; +found: + ret = hugetlb_full_walk_alloc(hpte, vma, start, sz); + if (ret) + return ret; + + if (hpte->shift > shift) + return -EEXIST; + + return 0; +} + #endif /* CONFIG_HUGETLB_HIGH_GRANULARITY_MAPPING */ /* diff --git a/mm/userfaultfd.c b/mm/userfaultfd.c index 53c3d916ff66..b56bc12f600e 100644 --- a/mm/userfaultfd.c +++ b/mm/userfaultfd.c @@ -320,14 +320,16 @@ static __always_inline ssize_t __mcopy_atomic_hugetlb(struct mm_struct *dst_mm, { int vm_shared = dst_vma->vm_flags & VM_SHARED; ssize_t err; - pte_t *dst_pte; unsigned long src_addr, dst_addr; long copied; struct page *page; - unsigned long vma_hpagesize; + unsigned long vma_hpagesize, target_pagesize; pgoff_t idx; u32 hash; struct address_space *mapping; + bool use_hgm = hugetlb_hgm_advised(dst_vma) && + mode == MCOPY_ATOMIC_CONTINUE; + struct hstate *h = hstate_vma(dst_vma); /* * There is no default zero huge page for all huge page sizes as @@ -345,12 +347,13 @@ static __always_inline ssize_t __mcopy_atomic_hugetlb(struct mm_struct *dst_mm, copied = 0; page = NULL; vma_hpagesize = vma_kernel_pagesize(dst_vma); + target_pagesize = use_hgm ? PAGE_SIZE : vma_hpagesize; /* - * Validate alignment based on huge page size + * Validate alignment based on the targeted page size. */ err = -EINVAL; - if (dst_start & (vma_hpagesize - 1) || len & (vma_hpagesize - 1)) + if (dst_start & (target_pagesize - 1) || len & (target_pagesize - 1)) goto out_unlock; retry: @@ -381,13 +384,14 @@ static __always_inline ssize_t __mcopy_atomic_hugetlb(struct mm_struct *dst_mm, } while (src_addr < src_start + len) { + struct hugetlb_pte hpte; BUG_ON(dst_addr >= dst_start + len); /* * Serialize via vma_lock and hugetlb_fault_mutex. - * vma_lock ensures the dst_pte remains valid even - * in the case of shared pmds. fault mutex prevents - * races with other faulting threads. + * vma_lock ensures the hpte.ptep remains valid even + * in the case of shared pmds and page table collapsing. + * fault mutex prevents races with other faulting threads. */ idx = linear_page_index(dst_vma, dst_addr); mapping = dst_vma->vm_file->f_mapping; @@ -395,23 +399,28 @@ static __always_inline ssize_t __mcopy_atomic_hugetlb(struct mm_struct *dst_mm, mutex_lock(&hugetlb_fault_mutex_table[hash]); hugetlb_vma_lock_read(dst_vma); - err = -ENOMEM; - dst_pte = huge_pte_alloc(dst_mm, dst_vma, dst_addr, vma_hpagesize); - if (!dst_pte) { + if (use_hgm) + err = hugetlb_alloc_largest_pte(&hpte, dst_mm, dst_vma, + dst_addr, + dst_start + len); + else + err = hugetlb_full_walk_alloc(&hpte, dst_vma, dst_addr, + vma_hpagesize); + if (err) { hugetlb_vma_unlock_read(dst_vma); mutex_unlock(&hugetlb_fault_mutex_table[hash]); goto out_unlock; } if (mode != MCOPY_ATOMIC_CONTINUE && - !huge_pte_none_mostly(huge_ptep_get(dst_pte))) { + !huge_pte_none_mostly(huge_ptep_get(hpte.ptep))) { err = -EEXIST; hugetlb_vma_unlock_read(dst_vma); mutex_unlock(&hugetlb_fault_mutex_table[hash]); goto out_unlock; } - err = hugetlb_mcopy_atomic_pte(dst_mm, dst_pte, dst_vma, + err = hugetlb_mcopy_atomic_pte(dst_mm, &hpte, dst_vma, dst_addr, src_addr, mode, &page, wp_copy); @@ -423,6 +432,7 @@ static __always_inline ssize_t __mcopy_atomic_hugetlb(struct mm_struct *dst_mm, if (unlikely(err == -ENOENT)) { mmap_read_unlock(dst_mm); BUG_ON(!page); + WARN_ON_ONCE(hpte.shift != huge_page_shift(h)); err = copy_huge_page_from_user(page, (const void __user *)src_addr, @@ -440,9 +450,9 @@ static __always_inline ssize_t __mcopy_atomic_hugetlb(struct mm_struct *dst_mm, BUG_ON(page); if (!err) { - dst_addr += vma_hpagesize; - src_addr += vma_hpagesize; - copied += vma_hpagesize; + dst_addr += hugetlb_pte_size(&hpte); + src_addr += hugetlb_pte_size(&hpte); + copied += hugetlb_pte_size(&hpte); if (fatal_signal_pending(current)) err = -EINTR; From patchwork Sat Feb 18 00:28:07 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: James Houghton X-Patchwork-Id: 13145400 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 3C865C05027 for ; Sat, 18 Feb 2023 00:29:46 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 433E228001E; Fri, 17 Feb 2023 19:29:21 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 395C3280002; Fri, 17 Feb 2023 19:29:21 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 1C1D428001E; Fri, 17 Feb 2023 19:29:21 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0015.hostedemail.com [216.40.44.15]) by kanga.kvack.org (Postfix) with ESMTP id 0D824280002 for ; Fri, 17 Feb 2023 19:29:21 -0500 (EST) Received: from smtpin23.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay09.hostedemail.com (Postfix) with ESMTP id E1F6280799 for ; Sat, 18 Feb 2023 00:29:20 +0000 (UTC) X-FDA: 80478528480.23.44F8710 Received: from mail-yw1-f201.google.com (mail-yw1-f201.google.com [209.85.128.201]) by imf08.hostedemail.com (Postfix) with ESMTP id 20FAE160008 for ; Sat, 18 Feb 2023 00:29:18 +0000 (UTC) Authentication-Results: imf08.hostedemail.com; dkim=pass header.d=google.com header.s=20210112 header.b=PyoCsPHZ; spf=pass (imf08.hostedemail.com: domain of 33hvwYwoKCPsmwkrxjkwrqjrrjoh.frpolqx0-ppnydfn.ruj@flex--jthoughton.bounces.google.com designates 209.85.128.201 as permitted sender) smtp.mailfrom=33hvwYwoKCPsmwkrxjkwrqjrrjoh.frpolqx0-ppnydfn.ruj@flex--jthoughton.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1676680159; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=H5Ss3Q8p6Tm/C0evsWap3MC9xTDp36cPJKkNtoaVKGQ=; b=OVgnT2L/v1swsKN4d531ylbx9gmt9GQPhbDSVZWXIsBdvhIvt08/cwzmpuOiFJSAjUIa12 WxfIEFm9yTkY6cJenDHLF/0KTh5HL/LwcTnEDrjnOd2M5yFirw48Q8WvK9CfsmsJUVSU6t brf4Phu0zZU1oXaG0Mo1W58MB6i41cE= ARC-Authentication-Results: i=1; imf08.hostedemail.com; dkim=pass header.d=google.com header.s=20210112 header.b=PyoCsPHZ; spf=pass (imf08.hostedemail.com: domain of 33hvwYwoKCPsmwkrxjkwrqjrrjoh.frpolqx0-ppnydfn.ruj@flex--jthoughton.bounces.google.com designates 209.85.128.201 as permitted sender) smtp.mailfrom=33hvwYwoKCPsmwkrxjkwrqjrrjoh.frpolqx0-ppnydfn.ruj@flex--jthoughton.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1676680159; a=rsa-sha256; cv=none; b=EYwLZ7i8nkiYSBPpK9jxM60bSDQyVeQnZ/g1e6KvxgolSl6n3X5R2end661w/S2Rk2UjDY zJyKdtGGWu6Jjo47kmqP+IzdBRQJ2kheCuz18tlYUfO5zY5sK585f4RlVx23/GH8Nl17c0 4NsJ9+ISVRrRz/exthPyJGfN2e9EKOo= Received: by mail-yw1-f201.google.com with SMTP id 00721157ae682-53659386dc8so20711357b3.6 for ; Fri, 17 Feb 2023 16:29:18 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=H5Ss3Q8p6Tm/C0evsWap3MC9xTDp36cPJKkNtoaVKGQ=; b=PyoCsPHZKLuSeDOLvIw+lkPHKwn65TPzd0CfsjOfssDvwh0+NtYbibsVfgVsSmNssl qradQv+EgXsSDZiodu5lljG1t2o7w6TQ698wIv5JgVXeUBOB3Wnz/Bcq5yevJLKg+nve VjJMx+Txfk3k/LgV2w2hnh91g802PdgQo2AseWuhEWbpXHvGj7DQ7qL06+fGAKkwmyeX eLgkDxIvUSbfleeEETs9PQLkB3WeTXxJkEgHYOVpxDZGz96scTefQeiZRq4DuIrveLRP xg9PkBMYr1jykjB8E7MRM4w0AuZAeJ4hrxxNWXPe2GYe6/W0zfWOPuKMd0nX/tiporx7 sscg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=H5Ss3Q8p6Tm/C0evsWap3MC9xTDp36cPJKkNtoaVKGQ=; b=fqvUnM7/3ges0iYT7rkPTPlfy/JG9x/EPkUukHgzwO1QrgJiTVrk8qkAGLsnu4dnkw f2qQE4yNxf2TvpY9eD/w4nQIC85DFGhQTpJxUkTLWWireFMP0uIxQk2ZYL53spPYkUJL 1c8Mvma/2AS7Pk67DCN6lD+cWVUEEiFCZe9y12VUPHd/0CyYvmJBFMZG8eQ4blFFP+fN QcpdWC7UsJFw7MS3+KamvC/YWv+zgQL8BGSInB8DpMo7ylQbpIJYbXumgskpNvEP7Rd4 +DzNklGcO7tklEVfeI5eIudKvxn1ZyDNZ7IC9jT3jl8DZjoeQ3f7tG++YgxUx7J8yrwO sIpw== X-Gm-Message-State: AO0yUKXZLaqe/607IqM5CWItBsPgp4jlAw0yDHY5mdLRvxicy3ACj0yI PR25hDaLJeNjddes0rOLJ7t9M6ewpCtgzx7P X-Google-Smtp-Source: AK7set9/STmUDdPiVjVMlyX9HVfyhKJ5jo2Gv2WTisaTY68C7uxLi9/y0Dvn93Bgqo0KqJ6aTwnqpO+gxhHqoPwi X-Received: from jthoughton.c.googlers.com ([fda3:e722:ac3:cc00:14:4d90:c0a8:2a4f]) (user=jthoughton job=sendgmr) by 2002:a05:6902:107:b0:914:1ef3:e98a with SMTP id o7-20020a056902010700b009141ef3e98amr168149ybh.213.1676680158302; Fri, 17 Feb 2023 16:29:18 -0800 (PST) Date: Sat, 18 Feb 2023 00:28:07 +0000 In-Reply-To: <20230218002819.1486479-1-jthoughton@google.com> Mime-Version: 1.0 References: <20230218002819.1486479-1-jthoughton@google.com> X-Mailer: git-send-email 2.39.2.637.g21b0678d19-goog Message-ID: <20230218002819.1486479-35-jthoughton@google.com> Subject: [PATCH v2 34/46] hugetlb: add MADV_COLLAPSE for hugetlb From: James Houghton To: Mike Kravetz , Muchun Song , Peter Xu , Andrew Morton Cc: David Hildenbrand , David Rientjes , Axel Rasmussen , Mina Almasry , "Zach O'Keefe" , Manish Mishra , Naoya Horiguchi , "Dr . David Alan Gilbert" , "Matthew Wilcox (Oracle)" , Vlastimil Babka , Baolin Wang , Miaohe Lin , Yang Shi , Frank van der Linden , Jiaqi Yan , linux-mm@kvack.org, linux-kernel@vger.kernel.org, James Houghton X-Rspam-User: X-Rspamd-Server: rspam03 X-Stat-Signature: 6pupykhdj7msmbuk8diyxf417eire3fr X-Rspamd-Queue-Id: 20FAE160008 X-HE-Tag: 1676680158-104152 X-HE-Meta: U2FsdGVkX1/v0jljVEdCDwUc8XatRk7UGvm+JzW0TKe+/fFSjDcrSFv+zNqC32vE1FqH6BxtqZQE70dz+3TBqmk+ciVATxwr7u183SPEglDNjm3Ameql0jpNdr4q2cv9xjy92O0fhhaTJK41Q8Z0THGLaaktEZJ8CHX9gpBCk7QKoX9+zw6V55IuBzn0YogfMGLJAUcrFhaiANq2ogY5xRJNZYqAckDNpaUK30a0e3gPXE9rBBt9NnUdvdFsg5656rldY+k/aa5XeS5W1agblduGRpj7XiVM7G8Eyao/TGPv7tRBY8jUnJltQMqpEgqv4FUuBtCoimfJDsOdIeJaXaHi7TWX2qSQ9xpzDNO7EZ9ka3gPPriLfyA9cT4ACLFHd9aV1aJHo2XvqnYRhLKELMDDhXjOfzYKTAS/JyZzIe3N/rgF5bD8I6SBNbVIHXhUk7qrNGKzvuGpK84PJztvU/mYq7Of7ZZfTYwkkXEfu1DhexxiIe1A2qpt0ereshE94cYN1D7FVt2rMCilhEyay8CzSRL+/AlaIJXMAg5HRkBeGOA/EKa8aJXhxr3WxoJAhyziKL7wFOoXbnw8mBpEEoJVp68+uPDTwwJwGtnwtBMsXB+vrkorMuUOcQ7evbQFP4C1cY8cPdeKSdRtlE4VYrKQNLcMWxkZXn5VJqs00F4RrgaVHKMXKAkZ/vZTf+nbcy17OExBOr45S6wX6Z/M7kW4ZxYa5JXVmxjMzPC29VHgTrk7W9ZGSx8pw//OfuwpaC5mzgKsWjgIGtF8p/6WdpkdWZnyeKPL56S93SjG1zFMoI4ciwYSgeRnOS/V+xjKuoBwadevOvb3tybamOBGG7v6Yum6ZBV/XcSGoQPkQmr69K8PzvBUnbH/jm4RMVR3c+juqYKP1bA7k2PRJvhW7ElgWPOsbMu6nX/m8/9a5F9jY+AL0/o14bHbX4A+jl+TqJDI6Urf0XvH+69mIpt EPTWUNdx hKJS71EAC+XP+ThHDV3NO+CC7IzIWDwNRKyqadVgUzMq5xV3gQ2iHdrPjiB3Hsyh9/A5AjS7Mby+PQkDB+YrhPHumeo2XI82kIIxuvJ+agqqILrrxWrXZaUcGbbXn8xmfNvMVdkPSBdsg8Tl3xhv5bBSdN74WeN0CgKHD8I5sqMjxqmv5NcWDf2COnduhPzbqWsS2WTJDs4FW5dZJ/5KofjgVnQJjEqJM1d2HrMlFaMiJdEwNE8FF8gteZAfhy+431ZwqW0g0FCT2qJsBVDWdpzuDvOIiVpUqnjB4NTfGeTJ4CqFFWrQgLqFqj+nM8JIUCQoWZ0IJQB6zWQvli38fzZ/MEccJJi1yVd9vjiW09S1dKLW1SWAnl6uB/K/dmrQCErDjAGOT8fE6eO+/cH2cQPcXxY8jVIKYSHaoWYoY3YKknAjsfHmPiyV9+M20DX6tlhujIVuo64giJRxoCLCwR67ik72tEhxOp4aC07DCYyH6j1Hnhkp2JWCsB/dMpZ9A5Mhoz3pZ8XqkelPt6KUOZZPrMcRmjr9TSBLlapf54yu2ntYj6lv3OKv44CcBukdXoUv0/syJuhztjXTrph3R1iwPvsCX0RAKfOmY4K6M4q8ktjRbfloubHSdQXbu5zVqi/2jFTIaa8jIwruY2LxCOURWyQ/R+pf1ywyvZKNGTK0bZwFWUueIkaTyIN2n/faUMCuNo80r7oU+BIM= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: This is a necessary extension to the UFFDIO_CONTINUE changes. When userspace finishes mapping an entire hugepage with UFFDIO_CONTINUE, the kernel has no mechanism to automatically collapse the page table to map the whole hugepage normally. We require userspace to inform us that they would like the mapping to be collapsed; they do this with MADV_COLLAPSE. If userspace has not mapped all of a hugepage with UFFDIO_CONTINUE, but only some, hugetlb_collapse will cause the requested range to be mapped as if it were UFFDIO_CONTINUE'd already. The effects of any UFFDIO_WRITEPROTECT calls may be undone by a call to MADV_COLLAPSE for intersecting address ranges. This commit is co-opting the same madvise mode that has been introduced to synchronously collapse THPs. The function that does THP collapsing has been renamed to madvise_collapse_thp. As with the rest of the high-granularity mapping support, MADV_COLLAPSE is only supported for shared VMAs right now. MADV_COLLAPSE for HugeTLB takes the mmap_lock for writing. It is important that we check PageHWPoison before checking !HPageMigratable, as PageHWPoison implies !HPageMigratable. !PageHWPoison && !HPageMigratable means that the page has been isolated for migration. Signed-off-by: James Houghton diff --git a/include/linux/huge_mm.h b/include/linux/huge_mm.h index 70bd867eba94..fa63a56ebaf0 100644 --- a/include/linux/huge_mm.h +++ b/include/linux/huge_mm.h @@ -218,9 +218,9 @@ void __split_huge_pud(struct vm_area_struct *vma, pud_t *pud, int hugepage_madvise(struct vm_area_struct *vma, unsigned long *vm_flags, int advice); -int madvise_collapse(struct vm_area_struct *vma, - struct vm_area_struct **prev, - unsigned long start, unsigned long end); +int madvise_collapse_thp(struct vm_area_struct *vma, + struct vm_area_struct **prev, + unsigned long start, unsigned long end); void vma_adjust_trans_huge(struct vm_area_struct *vma, unsigned long start, unsigned long end, long adjust_next); spinlock_t *__pmd_trans_huge_lock(pmd_t *pmd, struct vm_area_struct *vma); @@ -358,9 +358,9 @@ static inline int hugepage_madvise(struct vm_area_struct *vma, return -EINVAL; } -static inline int madvise_collapse(struct vm_area_struct *vma, - struct vm_area_struct **prev, - unsigned long start, unsigned long end) +static inline int madvise_collapse_thp(struct vm_area_struct *vma, + struct vm_area_struct **prev, + unsigned long start, unsigned long end) { return -EINVAL; } diff --git a/include/linux/hugetlb.h b/include/linux/hugetlb.h index e0e51bb06112..6cd4ae08d84d 100644 --- a/include/linux/hugetlb.h +++ b/include/linux/hugetlb.h @@ -1278,6 +1278,8 @@ bool hugetlb_hgm_eligible(struct vm_area_struct *vma); int hugetlb_alloc_largest_pte(struct hugetlb_pte *hpte, struct mm_struct *mm, struct vm_area_struct *vma, unsigned long start, unsigned long end); +int hugetlb_collapse(struct mm_struct *mm, unsigned long start, + unsigned long end); #else static inline bool hugetlb_hgm_enabled(struct vm_area_struct *vma) { @@ -1298,6 +1300,12 @@ int hugetlb_alloc_largest_pte(struct hugetlb_pte *hpte, struct mm_struct *mm, { return -EINVAL; } +static inline +int hugetlb_collapse(struct mm_struct *mm, unsigned long start, + unsigned long end) +{ + return -EINVAL; +} #endif static inline spinlock_t *huge_pte_lock(struct hstate *h, diff --git a/mm/hugetlb.c b/mm/hugetlb.c index a00b4ac07046..c4d189e5f1fd 100644 --- a/mm/hugetlb.c +++ b/mm/hugetlb.c @@ -8014,6 +8014,158 @@ int hugetlb_alloc_largest_pte(struct hugetlb_pte *hpte, struct mm_struct *mm, return 0; } +/* + * Collapse the address range from @start to @end to be mapped optimally. + * + * This is only valid for shared mappings. The main use case for this function + * is following UFFDIO_CONTINUE. If a user UFFDIO_CONTINUEs an entire hugepage + * by calling UFFDIO_CONTINUE once for each 4K region, the kernel doesn't know + * to collapse the mapping after the final UFFDIO_CONTINUE. Instead, we leave + * it up to userspace to tell us to do so, via MADV_COLLAPSE. + * + * Any holes in the mapping will be filled. If there is no page in the + * pagecache for a region we're collapsing, the PTEs will be cleared. + * + * If high-granularity PTEs are uffd-wp markers, those markers will be dropped. + */ +static int __hugetlb_collapse(struct mm_struct *mm, struct vm_area_struct *vma, + unsigned long start, unsigned long end) +{ + struct hstate *h = hstate_vma(vma); + struct address_space *mapping = vma->vm_file->f_mapping; + struct mmu_notifier_range range; + struct mmu_gather tlb; + unsigned long curr = start; + int ret = 0; + struct folio *folio; + struct page *subpage; + pgoff_t idx; + bool writable = vma->vm_flags & VM_WRITE; + struct hugetlb_pte hpte; + pte_t entry; + spinlock_t *ptl; + + /* + * This is only supported for shared VMAs, because we need to look up + * the page to use for any PTEs we end up creating. + */ + if (!(vma->vm_flags & VM_MAYSHARE)) + return -EINVAL; + + /* If HGM is not enabled, there is nothing to collapse. */ + if (!hugetlb_hgm_enabled(vma)) + return 0; + + tlb_gather_mmu(&tlb, mm); + + mmu_notifier_range_init(&range, MMU_NOTIFY_CLEAR, 0, mm, start, end); + mmu_notifier_invalidate_range_start(&range); + + while (curr < end) { + ret = hugetlb_alloc_largest_pte(&hpte, mm, vma, curr, end); + if (ret) + goto out; + + entry = huge_ptep_get(hpte.ptep); + + /* + * There is no work to do if the PTE doesn't point to page + * tables. + */ + if (!pte_present(entry)) + goto next_hpte; + if (hugetlb_pte_present_leaf(&hpte, entry)) + goto next_hpte; + + idx = vma_hugecache_offset(h, vma, curr); + folio = filemap_get_folio(mapping, idx); + + if (folio && folio_test_hwpoison(folio)) { + /* + * Don't collapse a mapping to a page that is + * hwpoisoned. The entire page will be poisoned. + * + * When HugeTLB supports poisoning PAGE_SIZE bits of + * the hugepage, the logic here can be improved. + * + * Skip this page, and continue to collapse the rest + * of the mapping. + */ + folio_put(folio); + curr = (curr & huge_page_mask(h)) + huge_page_size(h); + continue; + } + + if (folio && !folio_test_hugetlb_migratable(folio)) { + /* + * Don't collapse a mapping to a page that is pending + * a migration. Migration swap entries may have placed + * in the page table. + */ + ret = -EBUSY; + folio_put(folio); + goto out; + } + + /* + * Clear all the PTEs, and drop ref/mapcounts + * (on tlb_finish_mmu). + */ + __unmap_hugepage_range(&tlb, vma, curr, + curr + hugetlb_pte_size(&hpte), + NULL, + ZAP_FLAG_DROP_MARKER); + /* Free the PTEs. */ + hugetlb_free_pgd_range(&tlb, + curr, curr + hugetlb_pte_size(&hpte), + curr, curr + hugetlb_pte_size(&hpte)); + + ptl = hugetlb_pte_lock(&hpte); + + if (!folio) { + huge_pte_clear(mm, curr, hpte.ptep, + hugetlb_pte_size(&hpte)); + spin_unlock(ptl); + goto next_hpte; + } + + subpage = hugetlb_find_subpage(h, folio, curr); + entry = make_huge_pte_with_shift(vma, subpage, + writable, hpte.shift); + hugetlb_add_file_rmap(subpage, hpte.shift, h, vma); + set_huge_pte_at(mm, curr, hpte.ptep, entry); + spin_unlock(ptl); +next_hpte: + curr += hugetlb_pte_size(&hpte); + } +out: + mmu_notifier_invalidate_range_end(&range); + tlb_finish_mmu(&tlb); + + return ret; +} + +int hugetlb_collapse(struct mm_struct *mm, unsigned long start, + unsigned long end) +{ + int ret = 0; + struct vm_area_struct *vma; + + mmap_write_lock(mm); + while (start < end || ret) { + vma = find_vma(mm, start); + if (!vma || !is_vm_hugetlb_page(vma)) { + ret = -EINVAL; + break; + } + ret = __hugetlb_collapse(mm, vma, start, + end < vma->vm_end ? end : vma->vm_end); + start = vma->vm_end; + } + mmap_write_unlock(mm); + return ret; +} + #endif /* CONFIG_HUGETLB_HIGH_GRANULARITY_MAPPING */ /* diff --git a/mm/khugepaged.c b/mm/khugepaged.c index 8dbc39896811..58cda5020537 100644 --- a/mm/khugepaged.c +++ b/mm/khugepaged.c @@ -2750,8 +2750,8 @@ static int madvise_collapse_errno(enum scan_result r) } } -int madvise_collapse(struct vm_area_struct *vma, struct vm_area_struct **prev, - unsigned long start, unsigned long end) +int madvise_collapse_thp(struct vm_area_struct *vma, struct vm_area_struct **prev, + unsigned long start, unsigned long end) { struct collapse_control *cc; struct mm_struct *mm = vma->vm_mm; diff --git a/mm/madvise.c b/mm/madvise.c index 8c004c678262..e121d135252a 100644 --- a/mm/madvise.c +++ b/mm/madvise.c @@ -1028,6 +1028,24 @@ static int madvise_split(struct vm_area_struct *vma, #endif } +static int madvise_collapse(struct vm_area_struct *vma, + struct vm_area_struct **prev, + unsigned long start, unsigned long end) +{ + if (is_vm_hugetlb_page(vma)) { + struct mm_struct *mm = vma->vm_mm; + int ret; + + *prev = NULL; /* tell sys_madvise we dropped the mmap lock */ + mmap_read_unlock(mm); + ret = hugetlb_collapse(mm, start, end); + mmap_read_lock(mm); + return ret; + } + + return madvise_collapse_thp(vma, prev, start, end); +} + /* * Apply an madvise behavior to a region of a vma. madvise_update_vma * will handle splitting a vm area into separate areas, each area with its own @@ -1204,6 +1222,9 @@ madvise_behavior_valid(int behavior) #ifdef CONFIG_TRANSPARENT_HUGEPAGE case MADV_HUGEPAGE: case MADV_NOHUGEPAGE: +#endif +#if defined(CONFIG_HUGETLB_HIGH_GRANULARITY_MAPPING) || \ + defined(CONFIG_TRANSPARENT_HUGEPAGE) case MADV_COLLAPSE: #endif #ifdef CONFIG_HUGETLB_HIGH_GRANULARITY_MAPPING @@ -1397,7 +1418,8 @@ int madvise_set_anon_name(struct mm_struct *mm, unsigned long start, * MADV_NOHUGEPAGE - mark the given range as not worth being backed by * transparent huge pages so the existing pages will not be * coalesced into THP and new pages will not be allocated as THP. - * MADV_COLLAPSE - synchronously coalesce pages into new THP. + * MADV_COLLAPSE - synchronously coalesce pages into new THP, or, for HugeTLB + * pages, collapse the mapping. * MADV_SPLIT - allow HugeTLB pages to be mapped at PAGE_SIZE. This allows * UFFDIO_CONTINUE to accept PAGE_SIZE-aligned regions. * MADV_DONTDUMP - the application wants to prevent pages in the given range From patchwork Sat Feb 18 00:28:08 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: James Houghton X-Patchwork-Id: 13145401 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 58DC1C6379F for ; Sat, 18 Feb 2023 00:29:48 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 4D28B280020; Fri, 17 Feb 2023 19:29:22 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 45877280002; Fri, 17 Feb 2023 19:29:22 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 2D24D280020; Fri, 17 Feb 2023 19:29:22 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0010.hostedemail.com [216.40.44.10]) by kanga.kvack.org (Postfix) with ESMTP id 20C4C280002 for ; Fri, 17 Feb 2023 19:29:22 -0500 (EST) Received: from smtpin16.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay05.hostedemail.com (Postfix) with ESMTP id F4226405E8 for ; Sat, 18 Feb 2023 00:29:21 +0000 (UTC) X-FDA: 80478528564.16.834E979 Received: from mail-yb1-f201.google.com (mail-yb1-f201.google.com [209.85.219.201]) by imf09.hostedemail.com (Postfix) with ESMTP id 40BE1140005 for ; Sat, 18 Feb 2023 00:29:20 +0000 (UTC) Authentication-Results: imf09.hostedemail.com; dkim=pass header.d=google.com header.s=20210112 header.b=lcSeD08U; spf=pass (imf09.hostedemail.com: domain of 33xvwYwoKCPwnxlsyklxsrksskpi.gsqpmry1-qqozego.svk@flex--jthoughton.bounces.google.com designates 209.85.219.201 as permitted sender) smtp.mailfrom=33xvwYwoKCPwnxlsyklxsrksskpi.gsqpmry1-qqozego.svk@flex--jthoughton.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1676680160; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=6FTaaiplXAWGfEVqPjHy0St/TFfLKP8Xa0naqtqNKxQ=; b=d1vyPhtt+8EnU3bLHHXF/w1YhcnFUMUcB02OtV5Tl1pnNXFB0jG20JRjU6if3fuZ9ysJ+o 1OkeZ3FyRE6q2gV+2ZcLWVXEqIHx0JaW6x9KjJtRYLUFM74yNcOerKiQDh1oEyYTJalNzG Hr9PvGV7vcS3TGjKU7Vz49vDa03xPCg= ARC-Authentication-Results: i=1; imf09.hostedemail.com; dkim=pass header.d=google.com header.s=20210112 header.b=lcSeD08U; spf=pass (imf09.hostedemail.com: domain of 33xvwYwoKCPwnxlsyklxsrksskpi.gsqpmry1-qqozego.svk@flex--jthoughton.bounces.google.com designates 209.85.219.201 as permitted sender) smtp.mailfrom=33xvwYwoKCPwnxlsyklxsrksskpi.gsqpmry1-qqozego.svk@flex--jthoughton.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1676680160; a=rsa-sha256; cv=none; b=e+MRAzio6WZYnMASHZCVxUmZbUG0RqBGhrDXm+rQT3ekASIPYPLtW+tYySr0CT8kYApYvQ mv8FoKqjagjsEqfBTL6YxU2gzKepJmhvDkOWBGqxUr5qYxkC7/174rocjD0+boJ24CSDFB UojyzagQKCpV1fi87U9ir+qycVaEkcU= Received: by mail-yb1-f201.google.com with SMTP id w6-20020a25c706000000b0098592b9ff86so2698600ybe.9 for ; Fri, 17 Feb 2023 16:29:19 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=6FTaaiplXAWGfEVqPjHy0St/TFfLKP8Xa0naqtqNKxQ=; b=lcSeD08U2fduS0Ppbd/dwZ1sVutpIAR5utesmektFgF0YNOfket/WkjCBRvoXlNLiV +o2Aogs/gc+f50ybLSWxFhCjImIYIGSTQeUOhgRWJa7vrQRBVyoRIEpYpw2YRWcA9U41 WyAW0Ab++KWiCATJErqWrBwpBVdFZBa+QSN736IFEusFdICh8RUH5fL8c0v/Y8PEr8Br Qe8eJUa76VSmD3HP1n1esGFDMJpGGHWAJXm6Yqaf73w38JldN0uF+PQwqsmagB2uCrvE JVZuFpSJuNpEY8qrsEqwZB/8hVTroKRCuulZOvJKAfU7a7tyMxK2NmKF8QMW0V3Aw8B3 CM8g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=6FTaaiplXAWGfEVqPjHy0St/TFfLKP8Xa0naqtqNKxQ=; b=qlHlVxK0wLJe+i8HCyshMIFdyaISK1EkJhLgRJMO4RxOLuTptavrIAc5LR7JgexEwZ pEmlEIX7dsQOQFixQd4+X3IRXZEiHjA0fQtwGBCRnZJzwqq7tNt7RMnmtejTabhXPDcj gJZvVAgcI6t9pdkamH+tUvtRWkAjW9nIHwAOMkvQpGlY77cGNNxjabFA/QEWdlFToFOL h3LcviDC3LS6DLpEiTRiAvtGIrAvMxqoo3YDN+iu5Zl+gg/ziETzpbyLFPylNTEquwf5 5tB6QokeKkmQFXlH/dIySVTGSmw3YJMXL0TbIHo7hHomVGoWU7GHA4XRLjVb7fV/S722 0C9Q== X-Gm-Message-State: AO0yUKVxD0jvMtDZGol5LMunH7XLUIxTza+sKwYrEUxmrQQg7HIxMWqv KjIj0ZhdYh4KWWERHqugakBlntvATPN+YGhu X-Google-Smtp-Source: AK7set8ceT73sh6JAXytc6nngKglB94I/eez0+U3CbTB6Hmq6cgS64DE3iyooTxML7Ni0PUJg/39Zu2G+xo8+s0D X-Received: from jthoughton.c.googlers.com ([fda3:e722:ac3:cc00:14:4d90:c0a8:2a4f]) (user=jthoughton job=sendgmr) by 2002:a5b:f49:0:b0:995:ccb:1aae with SMTP id y9-20020a5b0f49000000b009950ccb1aaemr85936ybr.13.1676680159411; Fri, 17 Feb 2023 16:29:19 -0800 (PST) Date: Sat, 18 Feb 2023 00:28:08 +0000 In-Reply-To: <20230218002819.1486479-1-jthoughton@google.com> Mime-Version: 1.0 References: <20230218002819.1486479-1-jthoughton@google.com> X-Mailer: git-send-email 2.39.2.637.g21b0678d19-goog Message-ID: <20230218002819.1486479-36-jthoughton@google.com> Subject: [PATCH v2 35/46] hugetlb: add check to prevent refcount overflow via HGM From: James Houghton To: Mike Kravetz , Muchun Song , Peter Xu , Andrew Morton Cc: David Hildenbrand , David Rientjes , Axel Rasmussen , Mina Almasry , "Zach O'Keefe" , Manish Mishra , Naoya Horiguchi , "Dr . David Alan Gilbert" , "Matthew Wilcox (Oracle)" , Vlastimil Babka , Baolin Wang , Miaohe Lin , Yang Shi , Frank van der Linden , Jiaqi Yan , linux-mm@kvack.org, linux-kernel@vger.kernel.org, James Houghton X-Rspamd-Server: rspam05 X-Rspamd-Queue-Id: 40BE1140005 X-Stat-Signature: is64g171pf9869dxz3cs7io4ak94645a X-Rspam-User: X-HE-Tag: 1676680160-598857 X-HE-Meta: U2FsdGVkX1+Qhdngt+yCIPMW/rZCF9D1dAYepWQPccUtpjuj320jIdxsM8424XnaFlxLcwaSmng7Rk1/k0i99h9np6XhUPdBQB3CLR+6TxM+jEdnKecLNEJNmeWOsgCluXRsbNTiiC57RL0yuUhisv/T57teArpne2gjqYBBbPLox2ADIyg6YS8aL9+BAH1V3kA7npL2jqReCE7eeD9gwI03ajDwk1qqk7Iq+oiYMYl+SyL1UYdvlnBodp7MSoMsgMCThy3FN0wU9OhLK7QTxI6PCSCZBKn1w1zNeVPfuD+In3DLwolCRFG0dWNEzKuupH6EkVhhGOxuKm804U7oyeV+A92UnWH4yqtPUHfaiqk/cwpYu6CtZztI9+8gstC8hSmWkRGOmjyx+ia/UMiFk8Bgwxr+hb19TfeRELK89ZIt1z6GugCJmkg/pZ9UzEHkgB9MxsGoPIOjDLrFvxc4NVQJfFoDYQS8BEJR/0c5Pmc2mvvOpld4EYqMSry//Wrn/T5ennUFnVSkMXeJbe5NgT80j0Qua3HTNVZZRCvHo3wd4KgfGkME72qOVtxxSMSoR3ZZWTyEYpqmtuWijZ5i3xVIc9em5o+PIHFhnRL7zK1QXij1dONoymVsu4S13KZtYPPzR0US+mCET4v2WKyJceilW9FGnN5HrvVmaxJgpZgGlVYu86xk9bCEMno7XHuIO4LdpBcjpfXciptj128DtPvmNEWGKVOlNDDtJsEM5o8S5Y+JmrxeJT1I0mbMUJ8ySiipvgX2KFk7faf31qBBXU869zxkWIgyq46SLdS+QL7V7jy5fYwanBYMVZ9v3kxN2Uzh3/SnPYmGA4q+GAhLYUd7rFLotw7MjW/RM1caxzAmXiASRH5DsGpwsrV9Im7EZmKdMD19QB3D6EEKNwQ8frVs0JRQt6FpiVWfIzhjpkx/C7S1NrnNAeiCMPFMD8AgvbmJpC0vskSLCPF2QGu LJFZfTWp 2TjVnPlHVjLq2WGZ3cZaup1Nh5bpsDo4OjPHr5E4BRYyc2fR2TdMXjEllISSg6yiX9aj5zrJzhLoTH4Ugk/74DbX04421nMiPhvHcN61HyyYzCIvyKTFdEQVMsDwkif/F+3VH13Vf4gg/NVHvUioysME0MM28F53nmcuHTtMONGUUQnTJm2RKgwL8Sv4whFBcmJ2bNjl+GtcdWq/um8jdBAYO+e74KJc4CBsebjaZ6ovcMQpG5ucy4rTQFRnYK4synyGSo0uqWnZmksxVB8jozpteFOA86M0zFgJuaJwWEFYgByTspPTS6fbSiGeLW7eM3dZLZTseT68QxNw5y71AZifDawzY5Mgg1mSPpbWDcEFl7n1X6R8cFZpkBLmp/5M27SbQF3ykEzCPjLMe0BtXzSM6LuoNf80lqFLJqBHO7npANjB+ZPKxkKCkDQcYMv0zfvJkgWYISDXeUsVUJW3eujVVbNCXNdHwzZTkCjB3OgKfwCoA+4GeDaadsww+u1aQj/MkHqQtcu8JCYv0QiPr5M8i6w== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: With high-granularity mappings, it becomes quite trivial for userspace to overflow a page's refcount or mapcount. It can be done like so: 1. Create a 1G hugetlbfs file with a single 1G page. 2. Create 8192 mappings of the file. 3. Use UFFDIO_CONTINUE to map every mapping at entirely 4K. Each time step 3 is done for a mapping, the refcount and mapcount will increase by 2^19 (512 * 512). Do that 2^13 times (8192), and you reach 2^31. To avoid this, WARN_ON_ONCE when the refcount goes negative. If this happens as a result of a page fault, return VM_FAULT_SIGBUS, and if it happens as a result of a UFFDIO_CONTINUE, return EFAULT. We can also create too many mappings by fork()ing a lot with VMAs setup such that page tables must be copied at fork()-time (like if we have VM_UFFD_WP). Use try_get_page() in copy_hugetlb_page_range() to deal with this. Signed-off-by: James Houghton diff --git a/mm/hugetlb.c b/mm/hugetlb.c index c4d189e5f1fd..34368072dabe 100644 --- a/mm/hugetlb.c +++ b/mm/hugetlb.c @@ -5397,7 +5397,10 @@ int copy_hugetlb_page_range(struct mm_struct *dst, struct mm_struct *src, } else { ptepage = pte_page(entry); hpage = compound_head(ptepage); - get_page(hpage); + if (try_get_page(hpage)) { + ret = -EFAULT; + break; + } /* * Failing to duplicate the anon rmap is a rare case @@ -6132,6 +6135,30 @@ static bool hugetlb_pte_stable(struct hstate *h, struct hugetlb_pte *hpte, return same; } +/* + * Like filemap_lock_folio, but check the refcount of the page afterwards to + * check if we are at risk of overflowing refcount back to 0. + * + * This should be used in places that can be used to easily overflow refcount, + * like places that create high-granularity mappings. + */ +static struct folio *hugetlb_try_find_lock_folio(struct address_space *mapping, + pgoff_t idx) +{ + struct folio *folio = filemap_lock_folio(mapping, idx); + + /* + * This check is very similar to the one in try_get_page(). + * + * This check is inherently racy, so WARN_ON_ONCE() if this condition + * ever occurs. + */ + if (WARN_ON_ONCE(folio && folio_ref_count(folio) <= 0)) + return ERR_PTR(-EFAULT); + + return folio; +} + static vm_fault_t hugetlb_no_page(struct mm_struct *mm, struct vm_area_struct *vma, struct address_space *mapping, pgoff_t idx, @@ -6168,7 +6195,15 @@ static vm_fault_t hugetlb_no_page(struct mm_struct *mm, * before we get page_table_lock. */ new_folio = false; - folio = filemap_lock_folio(mapping, idx); + folio = hugetlb_try_find_lock_folio(mapping, idx); + if (IS_ERR(folio)) { + /* + * We don't want to invoke the OOM killer here, as we aren't + * actually OOMing. + */ + ret = VM_FAULT_SIGBUS; + goto out; + } if (!folio) { size = i_size_read(mapping->host) >> huge_page_shift(h); if (idx >= size) @@ -6600,8 +6635,8 @@ int hugetlb_mcopy_atomic_pte(struct mm_struct *dst_mm, if (is_continue) { ret = -EFAULT; - folio = filemap_lock_folio(mapping, idx); - if (!folio) + folio = hugetlb_try_find_lock_folio(mapping, idx); + if (IS_ERR_OR_NULL(folio)) goto out; folio_in_pagecache = true; } else if (!*pagep) { From patchwork Sat Feb 18 00:28:09 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: James Houghton X-Patchwork-Id: 13145402 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 00162C05027 for ; Sat, 18 Feb 2023 00:29:49 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 5229E280021; Fri, 17 Feb 2023 19:29:23 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 4D474280002; Fri, 17 Feb 2023 19:29:23 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 2841F280021; Fri, 17 Feb 2023 19:29:23 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0015.hostedemail.com [216.40.44.15]) by kanga.kvack.org (Postfix) with ESMTP id 15794280002 for ; Fri, 17 Feb 2023 19:29:23 -0500 (EST) Received: from smtpin08.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay10.hostedemail.com (Postfix) with ESMTP id E861BC0271 for ; Sat, 18 Feb 2023 00:29:22 +0000 (UTC) X-FDA: 80478528564.08.7910E00 Received: from mail-yb1-f202.google.com (mail-yb1-f202.google.com [209.85.219.202]) by imf22.hostedemail.com (Postfix) with ESMTP id 38990C000C for ; Sat, 18 Feb 2023 00:29:21 +0000 (UTC) Authentication-Results: imf22.hostedemail.com; dkim=pass header.d=google.com header.s=20210112 header.b=Kd84UzoU; spf=pass (imf22.hostedemail.com: domain of 34BvwYwoKCP0oymtzlmytslttlqj.htrqnsz2-rrp0fhp.twl@flex--jthoughton.bounces.google.com designates 209.85.219.202 as permitted sender) smtp.mailfrom=34BvwYwoKCP0oymtzlmytslttlqj.htrqnsz2-rrp0fhp.twl@flex--jthoughton.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1676680161; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=HPj8If6i9Ge054c2yi88NtPUxX7MkhEBgezIKsLW8z8=; b=aAHmP7SZ5G1Y6WWJBjzfTktxNZaLdE34CPAU7hguYS0P4p5tiMLUVKdgUF3RUPfwXovOyn ZWxEM+Vqk/behy6gI3lEaHVp//ef58SPKuFuFNBI9EwdfDHvkxCAZUNF6bxmOoeVwMIJTu EkiiBj3KNj0GPYzwLb58sVAdbZ3FyVM= ARC-Authentication-Results: i=1; imf22.hostedemail.com; dkim=pass header.d=google.com header.s=20210112 header.b=Kd84UzoU; spf=pass (imf22.hostedemail.com: domain of 34BvwYwoKCP0oymtzlmytslttlqj.htrqnsz2-rrp0fhp.twl@flex--jthoughton.bounces.google.com designates 209.85.219.202 as permitted sender) smtp.mailfrom=34BvwYwoKCP0oymtzlmytslttlqj.htrqnsz2-rrp0fhp.twl@flex--jthoughton.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1676680161; a=rsa-sha256; cv=none; b=eNF/HbgIG3EEjG9cvRUsl1Xh6aUbsMOr7NfVlpmLk+RFwgv+5k+pN3NEImVySyfaCGv9Ud RHQGfAY2QNEwmy9xMI3YxsboYloAvG8bVEtQPmhTnIpJ+KoisHFXN42mGD0b2nBwtm2R5g KX843IAebe5TzwIIGFWUyNkSOrr1urg= Received: by mail-yb1-f202.google.com with SMTP id e128-20020a253786000000b009433a21be0dso2008651yba.19 for ; Fri, 17 Feb 2023 16:29:20 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=HPj8If6i9Ge054c2yi88NtPUxX7MkhEBgezIKsLW8z8=; b=Kd84UzoUTRPGM2pruNTbVhhDlhlYmwG9wMjKAkrrxmwn/CJAyxsS6jZ/LzaQVwYC3z S69dpVgrGNsnQ2VWeasqchjAp3jDCn9n8jxsJDYFL1UCwLksmO5Kw+6uUwl7zSSgvaCl CihuI8Yf+eChioQnTWfweqD4fr36yuVoZcedwO4ze5S8nNa5G9axS81vDsPaWdtLrroV 8UuCTi7HBc8+DG1q9YpGK/r09QKm4gOgG0V2UzM7mQ71xE6K5hJlSxLD1cNbFDXLleV5 yEaFpVguoifalvGf8MAWgne1cnF863zM85vjZWzRVitIqTT0Ms2eNQkJwx5cA7n7LbdD 9FRQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=HPj8If6i9Ge054c2yi88NtPUxX7MkhEBgezIKsLW8z8=; b=C1AF1COEO2OrUd/s6xqzlM1cc+Pm4HcpHE1805jt2MpinE+Hr9FJzT9jCIUm/My3SF t19QC12ro0EEQWq/Q7aARviyYhdad310swQwxFuq0VO/zfKENdiEeID3oMHbhMObO6CR phqNHEJdT2kjDsxLOkNilbjB9K1IVOm3P7HMTwAQt6EJ8VTRWrxaRzIciSmuZkHG9mIp pLxYaoVNRrmc+KJ2FjPMc3pyJqDk43YgY8RXfqDYoQFeu7NHYNstgFRDNZDFbyTAd31h PejASGcWLBicHM252WiIYIQMdFkyMrw+66qrnZ9J/CRWq/+9FN5zvGlzS7A7bifZdHkB pWtQ== X-Gm-Message-State: AO0yUKVcLJdEk3I1SUZV+Ef1anKNWODzvle0z+LVpKzqqXdzh0Hy3Xyp lWN7rLAOAaT/HHoJo1LwLr0DN7q4XTWMQn2N X-Google-Smtp-Source: AK7set/mcEFSHmue9cMCcVJMB1Ku9p/0QTBXluCyDE4PFP+ttIIRE+dDut302gTdIA8Yp2aY7RngxiVOCfe3YAYP X-Received: from jthoughton.c.googlers.com ([fda3:e722:ac3:cc00:14:4d90:c0a8:2a4f]) (user=jthoughton job=sendgmr) by 2002:a05:6902:1cd:b0:985:3b30:f27 with SMTP id u13-20020a05690201cd00b009853b300f27mr245191ybh.13.1676680160446; Fri, 17 Feb 2023 16:29:20 -0800 (PST) Date: Sat, 18 Feb 2023 00:28:09 +0000 In-Reply-To: <20230218002819.1486479-1-jthoughton@google.com> Mime-Version: 1.0 References: <20230218002819.1486479-1-jthoughton@google.com> X-Mailer: git-send-email 2.39.2.637.g21b0678d19-goog Message-ID: <20230218002819.1486479-37-jthoughton@google.com> Subject: [PATCH v2 36/46] hugetlb: remove huge_pte_lock and huge_pte_lockptr From: James Houghton To: Mike Kravetz , Muchun Song , Peter Xu , Andrew Morton Cc: David Hildenbrand , David Rientjes , Axel Rasmussen , Mina Almasry , "Zach O'Keefe" , Manish Mishra , Naoya Horiguchi , "Dr . David Alan Gilbert" , "Matthew Wilcox (Oracle)" , Vlastimil Babka , Baolin Wang , Miaohe Lin , Yang Shi , Frank van der Linden , Jiaqi Yan , linux-mm@kvack.org, linux-kernel@vger.kernel.org, James Houghton X-Stat-Signature: pq51umbmmmks8ojtmae4re3p45n4jx19 X-Rspam-User: X-Rspamd-Queue-Id: 38990C000C X-Rspamd-Server: rspam06 X-HE-Tag: 1676680161-75918 X-HE-Meta: U2FsdGVkX19bcilqFFaZJdWA8cUWPHdtbyrcL/LSQEP7/2v2M6kgm4c4ZNpFTNwAQFM+Fq+lT6BioEKA9IDuJUrWzgnB6tnERgun7LEFoSm2nhQd8hF1IpSzipzEMEEyZzXlYhs5VyEZOLU1l7ixubwaZTP39jV64ZJzmpeo0i2vgE48qGTXlNj/VwfoZ/IahGRqi7/UMsDTgbL2bet5PwxMNR4vRAdAwHFGV6rA6G+0G3wHcwaZ070+vIXo0qpAmtoqISGrThdDu8dqvEIrUIdYDd25A1zVBhM4hPAAfwYF8d3933VLA5waQUuivFlKzj4F/60I4hySnkpnQgZ/TeMrn4oqD3kQQEYGAX1qztqdGVw/QP7lv+uz4RsqyGfJyK9EsJFpKESoPhrBpgEbFypCRYHG0FGIH3DRNyvZeN8PH1wRb+srVQ4X3Qn9c01ivYhi7MDc5FsdTvw9gn3smSgqAGhisLwkP4aLEJGzyV/DNJTlebgzAX7ZjbbJlS/Ei24GAqH2l5CSivYyfZBOKgOiHe1gvUtnMoRoHNYUxKXYNq0Iwwto9VUnpollBFpq/Y7qyRpInOpeSjRJohATYVrViAdIMa6Vppm5DQ10s5+fwQ5rKMGj2MHU5TO8p+tSdR46SAxJgiQlKOhg7eb/hs9zL26PyrnXk2GzXa151faoWtHPKqf9sQrXO+yIFkLB3d8nzPq1bae/lSuuh59V4tGRa8Lg3WAW1Dx1HxyzX05WDBbCndn6cYtZoO/NKMuZudQwKNba6AQGSGB7DSwjXtsILJ4VWm5gn/Rc8fbzHTQqJrSPW7T95wdwyRXlfD+YqwUMR2c+zb65s8jvc/CooV3wvVRWVN90thKYXbC4ghjNuQYF9Vw/7/zyYgSrnp9TrCdtu3FrgxD11bbLQQ2dXbDHUPtKR99eeFBl4D2HkJVBiFVv8S3owRkJR4fKdMdQWTUtM5RG+SUE34XqsT2 DEgDkNPz RiUQQIQ2cx5v9iNVLLCmWACgOxDglMm8rMZ8vnq4GFAgKV9cuwHTLIMEuNxS9pf/0PubnbmEbzmGawn6qwD3YuIEU+nX5LkVmVk/NRgCm6U64iZImdi4l8nfRMyBIW0eQkLwjFvJHDkBDHSSbk3OlpuOmxZs5YYWM1mzsv2VZ1YvvU89IIMWV2fRkbpZ8KLIn3cK6i+bVgbvUbt8fkKo1G9VQHe1d7S8t5su78aZ3zTmh/9jpumWf60p7PnPrYlG/F8ZFf56IA4O2pUnMTN6qaQc6CmUflL+GqyUJzuVOZzZOVyS5xX9KPl+YEoWL2ZVFoqNaTm934obwZarjYxdKjQ6X010Y8uMdu1n3uqFn6v0whLvjEuYUzHV1273H0SfN6ubQ++0zfanmQ8MW0H+egeycjjrmZ3EfPZYy+nTBllP8tvKTyWqG4WiCJTMuQjus6ZRx2MnPqYs6aT31o5HdtIA506vnl8+DFdTc85stPFtLsabozZGs+T/G6AV1Jbjb1Mr+PqA3K8vJ/agxHDAvrW0W2S+D+SMthjztqt+2aUjHd0m5G7DGGV/DIKhP1j3TFQC89Ve/K4DWCL/pu3YREVfuWcQJfjeyD/vD/Bmz+htZwik= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: They are replaced with hugetlb_pte_lock{,ptr}. All callers that haven't already been replaced don't get called when using HGM, so we handle them by populating hugetlb_ptes with the standard, hstate-sized huge PTEs. Signed-off-by: James Houghton diff --git a/arch/powerpc/mm/pgtable.c b/arch/powerpc/mm/pgtable.c index 035a0df47af0..c90ac06dc8d9 100644 --- a/arch/powerpc/mm/pgtable.c +++ b/arch/powerpc/mm/pgtable.c @@ -258,11 +258,14 @@ int huge_ptep_set_access_flags(struct vm_area_struct *vma, #ifdef CONFIG_PPC_BOOK3S_64 struct hstate *h = hstate_vma(vma); + struct hugetlb_pte hpte; psize = hstate_get_psize(h); #ifdef CONFIG_DEBUG_VM - assert_spin_locked(huge_pte_lockptr(huge_page_shift(h), - vma->vm_mm, ptep)); + /* HGM is not supported for powerpc yet. */ + hugetlb_pte_init(&hpte, ptep, huge_page_shift(h), + hpage_size_to_level(psize)); + assert_spin_locked(hpte.ptl); #endif #else diff --git a/include/linux/hugetlb.h b/include/linux/hugetlb.h index 6cd4ae08d84d..742e7f2cb170 100644 --- a/include/linux/hugetlb.h +++ b/include/linux/hugetlb.h @@ -1012,14 +1012,6 @@ static inline gfp_t htlb_modify_alloc_mask(struct hstate *h, gfp_t gfp_mask) return modified_mask; } -static inline spinlock_t *huge_pte_lockptr(unsigned int shift, - struct mm_struct *mm, pte_t *pte) -{ - if (shift == PMD_SHIFT) - return pmd_lockptr(mm, (pmd_t *) pte); - return &mm->page_table_lock; -} - #ifndef hugepages_supported /* * Some platform decide whether they support huge pages at boot @@ -1228,12 +1220,6 @@ static inline gfp_t htlb_modify_alloc_mask(struct hstate *h, gfp_t gfp_mask) return 0; } -static inline spinlock_t *huge_pte_lockptr(unsigned int shift, - struct mm_struct *mm, pte_t *pte) -{ - return &mm->page_table_lock; -} - static inline void hugetlb_count_init(struct mm_struct *mm) { } @@ -1308,16 +1294,6 @@ int hugetlb_collapse(struct mm_struct *mm, unsigned long start, } #endif -static inline spinlock_t *huge_pte_lock(struct hstate *h, - struct mm_struct *mm, pte_t *pte) -{ - spinlock_t *ptl; - - ptl = huge_pte_lockptr(huge_page_shift(h), mm, pte); - spin_lock(ptl); - return ptl; -} - static inline spinlock_t *hugetlb_pte_lockptr(struct hugetlb_pte *hpte) { @@ -1353,8 +1329,22 @@ void hugetlb_pte_init(struct mm_struct *mm, struct hugetlb_pte *hpte, pte_t *ptep, unsigned int shift, enum hugetlb_level level) { - __hugetlb_pte_init(hpte, ptep, shift, level, - huge_pte_lockptr(shift, mm, ptep)); + spinlock_t *ptl; + + /* + * For contiguous HugeTLB PTEs that can contain other HugeTLB PTEs + * on the same level, the same PTL for both must be used. + * + * For some architectures that implement hugetlb_walk_step, this + * version of hugetlb_pte_populate() may not be correct to use for + * high-granularity PTEs. Instead, call __hugetlb_pte_populate() + * directly. + */ + if (level == HUGETLB_LEVEL_PMD) + ptl = pmd_lockptr(mm, (pmd_t *) ptep); + else + ptl = &mm->page_table_lock; + __hugetlb_pte_init(hpte, ptep, shift, level, ptl); } #if defined(CONFIG_HUGETLB_PAGE) && defined(CONFIG_CMA) diff --git a/mm/hugetlb.c b/mm/hugetlb.c index 34368072dabe..e0a92e7c1755 100644 --- a/mm/hugetlb.c +++ b/mm/hugetlb.c @@ -5454,9 +5454,8 @@ int copy_hugetlb_page_range(struct mm_struct *dst, struct mm_struct *src, put_page(hpage); /* Install the new hugetlb folio if src pte stable */ - dst_ptl = huge_pte_lock(h, dst, dst_pte); - src_ptl = huge_pte_lockptr(huge_page_shift(h), - src, src_pte); + dst_ptl = hugetlb_pte_lock(&dst_hpte); + src_ptl = hugetlb_pte_lockptr(&src_hpte); spin_lock_nested(src_ptl, SINGLE_DEPTH_NESTING); entry = huge_ptep_get(src_pte); if (!pte_same(src_pte_old, entry)) { @@ -7582,7 +7581,8 @@ pte_t *huge_pmd_share(struct mm_struct *mm, struct vm_area_struct *vma, unsigned long saddr; pte_t *spte = NULL; pte_t *pte; - spinlock_t *ptl; + struct hugetlb_pte hpte; + struct hstate *shstate; i_mmap_lock_read(mapping); vma_interval_tree_foreach(svma, &mapping->i_mmap, idx, idx) { @@ -7603,7 +7603,11 @@ pte_t *huge_pmd_share(struct mm_struct *mm, struct vm_area_struct *vma, if (!spte) goto out; - ptl = huge_pte_lock(hstate_vma(vma), mm, spte); + shstate = hstate_vma(svma); + + hugetlb_pte_init(mm, &hpte, spte, huge_page_shift(shstate), + hpage_size_to_level(huge_page_size(shstate))); + spin_lock(hpte.ptl); if (pud_none(*pud)) { pud_populate(mm, pud, (pmd_t *)((unsigned long)spte & PAGE_MASK)); @@ -7611,7 +7615,7 @@ pte_t *huge_pmd_share(struct mm_struct *mm, struct vm_area_struct *vma, } else { put_page(virt_to_page(spte)); } - spin_unlock(ptl); + spin_unlock(hpte.ptl); out: pte = (pte_t *)pmd_alloc(mm, pud, addr); i_mmap_unlock_read(mapping); @@ -8315,6 +8319,7 @@ static void hugetlb_unshare_pmds(struct vm_area_struct *vma, unsigned long address; spinlock_t *ptl; pte_t *ptep; + struct hugetlb_pte hpte; if (!(vma->vm_flags & VM_MAYSHARE)) return; @@ -8336,7 +8341,10 @@ static void hugetlb_unshare_pmds(struct vm_area_struct *vma, ptep = hugetlb_walk(vma, address, sz); if (!ptep) continue; - ptl = huge_pte_lock(h, mm, ptep); + + hugetlb_pte_init(mm, &hpte, ptep, huge_page_shift(h), + hpage_size_to_level(sz)); + ptl = hugetlb_pte_lock(&hpte); huge_pmd_unshare(mm, vma, address, ptep); spin_unlock(ptl); } From patchwork Sat Feb 18 00:28:10 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: James Houghton X-Patchwork-Id: 13145403 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 92BF3C6379F for ; Sat, 18 Feb 2023 00:29:51 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 56BE4280022; Fri, 17 Feb 2023 19:29:24 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 4F59E280002; Fri, 17 Feb 2023 19:29:24 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 396D2280022; Fri, 17 Feb 2023 19:29:24 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0010.hostedemail.com [216.40.44.10]) by kanga.kvack.org (Postfix) with ESMTP id 29618280002 for ; Fri, 17 Feb 2023 19:29:24 -0500 (EST) Received: from smtpin28.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay05.hostedemail.com (Postfix) with ESMTP id 043434038F for ; Sat, 18 Feb 2023 00:29:23 +0000 (UTC) X-FDA: 80478528648.28.D64F97C Received: from mail-yb1-f202.google.com (mail-yb1-f202.google.com [209.85.219.202]) by imf25.hostedemail.com (Postfix) with ESMTP id 3DF32A001F for ; Sat, 18 Feb 2023 00:29:22 +0000 (UTC) Authentication-Results: imf25.hostedemail.com; dkim=pass header.d=google.com header.s=20210112 header.b=JocgWYIN; spf=pass (imf25.hostedemail.com: domain of 34RvwYwoKCP4pznu0mnzutmuumrk.iusrot03-ssq1giq.uxm@flex--jthoughton.bounces.google.com designates 209.85.219.202 as permitted sender) smtp.mailfrom=34RvwYwoKCP4pznu0mnzutmuumrk.iusrot03-ssq1giq.uxm@flex--jthoughton.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1676680162; a=rsa-sha256; cv=none; b=dDuCCUyhJDrhs5hoEd5yL97TWU7Ka6VmcJclExQ5X4boSrPCEoug154lljTmXmFtbcigda CcY5qtQFssdN5UthmHaryWiZVAKItbAk/1ZtKLT2pwSA9PX/yRKkurzeRnUIgzJrJYEggW AM3vHf7URccAyVDjt+CRp7yknBhnv9w= ARC-Authentication-Results: i=1; imf25.hostedemail.com; dkim=pass header.d=google.com header.s=20210112 header.b=JocgWYIN; spf=pass (imf25.hostedemail.com: domain of 34RvwYwoKCP4pznu0mnzutmuumrk.iusrot03-ssq1giq.uxm@flex--jthoughton.bounces.google.com designates 209.85.219.202 as permitted sender) smtp.mailfrom=34RvwYwoKCP4pznu0mnzutmuumrk.iusrot03-ssq1giq.uxm@flex--jthoughton.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1676680162; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=GaGw9/EzNbJa/6t4pJ7yTdVKyp/NxSZHrGSlaKMUK+E=; b=KSKWQCYoDBbfhPl0R3zCPwCh8rgsDObXYJIO888GKH2ztJkummYEthKmfOpRNFQhBi1AVE ubaE8GT/sHWuUmO/jrIoCLL9dmMmPGlbxcV9QyCCAbNnk6rgtKSeyBE+kZquSAoGXpxsn4 bfLJMnlnSgQlE3W7tX2ZJkSrFJhbp4Y= Received: by mail-yb1-f202.google.com with SMTP id 84-20020a251457000000b0091231592671so2247220ybu.1 for ; Fri, 17 Feb 2023 16:29:21 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=GaGw9/EzNbJa/6t4pJ7yTdVKyp/NxSZHrGSlaKMUK+E=; b=JocgWYINYVmULfBzV4L8RQDVrC/ksz9Y8dI+J+nlAsc/Rp1x6bcRPeZoTXr9WfAAcE 637vbboTsrEowR0AFIppjUO9sj33exLylRWnzGJRB4Ke4REBcvqZSCDhAGTVp/j5YufQ YgJobCIYoQmdFEhMN4GhlXFaOyyGAAfpa0yi+wp2F9k5f0rcjtQQjbs0cPJHzrVzFUhk 50adDYi9tmH1ngFR2JYpBWBk6Q0MNk3dZbjF//o0K6YVOKO0ibfrqlf5m/HRlbVFSL22 zZbQkatRHAWrwIFzYMCpOzJ34fpVzrSFt1DFYR7+DUtjlcSXBgnd1OMWELyjBlQGE0L4 N6FQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=GaGw9/EzNbJa/6t4pJ7yTdVKyp/NxSZHrGSlaKMUK+E=; b=miP6erWUjQi4RA64sg2PEjh047jgqniqikHEfakBzwEdTbW9oiPBMP4cIlyVCa+Ye2 riaaoVV3Vc+8Yv5TnMiFDuhcnNtgyih8putqpIRSZVlWRXx5hoyVg/34yrYMmq1asguY TiBR/QFSKxm5gxQT3t39peUrqHBb9XDEaT2mdEj+5hnr8PHJ1NZhfC4YGX/+x2URiwsX HOWB7Xu7GgiKk89xRCjyc5NzlZzXAD/lS5Cudh+wpCzrS57e4ncv5L9EPlVT+fJ37cha LPGYmhEYbY61TijR+fv8NcDInd6jmllciq1+PgL3BDSBohm9GYQttr6zYvVF58vE3IeD FP+w== X-Gm-Message-State: AO0yUKWrgeANsbh07iQzvsTieJWIhWomziGM7weKxCTCFnAXHrGvHJiN bw3ugX7r2vcu7VdpzdcoqkRaeKtRYDqMN4JX X-Google-Smtp-Source: AK7set9wC+TS+nhluAugC3NIlRB72Nwu/shM7r466HV9itDauVe/hJjpxls4Y6ZfyXihDUoWjGrwgClTAbgrS6MV X-Received: from jthoughton.c.googlers.com ([fda3:e722:ac3:cc00:14:4d90:c0a8:2a4f]) (user=jthoughton job=sendgmr) by 2002:a05:6902:12c8:b0:8e3:6aea:973 with SMTP id j8-20020a05690212c800b008e36aea0973mr91564ybu.4.1676680161464; Fri, 17 Feb 2023 16:29:21 -0800 (PST) Date: Sat, 18 Feb 2023 00:28:10 +0000 In-Reply-To: <20230218002819.1486479-1-jthoughton@google.com> Mime-Version: 1.0 References: <20230218002819.1486479-1-jthoughton@google.com> X-Mailer: git-send-email 2.39.2.637.g21b0678d19-goog Message-ID: <20230218002819.1486479-38-jthoughton@google.com> Subject: [PATCH v2 37/46] hugetlb: replace make_huge_pte with make_huge_pte_with_shift From: James Houghton To: Mike Kravetz , Muchun Song , Peter Xu , Andrew Morton Cc: David Hildenbrand , David Rientjes , Axel Rasmussen , Mina Almasry , "Zach O'Keefe" , Manish Mishra , Naoya Horiguchi , "Dr . David Alan Gilbert" , "Matthew Wilcox (Oracle)" , Vlastimil Babka , Baolin Wang , Miaohe Lin , Yang Shi , Frank van der Linden , Jiaqi Yan , linux-mm@kvack.org, linux-kernel@vger.kernel.org, James Houghton X-Rspam-User: X-Rspamd-Queue-Id: 3DF32A001F X-Rspamd-Server: rspam01 X-Stat-Signature: cidiynkoc9z5ynmksz1tt4d77bekaonx X-HE-Tag: 1676680162-908737 X-HE-Meta: U2FsdGVkX19An4UomyA35iGcZ+/JeOPtWfMBvKMeEAOkWS30w2MdQxCpUG09uZ07EogdSc1yW2+PYE4H+LmdJDz/prkNIOcGcZJLHshHq0y1P4OUNxIUsF7EWfeHSsEvk5OanNB7d8UGINhIvA7l4a+wMQ+paFx4D08ls5oWQJIwOpuswW1/7XPSygyVQrh0ZQsT699X2gRhBZeY+lmeOUVWTOnSO5TctI1EM4K9mdJFHsgt1x4hmdzyG1ojBBG0pVQT9VLsZ7Mm3hkD0k+2lXDLnL7Wq9iQ55+/66kT5hPSadIdMciY8XSMPF5SGIDhYsda+DjrgK0t1pqKl34kPpj3ld4IeN1dDS+AEWcLiN6chP1PXj2RTUlNkIMNED4BMwhzcANOUSR4OCrvb0RQnsMurOAQLBuESvHl4e1CRXKwOm0cELq0eUK9+2NEHeMXE54Jc9GsJT0CM6JtcWIHujxZBPdKUUIGYtJTS+x4CfqMQJXInvPq8HCoQn3hAehF3mp89+AIDchn1N3QReWTEHbG/La7JdO7pFC9DRJSZCBbkqArde0F2oQZVZfzVNSO/dVEzlE9NyoyDZG40i+7P1E3PaSPrFKjxLj4v/UgyeCSstLJklz9cnuJoagzlMDIdMxsfNXM004IuSgPPM/KvL/dyaLFRuw0rajPZkrkusDJLL2rwXUR5tMrr/8BqvXiGxFcY2Y4nGLukztz5OHrLetYgbSQh/JpbZ31W6FAzmcCGK2ezY0TMqZHUzvx3WNzxVcvTOXPwJUlkgos01Mb4nJAnzLu6zQWxIU3DoBujR0AELeDwOdjG1OyHI/1Jp5SFPOjBGNGboisc58A8E9crgA8OvKfkto7UnqOg8+3iyKqkbUPp+y46sYxxw4l9Nahty58GPI3lEnQjisalaik6GntExX1PCfk+SicJA5JybMvGzsdsthC8GXmorcD763WOVq85b+8zG34ROKHTqB WBBvo0+s RH6N6EX+cUDuo626dHNrxZ8sPfhj1wsSTipewuWiJcXTplUa8ApRAy351jzx3U2/v1zhF1At/T169UQnzstBAukHd1jlZlz0o/OFVMt4/x0jsjLLUEDm2KyUecANiBCcg5UJ+1YsHdPKadYRIgBrKcy/kDXSodOd47DSwCYliVtjMukfSG1LQMX8aKSWAwrB+jDJR63Vs06+H0/muMkP7LeWI1xOb6NnmUvdNfs5ABKttbTO5yqRcOsNNpVaZPCRE6wZAHAG2u3kNAvqKJjTkiNH3ODang7kGWNu/JZoxbqKkPFLKy14BTsVk9OnypKZ4AAuaA8c2+3vNMzh0TnIfUnqQLfot6X9ZB5D7dIfpxsqcat6zR8qPkgY7Oow412zR3VV8s9pzIcovovBhlVFOf5Mcgs9bVwZKDhOqHQIB5mT8G7kcTS4R6R1nNqsn7prfcABvir1jaS1jXKLpLiJ/C2PdDjn6EwuG8kMHn4KZyzeJ2oG900f21pDoC47eNPxyCh/F5NHX1MVt4m5jp4DOrS84IH1oA0EtJkTnQHDZQrcPIphXGGcTq39GbmzLH99IxORO9s/x20TOWAwdF8v2bWwB2wI4w/zP34sqLaCv8UCuWOI= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: This removes the old definition of make_huge_pte, where now we always require the shift to be explicitly given. All callsites are cleaned up. Signed-off-by: James Houghton diff --git a/mm/hugetlb.c b/mm/hugetlb.c index e0a92e7c1755..4c9b3c5379b2 100644 --- a/mm/hugetlb.c +++ b/mm/hugetlb.c @@ -5204,9 +5204,9 @@ const struct vm_operations_struct hugetlb_vm_ops = { .pagesize = hugetlb_vm_op_pagesize, }; -static pte_t make_huge_pte_with_shift(struct vm_area_struct *vma, - struct page *page, int writable, - int shift) +static pte_t make_huge_pte(struct vm_area_struct *vma, + struct page *page, int writable, + int shift) { pte_t entry; @@ -5222,14 +5222,6 @@ static pte_t make_huge_pte_with_shift(struct vm_area_struct *vma, return entry; } -static pte_t make_huge_pte(struct vm_area_struct *vma, struct page *page, - int writable) -{ - unsigned int shift = huge_page_shift(hstate_vma(vma)); - - return make_huge_pte_with_shift(vma, page, writable, shift); -} - static void set_huge_ptep_writable(struct vm_area_struct *vma, unsigned long address, pte_t *ptep) { @@ -5272,7 +5264,9 @@ hugetlb_install_folio(struct vm_area_struct *vma, pte_t *ptep, unsigned long add { __folio_mark_uptodate(new_folio); hugepage_add_new_anon_rmap(new_folio, vma, addr); - set_huge_pte_at(vma->vm_mm, addr, ptep, make_huge_pte(vma, &new_folio->page, 1)); + set_huge_pte_at(vma->vm_mm, addr, ptep, make_huge_pte( + vma, &new_folio->page, 1, + huge_page_shift(hstate_vma(vma)))); hugetlb_count_add(pages_per_huge_page(hstate_vma(vma)), vma->vm_mm); folio_set_hugetlb_migratable(new_folio); } @@ -6006,7 +6000,8 @@ static vm_fault_t hugetlb_wp(struct mm_struct *mm, struct vm_area_struct *vma, hugetlb_remove_rmap(old_page, huge_page_shift(h), h, vma); hugepage_add_new_anon_rmap(new_folio, vma, haddr); set_huge_pte_at(mm, haddr, ptep, - make_huge_pte(vma, &new_folio->page, !unshare)); + make_huge_pte(vma, &new_folio->page, !unshare, + huge_page_shift(h))); folio_set_hugetlb_migratable(new_folio); /* Make the old page be freed below */ new_folio = page_folio(old_page); @@ -6348,7 +6343,7 @@ static vm_fault_t hugetlb_no_page(struct mm_struct *mm, else hugetlb_add_file_rmap(subpage, hpte->shift, h, vma); - new_pte = make_huge_pte_with_shift(vma, subpage, + new_pte = make_huge_pte(vma, subpage, ((vma->vm_flags & VM_WRITE) && (vma->vm_flags & VM_SHARED)), hpte->shift); @@ -6770,8 +6765,7 @@ int hugetlb_mcopy_atomic_pte(struct mm_struct *dst_mm, else writable = dst_vma->vm_flags & VM_WRITE; - _dst_pte = make_huge_pte_with_shift(dst_vma, subpage, writable, - dst_hpte->shift); + _dst_pte = make_huge_pte(dst_vma, subpage, writable, dst_hpte->shift); /* * Always mark UFFDIO_COPY page dirty; note that this may not be * extremely important for hugetlbfs for now since swapping is not @@ -8169,8 +8163,7 @@ static int __hugetlb_collapse(struct mm_struct *mm, struct vm_area_struct *vma, } subpage = hugetlb_find_subpage(h, folio, curr); - entry = make_huge_pte_with_shift(vma, subpage, - writable, hpte.shift); + entry = make_huge_pte(vma, subpage, writable, hpte.shift); hugetlb_add_file_rmap(subpage, hpte.shift, h, vma); set_huge_pte_at(mm, curr, hpte.ptep, entry); spin_unlock(ptl); From patchwork Sat Feb 18 00:28:11 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: James Houghton X-Patchwork-Id: 13145404 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 6DA68C05027 for ; Sat, 18 Feb 2023 00:29:53 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 51DD0280023; Fri, 17 Feb 2023 19:29:25 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 45BA0280002; Fri, 17 Feb 2023 19:29:25 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 284A8280023; Fri, 17 Feb 2023 19:29:25 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0012.hostedemail.com [216.40.44.12]) by kanga.kvack.org (Postfix) with ESMTP id 1A28C280002 for ; Fri, 17 Feb 2023 19:29:25 -0500 (EST) Received: from smtpin20.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay08.hostedemail.com (Postfix) with ESMTP id F13331405E5 for ; Sat, 18 Feb 2023 00:29:24 +0000 (UTC) X-FDA: 80478528648.20.605552F Received: from mail-yb1-f201.google.com (mail-yb1-f201.google.com [209.85.219.201]) by imf23.hostedemail.com (Postfix) with ESMTP id 37895140005 for ; Sat, 18 Feb 2023 00:29:23 +0000 (UTC) Authentication-Results: imf23.hostedemail.com; dkim=pass header.d=google.com header.s=20210112 header.b=eXxtIitd; spf=pass (imf23.hostedemail.com: domain of 34hvwYwoKCAEkuipvhiupohpphmf.dpnmjovy-nnlwbdl.psh@flex--jthoughton.bounces.google.com designates 209.85.219.201 as permitted sender) smtp.mailfrom=34hvwYwoKCAEkuipvhiupohpphmf.dpnmjovy-nnlwbdl.psh@flex--jthoughton.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1676680163; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=ep+t2caOQSahkPTOUAWt22QPS6ka0wL1iktlvruOFtk=; b=XAKdgPqjYrierxcPFbKjlAnxEf0XU1oY5q2Q0pgKJIA6P3I94HVer9qVD1CCm7zSIAV0l0 +45HHOfgfVyS4oyojJlaIb4zHw4Sa9yDbITCLREx5iycDPOTc53jDV2qIFyP9sVTwivJrd Y6CVSTdn1aNyAH0nuEqLjiIrpCF/LBw= ARC-Authentication-Results: i=1; imf23.hostedemail.com; dkim=pass header.d=google.com header.s=20210112 header.b=eXxtIitd; spf=pass (imf23.hostedemail.com: domain of 34hvwYwoKCAEkuipvhiupohpphmf.dpnmjovy-nnlwbdl.psh@flex--jthoughton.bounces.google.com designates 209.85.219.201 as permitted sender) smtp.mailfrom=34hvwYwoKCAEkuipvhiupohpphmf.dpnmjovy-nnlwbdl.psh@flex--jthoughton.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1676680163; a=rsa-sha256; cv=none; b=6l7h+A0bHew8+8k4XVcFwbA2+GN45eB6WaIcLplo8T6V1jT/zHDwToEc2uzicPD4sy0AYA iLJ30fZAJNJOmbLm2ucTk6OElEzcc4XjPHY10pn4tMDc6CqTbZHGi1y93nxvAwSAIQ5ik9 0ZQQ6D6rIfPVU7Za6A06pvTRBeKWs7A= Received: by mail-yb1-f201.google.com with SMTP id e83-20020a25e756000000b0086349255277so2437406ybh.8 for ; Fri, 17 Feb 2023 16:29:23 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=ep+t2caOQSahkPTOUAWt22QPS6ka0wL1iktlvruOFtk=; b=eXxtIitdVZvzFBf04daw9PPgRtXdmihKwPkQjcig6vMYYMtp+LOPKmFy/GNPGDZBuZ hUMbzRNTCrNmLeDYHlIEgRctvtpQ9FpU/o7UPh1ZIp/HRvg6DXU0nPANTJe2xRrjJGwp WbMuaqtCTStJZiCZ1A+9pK76quqdRz7D7vwYcnbCxzAH8ZqKG92mdTDkW+Ff5KgL16Oc bfQ+hpZtyfxk3DBK5nZlhD8pUxO/J77oxCvJiRMYzZvWPl+L1gHyATsIUcMQzvI17Xkm FGAjE5aD9xBtLlj157PrQ56C2U6XXydzea3z40X2BAfKNbG/jjnHVSZCkfPzbPOV1U8m sBLQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=ep+t2caOQSahkPTOUAWt22QPS6ka0wL1iktlvruOFtk=; b=u11+o3EBurE+lbT+6D32opfrzR2pozQtbasdD9MrH4qqtZ3v6VoE8t2d/A9RgQwMEB Jsa9lA4vxD2Xf/Fu/r5k4hTsDp6LKaSjinDCP2/oBg0xg6FornLQ9O6xp4nCYQhKT9gG blg2jTdehDcoHN+H+mZ0CPeOYGRlU9zDAg1tGZZlPSDUQeRHKS5lz+21re+nxji9LRg3 /Py/OZYLLNgeUySNfiDtRQgpU+TD5A5rO0GkF9CSpukzJVhBcODo5zr6Hde+QJ35jgRt tJODvZCSGIYCeY+ASlQc4CBrf+bb+TGLaP5KxnteIi17Ld96Jmk95+VMmHKi+D3ZsQOw Lcjw== X-Gm-Message-State: AO0yUKX4ga4P54C38PeokyfyBgFriO0O9/skExy3QCn9zQ1M79iKZ7tJ mPHI4upJHsmVdzigphioyGc9Tin6UPB5J3EQ X-Google-Smtp-Source: AK7set+tBwsrzsGfYGkr6mqGYtCrd2pD30kKOdycyIYEofwc08SH08MLRIHONTv2mYLxvGH/o6jI+3RKVfH+ffxg X-Received: from jthoughton.c.googlers.com ([fda3:e722:ac3:cc00:14:4d90:c0a8:2a4f]) (user=jthoughton job=sendgmr) by 2002:a5b:d4b:0:b0:8da:6dc5:ca06 with SMTP id f11-20020a5b0d4b000000b008da6dc5ca06mr215488ybr.7.1676680162408; Fri, 17 Feb 2023 16:29:22 -0800 (PST) Date: Sat, 18 Feb 2023 00:28:11 +0000 In-Reply-To: <20230218002819.1486479-1-jthoughton@google.com> Mime-Version: 1.0 References: <20230218002819.1486479-1-jthoughton@google.com> X-Mailer: git-send-email 2.39.2.637.g21b0678d19-goog Message-ID: <20230218002819.1486479-39-jthoughton@google.com> Subject: [PATCH v2 38/46] mm: smaps: add stats for HugeTLB mapping size From: James Houghton To: Mike Kravetz , Muchun Song , Peter Xu , Andrew Morton Cc: David Hildenbrand , David Rientjes , Axel Rasmussen , Mina Almasry , "Zach O'Keefe" , Manish Mishra , Naoya Horiguchi , "Dr . David Alan Gilbert" , "Matthew Wilcox (Oracle)" , Vlastimil Babka , Baolin Wang , Miaohe Lin , Yang Shi , Frank van der Linden , Jiaqi Yan , linux-mm@kvack.org, linux-kernel@vger.kernel.org, James Houghton X-Stat-Signature: o8suitupnk94j343rda8mx839gzuasw4 X-Rspam-User: X-Rspamd-Queue-Id: 37895140005 X-Rspamd-Server: rspam06 X-HE-Tag: 1676680163-54022 X-HE-Meta: U2FsdGVkX18kWMco4UkrCFh/4PJ2n7HaJvuKqO57ANyQblcFuXD+0v9AyNRt9Vl0yHiVaMPvWtPzm5e9F2L6jeMH76hhF/Pd9LnomEAk+frtIVrJJ+hhuTOhbQ97NQgk3SPkoBbkaF7tlQrgYYUX7z0SsiP17gFo7GZS8xJjzoaiSJ8nVGmN27KWlrwNL2+YR1SQfrXVa0XJFqX4gvcRnMW+2UEy5RVvZEdxm7QVBQrUJVaY6s4LAcJ+nalKGUjkA4xhRexLuF7dMmP6IEKcHcGqIF4xRozKx2svHYeLwgJYBqF5xn0xSUNpfRdRkFtGzAq8WDH6PT+prvm4XdH8lUQ/duAE95FYxuJxZjY9akF8YVGlTE6HlFdazulXIi5srHJvsgsY4qn9of7TtPiseK1ax9HViXYx9hAcorirBtXcCfpzFV98O6qnAiL60eNJbv114KBquCF6EEoFzd71DmZTmLBJnbsICYOFBWpe+8wRwBf8lN68/dGJUHBIISBtlbE+1MDlM+jxOxk347YZyhdkmXY3CNFPW9hVP2KSx+aBTNw+8Ips+pLEOJPgIeOV7H3WJgTdOTtt7fudjvtB7XirkgpMQ9tjN9khHzk8jKown0HgEeHUEEDvOgBqPAdkWYzthIztUzqp1zJ99b0lHbocBre+9Iu8YVvQXOdGYtR1MalwLMrvahUhb414w7jVqhqTmlriXqjaJkUYzAVzmxOKcip37839cC8Ze4chsQWQXPNs8UKPQD+rbOzAMkrNZgXXSq1RGzMQtaf8NQu8pfuXKGEnO02ZKPuih8YjfYorKLskE48AOQgDuMrZAitgyDQgoUCsCvYopdQpS5SIM1mcrgMAo9918aAohYciSd7uN0thmp+1mBW/UJmAZy+W/EBb+a8qm2mWRq43k35tPDxYhoPgJ6X9X/zeN4UBxV+kACjH4a/ltmMjpV7QYse3xV2FnUS4I+Mx7ui1JQ6 X/NXZ7T3 rcwLp6/qQnEPeNortImz+gFOn46wjQx56gmWa3V+8D2lwj4FmxkzHMD63iGSks907cx/qr97GA38DrWdHO4ewnEOSblDTPAblZSTyZE915XjTs3JY8oKmeom3EKPayy5/FCJ3WyjxZ9FAFWt/fAezrmNf2UNusmxxgE+Zka6Id5WvUP/ZMn56Reoyd7jgMXsZ5dP7+swd/7QuX2jlc3NOT+GUW0kYqbfoOdTIILnTTYxZqGFkkxjok1Z56LHALr0OyNNHPyeIRicFck30g7XrVwnrXv9xvpyUoMG+Ct/xqK1OR3B0GI5g6LuaeDTjPffZIGr5peHHIK4STwas+5FyXzNl655iZeU8G7FTwdCB2M/Uls8N0JG5Xn/EdL9AKBsI0vovj2Mllk8uFAqpapZpnzd8z01BTTqRVlnWCKWHlgqfst0hKTnW9V7svxwkoRV0YGna5Um+9M+0eSKjMRpOERqAUF8Ly+AsPjnRHFzK97RfVbB6980+JulIuebXRheWM+ElShUaK2qVAN6R5G0mDgb7SpjadIyp2e5xttBM9iVBpxwLjutMkeuOJDiJW48ja05+ X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: When the kernel is compiled with HUGETLB_HIGH_GRANULARITY_MAPPING, smaps may provide HugetlbPudMapped, HugetlbPmdMapped, and HugetlbPteMapped. Levels that are folded will not be outputted. Signed-off-by: James Houghton diff --git a/fs/proc/task_mmu.c b/fs/proc/task_mmu.c index 2f293b5dabc0..1ced7300f8cd 100644 --- a/fs/proc/task_mmu.c +++ b/fs/proc/task_mmu.c @@ -412,6 +412,15 @@ struct mem_size_stats { unsigned long swap; unsigned long shared_hugetlb; unsigned long private_hugetlb; +#ifdef CONFIG_HUGETLB_HIGH_GRANULARITY_MAPPING +#ifndef __PAGETABLE_PUD_FOLDED + unsigned long hugetlb_pud_mapped; +#endif +#ifndef __PAGETABLE_PMD_FOLDED + unsigned long hugetlb_pmd_mapped; +#endif + unsigned long hugetlb_pte_mapped; +#endif /* CONFIG_HUGETLB_HIGH_GRANULARITY_MAPPING */ u64 pss; u64 pss_anon; u64 pss_file; @@ -731,6 +740,33 @@ static void show_smap_vma_flags(struct seq_file *m, struct vm_area_struct *vma) } #ifdef CONFIG_HUGETLB_PAGE + +#ifdef CONFIG_HUGETLB_HIGH_GRANULARITY_MAPPING +static void smaps_hugetlb_hgm_account(struct mem_size_stats *mss, + struct hugetlb_pte *hpte) +{ + unsigned long size = hugetlb_pte_size(hpte); + + switch (hpte->level) { +#ifndef __PAGETABLE_PUD_FOLDED + case HUGETLB_LEVEL_PUD: + mss->hugetlb_pud_mapped += size; + break; +#endif +#ifndef __PAGETABLE_PMD_FOLDED + case HUGETLB_LEVEL_PMD: + mss->hugetlb_pmd_mapped += size; + break; +#endif + case HUGETLB_LEVEL_PTE: + mss->hugetlb_pte_mapped += size; + break; + default: + break; + } +} +#endif /* CONFIG_HUGETLB_HIGH_GRANULARITY_MAPPING */ + static int smaps_hugetlb_range(struct hugetlb_pte *hpte, unsigned long addr, struct mm_walk *walk) @@ -764,6 +800,9 @@ static int smaps_hugetlb_range(struct hugetlb_pte *hpte, mss->shared_hugetlb += sz; else mss->private_hugetlb += sz; +#ifdef CONFIG_HUGETLB_HIGH_GRANULARITY_MAPPING + smaps_hugetlb_hgm_account(mss, hpte); +#endif } return 0; } @@ -833,38 +872,47 @@ static void smap_gather_stats(struct vm_area_struct *vma, static void __show_smap(struct seq_file *m, const struct mem_size_stats *mss, bool rollup_mode) { - SEQ_PUT_DEC("Rss: ", mss->resident); - SEQ_PUT_DEC(" kB\nPss: ", mss->pss >> PSS_SHIFT); - SEQ_PUT_DEC(" kB\nPss_Dirty: ", mss->pss_dirty >> PSS_SHIFT); + SEQ_PUT_DEC("Rss: ", mss->resident); + SEQ_PUT_DEC(" kB\nPss: ", mss->pss >> PSS_SHIFT); + SEQ_PUT_DEC(" kB\nPss_Dirty: ", mss->pss_dirty >> PSS_SHIFT); if (rollup_mode) { /* * These are meaningful only for smaps_rollup, otherwise two of * them are zero, and the other one is the same as Pss. */ - SEQ_PUT_DEC(" kB\nPss_Anon: ", + SEQ_PUT_DEC(" kB\nPss_Anon: ", mss->pss_anon >> PSS_SHIFT); - SEQ_PUT_DEC(" kB\nPss_File: ", + SEQ_PUT_DEC(" kB\nPss_File: ", mss->pss_file >> PSS_SHIFT); - SEQ_PUT_DEC(" kB\nPss_Shmem: ", + SEQ_PUT_DEC(" kB\nPss_Shmem: ", mss->pss_shmem >> PSS_SHIFT); } - SEQ_PUT_DEC(" kB\nShared_Clean: ", mss->shared_clean); - SEQ_PUT_DEC(" kB\nShared_Dirty: ", mss->shared_dirty); - SEQ_PUT_DEC(" kB\nPrivate_Clean: ", mss->private_clean); - SEQ_PUT_DEC(" kB\nPrivate_Dirty: ", mss->private_dirty); - SEQ_PUT_DEC(" kB\nReferenced: ", mss->referenced); - SEQ_PUT_DEC(" kB\nAnonymous: ", mss->anonymous); - SEQ_PUT_DEC(" kB\nLazyFree: ", mss->lazyfree); - SEQ_PUT_DEC(" kB\nAnonHugePages: ", mss->anonymous_thp); - SEQ_PUT_DEC(" kB\nShmemPmdMapped: ", mss->shmem_thp); - SEQ_PUT_DEC(" kB\nFilePmdMapped: ", mss->file_thp); - SEQ_PUT_DEC(" kB\nShared_Hugetlb: ", mss->shared_hugetlb); - seq_put_decimal_ull_width(m, " kB\nPrivate_Hugetlb: ", + SEQ_PUT_DEC(" kB\nShared_Clean: ", mss->shared_clean); + SEQ_PUT_DEC(" kB\nShared_Dirty: ", mss->shared_dirty); + SEQ_PUT_DEC(" kB\nPrivate_Clean: ", mss->private_clean); + SEQ_PUT_DEC(" kB\nPrivate_Dirty: ", mss->private_dirty); + SEQ_PUT_DEC(" kB\nReferenced: ", mss->referenced); + SEQ_PUT_DEC(" kB\nAnonymous: ", mss->anonymous); + SEQ_PUT_DEC(" kB\nLazyFree: ", mss->lazyfree); + SEQ_PUT_DEC(" kB\nAnonHugePages: ", mss->anonymous_thp); + SEQ_PUT_DEC(" kB\nShmemPmdMapped: ", mss->shmem_thp); + SEQ_PUT_DEC(" kB\nFilePmdMapped: ", mss->file_thp); + SEQ_PUT_DEC(" kB\nShared_Hugetlb: ", mss->shared_hugetlb); + seq_put_decimal_ull_width(m, " kB\nPrivate_Hugetlb: ", mss->private_hugetlb >> 10, 7); - SEQ_PUT_DEC(" kB\nSwap: ", mss->swap); - SEQ_PUT_DEC(" kB\nSwapPss: ", +#ifdef CONFIG_HUGETLB_HIGH_GRANULARITY_MAPPING +#ifndef __PAGETABLE_PUD_FOLDED + SEQ_PUT_DEC(" kB\nHugetlbPudMapped: ", mss->hugetlb_pud_mapped); +#endif +#ifndef __PAGETABLE_PMD_FOLDED + SEQ_PUT_DEC(" kB\nHugetlbPmdMapped: ", mss->hugetlb_pmd_mapped); +#endif + SEQ_PUT_DEC(" kB\nHugetlbPteMapped: ", mss->hugetlb_pte_mapped); +#endif /* CONFIG_HUGETLB_HIGH_GRANULARITY_MAPPING */ + SEQ_PUT_DEC(" kB\nSwap: ", mss->swap); + SEQ_PUT_DEC(" kB\nSwapPss: ", mss->swap_pss >> PSS_SHIFT); - SEQ_PUT_DEC(" kB\nLocked: ", + SEQ_PUT_DEC(" kB\nLocked: ", mss->pss_locked >> PSS_SHIFT); seq_puts(m, " kB\n"); } @@ -880,18 +928,18 @@ static int show_smap(struct seq_file *m, void *v) show_map_vma(m, vma); - SEQ_PUT_DEC("Size: ", vma->vm_end - vma->vm_start); - SEQ_PUT_DEC(" kB\nKernelPageSize: ", vma_kernel_pagesize(vma)); - SEQ_PUT_DEC(" kB\nMMUPageSize: ", vma_mmu_pagesize(vma)); + SEQ_PUT_DEC("Size: ", vma->vm_end - vma->vm_start); + SEQ_PUT_DEC(" kB\nKernelPageSize: ", vma_kernel_pagesize(vma)); + SEQ_PUT_DEC(" kB\nMMUPageSize: ", vma_mmu_pagesize(vma)); seq_puts(m, " kB\n"); __show_smap(m, &mss, false); - seq_printf(m, "THPeligible: %d\n", + seq_printf(m, "THPeligible: %d\n", hugepage_vma_check(vma, vma->vm_flags, true, false, true)); if (arch_pkeys_enabled()) - seq_printf(m, "ProtectionKey: %8u\n", vma_pkey(vma)); + seq_printf(m, "ProtectionKey: %8u\n", vma_pkey(vma)); show_smap_vma_flags(m, vma); return 0; From patchwork Sat Feb 18 00:28:12 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: James Houghton X-Patchwork-Id: 13145405 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 497A3C6379F for ; Sat, 18 Feb 2023 00:29:55 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 210A8280024; Fri, 17 Feb 2023 19:29:26 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 1995E280002; Fri, 17 Feb 2023 19:29:26 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id F2CCE280024; Fri, 17 Feb 2023 19:29:25 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0010.hostedemail.com [216.40.44.10]) by kanga.kvack.org (Postfix) with ESMTP id E3C07280002 for ; Fri, 17 Feb 2023 19:29:25 -0500 (EST) Received: from smtpin15.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay01.hostedemail.com (Postfix) with ESMTP id C8B8B1C6039 for ; Sat, 18 Feb 2023 00:29:25 +0000 (UTC) X-FDA: 80478528690.15.2430F82 Received: from mail-yb1-f202.google.com (mail-yb1-f202.google.com [209.85.219.202]) by imf11.hostedemail.com (Postfix) with ESMTP id 130444000F for ; Sat, 18 Feb 2023 00:29:23 +0000 (UTC) Authentication-Results: imf11.hostedemail.com; dkim=pass header.d=google.com header.s=20210112 header.b=mdWiwuhX; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf11.hostedemail.com: domain of 34xvwYwoKCAIlvjqwijvqpiqqing.eqonkpwz-oomxcem.qti@flex--jthoughton.bounces.google.com designates 209.85.219.202 as permitted sender) smtp.mailfrom=34xvwYwoKCAIlvjqwijvqpiqqing.eqonkpwz-oomxcem.qti@flex--jthoughton.bounces.google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1676680164; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=YyhF10hcbQMtiIcRNPe2j9wgCL3nu0mYgfomzauIh1M=; b=PtE42wMuh1D9MP8qU5lS+6OlzIRzcU/3mejoUc1Q28HiHpvOJes+KE/01u3TImCkcU/BE3 p2jMt2ujZbYkzDRb9NS3OWoCEiv05pyjVum0K6ZH6QNFJzp+FS4Wuk2Ecm7jVoe/0SKvCg +nyOId7ghsvATm7a0NoiqTdlGsIw99U= ARC-Authentication-Results: i=1; imf11.hostedemail.com; dkim=pass header.d=google.com header.s=20210112 header.b=mdWiwuhX; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf11.hostedemail.com: domain of 34xvwYwoKCAIlvjqwijvqpiqqing.eqonkpwz-oomxcem.qti@flex--jthoughton.bounces.google.com designates 209.85.219.202 as permitted sender) smtp.mailfrom=34xvwYwoKCAIlvjqwijvqpiqqing.eqonkpwz-oomxcem.qti@flex--jthoughton.bounces.google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1676680164; a=rsa-sha256; cv=none; b=nE9Lc0bNkLQqoP2SyMJefjqSV0wA2MXk3YciF3IAeTB7Gh59lReaq6SNLclYkQ9x/uvj1c 2KkVLpTfcXKWJFw7dKZjwjbmdbWhr+YOlA4AfXkXNEgbd30uCbCcSd076vukBM8T11NLZU xfWo54pq82enNKQHZ/emu6ueeaVv0kE= Received: by mail-yb1-f202.google.com with SMTP id o137-20020a25418f000000b009419f64f6afso2165429yba.2 for ; Fri, 17 Feb 2023 16:29:23 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=YyhF10hcbQMtiIcRNPe2j9wgCL3nu0mYgfomzauIh1M=; b=mdWiwuhXmCO7ZvCnfOzT5kGyeI1B76654OSvGqg0v4SfpBfq8gE8Y2HxCtDlqyAoCe RlH3iNC1O3POA4w07JzQSLH9WOuQMtWGYGTcuZDqfaF7LyXaloMakwXqjQejR67+NzLM ZSebxnW4BLvieTC+ErPKwo8+NE2d2ITkqYp8qePD98ii1puW00fdoHibbpwmm78EnXnc e9mKjUDG0xvqx47GeoE/+zEj2Ty0K8brFhlfqMVjJV2l/MZO0YGU5DUAH3uw/T0E0FYc e6nZsvIWbjAB08oOCLK03Uzk6o/Dc+xelLgiCeHsfs3lPkfiBKCFNyXIzhr8GfZEa1Cd rW1w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=YyhF10hcbQMtiIcRNPe2j9wgCL3nu0mYgfomzauIh1M=; b=4h/EHt7ZPcUpw9qTHeBxBIgpaxGJ4WNQECdwBAhckp42aQjKlYjsEFQ1jDexAmCC7p /mjluItz6hookyrD3wJ6unWi5JlZ2vXUG3b07NqcwIsxbov9/RhGln9dEMWV6vXXxMwX upSWHCeUKJFbeLNbCJZ9W0GgrcCEpBZEpWj/53+8N2cvrCSt7klhuYbQUwDdW3vkDEd0 eWWNwGrzaw1jimj5Y6U/OwWWk7zhy8d5hr3tCm9asz3BU3aHeyaGa+VJl0J6H3k5HE5M jug3hkcTdQUdQAtgwvooLuWh8iWHKYWTC1lUGyuy32mkCeTkfRflpuYJh238j3ra2HzR SFAw== X-Gm-Message-State: AO0yUKVks8ImNmEZXE8ewRLFXg1tZNcqVigvdOGLUTqEiIltmJ1nFIfM 0dfJV49nURgu6RQErK+u84Dd3lPLl2Uz5a59 X-Google-Smtp-Source: AK7set887EplN25RNMt2wpSeZlZy1r4wkCtXdk+wH3AhIORU9/UsNcsfMi31QN73nnyYQPAtcJJOO2NybrEvv00c X-Received: from jthoughton.c.googlers.com ([fda3:e722:ac3:cc00:14:4d90:c0a8:2a4f]) (user=jthoughton job=sendgmr) by 2002:a05:6902:3:b0:90d:af77:9ca6 with SMTP id l3-20020a056902000300b0090daf779ca6mr34196ybh.7.1676680163234; Fri, 17 Feb 2023 16:29:23 -0800 (PST) Date: Sat, 18 Feb 2023 00:28:12 +0000 In-Reply-To: <20230218002819.1486479-1-jthoughton@google.com> Mime-Version: 1.0 References: <20230218002819.1486479-1-jthoughton@google.com> X-Mailer: git-send-email 2.39.2.637.g21b0678d19-goog Message-ID: <20230218002819.1486479-40-jthoughton@google.com> Subject: [PATCH v2 39/46] hugetlb: x86: enable high-granularity mapping for x86_64 From: James Houghton To: Mike Kravetz , Muchun Song , Peter Xu , Andrew Morton Cc: David Hildenbrand , David Rientjes , Axel Rasmussen , Mina Almasry , "Zach O'Keefe" , Manish Mishra , Naoya Horiguchi , "Dr . David Alan Gilbert" , "Matthew Wilcox (Oracle)" , Vlastimil Babka , Baolin Wang , Miaohe Lin , Yang Shi , Frank van der Linden , Jiaqi Yan , linux-mm@kvack.org, linux-kernel@vger.kernel.org, James Houghton X-Rspamd-Queue-Id: 130444000F X-Rspamd-Server: rspam09 X-Rspam-User: X-Stat-Signature: nxobexdquk9u9kixjpc1bewxi3z77b9x X-HE-Tag: 1676680163-356114 X-HE-Meta: U2FsdGVkX18SAm+ui1qruk5G9MiDMjwcyP5n1EgRJKinePp24kyB4w+fJ/brWHSdQzTvvUq3JjC2wVnkLxNjGr8xKaGY1U4ba5LUnrxlBlDD8WGx3ViKCplDLkhf/H3XaRhYTtMc7eJY7tY3YsYQUM2igT90KQOcavyrv3+TaO8CZ2t3V7b8rOZEbhFG525kGt5WZwZ0Y8Mn/s0lbpkbbQrTmeaNE8dSyU3SvXjxdJt4pz6nVIXZfiqx4C4csg1kkPyXAreaqSg7zs0ZTFFlfydiVL6m/Y3L6i9tOYT9YxFliplEnsMLwU0umYMAlUprApag+4kdZnqmzhomIWlkJspEi5Uv78dyYDQui3VCTzZJVBy/Qj6sUfNaXSTGn5TM9VkL2ogOfuNRFBg+C/wo4GVU6CCb9esOESWSF6RLYCyAj7oRn8dmS0w8H8PpE1g9TANjxbM2nqYiS/cO756oPhJ1Okwb4Y/obbXbcnyCq93ZzU6Dk5hLHdEx6d8DYiHUfQImmyY3iXfpeRSZulMq4q/0PEUTVs0EEk3pMFsrEtu69ZqNqr+8ji9+FJycIE24HIOjEJ/9NFLAyyCPE3T9zjIgCFARcrijHp/2dt/IQEQUBbYxtO5ro1nhank7Nsht4lasPFkBafwrgHEYc0/ktp2FpF94hvcVegeGhWIWh1VxjIdlClOqnAgi+VI+pWrlWDVhrGuUJK7Zqdnl1FKVOQXSfw3M6vdq1QiWMJB+O+AvE8BpNAiEW5KCOEckaNyzF5BFwXJwxkz6avo4i6mw7us8kKZW/0BPcjOutlxJISfk0fGaa9DxCm3nQsyhCPQqPR7s2Qcqpe8G/WUIfHhLX5hV7kfFB1sD18VGsFoAjzBg/c2ck9kB+JyR+NVbd5Kb5on3GFh2bjQmXHr8QUVWFFC2yM05tp9GrO30pGfdz/yCnEbpReFz5XgkV44+P+9UUjxSbXB+2v/3AQxbSV2 81Zv5A5y KdiS67nQG6mPn9gWiLFelm6bffzuUQ7KDctkz1pjPgu9u5eW48yy/KSKVgRSRmC+wY66nlCmleif+/M8t8lS2pl3oSgCXn8IfT5CK1C8B1ecYQL51XVcadrCjiKWomWm8Ki6JvqePVmo3EdYWr6hmESziy6kl6PcIKXgZcuWUyTniaRlOM86UDw8pEPDRBWFRg7UPYx0eaHD2drQmEqgQ+1YdRdeEOh6M7IP5q8xT/jelVGOROe/XvhNThv0VI46TzRETXf++u6O/LyuLLVMCBn12eVPQIFL2f7IeCNo35YDxIyJPsquigeltduf9fCyNoO77TRebvWXYwmwvGA0pcqPuKL5t6+0dAx8h0NBCUMF1D39ChkX7j72B35QRJyRBwSJ/uOuA78Ve7ZIAxcm/GbYYZS7yDeKAkGBKrUBLShgIiU3p4D+2n/oMkSwOD88TndTSBvfP/SoQP37X0X0xXcBsE/OG3TGE3LWZitNjrWzoz8sNDi+XsCREu+NjRRCR0vR5uNXn5a/qd+E8Of2/5yUU51k9zQeDMCFtd01JFJshymzvzQYCVMJOmGlJrOE4HRkV13fVeY+ECMQpcTS/CvsP+Bi0ffHI74B7vCU7wzSWhuA= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Now that HGM is fully supported for GENERAL_HUGETLB, we can enable it for x86_64. We can only enable it for 64-bit architectures because the vm flag VM_HUGETLB_HGM uses a high bit. The x86 KVM MMU already properly handles HugeTLB HGM pages (it does a page table walk to determine which size to use in the second-stage page table instead of, for example, checking vma_mmu_pagesize, like arm64 does). Signed-off-by: James Houghton diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig index 3604074a878b..fde9ba1dd8d7 100644 --- a/arch/x86/Kconfig +++ b/arch/x86/Kconfig @@ -126,6 +126,7 @@ config X86 select ARCH_WANT_GENERAL_HUGETLB select ARCH_WANT_HUGE_PMD_SHARE select ARCH_WANT_HUGETLB_PAGE_OPTIMIZE_VMEMMAP if X86_64 + select ARCH_WANT_HUGETLB_HIGH_GRANULARITY_MAPPING if X86_64 select ARCH_WANT_LD_ORPHAN_WARN select ARCH_WANTS_THP_SWAP if X86_64 select ARCH_HAS_PARANOID_L1D_FLUSH From patchwork Sat Feb 18 00:28:13 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: James Houghton X-Patchwork-Id: 13145406 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id F2625C636D6 for ; Sat, 18 Feb 2023 00:29:56 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 3BF7F280025; Fri, 17 Feb 2023 19:29:28 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 3972E280002; Fri, 17 Feb 2023 19:29:28 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 0D5F7280025; Fri, 17 Feb 2023 19:29:28 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0014.hostedemail.com [216.40.44.14]) by kanga.kvack.org (Postfix) with ESMTP id E953D280002 for ; Fri, 17 Feb 2023 19:29:27 -0500 (EST) Received: from smtpin09.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay04.hostedemail.com (Postfix) with ESMTP id CCD5B1A050F for ; Sat, 18 Feb 2023 00:29:27 +0000 (UTC) X-FDA: 80478528774.09.70BB1AF Received: from mail-yb1-f201.google.com (mail-yb1-f201.google.com [209.85.219.201]) by imf06.hostedemail.com (Postfix) with ESMTP id 2281F180009 for ; Sat, 18 Feb 2023 00:29:24 +0000 (UTC) Authentication-Results: imf06.hostedemail.com; dkim=pass header.d=google.com header.s=20210112 header.b=jBDVTyUi; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf06.hostedemail.com: domain of 35BvwYwoKCAMmwkrxjkwrqjrrjoh.frpolqx0-ppnydfn.ruj@flex--jthoughton.bounces.google.com designates 209.85.219.201 as permitted sender) smtp.mailfrom=35BvwYwoKCAMmwkrxjkwrqjrrjoh.frpolqx0-ppnydfn.ruj@flex--jthoughton.bounces.google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1676680165; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=meim6QzwmgVJzeI2+Wk1IxU1uXO2GoWG7kArQ5zE/oE=; b=w0PpZbL3nSjpNs9GlRR0jW8Jj5qnDQCF7YETATU/rGB0zdU06JYlN+hkstULxfCf29E2FI 9ESsI3UfvxiAslY7pqkvd/e6dptmbr8TJWdI2VUBbKkIHgLVN1q3Ti5HWEKci1Ulp7aZZv L7nip3z+sCuT/HbcCJGZmfb0cCCx+us= ARC-Authentication-Results: i=1; imf06.hostedemail.com; dkim=pass header.d=google.com header.s=20210112 header.b=jBDVTyUi; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf06.hostedemail.com: domain of 35BvwYwoKCAMmwkrxjkwrqjrrjoh.frpolqx0-ppnydfn.ruj@flex--jthoughton.bounces.google.com designates 209.85.219.201 as permitted sender) smtp.mailfrom=35BvwYwoKCAMmwkrxjkwrqjrrjoh.frpolqx0-ppnydfn.ruj@flex--jthoughton.bounces.google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1676680165; a=rsa-sha256; cv=none; b=j/7LlhgME/pWJljLdOqUegpF9/bd+rc+anwWj++4N66lb7rJQyZw7bydo4YDDs9mgjRsu7 CcbO0Y6VQE6bCxfkDE/s3Nw+vP94uuW/N1eEoAG5rKnteatczHKjmkWvcoJJy/rzC/J2jS jopLFFRZCRQmy73ES5v9pa731yf7rTA= Received: by mail-yb1-f201.google.com with SMTP id l14-20020a056902072e00b008fa1d22bd55so1847441ybt.21 for ; Fri, 17 Feb 2023 16:29:24 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=meim6QzwmgVJzeI2+Wk1IxU1uXO2GoWG7kArQ5zE/oE=; b=jBDVTyUiFjSinjlesXO7Wsr29fC05xWRfrl7nAZ2Vi6vctWsz/7+NsXy6Dx9zE02oD Zi9rU6CuILwSQ9MmYCjyB8BnkobSNAXN9wk5uSKN5uHSYEobIzebiBr+QJLAoe2BO2h7 BvQ5nLWnyJ34pOp7tHtrWAltvM5a5A6wWV9ntVMJs7Np7CHhXIh92iq7fs97WdSGdx7k chX9KZzHAChL7Krh0NrLbnAxcDgyvNutktvSzeK5qogN04D9EfgapHMPtcUexRrTONeu Lv9LpAtAQc7gW1fQL5eLIT4qqKfM5pCyZOYdI8D3vWXHR6ii/aIgrxnOVziUFMcqw1D2 IOGw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=meim6QzwmgVJzeI2+Wk1IxU1uXO2GoWG7kArQ5zE/oE=; b=zkyE5a2V+Jp/RRDUSJizNF7YJa2m5+PBW/2gbxmV31CZbteNdt0DM0E0oOe/Nq4KhH j405QBI15SKaaQgVAr1ofO1F/r9HbuGYJg3iIxQWLR/yvoDKHWsY8RZljtBzBpOp1ScK oswMmHCbMqCSIWvWWlYZwbSx+n9BCfP3XKPjcSRiCtKWZijHEMfvX4oOI+cqBOAcES/o D96aubx2wwpiJhL8aP+kz6UFVMOEdxQUmE3UID+0TcuR/BczOqTpv67lpsmjakcRuhYx t1dZBtJwKx+p9zre7L8pvOxD+aMNWAOGbXSsEeLwsij0T49WxrtCXdSNTJ6WMAH0nEFt X0nA== X-Gm-Message-State: AO0yUKW2JkLp3ecPPKEhJZZwTNGUspxW2Q8wcZrCMGgQcmerxpPzyKTY PnBmECbDoOI/bc8y5AFPCYjfuhh18OBdrByZ X-Google-Smtp-Source: AK7set8GEY/pRGyIuQeSw27GrNwKq4LLQxkMXSq0msXVI/vUEuxHEPRagJyvSuTMY/SNITKnNSR3i2I9F+YeDJM+ X-Received: from jthoughton.c.googlers.com ([fda3:e722:ac3:cc00:14:4d90:c0a8:2a4f]) (user=jthoughton job=sendgmr) by 2002:a5b:144:0:b0:91c:90b6:f48a with SMTP id c4-20020a5b0144000000b0091c90b6f48amr1373069ybp.580.1676680164340; Fri, 17 Feb 2023 16:29:24 -0800 (PST) Date: Sat, 18 Feb 2023 00:28:13 +0000 In-Reply-To: <20230218002819.1486479-1-jthoughton@google.com> Mime-Version: 1.0 References: <20230218002819.1486479-1-jthoughton@google.com> X-Mailer: git-send-email 2.39.2.637.g21b0678d19-goog Message-ID: <20230218002819.1486479-41-jthoughton@google.com> Subject: [PATCH v2 40/46] docs: hugetlb: update hugetlb and userfaultfd admin-guides with HGM info From: James Houghton To: Mike Kravetz , Muchun Song , Peter Xu , Andrew Morton Cc: David Hildenbrand , David Rientjes , Axel Rasmussen , Mina Almasry , "Zach O'Keefe" , Manish Mishra , Naoya Horiguchi , "Dr . David Alan Gilbert" , "Matthew Wilcox (Oracle)" , Vlastimil Babka , Baolin Wang , Miaohe Lin , Yang Shi , Frank van der Linden , Jiaqi Yan , linux-mm@kvack.org, linux-kernel@vger.kernel.org, James Houghton X-Rspam-User: X-Rspamd-Server: rspam02 X-Rspamd-Queue-Id: 2281F180009 X-Stat-Signature: tqe551hh3io63ihcrafyrw3oab3x388y X-HE-Tag: 1676680164-57331 X-HE-Meta: U2FsdGVkX18TYctr32ckVNfgXEIUnHRD0UscQ3Cvui3hqUl56NCZcvUD/ziwFOSlrX7ct8znfirodgPWpxdj815Krc7osumE6G08FzXguNYLCaxc8miflDVpv136XTZGO94+Bz6PgHAbPmNwNnU7HnX2QMesyJy202sWVNDxFCmA+MRm5BSUcU8T5J8P8bhrDEmNYNzG4kYDVSonjpL1T/UyfYloPb1P/k2p5PsvLEFsaxIRPyT/BbjaHnk2vJ4JWkf/88ydK9GqBQSZ4Gm1HnTBWWhuVpSXJXjLz2oEmTRFMn2Fl07FXyc63sA9JSJYROUtIqHSUwL6uB3d8EfyMfT9LAs3om2GcFlNkZfjH4g7kaHgZHDCaOZTRtb1WDYFzfwQj1Y7gpGdIB/xnkD0Kvc413OH5yj+Y3+N5Qfsjy5JMA0RHoD8aK12eJgn9ObfvB0bj/QvjU/3noB1D1yZEo0kcaTNNWVNq7eY0ng6viQjXUv4UztRG65N2APCa6oIKHYZV8yyDAC3F4Bx2uhzSW7v8A8bB/nk1yrKZlXFSvdqn2KOI18tnvQmez0DsSor95xpgeT+Ybij3kpS+A2lNEFGXVWlAVttX2PBoIeuMkmdY0E/Od9qzN6pT8y4UVJ1T3jKog3O2/Yz1WsQ+JDOY46VQuEo2wLmFSyaOw/KjNCpQFLbeOUtKx8z1EHJmRd2mBxD5kQmNTkATYtJFNpC4z1ZUBqq7g7do1rPWZR1vkz7AeZoz4ENzn27c/RWl28U/MzDS7vBSEH2ctFyT7Ylybsg0IhD0/bH+dkeX01opM/tPh3M+mC9ba0EQ4TWWJaIIyRwJTj4qsele6QyVItFt/HCfmCQbZ2h7GUR8pYfAgH4X8B18GaNzUDuokpFmEhOEWdcRdehbnZSrlKI5eEjB0a6dB/adY3aJZTGOXAkbhElawzIXwpIYmeqQbqz+8ite9wHvapB1Mn9hZFOE3U 8eix5FxN oEKFxJXKH24GadG+LnG5yJiC+Xqi6p1mvUh7AMkKq6UCmYR4ofrDULkmWy2zTgbtr43V7isqLuDblcb3XNshEWu0PwPqEDyTxGR7xwX/eiBqNHL+jJtXGXy1tGk1dk9oIgDZ9yt77TO2YelDaLDDrBcRpGfUxHF16pXXysKiIKXzjkbbmdbcvIlVvs5cUlMeYA/Du8GBL2hPP6yCT1N327i1z67o9r45ar2VseFegaZAXzAnNMElCx6DD9t+E/CST7r+4PT1EK+AmnwEXgnI5v2tWCtuyDjwpriFgRI3iM4AvgTQAXRZask13uzRXckTxzVDNiJ8ty46269o1yAC8DhFpY4Fk1AubfZZVZ4QpU3PnReohabCXRVmLBjcU04x2XOAEBGt9DtmFrP6XFf/6Q2D7/tybc8/Lt2Fm3WWV/sJED/noaaIRIzWv5N3a+ICxvheeTzt+n6PuvqCs87H6Ec/FJIRhebs1Ek1ztfNfKbSX+kbt0GYXDsMKGad91huX2q3K71bmLuroweJZ5YAsotTyAg73GZZtk4Sm1GoG1LXCV9GsNp7LS/LPTQ== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Include information about how MADV_SPLIT should be used to enable high-granularity UFFDIO_CONTINUE operations, and include information about how MADV_COLLAPSE should be used to collapse the mappings at the end. Signed-off-by: James Houghton diff --git a/Documentation/admin-guide/mm/hugetlbpage.rst b/Documentation/admin-guide/mm/hugetlbpage.rst index a969a2c742b2..c6eaef785609 100644 --- a/Documentation/admin-guide/mm/hugetlbpage.rst +++ b/Documentation/admin-guide/mm/hugetlbpage.rst @@ -454,6 +454,10 @@ errno set to EINVAL or exclude hugetlb pages that extend beyond the length if not hugepage aligned. For example, munmap(2) will fail if memory is backed by a hugetlb page and the length is smaller than the hugepage size. +It is possible for users to map HugeTLB pages at a higher granularity than +normal using HugeTLB high-granularity mapping (HGM). For example, when using 1G +pages on x86, a user could map that page with 4K PTEs, 2M PMDs, a combination of +the two. See Documentation/admin-guide/mm/userfaultfd.rst. Examples ======== diff --git a/Documentation/admin-guide/mm/userfaultfd.rst b/Documentation/admin-guide/mm/userfaultfd.rst index 83f31919ebb3..cc496a307ea2 100644 --- a/Documentation/admin-guide/mm/userfaultfd.rst +++ b/Documentation/admin-guide/mm/userfaultfd.rst @@ -169,7 +169,13 @@ like to do to resolve it: the page cache). Userspace has the option of modifying the page's contents before resolving the fault. Once the contents are correct (modified or not), userspace asks the kernel to map the page and let the - faulting thread continue with ``UFFDIO_CONTINUE``. + faulting thread continue with ``UFFDIO_CONTINUE``. If this is done at the + base-page size in a transparent-hugepage-eligible VMA or in a HugeTLB VMA + (requires ``MADV_SPLIT``), then userspace may want to use + ``MADV_COLLAPSE`` when a hugepage is fully populated to inform the kernel + that it may be able to collapse the mapping. ``MADV_COLLAPSE`` will undo + the effect of any ``UFFDIO_WRITEPROTECT`` calls on the collapsed address + range. Notes: From patchwork Sat Feb 18 00:28:14 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: James Houghton X-Patchwork-Id: 13145407 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id BD62EC64ED6 for ; Sat, 18 Feb 2023 00:29:58 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 6125D280002; Fri, 17 Feb 2023 19:29:28 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 5BC7B280027; Fri, 17 Feb 2023 19:29:28 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 2D291280026; Fri, 17 Feb 2023 19:29:28 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0017.hostedemail.com [216.40.44.17]) by kanga.kvack.org (Postfix) with ESMTP id 10C6C280002 for ; Fri, 17 Feb 2023 19:29:28 -0500 (EST) Received: from smtpin26.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay01.hostedemail.com (Postfix) with ESMTP id E4EE41C5F4C for ; Sat, 18 Feb 2023 00:29:27 +0000 (UTC) X-FDA: 80478528774.26.4E7027B Received: from mail-vk1-f201.google.com (mail-vk1-f201.google.com [209.85.221.201]) by imf12.hostedemail.com (Postfix) with ESMTP id 3EAC840002 for ; Sat, 18 Feb 2023 00:29:26 +0000 (UTC) Authentication-Results: imf12.hostedemail.com; dkim=pass header.d=google.com header.s=20210112 header.b=q4c+hojX; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf12.hostedemail.com: domain of 35RvwYwoKCAQnxlsyklxsrksskpi.gsqpmry1-qqozego.svk@flex--jthoughton.bounces.google.com designates 209.85.221.201 as permitted sender) smtp.mailfrom=35RvwYwoKCAQnxlsyklxsrksskpi.gsqpmry1-qqozego.svk@flex--jthoughton.bounces.google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1676680166; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=rVcqYSKj99pGLWaPkurlgRpM4dq8t04RSFrIu2L3zgA=; b=7/AVN67u+yKJmv1DbHlG5xl1mmvx3pgKtGeH5/JFY/Qa9zV1trelOjTz8tmFx/G/CZwDFv Ha6gH5434Y7fL+OqHmxPA345PG375u7VvblXAGzPzMljbnrw+62FXCcCSc1sdlubB/+197 1hqUoig9mqxjdS5EsEcxIar+s/KbHIc= ARC-Authentication-Results: i=1; imf12.hostedemail.com; dkim=pass header.d=google.com header.s=20210112 header.b=q4c+hojX; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf12.hostedemail.com: domain of 35RvwYwoKCAQnxlsyklxsrksskpi.gsqpmry1-qqozego.svk@flex--jthoughton.bounces.google.com designates 209.85.221.201 as permitted sender) smtp.mailfrom=35RvwYwoKCAQnxlsyklxsrksskpi.gsqpmry1-qqozego.svk@flex--jthoughton.bounces.google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1676680166; a=rsa-sha256; cv=none; b=sIPD1JfqzqeOZfGvbgwU7VFDCNt3yN6coVEwi74017px+OipIEWXsPE56UioRl4fBDDkPx Z5FYaCN6YtzxMmi+ITALb+Yyom6UnFZ7Aolfl6vBAIXrMVEPqA1edoej7BekSpNnr0cDwz fCn8boFtA0x+0xJEXltXtuegHfneMU4= Received: by mail-vk1-f201.google.com with SMTP id p131-20020a1fbf89000000b003ffc9001de6so1107393vkf.8 for ; Fri, 17 Feb 2023 16:29:26 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=rVcqYSKj99pGLWaPkurlgRpM4dq8t04RSFrIu2L3zgA=; b=q4c+hojX4+dTZygB2Tks0txtgbVxj7x8TKkFZ4tMc+sqQf6GeN73tsvcqCiW02hDp4 8XPu7pKfyuoWyWf0l6efa5Zl8QTeR0XBVVaGrv+Wd9WG+jAQtWkF9yUFkUA6luQj+6cl jVGauOVy3CkA+wV1fgTRQ99Wfw3xgg3rfIr1+mhVfMwPBkQ+w4aIyIAy3UP92Sz6TP2b rQ4LOt/rb25Xw9+JRDlRHS/Ym0me/mMiwA3HH1eYVYoYEY9NAXA/3czNh8h0WNs5P9nw XagFyrun6V7H5uqPSe2BtAfdYCj11o7oZZljG6RocZ2iOY2K7t3KE+FE2jG7TajzTzWr Fm/Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=rVcqYSKj99pGLWaPkurlgRpM4dq8t04RSFrIu2L3zgA=; b=DQTQXfQ04PMmV8sCvGXOOaFCCJJa+b5NueUR28luPLY2QpbmjDEGQMrPY5MyX+hPiI 2P1F4Ed8YUKl/0YRZMVEUzwsJsEvlfa/CrZZsWLpMUBZvL79hEGsrME6MgjSmZzxVD6/ RyzxJKgpE4UwcIcrWSqILuDwv844ExbS2qJEG4BY0blLaQnZNjCcGLgdmkuubRbwJ3cz Hq2G8J3IrQW4S1w3Y16FIS5c6FES2jebl1wrc1dy/j75Z8154+CcSU6rjgh/RI1rIRnN cl5mBG34oJpFZvFdBlMoYRr39sc7O3ZBhxywdy1dczF/KbrrrCxv9FGMmcNYyxy/2NaM w6sQ== X-Gm-Message-State: AO0yUKV/sN9JsRDlBHaIAJhHCpuMjFgJrNwcGgVP1ChoaD2+nUKBD/Zr 2aOtgl8ZvhGbJtXH6m/2chBg1ejbcsnUOHNl X-Google-Smtp-Source: AK7set9Tf2rc8YP45MnilA/p/YM3vrBRBTo5dK3QsJoCd53f/+o/OkfQqN5UhVm88d9/++IMJB7K0rC38dMHYTlb X-Received: from jthoughton.c.googlers.com ([fda3:e722:ac3:cc00:14:4d90:c0a8:2a4f]) (user=jthoughton job=sendgmr) by 2002:ab0:100c:0:b0:68b:9eed:1c7d with SMTP id f12-20020ab0100c000000b0068b9eed1c7dmr77489uab.0.1676680165444; Fri, 17 Feb 2023 16:29:25 -0800 (PST) Date: Sat, 18 Feb 2023 00:28:14 +0000 In-Reply-To: <20230218002819.1486479-1-jthoughton@google.com> Mime-Version: 1.0 References: <20230218002819.1486479-1-jthoughton@google.com> X-Mailer: git-send-email 2.39.2.637.g21b0678d19-goog Message-ID: <20230218002819.1486479-42-jthoughton@google.com> Subject: [PATCH v2 41/46] docs: proc: include information about HugeTLB HGM From: James Houghton To: Mike Kravetz , Muchun Song , Peter Xu , Andrew Morton Cc: David Hildenbrand , David Rientjes , Axel Rasmussen , Mina Almasry , "Zach O'Keefe" , Manish Mishra , Naoya Horiguchi , "Dr . David Alan Gilbert" , "Matthew Wilcox (Oracle)" , Vlastimil Babka , Baolin Wang , Miaohe Lin , Yang Shi , Frank van der Linden , Jiaqi Yan , linux-mm@kvack.org, linux-kernel@vger.kernel.org, James Houghton X-Rspamd-Queue-Id: 3EAC840002 X-Rspamd-Server: rspam09 X-Rspam-User: X-Stat-Signature: cqfj4joupg3h7px9g4ppasijzbef57mp X-HE-Tag: 1676680166-662697 X-HE-Meta: U2FsdGVkX19ZSjjZ745MRi+9LSarani4kO+dMrinNHXATefvUSMfnZZGu4PTDXt0RFWiqumtgKkgvOIvyEywtfG1Ve850aRnEWGy9P+5I22KqyOjlqrM5L2AgTguWRPfbawgJJZVedXNcuMp31InMUvAvSbrGWxAWz7MEcPy3B7gT5XR7XI1udsGCnKIZijh+Nlc9jDNLxztBmdcwhKEf/YiYuqP/ohkSif3PYoTbYwcvOrxU13rdck45/pC7rzHT2kzy6x0+OJVarJi/MzE9+k6ySl+I9XXZJDGMYcSHg4dkWoKyXacMCO86PU+XbgnXKqwuxH/up5QCyg1wHL1bLsam1qG7mSK/f3GLNn+6weXzT1zhRGCIu6ypxrPavoczqXL+MeF3WITXQpNVUACKRTEsDAMyusCxsJNuzIma4aOdXt0hLMEvafthMUS2EDozesodZ6rzebuMqzhZ/MNb1wN/gB+yzzY6mkubjBLFUm4764B1dRDIq4OZr1lkknc2aYfLKpTkSZjdl57DD5vEPbzBao5yi4WYR/U75conOpG1y1YYBJUBp3t779PHC9lQp6EyRpBUEr5zWD9S+TQoF3QTwcrTC//lQRr3V2dvUv6q3DRhAK0yXtSamVwrAEphrTd5n/h7Dm8JpQEmXZJ0kjwxboyAEa6GoxYfRcQRqz0mckUGOkbAZYhj9MFJ9WtP9er3+8XyRuKymLZtw+wtQlWARj6rtvM9HeWPOQId7YP9F6J0sUaayayOEwSsIBpWiTE69G0OOCBpaS+X/FEa/TmeX9Y8gdJmvIS9aJCgsPDohYPl6pdJOsJ0sIuuSuW/NfACL4jZdzrlyeQvZRIJ+aDzaM1Nevxz3Dp5sZA3cmbjpYXiZCFe+nJi3GahtjGW/v5ReL39BvkwpLDXvGsber/5xRLAVMSogwLi9aL8EZZ5rdFOOpfDeLnMnN6Uvcxo0r6K3pf599jMGQ7Zid 9eNARsYd 8UsM88yRXHXD96NSMub9oTZTmlM82aFgYx8OvIdh5P/gveqm0DQfw7POLibiIIrgy3elc2pLqvGVxviXAFUA0Pz/gAvHcjsNJgM4V24HdeyJ7QN6bSpjWbcn9ObBTyzZ9WL+8onpOGpAwNpfWy7n17NIupKdk9GDL8RT/S0Zqcrc6kzPMiNTE3MMfflZtsXVYi8WpwR++q7eJbtpAUgjw1wlaNh3oE3BoqLUuQGTFr8c8i9XM10Z4tQswJLd/pFFrbWVEn+BV4YfspP78xSrXWWsRdG5QvDKO5RbALb+1EEualt4qW96bna7mQlB2jMb6LDYCEF+y7G2Rkxu5wplTJ3uCT671rx/phzzYh1XjvlHJFwg/LnuUYaEgVdQ3PxOv6NnQvhVy3wKjtr7XIV8NJOXyz00yCWgJ3SVVLbk6sXtqXQQ9czWygNOW7dax/VqjTxMIF7sSblma3O3gPjjjdCPdA5QiFUQFTOJy4Z9BaubbSK1v5MGV90apWLPrurwQOj914DALYud9sDxu8O4susQf0C5yGb9seqUJ4ioV7JII86aqR0UA62+5hpv9GMZYynwsyvgOaI24vlwRwFpexxajGHz+hvmCWUFoFT+aSau8Irbnxb9mXwPmaY7sv4Xbvducs+Wq7Nh1hszIc1Q0TGcJzQ== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Include the updates that have been made to smaps, specifically, the addition of Hugetlb[Pud,Pmd,Pte]Mapped. Signed-off-by: James Houghton diff --git a/Documentation/filesystems/proc.rst b/Documentation/filesystems/proc.rst index e224b6d5b642..1d2a1cd1fe6a 100644 --- a/Documentation/filesystems/proc.rst +++ b/Documentation/filesystems/proc.rst @@ -447,29 +447,32 @@ Memory Area, or VMA) there is a series of lines such as the following:: 08048000-080bc000 r-xp 00000000 03:02 13130 /bin/bash - Size: 1084 kB - KernelPageSize: 4 kB - MMUPageSize: 4 kB - Rss: 892 kB - Pss: 374 kB - Pss_Dirty: 0 kB - Shared_Clean: 892 kB - Shared_Dirty: 0 kB - Private_Clean: 0 kB - Private_Dirty: 0 kB - Referenced: 892 kB - Anonymous: 0 kB - LazyFree: 0 kB - AnonHugePages: 0 kB - ShmemPmdMapped: 0 kB - Shared_Hugetlb: 0 kB - Private_Hugetlb: 0 kB - Swap: 0 kB - SwapPss: 0 kB - KernelPageSize: 4 kB - MMUPageSize: 4 kB - Locked: 0 kB - THPeligible: 0 + Size: 1084 kB + KernelPageSize: 4 kB + MMUPageSize: 4 kB + Rss: 892 kB + Pss: 374 kB + Pss_Dirty: 0 kB + Shared_Clean: 892 kB + Shared_Dirty: 0 kB + Private_Clean: 0 kB + Private_Dirty: 0 kB + Referenced: 892 kB + Anonymous: 0 kB + LazyFree: 0 kB + AnonHugePages: 0 kB + ShmemPmdMapped: 0 kB + Shared_Hugetlb: 0 kB + Private_Hugetlb: 0 kB + HugetlbPudMapped: 0 kB + HugetlbPmdMapped: 0 kB + HugetlbPteMapped: 0 kB + Swap: 0 kB + SwapPss: 0 kB + KernelPageSize: 4 kB + MMUPageSize: 4 kB + Locked: 0 kB + THPeligible: 0 VmFlags: rd ex mr mw me dw The first of these lines shows the same information as is displayed for the @@ -510,10 +513,15 @@ implementation. If this is not desirable please file a bug report. "ShmemPmdMapped" shows the ammount of shared (shmem/tmpfs) memory backed by huge pages. -"Shared_Hugetlb" and "Private_Hugetlb" show the ammounts of memory backed by +"Shared_Hugetlb" and "Private_Hugetlb" show the amounts of memory backed by hugetlbfs page which is *not* counted in "RSS" or "PSS" field for historical reasons. And these are not included in {Shared,Private}_{Clean,Dirty} field. +If the kernel was compiled with ``CONFIG_HUGETLB_HIGH_GRANULARITY_MAPPING``, +"HugetlbPudMapped", "HugetlbPmdMapped", and "HugetlbPteMapped" may appear and +show the amount of HugeTLB memory mapped with PUDs, PMDs, and PTEs respectively. +Folded levels won't appear. See Documentation/admin-guide/mm/hugetlbpage.rst. + "Swap" shows how much would-be-anonymous memory is also used, but out on swap. For shmem mappings, "Swap" includes also the size of the mapped (and not From patchwork Sat Feb 18 00:28:15 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: James Houghton X-Patchwork-Id: 13145408 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 99EFFC636D6 for ; Sat, 18 Feb 2023 00:30:00 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 2AD7F280027; Fri, 17 Feb 2023 19:29:29 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 25C3D280026; Fri, 17 Feb 2023 19:29:29 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 06179280027; Fri, 17 Feb 2023 19:29:29 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0014.hostedemail.com [216.40.44.14]) by kanga.kvack.org (Postfix) with ESMTP id EC571280026 for ; Fri, 17 Feb 2023 19:29:28 -0500 (EST) Received: from smtpin22.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay10.hostedemail.com (Postfix) with ESMTP id D2BA1C0271 for ; Sat, 18 Feb 2023 00:29:28 +0000 (UTC) X-FDA: 80478528816.22.EF1DC44 Received: from mail-yb1-f201.google.com (mail-yb1-f201.google.com [209.85.219.201]) by imf09.hostedemail.com (Postfix) with ESMTP id 2A04114000B for ; Sat, 18 Feb 2023 00:29:26 +0000 (UTC) Authentication-Results: imf09.hostedemail.com; dkim=pass header.d=google.com header.s=20210112 header.b=FTUzD2si; spf=pass (imf09.hostedemail.com: domain of 35hvwYwoKCAUoymtzlmytslttlqj.htrqnsz2-rrp0fhp.twl@flex--jthoughton.bounces.google.com designates 209.85.219.201 as permitted sender) smtp.mailfrom=35hvwYwoKCAUoymtzlmytslttlqj.htrqnsz2-rrp0fhp.twl@flex--jthoughton.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1676680167; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=+gBzjJ9Vs83x35smR0BnSI5ARmUioiACF/H5X92ziQg=; b=njADco7SkapaNxNRfSgmiLexqpQ8QpypjhlIoruvSZYh7s8QGYaMsQHWiAZzjAlFBUNXUu VqTuKVZD23MYOyYt1N6nBZx6Zp7UGMcMC99llAIDdKLyYmL0pUywhNaf66ZC322dGJixfw qzaJsOw/MeQxzDEK4lPHS1zTyAEi9Y4= ARC-Authentication-Results: i=1; imf09.hostedemail.com; dkim=pass header.d=google.com header.s=20210112 header.b=FTUzD2si; spf=pass (imf09.hostedemail.com: domain of 35hvwYwoKCAUoymtzlmytslttlqj.htrqnsz2-rrp0fhp.twl@flex--jthoughton.bounces.google.com designates 209.85.219.201 as permitted sender) smtp.mailfrom=35hvwYwoKCAUoymtzlmytslttlqj.htrqnsz2-rrp0fhp.twl@flex--jthoughton.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1676680167; a=rsa-sha256; cv=none; b=qgxhIxerdIkgdOacihA/pauMoJLWcRqm5vAbJhIog2BgHcfJ2UtPbN5DkGU1+esqT8oGjb 3R8MIKFvczVwDmK2HdlDxMx11xp7EI5N9p1K2Mnucar9ZXyzU7hTN/U2QBD6hqf0fUJaH4 AEK9yf88rXFViMgmkTQMdsXZL8FQ1PQ= Received: by mail-yb1-f201.google.com with SMTP id w6-20020a25c706000000b0098592b9ff86so2698842ybe.9 for ; Fri, 17 Feb 2023 16:29:26 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=+gBzjJ9Vs83x35smR0BnSI5ARmUioiACF/H5X92ziQg=; b=FTUzD2siWOamshg6dDNFdmcX2aYM0u3nvjpwU9uc75sLtLAKVBJAHVwAMjybOOM6aS VCEL8peUI2WwLbzU6ATlSC6+Wvz1/sc5w5zoKT+HEYVEE6xJEDLkP6KrFxiX5oVihBDo s3dn54TE92bh/yI06SLWHvNp5s6gfTErs9rDsWAFHd6g5rwlUEwUOvW2OPB06dtmQD1G M0MjN8FlEcXs2Za2RbC+His2Zwd7Jw8oRFe61lIYiwkcZmU3IZmQOKMZVZ3pE48rUE+w Jbg305rP5BtXo+lCwZaFP+1TrtMNjBmig60ExoWFI/t+qWpnbo+4TveqyihYwoX+lHie dTlQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=+gBzjJ9Vs83x35smR0BnSI5ARmUioiACF/H5X92ziQg=; b=wkOI4c3fBDmewieLRNmsK3y/R08u5wjbg7Gz48hCV1Ai78nqG9GO7At4rBuPo0f79A SfmlnJL1ViuUlpfkcwKF70TS543SdeNp8eH+KZhOXUjT4N8YO8fKMyZURX6f4+fZFHqX QN3cdButZ3bZcvd2aWQwc6ewfuGKY+saeiVGBjRZRtfN3EV7E/xwGZLSavuBDLhRQdxM 19iXHk2iTaWbzgNtQcirXYDLXWWu9iHfXB2b3bzMpk6W89KEvDfrQoy6qH+SLf4461go xtcs5Kb2Gel0EjFDWkqUOdlObk+fPnuiLgsEy7k7S1Y/XLSWohJW7r68+ggkXlahO7qv CL0g== X-Gm-Message-State: AO0yUKWLB95Hl+5DRegSmvjShWLAlOEMagN1a2c3ivou7eo+1RuLfjb0 SpYoqmnzdHaretZWPWW5JkrpOdKFo3i4KJXZ X-Google-Smtp-Source: AK7set8hCmHkXvdJYET09bfYSTtiB1w0qW6BeHXJttgQrcwICFN7nXKvhHomAteDlPagr6evNf+A5cHI9ctrKXl9 X-Received: from jthoughton.c.googlers.com ([fda3:e722:ac3:cc00:14:4d90:c0a8:2a4f]) (user=jthoughton job=sendgmr) by 2002:a81:8706:0:b0:50b:429e:a9ef with SMTP id x6-20020a818706000000b0050b429ea9efmr1329552ywf.434.1676680166676; Fri, 17 Feb 2023 16:29:26 -0800 (PST) Date: Sat, 18 Feb 2023 00:28:15 +0000 In-Reply-To: <20230218002819.1486479-1-jthoughton@google.com> Mime-Version: 1.0 References: <20230218002819.1486479-1-jthoughton@google.com> X-Mailer: git-send-email 2.39.2.637.g21b0678d19-goog Message-ID: <20230218002819.1486479-43-jthoughton@google.com> Subject: [PATCH v2 42/46] selftests/mm: add HugeTLB HGM to userfaultfd selftest From: James Houghton To: Mike Kravetz , Muchun Song , Peter Xu , Andrew Morton Cc: David Hildenbrand , David Rientjes , Axel Rasmussen , Mina Almasry , "Zach O'Keefe" , Manish Mishra , Naoya Horiguchi , "Dr . David Alan Gilbert" , "Matthew Wilcox (Oracle)" , Vlastimil Babka , Baolin Wang , Miaohe Lin , Yang Shi , Frank van der Linden , Jiaqi Yan , linux-mm@kvack.org, linux-kernel@vger.kernel.org, James Houghton X-Rspamd-Server: rspam05 X-Rspamd-Queue-Id: 2A04114000B X-Stat-Signature: ui36xett8356nymqqirbhspoeo8moeao X-Rspam-User: X-HE-Tag: 1676680166-934503 X-HE-Meta: U2FsdGVkX1/XWdwhwNOqIK1tgL7iS1VyOPQDkSnGEoSSjLar4pEAN+Og6tRJcXThc8dSdgI434/Xy3RFDovFbu51BwKOTvAKZpdseFGrIVXOcskFNFFYw/E5rbSkXWMZ7cugtfiakkqsoMPhuFt7ihMBIHd8mYpxPec67qr5zgnn9saUvtaZbxzgtiyU1Xe8LsF8ias5X7CuqCsbO9ujfDAWh8bruwGt/hR+/rLPs5boXif3FvNCVCARvTwRvB2iZtAZdGZZDZYMa1xowJGXTHtk5qKenMtgWvbqISz58SwvuRvvVk/gO+aZZDw7sy/5CuAz+tzJPKllodC3b5O5jKjg8Jkt4GnDdSftz+1uO8jaZhGbRUidJlgBGWOM7S9Cy+Hz9eXS0mz0wJUvLbIGdqqyYg8wHO9ZJWEJ/rOJ70qFqnGaHUYtk2Z0wNyWd+4bU9Ye8CQF+880eUJJn6OhdT+EDDG2CPPbf+Uc3QQ48AxviotJyHvcHbOGCIdPwKnkbGmuhBvSiEhLSyifRFOgjqxYIy9GC69jj1eIMYbjPqK5jCNW8KbS3O3MQI2d+jJFhPIclQhCgUq5i0opshJD8XkeYAKniMGdshW7hz4IE4r6CxKHNBZWZHstEj/r3VBtkl7+vDp3H9dScyK+tkksCU24SbXpPtTCF3FRUUCHq2Ol8GjcEJe2a8pkvTVdTas4+iODd1Aoy9cXko6kjMr1QGt16nd1WYXG/34Qyg+KjsUOSISrjUgZgRA3WrnMlarFfoXPDYfgCTNf6M7u9y2VkncY3WQQc6NZm+dnBNMPTY6+t1bUXnK9ckSrPtfsd5KewwDa9iCw07EMxFGiE/SRO5Wa4eHDPSoVYQRP2gw9U4Z+Y4CZi7wRHEM4fpkX0uj4CiiTfLakE4EaJeMJm3ZCuGWebCButmwBLc2Q0G6IY2XiwHdHKcCf+OipI4yvJUJ60m2frsjhibC8pkWofgO BpcYBTpR Ln1W2wJOcUSXgVZeg9bzlIehyvGYVJNQBGPClouReLHxD5rQhgY53ebqCDUg1NP+HqGnfrEIjamF44JoIlNCVK1cxQoAoWVf7J0+g4k0AIxLcsB2Lqan4BtB4NGdRd/jlTZoYvBhGYHUBIzDF5zCrcVXqeRZo6EhFqsTlc8Q/Wq6UVp1ppDKcrTRw5C7LDj0h/8fzGBaAWNUiMgURKioSZcjzA2mxmk+6HJC+AELjkj47Wjnvq48kii4zBaAzVq4rqdSi25eV2pIx/H+zDDFFMiRpl2xCKDZHFLl9vhFbwCbee8j5q7mwx2rYarfRKE9jmUkqhYYmXEh1Die6vwQS+//4cuwB4TXpWH+gR2jlBOEJ/cI1kZsYgQIVB9C97Vg1suUC5s6ik995jIaoMgXPGlO/9Fuc9jMXkbDvmYR+AfHiZfvUb3bNPOyPLH0XImFI5tc+FnOtBv4enGTmQpyYzz0UlImn+IunRtXSP12O+mT9iMRLIu3LxDfc9pOMQsBRg+Tzthz6FtjGksE315epdzEa6uRTtnrdpTCIv83jHBhVs7CU/OE9z/EfqSBuqw1rtEnh X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: This test case behaves similarly to the regular shared HugeTLB configuration, except that it uses 4K instead of hugepages, and that we ignore the UFFDIO_COPY tests, as UFFDIO_CONTINUE is the only ioctl that supports PAGE_SIZE-aligned regions. This doesn't test MADV_COLLAPSE. Other tests are added later to exercise MADV_COLLAPSE. Signed-off-by: James Houghton diff --git a/tools/testing/selftests/mm/userfaultfd.c b/tools/testing/selftests/mm/userfaultfd.c index 7f22844ed704..681c5c5f863b 100644 --- a/tools/testing/selftests/mm/userfaultfd.c +++ b/tools/testing/selftests/mm/userfaultfd.c @@ -73,9 +73,10 @@ static unsigned long nr_cpus, nr_pages, nr_pages_per_cpu, page_size, hpage_size; #define BOUNCE_POLL (1<<3) static int bounces; -#define TEST_ANON 1 -#define TEST_HUGETLB 2 -#define TEST_SHMEM 3 +#define TEST_ANON 1 +#define TEST_HUGETLB 2 +#define TEST_HUGETLB_HGM 3 +#define TEST_SHMEM 4 static int test_type; #define UFFD_FLAGS (O_CLOEXEC | O_NONBLOCK | UFFD_USER_MODE_ONLY) @@ -93,6 +94,8 @@ static volatile bool test_uffdio_zeropage_eexist = true; static bool test_uffdio_wp = true; /* Whether to test uffd minor faults */ static bool test_uffdio_minor = false; +static bool test_uffdio_copy = true; + static bool map_shared; static int mem_fd; static unsigned long long *count_verify; @@ -151,7 +154,7 @@ static void usage(void) fprintf(stderr, "\nUsage: ./userfaultfd " "[hugetlbfs_file]\n\n"); fprintf(stderr, "Supported : anon, hugetlb, " - "hugetlb_shared, shmem\n\n"); + "hugetlb_shared, hugetlb_shared_hgm, shmem\n\n"); fprintf(stderr, "'Test mods' can be joined to the test type string with a ':'. " "Supported mods:\n"); fprintf(stderr, "\tsyscall - Use userfaultfd(2) (default)\n"); @@ -167,6 +170,11 @@ static void usage(void) exit(1); } +static bool test_is_hugetlb(void) +{ + return test_type == TEST_HUGETLB || test_type == TEST_HUGETLB_HGM; +} + #define _err(fmt, ...) \ do { \ int ret = errno; \ @@ -381,7 +389,7 @@ static struct uffd_test_ops *uffd_test_ops; static inline uint64_t uffd_minor_feature(void) { - if (test_type == TEST_HUGETLB && map_shared) + if (test_is_hugetlb() && map_shared) return UFFD_FEATURE_MINOR_HUGETLBFS; else if (test_type == TEST_SHMEM) return UFFD_FEATURE_MINOR_SHMEM; @@ -393,7 +401,7 @@ static uint64_t get_expected_ioctls(uint64_t mode) { uint64_t ioctls = UFFD_API_RANGE_IOCTLS; - if (test_type == TEST_HUGETLB) + if (test_is_hugetlb()) ioctls &= ~(1 << _UFFDIO_ZEROPAGE); if (!((mode & UFFDIO_REGISTER_MODE_WP) && test_uffdio_wp)) @@ -500,13 +508,16 @@ static void uffd_test_ctx_clear(void) static void uffd_test_ctx_init(uint64_t features) { unsigned long nr, cpu; + uint64_t enabled_features = features; uffd_test_ctx_clear(); uffd_test_ops->allocate_area((void **)&area_src, true); uffd_test_ops->allocate_area((void **)&area_dst, false); - userfaultfd_open(&features); + userfaultfd_open(&enabled_features); + if ((enabled_features & features) != features) + err("couldn't enable all features"); count_verify = malloc(nr_pages * sizeof(unsigned long long)); if (!count_verify) @@ -726,13 +737,16 @@ static void uffd_handle_page_fault(struct uffd_msg *msg, struct uffd_stats *stats) { unsigned long offset; + unsigned long address; if (msg->event != UFFD_EVENT_PAGEFAULT) err("unexpected msg event %u", msg->event); + address = msg->arg.pagefault.address; + if (msg->arg.pagefault.flags & UFFD_PAGEFAULT_FLAG_WP) { /* Write protect page faults */ - wp_range(uffd, msg->arg.pagefault.address, page_size, false); + wp_range(uffd, address, page_size, false); stats->wp_faults++; } else if (msg->arg.pagefault.flags & UFFD_PAGEFAULT_FLAG_MINOR) { uint8_t *area; @@ -751,11 +765,10 @@ static void uffd_handle_page_fault(struct uffd_msg *msg, */ area = (uint8_t *)(area_dst + - ((char *)msg->arg.pagefault.address - - area_dst_alias)); + ((char *)address - area_dst_alias)); for (b = 0; b < page_size; ++b) area[b] = ~area[b]; - continue_range(uffd, msg->arg.pagefault.address, page_size); + continue_range(uffd, address, page_size); stats->minor_faults++; } else { /* @@ -782,7 +795,7 @@ static void uffd_handle_page_fault(struct uffd_msg *msg, if (msg->arg.pagefault.flags & UFFD_PAGEFAULT_FLAG_WRITE) err("unexpected write fault"); - offset = (char *)(unsigned long)msg->arg.pagefault.address - area_dst; + offset = (char *)address - area_dst; offset &= ~(page_size-1); if (copy_page(uffd, offset)) @@ -1192,6 +1205,12 @@ static int userfaultfd_events_test(void) char c; struct uffd_stats stats = { 0 }; + if (!test_uffdio_copy) { + printf("Skipping userfaultfd events test " + "(test_uffdio_copy=false)\n"); + return 0; + } + printf("testing events (fork, remap, remove): "); fflush(stdout); @@ -1245,6 +1264,12 @@ static int userfaultfd_sig_test(void) char c; struct uffd_stats stats = { 0 }; + if (!test_uffdio_copy) { + printf("Skipping userfaultfd signal test " + "(test_uffdio_copy=false)\n"); + return 0; + } + printf("testing signal delivery: "); fflush(stdout); @@ -1329,6 +1354,11 @@ static int userfaultfd_minor_test(void) uffd_test_ctx_init(uffd_minor_feature()); + if (test_type == TEST_HUGETLB_HGM) + /* Enable high-granularity userfaultfd ioctls for HugeTLB */ + if (madvise(area_dst_alias, nr_pages * page_size, MADV_SPLIT)) + err("MADV_SPLIT failed"); + uffdio_register.range.start = (unsigned long)area_dst_alias; uffdio_register.range.len = nr_pages * page_size; uffdio_register.mode = UFFDIO_REGISTER_MODE_MINOR; @@ -1538,6 +1568,12 @@ static int userfaultfd_stress(void) pthread_attr_init(&attr); pthread_attr_setstacksize(&attr, 16*1024*1024); + if (!test_uffdio_copy) { + printf("Skipping userfaultfd stress test " + "(test_uffdio_copy=false)\n"); + bounces = 0; + } + while (bounces--) { printf("bounces: %d, mode:", bounces); if (bounces & BOUNCE_RANDOM) @@ -1696,6 +1732,16 @@ static void set_test_type(const char *type) uffd_test_ops = &hugetlb_uffd_test_ops; /* Minor faults require shared hugetlb; only enable here. */ test_uffdio_minor = true; + } else if (!strcmp(type, "hugetlb_shared_hgm")) { + map_shared = true; + test_type = TEST_HUGETLB_HGM; + uffd_test_ops = &hugetlb_uffd_test_ops; + /* + * HugeTLB HGM only changes UFFDIO_CONTINUE, so don't test + * UFFDIO_COPY. + */ + test_uffdio_minor = true; + test_uffdio_copy = false; } else if (!strcmp(type, "shmem")) { map_shared = true; test_type = TEST_SHMEM; @@ -1731,6 +1777,7 @@ static void parse_test_type_arg(const char *raw_type) err("Unsupported test: %s", raw_type); if (test_type == TEST_HUGETLB) + /* TEST_HUGETLB_HGM gets small pages. */ page_size = hpage_size; else page_size = sysconf(_SC_PAGE_SIZE); @@ -1813,22 +1860,29 @@ int main(int argc, char **argv) nr_cpus = x < y ? x : y; } nr_pages_per_cpu = bytes / page_size / nr_cpus; + if (test_type == TEST_HUGETLB_HGM) + /* + * `page_size` refers to the page_size we can use in + * UFFDIO_CONTINUE. We still need nr_pages to be appropriately + * aligned, so align it here. + */ + nr_pages_per_cpu -= nr_pages_per_cpu % (hpage_size / page_size); if (!nr_pages_per_cpu) { _err("invalid MiB"); usage(); } + nr_pages = nr_pages_per_cpu * nr_cpus; bounces = atoi(argv[3]); if (bounces <= 0) { _err("invalid bounces"); usage(); } - nr_pages = nr_pages_per_cpu * nr_cpus; - if (test_type == TEST_SHMEM || test_type == TEST_HUGETLB) { + if (test_type == TEST_SHMEM || test_is_hugetlb()) { unsigned int memfd_flags = 0; - if (test_type == TEST_HUGETLB) + if (test_is_hugetlb()) memfd_flags = MFD_HUGETLB; mem_fd = memfd_create(argv[0], memfd_flags); if (mem_fd < 0) From patchwork Sat Feb 18 00:28:16 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: James Houghton X-Patchwork-Id: 13145409 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 225D1C64ED6 for ; Sat, 18 Feb 2023 00:30:03 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id CE8F66B0078; Fri, 17 Feb 2023 19:29:46 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id C97636B007E; Fri, 17 Feb 2023 19:29:46 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id AC2636B0081; Fri, 17 Feb 2023 19:29:46 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0017.hostedemail.com [216.40.44.17]) by kanga.kvack.org (Postfix) with ESMTP id 9E5656B0078 for ; Fri, 17 Feb 2023 19:29:46 -0500 (EST) Received: from smtpin05.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay05.hostedemail.com (Postfix) with ESMTP id 8140540B47 for ; Sat, 18 Feb 2023 00:29:46 +0000 (UTC) X-FDA: 80478529572.05.B79EDF9 Received: from mail-vk1-f201.google.com (mail-vk1-f201.google.com [209.85.221.201]) by imf14.hostedemail.com (Postfix) with ESMTP id B27D8100002 for ; Sat, 18 Feb 2023 00:29:44 +0000 (UTC) Authentication-Results: imf14.hostedemail.com; dkim=pass header.d=google.com header.s=20210112 header.b=oC2H6v+t; spf=pass (imf14.hostedemail.com: domain of 39xvwYwoKCBY5F3AG23FA92AA270.yA8749GJ-886Hwy6.AD2@flex--jthoughton.bounces.google.com designates 209.85.221.201 as permitted sender) smtp.mailfrom=39xvwYwoKCBY5F3AG23FA92AA270.yA8749GJ-886Hwy6.AD2@flex--jthoughton.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1676680184; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=8+Dfh3tVbu9NNOAuUQbMjQlcnIuGhIGz3LkhsyysEBM=; b=7osbgl0egW4Bvk59utV5cn284SIAEbSftf3r88SY+x4mO7+qRQdxBvTrHNuvpdHs3olCut jVQaKEsEAy9jRjF69OI7Pd0clUq+y/4laYlepm5GQU/SDM0I5hCmImbGmauuZeViVhppAg VE2jbJ9EInF98+5cJ0zfwcd3X9X3vTw= ARC-Authentication-Results: i=1; imf14.hostedemail.com; dkim=pass header.d=google.com header.s=20210112 header.b=oC2H6v+t; spf=pass (imf14.hostedemail.com: domain of 39xvwYwoKCBY5F3AG23FA92AA270.yA8749GJ-886Hwy6.AD2@flex--jthoughton.bounces.google.com designates 209.85.221.201 as permitted sender) smtp.mailfrom=39xvwYwoKCBY5F3AG23FA92AA270.yA8749GJ-886Hwy6.AD2@flex--jthoughton.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1676680184; a=rsa-sha256; cv=none; b=hzb9UMSZjoCQn2qFAqZj2PnrCg+hsFGeS4/aeOz1IoADr4DFZsoxIbB4jksKsp/4Lz6xXr 6i9jdoqIpOwB7CUl7wyMPUVHiUlcAbKhxJXNK3oWbO3sqkt1+tku0VP/0lmS3vs6B8LT93 uBxOv3k4qOjBcBF6BYFuKvcj12/YVGQ= Received: by mail-vk1-f201.google.com with SMTP id j24-20020ac5ccd8000000b0040186857634so1130154vkn.0 for ; Fri, 17 Feb 2023 16:29:44 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=8+Dfh3tVbu9NNOAuUQbMjQlcnIuGhIGz3LkhsyysEBM=; b=oC2H6v+tP8sxgSW/qaTkqpk/qQGFu8OqZp2lefzh5tpvjsOdoYS95fEme6g5r45Srg kK+SX7TkKqLd7NIyWtb5jibyXnm1w9QJR9oIYR7KWDFmDwxHgvIJ02BbB1UgyoTqeQL8 89D9RKCX1G+/4ORDV+WYjrLRuKzAfDY/6QJdu8zlAz9TQtaROdykMDea9qi+W5pMHyEx npoMTQfezy/k56FzTnBrsh96SmGvpoH+UHWk3RlvjqE9gAwVIEdHIsW/fo6+TLVqF5XS Xw+BwwbTSNp8vpWaaHiMcSOExGAoahht2ywAUDVHMW8zc5X++I6XujZ6B7mVjrEIJDFO c0nA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=8+Dfh3tVbu9NNOAuUQbMjQlcnIuGhIGz3LkhsyysEBM=; b=2R7G8ywg6WMCC4SfVs5/HdvEK+Vim0dmUQcGH6gbhrm0NWu6NwoXNhTTr061cpJJr4 vdNQZLTzAe/oF9Ta7HOfXBnyzaVM37eGlQtPVrp6zYfuCp0miasj0KBV+jCFbq4nU/dV kjWXQsp4x/ztyA/Rceh26ZHnW6Ck1qgjssJox/H6jqk0iYagIuOUCcN0PWX5hftE0VMP 8P981szon3MP7cQl4pA1RFbGkuaSqET94pBkxtvftx0cpSHW8BxDczaBvW7Qyp9b45Wr ZYsJm/8Y2UnjBi1g2A8ln2ADK6bBpIRT3kqwmD1+bkORWu0cQdtPba2ZRKdjhglKLVWS tCLQ== X-Gm-Message-State: AO0yUKUvla42f56av3+iBs7XoNJPxdeMin48ts3UhSuN7j85cicOx+j4 eAdScUIuAAQa91wc1nxwOYNsSxfPfpyGqrvU X-Google-Smtp-Source: AK7set9ZL3hrEgPVLxygWFVRWkaZt5AY1vSRq45PWq0dcOktmSchjvn/Wzp1mppyMhkKtwEqnwlJ/llO3mZ7TTDg X-Received: from jthoughton.c.googlers.com ([fda3:e722:ac3:cc00:14:4d90:c0a8:2a4f]) (user=jthoughton job=sendgmr) by 2002:a1f:a013:0:b0:401:9bc6:c40c with SMTP id j19-20020a1fa013000000b004019bc6c40cmr552024vke.20.1676680183989; Fri, 17 Feb 2023 16:29:43 -0800 (PST) Date: Sat, 18 Feb 2023 00:28:16 +0000 In-Reply-To: <20230218002819.1486479-1-jthoughton@google.com> Mime-Version: 1.0 References: <20230218002819.1486479-1-jthoughton@google.com> X-Mailer: git-send-email 2.39.2.637.g21b0678d19-goog Message-ID: <20230218002819.1486479-44-jthoughton@google.com> Subject: [PATCH v2 43/46] KVM: selftests: add HugeTLB HGM to KVM demand paging selftest From: James Houghton To: Mike Kravetz , Muchun Song , Peter Xu , Andrew Morton Cc: David Hildenbrand , David Rientjes , Axel Rasmussen , Mina Almasry , "Zach O'Keefe" , Manish Mishra , Naoya Horiguchi , "Dr . David Alan Gilbert" , "Matthew Wilcox (Oracle)" , Vlastimil Babka , Baolin Wang , Miaohe Lin , Yang Shi , Frank van der Linden , Jiaqi Yan , linux-mm@kvack.org, linux-kernel@vger.kernel.org, James Houghton X-Rspamd-Server: rspam05 X-Rspamd-Queue-Id: B27D8100002 X-Stat-Signature: e9umxmunxy91mbkcdrq1r8cmrm1moqaf X-Rspam-User: X-HE-Tag: 1676680184-779334 X-HE-Meta: U2FsdGVkX1/td7Dy7plN0oNaIRXxtsLfzCcSm6r9Bwux0ybS/QAL0m1qOwMs7I3bb+RrmwochS/yxyo0iIF+GoE6CqoRPGCjotAt3LgXGJOhq9Iwjkn/hmNxE4MBIp415wW+ZBf17EDkn4YW54Ytd88CqKoQQ8WOEpIUfOXD+QCXx87HhmqRyaCaYNGznQ2U4/SB1mTJwpKNGawkTyEQYXJ8KcdM26/HhEaG6+ZpqkbDndpVrTZt/LtQJ2v1+tkQkjX4baYN5JfxrDYsiqBoOinHyrrwgFMcJF5SFo2WNyiIXTZMxPHv5AxVsRXu4Y/KWsWPGU1Xh6Mt7mzM2rzTCtkhM8rQJ5eYSy3XTf3p06YPmkCuR3E30jH6EgrX3k9GO1gmF7uR8TDInu5P+n9yvDO43+cCD4UPtdIncXf9tho7KohDnR1P6K/R0Mj4UHacUoFAXmZjhueL3k6PmRJc88zPT4jXahiSHI9vZgLtEjp3o5CqqTpQgH+F8f7ywtLDhmkVPCPvop3IP3KX8fbV2g0noXtpE/QbGZP+4FLyL8mo31tKhe/4E7ciz7DnWTGfR19+Erpkb8Tpi3Kf66Htlz88fru6Z9SBSB8N4b8qciQjqn1ctcqEnEyT9RvYFhqVrk/gbA8EVe7PlbJcErDJ2eYGVtu2duGfdg6J3Ru9xAZomg9ViJMNNEdI5o8YxJncqJkLIgPXUgM7zSOtF9b8wGEQZNDl4yDRjJYDNYgMrxyx+GWbkEcqvSF5zoX0BhQ1+G7IaEJxID826DdnvlVIgRwopNYmlsimhyeopImNVBP6KbisOhMKhwajdKWLYngwFS/rLPyCi2QFEjDpW52fGV0laPufIE9YVK+IJoyfcyw87S/K0YLAa7zDthjJkqZPG8Wk9uuSfPQQnb9YJvkqzyixnBIgGVslkBhUG5cLvfP/WuHVryPS4Ojp8Al6m+QNAp7ew1O9x+U3MeNQf21 GBvI7y3D 91E+c4s3/XMWDB/WlfYO6JV028y6GimI7K0Bef6W9Dz30R5oAit/vSEQuLYd2Mq8GfpwenLEGP1jtl8jKIjb7ELPg8MoWpAL2US8wWrT619+RrWJvA4CF5q+YYS9cZyhgsIp9ToVpzpsdp/O4tyt30AkG6oRu+/awaoF9NVE2PSLpkwEQHSFSRuyGx6qf5lqoMd4Iey3hbMzdMS2Sk9MJBlNWWxHgRNNGaHr36QGBQLQgTcWWD9Lu8tFlZUjFIZYGFZhE+tl8Pu6TlbG8KsxdXTMgxpe00SVMogh77x9BlIpFQECZfSLtbH/VrpOG/eMtJRVvOnKYxwlig+QdGWmnVeXob7fuUKX89TLEEsG2jkLQ8cDjP9GedAQUyLCNc5KZ2Ttev6s2DlPsRiqioWn3JYXi+BljWf08W3IBZtzk9tPdOZyOfGbx8XzHlLCTWLFXCAUi40XL5Rf7+Smc/LHi+8belqb99xj8Z6R8HZcCxt/28dDbLGVN1e5p8qmGZBvrdh1ws0fBCjmMBgVoALhDjCKB8OgC5DFmP3AnFjce5ZrlT3XLREHyllcnq1OLTgIxFW1C2m9SV/T5aXH2ced2Aa15+CJx/Uj1eldPj5FwntuU04IyIkZtMstBSFdRle+cCRVz66+NMDT0vT7ytk5cQoY3AgW/2hr7INyU X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: This test exercises the GUP paths for HGM. MADV_COLLAPSE is not tested. Signed-off-by: James Houghton diff --git a/tools/testing/selftests/kvm/demand_paging_test.c b/tools/testing/selftests/kvm/demand_paging_test.c index b0e1fc4de9e2..e534f9c927bf 100644 --- a/tools/testing/selftests/kvm/demand_paging_test.c +++ b/tools/testing/selftests/kvm/demand_paging_test.c @@ -170,7 +170,7 @@ static void run_test(enum vm_guest_mode mode, void *arg) uffd_descs[i] = uffd_setup_demand_paging( p->uffd_mode, p->uffd_delay, vcpu_hva, vcpu_args->pages * memstress_args.guest_page_size, - &handle_uffd_page_request); + p->src_type, &handle_uffd_page_request); } } diff --git a/tools/testing/selftests/kvm/include/test_util.h b/tools/testing/selftests/kvm/include/test_util.h index 80d6416f3012..a2106c19a614 100644 --- a/tools/testing/selftests/kvm/include/test_util.h +++ b/tools/testing/selftests/kvm/include/test_util.h @@ -103,6 +103,7 @@ enum vm_mem_backing_src_type { VM_MEM_SRC_ANONYMOUS_HUGETLB_16GB, VM_MEM_SRC_SHMEM, VM_MEM_SRC_SHARED_HUGETLB, + VM_MEM_SRC_SHARED_HUGETLB_HGM, NUM_SRC_TYPES, }; @@ -121,6 +122,7 @@ size_t get_def_hugetlb_pagesz(void); const struct vm_mem_backing_src_alias *vm_mem_backing_src_alias(uint32_t i); size_t get_backing_src_pagesz(uint32_t i); bool is_backing_src_hugetlb(uint32_t i); +bool is_backing_src_shared_hugetlb(enum vm_mem_backing_src_type src_type); void backing_src_help(const char *flag); enum vm_mem_backing_src_type parse_backing_src_type(const char *type_name); long get_run_delay(void); diff --git a/tools/testing/selftests/kvm/include/userfaultfd_util.h b/tools/testing/selftests/kvm/include/userfaultfd_util.h index 877449c34592..d91528a58245 100644 --- a/tools/testing/selftests/kvm/include/userfaultfd_util.h +++ b/tools/testing/selftests/kvm/include/userfaultfd_util.h @@ -26,9 +26,9 @@ struct uffd_desc { pthread_t thread; }; -struct uffd_desc *uffd_setup_demand_paging(int uffd_mode, useconds_t delay, - void *hva, uint64_t len, - uffd_handler_t handler); +struct uffd_desc *uffd_setup_demand_paging( + int uffd_mode, useconds_t delay, void *hva, uint64_t len, + enum vm_mem_backing_src_type src_type, uffd_handler_t handler); void uffd_stop_demand_paging(struct uffd_desc *uffd); diff --git a/tools/testing/selftests/kvm/lib/kvm_util.c b/tools/testing/selftests/kvm/lib/kvm_util.c index 56d5ea949cbb..b9c398dc295d 100644 --- a/tools/testing/selftests/kvm/lib/kvm_util.c +++ b/tools/testing/selftests/kvm/lib/kvm_util.c @@ -981,7 +981,7 @@ void vm_userspace_mem_region_add(struct kvm_vm *vm, region->fd = -1; if (backing_src_is_shared(src_type)) region->fd = kvm_memfd_alloc(region->mmap_size, - src_type == VM_MEM_SRC_SHARED_HUGETLB); + is_backing_src_shared_hugetlb(src_type)); region->mmap_start = mmap(NULL, region->mmap_size, PROT_READ | PROT_WRITE, diff --git a/tools/testing/selftests/kvm/lib/test_util.c b/tools/testing/selftests/kvm/lib/test_util.c index 5c22fa4c2825..712a0878932e 100644 --- a/tools/testing/selftests/kvm/lib/test_util.c +++ b/tools/testing/selftests/kvm/lib/test_util.c @@ -271,6 +271,13 @@ const struct vm_mem_backing_src_alias *vm_mem_backing_src_alias(uint32_t i) */ .flag = MAP_SHARED, }, + [VM_MEM_SRC_SHARED_HUGETLB_HGM] = { + /* + * Identical to shared_hugetlb except for the name. + */ + .name = "shared_hugetlb_hgm", + .flag = MAP_SHARED, + }, }; _Static_assert(ARRAY_SIZE(aliases) == NUM_SRC_TYPES, "Missing new backing src types?"); @@ -289,6 +296,7 @@ size_t get_backing_src_pagesz(uint32_t i) switch (i) { case VM_MEM_SRC_ANONYMOUS: case VM_MEM_SRC_SHMEM: + case VM_MEM_SRC_SHARED_HUGETLB_HGM: return getpagesize(); case VM_MEM_SRC_ANONYMOUS_THP: return get_trans_hugepagesz(); @@ -305,6 +313,12 @@ bool is_backing_src_hugetlb(uint32_t i) return !!(vm_mem_backing_src_alias(i)->flag & MAP_HUGETLB); } +bool is_backing_src_shared_hugetlb(enum vm_mem_backing_src_type src_type) +{ + return src_type == VM_MEM_SRC_SHARED_HUGETLB || + src_type == VM_MEM_SRC_SHARED_HUGETLB_HGM; +} + static void print_available_backing_src_types(const char *prefix) { int i; diff --git a/tools/testing/selftests/kvm/lib/userfaultfd_util.c b/tools/testing/selftests/kvm/lib/userfaultfd_util.c index 92cef20902f1..3c7178d6c4f4 100644 --- a/tools/testing/selftests/kvm/lib/userfaultfd_util.c +++ b/tools/testing/selftests/kvm/lib/userfaultfd_util.c @@ -25,6 +25,10 @@ #ifdef __NR_userfaultfd +#ifndef MADV_SPLIT +#define MADV_SPLIT 26 +#endif + static void *uffd_handler_thread_fn(void *arg) { struct uffd_desc *uffd_desc = (struct uffd_desc *)arg; @@ -108,9 +112,9 @@ static void *uffd_handler_thread_fn(void *arg) return NULL; } -struct uffd_desc *uffd_setup_demand_paging(int uffd_mode, useconds_t delay, - void *hva, uint64_t len, - uffd_handler_t handler) +struct uffd_desc *uffd_setup_demand_paging( + int uffd_mode, useconds_t delay, void *hva, uint64_t len, + enum vm_mem_backing_src_type src_type, uffd_handler_t handler) { struct uffd_desc *uffd_desc; bool is_minor = (uffd_mode == UFFDIO_REGISTER_MODE_MINOR); @@ -140,6 +144,10 @@ struct uffd_desc *uffd_setup_demand_paging(int uffd_mode, useconds_t delay, "ioctl UFFDIO_API failed: %" PRIu64, (uint64_t)uffdio_api.api); + if (src_type == VM_MEM_SRC_SHARED_HUGETLB_HGM) + TEST_ASSERT(!madvise(hva, len, MADV_SPLIT), + "Could not enable HGM"); + uffdio_register.range.start = (uint64_t)hva; uffdio_register.range.len = len; uffdio_register.mode = uffd_mode; From patchwork Sat Feb 18 00:28:17 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: James Houghton X-Patchwork-Id: 13145410 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 3353AC05027 for ; Sat, 18 Feb 2023 00:30:05 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id AC30F6B007E; Fri, 17 Feb 2023 19:29:47 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id A73006B0081; Fri, 17 Feb 2023 19:29:47 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 8EC006B0082; Fri, 17 Feb 2023 19:29:47 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0017.hostedemail.com [216.40.44.17]) by kanga.kvack.org (Postfix) with ESMTP id 7FA576B007E for ; Fri, 17 Feb 2023 19:29:47 -0500 (EST) Received: from smtpin25.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay09.hostedemail.com (Postfix) with ESMTP id 57E8D80723 for ; Sat, 18 Feb 2023 00:29:47 +0000 (UTC) X-FDA: 80478529614.25.4AC2B11 Received: from mail-yb1-f202.google.com (mail-yb1-f202.google.com [209.85.219.202]) by imf05.hostedemail.com (Postfix) with ESMTP id 9E1E210000B for ; Sat, 18 Feb 2023 00:29:45 +0000 (UTC) Authentication-Results: imf05.hostedemail.com; dkim=pass header.d=google.com header.s=20210112 header.b=CawVGssL; spf=pass (imf05.hostedemail.com: domain of 3-BvwYwoKCBc6G4BH34GBA3BB381.zB985AHK-997Ixz7.BE3@flex--jthoughton.bounces.google.com designates 209.85.219.202 as permitted sender) smtp.mailfrom=3-BvwYwoKCBc6G4BH34GBA3BB381.zB985AHK-997Ixz7.BE3@flex--jthoughton.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1676680185; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=hyOLv1FsgINJ0DL+GVviZenHuSxZX0xiNJERAdFTwJY=; b=hdWl+RRautAht2LWXbKvcyJLqR3TmKVBLbd/HZWB0rPfzuonMRTqHH6h/OzMcvp6P82S1Y W2bOUPl5zmwcf1GUNmMO5rn6PSlAha/qsZGKwdyWSJg4huBAkBLP0hza9CG4DNN1ibRN75 AOaz/de2eBpiBEFGUPqeQY6kfygMeFc= ARC-Authentication-Results: i=1; imf05.hostedemail.com; dkim=pass header.d=google.com header.s=20210112 header.b=CawVGssL; spf=pass (imf05.hostedemail.com: domain of 3-BvwYwoKCBc6G4BH34GBA3BB381.zB985AHK-997Ixz7.BE3@flex--jthoughton.bounces.google.com designates 209.85.219.202 as permitted sender) smtp.mailfrom=3-BvwYwoKCBc6G4BH34GBA3BB381.zB985AHK-997Ixz7.BE3@flex--jthoughton.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1676680185; a=rsa-sha256; cv=none; b=4+Ylq1Q6tVc3HYHLXXzgnm8VTnLbNq1jbyshUYYHBXd4UZPAvZlsTPIJ44AfvTDx6VvPkR zzYiAkNGlhjuyxpl/ajV5+LBrgPlBGMg/jxlcVFVCpkMepHfWShfcDEOfTWQH8F0D/hwhR 4ABHQrVkFs7W3CPg0oFWZxN+RFezgpU= Received: by mail-yb1-f202.google.com with SMTP id e128-20020a253786000000b009433a21be0dso2009367yba.19 for ; Fri, 17 Feb 2023 16:29:45 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=hyOLv1FsgINJ0DL+GVviZenHuSxZX0xiNJERAdFTwJY=; b=CawVGssLxbR/eBsnpR76uSOwVJlkmsmCv0a5oroE0FUoq5CLC5qHPXGXefy+nfOAyq vHPyCbLQuykbScgNRy8G6ArFuhS2PA24tAoAR0gKyBhUDTykhHwShI6arftzkplVbyyk b4WDZ3892nTCDc681eGtskse+KIsH+dqtfTG2Vne6aDgRhNRdxfjHtbN/sR4SQs6gHc0 qUEwaDcngWDaQt95+PjWHSmLfyKw65IbdY5tMd2l4nt3+4TnGgT5wI1VLYvFO5GyJcwD HDiIQ3YFBSVkmnMVLHQ4cyRGkHdEevZ48y3mNmJQPrsPUHrSdwcp+QpVtufpdBI9D2OH id0A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=hyOLv1FsgINJ0DL+GVviZenHuSxZX0xiNJERAdFTwJY=; b=7zZwZix/ml55bKggS35UGI9Tbe5yMjCUshMHIojRh+5RPoQCWnF/CHchhR2LK09+aB qqPqc1KLfrL+SK390esp8F0K1uIofFFB4ZyF80+uI066tGZq8+JSPdpbJAUgGlxJXXb4 ZQJBENXuucYmK+aaP5RSpbh0jsKXYUmxMGKQNhqE0wTsqKjDwTGtLJXNp3P8DfO+CXt/ U6xvJvxn7OaJy4FAyA6sQ7St1WDW497MN8RF7po8heQB4YVZ2OGsB1Wo+kDh3N2KB/g9 LX6WCoXcMefn4IVwSI7M1TcYMUGCbWmmjmOo/BWuSexz4u3my5my9+WTE0PJs+rJVtsu uzGQ== X-Gm-Message-State: AO0yUKXWvQCXxh2vcolWMdluVUgBCpCoVxY+WoJaUrQuG9okuECEXlg0 qJqRzJlYBS3NULkdmvMYDAvcNRYinJOvTk+B X-Google-Smtp-Source: AK7set+1gp+3xzyRWa+pkBelE3wO17FQrBJLukvYGoDbOlAPndL6HG5Z6zOqFqMyhLaMuLWmUDnySY104VSUsagh X-Received: from jthoughton.c.googlers.com ([fda3:e722:ac3:cc00:14:4d90:c0a8:2a4f]) (user=jthoughton job=sendgmr) by 2002:a05:6902:4f0:b0:98e:6280:74ca with SMTP id w16-20020a05690204f000b0098e628074camr174263ybs.1.1676680184745; Fri, 17 Feb 2023 16:29:44 -0800 (PST) Date: Sat, 18 Feb 2023 00:28:17 +0000 In-Reply-To: <20230218002819.1486479-1-jthoughton@google.com> Mime-Version: 1.0 References: <20230218002819.1486479-1-jthoughton@google.com> X-Mailer: git-send-email 2.39.2.637.g21b0678d19-goog Message-ID: <20230218002819.1486479-45-jthoughton@google.com> Subject: [PATCH v2 44/46] selftests/mm: add anon and shared hugetlb to migration test From: James Houghton To: Mike Kravetz , Muchun Song , Peter Xu , Andrew Morton Cc: David Hildenbrand , David Rientjes , Axel Rasmussen , Mina Almasry , "Zach O'Keefe" , Manish Mishra , Naoya Horiguchi , "Dr . David Alan Gilbert" , "Matthew Wilcox (Oracle)" , Vlastimil Babka , Baolin Wang , Miaohe Lin , Yang Shi , Frank van der Linden , Jiaqi Yan , linux-mm@kvack.org, linux-kernel@vger.kernel.org, James Houghton X-Stat-Signature: u8rwmc6m316khz9jkfdtrbo81bgwd8rr X-Rspam-User: X-Rspamd-Queue-Id: 9E1E210000B X-Rspamd-Server: rspam06 X-HE-Tag: 1676680185-754805 X-HE-Meta: U2FsdGVkX18BUNDvX6HQkcWYy1h/kg4lZaMo8uA3ToHNjPRgTgceTxfVldhDBzHiCSr9c45lCoG3pxxdkXCfrQ7gbRvxng22CBjJRYDGKVhTtFGKr25kxkiAPxcBaYxBJbPddL8vKZjd6F23G8bHTL9USuOL8R4qjb+VQeY0pTEpHMDOl+fB5nJOp4hNH/u44+LP8Ovr0bx1G4eKPksgbb4lttoQAz3qR7uxglqCQlxo+HEVKrJUWpsrIoh1QicZqcyRVdC8viOlKLMkd13H0gBzZoyU/jDZlFC66HQUtvzuFvFUdVSJCCf337JAyEws1A6TXjZNWXMT9SEtp4Qlo3O17LMnaUXjmCgHUGgbDM5fH0rqQNOcB+VvwR2wEK5KBrEQ55sZV/fxfcvj37jpDij+Y9L5wjdqnLR5xIOpJ+/9cxoH75ETczWfcUfTFt+V3LZq8RLaAhZthuZeyvr3ECi0cJsIwZwpv6O+v0fhMbjLLSUWsz/bLaqiC2+6DD0tIQiYdkYo+Swf8uI0Db6b/sORkrdr6EsgVfSf0RXPzC9W3u2zue2GJ1GJVu6jQdkhkKACktLVLt5V9D3Q5Ze2rsDN/lUAdZeKOwkx2ySO6D+EQLEBYJjqiI/DYeh5xaGvNcXUVlOnvp+VPeM/qVNF7ReF3tFKtQKm0SR9FsquAXnPy5SX/zQSiDpJuDG4o46LgrwhWRZNQZev5R+GH7zBjqu49/w2CsT5z8aeJgJlWgxKD1zeUfKN4yLCjhP+Z8vQotL7t/QTgq6PLadothIEnPjAMvVm258hXrwhawGsXx9XZeW+Ss3lnWUPvutuToTl1itZe0nu5uavN+i6AHKPymQxFocRZ4KIiQMjKuhS7DtMU0QKvrmvawkMjre9jrIpuVNgQroqA34IZ4wZ4JI28Q0tYNkuhaHRYdNMOnRB0CXPD6CtM0MWWF7TRz0+7yIKGNLgZvlLjqo5oyjRE00 PhFnDxzS 8AS1qtwnu5/1xW0ytsJTg+IDRvLNzwBvsoNL0o1QB3vCr48uWXcnJb6WenNS+3HCxv/xzz9qHCbEKf+pETUUctQCDWs5A4YkJbxESulyY9JKeWCdosnQxqK7oY+6qInYES5Tm+OwO43Xr2vs1yCpUqXds5TjE2ftiMEqLK7fy01+m1iMjmD2A0rK7IGaiVk6w+MQsjW0oC5ixWEot//bP4SZ0gb/9Ju/j+FKFOlqqyeLcbOX0Tl2FBfDw8dHxfIjCmOAehAyStHmlWXkmNLnwQS+xz4eeitUipKJONOSKGaTKrq4i9886Av5s/o8vbOY/XAeFhUZdfV+M2JeOpnQUiwuj1wlDEa0SZtoNqCFo9RFmJcXU6VR84tdZoazgII5rRbWsFg8gZJxznN7vvn6AKyz6zjIISXMLzqO/0jRAJTU/Zh4VBf6uuDfbA9lWp5evUxgdJHb9rv368Z45NmR5gqxwfXE/qDIssXPh2XBAMluqEJt8cBs4AbKY8hGDwbzGa/lZMRFEUyxfn1qs13CZ8ztKRVamfiIBU71unelK0Ur7f0JktnJoHt5FPwyuuKxLeWq6 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Shared HugeTLB mappings are migrated best-effort. Sometimes, due to being unable to grab the VMA lock for writing, migration may just randomly fail. To allow for that, we allow retries. Signed-off-by: James Houghton diff --git a/tools/testing/selftests/mm/migration.c b/tools/testing/selftests/mm/migration.c index 1cec8425e3ca..21577a84d7e4 100644 --- a/tools/testing/selftests/mm/migration.c +++ b/tools/testing/selftests/mm/migration.c @@ -13,6 +13,7 @@ #include #include #include +#include #define TWOMEG (2<<20) #define RUNTIME (60) @@ -59,11 +60,12 @@ FIXTURE_TEARDOWN(migration) free(self->pids); } -int migrate(uint64_t *ptr, int n1, int n2) +int migrate(uint64_t *ptr, int n1, int n2, int retries) { int ret, tmp; int status = 0; struct timespec ts1, ts2; + int failed = 0; if (clock_gettime(CLOCK_MONOTONIC, &ts1)) return -1; @@ -78,6 +80,9 @@ int migrate(uint64_t *ptr, int n1, int n2) ret = move_pages(0, 1, (void **) &ptr, &n2, &status, MPOL_MF_MOVE_ALL); if (ret) { + if (++failed < retries) + continue; + if (ret > 0) printf("Didn't migrate %d pages\n", ret); else @@ -88,6 +93,7 @@ int migrate(uint64_t *ptr, int n1, int n2) tmp = n2; n2 = n1; n1 = tmp; + failed = 0; } return 0; @@ -128,7 +134,7 @@ TEST_F_TIMEOUT(migration, private_anon, 2*RUNTIME) if (pthread_create(&self->threads[i], NULL, access_mem, ptr)) perror("Couldn't create thread"); - ASSERT_EQ(migrate(ptr, self->n1, self->n2), 0); + ASSERT_EQ(migrate(ptr, self->n1, self->n2, 1), 0); for (i = 0; i < self->nthreads - 1; i++) ASSERT_EQ(pthread_cancel(self->threads[i]), 0); } @@ -158,7 +164,7 @@ TEST_F_TIMEOUT(migration, shared_anon, 2*RUNTIME) self->pids[i] = pid; } - ASSERT_EQ(migrate(ptr, self->n1, self->n2), 0); + ASSERT_EQ(migrate(ptr, self->n1, self->n2, 1), 0); for (i = 0; i < self->nthreads - 1; i++) ASSERT_EQ(kill(self->pids[i], SIGTERM), 0); } @@ -185,9 +191,78 @@ TEST_F_TIMEOUT(migration, private_anon_thp, 2*RUNTIME) if (pthread_create(&self->threads[i], NULL, access_mem, ptr)) perror("Couldn't create thread"); - ASSERT_EQ(migrate(ptr, self->n1, self->n2), 0); + ASSERT_EQ(migrate(ptr, self->n1, self->n2, 1), 0); + for (i = 0; i < self->nthreads - 1; i++) + ASSERT_EQ(pthread_cancel(self->threads[i]), 0); +} + +/* + * Tests the anon hugetlb migration entry paths. + */ +TEST_F_TIMEOUT(migration, private_anon_hugetlb, 2*RUNTIME) +{ + uint64_t *ptr; + int i; + + if (self->nthreads < 2 || self->n1 < 0 || self->n2 < 0) + SKIP(return, "Not enough threads or NUMA nodes available"); + + ptr = mmap(NULL, TWOMEG, PROT_READ | PROT_WRITE, + MAP_PRIVATE | MAP_ANONYMOUS | MAP_HUGETLB, -1, 0); + if (ptr == MAP_FAILED) + SKIP(return, "Could not allocate hugetlb pages"); + + memset(ptr, 0xde, TWOMEG); + for (i = 0; i < self->nthreads - 1; i++) + if (pthread_create(&self->threads[i], NULL, access_mem, ptr)) + perror("Couldn't create thread"); + + ASSERT_EQ(migrate(ptr, self->n1, self->n2, 1), 0); for (i = 0; i < self->nthreads - 1; i++) ASSERT_EQ(pthread_cancel(self->threads[i]), 0); } +/* + * Tests the shared hugetlb migration entry paths. + */ +TEST_F_TIMEOUT(migration, shared_hugetlb, 2*RUNTIME) +{ + uint64_t *ptr; + int i; + int fd; + unsigned long sz; + struct statfs filestat; + + if (self->nthreads < 2 || self->n1 < 0 || self->n2 < 0) + SKIP(return, "Not enough threads or NUMA nodes available"); + + fd = memfd_create("tmp_hugetlb", MFD_HUGETLB); + if (fd < 0) + SKIP(return, "Couldn't create hugetlb memfd"); + + if (fstatfs(fd, &filestat) < 0) + SKIP(return, "Couldn't fstatfs hugetlb file"); + + sz = filestat.f_bsize; + + if (ftruncate(fd, sz)) + SKIP(return, "Couldn't allocate hugetlb pages"); + ptr = mmap(NULL, sz, PROT_READ | PROT_WRITE, MAP_SHARED, fd, 0); + if (ptr == MAP_FAILED) + SKIP(return, "Could not map hugetlb pages"); + + memset(ptr, 0xde, sz); + for (i = 0; i < self->nthreads - 1; i++) + if (pthread_create(&self->threads[i], NULL, access_mem, ptr)) + perror("Couldn't create thread"); + + ASSERT_EQ(migrate(ptr, self->n1, self->n2, 10), 0); + for (i = 0; i < self->nthreads - 1; i++) { + ASSERT_EQ(pthread_cancel(self->threads[i]), 0); + pthread_join(self->threads[i], NULL); + } + ftruncate(fd, 0); + close(fd); +} + TEST_HARNESS_MAIN From patchwork Sat Feb 18 00:28:18 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: James Houghton X-Patchwork-Id: 13145411 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id D8433C6379F for ; Sat, 18 Feb 2023 00:30:06 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 0B0806B0081; Fri, 17 Feb 2023 19:29:49 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 064346B0082; Fri, 17 Feb 2023 19:29:49 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id E1CFC6B0083; Fri, 17 Feb 2023 19:29:48 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0010.hostedemail.com [216.40.44.10]) by kanga.kvack.org (Postfix) with ESMTP id CFBB46B0081 for ; Fri, 17 Feb 2023 19:29:48 -0500 (EST) Received: from smtpin18.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay05.hostedemail.com (Postfix) with ESMTP id AC7B540B47 for ; Sat, 18 Feb 2023 00:29:48 +0000 (UTC) X-FDA: 80478529656.18.3C94318 Received: from mail-vk1-f201.google.com (mail-vk1-f201.google.com [209.85.221.201]) by imf25.hostedemail.com (Postfix) with ESMTP id EACFAA000D for ; Sat, 18 Feb 2023 00:29:46 +0000 (UTC) Authentication-Results: imf25.hostedemail.com; dkim=pass header.d=google.com header.s=20210112 header.b=FNS8OXzQ; spf=pass (imf25.hostedemail.com: domain of 3-hvwYwoKCBk8I6DJ56IDC5DD5A3.1DBA7CJM-BB9Kz19.DG5@flex--jthoughton.bounces.google.com designates 209.85.221.201 as permitted sender) smtp.mailfrom=3-hvwYwoKCBk8I6DJ56IDC5DD5A3.1DBA7CJM-BB9Kz19.DG5@flex--jthoughton.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1676680186; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=IfKq0bBaVq4CjPIDar2fHBWF2wKUUWJpsDr5c4wwSS8=; b=R/RslYEW8ZL5uMWgY5m+1G+/TY22JcpdIUWBkFPInEQDgHDHiGA21J7LnKd0z98Co2uqQN xK/+w9VVUszegZ0OWaWY6KjOc3D/kgOMsBYBtZgaj9DaJBrmRmEasLlg4NgKpFtXHhl8WL qKaYVEhvBoru/uyyozzm6PUC3VkUl14= ARC-Authentication-Results: i=1; imf25.hostedemail.com; dkim=pass header.d=google.com header.s=20210112 header.b=FNS8OXzQ; spf=pass (imf25.hostedemail.com: domain of 3-hvwYwoKCBk8I6DJ56IDC5DD5A3.1DBA7CJM-BB9Kz19.DG5@flex--jthoughton.bounces.google.com designates 209.85.221.201 as permitted sender) smtp.mailfrom=3-hvwYwoKCBk8I6DJ56IDC5DD5A3.1DBA7CJM-BB9Kz19.DG5@flex--jthoughton.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1676680187; a=rsa-sha256; cv=none; b=vHAXewwg64KieGMMW+qHDpdTcwJxJAFKB6AZF027zYpxX6jc+w6kq1HXlZrw1sC74+d6by W7S9TE1UTsPq7ha4T9NaqkSt0otoPUTWB9Vbva6B0BggLdGQeHsKiA7GvOlfONzwD3j+8J qtu4tRKrKjhGYa7wOlwCheQCJarRsGM= Received: by mail-vk1-f201.google.com with SMTP id o31-20020a056122179f00b003e20d9fec6dso957017vkf.12 for ; Fri, 17 Feb 2023 16:29:46 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=IfKq0bBaVq4CjPIDar2fHBWF2wKUUWJpsDr5c4wwSS8=; b=FNS8OXzQNbEkpLDpjB3nAEwYdiqjzDyLdWfy2qMCsFYgDJSKBhPFK3yDxyIWpeySgl maQUSikgn8RHfHnveHKP4KewomfuBHkrzkOdNxKAEwOz4ERjMdSgdS+IkmHhDevj0Dlo 8SYMrUU5w0fXhm10675DXBEz1qa82TExwLIbs6xG0LABcq+O89FT45Z9grk1Qq9uWQnL VEl/w45AklX6i6AYyOmin2NKbHJDvd0cIf5fk1zu+qjIn8o5y4g9O4lMFPFSQVgZ0QZ9 4eExXJ6WU1Q2oLrhiuNiGB+aLlTM+fJuXr+noURddq0gbHVbzoVDxDtmh6wk9zHk9oKN 9jig== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=IfKq0bBaVq4CjPIDar2fHBWF2wKUUWJpsDr5c4wwSS8=; b=k1mdnSNG1LwD+ygne9lZdsLG3eOlJsGy90a2EZoeFfZBt1Uo/FxuDThVkRMe6i0eYi T7K6sRLij6jIX982pLh0THiDckCXEcrqvXakePB9qiFsUSlA1uOfgq2IvtBZIp/U4c2F XP6s9OCNgJ2S8pvE9iFy/11/QyfbcxUo0fwxT1Ta2Jy3wgYaFhb4LJOp51VeWrpA2sxe 2icEKWiQAQCScl+ttUmTZJdEWvUBMZ3joAApDCmPJXdLoFE6TSRejPFW/ZCekIqu1FN/ aHJlldFUMxS9THcQbLo9jLtcnOYzpy2LKPlOWKV6HnRoTpyOwteHAlZI1FnV5JEirdy9 jYCg== X-Gm-Message-State: AO0yUKWbjD+xT4kb1HI4Jw81bFNy5maba4F5wwa84CJCyZqohJ1zR+FP 6BM7qLhXzL6XUUR0VpqegQr3K4v3cZ0LbgDl X-Google-Smtp-Source: AK7set+/8h4Ti6SaOCrSw2JWCwa+A4XjEDrEdPurb5cORBytv6OxLdCN0/KOEI5ep27FSzg7p5WKdfqWfRYxseKZ X-Received: from jthoughton.c.googlers.com ([fda3:e722:ac3:cc00:14:4d90:c0a8:2a4f]) (user=jthoughton job=sendgmr) by 2002:ab0:53d2:0:b0:68b:923a:d6f4 with SMTP id l18-20020ab053d2000000b0068b923ad6f4mr47364uaa.2.1676680186208; Fri, 17 Feb 2023 16:29:46 -0800 (PST) Date: Sat, 18 Feb 2023 00:28:18 +0000 In-Reply-To: <20230218002819.1486479-1-jthoughton@google.com> Mime-Version: 1.0 References: <20230218002819.1486479-1-jthoughton@google.com> X-Mailer: git-send-email 2.39.2.637.g21b0678d19-goog Message-ID: <20230218002819.1486479-46-jthoughton@google.com> Subject: [PATCH v2 45/46] selftests/mm: add hugetlb HGM test to migration selftest From: James Houghton To: Mike Kravetz , Muchun Song , Peter Xu , Andrew Morton Cc: David Hildenbrand , David Rientjes , Axel Rasmussen , Mina Almasry , "Zach O'Keefe" , Manish Mishra , Naoya Horiguchi , "Dr . David Alan Gilbert" , "Matthew Wilcox (Oracle)" , Vlastimil Babka , Baolin Wang , Miaohe Lin , Yang Shi , Frank van der Linden , Jiaqi Yan , linux-mm@kvack.org, linux-kernel@vger.kernel.org, James Houghton X-Rspamd-Server: rspam07 X-Rspamd-Queue-Id: EACFAA000D X-Rspam-User: X-Stat-Signature: t8nck9x14xsa9ostw9wxnhc7ohaq7kdi X-HE-Tag: 1676680186-286491 X-HE-Meta: U2FsdGVkX19xgfjnZheospz6h1N751sKTjjhipzV3GSkOVFtbH1yHuuakalOMYHnCkHCGhQ9jsSRfjC1rwgKi5k9yoNymYKclTIrqys8EPVDJGBCq3yTQJ0dYPtYjZb+qaEYuZwalgqyJRXkdJsnTCoosbaXAhLyc2SoIfNtZtj2pwuNjIYLghA9dIWgMxPwxLO0ztGoNTyKPI52NayBMBiIJlHmfiglYLZGjzWDhu8SFeaV+8F+rChIrzU+4Et/QRXz4kRCc1INddU/xH7C9e9SS2o9eWXR+dLu4XAkQu66yBtzALtNX8rdRJ2n/xhUwryBlQyEBjHehIMHP0No2Md4wZ03v5muim9rJkF8YEASm9PuFR4Jl1G04jjbRQZn6++O0HqZFrg2YVm1ky6F2KfoWR6Qu3DVB/n0cLvb9RThQacm+vrHnHvL4A9TPy57eSLeHTUh8SZnm77a2tD0dtR6sWnNOMWo1JFEYXKM2fOz2awE3/QoR4iiJwJU94c7FM5a6n9CCof0L4XE56U7Xk6M73t2L95jKGmp7VY+vsHskH1VqatIg9U15qhV46lXxpIJXhT1/OmZ/5xKlK2YYhgLk1Vz02eD/aeQdRCABikkT3E18ks7vB8SU89o9Q9C3MtoU/YrBAlx9Cj4Xdif8Q53ZBmVXwOqxiLFEUYHJVtXe5pmKAU6SxzLiusQB9aWPQU7eLoCUN+qJIbBpcVNxKLNwe4xdVuLbxoLPPmm/aLx57ZvB04VMtieVR1Hl0Sds0oGQ5VjatcE99q4OhKCsNmbCLXrPeNdlGN5VzWqDYpB7M4AeD64O7eOhkVjjE9ib2zrFnM3LWTlvGJYO+DqF7YpZKzu0FioPOeEkpFyEpt2tvOZadDuNZOXCXCgLHagNDbXEwEXU45xRWcxrCSE6XXDQsLaYXhW2YAWHjI8Wp5L5vbe3QsHRhn7zV9RXCUUVX58ubl81Rb+ztUfcUA CgqnoiTl UG4EA/AmYDl3y9a3atCo4jEym+dz14KaXjmuYKQ+2pM53F/qKmSUkfkrbLHnarqiADxqIeFfSVCVqeewP1jVeU304DaJ3ycPeGJuotoTyfdmMwJohb4C0ppoMDkmo1UMi4r1zzUd/18w19BdCcJ39HLvDwLmWCy8p3RQNDd7nxHKKZ6EnR6x7YVjxaG5Yu0TUlsJLiRp0xxB+pLYaUEXERN81Ud1H6lW0kLa8c8PoNHLeVWOdrIOLoo+FBHeFK9wHSgwd+WGKnLMAG96Geag78pwtFV5zdQ5NFxLlyiFQoKw+rZsmIIVR7h8l1KdOVPiCw6NPiSycLZ25X81gNCOIQ3IbABWAUTlzx3Lc823UnEJmK/MdZvEbEFsK+Lj34kOa8vbTyWB/Kc12DRceY7VS7SRM9c7MeTvKY4l3EVzv9Ynh8k/o0TFNX6mgUmkpvikXInWoRGwGPAESWrs7IJcIQVUxSvHQtyu5h/TkgVW2bwiCbA9JfGyI9Amm0STdMMZLmtIO+XvcdJv3XhyN2L9+rP3n2v4kWmohjNuPaMbgUkeEcS20ef9zZNLKBvxRGpODEW/YVQrWViWPH8LE1kKepBZLYx5ouH2xb8qEFvhlqTcLbJPMdxj5KsUVZRulTU6VPcMYJVJE5lhHwTg8avY9F/La9zZh2OkkKKLG X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: This is mostly the same as the shared HugeTLB case, but instead of mapping the page with a regular page fault, we map it with lots of UFFDIO_CONTINUE operations. We also verify that the contents haven't changed after the migration, which would be the case if the post-migration PTEs pointed to the wrong page. Signed-off-by: James Houghton diff --git a/tools/testing/selftests/mm/migration.c b/tools/testing/selftests/mm/migration.c index 21577a84d7e4..1fb3607accab 100644 --- a/tools/testing/selftests/mm/migration.c +++ b/tools/testing/selftests/mm/migration.c @@ -14,12 +14,21 @@ #include #include #include +#include +#include +#include +#include +#include #define TWOMEG (2<<20) #define RUNTIME (60) #define ALIGN(x, a) (((x) + (a - 1)) & (~((a) - 1))) +#ifndef MADV_SPLIT +#define MADV_SPLIT 26 +#endif + FIXTURE(migration) { pthread_t *threads; @@ -265,4 +274,141 @@ TEST_F_TIMEOUT(migration, shared_hugetlb, 2*RUNTIME) close(fd); } +#ifdef __NR_userfaultfd +static int map_at_high_granularity(char *mem, size_t length) +{ + int i; + int ret; + int uffd = syscall(__NR_userfaultfd, 0); + struct uffdio_api api; + struct uffdio_register reg; + int pagesize = getpagesize(); + + if (uffd < 0) { + perror("couldn't create uffd"); + return uffd; + } + + api.api = UFFD_API; + api.features = 0; + + ret = ioctl(uffd, UFFDIO_API, &api); + if (ret || api.api != UFFD_API) { + perror("UFFDIO_API failed"); + goto out; + } + + if (madvise(mem, length, MADV_SPLIT) == -1) { + perror("MADV_SPLIT failed"); + goto out; + } + + reg.range.start = (unsigned long)mem; + reg.range.len = length; + + reg.mode = UFFDIO_REGISTER_MODE_MISSING | UFFDIO_REGISTER_MODE_MINOR; + + ret = ioctl(uffd, UFFDIO_REGISTER, ®); + if (ret) { + perror("UFFDIO_REGISTER failed"); + goto out; + } + + /* UFFDIO_CONTINUE each 4K segment of the 2M page. */ + for (i = 0; i < length/pagesize; ++i) { + struct uffdio_continue cont; + + cont.range.start = (unsigned long long)mem + i * pagesize; + cont.range.len = pagesize; + cont.mode = 0; + ret = ioctl(uffd, UFFDIO_CONTINUE, &cont); + if (ret) { + fprintf(stderr, "UFFDIO_CONTINUE failed " + "for %llx -> %llx: %d\n", + cont.range.start, + cont.range.start + cont.range.len, + errno); + goto out; + } + } + ret = 0; +out: + close(uffd); + return ret; +} +#else +static int map_at_high_granularity(char *mem, size_t length) +{ + fprintf(stderr, "Userfaultfd missing\n"); + return -1; +} +#endif /* __NR_userfaultfd */ + +/* + * Tests the high-granularity hugetlb migration entry paths. + */ +TEST_F_TIMEOUT(migration, shared_hugetlb_hgm, 2*RUNTIME) +{ + uint64_t *ptr; + int i; + int fd; + unsigned long sz; + struct statfs filestat; + + if (self->nthreads < 2 || self->n1 < 0 || self->n2 < 0) + SKIP(return, "Not enough threads or NUMA nodes available"); + + fd = memfd_create("tmp_hugetlb", MFD_HUGETLB); + if (fd < 0) + SKIP(return, "Couldn't create hugetlb memfd"); + + if (fstatfs(fd, &filestat) < 0) + SKIP(return, "Couldn't fstatfs hugetlb file"); + + sz = filestat.f_bsize; + + if (ftruncate(fd, sz)) + SKIP(return, "Couldn't allocate hugetlb pages"); + + if (fallocate(fd, 0, 0, sz) < 0) { + perror("fallocate failed"); + SKIP(return, "fallocate failed"); + } + + ptr = mmap(NULL, sz, PROT_READ | PROT_WRITE, MAP_SHARED, fd, 0); + if (ptr == MAP_FAILED) + SKIP(return, "Could not allocate hugetlb pages"); + + /* + * We have to map_at_high_granularity before we memset, otherwise + * memset will map everything at the hugepage size. + */ + if (map_at_high_granularity((char *)ptr, sz) < 0) + SKIP(return, "Could not map HugeTLB range at high granularity"); + + /* Populate the page we're migrating. */ + for (i = 0; i < sz/sizeof(*ptr); ++i) + ptr[i] = i; + + for (i = 0; i < self->nthreads - 1; i++) + if (pthread_create(&self->threads[i], NULL, access_mem, ptr)) + perror("Couldn't create thread"); + + ASSERT_EQ(migrate(ptr, self->n1, self->n2, 10), 0); + for (i = 0; i < self->nthreads - 1; i++) { + ASSERT_EQ(pthread_cancel(self->threads[i]), 0); + pthread_join(self->threads[i], NULL); + } + + /* Check that the contents didnt' change. */ + for (i = 0; i < sz/sizeof(*ptr); ++i) { + ASSERT_EQ(ptr[i], i); + if (ptr[i] != i) + break; + } + + ftruncate(fd, 0); + close(fd); +} + TEST_HARNESS_MAIN From patchwork Sat Feb 18 00:28:19 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: James Houghton X-Patchwork-Id: 13145412 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 9128DC05027 for ; Sat, 18 Feb 2023 00:30:08 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 3E3E26B0082; Fri, 17 Feb 2023 19:29:50 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 3BB6F6B0083; Fri, 17 Feb 2023 19:29:50 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 236EF6B0085; Fri, 17 Feb 2023 19:29:50 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0017.hostedemail.com [216.40.44.17]) by kanga.kvack.org (Postfix) with ESMTP id 12CA96B0082 for ; Fri, 17 Feb 2023 19:29:50 -0500 (EST) Received: from smtpin05.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay09.hostedemail.com (Postfix) with ESMTP id E7F1880723 for ; Sat, 18 Feb 2023 00:29:49 +0000 (UTC) X-FDA: 80478529698.05.0D2B9CE Received: from mail-yb1-f201.google.com (mail-yb1-f201.google.com [209.85.219.201]) by imf08.hostedemail.com (Postfix) with ESMTP id 12926160009 for ; Sat, 18 Feb 2023 00:29:47 +0000 (UTC) Authentication-Results: imf08.hostedemail.com; dkim=pass header.d=google.com header.s=20210112 header.b=Ycfh2IW+; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf08.hostedemail.com: domain of 3-xvwYwoKCBo9J7EK67JED6EE6B4.2ECB8DKN-CCAL02A.EH6@flex--jthoughton.bounces.google.com designates 209.85.219.201 as permitted sender) smtp.mailfrom=3-xvwYwoKCBo9J7EK67JED6EE6B4.2ECB8DKN-CCAL02A.EH6@flex--jthoughton.bounces.google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1676680188; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=nWGrggQ/qDTMgdBpDhPDYEEJzTc7x7HUNebehORCAiA=; b=W2F8YCaPXhSTp59FoepV7Fa7V4wy5JX3z25ECwIC4HKHwRTYfNKSJjQmlitXM9A56bI7ef 6YACWjzLMM3ibVxOyIP9fhWAtAxi6qmjg5mbOSFhx3G+rDf1yGXDg9FDt8kTdiiVo7Iagp rfRmgvQ9ePU4piISCTks5ZbXTRc+xJo= ARC-Authentication-Results: i=1; imf08.hostedemail.com; dkim=pass header.d=google.com header.s=20210112 header.b=Ycfh2IW+; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf08.hostedemail.com: domain of 3-xvwYwoKCBo9J7EK67JED6EE6B4.2ECB8DKN-CCAL02A.EH6@flex--jthoughton.bounces.google.com designates 209.85.219.201 as permitted sender) smtp.mailfrom=3-xvwYwoKCBo9J7EK67JED6EE6B4.2ECB8DKN-CCAL02A.EH6@flex--jthoughton.bounces.google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1676680188; a=rsa-sha256; cv=none; b=LFa6HzI0QHk9ElYpSYKMJhPaYgV4qS93BN8DLjysKGLKlzne9e902ITmLWfDtzSW0YAWy9 iW+a2+PDZVBlppEJbH0fcwFOu3ghR9xJolbPjVQOQyiMP+SOL3ytt30cV1GWKxXmhAgTb7 N/6D30kzyqQT4DX/ZROpvgT54LU7UDM= Received: by mail-yb1-f201.google.com with SMTP id y33-20020a25ad21000000b00953ffdfbe1aso2198014ybi.23 for ; Fri, 17 Feb 2023 16:29:47 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=nWGrggQ/qDTMgdBpDhPDYEEJzTc7x7HUNebehORCAiA=; b=Ycfh2IW+hDbVlFvLyA3lntlwjCloreR3r/qvCLXWGq6CNK6K2k2C8p8Spq55zPyhuq AAzGAWCFdAKep93lG+yL4ZIn4/dB4F5SfgWiuIci0gRz5Jc/WSpHykVxXgtPRca51cTh Kjt4vcYHHiUBWKNDbEdNkw5LmtxPtJAs961xziLXEQQmcRRKC+XTvb8X3Mv5HuNd3g65 rbEnCtYwG9f2sqWhaLJitGD3eobDHYJm3Lp8idB4WHCamsENmNmOfrndZ99VziA1718v ZFwhg4eMsDrzkJAlatdDRFvDNHgbLy+EDTZnB6N9dZ7yt4JVcAMrYLFaHtXLj4rhKRWg o1Pg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=nWGrggQ/qDTMgdBpDhPDYEEJzTc7x7HUNebehORCAiA=; b=kCABBzD/tvUcNg2qO8UduVHi0ql09U3nQB429oMQCmfdRhqqv/AEdn709UcPJNFFci MRDhzDcz+IRvub3mn20xMJQzq4tGe532EiEuRrXIN/1u4iXEizgB+BGtr3A9mvtdPWl/ nqvWlsHSN2m8RIKJJLe00Px3PLm5U0/ebOvM5F+d7Sd8I1WBrXzd5LhOYWkv3EWVkL+a NHvfc5fImmiPYd2YmtRBh/pmp2MapTpC7uhUDHbxt+VGz2CI64Gy2nvvkld8U50DoOTw 7N/FzQSbsKgbStYvrdRMIIzAR8CD1VnuYVNqwenQZ2HgbfpHS8GplG4GZn/dAvLhuKj1 qJhA== X-Gm-Message-State: AO0yUKVOQTyIjIIlZmqGXtvQkxMcX54bg1oH0qUbapFhf9PuhKfLUmU9 Y5CmojajGFHxa6vosSEpa/IfIJ9BZmAiS5Wr X-Google-Smtp-Source: AK7set8/WW6OVTCGwPxg+uQR5Lu/p2b0h5x6j4EcxX6yyi99urCYtaxSW3TCY7nRxHf6oG5i6KVmizuXRuCPg5hr X-Received: from jthoughton.c.googlers.com ([fda3:e722:ac3:cc00:14:4d90:c0a8:2a4f]) (user=jthoughton job=sendgmr) by 2002:a5b:5c3:0:b0:8ed:262d:defe with SMTP id w3-20020a5b05c3000000b008ed262ddefemr185750ybp.0.1676680187212; Fri, 17 Feb 2023 16:29:47 -0800 (PST) Date: Sat, 18 Feb 2023 00:28:19 +0000 In-Reply-To: <20230218002819.1486479-1-jthoughton@google.com> Mime-Version: 1.0 References: <20230218002819.1486479-1-jthoughton@google.com> X-Mailer: git-send-email 2.39.2.637.g21b0678d19-goog Message-ID: <20230218002819.1486479-47-jthoughton@google.com> Subject: [PATCH v2 46/46] selftests/mm: add HGM UFFDIO_CONTINUE and hwpoison tests From: James Houghton To: Mike Kravetz , Muchun Song , Peter Xu , Andrew Morton Cc: David Hildenbrand , David Rientjes , Axel Rasmussen , Mina Almasry , "Zach O'Keefe" , Manish Mishra , Naoya Horiguchi , "Dr . David Alan Gilbert" , "Matthew Wilcox (Oracle)" , Vlastimil Babka , Baolin Wang , Miaohe Lin , Yang Shi , Frank van der Linden , Jiaqi Yan , linux-mm@kvack.org, linux-kernel@vger.kernel.org, James Houghton X-Rspam-User: X-Rspamd-Server: rspam02 X-Rspamd-Queue-Id: 12926160009 X-Stat-Signature: urojamfzjayyueouie7eudotesp4kjhr X-HE-Tag: 1676680187-247339 X-HE-Meta: U2FsdGVkX1+cP81901S1DTQdNd/tYsXAhHvGqfhqmFtRrNezVZOMm4CsRRJAxntN8204Rzm+2EBxjeeeGZloejNy4W2nRkEo6srGF+GdFKl9EWa0fKq+rGt72WpvhuyOr0MYxUzq8OGKKd/7PmW5jMEfsIeuwIcG7D3mdWLysOThu+7D2UKhvX7KlfUyfl6wE28FA7swqd9GJCJXSVqI1f9nx0i0Kr7YaVQWATMJIA1myiohoyhWU3i7ClSlZQ7hMVJXg9RQTxOD8e+PipHNlNBrgr4FGRC7iE9zUfMNKTZNOJ3ty0Ljo+0witCIymByeU3qN4EYRCL/l6ZKtRk9uIm6jaMP4sTfQAY8GaN34i1lUj45rDmXcDBjDg/lDmMo/H3CDR6MGxTVtpz9t/cuFio34pBjDtpg4UIo0gOGRIB/RtPYiwH8eOEAQ2QiDQpAuyRuLBGR3Wy4GZMMA65DLvtSyuGexa6BsHPk5ENpEEgvvlvv2ss8usrgUA+yYpAMjP2cRU/Qs2exs4fCCrDusMlkllH1o9vXr89GoIm7oWkC8oWAO4b3HqqsA2Bq226fZWVlCu3JuyY16v0fOZHAF8t7fwCle4tPiT5vTPKQf+akXbmwv4QSCgVeIYhXzWyr9x7DDYKK4rnv/fzFY+W/1+dYp1SHXjdiBWAdpBDXXkgEuByV/Z1EWblPWjQSz5jA0mgcTzAOFDDizR2I9N0J+59k+Rgy0qv9XtT/bgiNd1AlxC70uA1f7n6VAPImysEiYmsMh5DXnXPVJFrogEZWCs7YYRRK3pJqtKoCW/S30KWWC4BbrUZwQvjqwXGXQbNHHDb4nkgT1AvqCge+KDlD3mZUNockGH4NFT9bsLoZHyarAa6K0sfmHOdQMnEpXKw5Wk2PK1v6uPO2dEU32duhe1IciyE9IYjaRH4F7rlw5akQUUXpgqvicmPzARtp4LDtb+YNQxvI5i0aHC7EaQH SMcxUame Bc2zC6KLt1LqWz1qzPQqKIBekU7ifIpIJRhiVurs0xqn9xXpLu0q0Ov20SX7dOMnPgMxIu/WaXWdu0s1qv87thtwFJe2+UclrgVkV6UXt6s4lqjzApagC06nss+oYcePrDeSTGmHEvq97+1LxJ9QzMAEYu/Xj2LwyYnBSyw47b5nQj6bqTA+cWA4PBjZgdUcs3nSu3hPZOiNzSO8plDBfajuyrSDpDRUrASi8CyQr8F4Jg0P6xON9hLovi/524w3GUwyG3g8EOeqFsC0t9EhxOfoHcHLypetTZFYM4iHuX6xowSQBCF95thtvc6EgjdYVvmAEH+QcWxwF/UrBQu9jqMNBSLQ4RUEZRIfbdlA3zTVzjOJ4ei5Cqjk4KPISCLvyEI0AgbJOc2R1iFjoniIL3WxJ4q6Ael1WJzzLTFE0oLbwoHC2F/5PjY74pzZ9FHrjnogZ6iUTHwqSojLCcJKAxJitC+YDpCghfCzyIzWQ0TwpbyMN32EyVR1BnziiH9bZJhaPBkbP+6MOxB3Anw8mNOjXhGoqs6QJiDjrKtSp7veDlIZYIcZrHsfGd+r1ro8irnfU6FSqKbXprgDDrwr2qPmRWDgKSllwgEfVA3tAlhkHxHw= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Test that high-granularity CONTINUEs at all sizes work (exercising contiguous PTE sizes for arm64, when support is added). Also test that collapse works and hwpoison works correctly (although we aren't yet testing high-granularity poison). This test uses UFFD_FEATURE_EVENT_FORK + UFFD_REGISTER_MODE_WP to force the kernel to copy page tables on fork(), exercising the changes to copy_hugetlb_page_range(). Also test that UFFDIO_WRITEPROTECT doesn't prevent UFFDIO_CONTINUE from behaving properly (in other words, that HGM walks treat UFFD-WP markers like blank PTEs in the appropriate cases). We also test that the uffd-wp PTE markers are preserved properly. Signed-off-by: James Houghton create mode 100644 tools/testing/selftests/mm/hugetlb-hgm.c diff --git a/tools/testing/selftests/mm/Makefile b/tools/testing/selftests/mm/Makefile index d90cdc06aa59..920baccccb9e 100644 --- a/tools/testing/selftests/mm/Makefile +++ b/tools/testing/selftests/mm/Makefile @@ -36,6 +36,7 @@ TEST_GEN_FILES += compaction_test TEST_GEN_FILES += gup_test TEST_GEN_FILES += hmm-tests TEST_GEN_FILES += hugetlb-madvise +TEST_GEN_FILES += hugetlb-hgm TEST_GEN_FILES += hugepage-mmap TEST_GEN_FILES += hugepage-mremap TEST_GEN_FILES += hugepage-shm diff --git a/tools/testing/selftests/mm/hugetlb-hgm.c b/tools/testing/selftests/mm/hugetlb-hgm.c new file mode 100644 index 000000000000..4c27a6a11818 --- /dev/null +++ b/tools/testing/selftests/mm/hugetlb-hgm.c @@ -0,0 +1,608 @@ +// SPDX-License-Identifier: GPL-2.0 +/* + * Test uncommon cases in HugeTLB high-granularity mapping: + * 1. Test all supported high-granularity page sizes (with MADV_COLLAPSE). + * 2. Test MADV_HWPOISON behavior. + * 3. Test interaction with UFFDIO_WRITEPROTECT. + */ + +#define _GNU_SOURCE +#include +#include +#include +#include +#include +#include +#include +#include +#include + +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include + +#define PAGE_SIZE 4096 +#define PAGE_MASK ~(PAGE_SIZE - 1) + +#ifndef MADV_COLLAPSE +#define MADV_COLLAPSE 25 +#endif + +#ifndef MADV_SPLIT +#define MADV_SPLIT 26 +#endif + +#define PREFIX " ... " +#define ERROR_PREFIX " !!! " + +static void *sigbus_addr; +bool was_mceerr; +bool got_sigbus; +bool expecting_sigbus; + +enum test_status { + TEST_PASSED = 0, + TEST_FAILED = 1, + TEST_SKIPPED = 2, +}; + +static char *status_to_str(enum test_status status) +{ + switch (status) { + case TEST_PASSED: + return "TEST_PASSED"; + case TEST_FAILED: + return "TEST_FAILED"; + case TEST_SKIPPED: + return "TEST_SKIPPED"; + default: + return "TEST_???"; + } +} + +static int userfaultfd(int flags) +{ + return syscall(__NR_userfaultfd, flags); +} + +static int map_range(int uffd, char *addr, uint64_t length) +{ + struct uffdio_continue cont = { + .range = (struct uffdio_range) { + .start = (uint64_t)addr, + .len = length, + }, + .mode = 0, + .mapped = 0, + }; + + if (ioctl(uffd, UFFDIO_CONTINUE, &cont) < 0) { + perror(ERROR_PREFIX "UFFDIO_CONTINUE failed"); + return -1; + } + return 0; +} + +static int userfaultfd_writeprotect(int uffd, char *addr, uint64_t length, + bool protect) +{ + struct uffdio_writeprotect wp = { + .range = (struct uffdio_range) { + .start = (uint64_t)addr, + .len = length, + }, + .mode = UFFDIO_WRITEPROTECT_MODE_DONTWAKE, + }; + + if (protect) + wp.mode = UFFDIO_WRITEPROTECT_MODE_WP; + + printf(PREFIX "UFFDIO_WRITEPROTECT: %p -> %p (%sprotected)\n", addr, + addr + length, protect ? "" : "un"); + + if (ioctl(uffd, UFFDIO_WRITEPROTECT, &wp) < 0) { + perror(ERROR_PREFIX "UFFDIO_WRITEPROTECT failed"); + return -1; + } + return 0; +} + +static int check_equal(char *mapping, size_t length, char value) +{ + size_t i; + + for (i = 0; i < length; ++i) + if (mapping[i] != value) { + printf(ERROR_PREFIX "mismatch at %p (%d != %d)\n", + &mapping[i], mapping[i], value); + return -1; + } + + return 0; +} + +static int test_continues(int uffd, char *primary_map, char *secondary_map, + size_t len, bool verify) +{ + size_t offset = 0; + unsigned char iter = 0; + unsigned long pagesize = getpagesize(); + uint64_t size; + + for (size = len/2; size >= pagesize; + offset += size, size /= 2) { + iter++; + memset(secondary_map + offset, iter, size); + printf(PREFIX "UFFDIO_CONTINUE: %p -> %p = %d%s\n", + primary_map + offset, + primary_map + offset + size, + iter, + verify ? " (and verify)" : ""); + if (map_range(uffd, primary_map + offset, size)) + return -1; + if (verify && check_equal(primary_map + offset, size, iter)) + return -1; + } + return 0; +} + +static int verify_contents(char *map, size_t len, bool last_page_zero) +{ + size_t offset = 0; + int i = 0; + uint64_t size; + + for (size = len/2; size > PAGE_SIZE; offset += size, size /= 2) + if (check_equal(map + offset, size, ++i)) + return -1; + + if (last_page_zero) + if (check_equal(map + len - PAGE_SIZE, PAGE_SIZE, 0)) + return -1; + + return 0; +} + +static int test_collapse(char *primary_map, size_t len, bool verify) +{ + int ret = 0; + + printf(PREFIX "collapsing %p -> %p\n", primary_map, primary_map + len); + if (madvise(primary_map, len, MADV_COLLAPSE) < 0) { + perror(ERROR_PREFIX "collapse failed"); + return -1; + } + + if (verify) { + printf(PREFIX "verifying %p -> %p\n", primary_map, + primary_map + len); + ret = verify_contents(primary_map, len, true); + } + return ret; +} + +static void sigbus_handler(int signo, siginfo_t *info, void *context) +{ + if (!expecting_sigbus) + printf(ERROR_PREFIX "unexpected sigbus: %p\n", info->si_addr); + + got_sigbus = true; + was_mceerr = info->si_code == BUS_MCEERR_AR; + sigbus_addr = info->si_addr; + + pthread_exit(NULL); +} + +static void *access_mem(void *addr) +{ + volatile char *ptr = addr; + + /* + * Do a write without changing memory contents, as other routines will + * need to verify that mapping contents haven't changed. + * + * We do a write so that we trigger uffd-wp SIGBUSes. To test that we + * get HWPOISON SIGBUSes, we would only need to read. + */ + *ptr = *ptr; + return NULL; +} + +static int test_sigbus(char *addr, bool poison) +{ + int ret; + pthread_t pthread; + + sigbus_addr = (void *)0xBADBADBAD; + was_mceerr = false; + got_sigbus = false; + expecting_sigbus = true; + ret = pthread_create(&pthread, NULL, &access_mem, addr); + if (ret) { + printf(ERROR_PREFIX "failed to create thread: %s\n", + strerror(ret)); + goto out; + } + + pthread_join(pthread, NULL); + + ret = -1; + if (!got_sigbus) + printf(ERROR_PREFIX "didn't get a SIGBUS: %p\n", addr); + else if (sigbus_addr != addr) + printf(ERROR_PREFIX "got incorrect sigbus address: %p vs %p\n", + sigbus_addr, addr); + else if (poison && !was_mceerr) + printf(ERROR_PREFIX "didn't get an MCEERR?\n"); + else + ret = 0; +out: + expecting_sigbus = false; + return ret; +} + +static void *read_from_uffd_thd(void *arg) +{ + int uffd = *(int *)arg; + struct uffd_msg msg; + /* opened without O_NONBLOCK */ + if (read(uffd, &msg, sizeof(msg)) != sizeof(msg)) + printf(ERROR_PREFIX "reading uffd failed\n"); + + return NULL; +} + +static int read_event_from_uffd(int *uffd, pthread_t *pthread) +{ + int ret = 0; + + ret = pthread_create(pthread, NULL, &read_from_uffd_thd, (void *)uffd); + if (ret) { + printf(ERROR_PREFIX "failed to create thread: %s\n", + strerror(ret)); + return ret; + } + return 0; +} + +static int test_sigbus_range(char *primary_map, size_t len, bool hwpoison) +{ + const unsigned long pagesize = getpagesize(); + const int num_checks = 512; + unsigned long bytes_per_check = len/num_checks; + int i; + + printf(PREFIX "checking that we can't access " + "(%d addresses within %p -> %p)\n", + num_checks, primary_map, primary_map + len); + + if (pagesize > bytes_per_check) + bytes_per_check = pagesize; + + for (i = 0; i < len; i += bytes_per_check) + if (test_sigbus(primary_map + i, hwpoison) < 0) + return 1; + /* check very last byte, because we left it unmapped */ + if (test_sigbus(primary_map + len - 1, hwpoison)) + return 1; + + return 0; +} + +static enum test_status test_hwpoison(char *primary_map, size_t len) +{ + printf(PREFIX "poisoning %p -> %p\n", primary_map, primary_map + len); + if (madvise(primary_map, len, MADV_HWPOISON) < 0) { + perror(ERROR_PREFIX "MADV_HWPOISON failed"); + return TEST_SKIPPED; + } + + return test_sigbus_range(primary_map, len, true) + ? TEST_FAILED : TEST_PASSED; +} + +static int test_fork(int uffd, char *primary_map, size_t len) +{ + int status; + int ret = 0; + pid_t pid; + pthread_t uffd_thd; + + /* + * UFFD_FEATURE_EVENT_FORK will put fork event on the userfaultfd, + * which we must read, otherwise we block fork(). Setup a thread to + * read that event now. + * + * Page fault events should result in a SIGBUS, so we expect only a + * single event from the uffd (the fork event). + */ + if (read_event_from_uffd(&uffd, &uffd_thd)) + return -1; + + pid = fork(); + + if (!pid) { + /* + * Because we have UFFDIO_REGISTER_MODE_WP and + * UFFD_FEATURE_EVENT_FORK, the page tables should be copied + * exactly. + * + * Check that everything except that last 4K has correct + * contents, and then check that the last 4K gets a SIGBUS. + */ + printf(PREFIX "child validating...\n"); + ret = verify_contents(primary_map, len, false) || + test_sigbus(primary_map + len - 1, false); + ret = 0; + exit(ret ? 1 : 0); + } else { + /* wait for the child to finish. */ + waitpid(pid, &status, 0); + ret = WEXITSTATUS(status); + if (!ret) { + printf(PREFIX "parent validating...\n"); + /* Same check as the child. */ + ret = verify_contents(primary_map, len, false) || + test_sigbus(primary_map + len - 1, false); + ret = 0; + } + } + + pthread_join(uffd_thd, NULL); + return ret; + +} + +static int uffd_register(int uffd, char *primary_map, unsigned long len, + int mode) +{ + struct uffdio_register reg; + + reg.range.start = (unsigned long)primary_map; + reg.range.len = len; + reg.mode = mode; + + reg.ioctls = 0; + return ioctl(uffd, UFFDIO_REGISTER, ®); +} + +enum test_type { + TEST_DEFAULT, + TEST_UFFDWP, + TEST_HWPOISON +}; + +static enum test_status +test_hgm(int fd, size_t hugepagesize, size_t len, enum test_type type) +{ + int uffd; + char *primary_map, *secondary_map; + struct uffdio_api api; + struct sigaction new, old; + enum test_status status = TEST_SKIPPED; + bool hwpoison = type == TEST_HWPOISON; + bool uffd_wp = type == TEST_UFFDWP; + bool verify = type == TEST_DEFAULT; + int register_args; + + if (ftruncate(fd, len) < 0) { + perror(ERROR_PREFIX "ftruncate failed"); + return status; + } + + uffd = userfaultfd(O_CLOEXEC); + if (uffd < 0) { + perror(ERROR_PREFIX "uffd not created"); + return status; + } + + primary_map = mmap(NULL, len, PROT_READ | PROT_WRITE, MAP_SHARED, fd, 0); + if (primary_map == MAP_FAILED) { + perror(ERROR_PREFIX "mmap for primary mapping failed"); + goto close_uffd; + } + secondary_map = mmap(NULL, len, PROT_READ | PROT_WRITE, MAP_SHARED, fd, 0); + if (secondary_map == MAP_FAILED) { + perror(ERROR_PREFIX "mmap for secondary mapping failed"); + goto unmap_primary; + } + + printf(PREFIX "primary mapping: %p\n", primary_map); + printf(PREFIX "secondary mapping: %p\n", secondary_map); + + api.api = UFFD_API; + api.features = UFFD_FEATURE_SIGBUS | UFFD_FEATURE_EXACT_ADDRESS | + UFFD_FEATURE_EVENT_FORK; + if (ioctl(uffd, UFFDIO_API, &api) == -1) { + perror(ERROR_PREFIX "UFFDIO_API failed"); + goto out; + } + + if (madvise(primary_map, len, MADV_SPLIT)) { + perror(ERROR_PREFIX "MADV_SPLIT failed"); + goto out; + } + + /* + * Register with UFFDIO_REGISTER_MODE_WP to force fork() to copy page + * tables (also need UFFD_FEATURE_EVENT_FORK, which we have). + */ + register_args = UFFDIO_REGISTER_MODE_MISSING | UFFDIO_REGISTER_MODE_WP; + if (!uffd_wp) + /* + * If we're testing UFFDIO_WRITEPROTECT, then we don't want + * minor faults. With minor faults enabled, we'll get SIGBUSes + * for any minor fault, wheresa without minot faults enabled, + * writes will verify that uffd-wp PTE markers were installed + * properly. + */ + register_args |= UFFDIO_REGISTER_MODE_MINOR; + + if (uffd_register(uffd, primary_map, len, register_args)) { + perror(ERROR_PREFIX "UFFDIO_REGISTER failed"); + goto out; + } + + + new.sa_sigaction = &sigbus_handler; + new.sa_flags = SA_SIGINFO; + if (sigaction(SIGBUS, &new, &old) < 0) { + perror(ERROR_PREFIX "could not setup SIGBUS handler"); + goto out; + } + + status = TEST_FAILED; + + if (uffd_wp) { + /* + * Install uffd-wp PTE markers now. They should be preserved + * as we split the mappings with UFFDIO_CONTINUE later. + */ + if (userfaultfd_writeprotect(uffd, primary_map, len, true)) + goto done; + /* Verify that we really are write-protected. */ + if (test_sigbus(primary_map, false)) + goto done; + } + + /* + * Main piece of the test: map primary_map at all the possible + * page sizes. Starting at the hugepage size and going down to + * PAGE_SIZE. This leaves the final PAGE_SIZE piece of the mapping + * unmapped. + */ + if (test_continues(uffd, primary_map, secondary_map, len, verify)) + goto done; + + /* + * Verify that MADV_HWPOISON is able to properly poison the entire + * mapping. + */ + if (hwpoison) { + enum test_status new_status = test_hwpoison(primary_map, len); + + if (new_status != TEST_PASSED) { + status = new_status; + goto done; + } + } + + if (uffd_wp) { + /* + * Check that the uffd-wp marker we installed initially still + * exists in the unmapped 4K piece at the end the mapping. + * + * test_sigbus() will do a write. When this happens: + * 1. The page fault handler will find the uffd-wp marker and + * create a read-only PTE. + * 2. The memory access is retried, and the page fault handler + * will find that a write was attempted in a UFFD_WP VMA + * where a RO mapping exists, so SIGBUS + * (we have UFFD_FEATURE_SIGBUS). + * + * We only check the final pag because UFFDIO_CONTINUE will + * have cleared the write-protection on all the other pieces + * of the mapping. + */ + printf(PREFIX "verifying that we can't write to final page\n"); + if (test_sigbus(primary_map + len - 1, false)) + goto done; + } + + if (!hwpoison) + /* + * test_fork() will verify memory contents. We can't do + * that if memory has been poisoned. + */ + if (test_fork(uffd, primary_map, len)) + goto done; + + /* + * Check that MADV_COLLAPSE functions properly. That is: + * - the PAGE_SIZE hole we had is no longer unmapped. + * - poisoned regions are still poisoned. + * + * Verify the data is correct if we haven't poisoned. + */ + if (test_collapse(primary_map, len, !hwpoison)) + goto done; + /* + * Verify that memory is still poisoned. + */ + if (hwpoison && test_sigbus_range(primary_map, len, true)) + goto done; + + status = TEST_PASSED; + +done: + if (ftruncate(fd, 0) < 0) { + perror(ERROR_PREFIX "ftruncate back to 0 failed"); + status = TEST_FAILED; + } + +out: + munmap(secondary_map, len); +unmap_primary: + munmap(primary_map, len); +close_uffd: + close(uffd); + return status; +} + +int main(void) +{ + int fd; + struct statfs file_stat; + size_t hugepagesize; + size_t len; + enum test_status status; + int ret = 0; + + fd = memfd_create("hugetlb_tmp", MFD_HUGETLB); + if (fd < 0) { + perror(ERROR_PREFIX "could not open hugetlbfs file"); + return -1; + } + + memset(&file_stat, 0, sizeof(file_stat)); + if (fstatfs(fd, &file_stat)) { + perror(ERROR_PREFIX "fstatfs failed"); + goto close; + } + if (file_stat.f_type != HUGETLBFS_MAGIC) { + printf(ERROR_PREFIX "not hugetlbfs file\n"); + goto close; + } + + hugepagesize = file_stat.f_bsize; + len = 2 * hugepagesize; + + printf("HGM regular test...\n"); + status = test_hgm(fd, hugepagesize, len, TEST_DEFAULT); + printf("HGM regular test: %s\n", status_to_str(status)); + if (status == TEST_FAILED) + ret = -1; + + printf("HGM uffd-wp test...\n"); + status = test_hgm(fd, hugepagesize, len, TEST_UFFDWP); + printf("HGM uffd-wp test: %s\n", status_to_str(status)); + if (status == TEST_FAILED) + ret = -1; + + printf("HGM hwpoison test...\n"); + status = test_hgm(fd, hugepagesize, len, TEST_HWPOISON); + printf("HGM hwpoison test: %s\n", status_to_str(status)); + if (status == TEST_FAILED) + ret = -1; +close: + close(fd); + + return ret; +}