From patchwork Thu Jan 5 10:17:59 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: James Houghton X-Patchwork-Id: 13089636 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 791A9C3DA7A for ; Thu, 5 Jan 2023 10:18:56 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id C9E9E8E0003; Thu, 5 Jan 2023 05:18:55 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id C4E618E0001; Thu, 5 Jan 2023 05:18:55 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id B16118E0003; Thu, 5 Jan 2023 05:18:55 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0015.hostedemail.com [216.40.44.15]) by kanga.kvack.org (Postfix) with ESMTP id 9F8ED8E0001 for ; Thu, 5 Jan 2023 05:18:55 -0500 (EST) Received: from smtpin27.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay06.hostedemail.com (Postfix) with ESMTP id 781CFA32F9 for ; Thu, 5 Jan 2023 10:18:55 +0000 (UTC) X-FDA: 80320347030.27.C410A8A Received: from mail-yw1-f202.google.com (mail-yw1-f202.google.com [209.85.128.202]) by imf30.hostedemail.com (Postfix) with ESMTP id DB8448000B for ; Thu, 5 Jan 2023 10:18:53 +0000 (UTC) Authentication-Results: imf30.hostedemail.com; dkim=pass header.d=google.com header.s=20210112 header.b="e5/xp7qt"; spf=pass (imf30.hostedemail.com: domain of 3DaS2YwoKCFQ5F3AG23FA92AA270.yA8749GJ-886Hwy6.AD2@flex--jthoughton.bounces.google.com designates 209.85.128.202 as permitted sender) smtp.mailfrom=3DaS2YwoKCFQ5F3AG23FA92AA270.yA8749GJ-886Hwy6.AD2@flex--jthoughton.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1672913933; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=hd7V91rscRPZFZZvxWy710h4affBEizLA7fDXt78Shk=; b=TplhtaxlyoaK27nbbiPtQE/cvNQD5y6ycyxgqMeJy08/43UQ3LKt+OYH+0ubEwa3jIFTpy uO+ypEhumLkSzBJlDwlNyUAn8ydMzWPSpoQqINFG43z/6b/GM3Vqtcn0HslM47bJPLEstv 8EKzrS4Qhp9j4gZ5l8DlugRN54qUVAU= ARC-Authentication-Results: i=1; imf30.hostedemail.com; dkim=pass header.d=google.com header.s=20210112 header.b="e5/xp7qt"; spf=pass (imf30.hostedemail.com: domain of 3DaS2YwoKCFQ5F3AG23FA92AA270.yA8749GJ-886Hwy6.AD2@flex--jthoughton.bounces.google.com designates 209.85.128.202 as permitted sender) smtp.mailfrom=3DaS2YwoKCFQ5F3AG23FA92AA270.yA8749GJ-886Hwy6.AD2@flex--jthoughton.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1672913933; a=rsa-sha256; cv=none; b=eF8aRxH/7ick+cLRAEPTRcYvhC9SLS7ncHwhU8/FT8LytRisZ9ObC7EMnTx3g702fBcpM+ nPFi9WpTOOsQDvrXDaRySEGoKj0xhDzYt7ANqyQOcl1XKeTwz7USu8TDdTwzKYxxpGnNff pBYyd4VHVSxO+22RIly0ZwaEZ364BQw= Received: by mail-yw1-f202.google.com with SMTP id 00721157ae682-4ad7a1bd6f4so86472297b3.21 for ; Thu, 05 Jan 2023 02:18:53 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=hd7V91rscRPZFZZvxWy710h4affBEizLA7fDXt78Shk=; b=e5/xp7qtQB20h9zFT5hG8PhvybXKj9xDQ+Q32FSU+DoEcADv55QGkwnSACPjQG/DyG vSRBq0IQ2N16x9PTtjdsbMAthbWAIxsOWG8qTqMi3fOpGsCfnEzbWtwY/ftO7/Aww8UO QDPnn5bU281BREv8ROjgTb6/jrQ4xdaIwAXCflIgBhVPfKKsCUqvAx6332vM6h13gOZT 9fP/e7Jwx/iB0gVJpzSXmMLjDIxvFvGjRhbP16hT207hdtIUzqTLLWx1ltXn2+FmtM24 IuRG4pS+ekNQ8KYf+xVb4krM3ZYDVnIAUyYIxI4L+mJmziZTbCp7hnTBs60V1VSV/eHb UOjA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=hd7V91rscRPZFZZvxWy710h4affBEizLA7fDXt78Shk=; b=l4Jk8Ug/gvXdqD7eLoa2BMVavEB8H92z/z7YhTOckn7C4ovEbtTOy1GXEwSa1ingI2 vS0JGqzOj1CyaN2EIiFHr9IvdHOEjjuP8zfApNm3a4GcmajIaEAoQA/G7GR7qYLfzG6E 3/dGCKZOjpyg8HP5YHVn4hnubTvha6H6m0zOmQJOukQb2EJjACdRzDCOcI+RUJdgu9Ix eTT0QefpgB3dZY4YaEiX70vvgC6k9ZWkOXBGAomZ8uBIGrGNUspXZzQ9LuoDY41oEj6l tPaK4rn/+rj+x2kME1Mi8aw7V/2ijt4S60S/caRoBzf+Q7PE8/1/pzVIaByPHYJ+XQvf aXug== X-Gm-Message-State: AFqh2krCWgn/ojKnjtD3fqlY126NUnqcdHY4F7kvegwFRokbLOrwEHcR 9DubIft1RF+Bo8u4LQT40a4dFCpkcSmUXo36 X-Google-Smtp-Source: AMrXdXstw/w8uZgrnirlc8KaHhFpJGRLW/TRlHyKxIB3cggFw/eFDgOY+owXrL+9vF5tHSEWgJCdm1gzsJeripRO X-Received: from jthoughton.c.googlers.com ([fda3:e722:ac3:cc00:14:4d90:c0a8:2a4f]) (user=jthoughton job=sendgmr) by 2002:a0d:ccc2:0:b0:3c3:b17a:4255 with SMTP id o185-20020a0dccc2000000b003c3b17a4255mr7272761ywd.38.1672913933058; Thu, 05 Jan 2023 02:18:53 -0800 (PST) Date: Thu, 5 Jan 2023 10:17:59 +0000 In-Reply-To: <20230105101844.1893104-1-jthoughton@google.com> Mime-Version: 1.0 References: <20230105101844.1893104-1-jthoughton@google.com> X-Mailer: git-send-email 2.39.0.314.g84b9a713c41-goog Message-ID: <20230105101844.1893104-2-jthoughton@google.com> Subject: [PATCH 01/46] hugetlb: don't set PageUptodate for UFFDIO_CONTINUE From: James Houghton To: Mike Kravetz , Muchun Song , Peter Xu Cc: David Hildenbrand , David Rientjes , Axel Rasmussen , Mina Almasry , "Zach O'Keefe" , Manish Mishra , Naoya Horiguchi , "Dr . David Alan Gilbert" , "Matthew Wilcox (Oracle)" , Vlastimil Babka , Baolin Wang , Miaohe Lin , Yang Shi , Andrew Morton , linux-mm@kvack.org, linux-kernel@vger.kernel.org, James Houghton X-Rspamd-Server: rspam07 X-Rspamd-Queue-Id: DB8448000B X-Rspam-User: X-Stat-Signature: 3rd4p4pc9bhq4rb63p4b7meymbjgp8og X-HE-Tag: 1672913933-216707 X-HE-Meta: U2FsdGVkX1+/mpuviHzBD28TJzklz0AueAiZFrt5RPkyklh1N8ts/uoxiGGWx5WrszVmWF/HvENtAmell9cVj9tFm0eNK6sMkz5ZDvjKinE3cbO5tAiB1HENWjgJXwDBZpeHd00PUR9rZUKSBgF+rZQwci0miCEgFLd5VwTaxWxcXDYXhQtFytAscR9wlQEqET7Kb4gNobnyzZYTdh9OQ0A1vRbFeH7Kj8vWwKq22yj1fVG/sLUaKTq70qnBSk3s69GVQ35xWJd/jP06XqZf5rSm5Asy5XpBOgXDFL3WyhMgVPBT73CbCCFH+r+vVDuQz7DjK6m3O0OFx8lQ31KVvC3lUbUNve6Cg1KHml4MWe2pjSNtY9heRpBJT5gRHlc1/t97QTMQm+4M6b9JC29FeLtA490iHfbv8Gpm/iNWxMxrHoCFLZUSPQ+Yzr6ngRbicoIVGQHLHDGv02ke6FuqY1JwxX9YYlcHjnwWWqrbUfnsXxFi6VavLy7hrEsG1k18pyDJxkiVoqnZ265rE2tGEEzRI1pwZLnzmTiFoNfAr1DBMMT4W/omOHh+nstUAv5SoUTgj6lb6ejAfbh09MS5YnputMGkCKvbOp3UVZSQiHLM6co3ZGWpISuUyHZqY5fZgtOaLdc7m7EXWlvjn3PIJrZ5Vat9NKNJrqMpLcS9nvrWtBB9D0VJVb1PxnvujEyEbFjkDDXt9bqRSAo3a21wdiyaK4UbDxpvRyvguKR+SBFSnoFta9CtUc0Xrh3EJjnXo5f5SZ4i6HHC8Zwole9wn2V6nHhM3be+8UI7FABae8RbrI/h8Da5YZri7oAyncsniGhb2a5PEd2dLfk+q93+WzDBngNSq8FMga4MDj6+ZI0124lsBE9crNoeUKS1ABK2SLbfq0e88ZG1GWdELg00JDH0cw92cW/WZXSge0a6DWS4fiC/nIr+t844JOOzVsK6ZV92cYw67iCkdtouPbL 52Qa1bTv 4+Ve4/z2qf3tKCsUARi6cEWT0QW6NGOyyPqXzLPNwRf3+WbnOucDChGnudGbVOVA88CQM07lvsxOBllsXLcdiO7nnjrumbc1T93lyunwmuS7/TAWrkhyZ/Laws1xrAfD0k7Z3/o2fFFzSKWjyvj3J/v+0EQCCDSWQsVoDbfAH03XiIvl18Tjb1vfHsjx/9gctwjajMr1NqtiWNh7ekzAbagKKjzcYiw6E3uaJ8NC49eFf7QljkbS1dNyP0GdHWyTOa1pveNwj0BUi2B/aX0Q1oeMppLic8Wk7H5A/C9RGg+6PqoyYnqMdE+JyQxy0qQgSTIAiFvDcVhVHqDbXB9oWC/KR4YRDx236K+K/h/jC5t8wMxy72ZhnPutnvUd1voyYF4XMde9wP1bVPrk70X/ldvDWwJf65Bff+Hy/SvXS3F6tCa5pUtwJbqGVL5/naQLtQSKI X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: If would be bad if we actually set PageUptodate with UFFDIO_CONTINUE; PageUptodate indicates that the page has been zeroed, and we don't want to give a non-zeroed page to the user. The reason this change is being made now is because UFFDIO_CONTINUEs on subpages definitely shouldn't set this page flag on the head page. Signed-off-by: James Houghton --- mm/hugetlb.c | 11 ++++++++++- 1 file changed, 10 insertions(+), 1 deletion(-) diff --git a/mm/hugetlb.c b/mm/hugetlb.c index b39b74e0591a..b061e31c1fb8 100644 --- a/mm/hugetlb.c +++ b/mm/hugetlb.c @@ -6229,7 +6229,16 @@ int hugetlb_mcopy_atomic_pte(struct mm_struct *dst_mm, * preceding stores to the page contents become visible before * the set_pte_at() write. */ - __SetPageUptodate(page); + if (!is_continue) + __SetPageUptodate(page); + else if (!PageUptodate(page)) { + /* + * This should never happen; HugeTLB pages are always Uptodate + * as soon as they are allocated. + */ + ret = -EFAULT; + goto out_release_nounlock; + } /* Add shared, newly allocated pages to the page cache. */ if (vm_shared && !is_continue) { From patchwork Thu Jan 5 10:18:00 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: James Houghton X-Patchwork-Id: 13089638 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id DC243C54E76 for ; Thu, 5 Jan 2023 10:18:57 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 6848A900002; Thu, 5 Jan 2023 05:18:57 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 5E65B8E0001; Thu, 5 Jan 2023 05:18:57 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 46145900002; Thu, 5 Jan 2023 05:18:57 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0012.hostedemail.com [216.40.44.12]) by kanga.kvack.org (Postfix) with ESMTP id 3656F8E0001 for ; Thu, 5 Jan 2023 05:18:57 -0500 (EST) Received: from smtpin22.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay02.hostedemail.com (Postfix) with ESMTP id F17A4120D03 for ; Thu, 5 Jan 2023 10:18:56 +0000 (UTC) X-FDA: 80320347072.22.D3B9143 Received: from mail-yb1-f202.google.com (mail-yb1-f202.google.com [209.85.219.202]) by imf30.hostedemail.com (Postfix) with ESMTP id 651BB8000E for ; Thu, 5 Jan 2023 10:18:55 +0000 (UTC) Authentication-Results: imf30.hostedemail.com; dkim=pass header.d=google.com header.s=20210112 header.b=OvRCq1K7; spf=pass (imf30.hostedemail.com: domain of 3DqS2YwoKCFU6G4BH34GBA3BB381.zB985AHK-997Ixz7.BE3@flex--jthoughton.bounces.google.com designates 209.85.219.202 as permitted sender) smtp.mailfrom=3DqS2YwoKCFU6G4BH34GBA3BB381.zB985AHK-997Ixz7.BE3@flex--jthoughton.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1672913935; a=rsa-sha256; cv=none; b=8r75amKttunJ4Di/PNMwfxjhTR+osaNoySQw4ph2rObYViq7CKLrxeEglX+3KMBOhWO6TT 81fbXX4tifB20oDirX5uKtZvWwjxz/DwCKSxbeyf57ICRo8MqcaLevW0rZQUm+eHFq759j /U9/wjL1OgbFP22IhLboQa6BFeIPqy0= ARC-Authentication-Results: i=1; imf30.hostedemail.com; dkim=pass header.d=google.com header.s=20210112 header.b=OvRCq1K7; spf=pass (imf30.hostedemail.com: domain of 3DqS2YwoKCFU6G4BH34GBA3BB381.zB985AHK-997Ixz7.BE3@flex--jthoughton.bounces.google.com designates 209.85.219.202 as permitted sender) smtp.mailfrom=3DqS2YwoKCFU6G4BH34GBA3BB381.zB985AHK-997Ixz7.BE3@flex--jthoughton.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1672913935; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=0YYyrn/i5J+Hdl7Oce+Vi3RxVXkQH55IR2Pj2f6MRss=; b=JDhaOY8Od6xPoBcv2wr83HCk8Or3IG73sLOvkUdEOzjcB0GKwrSW88xFaDiwzauF+0kdz+ 7Fqb5DcT8huCBg9kR5A1LUmB9qO7NvkYawRHzOoCRrzjtpMNxlmlatk3sWwEwMu+kWKm/w 6zmcN+t/5bnVK8qYukZcXb9zM8aKIMM= Received: by mail-yb1-f202.google.com with SMTP id k18-20020a25e812000000b0077cc9ab9dd9so27181764ybd.8 for ; Thu, 05 Jan 2023 02:18:55 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=0YYyrn/i5J+Hdl7Oce+Vi3RxVXkQH55IR2Pj2f6MRss=; b=OvRCq1K7vBr4jC60kQ8Em8RrEuL5pvhvHnKcAsgERc3uS+w79rqkAY+8zuKClMXYP1 9Ko8z21BLtt7+dY9dy1V18U882pGXu/OE6i+Xr0iQPnYIyNgznMU4MIYAU10i4V8msYi vF8PBI3bFXNBLc45eUtC+jumcHqn1CRhScBqxVt0/BNc1x40y+nrTfMPbTVNaD7phI6p n1O1IlZLttIwzPlRgdc/hc2CmAd1C15Vy/o6AJ8g8UFwO7fkbcWUnHCUscKvp68diaN9 zV3S/Vot4voCzs8e2T5wk56DNUb4ssJaLwx9z6hGPfl2a20zFaxom/8VdWD8sa1qWbzF 3Vuw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=0YYyrn/i5J+Hdl7Oce+Vi3RxVXkQH55IR2Pj2f6MRss=; b=WYvnK0U9YnexCdNoe8VfpPkPvXsxTS9enKHbwSOm9d7n3iiQ/Nh2oOW0LH0xCHjqzs yAYxL8nE1he/eYUZFZUndlswrOs/n3AQvtJH57yBa5SW1MsCO0xo4s6K7xY2u1Ar2Fwo 7YKv5p83zWH5IUN8LxZpoKaWoajx7CKsjrSlUi1R+jTq4OzXmPkiz8NqxjomCBNrjC1q JM2k1LuvWTdJhNqI50xnL9htZTAYi3+3pzciqAIU6Q54R8yVrPnaGZdRO+cEDSDrCbdS 8KwL9USqkDzPsWqBE/eNHJKgjUW7zGqnCJUAjrAzYQvYjgip85rINLd+7i4bjSke9UD6 2hdA== X-Gm-Message-State: AFqh2kqkKsCUVGz1gzsWXcTmNBTj3p3v/LbrN6Tge8W4TILMc3oHVzhS 3onMdCOmHbiTs7xV2Kav0rcExgTo9BbO/Eta X-Google-Smtp-Source: AMrXdXtC8k5di86WReSiaGhD69tMGn+bIeELhRcjp6aIf2STjTrG//8SeY01j3sLTLnAywMSOT6aefEO4rMpcD9u X-Received: from jthoughton.c.googlers.com ([fda3:e722:ac3:cc00:14:4d90:c0a8:2a4f]) (user=jthoughton job=sendgmr) by 2002:a25:5056:0:b0:7b6:2b8f:f2c0 with SMTP id e83-20020a255056000000b007b62b8ff2c0mr25297ybb.46.1672913934575; Thu, 05 Jan 2023 02:18:54 -0800 (PST) Date: Thu, 5 Jan 2023 10:18:00 +0000 In-Reply-To: <20230105101844.1893104-1-jthoughton@google.com> Mime-Version: 1.0 References: <20230105101844.1893104-1-jthoughton@google.com> X-Mailer: git-send-email 2.39.0.314.g84b9a713c41-goog Message-ID: <20230105101844.1893104-3-jthoughton@google.com> Subject: [PATCH 02/46] hugetlb: remove mk_huge_pte; it is unused From: James Houghton To: Mike Kravetz , Muchun Song , Peter Xu Cc: David Hildenbrand , David Rientjes , Axel Rasmussen , Mina Almasry , "Zach O'Keefe" , Manish Mishra , Naoya Horiguchi , "Dr . David Alan Gilbert" , "Matthew Wilcox (Oracle)" , Vlastimil Babka , Baolin Wang , Miaohe Lin , Yang Shi , Andrew Morton , linux-mm@kvack.org, linux-kernel@vger.kernel.org, James Houghton X-Rspam-User: X-Rspamd-Queue-Id: 651BB8000E X-Rspamd-Server: rspam01 X-Stat-Signature: dj8dfu43u8pwxafz7z5ii981ri1rdhxf X-HE-Tag: 1672913935-473609 X-HE-Meta: U2FsdGVkX1+nJRy5POMh8nca7cvFd5qKHyBY7iA017GE2Jw4VwhWksqe8yXwl1uAFB9DM2gNQGU/0dtAOQaj0sKrXGtFK6UUo/OYlw9yERLkGaptN+a1eY5z7nVedpxoyF4YVa+tp71nxTh25VN+J1X9rm/EmLEGvCRr3yQ2ssAuIlR1YneqObYyMAB9CTXr/AR5sXebt6j1VRCcT1m1b/ViKvAjOP1WX8R8GcB0gqXxsEpEpshbton60Vpal3YBdWZLHAQYUJ/UMhLY9mknHck9qLkx+Y1F7P/jq57iyKnLONpRoAVFIvs/k5VfiYLviWiA+HLtjFvZrJP0XBW2lVJ7/AIFw68AOLKl5h+BpLA1JhzGTZ9gCxcHPt7tnm6eyQeRmSTQrLVqGNr300/z2wkeWZop8uIdXLBu8icdXYnD281nfqGW6Oj4R4vFhgCd6S4plrKDcv9sf0HdgTy6ToRT8Wg7Q3pLF3eO1FOuTG6S+Asp/zOKLXkR+UxDtj4c2hzCbfJoJ72Xt1TapmkXSg+OfPAlflYi40PRV1bYJ94yPSkpHHq4ZDSfAsW+lxQ3iXtyOOVj4a9jHPFO9CDW3uEfmyLT2TseQ/3F/H6uSXH7S7dkWDcIPkGwkQ3slLQk0+HsnSzwFyAILBNPw8SQegGmsAGioSNCyO+ujMceqzqDFGHeDDz8mgmsESyqX+tcNHENOp7vylnucacA33gsVv/yp8mbaaFuMz9nBj75/N1fzKY5ePDfYUIG/3SfAscx8tMdH+pNdYE0NaMIQuO60y24dnnvoVeZjtQAklwmN598wIo1jU04r32b7rt5Evc1wuUKfRvHqcCD4dXFPDLviv4MKoPizKb+YLR/Fy6ywUtBOepXV8cXZw700LYBXDH1p8PjqjJTcu+p+bxPCyitwYqRzGt05FOvGrgtZZ4TzQzZgq5yW2x7vjQ/aIiUtpkdJrE3UBBRgA7OSZ85IaS xCBWM9t1 pdxteQuG+WetHqXDfVhv+tc8H+2ogcp8Po11nX/qmIK74OcKaDzBSHzM/XDznlf9ceF6Iim9AmqFwSlcMRnLLxQkXByVzL+p9oqFw/t/tNnC6cxS0HWwclghBXKa9eZZWCQYGClxzzvmj22WaxMkdjD7ZNZCbTyttNITflu9tGE31Mp+1xiRW9tjIBoZ3QwiG4Mr34U3LMiOa7Yw96qNp8x9JxPB2izmJqzT9VL3q8f8rFOJgPFBk4kv0Z3Xe3nAP02bftcH2aVxEFUF86271DNepgMNVRjcmpHjO6vcue2D3jmq4TwJUNXwB3oyH0xcO5BX28iVILyqXXfEO+0p1TyF0OCGGJgvSFRk0NfFSpEZHk55zLJaFlb3OmuMnySxVFthDhb3t71rMv9uxu9xjrCCNK/rLGuPogKivaMgmY6KlblJyK3G+hbDyEBmzzySaQPG/ X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: mk_huge_pte is unused and not necessary. pte_mkhuge is the appropriate function to call to create a HugeTLB PTE (see Documentation/mm/arch_pgtable_helpers.rst). It is being removed now to avoid complicating the implementation of HugeTLB high-granularity mapping. Acked-by: Peter Xu Acked-by: Mina Almasry Reviewed-by: Mike Kravetz Signed-off-by: James Houghton --- arch/s390/include/asm/hugetlb.h | 5 ----- include/asm-generic/hugetlb.h | 5 ----- mm/debug_vm_pgtable.c | 2 +- mm/hugetlb.c | 7 +++---- 4 files changed, 4 insertions(+), 15 deletions(-) diff --git a/arch/s390/include/asm/hugetlb.h b/arch/s390/include/asm/hugetlb.h index ccdbccfde148..c34893719715 100644 --- a/arch/s390/include/asm/hugetlb.h +++ b/arch/s390/include/asm/hugetlb.h @@ -77,11 +77,6 @@ static inline void huge_ptep_set_wrprotect(struct mm_struct *mm, set_huge_pte_at(mm, addr, ptep, pte_wrprotect(pte)); } -static inline pte_t mk_huge_pte(struct page *page, pgprot_t pgprot) -{ - return mk_pte(page, pgprot); -} - static inline int huge_pte_none(pte_t pte) { return pte_none(pte); diff --git a/include/asm-generic/hugetlb.h b/include/asm-generic/hugetlb.h index d7f6335d3999..be2e763e956f 100644 --- a/include/asm-generic/hugetlb.h +++ b/include/asm-generic/hugetlb.h @@ -5,11 +5,6 @@ #include #include -static inline pte_t mk_huge_pte(struct page *page, pgprot_t pgprot) -{ - return mk_pte(page, pgprot); -} - static inline unsigned long huge_pte_write(pte_t pte) { return pte_write(pte); diff --git a/mm/debug_vm_pgtable.c b/mm/debug_vm_pgtable.c index c631ade3f1d2..643cce3493cc 100644 --- a/mm/debug_vm_pgtable.c +++ b/mm/debug_vm_pgtable.c @@ -900,7 +900,7 @@ static void __init hugetlb_basic_tests(struct pgtable_debug_args *args) * as it was previously derived from a real kernel symbol. */ page = pfn_to_page(args->fixed_pmd_pfn); - pte = mk_huge_pte(page, args->page_prot); + pte = mk_pte(page, args->page_prot); WARN_ON(!huge_pte_dirty(huge_pte_mkdirty(pte))); WARN_ON(!huge_pte_write(huge_pte_mkwrite(huge_pte_wrprotect(pte)))); diff --git a/mm/hugetlb.c b/mm/hugetlb.c index b061e31c1fb8..7e9793b602ac 100644 --- a/mm/hugetlb.c +++ b/mm/hugetlb.c @@ -4870,11 +4870,10 @@ static pte_t make_huge_pte(struct vm_area_struct *vma, struct page *page, unsigned int shift = huge_page_shift(hstate_vma(vma)); if (writable) { - entry = huge_pte_mkwrite(huge_pte_mkdirty(mk_huge_pte(page, - vma->vm_page_prot))); + entry = huge_pte_mkwrite(huge_pte_mkdirty(mk_pte(page, + vma->vm_page_prot))); } else { - entry = huge_pte_wrprotect(mk_huge_pte(page, - vma->vm_page_prot)); + entry = huge_pte_wrprotect(mk_pte(page, vma->vm_page_prot)); } entry = pte_mkyoung(entry); entry = arch_make_huge_pte(entry, shift, vma->vm_flags); From patchwork Thu Jan 5 10:18:01 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: James Houghton X-Patchwork-Id: 13089639 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 90E1DC54EBD for ; Thu, 5 Jan 2023 10:18:59 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id B1AAA900003; Thu, 5 Jan 2023 05:18:58 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id A07168E0001; Thu, 5 Jan 2023 05:18:58 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 859A7900003; Thu, 5 Jan 2023 05:18:58 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0011.hostedemail.com [216.40.44.11]) by kanga.kvack.org (Postfix) with ESMTP id 6CF8C8E0001 for ; Thu, 5 Jan 2023 05:18:58 -0500 (EST) Received: from smtpin17.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay02.hostedemail.com (Postfix) with ESMTP id 429B8120D0B for ; Thu, 5 Jan 2023 10:18:58 +0000 (UTC) X-FDA: 80320347156.17.D9F7EAC Received: from mail-yw1-f201.google.com (mail-yw1-f201.google.com [209.85.128.201]) by imf09.hostedemail.com (Postfix) with ESMTP id AC68514000E for ; Thu, 5 Jan 2023 10:18:56 +0000 (UTC) Authentication-Results: imf09.hostedemail.com; dkim=pass header.d=google.com header.s=20210112 header.b=pqyC2Khn; spf=pass (imf09.hostedemail.com: domain of 3D6S2YwoKCFY7H5CI45HCB4CC492.0CA96BIL-AA8Jy08.CF4@flex--jthoughton.bounces.google.com designates 209.85.128.201 as permitted sender) smtp.mailfrom=3D6S2YwoKCFY7H5CI45HCB4CC492.0CA96BIL-AA8Jy08.CF4@flex--jthoughton.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1672913936; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=6/ywzp+1XO7/Q8q0gmg2VRvRHujBF/Iur/xAcCpSChk=; b=y09sP9MQDEwJs3JUS2rqOAh8lu0DSkjpwjVNVRUq5TXQpQ3k84BOj1lIaiWylWDfBuaYH8 ZMBJvKP1MF3TAO/4knkYgA2bC0Kd5AH1tL4MLp9GivK3XRwgLDLyLYLcxAo94DvZ75Q0VV TL01+Of1ROgIaIk0ZMCE5t7+UrZChPI= ARC-Authentication-Results: i=1; imf09.hostedemail.com; dkim=pass header.d=google.com header.s=20210112 header.b=pqyC2Khn; spf=pass (imf09.hostedemail.com: domain of 3D6S2YwoKCFY7H5CI45HCB4CC492.0CA96BIL-AA8Jy08.CF4@flex--jthoughton.bounces.google.com designates 209.85.128.201 as permitted sender) smtp.mailfrom=3D6S2YwoKCFY7H5CI45HCB4CC492.0CA96BIL-AA8Jy08.CF4@flex--jthoughton.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1672913936; a=rsa-sha256; cv=none; b=C+n0uvixj1+9RHJkIpbTw4tm88qB6Vsl59qkV7Y/A6iEdjJQpSTgWCLn6g/Zn7dGqMHUuI s17NJD9f/nieHj0JgKrXP9Z1CLzJc36IkZjH7FW7sUlIlh8pSL63lR7wyzmsyMYgSpjjff i0hCgsdFoFOXDYoR4Kj6cQcrgxDd2EU= Received: by mail-yw1-f201.google.com with SMTP id 00721157ae682-46a518991cfso343220437b3.18 for ; Thu, 05 Jan 2023 02:18:56 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=6/ywzp+1XO7/Q8q0gmg2VRvRHujBF/Iur/xAcCpSChk=; b=pqyC2KhnurZ2BnxaxGpvFtfxzpkkxexWo/NsKpGJkrtsdVnOn9JsTznF5L8T10dpT9 AAb47kUzRLKe5zL54ZIpCmLTz4iG0qnBDPnnSJAcX0pnE6+rQ5/lLGGtdmK63OoTnblo GF3eU42h4BBDQ+ilspuDnsLoOBvix+ESeq+eoZdNGcZsD8QY/hbtyL1dR+g501YZ67Fz z9RZ/NbRSQRm1YcmxGoEQ2uIv7H63+AOQO1vX0XPbnZjLlUJSXkLGIr1Ih39qo/E4CRT MtdnGWeT9yFYs2+Gmn78g91Y+8+cLHMNidVLgMRBi0PWC2oIxVrguFCrpF8qXSNrN1Aj N+1g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=6/ywzp+1XO7/Q8q0gmg2VRvRHujBF/Iur/xAcCpSChk=; b=2Ddlc/0VNgE6JxRlAuNKWLABGQ3kj5zeQVYXVGK0pOkRMau5gkCCjMsTBOgspovOoq npKAqfR6el/dEHlpvCLcW4xfgDWAVNYfEsmPKg4obT0fnhipsXjnyrD0g+YWmTfqMow4 1JgMS2s7lsHygtxJzrsWwk/k8N+ud2AluvY9ACQXyks62YbJGp0MM2fv157HY2Ix2SCE 8bP2Y8BD0mFFvhpLK2lgntB2CgwFoNe7U9b8n723YG8A4UE0LJwbiHcJMZbr79Vik0jW HamS2k+Mx4NCf4fe527CyR1422umcMasrh22huemd8rF+T3X8IQfuxDgXTX/5VP0ZAUa jr9Q== X-Gm-Message-State: AFqh2kppB9L1hm1RkuFwW4dN7vySK9ik4fQ1UiG70L91OzHWuvcRdG5g 0p/S25/pOdYU/AT/1ALdaGpZJUg8yHDRtQ+0 X-Google-Smtp-Source: AMrXdXttj0DV8I+dROfy27Xg3A4VmdnnuZWQjLlZFwNJGVFmulN/HFoUXOpheTies8pt4M5XCaM5PllEMoSoJYup X-Received: from jthoughton.c.googlers.com ([fda3:e722:ac3:cc00:14:4d90:c0a8:2a4f]) (user=jthoughton job=sendgmr) by 2002:a25:bc8c:0:b0:79c:8a57:64ed with SMTP id e12-20020a25bc8c000000b0079c8a5764edmr1722191ybk.16.1672913935908; Thu, 05 Jan 2023 02:18:55 -0800 (PST) Date: Thu, 5 Jan 2023 10:18:01 +0000 In-Reply-To: <20230105101844.1893104-1-jthoughton@google.com> Mime-Version: 1.0 References: <20230105101844.1893104-1-jthoughton@google.com> X-Mailer: git-send-email 2.39.0.314.g84b9a713c41-goog Message-ID: <20230105101844.1893104-4-jthoughton@google.com> Subject: [PATCH 03/46] hugetlb: remove redundant pte_mkhuge in migration path From: James Houghton To: Mike Kravetz , Muchun Song , Peter Xu Cc: David Hildenbrand , David Rientjes , Axel Rasmussen , Mina Almasry , "Zach O'Keefe" , Manish Mishra , Naoya Horiguchi , "Dr . David Alan Gilbert" , "Matthew Wilcox (Oracle)" , Vlastimil Babka , Baolin Wang , Miaohe Lin , Yang Shi , Andrew Morton , linux-mm@kvack.org, linux-kernel@vger.kernel.org, James Houghton X-Rspamd-Queue-Id: AC68514000E X-Stat-Signature: b85ens6terqt19j664ttaf5c9prhj36d X-Rspam-User: X-Rspamd-Server: rspam08 X-HE-Tag: 1672913936-490849 X-HE-Meta: U2FsdGVkX1/NMGnPINGLlvggpA9NXZ5VI0MoKGroFp3RIUkFU99x/qf6m2Q9qelW7kXUzt/gRSWSsS68gwz/h1nTrwiUmQsOdOCWxmWF+mFv7dGD6tCJ6G2eJnSuhoEr7aclBZxtKHbLo43CCo2fDestUrvge7yKFALvbgiHB58QawtQ9/DHch9q2wICwJvOm9W+fdUpTkOlhXV9Qo9sUZyTTjtZrXp3FOY11iCnDTrImsMg/m/4Glq/I/Owph4ZfLfkjKdn+Vjo9gqH5JXuefqrsGB6P+lKwrfj0FdyLLaESKcugpo9khpS1ib2g1q1vgyrcJOJ9ymc/2F+Tb6x2tKvMH5StFRtBU9k8MGugvdS4hxtIjCKqOFuf/ZAxKkSqjZBDRPQHhwhFYFjIiAgPSy5swqN/BZLqO+ohQEj/wuTpo96RWhAp/6cT+YbntaUw2CSvH0jKyQiwOROa3KLunhML+EJUIjLYAkzuIdOOThP0oP7zdY984QpmOcHDNYHOIUtNOktitvN5sGqvE4ezrR/9IJVTBpZcUEa1JsNq3aRwe9F+TXsZC8LqaCefUexl3Hg3zaM+VDcDVqB5O5yapSpDsjSskFBMYis99xewvnn9shLf45GHqzq86E9w+PJFQhTE9gAYUYHVNiWWyerCXGxzuuknBUvidPwa+OVjOd0foWffNQ77HJM9KF8P7RD/VLj3bh1SeUgitPZuxUBFZiCdVZd036e317NAzSjwEepUHsV3eyu50FEkWxJXC+hgw0Jo00DsibRcGOF1V3lYHaj90nBIDLYZ6VIgwG9RWDLsZHHfSP7Fgio1x7HkVOvqieq3gs/tYOQtCorNRG+GAD/FzY/prj67D5aoGV7cmMV9amgjQmSOahjFMjAqwuznKld7MRXjNbsj8iMz/aV4T6KDi5b57iE1/HcgHxv7ZeLH+CPM9S1gHODCDoHd0gswzZMXvEdQEzjMrCWFF4 taRNsRuh YqQfh3QyPKPBNs8v0D7tot0srKbsPMxWoKfW5ysJnfnTzxDbe5wvDr+eg5tNypM0pTksj50Arfz17gE7akX3AHXERzQkRehprkwg52daMIMvpDdI7wWOUsxy+Ung5eLY0XPTJXERMBtoZGTdRxMqlT0+kdbTj/n8/oE0ggm8FNGm+pzDUVGm/fRenz7bC7RpNKwSOT66aZqjvr1THPdVL0FM8lCCHC991cISxeqotUcRknkG9384USEkVmOND8PlYq3pu5Ve/Wp4TVJzpsZMu+E0ATDuTu8bzi3uw6WXAhu9jGsherWl3PAk6ZFobzpk+H252YZ9KsGTSlS/eZAI3PO8Iw9Piw/bjqzh3b+GFLJyzJI5BLWIz4qIX5DmqExP1pkR+sAlRI6AOoTIB+yx3PQY/McZ1BXUDTOH38wPkcZ5npkg1f9dWpTN2nZdjLvFoXR93 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: arch_make_huge_pte, which is called immediately following pte_mkhuge, already makes the necessary changes to the PTE that pte_mkhuge would have. The generic implementation of arch_make_huge_pte simply calls pte_mkhuge. Acked-by: Peter Xu Acked-by: Mina Almasry Reviewed-by: Mike Kravetz Signed-off-by: James Houghton --- mm/migrate.c | 1 - 1 file changed, 1 deletion(-) diff --git a/mm/migrate.c b/mm/migrate.c index 494b3753fda9..b5032c3e940a 100644 --- a/mm/migrate.c +++ b/mm/migrate.c @@ -246,7 +246,6 @@ static bool remove_migration_pte(struct folio *folio, if (folio_test_hugetlb(folio)) { unsigned int shift = huge_page_shift(hstate_vma(vma)); - pte = pte_mkhuge(pte); pte = arch_make_huge_pte(pte, shift, vma->vm_flags); if (folio_test_anon(folio)) hugepage_add_anon_rmap(new, vma, pvmw.address, From patchwork Thu Jan 5 10:18:02 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: James Houghton X-Patchwork-Id: 13089640 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 349CCC53210 for ; Thu, 5 Jan 2023 10:19:01 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id C512A900004; Thu, 5 Jan 2023 05:19:00 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id C01798E0001; Thu, 5 Jan 2023 05:19:00 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id AA147900004; Thu, 5 Jan 2023 05:19:00 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0011.hostedemail.com [216.40.44.11]) by kanga.kvack.org (Postfix) with ESMTP id 88B928E0001 for ; Thu, 5 Jan 2023 05:19:00 -0500 (EST) Received: from smtpin02.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay04.hostedemail.com (Postfix) with ESMTP id 6650C1A0786 for ; Thu, 5 Jan 2023 10:19:00 +0000 (UTC) X-FDA: 80320347240.02.86DD742 Received: from mail-yb1-f202.google.com (mail-yb1-f202.google.com [209.85.219.202]) by imf26.hostedemail.com (Postfix) with ESMTP id C5C73140003 for ; Thu, 5 Jan 2023 10:18:58 +0000 (UTC) Authentication-Results: imf26.hostedemail.com; dkim=pass header.d=google.com header.s=20210112 header.b=NmgS3tlh; spf=pass (imf26.hostedemail.com: domain of 3EaS2YwoKCFg9J7EK67JED6EE6B4.2ECB8DKN-CCAL02A.EH6@flex--jthoughton.bounces.google.com designates 209.85.219.202 as permitted sender) smtp.mailfrom=3EaS2YwoKCFg9J7EK67JED6EE6B4.2ECB8DKN-CCAL02A.EH6@flex--jthoughton.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1672913938; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=8fBYHLCr9oSB4M/a/ngoaaKhcuSrl4k3J29XD8vYb2Q=; b=XdsRzbYlXkduALKHm9juae7A69hkBljuSuVHzPDhIHIqOWPjJdn8MIcHrachLK+T3b3Lel h180wxwOae6J6vWDi1RIqMdYrwNLa48agass/ECiKo4rcW3Op+uu5CVpc+TQyJ+YF/Ff6w dGrSQpSqois3TUiftIsLd6m4H4POZG8= ARC-Authentication-Results: i=1; imf26.hostedemail.com; dkim=pass header.d=google.com header.s=20210112 header.b=NmgS3tlh; spf=pass (imf26.hostedemail.com: domain of 3EaS2YwoKCFg9J7EK67JED6EE6B4.2ECB8DKN-CCAL02A.EH6@flex--jthoughton.bounces.google.com designates 209.85.219.202 as permitted sender) smtp.mailfrom=3EaS2YwoKCFg9J7EK67JED6EE6B4.2ECB8DKN-CCAL02A.EH6@flex--jthoughton.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1672913938; a=rsa-sha256; cv=none; b=38ZJCBBeer8neXtOzNtk2i73LxVnrRqvCUBVaDh57zrvSVafcHfFqkmSqFGR57/Jkgwo3v aBmDnDRzwEl0u2SCG12XwP2N7dVAFezq/PvLGgP4J9zWLX0iOj5YHgNPWP4ix4UiMLdBig ABrM+hOvAcr+UhnPV5QQVqHcv5nw4Oo= Received: by mail-yb1-f202.google.com with SMTP id s6-20020a259006000000b00706c8bfd130so36458840ybl.11 for ; Thu, 05 Jan 2023 02:18:58 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=8fBYHLCr9oSB4M/a/ngoaaKhcuSrl4k3J29XD8vYb2Q=; b=NmgS3tlh8WUTbw91DNXzILqULTbddBWnPQn0Wz57ppipw1WObH6FkSpTww6yztM+jA 6e8g5ZzoA5lVCAREHkwcp9nrWOI+Q0S0zYB3GUuo7cG78xBZHsz/CSyv57wBUHMoyAZn uMRawDWwRYn9gVztU7/tvicjuzxjqEUsaG6Q2Nh3CiNSMC3lromhxCi7P1Rw/hwbHL1F YmJbdMcEkKBiPCX4t46+UvV1gVJE1sWQPNJLWRGQKmx7VESDn0Z4IGBEAOvcilWvJwx0 097HlvspkiuYJIAb7OcrrOPaSLfJxD3uYwyJHjA4LcqVHz5gZ4GhB3QpiX2mLa2tpK7P fY/Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=8fBYHLCr9oSB4M/a/ngoaaKhcuSrl4k3J29XD8vYb2Q=; b=hS0stWYJd8pcVe/8Hh4Geh/sgE7H+hG3RW5bjfohSCU+djRaK4fZnT2HUY5u0gPiCd 1DhD/rBw+6mTxIZ8z/ZyqRrwfML+BlPuFlNxq8OJSA1FUxqlLYHmxQnonrFGCyKRMLgQ pl36RFP3W1WgCI5qWJH8trfbgyTlzaLmVYesTVP2pdjae+2LjrfnH7feMiorHSSFuOUX C0ZrZLYLGsqn2UuRWe+Cxp9hH31SXznQJj6nupR6dFOXAn2nwjVhJrkatBiQ5UaZIXHC WlVrUQOkZ+VyDxzqJ6JtWSjPH6PC11y7NVEzrF6uI93W490iK6euCBBGkKiQku3Bj0yX oCrA== X-Gm-Message-State: AFqh2kp8Grml+bUpXYDp8zwMLC+Up2JiEDWWAnVHWdG0vf7h9hnVQTrj nlKwr0dGsH2KrqAqFM6fexpvkoZXJoYs8FZG X-Google-Smtp-Source: AMrXdXtKSspcz64KdJshT2Zl/+9+WpY5XaxA4H4cZjlRHhAILIZdFfHdFt4D44YNQqdRUNZd3I9uCvlbO/Onyj2D X-Received: from jthoughton.c.googlers.com ([fda3:e722:ac3:cc00:14:4d90:c0a8:2a4f]) (user=jthoughton job=sendgmr) by 2002:a25:f311:0:b0:6ff:84be:717 with SMTP id c17-20020a25f311000000b006ff84be0717mr5385597ybs.314.1672913937929; Thu, 05 Jan 2023 02:18:57 -0800 (PST) Date: Thu, 5 Jan 2023 10:18:02 +0000 In-Reply-To: <20230105101844.1893104-1-jthoughton@google.com> Mime-Version: 1.0 References: <20230105101844.1893104-1-jthoughton@google.com> X-Mailer: git-send-email 2.39.0.314.g84b9a713c41-goog Message-ID: <20230105101844.1893104-5-jthoughton@google.com> Subject: [PATCH 04/46] hugetlb: only adjust address ranges when VMAs want PMD sharing From: James Houghton To: Mike Kravetz , Muchun Song , Peter Xu Cc: David Hildenbrand , David Rientjes , Axel Rasmussen , Mina Almasry , "Zach O'Keefe" , Manish Mishra , Naoya Horiguchi , "Dr . David Alan Gilbert" , "Matthew Wilcox (Oracle)" , Vlastimil Babka , Baolin Wang , Miaohe Lin , Yang Shi , Andrew Morton , linux-mm@kvack.org, linux-kernel@vger.kernel.org, James Houghton X-Rspamd-Server: rspam07 X-Rspamd-Queue-Id: C5C73140003 X-Rspam-User: X-Stat-Signature: b5b4zkfkzkr6q7bswk837ts964xnxst7 X-HE-Tag: 1672913938-510185 X-HE-Meta: U2FsdGVkX1/nn9bPhUDf2rpWXaORAcKbbvGW8pMedbERCmEyCTRZrWI8SuQm5k+w6s9kXpk4finD8vR0Y/emLGBAaEuJ73pAPcHMxm4/wdK2IOpex2oO+rw4R+57/vAy39eWF9srsXEDjdHiKW/zHUT9Ygj/725q4MaHaLqfX2V+BvJlpx91iyemTiw/Lk3S9kk3cs2wQWLJVdhyA/5HWoEYMEvET3HtkG8gD0arjxcm/QsEhneIirxcJjKC4GzEP5GbjUEm0AKY1Hg19BBgETMAnqtEYB0l4vwtfVVwqRiDSMXrmLDZjUBasQENsUpNAo8xrd+53aqbycdClaVwDDtwHH/12/Ig/ehdXBCRUqKXTMQyrbI9Z/GSMt0Du4GxcEW5EvliNje16Y4Vc8K7H4SQTzFtILHZfFYw4ZJ9ddI3+EtZpPAy8NtNpctJ4sb+A4IMmpfHF6kCtbJmzUic4CjoNcpRmKqx4o78Lv7PbxacjyrMNsW2wXNh/bjX3URzUrcKlzzDSvkz5ZdIl2UkX5s3kRy84IBSUzgxJHJ0LKpej2Fph0V9JBqNUdXrfutTfwRJCePJO7aLw7TSrg0EP0wKLEkZ4aan91IxvC4yI7PHPwNGZrevln/m53vV/2LFaEcA6muDo9alVu+/x3OJ1DtIdiIRWUMlv/arK7Pcz2iK3QtlYkrYYh6R2bQLMgxrZwKQEM7dul2y6q794RYMs0Orwy4wAyKUjvMd+5eptV8Pk0iwQTiClIF00s2GqXNb4nJAZwdoGPdjW2EyrROsnzb1qy1mhEO6fl/DwaKbDgriOoUJ27dldq+dDbdLdHgkfPUf1bYsXN6QUfzycFx1q+iNFw2M5Cg/uSXgDVPH2fc6erRQD0XJvovVN9rLj3SaoQgJCt0Uunct3ZTEqtIIGUzXnuP1nxa17uPLJafEvSTsJAhVe7OZg4zch6gNEk92meVPCH5wPG66Cq+fZ8I VdU/xcSl DBAJ6L75FyibY8o8gwb0dzPB83QC25aBBYqWyJxrLt/aCJT7EbIYS2UiX3MY20A3y78Scz6sIUeKxvTy7F8Dd6rXGI85tCD9pQYg5ZvES9U7FVHIoWHwBS1JzkqWsQkZB+L+4wEqoUPNgfyQM9AdpvFRaH12hP1CQ2BuJRa03xNWdpRrW3uOwuC1BwGPOD/+GoL/cVF6lzEywNQDzmbDaOCSWHPW+ucUjRb/99lx2/9q/3oYjJLCLmMsOKRqWT+ojlogrNCP17fH3KLV/qjsAVK9/f+JqINf2avPK9vX/dhw8OE0bDGdWPuvb9XFrlFG9p0Hmojw7oM8lR6f6+V/EV2MY6mfcBuARdt38GSvlEwYlbD/sDlGcxoFd61c7/gbiio+JPassberj9UA= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Currently this check is overly aggressive. For some userfaultfd VMAs, VMA sharing is disabled, yet we still widen the address range, which is used for flushing TLBs and sending MMU notifiers. This is done now, as HGM VMAs also have sharing disabled, yet would still have flush ranges adjusted. Overaggressively flushing TLBs and triggering MMU notifiers is particularly harmful with lots of high-granularity operations. Acked-by: Peter Xu Reviewed-by: Mike Kravetz Signed-off-by: James Houghton --- mm/hugetlb.c | 21 +++++++++++++++------ 1 file changed, 15 insertions(+), 6 deletions(-) diff --git a/mm/hugetlb.c b/mm/hugetlb.c index 7e9793b602ac..99fadd7680ec 100644 --- a/mm/hugetlb.c +++ b/mm/hugetlb.c @@ -6961,22 +6961,31 @@ static unsigned long page_table_shareable(struct vm_area_struct *svma, return saddr; } -bool want_pmd_share(struct vm_area_struct *vma, unsigned long addr) +static bool pmd_sharing_possible(struct vm_area_struct *vma) { - unsigned long start = addr & PUD_MASK; - unsigned long end = start + PUD_SIZE; - #ifdef CONFIG_USERFAULTFD if (uffd_disable_huge_pmd_share(vma)) return false; #endif /* - * check on proper vm_flags and page table alignment + * Only shared VMAs can share PMDs. */ if (!(vma->vm_flags & VM_MAYSHARE)) return false; if (!vma->vm_private_data) /* vma lock required for sharing */ return false; + return true; +} + +bool want_pmd_share(struct vm_area_struct *vma, unsigned long addr) +{ + unsigned long start = addr & PUD_MASK; + unsigned long end = start + PUD_SIZE; + /* + * check on proper vm_flags and page table alignment + */ + if (!pmd_sharing_possible(vma)) + return false; if (!range_in_vma(vma, start, end)) return false; return true; @@ -6997,7 +7006,7 @@ void adjust_range_if_pmd_sharing_possible(struct vm_area_struct *vma, * vma needs to span at least one aligned PUD size, and the range * must be at least partially within in. */ - if (!(vma->vm_flags & VM_MAYSHARE) || !(v_end > v_start) || + if (!pmd_sharing_possible(vma) || !(v_end > v_start) || (*end <= v_start) || (*start >= v_end)) return; From patchwork Thu Jan 5 10:18:03 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: James Houghton X-Patchwork-Id: 13089641 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 03065C3DA7D for ; Thu, 5 Jan 2023 10:19:02 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 6D1DB900005; Thu, 5 Jan 2023 05:19:02 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 683DB8E0001; Thu, 5 Jan 2023 05:19:02 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 4D4DD900005; Thu, 5 Jan 2023 05:19:02 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0011.hostedemail.com [216.40.44.11]) by kanga.kvack.org (Postfix) with ESMTP id 3987A8E0001 for ; Thu, 5 Jan 2023 05:19:02 -0500 (EST) Received: from smtpin23.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay02.hostedemail.com (Postfix) with ESMTP id 14245120D03 for ; Thu, 5 Jan 2023 10:19:02 +0000 (UTC) X-FDA: 80320347324.23.217F70B Received: from mail-yw1-f201.google.com (mail-yw1-f201.google.com [209.85.128.201]) by imf08.hostedemail.com (Postfix) with ESMTP id 89128160007 for ; Thu, 5 Jan 2023 10:19:00 +0000 (UTC) Authentication-Results: imf08.hostedemail.com; dkim=pass header.d=google.com header.s=20210112 header.b=D05RGKnu; spf=pass (imf08.hostedemail.com: domain of 3E6S2YwoKCFoBL9GM89LGF8GG8D6.4GEDAFMP-EECN24C.GJ8@flex--jthoughton.bounces.google.com designates 209.85.128.201 as permitted sender) smtp.mailfrom=3E6S2YwoKCFoBL9GM89LGF8GG8D6.4GEDAFMP-EECN24C.GJ8@flex--jthoughton.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1672913940; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=QisuvGCTCnQrC46aY73QASoWAbEZFo7ff5Vj8AOV5D4=; b=GxAzqqlJt95F9CC8JIy3+yXqsZryqcN7NxK59DxOweq55JWO9uCDvhhj0gb7Ds5Lr73FPw 7QKnmgXeZmYtJ6fxm7bkv0KPKePzdNuWAn1J/egy/vnDE9pwRL4EQcri63zU6L6ZRwSEi+ uOW0v5KfF78QqWSHk2ik7pLqxMrm7vg= ARC-Authentication-Results: i=1; imf08.hostedemail.com; dkim=pass header.d=google.com header.s=20210112 header.b=D05RGKnu; spf=pass (imf08.hostedemail.com: domain of 3E6S2YwoKCFoBL9GM89LGF8GG8D6.4GEDAFMP-EECN24C.GJ8@flex--jthoughton.bounces.google.com designates 209.85.128.201 as permitted sender) smtp.mailfrom=3E6S2YwoKCFoBL9GM89LGF8GG8D6.4GEDAFMP-EECN24C.GJ8@flex--jthoughton.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1672913940; a=rsa-sha256; cv=none; b=fZcnGnyT8qadcZdNuZT178LTHwR1V5iOnD0/vqwyAsI3ooJUsZD4iHCt2zCTfUnRwCHwU7 vLEJsKl43iAg6pFTrn02geKsdkPoIGHW0ycZ51qt4Gdd2rlXkwmgcol3XS1A0t0wjXa5ov c7znP2Vy+MNtDiyugNpedD8cYjvrdfQ= Received: by mail-yw1-f201.google.com with SMTP id 00721157ae682-45a51c37009so376783207b3.17 for ; Thu, 05 Jan 2023 02:19:00 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=QisuvGCTCnQrC46aY73QASoWAbEZFo7ff5Vj8AOV5D4=; b=D05RGKnu5npzQ4+IoIOCUoopJdHBZlGdmzR7wKgjiYJ7fpsPJXuc7/AVMHrkDHqo2o RNix7ExBJuxdhCqhn48A5PCySgryuQc3GCHIVd9cC5LTv0Q9aGo5YYJG8Ce+XT9OpeXk M3IL0FBVE2YUx+rEwav3paVke2g6M0EhpZNkkEWcM4bLj0uxgg8utb4DdIrHzftCy4tA Kij4pO4w0SwaaYqXSHOVWmgJ+wQsZ23Y9Y/ujE19y5mdHZG78XAtWGKWdrNVRkm9Nl44 elMxTtHTuRQV91u0RG0dTfAvGUJKSIuBA+Spih8dBWhAl2BkcB0t7zcxNmwHvfL9sk81 3ptA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=QisuvGCTCnQrC46aY73QASoWAbEZFo7ff5Vj8AOV5D4=; b=rrUgaMwnaBnvnYJJ20TxOGvLsGYHCc8dEuuqYHxIFvl5Y4gB5245MkCVoapeNRds1/ lTVOLFg4EE3+vm3WlpXpjbpBJ2e5NcUbVohM83ewTO/85OxzJEIZuirokdexrxrp4DVq 7Two21shgKsA9Iw5gHJXKfOQ4q2Pi2bTF3Wqm1zIz9zLrhA7K4T4pkBV2t7hBKByigjr 6MMZOZDlzm8ZPrbVq57Kjd0LSyqwMJEoSHMDO+vC6ztnRZjB00oVkhHRTBKwhGNiYfjB T946V07eZb6v0ghQ1Ix5DpFQWJzqg0MdYBsl8VoLowTwn8b0KZlLGd0S/+ltm+/nu2eI ndfw== X-Gm-Message-State: AFqh2krMG5cvn0Tz2k4RG6A16rWUI4hM+U0ZQOo8xlgdKhvY93DRg3TI ish/0oD/ZQwc/qFVeI4ryMgUs/ssZifITy52 X-Google-Smtp-Source: AMrXdXuNwgkkz9NCdvDombHELm4ryvn27Y2ZeR0xTnznCwFeZu8QafDmhNLkdStzMtbGd0AFx5F58froof/Omrr/ X-Received: from jthoughton.c.googlers.com ([fda3:e722:ac3:cc00:14:4d90:c0a8:2a4f]) (user=jthoughton job=sendgmr) by 2002:a25:ab4a:0:b0:7b4:df57:d887 with SMTP id u68-20020a25ab4a000000b007b4df57d887mr131335ybi.601.1672913939807; Thu, 05 Jan 2023 02:18:59 -0800 (PST) Date: Thu, 5 Jan 2023 10:18:03 +0000 In-Reply-To: <20230105101844.1893104-1-jthoughton@google.com> Mime-Version: 1.0 References: <20230105101844.1893104-1-jthoughton@google.com> X-Mailer: git-send-email 2.39.0.314.g84b9a713c41-goog Message-ID: <20230105101844.1893104-6-jthoughton@google.com> Subject: [PATCH 05/46] hugetlb: add CONFIG_HUGETLB_HIGH_GRANULARITY_MAPPING From: James Houghton To: Mike Kravetz , Muchun Song , Peter Xu Cc: David Hildenbrand , David Rientjes , Axel Rasmussen , Mina Almasry , "Zach O'Keefe" , Manish Mishra , Naoya Horiguchi , "Dr . David Alan Gilbert" , "Matthew Wilcox (Oracle)" , Vlastimil Babka , Baolin Wang , Miaohe Lin , Yang Shi , Andrew Morton , linux-mm@kvack.org, linux-kernel@vger.kernel.org, James Houghton X-Rspamd-Server: rspam07 X-Rspamd-Queue-Id: 89128160007 X-Rspam-User: X-Stat-Signature: uk4ytpw6f8q86hkpifrzp955ixzfncnp X-HE-Tag: 1672913940-725478 X-HE-Meta: U2FsdGVkX18xSR5kzZjSfTVmfZ6xe6Y5lPH0kwN2gnO8RYay0btSEe9G4QNUhnqBKAb6GMIBeQAaquJRSZhzLsYdSO7Etl1kkkTfWW1PZHDwgnMu9kbQf1xGkFkWp1bS2QzgpbVhroldTdONK11ZH2uXcOEbuSir2Ae4niXSkOfLEM9Iv6r6UgLNuNE0Db2J+ix/KCpuNYLC7CD3LvBkhBlsM+GxRf2lh2G29PWSYeu/1Lj6Lv4LzyZtuIfxDAhZtThp3+UdiB3Wx1iTtl/Bo5jyTfz8K0l0Pzhz7iMj705oHBxToCd+9vXkiOR6cuR6B3pCadNvGeRC0/a4uK+2AZ+AiFF01XnjRwRhPNtwEG/rqVdu7tCrKDQAD64J1ZW7vbOw7h/1IgWfv5lvOEycUUr2sqeXpzku1TW7HHKEIurFQFHwNbO/sV28Bu6f91/Op+vd5IrNIHN+Af+xYMADZr3qqqA1E3vUwPSOo7cutUsUnQU+5ZIkSbWfDyvzCVsNOkfW6X+F2WCdQyl/LUfZ8Q/YF0s6HabGZywMRATtY+kQMGzSHt+Oo/Y6WEbDfFfdBYRRXnpQh2mvhZztcFdCqA20H9QXVHqR0UwYcy/z6jS+VX2l1r/KKyA8XJgarxLuxXzkoBYxn/7l1w6qPVXHWhHviGZBW1mNI92jbNg4TNjbxfF4pZGh6ua2TmdqonDrnrpiFg2YbzO3ArOhto9elGN2X2ndIDyiBwm1kk23sHVqjZaOy6isnOjwGuDrhcZBPwjj3J4HehfynNd8Z4JeLAfIdosF2DolxH3GVV3SrfHoBMiEHd6lSZD84Ie9tLkn0L6fs+9JOuva6l3sTV2+6W93wxkcBIfZo7M4mX1BxooS8m3ibO0GlFpXC0WiWKhTvE6LwsjbH/lzTdITTXIqsY1VW3GJRCaxXpL+mrV5QEcXvJySZlHxVNXBf/tQs1FDSaC0v7+VUYKoHEYbyf9 d6dFznDo YBkewkylcVXgW7G0mpDelmdkOxgakp8Kgr2QG6jrXljkeQuTBjm0W3lt2VWE9PYWJR6uh7t64CT+cLI4jYXFTMJPTMJPGex0seOarjAwUPInMfxv5lhu8MM4wi/VLu7+bxrPmkJVS41hRs16YLZZtI7Q9rDCC93yxFNo1+MHYIwEbjHAwxG32mGE4bD+IynGTOzNKyd9BYdPf9CS0v/WZ4K5r24Hhcbzwy734JYC0erRn4VfuRrG7wzp1I4JSsV7Z2q/jhJUaEz/mVPyARC5XMQxxug2uBjP9MetEQrDSVubxgIpzP2m6MN0qv1h+mf86M72gwY7vvdLny1SWjhKlPa0Byx2fd/huPSilZHgvWeIDBQR8DA0PF7x8JgPeEpS17FdmSytGYDfox4YMfArNDkHxcKUpDzzpwiPbnzhu98JuxrfYGubsEsQJGSaiPM50JnbW X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: This adds the Kconfig to enable or disable high-granularity mapping. Each architecture must explicitly opt-in to it (via ARCH_WANT_HUGETLB_HIGH_GRANULARITY_MAPPING), but when opted in, HGM will be enabled by default if HUGETLB_PAGE is enabled. Signed-off-by: James Houghton --- fs/Kconfig | 7 +++++++ 1 file changed, 7 insertions(+) diff --git a/fs/Kconfig b/fs/Kconfig index 2685a4d0d353..ce2567946016 100644 --- a/fs/Kconfig +++ b/fs/Kconfig @@ -267,6 +267,13 @@ config HUGETLB_PAGE_OPTIMIZE_VMEMMAP_DEFAULT_ON enable HVO by default. It can be disabled via hugetlb_free_vmemmap=off (boot command line) or hugetlb_optimize_vmemmap (sysctl). +config ARCH_WANT_HUGETLB_HIGH_GRANULARITY_MAPPING + bool + +config HUGETLB_HIGH_GRANULARITY_MAPPING + def_bool HUGETLB_PAGE + depends on ARCH_WANT_HUGETLB_HIGH_GRANULARITY_MAPPING + config MEMFD_CREATE def_bool TMPFS || HUGETLBFS From patchwork Thu Jan 5 10:18:04 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: James Houghton X-Patchwork-Id: 13089642 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 89086C54E76 for ; Thu, 5 Jan 2023 10:19:04 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 2B340900006; Thu, 5 Jan 2023 05:19:04 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 266498E0001; Thu, 5 Jan 2023 05:19:04 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 08EF6900006; Thu, 5 Jan 2023 05:19:04 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0013.hostedemail.com [216.40.44.13]) by kanga.kvack.org (Postfix) with ESMTP id E73CA8E0001 for ; Thu, 5 Jan 2023 05:19:03 -0500 (EST) Received: from smtpin16.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay05.hostedemail.com (Postfix) with ESMTP id A6B8540722 for ; Thu, 5 Jan 2023 10:19:03 +0000 (UTC) X-FDA: 80320347366.16.05D27B4 Received: from mail-yb1-f202.google.com (mail-yb1-f202.google.com [209.85.219.202]) by imf06.hostedemail.com (Postfix) with ESMTP id 1E2F0180007 for ; Thu, 5 Jan 2023 10:19:01 +0000 (UTC) Authentication-Results: imf06.hostedemail.com; dkim=pass header.d=google.com header.s=20210112 header.b=CMFdycOs; spf=pass (imf06.hostedemail.com: domain of 3FaS2YwoKCFwDNBIOABNIHAIIAF8.6IGFCHOR-GGEP46E.ILA@flex--jthoughton.bounces.google.com designates 209.85.219.202 as permitted sender) smtp.mailfrom=3FaS2YwoKCFwDNBIOABNIHAIIAF8.6IGFCHOR-GGEP46E.ILA@flex--jthoughton.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1672913942; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=HIG9BIAJU+cdFn89xXe0LNee36+C1EcX0L4kUoSH9VA=; b=notgVjPfF6v0S/IBhxamq2DpQL0GN23AQcdBLBrJhYMEm9L8dV/URnAsXmI8XvCh4pi3sS sOPbiiMB/AB35pMwZQ4rQy0b0cNWd9e76pGuJvX94xmol9v3Aavraj6WbocmOjnwbrOP15 etpPSogiNLLqLpd550MHICP/FsTd9dA= ARC-Authentication-Results: i=1; imf06.hostedemail.com; dkim=pass header.d=google.com header.s=20210112 header.b=CMFdycOs; spf=pass (imf06.hostedemail.com: domain of 3FaS2YwoKCFwDNBIOABNIHAIIAF8.6IGFCHOR-GGEP46E.ILA@flex--jthoughton.bounces.google.com designates 209.85.219.202 as permitted sender) smtp.mailfrom=3FaS2YwoKCFwDNBIOABNIHAIIAF8.6IGFCHOR-GGEP46E.ILA@flex--jthoughton.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1672913942; a=rsa-sha256; cv=none; b=sv5pt0w/LUTnaacgS1n2y+tdKx5AzXB7SZsmOUUnt2v+cHAjskdchpiTGCcM46H4aDVQQB xXIXwEfdoifQtyuIQA216JGERafvzlKiltHzXHI0b+fr5+GBq+sF6kUajIpFfbC4PniVlK Wa3hNmImx4mcAot7B0d3inIUSIL9rDg= Received: by mail-yb1-f202.google.com with SMTP id 195-20020a2505cc000000b0071163981d18so36622708ybf.13 for ; Thu, 05 Jan 2023 02:19:01 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=HIG9BIAJU+cdFn89xXe0LNee36+C1EcX0L4kUoSH9VA=; b=CMFdycOsnlaZrP1tfINkoF9MpUwFHEYphAoiyklgUZsVLZG1tYthaAIbMq5iR4qWCX OXHWYueDyEFCpfBnjUioieM42w8NXjNUz5fmKNAAz8wLLvn5nPj/Bs9/ErfoA3IO4IPY c5oqUYecWCJ1I0u3x2YoT/ifuZ7HYu6l6/zffhFRBo2upvwZyY7aMzs+EuiwuXlkfrtN iASt+00Ipmgct2pD5KbpfbNwwHjYtIXrPj370sJlaEw+j7kbcNb7n2JHv67hdVJRv4uI ls+rD3wSrKgqrGl9nm1pKSRgOs1WF84oEpvybmix2WpHGSw3rcDgV7GQxKLzlEWTzLiD +41A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=HIG9BIAJU+cdFn89xXe0LNee36+C1EcX0L4kUoSH9VA=; b=tRbNCQhhJtDzIQdNJJ2WyW5b5HAmV/SZmPA67wKbxZpck4odLSTcawgdrmxD0fGs2V /mhwKbDUJiep4zEjgFgDBLp9Nlsg9am+2w4WRJ/GevyzVs5F1EM3zDCt84sgMlKUgWiK ozepBQv2WHdBI6+1ARmVahU0QB4V4JOW73AvlD9WoStcNVkFr01X5rfl+0W9vR2y8F2W YD/XO6GEWSHq1/Lq1hWtMVBoPzhsaRLC9K2m5uICL/Kcg5S7ghxB+BmVz6w/Re29mi// pA5rR059/ZcVAEGYA36MC2ptnXrfs0VAWNpoWhMa22sVgkWqwVAF2QfbGYgzjau3SJ6H 4W2Q== X-Gm-Message-State: AFqh2krFdfLXwHt/PNdSn7tszCO3y9VRGpAOmEo8Z3NeU3ezXzdKrUJK zQWjyNUWgG3Yr2fGYINIaCcAEbxSgVD1zGSs X-Google-Smtp-Source: AMrXdXvjR0Xro5c3Mgl40GsMzrllYRIX708i1RiJNWNcFVeIh0ysu1H8PYfOp0t36jr4OJW9kNNJCYFikuDhylAm X-Received: from jthoughton.c.googlers.com ([fda3:e722:ac3:cc00:14:4d90:c0a8:2a4f]) (user=jthoughton job=sendgmr) by 2002:a25:58c3:0:b0:703:5949:3b2a with SMTP id m186-20020a2558c3000000b0070359493b2amr3359803ybb.525.1672913941310; Thu, 05 Jan 2023 02:19:01 -0800 (PST) Date: Thu, 5 Jan 2023 10:18:04 +0000 In-Reply-To: <20230105101844.1893104-1-jthoughton@google.com> Mime-Version: 1.0 References: <20230105101844.1893104-1-jthoughton@google.com> X-Mailer: git-send-email 2.39.0.314.g84b9a713c41-goog Message-ID: <20230105101844.1893104-7-jthoughton@google.com> Subject: [PATCH 06/46] mm: add VM_HUGETLB_HGM VMA flag From: James Houghton To: Mike Kravetz , Muchun Song , Peter Xu Cc: David Hildenbrand , David Rientjes , Axel Rasmussen , Mina Almasry , "Zach O'Keefe" , Manish Mishra , Naoya Horiguchi , "Dr . David Alan Gilbert" , "Matthew Wilcox (Oracle)" , Vlastimil Babka , Baolin Wang , Miaohe Lin , Yang Shi , Andrew Morton , linux-mm@kvack.org, linux-kernel@vger.kernel.org, James Houghton X-Rspamd-Server: rspam05 X-Rspamd-Queue-Id: 1E2F0180007 X-Stat-Signature: yrhjb6hcdmpj9yfbno4rse39oy9e44r1 X-Rspam-User: X-HE-Tag: 1672913941-446025 X-HE-Meta: U2FsdGVkX1+6lzhfddxVcZ1dnrGYaqcPveGHzOtRRAWmf7wchF07uM3kn5HJb6aHWunA2xidFRPZ84gBUbPj3odUztiIXqNDXGMUZfwbVuoM6apkBIJv7jRDu98n0z8gRGNTruTpS3pymK3NzRtSZCsXeK+366jt1ncyLjYLae8CiAFupbkN0C8c5OdUNccvSN6ynIRXFiimzTASMWL6gVXKsMFj6+Zeb0rzA3UkKhvDXtPQyf/EM1SeTLKNYojCz3poyl+C0mVUOfv7czgvyC19ZATKlU7zl55po+TIGXHNL22TQYmI0SA3fS2izq0Z58pF9LoTbdHg0jEPZvVcVqn1WcE5AnWthmgqlRNRPh03HcSEJK2q39lCcBplXbzSWZ8JkagwfrbDDIzDJ8jVKGZcKidiiT/9ht+yYdirolQbXlppF8fPtfBc53zZzfmwuRu6NGLSoDanxL2YvLUtp8mva9rLVotI0F/01k6DwQCZpalT6xw1zPJM+b9VOdiiapwJqogESUotBUD99uSeBFB9mQNHFp+gKFnwT7/qPt1qAYEPDsuui8vjbhTPsCwfcSEuItoIPw9qU91ZWInRfB5RJcXsbM2Ar/V2uFHghn173P1YMpQoPOzxbbwfLXwXCmAla0etvls5FSXV3NL/R2xYArt6zqin3/lrzQgDJdH84xNnLfuZVhj65QSQJZRPdvROY6MmSE0iT6Euc7joSiP9b6qpkBHZoApueq4Djek8mQnGc66kkrpqXM6Xqq3vqiKYUBTRJJOpesdECgQ9AQm3NHneo/Cpz4oh+9Xle/wtrex3fYb9M8Tx4Kf6m9TDHA3QxaJ1qYmTpX0MjZwM6t/erSF1/eTgCcFqEOtksaxnCEgkbYZrZH4jb71NBploIg2S4F9EW33bcua73yjg1UYvESoATn3o3we7SgbY7qFUxAjDzPr+VcCDL4ifcNovUXDI09Px9OF4yvUDqL2 9D5wdEDg XHShD5JW4sEAcwi63SQYLbjN5009fXIYb6cs0pz63OvA+FBVOWdf7QINxh7xQibIyHsrWckLhMyj6TRVVpOJpHf0rK0ykOs0Ua6XoSQqyp/48zmqP4Cj3UWNi1JfRgAXaH1FmfH/xVx8N2sOJXESfSAHebRSHwRoE8XJ+XEB+1s9N7tYJl9zPS5rBWUx4bLVLtsmFePcrGdZc7nM4TnnkNSytg10+LmK0NHYX+rglhpydrclEeiF4OSXfIoO68f7zt1SxF8wwC+/TliF55gWK/pI8WHY2h53biv3Yxl0GaIP4mpZ15Pf2jc4oD7Xisv3Z/07hdAmJ8Zm0DDagVVSyeSXhbapt3dz/k1KvMEPbM0fem3ZvkyxPDEcpzou/s8dZMuQWWFKzp/uoR7Ry3mBLalIpgLtZ04Vu9o0bBqAIf8yhZNHblc2x3JtyNGOVtBR39ygD X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: VM_HUGETLB_HGM indicates that a HugeTLB VMA may contain high-granularity mappings. Its VmFlags string is "hm". Signed-off-by: James Houghton --- fs/proc/task_mmu.c | 3 +++ include/linux/mm.h | 7 +++++++ include/trace/events/mmflags.h | 7 +++++++ 3 files changed, 17 insertions(+) diff --git a/fs/proc/task_mmu.c b/fs/proc/task_mmu.c index e35a0398db63..41b5509bde0e 100644 --- a/fs/proc/task_mmu.c +++ b/fs/proc/task_mmu.c @@ -711,6 +711,9 @@ static void show_smap_vma_flags(struct seq_file *m, struct vm_area_struct *vma) #ifdef CONFIG_HAVE_ARCH_USERFAULTFD_MINOR [ilog2(VM_UFFD_MINOR)] = "ui", #endif /* CONFIG_HAVE_ARCH_USERFAULTFD_MINOR */ +#ifdef CONFIG_HUGETLB_HIGH_GRANULARITY_MAPPING + [ilog2(VM_HUGETLB_HGM)] = "hm", +#endif /* CONFIG_HUGETLB_HIGH_GRANULARITY_MAPPING */ }; size_t i; diff --git a/include/linux/mm.h b/include/linux/mm.h index c37f9330f14e..738b3605f80e 100644 --- a/include/linux/mm.h +++ b/include/linux/mm.h @@ -372,6 +372,13 @@ extern unsigned int kobjsize(const void *objp); # define VM_UFFD_MINOR VM_NONE #endif /* CONFIG_HAVE_ARCH_USERFAULTFD_MINOR */ +#ifdef CONFIG_HUGETLB_HIGH_GRANULARITY_MAPPING +# define VM_HUGETLB_HGM_BIT 38 +# define VM_HUGETLB_HGM BIT(VM_HUGETLB_HGM_BIT) /* HugeTLB high-granularity mapping */ +#else /* !CONFIG_HUGETLB_HIGH_GRANULARITY_MAPPING */ +# define VM_HUGETLB_HGM VM_NONE +#endif /* CONFIG_HUGETLB_HIGH_GRANULARITY_MAPPING */ + /* Bits set in the VMA until the stack is in its final location */ #define VM_STACK_INCOMPLETE_SETUP (VM_RAND_READ | VM_SEQ_READ) diff --git a/include/trace/events/mmflags.h b/include/trace/events/mmflags.h index 412b5a46374c..88ce04b2ff69 100644 --- a/include/trace/events/mmflags.h +++ b/include/trace/events/mmflags.h @@ -163,6 +163,12 @@ IF_HAVE_PG_SKIP_KASAN_POISON(PG_skip_kasan_poison, "skip_kasan_poison") # define IF_HAVE_UFFD_MINOR(flag, name) #endif +#ifdef CONFIG_HUGETLB_HIGH_GRANULARITY_MAPPING +# define IF_HAVE_HUGETLB_HGM(flag, name) {flag, name}, +#else +# define IF_HAVE_HUGETLB_HGM(flag, name) +#endif + #define __def_vmaflag_names \ {VM_READ, "read" }, \ {VM_WRITE, "write" }, \ @@ -187,6 +193,7 @@ IF_HAVE_UFFD_MINOR(VM_UFFD_MINOR, "uffd_minor" ) \ {VM_ACCOUNT, "account" }, \ {VM_NORESERVE, "noreserve" }, \ {VM_HUGETLB, "hugetlb" }, \ +IF_HAVE_HUGETLB_HGM(VM_HUGETLB_HGM, "hugetlb_hgm" ) \ {VM_SYNC, "sync" }, \ __VM_ARCH_SPECIFIC_1 , \ {VM_WIPEONFORK, "wipeonfork" }, \ From patchwork Thu Jan 5 10:18:05 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: James Houghton X-Patchwork-Id: 13089685 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 39203C3DA7A for ; Thu, 5 Jan 2023 10:24:11 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id C25928E0005; Thu, 5 Jan 2023 05:24:10 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id BAEEE8E0002; Thu, 5 Jan 2023 05:24:10 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id A29358E0005; Thu, 5 Jan 2023 05:24:10 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0014.hostedemail.com [216.40.44.14]) by kanga.kvack.org (Postfix) with ESMTP id 8E9358E0002 for ; Thu, 5 Jan 2023 05:24:10 -0500 (EST) Received: from smtpin25.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay02.hostedemail.com (Postfix) with ESMTP id 6A6D0120D39 for ; Thu, 5 Jan 2023 10:24:10 +0000 (UTC) X-FDA: 80320360260.25.483A7EB Received: from mail-yw1-f202.google.com (mail-yw1-f202.google.com [209.85.128.202]) by imf01.hostedemail.com (Postfix) with ESMTP id C342540006 for ; Thu, 5 Jan 2023 10:24:08 +0000 (UTC) Authentication-Results: imf01.hostedemail.com; dkim=pass header.d=google.com header.s=20210112 header.b=Wlu7XQk1; spf=pass (imf01.hostedemail.com: domain of 3F6S2YwoKCF4FPDKQCDPKJCKKCHA.8KIHEJQT-IIGR68G.KNC@flex--jthoughton.bounces.google.com designates 209.85.128.202 as permitted sender) smtp.mailfrom=3F6S2YwoKCF4FPDKQCDPKJCKKCHA.8KIHEJQT-IIGR68G.KNC@flex--jthoughton.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1672914248; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=/8tZn9tkRZbaiVi1nuTHJ8rnAzFnEyMSAOvoIDK0LiA=; b=kmj5Lekuz/dohHK5KPeAhFPozQ5Ac1iaS52j2STGpb02Wfr+hl1cfiUb2mQ3rrbgMEt6mi svGeZugP0tLuqrWacSee2OVg2FagQW6HPHiVlHE/i4ME8uzCjF4iBp9VHBoK7SO5v2CmPU on9t8IKrD86jfR4DBssZbuO9O0MQFEM= ARC-Authentication-Results: i=1; imf01.hostedemail.com; dkim=pass header.d=google.com header.s=20210112 header.b=Wlu7XQk1; spf=pass (imf01.hostedemail.com: domain of 3F6S2YwoKCF4FPDKQCDPKJCKKCHA.8KIHEJQT-IIGR68G.KNC@flex--jthoughton.bounces.google.com designates 209.85.128.202 as permitted sender) smtp.mailfrom=3F6S2YwoKCF4FPDKQCDPKJCKKCHA.8KIHEJQT-IIGR68G.KNC@flex--jthoughton.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1672914248; a=rsa-sha256; cv=none; b=GkMx9UuW/eyIPm4VJkHVOmXJtsTB+1CTcsZD93Br5ScsqLjOmybawrkiyVkvLgl7+EALia KrbXRZqry2gHNo6g5p9mqxWDYq4Z4sjfkCt7YDH4LDM2BerMTtONIa6NKJfVBXPPQPq3Y8 6NbgbdPpQfyiulpV2qa39rhOU+1WEAw= Received: by mail-yw1-f202.google.com with SMTP id 00721157ae682-46839d9ca5dso358704307b3.16 for ; Thu, 05 Jan 2023 02:24:08 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=/8tZn9tkRZbaiVi1nuTHJ8rnAzFnEyMSAOvoIDK0LiA=; b=Wlu7XQk16tqMmg+gNIiIDB1JA/O/qtq05ltp8iTMsYdjH00SBVdUFxa+cBdSZqAQcb d1Wubwm9cgGb/JWbc53zSO6qZR6u0tqT/uhjkvshXe4Nbc29/Mx0FTWQepbVo5VkK/f4 iZzimrA+jYG5bRVUIknjFLaPXJryFDe1l0StPyym3HfhV9kGhFLJmjpADMJIHuvbrMU2 TBM7F+ry7k4c0I//IisI9senk9jQmTO6g/9v4uatar6m1pi5ieYC8Z12sX5T0E2ZspUi kbNXjBMVADs1kunBuGHYbySRRHfw6oNiaeB7XCT6DI2AIDizBxl6F3hmyK+58dSXDshb CYtg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=/8tZn9tkRZbaiVi1nuTHJ8rnAzFnEyMSAOvoIDK0LiA=; b=mXi7M/JixJyGfbHuol1SFCuRJJJT0nCR5ArBBMhNu/eR1QJN2jCfFEau4LkxprsNfM I2T4vE/GZtZy+EFAwT/eqLZm9z8GrMGq3e1pBTfPDdCsvYLKYE+nE7bd4wTZXK39UmKc Hq/FIYbduMYMBaIiZh5SZwZcvR7A+dRzgQYcRVHfU1OEAS/edip7Nr83+zIGeA9qJ053 jgccMsvEG1T8l/WkOOCBnmdFnXHtEH2AbLkLfreRX+FjjoQthqJxrmD3vm1IMzw4sVr0 yMgceVjV3IxtTMTPN7jRxXUKILeAb+uYyjakDGIP2fhnPPZKs1vRF82pWZ0Q91JZ9eBu kb0Q== X-Gm-Message-State: AFqh2kqGR4/B7G4GBbWv7f8eHjeAO0quUXughYtpd9TvwM4O1myE+L7g Na+xxY6W79B1euHirF0II0ZIBLF2dU9UXEK8 X-Google-Smtp-Source: AMrXdXvzTfWe8MomGisuHc0GNsvUXDDVMXnXUti7atm5wfvLHi6RqA6sGIYMNCo6vky1zzYRo3Z5sRweZqrg8aL0 X-Received: from jthoughton.c.googlers.com ([fda3:e722:ac3:cc00:14:4d90:c0a8:2a4f]) (user=jthoughton job=sendgmr) by 2002:a25:afce:0:b0:6ca:ecb9:9fc0 with SMTP id d14-20020a25afce000000b006caecb99fc0mr6101337ybj.199.1672913943280; Thu, 05 Jan 2023 02:19:03 -0800 (PST) Date: Thu, 5 Jan 2023 10:18:05 +0000 In-Reply-To: <20230105101844.1893104-1-jthoughton@google.com> Mime-Version: 1.0 References: <20230105101844.1893104-1-jthoughton@google.com> X-Mailer: git-send-email 2.39.0.314.g84b9a713c41-goog Message-ID: <20230105101844.1893104-8-jthoughton@google.com> Subject: [PATCH 07/46] hugetlb: rename __vma_shareable_flags_pmd to __vma_has_hugetlb_vma_lock From: James Houghton To: Mike Kravetz , Muchun Song , Peter Xu Cc: David Hildenbrand , David Rientjes , Axel Rasmussen , Mina Almasry , "Zach O'Keefe" , Manish Mishra , Naoya Horiguchi , "Dr . David Alan Gilbert" , "Matthew Wilcox (Oracle)" , Vlastimil Babka , Baolin Wang , Miaohe Lin , Yang Shi , Andrew Morton , linux-mm@kvack.org, linux-kernel@vger.kernel.org, James Houghton X-Stat-Signature: 77ufbxnicr5t7sr9mf99rnf1d14j4wgf X-Rspam-User: X-Rspamd-Queue-Id: C342540006 X-Rspamd-Server: rspam06 X-HE-Tag: 1672914248-879650 X-HE-Meta: U2FsdGVkX18ojLVmNVhc7WuDPYgu+IMYxw0a++AeKyQu9yPDcN4xAD9YHmSb3CunHU/O7RNbgFXeR+B7ziRUfymMnLsmZJt/VQ8nUiMSvkHapzLFHpE2XzFUY/Gkc++ATQ5SjuZoGzBS9KYeekgOKnC0Lk0N2pBl39+JJYwSMP3qxAg1oWr4Xt5Jg2eA3JYsKo88DqmcgAlk3ZwgG6TY7u7fAzif8K3KCvIkmo+xtFx7iht+8GSdygaIJzPOTbBZEimu7RrVapY8HRxITbvQ3N5m6DHkNwqDdg9GK77TuK9N1RDo8BMy2V08M8hLwFyD6FB5aKqaidS4gL4pSDdh8+ZdFIcdEwAUeBMc4n8zC9HyMZRJVVSOT15Jk9sgIPNB4E4csucgFlN7bGgypaWJNXFNOY5GlXKdYVOjFjw1Cv6g9bRiHf1TyMOrqaNnFPynMbtSaFLv3/yytmr3Pdljk7idiStpNxCF+0qcnn0P5fDJzpfcPk+iTAN8UgY0ticBSxO8KHU8bjEzUpdLL+qn7UBZ/icZQMeQRm6e9X4XPseNeE8v3B4e2BeJnSP30Xs8EOF+988VZSGtPqGATuNdRCbBHelO5EmIFcj/pugg1nGdoK2jYN8E9uMV1Dv8+SJYdx5hs087d2rz4kIrCNVJ9x1IVIqLI8lvPC6qOtvGLrBUrG0UWWIC2yYgH9Q1GQ6GVkJNzcPCXaUgzCCk2oqbbw4GSnRbnaLPnpqPbQImmB4eoluLP8qBIe9DJwVH6/WPBqCI82zfeIf472Ag7U1ALJoOj22seaYxj5VuV4TLgNmMjV5QA4yZrS0beW/RWoHnPEtgM0tkPAIWpYTS1TPm1nyqt//Z7k1mP3Ue/2Je+/H8/z53wscbgvNpzV+igFFHfNSZiTh8zGsqglMQ8Bx8b80tz96tef5+ksquT56nkdALFDlaXdzE8iGEHqLD30ncL1wDuE+Lawr2RTrk0EO fof6iVim jzObmpzIrsgCesStk9ynfzt9pWIjQNmpG2BALnKuj+AFLXxOmdbsNIcm6gaQ441M2WOBRLTfnrJZRRspi+G/S7bhosoYKL37iCxT+/YFEi3UxZ/F31haVx8osWd/GlZ0KK8cLdgd8zdYyFb9AWCBa37gyvL2iRDUm5+S1brpi7G//Fb3DJ+Tz7nCvTpLHsTe+BvsAF1Cqr33GaJPnm3ckmBG1VtrS7g3+X7NED36kEN7eg0X8XSh0a9nc29/KnWBBgb5MZ0Arl6CA/P+kERbfcv3+lYo3lTOddh4b+ts1QN0dKPKPUtq5jCrGQYEcTdhzUeAQg+a9nECKrRWahDezd95nndz7C3UUCM2uByZGPptixE0uL28/lbwUwBuBO8Ag13a17wS/BdRUyjZJZMuqiHXdPkhRzWWycTL+VVXdq/VqoMK1HXy41cVwQ6JyyYD5PWr2 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Previously, if the hugetlb VMA lock was present, that meant that the VMA was PMD-shareable. Now it is possible that the VMA lock is allocated but the VMA is not PMD-shareable: if the VMA is a high-granularity VMA. It is possible for a high-granularity VMA not to have a VMA lock; in this case, MADV_COLLAPSE will not be able to collapse the mappings. Signed-off-by: James Houghton --- include/linux/hugetlb.h | 15 ++++++++++----- mm/hugetlb.c | 16 ++++++++-------- 2 files changed, 18 insertions(+), 13 deletions(-) diff --git a/include/linux/hugetlb.h b/include/linux/hugetlb.h index b6b10101bea7..aa49fd8cb47c 100644 --- a/include/linux/hugetlb.h +++ b/include/linux/hugetlb.h @@ -1235,7 +1235,8 @@ bool want_pmd_share(struct vm_area_struct *vma, unsigned long addr); #define flush_hugetlb_tlb_range(vma, addr, end) flush_tlb_range(vma, addr, end) #endif -static inline bool __vma_shareable_lock(struct vm_area_struct *vma) +static inline bool +__vma_has_hugetlb_vma_lock(struct vm_area_struct *vma) { return (vma->vm_flags & VM_MAYSHARE) && vma->vm_private_data; } @@ -1252,13 +1253,17 @@ hugetlb_walk(struct vm_area_struct *vma, unsigned long addr, unsigned long sz) struct hugetlb_vma_lock *vma_lock = vma->vm_private_data; /* - * If pmd sharing possible, locking needed to safely walk the - * hugetlb pgtables. More information can be found at the comment - * above huge_pte_offset() in the same file. + * If the VMA has the hugetlb vma lock (PMD sharable or HGM + * collapsible), locking needed to safely walk the hugetlb pgtables. + * More information can be found at the comment above huge_pte_offset() + * in the same file. + * + * This doesn't do a full high-granularity walk, so we are concerned + * only with PMD unsharing. * * NOTE: lockdep_is_held() is only defined with CONFIG_LOCKDEP. */ - if (__vma_shareable_lock(vma)) + if (__vma_has_hugetlb_vma_lock(vma)) WARN_ON_ONCE(!lockdep_is_held(&vma_lock->rw_sema) && !lockdep_is_held( &vma->vm_file->f_mapping->i_mmap_rwsem)); diff --git a/mm/hugetlb.c b/mm/hugetlb.c index 99fadd7680ec..2f86fedef283 100644 --- a/mm/hugetlb.c +++ b/mm/hugetlb.c @@ -260,7 +260,7 @@ static inline struct hugepage_subpool *subpool_vma(struct vm_area_struct *vma) */ void hugetlb_vma_lock_read(struct vm_area_struct *vma) { - if (__vma_shareable_lock(vma)) { + if (__vma_has_hugetlb_vma_lock(vma)) { struct hugetlb_vma_lock *vma_lock = vma->vm_private_data; down_read(&vma_lock->rw_sema); @@ -269,7 +269,7 @@ void hugetlb_vma_lock_read(struct vm_area_struct *vma) void hugetlb_vma_unlock_read(struct vm_area_struct *vma) { - if (__vma_shareable_lock(vma)) { + if (__vma_has_hugetlb_vma_lock(vma)) { struct hugetlb_vma_lock *vma_lock = vma->vm_private_data; up_read(&vma_lock->rw_sema); @@ -278,7 +278,7 @@ void hugetlb_vma_unlock_read(struct vm_area_struct *vma) void hugetlb_vma_lock_write(struct vm_area_struct *vma) { - if (__vma_shareable_lock(vma)) { + if (__vma_has_hugetlb_vma_lock(vma)) { struct hugetlb_vma_lock *vma_lock = vma->vm_private_data; down_write(&vma_lock->rw_sema); @@ -287,7 +287,7 @@ void hugetlb_vma_lock_write(struct vm_area_struct *vma) void hugetlb_vma_unlock_write(struct vm_area_struct *vma) { - if (__vma_shareable_lock(vma)) { + if (__vma_has_hugetlb_vma_lock(vma)) { struct hugetlb_vma_lock *vma_lock = vma->vm_private_data; up_write(&vma_lock->rw_sema); @@ -298,7 +298,7 @@ int hugetlb_vma_trylock_write(struct vm_area_struct *vma) { struct hugetlb_vma_lock *vma_lock = vma->vm_private_data; - if (!__vma_shareable_lock(vma)) + if (!__vma_has_hugetlb_vma_lock(vma)) return 1; return down_write_trylock(&vma_lock->rw_sema); @@ -306,7 +306,7 @@ int hugetlb_vma_trylock_write(struct vm_area_struct *vma) void hugetlb_vma_assert_locked(struct vm_area_struct *vma) { - if (__vma_shareable_lock(vma)) { + if (__vma_has_hugetlb_vma_lock(vma)) { struct hugetlb_vma_lock *vma_lock = vma->vm_private_data; lockdep_assert_held(&vma_lock->rw_sema); @@ -338,7 +338,7 @@ static void __hugetlb_vma_unlock_write_put(struct hugetlb_vma_lock *vma_lock) static void __hugetlb_vma_unlock_write_free(struct vm_area_struct *vma) { - if (__vma_shareable_lock(vma)) { + if (__vma_has_hugetlb_vma_lock(vma)) { struct hugetlb_vma_lock *vma_lock = vma->vm_private_data; __hugetlb_vma_unlock_write_put(vma_lock); @@ -350,7 +350,7 @@ static void hugetlb_vma_lock_free(struct vm_area_struct *vma) /* * Only present in sharable vmas. */ - if (!vma || !__vma_shareable_lock(vma)) + if (!vma || !__vma_has_hugetlb_vma_lock(vma)) return; if (vma->vm_private_data) { From patchwork Thu Jan 5 10:18:06 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: James Houghton X-Patchwork-Id: 13089643 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id D06FBC53210 for ; Thu, 5 Jan 2023 10:19:07 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 6D47F900007; Thu, 5 Jan 2023 05:19:07 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 65ADC8E0001; Thu, 5 Jan 2023 05:19:07 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 4AD23900007; Thu, 5 Jan 2023 05:19:07 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0012.hostedemail.com [216.40.44.12]) by kanga.kvack.org (Postfix) with ESMTP id 387408E0001 for ; Thu, 5 Jan 2023 05:19:07 -0500 (EST) Received: from smtpin30.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay02.hostedemail.com (Postfix) with ESMTP id 0D3D7120D2E for ; Thu, 5 Jan 2023 10:19:07 +0000 (UTC) X-FDA: 80320347534.30.D766BBB Received: from mail-yb1-f202.google.com (mail-yb1-f202.google.com [209.85.219.202]) by imf26.hostedemail.com (Postfix) with ESMTP id 842D4140010 for ; Thu, 5 Jan 2023 10:19:05 +0000 (UTC) Authentication-Results: imf26.hostedemail.com; dkim=pass header.d=google.com header.s=20210112 header.b=UFJAvfKJ; spf=pass (imf26.hostedemail.com: domain of 3GaS2YwoKCGAHRFMSEFRMLEMMEJC.AMKJGLSV-KKIT8AI.MPE@flex--jthoughton.bounces.google.com designates 209.85.219.202 as permitted sender) smtp.mailfrom=3GaS2YwoKCGAHRFMSEFRMLEMMEJC.AMKJGLSV-KKIT8AI.MPE@flex--jthoughton.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1672913945; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=GMmONeDZhRsfRYenu3fsdwuxvPajRN7/ApHlNDbguPQ=; b=IYNN35GdhHt6fK9EiPdnlrIGr1h90Mp11PpF78HD+BVKCkb2PZ14Y4r/YX7Oz9ACajX1Gf PtocjbSOyPJHsZlkagFSRzbF07GRLmd2ldQujEgW9A/EFaZKULEahCRUTost7Tb/I3Al4C cwRee5dnvWb0aaEAVRLUbYFaHSgszTI= ARC-Authentication-Results: i=1; imf26.hostedemail.com; dkim=pass header.d=google.com header.s=20210112 header.b=UFJAvfKJ; spf=pass (imf26.hostedemail.com: domain of 3GaS2YwoKCGAHRFMSEFRMLEMMEJC.AMKJGLSV-KKIT8AI.MPE@flex--jthoughton.bounces.google.com designates 209.85.219.202 as permitted sender) smtp.mailfrom=3GaS2YwoKCGAHRFMSEFRMLEMMEJC.AMKJGLSV-KKIT8AI.MPE@flex--jthoughton.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1672913945; a=rsa-sha256; cv=none; b=OZd1sWP65zook6c9qUmutqeh8Y6uBBemfv8G1rKwLoscLGP8ppFmuJZbgjLQuBpmnapCH9 O105xaou4UEDK0htmll2Q1G63z6Pwo8xNMzClOUUDLaPjHzuPWXaw30o+JLca/DbPGtscF PLwGZIFJnmNL7TVOiOSEYlSQmsV86Hw= Received: by mail-yb1-f202.google.com with SMTP id s6-20020a259006000000b00706c8bfd130so36459022ybl.11 for ; Thu, 05 Jan 2023 02:19:05 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=GMmONeDZhRsfRYenu3fsdwuxvPajRN7/ApHlNDbguPQ=; b=UFJAvfKJ0l2l1UPgK2wOxQdo8MGVCU1OiqZoCoIsORlWWBUstNNxYXGaVW7ns6IVpx sWOpSOdqwOmhWrxPIEuTJShy8NWBIxPmZh7CTQVtYZgNAzrkiyxm4tK3EB2cr1fcFG72 NL2UY0HHYPq++cxfnJOWh1PCfM28o2mBsAvAuRNwxLK1h+x7QAsbLSW86Vjev3uMwHMA m4xMKgabvqB+ymtxPfBLgyttxaDC5fCgZB7AtaPYv2QEle4gH4NgP5WxMkQzfjWNUecC PeW8MaipU+W1oEePXDzy12uT9ZIJZoMLhlOKeScuy6DFqn7J8Iqdtb46MwSIbDmz23uA U6Yw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=GMmONeDZhRsfRYenu3fsdwuxvPajRN7/ApHlNDbguPQ=; b=mcMC3GA5rr8mO/9URJTTicjOyWIwZuTmL42j/lk5sZm7B4Suy7V63IoYpi9LjaRyC+ 6+/SJN5tLCcQTbBsVPZrxwdXbYRPaD+DFvMrGXZONGWklLSmGsfFdhuMnqY7J6fM/Ruh TiyRMjLnXUuvTJilo4wdbz7l/RONhidRip2jIMaAu3jP9QJ3/TjmhxLEDA3xs60Rvoe9 TfyVA51z4jg2eCxMjR1frIfLhRvNysd2mkWQgx7pHRfMMXS0fNpp4ECXg/qR86ezkUCE BLe4+Uy08SzC/9Yz81vntHDumqngBc8J0Ruh/QBbKXN6eTJB7YdBE1qxR+31QP3+lrme 4MUw== X-Gm-Message-State: AFqh2kpLyb7gXRR6UoGADnAji8Yacjs2bbtBt3HdHPI6+WOEDbsLbcwv cE8s9naRCSbQI8oFVqEGb4N0mqihL+m2QRa3 X-Google-Smtp-Source: AMrXdXuQ5Q7cd5DOaU1a35I5qFByRezT/TNbnvDB/nPoo5UD/uGnkAVP9cU36gf8YA6RbJ0CrF4M1Z/gEQoDQ92V X-Received: from jthoughton.c.googlers.com ([fda3:e722:ac3:cc00:14:4d90:c0a8:2a4f]) (user=jthoughton job=sendgmr) by 2002:a5b:58f:0:b0:727:6671:ff85 with SMTP id l15-20020a5b058f000000b007276671ff85mr4592846ybp.585.1672913945077; Thu, 05 Jan 2023 02:19:05 -0800 (PST) Date: Thu, 5 Jan 2023 10:18:06 +0000 In-Reply-To: <20230105101844.1893104-1-jthoughton@google.com> Mime-Version: 1.0 References: <20230105101844.1893104-1-jthoughton@google.com> X-Mailer: git-send-email 2.39.0.314.g84b9a713c41-goog Message-ID: <20230105101844.1893104-9-jthoughton@google.com> Subject: [PATCH 08/46] hugetlb: add HugeTLB HGM enablement helpers From: James Houghton To: Mike Kravetz , Muchun Song , Peter Xu Cc: David Hildenbrand , David Rientjes , Axel Rasmussen , Mina Almasry , "Zach O'Keefe" , Manish Mishra , Naoya Horiguchi , "Dr . David Alan Gilbert" , "Matthew Wilcox (Oracle)" , Vlastimil Babka , Baolin Wang , Miaohe Lin , Yang Shi , Andrew Morton , linux-mm@kvack.org, linux-kernel@vger.kernel.org, James Houghton X-Rspamd-Server: rspam07 X-Rspamd-Queue-Id: 842D4140010 X-Rspam-User: X-Stat-Signature: qcxf347nyddfqswcfz55cga9eaxccs19 X-HE-Tag: 1672913945-595524 X-HE-Meta: U2FsdGVkX19ydvy3JgNXAKQlGVCZYRQJmXCYX3W9fvVr1AxsfoCMLWHeeGaU78KuNjfWcvnGq3h0y1zj8mlz78TzAtr2k8Hh7RTTlxQPZh02LPXwyehrKYlTDehuvE7rka3OUiDRAgt87QQx/g5KoQPlGlhUsWS7UbeCkySv1E0dmCwhwgKjEbHBCpgXg7aq3+wO3h7vaDD1Pj4AbV6jLUt6G8EDreZ4BfQ4pg1fU013DthTa0/fZo9my/4gd1eM5dUiwrOKn1TBaFMgNZ9MBTPuFP4oTIWvltCj6Re7nMq/9eY19G0e1L30z4dMFlcoY5YFmR85svAnw/PWMCMMPrRihMIyu2Yz/5LSmvJxh4sc+dhtDI82LqxPsYF45qxX6dQ00mgsLg7OR0SXPuKByo7gM+Zv82yujs0G6vMH3eShJfQUwvgF9aZRak/oeCzTrr5dEoPWWuDKwWVp6tcbipOfF5LQD/NL+VQQ2+luv3pe0vz/Fem1tDmAILwcLLtdsxafSMjGS3bCEjlDAXkm+a5kdkB7O4aKP3hFFhIfl/CcRj15LXP+ELaGMKpd8J9LMEEdY7VD8x6O+wcP3w8/4QL2cAdhHd8nC46ER61Z+jGLobyNT9KmthBMozkuKO++TPkyHYhMBUMguPdiDJHS5A1iYdRgWAqyS01UAs+NYvzmOkg/+vV3pt/Ytd0mL6+4tKisukyYIS6Sx40YWqCDloIrWsO74eQ25w3ALSbMunFHdkYaLC9Nj9+3jaYGCzn9TjvZFEtalGSzKxjusLlHAo2gaY0mKga41eNAxTc18u6SPMKc28uDNcat0anH0mSAt5c/DkuyCJI9YLG4Pjlb40ug4xOGjIjFX/rSgmQ2B2jnLP6mGAy84epEbDpncjFRP2wI3E1aV3st67IEJ5E6dqRk0+0AQBb9FI/rdB95ik6tuX7K45dCdRNm7iHNwVdlKTZPZWF6rTxYv2Gkvdq JOE6qy2/ xitr/p5RLU6Q2LnMJkUhTbZNLRjsQgmp2hcbxx3ruWFVsgF3ZbOVZTIQeKjI8rnE8pbxSmyIgsscHYQHnlmtFeKpeho8Oan/ay+i5ukOhkY4PBleN2dl5yCUcB4xjYaxhR6yFEGDh4V2kz5Ba8aTonTWNdHat+XQ8dr8PgIGQ9oZspdcKsbd/itA8kdsjoOMjTWEse80edWbdm2ROkzteh/D+lwF0KQHPTtLf5UIg8mTVz8xVpG6OZ4gOMYuCKP+1XXRTPUjmWhuc+dgSW7gANrl5gzVaJwZhTMKKh82Ah2GIbdjgiPiBkEFaLNSl2jMQA3Kc17DTW94WZy42c/oSRa4Icq3Ez8tmP+1hg43ZGFg2OXWBdWEkA/reIAtxF9ghadhFGJurjMw2fDs= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: hugetlb_hgm_eligible indicates that a VMA is eligible to have HGM explicitly enabled via MADV_SPLIT, and hugetlb_hgm_enabled indicates that HGM has been enabled. Signed-off-by: James Houghton --- include/linux/hugetlb.h | 14 ++++++++++++++ mm/hugetlb.c | 23 +++++++++++++++++++++++ 2 files changed, 37 insertions(+) diff --git a/include/linux/hugetlb.h b/include/linux/hugetlb.h index aa49fd8cb47c..8713d9c4f86c 100644 --- a/include/linux/hugetlb.h +++ b/include/linux/hugetlb.h @@ -1207,6 +1207,20 @@ static inline void hugetlb_unregister_node(struct node *node) } #endif /* CONFIG_HUGETLB_PAGE */ +#ifdef CONFIG_HUGETLB_HIGH_GRANULARITY_MAPPING +bool hugetlb_hgm_enabled(struct vm_area_struct *vma); +bool hugetlb_hgm_eligible(struct vm_area_struct *vma); +#else +static inline bool hugetlb_hgm_enabled(struct vm_area_struct *vma) +{ + return false; +} +static inline bool hugetlb_hgm_eligible(struct vm_area_struct *vma) +{ + return false; +} +#endif + static inline spinlock_t *huge_pte_lock(struct hstate *h, struct mm_struct *mm, pte_t *pte) { diff --git a/mm/hugetlb.c b/mm/hugetlb.c index 2f86fedef283..d27fe05d5ef6 100644 --- a/mm/hugetlb.c +++ b/mm/hugetlb.c @@ -6966,6 +6966,10 @@ static bool pmd_sharing_possible(struct vm_area_struct *vma) #ifdef CONFIG_USERFAULTFD if (uffd_disable_huge_pmd_share(vma)) return false; +#endif +#ifdef CONFIG_HUGETLB_HIGH_GRANULARITY_MAPPING + if (hugetlb_hgm_enabled(vma)) + return false; #endif /* * Only shared VMAs can share PMDs. @@ -7229,6 +7233,25 @@ __weak unsigned long hugetlb_mask_last_page(struct hstate *h) #endif /* CONFIG_ARCH_WANT_GENERAL_HUGETLB */ +#ifdef CONFIG_HUGETLB_HIGH_GRANULARITY_MAPPING +bool hugetlb_hgm_eligible(struct vm_area_struct *vma) +{ + /* + * All shared VMAs may have HGM. + * + * HGM requires using the VMA lock, which only exists for shared VMAs. + * To make HGM work for private VMAs, we would need to use another + * scheme to prevent collapsing/splitting from invalidating other + * threads' page table walks. + */ + return vma && (vma->vm_flags & VM_MAYSHARE); +} +bool hugetlb_hgm_enabled(struct vm_area_struct *vma) +{ + return vma && (vma->vm_flags & VM_HUGETLB_HGM); +} +#endif /* CONFIG_HUGETLB_HIGH_GRANULARITY_MAPPING */ + /* * These functions are overwritable if your architecture needs its own * behavior. From patchwork Thu Jan 5 10:18:07 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: James Houghton X-Patchwork-Id: 13089644 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id B0B16C3DA7D for ; Thu, 5 Jan 2023 10:19:09 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 4D798940007; Thu, 5 Jan 2023 05:19:09 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 486738E0001; Thu, 5 Jan 2023 05:19:09 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 301BB940007; Thu, 5 Jan 2023 05:19:09 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0015.hostedemail.com [216.40.44.15]) by kanga.kvack.org (Postfix) with ESMTP id 166808E0001 for ; Thu, 5 Jan 2023 05:19:09 -0500 (EST) Received: from smtpin20.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay08.hostedemail.com (Postfix) with ESMTP id E995A140217 for ; Thu, 5 Jan 2023 10:19:08 +0000 (UTC) X-FDA: 80320347576.20.6CBAF7B Received: from mail-vs1-f73.google.com (mail-vs1-f73.google.com [209.85.217.73]) by imf08.hostedemail.com (Postfix) with ESMTP id 5897516000A for ; Thu, 5 Jan 2023 10:19:07 +0000 (UTC) Authentication-Results: imf08.hostedemail.com; dkim=pass header.d=google.com header.s=20210112 header.b="mO/o/8j5"; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf08.hostedemail.com: domain of 3GqS2YwoKCGEISGNTFGSNMFNNFKD.BNLKHMTW-LLJU9BJ.NQF@flex--jthoughton.bounces.google.com designates 209.85.217.73 as permitted sender) smtp.mailfrom=3GqS2YwoKCGEISGNTFGSNMFNNFKD.BNLKHMTW-LLJU9BJ.NQF@flex--jthoughton.bounces.google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1672913947; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=wyTgKoUPaqLvPlKgkL6dENYZlNeSHJ5X+BJXc4mR0WE=; b=sLs05vGye4+n1qa40z7+TnoQyP+0ILQUmNu7QOsyNBp6YcfeSoXg8l2gsM508scuzujb69 d84t6s3wz0BoosCSg7OxgeviONd1aIKP+yXlVqHjS+Y116LG8ZrWt+GnEy6R229t8NaaXw 7451PIhSLnaGwPitF8jsm9MIE/WcHTQ= ARC-Authentication-Results: i=1; imf08.hostedemail.com; dkim=pass header.d=google.com header.s=20210112 header.b="mO/o/8j5"; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf08.hostedemail.com: domain of 3GqS2YwoKCGEISGNTFGSNMFNNFKD.BNLKHMTW-LLJU9BJ.NQF@flex--jthoughton.bounces.google.com designates 209.85.217.73 as permitted sender) smtp.mailfrom=3GqS2YwoKCGEISGNTFGSNMFNNFKD.BNLKHMTW-LLJU9BJ.NQF@flex--jthoughton.bounces.google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1672913947; a=rsa-sha256; cv=none; b=wosTFv7wqvtrMo1VUyksZUDnRPJljZyuHZ3JbNecQzzXuqsRMqR/ULOSSeawklb7h4Xzl/ 2dbFA83MoFQSIKYrZucV5eja7QVz5LtYbEXkx025fbfo7feK9b2FDcmwY8Hbjzsxh7qOc5 bn2sB8qrulQs8cZbL+RgZahjSkLYcPI= Received: by mail-vs1-f73.google.com with SMTP id v127-20020a676185000000b003c95257942bso5707545vsb.20 for ; Thu, 05 Jan 2023 02:19:07 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=wyTgKoUPaqLvPlKgkL6dENYZlNeSHJ5X+BJXc4mR0WE=; b=mO/o/8j5uzBbSG/A9cnee+je9u5JCkoj3aSd2P7vs7zhonYorgkMtjaT6Camo16MRO HNtN+6XL9Mv6MTsGSvNsS0ThZoOWReEOx+mlNlqsi2BHHzTVLxaSyppMdsTCWcfSc5jK 6+ZKrfvxX1DRZ4xkYZFJxsbNBzLnTyhaq61sRVQmunxV1fBNLuDxQ/deFSB6JHPEpTUt 5/tku/ASVRacTDjSteTyVtqapAitjwzsKEVznkEuAhKi0Dyl3YHpk9+wTziB0iAhpapX LWhXUWqa8fiTZGxfOcvuaDuvdtQuOtDLPsve4S6HaeJ5gJdlA8bacDMpeFwBpswodRai BbcA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=wyTgKoUPaqLvPlKgkL6dENYZlNeSHJ5X+BJXc4mR0WE=; b=r2A7mSmJgHfLr2GxapKcfl6OBsXpNgZ02edhflZ3uDYq2V8tH1zUQwraxOV8sWfAbL qXWmoxBYDtK/O9hwqtinusmA5s1AJMD6HDdfsr9417XmVAXkctIKupuabChPMZj0U8jn d+XtmfV0HZSrbgoQXYHT8ioC4EDqSCP94U1HpHBEIEc0nz4dIsSpHult85Bj7k3TGypc JhwXKyJg+UJ/HR8UR2P4YOSCj4xFQ97es4dfx7ty0dBZZe7xjGrHhdv8hm9+jb6+lweL 1W0oewwYF7044IHWDkcX2Zbomkn3/TRntMAdefwyxVzgYFvZg3Un19ZDqwKojoy1dUv1 OudA== X-Gm-Message-State: AFqh2kqKzg1UCfjFLJV1xBlR3E9u2ZD8JXtx/DdlDjDEJdmYOF8Y5PJT euAeRivzWIELXuP5hEFnN8UePjhbZsKJZQPD X-Google-Smtp-Source: AMrXdXvO4uC3f87rB5JMezqUX+x18qYnUlhtx11jU+WhbWs+qlfQayBtMoiabh7QOd7CQ701U3D2KnBikpjvwFJS X-Received: from jthoughton.c.googlers.com ([fda3:e722:ac3:cc00:14:4d90:c0a8:2a4f]) (user=jthoughton job=sendgmr) by 2002:a67:eb15:0:b0:3c6:5f9:689d with SMTP id a21-20020a67eb15000000b003c605f9689dmr4653326vso.8.1672913946620; Thu, 05 Jan 2023 02:19:06 -0800 (PST) Date: Thu, 5 Jan 2023 10:18:07 +0000 In-Reply-To: <20230105101844.1893104-1-jthoughton@google.com> Mime-Version: 1.0 References: <20230105101844.1893104-1-jthoughton@google.com> X-Mailer: git-send-email 2.39.0.314.g84b9a713c41-goog Message-ID: <20230105101844.1893104-10-jthoughton@google.com> Subject: [PATCH 09/46] mm: add MADV_SPLIT to enable HugeTLB HGM From: James Houghton To: Mike Kravetz , Muchun Song , Peter Xu Cc: David Hildenbrand , David Rientjes , Axel Rasmussen , Mina Almasry , "Zach O'Keefe" , Manish Mishra , Naoya Horiguchi , "Dr . David Alan Gilbert" , "Matthew Wilcox (Oracle)" , Vlastimil Babka , Baolin Wang , Miaohe Lin , Yang Shi , Andrew Morton , linux-mm@kvack.org, linux-kernel@vger.kernel.org, James Houghton X-Rspam-User: X-Rspamd-Server: rspam02 X-Rspamd-Queue-Id: 5897516000A X-Stat-Signature: r6h81aa7q1t7797yfda7kc5aacozk3es X-HE-Tag: 1672913947-645801 X-HE-Meta: U2FsdGVkX18k/yrqFfjC57J9hpy+D9eU5MCX3zlt9ugqr7mCpDCo/EDMxXhExR28Dlhi9Hrfl/iuC4B09a76yKpR6Yhi5tt0GcFsu4FOop4rQeK7WoQFIa460UeI80t10JdD1qulFEHeRhtV2IV1g/SITqohDlkn7T3X8feymdtcL1rF6Dm4sV9757fLDvAjE8m1lAJdsa0uS29zJ6PgilmnBxBk5F57tST/fUC6bFrW9Sp/0heWdojGrME3SIwuEvbZ1hC4BjSa/iyNZzTHvC9OnLh3xol4/VH1rhxJ9UpRA77Jl2ZGpNY0Jt1ZGxtv15HCC72S2BQDloEXWGdmrAMDyqfk4Q0eWoHyloojG6I2c8rKpskKm393UQ3oSDjs0Ql57oj/V00B52jVCA0jx6cHUVaqZOY4RGWHtfI3Tqx3i2yjPm6eLyE5xcCrDIGpEHhI7tFwgeoPa25lLo8Yv6+8zsagYx6j/R52CRaIjH9QoWlhaGZGCAMqi+SR1p0nPMIN0BpnIhaUc7TfUrWtWXQt5bHNyfVSeinv77TT2+CyBUiLHoritQ7NxYmjfqp9QfTt1mIv1nsHNX6TZSdh0LK3QGWpDJR2DVfB6ujlVsE84l+GehkHfReHSbrdbB4vH1tNLXpRV2uK+E08eRK1s54kdY13t1aLg2lFSaAoh/NrYkW2dUsvOSDI9xcfYvQFhgoGPPFpZQMfZH8PtvIkqMI1L4GS3OJfP5vMRe+XtUq5O4Merc9GWeGzFKAucMG0QL2P3f2aYnD8Mkgl0wgaQlQfb2KMuqspCbcSlcqmkS9bYEPE7AtFZuHneQmvOQWVTGzArV7rQCls/K4TUeuoF3ycgFce11ytBkRzBqzSuZj1HSfSP9lIBgAxM9W80OUalmzFQ+WVdPrEpeKxJQ8DIEyj82s+K5PjYiSkon8qE8QFhfRzxfE+5S/9YMHpJQRPdHAP0SirwAQhpFtNcl5 +zcUtMqI xWRlVUIBz8UUrS0vk5XFz5ax78d/RWZR9UGDQ3YPTVbsovE/ofM6G0I167hCv/Tni46jD+6/xfDVPASH8LWREhWhblqd2+aovPe8GURcCStO9VgNlVZIFWt6/qDrNf3dHOOEXHcrCTBdLYvfssGbEKNJVf9JyctX7jzf1yVZTL7RgfR41UqySjJ4/5M2rS3D+j5VMqQiXiv0MnixagriUKlk3eWM5XriiWZZsJr7x+H42JXpKogPZYa8Ulg7VUrC9MMdKTIltAxkHYIRhyGBSudfQpHnmxLuYvmBbGRUaMhWGWSbdzBnk0R37SYgCBgd0a2hYOrY4JseHsJUz82h234c3aYcMfDk+uhNQThJCOyTfDlsfz5YM9PXSLHFGjARsM+Qh5a8RXwSevGMyQgPyZwLJ8sw1Fwp7nu5LxOVRdssLdiiPijVVHpATWsolHCRr3km/ X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Issuing ioctl(MADV_SPLIT) on a HugeTLB address range will enable HugeTLB HGM. MADV_SPLIT was chosen for the name so that this API can be applied to non-HugeTLB memory in the future, if such an application is to arise. MADV_SPLIT provides several API changes for some syscalls on HugeTLB address ranges: 1. UFFDIO_CONTINUE is allowed for MAP_SHARED VMAs at PAGE_SIZE alignment. 2. read()ing a page fault event from a userfaultfd will yield a PAGE_SIZE-rounded address, instead of a huge-page-size-rounded address (unless UFFD_FEATURE_EXACT_ADDRESS is used). There is no way to disable the API changes that come with issuing MADV_SPLIT. MADV_COLLAPSE can be used to collapse high-granularity page table mappings that come from the extended functionality that comes with using MADV_SPLIT. For post-copy live migration, the expected use-case is: 1. mmap(MAP_SHARED, some_fd) primary mapping 2. mmap(MAP_SHARED, some_fd) alias mapping 3. MADV_SPLIT the primary mapping 4. UFFDIO_REGISTER/etc. the primary mapping 5. Copy memory contents into alias mapping and UFFDIO_CONTINUE the corresponding PAGE_SIZE sections in the primary mapping. More API changes may be added in the future. Signed-off-by: James Houghton --- arch/alpha/include/uapi/asm/mman.h | 2 ++ arch/mips/include/uapi/asm/mman.h | 2 ++ arch/parisc/include/uapi/asm/mman.h | 2 ++ arch/xtensa/include/uapi/asm/mman.h | 2 ++ include/linux/hugetlb.h | 2 ++ include/uapi/asm-generic/mman-common.h | 2 ++ mm/hugetlb.c | 3 +-- mm/madvise.c | 26 ++++++++++++++++++++++++++ 8 files changed, 39 insertions(+), 2 deletions(-) diff --git a/arch/alpha/include/uapi/asm/mman.h b/arch/alpha/include/uapi/asm/mman.h index 763929e814e9..7a26f3648b90 100644 --- a/arch/alpha/include/uapi/asm/mman.h +++ b/arch/alpha/include/uapi/asm/mman.h @@ -78,6 +78,8 @@ #define MADV_COLLAPSE 25 /* Synchronous hugepage collapse */ +#define MADV_SPLIT 26 /* Enable hugepage high-granularity APIs */ + /* compatibility flags */ #define MAP_FILE 0 diff --git a/arch/mips/include/uapi/asm/mman.h b/arch/mips/include/uapi/asm/mman.h index c6e1fc77c996..f8a74a3a0928 100644 --- a/arch/mips/include/uapi/asm/mman.h +++ b/arch/mips/include/uapi/asm/mman.h @@ -105,6 +105,8 @@ #define MADV_COLLAPSE 25 /* Synchronous hugepage collapse */ +#define MADV_SPLIT 26 /* Enable hugepage high-granularity APIs */ + /* compatibility flags */ #define MAP_FILE 0 diff --git a/arch/parisc/include/uapi/asm/mman.h b/arch/parisc/include/uapi/asm/mman.h index 68c44f99bc93..a6dc6a56c941 100644 --- a/arch/parisc/include/uapi/asm/mman.h +++ b/arch/parisc/include/uapi/asm/mman.h @@ -72,6 +72,8 @@ #define MADV_COLLAPSE 25 /* Synchronous hugepage collapse */ +#define MADV_SPLIT 74 /* Enable hugepage high-granularity APIs */ + #define MADV_HWPOISON 100 /* poison a page for testing */ #define MADV_SOFT_OFFLINE 101 /* soft offline page for testing */ diff --git a/arch/xtensa/include/uapi/asm/mman.h b/arch/xtensa/include/uapi/asm/mman.h index 1ff0c858544f..f98a77c430a9 100644 --- a/arch/xtensa/include/uapi/asm/mman.h +++ b/arch/xtensa/include/uapi/asm/mman.h @@ -113,6 +113,8 @@ #define MADV_COLLAPSE 25 /* Synchronous hugepage collapse */ +#define MADV_SPLIT 26 /* Enable hugepage high-granularity APIs */ + /* compatibility flags */ #define MAP_FILE 0 diff --git a/include/linux/hugetlb.h b/include/linux/hugetlb.h index 8713d9c4f86c..16fc3e381801 100644 --- a/include/linux/hugetlb.h +++ b/include/linux/hugetlb.h @@ -109,6 +109,8 @@ struct hugetlb_vma_lock { struct vm_area_struct *vma; }; +void hugetlb_vma_lock_alloc(struct vm_area_struct *vma); + extern struct resv_map *resv_map_alloc(void); void resv_map_release(struct kref *ref); diff --git a/include/uapi/asm-generic/mman-common.h b/include/uapi/asm-generic/mman-common.h index 6ce1f1ceb432..996e8ded092f 100644 --- a/include/uapi/asm-generic/mman-common.h +++ b/include/uapi/asm-generic/mman-common.h @@ -79,6 +79,8 @@ #define MADV_COLLAPSE 25 /* Synchronous hugepage collapse */ +#define MADV_SPLIT 26 /* Enable hugepage high-granularity APIs */ + /* compatibility flags */ #define MAP_FILE 0 diff --git a/mm/hugetlb.c b/mm/hugetlb.c index d27fe05d5ef6..5bd53ae8ca4b 100644 --- a/mm/hugetlb.c +++ b/mm/hugetlb.c @@ -92,7 +92,6 @@ struct mutex *hugetlb_fault_mutex_table ____cacheline_aligned_in_smp; /* Forward declaration */ static int hugetlb_acct_memory(struct hstate *h, long delta); static void hugetlb_vma_lock_free(struct vm_area_struct *vma); -static void hugetlb_vma_lock_alloc(struct vm_area_struct *vma); static void __hugetlb_vma_unlock_write_free(struct vm_area_struct *vma); static inline bool subpool_is_free(struct hugepage_subpool *spool) @@ -361,7 +360,7 @@ static void hugetlb_vma_lock_free(struct vm_area_struct *vma) } } -static void hugetlb_vma_lock_alloc(struct vm_area_struct *vma) +void hugetlb_vma_lock_alloc(struct vm_area_struct *vma) { struct hugetlb_vma_lock *vma_lock; diff --git a/mm/madvise.c b/mm/madvise.c index 025be3517af1..04ee28992e52 100644 --- a/mm/madvise.c +++ b/mm/madvise.c @@ -1011,6 +1011,24 @@ static long madvise_remove(struct vm_area_struct *vma, return error; } +static int madvise_split(struct vm_area_struct *vma, + unsigned long *new_flags) +{ + if (!is_vm_hugetlb_page(vma) || !hugetlb_hgm_eligible(vma)) + return -EINVAL; + /* + * Attempt to allocate the VMA lock again. If it isn't allocated, + * MADV_COLLAPSE won't work. + */ + hugetlb_vma_lock_alloc(vma); + + /* PMD sharing doesn't work with HGM. */ + hugetlb_unshare_all_pmds(vma); + + *new_flags |= VM_HUGETLB_HGM; + return 0; +} + /* * Apply an madvise behavior to a region of a vma. madvise_update_vma * will handle splitting a vm area into separate areas, each area with its own @@ -1089,6 +1107,11 @@ static int madvise_vma_behavior(struct vm_area_struct *vma, break; case MADV_COLLAPSE: return madvise_collapse(vma, prev, start, end); + case MADV_SPLIT: + error = madvise_split(vma, &new_flags); + if (error) + goto out; + break; } anon_name = anon_vma_name(vma); @@ -1183,6 +1206,9 @@ madvise_behavior_valid(int behavior) case MADV_HUGEPAGE: case MADV_NOHUGEPAGE: case MADV_COLLAPSE: +#endif +#ifdef CONFIG_HUGETLB_HIGH_GRANULARITY_MAPPING + case MADV_SPLIT: #endif case MADV_DONTDUMP: case MADV_DODUMP: From patchwork Thu Jan 5 10:18:08 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: James Houghton X-Patchwork-Id: 13089645 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 60294C54E76 for ; Thu, 5 Jan 2023 10:19:11 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id F0F168E0001; Thu, 5 Jan 2023 05:19:10 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id EBFF4940008; Thu, 5 Jan 2023 05:19:10 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id D3A9F8E0005; Thu, 5 Jan 2023 05:19:10 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0013.hostedemail.com [216.40.44.13]) by kanga.kvack.org (Postfix) with ESMTP id C467D8E0001 for ; Thu, 5 Jan 2023 05:19:10 -0500 (EST) Received: from smtpin10.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay01.hostedemail.com (Postfix) with ESMTP id 788F91C64F9 for ; Thu, 5 Jan 2023 10:19:10 +0000 (UTC) X-FDA: 80320347660.10.E7039C7 Received: from mail-vk1-f202.google.com (mail-vk1-f202.google.com [209.85.221.202]) by imf19.hostedemail.com (Postfix) with ESMTP id E2F591A0010 for ; Thu, 5 Jan 2023 10:19:08 +0000 (UTC) Authentication-Results: imf19.hostedemail.com; dkim=pass header.d=google.com header.s=20210112 header.b=kKVOVndA; spf=pass (imf19.hostedemail.com: domain of 3HKS2YwoKCGMKUIPVHIUPOHPPHMF.DPNMJOVY-NNLWBDL.PSH@flex--jthoughton.bounces.google.com designates 209.85.221.202 as permitted sender) smtp.mailfrom=3HKS2YwoKCGMKUIPVHIUPOHPPHMF.DPNMJOVY-NNLWBDL.PSH@flex--jthoughton.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1672913948; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=YeLoaIulUhCWWXAMbuNF9M+t6oo8aI+KwJNM9mGseGc=; b=hWZvSKxlLF+NorKs/l6jAbv1u6y+zRrc4Hpkk5/EDr8BJlaMzsr8UDH72PtnrjSMPLisOB /3gQRT9V1kFDo8TqIeg9CIvJO/r5+ysa3puofseg3lpnC2zMefV7OO+CVzjB0Ljd1s15GG YtKowVBbDGpbKzrW4/zrLLS7QAIQolc= ARC-Authentication-Results: i=1; imf19.hostedemail.com; dkim=pass header.d=google.com header.s=20210112 header.b=kKVOVndA; spf=pass (imf19.hostedemail.com: domain of 3HKS2YwoKCGMKUIPVHIUPOHPPHMF.DPNMJOVY-NNLWBDL.PSH@flex--jthoughton.bounces.google.com designates 209.85.221.202 as permitted sender) smtp.mailfrom=3HKS2YwoKCGMKUIPVHIUPOHPPHMF.DPNMJOVY-NNLWBDL.PSH@flex--jthoughton.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1672913948; a=rsa-sha256; cv=none; b=EKcsilmBpIlsSH5wUtWaTX4dTQwQt1WCh1z2J+N3cfM1NpKHEP/BeEJNURXk2R+MMtQttE Nea6ytg52FRf/pOPIchAeITknr5HRLGFtZ8A4igpd1bSAo2KpvGDk5FhzIZd9TEQRpU4Sf kuNonIoFDO6TDU1nFLdMMLAeRfLMCIA= Received: by mail-vk1-f202.google.com with SMTP id b72-20020a1fb24b000000b003d58fdc33fcso5749886vkf.10 for ; Thu, 05 Jan 2023 02:19:08 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=YeLoaIulUhCWWXAMbuNF9M+t6oo8aI+KwJNM9mGseGc=; b=kKVOVndAgw/m6ANRlyIyxTS08GniA9U67oeruSczuNHOB6SYI/n6pFU+HOjz+szql2 PWk4++j2ti0R10lgALWNsE1fhYxnWxM7JukY7mIAe7pCJ5g5YXsry19Ma9iPvK6znPtU jMYy+OpdWtqzX+Nek/p3045z5ujRR/oRPI7EUBPhqyf1BWSaZmKdKoPtQI+kGu8Ti537 3J1Z6ngNzsR80E2A85x/f+xTfzkF6gI9uOpsAHaLaSIpG6B2w5XFihWl8tN3nkAnSfct 1xZKvd3M+YUP+kgur9AeYOYhuRYRgatI3fPPdF/dj+Kh+YLKqpI/TnSWPjyTPhxjRiqI rjMg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=YeLoaIulUhCWWXAMbuNF9M+t6oo8aI+KwJNM9mGseGc=; b=NU2ZLCGtIN1PglFdoRDOjTkg+6tlQYECW6gXI6M/Q1QlrCfsUBv9FzPouC4d8FcHec zEahGHQuE/joNMMD1/7P0ZMsl2D+qMXhYSo6M8AVhJikCQk4at8gEHeBO58/dA20jfnV KKi9YcM2QcXRSaY2ILvLEtp+SpWQY62B54ADSr1rFRRt4/UPP7ccqypOJ3ScyIBIRV4W 8eytKjHW9MYTkERfCSNe+ZPZuE569dVLvcH/DlCsWp9WB+Vxlo3kZRxnGKwRjbJfVMMb rFDsDpODZ/BzvwuQKCCngbY0ldxo4rv8qlJNmseENclWGLw48xS8owwYWU2f/Xv44ZQn 1wLg== X-Gm-Message-State: AFqh2koiuqOHhcZ0QS1J0OfGwY+b3al47B+zX+/1quWkqhJKtHrEA3VW IJwUfrpaC8UTppDk6GIQK7ZcT5n7WjgiyZNE X-Google-Smtp-Source: AMrXdXtkUHqLKKc7LXMFTV+aWMn8L7Sl+RR7fShIQHxbdJZ0UldDgF+0b7JMDk/710jpas3HLfAgCwlTWk0Lhcfi X-Received: from jthoughton.c.googlers.com ([fda3:e722:ac3:cc00:14:4d90:c0a8:2a4f]) (user=jthoughton job=sendgmr) by 2002:ab0:2e84:0:b0:419:2056:34b8 with SMTP id f4-20020ab02e84000000b00419205634b8mr5177046uaa.85.1672913948093; Thu, 05 Jan 2023 02:19:08 -0800 (PST) Date: Thu, 5 Jan 2023 10:18:08 +0000 In-Reply-To: <20230105101844.1893104-1-jthoughton@google.com> Mime-Version: 1.0 References: <20230105101844.1893104-1-jthoughton@google.com> X-Mailer: git-send-email 2.39.0.314.g84b9a713c41-goog Message-ID: <20230105101844.1893104-11-jthoughton@google.com> Subject: [PATCH 10/46] hugetlb: make huge_pte_lockptr take an explicit shift argument From: James Houghton To: Mike Kravetz , Muchun Song , Peter Xu Cc: David Hildenbrand , David Rientjes , Axel Rasmussen , Mina Almasry , "Zach O'Keefe" , Manish Mishra , Naoya Horiguchi , "Dr . David Alan Gilbert" , "Matthew Wilcox (Oracle)" , Vlastimil Babka , Baolin Wang , Miaohe Lin , Yang Shi , Andrew Morton , linux-mm@kvack.org, linux-kernel@vger.kernel.org, James Houghton X-Rspamd-Server: rspam05 X-Rspamd-Queue-Id: E2F591A0010 X-Stat-Signature: ykzka14hw5x3jbur53c9fm71ofwktnsu X-Rspam-User: X-HE-Tag: 1672913948-70558 X-HE-Meta: U2FsdGVkX1+0wctXVB0bTeO168WjzpYMDuxaWrvgd0S+ONVyPhSIe8nUUU9OmA7Dt84PvmErPu1iSpcZ/VKEQuRqZT1dww9wxs1PPwTJYBzOoO78wezvfwquWGewUHMieUnPB54M4OLcrVASGtFQhNjfGjBdapIECyZcvnjtK+hjBDImiTAYZvqpnfIkolZfHy1ngZ15pe8Zz7N9FeP9t/6KxLpi73gGfLHnTF0t/V74jcuZrI9FzyHjjLwWBQeledZq42FBDZxL2LNbO1aU4I/pqAhE/BR74nFv6NUEgjGSpCZ+BPXMLge6/h3OqZtS6DelMMUbvUfDM3Tci56408jf354/GDZ5J/W0abvB3mr7dJp7CuQIDar3J0EynyvTTuVU5xfMZ1pUKdawXh/JGFhWZifvZ1KDEbBW3ien9RQ9UFFrJ/Hle5CYgNCTVpireGeazyWHM09rXwiHf1wtJdRqgV5SN6sf5z4SDGXtjPaG03+bPLDoBPwQFfWQCqSrmHjUrqKsWryS18QXjpsbmGocgCoeXRaIGMsiUp5Go8lnFE3G9e4VCywzJ8NDuEUw651KukE3EBAi5mx96vNd0jK4KAusIGcEdWEz8sKq6DvX8q4sQAsWMcDqLAj0u8nVasli+98hwJAM1C43BH4QlRTYbp4lBGmzqsVX/V36VMcu6KocVuWyqDVTd9qaXnvXEmSf9q1zfiX8FSSXvG6rgoq3F5lQNvzT5durX+aVf/p6XSbLiyaq7Jovq8FPfDp0E4jr+9ZFrsZG0Sfdt74vtRYhOzrX1RafQalWD/EajoNcsLgK0iaEqqyANsVnQ+i4yww/9IELu9llQzIfX8xwWObYEDpTFWdIN1zeFMHIutND1oaVDHSndltol7uM0j0JCFpp4xr3o/uCPQ7MFRzyhzf3wh5aRwNxf1FoTsO2M0DDjoaU9SIHESIXSTWOLzMx+Smib0EJxTh3tU8hbiu ayB34rnU 11TVpvv0fodgC+8AtiMNMSumpS+Hep0rG8IHpgW0hMdKlENLSROvzw3GBObSaRfYMHUCPAB8HaDvTEyHmNhfkUlIQEYY1n1/mv60oLRbbOfhITgEuAMyk5GCHl9AeyiZ6lInG8FYcg42Djyh3NuBaPDhA0QtXtnTcv6M7iRYhiidgjlqZaK0edteOS2O5guWtpsclSIgkGlRi5TzMCErNvA0zUONZ41TIuPu3FjY2IONBp7BLKI2/vzxDc8bfwoMoXm4pUMxuKSL6zQEUt51g7hiIkBsc6WVYN07nQcbnJMb8uR/hbiyiVYz3+kwwhhdLpByBcBS4fPWATku6VLnReboDkna2oFjZO36gqMFeg7GaSjSxS/bf9N2CG34OpS5Q37ghhyuiNsXT+O12IvByJR4XwpeCXI44bXJAHMvYXjr4HZvb8333JC0TqupI/8hx+7qV X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: This is needed to handle PTL locking with high-granularity mapping. We won't always be using the PMD-level PTL even if we're using the 2M hugepage hstate. It's possible that we're dealing with 4K PTEs, in which case, we need to lock the PTL for the 4K PTE. Reviewed-by: Mina Almasry Acked-by: Mike Kravetz Signed-off-by: James Houghton --- arch/powerpc/mm/pgtable.c | 3 ++- include/linux/hugetlb.h | 9 ++++----- mm/hugetlb.c | 7 ++++--- mm/migrate.c | 3 ++- 4 files changed, 12 insertions(+), 10 deletions(-) diff --git a/arch/powerpc/mm/pgtable.c b/arch/powerpc/mm/pgtable.c index cb2dcdb18f8e..035a0df47af0 100644 --- a/arch/powerpc/mm/pgtable.c +++ b/arch/powerpc/mm/pgtable.c @@ -261,7 +261,8 @@ int huge_ptep_set_access_flags(struct vm_area_struct *vma, psize = hstate_get_psize(h); #ifdef CONFIG_DEBUG_VM - assert_spin_locked(huge_pte_lockptr(h, vma->vm_mm, ptep)); + assert_spin_locked(huge_pte_lockptr(huge_page_shift(h), + vma->vm_mm, ptep)); #endif #else diff --git a/include/linux/hugetlb.h b/include/linux/hugetlb.h index 16fc3e381801..3f098363cd6e 100644 --- a/include/linux/hugetlb.h +++ b/include/linux/hugetlb.h @@ -956,12 +956,11 @@ static inline gfp_t htlb_modify_alloc_mask(struct hstate *h, gfp_t gfp_mask) return modified_mask; } -static inline spinlock_t *huge_pte_lockptr(struct hstate *h, +static inline spinlock_t *huge_pte_lockptr(unsigned int shift, struct mm_struct *mm, pte_t *pte) { - if (huge_page_size(h) == PMD_SIZE) + if (shift == PMD_SHIFT) return pmd_lockptr(mm, (pmd_t *) pte); - VM_BUG_ON(huge_page_size(h) == PAGE_SIZE); return &mm->page_table_lock; } @@ -1171,7 +1170,7 @@ static inline gfp_t htlb_modify_alloc_mask(struct hstate *h, gfp_t gfp_mask) return 0; } -static inline spinlock_t *huge_pte_lockptr(struct hstate *h, +static inline spinlock_t *huge_pte_lockptr(unsigned int shift, struct mm_struct *mm, pte_t *pte) { return &mm->page_table_lock; @@ -1228,7 +1227,7 @@ static inline spinlock_t *huge_pte_lock(struct hstate *h, { spinlock_t *ptl; - ptl = huge_pte_lockptr(h, mm, pte); + ptl = huge_pte_lockptr(huge_page_shift(h), mm, pte); spin_lock(ptl); return ptl; } diff --git a/mm/hugetlb.c b/mm/hugetlb.c index 5bd53ae8ca4b..4db38dc79d0e 100644 --- a/mm/hugetlb.c +++ b/mm/hugetlb.c @@ -4987,7 +4987,7 @@ int copy_hugetlb_page_range(struct mm_struct *dst, struct mm_struct *src, } dst_ptl = huge_pte_lock(h, dst, dst_pte); - src_ptl = huge_pte_lockptr(h, src, src_pte); + src_ptl = huge_pte_lockptr(huge_page_shift(h), src, src_pte); spin_lock_nested(src_ptl, SINGLE_DEPTH_NESTING); entry = huge_ptep_get(src_pte); again: @@ -5068,7 +5068,8 @@ int copy_hugetlb_page_range(struct mm_struct *dst, struct mm_struct *src, /* Install the new huge page if src pte stable */ dst_ptl = huge_pte_lock(h, dst, dst_pte); - src_ptl = huge_pte_lockptr(h, src, src_pte); + src_ptl = huge_pte_lockptr(huge_page_shift(h), + src, src_pte); spin_lock_nested(src_ptl, SINGLE_DEPTH_NESTING); entry = huge_ptep_get(src_pte); if (!pte_same(src_pte_old, entry)) { @@ -5122,7 +5123,7 @@ static void move_huge_pte(struct vm_area_struct *vma, unsigned long old_addr, pte_t pte; dst_ptl = huge_pte_lock(h, mm, dst_pte); - src_ptl = huge_pte_lockptr(h, mm, src_pte); + src_ptl = huge_pte_lockptr(huge_page_shift(h), mm, src_pte); /* * We don't have to worry about the ordering of src and dst ptlocks diff --git a/mm/migrate.c b/mm/migrate.c index b5032c3e940a..832f639fc49a 100644 --- a/mm/migrate.c +++ b/mm/migrate.c @@ -360,7 +360,8 @@ void __migration_entry_wait_huge(struct vm_area_struct *vma, void migration_entry_wait_huge(struct vm_area_struct *vma, pte_t *pte) { - spinlock_t *ptl = huge_pte_lockptr(hstate_vma(vma), vma->vm_mm, pte); + spinlock_t *ptl = huge_pte_lockptr(huge_page_shift(hstate_vma(vma)), + vma->vm_mm, pte); __migration_entry_wait_huge(vma, pte, ptl); } From patchwork Thu Jan 5 10:18:09 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: James Houghton X-Patchwork-Id: 13089646 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 9B527C53210 for ; Thu, 5 Jan 2023 10:19:12 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 3C500940009; Thu, 5 Jan 2023 05:19:12 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 34E3B940008; Thu, 5 Jan 2023 05:19:12 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 1C773940009; Thu, 5 Jan 2023 05:19:12 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0010.hostedemail.com [216.40.44.10]) by kanga.kvack.org (Postfix) with ESMTP id 0DCC2940008 for ; Thu, 5 Jan 2023 05:19:12 -0500 (EST) Received: from smtpin01.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay10.hostedemail.com (Postfix) with ESMTP id CDD00C060F for ; Thu, 5 Jan 2023 10:19:11 +0000 (UTC) X-FDA: 80320347702.01.83606DC Received: from mail-yb1-f202.google.com (mail-yb1-f202.google.com [209.85.219.202]) by imf20.hostedemail.com (Postfix) with ESMTP id 41CCA1C0014 for ; Thu, 5 Jan 2023 10:19:10 +0000 (UTC) Authentication-Results: imf20.hostedemail.com; dkim=pass header.d=google.com header.s=20210112 header.b=iSz2ooNk; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf20.hostedemail.com: domain of 3HaS2YwoKCGQLVJQWIJVQPIQQING.EQONKPWZ-OOMXCEM.QTI@flex--jthoughton.bounces.google.com designates 209.85.219.202 as permitted sender) smtp.mailfrom=3HaS2YwoKCGQLVJQWIJVQPIQQING.EQONKPWZ-OOMXCEM.QTI@flex--jthoughton.bounces.google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1672913950; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=HlgWOjaAJz0zfdNn9cG/30nMIJO7QLF2mBB2gxe7Hq8=; b=fwr9m1AslEwuL23NUlS8ML2eOvfGnTgkDP4OWStl6q4RiRFWZLO0rncuglDNpBSR/3qat2 6lOBMP2/BdLsiypuD/Xz69KM8PUGpwhEeNU/jGb4lajnmTHHF2f7uZymXajtdwlV8FAV0i c6NOOjSB9lVkNsMUtsjLGVZanPgwE44= ARC-Authentication-Results: i=1; imf20.hostedemail.com; dkim=pass header.d=google.com header.s=20210112 header.b=iSz2ooNk; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf20.hostedemail.com: domain of 3HaS2YwoKCGQLVJQWIJVQPIQQING.EQONKPWZ-OOMXCEM.QTI@flex--jthoughton.bounces.google.com designates 209.85.219.202 as permitted sender) smtp.mailfrom=3HaS2YwoKCGQLVJQWIJVQPIQQING.EQONKPWZ-OOMXCEM.QTI@flex--jthoughton.bounces.google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1672913950; a=rsa-sha256; cv=none; b=mUTYhASDmGVUen8aUtwHyST+yi6NTVq+1IfIuH30f0zUH/76t0hnIB587fYknyVtUuSGHX xFWmttiuiGJU0tWixWCd9pphcHfkJt66vX5Q4HsoV6/Nw2M4IM9T4qU2qIB7TNDSKtTe3G kYyYhs3VY4jpboE60E/mky8KE8ij1l4= Received: by mail-yb1-f202.google.com with SMTP id w9-20020a05690210c900b007b20e8d0c99so4516671ybu.0 for ; Thu, 05 Jan 2023 02:19:10 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=HlgWOjaAJz0zfdNn9cG/30nMIJO7QLF2mBB2gxe7Hq8=; b=iSz2ooNkUZGnH87sDmHD2gCBj2Hey3YSf423VRaH6AnWyafV22fMV2Sw9PR3D4uqic CllJH2wRMj7/tkZJT/L4HrkszREidz8xo8elOcL/ubDPrXdbjKqcvSwNuBpfIvAGVtvR A5ldBvPdHR0ntED8Zu2n7yVUxfVZ9fS2FUgGqGsf9I7fh5DevYo9DWvQ14UI5opS2Ndf 9pyDdSpxBL6nohGJDqj9v21g+a96iXBzAfKohulMwY8yNne3UYL+LmR1Lm/zk3AAVII7 QovdH+ZXsH6oWLiYCENZhk3nLT5IEtiB909u4krXh79gZLNok0fXXDBeFzffTBDi9IPZ p31Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=HlgWOjaAJz0zfdNn9cG/30nMIJO7QLF2mBB2gxe7Hq8=; b=p9sdLwdhO3CJnQ4ECVB2lrQjGXqOIqeZbb/ppwKCNIFzKO6xkgsDfAXoNOVLf8DPqs JsbCpZ5QAu81dTZRENskxQzhYQrel5lEIiAOmQGqPYPOn0HGQiUx/1bIz+3W1KYjkOi4 7Oirt8WoJFENEZMZhveK8aAxH/kiMuywJd9O85Q5wjLQqtgyY5e4H/lUSHpbIfdvKXJg BCQCvu9c6cTSetqBHqi0IpNh2LRZAXlgS8/oHhtTte7GAwFJ/nZ+CqBAaM1y6P3lfn7Q YPb1mSo1kIct7GQ34rmyFJZkjkeaGeySZqtaNHqZu8bHJMpaGUx2NCBRx4NOL+tNq/Xp Xa8w== X-Gm-Message-State: AFqh2kpQhBqDd7JbYiuJKo+ZSk09HKzA8cuHy6aPQZ3iIeL5sT3CCH7D EAL0SDNoJ/FoszRJ3jsCrJVlkKmO0w7rYQ9n X-Google-Smtp-Source: AMrXdXuWXwTFKt3tEGwcWkM+8Z5EJbSUAOSB4/uJq2eHaPtcD22bcM4r//hqZZqHWIknAkMmLTX/CoDS7t5HOrA7 X-Received: from jthoughton.c.googlers.com ([fda3:e722:ac3:cc00:14:4d90:c0a8:2a4f]) (user=jthoughton job=sendgmr) by 2002:a25:42c2:0:b0:758:bd9:8a70 with SMTP id p185-20020a2542c2000000b007580bd98a70mr4286087yba.377.1672913949470; Thu, 05 Jan 2023 02:19:09 -0800 (PST) Date: Thu, 5 Jan 2023 10:18:09 +0000 In-Reply-To: <20230105101844.1893104-1-jthoughton@google.com> Mime-Version: 1.0 References: <20230105101844.1893104-1-jthoughton@google.com> X-Mailer: git-send-email 2.39.0.314.g84b9a713c41-goog Message-ID: <20230105101844.1893104-12-jthoughton@google.com> Subject: [PATCH 11/46] hugetlb: add hugetlb_pte to track HugeTLB page table entries From: James Houghton To: Mike Kravetz , Muchun Song , Peter Xu Cc: David Hildenbrand , David Rientjes , Axel Rasmussen , Mina Almasry , "Zach O'Keefe" , Manish Mishra , Naoya Horiguchi , "Dr . David Alan Gilbert" , "Matthew Wilcox (Oracle)" , Vlastimil Babka , Baolin Wang , Miaohe Lin , Yang Shi , Andrew Morton , linux-mm@kvack.org, linux-kernel@vger.kernel.org, James Houghton X-Rspamd-Queue-Id: 41CCA1C0014 X-Rspamd-Server: rspam09 X-Rspam-User: X-Stat-Signature: qrgdk736cg4pin4yh9z65gzxhzsowmt6 X-HE-Tag: 1672913950-588659 X-HE-Meta: U2FsdGVkX1+22A9UQDJ2t92ttjG/Ui2yOowcOkJSE8ddUdaIjZgrONc1ncAiQUjQPfz9EyiKe6j+qijotuFmxJjd9uqk9NwsAVqdgrDdvztWKffSWG+ZJwZleAUl3leTlP4XBnFkZeCBkZ3jcbn7hX69TIVV46odS6MJsLFBtmdN6aIVW+oPYWhj+XBhTlbfUoWPtVOcdD3SVt8d9iRvp5rXAKBTZ1kdFnm62AJC0s8bbOQ4TMn0uRbGrzObVhU+tbb+RCefD+s7UsPS3v1vrf36Y499fqmFWlh8wAOnuLWodAcfcZDoegcY7GRwi6bfB8K5e2qVJPtROrnDBnp2RhoU5L8QSpcqU6jRYptw8L4xaCt3uyNUxArJFVSfMrQwS0SwHY/Lz1bIfTit3+8jgiNOjS6V1K/nfm8oiy/5GwDDMeCNiqg1qKnGWAxSVP4VnvLp2tNnyU/rMT7rmNGjMZYvtwdK3rXp5T3rzp/jQTI77PC0BRnTROOo6VACo6u4Sby0x26WGg1M3+GyPeS1wqrHzoA8OQJE+fMlKxIefy50CmB1+Leq9R4X7NQH3SaM12N3Bw3dzHQhNgY0ibHCAoaVMC0HsEfDZAP18Mpy6FbTU3N5fYLNwpMIV0U5TLDjC7oD9WKag+qmcDDbEEehRUSSZuvtC/1R1DorJBLIi6rgJ1SS1SdM8zON2x7KIRzIaKGejrbSoJrJny0fXyrsIhuJWRQdeabP+0IFgLokxte80tAM1DMitMR2gEexKI6SZ0Enip2hW4bAbXVC6Yx5pgceBClee5VW7TLMJNw2DcRbVX8hcjjvaXJVhHZGZEj0gRr/A9R5uWEdgpCVdfaIKuKLUqezSSoVY0GReS2rNrw7P4iKUq3w/sJtCag1iLR3WFcQ5L+oZsXAmUw/GQDwqwroT8GBD3hEhWfklnb0ZctF45nqE/vffPt0GDLx8ebIFA2P6+CyU1/FX9H4FDC bsR/sgiF YwGTrQvVL12R++vmNxUh+PFkUX0RSOCGh6TAgKsKWqJt5wgR5zpMAQaobkvUwXZYg5Uorb+sbCNvUY9syDgU7LCgJly8X6qQFJt4SsPfTMLbB2pPspZFjSwg8IDYU20BsthLSMGloxbH/eL6r2D+7ttfS3GI8dot23jwNTxA5CutJCiVUHxbwMo9C3kxmhENzKuoa25n1Fw6XuWWj1mphPH/Qhnet1Z+WWATy8Gft0/CaiucGr7Npc8jm55Nms2chPHyX2d+F86vmmnVvnEfBKPlIvj1p8KlO5XJv/3IflizR8dvq/0q5BUOBdhuAV/3e7D3MT1bLjNiB3AU8lTNJ02mlLk+uhHtZKSWJAHzqvd+VP+4NvLfn46wstcj4yO120z2wZtPuNYSZIpbhuHZ7qEdiBYj67CP0R1A0Cv25LAV8aPa3cO9jS4C9/i7J2mHKp3D6 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: After high-granularity mapping, page table entries for HugeTLB pages can be of any size/type. (For example, we can have a 1G page mapped with a mix of PMDs and PTEs.) This struct is to help keep track of a HugeTLB PTE after we have done a page table walk. Without this, we'd have to pass around the "size" of the PTE everywhere. We effectively did this before; it could be fetched from the hstate, which we pass around pretty much everywhere. hugetlb_pte_present_leaf is included here as a helper function that will be used frequently later on. Signed-off-by: James Houghton --- include/linux/hugetlb.h | 72 +++++++++++++++++++++++++++++++++++++++++ mm/hugetlb.c | 29 +++++++++++++++++ 2 files changed, 101 insertions(+) diff --git a/include/linux/hugetlb.h b/include/linux/hugetlb.h index 3f098363cd6e..bf441d8a1b52 100644 --- a/include/linux/hugetlb.h +++ b/include/linux/hugetlb.h @@ -38,6 +38,54 @@ typedef struct { unsigned long pd; } hugepd_t; */ #define __NR_USED_SUBPAGE 3 +enum hugetlb_level { + HUGETLB_LEVEL_PTE = 1, + /* + * We always include PMD, PUD, and P4D in this enum definition so that, + * when logged as an integer, we can easily tell which level it is. + */ + HUGETLB_LEVEL_PMD, + HUGETLB_LEVEL_PUD, + HUGETLB_LEVEL_P4D, + HUGETLB_LEVEL_PGD, +}; + +struct hugetlb_pte { + pte_t *ptep; + unsigned int shift; + enum hugetlb_level level; + spinlock_t *ptl; +}; + +static inline +void __hugetlb_pte_populate(struct hugetlb_pte *hpte, pte_t *ptep, + unsigned int shift, enum hugetlb_level level, + spinlock_t *ptl) +{ + /* + * If 'shift' indicates that this PTE is contiguous, then @ptep must + * be the first pte of the contiguous bunch. + */ + hpte->ptl = ptl; + hpte->ptep = ptep; + hpte->shift = shift; + hpte->level = level; +} + +static inline +unsigned long hugetlb_pte_size(const struct hugetlb_pte *hpte) +{ + return 1UL << hpte->shift; +} + +static inline +unsigned long hugetlb_pte_mask(const struct hugetlb_pte *hpte) +{ + return ~(hugetlb_pte_size(hpte) - 1); +} + +bool hugetlb_pte_present_leaf(const struct hugetlb_pte *hpte, pte_t pte); + struct hugepage_subpool { spinlock_t lock; long count; @@ -1232,6 +1280,30 @@ static inline spinlock_t *huge_pte_lock(struct hstate *h, return ptl; } +static inline +spinlock_t *hugetlb_pte_lockptr(struct hugetlb_pte *hpte) +{ + return hpte->ptl; +} + +static inline +spinlock_t *hugetlb_pte_lock(struct hugetlb_pte *hpte) +{ + spinlock_t *ptl = hugetlb_pte_lockptr(hpte); + + spin_lock(ptl); + return ptl; +} + +static inline +void hugetlb_pte_populate(struct mm_struct *mm, struct hugetlb_pte *hpte, + pte_t *ptep, unsigned int shift, + enum hugetlb_level level) +{ + __hugetlb_pte_populate(hpte, ptep, shift, level, + huge_pte_lockptr(shift, mm, ptep)); +} + #if defined(CONFIG_HUGETLB_PAGE) && defined(CONFIG_CMA) extern void __init hugetlb_cma_reserve(int order); #else diff --git a/mm/hugetlb.c b/mm/hugetlb.c index 4db38dc79d0e..2d83a2c359a2 100644 --- a/mm/hugetlb.c +++ b/mm/hugetlb.c @@ -1266,6 +1266,35 @@ static bool vma_has_reserves(struct vm_area_struct *vma, long chg) return false; } +bool hugetlb_pte_present_leaf(const struct hugetlb_pte *hpte, pte_t pte) +{ + pgd_t pgd; + p4d_t p4d; + pud_t pud; + pmd_t pmd; + + switch (hpte->level) { + case HUGETLB_LEVEL_PGD: + pgd = __pgd(pte_val(pte)); + return pgd_present(pgd) && pgd_leaf(pgd); + case HUGETLB_LEVEL_P4D: + p4d = __p4d(pte_val(pte)); + return p4d_present(p4d) && p4d_leaf(p4d); + case HUGETLB_LEVEL_PUD: + pud = __pud(pte_val(pte)); + return pud_present(pud) && pud_leaf(pud); + case HUGETLB_LEVEL_PMD: + pmd = __pmd(pte_val(pte)); + return pmd_present(pmd) && pmd_leaf(pmd); + case HUGETLB_LEVEL_PTE: + return pte_present(pte); + default: + WARN_ON_ONCE(1); + return false; + } +} + + static void enqueue_hugetlb_folio(struct hstate *h, struct folio *folio) { int nid = folio_nid(folio); From patchwork Thu Jan 5 10:18:10 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: James Houghton X-Patchwork-Id: 13089647 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 6BAA4C3DA7A for ; Thu, 5 Jan 2023 10:19:14 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 090CF94000A; Thu, 5 Jan 2023 05:19:14 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 04BF4940008; Thu, 5 Jan 2023 05:19:13 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id DAEE294000A; Thu, 5 Jan 2023 05:19:13 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0010.hostedemail.com [216.40.44.10]) by kanga.kvack.org (Postfix) with ESMTP id BFC8D940008 for ; Thu, 5 Jan 2023 05:19:13 -0500 (EST) Received: from smtpin21.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay08.hostedemail.com (Postfix) with ESMTP id 9C8D3140C40 for ; Thu, 5 Jan 2023 10:19:13 +0000 (UTC) X-FDA: 80320347786.21.6238A6A Received: from mail-yb1-f201.google.com (mail-yb1-f201.google.com [209.85.219.201]) by imf04.hostedemail.com (Postfix) with ESMTP id 1423440008 for ; Thu, 5 Jan 2023 10:19:11 +0000 (UTC) Authentication-Results: imf04.hostedemail.com; dkim=pass header.d=google.com header.s=20210112 header.b=XCDU0evH; spf=pass (imf04.hostedemail.com: domain of 3H6S2YwoKCGYNXLSYKLXSRKSSKPI.GSQPMRYb-QQOZEGO.SVK@flex--jthoughton.bounces.google.com designates 209.85.219.201 as permitted sender) smtp.mailfrom=3H6S2YwoKCGYNXLSYKLXSRKSSKPI.GSQPMRYb-QQOZEGO.SVK@flex--jthoughton.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1672913952; a=rsa-sha256; cv=none; b=p8nfEZEqIOlK4AxO2VbJlDWRpg3sKKa+iUGGzSQDQweji2bx3BdAPk0NpANwJxWEAn3r1c u172hQxhGxlPw5kIjiBJrdMGcKeJlUv4vEtS3fDHScAlMyQdssHC7d0R+kNmKOoZmfy69K paJPB8dOcMd18TW9pgbGKS+uX0o2Y0Y= ARC-Authentication-Results: i=1; imf04.hostedemail.com; dkim=pass header.d=google.com header.s=20210112 header.b=XCDU0evH; spf=pass (imf04.hostedemail.com: domain of 3H6S2YwoKCGYNXLSYKLXSRKSSKPI.GSQPMRYb-QQOZEGO.SVK@flex--jthoughton.bounces.google.com designates 209.85.219.201 as permitted sender) smtp.mailfrom=3H6S2YwoKCGYNXLSYKLXSRKSSKPI.GSQPMRYb-QQOZEGO.SVK@flex--jthoughton.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1672913952; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=YXjDdGeof+pkTOexvpTub2FYR/fAsSt0/calmVQ1OdU=; b=FuLteFh+hZgrnKScHYZWFoF6fx85YCzDP2qXYmlD4ZRipipKt1FInj7g1XjlekfcjHwMFr 1N+kZ5Vk36ejgT3YgFppAd2CFoFlVjlC6KTzVUmFqWMlUnrOCx8ZlFRsy1uwxhgXWZni/3 C5ahKEXG/cJGvdIbiYHtC7LBmt3XhHs= Received: by mail-yb1-f201.google.com with SMTP id a4-20020a5b0004000000b006fdc6aaec4fso36882599ybp.20 for ; Thu, 05 Jan 2023 02:19:11 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=YXjDdGeof+pkTOexvpTub2FYR/fAsSt0/calmVQ1OdU=; b=XCDU0evHwnZkmMXLipslFPIAImW1wnD5gg1yPSGJZCiJT1xNPXMtFtNvS/kZpXrh6h /YzmZN/osO/uOmKFuvSiBStVwQ/r8eUxfoBViWLdx/V5nbPmzEDvLxPqD88BhVRIk6gx BdYsxNiAVkM52/C8UupP5zDev9GYfWWEQNeFa2R3dl6z5oPntfdNRseZqzzslEjMFOLL US6rpYr3YdIccRX77RBQrSex+xbAKFESNnT8PrAfGIQxncDhYiZquJomS8iN2RA/dlmE I/ng9bMh5yIaNqpp0R/MRqgY3ugfYkSYOlYkBPLzLzHOxuzNSoTLFIrbRaCLRi+7kJln BAuw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=YXjDdGeof+pkTOexvpTub2FYR/fAsSt0/calmVQ1OdU=; b=Ej8HYB3gW4X1dOHerwu6EDY75Csr0mPLKmW+ts7llaPT6gxcJgnwkGzGHA6wrGhhRt tUn5l2eXJ4Jao0UhmzZFIAJTSGAaLaf8oDv9u3oaFgAqXAHJSykGevNn33n6/G5oBBe5 YVzh/Xk0T6I8qUC9v0XAHcYJDuAtwa+BvOgslpuDW15u9xrnKOMgYi2juAhyBF8rOSfc vyStH+azhL6h6cF7W0vsvSqA1sS0u4aeaBuRCOhct9aVSl1WZ+ZOfP5noSdAVTBNh9vw daGi1PMwNOweO57v6y4PMY/cS6nWrBVouaeg/49fkDv2RcKaKad73ZuY0gfsOiUeTb1Z orwQ== X-Gm-Message-State: AFqh2kpl3PmzrfuttkBeeX82tv4DYmVWA4ikLFnvakddAjey1QocxRzB 74jrNOJQpVGcQsw2j+3nYeehwi7043TOfwhd X-Google-Smtp-Source: AMrXdXvYDZWYkFpM6Bavubt5Vuec9lD7DHb7hBD7HW0U1DxPG0tR0i3mLWWTosVSiS0INVOOwY+3Lu37HfMNvQD7 X-Received: from jthoughton.c.googlers.com ([fda3:e722:ac3:cc00:14:4d90:c0a8:2a4f]) (user=jthoughton job=sendgmr) by 2002:a25:c054:0:b0:725:2e78:3fd3 with SMTP id c81-20020a25c054000000b007252e783fd3mr3300724ybf.41.1672913951341; Thu, 05 Jan 2023 02:19:11 -0800 (PST) Date: Thu, 5 Jan 2023 10:18:10 +0000 In-Reply-To: <20230105101844.1893104-1-jthoughton@google.com> Mime-Version: 1.0 References: <20230105101844.1893104-1-jthoughton@google.com> X-Mailer: git-send-email 2.39.0.314.g84b9a713c41-goog Message-ID: <20230105101844.1893104-13-jthoughton@google.com> Subject: [PATCH 12/46] hugetlb: add hugetlb_alloc_pmd and hugetlb_alloc_pte From: James Houghton To: Mike Kravetz , Muchun Song , Peter Xu Cc: David Hildenbrand , David Rientjes , Axel Rasmussen , Mina Almasry , "Zach O'Keefe" , Manish Mishra , Naoya Horiguchi , "Dr . David Alan Gilbert" , "Matthew Wilcox (Oracle)" , Vlastimil Babka , Baolin Wang , Miaohe Lin , Yang Shi , Andrew Morton , linux-mm@kvack.org, linux-kernel@vger.kernel.org, James Houghton X-Rspam-User: X-Rspamd-Server: rspam03 X-Rspamd-Queue-Id: 1423440008 X-Stat-Signature: hk7kccy1m1jyt93k1wd9jns4omhsdmd7 X-HE-Tag: 1672913951-910758 X-HE-Meta: U2FsdGVkX1/gWcBa64L8QYKUTBh+U6TFHOj17NOrpf/BRACEH8DkVXdiKHh5nj7s5VnBVh8uzI7+BMlSMvD7fL9gKQyy/8Fp/GYkXhqW0rd9Xb3m/I2gPbAFHebnUZEPkucU0Y/iszKJpiuLgWnxPnXfUNYRGoHqV7bAZSDBEt+WoKtEX4w8MIvS+FVa4nJXIoRcDyAXowXtA5vwPT4W90eKUV0oAswQotsVuQY71f2daRUVICZXZkTOWND4u62QMhj0dBERgAkrL4NAgLG9ChrMU+yn84OVdJ+WKhfTuhP3bEtAHkMFTuIVULIQah7bwJHjj+dtomet1GJxQg+q4qS+a9OcKAmwQVodf7PPfHTxmPvTTmXuEz7AQZeoWeQsgYrAnDVNNMQAvQU9856b70ahCLOhULFRy8qLlYeXUrqWlkYxwnhEEfXuMmqrKsC0DuL8u98m55+7RICwg9CzykRhLwskjPj1Gfz5Tld4YthFnnD+I5riV1H4K2hXqJVydTU0MFUdrH7S61KZcQwnXkTVAo1Qn8YL16znzluahkK/1BDTmiBWW0h1OeIPbwe2fyVueGzEipbhcDWU3/nKut+SjTQmm4TyO0SldOzw/0WvncfZOp2qknDL/sdsAfo0/cY5axxEakfUtq+xOu3HzZpGqhY2rusrj4Y9vnL4VrCHjHNm2kE4MWVekN95wpzBglAFQDMgZkfdrweC4rloPQoziKSmC6gElFGRcRKzfESNj0pDWvc7Isb9WWYH9qgwEwdiixX4KPFQTXybXpWY4vht3qQPaCBCOmuudpv/R9RxRmKRnXP/B3l1XES/F4wTKvSUxWgWl+ywMOHOIp/r4tI7O/x9+mnE5JnB3FoHIHQCNI4+YPDt5v3D6GMO1+EBpgTpXtn5ENGM19PXLrkfY8xWnFIlL4yp4ruRHFnvxAmSjIAvhuQrsP5BscIKMp0qPoJXStUNn5xJOnbQgTB p4IPSrcP Z6aOAoUU1rZ3X8meqVI8twnDRoJpW+gkKUkFM+9xRfj3wGql+Kx3P6PMx3zhiOHavSvoYtdUezwiv7umb0xZB+AVvETa8G+Xms2SY0tCty/BJ7MhbYkDf9sCn5e2L6lZu7jfK1eRixIrbj8kbuDGJpsrOmsZwCSiygITntW44AXtDxCrVIo+LmendXi5lNQuISidDn1fCeedl65tD6AIXdpE+N+obWpvN9hv6FFndiFVtjQ4pLokBo/egUD2FG0mISh0fillcZvgfd0NdnXM1SvSRd40jUG3LygAKAIeK7WTVrnExmjyEfOWIF52xKyJLTuaGwqh0FQrAlLAkq7/1lljj1/GYi0P5y5sPdV/dc2WKxDX9NQASUC6T7ZbfZzWr8cicrjSKr67Ogg6hAN7WZp0Kzmy5p0W7lIR3PAmUkdXtmuMB1QwjeegGI8CUb2nl4e1P X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: These functions are used to allocate new PTEs below the hstate PTE. This will be used by hugetlb_walk_step, which implements stepping forwards in a HugeTLB high-granularity page table walk. The reasons that we don't use the standard pmd_alloc/pte_alloc* functions are: 1) This prevents us from accidentally overwriting swap entries or attempting to use swap entries as present non-leaf PTEs (see pmd_alloc(); we assume that !pte_none means pte_present and non-leaf). 2) Locking hugetlb PTEs can different than regular PTEs. (Although, as implemented right now, locking is the same.) 3) We can maintain compatibility with CONFIG_HIGHPTE. That is, HugeTLB HGM won't use HIGHPTE, but the kernel can still be built with it, and other mm code will use it. When GENERAL_HUGETLB supports P4D-based hugepages, we will need to implement hugetlb_pud_alloc to implement hugetlb_walk_step. Signed-off-by: James Houghton --- include/linux/hugetlb.h | 5 ++ mm/hugetlb.c | 114 ++++++++++++++++++++++++++++++++++++++++ 2 files changed, 119 insertions(+) diff --git a/include/linux/hugetlb.h b/include/linux/hugetlb.h index bf441d8a1b52..ad9d19f0d1b9 100644 --- a/include/linux/hugetlb.h +++ b/include/linux/hugetlb.h @@ -86,6 +86,11 @@ unsigned long hugetlb_pte_mask(const struct hugetlb_pte *hpte) bool hugetlb_pte_present_leaf(const struct hugetlb_pte *hpte, pte_t pte); +pmd_t *hugetlb_alloc_pmd(struct mm_struct *mm, struct hugetlb_pte *hpte, + unsigned long addr); +pte_t *hugetlb_alloc_pte(struct mm_struct *mm, struct hugetlb_pte *hpte, + unsigned long addr); + struct hugepage_subpool { spinlock_t lock; long count; diff --git a/mm/hugetlb.c b/mm/hugetlb.c index 2d83a2c359a2..2160cbaf3311 100644 --- a/mm/hugetlb.c +++ b/mm/hugetlb.c @@ -480,6 +480,120 @@ static bool has_same_uncharge_info(struct file_region *rg, #endif } +/* + * hugetlb_alloc_pmd -- Allocate or find a PMD beneath a PUD-level hpte. + * + * This is meant to be used to implement hugetlb_walk_step when one must go to + * step down to a PMD. Different architectures may implement hugetlb_walk_step + * differently, but hugetlb_alloc_pmd and hugetlb_alloc_pte are architecture- + * independent. + * + * Returns: + * On success: the pointer to the PMD. This should be placed into a + * hugetlb_pte. @hpte is not changed. + * ERR_PTR(-EINVAL): hpte is not PUD-level + * ERR_PTR(-EEXIST): there is a non-leaf and non-empty PUD in @hpte + * ERR_PTR(-ENOMEM): could not allocate the new PMD + */ +pmd_t *hugetlb_alloc_pmd(struct mm_struct *mm, struct hugetlb_pte *hpte, + unsigned long addr) +{ + spinlock_t *ptl = hugetlb_pte_lockptr(hpte); + pmd_t *new; + pud_t *pudp; + pud_t pud; + + if (hpte->level != HUGETLB_LEVEL_PUD) + return ERR_PTR(-EINVAL); + + pudp = (pud_t *)hpte->ptep; +retry: + pud = READ_ONCE(*pudp); + if (likely(pud_present(pud))) + return unlikely(pud_leaf(pud)) + ? ERR_PTR(-EEXIST) + : pmd_offset(pudp, addr); + else if (!pud_none(pud)) + /* + * Not present and not none means that a swap entry lives here, + * and we can't get rid of it. + */ + return ERR_PTR(-EEXIST); + + new = pmd_alloc_one(mm, addr); + if (!new) + return ERR_PTR(-ENOMEM); + + spin_lock(ptl); + if (!pud_same(pud, *pudp)) { + spin_unlock(ptl); + pmd_free(mm, new); + goto retry; + } + + mm_inc_nr_pmds(mm); + smp_wmb(); /* See comment in pmd_install() */ + pud_populate(mm, pudp, new); + spin_unlock(ptl); + return pmd_offset(pudp, addr); +} + +/* + * hugetlb_alloc_pte -- Allocate a PTE beneath a pmd_none PMD-level hpte. + * + * See the comment above hugetlb_alloc_pmd. + */ +pte_t *hugetlb_alloc_pte(struct mm_struct *mm, struct hugetlb_pte *hpte, + unsigned long addr) +{ + spinlock_t *ptl = hugetlb_pte_lockptr(hpte); + pgtable_t new; + pmd_t *pmdp; + pmd_t pmd; + + if (hpte->level != HUGETLB_LEVEL_PMD) + return ERR_PTR(-EINVAL); + + pmdp = (pmd_t *)hpte->ptep; +retry: + pmd = READ_ONCE(*pmdp); + if (likely(pmd_present(pmd))) + return unlikely(pmd_leaf(pmd)) + ? ERR_PTR(-EEXIST) + : pte_offset_kernel(pmdp, addr); + else if (!pmd_none(pmd)) + /* + * Not present and not none means that a swap entry lives here, + * and we can't get rid of it. + */ + return ERR_PTR(-EEXIST); + + /* + * With CONFIG_HIGHPTE, calling `pte_alloc_one` directly may result + * in page tables being allocated in high memory, needing a kmap to + * access. Instead, we call __pte_alloc_one directly with + * GFP_PGTABLE_USER to prevent these PTEs being allocated in high + * memory. + */ + new = __pte_alloc_one(mm, GFP_PGTABLE_USER); + if (!new) + return ERR_PTR(-ENOMEM); + + spin_lock(ptl); + if (!pmd_same(pmd, *pmdp)) { + spin_unlock(ptl); + pgtable_pte_page_dtor(new); + __free_page(new); + goto retry; + } + + mm_inc_nr_ptes(mm); + smp_wmb(); /* See comment in pmd_install() */ + pmd_populate(mm, pmdp, new); + spin_unlock(ptl); + return pte_offset_kernel(pmdp, addr); +} + static void coalesce_file_region(struct resv_map *resv, struct file_region *rg) { struct file_region *nrg, *prg; From patchwork Thu Jan 5 10:18:11 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: James Houghton X-Patchwork-Id: 13089648 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 4061AC3DA7D for ; Thu, 5 Jan 2023 10:19:16 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id C54FD94000B; Thu, 5 Jan 2023 05:19:15 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id BDC06940008; Thu, 5 Jan 2023 05:19:15 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id A532994000B; Thu, 5 Jan 2023 05:19:15 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0016.hostedemail.com [216.40.44.16]) by kanga.kvack.org (Postfix) with ESMTP id 9644E940008 for ; Thu, 5 Jan 2023 05:19:15 -0500 (EST) Received: from smtpin14.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay04.hostedemail.com (Postfix) with ESMTP id 7186E1A0D50 for ; Thu, 5 Jan 2023 10:19:15 +0000 (UTC) X-FDA: 80320347870.14.589C9A1 Received: from mail-yb1-f201.google.com (mail-yb1-f201.google.com [209.85.219.201]) by imf04.hostedemail.com (Postfix) with ESMTP id D03A74000A for ; Thu, 5 Jan 2023 10:19:13 +0000 (UTC) Authentication-Results: imf04.hostedemail.com; dkim=pass header.d=google.com header.s=20210112 header.b=FAFveYt9; spf=pass (imf04.hostedemail.com: domain of 3IaS2YwoKCGgPZNUaMNZUTMUUMRK.IUSROTad-SSQbGIQ.UXM@flex--jthoughton.bounces.google.com designates 209.85.219.201 as permitted sender) smtp.mailfrom=3IaS2YwoKCGgPZNUaMNZUTMUUMRK.IUSROTad-SSQbGIQ.UXM@flex--jthoughton.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1672913953; a=rsa-sha256; cv=none; b=i2c9BTfB6kOQNUZ/G6wYwDWU/JIggpnSr2b21q2ZBGNf9TebuYld+KEHLehR781ONw3DGT LAT7qywyK9uB7vrw1w+8WNbX1EPEtPCEcJhPVfzebtDeV1fExJAN3z+mXnyHyT4tN6ayKt fkh9gM9juMZJulp/YSnX3boC9VLPxXk= ARC-Authentication-Results: i=1; imf04.hostedemail.com; dkim=pass header.d=google.com header.s=20210112 header.b=FAFveYt9; spf=pass (imf04.hostedemail.com: domain of 3IaS2YwoKCGgPZNUaMNZUTMUUMRK.IUSROTad-SSQbGIQ.UXM@flex--jthoughton.bounces.google.com designates 209.85.219.201 as permitted sender) smtp.mailfrom=3IaS2YwoKCGgPZNUaMNZUTMUUMRK.IUSROTad-SSQbGIQ.UXM@flex--jthoughton.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1672913953; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=2Vt58mLGL5M8YpW+S8eBdfZehWuEGnsVXCCJx2houLg=; b=WQuPbXgkGu8SlG31MAz79SDJv14eEv0loeQnFJi3GHCcvk5RTDAc6oc270tRKVg0jiTprD gmGuAlhe/OTRnVXYaNtmGAYWOewKjggAlWGW+6Slj+ShYQOpSGrP2VMvUiQq/DAqCQB4HL cM9/bkMm/U9qjh5TOWyQHHvDfI2tQOU= Received: by mail-yb1-f201.google.com with SMTP id a4-20020a5b0004000000b006fdc6aaec4fso36882665ybp.20 for ; Thu, 05 Jan 2023 02:19:13 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=2Vt58mLGL5M8YpW+S8eBdfZehWuEGnsVXCCJx2houLg=; b=FAFveYt9nBF0eFfaA1d0Tv7HE5xVwqFCRHNy2MmHG8kjRiH1IJ9npxQOlM0OPPr9x0 SOVsSNTwUscRwlguA9nGCrhEAEgibLkiSxGoSBRrXyiTEZUYuMRPxRxyfuuinve75ZiS b63qtgYDL7WfBK69glJ4RMOKf3QPMUJGqEe/PQ3lN23K5qmqgGr6+syabeOfq80mm59A aqLVtsA1nxHBiLCtySePm+DxyeSOS0NdLf5oXMQa6tQk1Fla5YNZga5tHOyH+tx+EP0G l5qkwogI70U832FjytD4NOveDXyTFLX+2z3zmQ8We0Ne3ob7c9NV3FWN7gNIpfjzemkn +pgg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=2Vt58mLGL5M8YpW+S8eBdfZehWuEGnsVXCCJx2houLg=; b=WmWEs/E02uidqTlhXYB8+agdKYzSgyF63jkAXiwcSxOMSC6l4zBw9tCqSIaGIQ2ufn UftvRgW8tP+BHOZ437SRcVwMiQSrNO89z6flI7blfFziipnVIpysZ5g8n6M/C4CCgkfY MCdNJXXAW4LvvQU9P0nnEjkaJLSkxCzAGjnz/vYhS23lBQlj8TNfjXb1LpTBT3Ze7sfz 9nrdKQOaVmCJRXJqXb75zoypSSjwzu+m9I8S0zQ4xKHRHmLDLdlUAe3I+0m0Onvw5oHD xplFq4x7hKWoGy/ad7cXWDdlMyeO+HXTXTQhBQQQIlL3KagWK/1e9YgEHrkAZISJ9ZU2 NHFw== X-Gm-Message-State: AFqh2krbHxMRSsvEbkdE2LJX4m5Mpzzb2XlXso4xEhbLX/F3HTcJX2k1 tCkRhuATNtbKz3ezgqPhbTlTy6yopF6gZjtJ X-Google-Smtp-Source: AMrXdXsG407xQfqrIOR0gOSXoyOo4VMzkb16AaBSb+fslYRZShBcxhwqLEDLjbewemEoWrf7Hfx25vDRkIb8v9+b X-Received: from jthoughton.c.googlers.com ([fda3:e722:ac3:cc00:14:4d90:c0a8:2a4f]) (user=jthoughton job=sendgmr) by 2002:a25:850e:0:b0:6f8:42d8:2507 with SMTP id w14-20020a25850e000000b006f842d82507mr6177141ybk.110.1672913953310; Thu, 05 Jan 2023 02:19:13 -0800 (PST) Date: Thu, 5 Jan 2023 10:18:11 +0000 In-Reply-To: <20230105101844.1893104-1-jthoughton@google.com> Mime-Version: 1.0 References: <20230105101844.1893104-1-jthoughton@google.com> X-Mailer: git-send-email 2.39.0.314.g84b9a713c41-goog Message-ID: <20230105101844.1893104-14-jthoughton@google.com> Subject: [PATCH 13/46] hugetlb: add hugetlb_hgm_walk and hugetlb_walk_step From: James Houghton To: Mike Kravetz , Muchun Song , Peter Xu Cc: David Hildenbrand , David Rientjes , Axel Rasmussen , Mina Almasry , "Zach O'Keefe" , Manish Mishra , Naoya Horiguchi , "Dr . David Alan Gilbert" , "Matthew Wilcox (Oracle)" , Vlastimil Babka , Baolin Wang , Miaohe Lin , Yang Shi , Andrew Morton , linux-mm@kvack.org, linux-kernel@vger.kernel.org, James Houghton X-Rspam-User: X-Rspamd-Server: rspam03 X-Rspamd-Queue-Id: D03A74000A X-Stat-Signature: knbrsqc5dj648zgfk199s5xks51ufadh X-HE-Tag: 1672913953-31932 X-HE-Meta: U2FsdGVkX1/Sz3hDbx7IZBUhnYOJKeyoGzGgugo/m3Lm37FliL5kYOBDBsP5rc/lMkCSraWLjU6hHP/Qnm9YAbXitKoF3UFH5+cu9geD9fswbVzhXFDaPsGR3HsnWRnu0KYbMwOEbaNfGCoqfWdw2L9XLvjSwuJ6X4k2HtVAiqJViMAdGdk86Is1Ip52DhvU0U+vi97VFz2Gqq2gXtuyfyFEHeulxz3BCgzoWwe19vgcFC7YMsSqPETzRqHa09bYYf1XIzsU8jgcuQj+eqb6XqMBrGqJltjUUQEQyfwy23n4WCswXQ78K1qSgbZeDI9c6c+6/5Z0PBWBdLVNXzLqNw68s6TffbY4vcyNUp/WvQV+3Rs7YE3K/gHB6Nr4xfjk0tYn7S9GqoDVXL4RLwrs4PnyhnXxAQpUSGSGFUH0ESu0KtM99v5eyInOmyVcVsQMwzu7HAfKdtl0yXQ8U6TA3CwQidGhVslj2RZWd8yg9yZt4LWWU3Dz1N16LUZpu4oa2tUUrATOcSy2GlIklYH4tvCzdCGSSN7ak/8Ri8DW9AcPoviFpHuLlZPEoiz09P6lBJXwzs48n3KtCxaj8K3cFGJC5qAB9TmYKA8izIXCPMz1+MILyMXDUajqvlrn95lObiVHTz9WC13hxtI5bSdYIX9hymxm70X4EDZccN8dXrYCDnSKtlwwNnDit+bJbTrSPviOQZj1Dm2lWrVE+SQWIWwOtpfakWZanEAGf8ZzBu30QirctWUMjkSDVc3vt2ukyw6rDuntra4xg8nvVMKJGsA3UszTtW9YquMS1IeCAS4JYP/xJvSSo1zCobbbgkjvq636jIKNSlDPQPqk9fOi1+wYmNFUkCq3ntIO1z6MypWtL1Skcp4ZpvkNzBwJjB4ytfLRWnE36DbXlMQhS6WLAZzQUu85aKoJZ1xvsl5qqpqiBnXsv3qKwwJgS42AQtnYioIMYx4sFxvGzcK5GHC UKeo4rYH 7NN2b1TS69xU6eK6ecxtE8Mw0dnYWHlxqQLNi1b0Dz5PPAVkH+6tVKz0raTxCy4x58oINumDZ1y2Fl4UYkAGrymde9Qo239q+uvXnZqeaoDjchh3F3dwla75vs0+QVY1Gpf/B7km+LJ5n13UegnHj6m6gtSRUeODSn6qiPPlgfW4r26iNANweigAF6QbMvUSI0Jp9T5RUZ/l/SYN+5Uw5HYbHkH+j9ED+d+Jv7+L/VNSvk6zgAeE7gi9U8HKympid68yFA5LIAmI4MpbNRr889zTu9dIn1AgA8JEj08jP3Xv4b9BLIyDZQZ33qfudsaVwFfl56uEcECfZ61SZcuFxJMJjOyrxCQxy2I24FdqZKz79dpVLkm1wO+zlJTeCGOqEwXJIJZOC8PBuw9OyR3MEdZ/hbYHhBBN++/fIMJgROxLcZf2P7UvglyIuFODf2HH7RAYE X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: hugetlb_hgm_walk implements high-granularity page table walks for HugeTLB. It is safe to call on non-HGM enabled VMAs; it will return immediately. hugetlb_walk_step implements how we step forwards in the walk. For architectures that don't use GENERAL_HUGETLB, they will need to provide their own implementation. Signed-off-by: James Houghton --- include/linux/hugetlb.h | 35 +++++-- mm/hugetlb.c | 213 ++++++++++++++++++++++++++++++++++++++++ 2 files changed, 242 insertions(+), 6 deletions(-) diff --git a/include/linux/hugetlb.h b/include/linux/hugetlb.h index ad9d19f0d1b9..2fcd8f313628 100644 --- a/include/linux/hugetlb.h +++ b/include/linux/hugetlb.h @@ -239,6 +239,14 @@ u32 hugetlb_fault_mutex_hash(struct address_space *mapping, pgoff_t idx); pte_t *huge_pmd_share(struct mm_struct *mm, struct vm_area_struct *vma, unsigned long addr, pud_t *pud); +int hugetlb_full_walk(struct hugetlb_pte *hpte, struct vm_area_struct *vma, + unsigned long addr); +void hugetlb_full_walk_continue(struct hugetlb_pte *hpte, + struct vm_area_struct *vma, unsigned long addr); +int hugetlb_full_walk_alloc(struct hugetlb_pte *hpte, + struct vm_area_struct *vma, unsigned long addr, + unsigned long target_sz); + struct address_space *hugetlb_page_mapping_lock_write(struct page *hpage); extern int sysctl_hugetlb_shm_group; @@ -288,6 +296,8 @@ pte_t *huge_pte_alloc(struct mm_struct *mm, struct vm_area_struct *vma, pte_t *huge_pte_offset(struct mm_struct *mm, unsigned long addr, unsigned long sz); unsigned long hugetlb_mask_last_page(struct hstate *h); +int hugetlb_walk_step(struct mm_struct *mm, struct hugetlb_pte *hpte, + unsigned long addr, unsigned long sz); int huge_pmd_unshare(struct mm_struct *mm, struct vm_area_struct *vma, unsigned long addr, pte_t *ptep); void adjust_range_if_pmd_sharing_possible(struct vm_area_struct *vma, @@ -1067,6 +1077,8 @@ void hugetlb_register_node(struct node *node); void hugetlb_unregister_node(struct node *node); #endif +enum hugetlb_level hpage_size_to_level(unsigned long sz); + #else /* CONFIG_HUGETLB_PAGE */ struct hstate {}; @@ -1259,6 +1271,11 @@ static inline void hugetlb_register_node(struct node *node) static inline void hugetlb_unregister_node(struct node *node) { } + +static inline enum hugetlb_level hpage_size_to_level(unsigned long sz) +{ + return HUGETLB_LEVEL_PTE; +} #endif /* CONFIG_HUGETLB_PAGE */ #ifdef CONFIG_HUGETLB_HIGH_GRANULARITY_MAPPING @@ -1333,12 +1350,8 @@ __vma_has_hugetlb_vma_lock(struct vm_area_struct *vma) return (vma->vm_flags & VM_MAYSHARE) && vma->vm_private_data; } -/* - * Safe version of huge_pte_offset() to check the locks. See comments - * above huge_pte_offset(). - */ -static inline pte_t * -hugetlb_walk(struct vm_area_struct *vma, unsigned long addr, unsigned long sz) +static inline void +hugetlb_walk_lock_check(struct vm_area_struct *vma) { #if defined(CONFIG_HUGETLB_PAGE) && \ defined(CONFIG_ARCH_WANT_HUGE_PMD_SHARE) && defined(CONFIG_LOCKDEP) @@ -1360,6 +1373,16 @@ hugetlb_walk(struct vm_area_struct *vma, unsigned long addr, unsigned long sz) !lockdep_is_held( &vma->vm_file->f_mapping->i_mmap_rwsem)); #endif +} + +/* + * Safe version of huge_pte_offset() to check the locks. See comments + * above huge_pte_offset(). + */ +static inline pte_t * +hugetlb_walk(struct vm_area_struct *vma, unsigned long addr, unsigned long sz) +{ + hugetlb_walk_lock_check(vma); return huge_pte_offset(vma->vm_mm, addr, sz); } diff --git a/mm/hugetlb.c b/mm/hugetlb.c index 2160cbaf3311..aa8e59cbca69 100644 --- a/mm/hugetlb.c +++ b/mm/hugetlb.c @@ -94,6 +94,29 @@ static int hugetlb_acct_memory(struct hstate *h, long delta); static void hugetlb_vma_lock_free(struct vm_area_struct *vma); static void __hugetlb_vma_unlock_write_free(struct vm_area_struct *vma); +/* + * hpage_size_to_level() - convert @sz to the corresponding page table level + * + * @sz must be less than or equal to a valid hugepage size. + */ +enum hugetlb_level hpage_size_to_level(unsigned long sz) +{ + /* + * We order the conditionals from smallest to largest to pick the + * smallest level when multiple levels have the same size (i.e., + * when levels are folded). + */ + if (sz < PMD_SIZE) + return HUGETLB_LEVEL_PTE; + if (sz < PUD_SIZE) + return HUGETLB_LEVEL_PMD; + if (sz < P4D_SIZE) + return HUGETLB_LEVEL_PUD; + if (sz < PGDIR_SIZE) + return HUGETLB_LEVEL_P4D; + return HUGETLB_LEVEL_PGD; +} + static inline bool subpool_is_free(struct hugepage_subpool *spool) { if (spool->count) @@ -7276,6 +7299,153 @@ bool want_pmd_share(struct vm_area_struct *vma, unsigned long addr) } #endif /* CONFIG_ARCH_WANT_HUGE_PMD_SHARE */ +/* hugetlb_hgm_walk - walks a high-granularity HugeTLB page table to resolve + * the page table entry for @addr. We might allocate new PTEs. + * + * @hpte must always be pointing at an hstate-level PTE or deeper. + * + * This function will never walk further if it encounters a PTE of a size + * less than or equal to @sz. + * + * @alloc determines what we do when we encounter an empty PTE. If false, + * we stop walking. If true and @sz is less than the current PTE's size, + * we make that PTE point to the next level down, going until @sz is the same + * as our current PTE. + * + * If @alloc is false and @sz is PAGE_SIZE, this function will always + * succeed, but that does not guarantee that hugetlb_pte_size(hpte) is @sz. + * + * Return: + * -ENOMEM if we couldn't allocate new PTEs. + * -EEXIST if the caller wanted to walk further than a migration PTE, + * poison PTE, or a PTE marker. The caller needs to manually deal + * with this scenario. + * -EINVAL if called with invalid arguments (@sz invalid, @hpte not + * initialized). + * 0 otherwise. + * + * Even if this function fails, @hpte is guaranteed to always remain + * valid. + */ +static int hugetlb_hgm_walk(struct mm_struct *mm, struct vm_area_struct *vma, + struct hugetlb_pte *hpte, unsigned long addr, + unsigned long sz, bool alloc) +{ + int ret = 0; + pte_t pte; + + if (WARN_ON_ONCE(sz < PAGE_SIZE)) + return -EINVAL; + + if (WARN_ON_ONCE(!hpte->ptep)) + return -EINVAL; + + /* We have the same synchronization requirements as hugetlb_walk. */ + hugetlb_walk_lock_check(vma); + + while (hugetlb_pte_size(hpte) > sz && !ret) { + pte = huge_ptep_get(hpte->ptep); + if (!pte_present(pte)) { + if (!alloc) + return 0; + if (unlikely(!huge_pte_none(pte))) + return -EEXIST; + } else if (hugetlb_pte_present_leaf(hpte, pte)) + return 0; + ret = hugetlb_walk_step(mm, hpte, addr, sz); + } + + return ret; +} + +static int hugetlb_hgm_walk_uninit(struct hugetlb_pte *hpte, + pte_t *ptep, + struct vm_area_struct *vma, + unsigned long addr, + unsigned long target_sz, + bool alloc) +{ + struct hstate *h = hstate_vma(vma); + + hugetlb_pte_populate(vma->vm_mm, hpte, ptep, huge_page_shift(h), + hpage_size_to_level(huge_page_size(h))); + return hugetlb_hgm_walk(vma->vm_mm, vma, hpte, addr, target_sz, + alloc); +} + +/* + * hugetlb_full_walk_continue - continue a high-granularity page-table walk. + * + * If a user has a valid @hpte but knows that @hpte is not a leaf, they can + * attempt to continue walking by calling this function. + * + * This function may never fail, but @hpte might not change. + * + * If @hpte is not valid, then this function is a no-op. + */ +void hugetlb_full_walk_continue(struct hugetlb_pte *hpte, + struct vm_area_struct *vma, + unsigned long addr) +{ + /* hugetlb_hgm_walk will never fail with these arguments. */ + WARN_ON_ONCE(hugetlb_hgm_walk(vma->vm_mm, vma, hpte, addr, + PAGE_SIZE, false)); +} + +/* + * hugetlb_full_walk - do a high-granularity page-table walk; never allocate. + * + * This function can only fail if we find that the hstate-level PTE is not + * allocated. Callers can take advantage of this fact to skip address regions + * that cannot be mapped in that case. + * + * If this function succeeds, @hpte is guaranteed to be valid. + */ +int hugetlb_full_walk(struct hugetlb_pte *hpte, + struct vm_area_struct *vma, + unsigned long addr) +{ + struct hstate *h = hstate_vma(vma); + unsigned long sz = huge_page_size(h); + /* + * We must mask the address appropriately so that we pick up the first + * PTE in a contiguous group. + */ + pte_t *ptep = hugetlb_walk(vma, addr & huge_page_mask(h), sz); + + if (!ptep) + return -ENOMEM; + + /* hugetlb_hgm_walk_uninit will never fail with these arguments. */ + WARN_ON_ONCE(hugetlb_hgm_walk_uninit(hpte, ptep, vma, addr, + PAGE_SIZE, false)); + return 0; +} + +/* + * hugetlb_full_walk_alloc - do a high-granularity walk, potentially allocate + * new PTEs. + */ +int hugetlb_full_walk_alloc(struct hugetlb_pte *hpte, + struct vm_area_struct *vma, + unsigned long addr, + unsigned long target_sz) +{ + struct hstate *h = hstate_vma(vma); + unsigned long sz = huge_page_size(h); + /* + * We must mask the address appropriately so that we pick up the first + * PTE in a contiguous group. + */ + pte_t *ptep = huge_pte_alloc(vma->vm_mm, vma, addr & huge_page_mask(h), + sz); + + if (!ptep) + return -ENOMEM; + + return hugetlb_hgm_walk_uninit(hpte, ptep, vma, addr, target_sz, true); +} + #ifdef CONFIG_ARCH_WANT_GENERAL_HUGETLB pte_t *huge_pte_alloc(struct mm_struct *mm, struct vm_area_struct *vma, unsigned long addr, unsigned long sz) @@ -7343,6 +7513,49 @@ pte_t *huge_pte_offset(struct mm_struct *mm, return (pte_t *)pmd; } +/* + * hugetlb_walk_step() - Walk the page table one step to resolve the page + * (hugepage or subpage) entry at address @addr. + * + * @sz always points at the final target PTE size (e.g. PAGE_SIZE for the + * lowest level PTE). + * + * @hpte will always remain valid, even if this function fails. + * + * Architectures that implement this function must ensure that if @hpte does + * not change levels, then its PTL must also stay the same. + */ +int hugetlb_walk_step(struct mm_struct *mm, struct hugetlb_pte *hpte, + unsigned long addr, unsigned long sz) +{ + pte_t *ptep; + spinlock_t *ptl; + + switch (hpte->level) { + case HUGETLB_LEVEL_PUD: + ptep = (pte_t *)hugetlb_alloc_pmd(mm, hpte, addr); + if (IS_ERR(ptep)) + return PTR_ERR(ptep); + hugetlb_pte_populate(mm, hpte, ptep, PMD_SHIFT, + HUGETLB_LEVEL_PMD); + break; + case HUGETLB_LEVEL_PMD: + ptep = hugetlb_alloc_pte(mm, hpte, addr); + if (IS_ERR(ptep)) + return PTR_ERR(ptep); + ptl = pte_lockptr(mm, (pmd_t *)hpte->ptep); + __hugetlb_pte_populate(hpte, ptep, PAGE_SHIFT, + HUGETLB_LEVEL_PTE, ptl); + hpte->ptl = ptl; + break; + default: + WARN_ONCE(1, "%s: got invalid level: %d (shift: %d)\n", + __func__, hpte->level, hpte->shift); + return -EINVAL; + } + return 0; +} + /* * Return a mask that can be used to update an address to the last huge * page in a page table page mapping size. Used to skip non-present From patchwork Thu Jan 5 10:18:12 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: James Houghton X-Patchwork-Id: 13089688 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 60CF5C3DA7A for ; Thu, 5 Jan 2023 10:26:47 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id F0A418E0005; Thu, 5 Jan 2023 05:26:46 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id EBA338E0002; Thu, 5 Jan 2023 05:26:46 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id D83C98E0005; Thu, 5 Jan 2023 05:26:46 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0015.hostedemail.com [216.40.44.15]) by kanga.kvack.org (Postfix) with ESMTP id C22AB8E0002 for ; Thu, 5 Jan 2023 05:26:46 -0500 (EST) Received: from smtpin11.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay08.hostedemail.com (Postfix) with ESMTP id 93838140B83 for ; Thu, 5 Jan 2023 10:26:46 +0000 (UTC) X-FDA: 80320366812.11.DB469F6 Received: from mail-oa1-f73.google.com (mail-oa1-f73.google.com [209.85.160.73]) by imf08.hostedemail.com (Postfix) with ESMTP id 07C3616000B for ; Thu, 5 Jan 2023 10:26:44 +0000 (UTC) Authentication-Results: imf08.hostedemail.com; dkim=pass header.d=google.com header.s=20210112 header.b=b1RVVWEc; spf=pass (imf08.hostedemail.com: domain of 3I6S2YwoKCGoRbPWcOPbWVOWWOTM.KWUTQVcf-UUSdIKS.WZO@flex--jthoughton.bounces.google.com designates 209.85.160.73 as permitted sender) smtp.mailfrom=3I6S2YwoKCGoRbPWcOPbWVOWWOTM.KWUTQVcf-UUSdIKS.WZO@flex--jthoughton.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1672914405; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=TZkjgoDFSdPqMdf6PVi66Nkhma8WC4z5Ws9uGEsRfvU=; b=LlAn+3/r5iquwLJbbm53jYjikvk5fvUzclbUEtycormekgPhYMhRPkGhmqEd5zqSofPluz rmWnxSj9vfKYeJ41AWyzFayd2Ni5gS5DD3D8LO+80h3Z40mPW5zaOCZl8uOXc+nX8IvJfV +QFIJLHLfYyWZ6M4UiwZHfKvyTyOZao= ARC-Authentication-Results: i=1; imf08.hostedemail.com; dkim=pass header.d=google.com header.s=20210112 header.b=b1RVVWEc; spf=pass (imf08.hostedemail.com: domain of 3I6S2YwoKCGoRbPWcOPbWVOWWOTM.KWUTQVcf-UUSdIKS.WZO@flex--jthoughton.bounces.google.com designates 209.85.160.73 as permitted sender) smtp.mailfrom=3I6S2YwoKCGoRbPWcOPbWVOWWOTM.KWUTQVcf-UUSdIKS.WZO@flex--jthoughton.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1672914405; a=rsa-sha256; cv=none; b=3qQsX5owXiz6S/5PDnAl4QmH5u93o3wbZF6BMDDkHWo9Y4VY4XcE/qPo0ZN/FdARwAHRj7 vTQPMQpdEkAZJROCdnJVktenXSpIYHZ+xJExBiWgUYrL7TOk5iPYP6mQ9BnLY52Nr2J3/f RVkCPm0m4NhKrprzPbUc6z+/I9734AE= Received: by mail-oa1-f73.google.com with SMTP id 586e51a60fabf-14fdcb3381fso11696265fac.0 for ; Thu, 05 Jan 2023 02:26:44 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=TZkjgoDFSdPqMdf6PVi66Nkhma8WC4z5Ws9uGEsRfvU=; b=b1RVVWEcn7xCiOOjSjy13xNZs7I3m2tPgL+vaICF4A006PbkCzbik5RH+bvlc5t1Qx smy7s9lD0NeMHyXR/guX74nirJ4mhSP/b5OsHYtrukajSYuo8SN51XOlUZveqfcIBxTi pCfvLZWOwBGymc/6OMx4GvgiaEZ8gEeK8Ch/w/fxFKNdFChMKGauZEpNnT4CrnGxewHT oGPT+PC4YdlyU5cZzBSC43pYPeJidbn+PHc6Uj+0smcgsOefwqWpVuhCtBoBfzEfEMFV upTK7E8QZhz98epzC3+PkHQg4sEDxGGdEgPVD9h872OYNbYRLMjyEnUxFS/EyfSg2GYD aEIQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=TZkjgoDFSdPqMdf6PVi66Nkhma8WC4z5Ws9uGEsRfvU=; b=H5JdTyH8yxHNKgfQ1TMl9MxGI3t5SeRf7UZA6EwTj9+rUUbFPccLzR39D1bcaguoui /TgUGwUotrL1GaHuyF1NLk6loUfggKHXDxxVZSE9bcG1xIyY3EVqayxtIJdOWCJkafmk 0XU3BbbzEcbzL6i5Lce0h5rK7DaEjmp/UDsG21i1qESE+YrInnDzXsvoMqcy8zl4Uxid Imi0o/gxb9jmzI3Li8yUBX3G/MWFxWROA0OoRQ5aX6XXKc1ySCx94VIVIhcisFkPbTt2 oGCNKOTYtWvfs5ZEsr32u+xG5irSonDUNFnYJ3Wi2Nw1VqFainUxBgtrXz1Uhp2uF7cI BY2w== X-Gm-Message-State: AFqh2krSZty02+TLbkEKrTeDtpk0/4Q7UXda/k2wI3OLWJNe62uP2e1y 981+/cSOVMF7tSGcXONMeBwsQaaNlIUwWoaW X-Google-Smtp-Source: AMrXdXvdfTkOB9SHQm3CQqLDUm9d4RcxTmzh/cSUAY1Ri2gZNPJRC9o4+adxaYDY9ZV7J44JX0JcbEjpnhe2Rp/q X-Received: from jthoughton.c.googlers.com ([fda3:e722:ac3:cc00:14:4d90:c0a8:2a4f]) (user=jthoughton job=sendgmr) by 2002:a25:f41:0:b0:6f9:bd92:e428 with SMTP id 62-20020a250f41000000b006f9bd92e428mr4815135ybp.28.1672913955309; Thu, 05 Jan 2023 02:19:15 -0800 (PST) Date: Thu, 5 Jan 2023 10:18:12 +0000 In-Reply-To: <20230105101844.1893104-1-jthoughton@google.com> Mime-Version: 1.0 References: <20230105101844.1893104-1-jthoughton@google.com> X-Mailer: git-send-email 2.39.0.314.g84b9a713c41-goog Message-ID: <20230105101844.1893104-15-jthoughton@google.com> Subject: [PATCH 14/46] hugetlb: add make_huge_pte_with_shift From: James Houghton To: Mike Kravetz , Muchun Song , Peter Xu Cc: David Hildenbrand , David Rientjes , Axel Rasmussen , Mina Almasry , "Zach O'Keefe" , Manish Mishra , Naoya Horiguchi , "Dr . David Alan Gilbert" , "Matthew Wilcox (Oracle)" , Vlastimil Babka , Baolin Wang , Miaohe Lin , Yang Shi , Andrew Morton , linux-mm@kvack.org, linux-kernel@vger.kernel.org, James Houghton X-Rspam-User: X-Rspamd-Server: rspam04 X-Rspamd-Queue-Id: 07C3616000B X-Stat-Signature: t6sekef3b6ogxxs6mdidheyei5jwidjf X-HE-Tag: 1672914404-342540 X-HE-Meta: U2FsdGVkX1+R+9BNL9qJKKnv9BBmTVUP6N1/GOL/3+gg/TtvVpbou+CzGU9YeZ0zKe1olyG/r0ng27EpdNro1lIQSVwmOXJvF1NBx8vve+yizwuq3DO2llKmRL+hu1IhuKH5hC9Vrt2u9wz1yCtc28rKb3elJbm+QZ9wPp+4/rf3L2y7MwRw8Oz/8lZ2d61bpKifxgLodyvCJEDC6Fk38U2nVC0Y6R9t9NW92y4BiVEY21h4WGq1filS+9E2zkmL1q6TVmoBYR/5X5Cz30gjFInb/l8j+zlvgESj1igEOr9//4QaRtBn7NJRTtJ8IkPE9e74nOctark7jCRbFvMO63cghpvI0s9TGrRB4hSnYK4S05mpwfFP6mhYiVq4Cgb9A9z73VMhBqETH+AogX3H893SLFsLrXfTOq2PAMQicy4F3/w4Gi797unhxF2Xm0HpZVLc4aXro8J+anXqEiuAXycmGWDbFSm5KV7tYE5whCs4Dj2JVmyGEC9T/T2Up/98iv+1iQCkmZFuE0AX9ihRQ98foqlar7hJ0VWmQVa7nrcyNKad+YvLc711j2SzONozIMWFjK21Gn6Fd6Qrda2YF7MRawNKVfLmvlY9PW1L5KW4mqvZgabiAICyseIPWCOhnGJ9chCw4xqFkjXpyDcoV5JuCQjsSpNsjVjjB1kUR3b6JXsySfeEXFr/xBOWqPg0cplDkcyf30E6FOU5cqZHvKY2FApVEhK485CWL3T/rhFSeBGoFJKtBT58f/RuS0IIihawfnrYWreHJSEaIqtKc0VPMenPb+qIblDRl3Lm1U+RiP5Be4ORwQ75mIxlK1ZoWgtVf+ykniN7Xb23zhPPj9pPBzl1ii6eGYAXpzDJQmiV0FTnFXCyZ6sz7jac8tST1NcZdngDGJNT+vcUyZfV5Hd/5M0l8xnFwks5wRUuqVknsne7ftIOSFd6l5kp5/1lsTi3taUUsmUQTMDYZsq jRkjd4T9 nu/NVpk2UZMS3+LhDznVUYnIy7D0yv/NlZo3jjVQsP+zM6JEtVSk6XBN36MbmlPg1jwUwC9mJmqk6flrEf5Oasot8oAd8jKSnhVftZ7Il4hr0r+BLYlpCREB8kkzE9Z9XgYaWrAYhBXLCZqyvcgDB2voKMlejGaiQhU7jYPt1juR/uwe7MsxCii4ZMvNrE1FdPgZIP+Uk9VXuOtCh7V6MMGLqsoUCZwzdhpw/7nGT92b68Pq3iT6e6CkiBDxkAy6Csx5cSkdjfYd6VLEvBmQwZlK+Dak+sJJd2T5j7phpyLMddIYQ+VFoSY/9mr4sD9Gw23yk2ukw8ygTff11/5cwpwOZAXEK8dZ9T6J84s47NJ0ofHcmGrXk0VYcv22D8Jx4B8dvdw/SiWzf8xlNEbQ15FKZ6Ai2p0MtFiHu5vbWfTprEerR+Nm5BrOBT6gH7hVlDvBK X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: This allows us to make huge PTEs at shifts other than the hstate shift, which will be necessary for high-granularity mappings. Acked-by: Mike Kravetz Signed-off-by: James Houghton --- mm/hugetlb.c | 14 +++++++++++--- 1 file changed, 11 insertions(+), 3 deletions(-) diff --git a/mm/hugetlb.c b/mm/hugetlb.c index aa8e59cbca69..3a75833d7aba 100644 --- a/mm/hugetlb.c +++ b/mm/hugetlb.c @@ -5028,11 +5028,11 @@ const struct vm_operations_struct hugetlb_vm_ops = { .pagesize = hugetlb_vm_op_pagesize, }; -static pte_t make_huge_pte(struct vm_area_struct *vma, struct page *page, - int writable) +static pte_t make_huge_pte_with_shift(struct vm_area_struct *vma, + struct page *page, int writable, + int shift) { pte_t entry; - unsigned int shift = huge_page_shift(hstate_vma(vma)); if (writable) { entry = huge_pte_mkwrite(huge_pte_mkdirty(mk_pte(page, @@ -5046,6 +5046,14 @@ static pte_t make_huge_pte(struct vm_area_struct *vma, struct page *page, return entry; } +static pte_t make_huge_pte(struct vm_area_struct *vma, struct page *page, + int writable) +{ + unsigned int shift = huge_page_shift(hstate_vma(vma)); + + return make_huge_pte_with_shift(vma, page, writable, shift); +} + static void set_huge_ptep_writable(struct vm_area_struct *vma, unsigned long address, pte_t *ptep) { From patchwork Thu Jan 5 10:18:13 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: James Houghton X-Patchwork-Id: 13089649 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 4A9FBC3DA7A for ; Thu, 5 Jan 2023 10:19:20 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id DFD4794000C; Thu, 5 Jan 2023 05:19:19 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id DAE00940008; Thu, 5 Jan 2023 05:19:19 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id C770094000C; Thu, 5 Jan 2023 05:19:19 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0013.hostedemail.com [216.40.44.13]) by kanga.kvack.org (Postfix) with ESMTP id B47A4940008 for ; Thu, 5 Jan 2023 05:19:19 -0500 (EST) Received: from smtpin20.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay03.hostedemail.com (Postfix) with ESMTP id 968F1A0D04 for ; Thu, 5 Jan 2023 10:19:19 +0000 (UTC) X-FDA: 80320348038.20.AA3C863 Received: from mail-yw1-f202.google.com (mail-yw1-f202.google.com [209.85.128.202]) by imf23.hostedemail.com (Postfix) with ESMTP id 0F08E140009 for ; Thu, 5 Jan 2023 10:19:17 +0000 (UTC) Authentication-Results: imf23.hostedemail.com; dkim=pass header.d=google.com header.s=20210112 header.b=DiSQp6yj; spf=pass (imf23.hostedemail.com: domain of 3JaS2YwoKCGwTdRYeQRdYXQYYQVO.MYWVSXeh-WWUfKMU.YbQ@flex--jthoughton.bounces.google.com designates 209.85.128.202 as permitted sender) smtp.mailfrom=3JaS2YwoKCGwTdRYeQRdYXQYYQVO.MYWVSXeh-WWUfKMU.YbQ@flex--jthoughton.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1672913958; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=xdV3IYpedDoNM7qUi7kHNJzHPxVnQ9DxeM3hKKmcY9U=; b=6xLu8EoczjSrCxfyOiziLYIsKZa8v56k93uEmZjbP3Y4zac1ezt5Louy2SGieBOW18B1yJ GYMZUqigEktQQ3o8yFL0xQaZRMOKCWqi2Gj9A3aJLiEXoje1mM0FHCesQq4IUd+7zl/XdW LITA8dlh4bCvvEcqA6rEqS14aIqYD+8= ARC-Authentication-Results: i=1; imf23.hostedemail.com; dkim=pass header.d=google.com header.s=20210112 header.b=DiSQp6yj; spf=pass (imf23.hostedemail.com: domain of 3JaS2YwoKCGwTdRYeQRdYXQYYQVO.MYWVSXeh-WWUfKMU.YbQ@flex--jthoughton.bounces.google.com designates 209.85.128.202 as permitted sender) smtp.mailfrom=3JaS2YwoKCGwTdRYeQRdYXQYYQVO.MYWVSXeh-WWUfKMU.YbQ@flex--jthoughton.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1672913958; a=rsa-sha256; cv=none; b=GKj43OAOJQU2Ws5r7qCit3Y5LkmffzdjyngMp9sdUMrJuusb20HxE653Z7fItgZHG65Dey dhNH2L6aN55cP0QT78XLUgHzYGtgTPiPK3IydAw/t3i18q6RSzm5wkQ8lozPpehrEL1cq9 IBEwDPWQ1mOrt6dGdrVzTxjHXhXZQe4= Received: by mail-yw1-f202.google.com with SMTP id 00721157ae682-460ab8a327eso378495117b3.23 for ; Thu, 05 Jan 2023 02:19:17 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=xdV3IYpedDoNM7qUi7kHNJzHPxVnQ9DxeM3hKKmcY9U=; b=DiSQp6yj/Jffb5ePxq8rLXfWEn0QKWirG+DC/Z+7i72DYO+nh0hgIQIWulypFbtBD0 CNXALjMtPUpaNYD+K9/Wtem86znsi9S0XhhVza6MwF9b0Ah41Zx62xgpk0EqpvxL45Ez twNhGjN8owfKEsoU0pxg3/ZHYJoT3u5nVxsT0weZJKuU5bHZEEigDexGz6ip4UrCMr3u 3a/kheO+Wz2ofksSCLNRFL/BR4OqsTuwZPgS6RFEg5mlXviTOOl8DGf/P5TUOJfu8Kz5 jKgiNh/96tLer8BEZqipSxoSlnT5zdxZozrvn3a/BmqSquCXcaUJoYHENLG2wGTwjIMH H8MA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=xdV3IYpedDoNM7qUi7kHNJzHPxVnQ9DxeM3hKKmcY9U=; b=HkxwIxRKIEqc64n5g5HEQIiv2UyD/7i5zGidh915cugxft59kKLMiRJdQ0kQpoK7E2 VNAaYuVwjsfQnCG6NLbw10VnsNpBm4YRzbIpJXQ3qQCgRWhl/Avts+8rhJLQAQVP44Bd ZnlefDiUe3RZ1i8bDZpKvVpQT8QfABXCptu1CY9n967nsnf1DOx6FgGcfSHyvGFDVO5D pGXj+/AJi0Hc2iys8a0SsDqQhrFiD2T0BKdclsRYacFgq5jnnKZLyIcKh7juXShrYUqS VJT2/jNXpthjandcRy3drlAOltp5o9oUMYO8QloA4uy7HBfxz7H3NyVBFnTpnil1r1w0 117A== X-Gm-Message-State: AFqh2kodCuw6Ju6HQR8QtZz3RoQmQFuNTbBxLuoZ1/k9Bbr3oMljYpzh lbpvayIfD6OIfVNAEtSiZiBmPvwZqjyq0QGz X-Google-Smtp-Source: AMrXdXueaa4dFGcoO+ZQv8KF8N7hesWfET2q0B/yE9XBfhLadDTwF6Vz6PrsKAZSbr/jbmJmEVRWPYHXjsmbmqDZ X-Received: from jthoughton.c.googlers.com ([fda3:e722:ac3:cc00:14:4d90:c0a8:2a4f]) (user=jthoughton job=sendgmr) by 2002:a81:490c:0:b0:41f:702d:7883 with SMTP id w12-20020a81490c000000b0041f702d7883mr5495262ywa.22.1672913957258; Thu, 05 Jan 2023 02:19:17 -0800 (PST) Date: Thu, 5 Jan 2023 10:18:13 +0000 In-Reply-To: <20230105101844.1893104-1-jthoughton@google.com> Mime-Version: 1.0 References: <20230105101844.1893104-1-jthoughton@google.com> X-Mailer: git-send-email 2.39.0.314.g84b9a713c41-goog Message-ID: <20230105101844.1893104-16-jthoughton@google.com> Subject: [PATCH 15/46] hugetlb: make default arch_make_huge_pte understand small mappings From: James Houghton To: Mike Kravetz , Muchun Song , Peter Xu Cc: David Hildenbrand , David Rientjes , Axel Rasmussen , Mina Almasry , "Zach O'Keefe" , Manish Mishra , Naoya Horiguchi , "Dr . David Alan Gilbert" , "Matthew Wilcox (Oracle)" , Vlastimil Babka , Baolin Wang , Miaohe Lin , Yang Shi , Andrew Morton , linux-mm@kvack.org, linux-kernel@vger.kernel.org, James Houghton X-Rspam-User: X-Rspamd-Server: rspam04 X-Rspamd-Queue-Id: 0F08E140009 X-Stat-Signature: a66w8go37fe3jwo64kdet6ctohmc8gqm X-HE-Tag: 1672913957-705781 X-HE-Meta: U2FsdGVkX19WiJkPjkSi8Q20WKU84Sd6OdhZ1KxczKmcvlsDYwg0D3IfCGU+6r8wWZi4bB1I295KqEjE2WRpJsPxzBjUhuK1BuikQl+fmHlJm0GsHSF0h4k/PaQq81hWuDJZ0PreNIWzoNXkH/ERrb/0lw0UySApq7Ym/j98QqlgSuxVMIQA9565mVbrSJ8wR5uT2N+Ej0XvSst0TMbBKpIwqfcVgStBXdKSZtYoy6lE8DLJfzQ3IqgCHAiyyPXk6VnQujL3xMaJsmS2q+MVjWQBz9oDw9EvTzi/kQ6L58aQsfLkkgKV4JAXYDOeJSF9CtQrm4ub6AVoOY8cegXDNLbkYZWmFWmM9zx5Z1LFGyUcyVBayFAUyGBZ37enbGtsIForOt1einJDPjs5jEeWFUR8OWYhVgtZCuYoZEvf+1GNs/RPoNDGPPNaTGt9N+Jsu1gFJmHOGw7PM3ERFFjzE3Y4egyWYbA7m2435f9/J+L3W6YES+jDEpefvLzszV2XCvlK2fGSNk27zndZE09ZpJEMLBDzjHEqagox829R37EFLlgZr0KTa/wCIbwdfWQ232qTHuMBRsC0LBVsBd0iX7gXDi7oZ2dOTmXDjyt5Lk/V44kf1Glh9LT8T8vrZwwUDCcAurzJitxrQEqSFWkK/lSINAj+N40erIz08FOvMR9/8tOFjnFMFBmrvcDn5A9z+U+J0xvL1wrCE0KZXnDMoI5kJI6G8bxyuQ86E7fkGr/t9TuviLHpunIogKO0AjX5lCDC5t/7yHmuq+yNR0REGPkSfoYl4XBeOT5Ir1P2/75w/W6dwENqwyJc+xkOJ4Kk4qqD/CNmq4B/1s4bYkymad/MIncueFnLJ1NqD+J637G2oSf2ezTR1wm1wx5WlMTIYi7heYd9g7xPr3PBX2cRZHmIrawGw0FInNyF9qzcgcd2kMTmkK4M5ijXYx6GVjGFqYQ3CU03rHYEnONhbP5 ymdmthow vDE8GPe5JtWyg+CArirjHujPbmLyEUPcAjUuV1v7Ebx8QD9ydacQ7oh6l7LRoBtClgXiDMYX6uDwu2pqcZpCWoXD9TlOGc1ithMMNc4K9NXpo+x43tT1pU/Ld8VGa7ETO/jDgw1sQ4ubpRQt8bYqF9QEyEickhttaSfmVhSbD3pNo94Nd2LZGLVuSGDQeu56LkhkSozs5khbCIpU3GYWJlMDIq6emBkzA/46w45DnQIeohUysVWzRgnZf87B7QBkLQSOzDM9hwLOrTo0Zp3HPSNeoc25L+mntnk2ltcs+0QEp+nVgGZtd/3zNbKARiA4pWvX5OnquZw/PhO+3wSUxVKUp3bB714OnF3g40HC3U3jDUSLElsOzhHzC+UjLM+fj3lFV5noFGBBovW/XJhq+l0u7o1rSm/s/F2CsX/QwwPzU5AGzEZlGGbSPeow3gE+P1nWa X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: This is a simple change: don't create a "huge" PTE if we are making a regular, PAGE_SIZE PTE. All architectures that want to implement HGM likely need to be changed in a similar way if they implement their own version of arch_make_huge_pte. Signed-off-by: James Houghton --- include/linux/hugetlb.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/include/linux/hugetlb.h b/include/linux/hugetlb.h index 2fcd8f313628..b7cf45535d64 100644 --- a/include/linux/hugetlb.h +++ b/include/linux/hugetlb.h @@ -912,7 +912,7 @@ static inline void arch_clear_hugepage_flags(struct page *page) { } static inline pte_t arch_make_huge_pte(pte_t entry, unsigned int shift, vm_flags_t flags) { - return pte_mkhuge(entry); + return shift > PAGE_SHIFT ? pte_mkhuge(entry) : entry; } #endif From patchwork Thu Jan 5 10:18:14 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: James Houghton X-Patchwork-Id: 13089650 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 64C6EC3DA7D for ; Thu, 5 Jan 2023 10:19:22 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id F107A94000D; Thu, 5 Jan 2023 05:19:21 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id EC2D5940008; Thu, 5 Jan 2023 05:19:21 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id D396294000D; Thu, 5 Jan 2023 05:19:21 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0014.hostedemail.com [216.40.44.14]) by kanga.kvack.org (Postfix) with ESMTP id C18A1940008 for ; Thu, 5 Jan 2023 05:19:21 -0500 (EST) Received: from smtpin14.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay02.hostedemail.com (Postfix) with ESMTP id 906981206CA for ; Thu, 5 Jan 2023 10:19:21 +0000 (UTC) X-FDA: 80320348122.14.65FBD7D Received: from mail-yb1-f202.google.com (mail-yb1-f202.google.com [209.85.219.202]) by imf22.hostedemail.com (Postfix) with ESMTP id EE842C0007 for ; Thu, 5 Jan 2023 10:19:19 +0000 (UTC) Authentication-Results: imf22.hostedemail.com; dkim=pass header.d=google.com header.s=20210112 header.b=XMejQEid; spf=pass (imf22.hostedemail.com: domain of 3J6S2YwoKCG4VfTagSTfaZSaaSXQ.OaYXUZgj-YYWhMOW.adS@flex--jthoughton.bounces.google.com designates 209.85.219.202 as permitted sender) smtp.mailfrom=3J6S2YwoKCG4VfTagSTfaZSaaSXQ.OaYXUZgj-YYWhMOW.adS@flex--jthoughton.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1672913960; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=JBa/YNOWHc8BdkByQMj3j16NgHnHDUWMHmpwThmwYW8=; b=S9qdij6CR4WfikYLjuSGZhEr4ffiO2eb0DkUpFj2skjLFWx5KF8P0E4TGhzq87puicCNCx thM8gbEsJqZ9OCpk16VezRfjB4rcCYFZjCxju4DESF+e1WBZeNgdm0mk46sBZJP+KL64hd NvIM3s9vrhECoFrJCBzDCKNnaUL4Cu8= ARC-Authentication-Results: i=1; imf22.hostedemail.com; dkim=pass header.d=google.com header.s=20210112 header.b=XMejQEid; spf=pass (imf22.hostedemail.com: domain of 3J6S2YwoKCG4VfTagSTfaZSaaSXQ.OaYXUZgj-YYWhMOW.adS@flex--jthoughton.bounces.google.com designates 209.85.219.202 as permitted sender) smtp.mailfrom=3J6S2YwoKCG4VfTagSTfaZSaaSXQ.OaYXUZgj-YYWhMOW.adS@flex--jthoughton.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1672913960; a=rsa-sha256; cv=none; b=ReCV4NxLCHMsjiojtswjmDngEARmsUha8g4mgqr4wLc+wM7h8yKUSqkuGNf3VQlRG9wROH 0gy9e+7HL8xP1LWlOdE7HgA7A+QujLpJvxP2ZJ0vHjDGKV6W0Hinpp//ioHyruCbIY8NJH EdXENJ/lWyAvLUB6Ck5Ml9jv1u0r+BE= Received: by mail-yb1-f202.google.com with SMTP id g9-20020a25bdc9000000b0073727a20239so36033645ybk.4 for ; Thu, 05 Jan 2023 02:19:19 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=JBa/YNOWHc8BdkByQMj3j16NgHnHDUWMHmpwThmwYW8=; b=XMejQEid82qJD40DqqAVmKO7KaK2I6fqGVaQFn3qa+xsUhE5GeK5QsvXmIgXrlYAYp ZglFLDAuM74Ul8HnkiBVKndyeFl7TfYD6lrTc+G+Y0y5590vBgUhLji2pkV/uo0dTtba JOhfbgAT9W5Ih3mvcLPzzoGXyJzqEfRKoNqybN3iBy+HEU29ORkAZ56QrvlR8wMgJBPv /fiJ3Qjur/kHePLpxQf6mexQ2zmKFEUxMYH6n8AvNHRLmu3tqS42gPfHnw2VZCW+sun6 KjrZlQt7j8nYKl5hgQmUrLHKrYGRJUYde6DMPUl9jm0rRYEyRoygWOPbOjS00/ls6W8J sVQg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=JBa/YNOWHc8BdkByQMj3j16NgHnHDUWMHmpwThmwYW8=; b=rIsMDpEQYkSM8VHUYybLIgekNergExXwLJWQL5qnApE5gWP5olt5We+D0NqpwdcLyf ty+bpSPGlcR+zV99ZwNPbl70DSygziTKZ//YCAT1ECZIggWG1NNK+ii0xBoY5eSdWZYz i7UPUGUeDNgcH5JB4JHk45nNnj+DLFvoyTzp96rYCIepSNwoOdp9H8gH8R5DFS4OgraT 8CNUjyrfykB6o6VXqtYwR2BG/++LRK4PZeaK+ueIPEBuGGJpArYK7EOnMjArVuL+IE0g ZIh9KDxyCiPEo4Yhx00HhsSQQ5B3v9iCkUZpN3yjgZdPAm0082NBbD/8OraqNzHAyZDE TeuA== X-Gm-Message-State: AFqh2kohWlGlRnWu/1kD4Nq3tE2JBYck9kujm7cE3KAKYQwn/rV9BGDt /UoCIq1E3Wpz0RP3XD/nug1JtUVr9pplPeVx X-Google-Smtp-Source: AMrXdXvV0r/wfkEbgjE+pvWMQzkSz+gLHIZu+cb6EHqmZiUJTMqvVQaEsHt/42vL01/hmp945pzEpgFSvftYQi7k X-Received: from jthoughton.c.googlers.com ([fda3:e722:ac3:cc00:14:4d90:c0a8:2a4f]) (user=jthoughton job=sendgmr) by 2002:a05:6902:b14:b0:6fc:c88a:1c6d with SMTP id ch20-20020a0569020b1400b006fcc88a1c6dmr5728371ybb.486.1672913959168; Thu, 05 Jan 2023 02:19:19 -0800 (PST) Date: Thu, 5 Jan 2023 10:18:14 +0000 In-Reply-To: <20230105101844.1893104-1-jthoughton@google.com> Mime-Version: 1.0 References: <20230105101844.1893104-1-jthoughton@google.com> X-Mailer: git-send-email 2.39.0.314.g84b9a713c41-goog Message-ID: <20230105101844.1893104-17-jthoughton@google.com> Subject: [PATCH 16/46] hugetlbfs: do a full walk to check if vma maps a page From: James Houghton To: Mike Kravetz , Muchun Song , Peter Xu Cc: David Hildenbrand , David Rientjes , Axel Rasmussen , Mina Almasry , "Zach O'Keefe" , Manish Mishra , Naoya Horiguchi , "Dr . David Alan Gilbert" , "Matthew Wilcox (Oracle)" , Vlastimil Babka , Baolin Wang , Miaohe Lin , Yang Shi , Andrew Morton , linux-mm@kvack.org, linux-kernel@vger.kernel.org, James Houghton X-Rspamd-Queue-Id: EE842C0007 X-Stat-Signature: erhesis88rxhr7ha1qzkbzixe86ussq8 X-Rspam-User: X-Rspamd-Server: rspam08 X-HE-Tag: 1672913959-477191 X-HE-Meta: U2FsdGVkX1+QHCdOy0f6OOuD1ZpHo9Bf94aAKyv4HHv3Qqvcve4K2WjSM+NOyPlyAvbZFfCvHOKwo+RbYzx6ZTG5dAn/jxEtlKKiYQp2mAIIaXj3I31Bcs636Y/9kGqmqZgLcX87lmRaf7x6GNWPxCZEgwQutp8jlCWW1ofgIanZGyIb45ooqboWZ8FMR/jo2N6NJWZyBTsvkH1Gxn3Uoekmtgc4d+OhKq7SbK1LfXppOrLHkb+blwYT9rQvjwfeCf7hF7qEnRb8wxPCZi5nrmpW1qarvQLGlw1kwQQuQXDNYKrB2pXtij9gq3D0DatsZ+M6fqiTEx5DHgCmtGK1Kr3FlNL0ceWQtb60eQyJhKnPlvs+QPcMShhvBZLo6wsh1kguf+BrYFRAIEMahdaS6QWjnN4depwSj0K4s3Fpw5bEPMB/jwDr8rTqHVavuiCqYEEm0dxxKTgqNxJf3H9guTo9a/KbbbIqEbX/CCQZaapD3t6oqCAHfZzYU3Y6U2yr+/4U6Ih7vpd13clk4tVKHxxHjGRt63pV2mmEzLvAYcP1SKo+KctWjQCHDw5oHLEG9DIcKc0Tco4HyvFPILhibvFJHrAS4VaSgGuDAJjW5JUL1FXpASsB38hEZHX3d7wJnK+ejZ+uqceIfKr9BWrectZ0qfz05ungVSvA9dNzDdIeHsXkyObzw1eOUKGvdIP9v7COadv1cz6jntF5U6eHplfBWhaRWmSrV4zc8Bs09zqUcym7Pj/+oW7X4kQ5+X2CSLZJLGwlYk2YsvErmPOKbzJfJ1i4NNl6AHdb/mpBnL81QEuZwjZTWRN1fkxEfSHNoWaDPDmpcSLzqEDlclunXohbsCHcxVAQQiPL3PiEYr7X68ABmFtnby5Lge6KPnYp4cju9t5km0DC/iyesTubyr6gq0fZEuhfxpqPDgoATriCmmhAS6idg27CV+rH/hbuDsoEBAlBvGi/a4niSuY 7p5XpYFK kzuqT4rknyExp+kZ7uD/bFg2ksNtHhNr71o418q64OF+H5vg9Qsr368i55BZ9y+LxiAkKOy6Gr8OoWPXVK0jiQvVuLBbUvD78ODyYlZzUtVZB07EkhGeBCZb9KSWBjsBKy2soT+1rqBki0bT1g0wnyKmUFHOdBYNHfvO+4b0lcsfuWxqgRZQGG2yS3gI8P1jD6gpC7Moiw8NGpVfFwRzJVy9J0vi8XqB7ibQ7HZt2r3QdDWKhyI+TTpfes/RIxA4wfahW2jnhfcwbhPhTJJhFtnlLok2/TFmUwO7hQD0dgB2q3n1+OWVEN1E0/llV/Ufwn5aK/L+l6Q0GIU/0rvP5Qm2YgHmAoZ0t3kpTTJ5ikZD64fH7U2qWTebr6Y3FptGLiFdsuguN3Qtg9oc= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Because it is safe to do so, we can do a full high-granularity page table walk to check if the page is mapped. If it were not safe to do so, we could bail out early in the case of a high-granularity mapped PTE, indicating that the page could have been mapped. Signed-off-by: James Houghton --- fs/hugetlbfs/inode.c | 17 ++++++++++++----- 1 file changed, 12 insertions(+), 5 deletions(-) diff --git a/fs/hugetlbfs/inode.c b/fs/hugetlbfs/inode.c index 48f1a8ad2243..d34ce79da595 100644 --- a/fs/hugetlbfs/inode.c +++ b/fs/hugetlbfs/inode.c @@ -386,17 +386,24 @@ static void hugetlb_delete_from_page_cache(struct folio *folio) static bool hugetlb_vma_maps_page(struct vm_area_struct *vma, unsigned long addr, struct page *page) { - pte_t *ptep, pte; + pte_t pte; + struct hugetlb_pte hpte; - ptep = hugetlb_walk(vma, addr, huge_page_size(hstate_vma(vma))); - if (!ptep) + if (hugetlb_full_walk(&hpte, vma, addr)) return false; - pte = huge_ptep_get(ptep); + pte = huge_ptep_get(hpte.ptep); if (huge_pte_none(pte) || !pte_present(pte)) return false; - if (pte_page(pte) == page) + if (unlikely(!hugetlb_pte_present_leaf(&hpte, pte))) + /* + * We raced with someone splitting us, and the only case + * where this is impossible is when the pte was none. + */ + return false; + + if (compound_head(pte_page(pte)) == page) return true; return false; From patchwork Thu Jan 5 10:18:15 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: James Houghton X-Patchwork-Id: 13089651 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id DE07FC54E76 for ; Thu, 5 Jan 2023 10:19:23 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 7BD4794000E; Thu, 5 Jan 2023 05:19:23 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 745CA940008; Thu, 5 Jan 2023 05:19:23 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 60D4894000E; Thu, 5 Jan 2023 05:19:23 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0010.hostedemail.com [216.40.44.10]) by kanga.kvack.org (Postfix) with ESMTP id 4EE4B940008 for ; Thu, 5 Jan 2023 05:19:23 -0500 (EST) Received: from smtpin07.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay08.hostedemail.com (Postfix) with ESMTP id 232ED140D66 for ; Thu, 5 Jan 2023 10:19:23 +0000 (UTC) X-FDA: 80320348206.07.EF115F6 Received: from mail-yb1-f201.google.com (mail-yb1-f201.google.com [209.85.219.201]) by imf01.hostedemail.com (Postfix) with ESMTP id 6DEDD40007 for ; Thu, 5 Jan 2023 10:19:21 +0000 (UTC) Authentication-Results: imf01.hostedemail.com; dkim=pass header.d=google.com header.s=20210112 header.b=L2aTbeQA; spf=pass (imf01.hostedemail.com: domain of 3KKS2YwoKCG8WgUbhTUgbaTbbTYR.PbZYVahk-ZZXiNPX.beT@flex--jthoughton.bounces.google.com designates 209.85.219.201 as permitted sender) smtp.mailfrom=3KKS2YwoKCG8WgUbhTUgbaTbbTYR.PbZYVahk-ZZXiNPX.beT@flex--jthoughton.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1672913961; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=3kWM8DYu7DnV7Jv/QGJ6f3ppiIBnNL94nY50ZsYLZ48=; b=UXMOhEveeGUMYKMX9qnTWhR0U9GBhpE128lvHzccpwoCgfBkfiaafB/L+okZDfVH5jdlgH PyCruals6Klzagoyaj0gl1kxKWeySTdtEKUKPe/+HD8qt07X2dnvx8+D7Uc+iU93nZa3KZ q6l7ief6MwKl0/cZVuCdAiiUtM/357U= ARC-Authentication-Results: i=1; imf01.hostedemail.com; dkim=pass header.d=google.com header.s=20210112 header.b=L2aTbeQA; spf=pass (imf01.hostedemail.com: domain of 3KKS2YwoKCG8WgUbhTUgbaTbbTYR.PbZYVahk-ZZXiNPX.beT@flex--jthoughton.bounces.google.com designates 209.85.219.201 as permitted sender) smtp.mailfrom=3KKS2YwoKCG8WgUbhTUgbaTbbTYR.PbZYVahk-ZZXiNPX.beT@flex--jthoughton.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1672913961; a=rsa-sha256; cv=none; b=OWxQk5hMjuE9Njjt9Q0uKtExYtrOVa3l2mmZlKiNnNQJFTqf6YOQc6ytgSHaw2MCFYFDOP /LkgS9u4fiprM/MzdXD5UtmMtaxDB6aXjmfSdtV2iNO8LRTJO+RJ8crTxQh2U1PL72BrKy nxUQKWT3MX2fl+D6GNnDotrWuqKzZEs= Received: by mail-yb1-f201.google.com with SMTP id 195-20020a2505cc000000b0071163981d18so36623226ybf.13 for ; Thu, 05 Jan 2023 02:19:21 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=3kWM8DYu7DnV7Jv/QGJ6f3ppiIBnNL94nY50ZsYLZ48=; b=L2aTbeQAeNJJeiUCuWw/dfwAL4UVIBqWGH55xL0ZJpLs3sGFv6hFcIvCc0Ga7d/ieX hjmwK3ib7/cchvS/YNI5B2b8qO3orIA5BWZwCv046KSsSk68EciP1H12pdFZ0JnY3ID3 pYb07IsQSWYw/0QL7lCbsdokm5Jdl9QVO58MurdRYCOYSbQ0K1s+CTWgqBbCQMbN52p3 rBm+Mj78x+W8YyDclWaBCjNduKppB0sC5WajJ2RRtfrhbf2j86ceAoNxktgW6316luJG DMu0s3GTU3x/vCtEgoRMt2pwviQKJhe3xm5AQwX5rbR/oBcVLAk0J73laj0wxnS/nDXd glUw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=3kWM8DYu7DnV7Jv/QGJ6f3ppiIBnNL94nY50ZsYLZ48=; b=gCBZVNoS6VreNK3Bypyhy7e0ImZXDm277u7Q1UvsRjfglWH5Qfj8mW080UmgeOiK35 0PoMEFZedk/l0F8CRrnpaYRv5Br6SDJpI0IxyqZcANk6A+6PFbGk6ZakUS/U7OyvwyMM TdPEWwGcSZ2YiRBX7bOdwMS58hQjWqBCp4soHxl+APio25iynU749aUO1ZsHzhZGpjI7 MMPgq2KC26Wc98eO93dW/Hf2vam2Nf2EgVg9fv319mHezjhSXhm3B5drUxAHu7nRv+xW OBzT9hsYLwO8blzhCdc58Hu4hNZfSS2fLjUqdokqik4dOj/insWziDLUXzdOnjA58c+N z6GQ== X-Gm-Message-State: AFqh2krFZkU7fwfb5ETTipoYVgD35FgoARtp6sXSmgxl15kN+OMni4q4 tYzYC14aVrC7ePMf+bVXTOnX++ZTiQbpttOb X-Google-Smtp-Source: AMrXdXvqOEL7ovL3FbMjk4okDrWCxN+YTjALnEUL0oRXnxb5KqHby493dqSvh11l+d52AF7sViTIqr3WdL+KcRjD X-Received: from jthoughton.c.googlers.com ([fda3:e722:ac3:cc00:14:4d90:c0a8:2a4f]) (user=jthoughton job=sendgmr) by 2002:a25:aa4f:0:b0:717:e051:5d2d with SMTP id s73-20020a25aa4f000000b00717e0515d2dmr5822098ybi.474.1672913960512; Thu, 05 Jan 2023 02:19:20 -0800 (PST) Date: Thu, 5 Jan 2023 10:18:15 +0000 In-Reply-To: <20230105101844.1893104-1-jthoughton@google.com> Mime-Version: 1.0 References: <20230105101844.1893104-1-jthoughton@google.com> X-Mailer: git-send-email 2.39.0.314.g84b9a713c41-goog Message-ID: <20230105101844.1893104-18-jthoughton@google.com> Subject: [PATCH 17/46] hugetlb: make unmapping compatible with high-granularity mappings From: James Houghton To: Mike Kravetz , Muchun Song , Peter Xu Cc: David Hildenbrand , David Rientjes , Axel Rasmussen , Mina Almasry , "Zach O'Keefe" , Manish Mishra , Naoya Horiguchi , "Dr . David Alan Gilbert" , "Matthew Wilcox (Oracle)" , Vlastimil Babka , Baolin Wang , Miaohe Lin , Yang Shi , Andrew Morton , linux-mm@kvack.org, linux-kernel@vger.kernel.org, James Houghton X-Rspamd-Server: rspam07 X-Rspamd-Queue-Id: 6DEDD40007 X-Rspam-User: X-Stat-Signature: c7befmc6eud8uafh9fypsfimhyaow6pw X-HE-Tag: 1672913961-106459 X-HE-Meta: U2FsdGVkX19sm8O/YwZkCaSfR0OOBNkeh804hIh+qM2eYVlj7MyIhcDYt+qnaLAYvXB8XKsQDkaLHEIAJx0TcjtVPsDd6q484ZUiCCydrjBYCxHGqez3UH6ruTmJaeauQCSzsnXh436KZ8ocGOgYLjvZxZZKEnI6YLWyM7FodRwOcNx98dkMo3N5jIRk4fZSKpDP8AUb3RKYAu2a70SROwsrrn2Hy91eEK+JOz2A/JolWM0FfZOVygBqpD7Mcj3BjsCvS6ZUjkdQTs2Bm23Z9RQJpK3M30ShMrmhZnhhLmT5ZpzOKAr/50seLk1bnNPlNGHDtph8G2XInV6wNua3xuGHTHJ1iNCL65OaWTjrDCCUzmT8qG9CjwLf7Ghp4WgXpGmSUFnl1yWgmDIscLxuFLsnYPuvdLGHi3C5SK3J+vCFmYlAU8EaYayYti0nEpWDHIwGzMbp4It8AWe2/24EL9XlBgYFRROhNQ5P7VJGZMq9XEuBc5ZltTf0tTNH6wPlgdmjmPGdMG+UgVT+9xoJoVYzVmbaF17Jhehowi69WFFSBnOnqkvKtM9m+OwfyAcaKIysyrzNYQJuFO3P7wRknQ3nyFPNTVsDFCgcHRKvLDGrz+vs1dzpSv2X3aYeyr4vFVrJYO/RIGGTNn4fUo5HC0VBBWW4qQ4zmJPikebjVW4u9tVMEXb2gygCesSbPfQA7cHc71QrAJmZGyzYZXQQjKY3M58/DDSilnTcgor2yXaFqxJXdqtzacLyMOyW9Mj4Ltk/Ehlinzw6m+FADdD+3ENya9qM6KMY+4p0/q57um8LjInSrd9RmfMAbsjtvpKMCsGsFHVNQwS08CoiVRA6EymyUZ9eqFiLTQcb0u63aniMy790GcUGq/ppOBe++K5gbeDhIQ9JOTKRCn3HcpjEa26n+9G4xYbyejmSAqC3KC9k4l3kRl5j+ABmPfqAr6T/QRy06qsoohDRFKTqvxF npee6HLf 9Y2AclIbR5HJQGLS0fx66BfmRB7tsefSsN4cCJE4BnCv0ut7Nt2lMNfJFtGl+RIQJOcevikgKr7bZ1MR0CGQlRRprayzSeyu7diGzrkvqJA8iOjHWqd/+EU35K4JALI7ql1qT6wnr2v7z521b/9334sS+oxbTfq1SjjsghG8FMv5mk38SPUu9wGy1SbToW3wjg7+FiLY1gc0uV9FYbhuMUe8xIqOFdm5WBCGG1kbdQMwMPg2GS9eVZd5FW6OqEeMPgbcffZj8SAuc55/2MQCJB3QnF5ag7V8LfbtKLWf1mp9YeoBcmH/v9P9Ql6qY8udSU13tzNl7zm8YagvkaTx2j7w/wYmyemI21mcYZ94rZN8RZrOf/pIQonsUVpxBQFiG7Df8sYfCZA2/I4I= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Enlighten __unmap_hugepage_range to deal with high-granularity mappings. This doesn't change its API; it still must be called with hugepage alignment, but it will correctly unmap hugepages that have been mapped at high granularity. The rules for mapcount and refcount here are: 1. Refcount and mapcount are tracked on the head page. 2. Each page table mapping into some of an hpage will increase that hpage's mapcount and refcount by 1. Eventually, functionality here can be expanded to allow users to call MADV_DONTNEED on PAGE_SIZE-aligned sections of a hugepage, but that is not done here. Signed-off-by: James Houghton --- include/asm-generic/tlb.h | 6 ++-- mm/hugetlb.c | 74 ++++++++++++++++++++++++--------------- 2 files changed, 48 insertions(+), 32 deletions(-) diff --git a/include/asm-generic/tlb.h b/include/asm-generic/tlb.h index b46617207c93..31267471760e 100644 --- a/include/asm-generic/tlb.h +++ b/include/asm-generic/tlb.h @@ -598,9 +598,9 @@ static inline void tlb_flush_p4d_range(struct mmu_gather *tlb, __tlb_remove_tlb_entry(tlb, ptep, address); \ } while (0) -#define tlb_remove_huge_tlb_entry(h, tlb, ptep, address) \ +#define tlb_remove_huge_tlb_entry(tlb, hpte, address) \ do { \ - unsigned long _sz = huge_page_size(h); \ + unsigned long _sz = hugetlb_pte_size(&hpte); \ if (_sz >= P4D_SIZE) \ tlb_flush_p4d_range(tlb, address, _sz); \ else if (_sz >= PUD_SIZE) \ @@ -609,7 +609,7 @@ static inline void tlb_flush_p4d_range(struct mmu_gather *tlb, tlb_flush_pmd_range(tlb, address, _sz); \ else \ tlb_flush_pte_range(tlb, address, _sz); \ - __tlb_remove_tlb_entry(tlb, ptep, address); \ + __tlb_remove_tlb_entry(tlb, hpte.ptep, address);\ } while (0) /** diff --git a/mm/hugetlb.c b/mm/hugetlb.c index 3a75833d7aba..dfd6c1491ac3 100644 --- a/mm/hugetlb.c +++ b/mm/hugetlb.c @@ -5384,10 +5384,10 @@ static void __unmap_hugepage_range(struct mmu_gather *tlb, struct vm_area_struct { struct mm_struct *mm = vma->vm_mm; unsigned long address; - pte_t *ptep; + struct hugetlb_pte hpte; pte_t pte; spinlock_t *ptl; - struct page *page; + struct page *hpage, *subpage; struct hstate *h = hstate_vma(vma); unsigned long sz = huge_page_size(h); unsigned long last_addr_mask; @@ -5397,35 +5397,33 @@ static void __unmap_hugepage_range(struct mmu_gather *tlb, struct vm_area_struct BUG_ON(start & ~huge_page_mask(h)); BUG_ON(end & ~huge_page_mask(h)); - /* - * This is a hugetlb vma, all the pte entries should point - * to huge page. - */ - tlb_change_page_size(tlb, sz); tlb_start_vma(tlb, vma); last_addr_mask = hugetlb_mask_last_page(h); address = start; - for (; address < end; address += sz) { - ptep = hugetlb_walk(vma, address, sz); - if (!ptep) { - address |= last_addr_mask; + + while (address < end) { + if (hugetlb_full_walk(&hpte, vma, address)) { + address = (address | last_addr_mask) + sz; continue; } - ptl = huge_pte_lock(h, mm, ptep); - if (huge_pmd_unshare(mm, vma, address, ptep)) { + ptl = hugetlb_pte_lock(&hpte); + if (hugetlb_pte_size(&hpte) == sz && + huge_pmd_unshare(mm, vma, address, hpte.ptep)) { spin_unlock(ptl); tlb_flush_pmd_range(tlb, address & PUD_MASK, PUD_SIZE); force_flush = true; address |= last_addr_mask; + address += sz; continue; } - pte = huge_ptep_get(ptep); + pte = huge_ptep_get(hpte.ptep); + if (huge_pte_none(pte)) { spin_unlock(ptl); - continue; + goto next_hpte; } /* @@ -5441,24 +5439,35 @@ static void __unmap_hugepage_range(struct mmu_gather *tlb, struct vm_area_struct */ if (pte_swp_uffd_wp_any(pte) && !(zap_flags & ZAP_FLAG_DROP_MARKER)) - set_huge_pte_at(mm, address, ptep, + set_huge_pte_at(mm, address, hpte.ptep, make_pte_marker(PTE_MARKER_UFFD_WP)); else - huge_pte_clear(mm, address, ptep, sz); + huge_pte_clear(mm, address, hpte.ptep, + hugetlb_pte_size(&hpte)); + spin_unlock(ptl); + goto next_hpte; + } + + if (unlikely(!hugetlb_pte_present_leaf(&hpte, pte))) { + /* + * We raced with someone splitting out from under us. + * Retry the walk. + */ spin_unlock(ptl); continue; } - page = pte_page(pte); + subpage = pte_page(pte); + hpage = compound_head(subpage); /* * If a reference page is supplied, it is because a specific * page is being unmapped, not a range. Ensure the page we * are about to unmap is the actual page of interest. */ if (ref_page) { - if (page != ref_page) { + if (hpage != ref_page) { spin_unlock(ptl); - continue; + goto next_hpte; } /* * Mark the VMA as having unmapped its page so that @@ -5468,25 +5477,32 @@ static void __unmap_hugepage_range(struct mmu_gather *tlb, struct vm_area_struct set_vma_resv_flags(vma, HPAGE_RESV_UNMAPPED); } - pte = huge_ptep_get_and_clear(mm, address, ptep); - tlb_remove_huge_tlb_entry(h, tlb, ptep, address); + pte = huge_ptep_get_and_clear(mm, address, hpte.ptep); + tlb_change_page_size(tlb, hugetlb_pte_size(&hpte)); + tlb_remove_huge_tlb_entry(tlb, hpte, address); if (huge_pte_dirty(pte)) - set_page_dirty(page); + set_page_dirty(hpage); /* Leave a uffd-wp pte marker if needed */ if (huge_pte_uffd_wp(pte) && !(zap_flags & ZAP_FLAG_DROP_MARKER)) - set_huge_pte_at(mm, address, ptep, + set_huge_pte_at(mm, address, hpte.ptep, make_pte_marker(PTE_MARKER_UFFD_WP)); - hugetlb_count_sub(pages_per_huge_page(h), mm); - page_remove_rmap(page, vma, true); + hugetlb_count_sub(hugetlb_pte_size(&hpte)/PAGE_SIZE, mm); + page_remove_rmap(hpage, vma, true); spin_unlock(ptl); - tlb_remove_page_size(tlb, page, huge_page_size(h)); /* - * Bail out after unmapping reference page if supplied + * Lower the reference count on the head page. + */ + tlb_remove_page_size(tlb, hpage, sz); + /* + * Bail out after unmapping reference page if supplied, + * and there's only one PTE mapping this page. */ - if (ref_page) + if (ref_page && hugetlb_pte_size(&hpte) == sz) break; +next_hpte: + address += hugetlb_pte_size(&hpte); } tlb_end_vma(tlb, vma); From patchwork Thu Jan 5 10:18:16 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: James Houghton X-Patchwork-Id: 13089652 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 88B5AC3DA7D for ; Thu, 5 Jan 2023 10:19:25 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id B655B94000F; Thu, 5 Jan 2023 05:19:24 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id B16CB940008; Thu, 5 Jan 2023 05:19:24 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 9B70694000F; Thu, 5 Jan 2023 05:19:24 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0010.hostedemail.com [216.40.44.10]) by kanga.kvack.org (Postfix) with ESMTP id 8A74E940008 for ; Thu, 5 Jan 2023 05:19:24 -0500 (EST) Received: from smtpin29.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay04.hostedemail.com (Postfix) with ESMTP id 6CE1F1A0786 for ; Thu, 5 Jan 2023 10:19:24 +0000 (UTC) X-FDA: 80320348248.29.ADC1C8B Received: from mail-yb1-f201.google.com (mail-yb1-f201.google.com [209.85.219.201]) by imf27.hostedemail.com (Postfix) with ESMTP id D81C64000B for ; Thu, 5 Jan 2023 10:19:22 +0000 (UTC) Authentication-Results: imf27.hostedemail.com; dkim=pass header.d=google.com header.s=20210112 header.b=gcqO4fum; spf=pass (imf27.hostedemail.com: domain of 3KqS2YwoKCHEYiWdjVWidcVddVaT.RdbaXcjm-bbZkPRZ.dgV@flex--jthoughton.bounces.google.com designates 209.85.219.201 as permitted sender) smtp.mailfrom=3KqS2YwoKCHEYiWdjVWidcVddVaT.RdbaXcjm-bbZkPRZ.dgV@flex--jthoughton.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1672913962; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=lp3NkrzzsDy9ah3gYP7Di6ebFhCh/hI0r9EhbhKs3qA=; b=p1qK7ykZm+OOi5OC5ymuy8D4MeYiOIftSle6byDtyKhbdSoW3yaQDwfWcZ+LOGUgMN2R94 +lYIh1+mlXlbhw9Dztb1rpsM8Bph/Vgh4h7x5DhZOVlzM3wMbtbktdMi9ieERuzOsBor5V cxbv87HaAJ3fn5XyUxNfEbFV7J4tH+M= ARC-Authentication-Results: i=1; imf27.hostedemail.com; dkim=pass header.d=google.com header.s=20210112 header.b=gcqO4fum; spf=pass (imf27.hostedemail.com: domain of 3KqS2YwoKCHEYiWdjVWidcVddVaT.RdbaXcjm-bbZkPRZ.dgV@flex--jthoughton.bounces.google.com designates 209.85.219.201 as permitted sender) smtp.mailfrom=3KqS2YwoKCHEYiWdjVWidcVddVaT.RdbaXcjm-bbZkPRZ.dgV@flex--jthoughton.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1672913962; a=rsa-sha256; cv=none; b=AkmQ4xNYW+at+SO/LesSab1uLd8oSJz4dU7X1oqeMrayMbahOZUN/vaZxT0rbSsdqnvUQ3 jN0yhNzlafdd3PIO3v1X3Cxl8kuNwG4nnOlEg9pz665FZ1/Zw2FcKc9pQQ0D0R59Dk4S4W OTeFPazgjmukM9x0BZyPeaHEyF6NJXU= Received: by mail-yb1-f201.google.com with SMTP id t13-20020a056902018d00b0074747131938so36503692ybh.12 for ; Thu, 05 Jan 2023 02:19:22 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=lp3NkrzzsDy9ah3gYP7Di6ebFhCh/hI0r9EhbhKs3qA=; b=gcqO4fumRx0zgQZPFIbbvhj2RDn3uzOiEjenn5HSn/KoUqOlS6cvCBPlQZqdnokXAv yuBgbgVdZ68dWkcgWO302Mw9PxijLJ23xw6JxoY2ObqFFf3QxAWs1BdbmgNwKsp6kjEu 8GFqq0LZlvusyfldGmkc6FfTZZ/RbptJV8SUrxL5hGv7ZFxVYZY2rxA32ezUN4HQKndZ n4NkO1TfQAKi7YjlKvn942sD0+vyg8DNgNqttkJLun2/ieyUyjjedHWOdKf/Toc3RDXd NYkM9a5g7CgZdcmSy/hZ0BvTwMl6CAeI0KwuHwV/HKzTEVBfjxeq4ELE0Z8GmCd1b9Ya 6emg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=lp3NkrzzsDy9ah3gYP7Di6ebFhCh/hI0r9EhbhKs3qA=; b=YvPHqia7EMVOCNsEKk8Uo7VfvIyn8GZiB7yIJExDwcaK2KnHlnteQY0w7KwOc1rFQT xnXYKpa5IpT2YgRn55EgMdwBr/c7RVQGske7nb/55eKyuBlJQYbMl1ag0Te0VE6GEtEd xHv5kGBbWWbBBlOAoP0T/4PltLyZlO9Jte8mqoj+XYL5MRyHkinUHNhm+l4jSCwX6wRO JcItiZisS509s5ctbTe9g7muIuH3xuG6tYZZ1TaTFzi4vF9fhfQ3N9a2W30e6jR9PREf R9IBgFFsjpZ29JX2HN34kCHgFIQt3K+PbhIFW9VUIF6h6ThTvosN53hTCzmGYxGv1/xj o9SA== X-Gm-Message-State: AFqh2kogBm0YZvDqOHQeZTEHVW8voLAUMoJcqQTzm0M866RnqmIrOGpf TY4uilBuGZei7v4i5jPwJjrCdXV1LEWoE8Bq X-Google-Smtp-Source: AMrXdXtIJM/w1a8XxNEazXxZ6FOVAPFAuY6o16gkUOQyGBaWZ7iAlnjKjopmkNnTtIe8ZRhqObJ/lwIhr3EKSgQ+ X-Received: from jthoughton.c.googlers.com ([fda3:e722:ac3:cc00:14:4d90:c0a8:2a4f]) (user=jthoughton job=sendgmr) by 2002:a25:13c2:0:b0:716:10cb:dc2e with SMTP id 185-20020a2513c2000000b0071610cbdc2emr6990557ybt.530.1672913962037; Thu, 05 Jan 2023 02:19:22 -0800 (PST) Date: Thu, 5 Jan 2023 10:18:16 +0000 In-Reply-To: <20230105101844.1893104-1-jthoughton@google.com> Mime-Version: 1.0 References: <20230105101844.1893104-1-jthoughton@google.com> X-Mailer: git-send-email 2.39.0.314.g84b9a713c41-goog Message-ID: <20230105101844.1893104-19-jthoughton@google.com> Subject: [PATCH 18/46] hugetlb: add HGM support for hugetlb_change_protection From: James Houghton To: Mike Kravetz , Muchun Song , Peter Xu Cc: David Hildenbrand , David Rientjes , Axel Rasmussen , Mina Almasry , "Zach O'Keefe" , Manish Mishra , Naoya Horiguchi , "Dr . David Alan Gilbert" , "Matthew Wilcox (Oracle)" , Vlastimil Babka , Baolin Wang , Miaohe Lin , Yang Shi , Andrew Morton , linux-mm@kvack.org, linux-kernel@vger.kernel.org, James Houghton X-Rspamd-Queue-Id: D81C64000B X-Stat-Signature: 7p9q5qjjs8ciearrnw9qaq95t7zmy3ef X-Rspam-User: X-Rspamd-Server: rspam08 X-HE-Tag: 1672913962-783655 X-HE-Meta: U2FsdGVkX18SKco89E1qIXY8j36Xqm3Q843g/2ZLLUrEeJ4dUyeE7KrJtYVooBfGg+uuFjPJSqaiBzZYgd4CqOxZ7cSWmT052qxLeXMry9zBF8da3H+4RcNmZBCxkH9nZnEmdddX9u7dQZBpc+XGOQBPb7liNaF5p0WxSLAOAxLDn/G3joBwoAGNiYjzsEhbmjzmrUIIIi6INUa9RkYuFwQu43AxH+LLFZaIAecvyYRwU6A3qbGjE2IF0/x2G7677idxNqFiPxIm+ErjaTpBwAeODqMgPiMC5QeDayOE9ceVYS/wHabs43rd+b/KmTcntBQIm2haM+iGQvLzFCvstpbuNTJFPxrK9FpFVfa9CFaUeaLgZtg+Gj11AARyx4UMdlSFWK/mxBDCjlq2Rs/ubNFhX8UCrOaJnxA6nmeSljQCbm0FsTLTo+bDt+uyU3hjVlC/kxYL3qUEGdh2QhRM64yCV4kfwfZqzq7oykubP1lDFN9xa+t+nKs8r+OruyDAGv9OFbJF9fd+cnKFnXxkcQWUDzAA9FkozglgqvY1mN6uuBJLhf/pNXxS11xQhoS8ciQH2dZA8z4bNJ4kwDoJ5xt2lpcjVzdvx7dC0Nk6ALlN3YAMC6kS3dT4SMKvQj8cp+J5gwHqSEN1swxQvE485KJrYWGG2Ut4gCHoMFr5SS17UiaaIouQj4NBEzppSiCHbypjlYwhqLiVjX1+1QSp0stPSoSYeuEnsWJKNcznanSPeq9+3hGisJ5p7M/NZ52As0sZOHj4RBf9fmiLJGf1HZqGiaSoMeP1Kap4BZeORiC+KRoZLQxqDRkEvKV8OvZEK1aht6CSjiPQPMRF+TF0aw43WCWkUCuHGAf6iAkjXJ34w4oFrrNSFbJqelvl+raUciG/4S60a+/DWfu7UUBh9LJ59j47QZvp+aZemk2vcpQGyUrF1emsfGKw0XuC2UFRD8Q8wfsXLS67TaaIaS8 XDx9WtOe OoQ3mVo8hBbCS6xIFttgwt6TRePsooaoSqam/NursyeZJsaHjvm/QIn5SrHZBSpZOoVGn/CBMJRhqqbSb2RmGmI6QmiIYA0KaB8/Vx/SDfTM3D7t2f5Ilz0ERKQFPpGDgKz+BvZADWC3c08Zpu9mDOddQjm/Pxn1ihmYWOJ2NoBLDYiru3AKyi3JkHOoxN0bDDvkigPzj0NihdooBSEXct3hK54bNd9+ZuV6nN0y5mBsURCVc865ZbqFcgfz7SrR0MlpnxmByz2N6Dl/4loRNSyhE5AuVMsfhy4RsVrytx2o9NONGwVUbww5Rd7fZQFd1ASa3Y4VkaWqJdfachVcWHPyPlws0F16NE15HWjow7HB9+TWSwYWwZ0eebwd1XIjen7oQO3l6rgf5NmgTJBlqQS2dtAShoILSY1YwSWOcSyewAsXd/FOYCZBYxoq0Rjcyn2No X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: The main change here is to do a high-granularity walk and pulling the shift from the walk (not from the hstate). Signed-off-by: James Houghton --- mm/hugetlb.c | 59 +++++++++++++++++++++++++++++++++------------------- 1 file changed, 38 insertions(+), 21 deletions(-) diff --git a/mm/hugetlb.c b/mm/hugetlb.c index dfd6c1491ac3..73672d806172 100644 --- a/mm/hugetlb.c +++ b/mm/hugetlb.c @@ -6798,15 +6798,15 @@ unsigned long hugetlb_change_protection(struct vm_area_struct *vma, { struct mm_struct *mm = vma->vm_mm; unsigned long start = address; - pte_t *ptep; pte_t pte; struct hstate *h = hstate_vma(vma); - unsigned long pages = 0, psize = huge_page_size(h); + unsigned long base_pages = 0, psize = huge_page_size(h); bool shared_pmd = false; struct mmu_notifier_range range; unsigned long last_addr_mask; bool uffd_wp = cp_flags & MM_CP_UFFD_WP; bool uffd_wp_resolve = cp_flags & MM_CP_UFFD_WP_RESOLVE; + struct hugetlb_pte hpte; /* * In the case of shared PMDs, the area to flush could be beyond @@ -6824,28 +6824,30 @@ unsigned long hugetlb_change_protection(struct vm_area_struct *vma, hugetlb_vma_lock_write(vma); i_mmap_lock_write(vma->vm_file->f_mapping); last_addr_mask = hugetlb_mask_last_page(h); - for (; address < end; address += psize) { + while (address < end) { spinlock_t *ptl; - ptep = hugetlb_walk(vma, address, psize); - if (!ptep) { - address |= last_addr_mask; + + if (hugetlb_full_walk(&hpte, vma, address)) { + address = (address | last_addr_mask) + psize; continue; } - ptl = huge_pte_lock(h, mm, ptep); - if (huge_pmd_unshare(mm, vma, address, ptep)) { + + ptl = hugetlb_pte_lock(&hpte); + if (hugetlb_pte_size(&hpte) == psize && + huge_pmd_unshare(mm, vma, address, hpte.ptep)) { /* * When uffd-wp is enabled on the vma, unshare * shouldn't happen at all. Warn about it if it * happened due to some reason. */ WARN_ON_ONCE(uffd_wp || uffd_wp_resolve); - pages++; + base_pages += psize / PAGE_SIZE; spin_unlock(ptl); shared_pmd = true; - address |= last_addr_mask; + address = (address | last_addr_mask) + psize; continue; } - pte = huge_ptep_get(ptep); + pte = huge_ptep_get(hpte.ptep); if (unlikely(is_hugetlb_entry_hwpoisoned(pte))) { /* Nothing to do. */ } else if (unlikely(is_hugetlb_entry_migration(pte))) { @@ -6861,7 +6863,7 @@ unsigned long hugetlb_change_protection(struct vm_area_struct *vma, entry = make_readable_migration_entry( swp_offset(entry)); newpte = swp_entry_to_pte(entry); - pages++; + base_pages += hugetlb_pte_size(&hpte) / PAGE_SIZE; } if (uffd_wp) @@ -6869,34 +6871,49 @@ unsigned long hugetlb_change_protection(struct vm_area_struct *vma, else if (uffd_wp_resolve) newpte = pte_swp_clear_uffd_wp(newpte); if (!pte_same(pte, newpte)) - set_huge_pte_at(mm, address, ptep, newpte); + set_huge_pte_at(mm, address, hpte.ptep, newpte); } else if (unlikely(is_pte_marker(pte))) { /* No other markers apply for now. */ WARN_ON_ONCE(!pte_marker_uffd_wp(pte)); if (uffd_wp_resolve) /* Safe to modify directly (non-present->none). */ - huge_pte_clear(mm, address, ptep, psize); + huge_pte_clear(mm, address, hpte.ptep, + hugetlb_pte_size(&hpte)); } else if (!huge_pte_none(pte)) { pte_t old_pte; - unsigned int shift = huge_page_shift(hstate_vma(vma)); + unsigned int shift = hpte.shift; - old_pte = huge_ptep_modify_prot_start(vma, address, ptep); + if (unlikely(!hugetlb_pte_present_leaf(&hpte, pte))) { + /* + * Someone split the PTE from under us, so retry + * the walk, + */ + spin_unlock(ptl); + continue; + } + + old_pte = huge_ptep_modify_prot_start( + vma, address, hpte.ptep); pte = huge_pte_modify(old_pte, newprot); - pte = arch_make_huge_pte(pte, shift, vma->vm_flags); + pte = arch_make_huge_pte( + pte, shift, vma->vm_flags); if (uffd_wp) pte = huge_pte_mkuffd_wp(pte); else if (uffd_wp_resolve) pte = huge_pte_clear_uffd_wp(pte); - huge_ptep_modify_prot_commit(vma, address, ptep, old_pte, pte); - pages++; + huge_ptep_modify_prot_commit( + vma, address, hpte.ptep, + old_pte, pte); + base_pages += hugetlb_pte_size(&hpte) / PAGE_SIZE; } else { /* None pte */ if (unlikely(uffd_wp)) /* Safe to modify directly (none->non-present). */ - set_huge_pte_at(mm, address, ptep, + set_huge_pte_at(mm, address, hpte.ptep, make_pte_marker(PTE_MARKER_UFFD_WP)); } spin_unlock(ptl); + address += hugetlb_pte_size(&hpte); } /* * Must flush TLB before releasing i_mmap_rwsem: x86's huge_pmd_unshare @@ -6919,7 +6936,7 @@ unsigned long hugetlb_change_protection(struct vm_area_struct *vma, hugetlb_vma_unlock_write(vma); mmu_notifier_invalidate_range_end(&range); - return pages << h->order; + return base_pages; } /* Return true if reservation was successful, false otherwise. */ From patchwork Thu Jan 5 10:18:17 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: James Houghton X-Patchwork-Id: 13089653 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 4CAFCC53210 for ; Thu, 5 Jan 2023 10:19:27 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 310D5940010; Thu, 5 Jan 2023 05:19:26 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 29AE1940008; Thu, 5 Jan 2023 05:19:26 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id F0D71940010; Thu, 5 Jan 2023 05:19:25 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0016.hostedemail.com [216.40.44.16]) by kanga.kvack.org (Postfix) with ESMTP id D523F940008 for ; Thu, 5 Jan 2023 05:19:25 -0500 (EST) Received: from smtpin19.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay09.hostedemail.com (Postfix) with ESMTP id 9A20D8083B for ; Thu, 5 Jan 2023 10:19:25 +0000 (UTC) X-FDA: 80320348290.19.6160AF2 Received: from mail-yb1-f202.google.com (mail-yb1-f202.google.com [209.85.219.202]) by imf06.hostedemail.com (Postfix) with ESMTP id 0ACAA180003 for ; Thu, 5 Jan 2023 10:19:23 +0000 (UTC) Authentication-Results: imf06.hostedemail.com; dkim=pass header.d=google.com header.s=20210112 header.b="V/Jfcp7c"; spf=pass (imf06.hostedemail.com: domain of 3K6S2YwoKCHIZjXekWXjedWeeWbU.SecbYdkn-ccalQSa.ehW@flex--jthoughton.bounces.google.com designates 209.85.219.202 as permitted sender) smtp.mailfrom=3K6S2YwoKCHIZjXekWXjedWeeWbU.SecbYdkn-ccalQSa.ehW@flex--jthoughton.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1672913964; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=BXNpcEQhxfziaYYe5JKwye/WpzbaEjEGrnqAXOzwqC4=; b=EgK6JlAxdGoAW+ocOO46YcuNJPnCKf3923dHBA+Z01DJ5rl7d/CRBmlqNi88gsn3hZlc3J riCkh46ftN0GEqPWXv+uxhiGCI8FexKkp3ANkMkfGZn3NelVstV+2Fr1bAPCuCKlZUUZr3 X+bJHs5+R8gx5yWji3yumFDU9cZ1rP0= ARC-Authentication-Results: i=1; imf06.hostedemail.com; dkim=pass header.d=google.com header.s=20210112 header.b="V/Jfcp7c"; spf=pass (imf06.hostedemail.com: domain of 3K6S2YwoKCHIZjXekWXjedWeeWbU.SecbYdkn-ccalQSa.ehW@flex--jthoughton.bounces.google.com designates 209.85.219.202 as permitted sender) smtp.mailfrom=3K6S2YwoKCHIZjXekWXjedWeeWbU.SecbYdkn-ccalQSa.ehW@flex--jthoughton.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1672913964; a=rsa-sha256; cv=none; b=TlX1hmg0wYY1WmbOJJh3ojNyHsOM/b8htTnRxIszOEJul3NLIC2PgKWID7EBGAhenXYTAc 3CUtm57CIgyrWsBK314X5BsdCRARWDaJwhievE7pXIStRzwGdsPX7JM0fCftK7cOXnnWao oO2wC2bOv/oOsu+OBvxExb7Y18HVEzY= Received: by mail-yb1-f202.google.com with SMTP id l194-20020a2525cb000000b007b411fbdc13so2632139ybl.23 for ; Thu, 05 Jan 2023 02:19:23 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=BXNpcEQhxfziaYYe5JKwye/WpzbaEjEGrnqAXOzwqC4=; b=V/Jfcp7ceK8WRpWk4kcE61X4AiMfGhPqZ8JiYFyiLAHzV69wvdHEzn4wPN7LIZk4CE bihTqKci5bcgrW5kSTMvQE87sTXKrY8emKYYInvNKBClO/fvmBJhy9VMt8s5XVUBJAvy cjyveafAHg6xtMfAOLygapq7E3ypRwoV1bqQRPlA4hyPfPYlp0LcL9Go83tyAqpX/AKd +fsGh7mpBN3witwaQC4bf1miNUfGCnDqMY9vU+x/X6z3JTCZFv94ilBZkurucEqHB8a0 FYbuUHfSwENyyzNyQOZRtyZ3QM703eKUjEsRRXQp4MeRqBusXwkmY1T3GybPKQ9WLMUE ioXA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=BXNpcEQhxfziaYYe5JKwye/WpzbaEjEGrnqAXOzwqC4=; b=uYrPtz2h3uLeaXMiCi16hJz+nj3D6qkyY7k214HJZpyHmQBFkt0KfkYyzBvQNVkBiB FQfKPwH46NMshZiVQIytgCDhv3dmHuF2FBBhhNOUNB/XmfFBOjReTvwS+hWEU7UiWGnX cmZlC2DlFsh4rmzSTGK05WY9vA8BDem8Hhb9bOzEEFbaN7hzugu0YZtIimXWsR7SsVku fr0D6lwqs/dSwDwj+bTlnbj4yz6UxbqoB81dp4X584CqYqMvO7/KzHbOWJrrfGrKPb6y baZTz1ln1QnCClIIwzM1qy0NEX5xUR96TwxKycqL4XAr6pnGIztR5Bx2cUYUsBedvq83 o1uA== X-Gm-Message-State: AFqh2krFbOqXr21nVDCGmO+YYe/gT33GMXyaUGYuM6kPybQM+9npDz3d unkaYLcP1TvXcQkM9Qcpmhz7ykI6BuwSJ063 X-Google-Smtp-Source: AMrXdXupC7GphFewk1Xzzmqq7XH6NNJJ3a9/fKMNJsKu0/gJPDtKjqvWd7TDuXpuJuJCOPC8zIthAjF3AmrDJE5r X-Received: from jthoughton.c.googlers.com ([fda3:e722:ac3:cc00:14:4d90:c0a8:2a4f]) (user=jthoughton job=sendgmr) by 2002:a25:c50f:0:b0:6fc:b841:cf42 with SMTP id v15-20020a25c50f000000b006fcb841cf42mr5162375ybe.372.1672913963216; Thu, 05 Jan 2023 02:19:23 -0800 (PST) Date: Thu, 5 Jan 2023 10:18:17 +0000 In-Reply-To: <20230105101844.1893104-1-jthoughton@google.com> Mime-Version: 1.0 References: <20230105101844.1893104-1-jthoughton@google.com> X-Mailer: git-send-email 2.39.0.314.g84b9a713c41-goog Message-ID: <20230105101844.1893104-20-jthoughton@google.com> Subject: [PATCH 19/46] hugetlb: add HGM support for follow_hugetlb_page From: James Houghton To: Mike Kravetz , Muchun Song , Peter Xu Cc: David Hildenbrand , David Rientjes , Axel Rasmussen , Mina Almasry , "Zach O'Keefe" , Manish Mishra , Naoya Horiguchi , "Dr . David Alan Gilbert" , "Matthew Wilcox (Oracle)" , Vlastimil Babka , Baolin Wang , Miaohe Lin , Yang Shi , Andrew Morton , linux-mm@kvack.org, linux-kernel@vger.kernel.org, James Houghton X-Rspamd-Queue-Id: 0ACAA180003 X-Stat-Signature: 45zif1t6idktirttrxpwy168h6yrww5t X-Rspam-User: X-Rspamd-Server: rspam08 X-HE-Tag: 1672913963-585750 X-HE-Meta: U2FsdGVkX1/dpQrFnMY39KWEWtQHpJH7bjpV3Qz3czXHSROlI0vsdg2upKu2S7RFXJ2CgBRWhUEW7iOIqi4Xm6XOLc3ABWr5n0FAZKU/fm6wttEt1e5Lc5LH6hLUibvkFyDTeQEeSDaBBjLINVRGNNKQK/l79quWghn40m2D2W9MPGguJOsrRhv4gIYYZhwRM1MlZsJ4Kvy9vA8FdGotxM3wlpl/JeM/RVLKsxni/+u7s3K29El52uB2SGnLSGzEGDW27VEJmgT2STZdHc9ITfzWEljVDUmYRnsjHxNlx6B52Y9j1jPrgodDI7XGqWskB13iSyUu/qzEC5911kVu/EwN00G6IuxZBOMmWnTEMpaafeQC1IxzdGMiW9co8/EdG+OOEP4qgDkVZNwtdEUPYu9zaWrkXZRbNGcMOi3zfP4FAUcARwy5nIQnjbZSbr1QmMilPfJwT36D0g3cRksvcgOvzUp9OtzwsqG52F7Cy35NtIRIfmBWxFVLwBSo6jd+7Hpx07S7LLSs/WWMUoaYcSOeFlmAOzVI6W29/E51mT7xctarLl5DabqZ+B6JSYHi6SN/2K0r8zK5nlneFBuouHaRh/BeNbfSeqgSNNj8SNM9ZubPUn+SajDVL9mQMxLWmfhjK5Nl4ZZPcZxJx59h+OfMQupTIORBC3qw2uaGUGoByjjaIJR+rOquXZ7/hU127TLZgSMUANFDuIET1clrty1umnoNVuse4tMYb2nJBrbP2wBu+55Wt/wc49uWehV29Thb8umsVnsndMPWo0S3c27sAGY/faX2sz3VJWQZlm6QqMHKUL6pUVpGJLUE/bez++ZDG1csPKm5j6b7A0j2eUXSrWWGs4xdPAsOv64K80NbXno5e6hWNlNcVmnUntTmXBYOeiN87Z9M7ld0gThjH6pLite5Ewprvuis68oI560M4B2eWbWrM44ICkkK6Hv3hpnSXfBd2tC/H+wKccD 4swN0qRR 6/FivZLvaPZRNx8hIa84j5u3XTJu1XCnF9AKcOsNOxkgRgGcSVnajtq5gps+3sbJUGrjcxXDAgDYm5A/qro0tDKcDc2as0Lsjj436+J9dijChxOdCR/uXu3gwjLzfeqi45UdZGtUas6+7J2B0XqZhzTNOATvaXm5A5slJcSgJH2IaZ5yudpNPOBBaacsNVeGHy+rkf9cSnhgtrYYl/Nn5scWkAaSBEHqcryNPQUIkur0A/Q7I+x4V8+y1+wI3ZMftDRKGruFqFJ2kN6jrytmk92VlZEH2t+ZCGPs0SPKs1flit/gR/D2hf8ERdNFwlr0UtT0Kgh5g2bilW2tTlwulbNJPAt273cmD69dPXGV2MM0OrGPw1xv4E085sYKLbP+aVUdACwcGcmEG5l2Oll1Ap8/HyjuEJMlokXefvIcogx1gP/I6PTQAHqifMx5muwex++5V X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: This enables high-granularity mapping support in GUP. In case it is confusing, pfn_offset is the offset (in PAGE_SIZE units) that vaddr points to within the subpage that hpte points to. Signed-off-by: James Houghton --- mm/hugetlb.c | 59 ++++++++++++++++++++++++++++++++-------------------- 1 file changed, 37 insertions(+), 22 deletions(-) diff --git a/mm/hugetlb.c b/mm/hugetlb.c index 73672d806172..30fea414d9ee 100644 --- a/mm/hugetlb.c +++ b/mm/hugetlb.c @@ -6532,11 +6532,9 @@ static void record_subpages_vmas(struct page *page, struct vm_area_struct *vma, } static inline bool __follow_hugetlb_must_fault(struct vm_area_struct *vma, - unsigned int flags, pte_t *pte, + unsigned int flags, pte_t pteval, bool *unshare) { - pte_t pteval = huge_ptep_get(pte); - *unshare = false; if (is_swap_pte(pteval)) return true; @@ -6611,11 +6609,13 @@ long follow_hugetlb_page(struct mm_struct *mm, struct vm_area_struct *vma, int err = -EFAULT, refs; while (vaddr < vma->vm_end && remainder) { - pte_t *pte; + pte_t *ptep, pte; spinlock_t *ptl = NULL; bool unshare = false; int absent; - struct page *page; + unsigned long pages_per_hpte; + struct page *page, *subpage; + struct hugetlb_pte hpte; /* * If we have a pending SIGKILL, don't keep faulting pages and @@ -6632,13 +6632,19 @@ long follow_hugetlb_page(struct mm_struct *mm, struct vm_area_struct *vma, * each hugepage. We have to make sure we get the * first, for the page indexing below to work. * - * Note that page table lock is not held when pte is null. + * hugetlb_full_walk will mask the address appropriately. + * + * Note that page table lock is not held when ptep is null. */ - pte = hugetlb_walk(vma, vaddr & huge_page_mask(h), - huge_page_size(h)); - if (pte) - ptl = huge_pte_lock(h, mm, pte); - absent = !pte || huge_pte_none(huge_ptep_get(pte)); + if (hugetlb_full_walk(&hpte, vma, vaddr)) { + ptep = NULL; + absent = true; + } else { + ptl = hugetlb_pte_lock(&hpte); + ptep = hpte.ptep; + pte = huge_ptep_get(ptep); + absent = huge_pte_none(pte); + } /* * When coredumping, it suits get_dump_page if we just return @@ -6649,13 +6655,20 @@ long follow_hugetlb_page(struct mm_struct *mm, struct vm_area_struct *vma, */ if (absent && (flags & FOLL_DUMP) && !hugetlbfs_pagecache_present(h, vma, vaddr)) { - if (pte) + if (ptep) spin_unlock(ptl); hugetlb_vma_unlock_read(vma); remainder = 0; break; } + if (!absent && pte_present(pte) && + !hugetlb_pte_present_leaf(&hpte, pte)) { + /* We raced with someone splitting the PTE, so retry. */ + spin_unlock(ptl); + continue; + } + /* * We need call hugetlb_fault for both hugepages under migration * (in which case hugetlb_fault waits for the migration,) and @@ -6671,7 +6684,7 @@ long follow_hugetlb_page(struct mm_struct *mm, struct vm_area_struct *vma, vm_fault_t ret; unsigned int fault_flags = 0; - if (pte) + if (ptep) spin_unlock(ptl); hugetlb_vma_unlock_read(vma); @@ -6720,8 +6733,10 @@ long follow_hugetlb_page(struct mm_struct *mm, struct vm_area_struct *vma, continue; } - pfn_offset = (vaddr & ~huge_page_mask(h)) >> PAGE_SHIFT; - page = pte_page(huge_ptep_get(pte)); + pfn_offset = (vaddr & ~hugetlb_pte_mask(&hpte)) >> PAGE_SHIFT; + subpage = pte_page(pte); + pages_per_hpte = hugetlb_pte_size(&hpte) / PAGE_SIZE; + page = compound_head(subpage); VM_BUG_ON_PAGE((flags & FOLL_PIN) && PageAnon(page) && !PageAnonExclusive(page), page); @@ -6731,22 +6746,22 @@ long follow_hugetlb_page(struct mm_struct *mm, struct vm_area_struct *vma, * and skip the same_page loop below. */ if (!pages && !vmas && !pfn_offset && - (vaddr + huge_page_size(h) < vma->vm_end) && - (remainder >= pages_per_huge_page(h))) { - vaddr += huge_page_size(h); - remainder -= pages_per_huge_page(h); - i += pages_per_huge_page(h); + (vaddr + pages_per_hpte < vma->vm_end) && + (remainder >= pages_per_hpte)) { + vaddr += pages_per_hpte; + remainder -= pages_per_hpte; + i += pages_per_hpte; spin_unlock(ptl); hugetlb_vma_unlock_read(vma); continue; } /* vaddr may not be aligned to PAGE_SIZE */ - refs = min3(pages_per_huge_page(h) - pfn_offset, remainder, + refs = min3(pages_per_hpte - pfn_offset, remainder, (vma->vm_end - ALIGN_DOWN(vaddr, PAGE_SIZE)) >> PAGE_SHIFT); if (pages || vmas) - record_subpages_vmas(nth_page(page, pfn_offset), + record_subpages_vmas(nth_page(subpage, pfn_offset), vma, refs, likely(pages) ? pages + i : NULL, vmas ? vmas + i : NULL); From patchwork Thu Jan 5 10:18:18 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: James Houghton X-Patchwork-Id: 13089654 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id E3D92C3DA7A for ; Thu, 5 Jan 2023 10:19:28 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 6FA7F940011; Thu, 5 Jan 2023 05:19:27 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 6837B940008; Thu, 5 Jan 2023 05:19:27 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 48876940011; Thu, 5 Jan 2023 05:19:27 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0017.hostedemail.com [216.40.44.17]) by kanga.kvack.org (Postfix) with ESMTP id 2FF1A940008 for ; Thu, 5 Jan 2023 05:19:27 -0500 (EST) Received: from smtpin11.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay08.hostedemail.com (Postfix) with ESMTP id 03B3D140D61 for ; Thu, 5 Jan 2023 10:19:26 +0000 (UTC) X-FDA: 80320348374.11.2E582DA Received: from mail-yw1-f201.google.com (mail-yw1-f201.google.com [209.85.128.201]) by imf25.hostedemail.com (Postfix) with ESMTP id 592AEA0006 for ; Thu, 5 Jan 2023 10:19:25 +0000 (UTC) Authentication-Results: imf25.hostedemail.com; dkim=pass header.d=google.com header.s=20210112 header.b=LLH1FHQm; spf=pass (imf25.hostedemail.com: domain of 3LKS2YwoKCHMakYflXYkfeXffXcV.TfdcZelo-ddbmRTb.fiX@flex--jthoughton.bounces.google.com designates 209.85.128.201 as permitted sender) smtp.mailfrom=3LKS2YwoKCHMakYflXYkfeXffXcV.TfdcZelo-ddbmRTb.fiX@flex--jthoughton.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1672913965; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=1YiV8/mi4PL7n0t7wUxWEcBb+colJN0WHKmqKjso6aI=; b=6lxcmGMB5CI/Su2y+zTy5F2b1HuuZAPjQCbm+IGPHFKDqgOLAE3YkLOGt4Qt6QII+g/HO3 pa4UZGof/40VwnF3VR87tijNPVAKNH252l5pFBitC68JC4KExF/wPkVU/xbjWP1hzjrQud wL0Kyw9haZA8+WVyOCpAwwobzWhF1f8= ARC-Authentication-Results: i=1; imf25.hostedemail.com; dkim=pass header.d=google.com header.s=20210112 header.b=LLH1FHQm; spf=pass (imf25.hostedemail.com: domain of 3LKS2YwoKCHMakYflXYkfeXffXcV.TfdcZelo-ddbmRTb.fiX@flex--jthoughton.bounces.google.com designates 209.85.128.201 as permitted sender) smtp.mailfrom=3LKS2YwoKCHMakYflXYkfeXffXcV.TfdcZelo-ddbmRTb.fiX@flex--jthoughton.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1672913965; a=rsa-sha256; cv=none; b=kRfYCyG4YAF/8j1+UwR3HWQEt4fB9uFWEKhFSHenUPTKzna0JOllGeQX3qtG9j3Q04Um3b e55TCdFXCCiMN+fgxvHM+hcYYidywimvwiHr+8861JFSw1SPT3JIEna7s1aoRqqiLg2ENW WkY7KV2AQkhbon8DaMavTfrUh/kxFcE= Received: by mail-yw1-f201.google.com with SMTP id 00721157ae682-434eb7c6fa5so374779127b3.14 for ; Thu, 05 Jan 2023 02:19:25 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=1YiV8/mi4PL7n0t7wUxWEcBb+colJN0WHKmqKjso6aI=; b=LLH1FHQmSuPREM9no8Xr1mrJ+WX7c7pQin9EAc1TvbLcGl/RkRXVwm8E1KdBkYBfGV YIbOE4N1o74dyaPeOKZcf4lLMNNXkB6ntMAbzLwmbHJOP6v9E5qeX1N0aPIuGJxFBbMc DAYDRj/A2/l8+0SDjtsFQpbfETrwBoJUzB2rpy+l5tAr/dFGII17vpR5Red3X2ZytOJ7 oWSpFA1Ze6s9PZS3/XShmJem6/Ne0GdcNEZibyC+1G61SR9BHwKMzCLxkcexZKOZqBmj o+/TsTl7bfdyibV9AZKmqM5CVurcLoGPu4otDexr+01trkvt9Pgocw+EzAHZiG5i0KHf sA1w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=1YiV8/mi4PL7n0t7wUxWEcBb+colJN0WHKmqKjso6aI=; b=pYTg3JettkhPcQCtAr6arREV+oVtHk21JEc6IM3vkCqfKmtxe97P+oS1Q5XLzhOP65 dZrJWcj/P2rhTZCYy6R7noJQeBUvBzPXGyJNsaZ/YiSWEMiXS7caQxcgdPO4/6WckDmH 65hJTzlvA0P778WT+eFarx2oEgE23LMsrcMWzRQGaTUwWgwVqilYJCwF109pb0C27CSc /saO862BdvVTX5vhpXVx6Y+NBwcN1nJ+TLyLHAr6XYpxcye0ZbYmAQG6Vi8iuqtTkLRO KIprHhYd6F0wts4qESvr5KXXw1NOtmKhA2dkpcT8nEKH2pcvzF+JwvAYHFgIxWEOAZS9 zexQ== X-Gm-Message-State: AFqh2kpWB9l7fqzQlREs1u60LcbPsgSWv0bROtbwMLAaboWDeORxdhme R981IYvbq1hA6N0yHiuLoR6YYand18slkWUd X-Google-Smtp-Source: AMrXdXsGYdoE5wl2sivs6OdjI5ZSUUexE/Vrq4eXctrFdqUeY9jM31ir5DBKkL0+y1etkakDWP7pB3SgIRai/epT X-Received: from jthoughton.c.googlers.com ([fda3:e722:ac3:cc00:14:4d90:c0a8:2a4f]) (user=jthoughton job=sendgmr) by 2002:a25:d06:0:b0:731:7af4:cc3e with SMTP id 6-20020a250d06000000b007317af4cc3emr3400989ybn.368.1672913964504; Thu, 05 Jan 2023 02:19:24 -0800 (PST) Date: Thu, 5 Jan 2023 10:18:18 +0000 In-Reply-To: <20230105101844.1893104-1-jthoughton@google.com> Mime-Version: 1.0 References: <20230105101844.1893104-1-jthoughton@google.com> X-Mailer: git-send-email 2.39.0.314.g84b9a713c41-goog Message-ID: <20230105101844.1893104-21-jthoughton@google.com> Subject: [PATCH 20/46] hugetlb: add HGM support for hugetlb_follow_page_mask From: James Houghton To: Mike Kravetz , Muchun Song , Peter Xu Cc: David Hildenbrand , David Rientjes , Axel Rasmussen , Mina Almasry , "Zach O'Keefe" , Manish Mishra , Naoya Horiguchi , "Dr . David Alan Gilbert" , "Matthew Wilcox (Oracle)" , Vlastimil Babka , Baolin Wang , Miaohe Lin , Yang Shi , Andrew Morton , linux-mm@kvack.org, linux-kernel@vger.kernel.org, James Houghton X-Rspamd-Server: rspam05 X-Rspamd-Queue-Id: 592AEA0006 X-Stat-Signature: 6xrg8wu8i8oy6t58jx6hmhsp4bcq4aty X-Rspam-User: X-HE-Tag: 1672913965-449754 X-HE-Meta: U2FsdGVkX18Jq3F0xwnH9mLt5R0giRXl5hN4DmmSNni6fjoCkY6cLYrU8fbtjP1zAKkPN6CTudsQ1zu8Dansv65ScYjWZFwJt8aCAkcu5NUKPWIj9WtZVuUqf9oOD6630WFiqnvg1+SUQBhkkVgnfvlKIC6BGUdDYLaCkvwOHc4t5ve6qpT1XUs3TcBuj+k8TGzSzRN/MZ6TrJX7x8lMCLqt1dJjiByZWf5I4qUkssDFMdN/QYCDI3WD7JD04Jky/oLETUPCWEhI7Xm75vZY1oreNrQnsczphST9bffUKuObJi6k7skDfQkNvB2uI3BK8TOwE8M1cJzpnx24le+T/dJzGYGObFxx9GvIWRYEuLMllOdbz4lAhKMPo6jHMtLWDnxP5o8PNjCG4GDtxE/ccdP6fwO2mVtq/nXposIck2Q00KTi9mgUI7xhc+O0FPZ9tCmhBc5OI2GA0eo/+i8gHh+W7mwLVVPCtHM/cLHCq6mGXnQIYxd/7wkpjf88pUBAuOHdM5uqFGDC1kS5ahHiSFIs1tR4bAOeI8Qe6JaPhIqzIzMkvd4Fg3e478ovcUSugDHXbUsnYFjzb7f86gq2+PJU4EvQyHfMfSgK0skRVSj3OPFwSKgDzwDXZXoHnLsT6GuRanoN6P1gpaz0CHO9+A45fUguFoesXc7fI+WZd8QVhGBLKmxsuEqL4g1eTdX7L4rY/TvHUdsFJwZmlMHadHNvdNp7yH+cTs6Vlo29p2MybBYkDKk/KthL92mQIJmaOh8+OR45pmJ/SGz7ClC4XbUM6S9NzPELMeA+L4m27JnjTIagXy9YH9YVncfDJ19WQacQ5f0eFxHkLlAeh7Hr7WaPIKWuWCBVZEyboc9Ug9qNJfjAMHdLwR4nI2GmM+tG802ZNlEgBRzN6n57gtokUstpuyGtymrmEeTtuC9y0GScnOnBIcRSRDGfv/pxlmtjJc34UdiWTd1GYt5/Ihp Sle/AwMq 9LPat3CgcGyAbWxrHEhLiThH7e7xgX7SK/F4c/sFKLz2rjqALXyVhs2CJYpbspkeYSO5PGVJivg15WkYgV1USR2kLdNy2LhfSsHnRJ6VI1QMfRMoreyhqEahgF1j4r6f783Ldvb1tQVUTAdOeaNDTWyv3vmIyepEgZgfqxGIj/v4fIELUsms908/biPef/1Lkb//0xDu3hWYy+53dSJ0K6xB6ZvfbvI7oOqaHdM5FHhZUxRFxxJDfM9S1rkjwi6+oPjx0i+vya6+8pGxC42B4j3GxnNq8VVUMH7mWXHuPxzMvawSIYUtVQ6D3kzomXLa3uluP3HqD33S4I35WFtv67JZ8eU5V4lDpJYwxtIELDlLj0Qy9pkXwAUIlXn0cmrPPP2o7+F9ux9/1FkyE7d19fLVB1UmpSmMs6HdpfwRKRIYqyudau4l4Q7cbGAaiSl1eMY90 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: The change here is very simple: do a high-granularity walk. Signed-off-by: James Houghton --- mm/hugetlb.c | 24 +++++++++++++++++------- 1 file changed, 17 insertions(+), 7 deletions(-) diff --git a/mm/hugetlb.c b/mm/hugetlb.c index 30fea414d9ee..718572444a73 100644 --- a/mm/hugetlb.c +++ b/mm/hugetlb.c @@ -6553,11 +6553,10 @@ struct page *hugetlb_follow_page_mask(struct vm_area_struct *vma, unsigned long address, unsigned int flags) { struct hstate *h = hstate_vma(vma); - struct mm_struct *mm = vma->vm_mm; - unsigned long haddr = address & huge_page_mask(h); struct page *page = NULL; spinlock_t *ptl; - pte_t *pte, entry; + pte_t entry; + struct hugetlb_pte hpte; /* * FOLL_PIN is not supported for follow_page(). Ordinary GUP goes via @@ -6567,13 +6566,24 @@ struct page *hugetlb_follow_page_mask(struct vm_area_struct *vma, return NULL; hugetlb_vma_lock_read(vma); - pte = hugetlb_walk(vma, haddr, huge_page_size(h)); - if (!pte) + + if (hugetlb_full_walk(&hpte, vma, address)) goto out_unlock; - ptl = huge_pte_lock(h, mm, pte); - entry = huge_ptep_get(pte); +retry: + ptl = hugetlb_pte_lock(&hpte); + entry = huge_ptep_get(hpte.ptep); if (pte_present(entry)) { + if (unlikely(!hugetlb_pte_present_leaf(&hpte, entry))) { + /* + * We raced with someone splitting from under us. + * Keep walking to get to the real leaf. + */ + spin_unlock(ptl); + hugetlb_full_walk_continue(&hpte, vma, address); + goto retry; + } + page = pte_page(entry) + ((address & ~huge_page_mask(h)) >> PAGE_SHIFT); /* From patchwork Thu Jan 5 10:18:19 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: James Houghton X-Patchwork-Id: 13089656 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 4E452C54E76 for ; Thu, 5 Jan 2023 10:19:30 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id C51F4940012; Thu, 5 Jan 2023 05:19:28 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id BDA13940008; Thu, 5 Jan 2023 05:19:28 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id A04CE940012; Thu, 5 Jan 2023 05:19:28 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0015.hostedemail.com [216.40.44.15]) by kanga.kvack.org (Postfix) with ESMTP id 8C7CC940008 for ; Thu, 5 Jan 2023 05:19:28 -0500 (EST) Received: from smtpin03.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay02.hostedemail.com (Postfix) with ESMTP id 63F651206CA for ; Thu, 5 Jan 2023 10:19:28 +0000 (UTC) X-FDA: 80320348416.03.5BC8A28 Received: from mail-yw1-f201.google.com (mail-yw1-f201.google.com [209.85.128.201]) by imf14.hostedemail.com (Postfix) with ESMTP id BEF24100014 for ; Thu, 5 Jan 2023 10:19:26 +0000 (UTC) Authentication-Results: imf14.hostedemail.com; dkim=pass header.d=google.com header.s=20210112 header.b=awstklLG; spf=pass (imf14.hostedemail.com: domain of 3LaS2YwoKCHQblZgmYZlgfYggYdW.Ugedafmp-eecnSUc.gjY@flex--jthoughton.bounces.google.com designates 209.85.128.201 as permitted sender) smtp.mailfrom=3LaS2YwoKCHQblZgmYZlgfYggYdW.Ugedafmp-eecnSUc.gjY@flex--jthoughton.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1672913966; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=dzlak4wlRH3uJBMFvzVwTlLZdbkzJfi0hjjO9/IWItc=; b=qrU/VPgzZDMMKE7PIGt4wVOgHSyShQGacwwGMXtHjyqTbt72dJJcq/qfQqnW8bDJm53IP7 J7bLvOnz7pUMAP79agnA4Y6H4Wr3toKSLVAvqNpjcR9PdVla0tzN4IqjadNFKpfy+kHEWk 7xpTpaZf8aKuxkwWymJtGEMW8oTrrXo= ARC-Authentication-Results: i=1; imf14.hostedemail.com; dkim=pass header.d=google.com header.s=20210112 header.b=awstklLG; spf=pass (imf14.hostedemail.com: domain of 3LaS2YwoKCHQblZgmYZlgfYggYdW.Ugedafmp-eecnSUc.gjY@flex--jthoughton.bounces.google.com designates 209.85.128.201 as permitted sender) smtp.mailfrom=3LaS2YwoKCHQblZgmYZlgfYggYdW.Ugedafmp-eecnSUc.gjY@flex--jthoughton.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1672913966; a=rsa-sha256; cv=none; b=GiphzF1iPYo6umI68i3kO3cDFpFgoQF+p+Jf005IyflstqkeXMlg1c+TtNrmm1Mi1emmsJ lnnrXxEWroOvO0d1VerXedpeM+1tcaFXBmFs5mrgpUMOkbH0CWt88MDVZVtKBj/8uCdnTf 76Bwn4ftVYZN8mBC/Gxz33ls/jcS/4k= Received: by mail-yw1-f201.google.com with SMTP id 00721157ae682-46eb8a5a713so313719617b3.1 for ; Thu, 05 Jan 2023 02:19:26 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=dzlak4wlRH3uJBMFvzVwTlLZdbkzJfi0hjjO9/IWItc=; b=awstklLGib5NLIDq4Z9cXiLXngXaTfD1T2MU8qCmmFIg7CftwwRRmxHAX6D7bWf5Gk eP1XAJ27BxVhrMSTPS4OjW8E1bBJRXlppA2RWBZmOpuN5hpHa1nhxolbVg5N3XX7fR/i uxhvgdxFosl+f9OSWK0aUkrSsfbYoSIksFEwk1rsCnqG3hCxFs6mirU8CHZeMTGjuX2e Q0rjB5hCGfPsTJ4KAHBF7XDazDkHbd8tLKzKTBPDcEeSaSx12m8GRvkvEisrZ27K202+ A0uBeYtWLg95Whts+TekNEMT7WDGdYBbfWNEqStD8GqMAKLEzVcmfqPLMlDyrJLjx5tS jt3g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=dzlak4wlRH3uJBMFvzVwTlLZdbkzJfi0hjjO9/IWItc=; b=FiQLcRr77cSyDz7zJom8NFaHOZFSXpAdvdfrkTBELn+MnttiZaK9qC7aZbXDiHSs/c FfW7wz6h2+cmEccN4FywzHJIi1n+z2WRj6EyCKxzQH4WcVvUFsYDxRbUpPhkXfbXi7Y4 LRr/LiK+qRxaaTS5glnSSHLLLziKS7qyavRXiKVTy0g/OQZXwVQsmvPI4nUn0qATFfcP roJCN2cw821QEa61fmiH/ff6USOae/I0RJX0WBd91rO8TS4yokARSIguEN69p4XWXW++ LXnHPBWMNFyWmtSbpoDju9luZvhJ18k5Rj/oqhuwNJ7a/5aR16uVkF9mHHrOlZ7qQQ3F 9d1Q== X-Gm-Message-State: AFqh2kpXUrlSijShZyIs2zgvDikzrf3XRG6FGCz+2rkRkc2RCA1EVKWC 6BdntweXOs9H5e42wstbktiaTnzOXrYUg2Eo X-Google-Smtp-Source: AMrXdXsBFsRq3QLOKoFHYSJZW7aNqeecSHB6zHpYfaCR9hJb67tR/XUiwWFNgVZLgKPrv3NfBrXd+RFTPmxskVNT X-Received: from jthoughton.c.googlers.com ([fda3:e722:ac3:cc00:14:4d90:c0a8:2a4f]) (user=jthoughton job=sendgmr) by 2002:a05:6902:13cc:b0:75f:65da:a046 with SMTP id y12-20020a05690213cc00b0075f65daa046mr5651223ybu.357.1672913965974; Thu, 05 Jan 2023 02:19:25 -0800 (PST) Date: Thu, 5 Jan 2023 10:18:19 +0000 In-Reply-To: <20230105101844.1893104-1-jthoughton@google.com> Mime-Version: 1.0 References: <20230105101844.1893104-1-jthoughton@google.com> X-Mailer: git-send-email 2.39.0.314.g84b9a713c41-goog Message-ID: <20230105101844.1893104-22-jthoughton@google.com> Subject: [PATCH 21/46] hugetlb: use struct hugetlb_pte for walk_hugetlb_range From: James Houghton To: Mike Kravetz , Muchun Song , Peter Xu Cc: David Hildenbrand , David Rientjes , Axel Rasmussen , Mina Almasry , "Zach O'Keefe" , Manish Mishra , Naoya Horiguchi , "Dr . David Alan Gilbert" , "Matthew Wilcox (Oracle)" , Vlastimil Babka , Baolin Wang , Miaohe Lin , Yang Shi , Andrew Morton , linux-mm@kvack.org, linux-kernel@vger.kernel.org, James Houghton X-Stat-Signature: gruxpeui55sep6tznnmpd8rbsr55o95f X-Rspam-User: X-Rspamd-Queue-Id: BEF24100014 X-Rspamd-Server: rspam06 X-HE-Tag: 1672913966-780264 X-HE-Meta: U2FsdGVkX1+aKKQdAug/INlABSEzXJpGnLPJ6dz9sfVnfBQT4L/itrwQp6J4uoxTgM34+ARh/OzXvnEyiHSrIkiEAS5KExLZn+P4uGXFS+BumIUh8KcV9wqyU3ARCeaZCk3vmhZsinnqu/fAssyfvx7moITq1JZsfI4/mL/eK8Qoyf5o1n8Tr29bsmgs2Rf1yLU/ZOfCKZrJkW2Uoj7mO1m/64YEXXCVqtijqSWjJ+v19D+Zx38aMV61ckClFbppyk63kKKfkflgkQvVh6ORfbVyxsLPHJOScfIpjpROWVf26VsIWuowReJDwkwq0hz9AYL0wekPAK/d+B9DgZDrR3Jrg57uFSQq1cnGUrTyo7yZpd/4jyqdaBfRnZ2BV+dv5Jvsmwg/7hvF+paI/GJV2eUsEMg1Zc8WB57lDMTiiA9vACPQSvExAFET9z0JIKY3i6O4uMKxdEYEIFe3VU+fi3HXqiduoITdpkOnOSxPhnzHTfX6j6+1aU1qSaI9DIIyxwuyezxnnA90qD7+htwdiDDDUh3xhMAii2sm8urR10QJqskgtWV0p9kjpOqXHZ/L8M7S87NY4ze7ypCBS3pKprm8P6f7o2AB5XqywkJy1OgaGlDlvnzjdpPCUmSy7esTyN/lHAr9E98hBKpHODYD7tU1aRJ2YoyHMvuMV8ueFkElvqHe8b63RJsNU0yhTzsQK2EktV2R+SHwySVwCOk70BvfRJXklX4nV23mNofewuwiH8j/Dku3Y+kWUIde2M/s8sF7yEIdmtFtMiPDGGz163SDQlLi5NatoWtAKbmaUie+1Rwrjwg0Z2wEHoFews4DzvFQ1D9jUHcFf9kn+2l2eEE5JwNGrdWC7CtmAAs6kbGMXWCQ9eRS4TCUrs5cWCSPcoGSNd12ub3hijRDUFOBnjw0n52gzyrG/7YrsIZ6sXwJsRTjg8DDJyj7aguFsY/1mDjc72++dL7smo3BF+f ep7gWD7p B6cIrqQ5SYW9JVtSb2olhe52gTvKCgPpIe1Ia8nh7cw3ZLMzoSQJ+Da7jktQqZr34/7E5B9gUVmKbF+7x5W2oodquXHGp/k0eBKk8GaT65VZkQpK+ZWwJfF1fNrdSXzn/4nONdI81RkK4BZsorNpWVPfhz0i3B2shQayQNSIasZ0p2Y0LKu5k1grwqM2DBN/z6vBEnVSY1YXKhTbObfaWNNPqfHuADuXnDEYphu2S7zvEGnk8SqklbIKf5t29lLe9b0qQmRv7SqefiawSENEY//26+OJ7VVMHur+SAeq7i/SOxaitR7mUNup86A6lRf2U5hpwAFZ7kAXd3zPc9BGYxyMvdq28o3PI6r3okzMqxjQVchsQdWaj0T3ZsIHcX+rCEKgtBeGghUYDywjszM/67SBjPZqihrt8bdDeR5CV3QhjXELhBf92lQeS2EVQpdng1jFy X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: The main change in this commit is to walk_hugetlb_range to support walking HGM mappings, but all walk_hugetlb_range callers must be updated to use the new API and take the correct action. Listing all the changes to the callers: For s390 changes, we simply ignore HGM PTEs (we don't support s390 yet). For smaps, shared_hugetlb (and private_hugetlb, although private mappings don't support HGM) may now not be divisible by the hugepage size. The appropriate changes have been made to support analyzing HGM PTEs. For pagemap, we ignore non-leaf PTEs by treating that as if they were none PTEs. We can only end up with non-leaf PTEs if they had just been updated from a none PTE. For show_numa_map, the challenge is that, if any of a hugepage is mapped, we have to count that entire page exactly once, as the results are given in units of hugepages. To support HGM mappings, we keep track of the last page that we looked it. If the hugepage we are currently looking at is the same as the last one, then we must be looking at an HGM-mapped page that has been mapped at high-granularity, and we've already accounted for it. For DAMON, we treat non-leaf PTEs as if they were blank, for the same reason as pagemap. For hwpoison, we proactively update the logic to support the case when hpte is pointing to a subpage within the poisoned hugepage. For queue_pages_hugetlb/migration, we ignore all HGM-enabled VMAs for now. For mincore, we ignore non-leaf PTEs for the same reason as pagemap. For mprotect/prot_none_hugetlb_entry, we retry the walk when we get a non-leaf PTE. Signed-off-by: James Houghton --- arch/s390/mm/gmap.c | 20 ++++++++-- fs/proc/task_mmu.c | 83 +++++++++++++++++++++++++++++----------- include/linux/pagewalk.h | 10 +++-- mm/damon/vaddr.c | 42 +++++++++++++------- mm/hmm.c | 20 +++++++--- mm/memory-failure.c | 17 ++++---- mm/mempolicy.c | 12 ++++-- mm/mincore.c | 17 ++++++-- mm/mprotect.c | 18 ++++++--- mm/pagewalk.c | 20 +++++----- 10 files changed, 180 insertions(+), 79 deletions(-) diff --git a/arch/s390/mm/gmap.c b/arch/s390/mm/gmap.c index 74e1d873dce0..284466bf4f25 100644 --- a/arch/s390/mm/gmap.c +++ b/arch/s390/mm/gmap.c @@ -2626,13 +2626,25 @@ static int __s390_enable_skey_pmd(pmd_t *pmd, unsigned long addr, return 0; } -static int __s390_enable_skey_hugetlb(pte_t *pte, unsigned long addr, - unsigned long hmask, unsigned long next, +static int __s390_enable_skey_hugetlb(struct hugetlb_pte *hpte, + unsigned long addr, struct mm_walk *walk) { - pmd_t *pmd = (pmd_t *)pte; + struct hstate *h = hstate_vma(walk->vma); + pmd_t *pmd; unsigned long start, end; - struct page *page = pmd_page(*pmd); + struct page *page; + + if (huge_page_size(h) != hugetlb_pte_size(hpte)) + /* Ignore high-granularity PTEs. */ + return 0; + + if (!pte_present(huge_ptep_get(hpte->ptep))) + /* Ignore non-present PTEs. */ + return 0; + + pmd = (pmd_t *)hpte->ptep; + page = pmd_page(*pmd); /* * The write check makes sure we do not set a key on shared diff --git a/fs/proc/task_mmu.c b/fs/proc/task_mmu.c index 41b5509bde0e..c353cab11eee 100644 --- a/fs/proc/task_mmu.c +++ b/fs/proc/task_mmu.c @@ -731,18 +731,28 @@ static void show_smap_vma_flags(struct seq_file *m, struct vm_area_struct *vma) } #ifdef CONFIG_HUGETLB_PAGE -static int smaps_hugetlb_range(pte_t *pte, unsigned long hmask, - unsigned long addr, unsigned long end, - struct mm_walk *walk) +static int smaps_hugetlb_range(struct hugetlb_pte *hpte, + unsigned long addr, + struct mm_walk *walk) { struct mem_size_stats *mss = walk->private; struct vm_area_struct *vma = walk->vma; struct page *page = NULL; + pte_t pte = huge_ptep_get(hpte->ptep); - if (pte_present(*pte)) { - page = vm_normal_page(vma, addr, *pte); - } else if (is_swap_pte(*pte)) { - swp_entry_t swpent = pte_to_swp_entry(*pte); + if (pte_present(pte)) { + /* We only care about leaf-level PTEs. */ + if (!hugetlb_pte_present_leaf(hpte, pte)) + /* + * The only case where hpte is not a leaf is that + * it was originally none, but it was split from + * under us. It was originally none, so exclude it. + */ + return 0; + + page = vm_normal_page(vma, addr, pte); + } else if (is_swap_pte(pte)) { + swp_entry_t swpent = pte_to_swp_entry(pte); if (is_pfn_swap_entry(swpent)) page = pfn_swap_entry_to_page(swpent); @@ -751,9 +761,9 @@ static int smaps_hugetlb_range(pte_t *pte, unsigned long hmask, int mapcount = page_mapcount(page); if (mapcount >= 2) - mss->shared_hugetlb += huge_page_size(hstate_vma(vma)); + mss->shared_hugetlb += hugetlb_pte_size(hpte); else - mss->private_hugetlb += huge_page_size(hstate_vma(vma)); + mss->private_hugetlb += hugetlb_pte_size(hpte); } return 0; } @@ -1572,22 +1582,31 @@ static int pagemap_pmd_range(pmd_t *pmdp, unsigned long addr, unsigned long end, #ifdef CONFIG_HUGETLB_PAGE /* This function walks within one hugetlb entry in the single call */ -static int pagemap_hugetlb_range(pte_t *ptep, unsigned long hmask, - unsigned long addr, unsigned long end, +static int pagemap_hugetlb_range(struct hugetlb_pte *hpte, + unsigned long addr, struct mm_walk *walk) { struct pagemapread *pm = walk->private; struct vm_area_struct *vma = walk->vma; u64 flags = 0, frame = 0; int err = 0; - pte_t pte; + unsigned long hmask = hugetlb_pte_mask(hpte); + unsigned long end = addr + hugetlb_pte_size(hpte); + pte_t pte = huge_ptep_get(hpte->ptep); + struct page *page; if (vma->vm_flags & VM_SOFTDIRTY) flags |= PM_SOFT_DIRTY; - pte = huge_ptep_get(ptep); if (pte_present(pte)) { - struct page *page = pte_page(pte); + /* + * We raced with this PTE being split, which can only happen if + * it was blank before. Treat it is as if it were blank. + */ + if (!hugetlb_pte_present_leaf(hpte, pte)) + return 0; + + page = pte_page(pte); if (!PageAnon(page)) flags |= PM_FILE; @@ -1868,10 +1887,16 @@ static struct page *can_gather_numa_stats_pmd(pmd_t pmd, } #endif +struct show_numa_map_private { + struct numa_maps *md; + struct page *last_page; +}; + static int gather_pte_stats(pmd_t *pmd, unsigned long addr, unsigned long end, struct mm_walk *walk) { - struct numa_maps *md = walk->private; + struct show_numa_map_private *priv = walk->private; + struct numa_maps *md = priv->md; struct vm_area_struct *vma = walk->vma; spinlock_t *ptl; pte_t *orig_pte; @@ -1883,6 +1908,7 @@ static int gather_pte_stats(pmd_t *pmd, unsigned long addr, struct page *page; page = can_gather_numa_stats_pmd(*pmd, vma, addr); + priv->last_page = page; if (page) gather_stats(page, md, pmd_dirty(*pmd), HPAGE_PMD_SIZE/PAGE_SIZE); @@ -1896,6 +1922,7 @@ static int gather_pte_stats(pmd_t *pmd, unsigned long addr, orig_pte = pte = pte_offset_map_lock(walk->mm, pmd, addr, &ptl); do { struct page *page = can_gather_numa_stats(*pte, vma, addr); + priv->last_page = page; if (!page) continue; gather_stats(page, md, pte_dirty(*pte), 1); @@ -1906,19 +1933,25 @@ static int gather_pte_stats(pmd_t *pmd, unsigned long addr, return 0; } #ifdef CONFIG_HUGETLB_PAGE -static int gather_hugetlb_stats(pte_t *pte, unsigned long hmask, - unsigned long addr, unsigned long end, struct mm_walk *walk) +static int gather_hugetlb_stats(struct hugetlb_pte *hpte, unsigned long addr, + struct mm_walk *walk) { - pte_t huge_pte = huge_ptep_get(pte); + struct show_numa_map_private *priv = walk->private; + pte_t huge_pte = huge_ptep_get(hpte->ptep); struct numa_maps *md; struct page *page; - if (!pte_present(huge_pte)) + if (!hugetlb_pte_present_leaf(hpte, huge_pte)) + return 0; + + page = compound_head(pte_page(huge_pte)); + if (priv->last_page == page) + /* we've already accounted for this page */ return 0; - page = pte_page(huge_pte); + priv->last_page = page; - md = walk->private; + md = priv->md; gather_stats(page, md, pte_dirty(huge_pte), 1); return 0; } @@ -1948,9 +1981,15 @@ static int show_numa_map(struct seq_file *m, void *v) struct file *file = vma->vm_file; struct mm_struct *mm = vma->vm_mm; struct mempolicy *pol; + char buffer[64]; int nid; + struct show_numa_map_private numa_map_private; + + numa_map_private.md = md; + numa_map_private.last_page = NULL; + if (!mm) return 0; @@ -1980,7 +2019,7 @@ static int show_numa_map(struct seq_file *m, void *v) seq_puts(m, " huge"); /* mmap_lock is held by m_start */ - walk_page_vma(vma, &show_numa_ops, md); + walk_page_vma(vma, &show_numa_ops, &numa_map_private); if (!md->pages) goto out; diff --git a/include/linux/pagewalk.h b/include/linux/pagewalk.h index 27a6df448ee5..f4bddad615c2 100644 --- a/include/linux/pagewalk.h +++ b/include/linux/pagewalk.h @@ -3,6 +3,7 @@ #define _LINUX_PAGEWALK_H #include +#include struct mm_walk; @@ -31,6 +32,10 @@ struct mm_walk; * ptl after dropping the vma lock, or else revalidate * those items after re-acquiring the vma lock and before * accessing them. + * In the presence of high-granularity hugetlb entries, + * @hugetlb_entry is called only for leaf-level entries + * (hstate-level entries are ignored if they are not + * leaves). * @test_walk: caller specific callback function to determine whether * we walk over the current vma or not. Returning 0 means * "do page table walk over the current vma", returning @@ -58,9 +63,8 @@ struct mm_walk_ops { unsigned long next, struct mm_walk *walk); int (*pte_hole)(unsigned long addr, unsigned long next, int depth, struct mm_walk *walk); - int (*hugetlb_entry)(pte_t *pte, unsigned long hmask, - unsigned long addr, unsigned long next, - struct mm_walk *walk); + int (*hugetlb_entry)(struct hugetlb_pte *hpte, + unsigned long addr, struct mm_walk *walk); int (*test_walk)(unsigned long addr, unsigned long next, struct mm_walk *walk); int (*pre_vma)(unsigned long start, unsigned long end, diff --git a/mm/damon/vaddr.c b/mm/damon/vaddr.c index 9d92c5eb3a1f..2383f647f202 100644 --- a/mm/damon/vaddr.c +++ b/mm/damon/vaddr.c @@ -330,11 +330,12 @@ static int damon_mkold_pmd_entry(pmd_t *pmd, unsigned long addr, } #ifdef CONFIG_HUGETLB_PAGE -static void damon_hugetlb_mkold(pte_t *pte, struct mm_struct *mm, +static void damon_hugetlb_mkold(struct hugetlb_pte *hpte, pte_t entry, + struct mm_struct *mm, struct vm_area_struct *vma, unsigned long addr) { bool referenced = false; - pte_t entry = huge_ptep_get(pte); + pte_t entry = huge_ptep_get(hpte->ptep); struct folio *folio = pfn_folio(pte_pfn(entry)); folio_get(folio); @@ -342,12 +343,12 @@ static void damon_hugetlb_mkold(pte_t *pte, struct mm_struct *mm, if (pte_young(entry)) { referenced = true; entry = pte_mkold(entry); - set_huge_pte_at(mm, addr, pte, entry); + set_huge_pte_at(mm, addr, hpte->ptep, entry); } #ifdef CONFIG_MMU_NOTIFIER if (mmu_notifier_clear_young(mm, addr, - addr + huge_page_size(hstate_vma(vma)))) + addr + hugetlb_pte_size(hpte))) referenced = true; #endif /* CONFIG_MMU_NOTIFIER */ @@ -358,20 +359,26 @@ static void damon_hugetlb_mkold(pte_t *pte, struct mm_struct *mm, folio_put(folio); } -static int damon_mkold_hugetlb_entry(pte_t *pte, unsigned long hmask, - unsigned long addr, unsigned long end, +static int damon_mkold_hugetlb_entry(struct hugetlb_pte *hpte, + unsigned long addr, struct mm_walk *walk) { - struct hstate *h = hstate_vma(walk->vma); spinlock_t *ptl; pte_t entry; - ptl = huge_pte_lock(h, walk->mm, pte); - entry = huge_ptep_get(pte); + ptl = hugetlb_pte_lock(hpte); + entry = huge_ptep_get(hpte->ptep); if (!pte_present(entry)) goto out; - damon_hugetlb_mkold(pte, walk->mm, walk->vma, addr); + if (!hugetlb_pte_present_leaf(hpte, entry)) + /* + * We raced with someone splitting a blank PTE. Treat this PTE + * as if it were blank. + */ + goto out; + + damon_hugetlb_mkold(hpte, entry, walk->mm, walk->vma, addr); out: spin_unlock(ptl); @@ -484,8 +491,8 @@ static int damon_young_pmd_entry(pmd_t *pmd, unsigned long addr, } #ifdef CONFIG_HUGETLB_PAGE -static int damon_young_hugetlb_entry(pte_t *pte, unsigned long hmask, - unsigned long addr, unsigned long end, +static int damon_young_hugetlb_entry(struct hugetlb_pte *hpte, + unsigned long addr, struct mm_walk *walk) { struct damon_young_walk_private *priv = walk->private; @@ -494,11 +501,18 @@ static int damon_young_hugetlb_entry(pte_t *pte, unsigned long hmask, spinlock_t *ptl; pte_t entry; - ptl = huge_pte_lock(h, walk->mm, pte); - entry = huge_ptep_get(pte); + ptl = hugetlb_pte_lock(hpte); + entry = huge_ptep_get(hpte->ptep); if (!pte_present(entry)) goto out; + if (!hugetlb_pte_present_leaf(hpte, entry)) + /* + * We raced with someone splitting a blank PTE. Treat this PTE + * as if it were blank. + */ + goto out; + folio = pfn_folio(pte_pfn(entry)); folio_get(folio); diff --git a/mm/hmm.c b/mm/hmm.c index 6a151c09de5e..d3e40cfdd4cb 100644 --- a/mm/hmm.c +++ b/mm/hmm.c @@ -468,8 +468,8 @@ static int hmm_vma_walk_pud(pud_t *pudp, unsigned long start, unsigned long end, #endif #ifdef CONFIG_HUGETLB_PAGE -static int hmm_vma_walk_hugetlb_entry(pte_t *pte, unsigned long hmask, - unsigned long start, unsigned long end, +static int hmm_vma_walk_hugetlb_entry(struct hugetlb_pte *hpte, + unsigned long start, struct mm_walk *walk) { unsigned long addr = start, i, pfn; @@ -479,16 +479,24 @@ static int hmm_vma_walk_hugetlb_entry(pte_t *pte, unsigned long hmask, unsigned int required_fault; unsigned long pfn_req_flags; unsigned long cpu_flags; + unsigned long hmask = hugetlb_pte_mask(hpte); + unsigned int order = hpte->shift - PAGE_SHIFT; + unsigned long end = start + hugetlb_pte_size(hpte); spinlock_t *ptl; pte_t entry; - ptl = huge_pte_lock(hstate_vma(vma), walk->mm, pte); - entry = huge_ptep_get(pte); + ptl = hugetlb_pte_lock(hpte); + entry = huge_ptep_get(hpte->ptep); + + if (!hugetlb_pte_present_leaf(hpte, entry)) { + spin_unlock(ptl); + return -EAGAIN; + } i = (start - range->start) >> PAGE_SHIFT; pfn_req_flags = range->hmm_pfns[i]; cpu_flags = pte_to_hmm_pfn_flags(range, entry) | - hmm_pfn_flags_order(huge_page_order(hstate_vma(vma))); + hmm_pfn_flags_order(order); required_fault = hmm_pte_need_fault(hmm_vma_walk, pfn_req_flags, cpu_flags); if (required_fault) { @@ -605,7 +613,7 @@ int hmm_range_fault(struct hmm_range *range) * in pfns. All entries < last in the pfn array are set to their * output, and all >= are still at their input values. */ - } while (ret == -EBUSY); + } while (ret == -EBUSY || ret == -EAGAIN); return ret; } EXPORT_SYMBOL(hmm_range_fault); diff --git a/mm/memory-failure.c b/mm/memory-failure.c index c77a9e37e27e..e7e56298d305 100644 --- a/mm/memory-failure.c +++ b/mm/memory-failure.c @@ -641,6 +641,7 @@ static int check_hwpoisoned_entry(pte_t pte, unsigned long addr, short shift, unsigned long poisoned_pfn, struct to_kill *tk) { unsigned long pfn = 0; + unsigned long base_pages_poisoned = (1UL << shift) / PAGE_SIZE; if (pte_present(pte)) { pfn = pte_pfn(pte); @@ -651,7 +652,8 @@ static int check_hwpoisoned_entry(pte_t pte, unsigned long addr, short shift, pfn = swp_offset_pfn(swp); } - if (!pfn || pfn != poisoned_pfn) + if (!pfn || pfn < poisoned_pfn || + pfn >= poisoned_pfn + base_pages_poisoned) return 0; set_to_kill(tk, addr, shift); @@ -717,16 +719,15 @@ static int hwpoison_pte_range(pmd_t *pmdp, unsigned long addr, } #ifdef CONFIG_HUGETLB_PAGE -static int hwpoison_hugetlb_range(pte_t *ptep, unsigned long hmask, - unsigned long addr, unsigned long end, - struct mm_walk *walk) +static int hwpoison_hugetlb_range(struct hugetlb_pte *hpte, + unsigned long addr, + struct mm_walk *walk) { struct hwp_walk *hwp = walk->private; - pte_t pte = huge_ptep_get(ptep); - struct hstate *h = hstate_vma(walk->vma); + pte_t pte = huge_ptep_get(hpte->ptep); - return check_hwpoisoned_entry(pte, addr, huge_page_shift(h), - hwp->pfn, &hwp->tk); + return check_hwpoisoned_entry(pte, addr & hugetlb_pte_mask(hpte), + hpte->shift, hwp->pfn, &hwp->tk); } #else #define hwpoison_hugetlb_range NULL diff --git a/mm/mempolicy.c b/mm/mempolicy.c index d3558248a0f0..e5859ed34e90 100644 --- a/mm/mempolicy.c +++ b/mm/mempolicy.c @@ -558,8 +558,8 @@ static int queue_pages_pte_range(pmd_t *pmd, unsigned long addr, return addr != end ? -EIO : 0; } -static int queue_pages_hugetlb(pte_t *pte, unsigned long hmask, - unsigned long addr, unsigned long end, +static int queue_pages_hugetlb(struct hugetlb_pte *hpte, + unsigned long addr, struct mm_walk *walk) { int ret = 0; @@ -570,8 +570,12 @@ static int queue_pages_hugetlb(pte_t *pte, unsigned long hmask, spinlock_t *ptl; pte_t entry; - ptl = huge_pte_lock(hstate_vma(walk->vma), walk->mm, pte); - entry = huge_ptep_get(pte); + /* We don't migrate high-granularity HugeTLB mappings for now. */ + if (hugetlb_hgm_enabled(walk->vma)) + return -EINVAL; + + ptl = hugetlb_pte_lock(hpte); + entry = huge_ptep_get(hpte->ptep); if (!pte_present(entry)) goto unlock; page = pte_page(entry); diff --git a/mm/mincore.c b/mm/mincore.c index a085a2aeabd8..0894965b3944 100644 --- a/mm/mincore.c +++ b/mm/mincore.c @@ -22,18 +22,29 @@ #include #include "swap.h" -static int mincore_hugetlb(pte_t *pte, unsigned long hmask, unsigned long addr, - unsigned long end, struct mm_walk *walk) +static int mincore_hugetlb(struct hugetlb_pte *hpte, unsigned long addr, + struct mm_walk *walk) { #ifdef CONFIG_HUGETLB_PAGE unsigned char present; + unsigned long end = addr + hugetlb_pte_size(hpte); unsigned char *vec = walk->private; + pte_t pte = huge_ptep_get(hpte->ptep); /* * Hugepages under user process are always in RAM and never * swapped out, but theoretically it needs to be checked. */ - present = pte && !huge_pte_none(huge_ptep_get(pte)); + present = !huge_pte_none(pte); + + /* + * If the pte is present but not a leaf, we raced with someone + * splitting it. For someone to have split it, it must have been + * huge_pte_none before, so treat it as such. + */ + if (pte_present(pte) && !hugetlb_pte_present_leaf(hpte, pte)) + present = false; + for (; addr != end; vec++, addr += PAGE_SIZE) *vec = present; walk->private = vec; diff --git a/mm/mprotect.c b/mm/mprotect.c index 71358e45a742..62d8c5f7bc92 100644 --- a/mm/mprotect.c +++ b/mm/mprotect.c @@ -543,12 +543,16 @@ static int prot_none_pte_entry(pte_t *pte, unsigned long addr, 0 : -EACCES; } -static int prot_none_hugetlb_entry(pte_t *pte, unsigned long hmask, - unsigned long addr, unsigned long next, +static int prot_none_hugetlb_entry(struct hugetlb_pte *hpte, + unsigned long addr, struct mm_walk *walk) { - return pfn_modify_allowed(pte_pfn(*pte), *(pgprot_t *)(walk->private)) ? - 0 : -EACCES; + pte_t pte = huge_ptep_get(hpte->ptep); + + if (!hugetlb_pte_present_leaf(hpte, pte)) + return -EAGAIN; + return pfn_modify_allowed(pte_pfn(pte), + *(pgprot_t *)(walk->private)) ? 0 : -EACCES; } static int prot_none_test(unsigned long addr, unsigned long next, @@ -591,8 +595,10 @@ mprotect_fixup(struct mmu_gather *tlb, struct vm_area_struct *vma, (newflags & VM_ACCESS_FLAGS) == 0) { pgprot_t new_pgprot = vm_get_page_prot(newflags); - error = walk_page_range(current->mm, start, end, - &prot_none_walk_ops, &new_pgprot); + do { + error = walk_page_range(current->mm, start, end, + &prot_none_walk_ops, &new_pgprot); + } while (error == -EAGAIN); if (error) return error; } diff --git a/mm/pagewalk.c b/mm/pagewalk.c index cb23f8a15c13..05ce242f8b7e 100644 --- a/mm/pagewalk.c +++ b/mm/pagewalk.c @@ -3,6 +3,7 @@ #include #include #include +#include /* * We want to know the real level where a entry is located ignoring any @@ -296,20 +297,21 @@ static int walk_hugetlb_range(unsigned long addr, unsigned long end, struct vm_area_struct *vma = walk->vma; struct hstate *h = hstate_vma(vma); unsigned long next; - unsigned long hmask = huge_page_mask(h); - unsigned long sz = huge_page_size(h); - pte_t *pte; const struct mm_walk_ops *ops = walk->ops; int err = 0; + struct hugetlb_pte hpte; hugetlb_vma_lock_read(vma); do { - next = hugetlb_entry_end(h, addr, end); - pte = hugetlb_walk(vma, addr & hmask, sz); - if (pte) - err = ops->hugetlb_entry(pte, hmask, addr, next, walk); - else if (ops->pte_hole) - err = ops->pte_hole(addr, next, -1, walk); + if (hugetlb_full_walk(&hpte, vma, addr)) { + next = hugetlb_entry_end(h, addr, end); + if (ops->pte_hole) + err = ops->pte_hole(addr, next, -1, walk); + } else { + err = ops->hugetlb_entry( + &hpte, addr, walk); + next = min(addr + hugetlb_pte_size(&hpte), end); + } if (err) break; } while (addr = next, addr != end); From patchwork Thu Jan 5 10:18:20 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: James Houghton X-Patchwork-Id: 13089655 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 9F58EC3DA7A for ; Thu, 5 Jan 2023 10:19:31 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 106AB940013; Thu, 5 Jan 2023 05:19:30 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 0DE59940008; Thu, 5 Jan 2023 05:19:30 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id E992A940013; Thu, 5 Jan 2023 05:19:29 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0017.hostedemail.com [216.40.44.17]) by kanga.kvack.org (Postfix) with ESMTP id DA0B1940008 for ; Thu, 5 Jan 2023 05:19:29 -0500 (EST) Received: from smtpin28.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay01.hostedemail.com (Postfix) with ESMTP id 939241C64FB for ; Thu, 5 Jan 2023 10:19:29 +0000 (UTC) X-FDA: 80320348458.28.2493AD9 Received: from mail-yb1-f201.google.com (mail-yb1-f201.google.com [209.85.219.201]) by imf16.hostedemail.com (Postfix) with ESMTP id 0AB05180005 for ; Thu, 5 Jan 2023 10:19:27 +0000 (UTC) Authentication-Results: imf16.hostedemail.com; dkim=pass header.d=google.com header.s=20210112 header.b=PMdoUglm; spf=pass (imf16.hostedemail.com: domain of 3L6S2YwoKCHYdnbioabnihaiiafY.Wigfchor-ggepUWe.ila@flex--jthoughton.bounces.google.com designates 209.85.219.201 as permitted sender) smtp.mailfrom=3L6S2YwoKCHYdnbioabnihaiiafY.Wigfchor-ggepUWe.ila@flex--jthoughton.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1672913968; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=nuChqEE6EgS9ZWad0RFM9vhCC4fCrQA1CHpyknS9e9I=; b=6i2v229bWXy1Dz/RYcQ9wZXk4fhSPz9DBTLzsSOc431a4QzZ0/nDFFtXBfpgju/EiBCuqR 2Jg13Vw6FyJ8y/K2WNQpdIvvo4APW/vLHCZjn1uzj8pn7KYkqpCBlu0CfpzYzt4Go/m0le UVI7k/3Qlrmgp3XpfwOI6m+rGqFxCsg= ARC-Authentication-Results: i=1; imf16.hostedemail.com; dkim=pass header.d=google.com header.s=20210112 header.b=PMdoUglm; spf=pass (imf16.hostedemail.com: domain of 3L6S2YwoKCHYdnbioabnihaiiafY.Wigfchor-ggepUWe.ila@flex--jthoughton.bounces.google.com designates 209.85.219.201 as permitted sender) smtp.mailfrom=3L6S2YwoKCHYdnbioabnihaiiafY.Wigfchor-ggepUWe.ila@flex--jthoughton.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1672913968; a=rsa-sha256; cv=none; b=CPQqdSM/MAVdAZav5BnsfRLrupxNDtcI0zwVXdR8QdghV0tb52Zbsw+NJEWxwjODzGDOeJ 7TQN2OhAuwfEo9nZMh9D7BfjjvnpxrQaxFgzxFIpyQe1cPh65F8Sw0HxHe7I7JY7PJEBvj LYT8L5/j2xXHd5UAYgkZT90uBVqzg74= Received: by mail-yb1-f201.google.com with SMTP id v9-20020a259d89000000b007b515f139e0so1269725ybp.17 for ; Thu, 05 Jan 2023 02:19:27 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=nuChqEE6EgS9ZWad0RFM9vhCC4fCrQA1CHpyknS9e9I=; b=PMdoUglmYM3qm6mWAqtyVJgzu77MRJxWmvYG3y/y52hzYvzk3AeSDZU/v2V/pwhi1f AihoMvQwpu8u1lszjJmu97VBPUPHphQwGK+Wi8ukPijxmNbY6TtoExxE1zmWstSCaeED T5Ii9p6OfjRqSgGbZeKD1Y0Rzcm0Fdvj3xukaysFr/AEQaLMJgbLYNRWvM54l3lHDA7x fl11csrpr58Kd28oGSDnxgd7q6xGmGLlKrMpRBTnVtv5V+fe2oSKpXRf+O+a4IQLcoCC PuKdMBIByP3sRXz/ggMAPuiyizC3+IXS18St1U3mOWfSpY1iHbA15ZXf/yxY1pQexB6C posA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=nuChqEE6EgS9ZWad0RFM9vhCC4fCrQA1CHpyknS9e9I=; b=DzmuDuhW0xijRawV5ZMxhBHqpdXZBEhjoG11nrc5++wfKZbD4CHBWmPwVoB+XlJAvE +ET6aDx6lcgpV0cOBuM6ceaKe8QFk3/IJQBVG/SfgbgyJvByj6Hc6n9M5AtYUsZKnUyB pbH2J9htyJCs+RUuYvxt9+zNdaYWjpurWm38DT+kISxTNwIZ4PkAgRnU6G9pQeQVzCYM NTUEvYtdIp3jZgs3KzcbNQcBKaVD3tX33zi5ok3CwLqr5EbIHZdGpgUu/Lu3NOutTujm nj3x4UX5F2wCmF1i7XMIO5uRsvsqwsviSNFH8lT47wCQl3orIa1sLX5LqDqhWIA+7ZOp uO1g== X-Gm-Message-State: AFqh2kryEJLY1PvntKnZ88FG2WRBb9fao76ZF6brrbl+DnY+SlDOzVUm I9F/JsCJ6qg2eQVlpe9fil6A18nvrntTkok2 X-Google-Smtp-Source: AMrXdXsmbu5aWUcLBPTHt2L3Enr+YToepIDBEUyzT+YPIQDSoYjuNB0al+BU5pwPYUwrP+sl0xBGKb8Hi38zvCCs X-Received: from jthoughton.c.googlers.com ([fda3:e722:ac3:cc00:14:4d90:c0a8:2a4f]) (user=jthoughton job=sendgmr) by 2002:a81:688a:0:b0:3b7:e501:90cf with SMTP id d132-20020a81688a000000b003b7e50190cfmr2167304ywc.501.1672913967234; Thu, 05 Jan 2023 02:19:27 -0800 (PST) Date: Thu, 5 Jan 2023 10:18:20 +0000 In-Reply-To: <20230105101844.1893104-1-jthoughton@google.com> Mime-Version: 1.0 References: <20230105101844.1893104-1-jthoughton@google.com> X-Mailer: git-send-email 2.39.0.314.g84b9a713c41-goog Message-ID: <20230105101844.1893104-23-jthoughton@google.com> Subject: [PATCH 22/46] mm: rmap: provide pte_order in page_vma_mapped_walk From: James Houghton To: Mike Kravetz , Muchun Song , Peter Xu Cc: David Hildenbrand , David Rientjes , Axel Rasmussen , Mina Almasry , "Zach O'Keefe" , Manish Mishra , Naoya Horiguchi , "Dr . David Alan Gilbert" , "Matthew Wilcox (Oracle)" , Vlastimil Babka , Baolin Wang , Miaohe Lin , Yang Shi , Andrew Morton , linux-mm@kvack.org, linux-kernel@vger.kernel.org, James Houghton X-Rspamd-Queue-Id: 0AB05180005 X-Stat-Signature: zggbhakbtrz69nm3w698feauf7b1c1er X-Rspam-User: X-Rspamd-Server: rspam08 X-HE-Tag: 1672913967-883809 X-HE-Meta: U2FsdGVkX1/GJZsphBW6ZW3QrhI02WmcLJp9+C3szpK6ra04iN0vWntjCYJDDQqsp0lEWL67FiYskI0KIaEhNAg/yrsqZj4/XX1qblEBhXBnuTlwcJ/ckdcopUryXHsQqHuJduc3JoLy1Ol8ds61D8Jx/6Tr0gBf0L/gtxqhEswyGYDtcdKJoT2a2eWi6d//roP9qUirdX+N708incASoVcn/vtVx710qohM6cj7h+UreZBbqVHlKoo1Q7X0+zk4dJWngDQ07RjQRxducXdbQOv4ABHWF18LKC5iQtYab5zkqlkUxIffR5hWXV4+AIFjCvcNCyuJp778AHTWwvgLWm3PR2/1KER+Gk7oezcjdo1RAjUDeENXjhPqDA1K8V6CFlB+14FHQzHgy4pPwbF+8KgYeb7IABFv5xPnS30DydznC9/yP0C4sIxT3MmRNTgtZGZxBqLhwaXsDlobOqTVvQ9j1pKKYTJ9jJN6nQja/KREvnc1fUQx2lOG0atbN+/bWMPwE8yF+pV4BM3rsFxiGf14ANm2KQGinUB9tkxJlJCzxmuqfsli0ZAM9a6V6koax/93xzPkZIqSeACH1UxmO36V4mDQDJGaErvJLcRlhrBHB4OY8DcrELcV4PIbtJcWnfWtkjl9fJY9/EpQJRFRs1Abm6h2fp+OybWyOJSoVgk0CASAcZsKWkS4tUx1V5My3qheh4JW6nvtZ86ilnPWUSYKlzhgG4r1DbTT7JB0YXEokAtRMOp3RUbCv51K0LPHhgRxFAu8bvt/tvp2AbcLLQppd3dR+bWb64UTsTZ+8ahw6T1AUREUvBBOHSug+ycGr/oqhNu3BBLRDFGnNQkQBYCPBeowQJxDg2OBRnQrTqq2ehmH09edVakitbZc5xlbNjdLWNI1I4H+ReEe9QJ5smDjXBhqAWxvQ4RCA5vNIkUuw0SYdr38yVcuo9FNGrSCsOXLUsUE0DRFCW4Bjbh e7oVKhku zvBWAY77EFwEEUmYdkaIHc2E+pQxfKY8ZToPg40fGNBopFZarBZZail/RZXUcJsrEY0Ch7C/SwHkEuEmR/Ad49W1SWH1mNB+1OH9K7cfihrPtjZ2bI10fWDsKDSm7KwHYoT01ALhETAqIsMER04OxEHno68wq4r8BUqmnXJkKkvJP5IfBdiuk5JZwkQCyH6RoPfem8pfq4fPw0AJK6tIt8G4mjhE7gCxJiv4Ci+fZK1al05nzTH9avU9Hh5Knl4dyK3PjTATMqwIB56UDg5sKLCgKySdGerTMRH48hymAi8KWcS58t5x8AHU3VVZseL/oF1Cf6/KL3Md6kC3kjZ/X0Hi9dSx5sCtBqTH4dnIm+njZjdOJh3i8bXL2gekmec+oRzsibTacshwztcH+kI/OhDK5XzIo1yH0oqDnS+TBRxFPsk9BnczYISOEn3xF9PvAUVdm X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: page_vma_mapped_walk callers will need this information to know how HugeTLB pages are mapped. pte_order only applies if pte is not NULL. Signed-off-by: James Houghton --- include/linux/rmap.h | 1 + mm/page_vma_mapped.c | 3 +++ 2 files changed, 4 insertions(+) diff --git a/include/linux/rmap.h b/include/linux/rmap.h index bd3504d11b15..e0557ede2951 100644 --- a/include/linux/rmap.h +++ b/include/linux/rmap.h @@ -378,6 +378,7 @@ struct page_vma_mapped_walk { pmd_t *pmd; pte_t *pte; spinlock_t *ptl; + unsigned int pte_order; unsigned int flags; }; diff --git a/mm/page_vma_mapped.c b/mm/page_vma_mapped.c index 4e448cfbc6ef..08295b122ad6 100644 --- a/mm/page_vma_mapped.c +++ b/mm/page_vma_mapped.c @@ -16,6 +16,7 @@ static inline bool not_found(struct page_vma_mapped_walk *pvmw) static bool map_pte(struct page_vma_mapped_walk *pvmw) { pvmw->pte = pte_offset_map(pvmw->pmd, pvmw->address); + pvmw->pte_order = 0; if (!(pvmw->flags & PVMW_SYNC)) { if (pvmw->flags & PVMW_MIGRATION) { if (!is_swap_pte(*pvmw->pte)) @@ -177,6 +178,7 @@ bool page_vma_mapped_walk(struct page_vma_mapped_walk *pvmw) if (!pvmw->pte) return false; + pvmw->pte_order = huge_page_order(hstate); pvmw->ptl = huge_pte_lock(hstate, mm, pvmw->pte); if (!check_pte(pvmw)) return not_found(pvmw); @@ -272,6 +274,7 @@ bool page_vma_mapped_walk(struct page_vma_mapped_walk *pvmw) } pte_unmap(pvmw->pte); pvmw->pte = NULL; + pvmw->pte_order = 0; goto restart; } pvmw->pte++; From patchwork Thu Jan 5 10:18:21 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: James Houghton X-Patchwork-Id: 13089657 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 0DD83C3DA7D for ; Thu, 5 Jan 2023 10:19:33 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 9B8FC940014; Thu, 5 Jan 2023 05:19:31 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 8F43E940008; Thu, 5 Jan 2023 05:19:31 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 682F8940014; Thu, 5 Jan 2023 05:19:31 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0012.hostedemail.com [216.40.44.12]) by kanga.kvack.org (Postfix) with ESMTP id 473E8940008 for ; Thu, 5 Jan 2023 05:19:31 -0500 (EST) Received: from smtpin20.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay06.hostedemail.com (Postfix) with ESMTP id 136AAAB26A for ; Thu, 5 Jan 2023 10:19:31 +0000 (UTC) X-FDA: 80320348542.20.4DBB538 Received: from mail-yw1-f202.google.com (mail-yw1-f202.google.com [209.85.128.202]) by imf12.hostedemail.com (Postfix) with ESMTP id 7B55C40005 for ; Thu, 5 Jan 2023 10:19:29 +0000 (UTC) Authentication-Results: imf12.hostedemail.com; dkim=pass header.d=google.com header.s=20210112 header.b=d8vpxkw4; spf=pass (imf12.hostedemail.com: domain of 3MKS2YwoKCHceocjpbcojibjjbgZ.Xjhgdips-hhfqVXf.jmb@flex--jthoughton.bounces.google.com designates 209.85.128.202 as permitted sender) smtp.mailfrom=3MKS2YwoKCHceocjpbcojibjjbgZ.Xjhgdips-hhfqVXf.jmb@flex--jthoughton.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1672913969; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=6PQiqeYYa8WUfPyd+HQWnupwn9/KSP4leVRdHky4yOM=; b=vEPv/3cUslWCzQzOEEZynAGCXXKd/VOjutjr5iYgbVgumq0GIvW00WWZoMHutknj9/I0AT Lzg6gm6DezaXcMzuTSXeIP3x3P51CroPGo3LVk8apcW6seReplzcGL2fcKTon80WKtm8iK 0TFCNGzeLlpjgJad4DTPv83pIRgmGR8= ARC-Authentication-Results: i=1; imf12.hostedemail.com; dkim=pass header.d=google.com header.s=20210112 header.b=d8vpxkw4; spf=pass (imf12.hostedemail.com: domain of 3MKS2YwoKCHceocjpbcojibjjbgZ.Xjhgdips-hhfqVXf.jmb@flex--jthoughton.bounces.google.com designates 209.85.128.202 as permitted sender) smtp.mailfrom=3MKS2YwoKCHceocjpbcojibjjbgZ.Xjhgdips-hhfqVXf.jmb@flex--jthoughton.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1672913969; a=rsa-sha256; cv=none; b=kYOii9bfV/mpCT/0d1VlHb1P7cjuetH4xG6SyvYLHQJi/CpzwpSrBrj9HLGd32ChYkifBG xhZYKLsup7kcyMrdtCXK5kXdMAdLEP204nmd1HtijKxNGDRkTLRfy7qgdydDurZG8+FGgh TVgUKPQkXhX7CP31we5TKKghE5SdOVI= Received: by mail-yw1-f202.google.com with SMTP id 00721157ae682-4bdeb1bbeafso12321477b3.4 for ; Thu, 05 Jan 2023 02:19:29 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=6PQiqeYYa8WUfPyd+HQWnupwn9/KSP4leVRdHky4yOM=; b=d8vpxkw4cTLgxinECDNkHM+FwEcUinhdEQd7152gUdKOFJEIQWa1NKtdQXLphkyKcT bpdH2lMDTgH/beB2g0VRtLMbp+YcroXFdVWLsiB6oveY6i22u5iwTllcBXk4KZYEr5Lx jObYGpas5lPGHqnShPWhNo4LAECP+oEcPhuNZUZ42WhflXI8bNSy0hV3UJuJvQz9gAgU aaKzW0AsLx6ESc+6vZVPFmko4ZGS7CpMbtELN0GU2kUY+xSiGVcH6pE6avyLi3c5Nfa7 BUv7WnFzE7GEtpQgVFx2SZ3xKkQNMHAOjJODDJtAiHOnqYYOeN0dJ26YV8xiaDyNelrQ A/ig== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=6PQiqeYYa8WUfPyd+HQWnupwn9/KSP4leVRdHky4yOM=; b=knHAIc3/ah7+5U5gzNAa+E9f1KrZrE+WgWAK3MvYTrtgajvoDMHLgRlnnL7WaRNZml u6MXMiKOkJ15+YAhpRio2/hlGJgOrcxKfMgVjh3G9kcndprjGHcYW4Ry2yjA1vfQ5Rzw 7QJFJYo/yTxdujbSt3GtD3GmW99fYxGU+aiKvBAcH/eowtnagKXTov1nVSyvOtsuIn/s l5DYI/KvM8LrmDofwLa4jVMd83gNDdJ6iuP3g9ASfx2mxdSY2Hlw0EmJS/DUALMjf814 suR+p9wBie4KveqaACV/65q7LQJzeANSRCs6d8n8Qgq+MxV6XInKtKzP6FaeoCJz8t2D EZOg== X-Gm-Message-State: AFqh2kqyYg6WQQ9OrSE3kumHUgU39g+ox8cXhlJ/8aerzKV8+zcUGypr BS3LYuajGN6sQiFwc6FRvkBzQdGZLo0HNlGS X-Google-Smtp-Source: AMrXdXuHdxtCjAeXg0wpQsCwfeAOcMgREXgzkYqYmxQ5/aPde2G7xFjeSk9hNA3xk1wNawTqTTGKNIYZYTNmO/5v X-Received: from jthoughton.c.googlers.com ([fda3:e722:ac3:cc00:14:4d90:c0a8:2a4f]) (user=jthoughton job=sendgmr) by 2002:a25:ab26:0:b0:703:481:a16b with SMTP id u35-20020a25ab26000000b007030481a16bmr4722307ybi.301.1672913968726; Thu, 05 Jan 2023 02:19:28 -0800 (PST) Date: Thu, 5 Jan 2023 10:18:21 +0000 In-Reply-To: <20230105101844.1893104-1-jthoughton@google.com> Mime-Version: 1.0 References: <20230105101844.1893104-1-jthoughton@google.com> X-Mailer: git-send-email 2.39.0.314.g84b9a713c41-goog Message-ID: <20230105101844.1893104-24-jthoughton@google.com> Subject: [PATCH 23/46] mm: rmap: make page_vma_mapped_walk callers use pte_order From: James Houghton To: Mike Kravetz , Muchun Song , Peter Xu Cc: David Hildenbrand , David Rientjes , Axel Rasmussen , Mina Almasry , "Zach O'Keefe" , Manish Mishra , Naoya Horiguchi , "Dr . David Alan Gilbert" , "Matthew Wilcox (Oracle)" , Vlastimil Babka , Baolin Wang , Miaohe Lin , Yang Shi , Andrew Morton , linux-mm@kvack.org, linux-kernel@vger.kernel.org, James Houghton X-Stat-Signature: 5qha84xt3o7r7krzyxhztjpcjowyu1jq X-Rspam-User: X-Rspamd-Queue-Id: 7B55C40005 X-Rspamd-Server: rspam06 X-HE-Tag: 1672913969-206910 X-HE-Meta: U2FsdGVkX1++gMI1Nb60ZmDyhfs9yuxgZ+NaXVBd0TLl60kttXzD9A11yZA7Rwy5wU55q3//eJlxeh8h9+fO7KNWeY9hA8WFFohvxVkpb9Mi50RzrxyIn/GtWnwBmQ+sbCoqgmeasMUs+9+CW6ETGXV/HrG5mDmNbgu70q8OAr+YM/ABNgNaUxHuYayUt38Aj9GZ8FcRYggAN+RpJeH/3LWWfpY/3+zRGk8U98YRY5ab2lwEEPCt5E9PEp29egCzhtR50ti7pEmNaMreJn1587B2JEc8UrvAJ9RXoEoadYeJcA8XySBKDA++IVnQOkbr2DmqWgWtfYK0g1K0VogxGmLo4ohhPmP4m2UZCqFOls6f3x96i0dVdvo2x2h5XTI/fDaoAvziaOKGYoDtwprrYfaFgq3+oq+H45YujEWASMmJdMcExBmS7lCWnsCvbElRUW86iYOn/HCVn0GNCSb+GEZX1cJK2MDfYr7Sb30DYibe02tpl93fns7Qpe+TeieZI+MYg5Eaz1uD0GhKj5gY8Kfxx83s2waVJy2zAJzXMvAnsUZiFQ1ub3jcnrIo8LaoJOxIsfJsPRueNXA68BPwE775475H8E4E/MsfgNeVUit+oNFnBBbI5z3vns7bEDZCJaJkT2YPcD/+j85v15myGE6sT1iHXuXVngCNxc8OhmUfqF1ZJbDusajwIRicZxQuSiyrEfkpcY9gL4U4pjGfFOh+Mky+4rng/55DzLlPHzp9g7p6x7l3bGwWcNncqGQOxUzZ2E8/43rL9Uv+UgGQJxDfcn5rw6Q1gOoFZ6AKVU9wc4dm6/h3CCNREVOqIVoOVZyYCeVXc5e5fgD2V2zapBBArGTN9wVPyamqVHlSXkXSeADlSaDp75BhWlo82hWezNfh9mESoc7GVZu9o1fd449/w5+XMnYzZ4+GvUR6kYfHnwAcl0BGmwnZny4pB7VxXYu/zsOL3uhAvOW0YeL Qpjld8MO CSvkqxyEczXXTrnhKjFr8yw4cPZYbbZpymquKZ3ODQE/ZQgFzE58W9Wh6sfuC5qCBj7tMniF8xIH/xkRywLvfxcWkc/LnVZZBn4YAuqXEy3oaFSXgHdCknVRtfBB5d+IaWaMTMIznRtpBXDZTQCOW1iNX3MHZHQEvN6OJr3xbYx5zynHQW8uGaxYEbZGZc6jQpNtV4DEmwYdo+Q94FkF3i6DJDgBbBdPQi8vuAFRvUhrmYc0jescqOuEhwkf/Cs4VJlqQo/ybKmJZMQ+IbaVu9ak1ASaHtDT/EUXyFS9JGUXriy0liEzP07xgalrDiS9GIuYlc/tX6NVjeZARKHIiDYm3Gxge+DPThPStyhtofXkeV+/qsrjFddBGhtcFGJQIzdTOkO3EtT9WFqiq5Eb7WTyJDGCrgLI2rqn/ZyMHX66iJZd1+qQp4Rcee8H6dbLVxPJh X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: This also updates the callers' hugetlb mapcounting code to handle mapcount properly for subpage-mapped hugetlb pages. Signed-off-by: James Houghton --- mm/migrate.c | 2 +- mm/rmap.c | 17 +++++++++++++---- 2 files changed, 14 insertions(+), 5 deletions(-) diff --git a/mm/migrate.c b/mm/migrate.c index 832f639fc49a..0062689f4878 100644 --- a/mm/migrate.c +++ b/mm/migrate.c @@ -244,7 +244,7 @@ static bool remove_migration_pte(struct folio *folio, #ifdef CONFIG_HUGETLB_PAGE if (folio_test_hugetlb(folio)) { - unsigned int shift = huge_page_shift(hstate_vma(vma)); + unsigned int shift = pvmw.pte_order + PAGE_SHIFT; pte = arch_make_huge_pte(pte, shift, vma->vm_flags); if (folio_test_anon(folio)) diff --git a/mm/rmap.c b/mm/rmap.c index 8a24b90d9531..ff7e6c770b0a 100644 --- a/mm/rmap.c +++ b/mm/rmap.c @@ -1608,7 +1608,7 @@ static bool try_to_unmap_one(struct folio *folio, struct vm_area_struct *vma, if (PageHWPoison(subpage) && !(flags & TTU_IGNORE_HWPOISON)) { pteval = swp_entry_to_pte(make_hwpoison_entry(subpage)); if (folio_test_hugetlb(folio)) { - hugetlb_count_sub(folio_nr_pages(folio), mm); + hugetlb_count_sub(1UL << pvmw.pte_order, mm); set_huge_pte_at(mm, address, pvmw.pte, pteval); } else { dec_mm_counter(mm, mm_counter(&folio->page)); @@ -1767,7 +1767,11 @@ static bool try_to_unmap_one(struct folio *folio, struct vm_area_struct *vma, * * See Documentation/mm/mmu_notifier.rst */ - page_remove_rmap(subpage, vma, folio_test_hugetlb(folio)); + if (folio_test_hugetlb(folio)) + page_remove_rmap(&folio->page, vma, true); + else + page_remove_rmap(subpage, vma, false); + if (vma->vm_flags & VM_LOCKED) mlock_page_drain_local(); folio_put(folio); @@ -2030,7 +2034,7 @@ static bool try_to_migrate_one(struct folio *folio, struct vm_area_struct *vma, } else if (PageHWPoison(subpage)) { pteval = swp_entry_to_pte(make_hwpoison_entry(subpage)); if (folio_test_hugetlb(folio)) { - hugetlb_count_sub(folio_nr_pages(folio), mm); + hugetlb_count_sub(1L << pvmw.pte_order, mm); set_huge_pte_at(mm, address, pvmw.pte, pteval); } else { dec_mm_counter(mm, mm_counter(&folio->page)); @@ -2122,7 +2126,10 @@ static bool try_to_migrate_one(struct folio *folio, struct vm_area_struct *vma, * * See Documentation/mm/mmu_notifier.rst */ - page_remove_rmap(subpage, vma, folio_test_hugetlb(folio)); + if (folio_test_hugetlb(folio)) + page_remove_rmap(&folio->page, vma, true); + else + page_remove_rmap(subpage, vma, false); if (vma->vm_flags & VM_LOCKED) mlock_page_drain_local(); folio_put(folio); @@ -2206,6 +2213,8 @@ static bool page_make_device_exclusive_one(struct folio *folio, args->owner); mmu_notifier_invalidate_range_start(&range); + VM_BUG_ON_FOLIO(folio_test_hugetlb(folio), folio); + while (page_vma_mapped_walk(&pvmw)) { /* Unexpected PMD-mapped THP? */ VM_BUG_ON_FOLIO(!pvmw.pte, folio); From patchwork Thu Jan 5 10:18:22 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: James Houghton X-Patchwork-Id: 13089658 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 49265C3DA7A for ; Thu, 5 Jan 2023 10:19:34 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id B196C940015; Thu, 5 Jan 2023 05:19:32 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id AC85B940008; Thu, 5 Jan 2023 05:19:32 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 98F5B940015; Thu, 5 Jan 2023 05:19:32 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0013.hostedemail.com [216.40.44.13]) by kanga.kvack.org (Postfix) with ESMTP id 87704940008 for ; Thu, 5 Jan 2023 05:19:32 -0500 (EST) Received: from smtpin15.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay07.hostedemail.com (Postfix) with ESMTP id 5F3381608EE for ; Thu, 5 Jan 2023 10:19:32 +0000 (UTC) X-FDA: 80320348584.15.59421E4 Received: from mail-yb1-f202.google.com (mail-yb1-f202.google.com [209.85.219.202]) by imf08.hostedemail.com (Postfix) with ESMTP id CE629160007 for ; Thu, 5 Jan 2023 10:19:30 +0000 (UTC) Authentication-Results: imf08.hostedemail.com; dkim=pass header.d=google.com header.s=20210112 header.b=R0e0fs1q; spf=pass (imf08.hostedemail.com: domain of 3MqS2YwoKCHkgqelrdeqlkdlldib.Zljifkru-jjhsXZh.lod@flex--jthoughton.bounces.google.com designates 209.85.219.202 as permitted sender) smtp.mailfrom=3MqS2YwoKCHkgqelrdeqlkdlldib.Zljifkru-jjhsXZh.lod@flex--jthoughton.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1672913970; a=rsa-sha256; cv=none; b=gsb4gJ5nu6BH083+YVkEQxUrTAMONIEo0Mc4lmVPiv39xSGDE1n+GPVWG+mqYwBunVrpBy deqDE21hoSi363R3NY5cea3f4AeB7QgoL8KNzw5z0F4av4ZU6FMn/rDVUZa0cN/r+ki9eI rtH+BcCvb7IO+pTXFnbMNR+Pc35aByI= ARC-Authentication-Results: i=1; imf08.hostedemail.com; dkim=pass header.d=google.com header.s=20210112 header.b=R0e0fs1q; spf=pass (imf08.hostedemail.com: domain of 3MqS2YwoKCHkgqelrdeqlkdlldib.Zljifkru-jjhsXZh.lod@flex--jthoughton.bounces.google.com designates 209.85.219.202 as permitted sender) smtp.mailfrom=3MqS2YwoKCHkgqelrdeqlkdlldib.Zljifkru-jjhsXZh.lod@flex--jthoughton.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1672913970; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=NiUhznaM6wSBHvFTCRNfcVBZtE9H57LW0vfZq/kiKq4=; b=PEQETR1Q13lY7qTFtNh9UATfllHyaKijak4MRf5c6stRRs/NoEDuzgpCgMXGjwp/H4cUGs bvJn5Eft9iLoY7lHRswO+O78ik5XZH2roUI2bk7nmg/C1SbJPJcThXW+xDZrLZnV+aV7je Fm7t9rXbS3x8LZL8Rnv3b49ENJ1MU8Q= Received: by mail-yb1-f202.google.com with SMTP id n190-20020a25dac7000000b007447d7a25e4so35872535ybf.9 for ; Thu, 05 Jan 2023 02:19:30 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=NiUhznaM6wSBHvFTCRNfcVBZtE9H57LW0vfZq/kiKq4=; b=R0e0fs1qHnnMIp3WQSIqsS/oYtW5o+Yj7Qa36R5hH23w7ZdGxSpC7WlTWoe3UlQfRD dGgTxvzDN/xImPp7YvQkVKm27XaeHjtfDYvAANhaT1tIjTWlY7bJyEtqoyb0eUUUqrU8 QXro3ShToMB2A8M5U7Ys2rZi65CcMRN/fFJOZAG6TsQBgyFYz/TZFDkv9hIUQiwwQiem r9xHbnsfSv7cyl5PTzK6aFGrR3WmNbvqetolrPlO4hROQfy+iN7qwrrmjuqcIxUxBVSJ t16F0RLTvqQFbGkRUXJ/5oh9tgpANRdg+EBmcNIs00ZH4Psbt9wmtDm5WNvKRja9aVfR jqgg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=NiUhznaM6wSBHvFTCRNfcVBZtE9H57LW0vfZq/kiKq4=; b=H7JR11nxSDck4bud4VzdaGlKcdkMnx9qJI7UpqMeye3RE9iYpHyBmbxWb87YCIpkHA KfK7akIy7Bq5uZy9TjNlK9DHZAONxoIgg2awDjgEO1L1G0SpcOC7Hs9XqNyv4js010uz 5ttC5sC7K5YSLw5QaPcLndUMsilZ+HSzzRVZOjwsgBl+zwD5IchZ3kWqlqaMk+l5Q9lB VV3fXY+j7alufOttBeetczExkUpYU4tnF0gw5JEIT3rauLxf3PeOGmpizngr5iXxAnvH W7rlI6+jbc8UJynKeZJcO/j/ZON4PRwNqMCINoD+IV6blF9o/GAreBiryBofl38fNzmx Ac8w== X-Gm-Message-State: AFqh2kqIjGD66um36nOegjg3vMZ6yPKFlPHbrsEZ90jJSGYtzQ3FdM5F FnvbFsRh0DZIUIKHn0BlvJ126q6j737DfhRE X-Google-Smtp-Source: AMrXdXs97hJ5UkX9w2wjCZam/ryTvIgotSl/r0iqjglatCp7QBztYQWA8nmijVJuzbrdnXpFOYzs+TfhnvrDByvB X-Received: from jthoughton.c.googlers.com ([fda3:e722:ac3:cc00:14:4d90:c0a8:2a4f]) (user=jthoughton job=sendgmr) by 2002:a25:cf02:0:b0:7b4:fa63:5519 with SMTP id f2-20020a25cf02000000b007b4fa635519mr198616ybg.270.1672913970030; Thu, 05 Jan 2023 02:19:30 -0800 (PST) Date: Thu, 5 Jan 2023 10:18:22 +0000 In-Reply-To: <20230105101844.1893104-1-jthoughton@google.com> Mime-Version: 1.0 References: <20230105101844.1893104-1-jthoughton@google.com> X-Mailer: git-send-email 2.39.0.314.g84b9a713c41-goog Message-ID: <20230105101844.1893104-25-jthoughton@google.com> Subject: [PATCH 24/46] rmap: update hugetlb lock comment for HGM From: James Houghton To: Mike Kravetz , Muchun Song , Peter Xu Cc: David Hildenbrand , David Rientjes , Axel Rasmussen , Mina Almasry , "Zach O'Keefe" , Manish Mishra , Naoya Horiguchi , "Dr . David Alan Gilbert" , "Matthew Wilcox (Oracle)" , Vlastimil Babka , Baolin Wang , Miaohe Lin , Yang Shi , Andrew Morton , linux-mm@kvack.org, linux-kernel@vger.kernel.org, James Houghton X-Rspam-User: X-Rspamd-Queue-Id: CE629160007 X-Rspamd-Server: rspam01 X-Stat-Signature: pqzhabq84eawafr3o5btj6cgowrnuf36 X-HE-Tag: 1672913970-518916 X-HE-Meta: U2FsdGVkX18BLTxjUm+wfJpwtjWe/yqAlqfAHer9grxEOmaxTefxGtjYpd6OGN9HL2E6MFLZJsp8j42D698bytPjpFK1zpT4mzHpz5GbEgVehvPREiix7+RwuNccxzIuhQZQTCdFyWWwmE2sPd/py1Iyoq7YmRjb/uCRfSlqmzBreo1JUrhgGoiyYI4W8CrD/DFoMLG5ARyiptIrP03ppsHq/yfGZSdLLGRNej/Caii3FqKsTl8DaEhsXRCXNc+lt/Ht6HnzPBxnr8ColIMKjnVstd94086cYagZxgHL5kV8ilsmZMWuFO00Rhzh90o5boxsHuIKUd8Dij1eubrFAAJ0wg2P1/7HQ4E/C2mBjynA/Lu6Uq7DnuFgSxLTJPQ4RIhbrY/FQl7OtH6nGHPhaBvHmqeeRdCzn4tr2AbB1wJJe2/tv6Y3pu8+CjolVBEgXzSga+M7VEAbXQIoGGo96PDDAHB4gJ0uEQ75ZnBUTLpmQ6hVAJztoCtHXe0shyWfe20Osf92b8uU2B9DdvzaPGrfMQIqH0XSRJBKwCefymuTtcOj1xMAw30OZuin3pXR/pFVdHCjMMJjwtPWmbB+KsPvp27y1d5K3VRVMXWyt6lW6lG0KI1cOPH2k/sPjfKWBCrg1g3wQKyJHhb18LLRY/hCbwl/mnxb70nG2BH6Iw7K74KLPGa1pxcLbBSBTQV0sq69gQqYnugK0yuODBWTYfd6KoOgTc68zUz8XrCCPM4gcUjxxs63FsYJdK3byxrbQfItjrQ8AO9kkJlMzG/p35QXU7kY/+vnUH/I5ZpXc+Ucqar3yyuQ/bf2eYDOrdoU1SU2B5xtVZuNMoTjvkCLvqBG69gbcZA5MEEzF3ry599uuaIdG82wsPVdtx9C+ZZdOZ40soHRW6d8YnM8frh0tWxvFGpa8bSD4lSIVdoQm3gouCyWB4a1M1gpld8HTrDvtObZvTo9RGaK1PAuW6r 7Hm39M1Y jjD6tyh/JfYDO5t95buW0oaVL1KPLABVrUhGEIsPMBaCcpinE7bocXuvXoNduPekhCp8q3V72+e64w9WmbMhDv83p11scB6tCLdf9iRZ9bufAUx1UpgNpR0GDVdzFdexUP31TJRfayRuMotZHS4qRK+lwwFh4YtXaPEvNkOduLdirUpJuFMlp1u9n81c2vcfIOTwOQZ0qsgDDVYWHW7fLC1nazYEHQU8D6LyYCWPL62DZxjR8xP3gJEbek6NSR9yfVKomqvXa2TPYgDLYtXdiXltF34fbVTOw5jMDwjKvB553K+nwjjgDLFIO1033Hiaqt2+vqr3f3LZlzy9TF3knuSnJI29h1IRbEh9xUv+mKYX1uyvJnTncqPMccCdeMnX4lcLr2LQ6qkz+jWQ= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: The VMA lock is used to prevent high-granularity HugeTLB mappings from being collapsed while other threads are doing high-granularity page table walks. Signed-off-by: James Houghton --- include/linux/hugetlb.h | 12 ++++++++++++ mm/rmap.c | 3 ++- 2 files changed, 14 insertions(+), 1 deletion(-) diff --git a/include/linux/hugetlb.h b/include/linux/hugetlb.h index b7cf45535d64..daf993fdbc38 100644 --- a/include/linux/hugetlb.h +++ b/include/linux/hugetlb.h @@ -156,6 +156,18 @@ struct file_region { #endif }; +/* + * The HugeTLB VMA lock is used to synchronize HugeTLB page table walks. + * Right now, it is only used for VM_SHARED mappings. + * - The read lock is held when we want to stabilize mappings (prevent PMD + * unsharing or MADV_COLLAPSE for high-granularity mappings). + * - The write lock is held when we want to free mappings (PMD unsharing and + * MADV_COLLAPSE for high-granularity mappings). + * + * Note: For PMD unsharing and MADV_COLLAPSE, the i_mmap_rwsem is held for + * writing as well, so page table walkers will also be safe if they hold + * i_mmap_rwsem for at least reading. See hugetlb_walk() for more information. + */ struct hugetlb_vma_lock { struct kref refs; struct rw_semaphore rw_sema; diff --git a/mm/rmap.c b/mm/rmap.c index ff7e6c770b0a..076ea77010e5 100644 --- a/mm/rmap.c +++ b/mm/rmap.c @@ -47,7 +47,8 @@ * * hugetlbfs PageHuge() take locks in this order: * hugetlb_fault_mutex (hugetlbfs specific page fault mutex) - * vma_lock (hugetlb specific lock for pmd_sharing) + * vma_lock (hugetlb specific lock for pmd_sharing and high-granularity + * mapping) * mapping->i_mmap_rwsem (also used for hugetlb pmd sharing) * page->flags PG_locked (lock_page) */ From patchwork Thu Jan 5 10:18:23 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: James Houghton X-Patchwork-Id: 13089659 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 98A64C3DA7D for ; Thu, 5 Jan 2023 10:19:35 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 17A8A940016; Thu, 5 Jan 2023 05:19:34 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 10575940008; Thu, 5 Jan 2023 05:19:34 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id DD553940016; Thu, 5 Jan 2023 05:19:33 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0016.hostedemail.com [216.40.44.16]) by kanga.kvack.org (Postfix) with ESMTP id BF347940008 for ; Thu, 5 Jan 2023 05:19:33 -0500 (EST) Received: from smtpin08.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay01.hostedemail.com (Postfix) with ESMTP id 9FD6A1C5FC3 for ; Thu, 5 Jan 2023 10:19:33 +0000 (UTC) X-FDA: 80320348626.08.CABA71E Received: from mail-yb1-f202.google.com (mail-yb1-f202.google.com [209.85.219.202]) by imf02.hostedemail.com (Postfix) with ESMTP id 0E9D18000A for ; Thu, 5 Jan 2023 10:19:31 +0000 (UTC) Authentication-Results: imf02.hostedemail.com; dkim=pass header.d=google.com header.s=20210112 header.b="mmJynU/T"; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf02.hostedemail.com: domain of 3M6S2YwoKCHohrfmsefrmlemmejc.amkjglsv-kkitYai.mpe@flex--jthoughton.bounces.google.com designates 209.85.219.202 as permitted sender) smtp.mailfrom=3M6S2YwoKCHohrfmsefrmlemmejc.amkjglsv-kkitYai.mpe@flex--jthoughton.bounces.google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1672913972; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=9Yt8Zfl8fNE6CjUoa2m2ZbJ28GLtsXtxPk8weKgz6Sk=; b=N1KThZ7WvlLqWd2F2Oau91MU4HfD9FMPpF+lA/wflEdnMlQ8IeZB/f7Gh7hKvwW/CpU2cC Ue/voUTbhL7Lob69+eljGldzdKFsvhvrR7qrv55t4okBdMSC/0tYfbBYEC4GhO9Hg5DTA8 3v+2m8XuHP4PVvoWCrd2vOdhI51ipa4= ARC-Authentication-Results: i=1; imf02.hostedemail.com; dkim=pass header.d=google.com header.s=20210112 header.b="mmJynU/T"; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf02.hostedemail.com: domain of 3M6S2YwoKCHohrfmsefrmlemmejc.amkjglsv-kkitYai.mpe@flex--jthoughton.bounces.google.com designates 209.85.219.202 as permitted sender) smtp.mailfrom=3M6S2YwoKCHohrfmsefrmlemmejc.amkjglsv-kkitYai.mpe@flex--jthoughton.bounces.google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1672913972; a=rsa-sha256; cv=none; b=qrKzPc54+c99aK/K34j8p1zS1ylSWHy+xQeaC6lnRcuvGB/1SvyV44tPRvfsD09c00QFTi pwtjRtZM6KXuCs93fethbrRYni6QPgpa401ae6qlmJwp3eMzvkfMRjIt4J2YU+DofFrxmg OZWcBZJKyoTm/0WVEJRfuYAUXQ6MyGs= Received: by mail-yb1-f202.google.com with SMTP id y66-20020a25c845000000b00733b5049b6fso36236215ybf.3 for ; Thu, 05 Jan 2023 02:19:31 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=9Yt8Zfl8fNE6CjUoa2m2ZbJ28GLtsXtxPk8weKgz6Sk=; b=mmJynU/TU9vgbAWdVBnooR6qbWy6e6Xt4HjkahKPBivEshc97MtogxTwIBRYXaYqJ0 fSpmPH7oyypf8QtZl1TBck6C8Qnc5vP3vXChtZzIua5/HPzB+uT2N0El75+vpSdXgb3I fXUucvYApqmIOdNMwmWBYNiybAbob8pq547IG5xhW67a2+7RxSBPQsKdkno0G9NfjnHR vThJc94sL2B5jTo0R162NfJZTasyE6NQSOqGcbgqMDv7BN8cJUhlAVO1ikA07733qw6v J0FL1bdTenfsSmm7z4ZB7CRnQWtuNtJNF7SWWWGOq6wzEvgYIHYppRVKiHaPtU8vk09q t6pA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=9Yt8Zfl8fNE6CjUoa2m2ZbJ28GLtsXtxPk8weKgz6Sk=; b=P7II9gd7qUhGQ4/YSVU1mDI3YOKKFb4lZNjnlHjwBlPWr66BTAWd1xkgwo2Jt/KDw6 GxL+/g6nOMVO9wuoxnwJA0yTprxn4+WD1QAUWASbtRrlYDsGrH66QX2nCvbCeJPeoOnO 8Fn/G3C88Z6he97Ha2F6rHX80P7om7ElmvybACBufXtiftV6chCIJwVXN8FMoz7r2ZF4 GVahUbv+dKPwWddymDLzIfLLQD2p40ba+RyjEjHrtnsYtGZE+R7EG1Ykn6DE6qUcG1So JVu/kIHfHYpQ16CBkn3gccDiGH6DvL8/piPf9C+esC6Wcrt++JHenfqlHYYMYWGMOdxa AppQ== X-Gm-Message-State: AFqh2kpPuD+hdL3ghJISpYR7DgJQeEUpvQp1TsbBIgyyIZUI96iiu8r3 gM16+qPC5b5yoGQeYiiJ6DoaaEMfdQ4CQ8zN X-Google-Smtp-Source: AMrXdXuVcGaIehPK3ggZHRYdepJOSa1ZuL0a2MndWjzlHZKir2rPDwMHed4R9zmgcZPJJPozNStrKME793+njzVf X-Received: from jthoughton.c.googlers.com ([fda3:e722:ac3:cc00:14:4d90:c0a8:2a4f]) (user=jthoughton job=sendgmr) by 2002:a25:5056:0:b0:7b6:2b8f:f2c0 with SMTP id e83-20020a255056000000b007b62b8ff2c0mr25432ybb.46.1672913971225; Thu, 05 Jan 2023 02:19:31 -0800 (PST) Date: Thu, 5 Jan 2023 10:18:23 +0000 In-Reply-To: <20230105101844.1893104-1-jthoughton@google.com> Mime-Version: 1.0 References: <20230105101844.1893104-1-jthoughton@google.com> X-Mailer: git-send-email 2.39.0.314.g84b9a713c41-goog Message-ID: <20230105101844.1893104-26-jthoughton@google.com> Subject: [PATCH 25/46] hugetlb: update page_vma_mapped to do high-granularity walks From: James Houghton To: Mike Kravetz , Muchun Song , Peter Xu Cc: David Hildenbrand , David Rientjes , Axel Rasmussen , Mina Almasry , "Zach O'Keefe" , Manish Mishra , Naoya Horiguchi , "Dr . David Alan Gilbert" , "Matthew Wilcox (Oracle)" , Vlastimil Babka , Baolin Wang , Miaohe Lin , Yang Shi , Andrew Morton , linux-mm@kvack.org, linux-kernel@vger.kernel.org, James Houghton X-Rspamd-Queue-Id: 0E9D18000A X-Rspamd-Server: rspam09 X-Rspam-User: X-Stat-Signature: 1j5hhugpuzysoud5nty86i4yrzpbh9gb X-HE-Tag: 1672913971-862332 X-HE-Meta: U2FsdGVkX18wVVBnNkZSW+e+s3sGrdpgBTiVaUu0yQzBIHptCp3ypOWoLQ+Ood7Ek+5kq2T86xIkBAKW0U4Bx124lbmSciD7rMwokHk5An7+wH5X06xhccWKcHFgk3cngrl/PEjvvS0IN4fPJhoYQrpRzLeNbEjhr/QhNoMlslHerm3BkLytMvamBYNi9KKDn+2yVQ9qGo6IYaf5FKkSSTkbe4oUDpRAhu3RdgOUkGJLg7sDzRiHhfbMlf4ja5CUuQwfdqeMPV6L6Cg/JpiZvcT6x4TDebugLsPTGwcZ5JVORavRtE3A0l6IgyW6qVZkrm1be5F31ODGwQb87FuTIqYlfV6K3HeuYAoON8KHUS3E5O9kueryTqBvHAzgreRceBuv4yo3VbPSpphCfh7Fhb7Lcr+RmgmMb+LmJZ05QJ7EY69Fsz8XooFXG+v50AqJPN24a2abC9B+bRjIFHct6BQOXRTMfy6JBlG6F9/9R3aoNHsz6JkjZZ9aw5D8CUzpGeOMzqOzMZPkWp0POOV8rttRZMNsP2myNt1vXAuCn3w8xg9y7XX8q/a258zl8rZUxImW+FLXK4dFlwco448ktEJ+4JC+narS5Q6whVB9oaRiPKv1zmOnLTjza1I2Nm5C/N0sh1eQ3uDVa+Es13L9hc8Bp35jk9dLbSHwZUyDJ6ldyNfob4iwizyOH3kw/HvlCOLrrqHdahWG7Fz4q1VLcjkIw0fXk/lAOOeoJK3fYwEr1OlTo8+ngTkYUwl1qXZxP44lCNv4y/hI81jcADV+t1BrauziHZqOr4NKCUU0RP8qzPgyqLnZMn/p2Xcgsph+Q1NgkLLeprRhufdq9vcSbWQo3wF6hvg+AZ/dVl646gpBOitxkd4YZpWy4i52SlMb+0uXwR5vL2tXWd32+1XjUPx6rZ6NV0gwdUI3Cbrt6VZ5R/SsufcPgLqJWzPItGq+VxtUzwmQARmTsxmCwC5 KyzG3lTg HMaa89I92pnMOPgL+PeKlGXzkpyCdOToCzyEWmBNAWaegy8cNPHOdFEw9zYRtXvLgzF0/1qmSpGCd5USc85ZWfVZCLrwyQGKdy6jG5tNPzVjFsGtL3ZRIYrExUU+KeciEqmWiL46z73vlmC3pQZ12NXSSF/Ixy3X9jlSkzPZQR5Uly1NZqay9TfTV4O8goEe2Q0Qp8FH/ZQwhK8VuGfnEFh1HnAUVsxwqXeCxlj2AkEC0du3VfeAXTnwcJ0Cim2eGln873jVmJNnt/w/gdJ9qHDbsQ5m/GZfqtQJMVnx4or9f4iAFvN/BhGETH6iFispEroVhoKem7WWhAUKOJ5dXFsj69h18pYcouoF5F7zB8wwpOCyBpKVHi4uIIuBNPfK0MWJJxVtcrNS3SurWfV9N4joMprTPigYZIglqIeu6ZuFGjfl1CZCe+U/G+p3OVyLhu2X/ X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Update the HugeTLB logic to look a lot more like the PTE-mapped THP logic. When a user calls us in a loop, we will update pvmw->address to walk to each page table entry that could possibly map the hugepage containing pvmw->pfn. Make use of the new pte_order so callers know what size PTE they're getting. The !pte failure case is changed to call not_found() instead of just returning false. This should be a no-op, but if somehow the hstate-level PTE were deallocated between iterations, not_found() should be called to drop locks. Signed-off-by: James Houghton --- mm/page_vma_mapped.c | 59 +++++++++++++++++++++++++++++++------------- 1 file changed, 42 insertions(+), 17 deletions(-) diff --git a/mm/page_vma_mapped.c b/mm/page_vma_mapped.c index 08295b122ad6..03e8a4987272 100644 --- a/mm/page_vma_mapped.c +++ b/mm/page_vma_mapped.c @@ -133,7 +133,8 @@ static void step_forward(struct page_vma_mapped_walk *pvmw, unsigned long size) * * Returns true if the page is mapped in the vma. @pvmw->pmd and @pvmw->pte point * to relevant page table entries. @pvmw->ptl is locked. @pvmw->address is - * adjusted if needed (for PTE-mapped THPs). + * adjusted if needed (for PTE-mapped THPs and high-granularity-mapped HugeTLB + * pages). * * If @pvmw->pmd is set but @pvmw->pte is not, you have found PMD-mapped page * (usually THP). For PTE-mapped THP, you should run page_vma_mapped_walk() in @@ -165,23 +166,47 @@ bool page_vma_mapped_walk(struct page_vma_mapped_walk *pvmw) if (unlikely(is_vm_hugetlb_page(vma))) { struct hstate *hstate = hstate_vma(vma); - unsigned long size = huge_page_size(hstate); - /* The only possible mapping was handled on last iteration */ - if (pvmw->pte) - return not_found(pvmw); - /* - * All callers that get here will already hold the - * i_mmap_rwsem. Therefore, no additional locks need to be - * taken before calling hugetlb_walk(). - */ - pvmw->pte = hugetlb_walk(vma, pvmw->address, size); - if (!pvmw->pte) - return false; + struct hugetlb_pte hpte; + pte_t pteval; + + end = (pvmw->address & huge_page_mask(hstate)) + + huge_page_size(hstate); + + do { + if (pvmw->pte) { + if (pvmw->ptl) + spin_unlock(pvmw->ptl); + pvmw->ptl = NULL; + pvmw->address += PAGE_SIZE << pvmw->pte_order; + if (pvmw->address >= end) + return not_found(pvmw); + } - pvmw->pte_order = huge_page_order(hstate); - pvmw->ptl = huge_pte_lock(hstate, mm, pvmw->pte); - if (!check_pte(pvmw)) - return not_found(pvmw); + /* + * All callers that get here will already hold the + * i_mmap_rwsem. Therefore, no additional locks need to + * be taken before calling hugetlb_walk(). + */ + if (hugetlb_full_walk(&hpte, vma, pvmw->address)) + return not_found(pvmw); + +retry: + pvmw->pte = hpte.ptep; + pvmw->pte_order = hpte.shift - PAGE_SHIFT; + pvmw->ptl = hugetlb_pte_lock(&hpte); + pteval = huge_ptep_get(hpte.ptep); + if (pte_present(pteval) && !hugetlb_pte_present_leaf( + &hpte, pteval)) { + /* + * Someone split from under us, so keep + * walking. + */ + spin_unlock(pvmw->ptl); + hugetlb_full_walk_continue(&hpte, vma, + pvmw->address); + goto retry; + } + } while (!check_pte(pvmw)); return true; } From patchwork Thu Jan 5 10:18:24 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: James Houghton X-Patchwork-Id: 13089660 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id D4866C53210 for ; Thu, 5 Jan 2023 10:19:36 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 4AF3D940017; Thu, 5 Jan 2023 05:19:35 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 4619F940008; Thu, 5 Jan 2023 05:19:35 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 30087940017; Thu, 5 Jan 2023 05:19:35 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0012.hostedemail.com [216.40.44.12]) by kanga.kvack.org (Postfix) with ESMTP id 1DD04940008 for ; Thu, 5 Jan 2023 05:19:35 -0500 (EST) Received: from smtpin26.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay09.hostedemail.com (Postfix) with ESMTP id E723380D72 for ; Thu, 5 Jan 2023 10:19:34 +0000 (UTC) X-FDA: 80320348668.26.76D328B Received: from mail-yw1-f201.google.com (mail-yw1-f201.google.com [209.85.128.201]) by imf27.hostedemail.com (Postfix) with ESMTP id 5F7CE4000F for ; Thu, 5 Jan 2023 10:19:33 +0000 (UTC) Authentication-Results: imf27.hostedemail.com; dkim=pass header.d=google.com header.s=20210112 header.b="aZrPoex/"; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf27.hostedemail.com: domain of 3NKS2YwoKCHsisgntfgsnmfnnfkd.bnlkhmtw-lljuZbj.nqf@flex--jthoughton.bounces.google.com designates 209.85.128.201 as permitted sender) smtp.mailfrom=3NKS2YwoKCHsisgntfgsnmfnnfkd.bnlkhmtw-lljuZbj.nqf@flex--jthoughton.bounces.google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1672913973; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=zYgA7vTy+UiW0jcHBjsT9jPj1JCczIw3Dv3/VC3unlc=; b=8glhXEN+aHLPt3JAWxgTOEPgOwJgnCf0+3YYG7GJavAWw0CIt+mJ9+dJbs8n+by2Fzerqq SracZKUGECOuH9+MxGg2LdQmuKD3UxpX8bSgUgiR3Put4VdDhtrYBzl6KxhC5R2jpzNZJV Us1V8cAvaNiEm62KlxN6u//bX4LB/MI= ARC-Authentication-Results: i=1; imf27.hostedemail.com; dkim=pass header.d=google.com header.s=20210112 header.b="aZrPoex/"; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf27.hostedemail.com: domain of 3NKS2YwoKCHsisgntfgsnmfnnfkd.bnlkhmtw-lljuZbj.nqf@flex--jthoughton.bounces.google.com designates 209.85.128.201 as permitted sender) smtp.mailfrom=3NKS2YwoKCHsisgntfgsnmfnnfkd.bnlkhmtw-lljuZbj.nqf@flex--jthoughton.bounces.google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1672913973; a=rsa-sha256; cv=none; b=F9oV8/TsczSQE4LlS0UrSgOwhbBA+RsloIx+OxDX3IFygaWkygmIwJE2drXetUR9yqm1GS kXhYuEoo6qhxMb1o9Nrs2wIj1Vvrjf4A84nOhaq2y2++ftGjjmKSPeWs1lSrB4Ww2+l92E /wsKfpLEHsA1iTyLY6kvjXTkn3C9yrM= Received: by mail-yw1-f201.google.com with SMTP id 00721157ae682-460ab8a327eso378498687b3.23 for ; Thu, 05 Jan 2023 02:19:33 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=zYgA7vTy+UiW0jcHBjsT9jPj1JCczIw3Dv3/VC3unlc=; b=aZrPoex/pVw6R2bZjUd0s/eQeCQD6Z+ZR1pwJBBdpF4+ODpwC87+571DSmG52tQd2A huBv6LZpBuHxyiw0O3CiSCiBc811ySkiBmjVOpz7sDguClfGeqjOyal8IA+Ji11V8jkN bEFusYMrdUdHMUBqX7dtj5Cx+qMWmrKu+iN47FofwRZruFCBAY5cfV+BlJmz8ohPVsX3 hreD3U4vshXW0/FmhhWMhII3bpHEdeySPmB3+xk2RkaW9C04IumqG5jIlCikAQysXd4v Ip9f+khY1/cJbjOG79FTa5wulnPT7jLwfHVYArewSw1+oGaupqCkA/OQmR8a7Mqa+kJj +N4g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=zYgA7vTy+UiW0jcHBjsT9jPj1JCczIw3Dv3/VC3unlc=; b=JrQjMFpihl5wBWpLbh/f+z3myyZN3wNSJwHGRRV/j1wwK0oyw0MSp82CgWO0ytrRkT dtN5TnPKJ56Sq5OxnhO0yI/P+UPkCJTfwf8eKEOcbPC0z+wqCUch+HaraZeiHDWDFvFB swtNs7m+qC1b/iZD3UK11z8wHD0pnVErDBDi2wqvX7lqzidvwIRSxbyXJcxh7FH4o2Dm 96vxzb78AFJUof9MDbn7wQMnRB1PBWSfVdPxMnB39z07bDdT1JTfNgcbd28fCbd2i+HJ lAVzLNsurOHkN5Ao5/FelzlHhm+Fc11Ts4crqLF0BOR8XQPmc/yc+AJIRnaKGmrwBwrx fI0g== X-Gm-Message-State: AFqh2kqvC6wTsd8EXiUbp7ptzKWKeVUYSRXYT/+k/LoaA6ua5+PbizDj XZxJ64tNvbirvqiJhamh77xufmBOk6acd3Bx X-Google-Smtp-Source: AMrXdXuxfUQNLilPrRGqsslWu2QZEFsqppw9mENfwOLUFmAwgaOYPQkWmXE3a1h40OvKVj2fDKV0vtvou6nlsSLw X-Received: from jthoughton.c.googlers.com ([fda3:e722:ac3:cc00:14:4d90:c0a8:2a4f]) (user=jthoughton job=sendgmr) by 2002:a0d:d454:0:b0:482:a03a:3fcd with SMTP id w81-20020a0dd454000000b00482a03a3fcdmr3459124ywd.99.1672913972517; Thu, 05 Jan 2023 02:19:32 -0800 (PST) Date: Thu, 5 Jan 2023 10:18:24 +0000 In-Reply-To: <20230105101844.1893104-1-jthoughton@google.com> Mime-Version: 1.0 References: <20230105101844.1893104-1-jthoughton@google.com> X-Mailer: git-send-email 2.39.0.314.g84b9a713c41-goog Message-ID: <20230105101844.1893104-27-jthoughton@google.com> Subject: [PATCH 26/46] hugetlb: add HGM support for copy_hugetlb_page_range From: James Houghton To: Mike Kravetz , Muchun Song , Peter Xu Cc: David Hildenbrand , David Rientjes , Axel Rasmussen , Mina Almasry , "Zach O'Keefe" , Manish Mishra , Naoya Horiguchi , "Dr . David Alan Gilbert" , "Matthew Wilcox (Oracle)" , Vlastimil Babka , Baolin Wang , Miaohe Lin , Yang Shi , Andrew Morton , linux-mm@kvack.org, linux-kernel@vger.kernel.org, James Houghton X-Rspamd-Queue-Id: 5F7CE4000F X-Rspamd-Server: rspam09 X-Rspam-User: X-Stat-Signature: q7xo9ye6u6cwsq5dfwzugcnzrutp1yaf X-HE-Tag: 1672913973-954965 X-HE-Meta: U2FsdGVkX19ZkEago6tkl9FSYvsl0hR85pehNByoSElNTGpbh8uCFEhbe1AVzUkuyO0kD3DyqmkgbM4Po/Yj4NdjL+PCoxxL0HJCcN/SA14nysZ0eUueGzEDPhgLFkU03GOees+wkUQfwyi7ThzlKSCNMajZsROdXehIq4OmajGDh9fpOWcTVPetzBIazojlq7VbIsYTYBd/35d53xyj1U4+qVbc9aNAIyLmfpf8PI5oxjbwLfNOmrH625G6ss3j5aaIs2zivS+yPUZM9paTY8P+6xzGCEAUVS2k4Kbo6tCUuDooM3OHyVzAcCFu9ad7mcu8akmvE5P1MFGZPMXnwToc9p/EjL1MioCwZy60gnfsMiR2fjh5WvCBm7Owsa83mK2L0YUcNxy83vLYVkOucdDEkP5DFp4Qd+PppKm3sx4X4Xlgpy79wAPIITx+gG4jKj5JeQzMfZM9SisPiuQepWaDyUwVsdBi5i+Y+6SnFzOoL6oTQZ+Y/fVmfjqr2t7p6Wa0Vn5bAD6vDthclfqfscft4sG3XYZHL9RuBdv8RvgF9piy8xvKcx5/BMW3QyhNm9EBaNcAB9DxA0kwR8q49pV9OQZHzUMpycOzQqqHkA29q4KYNr/Pic9C4b2HiMRwidfp8aClp5V1ghn5TvWym92tZT4eRYVd5eLDSIb1IP1G3xSdsDed16rhijpmngg71Ma/AzFC3hAPfcph6uIrfKv+4yhVFv/XzKjyo4WlaxE2GxJ+WhryspTaBfSKFpkP49cv+HEQVr0HTWJ3bkZxtWbgMmM6loy+yacTRdd+C6EYkyglIcNeYdgY4BzieNxXeQzquopPc1InlwRJ7zKR2iUwYBcvsRXJ3tlFw46FRnfWi4LkEqZGIqy6ei58JOfkV8mAwE7X2VtFW48RnJ1qG55uqqLqWvFKRsHmFDxT9wEY2a58BIT3f71v9eKLJGAN8/xwx4/QpZShAh4/uYf /ENuMhh2 +9k4+v04jwvrOTGItu+2S4Ndf5uS/uwJOlleJkr//Rwjmttl0fOnbx9Z9KjgPdxlN4K8MiyerqZD6P+faV0mDmr+eQ++0I83xsPrmWsNRww+N/S4nJNPrXKcUxWlZ4mCaHcH7rn+fwdGFJ4fNsUL5DjBj5wfDBlwe3bqA1Rkcko6RDbDASXk4UH1oK3Wi9TTLDcFfpR7haY3leoqMyxlZvHZbG2YMLnDLufvyOyNOTToDUyyy2xCatscV+ieDsC/AfMRJsQv1Fl1xfUijlqsdHflp26WZlGgeV145qEx8haDcxnHSBnVfXfLVZLHYiNX46x86ymnhVMGNSCzxIg7BZ3rYSjuE4sb6uHWccUDTyHSPP1vH7lbCNsnNYXKBz79IE8ZIXOdpEAr63Qm/903fDNgOK5BOWGw0MtNsvpRmOobJzqK4D3/DG6yUDHrJZ9d9Ct4x X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: This allows fork() to work with high-granularity mappings. The page table structure is copied such that partially mapped regions will remain partially mapped in the same way for the new process. A page's reference count is incremented for *each* portion of it that is mapped in the page table. For example, if you have a PMD-mapped 1G page, the reference count and mapcount will be incremented by 512. Signed-off-by: James Houghton --- mm/hugetlb.c | 75 ++++++++++++++++++++++++++++++++++------------------ 1 file changed, 50 insertions(+), 25 deletions(-) diff --git a/mm/hugetlb.c b/mm/hugetlb.c index 718572444a73..21a5116f509b 100644 --- a/mm/hugetlb.c +++ b/mm/hugetlb.c @@ -5106,7 +5106,8 @@ int copy_hugetlb_page_range(struct mm_struct *dst, struct mm_struct *src, struct vm_area_struct *src_vma) { pte_t *src_pte, *dst_pte, entry; - struct page *ptepage; + struct hugetlb_pte src_hpte, dst_hpte; + struct page *ptepage, *hpage; unsigned long addr; bool cow = is_cow_mapping(src_vma->vm_flags); struct hstate *h = hstate_vma(src_vma); @@ -5126,26 +5127,34 @@ int copy_hugetlb_page_range(struct mm_struct *dst, struct mm_struct *src, } else { /* * For shared mappings the vma lock must be held before - * calling hugetlb_walk() in the src vma. Otherwise, the - * returned ptep could go away if part of a shared pmd and - * another thread calls huge_pmd_unshare. + * calling hugetlb_full_walk() in the src vma. Otherwise, the + * returned hpte could go away if + * - part of a shared pmd and another thread calls + * - huge_pmd_unshare, or + * - another thread collapses a high-granularity mapping. */ hugetlb_vma_lock_read(src_vma); } last_addr_mask = hugetlb_mask_last_page(h); - for (addr = src_vma->vm_start; addr < src_vma->vm_end; addr += sz) { + addr = src_vma->vm_start; + while (addr < src_vma->vm_end) { spinlock_t *src_ptl, *dst_ptl; - src_pte = hugetlb_walk(src_vma, addr, sz); - if (!src_pte) { - addr |= last_addr_mask; + unsigned long hpte_sz; + + if (hugetlb_full_walk(&src_hpte, src_vma, addr)) { + addr = (addr | last_addr_mask) + sz; continue; } - dst_pte = huge_pte_alloc(dst, dst_vma, addr, sz); - if (!dst_pte) { - ret = -ENOMEM; + ret = hugetlb_full_walk_alloc(&dst_hpte, dst_vma, addr, + hugetlb_pte_size(&src_hpte)); + if (ret) break; - } + + src_pte = src_hpte.ptep; + dst_pte = dst_hpte.ptep; + + hpte_sz = hugetlb_pte_size(&src_hpte); /* * If the pagetables are shared don't copy or take references. @@ -5155,13 +5164,14 @@ int copy_hugetlb_page_range(struct mm_struct *dst, struct mm_struct *src, * another vma. So page_count of ptep page is checked instead * to reliably determine whether pte is shared. */ - if (page_count(virt_to_page(dst_pte)) > 1) { - addr |= last_addr_mask; + if (hugetlb_pte_size(&dst_hpte) == sz && + page_count(virt_to_page(dst_pte)) > 1) { + addr = (addr | last_addr_mask) + sz; continue; } - dst_ptl = huge_pte_lock(h, dst, dst_pte); - src_ptl = huge_pte_lockptr(huge_page_shift(h), src, src_pte); + dst_ptl = hugetlb_pte_lock(&dst_hpte); + src_ptl = hugetlb_pte_lockptr(&src_hpte); spin_lock_nested(src_ptl, SINGLE_DEPTH_NESTING); entry = huge_ptep_get(src_pte); again: @@ -5205,10 +5215,15 @@ int copy_hugetlb_page_range(struct mm_struct *dst, struct mm_struct *src, */ if (userfaultfd_wp(dst_vma)) set_huge_pte_at(dst, addr, dst_pte, entry); + } else if (!hugetlb_pte_present_leaf(&src_hpte, entry)) { + /* Retry the walk. */ + spin_unlock(src_ptl); + spin_unlock(dst_ptl); + continue; } else { - entry = huge_ptep_get(src_pte); ptepage = pte_page(entry); - get_page(ptepage); + hpage = compound_head(ptepage); + get_page(hpage); /* * Failing to duplicate the anon rmap is a rare case @@ -5220,25 +5235,31 @@ int copy_hugetlb_page_range(struct mm_struct *dst, struct mm_struct *src, * need to be without the pgtable locks since we could * sleep during the process. */ - if (!PageAnon(ptepage)) { - page_dup_file_rmap(ptepage, true); - } else if (page_try_dup_anon_rmap(ptepage, true, + if (!PageAnon(hpage)) { + page_dup_file_rmap(hpage, true); + } else if (page_try_dup_anon_rmap(hpage, true, src_vma)) { pte_t src_pte_old = entry; struct page *new; + if (hugetlb_pte_size(&src_hpte) != sz) { + put_page(hpage); + ret = -EINVAL; + break; + } + spin_unlock(src_ptl); spin_unlock(dst_ptl); /* Do not use reserve as it's private owned */ new = alloc_huge_page(dst_vma, addr, 1); if (IS_ERR(new)) { - put_page(ptepage); + put_page(hpage); ret = PTR_ERR(new); break; } - copy_user_huge_page(new, ptepage, addr, dst_vma, + copy_user_huge_page(new, hpage, addr, dst_vma, npages); - put_page(ptepage); + put_page(hpage); /* Install the new huge page if src pte stable */ dst_ptl = huge_pte_lock(h, dst, dst_pte); @@ -5256,6 +5277,7 @@ int copy_hugetlb_page_range(struct mm_struct *dst, struct mm_struct *src, hugetlb_install_page(dst_vma, dst_pte, addr, new); spin_unlock(src_ptl); spin_unlock(dst_ptl); + addr += hugetlb_pte_size(&src_hpte); continue; } @@ -5272,10 +5294,13 @@ int copy_hugetlb_page_range(struct mm_struct *dst, struct mm_struct *src, } set_huge_pte_at(dst, addr, dst_pte, entry); - hugetlb_count_add(npages, dst); + hugetlb_count_add( + hugetlb_pte_size(&dst_hpte) / PAGE_SIZE, + dst); } spin_unlock(src_ptl); spin_unlock(dst_ptl); + addr += hugetlb_pte_size(&src_hpte); } if (cow) { From patchwork Thu Jan 5 10:18:25 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: James Houghton X-Patchwork-Id: 13089661 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 34D47C3DA7A for ; Thu, 5 Jan 2023 10:19:38 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 8B3A3940018; Thu, 5 Jan 2023 05:19:36 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 83277940008; Thu, 5 Jan 2023 05:19:36 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 6AAF8940018; Thu, 5 Jan 2023 05:19:36 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0017.hostedemail.com [216.40.44.17]) by kanga.kvack.org (Postfix) with ESMTP id 56F01940008 for ; Thu, 5 Jan 2023 05:19:36 -0500 (EST) Received: from smtpin23.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay07.hostedemail.com (Postfix) with ESMTP id 107D41608EE for ; Thu, 5 Jan 2023 10:19:36 +0000 (UTC) X-FDA: 80320348752.23.C19D188 Received: from mail-yb1-f202.google.com (mail-yb1-f202.google.com [209.85.219.202]) by imf30.hostedemail.com (Postfix) with ESMTP id 7DDB88000F for ; Thu, 5 Jan 2023 10:19:34 +0000 (UTC) Authentication-Results: imf30.hostedemail.com; dkim=pass header.d=google.com header.s=20210112 header.b=rEGgdMvu; spf=pass (imf30.hostedemail.com: domain of 3NaS2YwoKCHwoymtzlmytslttlqj.htrqnsz2-rrp0fhp.twl@flex--jthoughton.bounces.google.com designates 209.85.219.202 as permitted sender) smtp.mailfrom=3NaS2YwoKCHwoymtzlmytslttlqj.htrqnsz2-rrp0fhp.twl@flex--jthoughton.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1672913974; a=rsa-sha256; cv=none; b=1a35SpcFKhYIIe2W8vN0xClZQOffu27Yi/bMBvy2FAKd7xMdcmRodmi89j+Xsb635x7ddR I8Y/np6go9y16xT6S9QeelLnHlc2RwlvidPdEGvzgCcswatX83q2DtuoVnyY+FDKmfuP+o dBK872ke3F41oBmbRYLj2MTDpxalucI= ARC-Authentication-Results: i=1; imf30.hostedemail.com; dkim=pass header.d=google.com header.s=20210112 header.b=rEGgdMvu; spf=pass (imf30.hostedemail.com: domain of 3NaS2YwoKCHwoymtzlmytslttlqj.htrqnsz2-rrp0fhp.twl@flex--jthoughton.bounces.google.com designates 209.85.219.202 as permitted sender) smtp.mailfrom=3NaS2YwoKCHwoymtzlmytslttlqj.htrqnsz2-rrp0fhp.twl@flex--jthoughton.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1672913974; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=NW6wHLpIenfZ4H83ugQ2cvHsKjvU5VzZf+cgjAGHY40=; b=LeGp/ZE9VqAheKisYAXbmcB21tTF1f/I2ooAwHlfmS+bdD+X3PcITeOmdHtT4vmhAkmlx5 68brKwCjiMpSZOVjrg/gRo9JDG2qpwX8EqoM1RsZil7wgJ4CLWigiNbDAJDBmkb9mtQO08 fvAZAQLJMK8m5Ff9mqbLhMqyEbq89yE= Received: by mail-yb1-f202.google.com with SMTP id z17-20020a25e311000000b00719e04e59e1so36650128ybd.10 for ; Thu, 05 Jan 2023 02:19:34 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=NW6wHLpIenfZ4H83ugQ2cvHsKjvU5VzZf+cgjAGHY40=; b=rEGgdMvuu+27Ici8qii/KRILsHn9RlUYSO9K/TJzur40T3t5kesN5ok/NY1HloAU4L Z7iSC9U8uLAkzjnzBDCE6RwQaxIdaGRfRBI7Ay4/uqsXEpMpfJH5fXmrdGe8jBMdebhB PTfqNSSWxeBYlcJ/n89QdrXzg+3piVs9cMRx16JezvMUtD4M+SLYVIhAlR6VJsoaz/0q LIveRtwE+XWa1auke1aNsrJsM9rgwGQVMhxbEoL1PGm2s5/+Us9MZpK3VqY68e/ucGb6 rLqEJyLAlrwQyolGVPX3aLH9GGNSXJQpX581J5jygzsuQLpiwPMUgm2Gwf8Q06CukAiZ 8Rdw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=NW6wHLpIenfZ4H83ugQ2cvHsKjvU5VzZf+cgjAGHY40=; b=Izo9yU3eAoeBdxHiEy4V7rEgmGbOh8oU1/X8hfObagiJYBMsa0jl2Q6nW2YZPJov2y DNBcnUnOkU91DiUTzaNs4drZyITXOuokVyFhwowZWM7q+PRACy6yfF88JqIacYQ/Tavg 99alezJgORjTjAsO3Z12p6XxtASCipnBNknQkNWH+CcsRyxuFXo0s6qPc+y1oRDG0CGe 2aw5QArHcV/Dz9w9ceXM4Uk4ZUsQDbe1mZT9jiV3Cd90jH8RR+IbSa5JHURIzNuxP0SY y4JgHsG6IsTZAN//pg8bVH4DxDP7xa/mZYBbrf6qIIVSwvKxKgscOaWo/h4r8miFeye0 JDoQ== X-Gm-Message-State: AFqh2krt9c7eTNv1gcXJJenquIZamsFmKFjvkU5i1ZS/rLyrGT3yhgyN nnYvL8IxHfSdcuX1XFoS+dJ4ft8sfN2B0IT/ X-Google-Smtp-Source: AMrXdXsgAfnLGLH/tQnO5IQRR4cK4f4YIvu7i1V91wP/6Pz7HgI4U5X4q/x6mDIyEA8c+u2ZnwHVp4g63zKjZ7Sk X-Received: from jthoughton.c.googlers.com ([fda3:e722:ac3:cc00:14:4d90:c0a8:2a4f]) (user=jthoughton job=sendgmr) by 2002:a25:4bc4:0:b0:794:4ff7:1111 with SMTP id y187-20020a254bc4000000b007944ff71111mr1789851yba.603.1672913973763; Thu, 05 Jan 2023 02:19:33 -0800 (PST) Date: Thu, 5 Jan 2023 10:18:25 +0000 In-Reply-To: <20230105101844.1893104-1-jthoughton@google.com> Mime-Version: 1.0 References: <20230105101844.1893104-1-jthoughton@google.com> X-Mailer: git-send-email 2.39.0.314.g84b9a713c41-goog Message-ID: <20230105101844.1893104-28-jthoughton@google.com> Subject: [PATCH 27/46] hugetlb: add HGM support for move_hugetlb_page_tables From: James Houghton To: Mike Kravetz , Muchun Song , Peter Xu Cc: David Hildenbrand , David Rientjes , Axel Rasmussen , Mina Almasry , "Zach O'Keefe" , Manish Mishra , Naoya Horiguchi , "Dr . David Alan Gilbert" , "Matthew Wilcox (Oracle)" , Vlastimil Babka , Baolin Wang , Miaohe Lin , Yang Shi , Andrew Morton , linux-mm@kvack.org, linux-kernel@vger.kernel.org, James Houghton X-Rspam-User: X-Rspamd-Server: rspam03 X-Rspamd-Queue-Id: 7DDB88000F X-Stat-Signature: a3hq911i83ks3zioerwr5u7qtj4wkihg X-HE-Tag: 1672913974-163857 X-HE-Meta: U2FsdGVkX192UnPGMWoxgKj7E0Q55rnoXKciE7wxlj4nQqXE/vZrz6Rh64Fnlu4oyLMCwmqHmlROlQGyQdC9mIkygICiv/hROzbtOBMxsHUm12n7rdfV/vr9jDnbVx8Ynrp40xSH9+NPqq5mYzDR6I7WS74wNL1FJ/FLy29dzY39SnNsbVTVyjAq+rTU8kN1/3LyfGMjy/8Sl1IDn2O5M3KiukeocMJhWCJ0AUzgQ3MTFw8PF27rHHEaSuhrM7nmGi2mcXqpariLRSmFWkpyywRBN1SzBGlEBj2C/X/YeVMhYk9rQuBHA3gCL66O8nvEF8jdTJXhTpAIsJZmiqRo74cNBW+jMF+unNwP17N5Nd5V6MpYHgTSitnih8oEBkvXheJOv+1EWz4EkOfmISiggYOGm115mIS0EMPD21smdJPL39PaaOmMf3R1JWoP+5Xf4/CuHls7+/rJ1jzTaYftBLhi/D6xji1xo3ne+YhfQOKaGs3x0awXqB907tGfbpS2alTEy/LRYfAYjTm9HyIEvG4zZpkRdxabXRdzWalQys0+TF7TcnYid2I4mknNLqmS60TE9pLSg9+plfmYTKRSH6gXuk0i+PULg0jQj9CO+8eb7r737LySXnxmxVDd/axoMxyDMrb3AuDNCST/rS6OMjm8C6JUXz3/Lg7hi8c2ItFOPeNRvQQ4QXr/Nz0F4kqhXbjg0ro1upOoYgQ548y5iJtxemL9Cq9GJEB1x9MkMqqGRYIOZalB6GO6rdFDLDcShf+n5166TB8MQP0MdSo7zytFa0/+jQoeMeNw8TZF9OrDgP991qujboPIiWWOM9aTljMHUjUivE8NKcnqzEYArFUKgbdjnBKlICY/GhOYHHjXQrPTQjgGzXcA8EP31nk6GsXAYkaHhjsfxW50ZC3lJ3lQIS8wG0o/wAOzlyjyTmegYJRv1MTT4AfUGx5o07xg6LyLFJ8qkquMKWPFSFW I7yzcFzC mTnDP2C+jJKQgwT7MBJHOtLZRnQwktZ/Mbz627hs8f9950kc3rKP1BfzYX3lZV+qRv3+bbrx31358pUI6ijDrbSAwVDlAyjTTPdjU70gIyagw5u/Wneey+U3Bzs+JrJeYjqqjb+z6gbzpZ+gciz0ioox5rijaKSzxU8Ooj9sVS/H8EXrD99o8LnYUmgxXEK+DaNKd9jDAc1JU8s4E0GsqMWuiDbLpn/1QXNF+2TV2Y1CatyUADh0iW9xjPx+hLOE1+ZHOLUPHPNKiC+kwk7NsRoraPha1M4vB/lQ5MebrpJcnByo5PAdCKoqtiautndr6SFhTyzKz+j051/5eEYMOf3qL5KNJbScfSW7TclMlCSDEN5ZyWZJlXbfJuWHV/7Qmo6XxbrkVHH2xv1wwDbRZEoLfoONmRoir/B1A2+q8RQeO/8qTKcOVhjs0Rkju0ENFV42y X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: This is very similar to the support that was added to copy_hugetlb_page_range. We simply do a high-granularity walk now, and most of the rest of the code stays the same. Signed-off-by: James Houghton --- mm/hugetlb.c | 47 +++++++++++++++++++++++++++-------------------- 1 file changed, 27 insertions(+), 20 deletions(-) diff --git a/mm/hugetlb.c b/mm/hugetlb.c index 21a5116f509b..582d14a206b5 100644 --- a/mm/hugetlb.c +++ b/mm/hugetlb.c @@ -5313,16 +5313,16 @@ int copy_hugetlb_page_range(struct mm_struct *dst, struct mm_struct *src, return ret; } -static void move_huge_pte(struct vm_area_struct *vma, unsigned long old_addr, - unsigned long new_addr, pte_t *src_pte, pte_t *dst_pte) +static void move_hugetlb_pte(struct vm_area_struct *vma, unsigned long old_addr, + unsigned long new_addr, struct hugetlb_pte *src_hpte, + struct hugetlb_pte *dst_hpte) { - struct hstate *h = hstate_vma(vma); struct mm_struct *mm = vma->vm_mm; spinlock_t *src_ptl, *dst_ptl; pte_t pte; - dst_ptl = huge_pte_lock(h, mm, dst_pte); - src_ptl = huge_pte_lockptr(huge_page_shift(h), mm, src_pte); + dst_ptl = hugetlb_pte_lock(dst_hpte); + src_ptl = hugetlb_pte_lockptr(src_hpte); /* * We don't have to worry about the ordering of src and dst ptlocks @@ -5331,8 +5331,8 @@ static void move_huge_pte(struct vm_area_struct *vma, unsigned long old_addr, if (src_ptl != dst_ptl) spin_lock_nested(src_ptl, SINGLE_DEPTH_NESTING); - pte = huge_ptep_get_and_clear(mm, old_addr, src_pte); - set_huge_pte_at(mm, new_addr, dst_pte, pte); + pte = huge_ptep_get_and_clear(mm, old_addr, src_hpte->ptep); + set_huge_pte_at(mm, new_addr, dst_hpte->ptep, pte); if (src_ptl != dst_ptl) spin_unlock(src_ptl); @@ -5350,9 +5350,9 @@ int move_hugetlb_page_tables(struct vm_area_struct *vma, struct mm_struct *mm = vma->vm_mm; unsigned long old_end = old_addr + len; unsigned long last_addr_mask; - pte_t *src_pte, *dst_pte; struct mmu_notifier_range range; bool shared_pmd = false; + struct hugetlb_pte src_hpte, dst_hpte; mmu_notifier_range_init(&range, MMU_NOTIFY_CLEAR, 0, vma, mm, old_addr, old_end); @@ -5368,28 +5368,35 @@ int move_hugetlb_page_tables(struct vm_area_struct *vma, /* Prevent race with file truncation */ hugetlb_vma_lock_write(vma); i_mmap_lock_write(mapping); - for (; old_addr < old_end; old_addr += sz, new_addr += sz) { - src_pte = hugetlb_walk(vma, old_addr, sz); - if (!src_pte) { - old_addr |= last_addr_mask; - new_addr |= last_addr_mask; + while (old_addr < old_end) { + if (hugetlb_full_walk(&src_hpte, vma, old_addr)) { + /* The hstate-level PTE wasn't allocated. */ + old_addr = (old_addr | last_addr_mask) + sz; + new_addr = (new_addr | last_addr_mask) + sz; continue; } - if (huge_pte_none(huge_ptep_get(src_pte))) + + if (huge_pte_none(huge_ptep_get(src_hpte.ptep))) { + old_addr += hugetlb_pte_size(&src_hpte); + new_addr += hugetlb_pte_size(&src_hpte); continue; + } - if (huge_pmd_unshare(mm, vma, old_addr, src_pte)) { + if (hugetlb_pte_size(&src_hpte) == sz && + huge_pmd_unshare(mm, vma, old_addr, src_hpte.ptep)) { shared_pmd = true; - old_addr |= last_addr_mask; - new_addr |= last_addr_mask; + old_addr = (old_addr | last_addr_mask) + sz; + new_addr = (new_addr | last_addr_mask) + sz; continue; } - dst_pte = huge_pte_alloc(mm, new_vma, new_addr, sz); - if (!dst_pte) + if (hugetlb_full_walk_alloc(&dst_hpte, new_vma, new_addr, + hugetlb_pte_size(&src_hpte))) break; - move_huge_pte(vma, old_addr, new_addr, src_pte, dst_pte); + move_hugetlb_pte(vma, old_addr, new_addr, &src_hpte, &dst_hpte); + old_addr += hugetlb_pte_size(&src_hpte); + new_addr += hugetlb_pte_size(&src_hpte); } if (shared_pmd) From patchwork Thu Jan 5 10:18:26 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: James Houghton X-Patchwork-Id: 13089662 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 7E8E9C53210 for ; Thu, 5 Jan 2023 10:19:39 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 23FB5940019; Thu, 5 Jan 2023 05:19:38 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 12C50940008; Thu, 5 Jan 2023 05:19:38 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id E7152940019; Thu, 5 Jan 2023 05:19:37 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0017.hostedemail.com [216.40.44.17]) by kanga.kvack.org (Postfix) with ESMTP id CFA56940008 for ; Thu, 5 Jan 2023 05:19:37 -0500 (EST) Received: from smtpin16.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay02.hostedemail.com (Postfix) with ESMTP id 9F9B01205F8 for ; Thu, 5 Jan 2023 10:19:37 +0000 (UTC) X-FDA: 80320348794.16.1674DC7 Received: from mail-yw1-f201.google.com (mail-yw1-f201.google.com [209.85.128.201]) by imf05.hostedemail.com (Postfix) with ESMTP id E4909100007 for ; Thu, 5 Jan 2023 10:19:35 +0000 (UTC) Authentication-Results: imf05.hostedemail.com; dkim=pass header.d=google.com header.s=20210112 header.b="Zx/5LSq7"; spf=pass (imf05.hostedemail.com: domain of 3N6S2YwoKCH4lvjqwijvqpiqqing.eqonkpwz-oomxcem.qti@flex--jthoughton.bounces.google.com designates 209.85.128.201 as permitted sender) smtp.mailfrom=3N6S2YwoKCH4lvjqwijvqpiqqing.eqonkpwz-oomxcem.qti@flex--jthoughton.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1672913976; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=52Lc/VSk0yIghq22praJapn3lwvkCPBnf2CxE486HBM=; b=rVeo+b0DNhD+ZYDLmpGnx6EiN+/jHsYurOYnaf0jbGCO9T5MPVvmP6wMaWCHaQtdrcimiQ PJHvBZzq5zX9+A/eGnky+GnuFd5p9yar7r+U7c9mOn/04jGYCH+ReEMR/AgxmAXppaaJyA tY/gNPv5DFRbV3XV6+0g3FIP7gYoG7E= ARC-Authentication-Results: i=1; imf05.hostedemail.com; dkim=pass header.d=google.com header.s=20210112 header.b="Zx/5LSq7"; spf=pass (imf05.hostedemail.com: domain of 3N6S2YwoKCH4lvjqwijvqpiqqing.eqonkpwz-oomxcem.qti@flex--jthoughton.bounces.google.com designates 209.85.128.201 as permitted sender) smtp.mailfrom=3N6S2YwoKCH4lvjqwijvqpiqqing.eqonkpwz-oomxcem.qti@flex--jthoughton.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1672913976; a=rsa-sha256; cv=none; b=t0oA6Vq0So1sziKodFBhENpTj82m+e613gGeQqahvKe4faQ0yumLVroS5K5anU/1cggSmx Y0Ro48BtPJaot7WEUh0K+hoTqiZBt0GxwTH4YavBQfTiEQoP7qwK9jegvIq5PdJel/tVHx he7OicEcqT4wjHhN94s+ODITPkXVkWw= Received: by mail-yw1-f201.google.com with SMTP id 00721157ae682-46658ec0cfcso366222927b3.19 for ; Thu, 05 Jan 2023 02:19:35 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=52Lc/VSk0yIghq22praJapn3lwvkCPBnf2CxE486HBM=; b=Zx/5LSq7vJ/Y7mzopiStfT2/J3Nk5Y/hwcDUw2EZUveXn48NopEe92bJctEe8AHRHb R4+NwkrLiB3Ui7F8085MNdLcPt4SgmkuRV80nXw5KvqkQh5nlPVpAeOmFBTYO/yyBK2R mfAGLc+fybaVayC+0em2EDp6HTBg/HUz57DWEpLbV1cSXcNeF2gtIk8xfD6FrGxUk4Ap emm5dAP/C0b01sNxXNKH99hzKuCt/6gecSgkdNuQIvP8mfymZ8FnbtLiZReUszJrDuC7 KppRcHq+DOV0gvkrTGcEapdiAAhjR3UU0kRqrCKj5VHgUxOeotl97qLhxViO6fj5HHq5 NEGg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=52Lc/VSk0yIghq22praJapn3lwvkCPBnf2CxE486HBM=; b=ZK9FdiYLJewrbC8g5Eaq9zYJVp+/2tshnhYwutALpKfWFeoY9aQXDBkYMkQthiRgHg cXXR/PbZ/jk+C9IlboOlG28ON0GRe7+gx4Ks+nZvweyS1UEjjtEvfwrNXmVpLO+LX+5c qdydVE5zw4iT+HEyakPgquPuKYBdh/PnxdgweH6k3lgpZnHOi2K2C+YCxhhf2om08Fir uS9Z+uOqhxNakXnF4zwy/h5zmYuni4+ry5KT86ZSlgLbH25IsZbPCUOCM5qG6ktrJFEF qOBBG2ObIwmpZg4bkYAjx1Zzgd/x6AXBRbVTDYFIv/MC9Zu66oPujScSSQmf9VTgUShx AkhQ== X-Gm-Message-State: AFqh2krfgHC87kF61ZycoMyDdMqHSwJFKqy43VMD8/gGVzrUOkQT+UcY ZFQ9cq1TQj7/0DPH/h6uod+ymvY1pncCpjnh X-Google-Smtp-Source: AMrXdXuLctd/ohOHSTBjXBqVjUo3TAHquo6zIzDgau3k7Zts/IwTA/w1vO+rM3veE/HEW3N4JvkEeZQUmL8K/PtT X-Received: from jthoughton.c.googlers.com ([fda3:e722:ac3:cc00:14:4d90:c0a8:2a4f]) (user=jthoughton job=sendgmr) by 2002:a0d:e2c1:0:b0:474:1969:ed89 with SMTP id l184-20020a0de2c1000000b004741969ed89mr3636781ywe.175.1672913975233; Thu, 05 Jan 2023 02:19:35 -0800 (PST) Date: Thu, 5 Jan 2023 10:18:26 +0000 In-Reply-To: <20230105101844.1893104-1-jthoughton@google.com> Mime-Version: 1.0 References: <20230105101844.1893104-1-jthoughton@google.com> X-Mailer: git-send-email 2.39.0.314.g84b9a713c41-goog Message-ID: <20230105101844.1893104-29-jthoughton@google.com> Subject: [PATCH 28/46] hugetlb: add HGM support for hugetlb_fault and hugetlb_no_page From: James Houghton To: Mike Kravetz , Muchun Song , Peter Xu Cc: David Hildenbrand , David Rientjes , Axel Rasmussen , Mina Almasry , "Zach O'Keefe" , Manish Mishra , Naoya Horiguchi , "Dr . David Alan Gilbert" , "Matthew Wilcox (Oracle)" , Vlastimil Babka , Baolin Wang , Miaohe Lin , Yang Shi , Andrew Morton , linux-mm@kvack.org, linux-kernel@vger.kernel.org, James Houghton X-Rspamd-Server: rspam07 X-Rspamd-Queue-Id: E4909100007 X-Rspam-User: X-Stat-Signature: 3n9un7e6h6apyfrbdw6akf767xd8xpwq X-HE-Tag: 1672913975-422413 X-HE-Meta: U2FsdGVkX18SVWiRnyUmzQCNEBs8HmxKpH2tGjFT82OkbKLBH95T1gKEaQboLgIpxe8b3gn7gvgSUtYBtobJGU8TZ0wASmfjH9mwamL/BKZ1nbvoM2pEkcx0/8WzHcS7mL/Mq3g0Kae29EMb8sEya43qGNA+L9v1C8/Pew2EG0VJMmkE7NnZzGLCub5WWBKSCWTELn4DJMTF8sboYKD07Eud74ZsKx3D32aI8XR2rY2Unp86BDFleXjMbGTNiduGjV1N2naO29dsxcyV6ihoqRc9Y8HBd2yp007WgyZRmfx8YmSbLotXHohoXraz7BcLLr7xsidv0mY51Ul8ibbEKzVbeSc316q9nNBNJq4n8Ugz4X9wrY9b4nOiyCXJmZYLx/sqxUCe7erw3aAZ17+fwY+vf2ZkgjUv8FEh+qr7X67YHbgxV30oShM0+ceivOqRM3wi1WWRbG9qSCRAlG2E+F9bVMdRxslDPiMgl5Me22kO5wQRGX0eRgbJCemydsD9fObMvqPqdhoOoAXLR4XBHpZHReK24xNigNwNx4OK49/8Ah5//HdJk0eAQWgZmfsf5e544CkvvdaRD9VY5+VPTRo6rNkrwOjeozeu/vWqFamqIEfWmOHn67llUv+1Zt3cuwki1Yp9h1yVMEw6a9Lt8LQRrzeE/94AbuOaZZYl8uJ/8hKC7TmWfs0/DYCYCF1blXfK6hXrW2JwRpGZJHV0G/jOTlCC3KHstjUFEjBq3ayGqbG4bfDPO+HA246gRPtwt56exQoOxXFci4SfCrUBkZ4adq9PA0paAcg/sqpfbLtw9sc7cL1JwGRnhkNmOEdcAQ1VUN8OX2nnqOpO9S+BnGbBSEC57L0fJnkUXajppgY+6gfY5P8S7KBc8EeRYMtK2LiGIQ4C5JwS4hiDurvIYd9aXXz3ZXVoR8rEbzfFM8Kp9HjG+qnJPXAv9YH6aJqzzlNG3iU3cBypbdHrd1n rwoJcScc +lH3XLel4sl9SYyv+5f+oBT/3u61CyTgSxYgn45Wn9AJQ13nLisx4uWMPxtGdaP+Oq6VmgsbW+nrwPv9TjaKZ5MC/7qfcWqYdT9BpIvNRBaebyjnysrvhcHnNiczbZwrvmQilYQLxO7JDjIHqzaWo8m6rpTBYOs6iJjcWwlMclSYXtpiqIfS7YwyetZPYceWKCcT0ldOM0uXE0sJpguvoi7koc1yGlhI3XRsSbAoyaxVJFvwO48j24DMRcnGG1i1d3APyL75E2uQiJJ5RfQjbwk4Cd78JFxGgb1D+nfk4vjXiKjMqwcD+EAuyP/wGDxWTxhmTc8oJb/1mGWEJeTk9Rae4SfCDbMRHjAGY735dmpqj3R/nxZyqd6F7VwtnrXhf45gzA5t867V0EyUZVs9TpMcWi++CM9Ry3hHJdJOiGb6nnBBjO8Otcs/+Lp2lcWVRyGS/ X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Update the page fault handler to support high-granularity page faults. While handling a page fault on a partially-mapped HugeTLB page, if the PTE we find with hugetlb_pte_walk is none, then we will replace it with a leaf-level PTE to map the page. To give some examples: 1. For a completely unmapped 1G page, it will be mapped with a 1G PUD. 2. For a 1G page that has its first 512M mapped, any faults on the unmapped sections will result in 2M PMDs mapping each unmapped 2M section. 3. For a 1G page that has only its first 4K mapped, a page fault on its second 4K section will get a 4K PTE to map it. Unless high-granularity mappings are created via UFFDIO_CONTINUE, it is impossible for hugetlb_fault to create high-granularity mappings. This commit does not handle hugetlb_wp right now, and it doesn't handle HugeTLB page migration and swap entries. The BUG_ON in huge_pte_alloc is removed, as it is not longer valid when HGM is possible. HGM can be disabled if the VMA lock cannot be allocated after a VMA is split, yet high-granularity mappings may still exist. Signed-off-by: James Houghton --- mm/hugetlb.c | 115 ++++++++++++++++++++++++++++++++++++--------------- 1 file changed, 81 insertions(+), 34 deletions(-) diff --git a/mm/hugetlb.c b/mm/hugetlb.c index 582d14a206b5..8e690a22456a 100644 --- a/mm/hugetlb.c +++ b/mm/hugetlb.c @@ -117,6 +117,18 @@ enum hugetlb_level hpage_size_to_level(unsigned long sz) return HUGETLB_LEVEL_PGD; } +/* + * Find the subpage that corresponds to `addr` in `hpage`. + */ +static struct page *hugetlb_find_subpage(struct hstate *h, struct page *hpage, + unsigned long addr) +{ + size_t idx = (addr & ~huge_page_mask(h))/PAGE_SIZE; + + BUG_ON(idx >= pages_per_huge_page(h)); + return &hpage[idx]; +} + static inline bool subpool_is_free(struct hugepage_subpool *spool) { if (spool->count) @@ -5926,14 +5938,14 @@ static inline vm_fault_t hugetlb_handle_userfault(struct vm_area_struct *vma, * Recheck pte with pgtable lock. Returns true if pte didn't change, or * false if pte changed or is changing. */ -static bool hugetlb_pte_stable(struct hstate *h, struct mm_struct *mm, - pte_t *ptep, pte_t old_pte) +static bool hugetlb_pte_stable(struct hstate *h, struct hugetlb_pte *hpte, + pte_t old_pte) { spinlock_t *ptl; bool same; - ptl = huge_pte_lock(h, mm, ptep); - same = pte_same(huge_ptep_get(ptep), old_pte); + ptl = hugetlb_pte_lock(hpte); + same = pte_same(huge_ptep_get(hpte->ptep), old_pte); spin_unlock(ptl); return same; @@ -5942,17 +5954,18 @@ static bool hugetlb_pte_stable(struct hstate *h, struct mm_struct *mm, static vm_fault_t hugetlb_no_page(struct mm_struct *mm, struct vm_area_struct *vma, struct address_space *mapping, pgoff_t idx, - unsigned long address, pte_t *ptep, + unsigned long address, struct hugetlb_pte *hpte, pte_t old_pte, unsigned int flags) { struct hstate *h = hstate_vma(vma); vm_fault_t ret = VM_FAULT_SIGBUS; int anon_rmap = 0; unsigned long size; - struct page *page; + struct page *page, *subpage; pte_t new_pte; spinlock_t *ptl; unsigned long haddr = address & huge_page_mask(h); + unsigned long haddr_hgm = address & hugetlb_pte_mask(hpte); bool new_page, new_pagecache_page = false; u32 hash = hugetlb_fault_mutex_hash(mapping, idx); @@ -5997,7 +6010,7 @@ static vm_fault_t hugetlb_no_page(struct mm_struct *mm, * never happen on the page after UFFDIO_COPY has * correctly installed the page and returned. */ - if (!hugetlb_pte_stable(h, mm, ptep, old_pte)) { + if (!hugetlb_pte_stable(h, hpte, old_pte)) { ret = 0; goto out; } @@ -6021,7 +6034,7 @@ static vm_fault_t hugetlb_no_page(struct mm_struct *mm, * here. Before returning error, get ptl and make * sure there really is no pte entry. */ - if (hugetlb_pte_stable(h, mm, ptep, old_pte)) + if (hugetlb_pte_stable(h, hpte, old_pte)) ret = vmf_error(PTR_ERR(page)); else ret = 0; @@ -6071,7 +6084,7 @@ static vm_fault_t hugetlb_no_page(struct mm_struct *mm, unlock_page(page); put_page(page); /* See comment in userfaultfd_missing() block above */ - if (!hugetlb_pte_stable(h, mm, ptep, old_pte)) { + if (!hugetlb_pte_stable(h, hpte, old_pte)) { ret = 0; goto out; } @@ -6096,30 +6109,43 @@ static vm_fault_t hugetlb_no_page(struct mm_struct *mm, vma_end_reservation(h, vma, haddr); } - ptl = huge_pte_lock(h, mm, ptep); + ptl = hugetlb_pte_lock(hpte); ret = 0; - /* If pte changed from under us, retry */ - if (!pte_same(huge_ptep_get(ptep), old_pte)) + /* + * If pte changed from under us, retry. + * + * When dealing with high-granularity-mapped PTEs, it's possible that + * a non-contiguous PTE within our contiguous PTE group gets populated, + * in which case, we need to retry here. This is NOT caught here, and + * will need to be addressed when HGM is supported for architectures + * that support contiguous PTEs. + */ + if (!pte_same(huge_ptep_get(hpte->ptep), old_pte)) goto backout; if (anon_rmap) hugepage_add_new_anon_rmap(page, vma, haddr); else page_dup_file_rmap(page, true); - new_pte = make_huge_pte(vma, page, ((vma->vm_flags & VM_WRITE) - && (vma->vm_flags & VM_SHARED))); + + subpage = hugetlb_find_subpage(h, page, haddr_hgm); + new_pte = make_huge_pte_with_shift(vma, subpage, + ((vma->vm_flags & VM_WRITE) + && (vma->vm_flags & VM_SHARED)), + hpte->shift); /* * If this pte was previously wr-protected, keep it wr-protected even * if populated. */ if (unlikely(pte_marker_uffd_wp(old_pte))) new_pte = huge_pte_mkuffd_wp(new_pte); - set_huge_pte_at(mm, haddr, ptep, new_pte); + set_huge_pte_at(mm, haddr_hgm, hpte->ptep, new_pte); - hugetlb_count_add(pages_per_huge_page(h), mm); + hugetlb_count_add(hugetlb_pte_size(hpte) / PAGE_SIZE, mm); if ((flags & FAULT_FLAG_WRITE) && !(vma->vm_flags & VM_SHARED)) { + WARN_ON_ONCE(hugetlb_pte_size(hpte) != huge_page_size(h)); /* Optimization, do the COW without a second fault */ - ret = hugetlb_wp(mm, vma, address, ptep, flags, page, ptl); + ret = hugetlb_wp(mm, vma, address, hpte->ptep, flags, page, ptl); } spin_unlock(ptl); @@ -6176,17 +6202,20 @@ u32 hugetlb_fault_mutex_hash(struct address_space *mapping, pgoff_t idx) vm_fault_t hugetlb_fault(struct mm_struct *mm, struct vm_area_struct *vma, unsigned long address, unsigned int flags) { - pte_t *ptep, entry; + pte_t entry; spinlock_t *ptl; vm_fault_t ret; u32 hash; pgoff_t idx; struct page *page = NULL; + struct page *subpage = NULL; struct page *pagecache_page = NULL; struct hstate *h = hstate_vma(vma); struct address_space *mapping; int need_wait_lock = 0; unsigned long haddr = address & huge_page_mask(h); + unsigned long haddr_hgm; + struct hugetlb_pte hpte; /* * Serialize hugepage allocation and instantiation, so that we don't @@ -6200,26 +6229,26 @@ vm_fault_t hugetlb_fault(struct mm_struct *mm, struct vm_area_struct *vma, /* * Acquire vma lock before calling huge_pte_alloc and hold - * until finished with ptep. This prevents huge_pmd_unshare from - * being called elsewhere and making the ptep no longer valid. + * until finished with hpte. This prevents huge_pmd_unshare from + * being called elsewhere and making the hpte no longer valid. */ hugetlb_vma_lock_read(vma); - ptep = huge_pte_alloc(mm, vma, haddr, huge_page_size(h)); - if (!ptep) { + if (hugetlb_full_walk_alloc(&hpte, vma, address, 0)) { hugetlb_vma_unlock_read(vma); mutex_unlock(&hugetlb_fault_mutex_table[hash]); return VM_FAULT_OOM; } - entry = huge_ptep_get(ptep); + entry = huge_ptep_get(hpte.ptep); /* PTE markers should be handled the same way as none pte */ - if (huge_pte_none_mostly(entry)) + if (huge_pte_none_mostly(entry)) { /* * hugetlb_no_page will drop vma lock and hugetlb fault * mutex internally, which make us return immediately. */ - return hugetlb_no_page(mm, vma, mapping, idx, address, ptep, + return hugetlb_no_page(mm, vma, mapping, idx, address, &hpte, entry, flags); + } ret = 0; @@ -6240,7 +6269,7 @@ vm_fault_t hugetlb_fault(struct mm_struct *mm, struct vm_area_struct *vma, * be released there. */ mutex_unlock(&hugetlb_fault_mutex_table[hash]); - migration_entry_wait_huge(vma, ptep); + migration_entry_wait_huge(vma, hpte.ptep); return 0; } else if (unlikely(is_hugetlb_entry_hwpoisoned(entry))) ret = VM_FAULT_HWPOISON_LARGE | @@ -6248,6 +6277,10 @@ vm_fault_t hugetlb_fault(struct mm_struct *mm, struct vm_area_struct *vma, goto out_mutex; } + if (!hugetlb_pte_present_leaf(&hpte, entry)) + /* We raced with someone splitting the entry. */ + goto out_mutex; + /* * If we are going to COW/unshare the mapping later, we examine the * pending reservations for this page now. This will ensure that any @@ -6267,14 +6300,17 @@ vm_fault_t hugetlb_fault(struct mm_struct *mm, struct vm_area_struct *vma, pagecache_page = find_lock_page(mapping, idx); } - ptl = huge_pte_lock(h, mm, ptep); + ptl = hugetlb_pte_lock(&hpte); /* Check for a racing update before calling hugetlb_wp() */ - if (unlikely(!pte_same(entry, huge_ptep_get(ptep)))) + if (unlikely(!pte_same(entry, huge_ptep_get(hpte.ptep)))) goto out_ptl; + /* haddr_hgm is the base address of the region that hpte maps. */ + haddr_hgm = address & hugetlb_pte_mask(&hpte); + /* Handle userfault-wp first, before trying to lock more pages */ - if (userfaultfd_wp(vma) && huge_pte_uffd_wp(huge_ptep_get(ptep)) && + if (userfaultfd_wp(vma) && huge_pte_uffd_wp(entry) && (flags & FAULT_FLAG_WRITE) && !huge_pte_write(entry)) { struct vm_fault vmf = { .vma = vma, @@ -6298,7 +6334,8 @@ vm_fault_t hugetlb_fault(struct mm_struct *mm, struct vm_area_struct *vma, * pagecache_page, so here we need take the former one * when page != pagecache_page or !pagecache_page. */ - page = pte_page(entry); + subpage = pte_page(entry); + page = compound_head(subpage); if (page != pagecache_page) if (!trylock_page(page)) { need_wait_lock = 1; @@ -6309,7 +6346,9 @@ vm_fault_t hugetlb_fault(struct mm_struct *mm, struct vm_area_struct *vma, if (flags & (FAULT_FLAG_WRITE|FAULT_FLAG_UNSHARE)) { if (!huge_pte_write(entry)) { - ret = hugetlb_wp(mm, vma, address, ptep, flags, + WARN_ON_ONCE(hugetlb_pte_size(&hpte) != + huge_page_size(h)); + ret = hugetlb_wp(mm, vma, address, hpte.ptep, flags, pagecache_page, ptl); goto out_put_page; } else if (likely(flags & FAULT_FLAG_WRITE)) { @@ -6317,9 +6356,9 @@ vm_fault_t hugetlb_fault(struct mm_struct *mm, struct vm_area_struct *vma, } } entry = pte_mkyoung(entry); - if (huge_ptep_set_access_flags(vma, haddr, ptep, entry, + if (huge_ptep_set_access_flags(vma, haddr_hgm, hpte.ptep, entry, flags & FAULT_FLAG_WRITE)) - update_mmu_cache(vma, haddr, ptep); + update_mmu_cache(vma, haddr_hgm, hpte.ptep); out_put_page: if (page != pagecache_page) unlock_page(page); @@ -7523,6 +7562,9 @@ int hugetlb_full_walk(struct hugetlb_pte *hpte, /* * hugetlb_full_walk_alloc - do a high-granularity walk, potentially allocate * new PTEs. + * + * If @target_sz is 0, then only attempt to allocate the hstate-level PTE, and + * walk as far as we can go. */ int hugetlb_full_walk_alloc(struct hugetlb_pte *hpte, struct vm_area_struct *vma, @@ -7541,6 +7583,12 @@ int hugetlb_full_walk_alloc(struct hugetlb_pte *hpte, if (!ptep) return -ENOMEM; + if (!target_sz) { + WARN_ON_ONCE(hugetlb_hgm_walk_uninit(hpte, ptep, vma, addr, + PAGE_SIZE, false)); + return 0; + } + return hugetlb_hgm_walk_uninit(hpte, ptep, vma, addr, target_sz, true); } @@ -7569,7 +7617,6 @@ pte_t *huge_pte_alloc(struct mm_struct *mm, struct vm_area_struct *vma, pte = (pte_t *)pmd_alloc(mm, pud, addr); } } - BUG_ON(pte && pte_present(*pte) && !pte_huge(*pte)); return pte; } From patchwork Thu Jan 5 10:18:27 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: James Houghton X-Patchwork-Id: 13089663 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id F1D5AC3DA7D for ; Thu, 5 Jan 2023 10:19:40 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id B8D9D94001A; Thu, 5 Jan 2023 05:19:39 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id B17C5940008; Thu, 5 Jan 2023 05:19:39 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id A044D94001A; Thu, 5 Jan 2023 05:19:39 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0014.hostedemail.com [216.40.44.14]) by kanga.kvack.org (Postfix) with ESMTP id 83193940008 for ; Thu, 5 Jan 2023 05:19:39 -0500 (EST) Received: from smtpin06.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay10.hostedemail.com (Postfix) with ESMTP id 532F0C0D18 for ; Thu, 5 Jan 2023 10:19:39 +0000 (UTC) X-FDA: 80320348878.06.565CE22 Received: from mail-vs1-f73.google.com (mail-vs1-f73.google.com [209.85.217.73]) by imf17.hostedemail.com (Postfix) with ESMTP id 97B6340019 for ; Thu, 5 Jan 2023 10:19:37 +0000 (UTC) Authentication-Results: imf17.hostedemail.com; dkim=pass header.d=google.com header.s=20210112 header.b=iSGNYib6; spf=pass (imf17.hostedemail.com: domain of 3OKS2YwoKCH8mwkrxjkwrqjrrjoh.frpolqx0-ppnydfn.ruj@flex--jthoughton.bounces.google.com designates 209.85.217.73 as permitted sender) smtp.mailfrom=3OKS2YwoKCH8mwkrxjkwrqjrrjoh.frpolqx0-ppnydfn.ruj@flex--jthoughton.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1672913977; a=rsa-sha256; cv=none; b=fK3jYN9miTLTlrcl4FXfYrooRSQtfz4y9x+ux4M7oGj3DZV9NvHQs5PaoRMeVNQnMbbbYf 6ooUGb/br4FsGYeuCrGrLbuCesrQMbpiwOiF9JUSxeSlHXmJlHm9N8NlXKwDGX1UFOEPMo TKgUQf7lIZ/E0i2M1ZeA0Y9VNfXLD4k= ARC-Authentication-Results: i=1; imf17.hostedemail.com; dkim=pass header.d=google.com header.s=20210112 header.b=iSGNYib6; spf=pass (imf17.hostedemail.com: domain of 3OKS2YwoKCH8mwkrxjkwrqjrrjoh.frpolqx0-ppnydfn.ruj@flex--jthoughton.bounces.google.com designates 209.85.217.73 as permitted sender) smtp.mailfrom=3OKS2YwoKCH8mwkrxjkwrqjrrjoh.frpolqx0-ppnydfn.ruj@flex--jthoughton.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1672913977; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=/OkucptRSHaE2jww1tz6Gz5twfCMvXYe7c+PeOdayik=; b=tO8XjW2FHlrqVH9qV1dTvNnzGgZ9Ej+JqYqpUKvQjas0Z2kU68UagCZ0h1fFVdGl2YcCz1 2xbLDA7dj8nNxlVXfnxqZUt6ABql4VG2D7nevpyGijP/3kyiD+/7FgmLvZlSBEkvRjoI1Q pQUZFq9at2wOfOSW586rtytIGALAoeU= Received: by mail-vs1-f73.google.com with SMTP id m63-20020a677142000000b003ce30446ff5so2514317vsc.23 for ; Thu, 05 Jan 2023 02:19:37 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=/OkucptRSHaE2jww1tz6Gz5twfCMvXYe7c+PeOdayik=; b=iSGNYib6WI9KvfmqIGZ0i7Vk1UHRi9UDSqMCmoRyiWYTjMJCfkdilKxemSSs37rmEB LHey9zQAKiluMdfQLrqTJC18PcagtYORncHX8xb4slmIglKAK8i5xsLn8irb+SQa5QzI 4QtmQAfki38HB20NjnEKifiLax0woXit1rzbXzd6DiZJt4TmQVa8Cw2yCoz++5MjYfVN pYOtQPvRLchwdc3u+rabZTzy1X2ezxmhCkcFxceEMFn6sa9jw6mf0hk0rjrfbLUnoyi1 4OF7tPUD2sGPQORVj4TOgSn8kcTeH6qegdWylrCr5/p3m+hVSlBOd7UYhD1rHychySU9 7SpA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=/OkucptRSHaE2jww1tz6Gz5twfCMvXYe7c+PeOdayik=; b=LlNTD7xwMkvd62Q4CgBtuQ0SLtCbvZxkukybqEoN6mIOkkNX6tovIfmsdFkHOeT2/q rKqr19MFBaS3DfSwT5RCZIfwwhmSplkBVMHg37JCu7yTyKZOAI0FDlFxgROG5wanb/9A 0AJP6LqVtv8R3mIq/B6KGImbAsC6w2x+kqLuT6ditwcpNa8+8dZca09H8Lr/iwc+Prlo RSOSjkkH7hOcTwxlA4XDrPSK871h8ec6FF/Ei2SPmxzPkSAbIrPeJfuXRdDEBXDHNWU9 z3RpVkJJoFzFqYhgFgD/nAgGCxByGQA0QPpVfRsbKXtGvIraqe8agxZixJMt1oCkUPk9 iRoQ== X-Gm-Message-State: AFqh2koyM0fyxNx0nAr4mL8wqCwV3UyJuxBu/9PNAdHjT/t+6u9CQ1QH a9K60k1ehSQ9Hwpcp6MKqTQFqCZwfuW1gDgy X-Google-Smtp-Source: AMrXdXuvwrh8nHt5LTtM5ZzmX7uEySb+WBxPJnc+22e/KgH4n1viiq8PVmwhxdb0dwfC5zkTILrp/OH5b2XpbHmH X-Received: from jthoughton.c.googlers.com ([fda3:e722:ac3:cc00:14:4d90:c0a8:2a4f]) (user=jthoughton job=sendgmr) by 2002:ac5:cbcc:0:b0:3cf:a8e:4620 with SMTP id h12-20020ac5cbcc000000b003cf0a8e4620mr5931022vkn.31.1672913976826; Thu, 05 Jan 2023 02:19:36 -0800 (PST) Date: Thu, 5 Jan 2023 10:18:27 +0000 In-Reply-To: <20230105101844.1893104-1-jthoughton@google.com> Mime-Version: 1.0 References: <20230105101844.1893104-1-jthoughton@google.com> X-Mailer: git-send-email 2.39.0.314.g84b9a713c41-goog Message-ID: <20230105101844.1893104-30-jthoughton@google.com> Subject: [PATCH 29/46] rmap: in try_to_{migrate,unmap}_one, check head page for page flags From: James Houghton To: Mike Kravetz , Muchun Song , Peter Xu Cc: David Hildenbrand , David Rientjes , Axel Rasmussen , Mina Almasry , "Zach O'Keefe" , Manish Mishra , Naoya Horiguchi , "Dr . David Alan Gilbert" , "Matthew Wilcox (Oracle)" , Vlastimil Babka , Baolin Wang , Miaohe Lin , Yang Shi , Andrew Morton , linux-mm@kvack.org, linux-kernel@vger.kernel.org, James Houghton X-Rspam-User: X-Rspamd-Queue-Id: 97B6340019 X-Rspamd-Server: rspam01 X-Stat-Signature: xzh4o8k16rid14gudqfz3h5jcoradi83 X-HE-Tag: 1672913977-465452 X-HE-Meta: U2FsdGVkX1/VwLOnKuy/yoturTTuD3ZB8YhUv2ewPv5WcdQzzJb6mTehkViDjt3R7uFHtr0MG2Lex2b7a2Pvvwauo1nWH+Qm93gd4UNmgucsLLusc/Tl/T+h9GPOCzMpB4DiSSqqSn76FP0t9HBp0jbAsGwMAY65vM50EiySyy7dfHMeldbWgAz6P/nYh6+IwZziq7eb7UPAUkj8779ThKliKnTxvSLlfYyfU5gN52BngwZ160uLHo1yEQrVElZB4JMUzcjxX2Ad2zPby3Jj7fnxos12G/u5/JnlLqdOXAVCPCAE6+iHDxCngVNtxA72U52Y9LJUEI1kyH5qnjcc/qlJTa7hqmfg74a3bFKTtmqwL+lzUIlS+/4l/+ZKMH4Od92J7yh0nUEkG6yHmoWtveQi87FfYJ/miQx9q7kspyMdh4gYKsjVcw6hQh4jDzYebqiuIQw2kWzYy1AB5Bi9KgGIVfPsEYNRztPeoIKAQADL8OoAg0MTHja95fIqos8RvjrKhIHAN35fAqJ45fitOfqH5fH6GschTY0zX3v9KUbuf1ZnI+PNxzM1AOyTev0vPvY/1OeH1FYgvCkqWZckgDuKwHQac8dc9GrMRFH/4kNrO7KqLOMuHcgudNM7L8UMfJV/0kGlmtT8DF/qgGmKWZyX7/EB19hD+3P0gxIX+3V2Z5z21blNO/b7PDA1Lprgh+ZS2oaum3cg24kB2prnfkxLHV7M6vvvWsEZZvxTqO/VyVrqEPSWHR7QaIXYpZ9vqsbISD0VTVVx0hTxb7LzEDK1N7eBCz5/XEEf5jTckqyW4Oer9YzBZIqyeBgV39aqn2bPRWiFiv3Qkpf+1+uRo1Mi7vW3Gg81e+DRwzG2l2qBV9gnAlH+zOF4Gm+AMe0vnzvZyHINbewwMUYrOC75+EGpxhZMKVwJDQNmT46P1/tcHztFTD3m64d0HCrfZPpb4b7ERddBPKd/8JUtTnl LmHvtxfp AF2WtyWDlnKTMpTCdsfw2m590RNDdvHWtnX0FG2Ozd2kimCHlIgwxyfc3BmpTuS2Uu9VIC8TYRHXSwFKSQ4yvpaLI1HjTrhw8RZE29RMCYlihtJgUXedn0qcEUa7vbFai3fQ39yqzSYSvD8jLrX5ecd7eOPL3lVsPg2+GUJ7EJVvkN0jVZzpVbQYo6rWXYPkyRxxn8EuWhO9Vvs81YA2DAhCuQxN+lH3jVs/qnj+L8NplxrAJ9znhfQLx0K9HOWo92IOs2QKT1SPlLyMJQhBGtStlb3RvJLauOvIu4Q1AbgIAAULPT0Iwk3hInulzBOh1fDyX5g7gmf352CfVgB9ciXoNUNmiipaG1OOnroD/YFwlA8HJ1F9AqneESu0oIr1TxhzjLd6F8o5tChbrEFuxU4jOyhUcZiNHYEj9o5W2utghzkk0Ki5yNKYpPpGcnbWjWQcj X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: The main complication here is that HugeTLB pages have their poison status stored in the head page as the HWPoison page flag. Because HugeTLB high-granularity mapping can create PTEs that point to subpages instead of always the head of a hugepage, we need to check the compound_head for page flags. Signed-off-by: James Houghton --- mm/rmap.c | 34 ++++++++++++++++++++++++++-------- 1 file changed, 26 insertions(+), 8 deletions(-) diff --git a/mm/rmap.c b/mm/rmap.c index 076ea77010e5..a6004d6b0415 100644 --- a/mm/rmap.c +++ b/mm/rmap.c @@ -1456,10 +1456,11 @@ static bool try_to_unmap_one(struct folio *folio, struct vm_area_struct *vma, struct mm_struct *mm = vma->vm_mm; DEFINE_FOLIO_VMA_WALK(pvmw, folio, vma, address, 0); pte_t pteval; - struct page *subpage; + struct page *subpage, *page_flags_page; bool anon_exclusive, ret = true; struct mmu_notifier_range range; enum ttu_flags flags = (enum ttu_flags)(long)arg; + bool page_poisoned; /* * When racing against e.g. zap_pte_range() on another cpu, @@ -1512,9 +1513,17 @@ static bool try_to_unmap_one(struct folio *folio, struct vm_area_struct *vma, subpage = folio_page(folio, pte_pfn(*pvmw.pte) - folio_pfn(folio)); + /* + * We check the page flags of HugeTLB pages by checking the + * head page. + */ + page_flags_page = folio_test_hugetlb(folio) + ? &folio->page + : subpage; + page_poisoned = PageHWPoison(page_flags_page); address = pvmw.address; anon_exclusive = folio_test_anon(folio) && - PageAnonExclusive(subpage); + PageAnonExclusive(page_flags_page); if (folio_test_hugetlb(folio)) { bool anon = folio_test_anon(folio); @@ -1523,7 +1532,7 @@ static bool try_to_unmap_one(struct folio *folio, struct vm_area_struct *vma, * The try_to_unmap() is only passed a hugetlb page * in the case where the hugetlb page is poisoned. */ - VM_BUG_ON_PAGE(!PageHWPoison(subpage), subpage); + VM_BUG_ON_FOLIO(!page_poisoned, folio); /* * huge_pmd_unshare may unmap an entire PMD page. * There is no way of knowing exactly which PMDs may @@ -1606,7 +1615,7 @@ static bool try_to_unmap_one(struct folio *folio, struct vm_area_struct *vma, /* Update high watermark before we lower rss */ update_hiwater_rss(mm); - if (PageHWPoison(subpage) && !(flags & TTU_IGNORE_HWPOISON)) { + if (page_poisoned && !(flags & TTU_IGNORE_HWPOISON)) { pteval = swp_entry_to_pte(make_hwpoison_entry(subpage)); if (folio_test_hugetlb(folio)) { hugetlb_count_sub(1UL << pvmw.pte_order, mm); @@ -1632,7 +1641,9 @@ static bool try_to_unmap_one(struct folio *folio, struct vm_area_struct *vma, mmu_notifier_invalidate_range(mm, address, address + PAGE_SIZE); } else if (folio_test_anon(folio)) { - swp_entry_t entry = { .val = page_private(subpage) }; + swp_entry_t entry = { + .val = page_private(page_flags_page) + }; pte_t swp_pte; /* * Store the swap location in the pte. @@ -1831,7 +1842,7 @@ static bool try_to_migrate_one(struct folio *folio, struct vm_area_struct *vma, struct mm_struct *mm = vma->vm_mm; DEFINE_FOLIO_VMA_WALK(pvmw, folio, vma, address, 0); pte_t pteval; - struct page *subpage; + struct page *subpage, *page_flags_page; bool anon_exclusive, ret = true; struct mmu_notifier_range range; enum ttu_flags flags = (enum ttu_flags)(long)arg; @@ -1911,9 +1922,16 @@ static bool try_to_migrate_one(struct folio *folio, struct vm_area_struct *vma, subpage = folio_page(folio, pte_pfn(*pvmw.pte) - folio_pfn(folio)); } + /* + * We check the page flags of HugeTLB pages by checking the + * head page. + */ + page_flags_page = folio_test_hugetlb(folio) + ? &folio->page + : subpage; address = pvmw.address; anon_exclusive = folio_test_anon(folio) && - PageAnonExclusive(subpage); + PageAnonExclusive(page_flags_page); if (folio_test_hugetlb(folio)) { bool anon = folio_test_anon(folio); @@ -2032,7 +2050,7 @@ static bool try_to_migrate_one(struct folio *folio, struct vm_area_struct *vma, * No need to invalidate here it will synchronize on * against the special swap migration pte. */ - } else if (PageHWPoison(subpage)) { + } else if (PageHWPoison(page_flags_page)) { pteval = swp_entry_to_pte(make_hwpoison_entry(subpage)); if (folio_test_hugetlb(folio)) { hugetlb_count_sub(1L << pvmw.pte_order, mm); From patchwork Thu Jan 5 10:18:28 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: James Houghton X-Patchwork-Id: 13089664 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 5102CC54E76 for ; Thu, 5 Jan 2023 10:19:42 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id CC64794001B; Thu, 5 Jan 2023 05:19:40 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id C75A7940008; Thu, 5 Jan 2023 05:19:40 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id AA51194001B; Thu, 5 Jan 2023 05:19:40 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0016.hostedemail.com [216.40.44.16]) by kanga.kvack.org (Postfix) with ESMTP id 900E4940008 for ; Thu, 5 Jan 2023 05:19:40 -0500 (EST) Received: from smtpin30.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay08.hostedemail.com (Postfix) with ESMTP id 66F931402A9 for ; Thu, 5 Jan 2023 10:19:40 +0000 (UTC) X-FDA: 80320348920.30.C2C76C3 Received: from mail-vs1-f73.google.com (mail-vs1-f73.google.com [209.85.217.73]) by imf28.hostedemail.com (Postfix) with ESMTP id C4EA4C000A for ; Thu, 5 Jan 2023 10:19:38 +0000 (UTC) Authentication-Results: imf28.hostedemail.com; dkim=pass header.d=google.com header.s=20210112 header.b=Fc+sFoa0; spf=pass (imf28.hostedemail.com: domain of 3OqS2YwoKCIEoymtzlmytslttlqj.htrqnsz2-rrp0fhp.twl@flex--jthoughton.bounces.google.com designates 209.85.217.73 as permitted sender) smtp.mailfrom=3OqS2YwoKCIEoymtzlmytslttlqj.htrqnsz2-rrp0fhp.twl@flex--jthoughton.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1672913978; a=rsa-sha256; cv=none; b=rROxeAnJoqvOwwmxkMkeecP8iSAxUma2ukA+DIgzxL/4LGhmgUi8R9lZJiZXRm0mi5c8bt f4DPZCz8L5mkDhuiesKV6Hnn1pvGWS0EPK1FgE0hXmlIuTb8qu+HS2xNGfeS6v7rpveODl gD6ixScOMa0AUsKML7WT/SVD6a4M+hA= ARC-Authentication-Results: i=1; imf28.hostedemail.com; dkim=pass header.d=google.com header.s=20210112 header.b=Fc+sFoa0; spf=pass (imf28.hostedemail.com: domain of 3OqS2YwoKCIEoymtzlmytslttlqj.htrqnsz2-rrp0fhp.twl@flex--jthoughton.bounces.google.com designates 209.85.217.73 as permitted sender) smtp.mailfrom=3OqS2YwoKCIEoymtzlmytslttlqj.htrqnsz2-rrp0fhp.twl@flex--jthoughton.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1672913978; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=D85Uov3Vr9+M/n7WNaDbBuEuFLe7i/N1k26DEBJc4Oo=; b=Rem6mosH2B2/3OI62b6YyWFZH0/JlQJEyTi4pRygXAri576i9Lwrf0rExm6gNqo+7zB1zu EFIHfTVrY0sqEEyWZfiTmqmP0A7M8Zb+wdkt2QFSSgAyR8wOJnuj3S4o7tmGy5wwMWkHaV nePx/PhXE2VsHno8JSQI6R5BCrC1jY0= Received: by mail-vs1-f73.google.com with SMTP id m63-20020a677142000000b003ce30446ff5so2514336vsc.23 for ; Thu, 05 Jan 2023 02:19:38 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=D85Uov3Vr9+M/n7WNaDbBuEuFLe7i/N1k26DEBJc4Oo=; b=Fc+sFoa0jzCWo3hJ/NOaTUhKo/Ryani/NbeUDYNYcfhk7QQC6IXT8SsP8RkpxdgQH0 Boa0nJMUokcB7P333Ef2yEOjw4nVlYYEEjKGQIu6pJHSa/sd/3BcY0VeYH8OQ7Ull7iD RZidov6eS7oRCowhc+nmf87KUuctR7UFMLVm7HW4S88jfAaN6OtaeNBZWA+IhO0nSuMt 3o/wfdL6uSYfAutpej7givW+b5zwSgmNlC9Aes9ldYfBsTRoMBs8tJvrkPpLBLkF1Gue jGQhMxLX3e6eD9v4YR/phB1VaySb2QLjouCRE8nhl5DlmoHI5vi/DDIvbIlCxmosvGfr rr2g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=D85Uov3Vr9+M/n7WNaDbBuEuFLe7i/N1k26DEBJc4Oo=; b=DH7p+tHROjlZVv8c+jSl6AxLWatl0zxElYYqfok0UsB2sGsDqmlkjGRjHx3ABZWx5n 1B/pi0VGc3w0fMTS1tvUBa5XMEZEzeF89XAYum5iF5wHldzKMez1yxcdjzoeI0aManle jwpw3MBkHjf1Kk0RU40IvBfVexBbjmAF5DyWfHhqtoXY1o/d0kTT3Ca4UmPoSAT6PNj0 MEtiPRaVRLQz8o/iriOahEFxX7XZV8K7Q7h31DnN6SeIldg/5pwkh5UgaLBPJKMv1lLB X4CO7xrdkKhTQnsiSjeBR6hafzTYZnUjKczYnpgwnj90qVQ2i80abpj3YAfacEDu03t5 jedw== X-Gm-Message-State: AFqh2koRcNEDklr0TV5VlpaJUuz1sWSuoTw47CeBcDzOz1QIACNVwuY/ V25pEpj38CmzzL66iARO3+ytSHMRDBjADClO X-Google-Smtp-Source: AMrXdXtKzLn8QHs3kEDlKhCPd1a8JiEnPA2DYqJZosaypmuepOntKXSSgRH/++rLw3+DjzBbsl1PvOkHnG0diFXv X-Received: from jthoughton.c.googlers.com ([fda3:e722:ac3:cc00:14:4d90:c0a8:2a4f]) (user=jthoughton job=sendgmr) by 2002:a05:6130:a19:b0:4aa:585:d7c2 with SMTP id bx25-20020a0561300a1900b004aa0585d7c2mr4446498uab.48.1672913978000; Thu, 05 Jan 2023 02:19:38 -0800 (PST) Date: Thu, 5 Jan 2023 10:18:28 +0000 In-Reply-To: <20230105101844.1893104-1-jthoughton@google.com> Mime-Version: 1.0 References: <20230105101844.1893104-1-jthoughton@google.com> X-Mailer: git-send-email 2.39.0.314.g84b9a713c41-goog Message-ID: <20230105101844.1893104-31-jthoughton@google.com> Subject: [PATCH 30/46] hugetlb: add high-granularity migration support From: James Houghton To: Mike Kravetz , Muchun Song , Peter Xu Cc: David Hildenbrand , David Rientjes , Axel Rasmussen , Mina Almasry , "Zach O'Keefe" , Manish Mishra , Naoya Horiguchi , "Dr . David Alan Gilbert" , "Matthew Wilcox (Oracle)" , Vlastimil Babka , Baolin Wang , Miaohe Lin , Yang Shi , Andrew Morton , linux-mm@kvack.org, linux-kernel@vger.kernel.org, James Houghton X-Rspam-User: X-Rspamd-Server: rspam03 X-Rspamd-Queue-Id: C4EA4C000A X-Stat-Signature: x864kdwisa1oekj5m6s9ysjtd51ykur3 X-HE-Tag: 1672913978-458828 X-HE-Meta: U2FsdGVkX19dzkkH+WPRUkX+/A0ia1CNmJ+KdP7yoYgwe59KM8dnRFxu8amaQ0CO8DcZV4CrAxSZIDtI5A0mTZGQAkLEvZHnSp0m+7nNG0dQYtd8pJ1lj93wgQpsv9BkeOA9Vm8Zqm3WxToYDPqhZx41PCy1qV9NlVKVJ8lVJD7NCM3viXrzwPlOppMSkH42U5vh0eiKODG8o6Iogd0l5fba/NB0sJ4bL6K7ly6uxfOkXk9p7Ran0x0mKTc53RDIN6BLf2zVEPdFy+hLB0WjhVkGGAOfp2+oe/r9JlvLGh8j7VqdS/p9eBXrj2w24Y4AqDSfJzU1/pVJFclP+geew9jQZmXT/0T5FW75zbmzm7hnh9EjinCrjk3OTCqNfY2/A7pa6zbCy4CRKuD3xTayPqJ5UWGPIlUXO+pKSTRTo0EeH7yFbTdSUwGqeJrvkfg6LN94CCIryyBBTPzQ5iZkuwMMR5EU7JH1F+yh3bpPm/qYZqBSu0Wj3WYLOJfTKtkIU1yPnvofdC119QthyxAySpnRihNKK2LTHPTEVBzm7+dzWyHoaq6Y4u4rNFsMDKHmGry0dp6c3G3bS463SCZoVAwnG1XB5v6ImArwE6ldU+CKCkjGzAwJZKr0cmVvT4p+M2zhoiqJiI0LBe5/Qk6OUrMuvN4f8IV6VFsIhwq9jDTDB7zM++2YLMbKfFSr4nBXcCiNBMYlWRI5ptbHruyihVUCsK1zKgMnsn/kiBBFpaRoHWUK8ZsKZEtvb01H0Bgd7D2Smia2IMJZn8d1pkS2B6Yapgc3BSTpwmfIHNumPeFBJ1pyUIDIihbAP9f7gzQDjSyJi8NsHOfgwksUliAYEGgTjrL+fo2sXqxY3bOuUb/k0YNEdgjfDd9rRYDfUg9mcRTgNImIYNeTvihgbIJTZP+tlPjiS0ov5rPo/1prrT3oI8XtE2wRVPhqkMI1FEnpwn24NdOkGJ3R9vPbmP8 7KDBCffa zc+1/7gnQ30ygm5yaXwEnViZX6l7/qlMsWK0rEJIcka7lOC53RZ5cbE6XuCHbx9fBrhzuIXA6bACoZ3XLT+qadGmKAPDcU3hyS4/KWj6nkqDjOxX+38++D6iVA35TD9XacodOjJBZJPDtAo/3rqGgyk6KdENGlHIBGTlPL7aIxPzmOXzgvUI4iRMoYhI3rWXVpNjE+R930OGEelqDhfpn9SMlgY5mDk/E7qeSF6eMj7oy+uo1loSQ/yrIU4RfiWBJRXJ+3oEmjPtZuWYG27X9ihBZA3eqgaZx7JCSUBBx9VfungrB5SEFJ51etmOPHlvlJDjzilnNDXQUkUOT5XAIAvb0Yzqj2WDUc+zcOm0w53+jFvRuRyuuF0E03som4IPuZLrZ1kv59a6VLNAHZl70QITYBQ== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: To prevent queueing a hugepage for migration multiple times, we use last_page to keep track of the last page we saw in queue_pages_hugetlb, and if the page we're looking at is last_page, then we skip it. For the non-hugetlb cases, last_page, although unused, is still updated so that it has a consistent meaning with the hugetlb case. This commit adds a check in hugetlb_fault for high-granularity migration PTEs. Signed-off-by: James Houghton --- include/linux/swapops.h | 8 ++++++-- mm/hugetlb.c | 2 +- mm/mempolicy.c | 24 +++++++++++++++++++----- mm/migrate.c | 18 ++++++++++-------- 4 files changed, 36 insertions(+), 16 deletions(-) diff --git a/include/linux/swapops.h b/include/linux/swapops.h index 3a451b7afcb3..6ef80763e629 100644 --- a/include/linux/swapops.h +++ b/include/linux/swapops.h @@ -68,6 +68,8 @@ static inline bool is_pfn_swap_entry(swp_entry_t entry); +struct hugetlb_pte; + /* Clear all flags but only keep swp_entry_t related information */ static inline pte_t pte_swp_clear_flags(pte_t pte) { @@ -339,7 +341,8 @@ extern void migration_entry_wait(struct mm_struct *mm, pmd_t *pmd, #ifdef CONFIG_HUGETLB_PAGE extern void __migration_entry_wait_huge(struct vm_area_struct *vma, pte_t *ptep, spinlock_t *ptl); -extern void migration_entry_wait_huge(struct vm_area_struct *vma, pte_t *pte); +extern void migration_entry_wait_huge(struct vm_area_struct *vma, + struct hugetlb_pte *hpte); #endif /* CONFIG_HUGETLB_PAGE */ #else /* CONFIG_MIGRATION */ static inline swp_entry_t make_readable_migration_entry(pgoff_t offset) @@ -369,7 +372,8 @@ static inline void migration_entry_wait(struct mm_struct *mm, pmd_t *pmd, #ifdef CONFIG_HUGETLB_PAGE static inline void __migration_entry_wait_huge(struct vm_area_struct *vma, pte_t *ptep, spinlock_t *ptl) { } -static inline void migration_entry_wait_huge(struct vm_area_struct *vma, pte_t *pte) { } +static inline void migration_entry_wait_huge(struct vm_area_struct *vma, + struct hugetlb_pte *hpte) { } #endif /* CONFIG_HUGETLB_PAGE */ static inline int is_writable_migration_entry(swp_entry_t entry) { diff --git a/mm/hugetlb.c b/mm/hugetlb.c index 8e690a22456a..2fb95ecafc63 100644 --- a/mm/hugetlb.c +++ b/mm/hugetlb.c @@ -6269,7 +6269,7 @@ vm_fault_t hugetlb_fault(struct mm_struct *mm, struct vm_area_struct *vma, * be released there. */ mutex_unlock(&hugetlb_fault_mutex_table[hash]); - migration_entry_wait_huge(vma, hpte.ptep); + migration_entry_wait_huge(vma, &hpte); return 0; } else if (unlikely(is_hugetlb_entry_hwpoisoned(entry))) ret = VM_FAULT_HWPOISON_LARGE | diff --git a/mm/mempolicy.c b/mm/mempolicy.c index e5859ed34e90..6c4c3c923fa2 100644 --- a/mm/mempolicy.c +++ b/mm/mempolicy.c @@ -424,6 +424,7 @@ struct queue_pages { unsigned long start; unsigned long end; struct vm_area_struct *first; + struct page *last_page; }; /* @@ -475,6 +476,7 @@ static int queue_pages_pmd(pmd_t *pmd, spinlock_t *ptl, unsigned long addr, flags = qp->flags; /* go to thp migration */ if (flags & (MPOL_MF_MOVE | MPOL_MF_MOVE_ALL)) { + qp->last_page = page; if (!vma_migratable(walk->vma) || migrate_page_add(page, qp->pagelist, flags)) { ret = 1; @@ -532,6 +534,7 @@ static int queue_pages_pte_range(pmd_t *pmd, unsigned long addr, continue; if (!queue_pages_required(page, qp)) continue; + if (flags & (MPOL_MF_MOVE | MPOL_MF_MOVE_ALL)) { /* MPOL_MF_STRICT must be specified if we get here */ if (!vma_migratable(vma)) { @@ -539,6 +542,8 @@ static int queue_pages_pte_range(pmd_t *pmd, unsigned long addr, break; } + qp->last_page = page; + /* * Do not abort immediately since there may be * temporary off LRU pages in the range. Still @@ -570,15 +575,22 @@ static int queue_pages_hugetlb(struct hugetlb_pte *hpte, spinlock_t *ptl; pte_t entry; - /* We don't migrate high-granularity HugeTLB mappings for now. */ - if (hugetlb_hgm_enabled(walk->vma)) - return -EINVAL; - ptl = hugetlb_pte_lock(hpte); entry = huge_ptep_get(hpte->ptep); if (!pte_present(entry)) goto unlock; - page = pte_page(entry); + + if (!hugetlb_pte_present_leaf(hpte, entry)) { + ret = -EAGAIN; + goto unlock; + } + + page = compound_head(pte_page(entry)); + + /* We already queued this page with another high-granularity PTE. */ + if (page == qp->last_page) + goto unlock; + if (!queue_pages_required(page, qp)) goto unlock; @@ -605,6 +617,7 @@ static int queue_pages_hugetlb(struct hugetlb_pte *hpte, /* With MPOL_MF_MOVE, we migrate only unshared hugepage. */ if (flags & (MPOL_MF_MOVE_ALL) || (flags & MPOL_MF_MOVE && page_mapcount(page) == 1)) { + qp->last_page = page; if (isolate_hugetlb(page, qp->pagelist) && (flags & MPOL_MF_STRICT)) /* @@ -739,6 +752,7 @@ queue_pages_range(struct mm_struct *mm, unsigned long start, unsigned long end, .start = start, .end = end, .first = NULL, + .last_page = NULL, }; err = walk_page_range(mm, start, end, &queue_pages_walk_ops, &qp); diff --git a/mm/migrate.c b/mm/migrate.c index 0062689f4878..c30647b75459 100644 --- a/mm/migrate.c +++ b/mm/migrate.c @@ -195,6 +195,9 @@ static bool remove_migration_pte(struct folio *folio, /* pgoff is invalid for ksm pages, but they are never large */ if (folio_test_large(folio) && !folio_test_hugetlb(folio)) idx = linear_page_index(vma, pvmw.address) - pvmw.pgoff; + else if (folio_test_hugetlb(folio)) + idx = (pvmw.address & ~huge_page_mask(hstate_vma(vma)))/ + PAGE_SIZE; new = folio_page(folio, idx); #ifdef CONFIG_ARCH_ENABLE_THP_MIGRATION @@ -244,14 +247,15 @@ static bool remove_migration_pte(struct folio *folio, #ifdef CONFIG_HUGETLB_PAGE if (folio_test_hugetlb(folio)) { + struct page *hpage = folio_page(folio, 0); unsigned int shift = pvmw.pte_order + PAGE_SHIFT; pte = arch_make_huge_pte(pte, shift, vma->vm_flags); if (folio_test_anon(folio)) - hugepage_add_anon_rmap(new, vma, pvmw.address, + hugepage_add_anon_rmap(hpage, vma, pvmw.address, rmap_flags); else - page_dup_file_rmap(new, true); + page_dup_file_rmap(hpage, true); set_huge_pte_at(vma->vm_mm, pvmw.address, pvmw.pte, pte); } else #endif @@ -267,7 +271,7 @@ static bool remove_migration_pte(struct folio *folio, mlock_page_drain_local(); trace_remove_migration_pte(pvmw.address, pte_val(pte), - compound_order(new)); + pvmw.pte_order); /* No need to invalidate - it was non-present before */ update_mmu_cache(vma, pvmw.address, pvmw.pte); @@ -358,12 +362,10 @@ void __migration_entry_wait_huge(struct vm_area_struct *vma, } } -void migration_entry_wait_huge(struct vm_area_struct *vma, pte_t *pte) +void migration_entry_wait_huge(struct vm_area_struct *vma, + struct hugetlb_pte *hpte) { - spinlock_t *ptl = huge_pte_lockptr(huge_page_shift(hstate_vma(vma)), - vma->vm_mm, pte); - - __migration_entry_wait_huge(vma, pte, ptl); + __migration_entry_wait_huge(vma, hpte->ptep, hpte->ptl); } #endif From patchwork Thu Jan 5 10:18:29 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: James Houghton X-Patchwork-Id: 13089665 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 7484FC3DA7A for ; Thu, 5 Jan 2023 10:19:43 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id C9F8494001C; Thu, 5 Jan 2023 05:19:41 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id C2914940008; Thu, 5 Jan 2023 05:19:41 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id A7CDE94001C; Thu, 5 Jan 2023 05:19:41 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0011.hostedemail.com [216.40.44.11]) by kanga.kvack.org (Postfix) with ESMTP id 8C1F1940008 for ; Thu, 5 Jan 2023 05:19:41 -0500 (EST) Received: from smtpin23.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay04.hostedemail.com (Postfix) with ESMTP id 6E98E1A0D08 for ; Thu, 5 Jan 2023 10:19:41 +0000 (UTC) X-FDA: 80320348962.23.1FC9FE5 Received: from mail-yw1-f201.google.com (mail-yw1-f201.google.com [209.85.128.201]) by imf15.hostedemail.com (Postfix) with ESMTP id D4319A000C for ; Thu, 5 Jan 2023 10:19:39 +0000 (UTC) Authentication-Results: imf15.hostedemail.com; dkim=pass header.d=google.com header.s=20210112 header.b=RZvyYiGX; spf=pass (imf15.hostedemail.com: domain of 3O6S2YwoKCIIpznu0mnzutmuumrk.iusrot03-ssq1giq.uxm@flex--jthoughton.bounces.google.com designates 209.85.128.201 as permitted sender) smtp.mailfrom=3O6S2YwoKCIIpznu0mnzutmuumrk.iusrot03-ssq1giq.uxm@flex--jthoughton.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1672913979; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=D5Cz37H+8pkm+eG9Z0H34ERNe8HsOxRqux6VnMQZxVI=; b=hkGp1NTEqSKm0C88EUsfN1jBUziiykq/Y7u+NPTUmSI8CI/70rI7b1aOApD1M0himvE8bw uYZ9wIJsx1nPtaLW4+9+4cacIpYCyfmqvUbq0fu41dQvVeSUkq+bqD8GSM8uhP+gZkuT90 oxTQTIpLtKqv0r95Ahc5EKGoIuEjQLc= ARC-Authentication-Results: i=1; imf15.hostedemail.com; dkim=pass header.d=google.com header.s=20210112 header.b=RZvyYiGX; spf=pass (imf15.hostedemail.com: domain of 3O6S2YwoKCIIpznu0mnzutmuumrk.iusrot03-ssq1giq.uxm@flex--jthoughton.bounces.google.com designates 209.85.128.201 as permitted sender) smtp.mailfrom=3O6S2YwoKCIIpznu0mnzutmuumrk.iusrot03-ssq1giq.uxm@flex--jthoughton.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1672913979; a=rsa-sha256; cv=none; b=H8TIF0pWpHXSlrQTDgf7j9v8wOa1/8o635ai6VNUbryCwr7rTkfQe0PaODXEfKw6GPEIw0 BOaeJoC8usks4OoPMUVxYOLw1O8JDfBaWdbWU5u6ozO1UAHOl4V6Jcq0dd+mBcKaSn8ErY AD/PkNGwlelavSvSmE2uFast0ssshPU= Received: by mail-yw1-f201.google.com with SMTP id 00721157ae682-46a518991cfso343231667b3.18 for ; Thu, 05 Jan 2023 02:19:39 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=D5Cz37H+8pkm+eG9Z0H34ERNe8HsOxRqux6VnMQZxVI=; b=RZvyYiGXLrB3ETdxawuVmo48jsdy59hZOrayu4MwbFs3pJ3ttnsPWlfACHcTKuKELB u7elr2o9ZzrreQn73HuxEenrQvdmQK+5qRx8fmCbb3+0y2yuQjtc7Eke6FKuTchDEYK8 YYzHLqNy10+sQRvm2sKmrDW/f1TF8kIqeHUqYRoK6qBEqz5HHAvdPBYZSb8+r9/77pgk lP8AdMMJCj/pfVHI2L4IlZhXYA3ZFnieW2soK/tiuKrevAeNTQ9+1Pzbbjl+RBoEO1Mv 1xYC+3/f2aDhItaL5lZuwhNnbQvNespJUKgHC6UK4LpthFqABlyIPMSYy0cETysfVVG/ YFkA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=D5Cz37H+8pkm+eG9Z0H34ERNe8HsOxRqux6VnMQZxVI=; b=Jd79D0z7H5iu9tmlSf49yXNiNO7rFxq86kWOq3DAG8+mNka656JO8aDwmzSgDIxvWq 2hv92uitFf0HjbTONhVxAeZ6L7XuooiMW20ScImVSiNCWnxHl+tcPuS5I3uKCIsaATSw /wWkdz7KCzCZA3eJcSbgFP6DrMdsF6wbFjNCYCnu8g8LtkfzkYwciBgw5BTQEgizuDrD Oj3bLVHNX0jcF+XYFzJFmGg60pqdiarDGGOGA5MUBsmgvC9LD1PC1tQ8hGBeZ7S0SI/b 7qsiJxWbWzAc0CvQQvZjqroftQatUuccV/1Pe2Y05VB4s96gtZWTT8dlWlIFiYqYpdKE 9nbg== X-Gm-Message-State: AFqh2koEfCgnqyRiTjYJU2i5cVXG0TJm0FzgoecFW7n9KzsRsl3Ei+CD 9jpZ23TWQpKhQPAgtez3QAkHHlVevm3V6hd2 X-Google-Smtp-Source: AMrXdXtZGNLt5SyLCQDlE7Cx/SKZasA9AHgebF17gcCgna+iBzg3syTx/7S3c65ST+dTPGEZeQ5Z177L8GEcKyYJ X-Received: from jthoughton.c.googlers.com ([fda3:e722:ac3:cc00:14:4d90:c0a8:2a4f]) (user=jthoughton job=sendgmr) by 2002:a81:17d6:0:b0:3ea:9ce2:cd76 with SMTP id 205-20020a8117d6000000b003ea9ce2cd76mr93655ywx.217.1672913979081; Thu, 05 Jan 2023 02:19:39 -0800 (PST) Date: Thu, 5 Jan 2023 10:18:29 +0000 In-Reply-To: <20230105101844.1893104-1-jthoughton@google.com> Mime-Version: 1.0 References: <20230105101844.1893104-1-jthoughton@google.com> X-Mailer: git-send-email 2.39.0.314.g84b9a713c41-goog Message-ID: <20230105101844.1893104-32-jthoughton@google.com> Subject: [PATCH 31/46] hugetlb: sort hstates in hugetlb_init_hstates From: James Houghton To: Mike Kravetz , Muchun Song , Peter Xu Cc: David Hildenbrand , David Rientjes , Axel Rasmussen , Mina Almasry , "Zach O'Keefe" , Manish Mishra , Naoya Horiguchi , "Dr . David Alan Gilbert" , "Matthew Wilcox (Oracle)" , Vlastimil Babka , Baolin Wang , Miaohe Lin , Yang Shi , Andrew Morton , linux-mm@kvack.org, linux-kernel@vger.kernel.org, James Houghton X-Rspam-User: X-Rspamd-Server: rspam04 X-Rspamd-Queue-Id: D4319A000C X-Stat-Signature: iynyr374to8uhrfwi41t6qfs7kg1gitn X-HE-Tag: 1672913979-125546 X-HE-Meta: U2FsdGVkX18SXrYEYEl9xitTJeJGeMCtSkbbtUTcElaSyZukZJbqnKdmaTWib0V0EPjQFezhk1vk9N2Fb/wiprnVP3O9onqWeBKzdCLP+/14EkYrvUfEYmgNA1c5sjgBB0dobhbcsVMZRXO5xOplVK4fqikLEM4t7ZFEHfWlvBNkkI/SNz+VwXWjVpoJIbiXVnA9DOtttlJJbBQSunWxMqiwnCRvB4PGhi7VkEyj1TD9SLgA45GJ6aTNsl+7Q5zxd2bP7uXtwpGMRLy8id9LjB4enGn8mT0Se3Hv7IBGawAg4jkX8OAt74irik34ZKqKloDd2UvLQNaLSlOLOdh/CiM0d8vSUO83Rcl+48tkXwz3iGq6CP3FCMIyBb6+POWM2XyNlRbMjpzHMULhaRgwKg1K4z13JP2OAfHr+02XzVpYEw6IW+v5vjYChH2EEEZBzWtO5ygX8qvdqiFcKke/HxBKbimazYRUhpce1G0bP57BkOIVDrjd7zvFUo3zdjcYuqjGeNoRie1BjGKyW+o1s/kYdl0LEFteqUdtrhhxgtnN5aqkcH5MvWnqMS3HS9cKqN288mGG1wwikFUTy59VF9S7XZrZMYwvWUUxSllpEXfVR4+qMSTF/V6ckbvZpYsvt8mLHBNNpyz/cCxSXjA4pfCbVS/SPOXLEY6dnxyV4dDZtkQb68CCKvpfb5ST28NFX77SUIjmuEiuaao5J5xAtE0hlF+8aU+DoFaYuXQ7OndULrIdqQ/XrUJDDnYCBFcKBjyuPOgdk5b36uykfjhHb2CUaO7tTupUI9d0hSh3LYrBiL9xY5wQjpuuNwJu39fAGXcPyE4jmaTksfLJUgH2SS5/+unuPO83o4J3lVjClp6W0qQh/qrBHrx4iN5JKFA/5mIeZ/yg6L4lzcMD6ZCL0QqbuDGGdsJsVCqpQTdHn7NuNS00HBkqYFHbxuwrCTjD4j7YNGuS54zR35OYgxN /mlbZcAo I4u+3eINY2Gwesq/gxO6Hd/2hxR5n8nTGTC1Fi+BxFn6yHj1675C8Tk60SBWU8AZFHxv2XPLG6XLRzTp1O9tr6SHWbVROoyuFbUNbwCgdgr4Bnvwc5nHZbaU3AznCppYP5y3jZwnK9lf3axKSKXxe6zBUdX8az4TkiEg0EvVTgnuS3/F3XvfeOph+wRdQj89iC1By4nSuRX0EvJxVoOnQF5aeE2dvcIN4OqjwswdZ1frs9fHev6GVK05mc9b9jN0LcfhG5njsjJvlgsbx3UdPKvxg1+g85Hq/1hgp6kqx9ay30fFG/se7FDuNJkps0HjFQ3FmjLKsMNiW0fm1GPpvE90RGzOKRDRbkf6gGXlScovDGjt7GBQJ/UMdhbydSFVA2I2URYqqG/UV10nWI3Mt/qvyu/rYkPriHMt38pLYwvoDTaRQgN14LVezowTGf49Yht8Y X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: When using HugeTLB high-granularity mapping, we need to go through the supported hugepage sizes in decreasing order so that we pick the largest size that works. Consider the case where we're faulting in a 1G hugepage for the first time: we want hugetlb_fault/hugetlb_no_page to map it with a PUD. By going through the sizes in decreasing order, we will find that PUD_SIZE works before finding out that PMD_SIZE or PAGE_SIZE work too. This commit also changes bootmem hugepages from storing hstate pointers directly to storing the hstate sizes. The hstate pointers used for boot-time-allocated hugepages become invalid after we sort the hstates. `gather_bootmem_prealloc`, called after the hstates have been sorted, now converts the size to the correct hstate. Signed-off-by: James Houghton --- include/linux/hugetlb.h | 2 +- mm/hugetlb.c | 49 ++++++++++++++++++++++++++++++++--------- 2 files changed, 40 insertions(+), 11 deletions(-) diff --git a/include/linux/hugetlb.h b/include/linux/hugetlb.h index daf993fdbc38..8a664a9dd0a8 100644 --- a/include/linux/hugetlb.h +++ b/include/linux/hugetlb.h @@ -789,7 +789,7 @@ struct hstate { struct huge_bootmem_page { struct list_head list; - struct hstate *hstate; + unsigned long hstate_sz; }; int isolate_or_dissolve_huge_page(struct page *page, struct list_head *list); diff --git a/mm/hugetlb.c b/mm/hugetlb.c index 2fb95ecafc63..1e9e149587b3 100644 --- a/mm/hugetlb.c +++ b/mm/hugetlb.c @@ -34,6 +34,7 @@ #include #include #include +#include #include #include @@ -49,6 +50,10 @@ int hugetlb_max_hstate __read_mostly; unsigned int default_hstate_idx; +/* + * After hugetlb_init_hstates is called, hstates will be sorted from largest + * to smallest. + */ struct hstate hstates[HUGE_MAX_HSTATE]; #ifdef CONFIG_CMA @@ -3347,7 +3352,7 @@ int __alloc_bootmem_huge_page(struct hstate *h, int nid) /* Put them into a private list first because mem_map is not up yet */ INIT_LIST_HEAD(&m->list); list_add(&m->list, &huge_boot_pages); - m->hstate = h; + m->hstate_sz = huge_page_size(h); return 1; } @@ -3362,7 +3367,7 @@ static void __init gather_bootmem_prealloc(void) list_for_each_entry(m, &huge_boot_pages, list) { struct page *page = virt_to_page(m); struct folio *folio = page_folio(page); - struct hstate *h = m->hstate; + struct hstate *h = size_to_hstate(m->hstate_sz); VM_BUG_ON(!hstate_is_gigantic(h)); WARN_ON(folio_ref_count(folio) != 1); @@ -3478,9 +3483,38 @@ static void __init hugetlb_hstate_alloc_pages(struct hstate *h) kfree(node_alloc_noretry); } +static int compare_hstates_decreasing(const void *a, const void *b) +{ + unsigned long sz_a = huge_page_size((const struct hstate *)a); + unsigned long sz_b = huge_page_size((const struct hstate *)b); + + if (sz_a < sz_b) + return 1; + if (sz_a > sz_b) + return -1; + return 0; +} + +static void sort_hstates(void) +{ + unsigned long default_hstate_sz = huge_page_size(&default_hstate); + + /* Sort from largest to smallest. */ + sort(hstates, hugetlb_max_hstate, sizeof(*hstates), + compare_hstates_decreasing, NULL); + + /* + * We may have changed the location of the default hstate, so we need to + * update it. + */ + default_hstate_idx = hstate_index(size_to_hstate(default_hstate_sz)); +} + static void __init hugetlb_init_hstates(void) { - struct hstate *h, *h2; + struct hstate *h; + + sort_hstates(); for_each_hstate(h) { /* oversize hugepages were init'ed in early boot */ @@ -3499,13 +3533,8 @@ static void __init hugetlb_init_hstates(void) continue; if (hugetlb_cma_size && h->order <= HUGETLB_PAGE_ORDER) continue; - for_each_hstate(h2) { - if (h2 == h) - continue; - if (h2->order < h->order && - h2->order > h->demote_order) - h->demote_order = h2->order; - } + if (h - 1 >= &hstates[0]) + h->demote_order = huge_page_order(h - 1); } } From patchwork Thu Jan 5 10:18:30 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: James Houghton X-Patchwork-Id: 13089666 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id B54CEC3DA7D for ; Thu, 5 Jan 2023 10:19:44 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 685908E0003; Thu, 5 Jan 2023 05:19:43 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 59A71940008; Thu, 5 Jan 2023 05:19:43 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 3C3568E0005; Thu, 5 Jan 2023 05:19:43 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0017.hostedemail.com [216.40.44.17]) by kanga.kvack.org (Postfix) with ESMTP id 271D28E0003 for ; Thu, 5 Jan 2023 05:19:43 -0500 (EST) Received: from smtpin13.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay08.hostedemail.com (Postfix) with ESMTP id 02962140217 for ; Thu, 5 Jan 2023 10:19:42 +0000 (UTC) X-FDA: 80320349046.13.41A30E7 Received: from mail-yw1-f201.google.com (mail-yw1-f201.google.com [209.85.128.201]) by imf23.hostedemail.com (Postfix) with ESMTP id 773E6140007 for ; Thu, 5 Jan 2023 10:19:41 +0000 (UTC) Authentication-Results: imf23.hostedemail.com; dkim=pass header.d=google.com header.s=20210112 header.b=VKrl9diS; spf=pass (imf23.hostedemail.com: domain of 3PKS2YwoKCIMq0ov1no0vunvvnsl.jvtspu14-ttr2hjr.vyn@flex--jthoughton.bounces.google.com designates 209.85.128.201 as permitted sender) smtp.mailfrom=3PKS2YwoKCIMq0ov1no0vunvvnsl.jvtspu14-ttr2hjr.vyn@flex--jthoughton.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1672913981; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=fjKcTgCHQU9dL6HEL7BjnCeJzokWebjD5rXkoH4g/pM=; b=NDSzMCPl6duNvznpMQM8q1Gh0eDd2pgOGHkiasVZB3VuHhMrlWdJsfKWV0miTMahI161qu JU38O7k93T40xhdQY8z6FlXTTV1GHVeQHfcfr8g0W5JLg1Zl9fibAGrl4nPVgIkYlcHkEQ 6kjjGvlbbSktdULT+Y8KmzBfLfsS0MM= ARC-Authentication-Results: i=1; imf23.hostedemail.com; dkim=pass header.d=google.com header.s=20210112 header.b=VKrl9diS; spf=pass (imf23.hostedemail.com: domain of 3PKS2YwoKCIMq0ov1no0vunvvnsl.jvtspu14-ttr2hjr.vyn@flex--jthoughton.bounces.google.com designates 209.85.128.201 as permitted sender) smtp.mailfrom=3PKS2YwoKCIMq0ov1no0vunvvnsl.jvtspu14-ttr2hjr.vyn@flex--jthoughton.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1672913981; a=rsa-sha256; cv=none; b=h77MQaGbHwjtGDEmXqklCZ0A1LQT72zPT9QwHvP7AvmkJDzFlhDzaU7rxdTAJYC8mqfpDT 0ei7Uk10PEXMPAX0VMtYLVytxXrDT3Z7duG1SQtm5cRrzyAieQxcCM1LIkI72ZrZpLtfXZ wLBtSl4y+Ick9AjgM0WDfk37yBv4i7g= Received: by mail-yw1-f201.google.com with SMTP id 00721157ae682-434eb7c6fa5so374783627b3.14 for ; Thu, 05 Jan 2023 02:19:41 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=fjKcTgCHQU9dL6HEL7BjnCeJzokWebjD5rXkoH4g/pM=; b=VKrl9diSEpwY713/gRrQ7D7jynnfdDKrTaaauRJ8Kmx76DzNiFvpsbu9cleofopMYY +vN1ez44452yYnBQYI9/ica76cicDmXhDVXewbjHC7ZvbpYxQspAVjUda//9KqW+Xs8u LR39HFBaerFN0EvlEGK+6/C5d0YJLK8kf7/07SPIgTMOJAJTpX9hXOYciZWdCmvO2ahR 6iZd5/Y9MDnIUiesI7LWvxsmnqWFe7L9Ef31ukl3F+hZHDBLwQWc9oEr3/AQYOzqITnW ykinTUIHoWMFPDqTP4+sgJ80WGoY/Y9vThhY32VjfbhOO1aAUa7akpnfIabYK+/jBMcc tlBg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=fjKcTgCHQU9dL6HEL7BjnCeJzokWebjD5rXkoH4g/pM=; b=oqWsaruawrubX7Kwrwzvor9wYxRybCdTiqubCbTpIr/4oLHH9hbYzVSp/FSnCKo1aL IeeDTnuL9gsjMc/lFv+yk1HZf9aZPECXueHHdbRQHz51hKQhewJQo5cY9MqUJPhq//G2 9BYqVUonVLPUU09HUr8mK7mkhXDTivGBf7uapTl66tTMsmvq0fAyXq+IoaTEh2lhpcXk sjOAOzi106iMktDgjjDAlh2nRkayAVRT7jnrq0e7VvI+w5Xjx+9ADE6JVOGDfVojHcmu PRuNnRDp4TMqL+cWOZKOVgAKl/SmQQPlFnyTlSTxyzBHZa6Q+tdDh2Jjzfu4Ab0mtm98 NmiQ== X-Gm-Message-State: AFqh2kqO3yR8VM/w0S/Odj2i7Zu4SqIOLXE5xTl2xDjCZC7tau+C4/tA AF/K133xKTdzlp6Heb6G9zrjNopcsmVjiU6f X-Google-Smtp-Source: AMrXdXtvsvCs5r5FUeplh9bq/0ejsb5delwBkknRcvX6W1H9wwf0TLqecw9cf1JUKuPILl1zHw3RCKREeb0/+0ng X-Received: from jthoughton.c.googlers.com ([fda3:e722:ac3:cc00:14:4d90:c0a8:2a4f]) (user=jthoughton job=sendgmr) by 2002:a25:b78c:0:b0:769:74cd:9c63 with SMTP id n12-20020a25b78c000000b0076974cd9c63mr4949127ybh.257.1672913980684; Thu, 05 Jan 2023 02:19:40 -0800 (PST) Date: Thu, 5 Jan 2023 10:18:30 +0000 In-Reply-To: <20230105101844.1893104-1-jthoughton@google.com> Mime-Version: 1.0 References: <20230105101844.1893104-1-jthoughton@google.com> X-Mailer: git-send-email 2.39.0.314.g84b9a713c41-goog Message-ID: <20230105101844.1893104-33-jthoughton@google.com> Subject: [PATCH 32/46] hugetlb: add for_each_hgm_shift From: James Houghton To: Mike Kravetz , Muchun Song , Peter Xu Cc: David Hildenbrand , David Rientjes , Axel Rasmussen , Mina Almasry , "Zach O'Keefe" , Manish Mishra , Naoya Horiguchi , "Dr . David Alan Gilbert" , "Matthew Wilcox (Oracle)" , Vlastimil Babka , Baolin Wang , Miaohe Lin , Yang Shi , Andrew Morton , linux-mm@kvack.org, linux-kernel@vger.kernel.org, James Houghton X-Rspamd-Server: rspam05 X-Rspamd-Queue-Id: 773E6140007 X-Stat-Signature: pryeajch3epqfumoz7gd77nc37fpupbu X-Rspam-User: X-HE-Tag: 1672913981-408844 X-HE-Meta: U2FsdGVkX1+rrOY7Vmxvq6bupXVTjTxxpJ/LI7GcvldB5cxertuobAa3LMOGaLOMMIeziLozqq2w+WEzGusKisQJ9U0vso4+0XJQ1vAghHwQKGKkXrJJv93E4GAX1RE4xMkOEFkQBAPy4YxYAdhlNSjgn0lVVB9f4VyFUJiQKiB73x7aTjlqF2RXzVEpCw5LhG4Vf2tziLpMerTxRrZ8CIhgQMx3viyLw0cyrlV4AMBuj7lBRc3Sq4WJjnhYmglwt8yAaOptBdsRaZnwVxGApFAsvfwrcgwcgiMkghWIkfHG2Q+wZnzRg4vBT7q6KtEtsR191KJj2XAV2AzKcRTXVZjnB0xXJmiqk3cQhL5qXVqrU54ZT2Z4YGOwmnpwam5LuqVpzACOd16Zjgd/ETvmFlMIZprfA6dskrg/3iTxS9qUjnPo+e7rVjaA7RK6+bXbKn/Pggab25IdQ6/i038mBZK1Dlgf9OcULQcD9BjrLvPZduOv+SsxAX+aV+mjXAga9mL95nqCZY/H4wfnqSvVqdybyrVz9hiV/X3xKIQSfpuhNnJRjiqVoffRwjado9KFzyg2/TXzQKg5sy08A9zCQoF45BSTtervIjuK3xErpsp9p/ve1I31G6Xby/buyHuCtiQ4LuhDUvmNod3BjyNfqwJ8otmXplVSc3QG9hnECEOrUbKyWAnfLtW8HanlnE+2eoa4cToIQeB+XPQyo6KBsWIwpm36NUlA/Ggoew/cd4S7LThns43mQ8q1l4K2qKOVxNLczzfUqybWy1h2XevRG4cVRj4tPHCLyEAt2w62J5Wp4Z35u/xA4jRy95PliboXQqcwlUwiwiZSRmVceYHy/niDDhBKoMYdT+gik0leCmLVLYjgD3mzz0LRigrJDOMC1/Hyhf02fUx8xWguxk6F8B+0MHldFjerLHFKTLJdSzqqjvjnPy/sBx6Kf0rE6j0X18M63dLv4ShaOuU957z INhC3/n3 1qqiVJcm4JOiIju44as4VM9MUw9Jf42tLeecCIf1Fyuqji+qo7JBYVV1ZynFo/vlpkNx2AIVAPWaD3Xs8sq+MA8pNvzMNXMOxnFUlG5Qg/+Yau7VD+uxvY1dGfsGdAoOaOwYQUWSwVhppQIwd0SDWXiTLA/MOceZ+seO5e5tOfe+B6HY3nmIyAy8MRBgiWXXDM/Y7tVmR6g2I1Gnjc5L/k5lkzXxxE8Oivy4G1AOepgm+t5hN9C9GyPmlUFzQ2Be2AyW2krEyNmwLseh9c3Ze4K7pl+WY9EBypo2hM2JyrINX9jVBxje550sW6KFqrPyCfLpB1VFB7MWFrtmdFnsw5Z/H+8VCWeUp/Tnb8SSihv/JIAum49dS5zoboKj4Hzctqzj4OOmMDLNFgSX5RyeK9HajeL8vPk22afhV+1CZlCsw7lcHTveAUEb1m4zTb3w0zDkr X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: This is a helper macro to loop through all the usable page sizes for a high-granularity-enabled HugeTLB VMA. Given the VMA's hstate, it will loop, in descending order, through the page sizes that HugeTLB supports for this architecture. It always includes PAGE_SIZE. This is done by looping through the hstates; however, there is no hstate for PAGE_SIZE. To handle this case, the loop intentionally goes out of bounds, and the out-of-bounds pointer is mapped to PAGE_SIZE. Signed-off-by: James Houghton --- mm/hugetlb.c | 18 ++++++++++++++++++ 1 file changed, 18 insertions(+) diff --git a/mm/hugetlb.c b/mm/hugetlb.c index 1e9e149587b3..1eef6968b1fa 100644 --- a/mm/hugetlb.c +++ b/mm/hugetlb.c @@ -7780,6 +7780,24 @@ bool hugetlb_hgm_enabled(struct vm_area_struct *vma) { return vma && (vma->vm_flags & VM_HUGETLB_HGM); } +/* Should only be used by the for_each_hgm_shift macro. */ +static unsigned int __shift_for_hstate(struct hstate *h) +{ + /* If h is out of bounds, we have reached the end, so give PAGE_SIZE */ + if (h >= &hstates[hugetlb_max_hstate]) + return PAGE_SHIFT; + return huge_page_shift(h); +} + +/* + * Intentionally go out of bounds. An out-of-bounds hstate will be converted to + * PAGE_SIZE. + */ +#define for_each_hgm_shift(hstate, tmp_h, shift) \ + for ((tmp_h) = hstate; (shift) = __shift_for_hstate(tmp_h), \ + (tmp_h) <= &hstates[hugetlb_max_hstate]; \ + (tmp_h)++) + #endif /* CONFIG_HUGETLB_HIGH_GRANULARITY_MAPPING */ /* From patchwork Thu Jan 5 10:18:31 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: James Houghton X-Patchwork-Id: 13089667 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 23063C54E76 for ; Thu, 5 Jan 2023 10:19:46 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id DB21C900008; Thu, 5 Jan 2023 05:19:44 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id D11D9900002; Thu, 5 Jan 2023 05:19:44 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id B63D594001D; Thu, 5 Jan 2023 05:19:44 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0016.hostedemail.com [216.40.44.16]) by kanga.kvack.org (Postfix) with ESMTP id 9F277940008 for ; Thu, 5 Jan 2023 05:19:44 -0500 (EST) Received: from smtpin08.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay09.hostedemail.com (Postfix) with ESMTP id 79D3A80D5F for ; Thu, 5 Jan 2023 10:19:44 +0000 (UTC) X-FDA: 80320349088.08.138F521 Received: from mail-yw1-f201.google.com (mail-yw1-f201.google.com [209.85.128.201]) by imf13.hostedemail.com (Postfix) with ESMTP id C5CFF2000B for ; Thu, 5 Jan 2023 10:19:42 +0000 (UTC) Authentication-Results: imf13.hostedemail.com; dkim=pass header.d=google.com header.s=20210112 header.b=Rbxv9WG5; spf=pass (imf13.hostedemail.com: domain of 3PqS2YwoKCIUs2qx3pq2xwpxxpun.lxvurw36-vvt4jlt.x0p@flex--jthoughton.bounces.google.com designates 209.85.128.201 as permitted sender) smtp.mailfrom=3PqS2YwoKCIUs2qx3pq2xwpxxpun.lxvurw36-vvt4jlt.x0p@flex--jthoughton.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1672913982; a=rsa-sha256; cv=none; b=ybmrFzJmYBovvNp75NoEjFpvg3uN3IahiXz5EiAVr3haYId1J9hts93UjlXgxfBThm38ka in3kyotOFhGSIonLcreiwppTyOo2y0pm4qY0fFmxZLvkun0oGOy2wmx5JqM61WI6EXFWjb tWemZtQkbkDuAjSfyFtPAQmsNTfvHm4= ARC-Authentication-Results: i=1; imf13.hostedemail.com; dkim=pass header.d=google.com header.s=20210112 header.b=Rbxv9WG5; spf=pass (imf13.hostedemail.com: domain of 3PqS2YwoKCIUs2qx3pq2xwpxxpun.lxvurw36-vvt4jlt.x0p@flex--jthoughton.bounces.google.com designates 209.85.128.201 as permitted sender) smtp.mailfrom=3PqS2YwoKCIUs2qx3pq2xwpxxpun.lxvurw36-vvt4jlt.x0p@flex--jthoughton.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1672913982; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=zyj8WI6ZHUfTa3QEH+sjYLC8OtvTBOvuUTYLgK/H6sg=; b=oKdVHhm54dJeeQ79bV1eGlL52Ni0VDsOzrRbMrQKkb8mboommkKXu4PO7iJ0lTNhACKwCz yDXkKgyEo2TCxX9SSzga/yMRlSMOEaWHL6R7HO0NvBpH4rQuoCtJeYm4QyH7/15/tK74Nx +WKz4k2XKU2DZQOETk5ZWGDwnzP2es8= Received: by mail-yw1-f201.google.com with SMTP id 00721157ae682-434eb7c6fa5so374784017b3.14 for ; Thu, 05 Jan 2023 02:19:42 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=zyj8WI6ZHUfTa3QEH+sjYLC8OtvTBOvuUTYLgK/H6sg=; b=Rbxv9WG514G9oOVs5vgQcTsbBy3/SRiOX3mNrNz29MS2bMXWY98la+rMNn2ajtmuKI RbKhJIhFROdY9YOERK0qYGEoiyq1d3vUSlg1jaKt5uAVqninxjoRWct88Yzwuxfk19z2 tsUY9QbBNmp7uiYlaf1LkNsrVrEC6ImgJHVLpEkO0iklvaW4mvdOLPz/4kUNPie65iNB gl7E3ZTK1v0Ya7fJYqs1A2YDxwbzwmY0HhaZ2uNEUltjjPJr2t9Ft0HjMgP5+BWI5luR 489kqIQR2WzL51HSVEyA0vbDsaZePxEYeKfvbmKR0o+veu6LHZWmA2Yg2sQObvWtPROx y3mA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=zyj8WI6ZHUfTa3QEH+sjYLC8OtvTBOvuUTYLgK/H6sg=; b=I7Dn8JZGkJ95cWuvYpC+DZga/FAXMpBKscNexahKm1cYgO+SkjRAEEmzdREJ/TEfJK Q9roewwqZmEbG6o+LrbZ4VrF592iMThpsuE1/x6YOfEze8oEs6nAHpC0o0F37+/JOBvi G4mn0A2Ll9A/rFthoSYOkEEK3mX3H1fNPaIz5id2zIz88Ghkpugv9Bdj2Qf8zZMDr9Ws 8/G7svo0Gxx3+++lwglR7nAFxg1sqVd3YJoWgt24pZblp0K7NaCn1rWVtbLBbuvtpNgW CQedObaL1f/9UW2p3wtzaFXXBxZRRXibNJwYvVfVHhEw//6iNxKEe1575KOuCLvMWJpM TRNQ== X-Gm-Message-State: AFqh2krC1d8ZV6Z7xAHYLH8pWyB7s2bDN/178b1j3f0Q0G53IafYuF22 d0TwCwpMR/1FNAxJF9xh8d+OoEot1wgKyXtD X-Google-Smtp-Source: AMrXdXsRJbSwW4QWtQHu6pc4k3ao5KfbaAEwR/zMMboHR+fKxB3gjjRDTHBpf0yob6aSNifFRjM0nZxDtdvHrdPj X-Received: from jthoughton.c.googlers.com ([fda3:e722:ac3:cc00:14:4d90:c0a8:2a4f]) (user=jthoughton job=sendgmr) by 2002:a25:3745:0:b0:75b:b01c:6a2c with SMTP id e66-20020a253745000000b0075bb01c6a2cmr5617998yba.166.1672913982057; Thu, 05 Jan 2023 02:19:42 -0800 (PST) Date: Thu, 5 Jan 2023 10:18:31 +0000 In-Reply-To: <20230105101844.1893104-1-jthoughton@google.com> Mime-Version: 1.0 References: <20230105101844.1893104-1-jthoughton@google.com> X-Mailer: git-send-email 2.39.0.314.g84b9a713c41-goog Message-ID: <20230105101844.1893104-34-jthoughton@google.com> Subject: [PATCH 33/46] hugetlb: userfaultfd: add support for high-granularity UFFDIO_CONTINUE From: James Houghton To: Mike Kravetz , Muchun Song , Peter Xu Cc: David Hildenbrand , David Rientjes , Axel Rasmussen , Mina Almasry , "Zach O'Keefe" , Manish Mishra , Naoya Horiguchi , "Dr . David Alan Gilbert" , "Matthew Wilcox (Oracle)" , Vlastimil Babka , Baolin Wang , Miaohe Lin , Yang Shi , Andrew Morton , linux-mm@kvack.org, linux-kernel@vger.kernel.org, James Houghton X-Rspam-User: X-Rspamd-Server: rspam03 X-Rspamd-Queue-Id: C5CFF2000B X-Stat-Signature: r476arzkcoos6i6jhseep53uj84md9zr X-HE-Tag: 1672913982-376914 X-HE-Meta: U2FsdGVkX1+e+ukSEhMZ33mOSyY/0Mhgbb5cuc2lh8CLqk1ODmN9fwVn0hy6IMDixtJIFSO2bFryhCAZw8VHdOWLJDaQFlA99f9jOxxxjToBb45RPnZcABHMAnYf6cfYXOrybohsZ9ORMw5EzbHvjw+X8pJW1EM6GmX+dXdU7QsFlsu6TcrN09+XFkA9yHUiY1ElQM762rWiF1wyZJL2osm2CmS37vtHCr9uHPhTUDJTjZnnLzOQ1KxDbYJU/u68twF8rWbN/KdhVdmyg5iTbuiSlbxlpHn+Rv1SQLS9CGLpspNXoZlNqBjQZMcJqStlJ2gjdLnQz2m5FLOQWskT37YTx6HYY5H1YxjgL5+JPC6HIAhwkJnrTZ7IZciKWaNlaKgf8ptwOTqnpkL7DPSQW8pieP1/SAMLVeGQTHReciDEnxxRblAtvD1bDxLJ39yRuq7iV/n0N30bYakm/Cz16lD+rsat/T4YrYzoNkjBpUzEm5Dm2UgWTlq+8urkiGQUxhsuDb5vdvZO3E0JADqCEhWBFk2q6j92PBPPOwRvlulIkBFE07O9IdvMPISd/fIKFxUz8BhKAr6qskbXG4/8hlpj5ltgyBkmAai8alMzPLqlZHRIzn/iGhVTGPXJGzgqt8z4mkSyDVid00Lm1djc22DNAv4c599xy+vVVA7OMUykAys61c8sR0pg4hmJ/t38LOILn4j5dnF+z0nTMYGJFRKW8+Meo1eXAwOzWFReikLXoUqA1dz6/pJAhFq3/0c0/52sZEmkuy0Da5jxSZLgn3hN6NH77/T4mC3U0SSE09qpk6YKzgsKbTay9KeXLcutr9gWsfrkFN/Bcze7EleN3ZWhM4YNtOQ3qduOAhJggKHPnRS3Bl3fjYxjqeWWaFeaBDMGTfsF/HE9qH5a9fP6DH/nYKME1byP7CMw2nPgtc9AUk0oOJclqhmVTTKxZnZ8LZf1RuWhpPIOrIS0Kux 9vsaa/Sm axJfTQwzQJ5PsJ34gaS3TsRTF1JZWxAqhQPtyrrOO+McK7UDvc7CKaSa9vGUQi1J2Kj0t4V9LqxTdaJ2IaIQXU80xWcSm+x6C6TCOnNJ921u/sIgaFN+L/e7874k781g3jyu2BxVlcfF9j9LJeEncfvO+U0HM93Cm8fVngkGIWj73+QcIRYRq+Ms2DLhGlA1fHO58MeRhBvnWXi2a5srH8dDBV6+xSXkwxcTMumUwLXx7vgBa3MM7IhwfYaRF3XD764kmFGZa5mS4bn+41VGoOoFvFBrMORbbauBXdYfQ0+xRHZ4lL2E/IOVZ06JP8i6ue7YhJvBZSu/yNzvBf9GK2/M58J7phMpcKh/rO1YGuhFCVoL3GUQLaqJmcfD3/8fPdVDSbqIDLNh/NQdXVF+gkzrcLkhSlttoGkIaSKgU7z2EGnejtcvXiEV/1BM/n9GRMxuN X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Changes here are similar to the changes made for hugetlb_no_page. Pass vmf->real_address to userfaultfd_huge_must_wait because vmf->address may be rounded down to the hugepage size, and a high-granularity page table walk would look up the wrong PTE. Also change the call to userfaultfd_must_wait in the same way for consistency. This commit introduces hugetlb_alloc_largest_pte which is used to find the appropriate PTE size to map pages with UFFDIO_CONTINUE. When MADV_SPLIT is provided, page fault events will report PAGE_SIZE-aligned address instead of huge_page_size(h)-aligned addresses, regardless of if UFFD_FEATURE_EXACT_ADDRESS is used. Signed-off-by: James Houghton --- fs/userfaultfd.c | 14 +++---- include/linux/hugetlb.h | 18 ++++++++- mm/hugetlb.c | 85 +++++++++++++++++++++++++++++++++-------- mm/userfaultfd.c | 40 +++++++++++-------- 4 files changed, 119 insertions(+), 38 deletions(-) diff --git a/fs/userfaultfd.c b/fs/userfaultfd.c index 15a5bf765d43..940ff63096a9 100644 --- a/fs/userfaultfd.c +++ b/fs/userfaultfd.c @@ -252,17 +252,17 @@ static inline bool userfaultfd_huge_must_wait(struct userfaultfd_ctx *ctx, unsigned long flags, unsigned long reason) { - pte_t *ptep, pte; + pte_t pte; bool ret = true; + struct hugetlb_pte hpte; mmap_assert_locked(ctx->mm); - ptep = hugetlb_walk(vma, address, vma_mmu_pagesize(vma)); - if (!ptep) + if (hugetlb_full_walk(&hpte, vma, address)) goto out; ret = false; - pte = huge_ptep_get(ptep); + pte = huge_ptep_get(hpte.ptep); /* * Lockless access: we're in a wait_event so it's ok if it @@ -531,11 +531,11 @@ vm_fault_t handle_userfault(struct vm_fault *vmf, unsigned long reason) spin_unlock_irq(&ctx->fault_pending_wqh.lock); if (!is_vm_hugetlb_page(vma)) - must_wait = userfaultfd_must_wait(ctx, vmf->address, vmf->flags, - reason); + must_wait = userfaultfd_must_wait(ctx, vmf->real_address, + vmf->flags, reason); else must_wait = userfaultfd_huge_must_wait(ctx, vma, - vmf->address, + vmf->real_address, vmf->flags, reason); if (is_vm_hugetlb_page(vma)) hugetlb_vma_unlock_read(vma); diff --git a/include/linux/hugetlb.h b/include/linux/hugetlb.h index 8a664a9dd0a8..c8524ac49b24 100644 --- a/include/linux/hugetlb.h +++ b/include/linux/hugetlb.h @@ -224,7 +224,8 @@ unsigned long hugetlb_total_pages(void); vm_fault_t hugetlb_fault(struct mm_struct *mm, struct vm_area_struct *vma, unsigned long address, unsigned int flags); #ifdef CONFIG_USERFAULTFD -int hugetlb_mcopy_atomic_pte(struct mm_struct *dst_mm, pte_t *dst_pte, +int hugetlb_mcopy_atomic_pte(struct mm_struct *dst_mm, + struct hugetlb_pte *dst_hpte, struct vm_area_struct *dst_vma, unsigned long dst_addr, unsigned long src_addr, @@ -1292,16 +1293,31 @@ static inline enum hugetlb_level hpage_size_to_level(unsigned long sz) #ifdef CONFIG_HUGETLB_HIGH_GRANULARITY_MAPPING bool hugetlb_hgm_enabled(struct vm_area_struct *vma); +bool hugetlb_hgm_advised(struct vm_area_struct *vma); bool hugetlb_hgm_eligible(struct vm_area_struct *vma); +int hugetlb_alloc_largest_pte(struct hugetlb_pte *hpte, struct mm_struct *mm, + struct vm_area_struct *vma, unsigned long start, + unsigned long end); #else static inline bool hugetlb_hgm_enabled(struct vm_area_struct *vma) { return false; } +static inline bool hugetlb_hgm_advised(struct vm_area_struct *vma) +{ + return false; +} static inline bool hugetlb_hgm_eligible(struct vm_area_struct *vma) { return false; } +static inline +int hugetlb_alloc_largest_pte(struct hugetlb_pte *hpte, struct mm_struct *mm, + struct vm_area_struct *vma, unsigned long start, + unsigned long end) +{ + return -EINVAL; +} #endif static inline spinlock_t *huge_pte_lock(struct hstate *h, diff --git a/mm/hugetlb.c b/mm/hugetlb.c index 1eef6968b1fa..5af6db52f34e 100644 --- a/mm/hugetlb.c +++ b/mm/hugetlb.c @@ -5936,6 +5936,13 @@ static inline vm_fault_t hugetlb_handle_userfault(struct vm_area_struct *vma, unsigned long addr, unsigned long reason) { + /* + * Don't use the hpage-aligned address if the user has explicitly + * enabled HGM. + */ + if (hugetlb_hgm_advised(vma) && reason == VM_UFFD_MINOR) + haddr = address & PAGE_MASK; + u32 hash; struct vm_fault vmf = { .vma = vma, @@ -6420,7 +6427,7 @@ vm_fault_t hugetlb_fault(struct mm_struct *mm, struct vm_area_struct *vma, * modifications for huge pages. */ int hugetlb_mcopy_atomic_pte(struct mm_struct *dst_mm, - pte_t *dst_pte, + struct hugetlb_pte *dst_hpte, struct vm_area_struct *dst_vma, unsigned long dst_addr, unsigned long src_addr, @@ -6431,13 +6438,14 @@ int hugetlb_mcopy_atomic_pte(struct mm_struct *dst_mm, bool is_continue = (mode == MCOPY_ATOMIC_CONTINUE); struct hstate *h = hstate_vma(dst_vma); struct address_space *mapping = dst_vma->vm_file->f_mapping; - pgoff_t idx = vma_hugecache_offset(h, dst_vma, dst_addr); + unsigned long haddr = dst_addr & huge_page_mask(h); + pgoff_t idx = vma_hugecache_offset(h, dst_vma, haddr); unsigned long size; int vm_shared = dst_vma->vm_flags & VM_SHARED; pte_t _dst_pte; spinlock_t *ptl; int ret = -ENOMEM; - struct page *page; + struct page *page, *subpage; int writable; bool page_in_pagecache = false; @@ -6452,12 +6460,12 @@ int hugetlb_mcopy_atomic_pte(struct mm_struct *dst_mm, * a non-missing case. Return -EEXIST. */ if (vm_shared && - hugetlbfs_pagecache_present(h, dst_vma, dst_addr)) { + hugetlbfs_pagecache_present(h, dst_vma, haddr)) { ret = -EEXIST; goto out; } - page = alloc_huge_page(dst_vma, dst_addr, 0); + page = alloc_huge_page(dst_vma, haddr, 0); if (IS_ERR(page)) { ret = -ENOMEM; goto out; @@ -6473,13 +6481,13 @@ int hugetlb_mcopy_atomic_pte(struct mm_struct *dst_mm, /* Free the allocated page which may have * consumed a reservation. */ - restore_reserve_on_error(h, dst_vma, dst_addr, page); + restore_reserve_on_error(h, dst_vma, haddr, page); put_page(page); /* Allocate a temporary page to hold the copied * contents. */ - page = alloc_huge_page_vma(h, dst_vma, dst_addr); + page = alloc_huge_page_vma(h, dst_vma, haddr); if (!page) { ret = -ENOMEM; goto out; @@ -6493,14 +6501,14 @@ int hugetlb_mcopy_atomic_pte(struct mm_struct *dst_mm, } } else { if (vm_shared && - hugetlbfs_pagecache_present(h, dst_vma, dst_addr)) { + hugetlbfs_pagecache_present(h, dst_vma, haddr)) { put_page(*pagep); ret = -EEXIST; *pagep = NULL; goto out; } - page = alloc_huge_page(dst_vma, dst_addr, 0); + page = alloc_huge_page(dst_vma, haddr, 0); if (IS_ERR(page)) { put_page(*pagep); ret = -ENOMEM; @@ -6548,7 +6556,7 @@ int hugetlb_mcopy_atomic_pte(struct mm_struct *dst_mm, page_in_pagecache = true; } - ptl = huge_pte_lock(h, dst_mm, dst_pte); + ptl = hugetlb_pte_lock(dst_hpte); ret = -EIO; if (PageHWPoison(page)) @@ -6560,7 +6568,7 @@ int hugetlb_mcopy_atomic_pte(struct mm_struct *dst_mm, * page backing it, then access the page. */ ret = -EEXIST; - if (!huge_pte_none_mostly(huge_ptep_get(dst_pte))) + if (!huge_pte_none_mostly(huge_ptep_get(dst_hpte->ptep))) goto out_release_unlock; if (page_in_pagecache) @@ -6577,7 +6585,10 @@ int hugetlb_mcopy_atomic_pte(struct mm_struct *dst_mm, else writable = dst_vma->vm_flags & VM_WRITE; - _dst_pte = make_huge_pte(dst_vma, page, writable); + subpage = hugetlb_find_subpage(h, page, dst_addr); + + _dst_pte = make_huge_pte_with_shift(dst_vma, subpage, writable, + dst_hpte->shift); /* * Always mark UFFDIO_COPY page dirty; note that this may not be * extremely important for hugetlbfs for now since swapping is not @@ -6590,12 +6601,12 @@ int hugetlb_mcopy_atomic_pte(struct mm_struct *dst_mm, if (wp_copy) _dst_pte = huge_pte_mkuffd_wp(_dst_pte); - set_huge_pte_at(dst_mm, dst_addr, dst_pte, _dst_pte); + set_huge_pte_at(dst_mm, dst_addr, dst_hpte->ptep, _dst_pte); - hugetlb_count_add(pages_per_huge_page(h), dst_mm); + hugetlb_count_add(hugetlb_pte_size(dst_hpte) / PAGE_SIZE, dst_mm); /* No need to invalidate - it was non-present before */ - update_mmu_cache(dst_vma, dst_addr, dst_pte); + update_mmu_cache(dst_vma, dst_addr, dst_hpte->ptep); spin_unlock(ptl); if (!is_continue) @@ -7780,6 +7791,18 @@ bool hugetlb_hgm_enabled(struct vm_area_struct *vma) { return vma && (vma->vm_flags & VM_HUGETLB_HGM); } +bool hugetlb_hgm_advised(struct vm_area_struct *vma) +{ + /* + * Right now, the only way for HGM to be enabled is if a user + * explicitly enables it via MADV_SPLIT, but in the future, there + * may be cases where it gets enabled automatically. + * + * Provide hugetlb_hgm_advised() now for call sites where care that the + * user explicitly enabled HGM. + */ + return hugetlb_hgm_enabled(vma); +} /* Should only be used by the for_each_hgm_shift macro. */ static unsigned int __shift_for_hstate(struct hstate *h) { @@ -7798,6 +7821,38 @@ static unsigned int __shift_for_hstate(struct hstate *h) (tmp_h) <= &hstates[hugetlb_max_hstate]; \ (tmp_h)++) +/* + * Find the HugeTLB PTE that maps as much of [start, end) as possible with a + * single page table entry. It is returned in @hpte. + */ +int hugetlb_alloc_largest_pte(struct hugetlb_pte *hpte, struct mm_struct *mm, + struct vm_area_struct *vma, unsigned long start, + unsigned long end) +{ + struct hstate *h = hstate_vma(vma), *tmp_h; + unsigned int shift; + unsigned long sz; + int ret; + + for_each_hgm_shift(h, tmp_h, shift) { + sz = 1UL << shift; + + if (!IS_ALIGNED(start, sz) || start + sz > end) + continue; + goto found; + } + return -EINVAL; +found: + ret = hugetlb_full_walk_alloc(hpte, vma, start, sz); + if (ret) + return ret; + + if (hpte->shift > shift) + return -EEXIST; + + return 0; +} + #endif /* CONFIG_HUGETLB_HIGH_GRANULARITY_MAPPING */ /* diff --git a/mm/userfaultfd.c b/mm/userfaultfd.c index 65ad172add27..2b233d31be24 100644 --- a/mm/userfaultfd.c +++ b/mm/userfaultfd.c @@ -320,14 +320,16 @@ static __always_inline ssize_t __mcopy_atomic_hugetlb(struct mm_struct *dst_mm, { int vm_shared = dst_vma->vm_flags & VM_SHARED; ssize_t err; - pte_t *dst_pte; unsigned long src_addr, dst_addr; long copied; struct page *page; - unsigned long vma_hpagesize; + unsigned long vma_hpagesize, target_pagesize; pgoff_t idx; u32 hash; struct address_space *mapping; + bool use_hgm = hugetlb_hgm_advised(dst_vma) && + mode == MCOPY_ATOMIC_CONTINUE; + struct hstate *h = hstate_vma(dst_vma); /* * There is no default zero huge page for all huge page sizes as @@ -345,12 +347,13 @@ static __always_inline ssize_t __mcopy_atomic_hugetlb(struct mm_struct *dst_mm, copied = 0; page = NULL; vma_hpagesize = vma_kernel_pagesize(dst_vma); + target_pagesize = use_hgm ? PAGE_SIZE : vma_hpagesize; /* - * Validate alignment based on huge page size + * Validate alignment based on the targeted page size. */ err = -EINVAL; - if (dst_start & (vma_hpagesize - 1) || len & (vma_hpagesize - 1)) + if (dst_start & (target_pagesize - 1) || len & (target_pagesize - 1)) goto out_unlock; retry: @@ -381,13 +384,14 @@ static __always_inline ssize_t __mcopy_atomic_hugetlb(struct mm_struct *dst_mm, } while (src_addr < src_start + len) { + struct hugetlb_pte hpte; BUG_ON(dst_addr >= dst_start + len); /* * Serialize via vma_lock and hugetlb_fault_mutex. - * vma_lock ensures the dst_pte remains valid even - * in the case of shared pmds. fault mutex prevents - * races with other faulting threads. + * vma_lock ensures the hpte.ptep remains valid even + * in the case of shared pmds and page table collapsing. + * fault mutex prevents races with other faulting threads. */ idx = linear_page_index(dst_vma, dst_addr); mapping = dst_vma->vm_file->f_mapping; @@ -395,23 +399,28 @@ static __always_inline ssize_t __mcopy_atomic_hugetlb(struct mm_struct *dst_mm, mutex_lock(&hugetlb_fault_mutex_table[hash]); hugetlb_vma_lock_read(dst_vma); - err = -ENOMEM; - dst_pte = huge_pte_alloc(dst_mm, dst_vma, dst_addr, vma_hpagesize); - if (!dst_pte) { + if (use_hgm) + err = hugetlb_alloc_largest_pte(&hpte, dst_mm, dst_vma, + dst_addr, + dst_start + len); + else + err = hugetlb_full_walk_alloc(&hpte, dst_vma, dst_addr, + vma_hpagesize); + if (err) { hugetlb_vma_unlock_read(dst_vma); mutex_unlock(&hugetlb_fault_mutex_table[hash]); goto out_unlock; } if (mode != MCOPY_ATOMIC_CONTINUE && - !huge_pte_none_mostly(huge_ptep_get(dst_pte))) { + !huge_pte_none_mostly(huge_ptep_get(hpte.ptep))) { err = -EEXIST; hugetlb_vma_unlock_read(dst_vma); mutex_unlock(&hugetlb_fault_mutex_table[hash]); goto out_unlock; } - err = hugetlb_mcopy_atomic_pte(dst_mm, dst_pte, dst_vma, + err = hugetlb_mcopy_atomic_pte(dst_mm, &hpte, dst_vma, dst_addr, src_addr, mode, &page, wp_copy); @@ -423,6 +432,7 @@ static __always_inline ssize_t __mcopy_atomic_hugetlb(struct mm_struct *dst_mm, if (unlikely(err == -ENOENT)) { mmap_read_unlock(dst_mm); BUG_ON(!page); + WARN_ON_ONCE(hpte.shift != huge_page_shift(h)); err = copy_huge_page_from_user(page, (const void __user *)src_addr, @@ -440,9 +450,9 @@ static __always_inline ssize_t __mcopy_atomic_hugetlb(struct mm_struct *dst_mm, BUG_ON(page); if (!err) { - dst_addr += vma_hpagesize; - src_addr += vma_hpagesize; - copied += vma_hpagesize; + dst_addr += hugetlb_pte_size(&hpte); + src_addr += hugetlb_pte_size(&hpte); + copied += hugetlb_pte_size(&hpte); if (fatal_signal_pending(current)) err = -EINTR; From patchwork Thu Jan 5 10:18:32 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: James Houghton X-Patchwork-Id: 13089668 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 9341CC3DA7D for ; Thu, 5 Jan 2023 10:19:47 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id EE9CF94001D; Thu, 5 Jan 2023 05:19:45 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id DFC38940008; Thu, 5 Jan 2023 05:19:45 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id CC52494001D; Thu, 5 Jan 2023 05:19:45 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0011.hostedemail.com [216.40.44.11]) by kanga.kvack.org (Postfix) with ESMTP id AB6BF940008 for ; Thu, 5 Jan 2023 05:19:45 -0500 (EST) Received: from smtpin07.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay02.hostedemail.com (Postfix) with ESMTP id 8F478120C72 for ; Thu, 5 Jan 2023 10:19:45 +0000 (UTC) X-FDA: 80320349130.07.45DCB28 Received: from mail-yb1-f201.google.com (mail-yb1-f201.google.com [209.85.219.201]) by imf26.hostedemail.com (Postfix) with ESMTP id 0F413140003 for ; Thu, 5 Jan 2023 10:19:43 +0000 (UTC) Authentication-Results: imf26.hostedemail.com; dkim=pass header.d=google.com header.s=20210112 header.b=lJAnXXDa; spf=pass (imf26.hostedemail.com: domain of 3P6S2YwoKCIYt3ry4qr3yxqyyqvo.mywvsx47-wwu5kmu.y1q@flex--jthoughton.bounces.google.com designates 209.85.219.201 as permitted sender) smtp.mailfrom=3P6S2YwoKCIYt3ry4qr3yxqyyqvo.mywvsx47-wwu5kmu.y1q@flex--jthoughton.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1672913984; a=rsa-sha256; cv=none; b=dS9rjM4waF5I4LafGefznOhIUy7H2L04fPnTsZBEABzKCBX9NcDjYl75E3scOixYe9+lgF cW6V+DbE3gmFtWhLBeQvLHgClFPaW5GB2NLQkI5d4vy9cy0hGw5MNQSS69wxAoR9lEjsKC ZYGdHZI/6rSya7SpjgodgT/eQmUcQXw= ARC-Authentication-Results: i=1; imf26.hostedemail.com; dkim=pass header.d=google.com header.s=20210112 header.b=lJAnXXDa; spf=pass (imf26.hostedemail.com: domain of 3P6S2YwoKCIYt3ry4qr3yxqyyqvo.mywvsx47-wwu5kmu.y1q@flex--jthoughton.bounces.google.com designates 209.85.219.201 as permitted sender) smtp.mailfrom=3P6S2YwoKCIYt3ry4qr3yxqyyqvo.mywvsx47-wwu5kmu.y1q@flex--jthoughton.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1672913984; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=zCmSbITJPCJiNnskiU3UeHLtqxSjqb+c0ixZX/ofLl4=; b=NKjGPohf2cWm4nij/Sv4jeK4XD3bee6tgRE+9rMIUfVP6q/dJBBYOdEN9jy2tMxZsnNaQp 2GKu2OG0z3O5UbnLUgnj2m1NsIcGLzF8eKOQ+v3rqgcvnjlxLrVs0cFeafitjEGBjI98v5 U+fzyxpFXYrAOscucHrPFt7brdky3UA= Received: by mail-yb1-f201.google.com with SMTP id b4-20020a253404000000b006fad1bb09f4so36049511yba.1 for ; Thu, 05 Jan 2023 02:19:43 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=zCmSbITJPCJiNnskiU3UeHLtqxSjqb+c0ixZX/ofLl4=; b=lJAnXXDaHr20zsyaBijCW0KfmxNaRfjZpxfzb+EmrpYOQBN9AQ+W3KqrhpFEfEMu4i +7s1tvD7xS0kP2x25VUp+DunUaseeJKeyN1avNtBeuFHDsiHsskzjfBlou/AESX/+GyZ Yno3ggoNr2tSyK2HIjIkjbcpNd8owTXKEPNm9MytpSANgE4bv5+upiaLxV9VBMb4i1Xe S8lJoonIgUC5GAtZGJIrTMLDZnJTnMIHM2tRy+FWyrtFT6x4k3aGvbCYpCKZhQPF1Ozf oa94HBmejEbGCXSYJeddbV1BbDR2UK5+OlrJ4o06xbexPYzwtOlvGFmkuLgDAu3Jz91E fNDQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=zCmSbITJPCJiNnskiU3UeHLtqxSjqb+c0ixZX/ofLl4=; b=DyWaDZ6ZdXDmDuz48Jc0z3dlR8n0nINEaK+PX8e69mk2L4pZ2roZLIyLNCMJ0mfdXV 955MqkHuVeZJHIi0Eq6eJBFNXZLpLTumS/o4gDAE+j6nKRivgKt/760vwkt4P6nE4C/X gu1cmOQOJGDW3ndmiNXTkSKBIcCKJh4FG2Bvq1WZ3R3b47cZYyh2ru2BPk2sGPu3gKR5 QpyyrFJvBuEETfOznl6aA/TFgIrSbPstL+dy/+wbH0j6d/mB8II6OI+NZdm9hcRDV+wi KLIt0Qu2TddF7qPcZXRe8Nebr8uVE47I4zSTBn0w1HBaMUfFoAS9/wRlo1fCi77sl0wq qwOA== X-Gm-Message-State: AFqh2kqcbgX9CmsKVLl6mBMWrAmhOQL1AuqBR+h+CeLU/iz+r21MEcC/ Zq35bYQ+aOkC8bhadBqMDz7FleiZmNSyTZhi X-Google-Smtp-Source: AMrXdXvfobcg0pSW7Zq/998deX0KaxCoddWWbcyCzeLy3zOMp15r6k8px++xOzaFbco23vhIryJAmAiJaJYBCriJ X-Received: from jthoughton.c.googlers.com ([fda3:e722:ac3:cc00:14:4d90:c0a8:2a4f]) (user=jthoughton job=sendgmr) by 2002:a81:688a:0:b0:3b7:e501:90cf with SMTP id d132-20020a81688a000000b003b7e50190cfmr2167375ywc.501.1672913983311; Thu, 05 Jan 2023 02:19:43 -0800 (PST) Date: Thu, 5 Jan 2023 10:18:32 +0000 In-Reply-To: <20230105101844.1893104-1-jthoughton@google.com> Mime-Version: 1.0 References: <20230105101844.1893104-1-jthoughton@google.com> X-Mailer: git-send-email 2.39.0.314.g84b9a713c41-goog Message-ID: <20230105101844.1893104-35-jthoughton@google.com> Subject: [PATCH 34/46] hugetlb: userfaultfd: when using MADV_SPLIT, round addresses to PAGE_SIZE From: James Houghton To: Mike Kravetz , Muchun Song , Peter Xu Cc: David Hildenbrand , David Rientjes , Axel Rasmussen , Mina Almasry , "Zach O'Keefe" , Manish Mishra , Naoya Horiguchi , "Dr . David Alan Gilbert" , "Matthew Wilcox (Oracle)" , Vlastimil Babka , Baolin Wang , Miaohe Lin , Yang Shi , Andrew Morton , linux-mm@kvack.org, linux-kernel@vger.kernel.org, James Houghton X-Rspam-User: X-Rspamd-Server: rspam03 X-Rspamd-Queue-Id: 0F413140003 X-Stat-Signature: nuek71xkrgq76zck6ue4houz7mtymgk8 X-HE-Tag: 1672913983-40610 X-HE-Meta: U2FsdGVkX1/q9hvQdo5DlbZxEqsB8AQScfolME/PDB0ZhHKmYbotx8MGhDvoh0dqaOkxEk1fDy7TxX5MxCk+S2R8sy28AHC1aaYpe32EZN/jcQj+gPv3iKMuFAUEP+SzUcEyqUuk2wbVCmDSAZCvAwbx/8NaKFmYxUO+mDE0/q+GWgjJpQwHV1VAtBIEu2GIsDJb5KCiqUqyruzGXdFxP0Q5KBDm7zH0ZoxtqVYQgVRGG/lsul4mea5Z8EYCRBdkWOAitSXr26zpgflJ088Xdh5eHpU+4Gy9zPkWoPxjscp1v2yx7o+k557sPmY93hrxtp///u3xQwYD4ytGPJOJR7v7l3x5uU/ry1PJGM8CcPGkewrAwAsa6CDGcZDUY1l4FX1UZzC1DJ3BZRZdvj+bxzCsdJw79k+YWe0mxH5/UXA/CD+tVDo0EraRTOF8dCxj6Bmby5FUSdzYcUhukPGBb+D3kAnUoAzEBhLyWjPGuLinYsi3zgRXTEocP0PKNI4kGGtCPIGQqp9rkF0DikepA688zYcp2oCc+OXgFjHCTU0CtVlWho1RykEDt+wUxVlTjZyDZKzNKumUDuwCNVSMhSi+nm9jiAoQW7zWkMTVuowB793uV5RZwYeZ8UAOpo5yyngOdXdERZTt8u+0ovZ6Rz9QODhUBQOOgID9fEmHuOBGj4g/baql5AM7AXShf7ymjErPRvGH07vh/tvZVp+opEM0LhbAXMqdEjsj5fMEZai8TBYLJ1uKjgkrl8KRYNd/sZFw6y/FGStX9lFvxHM8eLZQiXevBT8QNbgdQZ/w6WcEa7voaPeF9xd13I7Up9GmwROzE6FpxXT0h883Zz6G+LX798blP/uUv+v/lArGlipAJzPs4uABj8JN1Nm8XZ7wet/OLrikv5oqn7GkiFINiEq/nTuC9C6LoD3k2Fqrc4u/8K4T2cTccWG4xZMO+g7UmH55dcM/zzHyB4wVqtA YjyxBhAJ 1TtBtRi3I55kUOXA3X2oyEesJeSCU9m6drBLY1z+lDboGbKXRPe3q0dxVYcpC+lW0WHj54HxGD1JgzPmJmxtgur4cQwNm45c+CL7N7DmQ/SAgaxEYZviBaG8Lg9/+aD7uNCYEgAR8nVYwi204n4DxLy/gPf45vEZBkiPyK8fYItVqyPMxSiR9n040RSvz6nKwgC/GLSMtZ1sC4eXHkwjUORikHSVEc7KJ7xD+Pbii8vBe74wVBZdxzxJUZWshrDNsbNfj+uExZena+JkLL1mCHgsXGruuffjcHH6u7t/nNrus8pU3nVpPWBWGCPrus2/9rK2mVI9HmH+4Zwpy1jB9OfLplwadsIkpdHg0KSFkgKSkZSl9FBC0KpqEZIKAiJr9CfzUEJXohMed07qVEL4mZi7aqwVY3tDpAJIcII8YIcvrynftYqyjj4T9iULkD3aW8jE8 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: MADV_SPLIT enables HugeTLB HGM which allows for UFFDIO_CONTINUE in PAGE_SIZE chunks. If a huge-page-aligned address were to be provided, userspace would be completely unable to take advantage of HGM. That would then require userspace to know to provide UFFD_FEATURE_EXACT_ADDRESS. This patch would make it harder to make a mistake. Instead of requiring userspace to provide UFFD_FEATURE_EXACT_ADDRESS, always provide a usable address. Signed-off-by: James Houghton --- mm/hugetlb.c | 31 +++++++++++++++---------------- 1 file changed, 15 insertions(+), 16 deletions(-) diff --git a/mm/hugetlb.c b/mm/hugetlb.c index 5af6db52f34e..5b6215e03fe1 100644 --- a/mm/hugetlb.c +++ b/mm/hugetlb.c @@ -5936,28 +5936,27 @@ static inline vm_fault_t hugetlb_handle_userfault(struct vm_area_struct *vma, unsigned long addr, unsigned long reason) { + u32 hash; + struct vm_fault vmf; + /* * Don't use the hpage-aligned address if the user has explicitly * enabled HGM. */ if (hugetlb_hgm_advised(vma) && reason == VM_UFFD_MINOR) - haddr = address & PAGE_MASK; - - u32 hash; - struct vm_fault vmf = { - .vma = vma, - .address = haddr, - .real_address = addr, - .flags = flags, + haddr = addr & PAGE_MASK; - /* - * Hard to debug if it ends up being - * used by a callee that assumes - * something about the other - * uninitialized fields... same as in - * memory.c - */ - }; + vmf.vma = vma; + vmf.address = haddr; + vmf.real_address = addr; + vmf.flags = flags; + /* + * Hard to debug if it ends up being + * used by a callee that assumes + * something about the other + * uninitialized fields... same as in + * memory.c + */ /* * vma_lock and hugetlb_fault_mutex must be dropped before handling From patchwork Thu Jan 5 10:18:33 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: James Houghton X-Patchwork-Id: 13089686 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 848D5C3DA7D for ; Thu, 5 Jan 2023 10:25:29 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 2AFC98E0005; Thu, 5 Jan 2023 05:25:29 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 25F6F8E0002; Thu, 5 Jan 2023 05:25:29 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 100D78E0005; Thu, 5 Jan 2023 05:25:29 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0015.hostedemail.com [216.40.44.15]) by kanga.kvack.org (Postfix) with ESMTP id 008C38E0002 for ; Thu, 5 Jan 2023 05:25:28 -0500 (EST) Received: from smtpin03.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay09.hostedemail.com (Postfix) with ESMTP id CF53F80D79 for ; Thu, 5 Jan 2023 10:25:28 +0000 (UTC) X-FDA: 80320363536.03.F3B7192 Received: from mail-yw1-f201.google.com (mail-yw1-f201.google.com [209.85.128.201]) by imf24.hostedemail.com (Postfix) with ESMTP id 3F74B180002 for ; Thu, 5 Jan 2023 10:25:27 +0000 (UTC) Authentication-Results: imf24.hostedemail.com; dkim=pass header.d=google.com header.s=20210112 header.b=CQ2ZAMvC; spf=pass (imf24.hostedemail.com: domain of 3QKS2YwoKCIcu4sz5rs4zyrzzrwp.nzxwty58-xxv6lnv.z2r@flex--jthoughton.bounces.google.com designates 209.85.128.201 as permitted sender) smtp.mailfrom=3QKS2YwoKCIcu4sz5rs4zyrzzrwp.nzxwty58-xxv6lnv.z2r@flex--jthoughton.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1672914327; a=rsa-sha256; cv=none; b=Ccc++ingVWBglsMlTNqmjiyz87smQntgE2izBPJgp0bdg+vkk92VDjgFZl4XbrR6L3sAb5 QZwtGp967r74a0tY8EMUDsb9i1PTIEyIS9SKX5zFumT6EdXiYoVRTxrNQQukJ8ZyqO9M6L +SYY2wptlLS2gbRME7aczOFJTw6atic= ARC-Authentication-Results: i=1; imf24.hostedemail.com; dkim=pass header.d=google.com header.s=20210112 header.b=CQ2ZAMvC; spf=pass (imf24.hostedemail.com: domain of 3QKS2YwoKCIcu4sz5rs4zyrzzrwp.nzxwty58-xxv6lnv.z2r@flex--jthoughton.bounces.google.com designates 209.85.128.201 as permitted sender) smtp.mailfrom=3QKS2YwoKCIcu4sz5rs4zyrzzrwp.nzxwty58-xxv6lnv.z2r@flex--jthoughton.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1672914327; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=3FgXUHA3tnp9ag/KdcrsQO8+AooEilqNmGGLS4NLvAo=; b=DRDCEgnGrH5wA/4BqQ8noFLHIS5Yp/eqfsEqHr12T0SwjYdcJJnLyQttQbkkK2XPCNovYX 7B9i89kG6IRXhbg5PNCbXUSyczK7dTN2iNUna+EpXoeap29YjwumjGcc8/lUZhiF39NMDy /btDetf38TzLxyuuen3y9VLPpE5d0Zk= Received: by mail-yw1-f201.google.com with SMTP id 00721157ae682-46658ec0cfcso366325317b3.19 for ; Thu, 05 Jan 2023 02:25:26 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=3FgXUHA3tnp9ag/KdcrsQO8+AooEilqNmGGLS4NLvAo=; b=CQ2ZAMvCUZJv/+eIPckh0cru0bcQh2zSN2X8UydOAYOHip0mnvLnjMTpx+nekpYPri PgSqIhuQRNSfE3cy5gmX64ME2FE79V7tZA1HzCajMGW/OTeRieb6CfAI5RCC+jjsnkgi P1AqjuWmCe+IBSuxFW0Sx6Yi8H2eOhRCeKbnXhh618YloM81ctLM2MDeHXdprNgFIwKI DnNxI3YwFHfZjuMqooj0PPDF7pJlqgONVEwWBBjnUtuxxCEahkMIAqGF8l2UH+JCve2z UX89Ln9ApsOrMPYb3Pi/fXmim/GTGmCNYZtKYdlY+XWMpNKMbgBdacdNGn0bNgJJOMnv zQ/w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=3FgXUHA3tnp9ag/KdcrsQO8+AooEilqNmGGLS4NLvAo=; b=dwc31jIUhqV3WiuEXd1HIP+43LxnbaHn3kW4jrU0w3nng1INU3XoNcvqz/0TgcNS9w EaFebpxNDdUaJFe4SRsn53lSyChh5HT9rLo9bq+y3MuVl7NQYcu7NCVexq89Y24Wb5pm ncJbY/pJliTC/OwOXf5j8EqWI7huMwpNeXLAMvVnFxBPmFl25Ti2t9dBoJXwds4loir2 2OMjqhD39Jsaym+5EJSALy9NLRV6FIUyjnNQ6QljqBjYOqI63/nyHjZ/6Ee01c33WojN AWSwLnAcuqcTj0oQYNUSAimNFoT9my6gsn7lc7MH3QDKQ0l9QFzucY7JMA5otoOvToZC vcQg== X-Gm-Message-State: AFqh2kqMVapMat0F8ULIDNgWoP/5puC3F8j3bWdKlAmnF9e3Ed2oTrQQ WZABrcRRw23c2dXgjdJEbcpaK8s5/K6JQm7S X-Google-Smtp-Source: AMrXdXuzD5Zl8Kd66cl2YeIf0CBaeSj8tkfJ7XrLcfARYCSDgpp3FWj9tMfFIQw7/JsTh4ne2tp478HqHQuoawqr X-Received: from jthoughton.c.googlers.com ([fda3:e722:ac3:cc00:14:4d90:c0a8:2a4f]) (user=jthoughton job=sendgmr) by 2002:a81:a209:0:b0:4af:16a7:5334 with SMTP id w9-20020a81a209000000b004af16a75334mr1111907ywg.159.1672913984631; Thu, 05 Jan 2023 02:19:44 -0800 (PST) Date: Thu, 5 Jan 2023 10:18:33 +0000 In-Reply-To: <20230105101844.1893104-1-jthoughton@google.com> Mime-Version: 1.0 References: <20230105101844.1893104-1-jthoughton@google.com> X-Mailer: git-send-email 2.39.0.314.g84b9a713c41-goog Message-ID: <20230105101844.1893104-36-jthoughton@google.com> Subject: [PATCH 35/46] hugetlb: add MADV_COLLAPSE for hugetlb From: James Houghton To: Mike Kravetz , Muchun Song , Peter Xu Cc: David Hildenbrand , David Rientjes , Axel Rasmussen , Mina Almasry , "Zach O'Keefe" , Manish Mishra , Naoya Horiguchi , "Dr . David Alan Gilbert" , "Matthew Wilcox (Oracle)" , Vlastimil Babka , Baolin Wang , Miaohe Lin , Yang Shi , Andrew Morton , linux-mm@kvack.org, linux-kernel@vger.kernel.org, James Houghton X-Rspam-User: X-Rspamd-Queue-Id: 3F74B180002 X-Rspamd-Server: rspam01 X-Stat-Signature: p6ajsxojy73n8d6h9ccicrh3uxokcodm X-HE-Tag: 1672914327-886644 X-HE-Meta: U2FsdGVkX1+W9rgquZVN+T1I4mrKzBlpki8RTvVMWsDkSNS6ax/DlE8NHrLp/bmmIZoVkhJ92oqbKqnLXwcddvZso38sDbSYM4A1q2Eg6kxspxzLiEw6qE4X++p3PYZDNEvNiPa2nQwuCaFch4QMABDHIM51OEizRm6tStTSxMk4tE5pIPkZF6Amu50L11FzMZf2Tyn4/3/PQsGTK3MJHI2+XwtPMX4SYQCsPfpqNuwvzhCTOeWoDqd32uUMBg9+XUsvGQ8+vppg4r7mLjfDGB438gyDwE+aMz6yqAvcmYEfdwM+5WLCsZhlq/J7/PFUKV+rnM3ilJVuvd9n48RpFir7DijnZSaKm7rM9XXBlUCfebnqq261war6QzOUOA2KXx5ic9NJc+6G+Nic8Tegfek3Sk2zQBWkdT89up5f1FOthLnrrQbnNmTcbAvP97tbp0vTf8m8B5nvtNa/FzESlH82Wdx6u8wVzFlwuJJHZ/W1GVz7dfKw+J6iwBTJeJ+atUSf2Fx0PJ02iQKzsccPmY9UpAUXX/S/nEJC+FU1hSYjKVI1mbmUaBIY+AqKEX9i3wtdX44dy9p9oLJFIojmcAT3IDsyTH4vVbsKfrOWWAMcr65Xptj0ExYsApEjc9Q4YZ5jTTnU4jXYVFSG6abvLrUb5uBtaBWFM5ZmzJtG7Lcwg+Kpq8TzjEeHir5bAQTnUKz7m0E+cfkWknD0sWTxLi/4q3IwD26t1Dk1R9GPOuz1Ehmxbs1kszln7yn9HkxRDNamaEpC783Ex2xwMTeIDYomkUMIT2n9uM7QcrwfAXuI1oLwivFkiiEII1udmFTdfpeWax+KnYlJ+p3QR62Y8AE4GRrv+9UZz34btyxu06zWGtX0+7kMR3DIYKo4jsXojKS4JRAlGjC1i/jZB+o1xVx5VNt1mDS8yt7Ite0SNKVxqb2tAJTja1Aa1gtBBOckuwLUz/38ab4OwJ5uwpX SgXFUF7j 6sim/K6Bp3gz36MZpJuWra4bSJSOR9AAQXuBby0+IqSD/kHDLLxUE3jSvu4V3wbcDGyWk8K6JIGwVzQVDzRhJIS83INLPIKta/kO3FakIRLUHhqafQXhu48lmpZf04suP6DeFxuGHF5RtL8znAaKhx8QniFH/I+6HDW/0hzoHkPTjY6w/VvMPjybv7ojY2l1cMMrjXE8O66tVcCdCynW1sTXAuZW3tScVGGsRr1SAj8bPy5M1nffdO1xDpPTgYcyjR9QndA098YnOeGgVbZTz6rXwwkvJkX2/9dWjYnTX8QhoPO51Jok16VYJ4R8yeqd9TU1ox5S6hyrHEcHb+UbxFGekRuV8xdqTHhzXBH6m4GR08Ygzlvl8N3KM0LYOzH2VtMaGH+wTI4rCfaqlkGM5pMY4LCMWEEFYauq9WAzIufBn2WPqTEEGOLf99qB7Dn+3PCFq X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: This is a necessary extension to the UFFDIO_CONTINUE changes. When userspace finishes mapping an entire hugepage with UFFDIO_CONTINUE, the kernel has no mechanism to automatically collapse the page table to map the whole hugepage normally. We require userspace to inform us that they would like the mapping to be collapsed; they do this with MADV_COLLAPSE. If userspace has not mapped all of a hugepage with UFFDIO_CONTINUE, but only some, hugetlb_collapse will cause the requested range to be mapped as if it were UFFDIO_CONTINUE'd already. The effects of any UFFDIO_WRITEPROTECT calls may be undone by a call to MADV_COLLAPSE for intersecting address ranges. This commit is co-opting the same madvise mode that has been introduced to synchronously collapse THPs. The function that does THP collapsing has been renamed to madvise_collapse_thp. As with the rest of the high-granularity mapping support, MADV_COLLAPSE is only supported for shared VMAs right now. MADV_COLLAPSE has the same synchronization as huge_pmd_unshare. Signed-off-by: James Houghton --- include/linux/huge_mm.h | 12 +-- include/linux/hugetlb.h | 8 ++ mm/hugetlb.c | 164 ++++++++++++++++++++++++++++++++++++++++ mm/khugepaged.c | 4 +- mm/madvise.c | 18 ++++- 5 files changed, 197 insertions(+), 9 deletions(-) diff --git a/include/linux/huge_mm.h b/include/linux/huge_mm.h index a1341fdcf666..5d1e3c980f74 100644 --- a/include/linux/huge_mm.h +++ b/include/linux/huge_mm.h @@ -218,9 +218,9 @@ void __split_huge_pud(struct vm_area_struct *vma, pud_t *pud, int hugepage_madvise(struct vm_area_struct *vma, unsigned long *vm_flags, int advice); -int madvise_collapse(struct vm_area_struct *vma, - struct vm_area_struct **prev, - unsigned long start, unsigned long end); +int madvise_collapse_thp(struct vm_area_struct *vma, + struct vm_area_struct **prev, + unsigned long start, unsigned long end); void vma_adjust_trans_huge(struct vm_area_struct *vma, unsigned long start, unsigned long end, long adjust_next); spinlock_t *__pmd_trans_huge_lock(pmd_t *pmd, struct vm_area_struct *vma); @@ -367,9 +367,9 @@ static inline int hugepage_madvise(struct vm_area_struct *vma, return -EINVAL; } -static inline int madvise_collapse(struct vm_area_struct *vma, - struct vm_area_struct **prev, - unsigned long start, unsigned long end) +static inline int madvise_collapse_thp(struct vm_area_struct *vma, + struct vm_area_struct **prev, + unsigned long start, unsigned long end) { return -EINVAL; } diff --git a/include/linux/hugetlb.h b/include/linux/hugetlb.h index c8524ac49b24..e1baf939afb6 100644 --- a/include/linux/hugetlb.h +++ b/include/linux/hugetlb.h @@ -1298,6 +1298,8 @@ bool hugetlb_hgm_eligible(struct vm_area_struct *vma); int hugetlb_alloc_largest_pte(struct hugetlb_pte *hpte, struct mm_struct *mm, struct vm_area_struct *vma, unsigned long start, unsigned long end); +int hugetlb_collapse(struct mm_struct *mm, struct vm_area_struct *vma, + unsigned long start, unsigned long end); #else static inline bool hugetlb_hgm_enabled(struct vm_area_struct *vma) { @@ -1318,6 +1320,12 @@ int hugetlb_alloc_largest_pte(struct hugetlb_pte *hpte, struct mm_struct *mm, { return -EINVAL; } +static inline +int hugetlb_collapse(struct mm_struct *mm, struct vm_area_struct *vma, + unsigned long start, unsigned long end) +{ + return -EINVAL; +} #endif static inline spinlock_t *huge_pte_lock(struct hstate *h, diff --git a/mm/hugetlb.c b/mm/hugetlb.c index 5b6215e03fe1..388c46c7e77a 100644 --- a/mm/hugetlb.c +++ b/mm/hugetlb.c @@ -7852,6 +7852,170 @@ int hugetlb_alloc_largest_pte(struct hugetlb_pte *hpte, struct mm_struct *mm, return 0; } +static bool hugetlb_hgm_collapsable(struct vm_area_struct *vma) +{ + if (!hugetlb_hgm_eligible(vma)) + return false; + if (!vma->vm_private_data) /* vma lock required for collapsing */ + return false; + return true; +} + +/* + * Collapse the address range from @start to @end to be mapped optimally. + * + * This is only valid for shared mappings. The main use case for this function + * is following UFFDIO_CONTINUE. If a user UFFDIO_CONTINUEs an entire hugepage + * by calling UFFDIO_CONTINUE once for each 4K region, the kernel doesn't know + * to collapse the mapping after the final UFFDIO_CONTINUE. Instead, we leave + * it up to userspace to tell us to do so, via MADV_COLLAPSE. + * + * Any holes in the mapping will be filled. If there is no page in the + * pagecache for a region we're collapsing, the PTEs will be cleared. + * + * If high-granularity PTEs are uffd-wp markers, those markers will be dropped. + */ +int hugetlb_collapse(struct mm_struct *mm, struct vm_area_struct *vma, + unsigned long start, unsigned long end) +{ + struct hstate *h = hstate_vma(vma); + struct address_space *mapping = vma->vm_file->f_mapping; + struct mmu_notifier_range range; + struct mmu_gather tlb; + unsigned long curr = start; + int ret = 0; + struct page *hpage, *subpage; + pgoff_t idx; + bool writable = vma->vm_flags & VM_WRITE; + bool shared = vma->vm_flags & VM_SHARED; + struct hugetlb_pte hpte; + pte_t entry; + + /* + * This is only supported for shared VMAs, because we need to look up + * the page to use for any PTEs we end up creating. + */ + if (!shared) + return -EINVAL; + + /* If HGM is not enabled, there is nothing to collapse. */ + if (!hugetlb_hgm_enabled(vma)) + return 0; + + /* + * We lost the VMA lock after splitting, so we can't safely collapse. + * We could improve this in the future (like take the mmap_lock for + * writing and try again), but for now just fail with ENOMEM. + */ + if (unlikely(!hugetlb_hgm_collapsable(vma))) + return -ENOMEM; + + mmu_notifier_range_init(&range, MMU_NOTIFY_CLEAR, 0, vma, mm, + start, end); + mmu_notifier_invalidate_range_start(&range); + tlb_gather_mmu(&tlb, mm); + + /* + * Grab the VMA lock and mapping sem for writing. This will prevent + * concurrent high-granularity page table walks, so that we can safely + * collapse and free page tables. + * + * This is the same locking that huge_pmd_unshare requires. + */ + hugetlb_vma_lock_write(vma); + i_mmap_lock_write(vma->vm_file->f_mapping); + + while (curr < end) { + ret = hugetlb_alloc_largest_pte(&hpte, mm, vma, curr, end); + if (ret) + goto out; + + entry = huge_ptep_get(hpte.ptep); + + /* + * There is no work to do if the PTE doesn't point to page + * tables. + */ + if (!pte_present(entry)) + goto next_hpte; + if (hugetlb_pte_present_leaf(&hpte, entry)) + goto next_hpte; + + idx = vma_hugecache_offset(h, vma, curr); + hpage = find_get_page(mapping, idx); + + if (hpage && !HPageMigratable(hpage)) { + /* + * Don't collapse a mapping to a page that is pending + * a migration. Migration swap entries may have placed + * in the page table. + */ + ret = -EBUSY; + put_page(hpage); + goto out; + } + + if (hpage && PageHWPoison(hpage)) { + /* + * Don't collapse a mapping to a page that is + * hwpoisoned. + */ + ret = -EHWPOISON; + put_page(hpage); + /* + * By setting ret to -EHWPOISON, if nothing else + * happens, we will tell userspace that we couldn't + * fully collapse everything due to poison. + * + * Skip this page, and continue to collapse the rest + * of the mapping. + */ + curr = (curr & huge_page_mask(h)) + huge_page_size(h); + continue; + } + + /* + * Clear all the PTEs, and drop ref/mapcounts + * (on tlb_finish_mmu). + */ + __unmap_hugepage_range(&tlb, vma, curr, + curr + hugetlb_pte_size(&hpte), + NULL, + ZAP_FLAG_DROP_MARKER); + /* Free the PTEs. */ + hugetlb_free_pgd_range(&tlb, + curr, curr + hugetlb_pte_size(&hpte), + curr, curr + hugetlb_pte_size(&hpte)); + if (!hpage) { + huge_pte_clear(mm, curr, hpte.ptep, + hugetlb_pte_size(&hpte)); + goto next_hpte; + } + + page_dup_file_rmap(hpage, true); + + subpage = hugetlb_find_subpage(h, hpage, curr); + entry = make_huge_pte_with_shift(vma, subpage, + writable, hpte.shift); + set_huge_pte_at(mm, curr, hpte.ptep, entry); +next_hpte: + curr += hugetlb_pte_size(&hpte); + + if (curr < end) { + /* Don't hold the VMA lock for too long. */ + hugetlb_vma_unlock_write(vma); + cond_resched(); + hugetlb_vma_lock_write(vma); + } + } +out: + i_mmap_unlock_write(vma->vm_file->f_mapping); + hugetlb_vma_unlock_write(vma); + tlb_finish_mmu(&tlb); + mmu_notifier_invalidate_range_end(&range); + return ret; +} + #endif /* CONFIG_HUGETLB_HIGH_GRANULARITY_MAPPING */ /* diff --git a/mm/khugepaged.c b/mm/khugepaged.c index e1c7c1f357ef..cbeb7f00f1bf 100644 --- a/mm/khugepaged.c +++ b/mm/khugepaged.c @@ -2718,8 +2718,8 @@ static int madvise_collapse_errno(enum scan_result r) } } -int madvise_collapse(struct vm_area_struct *vma, struct vm_area_struct **prev, - unsigned long start, unsigned long end) +int madvise_collapse_thp(struct vm_area_struct *vma, struct vm_area_struct **prev, + unsigned long start, unsigned long end) { struct collapse_control *cc; struct mm_struct *mm = vma->vm_mm; diff --git a/mm/madvise.c b/mm/madvise.c index 04ee28992e52..fec47e9f845b 100644 --- a/mm/madvise.c +++ b/mm/madvise.c @@ -1029,6 +1029,18 @@ static int madvise_split(struct vm_area_struct *vma, return 0; } +static int madvise_collapse(struct vm_area_struct *vma, + struct vm_area_struct **prev, + unsigned long start, unsigned long end) +{ + if (is_vm_hugetlb_page(vma)) { + *prev = vma; + return hugetlb_collapse(vma->vm_mm, vma, start, end); + } + + return madvise_collapse_thp(vma, prev, start, end); +} + /* * Apply an madvise behavior to a region of a vma. madvise_update_vma * will handle splitting a vm area into separate areas, each area with its own @@ -1205,6 +1217,9 @@ madvise_behavior_valid(int behavior) #ifdef CONFIG_TRANSPARENT_HUGEPAGE case MADV_HUGEPAGE: case MADV_NOHUGEPAGE: +#endif +#if defined(CONFIG_HUGETLB_HIGH_GRANULARITY_MAPPING) || \ + defined(CONFIG_TRANSPARENT_HUGEPAGE) case MADV_COLLAPSE: #endif #ifdef CONFIG_HUGETLB_HIGH_GRANULARITY_MAPPING @@ -1398,7 +1413,8 @@ int madvise_set_anon_name(struct mm_struct *mm, unsigned long start, * MADV_NOHUGEPAGE - mark the given range as not worth being backed by * transparent huge pages so the existing pages will not be * coalesced into THP and new pages will not be allocated as THP. - * MADV_COLLAPSE - synchronously coalesce pages into new THP. + * MADV_COLLAPSE - synchronously coalesce pages into new THP, or, for HugeTLB + * pages, collapse the mapping. * MADV_DONTDUMP - the application wants to prevent pages in the given range * from being included in its core dump. * MADV_DODUMP - cancel MADV_DONTDUMP: no longer exclude from core dump. From patchwork Thu Jan 5 10:18:34 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: James Houghton X-Patchwork-Id: 13089669 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 3989AC54E76 for ; Thu, 5 Jan 2023 10:19:49 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id C29AC900003; Thu, 5 Jan 2023 05:19:48 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id BD95E900002; Thu, 5 Jan 2023 05:19:48 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 94120900003; Thu, 5 Jan 2023 05:19:48 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0016.hostedemail.com [216.40.44.16]) by kanga.kvack.org (Postfix) with ESMTP id 8503A900002 for ; Thu, 5 Jan 2023 05:19:48 -0500 (EST) Received: from smtpin21.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay06.hostedemail.com (Postfix) with ESMTP id 523C6AAE17 for ; Thu, 5 Jan 2023 10:19:48 +0000 (UTC) X-FDA: 80320349256.21.02C680A Received: from mail-yw1-f201.google.com (mail-yw1-f201.google.com [209.85.128.201]) by imf02.hostedemail.com (Postfix) with ESMTP id AB23380017 for ; Thu, 5 Jan 2023 10:19:46 +0000 (UTC) Authentication-Results: imf02.hostedemail.com; dkim=pass header.d=google.com header.s=20210112 header.b=atlQfqiE; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf02.hostedemail.com: domain of 3QaS2YwoKCIgv5t06st50zs00sxq.o0yxuz69-yyw7mow.03s@flex--jthoughton.bounces.google.com designates 209.85.128.201 as permitted sender) smtp.mailfrom=3QaS2YwoKCIgv5t06st50zs00sxq.o0yxuz69-yyw7mow.03s@flex--jthoughton.bounces.google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1672913986; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=nQJKiUDoV/4j52zLo+Pfg4lOVIUQ9TTfXf4yaRNkk0o=; b=2hnpsX2cBTrD6NvteDK90/ULdmmQKSDnFX+tABFCOzYDxvWD3LlWRoK9MTNDYj4JP5e3gV 3WstWe43OUPtjbMb+aU9p4TFsDedU+BjdtHyl0nj1jdGeRdw9vAIG04Ff6Bv7P5OW7ugto QEgjQUoFkI+CR0igISFUeKoZyGHIaMs= ARC-Authentication-Results: i=1; imf02.hostedemail.com; dkim=pass header.d=google.com header.s=20210112 header.b=atlQfqiE; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf02.hostedemail.com: domain of 3QaS2YwoKCIgv5t06st50zs00sxq.o0yxuz69-yyw7mow.03s@flex--jthoughton.bounces.google.com designates 209.85.128.201 as permitted sender) smtp.mailfrom=3QaS2YwoKCIgv5t06st50zs00sxq.o0yxuz69-yyw7mow.03s@flex--jthoughton.bounces.google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1672913986; a=rsa-sha256; cv=none; b=z6FgdIFLJlplAIApseXv0KkIBW/It4QUOSUhBoja0nZYTAEnRkpoUKr8TyZIGgeDKPUH3N wqRbIOpXtxlLBDfB9Vjpsro6NQ6T67RA1AmCyilHodSl2XHVW2uxkDhq90u9dm1f4I0euK wBtmgz+H71XPNRboUBwCZ+hu1RjgGBw= Received: by mail-yw1-f201.google.com with SMTP id 00721157ae682-47ede4426e1so257857727b3.7 for ; Thu, 05 Jan 2023 02:19:46 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=nQJKiUDoV/4j52zLo+Pfg4lOVIUQ9TTfXf4yaRNkk0o=; b=atlQfqiEgy1xYzhT4IyUVUGbPAhpghAx33YY9Ut2pzp0mD93Mh/Yq8hJbUQYMZUGyv 79g4SUp7Afeizjld4KBniyNCxadcDLxlzNbxhQuOC/KyWBzgN0ZCC7l0KieF/fpc4kkJ iUr6M0pdgU5RSZag7PDJbcOaLT6TYs/HCTL9W9Jb1R+becqgtzJQGEONP4TOXQWnH4F0 DYOLfXy51tLR/SmFY1q18P7ZJv+GrFzaVXxYkXHmZUnrB2+y7sCsAz/aoRq7ER2ueTN4 kZf3HAYzz5w5uKAKpFJxkGcQKV4Ix4sjQ5F5h/IZxCSAjyzQRxzHbXvVGGDZzdryGq3g sSOw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=nQJKiUDoV/4j52zLo+Pfg4lOVIUQ9TTfXf4yaRNkk0o=; b=WdvqspTiP1sPwSA6PSL0r0K08oLUQjkEfzJYlHcwAFl2csyrz5KeGb2OM+CKLgicNH n1DrShjb/xEefe/PXskqMGrZ7rO2RRq5gXeKT3VNljmC7LfBLAIsX3ww6j9oL9Dp0e5e WR/NDG7gvk7OCAKrTcMiOG8iZqt5CqtRbOcPxVe6x0ar+3jEw2+Y6/EIrIBBWTb1aCUv zsrcob5kCuVK4bOU0GqgB/O3Nvce17xRqlFF9aFM6cyUJ+zAAhKfCF6DYUciMlHJqam/ 0B5jnEAzAzLWPdDgDAMdDZ/4hVgUnp9MX1MPhogaC11KLXbcmG1l7oD2xc0pPjhe4VZ+ judA== X-Gm-Message-State: AFqh2kodYctjuIWACgni0zUTvkKvs5vaBR2JQLmlP9rMdcxEcjwyynH0 cMv3mU5xKQ/xuK4J0pNU1ahHi2arGCDuBqx5 X-Google-Smtp-Source: AMrXdXskgc9vPcHYbKiHVU6KI+i99SMBDDXKXKUoGyd1uzZY3LEXY9tDZtAeOm/gOkCBpHD2OdRdbv4oR1vsSaUo X-Received: from jthoughton.c.googlers.com ([fda3:e722:ac3:cc00:14:4d90:c0a8:2a4f]) (user=jthoughton job=sendgmr) by 2002:a81:cc8:0:b0:38d:c23a:c541 with SMTP id 191-20020a810cc8000000b0038dc23ac541mr53560ywm.109.1672913985945; Thu, 05 Jan 2023 02:19:45 -0800 (PST) Date: Thu, 5 Jan 2023 10:18:34 +0000 In-Reply-To: <20230105101844.1893104-1-jthoughton@google.com> Mime-Version: 1.0 References: <20230105101844.1893104-1-jthoughton@google.com> X-Mailer: git-send-email 2.39.0.314.g84b9a713c41-goog Message-ID: <20230105101844.1893104-37-jthoughton@google.com> Subject: [PATCH 36/46] hugetlb: remove huge_pte_lock and huge_pte_lockptr From: James Houghton To: Mike Kravetz , Muchun Song , Peter Xu Cc: David Hildenbrand , David Rientjes , Axel Rasmussen , Mina Almasry , "Zach O'Keefe" , Manish Mishra , Naoya Horiguchi , "Dr . David Alan Gilbert" , "Matthew Wilcox (Oracle)" , Vlastimil Babka , Baolin Wang , Miaohe Lin , Yang Shi , Andrew Morton , linux-mm@kvack.org, linux-kernel@vger.kernel.org, James Houghton X-Rspamd-Queue-Id: AB23380017 X-Rspamd-Server: rspam09 X-Rspam-User: X-Stat-Signature: p46kso6t51ennifbdtyrfztudw145my8 X-HE-Tag: 1672913986-600316 X-HE-Meta: U2FsdGVkX1/jCOU+/4joOGJEHGG8d8aTVt2uRlJvJWKnui3EbjkK6/8KI/rn1GFFq7Jfaaga5vm79LULjasQQbz3zOzo6N98qfKHQo5RqtMRVkV3WnMpz3BVnn8J/ZsjTkruW6ExFqfL0xPbuHy6EHnMFG3lV+O3jMqN9aFPedLgs7huQ1RI8r/sKDl0j1NqyEgsU6q8m1SubbBhcwokr6D/7u3JvVY8qLipJi9YsW00U7PLVGnm65kGZLjmN4KIU/Zku0qOt0Xw+dGZIN88+2V9XOXVzjFXpIVGA3RUY/4krta+VgfvqLP4hSeNaa484cxDPiqTpwQ0sLJROJE9RCox8T9HnJ69qdj2WnAVSerwSmpu6YuVVcRWSM3x+L/nJL4QR4EIxYmmGJt/mxZFCfrVuOletyW4/eD4H0NJrRwly1XVv+tCx3devkLLdFy3jZSC5y3hnOKHqQCqIZqM//RfMRxtRcW4i4YuHHk7VGUrnZqmxfrrtJ8XBDNLuWLqAIgIvXj5C6ItqlEKA317jJ0qvWe/SCXNBUWIEDvKNof1VzNdJPLE0YuEoZA0laL2FPOUxFla/W24san94g0fVAPCWHm1RaFOTcoUsa7OytUumcah3oxVUJkD0gr6sp6BZD/Jbhvofn4XV2TiUZNHzZK1Vv00Xb4YO2CMZzSrXTynngjOLaDMa1ipSXxoAa6ItDt1ln5ApWydDlIqY2i3Mn10MYVMOPp8YJuwLfvO1rC0MYHr/s2Um3EtqPk7qnkoH7VVZ3nZXJohTzGPVuKbmFrBBtgrJDLYk2L/Lhp9LBFwngncbkBIrs7YDtUahUKoVUTzvYwM6msZZHJ5MoPt9WMM67W/5V/iBIesGILo5ITmHU+1hmfSB96iP5HM75Gbx/S7qSUa1dy3wBl0/3yEdwnUuEHHzLxMeMdC61kipBIyVyhaHuSTsd2yC6AWGZb2Z69d3ACW5omccR+D0GY dHT1O+fX c4u52fOTPxss3So3MlaDCKE/oD9z0pu3FYsVEY+FuZAt789wr2+SH5vmswPNWoMoNAkpzgT1gvcmcIlmQRfCJVQ4yHqUgr4tcl2DCL088Cg2W1ZX4BhjevQRKWgDBPPC1EoNBS+7pVR66k/PNsPEktHBLqVMNnsTnmwgrXUqJviVPXp755C4epJQLiQHqbovQhuIhQ2wQx+lJYpdi66nPxTgrqrmYTQxWFo8lmzxBHTmTFG3Y8qBzAmq4nNdCFHDGEV0Y6TrfA2MuXOqWir92/IHMYA7LX5WZOgbtg9W0+biV0bNHRVIijCQd0WV5qEkm4sJsUD4zeSHiTqwNWjGVMU4588YI2GdSWrygUwfuFBRCLc2CyBcqzlvqA9eqyyVzf7pjV/bmb0jBHmCioNO59URss4ZdvUyFKbAe6+ZmUPoI5jpJ2XiKILh1cEt6f+z47YJf X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: They are replaced with hugetlb_pte_lock{,ptr}. All callers that haven't already been replaced don't get called when using HGM, so we handle them by populating hugetlb_ptes with the standard, hstate-sized huge PTEs. Signed-off-by: James Houghton --- arch/powerpc/mm/pgtable.c | 7 +++++-- include/linux/hugetlb.h | 42 +++++++++++++++------------------------ mm/hugetlb.c | 22 +++++++++++++------- 3 files changed, 36 insertions(+), 35 deletions(-) diff --git a/arch/powerpc/mm/pgtable.c b/arch/powerpc/mm/pgtable.c index 035a0df47af0..e20d6aa9a2a6 100644 --- a/arch/powerpc/mm/pgtable.c +++ b/arch/powerpc/mm/pgtable.c @@ -258,11 +258,14 @@ int huge_ptep_set_access_flags(struct vm_area_struct *vma, #ifdef CONFIG_PPC_BOOK3S_64 struct hstate *h = hstate_vma(vma); + struct hugetlb_pte hpte; psize = hstate_get_psize(h); #ifdef CONFIG_DEBUG_VM - assert_spin_locked(huge_pte_lockptr(huge_page_shift(h), - vma->vm_mm, ptep)); + /* HGM is not supported for powerpc yet. */ + hugetlb_pte_populate(&hpte, ptep, huge_page_shift(h), + hpage_size_to_level(psize)); + assert_spin_locked(hpte.ptl); #endif #else diff --git a/include/linux/hugetlb.h b/include/linux/hugetlb.h index e1baf939afb6..4d318bf2ced9 100644 --- a/include/linux/hugetlb.h +++ b/include/linux/hugetlb.h @@ -1032,14 +1032,6 @@ static inline gfp_t htlb_modify_alloc_mask(struct hstate *h, gfp_t gfp_mask) return modified_mask; } -static inline spinlock_t *huge_pte_lockptr(unsigned int shift, - struct mm_struct *mm, pte_t *pte) -{ - if (shift == PMD_SHIFT) - return pmd_lockptr(mm, (pmd_t *) pte); - return &mm->page_table_lock; -} - #ifndef hugepages_supported /* * Some platform decide whether they support huge pages at boot @@ -1248,12 +1240,6 @@ static inline gfp_t htlb_modify_alloc_mask(struct hstate *h, gfp_t gfp_mask) return 0; } -static inline spinlock_t *huge_pte_lockptr(unsigned int shift, - struct mm_struct *mm, pte_t *pte) -{ - return &mm->page_table_lock; -} - static inline void hugetlb_count_init(struct mm_struct *mm) { } @@ -1328,16 +1314,6 @@ int hugetlb_collapse(struct mm_struct *mm, struct vm_area_struct *vma, } #endif -static inline spinlock_t *huge_pte_lock(struct hstate *h, - struct mm_struct *mm, pte_t *pte) -{ - spinlock_t *ptl; - - ptl = huge_pte_lockptr(huge_page_shift(h), mm, pte); - spin_lock(ptl); - return ptl; -} - static inline spinlock_t *hugetlb_pte_lockptr(struct hugetlb_pte *hpte) { @@ -1358,8 +1334,22 @@ void hugetlb_pte_populate(struct mm_struct *mm, struct hugetlb_pte *hpte, pte_t *ptep, unsigned int shift, enum hugetlb_level level) { - __hugetlb_pte_populate(hpte, ptep, shift, level, - huge_pte_lockptr(shift, mm, ptep)); + spinlock_t *ptl; + + /* + * For contiguous HugeTLB PTEs that can contain other HugeTLB PTEs + * on the same level, the same PTL for both must be used. + * + * For some architectures that implement hugetlb_walk_step, this + * version of hugetlb_pte_populate() may not be correct to use for + * high-granularity PTEs. Instead, call __hugetlb_pte_populate() + * directly. + */ + if (level == HUGETLB_LEVEL_PMD) + ptl = pmd_lockptr(mm, (pmd_t *) ptep); + else + ptl = &mm->page_table_lock; + __hugetlb_pte_populate(hpte, ptep, shift, level, ptl); } #if defined(CONFIG_HUGETLB_PAGE) && defined(CONFIG_CMA) diff --git a/mm/hugetlb.c b/mm/hugetlb.c index 388c46c7e77a..d71adc03138d 100644 --- a/mm/hugetlb.c +++ b/mm/hugetlb.c @@ -5303,9 +5303,8 @@ int copy_hugetlb_page_range(struct mm_struct *dst, struct mm_struct *src, put_page(hpage); /* Install the new huge page if src pte stable */ - dst_ptl = huge_pte_lock(h, dst, dst_pte); - src_ptl = huge_pte_lockptr(huge_page_shift(h), - src, src_pte); + dst_ptl = hugetlb_pte_lock(&dst_hpte); + src_ptl = hugetlb_pte_lockptr(&src_hpte); spin_lock_nested(src_ptl, SINGLE_DEPTH_NESTING); entry = huge_ptep_get(src_pte); if (!pte_same(src_pte_old, entry)) { @@ -7383,7 +7382,8 @@ pte_t *huge_pmd_share(struct mm_struct *mm, struct vm_area_struct *vma, unsigned long saddr; pte_t *spte = NULL; pte_t *pte; - spinlock_t *ptl; + struct hugetlb_pte hpte; + struct hstate *shstate; i_mmap_lock_read(mapping); vma_interval_tree_foreach(svma, &mapping->i_mmap, idx, idx) { @@ -7404,7 +7404,11 @@ pte_t *huge_pmd_share(struct mm_struct *mm, struct vm_area_struct *vma, if (!spte) goto out; - ptl = huge_pte_lock(hstate_vma(vma), mm, spte); + shstate = hstate_vma(svma); + + hugetlb_pte_populate(mm, &hpte, spte, huge_page_shift(shstate), + hpage_size_to_level(huge_page_size(shstate))); + spin_lock(hpte.ptl); if (pud_none(*pud)) { pud_populate(mm, pud, (pmd_t *)((unsigned long)spte & PAGE_MASK)); @@ -7412,7 +7416,7 @@ pte_t *huge_pmd_share(struct mm_struct *mm, struct vm_area_struct *vma, } else { put_page(virt_to_page(spte)); } - spin_unlock(ptl); + spin_unlock(hpte.ptl); out: pte = (pte_t *)pmd_alloc(mm, pud, addr); i_mmap_unlock_read(mapping); @@ -8132,6 +8136,7 @@ void hugetlb_unshare_all_pmds(struct vm_area_struct *vma) unsigned long address, start, end; spinlock_t *ptl; pte_t *ptep; + struct hugetlb_pte hpte; if (!(vma->vm_flags & VM_MAYSHARE)) return; @@ -8156,7 +8161,10 @@ void hugetlb_unshare_all_pmds(struct vm_area_struct *vma) ptep = hugetlb_walk(vma, address, sz); if (!ptep) continue; - ptl = huge_pte_lock(h, mm, ptep); + + hugetlb_pte_populate(mm, &hpte, ptep, huge_page_shift(h), + hpage_size_to_level(sz)); + ptl = hugetlb_pte_lock(&hpte); huge_pmd_unshare(mm, vma, address, ptep); spin_unlock(ptl); } From patchwork Thu Jan 5 10:18:35 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: James Houghton X-Patchwork-Id: 13089670 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id ECBE0C3DA7A for ; Thu, 5 Jan 2023 10:19:50 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 1067694001E; Thu, 5 Jan 2023 05:19:50 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 0B705940008; Thu, 5 Jan 2023 05:19:50 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id E72EF94001E; Thu, 5 Jan 2023 05:19:49 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0014.hostedemail.com [216.40.44.14]) by kanga.kvack.org (Postfix) with ESMTP id C7FB1940008 for ; Thu, 5 Jan 2023 05:19:49 -0500 (EST) Received: from smtpin11.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay03.hostedemail.com (Postfix) with ESMTP id A02E1A0D0D for ; Thu, 5 Jan 2023 10:19:49 +0000 (UTC) X-FDA: 80320349298.11.25C04D4 Received: from mail-yb1-f202.google.com (mail-yb1-f202.google.com [209.85.219.202]) by imf15.hostedemail.com (Postfix) with ESMTP id 1D020A000D for ; Thu, 5 Jan 2023 10:19:47 +0000 (UTC) Authentication-Results: imf15.hostedemail.com; dkim=pass header.d=google.com header.s=20210112 header.b=rzwvJ5O+; spf=pass (imf15.hostedemail.com: domain of 3Q6S2YwoKCIox7v28uv721u22uzs.q20zw18B-00y9oqy.25u@flex--jthoughton.bounces.google.com designates 209.85.219.202 as permitted sender) smtp.mailfrom=3Q6S2YwoKCIox7v28uv721u22uzs.q20zw18B-00y9oqy.25u@flex--jthoughton.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1672913988; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=hEbQ+jBtLxtopyfN7uZpWk6gsndFSvLkXCtItdMk4sk=; b=vqLr3c6ZsET8mrcjamXeBt4vE1vltE7IbAHU0xU73TqbdI1iF2YhbKNE480v8uM7gJW1zx p0kzfnUwTkYaPRRXqGRYKc5HU6relZAAaImSjt7J5tviZuTXIsYXZZfdbb0hEdfVtgn66H wP3laEMnp4jFe5cly77yaaiHzQeX+uI= ARC-Authentication-Results: i=1; imf15.hostedemail.com; dkim=pass header.d=google.com header.s=20210112 header.b=rzwvJ5O+; spf=pass (imf15.hostedemail.com: domain of 3Q6S2YwoKCIox7v28uv721u22uzs.q20zw18B-00y9oqy.25u@flex--jthoughton.bounces.google.com designates 209.85.219.202 as permitted sender) smtp.mailfrom=3Q6S2YwoKCIox7v28uv721u22uzs.q20zw18B-00y9oqy.25u@flex--jthoughton.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1672913988; a=rsa-sha256; cv=none; b=Nw5jOxpspwrAC5nyGxyAhjK/3z2W5ttaLWnNxLtDMX0WfK4HHXX1Gh9kBgh4KB0S9AnXmD aprQXMVyp/XwjOfqxSL68bDpM9GN2E4+kU24mHVDaO/SHIbqC+pLRgou4+SHf5FOi3J/9M LeHLms8u2yXxR1ZSv2e8iSjT6Z0Q/yk= Received: by mail-yb1-f202.google.com with SMTP id v9-20020a259d89000000b007b515f139e0so1270363ybp.17 for ; Thu, 05 Jan 2023 02:19:47 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=hEbQ+jBtLxtopyfN7uZpWk6gsndFSvLkXCtItdMk4sk=; b=rzwvJ5O+RQ04BJH+OwNVawRQc+ih25bUT/fZ/ETXftYRdFLi35+tT7QSZVPpjxQS1b ABqV/Hscpt1tyn1fI606OsdBiPRbxM/DjtciO23jBbkIiSjZtGIU3B25/psNHjy2QqL+ n+3yYQBuUK1pJUfmrrK81ggN3y49WxAlQreHX/XO36/3irdaWFvE8DSXKP/d0WyODufr NKEQ4IDvemsHhse43nm2blfRwsmlA0BhLtmZ35A63iDIfnnlMTETvBoFXN6Dtdoz/wqg ut170EwANC3ESebWxhfhtn0H73/Xo/GUK5HnUthjZUYekYmuVYd1stlWrAvlonj7Th8O f90g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=hEbQ+jBtLxtopyfN7uZpWk6gsndFSvLkXCtItdMk4sk=; b=7n4Zw/83rq7/NLrKKpzHFcy4Vtaq8xJHkaEb507wrTkod7ogAGX0rT+mMnSSHVJPwz i8jK71Oe2Pfpd4ps0W1w9BwUogGvs/uu70RnLzVWJdUMBMVy3mvVO+bEXYxlUeQaMaXb VRyW4L1FbHlwouJZfSvTwnNa/yF2fne3yUaf6xty40MRxiwOqKcxyIelDa1D2k5nlZ78 dUuYTvjOy6sfcDkw95xzVBti9vLxbyOQLmDyadLwqeRK/uoYWi43AP7+9mAKmcVpJAvP dRkmnccizJkUy7RizSzPiCUtePZVrhRHi2Il967dG+OLmfzkXEMBj2ARfcXKkdAOk27p eeYQ== X-Gm-Message-State: AFqh2krQg4WMjdIVVZnZJXQ1U5bzyrVbnUQSe897U9GR/DKdDJOCFJmB GmFY4o5NBLkaPhgLcy7MBHa53hN7TSDiIVGT X-Google-Smtp-Source: AMrXdXsemwUzDJMwgzQg89O7O9QxdUcYFBlrWYezrLK7LXzb/jgbmzFmnWrbyc26/hsZEaa1LCvqsSIGoI/G4zn2 X-Received: from jthoughton.c.googlers.com ([fda3:e722:ac3:cc00:14:4d90:c0a8:2a4f]) (user=jthoughton job=sendgmr) by 2002:a81:c30e:0:b0:4b2:72:d8ee with SMTP id r14-20020a81c30e000000b004b20072d8eemr1083439ywk.272.1672913987306; Thu, 05 Jan 2023 02:19:47 -0800 (PST) Date: Thu, 5 Jan 2023 10:18:35 +0000 In-Reply-To: <20230105101844.1893104-1-jthoughton@google.com> Mime-Version: 1.0 References: <20230105101844.1893104-1-jthoughton@google.com> X-Mailer: git-send-email 2.39.0.314.g84b9a713c41-goog Message-ID: <20230105101844.1893104-38-jthoughton@google.com> Subject: [PATCH 37/46] hugetlb: replace make_huge_pte with make_huge_pte_with_shift From: James Houghton To: Mike Kravetz , Muchun Song , Peter Xu Cc: David Hildenbrand , David Rientjes , Axel Rasmussen , Mina Almasry , "Zach O'Keefe" , Manish Mishra , Naoya Horiguchi , "Dr . David Alan Gilbert" , "Matthew Wilcox (Oracle)" , Vlastimil Babka , Baolin Wang , Miaohe Lin , Yang Shi , Andrew Morton , linux-mm@kvack.org, linux-kernel@vger.kernel.org, James Houghton X-Stat-Signature: rgjsb7kz9is6q7efdmca59cu55k99g65 X-Rspam-User: X-Rspamd-Queue-Id: 1D020A000D X-Rspamd-Server: rspam06 X-HE-Tag: 1672913987-894168 X-HE-Meta: U2FsdGVkX18kGoWDzk7x2SAXYBfxdxlmxqy68ip3Qrb1cPWu12lGFlxz4W2XRidb+vrL7aexviR9V4I5M+thmG2DwzksxJMiiibWm9cF56hh5Woy1K5l6zMMHmEf06sPACMPTJMDAKuJglMNxTbjxPaq03yTEWGyx7Xaoy1j9IkjYRKWutnSc3TCsJvY585Bnk7b1MUkzbpL/fKWlb3B4ROu0flb8z+U7u4CVWHf4OBj3tjD3bfwUKMdoep0/wiWpAQ06vjYoZWoGmPYj6lGHryfgwwKLnBL0q0ebcsMyZZHai4w2i2NLseVbgP+0iyLdWCBcz+h22ff3dlIanoPFuIj6cZt6lUsSHp7P5BN2oWsaciQYgfxl+QjAgQHKgUNxj4nRcj7XROq6k30n9u03yoRdYc9mRFRgo5xAoeBgngXPpyjQ9kMlwVqrCNMp+u9epMz/BFpHA1P0FAX8YomJhEd74sjUoKMQHOyUxtFOaSqB5kDBMNJSkK3vCflCumDxlZBilzlJH26+5kZP0nePHIpK6PyXdlxUdyKRB5g259qz1wdgFOYnWLnssD/PqUTSkm31bIVJ2P0rnfByaMvUTBcUcmFGBSpBOS6COGXpYsK4/DEad2Zqf4nEDX979NWS+hRVOkhHSjoXgna3wymKu+76fIy6ZdOot9icFOn1D6DUVWCs1UlxpCtbXJ278G+KOzBkag55rHEbeGDC817btHwKy+t5tM9KmCexKHKxSNvDSUcI7jtjTDYQePlAmGnk3iHnLBvD4iUKnjVGnQxzCarr1zJ/ZMWCtU8SbWySVIy5xEbCtV4sv+50yz3DC7rlQYHZlFlYsdKaN5p6MKgjE4ufy6LbHRH6ul8uSasle7EIOyQzVua53SN7TGpn14gF1eHp+668eNku7agIa2idVfXmlwCHQmZww9FBto2gaJoqJw28KcfPtxXN1bvOarXVcfzgtXkGGjl5ho1eU1 cNMUp0p0 N+U6t8DsIoDWwHkPa0akH0APpIPuPW+ZLC8fe0naQj2fLzQGCoBX9BL5hdpnZ8dB7heE9DJHyreesUNF0Fr3DBSFVl4wEX1F9Pv7HLWibhVSdQCbttX1k3wvARvEuAspAbP94hea+mhDJOK9Y+noUKRmcfPTyKEzTbOih1k3DuxtbwaCCIBsrcgrs3b1DPeR4rMNjdHeqi/fJpQ4iThW1s2kVFqCoLDF3/zWUU548Fjro0F+LBDlVkwQN+n2VGfIhadGCSJ6lWMdHK4Wvh1cijRQHbHvY6awirrwOPshsUi1YZqwl/26HL3TN9x5IQRXapqjzGxVYl8RJurd98Ql9CI1oJ12TTuUEEhaMWqnR9KQ0XXh7Q8zjZ4E2RAXHJ9unUefSGpLusZPeEs9Nutap3kLb95zU1iTbcmvmnCI2cKKTDur0hdcPOLZ9eqvmCuEUHipP X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: This removes the old definition of make_huge_pte, where now we always require the shift to be explicitly given. All callsites are cleaned up. Signed-off-by: James Houghton --- mm/hugetlb.c | 31 ++++++++++++------------------- 1 file changed, 12 insertions(+), 19 deletions(-) diff --git a/mm/hugetlb.c b/mm/hugetlb.c index d71adc03138d..10a323e6bd9c 100644 --- a/mm/hugetlb.c +++ b/mm/hugetlb.c @@ -5069,9 +5069,9 @@ const struct vm_operations_struct hugetlb_vm_ops = { .pagesize = hugetlb_vm_op_pagesize, }; -static pte_t make_huge_pte_with_shift(struct vm_area_struct *vma, - struct page *page, int writable, - int shift) +static pte_t make_huge_pte(struct vm_area_struct *vma, + struct page *page, int writable, + int shift) { pte_t entry; @@ -5087,14 +5087,6 @@ static pte_t make_huge_pte_with_shift(struct vm_area_struct *vma, return entry; } -static pte_t make_huge_pte(struct vm_area_struct *vma, struct page *page, - int writable) -{ - unsigned int shift = huge_page_shift(hstate_vma(vma)); - - return make_huge_pte_with_shift(vma, page, writable, shift); -} - static void set_huge_ptep_writable(struct vm_area_struct *vma, unsigned long address, pte_t *ptep) { @@ -5135,10 +5127,12 @@ static void hugetlb_install_page(struct vm_area_struct *vma, pte_t *ptep, unsigned long addr, struct page *new_page) { + struct hstate *h = hstate_vma(vma); __SetPageUptodate(new_page); hugepage_add_new_anon_rmap(new_page, vma, addr); - set_huge_pte_at(vma->vm_mm, addr, ptep, make_huge_pte(vma, new_page, 1)); - hugetlb_count_add(pages_per_huge_page(hstate_vma(vma)), vma->vm_mm); + set_huge_pte_at(vma->vm_mm, addr, ptep, make_huge_pte(vma, new_page, 1, + huge_page_shift(h))); + hugetlb_count_add(pages_per_huge_page(h), vma->vm_mm); SetHPageMigratable(new_page); } @@ -5854,7 +5848,8 @@ static vm_fault_t hugetlb_wp(struct mm_struct *mm, struct vm_area_struct *vma, page_remove_rmap(old_page, vma, true); hugepage_add_new_anon_rmap(new_page, vma, haddr); set_huge_pte_at(mm, haddr, ptep, - make_huge_pte(vma, new_page, !unshare)); + make_huge_pte(vma, new_page, !unshare, + huge_page_shift(h))); SetHPageMigratable(new_page); /* Make the old page be freed below */ new_page = old_page; @@ -6163,7 +6158,7 @@ static vm_fault_t hugetlb_no_page(struct mm_struct *mm, page_dup_file_rmap(page, true); subpage = hugetlb_find_subpage(h, page, haddr_hgm); - new_pte = make_huge_pte_with_shift(vma, subpage, + new_pte = make_huge_pte(vma, subpage, ((vma->vm_flags & VM_WRITE) && (vma->vm_flags & VM_SHARED)), hpte->shift); @@ -6585,8 +6580,7 @@ int hugetlb_mcopy_atomic_pte(struct mm_struct *dst_mm, subpage = hugetlb_find_subpage(h, page, dst_addr); - _dst_pte = make_huge_pte_with_shift(dst_vma, subpage, writable, - dst_hpte->shift); + _dst_pte = make_huge_pte(dst_vma, subpage, writable, dst_hpte->shift); /* * Always mark UFFDIO_COPY page dirty; note that this may not be * extremely important for hugetlbfs for now since swapping is not @@ -7999,8 +7993,7 @@ int hugetlb_collapse(struct mm_struct *mm, struct vm_area_struct *vma, page_dup_file_rmap(hpage, true); subpage = hugetlb_find_subpage(h, hpage, curr); - entry = make_huge_pte_with_shift(vma, subpage, - writable, hpte.shift); + entry = make_huge_pte(vma, subpage, writable, hpte.shift); set_huge_pte_at(mm, curr, hpte.ptep, entry); next_hpte: curr += hugetlb_pte_size(&hpte); From patchwork Thu Jan 5 10:18:36 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: James Houghton X-Patchwork-Id: 13089671 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id C8F0EC53210 for ; Thu, 5 Jan 2023 10:19:52 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 611CA94001F; Thu, 5 Jan 2023 05:19:52 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 5497F940008; Thu, 5 Jan 2023 05:19:52 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 3C30494001F; Thu, 5 Jan 2023 05:19:52 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0016.hostedemail.com [216.40.44.16]) by kanga.kvack.org (Postfix) with ESMTP id 2BB39940008 for ; Thu, 5 Jan 2023 05:19:52 -0500 (EST) Received: from smtpin02.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay08.hostedemail.com (Postfix) with ESMTP id 08367140D99 for ; Thu, 5 Jan 2023 10:19:52 +0000 (UTC) X-FDA: 80320349424.02.54EA7BC Received: from mail-yw1-f202.google.com (mail-yw1-f202.google.com [209.85.128.202]) by imf02.hostedemail.com (Postfix) with ESMTP id 70D638000F for ; Thu, 5 Jan 2023 10:19:50 +0000 (UTC) Authentication-Results: imf02.hostedemail.com; dkim=pass header.d=google.com header.s=20210112 header.b="U/UY/aBZ"; spf=pass (imf02.hostedemail.com: domain of 3RaS2YwoKCIwz9x4Awx943w44w1u.s421y3AD-220Bqs0.47w@flex--jthoughton.bounces.google.com designates 209.85.128.202 as permitted sender) smtp.mailfrom=3RaS2YwoKCIwz9x4Awx943w44w1u.s421y3AD-220Bqs0.47w@flex--jthoughton.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1672913990; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=YROC31bBHQbBSjmgrEGHhxpa+rSAT6KfqkSaFQ7ab5I=; b=C3TfqnKLaP0JFBqM1tuSY5helaEOKy4XlTAp655d+MYOTt58HCZyAG411zopyJk3baoz/2 L/pkGPAWwAHAGeHiQBy6SNYm04emlibm7OtufhhtMbM/4Sq2zVVZfC7B8XmMtBmlHzztuD dqOS84gqR41yThk1NERnegModxbTRMk= ARC-Authentication-Results: i=1; imf02.hostedemail.com; dkim=pass header.d=google.com header.s=20210112 header.b="U/UY/aBZ"; spf=pass (imf02.hostedemail.com: domain of 3RaS2YwoKCIwz9x4Awx943w44w1u.s421y3AD-220Bqs0.47w@flex--jthoughton.bounces.google.com designates 209.85.128.202 as permitted sender) smtp.mailfrom=3RaS2YwoKCIwz9x4Awx943w44w1u.s421y3AD-220Bqs0.47w@flex--jthoughton.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1672913990; a=rsa-sha256; cv=none; b=ol8Pr5WjTJSTJ1I6cigKvqdOcLkmy8/WsbyVrLlTTcTFUegk6OeCyunGLmByPp2I7de5pp WGjoqVYIGdeXu18gWFAenzk+MXb9KyCt7RVPGx+AF75ztX3zPIhnB4XJ94amKb1/p9M/el /p/Mv5eBs14x/JWltEuRVG0Q30e8f7Y= Received: by mail-yw1-f202.google.com with SMTP id 00721157ae682-45a51c37009so376797657b3.17 for ; Thu, 05 Jan 2023 02:19:50 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=YROC31bBHQbBSjmgrEGHhxpa+rSAT6KfqkSaFQ7ab5I=; b=U/UY/aBZuBuGO1zBf6hoefUgK67MuUBly/rpc07hnF9u48daPB1OjsNzzcpRAPF19C LuSN2mUnYGUk8rvXyHGEuQtX3iEeJRkKhDrbwW5MY6Wf7TTfUt5oqHr8PuK0ICT4V7Ir G9IYlCVaymyzPUk4ZLkY6wnrdelHelK76kz4YyGJlubRd29SUQWrCXQl5QBNjdcKzI6B +QXWbVZqKiQuB/zZgtBcVlNNhSjJRbffRJg2gp1NzRPUYzXyFHl30sL1lhuUu2N/BaSC x0saMyGSHg8E20qCo+1AtcjQOlOpoSyK4amkpF0avMjdnvn1hlCJtd1I/c5LFWD6iihU /N9g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=YROC31bBHQbBSjmgrEGHhxpa+rSAT6KfqkSaFQ7ab5I=; b=vadI70g9JxioA+13GFCnTTzQsw7DYyOOLiI7OEsB9UUCDqhrMZmTa3b+o2EpeXkptm oruZ2u9Kdi9fwR2lq+EVzLaI9hE/sl0EhwLQmFyeUTGgFgEdJ891pHOJEySaJnzQBU5t xgi1JCwodYUDvYlMOnkjUo2Kon4EqqjoxGENqgRvbxLM8ETANWyh0iHVsG2XcsjO7UtA W2Hp0+sZSQwTNhfLKtFrpTpcXa2yzCk0I/cBuYloD3wQGAACd6tyHqW9iOgp3R4yLnOd koH9IbqWadh9/6lQXSLwRIcxlRuSat6m6aF/PIgrVBgx/ryHO1arKSOhEAJIW37B9x8h EKew== X-Gm-Message-State: AFqh2koxbzio++1WQPDVan4oOwUMo061/bsq0eflmMPoNgp8CbE8e/QP wUjJloCIc5aH13r/I7P+X3U43Rx7hog/zisn X-Google-Smtp-Source: AMrXdXva/62F1LiNjUZP+w4O7dOJeE1Zyly2AVsZ+dGfpJE6FQgboB0b+h+nC+ToRQ8c3xNeHsE+NI6Dzh0EkPbj X-Received: from jthoughton.c.googlers.com ([fda3:e722:ac3:cc00:14:4d90:c0a8:2a4f]) (user=jthoughton job=sendgmr) by 2002:a0d:e5c3:0:b0:4ba:4d9d:ca0f with SMTP id o186-20020a0de5c3000000b004ba4d9dca0fmr472468ywe.250.1672913989605; Thu, 05 Jan 2023 02:19:49 -0800 (PST) Date: Thu, 5 Jan 2023 10:18:36 +0000 In-Reply-To: <20230105101844.1893104-1-jthoughton@google.com> Mime-Version: 1.0 References: <20230105101844.1893104-1-jthoughton@google.com> X-Mailer: git-send-email 2.39.0.314.g84b9a713c41-goog Message-ID: <20230105101844.1893104-39-jthoughton@google.com> Subject: [PATCH 38/46] mm: smaps: add stats for HugeTLB mapping size From: James Houghton To: Mike Kravetz , Muchun Song , Peter Xu Cc: David Hildenbrand , David Rientjes , Axel Rasmussen , Mina Almasry , "Zach O'Keefe" , Manish Mishra , Naoya Horiguchi , "Dr . David Alan Gilbert" , "Matthew Wilcox (Oracle)" , Vlastimil Babka , Baolin Wang , Miaohe Lin , Yang Shi , Andrew Morton , linux-mm@kvack.org, linux-kernel@vger.kernel.org, James Houghton X-Rspamd-Queue-Id: 70D638000F X-Stat-Signature: 5kh1p6pjfbx6cuenkybw688paq8n4qmj X-Rspam-User: X-Rspamd-Server: rspam08 X-HE-Tag: 1672913990-249457 X-HE-Meta: U2FsdGVkX1+sLcKckZ1qvrB2VfXJESePbQzF2uWambPL0OctwKaV2BGkCpZMQxKljZHqdBMH6bpsnhNspOXnvG+sSJjSa7GfSz0kpwAyFmJ7ne7l3xs9kYvNoHFg7dtweRRii0O7s+XVmMQu0wa8ysEg13qBaSJrqMKXH+u1DoVr/54yKnHwW/ylhMah4OwfgqDUVXuvhVXqBOfTWZm3CCuPMIqj9ZXI+DVDH/F5m0bmmD0v10eSy67RgM1mXKLTrCx5YACnCyVn5L4q3q90yti/ZXLbwsUrGGZA75cygzAWT+McoBmQ5V+QdGg2tP6Q4welYPIg0uSW6GMRd3+4h6z6OOJJ/EG8vJrk3il0V5nd5lYbxQznR21ovsnQf28fvLANM4W6yp6OnAcyXcsUrmNuw+V6NoUmWYnBszFFUfJwCFstNODKfc8/0qO7b90kliPHpZYJTvdhn0+e4it3z300JgDUoa9+NqGa40y2OGzvB91qsM/7jsqNuz/g25+CUW8quq1FYaURmAa0FEMScLf7d3kZ0SVn6S6VbD3rxFEPOD2f+feLmZa8w/VMarYWHlP41JW+jntr0u7Yp3zccuOjdwYXS7XhnvSyBwNJzznQpdCqOji+9ak4djCyCzyH+1OJFx2c5ts1QvlVd6TZ7mdBoubw38wG9x/Q7r6hwJo8mzMI3HyaPKRDUpZyPMiUiMArlQ1C2ZQ7YbCORjVFUJyzay+7c8Z8wD7umpbMQGPIVJv7UtSiK8JLvqfTlSE3f4tEw8Z8joZ7+TPHjp/8nRBrYg/p45Fl9YQvdQHPYJ95Gy5U4n3KMUxqmgnW2oPVfp18ctDAKsRi6m7laOsBdVxvmVR/QxMm/JFvSTV9X77PoaMLg7xyAk4GAP3ymMZYNDydNi8RlUMSmjuNxmXrcZPWYTZkHz6e1fnPt5EWNvwYyPT86lU3+xRcMgN/iiM9VFj4amCpizh1rCagr4B Zrk0jqpk S3o9DIIVldRJQO8aJG/ApQokyv4dhuyoYxmaZaY2iDW7dR9KDPbWpRFtKxsnAbe6aiyyL5onFgIYaYfjxmm+ckc3FBPkzFQBBdJkY9KC9Fe292lubIePLMKpJ9+8SsYZIoKs1W6PxT6f0+gN9LOfVPPrb6cAjx4mmAcNREQsLxZKFvSTyl5Zo8IVW1GEDZuSLg+7OG3LkfbZzjUV1f8VW7Ax7R7TscKs8UytgBm3SoCxMe0cOkGZ0vYiMD+tQdt037D4BKjWpnc3D7/7uMZ8tkuix9Z2pwL0esGUJthlzR09HVQ2u536SNTv9Qmpw5WCCvzJHnNIvdW2yv6cdHnFV46Cs26bPc3qMCmVFLMOZZVRpSbKBMlpstaviyYfAmHZUP3PJdUlfd1Km5Ug= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: When the kernel is compiled with HUGETLB_HIGH_GRANULARITY_MAPPING, smaps may provide HugetlbPudMapped, HugetlbPmdMapped, and HugetlbPteMapped. Levels that are folded will not be outputted. Signed-off-by: James Houghton --- fs/proc/task_mmu.c | 101 +++++++++++++++++++++++++++++++++------------ 1 file changed, 75 insertions(+), 26 deletions(-) diff --git a/fs/proc/task_mmu.c b/fs/proc/task_mmu.c index c353cab11eee..af31c4d314d2 100644 --- a/fs/proc/task_mmu.c +++ b/fs/proc/task_mmu.c @@ -412,6 +412,15 @@ struct mem_size_stats { unsigned long swap; unsigned long shared_hugetlb; unsigned long private_hugetlb; +#ifdef CONFIG_HUGETLB_HIGH_GRANULARITY_MAPPING +#ifndef __PAGETABLE_PUD_FOLDED + unsigned long hugetlb_pud_mapped; +#endif +#ifndef __PAGETABLE_PMD_FOLDED + unsigned long hugetlb_pmd_mapped; +#endif + unsigned long hugetlb_pte_mapped; +#endif u64 pss; u64 pss_anon; u64 pss_file; @@ -731,6 +740,35 @@ static void show_smap_vma_flags(struct seq_file *m, struct vm_area_struct *vma) } #ifdef CONFIG_HUGETLB_PAGE + +static void smaps_hugetlb_hgm_account(struct mem_size_stats *mss, + struct hugetlb_pte *hpte) +{ +#ifdef CONFIG_HUGETLB_HIGH_GRANULARITY_MAPPING + unsigned long size = hugetlb_pte_size(hpte); + + switch (hpte->level) { +#ifndef __PAGETABLE_PUD_FOLDED + case HUGETLB_LEVEL_PUD: + mss->hugetlb_pud_mapped += size; + break; +#endif +#ifndef __PAGETABLE_PMD_FOLDED + case HUGETLB_LEVEL_PMD: + mss->hugetlb_pmd_mapped += size; + break; +#endif + case HUGETLB_LEVEL_PTE: + mss->hugetlb_pte_mapped += size; + break; + default: + break; + } +#else + return; +#endif /* CONFIG_HUGETLB_HIGH_GRANULARITY_MAPPING */ +} + static int smaps_hugetlb_range(struct hugetlb_pte *hpte, unsigned long addr, struct mm_walk *walk) @@ -764,6 +802,8 @@ static int smaps_hugetlb_range(struct hugetlb_pte *hpte, mss->shared_hugetlb += hugetlb_pte_size(hpte); else mss->private_hugetlb += hugetlb_pte_size(hpte); + + smaps_hugetlb_hgm_account(mss, hpte); } return 0; } @@ -833,38 +873,47 @@ static void smap_gather_stats(struct vm_area_struct *vma, static void __show_smap(struct seq_file *m, const struct mem_size_stats *mss, bool rollup_mode) { - SEQ_PUT_DEC("Rss: ", mss->resident); - SEQ_PUT_DEC(" kB\nPss: ", mss->pss >> PSS_SHIFT); - SEQ_PUT_DEC(" kB\nPss_Dirty: ", mss->pss_dirty >> PSS_SHIFT); + SEQ_PUT_DEC("Rss: ", mss->resident); + SEQ_PUT_DEC(" kB\nPss: ", mss->pss >> PSS_SHIFT); + SEQ_PUT_DEC(" kB\nPss_Dirty: ", mss->pss_dirty >> PSS_SHIFT); if (rollup_mode) { /* * These are meaningful only for smaps_rollup, otherwise two of * them are zero, and the other one is the same as Pss. */ - SEQ_PUT_DEC(" kB\nPss_Anon: ", + SEQ_PUT_DEC(" kB\nPss_Anon: ", mss->pss_anon >> PSS_SHIFT); - SEQ_PUT_DEC(" kB\nPss_File: ", + SEQ_PUT_DEC(" kB\nPss_File: ", mss->pss_file >> PSS_SHIFT); - SEQ_PUT_DEC(" kB\nPss_Shmem: ", + SEQ_PUT_DEC(" kB\nPss_Shmem: ", mss->pss_shmem >> PSS_SHIFT); } - SEQ_PUT_DEC(" kB\nShared_Clean: ", mss->shared_clean); - SEQ_PUT_DEC(" kB\nShared_Dirty: ", mss->shared_dirty); - SEQ_PUT_DEC(" kB\nPrivate_Clean: ", mss->private_clean); - SEQ_PUT_DEC(" kB\nPrivate_Dirty: ", mss->private_dirty); - SEQ_PUT_DEC(" kB\nReferenced: ", mss->referenced); - SEQ_PUT_DEC(" kB\nAnonymous: ", mss->anonymous); - SEQ_PUT_DEC(" kB\nLazyFree: ", mss->lazyfree); - SEQ_PUT_DEC(" kB\nAnonHugePages: ", mss->anonymous_thp); - SEQ_PUT_DEC(" kB\nShmemPmdMapped: ", mss->shmem_thp); - SEQ_PUT_DEC(" kB\nFilePmdMapped: ", mss->file_thp); - SEQ_PUT_DEC(" kB\nShared_Hugetlb: ", mss->shared_hugetlb); - seq_put_decimal_ull_width(m, " kB\nPrivate_Hugetlb: ", + SEQ_PUT_DEC(" kB\nShared_Clean: ", mss->shared_clean); + SEQ_PUT_DEC(" kB\nShared_Dirty: ", mss->shared_dirty); + SEQ_PUT_DEC(" kB\nPrivate_Clean: ", mss->private_clean); + SEQ_PUT_DEC(" kB\nPrivate_Dirty: ", mss->private_dirty); + SEQ_PUT_DEC(" kB\nReferenced: ", mss->referenced); + SEQ_PUT_DEC(" kB\nAnonymous: ", mss->anonymous); + SEQ_PUT_DEC(" kB\nLazyFree: ", mss->lazyfree); + SEQ_PUT_DEC(" kB\nAnonHugePages: ", mss->anonymous_thp); + SEQ_PUT_DEC(" kB\nShmemPmdMapped: ", mss->shmem_thp); + SEQ_PUT_DEC(" kB\nFilePmdMapped: ", mss->file_thp); + SEQ_PUT_DEC(" kB\nShared_Hugetlb: ", mss->shared_hugetlb); + seq_put_decimal_ull_width(m, " kB\nPrivate_Hugetlb: ", mss->private_hugetlb >> 10, 7); - SEQ_PUT_DEC(" kB\nSwap: ", mss->swap); - SEQ_PUT_DEC(" kB\nSwapPss: ", +#ifdef CONFIG_HUGETLB_HIGH_GRANULARITY_MAPPING +#ifndef __PAGETABLE_PUD_FOLDED + SEQ_PUT_DEC(" kB\nHugetlbPudMapped: ", mss->hugetlb_pud_mapped); +#endif +#ifndef __PAGETABLE_PMD_FOLDED + SEQ_PUT_DEC(" kB\nHugetlbPmdMapped: ", mss->hugetlb_pmd_mapped); +#endif + SEQ_PUT_DEC(" kB\nHugetlbPteMapped: ", mss->hugetlb_pte_mapped); +#endif /* CONFIG_HUGETLB_HIGH_GRANULARITY_MAPPING */ + SEQ_PUT_DEC(" kB\nSwap: ", mss->swap); + SEQ_PUT_DEC(" kB\nSwapPss: ", mss->swap_pss >> PSS_SHIFT); - SEQ_PUT_DEC(" kB\nLocked: ", + SEQ_PUT_DEC(" kB\nLocked: ", mss->pss_locked >> PSS_SHIFT); seq_puts(m, " kB\n"); } @@ -880,18 +929,18 @@ static int show_smap(struct seq_file *m, void *v) show_map_vma(m, vma); - SEQ_PUT_DEC("Size: ", vma->vm_end - vma->vm_start); - SEQ_PUT_DEC(" kB\nKernelPageSize: ", vma_kernel_pagesize(vma)); - SEQ_PUT_DEC(" kB\nMMUPageSize: ", vma_mmu_pagesize(vma)); + SEQ_PUT_DEC("Size: ", vma->vm_end - vma->vm_start); + SEQ_PUT_DEC(" kB\nKernelPageSize: ", vma_kernel_pagesize(vma)); + SEQ_PUT_DEC(" kB\nMMUPageSize: ", vma_mmu_pagesize(vma)); seq_puts(m, " kB\n"); __show_smap(m, &mss, false); - seq_printf(m, "THPeligible: %d\n", + seq_printf(m, "THPeligible: %d\n", hugepage_vma_check(vma, vma->vm_flags, true, false, true)); if (arch_pkeys_enabled()) - seq_printf(m, "ProtectionKey: %8u\n", vma_pkey(vma)); + seq_printf(m, "ProtectionKey: %8u\n", vma_pkey(vma)); show_smap_vma_flags(m, vma); return 0; From patchwork Thu Jan 5 10:18:37 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: James Houghton X-Patchwork-Id: 13089672 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 7E1E8C3DA7A for ; Thu, 5 Jan 2023 10:19:54 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 1C871940010; Thu, 5 Jan 2023 05:19:54 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 17A42940008; Thu, 5 Jan 2023 05:19:54 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id F34B3940010; Thu, 5 Jan 2023 05:19:53 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0017.hostedemail.com [216.40.44.17]) by kanga.kvack.org (Postfix) with ESMTP id CE63B940008 for ; Thu, 5 Jan 2023 05:19:53 -0500 (EST) Received: from smtpin24.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay05.hostedemail.com (Postfix) with ESMTP id B309240B38 for ; Thu, 5 Jan 2023 10:19:53 +0000 (UTC) X-FDA: 80320349466.24.5F82CC2 Received: from mail-yw1-f202.google.com (mail-yw1-f202.google.com [209.85.128.202]) by imf14.hostedemail.com (Postfix) with ESMTP id 1EB7810000C for ; Thu, 5 Jan 2023 10:19:51 +0000 (UTC) Authentication-Results: imf14.hostedemail.com; dkim=pass header.d=google.com header.s=20210112 header.b=I6ayK2fj; spf=pass (imf14.hostedemail.com: domain of 3R6S2YwoKCI41Bz6CyzB65y66y3w.u64305CF-442Dsu2.69y@flex--jthoughton.bounces.google.com designates 209.85.128.202 as permitted sender) smtp.mailfrom=3R6S2YwoKCI41Bz6CyzB65y66y3w.u64305CF-442Dsu2.69y@flex--jthoughton.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1672913992; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=ebd8Vu/XVboNQYDKBTu4hK6f3y9obuZPyqPY5k+Gudc=; b=40oK/RXhoqpQpKIp7SKG/vlxU0bucvIkpOLlyreK4iJJEMWsto3c3lAkHmL7c2AmGkJym6 PIbdwyjnPmPTWJGv/0tbpiB69OuVyaNIlSEOMX6OZgwlwfcHyvMhzx7v/ZGILSJbYTEs+0 OsqpJBbcIh81Q0hf3veMRLVoBi2kT4Y= ARC-Authentication-Results: i=1; imf14.hostedemail.com; dkim=pass header.d=google.com header.s=20210112 header.b=I6ayK2fj; spf=pass (imf14.hostedemail.com: domain of 3R6S2YwoKCI41Bz6CyzB65y66y3w.u64305CF-442Dsu2.69y@flex--jthoughton.bounces.google.com designates 209.85.128.202 as permitted sender) smtp.mailfrom=3R6S2YwoKCI41Bz6CyzB65y66y3w.u64305CF-442Dsu2.69y@flex--jthoughton.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1672913992; a=rsa-sha256; cv=none; b=q/ONQjxG10QofFrDWFu3fi+J5G3qOv4WzebdCcPsxIJ98y76R56Jevz1WTL8YQlgt3NYsM /88e00FeZcCTLdP8hjtDQb+T+S6uFNGYnzAe3C/Cx/+sEOqqFyMLZ9Oei71unlRlSmKx3x ay4Mydn4MOZIi7yyFCzW2XkNcIE2tqc= Received: by mail-yw1-f202.google.com with SMTP id 00721157ae682-470d4c948efso306816807b3.11 for ; Thu, 05 Jan 2023 02:19:51 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=ebd8Vu/XVboNQYDKBTu4hK6f3y9obuZPyqPY5k+Gudc=; b=I6ayK2fjeeV6XAYAV3ZhwaEnnQoYUdQNqveqXyF5ggg1QXVgf5LPqhurqUyX4Sg8HO s6F9jORt3w8TrKLgQbE9o/ipsILjWJVWxOlolxXAYaC7l1LUJTPuzmnYXb8M/6TAOgE6 iV4jd4Plk12mL7A8sfE6TF+7lw9HKJSmcmSdh24PWpxCGN5c8WUgHm+WCIlfPdcQb8Va oaa3biXUtYUW2lT6cUJ+TJCPkPPPYnk+RY3IQZ7J1RSQZAg/BMdpqELg44GmDZbqhD/+ kIK71fstpMvWu1m6mR114A35MUphScUHv/CK8JATL1Z7Dr//wEo/gsARlq+dXyJGqPZ3 FyGA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=ebd8Vu/XVboNQYDKBTu4hK6f3y9obuZPyqPY5k+Gudc=; b=a0M4OM41dDLHVX05AjjsicAhKPC2oaFMGndQ04kqGXRq7gS7xXE77Y61iBC8eyJRax 69h1GQV7pT7qTY3t+gjgqzq4HaZCyZxrcGXWnRjpv+F8p7P23kQNUQavDRWnJ0gwBqrD 8U30vFT4Pn9/+2l+odKZkC6yCVi19whkEiryt3rI6SniBmclMjSQ/97cP5KXre86epVS +zUbjFOS8RjYqVzpVC8ulVuYIKdhzM+/Nft2hrOkS3QOZGbpz6IMsfFsjd9Xai0XSTE1 uD+pL+FjN5/4m2ixdllljxrj+kfUl/IPTVPeUljHSx31f52E5R90us8Psny4sQ2I2AIT C7Hg== X-Gm-Message-State: AFqh2kqRsndj0LrVWRLYH+7csONyVETX7iP6YGqD265yrd9PjJsjJn/w NpIVm53HZ60eOpHYvezhggyEfWNKWUuRr8qR X-Google-Smtp-Source: AMrXdXtV4+6oO1J8uKzAHBFEmMr5riCkeBCaO8d3MxL5klPMuG1OvNmF2A8uHxD11fFnV9fuWT3AXj8ucCtp8clM X-Received: from jthoughton.c.googlers.com ([fda3:e722:ac3:cc00:14:4d90:c0a8:2a4f]) (user=jthoughton job=sendgmr) by 2002:a05:690c:c81:b0:48c:9ce1:9ac8 with SMTP id cm1-20020a05690c0c8100b0048c9ce19ac8mr2452166ywb.305.1672913991340; Thu, 05 Jan 2023 02:19:51 -0800 (PST) Date: Thu, 5 Jan 2023 10:18:37 +0000 In-Reply-To: <20230105101844.1893104-1-jthoughton@google.com> Mime-Version: 1.0 References: <20230105101844.1893104-1-jthoughton@google.com> X-Mailer: git-send-email 2.39.0.314.g84b9a713c41-goog Message-ID: <20230105101844.1893104-40-jthoughton@google.com> Subject: [PATCH 39/46] hugetlb: x86: enable high-granularity mapping From: James Houghton To: Mike Kravetz , Muchun Song , Peter Xu Cc: David Hildenbrand , David Rientjes , Axel Rasmussen , Mina Almasry , "Zach O'Keefe" , Manish Mishra , Naoya Horiguchi , "Dr . David Alan Gilbert" , "Matthew Wilcox (Oracle)" , Vlastimil Babka , Baolin Wang , Miaohe Lin , Yang Shi , Andrew Morton , linux-mm@kvack.org, linux-kernel@vger.kernel.org, James Houghton X-Rspamd-Server: rspam05 X-Rspamd-Queue-Id: 1EB7810000C X-Stat-Signature: thhg737be5gf37g7p744s3rzjk9twp5u X-Rspam-User: X-HE-Tag: 1672913991-560021 X-HE-Meta: U2FsdGVkX1/y9Szy5Ezd06v8Q5wjw0QdlcElizyOZt2zYVp080Uq+GcIStP/XCWF8t9pao8pzleIB1O5JTkf6/woNJV+yDiwGX2Apu9FSzF5YKiLAVOUN3hW/rVzN7Yu9AXQNFaTz5xnNaf/K0CaXuX+W0HSbOyGEcZMzBaUPcBQAD3ezapS3G9m70weXAfj0iU7jYZr6+gHkhfHuNtVxu5wN2hS1D58LVoL00dC4LkV2rI0QicsgVQPoT2JZeJlsWGtiG5vqMGW5bdFsXmNNyohlPgqhuU+3+zhCjnjth+s6eWbb8s2Mg5bDaeKWNyGd6NwMTA9oXG6W8JO1uQDuKs4+OX9tPgLGXQeVQawkrVl8TkkHLsmCFcTneM3gotssct2+iQZlGtkjeKllQx1VgfYGFpnOG7VBKbAWCg7ItZCwO+aq1QtsEHb1fuSk/Jv64TQrVeS8TFa9ifJ2iTB2v6lOx+WG3xHawCilosiOO5GN2CRQKDP0IpYjDBsRiA1s/h5yw3CvJlAlm4MLRkGs/O9Fa7KDiACFeqx03f75nPRqy3le6DkZzManCXafwJIFH8UMauSXTHq+8M+QGbno0l6MErqDwSObSjftYCRu1edr7rwF/Yra9TYkJBMSTsy1aN/5sMaCCMCxsHtd4umT6E/fzGZQ140TC+yRdQ+9r73oevEcXGitHkQ1uJ0P749Tw7RsPe3PJoDzlZCl3TRQaK13xcR4It1TE4hoRnBh0a78+6peoBRph6izCk8dY44sskFARAALLQBSPPxd/XYF7ajRSuKOMB+v64jaIYIry5kzS28g3iw/IxWS+k6q+f8YueQq16oZ3CvdyjTRKxwz+OXYXCgoOq7zmTh/wTh75xl6aBBgI28VUmUAiftS2b2ymNHOEHVC0v7F7JOKOuLHhAD/1UvOTt6NrQsJkmgZ8Djcmc4VFfZShf19wIOXbpvQu2sJMjfatIcdc8K/vV E8q7SWPi Y66tEMem6DhNEFLY2Cl8GbvMF2lDs0e/EFmmNIuFL/GLqvXY/itNNu1DoA4UK3Rb9YeaRvpno+0HDY8b7f+0PVWSehpey8cChUnHuZuBmDaq8lX1WUJPlUPc2qMPRQzoXywLN+JRtqYQNITpLlCNC8NrFrEFXnIHIGaLU5XQp2ZBvISXEJa/0janQKZZ6nHKqXwlcLoZr+pWg/WMjkHnaI9LFz7COAOQhEo6viz0+WVOj8Kf71fsfnUh5gH/PAfQIJSujLKVYq2WMbSmoIUDm2v6Drd0u8fZhSeUvmf0WdbP9HL3lD78lQ/coD1OsGjUk4FFkPK+/uTimswSIt9Fw4LouKckwrKeLpFg0s7u7zd1fFArNwXV8YROPBCQ97K/h+76+gpmlgGU8x1w= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Now that HGM is fully supported for GENERAL_HUGETLB, x86 can enable it. The x86 KVM MMU already properly handles HugeTLB HGM pages (it does a page table walk to determine which size to use in the second-stage page table instead of, for example, checking vma_mmu_pagesize, like arm64 does). We could also enable HugeTLB HGM for arm (32-bit) at this point, as it also uses GENERAL_HUGETLB and I don't see anything else that is needed for it. However, I haven't tested on arm at all, so I won't enable it. Signed-off-by: James Houghton --- arch/x86/Kconfig | 1 + 1 file changed, 1 insertion(+) diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig index 3604074a878b..3d08cd45549c 100644 --- a/arch/x86/Kconfig +++ b/arch/x86/Kconfig @@ -126,6 +126,7 @@ config X86 select ARCH_WANT_GENERAL_HUGETLB select ARCH_WANT_HUGE_PMD_SHARE select ARCH_WANT_HUGETLB_PAGE_OPTIMIZE_VMEMMAP if X86_64 + select ARCH_WANT_HUGETLB_HIGH_GRANULARITY_MAPPING select ARCH_WANT_LD_ORPHAN_WARN select ARCH_WANTS_THP_SWAP if X86_64 select ARCH_HAS_PARANOID_L1D_FLUSH From patchwork Thu Jan 5 10:18:38 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: James Houghton X-Patchwork-Id: 13089673 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 3A672C3DA7D for ; Thu, 5 Jan 2023 10:19:56 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 8CD56940020; Thu, 5 Jan 2023 05:19:55 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 830FE940008; Thu, 5 Jan 2023 05:19:55 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 681CB940020; Thu, 5 Jan 2023 05:19:55 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0015.hostedemail.com [216.40.44.15]) by kanga.kvack.org (Postfix) with ESMTP id 52AB7940008 for ; Thu, 5 Jan 2023 05:19:55 -0500 (EST) Received: from smtpin14.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay08.hostedemail.com (Postfix) with ESMTP id 372C5140854 for ; Thu, 5 Jan 2023 10:19:55 +0000 (UTC) X-FDA: 80320349550.14.E87DC2A Received: from mail-yw1-f201.google.com (mail-yw1-f201.google.com [209.85.128.201]) by imf07.hostedemail.com (Postfix) with ESMTP id 9E3774000C for ; Thu, 5 Jan 2023 10:19:53 +0000 (UTC) Authentication-Results: imf07.hostedemail.com; dkim=pass header.d=google.com header.s=20210112 header.b=MCxyEzTT; spf=pass (imf07.hostedemail.com: domain of 3SKS2YwoKCI82C07Dz0C76z77z4x.v75416DG-553Etv3.7Az@flex--jthoughton.bounces.google.com designates 209.85.128.201 as permitted sender) smtp.mailfrom=3SKS2YwoKCI82C07Dz0C76z77z4x.v75416DG-553Etv3.7Az@flex--jthoughton.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1672913993; a=rsa-sha256; cv=none; b=xoq1F9ev0S/Iyy+vhRqd3IYkeIiy9IJ/x8XBgNMXRzrrPsqJA6D0UFx/oFk9jaaajK4inY r81a/tY1wcuH5h6hnyy71V+ZydaAIYAO/qB/9qvR13AyBdIJF0QLXHkElJ4/o5ZcDpn9Vz YYKwCMnLg14SQ7b6imZ54lgixtfuIK8= ARC-Authentication-Results: i=1; imf07.hostedemail.com; dkim=pass header.d=google.com header.s=20210112 header.b=MCxyEzTT; spf=pass (imf07.hostedemail.com: domain of 3SKS2YwoKCI82C07Dz0C76z77z4x.v75416DG-553Etv3.7Az@flex--jthoughton.bounces.google.com designates 209.85.128.201 as permitted sender) smtp.mailfrom=3SKS2YwoKCI82C07Dz0C76z77z4x.v75416DG-553Etv3.7Az@flex--jthoughton.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1672913993; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=ksh61HdFJKW5FzhQ3xBVqQbws7N9eS43eLhKNpPeqew=; b=mmqgaAVbK9QTNVGo/uD/YTGCBTMvUxKAWiPO2hy8voW5jGM1ZJ6QROLxwDTDGxMI7l/JZP EOTaototAK2k84Ka683AkakNbEt6eaQiyTV9jyHOyrF6rrCDH03JoZU+rQoAVZxTP3A1SL jFXwJZx8K3UA/RNuJ8hhorI1x8GZfPQ= Received: by mail-yw1-f201.google.com with SMTP id 00721157ae682-410f1a84ed2so372090297b3.15 for ; Thu, 05 Jan 2023 02:19:53 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=ksh61HdFJKW5FzhQ3xBVqQbws7N9eS43eLhKNpPeqew=; b=MCxyEzTTO0yqg91shqPo6Ni+5EFzxsgTqYc3+pyMOGf0oatdr3BfMXYm2vwczO7xxN 6eXisofnTjW6ihZw5N2TmlX/02IpUvl2Obfz7KJBIRlML8eUYj2RVscTVOTPLqUOfEUl 7ALcPn6iafF/zuFvLcOWLr1Ggm27nMtZb4wuXZmie98LQOT+8Oo6f9LCBYicU9kTcsva a2UPiZZmZL+sjQYo2rL2gE+wftIxhHkEnB5vkmujJzydzSbw5WDTc7LnG3S0nQMS6Apd oD0DG4TxblVTSmtrDjw3jcw1qYcj6bz1oBV8ULyJY3Xpek6MnVH6iGGXHSiY51d0n8u/ xlOw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=ksh61HdFJKW5FzhQ3xBVqQbws7N9eS43eLhKNpPeqew=; b=Rj7hRGHF4jPhxyfStpltum7DLWoh0goYL8T2yN0dkxjsAn2NwKtnxZA+V9hOKTSrWN NkCViuCwIdmw4o1UkCzwO4hdjwg2nJiE1yqaf1k5zX84A0EuvioZbDcESnDTyOdmvuWq hePgNMY+ylNvy5Ia1yyjtUP12+m+2Mm7OlOQai/7haRI4hE5iSElT3DjHYZsRwXU64iT My4FLMoNwAGkObhCKkFKrsw0e493W4MHxxmIIsK8/9bS/dwac8y7MH+XwTeuWxgt5Ax9 gbGTnBPvntvPRgkGgiBF4w83Wqf7VA6/X75EgunXCZt0FvMJ3KM4E0Xk4U7NBl5Or7PZ 8JNw== X-Gm-Message-State: AFqh2kpO7QViu1ykjJR8w2BKJddjlBTjdRCE+EHmF5G3yy5pw8xubNgp 7sCLDjbfzrbVGM/eV/Z1bEp3vb6eprHO2gDY X-Google-Smtp-Source: AMrXdXuUdtLbHKVzfYol7vn49W5JNTDiD617kfFDgxP5DZEb9ADfn2YrSfngVZdq4KyN5rcGwmq3pEoz4PydDJ4S X-Received: from jthoughton.c.googlers.com ([fda3:e722:ac3:cc00:14:4d90:c0a8:2a4f]) (user=jthoughton job=sendgmr) by 2002:a81:1c17:0:b0:475:7911:2119 with SMTP id c23-20020a811c17000000b0047579112119mr5474920ywc.359.1672913992847; Thu, 05 Jan 2023 02:19:52 -0800 (PST) Date: Thu, 5 Jan 2023 10:18:38 +0000 In-Reply-To: <20230105101844.1893104-1-jthoughton@google.com> Mime-Version: 1.0 References: <20230105101844.1893104-1-jthoughton@google.com> X-Mailer: git-send-email 2.39.0.314.g84b9a713c41-goog Message-ID: <20230105101844.1893104-41-jthoughton@google.com> Subject: [PATCH 40/46] docs: hugetlb: update hugetlb and userfaultfd admin-guides with HGM info From: James Houghton To: Mike Kravetz , Muchun Song , Peter Xu Cc: David Hildenbrand , David Rientjes , Axel Rasmussen , Mina Almasry , "Zach O'Keefe" , Manish Mishra , Naoya Horiguchi , "Dr . David Alan Gilbert" , "Matthew Wilcox (Oracle)" , Vlastimil Babka , Baolin Wang , Miaohe Lin , Yang Shi , Andrew Morton , linux-mm@kvack.org, linux-kernel@vger.kernel.org, James Houghton X-Rspam-User: X-Rspamd-Queue-Id: 9E3774000C X-Rspamd-Server: rspam01 X-Stat-Signature: ojfpqw4r9qnb48xjs8wgja7wroy47khc X-HE-Tag: 1672913993-23645 X-HE-Meta: U2FsdGVkX19bUmr1WytNF14q6IP7KFPZhKJHfr1ZZHze3ZNiF0+pAJlK4RkWyzZYS8ISxQj2z44R7er+4juqCriSwk76sT7Vzg2DFfWad2ne4Tqde1qtN64cXPANbC2mf9bslQZpHN/g0qgBxZTedllr92Q2G220lfbjpvyibjoBTrjijVvtB5nh/gjEOIi2GHs+3/sjK6ThOimUVlJigB8cSA1s5APVaP7XYgjsp3kTp0OLdntBKbEauM+OXkfYgata3N9XEnVfp9ty1aFBdr+SPIpvlR05mkOooexIO3/Y6xU7/qCTbto8IEucCJHHcIty81kfRRvMFFMZpWbkIuadBbFzkPmV9wY0lA9ZN1NcyCPIuNZ2OFIQOztgzSMvQgnVSmV54C38KBI1yz6WFrUHFz7fVKeDzmKjc2athWfNVl+pd7c1XLg2JQX4Ti5w7ZJO8RMQZCV2dZ8ADx07u53j05T+Z0n/xI0tyPCIUv5IHIHAFpb4BLR5iMEROesV3rxQRqiMzVng1KxDKrykmO56ZaTZlZIyl+r/ExLRywqgoPxDdh2IuLk4aMNGjl+RhV+9BX7TGaGNgnEZhcUF9yaZYP0VobnTSfd4mw41a/EZ5SJ2K+JVIhnOTZRoZ0BlEMpvBrODuJWY7KKF2/tYzfdjl48h78QzwT8/+JhIozz/pDejIg6iTV3WtNnCU6HXN9R8orWIV5iI6YfqwuYwVu0pO1Phn+HHbQQMCLfJ2RCTZn0wjIt2m3V+7MLSnkWvy5pHddHRpHh9tv1RHedptcu8v8fu/Z2j4Xulh4WrQEJv87RSLjTGO3DRPSW3v6udxEaCV2D0K0FVEGFM3Tao0podu0OLK7zQtLCElgZlIeHmoXHtcPBkC34csh/o+mPiyBybs3nOwskHhkPFAGr7BMSSvZ+HfFxSAKvpNcVixSbIqu0tgRKV3xK63DlIYC3fKys59tUgevXNyM/p3BD 9kWZXc2b YLDyejftIzXivi5+TmADDkCzNBI1QFonldhNnhwfkVGPsnJxlV7bBXaKlM9pb1zHC+DIlkXoEglxEApwmWkDygiX1Kp0sT+F/WSBgAvWGYkDi/Az2TFDU84jPh5ShkEXO/jA2atsVP/enynlRHw93DkyJ1T4N6DGIAMy7s1dsymxcIsiodOOO71S54zoHmGLeQ8cBgmZjrqHME5xf6W/+d/LUTCanSvwpuN7xGeimC1aQn5DvAB1+YaMU50jz0ei/NcuLf30c538bjBy5NJM1xOtsDVN52k1rjiCpBMtscDEpFvf89tJKgS5xbWYAp/hMNeQg4kjUF6CVDMBTZh+g7WT2xANfIBnjHTwAbEA/Ldwi34+yXyfWEPrOpl+o5+LxRBMLkuEhCiIquS4= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: This includes information about how UFFD_FEATURE_MINOR_HUGETLBFS_HGM should be used and when MADV_COLLAPSE should be used with it. Signed-off-by: James Houghton --- Documentation/admin-guide/mm/hugetlbpage.rst | 4 ++++ Documentation/admin-guide/mm/userfaultfd.rst | 16 +++++++++++++++- 2 files changed, 19 insertions(+), 1 deletion(-) diff --git a/Documentation/admin-guide/mm/hugetlbpage.rst b/Documentation/admin-guide/mm/hugetlbpage.rst index 19f27c0d92e0..ca7db15ae768 100644 --- a/Documentation/admin-guide/mm/hugetlbpage.rst +++ b/Documentation/admin-guide/mm/hugetlbpage.rst @@ -454,6 +454,10 @@ errno set to EINVAL or exclude hugetlb pages that extend beyond the length if not hugepage aligned. For example, munmap(2) will fail if memory is backed by a hugetlb page and the length is smaller than the hugepage size. +It is possible for users to map HugeTLB pages at a higher granularity than +normal using HugeTLB high-granularity mapping (HGM). For example, when using 1G +pages on x86, a user could map that page with 4K PTEs, 2M PMDs, a combination of +the two. See Documentation/admin-guide/mm/userfaultfd.rst. Examples ======== diff --git a/Documentation/admin-guide/mm/userfaultfd.rst b/Documentation/admin-guide/mm/userfaultfd.rst index 83f31919ebb3..19877aaad61b 100644 --- a/Documentation/admin-guide/mm/userfaultfd.rst +++ b/Documentation/admin-guide/mm/userfaultfd.rst @@ -115,6 +115,14 @@ events, except page fault notifications, may be generated: areas. ``UFFD_FEATURE_MINOR_SHMEM`` is the analogous feature indicating support for shmem virtual memory areas. +- ``UFFD_FEATURE_MINOR_HUGETLBFS_HGM`` indicates that the kernel supports + small-page-aligned regions for ``UFFDIO_CONTINUE`` in HugeTLB-backed + virtual memory areas. ``UFFD_FEATURE_MINOR_HUGETLBFS_HGM`` and + ``UFFD_FEATURE_EXACT_ADDRESS`` must both be specified explicitly to enable + this behavior. If ``UFFD_FEATURE_MINOR_HUGETLBFS_HGM`` is specified but + ``UFFD_FEATURE_EXACT_ADDRESS`` is not, then ``UFFDIO_API`` will fail with + ``EINVAL``. + The userland application should set the feature flags it intends to use when invoking the ``UFFDIO_API`` ioctl, to request that those features be enabled if supported. @@ -169,7 +177,13 @@ like to do to resolve it: the page cache). Userspace has the option of modifying the page's contents before resolving the fault. Once the contents are correct (modified or not), userspace asks the kernel to map the page and let the - faulting thread continue with ``UFFDIO_CONTINUE``. + faulting thread continue with ``UFFDIO_CONTINUE``. If this is done at the + base-page size in a transparent-hugepage-eligible VMA or in a HugeTLB VMA + (requires ``UFFD_FEATURE_MINOR_HUGETLBFS_HGM``), then userspace may want to + use ``MADV_COLLAPSE`` when a hugepage is fully populated to inform the kernel + that it may be able to collapse the mapping. ``MADV_COLLAPSE`` will may undo + the effect of any ``UFFDIO_WRITEPROTECT`` calls on the collapsed address + range. Notes: From patchwork Thu Jan 5 10:18:39 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: James Houghton X-Patchwork-Id: 13089674 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id C155BC3DA7A for ; Thu, 5 Jan 2023 10:19:57 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 9927F940021; Thu, 5 Jan 2023 05:19:56 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 8F545940008; Thu, 5 Jan 2023 05:19:56 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 683A3940021; Thu, 5 Jan 2023 05:19:56 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0013.hostedemail.com [216.40.44.13]) by kanga.kvack.org (Postfix) with ESMTP id 5782C940008 for ; Thu, 5 Jan 2023 05:19:56 -0500 (EST) Received: from smtpin11.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay01.hostedemail.com (Postfix) with ESMTP id 3D5EE1C6505 for ; Thu, 5 Jan 2023 10:19:56 +0000 (UTC) X-FDA: 80320349592.11.7F803C8 Received: from mail-yw1-f201.google.com (mail-yw1-f201.google.com [209.85.128.201]) by imf11.hostedemail.com (Postfix) with ESMTP id AE10A4000A for ; Thu, 5 Jan 2023 10:19:54 +0000 (UTC) Authentication-Results: imf11.hostedemail.com; dkim=pass header.d=google.com header.s=20210112 header.b=BGsOO1Xa; spf=pass (imf11.hostedemail.com: domain of 3SaS2YwoKCJA3D18E01D8708805y.w86527EH-664Fuw4.8B0@flex--jthoughton.bounces.google.com designates 209.85.128.201 as permitted sender) smtp.mailfrom=3SaS2YwoKCJA3D18E01D8708805y.w86527EH-664Fuw4.8B0@flex--jthoughton.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1672913994; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=99zyT64KwIVcKcOLonOKX5TPwNAFDbPGP6/roR2wJeg=; b=Qqnh4lDan+I+SSZB2BWq5J2uZN1hGQTe3VLdo1v0PDl+G1f2RNlda1zXH7Apry/72MiJvt ytd2wQZdu1cjjLxtTbkOt73qmHmk/4fYdtEqJ00ZY11H5ymoAA1zZi85LEuTr/qXwWBPKG DZeL2jl44Zop7G0NKUsvJCOWYA3D+UE= ARC-Authentication-Results: i=1; imf11.hostedemail.com; dkim=pass header.d=google.com header.s=20210112 header.b=BGsOO1Xa; spf=pass (imf11.hostedemail.com: domain of 3SaS2YwoKCJA3D18E01D8708805y.w86527EH-664Fuw4.8B0@flex--jthoughton.bounces.google.com designates 209.85.128.201 as permitted sender) smtp.mailfrom=3SaS2YwoKCJA3D18E01D8708805y.w86527EH-664Fuw4.8B0@flex--jthoughton.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1672913994; a=rsa-sha256; cv=none; b=zBJtwq1iayTaioUj/OcBWk8ZqDowL3Of1/p+AfQVp9USOyiZPfLX8oro+UOJJ9rpZ4WWu0 BFI8iv4zoo/i6MoRPWZtElJ5ixgLCFmNIQNa+c79js2Xtm/4o5JNUFSEvFLJb/nQjij8+/ cjTT8UcKkGSO+CAviUp+7dMhCSS4tSU= Received: by mail-yw1-f201.google.com with SMTP id 00721157ae682-4597b0ff5e9so375015377b3.10 for ; Thu, 05 Jan 2023 02:19:54 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=99zyT64KwIVcKcOLonOKX5TPwNAFDbPGP6/roR2wJeg=; b=BGsOO1Xac220ouarsldL8eKmYhPBMh3Wgm0XdhF2qRYBawDIEYWKUm1fx3k2kI0tO3 6haBYEyU52usmxK/1mMxlgVirWHauUoqjfTmpqfD8C4NnA9XgvhoFikpqHngC1fco9V7 S+UlF+nVqFxP0rGL7Ey8oAS/AXZi5EU2c143l7lqDwLU7BB+WKAdEjqfM7TPqkG2EeyJ uasY9UGaoogWW9/VJti+da2o1nbNCj584CdLWxHWuI/cw3xKS8dYTTiOo6VECuB/GmOW 1BhWMy0g/FBHeLD+VAxCyqfvElw8RGe86lYpIg6QfrQJ1bwD+NpCsEMEh5dkAuTWPcaJ K9jg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=99zyT64KwIVcKcOLonOKX5TPwNAFDbPGP6/roR2wJeg=; b=HHfcD68Fziy5Pvi6VaxxmulNgYfF8tuxarKDLgusO0Lij3DYhEwhm0R75HE6TGBaQJ hthuP0/4VLugoF9HDdN1vk0vROmksohflb7abKq5R6FNXIOjcXGS0e6MIRmch4mUWkEs YRZvVJHgDc8nr0SHLM8RhLX3e4owPZ1aFvLaIPcMp9a60m7dn6Nljm+1DFqlR5aaiVUs R51HDPtZesuwLOOu4o8d6S41z9WSQpZGp5ERN6SN3z9NZQbQFDl0hsCRb7BY7XiZmY50 zDwySDdMSjcTO6xyWSnRWgXXBTkxgttVBwY0FluHAiGe+b8FirB0tnyaOLuF4nLxST93 3UoQ== X-Gm-Message-State: AFqh2ko/xvOxLX3HFUcLtgU8eVcV60ShWSG9sj5POK9YclN8rqsb3f0q k85z7fBLKQYPx7xk+EEGhZuaf4EaDwL99Ui0 X-Google-Smtp-Source: AMrXdXvsBqG28GRMWBm56GtDbnxmt0iS/OB8Uv+tWQOADD6vpQwS1iuiPqZuiNCSbnQC4jMgqN7qExiWxrGFRxIC X-Received: from jthoughton.c.googlers.com ([fda3:e722:ac3:cc00:14:4d90:c0a8:2a4f]) (user=jthoughton job=sendgmr) by 2002:a25:3146:0:b0:6fb:80c:fe0f with SMTP id x67-20020a253146000000b006fb080cfe0fmr4203995ybx.25.1672913993943; Thu, 05 Jan 2023 02:19:53 -0800 (PST) Date: Thu, 5 Jan 2023 10:18:39 +0000 In-Reply-To: <20230105101844.1893104-1-jthoughton@google.com> Mime-Version: 1.0 References: <20230105101844.1893104-1-jthoughton@google.com> X-Mailer: git-send-email 2.39.0.314.g84b9a713c41-goog Message-ID: <20230105101844.1893104-42-jthoughton@google.com> Subject: [PATCH 41/46] docs: proc: include information about HugeTLB HGM From: James Houghton To: Mike Kravetz , Muchun Song , Peter Xu Cc: David Hildenbrand , David Rientjes , Axel Rasmussen , Mina Almasry , "Zach O'Keefe" , Manish Mishra , Naoya Horiguchi , "Dr . David Alan Gilbert" , "Matthew Wilcox (Oracle)" , Vlastimil Babka , Baolin Wang , Miaohe Lin , Yang Shi , Andrew Morton , linux-mm@kvack.org, linux-kernel@vger.kernel.org, James Houghton X-Rspamd-Queue-Id: AE10A4000A X-Stat-Signature: aepec7gkfedoxhmikqwdik1hhj9s8k8o X-Rspam-User: X-Rspamd-Server: rspam08 X-HE-Tag: 1672913994-716103 X-HE-Meta: U2FsdGVkX1/wmVOJfjai8MKY2JuyJSV4jjyhkFvgMis1WJ+y+wk28I9H1P7rN0ljur/yyvIFSgp/tFYpYpK7uF2mExgJvWoXZie46f9wTurJjGDA08vgBp4lwikskSgsQa/oU00JX6PmGm1TLt9UTYR1JDXHaBcj3/z9OvXc5tu90xxXxSxyPKtTIi/lqkgZZ21I9y0bK9mXAS2aezs8171Q6rdivpkDzJXbAeGD/c0LuURYPHBaeza6hGn2J5+0MI7S7A3C11NjACi0Q8JmoMVBqAPzR8b37WvWU0+cLMamaA/S6xlWnngYY9/G9luOIeIqFFvMOsIE3aeQ2PO+ScEuXFVpoBrOLPIfGZaaN47OA4gffUdkdFCV4n/4nrz7X2oSWkifvtj5TCazIk8uXqaBJky01iaxKnPyG6a1L5np2o3ZaqKiEaI06y6cpp3POoCRA409I+47orLzVimCBejNWf3b+VkYLqhz9g9YKbCe0o6JwPSJ9Xr3lMciV/+unZardrcMPtZ8XhvFX153BRv31yx13Fl6+HgrFOXAwmcPZEfboJ3UMHcqAoBGAemGN7ORzvwpfD6VkbWdbJvwhbTCBOluwpKWW+D5Ibt7gD3uyRsFMF0PCBvJBCyCF0TvVQIEM6Sd/E5YktNMLsCWrlciWf4Wfx1Kv3wZHKfTJw26qhFkrXdzhVjMjNIlNgJRDfwlunJaxr5pMaYtjJQfC5GB8B87woKfRBYopYhZ3LbLPmXJSm5zAoPC7qf8u6LbNXkbbFqyo71cPyzsddZeNHphZ6L+FMGdELdfXWOX61o5p2hdQUqkcsDzZqzlULuaAaKVqGfAH0s22jopa70oUz1rSXCJN7Qh+4r4M6l0LoGnxUHMq85mRrWcZMv3ox006NDV75zP7suNpNqpCVNc/MkMFujgBIVFN9COrn1MxUgUX4eulae8bqOlzaFkyE1u9CvjJ/hN8cCLgnMrR9x S3ntqJGH bAzVbMY8e7hka4vE+IiuSB3WQj6JiXWWsQXnj1KGZKyp4E1MYRrA16Lb4RvQnFEtUWx3FYSQcO6WWKoZ25QzWAUg9Q/eWdTXm5/61TNUmZcT9i903VEXgQjYnqpxVeMFai2vcGV4YkXgPN4x8jIsrf8LwQxaQvqr9fmzwrjCbojln6j0jYh4XrbnEgvlOhNuAjFRs4q7fnS+nTP7HlhCv5uFMDD7cLV7G2M2rXno2SWYdbPMZ1AfnL/WReQ5N32djRO5PxJyrtw3qG6yAm2OL8wy0fBQ4aN1Pl3a4GFYOzaTst1WgSrJKk+69TyX9OkcmdFgCpQkDg55TXXoUB5u5fH/atfbGEZxJWAyvI7oPHQBzHr9MkkUINt60k1TCouYx/gjg1sI9FJrGiCI= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: This includes the updates that have been made to smaps, specifically, the addition of Hugetlb[Pud,Pmd,Pte]Mapped. Signed-off-by: James Houghton --- Documentation/filesystems/proc.rst | 56 +++++++++++++++++------------- 1 file changed, 32 insertions(+), 24 deletions(-) diff --git a/Documentation/filesystems/proc.rst b/Documentation/filesystems/proc.rst index e224b6d5b642..1fbb1310cea1 100644 --- a/Documentation/filesystems/proc.rst +++ b/Documentation/filesystems/proc.rst @@ -447,29 +447,32 @@ Memory Area, or VMA) there is a series of lines such as the following:: 08048000-080bc000 r-xp 00000000 03:02 13130 /bin/bash - Size: 1084 kB - KernelPageSize: 4 kB - MMUPageSize: 4 kB - Rss: 892 kB - Pss: 374 kB - Pss_Dirty: 0 kB - Shared_Clean: 892 kB - Shared_Dirty: 0 kB - Private_Clean: 0 kB - Private_Dirty: 0 kB - Referenced: 892 kB - Anonymous: 0 kB - LazyFree: 0 kB - AnonHugePages: 0 kB - ShmemPmdMapped: 0 kB - Shared_Hugetlb: 0 kB - Private_Hugetlb: 0 kB - Swap: 0 kB - SwapPss: 0 kB - KernelPageSize: 4 kB - MMUPageSize: 4 kB - Locked: 0 kB - THPeligible: 0 + Size: 1084 kB + KernelPageSize: 4 kB + MMUPageSize: 4 kB + Rss: 892 kB + Pss: 374 kB + Pss_Dirty: 0 kB + Shared_Clean: 892 kB + Shared_Dirty: 0 kB + Private_Clean: 0 kB + Private_Dirty: 0 kB + Referenced: 892 kB + Anonymous: 0 kB + LazyFree: 0 kB + AnonHugePages: 0 kB + ShmemPmdMapped: 0 kB + Shared_Hugetlb: 0 kB + Private_Hugetlb: 0 kB + HugetlbPudMapped: 0 kB + HugetlbPmdMapped: 0 kB + HugetlbPteMapped: 0 kB + Swap: 0 kB + SwapPss: 0 kB + KernelPageSize: 4 kB + MMUPageSize: 4 kB + Locked: 0 kB + THPeligible: 0 VmFlags: rd ex mr mw me dw The first of these lines shows the same information as is displayed for the @@ -510,10 +513,15 @@ implementation. If this is not desirable please file a bug report. "ShmemPmdMapped" shows the ammount of shared (shmem/tmpfs) memory backed by huge pages. -"Shared_Hugetlb" and "Private_Hugetlb" show the ammounts of memory backed by +"Shared_Hugetlb" and "Private_Hugetlb" show the amounts of memory backed by hugetlbfs page which is *not* counted in "RSS" or "PSS" field for historical reasons. And these are not included in {Shared,Private}_{Clean,Dirty} field. +If the kernel was compiled with ``CONFIG_HUGETLB_HIGH_GRANULARITY_MAPPING``, +"HugetlbPudMapped", "HugetlbPmdMapped", and "HugetlbPteMapped" will appear and +show the amount of HugeTLB memory mapped with PUDs, PMDs, and PTEs respectively. +See Documentation/admin-guide/mm/hugetlbpage.rst. + "Swap" shows how much would-be-anonymous memory is also used, but out on swap. For shmem mappings, "Swap" includes also the size of the mapped (and not From patchwork Thu Jan 5 10:18:40 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: James Houghton X-Patchwork-Id: 13089675 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 3BEA4C3DA7D for ; Thu, 5 Jan 2023 10:19:59 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 04709940022; Thu, 5 Jan 2023 05:19:58 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id E2720940008; Thu, 5 Jan 2023 05:19:57 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id C0041940022; Thu, 5 Jan 2023 05:19:57 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0010.hostedemail.com [216.40.44.10]) by kanga.kvack.org (Postfix) with ESMTP id ABE35940008 for ; Thu, 5 Jan 2023 05:19:57 -0500 (EST) Received: from smtpin02.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay06.hostedemail.com (Postfix) with ESMTP id 77777AB30A for ; Thu, 5 Jan 2023 10:19:57 +0000 (UTC) X-FDA: 80320349634.02.C124345 Received: from mail-yb1-f201.google.com (mail-yb1-f201.google.com [209.85.219.201]) by imf23.hostedemail.com (Postfix) with ESMTP id DDE5E14000A for ; Thu, 5 Jan 2023 10:19:55 +0000 (UTC) Authentication-Results: imf23.hostedemail.com; dkim=pass header.d=google.com header.s=20210112 header.b=I8sjFLZn; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf23.hostedemail.com: domain of 3S6S2YwoKCJI5F3AG23FA92AA270.yA8749GJ-886Hwy6.AD2@flex--jthoughton.bounces.google.com designates 209.85.219.201 as permitted sender) smtp.mailfrom=3S6S2YwoKCJI5F3AG23FA92AA270.yA8749GJ-886Hwy6.AD2@flex--jthoughton.bounces.google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1672913995; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=oM1fuebOiYdsKI6sxS1LfWku8C58IPteD/Sei8VpSxQ=; b=B1BSLut8G6aSKRfdjry7oB/xuLQvrOYmgYas3SYf+ygThePo6SQyzOumoPcwynE4MRAEbd vijPPRr7khZamwPr1zeIUlRPy2v+K8YSOscdtJWkqDnLOwqQ+H0mHgng62Eo62GEbdRzqa fghFWLEwSE735MiWOxKASsBn9u/mq1Y= ARC-Authentication-Results: i=1; imf23.hostedemail.com; dkim=pass header.d=google.com header.s=20210112 header.b=I8sjFLZn; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf23.hostedemail.com: domain of 3S6S2YwoKCJI5F3AG23FA92AA270.yA8749GJ-886Hwy6.AD2@flex--jthoughton.bounces.google.com designates 209.85.219.201 as permitted sender) smtp.mailfrom=3S6S2YwoKCJI5F3AG23FA92AA270.yA8749GJ-886Hwy6.AD2@flex--jthoughton.bounces.google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1672913995; a=rsa-sha256; cv=none; b=vcj0Cy5GwCKWFbMAge0Q8Pv5Lc734DW9rhu1TF2wxPfimoGT2PsAEm1Bh73024bin2Yp3m KzgITIyCYeLJm0yVyD+yLMXxMrmLW301QNll1BQ7aOeaWMZ2MpogKNqc12YnhMI8ovkHmJ 9qPM1Vymo2Zbd82hICPIRYIRHvcBNl4= Received: by mail-yb1-f201.google.com with SMTP id s6-20020a259006000000b00706c8bfd130so36460515ybl.11 for ; Thu, 05 Jan 2023 02:19:55 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=oM1fuebOiYdsKI6sxS1LfWku8C58IPteD/Sei8VpSxQ=; b=I8sjFLZnx+aNBRXgBMZRH312QMYInRX5Te2g9nBKTxCe34+OYraFmUgSlUONr2eooJ 62HtRkoKcvk84eoWTg4qbcUFd2KKx07w9eSKHxyCphfKi6xjhUfineaz01Tx3MjmftXp zpVT5lb/OOmri7egZafvXcJTXzLNWPrLvhzt30S+FUXmd1QDJW2D+ASIf6sSrCO2pujj Wcuy4YdikybIl74tALJwRxpzS/O3LreoB3I8MFENTfEl2pSn4PDiQF+YbQjOUN/Fl9LC tguO+iGWrK5Q4xc2bG/OYNQ+g0mnWemidtzpfKV7SFWPLEAOT5YY+2u3ODxF8u421Xje VNWg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=oM1fuebOiYdsKI6sxS1LfWku8C58IPteD/Sei8VpSxQ=; b=TSDpe2MWtym0vtFL0+ZDv3fuiqgmww52+z1ej3h5ZDgU1/S0gZvwnxvjNT/Tzi+Q6N edqr97tbFIvB/Z6Z93RrMQS9n2ql+lmfn82Jryiy1Q+u5XXQf3iImRFaD0Cgpn9rpv3L ceIhGlWKZffSugKFUqhXNIf4ldxA4j4JAIjl1Y0unfT2tB/498xTKfgYao6zsRyI6mKb LSSELpQgZOcYkjk7cE92O1W31wJo5rEYfiAj0wW7a9qYLo954QK1kb3anicdtrt5Ycl7 tFKphVxSDJ2dRZP7CImDNch5b9zyYWl5O3E3bdNPx6kMHi6JQZC3qEGEj6UfJnlc1Qek 9ucQ== X-Gm-Message-State: AFqh2kpAHTV75eodQsx1br6fIN01yPicdD31RHagiiE1D951G6m4CsZl ro+Zaz1Hw+Ivf06qIHVciAqWbGGb2aapVKNT X-Google-Smtp-Source: AMrXdXt1McQ73HGP/bgfACgNt1pvdbvPI5F1rZ+GWu6XhGxvz7s7KKrAed+ZRcqWM42eHH4pUjfYXbqKojyiSuhg X-Received: from jthoughton.c.googlers.com ([fda3:e722:ac3:cc00:14:4d90:c0a8:2a4f]) (user=jthoughton job=sendgmr) by 2002:a81:25cf:0:b0:426:6938:b154 with SMTP id l198-20020a8125cf000000b004266938b154mr102769ywl.511.1672913995101; Thu, 05 Jan 2023 02:19:55 -0800 (PST) Date: Thu, 5 Jan 2023 10:18:40 +0000 In-Reply-To: <20230105101844.1893104-1-jthoughton@google.com> Mime-Version: 1.0 References: <20230105101844.1893104-1-jthoughton@google.com> X-Mailer: git-send-email 2.39.0.314.g84b9a713c41-goog Message-ID: <20230105101844.1893104-43-jthoughton@google.com> Subject: [PATCH 42/46] selftests/vm: add HugeTLB HGM to userfaultfd selftest From: James Houghton To: Mike Kravetz , Muchun Song , Peter Xu Cc: David Hildenbrand , David Rientjes , Axel Rasmussen , Mina Almasry , "Zach O'Keefe" , Manish Mishra , Naoya Horiguchi , "Dr . David Alan Gilbert" , "Matthew Wilcox (Oracle)" , Vlastimil Babka , Baolin Wang , Miaohe Lin , Yang Shi , Andrew Morton , linux-mm@kvack.org, linux-kernel@vger.kernel.org, James Houghton X-Rspam-User: X-Rspamd-Server: rspam02 X-Rspamd-Queue-Id: DDE5E14000A X-Stat-Signature: 87s9mizjidt8p1z8ernpkte7cuzxd7mo X-HE-Tag: 1672913995-256110 X-HE-Meta: U2FsdGVkX19nCUCXKSlRmzL4M2TwjjLdt9/K0X9hxQdvw49d4uloN8ns0gS/u+f+8NFBh+nSd6qwv0ycGTMAo6CaBeQLkzpaWts2P5teK15FcJVfwF1KB9GRScX0UoP3j79omUCycVAkORcjNUolU22UFV5xmNVIH4WVTi56FIef9MQCpEkPCO7ERs+FCoFsmOdCouxlZj4m7ENWsB/Qnt8VPPxYIc9XB3zc0aVhRCiJETxQuKbWH1fLd61fzuN48QVuRINTtnGxMTYOGTJG2+CIRG60bpJGLs+G98a2mbXYSCI5kckfLkhfeOykH739YGmnmA+uax/4fZMtBHuvX29Yg+i1pd5FAAbXguUrRIVBZrg/3SWRqDTji1XF0oX+XfBWQxCitNxyWWMGcBEEZPkYpJ0LpGh9o4t5Cg079rDGQNHvPkAd2XpFAPMMlKh4VPv0/cdBvODJvhEuqZoY1oWSUdBp9jnGbe+VUTFMULaaonFV59wYPka2BaQP0HC5XX0eijys7GcXTP2a4qUA399WXDK7Ru+FT0G7xungMRcFpEwBn4Jm1xpdGa8qHSVRXg9nSsYjhkiUjCEBF8cZjq5b0Kduzh3unZE3WHWFgf6HemE23/yKlFlXSO8gAAEvawzAfs1JKUMig/mdHlo3tiRYI/nFOfObOGU2Y05wtLgOdF+ZOxjWhc/2xcpTTyXTy/5W4dHVB+x96uqD6IbQ2/LQe2zC2BxI/hXTY1U9dkzgtuHJkIEZ7QmMSJvp6UssoqKIRIDRceS2ifnRE14WAGuClTB65UpDHph0dsUNFLeelx3Jb0reeZICn7/ZEH4eWGpogOxbxSIIstJvUeYr0rK9A5STphojjMbI3sfQVgVOwVuejEmfTfA62JZngRIcERATnSoT5MEvBpIld9WSqd7xXqir80PgBXFALX2AErvq2QQghaXc6fZzetYb+3crU/zpA/U2NiZRCRCHawc gsDUh4uE +dwN/ymaudDtg18+xBcz08Nbt7LCY7keMJZuDKxXIuM7Wreymbvxb9BEIEcCvHu47BIecAisjn4NBVEP01uy+3Ta64eKfm1teLLYaOK43UKkGHHTdw8SJ0chcxhdlk3+HC9lVn4UkJE1jwHRKDjOd0yXwt+GIHbZfgZnZNXSSanUm151jjkynfv3YqbD42TQUN+nfnC63S26QMkXjezyUmXIPgYVTOKHsaH5Tz9LvFDroMFnUny7AnhruHeP94KCJm0FP5qGJJ3Q197daAQ3z4+/MOMF/dUziEJV6rZpHWQqVfO63p51uwi3owAPLOz0FV/LbuPWR2SZAL3M5E+W0r48Aitm5xvqlJl9xWButP9NWbv2lIApNUv/qhIJrPbhy97O1itYuBx5IB/o= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: This test case behaves similarly to the regular shared HugeTLB configuration, except that it uses 4K instead of hugepages, and that we ignore the UFFDIO_COPY tests, as UFFDIO_CONTINUE is the only ioctl that supports PAGE_SIZE-aligned regions. This doesn't test MADV_COLLAPSE. Other tests are added later to exercise MADV_COLLAPSE. Signed-off-by: James Houghton --- tools/testing/selftests/vm/userfaultfd.c | 84 +++++++++++++++++++----- 1 file changed, 69 insertions(+), 15 deletions(-) diff --git a/tools/testing/selftests/vm/userfaultfd.c b/tools/testing/selftests/vm/userfaultfd.c index 7f22844ed704..681c5c5f863b 100644 --- a/tools/testing/selftests/vm/userfaultfd.c +++ b/tools/testing/selftests/vm/userfaultfd.c @@ -73,9 +73,10 @@ static unsigned long nr_cpus, nr_pages, nr_pages_per_cpu, page_size, hpage_size; #define BOUNCE_POLL (1<<3) static int bounces; -#define TEST_ANON 1 -#define TEST_HUGETLB 2 -#define TEST_SHMEM 3 +#define TEST_ANON 1 +#define TEST_HUGETLB 2 +#define TEST_HUGETLB_HGM 3 +#define TEST_SHMEM 4 static int test_type; #define UFFD_FLAGS (O_CLOEXEC | O_NONBLOCK | UFFD_USER_MODE_ONLY) @@ -93,6 +94,8 @@ static volatile bool test_uffdio_zeropage_eexist = true; static bool test_uffdio_wp = true; /* Whether to test uffd minor faults */ static bool test_uffdio_minor = false; +static bool test_uffdio_copy = true; + static bool map_shared; static int mem_fd; static unsigned long long *count_verify; @@ -151,7 +154,7 @@ static void usage(void) fprintf(stderr, "\nUsage: ./userfaultfd " "[hugetlbfs_file]\n\n"); fprintf(stderr, "Supported : anon, hugetlb, " - "hugetlb_shared, shmem\n\n"); + "hugetlb_shared, hugetlb_shared_hgm, shmem\n\n"); fprintf(stderr, "'Test mods' can be joined to the test type string with a ':'. " "Supported mods:\n"); fprintf(stderr, "\tsyscall - Use userfaultfd(2) (default)\n"); @@ -167,6 +170,11 @@ static void usage(void) exit(1); } +static bool test_is_hugetlb(void) +{ + return test_type == TEST_HUGETLB || test_type == TEST_HUGETLB_HGM; +} + #define _err(fmt, ...) \ do { \ int ret = errno; \ @@ -381,7 +389,7 @@ static struct uffd_test_ops *uffd_test_ops; static inline uint64_t uffd_minor_feature(void) { - if (test_type == TEST_HUGETLB && map_shared) + if (test_is_hugetlb() && map_shared) return UFFD_FEATURE_MINOR_HUGETLBFS; else if (test_type == TEST_SHMEM) return UFFD_FEATURE_MINOR_SHMEM; @@ -393,7 +401,7 @@ static uint64_t get_expected_ioctls(uint64_t mode) { uint64_t ioctls = UFFD_API_RANGE_IOCTLS; - if (test_type == TEST_HUGETLB) + if (test_is_hugetlb()) ioctls &= ~(1 << _UFFDIO_ZEROPAGE); if (!((mode & UFFDIO_REGISTER_MODE_WP) && test_uffdio_wp)) @@ -500,13 +508,16 @@ static void uffd_test_ctx_clear(void) static void uffd_test_ctx_init(uint64_t features) { unsigned long nr, cpu; + uint64_t enabled_features = features; uffd_test_ctx_clear(); uffd_test_ops->allocate_area((void **)&area_src, true); uffd_test_ops->allocate_area((void **)&area_dst, false); - userfaultfd_open(&features); + userfaultfd_open(&enabled_features); + if ((enabled_features & features) != features) + err("couldn't enable all features"); count_verify = malloc(nr_pages * sizeof(unsigned long long)); if (!count_verify) @@ -726,13 +737,16 @@ static void uffd_handle_page_fault(struct uffd_msg *msg, struct uffd_stats *stats) { unsigned long offset; + unsigned long address; if (msg->event != UFFD_EVENT_PAGEFAULT) err("unexpected msg event %u", msg->event); + address = msg->arg.pagefault.address; + if (msg->arg.pagefault.flags & UFFD_PAGEFAULT_FLAG_WP) { /* Write protect page faults */ - wp_range(uffd, msg->arg.pagefault.address, page_size, false); + wp_range(uffd, address, page_size, false); stats->wp_faults++; } else if (msg->arg.pagefault.flags & UFFD_PAGEFAULT_FLAG_MINOR) { uint8_t *area; @@ -751,11 +765,10 @@ static void uffd_handle_page_fault(struct uffd_msg *msg, */ area = (uint8_t *)(area_dst + - ((char *)msg->arg.pagefault.address - - area_dst_alias)); + ((char *)address - area_dst_alias)); for (b = 0; b < page_size; ++b) area[b] = ~area[b]; - continue_range(uffd, msg->arg.pagefault.address, page_size); + continue_range(uffd, address, page_size); stats->minor_faults++; } else { /* @@ -782,7 +795,7 @@ static void uffd_handle_page_fault(struct uffd_msg *msg, if (msg->arg.pagefault.flags & UFFD_PAGEFAULT_FLAG_WRITE) err("unexpected write fault"); - offset = (char *)(unsigned long)msg->arg.pagefault.address - area_dst; + offset = (char *)address - area_dst; offset &= ~(page_size-1); if (copy_page(uffd, offset)) @@ -1192,6 +1205,12 @@ static int userfaultfd_events_test(void) char c; struct uffd_stats stats = { 0 }; + if (!test_uffdio_copy) { + printf("Skipping userfaultfd events test " + "(test_uffdio_copy=false)\n"); + return 0; + } + printf("testing events (fork, remap, remove): "); fflush(stdout); @@ -1245,6 +1264,12 @@ static int userfaultfd_sig_test(void) char c; struct uffd_stats stats = { 0 }; + if (!test_uffdio_copy) { + printf("Skipping userfaultfd signal test " + "(test_uffdio_copy=false)\n"); + return 0; + } + printf("testing signal delivery: "); fflush(stdout); @@ -1329,6 +1354,11 @@ static int userfaultfd_minor_test(void) uffd_test_ctx_init(uffd_minor_feature()); + if (test_type == TEST_HUGETLB_HGM) + /* Enable high-granularity userfaultfd ioctls for HugeTLB */ + if (madvise(area_dst_alias, nr_pages * page_size, MADV_SPLIT)) + err("MADV_SPLIT failed"); + uffdio_register.range.start = (unsigned long)area_dst_alias; uffdio_register.range.len = nr_pages * page_size; uffdio_register.mode = UFFDIO_REGISTER_MODE_MINOR; @@ -1538,6 +1568,12 @@ static int userfaultfd_stress(void) pthread_attr_init(&attr); pthread_attr_setstacksize(&attr, 16*1024*1024); + if (!test_uffdio_copy) { + printf("Skipping userfaultfd stress test " + "(test_uffdio_copy=false)\n"); + bounces = 0; + } + while (bounces--) { printf("bounces: %d, mode:", bounces); if (bounces & BOUNCE_RANDOM) @@ -1696,6 +1732,16 @@ static void set_test_type(const char *type) uffd_test_ops = &hugetlb_uffd_test_ops; /* Minor faults require shared hugetlb; only enable here. */ test_uffdio_minor = true; + } else if (!strcmp(type, "hugetlb_shared_hgm")) { + map_shared = true; + test_type = TEST_HUGETLB_HGM; + uffd_test_ops = &hugetlb_uffd_test_ops; + /* + * HugeTLB HGM only changes UFFDIO_CONTINUE, so don't test + * UFFDIO_COPY. + */ + test_uffdio_minor = true; + test_uffdio_copy = false; } else if (!strcmp(type, "shmem")) { map_shared = true; test_type = TEST_SHMEM; @@ -1731,6 +1777,7 @@ static void parse_test_type_arg(const char *raw_type) err("Unsupported test: %s", raw_type); if (test_type == TEST_HUGETLB) + /* TEST_HUGETLB_HGM gets small pages. */ page_size = hpage_size; else page_size = sysconf(_SC_PAGE_SIZE); @@ -1813,22 +1860,29 @@ int main(int argc, char **argv) nr_cpus = x < y ? x : y; } nr_pages_per_cpu = bytes / page_size / nr_cpus; + if (test_type == TEST_HUGETLB_HGM) + /* + * `page_size` refers to the page_size we can use in + * UFFDIO_CONTINUE. We still need nr_pages to be appropriately + * aligned, so align it here. + */ + nr_pages_per_cpu -= nr_pages_per_cpu % (hpage_size / page_size); if (!nr_pages_per_cpu) { _err("invalid MiB"); usage(); } + nr_pages = nr_pages_per_cpu * nr_cpus; bounces = atoi(argv[3]); if (bounces <= 0) { _err("invalid bounces"); usage(); } - nr_pages = nr_pages_per_cpu * nr_cpus; - if (test_type == TEST_SHMEM || test_type == TEST_HUGETLB) { + if (test_type == TEST_SHMEM || test_is_hugetlb()) { unsigned int memfd_flags = 0; - if (test_type == TEST_HUGETLB) + if (test_is_hugetlb()) memfd_flags = MFD_HUGETLB; mem_fd = memfd_create(argv[0], memfd_flags); if (mem_fd < 0) From patchwork Thu Jan 5 10:18:41 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: James Houghton X-Patchwork-Id: 13089676 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id B2CECC53210 for ; Thu, 5 Jan 2023 10:20:00 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 54BF58E0005; Thu, 5 Jan 2023 05:19:59 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 4FDD48E0002; Thu, 5 Jan 2023 05:19:59 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 39DE78E0005; Thu, 5 Jan 2023 05:19:59 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0016.hostedemail.com [216.40.44.16]) by kanga.kvack.org (Postfix) with ESMTP id 2A32F8E0002 for ; Thu, 5 Jan 2023 05:19:59 -0500 (EST) Received: from smtpin24.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay03.hostedemail.com (Postfix) with ESMTP id F1548A0D23 for ; Thu, 5 Jan 2023 10:19:58 +0000 (UTC) X-FDA: 80320349676.24.3218045 Received: from mail-yb1-f202.google.com (mail-yb1-f202.google.com [209.85.219.202]) by imf16.hostedemail.com (Postfix) with ESMTP id 53A4E18000B for ; Thu, 5 Jan 2023 10:19:57 +0000 (UTC) Authentication-Results: imf16.hostedemail.com; dkim=pass header.d=google.com header.s=20210112 header.b=c7YTwst+; spf=pass (imf16.hostedemail.com: domain of 3TKS2YwoKCJM6G4BH34GBA3BB381.zB985AHK-997Ixz7.BE3@flex--jthoughton.bounces.google.com designates 209.85.219.202 as permitted sender) smtp.mailfrom=3TKS2YwoKCJM6G4BH34GBA3BB381.zB985AHK-997Ixz7.BE3@flex--jthoughton.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1672913997; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=EgIq+1+XdJ7ES1jkz4HOORu1ptYeWb67NT6vWcXFv0I=; b=OkE0FBZ9lDPLGlKpOvFN//EaE7uphlXkHxlK2JE81vHIF6+G708d6ViXy4AwMJfO7fMUbX kFdIEjuEpaLH09Mq33I6ZRgUnyW9RbnvED1S7WyT8aMlhhBDIk8BkgjPfvzsiSlRJBNUdx 17vUA4CUb31TbgEXxfEmDL8GgFyAm6w= ARC-Authentication-Results: i=1; imf16.hostedemail.com; dkim=pass header.d=google.com header.s=20210112 header.b=c7YTwst+; spf=pass (imf16.hostedemail.com: domain of 3TKS2YwoKCJM6G4BH34GBA3BB381.zB985AHK-997Ixz7.BE3@flex--jthoughton.bounces.google.com designates 209.85.219.202 as permitted sender) smtp.mailfrom=3TKS2YwoKCJM6G4BH34GBA3BB381.zB985AHK-997Ixz7.BE3@flex--jthoughton.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1672913997; a=rsa-sha256; cv=none; b=w7rIkm6JPBxhrB5E0Y2I6bDWSyIpa1KQg43er3bac8cPptyRFQdTm/XrWFnFIeQo78A5R5 ZHVg3/PaEDpHPZ8FsCRwTZuOI5fb9cfTo/jVv9Ip1Kt4YEVSL6a8yAdumcLY+ZjYQOFxZc mOGsNJG7wKKDIKgneSYe5pJ6xM5ycYs= Received: by mail-yb1-f202.google.com with SMTP id l194-20020a2525cb000000b007b411fbdc13so2633140ybl.23 for ; Thu, 05 Jan 2023 02:19:57 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=EgIq+1+XdJ7ES1jkz4HOORu1ptYeWb67NT6vWcXFv0I=; b=c7YTwst+CXQAde/EC4BKjPkQxQslDinc3BRCYCIN4Tt+2cEbTnwk0TU3bwgaP6mee9 kEFf2wTCzpL5xUl7MHVCSLefee0s5iUkXnq/BXDX7FnSsqYa/Xt+MvBh9hZyl95WL9rl VzfwsERIDQ64g0aldVR7KIXBbKA6yB7hRzB4DLlyHtDcrEcwlH1V2Z30dYIICNHagJAF qbjJhxKV7+3WFweiPZZyzSyK5XDS2kSEUb5iNNMFi6ZL9VpJlG160jKr5BCMmceiKdCo 7Po0f1f/HHNFQPSs5wgIOpRg/gJwmw4aukeHNgtzIRiDRL7lBaiZl3zRpKSymn7/ugDK YcAw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=EgIq+1+XdJ7ES1jkz4HOORu1ptYeWb67NT6vWcXFv0I=; b=i7L+6NKnsBLmZva1n8l33kX1sciIYeRsf2c8ULBAcZ0D2BHAuwTHKmAxp2FA7XEqwX dN0OFq68ryiETXgky9Uin18CpbbRN2drwiS184gpyUTPe4SRRjgnbo1DCvrvzYmAr20b YP17hsXN76jMI9UAsFGzy/rFbYbmlxoIb1cnnFUztOa1CAFst9R1Ir9pTtibdOeaM7Fb 0iE+cSdqIjt0N9wQFZBmXjDvYIfXFnV32vf8bqy89a7jqqsWuHojN+cRXtegcXM9K3c/ dsLC74t8I6YO94tr6Ej3vWNtMt87Ad0V5lZi9+rA8kvSVCOSXU+2gFecJJXsrFy4rX6j qgbQ== X-Gm-Message-State: AFqh2kpmheMyNIwQl/dJRJb9KPkPe6EcJDQu/K2avOWOgCUOSZQzcDVL npy+w8GlZoqbBbM3J7XO+Fy0JSP/PRRbEVfF X-Google-Smtp-Source: AMrXdXubDVLgqn6t3MFjyOos5j8PLQg8z932QeBEdnknXJ+cAn7ASWClpeYwjz5bduTLnJufUdF/dV8Ta+ZRR9S8 X-Received: from jthoughton.c.googlers.com ([fda3:e722:ac3:cc00:14:4d90:c0a8:2a4f]) (user=jthoughton job=sendgmr) by 2002:a81:17d6:0:b0:3ea:9ce2:cd76 with SMTP id 205-20020a8117d6000000b003ea9ce2cd76mr93735ywx.217.1672913996509; Thu, 05 Jan 2023 02:19:56 -0800 (PST) Date: Thu, 5 Jan 2023 10:18:41 +0000 In-Reply-To: <20230105101844.1893104-1-jthoughton@google.com> Mime-Version: 1.0 References: <20230105101844.1893104-1-jthoughton@google.com> X-Mailer: git-send-email 2.39.0.314.g84b9a713c41-goog Message-ID: <20230105101844.1893104-44-jthoughton@google.com> Subject: [PATCH 43/46] selftests/kvm: add HugeTLB HGM to KVM demand paging selftest From: James Houghton To: Mike Kravetz , Muchun Song , Peter Xu Cc: David Hildenbrand , David Rientjes , Axel Rasmussen , Mina Almasry , "Zach O'Keefe" , Manish Mishra , Naoya Horiguchi , "Dr . David Alan Gilbert" , "Matthew Wilcox (Oracle)" , Vlastimil Babka , Baolin Wang , Miaohe Lin , Yang Shi , Andrew Morton , linux-mm@kvack.org, linux-kernel@vger.kernel.org, James Houghton X-Rspam-User: X-Rspamd-Server: rspam04 X-Rspamd-Queue-Id: 53A4E18000B X-Stat-Signature: n4m4wzzz1jw19atdufzssmdajt7nbsnd X-HE-Tag: 1672913997-57898 X-HE-Meta: U2FsdGVkX18kSVJ+Vc/rWsWc/qWCOyhI4bcPTFo1xApLRZDNawH/dCodcum7LvUIZxmuc8lWPVh6j3qqzNTwsGxnysKKU8RnLhEr1zj/sKby8owH8TIctxVscXWL+ZvBJQ4NKZqDgUWh0juHGucsJivX5jWg/ZXDVsckHUJP3vK4RGWMPg+XHuFOFbPapHzbHPl3wCCDlDblc0GhIlQRzyBPNbZ9Priib0nPrAFedv/HXTdg7sjQi4n6IQA3u21YwjvIlNfl9+5DPvnxj8z06I7M1CEk5Iq8Nl3esWQKmXImG89KbcuD4mianWZ+2f3P7EnYAJ2L2A2btybazfF9LLE/QNSVb8rlirejpxk5mH8KKIppom6N2IKNEO3PMmfC4KEvJwKZryeePQBD4DWdM2AL7qJJHAyRN7p3NE8hj8R/03ZT6dLjqfpZk+SoWOy44ks2aCjbpjxbNS0kh/OM8lODUzeiBaRhpHZxTDw8BBX/WFruR8P3rPrvvRa+Urh4fCofSAKlk7U2/dZ6TitVd+ThcsNd14ZZMFavvSBj3E8v8hc/ocDyUnvVWQzSdC6+WoL7K/SL8b4U1pghHcRkotiz1YzMnQUY5M4crITSX09XAXUNqHVr6rbABF/mjz6SHuFJMq34/tAO33L0C91rUoGn2a8sj3x0N1vw5R2UrFxG5tHhHeNjJTR27nr5HCjviDAjqP1yjnhy/ZrPz0lrZWhMuhqkmcZej3Cajr1hktPuBS0fMY79TZ0CFJlxhO6uWT7ycq7WdJ3ZWIU71TVtFYFjIAf0XTfLDzBzNTpaMRZxOxIWDzvbMqHkJ3nDiIVzMh8l26+Ex1Jix82K5dquttvkzdAxr1ZpYp8yYA3bONCzmsNn8cTpoKX6dMjN0h5iU8jwIOg3kFv7SF2AJV25J5+cQphhUeRCZdKRA/VM9Eza+41YpgC08UntOVdNk28GoAas/d4e8IOckt5cyQe p1/mU0V6 6cNukVwgwqrPUtSNFuX0iLnDMjKaumtvtWSLq8Sktjg5NT6EPtth+8vqznob/wMxvNouX/xTUyL5ypSkEMd271lSj271w58vo0g5WftxnDTsSXbPshjEWMRwp8c+ptIB70nvMf+ivdmvOMgCqUvhJ/bPLoTjZHkPmQueHgSbxY+rPVbDQXN26CjUBROu3z87/2aIOTXhCEJLZ5tm5fOwt7/InfQXhrFyhjM3rUyj2/JR5eF+4KIKN3hmBO0N9DoB6KVWaoG/i7jX8PNSe7t0seIpKrcW5K4JE03Iq+8tbpNKiMINAV3ioz4bbdRT+DZ3jnUqxlb6H+teYUHs4an17LZR6tsGAt3f/Yv6woL184AORNgv8fiZOOefUKE2W1cqwq2CUwJEkSOGLp+cp6Qmrt5vs6Q== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: This test exercises the GUP paths for HGM. MADV_COLLAPSE is not tested. Signed-off-by: James Houghton --- tools/testing/selftests/kvm/demand_paging_test.c | 2 +- tools/testing/selftests/kvm/include/test_util.h | 2 ++ .../selftests/kvm/include/userfaultfd_util.h | 6 +++--- tools/testing/selftests/kvm/lib/kvm_util.c | 2 +- tools/testing/selftests/kvm/lib/test_util.c | 14 ++++++++++++++ tools/testing/selftests/kvm/lib/userfaultfd_util.c | 14 +++++++++++--- 6 files changed, 32 insertions(+), 8 deletions(-) diff --git a/tools/testing/selftests/kvm/demand_paging_test.c b/tools/testing/selftests/kvm/demand_paging_test.c index b0e1fc4de9e2..e534f9c927bf 100644 --- a/tools/testing/selftests/kvm/demand_paging_test.c +++ b/tools/testing/selftests/kvm/demand_paging_test.c @@ -170,7 +170,7 @@ static void run_test(enum vm_guest_mode mode, void *arg) uffd_descs[i] = uffd_setup_demand_paging( p->uffd_mode, p->uffd_delay, vcpu_hva, vcpu_args->pages * memstress_args.guest_page_size, - &handle_uffd_page_request); + p->src_type, &handle_uffd_page_request); } } diff --git a/tools/testing/selftests/kvm/include/test_util.h b/tools/testing/selftests/kvm/include/test_util.h index 80d6416f3012..a2106c19a614 100644 --- a/tools/testing/selftests/kvm/include/test_util.h +++ b/tools/testing/selftests/kvm/include/test_util.h @@ -103,6 +103,7 @@ enum vm_mem_backing_src_type { VM_MEM_SRC_ANONYMOUS_HUGETLB_16GB, VM_MEM_SRC_SHMEM, VM_MEM_SRC_SHARED_HUGETLB, + VM_MEM_SRC_SHARED_HUGETLB_HGM, NUM_SRC_TYPES, }; @@ -121,6 +122,7 @@ size_t get_def_hugetlb_pagesz(void); const struct vm_mem_backing_src_alias *vm_mem_backing_src_alias(uint32_t i); size_t get_backing_src_pagesz(uint32_t i); bool is_backing_src_hugetlb(uint32_t i); +bool is_backing_src_shared_hugetlb(enum vm_mem_backing_src_type src_type); void backing_src_help(const char *flag); enum vm_mem_backing_src_type parse_backing_src_type(const char *type_name); long get_run_delay(void); diff --git a/tools/testing/selftests/kvm/include/userfaultfd_util.h b/tools/testing/selftests/kvm/include/userfaultfd_util.h index 877449c34592..d91528a58245 100644 --- a/tools/testing/selftests/kvm/include/userfaultfd_util.h +++ b/tools/testing/selftests/kvm/include/userfaultfd_util.h @@ -26,9 +26,9 @@ struct uffd_desc { pthread_t thread; }; -struct uffd_desc *uffd_setup_demand_paging(int uffd_mode, useconds_t delay, - void *hva, uint64_t len, - uffd_handler_t handler); +struct uffd_desc *uffd_setup_demand_paging( + int uffd_mode, useconds_t delay, void *hva, uint64_t len, + enum vm_mem_backing_src_type src_type, uffd_handler_t handler); void uffd_stop_demand_paging(struct uffd_desc *uffd); diff --git a/tools/testing/selftests/kvm/lib/kvm_util.c b/tools/testing/selftests/kvm/lib/kvm_util.c index c88c3ace16d2..67e7223f054b 100644 --- a/tools/testing/selftests/kvm/lib/kvm_util.c +++ b/tools/testing/selftests/kvm/lib/kvm_util.c @@ -972,7 +972,7 @@ void vm_userspace_mem_region_add(struct kvm_vm *vm, region->fd = -1; if (backing_src_is_shared(src_type)) region->fd = kvm_memfd_alloc(region->mmap_size, - src_type == VM_MEM_SRC_SHARED_HUGETLB); + is_backing_src_shared_hugetlb(src_type)); region->mmap_start = mmap(NULL, region->mmap_size, PROT_READ | PROT_WRITE, diff --git a/tools/testing/selftests/kvm/lib/test_util.c b/tools/testing/selftests/kvm/lib/test_util.c index 5c22fa4c2825..712a0878932e 100644 --- a/tools/testing/selftests/kvm/lib/test_util.c +++ b/tools/testing/selftests/kvm/lib/test_util.c @@ -271,6 +271,13 @@ const struct vm_mem_backing_src_alias *vm_mem_backing_src_alias(uint32_t i) */ .flag = MAP_SHARED, }, + [VM_MEM_SRC_SHARED_HUGETLB_HGM] = { + /* + * Identical to shared_hugetlb except for the name. + */ + .name = "shared_hugetlb_hgm", + .flag = MAP_SHARED, + }, }; _Static_assert(ARRAY_SIZE(aliases) == NUM_SRC_TYPES, "Missing new backing src types?"); @@ -289,6 +296,7 @@ size_t get_backing_src_pagesz(uint32_t i) switch (i) { case VM_MEM_SRC_ANONYMOUS: case VM_MEM_SRC_SHMEM: + case VM_MEM_SRC_SHARED_HUGETLB_HGM: return getpagesize(); case VM_MEM_SRC_ANONYMOUS_THP: return get_trans_hugepagesz(); @@ -305,6 +313,12 @@ bool is_backing_src_hugetlb(uint32_t i) return !!(vm_mem_backing_src_alias(i)->flag & MAP_HUGETLB); } +bool is_backing_src_shared_hugetlb(enum vm_mem_backing_src_type src_type) +{ + return src_type == VM_MEM_SRC_SHARED_HUGETLB || + src_type == VM_MEM_SRC_SHARED_HUGETLB_HGM; +} + static void print_available_backing_src_types(const char *prefix) { int i; diff --git a/tools/testing/selftests/kvm/lib/userfaultfd_util.c b/tools/testing/selftests/kvm/lib/userfaultfd_util.c index 92cef20902f1..3c7178d6c4f4 100644 --- a/tools/testing/selftests/kvm/lib/userfaultfd_util.c +++ b/tools/testing/selftests/kvm/lib/userfaultfd_util.c @@ -25,6 +25,10 @@ #ifdef __NR_userfaultfd +#ifndef MADV_SPLIT +#define MADV_SPLIT 26 +#endif + static void *uffd_handler_thread_fn(void *arg) { struct uffd_desc *uffd_desc = (struct uffd_desc *)arg; @@ -108,9 +112,9 @@ static void *uffd_handler_thread_fn(void *arg) return NULL; } -struct uffd_desc *uffd_setup_demand_paging(int uffd_mode, useconds_t delay, - void *hva, uint64_t len, - uffd_handler_t handler) +struct uffd_desc *uffd_setup_demand_paging( + int uffd_mode, useconds_t delay, void *hva, uint64_t len, + enum vm_mem_backing_src_type src_type, uffd_handler_t handler) { struct uffd_desc *uffd_desc; bool is_minor = (uffd_mode == UFFDIO_REGISTER_MODE_MINOR); @@ -140,6 +144,10 @@ struct uffd_desc *uffd_setup_demand_paging(int uffd_mode, useconds_t delay, "ioctl UFFDIO_API failed: %" PRIu64, (uint64_t)uffdio_api.api); + if (src_type == VM_MEM_SRC_SHARED_HUGETLB_HGM) + TEST_ASSERT(!madvise(hva, len, MADV_SPLIT), + "Could not enable HGM"); + uffdio_register.range.start = (uint64_t)hva; uffdio_register.range.len = len; uffdio_register.mode = uffd_mode; From patchwork Thu Jan 5 10:18:42 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: James Houghton X-Patchwork-Id: 13089677 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 7C1ECC3DA7A for ; Thu, 5 Jan 2023 10:20:02 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 1C956940023; Thu, 5 Jan 2023 05:20:02 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 178F6940008; Thu, 5 Jan 2023 05:20:02 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id F3577940023; Thu, 5 Jan 2023 05:20:01 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0017.hostedemail.com [216.40.44.17]) by kanga.kvack.org (Postfix) with ESMTP id D8352940008 for ; Thu, 5 Jan 2023 05:20:01 -0500 (EST) Received: from smtpin14.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay04.hostedemail.com (Postfix) with ESMTP id 8DB031A0CF4 for ; Thu, 5 Jan 2023 10:20:01 +0000 (UTC) X-FDA: 80320349802.14.1C18BA0 Received: from mail-yw1-f202.google.com (mail-yw1-f202.google.com [209.85.128.202]) by imf22.hostedemail.com (Postfix) with ESMTP id 04F94C0007 for ; Thu, 5 Jan 2023 10:19:59 +0000 (UTC) Authentication-Results: imf22.hostedemail.com; dkim=pass header.d=google.com header.s=20210112 header.b=kI4ZTRQe; spf=pass (imf22.hostedemail.com: domain of 3T6S2YwoKCJY9J7EK67JED6EE6B4.2ECB8DKN-CCAL02A.EH6@flex--jthoughton.bounces.google.com designates 209.85.128.202 as permitted sender) smtp.mailfrom=3T6S2YwoKCJY9J7EK67JED6EE6B4.2ECB8DKN-CCAL02A.EH6@flex--jthoughton.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1672914000; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=rPcVmryUH+86L7TuxpN8GfuTRNIB++IYGz7ii/fh+60=; b=xawIUj2JhQBhRYXDSh6gs0hGrua2qk5chCnwqa1WLjX1c1LJw8//wshyFzAtiaduEp/ysS u9fsAIhR2ym7wWmeXeKUFJM3KgD9nifnra6PJHanJmYecxrC4RaKzreF/OPevzHYdqVZK0 +9KhcPZjNYoUG36EsNxpmNFxPDzzFVM= ARC-Authentication-Results: i=1; imf22.hostedemail.com; dkim=pass header.d=google.com header.s=20210112 header.b=kI4ZTRQe; spf=pass (imf22.hostedemail.com: domain of 3T6S2YwoKCJY9J7EK67JED6EE6B4.2ECB8DKN-CCAL02A.EH6@flex--jthoughton.bounces.google.com designates 209.85.128.202 as permitted sender) smtp.mailfrom=3T6S2YwoKCJY9J7EK67JED6EE6B4.2ECB8DKN-CCAL02A.EH6@flex--jthoughton.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1672914000; a=rsa-sha256; cv=none; b=asL/6YwuQJlCfBH72p7Sl6qbK+XtGa0ebaCQq7u7nQkKBLJ8qjW6siNRl3OyUHVoblOob5 32g59w+gWM5T4D2n0QoKsoS2dSCAGtBMCBOJaCvgY0M1OuFE0B53aeE3kguHkjIJ4M5zOS KdHd7PksWPtPGF5k7X/Tgl/qGDyV8MY= Received: by mail-yw1-f202.google.com with SMTP id 00721157ae682-460ab8a327eso378506727b3.23 for ; Thu, 05 Jan 2023 02:19:59 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=rPcVmryUH+86L7TuxpN8GfuTRNIB++IYGz7ii/fh+60=; b=kI4ZTRQe7Q0DNwU8Nd2EZKZhgCILp8F6DoLJQBQIeMmVpBPgy2nY9fRh+7WBj3mKcv b3mHhtPqHKj27maMD9zcK0xNFg7ZH3P+7Zrl4zxFnCf7G/kUV7ygs9iJH6H3DPnwY+7v pEVpy5jqUM6S9+jlWu94uWgwsLLyquWN42I/U3Mye5rfAyPIaLO+CcmOU47oPaODXHBZ g6tTWmWkSREPHJl4j7JvfcfZmIeSW+FeLQ5bKfcMOb/+OzwicxRoFD6mPiFPk365Ae1D C7W9a4QJVUB1YYOH+GqrZqYY061aZQSPOGcDY51qN61zXe8gTbgIaiQswond5R+j607U +9Iw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=rPcVmryUH+86L7TuxpN8GfuTRNIB++IYGz7ii/fh+60=; b=adr7C3hIw4lz3/mmZKwhIAtKhGALJr/Y3+R0/2Lv7bOihqtAIN/vnksJi0MZagv2WM vT2hZv53Wh3NmjN896cedoJl2+z/8Fs+K79nHLfvKuWkxGgkXudUOfMwitfHJItVnQMj Bw217rFEd45S/Ep0G3+75jr9ClEtWRJwcMl8jWVdctIX+Wh8ARsnrC2/wT5snn1KDSqb UgE0YhoBg1jGToENdQFB8IowgkH4S4gKT+A07vLZahKty332TtThF4D4QIAaX6/nVIJR 70y3AHWUBZS5CybjZqDlV85yW1ziHyqc5I8NqklPOhGIyask/8txX7dF700/CjSPdk6g 1jbw== X-Gm-Message-State: AFqh2kqmkUWLQqeYeS+rAfL2JxnK+cKl+J3YKzt0pTZ0+d5rOr5NC8R0 k7qwVmeMuotdoyKicYkKYm48lA5qO+dgWBSo X-Google-Smtp-Source: AMrXdXtTdSV2u8sC9W2WJTx+X9U5bRZRCXSHwwKpc/aZe4u4qkoBDtjjb5Moax2r1vBZGfpJQ/5EU4hKV8ZnomEr X-Received: from jthoughton.c.googlers.com ([fda3:e722:ac3:cc00:14:4d90:c0a8:2a4f]) (user=jthoughton job=sendgmr) by 2002:a25:7c81:0:b0:727:e539:452f with SMTP id x123-20020a257c81000000b00727e539452fmr5333597ybc.552.1672913999168; Thu, 05 Jan 2023 02:19:59 -0800 (PST) Date: Thu, 5 Jan 2023 10:18:42 +0000 In-Reply-To: <20230105101844.1893104-1-jthoughton@google.com> Mime-Version: 1.0 References: <20230105101844.1893104-1-jthoughton@google.com> X-Mailer: git-send-email 2.39.0.314.g84b9a713c41-goog Message-ID: <20230105101844.1893104-45-jthoughton@google.com> Subject: [PATCH 44/46] selftests/vm: add anon and shared hugetlb to migration test From: James Houghton To: Mike Kravetz , Muchun Song , Peter Xu Cc: David Hildenbrand , David Rientjes , Axel Rasmussen , Mina Almasry , "Zach O'Keefe" , Manish Mishra , Naoya Horiguchi , "Dr . David Alan Gilbert" , "Matthew Wilcox (Oracle)" , Vlastimil Babka , Baolin Wang , Miaohe Lin , Yang Shi , Andrew Morton , linux-mm@kvack.org, linux-kernel@vger.kernel.org, James Houghton X-Rspamd-Server: rspam05 X-Rspamd-Queue-Id: 04F94C0007 X-Stat-Signature: xpphzw4s51xu86m56dn4oshfzfhu1xbh X-Rspam-User: X-HE-Tag: 1672913999-442941 X-HE-Meta: U2FsdGVkX1+LtKloDek9INPKPm37lXV/guBbfNk1QcEsHCSYgXL6fPBThWFAKsaC2RG/5ZDW4LAQ8L3B9sEQKCFKHboMTK4mxpy9TJRl4p/EbYPP2SgEf+oYIf6VTl4QLa18X6LHKA/byvp5+XrEkUWJwjrH3kbCxmFxx+kfnFLEvGrG1CmvSVuyWBTXVmElDh89zi+2glUhBG+Yne3tM+DBvldNIsZxJ2+gxZVLYivglk3XLZpHSfuhm2rLK8QNZU5aj1mec2Th14KGoQKGG6rCG/KFtqtKQrpSumNTl03Q/eorXmPhx82ltqG9yu+zs2vJsEax0yeihdvayQ6/Pd9aXquqXlrGny8C+hKqfpx2kOcj8Kj9j9U296b+NwW3GDYXYsg/1ObqQkJmB6CEME++hBoNESMKb7bpnxDM2cAfpvRh/NkYKL9y8yIXKQDgfWSR38Xh5fenqQ4f4+hAPk4PU5LxF/h8GEyr/H9g7nzu+FKNgXt0IMaAwQzY99KknosLYY2UVZSHfeSVQ0UhuGvQRXkV28WB0RQhvShoRD/BexmT38DMxmSWb7/6twFIiTiTFIUK1dN4NtkFZyonmPqMxCRG2yc0Zr2zt0hGAvd+tmjPrdXg3bBx5zcq6LW7ZAh5DKMXbm4teu1q6d/mV0uZUI/pRgwKUAnIGTz0k37VGt5sT5EvgXi3HLu/XIva7PROatk8rUASjcP6+sIkfyBSusl6A9otFIv4/n4rFVZ3uxsxoYxtpjB45zZDKtWBvall5+UalfKpc8/L480Ys6ZtwZlpC2pbpOFBgDTe86QSyYRrHBDp2jh+DBzxv8u90iV2ktSzUET6bhyIHWjM9z9SJjjhIP8oT77NbRV5jnk9JwGno6IxYpiMPZKOxImzzEeifb1ubZxKPUKM09MU3M0SQmY3Y+qLVrJvsuLWttXy0zv2CjnhjgAXGsEm8HNfhsKbx36aoQ5WZhE5//l 3wz7+6Rr RQNXImBszGLHYxvEi+5q6xIaCsCJoJjQ2x8k1/7s8ZfMeq6FoHQB63g3Fqb2yLarHxsLK6Z+jiFEpATYNqM6DPbH8+lShoQEE3ZH4cjAq+ojXS39roYVBNuK8Qwm4mdQ2ZGR1J/WR9NNIRcasdHmmdPT/IGHqj7Sdi85awdQn+Xcl5M4OrI4YdVWzkPRDyPp2isv3DYzHKuqyn3RBtrPDjUN6CfuHjuw4lKZtypuNEdPHjIyGDjP41hLm82NB4jLNb67l2FHwSQyf+gXLvJmFt8489n8f9l+omUbfFH85+ZeEj5xxf4kmFRB71HZr3KXRJAlY7aic6VNcHCnMCMVGUQKU/N+w3TE5WD6TwvwrQE1RuvUiNgqfSLzDjXOWV4vH0HNRAYfnXQX/5PY= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Shared HugeTLB mappings are migrated best-effort. Sometimes, due to being unable to grab the VMA lock for writing, migration may just randomly fail. To allow for that, we allow retries. Signed-off-by: James Houghton --- tools/testing/selftests/vm/migration.c | 83 ++++++++++++++++++++++++-- 1 file changed, 79 insertions(+), 4 deletions(-) diff --git a/tools/testing/selftests/vm/migration.c b/tools/testing/selftests/vm/migration.c index 1cec8425e3ca..21577a84d7e4 100644 --- a/tools/testing/selftests/vm/migration.c +++ b/tools/testing/selftests/vm/migration.c @@ -13,6 +13,7 @@ #include #include #include +#include #define TWOMEG (2<<20) #define RUNTIME (60) @@ -59,11 +60,12 @@ FIXTURE_TEARDOWN(migration) free(self->pids); } -int migrate(uint64_t *ptr, int n1, int n2) +int migrate(uint64_t *ptr, int n1, int n2, int retries) { int ret, tmp; int status = 0; struct timespec ts1, ts2; + int failed = 0; if (clock_gettime(CLOCK_MONOTONIC, &ts1)) return -1; @@ -78,6 +80,9 @@ int migrate(uint64_t *ptr, int n1, int n2) ret = move_pages(0, 1, (void **) &ptr, &n2, &status, MPOL_MF_MOVE_ALL); if (ret) { + if (++failed < retries) + continue; + if (ret > 0) printf("Didn't migrate %d pages\n", ret); else @@ -88,6 +93,7 @@ int migrate(uint64_t *ptr, int n1, int n2) tmp = n2; n2 = n1; n1 = tmp; + failed = 0; } return 0; @@ -128,7 +134,7 @@ TEST_F_TIMEOUT(migration, private_anon, 2*RUNTIME) if (pthread_create(&self->threads[i], NULL, access_mem, ptr)) perror("Couldn't create thread"); - ASSERT_EQ(migrate(ptr, self->n1, self->n2), 0); + ASSERT_EQ(migrate(ptr, self->n1, self->n2, 1), 0); for (i = 0; i < self->nthreads - 1; i++) ASSERT_EQ(pthread_cancel(self->threads[i]), 0); } @@ -158,7 +164,7 @@ TEST_F_TIMEOUT(migration, shared_anon, 2*RUNTIME) self->pids[i] = pid; } - ASSERT_EQ(migrate(ptr, self->n1, self->n2), 0); + ASSERT_EQ(migrate(ptr, self->n1, self->n2, 1), 0); for (i = 0; i < self->nthreads - 1; i++) ASSERT_EQ(kill(self->pids[i], SIGTERM), 0); } @@ -185,9 +191,78 @@ TEST_F_TIMEOUT(migration, private_anon_thp, 2*RUNTIME) if (pthread_create(&self->threads[i], NULL, access_mem, ptr)) perror("Couldn't create thread"); - ASSERT_EQ(migrate(ptr, self->n1, self->n2), 0); + ASSERT_EQ(migrate(ptr, self->n1, self->n2, 1), 0); + for (i = 0; i < self->nthreads - 1; i++) + ASSERT_EQ(pthread_cancel(self->threads[i]), 0); +} + +/* + * Tests the anon hugetlb migration entry paths. + */ +TEST_F_TIMEOUT(migration, private_anon_hugetlb, 2*RUNTIME) +{ + uint64_t *ptr; + int i; + + if (self->nthreads < 2 || self->n1 < 0 || self->n2 < 0) + SKIP(return, "Not enough threads or NUMA nodes available"); + + ptr = mmap(NULL, TWOMEG, PROT_READ | PROT_WRITE, + MAP_PRIVATE | MAP_ANONYMOUS | MAP_HUGETLB, -1, 0); + if (ptr == MAP_FAILED) + SKIP(return, "Could not allocate hugetlb pages"); + + memset(ptr, 0xde, TWOMEG); + for (i = 0; i < self->nthreads - 1; i++) + if (pthread_create(&self->threads[i], NULL, access_mem, ptr)) + perror("Couldn't create thread"); + + ASSERT_EQ(migrate(ptr, self->n1, self->n2, 1), 0); for (i = 0; i < self->nthreads - 1; i++) ASSERT_EQ(pthread_cancel(self->threads[i]), 0); } +/* + * Tests the shared hugetlb migration entry paths. + */ +TEST_F_TIMEOUT(migration, shared_hugetlb, 2*RUNTIME) +{ + uint64_t *ptr; + int i; + int fd; + unsigned long sz; + struct statfs filestat; + + if (self->nthreads < 2 || self->n1 < 0 || self->n2 < 0) + SKIP(return, "Not enough threads or NUMA nodes available"); + + fd = memfd_create("tmp_hugetlb", MFD_HUGETLB); + if (fd < 0) + SKIP(return, "Couldn't create hugetlb memfd"); + + if (fstatfs(fd, &filestat) < 0) + SKIP(return, "Couldn't fstatfs hugetlb file"); + + sz = filestat.f_bsize; + + if (ftruncate(fd, sz)) + SKIP(return, "Couldn't allocate hugetlb pages"); + ptr = mmap(NULL, sz, PROT_READ | PROT_WRITE, MAP_SHARED, fd, 0); + if (ptr == MAP_FAILED) + SKIP(return, "Could not map hugetlb pages"); + + memset(ptr, 0xde, sz); + for (i = 0; i < self->nthreads - 1; i++) + if (pthread_create(&self->threads[i], NULL, access_mem, ptr)) + perror("Couldn't create thread"); + + ASSERT_EQ(migrate(ptr, self->n1, self->n2, 10), 0); + for (i = 0; i < self->nthreads - 1; i++) { + ASSERT_EQ(pthread_cancel(self->threads[i]), 0); + pthread_join(self->threads[i], NULL); + } + ftruncate(fd, 0); + close(fd); +} + TEST_HARNESS_MAIN From patchwork Thu Jan 5 10:18:43 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: James Houghton X-Patchwork-Id: 13089678 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 7B656C3DA7D for ; Thu, 5 Jan 2023 10:20:04 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 1A24F900004; Thu, 5 Jan 2023 05:20:04 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 15160900002; Thu, 5 Jan 2023 05:20:04 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id F348A900004; Thu, 5 Jan 2023 05:20:03 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0010.hostedemail.com [216.40.44.10]) by kanga.kvack.org (Postfix) with ESMTP id E50E9900002 for ; Thu, 5 Jan 2023 05:20:03 -0500 (EST) Received: from smtpin02.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay03.hostedemail.com (Postfix) with ESMTP id 94565A0D29 for ; Thu, 5 Jan 2023 10:20:03 +0000 (UTC) X-FDA: 80320349886.02.17C7E2A Received: from mail-yb1-f201.google.com (mail-yb1-f201.google.com [209.85.219.201]) by imf30.hostedemail.com (Postfix) with ESMTP id 08A9F8000D for ; Thu, 5 Jan 2023 10:20:01 +0000 (UTC) Authentication-Results: imf30.hostedemail.com; dkim=pass header.d=google.com header.s=20210112 header.b=JoK7U3yU; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf30.hostedemail.com: domain of 3UaS2YwoKCJgBL9GM89LGF8GG8D6.4GEDAFMP-EECN24C.GJ8@flex--jthoughton.bounces.google.com designates 209.85.219.201 as permitted sender) smtp.mailfrom=3UaS2YwoKCJgBL9GM89LGF8GG8D6.4GEDAFMP-EECN24C.GJ8@flex--jthoughton.bounces.google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1672914002; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=rtdl4oKQo+XEbgxtU4YTxEZN3Xtv3pieSBor01lKxpM=; b=EbLXDEZaqRqx5n4iMxZgwR6b02D40obIifZnr9KNiURt6/BlSLRgaZ3npb7bVUgW9tcOmX gt2FFeE9PTKSf1MzS+vVCVXgf6VcOkqvzdcXUlUTfKq963KDnnJSd3NpAS6Ev6e9sThif/ bEuvBPLTrHOfB6czOPDZ3ApCuW7HApE= ARC-Authentication-Results: i=1; imf30.hostedemail.com; dkim=pass header.d=google.com header.s=20210112 header.b=JoK7U3yU; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf30.hostedemail.com: domain of 3UaS2YwoKCJgBL9GM89LGF8GG8D6.4GEDAFMP-EECN24C.GJ8@flex--jthoughton.bounces.google.com designates 209.85.219.201 as permitted sender) smtp.mailfrom=3UaS2YwoKCJgBL9GM89LGF8GG8D6.4GEDAFMP-EECN24C.GJ8@flex--jthoughton.bounces.google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1672914002; a=rsa-sha256; cv=none; b=5g585GKXeChnRJrmRdv1FEIMB53EpNEKCSvFh7NjkGzPSm0N3NLWdLDJozgzl3yATzhkfd mGCvvJW0ZzU0HE+Z4MqNCAdtEMziTs5LnmPfewUEtGGOoFOIUbbTgx34T3IsYYQEu0xtAO 3YgICIp+Aq86lSljgkPu32AgVcuPN+E= Received: by mail-yb1-f201.google.com with SMTP id h66-20020a252145000000b0071a7340eea9so36101005ybh.6 for ; Thu, 05 Jan 2023 02:20:01 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=rtdl4oKQo+XEbgxtU4YTxEZN3Xtv3pieSBor01lKxpM=; b=JoK7U3yUK3xMp7jj3ED65QQRxyxFdtP2VVXxjeaagA1yOQ1VDOWmerpsJ93SBkq9zh 1+sCrcsTNTUzGHfBCN3DVG0Ub0PaSUoyZmam0r/dYWFPfLKvvL3DvvjbWnpAFCJmqtmQ 9dL77taavQp85y0beQ4G5rqDpS08FPS6eH96z44yL0EIzphbGcRadmhuAWN3kkhTwcyi Z6tLvspR8aGzxFL9AdtTmhEJWc3alCOwAyuQe0Fe4aSztOMMuenUMqwRmV5J/FhoFHxX pOzfJfehMxt0Fd8Q97JtyGiSLNhamOIcoa298fqunC0dT5N8onLPpGnxOJexx6VIdt++ pOSw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=rtdl4oKQo+XEbgxtU4YTxEZN3Xtv3pieSBor01lKxpM=; b=5Gg9naikzLQk35WzLZH+H9QOLIw9ESmRKpwuBJG/FV2ULGj5vc8qfW0uhC7EemCxeq vB9j10BolB4zuBCHye5i8Km/t0wOWWg/hbKNvD93Mxh3J3QYBG/BXY6H2TbMDVJqvAMP 6fStV6tp2DKIpVi5PQP6+mv54lpgNfp+NVHipGyu3GzuKY1ELwWFSLsVHtkhs/CYHwSU o3F5VfLGdtye+SrhwqlArydgJuwdAqIQRKak2ZgQ+j1+9AhR3Jp+se617C/aACU89pft sTZBjVe2qNIi7wE7hDrCqm6sToASIBrN08UKI6b1a4G8t1ARoo6GRGoRWk418Ve9He6l O/sA== X-Gm-Message-State: AFqh2kqYOpgmB9DHvPp6PAb0IY2YheYgK2FatCbSSwq5bwfZ2I5OYP4e a2lxRKYHQw8HTNn4f1PWf3IuGLvU9jaxiL+w X-Google-Smtp-Source: AMrXdXv5OilDl2+kIyWuY+cCyv/mwU3aGk8FszvI2rVdH9/dLOCIgt0mfQQiMmbp6/zudM0adl4O7gCLJPTvbpXO X-Received: from jthoughton.c.googlers.com ([fda3:e722:ac3:cc00:14:4d90:c0a8:2a4f]) (user=jthoughton job=sendgmr) by 2002:a25:8149:0:b0:7b6:5baa:c97c with SMTP id j9-20020a258149000000b007b65baac97cmr13349ybm.515.1672914001263; Thu, 05 Jan 2023 02:20:01 -0800 (PST) Date: Thu, 5 Jan 2023 10:18:43 +0000 In-Reply-To: <20230105101844.1893104-1-jthoughton@google.com> Mime-Version: 1.0 References: <20230105101844.1893104-1-jthoughton@google.com> X-Mailer: git-send-email 2.39.0.314.g84b9a713c41-goog Message-ID: <20230105101844.1893104-46-jthoughton@google.com> Subject: [PATCH 45/46] selftests/vm: add hugetlb HGM test to migration selftest From: James Houghton To: Mike Kravetz , Muchun Song , Peter Xu Cc: David Hildenbrand , David Rientjes , Axel Rasmussen , Mina Almasry , "Zach O'Keefe" , Manish Mishra , Naoya Horiguchi , "Dr . David Alan Gilbert" , "Matthew Wilcox (Oracle)" , Vlastimil Babka , Baolin Wang , Miaohe Lin , Yang Shi , Andrew Morton , linux-mm@kvack.org, linux-kernel@vger.kernel.org, James Houghton X-Rspam-User: X-Rspamd-Server: rspam02 X-Rspamd-Queue-Id: 08A9F8000D X-Stat-Signature: taxgizhk71p9pbj1opys7ucejmhwhxtc X-HE-Tag: 1672914001-201394 X-HE-Meta: U2FsdGVkX1/1NPh1NUfbMbV9DjJDS9b12P9ilY6PxMegnnVcBuUXKY0gDWIGuFh9drcH/K7Cz0z/BfzZciTgQxkGlEwIg7YHD5MuutzSp24Cy4Iyx7LUnBR4aGoEYEHbMJan8zSJ8Z+WOjtLjLKqgK6KO93TUhrD9cWE76AZmmHGAjj92Gh/snDBKf6Uig6QhDnNE0fV1IGXNGBh5HCHNZQb4P9y5c1Fx21TcgJMm5Jcu+KNkLZQBzXYH37UiI7NM2V2UwiNlrJTe/QbN8Em4FF1ZagpvbLE8ZrLG/2lXWF8OcqCj7GxDjDYjEyIOIFI2hdsjpZbup6BBIcPcYh0t2hlP7c/F62wmX+/1XBTi+T7nM1qyHJSc2bH1YomSneFe/f2212zXBttu1szsX85jun6SG1lvRRuwdicEp4JNPL+Ki9GQif55r9n4kgANl7sCkCrRuA6W5cB/rlSUNwB2HQzCDI+W9Y6mppFAiUXdNvMhpzFD/HxjpOtBcsnkyc0KpJP/5zOFcWkqIxZAGn96V4W9idRUwADJTQj0rTbKQ4bml4SuiLNkUk/emJcYunqmqPlNeIBEPdRl9vLb3s03m4yy3emSff5Na6+uREE6vBFb5Sa4U4meqv058PdgPxYFZ1rtxADC9TOYX1nTlhuWKpc4XyR8XoL0DrtdRcMc62M7G1q6Ss9dvWNkVihrr+wrk2NLWxIQfuJdit9d5TdztBXPjdgWea46q/8el6Qxh9nVqJxMhHz0FwxFdN2IxM1XZSIy4OOotq54GMANNl0fyzkKqdIdQyDnRDDO8I0Gbfw0723gyHEc3/xxZMv+GV7FwDC8WePZqpjKSFP8mhpaAIJRKxyGRCXKCqduM0ALqrOFGSFNQolw7NWPAOEd+tQQToHqVn7d16rTNi2T9PJJgi4kArxZr+p88xLH03YcQOhq+0ZLqY4K+L48CuxMZUQkc0hdSmtlLOVPG/Eyfu w7dGKHCO MNkgAh90zO4AayxYOZH+lFkc7QhyDe98FQS+9L82vLI5/LaBZqZuRFC+RHY1GVOCTC0d7rcDZN29NLpSV389w2yRKdACeTChlKya4WI2cxCGrnc3t6QwB3Y71h7/4aCfX9lZXn8mOxOeAVh3zXNuI+h8OoN2sOMiu+3brsyJAyS+1vw6AWq5d4TBfZ46UXtXkwt40+G1+wfTkArzSZ+v5/NC8iUzNLeJFSrKPU+EVDWGHDyCE+lxYyN5KXmzhDmNHK4JWKrZZw0H0V+jAHA4cGvOKSwRoeLMSjcR/Guswc/0zxKtXSVtjyCP2RG7M2YZdk5TA0GESzHgRjGpYBQ/UUXCbbt/U7P3A7ZpyU/YKaLeTuFSajKBWqr93taaJqtGta9n8VftfD3jCleo2kymU2nHx6g== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: This is mostly the same as the shared HugeTLB case, but instead of mapping the page with a regular page fault, we map it with lots of UFFDIO_CONTINUE operations. We also verify that the contents haven't changed after the migration, which would be the case if the post-migration PTEs pointed to the wrong page. Signed-off-by: James Houghton --- tools/testing/selftests/vm/migration.c | 146 +++++++++++++++++++++++++ 1 file changed, 146 insertions(+) diff --git a/tools/testing/selftests/vm/migration.c b/tools/testing/selftests/vm/migration.c index 21577a84d7e4..1fb3607accab 100644 --- a/tools/testing/selftests/vm/migration.c +++ b/tools/testing/selftests/vm/migration.c @@ -14,12 +14,21 @@ #include #include #include +#include +#include +#include +#include +#include #define TWOMEG (2<<20) #define RUNTIME (60) #define ALIGN(x, a) (((x) + (a - 1)) & (~((a) - 1))) +#ifndef MADV_SPLIT +#define MADV_SPLIT 26 +#endif + FIXTURE(migration) { pthread_t *threads; @@ -265,4 +274,141 @@ TEST_F_TIMEOUT(migration, shared_hugetlb, 2*RUNTIME) close(fd); } +#ifdef __NR_userfaultfd +static int map_at_high_granularity(char *mem, size_t length) +{ + int i; + int ret; + int uffd = syscall(__NR_userfaultfd, 0); + struct uffdio_api api; + struct uffdio_register reg; + int pagesize = getpagesize(); + + if (uffd < 0) { + perror("couldn't create uffd"); + return uffd; + } + + api.api = UFFD_API; + api.features = 0; + + ret = ioctl(uffd, UFFDIO_API, &api); + if (ret || api.api != UFFD_API) { + perror("UFFDIO_API failed"); + goto out; + } + + if (madvise(mem, length, MADV_SPLIT) == -1) { + perror("MADV_SPLIT failed"); + goto out; + } + + reg.range.start = (unsigned long)mem; + reg.range.len = length; + + reg.mode = UFFDIO_REGISTER_MODE_MISSING | UFFDIO_REGISTER_MODE_MINOR; + + ret = ioctl(uffd, UFFDIO_REGISTER, ®); + if (ret) { + perror("UFFDIO_REGISTER failed"); + goto out; + } + + /* UFFDIO_CONTINUE each 4K segment of the 2M page. */ + for (i = 0; i < length/pagesize; ++i) { + struct uffdio_continue cont; + + cont.range.start = (unsigned long long)mem + i * pagesize; + cont.range.len = pagesize; + cont.mode = 0; + ret = ioctl(uffd, UFFDIO_CONTINUE, &cont); + if (ret) { + fprintf(stderr, "UFFDIO_CONTINUE failed " + "for %llx -> %llx: %d\n", + cont.range.start, + cont.range.start + cont.range.len, + errno); + goto out; + } + } + ret = 0; +out: + close(uffd); + return ret; +} +#else +static int map_at_high_granularity(char *mem, size_t length) +{ + fprintf(stderr, "Userfaultfd missing\n"); + return -1; +} +#endif /* __NR_userfaultfd */ + +/* + * Tests the high-granularity hugetlb migration entry paths. + */ +TEST_F_TIMEOUT(migration, shared_hugetlb_hgm, 2*RUNTIME) +{ + uint64_t *ptr; + int i; + int fd; + unsigned long sz; + struct statfs filestat; + + if (self->nthreads < 2 || self->n1 < 0 || self->n2 < 0) + SKIP(return, "Not enough threads or NUMA nodes available"); + + fd = memfd_create("tmp_hugetlb", MFD_HUGETLB); + if (fd < 0) + SKIP(return, "Couldn't create hugetlb memfd"); + + if (fstatfs(fd, &filestat) < 0) + SKIP(return, "Couldn't fstatfs hugetlb file"); + + sz = filestat.f_bsize; + + if (ftruncate(fd, sz)) + SKIP(return, "Couldn't allocate hugetlb pages"); + + if (fallocate(fd, 0, 0, sz) < 0) { + perror("fallocate failed"); + SKIP(return, "fallocate failed"); + } + + ptr = mmap(NULL, sz, PROT_READ | PROT_WRITE, MAP_SHARED, fd, 0); + if (ptr == MAP_FAILED) + SKIP(return, "Could not allocate hugetlb pages"); + + /* + * We have to map_at_high_granularity before we memset, otherwise + * memset will map everything at the hugepage size. + */ + if (map_at_high_granularity((char *)ptr, sz) < 0) + SKIP(return, "Could not map HugeTLB range at high granularity"); + + /* Populate the page we're migrating. */ + for (i = 0; i < sz/sizeof(*ptr); ++i) + ptr[i] = i; + + for (i = 0; i < self->nthreads - 1; i++) + if (pthread_create(&self->threads[i], NULL, access_mem, ptr)) + perror("Couldn't create thread"); + + ASSERT_EQ(migrate(ptr, self->n1, self->n2, 10), 0); + for (i = 0; i < self->nthreads - 1; i++) { + ASSERT_EQ(pthread_cancel(self->threads[i]), 0); + pthread_join(self->threads[i], NULL); + } + + /* Check that the contents didnt' change. */ + for (i = 0; i < sz/sizeof(*ptr); ++i) { + ASSERT_EQ(ptr[i], i); + if (ptr[i] != i) + break; + } + + ftruncate(fd, 0); + close(fd); +} + TEST_HARNESS_MAIN From patchwork Thu Jan 5 10:18:44 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: James Houghton X-Patchwork-Id: 13089679 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id B1E10C53210 for ; Thu, 5 Jan 2023 10:20:06 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 54ACB94000D; Thu, 5 Jan 2023 05:20:06 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 4FAE4940008; Thu, 5 Jan 2023 05:20:06 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 39E5B94000D; Thu, 5 Jan 2023 05:20:06 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0011.hostedemail.com [216.40.44.11]) by kanga.kvack.org (Postfix) with ESMTP id 19E25940008 for ; Thu, 5 Jan 2023 05:20:06 -0500 (EST) Received: from smtpin20.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay05.hostedemail.com (Postfix) with ESMTP id D3EA240722 for ; Thu, 5 Jan 2023 10:20:05 +0000 (UTC) X-FDA: 80320349970.20.0A79DEA Received: from mail-yb1-f202.google.com (mail-yb1-f202.google.com [209.85.219.202]) by imf13.hostedemail.com (Postfix) with ESMTP id 3E91820008 for ; Thu, 5 Jan 2023 10:20:04 +0000 (UTC) Authentication-Results: imf13.hostedemail.com; dkim=pass header.d=google.com header.s=20210112 header.b=FPeE17fM; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf13.hostedemail.com: domain of 3U6S2YwoKCJoDNBIOABNIHAIIAF8.6IGFCHOR-GGEP46E.ILA@flex--jthoughton.bounces.google.com designates 209.85.219.202 as permitted sender) smtp.mailfrom=3U6S2YwoKCJoDNBIOABNIHAIIAF8.6IGFCHOR-GGEP46E.ILA@flex--jthoughton.bounces.google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1672914004; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=iZP6H5SebDPMe8Z9a4A7Pf8cY5vrps0BxI4iu34x5Ck=; b=ROYoJ2JGe4vEV4BO8A6lEUx8WNninHfT8RNR7qFMIBWWau03dWQAbMVAhORIYqL48UtsxD BNV5Pp8rScoCnAYGiQ5CV60Nvn612k6go0zTUJh1nkWarIfHcz+U0OLq1Ck/ZKt5HLgJPn zvcsSVtfsDwuJ91PqvL/hURY8YxCtTU= ARC-Authentication-Results: i=1; imf13.hostedemail.com; dkim=pass header.d=google.com header.s=20210112 header.b=FPeE17fM; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf13.hostedemail.com: domain of 3U6S2YwoKCJoDNBIOABNIHAIIAF8.6IGFCHOR-GGEP46E.ILA@flex--jthoughton.bounces.google.com designates 209.85.219.202 as permitted sender) smtp.mailfrom=3U6S2YwoKCJoDNBIOABNIHAIIAF8.6IGFCHOR-GGEP46E.ILA@flex--jthoughton.bounces.google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1672914004; a=rsa-sha256; cv=none; b=rN+pVPJxlAGSGQhs3CIq0SStjYjUtzyMeBrb2Kzv+0k7JUw07xMWH6Oc7tATCUNE/Qxunb fNoY65uxanp81AV8rQRTRVfqeAANjJJM8yoaJSr+vDpU32vUZV3+VvQ6xg0XIZ3qDaxz1A VsidVjbd3sTs3VqS/6Cg61p9DOn4jew= Received: by mail-yb1-f202.google.com with SMTP id z17-20020a256651000000b007907852ca4dso18368040ybm.16 for ; Thu, 05 Jan 2023 02:20:03 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=iZP6H5SebDPMe8Z9a4A7Pf8cY5vrps0BxI4iu34x5Ck=; b=FPeE17fM2FM2bV60Sj1OaHnRhzqmXE0KPiqkIfKEtnEBuBPDiMOT/w/Udj4oVv3HCS 5q3PpRaQwWWOBzFTdOYoLbywYGPwMsEvCHssUbAsowr+IT9z/IWTwLGie0mMV2+cMbeb cvtAQSBVX9PdQ0lvlYS64Gdf9diov9UIYQ88guFqQTaH+gxQ7dA00f1v9vWPZGJh61XI jU+7SVaQxB3j7FlB+mkL5b6E4ZsRuq+Fs7C3vQgG4xQ9UccPQqq9xMqrJ4o0XGqMGs/G Fr7csII5Pm+/rVUvre0+5eE8fK1XxXa40/ns4sO2I0eokWaab9MvijLgi0lK+qkTiyoZ /gyQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=iZP6H5SebDPMe8Z9a4A7Pf8cY5vrps0BxI4iu34x5Ck=; b=tn4XXnd0xPUgL4GaUM3RPK4gmsXYhwU2dJFhFMb5VmIYwyz5l0HgSOGSYWRc6x+7sF BVyfj1TbYfxJCzGRY2kdBi0igIfWpatYmiEqdODNeK44DtFtF1t/sTL0jWCrs3S9jSX/ //9z7+MIQozLITHI1yBK1jCrXKJf7iFhT/kWXizkiI1E8Tgudiiat1plBP2VHR7trPMq I6agZW76Ffapo38p4QdZ4lA79D/52b6rVDZxahk6dLIq50QIwnDTdp7dYy2K6qD4sadj MyJWzZGCiyRzs8THBp+ZdWUlIp3T3Z2JOwk7xXxkc7B+XUC2VqcTZh54gf4MsmAmJarr APow== X-Gm-Message-State: AFqh2kq/iDfhx6hkIs136L05CIgjnWQHu+hH1XLc/rtDKDvNAmkOPApo ozClnhk6/4Cdn24Fi1G1QmRgRChA5hs1/P/d X-Google-Smtp-Source: AMrXdXv2vXh6DHnSZZ7YXZNdYuSmJ7WrXgsqO8vMOcCj5jtqnjJRoeJeUrU5rL30SXxXleapVXstScHoDbFGs3ZE X-Received: from jthoughton.c.googlers.com ([fda3:e722:ac3:cc00:14:4d90:c0a8:2a4f]) (user=jthoughton job=sendgmr) by 2002:a05:690c:444:b0:3d7:66df:9b62 with SMTP id bj4-20020a05690c044400b003d766df9b62mr8131895ywb.133.1672914003495; Thu, 05 Jan 2023 02:20:03 -0800 (PST) Date: Thu, 5 Jan 2023 10:18:44 +0000 In-Reply-To: <20230105101844.1893104-1-jthoughton@google.com> Mime-Version: 1.0 References: <20230105101844.1893104-1-jthoughton@google.com> X-Mailer: git-send-email 2.39.0.314.g84b9a713c41-goog Message-ID: <20230105101844.1893104-47-jthoughton@google.com> Subject: [PATCH 46/46] selftests/vm: add HGM UFFDIO_CONTINUE and hwpoison tests From: James Houghton To: Mike Kravetz , Muchun Song , Peter Xu Cc: David Hildenbrand , David Rientjes , Axel Rasmussen , Mina Almasry , "Zach O'Keefe" , Manish Mishra , Naoya Horiguchi , "Dr . David Alan Gilbert" , "Matthew Wilcox (Oracle)" , Vlastimil Babka , Baolin Wang , Miaohe Lin , Yang Shi , Andrew Morton , linux-mm@kvack.org, linux-kernel@vger.kernel.org, James Houghton X-Rspam-User: X-Rspamd-Server: rspam02 X-Rspamd-Queue-Id: 3E91820008 X-Stat-Signature: h4fkj1xattufpxcz7c3qyey7fzx5tmwb X-HE-Tag: 1672914004-120038 X-HE-Meta: U2FsdGVkX19CSq8bteGJDYecCdkml0FG2M+BgDDmKkma61ee8fDepFaRdiZXT+VfaDPUZPs6xDT1mKlj4muwGrzQa+MhKwNQx4mbPLHEbwu6vaHpWzTl0t3+0itS4amOyAtUL1BBUhLHjesxVAEgugvjsGYErH/EJKII5B2Ypxz+2zLhPKyYBjYAkrR9Dp/HHD9sy8vmIY2beKZ2oIxMnYlj46p7aNCzlxUCilMZiobZX8w9tgubZNhc2pfSaL1/LfKI2fzflCVDDrqFHR/AVYXqKc/QYIENfGPrwCN+cuaRv0CoSjWQE14thgtYCzozZIweJzUVSu1dBRq5fPgjMcTqKpAbmtnI/JI7dH6kfiZiQzIkI6GQGCr/hFany7vheRwG7RysGcqCCZ7BDg+uWw7PQqoO6E0xkSA8oynJFxYQRx3Da7Se/8ikrFufu+uIqBbYLi1ORHanF+2LblOK3ZmK5z9RyZuAdJXmC5dlb4bkvb8vkOAXrbJMo5xQSguAXH1v242G1wOOsKyV5DQOeu82N9UVUh9PXHLmEioK4CqZPbkPhvwklKl6bbvXbSgJiieibJ6oO9DpJ6c4ygAr4SDKl0Z5Is/zuHEvWk4ageUii2zIRG1VSV+c7l5vHf9mtjFPu5+21iY/S1bOIqTjucmXmE9uHWi4/n4CK0KATl2a1RdH7/9hKwGe/fpzFw/4UnlczphAIHVpo4Z1Zk9EpB1zZX1PigHTlb/I1JVR4LlGjRoCQ9/GP3AF00SUHq2ZGGyJPXNYxsyoeNDYC58nzsVCXAellKpZZrufOBkdszK3jPNgeHu03QZb2qYbDbmazjxY/SbcZWSc/6sULOeSZ8UMs2nxoq2+5QX08QGPK+4r3ySikXyLf4NgLhugfksRKlaARgtuThq/mNxe695WXkCrWgR4KMVQQeIqcfllYwnoPk/L+zzc8BXyY/mgsoi3XtqEJaTnZN8wB8o5AmI T1WQnvzo 5Xw6Zej+SAwbdiPmBBcbevYY+uQPXF17FKgXk1Kt1X0FmUWbspbtvCB7tf8FCISFVei/lTFWEBxNzzDEAUjMkApEzKPpOstLsqwUOHjS169CbsVUjRo9QTanqTqFTGB9OJOOwCSJzmFBgsaNXT9GKBUpLlrONB/brZwXjXHKMFYplHjBrYH9REBbQYYuYoflyq5SSXBE7PdjFPkYaIxCXPeeNmJvaGedjoeoBXyBOtKVIRw4xwbSlxSiYDcS8qmA6RoAK3YODtFPEggpF7KBwpYTxFG8hWoMx5ICjl4WiBeQPspI+2WPh0bCZzYx7A9n339LrxZ8rRGMimKU/m/KzJHJpr4eQjOGD2vP2Tyq3NoEIOLaVWahp015tIfRyR1zRBx3lqBeDD/IlzHb4VUYF+5FE6aRZqjNw1Cucjm8WwVuWNQOio2Y7ax75LW9BqJGT5kP+ X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: This tests that high-granularity CONTINUEs at all sizes work (exercising contiguous PTE sizes for arm64, when support is added). This also tests that collapse works and hwpoison works correctly (although we aren't yet testing high-granularity poison). This test uses UFFD_FEATURE_EVENT_FORK + UFFD_REGISTER_MODE_WP to force the kernel to copy page tables on fork(), exercising the changes to copy_hugetlb_page_range(). Signed-off-by: James Houghton --- tools/testing/selftests/vm/Makefile | 1 + tools/testing/selftests/vm/hugetlb-hgm.c | 455 +++++++++++++++++++++++ 2 files changed, 456 insertions(+) create mode 100644 tools/testing/selftests/vm/hugetlb-hgm.c diff --git a/tools/testing/selftests/vm/Makefile b/tools/testing/selftests/vm/Makefile index 89c14e41bd43..4aa4ca75a471 100644 --- a/tools/testing/selftests/vm/Makefile +++ b/tools/testing/selftests/vm/Makefile @@ -32,6 +32,7 @@ TEST_GEN_FILES += compaction_test TEST_GEN_FILES += gup_test TEST_GEN_FILES += hmm-tests TEST_GEN_FILES += hugetlb-madvise +TEST_GEN_FILES += hugetlb-hgm TEST_GEN_FILES += hugepage-mmap TEST_GEN_FILES += hugepage-mremap TEST_GEN_FILES += hugepage-shm diff --git a/tools/testing/selftests/vm/hugetlb-hgm.c b/tools/testing/selftests/vm/hugetlb-hgm.c new file mode 100644 index 000000000000..616bc40164bf --- /dev/null +++ b/tools/testing/selftests/vm/hugetlb-hgm.c @@ -0,0 +1,455 @@ +// SPDX-License-Identifier: GPL-2.0 +/* + * Test uncommon cases in HugeTLB high-granularity mapping: + * 1. Test all supported high-granularity page sizes (with MADV_COLLAPSE). + * 2. Test MADV_HWPOISON behavior. + */ + +#define _GNU_SOURCE +#include +#include +#include +#include +#include +#include +#include +#include +#include + +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include + +#define PAGE_MASK ~(4096 - 1) + +#ifndef MADV_COLLAPSE +#define MADV_COLLAPSE 25 +#endif + +#ifndef MADV_SPLIT +#define MADV_SPLIT 26 +#endif + +#define PREFIX " ... " +#define ERROR_PREFIX " !!! " + +enum test_status { + TEST_PASSED = 0, + TEST_FAILED = 1, + TEST_SKIPPED = 2, +}; + +static char *status_to_str(enum test_status status) +{ + switch (status) { + case TEST_PASSED: + return "TEST_PASSED"; + case TEST_FAILED: + return "TEST_FAILED"; + case TEST_SKIPPED: + return "TEST_SKIPPED"; + default: + return "TEST_???"; + } +} + +int userfaultfd(int flags) +{ + return syscall(__NR_userfaultfd, flags); +} + +int map_range(int uffd, char *addr, uint64_t length) +{ + struct uffdio_continue cont = { + .range = (struct uffdio_range) { + .start = (uint64_t)addr, + .len = length, + }, + .mode = 0, + .mapped = 0, + }; + + if (ioctl(uffd, UFFDIO_CONTINUE, &cont) < 0) { + perror(ERROR_PREFIX "UFFDIO_CONTINUE failed"); + return -1; + } + return 0; +} + +int check_equal(char *mapping, size_t length, char value) +{ + size_t i; + + for (i = 0; i < length; ++i) + if (mapping[i] != value) { + printf(ERROR_PREFIX "mismatch at %p (%d != %d)\n", + &mapping[i], mapping[i], value); + return -1; + } + + return 0; +} + +int test_continues(int uffd, char *primary_map, char *secondary_map, size_t len, + bool verify) +{ + size_t offset = 0; + unsigned char iter = 0; + unsigned long pagesize = getpagesize(); + uint64_t size; + + for (size = len/2; size >= pagesize; + offset += size, size /= 2) { + iter++; + memset(secondary_map + offset, iter, size); + printf(PREFIX "UFFDIO_CONTINUE: %p -> %p = %d%s\n", + primary_map + offset, + primary_map + offset + size, + iter, + verify ? " (and verify)" : ""); + if (map_range(uffd, primary_map + offset, size)) + return -1; + if (verify && check_equal(primary_map + offset, size, iter)) + return -1; + } + return 0; +} + +int verify_contents(char *map, size_t len, bool last_4k_zero) +{ + size_t offset = 0; + int i = 0; + uint64_t size; + + for (size = len/2; size > 4096; offset += size, size /= 2) + if (check_equal(map + offset, size, ++i)) + return -1; + + if (last_4k_zero) + /* expect the last 4K to be zero. */ + if (check_equal(map + len - 4096, 4096, 0)) + return -1; + + return 0; +} + +int test_collapse(char *primary_map, size_t len, bool hwpoison) +{ + printf(PREFIX "collapsing %p -> %p\n", primary_map, primary_map + len); + if (madvise(primary_map, len, MADV_COLLAPSE) < 0) { + if (errno == EHWPOISON && hwpoison) { + /* this is expected for the hwpoison test. */ + printf(PREFIX "could not collapse due to poison\n"); + return 0; + } + perror(ERROR_PREFIX "collapse failed"); + return -1; + } + + printf(PREFIX "verifying %p -> %p\n", primary_map, primary_map + len); + return verify_contents(primary_map, len, true); +} + +static void *sigbus_addr; +bool was_mceerr; +bool got_sigbus; + +void sigbus_handler(int signo, siginfo_t *info, void *context) +{ + got_sigbus = true; + was_mceerr = info->si_code == BUS_MCEERR_AR; + sigbus_addr = info->si_addr; + + pthread_exit(NULL); +} + +void *access_mem(void *addr) +{ + volatile char *ptr = addr; + + *ptr; + return NULL; +} + +int test_sigbus(char *addr, bool poison) +{ + int ret = 0; + pthread_t pthread; + + sigbus_addr = (void *)0xBADBADBAD; + was_mceerr = false; + got_sigbus = false; + ret = pthread_create(&pthread, NULL, &access_mem, addr); + if (ret) { + printf(ERROR_PREFIX "failed to create thread: %s\n", + strerror(ret)); + return ret; + } + + pthread_join(pthread, NULL); + if (!got_sigbus) { + printf(ERROR_PREFIX "didn't get a SIGBUS\n"); + return -1; + } else if (sigbus_addr != addr) { + printf(ERROR_PREFIX "got incorrect sigbus address: %p vs %p\n", + sigbus_addr, addr); + return -1; + } else if (poison && !was_mceerr) { + printf(ERROR_PREFIX "didn't get an MCEERR?\n"); + return -1; + } + return 0; +} + +void *read_from_uffd_thd(void *arg) +{ + int uffd = *(int *)arg; + struct uffd_msg msg; + /* opened without O_NONBLOCK */ + if (read(uffd, &msg, sizeof(msg)) != sizeof(msg)) + printf(ERROR_PREFIX "reading uffd failed\n"); + + return NULL; +} + +int read_event_from_uffd(int *uffd, pthread_t *pthread) +{ + int ret = 0; + + ret = pthread_create(pthread, NULL, &read_from_uffd_thd, (void *)uffd); + if (ret) { + printf(ERROR_PREFIX "failed to create thread: %s\n", + strerror(ret)); + return ret; + } + return 0; +} + +enum test_status test_hwpoison(char *primary_map, size_t len) +{ + const unsigned long pagesize = getpagesize(); + const int num_poison_checks = 512; + unsigned long bytes_per_check = len/num_poison_checks; + int i; + + printf(PREFIX "poisoning %p -> %p\n", primary_map, primary_map + len); + if (madvise(primary_map, len, MADV_HWPOISON) < 0) { + perror(ERROR_PREFIX "MADV_HWPOISON failed"); + return TEST_SKIPPED; + } + + printf(PREFIX "checking that it was poisoned " + "(%d addresses within %p -> %p)\n", + num_poison_checks, primary_map, primary_map + len); + + if (pagesize > bytes_per_check) + bytes_per_check = pagesize; + + for (i = 0; i < len; i += bytes_per_check) + if (test_sigbus(primary_map + i, true) < 0) + return TEST_FAILED; + /* check very last byte, because we left it unmapped */ + if (test_sigbus(primary_map + len - 1, true)) + return TEST_FAILED; + + return TEST_PASSED; +} + +int test_fork(int uffd, char *primary_map, size_t len) +{ + int status; + int ret = 0; + pid_t pid; + pthread_t uffd_thd; + + /* + * UFFD_FEATURE_EVENT_FORK will put fork event on the userfaultfd, + * which we must read, otherwise we block fork(). Setup a thread to + * read that event now. + * + * Page fault events should result in a SIGBUS, so we expect only a + * single event from the uffd (the fork event). + */ + if (read_event_from_uffd(&uffd, &uffd_thd)) + return -1; + + pid = fork(); + + if (!pid) { + /* + * Because we have UFFDIO_REGISTER_MODE_WP and + * UFFD_FEATURE_EVENT_FORK, the page tables should be copied + * exactly. + * + * Check that everything except that last 4K has correct + * contents, and then check that the last 4K gets a SIGBUS. + */ + printf(PREFIX "child validating...\n"); + ret = verify_contents(primary_map, len, false) || + test_sigbus(primary_map + len - 1, false); + ret = 0; + exit(ret ? 1 : 0); + } else { + /* wait for the child to finish. */ + waitpid(pid, &status, 0); + ret = WEXITSTATUS(status); + if (!ret) { + printf(PREFIX "parent validating...\n"); + /* Same check as the child. */ + ret = verify_contents(primary_map, len, false) || + test_sigbus(primary_map + len - 1, false); + ret = 0; + } + } + + pthread_join(uffd_thd, NULL); + return ret; + +} + +enum test_status +test_hgm(int fd, size_t hugepagesize, size_t len, bool hwpoison) +{ + int uffd; + char *primary_map, *secondary_map; + struct uffdio_api api; + struct uffdio_register reg; + struct sigaction new, old; + enum test_status status = TEST_SKIPPED; + + if (ftruncate(fd, len) < 0) { + perror(ERROR_PREFIX "ftruncate failed"); + return status; + } + + uffd = userfaultfd(O_CLOEXEC); + if (uffd < 0) { + perror(ERROR_PREFIX "uffd not created"); + return status; + } + + primary_map = mmap(NULL, len, PROT_READ | PROT_WRITE, MAP_SHARED, fd, 0); + if (primary_map == MAP_FAILED) { + perror(ERROR_PREFIX "mmap for primary mapping failed"); + goto close_uffd; + } + secondary_map = mmap(NULL, len, PROT_READ | PROT_WRITE, MAP_SHARED, fd, 0); + if (secondary_map == MAP_FAILED) { + perror(ERROR_PREFIX "mmap for secondary mapping failed"); + goto unmap_primary; + } + + printf(PREFIX "primary mapping: %p\n", primary_map); + printf(PREFIX "secondary mapping: %p\n", secondary_map); + + api.api = UFFD_API; + api.features = UFFD_FEATURE_SIGBUS | UFFD_FEATURE_EXACT_ADDRESS | + UFFD_FEATURE_EVENT_FORK; + if (ioctl(uffd, UFFDIO_API, &api) == -1) { + perror(ERROR_PREFIX "UFFDIO_API failed"); + goto out; + } + + if (madvise(primary_map, len, MADV_SPLIT)) { + perror(ERROR_PREFIX "MADV_SPLIT failed"); + goto out; + } + + reg.range.start = (unsigned long)primary_map; + reg.range.len = len; + /* + * Register with UFFDIO_REGISTER_MODE_WP to force fork() to copy page + * tables (also need UFFD_FEATURE_EVENT_FORK, which we have). + */ + reg.mode = UFFDIO_REGISTER_MODE_MINOR | UFFDIO_REGISTER_MODE_MISSING | + UFFDIO_REGISTER_MODE_WP; + reg.ioctls = 0; + if (ioctl(uffd, UFFDIO_REGISTER, ®) == -1) { + perror(ERROR_PREFIX "register failed"); + goto out; + } + + new.sa_sigaction = &sigbus_handler; + new.sa_flags = SA_SIGINFO; + if (sigaction(SIGBUS, &new, &old) < 0) { + perror(ERROR_PREFIX "could not setup SIGBUS handler"); + goto out; + } + + status = TEST_FAILED; + + if (test_continues(uffd, primary_map, secondary_map, len, !hwpoison)) + goto done; + if (hwpoison) { + /* test_hwpoison can fail with TEST_SKIPPED. */ + enum test_status new_status = test_hwpoison(primary_map, len); + + if (new_status != TEST_PASSED) { + status = new_status; + goto done; + } + } else if (test_fork(uffd, primary_map, len)) + goto done; + if (test_collapse(primary_map, len, hwpoison)) + goto done; + + status = TEST_PASSED; + +done: + if (ftruncate(fd, 0) < 0) { + perror(ERROR_PREFIX "ftruncate back to 0 failed"); + status = TEST_FAILED; + } + +out: + munmap(secondary_map, len); +unmap_primary: + munmap(primary_map, len); +close_uffd: + close(uffd); + return status; +} + +int main(void) +{ + int fd; + struct statfs file_stat; + size_t hugepagesize; + size_t len; + + fd = memfd_create("hugetlb_tmp", MFD_HUGETLB); + if (fd < 0) { + perror(ERROR_PREFIX "could not open hugetlbfs file"); + return -1; + } + + memset(&file_stat, 0, sizeof(file_stat)); + if (fstatfs(fd, &file_stat)) { + perror(ERROR_PREFIX "fstatfs failed"); + goto close; + } + if (file_stat.f_type != HUGETLBFS_MAGIC) { + printf(ERROR_PREFIX "not hugetlbfs file\n"); + goto close; + } + + hugepagesize = file_stat.f_bsize; + len = 2 * hugepagesize; + printf("HGM regular test...\n"); + printf("HGM regular test: %s\n", + status_to_str(test_hgm(fd, hugepagesize, len, false))); + printf("HGM hwpoison test...\n"); + printf("HGM hwpoison test: %s\n", + status_to_str(test_hgm(fd, hugepagesize, len, true))); +close: + close(fd); + + return 0; +}