From patchwork Tue Sep 3 14:25:19 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Steven Sistare X-Patchwork-Id: 13788802 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 6B8F2CD343A for ; Tue, 3 Sep 2024 14:25:47 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 0BF328D018B; Tue, 3 Sep 2024 10:25:45 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 032CA8D018A; Tue, 3 Sep 2024 10:25:44 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id D1B2D8D018B; Tue, 3 Sep 2024 10:25:44 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0014.hostedemail.com [216.40.44.14]) by kanga.kvack.org (Postfix) with ESMTP id A82558D0151 for ; Tue, 3 Sep 2024 10:25:44 -0400 (EDT) Received: from smtpin17.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay05.hostedemail.com (Postfix) with ESMTP id 65A294046E for ; Tue, 3 Sep 2024 14:25:44 +0000 (UTC) X-FDA: 82523650608.17.0D66D62 Received: from mx0a-00069f02.pphosted.com (mx0a-00069f02.pphosted.com [205.220.165.32]) by imf13.hostedemail.com (Postfix) with ESMTP id 5B88E20021 for ; Tue, 3 Sep 2024 14:25:42 +0000 (UTC) Authentication-Results: imf13.hostedemail.com; dkim=pass header.d=oracle.com header.s=corp-2023-11-20 header.b=AMuVrZt6; spf=pass (imf13.hostedemail.com: domain of steven.sistare@oracle.com designates 205.220.165.32 as permitted sender) smtp.mailfrom=steven.sistare@oracle.com; dmarc=pass (policy=reject) header.from=oracle.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1725373495; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:content-type: content-transfer-encoding:in-reply-to:in-reply-to: references:references:dkim-signature; bh=Rn6dnVmBMFUBHIqthBqsmiWctyRCK2BSKIV6bpRJTFk=; b=HnDkNUy09bXZ/8s837d/EjkEs/HJtTCB62LUetP3NsWPh2s2BFmI5+746fVsK/KXCHeLed vaYUXKG0CTYeY2PVfGDrgAosb+YNNmrm2hqtWjEj6lz5/lZyYN/LhRyJHKSfQgylmcWwgB yJwi9zPWl/RxfWTD67NZswoesHo506E= ARC-Authentication-Results: i=1; imf13.hostedemail.com; dkim=pass header.d=oracle.com header.s=corp-2023-11-20 header.b=AMuVrZt6; spf=pass (imf13.hostedemail.com: domain of steven.sistare@oracle.com designates 205.220.165.32 as permitted sender) smtp.mailfrom=steven.sistare@oracle.com; dmarc=pass (policy=reject) header.from=oracle.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1725373495; a=rsa-sha256; cv=none; b=ZaMd4F3K+c5AunrrivMHpAuszixRhPzGUFRGOe1q7q6kQHUwvopVqyUh9VXCshR+HuuFBe MBQO6onfFwtwtGIN0HvIcnbv2e5kYZIqnJyW3biBHU1Hs3JPCHcMO/UUllZqyUgbUIYx3X hj83MG944Ix3yhCmcpflqRLAtQxnrVg= Received: from pps.filterd (m0246627.ppops.net [127.0.0.1]) by mx0b-00069f02.pphosted.com (8.18.1.2/8.18.1.2) with ESMTP id 4837fTXa008145; Tue, 3 Sep 2024 14:25:29 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.com; h= from:to:cc:subject:date:message-id:in-reply-to:references; s= corp-2023-11-20; bh=Rn6dnVmBMFUBHIqthBqsmiWctyRCK2BSKIV6bpRJTFk=; b= AMuVrZt6fEJ80mvfWWTszRmkkJjRhKXpaxoLMSiqLEZ7OUyj8jApeOtiFU4FHUti a7EG118KIqJbrP3MoJRB8UjmBiXz1z5bZf5iK1jSgAOpQ0OX1xNGpMAvJor9oilD hYUu6sAYnd4Cwx70Nrdn/IDCuquG7ZsZEv/8A2rGO9yTV02kYmuvPTbmupl6d98F 21qsWA4BWds0DEKMk77oVSrio8+uCmmimD8O7Ip90WitwaNAKwMySIO8EPAqikrd L+zdJnt1AVvsvNJLgcPdotR65kVNa4Wg1Jpu/3rFMpS4QynVTGvn4AzbBltoUJfg nANzutKiU1e8QsBfidNdCw== Received: from iadpaimrmta02.imrmtpd1.prodappiadaev1.oraclevcn.com (iadpaimrmta02.appoci.oracle.com [147.154.18.20]) by mx0b-00069f02.pphosted.com (PPS) with ESMTPS id 41duyj12bp-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Tue, 03 Sep 2024 14:25:29 +0000 (GMT) Received: from pps.filterd (iadpaimrmta02.imrmtpd1.prodappiadaev1.oraclevcn.com [127.0.0.1]) by iadpaimrmta02.imrmtpd1.prodappiadaev1.oraclevcn.com (8.18.1.2/8.18.1.2) with ESMTP id 483E1F9v001728; Tue, 3 Sep 2024 14:25:27 GMT Received: from pps.reinject (localhost [127.0.0.1]) by iadpaimrmta02.imrmtpd1.prodappiadaev1.oraclevcn.com (PPS) with ESMTPS id 41bsmf1med-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Tue, 03 Sep 2024 14:25:27 +0000 Received: from iadpaimrmta02.imrmtpd1.prodappiadaev1.oraclevcn.com (iadpaimrmta02.imrmtpd1.prodappiadaev1.oraclevcn.com [127.0.0.1]) by pps.reinject (8.17.1.5/8.17.1.5) with ESMTP id 483EPN4O023489; Tue, 3 Sep 2024 14:25:27 GMT Received: from ca-dev63.us.oracle.com (ca-dev63.us.oracle.com [10.211.8.221]) by iadpaimrmta02.imrmtpd1.prodappiadaev1.oraclevcn.com (PPS) with ESMTP id 41bsmf1maj-4; Tue, 03 Sep 2024 14:25:27 +0000 From: Steve Sistare To: linux-mm@kvack.org Cc: Vivek Kasireddy , Muchun Song , Andrew Morton , Matthew Wilcox , Peter Xu , David Hildenbrand , Jason Gunthorpe , Steve Sistare Subject: [PATCH V1 3/5] mm/hugetlb: fix memfd_pin_folios resv_huge_pages leak Date: Tue, 3 Sep 2024 07:25:19 -0700 Message-Id: <1725373521-451395-4-git-send-email-steven.sistare@oracle.com> X-Mailer: git-send-email 1.8.3.1 In-Reply-To: <1725373521-451395-1-git-send-email-steven.sistare@oracle.com> References: <1725373521-451395-1-git-send-email-steven.sistare@oracle.com> X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.293,Aquarius:18.0.1039,Hydra:6.0.680,FMLib:17.12.60.29 definitions=2024-09-03_02,2024-09-03_01,2024-09-02_01 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 malwarescore=0 mlxscore=0 mlxlogscore=999 suspectscore=0 phishscore=0 bulkscore=0 adultscore=0 spamscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2407110000 definitions=main-2409030117 X-Proofpoint-GUID: mZn8bbOgrxDizi0AmwcAtSBQRu7FB2YS X-Proofpoint-ORIG-GUID: mZn8bbOgrxDizi0AmwcAtSBQRu7FB2YS X-Rspamd-Server: rspam03 X-Rspam-User: X-Rspamd-Queue-Id: 5B88E20021 X-Stat-Signature: mo74k8p1hh64npxc3njxdfja946b73pb X-HE-Tag: 1725373542-124374 X-HE-Meta: U2FsdGVkX19Ab/Jf1SsCbj9Tb13fGssCzgXV/5hDZcCYAYGujQUB3gCX8reRt5DXjY8AycM0CEHBIkyt37uadxVw7tuJvozxac/2v2ul8TCH1i8sibPKdY6kautpr5Ne2akKcSzvzuvE6xYRMRs8GAL4ImpxEJ3cgUJwodT0qf/RaHIyXA7vAbhKZQjqjWL2KGSXz+Ig8RlDOWpKnb8S4XqZobrU4e3ql2DN6+w4dwoBrvHRzZTUdX/XEWgDH5NSRDfQ10MojfladNoibNZW9tfglCFoPonPIF2pGzIFzIoTZ8lU+AzNwX6JvF8STs8jOPb0oDnjhKgisGVp64qZ0mnDwfaxR22mK0C2jr61T0MeZrXh5jjKHURPyjK+YyVGPxjhaZUPhSMVt9WW1Q2vXcbBG70YI8WjvOvnuqz4JH5vnbJRFRqNjV+4kMo3Cxv6DG1XICZKi4hh207pDU4L4hBhYWsDTo8yoIsyMzBxgxAjUipbXzjmwHyKQO4m8nROW7MioALSss+6WRkKPigqWRW+6/oqara5gLFNhHFcPJrve1ZoUC9TP1Y72TH4etnEslxOgUClmSIjiZZXSGEhzMECtL8NAYU4p6or8MXKnJ/1ce2J8Nkq7MzZWe06m8059saOacwrwQqFlGlAXzHeZWw6piAxUPgPAdZanX9ygJ2Nj5v/xi4rkR9gfnakvz4vpZMj3Opo9rIRnlDX/tln0ngZde4G6cPt91M/IT68kzzVZ9f1Q9OcHxpVoD2PBbBcUYT/G1svHUnq8M0WAYgpzrtI22XD9QVU2F359jwcDBP0CKqmR6uwkams5uI/yQl7/SMA61yHtnTgjTM9qEOjnWvxH+9eCEeRHmUhTClnNFEyYGJQalPVpbtVgxNzoJE8uCueqynCYo3fJ5GXqHM7LFGQ2+8/b+EiBbQqSYImobHZHByvULH2IYb7EYUWYQ/S7Brt2aG1hHuwjGKr3n+ VD7VHpL2 7o0ruSNOdjo8wOnlgOTdXxmqkTZD0HYSDOaVrsIAoF40r32r5tAI2dfAl2ZhBcY/u/sq5odzwozjXWk/Bl7FQE7u6TkHhVhtm+3ucQSMl4wNadLPqZr3unIk3msI+7kVMyKCFvbGX0wi50EZZzXE3YvG/dwwEbqEKAWCa3Lpv5r7m07/hQC+/WvuALUwezNQ18FWSaegtzfHNZzpxK25MNZhT1QQ4yTdklt8Iiv+trrwWi7Sg+RHKkzEHgw== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: memfd_pin_folios followed by unpin_folios leaves resv_huge_pages elevated if the pages were not already faulted in. During a normal page fault, resv_huge_pages is consumed here: hugetlb_fault() alloc_hugetlb_folio() dequeue_hugetlb_folio_vma() dequeue_hugetlb_folio_nodemask() dequeue_hugetlb_folio_node_exact() free_huge_pages-- resv_huge_pages-- During memfd_pin_folios, the page is created by calling alloc_hugetlb_folio_nodemask instead of alloc_hugetlb_folio, and resv_huge_pages is not modified: memfd_alloc_folio() alloc_hugetlb_folio_nodemask() dequeue_hugetlb_folio_nodemask() dequeue_hugetlb_folio_node_exact() free_huge_pages-- alloc_hugetlb_folio_nodemask has other callers that must not modify resv_huge_pages. Therefore, to fix, define an alternate version of alloc_hugetlb_folio_nodemask for this call site that adjusts resv_huge_pages. Fixes: 89c1905d9c14 ("mm/gup: introduce memfd_pin_folios() for pinning memfd folios") Signed-off-by: Steve Sistare Acked-by: Vivek Kasireddy --- include/linux/hugetlb.h | 10 ++++++++++ mm/hugetlb.c | 17 +++++++++++++++++ mm/memfd.c | 9 ++++----- 3 files changed, 31 insertions(+), 5 deletions(-) diff --git a/include/linux/hugetlb.h b/include/linux/hugetlb.h index 45bf05a..3ddd69b 100644 --- a/include/linux/hugetlb.h +++ b/include/linux/hugetlb.h @@ -695,6 +695,9 @@ struct folio *alloc_hugetlb_folio(struct vm_area_struct *vma, struct folio *alloc_hugetlb_folio_nodemask(struct hstate *h, int preferred_nid, nodemask_t *nmask, gfp_t gfp_mask, bool allow_alloc_fallback); +struct folio *alloc_hugetlb_folio_reserve(struct hstate *h, int preferred_nid, + nodemask_t *nmask, gfp_t gfp_mask); + int hugetlb_add_to_page_cache(struct folio *folio, struct address_space *mapping, pgoff_t idx); void restore_reserve_on_error(struct hstate *h, struct vm_area_struct *vma, @@ -1062,6 +1065,13 @@ static inline struct folio *alloc_hugetlb_folio(struct vm_area_struct *vma, } static inline struct folio * +alloc_hugetlb_folio_reserve(struct hstate *h, int preferred_nid, + nodemask_t *nmask, gfp_t gfp_mask) +{ + return NULL; +} + +static inline struct folio * alloc_hugetlb_folio_nodemask(struct hstate *h, int preferred_nid, nodemask_t *nmask, gfp_t gfp_mask, bool allow_alloc_fallback) diff --git a/mm/hugetlb.c b/mm/hugetlb.c index aaf508b..c2d44a1 100644 --- a/mm/hugetlb.c +++ b/mm/hugetlb.c @@ -2564,6 +2564,23 @@ struct folio *alloc_buddy_hugetlb_folio_with_mpol(struct hstate *h, return folio; } +struct folio *alloc_hugetlb_folio_reserve(struct hstate *h, int preferred_nid, + nodemask_t *nmask, gfp_t gfp_mask) +{ + struct folio *folio; + + spin_lock_irq(&hugetlb_lock); + folio = dequeue_hugetlb_folio_nodemask(h, gfp_mask, preferred_nid, + nmask); + if (folio) { + VM_BUG_ON(!h->resv_huge_pages); + h->resv_huge_pages--; + } + + spin_unlock_irq(&hugetlb_lock); + return folio; +} + /* folio migration callback function */ struct folio *alloc_hugetlb_folio_nodemask(struct hstate *h, int preferred_nid, nodemask_t *nmask, gfp_t gfp_mask, bool allow_alloc_fallback) diff --git a/mm/memfd.c b/mm/memfd.c index e7b7c52..bfe0e71 100644 --- a/mm/memfd.c +++ b/mm/memfd.c @@ -82,11 +82,10 @@ struct folio *memfd_alloc_folio(struct file *memfd, pgoff_t idx) gfp_mask = htlb_alloc_mask(hstate_file(memfd)); gfp_mask &= ~(__GFP_HIGHMEM | __GFP_MOVABLE); - folio = alloc_hugetlb_folio_nodemask(hstate_file(memfd), - numa_node_id(), - NULL, - gfp_mask, - false); + folio = alloc_hugetlb_folio_reserve(hstate_file(memfd), + numa_node_id(), + NULL, + gfp_mask); if (folio && folio_try_get(folio)) { err = hugetlb_add_to_page_cache(folio, memfd->f_mapping,