From patchwork Tue Nov 29 19:35:17 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Xu X-Patchwork-Id: 13059086 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id B6013C433FE for ; Tue, 29 Nov 2022 19:35:38 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 560F46B0074; Tue, 29 Nov 2022 14:35:38 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 49B386B0075; Tue, 29 Nov 2022 14:35:38 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 279726B007D; Tue, 29 Nov 2022 14:35:38 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0013.hostedemail.com [216.40.44.13]) by kanga.kvack.org (Postfix) with ESMTP id 19A986B0074 for ; Tue, 29 Nov 2022 14:35:38 -0500 (EST) Received: from smtpin07.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay04.hostedemail.com (Postfix) with ESMTP id E757B1A0A43 for ; Tue, 29 Nov 2022 19:35:37 +0000 (UTC) X-FDA: 80187484314.07.5AE0DCF Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by imf23.hostedemail.com (Postfix) with ESMTP id 8833C14000B for ; Tue, 29 Nov 2022 19:35:37 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1669750536; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=oUaUMokAqyi3NHrr8ZrIfiPSDBorB4zEKfPSOjIyLzs=; b=J8zan7P8iEiNIQueQrmJuPqEXnfpRKcYKQ/Ifc5Qu38XF5JzKvxmibNOvQ1nbJoyFHI2wJ 6umxRkgNBEJ3gWYhSwh7EjWag3XLbhT8jj6u0VwIQdWl+YBsFr/TLeGCzKubwtjtw0+gy/ 9Z/MkbZ9ExMKYWyxY0zVP0rnXOB3Y8M= Received: from mail-qk1-f200.google.com (mail-qk1-f200.google.com [209.85.222.200]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_128_GCM_SHA256) id us-mta-675-YSiOcgn5P5e60GVO5mslzQ-1; Tue, 29 Nov 2022 14:35:33 -0500 X-MC-Unique: YSiOcgn5P5e60GVO5mslzQ-1 Received: by mail-qk1-f200.google.com with SMTP id bj4-20020a05620a190400b006fc7c5d454cso16903080qkb.14 for ; Tue, 29 Nov 2022 11:35:33 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=oUaUMokAqyi3NHrr8ZrIfiPSDBorB4zEKfPSOjIyLzs=; b=ca4aEiei/cG7z+wuWQhmEToKs7Q2q8+15/idNuUof09S+1luo14ggcZOsweRXdxZkU Nu3LTBNPMKtm2xTbwccWHJ13cIUbkzsfRVSoovFx6H0GIsEadZLzUyQqhlQqjzOZHdnb ld/RD+G36QkgjGLjFUIpY6UqvWO9bYfFcuUsqXj+f1dQGE09w3FGw9ALI4WoKX8/dP1g haNvgk3a7JycgWvKtb0HEURew0BGG7CSQOK33Ya5DCu34JpCh0eOMsfuaTrokks97Yjy hEZn/ZDvfrryyYrag00TbbdrITjDJXlinP0GTFwK17kJrmOtrRy4nVtAH0ys3+x1lUmv LWhw== X-Gm-Message-State: ANoB5pkptQFvZfpiZbPEiK/j1WW7oWKLtvlLni/Q0qBIn6y2aJfDhQWj tOG30Qb17kM7023DbDDoBQEi6KUVd72GY3bVIvUg30zOUR+VI6OAfZnRU4Yoied6bHhzVCsHyGB D3lJKOtyktZVgkZ7si5Z/hwA2P6tGOQDOTCe9xyHT8ZwkqpVdlj6RQsJ1hZ3v X-Received: by 2002:a05:620a:b83:b0:6fb:ec6:da03 with SMTP id k3-20020a05620a0b8300b006fb0ec6da03mr51132397qkh.206.1669750532253; Tue, 29 Nov 2022 11:35:32 -0800 (PST) X-Google-Smtp-Source: AA0mqf5YA/acon+LhfHPHPKaW5yLk7KnmfoilNeObxz2Q7Ei8/I2mWOfNhuwDiUBf8dY9EaFWOGB6Q== X-Received: by 2002:a05:620a:b83:b0:6fb:ec6:da03 with SMTP id k3-20020a05620a0b8300b006fb0ec6da03mr51132361qkh.206.1669750531938; Tue, 29 Nov 2022 11:35:31 -0800 (PST) Received: from x1n.redhat.com (bras-base-aurron9127w-grc-46-70-31-27-79.dsl.bell.ca. [70.31.27.79]) by smtp.gmail.com with ESMTPSA id n1-20020a05620a294100b006fa16fe93bbsm11313013qkp.15.2022.11.29.11.35.28 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 29 Nov 2022 11:35:29 -0800 (PST) From: Peter Xu To: linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: James Houghton , Jann Horn , peterx@redhat.com, Andrew Morton , Andrea Arcangeli , Rik van Riel , Nadav Amit , Miaohe Lin , Muchun Song , Mike Kravetz , David Hildenbrand Subject: [PATCH 01/10] mm/hugetlb: Let vma_offset_start() to return start Date: Tue, 29 Nov 2022 14:35:17 -0500 Message-Id: <20221129193526.3588187-2-peterx@redhat.com> X-Mailer: git-send-email 2.37.3 In-Reply-To: <20221129193526.3588187-1-peterx@redhat.com> References: <20221129193526.3588187-1-peterx@redhat.com> MIME-Version: 1.0 X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-type: text/plain ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1669750537; a=rsa-sha256; cv=none; b=reFdPtWP90woKWldxv2W14zPKvrj0v/uzYJ6Q+8aPHj1eG1VCYxoAHDmyI8MI0kzVCC1Hp gWOygxtmqabk0mg1bbvS97gIURtZWibofvUZ83i7yEYQ55E7U7MqVQsXmK5oAxuC6LOwdb OH1/1geoHdsRlLpB+yjWdMgGiYvOYLg= ARC-Authentication-Results: i=1; imf23.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=J8zan7P8; dmarc=pass (policy=none) header.from=redhat.com; spf=pass (imf23.hostedemail.com: domain of peterx@redhat.com designates 170.10.129.124 as permitted sender) smtp.mailfrom=peterx@redhat.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1669750537; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=oUaUMokAqyi3NHrr8ZrIfiPSDBorB4zEKfPSOjIyLzs=; b=sHvvzOssrtRzdzeO3Ii7KTJpHi+OfcMKwkKBnzh44WuTGkSs/iPsHdLLrwiscFTV/4YKvQ SvP+xOJaizcOI+jpwkPvkOAvLvHdkNclsXEIRgigjsZAAoMn329ruftFGTYUB3FtiEVmj5 e24QSDDjmSD3aXoNkGVsC7H4tg7WDFg= X-Rspamd-Queue-Id: 8833C14000B Authentication-Results: imf23.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=J8zan7P8; dmarc=pass (policy=none) header.from=redhat.com; spf=pass (imf23.hostedemail.com: domain of peterx@redhat.com designates 170.10.129.124 as permitted sender) smtp.mailfrom=peterx@redhat.com X-Rspamd-Server: rspam12 X-Rspam-User: X-Stat-Signature: qxzcbyp4crb6p5kca86qnsp9bhei81zw X-HE-Tag: 1669750537-328043 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Even though vma_offset_start() is named like that, it's not returning "the start address of the range" but rather the offset we should use to offset the vma->vm_start address. Make it return the real value of the start vaddr, and it also helps for all the callers because whenever the retval is used, it'll be ultimately added into the vma->vm_start anyway, so it's better. Reviewed-by: Mike Kravetz Signed-off-by: Peter Xu Reviewed-by: David Hildenbrand --- fs/hugetlbfs/inode.c | 24 ++++++++++++------------ 1 file changed, 12 insertions(+), 12 deletions(-) diff --git a/fs/hugetlbfs/inode.c b/fs/hugetlbfs/inode.c index 790d2727141a..fdb16246f46e 100644 --- a/fs/hugetlbfs/inode.c +++ b/fs/hugetlbfs/inode.c @@ -412,10 +412,12 @@ static bool hugetlb_vma_maps_page(struct vm_area_struct *vma, */ static unsigned long vma_offset_start(struct vm_area_struct *vma, pgoff_t start) { + unsigned long offset = 0; + if (vma->vm_pgoff < start) - return (start - vma->vm_pgoff) << PAGE_SHIFT; - else - return 0; + offset = (start - vma->vm_pgoff) << PAGE_SHIFT; + + return vma->vm_start + offset; } static unsigned long vma_offset_end(struct vm_area_struct *vma, pgoff_t end) @@ -457,7 +459,7 @@ static void hugetlb_unmap_file_folio(struct hstate *h, v_start = vma_offset_start(vma, start); v_end = vma_offset_end(vma, end); - if (!hugetlb_vma_maps_page(vma, vma->vm_start + v_start, page)) + if (!hugetlb_vma_maps_page(vma, v_start, page)) continue; if (!hugetlb_vma_trylock_write(vma)) { @@ -473,8 +475,8 @@ static void hugetlb_unmap_file_folio(struct hstate *h, break; } - unmap_hugepage_range(vma, vma->vm_start + v_start, v_end, - NULL, ZAP_FLAG_DROP_MARKER); + unmap_hugepage_range(vma, v_start, v_end, NULL, + ZAP_FLAG_DROP_MARKER); hugetlb_vma_unlock_write(vma); } @@ -507,10 +509,9 @@ static void hugetlb_unmap_file_folio(struct hstate *h, */ v_start = vma_offset_start(vma, start); v_end = vma_offset_end(vma, end); - if (hugetlb_vma_maps_page(vma, vma->vm_start + v_start, page)) - unmap_hugepage_range(vma, vma->vm_start + v_start, - v_end, NULL, - ZAP_FLAG_DROP_MARKER); + if (hugetlb_vma_maps_page(vma, v_start, page)) + unmap_hugepage_range(vma, v_start, v_end, NULL, + ZAP_FLAG_DROP_MARKER); kref_put(&vma_lock->refs, hugetlb_vma_lock_release); hugetlb_vma_unlock_write(vma); @@ -540,8 +541,7 @@ hugetlb_vmdelete_list(struct rb_root_cached *root, pgoff_t start, pgoff_t end, v_start = vma_offset_start(vma, start); v_end = vma_offset_end(vma, end); - unmap_hugepage_range(vma, vma->vm_start + v_start, v_end, - NULL, zap_flags); + unmap_hugepage_range(vma, v_start, v_end, NULL, zap_flags); /* * Note that vma lock only exists for shared/non-private From patchwork Tue Nov 29 19:35:18 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Xu X-Patchwork-Id: 13059085 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id AB3A9C4332F for ; Tue, 29 Nov 2022 19:35:37 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 1412C6B007B; Tue, 29 Nov 2022 14:35:37 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 0F3336B0075; Tue, 29 Nov 2022 14:35:37 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id E86796B007B; Tue, 29 Nov 2022 14:35:36 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0016.hostedemail.com [216.40.44.16]) by kanga.kvack.org (Postfix) with ESMTP id D9A176B0074 for ; Tue, 29 Nov 2022 14:35:36 -0500 (EST) Received: from smtpin22.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay10.hostedemail.com (Postfix) with ESMTP id A792EC01A3 for ; Tue, 29 Nov 2022 19:35:36 +0000 (UTC) X-FDA: 80187484272.22.0DC7F67 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by imf28.hostedemail.com (Postfix) with ESMTP id 381E7C0012 for ; Tue, 29 Nov 2022 19:35:36 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1669750535; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=069LEwdmpjbCoEgrG7X25LyQKpeCovMtZOQCiu77QNo=; b=Rfxez/FP9JJaudHFIXy4W5UtTsBr5aMRRqpt4QbucuzCXp2qwJtSb/RckuK3/fRrxeZE3c EwWofrqsM26KyfSP663WyWgCTDrrFr4qXnKAhd7oFht7VDx4yCwmfa6zN4gIsk9T2x8I2j 446K7zzGYtN/mhIsTZInKoShtPwYTrs= Received: from mail-qt1-f199.google.com (mail-qt1-f199.google.com [209.85.160.199]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_128_GCM_SHA256) id us-mta-178-bAnQ8ArBN72xMnS-7w-hvw-1; Tue, 29 Nov 2022 14:35:34 -0500 X-MC-Unique: bAnQ8ArBN72xMnS-7w-hvw-1 Received: by mail-qt1-f199.google.com with SMTP id bz20-20020a05622a1e9400b003a646e03748so22976571qtb.12 for ; Tue, 29 Nov 2022 11:35:34 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=069LEwdmpjbCoEgrG7X25LyQKpeCovMtZOQCiu77QNo=; b=CRsszuuciQI1xRVbmNGVSwcITx/doSwhjegGWKkOgALqP4LTG4kbZfSmqnGIcbEUYu GUJWRS2GhQ1h1zXJy6uxo+fMuu+quI4a1ZkKlEYl3c8oKso49PRR0McWe3s0qKw2//HY i82q4FeXnZkD12R24KWPcFTNxkYrpdS+qXIM2p/0kf7I036KHNt6sK7X4TGL+lwBxgve GWmcXaNe/pHT7nrPoFTTEPG9ljzZ3TrWJQGBb/XxVOlDAog9VkYJM4bndw4vvKdkwy9Y ChXPdm+88LNYM+icnuaySo8KH72VKSCUUaUbIDPzzCAjI+6ySaS24NBIovgS8RC/xEFi y9vA== X-Gm-Message-State: ANoB5plA+uIy78wlVQKGswPybyjDw0FlHm5yFMP7KylW1w/OMPeCeLMI Lwt/baXkXT6L9mJjTAf3GWpZO+QVVph2aofV4D20v+Pu1ZLSNyVxc4p8n4hxmnIZhpHcGToMwic HftFP5BO1gfwlbd+cjVJfXDphy4b25B21ffHFapJLBE00nx7v+8fo/daZ7I8o X-Received: by 2002:a05:6214:3b0b:b0:4c6:fb71:d337 with SMTP id nm11-20020a0562143b0b00b004c6fb71d337mr12678559qvb.110.1669750533717; Tue, 29 Nov 2022 11:35:33 -0800 (PST) X-Google-Smtp-Source: AA0mqf4LlhV6iVquMdiRB0LITiL37rfEs8kARYiG6upAcjW7OrRevlCH3PouYPzzz2q3p48laJLCeg== X-Received: by 2002:a05:6214:3b0b:b0:4c6:fb71:d337 with SMTP id nm11-20020a0562143b0b00b004c6fb71d337mr12678530qvb.110.1669750533458; Tue, 29 Nov 2022 11:35:33 -0800 (PST) Received: from x1n.redhat.com (bras-base-aurron9127w-grc-46-70-31-27-79.dsl.bell.ca. [70.31.27.79]) by smtp.gmail.com with ESMTPSA id n1-20020a05620a294100b006fa16fe93bbsm11313013qkp.15.2022.11.29.11.35.32 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 29 Nov 2022 11:35:33 -0800 (PST) From: Peter Xu To: linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: James Houghton , Jann Horn , peterx@redhat.com, Andrew Morton , Andrea Arcangeli , Rik van Riel , Nadav Amit , Miaohe Lin , Muchun Song , Mike Kravetz , David Hildenbrand Subject: [PATCH 02/10] mm/hugetlb: Don't wait for migration entry during follow page Date: Tue, 29 Nov 2022 14:35:18 -0500 Message-Id: <20221129193526.3588187-3-peterx@redhat.com> X-Mailer: git-send-email 2.37.3 In-Reply-To: <20221129193526.3588187-1-peterx@redhat.com> References: <20221129193526.3588187-1-peterx@redhat.com> MIME-Version: 1.0 X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-type: text/plain ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1669750536; a=rsa-sha256; cv=none; b=NaqjAdwq1/qNGw6AXqVLfW8n0njKQfhYVG08B0NvXiDfkMPf167gFChJvxFF1wDuLS5rGy i9Hoq9CphZhuM+uXDhSO6yhW85FGZ6UdXnREAutkGK2l/R5aErO9p608gK7zIEZspfsO5j eP/LqUJncDD/kIssvECIZhRJKETaea0= ARC-Authentication-Results: i=1; imf28.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b="Rfxez/FP"; spf=pass (imf28.hostedemail.com: domain of peterx@redhat.com designates 170.10.133.124 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (policy=none) header.from=redhat.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1669750536; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=069LEwdmpjbCoEgrG7X25LyQKpeCovMtZOQCiu77QNo=; b=hgBmjNwkvx1uRoUXMmktSp2/o5ZIijZPiKQ1CGX8NdaAEY1V9Vy9tEY3JRlxsXtDkeJQTw ksKS/i5lrFXHvXWGV3+Gm2TrFDrkLOZiwCAKL7UM7N+Jr1AtnAkf1HD2npdJQ4vnpFfHpa NNJ2EwTTNCpyFnjgIHU+m/CiZM0TMzc= X-Stat-Signature: ggacj4ngqc5ur8o6uo7n8suoiqojpox5 X-Rspamd-Queue-Id: 381E7C0012 X-Rspam-User: Authentication-Results: imf28.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b="Rfxez/FP"; spf=pass (imf28.hostedemail.com: domain of peterx@redhat.com designates 170.10.133.124 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (policy=none) header.from=redhat.com X-Rspamd-Server: rspam05 X-HE-Tag: 1669750536-589860 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: That's what the code does with !hugetlb pages, so we should logically do the same for hugetlb, so migration entry will also be treated as no page. This is probably also the last piece in follow_page code that may sleep, the last one should be removed in cf994dd8af27 ("mm/gup: remove FOLL_MIGRATION", 2022-11-16). Signed-off-by: Peter Xu Reviewed-by: Mike Kravetz Reviewed-by: David Hildenbrand --- mm/hugetlb.c | 11 ----------- 1 file changed, 11 deletions(-) diff --git a/mm/hugetlb.c b/mm/hugetlb.c index 9d97c9a2a15d..dfe677fadaf8 100644 --- a/mm/hugetlb.c +++ b/mm/hugetlb.c @@ -6234,7 +6234,6 @@ struct page *hugetlb_follow_page_mask(struct vm_area_struct *vma, if (WARN_ON_ONCE(flags & FOLL_PIN)) return NULL; -retry: pte = huge_pte_offset(mm, haddr, huge_page_size(h)); if (!pte) return NULL; @@ -6257,16 +6256,6 @@ struct page *hugetlb_follow_page_mask(struct vm_area_struct *vma, page = NULL; goto out; } - } else { - if (is_hugetlb_entry_migration(entry)) { - spin_unlock(ptl); - __migration_entry_wait_huge(pte, ptl); - goto retry; - } - /* - * hwpoisoned entry is treated as no_page_table in - * follow_page_mask(). - */ } out: spin_unlock(ptl); From patchwork Tue Nov 29 19:35:19 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Xu X-Patchwork-Id: 13059087 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 0708EC4332F for ; Tue, 29 Nov 2022 19:35:40 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id A6F6B6B0075; Tue, 29 Nov 2022 14:35:38 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 9F4726B007D; Tue, 29 Nov 2022 14:35:38 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 846D36B007E; Tue, 29 Nov 2022 14:35:38 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0014.hostedemail.com [216.40.44.14]) by kanga.kvack.org (Postfix) with ESMTP id 591686B007D for ; Tue, 29 Nov 2022 14:35:38 -0500 (EST) Received: from smtpin06.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay04.hostedemail.com (Postfix) with ESMTP id 3170E1A0885 for ; Tue, 29 Nov 2022 19:35:38 +0000 (UTC) X-FDA: 80187484356.06.49E5D0A Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by imf07.hostedemail.com (Postfix) with ESMTP id C4E5A40004 for ; Tue, 29 Nov 2022 19:35:37 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1669750537; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=1A7zSSAdH3Q+Nn9HaLXwUXJ3cHWzn45QfpKMkITSm0w=; b=b6gVKCGZCvbnoU5aOoZzsIibfF+bR1hZQIon/dYwWZEkyI+xD/keZLKldUD6ds9Nb+7zT6 s39XNLh8ldeMS5hzTnsCGCtrmvEt+Bpm9X1cziLyx6nl29hk80lcOQpM5LB0XXz4QNwTeG mgcqWaT1+GJIr78wiEGaF+W847PoFXA= Received: from mail-qt1-f197.google.com (mail-qt1-f197.google.com [209.85.160.197]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_128_GCM_SHA256) id us-mta-377-DDuI8lWxODufox5ZW9UOrw-1; Tue, 29 Nov 2022 14:35:36 -0500 X-MC-Unique: DDuI8lWxODufox5ZW9UOrw-1 Received: by mail-qt1-f197.google.com with SMTP id ff5-20020a05622a4d8500b003a526107477so22829218qtb.9 for ; Tue, 29 Nov 2022 11:35:36 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=1A7zSSAdH3Q+Nn9HaLXwUXJ3cHWzn45QfpKMkITSm0w=; b=X0pownKir0YVs3F1yzfU+QpyV0C3EcDTe93uDRwWfSr5/PHDh0LCYk/MIqj8RsfVcz mo0jVV+Opa0VooSjAORGzQSnWaOSRnbCM5sYtqmWodRs+PCmSTNnbNCED3Gq4JhsYL3/ MzHKHMR1S8y/s5HIGdI3/OtsX2CK1MmmfDSZvj9GKRJa1ct0JKJLP5mD9iCjcsGgM9do U1uMHnUhv8o0AdBIXumKEAu/LPlNpFqICCwUIgngMzcZaAMzX4yDMVlYmawauyNAr67Y BFxOZVAbQQ7aCfynxcT6ab2SEMJrKdYtf0qAlcPXGXBjD3KLbUgpGfVs81Z2+aOXEi0t +0jw== X-Gm-Message-State: ANoB5pkgMvVBcMgrKoe6zHNsivBMo0s78awi8UdIvrqwdabqedfCl85j PaRm0N+jZoQcohWWaIKuZgD6Ahbc9vdhR9pE7O2eo4UI4+HcXo5QrJaNFi7YE6QVJsBYGG79hla qybRgOT6ztigz/8lyo9I7K3g277aodFSFMw44X8aV+mzr3nFzWWTAfOgc18lM X-Received: by 2002:a05:620a:573:b0:6fc:1ddf:deec with SMTP id p19-20020a05620a057300b006fc1ddfdeecmr31840287qkp.595.1669750535442; Tue, 29 Nov 2022 11:35:35 -0800 (PST) X-Google-Smtp-Source: AA0mqf5eWRw1Wa7HXYgpo5FSiggQn9vyy3jAcCpLR470cYzs1D4H/YgQTVf/66JS7eR+jzWwjJRHDg== X-Received: by 2002:a05:620a:573:b0:6fc:1ddf:deec with SMTP id p19-20020a05620a057300b006fc1ddfdeecmr31840242qkp.595.1669750535029; Tue, 29 Nov 2022 11:35:35 -0800 (PST) Received: from x1n.redhat.com (bras-base-aurron9127w-grc-46-70-31-27-79.dsl.bell.ca. [70.31.27.79]) by smtp.gmail.com with ESMTPSA id n1-20020a05620a294100b006fa16fe93bbsm11313013qkp.15.2022.11.29.11.35.33 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 29 Nov 2022 11:35:34 -0800 (PST) From: Peter Xu To: linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: James Houghton , Jann Horn , peterx@redhat.com, Andrew Morton , Andrea Arcangeli , Rik van Riel , Nadav Amit , Miaohe Lin , Muchun Song , Mike Kravetz , David Hildenbrand Subject: [PATCH 03/10] mm/hugetlb: Document huge_pte_offset usage Date: Tue, 29 Nov 2022 14:35:19 -0500 Message-Id: <20221129193526.3588187-4-peterx@redhat.com> X-Mailer: git-send-email 2.37.3 In-Reply-To: <20221129193526.3588187-1-peterx@redhat.com> References: <20221129193526.3588187-1-peterx@redhat.com> MIME-Version: 1.0 X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-type: text/plain ARC-Authentication-Results: i=1; imf07.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=b6gVKCGZ; spf=pass (imf07.hostedemail.com: domain of peterx@redhat.com designates 170.10.129.124 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (policy=none) header.from=redhat.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1669750537; a=rsa-sha256; cv=none; b=VwT6Mo7v2nUI25QZbqesR7rxnFxvI3kiiWF9Ebdh7ftWplOMyOG1wMBqZMpyn0+Ep6s76r rf1QzxSLebD3RC7T8bTT9OGePU5xTLdvLiPz3mq2jASTWSIIEXEeCgQ29eRpKBq6XU1zU6 K/9xCxJAanbhAXIKSFhWHuZx4RhWqsk= ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1669750537; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=1A7zSSAdH3Q+Nn9HaLXwUXJ3cHWzn45QfpKMkITSm0w=; b=hs02FSBlN47A5VkrlGmkxoWWb9tGKptejDjDheYd+csmiVKGgfzEzqFBS4s+YCB0kf1vGc IjaxQF3S1c30ktf+DlIj/vFWhz5V5sXMhTW2ne3o9S8p/Nra2YShlrOp44k0qixL4LGwR9 Ih7aTb2v7dbv86JuWz77CAoejhXKQ2A= X-Rspam-User: X-Rspamd-Server: rspam07 X-Rspamd-Queue-Id: C4E5A40004 Authentication-Results: imf07.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=b6gVKCGZ; spf=pass (imf07.hostedemail.com: domain of peterx@redhat.com designates 170.10.129.124 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (policy=none) header.from=redhat.com X-Stat-Signature: jjyo9iwbyi9uskmdadspg9g3319gu5ch X-HE-Tag: 1669750537-131288 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: huge_pte_offset() is potentially a pgtable walker, looking up pte_t* for a hugetlb address. Normally, it's always safe to walk a generic pgtable as long as we're with the mmap lock held for either read or write, because that guarantees the pgtable pages will always be valid during the process. But it's not true for hugetlbfs, especially shared: hugetlbfs can have its pgtable freed by pmd unsharing, it means that even with mmap lock held for current mm, the PMD pgtable page can still go away from under us if pmd unsharing is possible during the walk. So we have two ways to make it safe even for a shared mapping: (1) If we're with the hugetlb vma lock held for either read/write, it's okay because pmd unshare cannot happen at all. (2) If we're with the i_mmap_rwsem lock held for either read/write, it's okay because even if pmd unshare can happen, the pgtable page cannot be freed from under us. Document it. Signed-off-by: Peter Xu --- include/linux/hugetlb.h | 32 ++++++++++++++++++++++++++++++++ 1 file changed, 32 insertions(+) diff --git a/include/linux/hugetlb.h b/include/linux/hugetlb.h index 551834cd5299..81efd9b9baa2 100644 --- a/include/linux/hugetlb.h +++ b/include/linux/hugetlb.h @@ -192,6 +192,38 @@ extern struct list_head huge_boot_pages; pte_t *huge_pte_alloc(struct mm_struct *mm, struct vm_area_struct *vma, unsigned long addr, unsigned long sz); +/* + * huge_pte_offset(): Walk the hugetlb pgtable until the last level PTE. + * Returns the pte_t* if found, or NULL if the address is not mapped. + * + * Since this function will walk all the pgtable pages (including not only + * high-level pgtable page, but also PUD entry that can be unshared + * concurrently for VM_SHARED), the caller of this function should be + * responsible of its thread safety. One can follow this rule: + * + * (1) For private mappings: pmd unsharing is not possible, so it'll + * always be safe if we're with the mmap sem for either read or write. + * This is normally always the case, IOW we don't need to do anything + * special. + * + * (2) For shared mappings: pmd unsharing is possible (so the PUD-ranged + * pgtable page can go away from under us! It can be done by a pmd + * unshare with a follow up munmap() on the other process), then we + * need either: + * + * (2.1) hugetlb vma lock read or write held, to make sure pmd unshare + * won't happen upon the range (it also makes sure the pte_t we + * read is the right and stable one), or, + * + * (2.2) hugetlb mapping i_mmap_rwsem lock held read or write, to make + * sure even if unshare happened the racy unmap() will wait until + * i_mmap_rwsem is released. + * + * Option (2.1) is the safest, which guarantees pte stability from pmd + * sharing pov, until the vma lock released. Option (2.2) doesn't protect + * a concurrent pmd unshare, but it makes sure the pgtable page is safe to + * access. + */ pte_t *huge_pte_offset(struct mm_struct *mm, unsigned long addr, unsigned long sz); unsigned long hugetlb_mask_last_page(struct hstate *h); From patchwork Tue Nov 29 19:35:20 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Xu X-Patchwork-Id: 13059088 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 4BB77C433FE for ; Tue, 29 Nov 2022 19:35:41 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 785E26B007D; Tue, 29 Nov 2022 14:35:40 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 6496F8E0001; Tue, 29 Nov 2022 14:35:40 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 49BAC6B0080; Tue, 29 Nov 2022 14:35:40 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0011.hostedemail.com [216.40.44.11]) by kanga.kvack.org (Postfix) with ESMTP id 350E06B007D for ; Tue, 29 Nov 2022 14:35:40 -0500 (EST) Received: from smtpin20.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay09.hostedemail.com (Postfix) with ESMTP id DE72280D2A for ; Tue, 29 Nov 2022 19:35:39 +0000 (UTC) X-FDA: 80187484398.20.8EB9F37 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by imf16.hostedemail.com (Postfix) with ESMTP id 718D8180016 for ; Tue, 29 Nov 2022 19:35:39 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1669750538; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=Oh9DqPSDqDrKnK09kN0HQudmWQU6fy0ZbcaOAGSajb0=; b=PgzLfkJNxZ5d9+Yym5+DzY7RUvi8UjALwBt/Kotvz/PUfG98cfxnki0sKSd2iXeD5IORqu 215/yP8l8aviOV1tEGUsFJFZ5KhbsGbmUQw5gxQ/gotCFDpooWrIPr234jOxs8egttJDM0 DcbbkFt4uWMMgx6hgZ1vus1t6cUdL/A= Received: from mail-qk1-f199.google.com (mail-qk1-f199.google.com [209.85.222.199]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_128_GCM_SHA256) id us-mta-464-o5aJwBmfOAaOTS6K_-jw7g-1; Tue, 29 Nov 2022 14:35:37 -0500 X-MC-Unique: o5aJwBmfOAaOTS6K_-jw7g-1 Received: by mail-qk1-f199.google.com with SMTP id bq13-20020a05620a468d00b006fa5a75759aso31946887qkb.13 for ; Tue, 29 Nov 2022 11:35:37 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=Oh9DqPSDqDrKnK09kN0HQudmWQU6fy0ZbcaOAGSajb0=; b=PkYC/qJHwBdEOZoqDGCD0Svtmvd87eWm9EthfaD02TkvJRk+LkDXA6c72RISs4GHGi gYDDdFqD21VpC/oFmluBl6o82wJFV+8moeDDUGcYDturJxYKOwfiTWWOb/8HgqJfKFxG SIzGRHo8DFxJQi0Ht5Tjqir+o7YhZnID02SC50+dW9W7jCia8R3Z58pO/M//zInsCRs2 u5w/fl+X7dGAdSbbpiSfbD3F8lJnIyAFZsDMWfLBq5qIOosPSrytaizL0y8BIFvEMz1X 9lWTObz2HyxAYQf0MNpaSuN3a1kPp9mBrkpXGrDcNNsMC5cDrPt1Sq4RwM8gr3MgxJG4 HsVw== X-Gm-Message-State: ANoB5plGrhwuWTFejkN7u4zh62FQDcItARJTSlHqUsLDnz1JZ3rhMb1O c/RL3IpIt4ndzdQzXSy5obvYcQ+E3MmDSAgZShtDRSH6eSoX/RawfxBu7ui7s8NlnzMV0t/OxgS zRgo8zMOWPMIJvdWe1EYd8sHQs6eCZbqr00AJlZNnMCr/V0kS0NYVOp/Q01U3 X-Received: by 2002:a05:620a:a07:b0:6fa:438d:c86f with SMTP id i7-20020a05620a0a0700b006fa438dc86fmr51165993qka.712.1669750536830; Tue, 29 Nov 2022 11:35:36 -0800 (PST) X-Google-Smtp-Source: AA0mqf6Qnx+lO3miKOv0SAxOpvSfR8Gl6/yU1joyymWA5kdQHFi+xLjYS0g8rLLDUu4CbTX0G4JdDw== X-Received: by 2002:a05:620a:a07:b0:6fa:438d:c86f with SMTP id i7-20020a05620a0a0700b006fa438dc86fmr51165954qka.712.1669750536397; Tue, 29 Nov 2022 11:35:36 -0800 (PST) Received: from x1n.redhat.com (bras-base-aurron9127w-grc-46-70-31-27-79.dsl.bell.ca. [70.31.27.79]) by smtp.gmail.com with ESMTPSA id n1-20020a05620a294100b006fa16fe93bbsm11313013qkp.15.2022.11.29.11.35.35 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 29 Nov 2022 11:35:36 -0800 (PST) From: Peter Xu To: linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: James Houghton , Jann Horn , peterx@redhat.com, Andrew Morton , Andrea Arcangeli , Rik van Riel , Nadav Amit , Miaohe Lin , Muchun Song , Mike Kravetz , David Hildenbrand Subject: [PATCH 04/10] mm/hugetlb: Move swap entry handling into vma lock when faulted Date: Tue, 29 Nov 2022 14:35:20 -0500 Message-Id: <20221129193526.3588187-5-peterx@redhat.com> X-Mailer: git-send-email 2.37.3 In-Reply-To: <20221129193526.3588187-1-peterx@redhat.com> References: <20221129193526.3588187-1-peterx@redhat.com> MIME-Version: 1.0 X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-type: text/plain ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1669750539; a=rsa-sha256; cv=none; b=fum9+TTETzBKRHGLg7aeSabg4bo4Y08V4F0aOAFShfOuJ5nBI1EO4fgWXrcpVJ6bf4SFFR owPaZEnGe5oEo39piXYjsrLD3iJMZXV8n1anryEpIjncDkp3NQPvfSXHkBguddli+kUp7D Jy8Aqwt31LFL/DBYiw9XfexAgQPZyjg= ARC-Authentication-Results: i=1; imf16.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=PgzLfkJN; spf=pass (imf16.hostedemail.com: domain of peterx@redhat.com designates 170.10.133.124 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (policy=none) header.from=redhat.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1669750539; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=Oh9DqPSDqDrKnK09kN0HQudmWQU6fy0ZbcaOAGSajb0=; b=HdByAz8qxSkOId94c2SXYhHUIiUlEAzvpuj4jUXg767KgYLoun3xzZC/NSRtGhTF+79WpG EjpvnliXHZCSl8ZSTWoUAwp24Lcafh/qmyEYXkInbG2NjnyxF6FIUjiON3m4PY8X2XG1U4 OERB36CMJP5RFrDUS3PWZ/+XsMjXT0Y= X-Rspamd-Server: rspam04 X-Rspamd-Queue-Id: 718D8180016 X-Rspam-User: Authentication-Results: imf16.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=PgzLfkJN; spf=pass (imf16.hostedemail.com: domain of peterx@redhat.com designates 170.10.133.124 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (policy=none) header.from=redhat.com X-Stat-Signature: ybfnf8ng1ztwaw8qcy7zpczmi4yxtrqn X-HE-Tag: 1669750539-245270 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: In hugetlb_fault(), there used to have a special path to handle swap entry at the entrance using huge_pte_offset(). That's unsafe because huge_pte_offset() for a pmd sharable range can access freed pgtables if without any lock to protect the pgtable from being freed after pmd unshare. Here the simplest solution to make it safe is to move the swap handling to be after the vma lock being held. We may need to take the fault mutex on either migration or hwpoison entries now (also the vma lock, but that's really needed), however neither of them is hot path. Note that the vma lock cannot be released in hugetlb_fault() when the migration entry is detected, because in migration_entry_wait_huge() the pgtable page will be used again (by taking the pgtable lock), so that also need to be protected by the vma lock. Modify migration_entry_wait_huge() so that it must be called with vma read lock held, and properly release the lock in __migration_entry_wait_huge(). Signed-off-by: Peter Xu Reviewed-by: Mike Kravetz --- include/linux/swapops.h | 6 ++++-- mm/hugetlb.c | 32 +++++++++++++++----------------- mm/migrate.c | 25 +++++++++++++++++++++---- 3 files changed, 40 insertions(+), 23 deletions(-) diff --git a/include/linux/swapops.h b/include/linux/swapops.h index 27ade4f22abb..09b22b169a71 100644 --- a/include/linux/swapops.h +++ b/include/linux/swapops.h @@ -335,7 +335,8 @@ extern void __migration_entry_wait(struct mm_struct *mm, pte_t *ptep, extern void migration_entry_wait(struct mm_struct *mm, pmd_t *pmd, unsigned long address); #ifdef CONFIG_HUGETLB_PAGE -extern void __migration_entry_wait_huge(pte_t *ptep, spinlock_t *ptl); +extern void __migration_entry_wait_huge(struct vm_area_struct *vma, + pte_t *ptep, spinlock_t *ptl); extern void migration_entry_wait_huge(struct vm_area_struct *vma, pte_t *pte); #endif /* CONFIG_HUGETLB_PAGE */ #else /* CONFIG_MIGRATION */ @@ -364,7 +365,8 @@ static inline void __migration_entry_wait(struct mm_struct *mm, pte_t *ptep, static inline void migration_entry_wait(struct mm_struct *mm, pmd_t *pmd, unsigned long address) { } #ifdef CONFIG_HUGETLB_PAGE -static inline void __migration_entry_wait_huge(pte_t *ptep, spinlock_t *ptl) { } +static inline void __migration_entry_wait_huge(struct vm_area_struct *vma, + pte_t *ptep, spinlock_t *ptl) { } static inline void migration_entry_wait_huge(struct vm_area_struct *vma, pte_t *pte) { } #endif /* CONFIG_HUGETLB_PAGE */ static inline int is_writable_migration_entry(swp_entry_t entry) diff --git a/mm/hugetlb.c b/mm/hugetlb.c index dfe677fadaf8..776e34ccf029 100644 --- a/mm/hugetlb.c +++ b/mm/hugetlb.c @@ -5826,22 +5826,6 @@ vm_fault_t hugetlb_fault(struct mm_struct *mm, struct vm_area_struct *vma, int need_wait_lock = 0; unsigned long haddr = address & huge_page_mask(h); - ptep = huge_pte_offset(mm, haddr, huge_page_size(h)); - if (ptep) { - /* - * Since we hold no locks, ptep could be stale. That is - * OK as we are only making decisions based on content and - * not actually modifying content here. - */ - entry = huge_ptep_get(ptep); - if (unlikely(is_hugetlb_entry_migration(entry))) { - migration_entry_wait_huge(vma, ptep); - return 0; - } else if (unlikely(is_hugetlb_entry_hwpoisoned(entry))) - return VM_FAULT_HWPOISON_LARGE | - VM_FAULT_SET_HINDEX(hstate_index(h)); - } - /* * Serialize hugepage allocation and instantiation, so that we don't * get spurious allocation failures if two CPUs race to instantiate @@ -5888,8 +5872,22 @@ vm_fault_t hugetlb_fault(struct mm_struct *mm, struct vm_area_struct *vma, * fault, and is_hugetlb_entry_(migration|hwpoisoned) check will * properly handle it. */ - if (!pte_present(entry)) + if (!pte_present(entry)) { + if (unlikely(is_hugetlb_entry_migration(entry))) { + /* + * Release fault lock first because the vma lock is + * needed to guard the huge_pte_lockptr() later in + * migration_entry_wait_huge(). The vma lock will + * be released there. + */ + mutex_unlock(&hugetlb_fault_mutex_table[hash]); + migration_entry_wait_huge(vma, ptep); + return 0; + } else if (unlikely(is_hugetlb_entry_hwpoisoned(entry))) + ret = VM_FAULT_HWPOISON_LARGE | + VM_FAULT_SET_HINDEX(hstate_index(h)); goto out_mutex; + } /* * If we are going to COW/unshare the mapping later, we examine the diff --git a/mm/migrate.c b/mm/migrate.c index 267ad0d073ae..c13c828d34f3 100644 --- a/mm/migrate.c +++ b/mm/migrate.c @@ -326,24 +326,41 @@ void migration_entry_wait(struct mm_struct *mm, pmd_t *pmd, } #ifdef CONFIG_HUGETLB_PAGE -void __migration_entry_wait_huge(pte_t *ptep, spinlock_t *ptl) +void __migration_entry_wait_huge(struct vm_area_struct *vma, + pte_t *ptep, spinlock_t *ptl) { pte_t pte; + /* + * The vma read lock must be taken, which will be released before + * the function returns. It makes sure the pgtable page (along + * with its spin lock) not be freed in parallel. + */ + hugetlb_vma_assert_locked(vma); + spin_lock(ptl); pte = huge_ptep_get(ptep); - if (unlikely(!is_hugetlb_entry_migration(pte))) + if (unlikely(!is_hugetlb_entry_migration(pte))) { spin_unlock(ptl); - else + hugetlb_vma_unlock_read(vma); + } else { + /* + * If migration entry existed, safe to release vma lock + * here because the pgtable page won't be freed without the + * pgtable lock released. See comment right above pgtable + * lock release in migration_entry_wait_on_locked(). + */ + hugetlb_vma_unlock_read(vma); migration_entry_wait_on_locked(pte_to_swp_entry(pte), NULL, ptl); + } } void migration_entry_wait_huge(struct vm_area_struct *vma, pte_t *pte) { spinlock_t *ptl = huge_pte_lockptr(hstate_vma(vma), vma->vm_mm, pte); - __migration_entry_wait_huge(pte, ptl); + __migration_entry_wait_huge(vma, pte, ptl); } #endif From patchwork Tue Nov 29 19:35:21 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Xu X-Patchwork-Id: 13059089 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id AB1E9C4332F for ; Tue, 29 Nov 2022 19:35:42 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 1924C8E0001; Tue, 29 Nov 2022 14:35:42 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 11B186B0080; Tue, 29 Nov 2022 14:35:42 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id DEB228E0001; Tue, 29 Nov 2022 14:35:41 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0011.hostedemail.com [216.40.44.11]) by kanga.kvack.org (Postfix) with ESMTP id C49B26B007E for ; Tue, 29 Nov 2022 14:35:41 -0500 (EST) Received: from smtpin30.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay10.hostedemail.com (Postfix) with ESMTP id A04B3C0B9D for ; Tue, 29 Nov 2022 19:35:41 +0000 (UTC) X-FDA: 80187484482.30.D99FDE3 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by imf13.hostedemail.com (Postfix) with ESMTP id 311462000D for ; Tue, 29 Nov 2022 19:35:40 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1669750540; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=U2GgktAyXhDRBB8fn5rbcYoK/H0Sll/0lbzArKYhQzY=; b=HOXLYMVvhzztEaPIwe4aXANqPDSJyGxCzx5yIJu7YQPpDhqU0aG0azJHaxV0ifm3yaxoVL 8mbRB9uzCSomu6jCnX1j07MV+XwZJQDxFm+YjPzYngTDBW8WENpX0D26bXy8Y9GD+uImFl lbPn2T0j+HMv1vaT1/xMOYYTTCzOLd4= Received: from mail-qt1-f199.google.com (mail-qt1-f199.google.com [209.85.160.199]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_128_GCM_SHA256) id us-mta-488-iYJmZuj3NUuudXzOfc-Sqg-1; Tue, 29 Nov 2022 14:35:39 -0500 X-MC-Unique: iYJmZuj3NUuudXzOfc-Sqg-1 Received: by mail-qt1-f199.google.com with SMTP id bz20-20020a05622a1e9400b003a646e03748so22976960qtb.12 for ; Tue, 29 Nov 2022 11:35:39 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=U2GgktAyXhDRBB8fn5rbcYoK/H0Sll/0lbzArKYhQzY=; b=KlIsmrDQxuVazXO86dkSWUSsnGYOvCZPjCuYcwyKwZ+2vqdnjD5pkPKbvMBpyaPdlQ sfMPZ+zGYfKkynwNAzzZpgecrLcYuU5IiCGem/fV2lmNCO3LKpUgevqYWvLmCjERfemn nBClb+MaLq/8BzAJSC1wC/XjFdSh7Ml1CEER3VxL5Piap1zSAGNMPo5yenwTAW6Fq8UZ p3gjK2cbhdfQAXAvHoe9sXGEeJ3FKl8EEu5fZUmgEMqCDpymVoSnWZJzuhpeag7qTE2a iyXVOEJ4xQDcImA8LIBDJ3XUm/8K4q6+qaRFTD9KddQjQwUiDpnaxOVgov7MU85jv0q2 4qdQ== X-Gm-Message-State: ANoB5pmgdZPBtu19PQgYXL1BeN40uU5NaeyJQrNruP+BmWUEwx75b9CQ recvqMZMFJ+ZAri54lL6P46kgxJasMDB3MWQXtj8uytzzR96TBpMsmtqVZVoX/4S1ASAjdhQ24q 7zkGPrdh3kYWwNPugelm7zvt3OajlgAhmhtB8wzKECuVjUffQ2LgMf1Jr0Uhy X-Received: by 2002:ad4:5a12:0:b0:4c6:cfb3:461f with SMTP id ei18-20020ad45a12000000b004c6cfb3461fmr29830227qvb.18.1669750538400; Tue, 29 Nov 2022 11:35:38 -0800 (PST) X-Google-Smtp-Source: AA0mqf5sUOLP+bu5vmUyeu4eR0d5tdrBlXwCznX/b9tlkmnKZ8SzN/XLnce+u+p7KI/YvBLKOKGVTQ== X-Received: by 2002:ad4:5a12:0:b0:4c6:cfb3:461f with SMTP id ei18-20020ad45a12000000b004c6cfb3461fmr29830188qvb.18.1669750538086; Tue, 29 Nov 2022 11:35:38 -0800 (PST) Received: from x1n.redhat.com (bras-base-aurron9127w-grc-46-70-31-27-79.dsl.bell.ca. [70.31.27.79]) by smtp.gmail.com with ESMTPSA id n1-20020a05620a294100b006fa16fe93bbsm11313013qkp.15.2022.11.29.11.35.36 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 29 Nov 2022 11:35:37 -0800 (PST) From: Peter Xu To: linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: James Houghton , Jann Horn , peterx@redhat.com, Andrew Morton , Andrea Arcangeli , Rik van Riel , Nadav Amit , Miaohe Lin , Muchun Song , Mike Kravetz , David Hildenbrand Subject: [PATCH 05/10] mm/hugetlb: Make userfaultfd_huge_must_wait() safe to pmd unshare Date: Tue, 29 Nov 2022 14:35:21 -0500 Message-Id: <20221129193526.3588187-6-peterx@redhat.com> X-Mailer: git-send-email 2.37.3 In-Reply-To: <20221129193526.3588187-1-peterx@redhat.com> References: <20221129193526.3588187-1-peterx@redhat.com> MIME-Version: 1.0 X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-type: text/plain ARC-Authentication-Results: i=1; imf13.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=HOXLYMVv; spf=pass (imf13.hostedemail.com: domain of peterx@redhat.com designates 170.10.133.124 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (policy=none) header.from=redhat.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1669750541; a=rsa-sha256; cv=none; b=1ln2iUkYIiQo0UVYlPQOPlXLcoziZGrJ4NKP6jy76OsLLRtEV0f6E1CHr4Csl2XvHCHaGd yfpQJPbjwd7Zyky2IrA82ABLrw/XtskEnDuqscc7Ce5tItW3jbYKDlfJ4RtbnA9vozSpTb /w/r5BVeK9qZ/KbVL9tzW6e8W4/EnBw= ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1669750541; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=U2GgktAyXhDRBB8fn5rbcYoK/H0Sll/0lbzArKYhQzY=; b=soBIUWiq3t2Woie4Fa+xSzOMyR3y3aUAumeVxuHPoCn+fF/KuqLbu+xp3XqhvoKVByLctZ iBdmc1GADPWiHOjMomZqr1bX8zdSxbNC51s2MxTlbbPcSOLaezBOBPcd5Ydgq+ZsW81TWX ZZvZq+deaF4vaBSOFmii599fcfa+3dE= Authentication-Results: imf13.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=HOXLYMVv; spf=pass (imf13.hostedemail.com: domain of peterx@redhat.com designates 170.10.133.124 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (policy=none) header.from=redhat.com X-Rspamd-Server: rspam02 X-Rspamd-Queue-Id: 311462000D X-Stat-Signature: fkkummurdci3u7irxk89rstj6nkwiapb X-Rspam-User: X-HE-Tag: 1669750540-816601 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: We can take the hugetlb walker lock, here taking vma lock directly. Signed-off-by: Peter Xu Reviewed-by: David Hildenbrand Reviewed-by: Mike Kravetz --- fs/userfaultfd.c | 18 ++++++++++++++---- 1 file changed, 14 insertions(+), 4 deletions(-) diff --git a/fs/userfaultfd.c b/fs/userfaultfd.c index 07c81ab3fd4d..a602f008dde5 100644 --- a/fs/userfaultfd.c +++ b/fs/userfaultfd.c @@ -376,7 +376,8 @@ static inline unsigned int userfaultfd_get_blocking_state(unsigned int flags) */ vm_fault_t handle_userfault(struct vm_fault *vmf, unsigned long reason) { - struct mm_struct *mm = vmf->vma->vm_mm; + struct vm_area_struct *vma = vmf->vma; + struct mm_struct *mm = vma->vm_mm; struct userfaultfd_ctx *ctx; struct userfaultfd_wait_queue uwq; vm_fault_t ret = VM_FAULT_SIGBUS; @@ -403,7 +404,7 @@ vm_fault_t handle_userfault(struct vm_fault *vmf, unsigned long reason) */ mmap_assert_locked(mm); - ctx = vmf->vma->vm_userfaultfd_ctx.ctx; + ctx = vma->vm_userfaultfd_ctx.ctx; if (!ctx) goto out; @@ -493,6 +494,13 @@ vm_fault_t handle_userfault(struct vm_fault *vmf, unsigned long reason) blocking_state = userfaultfd_get_blocking_state(vmf->flags); + /* + * This stablizes pgtable for hugetlb on e.g. pmd unsharing. Need + * to be before setting current state. + */ + if (is_vm_hugetlb_page(vma)) + hugetlb_vma_lock_read(vma); + spin_lock_irq(&ctx->fault_pending_wqh.lock); /* * After the __add_wait_queue the uwq is visible to userland @@ -507,13 +515,15 @@ vm_fault_t handle_userfault(struct vm_fault *vmf, unsigned long reason) set_current_state(blocking_state); spin_unlock_irq(&ctx->fault_pending_wqh.lock); - if (!is_vm_hugetlb_page(vmf->vma)) + if (!is_vm_hugetlb_page(vma)) must_wait = userfaultfd_must_wait(ctx, vmf->address, vmf->flags, reason); else - must_wait = userfaultfd_huge_must_wait(ctx, vmf->vma, + must_wait = userfaultfd_huge_must_wait(ctx, vma, vmf->address, vmf->flags, reason); + if (is_vm_hugetlb_page(vma)) + hugetlb_vma_unlock_read(vma); mmap_read_unlock(mm); if (likely(must_wait && !READ_ONCE(ctx->released))) { From patchwork Tue Nov 29 19:35:22 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Xu X-Patchwork-Id: 13059090 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 9C2C0C4167B for ; Tue, 29 Nov 2022 19:35:44 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 38C828E0003; Tue, 29 Nov 2022 14:35:44 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 363906B007E; Tue, 29 Nov 2022 14:35:44 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 07C736B0081; Tue, 29 Nov 2022 14:35:43 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0011.hostedemail.com [216.40.44.11]) by kanga.kvack.org (Postfix) with ESMTP id E1DDB6B007E for ; Tue, 29 Nov 2022 14:35:43 -0500 (EST) Received: from smtpin03.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay05.hostedemail.com (Postfix) with ESMTP id B160140BA9 for ; Tue, 29 Nov 2022 19:35:43 +0000 (UTC) X-FDA: 80187484566.03.E827BBE Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by imf14.hostedemail.com (Postfix) with ESMTP id 5E02810000D for ; Tue, 29 Nov 2022 19:35:43 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1669750542; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=4wYg4N/GqmDYcLEMWI8FN15PYOVGEp4Ds74Kweju+do=; b=QcCROOXUwDOBPnXzg8qdciMd6/cbeY4lGWg8c0ISdqw5xaBxb0JJVEaHG0i+SZKyONuCRA NH0fwFPKRWzFSxC5LE8O80MPaR3O93jb5byBK01q06kb1fgsPpe977O2EXQKcxyU+ws+YP ttK6o4wiW+SPI7gmv2TWox4zOYuDfGU= Received: from mail-qk1-f200.google.com (mail-qk1-f200.google.com [209.85.222.200]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_128_GCM_SHA256) id us-mta-150-0011wSP_OXCdz56AUxDhLA-1; Tue, 29 Nov 2022 14:35:40 -0500 X-MC-Unique: 0011wSP_OXCdz56AUxDhLA-1 Received: by mail-qk1-f200.google.com with SMTP id w4-20020a05620a444400b006fa24b2f394so31142693qkp.15 for ; Tue, 29 Nov 2022 11:35:40 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=4wYg4N/GqmDYcLEMWI8FN15PYOVGEp4Ds74Kweju+do=; b=W652PuXpHwEI3nkxy3yynMB0DYdXtI8Oy76iEWP7SgYT2KlX/rKOTNiSOXjXZAviW3 Z4MNnejmVwQ4U7f5wcuBXxX+L10Xx3NAltVf2pkENwIvG9YARVjuWPIukhWlFcGjjFHq q7WFkXCU0ioOlVt8TuovqcXBjg+gQnXZSgCkCxwP7IevGsBYJ6LQZilQdgitUhaY8h3g /EEIvR/n5llrNtNKgHwHJmB8wEquPomtdwrsKR2xlrOdwClOEDzh7GI8VKW0kULdyd+D XmhfZidyHoGPktrv3n0G1x2WgA3es4uPn+dtJf3Z+LTCTLHBsHJ6V+plwt1wWohxklg8 u9EA== X-Gm-Message-State: ANoB5plXoZePjqkHiDeIRRx4aLwqrWnfHfBG2WV+n9tiX7Qgmv8vCxvo MUVOwOSEuu2bYHI9crNJw2oHHm52z5k/z/oEnuKftNd571i7ECGMM/TqqI0RvEvyDr5mVv7zgoK SqGmik8FGj+NWrnODL76FwbshXncUKhyhqCbIrwwAkdqJzSDdz8CnpEff3N50 X-Received: by 2002:a05:6214:207:b0:4c6:4ac0:12c1 with SMTP id i7-20020a056214020700b004c64ac012c1mr37457121qvt.111.1669750539690; Tue, 29 Nov 2022 11:35:39 -0800 (PST) X-Google-Smtp-Source: AA0mqf4u/49M+6rzccmqGiecb+5YeMty5DWCqGQppJOYyST0NA7r4AdqewGNpJUPOi2AgkzWJM2jYA== X-Received: by 2002:a05:6214:207:b0:4c6:4ac0:12c1 with SMTP id i7-20020a056214020700b004c64ac012c1mr37457088qvt.111.1669750539374; Tue, 29 Nov 2022 11:35:39 -0800 (PST) Received: from x1n.redhat.com (bras-base-aurron9127w-grc-46-70-31-27-79.dsl.bell.ca. [70.31.27.79]) by smtp.gmail.com with ESMTPSA id n1-20020a05620a294100b006fa16fe93bbsm11313013qkp.15.2022.11.29.11.35.38 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 29 Nov 2022 11:35:38 -0800 (PST) From: Peter Xu To: linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: James Houghton , Jann Horn , peterx@redhat.com, Andrew Morton , Andrea Arcangeli , Rik van Riel , Nadav Amit , Miaohe Lin , Muchun Song , Mike Kravetz , David Hildenbrand Subject: [PATCH 06/10] mm/hugetlb: Make hugetlb_follow_page_mask() safe to pmd unshare Date: Tue, 29 Nov 2022 14:35:22 -0500 Message-Id: <20221129193526.3588187-7-peterx@redhat.com> X-Mailer: git-send-email 2.37.3 In-Reply-To: <20221129193526.3588187-1-peterx@redhat.com> References: <20221129193526.3588187-1-peterx@redhat.com> MIME-Version: 1.0 X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-type: text/plain ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1669750543; a=rsa-sha256; cv=none; b=kKAn/gaHXvbI3OySq1BlFowesQ1jw68CuIS8N85OCI4bUxDeisTlwGXChuTviIGJ85CMZv Cjx+ICSQruyVu+aUQPJH5JCnJKlMVYatFEoYRt7yod1IzG2gbeMApwp4yH+4rB9O7JDxhM M5wt7RfnWF9PKt3ybGPBOJ1oyvZF9io= ARC-Authentication-Results: i=1; imf14.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=QcCROOXU; dmarc=pass (policy=none) header.from=redhat.com; spf=pass (imf14.hostedemail.com: domain of peterx@redhat.com designates 170.10.133.124 as permitted sender) smtp.mailfrom=peterx@redhat.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1669750543; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=4wYg4N/GqmDYcLEMWI8FN15PYOVGEp4Ds74Kweju+do=; b=KL0eof9uI/6eG8mw3brxEnKnBAINuVzE8U/mPWJ9PoAFzb0XfeYLlI8zNR4GGzRluiiV8p Iom5ro7XJ4B6uRFSRFdxpum+AVBiSYiobrIUiX6h1rbwnnDlMrTzsd4TGdc+3ndj6wYkz6 8q1IwGM+1iz8o7LEjYn0DpZYcZP1AWQ= X-Rspamd-Queue-Id: 5E02810000D Authentication-Results: imf14.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=QcCROOXU; dmarc=pass (policy=none) header.from=redhat.com; spf=pass (imf14.hostedemail.com: domain of peterx@redhat.com designates 170.10.133.124 as permitted sender) smtp.mailfrom=peterx@redhat.com X-Rspamd-Server: rspam12 X-Rspam-User: X-Stat-Signature: jt3ky1n8418p9zicbk45az39hc96kipt X-HE-Tag: 1669750543-735882 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Since hugetlb_follow_page_mask() walks the pgtable, it needs the vma lock to make sure the pgtable page will not be freed concurrently. Signed-off-by: Peter Xu Acked-by: David Hildenbrand Reviewed-by: Mike Kravetz --- mm/hugetlb.c | 5 ++++- 1 file changed, 4 insertions(+), 1 deletion(-) diff --git a/mm/hugetlb.c b/mm/hugetlb.c index 776e34ccf029..d6bb1d22f1c4 100644 --- a/mm/hugetlb.c +++ b/mm/hugetlb.c @@ -6232,9 +6232,10 @@ struct page *hugetlb_follow_page_mask(struct vm_area_struct *vma, if (WARN_ON_ONCE(flags & FOLL_PIN)) return NULL; + hugetlb_vma_lock_read(vma); pte = huge_pte_offset(mm, haddr, huge_page_size(h)); if (!pte) - return NULL; + goto out_unlock; ptl = huge_pte_lock(h, mm, pte); entry = huge_ptep_get(pte); @@ -6257,6 +6258,8 @@ struct page *hugetlb_follow_page_mask(struct vm_area_struct *vma, } out: spin_unlock(ptl); +out_unlock: + hugetlb_vma_unlock_read(vma); return page; } From patchwork Tue Nov 29 19:35:23 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Xu X-Patchwork-Id: 13059091 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id B9E61C4332F for ; Tue, 29 Nov 2022 19:35:45 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 6C0BA8E0002; Tue, 29 Nov 2022 14:35:44 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 671176B0080; Tue, 29 Nov 2022 14:35:44 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 22AC28E0002; Tue, 29 Nov 2022 14:35:44 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0010.hostedemail.com [216.40.44.10]) by kanga.kvack.org (Postfix) with ESMTP id 0F7D76B007E for ; Tue, 29 Nov 2022 14:35:44 -0500 (EST) Received: from smtpin04.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay03.hostedemail.com (Postfix) with ESMTP id B7BD1A04ED for ; Tue, 29 Nov 2022 19:35:43 +0000 (UTC) X-FDA: 80187484566.04.A727999 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by imf15.hostedemail.com (Postfix) with ESMTP id 5426AA000B for ; Tue, 29 Nov 2022 19:35:43 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1669750542; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=wQQtMmya3oIPlns9VTTG82x3u2HOgOUNM9IGJVSq3FI=; b=hDzQzunNQivxT0zrqpMmGf4I/gtlG3d7uEDEn3Fb8TVriI8dqv/LdVDO8LiRhh8SGQawzY LsXF73Ebn/2SE6B8WrwfP3zfUL2Kp5CCmwdoDswZiuePEHDMqHDsVWeYhWZZciG16Ffgrn vQLK9PEFBBPjsvjBPMaeLLNwXSsq2ZM= Received: from mail-qk1-f199.google.com (mail-qk1-f199.google.com [209.85.222.199]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_128_GCM_SHA256) id us-mta-323-Adv-nwoIM7m9XNJ58lLKWA-1; Tue, 29 Nov 2022 14:35:41 -0500 X-MC-Unique: Adv-nwoIM7m9XNJ58lLKWA-1 Received: by mail-qk1-f199.google.com with SMTP id bk30-20020a05620a1a1e00b006fb2378c857so31469532qkb.18 for ; Tue, 29 Nov 2022 11:35:41 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=wQQtMmya3oIPlns9VTTG82x3u2HOgOUNM9IGJVSq3FI=; b=xlv4499FbudocaeiORGifK9uX11VZ/91Uyf+N8NfqTZbwrkXQ175dkjP5PFpfDkA4k ODdm4q2NvXDCH5YI6H84RjFvcczSHCxNWMJtJhZTjt1fgm1L+gS5W6ESMZylvO4tSh1j TR5gsVayKsaJKGyMWk7MPMySHISON8+PE75jxPeeb4jACQ4OxOLWh81IorJpURGS+I6U g9+6KJyq2lV5WTXo9LntFMkLECcn9+Ol86YAz5fCT0SiCHRGvRh+D1DvVdU/+6nd3Dmq +gpq8hDwwH0Iqo5OlokXiS6OS5KN4hpkOkSjTw1wSN8Zp1rK+jjdqesTZ9HgAsOc1tbT uBgw== X-Gm-Message-State: ANoB5pmFmKuvckkaaGRgANAQ5+Drm/lJrgBg/PxBYaNj2E+LgfOV/oQZ j/5Wr5qL1/NhU+PmlEJOO+QhtUkfJRgi6OmRUwcJUpYpi8IAXeCxnvN43q5TVYMe3JdqtBTIYF/ AKp/XVk6vjipJRMRimpTqKGDhl/gDusCbFeuYyweKxKW39z1So9QGbqzNcJ4k X-Received: by 2002:a05:622a:4891:b0:3a5:280a:3c9c with SMTP id fc17-20020a05622a489100b003a5280a3c9cmr37951529qtb.282.1669750540896; Tue, 29 Nov 2022 11:35:40 -0800 (PST) X-Google-Smtp-Source: AA0mqf7D66AT3COOJUVWGxZ68VmKCBmOjMI4vwPiGugqyqG7QsmIEqLv5iYXBuBUQGMJ4PIx2eAMkQ== X-Received: by 2002:a05:622a:4891:b0:3a5:280a:3c9c with SMTP id fc17-20020a05622a489100b003a5280a3c9cmr37951490qtb.282.1669750540508; Tue, 29 Nov 2022 11:35:40 -0800 (PST) Received: from x1n.redhat.com (bras-base-aurron9127w-grc-46-70-31-27-79.dsl.bell.ca. [70.31.27.79]) by smtp.gmail.com with ESMTPSA id n1-20020a05620a294100b006fa16fe93bbsm11313013qkp.15.2022.11.29.11.35.39 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 29 Nov 2022 11:35:40 -0800 (PST) From: Peter Xu To: linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: James Houghton , Jann Horn , peterx@redhat.com, Andrew Morton , Andrea Arcangeli , Rik van Riel , Nadav Amit , Miaohe Lin , Muchun Song , Mike Kravetz , David Hildenbrand Subject: [PATCH 07/10] mm/hugetlb: Make follow_hugetlb_page() safe to pmd unshare Date: Tue, 29 Nov 2022 14:35:23 -0500 Message-Id: <20221129193526.3588187-8-peterx@redhat.com> X-Mailer: git-send-email 2.37.3 In-Reply-To: <20221129193526.3588187-1-peterx@redhat.com> References: <20221129193526.3588187-1-peterx@redhat.com> MIME-Version: 1.0 X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-type: text/plain ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1669750543; a=rsa-sha256; cv=none; b=lvVwJl/wpQF/R9n9u3DOfcGyLy6nS0XLxSSMhqo1V8XU7wP7N29sOZRbuwgkzyH+wUlzAq LWqodYM6YXzGM5LURrxBcq48wxFwSbR0b+8lc6yqi9ESGpY6BNceZoBaEJ//F18sz7TLB3 nCA0WCloyWzcehCxR0Zzgo3f1Bksuic= ARC-Authentication-Results: i=1; imf15.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=hDzQzunN; dmarc=pass (policy=none) header.from=redhat.com; spf=pass (imf15.hostedemail.com: domain of peterx@redhat.com designates 170.10.129.124 as permitted sender) smtp.mailfrom=peterx@redhat.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1669750543; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=wQQtMmya3oIPlns9VTTG82x3u2HOgOUNM9IGJVSq3FI=; b=QDg+AHI6+hKMqwwmJmbJ1r7wK6xEqHOz1u0U7kO7bdn1lKT9ASH8OuPbwFgG6reHpZeVlp 33nbJXK/Ji7teP0PIDOz5OjWZBEIBU6sOIWjyQur9lKDmi4abXiICsfXiQ2hkFHNJC10DO kHZ5ZNOFnVFqnuWIX3hxFhbS27v5fAA= X-Rspamd-Queue-Id: 5426AA000B X-Rspam-User: Authentication-Results: imf15.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=hDzQzunN; dmarc=pass (policy=none) header.from=redhat.com; spf=pass (imf15.hostedemail.com: domain of peterx@redhat.com designates 170.10.129.124 as permitted sender) smtp.mailfrom=peterx@redhat.com X-Rspamd-Server: rspam09 X-Stat-Signature: rw5d6ku8d51qz5eixqnptb9kzufif5no X-HE-Tag: 1669750543-6846 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Since follow_hugetlb_page() walks the pgtable, it needs the vma lock to make sure the pgtable page will not be freed concurrently. Signed-off-by: Peter Xu Acked-by: David Hildenbrand Reviewed-by: Mike Kravetz --- mm/hugetlb.c | 7 +++++++ 1 file changed, 7 insertions(+) diff --git a/mm/hugetlb.c b/mm/hugetlb.c index d6bb1d22f1c4..df645a5824e3 100644 --- a/mm/hugetlb.c +++ b/mm/hugetlb.c @@ -6290,6 +6290,7 @@ long follow_hugetlb_page(struct mm_struct *mm, struct vm_area_struct *vma, break; } + hugetlb_vma_lock_read(vma); /* * Some archs (sparc64, sh*) have multiple pte_ts to * each hugepage. We have to make sure we get the @@ -6314,6 +6315,7 @@ long follow_hugetlb_page(struct mm_struct *mm, struct vm_area_struct *vma, !hugetlbfs_pagecache_present(h, vma, vaddr)) { if (pte) spin_unlock(ptl); + hugetlb_vma_unlock_read(vma); remainder = 0; break; } @@ -6335,6 +6337,8 @@ long follow_hugetlb_page(struct mm_struct *mm, struct vm_area_struct *vma, if (pte) spin_unlock(ptl); + hugetlb_vma_unlock_read(vma); + if (flags & FOLL_WRITE) fault_flags |= FAULT_FLAG_WRITE; else if (unshare) @@ -6394,6 +6398,7 @@ long follow_hugetlb_page(struct mm_struct *mm, struct vm_area_struct *vma, remainder -= pages_per_huge_page(h); i += pages_per_huge_page(h); spin_unlock(ptl); + hugetlb_vma_unlock_read(vma); continue; } @@ -6421,6 +6426,7 @@ long follow_hugetlb_page(struct mm_struct *mm, struct vm_area_struct *vma, if (WARN_ON_ONCE(!try_grab_folio(pages[i], refs, flags))) { spin_unlock(ptl); + hugetlb_vma_unlock_read(vma); remainder = 0; err = -ENOMEM; break; @@ -6432,6 +6438,7 @@ long follow_hugetlb_page(struct mm_struct *mm, struct vm_area_struct *vma, i += refs; spin_unlock(ptl); + hugetlb_vma_unlock_read(vma); } *nr_pages = remainder; /* From patchwork Tue Nov 29 19:35:24 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Xu X-Patchwork-Id: 13059092 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 223BCC433FE for ; Tue, 29 Nov 2022 19:35:47 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 560F48E0005; Tue, 29 Nov 2022 14:35:46 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 511BE6B0080; Tue, 29 Nov 2022 14:35:46 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 2EEBD6B0081; Tue, 29 Nov 2022 14:35:46 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0014.hostedemail.com [216.40.44.14]) by kanga.kvack.org (Postfix) with ESMTP id 1C0A26B007E for ; Tue, 29 Nov 2022 14:35:46 -0500 (EST) Received: from smtpin15.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay06.hostedemail.com (Postfix) with ESMTP id CD60FAB708 for ; Tue, 29 Nov 2022 19:35:45 +0000 (UTC) X-FDA: 80187484650.15.5383F48 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by imf12.hostedemail.com (Postfix) with ESMTP id 5E3C640014 for ; Tue, 29 Nov 2022 19:35:45 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1669750544; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=tlgp91ck5zE/ROfq0OY9j8l1q/C3T3fwuEiiEDoRd8M=; b=PLmp9qAAVoXZ/DLCrhHauwbfIJ8eucPnqbLK/nRBJIKfSy2xRYyOfG9cpnvH4yNsOCxf60 ONlKONJCwUJht1zh2iZ82SSNfzcMDGnyIEYpeBPBqEMkp7MCCQCSIvZDQqthMmEHJpg43b 9OyN7osSfnd1FAHdBbWORvnupPB2HUQ= Received: from mail-qk1-f197.google.com (mail-qk1-f197.google.com [209.85.222.197]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_128_GCM_SHA256) id us-mta-372-hYxXIdGFNsisqjHvMTJYag-1; Tue, 29 Nov 2022 14:35:43 -0500 X-MC-Unique: hYxXIdGFNsisqjHvMTJYag-1 Received: by mail-qk1-f197.google.com with SMTP id de43-20020a05620a372b00b006fae7e5117fso30768615qkb.6 for ; Tue, 29 Nov 2022 11:35:43 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=tlgp91ck5zE/ROfq0OY9j8l1q/C3T3fwuEiiEDoRd8M=; b=HgAM9ezP9m/1bF9tZSqTPfU+9+MN/6MA4bqHGwgxgViruFTHbmpdXZKFVMsgubrHEk y/hHGUreI/wg8fqDZQ5/xdqSkSjMRdpQ+3CQXxgteHJssuKk3tIg3+xzHef/6EqUPSr+ fqwCYK57I408Llxt5MRh3ujiPQd5ah4mF+txeo2YU2j9Bgj1cO8m6mHgDWwzv5Cdem8m O5Jd65UVFij6PLHLtNNhVaY91acdL34l98i3Ak48lHX24OFuQbwKOrPwFkkU67ZlUFtY 64aLn/XZYyzJiTWpRLXeTf9cW7/3oywD6vTs9P6xzJ1NKW9XlEB7sa9pYERJrm7Dty7S +eUg== X-Gm-Message-State: ANoB5plEp42nVpQ1p3YvoQwFg/mUrrPhvnBi/Y9KMY4OqXtSqjovJljc kPkJYC0LI/dCbETYCiiykYZjtp5rVF7sKDU29nXXBWlCDNk5oNIRwBtbgxMeMUjg8SK/gszU2tR ZeEfdmH+fwf4B1IyJ62Aa5mHuiPLtyBZyZinal8qOL4yWMxplt8wXvnlKh+Y9 X-Received: by 2002:a05:6214:3607:b0:4c6:fb3e:4993 with SMTP id nv7-20020a056214360700b004c6fb3e4993mr12852452qvb.110.1669750542569; Tue, 29 Nov 2022 11:35:42 -0800 (PST) X-Google-Smtp-Source: AA0mqf4xmfsUZVgeiYSekMVU4CvYl69LMLxbBsqa8kFerZrz4PcOvOhvTZpsuJGzn4pvxETRAceDKA== X-Received: by 2002:a05:6214:3607:b0:4c6:fb3e:4993 with SMTP id nv7-20020a056214360700b004c6fb3e4993mr12852419qvb.110.1669750542285; Tue, 29 Nov 2022 11:35:42 -0800 (PST) Received: from x1n.redhat.com (bras-base-aurron9127w-grc-46-70-31-27-79.dsl.bell.ca. [70.31.27.79]) by smtp.gmail.com with ESMTPSA id n1-20020a05620a294100b006fa16fe93bbsm11313013qkp.15.2022.11.29.11.35.40 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 29 Nov 2022 11:35:41 -0800 (PST) From: Peter Xu To: linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: James Houghton , Jann Horn , peterx@redhat.com, Andrew Morton , Andrea Arcangeli , Rik van Riel , Nadav Amit , Miaohe Lin , Muchun Song , Mike Kravetz , David Hildenbrand Subject: [PATCH 08/10] mm/hugetlb: Make walk_hugetlb_range() safe to pmd unshare Date: Tue, 29 Nov 2022 14:35:24 -0500 Message-Id: <20221129193526.3588187-9-peterx@redhat.com> X-Mailer: git-send-email 2.37.3 In-Reply-To: <20221129193526.3588187-1-peterx@redhat.com> References: <20221129193526.3588187-1-peterx@redhat.com> MIME-Version: 1.0 X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-type: text/plain ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1669750545; a=rsa-sha256; cv=none; b=ww37ECfSUbnnr1fupMrzWqBpTeT1tUKMiUDBQ4U7sw8ZC+SK+zhAyfXvP9PLh6yOMWwsVG QeyfvM/s/GJmegiGdtbmWzWKQGZGUqdBS2G8R3RTMMSqizzH78Egs5sfbCucx9hYzT0aEw 5Z6s1/XA4yXin+ZykWEw9xex/s+X+q8= ARC-Authentication-Results: i=1; imf12.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=PLmp9qAA; spf=pass (imf12.hostedemail.com: domain of peterx@redhat.com designates 170.10.133.124 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (policy=none) header.from=redhat.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1669750545; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=tlgp91ck5zE/ROfq0OY9j8l1q/C3T3fwuEiiEDoRd8M=; b=B/ZRh6W+AdRKSvmFhhYSf+ZfJLGVmqEEGDbE1x2KGhEU7+F0YHTx45fhUakXakfNDTsINb O/+G7BhL+QPedu+Nbmpfcd1SbjdarGMFr/x7QzUYDpXrxZi+jQBZ5iea9iGFXfy9ATSHMX lRtFtU4ZUojMEVGB7MfdJbdhR/O5VTc= X-Stat-Signature: fzmrabo9az71j8qciwyzqotq8yeduoiq X-Rspamd-Queue-Id: 5E3C640014 X-Rspam-User: Authentication-Results: imf12.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=PLmp9qAA; spf=pass (imf12.hostedemail.com: domain of peterx@redhat.com designates 170.10.133.124 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (policy=none) header.from=redhat.com X-Rspamd-Server: rspam05 X-HE-Tag: 1669750545-290768 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Since walk_hugetlb_range() walks the pgtable, it needs the vma lock to make sure the pgtable page will not be freed concurrently. Signed-off-by: Peter Xu Acked-by: David Hildenbrand Reviewed-by: Mike Kravetz --- mm/pagewalk.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/mm/pagewalk.c b/mm/pagewalk.c index 7f1c9b274906..d98564a7be57 100644 --- a/mm/pagewalk.c +++ b/mm/pagewalk.c @@ -302,6 +302,7 @@ static int walk_hugetlb_range(unsigned long addr, unsigned long end, const struct mm_walk_ops *ops = walk->ops; int err = 0; + hugetlb_vma_lock_read(vma); do { next = hugetlb_entry_end(h, addr, end); pte = huge_pte_offset(walk->mm, addr & hmask, sz); @@ -314,6 +315,7 @@ static int walk_hugetlb_range(unsigned long addr, unsigned long end, if (err) break; } while (addr = next, addr != end); + hugetlb_vma_unlock_read(vma); return err; } From patchwork Tue Nov 29 19:35:25 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Xu X-Patchwork-Id: 13059093 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 84794C4332F for ; Tue, 29 Nov 2022 19:35:48 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 565108E0006; Tue, 29 Nov 2022 14:35:47 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 427826B0080; Tue, 29 Nov 2022 14:35:47 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 203738E0006; Tue, 29 Nov 2022 14:35:47 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0015.hostedemail.com [216.40.44.15]) by kanga.kvack.org (Postfix) with ESMTP id 08FDA6B007E for ; Tue, 29 Nov 2022 14:35:47 -0500 (EST) Received: from smtpin05.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay03.hostedemail.com (Postfix) with ESMTP id AC6C1A0715 for ; Tue, 29 Nov 2022 19:35:46 +0000 (UTC) X-FDA: 80187484692.05.4FDF838 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by imf17.hostedemail.com (Postfix) with ESMTP id 5E58C40013 for ; Tue, 29 Nov 2022 19:35:46 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1669750545; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=T6R1JTF+AaMAR6h+PxJxqsbJoPo1sRZuxu2H68hPMl4=; b=DrqQ7Pgygb/njglEWIVMov1qSfhA+MIJcOLBWCrFcNSZiKZVF6V3msJsC6T7/L6mvffl3Q iwdf3TbbHIHgYCq9ONZpspumV1hjTP0HrVS6cQgG/MwJyJJHdc0CDGqYldckiM9FrTuG7c u4Ol+A0nbxQxNn4o6hPkHbm3Be+5zGg= Received: from mail-qk1-f197.google.com (mail-qk1-f197.google.com [209.85.222.197]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_128_GCM_SHA256) id us-mta-546-_ETZ0NXRPQSinrTyDfFUuQ-1; Tue, 29 Nov 2022 14:35:44 -0500 X-MC-Unique: _ETZ0NXRPQSinrTyDfFUuQ-1 Received: by mail-qk1-f197.google.com with SMTP id h8-20020a05620a284800b006b5c98f09fbso32092926qkp.21 for ; Tue, 29 Nov 2022 11:35:44 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=T6R1JTF+AaMAR6h+PxJxqsbJoPo1sRZuxu2H68hPMl4=; b=w05izT+hftBfpKdsKcWJKZDU8Xbfg7kfIHH9/bgZUav/1+OiV3sQRFIDygXWo5VFun MK/SZV0IQeYT9L+BMmcqa/dmHIZ9FMVFVJkAq4TASxJfHO3nEzPHpSlSdWQ25rCay7Nk ogt8D+C4IJXORnX/E4wTVHK8q1AKWp1XCJZvAXNknJSqs1O1Inm1oIAXNFvjqtKYkSOM AjUIA89DgKP2NGilKPcW5Ud1XKFPmFKdNu9qq3c3u45FTN/Fy9+p9yLSwg9hnM+ADTJD 0Vtf1gjRAClNKIoEkU9h965wLSYn/52QVxCtpEDMbKJBePy+hXzOTG9VFmEKiNv6dtYB FZLA== X-Gm-Message-State: ANoB5pmJVxqk0QdjkfNYgnNwgShEB3a42jF1ouyeXLXfWCgJlrJdREVF ScuwJW745zB9BVlEQyLIOi8eaXfTJf9o8PW3EccG31BssKPyi1T658zoyzo4/UsKVYxy6mFqeBy kQtDjL2H5LXU7NdSSLs2VloOVpNqc1hsbUgkXXQHFk8kwu3fTOomp/nx4zh+F X-Received: by 2002:ac8:5511:0:b0:3a5:ae62:7b5a with SMTP id j17-20020ac85511000000b003a5ae627b5amr54714927qtq.595.1669750543801; Tue, 29 Nov 2022 11:35:43 -0800 (PST) X-Google-Smtp-Source: AA0mqf7OLspeAz8TgknZG0PdWjf/lNlOIdFM+L3EuhSvYh9tGOcnnSYMMTA7F6avxTBtXfjUbDS3oA== X-Received: by 2002:ac8:5511:0:b0:3a5:ae62:7b5a with SMTP id j17-20020ac85511000000b003a5ae627b5amr54714888qtq.595.1669750543457; Tue, 29 Nov 2022 11:35:43 -0800 (PST) Received: from x1n.redhat.com (bras-base-aurron9127w-grc-46-70-31-27-79.dsl.bell.ca. [70.31.27.79]) by smtp.gmail.com with ESMTPSA id n1-20020a05620a294100b006fa16fe93bbsm11313013qkp.15.2022.11.29.11.35.42 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 29 Nov 2022 11:35:43 -0800 (PST) From: Peter Xu To: linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: James Houghton , Jann Horn , peterx@redhat.com, Andrew Morton , Andrea Arcangeli , Rik van Riel , Nadav Amit , Miaohe Lin , Muchun Song , Mike Kravetz , David Hildenbrand Subject: [PATCH 09/10] mm/hugetlb: Make page_vma_mapped_walk() safe to pmd unshare Date: Tue, 29 Nov 2022 14:35:25 -0500 Message-Id: <20221129193526.3588187-10-peterx@redhat.com> X-Mailer: git-send-email 2.37.3 In-Reply-To: <20221129193526.3588187-1-peterx@redhat.com> References: <20221129193526.3588187-1-peterx@redhat.com> MIME-Version: 1.0 X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-type: text/plain ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1669750546; a=rsa-sha256; cv=none; b=WA9RVekicd7Xve78jMUWeBMA/e+5oOCzVQ4SSkZmAK7ZCX1vj3c/pAb4zCPyptCiulX2sa F+3ZULFLrBsWp5vVnx1o+5/jh1/BqWnD1li0aac893m5vAIDGBg7l0N/7pij4K65/kPFPX 7O+LT7vgH04VA1ggMP8zyrac36bl9Pw= ARC-Authentication-Results: i=1; imf17.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=DrqQ7Pgy; spf=pass (imf17.hostedemail.com: domain of peterx@redhat.com designates 170.10.133.124 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (policy=none) header.from=redhat.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1669750546; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=T6R1JTF+AaMAR6h+PxJxqsbJoPo1sRZuxu2H68hPMl4=; b=1aLB4FZxEdvxlx7yr3U3rNKnQ7MzAP7jR7sDG51HIswp4t9pasiV+KAP/oomFkh5dEJHAu 9SmoxQjwODZPBBLTHnV7jsbYh98olNLcABXxWwjaP6qV+8paFKQcvOC3YrWA0fsBpSV4SK XtzwMY9Exc+zW8IizGzCM3l1st8Xt5s= X-Stat-Signature: fyg9zkeh3mi9hrzj6n5f6ba47jaorzk9 Authentication-Results: imf17.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=DrqQ7Pgy; spf=pass (imf17.hostedemail.com: domain of peterx@redhat.com designates 170.10.133.124 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (policy=none) header.from=redhat.com X-Rspamd-Server: rspam08 X-Rspamd-Queue-Id: 5E58C40013 X-Rspam-User: X-HE-Tag: 1669750546-222877 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Since page_vma_mapped_walk() walks the pgtable, it needs the vma lock to make sure the pgtable page will not be freed concurrently. Signed-off-by: Peter Xu Acked-by: David Hildenbrand Reviewed-by: Mike Kravetz --- include/linux/rmap.h | 4 ++++ mm/page_vma_mapped.c | 5 ++++- 2 files changed, 8 insertions(+), 1 deletion(-) diff --git a/include/linux/rmap.h b/include/linux/rmap.h index bd3504d11b15..a50d18bb86aa 100644 --- a/include/linux/rmap.h +++ b/include/linux/rmap.h @@ -13,6 +13,7 @@ #include #include #include +#include /* * The anon_vma heads a list of private "related" vmas, to scan if @@ -408,6 +409,9 @@ static inline void page_vma_mapped_walk_done(struct page_vma_mapped_walk *pvmw) pte_unmap(pvmw->pte); if (pvmw->ptl) spin_unlock(pvmw->ptl); + /* This needs to be after unlock of the spinlock */ + if (is_vm_hugetlb_page(pvmw->vma)) + hugetlb_vma_unlock_read(pvmw->vma); } bool page_vma_mapped_walk(struct page_vma_mapped_walk *pvmw); diff --git a/mm/page_vma_mapped.c b/mm/page_vma_mapped.c index 93e13fc17d3c..f94ec78b54ff 100644 --- a/mm/page_vma_mapped.c +++ b/mm/page_vma_mapped.c @@ -169,10 +169,13 @@ bool page_vma_mapped_walk(struct page_vma_mapped_walk *pvmw) if (pvmw->pte) return not_found(pvmw); + hugetlb_vma_lock_read(vma); /* when pud is not present, pte will be NULL */ pvmw->pte = huge_pte_offset(mm, pvmw->address, size); - if (!pvmw->pte) + if (!pvmw->pte) { + hugetlb_vma_unlock_read(vma); return false; + } pvmw->ptl = huge_pte_lock(hstate, mm, pvmw->pte); if (!check_pte(pvmw)) From patchwork Tue Nov 29 19:35:26 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Xu X-Patchwork-Id: 13059094 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 145ECC433FE for ; Tue, 29 Nov 2022 19:35:50 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id A92456B007E; Tue, 29 Nov 2022 14:35:49 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 9F6FB8E0007; Tue, 29 Nov 2022 14:35:49 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 7A8C76B0081; Tue, 29 Nov 2022 14:35:49 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0015.hostedemail.com [216.40.44.15]) by kanga.kvack.org (Postfix) with ESMTP id 662E96B007E for ; Tue, 29 Nov 2022 14:35:49 -0500 (EST) Received: from smtpin06.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay02.hostedemail.com (Postfix) with ESMTP id 1E33212041E for ; Tue, 29 Nov 2022 19:35:49 +0000 (UTC) X-FDA: 80187484818.06.CBA0A15 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by imf25.hostedemail.com (Postfix) with ESMTP id 41594A0015 for ; Tue, 29 Nov 2022 19:35:48 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1669750547; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=rIOUAJrVD+5ozo92pUGGgS61bSj8F1LiHfipiIThQWE=; b=Iq5ze4PsAccH1uiXcAwHSiBqOlhIGk1woxz/esJVgwoRyT8VpRMk3msDHPMXwIDym9sBMq uCQ7ljefwTpoZo+HKKC0Un20QEsolJXE9ybFxCDw1W07aQUWTRghQJ6Y20tzHzMyL8dwyZ Z4+Gm+16nC5TuTqL6Mf12SeKH+N4f/E= Received: from mail-qk1-f198.google.com (mail-qk1-f198.google.com [209.85.222.198]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_128_GCM_SHA256) id us-mta-422-fLINrDl3PjaDVRZvjyv2_g-1; Tue, 29 Nov 2022 14:35:46 -0500 X-MC-Unique: fLINrDl3PjaDVRZvjyv2_g-1 Received: by mail-qk1-f198.google.com with SMTP id bl21-20020a05620a1a9500b006fa35db066aso31466999qkb.19 for ; Tue, 29 Nov 2022 11:35:46 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=rIOUAJrVD+5ozo92pUGGgS61bSj8F1LiHfipiIThQWE=; b=UWET6F+icH0hDPK3pcjxNooTYt7/ur1ILDFZaZkwql2EqElK+JiqznK954zrjV/c2r oWmXt0tZNeWyk8VxKeS1c2kliOvvUoTbhSTAbk2PIlWz39m0BsSQ1gA6fBDH+2pM7om9 z3OaAqFigbKnsXOQuwqP7lTng/OnyqMopFYDRcVKMg3BGpsAUugGLgYNjFwrO1pXdqB7 8Gj4TlV73MtjqvtjNXOqr2N9L+8p+YOvDaQhBW8HUdrkPWsXO7FgS0JZimxZqsXFtyOB zrGaQcW2RivmpeTxzNt4/5XFvf08wN+Kqpv1oJVDmtaFou9hmqA/HLTPftDAzzRDzAwx 3MLw== X-Gm-Message-State: ANoB5pkd1X9XU9HnkXY3TcOMK5y21khVF5LaLEKpFZrps7G/Rwwmbfw2 qQS7NI7uxZASA/dMDya/ChXfA/nwuBPMFgt7ajuvy62xBLIpQwFXDb3ZNzmvLHD49efTpm5owrp BlMHMBh+h+UBxMXtWc48ORVwGyN7GtGLv4doraClxo42GL1+V6phv3QEkb3qf X-Received: by 2002:ac8:47c5:0:b0:3a5:6a0e:db3c with SMTP id d5-20020ac847c5000000b003a56a0edb3cmr54798793qtr.398.1669750545448; Tue, 29 Nov 2022 11:35:45 -0800 (PST) X-Google-Smtp-Source: AA0mqf4C39MBXCtpMl9smeazapfgsP2CIC564Nh2CynrssdJQQDFrjOEpHDtdGLhHXiFwSNHGgoeMg== X-Received: by 2002:ac8:47c5:0:b0:3a5:6a0e:db3c with SMTP id d5-20020ac847c5000000b003a56a0edb3cmr54798742qtr.398.1669750544917; Tue, 29 Nov 2022 11:35:44 -0800 (PST) Received: from x1n.redhat.com (bras-base-aurron9127w-grc-46-70-31-27-79.dsl.bell.ca. [70.31.27.79]) by smtp.gmail.com with ESMTPSA id n1-20020a05620a294100b006fa16fe93bbsm11313013qkp.15.2022.11.29.11.35.43 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 29 Nov 2022 11:35:44 -0800 (PST) From: Peter Xu To: linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: James Houghton , Jann Horn , peterx@redhat.com, Andrew Morton , Andrea Arcangeli , Rik van Riel , Nadav Amit , Miaohe Lin , Muchun Song , Mike Kravetz , David Hildenbrand Subject: [PATCH 10/10] mm/hugetlb: Introduce hugetlb_walk() Date: Tue, 29 Nov 2022 14:35:26 -0500 Message-Id: <20221129193526.3588187-11-peterx@redhat.com> X-Mailer: git-send-email 2.37.3 In-Reply-To: <20221129193526.3588187-1-peterx@redhat.com> References: <20221129193526.3588187-1-peterx@redhat.com> MIME-Version: 1.0 X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-type: text/plain ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1669750548; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=rIOUAJrVD+5ozo92pUGGgS61bSj8F1LiHfipiIThQWE=; b=OMpciALzrClOmP3d3hOsJE5zIT8wC7nPlmOP4EUAXt6QL6FzZyZUFW+sMAZahUZlVDD4VG d7dg5FMJfILak3/UYdmXzGok78fuSJuN36EENd553kjVwWXLxen56ngl9Bq/RKlXdiEGIW fV+ArGjC3BTKAiSLvQAYOC+sgV5JC24= ARC-Authentication-Results: i=1; imf25.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=Iq5ze4Ps; spf=pass (imf25.hostedemail.com: domain of peterx@redhat.com designates 170.10.133.124 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (policy=none) header.from=redhat.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1669750548; a=rsa-sha256; cv=none; b=CFAwAewNYIfATIHvaTxJVIgRwBDUynSISDqh7i1FYhZ0Lz8fB/HfgSjVw/FaIMnMLjvqJG jpSBvd4luz0mY+J3HD4rxqHiLiWSLWHY+2L3gjcq9Mvzc5JkXAP4NpVC6EqNp9jUKnKSFR odAd5jCbF2FFug0E4O8PMb86g0x5gtc= X-Stat-Signature: nr1ssd917835p13snispjpcfxk85ynu3 X-Rspamd-Queue-Id: 41594A0015 Authentication-Results: imf25.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=Iq5ze4Ps; spf=pass (imf25.hostedemail.com: domain of peterx@redhat.com designates 170.10.133.124 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (policy=none) header.from=redhat.com X-Rspamd-Server: rspam06 X-Rspam-User: X-HE-Tag: 1669750548-99858 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: huge_pte_offset() is the main walker function for hugetlb pgtables. The name is not really representing what it does, though. Instead of renaming it, introduce a wrapper function called hugetlb_walk() which will use huge_pte_offset() inside. Assert on the locks when walking the pgtable. Note, the vma lock assertion will be a no-op for private mappings. Signed-off-by: Peter Xu Reviewed-by: Mike Kravetz --- fs/hugetlbfs/inode.c | 4 +--- fs/userfaultfd.c | 6 ++---- include/linux/hugetlb.h | 37 +++++++++++++++++++++++++++++++++++++ mm/hugetlb.c | 34 ++++++++++++++-------------------- mm/page_vma_mapped.c | 2 +- mm/pagewalk.c | 4 +--- 6 files changed, 56 insertions(+), 31 deletions(-) diff --git a/fs/hugetlbfs/inode.c b/fs/hugetlbfs/inode.c index fdb16246f46e..48f1a8ad2243 100644 --- a/fs/hugetlbfs/inode.c +++ b/fs/hugetlbfs/inode.c @@ -388,9 +388,7 @@ static bool hugetlb_vma_maps_page(struct vm_area_struct *vma, { pte_t *ptep, pte; - ptep = huge_pte_offset(vma->vm_mm, addr, - huge_page_size(hstate_vma(vma))); - + ptep = hugetlb_walk(vma, addr, huge_page_size(hstate_vma(vma))); if (!ptep) return false; diff --git a/fs/userfaultfd.c b/fs/userfaultfd.c index a602f008dde5..f31fe1a9f4c5 100644 --- a/fs/userfaultfd.c +++ b/fs/userfaultfd.c @@ -237,14 +237,12 @@ static inline bool userfaultfd_huge_must_wait(struct userfaultfd_ctx *ctx, unsigned long flags, unsigned long reason) { - struct mm_struct *mm = ctx->mm; pte_t *ptep, pte; bool ret = true; - mmap_assert_locked(mm); - - ptep = huge_pte_offset(mm, address, vma_mmu_pagesize(vma)); + mmap_assert_locked(ctx->mm); + ptep = hugetlb_walk(vma, address, vma_mmu_pagesize(vma)); if (!ptep) goto out; diff --git a/include/linux/hugetlb.h b/include/linux/hugetlb.h index 81efd9b9baa2..1a51c45fdf2e 100644 --- a/include/linux/hugetlb.h +++ b/include/linux/hugetlb.h @@ -196,6 +196,11 @@ pte_t *huge_pte_alloc(struct mm_struct *mm, struct vm_area_struct *vma, * huge_pte_offset(): Walk the hugetlb pgtable until the last level PTE. * Returns the pte_t* if found, or NULL if the address is not mapped. * + * IMPORTANT: we should normally not directly call this function, instead + * this is only a common interface to implement arch-specific walker. + * Please consider using the hugetlb_walk() helper to make sure of the + * correct locking is satisfied. + * * Since this function will walk all the pgtable pages (including not only * high-level pgtable page, but also PUD entry that can be unshared * concurrently for VM_SHARED), the caller of this function should be @@ -1229,4 +1234,36 @@ bool want_pmd_share(struct vm_area_struct *vma, unsigned long addr); #define flush_hugetlb_tlb_range(vma, addr, end) flush_tlb_range(vma, addr, end) #endif +static inline bool +__vma_shareable_flags_pmd(struct vm_area_struct *vma) +{ + return vma->vm_flags & (VM_MAYSHARE | VM_SHARED) && + vma->vm_private_data; +} + +/* + * Safe version of huge_pte_offset() to check the locks. See comments + * above huge_pte_offset(). + */ +static inline pte_t * +hugetlb_walk(struct vm_area_struct *vma, unsigned long addr, unsigned long sz) +{ +#if defined(CONFIG_ARCH_WANT_HUGE_PMD_SHARE) && defined(CONFIG_LOCKDEP) + struct hugetlb_vma_lock *vma_lock = vma->vm_private_data; + + /* + * If pmd sharing possible, locking needed to safely walk the + * hugetlb pgtables. More information can be found at the comment + * above huge_pte_offset() in the same file. + * + * NOTE: lockdep_is_held() is only defined with CONFIG_LOCKDEP. + */ + if (__vma_shareable_flags_pmd(vma)) + WARN_ON_ONCE(!lockdep_is_held(&vma_lock->rw_sema) && + !lockdep_is_held( + &vma->vm_file->f_mapping->i_mmap_rwsem)); +#endif + return huge_pte_offset(vma->vm_mm, addr, sz); +} + #endif /* _LINUX_HUGETLB_H */ diff --git a/mm/hugetlb.c b/mm/hugetlb.c index df645a5824e3..05867e82b467 100644 --- a/mm/hugetlb.c +++ b/mm/hugetlb.c @@ -4816,7 +4816,7 @@ int copy_hugetlb_page_range(struct mm_struct *dst, struct mm_struct *src, } else { /* * For shared mappings the vma lock must be held before - * calling huge_pte_offset in the src vma. Otherwise, the + * calling hugetlb_walk() in the src vma. Otherwise, the * returned ptep could go away if part of a shared pmd and * another thread calls huge_pmd_unshare. */ @@ -4826,7 +4826,7 @@ int copy_hugetlb_page_range(struct mm_struct *dst, struct mm_struct *src, last_addr_mask = hugetlb_mask_last_page(h); for (addr = src_vma->vm_start; addr < src_vma->vm_end; addr += sz) { spinlock_t *src_ptl, *dst_ptl; - src_pte = huge_pte_offset(src, addr, sz); + src_pte = hugetlb_walk(src_vma, addr, sz); if (!src_pte) { addr |= last_addr_mask; continue; @@ -5030,7 +5030,7 @@ int move_hugetlb_page_tables(struct vm_area_struct *vma, hugetlb_vma_lock_write(vma); i_mmap_lock_write(mapping); for (; old_addr < old_end; old_addr += sz, new_addr += sz) { - src_pte = huge_pte_offset(mm, old_addr, sz); + src_pte = hugetlb_walk(vma, old_addr, sz); if (!src_pte) { old_addr |= last_addr_mask; new_addr |= last_addr_mask; @@ -5093,7 +5093,7 @@ static void __unmap_hugepage_range(struct mmu_gather *tlb, struct vm_area_struct last_addr_mask = hugetlb_mask_last_page(h); address = start; for (; address < end; address += sz) { - ptep = huge_pte_offset(mm, address, sz); + ptep = hugetlb_walk(vma, address, sz); if (!ptep) { address |= last_addr_mask; continue; @@ -5406,7 +5406,7 @@ static vm_fault_t hugetlb_wp(struct mm_struct *mm, struct vm_area_struct *vma, mutex_lock(&hugetlb_fault_mutex_table[hash]); hugetlb_vma_lock_read(vma); spin_lock(ptl); - ptep = huge_pte_offset(mm, haddr, huge_page_size(h)); + ptep = hugetlb_walk(vma, haddr, huge_page_size(h)); if (likely(ptep && pte_same(huge_ptep_get(ptep), pte))) goto retry_avoidcopy; @@ -5444,7 +5444,7 @@ static vm_fault_t hugetlb_wp(struct mm_struct *mm, struct vm_area_struct *vma, * before the page tables are altered */ spin_lock(ptl); - ptep = huge_pte_offset(mm, haddr, huge_page_size(h)); + ptep = hugetlb_walk(vma, haddr, huge_page_size(h)); if (likely(ptep && pte_same(huge_ptep_get(ptep), pte))) { /* Break COW or unshare */ huge_ptep_clear_flush(vma, haddr, ptep); @@ -5841,7 +5841,7 @@ vm_fault_t hugetlb_fault(struct mm_struct *mm, struct vm_area_struct *vma, * until finished with ptep. This prevents huge_pmd_unshare from * being called elsewhere and making the ptep no longer valid. * - * ptep could have already be assigned via huge_pte_offset. That + * ptep could have already be assigned via hugetlb_walk(). That * is OK, as huge_pte_alloc will return the same value unless * something has changed. */ @@ -6233,7 +6233,7 @@ struct page *hugetlb_follow_page_mask(struct vm_area_struct *vma, return NULL; hugetlb_vma_lock_read(vma); - pte = huge_pte_offset(mm, haddr, huge_page_size(h)); + pte = hugetlb_walk(vma, haddr, huge_page_size(h)); if (!pte) goto out_unlock; @@ -6298,8 +6298,8 @@ long follow_hugetlb_page(struct mm_struct *mm, struct vm_area_struct *vma, * * Note that page table lock is not held when pte is null. */ - pte = huge_pte_offset(mm, vaddr & huge_page_mask(h), - huge_page_size(h)); + pte = hugetlb_walk(vma, vaddr & huge_page_mask(h), + huge_page_size(h)); if (pte) ptl = huge_pte_lock(h, mm, pte); absent = !pte || huge_pte_none(huge_ptep_get(pte)); @@ -6485,7 +6485,7 @@ unsigned long hugetlb_change_protection(struct vm_area_struct *vma, last_addr_mask = hugetlb_mask_last_page(h); for (; address < end; address += psize) { spinlock_t *ptl; - ptep = huge_pte_offset(mm, address, psize); + ptep = hugetlb_walk(vma, address, psize); if (!ptep) { address |= last_addr_mask; continue; @@ -6863,12 +6863,6 @@ void adjust_range_if_pmd_sharing_possible(struct vm_area_struct *vma, *end = ALIGN(*end, PUD_SIZE); } -static bool __vma_shareable_flags_pmd(struct vm_area_struct *vma) -{ - return vma->vm_flags & (VM_MAYSHARE | VM_SHARED) && - vma->vm_private_data; -} - void hugetlb_vma_lock_read(struct vm_area_struct *vma) { if (__vma_shareable_flags_pmd(vma)) { @@ -7034,8 +7028,8 @@ pte_t *huge_pmd_share(struct mm_struct *mm, struct vm_area_struct *vma, saddr = page_table_shareable(svma, vma, addr, idx); if (saddr) { - spte = huge_pte_offset(svma->vm_mm, saddr, - vma_mmu_pagesize(svma)); + spte = hugetlb_walk(svma, saddr, + vma_mmu_pagesize(svma)); if (spte) { get_page(virt_to_page(spte)); break; @@ -7394,7 +7388,7 @@ void hugetlb_unshare_all_pmds(struct vm_area_struct *vma) hugetlb_vma_lock_write(vma); i_mmap_lock_write(vma->vm_file->f_mapping); for (address = start; address < end; address += PUD_SIZE) { - ptep = huge_pte_offset(mm, address, sz); + ptep = hugetlb_walk(vma, address, sz); if (!ptep) continue; ptl = huge_pte_lock(h, mm, ptep); diff --git a/mm/page_vma_mapped.c b/mm/page_vma_mapped.c index f94ec78b54ff..bb782dea4b42 100644 --- a/mm/page_vma_mapped.c +++ b/mm/page_vma_mapped.c @@ -171,7 +171,7 @@ bool page_vma_mapped_walk(struct page_vma_mapped_walk *pvmw) hugetlb_vma_lock_read(vma); /* when pud is not present, pte will be NULL */ - pvmw->pte = huge_pte_offset(mm, pvmw->address, size); + pvmw->pte = hugetlb_walk(vma, pvmw->address, size); if (!pvmw->pte) { hugetlb_vma_unlock_read(vma); return false; diff --git a/mm/pagewalk.c b/mm/pagewalk.c index d98564a7be57..cb23f8a15c13 100644 --- a/mm/pagewalk.c +++ b/mm/pagewalk.c @@ -305,13 +305,11 @@ static int walk_hugetlb_range(unsigned long addr, unsigned long end, hugetlb_vma_lock_read(vma); do { next = hugetlb_entry_end(h, addr, end); - pte = huge_pte_offset(walk->mm, addr & hmask, sz); - + pte = hugetlb_walk(vma, addr & hmask, sz); if (pte) err = ops->hugetlb_entry(pte, hmask, addr, next, walk); else if (ops->pte_hole) err = ops->pte_hole(addr, next, -1, walk); - if (err) break; } while (addr = next, addr != end);