From patchwork Wed Sep 14 22:18:02 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Mike Kravetz X-Patchwork-Id: 12976586 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 3E3B3C6FA82 for ; Wed, 14 Sep 2022 22:18:37 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id C27A66B0071; Wed, 14 Sep 2022 18:18:36 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id BB0E8940007; Wed, 14 Sep 2022 18:18:36 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 9B5336B0074; Wed, 14 Sep 2022 18:18:36 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0014.hostedemail.com [216.40.44.14]) by kanga.kvack.org (Postfix) with ESMTP id 89E066B0071 for ; Wed, 14 Sep 2022 18:18:36 -0400 (EDT) Received: from smtpin06.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay02.hostedemail.com (Postfix) with ESMTP id 61C4F12020A for ; Wed, 14 Sep 2022 22:18:36 +0000 (UTC) X-FDA: 79912106232.06.CD8028D Received: from mx0b-00069f02.pphosted.com (mx0b-00069f02.pphosted.com [205.220.177.32]) by imf13.hostedemail.com (Postfix) with ESMTP id 9E2252009F for ; Wed, 14 Sep 2022 22:18:35 +0000 (UTC) Received: from pps.filterd (m0246632.ppops.net [127.0.0.1]) by mx0b-00069f02.pphosted.com (8.17.1.5/8.17.1.5) with ESMTP id 28EMACHD029326; Wed, 14 Sep 2022 22:18:24 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.com; h=from : to : cc : subject : date : message-id : in-reply-to : references : content-transfer-encoding : content-type : mime-version; s=corp-2022-7-12; bh=Gsxm/1hzbwI5YCEaMDozVqtcgnXRFDIev726SYDrqyw=; b=t0UoD+xyo70TIZXXUy/WWeyP59VvqKWMSoNB2LbfjsDsfrCRYVwD3MiNgnIWls0NrIND yDwSg7XHYND0JBk3/ayxswuqRuejBc71jMyNYwZZsMCdHL0jTO/ZQmPtzO4kKIQB+smo 6N9lb/Oizu3USee/LzRTAZDeRU8tMGgF7BC/uR42RI1KE9DzXbLvNuvOf9+YpoiizRle xdJT7/UPlCEQySS5PJaaHwKGK3OkjYQ6DWKTEd4LQnl7UAMA2iwlEofAiZbObIOPfe3u TyQrytaqAywaVnV+aYNwAkRobt+8b2vwzFDPxxXu/7PouDZCAGWGN5AySS5u/Vwfj1VG iQ== Received: from iadpaimrmta01.imrmtpd1.prodappiadaev1.oraclevcn.com (iadpaimrmta01.appoci.oracle.com [130.35.100.223]) by mx0b-00069f02.pphosted.com (PPS) with ESMTPS id 3jjxycbnrt-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Wed, 14 Sep 2022 22:18:24 +0000 Received: from pps.filterd (iadpaimrmta01.imrmtpd1.prodappiadaev1.oraclevcn.com [127.0.0.1]) by iadpaimrmta01.imrmtpd1.prodappiadaev1.oraclevcn.com (8.17.1.5/8.17.1.5) with ESMTP id 28EJ1dqi035524; Wed, 14 Sep 2022 22:18:23 GMT Received: from nam11-co1-obe.outbound.protection.outlook.com (mail-co1nam11lp2177.outbound.protection.outlook.com [104.47.56.177]) by iadpaimrmta01.imrmtpd1.prodappiadaev1.oraclevcn.com (PPS) with ESMTPS id 3jjyehtdq1-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Wed, 14 Sep 2022 22:18:23 +0000 ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=CyAs7KMyn2YGShqjQEucATTn4SPjoD9R68511rhtZewNr38Hh+jgQH9OZGCkgZqkW//lgeoF1ipJ26wlMGfzHVUZvMNSFQBbpZnKZAvs/xksUMEn/8IKUlJmRwLKNDfWTaMsQ/bWUt01rKpgDECI9vQmDHyNuJadfSWP/RShTCWnSk3c3VbfVGXLS6gkKjr3ke3X+dOFv42jj4cbx7Rqzv1x9BVwsfy4SPjDOsjlhf2CIJRFeRei9d21JzCF76HccAGVqeHA3TKK2r6SEBWYsh5Auw3SXxDgrt1CKu0TEoi9OpWDwCI/c7bCMok5pPo266lFN8sl2bwbHRVVE3mjww== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=Gsxm/1hzbwI5YCEaMDozVqtcgnXRFDIev726SYDrqyw=; b=BWHadgknGL7b516j4XeRTVrWlibmU/H6ypki9gCSDJz8caBOuL/BrCHpLvISNZxY00/45I95lNuNDSUY9IJ1SCQHOyl6Wm4Glvd+Qbrw8Qp8LdkkrHyUAlil4SQarJH+ChCTpZUyqViR6kXhWkCOmxMRQ/o+EjlVmxzP7L1+XNizGkUtvZAVsAONttEgl8j0Q4/zOet1PimfdARAGbyEaPmWMkKxR2f6VWN7byRo1asetSlH2S9692o0Ay5rZOf++/wctnL/f4SS9GpgGeHbntrmTjGpX/R4vw66MKGSkLCYXBSjqrC4WsWb3jnfJysZxhUwCsl+TK2mENuHLM85gQ== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=oracle.com; dmarc=pass action=none header.from=oracle.com; dkim=pass header.d=oracle.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.onmicrosoft.com; s=selector2-oracle-onmicrosoft-com; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=Gsxm/1hzbwI5YCEaMDozVqtcgnXRFDIev726SYDrqyw=; b=HCVBZBN7BeZQ+82Ol6Jy1cm2l/DWikqh7i/F3wHFERgEUa6n3saZuJbReBaElVAsFlB25BOF2OeV/HJVITGPk+xD5d8e4fM2CSi3mK2LBzD9XHGZf3nKl+Q2RRNpV1Edlzwo/Rdod0gxVLT2oD2Ju+IHE34F3AvImvn6QsQWF/Y= Received: from BY5PR10MB4196.namprd10.prod.outlook.com (2603:10b6:a03:20d::23) by BN0PR10MB5126.namprd10.prod.outlook.com (2603:10b6:408:129::8) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.5612.14; Wed, 14 Sep 2022 22:18:21 +0000 Received: from BY5PR10MB4196.namprd10.prod.outlook.com ([fe80::e9d2:a804:e53a:779a]) by BY5PR10MB4196.namprd10.prod.outlook.com ([fe80::e9d2:a804:e53a:779a%6]) with mapi id 15.20.5612.022; Wed, 14 Sep 2022 22:18:20 +0000 From: Mike Kravetz To: linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: Muchun Song , Miaohe Lin , David Hildenbrand , Sven Schnelle , Michal Hocko , Peter Xu , Naoya Horiguchi , "Aneesh Kumar K . V" , Andrea Arcangeli , "Kirill A . Shutemov" , Davidlohr Bueso , Prakash Sangappa , James Houghton , Mina Almasry , Pasha Tatashin , Axel Rasmussen , Ray Fucillo , Andrew Morton , Mike Kravetz Subject: [PATCH v2 1/9] hugetlbfs: revert use i_mmap_rwsem to address page fault/truncate race Date: Wed, 14 Sep 2022 15:18:02 -0700 Message-Id: <20220914221810.95771-2-mike.kravetz@oracle.com> X-Mailer: git-send-email 2.37.2 In-Reply-To: <20220914221810.95771-1-mike.kravetz@oracle.com> References: <20220914221810.95771-1-mike.kravetz@oracle.com> X-ClientProxiedBy: MW4PR04CA0049.namprd04.prod.outlook.com (2603:10b6:303:6a::24) To BY5PR10MB4196.namprd10.prod.outlook.com (2603:10b6:a03:20d::23) MIME-Version: 1.0 X-MS-PublicTrafficType: Email X-MS-TrafficTypeDiagnostic: BY5PR10MB4196:EE_|BN0PR10MB5126:EE_ X-MS-Office365-Filtering-Correlation-Id: 21c3a85a-3d4a-4c40-d1f9-08da969f0821 X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: DOBSIGiEHnpxATbIRB3K9n1spLw7AnagkCqYAXYOTf9MAVTf/aB6mMq3co8l2nf2V+dads2VY/Itpt/qE5L0d3Kd9HlYTqL6+1CFzOIKSb+CE/eRFn3Knru+7V1sv4tUaXapLOQFgdgzhi89H/o48ogI9wW/emlN+/4FGEfHyiY63eSYgXXkYVtW137qq37jVwxIy4QJWL5JhG/MjMZFIfxxC98QGr4e8iot1svUROg4RNX5omo7xFGovQKYEDFUXdWFcVzbGZJOaKpNRBL3Jv8RNI/wAFu2hU8neouEAgAmbkjBv2nFdCLJT9h/i8mcGf4JWZUkNgbXFTnf12auYIyb4bn7EmgtZns7VdvLB64hXC6YovAr3CwszlxdTMUADJxY6mA3by5Lv+Py3H6E5uRUtnYBEjHhsc3nkb4Y1hJgcMzNyDVX04K9/AHBHUW+2aYj3jOTsM/t7JAGbFImVyENH3/F6iWWVmSHognUMFQx6IXMi0vENcaXkCVJSYGM9d9q3y+1Iq9lcK/85oWk6foKKuHqkHBlzrX5uaBpCxrtQynpEvZ7hW2PiaWK3OIOvns6LEdBJoDwB3Bt2MXDRI+A8wImtRkNPMIOnL37zQuK8yoEdWekyAzURuRV2VO67OOHc7w4r8fido4ZowHzRbmak+FITWOufWZ9+9ZILe8LmDgOkRPLlR7b8rh8SiUFYC6n0NEHGuQfwiatzpVpCg== X-Forefront-Antispam-Report: CIP:255.255.255.255;CTRY:;LANG:en;SCL:1;SRV:;IPV:NLI;SFV:NSPM;H:BY5PR10MB4196.namprd10.prod.outlook.com;PTR:;CAT:NONE;SFS:(13230022)(396003)(136003)(366004)(39860400002)(376002)(346002)(451199015)(66946007)(6506007)(107886003)(54906003)(66476007)(86362001)(4326008)(6666004)(44832011)(5660300002)(66556008)(41300700001)(2906002)(36756003)(26005)(6512007)(7416002)(38100700002)(478600001)(316002)(8936002)(83380400001)(6486002)(8676002)(2616005)(186003)(1076003);DIR:OUT;SFP:1101; X-MS-Exchange-AntiSpam-MessageData-ChunkCount: 1 X-MS-Exchange-AntiSpam-MessageData-0: dH35fTSiRReeM70sD3rIGq8TrnUXO5yQGFRiSejmt4A53IoFu6xYvDhyZvVjcywNEBHJZOMXEsYE63GmcICcPbkdWf0YpNzYKQ9vbf51ZTQcKOnW6UiPi7XwfNhgoFSXFhizb/g+L6cumLVc8bw0XYOSjT/feP+ghHJwPdxj2cuXeJq3y+2eWY1DV6GEn86pj6CnGS1gbhDOZD2BbVFCenwUq9KziFkXk79jU4CdTvH9ZTv4RiniHIqEybnvezOqHhYqIVpwTDqDZ4ttvE1IWVqibn+9vKf0aa++0IoGN3XMqMlH88AYokLU4DpWvqsPRovFdHVEJpGIIdQOYk25/v55sX6Ev/8YBxM9nnFNHlH0WjeMxQCZiBncWkDlTl9M4IHXfwt22D12wAH3t/6lkW+ycmDJoV5PHfqb9oNslVDuaQ89gLUGjVttYRgCK/0cKVSLFEhN5Z5oFxR1i80tQkr3W8JWIZKMVcrAmzNWrJUzzN7EYAlxkucJSiMXKdXLgkuHN3YRZYwY3sbep6kGPmD307dl5OFkMW+352xyJ9n32IpwmLa80TM4rcZiJpDb5rrsKexyDXGFbZgtIX975L/EKu+foJwCjIeRocMKbhKvv0hWO8C2CJRRaHZnbT1yObxIqrFVu6zCqojtuv9+fc8nTjgpWm7SLkEZnAi1YncGD8zenvoZxf65XY1VM+haUsw6Cqpu4kMay2K7anWQwC/9cJIopjtMtuXwuid2FjdHSbd2kgjjwfZ8ZYT+62r05LZ3jGlBM78Su+ZeCd9NZG/abWhirT/quVwWmdPNZAifiKmkPW1NNKvpX1BBp1MhYKnpGBNAkLloWjuxJypdbYtUoWsK9IttBIZDqbFZH0MGgAl2MWCBLNLN1YzdncEgEiRdjTtwcfnv+MMrikCZJAR8Bmr54Zc8q0tkrDYe1DdXC8weH5FbnP95nFU1crUU1aeKch29zS5Ea2mGeTT+iULGWfR3RRQEWC/TYvyKtQgaOuFHhB+nGF3wlsIyWhtktVneqcUvPTAjhsJQI+/5ELS4z7UpGNZq0tfg296jDmnBl7myS8IF5L906MPLsPjCYToSO1racXJfbRhBHknu7I3YxkIl5CMZRStw8geBs6Zwky7LAj7C1LZ72mh0qRxZqdZMHQ9HuERu46Qj1RtdKoIHf6EMY0Io3JrcwrKmbJw7EU9KhaWESAk1xrMBqB1RRlFsaFvQNp31ISOaJajFuQAL16QNVUsTF6UAsHYazVFViR8q8zUyCEROlon7Uk87zBaD6oXmTLhHHNroCnn48gs+/m3163LSYFawvzK3wkJkoYMzBjdfvt8ojtTea/k+Hpnm1q06ixOvRWvgwYKFn43LBeP5qaffuwnoX6DdDTEug70Wo04X2Ujo/cLbsDLclmQocUp3pR3YgWbf0hK5pkWtAh0egnajFdMby+76CvzZFl0vlhXwh0sO5y+cUmmNbe91JPXhYofFLjf9Zg/emdNEOvACo7Zipo6elZrjwrYmhc68Ue64L4JY8nrNeefprTdJidAX2g07d2o/RiD6DnJK9ofwh+hcomqKEMVflz8nx43PAJG+Pc7OFfQLzAsdR/msC5Zb6qMUf7+1BBGNuQ== X-OriginatorOrg: oracle.com X-MS-Exchange-CrossTenant-Network-Message-Id: 21c3a85a-3d4a-4c40-d1f9-08da969f0821 X-MS-Exchange-CrossTenant-AuthSource: BY5PR10MB4196.namprd10.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-OriginalArrivalTime: 14 Sep 2022 22:18:20.7684 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-CrossTenant-Id: 4e2c6054-71cb-48f1-bd6c-3a9705aca71b X-MS-Exchange-CrossTenant-MailboxType: HOSTED X-MS-Exchange-CrossTenant-UserPrincipalName: Bf/w6rThK33agumlcukB7lnWd0mK0SYvbJW4oKZ/R6m7YtePx2/YmHoJAUJV3He2gfN5Zk19ZHdElCERFOXigQ== X-MS-Exchange-Transport-CrossTenantHeadersStamped: BN0PR10MB5126 X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.205,Aquarius:18.0.895,Hydra:6.0.528,FMLib:17.11.122.1 definitions=2022-09-14_09,2022-09-14_04,2022-06-22_01 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 malwarescore=0 mlxlogscore=999 bulkscore=0 phishscore=0 spamscore=0 suspectscore=0 adultscore=0 mlxscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2208220000 definitions=main-2209140108 X-Proofpoint-ORIG-GUID: o0ru1XvVX_M_KTc6nyJUcJTeY73X9glX X-Proofpoint-GUID: o0ru1XvVX_M_KTc6nyJUcJTeY73X9glX ARC-Seal: i=2; s=arc-20220608; d=hostedemail.com; t=1663193915; a=rsa-sha256; cv=pass; b=RVE3zEiOPiJ6jGfpuphsHL9cbhKR0P255DPxHsWm1e4HT426xQn3V7tM1sZUcnrLbkDaSI CRx9uZpZNH6jSv4P6bo+nE4F+Ikua2IIQgLWbkvM6k/6H4EeS6ymuk8IDNUeslECw2zamB jJme+uiLTG3JakuF/nKH7n7qitOQATI= ARC-Authentication-Results: i=2; imf13.hostedemail.com; dkim=pass header.d=oracle.com header.s=corp-2022-7-12 header.b=t0UoD+xy; dkim=pass header.d=oracle.onmicrosoft.com header.s=selector2-oracle-onmicrosoft-com header.b=HCVBZBN7; spf=pass (imf13.hostedemail.com: domain of mike.kravetz@oracle.com designates 205.220.177.32 as permitted sender) smtp.mailfrom=mike.kravetz@oracle.com; dmarc=pass (policy=none) header.from=oracle.com; arc=pass ("microsoft.com:s=arcselector9901:i=1") ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1663193915; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=Gsxm/1hzbwI5YCEaMDozVqtcgnXRFDIev726SYDrqyw=; b=MtUnIzXz7OOWT8/Pg9JRt0YCTQdf7PmeEIzRGgD7ebLBskLPxWYXJ9dZILTKCJ1Ym/ArZg jixtKbCEA8C4PO4dQIceXV/l1NnYf8APxyP2AM42ysNUyrCTfwtA/d6Q7jyNloxTkAvJPn poJH9LcSl5S9TwGqXLSv5rW/XJ5/wlQ= Authentication-Results: imf13.hostedemail.com; dkim=pass header.d=oracle.com header.s=corp-2022-7-12 header.b=t0UoD+xy; dkim=pass header.d=oracle.onmicrosoft.com header.s=selector2-oracle-onmicrosoft-com header.b=HCVBZBN7; spf=pass (imf13.hostedemail.com: domain of mike.kravetz@oracle.com designates 205.220.177.32 as permitted sender) smtp.mailfrom=mike.kravetz@oracle.com; dmarc=pass (policy=none) header.from=oracle.com; arc=pass ("microsoft.com:s=arcselector9901:i=1") X-Rspam-User: X-Rspamd-Server: rspam04 X-Stat-Signature: o93tif3bqwdfbf795p6neipcbyz5kxur X-Rspamd-Queue-Id: 9E2252009F X-HE-Tag: 1663193915-234775 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Commit c0d0381ade79 ("hugetlbfs: use i_mmap_rwsem for more pmd sharing synchronization") added code to take i_mmap_rwsem in read mode for the duration of fault processing. The use of i_mmap_rwsem to prevent fault/truncate races depends on this. However, this has been shown to cause performance/scaling issues. As a result, that code will be reverted. Since the use i_mmap_rwsem to address page fault/truncate races depends on this, it must also be reverted. In a subsequent patch, code will be added to detect the fault/truncate race and back out operations as required. Signed-off-by: Mike Kravetz Reviewed-by: Miaohe Lin --- fs/hugetlbfs/inode.c | 30 +++++++++--------------------- mm/hugetlb.c | 22 +++++++++++----------- 2 files changed, 20 insertions(+), 32 deletions(-) diff --git a/fs/hugetlbfs/inode.c b/fs/hugetlbfs/inode.c index f7a5b5124d8a..a32031e751d1 100644 --- a/fs/hugetlbfs/inode.c +++ b/fs/hugetlbfs/inode.c @@ -419,9 +419,10 @@ hugetlb_vmdelete_list(struct rb_root_cached *root, pgoff_t start, pgoff_t end, * In this case, we first scan the range and release found pages. * After releasing pages, hugetlb_unreserve_pages cleans up region/reserve * maps and global counts. Page faults can not race with truncation - * in this routine. hugetlb_no_page() holds i_mmap_rwsem and prevents - * page faults in the truncated range by checking i_size. i_size is - * modified while holding i_mmap_rwsem. + * in this routine. hugetlb_no_page() prevents page faults in the + * truncated range. It checks i_size before allocation, and again after + * with the page table lock for the page held. The same lock must be + * acquired to unmap a page. * hole punch is indicated if end is not LLONG_MAX * In the hole punch case we scan the range and release found pages. * Only when releasing a page is the associated region/reserve map @@ -451,16 +452,8 @@ static void remove_inode_hugepages(struct inode *inode, loff_t lstart, u32 hash = 0; index = folio->index; - if (!truncate_op) { - /* - * Only need to hold the fault mutex in the - * hole punch case. This prevents races with - * page faults. Races are not possible in the - * case of truncation. - */ - hash = hugetlb_fault_mutex_hash(mapping, index); - mutex_lock(&hugetlb_fault_mutex_table[hash]); - } + hash = hugetlb_fault_mutex_hash(mapping, index); + mutex_lock(&hugetlb_fault_mutex_table[hash]); /* * If folio is mapped, it was faulted in after being @@ -504,8 +497,7 @@ static void remove_inode_hugepages(struct inode *inode, loff_t lstart, } folio_unlock(folio); - if (!truncate_op) - mutex_unlock(&hugetlb_fault_mutex_table[hash]); + mutex_unlock(&hugetlb_fault_mutex_table[hash]); } folio_batch_release(&fbatch); cond_resched(); @@ -543,8 +535,8 @@ static void hugetlb_vmtruncate(struct inode *inode, loff_t offset) BUG_ON(offset & ~huge_page_mask(h)); pgoff = offset >> PAGE_SHIFT; - i_mmap_lock_write(mapping); i_size_write(inode, offset); + i_mmap_lock_write(mapping); if (!RB_EMPTY_ROOT(&mapping->i_mmap.rb_root)) hugetlb_vmdelete_list(&mapping->i_mmap, pgoff, 0, ZAP_FLAG_DROP_MARKER); @@ -703,11 +695,7 @@ static long hugetlbfs_fallocate(struct file *file, int mode, loff_t offset, /* addr is the offset within the file (zero based) */ addr = index * hpage_size; - /* - * fault mutex taken here, protects against fault path - * and hole punch. inode_lock previously taken protects - * against truncation. - */ + /* mutex taken here, fault path and hole punch */ hash = hugetlb_fault_mutex_hash(mapping, index); mutex_lock(&hugetlb_fault_mutex_table[hash]); diff --git a/mm/hugetlb.c b/mm/hugetlb.c index c6b53bcf823d..6c97b97aa252 100644 --- a/mm/hugetlb.c +++ b/mm/hugetlb.c @@ -5559,17 +5559,15 @@ static vm_fault_t hugetlb_no_page(struct mm_struct *mm, } /* - * We can not race with truncation due to holding i_mmap_rwsem. - * i_size is modified when holding i_mmap_rwsem, so check here - * once for faults beyond end of file. + * Use page lock to guard against racing truncation + * before we get page_table_lock. */ - size = i_size_read(mapping->host) >> huge_page_shift(h); - if (idx >= size) - goto out; - new_page = false; page = find_lock_page(mapping, idx); if (!page) { + size = i_size_read(mapping->host) >> huge_page_shift(h); + if (idx >= size) + goto out; /* Check for page in userfault range */ if (userfaultfd_missing(vma)) { ret = hugetlb_handle_userfault(vma, mapping, idx, @@ -5665,6 +5663,10 @@ static vm_fault_t hugetlb_no_page(struct mm_struct *mm, } ptl = huge_pte_lock(h, mm, ptep); + size = i_size_read(mapping->host) >> huge_page_shift(h); + if (idx >= size) + goto backout; + ret = 0; /* If pte changed from under us, retry */ if (!pte_same(huge_ptep_get(ptep), old_pte)) @@ -5773,10 +5775,8 @@ vm_fault_t hugetlb_fault(struct mm_struct *mm, struct vm_area_struct *vma, /* * Acquire i_mmap_rwsem before calling huge_pte_alloc and hold - * until finished with ptep. This serves two purposes: - * 1) It prevents huge_pmd_unshare from being called elsewhere - * and making the ptep no longer valid. - * 2) It synchronizes us with i_size modifications during truncation. + * until finished with ptep. This prevents huge_pmd_unshare from + * being called elsewhere and making the ptep no longer valid. * * ptep could have already be assigned via huge_pte_offset. That * is OK, as huge_pte_alloc will return the same value unless From patchwork Wed Sep 14 22:18:03 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Mike Kravetz X-Patchwork-Id: 12976588 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id AA5EFECAAD3 for ; Wed, 14 Sep 2022 22:18:43 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 65B9D80007; Wed, 14 Sep 2022 18:18:42 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 60BC2940007; Wed, 14 Sep 2022 18:18:42 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 3ED4D80007; Wed, 14 Sep 2022 18:18:42 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0015.hostedemail.com [216.40.44.15]) by kanga.kvack.org (Postfix) with ESMTP id 240E2940009 for ; Wed, 14 Sep 2022 18:18:42 -0400 (EDT) Received: from smtpin01.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay03.hostedemail.com (Postfix) with ESMTP id D48C7A03C0 for ; Wed, 14 Sep 2022 22:18:41 +0000 (UTC) X-FDA: 79912106442.01.C4CBAE8 Received: from mx0b-00069f02.pphosted.com (mx0b-00069f02.pphosted.com [205.220.177.32]) by imf17.hostedemail.com (Postfix) with ESMTP id 74CB240096 for ; Wed, 14 Sep 2022 22:18:41 +0000 (UTC) Received: from pps.filterd (m0246630.ppops.net [127.0.0.1]) by mx0b-00069f02.pphosted.com (8.17.1.5/8.17.1.5) with ESMTP id 28EMA2PU000788; Wed, 14 Sep 2022 22:18:26 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.com; h=from : to : cc : subject : date : message-id : in-reply-to : references : content-transfer-encoding : content-type : mime-version; s=corp-2022-7-12; bh=WWgFa5uvH4cxJCWBo6D7vjTXo0sJrhLbYOZnN6Af1kM=; b=Kko7SIoA+IIeIrBt1b+QlvmLxy8+khTSidnYyUxPK6eIOftFtdQehjRQkCZ2+INUnQWY U5adHVLIOVTEihsBS4Y/U+KN1auWV/hDoqefSC9FOpyK9RyUuPOGhkxvcNzUlNDiJhMs a846rObNLAuLIQhkZ72vtdGVWCWyrKmrX9e4ZMXDQ8Mln1sBAeyBKG1bBl5MqpgKT5Lv Xp+x0RxXyray2na+gJkQD90p0eNDRcrQoYbX47vHy1kquC36Lc0UozD8E8UR1kSR0Kmu A9gih2zd+YX6CtWhVLcZvT4Grf8jSCXu7AXR8xngrjV2emFh0Zc5trQqSSRIuQ4Act6z fg== Received: from iadpaimrmta02.imrmtpd1.prodappiadaev1.oraclevcn.com (iadpaimrmta02.appoci.oracle.com [147.154.18.20]) by mx0b-00069f02.pphosted.com (PPS) with ESMTPS id 3jjxyf3jn1-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Wed, 14 Sep 2022 22:18:26 +0000 Received: from pps.filterd (iadpaimrmta02.imrmtpd1.prodappiadaev1.oraclevcn.com [127.0.0.1]) by iadpaimrmta02.imrmtpd1.prodappiadaev1.oraclevcn.com (8.17.1.5/8.17.1.5) with ESMTP id 28ELrNe6006549; Wed, 14 Sep 2022 22:18:25 GMT Received: from nam11-bn8-obe.outbound.protection.outlook.com (mail-bn8nam11lp2169.outbound.protection.outlook.com [104.47.58.169]) by iadpaimrmta02.imrmtpd1.prodappiadaev1.oraclevcn.com (PPS) with ESMTPS id 3jjy2bjuf4-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Wed, 14 Sep 2022 22:18:25 +0000 ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=oEhAZ8aUK7pbhgyuahZi2B3/lEIHd6L3qE6Efo3abq2tcAReL74dDItAZz24azsKmDM82SdhBT33gwFU9QpDpM22Mzj7VojMcwL4xb9V73uwswaURXn3FuHz7c9jH9jFoMDTOxrggjapcvhQvxOv1YIIupZQ4oXZBaw3IBe2WKplLFItuCAxLo4NT1xjJGIiyEy8kyt8i8/se3nK04mIWn+aTj7WTqx3QyjouyRCbCuVV5p/qrGHagDjLpVVG8td1w/5yToseYxTcGMBQWeSUVoqOBZmBV7lRwu8Xj6P/QpvgvKUGjCSjUQfSV5x4p2bmIhpAk4WJSyXdilK4Fn4Kw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=WWgFa5uvH4cxJCWBo6D7vjTXo0sJrhLbYOZnN6Af1kM=; b=WTqzQ9zAZkug+MRRZUmNGsFixN4HDGnGcQQqDlPvwbz43gB3B4tZ8skyKZiPeC9SLEXUj/0A3RzygE+zaMdXZFgZ2htfkA/e/YVks1CsyRZ2qMUfnV7mGQahsbYglpwvmCZLVlbOSm8cfcZhZKdT1tyKELMQYh/uleaJJrt2s5FwD2ekoP/fSKwH+VDh6OrNw/F3+p3L8fKReOTEb4aOnpkNEmTVjXgDS4D7w29wUp8wBAENzv/ceJTsk1HJrF7ZnHWYExSvt/WjET34B2KMcPcGXTa74bQYGS0nrOCe3qg+Ut1BeTunTVbUUSaqcBv1UFOm6r34HKOPw4QG8biXSA== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=oracle.com; dmarc=pass action=none header.from=oracle.com; dkim=pass header.d=oracle.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.onmicrosoft.com; s=selector2-oracle-onmicrosoft-com; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=WWgFa5uvH4cxJCWBo6D7vjTXo0sJrhLbYOZnN6Af1kM=; b=DqpXcWI4rU3rLaWzAXHSALgMM0eLKCEtk5OZUy8EqSUIE+kWtBWbpwidcwuHksXYASFkPJS5SsXi05REweBXdAIB9oSo81+LpDGxkeTCvJ5/Um/++1IP2B8xVCtpyXD3d6D4VsntmGd+xdF1+nljaNZ0QBjciqxW372eNkxA7ck= Received: from BY5PR10MB4196.namprd10.prod.outlook.com (2603:10b6:a03:20d::23) by SA2PR10MB4745.namprd10.prod.outlook.com (2603:10b6:806:11b::6) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.5612.23; Wed, 14 Sep 2022 22:18:23 +0000 Received: from BY5PR10MB4196.namprd10.prod.outlook.com ([fe80::e9d2:a804:e53a:779a]) by BY5PR10MB4196.namprd10.prod.outlook.com ([fe80::e9d2:a804:e53a:779a%6]) with mapi id 15.20.5612.022; Wed, 14 Sep 2022 22:18:23 +0000 From: Mike Kravetz To: linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: Muchun Song , Miaohe Lin , David Hildenbrand , Sven Schnelle , Michal Hocko , Peter Xu , Naoya Horiguchi , "Aneesh Kumar K . V" , Andrea Arcangeli , "Kirill A . Shutemov" , Davidlohr Bueso , Prakash Sangappa , James Houghton , Mina Almasry , Pasha Tatashin , Axel Rasmussen , Ray Fucillo , Andrew Morton , Mike Kravetz Subject: [PATCH v2 2/9] hugetlbfs: revert use i_mmap_rwsem for more pmd sharing synchronization Date: Wed, 14 Sep 2022 15:18:03 -0700 Message-Id: <20220914221810.95771-3-mike.kravetz@oracle.com> X-Mailer: git-send-email 2.37.2 In-Reply-To: <20220914221810.95771-1-mike.kravetz@oracle.com> References: <20220914221810.95771-1-mike.kravetz@oracle.com> X-ClientProxiedBy: MW4PR04CA0312.namprd04.prod.outlook.com (2603:10b6:303:82::17) To BY5PR10MB4196.namprd10.prod.outlook.com (2603:10b6:a03:20d::23) MIME-Version: 1.0 X-MS-PublicTrafficType: Email X-MS-TrafficTypeDiagnostic: BY5PR10MB4196:EE_|SA2PR10MB4745:EE_ X-MS-Office365-Filtering-Correlation-Id: 6443825b-dd8c-4e61-6ea1-08da969f09be X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: Lo5UJ63f2XZtY8xEQxrekAbBjEUG4EVI69sZIXmYp6y/8KwszMQWYmyE95il8gZguYqXwgqzFYzJFzvVzH1YBcCcN0leq9Zu7q+u6k4b609YIJy0m3Ty61n12zSFuRSj6mEJ/RFgy/82eAHEda8eWpiGnz3MFE/JVBWYlWt49SofhFhtvYUN0f9INNgeIqLMyViYkQXrzW5DsyLfkfp9CtDcsWg16ki8IUqMeBF0l7OfpRtmV2o/iRKoUaBM1IJVyYoZD4/WhZ0Pe4Ejb/s0TkG0jDqAamB6AhkW4x4pEYLmShk/cmkqOFe8Pq4ZV4n7bI+gWTKLoIE5gA/++4RGs5PEd+TnqVo2/OcqE3pej546bgTxzCnthT4gzI7PJCp+sBHJ+x3u3K4ztkPLbYuQJYzKRXipAN8Ypj2jY21brL8XyeC9GdWEOvvpilUmIxZhLUZrA0i7izhp5XgrF1xskmrdgWsUXRT18+hTlvRfQ0AWWLQdp/J8WX4q5wOBb3i0/hcRYRvQaLQvRuYt6gMwnjYkmkhXMXypDX1y6PcbB31+hSprpCO8qiKdrzSk9R7qMDDHego7z8BsXvNQIquv8l127ELwsv2QkuEnvtvlxuhmS3+YjxnRDS6wu0BRgrEPj9UiNy/80xz4lpaXsLBLpS+mNEKu2R1rmzI3fvJ83b0o97aZNGBJONjm4VAUr1lsNNP5WAKAV0hw5pjm7/b5AA== X-Forefront-Antispam-Report: CIP:255.255.255.255;CTRY:;LANG:en;SCL:1;SRV:;IPV:NLI;SFV:NSPM;H:BY5PR10MB4196.namprd10.prod.outlook.com;PTR:;CAT:NONE;SFS:(13230022)(376002)(346002)(39860400002)(136003)(366004)(396003)(451199015)(478600001)(38100700002)(54906003)(316002)(36756003)(6506007)(8676002)(4326008)(66946007)(83380400001)(186003)(2616005)(1076003)(5660300002)(41300700001)(107886003)(66476007)(66556008)(86362001)(8936002)(6666004)(6486002)(7416002)(2906002)(26005)(30864003)(6512007)(44832011);DIR:OUT;SFP:1101; X-MS-Exchange-AntiSpam-MessageData-ChunkCount: 1 X-MS-Exchange-AntiSpam-MessageData-0: J1Ymtx/Ci0K3NDy6iwzUctn2M4kd81dSEpUttHF/UDXnUZ0kxS1QapZJRWv5P7dF8StBPZa2r2haYQEnF8cJhl/hA3NqyLv6fh/F2aVTxYaNZ1i0c8bIAYEw0SmmCx6LVEzB+RdQ8jQfIdZV0uQ4IlfA3fz2Um7bSzYV3osOlce36NviJktJ7ef9hZ2THrkz97k1kiEivNsaaUScUJ9XscmyELiTIrHmMBiwJm6OxS4I0hPbhIhavAO9zqzGbn1fOml6SyHE8Dc7FDvyRO7pV3DKQiacoeHDAR5eDKA9GyKK3cppl0CP28rkGust0hmLwiN2t9MOCz7yhBu3htdwhVzj6PKeGt2/PhnbSDNALLYWtXVOy4EC3YooF/YgbQ+tJhMik5ltb/LnCzW3a9QE0bM+7CB6362o+3fPBbGS/0ZOzeT0+xuJDoLnBhNgNRJEyvtFWPHkTMbh2TE0Q59ucn7VMgfIBi1w54R6fLWt4z7sk4CQFeRobSaHsIRm3edaI8YO7+AEMX5DeaN8g5H48+pNkctnGFcGuRNbanzfc88x3SwDQHmmRiEeg1hpxC/xQ5xp+WkvRTt9f17kH73Pw/b9tgA06Nsc1Q/IqHkWdX+Iyb5gJ7aUXEUxmhoC+BikF8/Qg8goKQGrB0qnHMhcjC3HTrWZYB/hlu+Smn1eSwXwC9U/PhEcbpUlHvLgZLE+uit4Qn4tzFxxn5e62oTi1t/JaXlFxirPTb1xccEi+h9P2MlvpPxKd47ugNo+ed4DIGOIqizYKVVYQBkpbIAailtNL2USXhSaMykzDy3f4LtiCGGalUefhlBOw0TA6lheD/6mBhdsJjFCodSQexITLisF4Nnz5K7LfijIsiAFxiC+LQJWnS12QNOU5Lgc1lVTqhBra82zYePQxoXNzDdq00IpYqVxjoy4d51oUi+wcqxO5bpJjUAQuKn0Rh7HsV4n+bRzu2OkCXVXuw7VFpHR1TXzs9S48OY/zJhbHIiSh2OVG+YwnsvM/ocaqpEQ3tNEjs5e4gynCdTlQbZkdbd1UZyK+JWaoJLc6zq3b1ZZeHDaM4AyEiL+IAeXTbV7mUtXgsIewMApFkpuf1V7+75GDn+Kt/YqlemWuEMdLSxf7+oDnDQ4Z88Sfd+W/Caq00kKDjbmlIi5dV9KMCVUFk+aeS9q3HzI898J7jEf/3ZXBaQH2DrUuqZO/Nlp27gJhCq7uvuI6HIoM0zeG3s1Sm6RJiVeSPBDHiZJFI9mBF9OclEdaWNBP0PBVjn6Xj2oYwzfNeoOuy07xBs8fl/RKW+7UUTgzDs+gQv3pEhtz9caew4ZhXASj4SA018asQhVy78ZM88u/Uq7qx+/1CV9azM9TMPPr2WQQ2+6Huh9EoPAnobsOm5KnF2rUETDRRkF2YPDhkuljw7kE1GXa7k1SPLiwnEdVqdItHKzHszktaqarm4lcGvT2cHJOuFM2HCMYak+LABHx5R8Qnvdr7iqtOxI2IL9bMUVemCTdM9cCPt8MCCDtKTAWN6Mi73pLG2fLVfU1KaO9UjHtOeIrj8+cfTsW/ip7QP+dZSANGOWMHZpaG7KNd3kZWbiEGFPh4zTPql5kyu7rU7WsP3SHsbk5mhY1A== X-OriginatorOrg: oracle.com X-MS-Exchange-CrossTenant-Network-Message-Id: 6443825b-dd8c-4e61-6ea1-08da969f09be X-MS-Exchange-CrossTenant-AuthSource: BY5PR10MB4196.namprd10.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-OriginalArrivalTime: 14 Sep 2022 22:18:23.4725 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-CrossTenant-Id: 4e2c6054-71cb-48f1-bd6c-3a9705aca71b X-MS-Exchange-CrossTenant-MailboxType: HOSTED X-MS-Exchange-CrossTenant-UserPrincipalName: oBIoKe0s4962hSjH+K3zMf4B/C2dE0ttVnF5qEO3JkQiuV9GhDHFyR6KATlhUGyAqPnJ0jf777z/IkKHUpzwFw== X-MS-Exchange-Transport-CrossTenantHeadersStamped: SA2PR10MB4745 X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.205,Aquarius:18.0.895,Hydra:6.0.528,FMLib:17.11.122.1 definitions=2022-09-14_09,2022-09-14_04,2022-06-22_01 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 suspectscore=0 adultscore=0 malwarescore=0 spamscore=0 bulkscore=0 mlxscore=0 mlxlogscore=999 phishscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2208220000 definitions=main-2209140108 X-Proofpoint-GUID: sjQciN9cBpPl3dePeAIQlRszQPesb11b X-Proofpoint-ORIG-GUID: sjQciN9cBpPl3dePeAIQlRszQPesb11b ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1663193921; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=WWgFa5uvH4cxJCWBo6D7vjTXo0sJrhLbYOZnN6Af1kM=; b=2vFFM7U/zL/hiTeySzAbZATabpZdqFTmH5LlTMpqM7lCRYgYZ9/tIQzBrEWwZCmFZPvRfu s31MwNzZXuf71+SmxJTSmNu0WltFGholg4V8zTLPq7UioxLJUIFo0t6rWfVERppMsEYunA nwEYMBPKtSoHxbF6XQEk13vTvO8GFBY= ARC-Authentication-Results: i=2; imf17.hostedemail.com; dkim=pass header.d=oracle.com header.s=corp-2022-7-12 header.b=Kko7SIoA; dkim=pass header.d=oracle.onmicrosoft.com header.s=selector2-oracle-onmicrosoft-com header.b=DqpXcWI4; dmarc=pass (policy=none) header.from=oracle.com; spf=pass (imf17.hostedemail.com: domain of mike.kravetz@oracle.com designates 205.220.177.32 as permitted sender) smtp.mailfrom=mike.kravetz@oracle.com; arc=pass ("microsoft.com:s=arcselector9901:i=1") ARC-Seal: i=2; s=arc-20220608; d=hostedemail.com; t=1663193921; a=rsa-sha256; cv=pass; b=cO35p25NBz71RODdB2MRhwxqVUJQlJzSZzowK2hLeClBjzK8bBVtcVcEMgRKFNz2ySmE57 lY6L5Xbh5j0GX9walLYAsMqh/eyFig2O3PK1HKws6T81C6Beh4Bsg5D2Fq6ZpaftyTmK+l DgDnrdRG+EzrhA7dx6HnXakZ/Q7XSh0= Authentication-Results: imf17.hostedemail.com; dkim=pass header.d=oracle.com header.s=corp-2022-7-12 header.b=Kko7SIoA; dkim=pass header.d=oracle.onmicrosoft.com header.s=selector2-oracle-onmicrosoft-com header.b=DqpXcWI4; dmarc=pass (policy=none) header.from=oracle.com; spf=pass (imf17.hostedemail.com: domain of mike.kravetz@oracle.com designates 205.220.177.32 as permitted sender) smtp.mailfrom=mike.kravetz@oracle.com; arc=pass ("microsoft.com:s=arcselector9901:i=1") X-Rspamd-Server: rspam03 X-Rspam-User: X-Rspamd-Queue-Id: 74CB240096 X-Stat-Signature: 4uyrydodi51c8w8cf7xxzkrba9miqqy6 X-HE-Tag: 1663193921-296988 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Commit c0d0381ade79 ("hugetlbfs: use i_mmap_rwsem for more pmd sharing synchronization") added code to take i_mmap_rwsem in read mode for the duration of fault processing. However, this has been shown to cause performance/scaling issues. Revert the code and go back to only taking the semaphore in huge_pmd_share during the fault path. Keep the code that takes i_mmap_rwsem in write mode before calling try_to_unmap as this is required if huge_pmd_unshare is called. NOTE: Reverting this code does expose the following race condition. Faulting thread Unsharing thread ... ... ptep = huge_pte_offset() or ptep = huge_pte_alloc() ... i_mmap_lock_write lock page table ptep invalid <------------------------ huge_pmd_unshare() Could be in a previously unlock_page_table sharing process or worse i_mmap_unlock_write ... ptl = huge_pte_lock(ptep) get/update pte set_pte_at(pte, ptep) It is unknown if the above race was ever experienced by a user. It was discovered via code inspection when initially addressed. In subsequent patches, a new synchronization mechanism will be added to coordinate pmd sharing and eliminate this race. Signed-off-by: Mike Kravetz Reviewed-by: Miaohe Lin --- fs/hugetlbfs/inode.c | 2 -- mm/hugetlb.c | 77 +++++++------------------------------------- mm/rmap.c | 8 +---- mm/userfaultfd.c | 11 ++----- 4 files changed, 15 insertions(+), 83 deletions(-) diff --git a/fs/hugetlbfs/inode.c b/fs/hugetlbfs/inode.c index a32031e751d1..dfb735a91bbb 100644 --- a/fs/hugetlbfs/inode.c +++ b/fs/hugetlbfs/inode.c @@ -467,9 +467,7 @@ static void remove_inode_hugepages(struct inode *inode, loff_t lstart, if (unlikely(folio_mapped(folio))) { BUG_ON(truncate_op); - mutex_unlock(&hugetlb_fault_mutex_table[hash]); i_mmap_lock_write(mapping); - mutex_lock(&hugetlb_fault_mutex_table[hash]); hugetlb_vmdelete_list(&mapping->i_mmap, index * pages_per_huge_page(h), (index + 1) * pages_per_huge_page(h), diff --git a/mm/hugetlb.c b/mm/hugetlb.c index 6c97b97aa252..00fba195a439 100644 --- a/mm/hugetlb.c +++ b/mm/hugetlb.c @@ -4769,7 +4769,6 @@ int copy_hugetlb_page_range(struct mm_struct *dst, struct mm_struct *src, struct hstate *h = hstate_vma(src_vma); unsigned long sz = huge_page_size(h); unsigned long npages = pages_per_huge_page(h); - struct address_space *mapping = src_vma->vm_file->f_mapping; struct mmu_notifier_range range; unsigned long last_addr_mask; int ret = 0; @@ -4781,14 +4780,6 @@ int copy_hugetlb_page_range(struct mm_struct *dst, struct mm_struct *src, mmu_notifier_invalidate_range_start(&range); mmap_assert_write_locked(src); raw_write_seqcount_begin(&src->write_protect_seq); - } else { - /* - * For shared mappings i_mmap_rwsem must be held to call - * huge_pte_alloc, otherwise the returned ptep could go - * away if part of a shared pmd and another thread calls - * huge_pmd_unshare. - */ - i_mmap_lock_read(mapping); } last_addr_mask = hugetlb_mask_last_page(h); @@ -4935,8 +4926,6 @@ int copy_hugetlb_page_range(struct mm_struct *dst, struct mm_struct *src, if (cow) { raw_write_seqcount_end(&src->write_protect_seq); mmu_notifier_invalidate_range_end(&range); - } else { - i_mmap_unlock_read(mapping); } return ret; @@ -5345,29 +5334,8 @@ static vm_fault_t hugetlb_wp(struct mm_struct *mm, struct vm_area_struct *vma, * may get SIGKILLed if it later faults. */ if (outside_reserve) { - struct address_space *mapping = vma->vm_file->f_mapping; - pgoff_t idx; - u32 hash; - put_page(old_page); - /* - * Drop hugetlb_fault_mutex and i_mmap_rwsem before - * unmapping. unmapping needs to hold i_mmap_rwsem - * in write mode. Dropping i_mmap_rwsem in read mode - * here is OK as COW mappings do not interact with - * PMD sharing. - * - * Reacquire both after unmap operation. - */ - idx = vma_hugecache_offset(h, vma, haddr); - hash = hugetlb_fault_mutex_hash(mapping, idx); - mutex_unlock(&hugetlb_fault_mutex_table[hash]); - i_mmap_unlock_read(mapping); - unmap_ref_private(mm, vma, old_page, haddr); - - i_mmap_lock_read(mapping); - mutex_lock(&hugetlb_fault_mutex_table[hash]); spin_lock(ptl); ptep = huge_pte_offset(mm, haddr, huge_page_size(h)); if (likely(ptep && @@ -5522,9 +5490,7 @@ static inline vm_fault_t hugetlb_handle_userfault(struct vm_area_struct *vma, */ hash = hugetlb_fault_mutex_hash(mapping, idx); mutex_unlock(&hugetlb_fault_mutex_table[hash]); - i_mmap_unlock_read(mapping); ret = handle_userfault(&vmf, reason); - i_mmap_lock_read(mapping); mutex_lock(&hugetlb_fault_mutex_table[hash]); return ret; @@ -5759,11 +5725,6 @@ vm_fault_t hugetlb_fault(struct mm_struct *mm, struct vm_area_struct *vma, ptep = huge_pte_offset(mm, haddr, huge_page_size(h)); if (ptep) { - /* - * Since we hold no locks, ptep could be stale. That is - * OK as we are only making decisions based on content and - * not actually modifying content here. - */ entry = huge_ptep_get(ptep); if (unlikely(is_hugetlb_entry_migration(entry))) { migration_entry_wait_huge(vma, ptep); @@ -5771,31 +5732,20 @@ vm_fault_t hugetlb_fault(struct mm_struct *mm, struct vm_area_struct *vma, } else if (unlikely(is_hugetlb_entry_hwpoisoned(entry))) return VM_FAULT_HWPOISON_LARGE | VM_FAULT_SET_HINDEX(hstate_index(h)); + } else { + ptep = huge_pte_alloc(mm, vma, haddr, huge_page_size(h)); + if (!ptep) + return VM_FAULT_OOM; } - /* - * Acquire i_mmap_rwsem before calling huge_pte_alloc and hold - * until finished with ptep. This prevents huge_pmd_unshare from - * being called elsewhere and making the ptep no longer valid. - * - * ptep could have already be assigned via huge_pte_offset. That - * is OK, as huge_pte_alloc will return the same value unless - * something has changed. - */ mapping = vma->vm_file->f_mapping; - i_mmap_lock_read(mapping); - ptep = huge_pte_alloc(mm, vma, haddr, huge_page_size(h)); - if (!ptep) { - i_mmap_unlock_read(mapping); - return VM_FAULT_OOM; - } + idx = vma_hugecache_offset(h, vma, haddr); /* * Serialize hugepage allocation and instantiation, so that we don't * get spurious allocation failures if two CPUs race to instantiate * the same page in the page cache. */ - idx = vma_hugecache_offset(h, vma, haddr); hash = hugetlb_fault_mutex_hash(mapping, idx); mutex_lock(&hugetlb_fault_mutex_table[hash]); @@ -5860,7 +5810,6 @@ vm_fault_t hugetlb_fault(struct mm_struct *mm, struct vm_area_struct *vma, put_page(pagecache_page); } mutex_unlock(&hugetlb_fault_mutex_table[hash]); - i_mmap_unlock_read(mapping); return handle_userfault(&vmf, VM_UFFD_WP); } @@ -5904,7 +5853,6 @@ vm_fault_t hugetlb_fault(struct mm_struct *mm, struct vm_area_struct *vma, } out_mutex: mutex_unlock(&hugetlb_fault_mutex_table[hash]); - i_mmap_unlock_read(mapping); /* * Generally it's safe to hold refcount during waiting page lock. But * here we just wait to defer the next page fault to avoid busy loop and @@ -6744,12 +6692,10 @@ void adjust_range_if_pmd_sharing_possible(struct vm_area_struct *vma, * Search for a shareable pmd page for hugetlb. In any case calls pmd_alloc() * and returns the corresponding pte. While this is not necessary for the * !shared pmd case because we can allocate the pmd later as well, it makes the - * code much cleaner. - * - * This routine must be called with i_mmap_rwsem held in at least read mode if - * sharing is possible. For hugetlbfs, this prevents removal of any page - * table entries associated with the address space. This is important as we - * are setting up sharing based on existing page table entries (mappings). + * code much cleaner. pmd allocation is essential for the shared case because + * pud has to be populated inside the same i_mmap_rwsem section - otherwise + * racing tasks could either miss the sharing (see huge_pte_offset) or select a + * bad pmd for sharing. */ pte_t *huge_pmd_share(struct mm_struct *mm, struct vm_area_struct *vma, unsigned long addr, pud_t *pud) @@ -6763,7 +6709,7 @@ pte_t *huge_pmd_share(struct mm_struct *mm, struct vm_area_struct *vma, pte_t *pte; spinlock_t *ptl; - i_mmap_assert_locked(mapping); + i_mmap_lock_read(mapping); vma_interval_tree_foreach(svma, &mapping->i_mmap, idx, idx) { if (svma == vma) continue; @@ -6793,6 +6739,7 @@ pte_t *huge_pmd_share(struct mm_struct *mm, struct vm_area_struct *vma, spin_unlock(ptl); out: pte = (pte_t *)pmd_alloc(mm, pud, addr); + i_mmap_unlock_read(mapping); return pte; } @@ -6803,7 +6750,7 @@ pte_t *huge_pmd_share(struct mm_struct *mm, struct vm_area_struct *vma, * indicated by page_count > 1, unmap is achieved by clearing pud and * decrementing the ref count. If count == 1, the pte page is not shared. * - * Called with page table lock held and i_mmap_rwsem held in write mode. + * Called with page table lock held. * * returns: 1 successfully unmapped a shared pte page * 0 the underlying pte page is not shared, or it is the last user diff --git a/mm/rmap.c b/mm/rmap.c index 08d552ea4ceb..d17d68a9b15b 100644 --- a/mm/rmap.c +++ b/mm/rmap.c @@ -23,10 +23,9 @@ * inode->i_rwsem (while writing or truncating, not reading or faulting) * mm->mmap_lock * mapping->invalidate_lock (in filemap_fault) - * page->flags PG_locked (lock_page) * (see hugetlbfs below) + * page->flags PG_locked (lock_page) * hugetlbfs_i_mmap_rwsem_key (in huge_pmd_share) * mapping->i_mmap_rwsem - * hugetlb_fault_mutex (hugetlbfs specific page fault mutex) * anon_vma->rwsem * mm->page_table_lock or pte_lock * swap_lock (in swap_duplicate, swap_info_get) @@ -45,11 +44,6 @@ * anon_vma->rwsem,mapping->i_mmap_rwsem (memory_failure, collect_procs_anon) * ->tasklist_lock * pte map lock - * - * * hugetlbfs PageHuge() pages take locks in this order: - * mapping->i_mmap_rwsem - * hugetlb_fault_mutex (hugetlbfs specific page fault mutex) - * page->flags PG_locked (lock_page) */ #include diff --git a/mm/userfaultfd.c b/mm/userfaultfd.c index 9c035be2148b..0fdbd2c05587 100644 --- a/mm/userfaultfd.c +++ b/mm/userfaultfd.c @@ -379,14 +379,10 @@ static __always_inline ssize_t __mcopy_atomic_hugetlb(struct mm_struct *dst_mm, BUG_ON(dst_addr >= dst_start + len); /* - * Serialize via i_mmap_rwsem and hugetlb_fault_mutex. - * i_mmap_rwsem ensures the dst_pte remains valid even - * in the case of shared pmds. fault mutex prevents - * races with other faulting threads. + * Serialize via hugetlb_fault_mutex. */ - mapping = dst_vma->vm_file->f_mapping; - i_mmap_lock_read(mapping); idx = linear_page_index(dst_vma, dst_addr); + mapping = dst_vma->vm_file->f_mapping; hash = hugetlb_fault_mutex_hash(mapping, idx); mutex_lock(&hugetlb_fault_mutex_table[hash]); @@ -394,7 +390,6 @@ static __always_inline ssize_t __mcopy_atomic_hugetlb(struct mm_struct *dst_mm, dst_pte = huge_pte_alloc(dst_mm, dst_vma, dst_addr, vma_hpagesize); if (!dst_pte) { mutex_unlock(&hugetlb_fault_mutex_table[hash]); - i_mmap_unlock_read(mapping); goto out_unlock; } @@ -402,7 +397,6 @@ static __always_inline ssize_t __mcopy_atomic_hugetlb(struct mm_struct *dst_mm, !huge_pte_none_mostly(huge_ptep_get(dst_pte))) { err = -EEXIST; mutex_unlock(&hugetlb_fault_mutex_table[hash]); - i_mmap_unlock_read(mapping); goto out_unlock; } @@ -411,7 +405,6 @@ static __always_inline ssize_t __mcopy_atomic_hugetlb(struct mm_struct *dst_mm, wp_copy); mutex_unlock(&hugetlb_fault_mutex_table[hash]); - i_mmap_unlock_read(mapping); cond_resched(); From patchwork Wed Sep 14 22:18:04 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Mike Kravetz X-Patchwork-Id: 12976590 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id D83DEC6FA86 for ; Wed, 14 Sep 2022 22:18:47 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 9900B80008; Wed, 14 Sep 2022 18:18:43 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 917B1940007; Wed, 14 Sep 2022 18:18:43 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 743F194000A; Wed, 14 Sep 2022 18:18:43 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0010.hostedemail.com [216.40.44.10]) by kanga.kvack.org (Postfix) with ESMTP id 5EA04940007 for ; Wed, 14 Sep 2022 18:18:43 -0400 (EDT) Received: from smtpin01.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay07.hostedemail.com (Postfix) with ESMTP id 3B178160344 for ; Wed, 14 Sep 2022 22:18:43 +0000 (UTC) X-FDA: 79912106526.01.2E33A05 Received: from mx0a-00069f02.pphosted.com (mx0a-00069f02.pphosted.com [205.220.165.32]) by imf24.hostedemail.com (Postfix) with ESMTP id CAF241800C9 for ; Wed, 14 Sep 2022 22:18:42 +0000 (UTC) Received: from pps.filterd (m0246617.ppops.net [127.0.0.1]) by mx0b-00069f02.pphosted.com (8.17.1.5/8.17.1.5) with ESMTP id 28EM9q9k031220; Wed, 14 Sep 2022 22:18:29 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.com; h=from : to : cc : subject : date : message-id : in-reply-to : references : content-transfer-encoding : content-type : mime-version; s=corp-2022-7-12; bh=73nPWJYn6DsvU2DsSwVM+vUmrXiJ+Unn+yfCo2I4Nr0=; b=LNW7p/CZnL1VS1XofYR6MnqhR6o5jYbOhAXIb++SY1t+V1sdUPRHUeUClajyGX1Lkl9e Uj6lwX4rcV/JAoFUcpcbTAcrbGjEHD+F/6t/Kl4aeZNAFZLXKpV9IJWTYAtveD7pHPQh bNAc3jgzQWMuZTkRcpkYoFBEAto7v1gi2zNcULFlBv6GMj0D3W81gBWJM49/jYqP+daQ MwYbSsTt8s2r9QhZ7++xnu9JCk0C8RQG+Xq8Ustq/8f0uCPoLpw6N8Ne2gZocPCQq1Kt eyPrrTI5kjy5dG1RJmnLuN9Ki1d2qhS4buKH0X4x5507A5O1RsvEfQHitcEwYOx+3CX1 2w== Received: from phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com (phxpaimrmta01.appoci.oracle.com [138.1.114.2]) by mx0b-00069f02.pphosted.com (PPS) with ESMTPS id 3jjxyr3shq-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Wed, 14 Sep 2022 22:18:29 +0000 Received: from pps.filterd (phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com [127.0.0.1]) by phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com (8.17.1.5/8.17.1.5) with ESMTP id 28EMIItv019429; Wed, 14 Sep 2022 22:18:28 GMT Received: from nam11-bn8-obe.outbound.protection.outlook.com (mail-bn8nam11lp2168.outbound.protection.outlook.com [104.47.58.168]) by phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com (PPS) with ESMTPS id 3jjym51gxt-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Wed, 14 Sep 2022 22:18:28 +0000 ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=XNBv/mAaAIi7Sioy0w+TZ9oWP1BbJ7Q1A6XqLuMh20yi2bhJKAPTNln2Yax6UbJ4ckPrzNIrOUyej++F0PRdEwmSLV+vHMxl+18ZZgE0mqkhT1081tM6wkt2bqmiu11PcQ3dx8PR1Wb0izLNK/01KzHWhnfj0mhHnkWd1GB1Dz/S1aMBWkLgY8xSLi5AnSIC4PhAx6ybEExbXogYphLKzc0ya7AIQZ8C+TbLcjI0ygXX1GUyb4QB2nky++yO8oPxmDGvNcpW/A3zElzG33k2j9sMJq4jE5X8Gy9UckPjq463Ylj2VLfQwS/0odE4eEHRD0YHYq1T2mNQskpUha8chw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=73nPWJYn6DsvU2DsSwVM+vUmrXiJ+Unn+yfCo2I4Nr0=; b=e5lwyFU8UHzmECdWMyHxd03fhvK+WD3AynV+p4ohUUf0i0wwvnSBHghDR4guviqNInNlqz4z5T3XHgFoXB5Ito98KrpsAW8M/8ErirwVLk4lukyz6I56AeXhgtxLsMfEeyoHkxUioHdzyH7M66Q3VzZEk6aO9fMrkfBEWBx1CBATcBTLuWSlJYaQTihz8xlKZhp5bgyAuwXNHKcyiUz+uTdiMtsSAmqCn5FT+7IopZikegRC0zzbPXpCMpZTYHG1SqB9eL2YQ6okn4JSzZpWm+jOX3Ok2gAkwH/RtofTnzHYdyvV79QjKVPLVFBhPBvVevV3S9ixRpoCXEvdbgOT4Q== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=oracle.com; dmarc=pass action=none header.from=oracle.com; dkim=pass header.d=oracle.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.onmicrosoft.com; s=selector2-oracle-onmicrosoft-com; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=73nPWJYn6DsvU2DsSwVM+vUmrXiJ+Unn+yfCo2I4Nr0=; b=z6dHn7E6ls+U4FBDH2lJ83SwofrmPalvBn57L0lE5w1xx8zdr2lvgI2lAoCzAXoG6PIPIogXn/kbJzZPGcVFoey/BBEKi+ugAgtXiubeWmwaapdxK5J8OPpiPZgwJUIj4PxuS9jpWkzDKBmwG+UxV0o1iLFnfwV83Rqp1XY35Vo= Received: from BY5PR10MB4196.namprd10.prod.outlook.com (2603:10b6:a03:20d::23) by SA2PR10MB4745.namprd10.prod.outlook.com (2603:10b6:806:11b::6) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.5612.23; Wed, 14 Sep 2022 22:18:25 +0000 Received: from BY5PR10MB4196.namprd10.prod.outlook.com ([fe80::e9d2:a804:e53a:779a]) by BY5PR10MB4196.namprd10.prod.outlook.com ([fe80::e9d2:a804:e53a:779a%6]) with mapi id 15.20.5612.022; Wed, 14 Sep 2022 22:18:25 +0000 From: Mike Kravetz To: linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: Muchun Song , Miaohe Lin , David Hildenbrand , Sven Schnelle , Michal Hocko , Peter Xu , Naoya Horiguchi , "Aneesh Kumar K . V" , Andrea Arcangeli , "Kirill A . Shutemov" , Davidlohr Bueso , Prakash Sangappa , James Houghton , Mina Almasry , Pasha Tatashin , Axel Rasmussen , Ray Fucillo , Andrew Morton , Mike Kravetz Subject: [PATCH v2 3/9] hugetlb: rename remove_huge_page to hugetlb_delete_from_page_cache Date: Wed, 14 Sep 2022 15:18:04 -0700 Message-Id: <20220914221810.95771-4-mike.kravetz@oracle.com> X-Mailer: git-send-email 2.37.2 In-Reply-To: <20220914221810.95771-1-mike.kravetz@oracle.com> References: <20220914221810.95771-1-mike.kravetz@oracle.com> X-ClientProxiedBy: MW4PR03CA0246.namprd03.prod.outlook.com (2603:10b6:303:b4::11) To BY5PR10MB4196.namprd10.prod.outlook.com (2603:10b6:a03:20d::23) MIME-Version: 1.0 X-MS-PublicTrafficType: Email X-MS-TrafficTypeDiagnostic: BY5PR10MB4196:EE_|SA2PR10MB4745:EE_ X-MS-Office365-Filtering-Correlation-Id: fed4d5e5-0d02-4fb4-8661-08da969f0b13 X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: mDpbeEoGxAFAIjM+N3hRcxUBTXU76t+yZo46xiyRzhnydcrxoa0IsGxnhjxYlwRmR4yCctT4UPbe4fJzwlJDF9fN3DzdPhxaDihKYHpKqsC7mw9X3YIYX3NCPR2+PStnSGzfc5uOrF2zFKDLpde4VTXzcKs2TUxNGKGxGyof9j/gYN6pbM7tEd+Gjk/d6MLHzoZ/3UTwVb44J9jjkOU1sulHDvLYb+sCgEit9M3cGUlrr/KOu1/7p97ebyaI7TlC94Gt/ieK4oQOapjx2CYYnZtLBoe9aEeNk8mt38ztZdy16lxexKzZJVr5LMOAAIRN+jK5RlgrWDYEorSGmvGx/wQ1lohV5+PAODB6HN7hU7dCG9MQe2h/8hDPuPBeJmmHSYZUQtgy+VyfBo2VYFaupP/2gkyD5rYsLyjwLhlBemYlBzeZkR5PR1UYpONJeYF+TMYVdrVL1HLylTkuMiiyNAwuMF5P4+s1MgQnk8zWcSeQR31gsyJ9ekoG3HhjRwPyy2b/m+UGU1UskGszLu1GiO76seBKa9eghPdk2g1gnJkT1oHAwj8xYdxgPjWfLH6Ht3mfiLb8ycbMRC8YSZ8FCt0qM9tmr795lB7NxDr1ZXLEIFtmyo2EGVFyNgdn71u2sEvgfOpsHdNgbtOLKxDPYcNjNbuaLcCGXJkK2KyroXGX6Oc2N7ZF1p2mwfom6VKx4DGdkO/Vunf9Y1aGiYzI8vYuByKlOMObz9o+l1ND50WgxMmbdRnJXrrDAUel3vhC X-Forefront-Antispam-Report: CIP:255.255.255.255;CTRY:;LANG:en;SCL:1;SRV:;IPV:NLI;SFV:NSPM;H:BY5PR10MB4196.namprd10.prod.outlook.com;PTR:;CAT:NONE;SFS:(13230022)(376002)(346002)(39860400002)(136003)(366004)(396003)(451199015)(478600001)(38100700002)(54906003)(316002)(36756003)(6506007)(8676002)(4326008)(66946007)(83380400001)(186003)(2616005)(1076003)(5660300002)(41300700001)(107886003)(66476007)(66556008)(86362001)(8936002)(6666004)(6486002)(7416002)(2906002)(26005)(6512007)(44832011)(14583001);DIR:OUT;SFP:1101; X-MS-Exchange-AntiSpam-MessageData-ChunkCount: 1 X-MS-Exchange-AntiSpam-MessageData-0: JDPb/tYT/opNocd87JMxDpK9jRdTTVt2q5ofh+7lBBCg21TeQIHCWi1QkikFDWyJCoeY3l2cpBj/XRxO2H812OqTSsteSDQDFDE95mILCkE3cl3xtweUP4Z4cLnXaxK8HsoK7a+H4RCAswO5cKhboGB8O7+PcTYQqY39eO02JmklEwCnqm18ZYSKyQsxyWad7HOU2HtE5Z4kS2rzpHp24lSkaw54Bk6tGD0b4G/wBoMa+njNUVM78wBarM0JOWMmOuSTy1rt1NPax1D+yvXvB7f5cLbpoAWUL/Ggc2CmcMcyRsC2R1/wYiAq5EFOl1kPJWo1zdtQ+IWNSXoZsD3OksnG6hxVNUuFeNizGPmGKCX54QI3RhcAVuvI8ED86XdQvlHt4BwQAIhJoLuD6oLDglLx859Gd0YVyjQQWkZTGVCfC3tAHFyfw8eqBNq+PLUx3ugkUlQ6ICN6IgaMenSLVtNUmRKeBwVV5aNWNCmj60gm/C8AUR0COH6g4VO/dhWX7d9AhiVdfQq9irMjoDayPFLkeQ6ahenPomA2uuCNTc9+WTQb3Qf0MuLdPJS6+r94KI27ugyXADEleDPk+eM2ClHwTKv+w43K9+c30kk90sSaaMQdVmNH6zUbfzUrAeilY8VcNwO6KAImNBWr5bq2O+YfUmeqXXY6GGlKeG0aAeodpcpslfSqzKORNqfu0sbJmN+/4BGpnK+60twtbTUAsY6eEb05tzQ3jEdeGPtl3blPqmxyOV819PJR0dNZ2DVa6nyy1lsQeR4VC5ibTGRHSXQMpTLlGfWanMCH4ZMF5p/I9DXoyAKVnWMo0zrPtnD4YDxc47QwtUkyH6b0UYszoswwZYv28DC3j4v+ljt9zYWl1uH+z2rZ9URIZmO32SQmu+DcqNkZkWgZab2xs5NkHBg/imY0fhCSLESVHhm8vG80qva42r+b48Er26WtAt/dJ/0xjnBJc0XMlzZ+IbfVqYCWiNiTKQg33fv31dAEG8raBWpkiqhoASsqfdzoW28XkDkGAacE09wsOipSODcddQQ6f2ytuuMW+EASDy3YD+/hPgZlW23Eegf0F18rGPRk2capgAiPv48GpxSvi8EPQ/6Wnmog8Us0l3eEe1fVpEeN52J/2ojbt1VGzUQCVDRK/PoO0Jsi6JrQgm68UsZANE7AJRMrxK5P66D7f8F4fuJVozcciM0xOL/o8bOTURoE64tW8tDkRYw9BaJ89n+jkSW4f0oLTLgPUBz4ayMxc13HiVWlMCULzjV7+qH49Z1nwXwGp1GUlzyU2jOxJths/f3d7gV5OOT55gCGhtQ8cUkfiHTrQ9QvjyWJMZJEEvtuaoeIOo2kP8vEH6JduadvpJd2nseq2YTYpaKjXLYT4zQkpnwmx3BvQ8RnDkld9lFU9/CSNPm5xwNKNr3RUzCn1kpT0nDyN1hGPlwzATth5OQ7oSi6mms6NoN0SZW5PUBflSjvlkeXFraYkkSAE6uS1uh1rwJb9sx9yYugXIZ7FuEdKfzxs7T4Mk2S1BVFR3TtLoKuJB18aX75Wwh/mc2+2NWfhTfAOZ51UbUKD1SSM/7ayG/65TKfKEaXeWdnAEDV1tcQXNGQ/vQn8f4HE7CB+A== X-OriginatorOrg: oracle.com X-MS-Exchange-CrossTenant-Network-Message-Id: fed4d5e5-0d02-4fb4-8661-08da969f0b13 X-MS-Exchange-CrossTenant-AuthSource: BY5PR10MB4196.namprd10.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-OriginalArrivalTime: 14 Sep 2022 22:18:25.7235 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-CrossTenant-Id: 4e2c6054-71cb-48f1-bd6c-3a9705aca71b X-MS-Exchange-CrossTenant-MailboxType: HOSTED X-MS-Exchange-CrossTenant-UserPrincipalName: 23LCo9UjX8wUJ8WH9IZWTYxqQhnt2ZUV4p7qgUynTouW6Jod8MU0cdjX3lyPxGbbIDnzygGMTev6S9B/Kw6cUA== X-MS-Exchange-Transport-CrossTenantHeadersStamped: SA2PR10MB4745 X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.205,Aquarius:18.0.895,Hydra:6.0.528,FMLib:17.11.122.1 definitions=2022-09-14_09,2022-09-14_04,2022-06-22_01 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 spamscore=0 phishscore=0 bulkscore=0 mlxlogscore=999 mlxscore=0 suspectscore=0 malwarescore=0 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2208220000 definitions=main-2209140108 X-Proofpoint-GUID: 0pWXxtYyyBO8C75-j4b6hA0JkX6F332_ X-Proofpoint-ORIG-GUID: 0pWXxtYyyBO8C75-j4b6hA0JkX6F332_ ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1663193922; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=73nPWJYn6DsvU2DsSwVM+vUmrXiJ+Unn+yfCo2I4Nr0=; b=vtAmI4m/vQ7dBgm0OQVlpgoOHvM0xhLzyJX4vX1Sat9BNzv8szqdc/62TXYwjet+ID3Tof vPpHLXrYgh5bLHwAkZ4y0Yoli2urrEqCfRe/FJQeetzBY/dkWh4mNrw2QBvxfkhdZRrmgy gPDV44qMNO2VImpZTp1Qt1XAeD27ImY= ARC-Authentication-Results: i=2; imf24.hostedemail.com; dkim=pass header.d=oracle.com header.s=corp-2022-7-12 header.b="LNW7p/CZ"; dkim=pass header.d=oracle.onmicrosoft.com header.s=selector2-oracle-onmicrosoft-com header.b=z6dHn7E6; dmarc=pass (policy=none) header.from=oracle.com; spf=pass (imf24.hostedemail.com: domain of mike.kravetz@oracle.com designates 205.220.165.32 as permitted sender) smtp.mailfrom=mike.kravetz@oracle.com; arc=pass ("microsoft.com:s=arcselector9901:i=1") ARC-Seal: i=2; s=arc-20220608; d=hostedemail.com; t=1663193922; a=rsa-sha256; cv=pass; b=7Xz2O6RBLNXysNAmV1bqlKd41rEVahTFXwSQqtEZuRwO9THtnIE8Fgjp3e9Wa3ZYsfVjA8 J3Dpiw0gGk3T/Rywnp6FKVRpcwJtWPgHIEKOrMO/Tq3VZ3r6Jmc/VBdciAVYJno4K3xFe8 yTjqyH105I1S5dme7mLWnHpUkbNNKfw= Authentication-Results: imf24.hostedemail.com; dkim=pass header.d=oracle.com header.s=corp-2022-7-12 header.b="LNW7p/CZ"; dkim=pass header.d=oracle.onmicrosoft.com header.s=selector2-oracle-onmicrosoft-com header.b=z6dHn7E6; dmarc=pass (policy=none) header.from=oracle.com; spf=pass (imf24.hostedemail.com: domain of mike.kravetz@oracle.com designates 205.220.165.32 as permitted sender) smtp.mailfrom=mike.kravetz@oracle.com; arc=pass ("microsoft.com:s=arcselector9901:i=1") X-Rspamd-Server: rspam03 X-Rspam-User: X-Rspamd-Queue-Id: CAF241800C9 X-Stat-Signature: 75aiiqfaxoqrdo3rmbiudjbz5hhnpawz X-HE-Tag: 1663193922-859053 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: remove_huge_page removes a hugetlb page from the page cache. Change to hugetlb_delete_from_page_cache as it is a more descriptive name. huge_add_to_page_cache is global in scope, but only deals with hugetlb pages. For consistency and clarity, rename to hugetlb_add_to_page_cache. Signed-off-by: Mike Kravetz Reviewed-by: Miaohe Lin --- fs/hugetlbfs/inode.c | 21 ++++++++++----------- include/linux/hugetlb.h | 2 +- mm/hugetlb.c | 8 ++++---- 3 files changed, 15 insertions(+), 16 deletions(-) diff --git a/fs/hugetlbfs/inode.c b/fs/hugetlbfs/inode.c index dfb735a91bbb..edd69cc43ca5 100644 --- a/fs/hugetlbfs/inode.c +++ b/fs/hugetlbfs/inode.c @@ -364,7 +364,7 @@ static int hugetlbfs_write_end(struct file *file, struct address_space *mapping, return -EINVAL; } -static void remove_huge_page(struct page *page) +static void hugetlb_delete_from_page_cache(struct page *page) { ClearPageDirty(page); ClearPageUptodate(page); @@ -478,15 +478,14 @@ static void remove_inode_hugepages(struct inode *inode, loff_t lstart, folio_lock(folio); /* * We must free the huge page and remove from page - * cache (remove_huge_page) BEFORE removing the - * region/reserve map (hugetlb_unreserve_pages). In - * rare out of memory conditions, removal of the - * region/reserve map could fail. Correspondingly, - * the subpool and global reserve usage count can need - * to be adjusted. + * cache BEFORE removing the region/reserve map + * (hugetlb_unreserve_pages). In rare out of memory + * conditions, removal of the region/reserve map could + * fail. Correspondingly, the subpool and global + * reserve usage count can need to be adjusted. */ VM_BUG_ON(HPageRestoreReserve(&folio->page)); - remove_huge_page(&folio->page); + hugetlb_delete_from_page_cache(&folio->page); freed++; if (!truncate_op) { if (unlikely(hugetlb_unreserve_pages(inode, @@ -723,7 +722,7 @@ static long hugetlbfs_fallocate(struct file *file, int mode, loff_t offset, } clear_huge_page(page, addr, pages_per_huge_page(h)); __SetPageUptodate(page); - error = huge_add_to_page_cache(page, mapping, index); + error = hugetlb_add_to_page_cache(page, mapping, index); if (unlikely(error)) { restore_reserve_on_error(h, &pseudo_vma, addr, page); put_page(page); @@ -735,7 +734,7 @@ static long hugetlbfs_fallocate(struct file *file, int mode, loff_t offset, SetHPageMigratable(page); /* - * unlock_page because locked by huge_add_to_page_cache() + * unlock_page because locked by hugetlb_add_to_page_cache() * put_page() due to reference from alloc_huge_page() */ unlock_page(page); @@ -980,7 +979,7 @@ static int hugetlbfs_error_remove_page(struct address_space *mapping, struct inode *inode = mapping->host; pgoff_t index = page->index; - remove_huge_page(page); + hugetlb_delete_from_page_cache(page); if (unlikely(hugetlb_unreserve_pages(inode, index, index + 1, 1))) hugetlb_fix_reserve_counts(inode); diff --git a/include/linux/hugetlb.h b/include/linux/hugetlb.h index 890f7b6a2eff..0ce916d1afca 100644 --- a/include/linux/hugetlb.h +++ b/include/linux/hugetlb.h @@ -665,7 +665,7 @@ struct page *alloc_huge_page_nodemask(struct hstate *h, int preferred_nid, nodemask_t *nmask, gfp_t gfp_mask); struct page *alloc_huge_page_vma(struct hstate *h, struct vm_area_struct *vma, unsigned long address); -int huge_add_to_page_cache(struct page *page, struct address_space *mapping, +int hugetlb_add_to_page_cache(struct page *page, struct address_space *mapping, pgoff_t idx); void restore_reserve_on_error(struct hstate *h, struct vm_area_struct *vma, unsigned long address, struct page *page); diff --git a/mm/hugetlb.c b/mm/hugetlb.c index 00fba195a439..eb38ae3e7a83 100644 --- a/mm/hugetlb.c +++ b/mm/hugetlb.c @@ -5429,7 +5429,7 @@ static bool hugetlbfs_pagecache_present(struct hstate *h, return page != NULL; } -int huge_add_to_page_cache(struct page *page, struct address_space *mapping, +int hugetlb_add_to_page_cache(struct page *page, struct address_space *mapping, pgoff_t idx) { struct folio *folio = page_folio(page); @@ -5568,7 +5568,7 @@ static vm_fault_t hugetlb_no_page(struct mm_struct *mm, new_page = true; if (vma->vm_flags & VM_MAYSHARE) { - int err = huge_add_to_page_cache(page, mapping, idx); + int err = hugetlb_add_to_page_cache(page, mapping, idx); if (err) { /* * err can't be -EEXIST which implies someone @@ -5980,11 +5980,11 @@ int hugetlb_mcopy_atomic_pte(struct mm_struct *dst_mm, /* * Serialization between remove_inode_hugepages() and - * huge_add_to_page_cache() below happens through the + * hugetlb_add_to_page_cache() below happens through the * hugetlb_fault_mutex_table that here must be hold by * the caller. */ - ret = huge_add_to_page_cache(page, mapping, idx); + ret = hugetlb_add_to_page_cache(page, mapping, idx); if (ret) goto out_release_nounlock; page_in_pagecache = true; From patchwork Wed Sep 14 22:18:05 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Mike Kravetz X-Patchwork-Id: 12976587 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 92F24C6FA82 for ; Wed, 14 Sep 2022 22:18:42 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 21575940008; Wed, 14 Sep 2022 18:18:42 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 19E55940007; Wed, 14 Sep 2022 18:18:42 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id F0C2A940008; Wed, 14 Sep 2022 18:18:41 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0011.hostedemail.com [216.40.44.11]) by kanga.kvack.org (Postfix) with ESMTP id E0A49940007 for ; Wed, 14 Sep 2022 18:18:41 -0400 (EDT) Received: from smtpin09.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay05.hostedemail.com (Postfix) with ESMTP id BE5FA4034B for ; Wed, 14 Sep 2022 22:18:41 +0000 (UTC) X-FDA: 79912106442.09.E8A41F2 Received: from mx0b-00069f02.pphosted.com (mx0b-00069f02.pphosted.com [205.220.177.32]) by imf13.hostedemail.com (Postfix) with ESMTP id 1483B200A5 for ; Wed, 14 Sep 2022 22:18:40 +0000 (UTC) Received: from pps.filterd (m0246632.ppops.net [127.0.0.1]) by mx0b-00069f02.pphosted.com (8.17.1.5/8.17.1.5) with ESMTP id 28EMIW4w015296; Wed, 14 Sep 2022 22:18:32 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.com; h=from : to : cc : subject : date : message-id : in-reply-to : references : content-transfer-encoding : content-type : mime-version; s=corp-2022-7-12; bh=6ivtxW+IRCVlKCBX7kIWyo1zJFXhNtt3pmRL4dhkLoM=; b=I+lre44FO8s0M0U69LG3tY8Pe8YkS0JC1yN2GEAsXGc02DL4IuvpQni6piIP8TmRhTw+ BnFhqL3/aBwvowC4t9pXQzDvH+UVfmuuXlrxeoSSO+ftPdi+3/1Vv9oIEZrOcNYKyFnl 1McVAWR7vChS2Svf6vCE/CAZry8mZsU1iCx8qPmcOoXKhDimRiF86KqW/avE4SeBwDMc qysdFg84//ikwimKfiVhhrm5RzrIW1R1mBf6rrBsXugLt3m3ZK9XtTTJBid/FrvpJrNe vN8yNeLe/enM6xCe02AFI5gwIO7/w2iirZKBCwn0caK7R0Blh7eQ21SY5qN/T1WP70iV oA== Received: from iadpaimrmta02.imrmtpd1.prodappiadaev1.oraclevcn.com (iadpaimrmta02.appoci.oracle.com [147.154.18.20]) by mx0b-00069f02.pphosted.com (PPS) with ESMTPS id 3jjxycbnry-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Wed, 14 Sep 2022 22:18:32 +0000 Received: from pps.filterd (iadpaimrmta02.imrmtpd1.prodappiadaev1.oraclevcn.com [127.0.0.1]) by iadpaimrmta02.imrmtpd1.prodappiadaev1.oraclevcn.com (8.17.1.5/8.17.1.5) with ESMTP id 28EJ636Z006504; Wed, 14 Sep 2022 22:18:29 GMT Received: from nam11-bn8-obe.outbound.protection.outlook.com (mail-bn8nam11lp2169.outbound.protection.outlook.com [104.47.58.169]) by iadpaimrmta02.imrmtpd1.prodappiadaev1.oraclevcn.com (PPS) with ESMTPS id 3jjy2bjuh2-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Wed, 14 Sep 2022 22:18:29 +0000 ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=iycipkIzS93zggF5NZArGo7Sb/sDycsYbs6EclebenBIXIW542AXdYt0vdD4qhI/6/IBKPrYF/sS/zJ3+TSOzxz7EidS1Ajvs5oxOmYTZXt+gLEk2IQI9vB6F0JNSJ03uG4PM2WENfjZ/Bybe0xeXaaNl5xOwqLAx1yelM1gP0MhHgV+wqLb1U3X7GYqiWyQ4iPhSV4zUrZ4PtloAyXobpFDbum0BRme8HN4gBmbkRvvZRdahrmFV/mOhATw+OebFCv/NdVw3ASVW9j4iwY093LjHgcjnszLLdjk+PcAWvEZEPNZaeqSqHgMYGHi60TkxnPyk1eoWT9mi3FHnZCCPg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=6ivtxW+IRCVlKCBX7kIWyo1zJFXhNtt3pmRL4dhkLoM=; b=QXZGb0dYcRSedrQtIPn3ueGswyZRCrrd+hBTIBnHumQ9UdKMeGcBIKy0H2f43EmwA2/0wKoOvvcf1LvGcjHUuYsrNjgmCjnp5xYANRtSRrvS4bSdHxXvWPWS1/CBlW2V3c3Cu0gkzdTv1rD2etAlOZqQWqZPxcxuD21L0SwPffiKM68NRDPPnMsOQmGolZtYIiEIJJDuGvhvdfCQq8ccP5xme+6YM7AOmkWyiSq1iSTuREt2JzDTBNT7m650O11Jy7uuez2AwYV3QnAINEHDkEejZFlAf5tlv84zcTLZdptjE7ErA7eEO44J76eadhaM44RdaHFDKhKtDtmPevKBdQ== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=oracle.com; dmarc=pass action=none header.from=oracle.com; dkim=pass header.d=oracle.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.onmicrosoft.com; s=selector2-oracle-onmicrosoft-com; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=6ivtxW+IRCVlKCBX7kIWyo1zJFXhNtt3pmRL4dhkLoM=; b=jcXVK2GNCO9wnzgsK5xoxmKMt4DYHOEc6YPzrUR0Wv6ayj4aDiAqFgsiyaPlMHOP27v2nP1mxqnsqWWfKbB4HiGx4QNVfOJU0ChvqiJsup+0W0g0FmdU/DzgaTl9dEP9B2HHxGOkSdGcJhX31VSKNY/JdqhAA+mJNjWNtVvZowA= Received: from BY5PR10MB4196.namprd10.prod.outlook.com (2603:10b6:a03:20d::23) by SA2PR10MB4745.namprd10.prod.outlook.com (2603:10b6:806:11b::6) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.5612.23; Wed, 14 Sep 2022 22:18:28 +0000 Received: from BY5PR10MB4196.namprd10.prod.outlook.com ([fe80::e9d2:a804:e53a:779a]) by BY5PR10MB4196.namprd10.prod.outlook.com ([fe80::e9d2:a804:e53a:779a%6]) with mapi id 15.20.5612.022; Wed, 14 Sep 2022 22:18:28 +0000 From: Mike Kravetz To: linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: Muchun Song , Miaohe Lin , David Hildenbrand , Sven Schnelle , Michal Hocko , Peter Xu , Naoya Horiguchi , "Aneesh Kumar K . V" , Andrea Arcangeli , "Kirill A . Shutemov" , Davidlohr Bueso , Prakash Sangappa , James Houghton , Mina Almasry , Pasha Tatashin , Axel Rasmussen , Ray Fucillo , Andrew Morton , Mike Kravetz Subject: [PATCH v2 4/9] hugetlb: create remove_inode_single_folio to remove single file folio Date: Wed, 14 Sep 2022 15:18:05 -0700 Message-Id: <20220914221810.95771-5-mike.kravetz@oracle.com> X-Mailer: git-send-email 2.37.2 In-Reply-To: <20220914221810.95771-1-mike.kravetz@oracle.com> References: <20220914221810.95771-1-mike.kravetz@oracle.com> X-ClientProxiedBy: MW4PR04CA0156.namprd04.prod.outlook.com (2603:10b6:303:85::11) To BY5PR10MB4196.namprd10.prod.outlook.com (2603:10b6:a03:20d::23) MIME-Version: 1.0 X-MS-PublicTrafficType: Email X-MS-TrafficTypeDiagnostic: BY5PR10MB4196:EE_|SA2PR10MB4745:EE_ X-MS-Office365-Filtering-Correlation-Id: 77f30175-95d2-458e-e2f4-08da969f0c7d X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: ZOtt8tSkYhN3bY41zBn9JpHPI5+XldPzOHMe3U7RDO9ZbUZVRIHsPrI3WSuUO9qhwrwTQsznmWfldZN6MR3qsm/m0AKQQnKl6KvXa4K2LNiaB/aHod+EVMiDtQpZCaUpwEOUy/Mz+MUaTtgyBupb5PGgcUnut0KwWWB+P0dBPniNFXqPLbrmfiqXjnNUXTGX8qViE9oAYfsXQW7DBHQQU2524IlWoGajzEDSCayYYmLXLcqbzoa7WacpMDGHSifE6os2uICqYjrUsKT65zJR/Lyun71U/wsZT+JYecjy6aQVsRYKRu39t1lxtFMUvnDStp82Z7EqceVW/VZJeZuxVT+iu+SQxXh+3U6KCEvrDai2QlXqb/AFfT2UJZMgXd1qvckFdbM3udL6JXBDFSEEGX/jfulg2XfQsoBK97kw/39Vo8J4oWlb2hoOQOwKyGHpAUkPeLu0QlnbJMg4v0dG2uyq+NkXbch1Oo6BV6HJ9w7WmiuLyzJZviF92zRDgeIm1Dvx32uWUhl/hJVmFRRRyMgQkahBgz8YQI6xawmN11Mgw0GDQAVdm0NaBD5VqgCJ/tTIgTalRHqz9y40mCYC1eJ38NMMA4WEA0Hox4Lj5D2xwX/2y9W45X9y3A8/8QFy2g2j/gboXl68MlMWtbWpKzefYu9KvB9B1ZpqL2i7EDiE954T71zoaQs2VheBIfkXGfsGD/3fBL/9oR6JTyutZQ== X-Forefront-Antispam-Report: CIP:255.255.255.255;CTRY:;LANG:en;SCL:1;SRV:;IPV:NLI;SFV:NSPM;H:BY5PR10MB4196.namprd10.prod.outlook.com;PTR:;CAT:NONE;SFS:(13230022)(376002)(346002)(39860400002)(136003)(366004)(396003)(451199015)(478600001)(38100700002)(54906003)(316002)(36756003)(6506007)(8676002)(4326008)(66946007)(83380400001)(186003)(2616005)(1076003)(5660300002)(41300700001)(107886003)(66476007)(66556008)(86362001)(8936002)(6666004)(6486002)(7416002)(2906002)(26005)(6512007)(44832011);DIR:OUT;SFP:1101; X-MS-Exchange-AntiSpam-MessageData-ChunkCount: 1 X-MS-Exchange-AntiSpam-MessageData-0: s5Aj33NahkbR6fy9WTa2aL6kpGY87TZT9GcXP1EH1smnte6MJ9nrsaedGZSue2W9XnuSIWzjvnWZQvm0uCfBWJRsUDH9JwwbObMvji2+cXFSAr9fdz2HVllNm7oyp3NVmIiANp9Gbh5SWYEh5fXFgYE8XIfUoZLo4qpIyHEas3qrPvqckPMioHbkdVjKVcav7n4DguHVagaVg3G8/sMDsFAiOq2/xFTJGUg9Xof8wH0YX5ZcI+b3uXeds1Z6ixmAw6HUro3giiI4rHa5vHlnjSAFqhfPJH7LpcPinWFm5GEPAlv0B1G775NNMA7NXENdaf7vJZR3HI3Hc33BNl3llI6gUFBEQVWjz7nrNpOI182jTEQY+F2RlXZTTiesVWr/y0uKlybpy5zScDzqO8ZN9ye3qUtJN/qjbqeGjqEDwb/R35xj+UGhiCyTCXCmfOiadrXqyfp26G4hY2P8Va3HPr4Pot7GvF8/FLyGMxEGItB1WVCNM+XUtUIomir7OCHZFAAQYqFF690N4t2ThuOr4VpMLSvHqfHU4cRwpelU2kPmqfuEXLrOB1W9dXlnVqXQr2wxAxtwtq1nTiwWBMOBF8yCv6xUb4BbLctFSfN0Cr+4HxbCLZkWT142byDvNYUyWaHZ0I2nA/TH2RZfdAV0AZqv1VPoIj4kRwt592oFWP6orHf9pjr8vD2lE2kC+H4W7nJoVa1+z6FxdjTx7AtZjBN1dLbJJc8ZfaDgrrjJ2PPLS4FZVbLseVDr+cX/L9KmrHSxHDIZhSTfIN3u2yFVQ4W3U894NvhuYbvESryjUw2nJFGxx0i81erEsgZVVyY4nr9rhqGtbXKlEYcZ06DFa0e7JdM0uDaxfRfi3vBYo3JQsf46NRkHnYO9MsNWQz6bthDFTQNyZlQf4NQb5lX+/7AZZskVqheA3LGsaDLvnxAIIynFxZW6APvo1iQPG8OMfqM2l5EG6WTgmP1Di/RnbFFwmP0h5+z03IWZaoRTe5mpHI6NJ4CTUw93FoyYwhnToCGOcsOQOi6nBXi+iaXee9LWWaZkQd0TZR5/oSIefIQ9YDJK7jjbcqP6Ac2PqeEb/DdIRiW/tWcZ16NIVnMYys015KGEj2ZDx6zSe8q9pW+D/KjG0K4eA4/Z1ICcvtgPYM4M4mjDXQ/jYGo04eF/cjYxb6CUK4GrNd23B5PnhILHXqCxJI3hNRg/OQTLZo1kDeoWX0ylF3t2OjcV2EcSZ1pCej+yFnSalcXSgY0a5Sh2rwBohWZNNhOsmupn3gsf6hmZy1fpSuHDhetEpoqXjhFxNwDne1uQ2KukAAlqY4eNHgyGe+Bd1a3tEE+P5XK3KH4LNoELOFKS2rC2mCG5H4vWkW/xfZKevydo4Wr4fLvzgIUMIXtzp460/ZzRCFRF6BWAfYJhFbmrKbieKiqPlPqNnQbd+1LSrbrniFoMqJjssLz5lkRa1gAOYpCGW3clKdF0vsw0uP0M4mx3rB+e7USqaWlu4woGcoAtuORZHhnTppZgKcwetWPxhxAr2ZCkqwZJVLeIcSCGvCmXt686V4RhfW+Yq7xjWt2w7AjZ/60eiqHN/ZXGshMJ3Xq/EU4qnE4ZhYSL3SQ7HguOVMzMNA== X-OriginatorOrg: oracle.com X-MS-Exchange-CrossTenant-Network-Message-Id: 77f30175-95d2-458e-e2f4-08da969f0c7d X-MS-Exchange-CrossTenant-AuthSource: BY5PR10MB4196.namprd10.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-OriginalArrivalTime: 14 Sep 2022 22:18:28.0983 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-CrossTenant-Id: 4e2c6054-71cb-48f1-bd6c-3a9705aca71b X-MS-Exchange-CrossTenant-MailboxType: HOSTED X-MS-Exchange-CrossTenant-UserPrincipalName: UBTpN4TeVvAJc/ul7F0Mt0A2ljYCFrNvSd8T4YXRumDlyicSk4mnUmSyC4PCZu+sJTvtU00g8cXUAXJyZK2wpw== X-MS-Exchange-Transport-CrossTenantHeadersStamped: SA2PR10MB4745 X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.205,Aquarius:18.0.895,Hydra:6.0.528,FMLib:17.11.122.1 definitions=2022-09-14_09,2022-09-14_04,2022-06-22_01 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 suspectscore=0 adultscore=0 malwarescore=0 spamscore=0 bulkscore=0 mlxscore=0 mlxlogscore=999 phishscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2208220000 definitions=main-2209140108 X-Proofpoint-ORIG-GUID: xBN2k2TtQfk0k7oxqVQj4K85fe08Lf9H X-Proofpoint-GUID: xBN2k2TtQfk0k7oxqVQj4K85fe08Lf9H ARC-Authentication-Results: i=2; imf13.hostedemail.com; dkim=pass header.d=oracle.com header.s=corp-2022-7-12 header.b=I+lre44F; dkim=pass header.d=oracle.onmicrosoft.com header.s=selector2-oracle-onmicrosoft-com header.b=jcXVK2GN; dmarc=pass (policy=none) header.from=oracle.com; arc=pass ("microsoft.com:s=arcselector9901:i=1"); spf=pass (imf13.hostedemail.com: domain of mike.kravetz@oracle.com designates 205.220.177.32 as permitted sender) smtp.mailfrom=mike.kravetz@oracle.com ARC-Seal: i=2; s=arc-20220608; d=hostedemail.com; t=1663193921; a=rsa-sha256; cv=pass; b=MVp6lN9o+kWE5pV/YI1O/juLpTdWlzJxcf5qDiOHcpguA7UDEkHXTTLxnYS6h4p93otYed i72JbmrA2utw2sWE1/iJGW/FD+pfRXkHUadbBLq9hPqlRDVaWi8fDrsABW1xqfQJPFX+gw 5eCAYayvjAtWH/jJtzgcpNzW0/mX+9U= ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1663193921; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=6ivtxW+IRCVlKCBX7kIWyo1zJFXhNtt3pmRL4dhkLoM=; b=WSiDpafLkeLPd7x2IpGWikepA9R9MJWYuPo7FnY+FeSgMiAmv8VTw7s03gRnmDL6XU2ZdI c7sjbxcwzfchr8THOtkQi1uqDPvlsjmOvdjs2M88G6OwIMv+EV+Mx9uhj/pWYX9H6AWqg2 Guw3OdpIHxwq75b9KKyTO7CIOydGBnU= X-Rspam-User: Authentication-Results: imf13.hostedemail.com; dkim=pass header.d=oracle.com header.s=corp-2022-7-12 header.b=I+lre44F; dkim=pass header.d=oracle.onmicrosoft.com header.s=selector2-oracle-onmicrosoft-com header.b=jcXVK2GN; dmarc=pass (policy=none) header.from=oracle.com; arc=pass ("microsoft.com:s=arcselector9901:i=1"); spf=pass (imf13.hostedemail.com: domain of mike.kravetz@oracle.com designates 205.220.177.32 as permitted sender) smtp.mailfrom=mike.kravetz@oracle.com X-Rspamd-Server: rspam07 X-Rspamd-Queue-Id: 1483B200A5 X-Stat-Signature: tz9rmimzdws8pf83e4zs9a57kdxmowor X-HE-Tag: 1663193920-819821 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Create the new routine remove_inode_single_folio that will remove a single folio from a file. This is refactored code from remove_inode_hugepages. It checks for the uncommon case in which the folio is still mapped and unmaps. No functional change. This refactoring will be put to use and expanded upon in a subsequent patches. Signed-off-by: Mike Kravetz Reviewed-by: Miaohe Lin --- fs/hugetlbfs/inode.c | 105 ++++++++++++++++++++++++++----------------- 1 file changed, 63 insertions(+), 42 deletions(-) diff --git a/fs/hugetlbfs/inode.c b/fs/hugetlbfs/inode.c index edd69cc43ca5..7112a9a9f54d 100644 --- a/fs/hugetlbfs/inode.c +++ b/fs/hugetlbfs/inode.c @@ -411,6 +411,60 @@ hugetlb_vmdelete_list(struct rb_root_cached *root, pgoff_t start, pgoff_t end, } } +/* + * Called with hugetlb fault mutex held. + * Returns true if page was actually removed, false otherwise. + */ +static bool remove_inode_single_folio(struct hstate *h, struct inode *inode, + struct address_space *mapping, + struct folio *folio, pgoff_t index, + bool truncate_op) +{ + bool ret = false; + + /* + * If folio is mapped, it was faulted in after being + * unmapped in caller. Unmap (again) while holding + * the fault mutex. The mutex will prevent faults + * until we finish removing the folio. + */ + if (unlikely(folio_mapped(folio))) { + i_mmap_lock_write(mapping); + hugetlb_vmdelete_list(&mapping->i_mmap, + index * pages_per_huge_page(h), + (index + 1) * pages_per_huge_page(h), + ZAP_FLAG_DROP_MARKER); + i_mmap_unlock_write(mapping); + } + + folio_lock(folio); + /* + * After locking page, make sure mapping is the same. + * We could have raced with page fault populate and + * backout code. + */ + if (folio_mapping(folio) == mapping) { + /* + * We must remove the folio from page cache before removing + * the region/ reserve map (hugetlb_unreserve_pages). In + * rare out of memory conditions, removal of the region/reserve + * map could fail. Correspondingly, the subpool and global + * reserve usage count can need to be adjusted. + */ + VM_BUG_ON(HPageRestoreReserve(&folio->page)); + hugetlb_delete_from_page_cache(&folio->page); + ret = true; + if (!truncate_op) { + if (unlikely(hugetlb_unreserve_pages(inode, index, + index + 1, 1))) + hugetlb_fix_reserve_counts(inode); + } + } + + folio_unlock(folio); + return ret; +} + /* * remove_inode_hugepages handles two distinct cases: truncation and hole * punch. There are subtle differences in operation for each case. @@ -418,11 +472,10 @@ hugetlb_vmdelete_list(struct rb_root_cached *root, pgoff_t start, pgoff_t end, * truncation is indicated by end of range being LLONG_MAX * In this case, we first scan the range and release found pages. * After releasing pages, hugetlb_unreserve_pages cleans up region/reserve - * maps and global counts. Page faults can not race with truncation - * in this routine. hugetlb_no_page() prevents page faults in the - * truncated range. It checks i_size before allocation, and again after - * with the page table lock for the page held. The same lock must be - * acquired to unmap a page. + * maps and global counts. Page faults can race with truncation. + * During faults, hugetlb_no_page() checks i_size before page allocation, + * and again after obtaining page table lock. It will 'back out' + * allocations in the truncated range. * hole punch is indicated if end is not LLONG_MAX * In the hole punch case we scan the range and release found pages. * Only when releasing a page is the associated region/reserve map @@ -456,44 +509,12 @@ static void remove_inode_hugepages(struct inode *inode, loff_t lstart, mutex_lock(&hugetlb_fault_mutex_table[hash]); /* - * If folio is mapped, it was faulted in after being - * unmapped in caller. Unmap (again) now after taking - * the fault mutex. The mutex will prevent faults - * until we finish removing the folio. - * - * This race can only happen in the hole punch case. - * Getting here in a truncate operation is a bug. + * Remove folio that was part of folio_batch. */ - if (unlikely(folio_mapped(folio))) { - BUG_ON(truncate_op); - - i_mmap_lock_write(mapping); - hugetlb_vmdelete_list(&mapping->i_mmap, - index * pages_per_huge_page(h), - (index + 1) * pages_per_huge_page(h), - ZAP_FLAG_DROP_MARKER); - i_mmap_unlock_write(mapping); - } - - folio_lock(folio); - /* - * We must free the huge page and remove from page - * cache BEFORE removing the region/reserve map - * (hugetlb_unreserve_pages). In rare out of memory - * conditions, removal of the region/reserve map could - * fail. Correspondingly, the subpool and global - * reserve usage count can need to be adjusted. - */ - VM_BUG_ON(HPageRestoreReserve(&folio->page)); - hugetlb_delete_from_page_cache(&folio->page); - freed++; - if (!truncate_op) { - if (unlikely(hugetlb_unreserve_pages(inode, - index, index + 1, 1))) - hugetlb_fix_reserve_counts(inode); - } - - folio_unlock(folio); + if (remove_inode_single_folio(h, inode, mapping, folio, + index, truncate_op)) + freed++; + mutex_unlock(&hugetlb_fault_mutex_table[hash]); } folio_batch_release(&fbatch); From patchwork Wed Sep 14 22:18:06 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Mike Kravetz X-Patchwork-Id: 12976589 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id EABA3C6FA82 for ; Wed, 14 Sep 2022 22:18:45 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 12D5F940009; Wed, 14 Sep 2022 18:18:43 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 0B4D9940007; Wed, 14 Sep 2022 18:18:42 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id D8550940009; Wed, 14 Sep 2022 18:18:42 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0016.hostedemail.com [216.40.44.16]) by kanga.kvack.org (Postfix) with ESMTP id C99F3940007 for ; Wed, 14 Sep 2022 18:18:42 -0400 (EDT) Received: from smtpin20.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay08.hostedemail.com (Postfix) with ESMTP id A60431401EA for ; Wed, 14 Sep 2022 22:18:42 +0000 (UTC) X-FDA: 79912106484.20.D023E55 Received: from mx0a-00069f02.pphosted.com (mx0a-00069f02.pphosted.com [205.220.165.32]) by imf19.hostedemail.com (Postfix) with ESMTP id 27BC81A0084 for ; Wed, 14 Sep 2022 22:18:41 +0000 (UTC) Received: from pps.filterd (m0246627.ppops.net [127.0.0.1]) by mx0b-00069f02.pphosted.com (8.17.1.5/8.17.1.5) with ESMTP id 28EMAARp026129; Wed, 14 Sep 2022 22:18:33 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.com; h=from : to : cc : subject : date : message-id : in-reply-to : references : content-transfer-encoding : content-type : mime-version; s=corp-2022-7-12; bh=c5CV29ucnX6P7BSExad26jDY9ce0i4MfDdk5OL0pR+4=; b=y1yf3FkwprIzarb0yhWit8irhp79vPw/iGZt6ErczAtgkGwAgDa1vrqd3XrSEmjEcBW/ 4yIblnZrWg5Lq84vOGDU14STENVGKL9LjIeLF1/WNVPn5/BRaFcxi2NsMwLQpRwsAeGp I8raYYun2NMuCg7FW0NPbh9HjXAAN6MilCM/MVnJNt5N2XM3ZWH2/SOUS36gtBH+0LKu ItjlYD6CGEVLNQu3LKtS22noxXpzThpgnStXCEZbxywqLBDA40BkCQHAAN3drxgtOwCB 6OvRRaYVruovpemTn26o6hQtEn6MhwRJXkO6rYGiUeWQLl/mb4P/UzAtB0bBsivIDbkp KQ== Received: from iadpaimrmta01.imrmtpd1.prodappiadaev1.oraclevcn.com (iadpaimrmta01.appoci.oracle.com [130.35.100.223]) by mx0b-00069f02.pphosted.com (PPS) with ESMTPS id 3jjxypbqr6-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Wed, 14 Sep 2022 22:18:33 +0000 Received: from pps.filterd (iadpaimrmta01.imrmtpd1.prodappiadaev1.oraclevcn.com [127.0.0.1]) by iadpaimrmta01.imrmtpd1.prodappiadaev1.oraclevcn.com (8.17.1.5/8.17.1.5) with ESMTP id 28EM1D1C035494; Wed, 14 Sep 2022 22:18:32 GMT Received: from nam11-bn8-obe.outbound.protection.outlook.com (mail-bn8nam11lp2169.outbound.protection.outlook.com [104.47.58.169]) by iadpaimrmta01.imrmtpd1.prodappiadaev1.oraclevcn.com (PPS) with ESMTPS id 3jjyehtdu5-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Wed, 14 Sep 2022 22:18:32 +0000 ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=PWLjzSJ8uiB6pDYXfUXVDz0jnU6pTCV7K1boAA0r3jgVaeI9cyc4mN4HxHDyORQtbVWPWfRr+iJkGBEXfNoHTO87ZVwB864UKYpCLa27HeBqxuih5/Rll2LDnADEiBr52ltl2shceyKAKDFjoSbPitChxq0E5Apbq7AMHW/d4dzN3RortlAWOHvUyRNbEvd+53q+bquxmDnfdORMaO4K2IlJwT1AflczgkFfY+z44j8b2SpNTnNDG8Zgeh+/IZYmJ04QAzok1lyDI4/eqEg97K3g2WshdDgox1af5iK5Q4xPlC/aNSqm4NItJsrurAsvS+dAkcYTZ2SqF+/meAGkqg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=c5CV29ucnX6P7BSExad26jDY9ce0i4MfDdk5OL0pR+4=; b=Mbn2pD9dj2yCro6F40fT/OyN/1IRkITaPnhMe+9jkQ9pStxnyeETPSTWlQ/sShTURG3tbP6ep8hPWfHgoW4JWHlBeSpUiNUe3v3+bDfvD2ofmDsjZ85CrFh0qKhkWPSufXS3YdbxLX7zBGyLD6rX/P2aYeiRkE48rPhXVX0kRjuyJtJJCG7Zk5WeXt2mPMZqHmnTOB+oU5rB1QeaAwnKgVKUfH0KVsMDGdMf6bYvEpBVDpJiFhMKiD1Wu81KTotXHwsH1HtAPQK6fSYQgMabadKma6wnauUDAT4aTHGScM69P4/D9mvhlfAHbYURw+UpcQPA91Qu6/KutLo8QZoIkw== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=oracle.com; dmarc=pass action=none header.from=oracle.com; dkim=pass header.d=oracle.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.onmicrosoft.com; s=selector2-oracle-onmicrosoft-com; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=c5CV29ucnX6P7BSExad26jDY9ce0i4MfDdk5OL0pR+4=; b=Vv2V0Ecj4I6Qej1KdNudLjrEYJLEzuA668EHVocTAqkErBwIqo2Byp1rTdZ8up0zfIBoSH0/14xBdwMPaK5FVWS9/s+xXr7nj2aQm1txI7x/2GsiAWg+n/SVrJSPN5xPWPj9REMdcY+wrJ4nEm+xs4CqcU1l2Py5Xw4lKvJIlXM= Received: from BY5PR10MB4196.namprd10.prod.outlook.com (2603:10b6:a03:20d::23) by SA2PR10MB4745.namprd10.prod.outlook.com (2603:10b6:806:11b::6) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.5612.23; Wed, 14 Sep 2022 22:18:30 +0000 Received: from BY5PR10MB4196.namprd10.prod.outlook.com ([fe80::e9d2:a804:e53a:779a]) by BY5PR10MB4196.namprd10.prod.outlook.com ([fe80::e9d2:a804:e53a:779a%6]) with mapi id 15.20.5612.022; Wed, 14 Sep 2022 22:18:30 +0000 From: Mike Kravetz To: linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: Muchun Song , Miaohe Lin , David Hildenbrand , Sven Schnelle , Michal Hocko , Peter Xu , Naoya Horiguchi , "Aneesh Kumar K . V" , Andrea Arcangeli , "Kirill A . Shutemov" , Davidlohr Bueso , Prakash Sangappa , James Houghton , Mina Almasry , Pasha Tatashin , Axel Rasmussen , Ray Fucillo , Andrew Morton , Mike Kravetz Subject: [PATCH v2 5/9] hugetlb: rename vma_shareable() and refactor code Date: Wed, 14 Sep 2022 15:18:06 -0700 Message-Id: <20220914221810.95771-6-mike.kravetz@oracle.com> X-Mailer: git-send-email 2.37.2 In-Reply-To: <20220914221810.95771-1-mike.kravetz@oracle.com> References: <20220914221810.95771-1-mike.kravetz@oracle.com> X-ClientProxiedBy: MW4PR03CA0153.namprd03.prod.outlook.com (2603:10b6:303:8d::8) To BY5PR10MB4196.namprd10.prod.outlook.com (2603:10b6:a03:20d::23) MIME-Version: 1.0 X-MS-PublicTrafficType: Email X-MS-TrafficTypeDiagnostic: BY5PR10MB4196:EE_|SA2PR10MB4745:EE_ X-MS-Office365-Filtering-Correlation-Id: 2fa0f240-db29-43b2-05a6-08da969f0de6 X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: 7AvGPbLdzoFj1uc71qBP6dbl3v7Xm7E/DJ4ArChkH7V0w7TZaTRWX6BAgHsG/d1AT4ynDjw8F+f4oAgfAyUz77XZt//8LolIAyajtwjj6L5D5fO7htRoBQlGF0j9+yxUpbHfnrp8Fl2Sm6gMa951koqPZedKdNRna7+BBl4ui/1mv5ER89P1Ij+y1/LqtAB3b1HT8zPdLHLDgXsRYt9Learp6H7ila3ps6Qkukxi6mT89lj8fb3xAttE8ycX1t/1geNtznXS6C4Sh1UT2v1DfTwj91do3wWGW8qN8/vL37QKa7xWPBksK3DHhMTsLVHgMMpB+1W55XHEeMqo0gZSFXU8uqzCVvraLmAfYgnLY68TBGUcVo+0OP1TVKjuy1WrGeECc7iZ9CedNpgbgtoF78MeFWGaKtZLd1TwJ9otj3pJFtUyjJJqIUBszdIyrcOWyIHLk/ZcjZPiSOTAuIFWDbDAMjhuo6NUp+78yBOT5m7HVB2ksbdk0wHY9OvQk2pI9NQYM5XgwDv5oo03XYs2wRiQ8hMD24mBLZrn/C2IQ1U01KJEOQoWK3a+8p1looTjPXGyht+Dj2ofoDGwt1R2HnwJyImc9lOBr6+DrPvJ+0ODpJqQHRUOUEnFNg5yFXY2t5HsHGBU+4hfnYbdTfi8F87RfvNVh2dsVn2S53yBCbJGnYim1vXVc5jeQg64nxENX3Et1oEVWM5LXJJbtmfCRA== X-Forefront-Antispam-Report: CIP:255.255.255.255;CTRY:;LANG:en;SCL:1;SRV:;IPV:NLI;SFV:NSPM;H:BY5PR10MB4196.namprd10.prod.outlook.com;PTR:;CAT:NONE;SFS:(13230022)(376002)(346002)(39860400002)(136003)(366004)(396003)(451199015)(478600001)(38100700002)(54906003)(316002)(36756003)(6506007)(8676002)(4326008)(66946007)(83380400001)(186003)(2616005)(1076003)(5660300002)(41300700001)(107886003)(66476007)(66556008)(86362001)(8936002)(6666004)(6486002)(7416002)(2906002)(26005)(6512007)(44832011);DIR:OUT;SFP:1101; X-MS-Exchange-AntiSpam-MessageData-ChunkCount: 1 X-MS-Exchange-AntiSpam-MessageData-0: ABmSBjnD+dy0WguvxoTWhK5crCfRpyyyiOVa6aXEU0Oj4AEE3UYBvgBEB0uq6NNXSl71/WHrg62H2CABw5cbw9Kf/cyScFOUy9Q+ym+Qrj3G9Bze/aVj1d7P0gr4AlgbJRRbekjSZUFFUGiOZO8wWZ3BXgb8eP8155kLPF/NmEd0MQaUYOH4Jv1pq07UB0T5dl93ojpIYtc663NaKLyzIsZYL51mNPoLpQNmTo19x1Le6wqmcqxyhtRGeSmdXiM5XGhL0gHbgHOGdxDy31d3Lr7PpT1V4Avf/Pq7S2hGrvPnx3ao39o7QrIpXczlKCkTZ3U6+O+VAvhXTl1DT2qR7PsRwf1IQ0Pi1nMhPC+3cEq0A010uZ6Rtm+ofShsplSn6peLCZvpMLinOPoF16RlVZKtwfYGTR/XlddQ7hGxwlURs37ZuQuVfTL/Zq8kQmfmnXNOSpLsR3QrYxihZQ/vBhnlDuMxBfyHeGscSzXKG9RWUxPnv2XjAfGSrLCQBtOGPaR6o+RRbYRP6SA1yJVEYeOHjgDVCoJxRAS7B2CD7Kp6xYon2ud5MJ9tzhP2zh/AEe4Q5hgZFo6/niglhk3T5AShpXDnHmvIKe6Gn+JxjpEDoyWdIw2wwIgsgMz2O7JaPhskivYHydXrbWDkghxV9+V9ffVtmtYY/PUExaZr47rchT3PlWKJKw6OgDcBxPPaTXmju9XPtnEgF0C9/IRcbcZLIbTxA+2hkj4RJaCDpJhwGgf29PAOEJ+J4TGRMbXQZ5eo4HBawPqaVK7kQFHUViiul7qcBF0MX/y3WgQmA7I54OMzyf6B1OXEkLp7bB2/tHxGcpnTBcDOZC87Lt4IJLG/CL7nlhVCSkjI/KNNwz0/bGW+JT9dPDWnoNbxo+TXoxd1SXk48a3yUsbuuY5UDOZj3jBQnANze081dAxEv+VPGyfqf4rpEmbCIqyoVe0EONRY0V3iGGGjz2QJDGae9s75q9yyKQTVr4wZFT39Fob/Bj4H8YkuujnK4nyQ0cTU+82AAYMC9Pn1ZAMjE0kXBzriSSuWlvvGVOilAPW3PIhdEk+j7sG/hgaIrehq5wh9noHxkypP9q436zRaHOgzljvOucm7DiamMnkvc5rfLWegKSINPWqSpnHwzoudoiUHi76YG392z8aL8GV6zh3bN/go6L5NUyAuGl5wcGvQfN1/1+dWiHCa/W+/QBcc7W/p0AOexCUTR/VS9kc+p28VniT7zqdmdTM2ReJ4ohEjr0qYfZXul8haQzpyuJLX/UhRgEI1i7sx2iN0r+RNBQFQfkuH1qa3uKAON3+pIrQOOrPPD83Q1N6BcgFFD1HPwKjuEjFA/FtLTmTuZ4VL376UIQiUXIgY6674E6gRLA/mmbN1kbjCocCYA9/RhRjHSxdVNTgQOGPyXvZn93Y5AQaiCozKdVLGRboSG+YRSKiZm71PYM+m0uctzsqngkxrVTmyRthZ+klXzyrId22p59JhmbWs+nSJQk5gx2/1IIrk7kL42bUBcVEYQe4fHhF8zOOVyh7ahlXlvxTdg2OpvbKYINKRLr05UPa+QzumGzSjsB78ukB1rdozpNtwBpYxffpuDqkwhGcZeuMym3kGAWV1vA== X-OriginatorOrg: oracle.com X-MS-Exchange-CrossTenant-Network-Message-Id: 2fa0f240-db29-43b2-05a6-08da969f0de6 X-MS-Exchange-CrossTenant-AuthSource: BY5PR10MB4196.namprd10.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-OriginalArrivalTime: 14 Sep 2022 22:18:30.4272 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-CrossTenant-Id: 4e2c6054-71cb-48f1-bd6c-3a9705aca71b X-MS-Exchange-CrossTenant-MailboxType: HOSTED X-MS-Exchange-CrossTenant-UserPrincipalName: mPk2CMG5DbWfOS5iILOJobbNQZ/ch/MWrRxmfleo2S6V3PohxLgn4ELZiabsQ5LFwJwGeORbwhyJ04TMCTEa8g== X-MS-Exchange-Transport-CrossTenantHeadersStamped: SA2PR10MB4745 X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.205,Aquarius:18.0.895,Hydra:6.0.528,FMLib:17.11.122.1 definitions=2022-09-14_09,2022-09-14_04,2022-06-22_01 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 malwarescore=0 mlxlogscore=999 bulkscore=0 phishscore=0 spamscore=0 suspectscore=0 adultscore=0 mlxscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2208220000 definitions=main-2209140108 X-Proofpoint-ORIG-GUID: P8wqMPzQxSCpFJXxg0I-dCr1o4h08_Tf X-Proofpoint-GUID: P8wqMPzQxSCpFJXxg0I-dCr1o4h08_Tf ARC-Seal: i=2; s=arc-20220608; d=hostedemail.com; t=1663193922; a=rsa-sha256; cv=pass; b=4kZuGqgcClMNcFVnNMzxX9u0jloi6UZWC8i4L4cKhvamsgC8VXjYkNxxo8rMG3Y/fxeu4V AB6JtE+YSznmrUdEj1EWpHYpy+r3l2GFQ4D9wIl6koyX/ZjNZmHK0iJr+Vjet5lkSrwnH0 U1T/u04S126paRvUtaL2xGpZuOtbL/Q= ARC-Authentication-Results: i=2; imf19.hostedemail.com; dkim=pass header.d=oracle.com header.s=corp-2022-7-12 header.b=y1yf3Fkw; dkim=pass header.d=oracle.onmicrosoft.com header.s=selector2-oracle-onmicrosoft-com header.b=Vv2V0Ecj; spf=pass (imf19.hostedemail.com: domain of mike.kravetz@oracle.com designates 205.220.165.32 as permitted sender) smtp.mailfrom=mike.kravetz@oracle.com; dmarc=pass (policy=none) header.from=oracle.com; arc=pass ("microsoft.com:s=arcselector9901:i=1") ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1663193922; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=c5CV29ucnX6P7BSExad26jDY9ce0i4MfDdk5OL0pR+4=; b=UGCKfJyMKK/jWJPcxg61T5txRhoiIP5fBIXiUiClF4vB/1F6nhVe7HmCq2ISY4G6lp2+dV 7OZ+s2GhrwtK8SQGX8AtbqD0Pan5pYzaan/1SZbjk0L17LfeiGHxZ3s1hws9QTqMGjeF2t DbvzIMShTD1vlFe3RCTu+3WHiqmKwJs= Authentication-Results: imf19.hostedemail.com; dkim=pass header.d=oracle.com header.s=corp-2022-7-12 header.b=y1yf3Fkw; dkim=pass header.d=oracle.onmicrosoft.com header.s=selector2-oracle-onmicrosoft-com header.b=Vv2V0Ecj; spf=pass (imf19.hostedemail.com: domain of mike.kravetz@oracle.com designates 205.220.165.32 as permitted sender) smtp.mailfrom=mike.kravetz@oracle.com; dmarc=pass (policy=none) header.from=oracle.com; arc=pass ("microsoft.com:s=arcselector9901:i=1") X-Rspam-User: X-Rspamd-Server: rspam04 X-Stat-Signature: rs1u6ff7spce46a1gz1uh6kau4hdgpcq X-Rspamd-Queue-Id: 27BC81A0084 X-HE-Tag: 1663193921-747033 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Rename the routine vma_shareable to vma_addr_pmd_shareable as it is checking a specific address within the vma. Refactor code to check if an aligned range is shareable as this will be needed in a subsequent patch. Signed-off-by: Mike Kravetz Reviewed-by: Miaohe Lin --- mm/hugetlb.c | 19 +++++++++++++------ 1 file changed, 13 insertions(+), 6 deletions(-) diff --git a/mm/hugetlb.c b/mm/hugetlb.c index eb38ae3e7a83..8117bc299c46 100644 --- a/mm/hugetlb.c +++ b/mm/hugetlb.c @@ -6639,26 +6639,33 @@ static unsigned long page_table_shareable(struct vm_area_struct *svma, return saddr; } -static bool vma_shareable(struct vm_area_struct *vma, unsigned long addr) +static bool __vma_aligned_range_pmd_shareable(struct vm_area_struct *vma, + unsigned long start, unsigned long end) { - unsigned long base = addr & PUD_MASK; - unsigned long end = base + PUD_SIZE; - /* * check on proper vm_flags and page table alignment */ - if (vma->vm_flags & VM_MAYSHARE && range_in_vma(vma, base, end)) + if (vma->vm_flags & VM_MAYSHARE && range_in_vma(vma, start, end)) return true; return false; } +static bool vma_addr_pmd_shareable(struct vm_area_struct *vma, + unsigned long addr) +{ + unsigned long start = addr & PUD_MASK; + unsigned long end = start + PUD_SIZE; + + return __vma_aligned_range_pmd_shareable(vma, start, end); +} + bool want_pmd_share(struct vm_area_struct *vma, unsigned long addr) { #ifdef CONFIG_USERFAULTFD if (uffd_disable_huge_pmd_share(vma)) return false; #endif - return vma_shareable(vma, addr); + return vma_addr_pmd_shareable(vma, addr); } /* From patchwork Wed Sep 14 22:18:07 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Mike Kravetz X-Patchwork-Id: 12976591 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id C1221ECAAD3 for ; Wed, 14 Sep 2022 22:18:49 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 5C40B80009; Wed, 14 Sep 2022 18:18:48 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 524A6940007; Wed, 14 Sep 2022 18:18:48 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 178AA80009; Wed, 14 Sep 2022 18:18:48 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0010.hostedemail.com [216.40.44.10]) by kanga.kvack.org (Postfix) with ESMTP id F3613940007 for ; Wed, 14 Sep 2022 18:18:47 -0400 (EDT) Received: from smtpin20.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay07.hostedemail.com (Postfix) with ESMTP id D2AA41603C8 for ; Wed, 14 Sep 2022 22:18:47 +0000 (UTC) X-FDA: 79912106694.20.DB39F5A Received: from mx0a-00069f02.pphosted.com (mx0a-00069f02.pphosted.com [205.220.165.32]) by imf31.hostedemail.com (Postfix) with ESMTP id 6B891200B5 for ; Wed, 14 Sep 2022 22:18:47 +0000 (UTC) Received: from pps.filterd (m0246629.ppops.net [127.0.0.1]) by mx0b-00069f02.pphosted.com (8.17.1.5/8.17.1.5) with ESMTP id 28EM9o1x004934; Wed, 14 Sep 2022 22:18:36 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.com; h=from : to : cc : subject : date : message-id : in-reply-to : references : content-transfer-encoding : content-type : mime-version; s=corp-2022-7-12; bh=Qv/NQPpFdTbOef+QCh3XKlQ9j1lu0LB1XJ6P0Mxj338=; b=ohwo0sPRvDfc3LkIPVpwGPcDgo5E2ZlrGrWVLhNNEBf+khYjHOw3FrkJYnJpONkIyuM3 0l5lVqVOFL2Qf8r7fbUuIUszNcpnWcRU07ZM9SY4Dzkprw/GPFeV2TTr5bfSwHC2aMh8 a4/JwoZaHJOCzruZB8Fyq8n5fruLVgONNKODBy8LwsI76Q06H0uGWfah3ffzzfw/xdlH Cp9Ugxr/jq/X2zyBBofIR9i1HodTaiAK1wrEPZ1R+c6WqTcke2oErqb5mjz9iem7jgZG FKnC5e5c8we0VwKgQqACe6TEjXEwdWXOCIIRWH57zMCcDHOLQBFZxxpmLPqyAob3LGL7 qw== Received: from phxpaimrmta03.imrmtpd1.prodappphxaev1.oraclevcn.com (phxpaimrmta03.appoci.oracle.com [138.1.37.129]) by mx0b-00069f02.pphosted.com (PPS) with ESMTPS id 3jjxypbmgs-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Wed, 14 Sep 2022 22:18:36 +0000 Received: from pps.filterd (phxpaimrmta03.imrmtpd1.prodappphxaev1.oraclevcn.com [127.0.0.1]) by phxpaimrmta03.imrmtpd1.prodappphxaev1.oraclevcn.com (8.17.1.5/8.17.1.5) with ESMTP id 28EKGKKR035390; Wed, 14 Sep 2022 22:18:35 GMT Received: from nam11-bn8-obe.outbound.protection.outlook.com (mail-bn8nam11lp2169.outbound.protection.outlook.com [104.47.58.169]) by phxpaimrmta03.imrmtpd1.prodappphxaev1.oraclevcn.com (PPS) with ESMTPS id 3jjym095jm-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Wed, 14 Sep 2022 22:18:35 +0000 ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=NqeyfE9+zDsHKAyjvySib8rFkH2Y6hdvbWZ291/jlbIpuK9WIcJu6dfjaO4akWNbggfwEsFm01/JXdwQd/NNORqxMihWp5NubCLXiPnNCDeugKIAptcaKvVaQUg8h/4n4KempW8Memdgrzw1dRq8D77w5now5WQ7C2XQdYl+c6k4AtQwWeVvv6Wi3/dQH5wVg4C9CD6MluR/B+iCTeHjEumSmiQ+4LqT4913v9qykSJEQcf1da60LdzbB2l2SdVuXNJEomV1GCDAmp0fE901jU3N7w6YtHeAa2fQhSFSApcbTjohLg64cPM5YzHzIUnbGaEO1F+r9GBQJJCKLNuZvQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=Qv/NQPpFdTbOef+QCh3XKlQ9j1lu0LB1XJ6P0Mxj338=; b=lqaaWlBC1KTTjMte7lXR6tvnnNcvyE+ZKbZ76PoowpL+giGTg3407oafAW2KGAzA49oAYtxFd936ASuBUTxiwVmpa0TNb35FgCv9GzWQv24G38uC2dxRHh5VnHc1gWNJh6B5j56fV141ubTx3f9Hzryrq8iauUh7PGG1c3/OGudu0zsPvL8NXPUiLOkasXhFaq3DwGq7vVRtdU6NB61gQFhl33vs/jTH21GsssuZL+YIDxxRC4MdaNY5lxsZHOUqcTKg77etKpzspR6UP5e4M5RzDlaSXE8SH+ERaeak4RLRQKIeREpviapF0/e5BHcYouk/NkHN6cZFELH49FefLw== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=oracle.com; dmarc=pass action=none header.from=oracle.com; dkim=pass header.d=oracle.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.onmicrosoft.com; s=selector2-oracle-onmicrosoft-com; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=Qv/NQPpFdTbOef+QCh3XKlQ9j1lu0LB1XJ6P0Mxj338=; b=ZQTVXYrEiMDfCjAwDVOBAnUtYmhFECU+bRiteRAlt4+qWXE5AZUHp9OddD2ykcX6F0d0HyThnoLDzTVuSrfM+/lLJXT3z5AwLXhNnNjP16vGzCIP47Z5NQgkVK75pZpeYelDatoSDh4xAxN4VH4WYr/tQ+CFqtwvInt86kMMrRs= Received: from BY5PR10MB4196.namprd10.prod.outlook.com (2603:10b6:a03:20d::23) by SA2PR10MB4745.namprd10.prod.outlook.com (2603:10b6:806:11b::6) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.5612.23; Wed, 14 Sep 2022 22:18:32 +0000 Received: from BY5PR10MB4196.namprd10.prod.outlook.com ([fe80::e9d2:a804:e53a:779a]) by BY5PR10MB4196.namprd10.prod.outlook.com ([fe80::e9d2:a804:e53a:779a%6]) with mapi id 15.20.5612.022; Wed, 14 Sep 2022 22:18:32 +0000 From: Mike Kravetz To: linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: Muchun Song , Miaohe Lin , David Hildenbrand , Sven Schnelle , Michal Hocko , Peter Xu , Naoya Horiguchi , "Aneesh Kumar K . V" , Andrea Arcangeli , "Kirill A . Shutemov" , Davidlohr Bueso , Prakash Sangappa , James Houghton , Mina Almasry , Pasha Tatashin , Axel Rasmussen , Ray Fucillo , Andrew Morton , Mike Kravetz Subject: [PATCH v2 6/9] hugetlb: add vma based lock for pmd sharing Date: Wed, 14 Sep 2022 15:18:07 -0700 Message-Id: <20220914221810.95771-7-mike.kravetz@oracle.com> X-Mailer: git-send-email 2.37.2 In-Reply-To: <20220914221810.95771-1-mike.kravetz@oracle.com> References: <20220914221810.95771-1-mike.kravetz@oracle.com> X-ClientProxiedBy: MW4PR03CA0330.namprd03.prod.outlook.com (2603:10b6:303:dd::35) To BY5PR10MB4196.namprd10.prod.outlook.com (2603:10b6:a03:20d::23) MIME-Version: 1.0 X-MS-PublicTrafficType: Email X-MS-TrafficTypeDiagnostic: BY5PR10MB4196:EE_|SA2PR10MB4745:EE_ X-MS-Office365-Filtering-Correlation-Id: d9512851-c283-4b64-a786-08da969f0f25 X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: xBSOHUDtQKEGj1DvthAIYyNzfvnqUjtofJWiMi1Qa+41QkkUlDWthfd04aHpNbZZgXY6VzRZ+hfIILGklSosrMx/VqvHpGTa0ku16JWEGo/+W96PsjeB3Pt+9YPLqhDYcum9ydREywIVMRahI9buQzKkfztyfmbTb/KlbXuHDfeaqjD5EwtAUrIynUCvSMzjJCLDu+ec6bkSAKrfis0GDZRtJv8nRPh2cihAQJFyfMHycOQApqS76mCA3nY557M2f72Sidn30f2ymYyrnLCAI3Gu3KG4RPl4yxNeGCu+CEAaL+272vcKuva6FzjJtcVVldcarCkXaja/wrb6+F6j1HVDLFlnzb6GxtRY4XVHZE9HiYOWcS0zG4a9E+4ezlUx+s03ZwatHrmTkp0XNZu99FEObzIDFHCP0yWq+uyBIphyX+Ott8RfTzxNiDn6UNvJZ+h/ZestEsHcDRgul4LJlbJ0i4DNVNe9bzryFCUW7BMVNGyRizXOQ8wndMme0VDOaBPYoHr0Un7f5XKewVOAV1g1K/LMAVY19GYo2krwPXunqTPYVvwTLnlMG5TVuhU5EKwdrWMeqjQEPpOpR/WoYOzLqEKNBadRgastskcLL8cbWjnFwj588UpL2RXzJKUQPyhYdJ1sRctAOWoOo327tbJ2o+q97SyMBmeG0uP782HrvGB9zw6xSkXq89spQLDdF2HxOuP4teEp3JUaktkc2g== X-Forefront-Antispam-Report: CIP:255.255.255.255;CTRY:;LANG:en;SCL:1;SRV:;IPV:NLI;SFV:NSPM;H:BY5PR10MB4196.namprd10.prod.outlook.com;PTR:;CAT:NONE;SFS:(13230022)(376002)(346002)(39860400002)(136003)(366004)(396003)(451199015)(478600001)(38100700002)(54906003)(316002)(36756003)(6506007)(8676002)(4326008)(66946007)(83380400001)(186003)(2616005)(1076003)(5660300002)(41300700001)(107886003)(66476007)(66556008)(86362001)(8936002)(6666004)(6486002)(7416002)(2906002)(26005)(30864003)(6512007)(44832011);DIR:OUT;SFP:1101; X-MS-Exchange-AntiSpam-MessageData-ChunkCount: 1 X-MS-Exchange-AntiSpam-MessageData-0: aQO5un1b747IiC7g6OG5JYowu0RhUsdKXct//tg/YMYRdT/mdMRGS3HLlKsz3x9lJDBX3ZuohVtsTBJCAkOLnY+jVDyhFbMNqJuYVbK5L1MWQwHraZhaZ6YfAutdR7xngmcs7lUIKqj8xDtueUZYBz5SGk6mpGbczTqg72j4vbLFDayW7TpuhvuK3tnMTiOpIevLEFyJfjpkTxa3vY3oHC5LdE8a2TBRKH/4qiMjILNx/xUh7NXGTySzzZvzNSlJB9Lf8QiOf/pqKZxCu8cjFcSe73FxSmuCukyZfmHYrQtRExuA9AloYKRZQCcfHfkgSyxih7ex5mdM+DXcm5PyhS+44VBQVTS0lreKV6JC7VgP983gVQ3dRWiNMjMzERuRtCjsdB1XeUAzNnqjnun5RnIpF1I5Y3cEjP7BYgD4TjpJs/mneG9pA7o6c1qKK9OyYoKq0q/kGA/o97evfVD5vYaXer6fNAby9I3u56noxHq1GBePTCZIaxLvO9++C2a5w2yB7DsEi5YM1Y9X3Zc0KWeT+ESFkYdvyDsh9L6oFHwOx7VH5jT/W/yioQ7KfwUfEUG2fsCS6YqCWF+jeRIhdHgIUNlIWGpHTtmcZyQGOfFIf6cn3t5Z0p8zUnMGFvYpHeqgL8NvCqcIWsAiEn+MHcCJRSFNkUsjYTM0aCzD5j0bMwNjubxrLOTKEEEjDOzYRoecrgDs4Hjzs8Gp4rPiXal0FurhBm2BRrrWxAoBKjIMfw2T1mw3MzhNc6+tiVzQhTM1EygkvrOyOcAD+89HFLyvMlEila1SnMf+mi4VzILEDemskRP4jwxihCaoTfVWBdbTijMB+BbY8PoVHttYlRqnF4VbFtWqCftkEWnbCwRDtupDRqN3vqo7rfJq7dUsX8V4A7Eoa9dQESSMd33p3J350K693q9s2cpdoFuga5VRg2mQ1JJHEZMkMsxpYdTdznDYt8c4KgWeVLDwvref1Jc1ZXANxdf9iU2RDr0lyUBHuRXH0XxPABvErI/SGfh/3qDurXH/6J6OVlJ1rE6QF5a82sZ1vPXIkEwh4aLzSNiUBGF3qt9GIaJbP9EsaTJ8iXHAK/CifFeGUi4DstIVpvQbIPZ10YXTJAdTkDXa4/WnYdirZDRys7OpKTwi5B8p3K5G+7DMUD4uQPaxB2N7pstnwzLrBmoeHc57yspZxbEvmu2h9DWsUifnpGdWAFDo53PAbyZ0oOx2EcItnGPxckCNa1u8mF+ImutV2cJzIZkfRyTmgN2njQ4DOd8l2P8huCIdxfQsoy0ifNMJOBqdWRP+xord2N568ccUx7s+xasOAM3uOxI0xC5WVXLicIL4xTKtCpivgQnRKopnFp41ytTqBQ3ypj0EFJViyPT/QcmEj/sbC+fnj00wXG0vytKFK93BQM1N/PVDtR6yKXPdIigpR71qRkP106AWpb00e4nL2oaTflK94SWiUXhaq8lfvlZrQiZGndn16AnQYmyEi55ylBv+niifOAlrhwyS5dmdAV6+PZ/wZEXuDJAVdckHPsE/op/eKJFaQ74nkB5ng3LkmSkKXuVinfafvFhWn7u7sEXXq8cehUScXW9rfIlMwUsvEInq2opI0uZ8qu4F5Q== X-OriginatorOrg: oracle.com X-MS-Exchange-CrossTenant-Network-Message-Id: d9512851-c283-4b64-a786-08da969f0f25 X-MS-Exchange-CrossTenant-AuthSource: BY5PR10MB4196.namprd10.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-OriginalArrivalTime: 14 Sep 2022 22:18:32.5531 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-CrossTenant-Id: 4e2c6054-71cb-48f1-bd6c-3a9705aca71b X-MS-Exchange-CrossTenant-MailboxType: HOSTED X-MS-Exchange-CrossTenant-UserPrincipalName: 9vi7q5mpDObA1xUfYTqoVJuf5OV0npv5x4GbDvDY2qemtmPVIAvY2qwnhTdrcO2CENwuparx6RTcgVlDa4PO1A== X-MS-Exchange-Transport-CrossTenantHeadersStamped: SA2PR10MB4745 X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.205,Aquarius:18.0.895,Hydra:6.0.528,FMLib:17.11.122.1 definitions=2022-09-14_09,2022-09-14_04,2022-06-22_01 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 mlxscore=0 spamscore=0 adultscore=0 malwarescore=0 suspectscore=0 bulkscore=0 mlxlogscore=999 phishscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2208220000 definitions=main-2209140108 X-Proofpoint-ORIG-GUID: XxWoKxguxz7mDk4owwO2p38FnMNla3FA X-Proofpoint-GUID: XxWoKxguxz7mDk4owwO2p38FnMNla3FA ARC-Seal: i=2; s=arc-20220608; d=hostedemail.com; t=1663193927; a=rsa-sha256; cv=pass; b=XzHpF97fOvPvhLCc+fa5Ga8WFpJn1/Oo/qgKyk4DqvejHJs0sv0xy50c/AEQkt5yolk75U M5JeNnl/+m7kCW5SptbE2CJ4zT6P9e4pJ+p01bAQhoQvAfeBZhzMS8RBjFbNiHV5fE/+Wg HlWyUaiC802A6PKH5UGbF9Rf+iSNj9s= ARC-Authentication-Results: i=2; imf31.hostedemail.com; dkim=pass header.d=oracle.com header.s=corp-2022-7-12 header.b=ohwo0sPR; dkim=pass header.d=oracle.onmicrosoft.com header.s=selector2-oracle-onmicrosoft-com header.b=ZQTVXYrE; dmarc=pass (policy=none) header.from=oracle.com; arc=pass ("microsoft.com:s=arcselector9901:i=1"); spf=pass (imf31.hostedemail.com: domain of mike.kravetz@oracle.com designates 205.220.165.32 as permitted sender) smtp.mailfrom=mike.kravetz@oracle.com ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1663193927; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=Qv/NQPpFdTbOef+QCh3XKlQ9j1lu0LB1XJ6P0Mxj338=; b=hhn0qzsurw5FGBYzQV3mnWj9wggyESZfZq9HLGF9fojgA/4BIj3wqzzmdb0aSa/C2wgpVp 3oFR6NtfcplSG1uI88Z3iR7eEQWoZ+C7csZJgv+4bJStFedBg+IYmCXKuaiTyQUVTQ21nX ZvZYltaoYd0e6teW06q6d0qtlpgdGaA= X-Stat-Signature: wnxezgf5cgqfp4qspbyr4h5cczkyb5c7 X-Rspamd-Queue-Id: 6B891200B5 X-Rspam-User: Authentication-Results: imf31.hostedemail.com; dkim=pass header.d=oracle.com header.s=corp-2022-7-12 header.b=ohwo0sPR; dkim=pass header.d=oracle.onmicrosoft.com header.s=selector2-oracle-onmicrosoft-com header.b=ZQTVXYrE; dmarc=pass (policy=none) header.from=oracle.com; arc=pass ("microsoft.com:s=arcselector9901:i=1"); spf=pass (imf31.hostedemail.com: domain of mike.kravetz@oracle.com designates 205.220.165.32 as permitted sender) smtp.mailfrom=mike.kravetz@oracle.com X-Rspamd-Server: rspam02 X-HE-Tag: 1663193927-776809 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Allocate a new hugetlb_vma_lock structure and hang off vm_private_data for synchronization use by vmas that could be involved in pmd sharing. This data structure contains a rw semaphore that is the primary tool used for synchronization. This new structure is ref counted, so that it can exist when NOT attached to a vma. This is only helpful in resolving lock ordering issues where code may need to obtain the vma_lock while there are no guarantees the vma may go away. By obtaining a ref on the structure, it can be guaranteed that at least the rw semaphore will not go away. Only add infrastructure for the new lock here. Actual use will be added in subsequent patches. Signed-off-by: Mike Kravetz Signed-off-by: Mike Kravetz Reviewed-by: Miaohe Lin --- include/linux/hugetlb.h | 43 ++++++++- kernel/fork.c | 6 +- mm/hugetlb.c | 202 ++++++++++++++++++++++++++++++++++++---- mm/rmap.c | 8 +- 4 files changed, 235 insertions(+), 24 deletions(-) diff --git a/include/linux/hugetlb.h b/include/linux/hugetlb.h index 0ce916d1afca..6a1bd172f943 100644 --- a/include/linux/hugetlb.h +++ b/include/linux/hugetlb.h @@ -114,6 +114,12 @@ struct file_region { #endif }; +struct hugetlb_vma_lock { + struct kref refs; + struct rw_semaphore rw_sema; + struct vm_area_struct *vma; +}; + extern struct resv_map *resv_map_alloc(void); void resv_map_release(struct kref *ref); @@ -126,7 +132,7 @@ struct hugepage_subpool *hugepage_new_subpool(struct hstate *h, long max_hpages, long min_hpages); void hugepage_put_subpool(struct hugepage_subpool *spool); -void reset_vma_resv_huge_pages(struct vm_area_struct *vma); +void hugetlb_dup_vma_private(struct vm_area_struct *vma); void clear_vma_resv_huge_pages(struct vm_area_struct *vma); int hugetlb_sysctl_handler(struct ctl_table *, int, void *, size_t *, loff_t *); int hugetlb_overcommit_handler(struct ctl_table *, int, void *, size_t *, @@ -214,6 +220,14 @@ struct page *follow_huge_pud(struct mm_struct *mm, unsigned long address, struct page *follow_huge_pgd(struct mm_struct *mm, unsigned long address, pgd_t *pgd, int flags); +void hugetlb_vma_lock_read(struct vm_area_struct *vma); +void hugetlb_vma_unlock_read(struct vm_area_struct *vma); +void hugetlb_vma_lock_write(struct vm_area_struct *vma); +void hugetlb_vma_unlock_write(struct vm_area_struct *vma); +int hugetlb_vma_trylock_write(struct vm_area_struct *vma); +void hugetlb_vma_assert_locked(struct vm_area_struct *vma); +void hugetlb_vma_lock_release(struct kref *kref); + int pmd_huge(pmd_t pmd); int pud_huge(pud_t pud); unsigned long hugetlb_change_protection(struct vm_area_struct *vma, @@ -225,7 +239,7 @@ void hugetlb_unshare_all_pmds(struct vm_area_struct *vma); #else /* !CONFIG_HUGETLB_PAGE */ -static inline void reset_vma_resv_huge_pages(struct vm_area_struct *vma) +static inline void hugetlb_dup_vma_private(struct vm_area_struct *vma) { } @@ -336,6 +350,31 @@ static inline int prepare_hugepage_range(struct file *file, return -EINVAL; } +static inline void hugetlb_vma_lock_read(struct vm_area_struct *vma) +{ +} + +static inline void hugetlb_vma_unlock_read(struct vm_area_struct *vma) +{ +} + +static inline void hugetlb_vma_lock_write(struct vm_area_struct *vma) +{ +} + +static inline void hugetlb_vma_unlock_write(struct vm_area_struct *vma) +{ +} + +static inline int hugetlb_vma_trylock_write(struct vm_area_struct *vma) +{ + return 1; +} + +static inline void hugetlb_vma_assert_locked(struct vm_area_struct *vma) +{ +} + static inline int pmd_huge(pmd_t pmd) { return 0; diff --git a/kernel/fork.c b/kernel/fork.c index b3399184706c..e85e923537a2 100644 --- a/kernel/fork.c +++ b/kernel/fork.c @@ -677,12 +677,10 @@ static __latent_entropy int dup_mmap(struct mm_struct *mm, } /* - * Clear hugetlb-related page reserves for children. This only - * affects MAP_PRIVATE mappings. Faults generated by the child - * are not guaranteed to succeed, even if read-only + * Copy/update hugetlb private vma information. */ if (is_vm_hugetlb_page(tmp)) - reset_vma_resv_huge_pages(tmp); + hugetlb_dup_vma_private(tmp); /* Link the vma into the MT */ mas.index = tmp->vm_start; diff --git a/mm/hugetlb.c b/mm/hugetlb.c index 8117bc299c46..616be891b798 100644 --- a/mm/hugetlb.c +++ b/mm/hugetlb.c @@ -90,6 +90,8 @@ struct mutex *hugetlb_fault_mutex_table ____cacheline_aligned_in_smp; /* Forward declaration */ static int hugetlb_acct_memory(struct hstate *h, long delta); +static void hugetlb_vma_lock_free(struct vm_area_struct *vma); +static void hugetlb_vma_lock_alloc(struct vm_area_struct *vma); static inline bool subpool_is_free(struct hugepage_subpool *spool) { @@ -858,7 +860,7 @@ __weak unsigned long vma_mmu_pagesize(struct vm_area_struct *vma) * faults in a MAP_PRIVATE mapping. Only the process that called mmap() * is guaranteed to have their future faults succeed. * - * With the exception of reset_vma_resv_huge_pages() which is called at fork(), + * With the exception of hugetlb_dup_vma_private() which is called at fork(), * the reserve counters are updated with the hugetlb_lock held. It is safe * to reset the VMA at fork() time as it is not in use yet and there is no * chance of the global counters getting corrupted as a result of the values. @@ -1005,12 +1007,20 @@ static int is_vma_resv_set(struct vm_area_struct *vma, unsigned long flag) return (get_vma_private_data(vma) & flag) != 0; } -/* Reset counters to 0 and clear all HPAGE_RESV_* flags */ -void reset_vma_resv_huge_pages(struct vm_area_struct *vma) +void hugetlb_dup_vma_private(struct vm_area_struct *vma) { VM_BUG_ON_VMA(!is_vm_hugetlb_page(vma), vma); + /* + * Clear vm_private_data + * - For MAP_PRIVATE mappings, this is the reserve map which does + * not apply to children. Faults generated by the children are + * not guaranteed to succeed, even if read-only. + * - For shared mappings this is a per-vma semaphore that may be + * allocated in a subsequent call to hugetlb_vm_op_open. + */ + vma->vm_private_data = (void *)0; if (!(vma->vm_flags & VM_MAYSHARE)) - vma->vm_private_data = (void *)0; + return; } /* @@ -1041,7 +1051,7 @@ void clear_vma_resv_huge_pages(struct vm_area_struct *vma) kref_put(&reservations->refs, resv_map_release); } - reset_vma_resv_huge_pages(vma); + hugetlb_dup_vma_private(vma); } /* Returns true if the VMA has associated reserve pages */ @@ -4622,16 +4632,21 @@ static void hugetlb_vm_op_open(struct vm_area_struct *vma) resv_map_dup_hugetlb_cgroup_uncharge_info(resv); kref_get(&resv->refs); } + + hugetlb_vma_lock_alloc(vma); } static void hugetlb_vm_op_close(struct vm_area_struct *vma) { struct hstate *h = hstate_vma(vma); - struct resv_map *resv = vma_resv_map(vma); + struct resv_map *resv; struct hugepage_subpool *spool = subpool_vma(vma); unsigned long reserve, start, end; long gbl_reserve; + hugetlb_vma_lock_free(vma); + + resv = vma_resv_map(vma); if (!resv || !is_vma_resv_set(vma, HPAGE_RESV_OWNER)) return; @@ -6438,6 +6453,11 @@ bool hugetlb_reserve_pages(struct inode *inode, return false; } + /* + * vma specific semaphore used for pmd sharing synchronization + */ + hugetlb_vma_lock_alloc(vma); + /* * Only apply hugepage reservation if asked. At fault time, an * attempt will be made for VM_NORESERVE to allocate a page @@ -6461,12 +6481,11 @@ bool hugetlb_reserve_pages(struct inode *inode, resv_map = inode_resv_map(inode); chg = region_chg(resv_map, from, to, ®ions_needed); - } else { /* Private mapping. */ resv_map = resv_map_alloc(); if (!resv_map) - return false; + goto out_err; chg = to - from; @@ -6561,6 +6580,7 @@ bool hugetlb_reserve_pages(struct inode *inode, hugetlb_cgroup_uncharge_cgroup_rsvd(hstate_index(h), chg * pages_per_huge_page(h), h_cg); out_err: + hugetlb_vma_lock_free(vma); if (!vma || vma->vm_flags & VM_MAYSHARE) /* Only call region_abort if the region_chg succeeded but the * region_add failed or didn't run. @@ -6640,14 +6660,34 @@ static unsigned long page_table_shareable(struct vm_area_struct *svma, } static bool __vma_aligned_range_pmd_shareable(struct vm_area_struct *vma, - unsigned long start, unsigned long end) + unsigned long start, unsigned long end, + bool check_vma_lock) { +#ifdef CONFIG_USERFAULTFD + if (uffd_disable_huge_pmd_share(vma)) + return false; +#endif /* * check on proper vm_flags and page table alignment */ - if (vma->vm_flags & VM_MAYSHARE && range_in_vma(vma, start, end)) - return true; - return false; + if (!(vma->vm_flags & VM_MAYSHARE)) + return false; + if (check_vma_lock && !vma->vm_private_data) + return false; + if (!range_in_vma(vma, start, end)) + return false; + return true; +} + +static bool vma_pmd_shareable(struct vm_area_struct *vma) +{ + unsigned long start = ALIGN(vma->vm_start, PUD_SIZE), + end = ALIGN_DOWN(vma->vm_end, PUD_SIZE); + + if (start >= end) + return false; + + return __vma_aligned_range_pmd_shareable(vma, start, end, false); } static bool vma_addr_pmd_shareable(struct vm_area_struct *vma, @@ -6656,15 +6696,11 @@ static bool vma_addr_pmd_shareable(struct vm_area_struct *vma, unsigned long start = addr & PUD_MASK; unsigned long end = start + PUD_SIZE; - return __vma_aligned_range_pmd_shareable(vma, start, end); + return __vma_aligned_range_pmd_shareable(vma, start, end, true); } bool want_pmd_share(struct vm_area_struct *vma, unsigned long addr) { -#ifdef CONFIG_USERFAULTFD - if (uffd_disable_huge_pmd_share(vma)) - return false; -#endif return vma_addr_pmd_shareable(vma, addr); } @@ -6695,6 +6731,130 @@ void adjust_range_if_pmd_sharing_possible(struct vm_area_struct *vma, *end = ALIGN(*end, PUD_SIZE); } +static bool __vma_shareable_flags_pmd(struct vm_area_struct *vma) +{ + return vma->vm_flags & (VM_MAYSHARE | VM_SHARED) && + vma->vm_private_data; +} + +void hugetlb_vma_lock_read(struct vm_area_struct *vma) +{ + if (__vma_shareable_flags_pmd(vma)) { + struct hugetlb_vma_lock *vma_lock = vma->vm_private_data; + + down_read(&vma_lock->rw_sema); + } +} + +void hugetlb_vma_unlock_read(struct vm_area_struct *vma) +{ + if (__vma_shareable_flags_pmd(vma)) { + struct hugetlb_vma_lock *vma_lock = vma->vm_private_data; + + up_read(&vma_lock->rw_sema); + } +} + +void hugetlb_vma_lock_write(struct vm_area_struct *vma) +{ + if (__vma_shareable_flags_pmd(vma)) { + struct hugetlb_vma_lock *vma_lock = vma->vm_private_data; + + down_write(&vma_lock->rw_sema); + } +} + +void hugetlb_vma_unlock_write(struct vm_area_struct *vma) +{ + if (__vma_shareable_flags_pmd(vma)) { + struct hugetlb_vma_lock *vma_lock = vma->vm_private_data; + + up_write(&vma_lock->rw_sema); + } +} + +int hugetlb_vma_trylock_write(struct vm_area_struct *vma) +{ + struct hugetlb_vma_lock *vma_lock = vma->vm_private_data; + + if (!__vma_shareable_flags_pmd(vma)) + return 1; + + return down_write_trylock(&vma_lock->rw_sema); +} + +void hugetlb_vma_assert_locked(struct vm_area_struct *vma) +{ + if (__vma_shareable_flags_pmd(vma)) { + struct hugetlb_vma_lock *vma_lock = vma->vm_private_data; + + lockdep_assert_held(&vma_lock->rw_sema); + } +} + +void hugetlb_vma_lock_release(struct kref *kref) +{ + struct hugetlb_vma_lock *vma_lock = container_of(kref, + struct hugetlb_vma_lock, refs); + + kfree(vma_lock); +} + +static void hugetlb_vma_lock_free(struct vm_area_struct *vma) +{ + /* + * Only present in sharable vmas. See comment in + * __unmap_hugepage_range_final about how VM_SHARED could + * be set without VM_MAYSHARE. As a result, we need to + * check if either is set in the free path. + */ + if (!vma || !(vma->vm_flags & (VM_MAYSHARE | VM_SHARED))) + return; + + if (vma->vm_private_data) { + struct hugetlb_vma_lock *vma_lock = vma->vm_private_data; + + /* + * vma_lock structure may or not be released, but it + * certainly will no longer be attached to vma so clear + * pointer. + */ + vma_lock->vma = NULL; + kref_put(&vma_lock->refs, hugetlb_vma_lock_release); + vma->vm_private_data = NULL; + } +} + +static void hugetlb_vma_lock_alloc(struct vm_area_struct *vma) +{ + struct hugetlb_vma_lock *vma_lock; + + /* Only establish in (flags) sharable vmas */ + if (!vma || !(vma->vm_flags & VM_MAYSHARE)) + return; + + /* Should never get here with non-NULL vm_private_data */ + if (vma->vm_private_data) + return; + + /* Check size/alignment for pmd sharing possible */ + if (!vma_pmd_shareable(vma)) + return; + + vma_lock = kmalloc(sizeof(*vma_lock), GFP_KERNEL); + if (!vma_lock) + /* + * If we can not allocate structure, then vma can not + * participate in pmd sharing. + */ + return; + + kref_init(&vma_lock->refs); + init_rwsem(&vma_lock->rw_sema); + vma_lock->vma = vma; + vma->vm_private_data = vma_lock; +} + /* * Search for a shareable pmd page for hugetlb. In any case calls pmd_alloc() * and returns the corresponding pte. While this is not necessary for the @@ -6781,6 +6941,14 @@ int huge_pmd_unshare(struct mm_struct *mm, struct vm_area_struct *vma, } #else /* !CONFIG_ARCH_WANT_HUGE_PMD_SHARE */ +static void hugetlb_vma_lock_free(struct vm_area_struct *vma) +{ +} + +static void hugetlb_vma_lock_alloc(struct vm_area_struct *vma) +{ +} + pte_t *huge_pmd_share(struct mm_struct *mm, struct vm_area_struct *vma, unsigned long addr, pud_t *pud) { diff --git a/mm/rmap.c b/mm/rmap.c index d17d68a9b15b..744faaef0489 100644 --- a/mm/rmap.c +++ b/mm/rmap.c @@ -24,7 +24,7 @@ * mm->mmap_lock * mapping->invalidate_lock (in filemap_fault) * page->flags PG_locked (lock_page) - * hugetlbfs_i_mmap_rwsem_key (in huge_pmd_share) + * hugetlbfs_i_mmap_rwsem_key (in huge_pmd_share, see hugetlbfs below) * mapping->i_mmap_rwsem * anon_vma->rwsem * mm->page_table_lock or pte_lock @@ -44,6 +44,12 @@ * anon_vma->rwsem,mapping->i_mmap_rwsem (memory_failure, collect_procs_anon) * ->tasklist_lock * pte map lock + * + * hugetlbfs PageHuge() take locks in this order: + * hugetlb_fault_mutex (hugetlbfs specific page fault mutex) + * vma_lock (hugetlb specific lock for pmd_sharing) + * mapping->i_mmap_rwsem (also used for hugetlb pmd sharing) + * page->flags PG_locked (lock_page) */ #include From patchwork Wed Sep 14 22:18:08 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Mike Kravetz X-Patchwork-Id: 12976592 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 4235AC6FA82 for ; Wed, 14 Sep 2022 22:18:52 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id CC4AE8000A; Wed, 14 Sep 2022 18:18:51 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id C7388940007; Wed, 14 Sep 2022 18:18:51 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id A29D18000A; Wed, 14 Sep 2022 18:18:51 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0016.hostedemail.com [216.40.44.16]) by kanga.kvack.org (Postfix) with ESMTP id 8CB7C940007 for ; Wed, 14 Sep 2022 18:18:51 -0400 (EDT) Received: from smtpin13.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay06.hostedemail.com (Postfix) with ESMTP id 69721AAB0E for ; Wed, 14 Sep 2022 22:18:51 +0000 (UTC) X-FDA: 79912106862.13.F05169C Received: from mx0b-00069f02.pphosted.com (mx0b-00069f02.pphosted.com [205.220.177.32]) by imf11.hostedemail.com (Postfix) with ESMTP id F2E6840099 for ; Wed, 14 Sep 2022 22:18:50 +0000 (UTC) Received: from pps.filterd (m0246631.ppops.net [127.0.0.1]) by mx0b-00069f02.pphosted.com (8.17.1.5/8.17.1.5) with ESMTP id 28EMIhCf027338; Wed, 14 Sep 2022 22:18:43 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.com; h=from : to : cc : subject : date : message-id : in-reply-to : references : content-transfer-encoding : content-type : mime-version; s=corp-2022-7-12; bh=YgYkt0QuP9ElyM3GYKSupA5PKhlljwrykt+N//yr6Sk=; b=O0aprwbDZJxPyxCjRnInoT10YiNSrEC8lpjxXpaFpDS8ijPvdi7R/Oe6vMXJ77SSWnjn 3KrYET85Gm6G1OPoaTzCflbChyL4fjhy7hyln/1xkxJiVhWJ6j2mTkh1qQ30zJqp73f5 ZHARUHHVGR5V1UWo+Y+PbzbXebvaTTy5WThG4hQJX7XxXBWYRFKuFgUL7OdcQza62urK k36wbLrIvJS5v2tK6+fqO8x24ZFfR0Jk8f0gOZPM+/9RoSbgYuJhBDYQhK+ri2ZCpDPq ciw00yFISGLRmSrmV5zpHthMzHX2kz4YD8QvTuGaoLJ0G80jxFOH/tNOBRMB+cacjnIX 1Q== Received: from iadpaimrmta03.imrmtpd1.prodappiadaev1.oraclevcn.com (iadpaimrmta03.appoci.oracle.com [130.35.103.27]) by mx0b-00069f02.pphosted.com (PPS) with ESMTPS id 3jjxyduppy-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Wed, 14 Sep 2022 22:18:37 +0000 Received: from pps.filterd (iadpaimrmta03.imrmtpd1.prodappiadaev1.oraclevcn.com [127.0.0.1]) by iadpaimrmta03.imrmtpd1.prodappiadaev1.oraclevcn.com (8.17.1.5/8.17.1.5) with ESMTP id 28ELuQLB009613; Wed, 14 Sep 2022 22:18:37 GMT Received: from nam11-bn8-obe.outbound.protection.outlook.com (mail-bn8nam11lp2169.outbound.protection.outlook.com [104.47.58.169]) by iadpaimrmta03.imrmtpd1.prodappiadaev1.oraclevcn.com (PPS) with ESMTPS id 3jjyedadwh-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Wed, 14 Sep 2022 22:18:37 +0000 ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=YdHiT0oeWnUkBaGuBpns4akjaLB6DpfplDu7t0to26RGIdrnmx7gyswyzktCLiNq4fDdOlYT5dNU/ZSKrABW8EWFB8HFd7KOALNw9pR2B3hsJxtUPGGQYCERlsVWriY1CGgeQSQb2uH1ch9sTxrV5b2bO0UdEi6cqS9UK8uCNtkRoDPTjcVVLOic+saoZSC7Sqj3YNUFcsNlEBgjLdtQdV++L9AUdDmhMVI326x1ndgfM1TDvZR4+xFCsiwF7UbkqUyoYv28FxA9WnBEeVXlRj36d/ly2LJ/I8jDYTewOh/cbVRYSSdkZEetH/Ek79i6fenMcmQS6eskjK9PIwNr7Q== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=YgYkt0QuP9ElyM3GYKSupA5PKhlljwrykt+N//yr6Sk=; b=AaojvL71ykzR8wbHHp2vCWRLH4YT6Q8nV+fReKomNP0HzM6khh7EQjvAWoltQ2Z3gLg8W7o/OyiMH/++sos+xaAhLJAXgL4R1cEkXuegG2C9Gvyj0rbFUWK5MW6l4DhqK709AF8GY1hEnsiAmu9UBT0WWTQkh/GIhmSJE6Z8Z/p+BlfdX0yVnIo4ifozlNlONI7ct0m8IDbPDoMnqVJ2bKFCGeO6zjAs8UUXuOJJ83GxDnUPD35BQr6N9zTYU3pOK10LwIHskkysXNPj588AgHtnty9xry7h43tbS+cNS0NprIqc6B6CqOa/zW0800CMwsxDQ6R5b5vx2Q+9P9Ilfg== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=oracle.com; dmarc=pass action=none header.from=oracle.com; dkim=pass header.d=oracle.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.onmicrosoft.com; s=selector2-oracle-onmicrosoft-com; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=YgYkt0QuP9ElyM3GYKSupA5PKhlljwrykt+N//yr6Sk=; b=O93Zdu38aIhHKC96gJs4lDNvZhBeFmiU/8q3gdAoQDMhNvV2qTAQJZa91L7zLJpup+s5TPomsZbZVpNogauj7WQ0VRcT3K8+P29+LkfP8t4QK1rX3CCFnKL1lr6xxMgZv4t2vqFjgJp9nv/MhE/gxAkxkSk32N0jlR4sSc0P9yw= Received: from BY5PR10MB4196.namprd10.prod.outlook.com (2603:10b6:a03:20d::23) by CH2PR10MB4390.namprd10.prod.outlook.com (2603:10b6:610:af::12) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.5612.22; Wed, 14 Sep 2022 22:18:35 +0000 Received: from BY5PR10MB4196.namprd10.prod.outlook.com ([fe80::e9d2:a804:e53a:779a]) by BY5PR10MB4196.namprd10.prod.outlook.com ([fe80::e9d2:a804:e53a:779a%6]) with mapi id 15.20.5612.022; Wed, 14 Sep 2022 22:18:34 +0000 From: Mike Kravetz To: linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: Muchun Song , Miaohe Lin , David Hildenbrand , Sven Schnelle , Michal Hocko , Peter Xu , Naoya Horiguchi , "Aneesh Kumar K . V" , Andrea Arcangeli , "Kirill A . Shutemov" , Davidlohr Bueso , Prakash Sangappa , James Houghton , Mina Almasry , Pasha Tatashin , Axel Rasmussen , Ray Fucillo , Andrew Morton , Mike Kravetz Subject: [PATCH v2 7/9] hugetlb: create hugetlb_unmap_file_folio to unmap single file folio Date: Wed, 14 Sep 2022 15:18:08 -0700 Message-Id: <20220914221810.95771-8-mike.kravetz@oracle.com> X-Mailer: git-send-email 2.37.2 In-Reply-To: <20220914221810.95771-1-mike.kravetz@oracle.com> References: <20220914221810.95771-1-mike.kravetz@oracle.com> X-ClientProxiedBy: MW2PR16CA0028.namprd16.prod.outlook.com (2603:10b6:907::41) To BY5PR10MB4196.namprd10.prod.outlook.com (2603:10b6:a03:20d::23) MIME-Version: 1.0 X-MS-PublicTrafficType: Email X-MS-TrafficTypeDiagnostic: BY5PR10MB4196:EE_|CH2PR10MB4390:EE_ X-MS-Office365-Filtering-Correlation-Id: 4d30b1de-46a5-4735-05b2-08da969f107b X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: Ek2fhnQWeKBdZE3NGSa3ZTktMmLHX2qwOSDFJB7YZzdSdVd4T/r6Y+hwmQk5IsSJsYIVevMdj205Jdh12XpCWQsYj7mdmsHBAMcj9feGRecaCrt8I3pbNrKm3kq8Jm8bb9B4dUTTq+mGs0L0A7RKLXf1oSp1yLtS580n9WsyPhEaJI1zlT3bkfzTGNRs4k1tjndzXgzjjNdNxVHLOMSvf5NImJkuDcxdUe0Dc+anrKQ1NL9o8YhS3f9K+NzZM3kbkz6opVLIlwDgoe4i+/pJkdnncHU3Fp+W7AUFe6ycPv7zWrSy7YeuuArnttll0A336eLv+b65I6+7ed4fvZnfgf2M6aMI92NbRxVmEjWs5GzhMikq3Zks+OlNdWk8WA3xwezynNuqcouVnMBHTBnf2Vs32V8jRUAR7o3diNQIHLmEFO6gYMjPTGrtSr6urq0fhnQ6oW4jlt9sQl9ivW0ypmU6UFUUDF3KNoBMc7mgipw6z+bVRWa4FIMK2ALmV26QPdlHWUwvXVvHSkciiHIx67H8jPiP/JIN1iqpALV35D0FuCystyzBiXzt/0/ru90Cvv9ggpowP/yhRn+UZGxNQ94u1Kw/07GWs+dn1MSW9qPSP+mhJ9y9c0gcDGYzQ2uer/+52ouT3wHhcHzMtYs8Cl1mY3oyiIsMk3YvksJFeDpDUrlRfWCiVfMI9i4ma+rzMO7CiqP780io4FJu7o4XVw== X-Forefront-Antispam-Report: CIP:255.255.255.255;CTRY:;LANG:en;SCL:1;SRV:;IPV:NLI;SFV:NSPM;H:BY5PR10MB4196.namprd10.prod.outlook.com;PTR:;CAT:NONE;SFS:(13230022)(366004)(376002)(346002)(396003)(39860400002)(136003)(451199015)(6512007)(7416002)(6666004)(86362001)(66556008)(36756003)(8676002)(107886003)(83380400001)(6486002)(66946007)(54906003)(44832011)(66476007)(8936002)(5660300002)(316002)(41300700001)(6506007)(478600001)(186003)(2906002)(38100700002)(26005)(4326008)(1076003)(2616005);DIR:OUT;SFP:1101; X-MS-Exchange-AntiSpam-MessageData-ChunkCount: 1 X-MS-Exchange-AntiSpam-MessageData-0: v93JHpVRFbb0PlIOy840rJlLYDtxWdSY/OBYkoI1HoN3Y+Hwo5pdBPyXKaE/5P+mT+CfeVnQuaL7Utv3n6rbnglMv8AxmtjGtdullvU9+8umxktSxu2wcLLIUuqcdAJ9pViZ8KNicnDOK5aViDzNrsluwN+pm0C/pFvaL1fGKlrswpCAN+weievXqYbO3pu+E0eg73caNgqczfaGlTikkvb6ZIz/Z+lUu4Q5Dya/rYBRxVfLe06RAc55UnfOBLFDKTHg/bf5+iDjJXJvA3vc0w6C+xPMj+k8XGC8SOpLjitmj1YozRbCFdAnReXcfNtBLADXxtz/zmNkJwJppvzwoDJSugTVXCzuuPLVqtbjr79MR3IJTmrYf9HY2PWFNREP4NnCJ8m5Xs9AnNo0GtuGiuAKb2NSYuiMXSWJ4yCYZT3k4nvpizf/WxMctHVrlr6+CfDLWg/UTk/jOmz7TXZJNfB4ZIuCPVGBiGecuSaZ4+ex+LF4TjLF4lft2EGIfnevbLUiIlr2eGDzTMFsKO4fQkNuAahMLu3At8BXRPkhRhRjWPh1TdpcuX20GtmNsw3DIMccabGJ6sO6mr3Ujb7gE5RxrQO2LtwEDJhPJKVMqpU4cs8hdYoH2XvP3Rf5jY2coHHNMeewL8W6osPQcZFzJ7oKP7rhL/vLGp5fFwkCNXjBqtayWAn5FEcfU8FrWO9dM36cTYkRoO361KgLtNPLgefT+OCXEO9OjusIkiPMkUXNPWRqSiSYtoucXRig9l9Yis23hxrjKExICMTNWvpl3iCglGDEMyKQzKV2LOm30UNkZqsLZHHtVFhK+oGlLy+L48vwmkTtHLRw0QdyAlvYeT366zbI9GcmmqoMV3Xqh9BSHCo/kigxFnSsMIWMtzz+j4+qllk8NbT4nJhbLk1i7LgqfiMruxIrMzC/kT4WPtj4yIF0mWVJqnCWoBOiQhnsQ6a+SrKXvkRz8Oj/WeZEsyqk6GV4auoET3sz0b+/wPXKRwiKlBXjhjC4IGZDL66OCqbeoRcQablSZcUs++loT7mynvXwLumhEMrfdvHa/FO+24g8yrOml2NesF+ZNj+A74siPUpoHBGVz16OSdzVD1QOz3t3oaJ9RaAEvyfhNRiraCvEQQKPnXq94JDvDRhMoHwtjXJKJtl8KD+vxr1k6npqFm5raXerOzh9s+alwfTn8/+G3nr8rd03AlPomRc2SqiDziO1h+Ht+37PVh9izm2qyn5sgk6HQP8ucEG7iXUASFhq/QvmAL0TA7SePShctwbkvb+UlLNj/Ib5FmnweA9o88EgHiWGEU/C4DDqu2V85Z2x7ZmFF4QLxJJ15nCgofnyeVZ57xwftcfMSwIw51MsCUHzQjngi7raZPV1uiFPz8Pp2yf5/bTQYpuALEb8x3cGHGFSxtFNiTNJjlxDAs8fg/6HCnyXUvFSZcjeuKIhkZCpL82D8df8CkaYrYetsTs6lCQLF+elWX09FtFnxsEfteeP94rVhgp68XS9YhSBRQ/11/vwpAaV1rTDWd6xb7L3cEKLIkRl90UTaqcCSEDkLrOgvS+UN17iEWkEY7+LocPZBGeXZyIGt7Nui/eHhoJxlMFpg93wWYgUQvazJA== X-OriginatorOrg: oracle.com X-MS-Exchange-CrossTenant-Network-Message-Id: 4d30b1de-46a5-4735-05b2-08da969f107b X-MS-Exchange-CrossTenant-AuthSource: BY5PR10MB4196.namprd10.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-OriginalArrivalTime: 14 Sep 2022 22:18:34.7591 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-CrossTenant-Id: 4e2c6054-71cb-48f1-bd6c-3a9705aca71b X-MS-Exchange-CrossTenant-MailboxType: HOSTED X-MS-Exchange-CrossTenant-UserPrincipalName: t5brwB27MUbXHWRq+WQd1JHiQ56b8HFyk8REbfJuQMmEH3OW0DVfoLI8hsBVjczE3AgUlwldaEVTRn3uMvUIug== X-MS-Exchange-Transport-CrossTenantHeadersStamped: CH2PR10MB4390 X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.205,Aquarius:18.0.895,Hydra:6.0.528,FMLib:17.11.122.1 definitions=2022-09-14_09,2022-09-14_04,2022-06-22_01 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 bulkscore=0 suspectscore=0 adultscore=0 phishscore=0 malwarescore=0 spamscore=0 mlxlogscore=999 mlxscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2208220000 definitions=main-2209140108 X-Proofpoint-ORIG-GUID: CCej2BkDRArpAEAYlJMrnFMEMtH-umz8 X-Proofpoint-GUID: CCej2BkDRArpAEAYlJMrnFMEMtH-umz8 ARC-Authentication-Results: i=2; imf11.hostedemail.com; dkim=pass header.d=oracle.com header.s=corp-2022-7-12 header.b=O0aprwbD; dkim=pass header.d=oracle.onmicrosoft.com header.s=selector2-oracle-onmicrosoft-com header.b=O93Zdu38; spf=pass (imf11.hostedemail.com: domain of mike.kravetz@oracle.com designates 205.220.177.32 as permitted sender) smtp.mailfrom=mike.kravetz@oracle.com; dmarc=pass (policy=none) header.from=oracle.com; arc=pass ("microsoft.com:s=arcselector9901:i=1") ARC-Seal: i=2; s=arc-20220608; d=hostedemail.com; t=1663193931; a=rsa-sha256; cv=pass; b=OU+sEz1KocfrXp34x2RsnE0+wejbjM5Kv+fKQt+nKNmnIRjqgTAAIxX1AfylFsVH6hurTg 9B58EmwLsSaJ4GjISfP3w/fnupf4scrPD7QHP2aQWvjjC2Cgdv1xqVsQn1zHUymZkuPQey capt47NZtVOZs9XqnsCARcMsPaUgx8o= ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1663193931; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=YgYkt0QuP9ElyM3GYKSupA5PKhlljwrykt+N//yr6Sk=; b=qx1yTO+mFwomsTgBLygeBIAtwZsuC8dZOxD8xGVPAsG1UOHr+hSmORs72xejzxSAaPQYGr xY0/T2uPZR7q9nuNOK2DBQ0uX8/AFL7efCXNd0bf3fvJxSRVVV77sODy46gi/G+snUVFJx pH0oyzPhNdni34+pRE/6DUFxBEUNQ4k= X-Rspamd-Server: rspam10 X-Rspamd-Queue-Id: F2E6840099 X-Rspam-User: Authentication-Results: imf11.hostedemail.com; dkim=pass header.d=oracle.com header.s=corp-2022-7-12 header.b=O0aprwbD; dkim=pass header.d=oracle.onmicrosoft.com header.s=selector2-oracle-onmicrosoft-com header.b=O93Zdu38; spf=pass (imf11.hostedemail.com: domain of mike.kravetz@oracle.com designates 205.220.177.32 as permitted sender) smtp.mailfrom=mike.kravetz@oracle.com; dmarc=pass (policy=none) header.from=oracle.com; arc=pass ("microsoft.com:s=arcselector9901:i=1") X-Stat-Signature: 8d4xq43t3iubci1snqfuefttxe69ouxo X-HE-Tag: 1663193930-416644 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Create the new routine hugetlb_unmap_file_folio that will unmap a single file folio. This is refactored code from hugetlb_vmdelete_list. It is modified to do locking within the routine itself and check whether the page is mapped within a specific vma before unmapping. This refactoring will be put to use and expanded upon in a subsequent patch adding vma specific locking. Signed-off-by: Mike Kravetz Reviewed-by: Miaohe Lin --- fs/hugetlbfs/inode.c | 123 +++++++++++++++++++++++++++++++++---------- 1 file changed, 94 insertions(+), 29 deletions(-) diff --git a/fs/hugetlbfs/inode.c b/fs/hugetlbfs/inode.c index 7112a9a9f54d..3bb1772fce2f 100644 --- a/fs/hugetlbfs/inode.c +++ b/fs/hugetlbfs/inode.c @@ -371,6 +371,94 @@ static void hugetlb_delete_from_page_cache(struct page *page) delete_from_page_cache(page); } +/* + * Called with i_mmap_rwsem held for inode based vma maps. This makes + * sure vma (and vm_mm) will not go away. We also hold the hugetlb fault + * mutex for the page in the mapping. So, we can not race with page being + * faulted into the vma. + */ +static bool hugetlb_vma_maps_page(struct vm_area_struct *vma, + unsigned long addr, struct page *page) +{ + pte_t *ptep, pte; + + ptep = huge_pte_offset(vma->vm_mm, addr, + huge_page_size(hstate_vma(vma))); + + if (!ptep) + return false; + + pte = huge_ptep_get(ptep); + if (huge_pte_none(pte) || !pte_present(pte)) + return false; + + if (pte_page(pte) == page) + return true; + + return false; +} + +/* + * Can vma_offset_start/vma_offset_end overflow on 32-bit arches? + * No, because the interval tree returns us only those vmas + * which overlap the truncated area starting at pgoff, + * and no vma on a 32-bit arch can span beyond the 4GB. + */ +static unsigned long vma_offset_start(struct vm_area_struct *vma, pgoff_t start) +{ + if (vma->vm_pgoff < start) + return (start - vma->vm_pgoff) << PAGE_SHIFT; + else + return 0; +} + +static unsigned long vma_offset_end(struct vm_area_struct *vma, pgoff_t end) +{ + unsigned long t_end; + + if (!end) + return vma->vm_end; + + t_end = ((end - vma->vm_pgoff) << PAGE_SHIFT) + vma->vm_start; + if (t_end > vma->vm_end) + t_end = vma->vm_end; + return t_end; +} + +/* + * Called with hugetlb fault mutex held. Therefore, no more mappings to + * this folio can be created while executing the routine. + */ +static void hugetlb_unmap_file_folio(struct hstate *h, + struct address_space *mapping, + struct folio *folio, pgoff_t index) +{ + struct rb_root_cached *root = &mapping->i_mmap; + struct page *page = &folio->page; + struct vm_area_struct *vma; + unsigned long v_start; + unsigned long v_end; + pgoff_t start, end; + + start = index * pages_per_huge_page(h); + end = (index + 1) * pages_per_huge_page(h); + + i_mmap_lock_write(mapping); + + vma_interval_tree_foreach(vma, root, start, end - 1) { + v_start = vma_offset_start(vma, start); + v_end = vma_offset_end(vma, end); + + if (!hugetlb_vma_maps_page(vma, vma->vm_start + v_start, page)) + continue; + + unmap_hugepage_range(vma, vma->vm_start + v_start, v_end, + NULL, ZAP_FLAG_DROP_MARKER); + } + + i_mmap_unlock_write(mapping); +} + static void hugetlb_vmdelete_list(struct rb_root_cached *root, pgoff_t start, pgoff_t end, zap_flags_t zap_flags) @@ -383,30 +471,13 @@ hugetlb_vmdelete_list(struct rb_root_cached *root, pgoff_t start, pgoff_t end, * an inclusive "last". */ vma_interval_tree_foreach(vma, root, start, end ? end - 1 : ULONG_MAX) { - unsigned long v_offset; + unsigned long v_start; unsigned long v_end; - /* - * Can the expression below overflow on 32-bit arches? - * No, because the interval tree returns us only those vmas - * which overlap the truncated area starting at pgoff, - * and no vma on a 32-bit arch can span beyond the 4GB. - */ - if (vma->vm_pgoff < start) - v_offset = (start - vma->vm_pgoff) << PAGE_SHIFT; - else - v_offset = 0; - - if (!end) - v_end = vma->vm_end; - else { - v_end = ((end - vma->vm_pgoff) << PAGE_SHIFT) - + vma->vm_start; - if (v_end > vma->vm_end) - v_end = vma->vm_end; - } + v_start = vma_offset_start(vma, start); + v_end = vma_offset_end(vma, end); - unmap_hugepage_range(vma, vma->vm_start + v_offset, v_end, + unmap_hugepage_range(vma, vma->vm_start + v_start, v_end, NULL, zap_flags); } } @@ -428,14 +499,8 @@ static bool remove_inode_single_folio(struct hstate *h, struct inode *inode, * the fault mutex. The mutex will prevent faults * until we finish removing the folio. */ - if (unlikely(folio_mapped(folio))) { - i_mmap_lock_write(mapping); - hugetlb_vmdelete_list(&mapping->i_mmap, - index * pages_per_huge_page(h), - (index + 1) * pages_per_huge_page(h), - ZAP_FLAG_DROP_MARKER); - i_mmap_unlock_write(mapping); - } + if (unlikely(folio_mapped(folio))) + hugetlb_unmap_file_folio(h, mapping, folio, index); folio_lock(folio); /* From patchwork Wed Sep 14 22:18:09 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Mike Kravetz X-Patchwork-Id: 12976594 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 70A9FC6FA82 for ; Wed, 14 Sep 2022 22:19:09 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 0B71D8000C; Wed, 14 Sep 2022 18:19:09 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 06618940007; Wed, 14 Sep 2022 18:19:09 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id D85DC8000C; Wed, 14 Sep 2022 18:19:08 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0017.hostedemail.com [216.40.44.17]) by kanga.kvack.org (Postfix) with ESMTP id C17E2940007 for ; Wed, 14 Sep 2022 18:19:08 -0400 (EDT) Received: from smtpin05.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay01.hostedemail.com (Postfix) with ESMTP id A24141C5DB1 for ; Wed, 14 Sep 2022 22:19:08 +0000 (UTC) X-FDA: 79912107576.05.7FF49DE Received: from mx0a-00069f02.pphosted.com (mx0a-00069f02.pphosted.com [205.220.165.32]) by imf07.hostedemail.com (Postfix) with ESMTP id 2489F40096 for ; Wed, 14 Sep 2022 22:19:07 +0000 (UTC) Received: from pps.filterd (m0246617.ppops.net [127.0.0.1]) by mx0b-00069f02.pphosted.com (8.17.1.5/8.17.1.5) with ESMTP id 28EMIml5014889; Wed, 14 Sep 2022 22:18:48 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.com; h=from : to : cc : subject : date : message-id : in-reply-to : references : content-transfer-encoding : content-type : mime-version; s=corp-2022-7-12; bh=qcAxmGJ/kx4xobBtSUuE0kBJ49vUn3mtf24AnfwZG6g=; b=K8sBhoI/A2A36R0sZzVz2jIda/5acJfAQ/pCbG2Q9H0Uxft5w4bLAGHuR1uH3fAI9mCN UZE7jv/DGfV22AM01mrqiya/Rw1CJbkQ3LDKy+RK3DXP44fuN1an7wIjxFL6frIHNAi7 XUlrsHMF4tnj/TiNGbGw6AnnKHnoGIf67VULqaV0NWeIscENf4TbN9kxEHvLXeQDMzVh 7drImZl/9O3C88WfJDlsh9S3LdL0m9sPPDGRHeBmzOydCO4Sc17mX3uox53aZWWv7uiY c7ZGX6WeF1NFhFFbAi3kfmr5aHCC/uEeZKhwGNb8WNom4nJMjGKscNJQcwukOT/kW7M6 xQ== Received: from phxpaimrmta03.imrmtpd1.prodappphxaev1.oraclevcn.com (phxpaimrmta03.appoci.oracle.com [138.1.37.129]) by mx0b-00069f02.pphosted.com (PPS) with ESMTPS id 3jjxyr3shy-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Wed, 14 Sep 2022 22:18:48 +0000 Received: from pps.filterd (phxpaimrmta03.imrmtpd1.prodappphxaev1.oraclevcn.com [127.0.0.1]) by phxpaimrmta03.imrmtpd1.prodappphxaev1.oraclevcn.com (8.17.1.5/8.17.1.5) with ESMTP id 28EJAjla035442; Wed, 14 Sep 2022 22:18:40 GMT Received: from nam11-bn8-obe.outbound.protection.outlook.com (mail-bn8nam11lp2168.outbound.protection.outlook.com [104.47.58.168]) by phxpaimrmta03.imrmtpd1.prodappphxaev1.oraclevcn.com (PPS) with ESMTPS id 3jjym095mr-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Wed, 14 Sep 2022 22:18:39 +0000 ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=FTPkg1VbFtBtOvowUuoiLIJ4o52U5O3DffB60FGh01wJZJuI5kkb8Yc0vwV3prdTDQ35DdaImPIhEw1cOK4evu0T+xp4djVaZ0RbApSEtCmryy3fkJDJTIhDI8/tFuXDPAetJVunF4sdqW8kx79eAuLnGLfjFHfrrEQJygviTXJD7LHy7e9M1cFoDT2m4/J8eZqBJ9YDjYOIuQZyP2t0JWc5qTJZBaL5T3x4Okbw61t8bcimvLzU6P86VhjWe/zcYbzfki8jhMPKXp0JSl4fLnQWSrYqK1cK9M2uhVRz5RXB4LsXJCJ52EXTaS/nsopZnY6in8ZtZiuYwP/By7X/Gw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=qcAxmGJ/kx4xobBtSUuE0kBJ49vUn3mtf24AnfwZG6g=; b=WH2ptc79/H+9gqWNo/Z8ItwtuZj81HFJ1BVf7jcZNrrnkipoqbVYgyr9WSsAfTdk4B/ZUUIwJRBGpXPBBwD+YNS29mDzzJBI7ThQDMhzA8zuaYSvO1xsLoaO7girrsJs3uv0SS9kk+pI8IW3TgA3sBe5YRqJKhjCcE4HRt6A1xsviC63j5aFXeVm8DdyvXA0Yvs23fU+DpL91ckpcPrcDbnbcycJDMJE3VuesdUX3V56yWJojM8mskxhRYb0rR/v+4OfHapoIR8wJ9qwUcl5R4BmOysXb3oysjY+ZYqY2Q5KZxwDyDvrj4alZwm8aJYIbH/3Vsa6NomlECfa4qRT2Q== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=oracle.com; dmarc=pass action=none header.from=oracle.com; dkim=pass header.d=oracle.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.onmicrosoft.com; s=selector2-oracle-onmicrosoft-com; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=qcAxmGJ/kx4xobBtSUuE0kBJ49vUn3mtf24AnfwZG6g=; b=SuLL9MI3xmSKZo5pEeWCqqfLOkREjxmKMUt1OkS0K363DfbT63iw8NHoU70OT2ou61hX3sSvHAUFkbfmH3iEiHjTbUNRM/Vr9JDGiAcGhfUuByBanJqUjh9bBjICF5sCLHsm+ZnOUJosNhgc2AXZRvzdX1z7hisMLxLI+shxuGw= Received: from BY5PR10MB4196.namprd10.prod.outlook.com (2603:10b6:a03:20d::23) by CH2PR10MB4390.namprd10.prod.outlook.com (2603:10b6:610:af::12) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.5612.22; Wed, 14 Sep 2022 22:18:37 +0000 Received: from BY5PR10MB4196.namprd10.prod.outlook.com ([fe80::e9d2:a804:e53a:779a]) by BY5PR10MB4196.namprd10.prod.outlook.com ([fe80::e9d2:a804:e53a:779a%6]) with mapi id 15.20.5612.022; Wed, 14 Sep 2022 22:18:37 +0000 From: Mike Kravetz To: linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: Muchun Song , Miaohe Lin , David Hildenbrand , Sven Schnelle , Michal Hocko , Peter Xu , Naoya Horiguchi , "Aneesh Kumar K . V" , Andrea Arcangeli , "Kirill A . Shutemov" , Davidlohr Bueso , Prakash Sangappa , James Houghton , Mina Almasry , Pasha Tatashin , Axel Rasmussen , Ray Fucillo , Andrew Morton , Mike Kravetz Subject: [PATCH v2 8/9] hugetlb: use new vma_lock for pmd sharing synchronization Date: Wed, 14 Sep 2022 15:18:09 -0700 Message-Id: <20220914221810.95771-9-mike.kravetz@oracle.com> X-Mailer: git-send-email 2.37.2 In-Reply-To: <20220914221810.95771-1-mike.kravetz@oracle.com> References: <20220914221810.95771-1-mike.kravetz@oracle.com> X-ClientProxiedBy: MW2PR2101CA0023.namprd21.prod.outlook.com (2603:10b6:302:1::36) To BY5PR10MB4196.namprd10.prod.outlook.com (2603:10b6:a03:20d::23) MIME-Version: 1.0 X-MS-PublicTrafficType: Email X-MS-TrafficTypeDiagnostic: BY5PR10MB4196:EE_|CH2PR10MB4390:EE_ X-MS-Office365-Filtering-Correlation-Id: 064c2c8f-7d6f-45a4-ba93-08da969f11ae X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: I4NRcuXUXb2IHGyaHEL5cHbegzmua74UKcx3S1OuwRK6hPPQkkhBxpB8ltUcBSSxa/uOyR1Dv9567+ClEa46eSJZO69QNru6Xp8yy3HNiBqRDfFzrQI7/9QxepQfxsrnHwvTr+tdd5xOmnO57KoPEAEllG48Owpi/dTmCIwW1UnVdwIcNsVDuZpmyH5Rp34oIRlEjo+KzuwX01TlQIKB3QNhPyQdv8AjeU3fBYNbpxneQfA1IfFQPruelKpJvfGxeBpgxMgNvFlXTZNHamtJocyzdmIZTUv4GJKb+3YsXhKGlwgWKUmPts/Gf9k8TiqG+Pd/+Bcp9xGoVvGJx76CyfAvIZcByq+G6zwm8tecjcCH0st5BH/QidR2FJ6Xx3HTERPobR8wSvd3//6emzLPkIkEyCsIrBl/hHgDiGnJ1tYDdLiFsgXP5P1DCHTEEUrsZ8VYdwhhCPXPzNVpHKKsBFwW7635quu0R70gNovSBtgs6f9edcK1xSmM8ft1WmI6DKucKCXx8+CgaoCvQFGW5UgNJWnfmath1YMm2EAWsDq24jiyX0hKKZAAp3Hu34ij2rxwblGRqqcMZoPxpPCbWuAiXvbYD6KWcfQjLmjZ2YwGBDDApIIcxuN2TZCNfqEyiH5de9atS5FoD618FzaYwZLgbNabrvgpfkAzJ62eA3KYJ3G6R+Un+1EW6d9bMqyztz3DNj9aqvJcVWEQYWN2VA== X-Forefront-Antispam-Report: CIP:255.255.255.255;CTRY:;LANG:en;SCL:1;SRV:;IPV:NLI;SFV:NSPM;H:BY5PR10MB4196.namprd10.prod.outlook.com;PTR:;CAT:NONE;SFS:(13230022)(366004)(376002)(346002)(396003)(39860400002)(136003)(451199015)(6512007)(7416002)(6666004)(86362001)(66556008)(36756003)(8676002)(107886003)(83380400001)(6486002)(66946007)(54906003)(30864003)(44832011)(66476007)(8936002)(5660300002)(316002)(41300700001)(6506007)(478600001)(186003)(2906002)(38100700002)(26005)(4326008)(1076003)(2616005);DIR:OUT;SFP:1101; X-MS-Exchange-AntiSpam-MessageData-ChunkCount: 1 X-MS-Exchange-AntiSpam-MessageData-0: Qlg2DX4u8ZWJOGcQr0JXQwTqXzdNssr3GDFpFuRoSlY49MDnLceUVRvRmERT0+pmCHlorLhoAd32LJM62r185prNK3FzyyaYODb1drlmXDRPKarxRy65Yu/Es4gdTQ47YBg7mRlpla8+dhmAJCZxxy/CncaRH/L40qfR5v3dcm0EcCZk8xw3QQ3EdKOKjcLWBNzMQ41gLX6fiUXn71omHSL3g6cp28ibYd76587cRvQtKyjt1qNIRg8ku6v6IooYWQZAEwJb96N+Xnwm1nnTMr2eTrbyCuiQdzFCgAXX1Tt3/H4GCX7mXUJORgdAQuM7TI3nEE0/mALauJs2iRZu+mgyLk55S+p8XBBClajmKbWy/tK0TmOgxeH4I0itxPI1qQi3f/83cKOg8Vp0cLaTVW/HCSJ/r212DDaR5V+j6j23c2KZYLDqBc83WbU8YW1WC8E9irEtFcGAjW0YJX43XxRwWdYoUXCy5UPLk01wsbSUqllxTS54F1RXjUsx7r+gpRujYkDdd+7ZQsXl+5LbM1YbNKZG5LLOpWpxRJZTrE0TepbVAXG3RPRVV3gqeQBqGgrOpQtWI6j+BUFVTzkn/OM9m+66IaQjRuESYRgs1dJt5JQbT0aYje5pemtGGBLwACef/1TcQK9We9Uvc9MnGW05qYdeCO8bPP5okuNxrPwW6dYowq2vAcj1yOV+WmDB9GkL5choLfrhm84vF4JDtriw9T0DuJ+SljooukY8yum18jttU9Bzf95jerMlZ7ZlxnGYiBdE5YY+WVRlmzwNAq+CYYriXlU7+AiOzFLR/oX0THOehVM7i/6JpfPPtUjMB8zReTP5Q8FXAEYDYmSwH18aXeBoo93iI8eZ30zz75vpAYBN/jIxUIL1HHx2PsA9I8C97MQCkgDLUjSPv1SP+G1EktMutCBmSyN/vpBppQX8H4yIVpIxkYvaXuAvVvdgBq7r4bpu0U2PHtrlSyrh5yEAHOb5C5wTIu2VVtUmHab4ci8DJ8NT3PNfUJMMfg4EGj7rcYOAPZwpZKvwUjRDmZ2I/gb5mHah4YC/0OQDXC/1PEDElp11/YKcO4iCdlcJaIrAMiXbUNPsBRPnB1qms7qymSWeavtSn9fLWJsbV81d0YUqjFSjtBy60sS0Auq9gwWDN1kqmMz7hmu31B5Y4jtFCO+IGmq6f2yBdPPaqoYMquGKSvcMwGQ+dfyxuT85Mho7WS3Wd4pevwHSFmhWNsqlM2/hLr2WlFLsAc6MiaERSe+i+EQGs6F8jAsvPEisY6Qk8MsybBJ8U4PN7Y/Hb2JN+ZZXoMXrQstT/IJJ7kmrcEN86qjixLDirpMg2SxIRDC/ADf50TtgYJtQZJAjuonqIQMtSJQ/bWQDTuCN6msESqRzqIs0vrpQF+Pc6qHgNP3hKWtSEZM+Ats98AaCrfZZololL+O2hYADTXTFTab/yvxmdQcWqMDgulQQc1vrzdrnFYjNWlqwavUyqBGQC4d1B6Fnj87no2EnJDzjV06QH1bRVHg5eC8K5zHbjazPIsdFgoOxn9vJlA11aSJ1p+og9iYXUHrXbJQzP4eUjTs32+VAptFbnr25qf/MXGiQhEi5EVS+3WmWZc7sCwYCFA== X-OriginatorOrg: oracle.com X-MS-Exchange-CrossTenant-Network-Message-Id: 064c2c8f-7d6f-45a4-ba93-08da969f11ae X-MS-Exchange-CrossTenant-AuthSource: BY5PR10MB4196.namprd10.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-OriginalArrivalTime: 14 Sep 2022 22:18:36.8070 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-CrossTenant-Id: 4e2c6054-71cb-48f1-bd6c-3a9705aca71b X-MS-Exchange-CrossTenant-MailboxType: HOSTED X-MS-Exchange-CrossTenant-UserPrincipalName: hh6G3RK8F5rSPr8UqkM5iYqyURrjVg9cLF5F2Lh6F1+WFt/5Fkxfg3Z4PgiiEGQqyns4wzC01qw7d16TP3kKpA== X-MS-Exchange-Transport-CrossTenantHeadersStamped: CH2PR10MB4390 X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.205,Aquarius:18.0.895,Hydra:6.0.528,FMLib:17.11.122.1 definitions=2022-09-14_09,2022-09-14_04,2022-06-22_01 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 mlxscore=0 spamscore=0 adultscore=0 malwarescore=0 suspectscore=0 bulkscore=0 mlxlogscore=999 phishscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2208220000 definitions=main-2209140108 X-Proofpoint-GUID: 1mm892UOSCJjKI1Q_BlwJwWFsjY6DIb7 X-Proofpoint-ORIG-GUID: 1mm892UOSCJjKI1Q_BlwJwWFsjY6DIb7 ARC-Seal: i=2; s=arc-20220608; d=hostedemail.com; t=1663193948; a=rsa-sha256; cv=pass; b=AYOjRofrUahW1PMe/lSD4Sq8dBaheAPOx888VPxQIdCeLyYF6TucEQYpTChVf0PXMntNbi bY+dLKheQrnPSSfpubHyiZt1aWOrOQJZSx4tUG4NNLgSU09NjpoPZas9JgWRm1YUmiFVdd 0pcE0YzOyGBad4eFbQntYgOL9VJIymk= ARC-Authentication-Results: i=2; imf07.hostedemail.com; dkim=pass header.d=oracle.com header.s=corp-2022-7-12 header.b="K8sBhoI/"; dkim=pass header.d=oracle.onmicrosoft.com header.s=selector2-oracle-onmicrosoft-com header.b=SuLL9MI3; spf=pass (imf07.hostedemail.com: domain of mike.kravetz@oracle.com designates 205.220.165.32 as permitted sender) smtp.mailfrom=mike.kravetz@oracle.com; dmarc=pass (policy=none) header.from=oracle.com; arc=pass ("microsoft.com:s=arcselector9901:i=1") ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1663193948; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=qcAxmGJ/kx4xobBtSUuE0kBJ49vUn3mtf24AnfwZG6g=; b=igQKgBXY2tB8hwm/FheEewuO4O9EggA1YG7ZzkAn1edDDoDcdIFcz30gwbCjwIh/n4W5GZ 5Slr5DC2TsP6TlQpMLz6UWHxW/kO3vG477Hgc+UWJ8Ujsrz97KMsty119ZOhY35FTtkbaq yX8nmQ17TOddSag7VwPKMIwqwf0i+8A= Authentication-Results: imf07.hostedemail.com; dkim=pass header.d=oracle.com header.s=corp-2022-7-12 header.b="K8sBhoI/"; dkim=pass header.d=oracle.onmicrosoft.com header.s=selector2-oracle-onmicrosoft-com header.b=SuLL9MI3; spf=pass (imf07.hostedemail.com: domain of mike.kravetz@oracle.com designates 205.220.165.32 as permitted sender) smtp.mailfrom=mike.kravetz@oracle.com; dmarc=pass (policy=none) header.from=oracle.com; arc=pass ("microsoft.com:s=arcselector9901:i=1") X-Rspam-User: X-Rspamd-Server: rspam04 X-Stat-Signature: bdoejiqgifufjbkxenm8wx4bq4ntpow6 X-Rspamd-Queue-Id: 2489F40096 X-HE-Tag: 1663193947-467788 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: The new hugetlb vma lock is used to address this race: Faulting thread Unsharing thread ... ... ptep = huge_pte_offset() or ptep = huge_pte_alloc() ... i_mmap_lock_write lock page table ptep invalid <------------------------ huge_pmd_unshare() Could be in a previously unlock_page_table sharing process or worse i_mmap_unlock_write ... The vma_lock is used as follows: - During fault processing. The lock is acquired in read mode before doing a page table lock and allocation (huge_pte_alloc). The lock is held until code is finished with the page table entry (ptep). - The lock must be held in write mode whenever huge_pmd_unshare is called. Lock ordering issues come into play when unmapping a page from all vmas mapping the page. The i_mmap_rwsem must be held to search for the vmas, and the vma lock must be held before calling unmap which will call huge_pmd_unshare. This is done today in: - try_to_migrate_one and try_to_unmap_ for page migration and memory error handling. In these routines we 'try' to obtain the vma lock and fail to unmap if unsuccessful. Calling routines already deal with the failure of unmapping. - hugetlb_vmdelete_list for truncation and hole punch. This routine also tries to acquire the vma lock. If it fails, it skips the unmapping. However, we can not have file truncation or hole punch fail because of contention. After hugetlb_vmdelete_list, truncation and hole punch call remove_inode_hugepages. remove_inode_hugepages checks for mapped pages and call hugetlb_unmap_file_page to unmap them. hugetlb_unmap_file_page is designed to drop locks and reacquire in the correct order to guarantee unmap success. Signed-off-by: Mike Kravetz --- fs/hugetlbfs/inode.c | 66 +++++++++++++++++++++++++++- mm/hugetlb.c | 102 +++++++++++++++++++++++++++++++++++++++---- mm/memory.c | 2 + mm/rmap.c | 100 +++++++++++++++++++++++++++--------------- mm/userfaultfd.c | 9 +++- 5 files changed, 233 insertions(+), 46 deletions(-) diff --git a/fs/hugetlbfs/inode.c b/fs/hugetlbfs/inode.c index 3bb1772fce2f..009ae539b9b2 100644 --- a/fs/hugetlbfs/inode.c +++ b/fs/hugetlbfs/inode.c @@ -434,6 +434,7 @@ static void hugetlb_unmap_file_folio(struct hstate *h, struct folio *folio, pgoff_t index) { struct rb_root_cached *root = &mapping->i_mmap; + struct hugetlb_vma_lock *vma_lock; struct page *page = &folio->page; struct vm_area_struct *vma; unsigned long v_start; @@ -444,7 +445,8 @@ static void hugetlb_unmap_file_folio(struct hstate *h, end = (index + 1) * pages_per_huge_page(h); i_mmap_lock_write(mapping); - +retry: + vma_lock = NULL; vma_interval_tree_foreach(vma, root, start, end - 1) { v_start = vma_offset_start(vma, start); v_end = vma_offset_end(vma, end); @@ -452,11 +454,63 @@ static void hugetlb_unmap_file_folio(struct hstate *h, if (!hugetlb_vma_maps_page(vma, vma->vm_start + v_start, page)) continue; + if (!hugetlb_vma_trylock_write(vma)) { + vma_lock = vma->vm_private_data; + /* + * If we can not get vma lock, we need to drop + * immap_sema and take locks in order. First, + * take a ref on the vma_lock structure so that + * we can be guaranteed it will not go away when + * dropping immap_sema. + */ + kref_get(&vma_lock->refs); + break; + } + unmap_hugepage_range(vma, vma->vm_start + v_start, v_end, NULL, ZAP_FLAG_DROP_MARKER); + hugetlb_vma_unlock_write(vma); } i_mmap_unlock_write(mapping); + + if (vma_lock) { + /* + * Wait on vma_lock. We know it is still valid as we have + * a reference. We must 'open code' vma locking as we do + * not know if vma_lock is still attached to vma. + */ + down_write(&vma_lock->rw_sema); + i_mmap_lock_write(mapping); + + vma = vma_lock->vma; + if (!vma) { + /* + * If lock is no longer attached to vma, then just + * unlock, drop our reference and retry looking for + * other vmas. + */ + up_write(&vma_lock->rw_sema); + kref_put(&vma_lock->refs, hugetlb_vma_lock_release); + goto retry; + } + + /* + * vma_lock is still attached to vma. Check to see if vma + * still maps page and if so, unmap. + */ + v_start = vma_offset_start(vma, start); + v_end = vma_offset_end(vma, end); + if (hugetlb_vma_maps_page(vma, vma->vm_start + v_start, page)) + unmap_hugepage_range(vma, vma->vm_start + v_start, + v_end, NULL, + ZAP_FLAG_DROP_MARKER); + + kref_put(&vma_lock->refs, hugetlb_vma_lock_release); + hugetlb_vma_unlock_write(vma); + + goto retry; + } } static void @@ -474,11 +528,21 @@ hugetlb_vmdelete_list(struct rb_root_cached *root, pgoff_t start, pgoff_t end, unsigned long v_start; unsigned long v_end; + if (!hugetlb_vma_trylock_write(vma)) + continue; + v_start = vma_offset_start(vma, start); v_end = vma_offset_end(vma, end); unmap_hugepage_range(vma, vma->vm_start + v_start, v_end, NULL, zap_flags); + + /* + * Note that vma lock only exists for shared/non-private + * vmas. Therefore, lock is not held when calling + * unmap_hugepage_range for private vmas. + */ + hugetlb_vma_unlock_write(vma); } } diff --git a/mm/hugetlb.c b/mm/hugetlb.c index 616be891b798..e8cbc0f7cdaa 100644 --- a/mm/hugetlb.c +++ b/mm/hugetlb.c @@ -4795,6 +4795,14 @@ int copy_hugetlb_page_range(struct mm_struct *dst, struct mm_struct *src, mmu_notifier_invalidate_range_start(&range); mmap_assert_write_locked(src); raw_write_seqcount_begin(&src->write_protect_seq); + } else { + /* + * For shared mappings the vma lock must be held before + * calling huge_pte_offset in the src vma. Otherwise, the + * returned ptep could go away if part of a shared pmd and + * another thread calls huge_pmd_unshare. + */ + hugetlb_vma_lock_read(src_vma); } last_addr_mask = hugetlb_mask_last_page(h); @@ -4941,6 +4949,8 @@ int copy_hugetlb_page_range(struct mm_struct *dst, struct mm_struct *src, if (cow) { raw_write_seqcount_end(&src->write_protect_seq); mmu_notifier_invalidate_range_end(&range); + } else { + hugetlb_vma_unlock_read(src_vma); } return ret; @@ -4999,6 +5009,7 @@ int move_hugetlb_page_tables(struct vm_area_struct *vma, mmu_notifier_invalidate_range_start(&range); last_addr_mask = hugetlb_mask_last_page(h); /* Prevent race with file truncation */ + hugetlb_vma_lock_write(vma); i_mmap_lock_write(mapping); for (; old_addr < old_end; old_addr += sz, new_addr += sz) { src_pte = huge_pte_offset(mm, old_addr, sz); @@ -5030,6 +5041,7 @@ int move_hugetlb_page_tables(struct vm_area_struct *vma, flush_tlb_range(vma, old_end - len, old_end); mmu_notifier_invalidate_range_end(&range); i_mmap_unlock_write(mapping); + hugetlb_vma_unlock_write(vma); return len + old_addr - old_end; } @@ -5349,8 +5361,29 @@ static vm_fault_t hugetlb_wp(struct mm_struct *mm, struct vm_area_struct *vma, * may get SIGKILLed if it later faults. */ if (outside_reserve) { + struct address_space *mapping = vma->vm_file->f_mapping; + pgoff_t idx; + u32 hash; + put_page(old_page); + /* + * Drop hugetlb_fault_mutex and vma_lock before + * unmapping. unmapping needs to hold vma_lock + * in write mode. Dropping vma_lock in read mode + * here is OK as COW mappings do not interact with + * PMD sharing. + * + * Reacquire both after unmap operation. + */ + idx = vma_hugecache_offset(h, vma, haddr); + hash = hugetlb_fault_mutex_hash(mapping, idx); + hugetlb_vma_unlock_read(vma); + mutex_unlock(&hugetlb_fault_mutex_table[hash]); + unmap_ref_private(mm, vma, old_page, haddr); + + mutex_lock(&hugetlb_fault_mutex_table[hash]); + hugetlb_vma_lock_read(vma); spin_lock(ptl); ptep = huge_pte_offset(mm, haddr, huge_page_size(h)); if (likely(ptep && @@ -5499,14 +5532,16 @@ static inline vm_fault_t hugetlb_handle_userfault(struct vm_area_struct *vma, }; /* - * hugetlb_fault_mutex and i_mmap_rwsem must be + * vma_lock and hugetlb_fault_mutex must be * dropped before handling userfault. Reacquire * after handling fault to make calling code simpler. */ + hugetlb_vma_unlock_read(vma); hash = hugetlb_fault_mutex_hash(mapping, idx); mutex_unlock(&hugetlb_fault_mutex_table[hash]); ret = handle_userfault(&vmf, reason); mutex_lock(&hugetlb_fault_mutex_table[hash]); + hugetlb_vma_lock_read(vma); return ret; } @@ -5740,6 +5775,11 @@ vm_fault_t hugetlb_fault(struct mm_struct *mm, struct vm_area_struct *vma, ptep = huge_pte_offset(mm, haddr, huge_page_size(h)); if (ptep) { + /* + * Since we hold no locks, ptep could be stale. That is + * OK as we are only making decisions based on content and + * not actually modifying content here. + */ entry = huge_ptep_get(ptep); if (unlikely(is_hugetlb_entry_migration(entry))) { migration_entry_wait_huge(vma, ptep); @@ -5747,23 +5787,35 @@ vm_fault_t hugetlb_fault(struct mm_struct *mm, struct vm_area_struct *vma, } else if (unlikely(is_hugetlb_entry_hwpoisoned(entry))) return VM_FAULT_HWPOISON_LARGE | VM_FAULT_SET_HINDEX(hstate_index(h)); - } else { - ptep = huge_pte_alloc(mm, vma, haddr, huge_page_size(h)); - if (!ptep) - return VM_FAULT_OOM; } - mapping = vma->vm_file->f_mapping; - idx = vma_hugecache_offset(h, vma, haddr); - /* * Serialize hugepage allocation and instantiation, so that we don't * get spurious allocation failures if two CPUs race to instantiate * the same page in the page cache. */ + mapping = vma->vm_file->f_mapping; + idx = vma_hugecache_offset(h, vma, haddr); hash = hugetlb_fault_mutex_hash(mapping, idx); mutex_lock(&hugetlb_fault_mutex_table[hash]); + /* + * Acquire vma lock before calling huge_pte_alloc and hold + * until finished with ptep. This prevents huge_pmd_unshare from + * being called elsewhere and making the ptep no longer valid. + * + * ptep could have already be assigned via huge_pte_offset. That + * is OK, as huge_pte_alloc will return the same value unless + * something has changed. + */ + hugetlb_vma_lock_read(vma); + ptep = huge_pte_alloc(mm, vma, haddr, huge_page_size(h)); + if (!ptep) { + hugetlb_vma_unlock_read(vma); + mutex_unlock(&hugetlb_fault_mutex_table[hash]); + return VM_FAULT_OOM; + } + entry = huge_ptep_get(ptep); /* PTE markers should be handled the same way as none pte */ if (huge_pte_none_mostly(entry)) { @@ -5824,6 +5876,7 @@ vm_fault_t hugetlb_fault(struct mm_struct *mm, struct vm_area_struct *vma, unlock_page(pagecache_page); put_page(pagecache_page); } + hugetlb_vma_unlock_read(vma); mutex_unlock(&hugetlb_fault_mutex_table[hash]); return handle_userfault(&vmf, VM_UFFD_WP); } @@ -5867,6 +5920,7 @@ vm_fault_t hugetlb_fault(struct mm_struct *mm, struct vm_area_struct *vma, put_page(pagecache_page); } out_mutex: + hugetlb_vma_unlock_read(vma); mutex_unlock(&hugetlb_fault_mutex_table[hash]); /* * Generally it's safe to hold refcount during waiting page lock. But @@ -6329,8 +6383,9 @@ unsigned long hugetlb_change_protection(struct vm_area_struct *vma, flush_cache_range(vma, range.start, range.end); mmu_notifier_invalidate_range_start(&range); - last_addr_mask = hugetlb_mask_last_page(h); + hugetlb_vma_lock_write(vma); i_mmap_lock_write(vma->vm_file->f_mapping); + last_addr_mask = hugetlb_mask_last_page(h); for (; address < end; address += psize) { spinlock_t *ptl; ptep = huge_pte_offset(mm, address, psize); @@ -6429,6 +6484,7 @@ unsigned long hugetlb_change_protection(struct vm_area_struct *vma, * See Documentation/mm/mmu_notifier.rst */ i_mmap_unlock_write(vma->vm_file->f_mapping); + hugetlb_vma_unlock_write(vma); mmu_notifier_invalidate_range_end(&range); return pages << h->order; @@ -6930,6 +6986,7 @@ int huge_pmd_unshare(struct mm_struct *mm, struct vm_area_struct *vma, pud_t *pud = pud_offset(p4d, addr); i_mmap_assert_write_locked(vma->vm_file->f_mapping); + hugetlb_vma_assert_locked(vma); BUG_ON(page_count(virt_to_page(ptep)) == 0); if (page_count(virt_to_page(ptep)) == 1) return 0; @@ -6941,6 +6998,31 @@ int huge_pmd_unshare(struct mm_struct *mm, struct vm_area_struct *vma, } #else /* !CONFIG_ARCH_WANT_HUGE_PMD_SHARE */ +void hugetlb_vma_lock_read(struct vm_area_struct *vma) +{ +} + +void hugetlb_vma_unlock_read(struct vm_area_struct *vma) +{ +} + +void hugetlb_vma_lock_write(struct vm_area_struct *vma) +{ +} + +void hugetlb_vma_unlock_write(struct vm_area_struct *vma) +{ +} + +int hugetlb_vma_trylock_write(struct vm_area_struct *vma) +{ + return 1; +} + +void hugetlb_vma_assert_locked(struct vm_area_struct *vma) +{ +} + static void hugetlb_vma_lock_free(struct vm_area_struct *vma) { } @@ -7318,6 +7400,7 @@ void hugetlb_unshare_all_pmds(struct vm_area_struct *vma) mmu_notifier_range_init(&range, MMU_NOTIFY_CLEAR, 0, vma, mm, start, end); mmu_notifier_invalidate_range_start(&range); + hugetlb_vma_lock_write(vma); i_mmap_lock_write(vma->vm_file->f_mapping); for (address = start; address < end; address += PUD_SIZE) { ptep = huge_pte_offset(mm, address, sz); @@ -7329,6 +7412,7 @@ void hugetlb_unshare_all_pmds(struct vm_area_struct *vma) } flush_hugetlb_tlb_range(vma, start, end); i_mmap_unlock_write(vma->vm_file->f_mapping); + hugetlb_vma_unlock_write(vma); /* * No need to call mmu_notifier_invalidate_range(), see * Documentation/mm/mmu_notifier.rst. diff --git a/mm/memory.c b/mm/memory.c index c4c3c2fd4f45..118e5f023597 100644 --- a/mm/memory.c +++ b/mm/memory.c @@ -1685,10 +1685,12 @@ static void unmap_single_vma(struct mmu_gather *tlb, if (vma->vm_file) { zap_flags_t zap_flags = details ? details->zap_flags : 0; + hugetlb_vma_lock_write(vma); i_mmap_lock_write(vma->vm_file->f_mapping); __unmap_hugepage_range_final(tlb, vma, start, end, NULL, zap_flags); i_mmap_unlock_write(vma->vm_file->f_mapping); + hugetlb_vma_unlock_write(vma); } } else unmap_page_range(tlb, vma, start, end, details); diff --git a/mm/rmap.c b/mm/rmap.c index 744faaef0489..2ec925e5fa6a 100644 --- a/mm/rmap.c +++ b/mm/rmap.c @@ -1554,24 +1554,39 @@ static bool try_to_unmap_one(struct folio *folio, struct vm_area_struct *vma, * To call huge_pmd_unshare, i_mmap_rwsem must be * held in write mode. Caller needs to explicitly * do this outside rmap routines. + * + * We also must hold hugetlb vma_lock in write mode. + * Lock order dictates acquiring vma_lock BEFORE + * i_mmap_rwsem. We can only try lock here and fail + * if unsuccessful. */ - VM_BUG_ON(!anon && !(flags & TTU_RMAP_LOCKED)); - if (!anon && huge_pmd_unshare(mm, vma, address, pvmw.pte)) { - flush_tlb_range(vma, range.start, range.end); - mmu_notifier_invalidate_range(mm, range.start, - range.end); - - /* - * The ref count of the PMD page was dropped - * which is part of the way map counting - * is done for shared PMDs. Return 'true' - * here. When there is no other sharing, - * huge_pmd_unshare returns false and we will - * unmap the actual page and drop map count - * to zero. - */ - page_vma_mapped_walk_done(&pvmw); - break; + if (!anon) { + VM_BUG_ON(!(flags & TTU_RMAP_LOCKED)); + if (!hugetlb_vma_trylock_write(vma)) { + page_vma_mapped_walk_done(&pvmw); + ret = false; + break; + } + if (huge_pmd_unshare(mm, vma, address, pvmw.pte)) { + hugetlb_vma_unlock_write(vma); + flush_tlb_range(vma, + range.start, range.end); + mmu_notifier_invalidate_range(mm, + range.start, range.end); + /* + * The ref count of the PMD page was + * dropped which is part of the way map + * counting is done for shared PMDs. + * Return 'true' here. When there is + * no other sharing, huge_pmd_unshare + * returns false and we will unmap the + * actual page and drop map count + * to zero. + */ + page_vma_mapped_walk_done(&pvmw); + break; + } + hugetlb_vma_unlock_write(vma); } pteval = huge_ptep_clear_flush(vma, address, pvmw.pte); } else { @@ -1929,26 +1944,41 @@ static bool try_to_migrate_one(struct folio *folio, struct vm_area_struct *vma, * To call huge_pmd_unshare, i_mmap_rwsem must be * held in write mode. Caller needs to explicitly * do this outside rmap routines. + * + * We also must hold hugetlb vma_lock in write mode. + * Lock order dictates acquiring vma_lock BEFORE + * i_mmap_rwsem. We can only try lock here and + * fail if unsuccessful. */ - VM_BUG_ON(!anon && !(flags & TTU_RMAP_LOCKED)); - if (!anon && huge_pmd_unshare(mm, vma, address, pvmw.pte)) { - flush_tlb_range(vma, range.start, range.end); - mmu_notifier_invalidate_range(mm, range.start, - range.end); - - /* - * The ref count of the PMD page was dropped - * which is part of the way map counting - * is done for shared PMDs. Return 'true' - * here. When there is no other sharing, - * huge_pmd_unshare returns false and we will - * unmap the actual page and drop map count - * to zero. - */ - page_vma_mapped_walk_done(&pvmw); - break; + if (!anon) { + VM_BUG_ON(!(flags & TTU_RMAP_LOCKED)); + if (!hugetlb_vma_trylock_write(vma)) { + page_vma_mapped_walk_done(&pvmw); + ret = false; + break; + } + if (huge_pmd_unshare(mm, vma, address, pvmw.pte)) { + hugetlb_vma_unlock_write(vma); + flush_tlb_range(vma, + range.start, range.end); + mmu_notifier_invalidate_range(mm, + range.start, range.end); + + /* + * The ref count of the PMD page was + * dropped which is part of the way map + * counting is done for shared PMDs. + * Return 'true' here. When there is + * no other sharing, huge_pmd_unshare + * returns false and we will unmap the + * actual page and drop map count + * to zero. + */ + page_vma_mapped_walk_done(&pvmw); + break; + } + hugetlb_vma_unlock_write(vma); } - /* Nuke the hugetlb page table entry */ pteval = huge_ptep_clear_flush(vma, address, pvmw.pte); } else { diff --git a/mm/userfaultfd.c b/mm/userfaultfd.c index 0fdbd2c05587..e24e8a47ce8a 100644 --- a/mm/userfaultfd.c +++ b/mm/userfaultfd.c @@ -379,16 +379,21 @@ static __always_inline ssize_t __mcopy_atomic_hugetlb(struct mm_struct *dst_mm, BUG_ON(dst_addr >= dst_start + len); /* - * Serialize via hugetlb_fault_mutex. + * Serialize via vma_lock and hugetlb_fault_mutex. + * vma_lock ensures the dst_pte remains valid even + * in the case of shared pmds. fault mutex prevents + * races with other faulting threads. */ idx = linear_page_index(dst_vma, dst_addr); mapping = dst_vma->vm_file->f_mapping; hash = hugetlb_fault_mutex_hash(mapping, idx); mutex_lock(&hugetlb_fault_mutex_table[hash]); + hugetlb_vma_lock_read(dst_vma); err = -ENOMEM; dst_pte = huge_pte_alloc(dst_mm, dst_vma, dst_addr, vma_hpagesize); if (!dst_pte) { + hugetlb_vma_unlock_read(dst_vma); mutex_unlock(&hugetlb_fault_mutex_table[hash]); goto out_unlock; } @@ -396,6 +401,7 @@ static __always_inline ssize_t __mcopy_atomic_hugetlb(struct mm_struct *dst_mm, if (mode != MCOPY_ATOMIC_CONTINUE && !huge_pte_none_mostly(huge_ptep_get(dst_pte))) { err = -EEXIST; + hugetlb_vma_unlock_read(dst_vma); mutex_unlock(&hugetlb_fault_mutex_table[hash]); goto out_unlock; } @@ -404,6 +410,7 @@ static __always_inline ssize_t __mcopy_atomic_hugetlb(struct mm_struct *dst_mm, dst_addr, src_addr, mode, &page, wp_copy); + hugetlb_vma_unlock_read(dst_vma); mutex_unlock(&hugetlb_fault_mutex_table[hash]); cond_resched(); From patchwork Wed Sep 14 22:18:10 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Mike Kravetz X-Patchwork-Id: 12976593 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 70E7FECAAD3 for ; Wed, 14 Sep 2022 22:18:54 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 34ECF8000B; Wed, 14 Sep 2022 18:18:53 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 2FFC6940007; Wed, 14 Sep 2022 18:18:53 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 0DBBC8000B; Wed, 14 Sep 2022 18:18:53 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0011.hostedemail.com [216.40.44.11]) by kanga.kvack.org (Postfix) with ESMTP id EA460940007 for ; Wed, 14 Sep 2022 18:18:52 -0400 (EDT) Received: from smtpin22.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay01.hostedemail.com (Postfix) with ESMTP id CF4461C5D98 for ; Wed, 14 Sep 2022 22:18:52 +0000 (UTC) X-FDA: 79912106904.22.DB14A62 Received: from mx0a-00069f02.pphosted.com (mx0a-00069f02.pphosted.com [205.220.165.32]) by imf01.hostedemail.com (Postfix) with ESMTP id 5BCAD400A5 for ; Wed, 14 Sep 2022 22:18:52 +0000 (UTC) Received: from pps.filterd (m0246627.ppops.net [127.0.0.1]) by mx0b-00069f02.pphosted.com (8.17.1.5/8.17.1.5) with ESMTP id 28EMAARs026129; Wed, 14 Sep 2022 22:18:41 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.com; h=from : to : cc : subject : date : message-id : in-reply-to : references : content-transfer-encoding : content-type : mime-version; s=corp-2022-7-12; bh=1Ip1504nE2rI5Mk0abFz7kLwYsrBv+Ysx7YZ81uNmX8=; b=uJZZLrnmYukpGEomRcMim6RLb3aHVVCFqbCAuhEETrj7jNDEMzzaS4l6gxCWfLE6HGNk pGRhFKxS8ngiJIgZ5ccer5MGyr03hrCeDX9pLCgHsvFMc+PQBlGP8sBynxJ55sVOuOKN ktGUGL7KrnrU8V+CUfMRWIMKM2edPzPewzWnSgrb7A4NZaKjDsDF+Yyp9uLsKuyb9lSa oD/yGHungRpHJtI4EXD4jdvnJH4BvuyAPwhZactRCpp3kW/mGrC/s/BwLRxCIDT67pjQ O0aVbql/8x0oCpiKRpcn07JkwcrzUr8vd+RiYu2lv1UxwYj4k16LgCuStb6FX98pOqze 2g== Received: from phxpaimrmta03.imrmtpd1.prodappphxaev1.oraclevcn.com (phxpaimrmta03.appoci.oracle.com [138.1.37.129]) by mx0b-00069f02.pphosted.com (PPS) with ESMTPS id 3jjxypbqrb-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Wed, 14 Sep 2022 22:18:41 +0000 Received: from pps.filterd (phxpaimrmta03.imrmtpd1.prodappphxaev1.oraclevcn.com [127.0.0.1]) by phxpaimrmta03.imrmtpd1.prodappphxaev1.oraclevcn.com (8.17.1.5/8.17.1.5) with ESMTP id 28EJAjlb035442; Wed, 14 Sep 2022 22:18:40 GMT Received: from nam11-bn8-obe.outbound.protection.outlook.com (mail-bn8nam11lp2168.outbound.protection.outlook.com [104.47.58.168]) by phxpaimrmta03.imrmtpd1.prodappphxaev1.oraclevcn.com (PPS) with ESMTPS id 3jjym095mr-2 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Wed, 14 Sep 2022 22:18:40 +0000 ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=SRoooO7fE/ST+x289RSUQ9p4OZwYdZ/lGZg5258R/MhzVjG2Nc0uiGDBVJYMzQfgAjws68YAW5acrMzt3fNG91yR4KpXLxsjVjYTMASaWpegm+ApdAm8T4zQQfo9trLxoltc2ooEUIYE3Rj3qqPBmUvyIfs3U2lrez5onIJVmZJK/DJez6MO6YNZOFDwsmBBU8IqX37CDwgUAEi2Shr4SJzo7fC1cSLe3hHcey95LF9n+M8h1U9nrYYopcnM6WxZU6iCTLp+r1NnwrndsarPnTROCfd2HxNE0ot+SrtlUG5KitNfH3cco18PPaZjjAV0MDdjDAFvcyb0Q1k6TAxO2w== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=1Ip1504nE2rI5Mk0abFz7kLwYsrBv+Ysx7YZ81uNmX8=; b=LlbwGiuFT4N8Y/w2joC8RV/j0WoGKynoJagxRFhb0J/SQHCNQZnqD+gkrLeh4nBRUfcwHqjD0EkqWm5yE+t0m0BB5MeyQ+hnsqW+hRuuF8N2MG4nXUoamora8YAqmDodqlST2MHhZizaTTKTIXb4ECF5snBPnSf6jCjRYGw+m9X25NLBOnfV6dfa5tIt0cMf2xu6v31dazzm7RI/YOQ237MQmEnUfzWnQ14sqq8ExSy8T6FbtRqugWA1byzVd0mDjoQsGtFS0Bykc9+4BaKvfO5WSfwB0e8pEBYkqlwESOtXGC5iKpnt+sD/uvokDPxEfPZKB9D5mX9MZj97kfHBlw== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=oracle.com; dmarc=pass action=none header.from=oracle.com; dkim=pass header.d=oracle.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.onmicrosoft.com; s=selector2-oracle-onmicrosoft-com; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=1Ip1504nE2rI5Mk0abFz7kLwYsrBv+Ysx7YZ81uNmX8=; b=b3fkDfIN38SrB0Mb/YQVaeUGsoneSBmthJcuJ4Of3LBLOY5eDQxKnBSS354ZcrvvXdi5unwD8GdcK2PvJKz9aBe0edvln0446DJrzFntCkiQdCAKrAVJ4Uqmg1CJGlQLZYcrekMOcr8oIHuSMY8IP8rUYU2s2xYFEmufTM0zuMI= Received: from BY5PR10MB4196.namprd10.prod.outlook.com (2603:10b6:a03:20d::23) by CH2PR10MB4390.namprd10.prod.outlook.com (2603:10b6:610:af::12) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.5612.22; Wed, 14 Sep 2022 22:18:39 +0000 Received: from BY5PR10MB4196.namprd10.prod.outlook.com ([fe80::e9d2:a804:e53a:779a]) by BY5PR10MB4196.namprd10.prod.outlook.com ([fe80::e9d2:a804:e53a:779a%6]) with mapi id 15.20.5612.022; Wed, 14 Sep 2022 22:18:39 +0000 From: Mike Kravetz To: linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: Muchun Song , Miaohe Lin , David Hildenbrand , Sven Schnelle , Michal Hocko , Peter Xu , Naoya Horiguchi , "Aneesh Kumar K . V" , Andrea Arcangeli , "Kirill A . Shutemov" , Davidlohr Bueso , Prakash Sangappa , James Houghton , Mina Almasry , Pasha Tatashin , Axel Rasmussen , Ray Fucillo , Andrew Morton , Mike Kravetz Subject: [PATCH v2 9/9] hugetlb: clean up code checking for fault/truncation races Date: Wed, 14 Sep 2022 15:18:10 -0700 Message-Id: <20220914221810.95771-10-mike.kravetz@oracle.com> X-Mailer: git-send-email 2.37.2 In-Reply-To: <20220914221810.95771-1-mike.kravetz@oracle.com> References: <20220914221810.95771-1-mike.kravetz@oracle.com> X-ClientProxiedBy: MW4P222CA0008.NAMP222.PROD.OUTLOOK.COM (2603:10b6:303:114::13) To BY5PR10MB4196.namprd10.prod.outlook.com (2603:10b6:a03:20d::23) MIME-Version: 1.0 X-MS-PublicTrafficType: Email X-MS-TrafficTypeDiagnostic: BY5PR10MB4196:EE_|CH2PR10MB4390:EE_ X-MS-Office365-Filtering-Correlation-Id: 1e7743fd-5bd0-4c36-1b01-08da969f130f X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: PTFLaIG2WlgsxGmKzQac3tPw/zJzljptv29fCCCGc03Btw76ntnzuSBBjDvZgRkQyiriUyEfrWKBOhb/2UFGHSTyJONKCxYZVKJ03kJvPAwpOhk1eEJrgk/yHKUk9hwwNkwOY2h48r4g0XUUmucpLXZeOLqZl+pCC0ZWw4IZdcF2ZwgY8SjUXOgES4bT/G0jLjzgGtEM8DpyduXpLpFuC08BmZM61nhtLwCQvjW8BM4Bz6jk2vQPtnaKloIKUobZx/8hxoOBCByfxUvu4v659W7ZRcq42ZjghOdStYM4RxPuN6M13YHh0LcWHqGYZvtY6hyt1v9F3tqu8r12hF7fLtWfgmhlzOBWicbmkQzYEoVu5MWyH0EP5aCYBK3UFfLGk+XQPrKm3XKtp1zVmsn4XvfDWzKbeSZu0iikhm3s70DUNaNrZ66bZfzFFCjzmgdz6cY/vKFMaNm8WZQbsK8Sjkjgsl8DHUub8N0+1U6HPHmJ8l1Ns3AvRj+GgChTb5IIEyG94CtCy1JEumLziz0ws6ZlhgAmD+DP7J6U+J58TtkaJkhDLiZX12svKF9t4ddhGmsQhpDcb4cdzZk9Dn9I4eEHMEv7inwWjlfrHyhAj47wNB/ApSyfwjnw+hHQjRieUKiMuyXWf936Mr57/Y2WLsRupXzpZY9a7B+/UEFK/7V62neE+XrZ1+p1r3Le/jlO6CIS20X2UVVXdOuV2dzWZg== X-Forefront-Antispam-Report: CIP:255.255.255.255;CTRY:;LANG:en;SCL:1;SRV:;IPV:NLI;SFV:NSPM;H:BY5PR10MB4196.namprd10.prod.outlook.com;PTR:;CAT:NONE;SFS:(13230022)(366004)(376002)(346002)(396003)(39860400002)(136003)(451199015)(6512007)(7416002)(6666004)(86362001)(66556008)(36756003)(8676002)(107886003)(83380400001)(6486002)(66946007)(54906003)(44832011)(66476007)(8936002)(5660300002)(316002)(41300700001)(6506007)(478600001)(186003)(2906002)(38100700002)(26005)(4326008)(1076003)(2616005);DIR:OUT;SFP:1101; X-MS-Exchange-AntiSpam-MessageData-ChunkCount: 1 X-MS-Exchange-AntiSpam-MessageData-0: 8r9+KVXHc3zgim8RH2NLp1PfvqiGZHIBjFW03EbQUv65Hgr/4BsSv1vp/03Oice7b0M+tDD1SXJ7MPQUrHjoJ1I41hr9sFMXoSD4R428lXm31VN5uo9A3WcPOo/0W3tlH9KQ0oZTpMxF1qCeCkUQKytYmVa9c2+uwXHhYz1weQ5W+fX4fhDNo1IhOxoTR97YzpXHFaM5rSSUAMbdz5PMHZ/Yj2+ivSCwUhb5wGiGVI+5jNFK2Nj0jLP/ogcg9NBdj6S4TtNoRGr2pRhOKncH7V+kxDhKgO7IkXKc9M76ORZPbNvSqzULQ5Qt3lME+DNe+UBAF1vhHK3mBj2kOcyjaIipx9n1feN4HwKseXZ2Yfa7EUQ0O6F7O8KRsKE5cNVbmBq+KmmG3vvd4EoNK5PLqMnLOkJzQXTmz8Q4j8+Kh67DD3bz4+DqYJ4yHHS+YNFd2eM4uRdIsDFMxcr3T36deFQdaUvmdn/7lC42NEL0pjERfHc1l5ezzaUdHR+HAAIEsDBLgoOApPqffkFkrOK2NIiqQi4Jlj/BcNcRJ5NjLEWcR5uhoHdbHBzA6GAF+2WuemVUkf3TyN3bdq90My5gQvMc0lsN/QPZj5OQDVTSANCFIzo6dE5MAwXWp25FntlSMt8vMYfRPjrdF94PtK6qXYTQSCANNHzpvdLgJcj20FjRwDzKx8z6fACzBAtJWj/hD3t/Uni1q4Jlqw1JFut2gPMLlr1xutTAp9ljrJvsSrYxVixC3+xkMJf8ZrIl+OgWsJeHmd/uu1IHWLX0qX9TlKJQmKXNXYW8I1mDLs61Toyrg0irYxyrI26dhVeHuG1vJ3SMD0M3aeUgNCOrpGo8kfwV06mB7syPU5dEZrVoB5itM0Nx5gETecOczU1deguewambUCvkBkDhFixU/3QbVnCMBqbyXgbVnoyADcb7ujKlJ9yZANQ5Fiy1FV585irOWvVEWg3L0J/a8vuFfKBqyKWJ94fhFOpeIMWPC3NK8uJesFwrxB9cGIcaZEDGvEDUz7phepkM8A5dGM23FwKMUp/1ViHzSZlw3UaHLB9BGszK7wMNa1JCfD5udrHbSoJWqpr4k87nkNmQbFnCKHWxu7i/f2LWkjD4LWck5/uZE/ZA8uYOETVhEDLUygF5zyUWgT4RqCmJN6FEdac+goo7GE3pGDsY9FvExRP7LLmAref3wGbfDWf1736dysjoRP5UruLtqSKLR7IpLLVZxaaBVrrexsp/DqqhjlTXXgEiwFB8ZijXChcdTu8fhCP2XmF6BpnnazUxkWOKTtFeh8BDEyuuPVpwjLNSqO8roJEHBv8OCRMaBtWgMXj0+4j/82fl2XF4vNZgxpLZ0bnHqsDqfDq0E4riRvrcTtInvm2n7mrkmiDgdOhDsKCxibSlWf0tJN4b3/06HuPxUDqc2i3Cf0QMs5jqQcwwzhloP+OeQx/dJI+EMQD75hBrEe8QUax3qwKeFmPZQ7V8SaGeXBknyBrCUuA0720FYkU/Zu3dja9gZnlrplhhVvFWAgmusy/biCFk0Zx7U07JG5OaosPSK1uf1I2CwoddnekCtOLQYNS7Fu5czJKT1cwWrMIbCbKr8fSy5orUQkRM5BbGD4CdEw== X-OriginatorOrg: oracle.com X-MS-Exchange-CrossTenant-Network-Message-Id: 1e7743fd-5bd0-4c36-1b01-08da969f130f X-MS-Exchange-CrossTenant-AuthSource: BY5PR10MB4196.namprd10.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-OriginalArrivalTime: 14 Sep 2022 22:18:39.1007 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-CrossTenant-Id: 4e2c6054-71cb-48f1-bd6c-3a9705aca71b X-MS-Exchange-CrossTenant-MailboxType: HOSTED X-MS-Exchange-CrossTenant-UserPrincipalName: cLu97R1moiteSWs3Bor2WiRp5gSM8bAmnqM2xvCQ3mjdzz2zQeVTBvkXUdCDOdTIYRUbmY6eM4YiOvxovkA7xA== X-MS-Exchange-Transport-CrossTenantHeadersStamped: CH2PR10MB4390 X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.205,Aquarius:18.0.895,Hydra:6.0.528,FMLib:17.11.122.1 definitions=2022-09-14_09,2022-09-14_04,2022-06-22_01 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 mlxscore=0 spamscore=0 adultscore=0 malwarescore=0 suspectscore=0 bulkscore=0 mlxlogscore=999 phishscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2208220000 definitions=main-2209140108 X-Proofpoint-ORIG-GUID: uuq0JspoIzDwcGOp_ylP0s1lJboeLOqg X-Proofpoint-GUID: uuq0JspoIzDwcGOp_ylP0s1lJboeLOqg ARC-Seal: i=2; s=arc-20220608; d=hostedemail.com; t=1663193932; a=rsa-sha256; cv=pass; b=NGgAWBAlZQjfZgUXK77H5AGJm3gOpgwkZ2IuOWEBe5kQJq/fCHlJFZYln9QLYviaFi2CIO lqATbKxo9z5VAVL2GVwiTXQYQUnYTSX4fuZR4fROAKLSkvxHQeLpspZW8VoC9LIkWECNuh qYWRaJ3ryGwVCw13lDqg8fNj+G2JgBQ= ARC-Authentication-Results: i=2; imf01.hostedemail.com; dkim=pass header.d=oracle.com header.s=corp-2022-7-12 header.b=uJZZLrnm; dkim=pass header.d=oracle.onmicrosoft.com header.s=selector2-oracle-onmicrosoft-com header.b=b3fkDfIN; dmarc=pass (policy=none) header.from=oracle.com; arc=pass ("microsoft.com:s=arcselector9901:i=1"); spf=pass (imf01.hostedemail.com: domain of mike.kravetz@oracle.com designates 205.220.165.32 as permitted sender) smtp.mailfrom=mike.kravetz@oracle.com ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1663193932; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=1Ip1504nE2rI5Mk0abFz7kLwYsrBv+Ysx7YZ81uNmX8=; b=DLFM2Fcp3k/tmGVuZZe2jy2wwLjuD9UkYXJVRWdbxowDSvETNQ6UoDc27OoF5LPHMzO5Zo +DaVD78A2OecFt4nEthyH4epBAvnI5fAd96D9JNPCDKGTtqVm7pxoFXWEC53nKhyRhmKTc 2+7Ci2KHMIsKvpCFeKd3ky/82r4Iw0M= X-Stat-Signature: 5oamgn5wp6w4pm4n8m8guxi3zde9segg X-Rspamd-Queue-Id: 5BCAD400A5 X-Rspam-User: Authentication-Results: imf01.hostedemail.com; dkim=pass header.d=oracle.com header.s=corp-2022-7-12 header.b=uJZZLrnm; dkim=pass header.d=oracle.onmicrosoft.com header.s=selector2-oracle-onmicrosoft-com header.b=b3fkDfIN; dmarc=pass (policy=none) header.from=oracle.com; arc=pass ("microsoft.com:s=arcselector9901:i=1"); spf=pass (imf01.hostedemail.com: domain of mike.kravetz@oracle.com designates 205.220.165.32 as permitted sender) smtp.mailfrom=mike.kravetz@oracle.com X-Rspamd-Server: rspam02 X-HE-Tag: 1663193932-67818 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: With the new hugetlb vma lock in place, it can also be used to handle page fault races with file truncation. The lock is taken at the beginning of the code fault path in read mode. During truncation, it is taken in write mode for each vma which has the file mapped. The file's size (i_size) is modified before taking the vma lock to unmap. How are races handled? The page fault code checks i_size early in processing after taking the vma lock. If the fault is beyond i_size, the fault is aborted. If the fault is not beyond i_size the fault will continue and a new page will be added to the file. It could be that truncation code modifies i_size after the check in fault code. That is OK, as truncation code will soon remove the page. The truncation code will wait until the fault is finished, as it must obtain the vma lock in write mode. This patch cleans up/removes late checks in the fault paths that try to back out pages racing with truncation. As noted above, we just let the truncation code remove the pages. Signed-off-by: Mike Kravetz --- fs/hugetlbfs/inode.c | 31 ++++++++++++------------------- mm/hugetlb.c | 27 ++++++--------------------- 2 files changed, 18 insertions(+), 40 deletions(-) diff --git a/fs/hugetlbfs/inode.c b/fs/hugetlbfs/inode.c index 009ae539b9b2..ed57a029eab0 100644 --- a/fs/hugetlbfs/inode.c +++ b/fs/hugetlbfs/inode.c @@ -568,26 +568,19 @@ static bool remove_inode_single_folio(struct hstate *h, struct inode *inode, folio_lock(folio); /* - * After locking page, make sure mapping is the same. - * We could have raced with page fault populate and - * backout code. + * We must remove the folio from page cache before removing + * the region/ reserve map (hugetlb_unreserve_pages). In + * rare out of memory conditions, removal of the region/reserve + * map could fail. Correspondingly, the subpool and global + * reserve usage count can need to be adjusted. */ - if (folio_mapping(folio) == mapping) { - /* - * We must remove the folio from page cache before removing - * the region/ reserve map (hugetlb_unreserve_pages). In - * rare out of memory conditions, removal of the region/reserve - * map could fail. Correspondingly, the subpool and global - * reserve usage count can need to be adjusted. - */ - VM_BUG_ON(HPageRestoreReserve(&folio->page)); - hugetlb_delete_from_page_cache(&folio->page); - ret = true; - if (!truncate_op) { - if (unlikely(hugetlb_unreserve_pages(inode, index, - index + 1, 1))) - hugetlb_fix_reserve_counts(inode); - } + VM_BUG_ON(HPageRestoreReserve(&folio->page)); + hugetlb_delete_from_page_cache(&folio->page); + ret = true; + if (!truncate_op) { + if (unlikely(hugetlb_unreserve_pages(inode, index, + index + 1, 1))) + hugetlb_fix_reserve_counts(inode); } folio_unlock(folio); diff --git a/mm/hugetlb.c b/mm/hugetlb.c index e8cbc0f7cdaa..2207300791e5 100644 --- a/mm/hugetlb.c +++ b/mm/hugetlb.c @@ -5561,6 +5561,7 @@ static vm_fault_t hugetlb_no_page(struct mm_struct *mm, spinlock_t *ptl; unsigned long haddr = address & huge_page_mask(h); bool new_page, new_pagecache_page = false; + bool reserve_alloc = false; /* * Currently, we are forced to kill the process in the event the @@ -5616,6 +5617,8 @@ static vm_fault_t hugetlb_no_page(struct mm_struct *mm, clear_huge_page(page, address, pages_per_huge_page(h)); __SetPageUptodate(page); new_page = true; + if (HPageRestoreReserve(page)) + reserve_alloc = true; if (vma->vm_flags & VM_MAYSHARE) { int err = hugetlb_add_to_page_cache(page, mapping, idx); @@ -5679,10 +5682,6 @@ static vm_fault_t hugetlb_no_page(struct mm_struct *mm, } ptl = huge_pte_lock(h, mm, ptep); - size = i_size_read(mapping->host) >> huge_page_shift(h); - if (idx >= size) - goto backout; - ret = 0; /* If pte changed from under us, retry */ if (!pte_same(huge_ptep_get(ptep), old_pte)) @@ -5726,10 +5725,10 @@ static vm_fault_t hugetlb_no_page(struct mm_struct *mm, backout: spin_unlock(ptl); backout_unlocked: - unlock_page(page); - /* restore reserve for newly allocated pages not in page cache */ if (new_page && !new_pagecache_page) restore_reserve_on_error(h, vma, haddr, page); + + unlock_page(page); put_page(page); goto out; } @@ -6061,26 +6060,12 @@ int hugetlb_mcopy_atomic_pte(struct mm_struct *dst_mm, ptl = huge_pte_lock(h, dst_mm, dst_pte); - /* - * Recheck the i_size after holding PT lock to make sure not - * to leave any page mapped (as page_mapped()) beyond the end - * of the i_size (remove_inode_hugepages() is strict about - * enforcing that). If we bail out here, we'll also leave a - * page in the radix tree in the vm_shared case beyond the end - * of the i_size, but remove_inode_hugepages() will take care - * of it as soon as we drop the hugetlb_fault_mutex_table. - */ - size = i_size_read(mapping->host) >> huge_page_shift(h); - ret = -EFAULT; - if (idx >= size) - goto out_release_unlock; - - ret = -EEXIST; /* * We allow to overwrite a pte marker: consider when both MISSING|WP * registered, we firstly wr-protect a none pte which has no page cache * page backing it, then access the page. */ + ret = -EEXIST; if (!huge_pte_none_mostly(huge_ptep_get(dst_pte))) goto out_release_unlock;