From patchwork Thu Mar 28 23:47:03 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Mike Kravetz X-Patchwork-Id: 10876107 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 502C21575 for ; Thu, 28 Mar 2019 23:47:27 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 3C22C28D9B for ; Thu, 28 Mar 2019 23:47:27 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 3002C28F55; Thu, 28 Mar 2019 23:47:27 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-3.0 required=2.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,MAILING_LIST_MULTI,RCVD_IN_DNSWL_NONE, UNPARSEABLE_RELAY autolearn=ham version=3.3.1 Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id A50E428D9B for ; Thu, 28 Mar 2019 23:47:26 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id B25076B000A; Thu, 28 Mar 2019 19:47:25 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id AA6D36B0008; Thu, 28 Mar 2019 19:47:25 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 94A036B000A; Thu, 28 Mar 2019 19:47:25 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from mail-pf1-f198.google.com (mail-pf1-f198.google.com [209.85.210.198]) by kanga.kvack.org (Postfix) with ESMTP id 5AF4A6B0006 for ; Thu, 28 Mar 2019 19:47:25 -0400 (EDT) Received: by mail-pf1-f198.google.com with SMTP id i23so180051pfa.0 for ; Thu, 28 Mar 2019 16:47:25 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:dkim-signature:from:to:cc:subject:date :message-id:in-reply-to:references:mime-version :content-transfer-encoding; bh=DCoXfwtDoQElNXOEfosWBeE/6jSxU0DiQdqtsDNUNdE=; b=LY93nJfmxww1ZV/7J/9gaxsh9GhKxLzvNX47JzaRNIAYjLw51UNs3Rj5R4j1HFWRLC KpUWUdaFmqDzDcJ7iH5YtHs5TJcFsmxyZUNI3zNW4ajGlE/tepqzD7Qv5UPzsQOyg8NV wnWIGadh/3ajEBVP9IcKbplh79c+FDwYpDRnMJpwbNS28nO4FkZBuiHgZywNPPEGEXNC 9a+ehXnzsnDwmKTwxEX7LqZd1Ggv2qdQrVlrtEeuP0Vf4W2A4lqDyK43ZbPtl4oXSgDh xNtt9GlN9OuBxOtslMj0H9yatljSLOwfBPulVkxgRAQCG0YtBPCQs2KYwWbm91k+xIn7 SgbA== X-Gm-Message-State: APjAAAVZ7/PEaLe49zx46WhJUEHyLoXQP3BnHkBje+ot6AdSz2FFZSM0 m/9bXsIXXAqywtaR/FSf7vnflMc5Bva234xwvD7iSW8CuMUREt3IIK+KeTr2kYWuViHQgmvS9ns m6elM4uFE1YLPIYXfRKQeWueTlboVzA1y9+f+ZsgJ3/G14BCl+60O4FS8fRjf0Ra9/Q== X-Received: by 2002:a62:59cb:: with SMTP id k72mr45095710pfj.111.1553816845005; Thu, 28 Mar 2019 16:47:25 -0700 (PDT) X-Google-Smtp-Source: APXvYqwtU6o59ILazl76RHgQiIZFcJ8auzqsDN7Y70DU35TBt+C+u/R2eB6b9LUV2C6yfFc05xRs X-Received: by 2002:a62:59cb:: with SMTP id k72mr45095674pfj.111.1553816844140; Thu, 28 Mar 2019 16:47:24 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1553816844; cv=none; d=google.com; s=arc-20160816; b=Pf36gJQTUDV0rVRHbmjSwnQZ9cSi1grHaff4uc69iVUyTpMhXGWDScLULf0biyN6Ob TF1kxYwcSu8+waFCtE3RPOj63EgKb5LNNnpG33IyXC/9NndTDITFN3kTxAv88XxSBFR9 b/K+q92k/v6196iupi97ph52jq0/sbdfFEiGMqA79lUqoYK0oig6XqZ6+tGTrvD6wySA r7FI95juS14jwFSMwhZF/SePLHqBQv+xzJrxfECpSY9RtCavOKgcc95GfvcYo0dn7pEu lpNMNzUNgUhfSdW5LPlWfDlha8Vsv4BtaTBRswLIM6Jj4sBjl+WMp7hJ7ecC7OCtnpvQ tIEw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:dkim-signature; bh=DCoXfwtDoQElNXOEfosWBeE/6jSxU0DiQdqtsDNUNdE=; b=e7Q9CKtiZO1hyKPc+cjX1W0Jq3QPPaqA8ik2N38JMaMcBagoIlpwvE9Q1pAYRWufwC D836QVuHIUhwQabeqS+bK6EV8M0x2zkeuKK4dxPZbKpZz5UPa/4YVzgarfbZq6igpl6Q 5Iahs8vsamWhQ82UtDrYNMMFK84p+sqYYCNFB5SHzrCO6uuCW0On1VLN36IhXa65IHLm PjH6BpzVzhCs62XvsUTj9AY85JhrqWO4QmhQ8irOeNHT8BQazzqjOVDerKNbY/fZ7RJK X9Dz+EEb+W89LN+CJF7w8ywFI0pyR4VaKy8TVUAUt62hBUz7p673vxOeiICRB4WLYa2f GWgg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@oracle.com header.s=corp-2018-07-02 header.b=iMJb67+f; spf=pass (google.com: domain of mike.kravetz@oracle.com designates 156.151.31.86 as permitted sender) smtp.mailfrom=mike.kravetz@oracle.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=oracle.com Received: from userp2130.oracle.com (userp2130.oracle.com. [156.151.31.86]) by mx.google.com with ESMTPS id r198si490907pgr.153.2019.03.28.16.47.23 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Thu, 28 Mar 2019 16:47:24 -0700 (PDT) Received-SPF: pass (google.com: domain of mike.kravetz@oracle.com designates 156.151.31.86 as permitted sender) client-ip=156.151.31.86; Authentication-Results: mx.google.com; dkim=pass header.i=@oracle.com header.s=corp-2018-07-02 header.b=iMJb67+f; spf=pass (google.com: domain of mike.kravetz@oracle.com designates 156.151.31.86 as permitted sender) smtp.mailfrom=mike.kravetz@oracle.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=oracle.com Received: from pps.filterd (userp2130.oracle.com [127.0.0.1]) by userp2130.oracle.com (8.16.0.27/8.16.0.27) with SMTP id x2SNiDr2124232; Thu, 28 Mar 2019 23:47:19 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.com; h=from : to : cc : subject : date : message-id : in-reply-to : references : mime-version : content-transfer-encoding; s=corp-2018-07-02; bh=DCoXfwtDoQElNXOEfosWBeE/6jSxU0DiQdqtsDNUNdE=; b=iMJb67+fuQVNpdvgaj3dV0A+AKOE1Ws5y9ti9ugYJXb0jmPwNHSMgBZLJ2ghZf1RxYfY ZVKJxyvwPsX4deBz06LETu0pe7dW+HHRhb4ldKbzdy+iWE1HaofJhO5X844P+IoIoh+u Dtip/rPUMHoKAxeR27Ne9UR+hNrUE0LIZyBnbDtDoTsAC7ezawApX7XkHGYeqxMfaVWg 9FIm6sMgoRPA+9cvxQ8mhAb7vR4ooIc/eSMBx/KNKYRDphJTwW2kNOy2YkvbXN6S9BUd 57hmuqYfb6DRD/cpCBUo4ENRRvVTD9Xz8pBvgk+URNjn0o35llqnPu8I6C73cWuyAbMd vw== Received: from userv0022.oracle.com (userv0022.oracle.com [156.151.31.74]) by userp2130.oracle.com with ESMTP id 2re6g1hj1n-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Thu, 28 Mar 2019 23:47:19 +0000 Received: from userv0121.oracle.com (userv0121.oracle.com [156.151.31.72]) by userv0022.oracle.com (8.14.4/8.14.4) with ESMTP id x2SNlJSq008392 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Thu, 28 Mar 2019 23:47:19 GMT Received: from abhmp0011.oracle.com (abhmp0011.oracle.com [141.146.116.17]) by userv0121.oracle.com (8.14.4/8.13.8) with ESMTP id x2SNlIn5018631; Thu, 28 Mar 2019 23:47:18 GMT Received: from monkey.oracle.com (/50.38.38.67) by default (Oracle Beehive Gateway v4.0) with ESMTP ; Thu, 28 Mar 2019 16:47:18 -0700 From: Mike Kravetz To: linux-kernel@vger.kernel.org, linux-mm@kvack.org, Davidlohr Bueso Cc: Joonsoo Kim , Michal Hocko , Naoya Horiguchi , "Kirill A . Shutemov" , Andrew Morton , Mike Kravetz Subject: [PATCH v2 1/2] huegtlbfs: on restore reserve error path retain subpool reservation Date: Thu, 28 Mar 2019 16:47:03 -0700 Message-Id: <20190328234704.27083-2-mike.kravetz@oracle.com> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20190328234704.27083-1-mike.kravetz@oracle.com> References: <20190328234704.27083-1-mike.kravetz@oracle.com> MIME-Version: 1.0 X-Proofpoint-Virus-Version: vendor=nai engine=5900 definitions=9209 signatures=668685 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 priorityscore=1501 malwarescore=0 suspectscore=2 phishscore=0 bulkscore=0 spamscore=0 clxscore=1015 lowpriorityscore=0 mlxscore=0 impostorscore=0 mlxlogscore=999 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1810050000 definitions=main-1903280154 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: X-Virus-Scanned: ClamAV using ClamSMTP When a huge page is allocated, PagePrivate() is set if the allocation consumed a reservation. When freeing a huge page, PagePrivate is checked. If set, it indicates the reservation should be restored. PagePrivate being set at free huge page time mostly happens on error paths. When huge page reservations are created, a check is made to determine if the mapping is associated with an explicitly mounted filesystem. If so, pages are also reserved within the filesystem. The default action when freeing a huge page is to decrement the usage count in any associated explicitly mounted filesystem. However, if the reservation is to be restored the reservation/use count within the filesystem should not be decrementd. Otherwise, a subsequent page allocation and free for the same mapping location will cause the file filesystem usage to go 'negative'. Filesystem Size Used Avail Use% Mounted on nodev 4.0G -4.0M 4.1G - /opt/hugepool To fix, when freeing a huge page do not adjust filesystem usage if PagePrivate() is set to indicate the reservation should be restored. Signed-off-by: Mike Kravetz --- mm/hugetlb.c | 21 ++++++++++++++++----- 1 file changed, 16 insertions(+), 5 deletions(-) diff --git a/mm/hugetlb.c b/mm/hugetlb.c index f79ae4e42159..8651d6a602f9 100644 --- a/mm/hugetlb.c +++ b/mm/hugetlb.c @@ -1268,12 +1268,23 @@ void free_huge_page(struct page *page) ClearPagePrivate(page); /* - * A return code of zero implies that the subpool will be under its - * minimum size if the reservation is not restored after page is free. - * Therefore, force restore_reserve operation. + * If PagePrivate() was set on page, page allocation consumed a + * reservation. If the page was associated with a subpool, there + * would have been a page reserved in the subpool before allocation + * via hugepage_subpool_get_pages(). Since we are 'restoring' the + * reservtion, do not call hugepage_subpool_put_pages() as this will + * remove the reserved page from the subpool. */ - if (hugepage_subpool_put_pages(spool, 1) == 0) - restore_reserve = true; + if (!restore_reserve) { + /* + * A return code of zero implies that the subpool will be + * under its minimum size if the reservation is not restored + * after page is free. Therefore, force restore_reserve + * operation. + */ + if (hugepage_subpool_put_pages(spool, 1) == 0) + restore_reserve = true; + } spin_lock(&hugetlb_lock); clear_page_huge_active(page); From patchwork Thu Mar 28 23:47:04 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Mike Kravetz X-Patchwork-Id: 10876111 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 639971575 for ; Thu, 28 Mar 2019 23:47:32 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 4E64328D9B for ; Thu, 28 Mar 2019 23:47:32 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 420A428F55; Thu, 28 Mar 2019 23:47:32 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-3.0 required=2.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,MAILING_LIST_MULTI,RCVD_IN_DNSWL_NONE, UNPARSEABLE_RELAY autolearn=ham version=3.3.1 Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 8972F28D9B for ; Thu, 28 Mar 2019 23:47:31 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id AD33F6B0007; Thu, 28 Mar 2019 19:47:27 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id AA9AE6B0008; Thu, 28 Mar 2019 19:47:27 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 8AEFD6B000C; Thu, 28 Mar 2019 19:47:27 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from mail-pl1-f198.google.com (mail-pl1-f198.google.com [209.85.214.198]) by kanga.kvack.org (Postfix) with ESMTP id 437B46B0007 for ; Thu, 28 Mar 2019 19:47:27 -0400 (EDT) Received: by mail-pl1-f198.google.com with SMTP id v5so370759plo.4 for ; Thu, 28 Mar 2019 16:47:27 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:dkim-signature:from:to:cc:subject:date :message-id:in-reply-to:references:mime-version :content-transfer-encoding; bh=YIsUhxU4+gts/JPtIOY3p0GcgAlB/Grz4wjUuRKy0gc=; b=LMhWilBKYmK+XrUHL+GEXX9BL9yOwrHpb07OfwV8tFMO2n9ovqUidpz0DIwvW4Php5 r6nluDDztKJOm2rmNfMU8HiDeCHMoJhtRo01ZRhcuDLF6BHR9K0e/B4OqsV4leBRXeiH rgGDx40xkXTEcWVNv12P9RnrrXcJccPuWD8b8uGi0XPV5/+m4oQLX5aX2t9QtPasDl9e vdEdNkD4o7ddqaDXQkM+q7o8RxnhHilevsJnQDeVdfeWjAQmYlUNrH/y/nyPg/siUkby euV5ICjdk2acBFSeohGuws7sbzTu4NEaqKM8kmUtnF0eWnqHQn0I+tWgsFYqNbg4pc6T BLRw== X-Gm-Message-State: APjAAAU3Wy47B+U3JFMY+hjkmmrY8hbZj9xfda1um1gXUtOCStERWb36 mHG/l5bTk//KavnqFVMgkeIMc76iC3LEW5EsR/2nlqP/hLW1t2mTAVk/OHcDRxrM57ioELEej0z EU9R5g1BpKwAiAWc7LvCL9kzaEhnKb9iE5S2YCXvyZYrC1I96ErjUyBKfKPZHK2qdUw== X-Received: by 2002:a63:8548:: with SMTP id u69mr42514292pgd.85.1553816846896; Thu, 28 Mar 2019 16:47:26 -0700 (PDT) X-Google-Smtp-Source: APXvYqxvdN76lgaFisKN7BjmxE3ZYjkL+98YCuu1QYX5MB/xC+QQf7tLvxwlUCK/2Z8/yf5r10iw X-Received: by 2002:a63:8548:: with SMTP id u69mr42514256pgd.85.1553816846067; Thu, 28 Mar 2019 16:47:26 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1553816846; cv=none; d=google.com; s=arc-20160816; b=b0fk5IAqLauTVF8xdts0ZbAXaYD7OO1q0orurxx+cfPZ9UVmJZM9CjD5WgYU4PtjdY IKBdaAz/rIXtHnXk/b/5B9VzP52H7a15O6eHr2eRnHctc/tjFKrwnwSD7mWe+lRzPOB3 kZgKHGb+nHdBsm7vselof6Epw7USTiOGxqIKl28feMRhrFjt1jIvQA8QWc7yfJpyVvGJ ta3PL0vnsBzsJT6fIBZkV/gnHsScizzVishPhbsw1OpytsfHcX3DuLoTdvAzN+RYniS3 dchLxGJ70Fpqv6GwhIwQO719Uu96ffBv/92WjXZtsCcldfWspEL7CTOCw5fiBNBBAZHX bf1A== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:dkim-signature; bh=YIsUhxU4+gts/JPtIOY3p0GcgAlB/Grz4wjUuRKy0gc=; b=pi8TMfw88I61ZLk3NA2k3s6gKf02cw9zaRgEvrNomkVy5qrA76cCn4QdogvhOllllp i5sU3Pxa2H6z7IOFvN5H0hjHe5GjY47FzWq2HNy/QFDpllQDMmnsT3zJMOrZTR4Nug92 Gf3Nu+TNY0TA8WPrO/EEAOmHI+00USLW3mBQHhed3GwGF/wgMmch1hVPZ4yZdxrUPaKU VFBiYv2sApDZXDTQ5HDpO8gCPllZEUrOCZg3sWaoyEAY7K3jaNkbML5f8L8z8NLAHtn1 KhaldUX6D6RfkhQ0EEDCdKI8PelRpW+XxNnznr/bsDuheFoiK5nbCpSyc/9aZuGO+huY FMgw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@oracle.com header.s=corp-2018-07-02 header.b=F19NqBy9; spf=pass (google.com: domain of mike.kravetz@oracle.com designates 141.146.126.79 as permitted sender) smtp.mailfrom=mike.kravetz@oracle.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=oracle.com Received: from aserp2130.oracle.com (aserp2130.oracle.com. [141.146.126.79]) by mx.google.com with ESMTPS id 22si451322pgd.540.2019.03.28.16.47.25 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Thu, 28 Mar 2019 16:47:26 -0700 (PDT) Received-SPF: pass (google.com: domain of mike.kravetz@oracle.com designates 141.146.126.79 as permitted sender) client-ip=141.146.126.79; Authentication-Results: mx.google.com; dkim=pass header.i=@oracle.com header.s=corp-2018-07-02 header.b=F19NqBy9; spf=pass (google.com: domain of mike.kravetz@oracle.com designates 141.146.126.79 as permitted sender) smtp.mailfrom=mike.kravetz@oracle.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=oracle.com Received: from pps.filterd (aserp2130.oracle.com [127.0.0.1]) by aserp2130.oracle.com (8.16.0.27/8.16.0.27) with SMTP id x2SNiRs1129269; Thu, 28 Mar 2019 23:47:21 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.com; h=from : to : cc : subject : date : message-id : in-reply-to : references : mime-version : content-transfer-encoding; s=corp-2018-07-02; bh=YIsUhxU4+gts/JPtIOY3p0GcgAlB/Grz4wjUuRKy0gc=; b=F19NqBy9epTh+nTN9Tzp4A10Ci+lWRm+5Sbwnc7wMHqwCGpyrdFpcyZkg3mzQs28hXSY BadHFdhW9ncY6AsNh2BCx+cIlHPvcbqyoh/+G80jHQPvIFIN/brMUplsVxJT9Yz4gj/1 DOe2KBQnlo2BUtdFCU3OkJYBTGQaaoAQQLiyrd79OTc3aAjRA+RkUOxsORnM4XLInNWw LKvrMSsCEBwUhVwbS/GJ9aVlf3sF4vKR/311pUgLatTW4czGABD60aY6rdTIFvZ8nL3X ZfnHY0xE7UJ7MM98N3vRxtR/AKXyvBJgAPr1TuM4SzpPE6+86zD5eQxBjjkgDHCBdg/8 Aw== Received: from aserv0022.oracle.com (aserv0022.oracle.com [141.146.126.234]) by aserp2130.oracle.com with ESMTP id 2re6g19kk2-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Thu, 28 Mar 2019 23:47:21 +0000 Received: from userv0122.oracle.com (userv0122.oracle.com [156.151.31.75]) by aserv0022.oracle.com (8.14.4/8.14.4) with ESMTP id x2SNlL8i015665 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Thu, 28 Mar 2019 23:47:21 GMT Received: from abhmp0011.oracle.com (abhmp0011.oracle.com [141.146.116.17]) by userv0122.oracle.com (8.14.4/8.14.4) with ESMTP id x2SNlK0Q016061; Thu, 28 Mar 2019 23:47:20 GMT Received: from monkey.oracle.com (/50.38.38.67) by default (Oracle Beehive Gateway v4.0) with ESMTP ; Thu, 28 Mar 2019 16:47:20 -0700 From: Mike Kravetz To: linux-kernel@vger.kernel.org, linux-mm@kvack.org, Davidlohr Bueso Cc: Joonsoo Kim , Michal Hocko , Naoya Horiguchi , "Kirill A . Shutemov" , Andrew Morton , Mike Kravetz Subject: [PATCH v2 2/2] hugetlb: use same fault hash key for shared and private mappings Date: Thu, 28 Mar 2019 16:47:04 -0700 Message-Id: <20190328234704.27083-3-mike.kravetz@oracle.com> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20190328234704.27083-1-mike.kravetz@oracle.com> References: <20190328234704.27083-1-mike.kravetz@oracle.com> MIME-Version: 1.0 X-Proofpoint-Virus-Version: vendor=nai engine=5900 definitions=9209 signatures=668685 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 priorityscore=1501 malwarescore=0 suspectscore=2 phishscore=0 bulkscore=0 spamscore=0 clxscore=1015 lowpriorityscore=0 mlxscore=0 impostorscore=0 mlxlogscore=973 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1810050000 definitions=main-1903280154 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: X-Virus-Scanned: ClamAV using ClamSMTP hugetlb uses a fault mutex hash table to prevent page faults of the same pages concurrently. The key for shared and private mappings is different. Shared keys off address_space and file index. Private keys off mm and virtual address. Consider a private mappings of a populated hugetlbfs file. A write fault will first map the page from the file and then do a COW to map a writable page. Hugetlbfs hole punch uses the fault mutex to prevent mappings of file pages. It uses the address_space file index key. However, private mappings will use a different key and could temporarily map the file page before COW. This causes problems (BUG) for the hole punch code as it expects the mutex to prevent additional uses/mappings of the page. There seems to be another potential COW issue/race with this approach of different private and shared keys as notes in commit 8382d914ebf7 ("mm, hugetlb: improve page-fault scalability"). Since every hugetlb mapping (even anon and private) is actually a file mapping, just use the address_space index key for all mappings. This results in potentially more hash collisions. However, this should not be the common case. Signed-off-by: Mike Kravetz Reviewed-by: Davidlohr Bueso --- fs/hugetlbfs/inode.c | 7 ++----- include/linux/hugetlb.h | 4 +--- mm/hugetlb.c | 22 ++++++---------------- mm/userfaultfd.c | 3 +-- 4 files changed, 10 insertions(+), 26 deletions(-) diff --git a/fs/hugetlbfs/inode.c b/fs/hugetlbfs/inode.c index ec32fece5e1e..6189ba80b57b 100644 --- a/fs/hugetlbfs/inode.c +++ b/fs/hugetlbfs/inode.c @@ -440,9 +440,7 @@ static void remove_inode_hugepages(struct inode *inode, loff_t lstart, u32 hash; index = page->index; - hash = hugetlb_fault_mutex_hash(h, current->mm, - &pseudo_vma, - mapping, index, 0); + hash = hugetlb_fault_mutex_hash(h, mapping, index, 0); mutex_lock(&hugetlb_fault_mutex_table[hash]); /* @@ -639,8 +637,7 @@ static long hugetlbfs_fallocate(struct file *file, int mode, loff_t offset, addr = index * hpage_size; /* mutex taken here, fault path and hole punch */ - hash = hugetlb_fault_mutex_hash(h, mm, &pseudo_vma, mapping, - index, addr); + hash = hugetlb_fault_mutex_hash(h, mapping, index, addr); mutex_lock(&hugetlb_fault_mutex_table[hash]); /* See if already present in mapping to avoid alloc/free */ diff --git a/include/linux/hugetlb.h b/include/linux/hugetlb.h index ea35263eb76b..3bc0d02649fe 100644 --- a/include/linux/hugetlb.h +++ b/include/linux/hugetlb.h @@ -123,9 +123,7 @@ void move_hugetlb_state(struct page *oldpage, struct page *newpage, int reason); void free_huge_page(struct page *page); void hugetlb_fix_reserve_counts(struct inode *inode); extern struct mutex *hugetlb_fault_mutex_table; -u32 hugetlb_fault_mutex_hash(struct hstate *h, struct mm_struct *mm, - struct vm_area_struct *vma, - struct address_space *mapping, +u32 hugetlb_fault_mutex_hash(struct hstate *h, struct address_space *mapping, pgoff_t idx, unsigned long address); pte_t *huge_pmd_share(struct mm_struct *mm, unsigned long addr, pud_t *pud); diff --git a/mm/hugetlb.c b/mm/hugetlb.c index 8651d6a602f9..4409a87434f1 100644 --- a/mm/hugetlb.c +++ b/mm/hugetlb.c @@ -3837,8 +3837,7 @@ static vm_fault_t hugetlb_no_page(struct mm_struct *mm, * handling userfault. Reacquire after handling * fault to make calling code simpler. */ - hash = hugetlb_fault_mutex_hash(h, mm, vma, mapping, - idx, haddr); + hash = hugetlb_fault_mutex_hash(h, mapping, idx, haddr); mutex_unlock(&hugetlb_fault_mutex_table[hash]); ret = handle_userfault(&vmf, VM_UFFD_MISSING); mutex_lock(&hugetlb_fault_mutex_table[hash]); @@ -3946,21 +3945,14 @@ static vm_fault_t hugetlb_no_page(struct mm_struct *mm, } #ifdef CONFIG_SMP -u32 hugetlb_fault_mutex_hash(struct hstate *h, struct mm_struct *mm, - struct vm_area_struct *vma, - struct address_space *mapping, +u32 hugetlb_fault_mutex_hash(struct hstate *h, struct address_space *mapping, pgoff_t idx, unsigned long address) { unsigned long key[2]; u32 hash; - if (vma->vm_flags & VM_SHARED) { - key[0] = (unsigned long) mapping; - key[1] = idx; - } else { - key[0] = (unsigned long) mm; - key[1] = address >> huge_page_shift(h); - } + key[0] = (unsigned long) mapping; + key[1] = idx; hash = jhash2((u32 *)&key, sizeof(key)/sizeof(u32), 0); @@ -3971,9 +3963,7 @@ u32 hugetlb_fault_mutex_hash(struct hstate *h, struct mm_struct *mm, * For uniprocesor systems we always use a single mutex, so just * return 0 and avoid the hashing overhead. */ -u32 hugetlb_fault_mutex_hash(struct hstate *h, struct mm_struct *mm, - struct vm_area_struct *vma, - struct address_space *mapping, +u32 hugetlb_fault_mutex_hash(struct hstate *h, struct address_space *mapping, pgoff_t idx, unsigned long address) { return 0; @@ -4018,7 +4008,7 @@ vm_fault_t hugetlb_fault(struct mm_struct *mm, struct vm_area_struct *vma, * get spurious allocation failures if two CPUs race to instantiate * the same page in the page cache. */ - hash = hugetlb_fault_mutex_hash(h, mm, vma, mapping, idx, haddr); + hash = hugetlb_fault_mutex_hash(h, mapping, idx, haddr); mutex_lock(&hugetlb_fault_mutex_table[hash]); entry = huge_ptep_get(ptep); diff --git a/mm/userfaultfd.c b/mm/userfaultfd.c index d59b5a73dfb3..9932d5755e4c 100644 --- a/mm/userfaultfd.c +++ b/mm/userfaultfd.c @@ -271,8 +271,7 @@ static __always_inline ssize_t __mcopy_atomic_hugetlb(struct mm_struct *dst_mm, */ idx = linear_page_index(dst_vma, dst_addr); mapping = dst_vma->vm_file->f_mapping; - hash = hugetlb_fault_mutex_hash(h, dst_mm, dst_vma, mapping, - idx, dst_addr); + hash = hugetlb_fault_mutex_hash(h, mapping, idx, dst_addr); mutex_lock(&hugetlb_fault_mutex_table[hash]); err = -ENOMEM;