From patchwork Wed Jun 23 08:51:02 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Bin Wang X-Patchwork-Id: 12339299 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.8 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,USER_AGENT_GIT autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 4C86CC48BE5 for ; Wed, 23 Jun 2021 08:51:14 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id E7F74611C1 for ; Wed, 23 Jun 2021 08:51:13 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org E7F74611C1 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=huawei.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 9BEE76B0011; Wed, 23 Jun 2021 04:51:12 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 995E46B0036; Wed, 23 Jun 2021 04:51:12 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 836636B006C; Wed, 23 Jun 2021 04:51:12 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0214.hostedemail.com [216.40.44.214]) by kanga.kvack.org (Postfix) with ESMTP id 4E7566B0011 for ; Wed, 23 Jun 2021 04:51:12 -0400 (EDT) Received: from smtpin26.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay01.hostedemail.com (Postfix) with ESMTP id 845811812A458 for ; Wed, 23 Jun 2021 08:51:12 +0000 (UTC) X-FDA: 78284369184.26.2F8808D Received: from szxga03-in.huawei.com (szxga03-in.huawei.com [45.249.212.189]) by imf10.hostedemail.com (Postfix) with ESMTP id 86D94400038A for ; Wed, 23 Jun 2021 08:51:11 +0000 (UTC) Received: from dggemv704-chm.china.huawei.com (unknown [172.30.72.53]) by szxga03-in.huawei.com (SkyGuard) with ESMTP id 4G8xhj29QNz70s2; Wed, 23 Jun 2021 16:47:01 +0800 (CST) Received: from dggema753-chm.china.huawei.com (10.1.198.195) by dggemv704-chm.china.huawei.com (10.3.19.47) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA256) id 15.1.2176.2; Wed, 23 Jun 2021 16:51:07 +0800 Received: from huawei.com (10.174.179.206) by dggema753-chm.china.huawei.com (10.1.198.195) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA256_P256) id 15.1.2176.2; Wed, 23 Jun 2021 16:51:06 +0800 From: wangbin To: , CC: , , , Subject: [PATCH v2] mm: hugetlb: add hwcrp_hugepages to record memory failure on hugetlbfs Date: Wed, 23 Jun 2021 16:51:02 +0800 Message-ID: <20210623085102.2458-1-wangbin224@huawei.com> X-Mailer: git-send-email 2.29.2.windows.3 MIME-Version: 1.0 X-Originating-IP: [10.174.179.206] X-ClientProxiedBy: dggems703-chm.china.huawei.com (10.3.19.180) To dggema753-chm.china.huawei.com (10.1.198.195) X-CFilter-Loop: Reflected Authentication-Results: imf10.hostedemail.com; dkim=none; dmarc=pass (policy=none) header.from=huawei.com; spf=pass (imf10.hostedemail.com: domain of wangbin224@huawei.com designates 45.249.212.189 as permitted sender) smtp.mailfrom=wangbin224@huawei.com X-Rspamd-Server: rspam02 X-Stat-Signature: cxbj4nhhs7nps8zqqa1bj8ccqxp1ez4n X-Rspamd-Queue-Id: 86D94400038A X-HE-Tag: 1624438271-525249 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Bin Wang In the current hugetlbfs memory failure handler, reserved huge page counts are used to record the number of huge pages with hwposion. There are two problems: 1. We call hugetlb_fix_reserve_counts() to change reserved counts in hugetlbfs_error_remove_page(). But this function is only called if hugetlb_unreserve_pages() fails, and hugetlb_unreserve_pages() fails only if kmalloc in region_del() fails, which is almost impossible. As a result, the reserved count is not corrected as expected when a memory failure occurs. 2. Reserved counts is designed to display the number of hugepages reserved at mmap() time. This means that even if we fix the first issue, reserved counts will be confusing because we can't tell if it's hwposion or reserved hugepage. This patch adds hardware corrput huge pages counts to record memory failure on hugetlbfs instead of reserved counts. Signed-off-by: Bin Wang --- fs/hugetlbfs/inode.c | 3 +-- include/linux/hugetlb.h | 3 +++ mm/hugetlb.c | 30 ++++++++++++++++++++++++++++++ 3 files changed, 34 insertions(+), 2 deletions(-) diff --git a/fs/hugetlbfs/inode.c b/fs/hugetlbfs/inode.c index 926eeb9bf4eb..ffb6e7b6756b 100644 --- a/fs/hugetlbfs/inode.c +++ b/fs/hugetlbfs/inode.c @@ -986,8 +986,7 @@ static int hugetlbfs_error_remove_page(struct address_space *mapping, pgoff_t index = page->index; remove_huge_page(page); - if (unlikely(hugetlb_unreserve_pages(inode, index, index + 1, 1))) - hugetlb_fix_reserve_counts(inode); + hugetlb_fix_hwcrp_counts(page); return 0; } diff --git a/include/linux/hugetlb.h b/include/linux/hugetlb.h index f7ca1a3870ea..1d5bada80aa5 100644 --- a/include/linux/hugetlb.h +++ b/include/linux/hugetlb.h @@ -171,6 +171,7 @@ void putback_active_hugepage(struct page *page); void move_hugetlb_state(struct page *oldpage, struct page *newpage, int reason); void free_huge_page(struct page *page); void hugetlb_fix_reserve_counts(struct inode *inode); +void hugetlb_fix_hwcrp_counts(struct page *page); extern struct mutex *hugetlb_fault_mutex_table; u32 hugetlb_fault_mutex_hash(struct address_space *mapping, pgoff_t idx); @@ -602,12 +603,14 @@ struct hstate { unsigned long free_huge_pages; unsigned long resv_huge_pages; unsigned long surplus_huge_pages; + unsigned long hwcrp_huge_pages; unsigned long nr_overcommit_huge_pages; struct list_head hugepage_activelist; struct list_head hugepage_freelists[MAX_NUMNODES]; unsigned int nr_huge_pages_node[MAX_NUMNODES]; unsigned int free_huge_pages_node[MAX_NUMNODES]; unsigned int surplus_huge_pages_node[MAX_NUMNODES]; + unsigned int hwcrp_huge_pages_node[MAX_NUMNODES]; #ifdef CONFIG_HUGETLB_PAGE_FREE_VMEMMAP unsigned int nr_free_vmemmap_pages; #endif diff --git a/mm/hugetlb.c b/mm/hugetlb.c index 760b5fb836b8..3e6385381db7 100644 --- a/mm/hugetlb.c +++ b/mm/hugetlb.c @@ -763,6 +763,15 @@ void hugetlb_fix_reserve_counts(struct inode *inode) pr_warn("hugetlb: Huge Page Reserved count may go negative.\n"); } +void hugetlb_fix_hwcrp_counts(struct page *page) +{ + struct hstate *h = &default_hstate; + int nid = page_to_nid(page); + + h->hwcrp_huge_pages++; + h->hwcrp_huge_pages_node[nid]++; +} + /* * Count and return the number of huge pages in the reserve map * that intersect with the range [f, t). @@ -3293,12 +3302,30 @@ static ssize_t surplus_hugepages_show(struct kobject *kobj, } HSTATE_ATTR_RO(surplus_hugepages); +static ssize_t hwcrp_hugepages_show(struct kobject *kobj, + struct kobj_attribute *attr, char *buf) +{ + struct hstate *h; + unsigned long hwcrp_huge_pages; + int nid; + + h = kobj_to_hstate(kobj, &nid); + if (nid == NUMA_NO_NODE) + hwcrp_huge_pages = h->hwcrp_huge_pages; + else + hwcrp_huge_pages = h->hwcrp_huge_pages_node[nid]; + + return sysfs_emit(buf, "%lu\n", hwcrp_huge_pages); +} +HSTATE_ATTR_RO(hwcrp_hugepages); + static struct attribute *hstate_attrs[] = { &nr_hugepages_attr.attr, &nr_overcommit_hugepages_attr.attr, &free_hugepages_attr.attr, &resv_hugepages_attr.attr, &surplus_hugepages_attr.attr, + &hwcrp_hugepages_attr.attr, #ifdef CONFIG_NUMA &nr_hugepages_mempolicy_attr.attr, #endif @@ -3368,6 +3395,7 @@ static struct attribute *per_node_hstate_attrs[] = { &nr_hugepages_attr.attr, &free_hugepages_attr.attr, &surplus_hugepages_attr.attr, + &hwcrp_hugepages_attr.attr, NULL, }; @@ -3862,11 +3890,13 @@ void hugetlb_report_meminfo(struct seq_file *m) "HugePages_Free: %5lu\n" "HugePages_Rsvd: %5lu\n" "HugePages_Surp: %5lu\n" + "HugePages_Hwcrp: %5lu\n" "Hugepagesize: %8lu kB\n", count, h->free_huge_pages, h->resv_huge_pages, h->surplus_huge_pages, + h->hwcrp_huge_pages, huge_page_size(h) / SZ_1K); }