From patchwork Thu Oct 28 21:36:04 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 12591221 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id C9882C433EF for ; Thu, 28 Oct 2021 21:36:08 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 6B890610CA for ; Thu, 28 Oct 2021 21:36:08 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org 6B890610CA Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=linux-foundation.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=kvack.org Received: by kanga.kvack.org (Postfix) id B68C16B0072; Thu, 28 Oct 2021 17:36:07 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id B183E6B0073; Thu, 28 Oct 2021 17:36:07 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id A2ECA940007; Thu, 28 Oct 2021 17:36:07 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0219.hostedemail.com [216.40.44.219]) by kanga.kvack.org (Postfix) with ESMTP id 7DA716B0072 for ; Thu, 28 Oct 2021 17:36:07 -0400 (EDT) Received: from smtpin15.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay02.hostedemail.com (Postfix) with ESMTP id 05BC931ED0 for ; Thu, 28 Oct 2021 21:36:07 +0000 (UTC) X-FDA: 78747154374.15.709AE75 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by imf27.hostedemail.com (Postfix) with ESMTP id 92126700024C for ; Thu, 28 Oct 2021 21:36:06 +0000 (UTC) Received: by mail.kernel.org (Postfix) with ESMTPSA id 3670160FF2; Thu, 28 Oct 2021 21:36:05 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1635456965; bh=jxWwOHabNnscVRy+r9kwuAbCzrielvvANPWmTQgJG2M=; h=Date:From:To:Subject:In-Reply-To:From; b=Wcg1c+P1+TuytgC/ZMbnWsQRbXL6nwTrL1/EZnjBmoXN1YRT1PCfDLu3GnJIii2W/ xqPjXLNo9vhAaYm4usQcJzI6noEWQDo3mcPIrdciN4Vsdetf3g/leZwnpyVSE7EYcr fKubFsnjk9W0y32YZhspBIQNjvjzeBrZFxiovHq0= Date: Thu, 28 Oct 2021 14:36:04 -0700 From: Andrew Morton To: akpm@linux-foundation.org, david@redhat.com, guro@fb.com, hannes@cmpxchg.org, linux-mm@kvack.org, mhocko@suse.com, mm-commits@vger.kernel.org, shakeelb@google.com, torvalds@linux-foundation.org, vvs@virtuozzo.com Subject: [patch 01/11] memcg: page_alloc: skip bulk allocator for __GFP_ACCOUNT Message-ID: <20211028213604.wxCte4LAl%akpm@linux-foundation.org> In-Reply-To: <20211028143506.5f5d5e2cd1f768a1da864844@linux-foundation.org> User-Agent: s-nail v14.8.16 X-Stat-Signature: 4rnup8c4xp73ip6yfquxg3tpu6d7b5pk Authentication-Results: imf27.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=korg header.b=Wcg1c+P1; spf=pass (imf27.hostedemail.com: domain of akpm@linux-foundation.org designates 198.145.29.99 as permitted sender) smtp.mailfrom=akpm@linux-foundation.org; dmarc=none X-Rspamd-Server: rspam04 X-Rspamd-Queue-Id: 92126700024C X-HE-Tag: 1635456966-995319 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Shakeel Butt Subject: memcg: page_alloc: skip bulk allocator for __GFP_ACCOUNT commit 5c1f4e690eec ("mm/vmalloc: switch to bulk allocator in __vmalloc_area_node()") switched to bulk page allocator for order 0 allocation backing vmalloc. However bulk page allocator does not support __GFP_ACCOUNT allocations and there are several users of kvmalloc(__GFP_ACCOUNT). For now make __GFP_ACCOUNT allocations bypass bulk page allocator. In future if there is workload that can be significantly improved with the bulk page allocator with __GFP_ACCCOUNT support, we can revisit the decision. Link: https://lkml.kernel.org/r/20211014151607.2171970-1-shakeelb@google.com Fixes: 5c1f4e690eec ("mm/vmalloc: switch to bulk allocator in __vmalloc_area_node()") Signed-off-by: Shakeel Butt Reported-by: Vasily Averin Tested-by: Vasily Averin Acked-by: David Hildenbrand Acked-by: Michal Hocko Acked-by: Roman Gushchin Acked-by: Johannes Weiner Signed-off-by: Andrew Morton --- mm/page_alloc.c | 4 ++++ 1 file changed, 4 insertions(+) --- a/mm/page_alloc.c~memcg-page_alloc-skip-bulk-allocator-for-__gfp_account +++ a/mm/page_alloc.c @@ -5223,6 +5223,10 @@ unsigned long __alloc_pages_bulk(gfp_t g if (unlikely(page_array && nr_pages - nr_populated == 0)) goto out; + /* Bulk allocator does not support memcg accounting. */ + if (memcg_kmem_enabled() && (gfp & __GFP_ACCOUNT)) + goto failed; + /* Use the single page allocator for one page. */ if (nr_pages - nr_populated == 1) goto failed; From patchwork Thu Oct 28 21:36:07 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 12591223 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 732DAC433F5 for ; Thu, 28 Oct 2021 21:36:11 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 0B55E610CA for ; Thu, 28 Oct 2021 21:36:11 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org 0B55E610CA Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=linux-foundation.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=kvack.org Received: by kanga.kvack.org (Postfix) id A7E966B0073; Thu, 28 Oct 2021 17:36:10 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id A2DDA6B0074; Thu, 28 Oct 2021 17:36:10 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 91CC3940007; Thu, 28 Oct 2021 17:36:10 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0195.hostedemail.com [216.40.44.195]) by kanga.kvack.org (Postfix) with ESMTP id 6C5046B0073 for ; Thu, 28 Oct 2021 17:36:10 -0400 (EDT) Received: from smtpin25.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay05.hostedemail.com (Postfix) with ESMTP id DF8C61830451D for ; Thu, 28 Oct 2021 21:36:09 +0000 (UTC) X-FDA: 78747154458.25.502D3E1 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by imf10.hostedemail.com (Postfix) with ESMTP id AE4496001E54 for ; Thu, 28 Oct 2021 21:36:01 +0000 (UTC) Received: by mail.kernel.org (Postfix) with ESMTPSA id 55BAF60FE3; Thu, 28 Oct 2021 21:36:08 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1635456968; bh=84gMrMkS11wA2Rk+CJ1QmnKRVd4oIENPgFHHYXhN+rA=; h=Date:From:To:Subject:In-Reply-To:From; b=VMN4m3C4Url1lvCikMOZE/mPmbO53vmzk4UoekmcnVF3/yqZhc+6BYjHKGMmQPykI TRcZJe9JtUhVIc1rxauLCDLa1XILkzXG4dYTMc9TDr/bPzIqiJD/Smfx6LASkgg1pq 8C2Yyu5Pxg9e3iiCpqoo7YwIL6J6tA9LGRxnw/X4= Date: Thu, 28 Oct 2021 14:36:07 -0700 From: Andrew Morton To: akpm@linux-foundation.org, hughd@google.com, kirill.shutemov@linux.intel.com, linux-mm@kvack.org, mm-commits@vger.kernel.org, naoya.horiguchi@nec.com, osalvador@suse.de, peterx@redhat.com, shy828301@gmail.com, stable@vger.kernel.org, torvalds@linux-foundation.org, willy@infradead.org Subject: [patch 02/11] mm: hwpoison: remove the unnecessary THP check Message-ID: <20211028213607.dE5Qz5QgJ%akpm@linux-foundation.org> In-Reply-To: <20211028143506.5f5d5e2cd1f768a1da864844@linux-foundation.org> User-Agent: s-nail v14.8.16 X-Rspamd-Server: rspam02 X-Rspamd-Queue-Id: AE4496001E54 X-Stat-Signature: 7asabg719s668q4znr43bik7xc6p3m87 Authentication-Results: imf10.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=korg header.b=VMN4m3C4; dmarc=none; spf=pass (imf10.hostedemail.com: domain of akpm@linux-foundation.org designates 198.145.29.99 as permitted sender) smtp.mailfrom=akpm@linux-foundation.org X-HE-Tag: 1635456961-553615 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Yang Shi Subject: mm: hwpoison: remove the unnecessary THP check When handling THP hwpoison checked if the THP is in allocation or free stage since hwpoison may mistreat it as hugetlb page. After commit 415c64c1453a ("mm/memory-failure: split thp earlier in memory error handling") the problem has been fixed, so this check is no longer needed. Remove it. The side effect of the removal is hwpoison may report unsplit THP instead of unknown error for shmem THP. It seems not like a big deal. The following patch "mm: filemap: check if THP has hwpoisoned subpage for PMD page fault" depends on this, which fixes shmem THP with hwpoisoned subpage(s) are mapped PMD wrongly. So this patch needs to be backported to -stable as well. Link: https://lkml.kernel.org/r/20211020210755.23964-2-shy828301@gmail.com Signed-off-by: Yang Shi Suggested-by: Naoya Horiguchi Acked-by: Naoya Horiguchi Cc: Hugh Dickins Cc: Kirill A. Shutemov Cc: Matthew Wilcox Cc: Oscar Salvador Cc: Peter Xu Cc: Signed-off-by: Andrew Morton --- mm/memory-failure.c | 14 -------------- 1 file changed, 14 deletions(-) --- a/mm/memory-failure.c~mm-hwpoison-remove-the-unnecessary-thp-check +++ a/mm/memory-failure.c @@ -1147,20 +1147,6 @@ static int __get_hwpoison_page(struct pa if (!HWPoisonHandlable(head)) return -EBUSY; - if (PageTransHuge(head)) { - /* - * Non anonymous thp exists only in allocation/free time. We - * can't handle such a case correctly, so let's give it up. - * This should be better than triggering BUG_ON when kernel - * tries to touch the "partially handled" page. - */ - if (!PageAnon(head)) { - pr_err("Memory failure: %#lx: non anonymous thp\n", - page_to_pfn(page)); - return 0; - } - } - if (get_page_unless_zero(head)) { if (head == compound_head(page)) return 1; From patchwork Thu Oct 28 21:36:11 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 12591225 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id BE05AC433FE for ; Thu, 28 Oct 2021 21:36:14 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 4AA0760FE3 for ; Thu, 28 Oct 2021 21:36:14 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org 4AA0760FE3 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=linux-foundation.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=kvack.org Received: by kanga.kvack.org (Postfix) id E28556B0074; Thu, 28 Oct 2021 17:36:13 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id DB074940007; Thu, 28 Oct 2021 17:36:13 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id C78266B0078; Thu, 28 Oct 2021 17:36:13 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0012.hostedemail.com [216.40.44.12]) by kanga.kvack.org (Postfix) with ESMTP id A17746B0074 for ; Thu, 28 Oct 2021 17:36:13 -0400 (EDT) Received: from smtpin02.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay03.hostedemail.com (Postfix) with ESMTP id 3296D8249980 for ; Thu, 28 Oct 2021 21:36:13 +0000 (UTC) X-FDA: 78747154626.02.0E47CB0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by imf16.hostedemail.com (Postfix) with ESMTP id 42401F00009D for ; Thu, 28 Oct 2021 21:36:07 +0000 (UTC) Received: by mail.kernel.org (Postfix) with ESMTPSA id 736BE61056; Thu, 28 Oct 2021 21:36:11 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1635456971; bh=dJQQEMdt0xvUwN3UWZ9lg3BwGuIrm9dAROXC/APbYn8=; h=Date:From:To:Subject:In-Reply-To:From; b=HyCt3UQCrx1xij2iby1bOWqV6EajwQ7nXUiAxpHhd4g6fhMlIll7z5cetWzlxhMNw Y2Dy0K1HVQLXp672aH6HJYGrbFXS3uA/vdYzhVDMcNCs/UBj1LfJD5X1NEm5XuqQWB ALAoY5s4brvOlSinkfCA7nbFUX97Y/SWx9dO1my0= Date: Thu, 28 Oct 2021 14:36:11 -0700 From: Andrew Morton To: akpm@linux-foundation.org, hughd@google.com, kirill.shutemov@linux.intel.com, linux-mm@kvack.org, mm-commits@vger.kernel.org, naoya.horiguchi@nec.com, osalvador@suse.de, peterx@redhat.com, shy828301@gmail.com, stable@vger.kernel.org, torvalds@linux-foundation.org, willy@infradead.org Subject: [patch 03/11] mm: filemap: check if THP has hwpoisoned subpage for PMD page fault Message-ID: <20211028213611.-fbyoks-F%akpm@linux-foundation.org> In-Reply-To: <20211028143506.5f5d5e2cd1f768a1da864844@linux-foundation.org> User-Agent: s-nail v14.8.16 X-Rspamd-Server: rspam03 X-Rspamd-Queue-Id: 42401F00009D X-Stat-Signature: 8hxre94xnwitkssuqbx4mjegb6saqmoh Authentication-Results: imf16.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=korg header.b=HyCt3UQC; dmarc=none; spf=pass (imf16.hostedemail.com: domain of akpm@linux-foundation.org designates 198.145.29.99 as permitted sender) smtp.mailfrom=akpm@linux-foundation.org X-HE-Tag: 1635456967-552827 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Yang Shi Subject: mm: filemap: check if THP has hwpoisoned subpage for PMD page fault When handling shmem page fault the THP with corrupted subpage could be PMD mapped if certain conditions are satisfied. But kernel is supposed to send SIGBUS when trying to map hwpoisoned page. There are two paths which may do PMD map: fault around and regular fault. Before commit f9ce0be71d1f ("mm: Cleanup faultaround and finish_fault() codepaths") the thing was even worse in fault around path. The THP could be PMD mapped as long as the VMA fits regardless what subpage is accessed and corrupted. After this commit as long as head page is not corrupted the THP could be PMD mapped. In the regular fault path the THP could be PMD mapped as long as the corrupted page is not accessed and the VMA fits. This loophole could be fixed by iterating every subpage to check if any of them is hwpoisoned or not, but it is somewhat costly in page fault path. So introduce a new page flag called HasHWPoisoned on the first tail page. It indicates the THP has hwpoisoned subpage(s). It is set if any subpage of THP is found hwpoisoned by memory failure and after the refcount is bumped successfully, then cleared when the THP is freed or split. The soft offline path doesn't need this since soft offline handler just marks a subpage hwpoisoned when the subpage is migrated successfully. But shmem THP didn't get split then migrated at all. Link: https://lkml.kernel.org/r/20211020210755.23964-3-shy828301@gmail.com Fixes: 800d8c63b2e9 ("shmem: add huge pages support") Signed-off-by: Yang Shi Reviewed-by: Naoya Horiguchi Suggested-by: Kirill A. Shutemov Cc: Hugh Dickins Cc: Matthew Wilcox Cc: Oscar Salvador Cc: Peter Xu Cc: Signed-off-by: Andrew Morton --- include/linux/page-flags.h | 23 +++++++++++++++++++++++ mm/huge_memory.c | 2 ++ mm/memory-failure.c | 14 ++++++++++++++ mm/memory.c | 9 +++++++++ mm/page_alloc.c | 4 +++- 5 files changed, 51 insertions(+), 1 deletion(-) --- a/include/linux/page-flags.h~mm-filemap-check-if-thp-has-hwpoisoned-subpage-for-pmd-page-fault +++ a/include/linux/page-flags.h @@ -171,6 +171,15 @@ enum pageflags { /* Compound pages. Stored in first tail page's flags */ PG_double_map = PG_workingset, +#ifdef CONFIG_MEMORY_FAILURE + /* + * Compound pages. Stored in first tail page's flags. + * Indicates that at least one subpage is hwpoisoned in the + * THP. + */ + PG_has_hwpoisoned = PG_mappedtodisk, +#endif + /* non-lru isolated movable page */ PG_isolated = PG_reclaim, @@ -668,6 +677,20 @@ PAGEFLAG_FALSE(DoubleMap) TESTSCFLAG_FALSE(DoubleMap) #endif +#if defined(CONFIG_MEMORY_FAILURE) && defined(CONFIG_TRANSPARENT_HUGEPAGE) +/* + * PageHasHWPoisoned indicates that at least one subpage is hwpoisoned in the + * compound page. + * + * This flag is set by hwpoison handler. Cleared by THP split or free page. + */ +PAGEFLAG(HasHWPoisoned, has_hwpoisoned, PF_SECOND) + TESTSCFLAG(HasHWPoisoned, has_hwpoisoned, PF_SECOND) +#else +PAGEFLAG_FALSE(HasHWPoisoned) + TESTSCFLAG_FALSE(HasHWPoisoned) +#endif + /* * Check if a page is currently marked HWPoisoned. Note that this check is * best effort only and inherently racy: there is no way to synchronize with --- a/mm/huge_memory.c~mm-filemap-check-if-thp-has-hwpoisoned-subpage-for-pmd-page-fault +++ a/mm/huge_memory.c @@ -2426,6 +2426,8 @@ static void __split_huge_page(struct pag /* lock lru list/PageCompound, ref frozen by page_ref_freeze */ lruvec = lock_page_lruvec(head); + ClearPageHasHWPoisoned(head); + for (i = nr - 1; i >= 1; i--) { __split_huge_page_tail(head, i, lruvec, list); /* Some pages can be beyond EOF: drop them from page cache */ --- a/mm/memory.c~mm-filemap-check-if-thp-has-hwpoisoned-subpage-for-pmd-page-fault +++ a/mm/memory.c @@ -3907,6 +3907,15 @@ vm_fault_t do_set_pmd(struct vm_fault *v return ret; /* + * Just backoff if any subpage of a THP is corrupted otherwise + * the corrupted page may mapped by PMD silently to escape the + * check. This kind of THP just can be PTE mapped. Access to + * the corrupted subpage should trigger SIGBUS as expected. + */ + if (unlikely(PageHasHWPoisoned(page))) + return ret; + + /* * Archs like ppc64 need additional space to store information * related to pte entry. Use the preallocated table for that. */ --- a/mm/memory-failure.c~mm-filemap-check-if-thp-has-hwpoisoned-subpage-for-pmd-page-fault +++ a/mm/memory-failure.c @@ -1694,6 +1694,20 @@ try_again: } if (PageTransHuge(hpage)) { + /* + * The flag must be set after the refcount is bumped + * otherwise it may race with THP split. + * And the flag can't be set in get_hwpoison_page() since + * it is called by soft offline too and it is just called + * for !MF_COUNT_INCREASE. So here seems to be the best + * place. + * + * Don't need care about the above error handling paths for + * get_hwpoison_page() since they handle either free page + * or unhandlable page. The refcount is bumped iff the + * page is a valid handlable page. + */ + SetPageHasHWPoisoned(hpage); if (try_to_split_thp_page(p, "Memory Failure") < 0) { action_result(pfn, MF_MSG_UNSPLIT_THP, MF_IGNORED); res = -EBUSY; --- a/mm/page_alloc.c~mm-filemap-check-if-thp-has-hwpoisoned-subpage-for-pmd-page-fault +++ a/mm/page_alloc.c @@ -1312,8 +1312,10 @@ static __always_inline bool free_pages_p VM_BUG_ON_PAGE(compound && compound_order(page) != order, page); - if (compound) + if (compound) { ClearPageDoubleMap(page); + ClearPageHasHWPoisoned(page); + } for (i = 1; i < (1 << order); i++) { if (compound) bad += free_tail_pages_check(page, page + i); From patchwork Thu Oct 28 21:36:14 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 12591227 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 16D4EC433F5 for ; Thu, 28 Oct 2021 21:36:18 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id BE8DA60FF2 for ; Thu, 28 Oct 2021 21:36:17 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org BE8DA60FF2 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=linux-foundation.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=kvack.org Received: by kanga.kvack.org (Postfix) id 6AD816B0071; Thu, 28 Oct 2021 17:36:17 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 65D1C6B0075; Thu, 28 Oct 2021 17:36:17 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 54C396B0078; Thu, 28 Oct 2021 17:36:17 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0221.hostedemail.com [216.40.44.221]) by kanga.kvack.org (Postfix) with ESMTP id 2FAEE6B0071 for ; Thu, 28 Oct 2021 17:36:17 -0400 (EDT) Received: from smtpin28.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay01.hostedemail.com (Postfix) with ESMTP id B9BA2183459E7 for ; Thu, 28 Oct 2021 21:36:16 +0000 (UTC) X-FDA: 78747154752.28.8C43BA1 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by imf20.hostedemail.com (Postfix) with ESMTP id E1D10D0000B8 for ; Thu, 28 Oct 2021 21:36:09 +0000 (UTC) Received: by mail.kernel.org (Postfix) with ESMTPSA id C68C360FE3; Thu, 28 Oct 2021 21:36:14 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1635456975; bh=9zwswZBiMPrrgKB8qJCiqcB5jYZFrqPCXDpgvEHIAYY=; h=Date:From:To:Subject:In-Reply-To:From; b=Ps2q+6wGwZgSEk1jL+qQ1SJskDOSVNNtK7+QIutis9n6BNexJGM92Mr1KkXpE1Pu7 //xqI57ki24E1CaYTZ/vXNdwaivfo7vtdhYLPgmRLoPog2dM9CDLRp8XPAy6aPuhaY 1ey5PZWpFGYpj3aDNp4seaFjonZ5bLwvHIRg5wKE= Date: Thu, 28 Oct 2021 14:36:14 -0700 From: Andrew Morton To: akpm@linux-foundation.org, christian.brauner@ubuntu.com, christian@brauner.io, david@redhat.com, fweimer@redhat.com, guro@fb.com, hannes@cmpxchg.org, hch@infradead.org, jannh@google.com, jengelh@inai.de, linux-mm@kvack.org, luto@kernel.org, mhocko@suse.com, minchan@kernel.org, mm-commits@vger.kernel.org, oleg@redhat.com, riel@surriel.com, rientjes@google.com, shakeelb@google.com, surenb@google.com, torvalds@linux-foundation.org, willy@infradead.org Subject: [patch 04/11] mm/oom_kill.c: prevent a race between process_mrelease and exit_mmap Message-ID: <20211028213614.GOA2nllUX%akpm@linux-foundation.org> In-Reply-To: <20211028143506.5f5d5e2cd1f768a1da864844@linux-foundation.org> User-Agent: s-nail v14.8.16 X-Stat-Signature: z4a7rq4rt1zz5kw36tjm8bw4zhojcbk4 Authentication-Results: imf20.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=korg header.b=Ps2q+6wG; spf=pass (imf20.hostedemail.com: domain of akpm@linux-foundation.org designates 198.145.29.99 as permitted sender) smtp.mailfrom=akpm@linux-foundation.org; dmarc=none X-Rspamd-Server: rspam04 X-Rspamd-Queue-Id: E1D10D0000B8 X-HE-Tag: 1635456969-592723 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Suren Baghdasaryan Subject: mm/oom_kill.c: prevent a race between process_mrelease and exit_mmap Race between process_mrelease and exit_mmap, where free_pgtables is called while __oom_reap_task_mm is in progress, leads to kernel crash during pte_offset_map_lock call. oom-reaper avoids this race by setting MMF_OOM_VICTIM flag and causing exit_mmap to take and release mmap_write_lock, blocking it until oom-reaper releases mmap_read_lock. Reusing MMF_OOM_VICTIM for process_mrelease would be the simplest way to fix this race, however that would be considered a hack. Fix this race by elevating mm->mm_users and preventing exit_mmap from executing until process_mrelease is finished. Patch slightly refactors the code to adapt for a possible mmget_not_zero failure. This fix has considerable negative impact on process_mrelease performance and will likely need later optimization. Link: https://lkml.kernel.org/r/20211022014658.263508-1-surenb@google.com Fixes: 884a7e5964e0 ("mm: introduce process_mrelease system call") Signed-off-by: Suren Baghdasaryan Acked-by: Michal Hocko Cc: David Rientjes Cc: Matthew Wilcox (Oracle) Cc: Johannes Weiner Cc: Roman Gushchin Cc: Rik van Riel Cc: Minchan Kim Cc: Christian Brauner Cc: Christoph Hellwig Cc: Oleg Nesterov Cc: David Hildenbrand Cc: Jann Horn Cc: Shakeel Butt Cc: Andy Lutomirski Cc: Christian Brauner Cc: Florian Weimer Cc: Jan Engelhardt Signed-off-by: Andrew Morton --- mm/oom_kill.c | 23 ++++++++++++----------- 1 file changed, 12 insertions(+), 11 deletions(-) --- a/mm/oom_kill.c~mm-prevent-a-race-between-process_mrelease-and-exit_mmap +++ a/mm/oom_kill.c @@ -1150,7 +1150,7 @@ SYSCALL_DEFINE2(process_mrelease, int, p struct task_struct *task; struct task_struct *p; unsigned int f_flags; - bool reap = true; + bool reap = false; struct pid *pid; long ret = 0; @@ -1177,15 +1177,15 @@ SYSCALL_DEFINE2(process_mrelease, int, p goto put_task; } - mm = p->mm; - mmgrab(mm); - - /* If the work has been done already, just exit with success */ - if (test_bit(MMF_OOM_SKIP, &mm->flags)) - reap = false; - else if (!task_will_free_mem(p)) { - reap = false; - ret = -EINVAL; + if (mmget_not_zero(p->mm)) { + mm = p->mm; + if (task_will_free_mem(p)) + reap = true; + else { + /* Error only if the work has not been done already */ + if (!test_bit(MMF_OOM_SKIP, &mm->flags)) + ret = -EINVAL; + } } task_unlock(p); @@ -1201,7 +1201,8 @@ SYSCALL_DEFINE2(process_mrelease, int, p mmap_read_unlock(mm); drop_mm: - mmdrop(mm); + if (mm) + mmput(mm); put_task: put_task_struct(task); put_pid: From patchwork Thu Oct 28 21:36:17 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 12591229 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 965B4C433FE for ; Thu, 28 Oct 2021 21:36:21 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 4A89960FE3 for ; Thu, 28 Oct 2021 21:36:21 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org 4A89960FE3 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=linux-foundation.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=kvack.org Received: by kanga.kvack.org (Postfix) id EC4086B0075; Thu, 28 Oct 2021 17:36:20 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id E74106B0078; Thu, 28 Oct 2021 17:36:20 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id D3B746B007B; Thu, 28 Oct 2021 17:36:20 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0199.hostedemail.com [216.40.44.199]) by kanga.kvack.org (Postfix) with ESMTP id AFD926B0075 for ; Thu, 28 Oct 2021 17:36:20 -0400 (EDT) Received: from smtpin06.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay02.hostedemail.com (Postfix) with ESMTP id 430D330B48 for ; Thu, 28 Oct 2021 21:36:20 +0000 (UTC) X-FDA: 78747154920.06.AF798F9 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by imf20.hostedemail.com (Postfix) with ESMTP id 76930D0000B8 for ; Thu, 28 Oct 2021 21:36:13 +0000 (UTC) Received: by mail.kernel.org (Postfix) with ESMTPSA id 6137360FF2; Thu, 28 Oct 2021 21:36:18 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1635456978; bh=OjR2vCQN6oERDvMoeORfA7FJfBGqjKGPXWFtPG6j5xM=; h=Date:From:To:Subject:In-Reply-To:From; b=nZB/XIgZuttk3TUtLZnvTRnvujYr1DYoUWxpd+g6SOs74eoTdXtZ9ui8ecIBzLU2m 7iaK/bDygOplYdYIeOgJJEydBl5HimMe7R25D7pmBguCiuZFqBJohzdIhA4iQQ+KAl yXgmGTwPtSU87AGoeE2khgOuW9PwOeDlWCX+LGQ0= Date: Thu, 28 Oct 2021 14:36:17 -0700 From: Andrew Morton To: akpm@linux-foundation.org, gautham.ananthakrishna@oracle.com, gechangwei@live.cn, ghe@suse.com, jlbec@evilplan.org, joseph.qi@linux.alibaba.com, junxiao.bi@oracle.com, linux-mm@kvack.org, mark@fasheh.com, mm-commits@vger.kernel.org, piaojun@huawei.com, rajesh.sivaramasubramaniom@oracle.com, stable@vger.kernel.org, torvalds@linux-foundation.org Subject: [patch 05/11] ocfs2: fix race between searching chunks and release journal_head from buffer_head Message-ID: <20211028213617.Qwy6hYti3%akpm@linux-foundation.org> In-Reply-To: <20211028143506.5f5d5e2cd1f768a1da864844@linux-foundation.org> User-Agent: s-nail v14.8.16 X-Rspamd-Server: rspam03 X-Rspamd-Queue-Id: 76930D0000B8 X-Stat-Signature: 11brgpk3u3nqj7w8dxjstaeqz86kgzbu Authentication-Results: imf20.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=korg header.b="nZB/XIgZ"; dmarc=none; spf=pass (imf20.hostedemail.com: domain of akpm@linux-foundation.org designates 198.145.29.99 as permitted sender) smtp.mailfrom=akpm@linux-foundation.org X-HE-Tag: 1635456973-510991 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Gautham Ananthakrishna Subject: ocfs2: fix race between searching chunks and release journal_head from buffer_head Encountered a race between ocfs2_test_bg_bit_allocatable() and jbd2_journal_put_journal_head() resulting in the below vmcore. PID: 106879 TASK: ffff880244ba9c00 CPU: 2 COMMAND: "loop3" 0 [ffff8802435ff1c0] panic at ffffffff816ed175 1 [ffff8802435ff240] oops_end at ffffffff8101a7c9 2 [ffff8802435ff270] no_context at ffffffff8106eccf 3 [ffff8802435ff2e0] __bad_area_nosemaphore at ffffffff8106ef9d 4 [ffff8802435ff330] bad_area_nosemaphore at ffffffff8106f143 5 [ffff8802435ff340] __do_page_fault at ffffffff8106f80b 6 [ffff8802435ff3a0] do_page_fault at ffffffff8106fc2f 7 [ffff8802435ff3e0] page_fault at ffffffff816fd667 [exception RIP: ocfs2_block_group_find_clear_bits+316] RIP: ffffffffc11ef6fc RSP: ffff8802435ff498 RFLAGS: 00010206 RAX: 0000000000003918 RBX: 0000000000000001 RCX: 0000000000000018 RDX: 0000000000003918 RSI: 0000000000000000 RDI: ffff880060194040 RBP: ffff8802435ff4f8 R8: ffffffffff000000 R9: ffffffffffffffff R10: ffff8802435ff730 R11: ffff8802a94e5800 R12: 0000000000000007 R13: 0000000000007e00 R14: 0000000000003918 R15: ffff88017c973a28 ORIG_RAX: ffffffffffffffff CS: e030 SS: e02b 8 [ffff8802435ff490] ocfs2_block_group_find_clear_bits at ffffffffc11ef680 [ocfs2] 9 [ffff8802435ff500] ocfs2_cluster_group_search at ffffffffc11ef916 [ocfs2] 10 [ffff8802435ff580] ocfs2_search_chain at ffffffffc11f0fb6 [ocfs2] 11 [ffff8802435ff660] ocfs2_claim_suballoc_bits at ffffffffc11f1b1b [ocfs2] 12 [ffff8802435ff6f0] __ocfs2_claim_clusters at ffffffffc11f32cb [ocfs2] 13 [ffff8802435ff770] ocfs2_claim_clusters at ffffffffc11f5caf [ocfs2] 14 [ffff8802435ff780] ocfs2_local_alloc_slide_window at ffffffffc11cc0db [ocfs2] 15 [ffff8802435ff820] ocfs2_reserve_local_alloc_bits at ffffffffc11ce53f [ocfs2] 16 [ffff8802435ff890] ocfs2_reserve_clusters_with_limit at ffffffffc11f59b5 [ocfs2] 17 [ffff8802435ff8e0] ocfs2_reserve_clusters at ffffffffc11f5c88 [ocfs2] 18 [ffff8802435ff8f0] ocfs2_lock_refcount_allocators at ffffffffc11dc169 [ocfs2] 19 [ffff8802435ff960] ocfs2_make_clusters_writable at ffffffffc11e4274 [ocfs2] 20 [ffff8802435ffa50] ocfs2_replace_cow at ffffffffc11e4df1 [ocfs2] 21 [ffff8802435ffac0] ocfs2_refcount_cow at ffffffffc11e54b1 [ocfs2] 22 [ffff8802435ffb80] ocfs2_file_write_iter at ffffffffc11bf8f4 [ocfs2] 23 [ffff8802435ffcd0] lo_rw_aio at ffffffff814a1b5d 24 [ffff8802435ffd80] loop_queue_work at ffffffff814a2802 25 [ffff8802435ffe60] kthread_worker_fn at ffffffff810a80d2 26 [ffff8802435ffec0] kthread at ffffffff810a7afb 27 [ffff8802435fff50] ret_from_fork at ffffffff816f7da1 When ocfs2_test_bg_bit_allocatable() called bh2jh(bg_bh), the bg_bh->b_private NULL as jbd2_journal_put_journal_head() raced and released the jounal head from the buffer head. Needed to take bit lock for the bit 'BH_JournalHead' to fix this race. Link: https://lkml.kernel.org/r/1634820718-6043-1-git-send-email-gautham.ananthakrishna@oracle.com Signed-off-by: Gautham Ananthakrishna Reviewed-by: Joseph Qi Cc: Cc: Mark Fasheh Cc: Joel Becker Cc: Junxiao Bi Cc: Changwei Ge Cc: Gang He Cc: Jun Piao Cc: Signed-off-by: Andrew Morton --- fs/ocfs2/suballoc.c | 22 +++++++++++++--------- 1 file changed, 13 insertions(+), 9 deletions(-) --- a/fs/ocfs2/suballoc.c~ocfs2-race-between-searching-chunks-and-release-journal_head-from-buffer_head +++ a/fs/ocfs2/suballoc.c @@ -1251,7 +1251,7 @@ static int ocfs2_test_bg_bit_allocatable { struct ocfs2_group_desc *bg = (struct ocfs2_group_desc *) bg_bh->b_data; struct journal_head *jh; - int ret; + int ret = 1; if (ocfs2_test_bit(nr, (unsigned long *)bg->bg_bitmap)) return 0; @@ -1259,14 +1259,18 @@ static int ocfs2_test_bg_bit_allocatable if (!buffer_jbd(bg_bh)) return 1; - jh = bh2jh(bg_bh); - spin_lock(&jh->b_state_lock); - bg = (struct ocfs2_group_desc *) jh->b_committed_data; - if (bg) - ret = !ocfs2_test_bit(nr, (unsigned long *)bg->bg_bitmap); - else - ret = 1; - spin_unlock(&jh->b_state_lock); + jbd_lock_bh_journal_head(bg_bh); + if (buffer_jbd(bg_bh)) { + jh = bh2jh(bg_bh); + spin_lock(&jh->b_state_lock); + bg = (struct ocfs2_group_desc *) jh->b_committed_data; + if (bg) + ret = !ocfs2_test_bit(nr, (unsigned long *)bg->bg_bitmap); + else + ret = 1; + spin_unlock(&jh->b_state_lock); + } + jbd_unlock_bh_journal_head(bg_bh); return ret; } From patchwork Thu Oct 28 21:36:21 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 12591231 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 7541EC433FE for ; Thu, 28 Oct 2021 21:36:24 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 28F7E60FE3 for ; Thu, 28 Oct 2021 21:36:24 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org 28F7E60FE3 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=linux-foundation.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=kvack.org Received: by kanga.kvack.org (Postfix) id C52896B0078; Thu, 28 Oct 2021 17:36:23 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id C01E36B007B; Thu, 28 Oct 2021 17:36:23 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id AF16B6B007D; Thu, 28 Oct 2021 17:36:23 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0110.hostedemail.com [216.40.44.110]) by kanga.kvack.org (Postfix) with ESMTP id 896276B0078 for ; Thu, 28 Oct 2021 17:36:23 -0400 (EDT) Received: from smtpin07.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay04.hostedemail.com (Postfix) with ESMTP id 0C8482D23E for ; Thu, 28 Oct 2021 21:36:23 +0000 (UTC) X-FDA: 78747155046.07.EE3B6E1 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by imf15.hostedemail.com (Postfix) with ESMTP id 65CA5D0004AC for ; Thu, 28 Oct 2021 21:36:15 +0000 (UTC) Received: by mail.kernel.org (Postfix) with ESMTPSA id 8757B61100; Thu, 28 Oct 2021 21:36:21 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1635456981; bh=IjMgySs6iaDJWIXfdPuYVV+8o6w8AxU6pCSF6E7pZbU=; h=Date:From:To:Subject:In-Reply-To:From; b=Y+ubBOHWBAJ3MqyHZ6igQYUL0wxSU65M7H/03SqKGe055VOsJCjE4dwLdXakT7aC8 7on8tm9IXUJuco3dSAfhrldMIlMWCrhtasivVGZDaA24xuGAMDHsUMYiRbxWK+kUk2 b8UeNAWUFqa5PenBkyXW6mcFDEq0KoykaTfODI/s= Date: Thu, 28 Oct 2021 14:36:21 -0700 From: Andrew Morton To: akpm@linux-foundation.org, david@redhat.com, dvyukov@google.com, jordy@pwning.systems, keescook@chromium.org, linux-mm@kvack.org, mm-commits@vger.kernel.org, rppt@kernel.org, torvalds@linux-foundation.org Subject: [patch 06/11] mm/secretmem: avoid letting secretmem_users drop to zero Message-ID: <20211028213621.YTZcxbpZE%akpm@linux-foundation.org> In-Reply-To: <20211028143506.5f5d5e2cd1f768a1da864844@linux-foundation.org> User-Agent: s-nail v14.8.16 X-Stat-Signature: 6czifwoff7o7fmiunxc6876q87ytbmhq Authentication-Results: imf15.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=korg header.b=Y+ubBOHW; spf=pass (imf15.hostedemail.com: domain of akpm@linux-foundation.org designates 198.145.29.99 as permitted sender) smtp.mailfrom=akpm@linux-foundation.org; dmarc=none X-Rspamd-Server: rspam04 X-Rspamd-Queue-Id: 65CA5D0004AC X-HE-Tag: 1635456975-772294 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Kees Cook Subject: mm/secretmem: avoid letting secretmem_users drop to zero Quoting Dmitry: "refcount_inc() needs to be done before fd_install(). After fd_install() finishes, the fd can be used by userspace and we can have secret data in memory before the refcount_inc(). A straightforward misuse where a user will predict the returned fd in another thread before the syscall returns and will use it to store secret data is somewhat dubious because such a user just shoots themself in the foot. But a more interesting misuse would be to close the predicted fd and decrement the refcount before the corresponding refcount_inc, this way one can briefly drop the refcount to zero while there are other users of secretmem." Move fd_install() after refcount_inc(). Link: https://lkml.kernel.org/r/20211021154046.880251-1-keescook@chromium.org Link: https://lore.kernel.org/lkml/CACT4Y+b1sW6-Hkn8HQYw_SsT7X3tp-CJNh2ci0wG3ZnQz9jjig@mail.gmail.com Fixes: 9a436f8ff631 ("PM: hibernate: disable when there are active secretmem users") Signed-off-by: Kees Cook Reported-by: Dmitry Vyukov Reviewed-by: Dmitry Vyukov Reviewed-by: David Hildenbrand Reviewed-by: Jordy Zomer Cc: Mike Rapoport Signed-off-by: Andrew Morton --- mm/secretmem.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) --- a/mm/secretmem.c~mm-secretmem-avoid-letting-secretmem_users-drop-to-zero +++ a/mm/secretmem.c @@ -218,8 +218,8 @@ SYSCALL_DEFINE1(memfd_secret, unsigned i file->f_flags |= O_LARGEFILE; - fd_install(fd, file); atomic_inc(&secretmem_users); + fd_install(fd, file); return fd; err_put_fd: From patchwork Thu Oct 28 21:36:24 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 12591233 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id A67B2C433F5 for ; Thu, 28 Oct 2021 21:36:27 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 59FBB610E5 for ; Thu, 28 Oct 2021 21:36:27 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org 59FBB610E5 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=linux-foundation.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=kvack.org Received: by kanga.kvack.org (Postfix) id 045126B007B; Thu, 28 Oct 2021 17:36:27 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id F372F6B007D; Thu, 28 Oct 2021 17:36:26 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id E262C940007; Thu, 28 Oct 2021 17:36:26 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0155.hostedemail.com [216.40.44.155]) by kanga.kvack.org (Postfix) with ESMTP id BE4956B007B for ; Thu, 28 Oct 2021 17:36:26 -0400 (EDT) Received: from smtpin09.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay02.hostedemail.com (Postfix) with ESMTP id 5C6EA30B48 for ; Thu, 28 Oct 2021 21:36:26 +0000 (UTC) X-FDA: 78747155172.09.B04EBF8 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by imf30.hostedemail.com (Postfix) with ESMTP id 969E0E0019BA for ; Thu, 28 Oct 2021 21:36:14 +0000 (UTC) Received: by mail.kernel.org (Postfix) with ESMTPSA id 889F1610C8; Thu, 28 Oct 2021 21:36:24 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1635456984; bh=WVhw8Y976ZTLMdEnvihPl2gK6wkVzy91Q+hCljqUI/4=; h=Date:From:To:Subject:In-Reply-To:From; b=J0NoVrz+HT0Ws9gk0ZrTHSj+YjLqZCqq0wX8TbNznbISt4ZXQI9oC8L24FpZ4Zwhb IQ5ARdkV2OTwPlTJPmhKNCWKaWTYFDms2FuBo5wn3nVhhOdEdPkmVIHQrPxnxFC7Kp S1jAm9j/6Ea8sRAff5wbURcGJvMvU7a743hz5RYA= Date: Thu, 28 Oct 2021 14:36:24 -0700 From: Andrew Morton To: akpm@linux-foundation.org, chenwandun@huawei.com, edumazet@google.com, guohanjun@huawei.com, linux-mm@kvack.org, mm-commits@vger.kernel.org, npiggin@gmail.com, shakeelb@google.com, torvalds@linux-foundation.org, urezki@gmail.com, wangkefeng.wang@huawei.com Subject: [patch 07/11] mm/vmalloc: fix numa spreading for large hash tables Message-ID: <20211028213624.ioyXk3qpi%akpm@linux-foundation.org> In-Reply-To: <20211028143506.5f5d5e2cd1f768a1da864844@linux-foundation.org> User-Agent: s-nail v14.8.16 X-Stat-Signature: ztm4xo8eeq973o55tiwh63ngr3nfh8qj X-Rspamd-Server: rspam01 X-Rspamd-Queue-Id: 969E0E0019BA Authentication-Results: imf30.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=korg header.b=J0NoVrz+; dmarc=none; spf=pass (imf30.hostedemail.com: domain of akpm@linux-foundation.org designates 198.145.29.99 as permitted sender) smtp.mailfrom=akpm@linux-foundation.org X-HE-Tag: 1635456974-460156 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Chen Wandun Subject: mm/vmalloc: fix numa spreading for large hash tables Eric Dumazet reported a strange numa spreading info in [1], and found commit 121e6f3258fe ("mm/vmalloc: hugepage vmalloc mappings") introduced this issue [2]. Dig into the difference before and after this patch, page allocation has some difference: before: alloc_large_system_hash __vmalloc __vmalloc_node(..., NUMA_NO_NODE, ...) __vmalloc_node_range __vmalloc_area_node alloc_page /* because NUMA_NO_NODE, so choose alloc_page branch */ alloc_pages_current alloc_page_interleave /* can be proved by print policy mode */ after: alloc_large_system_hash __vmalloc __vmalloc_node(..., NUMA_NO_NODE, ...) __vmalloc_node_range __vmalloc_area_node alloc_pages_node /* choose nid by nuam_mem_id() */ __alloc_pages_node(nid, ....) So after commit 121e6f3258fe ("mm/vmalloc: hugepage vmalloc mappings"), it will allocate memory in current node instead of interleaving allocate memory. [1] https://lore.kernel.org/linux-mm/CANn89iL6AAyWhfxdHO+jaT075iOa3XcYn9k6JJc7JR2XYn6k_Q@mail.gmail.com/ [2] https://lore.kernel.org/linux-mm/CANn89iLofTR=AK-QOZY87RdUZENCZUT4O6a0hvhu3_EwRMerOg@mail.gmail.com/ Link: https://lkml.kernel.org/r/20211021080744.874701-2-chenwandun@huawei.com Fixes: 121e6f3258fe ("mm/vmalloc: hugepage vmalloc mappings") Signed-off-by: Chen Wandun Reported-by: Eric Dumazet Cc: Shakeel Butt Cc: Nicholas Piggin Cc: Kefeng Wang Cc: Hanjun Guo Cc: Uladzislau Rezki Signed-off-by: Andrew Morton --- mm/vmalloc.c | 15 +++++++++------ 1 file changed, 9 insertions(+), 6 deletions(-) --- a/mm/vmalloc.c~mm-vmalloc-fix-numa-spreading-for-large-hash-tables +++ a/mm/vmalloc.c @@ -2816,6 +2816,8 @@ vm_area_alloc_pages(gfp_t gfp, int nid, unsigned int order, unsigned int nr_pages, struct page **pages) { unsigned int nr_allocated = 0; + struct page *page; + int i; /* * For order-0 pages we make use of bulk allocator, if @@ -2823,7 +2825,7 @@ vm_area_alloc_pages(gfp_t gfp, int nid, * to fails, fallback to a single page allocator that is * more permissive. */ - if (!order) { + if (!order && nid != NUMA_NO_NODE) { while (nr_allocated < nr_pages) { unsigned int nr, nr_pages_request; @@ -2848,7 +2850,7 @@ vm_area_alloc_pages(gfp_t gfp, int nid, if (nr != nr_pages_request) break; } - } else + } else if (order) /* * Compound pages required for remap_vmalloc_page if * high-order pages. @@ -2856,11 +2858,12 @@ vm_area_alloc_pages(gfp_t gfp, int nid, gfp |= __GFP_COMP; /* High-order pages or fallback path if "bulk" fails. */ - while (nr_allocated < nr_pages) { - struct page *page; - int i; - page = alloc_pages_node(nid, gfp, order); + while (nr_allocated < nr_pages) { + if (nid == NUMA_NO_NODE) + page = alloc_pages(gfp, order); + else + page = alloc_pages_node(nid, gfp, order); if (unlikely(!page)) break; From patchwork Thu Oct 28 21:36:27 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 12591235 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 24253C433EF for ; Thu, 28 Oct 2021 21:36:31 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id BC0E2610E5 for ; Thu, 28 Oct 2021 21:36:30 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org BC0E2610E5 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=linux-foundation.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=kvack.org Received: by kanga.kvack.org (Postfix) id 65FC06B007E; Thu, 28 Oct 2021 17:36:30 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 60E9E6B0080; Thu, 28 Oct 2021 17:36:30 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 4B020940007; Thu, 28 Oct 2021 17:36:30 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0201.hostedemail.com [216.40.44.201]) by kanga.kvack.org (Postfix) with ESMTP id 2528A6B007E for ; Thu, 28 Oct 2021 17:36:30 -0400 (EDT) Received: from smtpin25.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay05.hostedemail.com (Postfix) with ESMTP id B752E183E3986 for ; Thu, 28 Oct 2021 21:36:29 +0000 (UTC) X-FDA: 78747155298.25.6127A40 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by imf10.hostedemail.com (Postfix) with ESMTP id 493606001E56 for ; Thu, 28 Oct 2021 21:36:21 +0000 (UTC) Received: by mail.kernel.org (Postfix) with ESMTPSA id C6D0460FE3; Thu, 28 Oct 2021 21:36:27 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1635456988; bh=aeSrNUgYe3Oyyq9OiXG7KE2T7As8gX/jaK/M4JcUfrg=; h=Date:From:To:Subject:In-Reply-To:From; b=DLAQWprQzdIAs+t5/prJn1PQoiRK1UztNC77SOoNdv0td6MYZWGrRsj5GatqRngX7 3/ILjtbVbmdORpYsL+iXQzpRSt3hIH/88DKTk5jK1iuFkPhnrrKTjnB582pJQHCOBR e06GZB/LxZX8jJqoZFXUEsLXSuxEH1Zu5jXEaS7g= Date: Thu, 28 Oct 2021 14:36:27 -0700 From: Andrew Morton To: akpm@linux-foundation.org, hughd@google.com, kirill.shutemov@linux.intel.com, linux-mm@kvack.org, mike.kravetz@oracle.com, mm-commits@vger.kernel.org, rongwei.wang@linux.alibaba.com, shy828301@gmail.com, song@kernel.org, stable@vger.kernel.org, torvalds@linux-foundation.org, william.kucharski@oracle.com, willy@infradead.org, xuyu@linux.alibaba.com Subject: [patch 08/11] mm, thp: bail out early in collapse_file for writeback page Message-ID: <20211028213627.gvE1DZ9-z%akpm@linux-foundation.org> In-Reply-To: <20211028143506.5f5d5e2cd1f768a1da864844@linux-foundation.org> User-Agent: s-nail v14.8.16 X-Rspamd-Queue-Id: 493606001E56 Authentication-Results: imf10.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=korg header.b=DLAQWprQ; spf=pass (imf10.hostedemail.com: domain of akpm@linux-foundation.org designates 198.145.29.99 as permitted sender) smtp.mailfrom=akpm@linux-foundation.org; dmarc=none X-Stat-Signature: 3gj4rs7rsbcgptzh4kszm5tnmsh3jzwd X-Rspamd-Server: rspam06 X-HE-Tag: 1635456981-486057 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Rongwei Wang Subject: mm, thp: bail out early in collapse_file for writeback page Currently collapse_file does not explicitly check PG_writeback, instead, page_has_private and try_to_release_page are used to filter writeback pages. This does not work for xfs with blocksize equal to or larger than pagesize, because in such case xfs has no page->private. This makes collapse_file bail out early for writeback page. Otherwise, xfs end_page_writeback will panic as follows. page:fffffe00201bcc80 refcount:0 mapcount:0 mapping:ffff0003f88c86a8 index:0x0 pfn:0x84ef32 aops:xfs_address_space_operations [xfs] ino:30000b7 dentry name:"libtest.so" flags: 0x57fffe0000008027(locked|referenced|uptodate|active|writeback) raw: 57fffe0000008027 ffff80001b48bc28 ffff80001b48bc28 ffff0003f88c86a8 raw: 0000000000000000 0000000000000000 00000000ffffffff ffff0000c3e9a000 page dumped because: VM_BUG_ON_PAGE(((unsigned int) page_ref_count(page) + 127u <= 127u)) page->mem_cgroup:ffff0000c3e9a000 ------------[ cut here ]------------ kernel BUG at include/linux/mm.h:1212! Internal error: Oops - BUG: 0 [#1] SMP Modules linked in: BUG: Bad page state in process khugepaged pfn:84ef32 xfs(E) page:fffffe00201bcc80 refcount:0 mapcount:0 mapping:0 index:0x0 pfn:0x84ef32 libcrc32c(E) rfkill(E) aes_ce_blk(E) crypto_simd(E) ... CPU: 25 PID: 0 Comm: swapper/25 Kdump: loaded Tainted: ... pstate: 60400005 (nZCv daif +PAN -UAO -TCO BTYPE=--) pc : end_page_writeback+0x1c0/0x214 lr : end_page_writeback+0x1c0/0x214 sp : ffff800011ce3cc0 x29: ffff800011ce3cc0 x28: 0000000000000000 x27: ffff000c04608040 x26: 0000000000000000 x25: ffff000c04608040 x24: 0000000000001000 x23: ffff0003f88c8530 x22: 0000000000001000 x21: ffff0003f88c8530 x20: 0000000000000000 x19: fffffe00201bcc80 x18: 0000000000000030 x17: 0000000000000000 x16: 0000000000000000 x15: ffff000c018f9760 x14: ffffffffffffffff x13: ffff8000119d72b0 x12: ffff8000119d6ee3 x11: ffff8000117b69b8 x10: 00000000ffff8000 x9 : ffff800010617534 x8 : 0000000000000000 x7 : ffff8000114f69b8 x6 : 000000000000000f x5 : 0000000000000000 x4 : 0000000000000000 x3 : 0000000000000400 x2 : 0000000000000000 x1 : 0000000000000000 x0 : 0000000000000000 Call trace: end_page_writeback+0x1c0/0x214 iomap_finish_page_writeback+0x13c/0x204 iomap_finish_ioend+0xe8/0x19c iomap_writepage_end_bio+0x38/0x50 bio_endio+0x168/0x1ec blk_update_request+0x278/0x3f0 blk_mq_end_request+0x34/0x15c virtblk_request_done+0x38/0x74 [virtio_blk] blk_done_softirq+0xc4/0x110 __do_softirq+0x128/0x38c __irq_exit_rcu+0x118/0x150 irq_exit+0x1c/0x30 __handle_domain_irq+0x8c/0xf0 gic_handle_irq+0x84/0x108 el1_irq+0xcc/0x180 arch_cpu_idle+0x18/0x40 default_idle_call+0x4c/0x1a0 cpuidle_idle_call+0x168/0x1e0 do_idle+0xb4/0x104 cpu_startup_entry+0x30/0x9c secondary_start_kernel+0x104/0x180 Code: d4210000 b0006161 910c8021 94013f4d (d4210000) ---[ end trace 4a88c6a074082f8c ]--- Kernel panic - not syncing: Oops - BUG: Fatal exception in interrupt Link: https://lkml.kernel.org/r/20211022023052.33114-1-rongwei.wang@linux.alibaba.com Fixes: 99cb0dbd47a1 ("mm,thp: add read-only THP support for (non-shmem) FS") Signed-off-by: Rongwei Wang Signed-off-by: Xu Yu Suggested-by: Yang Shi Reviewed-by: Matthew Wilcox (Oracle) Reviewed-by: Yang Shi Acked-by: Kirill A. Shutemov Cc: Song Liu Cc: William Kucharski Cc: Hugh Dickins Cc: Mike Kravetz Cc: Signed-off-by: Andrew Morton --- mm/khugepaged.c | 7 ++++++- 1 file changed, 6 insertions(+), 1 deletion(-) --- a/mm/khugepaged.c~mm-thp-bail-out-early-in-collapse_file-for-writeback-page +++ a/mm/khugepaged.c @@ -1763,6 +1763,10 @@ static void collapse_file(struct mm_stru filemap_flush(mapping); result = SCAN_FAIL; goto xa_unlocked; + } else if (PageWriteback(page)) { + xas_unlock_irq(&xas); + result = SCAN_FAIL; + goto xa_unlocked; } else if (trylock_page(page)) { get_page(page); xas_unlock_irq(&xas); @@ -1798,7 +1802,8 @@ static void collapse_file(struct mm_stru goto out_unlock; } - if (!is_shmem && PageDirty(page)) { + if (!is_shmem && (PageDirty(page) || + PageWriteback(page))) { /* * khugepaged only works on read-only fd, so this * page is dirty because it hasn't been flushed From patchwork Thu Oct 28 21:36:30 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 12591237 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 6F0B0C433FE for ; Thu, 28 Oct 2021 21:36:34 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 15B33610CA for ; Thu, 28 Oct 2021 21:36:34 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org 15B33610CA Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=linux-foundation.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=kvack.org Received: by kanga.kvack.org (Postfix) id B40F86B0081; Thu, 28 Oct 2021 17:36:33 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id ACA42940007; Thu, 28 Oct 2021 17:36:33 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 96B486B0083; Thu, 28 Oct 2021 17:36:33 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0116.hostedemail.com [216.40.44.116]) by kanga.kvack.org (Postfix) with ESMTP id 6F2A36B0081 for ; Thu, 28 Oct 2021 17:36:33 -0400 (EDT) Received: from smtpin27.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay05.hostedemail.com (Postfix) with ESMTP id 029181842B101 for ; Thu, 28 Oct 2021 21:36:33 +0000 (UTC) X-FDA: 78747155466.27.A85C60C Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by imf23.hostedemail.com (Postfix) with ESMTP id 8E6A3900038E for ; Thu, 28 Oct 2021 21:36:23 +0000 (UTC) Received: by mail.kernel.org (Postfix) with ESMTPSA id 18571610E7; Thu, 28 Oct 2021 21:36:31 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1635456991; bh=JirqGpfpO/qxbnSaugdDW0snCqT4yJU1fdf6OO76ogw=; h=Date:From:To:Subject:In-Reply-To:From; b=0NwSL3j0dMAyVzWv1O9sb9AlC9NOKG0rddvU+x36ION1TylEa6Cub4nlcdANOOe2R 0MRnnzm2vvI0lLtCndhT+MdAyhDqadG+pipJ4E34l1c2MCt9pdbcouEVkNi1PqKjv+ 839stn4Z8cM8reBABe5JKJMhL0Mm7pBywrR6YjWU= Date: Thu, 28 Oct 2021 14:36:30 -0700 From: Andrew Morton To: akpm@linux-foundation.org, andrea.righi@canonical.com, hughd@google.com, kirill.shutemov@linux.intel.com, linux-mm@kvack.org, mm-commits@vger.kernel.org, shy828301@gmail.com, songliubraving@fb.com, stable@vger.kernel.org, sunhao.th@gmail.com, torvalds@linux-foundation.org, willy@infradead.org Subject: [patch 09/11] mm: khugepaged: skip huge page collapse for special files Message-ID: <20211028213630.X6Y5NAeme%akpm@linux-foundation.org> In-Reply-To: <20211028143506.5f5d5e2cd1f768a1da864844@linux-foundation.org> User-Agent: s-nail v14.8.16 X-Rspamd-Server: rspam02 X-Rspamd-Queue-Id: 8E6A3900038E X-Stat-Signature: prckrqf3ifqbrp943pc87x6pej3tdrge Authentication-Results: imf23.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=korg header.b=0NwSL3j0; dmarc=none; spf=pass (imf23.hostedemail.com: domain of akpm@linux-foundation.org designates 198.145.29.99 as permitted sender) smtp.mailfrom=akpm@linux-foundation.org X-HE-Tag: 1635456983-751356 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Yang Shi Subject: mm: khugepaged: skip huge page collapse for special files The read-only THP for filesystems will collapse THP for files opened readonly and mapped with VM_EXEC. The intended usecase is to avoid TLB misses for large text segments. But it doesn't restrict the file types so a THP could be collapsed for a non-regular file, for example, block device, if it is opened readonly and mapped with EXEC permission. This may cause bugs, like [1] and [2]. This is definitely not the intended usecase, so just collapse THP for regular files in order to close the attack surface. [1] https://lore.kernel.org/lkml/CACkBjsYwLYLRmX8GpsDpMthagWOjWWrNxqY6ZLNQVr6yx+f5vA@mail.gmail.com/ [2] https://lore.kernel.org/linux-mm/000000000000c6a82505ce284e4c@google.com/ [shy828301@gmail.com: fix vm_file check] Link: https://lkml.kernel.org/r/CAHbLzkqTW9U3VvTu1Ki5v_cLRC9gHW+znBukg_ycergE0JWj-A@mail.gmail.com Link: https://lkml.kernel.org/r/20211027195221.3825-1-shy828301@gmail.com Fixes: 99cb0dbd47a1 ("mm,thp: add read-only THP support for (non-shmem) FS") Signed-off-by: Hugh Dickins Signed-off-by: Yang Shi Reported-by: Hao Sun Reported-by: syzbot+aae069be1de40fb11825@syzkaller.appspotmail.com Cc: Matthew Wilcox Cc: Kirill A. Shutemov Cc: Song Liu Cc: Andrea Righi Cc: Signed-off-by: Andrew Morton --- mm/khugepaged.c | 19 +++++++++++-------- 1 file changed, 11 insertions(+), 8 deletions(-) --- a/mm/khugepaged.c~mm-khugepaged-skip-huge-page-collapse-for-special-files +++ a/mm/khugepaged.c @@ -445,22 +445,25 @@ static bool hugepage_vma_check(struct vm if (!transhuge_vma_enabled(vma, vm_flags)) return false; + if (vma->vm_file && !IS_ALIGNED((vma->vm_start >> PAGE_SHIFT) - + vma->vm_pgoff, HPAGE_PMD_NR)) + return false; + /* Enabled via shmem mount options or sysfs settings. */ - if (shmem_file(vma->vm_file) && shmem_huge_enabled(vma)) { - return IS_ALIGNED((vma->vm_start >> PAGE_SHIFT) - vma->vm_pgoff, - HPAGE_PMD_NR); - } + if (shmem_file(vma->vm_file)) + return shmem_huge_enabled(vma); /* THP settings require madvise. */ if (!(vm_flags & VM_HUGEPAGE) && !khugepaged_always()) return false; - /* Read-only file mappings need to be aligned for THP to work. */ + /* Only regular file is valid */ if (IS_ENABLED(CONFIG_READ_ONLY_THP_FOR_FS) && vma->vm_file && - !inode_is_open_for_write(vma->vm_file->f_inode) && (vm_flags & VM_EXEC)) { - return IS_ALIGNED((vma->vm_start >> PAGE_SHIFT) - vma->vm_pgoff, - HPAGE_PMD_NR); + struct inode *inode = vma->vm_file->f_inode; + + return !inode_is_open_for_write(inode) && + S_ISREG(inode->i_mode); } if (!vma->anon_vma || vma->vm_ops) From patchwork Thu Oct 28 21:36:33 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 12591239 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id CA86CC433FE for ; Thu, 28 Oct 2021 21:36:36 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 8D2F7610CA for ; Thu, 28 Oct 2021 21:36:36 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org 8D2F7610CA Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=linux-foundation.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=kvack.org Received: by kanga.kvack.org (Postfix) id 28F8A6B0083; Thu, 28 Oct 2021 17:36:36 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 23E2C6B0085; Thu, 28 Oct 2021 17:36:36 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 106986B0087; Thu, 28 Oct 2021 17:36:36 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0018.hostedemail.com [216.40.44.18]) by kanga.kvack.org (Postfix) with ESMTP id DA9136B0083 for ; Thu, 28 Oct 2021 17:36:35 -0400 (EDT) Received: from smtpin01.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay03.hostedemail.com (Postfix) with ESMTP id 7A2ED8249980 for ; Thu, 28 Oct 2021 21:36:35 +0000 (UTC) X-FDA: 78747155592.01.4B3CA69 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by imf24.hostedemail.com (Postfix) with ESMTP id 22AF2B00081F for ; Thu, 28 Oct 2021 21:36:35 +0000 (UTC) Received: by mail.kernel.org (Postfix) with ESMTPSA id 270DB610D2; Thu, 28 Oct 2021 21:36:34 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1635456994; bh=X6fs3f2fSOBDq+IJaZt2fhaL5UXblQ955KaLJm/Tiuc=; h=Date:From:To:Subject:In-Reply-To:From; b=Ofseem+jB3XecyPHSmkd5sZ3dV1NoNSEGSb6Rf70npQDCpugki2qgB31JizKr8doJ QQRd1mCdejtCE1N8N3bbzJZsl8cPahQXs+bECNX8jxnWuzHgvD+ubqnZvcp/ddA/hP GKLLJ7k7Foda+pEHcXeFhES1ADGRm7Q6D31VGq9U= Date: Thu, 28 Oct 2021 14:36:33 -0700 From: Andrew Morton To: akpm@linux-foundation.org, linux-mm@kvack.org, mm-commits@vger.kernel.org, sj@kernel.org, torvalds@linux-foundation.org Subject: [patch 10/11] mm/damon/core-test: fix wrong expectations for 'damon_split_regions_of()' Message-ID: <20211028213633.VtZv_vHJo%akpm@linux-foundation.org> In-Reply-To: <20211028143506.5f5d5e2cd1f768a1da864844@linux-foundation.org> User-Agent: s-nail v14.8.16 X-Rspamd-Server: rspam03 X-Rspamd-Queue-Id: 22AF2B00081F X-Stat-Signature: 34jaut4seoow634afd79ykobbcx9sup9 Authentication-Results: imf24.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=korg header.b=Ofseem+j; dmarc=none; spf=pass (imf24.hostedemail.com: domain of akpm@linux-foundation.org designates 198.145.29.99 as permitted sender) smtp.mailfrom=akpm@linux-foundation.org X-HE-Tag: 1635456995-995310 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: SeongJae Park Subject: mm/damon/core-test: fix wrong expectations for 'damon_split_regions_of()' Kunit test cases for 'damon_split_regions_of()' expects the number of regions after calling the function will be same to their request ('nr_sub'). However, the requested number is just an upper-limit, because the function randomly decides the size of each sub-region. This commit fixes the wrong expectation. Link: https://lkml.kernel.org/r/20211028090628.14948-1-sj@kernel.org Fixes: 17ccae8bb5c9 ("mm/damon: add kunit tests") Signed-off-by: SeongJae Park Signed-off-by: Andrew Morton --- mm/damon/core-test.h | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) --- a/mm/damon/core-test.h~mm-damon-core-test-fix-wrong-expectations-for-damon_split_regions_of +++ a/mm/damon/core-test.h @@ -219,14 +219,14 @@ static void damon_test_split_regions_of( r = damon_new_region(0, 22); damon_add_region(r, t); damon_split_regions_of(c, t, 2); - KUNIT_EXPECT_EQ(test, damon_nr_regions(t), 2u); + KUNIT_EXPECT_LE(test, damon_nr_regions(t), 2u); damon_free_target(t); t = damon_new_target(42); r = damon_new_region(0, 220); damon_add_region(r, t); damon_split_regions_of(c, t, 4); - KUNIT_EXPECT_EQ(test, damon_nr_regions(t), 4u); + KUNIT_EXPECT_LE(test, damon_nr_regions(t), 4u); damon_free_target(t); damon_destroy_ctx(c); } From patchwork Thu Oct 28 21:36:36 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 12591241 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 044FCC433EF for ; Thu, 28 Oct 2021 21:36:40 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id AEE16610C8 for ; Thu, 28 Oct 2021 21:36:39 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org AEE16610C8 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=linux-foundation.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=kvack.org Received: by kanga.kvack.org (Postfix) id 502106B0087; Thu, 28 Oct 2021 17:36:39 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 48AAB6B0088; Thu, 28 Oct 2021 17:36:39 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 2DB106B0089; Thu, 28 Oct 2021 17:36:39 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0046.hostedemail.com [216.40.44.46]) by kanga.kvack.org (Postfix) with ESMTP id 038416B0087 for ; Thu, 28 Oct 2021 17:36:38 -0400 (EDT) Received: from smtpin19.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay03.hostedemail.com (Postfix) with ESMTP id 7E1E28249980 for ; Thu, 28 Oct 2021 21:36:38 +0000 (UTC) X-FDA: 78747155676.19.4321E48 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by imf13.hostedemail.com (Postfix) with ESMTP id D7CDF1049B49 for ; Thu, 28 Oct 2021 21:36:31 +0000 (UTC) Received: by mail.kernel.org (Postfix) with ESMTPSA id 0FC14610E7; Thu, 28 Oct 2021 21:36:37 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1635456997; bh=uVXX96cpryr4jYBl0p51ywZQ2aOWyCxUngA8ubIxcxU=; h=Date:From:To:Subject:In-Reply-To:From; b=SpB1huqAqkpz9HwrImHJg9jZTsqnAi7c8pE4mG9AsPn3uwJazUqHLKufoqaDESAqI Zvd/Lt4Sx/P3FREid55dt+gd/gp7Z4zHq+MK5XEUyurHUbcJM0WYeaoCYaql/kPq/5 xIuQuIK0o9/A1fEt6Ti9c4maV4oqdQxkk4vAavEQ= Date: Thu, 28 Oct 2021 14:36:36 -0700 From: Andrew Morton To: akpm@linux-foundation.org, davidcomponentone@gmail.com, linux-mm@kvack.org, mm-commits@vger.kernel.org, shuah@kernel.org, torvalds@linux-foundation.org, zealci@zte.com.cn, ziy@nvidia.com Subject: [patch 11/11] tools/testing/selftests/vm/split_huge_page_test.c: fix application of sizeof to pointer Message-ID: <20211028213636.ffTjwzUhv%akpm@linux-foundation.org> In-Reply-To: <20211028143506.5f5d5e2cd1f768a1da864844@linux-foundation.org> User-Agent: s-nail v14.8.16 X-Rspamd-Server: rspam02 X-Rspamd-Queue-Id: D7CDF1049B49 X-Stat-Signature: j7jo8f3nj4jrmndaft4tpy5p7zsokbpy Authentication-Results: imf13.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=korg header.b=SpB1huqA; dmarc=none; spf=pass (imf13.hostedemail.com: domain of akpm@linux-foundation.org designates 198.145.29.99 as permitted sender) smtp.mailfrom=akpm@linux-foundation.org X-HE-Tag: 1635456991-222785 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: David Yang Subject: tools/testing/selftests/vm/split_huge_page_test.c: fix application of sizeof to pointer The coccinelle check report: "./tools/testing/selftests/vm/split_huge_page_test.c:344:36-42: ERROR: application of sizeof to pointer" Using the "strlen" to fix it. Link: https://lkml.kernel.org/r/20211012030116.184027-1-davidcomponentone@gmail.com Signed-off-by: David Yang Reported-by: Zeal Robot Cc: Zi Yan Cc: Shuah Khan Signed-off-by: Andrew Morton --- tools/testing/selftests/vm/split_huge_page_test.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) --- a/tools/testing/selftests/vm/split_huge_page_test.c~fix-application-of-sizeof-to-pointer +++ a/tools/testing/selftests/vm/split_huge_page_test.c @@ -341,7 +341,7 @@ void split_file_backed_thp(void) } /* write something to the file, so a file-backed THP can be allocated */ - num_written = write(fd, tmpfs_loc, sizeof(tmpfs_loc)); + num_written = write(fd, tmpfs_loc, strlen(tmpfs_loc) + 1); close(fd); if (num_written < 1) {