From patchwork Wed Aug 3 02:52:43 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yin Fengwei X-Patchwork-Id: 12935137 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 7D62BC00140 for ; Wed, 3 Aug 2022 02:52:52 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id E785A8E0002; Tue, 2 Aug 2022 22:52:51 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id E26638E0001; Tue, 2 Aug 2022 22:52:51 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id CEEEB8E0002; Tue, 2 Aug 2022 22:52:51 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0015.hostedemail.com [216.40.44.15]) by kanga.kvack.org (Postfix) with ESMTP id BFE918E0001 for ; Tue, 2 Aug 2022 22:52:51 -0400 (EDT) Received: from smtpin24.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay01.hostedemail.com (Postfix) with ESMTP id 999C11C67F0 for ; Wed, 3 Aug 2022 02:52:51 +0000 (UTC) X-FDA: 79756758942.24.B944288 Received: from mga09.intel.com (mga09.intel.com [134.134.136.24]) by imf12.hostedemail.com (Postfix) with ESMTP id 8B42640114 for ; Wed, 3 Aug 2022 02:52:50 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1659495170; x=1691031170; h=from:to:cc:subject:date:message-id:mime-version: content-transfer-encoding; bh=fOLxVmJMGG7wvMSu0IfPFNFCDd1epYSzMVTGbycc3uo=; b=SHV/OXRjf20l8OLaMG4/Bo/W9fwv7ksuQwDimukEZEoS8l1f1PIDS70b ekSoRsAIVzeNQ1c9rwguR/O7+DxWpVVBOeTRfwaVSJVFYhjEkbC4WiQg8 FAHgPrKzWSJi1G4gzeXaU3JBjiwgAcm1IZyEzTv9M6S+HF3lilaFcSxP+ zyiw1C4Wfxz1170eDd2vKNihHx3L/lZovv6MPZIKk9eX8EpqkZOhlDpOs /K3VKd8TctmN/lnoGF/x8ukblOHvTMGYO7yKE0JBWQ87WIFb69Rytd5pM DEmyBBqAOTyAUVstWVraKWtbtPbYezKlEuIThK3DjEcs+kuWdhQBJEFH4 w==; X-IronPort-AV: E=McAfee;i="6400,9594,10427"; a="290341637" X-IronPort-AV: E=Sophos;i="5.93,212,1654585200"; d="scan'208";a="290341637" Received: from fmsmga004.fm.intel.com ([10.253.24.48]) by orsmga102.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 02 Aug 2022 19:52:49 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.93,212,1654585200"; d="scan'208";a="670687815" Received: from ubuntu-fyin.sh.intel.com ([10.238.201.244]) by fmsmga004.fm.intel.com with ESMTP; 02 Aug 2022 19:52:46 -0700 From: Yin Fengwei To: linux-mm@kvack.org, naoya.horiguchi@nec.com, linmiaohe@huawei.com, willy@infradead.org Cc: aaron.lu@intel.com, tony.luck@intel.com, qiuxu.zhuo@intel.com, fengwei.yin@intel.com Subject: [RFC PATCH] mm/memory-failure: release private data before split THP Date: Wed, 3 Aug 2022 10:52:43 +0800 Message-Id: <20220803025243.155798-1-fengwei.yin@intel.com> X-Mailer: git-send-email 2.34.1 MIME-Version: 1.0 ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1659495171; a=rsa-sha256; cv=none; b=kKfZULVEPh4HA0yaYHm0SKk0eJNnpPfBeUcFaybpQnoqpQVFHfrsGZszPtBIuAo59lwbSE Bv95XEfel2uWMzXbhqLfVFGfwFkkvqwLkxY8qBtY02IKY7gVknL50Ms0V/oOmXwgYBbBL7 9ssPJv7Yubem9pIiF4d9vGW4DKdWJ0I= ARC-Authentication-Results: i=1; imf12.hostedemail.com; dkim=none ("invalid DKIM record") header.d=intel.com header.s=Intel header.b="SHV/OXRj"; spf=pass (imf12.hostedemail.com: domain of fengwei.yin@intel.com designates 134.134.136.24 as permitted sender) smtp.mailfrom=fengwei.yin@intel.com; dmarc=pass (policy=none) header.from=intel.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1659495171; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:references:dkim-signature; bh=n2H6PqkgWFqFij056iLqWeY6yc2F5HvfimKaiky0q9A=; b=ddkkkXzpnrXN8aX8Bgt1hPiWD0uE6n4rApYCo6DsNu/LC2aKmLMGhOdiyB64UizVyu+yB4 f/pOLM5HOCfLiisxgutXold+LCIOC6lzc2GIjBxh9BxSvdFGBxYK9KhL/7KhoT40iaJieB sn18wLrBBeVHC2SBrJahffBuaHU0f98= X-Rspam-User: X-Stat-Signature: chwmw95gtkyii3imc7aoaxiponaqizfc X-Rspamd-Queue-Id: 8B42640114 Authentication-Results: imf12.hostedemail.com; dkim=none ("invalid DKIM record") header.d=intel.com header.s=Intel header.b="SHV/OXRj"; spf=pass (imf12.hostedemail.com: domain of fengwei.yin@intel.com designates 134.134.136.24 as permitted sender) smtp.mailfrom=fengwei.yin@intel.com; dmarc=pass (policy=none) header.from=intel.com X-Rspamd-Server: rspam02 X-HE-Tag: 1659495170-317313 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: If there is private data attached to THP, the refcount of THP will be increased and block the THP split. Which could further cause the meomry failure not recovered. Release private data attached to THP before split it to increase the chance of splitting THP successfully. The issue was hit during HW error injection testing with 5.18 kernel + xfs as rootfs, test got killed and system reboot was required to re-run the test. The issue was tracked down to THP split failure caused the memory failure not being handled. The page dump showed: [ 1785.433075] page:0000000025f9530b refcount:18 mapcount:0 mapping:000000008162eea7 index:0xa10 pfn:0x2f0200 [ 1785.443954] head:0000000025f9530b order:4 compound_mapcount:0 compound_pincount:0 [ 1785.452408] memcg:ff4247f2d28e9000 [ 1785.456304] aops:xfs_address_space_operations ino:8555182 dentry name:"baseos-filenames.solvx" [ 1785.466612] flags: 0x1000000000012036(referenced|uptodate|lru|active|private|head|node=0|zone=2) [ 1785.476514] raw: 1000000000012036 ffb9460f8bc07c08 ffb9460f8bc08408 ff4247f22e6299f8 [ 1785.485268] raw: 0000000000000a10 ff4247f194ade900 00000012ffffffff ff4247f2d28e9000 It was like the error was injected to a large folio for xfs with private data attached. With private data released before split THP, the test case could be run successfully many times without reboot system. Signed-off-by: Yin Fengwei Reviewed-by: Aaron Lu Reviewed-by: Naoya Horiguchi --- mm/memory-failure.c | 9 +++++++++ 1 file changed, 9 insertions(+) base-commit: 9de1f9c8ca5100a02a2e271bdbde36202e251b4b diff --git a/mm/memory-failure.c b/mm/memory-failure.c index da39ec8afca8..08e21973b120 100644 --- a/mm/memory-failure.c +++ b/mm/memory-failure.c @@ -1484,7 +1484,16 @@ static int identify_page_state(unsigned long pfn, struct page *p, static int try_to_split_thp_page(struct page *page, const char *msg) { + struct page *head = compound_head(page); + lock_page(page); + /* + * If thp page has private data attached, thp split will fail. + * Release private data before split thp. + */ + if (page_has_private(head)) + try_to_release_page(head, GFP_KERNEL); + if (unlikely(split_huge_page(page))) { unsigned long pfn = page_to_pfn(page);