From patchwork Thu Nov 1 09:10:55 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Baoquan He X-Patchwork-Id: 10663703 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 7E7FA157A for ; Thu, 1 Nov 2018 09:11:03 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 6A4F62B361 for ; Thu, 1 Nov 2018 09:11:03 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 57C932B39C; Thu, 1 Nov 2018 09:11:03 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_NONE autolearn=ham version=3.3.1 Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 928E82B361 for ; Thu, 1 Nov 2018 09:11:02 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 722426B0005; Thu, 1 Nov 2018 05:11:01 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id 6D14B6B0006; Thu, 1 Nov 2018 05:11:01 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 5E7036B0007; Thu, 1 Nov 2018 05:11:01 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from mail-qt1-f197.google.com (mail-qt1-f197.google.com [209.85.160.197]) by kanga.kvack.org (Postfix) with ESMTP id 350606B0005 for ; Thu, 1 Nov 2018 05:11:01 -0400 (EDT) Received: by mail-qt1-f197.google.com with SMTP id x8-v6so19918318qtc.15 for ; Thu, 01 Nov 2018 02:11:01 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-original-authentication-results:x-gm-message-state:date:from:to :cc:subject:message-id:mime-version:content-disposition:user-agent; bh=JKHW2onwbpKI1SJSyteWVR4FNJUsjdrDYP5TiF4C1zg=; b=WelkRM+TALxrwzoIdvQlGuaG33Eiu15TyAditIoLSUUgjB4nECGnY/PHpG1zQNTI+y gNkn7pvXpQbR6tkdbMCCg7tde0E4pagWsC2+44l2n4sC7Hd7h+Z984b5uzN0XqmCBXDP 8cH/zzHHPCm6N3IfxdEHiwTcvf6J2xOYtxYvD/531NBh9tc+58WncrUqHvp6uuEmJ9sX 8FJehXUlq24hA7n9F/q85z8lsm5dRpKAZ1EtsBH6jwCm5Lja+B1/aNZ9W9LqC4znibpN KP9wPmSPXr713g+RkAUGhogf3oOp507xC4+6wuGSQPHxQnqa2c4xNVYGsUtWEonx2zau 394g== X-Original-Authentication-Results: mx.google.com; spf=pass (google.com: domain of bhe@redhat.com designates 209.132.183.28 as permitted sender) smtp.mailfrom=bhe@redhat.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com X-Gm-Message-State: AGRZ1gJOdWAi/MUyp4dfhtgD2nRyuwtmfRueH1TD4+/mvfVETAT5Cmgo Pbyw7cRsDYOwGol7lquSlDc7AywqHxkkUX2yRXT0+Jb/1Bdf+paMM/wo+49nWxrbBfP0O+q0syb YA4G8MXCLCICxUqoMzrbKQQid6EQJrXHvGV4smuzr6a4zlxqGcPmfj6jZEt1iS24iuQ== X-Received: by 2002:a0c:8c8a:: with SMTP id p10mr5932556qvb.218.1541063460966; Thu, 01 Nov 2018 02:11:00 -0700 (PDT) X-Google-Smtp-Source: AJdET5eNIU/dAt2viGU2LIxDTuN5c7CeQVbJm0YDWOdn36kyexLGHCfdCHjmS57Qomdn5F6/+FtW X-Received: by 2002:a0c:8c8a:: with SMTP id p10mr5932526qvb.218.1541063460215; Thu, 01 Nov 2018 02:11:00 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1541063460; cv=none; d=google.com; s=arc-20160816; b=Mk0Vt5LPqCtQl8Nou+O7jaRc7kBPQoNw3QP22Gh4tvr5o5Au/S9SxvfDCwml18Xso4 I3e3lKr7DrXHlDR8U2+endOhyL27joZy5gpnMO+NcVsI8AIyvRm8oRD442um5fViZT2F NvLlfdVdrHPpr+vNwWVGdKy7aIBfALFw830KJA30x+ThZYSpNyyEwiLdcXmLOyTBU5+N XPwiGjiWnHngsFXNzFTTJSwfxoV+IKY5vG0SsS1kGt2WW/WXPWn/Sx3R4jouBUuZSdTi toXXRKtYx32L6BbMFuYg31PaajW//71sQjfFZAQXFuPI3YQuKDGkXlMIn4a0A1Tw54XG Shsg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=user-agent:content-disposition:mime-version:message-id:subject:cc :to:from:date; bh=JKHW2onwbpKI1SJSyteWVR4FNJUsjdrDYP5TiF4C1zg=; b=qt60dKws92r30XxaM7SW4xKqa/Vxph1cWC8QdR+iC4gLv1J1EbaHXcV1lQpFLoCKte 1neHMBgIIAcCp/d21obBlob3vjUgu8zUcvYWTAgz7xIETU03Raeed08gQjQANd5Vo0yH nZJsgNhXpkk50BabuX3gG+gs3APQ30Mh3L9cLV92PTu8+kWVvljGy6JQbTz+Ejnq/kPz G19tGtE4uvD/Dq8l1Mr9w2uMB40Y8zNGii9gTbY+X7lGWFG3ASM/YIy8FGabCz8qfnDt BLWRr2G0CXFV5MpK7o4jy6Wr1qG2To1IidKMFiJn2v46lXioV0kp43o0VlyH0ZOHiUk5 q49g== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of bhe@redhat.com designates 209.132.183.28 as permitted sender) smtp.mailfrom=bhe@redhat.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from mx1.redhat.com (mx1.redhat.com. [209.132.183.28]) by mx.google.com with ESMTPS id u79si2443113qki.223.2018.11.01.02.10.59 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Thu, 01 Nov 2018 02:11:00 -0700 (PDT) Received-SPF: pass (google.com: domain of bhe@redhat.com designates 209.132.183.28 as permitted sender) client-ip=209.132.183.28; Authentication-Results: mx.google.com; spf=pass (google.com: domain of bhe@redhat.com designates 209.132.183.28 as permitted sender) smtp.mailfrom=bhe@redhat.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from smtp.corp.redhat.com (int-mx06.intmail.prod.int.phx2.redhat.com [10.5.11.16]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 4112430842A9; Thu, 1 Nov 2018 09:10:59 +0000 (UTC) Received: from localhost (ovpn-8-17.pek2.redhat.com [10.72.8.17]) by smtp.corp.redhat.com (Postfix) with ESMTPS id 68E385C22B; Thu, 1 Nov 2018 09:10:58 +0000 (UTC) Date: Thu, 1 Nov 2018 17:10:55 +0800 From: Baoquan He To: mhocko@suse.com, akpm@linux-foundation.org Cc: linux-mm@kvack.org, linux-kernel@vger.kernel.org Subject: Memory hotplug failed to offline on bare metal system of multiple nodes Message-ID: <20181101091055.GA15166@MiWiFi-R3L-srv> MIME-Version: 1.0 Content-Disposition: inline User-Agent: Mutt/1.9.1 (2017-09-22) X-Scanned-By: MIMEDefang 2.79 on 10.5.11.16 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.40]); Thu, 01 Nov 2018 09:10:59 +0000 (UTC) X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: X-Virus-Scanned: ClamAV using ClamSMTP Hi, A hot removal failure was met on one bare metal system with 8 nodes, and node1~7 are all hotpluggable and 'movable_node' is set. When try to check value of /sys/devices/system/node/node1/memory*/removable, found some of them are 0, namely un-removable. And a back trace will always be seen. After bisecting, it points at criminal commit: 15c30bc09085 ("mm, memory_hotplug: make has_unmovable_pages more robust") Reverting it fix the failure, and node1~7 can be hot removed and hot added again. From the log of commit 15c30bc09085, it's to fix a movable_core setting issue which we allocated node_data firstly in initmem_init(), then try to mark it as movable in mm_init(). We may need think about it further to fix it, meanwhile not breaking bare metal system. I haven't figured out why the above commit caused those memmory block in MOVABL zone being not removable. Still checking. Attach the tested reverting patch in this mail. Thanks Baoquan From 6644aefdf0f2499f7c7c3f30c7c31e791fe3c05a Mon Sep 17 00:00:00 2001 From: Baoquan He Date: Thu, 1 Nov 2018 11:52:41 +0800 Subject: [PATCH] mm, memory_hotplug: memory block failed to offline On bare metal with multiple nodes, hot removing a memory board will fail on those hotpluggable node since some memory blocks can't be offlined. Checking node memory attribute, not all memory blocks are removable even though they are in MOVABLE zone. And below trace can always be seen triggered by the checking. CPU: 60 PID: 4944 Comm: cat Not tainted 4.19.0+ #1 Hardware name: 9008/IT91SMUB, BIOS BLXSV512 03/22/2018 RIP: 0010:has_unmovable_pages+0x154/0x170 Code: 98 49 09 c5 eb c8 8b 43 30 25 80 00 00 f0 3d 00 00 00 f0 75 b9 48 8b 4b 28 b8 01 00 00 00 d3 e0 83 e8 01 48 98 49 01 c5 eb a4 <0f> 0b e9 49 ff ff ff 31 c0 e9 42 ff ff ff 0f 1f 40 00 66 2e 0f 1f RSP: 0018:ffffc9000e6d3d48 EFLAGS: 00010246 RAX: 0000000000000001 RBX: ffffea043fbda2c0 RCX: 0000000000000000 RDX: dead0000000000ff RSI: 0000000010fef600 RDI: ffffea043fbda2c0 RBP: 0000000010fef600 R08: 0000000000000001 R09: ffff880e4c4918c0 R10: ffff880e5a4a3d40 R11: 0000000000000001 R12: 0000000000001140 R13: 000000000000008b R14: 0000000000000001 R15: 0000000000000001 FS: 00007f97b6805540(0000) GS:ffff880e7d980000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000557f1bf44000 CR3: 0000000e27c06001 CR4: 00000000003606e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: is_mem_section_removable+0x76/0x100 show_mem_removable+0x6e/0xa0 dev_attr_show+0x1c/0x40 sysfs_kf_seq_show+0x9f/0x120 seq_read+0x153/0x410 __vfs_read+0x36/0x190 vfs_read+0x8a/0x140 ksys_read+0x4f/0xb0 do_syscall_64+0x55/0x1a0 entry_SYSCALL_64_after_hwframe+0x44/0xa9 RIP: 0033:0x7f97b630d2a5 Code: fe ff ff 50 48 8d 3d 02 df 09 00 e8 75 11 02 00 0f 1f 44 00 00 f3 0f 1e fa 48 8d 05 75 64 2d 00 8b 00 85 c0 75 0f 31 c0 0f 05 <48> 3d 00 f0 ff ff 77 53 c3 66 90 41 54 49 89 d4 55 48 89 f5 53 89 RSP: 002b:00007ffc8ce8e448 EFLAGS: 00000246 ORIG_RAX: 0000000000000000 RAX: ffffffffffffffda RBX: 0000000000020000 RCX: 00007f97b630d2a5 RDX: 0000000000020000 RSI: 0000557f1bf44000 RDI: 0000000000000003 RBP: 0000557f1bf44000 R08: 0000000000000003 R09: 000000000000007b R10: 0000557f1bf3e010 R11: 0000000000000246 R12: 0000557f1bf44000 R13: 0000000000000003 R14: 0000000000000fff R15: 0000000000020000 ---[ end trace aa042f77d15c548c ]--- Bisecting points to below commit as criminal: 15c30bc09085 ("mm, memory_hotplug: make has_unmovable_pages more robust") Reverting fixs the offline failure, and hot removing also succeeds. Signed-off-by: Baoquan He --- mm/page_alloc.c | 18 +++++++----------- 1 file changed, 7 insertions(+), 11 deletions(-) diff --git a/mm/page_alloc.c b/mm/page_alloc.c index a919ba5..b48b5eb 100644 --- a/mm/page_alloc.c +++ b/mm/page_alloc.c @@ -7760,12 +7760,11 @@ bool has_unmovable_pages(struct zone *zone, struct page *page, int count, unsigned long pfn, iter, found; /* - * TODO we could make this much more efficient by not checking every - * page in the range if we know all of them are in MOVABLE_ZONE and - * that the movable zone guarantees that pages are migratable but - * the later is not the case right now unfortunatelly. E.g. movablecore - * can still lead to having bootmem allocations in zone_movable. + * For avoiding noise data, lru_add_drain_all() should be called + * If ZONE_MOVABLE, the zone never contains unmovable pages */ + if (zone_idx(zone) == ZONE_MOVABLE) + return false; /* * CMA allocations (alloc_contig_range) really need to mark isolate @@ -7786,7 +7785,7 @@ bool has_unmovable_pages(struct zone *zone, struct page *page, int count, page = pfn_to_page(check); if (PageReserved(page)) - goto unmovable; + return true; /* * Hugepages are not in LRU lists, but they're movable. @@ -7796,7 +7795,7 @@ bool has_unmovable_pages(struct zone *zone, struct page *page, int count, if (PageHuge(page)) { if (!hugepage_migration_supported(page_hstate(page))) - goto unmovable; + return true; iter = round_up(iter + 1, 1< count) - goto unmovable; + return true; } return false; -unmovable: - WARN_ON_ONCE(zone_idx(zone) == ZONE_MOVABLE); - return true; } #if (defined(CONFIG_MEMORY_ISOLATION) && defined(CONFIG_COMPACTION)) || defined(CONFIG_CMA)