From patchwork Mon Oct 1 18:56:25 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Michael Bringmann X-Patchwork-Id: 10622655 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id B44B3174A for ; Mon, 1 Oct 2018 18:56:38 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 9872827F10 for ; Mon, 1 Oct 2018 18:56:38 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 8B3CC284F5; Mon, 1 Oct 2018 18:56:38 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_NONE autolearn=ham version=3.3.1 Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 3FFEE27F10 for ; Mon, 1 Oct 2018 18:56:37 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id B659D6B0003; Mon, 1 Oct 2018 14:56:35 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id B150E6B0005; Mon, 1 Oct 2018 14:56:35 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 9DCE06B0006; Mon, 1 Oct 2018 14:56:35 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from mail-ot1-f71.google.com (mail-ot1-f71.google.com [209.85.210.71]) by kanga.kvack.org (Postfix) with ESMTP id 6C64D6B0003 for ; Mon, 1 Oct 2018 14:56:35 -0400 (EDT) Received: by mail-ot1-f71.google.com with SMTP id p23-v6so15518710otl.23 for ; Mon, 01 Oct 2018 11:56:35 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-original-authentication-results:x-gm-message-state:subject:from :to:cc:date:user-agent:mime-version:content-transfer-encoding :message-id; bh=h8HrisBabVXAzICEsI4oLcWxCOO/l5FRh3DNncsOTbk=; b=LXdQofjbo6hba6MIeqwY/T91ALKWG7P6OuJdzeKqcBll2qfnzKEcLq/UCDF/47cclb rVGNjc4++J1+JScaroKlS7zDPXXfVImr+BuD0megCDkhP/6SRHbMO7KNyE6Thdqc4kSh 0Et03n83b0Zi79wdwU5FAnBheAczOMcFM7uvAV0nK0KFdveKQNL16TODr6cuGz8LE9iJ tfurSUdF7/9aNcRl95qWTtJSOZ6265HkcGdvurtpWkQGdt1q4ePIbBec2lVdxxoVZeCO dNB4CKO9c/De4JOuLvxKtW7t2WlsYdzvTX3lXLCyEVWq1BH/exjew+mgf/l1/SEYcSEB mPSg== X-Original-Authentication-Results: mx.google.com; spf=neutral (google.com: 148.163.158.5 is neither permitted nor denied by best guess record for domain of mwb@linux.vnet.ibm.com) smtp.mailfrom=mwb@linux.vnet.ibm.com; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=ibm.com X-Gm-Message-State: ABuFfoifRu6Ufj4bTVu3j9i+UsqxeM9jVeLP7JhgoyWbc4LtyWvQ+j6F QWDhvRbxcs/8Z0OnLTxp/wQKoxHvb8mDGolj6MZVNfXYNtYOls7SOdcn2lPC2iaEFC2+rbNKi2a S7rfrHQUYP/5fbmLsyk9AC7ZfCeYjYKxZvsQhNdPc5tDgGJy3Qy8577fkQpKIQ/4= X-Received: by 2002:a9d:3b85:: with SMTP id k5-v6mr424585otc.3.1538420195137; Mon, 01 Oct 2018 11:56:35 -0700 (PDT) X-Google-Smtp-Source: ACcGV60Zs27pXAya2VIYAgtljHU0pBxJO02tSjlSGSLptexfpwEkklC1TC13o5tzB9kGxP05ojQV X-Received: by 2002:a9d:3b85:: with SMTP id k5-v6mr424538otc.3.1538420194333; Mon, 01 Oct 2018 11:56:34 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1538420194; cv=none; d=google.com; s=arc-20160816; b=0G5UNXZy5HU1PvnzFtksQyLecgUiURjdyQjhptIFWwNT82cS/1+D7t2zh1Bg955RUI kqsOnWPecLRub9xFugnsNju+CHSNfTzrRvkVKeLYk+w+3Mob04nbAS0t719xNAIArc4J +A0+j+9cPWo78g4ptNfbKsiU/7GWDZiIJag0rlulHvkFIxKK9+NPzLzr/7O1wXkFhXxL x7aVAgBTitJWSKbeN3Sg4BeZS4+iAIEasjapiGyeY3PiE0qbZLpwzZh0K2bpc37uqpyd AkQlCEAP7jC5/0XHBrTSNqFa2UKyyOzFXFZGJOILFm+4oeya0vMc+1LEAKHRdz6DxXQ5 XYBw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=message-id:content-transfer-encoding:mime-version:user-agent:date :cc:to:from:subject; bh=h8HrisBabVXAzICEsI4oLcWxCOO/l5FRh3DNncsOTbk=; b=gWOaT5xj7wKHw3LL4IzMcOL+T74nk8xs73NJPY7euEYniaJYK/fb/YHP7dAhq60Uft CZ7whS30BzmXxnRwEFEr9xNkR5PHKhFUx7XMkzxU/O1zPtBVCqLsFUbNrW5fL1G7ki88 rf4TDfsT5jiLAkSYAeULT4HEbLtW3btsqtpR9H1cK0Y6pDclyxgIw1kIajwlF1h3LBgy MfRTEjt58fh/kd7yEtjmYXm8VXFNBiJ7T9d27njmEnsXvqYPdDFW2Y8hxD/+/0QcWN5Q c3TLjSPJuhy1ENobO+oYffnzMCaiaiiEtvLY2XnN57TkVJWAMfx0vA/9Kss2PBZNsS1f TQZw== ARC-Authentication-Results: i=1; mx.google.com; spf=neutral (google.com: 148.163.158.5 is neither permitted nor denied by best guess record for domain of mwb@linux.vnet.ibm.com) smtp.mailfrom=mwb@linux.vnet.ibm.com; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=ibm.com Received: from mx0a-001b2d01.pphosted.com (mx0b-001b2d01.pphosted.com. [148.163.158.5]) by mx.google.com with ESMTPS id t10-v6si6870349otj.224.2018.10.01.11.56.34 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Mon, 01 Oct 2018 11:56:34 -0700 (PDT) Received-SPF: neutral (google.com: 148.163.158.5 is neither permitted nor denied by best guess record for domain of mwb@linux.vnet.ibm.com) client-ip=148.163.158.5; Authentication-Results: mx.google.com; spf=neutral (google.com: 148.163.158.5 is neither permitted nor denied by best guess record for domain of mwb@linux.vnet.ibm.com) smtp.mailfrom=mwb@linux.vnet.ibm.com; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=ibm.com Received: from pps.filterd (m0098417.ppops.net [127.0.0.1]) by mx0a-001b2d01.pphosted.com (8.16.0.22/8.16.0.22) with SMTP id w91IsZF4052260 for ; Mon, 1 Oct 2018 14:56:33 -0400 Received: from e31.co.us.ibm.com (e31.co.us.ibm.com [32.97.110.149]) by mx0a-001b2d01.pphosted.com with ESMTP id 2mupedfdfx-1 (version=TLSv1.2 cipher=AES256-GCM-SHA384 bits=256 verify=NOT) for ; Mon, 01 Oct 2018 14:56:33 -0400 Received: from localhost by e31.co.us.ibm.com with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted for from ; Mon, 1 Oct 2018 12:56:31 -0600 Received: from b03cxnp08026.gho.boulder.ibm.com (9.17.130.18) by e31.co.us.ibm.com (192.168.1.131) with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted; (version=TLSv1/SSLv3 cipher=AES256-GCM-SHA384 bits=256/256) Mon, 1 Oct 2018 12:56:26 -0600 Received: from b03ledav005.gho.boulder.ibm.com (b03ledav005.gho.boulder.ibm.com [9.17.130.236]) by b03cxnp08026.gho.boulder.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id w91IuPbI41812070 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=FAIL); Mon, 1 Oct 2018 11:56:25 -0700 Received: from b03ledav005.gho.boulder.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id A8ACABE058; Mon, 1 Oct 2018 12:56:25 -0600 (MDT) Received: from b03ledav005.gho.boulder.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 8790DBE051; Mon, 1 Oct 2018 12:56:25 -0600 (MDT) Received: from ltcalpine2-lp9.aus.stglabs.ibm.com (unknown [9.40.195.192]) by b03ledav005.gho.boulder.ibm.com (Postfix) with ESMTP; Mon, 1 Oct 2018 12:56:25 -0600 (MDT) Received: from ltcalpine2-lp9.aus.stglabs.ibm.com (localhost [IPv6:::1]) by ltcalpine2-lp9.aus.stglabs.ibm.com (Postfix) with ESMTP id 350952087F40; Mon, 1 Oct 2018 13:56:25 -0500 (CDT) Subject: [PATCH] migration/mm: Add WARN_ON to try_offline_node From: Michael Bringmann To: linuxppc-dev@lists.ozlabs.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org, mwb@linux.vnet.ibm.com Cc: Michael Ellerman , Nathan Fontenot , Nicholas Piggin , Kees Cook , Thiago Jung Bauermann , Russell Currey , Mauricio Faria de Oliveira , Christophe Leroy , Andrew Morton , Michal Hocko , Pavel Tatashin , Dan Williams , Oscar Salvador , YASUAKI ISHIMATSU , Mathieu Malaterre , Juliet Kim , Tyrel Datwyler , Thomas Falcon Date: Mon, 01 Oct 2018 13:56:25 -0500 User-Agent: StGit/0.18-105-g416a MIME-Version: 1.0 X-TM-AS-GCONF: 00 x-cbid: 18100118-8235-0000-0000-00000E0ABA13 X-IBM-SpamModules-Scores: X-IBM-SpamModules-Versions: BY=3.00009804; HX=3.00000242; KW=3.00000007; PH=3.00000004; SC=3.00000267; SDB=6.01096428; UDB=6.00566942; IPR=6.00876470; MB=3.00023577; MTD=3.00000008; XFM=3.00000015; UTC=2018-10-01 18:56:30 X-IBM-AV-DETECTION: SAVI=unused REMOTE=unused XFE=unused x-cbparentid: 18100118-8236-0000-0000-000042D6EF7D Message-Id: <20181001185616.11427.35521.stgit@ltcalpine2-lp9.aus.stglabs.ibm.com> X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10434:,, definitions=2018-10-01_10:,, signatures=0 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 priorityscore=1501 malwarescore=0 suspectscore=0 phishscore=0 bulkscore=0 spamscore=0 clxscore=1011 lowpriorityscore=0 mlxscore=0 impostorscore=0 mlxlogscore=936 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1807170000 definitions=main-1810010180 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: X-Virus-Scanned: ClamAV using ClamSMTP In some LPAR migration scenarios, device-tree modifications are made to the affinity of the memory in the system. For instance, it may occur that memory is installed to nodes 0,3 on a source system, and to nodes 0,2 on a target system. Node 2 may not have been initialized/allocated on the target system. After migration, if a RTAS PRRN memory remove is made to a memory block that was in node 3 on the source system, then try_offline_node tries to remove it from node 2 on the target. The NODE_DATA(2) block would not be initialized on the target, and there is no validation check in the current code to prevent the use of a NULL pointer. Call traces such as the following may be observed: A similar problem of moving memory to an unitialized node has also been observed on systems where multiple PRRN events occur prior to a complete update of the device-tree. pseries-hotplug-mem: Attempting to update LMB, drc index 80000002 Offlined Pages 4096 ... Oops: Kernel access of bad area, sig: 11 [#1] ... Workqueue: pseries hotplug workque pseries_hp_work_fn ... NIP [c0000000002bc088] try_offline_node+0x48/0x1e0 LR [c0000000002e0b84] remove_memory+0xb4/0xf0 Call Trace: [c0000002bbee7a30] [c0000002bbee7a70] 0xc0000002bbee7a70 (unreliable) [c0000002bbee7a70] [c0000000002e0b84] remove_memory+0xb4/0xf0 [c0000002bbee7ab0] [c000000000097784] dlpar_remove_lmb+0xb4/0x160 [c0000002bbee7af0] [c000000000097f38] dlpar_memory+0x328/0xcb0 [c0000002bbee7ba0] [c0000000000906d0] handle_dlpar_errorlog+0xc0/0x130 [c0000002bbee7c10] [c0000000000907d4] pseries_hp_work_fn+0x94/0xa0 [c0000002bbee7c40] [c0000000000e1cd0] process_one_work+0x1a0/0x4e0 [c0000002bbee7cd0] [c0000000000e21b0] worker_thread+0x1a0/0x610 [c0000002bbee7d80] [c0000000000ea458] kthread+0x128/0x150 [c0000002bbee7e30] [c00000000000982c] ret_from_kernel_thread+0x5c/0xb0 This patch adds a check for an incorrectly initialized to the beginning of try_offline_node, and exits the routine. Another patch is being developed for powerpc to track the node Id to which an LMB belongs, so that we can remove the LMB from there instead of the nid as currently interpreted from the device tree. Signed-off-by: Michael Bringmann Reviewed-by: Kees Cook --- mm/memory_hotplug.c | 10 ++++++++-- 1 file changed, 8 insertions(+), 2 deletions(-) diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c index 38d94b7..e48a4d0 100644 --- a/mm/memory_hotplug.c +++ b/mm/memory_hotplug.c @@ -1831,10 +1831,16 @@ static int check_and_unmap_cpu_on_node(pg_data_t *pgdat) void try_offline_node(int nid) { pg_data_t *pgdat = NODE_DATA(nid); - unsigned long start_pfn = pgdat->node_start_pfn; - unsigned long end_pfn = start_pfn + pgdat->node_spanned_pages; + unsigned long start_pfn; + unsigned long end_pfn; unsigned long pfn; + if (WARN_ON(pgdat == NULL)) + return; + + start_pfn = pgdat->node_start_pfn; + end_pfn = start_pfn + pgdat->node_spanned_pages; + for (pfn = start_pfn; pfn < end_pfn; pfn += PAGES_PER_SECTION) { unsigned long section_nr = pfn_to_section_nr(pfn);