From patchwork Thu Jul 12 23:52:40 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Baoquan He X-Patchwork-Id: 10522429 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork.web.codeaurora.org (Postfix) with ESMTP id 815966028E for ; Thu, 12 Jul 2018 23:52:49 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 5EA7D298E8 for ; Thu, 12 Jul 2018 23:52:49 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 3BCF62992F; Thu, 12 Jul 2018 23:52:49 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.9 required=2.0 tests=BAYES_00, MAILING_LIST_MULTI, RCVD_IN_DNSWL_NONE autolearn=ham version=3.3.1 Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 95460298E8 for ; Thu, 12 Jul 2018 23:52:48 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id B81F16B0275; Thu, 12 Jul 2018 19:52:47 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id B31686B0276; Thu, 12 Jul 2018 19:52:47 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id A47746B0277; Thu, 12 Jul 2018 19:52:47 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from mail-qt0-f199.google.com (mail-qt0-f199.google.com [209.85.216.199]) by kanga.kvack.org (Postfix) with ESMTP id 7BF0A6B0275 for ; Thu, 12 Jul 2018 19:52:47 -0400 (EDT) Received: by mail-qt0-f199.google.com with SMTP id d25-v6so31199583qtp.10 for ; Thu, 12 Jul 2018 16:52:47 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-original-authentication-results:x-gm-message-state:date:from:to :cc:subject:message-id:references:mime-version:content-disposition :in-reply-to:user-agent; bh=xJAnwKSpCbeAJbd+OsHeqix9gKRauLD7LQCTVO+M0T4=; b=FBMVPKn/d+qa4D4lbvODPARKTFdDu1scp+LT+p6Js6GrjSjVzHGwvvGaYsUhhjeyZU f9vMS6YaI0nWRH/b21It4h6+zrSpIN0DUXRrqEwY5mMYrGrdeUTCiXOq52OzUJkTXkaF MssPxkxNSyj2QGM0hlKDAxEio54sy6Ies9rz8oMEGQAY8JdOad1Ai8fisBaicb/LSN2r p9VXxN0afPYzffzuhvP0FCQlsrm62jw3VSRMIZbb5H0ONKKuBvTYrJLdVW+KiPWPmcYp Su69UiMETZ0ICnqrAFo8lQ3EhVMKJBShCbCfJ8CaGhcwBjJXV10tZGg/8hLZlZGKhQb5 E4Lw== X-Original-Authentication-Results: mx.google.com; spf=pass (google.com: domain of bhe@redhat.com designates 66.187.233.73 as permitted sender) smtp.mailfrom=bhe@redhat.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com X-Gm-Message-State: AOUpUlE27oR6WqUyBfBaCKCS/3QBUytU29EHB/CulQW4Gtcq/+3km698 9q2uFs+hGhsZfF7jXxOBJMUbuA5uWv50jYJWQjioj8ZTMPQfmeK9JlD/zwjOEnLWGGfHEDxyN// FM1mF9djM37rKYfNtarLwfnV9U4lW1iYnlq+te3cp0EkQ2LNKxpNDdV0G8fNIvXYo4w== X-Received: by 2002:a37:c887:: with SMTP id t7-v6mr3743603qkl.300.1531439567238; Thu, 12 Jul 2018 16:52:47 -0700 (PDT) X-Google-Smtp-Source: AAOMgpfpP/ZxdCS2CNQH2BZ2g7lRjuC0fsPzzMdNXKLpD8xtLvU+4XBBp1ChJ3YY33ABvTEm6uHd X-Received: by 2002:a37:c887:: with SMTP id t7-v6mr3743582qkl.300.1531439566483; Thu, 12 Jul 2018 16:52:46 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1531439566; cv=none; d=google.com; s=arc-20160816; b=OT5REd2/AXvNtpw3JcxOap4UoQwlOd8dMtdkq34vhItYT2xvXlpfpYEbBs9R/IXbAt M39waUFL+aeOxt4QH1d026a6s9WdTLxoBZiYvtphiKjQaBFu1JfSST1tqkWnmKm1OIcv tOWjKEwMY0jbd60o6F2bVKvwNHmKweBFKmEyoBFrta9ZmbgMrCqYqV6izR6RS2sKKOwv Kk1457/ZXn/GABa3+JgKGlW1/VT6myHd74ZuzAdFdIow4k+sYoJEZRJRqxeoUsZx6de6 +XSB3BvPk+i5p78fOGz4TWbd24e08M2bR+sz3lv82VMPyAbcwwKbACoO8g3fFr6Ai3+4 mjTA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=user-agent:in-reply-to:content-disposition:mime-version:references :message-id:subject:cc:to:from:date:arc-authentication-results; bh=xJAnwKSpCbeAJbd+OsHeqix9gKRauLD7LQCTVO+M0T4=; b=m4Ijn2wxcWBhxEaeP/86s+zh0aXH0E5Xdh/240pHFQJMPL++Bkstn1qmYLu0LZRBE0 zT7k4KiaEScFXJByX0hc1gL+r6Z6M0bNFbJX5nq9b+RBiDAttMeq/zaTz2vyERyzckhh 7Q6bNOl1R/Cdx41c5Hay0ZBGNhn1JvE8V8AHRfqat0KgAZG2pUB+ExcuhNCo6rZ1nJl9 M38NZ5Qyr13/l37TxYRmqp5QKrnS2YY/WAc/8PGkht9RexiAhS0j/1hnx3/GxZMRFEOu ntVnAX6/gEWOfeXtVqU9DyO02+AudwWpxwdlbc+tv/Vo0JWATvVyvgh88eIinel5M+kL af3A== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of bhe@redhat.com designates 66.187.233.73 as permitted sender) smtp.mailfrom=bhe@redhat.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from mx1.redhat.com (mx3-rdu2.redhat.com. [66.187.233.73]) by mx.google.com with ESMTPS id r1-v6si16165880qkd.113.2018.07.12.16.52.46 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Thu, 12 Jul 2018 16:52:46 -0700 (PDT) Received-SPF: pass (google.com: domain of bhe@redhat.com designates 66.187.233.73 as permitted sender) client-ip=66.187.233.73; Authentication-Results: mx.google.com; spf=pass (google.com: domain of bhe@redhat.com designates 66.187.233.73 as permitted sender) smtp.mailfrom=bhe@redhat.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from smtp.corp.redhat.com (int-mx05.intmail.prod.int.rdu2.redhat.com [10.11.54.5]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 046E12BCB1; Thu, 12 Jul 2018 23:52:46 +0000 (UTC) Received: from localhost (ovpn-8-17.pek2.redhat.com [10.72.8.17]) by smtp.corp.redhat.com (Postfix) with ESMTPS id 8DFCB1C665; Thu, 12 Jul 2018 23:52:44 +0000 (UTC) Date: Fri, 13 Jul 2018 07:52:40 +0800 From: Baoquan He To: Michal Hocko Cc: Chao Fan , Dou Liyang , akpm@linux-foundation.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org, x86@kernel.org, yasu.isimatu@gmail.com, keescook@chromium.org, indou.takao@jp.fujitsu.com, caoj.fnst@cn.fujitsu.com, vbabka@suse.cz, mgorman@techsingularity.net Subject: Re: Bug report about KASLR and ZONE_MOVABLE Message-ID: <20180712235240.GH2070@MiWiFi-R3L-srv> References: <20180711094244.GA2019@localhost.localdomain> <20180711104158.GE2070@MiWiFi-R3L-srv> <20180711104944.GG1969@MiWiFi-R3L-srv> <20180711124008.GF2070@MiWiFi-R3L-srv> <72721138-ba6a-32c9-3489-f2060f40a4c9@cn.fujitsu.com> <20180712060115.GD6742@localhost.localdomain> <20180712123228.GK32648@dhcp22.suse.cz> MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: <20180712123228.GK32648@dhcp22.suse.cz> User-Agent: Mutt/1.9.1 (2017-09-22) X-Scanned-By: MIMEDefang 2.79 on 10.11.54.5 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.11.55.2]); Thu, 12 Jul 2018 23:52:46 +0000 (UTC) X-Greylist: inspected by milter-greylist-4.5.16 (mx1.redhat.com [10.11.55.2]); Thu, 12 Jul 2018 23:52:46 +0000 (UTC) for IP:'10.11.54.5' DOMAIN:'int-mx05.intmail.prod.int.rdu2.redhat.com' HELO:'smtp.corp.redhat.com' FROM:'bhe@redhat.com' RCPT:'' X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: X-Virus-Scanned: ClamAV using ClamSMTP Hi Michal, On 07/12/18 at 02:32pm, Michal Hocko wrote: > On Thu 12-07-18 14:01:15, Chao Fan wrote: > > On Thu, Jul 12, 2018 at 01:49:49PM +0800, Dou Liyang wrote: > > >Hi Baoquan, > > > > > >At 07/11/2018 08:40 PM, Baoquan He wrote: > > >> Please try this v3 patch: > > >> >>From 9850d3de9c02e570dc7572069a9749a8add4c4c7 Mon Sep 17 00:00:00 2001 > > >> From: Baoquan He > > >> Date: Wed, 11 Jul 2018 20:31:51 +0800 > > >> Subject: [PATCH v3] mm, page_alloc: find movable zone after kernel text > > >> > > >> In find_zone_movable_pfns_for_nodes(), when try to find the starting > > >> PFN movable zone begins in each node, kernel text position is not > > >> considered. KASLR may put kernel after which movable zone begins. > > >> > > >> Fix it by finding movable zone after kernel text on that node. > > >> > > >> Signed-off-by: Baoquan He > > > > > > > > >You fix this in the _zone_init side_. This may make the 'kernelcore=' or > > >'movablecore=' failed if the KASLR puts the kernel back the tail of the > > >last node, or more. > > > > I think it may not fail. > > There is a 'restart' to do another pass. > > > > > > > >Due to we have fix the mirror memory in KASLR side, and Chao is trying > > >to fix the 'movable_node' in KASLR side. Have you had a chance to fix > > >this in the KASLR side. > > > > > > > I think it's better to fix here, but not KASLR side. > > Cause much more code will be change if doing it in KASLR side. > > Since we didn't parse 'kernelcore' in compressed code, and you can see > > the distribution of ZONE_MOVABLE need so much code, so we do not need > > to do so much job in KASLR side. But here, several lines will be OK. > > I am not able to find the beginning of the email thread right now. Could > you summarize what is the actual problem please? The bug is found on x86 now. When added "kernelcore=" or "movablecore=" into kernel command line, kernel memory is spread evenly among nodes. However, this is right when KASLR is not enabled, then kernel will be at 16M of place in x86 arch. If KASLR enabled, it could be put any place from 16M to 64T randomly. Consider a scenario, we have 10 nodes, and each node has 20G memory, and we specify "kernelcore=50%", means each node will take 10G for kernelcore, 10G for movable area. But this doesn't take kernel position into consideration. E.g if kernel is put at 15G of 2nd node, namely node1. Then we think on node1 there's 10G for kernelcore, 10G for movable, in fact there's only 5G available for movable, just after kernel. I made a v4 patch which possibly can fix it. From dbcac3631863aed556dc2c4ff1839772dfd02d18 Mon Sep 17 00:00:00 2001 From: Baoquan He Date: Fri, 13 Jul 2018 07:49:29 +0800 Subject: [PATCH v4] mm, page_alloc: find movable zone after kernel text In find_zone_movable_pfns_for_nodes(), when try to find the starting PFN movable zone begins at in each node, kernel text position is not considered. KASLR may put kernel after which movable zone begins. Fix it by finding movable zone after kernel text on that node. Signed-off-by: Baoquan He --- mm/page_alloc.c | 15 +++++++++++++-- 1 file changed, 13 insertions(+), 2 deletions(-) diff --git a/mm/page_alloc.c b/mm/page_alloc.c index 1521100f1e63..5bc1a47dafda 100644 --- a/mm/page_alloc.c +++ b/mm/page_alloc.c @@ -6547,7 +6547,7 @@ static unsigned long __init early_calculate_totalpages(void) static void __init find_zone_movable_pfns_for_nodes(void) { int i, nid; - unsigned long usable_startpfn; + unsigned long usable_startpfn, kernel_endpfn, arch_startpfn; unsigned long kernelcore_node, kernelcore_remaining; /* save the state before borrow the nodemask */ nodemask_t saved_node_state = node_states[N_MEMORY]; @@ -6649,8 +6649,9 @@ static void __init find_zone_movable_pfns_for_nodes(void) if (!required_kernelcore || required_kernelcore >= totalpages) goto out; + kernel_endpfn = PFN_UP(__pa_symbol(_end)); /* usable_startpfn is the lowest possible pfn ZONE_MOVABLE can be at */ - usable_startpfn = arch_zone_lowest_possible_pfn[movable_zone]; + arch_startpfn = arch_zone_lowest_possible_pfn[movable_zone]; restart: /* Spread kernelcore memory as evenly as possible throughout nodes */ @@ -6659,6 +6660,16 @@ static void __init find_zone_movable_pfns_for_nodes(void) unsigned long start_pfn, end_pfn; /* + * KASLR may put kernel near tail of node memory, + * start after kernel on that node to find PFN + * at which zone begins. + */ + if (pfn_to_nid(kernel_endpfn) == nid) + usable_startpfn = max(arch_startpfn, kernel_endpfn); + else + usable_startpfn = arch_startpfn; + + /* * Recalculate kernelcore_node if the division per node * now exceeds what is necessary to satisfy the requested * amount of memory for the kernel