From patchwork Wed Jul 11 10:41:58 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Baoquan He X-Patchwork-Id: 10519307 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork.web.codeaurora.org (Postfix) with ESMTP id 206E56028E for ; Wed, 11 Jul 2018 10:42:08 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id ED149212D6 for ; Wed, 11 Jul 2018 10:42:07 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id E13F426247; Wed, 11 Jul 2018 10:42:07 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.9 required=2.0 tests=BAYES_00, MAILING_LIST_MULTI, RCVD_IN_DNSWL_NONE autolearn=ham version=3.3.1 Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 58AC2212D6 for ; Wed, 11 Jul 2018 10:42:07 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 821706B0003; Wed, 11 Jul 2018 06:42:06 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id 7A98F6B0006; Wed, 11 Jul 2018 06:42:06 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 648286B0269; Wed, 11 Jul 2018 06:42:06 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from mail-qk0-f197.google.com (mail-qk0-f197.google.com [209.85.220.197]) by kanga.kvack.org (Postfix) with ESMTP id 3358E6B0003 for ; Wed, 11 Jul 2018 06:42:06 -0400 (EDT) Received: by mail-qk0-f197.google.com with SMTP id m6-v6so30617531qkd.20 for ; Wed, 11 Jul 2018 03:42:06 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-original-authentication-results:x-gm-message-state:date:from:to :cc:subject:message-id:references:mime-version:content-disposition :in-reply-to:user-agent; bh=KR43E9yppDjx9yU9mT2Uk58r21/uB1VnI4Vzl3wwDE8=; b=R0+w6nMpojCZ7YZ3nxYvsVfvdsiJMjJYKP/DjP9UsJLLjyv50JGhueXDpsPB9XC5/i wV4oV+u1NHTQrc2Dvnnggi1C50Vk4QC8kz+CQZs03aClLItmDnC58GMpziqaaKKoHczQ 3DV57k30vBYWFSAs8DDzf4PypwsRn3VtEsbqRkvOlwwKlnOiRcJyvHZEURihE3JBCWMZ 2twWJrD2/u+SW9LZ5vBF0+La+8B8FmjYcNaAvCtrC0QqiL+R6xSIGWnD3kiAWsbPakFK U5uqZwxHZVkNbgzJgqhHGz4oeP5rz3j83PWt6h2+mZEA7r3WOK8+xD34lTi7a4KYtOQ2 xBag== X-Original-Authentication-Results: mx.google.com; spf=pass (google.com: domain of bhe@redhat.com designates 66.187.233.73 as permitted sender) smtp.mailfrom=bhe@redhat.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com X-Gm-Message-State: APt69E17n7JAHcVWSFYOIE2+plCUy1KHppuJkKsry+G24A3dyf+3oLxn 8qYSlLdqYv7QC1KvNBIAkhgCw1r/t5+cCIncnR0iMJx+EpzKR8TYcanVQbm/Du7gKAeXT4dxShh ZL+qWFd1Swz3JmbYhrOqqwAsjVPr6SKI2Ah8o78DwtnpzQGnFCsHcpEj+qHv+KE1O2w== X-Received: by 2002:ac8:12c3:: with SMTP id b3-v6mr27040687qtj.352.1531305725966; Wed, 11 Jul 2018 03:42:05 -0700 (PDT) X-Google-Smtp-Source: AAOMgpdgOZTgUWC4XzOi/j1KR9X1M4abZXoGOG/9JKhd9m1E16uJ+HpONDZfSGyg6RkZTFX1p1GW X-Received: by 2002:ac8:12c3:: with SMTP id b3-v6mr27040651qtj.352.1531305725284; Wed, 11 Jul 2018 03:42:05 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1531305725; cv=none; d=google.com; s=arc-20160816; b=inTw5TmGzq51JPg0viby+e0dPAgjLilO46R+R9yG8ykZpfQ/GLpc2Jz9lUJAJby+Y1 RRP7Tlr0y/9xIveM60o4Wbp0aynJSQ4hSMRMQocOFAruixYZxF9ioH9/NJhY6zePA4Lt OEjP5DJBZvqPJGzunDeF6Ct54iK33obIu6zV8ELkh5J7Hk1daMqFe5jvKUIiJXlbvTnr MW3XmtMQPVXTOaLxckD1IciT4w+9efD+AAInU+/U3Jn66uWbOxF5jzvWCY1HflRG5MfY SooRjTzxfABxicHPzeNGucUb0h+SFWzPyTyfU00Vt6+SvWtaYFhIQOqsED6Z69bVzU5A bl1w== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=user-agent:in-reply-to:content-disposition:mime-version:references :message-id:subject:cc:to:from:date:arc-authentication-results; bh=KR43E9yppDjx9yU9mT2Uk58r21/uB1VnI4Vzl3wwDE8=; b=F3hrCCWKYC59E/GulHDmKHrrBZVP5wjL7LOU/m3CjEnx9dONXWehOzS3fP59o8b/Qj 7bT2LkJXKeFV6rlA0OVzMOygKDNfT61gY2bDzX535FkFxwcNO5d47on4mv1rp657YcvX BPiFBP7lbm1j5p/f6n7pDxuJLpvcMQGhoghhcK8Nchp6kP8XHag7XLpucf7LdwBQnRX5 ch+1NaxU4zD7cMByG7EjDWUKB4aFxxJiHzcmC6JeSedYxENAMWjY83Sug7AmCMl34jWW i3TIdMUrCr3aoGJl0UUs9pzTGeJsBc9+nrRB5/MoUYkh7HbDoxiRyyiS40L1aKSRrnU4 NhBw== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of bhe@redhat.com designates 66.187.233.73 as permitted sender) smtp.mailfrom=bhe@redhat.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from mx1.redhat.com (mx3-rdu2.redhat.com. [66.187.233.73]) by mx.google.com with ESMTPS id r2-v6si2690680qkd.14.2018.07.11.03.42.05 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Wed, 11 Jul 2018 03:42:05 -0700 (PDT) Received-SPF: pass (google.com: domain of bhe@redhat.com designates 66.187.233.73 as permitted sender) client-ip=66.187.233.73; Authentication-Results: mx.google.com; spf=pass (google.com: domain of bhe@redhat.com designates 66.187.233.73 as permitted sender) smtp.mailfrom=bhe@redhat.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from smtp.corp.redhat.com (int-mx05.intmail.prod.int.rdu2.redhat.com [10.11.54.5]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id CE43E7C6CA; Wed, 11 Jul 2018 10:42:04 +0000 (UTC) Received: from localhost (ovpn-8-16.pek2.redhat.com [10.72.8.16]) by smtp.corp.redhat.com (Postfix) with ESMTPS id 7747A1C5A6; Wed, 11 Jul 2018 10:42:02 +0000 (UTC) Date: Wed, 11 Jul 2018 18:41:58 +0800 From: Baoquan He To: Chao Fan , akpm@linux-foundation.org, linux-mm@kvack.org Cc: linux-kernel@vger.kernel.org, x86@kernel.org, yasu.isimatu@gmail.com, keescook@chromium.org, indou.takao@jp.fujitsu.com, caoj.fnst@cn.fujitsu.com, douly.fnst@cn.fujitsu.com, mhocko@suse.com, vbabka@suse.cz, mgorman@techsingularity.net Subject: Re: Bug report about KASLR and ZONE_MOVABLE Message-ID: <20180711104158.GE2070@MiWiFi-R3L-srv> References: <20180711094244.GA2019@localhost.localdomain> MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: <20180711094244.GA2019@localhost.localdomain> User-Agent: Mutt/1.9.1 (2017-09-22) X-Scanned-By: MIMEDefang 2.79 on 10.11.54.5 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.11.55.2]); Wed, 11 Jul 2018 10:42:04 +0000 (UTC) X-Greylist: inspected by milter-greylist-4.5.16 (mx1.redhat.com [10.11.55.2]); Wed, 11 Jul 2018 10:42:04 +0000 (UTC) for IP:'10.11.54.5' DOMAIN:'int-mx05.intmail.prod.int.rdu2.redhat.com' HELO:'smtp.corp.redhat.com' FROM:'bhe@redhat.com' RCPT:'' X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: X-Virus-Scanned: ClamAV using ClamSMTP On 07/11/18 at 05:42pm, Chao Fan wrote: > Hi all, > > I found there is a BUG about KASLR and ZONE_MOVABLE. > > When users use 'kernelcore=' parameter without 'movable_node', > movable memory is evenly distributed to all nodes. The size of > ZONE_MOVABLE depends on the kernel parameter 'kernelcore=' and > 'movablecore='. > But sometiomes, KASLR may put the uncompressed kernel to the > tail position of a node, which will cause the kernel memory > set as ZONE_MOVABLE. This region can not be offlined. > > Here is a very simple test in my qemu-kvm machine, there is > only one node: > > The command line: > [root@localhost ~]# cat /proc/cmdline > BOOT_IMAGE=/vmlinuz-4.18.0-rc3+ root=/dev/mapper/fedora_localhost--live-root > ro resume=/dev/mapper/fedora_localhost--live-swap > rd.lvm.lv=fedora_localhost-live/root rd.lvm.lv=fedora_localhost-live/swap > console=ttyS0 earlyprintk=ttyS0,115200n8 memblock=debug kernelcore=50% > > I use 'kernelcore=50%' here. > > Here is my early print result, I print the random_addr after KASLR chooses > physical memory: > early console in extract_kernel > input_data: 0x000000000266b3b1 > input_len: 0x00000000007d8802 > output: 0x0000000001000000 > output_len: 0x0000000001e15698 > kernel_total_size: 0x0000000001a8b000 > trampoline_32bit: 0x000000000009d000 > booted via startup_32() > Physical KASLR using RDRAND RDTSC... > random_addr: 0x000000012f000000 > Virtual KASLR using RDRAND RDTSC... > > The address for kernel is 0x000000012f000000 > > Here is the log of ZONE: > [ 0.000000] Zone ranges: > [ 0.000000] DMA [mem 0x0000000000001000-0x0000000000ffffff] > [ 0.000000] DMA32 [mem 0x0000000001000000-0x00000000ffffffff] > [ 0.000000] Normal [mem 0x0000000100000000-0x00000001f57fffff] > [ 0.000000] Device empty > [ 0.000000] Movable zone start for each node > [ 0.000000] Node 0: 0x000000011b000000 > [ 0.000000] Early memory node ranges > [ 0.000000] node 0: [mem 0x0000000000001000-0x000000000009efff] > [ 0.000000] node 0: [mem 0x0000000000100000-0x00000000bffd6fff] > [ 0.000000] node 0: [mem 0x0000000100000000-0x00000001f57fffff] > [ 0.000000] Initmem setup node 0 [mem > 0x0000000000001000-0x00000001f57fffff] > > Only one node in my machine, ZONE_MOVABLE begins from 0x000000011b000000, > which is lower than 0x000000012f000000. > So KASLR put the kernel to the ZONE_MOVABLE. > Try to solve this problem, I think there should be a new tactic in function > find_zone_movable_pfns_for_nodes() of mm/page_alloc.c. If kernel is uncompressed > in a tail position, then just set the memory after the kernel as ZONE_MOVABLE, > at the same time, memory in other nodes will be set as ZONE_MOVABLE. Hmm, it's an issue, worth fixing it. Otherwise the size of movable area will be smaller than we expect when add "kernel_core=" or "movable_core=". Add a check in find_zone_movable_pfns_for_nodes(), and use min() to get the starting address of movable area between aligned '_etext' and start_pfn. It will go to label 'restart' to calculate the 2nd round if not satisfiled. Hi Chao, Could you check if below patch works for you? From ab6e47c6a78d1a4ccb577b995b7b386f3149732f Mon Sep 17 00:00:00 2001 From: Baoquan He Date: Wed, 11 Jul 2018 18:30:04 +0800 Subject: [PATCH] mm, page_alloc: find movable zone after kernel text In find_zone_movable_pfns_for_nodes(), when try to find the starting PFN movable zone begins in each node, kernel text position is not considered. KASLR may put kernel after which movable zone begins. Fix it by finding movable zone after kernel text. Signed-off-by: Baoquan He --- mm/page_alloc.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/mm/page_alloc.c b/mm/page_alloc.c index 1521100..fe346b4 100644 --- a/mm/page_alloc.c +++ b/mm/page_alloc.c @@ -6678,6 +6678,8 @@ static void __init find_zone_movable_pfns_for_nodes(void) unsigned long size_pages; start_pfn = max(start_pfn, zone_movable_pfn[nid]); + /* KASLR may put kernel after 'start_pfn', start after kernel */ + start_pfn = max(start_pfn, PAGE_ALIGN(_etext)); if (start_pfn >= end_pfn) continue;