From patchwork Wed Jan 8 17:24:59 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Woodhouse X-Patchwork-Id: 11324197 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 75B6A1395 for ; Wed, 8 Jan 2020 17:25:52 +0000 (UTC) Received: from lists.xenproject.org (lists.xenproject.org [192.237.175.120]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 46FBC205F4 for ; Wed, 8 Jan 2020 17:25:52 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=fail reason="signature verification failed" (2048-bit key) header.d=infradead.org header.i=@infradead.org header.b="GVzqRF6p" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 46FBC205F4 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=infradead.org Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=xen-devel-bounces@lists.xenproject.org Received: from localhost ([127.0.0.1] helo=lists.xenproject.org) by lists.xenproject.org with esmtp (Exim 4.89) (envelope-from ) id 1ipF53-0003QH-Bs; Wed, 08 Jan 2020 17:25:17 +0000 Received: from us1-rack-iad1.inumbo.com ([172.99.69.81]) by lists.xenproject.org with esmtp (Exim 4.89) (envelope-from ) id 1ipF52-0003Pu-2g for xen-devel@lists.xenproject.org; Wed, 08 Jan 2020 17:25:16 +0000 X-Inumbo-ID: ceecb850-323b-11ea-b1f0-bc764e2007e4 Received: from merlin.infradead.org (unknown [2001:8b0:10b:1231::1]) by us1-rack-iad1.inumbo.com (Halon) with ESMTPS id ceecb850-323b-11ea-b1f0-bc764e2007e4; Wed, 08 Jan 2020 17:25:05 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=merlin.20170209; h=Sender:Content-Transfer-Encoding: MIME-Version:References:In-Reply-To:Message-Id:Date:Subject:Cc:To:From: Reply-To:Content-Type:Content-ID:Content-Description:Resent-Date:Resent-From: Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID:List-Id:List-Help: List-Unsubscribe:List-Subscribe:List-Post:List-Owner:List-Archive; bh=vmP3+Vlla0vEQ5DT0IkAfB1OrAIJkcw8dT/Ro0yLZ/o=; b=GVzqRF6pMdBHB/sq0zSBnHAZXg XltbzxFCTx2rT0z16gkcJWy3Ybx87fxX+2VBJuHSmkz3R90Yr7kQS6ic7/DR8ujplhpxD7ahNKxEf 5ILs5zSb99m2GfJy3k5GLAFugHpEaKNmNSg/kLNSTur8EExjL/iLtNNdo88vjefY0JJhvocP1rX3k B4hWxgPrF2OXppubCnQOdM5iOilrU5Nl/IVNhWOHUq10WAMszwXR76rPANFTpTqG5Bkrt6JOqWZ9I P076wvhlDb7lUe0PRL19LhRxs0Yn/qDPTlSTcvwgYhBx3jKaF9FTFaatSqM9aGdijvNTrJ+o6WbJO 8HDwSuAQ==; Received: from i7.infradead.org ([2001:8b0:10b:1:21e:67ff:fecb:7a92]) by merlin.infradead.org with esmtpsa (Exim 4.92.3 #3 (Red Hat Linux)) id 1ipF4o-0004uv-8M; Wed, 08 Jan 2020 17:25:02 +0000 Received: from dwoodhou by i7.infradead.org with local (Exim 4.92 #3 (Red Hat Linux)) id 1ipF4m-005xKY-9n; Wed, 08 Jan 2020 17:25:00 +0000 From: David Woodhouse To: Xen-devel Date: Wed, 8 Jan 2020 17:24:59 +0000 Message-Id: <20200108172500.1419665-2-dwmw2@infradead.org> X-Mailer: git-send-email 2.21.0 In-Reply-To: <20200108172500.1419665-1-dwmw2@infradead.org> References: <20200108172500.1419665-1-dwmw2@infradead.org> MIME-Version: 1.0 X-SRS-Rewrite: SMTP reverse-path rewritten from by merlin.infradead.org. See http://www.infradead.org/rpr.html Subject: [Xen-devel] [RFC PATCH 2/3] x86/boot: Reserve live update boot memory X-BeenThere: xen-devel@lists.xenproject.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Xen developer discussion List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Cc: Stefano Stabellini , Julien Grall , Wei Liu , Konrad Rzeszutek Wilk , George Dunlap , Andrew Cooper , paul@xen.org, Ian Jackson , Jan Beulich , =?utf-8?q?Roger_Pau_Monn=C3=A9?= Errors-To: xen-devel-bounces@lists.xenproject.org Sender: "Xen-devel" From: David Woodhouse For live update to work, it will need a region of memory that can be given to the boot allocator while it parses the state information from the previous Xen and works out which of the other pages of memory it can consume. Reserve that like the crashdump region, and accept it on the command line. Use only that region for early boot, and register the remaining RAM (all of it for now, until the real live update happens) later. Signed-off-by: David Woodhouse --- xen/arch/x86/setup.c | 114 ++++++++++++++++++++++++++++++++++++++++--- 1 file changed, 107 insertions(+), 7 deletions(-) diff --git a/xen/arch/x86/setup.c b/xen/arch/x86/setup.c index 47e065e5fe..650d70c1fc 100644 --- a/xen/arch/x86/setup.c +++ b/xen/arch/x86/setup.c @@ -678,6 +678,41 @@ static unsigned int __init copy_bios_e820(struct e820entry *map, unsigned int li return n; } +static unsigned long lu_bootmem_start, lu_bootmem_size, lu_data; + +static int __init parse_liveupdate(const char *str) +{ + const char *cur; + lu_bootmem_size = parse_size_and_unit(cur = str, &str); + if (!lu_bootmem_size || cur == str) + return -EINVAL; + + if (!*str) { + printk("Live update size 0x%lx\n", lu_bootmem_size); + return 0; + } + if (*str != '@') + return -EINVAL; + lu_bootmem_start = parse_size_and_unit(cur = str + 1, &str); + if (!lu_bootmem_start || cur == str) + return -EINVAL; + + printk("Live update area 0x%lx-0x%lx (0x%lx)\n", lu_bootmem_start, + lu_bootmem_start + lu_bootmem_size, lu_bootmem_size); + + if (!*str) + return 0; + if (*str != ':') + return -EINVAL; + lu_data = simple_strtoull(cur = str + 1, &str, 0); + if (!lu_data || cur == str) + return -EINVAL; + + printk("Live update data at 0x%lx\n", lu_data); + return 0; +} +custom_param("liveupdate", parse_liveupdate); + void __init noreturn __start_xen(unsigned long mbi_p) { char *memmap_type = NULL; @@ -687,7 +722,7 @@ void __init noreturn __start_xen(unsigned long mbi_p) module_t *mod; unsigned long nr_pages, raw_max_page, modules_headroom, module_map[1]; int i, j, e820_warn = 0, bytes = 0; - bool acpi_boot_table_init_done = false, relocated = false; + bool acpi_boot_table_init_done = false, relocated = false, lu_reserved = false; int ret; struct ns16550_defaults ns16550 = { .data_bits = 8, @@ -980,6 +1015,22 @@ void __init noreturn __start_xen(unsigned long mbi_p) set_kexec_crash_area_size((u64)nr_pages << PAGE_SHIFT); kexec_reserve_area(&boot_e820); + if ( lu_bootmem_start ) + { + /* XX: Check it's in usable memory first */ + reserve_e820_ram(&boot_e820, lu_bootmem_start, lu_bootmem_start + lu_bootmem_size); + + /* Since it will already be out of the e820 map by the time the first + * loop over physical memory, map it manually already. */ + set_pdx_range(lu_bootmem_start >> PAGE_SHIFT, + (lu_bootmem_start + lu_bootmem_size) >> PAGE_SHIFT); + map_pages_to_xen((unsigned long)__va(lu_bootmem_start), + maddr_to_mfn(lu_bootmem_start), + PFN_DOWN(lu_bootmem_size), PAGE_HYPERVISOR); + + lu_reserved = true; + } + initial_images = mod; nr_initial_images = mbi->mods_count; @@ -1204,6 +1255,16 @@ void __init noreturn __start_xen(unsigned long mbi_p) printk("New Xen image base address: %#lx\n", xen_phys_start); } + /* Is the region suitable for the live update bootmem region? */ + if ( lu_bootmem_size && ! lu_bootmem_start && e < limit ) + { + end = consider_modules(s, e, lu_bootmem_size, mod, mbi->mods_count + relocated, -1); + if ( end ) + { + e = lu_bootmem_start = end - lu_bootmem_size; + } + } + /* Is the region suitable for relocating the multiboot modules? */ for ( j = mbi->mods_count - 1; j >= 0; j-- ) { @@ -1267,6 +1328,15 @@ void __init noreturn __start_xen(unsigned long mbi_p) if ( !xen_phys_start ) panic("Not enough memory to relocate Xen\n"); + if ( lu_bootmem_start ) + { + if ( !lu_reserved ) + reserve_e820_ram(&boot_e820, lu_bootmem_start, lu_bootmem_start + lu_bootmem_size); + printk("LU bootmem: 0x%lx - 0x%lx\n", lu_bootmem_start, lu_bootmem_start + lu_bootmem_size); + init_boot_pages(lu_bootmem_start, lu_bootmem_start + lu_bootmem_size); + lu_reserved = true; + } + /* This needs to remain in sync with xen_in_range(). */ reserve_e820_ram(&boot_e820, __pa(_stext), __pa(__2M_rwdata_end)); @@ -1278,8 +1348,8 @@ void __init noreturn __start_xen(unsigned long mbi_p) xenheap_max_mfn(PFN_DOWN(highmem_start - 1)); /* - * Walk every RAM region and map it in its entirety (on x86/64, at least) - * and notify it to the boot allocator. + * Walk every RAM region and map it in its entirety and (unless in + * live update mode) notify it to the boot allocator. */ for ( i = 0; i < boot_e820.nr_map; i++ ) { @@ -1329,6 +1399,7 @@ void __init noreturn __start_xen(unsigned long mbi_p) printk(XENLOG_WARNING "Ignoring inaccessible memory range" " %013"PRIx64"-%013"PRIx64"\n", s, e); + reserve_e820_ram(&boot_e820, s, e); continue; } map_e = e; @@ -1336,6 +1407,7 @@ void __init noreturn __start_xen(unsigned long mbi_p) printk(XENLOG_WARNING "Ignoring inaccessible memory range" " %013"PRIx64"-%013"PRIx64"\n", e, map_e); + reserve_e820_ram(&boot_e820, e, map_e); } set_pdx_range(s >> PAGE_SHIFT, e >> PAGE_SHIFT); @@ -1346,7 +1418,9 @@ void __init noreturn __start_xen(unsigned long mbi_p) ARRAY_SIZE(l2_identmap) << L2_PAGETABLE_SHIFT); /* Pass mapped memory to allocator /before/ creating new mappings. */ - init_boot_pages(s, min(map_s, e)); + if ( !lu_reserved) + init_boot_pages(s, min(map_s, e)); + s = map_s; if ( s < map_e ) { @@ -1354,7 +1428,8 @@ void __init noreturn __start_xen(unsigned long mbi_p) map_s = (s + mask) & ~mask; map_e &= ~mask; - init_boot_pages(map_s, map_e); + if ( !lu_reserved) + init_boot_pages(map_s, map_e); } if ( map_s > map_e ) @@ -1370,7 +1445,8 @@ void __init noreturn __start_xen(unsigned long mbi_p) { map_pages_to_xen((unsigned long)__va(map_e), maddr_to_mfn(map_e), PFN_DOWN(end - map_e), PAGE_HYPERVISOR); - init_boot_pages(map_e, end); + if ( !lu_reserved) + init_boot_pages(map_e, end); map_e = end; } } @@ -1385,7 +1461,8 @@ void __init noreturn __start_xen(unsigned long mbi_p) { map_pages_to_xen((unsigned long)__va(s), maddr_to_mfn(s), PFN_DOWN(map_s - s), PAGE_HYPERVISOR); - init_boot_pages(s, map_s); + if ( !lu_reserved) + init_boot_pages(s, map_s); } } @@ -1483,6 +1560,29 @@ void __init noreturn __start_xen(unsigned long mbi_p) numa_initmem_init(0, raw_max_page); + if ( lu_bootmem_start ) + { + unsigned long limit = virt_to_mfn(HYPERVISOR_VIRT_END - 1); + uint64_t mask = PAGE_SIZE - 1; + + for ( i = 0; i < boot_e820.nr_map; i++ ) + { + uint64_t s, e; + + if ( boot_e820.map[i].type != E820_RAM ) + continue; + s = (boot_e820.map[i].addr + mask) & ~mask; + e = (boot_e820.map[i].addr + boot_e820.map[i].size) & ~mask; + s = max_t(uint64_t, s, 1<<20); + if ( PFN_DOWN(s) > limit ) + continue; + if ( PFN_DOWN(e) > limit ) + e = pfn_to_paddr(limit); + + init_boot_pages(s, e); + } + } + if ( max_page - 1 > virt_to_mfn(HYPERVISOR_VIRT_END - 1) ) { unsigned long limit = virt_to_mfn(HYPERVISOR_VIRT_END - 1);