From patchwork Tue Sep 10 10:39:32 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?b?SsO8cmdlbiBHcm/Dnw==?= X-Patchwork-Id: 13798319 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.xenproject.org (lists.xenproject.org [192.237.175.120]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id C09B6EB64DE for ; Tue, 10 Sep 2024 10:40:27 +0000 (UTC) Received: from list by lists.xenproject.org with outflank-mailman.795338.1204677 (Exim 4.92) (envelope-from ) id 1snyIF-0000o0-HP; Tue, 10 Sep 2024 10:40:19 +0000 X-Outflank-Mailman: Message body and most headers restored to incoming version Received: by outflank-mailman (output) from mailman id 795338.1204677; Tue, 10 Sep 2024 10:40:19 +0000 Received: from localhost ([127.0.0.1] helo=lists.xenproject.org) by lists.xenproject.org with esmtp (Exim 4.92) (envelope-from ) id 1snyIF-0000nt-Du; Tue, 10 Sep 2024 10:40:19 +0000 Received: by outflank-mailman (input) for mailman id 795338; Tue, 10 Sep 2024 10:40:18 +0000 Received: from se1-gles-flk1-in.inumbo.com ([94.247.172.50] helo=se1-gles-flk1.inumbo.com) by lists.xenproject.org with esmtp (Exim 4.92) (envelope-from ) id 1snyIE-0005DK-24 for xen-devel@lists.xenproject.org; Tue, 10 Sep 2024 10:40:18 +0000 Received: from smtp-out2.suse.de (smtp-out2.suse.de [195.135.223.131]) by se1-gles-flk1.inumbo.com (Halon) with ESMTPS id 112246cb-6f61-11ef-99a1-01e77a169b0f; Tue, 10 Sep 2024 12:40:16 +0200 (CEST) Received: from imap1.dmz-prg2.suse.org (imap1.dmz-prg2.suse.org [IPv6:2a07:de40:b281:104:10:150:64:97]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by smtp-out2.suse.de (Postfix) with ESMTPS id ECE461FCA1; Tue, 10 Sep 2024 10:40:15 +0000 (UTC) Received: from imap1.dmz-prg2.suse.org (localhost [127.0.0.1]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by imap1.dmz-prg2.suse.org (Postfix) with ESMTPS id 8F149132CB; Tue, 10 Sep 2024 10:40:15 +0000 (UTC) Received: from dovecot-director2.suse.de ([2a07:de40:b281:106:10:150:64:167]) by imap1.dmz-prg2.suse.org with ESMTPSA id qQOPIQ8i4GbwYwAAD6G6ig (envelope-from ); Tue, 10 Sep 2024 10:40:15 +0000 X-BeenThere: xen-devel@lists.xenproject.org List-Id: Xen developer discussion List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Errors-To: xen-devel-bounces@lists.xenproject.org Precedence: list Sender: "Xen-devel" X-Inumbo-ID: 112246cb-6f61-11ef-99a1-01e77a169b0f DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.com; s=susede1; t=1725964816; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=emLEtDCHOyRNeSPAQA99QwKawrV9ff6zxw6QsKyQz4s=; b=Zly8IT/Tx7rrkOELPq2/w436dUI2daOA8jVa9PDBtoDS3qSMXqg/oXSlJZtPSRYB8nmZlI 4Hhf3jU973rikjt/K2TsrK5WTekGaTClHpA6V8qucU8H7ks9uEQrHZE6cyTLP2ZsJTvZDP lCkUD2K8DizIbgsa5q56S4cKmp3WX/E= Authentication-Results: smtp-out2.suse.de; dkim=pass header.d=suse.com header.s=susede1 header.b=qkyPntwh DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.com; s=susede1; t=1725964815; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=emLEtDCHOyRNeSPAQA99QwKawrV9ff6zxw6QsKyQz4s=; b=qkyPntwhQzj0CgrV1YR5cPvBvhVkpsVq0Vs1QXm+uH7Pz1Rt25SYBcrlHTpvuhOtRvW85S HaxKTv7qaPqWeIv580vrhRFZukJEqVTznqkJxvRX6ZgHPqXoxQcq5INwhlXX+z8+9wYxhz kYHTw7yl0Knjqf94QWDyCahBtQMxloc= From: Juergen Gross To: linux-kernel@vger.kernel.org, x86@kernel.org Cc: Juergen Gross , Boris Ostrovsky , Thomas Gleixner , Ingo Molnar , Borislav Petkov , Dave Hansen , "H. Peter Anvin" , xen-devel@lists.xenproject.org, =?utf-8?q?Marek_Marczykowski-G=C3=B3recki?= Subject: [PATCH v3 7/7] xen: tolerate ACPI NVS memory overlapping with Xen allocated memory Date: Tue, 10 Sep 2024 12:39:32 +0200 Message-ID: <20240910103932.7634-8-jgross@suse.com> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20240910103932.7634-1-jgross@suse.com> References: <20240910103932.7634-1-jgross@suse.com> MIME-Version: 1.0 X-Rspamd-Queue-Id: ECE461FCA1 X-Spamd-Result: default: False [-5.51 / 50.00]; BAYES_HAM(-3.00)[100.00%]; DWL_DNSWL_MED(-2.00)[suse.com:dkim]; NEURAL_HAM_LONG(-1.00)[-1.000]; MID_CONTAINS_FROM(1.00)[]; R_DKIM_ALLOW(-0.20)[suse.com:s=susede1]; NEURAL_HAM_SHORT(-0.20)[-1.000]; MIME_GOOD(-0.10)[text/plain]; MX_GOOD(-0.01)[]; DKIM_SIGNED(0.00)[suse.com:s=susede1]; FUZZY_BLOCKED(0.00)[rspamd.com]; RBL_SPAMHAUS_BLOCKED_OPENRESOLVER(0.00)[2a07:de40:b281:104:10:150:64:97:from]; MIME_TRACE(0.00)[0:+]; ARC_NA(0.00)[]; TO_DN_SOME(0.00)[]; TO_MATCH_ENVRCPT_ALL(0.00)[]; FROM_HAS_DN(0.00)[]; RCVD_TLS_ALL(0.00)[]; RCPT_COUNT_SEVEN(0.00)[11]; RCVD_COUNT_TWO(0.00)[2]; FROM_EQ_ENVFROM(0.00)[]; SPAMHAUS_XBL(0.00)[2a07:de40:b281:104:10:150:64:97:from]; RECEIVED_SPAMHAUS_BLOCKED_OPENRESOLVER(0.00)[2a07:de40:b281:106:10:150:64:167:received]; R_RATELIMIT(0.00)[to_ip_from(RLkdkdrsxe9hqhhs5ask8616i6)]; RCVD_VIA_SMTP_AUTH(0.00)[]; DKIM_TRACE(0.00)[suse.com:+]; DBL_BLOCKED_OPENRESOLVER(0.00)[suse.com:email,suse.com:dkim,suse.com:mid,imap1.dmz-prg2.suse.org:helo,imap1.dmz-prg2.suse.org:rdns] X-Rspamd-Server: rspamd2.dmz-prg2.suse.org X-Rspamd-Action: no action In order to minimize required special handling for running as Xen PV dom0, the memory layout is modified to match that of the host. This requires to have only RAM at the locations where Xen allocated memory is living. Unfortunately there seem to be some machines, where ACPI NVS is located at 64 MB, resulting in a conflict with the loaded kernel or the initial page tables built by Xen. Avoid this conflict by swapping the ACPI NVS area in the memory map with unused RAM. This is possible via modification of the dom0 P2M map. Accesses to the ACPI NVS area are done either for saving and restoring it across suspend operations (this will work the same way as before), or by ACPI code when NVS memory is referenced from other ACPI tables. The latter case is handled by a Xen specific indirection of acpi_os_ioremap(). While the E820 map can (and should) be modified right away, the P2M map can be updated only after memory allocation is working, as the P2M map might need to be extended. Fixes: 808fdb71936c ("xen: check for kernel memory conflicting with memory layout") Signed-off-by: Juergen Gross Tested-by: Marek Marczykowski-Górecki Reviewed-by: Jan Beulich --- V2: - remap helpers split off into other patch V3: - adjust commit message (Jan Beulich) --- arch/x86/xen/setup.c | 92 +++++++++++++++++++++++++++++++++++++++++++- 1 file changed, 91 insertions(+), 1 deletion(-) diff --git a/arch/x86/xen/setup.c b/arch/x86/xen/setup.c index 1114e49937da..c3db71d96c43 100644 --- a/arch/x86/xen/setup.c +++ b/arch/x86/xen/setup.c @@ -495,6 +495,8 @@ void __init xen_remap_memory(void) set_pte_mfn(buf, mfn_save, PAGE_KERNEL); pr_info("Remapped %ld page(s)\n", remapped); + + xen_do_remap_nonram(); } static unsigned long __init xen_get_pages_limit(void) @@ -625,14 +627,102 @@ phys_addr_t __init xen_find_free_area(phys_addr_t size) return 0; } +/* + * Swap a non-RAM E820 map entry with RAM above ini_nr_pages. + * Note that the E820 map is modified accordingly, but the P2M map isn't yet. + * The adaption of the P2M must be deferred until page allocation is possible. + */ +static void __init xen_e820_swap_entry_with_ram(struct e820_entry *swap_entry) +{ + struct e820_entry *entry; + unsigned int mapcnt; + phys_addr_t mem_end = PFN_PHYS(ini_nr_pages); + phys_addr_t swap_addr, swap_size, entry_end; + + swap_addr = PAGE_ALIGN_DOWN(swap_entry->addr); + swap_size = PAGE_ALIGN(swap_entry->addr - swap_addr + swap_entry->size); + entry = xen_e820_table.entries; + + for (mapcnt = 0; mapcnt < xen_e820_table.nr_entries; mapcnt++) { + entry_end = entry->addr + entry->size; + if (entry->type == E820_TYPE_RAM && entry->size >= swap_size && + entry_end - swap_size >= mem_end) { + /* Reduce RAM entry by needed space (whole pages). */ + entry->size -= swap_size; + + /* Add new entry at the end of E820 map. */ + entry = xen_e820_table.entries + + xen_e820_table.nr_entries; + xen_e820_table.nr_entries++; + + /* Fill new entry (keep size and page offset). */ + entry->type = swap_entry->type; + entry->addr = entry_end - swap_size + + swap_addr - swap_entry->addr; + entry->size = swap_entry->size; + + /* Convert old entry to RAM, align to pages. */ + swap_entry->type = E820_TYPE_RAM; + swap_entry->addr = swap_addr; + swap_entry->size = swap_size; + + /* Remember PFN<->MFN relation for P2M update. */ + xen_add_remap_nonram(swap_addr, entry_end - swap_size, + swap_size); + + /* Order E820 table and merge entries. */ + e820__update_table(&xen_e820_table); + + return; + } + + entry++; + } + + xen_raw_console_write("No suitable area found for required E820 entry remapping action\n"); + BUG(); +} + +/* + * Look for non-RAM memory types in a specific guest physical area and move + * those away if possible (ACPI NVS only for now). + */ +static void __init xen_e820_resolve_conflicts(phys_addr_t start, + phys_addr_t size) +{ + struct e820_entry *entry; + unsigned int mapcnt; + phys_addr_t end; + + if (!size) + return; + + end = start + size; + entry = xen_e820_table.entries; + + for (mapcnt = 0; mapcnt < xen_e820_table.nr_entries; mapcnt++) { + if (entry->addr >= end) + return; + + if (entry->addr + entry->size > start && + entry->type == E820_TYPE_NVS) + xen_e820_swap_entry_with_ram(entry); + + entry++; + } +} + /* * Check for an area in physical memory to be usable for non-movable purposes. - * An area is considered to usable if the used E820 map lists it to be RAM. + * An area is considered to usable if the used E820 map lists it to be RAM or + * some other type which can be moved to higher PFNs while keeping the MFNs. * In case the area is not usable, crash the system with an error message. */ void __init xen_chk_is_e820_usable(phys_addr_t start, phys_addr_t size, const char *component) { + xen_e820_resolve_conflicts(start, size); + if (!xen_is_e820_reserved(start, size)) return;