From patchwork Fri Nov 6 08:31:43 2015 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Xiao Guangrong X-Patchwork-Id: 7567401 Return-Path: X-Original-To: patchwork-kvm@patchwork.kernel.org Delivered-To: patchwork-parsemail@patchwork2.web.kernel.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.136]) by patchwork2.web.kernel.org (Postfix) with ESMTP id 4105DC05C6 for ; Fri, 6 Nov 2015 08:38:17 +0000 (UTC) Received: from mail.kernel.org (localhost [127.0.0.1]) by mail.kernel.org (Postfix) with ESMTP id 1B6FA20742 for ; Fri, 6 Nov 2015 08:38:16 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 778B3206A5 for ; Fri, 6 Nov 2015 08:38:14 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1032290AbbKFIiL (ORCPT ); Fri, 6 Nov 2015 03:38:11 -0500 Received: from mga02.intel.com ([134.134.136.20]:37275 "EHLO mga02.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1032054AbbKFIiK (ORCPT ); Fri, 6 Nov 2015 03:38:10 -0500 Received: from fmsmga002.fm.intel.com ([10.253.24.26]) by orsmga101.jf.intel.com with ESMTP; 06 Nov 2015 00:38:10 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.20,251,1444719600"; d="scan'208";a="844511845" Received: from xiao.sh.intel.com ([10.239.159.53]) by fmsmga002.fm.intel.com with ESMTP; 06 Nov 2015 00:37:56 -0800 Subject: Re: [PATCH v7 25/35] nvdimm acpi: init the resource used by NVDIMM ACPI To: Igor Mammedov References: <1446455617-129562-1-git-send-email-guangrong.xiao@linux.intel.com> <1446455617-129562-26-git-send-email-guangrong.xiao@linux.intel.com> <20151105105816.4945e0c4@nial.brq.redhat.com> <563B2C43.2030407@linux.intel.com> <20151105140326.18b21c32@nial.brq.redhat.com> <563B5AB3.6040605@linux.intel.com> <20151105154906.659779b5@nial.brq.redhat.com> Cc: pbonzini@redhat.com, gleb@kernel.org, mtosatti@redhat.com, stefanha@redhat.com, mst@redhat.com, rth@twiddle.net, ehabkost@redhat.com, dan.j.williams@intel.com, kvm@vger.kernel.org, qemu-devel@nongnu.org, vsementsov@virtuozzo.com, eblake@redhat.com From: Xiao Guangrong Message-ID: <563C656F.3030808@linux.intel.com> Date: Fri, 6 Nov 2015 16:31:43 +0800 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:38.0) Gecko/20100101 Thunderbird/38.3.0 MIME-Version: 1.0 In-Reply-To: <20151105154906.659779b5@nial.brq.redhat.com> Sender: kvm-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org X-Spam-Status: No, score=-6.9 required=5.0 tests=BAYES_00, RCVD_IN_DNSWL_HI, T_RP_MATCHES_RCVD, UNPARSEABLE_RELAY autolearn=unavailable version=3.3.1 X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on mail.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP On 11/05/2015 10:49 PM, Igor Mammedov wrote: > On Thu, 5 Nov 2015 21:33:39 +0800 > Xiao Guangrong wrote: > >> >> >> On 11/05/2015 09:03 PM, Igor Mammedov wrote: >>> On Thu, 5 Nov 2015 18:15:31 +0800 >>> Xiao Guangrong wrote: >>> >>>> >>>> >>>> On 11/05/2015 05:58 PM, Igor Mammedov wrote: >>>>> On Mon, 2 Nov 2015 17:13:27 +0800 >>>>> Xiao Guangrong wrote: >>>>> >>>>>> A page staring from 0xFF00000 and IO port 0x0a18 - 0xa1b in guest are >>>>> ^^ missing one 0??? >>>>> >>>>>> reserved for NVDIMM ACPI emulation, refer to docs/specs/acpi_nvdimm.txt >>>>>> for detailed design >>>>>> >>>>>> A parameter, 'nvdimm-support', is introduced for PIIX4_PM and ICH9-LPC >>>>>> that controls if nvdimm support is enabled, it is true on default and >>>>>> it is false on 2.4 and its earlier version to keep compatibility >>>>>> >>>>>> Signed-off-by: Xiao Guangrong >>>>> [...] >>>>> >>>>>> @@ -33,6 +33,15 @@ >>>>>> */ >>>>>> #define MIN_NAMESPACE_LABEL_SIZE (128UL << 10) >>>>>> >>>>>> +/* >>>>>> + * A page staring from 0xFF00000 and IO port 0x0a18 - 0xa1b in guest are >>>>> ^^^ missing 0 or value in define below has an extra 0 >>>>> >>>>>> + * reserved for NVDIMM ACPI emulation, refer to docs/specs/acpi_nvdimm.txt >>>>>> + * for detailed design. >>>>>> + */ >>>>>> +#define NVDIMM_ACPI_MEM_BASE 0xFF000000ULL >>>>> it still maps RAM at arbitrary place, >>>>> that's the reason why VMGenID patches hasn't been merged for >>>>> more than several months. >>>>> I'm not against of using (more exactly I'm for it) direct mapping >>>>> but we should reach consensus when and how to use it first. >>>>> >>>>> I'd wouldn't use addresses below 4G as it may be used firmware or >>>>> legacy hardware and I won't bet that 0xFF000000ULL won't conflict >>>>> with anything. >>>>> An alternative place to allocate reserve from could be high memory. >>>>> For pc we have "reserved-memory-end" which currently makes sure >>>>> that hotpluggable memory range isn't used by firmware. >>>>> >>>>> How about making API that allows to map additional memory >>>>> ranges before reserved-memory-end and pushes it up as mappings are >>>>> added. >>>> >>>> That what i did in the v1/v2 versions, however, as you noticed, using 64-bit >>>> address in ACPI in QEMU is not a easy work - we can not simply make >>>> SSDT.rev = 2 to apply 64 bit address, the reason i have documented in >>>> v3's changelog: >>>> >>>> 3) we figure out a unused memory hole below 4G that is 0xFF00000 ~ >>>> 0xFFF00000, this range is large enough for NVDIMM ACPI as build 64-bit >>>> ACPI SSDT/DSDT table will break windows XP. >>>> BTW, only make SSDT.rev = 2 can not work since the width is only depended >>>> on DSDT.rev based on 19.6.28 DefinitionBlock (Declare Definition Block) >>>> in ACPI spec: >>>> | Note: For compatibility with ACPI versions before ACPI 2.0, the bit >>>> | width of Integer objects is dependent on the ComplianceRevision of the DSDT. >>>> | If the ComplianceRevision is less than 2, all integers are restricted to 32 >>>> | bits. Otherwise, full 64-bit integers are used. The version of the DSDT sets >>>> | the global integer width for all integers, including integers in SSDTs. >>>> 4) use the lowest ACPI spec version to document AML terms. >>>> >>>> The only way introducing 64 bit address is adding XSDT support that what >>>> Michael did before, however, it seems it need great efforts to do it as >>>> it will break OVMF. It's a long term workload. :( >>> to enable 64-bit integers in AML it's sufficient to change DSDT revision to 2, >>> I already have a patch that switches DSDT/SSDT to rev2. >>> Tests show it doesn't break WindowsXP (which is rev1) and uses 64-bit integers >>> on linux & later Windows versions. >> >> Great, i remembered i did the similar test (directly change DSDT to rev2) and it >> caused winXP blue screen. Could you please tell me where i can find your patch? > https://github.com/imammedo/qemu/commits/mhpt_table_v2 > following changes revision: > pc: acpi: bump DSDT/SSDT compliance revision to v2 > and here is user: > acpi: memhp: simplify MCRS() using 64-bit math > > when writing ASL one shall make sure that only XP supported > features are in global scope, which is evaluated when tables > are loaded and features of rev2 and higher are inside methods. > That way XP doesn't crash as far as it doesn't evaluate unsupported > opcode and one can guard those opcodes checking _REV object if neccesary. > Really a good study case to me, i tried your patch and moved the 64 bit staffs to the private method, it worked. :) Igor, is this the API you want? > >>>> >>>> The luck thing is, the ACPI part is not ABI, we can move it to the high >>>> memory if QEMU supports XSDT is ready in future development. >>> But mapped control region is ABI and we can't change it if we find out later >>> that it breaks something. >> >> But the ACPI code is completely built by QEMU, which is transparent to guest >> and guest should not depend on it, no? > unfortunately no, think about cross-version migration. It makes sense. Stupid me. :( --- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html diff --git a/hw/i386/pc.c b/hw/i386/pc.c index 6bf569a..aba29df 100644 --- a/hw/i386/pc.c +++ b/hw/i386/pc.c @@ -1291,6 +1291,38 @@ FWCfgState *xen_load_linux(PCMachineState *pcms, return fw_cfg; } +static void pc_reserve_high_memory_init(PCMachineState *pcms, + uint64_t base, uint64_t align) +{ + pcms->reserve_high_memory.current_addr = ROUND_UP(base, align); +} + +static uint64_t +pc_reserve_high_memory_end(PCMachineState *pcms, int64_t align) +{ + return ROUND_UP(pcms->reserve_high_memory.current_addr, align); +} + +uint64_t pc_reserve_high_memory(PCMachineState *pcms, uint64_t size, + int64_t align, Error **errp) +{ + uint64_t end_addr, current_addr = pcms->reserve_high_memory.current_addr; + + if (!current_addr) { + error_setg(errp, "reserved high memory is not initialized."); + return 0; + } + + end_addr = pc_reserve_high_memory_end(pcms, align) + size; + if (current_addr > end_addr) { + error_setg(errp, "reserved high memory is not enough."); + return 0; + } + + pcms->reserve_high_memory.current_addr = end_addr; + return end_addr; +} + FWCfgState *pc_memory_init(PCMachineState *pcms, MemoryRegion *system_memory, MemoryRegion *rom_memory, @@ -1379,6 +1411,8 @@ FWCfgState *pc_memory_init(PCMachineState *pcms, "hotplug-memory", hotplug_mem_size); memory_region_add_subregion(system_memory, pcms->hotplug_memory.base, &pcms->hotplug_memory.mr); + pc_reserve_high_memory_init(pcms, pcms->hotplug_memory.base + + hotplug_mem_size, 1ULL << 30); } /* Initialize PC system firmware */ @@ -1403,7 +1437,7 @@ FWCfgState *pc_memory_init(PCMachineState *pcms, uint64_t res_mem_end = pcms->hotplug_memory.base; if (!pcmc->broken_reserved_end) { - res_mem_end += memory_region_size(&pcms->hotplug_memory.mr); + res_mem_end = pc_reserve_high_memory_end(pcms, 0x1ULL << 30); } *val = cpu_to_le64(ROUND_UP(res_mem_end, 0x1ULL << 30)); fw_cfg_add_file(fw_cfg, "etc/reserved-memory-end", val, sizeof(*val)); diff --git a/include/hw/i386/pc.h b/include/hw/i386/pc.h index 47162cf..fae3fea 100644 --- a/include/hw/i386/pc.h +++ b/include/hw/i386/pc.h @@ -20,6 +20,11 @@ #define HPET_INTCAP "hpet-intcap" +struct PCReserveHighMemory { + uint64_t current_addr; +}; +typedef struct PCReserveHighMemory PCReserveHighMemory; + /** * PCMachineState: * @acpi_dev: link to ACPI PM device that performs ACPI hotplug handling @@ -41,6 +46,7 @@ struct PCMachineState { OnOffAuto smm; bool enforce_aligned_dimm; ram_addr_t below_4g_mem_size, above_4g_mem_size; + PCReserveHighMemory reserve_high_memory; }; #define PC_MACHINE_ACPI_DEVICE_PROP "acpi-device" @@ -175,6 +181,9 @@ PcGuestInfo *pc_guest_info_init(PCMachineState *pcms); void pc_set_legacy_acpi_data_size(void); +uint64_t pc_reserve_high_memory(PCMachineState *pcms, uint64_t size, + int64_t align, Error **errp); + #define PCI_HOST_PROP_PCI_HOLE_START "pci-hole-start" #define PCI_HOST_PROP_PCI_HOLE_END "pci-hole-end" #define PCI_HOST_PROP_PCI_HOLE64_START "pci-hole64-start"