[v7,25/35] nvdimm acpi: init the resource used by NVDIMM ACPI
diff mbox

Message ID 563C656F.3030808@linux.intel.com
State New
Headers show

Commit Message

Xiao Guangrong Nov. 6, 2015, 8:31 a.m. UTC
On 11/05/2015 10:49 PM, Igor Mammedov wrote:
> On Thu, 5 Nov 2015 21:33:39 +0800
> Xiao Guangrong <guangrong.xiao@linux.intel.com> wrote:
>
>>
>>
>> On 11/05/2015 09:03 PM, Igor Mammedov wrote:
>>> On Thu, 5 Nov 2015 18:15:31 +0800
>>> Xiao Guangrong <guangrong.xiao@linux.intel.com> wrote:
>>>
>>>>
>>>>
>>>> On 11/05/2015 05:58 PM, Igor Mammedov wrote:
>>>>> On Mon,  2 Nov 2015 17:13:27 +0800
>>>>> Xiao Guangrong <guangrong.xiao@linux.intel.com> wrote:
>>>>>
>>>>>> A page staring from 0xFF00000 and IO port 0x0a18 - 0xa1b in guest are
>>>>>                                   ^^ missing one 0???
>>>>>
>>>>>> reserved for NVDIMM ACPI emulation, refer to docs/specs/acpi_nvdimm.txt
>>>>>> for detailed design
>>>>>>
>>>>>> A parameter, 'nvdimm-support', is introduced for PIIX4_PM and ICH9-LPC
>>>>>> that controls if nvdimm support is enabled, it is true on default and
>>>>>> it is false on 2.4 and its earlier version to keep compatibility
>>>>>>
>>>>>> Signed-off-by: Xiao Guangrong <guangrong.xiao@linux.intel.com>
>>>>> [...]
>>>>>
>>>>>> @@ -33,6 +33,15 @@
>>>>>>      */
>>>>>>     #define MIN_NAMESPACE_LABEL_SIZE      (128UL << 10)
>>>>>>
>>>>>> +/*
>>>>>> + * A page staring from 0xFF00000 and IO port 0x0a18 - 0xa1b in guest are
>>>>>                                     ^^^ missing 0 or value in define below has an extra 0
>>>>>
>>>>>> + * reserved for NVDIMM ACPI emulation, refer to docs/specs/acpi_nvdimm.txt
>>>>>> + * for detailed design.
>>>>>> + */
>>>>>> +#define NVDIMM_ACPI_MEM_BASE          0xFF000000ULL
>>>>> it still maps RAM at arbitrary place,
>>>>> that's the reason why VMGenID patches hasn't been merged for
>>>>> more than several months.
>>>>> I'm not against of using (more exactly I'm for it) direct mapping
>>>>> but we should reach consensus when and how to use it first.
>>>>>
>>>>> I'd wouldn't use addresses below 4G as it may be used firmware or
>>>>> legacy hardware and I won't bet that 0xFF000000ULL won't conflict
>>>>> with anything.
>>>>> An alternative place to allocate reserve from could be high memory.
>>>>> For pc we have "reserved-memory-end" which currently makes sure
>>>>> that hotpluggable memory range isn't used by firmware.
>>>>>
>>>>> How about making API that allows to map additional memory
>>>>> ranges before reserved-memory-end and pushes it up as mappings are
>>>>> added.
>>>>
>>>> That what i did in the v1/v2 versions, however, as you noticed, using 64-bit
>>>> address in ACPI in QEMU is not a easy work - we can not simply make
>>>> SSDT.rev = 2 to apply 64 bit address, the reason i have documented in
>>>> v3's changelog:
>>>>
>>>>      3) we figure out a unused memory hole below 4G that is 0xFF00000 ~
>>>>         0xFFF00000, this range is large enough for NVDIMM ACPI as build 64-bit
>>>>         ACPI SSDT/DSDT table will break windows XP.
>>>>         BTW, only make SSDT.rev = 2 can not work since the width is only depended
>>>>         on DSDT.rev based on 19.6.28 DefinitionBlock (Declare Definition Block)
>>>>         in ACPI spec:
>>>> | Note: For compatibility with ACPI versions before ACPI 2.0, the bit
>>>> | width of Integer objects is dependent on the ComplianceRevision of the DSDT.
>>>> | If the ComplianceRevision is less than 2, all integers are restricted to 32
>>>> | bits. Otherwise, full 64-bit integers are used. The version of the DSDT sets
>>>> | the global integer width for all integers, including integers in SSDTs.
>>>>      4) use the lowest ACPI spec version to document AML terms.
>>>>
>>>> The only way introducing 64 bit address is adding XSDT support that what
>>>> Michael did before, however, it seems it need great efforts to do it as
>>>> it will break OVMF. It's a long term workload. :(
>>> to enable 64-bit integers in AML it's sufficient to change DSDT revision to 2,
>>> I already have a patch that switches DSDT/SSDT to rev2.
>>> Tests show it doesn't break WindowsXP (which is rev1) and uses 64-bit integers
>>> on linux & later Windows versions.
>>
>> Great, i remembered i did the similar test (directly change DSDT to rev2) and it
>> caused winXP blue screen. Could you please tell me where i can find your patch?
> https://github.com/imammedo/qemu/commits/mhpt_table_v2
> following changes revision:
>   pc: acpi: bump DSDT/SSDT compliance revision to v2
> and here is user:
>   acpi: memhp: simplify MCRS() using 64-bit math
>
> when writing ASL one shall make sure that only XP supported
> features are in global scope, which is evaluated when tables
> are loaded and features of rev2 and higher are inside methods.
> That way XP doesn't crash as far as it doesn't evaluate unsupported
> opcode and one can guard those opcodes checking _REV object if neccesary.
>

Really a good study case to me, i tried your patch and moved the 64 bit
staffs to the private method, it worked. :)

Igor, is this the API you want?



>
>>>>
>>>> The luck thing is, the ACPI part is not ABI, we can move it to the high
>>>> memory if QEMU supports XSDT is ready in future development.
>>> But mapped control region is ABI and we can't change it if we find out later
>>> that it breaks something.
>>
>> But the ACPI code is completely built by QEMU, which is transparent to guest
>> and guest should not depend on it, no?
> unfortunately no, think about cross-version migration.

It makes sense. Stupid me. :(


--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Comments

Xiao Guangrong Nov. 6, 2015, 8:56 a.m. UTC | #1
On 11/06/2015 04:31 PM, Xiao Guangrong wrote:
>
>
> On 11/05/2015 10:49 PM, Igor Mammedov wrote:
>> On Thu, 5 Nov 2015 21:33:39 +0800
>> Xiao Guangrong <guangrong.xiao@linux.intel.com> wrote:
>>
>>>
>>>
>>> On 11/05/2015 09:03 PM, Igor Mammedov wrote:
>>>> On Thu, 5 Nov 2015 18:15:31 +0800
>>>> Xiao Guangrong <guangrong.xiao@linux.intel.com> wrote:
>>>>
>>>>>
>>>>>
>>>>> On 11/05/2015 05:58 PM, Igor Mammedov wrote:
>>>>>> On Mon,  2 Nov 2015 17:13:27 +0800
>>>>>> Xiao Guangrong <guangrong.xiao@linux.intel.com> wrote:
>>>>>>
>>>>>>> A page staring from 0xFF00000 and IO port 0x0a18 - 0xa1b in guest are
>>>>>>                                   ^^ missing one 0???
>>>>>>
>>>>>>> reserved for NVDIMM ACPI emulation, refer to docs/specs/acpi_nvdimm.txt
>>>>>>> for detailed design
>>>>>>>
>>>>>>> A parameter, 'nvdimm-support', is introduced for PIIX4_PM and ICH9-LPC
>>>>>>> that controls if nvdimm support is enabled, it is true on default and
>>>>>>> it is false on 2.4 and its earlier version to keep compatibility
>>>>>>>
>>>>>>> Signed-off-by: Xiao Guangrong <guangrong.xiao@linux.intel.com>
>>>>>> [...]
>>>>>>
>>>>>>> @@ -33,6 +33,15 @@
>>>>>>>      */
>>>>>>>     #define MIN_NAMESPACE_LABEL_SIZE      (128UL << 10)
>>>>>>>
>>>>>>> +/*
>>>>>>> + * A page staring from 0xFF00000 and IO port 0x0a18 - 0xa1b in guest are
>>>>>>                                     ^^^ missing 0 or value in define below has an extra 0
>>>>>>
>>>>>>> + * reserved for NVDIMM ACPI emulation, refer to docs/specs/acpi_nvdimm.txt
>>>>>>> + * for detailed design.
>>>>>>> + */
>>>>>>> +#define NVDIMM_ACPI_MEM_BASE          0xFF000000ULL
>>>>>> it still maps RAM at arbitrary place,
>>>>>> that's the reason why VMGenID patches hasn't been merged for
>>>>>> more than several months.
>>>>>> I'm not against of using (more exactly I'm for it) direct mapping
>>>>>> but we should reach consensus when and how to use it first.
>>>>>>
>>>>>> I'd wouldn't use addresses below 4G as it may be used firmware or
>>>>>> legacy hardware and I won't bet that 0xFF000000ULL won't conflict
>>>>>> with anything.
>>>>>> An alternative place to allocate reserve from could be high memory.
>>>>>> For pc we have "reserved-memory-end" which currently makes sure
>>>>>> that hotpluggable memory range isn't used by firmware.
>>>>>>
>>>>>> How about making API that allows to map additional memory
>>>>>> ranges before reserved-memory-end and pushes it up as mappings are
>>>>>> added.
>>>>>
>>>>> That what i did in the v1/v2 versions, however, as you noticed, using 64-bit
>>>>> address in ACPI in QEMU is not a easy work - we can not simply make
>>>>> SSDT.rev = 2 to apply 64 bit address, the reason i have documented in
>>>>> v3's changelog:
>>>>>
>>>>>      3) we figure out a unused memory hole below 4G that is 0xFF00000 ~
>>>>>         0xFFF00000, this range is large enough for NVDIMM ACPI as build 64-bit
>>>>>         ACPI SSDT/DSDT table will break windows XP.
>>>>>         BTW, only make SSDT.rev = 2 can not work since the width is only depended
>>>>>         on DSDT.rev based on 19.6.28 DefinitionBlock (Declare Definition Block)
>>>>>         in ACPI spec:
>>>>> | Note: For compatibility with ACPI versions before ACPI 2.0, the bit
>>>>> | width of Integer objects is dependent on the ComplianceRevision of the DSDT.
>>>>> | If the ComplianceRevision is less than 2, all integers are restricted to 32
>>>>> | bits. Otherwise, full 64-bit integers are used. The version of the DSDT sets
>>>>> | the global integer width for all integers, including integers in SSDTs.
>>>>>      4) use the lowest ACPI spec version to document AML terms.
>>>>>
>>>>> The only way introducing 64 bit address is adding XSDT support that what
>>>>> Michael did before, however, it seems it need great efforts to do it as
>>>>> it will break OVMF. It's a long term workload. :(
>>>> to enable 64-bit integers in AML it's sufficient to change DSDT revision to 2,
>>>> I already have a patch that switches DSDT/SSDT to rev2.
>>>> Tests show it doesn't break WindowsXP (which is rev1) and uses 64-bit integers
>>>> on linux & later Windows versions.
>>>
>>> Great, i remembered i did the similar test (directly change DSDT to rev2) and it
>>> caused winXP blue screen. Could you please tell me where i can find your patch?
>> https://github.com/imammedo/qemu/commits/mhpt_table_v2
>> following changes revision:
>>   pc: acpi: bump DSDT/SSDT compliance revision to v2
>> and here is user:
>>   acpi: memhp: simplify MCRS() using 64-bit math
>>
>> when writing ASL one shall make sure that only XP supported
>> features are in global scope, which is evaluated when tables
>> are loaded and features of rev2 and higher are inside methods.
>> That way XP doesn't crash as far as it doesn't evaluate unsupported
>> opcode and one can guard those opcodes checking _REV object if neccesary.
>>
>
> Really a good study case to me, i tried your patch and moved the 64 bit
> staffs to the private method, it worked. :)
>
> Igor, is this the API you want?

BTW, after move the control region to 4G above, is it acceptable to reserve enough
memory containing whole FIT and dsm memory as we did it in previous version?
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Igor Mammedov Nov. 9, 2015, 11:13 a.m. UTC | #2
On Fri, 6 Nov 2015 16:31:43 +0800
Xiao Guangrong <guangrong.xiao@linux.intel.com> wrote:

> 
> 
> On 11/05/2015 10:49 PM, Igor Mammedov wrote:
> > On Thu, 5 Nov 2015 21:33:39 +0800
> > Xiao Guangrong <guangrong.xiao@linux.intel.com> wrote:
> >
> >>
> >>
> >> On 11/05/2015 09:03 PM, Igor Mammedov wrote:
> >>> On Thu, 5 Nov 2015 18:15:31 +0800
> >>> Xiao Guangrong <guangrong.xiao@linux.intel.com> wrote:
> >>>
> >>>>
> >>>>
> >>>> On 11/05/2015 05:58 PM, Igor Mammedov wrote:
> >>>>> On Mon,  2 Nov 2015 17:13:27 +0800
> >>>>> Xiao Guangrong <guangrong.xiao@linux.intel.com> wrote:
> >>>>>
> >>>>>> A page staring from 0xFF00000 and IO port 0x0a18 - 0xa1b in guest are
> >>>>>                                   ^^ missing one 0???
> >>>>>
> >>>>>> reserved for NVDIMM ACPI emulation, refer to docs/specs/acpi_nvdimm.txt
> >>>>>> for detailed design
> >>>>>>
> >>>>>> A parameter, 'nvdimm-support', is introduced for PIIX4_PM and ICH9-LPC
> >>>>>> that controls if nvdimm support is enabled, it is true on default and
> >>>>>> it is false on 2.4 and its earlier version to keep compatibility
> >>>>>>
> >>>>>> Signed-off-by: Xiao Guangrong <guangrong.xiao@linux.intel.com>
> >>>>> [...]
> >>>>>
> >>>>>> @@ -33,6 +33,15 @@
> >>>>>>      */
> >>>>>>     #define MIN_NAMESPACE_LABEL_SIZE      (128UL << 10)
> >>>>>>
> >>>>>> +/*
> >>>>>> + * A page staring from 0xFF00000 and IO port 0x0a18 - 0xa1b in guest are
> >>>>>                                     ^^^ missing 0 or value in define below has an extra 0
> >>>>>
> >>>>>> + * reserved for NVDIMM ACPI emulation, refer to docs/specs/acpi_nvdimm.txt
> >>>>>> + * for detailed design.
> >>>>>> + */
> >>>>>> +#define NVDIMM_ACPI_MEM_BASE          0xFF000000ULL
> >>>>> it still maps RAM at arbitrary place,
> >>>>> that's the reason why VMGenID patches hasn't been merged for
> >>>>> more than several months.
> >>>>> I'm not against of using (more exactly I'm for it) direct mapping
> >>>>> but we should reach consensus when and how to use it first.
> >>>>>
> >>>>> I'd wouldn't use addresses below 4G as it may be used firmware or
> >>>>> legacy hardware and I won't bet that 0xFF000000ULL won't conflict
> >>>>> with anything.
> >>>>> An alternative place to allocate reserve from could be high memory.
> >>>>> For pc we have "reserved-memory-end" which currently makes sure
> >>>>> that hotpluggable memory range isn't used by firmware.
> >>>>>
> >>>>> How about making API that allows to map additional memory
> >>>>> ranges before reserved-memory-end and pushes it up as mappings are
> >>>>> added.
[...]

> 
> Really a good study case to me, i tried your patch and moved the 64 bit
> staffs to the private method, it worked. :)
> 
> Igor, is this the API you want?

Lets get ack from Michael on the idea of RAM mapping before
"reserved-memory-end" first.
If he rejects it then there isn't any other way except of switching
to MMIO instead.

> 
> diff --git a/hw/i386/pc.c b/hw/i386/pc.c
> index 6bf569a..aba29df 100644
> --- a/hw/i386/pc.c
> +++ b/hw/i386/pc.c
> @@ -1291,6 +1291,38 @@ FWCfgState *xen_load_linux(PCMachineState *pcms,
>       return fw_cfg;
>   }
> 
> +static void pc_reserve_high_memory_init(PCMachineState *pcms,
> +                                        uint64_t base, uint64_t align)
> +{
> +    pcms->reserve_high_memory.current_addr = ROUND_UP(base, align);
> +}
> +
> +static uint64_t
> +pc_reserve_high_memory_end(PCMachineState *pcms, int64_t align)
> +{
> +    return ROUND_UP(pcms->reserve_high_memory.current_addr, align);
> +}
> +
> +uint64_t pc_reserve_high_memory(PCMachineState *pcms, uint64_t size,
> +                                int64_t align, Error **errp)
> +{
> +    uint64_t end_addr, current_addr = pcms->reserve_high_memory.current_addr;
> +
> +    if (!current_addr) {
> +        error_setg(errp, "reserved high memory is not initialized.");
> +        return 0;
> +    }
> +
> +    end_addr = pc_reserve_high_memory_end(pcms, align) + size;
> +    if (current_addr > end_addr) {
> +        error_setg(errp, "reserved high memory is not enough.");
> +        return 0;
> +    }
> +
> +    pcms->reserve_high_memory.current_addr = end_addr;
> +    return end_addr;
> +}
> +
>   FWCfgState *pc_memory_init(PCMachineState *pcms,
>                              MemoryRegion *system_memory,
>                              MemoryRegion *rom_memory,
> @@ -1379,6 +1411,8 @@ FWCfgState *pc_memory_init(PCMachineState *pcms,
>                              "hotplug-memory", hotplug_mem_size);
>           memory_region_add_subregion(system_memory, pcms->hotplug_memory.base,
>                                       &pcms->hotplug_memory.mr);
> +        pc_reserve_high_memory_init(pcms, pcms->hotplug_memory.base +
> +                                    hotplug_mem_size, 1ULL << 30);
>       }
> 
>       /* Initialize PC system firmware */
> @@ -1403,7 +1437,7 @@ FWCfgState *pc_memory_init(PCMachineState *pcms,
>           uint64_t res_mem_end = pcms->hotplug_memory.base;
> 
>           if (!pcmc->broken_reserved_end) {
> -            res_mem_end += memory_region_size(&pcms->hotplug_memory.mr);
> +            res_mem_end = pc_reserve_high_memory_end(pcms, 0x1ULL << 30);
>           }
>           *val = cpu_to_le64(ROUND_UP(res_mem_end, 0x1ULL << 30));
>           fw_cfg_add_file(fw_cfg, "etc/reserved-memory-end", val, sizeof(*val));
> diff --git a/include/hw/i386/pc.h b/include/hw/i386/pc.h
> index 47162cf..fae3fea 100644
> --- a/include/hw/i386/pc.h
> +++ b/include/hw/i386/pc.h
> @@ -20,6 +20,11 @@
> 
>   #define HPET_INTCAP "hpet-intcap"
> 
> +struct PCReserveHighMemory {
> +    uint64_t current_addr;
> +};
> +typedef struct PCReserveHighMemory PCReserveHighMemory;
> +
>   /**
>    * PCMachineState:
>    * @acpi_dev: link to ACPI PM device that performs ACPI hotplug handling
> @@ -41,6 +46,7 @@ struct PCMachineState {
>       OnOffAuto smm;
>       bool enforce_aligned_dimm;
>       ram_addr_t below_4g_mem_size, above_4g_mem_size;
> +    PCReserveHighMemory reserve_high_memory;
>   };
> 
>   #define PC_MACHINE_ACPI_DEVICE_PROP "acpi-device"
> @@ -175,6 +181,9 @@ PcGuestInfo *pc_guest_info_init(PCMachineState *pcms);
> 
>   void pc_set_legacy_acpi_data_size(void);
> 
> +uint64_t pc_reserve_high_memory(PCMachineState *pcms, uint64_t size,
> +                                int64_t align, Error **errp);
> +
>   #define PCI_HOST_PROP_PCI_HOLE_START   "pci-hole-start"
>   #define PCI_HOST_PROP_PCI_HOLE_END     "pci-hole-end"
>   #define PCI_HOST_PROP_PCI_HOLE64_START "pci-hole64-start"
> 
> 
> >
> >>>>
> >>>> The luck thing is, the ACPI part is not ABI, we can move it to the high
> >>>> memory if QEMU supports XSDT is ready in future development.
> >>> But mapped control region is ABI and we can't change it if we find out later
> >>> that it breaks something.
> >>
> >> But the ACPI code is completely built by QEMU, which is transparent to guest
> >> and guest should not depend on it, no?
> > unfortunately no, think about cross-version migration.
> 
> It makes sense. Stupid me. :(
> 
> 
> 

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Xiao Guangrong Nov. 11, 2015, 3:01 a.m. UTC | #3
On 11/09/2015 07:13 PM, Igor Mammedov wrote:
> On Fri, 6 Nov 2015 16:31:43 +0800
> Xiao Guangrong <guangrong.xiao@linux.intel.com> wrote:
>
>>
>>
>> On 11/05/2015 10:49 PM, Igor Mammedov wrote:
>>> On Thu, 5 Nov 2015 21:33:39 +0800
>>> Xiao Guangrong <guangrong.xiao@linux.intel.com> wrote:
>>>
>>>>
>>>>
>>>> On 11/05/2015 09:03 PM, Igor Mammedov wrote:
>>>>> On Thu, 5 Nov 2015 18:15:31 +0800
>>>>> Xiao Guangrong <guangrong.xiao@linux.intel.com> wrote:
>>>>>
>>>>>>
>>>>>>
>>>>>> On 11/05/2015 05:58 PM, Igor Mammedov wrote:
>>>>>>> On Mon,  2 Nov 2015 17:13:27 +0800
>>>>>>> Xiao Guangrong <guangrong.xiao@linux.intel.com> wrote:
>>>>>>>
>>>>>>>> A page staring from 0xFF00000 and IO port 0x0a18 - 0xa1b in guest are
>>>>>>>                                    ^^ missing one 0???
>>>>>>>
>>>>>>>> reserved for NVDIMM ACPI emulation, refer to docs/specs/acpi_nvdimm.txt
>>>>>>>> for detailed design
>>>>>>>>
>>>>>>>> A parameter, 'nvdimm-support', is introduced for PIIX4_PM and ICH9-LPC
>>>>>>>> that controls if nvdimm support is enabled, it is true on default and
>>>>>>>> it is false on 2.4 and its earlier version to keep compatibility
>>>>>>>>
>>>>>>>> Signed-off-by: Xiao Guangrong <guangrong.xiao@linux.intel.com>
>>>>>>> [...]
>>>>>>>
>>>>>>>> @@ -33,6 +33,15 @@
>>>>>>>>       */
>>>>>>>>      #define MIN_NAMESPACE_LABEL_SIZE      (128UL << 10)
>>>>>>>>
>>>>>>>> +/*
>>>>>>>> + * A page staring from 0xFF00000 and IO port 0x0a18 - 0xa1b in guest are
>>>>>>>                                      ^^^ missing 0 or value in define below has an extra 0
>>>>>>>
>>>>>>>> + * reserved for NVDIMM ACPI emulation, refer to docs/specs/acpi_nvdimm.txt
>>>>>>>> + * for detailed design.
>>>>>>>> + */
>>>>>>>> +#define NVDIMM_ACPI_MEM_BASE          0xFF000000ULL
>>>>>>> it still maps RAM at arbitrary place,
>>>>>>> that's the reason why VMGenID patches hasn't been merged for
>>>>>>> more than several months.
>>>>>>> I'm not against of using (more exactly I'm for it) direct mapping
>>>>>>> but we should reach consensus when and how to use it first.
>>>>>>>
>>>>>>> I'd wouldn't use addresses below 4G as it may be used firmware or
>>>>>>> legacy hardware and I won't bet that 0xFF000000ULL won't conflict
>>>>>>> with anything.
>>>>>>> An alternative place to allocate reserve from could be high memory.
>>>>>>> For pc we have "reserved-memory-end" which currently makes sure
>>>>>>> that hotpluggable memory range isn't used by firmware.
>>>>>>>
>>>>>>> How about making API that allows to map additional memory
>>>>>>> ranges before reserved-memory-end and pushes it up as mappings are
>>>>>>> added.
> [...]
>
>>
>> Really a good study case to me, i tried your patch and moved the 64 bit
>> staffs to the private method, it worked. :)
>>
>> Igor, is this the API you want?
>
> Lets get ack from Michael on the idea of RAM mapping before
> "reserved-memory-end" first.
> If he rejects it then there isn't any other way except of switching
> to MMIO instead.

Michael, what's your idea?

BTW, this is the reason why we prefer to reserve memory space just in case
if you missed the thread:
       http://marc.info/?l=qemu-devel&m=144530844718146&w=2
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Patch
diff mbox

diff --git a/hw/i386/pc.c b/hw/i386/pc.c
index 6bf569a..aba29df 100644
--- a/hw/i386/pc.c
+++ b/hw/i386/pc.c
@@ -1291,6 +1291,38 @@  FWCfgState *xen_load_linux(PCMachineState *pcms,
      return fw_cfg;
  }

+static void pc_reserve_high_memory_init(PCMachineState *pcms,
+                                        uint64_t base, uint64_t align)
+{
+    pcms->reserve_high_memory.current_addr = ROUND_UP(base, align);
+}
+
+static uint64_t
+pc_reserve_high_memory_end(PCMachineState *pcms, int64_t align)
+{
+    return ROUND_UP(pcms->reserve_high_memory.current_addr, align);
+}
+
+uint64_t pc_reserve_high_memory(PCMachineState *pcms, uint64_t size,
+                                int64_t align, Error **errp)
+{
+    uint64_t end_addr, current_addr = pcms->reserve_high_memory.current_addr;
+
+    if (!current_addr) {
+        error_setg(errp, "reserved high memory is not initialized.");
+        return 0;
+    }
+
+    end_addr = pc_reserve_high_memory_end(pcms, align) + size;
+    if (current_addr > end_addr) {
+        error_setg(errp, "reserved high memory is not enough.");
+        return 0;
+    }
+
+    pcms->reserve_high_memory.current_addr = end_addr;
+    return end_addr;
+}
+
  FWCfgState *pc_memory_init(PCMachineState *pcms,
                             MemoryRegion *system_memory,
                             MemoryRegion *rom_memory,
@@ -1379,6 +1411,8 @@  FWCfgState *pc_memory_init(PCMachineState *pcms,
                             "hotplug-memory", hotplug_mem_size);
          memory_region_add_subregion(system_memory, pcms->hotplug_memory.base,
                                      &pcms->hotplug_memory.mr);
+        pc_reserve_high_memory_init(pcms, pcms->hotplug_memory.base +
+                                    hotplug_mem_size, 1ULL << 30);
      }

      /* Initialize PC system firmware */
@@ -1403,7 +1437,7 @@  FWCfgState *pc_memory_init(PCMachineState *pcms,
          uint64_t res_mem_end = pcms->hotplug_memory.base;

          if (!pcmc->broken_reserved_end) {
-            res_mem_end += memory_region_size(&pcms->hotplug_memory.mr);
+            res_mem_end = pc_reserve_high_memory_end(pcms, 0x1ULL << 30);
          }
          *val = cpu_to_le64(ROUND_UP(res_mem_end, 0x1ULL << 30));
          fw_cfg_add_file(fw_cfg, "etc/reserved-memory-end", val, sizeof(*val));
diff --git a/include/hw/i386/pc.h b/include/hw/i386/pc.h
index 47162cf..fae3fea 100644
--- a/include/hw/i386/pc.h
+++ b/include/hw/i386/pc.h
@@ -20,6 +20,11 @@ 

  #define HPET_INTCAP "hpet-intcap"

+struct PCReserveHighMemory {
+    uint64_t current_addr;
+};
+typedef struct PCReserveHighMemory PCReserveHighMemory;
+
  /**
   * PCMachineState:
   * @acpi_dev: link to ACPI PM device that performs ACPI hotplug handling
@@ -41,6 +46,7 @@  struct PCMachineState {
      OnOffAuto smm;
      bool enforce_aligned_dimm;
      ram_addr_t below_4g_mem_size, above_4g_mem_size;
+    PCReserveHighMemory reserve_high_memory;
  };

  #define PC_MACHINE_ACPI_DEVICE_PROP "acpi-device"
@@ -175,6 +181,9 @@  PcGuestInfo *pc_guest_info_init(PCMachineState *pcms);

  void pc_set_legacy_acpi_data_size(void);

+uint64_t pc_reserve_high_memory(PCMachineState *pcms, uint64_t size,
+                                int64_t align, Error **errp);
+
  #define PCI_HOST_PROP_PCI_HOLE_START   "pci-hole-start"
  #define PCI_HOST_PROP_PCI_HOLE_END     "pci-hole-end"
  #define PCI_HOST_PROP_PCI_HOLE64_START "pci-hole64-start"