Message ID | ea43ecd8e920088a783f81a7b2aeee8715657dc6.1613025709.git.isaku.yamahata@intel.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | ACPI related fixes to comform the spec better | expand |
On Wed, 10 Feb 2021 22:46:43 -0800 Isaku Yamahata <isaku.yamahata@intel.com> wrote: > Declare PNP0C01 device to reserve MMCONFIG region to conform to the > spec better and play nice with guest BIOSes/OSes. > > According to PCI Firmware Specification[0], MMCONFIG region must be > reserved by declaring a motherboard resource. It's optional to reserve > the region in memory map by Int 15 E820h or EFIGetMemoryMap. > Guest Linux checks if the MMCFG region is reserved by bios memory map > or ACPI resource. If it's not reserved, Linux falls back to legacy PCI > configuration access. > > TDVF [1] [2] doesn't reserve MMCONFIG the region in memory map. > On the other hand OVMF reserves it in memory map without declaring a > motherboard resource. With memory map reservation, linux guest uses > MMCONFIG region. However it doesn't comply to PCI Firmware > specification. > > [0] PCI Firmware specification Revision 3.2 > 4.1.2 MCFG Table Description table 4-2 NOTE 2 > If the operating system does not natively comprehend reserving the > MMCFG region, The MMCFG region must e reserved by firmware. ... > For most systems, the mortheroard resource would appear at the root > of the ACPI namespace (under \_SB)... > The resource can optionally be returned in Int15 E820h or > EFIGetMemoryMap as reserved memory but must always be reported > through ACPI as a motherboard resource > > [1] TDX: Intel Trust Domain Extension > https://software.intel.com/content/www/us/en/develop/articles/intel-trust-domain-extensions.html > [2] TDX Virtual Firmware > https://github.com/tianocore/edk2-staging/tree/TDVF > > The change to DSDT is as follows. > @@ -68,32 +68,51 @@ > > If ((CDW3 != Local0)) > { > CDW1 |= 0x10 > } > > CDW3 = Local0 > } > Else > { > CDW1 |= 0x04 > } > > Return (Arg3) > } > } > + > + Device (DRAC) > + { > + Name (_HID, "PNP0C01" /* System Board */) // _HID: Hardware ID > + Name (RBUF, ResourceTemplate () > + { > + QWordMemory (ResourceProducer, PosDecode, MinFixed, MaxFixed, NonCacheable, ReadWrite, > + 0x0000000000000000, // Granularity > + 0x00000000B0000000, // Range Minimum > + 0x00000000BFFFFFFF, // Range Maximum > + 0x0000000000000000, // Translation Offset > + 0x0000000010000000, // Length > + ,, , AddressRangeMemory, TypeStatic) > + }) > + Method (_CRS, 0, Serialized) // _CRS: Current Resource Settings > + { > + Return (RBUF) /* \_SB_.DRAC.RBUF */ > + } > + } > } > > Scope (_SB) > { > Device (HPET) > { > Name (_HID, EisaId ("PNP0103") /* HPET System Timer */) // _HID: Hardware ID > Name (_UID, Zero) // _UID: Unique ID > OperationRegion (HPTM, SystemMemory, 0xFED00000, 0x0400) > Field (HPTM, DWordAcc, Lock, Preserve) > { > VEND, 32, > PRD, 32 > } > > Method (_STA, 0, NotSerialized) // _STA: Status > > Signed-off-by: Isaku Yamahata <isaku.yamahata@intel.com> > --- > hw/i386/acpi-build.c | 46 +++++++++++++++++++++++++++++++++++++++++++- > 1 file changed, 45 insertions(+), 1 deletion(-) > > diff --git a/hw/i386/acpi-build.c b/hw/i386/acpi-build.c > index 00cc119362..e369908b1a 100644 > --- a/hw/i386/acpi-build.c > +++ b/hw/i386/acpi-build.c > @@ -1072,6 +1072,46 @@ static void build_q35_pci0_int(Aml *table) > aml_append(table, sb_scope); > } > > +static Aml *build_q35_dram_controller(void) > +{ > + AcpiMcfgInfo mcfg; build_dsdt() already calls acpi_get_mcfg(), I suggest to cache it there and pass to build_q35_dram_controller as an argument. > + Aml *dev; > + Aml *rbuf; > + Aml *resource_template; > + Aml *rbuf_name; > + Aml *crs; > + > + if (!acpi_get_mcfg(&mcfg)) { > + return NULL; > + } > + > + /* DRAM controller */ > + dev = aml_device("DRAC"); > + aml_append(dev, aml_name_decl("_HID", aml_string("PNP0C01"))); > + > + resource_template = aml_resource_template(); > + aml_append(resource_template, > + aml_qword_memory(AML_POS_DECODE, > + AML_MIN_FIXED, > + AML_MAX_FIXED, > + AML_NON_CACHEABLE, > + AML_READ_WRITE, > + 0x0000000000000000, > + mcfg.base, > + mcfg.base + mcfg.size - 1, s/mcfg.base + mcfg.size - 1/mcfg.base/ > + 0x0000000000000000, > + mcfg.size)); > + rbuf = aml_name_decl("RBUF", resource_template); > + aml_append(dev, rbuf); > + > + crs = aml_method("_CRS", 0, AML_SERIALIZED); > + rbuf_name = aml_name("RBUF"); > + aml_append(crs, aml_return(rbuf_name)); > + aml_append(dev, crs); > + > + return dev; > +} > + > static void build_q35_isa_bridge(Aml *table) > { > Aml *dev; > @@ -1212,7 +1252,7 @@ build_dsdt(GArray *table_data, BIOSLinker *linker, > Range *pci_hole, Range *pci_hole64, MachineState *machine) > { > CrsRangeEntry *entry; > - Aml *dsdt, *sb_scope, *scope, *dev, *method, *field, *pkg, *crs; > + Aml *dsdt, *sb_scope, *scope, *dev, *method, *field, *pkg, *crs, *drac; maybe limit drac to if (misc->is_piix4) { ... } else { Aml *drac scope > CrsRangeSet crs_range_set; > PCMachineState *pcms = PC_MACHINE(machine); > PCMachineClass *pcmc = PC_MACHINE_GET_CLASS(machine); > @@ -1256,6 +1296,10 @@ build_dsdt(GArray *table_data, BIOSLinker *linker, > aml_append(dev, aml_name_decl("_UID", aml_int(0))); > aml_append(dev, build_q35_osc_method()); > aml_append(sb_scope, dev); > + drac = build_q35_dram_controller(); > + if (drac) { > + aml_append(sb_scope, drac); > + } mmcfg isn't optional for q35, is it? I'd simplify to: aml_append(sb_scope, build_q35_dram_controller(mmcfg_info)); or put all of it under condition if it's optional if(foo) aml_append(sb_scope, build_q35_dram_controller(mmcfg_info)) > if (pm->smi_on_cpuhp) { > /* reserve SMI block resources, IO ports 0xB2, 0xB3 */
On Fri, Feb 12, 2021 at 04:40:38PM +0100, Igor Mammedov <imammedo@redhat.com> wrote: > On Wed, 10 Feb 2021 22:46:43 -0800 > Isaku Yamahata <isaku.yamahata@intel.com> wrote: > > > Declare PNP0C01 device to reserve MMCONFIG region to conform to the > > spec better and play nice with guest BIOSes/OSes. > > > > According to PCI Firmware Specification[0], MMCONFIG region must be > > reserved by declaring a motherboard resource. It's optional to reserve > > the region in memory map by Int 15 E820h or EFIGetMemoryMap. > > Guest Linux checks if the MMCFG region is reserved by bios memory map > > or ACPI resource. If it's not reserved, Linux falls back to legacy PCI > > configuration access. > > > > TDVF [1] [2] doesn't reserve MMCONFIG the region in memory map. > > On the other hand OVMF reserves it in memory map without declaring a > > motherboard resource. With memory map reservation, linux guest uses > > MMCONFIG region. However it doesn't comply to PCI Firmware > > specification. > > > > [0] PCI Firmware specification Revision 3.2 > > 4.1.2 MCFG Table Description table 4-2 NOTE 2 > > If the operating system does not natively comprehend reserving the > > MMCFG region, The MMCFG region must e reserved by firmware. ... > > For most systems, the mortheroard resource would appear at the root > > of the ACPI namespace (under \_SB)... > > The resource can optionally be returned in Int15 E820h or > > EFIGetMemoryMap as reserved memory but must always be reported > > through ACPI as a motherboard resource > > > > [1] TDX: Intel Trust Domain Extension > > https://software.intel.com/content/www/us/en/develop/articles/intel-trust-domain-extensions.html > > [2] TDX Virtual Firmware > > https://github.com/tianocore/edk2-staging/tree/TDVF > > > > The change to DSDT is as follows. > > @@ -68,32 +68,51 @@ > > > > If ((CDW3 != Local0)) > > { > > CDW1 |= 0x10 > > } > > > > CDW3 = Local0 > > } > > Else > > { > > CDW1 |= 0x04 > > } > > > > Return (Arg3) > > } > > } > > + > > + Device (DRAC) > > + { > > + Name (_HID, "PNP0C01" /* System Board */) // _HID: Hardware ID > > + Name (RBUF, ResourceTemplate () > > + { > > + QWordMemory (ResourceProducer, PosDecode, MinFixed, MaxFixed, NonCacheable, ReadWrite, > > + 0x0000000000000000, // Granularity > > + 0x00000000B0000000, // Range Minimum > > + 0x00000000BFFFFFFF, // Range Maximum > > + 0x0000000000000000, // Translation Offset > > + 0x0000000010000000, // Length > > + ,, , AddressRangeMemory, TypeStatic) > > + }) > > + Method (_CRS, 0, Serialized) // _CRS: Current Resource Settings > > + { > > + Return (RBUF) /* \_SB_.DRAC.RBUF */ > > + } > > + } > > } > > > > Scope (_SB) > > { > > Device (HPET) > > { > > Name (_HID, EisaId ("PNP0103") /* HPET System Timer */) // _HID: Hardware ID > > Name (_UID, Zero) // _UID: Unique ID > > OperationRegion (HPTM, SystemMemory, 0xFED00000, 0x0400) > > Field (HPTM, DWordAcc, Lock, Preserve) > > { > > VEND, 32, > > PRD, 32 > > } > > > > Method (_STA, 0, NotSerialized) // _STA: Status > > > > Signed-off-by: Isaku Yamahata <isaku.yamahata@intel.com> > > --- > > hw/i386/acpi-build.c | 46 +++++++++++++++++++++++++++++++++++++++++++- > > 1 file changed, 45 insertions(+), 1 deletion(-) > > > > diff --git a/hw/i386/acpi-build.c b/hw/i386/acpi-build.c > > index 00cc119362..e369908b1a 100644 > > --- a/hw/i386/acpi-build.c > > +++ b/hw/i386/acpi-build.c > > @@ -1072,6 +1072,46 @@ static void build_q35_pci0_int(Aml *table) > > aml_append(table, sb_scope); > > } > > > > +static Aml *build_q35_dram_controller(void) > > +{ > > + AcpiMcfgInfo mcfg; > build_dsdt() already calls acpi_get_mcfg(), > I suggest to cache it there and pass to build_q35_dram_controller > as an argument. Sure. > > + Aml *dev; > > + Aml *rbuf; > > + Aml *resource_template; > > + Aml *rbuf_name; > > + Aml *crs; > > + > > + if (!acpi_get_mcfg(&mcfg)) { > > + return NULL; > > + } > > + > > + /* DRAM controller */ > > + dev = aml_device("DRAC"); > > + aml_append(dev, aml_name_decl("_HID", aml_string("PNP0C01"))); > > + > > + resource_template = aml_resource_template(); > > + aml_append(resource_template, > > + aml_qword_memory(AML_POS_DECODE, > > + AML_MIN_FIXED, > > + AML_MAX_FIXED, > > + AML_NON_CACHEABLE, > > + AML_READ_WRITE, > > + 0x0000000000000000, > > + mcfg.base, > > + mcfg.base + mcfg.size - 1, > s/mcfg.base + mcfg.size - 1/mcfg.base/ AddressMaximum needs to be the highest address of the region. Not base address. By disassemble/assembl it, iasl complains as follows. Also there are similar examples in acpi-build.c. iasl drm-1.dsl Intel ACPI Component Architecture ASL+ Optimizing Compiler/Disassembler version 20190509 Copyright (c) 2000 - 2019 Intel Corporation drm-1.dsl 23: 0x10000000, // Length = 256M Error 6049 - ^ Length is larger than Min/Max window ASL Input: drm-1.dsl - 1000 bytes 6 keywords 37 source lines Compilation failed. 1 Errors, 0 Warnings, 0 Remarks No AML files were generated due to compiler error(s) > > + 0x0000000000000000, > > + mcfg.size)); > > + rbuf = aml_name_decl("RBUF", resource_template); > > + aml_append(dev, rbuf); > > + > > + crs = aml_method("_CRS", 0, AML_SERIALIZED); > > + rbuf_name = aml_name("RBUF"); > > + aml_append(crs, aml_return(rbuf_name)); > > + aml_append(dev, crs); > > + > > + return dev; > > +} > > + > > static void build_q35_isa_bridge(Aml *table) > > { > > Aml *dev; > > @@ -1212,7 +1252,7 @@ build_dsdt(GArray *table_data, BIOSLinker *linker, > > Range *pci_hole, Range *pci_hole64, MachineState *machine) > > { > > CrsRangeEntry *entry; > > - Aml *dsdt, *sb_scope, *scope, *dev, *method, *field, *pkg, *crs; > > + Aml *dsdt, *sb_scope, *scope, *dev, *method, *field, *pkg, *crs, *drac; > maybe limit drac to > if (misc->is_piix4) { > ... } else { > Aml *drac > scope ok. > > CrsRangeSet crs_range_set; > > PCMachineState *pcms = PC_MACHINE(machine); > > PCMachineClass *pcmc = PC_MACHINE_GET_CLASS(machine); > > @@ -1256,6 +1296,10 @@ build_dsdt(GArray *table_data, BIOSLinker *linker, > > aml_append(dev, aml_name_decl("_UID", aml_int(0))); > > aml_append(dev, build_q35_osc_method()); > > aml_append(sb_scope, dev); > > + drac = build_q35_dram_controller(); > > + if (drac) { > > + aml_append(sb_scope, drac); > > + } > mmcfg isn't optional for q35, is it? > I'd simplify to: > aml_append(sb_scope, build_q35_dram_controller(mmcfg_info)); > or put all of it under condition if it's optional > if(foo) aml_append(sb_scope, build_q35_dram_controller(mmcfg_info)) It's optional. If MCFG isn't setup after reset, PCIE_BASE_ADDR_UNMAPPED is returned.
On Fri, 12 Feb 2021 12:51:57 -0800 Isaku Yamahata <isaku.yamahata@gmail.com> wrote: > On Fri, Feb 12, 2021 at 04:40:38PM +0100, > Igor Mammedov <imammedo@redhat.com> wrote: > > > On Wed, 10 Feb 2021 22:46:43 -0800 > > Isaku Yamahata <isaku.yamahata@intel.com> wrote: > > > > > Declare PNP0C01 device to reserve MMCONFIG region to conform to the > > > spec better and play nice with guest BIOSes/OSes. > > > > > > According to PCI Firmware Specification[0], MMCONFIG region must be > > > reserved by declaring a motherboard resource. It's optional to reserve > > > the region in memory map by Int 15 E820h or EFIGetMemoryMap. > > > Guest Linux checks if the MMCFG region is reserved by bios memory map > > > or ACPI resource. If it's not reserved, Linux falls back to legacy PCI > > > configuration access. > > > > > > TDVF [1] [2] doesn't reserve MMCONFIG the region in memory map. > > > On the other hand OVMF reserves it in memory map without declaring a > > > motherboard resource. With memory map reservation, linux guest uses > > > MMCONFIG region. However it doesn't comply to PCI Firmware > > > specification. > > > > > > [0] PCI Firmware specification Revision 3.2 > > > 4.1.2 MCFG Table Description table 4-2 NOTE 2 > > > If the operating system does not natively comprehend reserving the > > > MMCFG region, The MMCFG region must e reserved by firmware. ... > > > For most systems, the mortheroard resource would appear at the root > > > of the ACPI namespace (under \_SB)... > > > The resource can optionally be returned in Int15 E820h or > > > EFIGetMemoryMap as reserved memory but must always be reported > > > through ACPI as a motherboard resource > > > > > > [1] TDX: Intel Trust Domain Extension > > > https://software.intel.com/content/www/us/en/develop/articles/intel-trust-domain-extensions.html > > > [2] TDX Virtual Firmware > > > https://github.com/tianocore/edk2-staging/tree/TDVF > > > > > > The change to DSDT is as follows. > > > @@ -68,32 +68,51 @@ > > > > > > If ((CDW3 != Local0)) > > > { > > > CDW1 |= 0x10 > > > } > > > > > > CDW3 = Local0 > > > } > > > Else > > > { > > > CDW1 |= 0x04 > > > } > > > > > > Return (Arg3) > > > } > > > } > > > + > > > + Device (DRAC) > > > + { > > > + Name (_HID, "PNP0C01" /* System Board */) // _HID: Hardware ID > > > + Name (RBUF, ResourceTemplate () > > > + { > > > + QWordMemory (ResourceProducer, PosDecode, MinFixed, MaxFixed, NonCacheable, ReadWrite, > > > + 0x0000000000000000, // Granularity > > > + 0x00000000B0000000, // Range Minimum > > > + 0x00000000BFFFFFFF, // Range Maximum > > > + 0x0000000000000000, // Translation Offset > > > + 0x0000000010000000, // Length > > > + ,, , AddressRangeMemory, TypeStatic) > > > + }) > > > + Method (_CRS, 0, Serialized) // _CRS: Current Resource Settings > > > + { > > > + Return (RBUF) /* \_SB_.DRAC.RBUF */ > > > + } > > > + } > > > } > > > > > > Scope (_SB) > > > { > > > Device (HPET) > > > { > > > Name (_HID, EisaId ("PNP0103") /* HPET System Timer */) // _HID: Hardware ID > > > Name (_UID, Zero) // _UID: Unique ID > > > OperationRegion (HPTM, SystemMemory, 0xFED00000, 0x0400) > > > Field (HPTM, DWordAcc, Lock, Preserve) > > > { > > > VEND, 32, > > > PRD, 32 > > > } > > > > > > Method (_STA, 0, NotSerialized) // _STA: Status > > > > > > Signed-off-by: Isaku Yamahata <isaku.yamahata@intel.com> > > > --- > > > hw/i386/acpi-build.c | 46 +++++++++++++++++++++++++++++++++++++++++++- > > > 1 file changed, 45 insertions(+), 1 deletion(-) > > > > > > diff --git a/hw/i386/acpi-build.c b/hw/i386/acpi-build.c > > > index 00cc119362..e369908b1a 100644 > > > --- a/hw/i386/acpi-build.c > > > +++ b/hw/i386/acpi-build.c > > > @@ -1072,6 +1072,46 @@ static void build_q35_pci0_int(Aml *table) > > > aml_append(table, sb_scope); > > > } > > > > > > +static Aml *build_q35_dram_controller(void) > > > +{ > > > + AcpiMcfgInfo mcfg; > > build_dsdt() already calls acpi_get_mcfg(), > > I suggest to cache it there and pass to build_q35_dram_controller > > as an argument. > > Sure. > > > > > + Aml *dev; > > > + Aml *rbuf; > > > + Aml *resource_template; > > > + Aml *rbuf_name; > > > + Aml *crs; > > > + > > > + if (!acpi_get_mcfg(&mcfg)) { > > > + return NULL; > > > + } > > > + > > > + /* DRAM controller */ > > > + dev = aml_device("DRAC"); > > > + aml_append(dev, aml_name_decl("_HID", aml_string("PNP0C01"))); > > > + > > > + resource_template = aml_resource_template(); > > > + aml_append(resource_template, > > > + aml_qword_memory(AML_POS_DECODE, > > > + AML_MIN_FIXED, > > > + AML_MAX_FIXED, > > > + AML_NON_CACHEABLE, > > > + AML_READ_WRITE, > > > + 0x0000000000000000, > > > + mcfg.base, > > > + mcfg.base + mcfg.size - 1, > > s/mcfg.base + mcfg.size - 1/mcfg.base/ > > AddressMaximum needs to be the highest address of the region. > Not base address. By disassemble/assembl it, iasl complains as follows. > Also there are similar examples in acpi-build.c. I mostly clean up all places to use the same base address for min/max, but probably something were left behind. spec says: acpi 6.3: 19.6.110 QWordMemory AddressMaximum evaluates to a 64-bit integer that specifies the highest possible base address of the Memory range. The value must have ‘0’ in all bits where the corresponding bit in AddressGranularity is ‘1’. For bridge devices which translate addresses, this is the address on the secondary bus. The 64-bit field DescriptorName ._MAX is automatically created to refer to this portion of the resource descriptor. > iasl drm-1.dsl > > Intel ACPI Component Architecture > ASL+ Optimizing Compiler/Disassembler version 20190509 > Copyright (c) 2000 - 2019 Intel Corporation > > drm-1.dsl 23: 0x10000000, // Length = 256M > Error 6049 - ^ Length is larger than Min/Max window > > ASL Input: drm-1.dsl - 1000 bytes 6 keywords 37 source lines > > Compilation failed. 1 Errors, 0 Warnings, 0 Remarks > No AML files were generated due to compiler error(s) > > > > > + 0x0000000000000000, > > > + mcfg.size)); > > > + rbuf = aml_name_decl("RBUF", resource_template); > > > + aml_append(dev, rbuf); > > > + > > > + crs = aml_method("_CRS", 0, AML_SERIALIZED); > > > + rbuf_name = aml_name("RBUF"); > > > + aml_append(crs, aml_return(rbuf_name)); > > > + aml_append(dev, crs); > > > + > > > + return dev; > > > +} > > > + > > > static void build_q35_isa_bridge(Aml *table) > > > { > > > Aml *dev; > > > @@ -1212,7 +1252,7 @@ build_dsdt(GArray *table_data, BIOSLinker *linker, > > > Range *pci_hole, Range *pci_hole64, MachineState *machine) > > > { > > > CrsRangeEntry *entry; > > > - Aml *dsdt, *sb_scope, *scope, *dev, *method, *field, *pkg, *crs; > > > + Aml *dsdt, *sb_scope, *scope, *dev, *method, *field, *pkg, *crs, *drac; > > maybe limit drac to > > if (misc->is_piix4) { > > ... } else { > > Aml *drac > > scope > > ok. > > > > > CrsRangeSet crs_range_set; > > > PCMachineState *pcms = PC_MACHINE(machine); > > > PCMachineClass *pcmc = PC_MACHINE_GET_CLASS(machine); > > > @@ -1256,6 +1296,10 @@ build_dsdt(GArray *table_data, BIOSLinker *linker, > > > aml_append(dev, aml_name_decl("_UID", aml_int(0))); > > > aml_append(dev, build_q35_osc_method()); > > > aml_append(sb_scope, dev); > > > + drac = build_q35_dram_controller(); > > > + if (drac) { > > > + aml_append(sb_scope, drac); > > > + } > > mmcfg isn't optional for q35, is it? > > I'd simplify to: > > aml_append(sb_scope, build_q35_dram_controller(mmcfg_info)); > > or put all of it under condition if it's optional > > if(foo) aml_append(sb_scope, build_q35_dram_controller(mmcfg_info)) > > It's optional. If MCFG isn't setup after reset, PCIE_BASE_ADDR_UNMAPPED > is returned. ok
On Mon, Feb 15, 2021 at 01:48:32PM +0100, Igor Mammedov <imammedo@redhat.com> wrote: > On Fri, 12 Feb 2021 12:51:57 -0800 > Isaku Yamahata <isaku.yamahata@gmail.com> wrote: > > > On Fri, Feb 12, 2021 at 04:40:38PM +0100, > > Igor Mammedov <imammedo@redhat.com> wrote: > > > > > On Wed, 10 Feb 2021 22:46:43 -0800 > > > Isaku Yamahata <isaku.yamahata@intel.com> wrote: > > > > > > > + Aml *dev; > > > > + Aml *rbuf; > > > > + Aml *resource_template; > > > > + Aml *rbuf_name; > > > > + Aml *crs; > > > > + > > > > + if (!acpi_get_mcfg(&mcfg)) { > > > > + return NULL; > > > > + } > > > > + > > > > + /* DRAM controller */ > > > > + dev = aml_device("DRAC"); > > > > + aml_append(dev, aml_name_decl("_HID", aml_string("PNP0C01"))); > > > > + > > > > + resource_template = aml_resource_template(); > > > > + aml_append(resource_template, > > > > + aml_qword_memory(AML_POS_DECODE, > > > > + AML_MIN_FIXED, > > > > + AML_MAX_FIXED, > > > > + AML_NON_CACHEABLE, > > > > + AML_READ_WRITE, > > > > + 0x0000000000000000, > > > > + mcfg.base, > > > > + mcfg.base + mcfg.size - 1, > > > s/mcfg.base + mcfg.size - 1/mcfg.base/ > > > > AddressMaximum needs to be the highest address of the region. > > Not base address. By disassemble/assembl it, iasl complains as follows. > > Also there are similar examples in acpi-build.c. > I mostly clean up all places to use the same base address for min/max, > but probably something were left behind. > > spec says: > > acpi 6.3: 19.6.110 QWordMemory > > AddressMaximum evaluates to a 64-bit integer that specifies the highest possible base address of the > Memory range. The value must have ‘0’ in all bits where the corresponding bit in AddressGranularity is > ‘1’. For bridge devices which translate addresses, this is the address on the secondary bus. The 64-bit > field DescriptorName ._MAX is automatically created to refer to this portion of the resource descriptor. Ok, Linux guest is happy with min=max. I conlude that it's iasl issue. Thanks,
On Tue, Feb 16, 2021 at 01:43:01AM -0800, Isaku Yamahata wrote: > On Mon, Feb 15, 2021 at 01:48:32PM +0100, > Igor Mammedov <imammedo@redhat.com> wrote: > > > On Fri, 12 Feb 2021 12:51:57 -0800 > > Isaku Yamahata <isaku.yamahata@gmail.com> wrote: > > > > > On Fri, Feb 12, 2021 at 04:40:38PM +0100, > > > Igor Mammedov <imammedo@redhat.com> wrote: > > > > > > > On Wed, 10 Feb 2021 22:46:43 -0800 > > > > Isaku Yamahata <isaku.yamahata@intel.com> wrote: > > > > > > > > > + Aml *dev; > > > > > + Aml *rbuf; > > > > > + Aml *resource_template; > > > > > + Aml *rbuf_name; > > > > > + Aml *crs; > > > > > + > > > > > + if (!acpi_get_mcfg(&mcfg)) { > > > > > + return NULL; > > > > > + } > > > > > + > > > > > + /* DRAM controller */ > > > > > + dev = aml_device("DRAC"); > > > > > + aml_append(dev, aml_name_decl("_HID", aml_string("PNP0C01"))); > > > > > + > > > > > + resource_template = aml_resource_template(); > > > > > + aml_append(resource_template, > > > > > + aml_qword_memory(AML_POS_DECODE, > > > > > + AML_MIN_FIXED, > > > > > + AML_MAX_FIXED, > > > > > + AML_NON_CACHEABLE, > > > > > + AML_READ_WRITE, > > > > > + 0x0000000000000000, > > > > > + mcfg.base, > > > > > + mcfg.base + mcfg.size - 1, > > > > s/mcfg.base + mcfg.size - 1/mcfg.base/ > > > > > > AddressMaximum needs to be the highest address of the region. > > > Not base address. By disassemble/assembl it, iasl complains as follows. > > > Also there are similar examples in acpi-build.c. > > I mostly clean up all places to use the same base address for min/max, > > but probably something were left behind. > > > > spec says: > > > > acpi 6.3: 19.6.110 QWordMemory > > > > AddressMaximum evaluates to a 64-bit integer that specifies the highest possible base address of the > > Memory range. The value must have ‘0’ in all bits where the corresponding bit in AddressGranularity is > > ‘1’. For bridge devices which translate addresses, this is the address on the secondary bus. The 64-bit > > field DescriptorName ._MAX is automatically created to refer to this portion of the resource descriptor. > > Ok, Linux guest is happy with min=max. > I conlude that it's iasl issue. > > Thanks, OK but what about all the other places in the code that seem to use this field differently?
On Tue, Feb 16, 2021 at 08:45:48AM -0500, "Michael S. Tsirkin" <mst@redhat.com> wrote: > On Tue, Feb 16, 2021 at 01:43:01AM -0800, Isaku Yamahata wrote: > > On Mon, Feb 15, 2021 at 01:48:32PM +0100, > > Igor Mammedov <imammedo@redhat.com> wrote: > > > > > On Fri, 12 Feb 2021 12:51:57 -0800 > > > Isaku Yamahata <isaku.yamahata@gmail.com> wrote: > > > > > > > On Fri, Feb 12, 2021 at 04:40:38PM +0100, > > > > Igor Mammedov <imammedo@redhat.com> wrote: > > > > > > > > > On Wed, 10 Feb 2021 22:46:43 -0800 > > > > > Isaku Yamahata <isaku.yamahata@intel.com> wrote: > > > > > > > > > > > + Aml *dev; > > > > > > + Aml *rbuf; > > > > > > + Aml *resource_template; > > > > > > + Aml *rbuf_name; > > > > > > + Aml *crs; > > > > > > + > > > > > > + if (!acpi_get_mcfg(&mcfg)) { > > > > > > + return NULL; > > > > > > + } > > > > > > + > > > > > > + /* DRAM controller */ > > > > > > + dev = aml_device("DRAC"); > > > > > > + aml_append(dev, aml_name_decl("_HID", aml_string("PNP0C01"))); > > > > > > + > > > > > > + resource_template = aml_resource_template(); > > > > > > + aml_append(resource_template, > > > > > > + aml_qword_memory(AML_POS_DECODE, > > > > > > + AML_MIN_FIXED, > > > > > > + AML_MAX_FIXED, > > > > > > + AML_NON_CACHEABLE, > > > > > > + AML_READ_WRITE, > > > > > > + 0x0000000000000000, > > > > > > + mcfg.base, > > > > > > + mcfg.base + mcfg.size - 1, > > > > > s/mcfg.base + mcfg.size - 1/mcfg.base/ > > > > > > > > AddressMaximum needs to be the highest address of the region. > > > > Not base address. By disassemble/assembl it, iasl complains as follows. > > > > Also there are similar examples in acpi-build.c. > > > I mostly clean up all places to use the same base address for min/max, > > > but probably something were left behind. > > > > > > spec says: > > > > > > acpi 6.3: 19.6.110 QWordMemory > > > > > > AddressMaximum evaluates to a 64-bit integer that specifies the highest possible base address of the > > > Memory range. The value must have ‘0’ in all bits where the corresponding bit in AddressGranularity is > > > ‘1’. For bridge devices which translate addresses, this is the address on the secondary bus. The 64-bit > > > field DescriptorName ._MAX is automatically created to refer to this portion of the resource descriptor. > > > > Ok, Linux guest is happy with min=max. > > I conlude that it's iasl issue. > > > > Thanks, > > OK but what about all the other places in the code that seem to use this > field differently? Igor, what do you think? The followings are the summary of the situation. - acpi 6.4: 19.6.110 QWordMemory _MAX: maximum of base address. - acpi 6.4: 6.4.3.5 Address Space Resource Descriptors table 6.44 If _LEN > 0 and _MIF = 1 and _MAF = 1, then _LEN = _MAX - _MIN + 1 This doesn't match with the above description - iasl If _LEN > 0 and _MIF = 1 and _MAF = 1, it emits warning on _LEN != _MAX - _MIN + 1 - Linux Guest MMCONFIG check check_mcfg_resoure() uses only _MIN. doesn't use _MAX. _MAX value doesn't matter - Linux acpi code acpi_decode_space() uses _MAX to calulate range [start, end], not use _LEN. i.e. It assumes _LEN = _MAX - _MIN + 1 if _LEN > 0, _MIF = 1, _MAF = 1 - qemu code sets for [qd]word_memory_resource (except this patch) If _LEN > 0 and _MIF = 1 and _MAF = 1, then set _LEN = _MAX - _MIN + 1
On Tue, 16 Feb 2021 10:13:25 -0800 Isaku Yamahata <isaku.yamahata@gmail.com> wrote: > On Tue, Feb 16, 2021 at 08:45:48AM -0500, > "Michael S. Tsirkin" <mst@redhat.com> wrote: > > > On Tue, Feb 16, 2021 at 01:43:01AM -0800, Isaku Yamahata wrote: > > > On Mon, Feb 15, 2021 at 01:48:32PM +0100, > > > Igor Mammedov <imammedo@redhat.com> wrote: > > > > > > > On Fri, 12 Feb 2021 12:51:57 -0800 > > > > Isaku Yamahata <isaku.yamahata@gmail.com> wrote: > > > > > > > > > On Fri, Feb 12, 2021 at 04:40:38PM +0100, > > > > > Igor Mammedov <imammedo@redhat.com> wrote: > > > > > > > > > > > On Wed, 10 Feb 2021 22:46:43 -0800 > > > > > > Isaku Yamahata <isaku.yamahata@intel.com> wrote: > > > > > > > > > > > > > + Aml *dev; > > > > > > > + Aml *rbuf; > > > > > > > + Aml *resource_template; > > > > > > > + Aml *rbuf_name; > > > > > > > + Aml *crs; > > > > > > > + > > > > > > > + if (!acpi_get_mcfg(&mcfg)) { > > > > > > > + return NULL; > > > > > > > + } > > > > > > > + > > > > > > > + /* DRAM controller */ > > > > > > > + dev = aml_device("DRAC"); > > > > > > > + aml_append(dev, aml_name_decl("_HID", aml_string("PNP0C01"))); > > > > > > > + > > > > > > > + resource_template = aml_resource_template(); > > > > > > > + aml_append(resource_template, > > > > > > > + aml_qword_memory(AML_POS_DECODE, > > > > > > > + AML_MIN_FIXED, > > > > > > > + AML_MAX_FIXED, > > > > > > > + AML_NON_CACHEABLE, > > > > > > > + AML_READ_WRITE, > > > > > > > + 0x0000000000000000, > > > > > > > + mcfg.base, > > > > > > > + mcfg.base + mcfg.size - 1, > > > > > > s/mcfg.base + mcfg.size - 1/mcfg.base/ > > > > > > > > > > AddressMaximum needs to be the highest address of the region. > > > > > Not base address. By disassemble/assembl it, iasl complains as follows. > > > > > Also there are similar examples in acpi-build.c. > > > > I mostly clean up all places to use the same base address for min/max, > > > > but probably something were left behind. > > > > > > > > spec says: > > > > > > > > acpi 6.3: 19.6.110 QWordMemory > > > > > > > > AddressMaximum evaluates to a 64-bit integer that specifies the highest possible base address of the > > > > Memory range. The value must have ‘0’ in all bits where the corresponding bit in AddressGranularity is > > > > ‘1’. For bridge devices which translate addresses, this is the address on the secondary bus. The 64-bit > > > > field DescriptorName ._MAX is automatically created to refer to this portion of the resource descriptor. > > > > > > Ok, Linux guest is happy with min=max. > > > I conlude that it's iasl issue. > > > > > > Thanks, > > > > OK but what about all the other places in the code that seem to use this > > field differently? > > Igor, what do you think? > The followings are the summary of the situation. > > - acpi 6.4: 19.6.110 QWordMemory > _MAX: maximum of base address. > > - acpi 6.4: 6.4.3.5 Address Space Resource Descriptors > table 6.44 > If _LEN > 0 and _MIF = 1 and _MAF = 1, then _LEN = _MAX - _MIN + 1 > This doesn't match with the above description I'd say it 19.6.110 doesn't describes whole possibilities of resource, but only subset in its current phrasing. > - iasl > If _LEN > 0 and _MIF = 1 and _MAF = 1, > it emits warning on _LEN != _MAX - _MIN + 1 so IASL is right and follows spec. Anyways, My apologies for misleading and your patch is correct. I need to add similar checks to QEMU code plus comments to avoid confusion in future. > - Linux Guest MMCONFIG check > check_mcfg_resoure() uses only _MIN. doesn't use _MAX. > _MAX value doesn't matter > > - Linux acpi code > acpi_decode_space() uses _MAX to calulate range [start, end], not use _LEN. > i.e. It assumes _LEN = _MAX - _MIN + 1 if _LEN > 0, _MIF = 1, _MAF = 1 > > - qemu code sets for [qd]word_memory_resource (except this patch) > If _LEN > 0 and _MIF = 1 and _MAF = 1, then set _LEN = _MAX - _MIN + 1
diff --git a/hw/i386/acpi-build.c b/hw/i386/acpi-build.c index 00cc119362..e369908b1a 100644 --- a/hw/i386/acpi-build.c +++ b/hw/i386/acpi-build.c @@ -1072,6 +1072,46 @@ static void build_q35_pci0_int(Aml *table) aml_append(table, sb_scope); } +static Aml *build_q35_dram_controller(void) +{ + AcpiMcfgInfo mcfg; + Aml *dev; + Aml *rbuf; + Aml *resource_template; + Aml *rbuf_name; + Aml *crs; + + if (!acpi_get_mcfg(&mcfg)) { + return NULL; + } + + /* DRAM controller */ + dev = aml_device("DRAC"); + aml_append(dev, aml_name_decl("_HID", aml_string("PNP0C01"))); + + resource_template = aml_resource_template(); + aml_append(resource_template, + aml_qword_memory(AML_POS_DECODE, + AML_MIN_FIXED, + AML_MAX_FIXED, + AML_NON_CACHEABLE, + AML_READ_WRITE, + 0x0000000000000000, + mcfg.base, + mcfg.base + mcfg.size - 1, + 0x0000000000000000, + mcfg.size)); + rbuf = aml_name_decl("RBUF", resource_template); + aml_append(dev, rbuf); + + crs = aml_method("_CRS", 0, AML_SERIALIZED); + rbuf_name = aml_name("RBUF"); + aml_append(crs, aml_return(rbuf_name)); + aml_append(dev, crs); + + return dev; +} + static void build_q35_isa_bridge(Aml *table) { Aml *dev; @@ -1212,7 +1252,7 @@ build_dsdt(GArray *table_data, BIOSLinker *linker, Range *pci_hole, Range *pci_hole64, MachineState *machine) { CrsRangeEntry *entry; - Aml *dsdt, *sb_scope, *scope, *dev, *method, *field, *pkg, *crs; + Aml *dsdt, *sb_scope, *scope, *dev, *method, *field, *pkg, *crs, *drac; CrsRangeSet crs_range_set; PCMachineState *pcms = PC_MACHINE(machine); PCMachineClass *pcmc = PC_MACHINE_GET_CLASS(machine); @@ -1256,6 +1296,10 @@ build_dsdt(GArray *table_data, BIOSLinker *linker, aml_append(dev, aml_name_decl("_UID", aml_int(0))); aml_append(dev, build_q35_osc_method()); aml_append(sb_scope, dev); + drac = build_q35_dram_controller(); + if (drac) { + aml_append(sb_scope, drac); + } if (pm->smi_on_cpuhp) { /* reserve SMI block resources, IO ports 0xB2, 0xB3 */
Declare PNP0C01 device to reserve MMCONFIG region to conform to the spec better and play nice with guest BIOSes/OSes. According to PCI Firmware Specification[0], MMCONFIG region must be reserved by declaring a motherboard resource. It's optional to reserve the region in memory map by Int 15 E820h or EFIGetMemoryMap. Guest Linux checks if the MMCFG region is reserved by bios memory map or ACPI resource. If it's not reserved, Linux falls back to legacy PCI configuration access. TDVF [1] [2] doesn't reserve MMCONFIG the region in memory map. On the other hand OVMF reserves it in memory map without declaring a motherboard resource. With memory map reservation, linux guest uses MMCONFIG region. However it doesn't comply to PCI Firmware specification. [0] PCI Firmware specification Revision 3.2 4.1.2 MCFG Table Description table 4-2 NOTE 2 If the operating system does not natively comprehend reserving the MMCFG region, The MMCFG region must e reserved by firmware. ... For most systems, the mortheroard resource would appear at the root of the ACPI namespace (under \_SB)... The resource can optionally be returned in Int15 E820h or EFIGetMemoryMap as reserved memory but must always be reported through ACPI as a motherboard resource [1] TDX: Intel Trust Domain Extension https://software.intel.com/content/www/us/en/develop/articles/intel-trust-domain-extensions.html [2] TDX Virtual Firmware https://github.com/tianocore/edk2-staging/tree/TDVF The change to DSDT is as follows. @@ -68,32 +68,51 @@ If ((CDW3 != Local0)) { CDW1 |= 0x10 } CDW3 = Local0 } Else { CDW1 |= 0x04 } Return (Arg3) } } + + Device (DRAC) + { + Name (_HID, "PNP0C01" /* System Board */) // _HID: Hardware ID + Name (RBUF, ResourceTemplate () + { + QWordMemory (ResourceProducer, PosDecode, MinFixed, MaxFixed, NonCacheable, ReadWrite, + 0x0000000000000000, // Granularity + 0x00000000B0000000, // Range Minimum + 0x00000000BFFFFFFF, // Range Maximum + 0x0000000000000000, // Translation Offset + 0x0000000010000000, // Length + ,, , AddressRangeMemory, TypeStatic) + }) + Method (_CRS, 0, Serialized) // _CRS: Current Resource Settings + { + Return (RBUF) /* \_SB_.DRAC.RBUF */ + } + } } Scope (_SB) { Device (HPET) { Name (_HID, EisaId ("PNP0103") /* HPET System Timer */) // _HID: Hardware ID Name (_UID, Zero) // _UID: Unique ID OperationRegion (HPTM, SystemMemory, 0xFED00000, 0x0400) Field (HPTM, DWordAcc, Lock, Preserve) { VEND, 32, PRD, 32 } Method (_STA, 0, NotSerialized) // _STA: Status Signed-off-by: Isaku Yamahata <isaku.yamahata@intel.com> --- hw/i386/acpi-build.c | 46 +++++++++++++++++++++++++++++++++++++++++++- 1 file changed, 45 insertions(+), 1 deletion(-)