Message ID | 5FC3163CFD30C246ABAA99954A238FA83F3FB328@lhreml524-mbs.china.huawei.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | Invalid blob size on NVDIMM hot-add (was: RE: [RFC PATCH 0/4] ARM virt: ACPI memory hotplug support) | expand |
On 09/20/19 19:04, Shameerali Kolothum Thodi wrote: > Hi Laszlo/Igor, > > I spend some time to debug this further as I was rebasing the nvdimm > hot-add support patches on top of the ongoing pc-dimm hot add ones. > > Just to refresh the memory: > > https://patchwork.kernel.org/cover/10783589/ > > " It is observed that hot adding nvdimm will results in guest reboot > failure. EDK2 fails to build the ACPI tables on reboot. Please find > below EDK2 log on Guest reboot after nvdimm hot-add, > > ProcessCmdAddChecksum: invalid checksum range in "etc/acpi/tables" > OnRootBridgesConnected: InstallAcpiTables: Protocol Error > " > > Please find below, > >> -----Original Message----- >> From: Laszlo Ersek [mailto:lersek@redhat.com] >> Sent: 05 March 2019 12:15 >> To: Igor Mammedov <imammedo@redhat.com> >> Cc: Shameerali Kolothum Thodi <shameerali.kolothum.thodi@huawei.com>; >> Auger Eric <eric.auger@redhat.com>; shannon.zhaosl@gmail.com; >> peter.maydell@linaro.org; qemu-devel@nongnu.org; qemu-arm@nongnu.org; >> xuwei (O) <xuwei5@huawei.com>; Linuxarm <linuxarm@huawei.com>; Ard >> Biesheuvel <ard.biesheuvel@linaro.org>; Leif Lindholm (Linaro >> address) <leif.lindholm@linaro.org> >> Subject: Re: [RFC PATCH 0/4] ARM virt: ACPI memory hotplug support >> >> On 03/01/19 18:39, Igor Mammedov wrote: >>> On Fri, 1 Mar 2019 14:49:45 +0100 >>> Laszlo Ersek <lersek@redhat.com> wrote: >>> >>>> On 02/28/19 15:02, Shameerali Kolothum Thodi wrote: >>>> >>>>> Ah..I missed the fact that, firmware indeed sees an update in the >>>>> blob len here (rounded or not) after reboot. So don't think x86 >>>>> has the same issue and padding is not the right solution as Igor >>>>> explained in his reply. >>>>> >>>>> I will try to debug this further. Any pointers welcome. >>>> >>>> How about this. >>>> >>>> (1) The firmware looks up the fw_cfg file called "etc/table-loader" >>>> in the fw_cfg file directory (identified by constant selector key >>>> 0x0019, FW_CFG_FILE_DIR). >>>> >>>> (2) The directory entry, once found, tells the firmware two things >>>> simultaneously. The selector key, and the size of the blob. >>>> >>>> (3) The firmware selects the selector key from step (2). >>>> >>>> (4) QEMU regenerates the ACPI payload (as a select callback). >>>> >>>> (5) The firmware reads the number of bytes from the fw_cfg blob >>>> that it learned in step (2). >>>> >>>> Here's the problem. As long as QEMU used to perform step (4) only >>>> for the purpose of refreshing PCI resources in the ACPI payload, >>>> step (4) wouldn't *resize* the blob. >>>> >>>> However, if step (4) enlarges the blob, then the byte count that >>>> step (5) uses -- from step (2) -- for reading, is obsolete. >> >>> I've thought that was a problem with IO based fw_cfg, as reading >>> size/content were separates steps and that it was solved by DMA >>> based fw_cfg file read. >> >> The DMA backend is not relevant for this question, for two reasons: >> >> (a) The question whether the fw_cfg transfer takes places with port >> IO vs. DMA is hidden from the fw_cfg client code; that code goes >> through an abstract library API. >> >> (b) While the DMA method indeed lets the firmware specify the details >> of the transfer with one action, the issue is with the number of >> bytes that the firmware requests (that is, not with *how* the >> firmware requests the transfer). The firmware has to know the size of >> the transfer before it can initiate the transfer (regardless of port >> IO vs. DMA). >> >> >> My question is: assume the firmware item in question is selected, and >> the QEMU-side select callback runs (regenerating the ACPI payload). >> Does this action update the blob size in the fw_cfg file directory as >> well? > > I think it doesn't update the blob size on select callback which is > the root cause of this issue. And the reason looks like, > qemu_ram_resize() function returns without invoking the callback to > update the blob size. > > On boot up, Qemu builds the table and exposes it to guest, > virt_acpi_setup() > acpi_add_rom_blob() > rom_add_blob() > rom_set_mr() --> mr is allocated here and ram_block used_length = HOST_PAGE_ALIGN(blob size); > fw_cfg_add_file_callback() > fw_cfg_add_bytes_callback() --> This uses the blob size passed into it. > > On select callback path, > > virt_acpi_build_update() > acpi_ram_update() > memory_region_ram_resize() > qemu_ram_resize() -->. Here the newsize gets aligned to HOST_PAGE and callback is only called used_length != newsize. > > https://github.com/qemu/qemu/blob/master/exec.c#L2180 > > Debug logs: > Initial boot: > ##QEMU_DEBUG## rom_add_blob: file etc/acpi/tables size 0x64f7 > ##QEMU_DEBUG## fw_cfg_add_bytes_callback: key 0x21 len 0x64f7 > ........ > ........ > ###UEFI#### InstallQemuFwCfgTables: "etc/table-loader" has FwCfgItem 0x27 size 0xD00 > ##QEMU_DEBUG## virt_acpi_build_update: > ##QEMU_DEBUG## acpi_ram_update: size 0x64f7 > ##QEMU_DEBUG## qemu_ram_resize: idstr /rom@etc/acpi/tables used_length 0x7000 newsize 0x7000 --> No callback. > ..... > ######UEFI###### ProcessCmdAllocate: QemuFwCfgFindFile("etc/acpi/tables"): size 0x64F7 --> UEFI get the actual size, which is fine for now. > > Hot-add nvdimms and reboot. > > root@ubuntu:/# reboot > ....... > ........ > ###UEFI#### InstallQemuFwCfgTables: "etc/table-loader" has FwCfgItem 0x27 size 0xD00 > ##QEMU_DEBUG## virt_acpi_build_update: > ##QEMU_DEBUG## acpi_ram_update: size 0x6667 --> Size changed. > ##QEMU_DEBUG## qemu_ram_resize: idstr /rom@etc/acpi/tables used_length 0x7000 newsize 0x7000 --> newsize is still aligned to 0x7000 and no callback to update. > ...... > ######UEFI###### ProcessCmdAllocate: QemuFwCfgFindFile("etc/acpi/tables"): size 0x64F7 -->UEFI still sees the old value and fails. > > This can be fixed by, > > diff --git a/hw/arm/virt-acpi-build.c b/hw/arm/virt-acpi-build.c index f3bd45675b..79da3fd35d 100644 > --- a/hw/arm/virt-acpi-build.c > +++ b/hw/arm/virt-acpi-build.c > @@ -854,6 +854,9 @@ void virt_acpi_build(VirtMachineState *vms, AcpiBuildTables *tables) > build_rsdp(tables->rsdp, tables->linker, &rsdp_data); > } > > + g_array_set_size(tables_blob, > + TARGET_PAGE_ALIGN(acpi_data_len(tables_blob))); > + > /* Cleanup memory that's no longer used. */ > g_array_free(table_offsets, true); > } > > But I am not sure this is the best to way fix this issue (Or I am > missing something here). > > Please let me know. The above QEMU patch, for virt_acpi_build(), may be necessary, but I don't think it is sufficient. For the firmware to see the updated (enlarged) blob, two things are required: (a) QEMU to update the blob size in the fw_cfg directory entry. Note: to the firmware, it is totally irrelevant if QEMU updates some *other* value or field that reflects the fresh blob size. The only thing the firmware can see is the entry in the FW_CFG_FILE_DIR blob. To illustrate the field I'm referring to, see: s->files->f[index].size = cpu_to_be32(len); in fw_cfg_add_file_callback(). See also: s->files->f[i].size = cpu_to_be32(len); in fw_cfg_modify_file(). That "size" field is what the firmware can see. Note: *all* relevant fw_cfg files must have their "size" fields updated in the directory blob (FW_CFG_FILE_DIR). I.e. the requirement applies to both the linker-loader script, and to all blobs that are referenced (by name) by the commands in the linker-loader script. (b) The firmware to re-read the size from the directory, after selecting the key for the sake of ACPI regeneration. I wrote: >> If it does, then I can work around the problem in the firmware. I can >> add a re-lookup to the code after the item selection, in order to get >> the fresh blob size from the fw_cfg file directory. Then we can use >> that size for the actual transfer. >> >> This won't help old firmware on new QEMU, but at least new firmware >> on old QEMU will not be hurt (the re-fetching of the fw_cfg file >> directory will come with a small performance penalty, but >> functionally it will be a no-op). Thus, the firmware patch in question would be: | diff --git a/OvmfPkg/AcpiPlatformDxe/QemuFwCfgAcpi.c b/OvmfPkg/AcpiPlatformDxe/QemuFwCfgAcpi.c | index bc1a891dbaf1..07f70ffe158a 100644 | --- a/OvmfPkg/AcpiPlatformDxe/QemuFwCfgAcpi.c | +++ b/OvmfPkg/AcpiPlatformDxe/QemuFwCfgAcpi.c | @@ -975,6 +975,24 @@ InstallQemuFwCfgTables ( | ORDERED_COLLECTION *SeenPointers; | ORDERED_COLLECTION_ENTRY *SeenPointerEntry, *SeenPointerEntry2; | | + Status = QemuFwCfgFindFile ("etc/table-loader", &FwCfgItem, &FwCfgSize); | + if (EFI_ERROR (Status)) { | + return Status; | + } | + | + // | + // By selecting "FwCfgItem", ask QEMU to regenerate the ACPI payload, with | + // all PCI devices decoding their resources. Note: further selections | + // of the same key will not repeat the patching. | + // | + EnablePciDecoding (&OriginalPciAttributes, &OriginalPciAttributesCount); | + QemuFwCfgSelectItem (FwCfgItem); | + RestorePciDecoding (OriginalPciAttributes, OriginalPciAttributesCount); | + | + // | + // The size of the script may have changed, possibly due to platform devices | + // having been hot-plugged before platform reset. Re-read the size. | + // | Status = QemuFwCfgFindFile ("etc/table-loader", &FwCfgItem, &FwCfgSize); | if (EFI_ERROR (Status)) { | return Status; | @@ -989,10 +1007,8 @@ InstallQemuFwCfgTables ( | if (LoaderStart == NULL) { | return EFI_OUT_OF_RESOURCES; | } | - EnablePciDecoding (&OriginalPciAttributes, &OriginalPciAttributesCount); | QemuFwCfgSelectItem (FwCfgItem); | QemuFwCfgReadBytes (FwCfgSize, LoaderStart); | - RestorePciDecoding (OriginalPciAttributes, OriginalPciAttributesCount); | LoaderEnd = LoaderStart + FwCfgSize / sizeof *LoaderEntry; | | AllocationsRestrictedTo32Bit = NULL; But, again, this only makes sense if QEMU implements (a). Thanks Laszlo
> -----Original Message----- > From: Laszlo Ersek [mailto:lersek@redhat.com] > Sent: 24 September 2019 16:53 > To: Shameerali Kolothum Thodi <shameerali.kolothum.thodi@huawei.com>; > Igor Mammedov <imammedo@redhat.com> > Cc: Auger Eric <eric.auger@redhat.com>; shannon.zhaosl@gmail.com; > peter.maydell@linaro.org; qemu-devel@nongnu.org; qemu-arm@nongnu.org; > xuwei (O) <xuwei5@huawei.com>; Linuxarm <linuxarm@huawei.com>; Ard > Biesheuvel <ard.biesheuvel@linaro.org>; Leif Lindholm (Linaro address) > <leif.lindholm@linaro.org> > Subject: Re: Invalid blob size on NVDIMM hot-add (was: RE: [RFC PATCH 0/4] > ARM virt: ACPI memory hotplug support) > > On 09/20/19 19:04, Shameerali Kolothum Thodi wrote: > > Hi Laszlo/Igor, > > > > I spend some time to debug this further as I was rebasing the nvdimm > > hot-add support patches on top of the ongoing pc-dimm hot add ones. > > > > Just to refresh the memory: > > > > https://patchwork.kernel.org/cover/10783589/ > > > > " It is observed that hot adding nvdimm will results in guest reboot > > failure. EDK2 fails to build the ACPI tables on reboot. Please find > > below EDK2 log on Guest reboot after nvdimm hot-add, > > > > ProcessCmdAddChecksum: invalid checksum range in "etc/acpi/tables" > > OnRootBridgesConnected: InstallAcpiTables: Protocol Error > > " > > > > Please find below, > > > >> -----Original Message----- > >> From: Laszlo Ersek [mailto:lersek@redhat.com] > >> Sent: 05 March 2019 12:15 > >> To: Igor Mammedov <imammedo@redhat.com> > >> Cc: Shameerali Kolothum Thodi <shameerali.kolothum.thodi@huawei.com>; > >> Auger Eric <eric.auger@redhat.com>; shannon.zhaosl@gmail.com; > >> peter.maydell@linaro.org; qemu-devel@nongnu.org; > qemu-arm@nongnu.org; > >> xuwei (O) <xuwei5@huawei.com>; Linuxarm <linuxarm@huawei.com>; Ard > >> Biesheuvel <ard.biesheuvel@linaro.org>; Leif Lindholm (Linaro > >> address) <leif.lindholm@linaro.org> > >> Subject: Re: [RFC PATCH 0/4] ARM virt: ACPI memory hotplug support > >> > >> On 03/01/19 18:39, Igor Mammedov wrote: > >>> On Fri, 1 Mar 2019 14:49:45 +0100 > >>> Laszlo Ersek <lersek@redhat.com> wrote: > >>> > >>>> On 02/28/19 15:02, Shameerali Kolothum Thodi wrote: > >>>> > >>>>> Ah..I missed the fact that, firmware indeed sees an update in the > >>>>> blob len here (rounded or not) after reboot. So don't think x86 > >>>>> has the same issue and padding is not the right solution as Igor > >>>>> explained in his reply. > >>>>> > >>>>> I will try to debug this further. Any pointers welcome. > >>>> > >>>> How about this. > >>>> > >>>> (1) The firmware looks up the fw_cfg file called "etc/table-loader" > >>>> in the fw_cfg file directory (identified by constant selector key > >>>> 0x0019, FW_CFG_FILE_DIR). > >>>> > >>>> (2) The directory entry, once found, tells the firmware two things > >>>> simultaneously. The selector key, and the size of the blob. > >>>> > >>>> (3) The firmware selects the selector key from step (2). > >>>> > >>>> (4) QEMU regenerates the ACPI payload (as a select callback). > >>>> > >>>> (5) The firmware reads the number of bytes from the fw_cfg blob > >>>> that it learned in step (2). > >>>> > >>>> Here's the problem. As long as QEMU used to perform step (4) only > >>>> for the purpose of refreshing PCI resources in the ACPI payload, > >>>> step (4) wouldn't *resize* the blob. > >>>> > >>>> However, if step (4) enlarges the blob, then the byte count that > >>>> step (5) uses -- from step (2) -- for reading, is obsolete. > >> > >>> I've thought that was a problem with IO based fw_cfg, as reading > >>> size/content were separates steps and that it was solved by DMA > >>> based fw_cfg file read. > >> > >> The DMA backend is not relevant for this question, for two reasons: > >> > >> (a) The question whether the fw_cfg transfer takes places with port > >> IO vs. DMA is hidden from the fw_cfg client code; that code goes > >> through an abstract library API. > >> > >> (b) While the DMA method indeed lets the firmware specify the details > >> of the transfer with one action, the issue is with the number of > >> bytes that the firmware requests (that is, not with *how* the > >> firmware requests the transfer). The firmware has to know the size of > >> the transfer before it can initiate the transfer (regardless of port > >> IO vs. DMA). > >> > >> > >> My question is: assume the firmware item in question is selected, and > >> the QEMU-side select callback runs (regenerating the ACPI payload). > >> Does this action update the blob size in the fw_cfg file directory as > >> well? > > > > I think it doesn't update the blob size on select callback which is > > the root cause of this issue. And the reason looks like, > > qemu_ram_resize() function returns without invoking the callback to > > update the blob size. > > > > On boot up, Qemu builds the table and exposes it to guest, > > virt_acpi_setup() > > acpi_add_rom_blob() > > rom_add_blob() > > rom_set_mr() --> mr is allocated here and ram_block > used_length = HOST_PAGE_ALIGN(blob size); > > fw_cfg_add_file_callback() > > fw_cfg_add_bytes_callback() --> This uses the blob size > passed into it. > > > > On select callback path, > > > > virt_acpi_build_update() > > acpi_ram_update() > > memory_region_ram_resize() > > qemu_ram_resize() -->. Here the newsize gets aligned to HOST_PAGE > and callback is only called used_length != newsize. > > > > https://github.com/qemu/qemu/blob/master/exec.c#L2180 > > > > Debug logs: > > Initial boot: > > ##QEMU_DEBUG## rom_add_blob: file etc/acpi/tables size 0x64f7 > > ##QEMU_DEBUG## fw_cfg_add_bytes_callback: key 0x21 len 0x64f7 > > ........ > > ........ > > ###UEFI#### InstallQemuFwCfgTables: "etc/table-loader" has FwCfgItem > 0x27 size 0xD00 > > ##QEMU_DEBUG## virt_acpi_build_update: > > ##QEMU_DEBUG## acpi_ram_update: size 0x64f7 > > ##QEMU_DEBUG## qemu_ram_resize: idstr /rom@etc/acpi/tables > used_length 0x7000 newsize 0x7000 --> No callback. > > ..... > > ######UEFI###### ProcessCmdAllocate: > QemuFwCfgFindFile("etc/acpi/tables"): size 0x64F7 --> UEFI get the actual size, > which is fine for now. > > > > Hot-add nvdimms and reboot. > > > > root@ubuntu:/# reboot > > ....... > > ........ > > ###UEFI#### InstallQemuFwCfgTables: "etc/table-loader" has FwCfgItem > 0x27 size 0xD00 > > ##QEMU_DEBUG## virt_acpi_build_update: > > ##QEMU_DEBUG## acpi_ram_update: size 0x6667 --> Size changed. > > ##QEMU_DEBUG## qemu_ram_resize: idstr /rom@etc/acpi/tables > used_length 0x7000 newsize 0x7000 --> newsize is still aligned to 0x7000 and > no callback to update. > > ...... > > ######UEFI###### ProcessCmdAllocate: > QemuFwCfgFindFile("etc/acpi/tables"): size 0x64F7 -->UEFI still sees the old > value and fails. > > > > This can be fixed by, > > > > diff --git a/hw/arm/virt-acpi-build.c b/hw/arm/virt-acpi-build.c index > f3bd45675b..79da3fd35d 100644 > > --- a/hw/arm/virt-acpi-build.c > > +++ b/hw/arm/virt-acpi-build.c > > @@ -854,6 +854,9 @@ void virt_acpi_build(VirtMachineState *vms, > AcpiBuildTables *tables) > > build_rsdp(tables->rsdp, tables->linker, &rsdp_data); > > } > > > > + g_array_set_size(tables_blob, > > + > TARGET_PAGE_ALIGN(acpi_data_len(tables_blob))); > > + > > /* Cleanup memory that's no longer used. */ > > g_array_free(table_offsets, true); > > } > > > > But I am not sure this is the best to way fix this issue (Or I am > > missing something here). > > > > Please let me know. > > The above QEMU patch, for virt_acpi_build(), may be necessary, but I > don't think it is sufficient. > > For the firmware to see the updated (enlarged) blob, two things are > required: > > (a) QEMU to update the blob size in the fw_cfg directory entry. > > Note: to the firmware, it is totally irrelevant if QEMU updates some > *other* value or field that reflects the fresh blob size. The only thing > the firmware can see is the entry in the FW_CFG_FILE_DIR blob. > > To illustrate the field I'm referring to, see: > > s->files->f[index].size = cpu_to_be32(len); > > in fw_cfg_add_file_callback(). > > See also: > > s->files->f[i].size = cpu_to_be32(len); > > in fw_cfg_modify_file(). > > That "size" field is what the firmware can see. Agree. And what I am trying to illustrate above is why this update is not happening. The Qemu path on select key from firmware is(I think), fw_cfg_select() e->select_cb() -->Callback registered in virt_acpi_setup() --> acpi_add_rom_blob() ..... -> fw_cfg_add_file_callback() virt_acpi_build_update() acpi_ram_update() memory_region_ram_resize() qemu_ram_resize() block->resized() --> Callback registered in virt_acpi_setup() -> acpi_add_rom_blob()-> rom_add_blob() -> rom_set_mr() fw_cfg_resized() fw_cfg_modify_file() s->files->f[i].size = cpu_to_be32(len); Updates the size for firmware But as I was trying to explain above, the callback function fw_cfg_resized() is not invoked in qemu_ram_resize()because while adding the mr region(rom_set_mr()), it uses the aligned size. And as a result if the updated/modified blob size is not greater than the aligned size the qemu_ram_resize() returns immediately. int qemu_ram_resize(RAMBlock *block, ram_addr_t newsize, Error **errp) { assert(block); newsize = HOST_PAGE_ALIGN(newsize); if (block->used_length == newsize) { return 0; } ... if (block->resized) { block->resized(block->idstr, newsize, block->host); ---> registered callback, fw_cfg_resized() } return 0; } > Note: *all* relevant fw_cfg files must have their "size" fields updated > in the directory blob (FW_CFG_FILE_DIR). I.e. the requirement applies to > both the linker-loader script, and to all blobs that are referenced (by > name) by the commands in the linker-loader script. > > > (b) The firmware to re-read the size from the directory, after selecting > the key for the sake of ACPI regeneration. Hmm...I am not sure this is required. In my testing with the above fix I mentioned, I can see that firmware is reading the correct(modified) size on select. I will double check this though. Thanks, Shameer > I wrote: > > >> If it does, then I can work around the problem in the firmware. I can > >> add a re-lookup to the code after the item selection, in order to get > >> the fresh blob size from the fw_cfg file directory. Then we can use > >> that size for the actual transfer. > >> > >> This won't help old firmware on new QEMU, but at least new firmware > >> on old QEMU will not be hurt (the re-fetching of the fw_cfg file > >> directory will come with a small performance penalty, but > >> functionally it will be a no-op). > > Thus, the firmware patch in question would be: > > | diff --git a/OvmfPkg/AcpiPlatformDxe/QemuFwCfgAcpi.c > b/OvmfPkg/AcpiPlatformDxe/QemuFwCfgAcpi.c > | index bc1a891dbaf1..07f70ffe158a 100644 > | --- a/OvmfPkg/AcpiPlatformDxe/QemuFwCfgAcpi.c > | +++ b/OvmfPkg/AcpiPlatformDxe/QemuFwCfgAcpi.c > | @@ -975,6 +975,24 @@ InstallQemuFwCfgTables ( > | ORDERED_COLLECTION *SeenPointers; > | ORDERED_COLLECTION_ENTRY *SeenPointerEntry, *SeenPointerEntry2; > | > | + Status = QemuFwCfgFindFile ("etc/table-loader", &FwCfgItem, > &FwCfgSize); > | + if (EFI_ERROR (Status)) { > | + return Status; > | + } > | + > | + // > | + // By selecting "FwCfgItem", ask QEMU to regenerate the ACPI payload, > with > | + // all PCI devices decoding their resources. Note: further selections > | + // of the same key will not repeat the patching. > | + // > | + EnablePciDecoding (&OriginalPciAttributes, &OriginalPciAttributesCount); > | + QemuFwCfgSelectItem (FwCfgItem); > | + RestorePciDecoding (OriginalPciAttributes, OriginalPciAttributesCount); > | + > | + // > | + // The size of the script may have changed, possibly due to platform > devices > | + // having been hot-plugged before platform reset. Re-read the size. > | + // > | Status = QemuFwCfgFindFile ("etc/table-loader", &FwCfgItem, > &FwCfgSize); > | if (EFI_ERROR (Status)) { > | return Status; > | @@ -989,10 +1007,8 @@ InstallQemuFwCfgTables ( > | if (LoaderStart == NULL) { > | return EFI_OUT_OF_RESOURCES; > | } > | - EnablePciDecoding (&OriginalPciAttributes, &OriginalPciAttributesCount); > | QemuFwCfgSelectItem (FwCfgItem); > | QemuFwCfgReadBytes (FwCfgSize, LoaderStart); > | - RestorePciDecoding (OriginalPciAttributes, OriginalPciAttributesCount); > | LoaderEnd = LoaderStart + FwCfgSize / sizeof *LoaderEntry; > | > | AllocationsRestrictedTo32Bit = NULL; > > But, again, this only makes sense if QEMU implements (a). > > Thanks > Laszlo
> -----Original Message----- > From: Shameerali Kolothum Thodi > Sent: 24 September 2019 17:39 > To: 'Laszlo Ersek' <lersek@redhat.com>; Igor Mammedov > <imammedo@redhat.com> > Cc: Auger Eric <eric.auger@redhat.com>; shannon.zhaosl@gmail.com; > peter.maydell@linaro.org; qemu-devel@nongnu.org; qemu-arm@nongnu.org; > xuwei (O) <xuwei5@huawei.com>; Linuxarm <linuxarm@huawei.com>; Ard > Biesheuvel <ard.biesheuvel@linaro.org>; Leif Lindholm (Linaro address) > <leif.lindholm@linaro.org> > Subject: RE: Invalid blob size on NVDIMM hot-add (was: RE: [RFC PATCH 0/4] > ARM virt: ACPI memory hotplug support) [...] > > > >>>> How about this. > > >>>> > > >>>> (1) The firmware looks up the fw_cfg file called "etc/table-loader" > > >>>> in the fw_cfg file directory (identified by constant selector key > > >>>> 0x0019, FW_CFG_FILE_DIR). > > >>>> > > >>>> (2) The directory entry, once found, tells the firmware two things > > >>>> simultaneously. The selector key, and the size of the blob. > > >>>> > > >>>> (3) The firmware selects the selector key from step (2). > > >>>> > > >>>> (4) QEMU regenerates the ACPI payload (as a select callback). > > >>>> > > >>>> (5) The firmware reads the number of bytes from the fw_cfg blob > > >>>> that it learned in step (2). > > >>>> > > >>>> Here's the problem. As long as QEMU used to perform step (4) only > > >>>> for the purpose of refreshing PCI resources in the ACPI payload, > > >>>> step (4) wouldn't *resize* the blob. > > >>>> > > >>>> However, if step (4) enlarges the blob, then the byte count that > > >>>> step (5) uses -- from step (2) -- for reading, is obsolete. > > >> > > >>> I've thought that was a problem with IO based fw_cfg, as reading > > >>> size/content were separates steps and that it was solved by DMA > > >>> based fw_cfg file read. > > >> > > >> The DMA backend is not relevant for this question, for two reasons: > > >> > > >> (a) The question whether the fw_cfg transfer takes places with port > > >> IO vs. DMA is hidden from the fw_cfg client code; that code goes > > >> through an abstract library API. > > >> > > >> (b) While the DMA method indeed lets the firmware specify the details > > >> of the transfer with one action, the issue is with the number of > > >> bytes that the firmware requests (that is, not with *how* the > > >> firmware requests the transfer). The firmware has to know the size of > > >> the transfer before it can initiate the transfer (regardless of port > > >> IO vs. DMA). > > >> > > >> > > >> My question is: assume the firmware item in question is selected, and > > >> the QEMU-side select callback runs (regenerating the ACPI payload). > > >> Does this action update the blob size in the fw_cfg file directory as > > >> well? > > > > > > I think it doesn't update the blob size on select callback which is > > > the root cause of this issue. And the reason looks like, > > > qemu_ram_resize() function returns without invoking the callback to > > > update the blob size. > > > > > > On boot up, Qemu builds the table and exposes it to guest, > > > virt_acpi_setup() > > > acpi_add_rom_blob() > > > rom_add_blob() > > > rom_set_mr() --> mr is allocated here and ram_block > > used_length = HOST_PAGE_ALIGN(blob size); > > > fw_cfg_add_file_callback() > > > fw_cfg_add_bytes_callback() --> This uses the blob size > > passed into it. > > > > > > On select callback path, > > > > > > virt_acpi_build_update() > > > acpi_ram_update() > > > memory_region_ram_resize() > > > qemu_ram_resize() -->. Here the newsize gets aligned to > HOST_PAGE > > and callback is only called used_length != newsize. > > > > > > https://github.com/qemu/qemu/blob/master/exec.c#L2180 > > > > > > Debug logs: > > > Initial boot: > > > ##QEMU_DEBUG## rom_add_blob: file etc/acpi/tables size 0x64f7 > > > ##QEMU_DEBUG## fw_cfg_add_bytes_callback: key 0x21 len 0x64f7 > > > ........ > > > ........ > > > ###UEFI#### InstallQemuFwCfgTables: "etc/table-loader" has FwCfgItem > > 0x27 size 0xD00 > > > ##QEMU_DEBUG## virt_acpi_build_update: > > > ##QEMU_DEBUG## acpi_ram_update: size 0x64f7 > > > ##QEMU_DEBUG## qemu_ram_resize: idstr /rom@etc/acpi/tables > > used_length 0x7000 newsize 0x7000 --> No callback. > > > ..... > > > ######UEFI###### ProcessCmdAllocate: > > QemuFwCfgFindFile("etc/acpi/tables"): size 0x64F7 --> UEFI get the actual > size, > > which is fine for now. > > > > > > Hot-add nvdimms and reboot. > > > > > > root@ubuntu:/# reboot > > > ....... > > > ........ > > > ###UEFI#### InstallQemuFwCfgTables: "etc/table-loader" has FwCfgItem > > 0x27 size 0xD00 > > > ##QEMU_DEBUG## virt_acpi_build_update: > > > ##QEMU_DEBUG## acpi_ram_update: size 0x6667 --> Size changed. > > > ##QEMU_DEBUG## qemu_ram_resize: idstr /rom@etc/acpi/tables > > used_length 0x7000 newsize 0x7000 --> newsize is still aligned to 0x7000 > and > > no callback to update. > > > ...... > > > ######UEFI###### ProcessCmdAllocate: > > QemuFwCfgFindFile("etc/acpi/tables"): size 0x64F7 -->UEFI still sees the old > > value and fails. > > > > > > This can be fixed by, > > > > > > diff --git a/hw/arm/virt-acpi-build.c b/hw/arm/virt-acpi-build.c index > > f3bd45675b..79da3fd35d 100644 > > > --- a/hw/arm/virt-acpi-build.c > > > +++ b/hw/arm/virt-acpi-build.c > > > @@ -854,6 +854,9 @@ void virt_acpi_build(VirtMachineState *vms, > > AcpiBuildTables *tables) > > > build_rsdp(tables->rsdp, tables->linker, &rsdp_data); > > > } > > > > > > + g_array_set_size(tables_blob, > > > + > > TARGET_PAGE_ALIGN(acpi_data_len(tables_blob))); > > > + > > > /* Cleanup memory that's no longer used. */ > > > g_array_free(table_offsets, true); > > > } > > > > > > But I am not sure this is the best to way fix this issue (Or I am > > > missing something here). > > > > > > Please let me know. > > > > The above QEMU patch, for virt_acpi_build(), may be necessary, but I > > don't think it is sufficient. > > > > For the firmware to see the updated (enlarged) blob, two things are > > required: > > > > (a) QEMU to update the blob size in the fw_cfg directory entry. > > > > Note: to the firmware, it is totally irrelevant if QEMU updates some > > *other* value or field that reflects the fresh blob size. The only thing > > the firmware can see is the entry in the FW_CFG_FILE_DIR blob. > > > > To illustrate the field I'm referring to, see: > > > > s->files->f[index].size = cpu_to_be32(len); > > > > in fw_cfg_add_file_callback(). > > > > See also: > > > > s->files->f[i].size = cpu_to_be32(len); > > > > in fw_cfg_modify_file(). > > > > That "size" field is what the firmware can see. > > Agree. And what I am trying to illustrate above is why this update is not > happening. > > The Qemu path on select key from firmware is(I think), > > fw_cfg_select() > e->select_cb() -->Callback registered in virt_acpi_setup() --> > acpi_add_rom_blob() ..... -> fw_cfg_add_file_callback() > virt_acpi_build_update() > acpi_ram_update() > memory_region_ram_resize() > qemu_ram_resize() > block->resized() --> Callback registered in virt_acpi_setup() -> > acpi_add_rom_blob()-> rom_add_blob() -> rom_set_mr() > fw_cfg_resized() > fw_cfg_modify_file() > s->files->f[i].size = cpu_to_be32(len); Updates the size for > firmware > > But as I was trying to explain above, the callback function fw_cfg_resized() is > not > invoked in qemu_ram_resize()because while adding the mr > region(rom_set_mr()), > it uses the aligned size. And as a result if the updated/modified blob size is not > greater > than the aligned size the qemu_ram_resize() returns immediately. > > int qemu_ram_resize(RAMBlock *block, ram_addr_t newsize, Error **errp) > { > assert(block); > > newsize = HOST_PAGE_ALIGN(newsize); > > if (block->used_length == newsize) { > return 0; > } > ... > > if (block->resized) { > block->resized(block->idstr, newsize, block->host); ---> registered > callback, fw_cfg_resized() > } > return 0; > } Hi Igor, Could you please take a look at the above Qemu side problem I am trying to explain here. Please let me if it is not clear or more details are required. > > Note: *all* relevant fw_cfg files must have their "size" fields updated > > in the directory blob (FW_CFG_FILE_DIR). I.e. the requirement applies to > > both the linker-loader script, and to all blobs that are referenced (by > > name) by the commands in the linker-loader script. > > > > > > (b) The firmware to re-read the size from the directory, after selecting > > the key for the sake of ACPI regeneration. > > Hmm...I am not sure this is required. In my testing with the above fix I > mentioned, > I can see that firmware is reading the correct(modified) size on select. > I will double check this though. Hi Laszlo, Right. I think the reason it works for me without your patch is when firmware selects the first item("etc/table-loader"), the Qemu side runs the callback and try to update all the acpi ram sizes including " etc/acpi/tables" which is the one that matters in this specific case. This is the log I have with prints added to both firmware and Qemu, OnRootBridgesConnected: root bridges have been connected, installing ACPI tables #UEFI# InstallQemuFwCfgTables: Find file "etc/table-loader" #UEFI# QemuFwCfgFindFile: Select QemuFwCfgItemFileDir for etc/table-loader #UEFI# QemuFwCfgSelectItem: Select Item 0x19 #QEMU# fw_cfg_select: key 0x19 No callback #UEFI# QemuFwCfgFindFile: FileSize 0xD00 FileSelect 0x27 FName etc/table-loader #UEFI# QemuFwCfgSelectItem: Select Item 0x27 #QEMU# fw_cfg_select: key 0x27 Invoking callback... #QEMU# virt_acpi_build_update: Updating ACPI tables... #QEMU# acpi_ram_update: mr name /rom@etc/acpi/tables size 0x64f7 #QEMU# qemu_ram_resize: newsize 0x7000(used_length == newsize). Returning #QEMU# acpi_ram_update: mr name /rom@etc/acpi/rsdp size 0x24 #QEMU# qemu_ram_resize: newsize 0x1000(used_length == newsize). Returning #QEMU# acpi_ram_update: mr name /rom@etc/table-loader size 0xd00 #QEMU# qemu_ram_resize: newsize 0x1000(used_length == newsize). Returning #UEFI# QemuFwCfgFindFile: Select QemuFwCfgItemFileDir for etc/acpi/rsdp #UEFI# QemuFwCfgSelectItem: Select Item 0x19 #QEMU# fw_cfg_select: key 0x19 No callback #UEFI# QemuFwCfgFindFile: FileSize 0x24 FileSelect 0x22 FName etc/acpi/rsdp #UEFI# QemuFwCfgSelectItem: Select Item 0x22 #QEMU# fw_cfg_select: key 0x22 Invoking callback... #QEMU# virt_acpi_build_update: No update required ..... #UEFI# QemuFwCfgFindFile: Select QemuFwCfgItemFileDir for etc/acpi/tables #UEFI# QemuFwCfgSelectItem: Select Item 0x19 #QEMU# fw_cfg_select: key 0x19 No callback #UEFI# QemuFwCfgFindFile: FileSize 0x64F7 FileSelect 0x23 FName etc/acpi/tables #UEFI# QemuFwCfgSelectItem: Select Item 0x23 #QEMU# fw_cfg_select: key 0x23 Invoking callback... #QEMU# virt_acpi_build_update: No update required InstallQemuFwCfgTables: installed 9 tables So with my fix with the tables_blob size align, qemu_ram_resize() calls the fw_cfg_modify_file() and updates the " etc/acpi/tables" size when firmware selects the " etc/table-loader" item. But I think, your below patch is still required in case " etc/table-loader" is changed by Qemu. Thanks, Shameer > Thanks, > Shameer > > > I wrote: > > > > >> If it does, then I can work around the problem in the firmware. I can > > >> add a re-lookup to the code after the item selection, in order to get > > >> the fresh blob size from the fw_cfg file directory. Then we can use > > >> that size for the actual transfer. > > >> > > >> This won't help old firmware on new QEMU, but at least new firmware > > >> on old QEMU will not be hurt (the re-fetching of the fw_cfg file > > >> directory will come with a small performance penalty, but > > >> functionally it will be a no-op). > > > > Thus, the firmware patch in question would be: > > > > | diff --git a/OvmfPkg/AcpiPlatformDxe/QemuFwCfgAcpi.c > > b/OvmfPkg/AcpiPlatformDxe/QemuFwCfgAcpi.c > > | index bc1a891dbaf1..07f70ffe158a 100644 > > | --- a/OvmfPkg/AcpiPlatformDxe/QemuFwCfgAcpi.c > > | +++ b/OvmfPkg/AcpiPlatformDxe/QemuFwCfgAcpi.c > > | @@ -975,6 +975,24 @@ InstallQemuFwCfgTables ( > > | ORDERED_COLLECTION *SeenPointers; > > | ORDERED_COLLECTION_ENTRY *SeenPointerEntry, > *SeenPointerEntry2; > > | > > | + Status = QemuFwCfgFindFile ("etc/table-loader", &FwCfgItem, > > &FwCfgSize); > > | + if (EFI_ERROR (Status)) { > > | + return Status; > > | + } > > | + > > | + // > > | + // By selecting "FwCfgItem", ask QEMU to regenerate the ACPI payload, > > with > > | + // all PCI devices decoding their resources. Note: further selections > > | + // of the same key will not repeat the patching. > > | + // > > | + EnablePciDecoding (&OriginalPciAttributes, > &OriginalPciAttributesCount); > > | + QemuFwCfgSelectItem (FwCfgItem); > > | + RestorePciDecoding (OriginalPciAttributes, OriginalPciAttributesCount); > > | + > > | + // > > | + // The size of the script may have changed, possibly due to platform > > devices > > | + // having been hot-plugged before platform reset. Re-read the size. > > | + // > > | Status = QemuFwCfgFindFile ("etc/table-loader", &FwCfgItem, > > &FwCfgSize); > > | if (EFI_ERROR (Status)) { > > | return Status; > > | @@ -989,10 +1007,8 @@ InstallQemuFwCfgTables ( > > | if (LoaderStart == NULL) { > > | return EFI_OUT_OF_RESOURCES; > > | } > > | - EnablePciDecoding (&OriginalPciAttributes, > &OriginalPciAttributesCount); > > | QemuFwCfgSelectItem (FwCfgItem); > > | QemuFwCfgReadBytes (FwCfgSize, LoaderStart); > > | - RestorePciDecoding (OriginalPciAttributes, OriginalPciAttributesCount); > > | LoaderEnd = LoaderStart + FwCfgSize / sizeof *LoaderEntry; > > | > > | AllocationsRestrictedTo32Bit = NULL; > > > > But, again, this only makes sense if QEMU implements (a). > > > > Thanks > > Laszlo
On 09/26/19 13:37, Shameerali Kolothum Thodi wrote: >>> Note: *all* relevant fw_cfg files must have their "size" fields updated >>> in the directory blob (FW_CFG_FILE_DIR). I.e. the requirement applies to >>> both the linker-loader script, and to all blobs that are referenced (by >>> name) by the commands in the linker-loader script. >>> >>> >>> (b) The firmware to re-read the size from the directory, after selecting >>> the key for the sake of ACPI regeneration. >> >> Hmm...I am not sure this is required. In my testing with the above fix I >> mentioned, >> I can see that firmware is reading the correct(modified) size on select. >> I will double check this though. > > Hi Laszlo, > > Right. I think the reason it works for me without your patch is when firmware > selects the first item("etc/table-loader"), the Qemu side runs the callback > and try to update all the acpi ram sizes including " etc/acpi/tables" which is the > one that matters in this specific case. This is a good explanation. Currently, the *only* action that occurs in the firmware before selecting "etc/table-loader", is fetching the selector key, and size, of "etc/table-loader". Therefore, if the ACPI re-generation does not affect the size of "etc/table-loader", just the size of "etc/acpi/tables" -- and QEMU updates the size in the fw_cfg directory for "etc/acpi/tables" --, then there is indeed no need for updating the firmware. I was just not sure if QEMU could guarantee this invariant (i.e. that the size of "etc/table-loader" would not change). After all, if you modify the ACPI tables in the "etc/acpi/tables" blob, possibly even add new tables to it, then you will likely have to append new *commands*, for those additional ACPI objects / tables, to the linker-loader script in "etc/table-loader". > So with my fix with the tables_blob size align, qemu_ram_resize() calls the > fw_cfg_modify_file() and updates the " etc/acpi/tables" size when firmware > selects the " etc/table-loader" item. Right, that's good. > But I think, your below patch is still required in case " etc/table-loader" is > changed by Qemu. Exactly. It really depends on whether the platform devices hot-plugged before platform reset require new commands in the linker/loader script. Thanks Laszlo
diff --git a/hw/arm/virt-acpi-build.c b/hw/arm/virt-acpi-build.c index f3bd45675b..79da3fd35d 100644 --- a/hw/arm/virt-acpi-build.c +++ b/hw/arm/virt-acpi-build.c @@ -854,6 +854,9 @@ void virt_acpi_build(VirtMachineState *vms, AcpiBuildTables *tables) build_rsdp(tables->rsdp, tables->linker, &rsdp_data); } + g_array_set_size(tables_blob, + TARGET_PAGE_ALIGN(acpi_data_len(tables_blob))); + /* Cleanup memory that's no longer used. */ g_array_free(table_offsets, true); }