Invalid blob size on NVDIMM hot-add (was: RE: [RFC PATCH 0/4] ARM virt: ACPI memory hotplug support)

Hi Laszlo/Igor,

I spend some time to debug this further as I was rebasing the nvdimm
hot-add support patches on top of the ongoing pc-dimm hot add ones.

Just to refresh the memory:

https://patchwork.kernel.org/cover/10783589/

" It is observed that hot adding nvdimm will results in guest reboot
failure. EDK2 fails to build the ACPI tables on reboot. Please find
below EDK2 log on Guest reboot after nvdimm hot-add,

ProcessCmdAddChecksum: invalid checksum range in "etc/acpi/tables"
OnRootBridgesConnected: InstallAcpiTables: Protocol Error
"

Please find below,

> -----Original Message-----
> From: Laszlo Ersek [mailto:lersek@redhat.com]
> Sent: 05 March 2019 12:15
> To: Igor Mammedov <imammedo@redhat.com>
> Cc: Shameerali Kolothum Thodi <shameerali.kolothum.thodi@huawei.com>;
> Auger Eric <eric.auger@redhat.com>; shannon.zhaosl@gmail.com;
> peter.maydell@linaro.org; qemu-devel@nongnu.org; qemu-arm@nongnu.org;
> xuwei (O) <xuwei5@huawei.com>; Linuxarm <linuxarm@huawei.com>; Ard
> Biesheuvel <ard.biesheuvel@linaro.org>; Leif Lindholm (Linaro address)
> <leif.lindholm@linaro.org>
> Subject: Re: [RFC PATCH 0/4] ARM virt: ACPI memory hotplug support
> 
> On 03/01/19 18:39, Igor Mammedov wrote:
> > On Fri, 1 Mar 2019 14:49:45 +0100
> > Laszlo Ersek <lersek@redhat.com> wrote:
> >
> >> On 02/28/19 15:02, Shameerali Kolothum Thodi wrote:
> >>
> >>> Ah..I missed the fact that, firmware indeed sees an update in the blob len
> here
> >>> (rounded or not) after reboot. So don’t think x86 has the same issue and
> padding
> >>> is not the right solution as Igor explained in his reply.
> >>>
> >>> I will try to debug this further. Any pointers welcome.
> >>
> >> How about this.
> >>
> >> (1) The firmware looks up the fw_cfg file called "etc/table-loader" in
> >> the fw_cfg file directory (identified by constant selector key 0x0019,
> >> FW_CFG_FILE_DIR).
> >>
> >> (2) The directory entry, once found, tells the firmware two things
> >> simultaneously. The selector key, and the size of the blob.
> >>
> >> (3) The firmware selects the selector key from step (2).
> >>
> >> (4) QEMU regenerates the ACPI payload (as a select callback).
> >>
> >> (5) The firmware reads the number of bytes from the fw_cfg blob that it
> >> learned in step (2).
> >>
> >> Here's the problem. As long as QEMU used to perform step (4) only for
> >> the purpose of refreshing PCI resources in the ACPI payload, step (4)
> >> wouldn't *resize* the blob.
> >>
> >> However, if step (4) enlarges the blob, then the byte count that step
> >> (5) uses -- from step (2) -- for reading, is obsolete.
> 
> > I've thought that was a problem with IO based fw_cfg, as reading
> size/content
> > were separates steps and that it was solved by DMA based fw_cfg file read.
> 
> The DMA backend is not relevant for this question, for two reasons:
> 
> (a) The question whether the fw_cfg transfer takes places with port IO
> vs. DMA is hidden from the fw_cfg client code; that code goes through an
> abstract library API.
> 
> (b) While the DMA method indeed lets the firmware specify the details of
> the transfer with one action, the issue is with the number of bytes that
> the firmware requests (that is, not with *how* the firmware requests the
> transfer). The firmware has to know the size of the transfer before it
> can initiate the transfer (regardless of port IO vs. DMA).
> 
> 
> My question is: assume the firmware item in question is selected, and
> the QEMU-side select callback runs (regenerating the ACPI payload). Does
> this action update the blob size in the fw_cfg file directory as well?

I think it doesn’t update the blob size on select callback which is the root
cause of this issue. And the reason looks like, qemu_ram_resize() function
returns without invoking the callback to update the blob size.

On boot up, Qemu builds the table and exposes it to guest,
      virt_acpi_setup()
        acpi_add_rom_blob()
          rom_add_blob()
            rom_set_mr()  --> mr is allocated here and ram_block used_length = HOST_PAGE_ALIGN(blob size);
            fw_cfg_add_file_callback()
              fw_cfg_add_bytes_callback() --> This uses the blob size passed into it.

On select callback path,

virt_acpi_build_update()
   acpi_ram_update()
    memory_region_ram_resize()
      qemu_ram_resize() -->. Here the newsize gets aligned to HOST_PAGE and callback is only called used_length != newsize.

https://github.com/qemu/qemu/blob/master/exec.c#L2180

Debug logs:
Initial boot:
##QEMU_DEBUG## rom_add_blob: file etc/acpi/tables size 0x64f7
##QEMU_DEBUG## fw_cfg_add_bytes_callback: key 0x21 len 0x64f7
........
........
###UEFI#### InstallQemuFwCfgTables: "etc/table-loader" has FwCfgItem 0x27 size 0xD00
##QEMU_DEBUG## virt_acpi_build_update:
##QEMU_DEBUG## acpi_ram_update: size 0x64f7
##QEMU_DEBUG## qemu_ram_resize: idstr /rom@etc/acpi/tables used_length  0x7000 newsize 0x7000 --> No callback.
.....
######UEFI###### ProcessCmdAllocate: QemuFwCfgFindFile("etc/acpi/tables"): size 0x64F7 --> UEFI get the actual size, which is fine for now.

Hot-add nvdimms and reboot.

root@ubuntu:/# reboot
.......
........
###UEFI#### InstallQemuFwCfgTables: "etc/table-loader" has FwCfgItem 0x27 size 0xD00
##QEMU_DEBUG## virt_acpi_build_update:
##QEMU_DEBUG## acpi_ram_update: size 0x6667 --> Size changed.
##QEMU_DEBUG## qemu_ram_resize: idstr /rom@etc/acpi/tables used_length  0x7000 newsize 0x7000 --> newsize is still aligned to 0x7000 and no callback to update.
......
######UEFI###### ProcessCmdAllocate: QemuFwCfgFindFile("etc/acpi/tables"): size 0x64F7 -->UEFI still sees the old value and fails.

This can be fixed by,

But I am not sure this is the best to way fix this issue (Or I am missing something here).

Please let me know.

Thanks,
Shameer

> If it does, then I can work around the problem in the firmware. I can
> add a re-lookup to the code after the item selection, in order to get
> the fresh blob size from the fw_cfg file directory. Then we can use that
> size for the actual transfer.
> 
> This won't help old firmware on new QEMU, but at least new firmware on
> old QEMU will not be hurt (the re-fetching of the fw_cfg file directory
> will come with a small performance penalty, but functionally it will be
> a no-op).
> 
> Thanks
> Laszlo

Message ID	5FC3163CFD30C246ABAA99954A238FA83F3FB328@lhreml524-mbs.china.huawei.com (mailing list archive)
State	New, archived
Headers	show Return-Path: <SRS0=SQ5Z=XP=nongnu.org=qemu-devel-bounces+patchwork-qemu-devel=patchwork.kernel.org@kernel.org> Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 9027476 for <patchwork-qemu-devel@patchwork.kernel.org>; Fri, 20 Sep 2019 17:05:25 +0000 (UTC) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 6FE79207FC for <patchwork-qemu-devel@patchwork.kernel.org>; Fri, 20 Sep 2019 17:05:25 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 6FE79207FC Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=huawei.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=qemu-devel-bounces+patchwork-qemu-devel=patchwork.kernel.org@nongnu.org Received: from localhost ([::1]:33882 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from <qemu-devel-bounces+patchwork-qemu-devel=patchwork.kernel.org@nongnu.org>) id 1iBMLU-00038I-Cy for patchwork-qemu-devel@patchwork.kernel.org; Fri, 20 Sep 2019 13:05:24 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:33375) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from <shameerali.kolothum.thodi@huawei.com>) id 1iBMKl-0002Kw-64 for qemu-devel@nongnu.org; Fri, 20 Sep 2019 13:04:41 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from <shameerali.kolothum.thodi@huawei.com>) id 1iBMKj-0001xH-DR for qemu-devel@nongnu.org; Fri, 20 Sep 2019 13:04:39 -0400 Received: from lhrrgout.huawei.com ([185.176.76.210]:45441 helo=huawei.com) by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from <shameerali.kolothum.thodi@huawei.com>) id 1iBMKf-0001mh-I5; Fri, 20 Sep 2019 13:04:33 -0400 Received: from lhreml707-cah.china.huawei.com (unknown [172.18.7.106]) by Forcepoint Email with ESMTP id 6707092E712AEA7118DC; Fri, 20 Sep 2019 18:04:23 +0100 (IST) Received: from LHREML524-MBS.china.huawei.com ([169.254.2.65]) by lhreml707-cah.china.huawei.com ([10.201.108.48]) with mapi id 14.03.0415.000; Fri, 20 Sep 2019 18:04:15 +0100 From: Shameerali Kolothum Thodi <shameerali.kolothum.thodi@huawei.com> To: Laszlo Ersek <lersek@redhat.com>, Igor Mammedov <imammedo@redhat.com> Subject: Invalid blob size on NVDIMM hot-add (was: RE: [RFC PATCH 0/4] ARM virt: ACPI memory hotplug support) Thread-Topic: Invalid blob size on NVDIMM hot-add (was: RE: [RFC PATCH 0/4] ARM virt: ACPI memory hotplug support) Thread-Index: AdVv0yEPLdplA0GwRKqg4qaBRPehIQ== Date: Fri, 20 Sep 2019 17:04:14 +0000 Message-ID: <5FC3163CFD30C246ABAA99954A238FA83F3FB328@lhreml524-mbs.china.huawei.com> Accept-Language: en-GB, en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: x-originating-ip: [10.202.227.237] Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: base64 MIME-Version: 1.0 X-CFilter-Loop: Reflected X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] X-Received-From: 185.176.76.210 X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: <qemu-devel.nongnu.org> List-Unsubscribe: <https://lists.nongnu.org/mailman/options/qemu-devel>, <mailto:qemu-devel-request@nongnu.org?subject=unsubscribe> List-Archive: <https://lists.nongnu.org/archive/html/qemu-devel> List-Post: <mailto:qemu-devel@nongnu.org> List-Help: <mailto:qemu-devel-request@nongnu.org?subject=help> List-Subscribe: <https://lists.nongnu.org/mailman/listinfo/qemu-devel>, <mailto:qemu-devel-request@nongnu.org?subject=subscribe> Cc: "peter.maydell@linaro.org" <peter.maydell@linaro.org>, "shannon.zhaosl@gmail.com" <shannon.zhaosl@gmail.com>, Ard Biesheuvel <ard.biesheuvel@linaro.org>, "qemu-devel@nongnu.org" <qemu-devel@nongnu.org>, "Leif Lindholm \(Linaro address\)" <leif.lindholm@linaro.org>, Linuxarm <linuxarm@huawei.com>, Auger Eric <eric.auger@redhat.com>, "qemu-arm@nongnu.org" <qemu-arm@nongnu.org>, "xuwei \(O\)" <xuwei5@huawei.com> Errors-To: qemu-devel-bounces+patchwork-qemu-devel=patchwork.kernel.org@nongnu.org Sender: "Qemu-devel" <qemu-devel-bounces+patchwork-qemu-devel=patchwork.kernel.org@nongnu.org>
Series	Invalid blob size on NVDIMM hot-add (was: RE: [RFC PATCH 0/4] ARM virt: ACPI memory hotplug support) \| expand Invalid blob size on NVDIMM hot-add (was: RE: [RFC PATCH 0/4] ARM virt: ACPI memory hotplug support)

Invalid blob size on NVDIMM hot-add (was: RE: [RFC PATCH 0/4] ARM virt: ACPI memory hotplug support)

Commit Message

Comments

Patch