diff mbox series

[RFC,v4,18/36] i386/tdx: Skip BIOS shadowing setup

Message ID 20220512031803.3315890-19-xiaoyao.li@intel.com (mailing list archive)
State New, archived
Headers show
Series TDX QEMU support | expand

Commit Message

Xiaoyao Li May 12, 2022, 3:17 a.m. UTC
TDX guest cannot go to real mode, so just skip the setup of isa-bios.

Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
---
 hw/i386/x86.c | 24 +++++++++++++-----------
 1 file changed, 13 insertions(+), 11 deletions(-)

Comments

Gerd Hoffmann May 24, 2022, 7:08 a.m. UTC | #1
On Thu, May 12, 2022 at 11:17:45AM +0800, Xiaoyao Li wrote:
> TDX guest cannot go to real mode, so just skip the setup of isa-bios.

Does isa-bios setup cause any actual problems?
(same question for patch #19).

"is not needed" IMHO isn't a good enough reason to special-case tdx
here.

take care,
  Gerd
Xiaoyao Li May 26, 2022, 2:48 a.m. UTC | #2
On 5/24/2022 3:08 PM, Gerd Hoffmann wrote:
> On Thu, May 12, 2022 at 11:17:45AM +0800, Xiaoyao Li wrote:
>> TDX guest cannot go to real mode, so just skip the setup of isa-bios.
> 
> Does isa-bios setup cause any actual problems?
> (same question for patch #19).

It causes mem_region split and mem_slot deletion on KVM.

TDVF marks pages starting from 0x800000 as TEMP_MEM and TD_HOB, which 
are TD's private memory and are TDH_MEM_PAGE_ADD'ed to TD via 
KVM_TDX_INIT_MEM_REGION

However, if isa-bios and pc.rom are not skipped, the memory_region 
initialization of them is after KVM_TDX_INIT_MEM_REGION in 
tdx_machine_done_notify(). (I didn't figure out why this order though)

And the it causes memory region split that splits
	[0, ram_below_4g)
to
	[0, 0xc0 000),
	[0xc0 000, 0xe0 000),
	[0xe0 000, 0x100 000),
	[0x100 000, ram_below_4g)

which causes mem_slot deletion on KVM. On KVM side, we lose the page 
content when mem_slot deletion. Thus, the we lose the content of TD HOB.

Yes, the better solution seems to be ensure KVM_TDX_INIT_MEM_REGION is 
called after all the mem region is settled down. But I haven't figured 
out the reason why the isa-bios and pc.rom initialization happens after
machine_init_done_notifier

on the other hand, to keep isa-bios and pc.rom, we need additional work 
to copy the content from the end_of_4G to end_of_1M.

I'm not sure if isa-bios and pc.rom are needed from people on TD guest, 
so I just skip them for simplicity,

> "is not needed" IMHO isn't a good enough reason to special-case tdx
> here.
> 
> take care,
>    Gerd
>
Gerd Hoffmann May 30, 2022, 11:49 a.m. UTC | #3
On Thu, May 26, 2022 at 10:48:56AM +0800, Xiaoyao Li wrote:
> On 5/24/2022 3:08 PM, Gerd Hoffmann wrote:
> > On Thu, May 12, 2022 at 11:17:45AM +0800, Xiaoyao Li wrote:
> > > TDX guest cannot go to real mode, so just skip the setup of isa-bios.
> > 
> > Does isa-bios setup cause any actual problems?
> > (same question for patch #19).
> 
> It causes mem_region split and mem_slot deletion on KVM.
> 
> TDVF marks pages starting from 0x800000 as TEMP_MEM and TD_HOB, which are
> TD's private memory and are TDH_MEM_PAGE_ADD'ed to TD via
> KVM_TDX_INIT_MEM_REGION
> 
> However, if isa-bios and pc.rom are not skipped, the memory_region
> initialization of them is after KVM_TDX_INIT_MEM_REGION in
> tdx_machine_done_notify(). (I didn't figure out why this order though)
> 
> And the it causes memory region split that splits
> 	[0, ram_below_4g)
> to
> 	[0, 0xc0 000),
> 	[0xc0 000, 0xe0 000),
> 	[0xe0 000, 0x100 000),
> 	[0x100 000, ram_below_4g)
> 
> which causes mem_slot deletion on KVM. On KVM side, we lose the page content
> when mem_slot deletion.  Thus, the we lose the content of TD HOB.

Hmm, removing and re-creating memory slots shouldn't cause page content
go away.   I'm wondering what the *real* problem is?  Maybe you loose
tdx-specific state, i.e. this removes TDH_MEM_PAGE_ADD changes?

> Yes, the better solution seems to be ensure KVM_TDX_INIT_MEM_REGION is
> called after all the mem region is settled down.

Yes, especially if tdx can't tolerate memory slots coming and going.

> But I haven't figured out the reason why the isa-bios and pc.rom
> initialization happens after machine_init_done_notifier

Probably happens when a flatview is created from the address space.

Maybe that is delayed somehow for machine creation, so all the address
space updates caused by device creation don't lead to lots of flatviews
being created and thrown away.

> on the other hand, to keep isa-bios and pc.rom, we need additional work to
> copy the content from the end_of_4G to end_of_1M.

There is no need for copying, end_of_1M is a alias memory region for
end_of_4G, so the backing storage is the same.

> I'm not sure if isa-bios and pc.rom are needed from people on TD guest, so I
> just skip them for simplicity,

Given that TDX guests start in 32bit mode not in real mode everything
should work fine without isa-bios.

I'd prefer to avoid creating a special case for tdx though.  Should make
long-term maintenance a bit easier when this is not needed.

take care,
  Gerd
Xiaoyao Li July 29, 2022, 7:14 a.m. UTC | #4
On 5/30/2022 7:49 PM, Gerd Hoffmann wrote:
> On Thu, May 26, 2022 at 10:48:56AM +0800, Xiaoyao Li wrote:
>> On 5/24/2022 3:08 PM, Gerd Hoffmann wrote:
>>> On Thu, May 12, 2022 at 11:17:45AM +0800, Xiaoyao Li wrote:
>>>> TDX guest cannot go to real mode, so just skip the setup of isa-bios.
>>>
>>> Does isa-bios setup cause any actual problems?
>>> (same question for patch #19).
>>
>> It causes mem_region split and mem_slot deletion on KVM.
>>
>> TDVF marks pages starting from 0x800000 as TEMP_MEM and TD_HOB, which are
>> TD's private memory and are TDH_MEM_PAGE_ADD'ed to TD via
>> KVM_TDX_INIT_MEM_REGION
>>
>> However, if isa-bios and pc.rom are not skipped, the memory_region
>> initialization of them is after KVM_TDX_INIT_MEM_REGION in
>> tdx_machine_done_notify(). (I didn't figure out why this order though)
>>
>> And the it causes memory region split that splits
>> 	[0, ram_below_4g)
>> to
>> 	[0, 0xc0 000),
>> 	[0xc0 000, 0xe0 000),
>> 	[0xe0 000, 0x100 000),
>> 	[0x100 000, ram_below_4g)
>>
>> which causes mem_slot deletion on KVM. On KVM side, we lose the page content
>> when mem_slot deletion.  Thus, the we lose the content of TD HOB.
> 
> Hmm, removing and re-creating memory slots shouldn't cause page content
> go away.   I'm wondering what the *real* problem is?  Maybe you loose
> tdx-specific state, i.e. this removes TDH_MEM_PAGE_ADD changes?
> 
>> Yes, the better solution seems to be ensure KVM_TDX_INIT_MEM_REGION is
>> called after all the mem region is settled down.
> 
> Yes, especially if tdx can't tolerate memory slots coming and going.

Actually, only the private memory that is assumed as already-accepted 
via SEAMALL(TDH.MEM.PAGE.ADD) in the point of view of TDVF cannot 
tolerate being removed. TDVF assumes those memory has initialized 
content and can be accessed directly. In other words, QEMU needs to 
always calls SEAMALL(TDH.MEM.PAGE.ADD) to "add" those memory before TDVF 
runs.

>> But I haven't figured out the reason why the isa-bios and pc.rom
>> initialization happens after machine_init_done_notifier
> 
> Probably happens when a flatview is created from the address space.
> 
> Maybe that is delayed somehow for machine creation, so all the address
> space updates caused by device creation don't lead to lots of flatviews
> being created and thrown away.

sorry for the late response.

I did some tracing for this, and the result differs for q35 machine type 
and pc machine type.

- For q35, the memslot update for isa-bios/pc.rom happens when 
mc->reset() that is triggered via

   qdev_machine_creation_done()
     -> qemu_system_reset(SHUTDOWN_CASE_NONE);

It's surely later than TDX's machine_init_done_notify callback which 
initializes the part of private memory via KVM_TDX_INIT_MEM_REGION

- For pc machine type, the memslot update happens in i440fx_init(), 
which is earlier than TDX's machine_init_done_notify callback

I haven't fully understand in what condition will QEMU carry out the 
memslot update yet. I will keep learning and try to come up a solution 
to ensure TDX's machine_init_done_notify callback executed after all the 
memslot settle down.

>> on the other hand, to keep isa-bios and pc.rom, we need additional work to
>> copy the content from the end_of_4G to end_of_1M.
> 
> There is no need for copying, end_of_1M is a alias memory region for
> end_of_4G, so the backing storage is the same.

It is a reason that current alias approach cannot work for TDX. Because 
in TDX a private page can be only mapped to one gpa. So for simplicity, 
I will just skip isa-bios shadowing for TDX instead of implementing a 
non-alias + memcpy approach.

For pc.rom in next patch, I don't have strong reason to skip it. But I 
will keep it in next version to make whole TDX patches work for q35 
machine type until I think up a good solution to ensure the memslot 
update happens before TDX's machine_init_done_notify callback.

>> I'm not sure if isa-bios and pc.rom are needed from people on TD guest, so I
>> just skip them for simplicity,
> 
> Given that TDX guests start in 32bit mode not in real mode everything
> should work fine without isa-bios.
> 
> I'd prefer to avoid creating a special case for tdx though.  Should make
> long-term maintenance a bit easier when this is not needed.
> 
> take care,
>    Gerd
>
Gerd Hoffmann Aug. 16, 2022, 7:13 a.m. UTC | #5
On Fri, Jul 29, 2022 at 03:14:02PM +0800, Xiaoyao Li wrote:
> On 5/30/2022 7:49 PM, Gerd Hoffmann wrote:
> > On Thu, May 26, 2022 at 10:48:56AM +0800, Xiaoyao Li wrote:
> > > On 5/24/2022 3:08 PM, Gerd Hoffmann wrote:
> > > > On Thu, May 12, 2022 at 11:17:45AM +0800, Xiaoyao Li wrote:
> > > > > TDX guest cannot go to real mode, so just skip the setup of isa-bios.
> > > > 
> > > > Does isa-bios setup cause any actual problems?
> > > > (same question for patch #19).

> > There is no need for copying, end_of_1M is a alias memory region for
> > end_of_4G, so the backing storage is the same.
> 
> It is a reason that current alias approach cannot work for TDX. Because in
> TDX a private page can be only mapped to one gpa.

Ok, so memory aliasing not being supported by TDX is the underlying
reason.

> So for simplicity, I will
> just skip isa-bios shadowing for TDX instead of implementing a non-alias +
> memcpy approach.

Makes sense given that tdx wouldn't use the mapping below 1M anyway.
A comment explaining the tdx aliasing restriction would be good to make
clear why the special case for tdx exists.

take care,
  Gerd
Gerd Hoffmann Aug. 16, 2022, 7:16 a.m. UTC | #6
Hi,

> I did some tracing for this, and the result differs for q35 machine type and
> pc machine type.
> 
> - For q35, the memslot update for isa-bios/pc.rom happens when mc->reset()
> that is triggered via
> 
>   qdev_machine_creation_done()
>     -> qemu_system_reset(SHUTDOWN_CASE_NONE);
> 
> It's surely later than TDX's machine_init_done_notify callback which
> initializes the part of private memory via KVM_TDX_INIT_MEM_REGION
> 
> - For pc machine type, the memslot update happens in i440fx_init(), which is
> earlier than TDX's machine_init_done_notify callback
> 
> I haven't fully understand in what condition will QEMU carry out the memslot
> update yet. I will keep learning and try to come up a solution to ensure
> TDX's machine_init_done_notify callback executed after all the memslot
> settle down.

My guess would be the rom shadowing initialization being slightly
different in 'pc' and 'q35'.

take care,
  Gerd
diff mbox series

Patch

diff --git a/hw/i386/x86.c b/hw/i386/x86.c
index fdf6af2f6add..17f2252296c5 100644
--- a/hw/i386/x86.c
+++ b/hw/i386/x86.c
@@ -1138,17 +1138,19 @@  void x86_bios_rom_init(MachineState *ms, const char *default_firmware,
     }
     g_free(filename);
 
-    /* map the last 128KB of the BIOS in ISA space */
-    isa_bios_size = MIN(bios_size, 128 * KiB);
-    isa_bios = g_malloc(sizeof(*isa_bios));
-    memory_region_init_alias(isa_bios, NULL, "isa-bios", bios,
-                             bios_size - isa_bios_size, isa_bios_size);
-    memory_region_add_subregion_overlap(rom_memory,
-                                        0x100000 - isa_bios_size,
-                                        isa_bios,
-                                        1);
-    if (!isapc_ram_fw) {
-        memory_region_set_readonly(isa_bios, true);
+    if (!is_tdx_vm()) {
+        /* map the last 128KB of the BIOS in ISA space */
+        isa_bios_size = MIN(bios_size, 128 * KiB);
+        isa_bios = g_malloc(sizeof(*isa_bios));
+        memory_region_init_alias(isa_bios, NULL, "isa-bios", bios,
+                                bios_size - isa_bios_size, isa_bios_size);
+        memory_region_add_subregion_overlap(rom_memory,
+                                            0x100000 - isa_bios_size,
+                                            isa_bios,
+                                            1);
+        if (!isapc_ram_fw) {
+            memory_region_set_readonly(isa_bios, true);
+        }
     }
 
     /* map all the bios at the top of memory */