Message ID | 20240921085712.28902-1-graf@amazon.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | [v3] target-i386: Walk NPT in guest real mode | expand |
On 9/21/24 01:57, Alexander Graf wrote: > When translating virtual to physical address with a guest CPU that > supports nested paging (NPT), we need to perform every page table walk > access indirectly through the NPT, which we correctly do. > > However, we treat real mode (no page table walk) special: In that case, > we currently just skip any walks and translate VA -> PA. With NPT > enabled, we also need to then perform NPT walk to do GVA -> GPA -> HPA > which we fail to do so far. > > The net result of that is that TCG VMs with NPT enabled that execute > real mode code (like SeaBIOS) end up with GPA==HPA mappings which means > the guest accesses host code and data. This typically shows as failure > to boot guests. > > This patch changes the page walk logic for NPT enabled guests so that we > always perform a GVA -> GPA translation and then skip any logic that > requires an actual PTE. > > That way, all remaining logic to walk the NPT stays and we successfully > walk the NPT in real mode. > > Fixes: fe441054bb3f0 ("target-i386: Add NPT support") > > Signed-off-by: Alexander Graf <graf@amazon.com> > Reported-by: Eduard Vlad <evlad@amazon.de> > Reviewed-by: Richard Henderson <richard.henderson@linaro.org> > > --- > > v1 -> v2: > > - Remove hack where we fake a PTE and instead just set the > corresponding resolved variables and jump straight to the > stage2 code. > > v2 -> v3: > > - Fix comment Thanks, queued. r~
On 21/09/2024 09:57, Alexander Graf wrote: > When translating virtual to physical address with a guest CPU that > supports nested paging (NPT), we need to perform every page table walk > access indirectly through the NPT, which we correctly do. > > However, we treat real mode (no page table walk) special: In that case, > we currently just skip any walks and translate VA -> PA. With NPT > enabled, we also need to then perform NPT walk to do GVA -> GPA -> HPA > which we fail to do so far. > > The net result of that is that TCG VMs with NPT enabled that execute > real mode code (like SeaBIOS) end up with GPA==HPA mappings which means > the guest accesses host code and data. This typically shows as failure > to boot guests. > > This patch changes the page walk logic for NPT enabled guests so that we > always perform a GVA -> GPA translation and then skip any logic that > requires an actual PTE. > > That way, all remaining logic to walk the NPT stays and we successfully > walk the NPT in real mode. > > Fixes: fe441054bb3f0 ("target-i386: Add NPT support") > > Signed-off-by: Alexander Graf <graf@amazon.com> > Reported-by: Eduard Vlad <evlad@amazon.de> > Reviewed-by: Richard Henderson <richard.henderson@linaro.org> > > --- > > v1 -> v2: > > - Remove hack where we fake a PTE and instead just set the > corresponding resolved variables and jump straight to the > stage2 code. > > v2 -> v3: > > - Fix comment > --- > target/i386/tcg/sysemu/excp_helper.c | 14 ++++++++++++-- > 1 file changed, 12 insertions(+), 2 deletions(-) > > diff --git a/target/i386/tcg/sysemu/excp_helper.c b/target/i386/tcg/sysemu/excp_helper.c > index 8fb05b1f53..24dd6935f9 100644 > --- a/target/i386/tcg/sysemu/excp_helper.c > +++ b/target/i386/tcg/sysemu/excp_helper.c > @@ -298,7 +298,7 @@ static bool mmu_translate(CPUX86State *env, const TranslateParams *in, > /* combine pde and pte nx, user and rw protections */ > ptep &= pte ^ PG_NX_MASK; > page_size = 4096; > - } else { > + } else if (pg_mode) { > /* > * Page table level 2 > */ > @@ -343,6 +343,15 @@ static bool mmu_translate(CPUX86State *env, const TranslateParams *in, > ptep &= pte | PG_NX_MASK; > page_size = 4096; > rsvd_mask = 0; > + } else { > + /* > + * No paging (real mode), let's tentatively resolve the address as 1:1 > + * here, but conditionally still perform an NPT walk on it later. > + */ > + page_size = 0x40000000; > + paddr = in->addr; > + prot = PAGE_READ | PAGE_WRITE | PAGE_EXEC; > + goto stage2; > } > > do_check_protect: > @@ -420,6 +429,7 @@ do_check_protect_pse36: > > /* merge offset within page */ > paddr = (pte & PG_ADDRESS_MASK & ~(page_size - 1)) | (addr & (page_size - 1)); > +stage2: > > /* > * Note that NPT is walked (for both paging structures and final guest > @@ -562,7 +572,7 @@ static bool get_physical_address(CPUX86State *env, vaddr addr, > addr = (uint32_t)addr; > } > > - if (likely(env->cr[0] & CR0_PG_MASK)) { > + if (likely(env->cr[0] & CR0_PG_MASK || use_stage2)) { > in.cr3 = env->cr[3]; > in.mmu_idx = mmu_idx; > in.ptw_idx = use_stage2 ? MMU_NESTED_IDX : MMU_PHYS_IDX; Hi Alex, This commit appears to break my WinXP boot test: with this patch applied, attempting to boot WinXP from CDROM fails with SeaBIOS getting stuck early in a boot loop. It is possible to reproduce the issue easily with: ./build/qemu-system-x86_64 -cdrom winxp.iso -boot d ATB, Mark.
Hey Mark, On 05.11.24 23:54, Mark Cave-Ayland wrote: > > Hi Alex, > > This commit appears to break my WinXP boot test: with this patch > applied, attempting > to boot WinXP from CDROM fails with SeaBIOS getting stuck early in a > boot loop. It is > possible to reproduce the issue easily with: > > ./build/qemu-system-x86_64 -cdrom winxp.iso -boot d Thanks a lot for the report! It also seems to break NetBSD (https://gitlab.com/qemu-project/qemu/-/issues/2654). I'll have a look tomorrow to root cause and will send a patch to fix. Alex Amazon Web Services Development Center Germany GmbH Krausenstr. 38 10117 Berlin Geschaeftsfuehrung: Christian Schlaeger, Jonathan Weiss Eingetragen am Amtsgericht Charlottenburg unter HRB 257764 B Sitz: Berlin Ust-ID: DE 365 538 597
diff --git a/target/i386/tcg/sysemu/excp_helper.c b/target/i386/tcg/sysemu/excp_helper.c index 8fb05b1f53..24dd6935f9 100644 --- a/target/i386/tcg/sysemu/excp_helper.c +++ b/target/i386/tcg/sysemu/excp_helper.c @@ -298,7 +298,7 @@ static bool mmu_translate(CPUX86State *env, const TranslateParams *in, /* combine pde and pte nx, user and rw protections */ ptep &= pte ^ PG_NX_MASK; page_size = 4096; - } else { + } else if (pg_mode) { /* * Page table level 2 */ @@ -343,6 +343,15 @@ static bool mmu_translate(CPUX86State *env, const TranslateParams *in, ptep &= pte | PG_NX_MASK; page_size = 4096; rsvd_mask = 0; + } else { + /* + * No paging (real mode), let's tentatively resolve the address as 1:1 + * here, but conditionally still perform an NPT walk on it later. + */ + page_size = 0x40000000; + paddr = in->addr; + prot = PAGE_READ | PAGE_WRITE | PAGE_EXEC; + goto stage2; } do_check_protect: @@ -420,6 +429,7 @@ do_check_protect_pse36: /* merge offset within page */ paddr = (pte & PG_ADDRESS_MASK & ~(page_size - 1)) | (addr & (page_size - 1)); +stage2: /* * Note that NPT is walked (for both paging structures and final guest @@ -562,7 +572,7 @@ static bool get_physical_address(CPUX86State *env, vaddr addr, addr = (uint32_t)addr; } - if (likely(env->cr[0] & CR0_PG_MASK)) { + if (likely(env->cr[0] & CR0_PG_MASK || use_stage2)) { in.cr3 = env->cr[3]; in.mmu_idx = mmu_idx; in.ptw_idx = use_stage2 ? MMU_NESTED_IDX : MMU_PHYS_IDX;