mbox series

[RFC,0/6] KVM: X86: Add and use shadow page with level promoted or acting as pae_root

Message ID 20211210092508.7185-1-jiangshanlai@gmail.com (mailing list archive)
Headers show
Series KVM: X86: Add and use shadow page with level promoted or acting as pae_root | expand

Message

Lai Jiangshan Dec. 10, 2021, 9:25 a.m. UTC
From: Lai Jiangshan <laijs@linux.alibaba.com>

(Request For Help for testing on AMD machine with 32 bit L1 hypervisor,
see information below)

KVM handles root pages specially for these cases:

direct mmu (nonpaping for 32 bit guest):
	gCR0_PG=0
shadow mmu (shadow paping for 32 bit guest):
	gCR0_PG=1,gEFER_LMA=0,gCR4_PSE=0
	gCR0_PG=1,gEFER_LMA=0,gCR4_PSE=1
direct mmu (NPT for 32bit host):
	hEFER_LMA=0
shadow nested NPT (for 32bit L1 hypervisor):
	gCR0_PG=1,gEFER_LMA=0,gCR4_PSE=0,hEFER_LMA=0
	gCR0_PG=1,gEFER_LMA=0,gCR4_PSE=1,hEFER_LMA=0
	gCR0_PG=1,gEFER_LMA=0,gCR4_PSE={0|1},hEFER_LMA=1,hCR4_LA57={0|1}
Shadow nested NPT for 64bit L1 hypervisor:
	gEFER_LMA=1,gCR4_LA57=0,hEFER_LMA=1,hCR4_LA57=1

They are either using special roots or matched the condition 
((mmu->shadow_root_level > mmu->root_level) && !mm->direct_map)
(refered as level promotion) or both.

All the cases are using special roots except the last one.
Many cases are doing level promotion including the last one.

When special roots are used, the root page will not be backed by
kvm_mmu_page.  So they must be treated specially, but not all places
is considering this problem, and Sean is adding some code to check
this special roots.

When level promotion, the kvm treats them silently always.

These treaments incur problems or complication, see the changelog
of every patch.

These patches were made when I reviewed all the usage of shadow_root_level
and root_level.  Some of them are sent and accepted.  Patch3-6 are too
complicated so they had been held back.  Patch1 and patch2 were sent.
Patch1 was rejected, but I think it is good.  Patch2 is said to be
accepted, but it is not shown in the kvm/queue.  Patch3-6 conflicts
with patch1,2 so patch1,2 are included here too.

Other reason that patch 3-6 were held back is that the patch 3-6 are
not tested with shadow NPT cases listed above.  Because I don't have
guest images can act as 32 bit L1 hypervisor, nor I can access to
AMD machine with 5 level paging.  I'm a bit reluctant to ask for the
resource, so I send the patches and wish someone test them and modify
them.  At least, it provides some thinking and reveals problems of the
existing code and of the AMD cases.
( *Request For Help* here.)

These patches have been tested with the all cases except the shadow-NPT
cases, the code coverage is believed to be more than 95% (hundreds of
code related to shadow-NPT are shoved, and be replaced with common
role.pae_root and role.level_promoted code with only 8 line of code is
added for shadow-NPT, only 2 line of code is not covered in my tests).

And Sean also found the problem of the last case listed above and asked
questions in a reply[1] to one of my emails, I hope this patchset can
be my reply to his questions about such complicated case.

If special roots are removed and PAE page is write-protected, there
can be some more cleanups.

[1]: https://lore.kernel.org/lkml/YbFY533IT3XSIqAK@google.com/

Lai Jiangshan (6):
  KVM: X86: Check root_level only in fast_pgd_switch()
  KVM: X86: Walk shadow page starting with shadow_root_level
  KVM: X86: Add arguement gfn and role to kvm_mmu_alloc_page()
  KVM: X86: Introduce role.level_promoted
  KVM: X86: Alloc pae_root shadow page
  KVM: X86: Use level_promoted and pae_root shadow page for 32bit guests

 arch/x86/include/asm/kvm_host.h |   9 +-
 arch/x86/kvm/mmu/mmu.c          | 440 ++++++++++----------------------
 arch/x86/kvm/mmu/mmu_audit.c    |  26 +-
 arch/x86/kvm/mmu/paging_tmpl.h  |  15 +-
 arch/x86/kvm/mmu/tdp_mmu.h      |   7 +-
 5 files changed, 164 insertions(+), 333 deletions(-)

Comments

Maxim Levitsky Dec. 10, 2021, 10:27 a.m. UTC | #1
On Fri, 2021-12-10 at 17:25 +0800, Lai Jiangshan wrote:
> From: Lai Jiangshan <laijs@linux.alibaba.com>
> 
> (Request For Help for testing on AMD machine with 32 bit L1 hypervisor,
> see information below)
> 
> KVM handles root pages specially for these cases:
> 
> direct mmu (nonpaping for 32 bit guest):
> 	gCR0_PG=0
> shadow mmu (shadow paping for 32 bit guest):
> 	gCR0_PG=1,gEFER_LMA=0,gCR4_PSE=0
> 	gCR0_PG=1,gEFER_LMA=0,gCR4_PSE=1
> direct mmu (NPT for 32bit host):
> 	hEFER_LMA=0
> shadow nested NPT (for 32bit L1 hypervisor):
> 	gCR0_PG=1,gEFER_LMA=0,gCR4_PSE=0,hEFER_LMA=0
> 	gCR0_PG=1,gEFER_LMA=0,gCR4_PSE=1,hEFER_LMA=0
> 	gCR0_PG=1,gEFER_LMA=0,gCR4_PSE={0|1},hEFER_LMA=1,hCR4_LA57={0|1}
> Shadow nested NPT for 64bit L1 hypervisor:
> 	gEFER_LMA=1,gCR4_LA57=0,hEFER_LMA=1,hCR4_LA57=1
> 
> They are either using special roots or matched the condition 
> ((mmu->shadow_root_level > mmu->root_level) && !mm->direct_map)
> (refered as level promotion) or both.
> 
> All the cases are using special roots except the last one.
> Many cases are doing level promotion including the last one.
> 
> When special roots are used, the root page will not be backed by
> kvm_mmu_page.  So they must be treated specially, but not all places
> is considering this problem, and Sean is adding some code to check
> this special roots.
> 
> When level promotion, the kvm treats them silently always.
> 
> These treaments incur problems or complication, see the changelog
> of every patch.
> 
> These patches were made when I reviewed all the usage of shadow_root_level
> and root_level.  Some of them are sent and accepted.  Patch3-6 are too
> complicated so they had been held back.  Patch1 and patch2 were sent.
> Patch1 was rejected, but I think it is good.  Patch2 is said to be
> accepted, but it is not shown in the kvm/queue.  Patch3-6 conflicts
> with patch1,2 so patch1,2 are included here too.
> 
> Other reason that patch 3-6 were held back is that the patch 3-6 are
> not tested with shadow NPT cases listed above.  Because I don't have
> guest images can act as 32 bit L1 hypervisor, nor I can access to
> AMD machine with 5 level paging.  I'm a bit reluctant to ask for the
> resource, so I send the patches and wish someone test them and modify
> them.  At least, it provides some thinking and reveals problems of the
> existing code and of the AMD cases.
> ( *Request For Help* here.)
> 
> These patches have been tested with the all cases except the shadow-NPT
> cases, the code coverage is believed to be more than 95% (hundreds of
> code related to shadow-NPT are shoved, and be replaced with common
> role.pae_root and role.level_promoted code with only 8 line of code is
> added for shadow-NPT, only 2 line of code is not covered in my tests).
> 
> And Sean also found the problem of the last case listed above and asked
> questions in a reply[1] to one of my emails, I hope this patchset can
> be my reply to his questions about such complicated case.
> 
> If special roots are removed and PAE page is write-protected, there
> can be some more cleanups.
> 
> [1]: https://lore.kernel.org/lkml/YbFY533IT3XSIqAK@google.com/
> 
> Lai Jiangshan (6):
>   KVM: X86: Check root_level only in fast_pgd_switch()
>   KVM: X86: Walk shadow page starting with shadow_root_level
>   KVM: X86: Add arguement gfn and role to kvm_mmu_alloc_page()
>   KVM: X86: Introduce role.level_promoted
>   KVM: X86: Alloc pae_root shadow page
>   KVM: X86: Use level_promoted and pae_root shadow page for 32bit guests
> 
>  arch/x86/include/asm/kvm_host.h |   9 +-
>  arch/x86/kvm/mmu/mmu.c          | 440 ++++++++++----------------------
>  arch/x86/kvm/mmu/mmu_audit.c    |  26 +-
>  arch/x86/kvm/mmu/paging_tmpl.h  |  15 +-
>  arch/x86/kvm/mmu/tdp_mmu.h      |   7 +-
>  5 files changed, 164 insertions(+), 333 deletions(-)
> 


I have 32 bit VM which can run an other 32 bit VM, and both it and the nested VM are using the mainline kernel).
I'll test this patch series soon.

I also have seabios hacked to use PAE instead of no paging, which I usually use for my 32 bit guests,
so I can make it switch to SMM+PAE paging mode to test it.

Best regards,
	Maxim Levitsky