mbox series

[v2,00/13] KVM: selftests: Morph max_guest_mem to mmu_stress

Message ID 20240911204158.2034295-1-seanjc@google.com (mailing list archive)
Headers show
Series KVM: selftests: Morph max_guest_mem to mmu_stress | expand

Message

Sean Christopherson Sept. 11, 2024, 8:41 p.m. UTC
Marc/Oliver,

I would love a sanity check on patches 2 and 3 before I file a bug against
gcc.  The code is pretty darn simple, so I don't think I've misdiagnosed the
problem, but I've also been second guessing myself _because_ it's so simple;
it seems super unlikely that no one else would have run into this before.

On to the patches...

The main purpose of this series is to convert the max_guest_memory_test into
a more generic mmu_stress_test.  The patches were originally posted as part
a KVM x86/mmu series to test the x86/mmu changes, hence the v2.

The basic gist of the "conversion" is to have the test do mprotect() on
guest memory while vCPUs are accessing said memory, e.g. to verify KVM and
mmu_notifiers are working as intended.

Patches 1-4 are a somewhat unexpected side quest that I can (arguably should)
post separately if that would make things easier.  The original plan was that
patch 2 would be a single patch, but things snowballed.

Patch 2 reworks vcpu_get_reg() to return a value instead of using an
out-param.  This is the entire motivation for including these patches;
having to define a variable just to bump the program counter on arm64
annoyed me.

Patch 4 adds hardening to vcpu_{g,s}et_reg() to detect potential truncation,
as KVM's uAPI allows for registers greater than the 64 bits the are supported
in the "outer" selftests APIs ((vcpu_set_reg() takes a u64, vcpu_get_reg()
now returns a u64).

Patch 1 is a change to KVM's uAPI headers to move the KVM_REG_SIZE
definition to common code so that the selftests side of things doesn't
need #ifdefs to implement the hardening in patch 4.

Patch 3 is the truly unexpected part.  With the vcpu_get_reg() rework,
arm64's vpmu_counter_test fails when compiled with gcc-13, and on gcc-11
with an added "noinline".  AFAICT, the failure doesn't actually have
anything to with vcpu_get_reg(); I suspect the largely unrelated change
just happened to run afoul of a latent gcc bug.

Pending a sanity check, I will file a gcc bug.  In the meantime, I am
hoping to fudge around the issue in KVM selftests so that the vcpu_get_reg()
cleanup isn't blocked, and because the hack-a-fix is arguably a cleanup
on its own.

v2:
 - Rebase onto kvm/next.
 - Add the aforementioned vcpu_get_reg() changes/disaster.
 - Actually add arm64 support for the fancy mprotect() testcase (I did this
   before v1, but managed to forget to include the changes when posting).
 - Emit "mov %rax, (%rax)" on x86. [James]
 - Add a comment to explain the fancy mprotect() vs. vCPUs logic.
 - Drop the KVM x86 patches (applied and/or will be handled separately).

v1: https://lore.kernel.org/all/20240809194335.1726916-1-seanjc@google.com

Sean Christopherson (13):
  KVM: Move KVM_REG_SIZE() definition to common uAPI header
  KVM: selftests: Return a value from vcpu_get_reg() instead of using an
    out-param
  KVM: selftests: Fudge around an apparent gcc bug in arm64's PMU test
  KVM: selftests: Assert that vcpu_{g,s}et_reg() won't truncate
  KVM: selftests: Check for a potential unhandled exception iff KVM_RUN
    succeeded
  KVM: selftests: Rename max_guest_memory_test to mmu_stress_test
  KVM: selftests: Only muck with SREGS on x86 in mmu_stress_test
  KVM: selftests: Compute number of extra pages needed in
    mmu_stress_test
  KVM: selftests: Enable mmu_stress_test on arm64
  KVM: selftests: Use vcpu_arch_put_guest() in mmu_stress_test
  KVM: selftests: Precisely limit the number of guest loops in
    mmu_stress_test
  KVM: selftests: Add a read-only mprotect() phase to mmu_stress_test
  KVM: selftests: Verify KVM correctly handles mprotect(PROT_READ)

 arch/arm64/include/uapi/asm/kvm.h             |   3 -
 arch/riscv/include/uapi/asm/kvm.h             |   3 -
 include/uapi/linux/kvm.h                      |   4 +
 tools/testing/selftests/kvm/Makefile          |   3 +-
 .../selftests/kvm/aarch64/aarch32_id_regs.c   |  10 +-
 .../selftests/kvm/aarch64/debug-exceptions.c  |   4 +-
 .../selftests/kvm/aarch64/hypercalls.c        |   6 +-
 .../testing/selftests/kvm/aarch64/psci_test.c |   6 +-
 .../selftests/kvm/aarch64/set_id_regs.c       |  18 +-
 .../kvm/aarch64/vpmu_counter_access.c         |  27 ++-
 .../testing/selftests/kvm/include/kvm_util.h  |  10 +-
 .../selftests/kvm/lib/aarch64/processor.c     |   8 +-
 tools/testing/selftests/kvm/lib/kvm_util.c    |   3 +-
 .../selftests/kvm/lib/riscv/processor.c       |  66 +++----
 ..._guest_memory_test.c => mmu_stress_test.c} | 161 ++++++++++++++++--
 .../testing/selftests/kvm/riscv/arch_timer.c  |   2 +-
 .../testing/selftests/kvm/riscv/ebreak_test.c |   2 +-
 .../selftests/kvm/riscv/sbi_pmu_test.c        |   2 +-
 tools/testing/selftests/kvm/s390x/resets.c    |   2 +-
 tools/testing/selftests/kvm/steal_time.c      |   3 +-
 20 files changed, 236 insertions(+), 107 deletions(-)
 rename tools/testing/selftests/kvm/{max_guest_memory_test.c => mmu_stress_test.c} (60%)


base-commit: 15e1c3d65975524c5c792fcd59f7d89f00402261

Comments

Andrew Jones Sept. 12, 2024, 11:48 a.m. UTC | #1
On Wed, Sep 11, 2024 at 01:41:45PM GMT, Sean Christopherson wrote:
> Marc/Oliver,
> 
> I would love a sanity check on patches 2 and 3 before I file a bug against
> gcc.  The code is pretty darn simple, so I don't think I've misdiagnosed the
> problem, but I've also been second guessing myself _because_ it's so simple;
> it seems super unlikely that no one else would have run into this before.
> 
> On to the patches...
> 
> The main purpose of this series is to convert the max_guest_memory_test into
> a more generic mmu_stress_test.  The patches were originally posted as part
> a KVM x86/mmu series to test the x86/mmu changes, hence the v2.
> 
> The basic gist of the "conversion" is to have the test do mprotect() on
> guest memory while vCPUs are accessing said memory, e.g. to verify KVM and
> mmu_notifiers are working as intended.
> 
> Patches 1-4 are a somewhat unexpected side quest that I can (arguably should)
> post separately if that would make things easier.  The original plan was that
> patch 2 would be a single patch, but things snowballed.
> 
> Patch 2 reworks vcpu_get_reg() to return a value instead of using an
> out-param.  This is the entire motivation for including these patches;
> having to define a variable just to bump the program counter on arm64
> annoyed me.
> 
> Patch 4 adds hardening to vcpu_{g,s}et_reg() to detect potential truncation,
> as KVM's uAPI allows for registers greater than the 64 bits the are supported
> in the "outer" selftests APIs ((vcpu_set_reg() takes a u64, vcpu_get_reg()
> now returns a u64).
> 
> Patch 1 is a change to KVM's uAPI headers to move the KVM_REG_SIZE
> definition to common code so that the selftests side of things doesn't
> need #ifdefs to implement the hardening in patch 4.
> 
> Patch 3 is the truly unexpected part.  With the vcpu_get_reg() rework,
> arm64's vpmu_counter_test fails when compiled with gcc-13, and on gcc-11
> with an added "noinline".  AFAICT, the failure doesn't actually have
> anything to with vcpu_get_reg(); I suspect the largely unrelated change
> just happened to run afoul of a latent gcc bug.
> 
> Pending a sanity check, I will file a gcc bug.  In the meantime, I am
> hoping to fudge around the issue in KVM selftests so that the vcpu_get_reg()
> cleanup isn't blocked, and because the hack-a-fix is arguably a cleanup
> on its own.
> 
> v2:
>  - Rebase onto kvm/next.
>  - Add the aforementioned vcpu_get_reg() changes/disaster.
>  - Actually add arm64 support for the fancy mprotect() testcase (I did this
>    before v1, but managed to forget to include the changes when posting).
>  - Emit "mov %rax, (%rax)" on x86. [James]
>  - Add a comment to explain the fancy mprotect() vs. vCPUs logic.
>  - Drop the KVM x86 patches (applied and/or will be handled separately).
> 
> v1: https://lore.kernel.org/all/20240809194335.1726916-1-seanjc@google.com
> 
> Sean Christopherson (13):
>   KVM: Move KVM_REG_SIZE() definition to common uAPI header
>   KVM: selftests: Return a value from vcpu_get_reg() instead of using an
>     out-param
>   KVM: selftests: Fudge around an apparent gcc bug in arm64's PMU test
>   KVM: selftests: Assert that vcpu_{g,s}et_reg() won't truncate
>   KVM: selftests: Check for a potential unhandled exception iff KVM_RUN
>     succeeded
>   KVM: selftests: Rename max_guest_memory_test to mmu_stress_test
>   KVM: selftests: Only muck with SREGS on x86 in mmu_stress_test
>   KVM: selftests: Compute number of extra pages needed in
>     mmu_stress_test
>   KVM: selftests: Enable mmu_stress_test on arm64
>   KVM: selftests: Use vcpu_arch_put_guest() in mmu_stress_test
>   KVM: selftests: Precisely limit the number of guest loops in
>     mmu_stress_test
>   KVM: selftests: Add a read-only mprotect() phase to mmu_stress_test
>   KVM: selftests: Verify KVM correctly handles mprotect(PROT_READ)
> 
>  arch/arm64/include/uapi/asm/kvm.h             |   3 -
>  arch/riscv/include/uapi/asm/kvm.h             |   3 -
>  include/uapi/linux/kvm.h                      |   4 +
>  tools/testing/selftests/kvm/Makefile          |   3 +-
>  .../selftests/kvm/aarch64/aarch32_id_regs.c   |  10 +-
>  .../selftests/kvm/aarch64/debug-exceptions.c  |   4 +-
>  .../selftests/kvm/aarch64/hypercalls.c        |   6 +-
>  .../testing/selftests/kvm/aarch64/psci_test.c |   6 +-
>  .../selftests/kvm/aarch64/set_id_regs.c       |  18 +-
>  .../kvm/aarch64/vpmu_counter_access.c         |  27 ++-
>  .../testing/selftests/kvm/include/kvm_util.h  |  10 +-
>  .../selftests/kvm/lib/aarch64/processor.c     |   8 +-
>  tools/testing/selftests/kvm/lib/kvm_util.c    |   3 +-
>  .../selftests/kvm/lib/riscv/processor.c       |  66 +++----
>  ..._guest_memory_test.c => mmu_stress_test.c} | 161 ++++++++++++++++--
>  .../testing/selftests/kvm/riscv/arch_timer.c  |   2 +-
>  .../testing/selftests/kvm/riscv/ebreak_test.c |   2 +-
>  .../selftests/kvm/riscv/sbi_pmu_test.c        |   2 +-
>  tools/testing/selftests/kvm/s390x/resets.c    |   2 +-
>  tools/testing/selftests/kvm/steal_time.c      |   3 +-
>  20 files changed, 236 insertions(+), 107 deletions(-)
>  rename tools/testing/selftests/kvm/{max_guest_memory_test.c => mmu_stress_test.c} (60%)
> 
> 
> base-commit: 15e1c3d65975524c5c792fcd59f7d89f00402261
> -- 
> 2.46.0.598.g6f2099f65c-goog

I gave this test a try on riscv, but it appears to hang in
rendezvous_with_vcpus(). My platform is QEMU, so maybe I was just too
impatient. Anyway, I haven't read the test yet, so I don't even know
what it's doing. It's possibly it's trying to do something not yet
supported on riscv. I'll add investigating that to my TODO, but I'm
not sure when I'll get to it.

As for this series, another patch (or a sneaky change to one
of the patches...) should add 

 #include "ucall_common.h"

to mmu_stress_test.c since it's not there yet despite using get_ucall().
Building riscv faild because of that.

Thanks,
drew
Sean Christopherson Sept. 12, 2024, 2:03 p.m. UTC | #2
On Thu, Sep 12, 2024, Andrew Jones wrote:
> I gave this test a try on riscv, but it appears to hang in
> rendezvous_with_vcpus(). My platform is QEMU, so maybe I was just too
> impatient.

Try running with " -m 1 -s 1", which tells the test to use only 1GiB of memory.
That should run quite quickly, even in an emulator.

> Anyway, I haven't read the test yet, so I don't even know what it's doing.
> It's possibly it's trying to do something not yet supported on riscv. I'll
> add investigating that to my TODO, but I'm not sure when I'll get to it.
> 
> As for this series, another patch (or a sneaky change to one
> of the patches...) should add 
> 
>  #include "ucall_common.h"
> 
> to mmu_stress_test.c since it's not there yet despite using get_ucall().
> Building riscv faild because of that.

Roger that.

Thanks!