diff mbox series

[2/3] KVM: selftests: Introduce kvm_vm_dead_free

Message ID 20241107094000.70705-3-eric.auger@redhat.com (mailing list archive)
State New
Headers show
Series KVM: arm64: Handle KVM_REQ_VM_DEAD in vgic_init test | expand

Commit Message

Eric Auger Nov. 7, 2024, 9:38 a.m. UTC
In case a KVM_REQ_VM_DEAD request was sent to a VM, subsequent
KVM ioctls will fail and cause test failure. This now happens
with an aarch64 vgic test where the kvm_vm_free() fails. Let's
add a new kvm_vm_dead_free() helper that does all the deallocation
besides the KVM_SET_USER_MEMORY_REGION2 ioctl.

Signed-off-by: Eric Auger <eric.auger@redhat.com>
---
 .../testing/selftests/kvm/include/kvm_util.h  |  1 +
 tools/testing/selftests/kvm/lib/kvm_util.c    | 25 ++++++++++++++-----
 2 files changed, 20 insertions(+), 6 deletions(-)

Comments

Sean Christopherson Nov. 7, 2024, 5:55 p.m. UTC | #1
On Thu, Nov 07, 2024, Eric Auger wrote:
> In case a KVM_REQ_VM_DEAD request was sent to a VM, subsequent
> KVM ioctls will fail and cause test failure. This now happens
> with an aarch64 vgic test where the kvm_vm_free() fails. Let's
> add a new kvm_vm_dead_free() helper that does all the deallocation
> besides the KVM_SET_USER_MEMORY_REGION2 ioctl.

Please no.  I don't want to bleed the kvm->vm_dead behavior all over selftests.
The hack in __TEST_ASSERT_VM_VCPU_IOCTL() is there purely to provide users with
a more helpful error message, it is most definitely not intended to be an "official"
way to detect and react to the VM being dead.

IMO, tests that intentionally result in a dead VM should assert that subsequent
VM/vCPU ioctls return -EIO, and that's all.  Attempting to gracefully free
resources adds complexity and pollutes the core selftests APIs, with very little
benefit.

Marking a VM dead should be a _very_ rare event; it's not something that I think
we should encourage, i.e. we shouldn't make it easier to deal with.  Ideally,
use of kvm_vm_dead() should be limited to things like sev_vm_move_enc_context_from(),
where KVM needs to prever accessing the source VM to protect the host.  IMO, the
vGIC case and x86's enter_smm() are hacks.  E.g. I don't see any reason why the
enter_smm() case can't synthesize a triple fault.

If memory utilization is a concern for the vgic_init test (which seems unlikely),
I would much rather solve that problem by expanding KVM selftests support for the
selftests harness so that each testcase runs as a child process.

https://lore.kernel.org/all/ZjUwqEXPA5QVItyX@google.com
Mark Brown Nov. 7, 2024, 6:17 p.m. UTC | #2
On Thu, Nov 07, 2024 at 09:55:52AM -0800, Sean Christopherson wrote:

> If memory utilization is a concern for the vgic_init test (which seems unlikely),
> I would much rather solve that problem by expanding KVM selftests support for the
> selftests harness so that each testcase runs as a child process.

> https://lore.kernel.org/all/ZjUwqEXPA5QVItyX@google.com

That would in general be useful, having individual test cases be visible
is helpful from a UI and triage point of view and having them all run
even if one of them has failed even more so.
Oliver Upton Nov. 7, 2024, 7:08 p.m. UTC | #3
On Thu, Nov 07, 2024 at 09:55:52AM -0800, Sean Christopherson wrote:
> On Thu, Nov 07, 2024, Eric Auger wrote:
> > In case a KVM_REQ_VM_DEAD request was sent to a VM, subsequent
> > KVM ioctls will fail and cause test failure. This now happens
> > with an aarch64 vgic test where the kvm_vm_free() fails. Let's
> > add a new kvm_vm_dead_free() helper that does all the deallocation
> > besides the KVM_SET_USER_MEMORY_REGION2 ioctl.
> 
> Please no.  I don't want to bleed the kvm->vm_dead behavior all over selftests.
> The hack in __TEST_ASSERT_VM_VCPU_IOCTL() is there purely to provide users with
> a more helpful error message, it is most definitely not intended to be an "official"
> way to detect and react to the VM being dead.
> 
> IMO, tests that intentionally result in a dead VM should assert that subsequent
> VM/vCPU ioctls return -EIO, and that's all.  Attempting to gracefully free
> resources adds complexity and pollutes the core selftests APIs, with very little
> benefit.

Encouraging tests to explicitly leak resources to fudge around assertions
in the selftests library seems off to me.

IMO, the better approach would be to provide a helper that gives the
impression of freeing the VM but implicitly leaks it, paired with some
reasoning for it.

diff --git a/tools/testing/selftests/kvm/include/kvm_util.h b/tools/testing/selftests/kvm/include/kvm_util.h
index bc7c242480d6..75ad05c3c429 100644
--- a/tools/testing/selftests/kvm/include/kvm_util.h
+++ b/tools/testing/selftests/kvm/include/kvm_util.h
@@ -426,6 +426,7 @@ void vm_enable_dirty_ring(struct kvm_vm *vm, uint32_t ring_size);
 const char *vm_guest_mode_string(uint32_t i);
 
 void kvm_vm_free(struct kvm_vm *vmp);
+void __kvm_vm_free_dead(struct kvm_vm *vmp);
 void kvm_vm_restart(struct kvm_vm *vmp);
 void kvm_vm_release(struct kvm_vm *vmp);
 void kvm_vm_elf_load(struct kvm_vm *vm, const char *filename);
diff --git a/tools/testing/selftests/kvm/lib/kvm_util.c b/tools/testing/selftests/kvm/lib/kvm_util.c
index a2b7df5f1d39..34ed397d7811 100644
--- a/tools/testing/selftests/kvm/lib/kvm_util.c
+++ b/tools/testing/selftests/kvm/lib/kvm_util.c
@@ -771,6 +771,21 @@ void kvm_vm_free(struct kvm_vm *vmp)
 	free(vmp);
 }
 
+/*
+ * For use with the *extremely* uncommon case that a test expects a VM to be
+ * marked as dead.
+ *
+ * Deliberately leak the VM to avoid hacking up kvm_vm_free() to cope with a
+ * dead VM while giving the outward impression of 'doing the right thing'.
+ */
+void __kvm_vm_dead_free(struct kvm_vm *vmp)
+{
+	if (!vmp)
+		return;
+
+	TEST_ASSERT(vm_dead(vmp));
+}
+
 int kvm_memfd_alloc(size_t size, bool hugepages)
 {
 	int memfd_flags = MFD_CLOEXEC;

> Marking a VM dead should be a _very_ rare event; it's not something that I think
> we should encourage, i.e. we shouldn't make it easier to deal with.  Ideally,
> use of kvm_vm_dead() should be limited to things like sev_vm_move_enc_context_from(),
> where KVM needs to prever accessing the source VM to protect the host.  IMO, the
> vGIC case and x86's enter_smm() are hacks.  E.g. I don't see any reason why the
> enter_smm() case can't synthesize a triple fault.

The VGIC case is at least better than the alternative of slapping
bandaids all over the shop to cope with a half-baked VM and ensure we
tear it down correctly. Userspace is far up shit creek at the point the
VM is marked as dead, so I don't see any value in hobbling along
afterwards.
Sean Christopherson Nov. 7, 2024, 7:56 p.m. UTC | #4
On Thu, Nov 07, 2024, Oliver Upton wrote:
> On Thu, Nov 07, 2024 at 09:55:52AM -0800, Sean Christopherson wrote:
> > On Thu, Nov 07, 2024, Eric Auger wrote:
> > > In case a KVM_REQ_VM_DEAD request was sent to a VM, subsequent
> > > KVM ioctls will fail and cause test failure. This now happens
> > > with an aarch64 vgic test where the kvm_vm_free() fails. Let's
> > > add a new kvm_vm_dead_free() helper that does all the deallocation
> > > besides the KVM_SET_USER_MEMORY_REGION2 ioctl.
> > 
> > Please no.  I don't want to bleed the kvm->vm_dead behavior all over selftests.
> > The hack in __TEST_ASSERT_VM_VCPU_IOCTL() is there purely to provide users with
> > a more helpful error message, it is most definitely not intended to be an "official"
> > way to detect and react to the VM being dead.
> > 
> > IMO, tests that intentionally result in a dead VM should assert that subsequent
> > VM/vCPU ioctls return -EIO, and that's all.  Attempting to gracefully free
> > resources adds complexity and pollutes the core selftests APIs, with very little
> > benefit.
> 
> Encouraging tests to explicitly leak resources to fudge around assertions
> in the selftests library seems off to me.

I don't disagree, but I really, really don't want to add vm_dead().

> IMO, the better approach would be to provide a helper that gives the
> impression of freeing the VM but implicitly leaks it, paired with some
> reasoning for it.

Actually, duh.  There's no need to manually delete KVM memslots for *any* VM,
dead or alive.  Just skip that unconditionally when freeing the VM, and then the
vGIC test just needs to assert on -EIO instead -ENXIO/-EBUSY.

---
From: Sean Christopherson <seanjc@google.com>
Date: Thu, 7 Nov 2024 11:39:59 -0800
Subject: [PATCH] KVM: selftests: Don't bother deleting memslots in KVM when
 freeing VMs

When freeing a VM, don't call into KVM to manually remove each memslot,
simply cleanup and free any userspace assets associated with the memory
region.  KVM is ultimately responsible for ensuring kernel resources are
freed when the VM is destroyed, deleting memslots one-by-one is
unnecessarily slow, and unless a test is already leaking the VM fd, the
VM will be destroyed when kvm_vm_release() is called.

Not deleting KVM's memslot also allows cleaning up dead VMs without having
to care whether or not the to-be-freed VM is dead or alive.

Reported-by: Eric Auger <eric.auger@redhat.com>
Reported-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Sean Christopherson <seanjc@google.com>
---
 tools/testing/selftests/kvm/lib/kvm_util.c | 10 ++++++----
 1 file changed, 6 insertions(+), 4 deletions(-)

diff --git a/tools/testing/selftests/kvm/lib/kvm_util.c b/tools/testing/selftests/kvm/lib/kvm_util.c
index 6b3161a0990f..33fefeb3ca44 100644
--- a/tools/testing/selftests/kvm/lib/kvm_util.c
+++ b/tools/testing/selftests/kvm/lib/kvm_util.c
@@ -720,9 +720,6 @@ static void __vm_mem_region_delete(struct kvm_vm *vm,
 	rb_erase(&region->hva_node, &vm->regions.hva_tree);
 	hash_del(&region->slot_node);
 
-	region->region.memory_size = 0;
-	vm_ioctl(vm, KVM_SET_USER_MEMORY_REGION2, &region->region);
-
 	sparsebit_free(&region->unused_phy_pages);
 	sparsebit_free(&region->protected_phy_pages);
 	ret = munmap(region->mmap_start, region->mmap_size);
@@ -1197,7 +1194,12 @@ void vm_mem_region_move(struct kvm_vm *vm, uint32_t slot, uint64_t new_gpa)
  */
 void vm_mem_region_delete(struct kvm_vm *vm, uint32_t slot)
 {
-	__vm_mem_region_delete(vm, memslot2region(vm, slot));
+	struct userspace_mem_region *region = memslot2region(vm, slot);
+
+	region->region.memory_size = 0;
+	vm_ioctl(vm, KVM_SET_USER_MEMORY_REGION2, &region->region);
+
+	__vm_mem_region_delete(vm, region);
 }
 
 void vm_guest_mem_fallocate(struct kvm_vm *vm, uint64_t base, uint64_t size,

base-commit: c88416ba074a8913cf6d61b789dd834bbca6681c
Oliver Upton Nov. 7, 2024, 8:12 p.m. UTC | #5
On Thu, Nov 07, 2024 at 11:56:32AM -0800, Sean Christopherson wrote:
> On Thu, Nov 07, 2024, Oliver Upton wrote:
> > On Thu, Nov 07, 2024 at 09:55:52AM -0800, Sean Christopherson wrote:
> > > On Thu, Nov 07, 2024, Eric Auger wrote:
> > > > In case a KVM_REQ_VM_DEAD request was sent to a VM, subsequent
> > > > KVM ioctls will fail and cause test failure. This now happens
> > > > with an aarch64 vgic test where the kvm_vm_free() fails. Let's
> > > > add a new kvm_vm_dead_free() helper that does all the deallocation
> > > > besides the KVM_SET_USER_MEMORY_REGION2 ioctl.
> > > 
> > > Please no.  I don't want to bleed the kvm->vm_dead behavior all over selftests.
> > > The hack in __TEST_ASSERT_VM_VCPU_IOCTL() is there purely to provide users with
> > > a more helpful error message, it is most definitely not intended to be an "official"
> > > way to detect and react to the VM being dead.
> > > 
> > > IMO, tests that intentionally result in a dead VM should assert that subsequent
> > > VM/vCPU ioctls return -EIO, and that's all.  Attempting to gracefully free
> > > resources adds complexity and pollutes the core selftests APIs, with very little
> > > benefit.
> > 
> > Encouraging tests to explicitly leak resources to fudge around assertions
> > in the selftests library seems off to me.
> 
> I don't disagree, but I really, really don't want to add vm_dead().

It'd still be valuable to test that the VM is properly dead and
subsequent ioctls also return EIO, but I understand the hesitation.

> > IMO, the better approach would be to provide a helper that gives the
> > impression of freeing the VM but implicitly leaks it, paired with some
> > reasoning for it.
> 
> Actually, duh.  There's no need to manually delete KVM memslots for *any* VM,
> dead or alive.  Just skip that unconditionally when freeing the VM, and then the
> vGIC test just needs to assert on -EIO instead -ENXIO/-EBUSY.

Yeah, that'd tighten up the assertions a bit more to the exact ioctl
where we expect the VM to go sideways.

> ---
> From: Sean Christopherson <seanjc@google.com>
> Date: Thu, 7 Nov 2024 11:39:59 -0800
> Subject: [PATCH] KVM: selftests: Don't bother deleting memslots in KVM when
>  freeing VMs
> 
> When freeing a VM, don't call into KVM to manually remove each memslot,
> simply cleanup and free any userspace assets associated with the memory
> region.  KVM is ultimately responsible for ensuring kernel resources are
> freed when the VM is destroyed, deleting memslots one-by-one is
> unnecessarily slow, and unless a test is already leaking the VM fd, the
> VM will be destroyed when kvm_vm_release() is called.
> 
> Not deleting KVM's memslot also allows cleaning up dead VMs without having
> to care whether or not the to-be-freed VM is dead or alive.

Can you add a comment to kvm_vm_free() about why we want to avoid ioctls
in that helper? It'd help discourage this situation from happening again
in the future in the unlikely case someone wants to park an ioctl there.

> Reported-by: Eric Auger <eric.auger@redhat.com>
> Reported-by: Mark Brown <broonie@kernel.org>
> Signed-off-by: Sean Christopherson <seanjc@google.com>

I'm assuming you want to take this, happy to grab it otherwise.

Reviewed-by: Oliver Upton <oliver.upton@linux.dev>

> > > Marking a VM dead should be a _very_ rare event; it's not something that I think
> > > we should encourage, i.e. we shouldn't make it easier to deal with.  Ideally,
> > > use of kvm_vm_dead() should be limited to things like sev_vm_move_enc_context_from(),
> > > where KVM needs to prever accessing the source VM to protect the host.  IMO, the
> > > vGIC case and x86's enter_smm() are hacks.  E.g. I don't see any reason why the
> > > enter_smm() case can't synthesize a triple fault.
> > 
> > The VGIC case is at least better than the alternative of slapping
> > bandaids all over the shop to cope with a half-baked VM and ensure we
> > tear it down correctly. Userspace is far up shit creek at the point the
> > VM is marked as dead, so I don't see any value in hobbling along
> > afterwards.
> 
> Again, I don't disagree, but I don't want to normalize shooting the VM on errors.

Definitely not. It is very much a break-glass situation where this is
even somewhat OK.
Sean Christopherson Nov. 7, 2024, 8:26 p.m. UTC | #6
On Thu, Nov 07, 2024, Oliver Upton wrote:
> On Thu, Nov 07, 2024 at 11:56:32AM -0800, Sean Christopherson wrote:
> > ---
> > From: Sean Christopherson <seanjc@google.com>
> > Date: Thu, 7 Nov 2024 11:39:59 -0800
> > Subject: [PATCH] KVM: selftests: Don't bother deleting memslots in KVM when
> >  freeing VMs
> > 
> > When freeing a VM, don't call into KVM to manually remove each memslot,
> > simply cleanup and free any userspace assets associated with the memory
> > region.  KVM is ultimately responsible for ensuring kernel resources are
> > freed when the VM is destroyed, deleting memslots one-by-one is
> > unnecessarily slow, and unless a test is already leaking the VM fd, the
> > VM will be destroyed when kvm_vm_release() is called.
> > 
> > Not deleting KVM's memslot also allows cleaning up dead VMs without having
> > to care whether or not the to-be-freed VM is dead or alive.
> 
> Can you add a comment to kvm_vm_free() about why we want to avoid ioctls
> in that helper? It'd help discourage this situation from happening again
> in the future in the unlikely case someone wants to park an ioctl there.
> 
> > Reported-by: Eric Auger <eric.auger@redhat.com>
> > Reported-by: Mark Brown <broonie@kernel.org>
> > Signed-off-by: Sean Christopherson <seanjc@google.com>
> 
> I'm assuming you want to take this, happy to grab it otherwise.

You take it.  Unless my git foo is off the rails, this is needs to go into 6.12,
along with a fix for the vGIC test.  That, and I already sent Paolo a pull request
for rc7; I don't want to overwork myself ;-)
Eric Auger Nov. 8, 2024, 8:55 a.m. UTC | #7
Hi Sean, Oliver,

On 11/7/24 21:12, Oliver Upton wrote:
> On Thu, Nov 07, 2024 at 11:56:32AM -0800, Sean Christopherson wrote:
>> On Thu, Nov 07, 2024, Oliver Upton wrote:
>>> On Thu, Nov 07, 2024 at 09:55:52AM -0800, Sean Christopherson wrote:
>>>> On Thu, Nov 07, 2024, Eric Auger wrote:
>>>>> In case a KVM_REQ_VM_DEAD request was sent to a VM, subsequent
>>>>> KVM ioctls will fail and cause test failure. This now happens
>>>>> with an aarch64 vgic test where the kvm_vm_free() fails. Let's
>>>>> add a new kvm_vm_dead_free() helper that does all the deallocation
>>>>> besides the KVM_SET_USER_MEMORY_REGION2 ioctl.
>>>> Please no.  I don't want to bleed the kvm->vm_dead behavior all over selftests.
>>>> The hack in __TEST_ASSERT_VM_VCPU_IOCTL() is there purely to provide users with
>>>> a more helpful error message, it is most definitely not intended to be an "official"
>>>> way to detect and react to the VM being dead.
>>>>
>>>> IMO, tests that intentionally result in a dead VM should assert that subsequent
>>>> VM/vCPU ioctls return -EIO, and that's all.  Attempting to gracefully free
>>>> resources adds complexity and pollutes the core selftests APIs, with very little
>>>> benefit.
>>> Encouraging tests to explicitly leak resources to fudge around assertions
>>> in the selftests library seems off to me.
>> I don't disagree, but I really, really don't want to add vm_dead().
> It'd still be valuable to test that the VM is properly dead and
> subsequent ioctls also return EIO, but I understand the hesitation.
>
>>> IMO, the better approach would be to provide a helper that gives the
>>> impression of freeing the VM but implicitly leaks it, paired with some
>>> reasoning for it.
>> Actually, duh.  There's no need to manually delete KVM memslots for *any* VM,
>> dead or alive.  Just skip that unconditionally when freeing the VM, and then the
>> vGIC test just needs to assert on -EIO instead -ENXIO/-EBUSY.
> Yeah, that'd tighten up the assertions a bit more to the exact ioctl
> where we expect the VM to go sideways.
>
>> ---
>> From: Sean Christopherson <seanjc@google.com>
>> Date: Thu, 7 Nov 2024 11:39:59 -0800
>> Subject: [PATCH] KVM: selftests: Don't bother deleting memslots in KVM when
>>  freeing VMs
>>
>> When freeing a VM, don't call into KVM to manually remove each memslot,
>> simply cleanup and free any userspace assets associated with the memory
>> region.  KVM is ultimately responsible for ensuring kernel resources are
>> freed when the VM is destroyed, deleting memslots one-by-one is
>> unnecessarily slow, and unless a test is already leaking the VM fd, the
>> VM will be destroyed when kvm_vm_release() is called.
>>
>> Not deleting KVM's memslot also allows cleaning up dead VMs without having
>> to care whether or not the to-be-freed VM is dead or alive.
> Can you add a comment to kvm_vm_free() about why we want to avoid ioctls
> in that helper? It'd help discourage this situation from happening again
> in the future in the unlikely case someone wants to park an ioctl there.
>
>> Reported-by: Eric Auger <eric.auger@redhat.com>
>> Reported-by: Mark Brown <broonie@kernel.org>
>> Signed-off-by: Sean Christopherson <seanjc@google.com>
> I'm assuming you want to take this, happy to grab it otherwise.
>
> Reviewed-by: Oliver Upton <oliver.upton@linux.dev>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Tested-by: Eric Auger <eric.auger@redhat.com>

Thanks

Eric
>
>>>> Marking a VM dead should be a _very_ rare event; it's not something that I think
>>>> we should encourage, i.e. we shouldn't make it easier to deal with.  Ideally,
>>>> use of kvm_vm_dead() should be limited to things like sev_vm_move_enc_context_from(),
>>>> where KVM needs to prever accessing the source VM to protect the host.  IMO, the
>>>> vGIC case and x86's enter_smm() are hacks.  E.g. I don't see any reason why the
>>>> enter_smm() case can't synthesize a triple fault.
>>> The VGIC case is at least better than the alternative of slapping
>>> bandaids all over the shop to cope with a half-baked VM and ensure we
>>> tear it down correctly. Userspace is far up shit creek at the point the
>>> VM is marked as dead, so I don't see any value in hobbling along
>>> afterwards.
>> Again, I don't disagree, but I don't want to normalize shooting the VM on errors.
> Definitely not. It is very much a break-glass situation where this is
> even somewhat OK.
>
Eric Auger Nov. 8, 2024, 9 a.m. UTC | #8
Hi Oliver,

On 11/7/24 21:12, Oliver Upton wrote:
> On Thu, Nov 07, 2024 at 11:56:32AM -0800, Sean Christopherson wrote:
>> On Thu, Nov 07, 2024, Oliver Upton wrote:
>>> On Thu, Nov 07, 2024 at 09:55:52AM -0800, Sean Christopherson wrote:
>>>> On Thu, Nov 07, 2024, Eric Auger wrote:
>>>>> In case a KVM_REQ_VM_DEAD request was sent to a VM, subsequent
>>>>> KVM ioctls will fail and cause test failure. This now happens
>>>>> with an aarch64 vgic test where the kvm_vm_free() fails. Let's
>>>>> add a new kvm_vm_dead_free() helper that does all the deallocation
>>>>> besides the KVM_SET_USER_MEMORY_REGION2 ioctl.
>>>> Please no.  I don't want to bleed the kvm->vm_dead behavior all over selftests.
>>>> The hack in __TEST_ASSERT_VM_VCPU_IOCTL() is there purely to provide users with
>>>> a more helpful error message, it is most definitely not intended to be an "official"
>>>> way to detect and react to the VM being dead.
>>>>
>>>> IMO, tests that intentionally result in a dead VM should assert that subsequent
>>>> VM/vCPU ioctls return -EIO, and that's all.  Attempting to gracefully free
>>>> resources adds complexity and pollutes the core selftests APIs, with very little
>>>> benefit.
>>> Encouraging tests to explicitly leak resources to fudge around assertions
>>> in the selftests library seems off to me.
>> I don't disagree, but I really, really don't want to add vm_dead().
> It'd still be valuable to test that the VM is properly dead and
> subsequent ioctls also return EIO, but I understand the hesitation.

Currently the vgic_test does not check that the VM is dead, it just
checks the first expected errno according to the uapi documentation.
Besides AFAIK this latter has not been updated according to the new VM
dead implementation.

Eric
>
>>> IMO, the better approach would be to provide a helper that gives the
>>> impression of freeing the VM but implicitly leaks it, paired with some
>>> reasoning for it.
>> Actually, duh.  There's no need to manually delete KVM memslots for *any* VM,
>> dead or alive.  Just skip that unconditionally when freeing the VM, and then the
>> vGIC test just needs to assert on -EIO instead -ENXIO/-EBUSY.
> Yeah, that'd tighten up the assertions a bit more to the exact ioctl
> where we expect the VM to go sideways.
>
>> ---
>> From: Sean Christopherson <seanjc@google.com>
>> Date: Thu, 7 Nov 2024 11:39:59 -0800
>> Subject: [PATCH] KVM: selftests: Don't bother deleting memslots in KVM when
>>  freeing VMs
>>
>> When freeing a VM, don't call into KVM to manually remove each memslot,
>> simply cleanup and free any userspace assets associated with the memory
>> region.  KVM is ultimately responsible for ensuring kernel resources are
>> freed when the VM is destroyed, deleting memslots one-by-one is
>> unnecessarily slow, and unless a test is already leaking the VM fd, the
>> VM will be destroyed when kvm_vm_release() is called.
>>
>> Not deleting KVM's memslot also allows cleaning up dead VMs without having
>> to care whether or not the to-be-freed VM is dead or alive.
> Can you add a comment to kvm_vm_free() about why we want to avoid ioctls
> in that helper? It'd help discourage this situation from happening again
> in the future in the unlikely case someone wants to park an ioctl there.
>
>> Reported-by: Eric Auger <eric.auger@redhat.com>
>> Reported-by: Mark Brown <broonie@kernel.org>
>> Signed-off-by: Sean Christopherson <seanjc@google.com>
> I'm assuming you want to take this, happy to grab it otherwise.
>
> Reviewed-by: Oliver Upton <oliver.upton@linux.dev>
>
>>>> Marking a VM dead should be a _very_ rare event; it's not something that I think
>>>> we should encourage, i.e. we shouldn't make it easier to deal with.  Ideally,
>>>> use of kvm_vm_dead() should be limited to things like sev_vm_move_enc_context_from(),
>>>> where KVM needs to prever accessing the source VM to protect the host.  IMO, the
>>>> vGIC case and x86's enter_smm() are hacks.  E.g. I don't see any reason why the
>>>> enter_smm() case can't synthesize a triple fault.
>>> The VGIC case is at least better than the alternative of slapping
>>> bandaids all over the shop to cope with a half-baked VM and ensure we
>>> tear it down correctly. Userspace is far up shit creek at the point the
>>> VM is marked as dead, so I don't see any value in hobbling along
>>> afterwards.
>> Again, I don't disagree, but I don't want to normalize shooting the VM on errors.
> Definitely not. It is very much a break-glass situation where this is
> even somewhat OK.
>
Oliver Upton Nov. 8, 2024, 5:18 p.m. UTC | #9
On Fri, Nov 08, 2024 at 10:00:06AM +0100, Eric Auger wrote:
> Hi Oliver,
> 
> On 11/7/24 21:12, Oliver Upton wrote:
> > On Thu, Nov 07, 2024 at 11:56:32AM -0800, Sean Christopherson wrote:
> >> On Thu, Nov 07, 2024, Oliver Upton wrote:
> >>> On Thu, Nov 07, 2024 at 09:55:52AM -0800, Sean Christopherson wrote:
> >>>> On Thu, Nov 07, 2024, Eric Auger wrote:
> >>>>> In case a KVM_REQ_VM_DEAD request was sent to a VM, subsequent
> >>>>> KVM ioctls will fail and cause test failure. This now happens
> >>>>> with an aarch64 vgic test where the kvm_vm_free() fails. Let's
> >>>>> add a new kvm_vm_dead_free() helper that does all the deallocation
> >>>>> besides the KVM_SET_USER_MEMORY_REGION2 ioctl.
> >>>> Please no.  I don't want to bleed the kvm->vm_dead behavior all over selftests.
> >>>> The hack in __TEST_ASSERT_VM_VCPU_IOCTL() is there purely to provide users with
> >>>> a more helpful error message, it is most definitely not intended to be an "official"
> >>>> way to detect and react to the VM being dead.
> >>>>
> >>>> IMO, tests that intentionally result in a dead VM should assert that subsequent
> >>>> VM/vCPU ioctls return -EIO, and that's all.  Attempting to gracefully free
> >>>> resources adds complexity and pollutes the core selftests APIs, with very little
> >>>> benefit.
> >>> Encouraging tests to explicitly leak resources to fudge around assertions
> >>> in the selftests library seems off to me.
> >> I don't disagree, but I really, really don't want to add vm_dead().
> > It'd still be valuable to test that the VM is properly dead and
> > subsequent ioctls also return EIO, but I understand the hesitation.
> 
> Currently the vgic_test does not check that the VM is dead, it just
> checks the first expected errno according to the uapi documentation.
> Besides AFAIK this latter has not been updated according to the new VM
> dead implementation.

Ah yep, of course. We'll only return EIO for subsequent ioctls. It'd be
nice to assert that is the case, but it isn't _that_ big of a deal.
diff mbox series

Patch

diff --git a/tools/testing/selftests/kvm/include/kvm_util.h b/tools/testing/selftests/kvm/include/kvm_util.h
index 90424bfe33bd..8be893b2c6a2 100644
--- a/tools/testing/selftests/kvm/include/kvm_util.h
+++ b/tools/testing/selftests/kvm/include/kvm_util.h
@@ -437,6 +437,7 @@  void vm_enable_dirty_ring(struct kvm_vm *vm, uint32_t ring_size);
 const char *vm_guest_mode_string(uint32_t i);
 
 void kvm_vm_free(struct kvm_vm *vmp);
+void kvm_vm_dead_free(struct kvm_vm *vmp);
 void kvm_vm_restart(struct kvm_vm *vmp);
 void kvm_vm_release(struct kvm_vm *vmp);
 void kvm_vm_elf_load(struct kvm_vm *vm, const char *filename);
diff --git a/tools/testing/selftests/kvm/lib/kvm_util.c b/tools/testing/selftests/kvm/lib/kvm_util.c
index a2b7df5f1d39..befbbe989d73 100644
--- a/tools/testing/selftests/kvm/lib/kvm_util.c
+++ b/tools/testing/selftests/kvm/lib/kvm_util.c
@@ -712,7 +712,8 @@  void kvm_vm_release(struct kvm_vm *vmp)
 }
 
 static void __vm_mem_region_delete(struct kvm_vm *vm,
-				   struct userspace_mem_region *region)
+				   struct userspace_mem_region *region,
+				   bool dead)
 {
 	int ret;
 
@@ -720,8 +721,10 @@  static void __vm_mem_region_delete(struct kvm_vm *vm,
 	rb_erase(&region->hva_node, &vm->regions.hva_tree);
 	hash_del(&region->slot_node);
 
-	region->region.memory_size = 0;
-	vm_ioctl(vm, KVM_SET_USER_MEMORY_REGION2, &region->region);
+	if (!dead) {
+		region->region.memory_size = 0;
+		vm_ioctl(vm, KVM_SET_USER_MEMORY_REGION2, &region->region);
+	}
 
 	sparsebit_free(&region->unused_phy_pages);
 	sparsebit_free(&region->protected_phy_pages);
@@ -742,7 +745,7 @@  static void __vm_mem_region_delete(struct kvm_vm *vm,
 /*
  * Destroys and frees the VM pointed to by vmp.
  */
-void kvm_vm_free(struct kvm_vm *vmp)
+static void __kvm_vm_free(struct kvm_vm *vmp, bool dead)
 {
 	int ctr;
 	struct hlist_node *node;
@@ -759,7 +762,7 @@  void kvm_vm_free(struct kvm_vm *vmp)
 
 	/* Free userspace_mem_regions. */
 	hash_for_each_safe(vmp->regions.slot_hash, ctr, node, region, slot_node)
-		__vm_mem_region_delete(vmp, region);
+		__vm_mem_region_delete(vmp, region, dead);
 
 	/* Free sparsebit arrays. */
 	sparsebit_free(&vmp->vpages_valid);
@@ -771,6 +774,16 @@  void kvm_vm_free(struct kvm_vm *vmp)
 	free(vmp);
 }
 
+void kvm_vm_free(struct kvm_vm *vmp)
+{
+	__kvm_vm_free(vmp, false);
+}
+
+void kvm_vm_dead_free(struct kvm_vm *vmp)
+{
+	__kvm_vm_free(vmp, true);
+}
+
 int kvm_memfd_alloc(size_t size, bool hugepages)
 {
 	int memfd_flags = MFD_CLOEXEC;
@@ -1197,7 +1210,7 @@  void vm_mem_region_move(struct kvm_vm *vm, uint32_t slot, uint64_t new_gpa)
  */
 void vm_mem_region_delete(struct kvm_vm *vm, uint32_t slot)
 {
-	__vm_mem_region_delete(vm, memslot2region(vm, slot));
+	__vm_mem_region_delete(vm, memslot2region(vm, slot), false);
 }
 
 void vm_guest_mem_fallocate(struct kvm_vm *vm, uint64_t base, uint64_t size,