Message ID | 20250226213844.3826821-1-kbusch@meta.com (mailing list archive) |
---|---|
Headers | show |
Series | kvm/x86: vhost task creation failure handling | expand |
Hi Keith There are some error messages from qemu output when I tested this series of patches with the virtio-net regression test. It can reproduced by boot up a guest with vhost device after applied your patches. Error messages: Qemu output: qemu-kvm: -netdev {"id": "idoejzv8", "type": "tap", "vhost": true, "vhostfd": "16", "fd": "10"}: vhost_set_owner failed: Cannot allocate memory qemu-kvm: -netdev {"id": "idoejzv8", "type": "tap", "vhost": true, "vhostfd": "16", "fd": "10"}: vhost-net requested but could not be initialized My tests based on this commit: # git log -1 commit dd83757f6e686a2188997cb58b5975f744bb7786 (HEAD -> master, origin/master, origin/HEAD) Merge: 102c16a1f9a9 eb54d2695b57 Author: Linus Torvalds <torvalds@linux-foundation.org> Date: Wed Feb 26 16:55:30 2025 -0800 Merge tag 'bcachefs-2025-02-26' of git://evilpiepirate.org/bcachefs Thanks Lei On Thu, Feb 27, 2025 at 5:40 AM Keith Busch <kbusch@meta.com> wrote: > > From: Keith Busch <kbusch@kernel.org> > > The suggestion from Sean appears to be successful, so sending out a new > version for consideration. > > Background: > > The crosvm VMM might send signals to its threads that have entered > KVM_RUN. The signal specifically is SIGRTRMIN from here: > > https://github.com/google/crosvm/blob/main/src/crosvm/sys/linux/vcpu.rs#L651 > > If this happens to occur when the huge page recovery is trying to create > its vhost task, that will fail with ERESTARTNOINTR. Once this happens, > all KVM_RUN calls will fail with ENOMEM despite memory not being the > problem. > > This series propogates the error up so we can distinguish that from the > current defaulting to ENOMEM and replaces the call_once since we need to > be able to call it repeatedly due to this condition. > > Changes from the v1 (prefixed as an "RFC", really) patch: > > Instead of using a VM-wide mutex, update the call_once pattern to > complete only if what it calls is successful (from Sean). > > Keith Busch (1): > vhost: return task creation error instead of NULL > > Sean Christopherson (1): > kvm: retry nx_huge_page_recovery_thread creation > > arch/x86/kvm/mmu/mmu.c | 12 +++++------- > drivers/vhost/vhost.c | 2 +- > include/linux/call_once.h | 16 +++++++++++----- > kernel/vhost_task.c | 4 ++-- > 4 files changed, 19 insertions(+), 15 deletions(-) > > -- > 2.43.5 > >
On Thu, Feb 27, 2025 at 06:21:34PM +0800, Lei Yang wrote: > Hi Keith > > There are some error messages from qemu output when I tested this > series of patches with the virtio-net regression test. It can > reproduced by boot up a guest with vhost device after applied your > patches. > Error messages: > Qemu output: > qemu-kvm: -netdev {"id": "idoejzv8", "type": "tap", "vhost": true, > "vhostfd": "16", "fd": "10"}: vhost_set_owner failed: Cannot allocate > memory > qemu-kvm: -netdev {"id": "idoejzv8", "type": "tap", "vhost": true, > "vhostfd": "16", "fd": "10"}: vhost-net requested but could not be > initialized *facepalm* I accidently left the "!" in the condition: @@ -666,7 +666,7 @@ static struct vhost_worker *vhost_worker_create(struct vhost_dev *dev) vtsk = vhost_task_create(vhost_run_work_list, vhost_worker_killed, worker, name); - if (!vtsk) + if (!IS_ERR(vtsk)) goto free_worker;
From: Keith Busch <kbusch@kernel.org> The suggestion from Sean appears to be successful, so sending out a new version for consideration. Background: The crosvm VMM might send signals to its threads that have entered KVM_RUN. The signal specifically is SIGRTRMIN from here: https://github.com/google/crosvm/blob/main/src/crosvm/sys/linux/vcpu.rs#L651 If this happens to occur when the huge page recovery is trying to create its vhost task, that will fail with ERESTARTNOINTR. Once this happens, all KVM_RUN calls will fail with ENOMEM despite memory not being the problem. This series propogates the error up so we can distinguish that from the current defaulting to ENOMEM and replaces the call_once since we need to be able to call it repeatedly due to this condition. Changes from the v1 (prefixed as an "RFC", really) patch: Instead of using a VM-wide mutex, update the call_once pattern to complete only if what it calls is successful (from Sean). Keith Busch (1): vhost: return task creation error instead of NULL Sean Christopherson (1): kvm: retry nx_huge_page_recovery_thread creation arch/x86/kvm/mmu/mmu.c | 12 +++++------- drivers/vhost/vhost.c | 2 +- include/linux/call_once.h | 16 +++++++++++----- kernel/vhost_task.c | 4 ++-- 4 files changed, 19 insertions(+), 15 deletions(-)