diff mbox series

[V9,1/8] accel/kvm: Extract common KVM vCPU {creation, parking} code

Message ID 20240519210620.228342-2-salil.mehta@huawei.com (mailing list archive)
State New, archived
Headers show
Series Add architecture agnostic code to support vCPU Hotplug | expand

Commit Message

Salil Mehta May 19, 2024, 9:06 p.m. UTC
KVM vCPU creation is done once during the vCPU realization when Qemu vCPU thread
is spawned. This is common to all the architectures as of now.

Hot-unplug of vCPU results in destruction of the vCPU object in QOM but the
corresponding KVM vCPU object in the Host KVM is not destroyed as KVM doesn't
support vCPU removal. Therefore, its representative KVM vCPU object/context in
Qemu is parked.

Refactor architecture common logic so that some APIs could be reused by vCPU
Hotplug code of some architectures likes ARM, Loongson etc. Update new/old APIs
with trace events instead of DPRINTF. No functional change is intended here.

Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
Reviewed-by: Gavin Shan <gshan@redhat.com>
Tested-by: Vishnu Pajjuri <vishnu@os.amperecomputing.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Tested-by: Xianglai Li <lixianglai@loongson.cn>
Tested-by: Miguel Luis <miguel.luis@oracle.com>
Reviewed-by: Shaoqin Huang <shahuang@redhat.com>
Reviewed-by: Vishnu Pajjuri <vishnu@os.amperecomputing.com>
---
 accel/kvm/kvm-all.c    | 64 ++++++++++++++++++++++++++++++++----------
 accel/kvm/kvm-cpus.h   | 14 +++++++++
 accel/kvm/trace-events |  5 +++-
 3 files changed, 67 insertions(+), 16 deletions(-)

Comments

Nicholas Piggin May 20, 2024, 8:04 a.m. UTC | #1
On Mon May 20, 2024 at 7:06 AM AEST, Salil Mehta wrote:
> KVM vCPU creation is done once during the vCPU realization when Qemu vCPU thread
> is spawned. This is common to all the architectures as of now.
>
> Hot-unplug of vCPU results in destruction of the vCPU object in QOM but the
> corresponding KVM vCPU object in the Host KVM is not destroyed as KVM doesn't
> support vCPU removal. Therefore, its representative KVM vCPU object/context in
> Qemu is parked.
>
> Refactor architecture common logic so that some APIs could be reused by vCPU
> Hotplug code of some architectures likes ARM, Loongson etc. Update new/old APIs
> with trace events instead of DPRINTF. No functional change is intended here.

This is a nice cleanup and helps with ppc hotplug as well.

Has there been any architecture code posted yet?

Just a few minor thing:

>
> Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
> Reviewed-by: Gavin Shan <gshan@redhat.com>
> Tested-by: Vishnu Pajjuri <vishnu@os.amperecomputing.com>
> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
> Tested-by: Xianglai Li <lixianglai@loongson.cn>
> Tested-by: Miguel Luis <miguel.luis@oracle.com>
> Reviewed-by: Shaoqin Huang <shahuang@redhat.com>
> Reviewed-by: Vishnu Pajjuri <vishnu@os.amperecomputing.com>
> ---
>  accel/kvm/kvm-all.c    | 64 ++++++++++++++++++++++++++++++++----------
>  accel/kvm/kvm-cpus.h   | 14 +++++++++
>  accel/kvm/trace-events |  5 +++-
>  3 files changed, 67 insertions(+), 16 deletions(-)
>
> diff --git a/accel/kvm/kvm-all.c b/accel/kvm/kvm-all.c
> index c0be9f5eed..9cd7d69bde 100644
> --- a/accel/kvm/kvm-all.c
> +++ b/accel/kvm/kvm-all.c
> @@ -128,6 +128,7 @@ static QemuMutex kml_slots_lock;
>  #define kvm_slots_unlock()  qemu_mutex_unlock(&kml_slots_lock)
>  
>  static void kvm_slot_init_dirty_bitmap(KVMSlot *mem);
> +static int kvm_get_vcpu(KVMState *s, unsigned long vcpu_id);
>  
>  static inline void kvm_resample_fd_remove(int gsi)
>  {
> @@ -340,14 +341,53 @@ err:
>      return ret;
>  }
>  
> +void kvm_park_vcpu(CPUState *cpu)
> +{
> +    struct KVMParkedVcpu *vcpu;
> +
> +    trace_kvm_park_vcpu(cpu->cpu_index, kvm_arch_vcpu_id(cpu));
> +
> +    vcpu = g_malloc0(sizeof(*vcpu));
> +    vcpu->vcpu_id = kvm_arch_vcpu_id(cpu);
> +    vcpu->kvm_fd = cpu->kvm_fd;
> +    QLIST_INSERT_HEAD(&kvm_state->kvm_parked_vcpus, vcpu, node);
> +}

Could you move kvm_get_vcpu up here so it's next to kvm_park_vcpu, and
then you don't need to forward declare it. Call it kvm_unpark_vcpu() for
symmetry with park.

Thanks,
Nick
Salil Mehta May 20, 2024, 10:48 a.m. UTC | #2
Hi Nick,

>  From: Nicholas Piggin <npiggin@gmail.com>
>  Sent: Monday, May 20, 2024 9:04 AM
>  To: Salil Mehta <salil.mehta@huawei.com>; qemu-devel@nongnu.org;
>  qemu-arm@nongnu.org
>  
>  On Mon May 20, 2024 at 7:06 AM AEST, Salil Mehta wrote:
>  > KVM vCPU creation is done once during the vCPU realization when Qemu
>  > vCPU thread is spawned. This is common to all the architectures as of now.
>  >
>  > Hot-unplug of vCPU results in destruction of the vCPU object in QOM
>  > but the corresponding KVM vCPU object in the Host KVM is not destroyed
>  > as KVM doesn't support vCPU removal. Therefore, its representative KVM
>  > vCPU object/context in Qemu is parked.
>  >
>  > Refactor architecture common logic so that some APIs could be reused
>  > by vCPU Hotplug code of some architectures likes ARM, Loongson etc.
>  > Update new/old APIs with trace events instead of DPRINTF. No functional
>  change is intended here.
>  
>  This is a nice cleanup and helps with ppc hotplug as well.


Thanks and glad that it is of help.


>  
>  Has there been any architecture code posted yet?


Yes, I created one in  April/early May.  It contains arch specific patch-set stacked up
on the arch agnostic patch-set V8. Arch specific patch-set contains patches from the
RFC V2 I floated last year. 

https://github.com/salil-mehta/qemu/commits/virt-cpuhp-armv8/rfc-v3-rc1/


As such, a complete code of RFC V2 was posted last year. There is no clear separation of
architecture agnostic and specific patch-set in this version. and can be found at below
repository.

https://lore.kernel.org/qemu-devel/20230926100436.28284-1-salil.mehta@huawei.com/


I shall be floating RFC V3 soon which will have only ARM architecture specific part and
yes will also have TCG code but there are some issues to be fixed for TCG and we are
working towards it. Hence, we have disabled TCG by default for now.


Thanks
Salil.













>  
>  Just a few minor thing:
>  
>  >
>  > Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
>  > Reviewed-by: Gavin Shan <gshan@redhat.com>
>  > Tested-by: Vishnu Pajjuri <vishnu@os.amperecomputing.com>
>  > Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
>  > Tested-by: Xianglai Li <lixianglai@loongson.cn>
>  > Tested-by: Miguel Luis <miguel.luis@oracle.com>
>  > Reviewed-by: Shaoqin Huang <shahuang@redhat.com>
>  > Reviewed-by: Vishnu Pajjuri <vishnu@os.amperecomputing.com>
>  > ---
>  >  accel/kvm/kvm-all.c    | 64 ++++++++++++++++++++++++++++++++------
>  ----
>  >  accel/kvm/kvm-cpus.h   | 14 +++++++++
>  >  accel/kvm/trace-events |  5 +++-
>  >  3 files changed, 67 insertions(+), 16 deletions(-)
>  >
>  > diff --git a/accel/kvm/kvm-all.c b/accel/kvm/kvm-all.c index
>  > c0be9f5eed..9cd7d69bde 100644
>  > --- a/accel/kvm/kvm-all.c
>  > +++ b/accel/kvm/kvm-all.c
>  > @@ -128,6 +128,7 @@ static QemuMutex kml_slots_lock;  #define
>  > kvm_slots_unlock()  qemu_mutex_unlock(&kml_slots_lock)
>  >
>  >  static void kvm_slot_init_dirty_bitmap(KVMSlot *mem);
>  > +static int kvm_get_vcpu(KVMState *s, unsigned long vcpu_id);
>  >
>  >  static inline void kvm_resample_fd_remove(int gsi)  { @@ -340,14
>  > +341,53 @@ err:
>  >      return ret;
>  >  }
>  >
>  > +void kvm_park_vcpu(CPUState *cpu)
>  > +{
>  > +    struct KVMParkedVcpu *vcpu;
>  > +
>  > +    trace_kvm_park_vcpu(cpu->cpu_index, kvm_arch_vcpu_id(cpu));
>  > +
>  > +    vcpu = g_malloc0(sizeof(*vcpu));
>  > +    vcpu->vcpu_id = kvm_arch_vcpu_id(cpu);
>  > +    vcpu->kvm_fd = cpu->kvm_fd;
>  > +    QLIST_INSERT_HEAD(&kvm_state->kvm_parked_vcpus, vcpu, node); }
>  
>  Could you move kvm_get_vcpu up here so it's next to kvm_park_vcpu, and
>  then you don't need to forward declare it. Call it kvm_unpark_vcpu() for
>  symmetry with park.
>  
>  Thanks,
>  Nick
Salil Mehta May 20, 2024, 11:27 a.m. UTC | #3
>  From: Nicholas Piggin <npiggin@gmail.com>
>  Sent: Monday, May 20, 2024 9:04 AM
>  
>  On Mon May 20, 2024 at 7:06 AM AEST, Salil Mehta wrote:
>  > KVM vCPU creation is done once during the vCPU realization when Qemu
>  > vCPU thread is spawned. This is common to all the architectures as of now.
>  >
>  > Hot-unplug of vCPU results in destruction of the vCPU object in QOM
>  > but the corresponding KVM vCPU object in the Host KVM is not destroyed
>  > as KVM doesn't support vCPU removal. Therefore, its representative KVM
>  > vCPU object/context in Qemu is parked.
>  >
>  > Refactor architecture common logic so that some APIs could be reused
>  > by vCPU Hotplug code of some architectures likes ARM, Loongson etc.
>  > Update new/old APIs with trace events instead of DPRINTF. No functional
>  change is intended here.
>  

[...]

void kvm_park_vcpu(CPUState *cpu)
>  > +{
>  > +    struct KVMParkedVcpu *vcpu;
>  > +
>  > +    trace_kvm_park_vcpu(cpu->cpu_index, kvm_arch_vcpu_id(cpu));
>  > +
>  > +    vcpu = g_malloc0(sizeof(*vcpu));
>  > +    vcpu->vcpu_id = kvm_arch_vcpu_id(cpu);
>  > +    vcpu->kvm_fd = cpu->kvm_fd;
>  > +    QLIST_INSERT_HEAD(&kvm_state->kvm_parked_vcpus, vcpu, node); }
>  
>  Could you move kvm_get_vcpu up here so it's next to kvm_park_vcpu, and
>  then you don't need to forward declare it. Call it kvm_unpark_vcpu() for
>  symmetry with park.


sorry I missed this part earlier and Phil also suggested the same. Yes, I can.


Thanks
Salil.

>  
>  Thanks,
>  Nick
diff mbox series

Patch

diff --git a/accel/kvm/kvm-all.c b/accel/kvm/kvm-all.c
index c0be9f5eed..9cd7d69bde 100644
--- a/accel/kvm/kvm-all.c
+++ b/accel/kvm/kvm-all.c
@@ -128,6 +128,7 @@  static QemuMutex kml_slots_lock;
 #define kvm_slots_unlock()  qemu_mutex_unlock(&kml_slots_lock)
 
 static void kvm_slot_init_dirty_bitmap(KVMSlot *mem);
+static int kvm_get_vcpu(KVMState *s, unsigned long vcpu_id);
 
 static inline void kvm_resample_fd_remove(int gsi)
 {
@@ -340,14 +341,53 @@  err:
     return ret;
 }
 
+void kvm_park_vcpu(CPUState *cpu)
+{
+    struct KVMParkedVcpu *vcpu;
+
+    trace_kvm_park_vcpu(cpu->cpu_index, kvm_arch_vcpu_id(cpu));
+
+    vcpu = g_malloc0(sizeof(*vcpu));
+    vcpu->vcpu_id = kvm_arch_vcpu_id(cpu);
+    vcpu->kvm_fd = cpu->kvm_fd;
+    QLIST_INSERT_HEAD(&kvm_state->kvm_parked_vcpus, vcpu, node);
+}
+
+int kvm_create_vcpu(CPUState *cpu)
+{
+    unsigned long vcpu_id = kvm_arch_vcpu_id(cpu);
+    KVMState *s = kvm_state;
+    int kvm_fd;
+
+    /* check if the KVM vCPU already exist but is parked */
+    kvm_fd = kvm_get_vcpu(s, vcpu_id);
+    if (kvm_fd < 0) {
+        /* vCPU not parked: create a new KVM vCPU */
+        kvm_fd = kvm_vm_ioctl(s, KVM_CREATE_VCPU, vcpu_id);
+        if (kvm_fd < 0) {
+            error_report("KVM_CREATE_VCPU IOCTL failed for vCPU %lu", vcpu_id);
+            return kvm_fd;
+        }
+    }
+
+    trace_kvm_create_vcpu(cpu->cpu_index, vcpu_id, kvm_fd);
+
+    cpu->kvm_fd = kvm_fd;
+    cpu->kvm_state = s;
+    cpu->vcpu_dirty = true;
+    cpu->dirty_pages = 0;
+    cpu->throttle_us_per_full = 0;
+
+    return 0;
+}
+
 static int do_kvm_destroy_vcpu(CPUState *cpu)
 {
     KVMState *s = kvm_state;
     long mmap_size;
-    struct KVMParkedVcpu *vcpu = NULL;
     int ret = 0;
 
-    trace_kvm_destroy_vcpu();
+    trace_kvm_destroy_vcpu(cpu->cpu_index, kvm_arch_vcpu_id(cpu));
 
     ret = kvm_arch_destroy_vcpu(cpu);
     if (ret < 0) {
@@ -373,10 +413,7 @@  static int do_kvm_destroy_vcpu(CPUState *cpu)
         }
     }
 
-    vcpu = g_malloc0(sizeof(*vcpu));
-    vcpu->vcpu_id = kvm_arch_vcpu_id(cpu);
-    vcpu->kvm_fd = cpu->kvm_fd;
-    QLIST_INSERT_HEAD(&kvm_state->kvm_parked_vcpus, vcpu, node);
+    kvm_park_vcpu(cpu);
 err:
     return ret;
 }
@@ -397,6 +434,8 @@  static int kvm_get_vcpu(KVMState *s, unsigned long vcpu_id)
         if (cpu->vcpu_id == vcpu_id) {
             int kvm_fd;
 
+            trace_kvm_get_vcpu(vcpu_id);
+
             QLIST_REMOVE(cpu, node);
             kvm_fd = cpu->kvm_fd;
             g_free(cpu);
@@ -404,7 +443,7 @@  static int kvm_get_vcpu(KVMState *s, unsigned long vcpu_id)
         }
     }
 
-    return kvm_vm_ioctl(s, KVM_CREATE_VCPU, (void *)vcpu_id);
+    return -ENOENT;
 }
 
 int kvm_init_vcpu(CPUState *cpu, Error **errp)
@@ -415,19 +454,14 @@  int kvm_init_vcpu(CPUState *cpu, Error **errp)
 
     trace_kvm_init_vcpu(cpu->cpu_index, kvm_arch_vcpu_id(cpu));
 
-    ret = kvm_get_vcpu(s, kvm_arch_vcpu_id(cpu));
+    ret = kvm_create_vcpu(cpu);
     if (ret < 0) {
-        error_setg_errno(errp, -ret, "kvm_init_vcpu: kvm_get_vcpu failed (%lu)",
+        error_setg_errno(errp, -ret,
+                         "kvm_init_vcpu: kvm_create_vcpu failed (%lu)",
                          kvm_arch_vcpu_id(cpu));
         goto err;
     }
 
-    cpu->kvm_fd = ret;
-    cpu->kvm_state = s;
-    cpu->vcpu_dirty = true;
-    cpu->dirty_pages = 0;
-    cpu->throttle_us_per_full = 0;
-
     mmap_size = kvm_ioctl(s, KVM_GET_VCPU_MMAP_SIZE, 0);
     if (mmap_size < 0) {
         ret = mmap_size;
diff --git a/accel/kvm/kvm-cpus.h b/accel/kvm/kvm-cpus.h
index ca40add32c..ba5bea5bca 100644
--- a/accel/kvm/kvm-cpus.h
+++ b/accel/kvm/kvm-cpus.h
@@ -22,5 +22,19 @@  bool kvm_supports_guest_debug(void);
 int kvm_insert_breakpoint(CPUState *cpu, int type, vaddr addr, vaddr len);
 int kvm_remove_breakpoint(CPUState *cpu, int type, vaddr addr, vaddr len);
 void kvm_remove_all_breakpoints(CPUState *cpu);
+/**
+ * kvm_create_vcpu - Gets a parked KVM vCPU or creates a KVM vCPU
+ * @cpu: QOM CPUState object for which KVM vCPU has to be fetched/created.
+ *
+ * @returns: 0 when success, errno (<0) when failed.
+ */
+int kvm_create_vcpu(CPUState *cpu);
 
+/**
+ * kvm_park_vcpu - Park QEMU KVM vCPU context
+ * @cpu: QOM CPUState object for which QEMU KVM vCPU context has to be parked.
+ *
+ * @returns: none
+ */
+void kvm_park_vcpu(CPUState *cpu);
 #endif /* KVM_CPUS_H */
diff --git a/accel/kvm/trace-events b/accel/kvm/trace-events
index 681ccb667d..96f4caf18a 100644
--- a/accel/kvm/trace-events
+++ b/accel/kvm/trace-events
@@ -9,6 +9,10 @@  kvm_device_ioctl(int fd, int type, void *arg) "dev fd %d, type 0x%x, arg %p"
 kvm_failed_reg_get(uint64_t id, const char *msg) "Warning: Unable to retrieve ONEREG %" PRIu64 " from KVM: %s"
 kvm_failed_reg_set(uint64_t id, const char *msg) "Warning: Unable to set ONEREG %" PRIu64 " to KVM: %s"
 kvm_init_vcpu(int cpu_index, unsigned long arch_cpu_id) "index: %d id: %lu"
+kvm_create_vcpu(int cpu_index, unsigned long arch_cpu_id, int kvm_fd) "index: %d, id: %lu, kvm fd: %d"
+kvm_get_vcpu(unsigned long arch_cpu_id) "id: %lu"
+kvm_destroy_vcpu(int cpu_index, unsigned long arch_cpu_id) "index: %d id: %lu"
+kvm_park_vcpu(int cpu_index, unsigned long arch_cpu_id) "index: %d id: %lu"
 kvm_irqchip_commit_routes(void) ""
 kvm_irqchip_add_msi_route(char *name, int vector, int virq) "dev %s vector %d virq %d"
 kvm_irqchip_update_msi_route(int virq) "Updating MSI route virq=%d"
@@ -25,7 +29,6 @@  kvm_dirty_ring_reaper(const char *s) "%s"
 kvm_dirty_ring_reap(uint64_t count, int64_t t) "reaped %"PRIu64" pages (took %"PRIi64" us)"
 kvm_dirty_ring_reaper_kick(const char *reason) "%s"
 kvm_dirty_ring_flush(int finished) "%d"
-kvm_destroy_vcpu(void) ""
 kvm_failed_get_vcpu_mmap_size(void) ""
 kvm_cpu_exec(void) ""
 kvm_interrupt_exit_request(void) ""