[v3,2/4] nvdimm acpi: introduce fit buffer
diff mbox

Message ID 1477672540-27952-3-git-send-email-guangrong.xiao@linux.intel.com
State New
Headers show

Commit Message

Xiao Guangrong Oct. 28, 2016, 4:35 p.m. UTC
The buffer is used to save the FIT info for all the presented nvdimm
devices which is updated after the nvdimm device is plugged or
unplugged. In the later patch, it will be used to construct NVDIMM
ACPI _FIT method which reflects the presented nvdimm devices after
nvdimm hotplug

As FIT buffer can not completely mapped into guest address space,
OSPM will exit to QEMU multiple times, however, there is the race
condition - FIT may be changed during these multiple exits, so that
some rules are introduced:
1) the user should hold the @lock to access the buffer and
2) mark @dirty whenever the buffer is updated.

@dirty is cleared for the first time OSPM gets fit buffer, if
dirty is detected in the later access, OSPM will restart the
access

As fit should be updated after nvdimm device is successfully realized
so that a new hotplug callback, post_hotplug, is introduced

Signed-off-by: Xiao Guangrong <guangrong.xiao@linux.intel.com>
---
 hw/acpi/nvdimm.c        | 59 +++++++++++++++++++++++++++++++++++--------------
 hw/core/hotplug.c       | 11 +++++++++
 hw/core/qdev.c          | 20 +++++++++++++----
 hw/i386/acpi-build.c    |  2 +-
 hw/i386/pc.c            | 19 ++++++++++++++++
 include/hw/hotplug.h    | 10 +++++++++
 include/hw/mem/nvdimm.h | 26 +++++++++++++++++++++-
 7 files changed, 124 insertions(+), 23 deletions(-)

Comments

Stefan Hajnoczi Nov. 1, 2016, 3:17 p.m. UTC | #1
On Sat, Oct 29, 2016 at 12:35:38AM +0800, Xiao Guangrong wrote:
> The buffer is used to save the FIT info for all the presented nvdimm
> devices which is updated after the nvdimm device is plugged or
> unplugged. In the later patch, it will be used to construct NVDIMM
> ACPI _FIT method which reflects the presented nvdimm devices after
> nvdimm hotplug
> 
> As FIT buffer can not completely mapped into guest address space,
> OSPM will exit to QEMU multiple times, however, there is the race
> condition - FIT may be changed during these multiple exits, so that
> some rules are introduced:
> 1) the user should hold the @lock to access the buffer and

I don't understand the purpose of the QEMUMutex lock.  Don't all threads
calling nvdimm/acpi code hold the QEMU global mutex (main loop and vcpu
threads)?

> -static void nvdimm_build_nfit(GSList *device_list, GArray *table_offsets,
> +static void nvdimm_init_fit_buffer(NvdimmFitBuffer *fit_buf)
> +{
> +    qemu_mutex_init(&fit_buf->lock);
> +    fit_buf->fit = g_array_new(false, true /* clear */, 1);

Is it possible to call nvdimm_build_device_structure() here?  That way
we don't duplicate the g_array_new() details.

> diff --git a/hw/core/qdev.c b/hw/core/qdev.c
> index 5783442..d835e62 100644
> --- a/hw/core/qdev.c
> +++ b/hw/core/qdev.c
> @@ -945,10 +945,21 @@ static void device_set_realized(Object *obj, bool value, Error **errp)
>                  goto child_realize_fail;
>              }
>          }
> +
>          if (dev->hotplugged) {
>              device_reset(dev);
>          }
>          dev->pending_deleted_event = false;
> +        dev->realized = value;
> +
> +        if (hotplug_ctrl) {
> +            hotplug_handler_post_plug(hotplug_ctrl, dev, &local_err);
> +        }
> +
> +        if (local_err != NULL) {
> +            dev->realized = value;

dev->realized = value was already set above.

In order to preserve semantics you would need a bool old_value =
dev->realized which you can restore in post_realize_fail.  QEMU current
does not assign dev->realized = value when there is a failure!
Igor Mammedov Nov. 1, 2016, 3:25 p.m. UTC | #2
On Tue, 1 Nov 2016 15:17:01 +0000
Stefan Hajnoczi <stefanha@gmail.com> wrote:

> On Sat, Oct 29, 2016 at 12:35:38AM +0800, Xiao Guangrong wrote:
> > The buffer is used to save the FIT info for all the presented nvdimm
> > devices which is updated after the nvdimm device is plugged or
> > unplugged. In the later patch, it will be used to construct NVDIMM
> > ACPI _FIT method which reflects the presented nvdimm devices after
> > nvdimm hotplug
> > 
> > As FIT buffer can not completely mapped into guest address space,
> > OSPM will exit to QEMU multiple times, however, there is the race
> > condition - FIT may be changed during these multiple exits, so that
> > some rules are introduced:
> > 1) the user should hold the @lock to access the buffer and  
> 
> I don't understand the purpose of the QEMUMutex lock.  Don't all threads
> calling nvdimm/acpi code hold the QEMU global mutex (main loop and vcpu
> threads)?
> 
> > -static void nvdimm_build_nfit(GSList *device_list, GArray *table_offsets,
> > +static void nvdimm_init_fit_buffer(NvdimmFitBuffer *fit_buf)
> > +{
> > +    qemu_mutex_init(&fit_buf->lock);
> > +    fit_buf->fit = g_array_new(false, true /* clear */, 1);  
> 
> Is it possible to call nvdimm_build_device_structure() here?  That way
> we don't duplicate the g_array_new() details.
> 
> > diff --git a/hw/core/qdev.c b/hw/core/qdev.c
> > index 5783442..d835e62 100644
> > --- a/hw/core/qdev.c
> > +++ b/hw/core/qdev.c
> > @@ -945,10 +945,21 @@ static void device_set_realized(Object *obj, bool value, Error **errp)
> >                  goto child_realize_fail;
> >              }
> >          }
> > +
> >          if (dev->hotplugged) {
> >              device_reset(dev);
> >          }
> >          dev->pending_deleted_event = false;
> > +        dev->realized = value;
> > +
> > +        if (hotplug_ctrl) {
> > +            hotplug_handler_post_plug(hotplug_ctrl, dev, &local_err);
> > +        }
> > +
> > +        if (local_err != NULL) {
> > +            dev->realized = value;  
> 
> dev->realized = value was already set above.
> 
> In order to preserve semantics you would need a bool old_value =
> dev->realized which you can restore in post_realize_fail.  QEMU current
> does not assign dev->realized = value when there is a failure!

Stefan,

as agreed on
 "Re: [Qemu-devel] [PULL 00/47] virtio, pc: fixes and features" thread
this series would be merged as is and then as follow up author
will fixup all issues with it.

For this patch it means a patch on top of Michael pull req
to drop lock and drop hotplug_handler_post_plug() code in favor
of existing hotplug_handler_plug().
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Xiao Guangrong Nov. 1, 2016, 4 p.m. UTC | #3
On 11/01/2016 11:25 PM, Igor Mammedov wrote:
> On Tue, 1 Nov 2016 15:17:01 +0000
> Stefan Hajnoczi <stefanha@gmail.com> wrote:
>
>> On Sat, Oct 29, 2016 at 12:35:38AM +0800, Xiao Guangrong wrote:
>>> The buffer is used to save the FIT info for all the presented nvdimm
>>> devices which is updated after the nvdimm device is plugged or
>>> unplugged. In the later patch, it will be used to construct NVDIMM
>>> ACPI _FIT method which reflects the presented nvdimm devices after
>>> nvdimm hotplug
>>>
>>> As FIT buffer can not completely mapped into guest address space,
>>> OSPM will exit to QEMU multiple times, however, there is the race
>>> condition - FIT may be changed during these multiple exits, so that
>>> some rules are introduced:
>>> 1) the user should hold the @lock to access the buffer and
>>
>> I don't understand the purpose of the QEMUMutex lock.  Don't all threads
>> calling nvdimm/acpi code hold the QEMU global mutex (main loop and vcpu
>> threads)?
>>
>>> -static void nvdimm_build_nfit(GSList *device_list, GArray *table_offsets,
>>> +static void nvdimm_init_fit_buffer(NvdimmFitBuffer *fit_buf)
>>> +{
>>> +    qemu_mutex_init(&fit_buf->lock);
>>> +    fit_buf->fit = g_array_new(false, true /* clear */, 1);
>>
>> Is it possible to call nvdimm_build_device_structure() here?  That way
>> we don't duplicate the g_array_new() details.
>>
>>> diff --git a/hw/core/qdev.c b/hw/core/qdev.c
>>> index 5783442..d835e62 100644
>>> --- a/hw/core/qdev.c
>>> +++ b/hw/core/qdev.c
>>> @@ -945,10 +945,21 @@ static void device_set_realized(Object *obj, bool value, Error **errp)
>>>                  goto child_realize_fail;
>>>              }
>>>          }
>>> +
>>>          if (dev->hotplugged) {
>>>              device_reset(dev);
>>>          }
>>>          dev->pending_deleted_event = false;
>>> +        dev->realized = value;
>>> +
>>> +        if (hotplug_ctrl) {
>>> +            hotplug_handler_post_plug(hotplug_ctrl, dev, &local_err);
>>> +        }
>>> +
>>> +        if (local_err != NULL) {
>>> +            dev->realized = value;
>>
>> dev->realized = value was already set above.
>>
>> In order to preserve semantics you would need a bool old_value =
>> dev->realized which you can restore in post_realize_fail.  QEMU current
>> does not assign dev->realized = value when there is a failure!
>
> Stefan,
>
> as agreed on
>  "Re: [Qemu-devel] [PULL 00/47] virtio, pc: fixes and features" thread
> this series would be merged as is and then as follow up author
> will fixup all issues with it.
>
> For this patch it means a patch on top of Michael pull req
> to drop lock and drop hotplug_handler_post_plug() code in favor
> of existing hotplug_handler_plug().
>

Thank you, Igor, Michael and Stefan. I will do it asap after
this patchset is merged. :)


--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Patch
diff mbox

diff --git a/hw/acpi/nvdimm.c b/hw/acpi/nvdimm.c
index b8a2e62..5f728a6 100644
--- a/hw/acpi/nvdimm.c
+++ b/hw/acpi/nvdimm.c
@@ -348,8 +348,9 @@  static void nvdimm_build_structure_dcr(GArray *structures, DeviceState *dev)
                                          (DSM) in DSM Spec Rev1.*/);
 }
 
-static GArray *nvdimm_build_device_structure(GSList *device_list)
+static GArray *nvdimm_build_device_structure(void)
 {
+    GSList *device_list = nvdimm_get_plugged_device_list();
     GArray *structures = g_array_new(false, true /* clear */, 1);
 
     for (; device_list; device_list = device_list->next) {
@@ -367,28 +368,58 @@  static GArray *nvdimm_build_device_structure(GSList *device_list)
         /* build NVDIMM Control Region Structure. */
         nvdimm_build_structure_dcr(structures, dev);
     }
+    g_slist_free(device_list);
 
     return structures;
 }
 
-static void nvdimm_build_nfit(GSList *device_list, GArray *table_offsets,
+static void nvdimm_init_fit_buffer(NvdimmFitBuffer *fit_buf)
+{
+    qemu_mutex_init(&fit_buf->lock);
+    fit_buf->fit = g_array_new(false, true /* clear */, 1);
+}
+
+static void nvdimm_build_fit_buffer(NvdimmFitBuffer *fit_buf)
+{
+    qemu_mutex_lock(&fit_buf->lock);
+    g_array_free(fit_buf->fit, true);
+    fit_buf->fit = nvdimm_build_device_structure();
+    fit_buf->dirty = true;
+    qemu_mutex_unlock(&fit_buf->lock);
+}
+
+void nvdimm_acpi_hotplug(AcpiNVDIMMState *state)
+{
+    nvdimm_build_fit_buffer(&state->fit_buf);
+}
+
+static void nvdimm_build_nfit(AcpiNVDIMMState *state, GArray *table_offsets,
                               GArray *table_data, BIOSLinker *linker)
 {
-    GArray *structures = nvdimm_build_device_structure(device_list);
+    NvdimmFitBuffer *fit_buf = &state->fit_buf;
     unsigned int header;
 
+    qemu_mutex_lock(&fit_buf->lock);
+
+    /* NVDIMM device is not plugged? */
+    if (!fit_buf->fit->len) {
+        goto exit;
+    }
+
     acpi_add_table(table_offsets, table_data);
 
     /* NFIT header. */
     header = table_data->len;
     acpi_data_push(table_data, sizeof(NvdimmNfitHeader));
     /* NVDIMM device structures. */
-    g_array_append_vals(table_data, structures->data, structures->len);
+    g_array_append_vals(table_data, fit_buf->fit->data, fit_buf->fit->len);
 
     build_header(linker, table_data,
                  (void *)(table_data->data + header), "NFIT",
-                 sizeof(NvdimmNfitHeader) + structures->len, 1, NULL, NULL);
-    g_array_free(structures, true);
+                 sizeof(NvdimmNfitHeader) + fit_buf->fit->len, 1, NULL, NULL);
+
+exit:
+    qemu_mutex_unlock(&fit_buf->lock);
 }
 
 struct NvdimmDsmIn {
@@ -771,6 +802,8 @@  void nvdimm_init_acpi_state(AcpiNVDIMMState *state, MemoryRegion *io,
     acpi_data_push(state->dsm_mem, sizeof(NvdimmDsmIn));
     fw_cfg_add_file(fw_cfg, NVDIMM_DSM_MEM_FILE, state->dsm_mem->data,
                     state->dsm_mem->len);
+
+    nvdimm_init_fit_buffer(&state->fit_buf);
 }
 
 #define NVDIMM_COMMON_DSM       "NCAL"
@@ -1045,25 +1078,17 @@  static void nvdimm_build_ssdt(GArray *table_offsets, GArray *table_data,
 }
 
 void nvdimm_build_acpi(GArray *table_offsets, GArray *table_data,
-                       BIOSLinker *linker, GArray *dsm_dma_arrea,
+                       BIOSLinker *linker, AcpiNVDIMMState *state,
                        uint32_t ram_slots)
 {
-    GSList *device_list;
-
-    device_list = nvdimm_get_plugged_device_list();
-
-    /* NVDIMM device is plugged. */
-    if (device_list) {
-        nvdimm_build_nfit(device_list, table_offsets, table_data, linker);
-        g_slist_free(device_list);
-    }
+    nvdimm_build_nfit(state, table_offsets, table_data, linker);
 
     /*
      * NVDIMM device is allowed to be plugged only if there is available
      * slot.
      */
     if (ram_slots) {
-        nvdimm_build_ssdt(table_offsets, table_data, linker, dsm_dma_arrea,
+        nvdimm_build_ssdt(table_offsets, table_data, linker, state->dsm_mem,
                           ram_slots);
     }
 }
diff --git a/hw/core/hotplug.c b/hw/core/hotplug.c
index 17ac986..ab34c19 100644
--- a/hw/core/hotplug.c
+++ b/hw/core/hotplug.c
@@ -35,6 +35,17 @@  void hotplug_handler_plug(HotplugHandler *plug_handler,
     }
 }
 
+void hotplug_handler_post_plug(HotplugHandler *plug_handler,
+                               DeviceState *plugged_dev,
+                               Error **errp)
+{
+    HotplugHandlerClass *hdc = HOTPLUG_HANDLER_GET_CLASS(plug_handler);
+
+    if (hdc->post_plug) {
+        hdc->post_plug(plug_handler, plugged_dev, errp);
+    }
+}
+
 void hotplug_handler_unplug_request(HotplugHandler *plug_handler,
                                     DeviceState *plugged_dev,
                                     Error **errp)
diff --git a/hw/core/qdev.c b/hw/core/qdev.c
index 5783442..d835e62 100644
--- a/hw/core/qdev.c
+++ b/hw/core/qdev.c
@@ -945,10 +945,21 @@  static void device_set_realized(Object *obj, bool value, Error **errp)
                 goto child_realize_fail;
             }
         }
+
         if (dev->hotplugged) {
             device_reset(dev);
         }
         dev->pending_deleted_event = false;
+        dev->realized = value;
+
+        if (hotplug_ctrl) {
+            hotplug_handler_post_plug(hotplug_ctrl, dev, &local_err);
+        }
+
+        if (local_err != NULL) {
+            dev->realized = value;
+            goto post_realize_fail;
+        }
     } else if (!value && dev->realized) {
         Error **local_errp = NULL;
         QLIST_FOREACH(bus, &dev->child_bus, sibling) {
@@ -965,13 +976,14 @@  static void device_set_realized(Object *obj, bool value, Error **errp)
         }
         dev->pending_deleted_event = true;
         DEVICE_LISTENER_CALL(unrealize, Reverse, dev);
-    }
 
-    if (local_err != NULL) {
-        goto fail;
+        if (local_err != NULL) {
+            goto fail;
+        }
+
+        dev->realized = value;
     }
 
-    dev->realized = value;
     return;
 
 child_realize_fail:
diff --git a/hw/i386/acpi-build.c b/hw/i386/acpi-build.c
index 6ae4769..bc49958 100644
--- a/hw/i386/acpi-build.c
+++ b/hw/i386/acpi-build.c
@@ -2767,7 +2767,7 @@  void acpi_build(AcpiBuildTables *tables, MachineState *machine)
     }
     if (pcms->acpi_nvdimm_state.is_enabled) {
         nvdimm_build_acpi(table_offsets, tables_blob, tables->linker,
-                          pcms->acpi_nvdimm_state.dsm_mem, machine->ram_slots);
+                          &pcms->acpi_nvdimm_state, machine->ram_slots);
     }
 
     /* Add tables supplied by user (if any) */
diff --git a/hw/i386/pc.c b/hw/i386/pc.c
index 93ff49c..3be6304 100644
--- a/hw/i386/pc.c
+++ b/hw/i386/pc.c
@@ -1706,6 +1706,16 @@  out:
     error_propagate(errp, local_err);
 }
 
+static void pc_dimm_post_plug(HotplugHandler *hotplug_dev,
+                              DeviceState *dev, Error **errp)
+{
+    PCMachineState *pcms = PC_MACHINE(hotplug_dev);
+
+    if (object_dynamic_cast(OBJECT(dev), TYPE_NVDIMM)) {
+        nvdimm_acpi_hotplug(&pcms->acpi_nvdimm_state);
+    }
+}
+
 static void pc_dimm_unplug_request(HotplugHandler *hotplug_dev,
                                    DeviceState *dev, Error **errp)
 {
@@ -1966,6 +1976,14 @@  static void pc_machine_device_plug_cb(HotplugHandler *hotplug_dev,
     }
 }
 
+static void pc_machine_device_post_plug_cb(HotplugHandler *hotplug_dev,
+                                           DeviceState *dev, Error **errp)
+{
+    if (object_dynamic_cast(OBJECT(dev), TYPE_PC_DIMM)) {
+        pc_dimm_post_plug(hotplug_dev, dev, errp);
+    }
+}
+
 static void pc_machine_device_unplug_request_cb(HotplugHandler *hotplug_dev,
                                                 DeviceState *dev, Error **errp)
 {
@@ -2300,6 +2318,7 @@  static void pc_machine_class_init(ObjectClass *oc, void *data)
     mc->reset = pc_machine_reset;
     hc->pre_plug = pc_machine_device_pre_plug_cb;
     hc->plug = pc_machine_device_plug_cb;
+    hc->post_plug = pc_machine_device_post_plug_cb;
     hc->unplug_request = pc_machine_device_unplug_request_cb;
     hc->unplug = pc_machine_device_unplug_cb;
     nc->nmi_monitor_handler = x86_nmi;
diff --git a/include/hw/hotplug.h b/include/hw/hotplug.h
index c0db869..10ca5b6 100644
--- a/include/hw/hotplug.h
+++ b/include/hw/hotplug.h
@@ -47,6 +47,7 @@  typedef void (*hotplug_fn)(HotplugHandler *plug_handler,
  * @parent: Opaque parent interface.
  * @pre_plug: pre plug callback called at start of device.realize(true)
  * @plug: plug callback called at end of device.realize(true).
+ * @post_pug: post plug callback called after device is successfully plugged.
  * @unplug_request: unplug request callback.
  *                  Used as a means to initiate device unplug for devices that
  *                  require asynchronous unplug handling.
@@ -61,6 +62,7 @@  typedef struct HotplugHandlerClass {
     /* <public> */
     hotplug_fn pre_plug;
     hotplug_fn plug;
+    hotplug_fn post_plug;
     hotplug_fn unplug_request;
     hotplug_fn unplug;
 } HotplugHandlerClass;
@@ -83,6 +85,14 @@  void hotplug_handler_pre_plug(HotplugHandler *plug_handler,
                               DeviceState *plugged_dev,
                               Error **errp);
 
+/**
+ * hotplug_handler_post_plug:
+ *
+ * Call #HotplugHandlerClass.post_plug callback of @plug_handler.
+ */
+void hotplug_handler_post_plug(HotplugHandler *plug_handler,
+                               DeviceState *plugged_dev,
+                               Error **errp);
 
 /**
  * hotplug_handler_unplug_request:
diff --git a/include/hw/mem/nvdimm.h b/include/hw/mem/nvdimm.h
index 63a2b20..33cd421 100644
--- a/include/hw/mem/nvdimm.h
+++ b/include/hw/mem/nvdimm.h
@@ -98,12 +98,35 @@  typedef struct NVDIMMClass NVDIMMClass;
 #define NVDIMM_ACPI_IO_BASE     0x0a18
 #define NVDIMM_ACPI_IO_LEN      4
 
+/*
+ * The buffer, @fit, saves the FIT info for all the presented NVDIMM
+ * devices which is updated after the NVDIMM device is plugged or
+ * unplugged.
+ *
+ * Rules to use the buffer:
+ *    1) the user should hold the @lock to access the buffer.
+ *    2) mark @dirty whenever the buffer is updated.
+ *
+ * These rules preserve NVDIMM ACPI _FIT method to read incomplete
+ * or obsolete fit info if fit update happens during multiple RFIT
+ * calls.
+ */
+struct NvdimmFitBuffer {
+    QemuMutex lock;
+    GArray *fit;
+    bool dirty;
+};
+typedef struct NvdimmFitBuffer NvdimmFitBuffer;
+
 struct AcpiNVDIMMState {
     /* detect if NVDIMM support is enabled. */
     bool is_enabled;
 
     /* the data of the fw_cfg file NVDIMM_DSM_MEM_FILE. */
     GArray *dsm_mem;
+
+    NvdimmFitBuffer fit_buf;
+
     /* the IO region used by OSPM to transfer control to QEMU. */
     MemoryRegion io_mr;
 };
@@ -112,6 +135,7 @@  typedef struct AcpiNVDIMMState AcpiNVDIMMState;
 void nvdimm_init_acpi_state(AcpiNVDIMMState *state, MemoryRegion *io,
                             FWCfgState *fw_cfg, Object *owner);
 void nvdimm_build_acpi(GArray *table_offsets, GArray *table_data,
-                       BIOSLinker *linker, GArray *dsm_dma_arrea,
+                       BIOSLinker *linker, AcpiNVDIMMState *state,
                        uint32_t ram_slots);
+void nvdimm_acpi_hotplug(AcpiNVDIMMState *state);
 #endif