diff mbox

[v3,3/4] nvdimm acpi: introduce _FIT

Message ID 1477672540-27952-4-git-send-email-guangrong.xiao@linux.intel.com
State New, archived
Headers show

Commit Message

Xiao Guangrong Oct. 28, 2016, 4:35 p.m. UTC
_FIT is required for hotplug support, guest will inquire the updated
device info from it if a hotplug event is received

As FIT buffer is not completely mapped into guest address space, so a
new function, Read FIT whose UUID is UUID
648B9CF2-CDA1-4312-8AD9-49C4AF32BD62, handle 0x10000, function index
is 0x1, is reserved by QEMU to read the piece of FIT buffer. The buffer
is concatenated before _FIT return

Refer to docs/specs/acpi-nvdimm.txt for detailed design

Signed-off-by: Xiao Guangrong <guangrong.xiao@linux.intel.com>
---
 docs/specs/acpi_nvdimm.txt |  58 ++++++++++++-
 hw/acpi/nvdimm.c           | 204 ++++++++++++++++++++++++++++++++++++++++++++-
 2 files changed, 257 insertions(+), 5 deletions(-)

Comments

Igor Mammedov Nov. 1, 2016, 4:24 p.m. UTC | #1
On Sat, 29 Oct 2016 00:35:39 +0800
Xiao Guangrong <guangrong.xiao@linux.intel.com> wrote:

> _FIT is required for hotplug support, guest will inquire the updated
> device info from it if a hotplug event is received
> 
> As FIT buffer is not completely mapped into guest address space, so a
> new function, Read FIT whose UUID is UUID
> 648B9CF2-CDA1-4312-8AD9-49C4AF32BD62, handle 0x10000, function index
> is 0x1, is reserved by QEMU to read the piece of FIT buffer. The buffer
> is concatenated before _FIT return
> 
> Refer to docs/specs/acpi-nvdimm.txt for detailed design
> 
> Signed-off-by: Xiao Guangrong <guangrong.xiao@linux.intel.com>
> ---
>  docs/specs/acpi_nvdimm.txt |  58 ++++++++++++-
>  hw/acpi/nvdimm.c           | 204 ++++++++++++++++++++++++++++++++++++++++++++-
>  2 files changed, 257 insertions(+), 5 deletions(-)
> 
> diff --git a/docs/specs/acpi_nvdimm.txt b/docs/specs/acpi_nvdimm.txt
> index 0fdd251..4aa5e3d 100644
> --- a/docs/specs/acpi_nvdimm.txt
> +++ b/docs/specs/acpi_nvdimm.txt
> @@ -127,6 +127,58 @@ _DSM process diagram:
>   | result from the page     |      |              |
>   +--------------------------+      +--------------+
>  
> - _FIT implementation
> - -------------------
> - TODO (will fill it when nvdimm hotplug is introduced)
> +Device Handle Reservation
> +-------------------------
> +As we mentioned above, byte 0 ~ byte 3 in the DSM memory save NVDIMM device
> +handle. The handle is completely QEMU internal thing, the values in range
> +[0, 0xFFFF] indicate nvdimm device (O means nvdimm root device named NVDR),
> +other values are reserved by other purpose.
> +
> +Current reserved handle:
> +0x10000 is reserved for QEMU internal DSM function called on the root
> +device.
Above part should go to section where 'handle' is defined, i.e. earlier in the file:

   ACPI writes _DSM Input Data (based on the offset in the page):
   [0x0 - 0x3]: 4 bytes, NVDIMM Device Handle, 0 is reserved for NVDIMM
                Root device.


> +QEMU internal use only _DSM function
> +------------------------------------
> +UUID, 648B9CF2-CDA1-4312-8AD9-49C4AF32BD62, is reserved for QEMU internal
> +DSM function.
> +
> +There is the function introduced by QEMU and only used by QEMU internal.
> +
> +1) Read FIT
UUID 648B9CF2-CDA1-4312-8AD9-49C4AF32BD62 is reserved for Read_FIT DSM function
(private QEMU function)

> +   As we only reserved one page for NVDIMM ACPI it is impossible to map the
> +   whole FIT data to guest's address space. This function is used by _FIT
> +   method to read a piece of FIT data from QEMU.
 _FIT method uses Read_FIT function to fetch NFIT structures blob from QEMU
 in 1 page sized increments which are then concatenated and returned as _FIT method result.
 

> +
> +   Input parameters:
> +   Arg0 – UUID {set to 648B9CF2-CDA1-4312-8AD9-49C4AF32BD62}
> +   Arg1 – Revision ID (set to 1)
> +   Arg2 - Function Index, 0x1
> +   Arg3 - A package containing a buffer whose layout is as follows:
> +
> +   +----------+-------------+-------------+-----------------------------------+
> +   |  Filed   | Byte Length | Byte Offset | Description                       |
         ^ field,   s/Byte//,    s/Byte//
> +   +----------+-------------+-------------+-----------------------------------+
> +   | offset   |     4       |    0        | the offset of FIT buffer          |
offset in QEMU's NFIT structures blob to read from

> +   +----------+-------------+-------------+-----------------------------------+
> +
> +   Output:
> +   +----------+-------------+-------------+-----------------------------------+
> +   |  Filed   | Byte Length | Byte Offset | Description                       |
> +   +----------+-------------+-------------+-----------------------------------+
> +   |          |             |             | return status codes               |
> +   |          |             |             |   0x100 indicates fit has been    |
> +   | status   |     4       |    0        |   updated                         |
0x100 - error caused by NFIT update while read by _FIT wasn't completed

> +   |          |             |             | other follows Chapter 3 in DSM    |
s/other follows/other codes follow/

> +   |          |             |             | Spec Rev1                         |
> +   +----------+-------------+-------------+-----------------------------------+
> +   | fit data |  Varies     |    4        | FIT data                          |
> +   |          |             |             |                                   |
> +   +----------+-------------+-------------+-----------------------------------+
what does "Varies" mean, how would I know reading this how much data Read_FIT should read
from shared page?

> +
> +   The FIT offset is maintained by the caller itself,
probably is not necessary sentence, or specify a caller (for example OSPM)

> current offset plugs
                 ^^^^?

> +   the length returned by the function is the next offset we should read.
> +   When all the FIT data has been read out, zero length is returned.
> +
> +   If it returns 0x100, OSPM should restart to read FIT (read from offset 0
> +   again).
[...]
that's all for doc part, I'll do the code part later.
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Stefan Hajnoczi Nov. 1, 2016, 4:41 p.m. UTC | #2
On Sat, Oct 29, 2016 at 12:35:39AM +0800, Xiao Guangrong wrote:
> +1) Read FIT
> +   As we only reserved one page for NVDIMM ACPI it is impossible to map the
> +   whole FIT data to guest's address space. This function is used by _FIT
> +   method to read a piece of FIT data from QEMU.
> +
> +   Input parameters:
> +   Arg0 – UUID {set to 648B9CF2-CDA1-4312-8AD9-49C4AF32BD62}
> +   Arg1 – Revision ID (set to 1)
> +   Arg2 - Function Index, 0x1
> +   Arg3 - A package containing a buffer whose layout is as follows:
> +
> +   +----------+-------------+-------------+-----------------------------------+
> +   |  Filed   | Byte Length | Byte Offset | Description                       |

s/Filed/Field/

The same applies below too.

> +   +----------+-------------+-------------+-----------------------------------+
> +   | offset   |     4       |    0        | the offset of FIT buffer          |
> +   +----------+-------------+-------------+-----------------------------------+

s/offset of FIT buffer/offset into FIT buffer/

> +
> +   Output:
> +   +----------+-------------+-------------+-----------------------------------+
> +   |  Filed   | Byte Length | Byte Offset | Description                       |
> +   +----------+-------------+-------------+-----------------------------------+
> +   |          |             |             | return status codes               |
> +   |          |             |             |   0x100 indicates fit has been    |
> +   | status   |     4       |    0        |   updated                         |
> +   |          |             |             | other follows Chapter 3 in DSM    |
> +   |          |             |             | Spec Rev1                         |
> +   +----------+-------------+-------------+-----------------------------------+
> +   | fit data |  Varies     |    4        | FIT data                          |
> +   |          |             |             |                                   |
> +   +----------+-------------+-------------+-----------------------------------+
> +
> +   The FIT offset is maintained by the caller itself, current offset plugs

s/plugs/plus/

> +struct NvdimmFuncReadFITIn {
> +    uint32_t offset; /* the offset of FIT buffer. */

s/offset of FIT buffer/offset into FIT buffer/
Igor Mammedov Nov. 2, 2016, 1:56 p.m. UTC | #3
On Sat, 29 Oct 2016 00:35:39 +0800
Xiao Guangrong <guangrong.xiao@linux.intel.com> wrote:

> _FIT is required for hotplug support, guest will inquire the updated
> device info from it if a hotplug event is received
> 
> As FIT buffer is not completely mapped into guest address space, so a
> new function, Read FIT whose UUID is UUID
> 648B9CF2-CDA1-4312-8AD9-49C4AF32BD62, handle 0x10000, function index
> is 0x1, is reserved by QEMU to read the piece of FIT buffer. The buffer
> is concatenated before _FIT return
> 
> Refer to docs/specs/acpi-nvdimm.txt for detailed design
> 
> Signed-off-by: Xiao Guangrong <guangrong.xiao@linux.intel.com>
> ---
[...]

> diff --git a/hw/acpi/nvdimm.c b/hw/acpi/nvdimm.c
> index 5f728a6..fc1a012 100644
> --- a/hw/acpi/nvdimm.c
> +++ b/hw/acpi/nvdimm.c
> @@ -496,6 +496,22 @@ typedef struct NvdimmFuncSetLabelDataIn NvdimmFuncSetLabelDataIn;
>  QEMU_BUILD_BUG_ON(sizeof(NvdimmFuncSetLabelDataIn) +
>                    offsetof(NvdimmDsmIn, arg3) > 4096);
>  
> +struct NvdimmFuncReadFITIn {
> +    uint32_t offset; /* the offset of FIT buffer. */
> +} QEMU_PACKED;
> +typedef struct NvdimmFuncReadFITIn NvdimmFuncReadFITIn;
> +QEMU_BUILD_BUG_ON(sizeof(NvdimmFuncReadFITIn) +
> +                  offsetof(NvdimmDsmIn, arg3) > 4096);
> +
> +struct NvdimmFuncReadFITOut {
> +    /* the size of buffer filled by QEMU. */
> +    uint32_t len;
> +    uint32_t func_ret_status; /* return status code. */
> +    uint8_t fit[0]; /* the FIT data. */
> +} QEMU_PACKED;
> +typedef struct NvdimmFuncReadFITOut NvdimmFuncReadFITOut;
> +QEMU_BUILD_BUG_ON(sizeof(NvdimmFuncReadFITOut) > 4096);
> +
>  static void
>  nvdimm_dsm_function0(uint32_t supported_func, hwaddr dsm_mem_addr)
>  {
> @@ -516,6 +532,74 @@ nvdimm_dsm_no_payload(uint32_t func_ret_status, hwaddr dsm_mem_addr)
>      cpu_physical_memory_write(dsm_mem_addr, &out, sizeof(out));
>  }
>  
> +#define NVDIMM_QEMU_RSVD_HANDLE_ROOT 0x10000
> +
> +/* Read FIT data, defined in docs/specs/acpi_nvdimm.txt. */
> +static void nvdimm_dsm_func_read_fit(AcpiNVDIMMState *state, NvdimmDsmIn *in,
> +                                     hwaddr dsm_mem_addr)
> +{
> +    NvdimmFitBuffer *fit_buf = &state->fit_buf;
> +    NvdimmFuncReadFITIn *read_fit;
> +    NvdimmFuncReadFITOut *read_fit_out;
> +    GArray *fit;
> +    uint32_t read_len = 0, func_ret_status;
> +    int size;
> +
> +    read_fit = (NvdimmFuncReadFITIn *)in->arg3;
> +    le32_to_cpus(&read_fit->offset);
I'd prefer if you'd not do inplace conversion, just do
offset = le32_to_cpu(read_fit->offset);

> +
> +    qemu_mutex_lock(&fit_buf->lock);
> +    fit = fit_buf->fit;
> +
> +    nvdimm_debug("Read FIT: offset %#x FIT size %#x Dirty %s.\n",
> +                 read_fit->offset, fit->len, fit_buf->dirty ? "Yes" : "No");
as follow up path replace nvdimm_debug() with trace events

> +
> +    if (read_fit->offset > fit->len) {
> +        func_ret_status = 3 /* Invalid Input Parameters */;
should be macros instead of magic value

> +        goto exit;
> +    }
> +
> +    /* It is the first time to read FIT. */
> +    if (!read_fit->offset) {
> +        fit_buf->dirty = false;
> +    } else if (fit_buf->dirty) { /* FIT has been changed during RFIT. */
> +        func_ret_status = 0x100 /* fit changed */;
should be macros instead of magic value

> +        goto exit;
> +    }
> +
> +    func_ret_status = 0 /* Success */;
> +    read_len = MIN(fit->len - read_fit->offset,
> +                   4096 - sizeof(NvdimmFuncReadFITOut));
> +
> +exit:
> +    size = sizeof(NvdimmFuncReadFITOut) + read_len;
> +    read_fit_out = g_malloc(size);
> +
> +    read_fit_out->len = cpu_to_le32(size);
> +    read_fit_out->func_ret_status = cpu_to_le32(func_ret_status);
> +    memcpy(read_fit_out->fit, fit->data + read_fit->offset, read_len);
> +
> +    cpu_physical_memory_write(dsm_mem_addr, read_fit_out, size);
> +
> +    g_free(read_fit_out);
> +    qemu_mutex_unlock(&fit_buf->lock);
> +}
> +
> +static void nvdimm_dsm_reserved_root(AcpiNVDIMMState *state, NvdimmDsmIn *in,
> +                                     hwaddr dsm_mem_addr)
> +{
> +    switch (in->function) {
> +    case 0x0:
> +        nvdimm_dsm_function0(0x1 | 1 << 1 /* Read FIT */, dsm_mem_addr);
> +        return;
> +    case 0x1 /*Read FIT */:
> +        nvdimm_dsm_func_read_fit(state, in, dsm_mem_addr);
> +        return;
> +    }
> +
> +    nvdimm_dsm_no_payload(1 /* Not Supported */, dsm_mem_addr);
should be macros instead of magic value

> +}
> +
>  static void nvdimm_dsm_root(NvdimmDsmIn *in, hwaddr dsm_mem_addr)
>  {
>      /*
> @@ -742,6 +826,7 @@ nvdimm_dsm_read(void *opaque, hwaddr addr, unsigned size)
>  static void
>  nvdimm_dsm_write(void *opaque, hwaddr addr, uint64_t val, unsigned size)
>  {
> +    AcpiNVDIMMState *state = opaque;
>      NvdimmDsmIn *in;
>      hwaddr dsm_mem_addr = val;
>  
> @@ -769,6 +854,11 @@ nvdimm_dsm_write(void *opaque, hwaddr addr, uint64_t val, unsigned size)
>          goto exit;
>      }
>  
> +    if (in->handle == NVDIMM_QEMU_RSVD_HANDLE_ROOT) {
> +        nvdimm_dsm_reserved_root(state, in, dsm_mem_addr);
> +        goto exit;
> +    }
> +
>       /* Handle 0 is reserved for NVDIMM Root Device. */
>      if (!in->handle) {
>          nvdimm_dsm_root(in, dsm_mem_addr);
> @@ -821,9 +911,13 @@ void nvdimm_init_acpi_state(AcpiNVDIMMState *state, MemoryRegion *io,
>  #define NVDIMM_DSM_OUT_BUF_SIZE "RLEN"
>  #define NVDIMM_DSM_OUT_BUF      "ODAT"
>  
> +#define NVDIMM_DSM_RFIT_STATUS  "RSTA"
> +
> +#define NVDIMM_QEMU_RSVD_UUID   "648B9CF2-CDA1-4312-8AD9-49C4AF32BD62"
> +
>  static void nvdimm_build_common_dsm(Aml *dev)
>  {
> -    Aml *method, *ifctx, *function, *handle, *uuid, *dsm_mem;
> +    Aml *method, *ifctx, *function, *handle, *uuid, *dsm_mem, *elsectx2;
>      Aml *elsectx, *unsupport, *unpatched, *expected_uuid, *uuid_invalid;
>      Aml *pckg, *pckg_index, *pckg_buf, *field, *dsm_out_buf, *dsm_out_buf_size;
>      uint8_t byte_list[1];
> @@ -912,9 +1006,15 @@ static void nvdimm_build_common_dsm(Aml *dev)
>                 /* UUID for NVDIMM Root Device */, expected_uuid));
>      aml_append(method, ifctx);
>      elsectx = aml_else();
> -    aml_append(elsectx, aml_store(
> +    ifctx = aml_if(aml_equal(handle, aml_int(NVDIMM_QEMU_RSVD_HANDLE_ROOT)));
> +    aml_append(ifctx, aml_store(aml_touuid(NVDIMM_QEMU_RSVD_UUID
> +               /* UUID for QEMU internal use */), expected_uuid));
> +    aml_append(elsectx, ifctx);
> +    elsectx2 = aml_else();
> +    aml_append(elsectx2, aml_store(
>                 aml_touuid("4309AC30-0D11-11E4-9191-0800200C9A66")
>                 /* UUID for NVDIMM Devices */, expected_uuid));
> +    aml_append(elsectx, elsectx2);
>      aml_append(method, elsectx);
>  
>      uuid_invalid = aml_lnot(aml_equal(uuid, expected_uuid));
> @@ -994,6 +1094,105 @@ static void nvdimm_build_device_dsm(Aml *dev, uint32_t handle)
>      aml_append(dev, method);
>  }
>  
> +static void nvdimm_build_fit(Aml *dev)
nvdimm_build_fit_method()

> +{
> +    Aml *method, *pkg, *buf, *buf_size, *offset, *call_result;
> +    Aml *whilectx, *ifcond, *ifctx, *elsectx, *fit;
> +
> +    buf = aml_local(0);
> +    buf_size = aml_local(1);
> +    fit = aml_local(2);
> +
> +    aml_append(dev, aml_create_dword_field(aml_buffer(4, NULL),
> +               aml_int(0), NVDIMM_DSM_RFIT_STATUS));
it doesn't have to be buffer as it's internal ASL integer object
so it could be just named variable.

I'd also move it to _FIT method instead of making it device global

and if it could work try to pass it as argument to RFIT
RefOf/DerefOf may help here
or make return value instead of buffer and pass buffer as reference.

Alternatively you can return buffer from RFIT with status field included
and check/discard status value there.

> +
> +    /* build helper function, RFIT. */
> +    method = aml_method("RFIT", 1, AML_SERIALIZED);
> +    aml_append(method, aml_create_dword_field(aml_buffer(4, NULL),
> +                                              aml_int(0), "OFST"));
> +
> +    /* prepare input package. */
> +    pkg = aml_package(1);
> +    aml_append(method, aml_store(aml_arg(0), aml_name("OFST")));
> +    aml_append(pkg, aml_name("OFST"));
> +
> +    /* call Read_FIT function. */
> +    call_result = aml_call5(NVDIMM_COMMON_DSM,
> +                            aml_touuid(NVDIMM_QEMU_RSVD_UUID),
> +                            aml_int(1) /* Revision 1 */,
> +                            aml_int(0x1) /* Read FIT */,
> +                            pkg, aml_int(NVDIMM_QEMU_RSVD_HANDLE_ROOT));
> +    aml_append(method, aml_store(call_result, buf));
> +
> +    /* handle _DSM result. */
> +    aml_append(method, aml_create_dword_field(buf,
> +               aml_int(0) /* offset at byte 0 */, "STAU"));
> +
> +    aml_append(method, aml_store(aml_name("STAU"),
> +                                 aml_name(NVDIMM_DSM_RFIT_STATUS)));
> +
> +     /* if something is wrong during _DSM. */
> +    ifcond = aml_equal(aml_int(0 /* Success */), aml_name("STAU"));
> +    ifctx = aml_if(aml_lnot(ifcond));
> +    aml_append(ifctx, aml_return(aml_buffer(0, NULL)));
> +    aml_append(method, ifctx);
> +
> +    aml_append(method, aml_store(aml_sizeof(buf), buf_size));
> +    aml_append(method, aml_subtract(buf_size,
> +                                    aml_int(4) /* the size of "STAU" */,
> +                                    buf_size));
> +
> +    /* if we read the end of fit. */
> +    ifctx = aml_if(aml_equal(buf_size, aml_int(0)));
> +    aml_append(ifctx, aml_return(aml_buffer(0, NULL)));
> +    aml_append(method, ifctx);
> +
> +    aml_append(method, aml_store(aml_shiftleft(buf_size, aml_int(3)),
> +                                 buf_size));
there isn't need to convert bytes to bits and store it in the same variable
it creates confusion

> +    aml_append(method, aml_create_field(buf,
> +                            aml_int(4 * BITS_PER_BYTE), /* offset at byte 4.*/
> +                            buf_size, "BUFF"));
just inline conversion in aml_create_field()


> +    aml_append(method, aml_return(aml_name("BUFF")));
> +    aml_append(dev, method);
> +
> +    /* build _FIT. */
> +    method = aml_method("_FIT", 0, AML_SERIALIZED);
> +    offset = aml_local(3);
> +
> +    aml_append(method, aml_store(aml_buffer(0, NULL), fit));
> +    aml_append(method, aml_store(aml_int(0), offset));
> +
> +    whilectx = aml_while(aml_int(1));
> +    aml_append(whilectx, aml_store(aml_call1("RFIT", offset), buf));
> +    aml_append(whilectx, aml_store(aml_sizeof(buf), buf_size));
> +
> +    /*
> +     * if fit buffer was changed during RFIT, read from the beginning
> +     * again.
> +     */
> +    ifctx = aml_if(aml_equal(aml_name(NVDIMM_DSM_RFIT_STATUS),
> +                             aml_int(0x100 /* fit changed */)));
> +    aml_append(ifctx, aml_store(aml_buffer(0, NULL), fit));
> +    aml_append(ifctx, aml_store(aml_int(0), offset));
> +    aml_append(whilectx, ifctx);
> +
> +    elsectx = aml_else();
> +
> +    /* finish fit read if no data is read out. */
> +    ifctx = aml_if(aml_equal(buf_size, aml_int(0)));
> +    aml_append(ifctx, aml_return(fit));
> +    aml_append(elsectx, ifctx);
> +
> +    /* update the offset. */
> +    aml_append(elsectx, aml_add(offset, buf_size, offset));
> +    /* append the data we read out to the fit buffer. */
> +    aml_append(elsectx, aml_concatenate(fit, buf, fit));
> +    aml_append(whilectx, elsectx);
> +    aml_append(method, whilectx);
> +
> +    aml_append(dev, method);
> +}
> +
>  static void nvdimm_build_nvdimm_devices(Aml *root_dev, uint32_t ram_slots)
>  {
>      uint32_t slot;
> @@ -1052,6 +1251,7 @@ static void nvdimm_build_ssdt(GArray *table_offsets, GArray *table_data,
>  
>      /* 0 is reserved for root device. */
>      nvdimm_build_device_dsm(dev, 0);
> +    nvdimm_build_fit(dev);
>  
>      nvdimm_build_nvdimm_devices(dev, ram_slots);
>  

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Xiao Guangrong Nov. 2, 2016, 3:40 p.m. UTC | #4
On 11/02/2016 12:24 AM, Igor Mammedov wrote:
> On Sat, 29 Oct 2016 00:35:39 +0800
> Xiao Guangrong <guangrong.xiao@linux.intel.com> wrote:
>
>> _FIT is required for hotplug support, guest will inquire the updated
>> device info from it if a hotplug event is received
>>
>> As FIT buffer is not completely mapped into guest address space, so a
>> new function, Read FIT whose UUID is UUID
>> 648B9CF2-CDA1-4312-8AD9-49C4AF32BD62, handle 0x10000, function index
>> is 0x1, is reserved by QEMU to read the piece of FIT buffer. The buffer
>> is concatenated before _FIT return
>>
>> Refer to docs/specs/acpi-nvdimm.txt for detailed design
>>
>> Signed-off-by: Xiao Guangrong <guangrong.xiao@linux.intel.com>
>> ---
>>  docs/specs/acpi_nvdimm.txt |  58 ++++++++++++-
>>  hw/acpi/nvdimm.c           | 204 ++++++++++++++++++++++++++++++++++++++++++++-
>>  2 files changed, 257 insertions(+), 5 deletions(-)
>>
>> diff --git a/docs/specs/acpi_nvdimm.txt b/docs/specs/acpi_nvdimm.txt
>> index 0fdd251..4aa5e3d 100644
>> --- a/docs/specs/acpi_nvdimm.txt
>> +++ b/docs/specs/acpi_nvdimm.txt
>> @@ -127,6 +127,58 @@ _DSM process diagram:
>>   | result from the page     |      |              |
>>   +--------------------------+      +--------------+
>>
>> - _FIT implementation
>> - -------------------
>> - TODO (will fill it when nvdimm hotplug is introduced)
>> +Device Handle Reservation
>> +-------------------------
>> +As we mentioned above, byte 0 ~ byte 3 in the DSM memory save NVDIMM device
>> +handle. The handle is completely QEMU internal thing, the values in range
>> +[0, 0xFFFF] indicate nvdimm device (O means nvdimm root device named NVDR),
>> +other values are reserved by other purpose.
>> +
>> +Current reserved handle:
>> +0x10000 is reserved for QEMU internal DSM function called on the root
>> +device.
> Above part should go to section where 'handle' is defined, i.e. earlier in the file:
>
>    ACPI writes _DSM Input Data (based on the offset in the page):
>    [0x0 - 0x3]: 4 bytes, NVDIMM Device Handle, 0 is reserved for NVDIMM
>                 Root device.
>

Okay.

>
>> +QEMU internal use only _DSM function
>> +------------------------------------
>> +UUID, 648B9CF2-CDA1-4312-8AD9-49C4AF32BD62, is reserved for QEMU internal
>> +DSM function.
>> +
>> +There is the function introduced by QEMU and only used by QEMU internal.
>> +
>> +1) Read FIT
> UUID 648B9CF2-CDA1-4312-8AD9-49C4AF32BD62 is reserved for Read_FIT DSM function
> (private QEMU function)
>

okay.

>> +   As we only reserved one page for NVDIMM ACPI it is impossible to map the
>> +   whole FIT data to guest's address space. This function is used by _FIT
>> +   method to read a piece of FIT data from QEMU.
>  _FIT method uses Read_FIT function to fetch NFIT structures blob from QEMU
>  in 1 page sized increments which are then concatenated and returned as _FIT method result.
>

okay.

>
>> +
>> +   Input parameters:
>> +   Arg0 – UUID {set to 648B9CF2-CDA1-4312-8AD9-49C4AF32BD62}
>> +   Arg1 – Revision ID (set to 1)
>> +   Arg2 - Function Index, 0x1
>> +   Arg3 - A package containing a buffer whose layout is as follows:
>> +
>> +   +----------+-------------+-------------+-----------------------------------+
>> +   |  Filed   | Byte Length | Byte Offset | Description                       |
>          ^ field,   s/Byte//,    s/Byte//
>> +   +----------+-------------+-------------+-----------------------------------+
>> +   | offset   |     4       |    0        | the offset of FIT buffer          |
> offset in QEMU's NFIT structures blob to read from

okay.

>
>> +   +----------+-------------+-------------+-----------------------------------+
>> +
>> +   Output:
>> +   +----------+-------------+-------------+-----------------------------------+
>> +   |  Filed   | Byte Length | Byte Offset | Description                       |
>> +   +----------+-------------+-------------+-----------------------------------+
>> +   |          |             |             | return status codes               |
>> +   |          |             |             |   0x100 indicates fit has been    |
>> +   | status   |     4       |    0        |   updated                         |
> 0x100 - error caused by NFIT update while read by _FIT wasn't completed

okay.

>
>> +   |          |             |             | other follows Chapter 3 in DSM    |
> s/other follows/other codes follow/

okay.

>
>> +   |          |             |             | Spec Rev1                         |
>> +   +----------+-------------+-------------+-----------------------------------+
>> +   | fit data |  Varies     |    4        | FIT data                          |
>> +   |          |             |             |                                   |
>> +   +----------+-------------+-------------+-----------------------------------+
> what does "Varies" mean, how would I know reading this how much data Read_FIT should read
> from shared page?

I mean other bytes in the buffer returned by this function except the 'status' field
will be used as fit data. Maybe it is has a 'length' field?

>
>> +
>> +   The FIT offset is maintained by the caller itself,
> probably is not necessary sentence, or specify a caller (for example OSPM)
>

Okay.

>> current offset plugs
>                  ^^^^?

Typo. should be plus.

>
>> +   the length returned by the function is the next offset we should read.
>> +   When all the FIT data has been read out, zero length is returned.
>> +
>> +   If it returns 0x100, OSPM should restart to read FIT (read from offset 0
>> +   again).
> [...]
> that's all for doc part, I'll do the code part later.
>

Thank you, Igor.

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Xiao Guangrong Nov. 2, 2016, 3:42 p.m. UTC | #5
On 11/02/2016 12:41 AM, Stefan Hajnoczi wrote:
> On Sat, Oct 29, 2016 at 12:35:39AM +0800, Xiao Guangrong wrote:
>> +1) Read FIT
>> +   As we only reserved one page for NVDIMM ACPI it is impossible to map the
>> +   whole FIT data to guest's address space. This function is used by _FIT
>> +   method to read a piece of FIT data from QEMU.
>> +
>> +   Input parameters:
>> +   Arg0 – UUID {set to 648B9CF2-CDA1-4312-8AD9-49C4AF32BD62}
>> +   Arg1 – Revision ID (set to 1)
>> +   Arg2 - Function Index, 0x1
>> +   Arg3 - A package containing a buffer whose layout is as follows:
>> +
>> +   +----------+-------------+-------------+-----------------------------------+
>> +   |  Filed   | Byte Length | Byte Offset | Description                       |
>
> s/Filed/Field/
>
> The same applies below too.

Will fix.

>
>> +   +----------+-------------+-------------+-----------------------------------+
>> +   | offset   |     4       |    0        | the offset of FIT buffer          |
>> +   +----------+-------------+-------------+-----------------------------------+
>
> s/offset of FIT buffer/offset into FIT buffer/

will fix.

>
>> +
>> +   Output:
>> +   +----------+-------------+-------------+-----------------------------------+
>> +   |  Filed   | Byte Length | Byte Offset | Description                       |
>> +   +----------+-------------+-------------+-----------------------------------+
>> +   |          |             |             | return status codes               |
>> +   |          |             |             |   0x100 indicates fit has been    |
>> +   | status   |     4       |    0        |   updated                         |
>> +   |          |             |             | other follows Chapter 3 in DSM    |
>> +   |          |             |             | Spec Rev1                         |
>> +   +----------+-------------+-------------+-----------------------------------+
>> +   | fit data |  Varies     |    4        | FIT data                          |
>> +   |          |             |             |                                   |
>> +   +----------+-------------+-------------+-----------------------------------+
>> +
>> +   The FIT offset is maintained by the caller itself, current offset plugs
>
> s/plugs/plus/

Yes, indeed.

>
>> +struct NvdimmFuncReadFITIn {
>> +    uint32_t offset; /* the offset of FIT buffer. */
>
> s/offset of FIT buffer/offset into FIT buffer/

will fix.

Thank you, Stefan!


--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Xiao Guangrong Nov. 2, 2016, 3:54 p.m. UTC | #6
On 11/02/2016 09:56 PM, Igor Mammedov wrote:
> On Sat, 29 Oct 2016 00:35:39 +0800
> Xiao Guangrong <guangrong.xiao@linux.intel.com> wrote:
>
>> _FIT is required for hotplug support, guest will inquire the updated
>> device info from it if a hotplug event is received
>>
>> As FIT buffer is not completely mapped into guest address space, so a
>> new function, Read FIT whose UUID is UUID
>> 648B9CF2-CDA1-4312-8AD9-49C4AF32BD62, handle 0x10000, function index
>> is 0x1, is reserved by QEMU to read the piece of FIT buffer. The buffer
>> is concatenated before _FIT return
>>
>> Refer to docs/specs/acpi-nvdimm.txt for detailed design
>>
>> Signed-off-by: Xiao Guangrong <guangrong.xiao@linux.intel.com>
>> ---
> [...]
>
>> diff --git a/hw/acpi/nvdimm.c b/hw/acpi/nvdimm.c
>> index 5f728a6..fc1a012 100644
>> --- a/hw/acpi/nvdimm.c
>> +++ b/hw/acpi/nvdimm.c
>> @@ -496,6 +496,22 @@ typedef struct NvdimmFuncSetLabelDataIn NvdimmFuncSetLabelDataIn;
>>  QEMU_BUILD_BUG_ON(sizeof(NvdimmFuncSetLabelDataIn) +
>>                    offsetof(NvdimmDsmIn, arg3) > 4096);
>>
>> +struct NvdimmFuncReadFITIn {
>> +    uint32_t offset; /* the offset of FIT buffer. */
>> +} QEMU_PACKED;
>> +typedef struct NvdimmFuncReadFITIn NvdimmFuncReadFITIn;
>> +QEMU_BUILD_BUG_ON(sizeof(NvdimmFuncReadFITIn) +
>> +                  offsetof(NvdimmDsmIn, arg3) > 4096);
>> +
>> +struct NvdimmFuncReadFITOut {
>> +    /* the size of buffer filled by QEMU. */
>> +    uint32_t len;
>> +    uint32_t func_ret_status; /* return status code. */
>> +    uint8_t fit[0]; /* the FIT data. */
>> +} QEMU_PACKED;
>> +typedef struct NvdimmFuncReadFITOut NvdimmFuncReadFITOut;
>> +QEMU_BUILD_BUG_ON(sizeof(NvdimmFuncReadFITOut) > 4096);
>> +
>>  static void
>>  nvdimm_dsm_function0(uint32_t supported_func, hwaddr dsm_mem_addr)
>>  {
>> @@ -516,6 +532,74 @@ nvdimm_dsm_no_payload(uint32_t func_ret_status, hwaddr dsm_mem_addr)
>>      cpu_physical_memory_write(dsm_mem_addr, &out, sizeof(out));
>>  }
>>
>> +#define NVDIMM_QEMU_RSVD_HANDLE_ROOT 0x10000
>> +
>> +/* Read FIT data, defined in docs/specs/acpi_nvdimm.txt. */
>> +static void nvdimm_dsm_func_read_fit(AcpiNVDIMMState *state, NvdimmDsmIn *in,
>> +                                     hwaddr dsm_mem_addr)
>> +{
>> +    NvdimmFitBuffer *fit_buf = &state->fit_buf;
>> +    NvdimmFuncReadFITIn *read_fit;
>> +    NvdimmFuncReadFITOut *read_fit_out;
>> +    GArray *fit;
>> +    uint32_t read_len = 0, func_ret_status;
>> +    int size;
>> +
>> +    read_fit = (NvdimmFuncReadFITIn *)in->arg3;
>> +    le32_to_cpus(&read_fit->offset);
> I'd prefer if you'd not do inplace conversion, just do
> offset = le32_to_cpu(read_fit->offset);

okay.

>
>> +
>> +    qemu_mutex_lock(&fit_buf->lock);
>> +    fit = fit_buf->fit;
>> +
>> +    nvdimm_debug("Read FIT: offset %#x FIT size %#x Dirty %s.\n",
>> +                 read_fit->offset, fit->len, fit_buf->dirty ? "Yes" : "No");
> as follow up path replace nvdimm_debug() with trace events

I will do it as a separate patch.

>
>> +
>> +    if (read_fit->offset > fit->len) {
>> +        func_ret_status = 3 /* Invalid Input Parameters */;
> should be macros instead of magic value

Yes.

>
>> +        goto exit;
>> +    }
>> +
>> +    /* It is the first time to read FIT. */
>> +    if (!read_fit->offset) {
>> +        fit_buf->dirty = false;
>> +    } else if (fit_buf->dirty) { /* FIT has been changed during RFIT. */
>> +        func_ret_status = 0x100 /* fit changed */;
> should be macros instead of magic value

okay.

>
>> +        goto exit;
>> +    }
>> +
>> +    func_ret_status = 0 /* Success */;
>> +    read_len = MIN(fit->len - read_fit->offset,
>> +                   4096 - sizeof(NvdimmFuncReadFITOut));
>> +
>> +exit:
>> +    size = sizeof(NvdimmFuncReadFITOut) + read_len;
>> +    read_fit_out = g_malloc(size);
>> +
>> +    read_fit_out->len = cpu_to_le32(size);
>> +    read_fit_out->func_ret_status = cpu_to_le32(func_ret_status);
>> +    memcpy(read_fit_out->fit, fit->data + read_fit->offset, read_len);
>> +
>> +    cpu_physical_memory_write(dsm_mem_addr, read_fit_out, size);
>> +
>> +    g_free(read_fit_out);
>> +    qemu_mutex_unlock(&fit_buf->lock);
>> +}
>> +
>> +static void nvdimm_dsm_reserved_root(AcpiNVDIMMState *state, NvdimmDsmIn *in,
>> +                                     hwaddr dsm_mem_addr)
>> +{
>> +    switch (in->function) {
>> +    case 0x0:
>> +        nvdimm_dsm_function0(0x1 | 1 << 1 /* Read FIT */, dsm_mem_addr);
>> +        return;
>> +    case 0x1 /*Read FIT */:
>> +        nvdimm_dsm_func_read_fit(state, in, dsm_mem_addr);
>> +        return;
>> +    }
>> +
>> +    nvdimm_dsm_no_payload(1 /* Not Supported */, dsm_mem_addr);
> should be macros instead of magic value

Okay.

>
>> +}
>> +
>>  static void nvdimm_dsm_root(NvdimmDsmIn *in, hwaddr dsm_mem_addr)
>>  {
>>      /*
>> @@ -742,6 +826,7 @@ nvdimm_dsm_read(void *opaque, hwaddr addr, unsigned size)
>>  static void
>>  nvdimm_dsm_write(void *opaque, hwaddr addr, uint64_t val, unsigned size)
>>  {
>> +    AcpiNVDIMMState *state = opaque;
>>      NvdimmDsmIn *in;
>>      hwaddr dsm_mem_addr = val;
>>
>> @@ -769,6 +854,11 @@ nvdimm_dsm_write(void *opaque, hwaddr addr, uint64_t val, unsigned size)
>>          goto exit;
>>      }
>>
>> +    if (in->handle == NVDIMM_QEMU_RSVD_HANDLE_ROOT) {
>> +        nvdimm_dsm_reserved_root(state, in, dsm_mem_addr);
>> +        goto exit;
>> +    }
>> +
>>       /* Handle 0 is reserved for NVDIMM Root Device. */
>>      if (!in->handle) {
>>          nvdimm_dsm_root(in, dsm_mem_addr);
>> @@ -821,9 +911,13 @@ void nvdimm_init_acpi_state(AcpiNVDIMMState *state, MemoryRegion *io,
>>  #define NVDIMM_DSM_OUT_BUF_SIZE "RLEN"
>>  #define NVDIMM_DSM_OUT_BUF      "ODAT"
>>
>> +#define NVDIMM_DSM_RFIT_STATUS  "RSTA"
>> +
>> +#define NVDIMM_QEMU_RSVD_UUID   "648B9CF2-CDA1-4312-8AD9-49C4AF32BD62"
>> +
>>  static void nvdimm_build_common_dsm(Aml *dev)
>>  {
>> -    Aml *method, *ifctx, *function, *handle, *uuid, *dsm_mem;
>> +    Aml *method, *ifctx, *function, *handle, *uuid, *dsm_mem, *elsectx2;
>>      Aml *elsectx, *unsupport, *unpatched, *expected_uuid, *uuid_invalid;
>>      Aml *pckg, *pckg_index, *pckg_buf, *field, *dsm_out_buf, *dsm_out_buf_size;
>>      uint8_t byte_list[1];
>> @@ -912,9 +1006,15 @@ static void nvdimm_build_common_dsm(Aml *dev)
>>                 /* UUID for NVDIMM Root Device */, expected_uuid));
>>      aml_append(method, ifctx);
>>      elsectx = aml_else();
>> -    aml_append(elsectx, aml_store(
>> +    ifctx = aml_if(aml_equal(handle, aml_int(NVDIMM_QEMU_RSVD_HANDLE_ROOT)));
>> +    aml_append(ifctx, aml_store(aml_touuid(NVDIMM_QEMU_RSVD_UUID
>> +               /* UUID for QEMU internal use */), expected_uuid));
>> +    aml_append(elsectx, ifctx);
>> +    elsectx2 = aml_else();
>> +    aml_append(elsectx2, aml_store(
>>                 aml_touuid("4309AC30-0D11-11E4-9191-0800200C9A66")
>>                 /* UUID for NVDIMM Devices */, expected_uuid));
>> +    aml_append(elsectx, elsectx2);
>>      aml_append(method, elsectx);
>>
>>      uuid_invalid = aml_lnot(aml_equal(uuid, expected_uuid));
>> @@ -994,6 +1094,105 @@ static void nvdimm_build_device_dsm(Aml *dev, uint32_t handle)
>>      aml_append(dev, method);
>>  }
>>
>> +static void nvdimm_build_fit(Aml *dev)
> nvdimm_build_fit_method()

okay.

>
>> +{
>> +    Aml *method, *pkg, *buf, *buf_size, *offset, *call_result;
>> +    Aml *whilectx, *ifcond, *ifctx, *elsectx, *fit;
>> +
>> +    buf = aml_local(0);
>> +    buf_size = aml_local(1);
>> +    fit = aml_local(2);
>> +
>> +    aml_append(dev, aml_create_dword_field(aml_buffer(4, NULL),
>> +               aml_int(0), NVDIMM_DSM_RFIT_STATUS));
> it doesn't have to be buffer as it's internal ASL integer object
> so it could be just named variable.

Let me try.

>
> I'd also move it to _FIT method instead of making it device global

We can not as it is used both in _FIT method and RFIT method.

>
> and if it could work try to pass it as argument to RFIT
> RefOf/DerefOf may help here
> or make return value instead of buffer and pass buffer as reference.
>

Let me try.

> Alternatively you can return buffer from RFIT with status field included
> and check/discard status value there.
>

As we can not create name object in a while-loop, it is not easy to check
the status in _FIT.

>> +
>> +    /* build helper function, RFIT. */
>> +    method = aml_method("RFIT", 1, AML_SERIALIZED);
>> +    aml_append(method, aml_create_dword_field(aml_buffer(4, NULL),
>> +                                              aml_int(0), "OFST"));
>> +
>> +    /* prepare input package. */
>> +    pkg = aml_package(1);
>> +    aml_append(method, aml_store(aml_arg(0), aml_name("OFST")));
>> +    aml_append(pkg, aml_name("OFST"));
>> +
>> +    /* call Read_FIT function. */
>> +    call_result = aml_call5(NVDIMM_COMMON_DSM,
>> +                            aml_touuid(NVDIMM_QEMU_RSVD_UUID),
>> +                            aml_int(1) /* Revision 1 */,
>> +                            aml_int(0x1) /* Read FIT */,
>> +                            pkg, aml_int(NVDIMM_QEMU_RSVD_HANDLE_ROOT));
>> +    aml_append(method, aml_store(call_result, buf));
>> +
>> +    /* handle _DSM result. */
>> +    aml_append(method, aml_create_dword_field(buf,
>> +               aml_int(0) /* offset at byte 0 */, "STAU"));
>> +
>> +    aml_append(method, aml_store(aml_name("STAU"),
>> +                                 aml_name(NVDIMM_DSM_RFIT_STATUS)));
>> +
>> +     /* if something is wrong during _DSM. */
>> +    ifcond = aml_equal(aml_int(0 /* Success */), aml_name("STAU"));
>> +    ifctx = aml_if(aml_lnot(ifcond));
>> +    aml_append(ifctx, aml_return(aml_buffer(0, NULL)));
>> +    aml_append(method, ifctx);
>> +
>> +    aml_append(method, aml_store(aml_sizeof(buf), buf_size));
>> +    aml_append(method, aml_subtract(buf_size,
>> +                                    aml_int(4) /* the size of "STAU" */,
>> +                                    buf_size));
>> +
>> +    /* if we read the end of fit. */
>> +    ifctx = aml_if(aml_equal(buf_size, aml_int(0)));
>> +    aml_append(ifctx, aml_return(aml_buffer(0, NULL)));
>> +    aml_append(method, ifctx);
>> +
>> +    aml_append(method, aml_store(aml_shiftleft(buf_size, aml_int(3)),
>> +                                 buf_size));
> there isn't need to convert bytes to bits and store it in the same variable
> it creates confusion

Okay, i will introduce a new variable named buf_size_bits.

>
>> +    aml_append(method, aml_create_field(buf,
>> +                            aml_int(4 * BITS_PER_BYTE), /* offset at byte 4.*/
>> +                            buf_size, "BUFF"));
> just inline conversion in aml_create_field()

okay.

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Igor Mammedov Nov. 3, 2016, 9:15 a.m. UTC | #7
On Wed, 2 Nov 2016 23:40:56 +0800
Xiao Guangrong <guangrong.xiao@linux.intel.com> wrote:

> On 11/02/2016 12:24 AM, Igor Mammedov wrote:
> > On Sat, 29 Oct 2016 00:35:39 +0800
> > Xiao Guangrong <guangrong.xiao@linux.intel.com> wrote:
> >  
> >> _FIT is required for hotplug support, guest will inquire the updated
> >> device info from it if a hotplug event is received
> >>
> >> As FIT buffer is not completely mapped into guest address space, so a
> >> new function, Read FIT whose UUID is UUID
> >> 648B9CF2-CDA1-4312-8AD9-49C4AF32BD62, handle 0x10000, function index
> >> is 0x1, is reserved by QEMU to read the piece of FIT buffer. The buffer
> >> is concatenated before _FIT return
> >>
> >> Refer to docs/specs/acpi-nvdimm.txt for detailed design
> >>
> >> Signed-off-by: Xiao Guangrong <guangrong.xiao@linux.intel.com>
> >> ---
> >>  docs/specs/acpi_nvdimm.txt |  58 ++++++++++++-
> >>  hw/acpi/nvdimm.c           | 204 ++++++++++++++++++++++++++++++++++++++++++++-
> >>  2 files changed, 257 insertions(+), 5 deletions(-)
> >>
> >> diff --git a/docs/specs/acpi_nvdimm.txt b/docs/specs/acpi_nvdimm.txt
> >> index 0fdd251..4aa5e3d 100644
> >> --- a/docs/specs/acpi_nvdimm.txt
> >> +++ b/docs/specs/acpi_nvdimm.txt
> >> @@ -127,6 +127,58 @@ _DSM process diagram:
> >>   | result from the page     |      |              |
> >>   +--------------------------+      +--------------+
> >>
> >> - _FIT implementation
> >> - -------------------
> >> - TODO (will fill it when nvdimm hotplug is introduced)
> >> +Device Handle Reservation
> >> +-------------------------
> >> +As we mentioned above, byte 0 ~ byte 3 in the DSM memory save NVDIMM device
> >> +handle. The handle is completely QEMU internal thing, the values in range
> >> +[0, 0xFFFF] indicate nvdimm device (O means nvdimm root device named NVDR),
> >> +other values are reserved by other purpose.
> >> +
> >> +Current reserved handle:
> >> +0x10000 is reserved for QEMU internal DSM function called on the root
> >> +device.  
> > Above part should go to section where 'handle' is defined, i.e. earlier in the file:
> >
> >    ACPI writes _DSM Input Data (based on the offset in the page):
> >    [0x0 - 0x3]: 4 bytes, NVDIMM Device Handle, 0 is reserved for NVDIMM
> >                 Root device.
> >  
> 
> Okay.
> 
> >  
> >> +QEMU internal use only _DSM function
> >> +------------------------------------
> >> +UUID, 648B9CF2-CDA1-4312-8AD9-49C4AF32BD62, is reserved for QEMU internal
> >> +DSM function.
> >> +
> >> +There is the function introduced by QEMU and only used by QEMU internal.
> >> +
> >> +1) Read FIT  
> > UUID 648B9CF2-CDA1-4312-8AD9-49C4AF32BD62 is reserved for Read_FIT DSM function
> > (private QEMU function)
> >  
> 
> okay.
> 
> >> +   As we only reserved one page for NVDIMM ACPI it is impossible to map the
> >> +   whole FIT data to guest's address space. This function is used by _FIT
> >> +   method to read a piece of FIT data from QEMU.  
> >  _FIT method uses Read_FIT function to fetch NFIT structures blob from QEMU
> >  in 1 page sized increments which are then concatenated and returned as _FIT method result.
> >  
> 
> okay.
> 
> >  
> >> +
> >> +   Input parameters:
> >> +   Arg0 – UUID {set to 648B9CF2-CDA1-4312-8AD9-49C4AF32BD62}
> >> +   Arg1 – Revision ID (set to 1)
> >> +   Arg2 - Function Index, 0x1
> >> +   Arg3 - A package containing a buffer whose layout is as follows:
> >> +
> >> +   +----------+-------------+-------------+-----------------------------------+
> >> +   |  Filed   | Byte Length | Byte Offset | Description                       |  
> >          ^ field,   s/Byte//,    s/Byte//  
> >> +   +----------+-------------+-------------+-----------------------------------+
> >> +   | offset   |     4       |    0        | the offset of FIT buffer          |  
> > offset in QEMU's NFIT structures blob to read from  
> 
> okay.
> 
> >  
> >> +   +----------+-------------+-------------+-----------------------------------+
> >> +
> >> +   Output:
> >> +   +----------+-------------+-------------+-----------------------------------+
> >> +   |  Filed   | Byte Length | Byte Offset | Description                       |
> >> +   +----------+-------------+-------------+-----------------------------------+
> >> +   |          |             |             | return status codes               |
> >> +   |          |             |             |   0x100 indicates fit has been    |
> >> +   | status   |     4       |    0        |   updated                         |  
> > 0x100 - error caused by NFIT update while read by _FIT wasn't completed  
> 
> okay.
> 
> >  
> >> +   |          |             |             | other follows Chapter 3 in DSM    |  
> > s/other follows/other codes follow/  
> 
> okay.
> 
> >  
> >> +   |          |             |             | Spec Rev1                         |
> >> +   +----------+-------------+-------------+-----------------------------------+
> >> +   | fit data |  Varies     |    4        | FIT data                          |
> >> +   |          |             |             |                                   |
> >> +   +----------+-------------+-------------+-----------------------------------+  
> > what does "Varies" mean, how would I know reading this how much data Read_FIT should read
> > from shared page?  
> 
> I mean other bytes in the buffer returned by this function except the 'status' field
> will be used as fit data. Maybe it is has a 'length' field?
you fill in length in C and access it in AML code and it's a part of output buffer format
so document it properly here.

> 
> >  
> >> +
> >> +   The FIT offset is maintained by the caller itself,  
> > probably is not necessary sentence, or specify a caller (for example OSPM)
> >  
> 
> Okay.
> 
> >> current offset plugs  
> >                  ^^^^?  
> 
> Typo. should be plus.
> 
> >  
> >> +   the length returned by the function is the next offset we should read.
> >> +   When all the FIT data has been read out, zero length is returned.
> >> +
> >> +   If it returns 0x100, OSPM should restart to read FIT (read from offset 0
> >> +   again).  
> > [...]
> > that's all for doc part, I'll do the code part later.
> >  
> 
> Thank you, Igor.
> 

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Igor Mammedov Nov. 3, 2016, 9:22 a.m. UTC | #8
On Wed, 2 Nov 2016 23:54:05 +0800
Xiao Guangrong <guangrong.xiao@linux.intel.com> wrote:

> On 11/02/2016 09:56 PM, Igor Mammedov wrote:
> > On Sat, 29 Oct 2016 00:35:39 +0800
> > Xiao Guangrong <guangrong.xiao@linux.intel.com> wrote:
> >  
> >> _FIT is required for hotplug support, guest will inquire the updated
> >> device info from it if a hotplug event is received
> >>
> >> As FIT buffer is not completely mapped into guest address space, so a
> >> new function, Read FIT whose UUID is UUID
> >> 648B9CF2-CDA1-4312-8AD9-49C4AF32BD62, handle 0x10000, function index
> >> is 0x1, is reserved by QEMU to read the piece of FIT buffer. The buffer
> >> is concatenated before _FIT return
> >>
> >> Refer to docs/specs/acpi-nvdimm.txt for detailed design
> >>
> >> Signed-off-by: Xiao Guangrong <guangrong.xiao@linux.intel.com>
> >> ---  
> > [...]
> >  
> >> diff --git a/hw/acpi/nvdimm.c b/hw/acpi/nvdimm.c
> >> index 5f728a6..fc1a012 100644
> >> --- a/hw/acpi/nvdimm.c
> >> +++ b/hw/acpi/nvdimm.c
> >> @@ -496,6 +496,22 @@ typedef struct NvdimmFuncSetLabelDataIn NvdimmFuncSetLabelDataIn;
> >>  QEMU_BUILD_BUG_ON(sizeof(NvdimmFuncSetLabelDataIn) +
> >>                    offsetof(NvdimmDsmIn, arg3) > 4096);
> >>
> >> +struct NvdimmFuncReadFITIn {
> >> +    uint32_t offset; /* the offset of FIT buffer. */
> >> +} QEMU_PACKED;
> >> +typedef struct NvdimmFuncReadFITIn NvdimmFuncReadFITIn;
> >> +QEMU_BUILD_BUG_ON(sizeof(NvdimmFuncReadFITIn) +
> >> +                  offsetof(NvdimmDsmIn, arg3) > 4096);
> >> +
> >> +struct NvdimmFuncReadFITOut {
> >> +    /* the size of buffer filled by QEMU. */
> >> +    uint32_t len;
> >> +    uint32_t func_ret_status; /* return status code. */
> >> +    uint8_t fit[0]; /* the FIT data. */
> >> +} QEMU_PACKED;
> >> +typedef struct NvdimmFuncReadFITOut NvdimmFuncReadFITOut;
> >> +QEMU_BUILD_BUG_ON(sizeof(NvdimmFuncReadFITOut) > 4096);
> >> +
> >>  static void
> >>  nvdimm_dsm_function0(uint32_t supported_func, hwaddr dsm_mem_addr)
> >>  {
> >> @@ -516,6 +532,74 @@ nvdimm_dsm_no_payload(uint32_t func_ret_status, hwaddr dsm_mem_addr)
> >>      cpu_physical_memory_write(dsm_mem_addr, &out, sizeof(out));
> >>  }
> >>
> >> +#define NVDIMM_QEMU_RSVD_HANDLE_ROOT 0x10000
> >> +
> >> +/* Read FIT data, defined in docs/specs/acpi_nvdimm.txt. */
> >> +static void nvdimm_dsm_func_read_fit(AcpiNVDIMMState *state, NvdimmDsmIn *in,
> >> +                                     hwaddr dsm_mem_addr)
> >> +{
> >> +    NvdimmFitBuffer *fit_buf = &state->fit_buf;
> >> +    NvdimmFuncReadFITIn *read_fit;
> >> +    NvdimmFuncReadFITOut *read_fit_out;
> >> +    GArray *fit;
> >> +    uint32_t read_len = 0, func_ret_status;
> >> +    int size;
> >> +
> >> +    read_fit = (NvdimmFuncReadFITIn *)in->arg3;
> >> +    le32_to_cpus(&read_fit->offset);  
> > I'd prefer if you'd not do inplace conversion, just do
> > offset = le32_to_cpu(read_fit->offset);  
> 
> okay.
> 
> >  
> >> +
> >> +    qemu_mutex_lock(&fit_buf->lock);
> >> +    fit = fit_buf->fit;
> >> +
> >> +    nvdimm_debug("Read FIT: offset %#x FIT size %#x Dirty %s.\n",
> >> +                 read_fit->offset, fit->len, fit_buf->dirty ? "Yes" : "No");  
> > as follow up path replace nvdimm_debug() with trace events  
> 
> I will do it as a separate patch.
> 
> >  
> >> +
> >> +    if (read_fit->offset > fit->len) {
> >> +        func_ret_status = 3 /* Invalid Input Parameters */;  
> > should be macros instead of magic value  
> 
> Yes.
> 
> >  
> >> +        goto exit;
> >> +    }
> >> +
> >> +    /* It is the first time to read FIT. */
> >> +    if (!read_fit->offset) {
> >> +        fit_buf->dirty = false;
> >> +    } else if (fit_buf->dirty) { /* FIT has been changed during RFIT. */
> >> +        func_ret_status = 0x100 /* fit changed */;  
> > should be macros instead of magic value  
> 
> okay.
> 
> >  
> >> +        goto exit;
> >> +    }
> >> +
> >> +    func_ret_status = 0 /* Success */;
> >> +    read_len = MIN(fit->len - read_fit->offset,
> >> +                   4096 - sizeof(NvdimmFuncReadFITOut));
> >> +
> >> +exit:
> >> +    size = sizeof(NvdimmFuncReadFITOut) + read_len;
> >> +    read_fit_out = g_malloc(size);
> >> +
> >> +    read_fit_out->len = cpu_to_le32(size);
> >> +    read_fit_out->func_ret_status = cpu_to_le32(func_ret_status);
> >> +    memcpy(read_fit_out->fit, fit->data + read_fit->offset, read_len);
> >> +
> >> +    cpu_physical_memory_write(dsm_mem_addr, read_fit_out, size);
> >> +
> >> +    g_free(read_fit_out);
> >> +    qemu_mutex_unlock(&fit_buf->lock);
> >> +}
> >> +
> >> +static void nvdimm_dsm_reserved_root(AcpiNVDIMMState *state, NvdimmDsmIn *in,
> >> +                                     hwaddr dsm_mem_addr)
> >> +{
> >> +    switch (in->function) {
> >> +    case 0x0:
> >> +        nvdimm_dsm_function0(0x1 | 1 << 1 /* Read FIT */, dsm_mem_addr);
> >> +        return;
> >> +    case 0x1 /*Read FIT */:
> >> +        nvdimm_dsm_func_read_fit(state, in, dsm_mem_addr);
> >> +        return;
> >> +    }
> >> +
> >> +    nvdimm_dsm_no_payload(1 /* Not Supported */, dsm_mem_addr);  
> > should be macros instead of magic value  
> 
> Okay.
> 
> >  
> >> +}
> >> +
> >>  static void nvdimm_dsm_root(NvdimmDsmIn *in, hwaddr dsm_mem_addr)
> >>  {
> >>      /*
> >> @@ -742,6 +826,7 @@ nvdimm_dsm_read(void *opaque, hwaddr addr, unsigned size)
> >>  static void
> >>  nvdimm_dsm_write(void *opaque, hwaddr addr, uint64_t val, unsigned size)
> >>  {
> >> +    AcpiNVDIMMState *state = opaque;
> >>      NvdimmDsmIn *in;
> >>      hwaddr dsm_mem_addr = val;
> >>
> >> @@ -769,6 +854,11 @@ nvdimm_dsm_write(void *opaque, hwaddr addr, uint64_t val, unsigned size)
> >>          goto exit;
> >>      }
> >>
> >> +    if (in->handle == NVDIMM_QEMU_RSVD_HANDLE_ROOT) {
> >> +        nvdimm_dsm_reserved_root(state, in, dsm_mem_addr);
> >> +        goto exit;
> >> +    }
> >> +
> >>       /* Handle 0 is reserved for NVDIMM Root Device. */
> >>      if (!in->handle) {
> >>          nvdimm_dsm_root(in, dsm_mem_addr);
> >> @@ -821,9 +911,13 @@ void nvdimm_init_acpi_state(AcpiNVDIMMState *state, MemoryRegion *io,
> >>  #define NVDIMM_DSM_OUT_BUF_SIZE "RLEN"
> >>  #define NVDIMM_DSM_OUT_BUF      "ODAT"
> >>
> >> +#define NVDIMM_DSM_RFIT_STATUS  "RSTA"
> >> +
> >> +#define NVDIMM_QEMU_RSVD_UUID   "648B9CF2-CDA1-4312-8AD9-49C4AF32BD62"
> >> +
> >>  static void nvdimm_build_common_dsm(Aml *dev)
> >>  {
> >> -    Aml *method, *ifctx, *function, *handle, *uuid, *dsm_mem;
> >> +    Aml *method, *ifctx, *function, *handle, *uuid, *dsm_mem, *elsectx2;
> >>      Aml *elsectx, *unsupport, *unpatched, *expected_uuid, *uuid_invalid;
> >>      Aml *pckg, *pckg_index, *pckg_buf, *field, *dsm_out_buf, *dsm_out_buf_size;
> >>      uint8_t byte_list[1];
> >> @@ -912,9 +1006,15 @@ static void nvdimm_build_common_dsm(Aml *dev)
> >>                 /* UUID for NVDIMM Root Device */, expected_uuid));
> >>      aml_append(method, ifctx);
> >>      elsectx = aml_else();
> >> -    aml_append(elsectx, aml_store(
> >> +    ifctx = aml_if(aml_equal(handle, aml_int(NVDIMM_QEMU_RSVD_HANDLE_ROOT)));
> >> +    aml_append(ifctx, aml_store(aml_touuid(NVDIMM_QEMU_RSVD_UUID
> >> +               /* UUID for QEMU internal use */), expected_uuid));
> >> +    aml_append(elsectx, ifctx);
> >> +    elsectx2 = aml_else();
> >> +    aml_append(elsectx2, aml_store(
> >>                 aml_touuid("4309AC30-0D11-11E4-9191-0800200C9A66")
> >>                 /* UUID for NVDIMM Devices */, expected_uuid));
> >> +    aml_append(elsectx, elsectx2);
> >>      aml_append(method, elsectx);
> >>
> >>      uuid_invalid = aml_lnot(aml_equal(uuid, expected_uuid));
> >> @@ -994,6 +1094,105 @@ static void nvdimm_build_device_dsm(Aml *dev, uint32_t handle)
> >>      aml_append(dev, method);
> >>  }
> >>
> >> +static void nvdimm_build_fit(Aml *dev)  
> > nvdimm_build_fit_method()  
> 
> okay.
> 
> >  
> >> +{
> >> +    Aml *method, *pkg, *buf, *buf_size, *offset, *call_result;
> >> +    Aml *whilectx, *ifcond, *ifctx, *elsectx, *fit;
> >> +
> >> +    buf = aml_local(0);
> >> +    buf_size = aml_local(1);
> >> +    fit = aml_local(2);
> >> +
> >> +    aml_append(dev, aml_create_dword_field(aml_buffer(4, NULL),
> >> +               aml_int(0), NVDIMM_DSM_RFIT_STATUS));  
> > it doesn't have to be buffer as it's internal ASL integer object
> > so it could be just named variable.  
> 
> Let me try.
> 
> >
> > I'd also move it to _FIT method instead of making it device global  
> 
> We can not as it is used both in _FIT method and RFIT method.
> 
> >
> > and if it could work try to pass it as argument to RFIT
> > RefOf/DerefOf may help here
> > or make return value instead of buffer and pass buffer as reference.
> >  
> 
> Let me try.
> 
> > Alternatively you can return buffer from RFIT with status field included
> > and check/discard status value there.
> >  
> 
> As we can not create name object in a while-loop, it is not easy to check
> the status in _FIT.
index() method might help there
or even better create a separate method to extract dword from buffer
and call it from all other places where you do similar thing
instead of creating a bunch of fields.

> 
> >> +
> >> +    /* build helper function, RFIT. */
> >> +    method = aml_method("RFIT", 1, AML_SERIALIZED);
> >> +    aml_append(method, aml_create_dword_field(aml_buffer(4, NULL),
> >> +                                              aml_int(0), "OFST"));
> >> +
> >> +    /* prepare input package. */
> >> +    pkg = aml_package(1);
> >> +    aml_append(method, aml_store(aml_arg(0), aml_name("OFST")));
> >> +    aml_append(pkg, aml_name("OFST"));
> >> +
> >> +    /* call Read_FIT function. */
> >> +    call_result = aml_call5(NVDIMM_COMMON_DSM,
> >> +                            aml_touuid(NVDIMM_QEMU_RSVD_UUID),
> >> +                            aml_int(1) /* Revision 1 */,
> >> +                            aml_int(0x1) /* Read FIT */,
> >> +                            pkg, aml_int(NVDIMM_QEMU_RSVD_HANDLE_ROOT));
> >> +    aml_append(method, aml_store(call_result, buf));
> >> +
> >> +    /* handle _DSM result. */
> >> +    aml_append(method, aml_create_dword_field(buf,
> >> +               aml_int(0) /* offset at byte 0 */, "STAU"));
> >> +
> >> +    aml_append(method, aml_store(aml_name("STAU"),
> >> +                                 aml_name(NVDIMM_DSM_RFIT_STATUS)));
> >> +
> >> +     /* if something is wrong during _DSM. */
> >> +    ifcond = aml_equal(aml_int(0 /* Success */), aml_name("STAU"));
> >> +    ifctx = aml_if(aml_lnot(ifcond));
> >> +    aml_append(ifctx, aml_return(aml_buffer(0, NULL)));
> >> +    aml_append(method, ifctx);
> >> +
> >> +    aml_append(method, aml_store(aml_sizeof(buf), buf_size));
> >> +    aml_append(method, aml_subtract(buf_size,
> >> +                                    aml_int(4) /* the size of "STAU" */,
> >> +                                    buf_size));
> >> +
> >> +    /* if we read the end of fit. */
> >> +    ifctx = aml_if(aml_equal(buf_size, aml_int(0)));
> >> +    aml_append(ifctx, aml_return(aml_buffer(0, NULL)));
> >> +    aml_append(method, ifctx);
> >> +
> >> +    aml_append(method, aml_store(aml_shiftleft(buf_size, aml_int(3)),
> >> +                                 buf_size));  
> > there isn't need to convert bytes to bits and store it in the same variable
> > it creates confusion  
> 
> Okay, i will introduce a new variable named buf_size_bits.
> 
> >  
> >> +    aml_append(method, aml_create_field(buf,
> >> +                            aml_int(4 * BITS_PER_BYTE), /* offset at byte 4.*/
> >> +                            buf_size, "BUFF"));  
> > just inline conversion in aml_create_field()  
> 
> okay.
> 

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/docs/specs/acpi_nvdimm.txt b/docs/specs/acpi_nvdimm.txt
index 0fdd251..4aa5e3d 100644
--- a/docs/specs/acpi_nvdimm.txt
+++ b/docs/specs/acpi_nvdimm.txt
@@ -127,6 +127,58 @@  _DSM process diagram:
  | result from the page     |      |              |
  +--------------------------+      +--------------+
 
- _FIT implementation
- -------------------
- TODO (will fill it when nvdimm hotplug is introduced)
+Device Handle Reservation
+-------------------------
+As we mentioned above, byte 0 ~ byte 3 in the DSM memory save NVDIMM device
+handle. The handle is completely QEMU internal thing, the values in range
+[0, 0xFFFF] indicate nvdimm device (O means nvdimm root device named NVDR),
+other values are reserved by other purpose.
+
+Current reserved handle:
+0x10000 is reserved for QEMU internal DSM function called on the root
+device.
+
+QEMU internal use only _DSM function
+------------------------------------
+UUID, 648B9CF2-CDA1-4312-8AD9-49C4AF32BD62, is reserved for QEMU internal
+DSM function.
+
+There is the function introduced by QEMU and only used by QEMU internal.
+
+1) Read FIT
+   As we only reserved one page for NVDIMM ACPI it is impossible to map the
+   whole FIT data to guest's address space. This function is used by _FIT
+   method to read a piece of FIT data from QEMU.
+
+   Input parameters:
+   Arg0 – UUID {set to 648B9CF2-CDA1-4312-8AD9-49C4AF32BD62}
+   Arg1 – Revision ID (set to 1)
+   Arg2 - Function Index, 0x1
+   Arg3 - A package containing a buffer whose layout is as follows:
+
+   +----------+-------------+-------------+-----------------------------------+
+   |  Filed   | Byte Length | Byte Offset | Description                       |
+   +----------+-------------+-------------+-----------------------------------+
+   | offset   |     4       |    0        | the offset of FIT buffer          |
+   +----------+-------------+-------------+-----------------------------------+
+
+   Output:
+   +----------+-------------+-------------+-----------------------------------+
+   |  Filed   | Byte Length | Byte Offset | Description                       |
+   +----------+-------------+-------------+-----------------------------------+
+   |          |             |             | return status codes               |
+   |          |             |             |   0x100 indicates fit has been    |
+   | status   |     4       |    0        |   updated                         |
+   |          |             |             | other follows Chapter 3 in DSM    |
+   |          |             |             | Spec Rev1                         |
+   +----------+-------------+-------------+-----------------------------------+
+   | fit data |  Varies     |    4        | FIT data                          |
+   |          |             |             |                                   |
+   +----------+-------------+-------------+-----------------------------------+
+
+   The FIT offset is maintained by the caller itself, current offset plugs
+   the length returned by the function is the next offset we should read.
+   When all the FIT data has been read out, zero length is returned.
+
+   If it returns 0x100, OSPM should restart to read FIT (read from offset 0
+   again).
diff --git a/hw/acpi/nvdimm.c b/hw/acpi/nvdimm.c
index 5f728a6..fc1a012 100644
--- a/hw/acpi/nvdimm.c
+++ b/hw/acpi/nvdimm.c
@@ -496,6 +496,22 @@  typedef struct NvdimmFuncSetLabelDataIn NvdimmFuncSetLabelDataIn;
 QEMU_BUILD_BUG_ON(sizeof(NvdimmFuncSetLabelDataIn) +
                   offsetof(NvdimmDsmIn, arg3) > 4096);
 
+struct NvdimmFuncReadFITIn {
+    uint32_t offset; /* the offset of FIT buffer. */
+} QEMU_PACKED;
+typedef struct NvdimmFuncReadFITIn NvdimmFuncReadFITIn;
+QEMU_BUILD_BUG_ON(sizeof(NvdimmFuncReadFITIn) +
+                  offsetof(NvdimmDsmIn, arg3) > 4096);
+
+struct NvdimmFuncReadFITOut {
+    /* the size of buffer filled by QEMU. */
+    uint32_t len;
+    uint32_t func_ret_status; /* return status code. */
+    uint8_t fit[0]; /* the FIT data. */
+} QEMU_PACKED;
+typedef struct NvdimmFuncReadFITOut NvdimmFuncReadFITOut;
+QEMU_BUILD_BUG_ON(sizeof(NvdimmFuncReadFITOut) > 4096);
+
 static void
 nvdimm_dsm_function0(uint32_t supported_func, hwaddr dsm_mem_addr)
 {
@@ -516,6 +532,74 @@  nvdimm_dsm_no_payload(uint32_t func_ret_status, hwaddr dsm_mem_addr)
     cpu_physical_memory_write(dsm_mem_addr, &out, sizeof(out));
 }
 
+#define NVDIMM_QEMU_RSVD_HANDLE_ROOT 0x10000
+
+/* Read FIT data, defined in docs/specs/acpi_nvdimm.txt. */
+static void nvdimm_dsm_func_read_fit(AcpiNVDIMMState *state, NvdimmDsmIn *in,
+                                     hwaddr dsm_mem_addr)
+{
+    NvdimmFitBuffer *fit_buf = &state->fit_buf;
+    NvdimmFuncReadFITIn *read_fit;
+    NvdimmFuncReadFITOut *read_fit_out;
+    GArray *fit;
+    uint32_t read_len = 0, func_ret_status;
+    int size;
+
+    read_fit = (NvdimmFuncReadFITIn *)in->arg3;
+    le32_to_cpus(&read_fit->offset);
+
+    qemu_mutex_lock(&fit_buf->lock);
+    fit = fit_buf->fit;
+
+    nvdimm_debug("Read FIT: offset %#x FIT size %#x Dirty %s.\n",
+                 read_fit->offset, fit->len, fit_buf->dirty ? "Yes" : "No");
+
+    if (read_fit->offset > fit->len) {
+        func_ret_status = 3 /* Invalid Input Parameters */;
+        goto exit;
+    }
+
+    /* It is the first time to read FIT. */
+    if (!read_fit->offset) {
+        fit_buf->dirty = false;
+    } else if (fit_buf->dirty) { /* FIT has been changed during RFIT. */
+        func_ret_status = 0x100 /* fit changed */;
+        goto exit;
+    }
+
+    func_ret_status = 0 /* Success */;
+    read_len = MIN(fit->len - read_fit->offset,
+                   4096 - sizeof(NvdimmFuncReadFITOut));
+
+exit:
+    size = sizeof(NvdimmFuncReadFITOut) + read_len;
+    read_fit_out = g_malloc(size);
+
+    read_fit_out->len = cpu_to_le32(size);
+    read_fit_out->func_ret_status = cpu_to_le32(func_ret_status);
+    memcpy(read_fit_out->fit, fit->data + read_fit->offset, read_len);
+
+    cpu_physical_memory_write(dsm_mem_addr, read_fit_out, size);
+
+    g_free(read_fit_out);
+    qemu_mutex_unlock(&fit_buf->lock);
+}
+
+static void nvdimm_dsm_reserved_root(AcpiNVDIMMState *state, NvdimmDsmIn *in,
+                                     hwaddr dsm_mem_addr)
+{
+    switch (in->function) {
+    case 0x0:
+        nvdimm_dsm_function0(0x1 | 1 << 1 /* Read FIT */, dsm_mem_addr);
+        return;
+    case 0x1 /*Read FIT */:
+        nvdimm_dsm_func_read_fit(state, in, dsm_mem_addr);
+        return;
+    }
+
+    nvdimm_dsm_no_payload(1 /* Not Supported */, dsm_mem_addr);
+}
+
 static void nvdimm_dsm_root(NvdimmDsmIn *in, hwaddr dsm_mem_addr)
 {
     /*
@@ -742,6 +826,7 @@  nvdimm_dsm_read(void *opaque, hwaddr addr, unsigned size)
 static void
 nvdimm_dsm_write(void *opaque, hwaddr addr, uint64_t val, unsigned size)
 {
+    AcpiNVDIMMState *state = opaque;
     NvdimmDsmIn *in;
     hwaddr dsm_mem_addr = val;
 
@@ -769,6 +854,11 @@  nvdimm_dsm_write(void *opaque, hwaddr addr, uint64_t val, unsigned size)
         goto exit;
     }
 
+    if (in->handle == NVDIMM_QEMU_RSVD_HANDLE_ROOT) {
+        nvdimm_dsm_reserved_root(state, in, dsm_mem_addr);
+        goto exit;
+    }
+
      /* Handle 0 is reserved for NVDIMM Root Device. */
     if (!in->handle) {
         nvdimm_dsm_root(in, dsm_mem_addr);
@@ -821,9 +911,13 @@  void nvdimm_init_acpi_state(AcpiNVDIMMState *state, MemoryRegion *io,
 #define NVDIMM_DSM_OUT_BUF_SIZE "RLEN"
 #define NVDIMM_DSM_OUT_BUF      "ODAT"
 
+#define NVDIMM_DSM_RFIT_STATUS  "RSTA"
+
+#define NVDIMM_QEMU_RSVD_UUID   "648B9CF2-CDA1-4312-8AD9-49C4AF32BD62"
+
 static void nvdimm_build_common_dsm(Aml *dev)
 {
-    Aml *method, *ifctx, *function, *handle, *uuid, *dsm_mem;
+    Aml *method, *ifctx, *function, *handle, *uuid, *dsm_mem, *elsectx2;
     Aml *elsectx, *unsupport, *unpatched, *expected_uuid, *uuid_invalid;
     Aml *pckg, *pckg_index, *pckg_buf, *field, *dsm_out_buf, *dsm_out_buf_size;
     uint8_t byte_list[1];
@@ -912,9 +1006,15 @@  static void nvdimm_build_common_dsm(Aml *dev)
                /* UUID for NVDIMM Root Device */, expected_uuid));
     aml_append(method, ifctx);
     elsectx = aml_else();
-    aml_append(elsectx, aml_store(
+    ifctx = aml_if(aml_equal(handle, aml_int(NVDIMM_QEMU_RSVD_HANDLE_ROOT)));
+    aml_append(ifctx, aml_store(aml_touuid(NVDIMM_QEMU_RSVD_UUID
+               /* UUID for QEMU internal use */), expected_uuid));
+    aml_append(elsectx, ifctx);
+    elsectx2 = aml_else();
+    aml_append(elsectx2, aml_store(
                aml_touuid("4309AC30-0D11-11E4-9191-0800200C9A66")
                /* UUID for NVDIMM Devices */, expected_uuid));
+    aml_append(elsectx, elsectx2);
     aml_append(method, elsectx);
 
     uuid_invalid = aml_lnot(aml_equal(uuid, expected_uuid));
@@ -994,6 +1094,105 @@  static void nvdimm_build_device_dsm(Aml *dev, uint32_t handle)
     aml_append(dev, method);
 }
 
+static void nvdimm_build_fit(Aml *dev)
+{
+    Aml *method, *pkg, *buf, *buf_size, *offset, *call_result;
+    Aml *whilectx, *ifcond, *ifctx, *elsectx, *fit;
+
+    buf = aml_local(0);
+    buf_size = aml_local(1);
+    fit = aml_local(2);
+
+    aml_append(dev, aml_create_dword_field(aml_buffer(4, NULL),
+               aml_int(0), NVDIMM_DSM_RFIT_STATUS));
+
+    /* build helper function, RFIT. */
+    method = aml_method("RFIT", 1, AML_SERIALIZED);
+    aml_append(method, aml_create_dword_field(aml_buffer(4, NULL),
+                                              aml_int(0), "OFST"));
+
+    /* prepare input package. */
+    pkg = aml_package(1);
+    aml_append(method, aml_store(aml_arg(0), aml_name("OFST")));
+    aml_append(pkg, aml_name("OFST"));
+
+    /* call Read_FIT function. */
+    call_result = aml_call5(NVDIMM_COMMON_DSM,
+                            aml_touuid(NVDIMM_QEMU_RSVD_UUID),
+                            aml_int(1) /* Revision 1 */,
+                            aml_int(0x1) /* Read FIT */,
+                            pkg, aml_int(NVDIMM_QEMU_RSVD_HANDLE_ROOT));
+    aml_append(method, aml_store(call_result, buf));
+
+    /* handle _DSM result. */
+    aml_append(method, aml_create_dword_field(buf,
+               aml_int(0) /* offset at byte 0 */, "STAU"));
+
+    aml_append(method, aml_store(aml_name("STAU"),
+                                 aml_name(NVDIMM_DSM_RFIT_STATUS)));
+
+     /* if something is wrong during _DSM. */
+    ifcond = aml_equal(aml_int(0 /* Success */), aml_name("STAU"));
+    ifctx = aml_if(aml_lnot(ifcond));
+    aml_append(ifctx, aml_return(aml_buffer(0, NULL)));
+    aml_append(method, ifctx);
+
+    aml_append(method, aml_store(aml_sizeof(buf), buf_size));
+    aml_append(method, aml_subtract(buf_size,
+                                    aml_int(4) /* the size of "STAU" */,
+                                    buf_size));
+
+    /* if we read the end of fit. */
+    ifctx = aml_if(aml_equal(buf_size, aml_int(0)));
+    aml_append(ifctx, aml_return(aml_buffer(0, NULL)));
+    aml_append(method, ifctx);
+
+    aml_append(method, aml_store(aml_shiftleft(buf_size, aml_int(3)),
+                                 buf_size));
+    aml_append(method, aml_create_field(buf,
+                            aml_int(4 * BITS_PER_BYTE), /* offset at byte 4.*/
+                            buf_size, "BUFF"));
+    aml_append(method, aml_return(aml_name("BUFF")));
+    aml_append(dev, method);
+
+    /* build _FIT. */
+    method = aml_method("_FIT", 0, AML_SERIALIZED);
+    offset = aml_local(3);
+
+    aml_append(method, aml_store(aml_buffer(0, NULL), fit));
+    aml_append(method, aml_store(aml_int(0), offset));
+
+    whilectx = aml_while(aml_int(1));
+    aml_append(whilectx, aml_store(aml_call1("RFIT", offset), buf));
+    aml_append(whilectx, aml_store(aml_sizeof(buf), buf_size));
+
+    /*
+     * if fit buffer was changed during RFIT, read from the beginning
+     * again.
+     */
+    ifctx = aml_if(aml_equal(aml_name(NVDIMM_DSM_RFIT_STATUS),
+                             aml_int(0x100 /* fit changed */)));
+    aml_append(ifctx, aml_store(aml_buffer(0, NULL), fit));
+    aml_append(ifctx, aml_store(aml_int(0), offset));
+    aml_append(whilectx, ifctx);
+
+    elsectx = aml_else();
+
+    /* finish fit read if no data is read out. */
+    ifctx = aml_if(aml_equal(buf_size, aml_int(0)));
+    aml_append(ifctx, aml_return(fit));
+    aml_append(elsectx, ifctx);
+
+    /* update the offset. */
+    aml_append(elsectx, aml_add(offset, buf_size, offset));
+    /* append the data we read out to the fit buffer. */
+    aml_append(elsectx, aml_concatenate(fit, buf, fit));
+    aml_append(whilectx, elsectx);
+    aml_append(method, whilectx);
+
+    aml_append(dev, method);
+}
+
 static void nvdimm_build_nvdimm_devices(Aml *root_dev, uint32_t ram_slots)
 {
     uint32_t slot;
@@ -1052,6 +1251,7 @@  static void nvdimm_build_ssdt(GArray *table_offsets, GArray *table_data,
 
     /* 0 is reserved for root device. */
     nvdimm_build_device_dsm(dev, 0);
+    nvdimm_build_fit(dev);
 
     nvdimm_build_nvdimm_devices(dev, ram_slots);