diff mbox series

[v3,1/3] msix/hmp: add hmp interface to dump MSI-X info

Message ID 20210714004754.22243-2-dongli.zhang@oracle.com (mailing list archive)
State New, archived
Headers show
Series To add HMP interface to dump PCI MSI-X table/PBA | expand

Commit Message

Dongli Zhang July 14, 2021, 12:47 a.m. UTC
This patch is to add the HMP interface to dump MSI-X table and PBA, in
order to help diagnose the loss of IRQ issue in VM (e.g., if an MSI-X
vector is erroneously masked permanently). Here is the example with
vhost-scsi:

(qemu) info msix /machine/peripheral/vscsi0
Msg L.Addr Msg U.Addr Msg Data   Vect Ctrl
0xfee00000 0x00000000 0x00004041 0x00000000
0xfee00000 0x00000000 0x00004051 0x00000000
0xfee00000 0x00000000 0x00004061 0x00000000
0xfee00000 0x00000000 0x00004071 0x00000000
0xfee01000 0x00000000 0x000040b1 0x00000000
0xfee02000 0x00000000 0x000040c1 0x00000000
0xfee03000 0x00000000 0x000040d1 0x00000000

MSI-X PBA
0 0 0 0 0 0 0

Since the number of MSI-X entries is not determined and might be very
large, it is sometimes inappropriate to dump via QMP.

Therefore, this patch dumps MSI-X information only via HMP, which is
similar to the implementation of hmp_info_mem().

Cc: Jason Wang <jasowang@redhat.com>
Cc: Joe Jin <joe.jin@oracle.com>
Signed-off-by: Dongli Zhang <dongli.zhang@oracle.com>
Acked-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Acked-by: Jason Wang <jasowang@redhat.com>
---
Changed since v1/v2:
  - Add msix_present() to pci-stub.c to avoid build error

 hmp-commands-info.hx   | 13 +++++++++
 hw/pci/msix.c          | 63 ++++++++++++++++++++++++++++++++++++++++++
 hw/pci/pci-stub.c      |  5 ++++
 include/hw/pci/msix.h  |  2 ++
 include/monitor/hmp.h  |  1 +
 softmmu/qdev-monitor.c | 25 +++++++++++++++++
 6 files changed, 109 insertions(+)

Comments

Markus Armbruster July 14, 2021, 5:46 a.m. UTC | #1
Dongli Zhang <dongli.zhang@oracle.com> writes:

> This patch is to add the HMP interface to dump MSI-X table and PBA, in
> order to help diagnose the loss of IRQ issue in VM (e.g., if an MSI-X
> vector is erroneously masked permanently). Here is the example with
> vhost-scsi:
>
> (qemu) info msix /machine/peripheral/vscsi0
> Msg L.Addr Msg U.Addr Msg Data   Vect Ctrl
> 0xfee00000 0x00000000 0x00004041 0x00000000
> 0xfee00000 0x00000000 0x00004051 0x00000000
> 0xfee00000 0x00000000 0x00004061 0x00000000
> 0xfee00000 0x00000000 0x00004071 0x00000000
> 0xfee01000 0x00000000 0x000040b1 0x00000000
> 0xfee02000 0x00000000 0x000040c1 0x00000000
> 0xfee03000 0x00000000 0x000040d1 0x00000000
>
> MSI-X PBA
> 0 0 0 0 0 0 0
>
> Since the number of MSI-X entries is not determined and might be very
> large, it is sometimes inappropriate to dump via QMP.

Why?  What makes HMP different?

> Therefore, this patch dumps MSI-X information only via HMP, which is
> similar to the implementation of hmp_info_mem().
>
> Cc: Jason Wang <jasowang@redhat.com>
> Cc: Joe Jin <joe.jin@oracle.com>
> Signed-off-by: Dongli Zhang <dongli.zhang@oracle.com>
> Acked-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
> Acked-by: Jason Wang <jasowang@redhat.com>
Dongli Zhang July 14, 2021, 6:17 a.m. UTC | #2
Hi Markus,

On 7/13/21 10:46 PM, Markus Armbruster wrote:
> Dongli Zhang <dongli.zhang@oracle.com> writes:
> 
>> This patch is to add the HMP interface to dump MSI-X table and PBA, in
>> order to help diagnose the loss of IRQ issue in VM (e.g., if an MSI-X
>> vector is erroneously masked permanently). Here is the example with
>> vhost-scsi:
>>
>> (qemu) info msix /machine/peripheral/vscsi0
>> Msg L.Addr Msg U.Addr Msg Data   Vect Ctrl
>> 0xfee00000 0x00000000 0x00004041 0x00000000
>> 0xfee00000 0x00000000 0x00004051 0x00000000
>> 0xfee00000 0x00000000 0x00004061 0x00000000
>> 0xfee00000 0x00000000 0x00004071 0x00000000
>> 0xfee01000 0x00000000 0x000040b1 0x00000000
>> 0xfee02000 0x00000000 0x000040c1 0x00000000
>> 0xfee03000 0x00000000 0x000040d1 0x00000000
>>
>> MSI-X PBA
>> 0 0 0 0 0 0 0
>>
>> Since the number of MSI-X entries is not determined and might be very
>> large, it is sometimes inappropriate to dump via QMP.
> 
> Why?  What makes HMP different?

Here are two reasons.

1. The size of MSI-X table is nondeterministic and might be very large, e.g.,
the PCI_MSIX_FLAGS_QSIZE is 0x07FF. The "info tlb" (which is a table and similar
to MSI-X) and "info lapic" also only support hmp.

2. The [PATCH 3/3] of this patchset support device specific data, the
definitional of which varies depending on each device type (so far only
virtio-pci supports the interface).

Thank you very much!

Dongli Zhang

> 
>> Therefore, this patch dumps MSI-X information only via HMP, which is
>> similar to the implementation of hmp_info_mem().
>>
>> Cc: Jason Wang <jasowang@redhat.com>
>> Cc: Joe Jin <joe.jin@oracle.com>
>> Signed-off-by: Dongli Zhang <dongli.zhang@oracle.com>
>> Acked-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
>> Acked-by: Jason Wang <jasowang@redhat.com>
>
Markus Armbruster July 14, 2021, 9:42 a.m. UTC | #3
Dongli Zhang <dongli.zhang@oracle.com> writes:

> Hi Markus,
>
> On 7/13/21 10:46 PM, Markus Armbruster wrote:
>> Dongli Zhang <dongli.zhang@oracle.com> writes:
>> 
>>> This patch is to add the HMP interface to dump MSI-X table and PBA, in
>>> order to help diagnose the loss of IRQ issue in VM (e.g., if an MSI-X
>>> vector is erroneously masked permanently). Here is the example with
>>> vhost-scsi:
>>>
>>> (qemu) info msix /machine/peripheral/vscsi0
>>> Msg L.Addr Msg U.Addr Msg Data   Vect Ctrl
>>> 0xfee00000 0x00000000 0x00004041 0x00000000
>>> 0xfee00000 0x00000000 0x00004051 0x00000000
>>> 0xfee00000 0x00000000 0x00004061 0x00000000
>>> 0xfee00000 0x00000000 0x00004071 0x00000000
>>> 0xfee01000 0x00000000 0x000040b1 0x00000000
>>> 0xfee02000 0x00000000 0x000040c1 0x00000000
>>> 0xfee03000 0x00000000 0x000040d1 0x00000000
>>>
>>> MSI-X PBA
>>> 0 0 0 0 0 0 0
>>>
>>> Since the number of MSI-X entries is not determined and might be very
>>> large, it is sometimes inappropriate to dump via QMP.
>> 
>> Why?  What makes HMP different?
>
> Here are two reasons.
>
> 1. The size of MSI-X table is nondeterministic and might be very large, e.g.,
> the PCI_MSIX_FLAGS_QSIZE is 0x07FF. The "info tlb" (which is a table and similar
> to MSI-X) and "info lapic" also only support hmp.
>
> 2. The [PATCH 3/3] of this patchset support device specific data, the
> definitional of which varies depending on each device type (so far only
> virtio-pci supports the interface).
>
> Thank you very much!
>
> Dongli Zhang
>
>>
>>> Therefore, this patch dumps MSI-X information only via HMP, which is
>>> similar to the implementation of hmp_info_mem().

I think you're mixing up valid and invalid arguments :)  Let me try to
pick them apart.

1a. "Command output may be too large for QMP, therefore provide only the
HMP command" makes no sense.

Both QMP and HMP are not meant for bulk data transfer.  They are control
plane, not data plane.  To illustrate what that means, consider a backup
feature.  The commands to control the backup task are QMP (and HMP, if
desired), but the actual data transfer uses some other channel, so it
doesn't clog the QMP pipes.

If the data is too bulky for QMP, then it's too bulky for HMP, too.

1b. "info tlb" and "info lapic" are HMP only because they are debugging
commands for humans.  Debugging is not necessarily done by humans only.
Sometimes, humans use programs to help them debug, and these programs
would be better off with QMP commands.  For the HMP-only debugging
commands, we decided providing for (largely hypothetical) debugging
scripts wasn't worthwhile.  A similar argument could probably be made
for "info msix".

2. "Output is in part specific to Devices, therefore provide only the
HMP command" is also iffy.  Yes, hacking up a bunch of monitor_printf()s
is probably easier than specifying a QAPI schema.   "It's work" is no
excuse.  "It's more work than it's worth" can be one.  But then we're
back at argument 1b.

Your commit message's first sentence suggests this is indeed just for
debugging.  If this is the case, consider replacing

    Since the number of MSI-X entries is not determined and might be very
    large, it is sometimes inappropriate to dump via QMP.

    Therefore, this patch dumps MSI-X information only via HMP, which is
    similar to the implementation of hmp_info_mem().

by

    Since this is just for debugging by humans, provide the command only
    in HMP, not in QMP.
Dongli Zhang July 14, 2021, 3:42 p.m. UTC | #4
Hi Markus,

On 7/14/21 2:42 AM, Markus Armbruster wrote:
> Dongli Zhang <dongli.zhang@oracle.com> writes:
> 
>> Hi Markus,
>>
>> On 7/13/21 10:46 PM, Markus Armbruster wrote:
>>> Dongli Zhang <dongli.zhang@oracle.com> writes:
>>>
>>>> This patch is to add the HMP interface to dump MSI-X table and PBA, in
>>>> order to help diagnose the loss of IRQ issue in VM (e.g., if an MSI-X
>>>> vector is erroneously masked permanently). Here is the example with
>>>> vhost-scsi:
>>>>
>>>> (qemu) info msix /machine/peripheral/vscsi0
>>>> Msg L.Addr Msg U.Addr Msg Data   Vect Ctrl
>>>> 0xfee00000 0x00000000 0x00004041 0x00000000
>>>> 0xfee00000 0x00000000 0x00004051 0x00000000
>>>> 0xfee00000 0x00000000 0x00004061 0x00000000
>>>> 0xfee00000 0x00000000 0x00004071 0x00000000
>>>> 0xfee01000 0x00000000 0x000040b1 0x00000000
>>>> 0xfee02000 0x00000000 0x000040c1 0x00000000
>>>> 0xfee03000 0x00000000 0x000040d1 0x00000000
>>>>
>>>> MSI-X PBA
>>>> 0 0 0 0 0 0 0
>>>>
>>>> Since the number of MSI-X entries is not determined and might be very
>>>> large, it is sometimes inappropriate to dump via QMP.
>>>
>>> Why?  What makes HMP different?
>>
>> Here are two reasons.
>>
>> 1. The size of MSI-X table is nondeterministic and might be very large, e.g.,
>> the PCI_MSIX_FLAGS_QSIZE is 0x07FF. The "info tlb" (which is a table and similar
>> to MSI-X) and "info lapic" also only support hmp.
>>
>> 2. The [PATCH 3/3] of this patchset support device specific data, the
>> definitional of which varies depending on each device type (so far only
>> virtio-pci supports the interface).
>>
>> Thank you very much!
>>
>> Dongli Zhang
>>
>>>
>>>> Therefore, this patch dumps MSI-X information only via HMP, which is
>>>> similar to the implementation of hmp_info_mem().
> 
> I think you're mixing up valid and invalid arguments :)  Let me try to
> pick them apart.
> 
> 1a. "Command output may be too large for QMP, therefore provide only the
> HMP command" makes no sense.
> 
> Both QMP and HMP are not meant for bulk data transfer.  They are control
> plane, not data plane.  To illustrate what that means, consider a backup
> feature.  The commands to control the backup task are QMP (and HMP, if
> desired), but the actual data transfer uses some other channel, so it
> doesn't clog the QMP pipes.
> 
> If the data is too bulky for QMP, then it's too bulky for HMP, too.
> 
> 1b. "info tlb" and "info lapic" are HMP only because they are debugging
> commands for humans.  Debugging is not necessarily done by humans only.
> Sometimes, humans use programs to help them debug, and these programs
> would be better off with QMP commands.  For the HMP-only debugging
> commands, we decided providing for (largely hypothetical) debugging
> scripts wasn't worthwhile.  A similar argument could probably be made
> for "info msix".
> 
> 2. "Output is in part specific to Devices, therefore provide only the
> HMP command" is also iffy.  Yes, hacking up a bunch of monitor_printf()s
> is probably easier than specifying a QAPI schema.   "It's work" is no
> excuse.  "It's more work than it's worth" can be one.  But then we're
> back at argument 1b.

Thank you very much for the explanation!

> 
> Your commit message's first sentence suggests this is indeed just for
> debugging.  If this is the case, consider replacing
> 
>     Since the number of MSI-X entries is not determined and might be very
>     large, it is sometimes inappropriate to dump via QMP.
> 
>     Therefore, this patch dumps MSI-X information only via HMP, which is
>     similar to the implementation of hmp_info_mem().
> 
> by
> 
>     Since this is just for debugging by humans, provide the command only
>     in HMP, not in QMP.
> 

Yes, this is for debugging purpose. The primary objective is to confirm a
specific vector is not erroneously masked permanently, when a specific device
queue is stuck.

I will replace commit message and send v4.

Thank you very much!

Dongli Zhang
diff mbox series

Patch

diff --git a/hmp-commands-info.hx b/hmp-commands-info.hx
index 27206ac049..ce5c550d44 100644
--- a/hmp-commands-info.hx
+++ b/hmp-commands-info.hx
@@ -221,6 +221,19 @@  SRST
     Show PCI information.
 ERST
 
+    {
+        .name       = "msix",
+        .args_type  = "dev:s",
+        .params     = "dev",
+        .help       = "dump MSI-X information",
+        .cmd        = hmp_info_msix,
+    },
+
+SRST
+  ``info msix`` *dev*
+    Dump MSI-X information for device *dev*.
+ERST
+
 #if defined(TARGET_I386) || defined(TARGET_SH4) || defined(TARGET_SPARC) || \
     defined(TARGET_PPC) || defined(TARGET_XTENSA) || defined(TARGET_M68K)
     {
diff --git a/hw/pci/msix.c b/hw/pci/msix.c
index ae9331cd0b..4b4ec87eee 100644
--- a/hw/pci/msix.c
+++ b/hw/pci/msix.c
@@ -22,6 +22,7 @@ 
 #include "sysemu/xen.h"
 #include "migration/qemu-file-types.h"
 #include "migration/vmstate.h"
+#include "monitor/monitor.h"
 #include "qemu/range.h"
 #include "qapi/error.h"
 #include "trace.h"
@@ -669,3 +670,65 @@  const VMStateDescription vmstate_msix = {
         VMSTATE_END_OF_LIST()
     }
 };
+
+static void msix_dump_table(Monitor *mon, PCIDevice *dev)
+{
+    int vector;
+    uint32_t val;
+    uint8_t *table_entry;
+
+    monitor_printf(mon, "Msg L.Addr ");
+    monitor_printf(mon, "Msg U.Addr ");
+    monitor_printf(mon, "Msg Data   ");
+    monitor_printf(mon, "Vect Ctrl\n");
+
+    for (vector = 0; vector < dev->msix_entries_nr; vector++) {
+        table_entry = dev->msix_table + vector * PCI_MSIX_ENTRY_SIZE;
+
+        val = pci_get_long(table_entry + PCI_MSIX_ENTRY_LOWER_ADDR);
+        monitor_printf(mon, "0x%08x ", val);
+
+        val = pci_get_long(table_entry + PCI_MSIX_ENTRY_UPPER_ADDR);
+        monitor_printf(mon, "0x%08x ", val);
+
+        val = pci_get_long(table_entry + PCI_MSIX_ENTRY_DATA);
+        monitor_printf(mon, "0x%08x ", val);
+
+        val = pci_get_long(table_entry + PCI_MSIX_ENTRY_VECTOR_CTRL);
+        monitor_printf(mon, "0x%08x\n", val);
+    }
+
+    monitor_printf(mon, "\n");
+}
+
+static void msix_dump_pba(Monitor *mon, PCIDevice *dev)
+{
+    int vector;
+
+    monitor_printf(mon, "MSI-X PBA\n");
+
+    for (vector = 0; vector < dev->msix_entries_nr; vector++) {
+        monitor_printf(mon, "%d ", !!msix_is_pending(dev, vector));
+
+        if (vector % 16 == 15) {
+            monitor_printf(mon, "\n");
+        }
+    }
+
+    if (vector % 16 != 15) {
+        monitor_printf(mon, "\n");
+    }
+
+    monitor_printf(mon, "\n");
+}
+
+void msix_dump_info(Monitor *mon, PCIDevice *dev, Error **errp)
+{
+    if (!msix_present(dev)) {
+        error_setg(errp, "MSI-X not available");
+        return;
+    }
+
+    msix_dump_table(mon, dev);
+    msix_dump_pba(mon, dev);
+}
diff --git a/hw/pci/pci-stub.c b/hw/pci/pci-stub.c
index 3a027c42e4..8191d49d56 100644
--- a/hw/pci/pci-stub.c
+++ b/hw/pci/pci-stub.c
@@ -91,3 +91,8 @@  MSIMessage msix_get_message(PCIDevice *dev, unsigned int vector)
 {
     g_assert_not_reached();
 }
+
+void msix_dump_info(Monitor *mon, PCIDevice *dev, Error **errp)
+{
+     monitor_printf(mon, "PCI devices not supported\n");
+}
diff --git a/include/hw/pci/msix.h b/include/hw/pci/msix.h
index 4c4a60c739..10a4500295 100644
--- a/include/hw/pci/msix.h
+++ b/include/hw/pci/msix.h
@@ -47,6 +47,8 @@  int msix_set_vector_notifiers(PCIDevice *dev,
                               MSIVectorPollNotifier poll_notifier);
 void msix_unset_vector_notifiers(PCIDevice *dev);
 
+void msix_dump_info(Monitor *mon, PCIDevice *dev, Error **errp);
+
 extern const VMStateDescription vmstate_msix;
 
 #define VMSTATE_MSIX_TEST(_field, _state, _test) {                   \
diff --git a/include/monitor/hmp.h b/include/monitor/hmp.h
index 3baa1058e2..97c040a3c8 100644
--- a/include/monitor/hmp.h
+++ b/include/monitor/hmp.h
@@ -36,6 +36,7 @@  void hmp_info_irq(Monitor *mon, const QDict *qdict);
 void hmp_info_pic(Monitor *mon, const QDict *qdict);
 void hmp_info_rdma(Monitor *mon, const QDict *qdict);
 void hmp_info_pci(Monitor *mon, const QDict *qdict);
+void hmp_info_msix(Monitor *mon, const QDict *qdict);
 void hmp_info_tpm(Monitor *mon, const QDict *qdict);
 void hmp_info_iothreads(Monitor *mon, const QDict *qdict);
 void hmp_quit(Monitor *mon, const QDict *qdict);
diff --git a/softmmu/qdev-monitor.c b/softmmu/qdev-monitor.c
index 721dec2d82..7837a17d0d 100644
--- a/softmmu/qdev-monitor.c
+++ b/softmmu/qdev-monitor.c
@@ -19,6 +19,7 @@ 
 
 #include "qemu/osdep.h"
 #include "hw/sysbus.h"
+#include "hw/pci/msix.h"
 #include "monitor/hmp.h"
 #include "monitor/monitor.h"
 #include "monitor/qdev.h"
@@ -1005,3 +1006,27 @@  bool qmp_command_available(const QmpCommand *cmd, Error **errp)
     }
     return true;
 }
+
+void hmp_info_msix(Monitor *mon, const QDict *qdict)
+{
+    const char *name = qdict_get_str(qdict, "dev");
+    DeviceState *dev = find_device_state(name, NULL);
+    PCIDevice *pci_dev;
+    Error *err = NULL;
+
+    if (!dev) {
+        error_setg(&err, "Device %s not found", name);
+        goto exit;
+    }
+
+    if (!object_dynamic_cast(OBJECT(dev), TYPE_PCI_DEVICE)) {
+        error_setg(&err, "Not a PCI device");
+        goto exit;
+    }
+
+    pci_dev = PCI_DEVICE(dev);
+    msix_dump_info(mon, pci_dev, &err);
+
+exit:
+    hmp_handle_error(mon, err);
+}