mbox series

[vhost,v2,00/10] vdpa/mlx5: Parallelize device suspend/resume

Message ID 20240816090159.1967650-1-dtatulea@nvidia.com (mailing list archive)
Headers show
Series vdpa/mlx5: Parallelize device suspend/resume | expand

Message

Dragos Tatulea Aug. 16, 2024, 9:01 a.m. UTC
This series parallelizes the mlx5_vdpa device suspend and resume
operations through the firmware async API. The purpose is to reduce live
migration downtime.

The series starts with changing the VQ suspend and resume commands
to the async API. After that, the switch is made to issue multiple
commands of the same type in parallel.

Then, the an additional improvement is added: keep the notifiers enabled
during suspend but make it a NOP. Upon resume make sure that the link
state is forwarded. This shaves around 30ms per device constant time.

Finally, use parallel VQ suspend and resume during the CVQ MQ command.

For 1 vDPA device x 32 VQs (16 VQPs), on a large VM (256 GB RAM, 32 CPUs
x 2 threads per core), the improvements are:

+-------------------+--------+--------+-----------+
| operation         | Before | After  | Reduction |
|-------------------+--------+--------+-----------|
| mlx5_vdpa_suspend | 37 ms  | 2.5 ms |     14x   |
| mlx5_vdpa_resume  | 16 ms  | 5 ms   |      3x   |
+-------------------+--------+--------+-----------+

---
v2:
- Changed to parallel VQ suspend/resume during CVQ MQ command.
  Support added in the last 2 patches.
- Made the fw async command more generic and moved it to resources.c.
  Did that because the following series (parallel mkey ops) needs this
  code as well.
  Dropped Acked-by from Eugenio on modified patches.
- Fixed kfree -> kvfree.
- Removed extra newline caught during review.
- As discussed in the v1, the series can be pulled in completely in
  the vhost tree [0]. The mlx5_core patch was reviewed by Tariq who is
  also a maintainer for mlx5_core.

[0] - https://lore.kernel.org/virtualization/6582792d-8db2-4bc0-bf3a-248fe5c8fc56@nvidia.com/T/#maefabb2fde5adfb322d16ca16ae64d540f75b7d2

Dragos Tatulea (10):
  net/mlx5: Support throttled commands from async API
  vdpa/mlx5: Introduce error logging function
  vdpa/mlx5: Introduce async fw command wrapper
  vdpa/mlx5: Use async API for vq query command
  vdpa/mlx5: Use async API for vq modify commands
  vdpa/mlx5: Parallelize device suspend
  vdpa/mlx5: Parallelize device resume
  vdpa/mlx5: Keep notifiers during suspend but ignore
  vdpa/mlx5: Small improvement for change_num_qps()
  vdpa/mlx5: Parallelize VQ suspend/resume for CVQ MQ command

 drivers/net/ethernet/mellanox/mlx5/core/cmd.c |  21 +-
 drivers/vdpa/mlx5/core/mlx5_vdpa.h            |  22 +
 drivers/vdpa/mlx5/core/resources.c            |  73 ++++
 drivers/vdpa/mlx5/net/mlx5_vnet.c             | 396 +++++++++++-------
 4 files changed, 361 insertions(+), 151 deletions(-)

Comments

Lei Yang Sept. 2, 2024, 10:03 a.m. UTC | #1
Hi Dragos

QE tested this series with mellanox nic, it failed with [1] when
booting guest, and host dmesg also will print messages [2]. This bug
can be reproduced boot guest with vhost-vdpa device.

[1] qemu) qemu-kvm: vhost VQ 1 ring restore failed: -1: Operation not
permitted (1)
qemu-kvm: vhost VQ 0 ring restore failed: -1: Operation not permitted (1)
qemu-kvm: unable to start vhost net: 5: falling back on userspace virtio
qemu-kvm: vhost_set_features failed: Device or resource busy (16)
qemu-kvm: unable to start vhost net: 16: falling back on userspace virtio

[2] Host dmesg:
[ 1406.187977] mlx5_core 0000:0d:00.2:
mlx5_vdpa_compat_reset:3267:(pid 8506): performing device reset
[ 1406.189221] mlx5_core 0000:0d:00.2:
mlx5_vdpa_compat_reset:3267:(pid 8506): performing device reset
[ 1406.190354] mlx5_core 0000:0d:00.2:
mlx5_vdpa_show_mr_leaks:573:(pid 8506) warning: mkey still alive after
resource delete: mr: 000000000c5ccca2, mkey: 0x40000000, refcount: 2
[ 1471.538487] mlx5_core 0000:0d:00.2: cb_timeout_handler:938:(pid
428): cmd[13]: MODIFY_GENERAL_OBJECT(0xa01) Async, timeout. Will cause
a leak of a command resource
[ 1471.539486] mlx5_core 0000:0d:00.2: cb_timeout_handler:938:(pid
428): cmd[12]: MODIFY_GENERAL_OBJECT(0xa01) Async, timeout. Will cause
a leak of a command resource
[ 1471.540351] mlx5_core 0000:0d:00.2: modify_virtqueues:1617:(pid
8511) error: modify vq 0 failed, state: 0 -> 0, err: 0
[ 1471.541433] mlx5_core 0000:0d:00.2: modify_virtqueues:1617:(pid
8511) error: modify vq 1 failed, state: 0 -> 0, err: -110
[ 1471.542388] mlx5_core 0000:0d:00.2: mlx5_vdpa_set_status:3203:(pid
8511) warning: failed to resume VQs
[ 1471.549778] mlx5_core 0000:0d:00.2:
mlx5_vdpa_show_mr_leaks:573:(pid 8511) warning: mkey still alive after
resource delete: mr: 000000000c5ccca2, mkey: 0x40000000, refcount: 2
[ 1512.929854] mlx5_core 0000:0d:00.2:
mlx5_vdpa_compat_reset:3267:(pid 8565): performing device reset
[ 1513.100290] mlx5_core 0000:0d:00.2:
mlx5_vdpa_show_mr_leaks:573:(pid 8565) warning: mkey still alive after
resource delete: mr: 000000000c5ccca2, mkey: 0x40000000, refcount: 2

Thanks
Lei




> This series parallelizes the mlx5_vdpa device suspend and resume
> operations through the firmware async API. The purpose is to reduce live
> migration downtime.
>
> The series starts with changing the VQ suspend and resume commands
> to the async API. After that, the switch is made to issue multiple
> commands of the same type in parallel.
>
> Then, the an additional improvement is added: keep the notifiers enabled
> during suspend but make it a NOP. Upon resume make sure that the link
> state is forwarded. This shaves around 30ms per device constant time.
>
> Finally, use parallel VQ suspend and resume during the CVQ MQ command.
>
> For 1 vDPA device x 32 VQs (16 VQPs), on a large VM (256 GB RAM, 32 CPUs
> x 2 threads per core), the improvements are:
>
> +-------------------+--------+--------+-----------+
> | operation         | Before | After  | Reduction |
> |-------------------+--------+--------+-----------|
> | mlx5_vdpa_suspend | 37 ms  | 2.5 ms |     14x   |
> | mlx5_vdpa_resume  | 16 ms  | 5 ms   |      3x   |
> +-------------------+--------+--------+-----------+
>
> ---
> v2:
> - Changed to parallel VQ suspend/resume during CVQ MQ command.
>   Support added in the last 2 patches.
> - Made the fw async command more generic and moved it to resources.c.
>   Did that because the following series (parallel mkey ops) needs this
>   code as well.
>   Dropped Acked-by from Eugenio on modified patches.
> - Fixed kfree -> kvfree.
> - Removed extra newline caught during review.
> - As discussed in the v1, the series can be pulled in completely in
>   the vhost tree [0]. The mlx5_core patch was reviewed by Tariq who is
>   also a maintainer for mlx5_core.
>
> [0] - https://lore.kernel.org/virtualization/6582792d-8db2-4bc0-bf3a-248fe5c8fc56@nvidia.com/T/#maefabb2fde5adfb322d16ca16ae64d540f75b7d2
>
> Dragos Tatulea (10):
>   net/mlx5: Support throttled commands from async API
>   vdpa/mlx5: Introduce error logging function
>   vdpa/mlx5: Introduce async fw command wrapper
>   vdpa/mlx5: Use async API for vq query command
>   vdpa/mlx5: Use async API for vq modify commands
>   vdpa/mlx5: Parallelize device suspend
>   vdpa/mlx5: Parallelize device resume
>   vdpa/mlx5: Keep notifiers during suspend but ignore
>   vdpa/mlx5: Small improvement for change_num_qps()
>   vdpa/mlx5: Parallelize VQ suspend/resume for CVQ MQ command
>
>  drivers/net/ethernet/mellanox/mlx5/core/cmd.c |  21 +-
>  drivers/vdpa/mlx5/core/mlx5_vdpa.h            |  22 +
>  drivers/vdpa/mlx5/core/resources.c            |  73 ++++
>  drivers/vdpa/mlx5/net/mlx5_vnet.c             | 396 +++++++++++-------
>  4 files changed, 361 insertions(+), 151 deletions(-)
>
> --
> 2.45.1
>
Dragos Tatulea Sept. 2, 2024, 11:05 a.m. UTC | #2
Hi Lei,

On 02.09.24 12:03, Lei Yang wrote:
> Hi Dragos
> 
> QE tested this series with mellanox nic, it failed with [1] when
> booting guest, and host dmesg also will print messages [2]. This bug
> can be reproduced boot guest with vhost-vdpa device.
> 
> [1] qemu) qemu-kvm: vhost VQ 1 ring restore failed: -1: Operation not
> permitted (1)
> qemu-kvm: vhost VQ 0 ring restore failed: -1: Operation not permitted (1)
> qemu-kvm: unable to start vhost net: 5: falling back on userspace virtio
> qemu-kvm: vhost_set_features failed: Device or resource busy (16)
> qemu-kvm: unable to start vhost net: 16: falling back on userspace virtio
> 
> [2] Host dmesg:
> [ 1406.187977] mlx5_core 0000:0d:00.2:
> mlx5_vdpa_compat_reset:3267:(pid 8506): performing device reset
> [ 1406.189221] mlx5_core 0000:0d:00.2:
> mlx5_vdpa_compat_reset:3267:(pid 8506): performing device reset
> [ 1406.190354] mlx5_core 0000:0d:00.2:
> mlx5_vdpa_show_mr_leaks:573:(pid 8506) warning: mkey still alive after
> resource delete: mr: 000000000c5ccca2, mkey: 0x40000000, refcount: 2
> [ 1471.538487] mlx5_core 0000:0d:00.2: cb_timeout_handler:938:(pid
> 428): cmd[13]: MODIFY_GENERAL_OBJECT(0xa01) Async, timeout. Will cause
> a leak of a command resource
> [ 1471.539486] mlx5_core 0000:0d:00.2: cb_timeout_handler:938:(pid
> 428): cmd[12]: MODIFY_GENERAL_OBJECT(0xa01) Async, timeout. Will cause
> a leak of a command resource
> [ 1471.540351] mlx5_core 0000:0d:00.2: modify_virtqueues:1617:(pid
> 8511) error: modify vq 0 failed, state: 0 -> 0, err: 0
> [ 1471.541433] mlx5_core 0000:0d:00.2: modify_virtqueues:1617:(pid
> 8511) error: modify vq 1 failed, state: 0 -> 0, err: -110
> [ 1471.542388] mlx5_core 0000:0d:00.2: mlx5_vdpa_set_status:3203:(pid
> 8511) warning: failed to resume VQs
> [ 1471.549778] mlx5_core 0000:0d:00.2:
> mlx5_vdpa_show_mr_leaks:573:(pid 8511) warning: mkey still alive after
> resource delete: mr: 000000000c5ccca2, mkey: 0x40000000, refcount: 2
> [ 1512.929854] mlx5_core 0000:0d:00.2:
> mlx5_vdpa_compat_reset:3267:(pid 8565): performing device reset
> [ 1513.100290] mlx5_core 0000:0d:00.2:
> mlx5_vdpa_show_mr_leaks:573:(pid 8565) warning: mkey still alive after
> resource delete: mr: 000000000c5ccca2, mkey: 0x40000000, refcount: 2
> 
Can you provide more details about the qemu version and the vdpa device
options used?

Also, which FW version are you using? There is a relevant bug in FW
22.41.1000 which was fixed in the latest FW (22.42.1000). Did you
encounter any FW syndromes in the host dmesg log?

Thanks,
Dragos
Lei Yang Sept. 3, 2024, 7:40 a.m. UTC | #3
On Mon, Sep 2, 2024 at 7:05 PM Dragos Tatulea <dtatulea@nvidia.com> wrote:
>
> Hi Lei,
>
> On 02.09.24 12:03, Lei Yang wrote:
> > Hi Dragos
> >
> > QE tested this series with mellanox nic, it failed with [1] when
> > booting guest, and host dmesg also will print messages [2]. This bug
> > can be reproduced boot guest with vhost-vdpa device.
> >
> > [1] qemu) qemu-kvm: vhost VQ 1 ring restore failed: -1: Operation not
> > permitted (1)
> > qemu-kvm: vhost VQ 0 ring restore failed: -1: Operation not permitted (1)
> > qemu-kvm: unable to start vhost net: 5: falling back on userspace virtio
> > qemu-kvm: vhost_set_features failed: Device or resource busy (16)
> > qemu-kvm: unable to start vhost net: 16: falling back on userspace virtio
> >
> > [2] Host dmesg:
> > [ 1406.187977] mlx5_core 0000:0d:00.2:
> > mlx5_vdpa_compat_reset:3267:(pid 8506): performing device reset
> > [ 1406.189221] mlx5_core 0000:0d:00.2:
> > mlx5_vdpa_compat_reset:3267:(pid 8506): performing device reset
> > [ 1406.190354] mlx5_core 0000:0d:00.2:
> > mlx5_vdpa_show_mr_leaks:573:(pid 8506) warning: mkey still alive after
> > resource delete: mr: 000000000c5ccca2, mkey: 0x40000000, refcount: 2
> > [ 1471.538487] mlx5_core 0000:0d:00.2: cb_timeout_handler:938:(pid
> > 428): cmd[13]: MODIFY_GENERAL_OBJECT(0xa01) Async, timeout. Will cause
> > a leak of a command resource
> > [ 1471.539486] mlx5_core 0000:0d:00.2: cb_timeout_handler:938:(pid
> > 428): cmd[12]: MODIFY_GENERAL_OBJECT(0xa01) Async, timeout. Will cause
> > a leak of a command resource
> > [ 1471.540351] mlx5_core 0000:0d:00.2: modify_virtqueues:1617:(pid
> > 8511) error: modify vq 0 failed, state: 0 -> 0, err: 0
> > [ 1471.541433] mlx5_core 0000:0d:00.2: modify_virtqueues:1617:(pid
> > 8511) error: modify vq 1 failed, state: 0 -> 0, err: -110
> > [ 1471.542388] mlx5_core 0000:0d:00.2: mlx5_vdpa_set_status:3203:(pid
> > 8511) warning: failed to resume VQs
> > [ 1471.549778] mlx5_core 0000:0d:00.2:
> > mlx5_vdpa_show_mr_leaks:573:(pid 8511) warning: mkey still alive after
> > resource delete: mr: 000000000c5ccca2, mkey: 0x40000000, refcount: 2
> > [ 1512.929854] mlx5_core 0000:0d:00.2:
> > mlx5_vdpa_compat_reset:3267:(pid 8565): performing device reset
> > [ 1513.100290] mlx5_core 0000:0d:00.2:
> > mlx5_vdpa_show_mr_leaks:573:(pid 8565) warning: mkey still alive after
> > resource delete: mr: 000000000c5ccca2, mkey: 0x40000000, refcount: 2
> >

Hi Dragos

> Can you provide more details about the qemu version and the vdpa device
> options used?
>
> Also, which FW version are you using? There is a relevant bug in FW
> 22.41.1000 which was fixed in the latest FW (22.42.1000). Did you
> encounter any FW syndromes in the host dmesg log?

This problem has gone when I updated the firmware version to
22.42.1000, and I tested it with regression tests using mellanox nic,
everything works well.

Tested-by: Lei Yang <leiyang@redhat.com>
>
> Thanks,
> Dragos
>
Dragos Tatulea Sept. 3, 2024, 7:47 a.m. UTC | #4
On 03.09.24 09:40, Lei Yang wrote:
> On Mon, Sep 2, 2024 at 7:05 PM Dragos Tatulea <dtatulea@nvidia.com> wrote:
>>
>> Hi Lei,
>>
>> On 02.09.24 12:03, Lei Yang wrote:
>>> Hi Dragos
>>>
>>> QE tested this series with mellanox nic, it failed with [1] when
>>> booting guest, and host dmesg also will print messages [2]. This bug
>>> can be reproduced boot guest with vhost-vdpa device.
>>>
>>> [1] qemu) qemu-kvm: vhost VQ 1 ring restore failed: -1: Operation not
>>> permitted (1)
>>> qemu-kvm: vhost VQ 0 ring restore failed: -1: Operation not permitted (1)
>>> qemu-kvm: unable to start vhost net: 5: falling back on userspace virtio
>>> qemu-kvm: vhost_set_features failed: Device or resource busy (16)
>>> qemu-kvm: unable to start vhost net: 16: falling back on userspace virtio
>>>
>>> [2] Host dmesg:
>>> [ 1406.187977] mlx5_core 0000:0d:00.2:
>>> mlx5_vdpa_compat_reset:3267:(pid 8506): performing device reset
>>> [ 1406.189221] mlx5_core 0000:0d:00.2:
>>> mlx5_vdpa_compat_reset:3267:(pid 8506): performing device reset
>>> [ 1406.190354] mlx5_core 0000:0d:00.2:
>>> mlx5_vdpa_show_mr_leaks:573:(pid 8506) warning: mkey still alive after
>>> resource delete: mr: 000000000c5ccca2, mkey: 0x40000000, refcount: 2
>>> [ 1471.538487] mlx5_core 0000:0d:00.2: cb_timeout_handler:938:(pid
>>> 428): cmd[13]: MODIFY_GENERAL_OBJECT(0xa01) Async, timeout. Will cause
>>> a leak of a command resource
>>> [ 1471.539486] mlx5_core 0000:0d:00.2: cb_timeout_handler:938:(pid
>>> 428): cmd[12]: MODIFY_GENERAL_OBJECT(0xa01) Async, timeout. Will cause
>>> a leak of a command resource
>>> [ 1471.540351] mlx5_core 0000:0d:00.2: modify_virtqueues:1617:(pid
>>> 8511) error: modify vq 0 failed, state: 0 -> 0, err: 0
>>> [ 1471.541433] mlx5_core 0000:0d:00.2: modify_virtqueues:1617:(pid
>>> 8511) error: modify vq 1 failed, state: 0 -> 0, err: -110
>>> [ 1471.542388] mlx5_core 0000:0d:00.2: mlx5_vdpa_set_status:3203:(pid
>>> 8511) warning: failed to resume VQs
>>> [ 1471.549778] mlx5_core 0000:0d:00.2:
>>> mlx5_vdpa_show_mr_leaks:573:(pid 8511) warning: mkey still alive after
>>> resource delete: mr: 000000000c5ccca2, mkey: 0x40000000, refcount: 2
>>> [ 1512.929854] mlx5_core 0000:0d:00.2:
>>> mlx5_vdpa_compat_reset:3267:(pid 8565): performing device reset
>>> [ 1513.100290] mlx5_core 0000:0d:00.2:
>>> mlx5_vdpa_show_mr_leaks:573:(pid 8565) warning: mkey still alive after
>>> resource delete: mr: 000000000c5ccca2, mkey: 0x40000000, refcount: 2
>>>
> 
> Hi Dragos
> 
>> Can you provide more details about the qemu version and the vdpa device
>> options used?
>>
>> Also, which FW version are you using? There is a relevant bug in FW
>> 22.41.1000 which was fixed in the latest FW (22.42.1000). Did you
>> encounter any FW syndromes in the host dmesg log?
> 
> This problem has gone when I updated the firmware version to
> 22.42.1000, and I tested it with regression tests using mellanox nic,
> everything works well.
> 
> Tested-by: Lei Yang <leiyang@redhat.com>
Good to hear. Thanks for the quick reaction.

Thanks,
Dragos
Eugenio Pérez Sept. 3, 2024, 8:10 a.m. UTC | #5
On Tue, Sep 3, 2024 at 9:48 AM Dragos Tatulea <dtatulea@nvidia.com> wrote:
>
>
>
> On 03.09.24 09:40, Lei Yang wrote:
> > On Mon, Sep 2, 2024 at 7:05 PM Dragos Tatulea <dtatulea@nvidia.com> wrote:
> >>
> >> Hi Lei,
> >>
> >> On 02.09.24 12:03, Lei Yang wrote:
> >>> Hi Dragos
> >>>
> >>> QE tested this series with mellanox nic, it failed with [1] when
> >>> booting guest, and host dmesg also will print messages [2]. This bug
> >>> can be reproduced boot guest with vhost-vdpa device.
> >>>
> >>> [1] qemu) qemu-kvm: vhost VQ 1 ring restore failed: -1: Operation not
> >>> permitted (1)
> >>> qemu-kvm: vhost VQ 0 ring restore failed: -1: Operation not permitted (1)
> >>> qemu-kvm: unable to start vhost net: 5: falling back on userspace virtio
> >>> qemu-kvm: vhost_set_features failed: Device or resource busy (16)
> >>> qemu-kvm: unable to start vhost net: 16: falling back on userspace virtio
> >>>
> >>> [2] Host dmesg:
> >>> [ 1406.187977] mlx5_core 0000:0d:00.2:
> >>> mlx5_vdpa_compat_reset:3267:(pid 8506): performing device reset
> >>> [ 1406.189221] mlx5_core 0000:0d:00.2:
> >>> mlx5_vdpa_compat_reset:3267:(pid 8506): performing device reset
> >>> [ 1406.190354] mlx5_core 0000:0d:00.2:
> >>> mlx5_vdpa_show_mr_leaks:573:(pid 8506) warning: mkey still alive after
> >>> resource delete: mr: 000000000c5ccca2, mkey: 0x40000000, refcount: 2
> >>> [ 1471.538487] mlx5_core 0000:0d:00.2: cb_timeout_handler:938:(pid
> >>> 428): cmd[13]: MODIFY_GENERAL_OBJECT(0xa01) Async, timeout. Will cause
> >>> a leak of a command resource
> >>> [ 1471.539486] mlx5_core 0000:0d:00.2: cb_timeout_handler:938:(pid
> >>> 428): cmd[12]: MODIFY_GENERAL_OBJECT(0xa01) Async, timeout. Will cause
> >>> a leak of a command resource
> >>> [ 1471.540351] mlx5_core 0000:0d:00.2: modify_virtqueues:1617:(pid
> >>> 8511) error: modify vq 0 failed, state: 0 -> 0, err: 0
> >>> [ 1471.541433] mlx5_core 0000:0d:00.2: modify_virtqueues:1617:(pid
> >>> 8511) error: modify vq 1 failed, state: 0 -> 0, err: -110
> >>> [ 1471.542388] mlx5_core 0000:0d:00.2: mlx5_vdpa_set_status:3203:(pid
> >>> 8511) warning: failed to resume VQs
> >>> [ 1471.549778] mlx5_core 0000:0d:00.2:
> >>> mlx5_vdpa_show_mr_leaks:573:(pid 8511) warning: mkey still alive after
> >>> resource delete: mr: 000000000c5ccca2, mkey: 0x40000000, refcount: 2
> >>> [ 1512.929854] mlx5_core 0000:0d:00.2:
> >>> mlx5_vdpa_compat_reset:3267:(pid 8565): performing device reset
> >>> [ 1513.100290] mlx5_core 0000:0d:00.2:
> >>> mlx5_vdpa_show_mr_leaks:573:(pid 8565) warning: mkey still alive after
> >>> resource delete: mr: 000000000c5ccca2, mkey: 0x40000000, refcount: 2
> >>>
> >
> > Hi Dragos
> >
> >> Can you provide more details about the qemu version and the vdpa device
> >> options used?
> >>
> >> Also, which FW version are you using? There is a relevant bug in FW
> >> 22.41.1000 which was fixed in the latest FW (22.42.1000). Did you
> >> encounter any FW syndromes in the host dmesg log?
> >
> > This problem has gone when I updated the firmware version to
> > 22.42.1000, and I tested it with regression tests using mellanox nic,
> > everything works well.
> >
> > Tested-by: Lei Yang <leiyang@redhat.com>
> Good to hear. Thanks for the quick reaction.
>

Is it possible to add a check so it doesn't use the async fashion in old FW?
Dragos Tatulea Sept. 3, 2024, 8:16 a.m. UTC | #6
On 03.09.24 10:10, Eugenio Perez Martin wrote:
> On Tue, Sep 3, 2024 at 9:48 AM Dragos Tatulea <dtatulea@nvidia.com> wrote:
>>
>>
>>
>> On 03.09.24 09:40, Lei Yang wrote:
>>> On Mon, Sep 2, 2024 at 7:05 PM Dragos Tatulea <dtatulea@nvidia.com> wrote:
>>>>
>>>> Hi Lei,
>>>>
>>>> On 02.09.24 12:03, Lei Yang wrote:
>>>>> Hi Dragos
>>>>>
>>>>> QE tested this series with mellanox nic, it failed with [1] when
>>>>> booting guest, and host dmesg also will print messages [2]. This bug
>>>>> can be reproduced boot guest with vhost-vdpa device.
>>>>>
>>>>> [1] qemu) qemu-kvm: vhost VQ 1 ring restore failed: -1: Operation not
>>>>> permitted (1)
>>>>> qemu-kvm: vhost VQ 0 ring restore failed: -1: Operation not permitted (1)
>>>>> qemu-kvm: unable to start vhost net: 5: falling back on userspace virtio
>>>>> qemu-kvm: vhost_set_features failed: Device or resource busy (16)
>>>>> qemu-kvm: unable to start vhost net: 16: falling back on userspace virtio
>>>>>
>>>>> [2] Host dmesg:
>>>>> [ 1406.187977] mlx5_core 0000:0d:00.2:
>>>>> mlx5_vdpa_compat_reset:3267:(pid 8506): performing device reset
>>>>> [ 1406.189221] mlx5_core 0000:0d:00.2:
>>>>> mlx5_vdpa_compat_reset:3267:(pid 8506): performing device reset
>>>>> [ 1406.190354] mlx5_core 0000:0d:00.2:
>>>>> mlx5_vdpa_show_mr_leaks:573:(pid 8506) warning: mkey still alive after
>>>>> resource delete: mr: 000000000c5ccca2, mkey: 0x40000000, refcount: 2
>>>>> [ 1471.538487] mlx5_core 0000:0d:00.2: cb_timeout_handler:938:(pid
>>>>> 428): cmd[13]: MODIFY_GENERAL_OBJECT(0xa01) Async, timeout. Will cause
>>>>> a leak of a command resource
>>>>> [ 1471.539486] mlx5_core 0000:0d:00.2: cb_timeout_handler:938:(pid
>>>>> 428): cmd[12]: MODIFY_GENERAL_OBJECT(0xa01) Async, timeout. Will cause
>>>>> a leak of a command resource
>>>>> [ 1471.540351] mlx5_core 0000:0d:00.2: modify_virtqueues:1617:(pid
>>>>> 8511) error: modify vq 0 failed, state: 0 -> 0, err: 0
>>>>> [ 1471.541433] mlx5_core 0000:0d:00.2: modify_virtqueues:1617:(pid
>>>>> 8511) error: modify vq 1 failed, state: 0 -> 0, err: -110
>>>>> [ 1471.542388] mlx5_core 0000:0d:00.2: mlx5_vdpa_set_status:3203:(pid
>>>>> 8511) warning: failed to resume VQs
>>>>> [ 1471.549778] mlx5_core 0000:0d:00.2:
>>>>> mlx5_vdpa_show_mr_leaks:573:(pid 8511) warning: mkey still alive after
>>>>> resource delete: mr: 000000000c5ccca2, mkey: 0x40000000, refcount: 2
>>>>> [ 1512.929854] mlx5_core 0000:0d:00.2:
>>>>> mlx5_vdpa_compat_reset:3267:(pid 8565): performing device reset
>>>>> [ 1513.100290] mlx5_core 0000:0d:00.2:
>>>>> mlx5_vdpa_show_mr_leaks:573:(pid 8565) warning: mkey still alive after
>>>>> resource delete: mr: 000000000c5ccca2, mkey: 0x40000000, refcount: 2
>>>>>
>>>
>>> Hi Dragos
>>>
>>>> Can you provide more details about the qemu version and the vdpa device
>>>> options used?
>>>>
>>>> Also, which FW version are you using? There is a relevant bug in FW
>>>> 22.41.1000 which was fixed in the latest FW (22.42.1000). Did you
>>>> encounter any FW syndromes in the host dmesg log?
>>>
>>> This problem has gone when I updated the firmware version to
>>> 22.42.1000, and I tested it with regression tests using mellanox nic,
>>> everything works well.
>>>
>>> Tested-by: Lei Yang <leiyang@redhat.com>
>> Good to hear. Thanks for the quick reaction.
>>
> 
> Is it possible to add a check so it doesn't use the async fashion in old FW?
> 
Unfortunately not, it would have been there otherwise.

Note that this affects only FW version 22.41.1000. Older versions are not
affected because VQ resume is not supported.

Thanks,
Dragos