mbox series

[v3,0/3] vdpa: Return -EIO if device ack is VIRTIO_NET_ERR

Message ID cover.1688438055.git.yin31149@gmail.com (mailing list archive)
Headers show
Series vdpa: Return -EIO if device ack is VIRTIO_NET_ERR | expand

Message

Hawkins Jiawei July 4, 2023, 3:34 a.m. UTC
According to VirtIO standard, "The class, command and
command-specific-data are set by the driver,
and the device sets the ack byte.
There is little it can do except issue a diagnostic
if ack is not VIRTIO_NET_OK."

Therefore, QEMU should stop sending the queued SVQ commands and
cancel the device startup if the device's ack is not VIRTIO_NET_OK.

Yet the problem is that, vhost_vdpa_net_load_x() returns 1 based on
`*s->status != VIRTIO_NET_OK` when the device's ack is VIRTIO_NET_ERR.
As a result, net->nc->info->load() also returns 1, this makes
vhost_net_start_one() incorrectly assume the device state is
successfully loaded by vhost_vdpa_net_load() and return 0, instead of
goto `fail` label to cancel the device startup, as vhost_net_start_one()
only cancels the device startup when net->nc->info->load() returns a
negative value.

This patchset fixes this problem by returning -EIO when the device's
ack is not VIRTIO_NET_OK.

Changelog
=========
v3:
 - split the fixes suggested by Eugenio
 - return -EIO suggested by Michael

v2: https://lore.kernel.org/all/69010e9ebb5e3729aef595ed92840f43e48e53e5.1687875592.git.yin31149@gmail.com/
 - fix the same bug in vhost_vdpa_net_load_offloads()

v1: https://lore.kernel.org/all/cover.1686746406.git.yin31149@gmail.com/

Hawkins Jiawei (3):
  vdpa: Return -EIO if device ack is VIRTIO_NET_ERR in _load_mac()
  vdpa: Return -EIO if device ack is VIRTIO_NET_ERR in _load_mq()
  vdpa: Return -EIO if device ack is VIRTIO_NET_ERR in _load_offloads()

 net/vhost-vdpa.c | 15 +++++++++++----
 1 file changed, 11 insertions(+), 4 deletions(-)

Comments

Lei Yang July 5, 2023, 7:59 a.m. UTC | #1
Hello Hawkins

QE can help test this series before  it is merged into master, I would
like to know what test steps can cover this series related scenario?

Thanks
Lei

On Tue, Jul 4, 2023 at 11:36 AM Hawkins Jiawei <yin31149@gmail.com> wrote:
>
> According to VirtIO standard, "The class, command and
> command-specific-data are set by the driver,
> and the device sets the ack byte.
> There is little it can do except issue a diagnostic
> if ack is not VIRTIO_NET_OK."
>
> Therefore, QEMU should stop sending the queued SVQ commands and
> cancel the device startup if the device's ack is not VIRTIO_NET_OK.
>
> Yet the problem is that, vhost_vdpa_net_load_x() returns 1 based on
> `*s->status != VIRTIO_NET_OK` when the device's ack is VIRTIO_NET_ERR.
> As a result, net->nc->info->load() also returns 1, this makes
> vhost_net_start_one() incorrectly assume the device state is
> successfully loaded by vhost_vdpa_net_load() and return 0, instead of
> goto `fail` label to cancel the device startup, as vhost_net_start_one()
> only cancels the device startup when net->nc->info->load() returns a
> negative value.
>
> This patchset fixes this problem by returning -EIO when the device's
> ack is not VIRTIO_NET_OK.
>
> Changelog
> =========
> v3:
>  - split the fixes suggested by Eugenio
>  - return -EIO suggested by Michael
>
> v2: https://lore.kernel.org/all/69010e9ebb5e3729aef595ed92840f43e48e53e5.1687875592.git.yin31149@gmail.com/
>  - fix the same bug in vhost_vdpa_net_load_offloads()
>
> v1: https://lore.kernel.org/all/cover.1686746406.git.yin31149@gmail.com/
>
> Hawkins Jiawei (3):
>   vdpa: Return -EIO if device ack is VIRTIO_NET_ERR in _load_mac()
>   vdpa: Return -EIO if device ack is VIRTIO_NET_ERR in _load_mq()
>   vdpa: Return -EIO if device ack is VIRTIO_NET_ERR in _load_offloads()
>
>  net/vhost-vdpa.c | 15 +++++++++++----
>  1 file changed, 11 insertions(+), 4 deletions(-)
>
> --
> 2.25.1
>
>
Hawkins Jiawei July 5, 2023, 11:03 a.m. UTC | #2
On 2023/7/5 15:59, Lei Yang wrote:
> Hello Hawkins
>
> QE can help test this series before  it is merged into master, I would
> like to know what test steps can cover this series related scenario?
>

Hi, I would like to suggest the following steps to test this patch series:

1.  Modify the QEMU source code to make the device return a
VIRTIO_NET_ERR for the CVQ command. Please apply the patch
provided below:

diff --git a/net/vhost-vdpa.c b/net/vhost-vdpa.c
index 373609216f..58ade6d4e0 100644
--- a/net/vhost-vdpa.c
+++ b/net/vhost-vdpa.c
@@ -642,7 +642,7 @@ static int vhost_vdpa_net_load_mac(VhostVDPAState
*s, const VirtIONet *n)
      if (virtio_vdev_has_feature(&n->parent_obj,
VIRTIO_NET_F_CTRL_MAC_ADDR)) {
          ssize_t dev_written = vhost_vdpa_net_load_cmd(s,
VIRTIO_NET_CTRL_MAC,

VIRTIO_NET_CTRL_MAC_ADDR_SET,
-                                                  n->mac, sizeof(n->mac));
+                                                  n->mac,
sizeof(n->mac) - 1);
          if (unlikely(dev_written < 0)) {
              return dev_written;
          }

2. Start QEMU with the vdpa device in default state.
Without the patch series, QEMU should not trigger any errors or warnings.
With the series applied, QEMU should trigger the warning like
"qemu-system-x86_64: unable to start vhost net: 5: falling back on
userspace virtio".

Thanks!


> Thanks
> Lei
>
> On Tue, Jul 4, 2023 at 11:36 AM Hawkins Jiawei <yin31149@gmail.com> wrote:
>>
>> According to VirtIO standard, "The class, command and
>> command-specific-data are set by the driver,
>> and the device sets the ack byte.
>> There is little it can do except issue a diagnostic
>> if ack is not VIRTIO_NET_OK."
>>
>> Therefore, QEMU should stop sending the queued SVQ commands and
>> cancel the device startup if the device's ack is not VIRTIO_NET_OK.
>>
>> Yet the problem is that, vhost_vdpa_net_load_x() returns 1 based on
>> `*s->status != VIRTIO_NET_OK` when the device's ack is VIRTIO_NET_ERR.
>> As a result, net->nc->info->load() also returns 1, this makes
>> vhost_net_start_one() incorrectly assume the device state is
>> successfully loaded by vhost_vdpa_net_load() and return 0, instead of
>> goto `fail` label to cancel the device startup, as vhost_net_start_one()
>> only cancels the device startup when net->nc->info->load() returns a
>> negative value.
>>
>> This patchset fixes this problem by returning -EIO when the device's
>> ack is not VIRTIO_NET_OK.
>>
>> Changelog
>> =========
>> v3:
>>   - split the fixes suggested by Eugenio
>>   - return -EIO suggested by Michael
>>
>> v2: https://lore.kernel.org/all/69010e9ebb5e3729aef595ed92840f43e48e53e5.1687875592.git.yin31149@gmail.com/
>>   - fix the same bug in vhost_vdpa_net_load_offloads()
>>
>> v1: https://lore.kernel.org/all/cover.1686746406.git.yin31149@gmail.com/
>>
>> Hawkins Jiawei (3):
>>    vdpa: Return -EIO if device ack is VIRTIO_NET_ERR in _load_mac()
>>    vdpa: Return -EIO if device ack is VIRTIO_NET_ERR in _load_mq()
>>    vdpa: Return -EIO if device ack is VIRTIO_NET_ERR in _load_offloads()
>>
>>   net/vhost-vdpa.c | 15 +++++++++++----
>>   1 file changed, 11 insertions(+), 4 deletions(-)
>>
>> --
>> 2.25.1
>>
>>
>
Lei Yang July 6, 2023, 11:03 a.m. UTC | #3
On Wed, Jul 5, 2023 at 7:03 PM Hawkins Jiawei <yin31149@gmail.com> wrote:
>
> On 2023/7/5 15:59, Lei Yang wrote:
> > Hello Hawkins
> >
> > QE can help test this series before  it is merged into master, I would
> > like to know what test steps can cover this series related scenario?
> >
>
> Hi, I would like to suggest the following steps to test this patch series:
>
> 1.  Modify the QEMU source code to make the device return a
> VIRTIO_NET_ERR for the CVQ command. Please apply the patch
> provided below:
>
> diff --git a/net/vhost-vdpa.c b/net/vhost-vdpa.c
> index 373609216f..58ade6d4e0 100644
> --- a/net/vhost-vdpa.c
> +++ b/net/vhost-vdpa.c
> @@ -642,7 +642,7 @@ static int vhost_vdpa_net_load_mac(VhostVDPAState
> *s, const VirtIONet *n)
>       if (virtio_vdev_has_feature(&n->parent_obj,
> VIRTIO_NET_F_CTRL_MAC_ADDR)) {
>           ssize_t dev_written = vhost_vdpa_net_load_cmd(s,
> VIRTIO_NET_CTRL_MAC,
>
> VIRTIO_NET_CTRL_MAC_ADDR_SET,
> -                                                  n->mac, sizeof(n->mac));
> +                                                  n->mac,
> sizeof(n->mac) - 1);
>           if (unlikely(dev_written < 0)) {
>               return dev_written;
>           }
>
> 2. Start QEMU with the vdpa device in default state.
> Without the patch series, QEMU should not trigger any errors or warnings.
> With the series applied, QEMU should trigger the warning like
> "qemu-system-x86_64: unable to start vhost net: 5: falling back on
> userspace virtio".

Based on the above steps, QE tests it without the above patch first,
it will not trigger any errors or warnings. Then QE manually applied
the above patch, boot guest again, it can trigger this warning:
qemu-system-x86_64: unable to start vhost net: 5: falling back on
userspace virtio, this is an expected result.

Tested-by: Lei Yang <leiyang@redhat.com>

BR
Lei

>
> Thanks!
>
>
> > Thanks
> > Lei
> >
> > On Tue, Jul 4, 2023 at 11:36 AM Hawkins Jiawei <yin31149@gmail.com> wrote:
> >>
> >> According to VirtIO standard, "The class, command and
> >> command-specific-data are set by the driver,
> >> and the device sets the ack byte.
> >> There is little it can do except issue a diagnostic
> >> if ack is not VIRTIO_NET_OK."
> >>
> >> Therefore, QEMU should stop sending the queued SVQ commands and
> >> cancel the device startup if the device's ack is not VIRTIO_NET_OK.
> >>
> >> Yet the problem is that, vhost_vdpa_net_load_x() returns 1 based on
> >> `*s->status != VIRTIO_NET_OK` when the device's ack is VIRTIO_NET_ERR.
> >> As a result, net->nc->info->load() also returns 1, this makes
> >> vhost_net_start_one() incorrectly assume the device state is
> >> successfully loaded by vhost_vdpa_net_load() and return 0, instead of
> >> goto `fail` label to cancel the device startup, as vhost_net_start_one()
> >> only cancels the device startup when net->nc->info->load() returns a
> >> negative value.
> >>
> >> This patchset fixes this problem by returning -EIO when the device's
> >> ack is not VIRTIO_NET_OK.
> >>
> >> Changelog
> >> =========
> >> v3:
> >>   - split the fixes suggested by Eugenio
> >>   - return -EIO suggested by Michael
> >>
> >> v2: https://lore.kernel.org/all/69010e9ebb5e3729aef595ed92840f43e48e53e5.1687875592.git.yin31149@gmail.com/
> >>   - fix the same bug in vhost_vdpa_net_load_offloads()
> >>
> >> v1: https://lore.kernel.org/all/cover.1686746406.git.yin31149@gmail.com/
> >>
> >> Hawkins Jiawei (3):
> >>    vdpa: Return -EIO if device ack is VIRTIO_NET_ERR in _load_mac()
> >>    vdpa: Return -EIO if device ack is VIRTIO_NET_ERR in _load_mq()
> >>    vdpa: Return -EIO if device ack is VIRTIO_NET_ERR in _load_offloads()
> >>
> >>   net/vhost-vdpa.c | 15 +++++++++++----
> >>   1 file changed, 11 insertions(+), 4 deletions(-)
> >>
> >> --
> >> 2.25.1
> >>
> >>
> >
>
Michael Tokarev Aug. 5, 2023, 6:15 a.m. UTC | #4
04.07.2023 06:34, Hawkins Jiawei wrote:
> According to VirtIO standard, "The class, command and
> command-specific-data are set by the driver,
> and the device sets the ack byte.
> There is little it can do except issue a diagnostic
> if ack is not VIRTIO_NET_OK."
> 
> Therefore, QEMU should stop sending the queued SVQ commands and
> cancel the device startup if the device's ack is not VIRTIO_NET_OK.
> 
> Yet the problem is that, vhost_vdpa_net_load_x() returns 1 based on
> `*s->status != VIRTIO_NET_OK` when the device's ack is VIRTIO_NET_ERR.
> As a result, net->nc->info->load() also returns 1, this makes
> vhost_net_start_one() incorrectly assume the device state is
> successfully loaded by vhost_vdpa_net_load() and return 0, instead of
> goto `fail` label to cancel the device startup, as vhost_net_start_one()
> only cancels the device startup when net->nc->info->load() returns a
> negative value.
> 
> This patchset fixes this problem by returning -EIO when the device's
> ack is not VIRTIO_NET_OK.
> 
> Changelog
> =========
> v3:
>   - split the fixes suggested by Eugenio
>   - return -EIO suggested by Michael
> 
> v2: https://lore.kernel.org/all/69010e9ebb5e3729aef595ed92840f43e48e53e5.1687875592.git.yin31149@gmail.com/
>   - fix the same bug in vhost_vdpa_net_load_offloads()
> 
> v1: https://lore.kernel.org/all/cover.1686746406.git.yin31149@gmail.com/
> 
> Hawkins Jiawei (3):
>    vdpa: Return -EIO if device ack is VIRTIO_NET_ERR in _load_mac()
>    vdpa: Return -EIO if device ack is VIRTIO_NET_ERR in _load_mq()
>    vdpa: Return -EIO if device ack is VIRTIO_NET_ERR in _load_offloads()

Hi!

I don't remember why, but this patch series is marked as "check later" in
my qemu-stable-to-apply email folder.  Does it make sense to back-port this
series to stable-8.0?

6f34807116 vdpa: Return -EIO if device ack is VIRTIO_NET_ERR in _load_offloads()
f45fd95ec9 vdpa: Return -EIO if device ack is VIRTIO_NET_ERR in _load_mq()
b479bc3c9d vdpa: Return -EIO if device ack is VIRTIO_NET_ERR in _load_mac()

Patch 6f34807116 also needs

b58d3686a0 vdpa: Add vhost_vdpa_net_load_offloads()

for 8.0.

Thanks,

/mjt
Hawkins Jiawei Aug. 5, 2023, 9:28 a.m. UTC | #5
On 2023/8/5 14:15, Michael Tokarev wrote:
> 04.07.2023 06:34, Hawkins Jiawei wrote:
>> According to VirtIO standard, "The class, command and
>> command-specific-data are set by the driver,
>> and the device sets the ack byte.
>> There is little it can do except issue a diagnostic
>> if ack is not VIRTIO_NET_OK."
>>
>> Therefore, QEMU should stop sending the queued SVQ commands and
>> cancel the device startup if the device's ack is not VIRTIO_NET_OK.
>>
>> Yet the problem is that, vhost_vdpa_net_load_x() returns 1 based on
>> `*s->status != VIRTIO_NET_OK` when the device's ack is VIRTIO_NET_ERR.
>> As a result, net->nc->info->load() also returns 1, this makes
>> vhost_net_start_one() incorrectly assume the device state is
>> successfully loaded by vhost_vdpa_net_load() and return 0, instead of
>> goto `fail` label to cancel the device startup, as vhost_net_start_one()
>> only cancels the device startup when net->nc->info->load() returns a
>> negative value.
>>
>> This patchset fixes this problem by returning -EIO when the device's
>> ack is not VIRTIO_NET_OK.
>>
>> Changelog
>> =========
>> v3:
>>   - split the fixes suggested by Eugenio
>>   - return -EIO suggested by Michael
>>
>> v2:
>> https://lore.kernel.org/all/69010e9ebb5e3729aef595ed92840f43e48e53e5.1687875592.git.yin31149@gmail.com/
>>   - fix the same bug in vhost_vdpa_net_load_offloads()
>>
>> v1: https://lore.kernel.org/all/cover.1686746406.git.yin31149@gmail.com/
>>
>> Hawkins Jiawei (3):
>>    vdpa: Return -EIO if device ack is VIRTIO_NET_ERR in _load_mac()
>>    vdpa: Return -EIO if device ack is VIRTIO_NET_ERR in _load_mq()
>>    vdpa: Return -EIO if device ack is VIRTIO_NET_ERR in _load_offloads()
>
> Hi!
>
> I don't remember why, but this patch series is marked as "check later" in
> my qemu-stable-to-apply email folder.  Does it make sense to back-port this
> series to stable-8.0?
>
> 6f34807116 vdpa: Return -EIO if device ack is VIRTIO_NET_ERR in
> _load_offloads()
> f45fd95ec9 vdpa: Return -EIO if device ack is VIRTIO_NET_ERR in _load_mq()
> b479bc3c9d vdpa: Return -EIO if device ack is VIRTIO_NET_ERR in _load_mac()
>

Hi Michael,

Yes, this bug exists in stable-8.0, so it makes sense to back-port this
series.

Commit f45fd95ec9("vdpa: Return -EIO if device ack is VIRTIO_NET_ERR in
_load_mq()") and
commit b479bc3c9d("vdpa: Return -EIO if device ack is VIRTIO_NET_ERR in
_load_mac()") can be back-ported directly.

> Patch 6f34807116 also needs
>
> b58d3686a0 vdpa: Add vhost_vdpa_net_load_offloads()

As you point out, patch 6f34807116("vdpa: Return -EIO if device ack is
VIRTIO_NET_ERR in _load_offloads()") is a fix to the commit
b58d3686a0("vdpa: Add vhost_vdpa_net_load_offloads()"), which was
introduced by patch series "Vhost-vdpa Shadow Virtqueue Offloads
support" at [1].

This mentioned patch series introduces a new feature for QEMU and
has not been merged into stable-8.0 yet, so I think we do not need to
apply the
patch 6f34807116("vdpa: Return -EIO if device ack is
VIRTIO_NET_ERR in _load_offloads()") to stable-8.0.

Sorry for not mentioning this information in the cover letter.

Thanks!

[1]. https://lore.kernel.org/all/cover.1685704856.git.yin31149@gmail.com/

>
> for 8.0.
>
> Thanks,
>
> /mjt
Michael Tokarev Aug. 5, 2023, 9:53 a.m. UTC | #6
05.08.2023 12:28, Hawkins Jiawei wrote:
..
>> I don't remember why, but this patch series is marked as "check later" in
>> my qemu-stable-to-apply email folder.  Does it make sense to back-port this
>> series to stable-8.0?
>>
>> 6f34807116 vdpa: Return -EIO if device ack is VIRTIO_NET_ERR in
>> _load_offloads()
>> f45fd95ec9 vdpa: Return -EIO if device ack is VIRTIO_NET_ERR in _load_mq()
>> b479bc3c9d vdpa: Return -EIO if device ack is VIRTIO_NET_ERR in _load_mac()
>>
> 
> Hi Michael,
> 
> Yes, this bug exists in stable-8.0, so it makes sense to back-port this
> series.
> 
> Commit f45fd95ec9("vdpa: Return -EIO if device ack is VIRTIO_NET_ERR in
> _load_mq()") and
> commit b479bc3c9d("vdpa: Return -EIO if device ack is VIRTIO_NET_ERR in
> _load_mac()") can be back-ported directly.
> 
>> Patch 6f34807116 also needs
>>
>> b58d3686a0 vdpa: Add vhost_vdpa_net_load_offloads()
> 
> As you point out, patch 6f34807116("vdpa: Return -EIO if device ack is
> VIRTIO_NET_ERR in _load_offloads()") is a fix to the commit
> b58d3686a0("vdpa: Add vhost_vdpa_net_load_offloads()"), which was
> introduced by patch series "Vhost-vdpa Shadow Virtqueue Offloads
> support" at [1].
> 
> This mentioned patch series introduces a new feature for QEMU and
> has not been merged into stable-8.0 yet, so I think we do not need to
> apply the
> patch 6f34807116("vdpa: Return -EIO if device ack is
> VIRTIO_NET_ERR in _load_offloads()") to stable-8.0.

Ok, this makes sense, thank you for making it clear.

> Sorry for not mentioning this information in the cover letter.

That's okay.  I too didn't pick it up in time, so fixing this now :

/mjt