diff mbox series

vdpa: Increase out buffer size for CVQ commands

Message ID 20230622010651.22698-1-yin31149@gmail.com (mailing list archive)
State New, archived
Headers show
Series vdpa: Increase out buffer size for CVQ commands | expand

Commit Message

Hawkins Jiawei June 22, 2023, 1:06 a.m. UTC
According to the VirtIO standard, "Since there are no guarantees,
it can use a hash filter or silently switch to
allmulti or promiscuous mode if it is given too many addresses."
To achive this, QEMU ignores MAC addresses and marks `mac_table.x_overflow`
in the device internal state in virtio_net_handle_mac()
if the guest sets more than `MAC_TABLE_ENTRIES` MAC addresses
for the filter table.

However, the problem is that QEMU never marks the `mac_table.x_overflow`
for the vdpa device internal state when the guest sets more than
`MAC_TABLE_ENTRIES` MAC addresses.

To be more specific, currently QEMU offers a buffer size of
vhost_vdpa_net_cvq_cmd_len() for CVQ commands, which represents the size of
VIRTIO_NET_CTRL_MAC_TABLE_SET command with a maximum `MAC_TABLE_ENTRIES`
MAC addresses.

Consequently, if the guest sets more than `MAC_TABLE_ENTRIES` MAC addresses,
QEMU truncates the CVQ command data and copies this incomplete command
into the out buffer. In this situation, virtio_net_handle_mac() fails the
integrity check and returns VIRTIO_NET_ERR instead of marking
`mac_table.x_overflow` and returning VIRTIO_NET_OK, since the copied
CVQ command in the buffer is incomplete and flawed.

This patch solves this problem by increasing the buffer size to
vhost_vdpa_net_cvq_cmd_page_len(), which represents the size of the buffer
that is allocated and mmaped. Therefore, everything should work correctly
as long as the guest sets fewer than `(vhost_vdpa_net_cvq_cmd_page_len() -
sizeof(struct virtio_net_ctrl_hdr)
- 2 * sizeof(struct virtio_net_ctrl_mac)) / ETH_ALEN` MAC addresses.

Considering the highly unlikely scenario for the guest setting more than
that number of MAC addresses for the filter table, this patch should
work fine for the majority of cases. If there is a need for more than thoes
entries, we can increase the value for vhost_vdpa_net_cvq_cmd_page_len()
in the future, mapping more than one page for command output.

Fixes: 7a7f87e94c ("vdpa: Move command buffers map to start of net device")
Signed-off-by: Hawkins Jiawei <yin31149@gmail.com>
---
 net/vhost-vdpa.c | 11 ++++++++++-
 1 file changed, 10 insertions(+), 1 deletion(-)

Comments

Eugenio Perez Martin June 25, 2023, 10:48 a.m. UTC | #1
On Thu, Jun 22, 2023 at 3:07 AM Hawkins Jiawei <yin31149@gmail.com> wrote:
>
> According to the VirtIO standard, "Since there are no guarantees,
> it can use a hash filter or silently switch to
> allmulti or promiscuous mode if it is given too many addresses."
> To achive this, QEMU ignores MAC addresses and marks `mac_table.x_overflow`
> in the device internal state in virtio_net_handle_mac()
> if the guest sets more than `MAC_TABLE_ENTRIES` MAC addresses
> for the filter table.
>
> However, the problem is that QEMU never marks the `mac_table.x_overflow`
> for the vdpa device internal state when the guest sets more than
> `MAC_TABLE_ENTRIES` MAC addresses.
>
> To be more specific, currently QEMU offers a buffer size of
> vhost_vdpa_net_cvq_cmd_len() for CVQ commands, which represents the size of
> VIRTIO_NET_CTRL_MAC_TABLE_SET command with a maximum `MAC_TABLE_ENTRIES`
> MAC addresses.
>
> Consequently, if the guest sets more than `MAC_TABLE_ENTRIES` MAC addresses,
> QEMU truncates the CVQ command data and copies this incomplete command
> into the out buffer. In this situation, virtio_net_handle_mac() fails the
> integrity check and returns VIRTIO_NET_ERR instead of marking
> `mac_table.x_overflow` and returning VIRTIO_NET_OK, since the copied
> CVQ command in the buffer is incomplete and flawed.
>
> This patch solves this problem by increasing the buffer size to
> vhost_vdpa_net_cvq_cmd_page_len(), which represents the size of the buffer
> that is allocated and mmaped. Therefore, everything should work correctly
> as long as the guest sets fewer than `(vhost_vdpa_net_cvq_cmd_page_len() -
> sizeof(struct virtio_net_ctrl_hdr)
> - 2 * sizeof(struct virtio_net_ctrl_mac)) / ETH_ALEN` MAC addresses.
>
> Considering the highly unlikely scenario for the guest setting more than
> that number of MAC addresses for the filter table, this patch should
> work fine for the majority of cases. If there is a need for more than thoes
> entries, we can increase the value for vhost_vdpa_net_cvq_cmd_page_len()
> in the future, mapping more than one page for command output.
>
> Fixes: 7a7f87e94c ("vdpa: Move command buffers map to start of net device")
> Signed-off-by: Hawkins Jiawei <yin31149@gmail.com>
> ---
>  net/vhost-vdpa.c | 11 ++++++++++-
>  1 file changed, 10 insertions(+), 1 deletion(-)
>
> diff --git a/net/vhost-vdpa.c b/net/vhost-vdpa.c
> index 5a72204899..ecfa8852b5 100644
> --- a/net/vhost-vdpa.c
> +++ b/net/vhost-vdpa.c
> @@ -784,9 +784,18 @@ static int vhost_vdpa_net_handle_ctrl_avail(VhostShadowVirtqueue *svq,
>      };
>      ssize_t dev_written = -EINVAL;
>
> +    /*
> +     * This code truncates the VIRTIO_NET_CTRL_MAC_TABLE_SET CVQ command
> +     * and prevents QEMU from marking `mac_table.x_overflow` in the device
> +     * internal state in virtio_net_handle_mac() if the guest sets more than
> +     * `(vhost_vdpa_net_cvq_cmd_page_len() - sizeof(struct virtio_net_ctrl_hdr)
> +     * - 2 * sizeof(struct virtio_net_ctrl_mac)) / ETH_ALEN` MAC addresses for
> +     * filter table.
> +     * However, this situation is considered rare, so it is acceptable.

I think we can also fix this situation. If it fits in one page, we can
still send the same page to the device and virtio_net_handle_ctrl_iov.
If it does not fit in one page, we can clear all mac filters with
VIRTIO_NET_CTRL_RX_ALLUNI and / or VIRTIO_NET_CTRL_RX_ALLMULTI.

Thanks!

> +     */
>      out.iov_len = iov_to_buf(elem->out_sg, elem->out_num, 0,
>                               s->cvq_cmd_out_buffer,
> -                             vhost_vdpa_net_cvq_cmd_len());
> +                             vhost_vdpa_net_cvq_cmd_page_len());
>      if (*(uint8_t *)s->cvq_cmd_out_buffer == VIRTIO_NET_CTRL_ANNOUNCE) {
>          /*
>           * Guest announce capability is emulated by qemu, so don't forward to
> --
> 2.25.1
>
Hawkins Jiawei June 26, 2023, 8:26 a.m. UTC | #2
On 2023/6/25 18:48, Eugenio Perez Martin wrote:
> On Thu, Jun 22, 2023 at 3:07 AM Hawkins Jiawei <yin31149@gmail.com> wrote:
>>
>> According to the VirtIO standard, "Since there are no guarantees,
>> it can use a hash filter or silently switch to
>> allmulti or promiscuous mode if it is given too many addresses."
>> To achive this, QEMU ignores MAC addresses and marks `mac_table.x_overflow`
>> in the device internal state in virtio_net_handle_mac()
>> if the guest sets more than `MAC_TABLE_ENTRIES` MAC addresses
>> for the filter table.
>>
>> However, the problem is that QEMU never marks the `mac_table.x_overflow`
>> for the vdpa device internal state when the guest sets more than
>> `MAC_TABLE_ENTRIES` MAC addresses.
>>
>> To be more specific, currently QEMU offers a buffer size of
>> vhost_vdpa_net_cvq_cmd_len() for CVQ commands, which represents the size of
>> VIRTIO_NET_CTRL_MAC_TABLE_SET command with a maximum `MAC_TABLE_ENTRIES`
>> MAC addresses.
>>
>> Consequently, if the guest sets more than `MAC_TABLE_ENTRIES` MAC addresses,
>> QEMU truncates the CVQ command data and copies this incomplete command
>> into the out buffer. In this situation, virtio_net_handle_mac() fails the
>> integrity check and returns VIRTIO_NET_ERR instead of marking
>> `mac_table.x_overflow` and returning VIRTIO_NET_OK, since the copied
>> CVQ command in the buffer is incomplete and flawed.
>>
>> This patch solves this problem by increasing the buffer size to
>> vhost_vdpa_net_cvq_cmd_page_len(), which represents the size of the buffer
>> that is allocated and mmaped. Therefore, everything should work correctly
>> as long as the guest sets fewer than `(vhost_vdpa_net_cvq_cmd_page_len() -
>> sizeof(struct virtio_net_ctrl_hdr)
>> - 2 * sizeof(struct virtio_net_ctrl_mac)) / ETH_ALEN` MAC addresses.
>>
>> Considering the highly unlikely scenario for the guest setting more than
>> that number of MAC addresses for the filter table, this patch should
>> work fine for the majority of cases. If there is a need for more than thoes
>> entries, we can increase the value for vhost_vdpa_net_cvq_cmd_page_len()
>> in the future, mapping more than one page for command output.
>>
>> Fixes: 7a7f87e94c ("vdpa: Move command buffers map to start of net device")
>> Signed-off-by: Hawkins Jiawei <yin31149@gmail.com>
>> ---
>>   net/vhost-vdpa.c | 11 ++++++++++-
>>   1 file changed, 10 insertions(+), 1 deletion(-)
>>
>> diff --git a/net/vhost-vdpa.c b/net/vhost-vdpa.c
>> index 5a72204899..ecfa8852b5 100644
>> --- a/net/vhost-vdpa.c
>> +++ b/net/vhost-vdpa.c
>> @@ -784,9 +784,18 @@ static int vhost_vdpa_net_handle_ctrl_avail(VhostShadowVirtqueue *svq,
>>       };
>>       ssize_t dev_written = -EINVAL;
>>
>> +    /*
>> +     * This code truncates the VIRTIO_NET_CTRL_MAC_TABLE_SET CVQ command
>> +     * and prevents QEMU from marking `mac_table.x_overflow` in the device
>> +     * internal state in virtio_net_handle_mac() if the guest sets more than
>> +     * `(vhost_vdpa_net_cvq_cmd_page_len() - sizeof(struct virtio_net_ctrl_hdr)
>> +     * - 2 * sizeof(struct virtio_net_ctrl_mac)) / ETH_ALEN` MAC addresses for
>> +     * filter table.
>> +     * However, this situation is considered rare, so it is acceptable.
>
> I think we can also fix this situation. If it fits in one page, we can
> still send the same page to the device and virtio_net_handle_ctrl_iov.
> If it does not fit in one page, we can clear all mac filters with
> VIRTIO_NET_CTRL_RX_ALLUNI and / or VIRTIO_NET_CTRL_RX_ALLMULTI.

Hi Eugenio,

Thank you for your review.

However, it appears that the approach may not resolve the situation.

When the CVQ command exceeds one page,
vhost_vdpa_net_handle_ctrl_avail() truncates the command, resulting in a
malformed SVQ command being received by the hardware, which in turn
triggers an error acknowledgement to QEMU.

So if CVQ command exceeds one page, vhost_vdpa_net_handle_ctrl_avail()
should not update the device internal state because the SVQ command
fails.(Please correct me if I am wrong)

It appears that my commit message and comments did not take this into
account. I will refactor them in the v2 patch..

Thanks!


>
> Thanks!
>
>> +     */
>>       out.iov_len = iov_to_buf(elem->out_sg, elem->out_num, 0,
>>                                s->cvq_cmd_out_buffer,
>> -                             vhost_vdpa_net_cvq_cmd_len());
>> +                             vhost_vdpa_net_cvq_cmd_page_len());
>>       if (*(uint8_t *)s->cvq_cmd_out_buffer == VIRTIO_NET_CTRL_ANNOUNCE) {
>>           /*
>>            * Guest announce capability is emulated by qemu, so don't forward to
>> --
>> 2.25.1
>>
>
Eugenio Perez Martin June 26, 2023, 9:08 a.m. UTC | #3
On Mon, Jun 26, 2023 at 10:26 AM Hawkins Jiawei <yin31149@gmail.com> wrote:
>
> On 2023/6/25 18:48, Eugenio Perez Martin wrote:
> > On Thu, Jun 22, 2023 at 3:07 AM Hawkins Jiawei <yin31149@gmail.com> wrote:
> >>
> >> According to the VirtIO standard, "Since there are no guarantees,
> >> it can use a hash filter or silently switch to
> >> allmulti or promiscuous mode if it is given too many addresses."
> >> To achive this, QEMU ignores MAC addresses and marks `mac_table.x_overflow`
> >> in the device internal state in virtio_net_handle_mac()
> >> if the guest sets more than `MAC_TABLE_ENTRIES` MAC addresses
> >> for the filter table.
> >>
> >> However, the problem is that QEMU never marks the `mac_table.x_overflow`
> >> for the vdpa device internal state when the guest sets more than
> >> `MAC_TABLE_ENTRIES` MAC addresses.
> >>
> >> To be more specific, currently QEMU offers a buffer size of
> >> vhost_vdpa_net_cvq_cmd_len() for CVQ commands, which represents the size of
> >> VIRTIO_NET_CTRL_MAC_TABLE_SET command with a maximum `MAC_TABLE_ENTRIES`
> >> MAC addresses.
> >>
> >> Consequently, if the guest sets more than `MAC_TABLE_ENTRIES` MAC addresses,
> >> QEMU truncates the CVQ command data and copies this incomplete command
> >> into the out buffer. In this situation, virtio_net_handle_mac() fails the
> >> integrity check and returns VIRTIO_NET_ERR instead of marking
> >> `mac_table.x_overflow` and returning VIRTIO_NET_OK, since the copied
> >> CVQ command in the buffer is incomplete and flawed.
> >>
> >> This patch solves this problem by increasing the buffer size to
> >> vhost_vdpa_net_cvq_cmd_page_len(), which represents the size of the buffer
> >> that is allocated and mmaped. Therefore, everything should work correctly
> >> as long as the guest sets fewer than `(vhost_vdpa_net_cvq_cmd_page_len() -
> >> sizeof(struct virtio_net_ctrl_hdr)
> >> - 2 * sizeof(struct virtio_net_ctrl_mac)) / ETH_ALEN` MAC addresses.
> >>
> >> Considering the highly unlikely scenario for the guest setting more than
> >> that number of MAC addresses for the filter table, this patch should
> >> work fine for the majority of cases. If there is a need for more than thoes
> >> entries, we can increase the value for vhost_vdpa_net_cvq_cmd_page_len()
> >> in the future, mapping more than one page for command output.
> >>
> >> Fixes: 7a7f87e94c ("vdpa: Move command buffers map to start of net device")
> >> Signed-off-by: Hawkins Jiawei <yin31149@gmail.com>
> >> ---
> >>   net/vhost-vdpa.c | 11 ++++++++++-
> >>   1 file changed, 10 insertions(+), 1 deletion(-)
> >>
> >> diff --git a/net/vhost-vdpa.c b/net/vhost-vdpa.c
> >> index 5a72204899..ecfa8852b5 100644
> >> --- a/net/vhost-vdpa.c
> >> +++ b/net/vhost-vdpa.c
> >> @@ -784,9 +784,18 @@ static int vhost_vdpa_net_handle_ctrl_avail(VhostShadowVirtqueue *svq,
> >>       };
> >>       ssize_t dev_written = -EINVAL;
> >>
> >> +    /*
> >> +     * This code truncates the VIRTIO_NET_CTRL_MAC_TABLE_SET CVQ command
> >> +     * and prevents QEMU from marking `mac_table.x_overflow` in the device
> >> +     * internal state in virtio_net_handle_mac() if the guest sets more than
> >> +     * `(vhost_vdpa_net_cvq_cmd_page_len() - sizeof(struct virtio_net_ctrl_hdr)
> >> +     * - 2 * sizeof(struct virtio_net_ctrl_mac)) / ETH_ALEN` MAC addresses for
> >> +     * filter table.
> >> +     * However, this situation is considered rare, so it is acceptable.
> >
> > I think we can also fix this situation. If it fits in one page, we can
> > still send the same page to the device and virtio_net_handle_ctrl_iov.
> > If it does not fit in one page, we can clear all mac filters with
> > VIRTIO_NET_CTRL_RX_ALLUNI and / or VIRTIO_NET_CTRL_RX_ALLMULTI.
>
> Hi Eugenio,
>
> Thank you for your review.
>
> However, it appears that the approach may not resolve the situation.
>
> When the CVQ command exceeds one page,
> vhost_vdpa_net_handle_ctrl_avail() truncates the command, resulting in a
> malformed SVQ command being received by the hardware, which in turn
> triggers an error acknowledgement to QEMU.
>

If that happens we can sanitize the copied cmd buffer. Let's start
with an overflowed unicast table first.

To configure the device we need to transform the command to
VIRTIO_NET_CTRL_RX_ALLUNI, as we cannot copy all the table to the out
cmd page. If that succeeds, we can continue to register that in the
VirtIONet struct.

 We can copy the first MAC_TABLE_ENTRIES + 1 entries and set entries =
(MAC_TABLE_ENTRIES + 1), and then use that buffer to call
virtio_net_handle_ctrl_iov. That sets VirtIONet.uni_overflow = true
and VirtIONet.all_uni = false.

We need to handle the multicast addresses in a similar way, but we can
find cases where only multicast addresses overflow.

In future series we can improve the situation:
* Allocating bigger out buffers so more mac addresses fit in it.
* Mapping also guest pages in CVQ, so the real device is able to read
the full table but VirtIONet only stores the first MAC_TABLE_ENTRIES
or .uni_overflow = 1.

But I think it should be enough at this point.

Thanks!

> So if CVQ command exceeds one page, vhost_vdpa_net_handle_ctrl_avail()
> should not update the device internal state because the SVQ command
> fails.(Please correct me if I am wrong)
>
> It appears that my commit message and comments did not take this into
> account. I will refactor them in the v2 patch..
>
> Thanks!
>
>
> >
> > Thanks!
> >
> >> +     */
> >>       out.iov_len = iov_to_buf(elem->out_sg, elem->out_num, 0,
> >>                                s->cvq_cmd_out_buffer,
> >> -                             vhost_vdpa_net_cvq_cmd_len());
> >> +                             vhost_vdpa_net_cvq_cmd_page_len());
> >>       if (*(uint8_t *)s->cvq_cmd_out_buffer == VIRTIO_NET_CTRL_ANNOUNCE) {
> >>           /*
> >>            * Guest announce capability is emulated by qemu, so don't forward to
> >> --
> >> 2.25.1
> >>
> >
>
Hawkins Jiawei June 27, 2023, 8:53 a.m. UTC | #4
On 2023/6/26 17:08, Eugenio Perez Martin wrote:
> On Mon, Jun 26, 2023 at 10:26 AM Hawkins Jiawei <yin31149@gmail.com> wrote:
>>
>> On 2023/6/25 18:48, Eugenio Perez Martin wrote:
>>> On Thu, Jun 22, 2023 at 3:07 AM Hawkins Jiawei <yin31149@gmail.com> wrote:
>>>>
>>>> According to the VirtIO standard, "Since there are no guarantees,
>>>> it can use a hash filter or silently switch to
>>>> allmulti or promiscuous mode if it is given too many addresses."
>>>> To achive this, QEMU ignores MAC addresses and marks `mac_table.x_overflow`
>>>> in the device internal state in virtio_net_handle_mac()
>>>> if the guest sets more than `MAC_TABLE_ENTRIES` MAC addresses
>>>> for the filter table.
>>>>
>>>> However, the problem is that QEMU never marks the `mac_table.x_overflow`
>>>> for the vdpa device internal state when the guest sets more than
>>>> `MAC_TABLE_ENTRIES` MAC addresses.
>>>>
>>>> To be more specific, currently QEMU offers a buffer size of
>>>> vhost_vdpa_net_cvq_cmd_len() for CVQ commands, which represents the size of
>>>> VIRTIO_NET_CTRL_MAC_TABLE_SET command with a maximum `MAC_TABLE_ENTRIES`
>>>> MAC addresses.
>>>>
>>>> Consequently, if the guest sets more than `MAC_TABLE_ENTRIES` MAC addresses,
>>>> QEMU truncates the CVQ command data and copies this incomplete command
>>>> into the out buffer. In this situation, virtio_net_handle_mac() fails the
>>>> integrity check and returns VIRTIO_NET_ERR instead of marking
>>>> `mac_table.x_overflow` and returning VIRTIO_NET_OK, since the copied
>>>> CVQ command in the buffer is incomplete and flawed.
>>>>
>>>> This patch solves this problem by increasing the buffer size to
>>>> vhost_vdpa_net_cvq_cmd_page_len(), which represents the size of the buffer
>>>> that is allocated and mmaped. Therefore, everything should work correctly
>>>> as long as the guest sets fewer than `(vhost_vdpa_net_cvq_cmd_page_len() -
>>>> sizeof(struct virtio_net_ctrl_hdr)
>>>> - 2 * sizeof(struct virtio_net_ctrl_mac)) / ETH_ALEN` MAC addresses.
>>>>
>>>> Considering the highly unlikely scenario for the guest setting more than
>>>> that number of MAC addresses for the filter table, this patch should
>>>> work fine for the majority of cases. If there is a need for more than thoes
>>>> entries, we can increase the value for vhost_vdpa_net_cvq_cmd_page_len()
>>>> in the future, mapping more than one page for command output.
>>>>
>>>> Fixes: 7a7f87e94c ("vdpa: Move command buffers map to start of net device")
>>>> Signed-off-by: Hawkins Jiawei <yin31149@gmail.com>
>>>> ---
>>>>    net/vhost-vdpa.c | 11 ++++++++++-
>>>>    1 file changed, 10 insertions(+), 1 deletion(-)
>>>>
>>>> diff --git a/net/vhost-vdpa.c b/net/vhost-vdpa.c
>>>> index 5a72204899..ecfa8852b5 100644
>>>> --- a/net/vhost-vdpa.c
>>>> +++ b/net/vhost-vdpa.c
>>>> @@ -784,9 +784,18 @@ static int vhost_vdpa_net_handle_ctrl_avail(VhostShadowVirtqueue *svq,
>>>>        };
>>>>        ssize_t dev_written = -EINVAL;
>>>>
>>>> +    /*
>>>> +     * This code truncates the VIRTIO_NET_CTRL_MAC_TABLE_SET CVQ command
>>>> +     * and prevents QEMU from marking `mac_table.x_overflow` in the device
>>>> +     * internal state in virtio_net_handle_mac() if the guest sets more than
>>>> +     * `(vhost_vdpa_net_cvq_cmd_page_len() - sizeof(struct virtio_net_ctrl_hdr)
>>>> +     * - 2 * sizeof(struct virtio_net_ctrl_mac)) / ETH_ALEN` MAC addresses for
>>>> +     * filter table.
>>>> +     * However, this situation is considered rare, so it is acceptable.
>>>
>>> I think we can also fix this situation. If it fits in one page, we can
>>> still send the same page to the device and virtio_net_handle_ctrl_iov.
>>> If it does not fit in one page, we can clear all mac filters with
>>> VIRTIO_NET_CTRL_RX_ALLUNI and / or VIRTIO_NET_CTRL_RX_ALLMULTI.
>>
>> Hi Eugenio,
>>
>> Thank you for your review.
>>
>> However, it appears that the approach may not resolve the situation.
>>
>> When the CVQ command exceeds one page,
>> vhost_vdpa_net_handle_ctrl_avail() truncates the command, resulting in a
>> malformed SVQ command being received by the hardware, which in turn
>> triggers an error acknowledgement to QEMU.
>>
>
> If that happens we can sanitize the copied cmd buffer. Let's start
> with an overflowed unicast table first.
>
> To configure the device we need to transform the command to
> VIRTIO_NET_CTRL_RX_ALLUNI, as we cannot copy all the table to the out
> cmd page. If that succeeds, we can continue to register that in the
> VirtIONet struct.
>
>   We can copy the first MAC_TABLE_ENTRIES + 1 entries and set entries =
> (MAC_TABLE_ENTRIES + 1), and then use that buffer to call
> virtio_net_handle_ctrl_iov. That sets VirtIONet.uni_overflow = true
> and VirtIONet.all_uni = false.
>
> We need to handle the multicast addresses in a similar way, but we can
> find cases where only multicast addresses overflow.
>
> In future series we can improve the situation:
> * Allocating bigger out buffers so more mac addresses fit in it.
> * Mapping also guest pages in CVQ, so the real device is able to read
> the full table but VirtIONet only stores the first MAC_TABLE_ENTRIES
> or .uni_overflow = 1.
>
> But I think it should be enough at this point.

Yes, I think this is a good method to update the device internal state
to align with the VirtIO standard when the VIRTIO_NET_CTRL_MAC_TABLE_SET
CVQ command does not fit in one page.

But it seems that we should update the device internal state only when
the hardware successfully receives this CVQ command. Once the hardware
set the VIRTIO_NET_ERR to this CVQ command, we should not change the
device internal state.

To be more specific, there are two scenarios to consider:

- If the CVQ command fits within a single page, the function
vhost_vdpa_net_handle_ctrl_avail() can offer a buffer with the fully
copied CVQ command. In this case, everything should work smoothly.

- If the CVQ command does not fit within a single page, the function
vhost_vdpa_net_handle_ctrl_avail() can only provide a buffer with the
truncated CVQ command. When this buffer is sent to the vpda device, the
device fails to accept this CVQ command and keeps its state unchanged
due to the malformed CVQ. As a result, the device sets the ack
VIRTIO_NET_ERR.

   Consequently, vhost_vdpa_net_handle_ctrl_avail() should immediately
return and keep the device internal state unchanged because the vdpa
device returns VIRTIO_NET_ERR, indicating that the hardware did not
modify the state in this CVQ command.

Therefore, it appears that whenever the VIRTIO_NET_CTRL_MAC_TABLE_SET
CVQ command does not fit in one page, we should not update the device
internal state, because vdpa hardware always fails to accept this CVQ
command due to truncation.

Thanks!



>
> Thanks!
>
>> So if CVQ command exceeds one page, vhost_vdpa_net_handle_ctrl_avail()
>> should not update the device internal state because the SVQ command
>> fails.(Please correct me if I am wrong)
>>
>> It appears that my commit message and comments did not take this into
>> account. I will refactor them in the v2 patch..
>>
>> Thanks!
>>
>>
>>>
>>> Thanks!
>>>
>>>> +     */
>>>>        out.iov_len = iov_to_buf(elem->out_sg, elem->out_num, 0,
>>>>                                 s->cvq_cmd_out_buffer,
>>>> -                             vhost_vdpa_net_cvq_cmd_len());
>>>> +                             vhost_vdpa_net_cvq_cmd_page_len());
>>>>        if (*(uint8_t *)s->cvq_cmd_out_buffer == VIRTIO_NET_CTRL_ANNOUNCE) {
>>>>            /*
>>>>             * Guest announce capability is emulated by qemu, so don't forward to
>>>> --
>>>> 2.25.1
>>>>
>>>
>>
>
Michael S. Tsirkin July 10, 2023, 6:52 p.m. UTC | #5
On Mon, Jun 26, 2023 at 04:26:04PM +0800, Hawkins Jiawei wrote:
> It appears that my commit message and comments did not take this into
> account. I will refactor them in the v2 patch..

does not look like you ever sent v2.
Hawkins Jiawei July 11, 2023, 1:48 a.m. UTC | #6
On 2023/7/11 2:52, Michael S. Tsirkin wrote:
> On Mon, Jun 26, 2023 at 04:26:04PM +0800, Hawkins Jiawei wrote:
>> It appears that my commit message and comments did not take this into
>> account. I will refactor them in the v2 patch..
>
> does not look like you ever sent v2.
>

Sorry for not mentioning that I have moved the patch to the patch series
titled "Vhost-vdpa Shadow Virtqueue _F_CTRL_RX commands support" at [1].
The reason for this move is that the bug in question should not be
triggered until the VIRTIO_NET_CTRL_MAC_TABLE_SET command is exposed by
this patch series.

I will take care of this in my future patch series.

[1].https://lists.gnu.org/archive/html/qemu-devel/2023-07/msg01577.html

Thanks!
Michael Tokarev July 12, 2023, 10:45 a.m. UTC | #7
11.07.2023 04:48, Hawkins Jiawei wrote:
..
> Sorry for not mentioning that I have moved the patch to the patch series
> titled "Vhost-vdpa Shadow Virtqueue _F_CTRL_RX commands support" at [1].
> The reason for this move is that the bug in question should not be
> triggered until the VIRTIO_NET_CTRL_MAC_TABLE_SET command is exposed by
> this patch series.

Does this mean this particular change is not supposed to be applied to -stable,
as the other change which exposes the bug isn't in any stable series?

Thanks,

/mjt
Hawkins Jiawei July 12, 2023, 2:04 p.m. UTC | #8
在 2023/7/12 18:45, Michael Tokarev 写道:
> 11.07.2023 04:48, Hawkins Jiawei wrote:
> ..
>> Sorry for not mentioning that I have moved the patch to the patch series
>> titled "Vhost-vdpa Shadow Virtqueue _F_CTRL_RX commands support" at [1].
>> The reason for this move is that the bug in question should not be
>> triggered until the VIRTIO_NET_CTRL_MAC_TABLE_SET command is exposed by
>> this patch series.
>
> Does this mean this particular change is not supposed to be applied to
> -stable,
> as the other change which exposes the bug isn't in any stable series?

Yes, you are right.

This bug is related to the VIRTIO_NET_CTRL_MAC_TABLE_SET command in SVQ,
and this command is not exposed in SVQ in any stable branch, so we do
not need to apply the patch to the -stable branch.

Thanks!

>
> Thanks,
>
> /mjt
diff mbox series

Patch

diff --git a/net/vhost-vdpa.c b/net/vhost-vdpa.c
index 5a72204899..ecfa8852b5 100644
--- a/net/vhost-vdpa.c
+++ b/net/vhost-vdpa.c
@@ -784,9 +784,18 @@  static int vhost_vdpa_net_handle_ctrl_avail(VhostShadowVirtqueue *svq,
     };
     ssize_t dev_written = -EINVAL;
 
+    /*
+     * This code truncates the VIRTIO_NET_CTRL_MAC_TABLE_SET CVQ command
+     * and prevents QEMU from marking `mac_table.x_overflow` in the device
+     * internal state in virtio_net_handle_mac() if the guest sets more than
+     * `(vhost_vdpa_net_cvq_cmd_page_len() - sizeof(struct virtio_net_ctrl_hdr)
+     * - 2 * sizeof(struct virtio_net_ctrl_mac)) / ETH_ALEN` MAC addresses for
+     * filter table.
+     * However, this situation is considered rare, so it is acceptable.
+     */
     out.iov_len = iov_to_buf(elem->out_sg, elem->out_num, 0,
                              s->cvq_cmd_out_buffer,
-                             vhost_vdpa_net_cvq_cmd_len());
+                             vhost_vdpa_net_cvq_cmd_page_len());
     if (*(uint8_t *)s->cvq_cmd_out_buffer == VIRTIO_NET_CTRL_ANNOUNCE) {
         /*
          * Guest announce capability is emulated by qemu, so don't forward to