Message ID | 20230622010651.22698-1-yin31149@gmail.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | vdpa: Increase out buffer size for CVQ commands | expand |
On Thu, Jun 22, 2023 at 3:07 AM Hawkins Jiawei <yin31149@gmail.com> wrote: > > According to the VirtIO standard, "Since there are no guarantees, > it can use a hash filter or silently switch to > allmulti or promiscuous mode if it is given too many addresses." > To achive this, QEMU ignores MAC addresses and marks `mac_table.x_overflow` > in the device internal state in virtio_net_handle_mac() > if the guest sets more than `MAC_TABLE_ENTRIES` MAC addresses > for the filter table. > > However, the problem is that QEMU never marks the `mac_table.x_overflow` > for the vdpa device internal state when the guest sets more than > `MAC_TABLE_ENTRIES` MAC addresses. > > To be more specific, currently QEMU offers a buffer size of > vhost_vdpa_net_cvq_cmd_len() for CVQ commands, which represents the size of > VIRTIO_NET_CTRL_MAC_TABLE_SET command with a maximum `MAC_TABLE_ENTRIES` > MAC addresses. > > Consequently, if the guest sets more than `MAC_TABLE_ENTRIES` MAC addresses, > QEMU truncates the CVQ command data and copies this incomplete command > into the out buffer. In this situation, virtio_net_handle_mac() fails the > integrity check and returns VIRTIO_NET_ERR instead of marking > `mac_table.x_overflow` and returning VIRTIO_NET_OK, since the copied > CVQ command in the buffer is incomplete and flawed. > > This patch solves this problem by increasing the buffer size to > vhost_vdpa_net_cvq_cmd_page_len(), which represents the size of the buffer > that is allocated and mmaped. Therefore, everything should work correctly > as long as the guest sets fewer than `(vhost_vdpa_net_cvq_cmd_page_len() - > sizeof(struct virtio_net_ctrl_hdr) > - 2 * sizeof(struct virtio_net_ctrl_mac)) / ETH_ALEN` MAC addresses. > > Considering the highly unlikely scenario for the guest setting more than > that number of MAC addresses for the filter table, this patch should > work fine for the majority of cases. If there is a need for more than thoes > entries, we can increase the value for vhost_vdpa_net_cvq_cmd_page_len() > in the future, mapping more than one page for command output. > > Fixes: 7a7f87e94c ("vdpa: Move command buffers map to start of net device") > Signed-off-by: Hawkins Jiawei <yin31149@gmail.com> > --- > net/vhost-vdpa.c | 11 ++++++++++- > 1 file changed, 10 insertions(+), 1 deletion(-) > > diff --git a/net/vhost-vdpa.c b/net/vhost-vdpa.c > index 5a72204899..ecfa8852b5 100644 > --- a/net/vhost-vdpa.c > +++ b/net/vhost-vdpa.c > @@ -784,9 +784,18 @@ static int vhost_vdpa_net_handle_ctrl_avail(VhostShadowVirtqueue *svq, > }; > ssize_t dev_written = -EINVAL; > > + /* > + * This code truncates the VIRTIO_NET_CTRL_MAC_TABLE_SET CVQ command > + * and prevents QEMU from marking `mac_table.x_overflow` in the device > + * internal state in virtio_net_handle_mac() if the guest sets more than > + * `(vhost_vdpa_net_cvq_cmd_page_len() - sizeof(struct virtio_net_ctrl_hdr) > + * - 2 * sizeof(struct virtio_net_ctrl_mac)) / ETH_ALEN` MAC addresses for > + * filter table. > + * However, this situation is considered rare, so it is acceptable. I think we can also fix this situation. If it fits in one page, we can still send the same page to the device and virtio_net_handle_ctrl_iov. If it does not fit in one page, we can clear all mac filters with VIRTIO_NET_CTRL_RX_ALLUNI and / or VIRTIO_NET_CTRL_RX_ALLMULTI. Thanks! > + */ > out.iov_len = iov_to_buf(elem->out_sg, elem->out_num, 0, > s->cvq_cmd_out_buffer, > - vhost_vdpa_net_cvq_cmd_len()); > + vhost_vdpa_net_cvq_cmd_page_len()); > if (*(uint8_t *)s->cvq_cmd_out_buffer == VIRTIO_NET_CTRL_ANNOUNCE) { > /* > * Guest announce capability is emulated by qemu, so don't forward to > -- > 2.25.1 >
On 2023/6/25 18:48, Eugenio Perez Martin wrote: > On Thu, Jun 22, 2023 at 3:07 AM Hawkins Jiawei <yin31149@gmail.com> wrote: >> >> According to the VirtIO standard, "Since there are no guarantees, >> it can use a hash filter or silently switch to >> allmulti or promiscuous mode if it is given too many addresses." >> To achive this, QEMU ignores MAC addresses and marks `mac_table.x_overflow` >> in the device internal state in virtio_net_handle_mac() >> if the guest sets more than `MAC_TABLE_ENTRIES` MAC addresses >> for the filter table. >> >> However, the problem is that QEMU never marks the `mac_table.x_overflow` >> for the vdpa device internal state when the guest sets more than >> `MAC_TABLE_ENTRIES` MAC addresses. >> >> To be more specific, currently QEMU offers a buffer size of >> vhost_vdpa_net_cvq_cmd_len() for CVQ commands, which represents the size of >> VIRTIO_NET_CTRL_MAC_TABLE_SET command with a maximum `MAC_TABLE_ENTRIES` >> MAC addresses. >> >> Consequently, if the guest sets more than `MAC_TABLE_ENTRIES` MAC addresses, >> QEMU truncates the CVQ command data and copies this incomplete command >> into the out buffer. In this situation, virtio_net_handle_mac() fails the >> integrity check and returns VIRTIO_NET_ERR instead of marking >> `mac_table.x_overflow` and returning VIRTIO_NET_OK, since the copied >> CVQ command in the buffer is incomplete and flawed. >> >> This patch solves this problem by increasing the buffer size to >> vhost_vdpa_net_cvq_cmd_page_len(), which represents the size of the buffer >> that is allocated and mmaped. Therefore, everything should work correctly >> as long as the guest sets fewer than `(vhost_vdpa_net_cvq_cmd_page_len() - >> sizeof(struct virtio_net_ctrl_hdr) >> - 2 * sizeof(struct virtio_net_ctrl_mac)) / ETH_ALEN` MAC addresses. >> >> Considering the highly unlikely scenario for the guest setting more than >> that number of MAC addresses for the filter table, this patch should >> work fine for the majority of cases. If there is a need for more than thoes >> entries, we can increase the value for vhost_vdpa_net_cvq_cmd_page_len() >> in the future, mapping more than one page for command output. >> >> Fixes: 7a7f87e94c ("vdpa: Move command buffers map to start of net device") >> Signed-off-by: Hawkins Jiawei <yin31149@gmail.com> >> --- >> net/vhost-vdpa.c | 11 ++++++++++- >> 1 file changed, 10 insertions(+), 1 deletion(-) >> >> diff --git a/net/vhost-vdpa.c b/net/vhost-vdpa.c >> index 5a72204899..ecfa8852b5 100644 >> --- a/net/vhost-vdpa.c >> +++ b/net/vhost-vdpa.c >> @@ -784,9 +784,18 @@ static int vhost_vdpa_net_handle_ctrl_avail(VhostShadowVirtqueue *svq, >> }; >> ssize_t dev_written = -EINVAL; >> >> + /* >> + * This code truncates the VIRTIO_NET_CTRL_MAC_TABLE_SET CVQ command >> + * and prevents QEMU from marking `mac_table.x_overflow` in the device >> + * internal state in virtio_net_handle_mac() if the guest sets more than >> + * `(vhost_vdpa_net_cvq_cmd_page_len() - sizeof(struct virtio_net_ctrl_hdr) >> + * - 2 * sizeof(struct virtio_net_ctrl_mac)) / ETH_ALEN` MAC addresses for >> + * filter table. >> + * However, this situation is considered rare, so it is acceptable. > > I think we can also fix this situation. If it fits in one page, we can > still send the same page to the device and virtio_net_handle_ctrl_iov. > If it does not fit in one page, we can clear all mac filters with > VIRTIO_NET_CTRL_RX_ALLUNI and / or VIRTIO_NET_CTRL_RX_ALLMULTI. Hi Eugenio, Thank you for your review. However, it appears that the approach may not resolve the situation. When the CVQ command exceeds one page, vhost_vdpa_net_handle_ctrl_avail() truncates the command, resulting in a malformed SVQ command being received by the hardware, which in turn triggers an error acknowledgement to QEMU. So if CVQ command exceeds one page, vhost_vdpa_net_handle_ctrl_avail() should not update the device internal state because the SVQ command fails.(Please correct me if I am wrong) It appears that my commit message and comments did not take this into account. I will refactor them in the v2 patch.. Thanks! > > Thanks! > >> + */ >> out.iov_len = iov_to_buf(elem->out_sg, elem->out_num, 0, >> s->cvq_cmd_out_buffer, >> - vhost_vdpa_net_cvq_cmd_len()); >> + vhost_vdpa_net_cvq_cmd_page_len()); >> if (*(uint8_t *)s->cvq_cmd_out_buffer == VIRTIO_NET_CTRL_ANNOUNCE) { >> /* >> * Guest announce capability is emulated by qemu, so don't forward to >> -- >> 2.25.1 >> >
On Mon, Jun 26, 2023 at 10:26 AM Hawkins Jiawei <yin31149@gmail.com> wrote: > > On 2023/6/25 18:48, Eugenio Perez Martin wrote: > > On Thu, Jun 22, 2023 at 3:07 AM Hawkins Jiawei <yin31149@gmail.com> wrote: > >> > >> According to the VirtIO standard, "Since there are no guarantees, > >> it can use a hash filter or silently switch to > >> allmulti or promiscuous mode if it is given too many addresses." > >> To achive this, QEMU ignores MAC addresses and marks `mac_table.x_overflow` > >> in the device internal state in virtio_net_handle_mac() > >> if the guest sets more than `MAC_TABLE_ENTRIES` MAC addresses > >> for the filter table. > >> > >> However, the problem is that QEMU never marks the `mac_table.x_overflow` > >> for the vdpa device internal state when the guest sets more than > >> `MAC_TABLE_ENTRIES` MAC addresses. > >> > >> To be more specific, currently QEMU offers a buffer size of > >> vhost_vdpa_net_cvq_cmd_len() for CVQ commands, which represents the size of > >> VIRTIO_NET_CTRL_MAC_TABLE_SET command with a maximum `MAC_TABLE_ENTRIES` > >> MAC addresses. > >> > >> Consequently, if the guest sets more than `MAC_TABLE_ENTRIES` MAC addresses, > >> QEMU truncates the CVQ command data and copies this incomplete command > >> into the out buffer. In this situation, virtio_net_handle_mac() fails the > >> integrity check and returns VIRTIO_NET_ERR instead of marking > >> `mac_table.x_overflow` and returning VIRTIO_NET_OK, since the copied > >> CVQ command in the buffer is incomplete and flawed. > >> > >> This patch solves this problem by increasing the buffer size to > >> vhost_vdpa_net_cvq_cmd_page_len(), which represents the size of the buffer > >> that is allocated and mmaped. Therefore, everything should work correctly > >> as long as the guest sets fewer than `(vhost_vdpa_net_cvq_cmd_page_len() - > >> sizeof(struct virtio_net_ctrl_hdr) > >> - 2 * sizeof(struct virtio_net_ctrl_mac)) / ETH_ALEN` MAC addresses. > >> > >> Considering the highly unlikely scenario for the guest setting more than > >> that number of MAC addresses for the filter table, this patch should > >> work fine for the majority of cases. If there is a need for more than thoes > >> entries, we can increase the value for vhost_vdpa_net_cvq_cmd_page_len() > >> in the future, mapping more than one page for command output. > >> > >> Fixes: 7a7f87e94c ("vdpa: Move command buffers map to start of net device") > >> Signed-off-by: Hawkins Jiawei <yin31149@gmail.com> > >> --- > >> net/vhost-vdpa.c | 11 ++++++++++- > >> 1 file changed, 10 insertions(+), 1 deletion(-) > >> > >> diff --git a/net/vhost-vdpa.c b/net/vhost-vdpa.c > >> index 5a72204899..ecfa8852b5 100644 > >> --- a/net/vhost-vdpa.c > >> +++ b/net/vhost-vdpa.c > >> @@ -784,9 +784,18 @@ static int vhost_vdpa_net_handle_ctrl_avail(VhostShadowVirtqueue *svq, > >> }; > >> ssize_t dev_written = -EINVAL; > >> > >> + /* > >> + * This code truncates the VIRTIO_NET_CTRL_MAC_TABLE_SET CVQ command > >> + * and prevents QEMU from marking `mac_table.x_overflow` in the device > >> + * internal state in virtio_net_handle_mac() if the guest sets more than > >> + * `(vhost_vdpa_net_cvq_cmd_page_len() - sizeof(struct virtio_net_ctrl_hdr) > >> + * - 2 * sizeof(struct virtio_net_ctrl_mac)) / ETH_ALEN` MAC addresses for > >> + * filter table. > >> + * However, this situation is considered rare, so it is acceptable. > > > > I think we can also fix this situation. If it fits in one page, we can > > still send the same page to the device and virtio_net_handle_ctrl_iov. > > If it does not fit in one page, we can clear all mac filters with > > VIRTIO_NET_CTRL_RX_ALLUNI and / or VIRTIO_NET_CTRL_RX_ALLMULTI. > > Hi Eugenio, > > Thank you for your review. > > However, it appears that the approach may not resolve the situation. > > When the CVQ command exceeds one page, > vhost_vdpa_net_handle_ctrl_avail() truncates the command, resulting in a > malformed SVQ command being received by the hardware, which in turn > triggers an error acknowledgement to QEMU. > If that happens we can sanitize the copied cmd buffer. Let's start with an overflowed unicast table first. To configure the device we need to transform the command to VIRTIO_NET_CTRL_RX_ALLUNI, as we cannot copy all the table to the out cmd page. If that succeeds, we can continue to register that in the VirtIONet struct. We can copy the first MAC_TABLE_ENTRIES + 1 entries and set entries = (MAC_TABLE_ENTRIES + 1), and then use that buffer to call virtio_net_handle_ctrl_iov. That sets VirtIONet.uni_overflow = true and VirtIONet.all_uni = false. We need to handle the multicast addresses in a similar way, but we can find cases where only multicast addresses overflow. In future series we can improve the situation: * Allocating bigger out buffers so more mac addresses fit in it. * Mapping also guest pages in CVQ, so the real device is able to read the full table but VirtIONet only stores the first MAC_TABLE_ENTRIES or .uni_overflow = 1. But I think it should be enough at this point. Thanks! > So if CVQ command exceeds one page, vhost_vdpa_net_handle_ctrl_avail() > should not update the device internal state because the SVQ command > fails.(Please correct me if I am wrong) > > It appears that my commit message and comments did not take this into > account. I will refactor them in the v2 patch.. > > Thanks! > > > > > > Thanks! > > > >> + */ > >> out.iov_len = iov_to_buf(elem->out_sg, elem->out_num, 0, > >> s->cvq_cmd_out_buffer, > >> - vhost_vdpa_net_cvq_cmd_len()); > >> + vhost_vdpa_net_cvq_cmd_page_len()); > >> if (*(uint8_t *)s->cvq_cmd_out_buffer == VIRTIO_NET_CTRL_ANNOUNCE) { > >> /* > >> * Guest announce capability is emulated by qemu, so don't forward to > >> -- > >> 2.25.1 > >> > > >
On 2023/6/26 17:08, Eugenio Perez Martin wrote: > On Mon, Jun 26, 2023 at 10:26 AM Hawkins Jiawei <yin31149@gmail.com> wrote: >> >> On 2023/6/25 18:48, Eugenio Perez Martin wrote: >>> On Thu, Jun 22, 2023 at 3:07 AM Hawkins Jiawei <yin31149@gmail.com> wrote: >>>> >>>> According to the VirtIO standard, "Since there are no guarantees, >>>> it can use a hash filter or silently switch to >>>> allmulti or promiscuous mode if it is given too many addresses." >>>> To achive this, QEMU ignores MAC addresses and marks `mac_table.x_overflow` >>>> in the device internal state in virtio_net_handle_mac() >>>> if the guest sets more than `MAC_TABLE_ENTRIES` MAC addresses >>>> for the filter table. >>>> >>>> However, the problem is that QEMU never marks the `mac_table.x_overflow` >>>> for the vdpa device internal state when the guest sets more than >>>> `MAC_TABLE_ENTRIES` MAC addresses. >>>> >>>> To be more specific, currently QEMU offers a buffer size of >>>> vhost_vdpa_net_cvq_cmd_len() for CVQ commands, which represents the size of >>>> VIRTIO_NET_CTRL_MAC_TABLE_SET command with a maximum `MAC_TABLE_ENTRIES` >>>> MAC addresses. >>>> >>>> Consequently, if the guest sets more than `MAC_TABLE_ENTRIES` MAC addresses, >>>> QEMU truncates the CVQ command data and copies this incomplete command >>>> into the out buffer. In this situation, virtio_net_handle_mac() fails the >>>> integrity check and returns VIRTIO_NET_ERR instead of marking >>>> `mac_table.x_overflow` and returning VIRTIO_NET_OK, since the copied >>>> CVQ command in the buffer is incomplete and flawed. >>>> >>>> This patch solves this problem by increasing the buffer size to >>>> vhost_vdpa_net_cvq_cmd_page_len(), which represents the size of the buffer >>>> that is allocated and mmaped. Therefore, everything should work correctly >>>> as long as the guest sets fewer than `(vhost_vdpa_net_cvq_cmd_page_len() - >>>> sizeof(struct virtio_net_ctrl_hdr) >>>> - 2 * sizeof(struct virtio_net_ctrl_mac)) / ETH_ALEN` MAC addresses. >>>> >>>> Considering the highly unlikely scenario for the guest setting more than >>>> that number of MAC addresses for the filter table, this patch should >>>> work fine for the majority of cases. If there is a need for more than thoes >>>> entries, we can increase the value for vhost_vdpa_net_cvq_cmd_page_len() >>>> in the future, mapping more than one page for command output. >>>> >>>> Fixes: 7a7f87e94c ("vdpa: Move command buffers map to start of net device") >>>> Signed-off-by: Hawkins Jiawei <yin31149@gmail.com> >>>> --- >>>> net/vhost-vdpa.c | 11 ++++++++++- >>>> 1 file changed, 10 insertions(+), 1 deletion(-) >>>> >>>> diff --git a/net/vhost-vdpa.c b/net/vhost-vdpa.c >>>> index 5a72204899..ecfa8852b5 100644 >>>> --- a/net/vhost-vdpa.c >>>> +++ b/net/vhost-vdpa.c >>>> @@ -784,9 +784,18 @@ static int vhost_vdpa_net_handle_ctrl_avail(VhostShadowVirtqueue *svq, >>>> }; >>>> ssize_t dev_written = -EINVAL; >>>> >>>> + /* >>>> + * This code truncates the VIRTIO_NET_CTRL_MAC_TABLE_SET CVQ command >>>> + * and prevents QEMU from marking `mac_table.x_overflow` in the device >>>> + * internal state in virtio_net_handle_mac() if the guest sets more than >>>> + * `(vhost_vdpa_net_cvq_cmd_page_len() - sizeof(struct virtio_net_ctrl_hdr) >>>> + * - 2 * sizeof(struct virtio_net_ctrl_mac)) / ETH_ALEN` MAC addresses for >>>> + * filter table. >>>> + * However, this situation is considered rare, so it is acceptable. >>> >>> I think we can also fix this situation. If it fits in one page, we can >>> still send the same page to the device and virtio_net_handle_ctrl_iov. >>> If it does not fit in one page, we can clear all mac filters with >>> VIRTIO_NET_CTRL_RX_ALLUNI and / or VIRTIO_NET_CTRL_RX_ALLMULTI. >> >> Hi Eugenio, >> >> Thank you for your review. >> >> However, it appears that the approach may not resolve the situation. >> >> When the CVQ command exceeds one page, >> vhost_vdpa_net_handle_ctrl_avail() truncates the command, resulting in a >> malformed SVQ command being received by the hardware, which in turn >> triggers an error acknowledgement to QEMU. >> > > If that happens we can sanitize the copied cmd buffer. Let's start > with an overflowed unicast table first. > > To configure the device we need to transform the command to > VIRTIO_NET_CTRL_RX_ALLUNI, as we cannot copy all the table to the out > cmd page. If that succeeds, we can continue to register that in the > VirtIONet struct. > > We can copy the first MAC_TABLE_ENTRIES + 1 entries and set entries = > (MAC_TABLE_ENTRIES + 1), and then use that buffer to call > virtio_net_handle_ctrl_iov. That sets VirtIONet.uni_overflow = true > and VirtIONet.all_uni = false. > > We need to handle the multicast addresses in a similar way, but we can > find cases where only multicast addresses overflow. > > In future series we can improve the situation: > * Allocating bigger out buffers so more mac addresses fit in it. > * Mapping also guest pages in CVQ, so the real device is able to read > the full table but VirtIONet only stores the first MAC_TABLE_ENTRIES > or .uni_overflow = 1. > > But I think it should be enough at this point. Yes, I think this is a good method to update the device internal state to align with the VirtIO standard when the VIRTIO_NET_CTRL_MAC_TABLE_SET CVQ command does not fit in one page. But it seems that we should update the device internal state only when the hardware successfully receives this CVQ command. Once the hardware set the VIRTIO_NET_ERR to this CVQ command, we should not change the device internal state. To be more specific, there are two scenarios to consider: - If the CVQ command fits within a single page, the function vhost_vdpa_net_handle_ctrl_avail() can offer a buffer with the fully copied CVQ command. In this case, everything should work smoothly. - If the CVQ command does not fit within a single page, the function vhost_vdpa_net_handle_ctrl_avail() can only provide a buffer with the truncated CVQ command. When this buffer is sent to the vpda device, the device fails to accept this CVQ command and keeps its state unchanged due to the malformed CVQ. As a result, the device sets the ack VIRTIO_NET_ERR. Consequently, vhost_vdpa_net_handle_ctrl_avail() should immediately return and keep the device internal state unchanged because the vdpa device returns VIRTIO_NET_ERR, indicating that the hardware did not modify the state in this CVQ command. Therefore, it appears that whenever the VIRTIO_NET_CTRL_MAC_TABLE_SET CVQ command does not fit in one page, we should not update the device internal state, because vdpa hardware always fails to accept this CVQ command due to truncation. Thanks! > > Thanks! > >> So if CVQ command exceeds one page, vhost_vdpa_net_handle_ctrl_avail() >> should not update the device internal state because the SVQ command >> fails.(Please correct me if I am wrong) >> >> It appears that my commit message and comments did not take this into >> account. I will refactor them in the v2 patch.. >> >> Thanks! >> >> >>> >>> Thanks! >>> >>>> + */ >>>> out.iov_len = iov_to_buf(elem->out_sg, elem->out_num, 0, >>>> s->cvq_cmd_out_buffer, >>>> - vhost_vdpa_net_cvq_cmd_len()); >>>> + vhost_vdpa_net_cvq_cmd_page_len()); >>>> if (*(uint8_t *)s->cvq_cmd_out_buffer == VIRTIO_NET_CTRL_ANNOUNCE) { >>>> /* >>>> * Guest announce capability is emulated by qemu, so don't forward to >>>> -- >>>> 2.25.1 >>>> >>> >> >
On Mon, Jun 26, 2023 at 04:26:04PM +0800, Hawkins Jiawei wrote: > It appears that my commit message and comments did not take this into > account. I will refactor them in the v2 patch.. does not look like you ever sent v2.
On 2023/7/11 2:52, Michael S. Tsirkin wrote: > On Mon, Jun 26, 2023 at 04:26:04PM +0800, Hawkins Jiawei wrote: >> It appears that my commit message and comments did not take this into >> account. I will refactor them in the v2 patch.. > > does not look like you ever sent v2. > Sorry for not mentioning that I have moved the patch to the patch series titled "Vhost-vdpa Shadow Virtqueue _F_CTRL_RX commands support" at [1]. The reason for this move is that the bug in question should not be triggered until the VIRTIO_NET_CTRL_MAC_TABLE_SET command is exposed by this patch series. I will take care of this in my future patch series. [1].https://lists.gnu.org/archive/html/qemu-devel/2023-07/msg01577.html Thanks!
11.07.2023 04:48, Hawkins Jiawei wrote: .. > Sorry for not mentioning that I have moved the patch to the patch series > titled "Vhost-vdpa Shadow Virtqueue _F_CTRL_RX commands support" at [1]. > The reason for this move is that the bug in question should not be > triggered until the VIRTIO_NET_CTRL_MAC_TABLE_SET command is exposed by > this patch series. Does this mean this particular change is not supposed to be applied to -stable, as the other change which exposes the bug isn't in any stable series? Thanks, /mjt
在 2023/7/12 18:45, Michael Tokarev 写道: > 11.07.2023 04:48, Hawkins Jiawei wrote: > .. >> Sorry for not mentioning that I have moved the patch to the patch series >> titled "Vhost-vdpa Shadow Virtqueue _F_CTRL_RX commands support" at [1]. >> The reason for this move is that the bug in question should not be >> triggered until the VIRTIO_NET_CTRL_MAC_TABLE_SET command is exposed by >> this patch series. > > Does this mean this particular change is not supposed to be applied to > -stable, > as the other change which exposes the bug isn't in any stable series? Yes, you are right. This bug is related to the VIRTIO_NET_CTRL_MAC_TABLE_SET command in SVQ, and this command is not exposed in SVQ in any stable branch, so we do not need to apply the patch to the -stable branch. Thanks! > > Thanks, > > /mjt
diff --git a/net/vhost-vdpa.c b/net/vhost-vdpa.c index 5a72204899..ecfa8852b5 100644 --- a/net/vhost-vdpa.c +++ b/net/vhost-vdpa.c @@ -784,9 +784,18 @@ static int vhost_vdpa_net_handle_ctrl_avail(VhostShadowVirtqueue *svq, }; ssize_t dev_written = -EINVAL; + /* + * This code truncates the VIRTIO_NET_CTRL_MAC_TABLE_SET CVQ command + * and prevents QEMU from marking `mac_table.x_overflow` in the device + * internal state in virtio_net_handle_mac() if the guest sets more than + * `(vhost_vdpa_net_cvq_cmd_page_len() - sizeof(struct virtio_net_ctrl_hdr) + * - 2 * sizeof(struct virtio_net_ctrl_mac)) / ETH_ALEN` MAC addresses for + * filter table. + * However, this situation is considered rare, so it is acceptable. + */ out.iov_len = iov_to_buf(elem->out_sg, elem->out_num, 0, s->cvq_cmd_out_buffer, - vhost_vdpa_net_cvq_cmd_len()); + vhost_vdpa_net_cvq_cmd_page_len()); if (*(uint8_t *)s->cvq_cmd_out_buffer == VIRTIO_NET_CTRL_ANNOUNCE) { /* * Guest announce capability is emulated by qemu, so don't forward to
According to the VirtIO standard, "Since there are no guarantees, it can use a hash filter or silently switch to allmulti or promiscuous mode if it is given too many addresses." To achive this, QEMU ignores MAC addresses and marks `mac_table.x_overflow` in the device internal state in virtio_net_handle_mac() if the guest sets more than `MAC_TABLE_ENTRIES` MAC addresses for the filter table. However, the problem is that QEMU never marks the `mac_table.x_overflow` for the vdpa device internal state when the guest sets more than `MAC_TABLE_ENTRIES` MAC addresses. To be more specific, currently QEMU offers a buffer size of vhost_vdpa_net_cvq_cmd_len() for CVQ commands, which represents the size of VIRTIO_NET_CTRL_MAC_TABLE_SET command with a maximum `MAC_TABLE_ENTRIES` MAC addresses. Consequently, if the guest sets more than `MAC_TABLE_ENTRIES` MAC addresses, QEMU truncates the CVQ command data and copies this incomplete command into the out buffer. In this situation, virtio_net_handle_mac() fails the integrity check and returns VIRTIO_NET_ERR instead of marking `mac_table.x_overflow` and returning VIRTIO_NET_OK, since the copied CVQ command in the buffer is incomplete and flawed. This patch solves this problem by increasing the buffer size to vhost_vdpa_net_cvq_cmd_page_len(), which represents the size of the buffer that is allocated and mmaped. Therefore, everything should work correctly as long as the guest sets fewer than `(vhost_vdpa_net_cvq_cmd_page_len() - sizeof(struct virtio_net_ctrl_hdr) - 2 * sizeof(struct virtio_net_ctrl_mac)) / ETH_ALEN` MAC addresses. Considering the highly unlikely scenario for the guest setting more than that number of MAC addresses for the filter table, this patch should work fine for the majority of cases. If there is a need for more than thoes entries, we can increase the value for vhost_vdpa_net_cvq_cmd_page_len() in the future, mapping more than one page for command output. Fixes: 7a7f87e94c ("vdpa: Move command buffers map to start of net device") Signed-off-by: Hawkins Jiawei <yin31149@gmail.com> --- net/vhost-vdpa.c | 11 ++++++++++- 1 file changed, 10 insertions(+), 1 deletion(-)