Message ID | 20250211071922.2311873-2-junnan01.wu@samsung.com (mailing list archive) |
---|---|
State | Changes Requested |
Delegated to: | Netdev Maintainers |
Headers | show |
Series | vsock suspend/resume fix | expand |
You need to update the title now that you're moving also queued_replies. On Tue, Feb 11, 2025 at 03:19:21PM +0800, Junnan Wu wrote: >When executing suspend to ram twice in a row, >the `rx_buf_nr` and `rx_buf_max_nr` increase to three times vq->num_free. >Then after virtqueue_get_buf and `rx_buf_nr` decreased >in function virtio_transport_rx_work, >the condition to fill rx buffer >(rx_buf_nr < rx_buf_max_nr / 2) will never be met. > >It is because that `rx_buf_nr` and `rx_buf_max_nr` >are initialized only in virtio_vsock_probe(), >but they should be reset whenever virtqueues are recreated, >like after a suspend/resume. > >Move the `rx_buf_nr` and `rx_buf_max_nr` initialization in >virtio_vsock_vqs_init(), so we are sure that they are properly >initialized, every time we initialize the virtqueues, either when we >load the driver or after a suspend/resume. >At the same time, also move `queued_replies`. Why? As I mentioned the commit description should explain why the changes are being made for both reviewers and future references to this patch. The rest LGTM. Stefano > >Fixes: bd50c5dc182b ("vsock/virtio: add support for device suspend/resume") >Co-developed-by: Ying Gao <ying01.gao@samsung.com> >Signed-off-by: Ying Gao <ying01.gao@samsung.com> >Signed-off-by: Junnan Wu <junnan01.wu@samsung.com> >--- > net/vmw_vsock/virtio_transport.c | 10 +++++++--- > 1 file changed, 7 insertions(+), 3 deletions(-) > >diff --git a/net/vmw_vsock/virtio_transport.c b/net/vmw_vsock/virtio_transport.c >index b58c3818f284..f0e48e6911fc 100644 >--- a/net/vmw_vsock/virtio_transport.c >+++ b/net/vmw_vsock/virtio_transport.c >@@ -670,6 +670,13 @@ static int virtio_vsock_vqs_init(struct virtio_vsock *vsock) > }; > int ret; > >+ mutex_lock(&vsock->rx_lock); >+ vsock->rx_buf_nr = 0; >+ vsock->rx_buf_max_nr = 0; >+ mutex_unlock(&vsock->rx_lock); >+ >+ atomic_set(&vsock->queued_replies, 0); >+ > ret = virtio_find_vqs(vdev, VSOCK_VQ_MAX, vsock->vqs, vqs_info, NULL); > if (ret < 0) > return ret; >@@ -779,9 +786,6 @@ static int virtio_vsock_probe(struct virtio_device *vdev) > > vsock->vdev = vdev; > >- vsock->rx_buf_nr = 0; >- vsock->rx_buf_max_nr = 0; >- atomic_set(&vsock->queued_replies, 0); > > mutex_init(&vsock->tx_lock); > mutex_init(&vsock->rx_lock); >-- >2.34.1 >
>You need to update the title now that you're moving also queued_replies. > Well, I will update the title in V3 version. >On Tue, Feb 11, 2025 at 03:19:21PM +0800, Junnan Wu wrote: >>When executing suspend to ram twice in a row, >>the `rx_buf_nr` and `rx_buf_max_nr` increase to three times vq->num_free. >>Then after virtqueue_get_buf and `rx_buf_nr` decreased >>in function virtio_transport_rx_work, >>the condition to fill rx buffer >>(rx_buf_nr < rx_buf_max_nr / 2) will never be met. >> >>It is because that `rx_buf_nr` and `rx_buf_max_nr` >>are initialized only in virtio_vsock_probe(), >>but they should be reset whenever virtqueues are recreated, >>like after a suspend/resume. >> >>Move the `rx_buf_nr` and `rx_buf_max_nr` initialization in >>virtio_vsock_vqs_init(), so we are sure that they are properly >>initialized, every time we initialize the virtqueues, either when we >>load the driver or after a suspend/resume. >>At the same time, also move `queued_replies`. > >Why? > >As I mentioned the commit description should explain why the changes are >being made for both reviewers and future references to this patch. > After your kindly remind, I have double checked all locations where `queued_replies` used, and we think for order to prevent erroneous atomic load operations on the `queued_replies` in the virtio_transport_send_pkt_work() function which may disrupt the scheduling of vsock->rx_work when transmitting reply-required socket packets, this atomic variable must undergo synchronized initialization alongside the preceding two variables after a suspend/resume. If we reach agreement on it, I will add this description in V3 version. BRs Junnan Wu >The rest LGTM. > >Stefano > >> >>Fixes: bd50c5dc182b ("vsock/virtio: add support for device suspend/resume") >>Co-developed-by: Ying Gao <ying01.gao@samsung.com> >>Signed-off-by: Ying Gao <ying01.gao@samsung.com> >>Signed-off-by: Junnan Wu <junnan01.wu@samsung.com> >>--- >> net/vmw_vsock/virtio_transport.c | 10 +++++++--- >> 1 file changed, 7 insertions(+), 3 deletions(-) >> >>diff --git a/net/vmw_vsock/virtio_transport.c b/net/vmw_vsock/virtio_transport.c >>index b58c3818f284..f0e48e6911fc 100644 >>--- a/net/vmw_vsock/virtio_transport.c >>+++ b/net/vmw_vsock/virtio_transport.c >>@@ -670,6 +670,13 @@ static int virtio_vsock_vqs_init(struct virtio_vsock *vsock) >> }; >> int ret; >> >>+ mutex_lock(&vsock->rx_lock); >>+ vsock->rx_buf_nr = 0; >>+ vsock->rx_buf_max_nr = 0; >>+ mutex_unlock(&vsock->rx_lock); >>+ >>+ atomic_set(&vsock->queued_replies, 0); >>+ >> ret = virtio_find_vqs(vdev, VSOCK_VQ_MAX, vsock->vqs, vqs_info, NULL); >> if (ret < 0) >> return ret; >>@@ -779,9 +786,6 @@ static int virtio_vsock_probe(struct virtio_device *vdev) >> >> vsock->vdev = vdev; >> >>- vsock->rx_buf_nr = 0; >>- vsock->rx_buf_max_nr = 0; >>- atomic_set(&vsock->queued_replies, 0); >> >> mutex_init(&vsock->tx_lock); >> mutex_init(&vsock->rx_lock); >>-- >>2.34.1 >>
On Thu, Feb 13, 2025 at 09:28:05AM +0800, Junnan Wu wrote: >>You need to update the title now that you're moving also queued_replies. >> > >Well, I will update the title in V3 version. > >>On Tue, Feb 11, 2025 at 03:19:21PM +0800, Junnan Wu wrote: >>>When executing suspend to ram twice in a row, >>>the `rx_buf_nr` and `rx_buf_max_nr` increase to three times vq->num_free. >>>Then after virtqueue_get_buf and `rx_buf_nr` decreased >>>in function virtio_transport_rx_work, >>>the condition to fill rx buffer >>>(rx_buf_nr < rx_buf_max_nr / 2) will never be met. >>> >>>It is because that `rx_buf_nr` and `rx_buf_max_nr` >>>are initialized only in virtio_vsock_probe(), >>>but they should be reset whenever virtqueues are recreated, >>>like after a suspend/resume. >>> >>>Move the `rx_buf_nr` and `rx_buf_max_nr` initialization in >>>virtio_vsock_vqs_init(), so we are sure that they are properly >>>initialized, every time we initialize the virtqueues, either when we >>>load the driver or after a suspend/resume. >>>At the same time, also move `queued_replies`. >> >>Why? >> >>As I mentioned the commit description should explain why the changes are >>being made for both reviewers and future references to this patch. >> > >After your kindly remind, I have double checked all locations where `queued_replies` >used, and we think for order to prevent erroneous atomic load operations >on the `queued_replies` in the virtio_transport_send_pkt_work() function >which may disrupt the scheduling of vsock->rx_work >when transmitting reply-required socket packets, >this atomic variable must undergo synchronized initialization >alongside the preceding two variables after a suspend/resume. Yes, that was my concern! > >If we reach agreement on it, I will add this description in V3 version. Yes, please, I just wanted to point out that we need to add an explanation in the commit description. And in the title, in this case though listing all the variables would get too long, so you can do something like that: vsock/virtio: fix variables initialization during resuming Thanks, Stefano > >BRs >Junnan Wu > >>The rest LGTM. >> >>Stefano >> >>> >>>Fixes: bd50c5dc182b ("vsock/virtio: add support for device suspend/resume") >>>Co-developed-by: Ying Gao <ying01.gao@samsung.com> >>>Signed-off-by: Ying Gao <ying01.gao@samsung.com> >>>Signed-off-by: Junnan Wu <junnan01.wu@samsung.com> >>>--- >>> net/vmw_vsock/virtio_transport.c | 10 +++++++--- >>> 1 file changed, 7 insertions(+), 3 deletions(-) >>> >>>diff --git a/net/vmw_vsock/virtio_transport.c b/net/vmw_vsock/virtio_transport.c >>>index b58c3818f284..f0e48e6911fc 100644 >>>--- a/net/vmw_vsock/virtio_transport.c >>>+++ b/net/vmw_vsock/virtio_transport.c >>>@@ -670,6 +670,13 @@ static int virtio_vsock_vqs_init(struct virtio_vsock *vsock) >>> }; >>> int ret; >>> >>>+ mutex_lock(&vsock->rx_lock); >>>+ vsock->rx_buf_nr = 0; >>>+ vsock->rx_buf_max_nr = 0; >>>+ mutex_unlock(&vsock->rx_lock); >>>+ >>>+ atomic_set(&vsock->queued_replies, 0); >>>+ >>> ret = virtio_find_vqs(vdev, VSOCK_VQ_MAX, vsock->vqs, vqs_info, NULL); >>> if (ret < 0) >>> return ret; >>>@@ -779,9 +786,6 @@ static int virtio_vsock_probe(struct virtio_device *vdev) >>> >>> vsock->vdev = vdev; >>> >>>- vsock->rx_buf_nr = 0; >>>- vsock->rx_buf_max_nr = 0; >>>- atomic_set(&vsock->queued_replies, 0); >>> >>> mutex_init(&vsock->tx_lock); >>> mutex_init(&vsock->rx_lock); >>>-- >>>2.34.1 >>> >
On Thu, 13 Feb 2025 at 10:25, Stefano Garzarella <sgarzare@redhat.com> wrote: > > On Thu, Feb 13, 2025 at 09:28:05AM +0800, Junnan Wu wrote: > >>You need to update the title now that you're moving also queued_replies. > >> > > > >Well, I will update the title in V3 version. > > > >>On Tue, Feb 11, 2025 at 03:19:21PM +0800, Junnan Wu wrote: > >>>When executing suspend to ram twice in a row, > >>>the `rx_buf_nr` and `rx_buf_max_nr` increase to three times vq->num_free. > >>>Then after virtqueue_get_buf and `rx_buf_nr` decreased > >>>in function virtio_transport_rx_work, > >>>the condition to fill rx buffer > >>>(rx_buf_nr < rx_buf_max_nr / 2) will never be met. > >>> > >>>It is because that `rx_buf_nr` and `rx_buf_max_nr` > >>>are initialized only in virtio_vsock_probe(), > >>>but they should be reset whenever virtqueues are recreated, > >>>like after a suspend/resume. > >>> > >>>Move the `rx_buf_nr` and `rx_buf_max_nr` initialization in > >>>virtio_vsock_vqs_init(), so we are sure that they are properly > >>>initialized, every time we initialize the virtqueues, either when we > >>>load the driver or after a suspend/resume. > >>>At the same time, also move `queued_replies`. > >> > >>Why? > >> > >>As I mentioned the commit description should explain why the changes are > >>being made for both reviewers and future references to this patch. > >> > > > >After your kindly remind, I have double checked all locations where `queued_replies` > >used, and we think for order to prevent erroneous atomic load operations > >on the `queued_replies` in the virtio_transport_send_pkt_work() function > >which may disrupt the scheduling of vsock->rx_work > >when transmitting reply-required socket packets, > >this atomic variable must undergo synchronized initialization > >alongside the preceding two variables after a suspend/resume. > > Yes, that was my concern! > > > > >If we reach agreement on it, I will add this description in V3 version. > > Yes, please, I just wanted to point out that we need to add an > explanation in the commit description. > > And in the title, in this case though listing all the variables would > get too long, so you can do something like that: > > vsock/virtio: fix variables initialization during resuming I forgot to mention that IMHO it's better to split this series. This first patch (this one) seems ready, without controversy, and it's a real fix, so for me v3 should be a version ready to be merged. While the other patch is more controversial and especially not a fix but more of a new feature, so I don't think it makes sense to continue to have these two patches in a single series. Thanks, Stefano
On Thu, 13 Feb 2025 at 15:47, Stefano Garzarella <sgarzare@redhat.com> wrote: >I forgot to mention that IMHO it's better to split this series. >This first patch (this one) seems ready, without controversy, and it's >a real fix, so for me v3 should be a version ready to be merged. > >While the other patch is more controversial and especially not a fix >but more of a new feature, so I don't think it makes sense to continue >to have these two patches in a single series. > >Thanks, >Stefano Well, I agree with you that these two patches should be splited. And I will send v3 version of the first patch individually. And according to our discussion, the second one can be ignored, until we find a suitable way to deal the scenario I metionded. Thanks, Junnan
diff --git a/net/vmw_vsock/virtio_transport.c b/net/vmw_vsock/virtio_transport.c index b58c3818f284..f0e48e6911fc 100644 --- a/net/vmw_vsock/virtio_transport.c +++ b/net/vmw_vsock/virtio_transport.c @@ -670,6 +670,13 @@ static int virtio_vsock_vqs_init(struct virtio_vsock *vsock) }; int ret; + mutex_lock(&vsock->rx_lock); + vsock->rx_buf_nr = 0; + vsock->rx_buf_max_nr = 0; + mutex_unlock(&vsock->rx_lock); + + atomic_set(&vsock->queued_replies, 0); + ret = virtio_find_vqs(vdev, VSOCK_VQ_MAX, vsock->vqs, vqs_info, NULL); if (ret < 0) return ret; @@ -779,9 +786,6 @@ static int virtio_vsock_probe(struct virtio_device *vdev) vsock->vdev = vdev; - vsock->rx_buf_nr = 0; - vsock->rx_buf_max_nr = 0; - atomic_set(&vsock->queued_replies, 0); mutex_init(&vsock->tx_lock); mutex_init(&vsock->rx_lock);