diff mbox series

[net,1/2] vsock/virtio: initialize rx_buf_nr and rx_buf_max_nr when resuming

Message ID 20250211071922.2311873-2-junnan01.wu@samsung.com (mailing list archive)
State Changes Requested
Delegated to: Netdev Maintainers
Headers show
Series vsock suspend/resume fix | expand

Checks

Context Check Description
netdev/series_format success Posting correctly formatted
netdev/tree_selection success Clearly marked for net
netdev/ynl success Generated files up to date; no warnings/errors; no diff in generated;
netdev/fixes_present success Fixes tag present in non-next series
netdev/header_inline success No static functions without inline keyword in header files
netdev/build_32bit success Errors and warnings before: 0 this patch: 0
netdev/build_tools success No tools touched, skip
netdev/cc_maintainers success CCed 13 of 13 maintainers
netdev/build_clang success Errors and warnings before: 2 this patch: 2
netdev/verify_signedoff success Signed-off-by tag matches author and committer
netdev/deprecated_api success None detected
netdev/check_selftest success No net selftest shell script
netdev/verify_fixes success Fixes tag looks correct
netdev/build_allmodconfig_warn success Errors and warnings before: 0 this patch: 0
netdev/checkpatch success total: 0 errors, 0 warnings, 0 checks, 22 lines checked
netdev/build_clang_rust success No Rust files in patch. Skipping build
netdev/kdoc success Errors and warnings before: 0 this patch: 0
netdev/source_inline success Was 0 now: 0
netdev/contest success net-next-2025-02-12--06-00 (tests: 889)

Commit Message

Junnan Wu Feb. 11, 2025, 7:19 a.m. UTC
When executing suspend to ram twice in a row,
the `rx_buf_nr` and `rx_buf_max_nr` increase to three times vq->num_free.
Then after virtqueue_get_buf and `rx_buf_nr` decreased
in function virtio_transport_rx_work,
the condition to fill rx buffer
(rx_buf_nr < rx_buf_max_nr / 2) will never be met.

It is because that `rx_buf_nr` and `rx_buf_max_nr`
are initialized only in virtio_vsock_probe(),
but they should be reset whenever virtqueues are recreated,
like after a suspend/resume.

Move the `rx_buf_nr` and `rx_buf_max_nr` initialization in
virtio_vsock_vqs_init(), so we are sure that they are properly
initialized, every time we initialize the virtqueues, either when we
load the driver or after a suspend/resume.
At the same time, also move `queued_replies`.

Fixes: bd50c5dc182b ("vsock/virtio: add support for device suspend/resume")
Co-developed-by: Ying Gao <ying01.gao@samsung.com>
Signed-off-by: Ying Gao <ying01.gao@samsung.com>
Signed-off-by: Junnan Wu <junnan01.wu@samsung.com>
---
 net/vmw_vsock/virtio_transport.c | 10 +++++++---
 1 file changed, 7 insertions(+), 3 deletions(-)

Comments

Stefano Garzarella Feb. 11, 2025, 8:57 a.m. UTC | #1
You need to update the title now that you're moving also queued_replies.

On Tue, Feb 11, 2025 at 03:19:21PM +0800, Junnan Wu wrote:
>When executing suspend to ram twice in a row,
>the `rx_buf_nr` and `rx_buf_max_nr` increase to three times vq->num_free.
>Then after virtqueue_get_buf and `rx_buf_nr` decreased
>in function virtio_transport_rx_work,
>the condition to fill rx buffer
>(rx_buf_nr < rx_buf_max_nr / 2) will never be met.
>
>It is because that `rx_buf_nr` and `rx_buf_max_nr`
>are initialized only in virtio_vsock_probe(),
>but they should be reset whenever virtqueues are recreated,
>like after a suspend/resume.
>
>Move the `rx_buf_nr` and `rx_buf_max_nr` initialization in
>virtio_vsock_vqs_init(), so we are sure that they are properly
>initialized, every time we initialize the virtqueues, either when we
>load the driver or after a suspend/resume.
>At the same time, also move `queued_replies`.

Why?

As I mentioned the commit description should explain why the changes are 
being made for both reviewers and future references to this patch.

The rest LGTM.

Stefano

>
>Fixes: bd50c5dc182b ("vsock/virtio: add support for device suspend/resume")
>Co-developed-by: Ying Gao <ying01.gao@samsung.com>
>Signed-off-by: Ying Gao <ying01.gao@samsung.com>
>Signed-off-by: Junnan Wu <junnan01.wu@samsung.com>
>---
> net/vmw_vsock/virtio_transport.c | 10 +++++++---
> 1 file changed, 7 insertions(+), 3 deletions(-)
>
>diff --git a/net/vmw_vsock/virtio_transport.c b/net/vmw_vsock/virtio_transport.c
>index b58c3818f284..f0e48e6911fc 100644
>--- a/net/vmw_vsock/virtio_transport.c
>+++ b/net/vmw_vsock/virtio_transport.c
>@@ -670,6 +670,13 @@ static int virtio_vsock_vqs_init(struct virtio_vsock *vsock)
> 	};
> 	int ret;
>
>+	mutex_lock(&vsock->rx_lock);
>+	vsock->rx_buf_nr = 0;
>+	vsock->rx_buf_max_nr = 0;
>+	mutex_unlock(&vsock->rx_lock);
>+
>+	atomic_set(&vsock->queued_replies, 0);
>+
> 	ret = virtio_find_vqs(vdev, VSOCK_VQ_MAX, vsock->vqs, vqs_info, NULL);
> 	if (ret < 0)
> 		return ret;
>@@ -779,9 +786,6 @@ static int virtio_vsock_probe(struct virtio_device *vdev)
>
> 	vsock->vdev = vdev;
>
>-	vsock->rx_buf_nr = 0;
>-	vsock->rx_buf_max_nr = 0;
>-	atomic_set(&vsock->queued_replies, 0);
>
> 	mutex_init(&vsock->tx_lock);
> 	mutex_init(&vsock->rx_lock);
>-- 
>2.34.1
>
Junnan Wu Feb. 13, 2025, 1:28 a.m. UTC | #2
>You need to update the title now that you're moving also queued_replies.
>

Well, I will update the title in V3 version.

>On Tue, Feb 11, 2025 at 03:19:21PM +0800, Junnan Wu wrote:
>>When executing suspend to ram twice in a row,
>>the `rx_buf_nr` and `rx_buf_max_nr` increase to three times vq->num_free.
>>Then after virtqueue_get_buf and `rx_buf_nr` decreased
>>in function virtio_transport_rx_work,
>>the condition to fill rx buffer
>>(rx_buf_nr < rx_buf_max_nr / 2) will never be met.
>>
>>It is because that `rx_buf_nr` and `rx_buf_max_nr`
>>are initialized only in virtio_vsock_probe(),
>>but they should be reset whenever virtqueues are recreated,
>>like after a suspend/resume.
>>
>>Move the `rx_buf_nr` and `rx_buf_max_nr` initialization in
>>virtio_vsock_vqs_init(), so we are sure that they are properly
>>initialized, every time we initialize the virtqueues, either when we
>>load the driver or after a suspend/resume.
>>At the same time, also move `queued_replies`.
>
>Why?
>
>As I mentioned the commit description should explain why the changes are 
>being made for both reviewers and future references to this patch.
>

After your kindly remind, I have double checked all locations where `queued_replies`
used, and we think for order to prevent erroneous atomic load operations 
on the `queued_replies` in the virtio_transport_send_pkt_work() function
which may disrupt the scheduling of vsock->rx_work
when transmitting reply-required socket packets,
this atomic variable must undergo synchronized initialization
alongside the preceding two variables after a suspend/resume.

If we reach agreement on it, I will add this description in V3 version.

BRs
Junnan Wu

>The rest LGTM.
>
>Stefano
>
>>
>>Fixes: bd50c5dc182b ("vsock/virtio: add support for device suspend/resume")
>>Co-developed-by: Ying Gao <ying01.gao@samsung.com>
>>Signed-off-by: Ying Gao <ying01.gao@samsung.com>
>>Signed-off-by: Junnan Wu <junnan01.wu@samsung.com>
>>---
>> net/vmw_vsock/virtio_transport.c | 10 +++++++---
>> 1 file changed, 7 insertions(+), 3 deletions(-)
>>
>>diff --git a/net/vmw_vsock/virtio_transport.c b/net/vmw_vsock/virtio_transport.c
>>index b58c3818f284..f0e48e6911fc 100644
>>--- a/net/vmw_vsock/virtio_transport.c
>>+++ b/net/vmw_vsock/virtio_transport.c
>>@@ -670,6 +670,13 @@ static int virtio_vsock_vqs_init(struct virtio_vsock *vsock)
>> 	};
>> 	int ret;
>>
>>+	mutex_lock(&vsock->rx_lock);
>>+	vsock->rx_buf_nr = 0;
>>+	vsock->rx_buf_max_nr = 0;
>>+	mutex_unlock(&vsock->rx_lock);
>>+
>>+	atomic_set(&vsock->queued_replies, 0);
>>+
>> 	ret = virtio_find_vqs(vdev, VSOCK_VQ_MAX, vsock->vqs, vqs_info, NULL);
>> 	if (ret < 0)
>> 		return ret;
>>@@ -779,9 +786,6 @@ static int virtio_vsock_probe(struct virtio_device *vdev)
>>
>> 	vsock->vdev = vdev;
>>
>>-	vsock->rx_buf_nr = 0;
>>-	vsock->rx_buf_max_nr = 0;
>>-	atomic_set(&vsock->queued_replies, 0);
>>
>> 	mutex_init(&vsock->tx_lock);
>> 	mutex_init(&vsock->rx_lock);
>>-- 
>>2.34.1
>>
Stefano Garzarella Feb. 13, 2025, 9:25 a.m. UTC | #3
On Thu, Feb 13, 2025 at 09:28:05AM +0800, Junnan Wu wrote:
>>You need to update the title now that you're moving also queued_replies.
>>
>
>Well, I will update the title in V3 version.
>
>>On Tue, Feb 11, 2025 at 03:19:21PM +0800, Junnan Wu wrote:
>>>When executing suspend to ram twice in a row,
>>>the `rx_buf_nr` and `rx_buf_max_nr` increase to three times vq->num_free.
>>>Then after virtqueue_get_buf and `rx_buf_nr` decreased
>>>in function virtio_transport_rx_work,
>>>the condition to fill rx buffer
>>>(rx_buf_nr < rx_buf_max_nr / 2) will never be met.
>>>
>>>It is because that `rx_buf_nr` and `rx_buf_max_nr`
>>>are initialized only in virtio_vsock_probe(),
>>>but they should be reset whenever virtqueues are recreated,
>>>like after a suspend/resume.
>>>
>>>Move the `rx_buf_nr` and `rx_buf_max_nr` initialization in
>>>virtio_vsock_vqs_init(), so we are sure that they are properly
>>>initialized, every time we initialize the virtqueues, either when we
>>>load the driver or after a suspend/resume.
>>>At the same time, also move `queued_replies`.
>>
>>Why?
>>
>>As I mentioned the commit description should explain why the changes are
>>being made for both reviewers and future references to this patch.
>>
>
>After your kindly remind, I have double checked all locations where `queued_replies`
>used, and we think for order to prevent erroneous atomic load operations
>on the `queued_replies` in the virtio_transport_send_pkt_work() function
>which may disrupt the scheduling of vsock->rx_work
>when transmitting reply-required socket packets,
>this atomic variable must undergo synchronized initialization
>alongside the preceding two variables after a suspend/resume.

Yes, that was my concern!

>
>If we reach agreement on it, I will add this description in V3 version.

Yes, please, I just wanted to point out that we need to add an 
explanation in the commit description.

And in the title, in this case though listing all the variables would 
get too long, so you can do something like that:

     vsock/virtio: fix variables initialization during resuming

Thanks,
Stefano

>
>BRs
>Junnan Wu
>
>>The rest LGTM.
>>
>>Stefano
>>
>>>
>>>Fixes: bd50c5dc182b ("vsock/virtio: add support for device suspend/resume")
>>>Co-developed-by: Ying Gao <ying01.gao@samsung.com>
>>>Signed-off-by: Ying Gao <ying01.gao@samsung.com>
>>>Signed-off-by: Junnan Wu <junnan01.wu@samsung.com>
>>>---
>>> net/vmw_vsock/virtio_transport.c | 10 +++++++---
>>> 1 file changed, 7 insertions(+), 3 deletions(-)
>>>
>>>diff --git a/net/vmw_vsock/virtio_transport.c b/net/vmw_vsock/virtio_transport.c
>>>index b58c3818f284..f0e48e6911fc 100644
>>>--- a/net/vmw_vsock/virtio_transport.c
>>>+++ b/net/vmw_vsock/virtio_transport.c
>>>@@ -670,6 +670,13 @@ static int virtio_vsock_vqs_init(struct virtio_vsock *vsock)
>>> 	};
>>> 	int ret;
>>>
>>>+	mutex_lock(&vsock->rx_lock);
>>>+	vsock->rx_buf_nr = 0;
>>>+	vsock->rx_buf_max_nr = 0;
>>>+	mutex_unlock(&vsock->rx_lock);
>>>+
>>>+	atomic_set(&vsock->queued_replies, 0);
>>>+
>>> 	ret = virtio_find_vqs(vdev, VSOCK_VQ_MAX, vsock->vqs, vqs_info, NULL);
>>> 	if (ret < 0)
>>> 		return ret;
>>>@@ -779,9 +786,6 @@ static int virtio_vsock_probe(struct virtio_device *vdev)
>>>
>>> 	vsock->vdev = vdev;
>>>
>>>-	vsock->rx_buf_nr = 0;
>>>-	vsock->rx_buf_max_nr = 0;
>>>-	atomic_set(&vsock->queued_replies, 0);
>>>
>>> 	mutex_init(&vsock->tx_lock);
>>> 	mutex_init(&vsock->rx_lock);
>>>--
>>>2.34.1
>>>
>
Stefano Garzarella Feb. 13, 2025, 2:47 p.m. UTC | #4
On Thu, 13 Feb 2025 at 10:25, Stefano Garzarella <sgarzare@redhat.com> wrote:
>
> On Thu, Feb 13, 2025 at 09:28:05AM +0800, Junnan Wu wrote:
> >>You need to update the title now that you're moving also queued_replies.
> >>
> >
> >Well, I will update the title in V3 version.
> >
> >>On Tue, Feb 11, 2025 at 03:19:21PM +0800, Junnan Wu wrote:
> >>>When executing suspend to ram twice in a row,
> >>>the `rx_buf_nr` and `rx_buf_max_nr` increase to three times vq->num_free.
> >>>Then after virtqueue_get_buf and `rx_buf_nr` decreased
> >>>in function virtio_transport_rx_work,
> >>>the condition to fill rx buffer
> >>>(rx_buf_nr < rx_buf_max_nr / 2) will never be met.
> >>>
> >>>It is because that `rx_buf_nr` and `rx_buf_max_nr`
> >>>are initialized only in virtio_vsock_probe(),
> >>>but they should be reset whenever virtqueues are recreated,
> >>>like after a suspend/resume.
> >>>
> >>>Move the `rx_buf_nr` and `rx_buf_max_nr` initialization in
> >>>virtio_vsock_vqs_init(), so we are sure that they are properly
> >>>initialized, every time we initialize the virtqueues, either when we
> >>>load the driver or after a suspend/resume.
> >>>At the same time, also move `queued_replies`.
> >>
> >>Why?
> >>
> >>As I mentioned the commit description should explain why the changes are
> >>being made for both reviewers and future references to this patch.
> >>
> >
> >After your kindly remind, I have double checked all locations where `queued_replies`
> >used, and we think for order to prevent erroneous atomic load operations
> >on the `queued_replies` in the virtio_transport_send_pkt_work() function
> >which may disrupt the scheduling of vsock->rx_work
> >when transmitting reply-required socket packets,
> >this atomic variable must undergo synchronized initialization
> >alongside the preceding two variables after a suspend/resume.
>
> Yes, that was my concern!
>
> >
> >If we reach agreement on it, I will add this description in V3 version.
>
> Yes, please, I just wanted to point out that we need to add an
> explanation in the commit description.
>
> And in the title, in this case though listing all the variables would
> get too long, so you can do something like that:
>
>      vsock/virtio: fix variables initialization during resuming

I forgot to mention that IMHO it's better to split this series.
This first patch (this one) seems ready, without controversy, and it's
a real fix, so for me v3 should be a version ready to be merged.

While the other patch is more controversial and especially not a fix
but more of a new feature, so I don't think it makes sense to continue
to have these two patches in a single series.

Thanks,
Stefano
Junnan Wu Feb. 14, 2025, 1:02 a.m. UTC | #5
On Thu, 13 Feb 2025 at 15:47, Stefano Garzarella <sgarzare@redhat.com> wrote:
>I forgot to mention that IMHO it's better to split this series.
>This first patch (this one) seems ready, without controversy, and it's
>a real fix, so for me v3 should be a version ready to be merged.
>
>While the other patch is more controversial and especially not a fix
>but more of a new feature, so I don't think it makes sense to continue
>to have these two patches in a single series.
>
>Thanks,
>Stefano

Well, I agree with you that these two patches should be splited.
And I will send v3 version of the first patch individually.

And according to our discussion, the second one can be ignored, until
we find a suitable way to deal the scenario I metionded.

Thanks,
Junnan
diff mbox series

Patch

diff --git a/net/vmw_vsock/virtio_transport.c b/net/vmw_vsock/virtio_transport.c
index b58c3818f284..f0e48e6911fc 100644
--- a/net/vmw_vsock/virtio_transport.c
+++ b/net/vmw_vsock/virtio_transport.c
@@ -670,6 +670,13 @@  static int virtio_vsock_vqs_init(struct virtio_vsock *vsock)
 	};
 	int ret;
 
+	mutex_lock(&vsock->rx_lock);
+	vsock->rx_buf_nr = 0;
+	vsock->rx_buf_max_nr = 0;
+	mutex_unlock(&vsock->rx_lock);
+
+	atomic_set(&vsock->queued_replies, 0);
+
 	ret = virtio_find_vqs(vdev, VSOCK_VQ_MAX, vsock->vqs, vqs_info, NULL);
 	if (ret < 0)
 		return ret;
@@ -779,9 +786,6 @@  static int virtio_vsock_probe(struct virtio_device *vdev)
 
 	vsock->vdev = vdev;
 
-	vsock->rx_buf_nr = 0;
-	vsock->rx_buf_max_nr = 0;
-	atomic_set(&vsock->queued_replies, 0);
 
 	mutex_init(&vsock->tx_lock);
 	mutex_init(&vsock->rx_lock);