[net] vsock: improve tap delivery accuracy

Message ID	20230502174404.668749-1-xiyou.wangcong@gmail.com (mailing list archive)
State	Changes Requested
Delegated to:	Netdev Maintainers
Headers	show Return-Path: <netdev-owner@vger.kernel.org> From: Cong Wang <xiyou.wangcong@gmail.com> To: netdev@vger.kernel.org Cc: virtualization@lists.linux-foundation.org, kvm@vger.kernel.org, Cong Wang <cong.wang@bytedance.com>, Stefan Hajnoczi <stefanha@redhat.com>, Stefano Garzarella <sgarzare@redhat.com>, Bobby Eshleman <bobby.eshleman@bytedance.com> Subject: [Patch net] vsock: improve tap delivery accuracy Date: Tue, 2 May 2023 10:44:04 -0700 Message-Id: <20230502174404.668749-1-xiyou.wangcong@gmail.com> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Precedence: bulk
Series	[net] vsock: improve tap delivery accuracy \| expand [net] vsock: improve tap delivery accuracy

Context	Check	Description
netdev/series_format	success	Single patches do not need cover letters
netdev/tree_selection	success	Clearly marked for net
netdev/fixes_present	success	Fixes tag present in non-next series
netdev/header_inline	success	No static functions without inline keyword in header files
netdev/build_32bit	success	Errors and warnings before: 8 this patch: 8
netdev/cc_maintainers	fail	3 blamed authors not CCed: ggarcia@deic.uab.cat jhansen@vmware.com davem@davemloft.net; 6 maintainers not CCed: kuba@kernel.org ggarcia@deic.uab.cat jhansen@vmware.com davem@davemloft.net pabeni@redhat.com edumazet@google.com
netdev/build_clang	success	Errors and warnings before: 8 this patch: 8
netdev/verify_signedoff	success	Signed-off-by tag matches author and committer
netdev/deprecated_api	success	None detected
netdev/check_selftest	success	No net selftest shell script
netdev/verify_fixes	success	Fixes tag looks correct
netdev/build_allmodconfig_warn	success	Errors and warnings before: 8 this patch: 8
netdev/checkpatch	success	total: 0 errors, 0 warnings, 0 checks, 17 lines checked
netdev/kdoc	success	Errors and warnings before: 0 this patch: 0
netdev/source_inline	success	Was 0 now: 0

Cong Wang May 2, 2023, 5:44 p.m. UTC

From: Cong Wang <cong.wang@bytedance.com>

When virtqueue_add_sgs() fails, the skb is put back to send queue,
we should not deliver the copy to tap device in this case. So we
need to move virtio_transport_deliver_tap_pkt() down after all
possible failures.

Fixes: 82dfb540aeb2 ("VSOCK: Add virtio vsock vsockmon hooks")
Cc: Stefan Hajnoczi <stefanha@redhat.com>
Cc: Stefano Garzarella <sgarzare@redhat.com>
Cc: Bobby Eshleman <bobby.eshleman@bytedance.com>
Signed-off-by: Cong Wang <cong.wang@bytedance.com>
---
 net/vmw_vsock/virtio_transport.c | 5 ++---
 1 file changed, 2 insertions(+), 3 deletions(-)

Bobby Eshleman April 16, 2023, 4:49 a.m. UTC | #1

On Tue, May 02, 2023 at 04:14:18PM -0400, Stefan Hajnoczi wrote:
> On Tue, May 02, 2023 at 10:44:04AM -0700, Cong Wang wrote:
> > From: Cong Wang <cong.wang@bytedance.com>
> > 
> > When virtqueue_add_sgs() fails, the skb is put back to send queue,
> > we should not deliver the copy to tap device in this case. So we
> > need to move virtio_transport_deliver_tap_pkt() down after all
> > possible failures.
> > 
> > Fixes: 82dfb540aeb2 ("VSOCK: Add virtio vsock vsockmon hooks")
> > Cc: Stefan Hajnoczi <stefanha@redhat.com>
> > Cc: Stefano Garzarella <sgarzare@redhat.com>
> > Cc: Bobby Eshleman <bobby.eshleman@bytedance.com>
> > Signed-off-by: Cong Wang <cong.wang@bytedance.com>
> > ---
> >  net/vmw_vsock/virtio_transport.c | 5 ++---
> >  1 file changed, 2 insertions(+), 3 deletions(-)
> > 
> > diff --git a/net/vmw_vsock/virtio_transport.c b/net/vmw_vsock/virtio_transport.c
> > index e95df847176b..055678628c07 100644
> > --- a/net/vmw_vsock/virtio_transport.c
> > +++ b/net/vmw_vsock/virtio_transport.c
> > @@ -109,9 +109,6 @@ virtio_transport_send_pkt_work(struct work_struct *work)
> >  		if (!skb)
> >  			break;
> >  
> > -		virtio_transport_deliver_tap_pkt(skb);
> > -		reply = virtio_vsock_skb_reply(skb);
> > -
> >  		sg_init_one(&hdr, virtio_vsock_hdr(skb), sizeof(*virtio_vsock_hdr(skb)));
> >  		sgs[out_sg++] = &hdr;
> >  		if (skb->len > 0) {
> > @@ -128,6 +125,8 @@ virtio_transport_send_pkt_work(struct work_struct *work)
> >  			break;
> >  		}
> >  
> > +		virtio_transport_deliver_tap_pkt(skb);
> > +		reply = virtio_vsock_skb_reply(skb);
> 
> I don't remember the reason for the ordering, but I'm pretty sure it was
> deliberate. Probably because the payload buffers could be freed as soon
> as virtqueue_add_sgs() is called.
> 
> If that's no longer true with Bobby's skbuff code, then maybe it's safe
> to monitor packets after they have been sent.
> 
> Stefan

Hey Stefan,

Unfortunately, skbuff doesn't change that behavior.

If I understand correctly, the problem flow you are describing
would be something like this:

Thread 0 			Thread 1
guest:virtqueue_add_sgs()[@send_pkt_work]

				host:vhost_vq_get_desc()[@handle_tx_kick]
				host:vhost_add_used()
				host:vhost_signal()
				guest:virtqueue_get_buf()[@tx_work]
				guest:consume_skb()

guest:deliver_tap_pkt()[@send_pkt_work]
^ use-after-free

Which I guess is possible because the receiver can consume the new
scatterlist during the processing kicked off for a previous batch?
(doesn't have to wait for the subsequent kick)

Best,
Bobby

Bobby Eshleman April 16, 2023, 6:40 a.m. UTC | #2

On Wed, May 03, 2023 at 09:39:13AM -0400, Stefan Hajnoczi wrote:
> On Sun, Apr 16, 2023 at 04:49:00AM +0000, Bobby Eshleman wrote:
> > On Tue, May 02, 2023 at 04:14:18PM -0400, Stefan Hajnoczi wrote:
> > > On Tue, May 02, 2023 at 10:44:04AM -0700, Cong Wang wrote:
> > > > From: Cong Wang <cong.wang@bytedance.com>
> > > > 
> > > > When virtqueue_add_sgs() fails, the skb is put back to send queue,
> > > > we should not deliver the copy to tap device in this case. So we
> > > > need to move virtio_transport_deliver_tap_pkt() down after all
> > > > possible failures.
> > > > 
> > > > Fixes: 82dfb540aeb2 ("VSOCK: Add virtio vsock vsockmon hooks")
> > > > Cc: Stefan Hajnoczi <stefanha@redhat.com>
> > > > Cc: Stefano Garzarella <sgarzare@redhat.com>
> > > > Cc: Bobby Eshleman <bobby.eshleman@bytedance.com>
> > > > Signed-off-by: Cong Wang <cong.wang@bytedance.com>
> > > > ---
> > > >  net/vmw_vsock/virtio_transport.c | 5 ++---
> > > >  1 file changed, 2 insertions(+), 3 deletions(-)
> > > > 
> > > > diff --git a/net/vmw_vsock/virtio_transport.c b/net/vmw_vsock/virtio_transport.c
> > > > index e95df847176b..055678628c07 100644
> > > > --- a/net/vmw_vsock/virtio_transport.c
> > > > +++ b/net/vmw_vsock/virtio_transport.c
> > > > @@ -109,9 +109,6 @@ virtio_transport_send_pkt_work(struct work_struct *work)
> > > >  		if (!skb)
> > > >  			break;
> > > >  
> > > > -		virtio_transport_deliver_tap_pkt(skb);
> > > > -		reply = virtio_vsock_skb_reply(skb);
> > > > -
> > > >  		sg_init_one(&hdr, virtio_vsock_hdr(skb), sizeof(*virtio_vsock_hdr(skb)));
> > > >  		sgs[out_sg++] = &hdr;
> > > >  		if (skb->len > 0) {
> > > > @@ -128,6 +125,8 @@ virtio_transport_send_pkt_work(struct work_struct *work)
> > > >  			break;
> > > >  		}
> > > >  
> > > > +		virtio_transport_deliver_tap_pkt(skb);
> > > > +		reply = virtio_vsock_skb_reply(skb);
> > > 
> > > I don't remember the reason for the ordering, but I'm pretty sure it was
> > > deliberate. Probably because the payload buffers could be freed as soon
> > > as virtqueue_add_sgs() is called.
> > > 
> > > If that's no longer true with Bobby's skbuff code, then maybe it's safe
> > > to monitor packets after they have been sent.
> > > 
> > > Stefan
> > 
> > Hey Stefan,
> > 
> > Unfortunately, skbuff doesn't change that behavior.
> > 
> > If I understand correctly, the problem flow you are describing
> > would be something like this:
> > 
> > Thread 0 			Thread 1
> > guest:virtqueue_add_sgs()[@send_pkt_work]
> > 
> > 				host:vhost_vq_get_desc()[@handle_tx_kick]
> > 				host:vhost_add_used()
> > 				host:vhost_signal()
> > 				guest:virtqueue_get_buf()[@tx_work]
> > 				guest:consume_skb()
> > 
> > guest:deliver_tap_pkt()[@send_pkt_work]
> > ^ use-after-free
> > 
> > Which I guess is possible because the receiver can consume the new
> > scatterlist during the processing kicked off for a previous batch?
> > (doesn't have to wait for the subsequent kick)
> 
> Yes, drivers must assume that the device completes request before
> virtqueue_add_sgs() returns. For example, the device is allowed to poll
> the virtqueue memory and may see the new descriptors immediately.
> 
> I haven't audited the current vsock code path to determine whether it's
> possible to reach consume_skb() before deliver_tap_pkt() returns, so I
> can't say whether it's safe or not.
> 

I see, thanks for the clarification.

Best,
Bobby

Bobby Eshleman April 16, 2023, 6:57 a.m. UTC | #3

On Wed, May 03, 2023 at 09:38:50AM +0200, Stefano Garzarella wrote:
> On Sun, Apr 16, 2023 at 04:49:00AM +0000, Bobby Eshleman wrote:
> > On Tue, May 02, 2023 at 04:14:18PM -0400, Stefan Hajnoczi wrote:
> > > On Tue, May 02, 2023 at 10:44:04AM -0700, Cong Wang wrote:
> > > > From: Cong Wang <cong.wang@bytedance.com>
> > > >
> > > > When virtqueue_add_sgs() fails, the skb is put back to send queue,
> > > > we should not deliver the copy to tap device in this case. So we
> > > > need to move virtio_transport_deliver_tap_pkt() down after all
> > > > possible failures.
> > > >
> > > > Fixes: 82dfb540aeb2 ("VSOCK: Add virtio vsock vsockmon hooks")
> > > > Cc: Stefan Hajnoczi <stefanha@redhat.com>
> > > > Cc: Stefano Garzarella <sgarzare@redhat.com>
> > > > Cc: Bobby Eshleman <bobby.eshleman@bytedance.com>
> > > > Signed-off-by: Cong Wang <cong.wang@bytedance.com>
> > > > ---
> > > >  net/vmw_vsock/virtio_transport.c | 5 ++---
> > > >  1 file changed, 2 insertions(+), 3 deletions(-)
> > > >
> > > > diff --git a/net/vmw_vsock/virtio_transport.c b/net/vmw_vsock/virtio_transport.c
> > > > index e95df847176b..055678628c07 100644
> > > > --- a/net/vmw_vsock/virtio_transport.c
> > > > +++ b/net/vmw_vsock/virtio_transport.c
> > > > @@ -109,9 +109,6 @@ virtio_transport_send_pkt_work(struct work_struct *work)
> > > >  		if (!skb)
> > > >  			break;
> > > >
> > > > -		virtio_transport_deliver_tap_pkt(skb);
> > > > -		reply = virtio_vsock_skb_reply(skb);
> > > > -
> > > >  		sg_init_one(&hdr, virtio_vsock_hdr(skb), sizeof(*virtio_vsock_hdr(skb)));
> > > >  		sgs[out_sg++] = &hdr;
> > > >  		if (skb->len > 0) {
> > > > @@ -128,6 +125,8 @@ virtio_transport_send_pkt_work(struct work_struct *work)
> > > >  			break;
> > > >  		}
> > > >
> > > > +		virtio_transport_deliver_tap_pkt(skb);
> 
> I would move only the virtio_transport_deliver_tap_pkt(),
> virtio_vsock_skb_reply() is not related.
> 
> > > > +		reply = virtio_vsock_skb_reply(skb);
> > > 
> > > I don't remember the reason for the ordering, but I'm pretty sure it was
> > > deliberate. Probably because the payload buffers could be freed as soon
> > > as virtqueue_add_sgs() is called.
> > > 
> > > If that's no longer true with Bobby's skbuff code, then maybe it's safe
> > > to monitor packets after they have been sent.
> > > 
> > > Stefan
> > 
> > Hey Stefan,
> > 
> > Unfortunately, skbuff doesn't change that behavior.
> > 
> > If I understand correctly, the problem flow you are describing
> > would be something like this:
> > 
> > Thread 0 			Thread 1
> > guest:virtqueue_add_sgs()[@send_pkt_work]
> > 
> > 				host:vhost_vq_get_desc()[@handle_tx_kick]
> > 				host:vhost_add_used()
> > 				host:vhost_signal()
> > 				guest:virtqueue_get_buf()[@tx_work]
> > 				guest:consume_skb()
> > 
> > guest:deliver_tap_pkt()[@send_pkt_work]
> > ^ use-after-free
> > 
> > Which I guess is possible because the receiver can consume the new
> > scatterlist during the processing kicked off for a previous batch?
> > (doesn't have to wait for the subsequent kick)
> 
> This is true, but both `send_pkt_work` and `tx_work` hold `tx_lock`, so can
> they really go in parallel?
> 

Oh good point, the tx_lock synchronizes it:

Thread 0 			Thread 1
guest:virtqueue_add_sgs()[@send_pkt_work]

				host:vhost_vq_get_desc()[@handle_tx_kick]
				host:vhost_add_used()
				host:vhost_signal()
				guest:mutex_lock()[@tx_work]
guest:deliver_tap_pkt()[@send_pkt_work]
guest:mutex_unlock()
				guest:virtqueue_get_buf()[@tx_work]
				guest:consume_skb()


I'm pretty sure this should be safe.

Best,
Bobby

Simon Horman May 2, 2023, 8:02 p.m. UTC | #4

On Tue, May 02, 2023 at 10:44:04AM -0700, Cong Wang wrote:
> From: Cong Wang <cong.wang@bytedance.com>
> 
> When virtqueue_add_sgs() fails, the skb is put back to send queue,
> we should not deliver the copy to tap device in this case. So we
> need to move virtio_transport_deliver_tap_pkt() down after all
> possible failures.
> 
> Fixes: 82dfb540aeb2 ("VSOCK: Add virtio vsock vsockmon hooks")
> Cc: Stefan Hajnoczi <stefanha@redhat.com>
> Cc: Stefano Garzarella <sgarzare@redhat.com>
> Cc: Bobby Eshleman <bobby.eshleman@bytedance.com>
> Signed-off-by: Cong Wang <cong.wang@bytedance.com>

Reviewed-by: Simon Horman <simon.horman@corigine.com>

Stefan Hajnoczi May 2, 2023, 8:14 p.m. UTC | #5

On Tue, May 02, 2023 at 10:44:04AM -0700, Cong Wang wrote:
> From: Cong Wang <cong.wang@bytedance.com>
> 
> When virtqueue_add_sgs() fails, the skb is put back to send queue,
> we should not deliver the copy to tap device in this case. So we
> need to move virtio_transport_deliver_tap_pkt() down after all
> possible failures.
> 
> Fixes: 82dfb540aeb2 ("VSOCK: Add virtio vsock vsockmon hooks")
> Cc: Stefan Hajnoczi <stefanha@redhat.com>
> Cc: Stefano Garzarella <sgarzare@redhat.com>
> Cc: Bobby Eshleman <bobby.eshleman@bytedance.com>
> Signed-off-by: Cong Wang <cong.wang@bytedance.com>
> ---
>  net/vmw_vsock/virtio_transport.c | 5 ++---
>  1 file changed, 2 insertions(+), 3 deletions(-)
> 
> diff --git a/net/vmw_vsock/virtio_transport.c b/net/vmw_vsock/virtio_transport.c
> index e95df847176b..055678628c07 100644
> --- a/net/vmw_vsock/virtio_transport.c
> +++ b/net/vmw_vsock/virtio_transport.c
> @@ -109,9 +109,6 @@ virtio_transport_send_pkt_work(struct work_struct *work)
>  		if (!skb)
>  			break;
>  
> -		virtio_transport_deliver_tap_pkt(skb);
> -		reply = virtio_vsock_skb_reply(skb);
> -
>  		sg_init_one(&hdr, virtio_vsock_hdr(skb), sizeof(*virtio_vsock_hdr(skb)));
>  		sgs[out_sg++] = &hdr;
>  		if (skb->len > 0) {
> @@ -128,6 +125,8 @@ virtio_transport_send_pkt_work(struct work_struct *work)
>  			break;
>  		}
>  
> +		virtio_transport_deliver_tap_pkt(skb);
> +		reply = virtio_vsock_skb_reply(skb);

I don't remember the reason for the ordering, but I'm pretty sure it was
deliberate. Probably because the payload buffers could be freed as soon
as virtqueue_add_sgs() is called.

If that's no longer true with Bobby's skbuff code, then maybe it's safe
to monitor packets after they have been sent.

Stefan

Stefano Garzarella May 3, 2023, 7:38 a.m. UTC | #6

On Sun, Apr 16, 2023 at 04:49:00AM +0000, Bobby Eshleman wrote:
>On Tue, May 02, 2023 at 04:14:18PM -0400, Stefan Hajnoczi wrote:
>> On Tue, May 02, 2023 at 10:44:04AM -0700, Cong Wang wrote:
>> > From: Cong Wang <cong.wang@bytedance.com>
>> >
>> > When virtqueue_add_sgs() fails, the skb is put back to send queue,
>> > we should not deliver the copy to tap device in this case. So we
>> > need to move virtio_transport_deliver_tap_pkt() down after all
>> > possible failures.
>> >
>> > Fixes: 82dfb540aeb2 ("VSOCK: Add virtio vsock vsockmon hooks")
>> > Cc: Stefan Hajnoczi <stefanha@redhat.com>
>> > Cc: Stefano Garzarella <sgarzare@redhat.com>
>> > Cc: Bobby Eshleman <bobby.eshleman@bytedance.com>
>> > Signed-off-by: Cong Wang <cong.wang@bytedance.com>
>> > ---
>> >  net/vmw_vsock/virtio_transport.c | 5 ++---
>> >  1 file changed, 2 insertions(+), 3 deletions(-)
>> >
>> > diff --git a/net/vmw_vsock/virtio_transport.c b/net/vmw_vsock/virtio_transport.c
>> > index e95df847176b..055678628c07 100644
>> > --- a/net/vmw_vsock/virtio_transport.c
>> > +++ b/net/vmw_vsock/virtio_transport.c
>> > @@ -109,9 +109,6 @@ virtio_transport_send_pkt_work(struct work_struct *work)
>> >  		if (!skb)
>> >  			break;
>> >
>> > -		virtio_transport_deliver_tap_pkt(skb);
>> > -		reply = virtio_vsock_skb_reply(skb);
>> > -
>> >  		sg_init_one(&hdr, virtio_vsock_hdr(skb), sizeof(*virtio_vsock_hdr(skb)));
>> >  		sgs[out_sg++] = &hdr;
>> >  		if (skb->len > 0) {
>> > @@ -128,6 +125,8 @@ virtio_transport_send_pkt_work(struct work_struct *work)
>> >  			break;
>> >  		}
>> >
>> > +		virtio_transport_deliver_tap_pkt(skb);

I would move only the virtio_transport_deliver_tap_pkt(), 
virtio_vsock_skb_reply() is not related.

>> > +		reply = virtio_vsock_skb_reply(skb);
>>
>> I don't remember the reason for the ordering, but I'm pretty sure it was
>> deliberate. Probably because the payload buffers could be freed as soon
>> as virtqueue_add_sgs() is called.
>>
>> If that's no longer true with Bobby's skbuff code, then maybe it's safe
>> to monitor packets after they have been sent.
>>
>> Stefan
>
>Hey Stefan,
>
>Unfortunately, skbuff doesn't change that behavior.
>
>If I understand correctly, the problem flow you are describing
>would be something like this:
>
>Thread 0 			Thread 1
>guest:virtqueue_add_sgs()[@send_pkt_work]
>
>				host:vhost_vq_get_desc()[@handle_tx_kick]
>				host:vhost_add_used()
>				host:vhost_signal()
>				guest:virtqueue_get_buf()[@tx_work]
>				guest:consume_skb()
>
>guest:deliver_tap_pkt()[@send_pkt_work]
>^ use-after-free
>
>Which I guess is possible because the receiver can consume the new
>scatterlist during the processing kicked off for a previous batch?
>(doesn't have to wait for the subsequent kick)

This is true, but both `send_pkt_work` and `tx_work` hold `tx_lock`, so 
can they really go in parallel?

Thanks,
Stefano

Stefan Hajnoczi May 3, 2023, 1:39 p.m. UTC | #7

On Sun, Apr 16, 2023 at 04:49:00AM +0000, Bobby Eshleman wrote:
> On Tue, May 02, 2023 at 04:14:18PM -0400, Stefan Hajnoczi wrote:
> > On Tue, May 02, 2023 at 10:44:04AM -0700, Cong Wang wrote:
> > > From: Cong Wang <cong.wang@bytedance.com>
> > > 
> > > When virtqueue_add_sgs() fails, the skb is put back to send queue,
> > > we should not deliver the copy to tap device in this case. So we
> > > need to move virtio_transport_deliver_tap_pkt() down after all
> > > possible failures.
> > > 
> > > Fixes: 82dfb540aeb2 ("VSOCK: Add virtio vsock vsockmon hooks")
> > > Cc: Stefan Hajnoczi <stefanha@redhat.com>
> > > Cc: Stefano Garzarella <sgarzare@redhat.com>
> > > Cc: Bobby Eshleman <bobby.eshleman@bytedance.com>
> > > Signed-off-by: Cong Wang <cong.wang@bytedance.com>
> > > ---
> > >  net/vmw_vsock/virtio_transport.c | 5 ++---
> > >  1 file changed, 2 insertions(+), 3 deletions(-)
> > > 
> > > diff --git a/net/vmw_vsock/virtio_transport.c b/net/vmw_vsock/virtio_transport.c
> > > index e95df847176b..055678628c07 100644
> > > --- a/net/vmw_vsock/virtio_transport.c
> > > +++ b/net/vmw_vsock/virtio_transport.c
> > > @@ -109,9 +109,6 @@ virtio_transport_send_pkt_work(struct work_struct *work)
> > >  		if (!skb)
> > >  			break;
> > >  
> > > -		virtio_transport_deliver_tap_pkt(skb);
> > > -		reply = virtio_vsock_skb_reply(skb);
> > > -
> > >  		sg_init_one(&hdr, virtio_vsock_hdr(skb), sizeof(*virtio_vsock_hdr(skb)));
> > >  		sgs[out_sg++] = &hdr;
> > >  		if (skb->len > 0) {
> > > @@ -128,6 +125,8 @@ virtio_transport_send_pkt_work(struct work_struct *work)
> > >  			break;
> > >  		}
> > >  
> > > +		virtio_transport_deliver_tap_pkt(skb);
> > > +		reply = virtio_vsock_skb_reply(skb);
> > 
> > I don't remember the reason for the ordering, but I'm pretty sure it was
> > deliberate. Probably because the payload buffers could be freed as soon
> > as virtqueue_add_sgs() is called.
> > 
> > If that's no longer true with Bobby's skbuff code, then maybe it's safe
> > to monitor packets after they have been sent.
> > 
> > Stefan
> 
> Hey Stefan,
> 
> Unfortunately, skbuff doesn't change that behavior.
> 
> If I understand correctly, the problem flow you are describing
> would be something like this:
> 
> Thread 0 			Thread 1
> guest:virtqueue_add_sgs()[@send_pkt_work]
> 
> 				host:vhost_vq_get_desc()[@handle_tx_kick]
> 				host:vhost_add_used()
> 				host:vhost_signal()
> 				guest:virtqueue_get_buf()[@tx_work]
> 				guest:consume_skb()
> 
> guest:deliver_tap_pkt()[@send_pkt_work]
> ^ use-after-free
> 
> Which I guess is possible because the receiver can consume the new
> scatterlist during the processing kicked off for a previous batch?
> (doesn't have to wait for the subsequent kick)

Yes, drivers must assume that the device completes request before
virtqueue_add_sgs() returns. For example, the device is allowed to poll
the virtqueue memory and may see the new descriptors immediately.

I haven't audited the current vsock code path to determine whether it's
possible to reach consume_skb() before deliver_tap_pkt() returns, so I
can't say whether it's safe or not.

Stefan

[net] vsock: improve tap delivery accuracy

Checks

Commit Message

Comments

Patch