diff mbox series

[RFC,2/4] vhost/vsock: split packets to send using multiple buffers

Message ID 20190404105838.101559-3-sgarzare@redhat.com (mailing list archive)
State New, archived
Headers show
Series vsock/virtio: optimizations to increase the throughput | expand

Commit Message

Stefano Garzarella April 4, 2019, 10:58 a.m. UTC
If the packets to sent to the guest are bigger than the buffer
available, we can split them, using multiple buffers and fixing
the length in the packet header.
This is safe since virtio-vsock supports only stream sockets.

Signed-off-by: Stefano Garzarella <sgarzare@redhat.com>
---
 drivers/vhost/vsock.c | 35 +++++++++++++++++++++++++++++------
 1 file changed, 29 insertions(+), 6 deletions(-)

Comments

Stefan Hajnoczi April 5, 2019, 8:13 a.m. UTC | #1
On Thu, Apr 04, 2019 at 12:58:36PM +0200, Stefano Garzarella wrote:
> @@ -139,8 +139,18 @@ vhost_transport_do_send_pkt(struct vhost_vsock *vsock,
>  			break;
>  		}
>  
> -		len = iov_length(&vq->iov[out], in);
> -		iov_iter_init(&iov_iter, READ, &vq->iov[out], in, len);
> +		payload_len = pkt->len - pkt->off;
> +		iov_len = iov_length(&vq->iov[out], in);
> +		iov_iter_init(&iov_iter, READ, &vq->iov[out], in, iov_len);
> +
> +		/* If the packet is greater than the space available in the
> +		 * buffer, we split it using multiple buffers.
> +		 */
> +		if (payload_len > iov_len - sizeof(pkt->hdr))

Integer underflow.  iov_len is controlled by the guest and therefore
untrusted.  Please validate iov_len before assuming it's larger than
sizeof(pkt->hdr).

> -		vhost_add_used(vq, head, sizeof(pkt->hdr) + pkt->len);
> +		vhost_add_used(vq, head, sizeof(pkt->hdr) + payload_len);
>  		added = true;
>  
> +		pkt->off += payload_len;
> +
> +		/* If we didn't send all the payload we can requeue the packet
> +		 * to send it with the next available buffer.
> +		 */
> +		if (pkt->off < pkt->len) {
> +			spin_lock_bh(&vsock->send_pkt_list_lock);
> +			list_add(&pkt->list, &vsock->send_pkt_list);
> +			spin_unlock_bh(&vsock->send_pkt_list_lock);
> +			continue;

The virtio_transport_deliver_tap_pkt() call is skipped.  Packet capture
should see the exact packets that are delivered.  I think this patch
will present one large packet instead of several smaller packets that
were actually delivered.
Stefano Garzarella April 5, 2019, 9:36 a.m. UTC | #2
On Fri, Apr 05, 2019 at 09:13:56AM +0100, Stefan Hajnoczi wrote:
> On Thu, Apr 04, 2019 at 12:58:36PM +0200, Stefano Garzarella wrote:
> > @@ -139,8 +139,18 @@ vhost_transport_do_send_pkt(struct vhost_vsock *vsock,
> >  			break;
> >  		}
> >  
> > -		len = iov_length(&vq->iov[out], in);
> > -		iov_iter_init(&iov_iter, READ, &vq->iov[out], in, len);
> > +		payload_len = pkt->len - pkt->off;
> > +		iov_len = iov_length(&vq->iov[out], in);
> > +		iov_iter_init(&iov_iter, READ, &vq->iov[out], in, iov_len);
> > +
> > +		/* If the packet is greater than the space available in the
> > +		 * buffer, we split it using multiple buffers.
> > +		 */
> > +		if (payload_len > iov_len - sizeof(pkt->hdr))
> 
> Integer underflow.  iov_len is controlled by the guest and therefore
> untrusted.  Please validate iov_len before assuming it's larger than
> sizeof(pkt->hdr).
> 

Okay, I'll do it!

> > -		vhost_add_used(vq, head, sizeof(pkt->hdr) + pkt->len);
> > +		vhost_add_used(vq, head, sizeof(pkt->hdr) + payload_len);
> >  		added = true;
> >  
> > +		pkt->off += payload_len;
> > +
> > +		/* If we didn't send all the payload we can requeue the packet
> > +		 * to send it with the next available buffer.
> > +		 */
> > +		if (pkt->off < pkt->len) {
> > +			spin_lock_bh(&vsock->send_pkt_list_lock);
> > +			list_add(&pkt->list, &vsock->send_pkt_list);
> > +			spin_unlock_bh(&vsock->send_pkt_list_lock);
> > +			continue;
> 
> The virtio_transport_deliver_tap_pkt() call is skipped.  Packet capture
> should see the exact packets that are delivered.  I think this patch
> will present one large packet instead of several smaller packets that
> were actually delivered.

I'll modify virtio_transport_build_skb() to take care of pkt->off
and reading the payload size from the virtio_vsock_hdr.
Otherwise, should I introduce another field in virtio_vsock_pkt to store
the payload size?

Thanks,
Stefano
Stefan Hajnoczi April 8, 2019, 9:28 a.m. UTC | #3
On Fri, Apr 05, 2019 at 11:36:08AM +0200, Stefano Garzarella wrote:
> On Fri, Apr 05, 2019 at 09:13:56AM +0100, Stefan Hajnoczi wrote:
> > On Thu, Apr 04, 2019 at 12:58:36PM +0200, Stefano Garzarella wrote:
> > > -		vhost_add_used(vq, head, sizeof(pkt->hdr) + pkt->len);
> > > +		vhost_add_used(vq, head, sizeof(pkt->hdr) + payload_len);
> > >  		added = true;
> > >  
> > > +		pkt->off += payload_len;
> > > +
> > > +		/* If we didn't send all the payload we can requeue the packet
> > > +		 * to send it with the next available buffer.
> > > +		 */
> > > +		if (pkt->off < pkt->len) {
> > > +			spin_lock_bh(&vsock->send_pkt_list_lock);
> > > +			list_add(&pkt->list, &vsock->send_pkt_list);
> > > +			spin_unlock_bh(&vsock->send_pkt_list_lock);
> > > +			continue;
> > 
> > The virtio_transport_deliver_tap_pkt() call is skipped.  Packet capture
> > should see the exact packets that are delivered.  I think this patch
> > will present one large packet instead of several smaller packets that
> > were actually delivered.
> 
> I'll modify virtio_transport_build_skb() to take care of pkt->off
> and reading the payload size from the virtio_vsock_hdr.
> Otherwise, should I introduce another field in virtio_vsock_pkt to store
> the payload size?

I don't remember the details but I trust you'll pick a good way of doing
it.

Stefan
diff mbox series

Patch

diff --git a/drivers/vhost/vsock.c b/drivers/vhost/vsock.c
index bb5fc0e9fbc2..9951b7e661f6 100644
--- a/drivers/vhost/vsock.c
+++ b/drivers/vhost/vsock.c
@@ -94,7 +94,7 @@  vhost_transport_do_send_pkt(struct vhost_vsock *vsock,
 		struct iov_iter iov_iter;
 		unsigned out, in;
 		size_t nbytes;
-		size_t len;
+		size_t iov_len, payload_len;
 		int head;
 
 		spin_lock_bh(&vsock->send_pkt_list_lock);
@@ -139,8 +139,18 @@  vhost_transport_do_send_pkt(struct vhost_vsock *vsock,
 			break;
 		}
 
-		len = iov_length(&vq->iov[out], in);
-		iov_iter_init(&iov_iter, READ, &vq->iov[out], in, len);
+		payload_len = pkt->len - pkt->off;
+		iov_len = iov_length(&vq->iov[out], in);
+		iov_iter_init(&iov_iter, READ, &vq->iov[out], in, iov_len);
+
+		/* If the packet is greater than the space available in the
+		 * buffer, we split it using multiple buffers.
+		 */
+		if (payload_len > iov_len - sizeof(pkt->hdr))
+			payload_len = iov_len - sizeof(pkt->hdr);
+
+		/* Set the correct length in the header */
+		pkt->hdr.len = cpu_to_le32(payload_len);
 
 		nbytes = copy_to_iter(&pkt->hdr, sizeof(pkt->hdr), &iov_iter);
 		if (nbytes != sizeof(pkt->hdr)) {
@@ -149,16 +159,29 @@  vhost_transport_do_send_pkt(struct vhost_vsock *vsock,
 			break;
 		}
 
-		nbytes = copy_to_iter(pkt->buf, pkt->len, &iov_iter);
-		if (nbytes != pkt->len) {
+		nbytes = copy_to_iter(pkt->buf + pkt->off, payload_len,
+				      &iov_iter);
+		if (nbytes != payload_len) {
 			virtio_transport_free_pkt(pkt);
 			vq_err(vq, "Faulted on copying pkt buf\n");
 			break;
 		}
 
-		vhost_add_used(vq, head, sizeof(pkt->hdr) + pkt->len);
+		vhost_add_used(vq, head, sizeof(pkt->hdr) + payload_len);
 		added = true;
 
+		pkt->off += payload_len;
+
+		/* If we didn't send all the payload we can requeue the packet
+		 * to send it with the next available buffer.
+		 */
+		if (pkt->off < pkt->len) {
+			spin_lock_bh(&vsock->send_pkt_list_lock);
+			list_add(&pkt->list, &vsock->send_pkt_list);
+			spin_unlock_bh(&vsock->send_pkt_list_lock);
+			continue;
+		}
+
 		if (pkt->reply) {
 			int val;