From patchwork Fri Jul 27 06:27:57 2012 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Rusty Russell X-Patchwork-Id: 1247121 Return-Path: X-Original-To: patchwork-kvm@patchwork.kernel.org Delivered-To: patchwork-process-083081@patchwork1.kernel.org Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by patchwork1.kernel.org (Postfix) with ESMTP id 9B43E3FDCA for ; Fri, 27 Jul 2012 07:49:58 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752279Ab2G0Htj (ORCPT ); Fri, 27 Jul 2012 03:49:39 -0400 Received: from ozlabs.org ([203.10.76.45]:54503 "EHLO ozlabs.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751909Ab2G0Hsn (ORCPT ); Fri, 27 Jul 2012 03:48:43 -0400 Received: by ozlabs.org (Postfix, from userid 1011) id ACF7C2C0099; Fri, 27 Jul 2012 17:48:40 +1000 (EST) From: Rusty Russell To: Paolo Bonzini , Boaz Harrosh Cc: Wang Sen , linux-scsi@vger.kernel.org, JBottomley@parallels.com, stefanha@linux.vnet.ibm.com, mc@linux.vnet.ibm.com, linux-kernel@vger.kernel.org, "kvm\@vger.kernel.org" Subject: Re: virtio(-scsi) vs. chained sg_lists (was Re: [PATCH] scsi: virtio-scsi: Fix address translation failure of HighMem pages used by sg list) In-Reply-To: <501140A3.9090908@redhat.com> References: <1343204966-23560-1-git-send-email-senwang@linux.vnet.ibm.com> <500FB1DE.1000100@redhat.com> <500FBAE8.2050107@panasas.com> <500FBF37.50603@redhat.com> <500FE7D2.7070101@panasas.com> <500FEB63.3000709@redhat.com> <500FF412.3090600@panasas.com> <50100014.2010109@redhat.com> <50101091.5090909@panasas.com> <50103043.5050508@redhat.com> <50104614.3080002@panasas.com> <501051DF.5040907@redhat.com> <50105F60.8050707@panasas.com> <5010F07E.7050506@redhat.com> <5010F831.9030300@panasas.com> <5010F896.8090409@redhat.com> <501140A3.9090908@redhat.com> User-Agent: Notmuch/0.12 (http://notmuchmail.org) Emacs/23.3.1 (i686-pc-linux-gnu) Date: Fri, 27 Jul 2012 15:57:57 +0930 Message-ID: <874notoh02.fsf@rustcorp.com.au> MIME-Version: 1.0 Sender: kvm-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org On Thu, 26 Jul 2012 15:05:39 +0200, Paolo Bonzini wrote: > Il 26/07/2012 09:58, Paolo Bonzini ha scritto: > > > >> > Please CC me on the "convert to sg copy-less" patches, It looks interesting > > Sure. > > Well, here is the gist of it (note it won't apply on any public tree, > hence no SoB yet). It should be split in multiple changesets and you > can make more simplifications on top of it, because > virtio_scsi_target_state is not anymore variable-sized, but that's > secondary. ISTR starting on such a patch years ago, but the primitives to manipulate a chained sg_list were nonexistent, so I dropped it, waiting for it to be fully-baked or replaced. That hasn't happened: > + /* Remove scatterlist terminator, we will tack more items soon. */ > + vblk->sg[num + out - 1].page_link &= ~0x2; I hate this interface: > +int virtqueue_add_buf_sg(struct virtqueue *_vq, > + struct scatterlist *sg_out, > + unsigned int out, > + struct scatterlist *sg_in, > + unsigned int in, > + void *data, > + gfp_t gfp) The point of chained scatterlists is they're self-terminated, so the in & out counts should be calculated. Counting them is not *that* bad, since we're about to read them all anyway. (Yes, the chained scatterlist stuff is complete crack, but I lost that debate years ago.) Here's my variant. Networking, console and block seem OK, at least (ie. it booted!). From: Rusty Russell Subject: virtio: use chained scatterlists. Rather than handing a scatterlist[] and out and in numbers to virtqueue_add_buf(), hand two separate ones which can be chained. I shall refrain from ranting about what a disgusting hack chained scatterlists are. I'll just note that this doesn't make things simpler (see diff). The scatterlists we use can be too large for the stack, so we put them in our device struct and reuse them. But in many cases we don't want to pay the cost of sg_init_table() as we don't know how many elements we'll have and we'd have to initialize the entire table. This means we have two choices: carefully reset the end markers after we call virtqueue_add_buf(), which we do in virtio_net for the xmit path where it's easy and we want to be optimal. Elsewhere we implement a helper to unset the end markers after we've filled the array. Signed-off-by: Rusty Russell --- drivers/block/virtio_blk.c | 37 +++++++++++++----- drivers/char/hw_random/virtio-rng.c | 2 - drivers/char/virtio_console.c | 6 +-- drivers/net/virtio_net.c | 67 ++++++++++++++++++--------------- drivers/virtio/virtio_balloon.c | 6 +-- drivers/virtio/virtio_ring.c | 71 ++++++++++++++++++++++-------------- include/linux/virtio.h | 5 +- net/9p/trans_virtio.c | 46 +++++++++++++++++++++-- 8 files changed, 159 insertions(+), 81 deletions(-) -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html diff --git a/drivers/block/virtio_blk.c b/drivers/block/virtio_blk.c --- a/drivers/block/virtio_blk.c +++ b/drivers/block/virtio_blk.c @@ -100,6 +100,14 @@ static void blk_done(struct virtqueue *v spin_unlock_irqrestore(vblk->disk->queue->queue_lock, flags); } +static void sg_unset_end_markers(struct scatterlist *sg, unsigned int num) +{ + unsigned int i; + + for (i = 0; i < num; i++) + sg[i].page_link &= ~0x02; +} + static bool do_req(struct request_queue *q, struct virtio_blk *vblk, struct request *req) { @@ -140,6 +148,7 @@ static bool do_req(struct request_queue } } + /* We layout out scatterlist in a single array, out then in. */ sg_set_buf(&vblk->sg[out++], &vbr->out_hdr, sizeof(vbr->out_hdr)); /* @@ -151,17 +160,8 @@ static bool do_req(struct request_queue if (vbr->req->cmd_type == REQ_TYPE_BLOCK_PC) sg_set_buf(&vblk->sg[out++], vbr->req->cmd, vbr->req->cmd_len); + /* This marks the end of the sg list at vblk->sg[out]. */ num = blk_rq_map_sg(q, vbr->req, vblk->sg + out); - - if (vbr->req->cmd_type == REQ_TYPE_BLOCK_PC) { - sg_set_buf(&vblk->sg[num + out + in++], vbr->req->sense, SCSI_SENSE_BUFFERSIZE); - sg_set_buf(&vblk->sg[num + out + in++], &vbr->in_hdr, - sizeof(vbr->in_hdr)); - } - - sg_set_buf(&vblk->sg[num + out + in++], &vbr->status, - sizeof(vbr->status)); - if (num) { if (rq_data_dir(vbr->req) == WRITE) { vbr->out_hdr.type |= VIRTIO_BLK_T_OUT; @@ -172,7 +172,22 @@ static bool do_req(struct request_queue } } - if (virtqueue_add_buf(vblk->vq, vblk->sg, out, in, vbr, GFP_ATOMIC)<0) { + if (vbr->req->cmd_type == REQ_TYPE_BLOCK_PC) { + sg_set_buf(&vblk->sg[out + in++], vbr->req->sense, + SCSI_SENSE_BUFFERSIZE); + sg_set_buf(&vblk->sg[out + in++], &vbr->in_hdr, + sizeof(vbr->in_hdr)); + } + + sg_set_buf(&vblk->sg[out + in++], &vbr->status, + sizeof(vbr->status)); + + sg_unset_end_markers(vblk->sg, out+in); + sg_mark_end(&vblk->sg[out-1]); + sg_mark_end(&vblk->sg[out+in-1]); + + if (virtqueue_add_buf(vblk->vq, vblk->sg, vblk->sg+out, vbr, GFP_ATOMIC) + < 0) { mempool_free(vbr, vblk->pool); return false; } diff --git a/drivers/char/hw_random/virtio-rng.c b/drivers/char/hw_random/virtio-rng.c --- a/drivers/char/hw_random/virtio-rng.c +++ b/drivers/char/hw_random/virtio-rng.c @@ -47,7 +47,7 @@ static void register_buffer(u8 *buf, siz sg_init_one(&sg, buf, size); /* There should always be room for one buffer. */ - if (virtqueue_add_buf(vq, &sg, 0, 1, buf, GFP_KERNEL) < 0) + if (virtqueue_add_buf(vq, NULL, &sg, buf, GFP_KERNEL) < 0) BUG(); virtqueue_kick(vq); diff --git a/drivers/char/virtio_console.c b/drivers/char/virtio_console.c --- a/drivers/char/virtio_console.c +++ b/drivers/char/virtio_console.c @@ -392,7 +392,7 @@ static int add_inbuf(struct virtqueue *v sg_init_one(sg, buf->buf, buf->size); - ret = virtqueue_add_buf(vq, sg, 0, 1, buf, GFP_ATOMIC); + ret = virtqueue_add_buf(vq, NULL, sg, buf, GFP_ATOMIC); virtqueue_kick(vq); return ret; } @@ -457,7 +457,7 @@ static ssize_t __send_control_msg(struct vq = portdev->c_ovq; sg_init_one(sg, &cpkt, sizeof(cpkt)); - if (virtqueue_add_buf(vq, sg, 1, 0, &cpkt, GFP_ATOMIC) >= 0) { + if (virtqueue_add_buf(vq, sg, NULL, &cpkt, GFP_ATOMIC) >= 0) { virtqueue_kick(vq); while (!virtqueue_get_buf(vq, &len)) cpu_relax(); @@ -506,7 +506,7 @@ static ssize_t send_buf(struct port *por reclaim_consumed_buffers(port); sg_init_one(sg, in_buf, in_count); - ret = virtqueue_add_buf(out_vq, sg, 1, 0, in_buf, GFP_ATOMIC); + ret = virtqueue_add_buf(out_vq, sg, NULL, in_buf, GFP_ATOMIC); /* Tell Host to go! */ virtqueue_kick(out_vq); diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c --- a/drivers/net/virtio_net.c +++ b/drivers/net/virtio_net.c @@ -376,11 +376,11 @@ static int add_recvbuf_small(struct virt skb_put(skb, MAX_PACKET_LEN); hdr = skb_vnet_hdr(skb); + sg_init_table(vi->rx_sg, 2); sg_set_buf(vi->rx_sg, &hdr->hdr, sizeof hdr->hdr); - skb_to_sgvec(skb, vi->rx_sg + 1, 0, skb->len); - err = virtqueue_add_buf(vi->rvq, vi->rx_sg, 0, 2, skb, gfp); + err = virtqueue_add_buf(vi->rvq, NULL, vi->rx_sg, skb, gfp); if (err < 0) dev_kfree_skb(skb); @@ -393,6 +393,8 @@ static int add_recvbuf_big(struct virtne char *p; int i, err, offset; + sg_init_table(vi->rx_sg, MAX_SKB_FRAGS + 1); + /* page in vi->rx_sg[MAX_SKB_FRAGS + 1] is list tail */ for (i = MAX_SKB_FRAGS + 1; i > 1; --i) { first = get_a_page(vi, gfp); @@ -425,8 +427,8 @@ static int add_recvbuf_big(struct virtne /* chain first in list head */ first->private = (unsigned long)list; - err = virtqueue_add_buf(vi->rvq, vi->rx_sg, 0, MAX_SKB_FRAGS + 2, - first, gfp); + + err = virtqueue_add_buf(vi->rvq, NULL, vi->rx_sg, first, gfp); if (err < 0) give_pages(vi, first); @@ -444,7 +446,7 @@ static int add_recvbuf_mergeable(struct sg_init_one(vi->rx_sg, page_address(page), PAGE_SIZE); - err = virtqueue_add_buf(vi->rvq, vi->rx_sg, 0, 1, page, gfp); + err = virtqueue_add_buf(vi->rvq, NULL, vi->rx_sg, page, gfp); if (err < 0) give_pages(vi, page); @@ -581,6 +583,7 @@ static int xmit_skb(struct virtnet_info { struct skb_vnet_hdr *hdr = skb_vnet_hdr(skb); const unsigned char *dest = ((struct ethhdr *)skb->data)->h_dest; + int ret; pr_debug("%s: xmit %p %pM\n", vi->dev->name, skb, dest); @@ -620,8 +623,16 @@ static int xmit_skb(struct virtnet_info sg_set_buf(vi->tx_sg, &hdr->hdr, sizeof hdr->hdr); hdr->num_sg = skb_to_sgvec(skb, vi->tx_sg + 1, 0, skb->len) + 1; - return virtqueue_add_buf(vi->svq, vi->tx_sg, hdr->num_sg, - 0, skb, GFP_ATOMIC); + + ret = virtqueue_add_buf(vi->svq, vi->tx_sg, NULL, skb, GFP_ATOMIC); + + /* + * An optimization: clear the end bit set by skb_to_sgvec, so + * we can simply re-use vi->tx_sg[] next time. + */ + vi->tx_sg[hdr->num_sg-1].page_link &= ~0x02; + + return ret; } static netdev_tx_t start_xmit(struct sk_buff *skb, struct net_device *dev) @@ -757,32 +768,31 @@ static int virtnet_open(struct net_devic * never fail unless improperly formated. */ static bool virtnet_send_command(struct virtnet_info *vi, u8 class, u8 cmd, - struct scatterlist *data, int out, int in) + struct scatterlist *cmdsg) { - struct scatterlist *s, sg[VIRTNET_SEND_COMMAND_SG_MAX + 2]; + struct scatterlist in[1], out[2]; struct virtio_net_ctrl_hdr ctrl; virtio_net_ctrl_ack status = ~0; unsigned int tmp; - int i; /* Caller should know better */ - BUG_ON(!virtio_has_feature(vi->vdev, VIRTIO_NET_F_CTRL_VQ) || - (out + in > VIRTNET_SEND_COMMAND_SG_MAX)); + BUG_ON(!virtio_has_feature(vi->vdev, VIRTIO_NET_F_CTRL_VQ)); - out++; /* Add header */ - in++; /* Add return status */ - + /* Prepend header to output. */ + sg_init_table(out, 2); ctrl.class = class; ctrl.cmd = cmd; + sg_set_buf(&out[0], &ctrl, sizeof(ctrl)); + if (cmdsg) + sg_chain(out, 2, cmdsg); + else + sg_mark_end(&out[0]); - sg_init_table(sg, out + in); + /* Status response. */ + sg_init_one(in, &status, sizeof(status)); - sg_set_buf(&sg[0], &ctrl, sizeof(ctrl)); - for_each_sg(data, s, out + in - 2, i) - sg_set_buf(&sg[i + 1], sg_virt(s), s->length); - sg_set_buf(&sg[out + in - 1], &status, sizeof(status)); - BUG_ON(virtqueue_add_buf(vi->cvq, sg, out, in, vi, GFP_ATOMIC) < 0); + BUG_ON(virtqueue_add_buf(vi->cvq, out, in, vi, GFP_ATOMIC) < 0); virtqueue_kick(vi->cvq); @@ -800,8 +810,7 @@ static void virtnet_ack_link_announce(st { rtnl_lock(); if (!virtnet_send_command(vi, VIRTIO_NET_CTRL_ANNOUNCE, - VIRTIO_NET_CTRL_ANNOUNCE_ACK, NULL, - 0, 0)) + VIRTIO_NET_CTRL_ANNOUNCE_ACK, NULL)) dev_warn(&vi->dev->dev, "Failed to ack link announce.\n"); rtnl_unlock(); } @@ -839,16 +848,14 @@ static void virtnet_set_rx_mode(struct n sg_init_one(sg, &promisc, sizeof(promisc)); if (!virtnet_send_command(vi, VIRTIO_NET_CTRL_RX, - VIRTIO_NET_CTRL_RX_PROMISC, - sg, 1, 0)) + VIRTIO_NET_CTRL_RX_PROMISC, sg)) dev_warn(&dev->dev, "Failed to %sable promisc mode.\n", promisc ? "en" : "dis"); sg_init_one(sg, &allmulti, sizeof(allmulti)); if (!virtnet_send_command(vi, VIRTIO_NET_CTRL_RX, - VIRTIO_NET_CTRL_RX_ALLMULTI, - sg, 1, 0)) + VIRTIO_NET_CTRL_RX_ALLMULTI, sg)) dev_warn(&dev->dev, "Failed to %sable allmulti mode.\n", allmulti ? "en" : "dis"); @@ -887,7 +894,7 @@ static void virtnet_set_rx_mode(struct n if (!virtnet_send_command(vi, VIRTIO_NET_CTRL_MAC, VIRTIO_NET_CTRL_MAC_TABLE_SET, - sg, 2, 0)) + sg)) dev_warn(&dev->dev, "Failed to set MAC fitler table.\n"); kfree(buf); @@ -901,7 +908,7 @@ static int virtnet_vlan_rx_add_vid(struc sg_init_one(&sg, &vid, sizeof(vid)); if (!virtnet_send_command(vi, VIRTIO_NET_CTRL_VLAN, - VIRTIO_NET_CTRL_VLAN_ADD, &sg, 1, 0)) + VIRTIO_NET_CTRL_VLAN_ADD, &sg)) dev_warn(&dev->dev, "Failed to add VLAN ID %d.\n", vid); return 0; } @@ -914,7 +921,7 @@ static int virtnet_vlan_rx_kill_vid(stru sg_init_one(&sg, &vid, sizeof(vid)); if (!virtnet_send_command(vi, VIRTIO_NET_CTRL_VLAN, - VIRTIO_NET_CTRL_VLAN_DEL, &sg, 1, 0)) + VIRTIO_NET_CTRL_VLAN_DEL, &sg)) dev_warn(&dev->dev, "Failed to kill VLAN ID %d.\n", vid); return 0; } diff --git a/drivers/virtio/virtio_balloon.c b/drivers/virtio/virtio_balloon.c --- a/drivers/virtio/virtio_balloon.c +++ b/drivers/virtio/virtio_balloon.c @@ -102,7 +102,7 @@ static void tell_host(struct virtio_ball sg_init_one(&sg, vb->pfns, sizeof(vb->pfns[0]) * vb->num_pfns); /* We should always be able to add one buffer to an empty queue. */ - if (virtqueue_add_buf(vq, &sg, 1, 0, vb, GFP_KERNEL) < 0) + if (virtqueue_add_buf(vq, &sg, NULL, vb, GFP_KERNEL) < 0) BUG(); virtqueue_kick(vq); @@ -246,7 +246,7 @@ static void stats_handle_request(struct if (!virtqueue_get_buf(vq, &len)) return; sg_init_one(&sg, vb->stats, sizeof(vb->stats)); - if (virtqueue_add_buf(vq, &sg, 1, 0, vb, GFP_KERNEL) < 0) + if (virtqueue_add_buf(vq, &sg, NULL, vb, GFP_KERNEL) < 0) BUG(); virtqueue_kick(vq); } @@ -331,7 +331,7 @@ static int init_vqs(struct virtio_balloo * use it to signal us later. */ sg_init_one(&sg, vb->stats, sizeof vb->stats); - if (virtqueue_add_buf(vb->stats_vq, &sg, 1, 0, vb, GFP_KERNEL) + if (virtqueue_add_buf(vb->stats_vq, &sg, NULL, vb, GFP_KERNEL) < 0) BUG(); virtqueue_kick(vb->stats_vq); diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c --- a/drivers/virtio/virtio_ring.c +++ b/drivers/virtio/virtio_ring.c @@ -121,35 +121,41 @@ struct vring_virtqueue #define to_vvq(_vq) container_of(_vq, struct vring_virtqueue, vq) +/* This doesn't have the counter that for_each_sg() has */ +#define foreach_sg(sglist, i) \ + for (i = (sglist); i; i = sg_next(i)) + /* Set up an indirect table of descriptors and add it to the queue. */ static int vring_add_indirect(struct vring_virtqueue *vq, - struct scatterlist sg[], - unsigned int out, - unsigned int in, + unsigned int num, + const struct scatterlist *out, + const struct scatterlist *in, gfp_t gfp) { + const struct scatterlist *sg; struct vring_desc *desc; unsigned head; int i; - desc = kmalloc((out + in) * sizeof(struct vring_desc), gfp); + desc = kmalloc(num * sizeof(struct vring_desc), gfp); if (!desc) return -ENOMEM; /* Transfer entries from the sg list into the indirect page */ - for (i = 0; i < out; i++) { + i = 0; + foreach_sg(out, sg) { desc[i].flags = VRING_DESC_F_NEXT; desc[i].addr = sg_phys(sg); desc[i].len = sg->length; desc[i].next = i+1; - sg++; + i++; } - for (; i < (out + in); i++) { + foreach_sg(in, sg) { desc[i].flags = VRING_DESC_F_NEXT|VRING_DESC_F_WRITE; desc[i].addr = sg_phys(sg); desc[i].len = sg->length; desc[i].next = i+1; - sg++; + i++; } /* Last one doesn't continue. */ @@ -171,12 +177,21 @@ static int vring_add_indirect(struct vri return head; } +static unsigned int count_sg(const struct scatterlist *sg) +{ + unsigned int count = 0; + const struct scatterlist *i; + + foreach_sg(sg, i) + count++; + return count; +} + /** * virtqueue_add_buf - expose buffer to other end * @vq: the struct virtqueue we're talking about. - * @sg: the description of the buffer(s). - * @out_num: the number of sg readable by other side - * @in_num: the number of sg which are writable (after readable ones) + * @out: the description of the output buffer(s). + * @in: the description of the input buffer(s). * @data: the token identifying the buffer. * @gfp: how to do memory allocations (if necessary). * @@ -189,20 +204,23 @@ static int vring_add_indirect(struct vri * we can put an entire sg[] array inside a single queue entry. */ int virtqueue_add_buf(struct virtqueue *_vq, - struct scatterlist sg[], - unsigned int out, - unsigned int in, + const struct scatterlist *out, + const struct scatterlist *in, void *data, gfp_t gfp) { struct vring_virtqueue *vq = to_vvq(_vq); - unsigned int i, avail, uninitialized_var(prev); + unsigned int i, avail, uninitialized_var(prev), num; + const struct scatterlist *sg; int head; START_USE(vq); BUG_ON(data == NULL); + num = count_sg(out) + count_sg(in); + BUG_ON(num == 0); + #ifdef DEBUG { ktime_t now = ktime_get(); @@ -218,18 +236,17 @@ int virtqueue_add_buf(struct virtqueue * /* If the host supports indirect descriptor tables, and we have multiple * buffers, then go indirect. FIXME: tune this threshold */ - if (vq->indirect && (out + in) > 1 && vq->num_free) { - head = vring_add_indirect(vq, sg, out, in, gfp); + if (vq->indirect && num > 1 && vq->num_free) { + head = vring_add_indirect(vq, num, out, in, gfp); if (likely(head >= 0)) goto add_head; } - BUG_ON(out + in > vq->vring.num); - BUG_ON(out + in == 0); + BUG_ON(num > vq->vring.num); - if (vq->num_free < out + in) { + if (vq->num_free < num) { pr_debug("Can't add buf len %i - avail = %i\n", - out + in, vq->num_free); + num, vq->num_free); /* FIXME: for historical reasons, we force a notify here if * there are outgoing parts to the buffer. Presumably the * host should service the ring ASAP. */ @@ -240,22 +257,24 @@ int virtqueue_add_buf(struct virtqueue * } /* We're about to use some buffers from the free list. */ - vq->num_free -= out + in; + vq->num_free -= num; head = vq->free_head; - for (i = vq->free_head; out; i = vq->vring.desc[i].next, out--) { + + i = vq->free_head; + foreach_sg(out, sg) { vq->vring.desc[i].flags = VRING_DESC_F_NEXT; vq->vring.desc[i].addr = sg_phys(sg); vq->vring.desc[i].len = sg->length; prev = i; - sg++; + i = vq->vring.desc[i].next; } - for (; in; i = vq->vring.desc[i].next, in--) { + foreach_sg(in, sg) { vq->vring.desc[i].flags = VRING_DESC_F_NEXT|VRING_DESC_F_WRITE; vq->vring.desc[i].addr = sg_phys(sg); vq->vring.desc[i].len = sg->length; prev = i; - sg++; + i = vq->vring.desc[i].next; } /* Last one doesn't continue. */ vq->vring.desc[prev].flags &= ~VRING_DESC_F_NEXT; diff --git a/include/linux/virtio.h b/include/linux/virtio.h --- a/include/linux/virtio.h +++ b/include/linux/virtio.h @@ -26,9 +26,8 @@ struct virtqueue { }; int virtqueue_add_buf(struct virtqueue *vq, - struct scatterlist sg[], - unsigned int out_num, - unsigned int in_num, + const struct scatterlist *out, + const struct scatterlist *in, void *data, gfp_t gfp); diff --git a/net/9p/trans_virtio.c b/net/9p/trans_virtio.c --- a/net/9p/trans_virtio.c +++ b/net/9p/trans_virtio.c @@ -244,6 +244,14 @@ pack_sg_list_p(struct scatterlist *sg, i return index - start; } +static void sg_unset_end_markers(struct scatterlist *sg, unsigned int num) +{ + unsigned int i; + + for (i = 0; i < num; i++) + sg[i].page_link &= ~0x02; +} + /** * p9_virtio_request - issue a request * @client: client instance issuing the request @@ -258,6 +266,7 @@ p9_virtio_request(struct p9_client *clie int in, out; unsigned long flags; struct virtio_chan *chan = client->trans; + const struct scatterlist *outsg = NULL, *insg = NULL; p9_debug(P9_DEBUG_TRANS, "9p debug: virtio request\n"); @@ -268,12 +277,21 @@ req_retry: /* Handle out VirtIO ring buffers */ out = pack_sg_list(chan->sg, 0, VIRTQUEUE_NUM, req->tc->sdata, req->tc->size); + if (out) { + sg_unset_end_markers(chan->sg, out-1); + sg_mark_end(&chan->sg[out-1]); + outsg = chan->sg; + } in = pack_sg_list(chan->sg, out, VIRTQUEUE_NUM, req->rc->sdata, req->rc->capacity); + if (in) { + sg_unset_end_markers(chan->sg+out, in-1); + sg_mark_end(&chan->sg[out+in-1]); + insg = chan->sg+out; + } - err = virtqueue_add_buf(chan->vq, chan->sg, out, in, req->tc, - GFP_ATOMIC); + err = virtqueue_add_buf(chan->vq, outsg, insg, req->tc, GFP_ATOMIC); if (err < 0) { if (err == -ENOSPC) { chan->ring_bufs_avail = 0; @@ -355,6 +377,7 @@ p9_virtio_zc_request(struct p9_client *c int in_nr_pages = 0, out_nr_pages = 0; struct page **in_pages = NULL, **out_pages = NULL; struct virtio_chan *chan = client->trans; + struct scatterlist *insg = NULL, *outsg = NULL; p9_debug(P9_DEBUG_TRANS, "virtio request\n"); @@ -402,6 +425,13 @@ req_retry_pinned: if (out_pages) out += pack_sg_list_p(chan->sg, out, VIRTQUEUE_NUM, out_pages, out_nr_pages, uodata, outlen); + + if (out) { + sg_unset_end_markers(chan->sg, out-1); + sg_mark_end(&chan->sg[out-1]); + outsg = chan->sg; + } + /* * Take care of in data * For example TREAD have 11. @@ -414,9 +446,13 @@ req_retry_pinned: if (in_pages) in += pack_sg_list_p(chan->sg, out + in, VIRTQUEUE_NUM, in_pages, in_nr_pages, uidata, inlen); + if (in) { + sg_unset_end_markers(chan->sg+out, in-1); + sg_mark_end(&chan->sg[out+in-1]); + insg = chan->sg + out; + } - err = virtqueue_add_buf(chan->vq, chan->sg, out, in, req->tc, - GFP_ATOMIC); + err = virtqueue_add_buf(chan->vq, outsg, insg, req->tc, GFP_ATOMIC); if (err < 0) { if (err == -ENOSPC) { chan->ring_bufs_avail = 0;