Message ID | 20190621174553.28862-3-suwan.kim027@gmail.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | usbip: Implement SG support to vhci | expand |
On Sat, 22 Jun 2019, Suwan Kim wrote: > There are bugs on vhci with usb 3.0 storage device. Originally, vhci > doesn't supported SG. So, USB storage driver on vhci divides SG list > into multiple URBs and it causes buffer overflow on the xhci of the > server. So we need to add SG support to vhci It doesn't cause buffer overflow. The problem was that a transfer got terminated too early because the transfer length for one of the URBs was not divisible by the maxpacket size. > In this patch, vhci basically support SG and it sends each SG list > entry to the stub driver. Then, the stub driver sees total length of > the buffer and allocates SG table and pages according to the total > buffer length calling sgl_alloc(). After the stub driver receives > completed URB, it again sends each SG list entry to the vhci. > > If HCD of server doesn't support SG, the stub driver allocates > big buffer using kmalloc() instead of using sgl_alloc() which > allocates SG list and pages. You might be better off not using kmalloc. It's easier for the kernel to allocate a bunch of small buffers than a single big one. Then you can create a separate URB for each buffer. > Alan fixed vhci bug with the USB 3.0 storage device by modifying > USB storage driver. > ("usb-storage: Set virt_boundary_mask to avoid SG overflows") > But the fundamental solution of it is to add SG support to vhci. > > This patch works well with the USB 3.0 storage devices without Alan's > patch, and we can revert Alan's patch if it causes some troubles. These last two paragraphs don't need to be in the patch description. > Signed-off-by: Suwan Kim <suwan.kim027@gmail.com> > --- I'm not sufficiently familiar with the usbip drivers to review most of this. However... > diff --git a/drivers/usb/usbip/vhci_hcd.c b/drivers/usb/usbip/vhci_hcd.c > index be87c8a63e24..cc93c1a87a3e 100644 > --- a/drivers/usb/usbip/vhci_hcd.c > +++ b/drivers/usb/usbip/vhci_hcd.c > @@ -696,7 +696,8 @@ static int vhci_urb_enqueue(struct usb_hcd *hcd, struct urb *urb, gfp_t mem_flag > } > vdev = &vhci_hcd->vdev[portnum-1]; > > - if (!urb->transfer_buffer && urb->transfer_buffer_length) { > + if (!urb->transfer_buffer && !urb->num_sgs && > + urb->transfer_buffer_length) { > dev_dbg(dev, "Null URB transfer buffer\n"); > return -EINVAL; > } > @@ -1142,6 +1143,11 @@ static int vhci_setup(struct usb_hcd *hcd) > hcd->speed = HCD_USB3; > hcd->self.root_hub->speed = USB_SPEED_SUPER; > } > + > + /* support sg */ > + hcd->self.sg_tablesize = ~0; > + hcd->self.no_sg_constraint = 1; You probably shouldn't do this, for two reasons. First, sg_tablesize of the server's HCD may be smaller than ~0. If the client's value is larger than the server's, a transfer could be accepted on the client but then fail on the server because the SG list was too big. Also, you may want to restrict the size of SG transfers even further, so that you don't have to allocate a tremendous amount of memory all at once on the server. An SG transfer can be quite large. I don't know what a reasonable limit would be -- 16 perhaps? Alan Stern
On Fri, Jun 21, 2019 at 04:05:24PM -0400, Alan Stern wrote: > On Sat, 22 Jun 2019, Suwan Kim wrote: > > > There are bugs on vhci with usb 3.0 storage device. Originally, vhci > > doesn't supported SG. So, USB storage driver on vhci divides SG list > > into multiple URBs and it causes buffer overflow on the xhci of the > > server. So we need to add SG support to vhci > > It doesn't cause buffer overflow. The problem was that a transfer got > terminated too early because the transfer length for one of the URBs > was not divisible by the maxpacket size. Oh.. I misunderstood the problem. I will rewrite the problem situation. > > In this patch, vhci basically support SG and it sends each SG list > > entry to the stub driver. Then, the stub driver sees total length of > > the buffer and allocates SG table and pages according to the total > > buffer length calling sgl_alloc(). After the stub driver receives > > completed URB, it again sends each SG list entry to the vhci. > > > > If HCD of server doesn't support SG, the stub driver allocates > > big buffer using kmalloc() instead of using sgl_alloc() which > > allocates SG list and pages. > > You might be better off not using kmalloc. It's easier for the kernel > to allocate a bunch of small buffers than a single big one. Then you > can create a separate URB for each buffer. Ok. I will implement it as usb_sg_init() does and send v2 patch including the logic of submitting separate URBs. > > Alan fixed vhci bug with the USB 3.0 storage device by modifying > > USB storage driver. > > ("usb-storage: Set virt_boundary_mask to avoid SG overflows") > > But the fundamental solution of it is to add SG support to vhci. > > > > This patch works well with the USB 3.0 storage devices without Alan's > > patch, and we can revert Alan's patch if it causes some troubles. > > These last two paragraphs don't need to be in the patch description. I will remove these paragraphs in v2 patch. > > Signed-off-by: Suwan Kim <suwan.kim027@gmail.com> > > --- > > I'm not sufficiently familiar with the usbip drivers to review most of > this. However... > > > diff --git a/drivers/usb/usbip/vhci_hcd.c b/drivers/usb/usbip/vhci_hcd.c > > index be87c8a63e24..cc93c1a87a3e 100644 > > --- a/drivers/usb/usbip/vhci_hcd.c > > +++ b/drivers/usb/usbip/vhci_hcd.c > > @@ -696,7 +696,8 @@ static int vhci_urb_enqueue(struct usb_hcd *hcd, struct urb *urb, gfp_t mem_flag > > } > > vdev = &vhci_hcd->vdev[portnum-1]; > > > > - if (!urb->transfer_buffer && urb->transfer_buffer_length) { > > + if (!urb->transfer_buffer && !urb->num_sgs && > > + urb->transfer_buffer_length) { > > dev_dbg(dev, "Null URB transfer buffer\n"); > > return -EINVAL; > > } > > @@ -1142,6 +1143,11 @@ static int vhci_setup(struct usb_hcd *hcd) > > hcd->speed = HCD_USB3; > > hcd->self.root_hub->speed = USB_SPEED_SUPER; > > } > > + > > + /* support sg */ > > + hcd->self.sg_tablesize = ~0; > > + hcd->self.no_sg_constraint = 1; > > You probably shouldn't do this, for two reasons. First, sg_tablesize > of the server's HCD may be smaller than ~0. If the client's value is > larger than the server's, a transfer could be accepted on the client > but then fail on the server because the SG list was too big. > > Also, you may want to restrict the size of SG transfers even further, > so that you don't have to allocate a tremendous amount of memory all at > once on the server. An SG transfer can be quite large. I don't know > what a reasonable limit would be -- 16 perhaps? Is there any reason why you think that 16 is ok? Or Can I set this value as the smallest value of all HC? I think that sg_tablesize cannot be a variable value because vhci interacts with different machines and all machines has different sg_tablesize value. Regards Suwan Kim
On Mon, 24 Jun 2019, Suwan Kim wrote: > > > + hcd->self.sg_tablesize = ~0; > > > + hcd->self.no_sg_constraint = 1; > > > > You probably shouldn't do this, for two reasons. First, sg_tablesize > > of the server's HCD may be smaller than ~0. If the client's value is > > larger than the server's, a transfer could be accepted on the client > > but then fail on the server because the SG list was too big. On the other hand, I don't know of any examples where an HCD has sg_tablesize set to anything other than 0 or ~0. vhci-hcd might end up being the only one. > > Also, you may want to restrict the size of SG transfers even further, > > so that you don't have to allocate a tremendous amount of memory all at > > once on the server. An SG transfer can be quite large. I don't know > > what a reasonable limit would be -- 16 perhaps? > > Is there any reason why you think that 16 is ok? Or Can I set this > value as the smallest value of all HC? I think that sg_tablesize > cannot be a variable value because vhci interacts with different > machines and all machines has different sg_tablesize value. I didn't have any good reason for picking 16. Using the smallest value of all the HCDs seems like a good idea. Alan Stern
On Mon, Jun 24, 2019 at 01:24:15PM -0400, Alan Stern wrote: > On Mon, 24 Jun 2019, Suwan Kim wrote: > > > > > + hcd->self.sg_tablesize = ~0; > > > > + hcd->self.no_sg_constraint = 1; > > > > > > You probably shouldn't do this, for two reasons. First, sg_tablesize > > > of the server's HCD may be smaller than ~0. If the client's value is > > > larger than the server's, a transfer could be accepted on the client > > > but then fail on the server because the SG list was too big. > > On the other hand, I don't know of any examples where an HCD has > sg_tablesize set to anything other than 0 or ~0. vhci-hcd might end up > being the only one. > > > > Also, you may want to restrict the size of SG transfers even further, > > > so that you don't have to allocate a tremendous amount of memory all at > > > once on the server. An SG transfer can be quite large. I don't know > > > what a reasonable limit would be -- 16 perhaps? > > > > Is there any reason why you think that 16 is ok? Or Can I set this > > value as the smallest value of all HC? I think that sg_tablesize > > cannot be a variable value because vhci interacts with different > > machines and all machines has different sg_tablesize value. > > I didn't have any good reason for picking 16. Using the smallest value > of all the HCDs seems like a good idea. I also have not seen an HCD with a value other than ~0 or 0 except for whci which uses 2048, but is not 2048 the maximum value of sg_tablesize? If so, ~0 is the minimum value of sg_tablesize that supports SG. Then can vhci use ~0 if we don't consider memory pressure of the server? If all of the HCDs supporting SG have ~0 as sg_tablesize value, I think that whether we use an HCD locally or remotely, the degree of memory pressure is same in both local and remote usage. Regards Suwan Kim
On Fri, 5 Jul 2019, Suwan Kim wrote: > On Mon, Jun 24, 2019 at 01:24:15PM -0400, Alan Stern wrote: > > On Mon, 24 Jun 2019, Suwan Kim wrote: > > > > > > > + hcd->self.sg_tablesize = ~0; > > > > > + hcd->self.no_sg_constraint = 1; > > > > > > > > You probably shouldn't do this, for two reasons. First, sg_tablesize > > > > of the server's HCD may be smaller than ~0. If the client's value is > > > > larger than the server's, a transfer could be accepted on the client > > > > but then fail on the server because the SG list was too big. > > > > On the other hand, I don't know of any examples where an HCD has > > sg_tablesize set to anything other than 0 or ~0. vhci-hcd might end up > > being the only one. > > > > > > Also, you may want to restrict the size of SG transfers even further, > > > > so that you don't have to allocate a tremendous amount of memory all at > > > > once on the server. An SG transfer can be quite large. I don't know > > > > what a reasonable limit would be -- 16 perhaps? > > > > > > Is there any reason why you think that 16 is ok? Or Can I set this > > > value as the smallest value of all HC? I think that sg_tablesize > > > cannot be a variable value because vhci interacts with different > > > machines and all machines has different sg_tablesize value. > > > > I didn't have any good reason for picking 16. Using the smallest value > > of all the HCDs seems like a good idea. > > I also have not seen an HCD with a value other than ~0 or 0 except for > whci which uses 2048, but is not 2048 the maximum value of sg_tablesize? > If so, ~0 is the minimum value of sg_tablesize that supports SG. Then > can vhci use ~0 if we don't consider memory pressure of the server? > > If all of the HCDs supporting SG have ~0 as sg_tablesize value, I > think that whether we use an HCD locally or remotely, the degree of > memory pressure is same in both local and remote usage. You have a lot of leeway. For example, there's no reason a single SG transfer on the client has to correspond to a single SG transfer on the host. In theory the client's vhci-hcd can break a large SG transfer up into a bunch of smaller pieces and send them to the host one by one, then reassemble the results to complete the original transfer. That way the memory pressure on the host would be a lot smaller than on the client. Alan Stern
On Thu, Jul 04, 2019 at 09:41:04PM -0400, Alan Stern wrote: > On Fri, 5 Jul 2019, Suwan Kim wrote: > > > On Mon, Jun 24, 2019 at 01:24:15PM -0400, Alan Stern wrote: > > > On Mon, 24 Jun 2019, Suwan Kim wrote: > > > > > > > > > + hcd->self.sg_tablesize = ~0; > > > > > > + hcd->self.no_sg_constraint = 1; > > > > > > > > > > You probably shouldn't do this, for two reasons. First, sg_tablesize > > > > > of the server's HCD may be smaller than ~0. If the client's value is > > > > > larger than the server's, a transfer could be accepted on the client > > > > > but then fail on the server because the SG list was too big. > > > > > > On the other hand, I don't know of any examples where an HCD has > > > sg_tablesize set to anything other than 0 or ~0. vhci-hcd might end up > > > being the only one. > > > > > > > > Also, you may want to restrict the size of SG transfers even further, > > > > > so that you don't have to allocate a tremendous amount of memory all at > > > > > once on the server. An SG transfer can be quite large. I don't know > > > > > what a reasonable limit would be -- 16 perhaps? > > > > > > > > Is there any reason why you think that 16 is ok? Or Can I set this > > > > value as the smallest value of all HC? I think that sg_tablesize > > > > cannot be a variable value because vhci interacts with different > > > > machines and all machines has different sg_tablesize value. > > > > > > I didn't have any good reason for picking 16. Using the smallest value > > > of all the HCDs seems like a good idea. > > > > I also have not seen an HCD with a value other than ~0 or 0 except for > > whci which uses 2048, but is not 2048 the maximum value of sg_tablesize? > > If so, ~0 is the minimum value of sg_tablesize that supports SG. Then > > can vhci use ~0 if we don't consider memory pressure of the server? > > > > If all of the HCDs supporting SG have ~0 as sg_tablesize value, I > > think that whether we use an HCD locally or remotely, the degree of > > memory pressure is same in both local and remote usage. > > You have a lot of leeway. For example, there's no reason a single SG > transfer on the client has to correspond to a single SG transfer on the > host. In theory the client's vhci-hcd can break a large SG transfer up > into a bunch of smaller pieces and send them to the host one by one, > then reassemble the results to complete the original transfer. That > way the memory pressure on the host would be a lot smaller than on the > client. Thank you for the feedback, Alan. I understood your comment. It seems to be a good idea to use sg_tablesize to alleviate the memory pressure of the host. But I think 16 is too small for USB 3.0 device because USB 3.0 storage device in my machine usually uses more than 30 SG entries. So, I will set sg_tablesize to 32. Regards Suwan Kim
diff --git a/drivers/usb/usbip/stub_rx.c b/drivers/usb/usbip/stub_rx.c index 97b09a42a10c..798854b9bbc8 100644 --- a/drivers/usb/usbip/stub_rx.c +++ b/drivers/usb/usbip/stub_rx.c @@ -7,6 +7,7 @@ #include <linux/kthread.h> #include <linux/usb.h> #include <linux/usb/hcd.h> +#include <linux/scatterlist.h> #include "usbip_common.h" #include "stub.h" @@ -446,7 +447,11 @@ static void stub_recv_cmd_submit(struct stub_device *sdev, struct stub_priv *priv; struct usbip_device *ud = &sdev->ud; struct usb_device *udev = sdev->udev; + struct scatterlist *sgl; + int nents; int pipe = get_pipe(sdev, pdu); + unsigned long long buf_len; + int use_sg; if (pipe == -1) return; @@ -455,6 +460,10 @@ static void stub_recv_cmd_submit(struct stub_device *sdev, if (!priv) return; + buf_len = (unsigned long long)pdu->u.cmd_submit.transfer_buffer_length; + /* Check if the server's HCD supprot sg */ + use_sg = pdu->u.cmd_submit.num_sgs && udev->bus->sg_tablesize; + /* setup a urb */ if (usb_pipeisoc(pipe)) priv->urb = usb_alloc_urb(pdu->u.cmd_submit.number_of_packets, @@ -468,13 +477,22 @@ static void stub_recv_cmd_submit(struct stub_device *sdev, } /* allocate urb transfer buffer, if needed */ - if (pdu->u.cmd_submit.transfer_buffer_length > 0) { - priv->urb->transfer_buffer = - kzalloc(pdu->u.cmd_submit.transfer_buffer_length, - GFP_KERNEL); - if (!priv->urb->transfer_buffer) { - usbip_event_add(ud, SDEV_EVENT_ERROR_MALLOC); - return; + if (buf_len > 0) { + if (use_sg) { + sgl = sgl_alloc(buf_len, GFP_KERNEL, &nents); + if (!sgl) { + usbip_event_add(ud, SDEV_EVENT_ERROR_MALLOC); + return; + } + priv->urb->sg = sgl; + priv->urb->num_sgs = nents; + priv->urb->transfer_buffer = NULL; + } else { + priv->urb->transfer_buffer = kzalloc(buf_len, GFP_KERNEL); + if (!priv->urb->transfer_buffer) { + usbip_event_add(ud, SDEV_EVENT_ERROR_MALLOC); + return; + } } } diff --git a/drivers/usb/usbip/stub_tx.c b/drivers/usb/usbip/stub_tx.c index f0ec41a50cbc..ece129cd0b28 100644 --- a/drivers/usb/usbip/stub_tx.c +++ b/drivers/usb/usbip/stub_tx.c @@ -5,6 +5,7 @@ #include <linux/kthread.h> #include <linux/socket.h> +#include <linux/scatterlist.h> #include "usbip_common.h" #include "stub.h" @@ -13,11 +14,21 @@ static void stub_free_priv_and_urb(struct stub_priv *priv) { struct urb *urb = priv->urb; - kfree(urb->setup_packet); - urb->setup_packet = NULL; + if (urb->setup_packet) { + kfree(urb->setup_packet); + urb->setup_packet = NULL; + } + + if (urb->transfer_buffer) { + kfree(urb->transfer_buffer); + urb->transfer_buffer = NULL; + } - kfree(urb->transfer_buffer); - urb->transfer_buffer = NULL; + if (urb->num_sgs) { + sgl_free(urb->sg); + urb->sg = NULL; + urb->num_sgs = 0; + } list_del(&priv->list); kmem_cache_free(stub_priv_cache, priv); @@ -161,13 +172,16 @@ static int stub_send_ret_submit(struct stub_device *sdev) struct usbip_header pdu_header; struct usbip_iso_packet_descriptor *iso_buffer = NULL; struct kvec *iov = NULL; + struct scatterlist *sg; int iovnum = 0; + int i; txsize = 0; memset(&pdu_header, 0, sizeof(pdu_header)); memset(&msg, 0, sizeof(msg)); - if (urb->actual_length > 0 && !urb->transfer_buffer) { + if (urb->actual_length > 0 && !urb->transfer_buffer && + !urb->num_sgs) { dev_err(&sdev->udev->dev, "urb: actual_length %d transfer_buffer null\n", urb->actual_length); @@ -176,6 +190,8 @@ static int stub_send_ret_submit(struct stub_device *sdev) if (usb_pipetype(urb->pipe) == PIPE_ISOCHRONOUS) iovnum = 2 + urb->number_of_packets; + else if (usb_pipein(urb->pipe) && urb->actual_length > 0 && urb->num_sgs) + iovnum = 1 + urb->num_sgs; else iovnum = 2; @@ -203,9 +219,30 @@ static int stub_send_ret_submit(struct stub_device *sdev) if (usb_pipein(urb->pipe) && usb_pipetype(urb->pipe) != PIPE_ISOCHRONOUS && urb->actual_length > 0) { - iov[iovnum].iov_base = urb->transfer_buffer; - iov[iovnum].iov_len = urb->actual_length; - iovnum++; + if (urb->num_sgs) { + unsigned int copy = urb->actual_length; + int size; + + for_each_sg(urb->sg, sg, urb->num_sgs ,i) { + if (copy == 0) + break; + + if (copy < sg->length) + size = copy; + else + size = sg->length; + + iov[iovnum].iov_base = sg_virt(sg); + iov[iovnum].iov_len = size; + + iovnum++; + copy -= size; + } + } else { + iov[iovnum].iov_base = urb->transfer_buffer; + iov[iovnum].iov_len = urb->actual_length; + iovnum++; + } txsize += urb->actual_length; } else if (usb_pipein(urb->pipe) && usb_pipetype(urb->pipe) == PIPE_ISOCHRONOUS) { diff --git a/drivers/usb/usbip/usbip_common.c b/drivers/usb/usbip/usbip_common.c index 45da3e01c7b0..56b2a1fbe0bf 100644 --- a/drivers/usb/usbip/usbip_common.c +++ b/drivers/usb/usbip/usbip_common.c @@ -365,6 +365,7 @@ static void usbip_pack_cmd_submit(struct usbip_header *pdu, struct urb *urb, spdu->start_frame = urb->start_frame; spdu->number_of_packets = urb->number_of_packets; spdu->interval = urb->interval; + spdu->num_sgs = urb->num_sgs; } else { urb->transfer_flags = spdu->transfer_flags; urb->transfer_buffer_length = spdu->transfer_buffer_length; @@ -434,6 +435,7 @@ static void correct_endian_cmd_submit(struct usbip_header_cmd_submit *pdu, { if (send) { pdu->transfer_flags = cpu_to_be32(pdu->transfer_flags); + pdu->num_sgs = cpu_to_be32(pdu->num_sgs); cpu_to_be32s(&pdu->transfer_buffer_length); cpu_to_be32s(&pdu->start_frame); @@ -441,6 +443,7 @@ static void correct_endian_cmd_submit(struct usbip_header_cmd_submit *pdu, cpu_to_be32s(&pdu->interval); } else { pdu->transfer_flags = be32_to_cpu(pdu->transfer_flags); + pdu->num_sgs = be32_to_cpu(pdu->num_sgs); be32_to_cpus(&pdu->transfer_buffer_length); be32_to_cpus(&pdu->start_frame); @@ -680,8 +683,12 @@ EXPORT_SYMBOL_GPL(usbip_pad_iso); /* some members of urb must be substituted before. */ int usbip_recv_xbuff(struct usbip_device *ud, struct urb *urb) { - int ret; + struct scatterlist *sg; + int ret = 0; + int recv; int size; + int copy; + int i; if (ud->side == USBIP_STUB || ud->side == USBIP_VUDC) { /* the direction of urb must be OUT. */ @@ -712,14 +719,49 @@ int usbip_recv_xbuff(struct usbip_device *ud, struct urb *urb) } } - ret = usbip_recv(ud->tcp_socket, urb->transfer_buffer, size); - if (ret != size) { - dev_err(&urb->dev->dev, "recv xbuf, %d\n", ret); - if (ud->side == USBIP_STUB || ud->side == USBIP_VUDC) { - usbip_event_add(ud, SDEV_EVENT_ERROR_TCP); - } else { - usbip_event_add(ud, VDEV_EVENT_ERROR_TCP); - return -EPIPE; + if (urb->num_sgs) { + copy = size; + for_each_sg(urb->sg, sg, urb->num_sgs, i) { + int recv_size; + + if (copy < sg->length) + recv_size = copy; + else + recv_size = sg->length; + + recv = usbip_recv(ud->tcp_socket, sg_virt(sg), recv_size); + if (recv != recv_size) { + dev_err(&urb->dev->dev, "recv xbuf, %d\n", ret); + if (ud->side == USBIP_STUB || ud->side == USBIP_VUDC) { + usbip_event_add(ud, SDEV_EVENT_ERROR_TCP); + } else { + usbip_event_add(ud, VDEV_EVENT_ERROR_TCP); + return -EPIPE; + } + } + copy -= recv; + ret += recv; + } + + if (ret != size) { + dev_err(&urb->dev->dev, "recv xbuf, %d\n", ret); + if (ud->side == USBIP_STUB || ud->side == USBIP_VUDC) { + usbip_event_add(ud, SDEV_EVENT_ERROR_TCP); + } else { + usbip_event_add(ud, VDEV_EVENT_ERROR_TCP); + return -EPIPE; + } + } + } else { + ret = usbip_recv(ud->tcp_socket, urb->transfer_buffer, size); + if (ret != size) { + dev_err(&urb->dev->dev, "recv xbuf, %d\n", ret); + if (ud->side == USBIP_STUB || ud->side == USBIP_VUDC) { + usbip_event_add(ud, SDEV_EVENT_ERROR_TCP); + } else { + usbip_event_add(ud, VDEV_EVENT_ERROR_TCP); + return -EPIPE; + } } } diff --git a/drivers/usb/usbip/usbip_common.h b/drivers/usb/usbip/usbip_common.h index bf8afe9b5883..b395946c4d03 100644 --- a/drivers/usb/usbip/usbip_common.h +++ b/drivers/usb/usbip/usbip_common.h @@ -143,6 +143,7 @@ struct usbip_header_basic { * struct usbip_header_cmd_submit - USBIP_CMD_SUBMIT packet header * @transfer_flags: URB flags * @transfer_buffer_length: the data size for (in) or (out) transfer + * @num_sgs: the number of scatter gather list of URB * @start_frame: initial frame for isochronous or interrupt transfers * @number_of_packets: number of isochronous packets * @interval: maximum time for the request on the server-side host controller @@ -151,6 +152,7 @@ struct usbip_header_basic { struct usbip_header_cmd_submit { __u32 transfer_flags; __s32 transfer_buffer_length; + __u32 num_sgs; /* it is difficult for usbip to sync frames (reserved only?) */ __s32 start_frame; diff --git a/drivers/usb/usbip/vhci_hcd.c b/drivers/usb/usbip/vhci_hcd.c index be87c8a63e24..cc93c1a87a3e 100644 --- a/drivers/usb/usbip/vhci_hcd.c +++ b/drivers/usb/usbip/vhci_hcd.c @@ -696,7 +696,8 @@ static int vhci_urb_enqueue(struct usb_hcd *hcd, struct urb *urb, gfp_t mem_flag } vdev = &vhci_hcd->vdev[portnum-1]; - if (!urb->transfer_buffer && urb->transfer_buffer_length) { + if (!urb->transfer_buffer && !urb->num_sgs && + urb->transfer_buffer_length) { dev_dbg(dev, "Null URB transfer buffer\n"); return -EINVAL; } @@ -1142,6 +1143,11 @@ static int vhci_setup(struct usb_hcd *hcd) hcd->speed = HCD_USB3; hcd->self.root_hub->speed = USB_SPEED_SUPER; } + + /* support sg */ + hcd->self.sg_tablesize = ~0; + hcd->self.no_sg_constraint = 1; + return 0; } diff --git a/drivers/usb/usbip/vhci_tx.c b/drivers/usb/usbip/vhci_tx.c index 2fa26d0578d7..3472180f5af8 100644 --- a/drivers/usb/usbip/vhci_tx.c +++ b/drivers/usb/usbip/vhci_tx.c @@ -5,6 +5,7 @@ #include <linux/kthread.h> #include <linux/slab.h> +#include <linux/scatterlist.h> #include "usbip_common.h" #include "vhci.h" @@ -51,12 +52,13 @@ static struct vhci_priv *dequeue_from_priv_tx(struct vhci_device *vdev) static int vhci_send_cmd_submit(struct vhci_device *vdev) { struct vhci_priv *priv = NULL; - + struct scatterlist *sg; struct msghdr msg; - struct kvec iov[3]; + struct kvec *iov; size_t txsize; - size_t total_size = 0; + int iovnum; + int i; while ((priv = dequeue_from_priv_tx(vdev)) != NULL) { int ret; @@ -72,18 +74,41 @@ static int vhci_send_cmd_submit(struct vhci_device *vdev) usbip_dbg_vhci_tx("setup txdata urb seqnum %lu\n", priv->seqnum); + if (urb->num_sgs && usb_pipeout(urb->pipe)) + iovnum = 2 + urb->num_sgs; + else + iovnum = 3; + + iov = kzalloc(iovnum * sizeof(*iov), GFP_KERNEL); + if (!iov) { + usbip_event_add(&vdev->ud, + SDEV_EVENT_ERROR_MALLOC); + return -ENOMEM; + } + /* 1. setup usbip_header */ setup_cmd_submit_pdu(&pdu_header, urb); usbip_header_correct_endian(&pdu_header, 1); + iovnum = 0; - iov[0].iov_base = &pdu_header; - iov[0].iov_len = sizeof(pdu_header); + iov[iovnum].iov_base = &pdu_header; + iov[iovnum].iov_len = sizeof(pdu_header); txsize += sizeof(pdu_header); + iovnum++; /* 2. setup transfer buffer */ if (!usb_pipein(urb->pipe) && urb->transfer_buffer_length > 0) { - iov[1].iov_base = urb->transfer_buffer; - iov[1].iov_len = urb->transfer_buffer_length; + if (urb->num_sgs && !usb_endpoint_xfer_isoc(&urb->ep->desc)) { + for_each_sg(urb->sg, sg, urb->num_sgs ,i) { + iov[iovnum].iov_base = sg_virt(sg); + iov[iovnum].iov_len = sg->length; + iovnum++; + } + } else { + iov[iovnum].iov_base = urb->transfer_buffer; + iov[iovnum].iov_len = urb->transfer_buffer_length; + iovnum++; + } txsize += urb->transfer_buffer_length; } @@ -93,25 +118,29 @@ static int vhci_send_cmd_submit(struct vhci_device *vdev) iso_buffer = usbip_alloc_iso_desc_pdu(urb, &len); if (!iso_buffer) { + kfree(iov); usbip_event_add(&vdev->ud, SDEV_EVENT_ERROR_MALLOC); return -1; } - iov[2].iov_base = iso_buffer; - iov[2].iov_len = len; + iov[iovnum].iov_base = iso_buffer; + iov[iovnum].iov_len = len; + iovnum++; txsize += len; } - ret = kernel_sendmsg(vdev->ud.tcp_socket, &msg, iov, 3, txsize); + ret = kernel_sendmsg(vdev->ud.tcp_socket, &msg, iov, iovnum, txsize); if (ret != txsize) { pr_err("sendmsg failed!, ret=%d for %zd\n", ret, txsize); + kfree(iov); kfree(iso_buffer); usbip_event_add(&vdev->ud, VDEV_EVENT_ERROR_TCP); return -1; } + kfree(iov); kfree(iso_buffer); usbip_dbg_vhci_tx("send txdata\n");
There are bugs on vhci with usb 3.0 storage device. Originally, vhci doesn't supported SG. So, USB storage driver on vhci divides SG list into multiple URBs and it causes buffer overflow on the xhci of the server. So we need to add SG support to vhci In this patch, vhci basically support SG and it sends each SG list entry to the stub driver. Then, the stub driver sees total length of the buffer and allocates SG table and pages according to the total buffer length calling sgl_alloc(). After the stub driver receives completed URB, it again sends each SG list entry to the vhci. If HCD of server doesn't support SG, the stub driver allocates big buffer using kmalloc() instead of using sgl_alloc() which allocates SG list and pages. Alan fixed vhci bug with the USB 3.0 storage device by modifying USB storage driver. ("usb-storage: Set virt_boundary_mask to avoid SG overflows") But the fundamental solution of it is to add SG support to vhci. This patch works well with the USB 3.0 storage devices without Alan's patch, and we can revert Alan's patch if it causes some troubles. Signed-off-by: Suwan Kim <suwan.kim027@gmail.com> --- drivers/usb/usbip/stub_rx.c | 32 +++++++++++++---- drivers/usb/usbip/stub_tx.c | 53 +++++++++++++++++++++++----- drivers/usb/usbip/usbip_common.c | 60 +++++++++++++++++++++++++++----- drivers/usb/usbip/usbip_common.h | 2 ++ drivers/usb/usbip/vhci_hcd.c | 8 ++++- drivers/usb/usbip/vhci_tx.c | 49 ++++++++++++++++++++------ 6 files changed, 169 insertions(+), 35 deletions(-)