Message ID | 20220402233914.3625405-4-m.grzeschik@pengutronix.de (mailing list archive) |
---|---|
State | Superseded |
Headers | show |
Series | usb: gadget: uvc: fixes and improvements | expand |
Hi Michael, Thank you for the patch. On Sun, Apr 03, 2022 at 01:39:12AM +0200, Michael Grzeschik wrote: > Likewise to the uvcvideo hostside driver, this patch is changing the > simple workqueue to an async_wq with higher priority. This ensures that > the worker will not be scheduled away while the video stream is handled. > > Signed-off-by: Michael Grzeschik <m.grzeschik@pengutronix.de> > --- > drivers/usb/gadget/function/uvc.h | 1 + > drivers/usb/gadget/function/uvc_v4l2.c | 2 +- > drivers/usb/gadget/function/uvc_video.c | 9 +++++++-- > 3 files changed, 9 insertions(+), 3 deletions(-) > > diff --git a/drivers/usb/gadget/function/uvc.h b/drivers/usb/gadget/function/uvc.h > index c3607a32b98624..ab537acdae3184 100644 > --- a/drivers/usb/gadget/function/uvc.h > +++ b/drivers/usb/gadget/function/uvc.h > @@ -86,6 +86,7 @@ struct uvc_video { > struct usb_ep *ep; > > struct work_struct pump; > + struct workqueue_struct *async_wq; > > /* Frame parameters */ > u8 bpp; > diff --git a/drivers/usb/gadget/function/uvc_v4l2.c b/drivers/usb/gadget/function/uvc_v4l2.c > index a2c78690c5c288..9b1488f7abd736 100644 > --- a/drivers/usb/gadget/function/uvc_v4l2.c > +++ b/drivers/usb/gadget/function/uvc_v4l2.c > @@ -170,7 +170,7 @@ uvc_v4l2_qbuf(struct file *file, void *fh, struct v4l2_buffer *b) > return ret; > > if (uvc->state == UVC_STATE_STREAMING) > - schedule_work(&video->pump); > + queue_work(video->async_wq, &video->pump); > > return ret; > } > diff --git a/drivers/usb/gadget/function/uvc_video.c b/drivers/usb/gadget/function/uvc_video.c > index 7f59a0c4740209..b1075e23a61010 100644 > --- a/drivers/usb/gadget/function/uvc_video.c > +++ b/drivers/usb/gadget/function/uvc_video.c > @@ -269,7 +269,7 @@ uvc_video_complete(struct usb_ep *ep, struct usb_request *req) > spin_unlock_irqrestore(&video->req_lock, flags); > > if (uvc->state == UVC_STATE_STREAMING) > - schedule_work(&video->pump); > + queue_work(video->async_wq, &video->pump); > } > > static int > @@ -469,7 +469,7 @@ int uvcg_video_enable(struct uvc_video *video, int enable) > > video->req_int_count = 0; > > - schedule_work(&video->pump); > + queue_work(video->async_wq, &video->pump); > > return ret; > } > @@ -483,6 +483,11 @@ int uvcg_video_init(struct uvc_video *video, struct uvc_device *uvc) > spin_lock_init(&video->req_lock); > INIT_WORK(&video->pump, uvcg_video_pump); > > + /* Allocate a stream specific work queue for asynchronous tasks. */ You can drop the "stream" here. The gadget driver handles a single stream. > + video->async_wq = alloc_workqueue("uvcvideo", WQ_UNBOUND | WQ_HIGHPRI, 0); Unless I'm mistaken, an unbound work queue means that multiple CPUs will handle tasks in parallel. Is that safe ? > + if (!video->async_wq) > + return -EINVAL; No need to destroy the work queue somewhere ? > + > video->uvc = uvc; > video->fcc = V4L2_PIX_FMT_YUYV; > video->bpp = 16;
Hi Michael, Thanks for this change it improves the performance with the DWC3 controller on QCOM chips in an Android 5.10 kernel. I haven't tested the scatter/gather path, so memcpy was used here via uvc_video_encode_isoc(). I was able to get around 30% improvement (fps on host side). I did modify the alloc to only set the WQ_HIGHPRI flag. On Tue, Apr 19, 2022 at 11:46:57PM +0300, Laurent Pinchart wrote: > Hi Michael, > > Thank you for the patch. > > On Sun, Apr 03, 2022 at 01:39:12AM +0200, Michael Grzeschik wrote: > > Likewise to the uvcvideo hostside driver, this patch is changing the > > simple workqueue to an async_wq with higher priority. This ensures that > > the worker will not be scheduled away while the video stream is handled. > > > > Signed-off-by: Michael Grzeschik <m.grzeschik@pengutronix.de> > > + video->async_wq = alloc_workqueue("uvcvideo", WQ_UNBOUND | WQ_HIGHPRI, 0); > > Unless I'm mistaken, an unbound work queue means that multiple CPUs will > handle tasks in parallel. Is that safe ? I found that with the WQ_UNBOUND flag I didn't see any performance improvement to the baseline, perhaps related to cpu caching or scheduling delays. I didn't notice any stability problems or concurrent execution. Do you see any benefit to keeping the WQ_UNBOUND flag? > > > + if (!video->async_wq) > > + return -EINVAL; > > -- > Regards, > > Laurent Pinchart Thanks, Dan
Hi Dan, Hi Laurent, On Fri, Apr 29, 2022 at 01:51:48PM -0500, Dan Vacura wrote: >Thanks for this change it improves the performance with the DWC3 >controller on QCOM chips in an Android 5.10 kernel. I haven't tested the >scatter/gather path, so memcpy was used here via >uvc_video_encode_isoc(). I was able to get around 30% improvement (fps >on host side). I did modify the alloc to only set the WQ_HIGHPRI flag. > >On Tue, Apr 19, 2022 at 11:46:57PM +0300, Laurent Pinchart wrote: >> Thank you for the patch. >> >> On Sun, Apr 03, 2022 at 01:39:12AM +0200, Michael Grzeschik wrote: >> > Likewise to the uvcvideo hostside driver, this patch is changing the >> > simple workqueue to an async_wq with higher priority. This ensures that >> > the worker will not be scheduled away while the video stream is handled. >> > >> > Signed-off-by: Michael Grzeschik <m.grzeschik@pengutronix.de> >> > + video->async_wq = alloc_workqueue("uvcvideo", WQ_UNBOUND | WQ_HIGHPRI, 0); >> >> Unless I'm mistaken, an unbound work queue means that multiple CPUs will >> handle tasks in parallel. Is that safe ? > >I found that with the WQ_UNBOUND flag I didn't see any performance >improvement to the baseline, perhaps related to cpu caching or >scheduling delays. I didn't notice any stability problems or concurrent >execution. Do you see any benefit to keeping the WQ_UNBOUND flag? I actually copied this from drivers/media/usb/uvc/uvc_driver.c , which is also allocating the workqueue with WQ_UNBOUND. Look into drivers/media/usb/uvc/uvc_driver.c + 486 stream->async_wq = alloc_workqueue("uvcvideo", WQ_UNBOUND | WQ_HIGHPRI, In my tests, continous streaming did not trigger any errors. In fact if this would be unsafe, the issue would probably trigger early, numerous and obvious on multicore cpus. However, some users seem to have seen recent issues on unplugging the cable while streaming. I have to check if this could be related. >> > + if (!video->async_wq) >> > + return -EINVAL; >> >> -- >> Regards, >> >> Laurent Pinchart > >Thanks, > >Dan >
Hi Dan, On Fri, Apr 29, 2022 at 10:01:37PM +0200, Michael Grzeschik wrote: >On Fri, Apr 29, 2022 at 01:51:48PM -0500, Dan Vacura wrote: >>Thanks for this change it improves the performance with the DWC3 >>controller on QCOM chips in an Android 5.10 kernel. I haven't tested the >>scatter/gather path, so memcpy was used here via >>uvc_video_encode_isoc(). I was able to get around 30% improvement (fps >>on host side). I did modify the alloc to only set the WQ_HIGHPRI flag. I missed to ask you to try the WQ_CPU_INTENSIVE flag. It would be interesting if you can see further improvement. Regards, Michael
Hi Michael, On Mon, May 02, 2022 at 11:00:03AM +0200, Michael Grzeschik wrote: > Hi Dan, > > On Fri, Apr 29, 2022 at 10:01:37PM +0200, Michael Grzeschik wrote: > > On Fri, Apr 29, 2022 at 01:51:48PM -0500, Dan Vacura wrote: > > > Thanks for this change it improves the performance with the DWC3 > > > controller on QCOM chips in an Android 5.10 kernel. I haven't tested the > > > scatter/gather path, so memcpy was used here via > > > uvc_video_encode_isoc(). I was able to get around 30% improvement (fps > > > on host side). I did modify the alloc to only set the WQ_HIGHPRI flag. > > I missed to ask you to try the WQ_CPU_INTENSIVE flag. It would be > interesting if you can see further improvement. I had some time to test this flag and I couldn't find any discernible difference with it set or not. Regards, Dan
Hi Michael, On Fri, Apr 29, 2022 at 10:01:37PM +0200, Michael Grzeschik wrote: > Hi Dan, > Hi Laurent, > > On Fri, Apr 29, 2022 at 01:51:48PM -0500, Dan Vacura wrote: > > Thanks for this change it improves the performance with the DWC3 > > controller on QCOM chips in an Android 5.10 kernel. I haven't tested the > > scatter/gather path, so memcpy was used here via > > uvc_video_encode_isoc(). I was able to get around 30% improvement (fps > > on host side). I did modify the alloc to only set the WQ_HIGHPRI flag. > > > > On Tue, Apr 19, 2022 at 11:46:57PM +0300, Laurent Pinchart wrote: > >> Thank you for the patch. > >> > >> On Sun, Apr 03, 2022 at 01:39:12AM +0200, Michael Grzeschik wrote: > >> > Likewise to the uvcvideo hostside driver, this patch is changing the > >> > simple workqueue to an async_wq with higher priority. This ensures that > >> > the worker will not be scheduled away while the video stream is handled. > >> > > >> > Signed-off-by: Michael Grzeschik <m.grzeschik@pengutronix.de> > >> > + video->async_wq = alloc_workqueue("uvcvideo", WQ_UNBOUND | WQ_HIGHPRI, 0); > >> > >> Unless I'm mistaken, an unbound work queue means that multiple CPUs will > >> handle tasks in parallel. Is that safe ? > > > > I found that with the WQ_UNBOUND flag I didn't see any performance > > improvement to the baseline, perhaps related to cpu caching or > > scheduling delays. I didn't notice any stability problems or concurrent > > execution. Do you see any benefit to keeping the WQ_UNBOUND flag? > > I actually copied this from drivers/media/usb/uvc/uvc_driver.c , > which is also allocating the workqueue with WQ_UNBOUND. > > Look into drivers/media/usb/uvc/uvc_driver.c + 486 > > stream->async_wq = alloc_workqueue("uvcvideo", WQ_UNBOUND | WQ_HIGHPRI, Just for the record, as a newer version of this patch has been merged, the host-side uvcvideo driver is specifically made to handle multiple work items in parallel. Each work item will essentially perform one or multiple memcpy operations, with the size and offset calculated by the code that dispatches the work items. As Lucas separately commented, the UVC gadget driver has a single work_struct, so there can't be any concurrency. We seem to be safe for now. > In my tests, continous streaming did not trigger any errors. In fact if > this would be unsafe, the issue would probably trigger early, numerous > and obvious on multicore cpus. > > However, some users seem to have seen recent issues on unplugging the > cable while streaming. I have to check if this could be related. > > >> > + if (!video->async_wq) > >> > + return -EINVAL;
diff --git a/drivers/usb/gadget/function/uvc.h b/drivers/usb/gadget/function/uvc.h index c3607a32b98624..ab537acdae3184 100644 --- a/drivers/usb/gadget/function/uvc.h +++ b/drivers/usb/gadget/function/uvc.h @@ -86,6 +86,7 @@ struct uvc_video { struct usb_ep *ep; struct work_struct pump; + struct workqueue_struct *async_wq; /* Frame parameters */ u8 bpp; diff --git a/drivers/usb/gadget/function/uvc_v4l2.c b/drivers/usb/gadget/function/uvc_v4l2.c index a2c78690c5c288..9b1488f7abd736 100644 --- a/drivers/usb/gadget/function/uvc_v4l2.c +++ b/drivers/usb/gadget/function/uvc_v4l2.c @@ -170,7 +170,7 @@ uvc_v4l2_qbuf(struct file *file, void *fh, struct v4l2_buffer *b) return ret; if (uvc->state == UVC_STATE_STREAMING) - schedule_work(&video->pump); + queue_work(video->async_wq, &video->pump); return ret; } diff --git a/drivers/usb/gadget/function/uvc_video.c b/drivers/usb/gadget/function/uvc_video.c index 7f59a0c4740209..b1075e23a61010 100644 --- a/drivers/usb/gadget/function/uvc_video.c +++ b/drivers/usb/gadget/function/uvc_video.c @@ -269,7 +269,7 @@ uvc_video_complete(struct usb_ep *ep, struct usb_request *req) spin_unlock_irqrestore(&video->req_lock, flags); if (uvc->state == UVC_STATE_STREAMING) - schedule_work(&video->pump); + queue_work(video->async_wq, &video->pump); } static int @@ -469,7 +469,7 @@ int uvcg_video_enable(struct uvc_video *video, int enable) video->req_int_count = 0; - schedule_work(&video->pump); + queue_work(video->async_wq, &video->pump); return ret; } @@ -483,6 +483,11 @@ int uvcg_video_init(struct uvc_video *video, struct uvc_device *uvc) spin_lock_init(&video->req_lock); INIT_WORK(&video->pump, uvcg_video_pump); + /* Allocate a stream specific work queue for asynchronous tasks. */ + video->async_wq = alloc_workqueue("uvcvideo", WQ_UNBOUND | WQ_HIGHPRI, 0); + if (!video->async_wq) + return -EINVAL; + video->uvc = uvc; video->fcc = V4L2_PIX_FMT_YUYV; video->bpp = 16;
Likewise to the uvcvideo hostside driver, this patch is changing the simple workqueue to an async_wq with higher priority. This ensures that the worker will not be scheduled away while the video stream is handled. Signed-off-by: Michael Grzeschik <m.grzeschik@pengutronix.de> --- drivers/usb/gadget/function/uvc.h | 1 + drivers/usb/gadget/function/uvc_v4l2.c | 2 +- drivers/usb/gadget/function/uvc_video.c | 9 +++++++-- 3 files changed, 9 insertions(+), 3 deletions(-)