Message ID | 1357996822-13072-1-git-send-email-holler@ahsoftware.de (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Hi Alexander, On Sat, Jan 12, 2013 at 5:20 AM, Alexander Holler <holler@ahsoftware.de> wrote: > When a device was disconnected the driver may hang at waiting for urbs it never > will get. Fix this by using a timeout while waiting for the used semaphore. The code used to be this way, but it used to cause nasty shutdown hangs: http://git.plugable.com/gitphp/index.php?p=udlfb&a=commitdiff&h=1dd39a65001deb5a84088dfabb788d3274fbb6b6 Which is why the code is the way it is today. Can you say under what situations you're hitting hangs on device disconnect? Have you tested extensively to confirm no shutdown hangs with your patch? Stepping back, there was another recent patch from the community to udlfb to work around issues of sleeping in the wrong context. The fix involved introducing another scheduled workitem. This slows everything down when it's in the main path, and isn't really desirable if we can avoid it. Another option to eliminate all these problems -- long considered but never implemented -- is to get rid of all semaphores and potential sleeps in udlfb entirely. That would require a strategy to throttle rendering in some way other than by waiting in kernel (without some throttling strategy, the USB bus can be a bottleneck which can flood the system with rendered but untransmitted pixels). Options might be: 1) When transfer buffers are full, keep track of dirty rectangles for the rest and pick up where we left off the next time we're entered (avoiding flooding by potentially having pixels in the dirty regions be written over multiple times before we get to rendering them once) 2 ) If we "bet" on page-fault-based defio dirty pixel detection, we could allocate buffers dynamically but increase the scheduling time to transfer as our outstanding buffer count grows, and reduce the latency only when the buffer count goes down (again, pixels will be potentially rendered many times before being transfered once, avoiding flooding). Any other ideas on the specific or general case are welcome. Also note that udlfb is being largely superceeded by the udl DRM driver - so any decisions here should also be considered in that codebase. In any case, thanks for giving the DisplayLink USB 2.0 graphics drivers attention - it's much appreciated! Bernie Thompson http://plugable.com/ -- To unsubscribe from this list: send the line "unsubscribe linux-fbdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Am 12.01.2013 23:22, schrieb Bernie Thompson: > Hi Alexander, > > On Sat, Jan 12, 2013 at 5:20 AM, Alexander Holler <holler@ahsoftware.de> wrote: >> When a device was disconnected the driver may hang at waiting for urbs it never >> will get. Fix this by using a timeout while waiting for the used semaphore. > > The code used to be this way, but it used to cause nasty shutdown hangs: > http://git.plugable.com/gitphp/index.php?p=udlfb&a=commitdiff&h=1dd39a65001deb5a84088dfabb788d3274fbb6b6 > > Which is why the code is the way it is today. > > Can you say under what situations you're hitting hangs on device > disconnect? Have you tested extensively to confirm no shutdown hangs > with your patch? > The driver almost always (2/3) hangs here when the device gets disconnected. It is easy to see when the device gets attached again as nothing will happen if the driver (already) hangs (in addition that a shutdown isn't possible). I didn't test it extensively, but without the patch the driver isn't usable here. Maybe my previous patch which moves damages to a workqueue is the reason that it's more likely that urbs get missing, but the problem already existed because an urb might get missed on disconnect. I don't know what problems existed before, maybe people just had a problem with the BUG_ON(ret). If that _interrupted_ is really needed, it could make sense to implement a down_timeout_interruptible() for semaphores. > Stepping back, there was another recent patch from the community to > udlfb to work around issues of sleeping in the wrong context. The fix > involved introducing another scheduled workitem. This slows everything > down when it's in the main path, and isn't really desirable if we can > avoid it. Do you mean the one I've recently posted? It is needed, at least for 3.7 (I don't know since when those "schedule while atomic" messages appear). It might slow down refreshes, but it is needed, at least until someone gets around those semaphores or removes the spinlocks in upper layers (as Alan Cox suggested with the "I am crap" helper for printk). Maybe using a WQ_HIGHPRI for the workqueue with the damages will speed up things. More optimizations might be doable too (e.g. combining multiple queued damages). > Another option to eliminate all these problems -- long considered but > never implemented -- is to get rid of all semaphores and potential > sleeps in udlfb entirely. That would require a strategy to throttle > rendering in some way other than by waiting in kernel (without some > throttling strategy, the USB bus can be a bottleneck which can flood > the system with rendered but untransmitted pixels). > > Options might be: > > 1) When transfer buffers are full, keep track of dirty rectangles for > the rest and pick up where we left off the next time we're entered > (avoiding flooding by potentially having pixels in the dirty regions > be written over multiple times before we get to rendering them once) > > 2 ) If we "bet" on page-fault-based defio dirty pixel detection, we > could allocate buffers dynamically but increase the scheduling time to > transfer as our outstanding buffer count grows, and reduce the latency > only when the buffer count goes down (again, pixels will be > potentially rendered many times before being transfered once, avoiding > flooding). > > Any other ideas on the specific or general case are welcome. Also > note that udlfb is being largely superceeded by the udl DRM driver - > so any decisions here should also be considered in that codebase. > > In any case, thanks for giving the DisplayLink USB 2.0 graphics > drivers attention - it's much appreciated! Thanks for the sugestions, but I don't feel the need to spend a lot of time here. I just wanted to use the console with the device and a kernel 3.7.x and neither udlfb nor udl currently worked (and I'm pretty sure I've used one of them some time before, likely udlfb). Btw, to see the console again after a disconnect and connect, I'm currently using the following (necessary) quick&dirty hack: --------- /* if clients still have us open, will be freed on last close */ - if (dev->fb_count == 0) +// if (dev->fb_count == 0) schedule_delayed_work(&dev->free_framebuffer_work, 0); --------- Without that the framebuffer will never get unregistered (because just unlinking it doesn't remove the fb-console which counts for one client) with the result that the new one (after connecting the device again) will not get the console. Regards, Alexander -- To unsubscribe from this list: send the line "unsubscribe linux-fbdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Am 13.01.2013 13:05, schrieb Alexander Holler: > Am 12.01.2013 23:22, schrieb Bernie Thompson: > I didn't test it extensively, but without the patch the driver isn't > usable here. Maybe my previous patch which moves damages to a workqueue To add some more explanations, I'm currently only testing it with a statically linked udlfb (for fbcon) as that is what I'm mainly using the device for (with otherwise headless boxes). When udlfb is a module, I don't see those "schedule while atomic" messages (I don't know why), but having a console only after the modules got loaded isn't always an option. Regards, Alexander -- To unsubscribe from this list: send the line "unsubscribe linux-fbdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
diff --git a/drivers/video/udlfb.c b/drivers/video/udlfb.c index 86d449e..cc4a8d1 100644 --- a/drivers/video/udlfb.c +++ b/drivers/video/udlfb.c @@ -1832,8 +1832,8 @@ static void dlfb_free_urb_list(struct dlfb_data *dev) /* keep waiting and freeing, until we've got 'em all */ while (count--) { - /* Getting interrupted means a leak, but ok at disconnect */ - ret = down_interruptible(&dev->urbs.limit_sem); + /* Timeout likely occurs at disconnect (resulting in a leak) */ + ret = down_timeout(&dev->urbs.limit_sem, GET_URB_TIMEOUT); if (ret) break;
When a device was disconnected the driver may hang at waiting for urbs it never will get. Fix this by using a timeout while waiting for the used semaphore. There is still a memory leak if a timeout happens, but at least the driver now continues his disconnect routine. Cc: <stable@vger.kernel.org> Signed-off-by: Alexander Holler <holler@ahsoftware.de> --- drivers/video/udlfb.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)