Message ID | 1359139768-32294-2-git-send-email-holler@ahsoftware.de (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
On Fri, 25 Jan 2013 19:49:27 +0100 Alexander Holler <holler@ahsoftware.de> wrote: > When a device was disconnected the driver may hang at waiting for urbs it never > will get. Fix this by using a timeout while waiting for the used semaphore. > > There is still a memory leak if a timeout happens, but at least the driver > now continues his disconnect routine. > > ... > > --- a/drivers/video/udlfb.c > +++ b/drivers/video/udlfb.c > @@ -1832,8 +1832,9 @@ static void dlfb_free_urb_list(struct dlfb_data *dev) > /* keep waiting and freeing, until we've got 'em all */ > while (count--) { > > - /* Getting interrupted means a leak, but ok at disconnect */ > - ret = down_interruptible(&dev->urbs.limit_sem); > + /* Timeout likely occurs at disconnect (resulting in a leak) */ > + ret = down_timeout_killable(&dev->urbs.limit_sem, > + FREE_URB_TIMEOUT); > if (ret) > break; This is rather a hack. Do you have an understanding of the underlying bug? Why is the driver waiting for things which will never happen? -- To unsubscribe from this list: send the line "unsubscribe linux-fbdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Am 29.01.2013 01:22, schrieb Andrew Morton: > On Fri, 25 Jan 2013 19:49:27 +0100 > Alexander Holler <holler@ahsoftware.de> wrote: > >> When a device was disconnected the driver may hang at waiting for urbs it never >> will get. Fix this by using a timeout while waiting for the used semaphore. >> >> There is still a memory leak if a timeout happens, but at least the driver >> now continues his disconnect routine. >> >> ... >> >> --- a/drivers/video/udlfb.c >> +++ b/drivers/video/udlfb.c >> @@ -1832,8 +1832,9 @@ static void dlfb_free_urb_list(struct dlfb_data *dev) >> /* keep waiting and freeing, until we've got 'em all */ >> while (count--) { >> >> - /* Getting interrupted means a leak, but ok at disconnect */ >> - ret = down_interruptible(&dev->urbs.limit_sem); >> + /* Timeout likely occurs at disconnect (resulting in a leak) */ >> + ret = down_timeout_killable(&dev->urbs.limit_sem, >> + FREE_URB_TIMEOUT); >> if (ret) >> break; > > This is rather a hack. Do you have an understanding of the underlying > bug? Why is the driver waiting for things which will never happen? It is a hack, but I don't want to rewrite the whole driver. There is already an attempt to do so, udl. The hack is a quick solution which doesn't make something worse, just better. The only drawback might be the few additional bytes for the implementation of down_timeout_killable(), but I thought such should be already available, and wondered it wasn't. Fixing the underlying problem (including removing the leaks) would just end up in another rewrite and I prefer to have a look at udl, maybe fixing the current problems to use such a device as console there. I've only posted this small patch series, because some (longterm) stable kernels don't have udl, and smscufx, which is in large parts identical to udlfb might want that patch too. Just in case: I don't know anything about smscufx and have discovered the similarities between those drivers only when I did a git grep to search for something I fixed with a previous patch. Regards, Alexander -- To unsubscribe from this list: send the line "unsubscribe linux-fbdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Am 29.01.2013 01:56, schrieb Alexander Holler: > Am 29.01.2013 01:22, schrieb Andrew Morton: >> On Fri, 25 Jan 2013 19:49:27 +0100 >> Alexander Holler <holler@ahsoftware.de> wrote: >> >>> When a device was disconnected the driver may hang at waiting for urbs it never >>> will get. Fix this by using a timeout while waiting for the used semaphore. >>> >>> There is still a memory leak if a timeout happens, but at least the driver >>> now continues his disconnect routine. >>> >>> ... >>> >>> --- a/drivers/video/udlfb.c >>> +++ b/drivers/video/udlfb.c >>> @@ -1832,8 +1832,9 @@ static void dlfb_free_urb_list(struct dlfb_data *dev) >>> /* keep waiting and freeing, until we've got 'em all */ >>> while (count--) { >>> >>> - /* Getting interrupted means a leak, but ok at disconnect */ >>> - ret = down_interruptible(&dev->urbs.limit_sem); >>> + /* Timeout likely occurs at disconnect (resulting in a leak) */ >>> + ret = down_timeout_killable(&dev->urbs.limit_sem, >>> + FREE_URB_TIMEOUT); >>> if (ret) >>> break; >> >> This is rather a hack. Do you have an understanding of the underlying >> bug? Why is the driver waiting for things which will never happen? To add a bit more explanation: I've experienced that bug after moving the fb-damage-handling into a workqueue (to make the driver usable as console). This likely has increased the possibility that an urb gets missed when the usb-stack calls the (usb-)disconnect function of the driver. But I don't know as I couldn't use the driver before (as fbcon) so I don't really have a comparison. What currently happens here is something like that: fb -> damage -> workload which sends urb and waits for answer device disconnect -> dlfb_usb_disconnect() -> stall (no answer to the above urb) I don't know why the disconnect waits for all urbs. The code looks like it does that just to free the allocated memory. As I'm not very familiar with the usb-stack, I would have to read up about the urb-handling to find out how to free the memory otherwise. As the previous comment in the code suggests that urbs already got missed (on shutdown) before, I assume that even without my patch, which moved the damage into a workqueue, the problem could occur which then prevents a shutdown as there is no timeout. As I've experienced that problem not only on disconnect, but on shutdown too (no shutdown was possible), I have to assume, that the previous used down_interruptible() didn't get a signal on shutdown (if the driver is used as fbcon), therefor I consider the timeout as necessary. Regards, Alexander -- To unsubscribe from this list: send the line "unsubscribe linux-fbdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Am 29.01.2013 11:35, schrieb Alexander Holler: > Am 29.01.2013 01:56, schrieb Alexander Holler: >> Am 29.01.2013 01:22, schrieb Andrew Morton: >>> On Fri, 25 Jan 2013 19:49:27 +0100 >>> Alexander Holler <holler@ahsoftware.de> wrote: >>> >>>> When a device was disconnected the driver may hang at waiting for >>>> urbs it never >>>> will get. Fix this by using a timeout while waiting for the used >>>> semaphore. >>>> >>>> There is still a memory leak if a timeout happens, but at least the >>>> driver >>>> now continues his disconnect routine. >>>> >>>> ... >>>> >>>> --- a/drivers/video/udlfb.c >>>> +++ b/drivers/video/udlfb.c >>>> @@ -1832,8 +1832,9 @@ static void dlfb_free_urb_list(struct >>>> dlfb_data *dev) >>>> /* keep waiting and freeing, until we've got 'em all */ >>>> while (count--) { >>>> >>>> - /* Getting interrupted means a leak, but ok at disconnect */ >>>> - ret = down_interruptible(&dev->urbs.limit_sem); >>>> + /* Timeout likely occurs at disconnect (resulting in a >>>> leak) */ >>>> + ret = down_timeout_killable(&dev->urbs.limit_sem, >>>> + FREE_URB_TIMEOUT); >>>> if (ret) >>>> break; >>> >>> This is rather a hack. Do you have an understanding of the underlying >>> bug? Why is the driver waiting for things which will never happen? > > To add a bit more explanation: > > I've experienced that bug after moving the fb-damage-handling into a > workqueue (to make the driver usable as console). This likely has > increased the possibility that an urb gets missed when the usb-stack > calls the (usb-)disconnect function of the driver. But I don't know as I > couldn't use the driver before (as fbcon) so I don't really have a > comparison. > > What currently happens here is something like that: > > fb -> damage -> workload which sends urb and waits for answer > device disconnect -> dlfb_usb_disconnect() -> stall (no answer to the > above urb) > > I don't know why the disconnect waits for all urbs. The code looks like > it does that just to free the allocated memory. As I'm not very familiar > with the usb-stack, I would have to read up about the urb-handling to > find out how to free the memory otherwise. > > As the previous comment in the code suggests that urbs already got > missed (on shutdown) before, I assume that even without my patch, which > moved the damage into a workqueue, the problem could occur which then > prevents a shutdown as there is no timeout. As I've experienced that > problem not only on disconnect, but on shutdown too (no shutdown was > possible), I have to assume, that the previous used down_interruptible() > didn't get a signal on shutdown (if the driver is used as fbcon), > therefor I consider the timeout as necessary. To explain the problem on shutdown a bit further, I think the following happens (usb and driver are statically linked and started by the kernel): shutdown -> kill signal -> usb stack shuts down -> udlfb waits (forever) for a kill or an urb which it doesn't get. Maybe the sequence is different if the usb-stack and udlfb are used as a module and/or udlfb is used only for X/fb. I'm not sure what actually does shut down the usb-stack in such a case, but maybe more than one kill signal might be thrown around. Regards, Alexander -- To unsubscribe from this list: send the line "unsubscribe linux-fbdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Am 29.01.2013 12:11, schrieb Alexander Holler: > > To explain the problem on shutdown a bit further, I think the following > happens (usb and driver are statically linked and started by the kernel): > > shutdown -> kill signal -> usb stack shuts down -> udlfb waits (forever) > for a kill or an urb which it doesn't get. Having a second look at what I've written above, I'm not even sure if the kernel sends one or more fatal signals on shutdown at all. I've just assumed it because otherwise down_interruptible() wouldn't have worked before (it would have stalled on shutdown too (if an urb got missed), not only on disconnect). Sounds like an interesting question I should read about (if/when fatal signals are issued by the kernel). ;) > Maybe the sequence is different if the usb-stack and udlfb are used as a > module and/or udlfb is used only for X/fb. I'm not sure what actually > does shut down the usb-stack in such a case, but maybe more than one > kill signal might be thrown around. Regards, Alexander -- To unsubscribe from this list: send the line "unsubscribe linux-fbdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
diff --git a/drivers/video/udlfb.c b/drivers/video/udlfb.c index 86d449e..db6cc66 100644 --- a/drivers/video/udlfb.c +++ b/drivers/video/udlfb.c @@ -1832,8 +1832,9 @@ static void dlfb_free_urb_list(struct dlfb_data *dev) /* keep waiting and freeing, until we've got 'em all */ while (count--) { - /* Getting interrupted means a leak, but ok at disconnect */ - ret = down_interruptible(&dev->urbs.limit_sem); + /* Timeout likely occurs at disconnect (resulting in a leak) */ + ret = down_timeout_killable(&dev->urbs.limit_sem, + FREE_URB_TIMEOUT); if (ret) break;
When a device was disconnected the driver may hang at waiting for urbs it never will get. Fix this by using a timeout while waiting for the used semaphore. There is still a memory leak if a timeout happens, but at least the driver now continues his disconnect routine. Cc: <stable@vger.kernel.org> Signed-off-by: Alexander Holler <holler@ahsoftware.de> --- drivers/video/udlfb.c | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-)