Message ID | 1448050742-10777-1-git-send-email-adi@adirat.com (mailing list archive) |
---|---|
State | New, archived |
Delegated to: | Jiri Kosina |
Headers | show |
On Fri, 20 Nov 2015 22:19:02 +0200 Ioan-Adrian Ratiu <adi@adirat.com> wrote: > The critical section protected by usbhid->lock in hid_ctrl() is too > big and because of this it causes a recursive deadlock. "Too big" means > the case statement and the call to hid_input_report() do not need to be > protected by the spinlock (no URB operations are done inside them). > > The deadlock happens because in certain rare cases drivers try to grab > the lock while handling the ctrl irq which grabs the lock before them > as described above. For example newer wacom tablets like 056a:033c try > to reschedule proximity reads from wacom_intuos_schedule_prox_event() > calling hid_hw_request() -> usbhid_request() -> usbhid_submit_report() > which tries to grab the usbhid lock already held by hid_ctrl(). > > There are two ways to get out of this deadlock: > 1. Make the drivers work "around" the ctrl critical region, in the > wacom case for ex. by delaying the scheduling of the proximity read > request itself to a workqueue. > 2. Shrink the critical region so the usbhid lock protects only the > instructions which modify usbhid state, calling hid_input_report() > with the spinlock unlocked, allowing the device driver to grab the > lock first, finish and then grab the lock afterwards in hid_ctrl(). > > This patch implements the 2nd solution. > > Signed-off-by: Ioan-Adrian Ratiu <adi@adirat.com> > --- > drivers/hid/usbhid/hid-core.c | 4 ++-- > 1 file changed, 2 insertions(+), 2 deletions(-) > > diff --git a/drivers/hid/usbhid/hid-core.c b/drivers/hid/usbhid/hid-core.c > index 36712e9..5dd426f 100644 > --- a/drivers/hid/usbhid/hid-core.c > +++ b/drivers/hid/usbhid/hid-core.c > @@ -477,8 +477,6 @@ static void hid_ctrl(struct urb *urb) > struct usbhid_device *usbhid = hid->driver_data; > int unplug = 0, status = urb->status; > > - spin_lock(&usbhid->lock); > - > switch (status) { > case 0: /* success */ > if (usbhid->ctrl[usbhid->ctrltail].dir == USB_DIR_IN) > @@ -498,6 +496,8 @@ static void hid_ctrl(struct urb *urb) > hid_warn(urb->dev, "ctrl urb status %d received\n", status); > } > > + spin_lock(&usbhid->lock); > + > if (unplug) { > usbhid->ctrltail = usbhid->ctrlhead; > } else { Hello again Can this please be merged in the 4.4? There are quite a few people who have their tablets deadlock and don't know how to patch their kernels so are stuck waiting for a new release. The severity of this issue is much bigger than I initially thought. Since I've posted on the wacom mailing list that I have a fix for this deadlock I've recived lots of email from people complaining of the same problem on a wide range of tablets. Some of those people know how to patch a kernel, some found this patch on the mailing list and tested it and confirmed that it works on the wacom mailing list (you can verify the deadlock + fix on that mailing list). Thank you, Adrian -- To unsubscribe from this list: send the line "unsubscribe linux-input" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Fri, 20 Nov 2015, Ioan-Adrian Ratiu wrote: > The critical section protected by usbhid->lock in hid_ctrl() is too > big and because of this it causes a recursive deadlock. "Too big" means > the case statement and the call to hid_input_report() do not need to be > protected by the spinlock (no URB operations are done inside them). > > The deadlock happens because in certain rare cases drivers try to grab > the lock while handling the ctrl irq which grabs the lock before them > as described above. For example newer wacom tablets like 056a:033c try > to reschedule proximity reads from wacom_intuos_schedule_prox_event() > calling hid_hw_request() -> usbhid_request() -> usbhid_submit_report() > which tries to grab the usbhid lock already held by hid_ctrl(). > > There are two ways to get out of this deadlock: > 1. Make the drivers work "around" the ctrl critical region, in the > wacom case for ex. by delaying the scheduling of the proximity read > request itself to a workqueue. > 2. Shrink the critical region so the usbhid lock protects only the > instructions which modify usbhid state, calling hid_input_report() > with the spinlock unlocked, allowing the device driver to grab the > lock first, finish and then grab the lock afterwards in hid_ctrl(). > > This patch implements the 2nd solution. > > Signed-off-by: Ioan-Adrian Ratiu <adi@adirat.com> Applied to for-4.5/core branch. Thanks,
Hi Jiri, On Tue, Dec 1, 2015 at 8:36 AM, Jiri Kosina <jikos@kernel.org> wrote: > On Fri, 20 Nov 2015, Ioan-Adrian Ratiu wrote: > >> The critical section protected by usbhid->lock in hid_ctrl() is too >> big and because of this it causes a recursive deadlock. "Too big" means >> the case statement and the call to hid_input_report() do not need to be >> protected by the spinlock (no URB operations are done inside them). >> >> The deadlock happens because in certain rare cases drivers try to grab >> the lock while handling the ctrl irq which grabs the lock before them >> as described above. For example newer wacom tablets like 056a:033c try >> to reschedule proximity reads from wacom_intuos_schedule_prox_event() >> calling hid_hw_request() -> usbhid_request() -> usbhid_submit_report() >> which tries to grab the usbhid lock already held by hid_ctrl(). >> >> There are two ways to get out of this deadlock: >> 1. Make the drivers work "around" the ctrl critical region, in the >> wacom case for ex. by delaying the scheduling of the proximity read >> request itself to a workqueue. >> 2. Shrink the critical region so the usbhid lock protects only the >> instructions which modify usbhid state, calling hid_input_report() >> with the spinlock unlocked, allowing the device driver to grab the >> lock first, finish and then grab the lock afterwards in hid_ctrl(). >> >> This patch implements the 2nd solution. >> >> Signed-off-by: Ioan-Adrian Ratiu <adi@adirat.com> > > Applied to for-4.5/core branch. Thanks, Since wacom_intuos_schedule_prox_event() was introduced on March 5, 2015, the call was first used in kernel 3.19. Shouldn't we back-port this patch to other stable versions? Thank you. Ping -- To unsubscribe from this list: send the line "unsubscribe linux-input" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Jason --- Now instead of four in the eights place / you’ve got three, ‘Cause you added one / (That is to say, eight) to the two, / But you can’t take seven from three, / So you look at the sixty-fours.... On Sun, Nov 29, 2015 at 2:29 AM, Ioan-Adrian Ratiu <adi@adirat.com> wrote: > On Fri, 20 Nov 2015 22:19:02 +0200 > Ioan-Adrian Ratiu <adi@adirat.com> wrote: > >> The critical section protected by usbhid->lock in hid_ctrl() is too >> big and because of this it causes a recursive deadlock. "Too big" means >> the case statement and the call to hid_input_report() do not need to be >> protected by the spinlock (no URB operations are done inside them). >> >> The deadlock happens because in certain rare cases drivers try to grab >> the lock while handling the ctrl irq which grabs the lock before them >> as described above. For example newer wacom tablets like 056a:033c try >> to reschedule proximity reads from wacom_intuos_schedule_prox_event() >> calling hid_hw_request() -> usbhid_request() -> usbhid_submit_report() >> which tries to grab the usbhid lock already held by hid_ctrl(). >> >> There are two ways to get out of this deadlock: >> 1. Make the drivers work "around" the ctrl critical region, in the >> wacom case for ex. by delaying the scheduling of the proximity read >> request itself to a workqueue. >> 2. Shrink the critical region so the usbhid lock protects only the >> instructions which modify usbhid state, calling hid_input_report() >> with the spinlock unlocked, allowing the device driver to grab the >> lock first, finish and then grab the lock afterwards in hid_ctrl(). >> >> This patch implements the 2nd solution. >> >> Signed-off-by: Ioan-Adrian Ratiu <adi@adirat.com> >> --- >> drivers/hid/usbhid/hid-core.c | 4 ++-- >> 1 file changed, 2 insertions(+), 2 deletions(-) >> >> diff --git a/drivers/hid/usbhid/hid-core.c b/drivers/hid/usbhid/hid-core.c >> index 36712e9..5dd426f 100644 >> --- a/drivers/hid/usbhid/hid-core.c >> +++ b/drivers/hid/usbhid/hid-core.c >> @@ -477,8 +477,6 @@ static void hid_ctrl(struct urb *urb) >> struct usbhid_device *usbhid = hid->driver_data; >> int unplug = 0, status = urb->status; >> >> - spin_lock(&usbhid->lock); >> - >> switch (status) { >> case 0: /* success */ >> if (usbhid->ctrl[usbhid->ctrltail].dir == USB_DIR_IN) >> @@ -498,6 +496,8 @@ static void hid_ctrl(struct urb *urb) >> hid_warn(urb->dev, "ctrl urb status %d received\n", status); >> } >> >> + spin_lock(&usbhid->lock); >> + >> if (unplug) { >> usbhid->ctrltail = usbhid->ctrlhead; >> } else { > > Hello again > > Can this please be merged in the 4.4? There are quite a few people who have > their tablets deadlock and don't know how to patch their kernels so are stuck > waiting for a new release. > > The severity of this issue is much bigger than I initially thought. Since I've > posted on the wacom mailing list that I have a fix for this deadlock I've > recived lots of email from people complaining of the same problem on a wide > range of tablets. > > Some of those people know how to patch a kernel, some found this patch on the > mailing list and tested it and confirmed that it works on the wacom mailing > list (you can verify the deadlock + fix on that mailing list). > > Thank you, > Adrian On Tue, Dec 1, 2015 at 4:00 PM, Ping Cheng <pinglinux@gmail.com> wrote: > Hi Jiri, > > Since wacom_intuos_schedule_prox_event() was introduced on March 5, > 2015, the call was first used in kernel 3.19. Shouldn't we back-port > this patch to other stable versions? > > Thank you. > > Ping I've been prompted to resurrect this question after fielding yet another question from a user encountering a locked-up system. We received a surprising number of reports of people encountering this issue in the past several weeks and that's only going to get worse as 4.4 works its way into distros. Integrating back even just to 4.4 instead of 3.19 would be appreciated since this issue has (so far) only ever been encountered with the new Intuos tablets that were added in 4.4... Jason --- Now instead of four in the eights place / you’ve got three, ‘Cause you added one / (That is to say, eight) to the two, / But you can’t take seven from three, / So you look at the sixty-fours.... -- To unsubscribe from this list: send the line "unsubscribe linux-input" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Wed, 20 Jan 2016, Jason Gerecke wrote: > I've been prompted to resurrect this question after fielding yet another > question from a user encountering a locked-up system. We received a > surprising number of reports of people encountering this issue in the > past several weeks and that's only going to get worse as 4.4 works its > way into distros. Integrating back even just to 4.4 instead of 3.19 > would be appreciated since this issue has (so far) only ever been > encountered with the new Intuos tablets that were added in 4.4... I am in favor of pushing this to -stable. As you guys seem to be the ones able to test it with your hardware, would you guys (Ioan-Adrian? Jason?) be willing to prepare the backport and submit it once it passes your testing? Thanks,
diff --git a/drivers/hid/usbhid/hid-core.c b/drivers/hid/usbhid/hid-core.c index 36712e9..5dd426f 100644 --- a/drivers/hid/usbhid/hid-core.c +++ b/drivers/hid/usbhid/hid-core.c @@ -477,8 +477,6 @@ static void hid_ctrl(struct urb *urb) struct usbhid_device *usbhid = hid->driver_data; int unplug = 0, status = urb->status; - spin_lock(&usbhid->lock); - switch (status) { case 0: /* success */ if (usbhid->ctrl[usbhid->ctrltail].dir == USB_DIR_IN) @@ -498,6 +496,8 @@ static void hid_ctrl(struct urb *urb) hid_warn(urb->dev, "ctrl urb status %d received\n", status); } + spin_lock(&usbhid->lock); + if (unplug) { usbhid->ctrltail = usbhid->ctrlhead; } else {
The critical section protected by usbhid->lock in hid_ctrl() is too big and because of this it causes a recursive deadlock. "Too big" means the case statement and the call to hid_input_report() do not need to be protected by the spinlock (no URB operations are done inside them). The deadlock happens because in certain rare cases drivers try to grab the lock while handling the ctrl irq which grabs the lock before them as described above. For example newer wacom tablets like 056a:033c try to reschedule proximity reads from wacom_intuos_schedule_prox_event() calling hid_hw_request() -> usbhid_request() -> usbhid_submit_report() which tries to grab the usbhid lock already held by hid_ctrl(). There are two ways to get out of this deadlock: 1. Make the drivers work "around" the ctrl critical region, in the wacom case for ex. by delaying the scheduling of the proximity read request itself to a workqueue. 2. Shrink the critical region so the usbhid lock protects only the instructions which modify usbhid state, calling hid_input_report() with the spinlock unlocked, allowing the device driver to grab the lock first, finish and then grab the lock afterwards in hid_ctrl(). This patch implements the 2nd solution. Signed-off-by: Ioan-Adrian Ratiu <adi@adirat.com> --- drivers/hid/usbhid/hid-core.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)