diff mbox

ACPI: fix Thunderbolt hotplug

Message ID 1461760019-12930-1-git-send-email-prarit@redhat.com (mailing list archive)
State Superseded, archived
Delegated to: Rafael Wysocki
Headers show

Commit Message

Prarit Bhargava April 27, 2016, 12:26 p.m. UTC
Rafael, this patch is in the acpica.git tree as 7a3bd2d ("Dispatcher: Update
thread ID for recursive method calls").  I've had many positive testing
results from hardware vendors and users with this patch and this resolves
many of the problems seen here:

https://bugzilla.kernel.org/show_bug.cgi?id=115121

This does not fix the problems with the TB docking station.  Although this
patch will also be required, the docking station issues require a FW update.
Updated FW should be coming soon to resolve those problems.

P.

----8<----

The following hung task trace is seen when hotplugging
an ethernet dongle in a Thunderbolt port on Linux.

INFO: task kworker/0:4:1468 blocked for more than 120 seconds.
      Tainted: G        W       4.6.0-rc1+ #1
 "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
 kworker/0:4     D ffff8802a265ba38 13344  1468      2 0x00000000
 Workqueue: kacpid acpi_os_execute_deferred
 ffff8802a265ba38 ffff8802a265ba00 ffffffff81130200 ffffffff81e0d580
 ffff88029e5eb340 ffff8802a265c000 ffff88029d69d000 ffff88029e5eb340
 ffffffff818c1b8d ffff8802b64e8758 ffff8802a265ba50 ffffffff818bdfcc
 Call Trace:
 [<ffffffff81130200>] ? test_callback+0x10/0x30
 [<ffffffff818c1b8d>] ? __down_timeout+0x5d/0xd0
 [<ffffffff818bdfcc>] schedule+0x3c/0x90
 [<ffffffff818c2d60>] schedule_timeout+0x210/0x360
 [<ffffffff8103fc89>] ? sched_clock+0x9/0x10
 [<ffffffff810ee51c>] ? local_clock+0x1c/0x20
 [<ffffffff81110c16>] ? mark_held_locks+0x76/0xa0
 [<ffffffff818c3cfc>] ? _raw_spin_unlock_irq+0x2c/0x40
 [<ffffffff818c1b8d>] ? __down_timeout+0x5d/0xd0
 [<ffffffff81110d35>] ? trace_hardirqs_on_caller+0xf5/0x1b0
 [<ffffffff818c1b8d>] ? __down_timeout+0x5d/0xd0
 [<ffffffff818c1bac>] __down_timeout+0x7c/0xd0
 [<ffffffff818c44b2>] ? _raw_spin_lock_irqsave+0x82/0x90
 [<ffffffff8110b87c>] down_timeout+0x4c/0x60
 [<ffffffff814e3a9c>] acpi_os_wait_semaphore+0xaa/0x16a
 [<ffffffff81510f37>] acpi_ex_system_wait_mutex+0x81/0xfa
 [<ffffffff814f8796>] acpi_ds_begin_method_execution+0x25a/0x373
 [<ffffffff814f8cea>] acpi_ds_call_control_method+0x107/0x2e0
 [<ffffffff8151ed60>] acpi_ps_parse_aml+0x177/0x495
 [<ffffffff8151fa46>] acpi_ps_execute_method+0x1f7/0x2b9
 [<ffffffff81516c9a>] acpi_ns_evaluate+0x2ee/0x435
 [<ffffffff814ff84a>] acpi_ev_asynch_execute_gpe_method+0xbd/0x159
 [<ffffffff814e2a69>] acpi_os_execute_deferred+0x17/0x23
 [<ffffffff810d1fc2>] process_one_work+0x242/0x700
 [<ffffffff810d1f3a>] ? process_one_work+0x1ba/0x700
 [<ffffffff810d24ce>] worker_thread+0x4e/0x490
 [<ffffffff810d2480>] ? process_one_work+0x700/0x700
 [<ffffffff810d2480>] ? process_one_work+0x700/0x700
 [<ffffffff810d98b1>] kthread+0x101/0x120
 [<ffffffff81110d35>] ? trace_hardirqs_on_caller+0xf5/0x1b0
 [<ffffffff818c4832>] ret_from_fork+0x22/0x50
 [<ffffffff810d97b0>] ? kthread_create_on_node+0x250/0x250
 2 locks held by kworker/0:4/1468:
 #0:  ("kacpid"){.+.+.+}, at: [<ffffffff810d1f3a>] process_one_work+0x1ba/0x700
 #1:  ((&dpc->work)){+.+.+.}, at: [<ffffffff810d1f3a>] process_one_work+0x1ba/0x700

The issue appears to be that the kworker thread attempts to acquire the
_E42 method's mutex twice when executing  acpi_ps_execute_method() and
recursing through the entry method.

The current code does take the possiblity of this recursion into account,
however, it is only for the case where the walk_state has been populated.

This can be fixed by setting the thread id in the !walk_state case to
allow for recursion.

Cc: Robert Moore <robert.moore@intel.com>
Cc: Lv Zheng <lv.zheng@intel.com>
Cc: "Rafael J. Wysocki" <rafael.j.wysocki@intel.com>
Cc: Len Brown <lenb@kernel.org>
Cc: linux-acpi@vger.kernel.org
Cc: Mario_Limonciello@dell.com
Cc: stable@vger.kernel.org
Cc: devel@acpica.org
Signed-off-by: Prarit Bhargava <prarit@redhat.com>
---
 drivers/acpi/acpica/dsmethod.c |    3 +++
 1 file changed, 3 insertions(+)

Comments

Rafael J. Wysocki April 27, 2016, 12:33 p.m. UTC | #1
On Wed, Apr 27, 2016 at 2:26 PM, Prarit Bhargava <prarit@redhat.com> wrote:
> Rafael, this patch is in the acpica.git tree as 7a3bd2d ("Dispatcher: Update
> thread ID for recursive method calls").  I've had many positive testing
> results from hardware vendors and users with this patch and this resolves
> many of the problems seen here:
>
> https://bugzilla.kernel.org/show_bug.cgi?id=115121
>
> This does not fix the problems with the TB docking station.  Although this
> patch will also be required, the docking station issues require a FW update.
> Updated FW should be coming soon to resolve those problems.

Lv is going to send me patches for the current ACPICA release shortly.

I'll pick up this one from that series.  Hopefully, that's not a problem.

Thanks,
Rafael
--
To unsubscribe from this list: send the line "unsubscribe linux-acpi" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Mario Limonciello April 27, 2016, 1:07 p.m. UTC | #2
PiBPbiBXZWQsIEFwciAyNywgMjAxNiBhdCAyOjI2IFBNLCBQcmFyaXQgQmhhcmdhdmEgPHByYXJp
dEByZWRoYXQuY29tPiB3cm90ZToNCj4gPiBSYWZhZWwsIHRoaXMgcGF0Y2ggaXMgaW4gdGhlIGFj
cGljYS5naXQgdHJlZSBhcyA3YTNiZDJkICgiRGlzcGF0Y2hlcjoNCj4gPiBVcGRhdGUgdGhyZWFk
IElEIGZvciByZWN1cnNpdmUgbWV0aG9kIGNhbGxzIikuICBJJ3ZlIGhhZCBtYW55IHBvc2l0aXZl
DQo+ID4gdGVzdGluZyByZXN1bHRzIGZyb20gaGFyZHdhcmUgdmVuZG9ycyBhbmQgdXNlcnMgd2l0
aCB0aGlzIHBhdGNoIGFuZA0KPiA+IHRoaXMgcmVzb2x2ZXMgbWFueSBvZiB0aGUgcHJvYmxlbXMg
c2VlbiBoZXJlOg0KPiA+DQo+ID4gaHR0cHM6Ly9idWd6aWxsYS5rZXJuZWwub3JnL3Nob3dfYnVn
LmNnaT9pZD0xMTUxMjENCj4gPg0KPiA+IFRoaXMgZG9lcyBub3QgZml4IHRoZSBwcm9ibGVtcyB3
aXRoIHRoZSBUQiBkb2NraW5nIHN0YXRpb24uICBBbHRob3VnaA0KPiA+IHRoaXMgcGF0Y2ggd2ls
bCBhbHNvIGJlIHJlcXVpcmVkLCB0aGUgZG9ja2luZyBzdGF0aW9uIGlzc3VlcyByZXF1aXJlIGEg
RlcgdXBkYXRlLg0KPiA+IFVwZGF0ZWQgRlcgc2hvdWxkIGJlIGNvbWluZyBzb29uIHRvIHJlc29s
dmUgdGhvc2UgcHJvYmxlbXMuDQo+IA0KPiBMdiBpcyBnb2luZyB0byBzZW5kIG1lIHBhdGNoZXMg
Zm9yIHRoZSBjdXJyZW50IEFDUElDQSByZWxlYXNlIHNob3J0bHkuDQo+IA0KPiBJJ2xsIHBpY2sg
dXAgdGhpcyBvbmUgZnJvbSB0aGF0IHNlcmllcy4gIEhvcGVmdWxseSwgdGhhdCdzIG5vdCBhIHBy
b2JsZW0uDQo+IA0KDQpSYWZhZWwsDQoNCkFueSBjaGFuY2UgdGhpcyBjYW4gc3RpbGwgbWFrZSA0
LjYgYW5kIGFsc28gd291bGQgYmUgYXBwcm9wcmlhdGUgZm9yIHN0YWJsZT8gIA0K
--
To unsubscribe from this list: send the line "unsubscribe linux-acpi" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Andy Lutomirski April 28, 2016, 6:38 p.m. UTC | #3
On 04/27/2016 05:26 AM, Prarit Bhargava wrote:
> Rafael, this patch is in the acpica.git tree as 7a3bd2d ("Dispatcher: Update
> thread ID for recursive method calls").  I've had many positive testing
> results from hardware vendors and users with this patch and this resolves
> many of the problems seen here:
>
> https://bugzilla.kernel.org/show_bug.cgi?id=115121
>
> This does not fix the problems with the TB docking station.  Although this
> patch will also be required, the docking station issues require a FW update.
> Updated FW should be coming soon to resolve those problems.
>

No kidding!?!  Will that update be installable from Linux?  If so, could 
you cc me on that?

> P.
>
> ----8<----
>
> The following hung task trace is seen when hotplugging
> an ethernet dongle in a Thunderbolt port on Linux.
>
> INFO: task kworker/0:4:1468 blocked for more than 120 seconds.
>       Tainted: G        W       4.6.0-rc1+ #1
>  "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
>  kworker/0:4     D ffff8802a265ba38 13344  1468      2 0x00000000
>  Workqueue: kacpid acpi_os_execute_deferred
>  ffff8802a265ba38 ffff8802a265ba00 ffffffff81130200 ffffffff81e0d580
>  ffff88029e5eb340 ffff8802a265c000 ffff88029d69d000 ffff88029e5eb340
>  ffffffff818c1b8d ffff8802b64e8758 ffff8802a265ba50 ffffffff818bdfcc
>  Call Trace:
>  [<ffffffff81130200>] ? test_callback+0x10/0x30
>  [<ffffffff818c1b8d>] ? __down_timeout+0x5d/0xd0
>  [<ffffffff818bdfcc>] schedule+0x3c/0x90
>  [<ffffffff818c2d60>] schedule_timeout+0x210/0x360
>  [<ffffffff8103fc89>] ? sched_clock+0x9/0x10
>  [<ffffffff810ee51c>] ? local_clock+0x1c/0x20
>  [<ffffffff81110c16>] ? mark_held_locks+0x76/0xa0
>  [<ffffffff818c3cfc>] ? _raw_spin_unlock_irq+0x2c/0x40
>  [<ffffffff818c1b8d>] ? __down_timeout+0x5d/0xd0
>  [<ffffffff81110d35>] ? trace_hardirqs_on_caller+0xf5/0x1b0
>  [<ffffffff818c1b8d>] ? __down_timeout+0x5d/0xd0
>  [<ffffffff818c1bac>] __down_timeout+0x7c/0xd0
>  [<ffffffff818c44b2>] ? _raw_spin_lock_irqsave+0x82/0x90
>  [<ffffffff8110b87c>] down_timeout+0x4c/0x60
>  [<ffffffff814e3a9c>] acpi_os_wait_semaphore+0xaa/0x16a
>  [<ffffffff81510f37>] acpi_ex_system_wait_mutex+0x81/0xfa
>  [<ffffffff814f8796>] acpi_ds_begin_method_execution+0x25a/0x373
>  [<ffffffff814f8cea>] acpi_ds_call_control_method+0x107/0x2e0
>  [<ffffffff8151ed60>] acpi_ps_parse_aml+0x177/0x495
>  [<ffffffff8151fa46>] acpi_ps_execute_method+0x1f7/0x2b9
>  [<ffffffff81516c9a>] acpi_ns_evaluate+0x2ee/0x435
>  [<ffffffff814ff84a>] acpi_ev_asynch_execute_gpe_method+0xbd/0x159
>  [<ffffffff814e2a69>] acpi_os_execute_deferred+0x17/0x23
>  [<ffffffff810d1fc2>] process_one_work+0x242/0x700
>  [<ffffffff810d1f3a>] ? process_one_work+0x1ba/0x700
>  [<ffffffff810d24ce>] worker_thread+0x4e/0x490
>  [<ffffffff810d2480>] ? process_one_work+0x700/0x700
>  [<ffffffff810d2480>] ? process_one_work+0x700/0x700
>  [<ffffffff810d98b1>] kthread+0x101/0x120
>  [<ffffffff81110d35>] ? trace_hardirqs_on_caller+0xf5/0x1b0
>  [<ffffffff818c4832>] ret_from_fork+0x22/0x50
>  [<ffffffff810d97b0>] ? kthread_create_on_node+0x250/0x250
>  2 locks held by kworker/0:4/1468:
>  #0:  ("kacpid"){.+.+.+}, at: [<ffffffff810d1f3a>] process_one_work+0x1ba/0x700
>  #1:  ((&dpc->work)){+.+.+.}, at: [<ffffffff810d1f3a>] process_one_work+0x1ba/0x700
>
> The issue appears to be that the kworker thread attempts to acquire the
> _E42 method's mutex twice when executing  acpi_ps_execute_method() and
> recursing through the entry method.
>
> The current code does take the possiblity of this recursion into account,
> however, it is only for the case where the walk_state has been populated.
>
> This can be fixed by setting the thread id in the !walk_state case to
> allow for recursion.

Tested-by: Andy Lutomirski <luto@kernel.org> # On a Dell XPS 13 9350

I don't care at all what variant of this patch is applied, but it would 
be nice to get it in for 4.6.  This fixes hotplug for me and I suspect 
it also fixes some occasional suspend/resume failures I've seen.

--Andy


--
To unsubscribe from this list: send the line "unsubscribe linux-acpi" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Mario Limonciello April 28, 2016, 6:44 p.m. UTC | #4
> -----Original Message-----
> From: Andy Lutomirski [mailto:luto@kernel.org]
> Sent: Thursday, April 28, 2016 1:39 PM
> To: Prarit Bhargava <prarit@redhat.com>; linux-kernel@vger.kernel.org
> Cc: Robert Moore <robert.moore@intel.com>; Lv Zheng
> <lv.zheng@intel.com>; Rafael J. Wysocki <rafael.j.wysocki@intel.com>; Len
> Brown <lenb@kernel.org>; linux-acpi@vger.kernel.org; Limonciello, Mario
> <Mario_Limonciello@Dell.com>; stable@vger.kernel.org; devel@acpica.org
> Subject: Re: [PATCH] ACPI: fix Thunderbolt hotplug
> 
> On 04/27/2016 05:26 AM, Prarit Bhargava wrote:
> > Rafael, this patch is in the acpica.git tree as 7a3bd2d ("Dispatcher:
> > Update thread ID for recursive method calls").  I've had many positive
> > testing results from hardware vendors and users with this patch and
> > this resolves many of the problems seen here:
> >
> > https://bugzilla.kernel.org/show_bug.cgi?id=115121
> >
> > This does not fix the problems with the TB docking station.  Although
> > this patch will also be required, the docking station issues require a FW
> update.
> > Updated FW should be coming soon to resolve those problems.
> >
> 
> No kidding!?!  Will that update be installable from Linux?  If so, could you cc
> me on that?
> 


This patch does let the dock hotplug properly after the BIOS fix is applied.

For the dock devices not showing up on the other side of the thunderbolt bridge:
It's going to be a system firmware update for both the XPS 9350 and Precision 5510.  It will get distributed to LVFS and can be loaded up with fwupd/fwupdate.  
I don't yet have an ETA that I can share other than "Next BIOS release"

After the update, set Thunderbolt Security to "No Security" and devices should be available.

Thanks,

--
To unsubscribe from this list: send the line "unsubscribe linux-acpi" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Andy Lutomirski April 28, 2016, 6:46 p.m. UTC | #5
On Thu, Apr 28, 2016 at 11:44 AM,  <Mario_Limonciello@dell.com> wrote:
>
>
>> -----Original Message-----
>> From: Andy Lutomirski [mailto:luto@kernel.org]
>> Sent: Thursday, April 28, 2016 1:39 PM
>> To: Prarit Bhargava <prarit@redhat.com>; linux-kernel@vger.kernel.org
>> Cc: Robert Moore <robert.moore@intel.com>; Lv Zheng
>> <lv.zheng@intel.com>; Rafael J. Wysocki <rafael.j.wysocki@intel.com>; Len
>> Brown <lenb@kernel.org>; linux-acpi@vger.kernel.org; Limonciello, Mario
>> <Mario_Limonciello@Dell.com>; stable@vger.kernel.org; devel@acpica.org
>> Subject: Re: [PATCH] ACPI: fix Thunderbolt hotplug
>>
>> On 04/27/2016 05:26 AM, Prarit Bhargava wrote:
>> > Rafael, this patch is in the acpica.git tree as 7a3bd2d ("Dispatcher:
>> > Update thread ID for recursive method calls").  I've had many positive
>> > testing results from hardware vendors and users with this patch and
>> > this resolves many of the problems seen here:
>> >
>> > https://bugzilla.kernel.org/show_bug.cgi?id=115121
>> >
>> > This does not fix the problems with the TB docking station.  Although
>> > this patch will also be required, the docking station issues require a FW
>> update.
>> > Updated FW should be coming soon to resolve those problems.
>> >
>>
>> No kidding!?!  Will that update be installable from Linux?  If so, could you cc
>> me on that?
>>
>
>
> This patch does let the dock hotplug properly after the BIOS fix is applied.
>
> For the dock devices not showing up on the other side of the thunderbolt bridge:
> It's going to be a system firmware update for both the XPS 9350 and Precision 5510.  It will get distributed to LVFS and can be loaded up with fwupd/fwupdate.
> I don't yet have an ETA that I can share other than "Next BIOS release"
>
> After the update, set Thunderbolt Security to "No Security" and devices should be available.

Great!

Does this mean that I won't need to boot into Windows and run the
little Intel tool to update the Alpine Ridge firmware?  I.e. will the
LVFS / EFI executable do that part?

--Andy

>
> Thanks,
>
Mario Limonciello April 28, 2016, 6:49 p.m. UTC | #6
> -----Original Message-----

> From: Andy Lutomirski [mailto:luto@amacapital.net]

> Sent: Thursday, April 28, 2016 1:47 PM

> To: Limonciello, Mario <Mario_Limonciello@Dell.com>

> Cc: Andrew Lutomirski <luto@kernel.org>; Prarit Bhargava

> <prarit@redhat.com>; linux-kernel@vger.kernel.org; Robert Moore

> <robert.moore@intel.com>; Lv Zheng <lv.zheng@intel.com>; Rafael J.

> Wysocki <rafael.j.wysocki@intel.com>; Len Brown <lenb@kernel.org>; Linux

> ACPI <linux-acpi@vger.kernel.org>; stable <stable@vger.kernel.org>;

> devel@acpica.org

> Subject: Re: [PATCH] ACPI: fix Thunderbolt hotplug

> 

> On Thu, Apr 28, 2016 at 11:44 AM,  <Mario_Limonciello@dell.com> wrote:

> >

> >

> >> -----Original Message-----

> >> From: Andy Lutomirski [mailto:luto@kernel.org]

> >> Sent: Thursday, April 28, 2016 1:39 PM

> >> To: Prarit Bhargava <prarit@redhat.com>; linux-kernel@vger.kernel.org

> >> Cc: Robert Moore <robert.moore@intel.com>; Lv Zheng

> >> <lv.zheng@intel.com>; Rafael J. Wysocki <rafael.j.wysocki@intel.com>;

> >> Len Brown <lenb@kernel.org>; linux-acpi@vger.kernel.org; Limonciello,

> >> Mario <Mario_Limonciello@Dell.com>; stable@vger.kernel.org;

> >> devel@acpica.org

> >> Subject: Re: [PATCH] ACPI: fix Thunderbolt hotplug

> >>

> >> On 04/27/2016 05:26 AM, Prarit Bhargava wrote:

> >> > Rafael, this patch is in the acpica.git tree as 7a3bd2d ("Dispatcher:

> >> > Update thread ID for recursive method calls").  I've had many

> >> > positive testing results from hardware vendors and users with this

> >> > patch and this resolves many of the problems seen here:

> >> >

> >> > https://bugzilla.kernel.org/show_bug.cgi?id=115121

> >> >

> >> > This does not fix the problems with the TB docking station.

> >> > Although this patch will also be required, the docking station

> >> > issues require a FW

> >> update.

> >> > Updated FW should be coming soon to resolve those problems.

> >> >

> >>

> >> No kidding!?!  Will that update be installable from Linux?  If so,

> >> could you cc me on that?

> >>

> >

> >

> > This patch does let the dock hotplug properly after the BIOS fix is applied.

> >

> > For the dock devices not showing up on the other side of the thunderbolt

> bridge:

> > It's going to be a system firmware update for both the XPS 9350 and

> Precision 5510.  It will get distributed to LVFS and can be loaded up with

> fwupd/fwupdate.

> > I don't yet have an ETA that I can share other than "Next BIOS release"

> >

> > After the update, set Thunderbolt Security to "No Security" and devices

> should be available.

> 

> Great!

> 

> Does this mean that I won't need to boot into Windows and run the little

> Intel tool to update the Alpine Ridge firmware?  I.e. will the LVFS / EFI

> executable do that part?

> 


Unfortunately that's a separate problem.  Working towards a few different solutions on that too, but nothing to share right now.
diff mbox

Patch

diff --git a/drivers/acpi/acpica/dsmethod.c b/drivers/acpi/acpica/dsmethod.c
index 1982310..93799db 100644
--- a/drivers/acpi/acpica/dsmethod.c
+++ b/drivers/acpi/acpica/dsmethod.c
@@ -428,6 +428,9 @@  acpi_ds_begin_method_execution(struct acpi_namespace_node *method_node,
 				obj_desc->method.mutex->mutex.
 				    original_sync_level =
 				    obj_desc->method.mutex->mutex.sync_level;
+
+				obj_desc->method.mutex->mutex.thread_id =
+					acpi_os_get_thread_id();
 			}
 		}