From patchwork Tue Jun 23 02:20:29 2009 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Zhang Rui X-Patchwork-Id: 31903 Received: from vger.kernel.org (vger.kernel.org [209.132.176.167]) by demeter.kernel.org (8.14.2/8.14.2) with ESMTP id n5N2JnmS025009 for ; Tue, 23 Jun 2009 02:19:49 GMT Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752005AbZFWCTp (ORCPT ); Mon, 22 Jun 2009 22:19:45 -0400 Received: (majordomo@vger.kernel.org) by vger.kernel.org id S1752269AbZFWCTp (ORCPT ); Mon, 22 Jun 2009 22:19:45 -0400 Received: from mga02.intel.com ([134.134.136.20]:20146 "EHLO mga02.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752005AbZFWCTo (ORCPT ); Mon, 22 Jun 2009 22:19:44 -0400 Received: from orsmga002.jf.intel.com ([10.7.209.21]) by orsmga101.jf.intel.com with ESMTP; 22 Jun 2009 19:10:29 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="4.42,272,1243839600"; d="scan'208";a="424446537" Received: from rzhang-dt.sh.intel.com (HELO [10.239.36.94]) ([10.239.36.94]) by orsmga002.jf.intel.com with ESMTP; 22 Jun 2009 19:26:58 -0700 Subject: Re: [PATCH 2/5] fix a deadlock in hotplug case From: Zhang Rui To: Bjorn Helgaas Cc: "lenb@kernel.org" , "linux-acpi@vger.kernel.org" In-Reply-To: <200906221554.07792.bjorn.helgaas@hp.com> References: <1245641478-31805-1-git-send-email-rui.zhang@intel.com> <1245641478-31805-2-git-send-email-rui.zhang@intel.com> <200906221554.07792.bjorn.helgaas@hp.com> Date: Tue, 23 Jun 2009 10:20:29 +0800 Message-Id: <1245723629.15520.52.camel@rzhang-dt> Mime-Version: 1.0 X-Mailer: Evolution 2.22.1 (2.22.1-2.fc9) Sender: linux-acpi-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-acpi@vger.kernel.org On Tue, 2009-06-23 at 05:54 +0800, Bjorn Helgaas wrote: > On Sunday 21 June 2009 09:31:15 pm Zhang Rui wrote: > > we used to run the hotplug code in keventd_wq. > > But when hot removing the ACPI battery device, > > power_supply_unregister invokes flush_scheduled_work. > > This causes a deadlock. > > Introduce a new workqueue for hotplug in this patch. > > http://bugzilla.kernel.org/show_bug.cgi?id=13533 > > Can you be specific about how the deadlock occurs? That will > make it easier to make sure we don't re-introduce the deadlock > if this design is ever changed in the future. And I will probably > try to change this design, so this is really a self-serving > request :-) > > I suspect the deadlock is something like this: > > > acpi_bus_notify > blocking_notifier_call_chain > acpi_dock_notifier_call > acpi_os_hotplug_execute(acpi_dock_deferred_cb, ...) > __acpi_os_execute(..., acpi_dock_deferred_cb, ...) > schedule_work() > queue_work(keventd_wq, ) > > > worker_thread(keventd_wq) > acpi_os_execute_hp_deferred > acpi_dock_deferred_cb > dock_notify > hotplug_dock_devices > dock_remove_acpi_device > acpi_bus_trim > acpi_bus_remove > device_release_driver > acpi_device_remove > acpi_battery_remove > sysfs_remove_battery > power_supply_unregister > flush_scheduled_work > flush_workqueue(keventd_wq) > > Now we're waiting for keventd_wq to be flushed, but it won't be > considered flushed until acpi_os_execute_hp_deferred() completes. > exactly. > This sort of analysis is obviously way too long and too specific > for a source code comment, but it would be nice to have it in the > bugzilla, at least. > > And maybe the changelog could mention that flush_scheduled_work() > applies to keventd_wq, just to close the loop there. > > The source code comment: > > > +/* > > + * run the hotplug code in a seperate workqueue > > + * to avoid the deadlock issue > > + */ > > should be more specific about what "the deadlock issue" is. > Maybe something like: > > /* > * We can't run hotplug code in keventd_wq because the hotplug > * code may call driver .remove() functions, which often use > * flush_scheduled_work() to wait for keventd_wq to be flushed. > */ > > And it might make more sense to put the comment where the queue > is used, i.e., in __acpi_os_execute(), rather than at the definition. > good point. Refreshed patch attached. we used to run the hotplug code in keventd_wq. But when hot removing the ACPI battery device, power_supply_unregister invokes flush_scheduled_work. This causes a deadlock. i.e 1. When dock is unplugged, all the hotplug code is run on kevent_wq. 2. the hotplug code removes all the child devices of dock device. 3. removing the child device may invoke flush_scheduled_work 4. flush_scheduled_work waits until all the work on kevent_wq to be finished, while this will never be true because the hotplug code is running on keventd_wq... Introduce a new workqueue for hotplug in this patch. http://bugzilla.kernel.org/show_bug.cgi?id=13533 Tested-by: Paul Martin Tested-by: Vojtech Gondzala Signed-off-by: Zhang Rui Reviewed-by: Bjorn Helgaas --- drivers/acpi/osl.c | 25 ++++++++++++++++--------- 1 file changed, 16 insertions(+), 9 deletions(-) -- To unsubscribe from this list: send the line "unsubscribe linux-acpi" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Index: linux-2.6/drivers/acpi/osl.c =================================================================== --- linux-2.6.orig/drivers/acpi/osl.c +++ linux-2.6/drivers/acpi/osl.c @@ -79,6 +79,7 @@ static acpi_osd_handler acpi_irq_handler static void *acpi_irq_context; static struct workqueue_struct *kacpid_wq; static struct workqueue_struct *kacpi_notify_wq; +static struct workqueue_struct *kacpi_hotplug_wq; struct acpi_res_list { resource_size_t start; @@ -192,8 +193,10 @@ acpi_status acpi_os_initialize1(void) { kacpid_wq = create_singlethread_workqueue("kacpid"); kacpi_notify_wq = create_singlethread_workqueue("kacpi_notify"); + kacpi_hotplug_wq = create_singlethread_workqueue("kacpi_hotplug"); BUG_ON(!kacpid_wq); BUG_ON(!kacpi_notify_wq); + BUG_ON(!kacpi_hotplug_wq); return AE_OK; } @@ -206,6 +209,7 @@ acpi_status acpi_os_terminate(void) destroy_workqueue(kacpid_wq); destroy_workqueue(kacpi_notify_wq); + destroy_workqueue(kacpi_hotplug_wq); return AE_OK; } @@ -716,6 +720,7 @@ static acpi_status __acpi_os_execute(acp acpi_status status = AE_OK; struct acpi_os_dpc *dpc; struct workqueue_struct *queue; + work_func_t func; int ret; ACPI_DEBUG_PRINT((ACPI_DB_EXEC, "Scheduling function [%p(%p)] for deferred execution.\n", @@ -740,15 +745,17 @@ static acpi_status __acpi_os_execute(acp dpc->function = function; dpc->context = context; - if (!hp) { - INIT_WORK(&dpc->work, acpi_os_execute_deferred); - queue = (type == OSL_NOTIFY_HANDLER) ? - kacpi_notify_wq : kacpid_wq; - ret = queue_work(queue, &dpc->work); - } else { - INIT_WORK(&dpc->work, acpi_os_execute_hp_deferred); - ret = schedule_work(&dpc->work); - } + /* + * We can't run hotplug code in keventd_wq/kacpid_wq/kacpid_notify_wq + * because the hotplug code may call driver .remove() functions, + * which invoke flush_scheduled_work/acpi_os_wait_events_complete + * to flush these workqueues. + */ + queue = hp ? kacpi_hotplug_wq : + (type == OSL_NOTIFY_HANDLER ? kacpi_notify_wq : kacpid_wq); + func = hp ? acpi_os_execute_hp_deferred : acpi_os_execute_deferred; + INIT_WORK(&dpc->work, func); + ret = queue_work(queue, &dpc->work); if (!ret) { printk(KERN_ERR PREFIX